All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v9 00/14] ARM SMMUv3 Emulation Support
@ 2018-02-17 18:46 Eric Auger
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 01/14] hw/arm/smmu-common: smmu base device and datatypes Eric Auger
                   ` (14 more replies)
  0 siblings, 15 replies; 63+ messages in thread
From: Eric Auger @ 2018-02-17 18:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: tn, mst, christoffer.dall, bharat.bhushan, jean-philippe.brucker,
	edgar.iglesias, linuc.decode, peterx

This series implements the emulation code for ARM SMMUv3.

SMMUv3 gets instantiated by adding ",iommu=smmuv3" to the virt
machine option.

VHOST integration will be handled in a separate series. VFIO
integration is not targeted at the moment. Only stage 1 and
AArch64 PTW are supported.

Main changes since v8:
- fix mingw compilation (qemu/log.h)
- put gpl v2 license on all files to respect initial license
- change proto of smmu_ptw* to clarify inputs/outputs and
  prepare for iotlb emulation
- fix hash table lookup
- cleanup access type handling during ptw
- cleanup reset infra (parent_reset)
- replace some inline functions by macros
- fix some CMD fields
- increment cmdq cons only after cmd execution
- replace some remaining error_report by qemu_log_mask

Best Regards

Eric

This series can be found at:
v9: https://github.com/eauger/qemu/tree/v2.11.0-SMMU-v9
Previous version at:
v8: https://github.com/eauger/qemu/tree/v2.11.0-SMMU-v8

History:

v8 -> v9:
- see above description

v7 -> v8:
Took into account Peter's comments:
- revisit queue data structures
- use registerfields.h and got rid of reg array
- use dma_memory_read for all descriptor fetches
- got rid of page table walk for an iova range and
  implemented standard page table walk for single IOVA
- revisit event data structure
- report events in many more situations and pass the event
  handle all along the decode and ptw phases
- fix gerror/gerron computations
- completely got rid of stage2 decoding
- use a machine option for instantiation
- get rid of VFIO integration
- get rid of VHOST integration (this will be added in a
- abort in case vhost/vfio notifiers get detected
  second step together with TLB emulation)
- Tested migration
- fixed TTBR index computation (issue reported by Tomasz)

v6 -> v7:
- DPDK testpmd now running on guest with 2 assigned VFs
- Changed the instantiation method: add the following option to
  the QEMU command line
  -device smmu # for virtio/vhost use cases
  -device smmu,caching-mode # for vfio use cases (based on [1])
- splitted the series into smaller patches to allow the review
- the VFIO integration based on "ltlbi-on-map" smmuv3 driver
  is isolated from the rest: last 2 patches, not for upstream.
  This is shipped for testing/bench until a better solution is found.
- Reworked permission flag checks and event generation

v5 -> v6:
- Rebase on 2.10 and IOMMUMemoryRegion
- add ACPI TLBI_ON_MAP support (VFIO integration also works in
  ACPI mode)
- fix block replay
- handle implementation defined SMMU_CMD_TLBI_NH_VA_AM cmd
  (goes along with TLBI_ON_MAP FW quirk)
- replay systematically unmap the whole range first
- smmuv3_map_hook does not unmap anymore and the unmap is done
  before the replay
- add and use smmuv3_context_device_invalidate instead of
  blindly replaying everything

v4 -> v5:
- initial_level now part of SMMUTransCfg
- smmu_page_walk_64 takes into account the max input size
- implement sys->iommu_ops.replay and sys->iommu_ops.notify_flag_changed
- smmuv3_translate: bug fix: don't walk on bypass
- smmu_update_qreg: fix PROD index update
- I did not yet address Peter's comments as the code is not mature enough
  to be split into sub patches.

v3 -> v4 [Eric]:
- page table walk rewritten to allow scan of the page table within a
  range of IOVA. This prepares for VFIO integration and replay.
- configuration parsing partially reworked.
- do not advertise unsupported/untested features: S2, S1 + S2, HYP,
  PRI, ATS, ..
- added ACPI table generation
- migrated to dynamic traces
- mingw compilation fix

v2 -> v3 [Eric]:
- rebased on 2.9
- mostly code and patch reorganization to ease the review process
- optional patches removed. They may be handled separately. I am currently
  working on ACPI enablement.
- optional instantiation of the smmu in mach-virt
- removed [2/9] (fdt functions) since not mandated
- start splitting main patch into base and derived object
- no new function feature added

v1 -> v2 [Prem]:
- Adopted review comments from Eric Auger
        - Make SMMU_DPRINTF to internally call qemu_log
            (since translation requests are too many, we need control
             on the type of log we want)
        - SMMUTransCfg modified to suite simplicity
        - Change RegInfo to uint64 register array
        - Code cleanup
        - Test cleanups
- Reshuffled patches

v0 -> v1 [Prem]:
- As per SMMUv3 spec 16.0 (only is_ste_consistant() is noticeable)
- Reworked register access/update logic
- Factored out translation code for
        - single point bug fix
        - sharing/removal in future
- (optional) Unit tests added, with PCI test device
        - S1 with 4k/64k, S1+S2 with 4k/64k
        - (S1 or S2) only can be verified by Linux 4.7 driver
        - (optional) Priliminary ACPI support

v0 [Prem]:
- Implements SMMUv3 spec 11.0
- Supported for PCIe devices,
- Command Queue and Event Queue supported
- LPAE only, S1 is supported and Tested, S2 not tested
- BE mode Translation not supported
- IRQ support (legacy, no MSI)
- Tested with DPDK and e1000


Eric Auger (11):
  hw/arm/smmu-common: smmu base device and datatypes
  hw/arm/smmu-common: IOMMU memory region and address space setup
  hw/arm/smmu-common: VMSAv8-64 page table walk
  hw/arm/smmuv3: Wired IRQ and GERROR helpers
  hw/arm/smmuv3: Queue helpers
  hw/arm/smmuv3: Implement MMIO write operations
  hw/arm/smmuv3: Event queue recording helper
  hw/arm/smmuv3: Implement translate callback
  hw/arm/smmuv3: Abort on vfio or vhost case
  target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route
  hw/arm/virt: Handle iommu in 2.12 machine type

Prem Mallappa (3):
  hw/arm/smmuv3: Skeleton
  hw/arm/virt: Add SMMUv3 to the virt board
  hw/arm/virt-acpi-build: Add smmuv3 node in IORT table

 default-configs/aarch64-softmmu.mak |    1 +
 hw/arm/Makefile.objs                |    1 +
 hw/arm/smmu-common.c                |  371 ++++++++++++
 hw/arm/smmu-internal.h              |   96 +++
 hw/arm/smmuv3-internal.h            |  611 +++++++++++++++++++
 hw/arm/smmuv3.c                     | 1118 +++++++++++++++++++++++++++++++++++
 hw/arm/trace-events                 |   36 ++
 hw/arm/virt-acpi-build.c            |   56 +-
 hw/arm/virt.c                       |  109 +++-
 include/hw/acpi/acpi-defs.h         |   15 +
 include/hw/arm/smmu-common.h        |  136 +++++
 include/hw/arm/smmuv3.h             |   91 +++
 include/hw/arm/virt.h               |   11 +
 target/arm/kvm.c                    |   27 +
 target/arm/trace-events             |    3 +
 15 files changed, 2674 insertions(+), 8 deletions(-)
 create mode 100644 hw/arm/smmu-common.c
 create mode 100644 hw/arm/smmu-internal.h
 create mode 100644 hw/arm/smmuv3-internal.h
 create mode 100644 hw/arm/smmuv3.c
 create mode 100644 include/hw/arm/smmu-common.h
 create mode 100644 include/hw/arm/smmuv3.h

-- 
2.5.5

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [Qemu-devel] [PATCH v9 01/14] hw/arm/smmu-common: smmu base device and datatypes
  2018-02-17 18:46 [Qemu-devel] [PATCH v9 00/14] ARM SMMUv3 Emulation Support Eric Auger
@ 2018-02-17 18:46 ` Eric Auger
  2018-03-06 12:09   ` Peter Maydell
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 02/14] hw/arm/smmu-common: IOMMU memory region and address space setup Eric Auger
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 63+ messages in thread
From: Eric Auger @ 2018-02-17 18:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: tn, mst, christoffer.dall, bharat.bhushan, jean-philippe.brucker,
	edgar.iglesias, linuc.decode, peterx

The patch introduces the smmu base device and class for the ARM
smmu. Devices for specific versions will be derived from this
base device.

We also introduce some important datatypes.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>

---
v8 -> v9:
- remove page walk callback type from this patch (vhost related)
- add a new hash table for caching configuration data
- add reset function
- add asid

v7 -> v8:
- add bus_num property
- add primary-bus property
- add realize and remove instance_init
- rename TYPE and related macros to match naming convention using
  for GIC
- add SMMUPageTableWalkEventInfo
- tt[2] in translation config

v3 -> v4:
- added smmu_find_as_from_bus_num
- SMMU_PCI_BUS_MAX and SMMU_PCI_DEVFN_MAX in smmu-common header
- new fields in SMMUState:
  - iommu_ops, smmu_as_by_busptr, smmu_as_by_bus_num
- add aa64[] field in SMMUTransCfg

v3:
- moved the base code in a separate patch to ease the review.
- clearer separation between base class and smmuv3 class
- translate_* only implemented as class methods

Conflicts:
	default-configs/aarch64-softmmu.mak
---
 default-configs/aarch64-softmmu.mak |   1 +
 hw/arm/Makefile.objs                |   1 +
 hw/arm/smmu-common.c                |  80 +++++++++++++++++++++++
 include/hw/arm/smmu-common.h        | 124 ++++++++++++++++++++++++++++++++++++
 4 files changed, 206 insertions(+)
 create mode 100644 hw/arm/smmu-common.c
 create mode 100644 include/hw/arm/smmu-common.h

diff --git a/default-configs/aarch64-softmmu.mak b/default-configs/aarch64-softmmu.mak
index 9ddccf8..6f790f0 100644
--- a/default-configs/aarch64-softmmu.mak
+++ b/default-configs/aarch64-softmmu.mak
@@ -8,3 +8,4 @@ CONFIG_DDC=y
 CONFIG_DPCD=y
 CONFIG_XLNX_ZYNQMP=y
 CONFIG_XLNX_ZYNQMP_ARM=y
+CONFIG_ARM_SMMUV3=y
diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
index 1c896ba..c84c5ac 100644
--- a/hw/arm/Makefile.objs
+++ b/hw/arm/Makefile.objs
@@ -20,3 +20,4 @@ obj-$(CONFIG_FSL_IMX6) += fsl-imx6.o sabrelite.o
 obj-$(CONFIG_ASPEED_SOC) += aspeed_soc.o aspeed.o
 obj-$(CONFIG_MPS2) += mps2.o
 obj-$(CONFIG_MSF2) += msf2-soc.o msf2-som.o
+obj-$(CONFIG_ARM_SMMUV3) += smmu-common.o
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
new file mode 100644
index 0000000..86a5aab
--- /dev/null
+++ b/hw/arm/smmu-common.c
@@ -0,0 +1,80 @@
+/*
+ * Copyright (C) 2014-2016 Broadcom Corporation
+ * Copyright (c) 2017 Red Hat, Inc.
+ * Written by Prem Mallappa, Eric Auger
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * Author: Prem Mallappa <pmallapp@broadcom.com>
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "sysemu/sysemu.h"
+#include "exec/address-spaces.h"
+#include "trace.h"
+#include "exec/target_page.h"
+#include "qom/cpu.h"
+#include "hw/qdev-properties.h"
+#include "qapi/error.h"
+
+#include "qemu/error-report.h"
+#include "hw/arm/smmu-common.h"
+
+static void smmu_base_realize(DeviceState *dev, Error **errp)
+{
+    SMMUState *s = ARM_SMMU(dev);
+
+    s->configs = g_hash_table_new_full(NULL, NULL, NULL, g_free);
+    s->iotlb = g_hash_table_new_full(NULL, NULL, NULL, g_free);
+}
+
+static void smmu_base_reset(DeviceState *dev)
+{
+    SMMUState *s = ARM_SMMU(dev);
+
+    g_hash_table_remove_all(s->configs);
+    g_hash_table_remove_all(s->iotlb);
+}
+
+static Property smmu_dev_properties[] = {
+    DEFINE_PROP_UINT8("bus_num", SMMUState, bus_num, 0),
+    DEFINE_PROP_LINK("primary-bus", SMMUState, primary_bus, "PCI", PCIBus *),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void smmu_base_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+    SMMUBaseClass *sbc = ARM_SMMU_CLASS(klass);
+
+    dc->props = smmu_dev_properties;
+    sbc->parent_realize = dc->realize;
+    dc->realize = smmu_base_realize;
+    dc->reset = smmu_base_reset;
+}
+
+static const TypeInfo smmu_base_info = {
+    .name          = TYPE_ARM_SMMU,
+    .parent        = TYPE_SYS_BUS_DEVICE,
+    .instance_size = sizeof(SMMUState),
+    .class_data    = NULL,
+    .class_size    = sizeof(SMMUBaseClass),
+    .class_init    = smmu_base_class_init,
+    .abstract      = true,
+};
+
+static void smmu_base_register_types(void)
+{
+    type_register_static(&smmu_base_info);
+}
+
+type_init(smmu_base_register_types)
+
diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
new file mode 100644
index 0000000..8a9d931
--- /dev/null
+++ b/include/hw/arm/smmu-common.h
@@ -0,0 +1,124 @@
+/*
+ * ARM SMMU Support
+ *
+ * Copyright (C) 2015-2016 Broadcom Corporation
+ * Copyright (c) 2017 Red Hat, Inc.
+ * Written by Prem Mallappa, Eric Auger
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#ifndef HW_ARM_SMMU_COMMON_H
+#define HW_ARM_SMMU_COMMON_H
+
+#include <hw/sysbus.h>
+#include "hw/pci/pci.h"
+
+#define SMMU_PCI_BUS_MAX      256
+#define SMMU_PCI_DEVFN_MAX    256
+
+#define SMMU_MAX_VA_BITS      48
+
+/*
+ * Page table walk error types
+ */
+typedef enum {
+    SMMU_PTW_ERR_NONE,
+    SMMU_PTW_ERR_WALK_EABT,   /* Translation walk external abort */
+    SMMU_PTW_ERR_TRANSLATION, /* Translation fault */
+    SMMU_PTW_ERR_ADDR_SIZE,   /* Address Size fault */
+    SMMU_PTW_ERR_ACCESS,      /* Access fault */
+    SMMU_PTW_ERR_PERMISSION,  /* Permission fault */
+} SMMUPTWEventType;
+
+typedef struct SMMUPTWEventInfo {
+    SMMUPTWEventType type;
+    dma_addr_t addr; /* fetched address that induced an abort, if any */
+} SMMUPTWEventInfo;
+
+typedef struct SMMUTransTableInfo {
+    bool disabled;             /* is the translation table disabled? */
+    uint64_t ttb;              /* TT base address */
+    uint8_t tsz;               /* input range, ie. 2^(64 -tsz)*/
+    uint8_t granule_sz;        /* granule page shift */
+    uint8_t initial_level;     /* initial lookup level */
+} SMMUTransTableInfo;
+
+/*
+ * Generic structure populated by derived SMMU devices
+ * after decoding the configuration information and used as
+ * input to the page table walk
+ */
+typedef struct SMMUTransCfg {
+    int      stage;            /* translation stage */
+    bool     aa64;             /* arch64 or aarch32 translation table */
+    bool     disabled;         /* smmu is disabled */
+    bool     bypassed;         /* translation is bypassed */
+    bool     aborted;          /* translation is aborted */
+    uint64_t ttb;              /* TT base address */
+    uint8_t oas;               /* output address width */
+    uint8_t  tbi;              /* Top Byte Ignore */
+    uint16_t asid;
+    SMMUTransTableInfo tt[2];
+} SMMUTransCfg;
+
+typedef struct SMMUDevice {
+    void               *smmu;
+    PCIBus             *bus;
+    int                devfn;
+    IOMMUMemoryRegion  iommu;
+    AddressSpace       as;
+} SMMUDevice;
+
+typedef struct SMMUNotifierNode {
+    SMMUDevice *sdev;
+    QLIST_ENTRY(SMMUNotifierNode) next;
+} SMMUNotifierNode;
+
+typedef struct SMMUPciBus {
+    PCIBus       *bus;
+    SMMUDevice   *pbdev[0]; /* Parent array is sparse, so dynamically alloc */
+} SMMUPciBus;
+
+typedef struct SMMUState {
+    /* <private> */
+    SysBusDevice  dev;
+    char *mrtypename;
+    MemoryRegion iomem;
+
+    GHashTable *smmu_as_by_busptr;
+    GHashTable *configs; /* cache for configuration data */
+    GHashTable *iotlb;
+    SMMUPciBus *smmu_as_by_bus_num[SMMU_PCI_BUS_MAX];
+    PCIBus *pci_bus;
+    QLIST_HEAD(, SMMUNotifierNode) notifiers_list;
+    uint8_t bus_num;
+    PCIBus *primary_bus;
+} SMMUState;
+
+typedef struct {
+    /* <private> */
+    SysBusDeviceClass parent_class;
+
+    /*< public >*/
+
+    DeviceRealize parent_realize;
+
+} SMMUBaseClass;
+
+#define TYPE_ARM_SMMU "arm-smmu"
+#define ARM_SMMU(obj) OBJECT_CHECK(SMMUState, (obj), TYPE_ARM_SMMU)
+#define ARM_SMMU_CLASS(klass)                                    \
+    OBJECT_CLASS_CHECK(SMMUBaseClass, (klass), TYPE_ARM_SMMU)
+#define ARM_SMMU_GET_CLASS(obj)                              \
+    OBJECT_GET_CLASS(SMMUBaseClass, (obj), TYPE_ARM_SMMU)
+
+#endif  /* HW_ARM_SMMU_COMMON */
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [Qemu-devel] [PATCH v9 02/14] hw/arm/smmu-common: IOMMU memory region and address space setup
  2018-02-17 18:46 [Qemu-devel] [PATCH v9 00/14] ARM SMMUv3 Emulation Support Eric Auger
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 01/14] hw/arm/smmu-common: smmu base device and datatypes Eric Auger
@ 2018-02-17 18:46 ` Eric Auger
  2018-03-06 14:08   ` Peter Maydell
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 03/14] hw/arm/smmu-common: VMSAv8-64 page table walk Eric Auger
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 63+ messages in thread
From: Eric Auger @ 2018-02-17 18:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: tn, mst, christoffer.dall, bharat.bhushan, jean-philippe.brucker,
	edgar.iglesias, linuc.decode, peterx

We enumerate all the PCI devices attached to the SMMU and
initialize an associated IOMMU memory region and address space.
This happens on SMMU base instance init.

Those info are stored in SMMUDevice objects. The devices are
grouped according to the PCIBus they belong to. A hash table
indexed by the PCIBus poinet is used. Also an array indexed by
the bus number allows to find the list of SMMUDevices.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v8 -> v9:
- fix key value for lookup

v7 -> v8:
- introduce SMMU_MAX_VA_BITS
- use PCI bus handle as a key
- do not clear s->smmu_as_by_bus_num
- use g_new0 instead of g_malloc0
- use primary_bus field
---
 hw/arm/smmu-common.c         | 59 ++++++++++++++++++++++++++++++++++++++++++++
 include/hw/arm/smmu-common.h |  6 +++++
 2 files changed, 65 insertions(+)

diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index 86a5aab..d0516dc 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -28,12 +28,71 @@
 #include "qemu/error-report.h"
 #include "hw/arm/smmu-common.h"
 
+SMMUPciBus *smmu_find_as_from_bus_num(SMMUState *s, uint8_t bus_num)
+{
+    SMMUPciBus *smmu_pci_bus = s->smmu_as_by_bus_num[bus_num];
+
+    if (!smmu_pci_bus) {
+        GHashTableIter iter;
+
+        g_hash_table_iter_init(&iter, s->smmu_as_by_busptr);
+        while (g_hash_table_iter_next(&iter, NULL, (void **)&smmu_pci_bus)) {
+            if (pci_bus_num(smmu_pci_bus->bus) == bus_num) {
+                s->smmu_as_by_bus_num[bus_num] = smmu_pci_bus;
+                return smmu_pci_bus;
+            }
+        }
+    }
+    return smmu_pci_bus;
+}
+
+static AddressSpace *smmu_find_add_as(PCIBus *bus, void *opaque, int devfn)
+{
+    SMMUState *s = opaque;
+    SMMUPciBus *sbus = g_hash_table_lookup(s->smmu_as_by_busptr, bus);
+    SMMUDevice *sdev;
+
+    if (!sbus) {
+        sbus = g_malloc0(sizeof(SMMUPciBus) +
+                         sizeof(SMMUDevice *) * SMMU_PCI_DEVFN_MAX);
+        sbus->bus = bus;
+        g_hash_table_insert(s->smmu_as_by_busptr, bus, sbus);
+    }
+
+    sdev = sbus->pbdev[devfn];
+    if (!sdev) {
+        char *name = g_strdup_printf("%s-%d-%d",
+                                     s->mrtypename,
+                                     pci_bus_num(bus), devfn);
+        sdev = sbus->pbdev[devfn] = g_new0(SMMUDevice, 1);
+
+        sdev->smmu = s;
+        sdev->bus = bus;
+        sdev->devfn = devfn;
+
+        memory_region_init_iommu(&sdev->iommu, sizeof(sdev->iommu),
+                                 s->mrtypename,
+                                 OBJECT(s), name, 1ULL << SMMU_MAX_VA_BITS);
+        address_space_init(&sdev->as,
+                           MEMORY_REGION(&sdev->iommu), name);
+    }
+
+    return &sdev->as;
+}
+
 static void smmu_base_realize(DeviceState *dev, Error **errp)
 {
     SMMUState *s = ARM_SMMU(dev);
 
     s->configs = g_hash_table_new_full(NULL, NULL, NULL, g_free);
     s->iotlb = g_hash_table_new_full(NULL, NULL, NULL, g_free);
+    s->smmu_as_by_busptr = g_hash_table_new(NULL, NULL);
+
+    if (s->primary_bus) {
+        pci_setup_iommu(s->primary_bus, smmu_find_add_as, s);
+    } else {
+        error_setg(errp, "SMMU is not attached to any PCI bus!");
+    }
 }
 
 static void smmu_base_reset(DeviceState *dev)
diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
index 8a9d931..aee96c2 100644
--- a/include/hw/arm/smmu-common.h
+++ b/include/hw/arm/smmu-common.h
@@ -121,4 +121,10 @@ typedef struct {
 #define ARM_SMMU_GET_CLASS(obj)                              \
     OBJECT_GET_CLASS(SMMUBaseClass, (obj), TYPE_ARM_SMMU)
 
+SMMUPciBus *smmu_find_as_from_bus_num(SMMUState *s, uint8_t bus_num);
+
+static inline uint16_t smmu_get_sid(SMMUDevice *sdev)
+{
+    return  ((pci_bus_num(sdev->bus) & 0xff) << 8) | sdev->devfn;
+}
 #endif  /* HW_ARM_SMMU_COMMON */
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [Qemu-devel] [PATCH v9 03/14] hw/arm/smmu-common: VMSAv8-64 page table walk
  2018-02-17 18:46 [Qemu-devel] [PATCH v9 00/14] ARM SMMUv3 Emulation Support Eric Auger
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 01/14] hw/arm/smmu-common: smmu base device and datatypes Eric Auger
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 02/14] hw/arm/smmu-common: IOMMU memory region and address space setup Eric Auger
@ 2018-02-17 18:46 ` Eric Auger
  2018-03-06 19:43   ` Peter Maydell
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 04/14] hw/arm/smmuv3: Skeleton Eric Auger
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 63+ messages in thread
From: Eric Auger @ 2018-02-17 18:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: tn, mst, christoffer.dall, bharat.bhushan, jean-philippe.brucker,
	edgar.iglesias, linuc.decode, peterx

This patch implements the page table walk for VMSAv8-64.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v8 -> v9:
- remove guest error log on PTE fetch fault
- rename  trace functions
- fix smmu_page_walk_level_res_invalid_pte last arg
- fix PTE_ADDRESS
- turn functions into macros
- make sure to return the actual pte access permission
  into tlbe->perm
- change proto of smmu_ptw*

v7 -> v8:
- rework get_pte
- use LOG_LEVEL_ERROR
- remove error checking in get_block_pte_address
- page table walk simplified (no VFIO replay anymore)
- handle PTW error events
- use dma_memory_read

v6 -> v7:
- fix wrong error handling in walk_page_table
- check perm in smmu_translate

v5 -> v6:
- use IOMMUMemoryRegion
- remove initial_lookup_level()
- fix block replay

v4 -> v5:
- add initial level in translation config
- implement block pte
- rename must_translate into nofail
- introduce call_entry_hook
- small changes to dynamic traces
- smmu_page_walk code moved from smmuv3.c to this file
- remove smmu_translate*

v3 -> v4:
- reworked page table walk to prepare for VFIO integration
  (capability to scan a range of IOVA). Same function is used
  for translate for a single iova. This is largely inspired
  from intel_iommu.c
- as the translate function was not straightforward to me,
  I tried to stick more closely to the VMSA spec.
- remove support of nested stage (kernel driver does not
  support it anyway)
- use error_report and trace events
- add aa64[] field in SMMUTransCfg
---
 hw/arm/smmu-common.c         | 232 +++++++++++++++++++++++++++++++++++++++++++
 hw/arm/smmu-internal.h       |  96 ++++++++++++++++++
 hw/arm/trace-events          |  10 ++
 include/hw/arm/smmu-common.h |   6 ++
 4 files changed, 344 insertions(+)
 create mode 100644 hw/arm/smmu-internal.h

diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index d0516dc..24cc4ba 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -27,6 +27,238 @@
 
 #include "qemu/error-report.h"
 #include "hw/arm/smmu-common.h"
+#include "smmu-internal.h"
+
+/* VMSAv8-64 Translation */
+
+/**
+ * get_pte - Get the content of a page table entry located t
+ * @base_addr[@index]
+ */
+static int get_pte(dma_addr_t baseaddr, uint32_t index, uint64_t *pte,
+                   SMMUPTWEventInfo *info)
+{
+    int ret;
+    dma_addr_t addr = baseaddr + index * sizeof(*pte);
+
+    ret = dma_memory_read(&address_space_memory, addr,
+                          (uint8_t *)pte, sizeof(*pte));
+
+    if (ret != MEMTX_OK) {
+        info->type = SMMU_PTW_ERR_WALK_EABT;
+        info->addr = addr;
+        return -EINVAL;
+    }
+    trace_smmu_get_pte(baseaddr, index, addr, *pte);
+    return 0;
+}
+
+/* VMSAv8-64 Translation Table Format Descriptor Decoding */
+
+/**
+ * get_page_pte_address - returns the L3 descriptor output address,
+ * ie. the page frame
+ * ARM ARM spec: Figure D4-17 VMSAv8-64 level 3 descriptor format
+ */
+static inline hwaddr get_page_pte_address(uint64_t pte, int granule_sz)
+{
+    return PTE_ADDRESS(pte, granule_sz);
+}
+
+/**
+ * get_table_pte_address - return table descriptor output address,
+ * ie. address of next level table
+ * ARM ARM Figure D4-16 VMSAv8-64 level0, level1, and level 2 descriptor formats
+ */
+static inline hwaddr get_table_pte_address(uint64_t pte, int granule_sz)
+{
+    return PTE_ADDRESS(pte, granule_sz);
+}
+
+/**
+ * get_block_pte_address - return block descriptor output address and block size
+ * ARM ARM Figure D4-16 VMSAv8-64 level0, level1, and level 2 descriptor formats
+ */
+static hwaddr get_block_pte_address(uint64_t pte, int level, int granule_sz,
+                                    uint64_t *bsz)
+{
+    int n = 0;
+
+    switch (granule_sz) {
+    case 12:
+        if (level == 1) {
+            n = 30;
+        } else if (level == 2) {
+            n = 21;
+        }
+        break;
+    case 14:
+        if (level == 2) {
+            n = 25;
+        }
+        break;
+    case 16:
+        if (level == 2) {
+            n = 29;
+        }
+        break;
+    }
+    if (!n) {
+        error_setg(&error_fatal,
+                   "wrong granule/level combination (%d/%d)",
+                   granule_sz, level);
+    }
+    *bsz = 1 << n;
+    return PTE_ADDRESS(pte, n);
+}
+
+static inline bool check_perm(int access_attrs, int mem_attrs)
+{
+    if (((access_attrs & IOMMU_RO) && !(mem_attrs & IOMMU_RO)) ||
+        ((access_attrs & IOMMU_WO) && !(mem_attrs & IOMMU_WO))) {
+        return false;
+    }
+    return true;
+}
+
+SMMUTransTableInfo *select_tt(SMMUTransCfg *cfg, dma_addr_t iova)
+{
+    if (!extract64(iova, 64 - cfg->tt[0].tsz, cfg->tt[0].tsz - cfg->tbi)) {
+        return &cfg->tt[0];
+    }
+    return &cfg->tt[1];
+}
+
+/**
+ * smmu_ptw_64 - VMSAv8-64 Walk of the page tables for a given IOVA
+ * @cfg: translation config
+ * @iova: iova to translate
+ * @perm: access type
+ * @tlbe: IOMMUTLBEntry (out)
+ * @info: handle to an error info
+ *
+ * Return 0 on success, < 0 on error. In case of error, @info is filled
+ * and tlbe->perm is set to IOMMU_NONE.
+ * Upon success, @tlbe is filled with translated_addr and entry
+ * permission rights.
+ */
+static int smmu_ptw_64(SMMUTransCfg *cfg,
+                       dma_addr_t iova, IOMMUAccessFlags perm,
+                       IOMMUTLBEntry *tlbe, SMMUPTWEventInfo *info)
+{
+    dma_addr_t baseaddr;
+    int stage = cfg->stage;
+    SMMUTransTableInfo *tt = select_tt(cfg, iova);
+    uint8_t level;
+    uint8_t granule_sz;
+
+    if (tt->disabled) {
+        info->type = SMMU_PTW_ERR_TRANSLATION;
+        goto error;
+    }
+
+    level = tt->initial_level;
+    granule_sz = tt->granule_sz;
+    baseaddr = extract64(tt->ttb, 0, 48);
+
+    tlbe->iova = iova;
+    tlbe->addr_mask = (1 << tt->granule_sz) - 1;
+
+    while (level <= 3) {
+        uint64_t subpage_size = 1ULL << level_shift(level, granule_sz);
+        uint64_t mask = subpage_size - 1;
+        uint32_t offset = iova_level_offset(iova, level, granule_sz);
+        uint64_t pte;
+        dma_addr_t pte_addr = baseaddr + offset * sizeof(pte);
+        uint8_t ap;
+
+        if (get_pte(baseaddr, offset, &pte, info)) {
+                goto error;
+        }
+        trace_smmu_ptw_level(level, iova, subpage_size,
+                             baseaddr, offset, pte);
+
+        if (is_invalid_pte(pte) || is_reserved_pte(pte, level)) {
+            trace_smmu_ptw_invalid_pte(stage, level, baseaddr,
+                                       pte_addr, offset, pte);
+            info->type = SMMU_PTW_ERR_TRANSLATION;
+            goto error;
+        }
+
+        if (is_page_pte(pte, level)) {
+            uint64_t gpa = get_page_pte_address(pte, granule_sz);
+
+            ap = PTE_AP(pte);
+            if (is_permission_fault(ap, perm)) {
+                info->type = SMMU_PTW_ERR_PERMISSION;
+                goto error;
+            }
+
+            tlbe->translated_addr = gpa + (iova & mask);
+            tlbe->perm = PTE_AP_TO_PERM(ap);
+            trace_smmu_ptw_page_pte(stage, level, iova,
+                                    baseaddr, pte_addr, pte, gpa);
+            return 0;
+        }
+        if (is_block_pte(pte, level)) {
+            uint64_t block_size;
+            hwaddr gpa = get_block_pte_address(pte, level, granule_sz,
+                                               &block_size);
+
+            ap = PTE_AP(pte);
+            if (is_permission_fault(ap, perm)) {
+                info->type = SMMU_PTW_ERR_PERMISSION;
+                goto error;
+            }
+
+            trace_smmu_ptw_block_pte(stage, level, baseaddr,
+                                     pte_addr, pte, iova, gpa,
+                                    (int)(block_size >> 20));
+
+            tlbe->translated_addr = gpa + (iova & mask);
+            tlbe->perm = PTE_AP_TO_PERM(ap);
+            return 0;
+        }
+
+        /* table pte */
+        ap = PTE_APTABLE(pte);
+
+        if (is_permission_fault(ap, perm)) {
+            info->type = SMMU_PTW_ERR_PERMISSION;
+            goto error;
+        }
+        baseaddr = get_table_pte_address(pte, granule_sz);
+        level++;
+    }
+
+    info->type = SMMU_PTW_ERR_TRANSLATION;
+
+error:
+    tlbe->perm = IOMMU_NONE;
+    return -EINVAL;
+}
+
+/**
+ * smmu_ptw - Walk the page tables for an IOVA, according to @cfg
+ *
+ * @cfg: translation configuration
+ * @iova: iova to translate
+ * @perm: tentative access type
+ * @tlbe: returned entry
+ * @info: ptw event handle
+ *
+ * return 0 on success
+ */
+int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags perm,
+             IOMMUTLBEntry *tlbe, SMMUPTWEventInfo *info)
+{
+    if (!cfg->aa64) {
+        error_setg(&error_fatal,
+                   "SMMUv3 model does not support VMSAv8-32 page walk yet");
+    }
+
+    return smmu_ptw_64(cfg, iova, perm, tlbe, info);
+}
 
 SMMUPciBus *smmu_find_as_from_bus_num(SMMUState *s, uint8_t bus_num)
 {
diff --git a/hw/arm/smmu-internal.h b/hw/arm/smmu-internal.h
new file mode 100644
index 0000000..3ed97ee
--- /dev/null
+++ b/hw/arm/smmu-internal.h
@@ -0,0 +1,96 @@
+/*
+ * ARM SMMU support - Internal API
+ *
+ * Copyright (c) 2017 Red Hat, Inc.
+ * Copyright (C) 2014-2016 Broadcom Corporation
+ * Written by Prem Mallappa, Eric Auger
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_ARM_SMMU_INTERNAL_H
+#define HW_ARM_SMMU_INTERNAL_H
+
+#define ARM_LPAE_MAX_ADDR_BITS          48
+#define ARM_LPAE_MAX_LEVELS             4
+
+/* PTE Manipulation */
+
+#define ARM_LPAE_PTE_TYPE_SHIFT         0
+#define ARM_LPAE_PTE_TYPE_MASK          0x3
+
+#define ARM_LPAE_PTE_TYPE_BLOCK         1
+#define ARM_LPAE_PTE_TYPE_TABLE         3
+
+#define ARM_LPAE_L3_PTE_TYPE_RESERVED   1
+#define ARM_LPAE_L3_PTE_TYPE_PAGE       3
+
+#define ARM_LPAE_PTE_VALID              (1 << 0)
+
+#define PTE_ADDRESS(pte, shift) \
+    (extract64(pte, shift, 47 - shift + 1) << shift)
+
+#define is_invalid_pte(pte) (!(pte & ARM_LPAE_PTE_VALID))
+
+#define is_reserved_pte(pte, level)                                      \
+    ((level == 3) &&                                                     \
+     ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_L3_PTE_TYPE_RESERVED))
+
+#define is_block_pte(pte, level)                                         \
+    ((level < 3) &&                                                      \
+     ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_PTE_TYPE_BLOCK))
+
+#define is_table_pte(pte, level)                                        \
+    ((level < 3) &&                                                     \
+     ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_PTE_TYPE_TABLE))
+
+#define is_page_pte(pte, level)                                         \
+    ((level == 3) &&                                                    \
+     ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_L3_PTE_TYPE_PAGE))
+
+#define PTE_AP(pte) \
+    (extract64(pte, 6, 2))
+
+#define PTE_APTABLE(pte) \
+    (extract64(pte, 61, 2))
+
+#define is_permission_fault(ap, perm) \
+    (((perm) & IOMMU_WO) && ((ap) & 0x2))
+
+#define PTE_AP_TO_PERM(ap) \
+    (IOMMU_ACCESS_FLAG(true, !((ap) & 0x2)))
+
+/* Level Indexing */
+
+static inline int level_shift(int level, int granule_sz)
+{
+    return granule_sz + (3 - level) * (granule_sz - 3);
+}
+
+static inline uint64_t level_page_mask(int level, int granule_sz)
+{
+    return ~((1ULL << level_shift(level, granule_sz)) - 1);
+}
+
+/**
+ * TODO: handle the case where the level resolves less than
+ * granule_sz -3 IA bits.
+ */
+static inline
+uint64_t iova_level_offset(uint64_t iova, int level, int granule_sz)
+{
+    return (iova >> level_shift(level, granule_sz)) &
+            ((1ULL << (granule_sz - 3)) - 1);
+}
+
+#endif
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index 193063e..3584974 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -2,3 +2,13 @@
 
 # hw/arm/virt-acpi-build.c
 virt_acpi_setup(void) "No fw cfg or ACPI disabled. Bailing out."
+
+# hw/arm/smmu-common.c
+
+smmu_page_walk(int stage, uint64_t baseaddr, int first_level, uint64_t start, uint64_t end) "stage=%d, baseaddr=0x%"PRIx64", first level=%d, start=0x%"PRIx64", end=0x%"PRIx64
+smmu_lookup_table(int level, uint64_t baseaddr, int granule_sz, uint64_t start, uint64_t end, int flags, uint64_t subpage_size) "level=%d baseaddr=0x%"PRIx64" granule=%d, start=0x%"PRIx64" end=0x%"PRIx64" flags=%d subpage_size=0x%lx"
+smmu_ptw_level(int level, uint64_t iova, size_t subpage_size, uint64_t baseaddr, uint32_t offset, uint64_t pte) "level=%d iova=0x%lx subpage_sz=0x%lx baseaddr=0x%"PRIx64" offset=%d => pte=0x%lx"
+smmu_ptw_invalid_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr, uint32_t offset, uint64_t pte) "stage=%d level=%d base@=0x%"PRIx64" pte@=0x%"PRIx64" offset=%d pte=0x%"PRIx64
+smmu_ptw_page_pte(int stage, int level,  uint64_t iova, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t address) "stage=%d level=%d iova=0x%"PRIx64" base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" page address = 0x%"PRIx64
+smmu_ptw_block_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t iova, uint64_t gpa, int bsize_mb) "stage=%d level=%d base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" iova=0x%"PRIx64" block address = 0x%"PRIx64" block size = %d MiB"
+smmu_get_pte(uint64_t baseaddr, int index, uint64_t pteaddr, uint64_t pte) "baseaddr=0x%"PRIx64" index=0x%x, pteaddr=0x%"PRIx64", pte=0x%"PRIx64
diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
index aee96c2..0fb27f7 100644
--- a/include/hw/arm/smmu-common.h
+++ b/include/hw/arm/smmu-common.h
@@ -127,4 +127,10 @@ static inline uint16_t smmu_get_sid(SMMUDevice *sdev)
 {
     return  ((pci_bus_num(sdev->bus) & 0xff) << 8) | sdev->devfn;
 }
+
+int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags perm,
+             IOMMUTLBEntry *tlbe, SMMUPTWEventInfo *info);
+
+SMMUTransTableInfo *select_tt(SMMUTransCfg *cfg, dma_addr_t iova);
+
 #endif  /* HW_ARM_SMMU_COMMON */
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [Qemu-devel] [PATCH v9 04/14] hw/arm/smmuv3: Skeleton
  2018-02-17 18:46 [Qemu-devel] [PATCH v9 00/14] ARM SMMUv3 Emulation Support Eric Auger
                   ` (2 preceding siblings ...)
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 03/14] hw/arm/smmu-common: VMSAv8-64 page table walk Eric Auger
@ 2018-02-17 18:46 ` Eric Auger
  2018-03-08 14:27   ` Peter Maydell
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 05/14] hw/arm/smmuv3: Wired IRQ and GERROR helpers Eric Auger
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 63+ messages in thread
From: Eric Auger @ 2018-02-17 18:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: tn, mst, christoffer.dall, bharat.bhushan, jean-philippe.brucker,
	edgar.iglesias, linuc.decode, peterx

From: Prem Mallappa <prem.mallappa@broadcom.com>

This patch implements a skeleton for the smmuv3 device.
Datatypes and register definitions are introduced. The MMIO
region, the interrupts and the queue are initialized.

Only the MMIO read operation is implemented here.

Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v8 -> v9:
- add #include "qemu/log.h"
- add parent_reset

v7 -> v8:
- remove __smmu_data structs
- revisit struct SMMUQueue
- do not advertise stage 2 support anymore
- use the register definition API and get rid of REG array
- get read of queue structs

v6 -> v7:
- split into several patches

v5 -> v6:
- Use IOMMUMemoryregion
- regs become uint32_t and fix 64b MMIO access (.impl)
- trace_smmuv3_write/read_mmio take the size param

v4 -> v5:
- change smmuv3_translate proto (IOMMUAccessFlags flag)
- has_stagex replaced by is_ste_stagex
- smmu_cfg_populate removed
- added smmuv3_decode_config and reworked error management
- remwork the naming of IOMMU mrs
- fix SMMU_CMDQ_CONS offset

v3 -> v4
- smmu_irq_update
- fix hash key allocation
- set smmu_iommu_ops
- set SMMU_REG_CR0,
- smmuv3_translate: ret.perm not set in bypass mode
- use trace events
- renamed STM2U64 into L1STD_L2PTR and STMSPAN into L1STD_SPAN
- rework smmu_find_ste
- fix tg2granule in TT0/0b10 corresponds to 16kB

v2 -> v3:
- move creation of include/hw/arm/smmuv3.h to this patch to fix compil issue
- compilation allowed
- fix sbus allocation in smmu_init_pci_iommu
- restructure code into headers
- misc cleanups
---
 hw/arm/Makefile.objs     |   2 +-
 hw/arm/smmuv3-internal.h | 155 +++++++++++++++++++++
 hw/arm/smmuv3.c          | 348 +++++++++++++++++++++++++++++++++++++++++++++++
 hw/arm/trace-events      |   3 +
 include/hw/arm/smmuv3.h  |  91 +++++++++++++
 5 files changed, 598 insertions(+), 1 deletion(-)
 create mode 100644 hw/arm/smmuv3-internal.h
 create mode 100644 hw/arm/smmuv3.c
 create mode 100644 include/hw/arm/smmuv3.h

diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
index c84c5ac..676b222 100644
--- a/hw/arm/Makefile.objs
+++ b/hw/arm/Makefile.objs
@@ -20,4 +20,4 @@ obj-$(CONFIG_FSL_IMX6) += fsl-imx6.o sabrelite.o
 obj-$(CONFIG_ASPEED_SOC) += aspeed_soc.o aspeed.o
 obj-$(CONFIG_MPS2) += mps2.o
 obj-$(CONFIG_MSF2) += msf2-soc.o msf2-som.o
-obj-$(CONFIG_ARM_SMMUV3) += smmu-common.o
+obj-$(CONFIG_ARM_SMMUV3) += smmu-common.o smmuv3.o
diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
new file mode 100644
index 0000000..5be8303
--- /dev/null
+++ b/hw/arm/smmuv3-internal.h
@@ -0,0 +1,155 @@
+/*
+ * ARM SMMUv3 support - Internal API
+ *
+ * Copyright (C) 2014-2016 Broadcom Corporation
+ * Copyright (c) 2017 Red Hat, Inc.
+ * Written by Prem Mallappa, Eric Auger
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_ARM_SMMU_V3_INTERNAL_H
+#define HW_ARM_SMMU_V3_INTERNAL_H
+
+#include "qemu/log.h"
+#include "trace.h"
+#include "qemu/error-report.h"
+#include "hw/arm/smmu-common.h"
+
+/* MMIO Registers */
+
+REG32(IDR0,                0x0)
+    FIELD(IDR0, S1P,         1 , 1)
+    FIELD(IDR0, TTF,         2 , 2)
+    FIELD(IDR0, COHACC,      4 , 1)
+    FIELD(IDR0, ASID16,      12, 1)
+    FIELD(IDR0, TTENDIAN,    21, 2)
+    FIELD(IDR0, STALL_MODEL, 24, 2)
+    FIELD(IDR0, TERM_MODEL,  26, 1)
+    FIELD(IDR0, STLEVEL,     27, 2)
+
+REG32(IDR1,                0x4)
+    FIELD(IDR1, SIDSIZE,      0 , 6)
+    FIELD(IDR1, EVENTQS,      16, 5)
+    FIELD(IDR1, CMDQS,        21, 5)
+
+#define SMMU_IDR1_SIDSIZE 16
+
+REG32(IDR2,                0x8)
+REG32(IDR3,                0xc)
+REG32(IDR4,                0x10)
+REG32(IDR5,                0x14)
+     FIELD(IDR5, OAS,         0, 3);
+     FIELD(IDR5, GRAN4K,      4, 1);
+     FIELD(IDR5, GRAN16K,     5, 1);
+     FIELD(IDR5, GRAN64K,     6, 1);
+
+#define SMMU_IDR5_OAS 4
+
+REG32(IIDR,                0x1c)
+REG32(CR0,                 0x20)
+    FIELD(CR0, SMMU_ENABLE,   0, 1)
+    FIELD(CR0, EVENTQEN,      2, 1)
+    FIELD(CR0, CMDQEN,        3, 1)
+
+REG32(CR0ACK,              0x24)
+REG32(CR1,                 0x28)
+REG32(CR2,                 0x2c)
+REG32(STATUSR,             0x40)
+REG32(IRQ_CTRL,            0x50)
+    FIELD(IRQ_CTRL, GERROR_IRQEN,        0, 1)
+    FIELD(IRQ_CTRL, PRI_IRQEN,           1, 1)
+    FIELD(IRQ_CTRL, EVENTQ_IRQEN,        2, 1)
+
+REG32(IRQ_CTRL_ACK,        0x54)
+REG32(GERROR,              0x60)
+    FIELD(GERROR, CMDQ_ERR,           0, 1)
+    FIELD(GERROR, EVENTQ_ABT_ERR,     2, 1)
+    FIELD(GERROR, PRIQ_ABT_ERR,       3, 1)
+    FIELD(GERROR, MSI_CMDQ_ABT_ERR,   4, 1)
+    FIELD(GERROR, MSI_EVENTQ_ABT_ERR, 5, 1)
+    FIELD(GERROR, MSI_PRIQ_ABT_ERR,   6, 1)
+    FIELD(GERROR, MSI_GERROR_ABT_ERR, 7, 1)
+    FIELD(GERROR, MSI_SFM_ERR,        8, 1)
+
+REG32(GERRORN,             0x64)
+
+#define A_GERROR_IRQ_CFG0  0x68 /* 64b */
+REG32(GERROR_IRQ_CFG1, 0x70)
+REG32(GERROR_IRQ_CFG2, 0x74)
+
+#define A_STRTAB_BASE      0x80 /* 64b */
+
+#define SMMU_BASE_ADDR_MASK 0xffffffffffe0
+
+REG32(STRTAB_BASE_CFG,     0x88)
+    FIELD(STRTAB_BASE_CFG, FMT,      16, 2)
+    FIELD(STRTAB_BASE_CFG, SPLIT,    6 , 5)
+    FIELD(STRTAB_BASE_CFG, LOG2SIZE, 0 , 6)
+
+#define A_CMDQ_BASE        0x90 /* 64b */
+REG32(CMDQ_PROD,           0x98)
+REG32(CMDQ_CONS,           0x9c)
+    FIELD(CMDQ_CONS, ERR, 24, 7)
+
+#define A_EVENTQ_BASE      0xa0 /* 64b */
+REG32(EVENTQ_PROD,         0xa8)
+REG32(EVENTQ_CONS,         0xac)
+
+#define A_EVENTQ_IRQ_CFG0  0xb0 /* 64b */
+REG32(EVENTQ_IRQ_CFG1,     0xb8)
+REG32(EVENTQ_IRQ_CFG2,     0xbc)
+
+REG32(CIDR0,               0xff0)
+REG32(CIDR1,               0xff4)
+REG32(CIDR2,               0xff8)
+REG32(CIDR3,               0xffc)
+REG32(PIDR0,               0xfe0)
+REG32(PIDR1,               0xfe4)
+REG32(PIDR2,               0xfe8)
+REG32(PIDR3,               0xfec)
+REG32(PIDR4,               0xfd0)
+
+static inline int smmu_enabled(SMMUv3State *s)
+{
+    return FIELD_EX32(s->cr[0], CR0, SMMU_ENABLE);
+}
+
+typedef struct Cmd {
+    uint32_t word[4];
+} Cmd;
+
+typedef struct Evt  {
+    uint32_t word[8];
+} Evt;
+
+static inline uint64_t smmu_read64(uint64_t r, unsigned offset,
+                                   unsigned size)
+{
+    if (size == 8 && !offset) {
+        return r;
+    }
+
+    /* 32 bit access */
+
+    if (offset && offset != 4)  {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "SMMUv3 MMIO read: bad offset/size %u/%u\n",
+                      offset, size);
+        return 0;
+    }
+
+    return extract64(r, offset << 3, 32);
+}
+
+#endif
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
new file mode 100644
index 0000000..dc03c9e
--- /dev/null
+++ b/hw/arm/smmuv3.c
@@ -0,0 +1,348 @@
+/*
+ * Copyright (C) 2014-2016 Broadcom Corporation
+ * Copyright (c) 2017 Red Hat, Inc.
+ * Written by Prem Mallappa, Eric Auger
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/boards.h"
+#include "sysemu/sysemu.h"
+#include "hw/sysbus.h"
+#include "hw/qdev-core.h"
+#include "hw/pci/pci.h"
+#include "exec/address-spaces.h"
+#include "trace.h"
+#include "qemu/error-report.h"
+#include "qapi/error.h"
+
+#include "hw/arm/smmuv3.h"
+#include "smmuv3-internal.h"
+
+static void smmuv3_init_regs(SMMUv3State *s)
+{
+    /**
+     * IDR0: stage1 only, AArch64 only, coherent access, 16b ASID,
+     *       multi-level stream table
+     */
+    s->idr[0] = FIELD_DP32(s->idr[0], IDR0, S1P, 1); /* stage 1 supported */
+    s->idr[0] = FIELD_DP32(s->idr[0], IDR0, TTF, 2); /* AArch64 PTW only */
+    s->idr[0] = FIELD_DP32(s->idr[0], IDR0, COHACC, 1); /* IO coherent */
+    s->idr[0] = FIELD_DP32(s->idr[0], IDR0, ASID16, 1); /* 16-bit ASID */
+    s->idr[0] = FIELD_DP32(s->idr[0], IDR0, TTENDIAN, 2); /* little endian */
+    s->idr[0] = FIELD_DP32(s->idr[0], IDR0, STALL_MODEL, 1); /* No stall */
+    /* terminated transaction will always be aborted/error returned */
+    s->idr[0] = FIELD_DP32(s->idr[0], IDR0, TERM_MODEL, 1);
+    /* 2-level stream table supported */
+    s->idr[0] = FIELD_DP32(s->idr[0], IDR0, STLEVEL, 1);
+
+    s->idr[1] = FIELD_DP32(s->idr[1], IDR1, SIDSIZE, SMMU_IDR1_SIDSIZE);
+    s->idr[1] = FIELD_DP32(s->idr[1], IDR1, EVENTQS, 19);
+    s->idr[1] = FIELD_DP32(s->idr[1], IDR1, CMDQS,   19);
+
+   /* 4K and 64K granule support */
+    s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN4K, 1);
+    s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN64K, 1);
+    s->idr[5] = FIELD_DP32(s->idr[5], IDR5, OAS, SMMU_IDR5_OAS); /* 44 bits */
+
+    s->cmdq.base = deposit64(s->cmdq.base, 0, 5, 19); /* LOG2SIZE = 19 */
+    s->cmdq.prod = 0;
+    s->cmdq.cons = 0;
+    s->cmdq.entry_size = sizeof(struct Cmd);
+    s->eventq.base = deposit64(s->eventq.base, 0, 5, 19); /* LOG2SIZE = 19 */
+    s->eventq.prod = 0;
+    s->eventq.cons = 0;
+    s->eventq.entry_size = sizeof(struct Evt);
+
+    s->features = 0;
+    s->sid_split = 0;
+}
+
+static void smmu_write_mmio(void *opaque, hwaddr addr,
+                            uint64_t val, unsigned size)
+{
+    /* not yet implemented */
+}
+
+static uint64_t smmu_read_mmio(void *opaque, hwaddr addr, unsigned size)
+{
+    SMMUState *sys = opaque;
+    SMMUv3State *s = ARM_SMMUV3(sys);
+    uint64_t val;
+
+    /* CONSTRAINED UNPREDICTABLE choice to have page0/1 be exact aliases */
+    addr &= ~0x10000;
+
+    if (size != 4 && size != 8) {
+        qemu_log_mask(LOG_GUEST_ERROR, "SMMUv3 MMIO read: bad size %u\n", size);
+        return 0;
+    }
+
+    /* Primecell/Corelink ID registers */
+    switch (addr) {
+    case A_CIDR0:
+        val = 0x0D;
+        break;
+    case A_CIDR1:
+        val = 0xF0;
+        break;
+    case A_CIDR2:
+        val = 0x05;
+        break;
+    case A_CIDR3:
+        val = 0xB1;
+        break;
+    case A_PIDR0:
+        val = 0x84; /* Part Number */
+        break;
+    case A_PIDR1:
+        val = 0xB4; /* JEP106 ID code[3:0] for Arm and Part numver[11:8] */
+        break;
+    case A_PIDR3:
+        val = 0x10; /* MMU600 p1 */
+        break;
+    case A_PIDR4:
+        val = 0x4; /* 4KB region count, JEP106 continuation code for Arm */
+        break;
+    case 0xFD4 ... 0xFDC: /* SMMU_PDIR 5-7 */
+        val = 0;
+        break;
+    case A_IDR0 ... A_IDR5:
+        val = s->idr[(addr - A_IDR0) / 4];
+        break;
+    case A_IIDR:
+        val = s->iidr;
+        break;
+    case A_CR0:
+        val = s->cr[0];
+        break;
+    case A_CR0ACK:
+        val = s->cr0ack;
+        break;
+    case A_CR1:
+        val = s->cr[1];
+        break;
+    case A_CR2:
+        val = s->cr[2];
+        break;
+    case A_STATUSR:
+        val = s->statusr;
+        break;
+    case A_IRQ_CTRL:
+        val = s->irq_ctrl;
+        break;
+    case A_IRQ_CTRL_ACK:
+        val = s->irq_ctrl_ack;
+        break;
+    case A_GERROR:
+        val = s->gerror;
+        break;
+    case A_GERRORN:
+        val = s->gerrorn;
+        break;
+    case A_GERROR_IRQ_CFG0: /* 64b */
+        val = smmu_read64(s->gerror_irq_cfg0, 0, size);
+        break;
+    case A_GERROR_IRQ_CFG0 + 4:
+        val = smmu_read64(s->gerror_irq_cfg0, 4, size);
+        break;
+    case A_GERROR_IRQ_CFG1:
+        val = s->gerror_irq_cfg1;
+        break;
+    case A_GERROR_IRQ_CFG2:
+        val = s->gerror_irq_cfg2;
+        break;
+    case A_STRTAB_BASE: /* 64b */
+        val = smmu_read64(s->strtab_base, 0, size);
+        break;
+    case A_STRTAB_BASE + 4: /* 64b */
+        val = smmu_read64(s->strtab_base, 4, size);
+        break;
+    case A_STRTAB_BASE_CFG:
+        val = s->strtab_base_cfg;
+        break;
+    case A_CMDQ_BASE: /* 64b */
+        val = smmu_read64(s->cmdq.base, 0, size);
+        break;
+    case A_CMDQ_BASE + 4:
+        val = smmu_read64(s->cmdq.base, 4, size);
+        break;
+    case A_CMDQ_PROD:
+        val = s->cmdq.prod;
+        break;
+    case A_CMDQ_CONS:
+        val = s->cmdq.cons;
+        break;
+    case A_EVENTQ_BASE: /* 64b */
+        val = smmu_read64(s->eventq.base, 0, size);
+        break;
+    case A_EVENTQ_BASE + 4: /* 64b */
+        val = smmu_read64(s->eventq.base, 4, size);
+        break;
+    case A_EVENTQ_PROD:
+        val = s->eventq.prod;
+        break;
+    case A_EVENTQ_CONS:
+        val = s->eventq.cons;
+        break;
+    default:
+        error_report("%s unhandled access at 0x%"PRIx64, __func__, addr);
+        break;
+    }
+
+    trace_smmuv3_read_mmio(addr, val, size);
+    return val;
+}
+
+static const MemoryRegionOps smmu_mem_ops = {
+    .read = smmu_read_mmio,
+    .write = smmu_write_mmio,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid = {
+        .min_access_size = 4,
+        .max_access_size = 8,
+    },
+    .impl = {
+        .min_access_size = 4,
+        .max_access_size = 8,
+    },
+};
+
+static void smmu_init_irq(SMMUv3State *s, SysBusDevice *dev)
+{
+    int i;
+
+    for (i = 0; i < ARRAY_SIZE(s->irq); i++) {
+        sysbus_init_irq(dev, &s->irq[i]);
+    }
+}
+
+static void smmu_reset(DeviceState *dev)
+{
+    SMMUv3State *s = ARM_SMMUV3(dev);
+    SMMUv3Class *c = ARM_SMMUV3_GET_CLASS(s);
+
+    c->parent_reset(dev);
+
+    smmuv3_init_regs(s);
+}
+
+static void smmu_realize(DeviceState *d, Error **errp)
+{
+    SMMUState *sys = ARM_SMMU(d);
+    SMMUv3State *s = ARM_SMMUV3(sys);
+    SMMUv3Class *c = ARM_SMMUV3_GET_CLASS(s);
+    SysBusDevice *dev = SYS_BUS_DEVICE(d);
+    Error *local_err = NULL;
+
+    c->parent_realize(d, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return;
+    }
+
+    memory_region_init_io(&sys->iomem, OBJECT(s),
+                          &smmu_mem_ops, sys, TYPE_ARM_SMMUV3, 0x20000);
+
+    sys->mrtypename = g_strdup(TYPE_SMMUV3_IOMMU_MEMORY_REGION);
+
+    sysbus_init_mmio(dev, &sys->iomem);
+
+    smmu_init_irq(s, dev);
+}
+
+static const VMStateDescription vmstate_smmuv3 = {
+    .name = "smmuv3",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT32(features, SMMUv3State),
+        VMSTATE_UINT8(sid_size, SMMUv3State),
+        VMSTATE_UINT8(sid_split, SMMUv3State),
+
+        VMSTATE_UINT32_ARRAY(idr, SMMUv3State, 6),
+        VMSTATE_UINT32(iidr, SMMUv3State),
+        VMSTATE_UINT32_ARRAY(cr, SMMUv3State, 3),
+        VMSTATE_UINT32(cr0ack, SMMUv3State),
+        VMSTATE_UINT32(statusr, SMMUv3State),
+        VMSTATE_UINT32(irq_ctrl, SMMUv3State),
+        VMSTATE_UINT32(irq_ctrl_ack, SMMUv3State),
+        VMSTATE_UINT32(gerror, SMMUv3State),
+        VMSTATE_UINT32(gerrorn, SMMUv3State),
+        VMSTATE_UINT64(gerror_irq_cfg0, SMMUv3State),
+        VMSTATE_UINT32(gerror_irq_cfg1, SMMUv3State),
+        VMSTATE_UINT32(gerror_irq_cfg2, SMMUv3State),
+        VMSTATE_UINT64(strtab_base, SMMUv3State),
+        VMSTATE_UINT32(strtab_base_cfg, SMMUv3State),
+        VMSTATE_UINT64(eventq_irq_cfg0, SMMUv3State),
+        VMSTATE_UINT32(eventq_irq_cfg1, SMMUv3State),
+        VMSTATE_UINT32(eventq_irq_cfg2, SMMUv3State),
+
+        VMSTATE_UINT64(cmdq.base, SMMUv3State),
+        VMSTATE_UINT32(cmdq.prod, SMMUv3State),
+        VMSTATE_UINT32(cmdq.cons, SMMUv3State),
+        VMSTATE_UINT8(cmdq.entry_size, SMMUv3State),
+        VMSTATE_UINT64(eventq.base, SMMUv3State),
+        VMSTATE_UINT32(eventq.prod, SMMUv3State),
+        VMSTATE_UINT32(eventq.cons, SMMUv3State),
+        VMSTATE_UINT8(eventq.entry_size, SMMUv3State),
+
+        VMSTATE_END_OF_LIST(),
+    },
+};
+
+static void smmuv3_instance_init(Object *obj)
+{
+    /* Nothing much to do here as of now */
+}
+
+static void smmuv3_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+    SMMUv3Class *c = ARM_SMMUV3_CLASS(klass);
+
+    dc->vmsd    = &vmstate_smmuv3;
+    device_class_set_parent_reset(dc, smmu_reset, &c->parent_reset);
+    c->parent_realize = dc->realize;
+    dc->realize = smmu_realize;
+}
+
+static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
+                                                  void *data)
+{
+}
+
+static const TypeInfo smmuv3_type_info = {
+    .name          = TYPE_ARM_SMMUV3,
+    .parent        = TYPE_ARM_SMMU,
+    .instance_size = sizeof(SMMUv3State),
+    .instance_init = smmuv3_instance_init,
+    .class_size    = sizeof(SMMUv3Class),
+    .class_init    = smmuv3_class_init,
+};
+
+static const TypeInfo smmuv3_iommu_memory_region_info = {
+    .parent = TYPE_IOMMU_MEMORY_REGION,
+    .name = TYPE_SMMUV3_IOMMU_MEMORY_REGION,
+    .class_init = smmuv3_iommu_memory_region_class_init,
+};
+
+static void smmuv3_register_types(void)
+{
+    type_register(&smmuv3_type_info);
+    type_register(&smmuv3_iommu_memory_region_info);
+}
+
+type_init(smmuv3_register_types)
+
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index 3584974..64d2b9b 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -12,3 +12,6 @@ smmu_ptw_invalid_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr,
 smmu_ptw_page_pte(int stage, int level,  uint64_t iova, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t address) "stage=%d level=%d iova=0x%"PRIx64" base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" page address = 0x%"PRIx64
 smmu_ptw_block_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t iova, uint64_t gpa, int bsize_mb) "stage=%d level=%d base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" iova=0x%"PRIx64" block address = 0x%"PRIx64" block size = %d MiB"
 smmu_get_pte(uint64_t baseaddr, int index, uint64_t pteaddr, uint64_t pte) "baseaddr=0x%"PRIx64" index=0x%x, pteaddr=0x%"PRIx64", pte=0x%"PRIx64
+
+#hw/arm/smmuv3.c
+smmuv3_read_mmio(hwaddr addr, uint64_t val, unsigned size) "addr: 0x%"PRIx64" val:0x%"PRIx64" size: 0x%x"
diff --git a/include/hw/arm/smmuv3.h b/include/hw/arm/smmuv3.h
new file mode 100644
index 0000000..37a5723
--- /dev/null
+++ b/include/hw/arm/smmuv3.h
@@ -0,0 +1,91 @@
+/*
+ * Copyright (C) 2014-2016 Broadcom Corporation
+ * Copyright (c) 2017 Red Hat, Inc.
+ * Written by Prem Mallappa, Eric Auger
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_ARM_SMMUV3_H
+#define HW_ARM_SMMUV3_H
+
+#include "hw/arm/smmu-common.h"
+#include "hw/registerfields.h"
+
+#define TYPE_SMMUV3_IOMMU_MEMORY_REGION "smmuv3-iommu-memory-region"
+
+#define SMMU_NREGS            0x200
+
+typedef struct SMMUQueue {
+     hwaddr base;
+     uint32_t prod;
+     uint32_t cons;
+     uint8_t entry_size;
+} SMMUQueue;
+
+typedef struct SMMUv3State {
+    SMMUState     smmu_state;
+
+    /* Local cache of most-frequently used registers */
+#define SMMU_FEATURE_2LVL_STE (1 << 0)
+    uint32_t features;
+    uint8_t sid_size;
+    uint8_t sid_split;
+
+    uint32_t idr[6];
+    uint32_t iidr;
+    uint32_t cr[3];
+    uint32_t cr0ack;
+    uint32_t statusr;
+    uint32_t irq_ctrl;
+    uint32_t irq_ctrl_ack;
+    uint32_t gerror;
+    uint32_t gerrorn;
+    uint64_t gerror_irq_cfg0;
+    uint32_t gerror_irq_cfg1;
+    uint32_t gerror_irq_cfg2;
+    uint64_t strtab_base;
+    uint32_t strtab_base_cfg;
+    uint64_t eventq_irq_cfg0;
+    uint32_t eventq_irq_cfg1;
+    uint32_t eventq_irq_cfg2;
+
+    SMMUQueue eventq, cmdq;
+
+    qemu_irq     irq[4];
+} SMMUv3State;
+
+typedef enum {
+    SMMU_IRQ_EVTQ,
+    SMMU_IRQ_PRIQ,
+    SMMU_IRQ_CMD_SYNC,
+    SMMU_IRQ_GERROR,
+} SMMUIrq;
+
+typedef struct {
+    /*< private >*/
+    SMMUBaseClass smmu_base_class;
+    /*< public >*/
+
+    DeviceRealize parent_realize;
+    DeviceReset   parent_reset;
+} SMMUv3Class;
+
+#define TYPE_ARM_SMMUV3   "arm-smmuv3"
+#define ARM_SMMUV3(obj) OBJECT_CHECK(SMMUv3State, (obj), TYPE_ARM_SMMUV3)
+#define ARM_SMMUV3_CLASS(klass)                              \
+    OBJECT_CLASS_CHECK(SMMUv3Class, (klass), TYPE_ARM_SMMUV3)
+#define ARM_SMMUV3_GET_CLASS(obj) \
+     OBJECT_GET_CLASS(SMMUv3Class, (obj), TYPE_ARM_SMMUV3)
+
+#endif
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [Qemu-devel] [PATCH v9 05/14] hw/arm/smmuv3: Wired IRQ and GERROR helpers
  2018-02-17 18:46 [Qemu-devel] [PATCH v9 00/14] ARM SMMUv3 Emulation Support Eric Auger
                   ` (3 preceding siblings ...)
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 04/14] hw/arm/smmuv3: Skeleton Eric Auger
@ 2018-02-17 18:46 ` Eric Auger
  2018-03-08 17:49   ` Peter Maydell
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 06/14] hw/arm/smmuv3: Queue helpers Eric Auger
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 63+ messages in thread
From: Eric Auger @ 2018-02-17 18:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: tn, mst, christoffer.dall, bharat.bhushan, jean-philippe.brucker,
	edgar.iglesias, linuc.decode, peterx

We introduce some helpers to handle wired IRQs and especially
GERROR interrupt. SMMU writes GERROR register on GERROR event
and SW acks GERROR interrupts by setting GERRORn.

The Wired interrupts are edge sensitive hence the pulse usage.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v7 -> v8:
- remove SMMU_PENDING_GERRORS macro
- properly toggle gerror
- properly sanitize gerrorn write
---
 hw/arm/smmuv3-internal.h | 10 ++++++++
 hw/arm/smmuv3.c          | 64 ++++++++++++++++++++++++++++++++++++++++++++++++
 hw/arm/trace-events      |  3 +++
 3 files changed, 77 insertions(+)

diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
index 5be8303..40b39a1 100644
--- a/hw/arm/smmuv3-internal.h
+++ b/hw/arm/smmuv3-internal.h
@@ -152,4 +152,14 @@ static inline uint64_t smmu_read64(uint64_t r, unsigned offset,
     return extract64(r, offset << 3, 32);
 }
 
+/* Interrupts */
+
+#define smmuv3_eventq_irq_enabled(s)                   \
+    (FIELD_EX32(s->irq_ctrl, IRQ_CTRL, EVENTQ_IRQEN))
+#define smmuv3_gerror_irq_enabled(s)                  \
+    (FIELD_EX32(s->irq_ctrl, IRQ_CTRL, GERROR_IRQEN))
+
+void smmuv3_trigger_irq(SMMUv3State *s, SMMUIrq irq, uint32_t gerror_mask);
+void smmuv3_write_gerrorn(SMMUv3State *s, uint32_t gerrorn);
+
 #endif
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index dc03c9e..8779d3f 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -30,6 +30,70 @@
 #include "hw/arm/smmuv3.h"
 #include "smmuv3-internal.h"
 
+/**
+ * smmuv3_trigger_irq - pulse @irq if enabled and update
+ * GERROR register in case of GERROR interrupt
+ *
+ * @irq: irq type
+ * @gerror_mask: mask of gerrors to toggle (relevant if @irq is GERROR)
+ */
+void smmuv3_trigger_irq(SMMUv3State *s, SMMUIrq irq, uint32_t gerror_mask)
+{
+
+    bool pulse = false;
+
+    switch (irq) {
+    case SMMU_IRQ_EVTQ:
+        pulse = smmuv3_eventq_irq_enabled(s);
+        break;
+    case SMMU_IRQ_PRIQ:
+        error_setg(&error_fatal, "PRI not supported");
+        break;
+    case SMMU_IRQ_CMD_SYNC:
+        pulse = true;
+        break;
+    case SMMU_IRQ_GERROR:
+    {
+        uint32_t pending = s->gerror ^ s->gerrorn;
+        uint32_t new_gerrors = ~pending & gerror_mask;
+
+        if (!new_gerrors) {
+            /* only toggle non pending errors */
+            return;
+        }
+        s->gerror ^= new_gerrors;
+        trace_smmuv3_write_gerror(new_gerrors, s->gerror);
+
+        /* pulse the GERROR irq only if all previous gerrors were acked */
+        pulse = smmuv3_gerror_irq_enabled(s) && !pending;
+        break;
+    }
+    }
+    if (pulse) {
+            trace_smmuv3_trigger_irq(irq);
+            qemu_irq_pulse(s->irq[irq]);
+    }
+}
+
+void smmuv3_write_gerrorn(SMMUv3State *s, uint32_t new_gerrorn)
+{
+    uint32_t pending = s->gerror ^ s->gerrorn;
+    uint32_t toggled = s->gerrorn ^ new_gerrorn;
+    uint32_t acked;
+
+    if (toggled & ~pending) {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "guest toggles non pending errors = 0x%x\n",
+                      toggled & ~pending);
+    }
+
+    /* Make sure SW does not toggle irqs that are not active */
+    acked = toggled & pending;
+    s->gerrorn ^= acked;
+
+    trace_smmuv3_write_gerrorn(acked, s->gerrorn);
+}
+
 static void smmuv3_init_regs(SMMUv3State *s)
 {
     /**
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index 64d2b9b..2ddae40 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -15,3 +15,6 @@ smmu_get_pte(uint64_t baseaddr, int index, uint64_t pteaddr, uint64_t pte) "base
 
 #hw/arm/smmuv3.c
 smmuv3_read_mmio(hwaddr addr, uint64_t val, unsigned size) "addr: 0x%"PRIx64" val:0x%"PRIx64" size: 0x%x"
+smmuv3_trigger_irq(int irq) "irq=%d"
+smmuv3_write_gerror(uint32_t toggled, uint32_t gerror) "toggled=0x%x, new gerror=0x%x"
+smmuv3_write_gerrorn(uint32_t acked, uint32_t gerrorn) "acked=0x%x, new gerrorn=0x%x"
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [Qemu-devel] [PATCH v9 06/14] hw/arm/smmuv3: Queue helpers
  2018-02-17 18:46 [Qemu-devel] [PATCH v9 00/14] ARM SMMUv3 Emulation Support Eric Auger
                   ` (4 preceding siblings ...)
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 05/14] hw/arm/smmuv3: Wired IRQ and GERROR helpers Eric Auger
@ 2018-02-17 18:46 ` Eric Auger
  2018-03-08 18:28   ` Peter Maydell
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 07/14] hw/arm/smmuv3: Implement MMIO write operations Eric Auger
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 63+ messages in thread
From: Eric Auger @ 2018-02-17 18:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: tn, mst, christoffer.dall, bharat.bhushan, jean-philippe.brucker,
	edgar.iglesias, linuc.decode, peterx

We introduce helpers to read/write into the command and event
circular queues.

smmuv3_write_eventq and smmuv3_cmq_consume will become static
in subsequent patches.

Invalidation commands are not yet dealt with. We do not cache
data that need to be invalidated. This will change with vhost
integration.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v8 -> v9:
- fix CMD_SSID & CMD_ADDR + some renamings
- do cons increment after the execution of the command
- add Q_INCONSISTENT()

v7 -> v8
- use address_space_rw
- helpers inspired from spec
---
 hw/arm/smmuv3-internal.h | 150 +++++++++++++++++++++++++++++++++++++++++++
 hw/arm/smmuv3.c          | 162 +++++++++++++++++++++++++++++++++++++++++++++++
 hw/arm/trace-events      |   4 ++
 3 files changed, 316 insertions(+)

diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
index 40b39a1..c0771ce 100644
--- a/hw/arm/smmuv3-internal.h
+++ b/hw/arm/smmuv3-internal.h
@@ -162,4 +162,154 @@ static inline uint64_t smmu_read64(uint64_t r, unsigned offset,
 void smmuv3_trigger_irq(SMMUv3State *s, SMMUIrq irq, uint32_t gerror_mask);
 void smmuv3_write_gerrorn(SMMUv3State *s, uint32_t gerrorn);
 
+/* Queue Handling */
+
+#define LOG2SIZE(q)        extract64((q)->base, 0, 5)
+#define BASE(q)            ((q)->base & SMMU_BASE_ADDR_MASK)
+#define WRAP_MASK(q)       (1 << LOG2SIZE(q))
+#define INDEX_MASK(q)      ((1 << LOG2SIZE(q)) - 1)
+#define WRAP_INDEX_MASK(q) ((1 << (LOG2SIZE(q) + 1)) - 1)
+
+#define Q_CONS_ENTRY(q)  (BASE(q) + \
+                          (q)->entry_size * ((q)->cons & INDEX_MASK(q)))
+#define Q_PROD_ENTRY(q)  (BASE(q) + \
+                          (q)->entry_size * ((q)->prod & INDEX_MASK(q)))
+
+#define Q_CONS(q) ((q)->cons & INDEX_MASK(q))
+#define Q_PROD(q) ((q)->prod & INDEX_MASK(q))
+
+#define Q_CONS_WRAP(q) (((q)->cons & WRAP_MASK(q)) >> LOG2SIZE(q))
+#define Q_PROD_WRAP(q) (((q)->prod & WRAP_MASK(q)) >> LOG2SIZE(q))
+
+#define Q_FULL(q) \
+    (((((q)->cons) & INDEX_MASK(q)) == \
+      (((q)->prod) & INDEX_MASK(q))) && \
+     ((((q)->cons) & WRAP_MASK(q)) != \
+      (((q)->prod) & WRAP_MASK(q))))
+
+#define Q_EMPTY(q) \
+    (((((q)->cons) & INDEX_MASK(q)) == \
+      (((q)->prod) & INDEX_MASK(q))) && \
+     ((((q)->cons) & WRAP_MASK(q)) == \
+      (((q)->prod) & WRAP_MASK(q))))
+
+#define Q_INCONSISTENT(q) \
+((((((q)->prod) & INDEX_MASK(q)) > (((q)->cons) & INDEX_MASK(q))) && \
+((((q)->prod) & WRAP_MASK(q)) != (((q)->cons) & WRAP_MASK(q)))) || \
+(((((q)->prod) & INDEX_MASK(q)) < (((q)->cons) & INDEX_MASK(q))) && \
+((((q)->prod) & WRAP_MASK(q)) == (((q)->cons) & WRAP_MASK(q))))) \
+
+#define SMMUV3_CMDQ_ENABLED(s) \
+     (FIELD_EX32(s->cr[0], CR0, CMDQEN))
+
+#define SMMUV3_EVENTQ_ENABLED(s) \
+     (FIELD_EX32(s->cr[0], CR0, EVENTQEN))
+
+static inline void smmu_write_cmdq_err(SMMUv3State *s, uint32_t err_type)
+{
+    s->cmdq.cons = FIELD_DP32(s->cmdq.cons, CMDQ_CONS, ERR, err_type);
+}
+
+void smmuv3_write_eventq(SMMUv3State *s, Evt *evt);
+
+/* Commands */
+
+enum {
+    SMMU_CMD_PREFETCH_CONFIG = 0x01,
+    SMMU_CMD_PREFETCH_ADDR,
+    SMMU_CMD_CFGI_STE,
+    SMMU_CMD_CFGI_STE_RANGE,
+    SMMU_CMD_CFGI_CD,
+    SMMU_CMD_CFGI_CD_ALL,
+    SMMU_CMD_CFGI_ALL,
+    SMMU_CMD_TLBI_NH_ALL     = 0x10,
+    SMMU_CMD_TLBI_NH_ASID,
+    SMMU_CMD_TLBI_NH_VA,
+    SMMU_CMD_TLBI_NH_VAA,
+    SMMU_CMD_TLBI_EL3_ALL    = 0x18,
+    SMMU_CMD_TLBI_EL3_VA     = 0x1a,
+    SMMU_CMD_TLBI_EL2_ALL    = 0x20,
+    SMMU_CMD_TLBI_EL2_ASID,
+    SMMU_CMD_TLBI_EL2_VA,
+    SMMU_CMD_TLBI_EL2_VAA,  /* 0x23 */
+    SMMU_CMD_TLBI_S12_VMALL  = 0x28,
+    SMMU_CMD_TLBI_S2_IPA     = 0x2a,
+    SMMU_CMD_TLBI_NSNH_ALL   = 0x30,
+    SMMU_CMD_ATC_INV         = 0x40,
+    SMMU_CMD_PRI_RESP,
+    SMMU_CMD_RESUME          = 0x44,
+    SMMU_CMD_STALL_TERM,
+    SMMU_CMD_SYNC,          /* 0x46 */
+};
+
+static const char *cmd_stringify[] = {
+    [SMMU_CMD_PREFETCH_CONFIG] = "SMMU_CMD_PREFETCH_CONFIG",
+    [SMMU_CMD_PREFETCH_ADDR]   = "SMMU_CMD_PREFETCH_ADDR",
+    [SMMU_CMD_CFGI_STE]        = "SMMU_CMD_CFGI_STE",
+    [SMMU_CMD_CFGI_STE_RANGE]  = "SMMU_CMD_CFGI_STE_RANGE",
+    [SMMU_CMD_CFGI_CD]         = "SMMU_CMD_CFGI_CD",
+    [SMMU_CMD_CFGI_CD_ALL]     = "SMMU_CMD_CFGI_CD_ALL",
+    [SMMU_CMD_CFGI_ALL]        = "SMMU_CMD_CFGI_ALL",
+    [SMMU_CMD_TLBI_NH_ALL]     = "SMMU_CMD_TLBI_NH_ALL",
+    [SMMU_CMD_TLBI_NH_ASID]    = "SMMU_CMD_TLBI_NH_ASID",
+    [SMMU_CMD_TLBI_NH_VA]      = "SMMU_CMD_TLBI_NH_VA",
+    [SMMU_CMD_TLBI_NH_VAA]     = "SMMU_CMD_TLBI_NH_VAA",
+    [SMMU_CMD_TLBI_EL3_ALL]    = "SMMU_CMD_TLBI_EL3_ALL",
+    [SMMU_CMD_TLBI_EL3_VA]     = "SMMU_CMD_TLBI_EL3_VA",
+    [SMMU_CMD_TLBI_EL2_ALL]    = "SMMU_CMD_TLBI_EL2_ALL",
+    [SMMU_CMD_TLBI_EL2_ASID]   = "SMMU_CMD_TLBI_EL2_ASID",
+    [SMMU_CMD_TLBI_EL2_VA]     = "SMMU_CMD_TLBI_EL2_VA",
+    [SMMU_CMD_TLBI_EL2_VAA]    = "SMMU_CMD_TLBI_EL2_VAA",
+    [SMMU_CMD_TLBI_S12_VMALL]  = "SMMU_CMD_TLBI_S12_VMALL",
+    [SMMU_CMD_TLBI_S2_IPA]     = "SMMU_CMD_TLBI_S2_IPA",
+    [SMMU_CMD_TLBI_NSNH_ALL]   = "SMMU_CMD_TLBI_NSNH_ALL",
+    [SMMU_CMD_ATC_INV]         = "SMMU_CMD_ATC_INV",
+    [SMMU_CMD_PRI_RESP]        = "SMMU_CMD_PRI_RESP",
+    [SMMU_CMD_RESUME]          = "SMMU_CMD_RESUME",
+    [SMMU_CMD_STALL_TERM]      = "SMMU_CMD_STALL_TERM",
+    [SMMU_CMD_SYNC]            = "SMMU_CMD_SYNC",
+};
+
+#define SMMU_CMD_STRING(type) (                                      \
+(type < ARRAY_SIZE(cmd_stringify)) ? cmd_stringify[type] : "UNKNOWN" \
+)
+
+/* CMDQ fields */
+
+typedef enum {
+    SMMU_CERROR_NONE = 0,
+    SMMU_CERROR_ILL,
+    SMMU_CERROR_ABT,
+    SMMU_CERROR_ATC_INV_SYNC,
+} SMMUCmdError;
+
+enum { /* Command completion notification */
+    CMD_SYNC_SIG_NONE,
+    CMD_SYNC_SIG_IRQ,
+    CMD_SYNC_SIG_SEV,
+};
+
+#define CMD_TYPE(x)         extract32((x)->word[0], 0 , 8)
+#define CMD_SSEC(x)         extract32((x)->word[0], 10, 1)
+#define CMD_SSV(x)          extract32((x)->word[0], 11, 1)
+#define CMD_RESUME_AC(x)    extract32((x)->word[0], 12, 1)
+#define CMD_RESUME_AB(x)    extract32((x)->word[0], 13, 1)
+#define CMD_SYNC_CS(x)      extract32((x)->word[0], 12, 2)
+#define CMD_SSID(x)         extract32((x)->word[0], 12, 20)
+#define CMD_SID(x)          ((x)->word[1])
+#define CMD_VMID(x)         extract32((x)->word[1], 0 , 16)
+#define CMD_ASID(x)         extract32((x)->word[1], 16, 16)
+#define CMD_RESUME_STAG(x)  extract32((x)->word[2], 0 , 16)
+#define CMD_RESP(x)         extract32((x)->word[2], 11, 2)
+#define CMD_LEAF(x)         extract32((x)->word[2], 0 , 1)
+#define CMD_STE_RANGE(x)    extract32((x)->word[2], 0 , 5)
+#define CMD_ADDR(x) ({                                        \
+            uint64_t high = (uint64_t)(x)->word[3];           \
+            uint64_t low = extract32((x)->word[2], 12, 20);    \
+            uint64_t addr = high << 32 | (low << 12);         \
+            addr;                                             \
+        })
+
+int smmuv3_cmdq_consume(SMMUv3State *s);
+
 #endif
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 8779d3f..0b57215 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -94,6 +94,72 @@ void smmuv3_write_gerrorn(SMMUv3State *s, uint32_t new_gerrorn)
     trace_smmuv3_write_gerrorn(acked, s->gerrorn);
 }
 
+static uint32_t queue_index_inc(uint32_t val,
+                                uint32_t qidx_mask, uint32_t qwrap_mask)
+{
+    uint32_t i = (val + 1) & qidx_mask;
+
+    if (i <= (val & qidx_mask)) {
+        i = ((val & qwrap_mask) ^ qwrap_mask) | i;
+    } else {
+        i = (val & qwrap_mask) | i;
+    }
+    return i;
+}
+
+static inline void queue_prod_incr(SMMUQueue *q)
+{
+    q->prod = queue_index_inc(q->prod, INDEX_MASK(q), WRAP_MASK(q));
+}
+
+static inline void queue_cons_incr(SMMUQueue *q)
+{
+    q->cons = queue_index_inc(q->cons, INDEX_MASK(q), WRAP_MASK(q));
+}
+
+static inline MemTxResult queue_read(SMMUQueue *q, void *data)
+{
+    dma_addr_t addr = Q_CONS_ENTRY(q);
+
+    return dma_memory_read(&address_space_memory, addr,
+                           (uint8_t *)data, q->entry_size);
+}
+
+static void queue_write(SMMUQueue *q, void *data)
+{
+    dma_addr_t addr = Q_PROD_ENTRY(q);
+    MemTxResult ret;
+
+    ret = dma_memory_write(&address_space_memory, addr,
+                           (uint8_t *)data, q->entry_size);
+    if (ret != MEMTX_OK) {
+        return;
+    }
+
+    queue_prod_incr(q);
+}
+
+void smmuv3_write_eventq(SMMUv3State *s, Evt *evt)
+{
+    SMMUQueue *q = &s->eventq;
+    bool q_empty = Q_EMPTY(q);
+    bool q_full = Q_FULL(q);
+
+    if (!SMMUV3_EVENTQ_ENABLED(s)) {
+        return;
+    }
+
+    if (q_full) {
+        return;
+    }
+
+    queue_write(q, evt);
+
+    if (q_empty) {
+        smmuv3_trigger_irq(s, SMMU_IRQ_EVTQ, 0);
+    }
+}
+
 static void smmuv3_init_regs(SMMUv3State *s)
 {
     /**
@@ -133,6 +199,102 @@ static void smmuv3_init_regs(SMMUv3State *s)
     s->sid_split = 0;
 }
 
+int smmuv3_cmdq_consume(SMMUv3State *s)
+{
+    SMMUCmdError cmd_error = SMMU_CERROR_NONE;
+    SMMUQueue *q = &s->cmdq;
+    uint32_t type = 0;
+
+    if (!SMMUV3_CMDQ_ENABLED(s)) {
+        return 0;
+    }
+    /*
+     * some commands depend on register values, as above. In case those
+     * register values change while handling the command, spec says it
+     * is UNPREDICTABLE whether the command is interpreted under the new
+     * or old value.
+     */
+
+    while (!Q_EMPTY(q)) {
+        uint32_t pending = s->gerror ^ s->gerrorn;
+        Cmd cmd;
+
+        trace_smmuv3_cmdq_consume(Q_PROD(q), Q_CONS(q),
+                                  Q_PROD_WRAP(q), Q_CONS_WRAP(q));
+
+        if (FIELD_EX32(pending, GERROR, CMDQ_ERR)) {
+            break;
+        }
+
+        if (queue_read(q, &cmd) != MEMTX_OK) {
+            cmd_error = SMMU_CERROR_ABT;
+            break;
+        }
+
+        type = CMD_TYPE(&cmd);
+
+        trace_smmuv3_cmdq_opcode(SMMU_CMD_STRING(type));
+
+        switch (type) {
+        case SMMU_CMD_SYNC:
+            if (CMD_SYNC_CS(&cmd) & CMD_SYNC_SIG_IRQ) {
+                smmuv3_trigger_irq(s, SMMU_IRQ_CMD_SYNC, 0);
+            }
+            break;
+        case SMMU_CMD_PREFETCH_CONFIG:
+        case SMMU_CMD_PREFETCH_ADDR:
+        case SMMU_CMD_CFGI_STE:
+        case SMMU_CMD_CFGI_STE_RANGE: /* same as SMMU_CMD_CFGI_ALL */
+        case SMMU_CMD_CFGI_CD:
+        case SMMU_CMD_CFGI_CD_ALL:
+        case SMMU_CMD_TLBI_NH_ALL:
+        case SMMU_CMD_TLBI_NH_ASID:
+        case SMMU_CMD_TLBI_NH_VA:
+        case SMMU_CMD_TLBI_NH_VAA:
+        case SMMU_CMD_TLBI_EL3_ALL:
+        case SMMU_CMD_TLBI_EL3_VA:
+        case SMMU_CMD_TLBI_EL2_ALL:
+        case SMMU_CMD_TLBI_EL2_ASID:
+        case SMMU_CMD_TLBI_EL2_VA:
+        case SMMU_CMD_TLBI_EL2_VAA:
+        case SMMU_CMD_TLBI_S12_VMALL:
+        case SMMU_CMD_TLBI_S2_IPA:
+        case SMMU_CMD_TLBI_NSNH_ALL:
+        case SMMU_CMD_ATC_INV:
+        case SMMU_CMD_PRI_RESP:
+        case SMMU_CMD_RESUME:
+        case SMMU_CMD_STALL_TERM:
+            trace_smmuv3_unhandled_cmd(type);
+            break;
+        default:
+            cmd_error = SMMU_CERROR_ILL;
+            error_report("Illegal command type: %d", CMD_TYPE(&cmd));
+            break;
+        }
+        if (cmd_error) {
+            break;
+        }
+        /*
+         * We only increment the cons index after the completion of
+         * the command. We do that because the SYNC returns immediatly
+         * and do not check the completion of previous commands
+         */
+        queue_cons_incr(q);
+    }
+
+    if (cmd_error) {
+        error_report("Error on %s command execution: %d",
+                     SMMU_CMD_STRING(type), cmd_error);
+        smmu_write_cmdq_err(s, cmd_error);
+        smmuv3_trigger_irq(s, SMMU_IRQ_GERROR, R_GERROR_CMDQ_ERR_MASK);
+    }
+
+    trace_smmuv3_cmdq_consume_out(Q_PROD(q), Q_CONS(q),
+                                  Q_PROD_WRAP(q), Q_CONS_WRAP(q));
+
+    return 0;
+}
+
 static void smmu_write_mmio(void *opaque, hwaddr addr,
                             uint64_t val, unsigned size)
 {
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index 2ddae40..1c5105d 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -18,3 +18,7 @@ smmuv3_read_mmio(hwaddr addr, uint64_t val, unsigned size) "addr: 0x%"PRIx64" va
 smmuv3_trigger_irq(int irq) "irq=%d"
 smmuv3_write_gerror(uint32_t toggled, uint32_t gerror) "toggled=0x%x, new gerror=0x%x"
 smmuv3_write_gerrorn(uint32_t acked, uint32_t gerrorn) "acked=0x%x, new gerrorn=0x%x"
+smmuv3_unhandled_cmd(uint32_t type) "Unhandled command type=%d"
+smmuv3_cmdq_consume(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "prod=%d cons=%d prod.wrap=%d cons.wrap=%d"
+smmuv3_cmdq_opcode(const char *opcode) "<--- %s"
+smmuv3_cmdq_consume_out(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "prod:%d, cons:%d, prod_wrap:%d, cons_wrap:%d "
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [Qemu-devel] [PATCH v9 07/14] hw/arm/smmuv3: Implement MMIO write operations
  2018-02-17 18:46 [Qemu-devel] [PATCH v9 00/14] ARM SMMUv3 Emulation Support Eric Auger
                   ` (5 preceding siblings ...)
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 06/14] hw/arm/smmuv3: Queue helpers Eric Auger
@ 2018-02-17 18:46 ` Eric Auger
  2018-03-08 18:37   ` Peter Maydell
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 08/14] hw/arm/smmuv3: Event queue recording helper Eric Auger
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 63+ messages in thread
From: Eric Auger @ 2018-02-17 18:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: tn, mst, christoffer.dall, bharat.bhushan, jean-philippe.brucker,
	edgar.iglesias, linuc.decode, peterx

Now we have relevant helpers for queue and irq
management, let's implement MMIO write operations.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v7 -> v8:
- precise in the commit message invalidation commands
  are not yet treated.
- use new queue helpers
- do not decode unhandled commands at this stage
---
 hw/arm/smmuv3-internal.h |  24 +++++++---
 hw/arm/smmuv3.c          | 111 +++++++++++++++++++++++++++++++++++++++++++++--
 hw/arm/trace-events      |   6 +++
 3 files changed, 132 insertions(+), 9 deletions(-)

diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
index c0771ce..5af97ae 100644
--- a/hw/arm/smmuv3-internal.h
+++ b/hw/arm/smmuv3-internal.h
@@ -152,6 +152,25 @@ static inline uint64_t smmu_read64(uint64_t r, unsigned offset,
     return extract64(r, offset << 3, 32);
 }
 
+static inline void smmu_write64(uint64_t *r, unsigned offset,
+                                unsigned size, uint64_t value)
+{
+    if (size == 8 && !offset) {
+        *r  = value;
+    }
+
+    /* 32 bit access */
+
+    if (offset && offset != 4)  {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "SMMUv3 MMIO write: bad offset/size %u/%u\n",
+                      offset, size);
+        return ;
+    }
+
+    *r = deposit64(*r, offset << 3, 32, value);
+}
+
 /* Interrupts */
 
 #define smmuv3_eventq_irq_enabled(s)                   \
@@ -159,9 +178,6 @@ static inline uint64_t smmu_read64(uint64_t r, unsigned offset,
 #define smmuv3_gerror_irq_enabled(s)                  \
     (FIELD_EX32(s->irq_ctrl, IRQ_CTRL, GERROR_IRQEN))
 
-void smmuv3_trigger_irq(SMMUv3State *s, SMMUIrq irq, uint32_t gerror_mask);
-void smmuv3_write_gerrorn(SMMUv3State *s, uint32_t gerrorn);
-
 /* Queue Handling */
 
 #define LOG2SIZE(q)        extract64((q)->base, 0, 5)
@@ -310,6 +326,4 @@ enum { /* Command completion notification */
             addr;                                             \
         })
 
-int smmuv3_cmdq_consume(SMMUv3State *s);
-
 #endif
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 0b57215..fcfdbb0 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -37,7 +37,8 @@
  * @irq: irq type
  * @gerror_mask: mask of gerrors to toggle (relevant if @irq is GERROR)
  */
-void smmuv3_trigger_irq(SMMUv3State *s, SMMUIrq irq, uint32_t gerror_mask)
+static void smmuv3_trigger_irq(SMMUv3State *s, SMMUIrq irq,
+                               uint32_t gerror_mask)
 {
 
     bool pulse = false;
@@ -75,7 +76,7 @@ void smmuv3_trigger_irq(SMMUv3State *s, SMMUIrq irq, uint32_t gerror_mask)
     }
 }
 
-void smmuv3_write_gerrorn(SMMUv3State *s, uint32_t new_gerrorn)
+static void smmuv3_write_gerrorn(SMMUv3State *s, uint32_t new_gerrorn)
 {
     uint32_t pending = s->gerror ^ s->gerrorn;
     uint32_t toggled = s->gerrorn ^ new_gerrorn;
@@ -199,7 +200,7 @@ static void smmuv3_init_regs(SMMUv3State *s)
     s->sid_split = 0;
 }
 
-int smmuv3_cmdq_consume(SMMUv3State *s)
+static int smmuv3_cmdq_consume(SMMUv3State *s)
 {
     SMMUCmdError cmd_error = SMMU_CERROR_NONE;
     SMMUQueue *q = &s->cmdq;
@@ -298,7 +299,109 @@ int smmuv3_cmdq_consume(SMMUv3State *s)
 static void smmu_write_mmio(void *opaque, hwaddr addr,
                             uint64_t val, unsigned size)
 {
-    /* not yet implemented */
+    SMMUState *sys = opaque;
+    SMMUv3State *s = ARM_SMMUV3(sys);
+
+    /* CONSTRAINED UNPREDICTABLE choice to have page0/1 be exact aliases */
+    addr &= ~0x10000;
+
+    if (size != 4 && size != 8) {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "SMMUv3 MMIO write: bad size %u\n", size);
+    }
+
+    trace_smmuv3_write_mmio(addr, val, size);
+
+    switch (addr) {
+    case A_CR0:
+        s->cr[0] = val;
+        s->cr0ack = val;
+        /* in case the command queue has been enabled */
+        smmuv3_cmdq_consume(s);
+        return;
+    case A_CR1:
+        s->cr[1] = val;
+        return;
+    case A_CR2:
+        s->cr[2] = val;
+        return;
+    case A_IRQ_CTRL:
+        s->irq_ctrl = val;
+        return;
+    case A_GERRORN:
+        smmuv3_write_gerrorn(s, val);
+        /*
+         * By acknowledging the CMDQ_ERR, SW may notify cmds can
+         * be processed again
+         */
+        smmuv3_cmdq_consume(s);
+        return;
+    case A_GERROR_IRQ_CFG0: /* 64b */
+        smmu_write64(&s->gerror_irq_cfg0, 0, size, val);
+        return;
+    case A_GERROR_IRQ_CFG0 + 4:
+        smmu_write64(&s->gerror_irq_cfg0, 4, size, val);
+        return;
+    case A_GERROR_IRQ_CFG1:
+        s->gerror_irq_cfg1 = val;
+        return;
+    case A_GERROR_IRQ_CFG2:
+        s->gerror_irq_cfg2 = val;
+        return;
+    case A_STRTAB_BASE: /* 64b */
+        smmu_write64(&s->strtab_base, 0, size, val);
+        return;
+    case A_STRTAB_BASE + 4:
+        smmu_write64(&s->strtab_base, 4, size, val);
+        return;
+    case A_STRTAB_BASE_CFG:
+        s->strtab_base_cfg = val;
+        if (FIELD_EX32(val, STRTAB_BASE_CFG, FMT) == 1) {
+            s->sid_split = FIELD_EX32(val, STRTAB_BASE_CFG, SPLIT);
+            s->features |= SMMU_FEATURE_2LVL_STE;
+        }
+        return;
+    case A_CMDQ_BASE: /* 64b */
+        smmu_write64(&s->cmdq.base, 0, size, val);
+        return;
+    case A_CMDQ_BASE + 4: /* 64b */
+        smmu_write64(&s->cmdq.base, 4, size, val);
+        return;
+    case A_CMDQ_PROD:
+        s->cmdq.prod = val;
+        smmuv3_cmdq_consume(s);
+        return;
+    case A_CMDQ_CONS:
+        s->cmdq.cons = val;
+        return;
+    case A_EVENTQ_BASE: /* 64b */
+        smmu_write64(&s->eventq.base, 0, size, val);
+        return;
+    case A_EVENTQ_BASE + 4:
+        smmu_write64(&s->eventq.base, 4, size, val);
+        return;
+    case A_EVENTQ_PROD:
+        s->eventq.prod = val;
+        return;
+    case A_EVENTQ_CONS:
+        s->eventq.cons = val;
+        return;
+    case A_EVENTQ_IRQ_CFG0: /* 64b */
+        s->eventq.prod = val;
+        smmu_write64(&s->eventq_irq_cfg0, 0, size, val);
+        return;
+    case A_EVENTQ_IRQ_CFG0 + 4:
+        smmu_write64(&s->eventq_irq_cfg0, 4, size, val);
+        return;
+    case A_EVENTQ_IRQ_CFG1:
+        s->eventq_irq_cfg1 = val;
+        return;
+    case A_EVENTQ_IRQ_CFG2:
+        s->eventq_irq_cfg2 = val;
+        return;
+    default:
+        error_report("%s unhandled access at 0x%"PRIx64, __func__, addr);
+    }
 }
 
 static uint64_t smmu_read_mmio(void *opaque, hwaddr addr, unsigned size)
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index 1c5105d..ed5dce0 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -22,3 +22,9 @@ smmuv3_unhandled_cmd(uint32_t type) "Unhandled command type=%d"
 smmuv3_cmdq_consume(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "prod=%d cons=%d prod.wrap=%d cons.wrap=%d"
 smmuv3_cmdq_opcode(const char *opcode) "<--- %s"
 smmuv3_cmdq_consume_out(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "prod:%d, cons:%d, prod_wrap:%d, cons_wrap:%d "
+smmuv3_update(bool is_empty, uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "q empty:%d prod:%d cons:%d p.wrap:%d p.cons:%d"
+smmuv3_update_check_cmd(int error) "cmdq not enabled or error :0x%x"
+smmuv3_write_mmio(hwaddr addr, uint64_t val, unsigned size) "addr: 0x%"PRIx64" val:0x%"PRIx64" size: 0x%x"
+smmuv3_write_mmio_idr(hwaddr addr, uint64_t val) "write to RO/Unimpl reg 0x%lx val64:0x%lx"
+smmuv3_write_mmio_evtq_cons_bef_clear(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "Before clearing interrupt prod:0x%x cons:0x%x prod.w:%d cons.w:%d"
+smmuv3_write_mmio_evtq_cons_after_clear(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "after clearing interrupt prod:0x%x cons:0x%x prod.w:%d cons.w:%d"
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [Qemu-devel] [PATCH v9 08/14] hw/arm/smmuv3: Event queue recording helper
  2018-02-17 18:46 [Qemu-devel] [PATCH v9 00/14] ARM SMMUv3 Emulation Support Eric Auger
                   ` (6 preceding siblings ...)
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 07/14] hw/arm/smmuv3: Implement MMIO write operations Eric Auger
@ 2018-02-17 18:46 ` Eric Auger
  2018-03-08 18:39   ` Peter Maydell
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 09/14] hw/arm/smmuv3: Implement translate callback Eric Auger
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 63+ messages in thread
From: Eric Auger @ 2018-02-17 18:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: tn, mst, christoffer.dall, bharat.bhushan, jean-philippe.brucker,
	edgar.iglesias, linuc.decode, peterx

Let's introduce a helper function aiming at recording an
event in the event queue.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v8 -> v9:
- add SMMU_EVENT_STRING

v7 -> v8:
- use dma_addr_t instead of hwaddr in smmuv3_record_event()
- introduce struct SMMUEventInfo
- add event_stringify + helpers for all fields
---
 hw/arm/smmuv3-internal.h | 140 ++++++++++++++++++++++++++++++++++++++++++++++-
 hw/arm/smmuv3.c          |  91 +++++++++++++++++++++++++++++-
 hw/arm/trace-events      |   1 +
 3 files changed, 229 insertions(+), 3 deletions(-)

diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
index 5af97ae..3929f69 100644
--- a/hw/arm/smmuv3-internal.h
+++ b/hw/arm/smmuv3-internal.h
@@ -226,8 +226,6 @@ static inline void smmu_write_cmdq_err(SMMUv3State *s, uint32_t err_type)
     s->cmdq.cons = FIELD_DP32(s->cmdq.cons, CMDQ_CONS, ERR, err_type);
 }
 
-void smmuv3_write_eventq(SMMUv3State *s, Evt *evt);
-
 /* Commands */
 
 enum {
@@ -326,4 +324,142 @@ enum { /* Command completion notification */
             addr;                                             \
         })
 
+/* Events */
+
+typedef enum SMMUEventType {
+    SMMU_EVT_OK                 = 0x00,
+    SMMU_EVT_F_UUT              = 0x01,
+    SMMU_EVT_C_BAD_STREAMID     = 0x02,
+    SMMU_EVT_F_STE_FETCH        = 0x03,
+    SMMU_EVT_C_BAD_STE          = 0x04,
+    SMMU_EVT_F_BAD_ATS_TREQ     = 0x05,
+    SMMU_EVT_F_STREAM_DISABLED  = 0x06,
+    SMMU_EVT_F_TRANS_FORBIDDEN  = 0x07,
+    SMMU_EVT_C_BAD_SUBSTREAMID  = 0x08,
+    SMMU_EVT_F_CD_FETCH         = 0x09,
+    SMMU_EVT_C_BAD_CD           = 0x0a,
+    SMMU_EVT_F_WALK_EABT        = 0x0b,
+    SMMU_EVT_F_TRANSLATION      = 0x10,
+    SMMU_EVT_F_ADDR_SIZE        = 0x11,
+    SMMU_EVT_F_ACCESS           = 0x12,
+    SMMU_EVT_F_PERMISSION       = 0x13,
+    SMMU_EVT_F_TLB_CONFLICT     = 0x20,
+    SMMU_EVT_F_CFG_CONFLICT     = 0x21,
+    SMMU_EVT_E_PAGE_REQ         = 0x24,
+} SMMUEventType;
+
+static const char *event_stringify[] = {
+    [SMMU_EVT_OK]                       = "SMMU_EVT_OK",
+    [SMMU_EVT_F_UUT]                    = "SMMU_EVT_F_UUT",
+    [SMMU_EVT_C_BAD_STREAMID]           = "SMMU_EVT_C_BAD_STREAMID",
+    [SMMU_EVT_F_STE_FETCH]              = "SMMU_EVT_F_STE_FETCH",
+    [SMMU_EVT_C_BAD_STE]                = "SMMU_EVT_C_BAD_STE",
+    [SMMU_EVT_F_BAD_ATS_TREQ]           = "SMMU_EVT_F_BAD_ATS_TREQ",
+    [SMMU_EVT_F_STREAM_DISABLED]        = "SMMU_EVT_F_STREAM_DISABLED",
+    [SMMU_EVT_F_TRANS_FORBIDDEN]        = "SMMU_EVT_F_TRANS_FORBIDDEN",
+    [SMMU_EVT_C_BAD_SUBSTREAMID]        = "SMMU_EVT_C_BAD_SUBSTREAMID",
+    [SMMU_EVT_F_CD_FETCH]               = "SMMU_EVT_F_CD_FETCH",
+    [SMMU_EVT_C_BAD_CD]                 = "SMMU_EVT_C_BAD_CD",
+    [SMMU_EVT_F_WALK_EABT]              = "SMMU_EVT_F_WALK_EABT",
+    [SMMU_EVT_F_TRANSLATION]            = "SMMU_EVT_F_TRANSLATION",
+    [SMMU_EVT_F_ADDR_SIZE]              = "SMMU_EVT_F_ADDR_SIZE",
+    [SMMU_EVT_F_ACCESS]                 = "SMMU_EVT_F_ACCESS",
+    [SMMU_EVT_F_PERMISSION]             = "SMMU_EVT_F_PERMISSION",
+    [SMMU_EVT_F_TLB_CONFLICT]           = "SMMU_EVT_F_TLB_CONFLICT",
+    [SMMU_EVT_F_CFG_CONFLICT]           = "SMMU_EVT_F_CFG_CONFLICT",
+    [SMMU_EVT_E_PAGE_REQ]               = "SMMU_EVT_E_PAGE_REQ",
+};
+
+#define SMMU_EVENT_STRING(event) (                                         \
+(event < ARRAY_SIZE(event_stringify)) ? event_stringify[event] : "UNKNOWN" \
+)
+
+typedef struct SMMUEventInfo {
+    SMMUEventType type;
+    uint32_t sid;
+    bool recorded;
+    bool record_trans_faults;
+    union {
+        struct {
+            uint32_t ssid;
+            bool ssv;
+            dma_addr_t addr;
+            bool rnw;
+            bool pnu;
+            bool ind;
+       } f_uut;
+       struct ssid_info {
+            uint32_t ssid;
+            bool ssv;
+       } c_bad_streamid;
+       struct ssid_addr_info {
+            uint32_t ssid;
+            bool ssv;
+            dma_addr_t addr;
+       } f_ste_fetch;
+       struct ssid_info c_bad_ste;
+       struct {
+            dma_addr_t addr;
+            bool rnw;
+       } f_transl_forbidden;
+       struct {
+            uint32_t ssid;
+       } c_bad_substream;
+       struct ssid_addr_info f_cd_fetch;
+       struct ssid_info c_bad_cd;
+       struct full_info {
+            bool stall;
+            uint16_t stag;
+            uint32_t ssid;
+            bool ssv;
+            bool s2;
+            dma_addr_t addr;
+            bool rnw;
+            bool pnu;
+            bool ind;
+            uint8_t class;
+            dma_addr_t addr2;
+       } f_walk_eabt;
+       struct full_info f_translation;
+       struct full_info f_addr_size;
+       struct full_info f_access;
+       struct full_info f_permission;
+       struct ssid_info f_cfg_conflict;
+       /**
+        * not supported yet:
+        * F_BAD_ATS_TREQ
+        * F_BAD_ATS_TREQ
+        * F_TLB_CONFLICT
+        * E_PAGE_REQUEST
+        * IMPDEF_EVENTn
+        */
+    } u;
+} SMMUEventInfo;
+
+/* EVTQ fields */
+
+#define EVT_Q_OVERFLOW        (1 << 31)
+
+#define EVT_SET_TYPE(x, v)              deposit32((x)->word[0], 0 , 8 ,  v)
+#define EVT_SET_SSV(x, v)               deposit32((x)->word[0], 11, 1 ,  v)
+#define EVT_SET_SSID(x, v)              deposit32((x)->word[0], 12, 20, v)
+#define EVT_SET_SID(x, v)               ((x)->word[1] =  v)
+#define EVT_SET_STAG(x, v)              deposit32((x)->word[2], 0 , 16, v)
+#define EVT_SET_STALL(x, v)             deposit32((x)->word[2], 31, 1 , v)
+#define EVT_SET_PNU(x, v)               deposit32((x)->word[3], 1 , 1 , v)
+#define EVT_SET_IND(x, v)               deposit32((x)->word[3], 2 , 1 , v)
+#define EVT_SET_RNW(x, v)               deposit32((x)->word[3], 3 , 1 , v)
+#define EVT_SET_S2(x, v)                deposit32((x)->word[3], 7 , 1 , v)
+#define EVT_SET_CLASS(x, v)             deposit32((x)->word[3], 8 , 2 , v)
+#define EVT_SET_ADDR(x, addr) ({                    \
+            (x)->word[5] = (uint32_t)(addr >> 32);        \
+            (x)->word[4] = (uint32_t)(addr & 0xffffffff); \
+        })
+#define EVT_SET_ADDR2(x, addr) ({                    \
+            deposit32((x)->word[7], 3, 29, addr >> 16);        \
+            deposit32((x)->word[7], 0, 16, addr & 0xffff); \
+        })
+
+void smmuv3_record_event(SMMUv3State *s, SMMUEventInfo *event);
+
 #endif
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index fcfdbb0..0adfe53 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -140,7 +140,7 @@ static void queue_write(SMMUQueue *q, void *data)
     queue_prod_incr(q);
 }
 
-void smmuv3_write_eventq(SMMUv3State *s, Evt *evt)
+static void smmuv3_write_eventq(SMMUv3State *s, Evt *evt)
 {
     SMMUQueue *q = &s->eventq;
     bool q_empty = Q_EMPTY(q);
@@ -161,6 +161,95 @@ void smmuv3_write_eventq(SMMUv3State *s, Evt *evt)
     }
 }
 
+void smmuv3_record_event(SMMUv3State *s, SMMUEventInfo *info)
+{
+    Evt evt;
+
+    if (!SMMUV3_EVENTQ_ENABLED(s)) {
+        return;
+    }
+
+    EVT_SET_TYPE(&evt, info->type);
+    EVT_SET_SID(&evt, info->sid);
+
+    switch (info->type) {
+    case SMMU_EVT_OK:
+        return;
+    case SMMU_EVT_F_UUT:
+        EVT_SET_SSID(&evt, info->u.f_uut.ssid);
+        EVT_SET_SSV(&evt,  info->u.f_uut.ssv);
+        EVT_SET_ADDR(&evt, info->u.f_uut.addr);
+        EVT_SET_RNW(&evt,  info->u.f_uut.rnw);
+        EVT_SET_PNU(&evt,  info->u.f_uut.pnu);
+        EVT_SET_IND(&evt,  info->u.f_uut.ind);
+        break;
+    case SMMU_EVT_C_BAD_STREAMID:
+        EVT_SET_SSID(&evt, info->u.c_bad_streamid.ssid);
+        EVT_SET_SSV(&evt,  info->u.c_bad_streamid.ssv);
+        break;
+    case SMMU_EVT_F_STE_FETCH:
+        EVT_SET_SSID(&evt, info->u.f_ste_fetch.ssid);
+        EVT_SET_SSV(&evt,  info->u.f_ste_fetch.ssv);
+        EVT_SET_ADDR(&evt, info->u.f_ste_fetch.addr);
+        break;
+    case SMMU_EVT_C_BAD_STE:
+        EVT_SET_SSID(&evt, info->u.c_bad_ste.ssid);
+        EVT_SET_SSV(&evt,  info->u.c_bad_ste.ssv);
+        break;
+    case SMMU_EVT_F_STREAM_DISABLED:
+        break;
+    case SMMU_EVT_F_TRANS_FORBIDDEN:
+        EVT_SET_ADDR(&evt, info->u.f_transl_forbidden.addr);
+        EVT_SET_RNW(&evt, info->u.f_transl_forbidden.rnw);
+        break;
+    case SMMU_EVT_C_BAD_SUBSTREAMID:
+        EVT_SET_SSID(&evt, info->u.c_bad_substream.ssid);
+        break;
+    case SMMU_EVT_F_CD_FETCH:
+        EVT_SET_SSID(&evt, info->u.f_cd_fetch.ssid);
+        EVT_SET_SSV(&evt,  info->u.f_cd_fetch.ssv);
+        EVT_SET_ADDR(&evt, info->u.f_cd_fetch.addr);
+        break;
+    case SMMU_EVT_C_BAD_CD:
+        EVT_SET_SSID(&evt, info->u.c_bad_cd.ssid);
+        EVT_SET_SSV(&evt,  info->u.c_bad_cd.ssv);
+        break;
+    case SMMU_EVT_F_WALK_EABT:
+    case SMMU_EVT_F_TRANSLATION:
+    case SMMU_EVT_F_ADDR_SIZE:
+    case SMMU_EVT_F_ACCESS:
+    case SMMU_EVT_F_PERMISSION:
+        EVT_SET_STALL(&evt, info->u.f_walk_eabt.stall);
+        EVT_SET_STAG(&evt, info->u.f_walk_eabt.stag);
+        EVT_SET_SSID(&evt, info->u.f_walk_eabt.ssid);
+        EVT_SET_SSV(&evt, info->u.f_walk_eabt.ssv);
+        EVT_SET_S2(&evt, info->u.f_walk_eabt.s2);
+        EVT_SET_ADDR(&evt, info->u.f_walk_eabt.addr);
+        EVT_SET_RNW(&evt, info->u.f_walk_eabt.rnw);
+        EVT_SET_PNU(&evt, info->u.f_walk_eabt.pnu);
+        EVT_SET_IND(&evt, info->u.f_walk_eabt.ind);
+        EVT_SET_CLASS(&evt, info->u.f_walk_eabt.class);
+        EVT_SET_ADDR2(&evt, info->u.f_walk_eabt.addr2);
+        break;
+    case SMMU_EVT_F_CFG_CONFLICT:
+        EVT_SET_SSID(&evt, info->u.f_cfg_conflict.ssid);
+        EVT_SET_SSV(&evt,  info->u.f_cfg_conflict.ssv);
+        break;
+    /* rest is not implemented */
+    case SMMU_EVT_F_BAD_ATS_TREQ:
+    case SMMU_EVT_F_TLB_CONFLICT:
+    case SMMU_EVT_E_PAGE_REQ:
+    default:
+        error_report("%s event %d not supported", __func__,
+                     info->type);
+        return;
+    }
+
+    trace_smmuv3_record_event(SMMU_EVENT_STRING(info->type), info->sid);
+    smmuv3_write_eventq(s, &evt);
+    info->recorded = true;
+}
+
 static void smmuv3_init_regs(SMMUv3State *s)
 {
     /**
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index ed5dce0..c79c15e 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -28,3 +28,4 @@ smmuv3_write_mmio(hwaddr addr, uint64_t val, unsigned size) "addr: 0x%"PRIx64" v
 smmuv3_write_mmio_idr(hwaddr addr, uint64_t val) "write to RO/Unimpl reg 0x%lx val64:0x%lx"
 smmuv3_write_mmio_evtq_cons_bef_clear(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "Before clearing interrupt prod:0x%x cons:0x%x prod.w:%d cons.w:%d"
 smmuv3_write_mmio_evtq_cons_after_clear(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "after clearing interrupt prod:0x%x cons:0x%x prod.w:%d cons.w:%d"
+smmuv3_record_event(const char *type, uint32_t sid) "%s sid=%d"
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [Qemu-devel] [PATCH v9 09/14] hw/arm/smmuv3: Implement translate callback
  2018-02-17 18:46 [Qemu-devel] [PATCH v9 00/14] ARM SMMUv3 Emulation Support Eric Auger
                   ` (7 preceding siblings ...)
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 08/14] hw/arm/smmuv3: Event queue recording helper Eric Auger
@ 2018-02-17 18:46 ` Eric Auger
  2018-03-09 18:46   ` Peter Maydell
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 10/14] hw/arm/smmuv3: Abort on vfio or vhost case Eric Auger
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 63+ messages in thread
From: Eric Auger @ 2018-02-17 18:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: tn, mst, christoffer.dall, bharat.bhushan, jean-philippe.brucker,
	edgar.iglesias, linuc.decode, peterx

This patch implements the IOMMU Memory Region translate()
callback. Most of the code relates to the translation
configuration decoding and check (STE, CD).

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v8 -> v9:
- use SMMU_EVENT_STRING macro
- get rid of last erro_report's
- decode asid
- handle config abort before ptw
- add 64-bit single-copy atomic comment

v7 -> v8:
- use address_space_rw
- s/Ste/STE, s/Cd/CD
- use dma_memory_read
- remove everything related to stage 2
- collect data for both TTx
- renamings
- pass the event handle all along the config decoding path
- decode tbi, ars
---
 hw/arm/smmuv3-internal.h | 146 ++++++++++++++++++++
 hw/arm/smmuv3.c          | 341 +++++++++++++++++++++++++++++++++++++++++++++++
 hw/arm/trace-events      |   9 ++
 3 files changed, 496 insertions(+)

diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
index 3929f69..b203426 100644
--- a/hw/arm/smmuv3-internal.h
+++ b/hw/arm/smmuv3-internal.h
@@ -462,4 +462,150 @@ typedef struct SMMUEventInfo {
 
 void smmuv3_record_event(SMMUv3State *s, SMMUEventInfo *event);
 
+/* Configuration Data */
+
+/* STE Level 1 Descriptor */
+typedef struct STEDesc {
+    uint32_t word[2];
+} STEDesc;
+
+/* CD Level 1 Descriptor */
+typedef struct CDDesc {
+    uint32_t word[2];
+} CDDesc;
+
+/* Stream Table Entry(STE) */
+typedef struct STE {
+    uint32_t word[16];
+} STE;
+
+/* Context Descriptor(CD) */
+typedef struct CD {
+    uint32_t word[16];
+} CD;
+
+/* STE fields */
+
+#define STE_VALID(x)   extract32((x)->word[0], 0, 1) /* 0 */
+
+#define STE_CONFIG(x)  extract32((x)->word[0], 1, 3)
+#define STE_CFG_S1_ENABLED(config) (config & 0x1)
+#define STE_CFG_S2_ENABLED(config) (config & 0x2)
+#define STE_CFG_ABORT(config)      (!(config & 0x4))
+#define STE_CFG_BYPASS(config)     (config == 0x4)
+
+#define STE_S1FMT(x)   extract32((x)->word[0], 4 , 2)
+#define STE_S1CDMAX(x) extract32((x)->word[1], 27, 5)
+#define STE_EATS(x)    extract32((x)->word[2], 28, 2)
+#define STE_STRW(x)    extract32((x)->word[2], 30, 2)
+#define STE_S2VMID(x)  extract32((x)->word[4], 0 , 16)
+#define STE_S2T0SZ(x)  extract32((x)->word[5], 0 , 6)
+#define STE_S2SL0(x)   extract32((x)->word[5], 6 , 2)
+#define STE_S2TG(x)    extract32((x)->word[5], 14, 2)
+#define STE_S2PS(x)    extract32((x)->word[5], 16, 3)
+#define STE_S2AA64(x)  extract32((x)->word[5], 19, 1)
+#define STE_S2HD(x)    extract32((x)->word[5], 24, 1)
+#define STE_S2HA(x)    extract32((x)->word[5], 25, 1)
+#define STE_S2S(x)     extract32((x)->word[5], 26, 1)
+#define STE_CTXPTR(x)                                           \
+    ({                                                          \
+        unsigned long addr;                                     \
+        addr = (uint64_t)extract32((x)->word[1], 0, 16) << 32;  \
+        addr |= (uint64_t)((x)->word[0] & 0xffffffc0);          \
+        addr;                                                   \
+    })
+
+#define STE_S2TTB(x)                                            \
+    ({                                                          \
+        unsigned long addr;                                     \
+        addr = (uint64_t)extract32((x)->word[7], 0, 16) << 32;  \
+        addr |= (uint64_t)((x)->word[6] & 0xfffffff0);          \
+        addr;                                                   \
+    })
+
+static inline int oas2bits(int oas_field)
+{
+    switch (oas_field) {
+    case 0b011:
+        return 42;
+    case 0b100:
+        return 44;
+    default:
+        return 32 + (1 << oas_field);
+   }
+}
+
+static inline int pa_range(STE *ste)
+{
+    int oas_field = MIN(STE_S2PS(ste), SMMU_IDR5_OAS);
+
+    if (!STE_S2AA64(ste)) {
+        return 40;
+    }
+
+    return oas2bits(oas_field);
+}
+
+#define MAX_PA(ste) ((1 << pa_range(ste)) - 1)
+
+/* CD fields */
+
+#define CD_VALID(x)   extract32((x)->word[0], 30, 1)
+#define CD_ASID(x)    extract32((x)->word[1], 16, 16)
+#define CD_TTB(x, sel)                                      \
+    ({                                                      \
+        uint64_t hi, lo;                                    \
+        hi = extract32((x)->word[(sel) * 2 + 3], 0, 16);    \
+        hi <<= 32;                                          \
+        lo = (x)->word[(sel) * 2 + 2] & ~0xf;               \
+        hi | lo;                                            \
+    })
+
+#define CD_TSZ(x, sel)   extract32((x)->word[0], (16 * (sel)) + 0, 6)
+#define CD_TG(x, sel)    extract32((x)->word[0], (16 * (sel)) + 6, 2)
+#define CD_EPD(x, sel)   extract32((x)->word[0], (16 * (sel)) + 14, 1)
+#define CD_ENDI(x)       extract32((x)->word[0], 15, 1)
+#define CD_IPS(x)        extract32((x)->word[1], 0 , 3)
+#define CD_TBI(x)        extract32((x)->word[1], 6 , 2)
+#define CD_S(x)          extract32((x)->word[1], 12, 1)
+#define CD_R(x)          extract32((x)->word[1], 13, 1)
+#define CD_A(x)          extract32((x)->word[1], 14, 1)
+#define CD_AARCH64(x)    extract32((x)->word[1], 9 , 1)
+
+#define CDM_VALID(x)    ((x)->word[0] & 0x1)
+
+static inline int is_cd_valid(SMMUv3State *s, STE *ste, CD *cd)
+{
+    return CD_VALID(cd);
+}
+
+/**
+ * tg2granule - Decodes the CD translation granule size field according
+ * to the ttbr in use
+ * @bits: TG0/1 fields
+ * @ttbr: ttbr index in use
+ */
+static inline int tg2granule(int bits, int ttbr)
+{
+    switch (bits) {
+    case 1:
+        return ttbr ? 14 : 16;
+    case 2:
+        return ttbr ? 12 : 14;
+    case 3:
+        return ttbr ? 16 : 12;
+    default:
+        return 12;
+    }
+}
+
+#define L1STD_L2PTR(stm) ({                                 \
+            uint64_t hi, lo;                            \
+            hi = (stm)->word[1];                        \
+            lo = (stm)->word[0] & ~(uint64_t)0x1f;      \
+            hi << 32 | lo;                              \
+        })
+
+#define L1STD_SPAN(stm) (extract32((stm)->word[0], 0, 4))
+
 #endif
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 0adfe53..384393f 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -289,6 +289,344 @@ static void smmuv3_init_regs(SMMUv3State *s)
     s->sid_split = 0;
 }
 
+static int smmu_get_ste(SMMUv3State *s, dma_addr_t addr, STE *buf,
+                        SMMUEventInfo *event)
+{
+    int ret;
+
+    trace_smmuv3_get_ste(addr);
+    /* TODO: guarantee 64-bit single-copy atomicity */
+    ret = dma_memory_read(&address_space_memory, addr,
+                          (void *)buf, sizeof(*buf));
+    if (ret != MEMTX_OK) {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "Cannot fetch pte at address=0x%"PRIx64"\n", addr);
+        event->type = SMMU_EVT_F_STE_FETCH;
+        event->u.f_ste_fetch.addr = addr;
+        return -EINVAL;
+    }
+    return 0;
+
+}
+
+/* @ssid > 0 not supported yet */
+static int smmu_get_cd(SMMUv3State *s, STE *ste, uint32_t ssid,
+                       CD *buf, SMMUEventInfo *event)
+{
+    dma_addr_t addr = STE_CTXPTR(ste);
+    int ret;
+
+    trace_smmuv3_get_cd(addr);
+    /* TODO: guarantee 64-bit single-copy atomicity */
+    ret = dma_memory_read(&address_space_memory, addr,
+                           (void *)buf, sizeof(*buf));
+    if (ret != MEMTX_OK) {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "Cannot fetch pte at address=0x%"PRIx64"\n", addr);
+        event->type = SMMU_EVT_F_CD_FETCH;
+        event->u.f_ste_fetch.addr = addr;
+        return -EINVAL;
+    }
+    return 0;
+}
+
+static int decode_ste(SMMUv3State *s, SMMUTransCfg *cfg,
+                      STE *ste, SMMUEventInfo *event)
+{
+    uint32_t config = STE_CONFIG(ste);
+    int ret = -EINVAL;
+
+    if (STE_CFG_ABORT(config)) {
+        /* abort but don't record any event */
+        cfg->aborted = true;
+        return ret;
+    }
+
+    if (STE_CFG_BYPASS(config)) {
+        cfg->bypassed = true;
+        return ret;
+    }
+
+    if (!STE_VALID(ste)) {
+        goto bad_ste;
+    }
+
+    if (STE_CFG_S2_ENABLED(config)) {
+        error_setg(&error_fatal, "SMMUv3 does not support stage 2 yet");
+    }
+
+    if (STE_S1CDMAX(ste) != 0) {
+        error_setg(&error_fatal,
+                   "SMMUv3 does not support multiple context descriptors yet");
+        goto bad_ste;
+    }
+    return 0;
+
+bad_ste:
+    event->type = SMMU_EVT_C_BAD_STE;
+    return -EINVAL;
+}
+
+/**
+ * smmu_find_ste - Return the stream table entry associated
+ * to the sid
+ *
+ * @s: smmuv3 handle
+ * @sid: stream ID
+ * @ste: returned stream table entry
+ * @event: handle to an event info
+ *
+ * Supports linear and 2-level stream table
+ * Return 0 on success, -EINVAL otherwise
+ */
+static int smmu_find_ste(SMMUv3State *s, uint32_t sid, STE *ste,
+                         SMMUEventInfo *event)
+{
+    dma_addr_t addr;
+    int ret;
+
+    trace_smmuv3_find_ste(sid, s->features, s->sid_split);
+    /* Check SID range */
+    if (sid > (1 << SMMU_IDR1_SIDSIZE)) {
+        event->type = SMMU_EVT_C_BAD_STREAMID;
+        return -EINVAL;
+    }
+    if (s->features & SMMU_FEATURE_2LVL_STE) {
+        int l1_ste_offset, l2_ste_offset, max_l2_ste, span;
+        dma_addr_t strtab_base, l1ptr, l2ptr;
+        STEDesc l1std;
+
+        strtab_base = s->strtab_base & SMMU_BASE_ADDR_MASK;
+        l1_ste_offset = sid >> s->sid_split;
+        l2_ste_offset = sid & ((1 << s->sid_split) - 1);
+        l1ptr = (dma_addr_t)(strtab_base + l1_ste_offset * sizeof(l1std));
+        /* TODO: guarantee 64-bit single-copy atomicity */
+        ret = dma_memory_read(&address_space_memory, l1ptr,
+                              (uint8_t *)&l1std, sizeof(l1std));
+        if (ret != MEMTX_OK) {
+            qemu_log_mask(LOG_GUEST_ERROR,
+                          "Could not read L1PTR at 0X%"PRIx64"\n", l1ptr);
+            event->type = SMMU_EVT_F_STE_FETCH;
+            event->u.f_ste_fetch.addr = l1ptr;
+            return -EINVAL;
+        }
+
+        span = L1STD_SPAN(&l1std);
+
+        if (!span) {
+            /* l2ptr is not valid */
+            qemu_log_mask(LOG_GUEST_ERROR,
+                          "invalid sid=%d (L1STD span=0)\n", sid);
+            event->type = SMMU_EVT_C_BAD_STREAMID;
+            return -EINVAL;
+        }
+        max_l2_ste = (1 << span) - 1;
+        l2ptr = L1STD_L2PTR(&l1std);
+        trace_smmuv3_find_ste_2lvl(s->strtab_base, l1ptr, l1_ste_offset,
+                                   l2ptr, l2_ste_offset, max_l2_ste);
+        if (l2_ste_offset > max_l2_ste) {
+            qemu_log_mask(LOG_GUEST_ERROR,
+                          "l2_ste_offset=%d > max_l2_ste=%d\n",
+                          l2_ste_offset, max_l2_ste);
+            event->type = SMMU_EVT_C_BAD_STE;
+            return -EINVAL;
+        }
+        addr = L1STD_L2PTR(&l1std) + l2_ste_offset * sizeof(*ste);
+    } else {
+        addr = s->strtab_base + sid * sizeof(*ste);
+    }
+
+    if (smmu_get_ste(s, addr, ste, event)) {
+        return -EINVAL;
+    }
+
+    return 0;
+}
+
+static int decode_cd(SMMUTransCfg *cfg, CD *cd, SMMUEventInfo *event)
+{
+    int ret = -EINVAL;
+    int i;
+
+    if (!CD_VALID(cd) || !CD_AARCH64(cd)) {
+        goto error;
+    }
+
+    /* we support only those at the moment */
+    cfg->aa64 = true;
+    cfg->stage = 1;
+
+    cfg->oas = oas2bits(CD_IPS(cd));
+    cfg->oas = MIN(oas2bits(SMMU_IDR5_OAS), cfg->oas);
+    cfg->tbi = CD_TBI(cd);
+    cfg->asid = CD_ASID(cd);
+
+    trace_smmuv3_decode_cd(cfg->oas);
+
+    /* decode data dependent on TT */
+    for (i = 0; i <= 1; i++) {
+        int tg, tsz;
+        SMMUTransTableInfo *tt = &cfg->tt[i];
+
+        cfg->tt[i].disabled = CD_EPD(cd, i);
+        if (cfg->tt[i].disabled) {
+            continue;
+        }
+
+        tsz = CD_TSZ(cd, i);
+        if (tsz < 16 || tsz > 39) {
+            goto error;
+        }
+
+        tg = CD_TG(cd, i);
+        tt->granule_sz = tg2granule(tg, i);
+        if ((tt->granule_sz != 12 && tt->granule_sz != 16) || CD_ENDI(cd)) {
+            goto error;
+        }
+
+        tt->tsz = tsz;
+        tt->initial_level = 4 - (64 - tsz - 4) / (tt->granule_sz - 3);
+        tt->ttb = CD_TTB(cd, i);
+        tt->ttb = extract64(tt->ttb, 0, cfg->oas);
+        trace_smmuv3_decode_cd_tt(i, tt->tsz, tt->ttb,
+                                  tt->granule_sz, tt->initial_level);
+    }
+
+    event->record_trans_faults = CD_R(cd);
+
+    return 0;
+
+error:
+    event->type = SMMU_EVT_C_BAD_CD;
+    return ret;
+}
+
+/**
+ * smmuv3_decode_config - Prepare the translation configuration
+ * for the @mr iommu region
+ * @mr: iommu memory region the translation config must be prepared for
+ * @cfg: output translation configuration which is populated through
+ *       the different configuration decodng steps
+ * @event: must be zero'ed by the caller
+ *
+ * return < 0 if the translation needs to be aborted (@event is filled
+ * accordingly). Return 0 otherwise.
+ */
+static int smmuv3_decode_config(IOMMUMemoryRegion *mr, SMMUTransCfg *cfg,
+                                SMMUEventInfo *event)
+{
+    SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
+    uint32_t sid = smmu_get_sid(sdev);
+    SMMUv3State *s = sdev->smmu;
+    int ret = -EINVAL;
+    STE ste;
+    CD cd;
+
+    if (smmu_find_ste(s, sid, &ste, event)) {
+        return ret;
+    }
+
+    if (decode_ste(s, cfg, &ste, event)) {
+        return ret;
+    }
+
+    if (smmu_get_cd(s, &ste, 0 /* ssid */, &cd, event)) {
+        return ret;
+    }
+
+    return decode_cd(cfg, &cd, event);
+}
+
+static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr addr,
+                                      IOMMUAccessFlags flag)
+{
+    SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
+    SMMUv3State *s = sdev->smmu;
+    uint32_t sid = smmu_get_sid(sdev);
+    SMMUEventInfo event = {.type = SMMU_EVT_OK, .sid = sid};
+    SMMUPTWEventInfo ptw_info = {};
+    SMMUTransCfg cfg = {};
+    IOMMUTLBEntry entry = {
+        .target_as = &address_space_memory,
+        .iova = addr,
+        .translated_addr = addr,
+        .addr_mask = ~(hwaddr)0,
+        .perm = IOMMU_NONE,
+    };
+    int ret = 0;
+
+    if (!smmu_enabled(s)) {
+        goto out;
+    }
+
+    ret = smmuv3_decode_config(mr, &cfg, &event);
+    if (ret) {
+        goto out;
+    }
+
+    if (cfg.aborted) {
+        goto out;
+    }
+
+    ret = smmu_ptw(&cfg, addr, flag, &entry, &ptw_info);
+    if (ret) {
+        switch (ptw_info.type) {
+        case SMMU_PTW_ERR_WALK_EABT:
+            event.type = SMMU_EVT_F_WALK_EABT;
+            event.u.f_walk_eabt.addr = addr;
+            event.u.f_walk_eabt.rnw = flag & 0x1;
+            event.u.f_walk_eabt.class = 0x1;
+            event.u.f_walk_eabt.addr2 = ptw_info.addr;
+            break;
+        case SMMU_PTW_ERR_TRANSLATION:
+            if (event.record_trans_faults) {
+                event.type = SMMU_EVT_F_TRANSLATION;
+                event.u.f_translation.addr = addr;
+                event.u.f_translation.rnw = flag & 0x1;
+            }
+            break;
+        case SMMU_PTW_ERR_ADDR_SIZE:
+            if (event.record_trans_faults) {
+                event.type = SMMU_EVT_F_ADDR_SIZE;
+                event.u.f_addr_size.addr = addr;
+                event.u.f_addr_size.rnw = flag & 0x1;
+            }
+            break;
+        case SMMU_PTW_ERR_ACCESS:
+            if (event.record_trans_faults) {
+                event.type = SMMU_EVT_F_ACCESS;
+                event.u.f_access.addr = addr;
+                event.u.f_access.rnw = flag & 0x1;
+            }
+            break;
+        case SMMU_PTW_ERR_PERMISSION:
+            if (event.record_trans_faults) {
+                event.type = SMMU_EVT_F_PERMISSION;
+                event.u.f_permission.addr = addr;
+                event.u.f_permission.rnw = flag & 0x1;
+            }
+            break;
+        default:
+            error_setg(&error_fatal, "SMMUV3 BUG");
+        }
+    }
+
+    trace_smmuv3_translate(mr->parent_obj.name, sid, addr,
+                           entry.translated_addr, entry.perm);
+out:
+    if (ret) {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "%s translation failed for iova=0x%"PRIx64" (%s)\n",
+                      mr->parent_obj.name, addr, SMMU_EVENT_STRING(event.type));
+        entry.perm = IOMMU_NONE;
+        smmuv3_record_event(s, &event);
+    } else if (!cfg.aborted) {
+        entry.perm = flag;
+    }
+
+    return entry;
+}
+
 static int smmuv3_cmdq_consume(SMMUv3State *s)
 {
     SMMUCmdError cmd_error = SMMU_CERROR_NONE;
@@ -739,6 +1077,9 @@ static void smmuv3_class_init(ObjectClass *klass, void *data)
 static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
                                                   void *data)
 {
+    IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass);
+
+    imrc->translate = smmuv3_translate;
 }
 
 static const TypeInfo smmuv3_type_info = {
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index c79c15e..1102bd4 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -29,3 +29,12 @@ smmuv3_write_mmio_idr(hwaddr addr, uint64_t val) "write to RO/Unimpl reg 0x%lx v
 smmuv3_write_mmio_evtq_cons_bef_clear(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "Before clearing interrupt prod:0x%x cons:0x%x prod.w:%d cons.w:%d"
 smmuv3_write_mmio_evtq_cons_after_clear(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "after clearing interrupt prod:0x%x cons:0x%x prod.w:%d cons.w:%d"
 smmuv3_record_event(const char *type, uint32_t sid) "%s sid=%d"
+smmuv3_find_ste(uint16_t sid, uint32_t features, uint16_t sid_split) "SID:0x%x features:0x%x, sid_split:0x%x"
+smmuv3_find_ste_2lvl(uint64_t strtab_base, hwaddr l1ptr, int l1_ste_offset, hwaddr l2ptr, int l2_ste_offset, int max_l2_ste) "strtab_base:0x%lx l1ptr:0x%"PRIx64" l1_off:0x%x, l2ptr:0x%"PRIx64" l2_off:0x%x max_l2_ste:%d"
+smmuv3_get_ste(hwaddr addr) "STE addr: 0x%"PRIx64
+smmuv3_translate_bypass(const char *n, uint16_t sid, hwaddr addr, bool is_write) "%s sid=%d bypass iova:0x%"PRIx64" is_write=%d"
+smmuv3_translate_in(uint16_t sid, int pci_bus_num, hwaddr strtab_base) "SID:0x%x bus:%d strtab_base:0x%"PRIx64
+smmuv3_get_cd(hwaddr addr) "CD addr: 0x%"PRIx64
+smmuv3_translate(const char *n, uint16_t sid, hwaddr iova, hwaddr translated, int perm) "%s sid=%d iova=0x%"PRIx64" translated=0x%"PRIx64" perm=0x%x"
+smmuv3_decode_cd(uint32_t oas) "oas=%d"
+smmuv3_decode_cd_tt(int i, uint32_t tsz, uint64_t ttb, uint32_t granule_sz, int initial_level) "TT[%d]:tsz:%d ttb:0x%"PRIx64" granule_sz:%d, initial_level = %d"
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [Qemu-devel] [PATCH v9 10/14] hw/arm/smmuv3: Abort on vfio or vhost case
  2018-02-17 18:46 [Qemu-devel] [PATCH v9 00/14] ARM SMMUv3 Emulation Support Eric Auger
                   ` (8 preceding siblings ...)
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 09/14] hw/arm/smmuv3: Implement translate callback Eric Auger
@ 2018-02-17 18:46 ` Eric Auger
  2018-03-08 19:06   ` Peter Maydell
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 11/14] target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route Eric Auger
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 63+ messages in thread
From: Eric Auger @ 2018-02-17 18:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: tn, mst, christoffer.dall, bharat.bhushan, jean-philippe.brucker,
	edgar.iglesias, linuc.decode, peterx

At the moment, the SMMUv3 does not support notification on
TLB invalidation. So let's abort as soon as such notifier gets
enabled.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/arm/smmuv3.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 384393f..5efe933 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -1074,12 +1074,23 @@ static void smmuv3_class_init(ObjectClass *klass, void *data)
     dc->realize = smmu_realize;
 }
 
+static void smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
+                                       IOMMUNotifierFlag old,
+                                       IOMMUNotifierFlag new)
+{
+    if (old == IOMMU_NOTIFIER_NONE) {
+        error_setg(&error_fatal,
+                   "SMMUV3: vhost and vfio notifiers not yet supported");
+    }
+}
+
 static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
                                                   void *data)
 {
     IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass);
 
     imrc->translate = smmuv3_translate;
+    imrc->notify_flag_changed = smmuv3_notify_flag_changed;
 }
 
 static const TypeInfo smmuv3_type_info = {
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [Qemu-devel] [PATCH v9 11/14] target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route
  2018-02-17 18:46 [Qemu-devel] [PATCH v9 00/14] ARM SMMUv3 Emulation Support Eric Auger
                   ` (9 preceding siblings ...)
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 10/14] hw/arm/smmuv3: Abort on vfio or vhost case Eric Auger
@ 2018-02-17 18:46 ` Eric Auger
  2018-03-12 11:59   ` Peter Maydell
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 12/14] hw/arm/virt: Add SMMUv3 to the virt board Eric Auger
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 63+ messages in thread
From: Eric Auger @ 2018-02-17 18:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: tn, mst, christoffer.dall, bharat.bhushan, jean-philippe.brucker,
	edgar.iglesias, linuc.decode, peterx

In case the MSI is translated by an IOMMU we need to fixup the
MSI route with the translated address.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v5 -> v6:
- use IOMMUMemoryRegionClass API

It is still unclear to me if we need to register an IOMMUNotifier
to handle any change in the MSI doorbell which would occur behind
the scene and would not lead to any call to kvm_arch_fixup_msi_route().
---
 target/arm/kvm.c        | 27 +++++++++++++++++++++++++++
 target/arm/trace-events |  3 +++
 2 files changed, 30 insertions(+)

diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 1219d00..9f5976a 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -20,8 +20,13 @@
 #include "sysemu/kvm.h"
 #include "kvm_arm.h"
 #include "cpu.h"
+#include "trace.h"
 #include "internals.h"
 #include "hw/arm/arm.h"
+#include "hw/pci/pci.h"
+#include "hw/pci/msi.h"
+#include "hw/arm/smmu-common.h"
+#include "hw/arm/smmuv3.h"
 #include "exec/memattrs.h"
 #include "exec/address-spaces.h"
 #include "hw/boards.h"
@@ -666,6 +671,28 @@ int kvm_arm_vgic_probe(void)
 int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route,
                              uint64_t address, uint32_t data, PCIDevice *dev)
 {
+    AddressSpace *as = pci_device_iommu_address_space(dev);
+    IOMMUMemoryRegionClass *imrc;
+    IOMMUTLBEntry entry;
+    SMMUDevice *sdev;
+
+    if (as == &address_space_memory) {
+        return 0;
+    }
+
+    /* MSI doorbell address is translated by an IOMMU */
+    sdev = container_of(as, SMMUDevice, as);
+    imrc = IOMMU_MEMORY_REGION_GET_CLASS(&sdev->iommu);
+
+    entry = imrc->translate(&sdev->iommu, address, IOMMU_WO);
+
+    route->u.msi.address_lo = entry.translated_addr;
+    route->u.msi.address_hi = entry.translated_addr >> 32;
+
+    trace_kvm_arm_fixup_msi_route(address, sdev->devfn,
+                                  sdev->iommu.parent_obj.name,
+                                  entry.translated_addr);
+
     return 0;
 }
 
diff --git a/target/arm/trace-events b/target/arm/trace-events
index 9e37131..8b3c220 100644
--- a/target/arm/trace-events
+++ b/target/arm/trace-events
@@ -8,3 +8,6 @@ arm_gt_tval_write(int timer, uint64_t value) "gt_tval_write: timer %d value 0x%"
 arm_gt_ctl_write(int timer, uint64_t value) "gt_ctl_write: timer %d value 0x%" PRIx64
 arm_gt_imask_toggle(int timer, int irqstate) "gt_ctl_write: timer %d IMASK toggle, new irqstate %d"
 arm_gt_cntvoff_write(uint64_t value) "gt_cntvoff_write: value 0x%" PRIx64
+
+# target/arm/kvm.c
+kvm_arm_fixup_msi_route(uint64_t iova, uint32_t devid, const char *name, uint64_t gpa) "MSI addr = 0x%"PRIx64" is translated for devfn=%d through %s into 0x%"PRIx64
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [Qemu-devel] [PATCH v9 12/14] hw/arm/virt: Add SMMUv3 to the virt board
  2018-02-17 18:46 [Qemu-devel] [PATCH v9 00/14] ARM SMMUv3 Emulation Support Eric Auger
                   ` (10 preceding siblings ...)
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 11/14] target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route Eric Auger
@ 2018-02-17 18:46 ` Eric Auger
  2018-03-12 12:46   ` Peter Maydell
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 13/14] hw/arm/virt-acpi-build: Add smmuv3 node in IORT table Eric Auger
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 63+ messages in thread
From: Eric Auger @ 2018-02-17 18:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: tn, mst, christoffer.dall, bharat.bhushan, jean-philippe.brucker,
	edgar.iglesias, linuc.decode, peterx

From: Prem Mallappa <prem.mallappa@broadcom.com>

Add code to instantiate an smmuv3 in virt machine. A new iommu
integer member is introduced in VirtMachineState to store the type
of the iommu in use.

Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v7 -> v8:
- integer iommu member
- add primary-bus property

v4 -> v5:
- add dma-coherent property

v2 -> v3:
- vbi was removed. Use vms instead
- migrate to new smmu binding format (iommu-map)
- don't use appendprop anymore
- add vms->smmu and guard instantiation with this latter
- interrupts type changed to edge

Conflicts:
	hw/arm/smmuv3.c
---
 hw/arm/virt.c         | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 include/hw/arm/virt.h | 10 ++++++++
 2 files changed, 73 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index dbb3c80..e9dca0d 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -58,6 +58,7 @@
 #include "hw/smbios/smbios.h"
 #include "qapi/visitor.h"
 #include "standard-headers/linux/input.h"
+#include "hw/arm/smmuv3.h"
 
 #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
     static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
@@ -141,6 +142,7 @@ static const MemMapEntry a15memmap[] = {
     [VIRT_FW_CFG] =             { 0x09020000, 0x00000018 },
     [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
     [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
+    [VIRT_SMMU] =               { 0x09050000, 0x00020000 }, /* 128K, needed */
     [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
     /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
     [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
@@ -161,6 +163,7 @@ static const int a15irqmap[] = {
     [VIRT_SECURE_UART] = 8,
     [VIRT_MMIO] = 16, /* ...to 16 + NUM_VIRTIO_TRANSPORTS - 1 */
     [VIRT_GIC_V2M] = 48, /* ...to 48 + NUM_GICV2M_SPIS - 1 */
+    [VIRT_SMMU] = 74,    /* ...to 74 + NUM_SMMU_IRQS - 1 */
     [VIRT_PLATFORM_BUS] = 112, /* ...to 112 + PLATFORM_BUS_NUM_IRQS -1 */
 };
 
@@ -941,7 +944,57 @@ static void create_pcie_irq_map(const VirtMachineState *vms,
                            0x7           /* PCI irq */);
 }
 
-static void create_pcie(const VirtMachineState *vms, qemu_irq *pic)
+static void create_smmu(const VirtMachineState *vms, qemu_irq *pic,
+                        PCIBus *bus)
+{
+    char *node;
+    const char compat[] = "arm,smmu-v3";
+    int irq =  vms->irqmap[VIRT_SMMU];
+    int i;
+    hwaddr base = vms->memmap[VIRT_SMMU].base;
+    hwaddr size = vms->memmap[VIRT_SMMU].size;
+    const char irq_names[] = "eventq\0priq\0cmdq-sync\0gerror";
+    DeviceState *dev;
+
+    if (vms->iommu != VIRT_IOMMU_SMMUV3 || !vms->iommu_phandle) {
+        return;
+    }
+
+    dev = qdev_create(NULL, "arm-smmuv3");
+
+    object_property_set_link(OBJECT(dev), OBJECT(bus), "primary-bus",
+                             &error_abort);
+    qdev_init_nofail(dev);
+    sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, base);
+    for (i = 0; i < NUM_SMMU_IRQS; i++) {
+        sysbus_connect_irq(SYS_BUS_DEVICE(dev), i, pic[irq + i]);
+    }
+
+    node = g_strdup_printf("/smmuv3@%" PRIx64, base);
+    qemu_fdt_add_subnode(vms->fdt, node);
+    qemu_fdt_setprop(vms->fdt, node, "compatible", compat, sizeof(compat));
+    qemu_fdt_setprop_sized_cells(vms->fdt, node, "reg", 2, base, 2, size);
+
+    qemu_fdt_setprop_cells(vms->fdt, node, "interrupts",
+            GIC_FDT_IRQ_TYPE_SPI, irq    , GIC_FDT_IRQ_FLAGS_EDGE_LO_HI,
+            GIC_FDT_IRQ_TYPE_SPI, irq + 1, GIC_FDT_IRQ_FLAGS_EDGE_LO_HI,
+            GIC_FDT_IRQ_TYPE_SPI, irq + 2, GIC_FDT_IRQ_FLAGS_EDGE_LO_HI,
+            GIC_FDT_IRQ_TYPE_SPI, irq + 3, GIC_FDT_IRQ_FLAGS_EDGE_LO_HI);
+
+    qemu_fdt_setprop(vms->fdt, node, "interrupt-names", irq_names,
+                     sizeof(irq_names));
+
+    qemu_fdt_setprop_cell(vms->fdt, node, "clocks", vms->clock_phandle);
+    qemu_fdt_setprop_string(vms->fdt, node, "clock-names", "apb_pclk");
+    qemu_fdt_setprop(vms->fdt, node, "dma-coherent", NULL, 0);
+
+    qemu_fdt_setprop_cell(vms->fdt, node, "#iommu-cells", 1);
+
+    qemu_fdt_setprop_cell(vms->fdt, node, "phandle", vms->iommu_phandle);
+    g_free(node);
+}
+
+static void create_pcie(VirtMachineState *vms, qemu_irq *pic)
 {
     hwaddr base_mmio = vms->memmap[VIRT_PCIE_MMIO].base;
     hwaddr size_mmio = vms->memmap[VIRT_PCIE_MMIO].size;
@@ -1054,6 +1107,15 @@ static void create_pcie(const VirtMachineState *vms, qemu_irq *pic)
     qemu_fdt_setprop_cell(vms->fdt, nodename, "#interrupt-cells", 1);
     create_pcie_irq_map(vms, vms->gic_phandle, irq, nodename);
 
+    if (vms->iommu) {
+        vms->iommu_phandle = qemu_fdt_alloc_phandle(vms->fdt);
+
+        create_smmu(vms, pic, pci->bus);
+
+        qemu_fdt_setprop_cells(vms->fdt, nodename, "iommu-map",
+                               0x0, vms->iommu_phandle, 0x0, 0x10000);
+    }
+
     g_free(nodename);
 }
 
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 33b0ff3..13d3724 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -38,6 +38,7 @@
 
 #define NUM_GICV2M_SPIS       64
 #define NUM_VIRTIO_TRANSPORTS 32
+#define NUM_SMMU_IRQS          4
 
 #define ARCH_GICV3_MAINT_IRQ  9
 
@@ -59,6 +60,7 @@ enum {
     VIRT_GIC_V2M,
     VIRT_GIC_ITS,
     VIRT_GIC_REDIST,
+    VIRT_SMMU,
     VIRT_UART,
     VIRT_MMIO,
     VIRT_RTC,
@@ -74,6 +76,12 @@ enum {
     VIRT_SECURE_MEM,
 };
 
+enum {
+    VIRT_IOMMU_NONE,
+    VIRT_IOMMU_SMMUV3,
+    VIRT_IOMMU_VIRTIO,
+};
+
 typedef struct MemMapEntry {
     hwaddr base;
     hwaddr size;
@@ -96,6 +104,7 @@ typedef struct {
     bool its;
     bool virt;
     int32_t gic_version;
+    int32_t iommu;
     struct arm_boot_info bootinfo;
     const MemMapEntry *memmap;
     const int *irqmap;
@@ -105,6 +114,7 @@ typedef struct {
     uint32_t clock_phandle;
     uint32_t gic_phandle;
     uint32_t msi_phandle;
+    uint32_t iommu_phandle;
     int psci_conduit;
 } VirtMachineState;
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [Qemu-devel] [PATCH v9 13/14] hw/arm/virt-acpi-build: Add smmuv3 node in IORT table
  2018-02-17 18:46 [Qemu-devel] [PATCH v9 00/14] ARM SMMUv3 Emulation Support Eric Auger
                   ` (11 preceding siblings ...)
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 12/14] hw/arm/virt: Add SMMUv3 to the virt board Eric Auger
@ 2018-02-17 18:46 ` Eric Auger
  2018-03-12 12:48   ` Peter Maydell
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 14/14] hw/arm/virt: Handle iommu in 2.12 machine type Eric Auger
  2018-02-27 19:02 ` [Qemu-devel] [PATCH v9 00/14] ARM SMMUv3 Emulation Support Peter Maydell
  14 siblings, 1 reply; 63+ messages in thread
From: Eric Auger @ 2018-02-17 18:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: tn, mst, christoffer.dall, bharat.bhushan, jean-philippe.brucker,
	edgar.iglesias, linuc.decode, peterx

From: Prem Mallappa <prem.mallappa@broadcom.com>

This patch builds the smmuv3 node in the ACPI IORT table.

The RID space of the root complex, which spans 0x0-0x10000
maps to streamid space 0x0-0x10000 in smmuv3, which in turn
maps to deviceid space 0x0-0x10000 in the ITS group.

The guest must feature the IOMMU probe deferral series
(https://lkml.org/lkml/2017/4/10/214) which fixes streamid
multiple lookup. This bug is not related to the SMMU emulation.

Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v2 -> v3:
- integrate into the existing IORT table made up of ITS, RC nodes
- take into account vms->smmu
- match linux actbl2.h acpi_iort_smmu_v3 field names
---
 hw/arm/virt-acpi-build.c    | 56 +++++++++++++++++++++++++++++++++++++++------
 include/hw/acpi/acpi-defs.h | 15 ++++++++++++
 2 files changed, 64 insertions(+), 7 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index f7fa795..4b5ad91 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -393,19 +393,26 @@ build_rsdp(GArray *rsdp_table, BIOSLinker *linker, unsigned xsdt_tbl_offset)
 }
 
 static void
-build_iort(GArray *table_data, BIOSLinker *linker)
+build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
 {
-    int iort_start = table_data->len;
+    int nb_nodes, iort_start = table_data->len;
     AcpiIortIdMapping *idmap;
     AcpiIortItsGroup *its;
     AcpiIortTable *iort;
-    size_t node_size, iort_length;
+    AcpiIortSmmu3 *smmu;
+    size_t node_size, iort_length, smmu_offset = 0;
     AcpiIortRC *rc;
 
     iort = acpi_data_push(table_data, sizeof(*iort));
 
+    if (vms->iommu) {
+        nb_nodes = 3; /* RC, ITS, SMMUv3 */
+    } else {
+        nb_nodes = 2; /* RC, ITS */
+    }
+
     iort_length = sizeof(*iort);
-    iort->node_count = cpu_to_le32(2); /* RC and ITS nodes */
+    iort->node_count = cpu_to_le32(nb_nodes);
     iort->node_offset = cpu_to_le32(sizeof(*iort));
 
     /* ITS group node */
@@ -418,6 +425,35 @@ build_iort(GArray *table_data, BIOSLinker *linker)
     its->its_count = cpu_to_le32(1);
     its->identifiers[0] = 0; /* MADT translation_id */
 
+    if (vms->iommu == VIRT_IOMMU_SMMUV3) {
+        int irq =  vms->irqmap[VIRT_SMMU];
+
+        /* SMMUv3 node */
+        smmu_offset = cpu_to_le32(iort->node_offset + node_size);
+        node_size = sizeof(*smmu) + sizeof(*idmap);
+        iort_length += node_size;
+        smmu = acpi_data_push(table_data, node_size);
+
+
+        smmu->type = ACPI_IORT_NODE_SMMU_V3;
+        smmu->length = cpu_to_le16(node_size);
+        smmu->mapping_count = cpu_to_le32(1);
+        smmu->mapping_offset = cpu_to_le32(sizeof(*smmu));
+        smmu->base_address = cpu_to_le64(vms->memmap[VIRT_SMMU].base);
+        smmu->event_gsiv = cpu_to_le32(irq);
+        smmu->pri_gsiv = cpu_to_le32(irq + 1);
+        smmu->gerr_gsiv = cpu_to_le32(irq + 2);
+        smmu->sync_gsiv = cpu_to_le32(irq + 3);
+
+        /* Identity RID mapping covering the whole input RID range */
+        idmap = &smmu->id_mapping_array[0];
+        idmap->input_base = 0;
+        idmap->id_count = cpu_to_le32(0xFFFF);
+        idmap->output_base = 0;
+        /* output IORT node is the ITS group node (the first node) */
+        idmap->output_reference = cpu_to_le32(iort->node_offset);
+    }
+
     /* Root Complex Node */
     node_size = sizeof(*rc) + sizeof(*idmap);
     iort_length += node_size;
@@ -438,8 +474,14 @@ build_iort(GArray *table_data, BIOSLinker *linker)
     idmap->input_base = 0;
     idmap->id_count = cpu_to_le32(0xFFFF);
     idmap->output_base = 0;
-    /* output IORT node is the ITS group node (the first node) */
-    idmap->output_reference = cpu_to_le32(iort->node_offset);
+
+    if (vms->iommu) {
+        /* output IORT node is the smmuv3 node */
+        idmap->output_reference = cpu_to_le32(smmu_offset);
+    } else {
+        /* output IORT node is the ITS group node (the first node) */
+        idmap->output_reference = cpu_to_le32(iort->node_offset);
+    }
 
     iort->length = cpu_to_le32(iort_length);
 
@@ -786,7 +828,7 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
 
     if (its_class_name() && !vmc->no_its) {
         acpi_add_table(table_offsets, tables_blob);
-        build_iort(tables_blob, tables->linker);
+        build_iort(tables_blob, tables->linker, vms);
     }
 
     /* XSDT is pointed to by RSDP */
diff --git a/include/hw/acpi/acpi-defs.h b/include/hw/acpi/acpi-defs.h
index 80c8099..068ce28 100644
--- a/include/hw/acpi/acpi-defs.h
+++ b/include/hw/acpi/acpi-defs.h
@@ -700,6 +700,21 @@ struct AcpiIortItsGroup {
 } QEMU_PACKED;
 typedef struct AcpiIortItsGroup AcpiIortItsGroup;
 
+struct AcpiIortSmmu3 {
+    ACPI_IORT_NODE_HEADER_DEF
+    uint64_t base_address;
+    uint32_t flags;
+    uint32_t reserved2;
+    uint64_t vatos_address;
+    uint32_t model;
+    uint32_t event_gsiv;
+    uint32_t pri_gsiv;
+    uint32_t gerr_gsiv;
+    uint32_t sync_gsiv;
+    AcpiIortIdMapping id_mapping_array[0];
+} QEMU_PACKED;
+typedef struct AcpiIortSmmu3 AcpiIortSmmu3;
+
 struct AcpiIortRC {
     ACPI_IORT_NODE_HEADER_DEF
     AcpiIortMemoryAccess memory_properties;
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [Qemu-devel] [PATCH v9 14/14] hw/arm/virt: Handle iommu in 2.12 machine type
  2018-02-17 18:46 [Qemu-devel] [PATCH v9 00/14] ARM SMMUv3 Emulation Support Eric Auger
                   ` (12 preceding siblings ...)
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 13/14] hw/arm/virt-acpi-build: Add smmuv3 node in IORT table Eric Auger
@ 2018-02-17 18:46 ` Eric Auger
  2018-03-12 12:56   ` Peter Maydell
  2018-02-27 19:02 ` [Qemu-devel] [PATCH v9 00/14] ARM SMMUv3 Emulation Support Peter Maydell
  14 siblings, 1 reply; 63+ messages in thread
From: Eric Auger @ 2018-02-17 18:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: tn, mst, christoffer.dall, bharat.bhushan, jean-philippe.brucker,
	edgar.iglesias, linuc.decode, peterx

The new machine type exposes a new "iommu" virt machine option.
The SMMUv3 IOMMU is instantiated using -machine virt,iommu=smmuv3.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v7 -> v8:
- Revert to machine option, now dubbed "iommu", preparing for virtio
  instantiation.

v5 -> v6: machine 2_11

Another alternative would be to use the -device option as
done on x86. As the smmu is a sysbus device, we would need to
use the platform bus framework.
---
 hw/arm/virt.c         | 45 +++++++++++++++++++++++++++++++++++++++++++++
 include/hw/arm/virt.h |  1 +
 2 files changed, 46 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index e9dca0d..607c7e1 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1547,6 +1547,34 @@ static void virt_set_gic_version(Object *obj, const char *value, Error **errp)
     }
 }
 
+static char *virt_get_iommu(Object *obj, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+
+    switch (vms->iommu) {
+    case VIRT_IOMMU_NONE:
+        return g_strdup("none");
+    case VIRT_IOMMU_SMMUV3:
+        return g_strdup("smmuv3");
+    default:
+        return g_strdup("none");
+    }
+}
+
+static void virt_set_iommu(Object *obj, const char *value, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+
+    if (!strcmp(value, "smmuv3")) {
+        vms->iommu = VIRT_IOMMU_SMMUV3;
+    } else if (!strcmp(value, "none")) {
+        vms->iommu = VIRT_IOMMU_NONE;
+    } else {
+        error_setg(errp, "Invalid iommu value");
+        error_append_hint(errp, "Valid value are none, smmuv3\n");
+    }
+}
+
 static CpuInstanceProperties
 virt_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
 {
@@ -1679,6 +1707,19 @@ static void virt_2_12_instance_init(Object *obj)
                                         NULL);
     }
 
+    if (vmc->no_iommu) {
+        vms->iommu = VIRT_IOMMU_NONE;
+    } else {
+        /* Default disallows smmu instantiation */
+        vms->iommu = VIRT_IOMMU_NONE;
+        object_property_add_str(obj, "iommu", virt_get_iommu,
+                                 virt_set_iommu, NULL);
+        object_property_set_description(obj, "iommu",
+                                        "Set the IOMMU model among "
+                                        "none, smmuv3 (default none)",
+                                        NULL);
+    }
+
     vms->memmap = a15memmap;
     vms->irqmap = a15irqmap;
 }
@@ -1698,8 +1739,12 @@ static void virt_2_11_instance_init(Object *obj)
 
 static void virt_machine_2_11_options(MachineClass *mc)
 {
+    VirtMachineClass *vmc = VIRT_MACHINE_CLASS(OBJECT_CLASS(mc));
+
     virt_machine_2_12_options(mc);
     SET_MACHINE_COMPAT(mc, VIRT_COMPAT_2_11);
+
+    vmc->no_iommu = true;
 }
 DEFINE_VIRT_MACHINE(2, 11)
 
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 13d3724..3a92fc3 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -92,6 +92,7 @@ typedef struct {
     bool disallow_affinity_adjustment;
     bool no_its;
     bool no_pmu;
+    bool no_iommu;
     bool claim_edge_triggered_timers;
 } VirtMachineClass;
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 00/14] ARM SMMUv3 Emulation Support
  2018-02-17 18:46 [Qemu-devel] [PATCH v9 00/14] ARM SMMUv3 Emulation Support Eric Auger
                   ` (13 preceding siblings ...)
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 14/14] hw/arm/virt: Handle iommu in 2.12 machine type Eric Auger
@ 2018-02-27 19:02 ` Peter Maydell
  2018-02-28  8:44   ` Auger Eric
  14 siblings, 1 reply; 63+ messages in thread
From: Peter Maydell @ 2018-02-27 19:02 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
> This series implements the emulation code for ARM SMMUv3.
>
> SMMUv3 gets instantiated by adding ",iommu=smmuv3" to the virt
> machine option.
>
> VHOST integration will be handled in a separate series. VFIO
> integration is not targeted at the moment. Only stage 1 and
> AArch64 PTW are supported.
>
> Main changes since v8:
> - fix mingw compilation (qemu/log.h)
> - put gpl v2 license on all files to respect initial license
> - change proto of smmu_ptw* to clarify inputs/outputs and
>   prepare for iotlb emulation
> - fix hash table lookup
> - cleanup access type handling during ptw
> - cleanup reset infra (parent_reset)
> - replace some inline functions by macros
> - fix some CMD fields
> - increment cmdq cons only after cmd execution
> - replace some remaining error_report by qemu_log_mask

What state is this series in now? Is it "seems ready to
go, needs review"? Are you hoping it might get in for 2.12,
or targeting 2.13 ?

thanks
-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 00/14] ARM SMMUv3 Emulation Support
  2018-02-27 19:02 ` [Qemu-devel] [PATCH v9 00/14] ARM SMMUv3 Emulation Support Peter Maydell
@ 2018-02-28  8:44   ` Auger Eric
  2018-03-12 12:58     ` Peter Maydell
  0 siblings, 1 reply; 63+ messages in thread
From: Auger Eric @ 2018-02-28  8:44 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Michael S. Tsirkin, Jean-Philippe Brucker, Tomasz Nowicki,
	QEMU Developers, Peter Xu, Alex Williamson, qemu-arm,
	Christoffer Dall, Edgar E. Iglesias, linuc.decode,
	Bharat Bhushan, Prem Mallappa, eric.auger.pro

Hi Peter,

On 27/02/18 20:02, Peter Maydell wrote:
> On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
>> This series implements the emulation code for ARM SMMUv3.
>>
>> SMMUv3 gets instantiated by adding ",iommu=smmuv3" to the virt
>> machine option.
>>
>> VHOST integration will be handled in a separate series. VFIO
>> integration is not targeted at the moment. Only stage 1 and
>> AArch64 PTW are supported.
>>
>> Main changes since v8:
>> - fix mingw compilation (qemu/log.h)
>> - put gpl v2 license on all files to respect initial license
>> - change proto of smmu_ptw* to clarify inputs/outputs and
>>   prepare for iotlb emulation
>> - fix hash table lookup
>> - cleanup access type handling during ptw
>> - cleanup reset infra (parent_reset)
>> - replace some inline functions by macros
>> - fix some CMD fields
>> - increment cmdq cons only after cmd execution
>> - replace some remaining error_report by qemu_log_mask
> 
> What state is this series in now? Is it "seems ready to
> go, needs review"? Are you hoping it might get in for 2.12,
> or targeting 2.13 ?
Yes I think it is in a decent state and I would be happy to get some new
reviews, for a tentative pull in 2.12. In any case I will be reactive on
any comment in that prospect.

Thanks

Eric
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 01/14] hw/arm/smmu-common: smmu base device and datatypes
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 01/14] hw/arm/smmu-common: smmu base device and datatypes Eric Auger
@ 2018-03-06 12:09   ` Peter Maydell
  2018-03-06 15:01     ` Auger Eric
  0 siblings, 1 reply; 63+ messages in thread
From: Peter Maydell @ 2018-03-06 12:09 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
> The patch introduces the smmu base device and class for the ARM
> smmu. Devices for specific versions will be derived from this
> base device.
>
> We also introduce some important datatypes.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
>

> + * Author: Prem Mallappa <pmallapp@broadcom.com>
> + *
> + */
> +
> +#include "qemu/osdep.h"
> +#include "sysemu/sysemu.h"
> +#include "exec/address-spaces.h"
> +#include "trace.h"
> +#include "exec/target_page.h"
> +#include "qom/cpu.h"
> +#include "hw/qdev-properties.h"
> +#include "qapi/error.h"
> +
> +#include "qemu/error-report.h"
> +#include "hw/arm/smmu-common.h"
> +
> +static void smmu_base_realize(DeviceState *dev, Error **errp)
> +{
> +    SMMUState *s = ARM_SMMU(dev);
> +
> +    s->configs = g_hash_table_new_full(NULL, NULL, NULL, g_free);
> +    s->iotlb = g_hash_table_new_full(NULL, NULL, NULL, g_free);

Shouldn't we also invoke the parent_realize ?

> +}
> +
> +static void smmu_base_reset(DeviceState *dev)
> +{
> +    SMMUState *s = ARM_SMMU(dev);
> +
> +    g_hash_table_remove_all(s->configs);
> +    g_hash_table_remove_all(s->iotlb);
> +}
> +
> +static Property smmu_dev_properties[] = {
> +    DEFINE_PROP_UINT8("bus_num", SMMUState, bus_num, 0),
> +    DEFINE_PROP_LINK("primary-bus", SMMUState, primary_bus, "PCI", PCIBus *),
> +    DEFINE_PROP_END_OF_LIST(),
> +};
> +
> +static void smmu_base_class_init(ObjectClass *klass, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(klass);
> +    SMMUBaseClass *sbc = ARM_SMMU_CLASS(klass);
> +
> +    dc->props = smmu_dev_properties;
> +    sbc->parent_realize = dc->realize;
> +    dc->realize = smmu_base_realize;

There's a device_class_set_parent_realize() in the tree now that
we should probably use instead of these 2 lines:
       device_class_set_parent_realize(dc, smmu_base_realize,
                                       &sbc->parent_realize);

> +    dc->reset = smmu_base_reset;
> +}
> +
> +static const TypeInfo smmu_base_info = {
> +    .name          = TYPE_ARM_SMMU,
> +    .parent        = TYPE_SYS_BUS_DEVICE,
> +    .instance_size = sizeof(SMMUState),
> +    .class_data    = NULL,
> +    .class_size    = sizeof(SMMUBaseClass),
> +    .class_init    = smmu_base_class_init,
> +    .abstract      = true,
> +};
> +
> +static void smmu_base_register_types(void)
> +{
> +    type_register_static(&smmu_base_info);
> +}
> +
> +type_init(smmu_base_register_types)
> +
> diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
> new file mode 100644
> index 0000000..8a9d931
> --- /dev/null
> +++ b/include/hw/arm/smmu-common.h
> @@ -0,0 +1,124 @@
> +/*
> + * ARM SMMU Support
> + *
> + * Copyright (C) 2015-2016 Broadcom Corporation
> + * Copyright (c) 2017 Red Hat, Inc.
> + * Written by Prem Mallappa, Eric Auger
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + */
> +
> +#ifndef HW_ARM_SMMU_COMMON_H
> +#define HW_ARM_SMMU_COMMON_H
> +
> +#include <hw/sysbus.h>

QEMU headers should be included using "", not <>.

> +#include "hw/pci/pci.h"
> +
> +#define SMMU_PCI_BUS_MAX      256
> +#define SMMU_PCI_DEVFN_MAX    256
> +
> +#define SMMU_MAX_VA_BITS      48
> +
> +/*
> + * Page table walk error types
> + */
> +typedef enum {
> +    SMMU_PTW_ERR_NONE,
> +    SMMU_PTW_ERR_WALK_EABT,   /* Translation walk external abort */
> +    SMMU_PTW_ERR_TRANSLATION, /* Translation fault */
> +    SMMU_PTW_ERR_ADDR_SIZE,   /* Address Size fault */
> +    SMMU_PTW_ERR_ACCESS,      /* Access fault */
> +    SMMU_PTW_ERR_PERMISSION,  /* Permission fault */
> +} SMMUPTWEventType;
> +
> +typedef struct SMMUPTWEventInfo {
> +    SMMUPTWEventType type;
> +    dma_addr_t addr; /* fetched address that induced an abort, if any */
> +} SMMUPTWEventInfo;
> +
> +typedef struct SMMUTransTableInfo {
> +    bool disabled;             /* is the translation table disabled? */
> +    uint64_t ttb;              /* TT base address */
> +    uint8_t tsz;               /* input range, ie. 2^(64 -tsz)*/
> +    uint8_t granule_sz;        /* granule page shift */
> +    uint8_t initial_level;     /* initial lookup level */
> +} SMMUTransTableInfo;
> +
> +/*
> + * Generic structure populated by derived SMMU devices
> + * after decoding the configuration information and used as
> + * input to the page table walk
> + */
> +typedef struct SMMUTransCfg {
> +    int      stage;            /* translation stage */
> +    bool     aa64;             /* arch64 or aarch32 translation table */
> +    bool     disabled;         /* smmu is disabled */
> +    bool     bypassed;         /* translation is bypassed */
> +    bool     aborted;          /* translation is aborted */
> +    uint64_t ttb;              /* TT base address */
> +    uint8_t oas;               /* output address width */
> +    uint8_t  tbi;              /* Top Byte Ignore */
> +    uint16_t asid;
> +    SMMUTransTableInfo tt[2];

Can you be consistent about either lining up the field names or not doing so,
please? (I would suggest going for 'not'.)

> +} SMMUTransCfg;

Otherwise
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 02/14] hw/arm/smmu-common: IOMMU memory region and address space setup
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 02/14] hw/arm/smmu-common: IOMMU memory region and address space setup Eric Auger
@ 2018-03-06 14:08   ` Peter Maydell
  2018-03-06 14:47     ` Auger Eric
  0 siblings, 1 reply; 63+ messages in thread
From: Peter Maydell @ 2018-03-06 14:08 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
> We enumerate all the PCI devices attached to the SMMU and
> initialize an associated IOMMU memory region and address space.
> This happens on SMMU base instance init.
>
> Those info are stored in SMMUDevice objects. The devices are
> grouped according to the PCIBus they belong to. A hash table
> indexed by the PCIBus poinet is used. Also an array indexed by

"pointer".

> the bus number allows to find the list of SMMUDevices.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>
> ---
> v8 -> v9:
> - fix key value for lookup
>
> v7 -> v8:
> - introduce SMMU_MAX_VA_BITS
> - use PCI bus handle as a key
> - do not clear s->smmu_as_by_bus_num
> - use g_new0 instead of g_malloc0
> - use primary_bus field
> ---
>  hw/arm/smmu-common.c         | 59 ++++++++++++++++++++++++++++++++++++++++++++
>  include/hw/arm/smmu-common.h |  6 +++++
>  2 files changed, 65 insertions(+)
>
> diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
> index 86a5aab..d0516dc 100644
> --- a/hw/arm/smmu-common.c
> +++ b/hw/arm/smmu-common.c
> @@ -28,12 +28,71 @@
>  #include "qemu/error-report.h"
>  #include "hw/arm/smmu-common.h"
>
> +SMMUPciBus *smmu_find_as_from_bus_num(SMMUState *s, uint8_t bus_num)
> +{
> +    SMMUPciBus *smmu_pci_bus = s->smmu_as_by_bus_num[bus_num];

Variable name suggests this is a table of AddressSpaces indexed by
bus number, but the code says we're getting SMMUPCIBus objects from it?

> +
> +    if (!smmu_pci_bus) {
> +        GHashTableIter iter;
> +
> +        g_hash_table_iter_init(&iter, s->smmu_as_by_busptr);
> +        while (g_hash_table_iter_next(&iter, NULL, (void **)&smmu_pci_bus)) {
> +            if (pci_bus_num(smmu_pci_bus->bus) == bus_num) {
> +                s->smmu_as_by_bus_num[bus_num] = smmu_pci_bus;
> +                return smmu_pci_bus;
> +            }
> +        }

Why do we populate this hashtable lazily rather than when we
put the SMMUPciBus in the smmu_as_by_busptr table? Do we
expect this function not to ordinarily be called?

> +    }
> +    return smmu_pci_bus;
> +}
> +
> +static AddressSpace *smmu_find_add_as(PCIBus *bus, void *opaque, int devfn)
> +{
> +    SMMUState *s = opaque;
> +    SMMUPciBus *sbus = g_hash_table_lookup(s->smmu_as_by_busptr, bus);
> +    SMMUDevice *sdev;
> +
> +    if (!sbus) {
> +        sbus = g_malloc0(sizeof(SMMUPciBus) +
> +                         sizeof(SMMUDevice *) * SMMU_PCI_DEVFN_MAX);
> +        sbus->bus = bus;
> +        g_hash_table_insert(s->smmu_as_by_busptr, bus, sbus);
> +    }
> +
> +    sdev = sbus->pbdev[devfn];
> +    if (!sdev) {
> +        char *name = g_strdup_printf("%s-%d-%d",
> +                                     s->mrtypename,
> +                                     pci_bus_num(bus), devfn);
> +        sdev = sbus->pbdev[devfn] = g_new0(SMMUDevice, 1);
> +
> +        sdev->smmu = s;
> +        sdev->bus = bus;
> +        sdev->devfn = devfn;
> +
> +        memory_region_init_iommu(&sdev->iommu, sizeof(sdev->iommu),
> +                                 s->mrtypename,
> +                                 OBJECT(s), name, 1ULL << SMMU_MAX_VA_BITS);
> +        address_space_init(&sdev->as,
> +                           MEMORY_REGION(&sdev->iommu), name);

This leaks the memory pointed to by name, doesn't it?
(address_space_init() takes a copy of the name string, so you want
to g_free(name) here.)

> +    }
> +
> +    return &sdev->as;
> +}
> +
>  static void smmu_base_realize(DeviceState *dev, Error **errp)
>  {
>      SMMUState *s = ARM_SMMU(dev);
>
>      s->configs = g_hash_table_new_full(NULL, NULL, NULL, g_free);
>      s->iotlb = g_hash_table_new_full(NULL, NULL, NULL, g_free);
> +    s->smmu_as_by_busptr = g_hash_table_new(NULL, NULL);
> +
> +    if (s->primary_bus) {
> +        pci_setup_iommu(s->primary_bus, smmu_find_add_as, s);
> +    } else {
> +        error_setg(errp, "SMMU is not attached to any PCI bus!");
> +    }
>  }
>
>  static void smmu_base_reset(DeviceState *dev)
> diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
> index 8a9d931..aee96c2 100644
> --- a/include/hw/arm/smmu-common.h
> +++ b/include/hw/arm/smmu-common.h
> @@ -121,4 +121,10 @@ typedef struct {
>  #define ARM_SMMU_GET_CLASS(obj)                              \
>      OBJECT_GET_CLASS(SMMUBaseClass, (obj), TYPE_ARM_SMMU)
>
> +SMMUPciBus *smmu_find_as_from_bus_num(SMMUState *s, uint8_t bus_num);
> +
> +static inline uint16_t smmu_get_sid(SMMUDevice *sdev)
> +{
> +    return  ((pci_bus_num(sdev->bus) & 0xff) << 8) | sdev->devfn;
> +}

Can we have at least brief documentation comments for any
new functions or prototypes added to header files, please?

>  #endif  /* HW_ARM_SMMU_COMMON */
> --

thanks
-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 02/14] hw/arm/smmu-common: IOMMU memory region and address space setup
  2018-03-06 14:08   ` Peter Maydell
@ 2018-03-06 14:47     ` Auger Eric
  2018-03-06 14:49       ` Peter Maydell
  0 siblings, 1 reply; 63+ messages in thread
From: Auger Eric @ 2018-03-06 14:47 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Michael S. Tsirkin, Jean-Philippe Brucker, Tomasz Nowicki,
	QEMU Developers, Peter Xu, Alex Williamson, qemu-arm,
	Christoffer Dall, Edgar E. Iglesias, linuc.decode,
	Bharat Bhushan, Prem Mallappa, eric.auger.pro

Hi Peter,

On 06/03/18 15:08, Peter Maydell wrote:
> On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
>> We enumerate all the PCI devices attached to the SMMU and
>> initialize an associated IOMMU memory region and address space.
>> This happens on SMMU base instance init.
>>
>> Those info are stored in SMMUDevice objects. The devices are
>> grouped according to the PCIBus they belong to. A hash table
>> indexed by the PCIBus poinet is used. Also an array indexed by
> 
> "pointer".
OK
> 
>> the bus number allows to find the list of SMMUDevices.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>> v8 -> v9:
>> - fix key value for lookup
>>
>> v7 -> v8:
>> - introduce SMMU_MAX_VA_BITS
>> - use PCI bus handle as a key
>> - do not clear s->smmu_as_by_bus_num
>> - use g_new0 instead of g_malloc0
>> - use primary_bus field
>> ---
>>  hw/arm/smmu-common.c         | 59 ++++++++++++++++++++++++++++++++++++++++++++
>>  include/hw/arm/smmu-common.h |  6 +++++
>>  2 files changed, 65 insertions(+)
>>
>> diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
>> index 86a5aab..d0516dc 100644
>> --- a/hw/arm/smmu-common.c
>> +++ b/hw/arm/smmu-common.c
>> @@ -28,12 +28,71 @@
>>  #include "qemu/error-report.h"
>>  #include "hw/arm/smmu-common.h"
>>
>> +SMMUPciBus *smmu_find_as_from_bus_num(SMMUState *s, uint8_t bus_num)
>> +{
>> +    SMMUPciBus *smmu_pci_bus = s->smmu_as_by_bus_num[bus_num];
> 
> Variable name suggests this is a table of AddressSpaces indexed by
> bus number, but the code says we're getting SMMUPCIBus objects from it?
Yes I can rename this variable. It stems from x86 naming (see
hw/intel_iommu.c vtd_find_as_from_bus_num). The code here does the same
functionality with arm smmu datatypes.
SMMUPciBus ~ VTDBus and smmu_as_by_bus_num ~ vtd_as_by_bus_num

purpose is to find the SMUPciBus which matches the bus number passed in
parameter.

> 
>> +
>> +    if (!smmu_pci_bus) {
>> +        GHashTableIter iter;
>> +
>> +        g_hash_table_iter_init(&iter, s->smmu_as_by_busptr);
>> +        while (g_hash_table_iter_next(&iter, NULL, (void **)&smmu_pci_bus)) {
>> +            if (pci_bus_num(smmu_pci_bus->bus) == bus_num) {
>> +                s->smmu_as_by_bus_num[bus_num] = smmu_pci_bus;
>> +                return smmu_pci_bus;
>> +            }
>> +        }
> 
> Why do we populate this hashtable lazily rather than when we
> put the SMMUPciBus in the smmu_as_by_busptr table? Do we
> expect this function not to ordinarily be called?

This function only is used when handling invalidations (vhost/vfio). I
can even remove it from this series at this stage and re-introduce
latter on. From the SID, you retrieve bus_num, retrieve the SMMUPciBus.
Andd from the function number you can then retrieve the IOMMU mr at
smmu_bus->pbdev[devfn]

That's the purpose of ths function. But I will remove it.
> 
>> +    }
>> +    return smmu_pci_bus;
>> +}
>> +
>> +static AddressSpace *smmu_find_add_as(PCIBus *bus, void *opaque, int devfn)
>> +{
>> +    SMMUState *s = opaque;
>> +    SMMUPciBus *sbus = g_hash_table_lookup(s->smmu_as_by_busptr, bus);
>> +    SMMUDevice *sdev;
>> +
>> +    if (!sbus) {
>> +        sbus = g_malloc0(sizeof(SMMUPciBus) +
>> +                         sizeof(SMMUDevice *) * SMMU_PCI_DEVFN_MAX);
>> +        sbus->bus = bus;
>> +        g_hash_table_insert(s->smmu_as_by_busptr, bus, sbus);
>> +    }
>> +
>> +    sdev = sbus->pbdev[devfn];
>> +    if (!sdev) {
>> +        char *name = g_strdup_printf("%s-%d-%d",
>> +                                     s->mrtypename,
>> +                                     pci_bus_num(bus), devfn);
>> +        sdev = sbus->pbdev[devfn] = g_new0(SMMUDevice, 1);
>> +
>> +        sdev->smmu = s;
>> +        sdev->bus = bus;
>> +        sdev->devfn = devfn;
>> +
>> +        memory_region_init_iommu(&sdev->iommu, sizeof(sdev->iommu),
>> +                                 s->mrtypename,
>> +                                 OBJECT(s), name, 1ULL << SMMU_MAX_VA_BITS);
>> +        address_space_init(&sdev->as,
>> +                           MEMORY_REGION(&sdev->iommu), name);
> 
> This leaks the memory pointed to by name, doesn't it?
> (address_space_init() takes a copy of the name string, so you want
> to g_free(name) here.)
Hum yes. Will free it.
> 
>> +    }
>> +
>> +    return &sdev->as;
>> +}
>> +
>>  static void smmu_base_realize(DeviceState *dev, Error **errp)
>>  {
>>      SMMUState *s = ARM_SMMU(dev);
>>
>>      s->configs = g_hash_table_new_full(NULL, NULL, NULL, g_free);
>>      s->iotlb = g_hash_table_new_full(NULL, NULL, NULL, g_free);
>> +    s->smmu_as_by_busptr = g_hash_table_new(NULL, NULL);
>> +
>> +    if (s->primary_bus) {
>> +        pci_setup_iommu(s->primary_bus, smmu_find_add_as, s);
>> +    } else {
>> +        error_setg(errp, "SMMU is not attached to any PCI bus!");
>> +    }
>>  }
>>
>>  static void smmu_base_reset(DeviceState *dev)
>> diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
>> index 8a9d931..aee96c2 100644
>> --- a/include/hw/arm/smmu-common.h
>> +++ b/include/hw/arm/smmu-common.h
>> @@ -121,4 +121,10 @@ typedef struct {
>>  #define ARM_SMMU_GET_CLASS(obj)                              \
>>      OBJECT_GET_CLASS(SMMUBaseClass, (obj), TYPE_ARM_SMMU)
>>
>> +SMMUPciBus *smmu_find_as_from_bus_num(SMMUState *s, uint8_t bus_num);
>> +
>> +static inline uint16_t smmu_get_sid(SMMUDevice *sdev)
>> +{
>> +    return  ((pci_bus_num(sdev->bus) & 0xff) << 8) | sdev->devfn;
>> +}
> 
> Can we have at least brief documentation comments for any
> new functions or prototypes added to header files, please?
Sure.

Thanks

Eric
> 
>>  #endif  /* HW_ARM_SMMU_COMMON */
>> --
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 02/14] hw/arm/smmu-common: IOMMU memory region and address space setup
  2018-03-06 14:47     ` Auger Eric
@ 2018-03-06 14:49       ` Peter Maydell
  0 siblings, 0 replies; 63+ messages in thread
From: Peter Maydell @ 2018-03-06 14:49 UTC (permalink / raw)
  To: Auger Eric
  Cc: Michael S. Tsirkin, Jean-Philippe Brucker, Tomasz Nowicki,
	QEMU Developers, Peter Xu, Alex Williamson, qemu-arm,
	Christoffer Dall, Edgar E. Iglesias, linuc.decode,
	Bharat Bhushan, Prem Mallappa, eric.auger.pro

On 6 March 2018 at 14:47, Auger Eric <eric.auger@redhat.com> wrote:
> Hi Peter,
>
> On 06/03/18 15:08, Peter Maydell wrote:
>> On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
>>> +    if (!smmu_pci_bus) {
>>> +        GHashTableIter iter;
>>> +
>>> +        g_hash_table_iter_init(&iter, s->smmu_as_by_busptr);
>>> +        while (g_hash_table_iter_next(&iter, NULL, (void **)&smmu_pci_bus)) {
>>> +            if (pci_bus_num(smmu_pci_bus->bus) == bus_num) {
>>> +                s->smmu_as_by_bus_num[bus_num] = smmu_pci_bus;
>>> +                return smmu_pci_bus;
>>> +            }
>>> +        }
>>
>> Why do we populate this hashtable lazily rather than when we
>> put the SMMUPciBus in the smmu_as_by_busptr table? Do we
>> expect this function not to ordinarily be called?
>
> This function only is used when handling invalidations (vhost/vfio). I
> can even remove it from this series at this stage and re-introduce
> latter on. From the SID, you retrieve bus_num, retrieve the SMMUPciBus.
> Andd from the function number you can then retrieve the IOMMU mr at
> smmu_bus->pbdev[devfn]
>
> That's the purpose of ths function. But I will remove it.

No, it's fine here. You should just have a comment about why
it makes sense to lazily populate the hashtable (it's overall
going to be slower than if we populated it up front), to
document the rationale for the design decision.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 01/14] hw/arm/smmu-common: smmu base device and datatypes
  2018-03-06 12:09   ` Peter Maydell
@ 2018-03-06 15:01     ` Auger Eric
  0 siblings, 0 replies; 63+ messages in thread
From: Auger Eric @ 2018-03-06 15:01 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Michael S. Tsirkin, Jean-Philippe Brucker, Tomasz Nowicki,
	QEMU Developers, Peter Xu, Alex Williamson, qemu-arm,
	Christoffer Dall, Edgar E. Iglesias, linuc.decode,
	Bharat Bhushan, Prem Mallappa, eric.auger.pro

Hi Peter,

On 06/03/18 13:09, Peter Maydell wrote:
> On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
>> The patch introduces the smmu base device and class for the ARM
>> smmu. Devices for specific versions will be derived from this
>> base device.
>>
>> We also introduce some important datatypes.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
>>
> 
>> + * Author: Prem Mallappa <pmallapp@broadcom.com>
>> + *
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +#include "sysemu/sysemu.h"
>> +#include "exec/address-spaces.h"
>> +#include "trace.h"
>> +#include "exec/target_page.h"
>> +#include "qom/cpu.h"
>> +#include "hw/qdev-properties.h"
>> +#include "qapi/error.h"
>> +
>> +#include "qemu/error-report.h"
>> +#include "hw/arm/smmu-common.h"
>> +
>> +static void smmu_base_realize(DeviceState *dev, Error **errp)
>> +{
>> +    SMMUState *s = ARM_SMMU(dev);
>> +
>> +    s->configs = g_hash_table_new_full(NULL, NULL, NULL, g_free);
>> +    s->iotlb = g_hash_table_new_full(NULL, NULL, NULL, g_free);
> 
> Shouldn't we also invoke the parent_realize ?
Yes will fix that.

note that those hash tables are not yet used. However I kept those in
this series to illustrate the split between the generic and smmuv3 part.
configs will be used to cache config data whereas iotlb will be used to
cache IOTLB entries. I can remove them at the moment and keep the
realize function void.
> 
>> +}
>> +
>> +static void smmu_base_reset(DeviceState *dev)
>> +{
>> +    SMMUState *s = ARM_SMMU(dev);
>> +
>> +    g_hash_table_remove_all(s->configs);
>> +    g_hash_table_remove_all(s->iotlb);
>> +}
>> +
>> +static Property smmu_dev_properties[] = {
>> +    DEFINE_PROP_UINT8("bus_num", SMMUState, bus_num, 0),
>> +    DEFINE_PROP_LINK("primary-bus", SMMUState, primary_bus, "PCI", PCIBus *),
>> +    DEFINE_PROP_END_OF_LIST(),
>> +};
>> +
>> +static void smmu_base_class_init(ObjectClass *klass, void *data)
>> +{
>> +    DeviceClass *dc = DEVICE_CLASS(klass);
>> +    SMMUBaseClass *sbc = ARM_SMMU_CLASS(klass);
>> +
>> +    dc->props = smmu_dev_properties;
>> +    sbc->parent_realize = dc->realize;
>> +    dc->realize = smmu_base_realize;
> 
> There's a device_class_set_parent_realize() in the tree now that
> we should probably use instead of these 2 lines:
>        device_class_set_parent_realize(dc, smmu_base_realize,
>                                        &sbc->parent_realize);
OK. Will do that as well for smmuv3 class. I used the fellow reset
setter but not that one.
> 
>> +    dc->reset = smmu_base_reset;
>> +}
>> +
>> +static const TypeInfo smmu_base_info = {
>> +    .name          = TYPE_ARM_SMMU,
>> +    .parent        = TYPE_SYS_BUS_DEVICE,
>> +    .instance_size = sizeof(SMMUState),
>> +    .class_data    = NULL,
>> +    .class_size    = sizeof(SMMUBaseClass),
>> +    .class_init    = smmu_base_class_init,
>> +    .abstract      = true,
>> +};
>> +
>> +static void smmu_base_register_types(void)
>> +{
>> +    type_register_static(&smmu_base_info);
>> +}
>> +
>> +type_init(smmu_base_register_types)
>> +
>> diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
>> new file mode 100644
>> index 0000000..8a9d931
>> --- /dev/null
>> +++ b/include/hw/arm/smmu-common.h
>> @@ -0,0 +1,124 @@
>> +/*
>> + * ARM SMMU Support
>> + *
>> + * Copyright (C) 2015-2016 Broadcom Corporation
>> + * Copyright (c) 2017 Red Hat, Inc.
>> + * Written by Prem Mallappa, Eric Auger
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + */
>> +
>> +#ifndef HW_ARM_SMMU_COMMON_H
>> +#define HW_ARM_SMMU_COMMON_H
>> +
>> +#include <hw/sysbus.h>
> 
> QEMU headers should be included using "", not <>.
> 
>> +#include "hw/pci/pci.h"
>> +
>> +#define SMMU_PCI_BUS_MAX      256
>> +#define SMMU_PCI_DEVFN_MAX    256
>> +
>> +#define SMMU_MAX_VA_BITS      48
>> +
>> +/*
>> + * Page table walk error types
>> + */
>> +typedef enum {
>> +    SMMU_PTW_ERR_NONE,
>> +    SMMU_PTW_ERR_WALK_EABT,   /* Translation walk external abort */
>> +    SMMU_PTW_ERR_TRANSLATION, /* Translation fault */
>> +    SMMU_PTW_ERR_ADDR_SIZE,   /* Address Size fault */
>> +    SMMU_PTW_ERR_ACCESS,      /* Access fault */
>> +    SMMU_PTW_ERR_PERMISSION,  /* Permission fault */
>> +} SMMUPTWEventType;
>> +
>> +typedef struct SMMUPTWEventInfo {
>> +    SMMUPTWEventType type;
>> +    dma_addr_t addr; /* fetched address that induced an abort, if any */
>> +} SMMUPTWEventInfo;
>> +
>> +typedef struct SMMUTransTableInfo {
>> +    bool disabled;             /* is the translation table disabled? */
>> +    uint64_t ttb;              /* TT base address */
>> +    uint8_t tsz;               /* input range, ie. 2^(64 -tsz)*/
>> +    uint8_t granule_sz;        /* granule page shift */
>> +    uint8_t initial_level;     /* initial lookup level */
>> +} SMMUTransTableInfo;
>> +
>> +/*
>> + * Generic structure populated by derived SMMU devices
>> + * after decoding the configuration information and used as
>> + * input to the page table walk
>> + */
>> +typedef struct SMMUTransCfg {
>> +    int      stage;            /* translation stage */
>> +    bool     aa64;             /* arch64 or aarch32 translation table */
>> +    bool     disabled;         /* smmu is disabled */
>> +    bool     bypassed;         /* translation is bypassed */
>> +    bool     aborted;          /* translation is aborted */
>> +    uint64_t ttb;              /* TT base address */
>> +    uint8_t oas;               /* output address width */
>> +    uint8_t  tbi;              /* Top Byte Ignore */
>> +    uint16_t asid;
>> +    SMMUTransTableInfo tt[2];
> 
> Can you be consistent about either lining up the field names or not doing so,
> please? (I would suggest going for 'not'.)
OK
> 
>> +} SMMUTransCfg;
> 
> Otherwise
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
thanks!

Eric
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 03/14] hw/arm/smmu-common: VMSAv8-64 page table walk
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 03/14] hw/arm/smmu-common: VMSAv8-64 page table walk Eric Auger
@ 2018-03-06 19:43   ` Peter Maydell
  2018-03-07 16:23     ` Auger Eric
  0 siblings, 1 reply; 63+ messages in thread
From: Peter Maydell @ 2018-03-06 19:43 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
> This patch implements the page table walk for VMSAv8-64.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>
> ---
> v8 -> v9:
> - remove guest error log on PTE fetch fault
> - rename  trace functions
> - fix smmu_page_walk_level_res_invalid_pte last arg
> - fix PTE_ADDRESS
> - turn functions into macros
> - make sure to return the actual pte access permission
>   into tlbe->perm
> - change proto of smmu_ptw*
>
> v7 -> v8:
> - rework get_pte
> - use LOG_LEVEL_ERROR
> - remove error checking in get_block_pte_address
> - page table walk simplified (no VFIO replay anymore)
> - handle PTW error events
> - use dma_memory_read
>
> v6 -> v7:
> - fix wrong error handling in walk_page_table
> - check perm in smmu_translate
>
> v5 -> v6:
> - use IOMMUMemoryRegion
> - remove initial_lookup_level()
> - fix block replay
>
> v4 -> v5:
> - add initial level in translation config
> - implement block pte
> - rename must_translate into nofail
> - introduce call_entry_hook
> - small changes to dynamic traces
> - smmu_page_walk code moved from smmuv3.c to this file
> - remove smmu_translate*
>
> v3 -> v4:
> - reworked page table walk to prepare for VFIO integration
>   (capability to scan a range of IOVA). Same function is used
>   for translate for a single iova. This is largely inspired
>   from intel_iommu.c
> - as the translate function was not straightforward to me,
>   I tried to stick more closely to the VMSA spec.
> - remove support of nested stage (kernel driver does not
>   support it anyway)
> - use error_report and trace events
> - add aa64[] field in SMMUTransCfg
> ---
>  hw/arm/smmu-common.c         | 232 +++++++++++++++++++++++++++++++++++++++++++
>  hw/arm/smmu-internal.h       |  96 ++++++++++++++++++
>  hw/arm/trace-events          |  10 ++
>  include/hw/arm/smmu-common.h |   6 ++
>  4 files changed, 344 insertions(+)
>  create mode 100644 hw/arm/smmu-internal.h
>
> diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
> index d0516dc..24cc4ba 100644
> --- a/hw/arm/smmu-common.c
> +++ b/hw/arm/smmu-common.c
> @@ -27,6 +27,238 @@
>
>  #include "qemu/error-report.h"
>  #include "hw/arm/smmu-common.h"
> +#include "smmu-internal.h"
> +
> +/* VMSAv8-64 Translation */
> +
> +/**
> + * get_pte - Get the content of a page table entry located t
> + * @base_addr[@index]
> + */
> +static int get_pte(dma_addr_t baseaddr, uint32_t index, uint64_t *pte,
> +                   SMMUPTWEventInfo *info)
> +{
> +    int ret;
> +    dma_addr_t addr = baseaddr + index * sizeof(*pte);
> +
> +    ret = dma_memory_read(&address_space_memory, addr,
> +                          (uint8_t *)pte, sizeof(*pte));

I think last time round I asked that these be done with the
"read a 64-bit quantity" APIs and a comment that they're
supposed to be atomic.

> +
> +    if (ret != MEMTX_OK) {
> +        info->type = SMMU_PTW_ERR_WALK_EABT;
> +        info->addr = addr;
> +        return -EINVAL;
> +    }
> +    trace_smmu_get_pte(baseaddr, index, addr, *pte);
> +    return 0;
> +}
> +
> +/* VMSAv8-64 Translation Table Format Descriptor Decoding */
> +
> +/**
> + * get_page_pte_address - returns the L3 descriptor output address,
> + * ie. the page frame
> + * ARM ARM spec: Figure D4-17 VMSAv8-64 level 3 descriptor format
> + */
> +static inline hwaddr get_page_pte_address(uint64_t pte, int granule_sz)
> +{
> +    return PTE_ADDRESS(pte, granule_sz);
> +}
> +
> +/**
> + * get_table_pte_address - return table descriptor output address,
> + * ie. address of next level table
> + * ARM ARM Figure D4-16 VMSAv8-64 level0, level1, and level 2 descriptor formats
> + */
> +static inline hwaddr get_table_pte_address(uint64_t pte, int granule_sz)
> +{
> +    return PTE_ADDRESS(pte, granule_sz);
> +}
> +
> +/**
> + * get_block_pte_address - return block descriptor output address and block size
> + * ARM ARM Figure D4-16 VMSAv8-64 level0, level1, and level 2 descriptor formats
> + */
> +static hwaddr get_block_pte_address(uint64_t pte, int level, int granule_sz,
> +                                    uint64_t *bsz)
> +{
> +    int n = 0;
> +
> +    switch (granule_sz) {
> +    case 12:
> +        if (level == 1) {
> +            n = 30;
> +        } else if (level == 2) {
> +            n = 21;
> +        }
> +        break;
> +    case 14:
> +        if (level == 2) {
> +            n = 25;
> +        }
> +        break;
> +    case 16:
> +        if (level == 2) {
> +            n = 29;
> +        }
> +        break;
> +    }
> +    if (!n) {
> +        error_setg(&error_fatal,
> +                   "wrong granule/level combination (%d/%d)",
> +                   granule_sz, level);

If this is guest-provokable then it shouldn't be a fatal error.
If it isn't guest provokable then you can just assert.
I think you should be able to sanitize the SMUTransTableInfo when
you construct it from the CD (and give a C_BAD_CD event), and then
you can trust the granule_sz and level when you're doing table walks.

You can calculate n as
    n = (granule_sz - 3) * (4 - level) + 3;
(compare target/arm/helper.c:get_phys_addr_lpae() calculation
of page_size, and in the pseudocode the line
    addrselectbottom = (3-level)*stride + grainsize;
where stride is grainsize-3 and so comes out to the same thing.)

> +    }
> +    *bsz = 1 << n;
> +    return PTE_ADDRESS(pte, n);
> +}
> +
> +static inline bool check_perm(int access_attrs, int mem_attrs)
> +{
> +    if (((access_attrs & IOMMU_RO) && !(mem_attrs & IOMMU_RO)) ||
> +        ((access_attrs & IOMMU_WO) && !(mem_attrs & IOMMU_WO))) {
> +        return false;
> +    }
> +    return true;
> +}

This function doesn't seem to ever be used in this patchset?
(clang will complain about that, though gcc won't.)

> +
> +SMMUTransTableInfo *select_tt(SMMUTransCfg *cfg, dma_addr_t iova)
> +{
> +    if (!extract64(iova, 64 - cfg->tt[0].tsz, cfg->tt[0].tsz - cfg->tbi)) {
> +        return &cfg->tt[0];
> +    }
> +    return &cfg->tt[1];

I'm confused by your handling of the TBI bits here. In
decode_cd() you take the 2-bit TBI field into cfg->tbi. That's
a pair of bits, one of which is "top byte ignore" for TTB0 and the
other is "top byte ignore" for TTB1. But here you're subtracting
the whole field from cfg->tt[0].tsz. Which of the two tbi bits
you need to use depends on bit 55 of the iova (compare the code
in get_phys_addr_lpae() and the pseudocode function AddrTop()),
and then if that bit is 1 it means that 8 bits of address should
be ignored when determining whether to use TTB0 or TTB1.

You also need to consider the case where the input address is in
neither the TTB0 range nor the TTb1 range (cf fig D4-14 in the
v8A Arm ARM DDI0487C.a, and the code in get_phys_addr_lpae()).

> +}
> +
> +/**
> + * smmu_ptw_64 - VMSAv8-64 Walk of the page tables for a given IOVA
> + * @cfg: translation config
> + * @iova: iova to translate
> + * @perm: access type
> + * @tlbe: IOMMUTLBEntry (out)
> + * @info: handle to an error info
> + *
> + * Return 0 on success, < 0 on error. In case of error, @info is filled
> + * and tlbe->perm is set to IOMMU_NONE.
> + * Upon success, @tlbe is filled with translated_addr and entry
> + * permission rights.
> + */
> +static int smmu_ptw_64(SMMUTransCfg *cfg,
> +                       dma_addr_t iova, IOMMUAccessFlags perm,
> +                       IOMMUTLBEntry *tlbe, SMMUPTWEventInfo *info)
> +{
> +    dma_addr_t baseaddr;
> +    int stage = cfg->stage;
> +    SMMUTransTableInfo *tt = select_tt(cfg, iova);
> +    uint8_t level;
> +    uint8_t granule_sz;
> +
> +    if (tt->disabled) {
> +        info->type = SMMU_PTW_ERR_TRANSLATION;
> +        goto error;
> +    }
> +
> +    level = tt->initial_level;
> +    granule_sz = tt->granule_sz;
> +    baseaddr = extract64(tt->ttb, 0, 48);

The spec says that bits specified by the TTB0/TTB1 fields
in a CD that are outside the effective IPS range are ILLEGAL;
you should detect that when you set up tt->ttb and then you
don't need the extract64() here, I think.

> +
> +    tlbe->iova = iova;
> +    tlbe->addr_mask = (1 << tt->granule_sz) - 1;

you could just use "granule_sz" here since you have it in a local.

> +
> +    while (level <= 3) {
> +        uint64_t subpage_size = 1ULL << level_shift(level, granule_sz);
> +        uint64_t mask = subpage_size - 1;
> +        uint32_t offset = iova_level_offset(iova, level, granule_sz);
> +        uint64_t pte;
> +        dma_addr_t pte_addr = baseaddr + offset * sizeof(pte);
> +        uint8_t ap;
> +
> +        if (get_pte(baseaddr, offset, &pte, info)) {
> +                goto error;
> +        }
> +        trace_smmu_ptw_level(level, iova, subpage_size,
> +                             baseaddr, offset, pte);
> +
> +        if (is_invalid_pte(pte) || is_reserved_pte(pte, level)) {
> +            trace_smmu_ptw_invalid_pte(stage, level, baseaddr,
> +                                       pte_addr, offset, pte);
> +            info->type = SMMU_PTW_ERR_TRANSLATION;
> +            goto error;
> +        }
> +
> +        if (is_page_pte(pte, level)) {
> +            uint64_t gpa = get_page_pte_address(pte, granule_sz);
> +
> +            ap = PTE_AP(pte);
> +            if (is_permission_fault(ap, perm)) {
> +                info->type = SMMU_PTW_ERR_PERMISSION;
> +                goto error;
> +            }
> +
> +            tlbe->translated_addr = gpa + (iova & mask);
> +            tlbe->perm = PTE_AP_TO_PERM(ap);
> +            trace_smmu_ptw_page_pte(stage, level, iova,
> +                                    baseaddr, pte_addr, pte, gpa);
> +            return 0;
> +        }
> +        if (is_block_pte(pte, level)) {
> +            uint64_t block_size;
> +            hwaddr gpa = get_block_pte_address(pte, level, granule_sz,
> +                                               &block_size);
> +
> +            ap = PTE_AP(pte);
> +            if (is_permission_fault(ap, perm)) {
> +                info->type = SMMU_PTW_ERR_PERMISSION;
> +                goto error;
> +            }
> +
> +            trace_smmu_ptw_block_pte(stage, level, baseaddr,
> +                                     pte_addr, pte, iova, gpa,
> +                                    (int)(block_size >> 20));

I don't think you should need this cast, because the argument to the
trace function is an int anyway ?

> +
> +            tlbe->translated_addr = gpa + (iova & mask);
> +            tlbe->perm = PTE_AP_TO_PERM(ap);
> +            return 0;
> +        }
> +
> +        /* table pte */
> +        ap = PTE_APTABLE(pte);
> +
> +        if (is_permission_fault(ap, perm)) {
> +            info->type = SMMU_PTW_ERR_PERMISSION;
> +            goto error;
> +        }
> +        baseaddr = get_table_pte_address(pte, granule_sz);
> +        level++;
> +    }
> +
> +    info->type = SMMU_PTW_ERR_TRANSLATION;
> +
> +error:
> +    tlbe->perm = IOMMU_NONE;
> +    return -EINVAL;
> +}
> +
> +/**
> + * smmu_ptw - Walk the page tables for an IOVA, according to @cfg
> + *
> + * @cfg: translation configuration
> + * @iova: iova to translate
> + * @perm: tentative access type
> + * @tlbe: returned entry
> + * @info: ptw event handle
> + *
> + * return 0 on success
> + */
> +int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags perm,
> +             IOMMUTLBEntry *tlbe, SMMUPTWEventInfo *info)
> +{
> +    if (!cfg->aa64) {
> +        error_setg(&error_fatal,
> +                   "SMMUv3 model does not support VMSAv8-32 page walk yet");

This sort of guest-provokable error should not be fatal -- log
it with LOG_UNIMP and carry on as best you can (in this case
the spec should say what happens for SMMUv3 implementations which
don't support AArch32 tables).

> +    }
> +
> +    return smmu_ptw_64(cfg, iova, perm, tlbe, info);
> +}
>
>  SMMUPciBus *smmu_find_as_from_bus_num(SMMUState *s, uint8_t bus_num)
>  {
> diff --git a/hw/arm/smmu-internal.h b/hw/arm/smmu-internal.h
> new file mode 100644
> index 0000000..3ed97ee
> --- /dev/null
> +++ b/hw/arm/smmu-internal.h
> @@ -0,0 +1,96 @@
> +/*
> + * ARM SMMU support - Internal API
> + *
> + * Copyright (c) 2017 Red Hat, Inc.
> + * Copyright (C) 2014-2016 Broadcom Corporation
> + * Written by Prem Mallappa, Eric Auger
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef HW_ARM_SMMU_INTERNAL_H
> +#define HW_ARM_SMMU_INTERNAL_H
> +
> +#define ARM_LPAE_MAX_ADDR_BITS          48

You define this, but nothing in the series uses it.

Also, from an SMMU perspective, it would be helpful to clarify
exactly what addresses are involved here (input addresses,
intermediate addresses, output addresses?) -- cf the spec section 3.4.

> +#define ARM_LPAE_MAX_LEVELS             4

This doesn't seem to be used either.

> +
> +/* PTE Manipulation */
> +
> +#define ARM_LPAE_PTE_TYPE_SHIFT         0
> +#define ARM_LPAE_PTE_TYPE_MASK          0x3
> +
> +#define ARM_LPAE_PTE_TYPE_BLOCK         1
> +#define ARM_LPAE_PTE_TYPE_TABLE         3
> +
> +#define ARM_LPAE_L3_PTE_TYPE_RESERVED   1
> +#define ARM_LPAE_L3_PTE_TYPE_PAGE       3
> +
> +#define ARM_LPAE_PTE_VALID              (1 << 0)
> +
> +#define PTE_ADDRESS(pte, shift) \
> +    (extract64(pte, shift, 47 - shift + 1) << shift)
> +
> +#define is_invalid_pte(pte) (!(pte & ARM_LPAE_PTE_VALID))
> +
> +#define is_reserved_pte(pte, level)                                      \
> +    ((level == 3) &&                                                     \
> +     ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_L3_PTE_TYPE_RESERVED))
> +
> +#define is_block_pte(pte, level)                                         \
> +    ((level < 3) &&                                                      \
> +     ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_PTE_TYPE_BLOCK))
> +
> +#define is_table_pte(pte, level)                                        \
> +    ((level < 3) &&                                                     \
> +     ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_PTE_TYPE_TABLE))
> +
> +#define is_page_pte(pte, level)                                         \
> +    ((level == 3) &&                                                    \
> +     ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_L3_PTE_TYPE_PAGE))
> +
> +#define PTE_AP(pte) \
> +    (extract64(pte, 6, 2))
> +
> +#define PTE_APTABLE(pte) \
> +    (extract64(pte, 61, 2))
> +
> +#define is_permission_fault(ap, perm) \
> +    (((perm) & IOMMU_WO) && ((ap) & 0x2))

Don't we also need to check AP bit 1 in some cases?
(when the StreamWorld is S or NS EL1 and either (a) the incoming
transaction has its attrs.user = 1 and STE.PRIVCFG is 0b0x, or
(b) STE.PRIVCFG is 0b10).

> +
> +#define PTE_AP_TO_PERM(ap) \
> +    (IOMMU_ACCESS_FLAG(true, !((ap) & 0x2)))

Similarly here.

> +
> +/* Level Indexing */
> +
> +static inline int level_shift(int level, int granule_sz)
> +{
> +    return granule_sz + (3 - level) * (granule_sz - 3);
> +}
> +
> +static inline uint64_t level_page_mask(int level, int granule_sz)
> +{
> +    return ~((1ULL << level_shift(level, granule_sz)) - 1);
> +}
> +
> +/**
> + * TODO: handle the case where the level resolves less than
> + * granule_sz -3 IA bits.
> + */
> +static inline
> +uint64_t iova_level_offset(uint64_t iova, int level, int granule_sz)
> +{
> +    return (iova >> level_shift(level, granule_sz)) &
> +            ((1ULL << (granule_sz - 3)) - 1);

Maybe "... & MAKE_64BIT_MASK(0, granule_sz - 3)" ?

> +}
> +
> +#endif
> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
> index 193063e..3584974 100644
> --- a/hw/arm/trace-events
> +++ b/hw/arm/trace-events
> @@ -2,3 +2,13 @@
>
>  # hw/arm/virt-acpi-build.c
>  virt_acpi_setup(void) "No fw cfg or ACPI disabled. Bailing out."
> +
> +# hw/arm/smmu-common.c
> +
> +smmu_page_walk(int stage, uint64_t baseaddr, int first_level, uint64_t start, uint64_t end) "stage=%d, baseaddr=0x%"PRIx64", first level=%d, start=0x%"PRIx64", end=0x%"PRIx64
> +smmu_lookup_table(int level, uint64_t baseaddr, int granule_sz, uint64_t start, uint64_t end, int flags, uint64_t subpage_size) "level=%d baseaddr=0x%"PRIx64" granule=%d, start=0x%"PRIx64" end=0x%"PRIx64" flags=%d subpage_size=0x%lx"
> +smmu_ptw_level(int level, uint64_t iova, size_t subpage_size, uint64_t baseaddr, uint32_t offset, uint64_t pte) "level=%d iova=0x%lx subpage_sz=0x%lx baseaddr=0x%"PRIx64" offset=%d => pte=0x%lx"
> +smmu_ptw_invalid_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr, uint32_t offset, uint64_t pte) "stage=%d level=%d base@=0x%"PRIx64" pte@=0x%"PRIx64" offset=%d pte=0x%"PRIx64
> +smmu_ptw_page_pte(int stage, int level,  uint64_t iova, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t address) "stage=%d level=%d iova=0x%"PRIx64" base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" page address = 0x%"PRIx64
> +smmu_ptw_block_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t iova, uint64_t gpa, int bsize_mb) "stage=%d level=%d base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" iova=0x%"PRIx64" block address = 0x%"PRIx64" block size = %d MiB"
> +smmu_get_pte(uint64_t baseaddr, int index, uint64_t pteaddr, uint64_t pte) "baseaddr=0x%"PRIx64" index=0x%x, pteaddr=0x%"PRIx64", pte=0x%"PRIx64

These have some uses of "%lx" for uint64_t, which should use PRIx64
to avoid compile issues on 32 bit systems.

> diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
> index aee96c2..0fb27f7 100644
> --- a/include/hw/arm/smmu-common.h
> +++ b/include/hw/arm/smmu-common.h
> @@ -127,4 +127,10 @@ static inline uint16_t smmu_get_sid(SMMUDevice *sdev)
>  {
>      return  ((pci_bus_num(sdev->bus) & 0xff) << 8) | sdev->devfn;
>  }
> +
> +int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags perm,
> +             IOMMUTLBEntry *tlbe, SMMUPTWEventInfo *info);
> +
> +SMMUTransTableInfo *select_tt(SMMUTransCfg *cfg, dma_addr_t iova);
> +
>  #endif  /* HW_ARM_SMMU_COMMON */
> --
> 2.5.5

thanks
-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 03/14] hw/arm/smmu-common: VMSAv8-64 page table walk
  2018-03-06 19:43   ` Peter Maydell
@ 2018-03-07 16:23     ` Auger Eric
  2018-03-07 16:35       ` Peter Maydell
  0 siblings, 1 reply; 63+ messages in thread
From: Auger Eric @ 2018-03-07 16:23 UTC (permalink / raw)
  To: Peter Maydell
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

Hi Peter,

On 06/03/18 20:43, Peter Maydell wrote:
> On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
>> This patch implements the page table walk for VMSAv8-64.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>> v8 -> v9:
>> - remove guest error log on PTE fetch fault
>> - rename  trace functions
>> - fix smmu_page_walk_level_res_invalid_pte last arg
>> - fix PTE_ADDRESS
>> - turn functions into macros
>> - make sure to return the actual pte access permission
>>   into tlbe->perm
>> - change proto of smmu_ptw*
>>
>> v7 -> v8:
>> - rework get_pte
>> - use LOG_LEVEL_ERROR
>> - remove error checking in get_block_pte_address
>> - page table walk simplified (no VFIO replay anymore)
>> - handle PTW error events
>> - use dma_memory_read
>>
>> v6 -> v7:
>> - fix wrong error handling in walk_page_table
>> - check perm in smmu_translate
>>
>> v5 -> v6:
>> - use IOMMUMemoryRegion
>> - remove initial_lookup_level()
>> - fix block replay
>>
>> v4 -> v5:
>> - add initial level in translation config
>> - implement block pte
>> - rename must_translate into nofail
>> - introduce call_entry_hook
>> - small changes to dynamic traces
>> - smmu_page_walk code moved from smmuv3.c to this file
>> - remove smmu_translate*
>>
>> v3 -> v4:
>> - reworked page table walk to prepare for VFIO integration
>>   (capability to scan a range of IOVA). Same function is used
>>   for translate for a single iova. This is largely inspired
>>   from intel_iommu.c
>> - as the translate function was not straightforward to me,
>>   I tried to stick more closely to the VMSA spec.
>> - remove support of nested stage (kernel driver does not
>>   support it anyway)
>> - use error_report and trace events
>> - add aa64[] field in SMMUTransCfg
>> ---
>>  hw/arm/smmu-common.c         | 232 +++++++++++++++++++++++++++++++++++++++++++
>>  hw/arm/smmu-internal.h       |  96 ++++++++++++++++++
>>  hw/arm/trace-events          |  10 ++
>>  include/hw/arm/smmu-common.h |   6 ++
>>  4 files changed, 344 insertions(+)
>>  create mode 100644 hw/arm/smmu-internal.h
>>
>> diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
>> index d0516dc..24cc4ba 100644
>> --- a/hw/arm/smmu-common.c
>> +++ b/hw/arm/smmu-common.c
>> @@ -27,6 +27,238 @@
>>
>>  #include "qemu/error-report.h"
>>  #include "hw/arm/smmu-common.h"
>> +#include "smmu-internal.h"
>> +
>> +/* VMSAv8-64 Translation */
>> +
>> +/**
>> + * get_pte - Get the content of a page table entry located t
>> + * @base_addr[@index]
>> + */
>> +static int get_pte(dma_addr_t baseaddr, uint32_t index, uint64_t *pte,
>> +                   SMMUPTWEventInfo *info)
>> +{
>> +    int ret;
>> +    dma_addr_t addr = baseaddr + index * sizeof(*pte);
>> +
>> +    ret = dma_memory_read(&address_space_memory, addr,
>> +                          (uint8_t *)pte, sizeof(*pte));
> 
> I think last time round I asked that these be done with the
> "read a 64-bit quantity" APIs and a comment that they're
> supposed to be atomic.

I was unsure about the fetch of the page descriptors themselves. I added
the comment on the fetch of STE, CD. 3.21.3 deals with configuration
structures. I will add the comment here as well.

> 
>> +
>> +    if (ret != MEMTX_OK) {
>> +        info->type = SMMU_PTW_ERR_WALK_EABT;
>> +        info->addr = addr;
>> +        return -EINVAL;
>> +    }
>> +    trace_smmu_get_pte(baseaddr, index, addr, *pte);
>> +    return 0;
>> +}
>> +
>> +/* VMSAv8-64 Translation Table Format Descriptor Decoding */
>> +
>> +/**
>> + * get_page_pte_address - returns the L3 descriptor output address,
>> + * ie. the page frame
>> + * ARM ARM spec: Figure D4-17 VMSAv8-64 level 3 descriptor format
>> + */
>> +static inline hwaddr get_page_pte_address(uint64_t pte, int granule_sz)
>> +{
>> +    return PTE_ADDRESS(pte, granule_sz);
>> +}
>> +
>> +/**
>> + * get_table_pte_address - return table descriptor output address,
>> + * ie. address of next level table
>> + * ARM ARM Figure D4-16 VMSAv8-64 level0, level1, and level 2 descriptor formats
>> + */
>> +static inline hwaddr get_table_pte_address(uint64_t pte, int granule_sz)
>> +{
>> +    return PTE_ADDRESS(pte, granule_sz);
>> +}
>> +
>> +/**
>> + * get_block_pte_address - return block descriptor output address and block size
>> + * ARM ARM Figure D4-16 VMSAv8-64 level0, level1, and level 2 descriptor formats
>> + */
>> +static hwaddr get_block_pte_address(uint64_t pte, int level, int granule_sz,
>> +                                    uint64_t *bsz)
>> +{
>> +    int n = 0;
>> +
>> +    switch (granule_sz) {
>> +    case 12:
>> +        if (level == 1) {
>> +            n = 30;
>> +        } else if (level == 2) {
>> +            n = 21;
>> +        }
>> +        break;
>> +    case 14:
>> +        if (level == 2) {
>> +            n = 25;
>> +        }
>> +        break;
>> +    case 16:
>> +        if (level == 2) {
>> +            n = 29;
>> +        }
>> +        break;
>> +    }
>> +    if (!n) {
>> +        error_setg(&error_fatal,
>> +                   "wrong granule/level combination (%d/%d)",
>> +                   granule_sz, level);
> 
> If this is guest-provokable then it shouldn't be a fatal error.
> If it isn't guest provokable then you can just assert.
> I think you should be able to sanitize the SMUTransTableInfo when
> you construct it from the CD (and give a C_BAD_CD event), and then
> you can trust the granule_sz and level when you're doing table walks.
OK.
> 
> You can calculate n as
>     n = (granule_sz - 3) * (4 - level) + 3;
> (compare target/arm/helper.c:get_phys_addr_lpae() calculation
> of page_size, and in the pseudocode the line
>     addrselectbottom = (3-level)*stride + grainsize;
> where stride is grainsize-3 and so comes out to the same thing.)

OK. I will simplify it and check when decoding the CD.
> 
>> +    }
>> +    *bsz = 1 << n;
>> +    return PTE_ADDRESS(pte, n);
>> +}
>> +
>> +static inline bool check_perm(int access_attrs, int mem_attrs)
>> +{
>> +    if (((access_attrs & IOMMU_RO) && !(mem_attrs & IOMMU_RO)) ||
>> +        ((access_attrs & IOMMU_WO) && !(mem_attrs & IOMMU_WO))) {
>> +        return false;
>> +    }
>> +    return true;
>> +}
> 
> This function doesn't seem to ever be used in this patchset?
> (clang will complain about that, though gcc won't.)
OK, will remove that.
> 
>> +
>> +SMMUTransTableInfo *select_tt(SMMUTransCfg *cfg, dma_addr_t iova)
>> +{
>> +    if (!extract64(iova, 64 - cfg->tt[0].tsz, cfg->tt[0].tsz - cfg->tbi)) {
>> +        return &cfg->tt[0];
>> +    }
>> +    return &cfg->tt[1];
> 
> I'm confused by your handling of the TBI bits here. In
> decode_cd() you take the 2-bit TBI field into cfg->tbi. That's
> a pair of bits, one of which is "top byte ignore" for TTB0 and the
> other is "top byte ignore" for TTB1. But here you're subtracting
> the whole field from cfg->tt[0].tsz. Which of the two tbi bits
> you need to use depends on bit 55 of the iova (compare the code
> in get_phys_addr_lpae() and the pseudocode function AddrTop()),
> and then if that bit is 1 it means that 8 bits of address should
> be ignored when determining whether to use TTB0 or TTB1.
> 
> You also need to consider the case where the input address is in
> neither the TTB0 range nor the TTb1 range (cf fig D4-14 in the
> v8A Arm ARM DDI0487C.a, and the code in get_phys_addr_lpae()).

Yes I totally misunderstood the way to decode that :-(

> 
>> +}
>> +
>> +/**
>> + * smmu_ptw_64 - VMSAv8-64 Walk of the page tables for a given IOVA
>> + * @cfg: translation config
>> + * @iova: iova to translate
>> + * @perm: access type
>> + * @tlbe: IOMMUTLBEntry (out)
>> + * @info: handle to an error info
>> + *
>> + * Return 0 on success, < 0 on error. In case of error, @info is filled
>> + * and tlbe->perm is set to IOMMU_NONE.
>> + * Upon success, @tlbe is filled with translated_addr and entry
>> + * permission rights.
>> + */
>> +static int smmu_ptw_64(SMMUTransCfg *cfg,
>> +                       dma_addr_t iova, IOMMUAccessFlags perm,
>> +                       IOMMUTLBEntry *tlbe, SMMUPTWEventInfo *info)
>> +{
>> +    dma_addr_t baseaddr;
>> +    int stage = cfg->stage;
>> +    SMMUTransTableInfo *tt = select_tt(cfg, iova);
>> +    uint8_t level;
>> +    uint8_t granule_sz;
>> +
>> +    if (tt->disabled) {
>> +        info->type = SMMU_PTW_ERR_TRANSLATION;
>> +        goto error;
>> +    }
>> +
>> +    level = tt->initial_level;
>> +    granule_sz = tt->granule_sz;
>> +    baseaddr = extract64(tt->ttb, 0, 48);
> 
> The spec says that bits specified by the TTB0/TTB1 fields
> in a CD that are outside the effective IPS range are ILLEGAL;
> you should detect that when you set up tt->ttb and then you
> don't need the extract64() here, I think.
OK
> 
>> +
>> +    tlbe->iova = iova;
>> +    tlbe->addr_mask = (1 << tt->granule_sz) - 1;
> 
> you could just use "granule_sz" here since you have it in a local.
sure
> 
>> +
>> +    while (level <= 3) {
>> +        uint64_t subpage_size = 1ULL << level_shift(level, granule_sz);
>> +        uint64_t mask = subpage_size - 1;
>> +        uint32_t offset = iova_level_offset(iova, level, granule_sz);
>> +        uint64_t pte;
>> +        dma_addr_t pte_addr = baseaddr + offset * sizeof(pte);
>> +        uint8_t ap;
>> +
>> +        if (get_pte(baseaddr, offset, &pte, info)) {
>> +                goto error;
>> +        }
>> +        trace_smmu_ptw_level(level, iova, subpage_size,
>> +                             baseaddr, offset, pte);
>> +
>> +        if (is_invalid_pte(pte) || is_reserved_pte(pte, level)) {
>> +            trace_smmu_ptw_invalid_pte(stage, level, baseaddr,
>> +                                       pte_addr, offset, pte);
>> +            info->type = SMMU_PTW_ERR_TRANSLATION;
>> +            goto error;
>> +        }
>> +
>> +        if (is_page_pte(pte, level)) {
>> +            uint64_t gpa = get_page_pte_address(pte, granule_sz);
>> +
>> +            ap = PTE_AP(pte);
>> +            if (is_permission_fault(ap, perm)) {
>> +                info->type = SMMU_PTW_ERR_PERMISSION;
>> +                goto error;
>> +            }
>> +
>> +            tlbe->translated_addr = gpa + (iova & mask);
>> +            tlbe->perm = PTE_AP_TO_PERM(ap);
>> +            trace_smmu_ptw_page_pte(stage, level, iova,
>> +                                    baseaddr, pte_addr, pte, gpa);
>> +            return 0;
>> +        }
>> +        if (is_block_pte(pte, level)) {
>> +            uint64_t block_size;
>> +            hwaddr gpa = get_block_pte_address(pte, level, granule_sz,
>> +                                               &block_size);
>> +
>> +            ap = PTE_AP(pte);
>> +            if (is_permission_fault(ap, perm)) {
>> +                info->type = SMMU_PTW_ERR_PERMISSION;
>> +                goto error;
>> +            }
>> +
>> +            trace_smmu_ptw_block_pte(stage, level, baseaddr,
>> +                                     pte_addr, pte, iova, gpa,
>> +                                    (int)(block_size >> 20));
> 
> I don't think you should need this cast, because the argument to the
> trace function is an int anyway ?
indeed
> 
>> +
>> +            tlbe->translated_addr = gpa + (iova & mask);
>> +            tlbe->perm = PTE_AP_TO_PERM(ap);
>> +            return 0;
>> +        }
>> +
>> +        /* table pte */
>> +        ap = PTE_APTABLE(pte);
>> +
>> +        if (is_permission_fault(ap, perm)) {
>> +            info->type = SMMU_PTW_ERR_PERMISSION;
>> +            goto error;
>> +        }
>> +        baseaddr = get_table_pte_address(pte, granule_sz);
>> +        level++;
>> +    }
>> +
>> +    info->type = SMMU_PTW_ERR_TRANSLATION;
>> +
>> +error:
>> +    tlbe->perm = IOMMU_NONE;
>> +    return -EINVAL;
>> +}
>> +
>> +/**
>> + * smmu_ptw - Walk the page tables for an IOVA, according to @cfg
>> + *
>> + * @cfg: translation configuration
>> + * @iova: iova to translate
>> + * @perm: tentative access type
>> + * @tlbe: returned entry
>> + * @info: ptw event handle
>> + *
>> + * return 0 on success
>> + */
>> +int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags perm,
>> +             IOMMUTLBEntry *tlbe, SMMUPTWEventInfo *info)
>> +{
>> +    if (!cfg->aa64) {
>> +        error_setg(&error_fatal,
>> +                   "SMMUv3 model does not support VMSAv8-32 page walk yet");
> 
> This sort of guest-provokable error should not be fatal -- log
> it with LOG_UNIMP and carry on as best you can (in this case
> the spec should say what happens for SMMUv3 implementations which
> don't support AArch32 tables).
This code should never been entered as the check is done when decoding
the CD in the smmuv3. However since it is a base class I wanted to
enphase the ptw was only supporting AArch32.

> 
>> +    }
>> +
>> +    return smmu_ptw_64(cfg, iova, perm, tlbe, info);
>> +}
>>
>>  SMMUPciBus *smmu_find_as_from_bus_num(SMMUState *s, uint8_t bus_num)
>>  {
>> diff --git a/hw/arm/smmu-internal.h b/hw/arm/smmu-internal.h
>> new file mode 100644
>> index 0000000..3ed97ee
>> --- /dev/null
>> +++ b/hw/arm/smmu-internal.h
>> @@ -0,0 +1,96 @@
>> +/*
>> + * ARM SMMU support - Internal API
>> + *
>> + * Copyright (c) 2017 Red Hat, Inc.
>> + * Copyright (C) 2014-2016 Broadcom Corporation
>> + * Written by Prem Mallappa, Eric Auger
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License along
>> + * with this program; if not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#ifndef HW_ARM_SMMU_INTERNAL_H
>> +#define HW_ARM_SMMU_INTERNAL_H
>> +
>> +#define ARM_LPAE_MAX_ADDR_BITS          48
> 
> You define this, but nothing in the series uses it.
removed. Actually we don't seem to need that anymore.
> 
> Also, from an SMMU perspective, it would be helpful to clarify
> exactly what addresses are involved here (input addresses,
> intermediate addresses, output addresses?) -- cf the spec section 3.4.
> 
>> +#define ARM_LPAE_MAX_LEVELS             4
> 
> This doesn't seem to be used either.
removed
> 
>> +
>> +/* PTE Manipulation */
>> +
>> +#define ARM_LPAE_PTE_TYPE_SHIFT         0
>> +#define ARM_LPAE_PTE_TYPE_MASK          0x3
>> +
>> +#define ARM_LPAE_PTE_TYPE_BLOCK         1
>> +#define ARM_LPAE_PTE_TYPE_TABLE         3
>> +
>> +#define ARM_LPAE_L3_PTE_TYPE_RESERVED   1
>> +#define ARM_LPAE_L3_PTE_TYPE_PAGE       3
>> +
>> +#define ARM_LPAE_PTE_VALID              (1 << 0)
>> +
>> +#define PTE_ADDRESS(pte, shift) \
>> +    (extract64(pte, shift, 47 - shift + 1) << shift)
>> +
>> +#define is_invalid_pte(pte) (!(pte & ARM_LPAE_PTE_VALID))
>> +
>> +#define is_reserved_pte(pte, level)                                      \
>> +    ((level == 3) &&                                                     \
>> +     ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_L3_PTE_TYPE_RESERVED))
>> +
>> +#define is_block_pte(pte, level)                                         \
>> +    ((level < 3) &&                                                      \
>> +     ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_PTE_TYPE_BLOCK))
>> +
>> +#define is_table_pte(pte, level)                                        \
>> +    ((level < 3) &&                                                     \
>> +     ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_PTE_TYPE_TABLE))
>> +
>> +#define is_page_pte(pte, level)                                         \
>> +    ((level == 3) &&                                                    \
>> +     ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_L3_PTE_TYPE_PAGE))
>> +
>> +#define PTE_AP(pte) \
>> +    (extract64(pte, 6, 2))
>> +
>> +#define PTE_APTABLE(pte) \
>> +    (extract64(pte, 61, 2))
>> +
>> +#define is_permission_fault(ap, perm) \
>> +    (((perm) & IOMMU_WO) && ((ap) & 0x2))
> 
> Don't we also need to check AP bit 1 in some cases?
> (when the StreamWorld is S or NS EL1 and either (a) the incoming
> transaction has its attrs.user = 1 and STE.PRIVCFG is 0b0x, or
> (b) STE.PRIVCFG is 0b10).
I think I don't need to as I don't support this feature at the moment:
spec says:
"When SMMU_IDR1.ATTR_PERMS_OVR=0, this field is RES0 and the incoming
PRIV attribute is used."
But to be honest I was not aware this existed ;()
> 
>> +
>> +#define PTE_AP_TO_PERM(ap) \
>> +    (IOMMU_ACCESS_FLAG(true, !((ap) & 0x2)))
> 
> Similarly here.
?
> 
>> +
>> +/* Level Indexing */
>> +
>> +static inline int level_shift(int level, int granule_sz)
>> +{
>> +    return granule_sz + (3 - level) * (granule_sz - 3);
>> +}
>> +
>> +static inline uint64_t level_page_mask(int level, int granule_sz)
>> +{
>> +    return ~((1ULL << level_shift(level, granule_sz)) - 1);
>> +}
>> +
>> +/**
>> + * TODO: handle the case where the level resolves less than
>> + * granule_sz -3 IA bits.
>> + */
>> +static inline
>> +uint64_t iova_level_offset(uint64_t iova, int level, int granule_sz)
>> +{
>> +    return (iova >> level_shift(level, granule_sz)) &
>> +            ((1ULL << (granule_sz - 3)) - 1);
> 
> Maybe "... & MAKE_64BIT_MASK(0, granule_sz - 3)" ?
OK
> 
>> +}
>> +
>> +#endif
>> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
>> index 193063e..3584974 100644
>> --- a/hw/arm/trace-events
>> +++ b/hw/arm/trace-events
>> @@ -2,3 +2,13 @@
>>
>>  # hw/arm/virt-acpi-build.c
>>  virt_acpi_setup(void) "No fw cfg or ACPI disabled. Bailing out."
>> +
>> +# hw/arm/smmu-common.c
>> +
>> +smmu_page_walk(int stage, uint64_t baseaddr, int first_level, uint64_t start, uint64_t end) "stage=%d, baseaddr=0x%"PRIx64", first level=%d, start=0x%"PRIx64", end=0x%"PRIx64
>> +smmu_lookup_table(int level, uint64_t baseaddr, int granule_sz, uint64_t start, uint64_t end, int flags, uint64_t subpage_size) "level=%d baseaddr=0x%"PRIx64" granule=%d, start=0x%"PRIx64" end=0x%"PRIx64" flags=%d subpage_size=0x%lx"
>> +smmu_ptw_level(int level, uint64_t iova, size_t subpage_size, uint64_t baseaddr, uint32_t offset, uint64_t pte) "level=%d iova=0x%lx subpage_sz=0x%lx baseaddr=0x%"PRIx64" offset=%d => pte=0x%lx"
>> +smmu_ptw_invalid_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr, uint32_t offset, uint64_t pte) "stage=%d level=%d base@=0x%"PRIx64" pte@=0x%"PRIx64" offset=%d pte=0x%"PRIx64
>> +smmu_ptw_page_pte(int stage, int level,  uint64_t iova, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t address) "stage=%d level=%d iova=0x%"PRIx64" base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" page address = 0x%"PRIx64
>> +smmu_ptw_block_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t iova, uint64_t gpa, int bsize_mb) "stage=%d level=%d base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" iova=0x%"PRIx64" block address = 0x%"PRIx64" block size = %d MiB"
>> +smmu_get_pte(uint64_t baseaddr, int index, uint64_t pteaddr, uint64_t pte) "baseaddr=0x%"PRIx64" index=0x%x, pteaddr=0x%"PRIx64", pte=0x%"PRIx64
> 
> These have some uses of "%lx" for uint64_t, which should use PRIx64
> to avoid compile issues on 32 bit systems.
OK

Thanks

Eric
> 
>> diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
>> index aee96c2..0fb27f7 100644
>> --- a/include/hw/arm/smmu-common.h
>> +++ b/include/hw/arm/smmu-common.h
>> @@ -127,4 +127,10 @@ static inline uint16_t smmu_get_sid(SMMUDevice *sdev)
>>  {
>>      return  ((pci_bus_num(sdev->bus) & 0xff) << 8) | sdev->devfn;
>>  }
>> +
>> +int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags perm,
>> +             IOMMUTLBEntry *tlbe, SMMUPTWEventInfo *info);
>> +
>> +SMMUTransTableInfo *select_tt(SMMUTransCfg *cfg, dma_addr_t iova);
>> +
>>  #endif  /* HW_ARM_SMMU_COMMON */
>> --
>> 2.5.5
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 03/14] hw/arm/smmu-common: VMSAv8-64 page table walk
  2018-03-07 16:23     ` Auger Eric
@ 2018-03-07 16:35       ` Peter Maydell
  2018-03-08 18:56         ` Auger Eric
  0 siblings, 1 reply; 63+ messages in thread
From: Peter Maydell @ 2018-03-07 16:35 UTC (permalink / raw)
  To: Auger Eric
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

On 7 March 2018 at 16:23, Auger Eric <eric.auger@redhat.com> wrote:
> Hi Peter,
>
> On 06/03/18 20:43, Peter Maydell wrote:
>> On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:

>>> +int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags perm,
>>> +             IOMMUTLBEntry *tlbe, SMMUPTWEventInfo *info)
>>> +{
>>> +    if (!cfg->aa64) {
>>> +        error_setg(&error_fatal,
>>> +                   "SMMUv3 model does not support VMSAv8-32 page walk yet");
>>
>> This sort of guest-provokable error should not be fatal -- log
>> it with LOG_UNIMP and carry on as best you can (in this case
>> the spec should say what happens for SMMUv3 implementations which
>> don't support AArch32 tables).
> This code should never been entered as the check is done when decoding
> the CD in the smmuv3. However since it is a base class I wanted to
> enphase the ptw was only supporting AArch32.

Ah, right. That should be an assert() with a brief comment
about why the condition will never trigger.


>>> +#define is_permission_fault(ap, perm) \
>>> +    (((perm) & IOMMU_WO) && ((ap) & 0x2))
>>
>> Don't we also need to check AP bit 1 in some cases?
>> (when the StreamWorld is S or NS EL1 and either (a) the incoming
>> transaction has its attrs.user = 1 and STE.PRIVCFG is 0b0x, or
>> (b) STE.PRIVCFG is 0b10).
> I think I don't need to as I don't support this feature at the moment:
> spec says:
> "When SMMU_IDR1.ATTR_PERMS_OVR=0, this field is RES0 and the incoming
> PRIV attribute is used."
> But to be honest I was not aware this existed ;()

I think you still need to check the incoming transaction
for user vs priv, even if you don't support STE.PRIVCFG.

>>> +
>>> +#define PTE_AP_TO_PERM(ap) \
>>> +    (IOMMU_ACCESS_FLAG(true, !((ap) & 0x2)))
>>
>> Similarly here.
> ?

Can't just ignore AP bit 1 (or 0, if you're counting it that way).


thanks
-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 04/14] hw/arm/smmuv3: Skeleton
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 04/14] hw/arm/smmuv3: Skeleton Eric Auger
@ 2018-03-08 14:27   ` Peter Maydell
  2018-03-09 13:19     ` Auger Eric
  0 siblings, 1 reply; 63+ messages in thread
From: Peter Maydell @ 2018-03-08 14:27 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
> From: Prem Mallappa <prem.mallappa@broadcom.com>
>
> This patch implements a skeleton for the smmuv3 device.
> Datatypes and register definitions are introduced. The MMIO
> region, the interrupts and the queue are initialized.
>
> Only the MMIO read operation is implemented here.
>
> Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>


I just have a few minor nits on this one...


> +static inline int smmu_enabled(SMMUv3State *s)
> +{
> +    return FIELD_EX32(s->cr[0], CR0, SMMU_ENABLE);
> +}
> +
> +typedef struct Cmd {
> +    uint32_t word[4];
> +} Cmd;
> +
> +typedef struct Evt  {
> +    uint32_t word[8];
> +} Evt;

Some one-liner comments noting what these structs are for
would be helpful.

> +
> +static inline uint64_t smmu_read64(uint64_t r, unsigned offset,
> +                                   unsigned size)

A doc comment here would help in describing what the purpose of
this utility function is.

> +{
> +    if (size == 8 && !offset) {
> +        return r;

If you take my advice a bit further down about just checking
up front that 8-byte accesses are to definitely permitted 64
bit register offsets, you won't need the check on offset.

> +    }
> +
> +    /* 32 bit access */
> +
> +    if (offset && offset != 4)  {
> +        qemu_log_mask(LOG_GUEST_ERROR,
> +                      "SMMUv3 MMIO read: bad offset/size %u/%u\n",
> +                      offset, size);

This isn't a guest error, because the function is only ever called
with constant values for the 'offset' parameter. You should just
assert that the offset is 0 or 4.

> +        return 0;
> +    }
> +
> +    return extract64(r, offset << 3, 32);
> +}
> +
> +#endif

> +static void smmuv3_init_regs(SMMUv3State *s)
> +{
> +    /**
> +     * IDR0: stage1 only, AArch64 only, coherent access, 16b ASID,
> +     *       multi-level stream table
> +     */
> +    s->idr[0] = FIELD_DP32(s->idr[0], IDR0, S1P, 1); /* stage 1 supported */
> +    s->idr[0] = FIELD_DP32(s->idr[0], IDR0, TTF, 2); /* AArch64 PTW only */
> +    s->idr[0] = FIELD_DP32(s->idr[0], IDR0, COHACC, 1); /* IO coherent */
> +    s->idr[0] = FIELD_DP32(s->idr[0], IDR0, ASID16, 1); /* 16-bit ASID */
> +    s->idr[0] = FIELD_DP32(s->idr[0], IDR0, TTENDIAN, 2); /* little endian */
> +    s->idr[0] = FIELD_DP32(s->idr[0], IDR0, STALL_MODEL, 1); /* No stall */
> +    /* terminated transaction will always be aborted/error returned */
> +    s->idr[0] = FIELD_DP32(s->idr[0], IDR0, TERM_MODEL, 1);
> +    /* 2-level stream table supported */
> +    s->idr[0] = FIELD_DP32(s->idr[0], IDR0, STLEVEL, 1);
> +
> +    s->idr[1] = FIELD_DP32(s->idr[1], IDR1, SIDSIZE, SMMU_IDR1_SIDSIZE);
> +    s->idr[1] = FIELD_DP32(s->idr[1], IDR1, EVENTQS, 19);
> +    s->idr[1] = FIELD_DP32(s->idr[1], IDR1, CMDQS,   19);
> +
> +   /* 4K and 64K granule support */
> +    s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN4K, 1);
> +    s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN64K, 1);
> +    s->idr[5] = FIELD_DP32(s->idr[5], IDR5, OAS, SMMU_IDR5_OAS); /* 44 bits */
> +
> +    s->cmdq.base = deposit64(s->cmdq.base, 0, 5, 19); /* LOG2SIZE = 19 */
> +    s->cmdq.prod = 0;
> +    s->cmdq.cons = 0;
> +    s->cmdq.entry_size = sizeof(struct Cmd);
> +    s->eventq.base = deposit64(s->eventq.base, 0, 5, 19); /* LOG2SIZE = 19 */

Have some #defines for max cmd queue and event queue size, since
we use them here and also above in filling in the IDR fields ?

> +    s->eventq.prod = 0;
> +    s->eventq.cons = 0;
> +    s->eventq.entry_size = sizeof(struct Evt);
> +
> +    s->features = 0;
> +    s->sid_split = 0;
> +}
> +
> +static void smmu_write_mmio(void *opaque, hwaddr addr,
> +                            uint64_t val, unsigned size)
> +{
> +    /* not yet implemented */
> +}
> +
> +static uint64_t smmu_read_mmio(void *opaque, hwaddr addr, unsigned size)
> +{
> +    SMMUState *sys = opaque;
> +    SMMUv3State *s = ARM_SMMUV3(sys);
> +    uint64_t val;
> +
> +    /* CONSTRAINED UNPREDICTABLE choice to have page0/1 be exact aliases */
> +    addr &= ~0x10000;
> +
> +    if (size != 4 && size != 8) {
> +        qemu_log_mask(LOG_GUEST_ERROR, "SMMUv3 MMIO read: bad size %u\n", size);

Your MemoryRegionOps settings mean this can never happen, so you
don't need to check it at runtime.

> +        return 0;
> +    }

Consider specifically catching 8-byte accesses to non-64-bit registers?
This is CONSTRAINED UNPREDICTABLE (see spec section 6.2), and "one
of the registers is read/written and other half is RAZ/WI" is permitted
behaviour, but it does mean you need to be a little careful about not
letting the top 32 bits of val become non-zero for the 32-bit register
codepaths. Logging bad 64-bit accesses as LOG_GUEST_ERROR and making
them RAZ/WI might be nicer for guest software developers.

> +
> +    /* Primecell/Corelink ID registers */
> +    switch (addr) {
> +    case A_CIDR0:
> +        val = 0x0D;
> +        break;
> +    case A_CIDR1:
> +        val = 0xF0;
> +        break;
> +    case A_CIDR2:
> +        val = 0x05;
> +        break;
> +    case A_CIDR3:
> +        val = 0xB1;
> +        break;
> +    case A_PIDR0:
> +        val = 0x84; /* Part Number */
> +        break;
> +    case A_PIDR1:
> +        val = 0xB4; /* JEP106 ID code[3:0] for Arm and Part numver[11:8] */
> +        break;
> +    case A_PIDR3:
> +        val = 0x10; /* MMU600 p1 */
> +        break;
> +    case A_PIDR4:
> +        val = 0x4; /* 4KB region count, JEP106 continuation code for Arm */
> +        break;
> +    case 0xFD4 ... 0xFDC: /* SMMU_PDIR 5-7 */
> +        val = 0;
> +        break;

I usually put all the const CIDR/PIDR values in a const array, since
there are always 12 of them next to each other, but since you already
have this code it's fine.

> +    case A_IDR0 ... A_IDR5:
> +        val = s->idr[(addr - A_IDR0) / 4];
> +        break;
> +    case A_IIDR:
> +        val = s->iidr;
> +        break;
> +    case A_CR0:
> +        val = s->cr[0];
> +        break;
> +    case A_CR0ACK:
> +        val = s->cr0ack;
> +        break;
> +    case A_CR1:
> +        val = s->cr[1];
> +        break;
> +    case A_CR2:
> +        val = s->cr[2];
> +        break;
> +    case A_STATUSR:
> +        val = s->statusr;
> +        break;
> +    case A_IRQ_CTRL:
> +        val = s->irq_ctrl;
> +        break;
> +    case A_IRQ_CTRL_ACK:
> +        val = s->irq_ctrl_ack;
> +        break;
> +    case A_GERROR:
> +        val = s->gerror;
> +        break;
> +    case A_GERRORN:
> +        val = s->gerrorn;
> +        break;
> +    case A_GERROR_IRQ_CFG0: /* 64b */
> +        val = smmu_read64(s->gerror_irq_cfg0, 0, size);
> +        break;
> +    case A_GERROR_IRQ_CFG0 + 4:
> +        val = smmu_read64(s->gerror_irq_cfg0, 4, size);
> +        break;
> +    case A_GERROR_IRQ_CFG1:
> +        val = s->gerror_irq_cfg1;
> +        break;
> +    case A_GERROR_IRQ_CFG2:
> +        val = s->gerror_irq_cfg2;
> +        break;
> +    case A_STRTAB_BASE: /* 64b */
> +        val = smmu_read64(s->strtab_base, 0, size);
> +        break;
> +    case A_STRTAB_BASE + 4: /* 64b */
> +        val = smmu_read64(s->strtab_base, 4, size);
> +        break;
> +    case A_STRTAB_BASE_CFG:
> +        val = s->strtab_base_cfg;
> +        break;
> +    case A_CMDQ_BASE: /* 64b */
> +        val = smmu_read64(s->cmdq.base, 0, size);
> +        break;
> +    case A_CMDQ_BASE + 4:
> +        val = smmu_read64(s->cmdq.base, 4, size);
> +        break;
> +    case A_CMDQ_PROD:
> +        val = s->cmdq.prod;
> +        break;
> +    case A_CMDQ_CONS:
> +        val = s->cmdq.cons;
> +        break;
> +    case A_EVENTQ_BASE: /* 64b */
> +        val = smmu_read64(s->eventq.base, 0, size);
> +        break;
> +    case A_EVENTQ_BASE + 4: /* 64b */
> +        val = smmu_read64(s->eventq.base, 4, size);
> +        break;
> +    case A_EVENTQ_PROD:
> +        val = s->eventq.prod;
> +        break;
> +    case A_EVENTQ_CONS:
> +        val = s->eventq.cons;
> +        break;
> +    default:
> +        error_report("%s unhandled access at 0x%"PRIx64, __func__, addr);

This should be a LOG_GUEST_ERROR (if there are registers we don't
implement, you can define the A_* constants for them and have those
do a LOG_UNIMP.)

> +        break;
> +    }
> +
> +    trace_smmuv3_read_mmio(addr, val, size);
> +    return val;
> +}
> +

> +static void smmu_realize(DeviceState *d, Error **errp)
> +{
> +    SMMUState *sys = ARM_SMMU(d);
> +    SMMUv3State *s = ARM_SMMUV3(sys);
> +    SMMUv3Class *c = ARM_SMMUV3_GET_CLASS(s);
> +    SysBusDevice *dev = SYS_BUS_DEVICE(d);
> +    Error *local_err = NULL;
> +
> +    c->parent_realize(d, &local_err);
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +        return;
> +    }
> +
> +    memory_region_init_io(&sys->iomem, OBJECT(s),
> +                          &smmu_mem_ops, sys, TYPE_ARM_SMMUV3, 0x20000);
> +
> +    sys->mrtypename = g_strdup(TYPE_SMMUV3_IOMMU_MEMORY_REGION);

Nothing ever modifies this later, so I don't think you nede to
do the g_strdup() ?  (The declaration of the struct field should
probably have a 'const'.)

> +
> +    sysbus_init_mmio(dev, &sys->iomem);
> +
> +    smmu_init_irq(s, dev);
> +}
> +
> +static const VMStateDescription vmstate_smmuv3 = {
> +    .name = "smmuv3",
> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    .fields = (VMStateField[]) {
> +        VMSTATE_UINT32(features, SMMUv3State),
> +        VMSTATE_UINT8(sid_size, SMMUv3State),
> +        VMSTATE_UINT8(sid_split, SMMUv3State),
> +
> +        VMSTATE_UINT32_ARRAY(idr, SMMUv3State, 6),

These are constant ID registers, right? You don't need to
migrate anything that's a fixed value set when the device
is created and never modified.

> +        VMSTATE_UINT32(iidr, SMMUv3State),
> +        VMSTATE_UINT32_ARRAY(cr, SMMUv3State, 3),
> +        VMSTATE_UINT32(cr0ack, SMMUv3State),
> +        VMSTATE_UINT32(statusr, SMMUv3State),
> +        VMSTATE_UINT32(irq_ctrl, SMMUv3State),
> +        VMSTATE_UINT32(irq_ctrl_ack, SMMUv3State),
> +        VMSTATE_UINT32(gerror, SMMUv3State),
> +        VMSTATE_UINT32(gerrorn, SMMUv3State),
> +        VMSTATE_UINT64(gerror_irq_cfg0, SMMUv3State),
> +        VMSTATE_UINT32(gerror_irq_cfg1, SMMUv3State),
> +        VMSTATE_UINT32(gerror_irq_cfg2, SMMUv3State),
> +        VMSTATE_UINT64(strtab_base, SMMUv3State),
> +        VMSTATE_UINT32(strtab_base_cfg, SMMUv3State),
> +        VMSTATE_UINT64(eventq_irq_cfg0, SMMUv3State),
> +        VMSTATE_UINT32(eventq_irq_cfg1, SMMUv3State),
> +        VMSTATE_UINT32(eventq_irq_cfg2, SMMUv3State),
> +
> +        VMSTATE_UINT64(cmdq.base, SMMUv3State),
> +        VMSTATE_UINT32(cmdq.prod, SMMUv3State),
> +        VMSTATE_UINT32(cmdq.cons, SMMUv3State),
> +        VMSTATE_UINT8(cmdq.entry_size, SMMUv3State),
> +        VMSTATE_UINT64(eventq.base, SMMUv3State),
> +        VMSTATE_UINT32(eventq.prod, SMMUv3State),
> +        VMSTATE_UINT32(eventq.cons, SMMUv3State),
> +        VMSTATE_UINT8(eventq.entry_size, SMMUv3State),

It's a little neater to define a separate VMStateDescription
for the SMMUQueue struct and then just use it twice here with
VMSTATE_STRUCT. (Example in hw/dma/pl330.c for vmstate_pl330_chan.)

Also, isn't the entry_size constant and fixed at device creation?
If so, you don't need to migrate it.

> +
> +        VMSTATE_END_OF_LIST(),
> +    },
> +};
> +
> +static void smmuv3_instance_init(Object *obj)
> +{
> +    /* Nothing much to do here as of now */
> +}
> +
> +static void smmuv3_class_init(ObjectClass *klass, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(klass);
> +    SMMUv3Class *c = ARM_SMMUV3_CLASS(klass);
> +
> +    dc->vmsd    = &vmstate_smmuv3;

It would be nice to go through the patchset and remove these
unnecessary extra spaces in various assignments and declarations.

> +    device_class_set_parent_reset(dc, smmu_reset, &c->parent_reset);
> +    c->parent_realize = dc->realize;
> +    dc->realize = smmu_realize;
> +}


> +type_init(smmuv3_register_types)
> +

Stray blank line at end of file.

> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
> index 3584974..64d2b9b 100644
> --- a/hw/arm/trace-events
> +++ b/hw/arm/trace-events
> @@ -12,3 +12,6 @@ smmu_ptw_invalid_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr,
>  smmu_ptw_page_pte(int stage, int level,  uint64_t iova, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t address) "stage=%d level=%d iova=0x%"PRIx64" base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" page address = 0x%"PRIx64
>  smmu_ptw_block_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t iova, uint64_t gpa, int bsize_mb) "stage=%d level=%d base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" iova=0x%"PRIx64" block address = 0x%"PRIx64" block size = %d MiB"
>  smmu_get_pte(uint64_t baseaddr, int index, uint64_t pteaddr, uint64_t pte) "baseaddr=0x%"PRIx64" index=0x%x, pteaddr=0x%"PRIx64", pte=0x%"PRIx64
> +
> +#hw/arm/smmuv3.c
> +smmuv3_read_mmio(hwaddr addr, uint64_t val, unsigned size) "addr: 0x%"PRIx64" val:0x%"PRIx64" size: 0x%x"

"hwaddr" isn't valid as a type for trace event parameters. This should
be uint64_t; otherwise trace backends like 'ust' won't build. (There's a
patch on list to tracetool that will make this mistake a compile error
for all backends, so it's easier to catch.)


> +#ifndef HW_ARM_SMMUV3_H
> +#define HW_ARM_SMMUV3_H
> +
> +#include "hw/arm/smmu-common.h"
> +#include "hw/registerfields.h"
> +
> +#define TYPE_SMMUV3_IOMMU_MEMORY_REGION "smmuv3-iommu-memory-region"
> +
> +#define SMMU_NREGS            0x200
> +
> +typedef struct SMMUQueue {
> +     hwaddr base;
> +     uint32_t prod;
> +     uint32_t cons;
> +     uint8_t entry_size;
> +} SMMUQueue;
> +
> +typedef struct SMMUv3State {
> +    SMMUState     smmu_state;
> +
> +    /* Local cache of most-frequently used registers */
> +#define SMMU_FEATURE_2LVL_STE (1 << 0)

Minor thing, I think it would be better for this define to
not be in the middle of the struct definition.

> +    uint32_t features;
> +    uint8_t sid_size;
> +    uint8_t sid_split;
> +
> +    uint32_t idr[6];
> +    uint32_t iidr;
> +    uint32_t cr[3];
> +    uint32_t cr0ack;
> +    uint32_t statusr;
> +    uint32_t irq_ctrl;
> +    uint32_t irq_ctrl_ack;
> +    uint32_t gerror;
> +    uint32_t gerrorn;
> +    uint64_t gerror_irq_cfg0;
> +    uint32_t gerror_irq_cfg1;
> +    uint32_t gerror_irq_cfg2;
> +    uint64_t strtab_base;
> +    uint32_t strtab_base_cfg;
> +    uint64_t eventq_irq_cfg0;
> +    uint32_t eventq_irq_cfg1;
> +    uint32_t eventq_irq_cfg2;
> +
> +    SMMUQueue eventq, cmdq;
> +
> +    qemu_irq     irq[4];
> +} SMMUv3State;
> +
> +typedef enum {
> +    SMMU_IRQ_EVTQ,
> +    SMMU_IRQ_PRIQ,
> +    SMMU_IRQ_CMD_SYNC,
> +    SMMU_IRQ_GERROR,
> +} SMMUIrq;
> +
> +typedef struct {
> +    /*< private >*/
> +    SMMUBaseClass smmu_base_class;
> +    /*< public >*/
> +
> +    DeviceRealize parent_realize;
> +    DeviceReset   parent_reset;
> +} SMMUv3Class;
> +
> +#define TYPE_ARM_SMMUV3   "arm-smmuv3"
> +#define ARM_SMMUV3(obj) OBJECT_CHECK(SMMUv3State, (obj), TYPE_ARM_SMMUV3)
> +#define ARM_SMMUV3_CLASS(klass)                              \
> +    OBJECT_CLASS_CHECK(SMMUv3Class, (klass), TYPE_ARM_SMMUV3)
> +#define ARM_SMMUV3_GET_CLASS(obj) \
> +     OBJECT_GET_CLASS(SMMUv3Class, (obj), TYPE_ARM_SMMUV3)
> +
> +#endif

thanks
-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 05/14] hw/arm/smmuv3: Wired IRQ and GERROR helpers
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 05/14] hw/arm/smmuv3: Wired IRQ and GERROR helpers Eric Auger
@ 2018-03-08 17:49   ` Peter Maydell
  2018-03-09 14:03     ` Auger Eric
  0 siblings, 1 reply; 63+ messages in thread
From: Peter Maydell @ 2018-03-08 17:49 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
> We introduce some helpers to handle wired IRQs and especially
> GERROR interrupt. SMMU writes GERROR register on GERROR event
> and SW acks GERROR interrupts by setting GERRORn.
>
> The Wired interrupts are edge sensitive hence the pulse usage.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>
> ---
>
> v7 -> v8:
> - remove SMMU_PENDING_GERRORS macro
> - properly toggle gerror
> - properly sanitize gerrorn write
> ---
>  hw/arm/smmuv3-internal.h | 10 ++++++++
>  hw/arm/smmuv3.c          | 64 ++++++++++++++++++++++++++++++++++++++++++++++++
>  hw/arm/trace-events      |  3 +++
>  3 files changed, 77 insertions(+)
>
> diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
> index 5be8303..40b39a1 100644
> --- a/hw/arm/smmuv3-internal.h
> +++ b/hw/arm/smmuv3-internal.h
> @@ -152,4 +152,14 @@ static inline uint64_t smmu_read64(uint64_t r, unsigned offset,
>      return extract64(r, offset << 3, 32);
>  }
>
> +/* Interrupts */
> +
> +#define smmuv3_eventq_irq_enabled(s)                   \
> +    (FIELD_EX32(s->irq_ctrl, IRQ_CTRL, EVENTQ_IRQEN))
> +#define smmuv3_gerror_irq_enabled(s)                  \
> +    (FIELD_EX32(s->irq_ctrl, IRQ_CTRL, GERROR_IRQEN))

These are only ever used in smmuv3.c, so you can just move them
to there (and make them inline functions, ideally).

> +
> +void smmuv3_trigger_irq(SMMUv3State *s, SMMUIrq irq, uint32_t gerror_mask);
> +void smmuv3_write_gerrorn(SMMUv3State *s, uint32_t gerrorn);

I guess these are global to avoid the compiler complaining about
unused static functions at this point? If so, add a comment
saying so, and flip them back to being static functions when
their callers get added. (Or just add the callers here, if they're
not too complicated.)

> +
>  #endif
> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
> index dc03c9e..8779d3f 100644
> --- a/hw/arm/smmuv3.c
> +++ b/hw/arm/smmuv3.c
> @@ -30,6 +30,70 @@
>  #include "hw/arm/smmuv3.h"
>  #include "smmuv3-internal.h"
>
> +/**
> + * smmuv3_trigger_irq - pulse @irq if enabled and update
> + * GERROR register in case of GERROR interrupt
> + *
> + * @irq: irq type
> + * @gerror_mask: mask of gerrors to toggle (relevant if @irq is GERROR)
> + */
> +void smmuv3_trigger_irq(SMMUv3State *s, SMMUIrq irq, uint32_t gerror_mask)
> +{
> +
> +    bool pulse = false;
> +
> +    switch (irq) {
> +    case SMMU_IRQ_EVTQ:
> +        pulse = smmuv3_eventq_irq_enabled(s);
> +        break;
> +    case SMMU_IRQ_PRIQ:
> +        error_setg(&error_fatal, "PRI not supported");

This should either assert() if it would be a bug in the rest
of the smmu code, or LOG_UNIMP if the guest can trigger it.

> +        break;
> +    case SMMU_IRQ_CMD_SYNC:
> +        pulse = true;
> +        break;
> +    case SMMU_IRQ_GERROR:
> +    {
> +        uint32_t pending = s->gerror ^ s->gerrorn;
> +        uint32_t new_gerrors = ~pending & gerror_mask;
> +
> +        if (!new_gerrors) {
> +            /* only toggle non pending errors */
> +            return;
> +        }
> +        s->gerror ^= new_gerrors;
> +        trace_smmuv3_write_gerror(new_gerrors, s->gerror);
> +
> +        /* pulse the GERROR irq only if all previous gerrors were acked */

It's not entirely clear to me that this is correct; should
we generate only one pulse if the implementation raises error A,
and then later raises error B before software acknowledges A ?
There's some language in 3.18 about the SMMU implementation
being able to coalesce events and identical interrupts, but
I think that would mean that we could skip raising the first
pulse for error A if error B arrived sufficiently quickly after it.
(Not something we're going to care about for a s/w model.)

I think the right behaviour is probably that we should pulse
the interrupt if there are any new gerrors, which is to
say to drop this !pending test:

> +        pulse = smmuv3_gerror_irq_enabled(s) && !pending;
> +        break;
> +    }
> +    }
> +    if (pulse) {
> +            trace_smmuv3_trigger_irq(irq);
> +            qemu_irq_pulse(s->irq[irq]);
> +    }
> +}
> +
> +void smmuv3_write_gerrorn(SMMUv3State *s, uint32_t new_gerrorn)
> +{
> +    uint32_t pending = s->gerror ^ s->gerrorn;
> +    uint32_t toggled = s->gerrorn ^ new_gerrorn;
> +    uint32_t acked;
> +
> +    if (toggled & ~pending) {
> +        qemu_log_mask(LOG_GUEST_ERROR,
> +                      "guest toggles non pending errors = 0x%x\n",
> +                      toggled & ~pending);
> +    }
> +
> +    /* Make sure SW does not toggle irqs that are not active */
> +    acked = toggled & pending;
> +    s->gerrorn ^= acked;
> +

I don't think this behaviour is correct. From the hardware's
perspective, we should just take the value the user writes
to SMMU_GERRORN and put it in the register (and update the
status of the irq accordingly).

It is CONSTRAINED UNPREDICTABLE whether we actually raise an
interrupt if the guest toggles a field that corresponds to an
inactive error, so we should just do whatever is easiest.

> +    trace_smmuv3_write_gerrorn(acked, s->gerrorn);
> +}
> +
>  static void smmuv3_init_regs(SMMUv3State *s)
>  {
>      /**
> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
> index 64d2b9b..2ddae40 100644
> --- a/hw/arm/trace-events
> +++ b/hw/arm/trace-events
> @@ -15,3 +15,6 @@ smmu_get_pte(uint64_t baseaddr, int index, uint64_t pteaddr, uint64_t pte) "base
>
>  #hw/arm/smmuv3.c
>  smmuv3_read_mmio(hwaddr addr, uint64_t val, unsigned size) "addr: 0x%"PRIx64" val:0x%"PRIx64" size: 0x%x"
> +smmuv3_trigger_irq(int irq) "irq=%d"
> +smmuv3_write_gerror(uint32_t toggled, uint32_t gerror) "toggled=0x%x, new gerror=0x%x"
> +smmuv3_write_gerrorn(uint32_t acked, uint32_t gerrorn) "acked=0x%x, new gerrorn=0x%x"

Capitalizing names of registers like GERROR in trace messages would
make them match the convention in the SMMUv3 spec.

> --
> 2.5.5
>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 06/14] hw/arm/smmuv3: Queue helpers
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 06/14] hw/arm/smmuv3: Queue helpers Eric Auger
@ 2018-03-08 18:28   ` Peter Maydell
  2018-03-09 16:43     ` Auger Eric
  0 siblings, 1 reply; 63+ messages in thread
From: Peter Maydell @ 2018-03-08 18:28 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
> We introduce helpers to read/write into the command and event
> circular queues.
>
> smmuv3_write_eventq and smmuv3_cmq_consume will become static
> in subsequent patches.
>
> Invalidation commands are not yet dealt with. We do not cache
> data that need to be invalidated. This will change with vhost
> integration.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>
> ---
>
> v8 -> v9:
> - fix CMD_SSID & CMD_ADDR + some renamings
> - do cons increment after the execution of the command
> - add Q_INCONSISTENT()
>
> v7 -> v8
> - use address_space_rw
> - helpers inspired from spec
> ---
>  hw/arm/smmuv3-internal.h | 150 +++++++++++++++++++++++++++++++++++++++++++
>  hw/arm/smmuv3.c          | 162 +++++++++++++++++++++++++++++++++++++++++++++++
>  hw/arm/trace-events      |   4 ++
>  3 files changed, 316 insertions(+)
>
> diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
> index 40b39a1..c0771ce 100644
> --- a/hw/arm/smmuv3-internal.h
> +++ b/hw/arm/smmuv3-internal.h
> @@ -162,4 +162,154 @@ static inline uint64_t smmu_read64(uint64_t r, unsigned offset,
>  void smmuv3_trigger_irq(SMMUv3State *s, SMMUIrq irq, uint32_t gerror_mask);
>  void smmuv3_write_gerrorn(SMMUv3State *s, uint32_t gerrorn);
>
> +/* Queue Handling */
> +
> +#define LOG2SIZE(q)        extract64((q)->base, 0, 5)
> +#define BASE(q)            ((q)->base & SMMU_BASE_ADDR_MASK)

These are both very generic names for things in header files.

Looking at the required behaviour of the *_BASE registers,
if the LOG2SIZE field is written to a value larger than the maximum,
it is supposed to read back as the value written but must behave
as if it was set to the maximum. That suggests to me that your
SMMUQueue struct should have a "log2size" field which is set when
the guest writes to *_BASE (which is where you can cap it to the
max value).

> +#define WRAP_MASK(q)       (1 << LOG2SIZE(q))
> +#define INDEX_MASK(q)      ((1 << LOG2SIZE(q)) - 1)
> +#define WRAP_INDEX_MASK(q) ((1 << (LOG2SIZE(q) + 1)) - 1)

WRAP_INDEX_MASK is unused (but see below for a possible use)

> +
> +#define Q_CONS_ENTRY(q)  (BASE(q) + \
> +                          (q)->entry_size * ((q)->cons & INDEX_MASK(q)))
> +#define Q_PROD_ENTRY(q)  (BASE(q) + \
> +                          (q)->entry_size * ((q)->prod & INDEX_MASK(q)))
> +
> +#define Q_CONS(q) ((q)->cons & INDEX_MASK(q))
> +#define Q_PROD(q) ((q)->prod & INDEX_MASK(q))

If you put these a bit earlier you can use them in the definitions
of Q_CONS_ENTRY and Q_PROD_ENTRY.

> +
> +#define Q_CONS_WRAP(q) (((q)->cons & WRAP_MASK(q)) >> LOG2SIZE(q))
> +#define Q_PROD_WRAP(q) (((q)->prod & WRAP_MASK(q)) >> LOG2SIZE(q))
> +
> +#define Q_FULL(q) \
> +    (((((q)->cons) & INDEX_MASK(q)) == \
> +      (((q)->prod) & INDEX_MASK(q))) && \
> +     ((((q)->cons) & WRAP_MASK(q)) != \
> +      (((q)->prod) & WRAP_MASK(q))))

You could write this as
   ((cons ^ prod) & WRAP_INDEX_MASK) == WRAP_MASK

> +
> +#define Q_EMPTY(q) \
> +    (((((q)->cons) & INDEX_MASK(q)) == \
> +      (((q)->prod) & INDEX_MASK(q))) && \
> +     ((((q)->cons) & WRAP_MASK(q)) == \
> +      (((q)->prod) & WRAP_MASK(q))))

and this as
   (cons & WRAP_INDEX_MASK) == (prod & WRAP_INDEX_MASK)

(or as ((cons ^ prod) & WRAP_INDEX_MASK) == 0, but that's unnecessarily
obscure I think.)


This is all a bit macro-heavy. Do these really all need to be macros
rather than functions?

> +
> +#define Q_INCONSISTENT(q) \
> +((((((q)->prod) & INDEX_MASK(q)) > (((q)->cons) & INDEX_MASK(q))) && \
> +((((q)->prod) & WRAP_MASK(q)) != (((q)->cons) & WRAP_MASK(q)))) || \
> +(((((q)->prod) & INDEX_MASK(q)) < (((q)->cons) & INDEX_MASK(q))) && \
> +((((q)->prod) & WRAP_MASK(q)) == (((q)->cons) & WRAP_MASK(q))))) \
> +

This never seems to be used. (Also it has a stray trailing '\',
and isn't indented very clearly.

> +#define SMMUV3_CMDQ_ENABLED(s) \
> +     (FIELD_EX32(s->cr[0], CR0, CMDQEN))
> +
> +#define SMMUV3_EVENTQ_ENABLED(s) \
> +     (FIELD_EX32(s->cr[0], CR0, EVENTQEN))
> +
> +static inline void smmu_write_cmdq_err(SMMUv3State *s, uint32_t err_type)
> +{
> +    s->cmdq.cons = FIELD_DP32(s->cmdq.cons, CMDQ_CONS, ERR, err_type);
> +}
> +
> +void smmuv3_write_eventq(SMMUv3State *s, Evt *evt);
> +
> +/* Commands */
> +
> +enum {
> +    SMMU_CMD_PREFETCH_CONFIG = 0x01,
> +    SMMU_CMD_PREFETCH_ADDR,
> +    SMMU_CMD_CFGI_STE,
> +    SMMU_CMD_CFGI_STE_RANGE,
> +    SMMU_CMD_CFGI_CD,
> +    SMMU_CMD_CFGI_CD_ALL,
> +    SMMU_CMD_CFGI_ALL,
> +    SMMU_CMD_TLBI_NH_ALL     = 0x10,
> +    SMMU_CMD_TLBI_NH_ASID,
> +    SMMU_CMD_TLBI_NH_VA,
> +    SMMU_CMD_TLBI_NH_VAA,
> +    SMMU_CMD_TLBI_EL3_ALL    = 0x18,
> +    SMMU_CMD_TLBI_EL3_VA     = 0x1a,
> +    SMMU_CMD_TLBI_EL2_ALL    = 0x20,
> +    SMMU_CMD_TLBI_EL2_ASID,
> +    SMMU_CMD_TLBI_EL2_VA,
> +    SMMU_CMD_TLBI_EL2_VAA,  /* 0x23 */
> +    SMMU_CMD_TLBI_S12_VMALL  = 0x28,
> +    SMMU_CMD_TLBI_S2_IPA     = 0x2a,
> +    SMMU_CMD_TLBI_NSNH_ALL   = 0x30,
> +    SMMU_CMD_ATC_INV         = 0x40,
> +    SMMU_CMD_PRI_RESP,
> +    SMMU_CMD_RESUME          = 0x44,
> +    SMMU_CMD_STALL_TERM,
> +    SMMU_CMD_SYNC,          /* 0x46 */
> +};
> +
> +static const char *cmd_stringify[] = {
> +    [SMMU_CMD_PREFETCH_CONFIG] = "SMMU_CMD_PREFETCH_CONFIG",
> +    [SMMU_CMD_PREFETCH_ADDR]   = "SMMU_CMD_PREFETCH_ADDR",
> +    [SMMU_CMD_CFGI_STE]        = "SMMU_CMD_CFGI_STE",
> +    [SMMU_CMD_CFGI_STE_RANGE]  = "SMMU_CMD_CFGI_STE_RANGE",
> +    [SMMU_CMD_CFGI_CD]         = "SMMU_CMD_CFGI_CD",
> +    [SMMU_CMD_CFGI_CD_ALL]     = "SMMU_CMD_CFGI_CD_ALL",
> +    [SMMU_CMD_CFGI_ALL]        = "SMMU_CMD_CFGI_ALL",
> +    [SMMU_CMD_TLBI_NH_ALL]     = "SMMU_CMD_TLBI_NH_ALL",
> +    [SMMU_CMD_TLBI_NH_ASID]    = "SMMU_CMD_TLBI_NH_ASID",
> +    [SMMU_CMD_TLBI_NH_VA]      = "SMMU_CMD_TLBI_NH_VA",
> +    [SMMU_CMD_TLBI_NH_VAA]     = "SMMU_CMD_TLBI_NH_VAA",
> +    [SMMU_CMD_TLBI_EL3_ALL]    = "SMMU_CMD_TLBI_EL3_ALL",
> +    [SMMU_CMD_TLBI_EL3_VA]     = "SMMU_CMD_TLBI_EL3_VA",
> +    [SMMU_CMD_TLBI_EL2_ALL]    = "SMMU_CMD_TLBI_EL2_ALL",
> +    [SMMU_CMD_TLBI_EL2_ASID]   = "SMMU_CMD_TLBI_EL2_ASID",
> +    [SMMU_CMD_TLBI_EL2_VA]     = "SMMU_CMD_TLBI_EL2_VA",
> +    [SMMU_CMD_TLBI_EL2_VAA]    = "SMMU_CMD_TLBI_EL2_VAA",
> +    [SMMU_CMD_TLBI_S12_VMALL]  = "SMMU_CMD_TLBI_S12_VMALL",
> +    [SMMU_CMD_TLBI_S2_IPA]     = "SMMU_CMD_TLBI_S2_IPA",
> +    [SMMU_CMD_TLBI_NSNH_ALL]   = "SMMU_CMD_TLBI_NSNH_ALL",
> +    [SMMU_CMD_ATC_INV]         = "SMMU_CMD_ATC_INV",
> +    [SMMU_CMD_PRI_RESP]        = "SMMU_CMD_PRI_RESP",
> +    [SMMU_CMD_RESUME]          = "SMMU_CMD_RESUME",
> +    [SMMU_CMD_STALL_TERM]      = "SMMU_CMD_STALL_TERM",
> +    [SMMU_CMD_SYNC]            = "SMMU_CMD_SYNC",
> +};
> +
> +#define SMMU_CMD_STRING(type) (                                      \
> +(type < ARRAY_SIZE(cmd_stringify)) ? cmd_stringify[type] : "UNKNOWN" \
> +)

If this was a function you'd know what the type of 'type' is
and so whether it needed to have a >= 0 check on it. Also it
will hand you a NULL pointer for a value that's inside the
array size but not initialized, like 0x24.

> +
> +/* CMDQ fields */
> +
> +typedef enum {
> +    SMMU_CERROR_NONE = 0,
> +    SMMU_CERROR_ILL,
> +    SMMU_CERROR_ABT,
> +    SMMU_CERROR_ATC_INV_SYNC,
> +} SMMUCmdError;
> +
> +enum { /* Command completion notification */
> +    CMD_SYNC_SIG_NONE,
> +    CMD_SYNC_SIG_IRQ,
> +    CMD_SYNC_SIG_SEV,
> +};
> +
> +#define CMD_TYPE(x)         extract32((x)->word[0], 0 , 8)
> +#define CMD_SSEC(x)         extract32((x)->word[0], 10, 1)
> +#define CMD_SSV(x)          extract32((x)->word[0], 11, 1)
> +#define CMD_RESUME_AC(x)    extract32((x)->word[0], 12, 1)
> +#define CMD_RESUME_AB(x)    extract32((x)->word[0], 13, 1)
> +#define CMD_SYNC_CS(x)      extract32((x)->word[0], 12, 2)
> +#define CMD_SSID(x)         extract32((x)->word[0], 12, 20)
> +#define CMD_SID(x)          ((x)->word[1])
> +#define CMD_VMID(x)         extract32((x)->word[1], 0 , 16)
> +#define CMD_ASID(x)         extract32((x)->word[1], 16, 16)
> +#define CMD_RESUME_STAG(x)  extract32((x)->word[2], 0 , 16)
> +#define CMD_RESP(x)         extract32((x)->word[2], 11, 2)
> +#define CMD_LEAF(x)         extract32((x)->word[2], 0 , 1)
> +#define CMD_STE_RANGE(x)    extract32((x)->word[2], 0 , 5)
> +#define CMD_ADDR(x) ({                                        \
> +            uint64_t high = (uint64_t)(x)->word[3];           \
> +            uint64_t low = extract32((x)->word[2], 12, 20);    \
> +            uint64_t addr = high << 32 | (low << 12);         \
> +            addr;                                             \
> +        })
> +
> +int smmuv3_cmdq_consume(SMMUv3State *s);
> +
>  #endif
> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
> index 8779d3f..0b57215 100644
> --- a/hw/arm/smmuv3.c
> +++ b/hw/arm/smmuv3.c
> @@ -94,6 +94,72 @@ void smmuv3_write_gerrorn(SMMUv3State *s, uint32_t new_gerrorn)
>      trace_smmuv3_write_gerrorn(acked, s->gerrorn);
>  }
>
> +static uint32_t queue_index_inc(uint32_t val,
> +                                uint32_t qidx_mask, uint32_t qwrap_mask)
> +{
> +    uint32_t i = (val + 1) & qidx_mask;
> +
> +    if (i <= (val & qidx_mask)) {
> +        i = ((val & qwrap_mask) ^ qwrap_mask) | i;
> +    } else {
> +        i = (val & qwrap_mask) | i;
> +    }
> +    return i;

This is unnecessarily complicated -- an index increment is just
   val = (val + 1) & INDEX_WRAP_MASK;
which will automatically flip the wrap bit as required.

> +}
> +
> +static inline void queue_prod_incr(SMMUQueue *q)
> +{
> +    q->prod = queue_index_inc(q->prod, INDEX_MASK(q), WRAP_MASK(q));

Doesn't this trash the ERR code in bits [30:24], or are you
keeping that somewhere else for efficiency?

> +}
> +
> +static inline void queue_cons_incr(SMMUQueue *q)
> +{
> +    q->cons = queue_index_inc(q->cons, INDEX_MASK(q), WRAP_MASK(q));
> +}
> +
> +static inline MemTxResult queue_read(SMMUQueue *q, void *data)
> +{
> +    dma_addr_t addr = Q_CONS_ENTRY(q);
> +
> +    return dma_memory_read(&address_space_memory, addr,
> +                           (uint8_t *)data, q->entry_size);

Does the compiler complain if you don't provide this cast?

> +}
> +
> +static void queue_write(SMMUQueue *q, void *data)
> +{
> +    dma_addr_t addr = Q_PROD_ENTRY(q);
> +    MemTxResult ret;
> +
> +    ret = dma_memory_write(&address_space_memory, addr,
> +                           (uint8_t *)data, q->entry_size);
> +    if (ret != MEMTX_OK) {
> +        return;

Shouldn't we record or return this error to the caller,
like queue_read() does, rather than throwing it away?
I think that for the event queue (which is the only user
here ) this should cause an EVENTQ_ABT_ERR.

> +    }
> +
> +    queue_prod_incr(q);
> +}
> +
> +void smmuv3_write_eventq(SMMUv3State *s, Evt *evt)
> +{
> +    SMMUQueue *q = &s->eventq;
> +    bool q_empty = Q_EMPTY(q);
> +    bool q_full = Q_FULL(q);

You only use these once each, and they're not very complicated
expressions, so you might as well just have the uses be
"if (Q_FULL(q)) { ..." &c.

> +
> +    if (!SMMUV3_EVENTQ_ENABLED(s)) {
> +        return;
> +    }
> +
> +    if (q_full) {
> +        return;
> +    }
> +
> +    queue_write(q, evt);
> +
> +    if (q_empty) {
> +        smmuv3_trigger_irq(s, SMMU_IRQ_EVTQ, 0);
> +    }
> +}
> +
>  static void smmuv3_init_regs(SMMUv3State *s)
>  {
>      /**
> @@ -133,6 +199,102 @@ static void smmuv3_init_regs(SMMUv3State *s)
>      s->sid_split = 0;
>  }
>
> +int smmuv3_cmdq_consume(SMMUv3State *s)
> +{
> +    SMMUCmdError cmd_error = SMMU_CERROR_NONE;
> +    SMMUQueue *q = &s->cmdq;
> +    uint32_t type = 0;
> +
> +    if (!SMMUV3_CMDQ_ENABLED(s)) {
> +        return 0;
> +    }
> +    /*
> +     * some commands depend on register values, as above. In case those

Where is "as above" referring to ?

> +     * register values change while handling the command, spec says it
> +     * is UNPREDICTABLE whether the command is interpreted under the new
> +     * or old value.
> +     */
> +
> +    while (!Q_EMPTY(q)) {
> +        uint32_t pending = s->gerror ^ s->gerrorn;
> +        Cmd cmd;
> +
> +        trace_smmuv3_cmdq_consume(Q_PROD(q), Q_CONS(q),
> +                                  Q_PROD_WRAP(q), Q_CONS_WRAP(q));
> +
> +        if (FIELD_EX32(pending, GERROR, CMDQ_ERR)) {
> +            break;
> +        }
> +
> +        if (queue_read(q, &cmd) != MEMTX_OK) {
> +            cmd_error = SMMU_CERROR_ABT;
> +            break;
> +        }
> +
> +        type = CMD_TYPE(&cmd);
> +
> +        trace_smmuv3_cmdq_opcode(SMMU_CMD_STRING(type));
> +
> +        switch (type) {
> +        case SMMU_CMD_SYNC:
> +            if (CMD_SYNC_CS(&cmd) & CMD_SYNC_SIG_IRQ) {
> +                smmuv3_trigger_irq(s, SMMU_IRQ_CMD_SYNC, 0);
> +            }
> +            break;
> +        case SMMU_CMD_PREFETCH_CONFIG:
> +        case SMMU_CMD_PREFETCH_ADDR:
> +        case SMMU_CMD_CFGI_STE:
> +        case SMMU_CMD_CFGI_STE_RANGE: /* same as SMMU_CMD_CFGI_ALL */
> +        case SMMU_CMD_CFGI_CD:
> +        case SMMU_CMD_CFGI_CD_ALL:
> +        case SMMU_CMD_TLBI_NH_ALL:
> +        case SMMU_CMD_TLBI_NH_ASID:
> +        case SMMU_CMD_TLBI_NH_VA:
> +        case SMMU_CMD_TLBI_NH_VAA:
> +        case SMMU_CMD_TLBI_EL3_ALL:
> +        case SMMU_CMD_TLBI_EL3_VA:
> +        case SMMU_CMD_TLBI_EL2_ALL:
> +        case SMMU_CMD_TLBI_EL2_ASID:
> +        case SMMU_CMD_TLBI_EL2_VA:
> +        case SMMU_CMD_TLBI_EL2_VAA:
> +        case SMMU_CMD_TLBI_S12_VMALL:
> +        case SMMU_CMD_TLBI_S2_IPA:
> +        case SMMU_CMD_TLBI_NSNH_ALL:
> +        case SMMU_CMD_ATC_INV:
> +        case SMMU_CMD_PRI_RESP:
> +        case SMMU_CMD_RESUME:
> +        case SMMU_CMD_STALL_TERM:
> +            trace_smmuv3_unhandled_cmd(type);
> +            break;
> +        default:
> +            cmd_error = SMMU_CERROR_ILL;
> +            error_report("Illegal command type: %d", CMD_TYPE(&cmd));

This isn't what error_report() is for. You can log it as a GUEST_ERROR.

> +            break;
> +        }
> +        if (cmd_error) {
> +            break;
> +        }
> +        /*
> +         * We only increment the cons index after the completion of
> +         * the command. We do that because the SYNC returns immediatly

"immediately"

> +         * and do not check the completion of previous commands

"does not"

> +         */
> +        queue_cons_incr(q);
> +    }
> +
> +    if (cmd_error) {
> +        error_report("Error on %s command execution: %d",
> +                     SMMU_CMD_STRING(type), cmd_error);

Again, not error_report(). Probably a good location for a trace_ point.

> +        smmu_write_cmdq_err(s, cmd_error);
> +        smmuv3_trigger_irq(s, SMMU_IRQ_GERROR, R_GERROR_CMDQ_ERR_MASK);
> +    }
> +
> +    trace_smmuv3_cmdq_consume_out(Q_PROD(q), Q_CONS(q),
> +                                  Q_PROD_WRAP(q), Q_CONS_WRAP(q));
> +
> +    return 0;
> +}
> +
>  static void smmu_write_mmio(void *opaque, hwaddr addr,
>                              uint64_t val, unsigned size)
>  {
> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
> index 2ddae40..1c5105d 100644
> --- a/hw/arm/trace-events
> +++ b/hw/arm/trace-events
> @@ -18,3 +18,7 @@ smmuv3_read_mmio(hwaddr addr, uint64_t val, unsigned size) "addr: 0x%"PRIx64" va
>  smmuv3_trigger_irq(int irq) "irq=%d"
>  smmuv3_write_gerror(uint32_t toggled, uint32_t gerror) "toggled=0x%x, new gerror=0x%x"
>  smmuv3_write_gerrorn(uint32_t acked, uint32_t gerrorn) "acked=0x%x, new gerrorn=0x%x"
> +smmuv3_unhandled_cmd(uint32_t type) "Unhandled command type=%d"
> +smmuv3_cmdq_consume(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "prod=%d cons=%d prod.wrap=%d cons.wrap=%d"
> +smmuv3_cmdq_opcode(const char *opcode) "<--- %s"
> +smmuv3_cmdq_consume_out(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "prod:%d, cons:%d, prod_wrap:%d, cons_wrap:%d "
> --
> 2.5.5
>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 07/14] hw/arm/smmuv3: Implement MMIO write operations
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 07/14] hw/arm/smmuv3: Implement MMIO write operations Eric Auger
@ 2018-03-08 18:37   ` Peter Maydell
  2018-03-09 16:42     ` Auger Eric
  0 siblings, 1 reply; 63+ messages in thread
From: Peter Maydell @ 2018-03-08 18:37 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
> Now we have relevant helpers for queue and irq
> management, let's implement MMIO write operations.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>
> ---
>
> v7 -> v8:
> - precise in the commit message invalidation commands
>   are not yet treated.
> - use new queue helpers
> - do not decode unhandled commands at this stage
> ---
>  hw/arm/smmuv3-internal.h |  24 +++++++---
>  hw/arm/smmuv3.c          | 111 +++++++++++++++++++++++++++++++++++++++++++++--
>  hw/arm/trace-events      |   6 +++
>  3 files changed, 132 insertions(+), 9 deletions(-)
>
> diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
> index c0771ce..5af97ae 100644
> --- a/hw/arm/smmuv3-internal.h
> +++ b/hw/arm/smmuv3-internal.h
> @@ -152,6 +152,25 @@ static inline uint64_t smmu_read64(uint64_t r, unsigned offset,
>      return extract64(r, offset << 3, 32);
>  }
>
> +static inline void smmu_write64(uint64_t *r, unsigned offset,
> +                                unsigned size, uint64_t value)
> +{
> +    if (size == 8 && !offset) {
> +        *r  = value;
> +    }
> +
> +    /* 32 bit access */
> +
> +    if (offset && offset != 4)  {
> +        qemu_log_mask(LOG_GUEST_ERROR,
> +                      "SMMUv3 MMIO write: bad offset/size %u/%u\n",
> +                      offset, size);
> +        return ;
> +    }
> +
> +    *r = deposit64(*r, offset << 3, 32, value);

Similar remarks apply to this helper as to smmu_read64 in the earlier patch.

> +}
> +
>  /* Interrupts */
>
>  #define smmuv3_eventq_irq_enabled(s)                   \
> @@ -159,9 +178,6 @@ static inline uint64_t smmu_read64(uint64_t r, unsigned offset,
>  #define smmuv3_gerror_irq_enabled(s)                  \
>      (FIELD_EX32(s->irq_ctrl, IRQ_CTRL, GERROR_IRQEN))
>
> -void smmuv3_trigger_irq(SMMUv3State *s, SMMUIrq irq, uint32_t gerror_mask);
> -void smmuv3_write_gerrorn(SMMUv3State *s, uint32_t gerrorn);
> -
>  /* Queue Handling */
>
>  #define LOG2SIZE(q)        extract64((q)->base, 0, 5)
> @@ -310,6 +326,4 @@ enum { /* Command completion notification */
>              addr;                                             \
>          })
>
> -int smmuv3_cmdq_consume(SMMUv3State *s);
> -
>  #endif
> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
> index 0b57215..fcfdbb0 100644
> --- a/hw/arm/smmuv3.c
> +++ b/hw/arm/smmuv3.c
> @@ -37,7 +37,8 @@
>   * @irq: irq type
>   * @gerror_mask: mask of gerrors to toggle (relevant if @irq is GERROR)
>   */
> -void smmuv3_trigger_irq(SMMUv3State *s, SMMUIrq irq, uint32_t gerror_mask)
> +static void smmuv3_trigger_irq(SMMUv3State *s, SMMUIrq irq,
> +                               uint32_t gerror_mask)
>  {
>
>      bool pulse = false;
> @@ -75,7 +76,7 @@ void smmuv3_trigger_irq(SMMUv3State *s, SMMUIrq irq, uint32_t gerror_mask)
>      }
>  }
>
> -void smmuv3_write_gerrorn(SMMUv3State *s, uint32_t new_gerrorn)
> +static void smmuv3_write_gerrorn(SMMUv3State *s, uint32_t new_gerrorn)
>  {
>      uint32_t pending = s->gerror ^ s->gerrorn;
>      uint32_t toggled = s->gerrorn ^ new_gerrorn;
> @@ -199,7 +200,7 @@ static void smmuv3_init_regs(SMMUv3State *s)
>      s->sid_split = 0;
>  }
>
> -int smmuv3_cmdq_consume(SMMUv3State *s)
> +static int smmuv3_cmdq_consume(SMMUv3State *s)
>  {
>      SMMUCmdError cmd_error = SMMU_CERROR_NONE;
>      SMMUQueue *q = &s->cmdq;
> @@ -298,7 +299,109 @@ int smmuv3_cmdq_consume(SMMUv3State *s)
>  static void smmu_write_mmio(void *opaque, hwaddr addr,
>                              uint64_t val, unsigned size)
>  {
> -    /* not yet implemented */
> +    SMMUState *sys = opaque;
> +    SMMUv3State *s = ARM_SMMUV3(sys);
> +
> +    /* CONSTRAINED UNPREDICTABLE choice to have page0/1 be exact aliases */
> +    addr &= ~0x10000;
> +
> +    if (size != 4 && size != 8) {
> +        qemu_log_mask(LOG_GUEST_ERROR,
> +                      "SMMUv3 MMIO write: bad size %u\n", size);
> +    }

As with read, this can never happen so you don't need to check for it.

As with read, probably better to explicitly whitelist the 64-bit
accessible offsets, and LOG_GUEST_ERROR and write-ignore the others.

> +
> +    trace_smmuv3_write_mmio(addr, val, size);
> +
> +    switch (addr) {
> +    case A_CR0:
> +        s->cr[0] = val;
> +        s->cr0ack = val;

Spec says "reserved fields in SMMU_CR0 are not reflected in SMMU_CR0ACK",
so you probably need to mask those out.

> +        /* in case the command queue has been enabled */
> +        smmuv3_cmdq_consume(s);
> +        return;
> +    case A_CR1:
> +        s->cr[1] = val;
> +        return;
> +    case A_CR2:
> +        s->cr[2] = val;
> +        return;
> +    case A_IRQ_CTRL:
> +        s->irq_ctrl = val;
> +        return;
> +    case A_GERRORN:
> +        smmuv3_write_gerrorn(s, val);
> +        /*
> +         * By acknowledging the CMDQ_ERR, SW may notify cmds can
> +         * be processed again
> +         */
> +        smmuv3_cmdq_consume(s);
> +        return;
> +    case A_GERROR_IRQ_CFG0: /* 64b */
> +        smmu_write64(&s->gerror_irq_cfg0, 0, size, val);
> +        return;
> +    case A_GERROR_IRQ_CFG0 + 4:
> +        smmu_write64(&s->gerror_irq_cfg0, 4, size, val);
> +        return;
> +    case A_GERROR_IRQ_CFG1:
> +        s->gerror_irq_cfg1 = val;
> +        return;
> +    case A_GERROR_IRQ_CFG2:
> +        s->gerror_irq_cfg2 = val;
> +        return;
> +    case A_STRTAB_BASE: /* 64b */
> +        smmu_write64(&s->strtab_base, 0, size, val);
> +        return;
> +    case A_STRTAB_BASE + 4:
> +        smmu_write64(&s->strtab_base, 4, size, val);
> +        return;
> +    case A_STRTAB_BASE_CFG:
> +        s->strtab_base_cfg = val;
> +        if (FIELD_EX32(val, STRTAB_BASE_CFG, FMT) == 1) {
> +            s->sid_split = FIELD_EX32(val, STRTAB_BASE_CFG, SPLIT);
> +            s->features |= SMMU_FEATURE_2LVL_STE;
> +        }
> +        return;
> +    case A_CMDQ_BASE: /* 64b */
> +        smmu_write64(&s->cmdq.base, 0, size, val);
> +        return;
> +    case A_CMDQ_BASE + 4: /* 64b */
> +        smmu_write64(&s->cmdq.base, 4, size, val);
> +        return;
> +    case A_CMDQ_PROD:
> +        s->cmdq.prod = val;
> +        smmuv3_cmdq_consume(s);
> +        return;
> +    case A_CMDQ_CONS:
> +        s->cmdq.cons = val;
> +        return;
> +    case A_EVENTQ_BASE: /* 64b */
> +        smmu_write64(&s->eventq.base, 0, size, val);
> +        return;
> +    case A_EVENTQ_BASE + 4:
> +        smmu_write64(&s->eventq.base, 4, size, val);
> +        return;
> +    case A_EVENTQ_PROD:
> +        s->eventq.prod = val;
> +        return;
> +    case A_EVENTQ_CONS:
> +        s->eventq.cons = val;
> +        return;
> +    case A_EVENTQ_IRQ_CFG0: /* 64b */
> +        s->eventq.prod = val;
> +        smmu_write64(&s->eventq_irq_cfg0, 0, size, val);
> +        return;
> +    case A_EVENTQ_IRQ_CFG0 + 4:
> +        smmu_write64(&s->eventq_irq_cfg0, 4, size, val);
> +        return;
> +    case A_EVENTQ_IRQ_CFG1:
> +        s->eventq_irq_cfg1 = val;
> +        return;
> +    case A_EVENTQ_IRQ_CFG2:
> +        s->eventq_irq_cfg2 = val;
> +        return;
> +    default:
> +        error_report("%s unhandled access at 0x%"PRIx64, __func__, addr);

Tracepoint or LOG_GUEST_ERROR, not error_report(), please.

> +    }
>  }
>
>  static uint64_t smmu_read_mmio(void *opaque, hwaddr addr, unsigned size)
> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
> index 1c5105d..ed5dce0 100644
> --- a/hw/arm/trace-events
> +++ b/hw/arm/trace-events
> @@ -22,3 +22,9 @@ smmuv3_unhandled_cmd(uint32_t type) "Unhandled command type=%d"
>  smmuv3_cmdq_consume(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "prod=%d cons=%d prod.wrap=%d cons.wrap=%d"
>  smmuv3_cmdq_opcode(const char *opcode) "<--- %s"
>  smmuv3_cmdq_consume_out(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "prod:%d, cons:%d, prod_wrap:%d, cons_wrap:%d "
> +smmuv3_update(bool is_empty, uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "q empty:%d prod:%d cons:%d p.wrap:%d p.cons:%d"
> +smmuv3_update_check_cmd(int error) "cmdq not enabled or error :0x%x"
> +smmuv3_write_mmio(hwaddr addr, uint64_t val, unsigned size) "addr: 0x%"PRIx64" val:0x%"PRIx64" size: 0x%x"
> +smmuv3_write_mmio_idr(hwaddr addr, uint64_t val) "write to RO/Unimpl reg 0x%lx val64:0x%lx"
> +smmuv3_write_mmio_evtq_cons_bef_clear(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "Before clearing interrupt prod:0x%x cons:0x%x prod.w:%d cons.w:%d"
> +smmuv3_write_mmio_evtq_cons_after_clear(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "after clearing interrupt prod:0x%x cons:0x%x prod.w:%d cons.w:%d"


thanks
-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 08/14] hw/arm/smmuv3: Event queue recording helper
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 08/14] hw/arm/smmuv3: Event queue recording helper Eric Auger
@ 2018-03-08 18:39   ` Peter Maydell
  2018-03-09 17:16     ` Auger Eric
  0 siblings, 1 reply; 63+ messages in thread
From: Peter Maydell @ 2018-03-08 18:39 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
> Let's introduce a helper function aiming at recording an
> event in the event queue.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>
> ---
>
> v8 -> v9:
> - add SMMU_EVENT_STRING
>
> v7 -> v8:
> - use dma_addr_t instead of hwaddr in smmuv3_record_event()
> - introduce struct SMMUEventInfo
> - add event_stringify + helpers for all fields
> ---
>  hw/arm/smmuv3-internal.h | 140 ++++++++++++++++++++++++++++++++++++++++++++++-
>  hw/arm/smmuv3.c          |  91 +++++++++++++++++++++++++++++-
>  hw/arm/trace-events      |   1 +
>  3 files changed, 229 insertions(+), 3 deletions(-)
>
> diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
> index 5af97ae..3929f69 100644
> --- a/hw/arm/smmuv3-internal.h
> +++ b/hw/arm/smmuv3-internal.h
> @@ -226,8 +226,6 @@ static inline void smmu_write_cmdq_err(SMMUv3State *s, uint32_t err_type)
>      s->cmdq.cons = FIELD_DP32(s->cmdq.cons, CMDQ_CONS, ERR, err_type);
>  }
>
> -void smmuv3_write_eventq(SMMUv3State *s, Evt *evt);
> -
>  /* Commands */
>
>  enum {
> @@ -326,4 +324,142 @@ enum { /* Command completion notification */
>              addr;                                             \
>          })
>
> +/* Events */
> +
> +typedef enum SMMUEventType {
> +    SMMU_EVT_OK                 = 0x00,
> +    SMMU_EVT_F_UUT              = 0x01,
> +    SMMU_EVT_C_BAD_STREAMID     = 0x02,
> +    SMMU_EVT_F_STE_FETCH        = 0x03,
> +    SMMU_EVT_C_BAD_STE          = 0x04,
> +    SMMU_EVT_F_BAD_ATS_TREQ     = 0x05,
> +    SMMU_EVT_F_STREAM_DISABLED  = 0x06,
> +    SMMU_EVT_F_TRANS_FORBIDDEN  = 0x07,
> +    SMMU_EVT_C_BAD_SUBSTREAMID  = 0x08,
> +    SMMU_EVT_F_CD_FETCH         = 0x09,
> +    SMMU_EVT_C_BAD_CD           = 0x0a,
> +    SMMU_EVT_F_WALK_EABT        = 0x0b,
> +    SMMU_EVT_F_TRANSLATION      = 0x10,
> +    SMMU_EVT_F_ADDR_SIZE        = 0x11,
> +    SMMU_EVT_F_ACCESS           = 0x12,
> +    SMMU_EVT_F_PERMISSION       = 0x13,
> +    SMMU_EVT_F_TLB_CONFLICT     = 0x20,
> +    SMMU_EVT_F_CFG_CONFLICT     = 0x21,
> +    SMMU_EVT_E_PAGE_REQ         = 0x24,
> +} SMMUEventType;
> +
> +static const char *event_stringify[] = {
> +    [SMMU_EVT_OK]                       = "SMMU_EVT_OK",
> +    [SMMU_EVT_F_UUT]                    = "SMMU_EVT_F_UUT",
> +    [SMMU_EVT_C_BAD_STREAMID]           = "SMMU_EVT_C_BAD_STREAMID",
> +    [SMMU_EVT_F_STE_FETCH]              = "SMMU_EVT_F_STE_FETCH",
> +    [SMMU_EVT_C_BAD_STE]                = "SMMU_EVT_C_BAD_STE",
> +    [SMMU_EVT_F_BAD_ATS_TREQ]           = "SMMU_EVT_F_BAD_ATS_TREQ",
> +    [SMMU_EVT_F_STREAM_DISABLED]        = "SMMU_EVT_F_STREAM_DISABLED",
> +    [SMMU_EVT_F_TRANS_FORBIDDEN]        = "SMMU_EVT_F_TRANS_FORBIDDEN",
> +    [SMMU_EVT_C_BAD_SUBSTREAMID]        = "SMMU_EVT_C_BAD_SUBSTREAMID",
> +    [SMMU_EVT_F_CD_FETCH]               = "SMMU_EVT_F_CD_FETCH",
> +    [SMMU_EVT_C_BAD_CD]                 = "SMMU_EVT_C_BAD_CD",
> +    [SMMU_EVT_F_WALK_EABT]              = "SMMU_EVT_F_WALK_EABT",
> +    [SMMU_EVT_F_TRANSLATION]            = "SMMU_EVT_F_TRANSLATION",
> +    [SMMU_EVT_F_ADDR_SIZE]              = "SMMU_EVT_F_ADDR_SIZE",
> +    [SMMU_EVT_F_ACCESS]                 = "SMMU_EVT_F_ACCESS",
> +    [SMMU_EVT_F_PERMISSION]             = "SMMU_EVT_F_PERMISSION",
> +    [SMMU_EVT_F_TLB_CONFLICT]           = "SMMU_EVT_F_TLB_CONFLICT",
> +    [SMMU_EVT_F_CFG_CONFLICT]           = "SMMU_EVT_F_CFG_CONFLICT",
> +    [SMMU_EVT_E_PAGE_REQ]               = "SMMU_EVT_E_PAGE_REQ",
> +};
> +
> +#define SMMU_EVENT_STRING(event) (                                         \
> +(event < ARRAY_SIZE(event_stringify)) ? event_stringify[event] : "UNKNOWN" \
> +)

Same remarks as for the other value-to-string helper.

> +
> +typedef struct SMMUEventInfo {

This struct could use a comment summmarizing what it's for.

> +    SMMUEventType type;
> +    uint32_t sid;
> +    bool recorded;
> +    bool record_trans_faults;
> +    union {
> +        struct {
> +            uint32_t ssid;
> +            bool ssv;
> +            dma_addr_t addr;
> +            bool rnw;
> +            bool pnu;
> +            bool ind;
> +       } f_uut;
> +       struct ssid_info {
> +            uint32_t ssid;
> +            bool ssv;
> +       } c_bad_streamid;
> +       struct ssid_addr_info {
> +            uint32_t ssid;
> +            bool ssv;
> +            dma_addr_t addr;
> +       } f_ste_fetch;
> +       struct ssid_info c_bad_ste;
> +       struct {
> +            dma_addr_t addr;
> +            bool rnw;
> +       } f_transl_forbidden;
> +       struct {
> +            uint32_t ssid;
> +       } c_bad_substream;
> +       struct ssid_addr_info f_cd_fetch;
> +       struct ssid_info c_bad_cd;
> +       struct full_info {
> +            bool stall;
> +            uint16_t stag;
> +            uint32_t ssid;
> +            bool ssv;
> +            bool s2;
> +            dma_addr_t addr;
> +            bool rnw;
> +            bool pnu;
> +            bool ind;
> +            uint8_t class;
> +            dma_addr_t addr2;
> +       } f_walk_eabt;
> +       struct full_info f_translation;
> +       struct full_info f_addr_size;
> +       struct full_info f_access;
> +       struct full_info f_permission;
> +       struct ssid_info f_cfg_conflict;
> +       /**
> +        * not supported yet:
> +        * F_BAD_ATS_TREQ
> +        * F_BAD_ATS_TREQ
> +        * F_TLB_CONFLICT
> +        * E_PAGE_REQUEST
> +        * IMPDEF_EVENTn
> +        */
> +    } u;
> +} SMMUEventInfo;
> +
> +/* EVTQ fields */
> +
> +#define EVT_Q_OVERFLOW        (1 << 31)
> +
> +#define EVT_SET_TYPE(x, v)              deposit32((x)->word[0], 0 , 8 ,  v)
> +#define EVT_SET_SSV(x, v)               deposit32((x)->word[0], 11, 1 ,  v)
> +#define EVT_SET_SSID(x, v)              deposit32((x)->word[0], 12, 20, v)
> +#define EVT_SET_SID(x, v)               ((x)->word[1] =  v)
> +#define EVT_SET_STAG(x, v)              deposit32((x)->word[2], 0 , 16, v)
> +#define EVT_SET_STALL(x, v)             deposit32((x)->word[2], 31, 1 , v)
> +#define EVT_SET_PNU(x, v)               deposit32((x)->word[3], 1 , 1 , v)
> +#define EVT_SET_IND(x, v)               deposit32((x)->word[3], 2 , 1 , v)
> +#define EVT_SET_RNW(x, v)               deposit32((x)->word[3], 3 , 1 , v)
> +#define EVT_SET_S2(x, v)                deposit32((x)->word[3], 7 , 1 , v)
> +#define EVT_SET_CLASS(x, v)             deposit32((x)->word[3], 8 , 2 , v)
> +#define EVT_SET_ADDR(x, addr) ({                    \
> +            (x)->word[5] = (uint32_t)(addr >> 32);        \
> +            (x)->word[4] = (uint32_t)(addr & 0xffffffff); \
> +        })
> +#define EVT_SET_ADDR2(x, addr) ({                    \
> +            deposit32((x)->word[7], 3, 29, addr >> 16);        \
> +            deposit32((x)->word[7], 0, 16, addr & 0xffff); \
> +        })
> +
> +void smmuv3_record_event(SMMUv3State *s, SMMUEventInfo *event);
> +
>  #endif
> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
> index fcfdbb0..0adfe53 100644
> --- a/hw/arm/smmuv3.c
> +++ b/hw/arm/smmuv3.c
> @@ -140,7 +140,7 @@ static void queue_write(SMMUQueue *q, void *data)
>      queue_prod_incr(q);
>  }
>
> -void smmuv3_write_eventq(SMMUv3State *s, Evt *evt)
> +static void smmuv3_write_eventq(SMMUv3State *s, Evt *evt)
>  {
>      SMMUQueue *q = &s->eventq;
>      bool q_empty = Q_EMPTY(q);
> @@ -161,6 +161,95 @@ void smmuv3_write_eventq(SMMUv3State *s, Evt *evt)
>      }
>  }
>
> +void smmuv3_record_event(SMMUv3State *s, SMMUEventInfo *info)
> +{
> +    Evt evt;
> +
> +    if (!SMMUV3_EVENTQ_ENABLED(s)) {
> +        return;
> +    }
> +
> +    EVT_SET_TYPE(&evt, info->type);
> +    EVT_SET_SID(&evt, info->sid);
> +
> +    switch (info->type) {
> +    case SMMU_EVT_OK:
> +        return;
> +    case SMMU_EVT_F_UUT:
> +        EVT_SET_SSID(&evt, info->u.f_uut.ssid);
> +        EVT_SET_SSV(&evt,  info->u.f_uut.ssv);
> +        EVT_SET_ADDR(&evt, info->u.f_uut.addr);
> +        EVT_SET_RNW(&evt,  info->u.f_uut.rnw);
> +        EVT_SET_PNU(&evt,  info->u.f_uut.pnu);
> +        EVT_SET_IND(&evt,  info->u.f_uut.ind);
> +        break;
> +    case SMMU_EVT_C_BAD_STREAMID:
> +        EVT_SET_SSID(&evt, info->u.c_bad_streamid.ssid);
> +        EVT_SET_SSV(&evt,  info->u.c_bad_streamid.ssv);
> +        break;
> +    case SMMU_EVT_F_STE_FETCH:
> +        EVT_SET_SSID(&evt, info->u.f_ste_fetch.ssid);
> +        EVT_SET_SSV(&evt,  info->u.f_ste_fetch.ssv);
> +        EVT_SET_ADDR(&evt, info->u.f_ste_fetch.addr);
> +        break;
> +    case SMMU_EVT_C_BAD_STE:
> +        EVT_SET_SSID(&evt, info->u.c_bad_ste.ssid);
> +        EVT_SET_SSV(&evt,  info->u.c_bad_ste.ssv);
> +        break;
> +    case SMMU_EVT_F_STREAM_DISABLED:
> +        break;
> +    case SMMU_EVT_F_TRANS_FORBIDDEN:
> +        EVT_SET_ADDR(&evt, info->u.f_transl_forbidden.addr);
> +        EVT_SET_RNW(&evt, info->u.f_transl_forbidden.rnw);
> +        break;
> +    case SMMU_EVT_C_BAD_SUBSTREAMID:
> +        EVT_SET_SSID(&evt, info->u.c_bad_substream.ssid);
> +        break;
> +    case SMMU_EVT_F_CD_FETCH:
> +        EVT_SET_SSID(&evt, info->u.f_cd_fetch.ssid);
> +        EVT_SET_SSV(&evt,  info->u.f_cd_fetch.ssv);
> +        EVT_SET_ADDR(&evt, info->u.f_cd_fetch.addr);
> +        break;
> +    case SMMU_EVT_C_BAD_CD:
> +        EVT_SET_SSID(&evt, info->u.c_bad_cd.ssid);
> +        EVT_SET_SSV(&evt,  info->u.c_bad_cd.ssv);
> +        break;
> +    case SMMU_EVT_F_WALK_EABT:
> +    case SMMU_EVT_F_TRANSLATION:
> +    case SMMU_EVT_F_ADDR_SIZE:
> +    case SMMU_EVT_F_ACCESS:
> +    case SMMU_EVT_F_PERMISSION:
> +        EVT_SET_STALL(&evt, info->u.f_walk_eabt.stall);
> +        EVT_SET_STAG(&evt, info->u.f_walk_eabt.stag);
> +        EVT_SET_SSID(&evt, info->u.f_walk_eabt.ssid);
> +        EVT_SET_SSV(&evt, info->u.f_walk_eabt.ssv);
> +        EVT_SET_S2(&evt, info->u.f_walk_eabt.s2);
> +        EVT_SET_ADDR(&evt, info->u.f_walk_eabt.addr);
> +        EVT_SET_RNW(&evt, info->u.f_walk_eabt.rnw);
> +        EVT_SET_PNU(&evt, info->u.f_walk_eabt.pnu);
> +        EVT_SET_IND(&evt, info->u.f_walk_eabt.ind);
> +        EVT_SET_CLASS(&evt, info->u.f_walk_eabt.class);
> +        EVT_SET_ADDR2(&evt, info->u.f_walk_eabt.addr2);
> +        break;
> +    case SMMU_EVT_F_CFG_CONFLICT:
> +        EVT_SET_SSID(&evt, info->u.f_cfg_conflict.ssid);
> +        EVT_SET_SSV(&evt,  info->u.f_cfg_conflict.ssv);
> +        break;
> +    /* rest is not implemented */
> +    case SMMU_EVT_F_BAD_ATS_TREQ:
> +    case SMMU_EVT_F_TLB_CONFLICT:
> +    case SMMU_EVT_E_PAGE_REQ:
> +    default:
> +        error_report("%s event %d not supported", __func__,
> +                     info->type);

Not error_report, please.

> +        return;
> +    }
> +
> +    trace_smmuv3_record_event(SMMU_EVENT_STRING(info->type), info->sid);
> +    smmuv3_write_eventq(s, &evt);

This should be handling the "oops, the write to memory failed" case.

> +    info->recorded = true;
> +}
> +
>  static void smmuv3_init_regs(SMMUv3State *s)
>  {
>      /**
> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
> index ed5dce0..c79c15e 100644
> --- a/hw/arm/trace-events
> +++ b/hw/arm/trace-events
> @@ -28,3 +28,4 @@ smmuv3_write_mmio(hwaddr addr, uint64_t val, unsigned size) "addr: 0x%"PRIx64" v
>  smmuv3_write_mmio_idr(hwaddr addr, uint64_t val) "write to RO/Unimpl reg 0x%lx val64:0x%lx"
>  smmuv3_write_mmio_evtq_cons_bef_clear(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "Before clearing interrupt prod:0x%x cons:0x%x prod.w:%d cons.w:%d"
>  smmuv3_write_mmio_evtq_cons_after_clear(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "after clearing interrupt prod:0x%x cons:0x%x prod.w:%d cons.w:%d"
> +smmuv3_record_event(const char *type, uint32_t sid) "%s sid=%d"
> --

thanks
-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 03/14] hw/arm/smmu-common: VMSAv8-64 page table walk
  2018-03-07 16:35       ` Peter Maydell
@ 2018-03-08 18:56         ` Auger Eric
  2018-03-08 19:01           ` Peter Maydell
  0 siblings, 1 reply; 63+ messages in thread
From: Auger Eric @ 2018-03-08 18:56 UTC (permalink / raw)
  To: Peter Maydell
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

Hi Peter,
On 07/03/18 17:35, Peter Maydell wrote:
> On 7 March 2018 at 16:23, Auger Eric <eric.auger@redhat.com> wrote:
>> Hi Peter,
>>
>> On 06/03/18 20:43, Peter Maydell wrote:
>>> On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
> 
>>>> +int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags perm,
>>>> +             IOMMUTLBEntry *tlbe, SMMUPTWEventInfo *info)
>>>> +{
>>>> +    if (!cfg->aa64) {
>>>> +        error_setg(&error_fatal,
>>>> +                   "SMMUv3 model does not support VMSAv8-32 page walk yet");
>>>
>>> This sort of guest-provokable error should not be fatal -- log
>>> it with LOG_UNIMP and carry on as best you can (in this case
>>> the spec should say what happens for SMMUv3 implementations which
>>> don't support AArch32 tables).
>> This code should never been entered as the check is done when decoding
>> the CD in the smmuv3. However since it is a base class I wanted to
>> enphase the ptw was only supporting AArch32.
> 
> Ah, right. That should be an assert() with a brief comment
> about why the condition will never trigger.
> 
> 
>>>> +#define is_permission_fault(ap, perm) \
>>>> +    (((perm) & IOMMU_WO) && ((ap) & 0x2))
>>>
>>> Don't we also need to check AP bit 1 in some cases?
>>> (when the StreamWorld is S or NS EL1 and either (a) the incoming
>>> transaction has its attrs.user = 1 and STE.PRIVCFG is 0b0x, or
>>> (b) STE.PRIVCFG is 0b10).
>> I think I don't need to as I don't support this feature at the moment:
>> spec says:
>> "When SMMU_IDR1.ATTR_PERMS_OVR=0, this field is RES0 and the incoming
>> PRIV attribute is used."
>> But to be honest I was not aware this existed ;()
> 
> I think you still need to check the incoming transaction
> for user vs priv, even if you don't support STE.PRIVCFG.

On the CPU side, you have MemTxAttrs as input from get_phys_addr_lpae().

On IOMMU side, the current input callback for translation is

static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr
addr, IOMMUAccessFlags flag)

where IOMMUAccessFlags just is R/W access flag.

So I am not sure I have acess to those user/priv attributes.
> 
>>>> +
>>>> +#define PTE_AP_TO_PERM(ap) \
>>>> +    (IOMMU_ACCESS_FLAG(true, !((ap) & 0x2)))
>>>
>>> Similarly here.
>> ?
> 
> Can't just ignore AP bit 1 (or 0, if you're counting it that way).

with AP, the LSB is associated to EL0 rights. Similarly, given the
translation callback in use, is it something relevant as of now?

with APTable, the LSB bit is associated to EL0 as well
APTable[1] = 1, access at EL0 not permitted

In my case, can't I consider all accesses are priviledged?

Thanks

Eric
> 
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 03/14] hw/arm/smmu-common: VMSAv8-64 page table walk
  2018-03-08 18:56         ` Auger Eric
@ 2018-03-08 19:01           ` Peter Maydell
  0 siblings, 0 replies; 63+ messages in thread
From: Peter Maydell @ 2018-03-08 19:01 UTC (permalink / raw)
  To: Auger Eric
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

On 8 March 2018 at 18:56, Auger Eric <eric.auger@redhat.com> wrote:
> Hi Peter,
> On 07/03/18 17:35, Peter Maydell wrote:
>> On 7 March 2018 at 16:23, Auger Eric <eric.auger@redhat.com> wrote:
>>> Hi Peter,
>>>
>>> On 06/03/18 20:43, Peter Maydell wrote:
>>>> On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
>>>>> +#define is_permission_fault(ap, perm) \
>>>>> +    (((perm) & IOMMU_WO) && ((ap) & 0x2))
>>>>
>>>> Don't we also need to check AP bit 1 in some cases?
>>>> (when the StreamWorld is S or NS EL1 and either (a) the incoming
>>>> transaction has its attrs.user = 1 and STE.PRIVCFG is 0b0x, or
>>>> (b) STE.PRIVCFG is 0b10).
>>> I think I don't need to as I don't support this feature at the moment:
>>> spec says:
>>> "When SMMU_IDR1.ATTR_PERMS_OVR=0, this field is RES0 and the incoming
>>> PRIV attribute is used."
>>> But to be honest I was not aware this existed ;()
>>
>> I think you still need to check the incoming transaction
>> for user vs priv, even if you don't support STE.PRIVCFG.
>
> On the CPU side, you have MemTxAttrs as input from get_phys_addr_lpae().
>
> On IOMMU side, the current input callback for translation is
>
> static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr
> addr, IOMMUAccessFlags flag)
>
> where IOMMUAccessFlags just is R/W access flag.
>
> So I am not sure I have acess to those user/priv attributes.

Hmm, yes. This looks like a deficiency in our IOMMU framework.
For the moment put a TODO note that we treat all transactions
as privileged because QEMU's IOMMU code doesn't pass transaction
attributes around correctly.

(This will also be an issue for secure/nonsecure eventually.)

thanks
-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 10/14] hw/arm/smmuv3: Abort on vfio or vhost case
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 10/14] hw/arm/smmuv3: Abort on vfio or vhost case Eric Auger
@ 2018-03-08 19:06   ` Peter Maydell
  2018-03-09 17:53     ` Auger Eric
  0 siblings, 1 reply; 63+ messages in thread
From: Peter Maydell @ 2018-03-08 19:06 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
> At the moment, the SMMUv3 does not support notification on
> TLB invalidation. So let's abort as soon as such notifier gets
> enabled.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  hw/arm/smmuv3.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>
> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
> index 384393f..5efe933 100644
> --- a/hw/arm/smmuv3.c
> +++ b/hw/arm/smmuv3.c
> @@ -1074,12 +1074,23 @@ static void smmuv3_class_init(ObjectClass *klass, void *data)
>      dc->realize = smmu_realize;
>  }
>
> +static void smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
> +                                       IOMMUNotifierFlag old,
> +                                       IOMMUNotifierFlag new)
> +{
> +    if (old == IOMMU_NOTIFIER_NONE) {
> +        error_setg(&error_fatal,
> +                   "SMMUV3: vhost and vfio notifiers not yet supported");
> +    }
> +}

Is this triggerable by the guest, or by the user on the command
line, or only by a bug in the board or other QEMU code?

thanks
-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 04/14] hw/arm/smmuv3: Skeleton
  2018-03-08 14:27   ` Peter Maydell
@ 2018-03-09 13:19     ` Auger Eric
  2018-03-09 13:37       ` Peter Maydell
  0 siblings, 1 reply; 63+ messages in thread
From: Auger Eric @ 2018-03-09 13:19 UTC (permalink / raw)
  To: Peter Maydell
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

Hi Peter,

On 08/03/18 15:27, Peter Maydell wrote:
> On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
>> From: Prem Mallappa <prem.mallappa@broadcom.com>
>>
>> This patch implements a skeleton for the smmuv3 device.
>> Datatypes and register definitions are introduced. The MMIO
>> region, the interrupts and the queue are initialized.
>>
>> Only the MMIO read operation is implemented here.
>>
>> Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> 
> 
> I just have a few minor nits on this one...
> 
> 
>> +static inline int smmu_enabled(SMMUv3State *s)
>> +{
>> +    return FIELD_EX32(s->cr[0], CR0, SMMU_ENABLE);
>> +}
>> +
>> +typedef struct Cmd {
>> +    uint32_t word[4];
>> +} Cmd;
>> +
>> +typedef struct Evt  {
>> +    uint32_t word[8];
>> +} Evt;
> 
> Some one-liner comments noting what these structs are for
> would be helpful.
sure
> 
>> +
>> +static inline uint64_t smmu_read64(uint64_t r, unsigned offset,
>> +                                   unsigned size)
> 
> A doc comment here would help in describing what the purpose of
> this utility function is.
done
> 
>> +{
>> +    if (size == 8 && !offset) {
>> +        return r;
> 
> If you take my advice a bit further down about just checking
> up front that 8-byte accesses are to definitely permitted 64
> bit register offsets, you won't need the check on offset.
OK. I created a readl and readll to check this as done in gic. So
eventually removed smmu_read64().
> 
>> +    }
>> +
>> +    /* 32 bit access */
>> +
>> +    if (offset && offset != 4)  {
>> +        qemu_log_mask(LOG_GUEST_ERROR,
>> +                      "SMMUv3 MMIO read: bad offset/size %u/%u\n",
>> +                      offset, size);
> 
> This isn't a guest error, because the function is only ever called
> with constant values for the 'offset' parameter. You should just
> assert that the offset is 0 or 4.
indeed
> 
>> +        return 0;
>> +    }
>> +
>> +    return extract64(r, offset << 3, 32);
>> +}
>> +
>> +#endif
> 
>> +static void smmuv3_init_regs(SMMUv3State *s)
>> +{
>> +    /**
>> +     * IDR0: stage1 only, AArch64 only, coherent access, 16b ASID,
>> +     *       multi-level stream table
>> +     */
>> +    s->idr[0] = FIELD_DP32(s->idr[0], IDR0, S1P, 1); /* stage 1 supported */
>> +    s->idr[0] = FIELD_DP32(s->idr[0], IDR0, TTF, 2); /* AArch64 PTW only */
>> +    s->idr[0] = FIELD_DP32(s->idr[0], IDR0, COHACC, 1); /* IO coherent */
>> +    s->idr[0] = FIELD_DP32(s->idr[0], IDR0, ASID16, 1); /* 16-bit ASID */
>> +    s->idr[0] = FIELD_DP32(s->idr[0], IDR0, TTENDIAN, 2); /* little endian */
>> +    s->idr[0] = FIELD_DP32(s->idr[0], IDR0, STALL_MODEL, 1); /* No stall */
>> +    /* terminated transaction will always be aborted/error returned */
>> +    s->idr[0] = FIELD_DP32(s->idr[0], IDR0, TERM_MODEL, 1);
>> +    /* 2-level stream table supported */
>> +    s->idr[0] = FIELD_DP32(s->idr[0], IDR0, STLEVEL, 1);
>> +
>> +    s->idr[1] = FIELD_DP32(s->idr[1], IDR1, SIDSIZE, SMMU_IDR1_SIDSIZE);
>> +    s->idr[1] = FIELD_DP32(s->idr[1], IDR1, EVENTQS, 19);
>> +    s->idr[1] = FIELD_DP32(s->idr[1], IDR1, CMDQS,   19);
>> +
>> +   /* 4K and 64K granule support */
>> +    s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN4K, 1);
>> +    s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN64K, 1);
>> +    s->idr[5] = FIELD_DP32(s->idr[5], IDR5, OAS, SMMU_IDR5_OAS); /* 44 bits */
>> +
>> +    s->cmdq.base = deposit64(s->cmdq.base, 0, 5, 19); /* LOG2SIZE = 19 */
>> +    s->cmdq.prod = 0;
>> +    s->cmdq.cons = 0;
>> +    s->cmdq.entry_size = sizeof(struct Cmd);
>> +    s->eventq.base = deposit64(s->eventq.base, 0, 5, 19); /* LOG2SIZE = 19 */
> 
> Have some #defines for max cmd queue and event queue size, since
> we use them here and also above in filling in the IDR fields ?
done
> 
>> +    s->eventq.prod = 0;
>> +    s->eventq.cons = 0;
>> +    s->eventq.entry_size = sizeof(struct Evt);
>> +
>> +    s->features = 0;
>> +    s->sid_split = 0;
>> +}
>> +
>> +static void smmu_write_mmio(void *opaque, hwaddr addr,
>> +                            uint64_t val, unsigned size)
>> +{
>> +    /* not yet implemented */
>> +}
>> +
>> +static uint64_t smmu_read_mmio(void *opaque, hwaddr addr, unsigned size)
>> +{
>> +    SMMUState *sys = opaque;
>> +    SMMUv3State *s = ARM_SMMUV3(sys);
>> +    uint64_t val;
>> +
>> +    /* CONSTRAINED UNPREDICTABLE choice to have page0/1 be exact aliases */
>> +    addr &= ~0x10000;
>> +
>> +    if (size != 4 && size != 8) {
>> +        qemu_log_mask(LOG_GUEST_ERROR, "SMMUv3 MMIO read: bad size %u\n", size);
> 
> Your MemoryRegionOps settings mean this can never happen, so you
> don't need to check it at runtime.
OK
> 
>> +        return 0;
>> +    }
> 
> Consider specifically catching 8-byte accesses to non-64-bit registers?
> This is CONSTRAINED UNPREDICTABLE (see spec section 6.2), and "one
> of the registers is read/written and other half is RAZ/WI" is permitted
> behaviour, but it does mean you need to be a little careful about not
> letting the top 32 bits of val become non-zero for the 32-bit register
> codepaths. Logging bad 64-bit accesses as LOG_GUEST_ERROR and making
> them RAZ/WI might be nicer for guest software developers.
I moved to ops with attrs and if a 64-bit access is attempted on
something not a 64b reg base, I return an error + log a guest error.
> 
>> +
>> +    /* Primecell/Corelink ID registers */
>> +    switch (addr) {
>> +    case A_CIDR0:
>> +        val = 0x0D;
>> +        break;
>> +    case A_CIDR1:
>> +        val = 0xF0;
>> +        break;
>> +    case A_CIDR2:
>> +        val = 0x05;
>> +        break;
>> +    case A_CIDR3:
>> +        val = 0xB1;
>> +        break;
>> +    case A_PIDR0:
>> +        val = 0x84; /* Part Number */
>> +        break;
>> +    case A_PIDR1:
>> +        val = 0xB4; /* JEP106 ID code[3:0] for Arm and Part numver[11:8] */
>> +        break;
>> +    case A_PIDR3:
>> +        val = 0x10; /* MMU600 p1 */
>> +        break;
>> +    case A_PIDR4:
>> +        val = 0x4; /* 4KB region count, JEP106 continuation code for Arm */
>> +        break;
>> +    case 0xFD4 ... 0xFDC: /* SMMU_PDIR 5-7 */
>> +        val = 0;
>> +        break;
> 
> I usually put all the const CIDR/PIDR values in a const array, since
> there are always 12 of them next to each other, but since you already
> have this code it's fine.
Switched to the array as suggested.
> 
>> +    case A_IDR0 ... A_IDR5:
>> +        val = s->idr[(addr - A_IDR0) / 4];
>> +        break;
>> +    case A_IIDR:
>> +        val = s->iidr;
>> +        break;
>> +    case A_CR0:
>> +        val = s->cr[0];
>> +        break;
>> +    case A_CR0ACK:
>> +        val = s->cr0ack;
>> +        break;
>> +    case A_CR1:
>> +        val = s->cr[1];
>> +        break;
>> +    case A_CR2:
>> +        val = s->cr[2];
>> +        break;
>> +    case A_STATUSR:
>> +        val = s->statusr;
>> +        break;
>> +    case A_IRQ_CTRL:
>> +        val = s->irq_ctrl;
>> +        break;
>> +    case A_IRQ_CTRL_ACK:
>> +        val = s->irq_ctrl_ack;
>> +        break;
>> +    case A_GERROR:
>> +        val = s->gerror;
>> +        break;
>> +    case A_GERRORN:
>> +        val = s->gerrorn;
>> +        break;
>> +    case A_GERROR_IRQ_CFG0: /* 64b */
>> +        val = smmu_read64(s->gerror_irq_cfg0, 0, size);
>> +        break;
>> +    case A_GERROR_IRQ_CFG0 + 4:
>> +        val = smmu_read64(s->gerror_irq_cfg0, 4, size);
>> +        break;
>> +    case A_GERROR_IRQ_CFG1:
>> +        val = s->gerror_irq_cfg1;
>> +        break;
>> +    case A_GERROR_IRQ_CFG2:
>> +        val = s->gerror_irq_cfg2;
>> +        break;
>> +    case A_STRTAB_BASE: /* 64b */
>> +        val = smmu_read64(s->strtab_base, 0, size);
>> +        break;
>> +    case A_STRTAB_BASE + 4: /* 64b */
>> +        val = smmu_read64(s->strtab_base, 4, size);
>> +        break;
>> +    case A_STRTAB_BASE_CFG:
>> +        val = s->strtab_base_cfg;
>> +        break;
>> +    case A_CMDQ_BASE: /* 64b */
>> +        val = smmu_read64(s->cmdq.base, 0, size);
>> +        break;
>> +    case A_CMDQ_BASE + 4:
>> +        val = smmu_read64(s->cmdq.base, 4, size);
>> +        break;
>> +    case A_CMDQ_PROD:
>> +        val = s->cmdq.prod;
>> +        break;
>> +    case A_CMDQ_CONS:
>> +        val = s->cmdq.cons;
>> +        break;
>> +    case A_EVENTQ_BASE: /* 64b */
>> +        val = smmu_read64(s->eventq.base, 0, size);
>> +        break;
>> +    case A_EVENTQ_BASE + 4: /* 64b */
>> +        val = smmu_read64(s->eventq.base, 4, size);
>> +        break;
>> +    case A_EVENTQ_PROD:
>> +        val = s->eventq.prod;
>> +        break;
>> +    case A_EVENTQ_CONS:
>> +        val = s->eventq.cons;
>> +        break;
>> +    default:
>> +        error_report("%s unhandled access at 0x%"PRIx64, __func__, addr);
> 
> This should be a LOG_GUEST_ERROR (if there are registers we don't
> implement, you can define the A_* constants for them and have those
> do a LOG_UNIMP.)
changed to LOG_GUEST_ERROR
> 
>> +        break;
>> +    }
>> +
>> +    trace_smmuv3_read_mmio(addr, val, size);
>> +    return val;
>> +}
>> +
> 
>> +static void smmu_realize(DeviceState *d, Error **errp)
>> +{
>> +    SMMUState *sys = ARM_SMMU(d);
>> +    SMMUv3State *s = ARM_SMMUV3(sys);
>> +    SMMUv3Class *c = ARM_SMMUV3_GET_CLASS(s);
>> +    SysBusDevice *dev = SYS_BUS_DEVICE(d);
>> +    Error *local_err = NULL;
>> +
>> +    c->parent_realize(d, &local_err);
>> +    if (local_err) {
>> +        error_propagate(errp, local_err);
>> +        return;
>> +    }
>> +
>> +    memory_region_init_io(&sys->iomem, OBJECT(s),
>> +                          &smmu_mem_ops, sys, TYPE_ARM_SMMUV3, 0x20000);
>> +
>> +    sys->mrtypename = g_strdup(TYPE_SMMUV3_IOMMU_MEMORY_REGION);
> 
> Nothing ever modifies this later, so I don't think you nede to
> do the g_strdup() ?  (The declaration of the struct field should
> probably have a 'const'.)
done
> 
>> +
>> +    sysbus_init_mmio(dev, &sys->iomem);
>> +
>> +    smmu_init_irq(s, dev);
>> +}
>> +
>> +static const VMStateDescription vmstate_smmuv3 = {
>> +    .name = "smmuv3",
>> +    .version_id = 1,
>> +    .minimum_version_id = 1,
>> +    .fields = (VMStateField[]) {
>> +        VMSTATE_UINT32(features, SMMUv3State),
>> +        VMSTATE_UINT8(sid_size, SMMUv3State),
>> +        VMSTATE_UINT8(sid_split, SMMUv3State),
>> +
>> +        VMSTATE_UINT32_ARRAY(idr, SMMUv3State, 6),
> 
> These are constant ID registers, right? You don't need to
> migrate anything that's a fixed value set when the device
> is created and never modified.
yes correct, removed idr and iidr
> 
>> +        VMSTATE_UINT32(iidr, SMMUv3State),
>> +        VMSTATE_UINT32_ARRAY(cr, SMMUv3State, 3),
>> +        VMSTATE_UINT32(cr0ack, SMMUv3State),
>> +        VMSTATE_UINT32(statusr, SMMUv3State),
>> +        VMSTATE_UINT32(irq_ctrl, SMMUv3State),
>> +        VMSTATE_UINT32(irq_ctrl_ack, SMMUv3State),
>> +        VMSTATE_UINT32(gerror, SMMUv3State),
>> +        VMSTATE_UINT32(gerrorn, SMMUv3State),
>> +        VMSTATE_UINT64(gerror_irq_cfg0, SMMUv3State),
>> +        VMSTATE_UINT32(gerror_irq_cfg1, SMMUv3State),
>> +        VMSTATE_UINT32(gerror_irq_cfg2, SMMUv3State),
>> +        VMSTATE_UINT64(strtab_base, SMMUv3State),
>> +        VMSTATE_UINT32(strtab_base_cfg, SMMUv3State),
>> +        VMSTATE_UINT64(eventq_irq_cfg0, SMMUv3State),
>> +        VMSTATE_UINT32(eventq_irq_cfg1, SMMUv3State),
>> +        VMSTATE_UINT32(eventq_irq_cfg2, SMMUv3State),
>> +
>> +        VMSTATE_UINT64(cmdq.base, SMMUv3State),
>> +        VMSTATE_UINT32(cmdq.prod, SMMUv3State),
>> +        VMSTATE_UINT32(cmdq.cons, SMMUv3State),
>> +        VMSTATE_UINT8(cmdq.entry_size, SMMUv3State),
>> +        VMSTATE_UINT64(eventq.base, SMMUv3State),
>> +        VMSTATE_UINT32(eventq.prod, SMMUv3State),
>> +        VMSTATE_UINT32(eventq.cons, SMMUv3State),
>> +        VMSTATE_UINT8(eventq.entry_size, SMMUv3State),
> 
> It's a little neater to define a separate VMStateDescription
> for the SMMUQueue struct and then just use it twice here with
> VMSTATE_STRUCT. (Example in hw/dma/pl330.c for vmstate_pl330_chan.)
done.
> 
> Also, isn't the entry_size constant and fixed at device creation?
> If so, you don't need to migrate it.
yes, removed
> 
>> +
>> +        VMSTATE_END_OF_LIST(),
>> +    },
>> +};
>> +
>> +static void smmuv3_instance_init(Object *obj)
>> +{
>> +    /* Nothing much to do here as of now */
>> +}
>> +
>> +static void smmuv3_class_init(ObjectClass *klass, void *data)
>> +{
>> +    DeviceClass *dc = DEVICE_CLASS(klass);
>> +    SMMUv3Class *c = ARM_SMMUV3_CLASS(klass);
>> +
>> +    dc->vmsd    = &vmstate_smmuv3;
> 
> It would be nice to go through the patchset and remove these
> unnecessary extra spaces in various assignments and declarations.
attempted ;-)
> 
>> +    device_class_set_parent_reset(dc, smmu_reset, &c->parent_reset);
>> +    c->parent_realize = dc->realize;
>> +    dc->realize = smmu_realize;
>> +}
> 
> 
>> +type_init(smmuv3_register_types)
>> +
> 
> Stray blank line at end of file.
> 
>> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
>> index 3584974..64d2b9b 100644
>> --- a/hw/arm/trace-events
>> +++ b/hw/arm/trace-events
>> @@ -12,3 +12,6 @@ smmu_ptw_invalid_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr,
>>  smmu_ptw_page_pte(int stage, int level,  uint64_t iova, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t address) "stage=%d level=%d iova=0x%"PRIx64" base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" page address = 0x%"PRIx64
>>  smmu_ptw_block_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t iova, uint64_t gpa, int bsize_mb) "stage=%d level=%d base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" iova=0x%"PRIx64" block address = 0x%"PRIx64" block size = %d MiB"
>>  smmu_get_pte(uint64_t baseaddr, int index, uint64_t pteaddr, uint64_t pte) "baseaddr=0x%"PRIx64" index=0x%x, pteaddr=0x%"PRIx64", pte=0x%"PRIx64
>> +
>> +#hw/arm/smmuv3.c
>> +smmuv3_read_mmio(hwaddr addr, uint64_t val, unsigned size) "addr: 0x%"PRIx64" val:0x%"PRIx64" size: 0x%x"
> 
> "hwaddr" isn't valid as a type for trace event parameters. This should
> be uint64_t; otherwise trace backends like 'ust' won't build. (There's a
> patch on list to tracetool that will make this mistake a compile error
> for all backends, so it's easier to catch.)
Yes changed all of those to uint64_t.
> 
> 
>> +#ifndef HW_ARM_SMMUV3_H
>> +#define HW_ARM_SMMUV3_H
>> +
>> +#include "hw/arm/smmu-common.h"
>> +#include "hw/registerfields.h"
>> +
>> +#define TYPE_SMMUV3_IOMMU_MEMORY_REGION "smmuv3-iommu-memory-region"
>> +
>> +#define SMMU_NREGS            0x200
>> +
>> +typedef struct SMMUQueue {
>> +     hwaddr base;
>> +     uint32_t prod;
>> +     uint32_t cons;
>> +     uint8_t entry_size;
>> +} SMMUQueue;
>> +
>> +typedef struct SMMUv3State {
>> +    SMMUState     smmu_state;
>> +
>> +    /* Local cache of most-frequently used registers */
>> +#define SMMU_FEATURE_2LVL_STE (1 << 0)
> 
> Minor thing, I think it would be better for this define to
> not be in the middle of the struct definition.
OK moved to smmuv3-internal in patch adding write ops.

Thanks

Eric
> 
>> +    uint32_t features;
>> +    uint8_t sid_size;
>> +    uint8_t sid_split;
>> +
>> +    uint32_t idr[6];
>> +    uint32_t iidr;
>> +    uint32_t cr[3];
>> +    uint32_t cr0ack;
>> +    uint32_t statusr;
>> +    uint32_t irq_ctrl;
>> +    uint32_t irq_ctrl_ack;
>> +    uint32_t gerror;
>> +    uint32_t gerrorn;
>> +    uint64_t gerror_irq_cfg0;
>> +    uint32_t gerror_irq_cfg1;
>> +    uint32_t gerror_irq_cfg2;
>> +    uint64_t strtab_base;
>> +    uint32_t strtab_base_cfg;
>> +    uint64_t eventq_irq_cfg0;
>> +    uint32_t eventq_irq_cfg1;
>> +    uint32_t eventq_irq_cfg2;
>> +
>> +    SMMUQueue eventq, cmdq;
>> +
>> +    qemu_irq     irq[4];
>> +} SMMUv3State;
>> +
>> +typedef enum {
>> +    SMMU_IRQ_EVTQ,
>> +    SMMU_IRQ_PRIQ,
>> +    SMMU_IRQ_CMD_SYNC,
>> +    SMMU_IRQ_GERROR,
>> +} SMMUIrq;
>> +
>> +typedef struct {
>> +    /*< private >*/
>> +    SMMUBaseClass smmu_base_class;
>> +    /*< public >*/
>> +
>> +    DeviceRealize parent_realize;
>> +    DeviceReset   parent_reset;
>> +} SMMUv3Class;
>> +
>> +#define TYPE_ARM_SMMUV3   "arm-smmuv3"
>> +#define ARM_SMMUV3(obj) OBJECT_CHECK(SMMUv3State, (obj), TYPE_ARM_SMMUV3)
>> +#define ARM_SMMUV3_CLASS(klass)                              \
>> +    OBJECT_CLASS_CHECK(SMMUv3Class, (klass), TYPE_ARM_SMMUV3)
>> +#define ARM_SMMUV3_GET_CLASS(obj) \
>> +     OBJECT_GET_CLASS(SMMUv3Class, (obj), TYPE_ARM_SMMUV3)
>> +
>> +#endif
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 04/14] hw/arm/smmuv3: Skeleton
  2018-03-09 13:19     ` Auger Eric
@ 2018-03-09 13:37       ` Peter Maydell
  2018-03-09 13:49         ` Auger Eric
  0 siblings, 1 reply; 63+ messages in thread
From: Peter Maydell @ 2018-03-09 13:37 UTC (permalink / raw)
  To: Auger Eric
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

On 9 March 2018 at 13:19, Auger Eric <eric.auger@redhat.com> wrote:
> On 08/03/18 15:27, Peter Maydell wrote:
>> Consider specifically catching 8-byte accesses to non-64-bit registers?
>> This is CONSTRAINED UNPREDICTABLE (see spec section 6.2), and "one
>> of the registers is read/written and other half is RAZ/WI" is permitted
>> behaviour, but it does mean you need to be a little careful about not
>> letting the top 32 bits of val become non-zero for the 32-bit register
>> codepaths. Logging bad 64-bit accesses as LOG_GUEST_ERROR and making
>> them RAZ/WI might be nicer for guest software developers.
>
> I moved to ops with attrs and if a 64-bit access is attempted on
> something not a 64b reg base, I return an error + log a guest error.

Ah, you probably don't want to return MEMTX_ERROR, because that
becomes a guest CPU external-abort exception. An abort is listed
as one of the permitted constrained-unpredictable behaviours for
bad 64-bit accesses, but there is a note that "strongly recommends"
not to abort for cases where the registers might be used by software
associated with lower exception levels. Rather than trying to decide
which registers do or don't get to return MEMTX_ERROR, it's probably
easier just to RAZ/WI and return MEMTX_OK.

(We had to fix a bug like this in the gicv3 in commits f1945632b43e3
and 0cf09852015e when we started making MEMTX_ERROR generate aborts,
though in that case the spec is more definite that abort is not a
permitted behaviour.)

thanks
-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 04/14] hw/arm/smmuv3: Skeleton
  2018-03-09 13:37       ` Peter Maydell
@ 2018-03-09 13:49         ` Auger Eric
  0 siblings, 0 replies; 63+ messages in thread
From: Auger Eric @ 2018-03-09 13:49 UTC (permalink / raw)
  To: Peter Maydell
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

Hi Peter,

On 09/03/18 14:37, Peter Maydell wrote:
> On 9 March 2018 at 13:19, Auger Eric <eric.auger@redhat.com> wrote:
>> On 08/03/18 15:27, Peter Maydell wrote:
>>> Consider specifically catching 8-byte accesses to non-64-bit registers?
>>> This is CONSTRAINED UNPREDICTABLE (see spec section 6.2), and "one
>>> of the registers is read/written and other half is RAZ/WI" is permitted
>>> behaviour, but it does mean you need to be a little careful about not
>>> letting the top 32 bits of val become non-zero for the 32-bit register
>>> codepaths. Logging bad 64-bit accesses as LOG_GUEST_ERROR and making
>>> them RAZ/WI might be nicer for guest software developers.
>>
>> I moved to ops with attrs and if a 64-bit access is attempted on
>> something not a 64b reg base, I return an error + log a guest error.
> 
> Ah, you probably don't want to return MEMTX_ERROR, because that
> becomes a guest CPU external-abort exception. An abort is listed
> as one of the permitted constrained-unpredictable behaviours for
> bad 64-bit accesses, but there is a note that "strongly recommends"
> not to abort for cases where the registers might be used by software
> associated with lower exception levels. Rather than trying to decide
> which registers do or don't get to return MEMTX_ERROR, it's probably
> easier just to RAZ/WI and return MEMTX_OK.
> 
> (We had to fix a bug like this in the gicv3 in commits f1945632b43e3
> and 0cf09852015e when we started making MEMTX_ERROR generate aborts,
> though in that case the spec is more definite that abort is not a
> permitted behaviour.)

Yes saw those modifs in gic. I will check & fix this.

Thanks

Eric
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 05/14] hw/arm/smmuv3: Wired IRQ and GERROR helpers
  2018-03-08 17:49   ` Peter Maydell
@ 2018-03-09 14:03     ` Auger Eric
  2018-03-09 14:18       ` Peter Maydell
  0 siblings, 1 reply; 63+ messages in thread
From: Auger Eric @ 2018-03-09 14:03 UTC (permalink / raw)
  To: Peter Maydell
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

Hi Peter,

On 08/03/18 18:49, Peter Maydell wrote:
> On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
>> We introduce some helpers to handle wired IRQs and especially
>> GERROR interrupt. SMMU writes GERROR register on GERROR event
>> and SW acks GERROR interrupts by setting GERRORn.
>>
>> The Wired interrupts are edge sensitive hence the pulse usage.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>>
>> v7 -> v8:
>> - remove SMMU_PENDING_GERRORS macro
>> - properly toggle gerror
>> - properly sanitize gerrorn write
>> ---
>>  hw/arm/smmuv3-internal.h | 10 ++++++++
>>  hw/arm/smmuv3.c          | 64 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  hw/arm/trace-events      |  3 +++
>>  3 files changed, 77 insertions(+)
>>
>> diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
>> index 5be8303..40b39a1 100644
>> --- a/hw/arm/smmuv3-internal.h
>> +++ b/hw/arm/smmuv3-internal.h
>> @@ -152,4 +152,14 @@ static inline uint64_t smmu_read64(uint64_t r, unsigned offset,
>>      return extract64(r, offset << 3, 32);
>>  }
>>
>> +/* Interrupts */
>> +
>> +#define smmuv3_eventq_irq_enabled(s)                   \
>> +    (FIELD_EX32(s->irq_ctrl, IRQ_CTRL, EVENTQ_IRQEN))
>> +#define smmuv3_gerror_irq_enabled(s)                  \
>> +    (FIELD_EX32(s->irq_ctrl, IRQ_CTRL, GERROR_IRQEN))
> 
> These are only ever used in smmuv3.c, so you can just move them
> to there (and make them inline functions, ideally).
smmuv3-internal.h contains helpers, macros which are only used by
smmuv3.c . I though this could avoid putting too much code in smmuv3.c
and would help in the readability.

what is the exact benefit of transforming those into inline functions
instead of macros. Not meaning I don't want to take this into account
but to improve my coding style ;-)
> 
>> +
>> +void smmuv3_trigger_irq(SMMUv3State *s, SMMUIrq irq, uint32_t gerror_mask);
>> +void smmuv3_write_gerrorn(SMMUv3State *s, uint32_t gerrorn);
> 
> I guess these are global to avoid the compiler complaining about
> unused static functions at this point? If so, add a comment
> saying so, and flip them back to being static functions when
> their callers get added. (Or just add the callers here, if they're
> not too complicated.)

Yes they becomes static in "hw/arm/smmuv3: Implement MMIO write
operations". Added a comment
> 
>> +
>>  #endif
>> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
>> index dc03c9e..8779d3f 100644
>> --- a/hw/arm/smmuv3.c
>> +++ b/hw/arm/smmuv3.c
>> @@ -30,6 +30,70 @@
>>  #include "hw/arm/smmuv3.h"
>>  #include "smmuv3-internal.h"
>>
>> +/**
>> + * smmuv3_trigger_irq - pulse @irq if enabled and update
>> + * GERROR register in case of GERROR interrupt
>> + *
>> + * @irq: irq type
>> + * @gerror_mask: mask of gerrors to toggle (relevant if @irq is GERROR)
>> + */
>> +void smmuv3_trigger_irq(SMMUv3State *s, SMMUIrq irq, uint32_t gerror_mask)
>> +{
>> +
>> +    bool pulse = false;
>> +
>> +    switch (irq) {
>> +    case SMMU_IRQ_EVTQ:
>> +        pulse = smmuv3_eventq_irq_enabled(s);
>> +        break;
>> +    case SMMU_IRQ_PRIQ:
>> +        error_setg(&error_fatal, "PRI not supported");
> 
> This should either assert() if it would be a bug in the rest
> of the smmu code, or LOG_UNIMP if the guest can trigger it.
replaced by LOG_UNIMP
> 
>> +        break;
>> +    case SMMU_IRQ_CMD_SYNC:
>> +        pulse = true;
>> +        break;
>> +    case SMMU_IRQ_GERROR:
>> +    {
>> +        uint32_t pending = s->gerror ^ s->gerrorn;
>> +        uint32_t new_gerrors = ~pending & gerror_mask;
>> +
>> +        if (!new_gerrors) {
>> +            /* only toggle non pending errors */
>> +            return;
>> +        }
>> +        s->gerror ^= new_gerrors;
>> +        trace_smmuv3_write_gerror(new_gerrors, s->gerror);
>> +
>> +        /* pulse the GERROR irq only if all previous gerrors were acked */
> 
> It's not entirely clear to me that this is correct; should
> we generate only one pulse if the implementation raises error A,
> and then later raises error B before software acknowledges A ?
> There's some language in 3.18 about the SMMU implementation
> being able to coalesce events and identical interrupts, but
> I think that would mean that we could skip raising the first
> pulse for error A if error B arrived sufficiently quickly after it.
> (Not something we're going to care about for a s/w model.)
I don't implement event merging atm.
> 
> I think the right behaviour is probably that we should pulse
> the interrupt if there are any new gerrors, which is to
> say to drop this !pending test:
I agree with your interpretation.


> 
>> +        pulse = smmuv3_gerror_irq_enabled(s) && !pending;
>> +        break;
>> +    }
>> +    }
>> +    if (pulse) {
>> +            trace_smmuv3_trigger_irq(irq);
>> +            qemu_irq_pulse(s->irq[irq]);
>> +    }
>> +}
>> +
>> +void smmuv3_write_gerrorn(SMMUv3State *s, uint32_t new_gerrorn)
>> +{
>> +    uint32_t pending = s->gerror ^ s->gerrorn;
>> +    uint32_t toggled = s->gerrorn ^ new_gerrorn;
>> +    uint32_t acked;
>> +
>> +    if (toggled & ~pending) {
>> +        qemu_log_mask(LOG_GUEST_ERROR,
>> +                      "guest toggles non pending errors = 0x%x\n",
>> +                      toggled & ~pending);
>> +    }
>> +
>> +    /* Make sure SW does not toggle irqs that are not active */
>> +    acked = toggled & pending;
>> +    s->gerrorn ^= acked;
>> +
> 
> I don't think this behaviour is correct. From the hardware's
> perspective, we should just take the value the user writes
> to SMMU_GERRORN and put it in the register (and update the
> status of the irq accordingly).

OK
> 
> It is CONSTRAINED UNPREDICTABLE whether we actually raise an
> interrupt if the guest toggles a field that corresponds to an
> inactive error, so we should just do whatever is easiest.
OK: nothing except reporting a LOG_GUEST_ERROR ;-)

Thanks

Eric
> 
>> +    trace_smmuv3_write_gerrorn(acked, s->gerrorn);
>> +}
>> +
>>  static void smmuv3_init_regs(SMMUv3State *s)
>>  {
>>      /**
>> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
>> index 64d2b9b..2ddae40 100644
>> --- a/hw/arm/trace-events
>> +++ b/hw/arm/trace-events
>> @@ -15,3 +15,6 @@ smmu_get_pte(uint64_t baseaddr, int index, uint64_t pteaddr, uint64_t pte) "base
>>
>>  #hw/arm/smmuv3.c
>>  smmuv3_read_mmio(hwaddr addr, uint64_t val, unsigned size) "addr: 0x%"PRIx64" val:0x%"PRIx64" size: 0x%x"
>> +smmuv3_trigger_irq(int irq) "irq=%d"
>> +smmuv3_write_gerror(uint32_t toggled, uint32_t gerror) "toggled=0x%x, new gerror=0x%x"
>> +smmuv3_write_gerrorn(uint32_t acked, uint32_t gerrorn) "acked=0x%x, new gerrorn=0x%x"
> 
> Capitalizing names of registers like GERROR in trace messages would
> make them match the convention in the SMMUv3 spec.
done

Thanks

Eric
> 
>> --
>> 2.5.5
>>
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 05/14] hw/arm/smmuv3: Wired IRQ and GERROR helpers
  2018-03-09 14:03     ` Auger Eric
@ 2018-03-09 14:18       ` Peter Maydell
  2018-03-09 14:50         ` Auger Eric
  0 siblings, 1 reply; 63+ messages in thread
From: Peter Maydell @ 2018-03-09 14:18 UTC (permalink / raw)
  To: Auger Eric
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

On 9 March 2018 at 14:03, Auger Eric <eric.auger@redhat.com> wrote:
> On 08/03/18 18:49, Peter Maydell wrote:
>>> +#define smmuv3_eventq_irq_enabled(s)                   \
>>> +    (FIELD_EX32(s->irq_ctrl, IRQ_CTRL, EVENTQ_IRQEN))
>>> +#define smmuv3_gerror_irq_enabled(s)                  \
>>> +    (FIELD_EX32(s->irq_ctrl, IRQ_CTRL, GERROR_IRQEN))
>>
>> These are only ever used in smmuv3.c, so you can just move them
>> to there (and make them inline functions, ideally).
> smmuv3-internal.h contains helpers, macros which are only used by
> smmuv3.c . I though this could avoid putting too much code in smmuv3.c
> and would help in the readability.
>
> what is the exact benefit of transforming those into inline functions
> instead of macros. Not meaning I don't want to take this into account
> but to improve my coding style ;-)

You get the benefit of type checking (and it self-documents that
the macros want to be passed an SMMUv3State*). You don't have to
worry about trying to write your macro to not evaluate arguments multiple
times. These are one-liners so they're fairly easy to read,
but when you get to 3 or 4 lines you end up with a lot of '\'
lines and the inline function is I think more clearly better.
I prefer to save macros for cases where you really need a macro
and can't get the same effect with a function.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 05/14] hw/arm/smmuv3: Wired IRQ and GERROR helpers
  2018-03-09 14:18       ` Peter Maydell
@ 2018-03-09 14:50         ` Auger Eric
  0 siblings, 0 replies; 63+ messages in thread
From: Auger Eric @ 2018-03-09 14:50 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Michael S. Tsirkin, Jean-Philippe Brucker, Tomasz Nowicki,
	QEMU Developers, Peter Xu, Alex Williamson, qemu-arm,
	Christoffer Dall, Edgar E. Iglesias, linuc.decode,
	Bharat Bhushan, Prem Mallappa, eric.auger.pro

Hi Peter,

On 09/03/18 15:18, Peter Maydell wrote:
> On 9 March 2018 at 14:03, Auger Eric <eric.auger@redhat.com> wrote:
>> On 08/03/18 18:49, Peter Maydell wrote:
>>>> +#define smmuv3_eventq_irq_enabled(s)                   \
>>>> +    (FIELD_EX32(s->irq_ctrl, IRQ_CTRL, EVENTQ_IRQEN))
>>>> +#define smmuv3_gerror_irq_enabled(s)                  \
>>>> +    (FIELD_EX32(s->irq_ctrl, IRQ_CTRL, GERROR_IRQEN))
>>>
>>> These are only ever used in smmuv3.c, so you can just move them
>>> to there (and make them inline functions, ideally).
>> smmuv3-internal.h contains helpers, macros which are only used by
>> smmuv3.c . I though this could avoid putting too much code in smmuv3.c
>> and would help in the readability.
>>
>> what is the exact benefit of transforming those into inline functions
>> instead of macros. Not meaning I don't want to take this into account
>> but to improve my coding style ;-)
> 
> You get the benefit of type checking (and it self-documents that
> the macros want to be passed an SMMUv3State*). You don't have to
> worry about trying to write your macro to not evaluate arguments multiple
> times. These are one-liners so they're fairly easy to read,
> but when you get to 3 or 4 lines you end up with a lot of '\'
> lines and the inline function is I think more clearly better.
> I prefer to save macros for cases where you really need a macro
> and can't get the same effect with a function.

OK. Thank you for the explanation

Eric
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 07/14] hw/arm/smmuv3: Implement MMIO write operations
  2018-03-08 18:37   ` Peter Maydell
@ 2018-03-09 16:42     ` Auger Eric
  0 siblings, 0 replies; 63+ messages in thread
From: Auger Eric @ 2018-03-09 16:42 UTC (permalink / raw)
  To: Peter Maydell
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

Hi Peter,

On 08/03/18 19:37, Peter Maydell wrote:
> On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
>> Now we have relevant helpers for queue and irq
>> management, let's implement MMIO write operations.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>>
>> v7 -> v8:
>> - precise in the commit message invalidation commands
>>   are not yet treated.
>> - use new queue helpers
>> - do not decode unhandled commands at this stage
>> ---
>>  hw/arm/smmuv3-internal.h |  24 +++++++---
>>  hw/arm/smmuv3.c          | 111 +++++++++++++++++++++++++++++++++++++++++++++--
>>  hw/arm/trace-events      |   6 +++
>>  3 files changed, 132 insertions(+), 9 deletions(-)
>>
>> diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
>> index c0771ce..5af97ae 100644
>> --- a/hw/arm/smmuv3-internal.h
>> +++ b/hw/arm/smmuv3-internal.h
>> @@ -152,6 +152,25 @@ static inline uint64_t smmu_read64(uint64_t r, unsigned offset,
>>      return extract64(r, offset << 3, 32);
>>  }
>>
>> +static inline void smmu_write64(uint64_t *r, unsigned offset,
>> +                                unsigned size, uint64_t value)
>> +{
>> +    if (size == 8 && !offset) {
>> +        *r  = value;
>> +    }
>> +
>> +    /* 32 bit access */
>> +
>> +    if (offset && offset != 4)  {
>> +        qemu_log_mask(LOG_GUEST_ERROR,
>> +                      "SMMUv3 MMIO write: bad offset/size %u/%u\n",
>> +                      offset, size);
>> +        return ;
>> +    }
>> +
>> +    *r = deposit64(*r, offset << 3, 32, value);
> 
> Similar remarks apply to this helper as to smmu_read64 in the earlier patch.
removed
> 
>> +}
>> +
>>  /* Interrupts */
>>
>>  #define smmuv3_eventq_irq_enabled(s)                   \
>> @@ -159,9 +178,6 @@ static inline uint64_t smmu_read64(uint64_t r, unsigned offset,
>>  #define smmuv3_gerror_irq_enabled(s)                  \
>>      (FIELD_EX32(s->irq_ctrl, IRQ_CTRL, GERROR_IRQEN))
>>
>> -void smmuv3_trigger_irq(SMMUv3State *s, SMMUIrq irq, uint32_t gerror_mask);
>> -void smmuv3_write_gerrorn(SMMUv3State *s, uint32_t gerrorn);
>> -
>>  /* Queue Handling */
>>
>>  #define LOG2SIZE(q)        extract64((q)->base, 0, 5)
>> @@ -310,6 +326,4 @@ enum { /* Command completion notification */
>>              addr;                                             \
>>          })
>>
>> -int smmuv3_cmdq_consume(SMMUv3State *s);
>> -
>>  #endif
>> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
>> index 0b57215..fcfdbb0 100644
>> --- a/hw/arm/smmuv3.c
>> +++ b/hw/arm/smmuv3.c
>> @@ -37,7 +37,8 @@
>>   * @irq: irq type
>>   * @gerror_mask: mask of gerrors to toggle (relevant if @irq is GERROR)
>>   */
>> -void smmuv3_trigger_irq(SMMUv3State *s, SMMUIrq irq, uint32_t gerror_mask)
>> +static void smmuv3_trigger_irq(SMMUv3State *s, SMMUIrq irq,
>> +                               uint32_t gerror_mask)
>>  {
>>
>>      bool pulse = false;
>> @@ -75,7 +76,7 @@ void smmuv3_trigger_irq(SMMUv3State *s, SMMUIrq irq, uint32_t gerror_mask)
>>      }
>>  }
>>
>> -void smmuv3_write_gerrorn(SMMUv3State *s, uint32_t new_gerrorn)
>> +static void smmuv3_write_gerrorn(SMMUv3State *s, uint32_t new_gerrorn)
>>  {
>>      uint32_t pending = s->gerror ^ s->gerrorn;
>>      uint32_t toggled = s->gerrorn ^ new_gerrorn;
>> @@ -199,7 +200,7 @@ static void smmuv3_init_regs(SMMUv3State *s)
>>      s->sid_split = 0;
>>  }
>>
>> -int smmuv3_cmdq_consume(SMMUv3State *s)
>> +static int smmuv3_cmdq_consume(SMMUv3State *s)
>>  {
>>      SMMUCmdError cmd_error = SMMU_CERROR_NONE;
>>      SMMUQueue *q = &s->cmdq;
>> @@ -298,7 +299,109 @@ int smmuv3_cmdq_consume(SMMUv3State *s)
>>  static void smmu_write_mmio(void *opaque, hwaddr addr,
>>                              uint64_t val, unsigned size)
>>  {
>> -    /* not yet implemented */
>> +    SMMUState *sys = opaque;
>> +    SMMUv3State *s = ARM_SMMUV3(sys);
>> +
>> +    /* CONSTRAINED UNPREDICTABLE choice to have page0/1 be exact aliases */
>> +    addr &= ~0x10000;
>> +
>> +    if (size != 4 && size != 8) {
>> +        qemu_log_mask(LOG_GUEST_ERROR,
>> +                      "SMMUv3 MMIO write: bad size %u\n", size);
>> +    }
> 
> As with read, this can never happen so you don't need to check for it.
> 
> As with read, probably better to explicitly whitelist the 64-bit
> accessible offsets, and LOG_GUEST_ERROR and write-ignore the others.
done
> 
>> +
>> +    trace_smmuv3_write_mmio(addr, val, size);
>> +
>> +    switch (addr) {
>> +    case A_CR0:
>> +        s->cr[0] = val;
>> +        s->cr0ack = val;
> 
> Spec says "reserved fields in SMMU_CR0 are not reflected in SMMU_CR0ACK",
> so you probably need to mask those out.
OK
> 
>> +        /* in case the command queue has been enabled */
>> +        smmuv3_cmdq_consume(s);
>> +        return;
>> +    case A_CR1:
>> +        s->cr[1] = val;
>> +        return;
>> +    case A_CR2:
>> +        s->cr[2] = val;
>> +        return;
>> +    case A_IRQ_CTRL:
>> +        s->irq_ctrl = val;
>> +        return;
>> +    case A_GERRORN:
>> +        smmuv3_write_gerrorn(s, val);
>> +        /*
>> +         * By acknowledging the CMDQ_ERR, SW may notify cmds can
>> +         * be processed again
>> +         */
>> +        smmuv3_cmdq_consume(s);
>> +        return;
>> +    case A_GERROR_IRQ_CFG0: /* 64b */
>> +        smmu_write64(&s->gerror_irq_cfg0, 0, size, val);
>> +        return;
>> +    case A_GERROR_IRQ_CFG0 + 4:
>> +        smmu_write64(&s->gerror_irq_cfg0, 4, size, val);
>> +        return;
>> +    case A_GERROR_IRQ_CFG1:
>> +        s->gerror_irq_cfg1 = val;
>> +        return;
>> +    case A_GERROR_IRQ_CFG2:
>> +        s->gerror_irq_cfg2 = val;
>> +        return;
>> +    case A_STRTAB_BASE: /* 64b */
>> +        smmu_write64(&s->strtab_base, 0, size, val);
>> +        return;
>> +    case A_STRTAB_BASE + 4:
>> +        smmu_write64(&s->strtab_base, 4, size, val);
>> +        return;
>> +    case A_STRTAB_BASE_CFG:
>> +        s->strtab_base_cfg = val;
>> +        if (FIELD_EX32(val, STRTAB_BASE_CFG, FMT) == 1) {
>> +            s->sid_split = FIELD_EX32(val, STRTAB_BASE_CFG, SPLIT);
>> +            s->features |= SMMU_FEATURE_2LVL_STE;
>> +        }
>> +        return;
>> +    case A_CMDQ_BASE: /* 64b */
>> +        smmu_write64(&s->cmdq.base, 0, size, val);
>> +        return;
>> +    case A_CMDQ_BASE + 4: /* 64b */
>> +        smmu_write64(&s->cmdq.base, 4, size, val);
>> +        return;
>> +    case A_CMDQ_PROD:
>> +        s->cmdq.prod = val;
>> +        smmuv3_cmdq_consume(s);
>> +        return;
>> +    case A_CMDQ_CONS:
>> +        s->cmdq.cons = val;
>> +        return;
>> +    case A_EVENTQ_BASE: /* 64b */
>> +        smmu_write64(&s->eventq.base, 0, size, val);
>> +        return;
>> +    case A_EVENTQ_BASE + 4:
>> +        smmu_write64(&s->eventq.base, 4, size, val);
>> +        return;
>> +    case A_EVENTQ_PROD:
>> +        s->eventq.prod = val;
>> +        return;
>> +    case A_EVENTQ_CONS:
>> +        s->eventq.cons = val;
>> +        return;
>> +    case A_EVENTQ_IRQ_CFG0: /* 64b */
>> +        s->eventq.prod = val;
>> +        smmu_write64(&s->eventq_irq_cfg0, 0, size, val);
>> +        return;
>> +    case A_EVENTQ_IRQ_CFG0 + 4:
>> +        smmu_write64(&s->eventq_irq_cfg0, 4, size, val);
>> +        return;
>> +    case A_EVENTQ_IRQ_CFG1:
>> +        s->eventq_irq_cfg1 = val;
>> +        return;
>> +    case A_EVENTQ_IRQ_CFG2:
>> +        s->eventq_irq_cfg2 = val;
>> +        return;
>> +    default:
>> +        error_report("%s unhandled access at 0x%"PRIx64, __func__, addr);
> 
> Tracepoint or LOG_GUEST_ERROR, not error_report(), please.
OK

Thanks

Eric
> 
>> +    }
>>  }
>>
>>  static uint64_t smmu_read_mmio(void *opaque, hwaddr addr, unsigned size)
>> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
>> index 1c5105d..ed5dce0 100644
>> --- a/hw/arm/trace-events
>> +++ b/hw/arm/trace-events
>> @@ -22,3 +22,9 @@ smmuv3_unhandled_cmd(uint32_t type) "Unhandled command type=%d"
>>  smmuv3_cmdq_consume(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "prod=%d cons=%d prod.wrap=%d cons.wrap=%d"
>>  smmuv3_cmdq_opcode(const char *opcode) "<--- %s"
>>  smmuv3_cmdq_consume_out(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "prod:%d, cons:%d, prod_wrap:%d, cons_wrap:%d "
>> +smmuv3_update(bool is_empty, uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "q empty:%d prod:%d cons:%d p.wrap:%d p.cons:%d"
>> +smmuv3_update_check_cmd(int error) "cmdq not enabled or error :0x%x"
>> +smmuv3_write_mmio(hwaddr addr, uint64_t val, unsigned size) "addr: 0x%"PRIx64" val:0x%"PRIx64" size: 0x%x"
>> +smmuv3_write_mmio_idr(hwaddr addr, uint64_t val) "write to RO/Unimpl reg 0x%lx val64:0x%lx"
>> +smmuv3_write_mmio_evtq_cons_bef_clear(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "Before clearing interrupt prod:0x%x cons:0x%x prod.w:%d cons.w:%d"
>> +smmuv3_write_mmio_evtq_cons_after_clear(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "after clearing interrupt prod:0x%x cons:0x%x prod.w:%d cons.w:%d"
> 
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 06/14] hw/arm/smmuv3: Queue helpers
  2018-03-08 18:28   ` Peter Maydell
@ 2018-03-09 16:43     ` Auger Eric
  0 siblings, 0 replies; 63+ messages in thread
From: Auger Eric @ 2018-03-09 16:43 UTC (permalink / raw)
  To: Peter Maydell
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

Hi Peter,

On 08/03/18 19:28, Peter Maydell wrote:
> On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
>> We introduce helpers to read/write into the command and event
>> circular queues.
>>
>> smmuv3_write_eventq and smmuv3_cmq_consume will become static
>> in subsequent patches.
>>
>> Invalidation commands are not yet dealt with. We do not cache
>> data that need to be invalidated. This will change with vhost
>> integration.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>>
>> v8 -> v9:
>> - fix CMD_SSID & CMD_ADDR + some renamings
>> - do cons increment after the execution of the command
>> - add Q_INCONSISTENT()
>>
>> v7 -> v8
>> - use address_space_rw
>> - helpers inspired from spec
>> ---
>>  hw/arm/smmuv3-internal.h | 150 +++++++++++++++++++++++++++++++++++++++++++
>>  hw/arm/smmuv3.c          | 162 +++++++++++++++++++++++++++++++++++++++++++++++
>>  hw/arm/trace-events      |   4 ++
>>  3 files changed, 316 insertions(+)
>>
>> diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
>> index 40b39a1..c0771ce 100644
>> --- a/hw/arm/smmuv3-internal.h
>> +++ b/hw/arm/smmuv3-internal.h
>> @@ -162,4 +162,154 @@ static inline uint64_t smmu_read64(uint64_t r, unsigned offset,
>>  void smmuv3_trigger_irq(SMMUv3State *s, SMMUIrq irq, uint32_t gerror_mask);
>>  void smmuv3_write_gerrorn(SMMUv3State *s, uint32_t gerrorn);
>>
>> +/* Queue Handling */
>> +
>> +#define LOG2SIZE(q)        extract64((q)->base, 0, 5)
>> +#define BASE(q)            ((q)->base & SMMU_BASE_ADDR_MASK)
renamed Q_LOG2SIZE and Q_BASE respectively
> 
> These are both very generic names for things in header files.
> 
> Looking at the required behaviour of the *_BASE registers,
> if the LOG2SIZE field is written to a value larger than the maximum,
> it is supposed to read back as the value written but must behave
> as if it was set to the maximum. That suggests to me that your
> SMMUQueue struct should have a "log2size" field which is set when
> the guest writes to *_BASE (which is where you can cap it to the
> max value).
done
> 
>> +#define WRAP_MASK(q)       (1 << LOG2SIZE(q))
>> +#define INDEX_MASK(q)      ((1 << LOG2SIZE(q)) - 1)
>> +#define WRAP_INDEX_MASK(q) ((1 << (LOG2SIZE(q) + 1)) - 1)
> 
> WRAP_INDEX_MASK is unused (but see below for a possible use)
kept and adopted your suggestions below
> 
>> +
>> +#define Q_CONS_ENTRY(q)  (BASE(q) + \
>> +                          (q)->entry_size * ((q)->cons & INDEX_MASK(q)))
>> +#define Q_PROD_ENTRY(q)  (BASE(q) + \
>> +                          (q)->entry_size * ((q)->prod & INDEX_MASK(q)))
>> +
>> +#define Q_CONS(q) ((q)->cons & INDEX_MASK(q))
>> +#define Q_PROD(q) ((q)->prod & INDEX_MASK(q))
> 
> If you put these a bit earlier you can use them in the definitions
> of Q_CONS_ENTRY and Q_PROD_ENTRY.
done
> 
>> +
>> +#define Q_CONS_WRAP(q) (((q)->cons & WRAP_MASK(q)) >> LOG2SIZE(q))
>> +#define Q_PROD_WRAP(q) (((q)->prod & WRAP_MASK(q)) >> LOG2SIZE(q))
>> +
>> +#define Q_FULL(q) \
>> +    (((((q)->cons) & INDEX_MASK(q)) == \
>> +      (((q)->prod) & INDEX_MASK(q))) && \
>> +     ((((q)->cons) & WRAP_MASK(q)) != \
>> +      (((q)->prod) & WRAP_MASK(q))))
> 
> You could write this as
>    ((cons ^ prod) & WRAP_INDEX_MASK) == WRAP_MASK
done
> 
>> +
>> +#define Q_EMPTY(q) \
>> +    (((((q)->cons) & INDEX_MASK(q)) == \
>> +      (((q)->prod) & INDEX_MASK(q))) && \
>> +     ((((q)->cons) & WRAP_MASK(q)) == \
>> +      (((q)->prod) & WRAP_MASK(q))))
> 
> and this as
>    (cons & WRAP_INDEX_MASK) == (prod & WRAP_INDEX_MASK)
done
> 
> (or as ((cons ^ prod) & WRAP_INDEX_MASK) == 0, but that's unnecessarily
> obscure I think.)
> 
> 
> This is all a bit macro-heavy. Do these really all need to be macros
> rather than functions?
> 
>> +
>> +#define Q_INCONSISTENT(q) \
>> +((((((q)->prod) & INDEX_MASK(q)) > (((q)->cons) & INDEX_MASK(q))) && \
>> +((((q)->prod) & WRAP_MASK(q)) != (((q)->cons) & WRAP_MASK(q)))) || \
>> +(((((q)->prod) & INDEX_MASK(q)) < (((q)->cons) & INDEX_MASK(q))) && \
>> +((((q)->prod) & WRAP_MASK(q)) == (((q)->cons) & WRAP_MASK(q))))) \
>> +
> 
> This never seems to be used. (Also it has a stray trailing '\',
> and isn't indented very clearly.
removed
> 
>> +#define SMMUV3_CMDQ_ENABLED(s) \
>> +     (FIELD_EX32(s->cr[0], CR0, CMDQEN))
>> +
>> +#define SMMUV3_EVENTQ_ENABLED(s) \
>> +     (FIELD_EX32(s->cr[0], CR0, EVENTQEN))

Those were moved to static inline functions
>> +
>> +static inline void smmu_write_cmdq_err(SMMUv3State *s, uint32_t err_type)
>> +{
>> +    s->cmdq.cons = FIELD_DP32(s->cmdq.cons, CMDQ_CONS, ERR, err_type);
>> +}
>> +
>> +void smmuv3_write_eventq(SMMUv3State *s, Evt *evt);
>> +
>> +/* Commands */
>> +
>> +enum {
>> +    SMMU_CMD_PREFETCH_CONFIG = 0x01,
>> +    SMMU_CMD_PREFETCH_ADDR,
>> +    SMMU_CMD_CFGI_STE,
>> +    SMMU_CMD_CFGI_STE_RANGE,
>> +    SMMU_CMD_CFGI_CD,
>> +    SMMU_CMD_CFGI_CD_ALL,
>> +    SMMU_CMD_CFGI_ALL,
>> +    SMMU_CMD_TLBI_NH_ALL     = 0x10,
>> +    SMMU_CMD_TLBI_NH_ASID,
>> +    SMMU_CMD_TLBI_NH_VA,
>> +    SMMU_CMD_TLBI_NH_VAA,
>> +    SMMU_CMD_TLBI_EL3_ALL    = 0x18,
>> +    SMMU_CMD_TLBI_EL3_VA     = 0x1a,
>> +    SMMU_CMD_TLBI_EL2_ALL    = 0x20,
>> +    SMMU_CMD_TLBI_EL2_ASID,
>> +    SMMU_CMD_TLBI_EL2_VA,
>> +    SMMU_CMD_TLBI_EL2_VAA,  /* 0x23 */
>> +    SMMU_CMD_TLBI_S12_VMALL  = 0x28,
>> +    SMMU_CMD_TLBI_S2_IPA     = 0x2a,
>> +    SMMU_CMD_TLBI_NSNH_ALL   = 0x30,
>> +    SMMU_CMD_ATC_INV         = 0x40,
>> +    SMMU_CMD_PRI_RESP,
>> +    SMMU_CMD_RESUME          = 0x44,
>> +    SMMU_CMD_STALL_TERM,
>> +    SMMU_CMD_SYNC,          /* 0x46 */
>> +};
>> +
>> +static const char *cmd_stringify[] = {
>> +    [SMMU_CMD_PREFETCH_CONFIG] = "SMMU_CMD_PREFETCH_CONFIG",
>> +    [SMMU_CMD_PREFETCH_ADDR]   = "SMMU_CMD_PREFETCH_ADDR",
>> +    [SMMU_CMD_CFGI_STE]        = "SMMU_CMD_CFGI_STE",
>> +    [SMMU_CMD_CFGI_STE_RANGE]  = "SMMU_CMD_CFGI_STE_RANGE",
>> +    [SMMU_CMD_CFGI_CD]         = "SMMU_CMD_CFGI_CD",
>> +    [SMMU_CMD_CFGI_CD_ALL]     = "SMMU_CMD_CFGI_CD_ALL",
>> +    [SMMU_CMD_CFGI_ALL]        = "SMMU_CMD_CFGI_ALL",
>> +    [SMMU_CMD_TLBI_NH_ALL]     = "SMMU_CMD_TLBI_NH_ALL",
>> +    [SMMU_CMD_TLBI_NH_ASID]    = "SMMU_CMD_TLBI_NH_ASID",
>> +    [SMMU_CMD_TLBI_NH_VA]      = "SMMU_CMD_TLBI_NH_VA",
>> +    [SMMU_CMD_TLBI_NH_VAA]     = "SMMU_CMD_TLBI_NH_VAA",
>> +    [SMMU_CMD_TLBI_EL3_ALL]    = "SMMU_CMD_TLBI_EL3_ALL",
>> +    [SMMU_CMD_TLBI_EL3_VA]     = "SMMU_CMD_TLBI_EL3_VA",
>> +    [SMMU_CMD_TLBI_EL2_ALL]    = "SMMU_CMD_TLBI_EL2_ALL",
>> +    [SMMU_CMD_TLBI_EL2_ASID]   = "SMMU_CMD_TLBI_EL2_ASID",
>> +    [SMMU_CMD_TLBI_EL2_VA]     = "SMMU_CMD_TLBI_EL2_VA",
>> +    [SMMU_CMD_TLBI_EL2_VAA]    = "SMMU_CMD_TLBI_EL2_VAA",
>> +    [SMMU_CMD_TLBI_S12_VMALL]  = "SMMU_CMD_TLBI_S12_VMALL",
>> +    [SMMU_CMD_TLBI_S2_IPA]     = "SMMU_CMD_TLBI_S2_IPA",
>> +    [SMMU_CMD_TLBI_NSNH_ALL]   = "SMMU_CMD_TLBI_NSNH_ALL",
>> +    [SMMU_CMD_ATC_INV]         = "SMMU_CMD_ATC_INV",
>> +    [SMMU_CMD_PRI_RESP]        = "SMMU_CMD_PRI_RESP",
>> +    [SMMU_CMD_RESUME]          = "SMMU_CMD_RESUME",
>> +    [SMMU_CMD_STALL_TERM]      = "SMMU_CMD_STALL_TERM",
>> +    [SMMU_CMD_SYNC]            = "SMMU_CMD_SYNC",
>> +};
>> +
>> +#define SMMU_CMD_STRING(type) (                                      \
>> +(type < ARRAY_SIZE(cmd_stringify)) ? cmd_stringify[type] : "UNKNOWN" \
>> +)
> 
> If this was a function you'd know what the type of 'type' is
> and so whether it needed to have a >= 0 check on it. Also it
> will hand you a NULL pointer for a value that's inside the
> array size but not initialized, like 0x24.
OK
> 
>> +
>> +/* CMDQ fields */
>> +
>> +typedef enum {
>> +    SMMU_CERROR_NONE = 0,
>> +    SMMU_CERROR_ILL,
>> +    SMMU_CERROR_ABT,
>> +    SMMU_CERROR_ATC_INV_SYNC,
>> +} SMMUCmdError;
>> +
>> +enum { /* Command completion notification */
>> +    CMD_SYNC_SIG_NONE,
>> +    CMD_SYNC_SIG_IRQ,
>> +    CMD_SYNC_SIG_SEV,
>> +};
>> +
>> +#define CMD_TYPE(x)         extract32((x)->word[0], 0 , 8)
>> +#define CMD_SSEC(x)         extract32((x)->word[0], 10, 1)
>> +#define CMD_SSV(x)          extract32((x)->word[0], 11, 1)
>> +#define CMD_RESUME_AC(x)    extract32((x)->word[0], 12, 1)
>> +#define CMD_RESUME_AB(x)    extract32((x)->word[0], 13, 1)
>> +#define CMD_SYNC_CS(x)      extract32((x)->word[0], 12, 2)
>> +#define CMD_SSID(x)         extract32((x)->word[0], 12, 20)
>> +#define CMD_SID(x)          ((x)->word[1])
>> +#define CMD_VMID(x)         extract32((x)->word[1], 0 , 16)
>> +#define CMD_ASID(x)         extract32((x)->word[1], 16, 16)
>> +#define CMD_RESUME_STAG(x)  extract32((x)->word[2], 0 , 16)
>> +#define CMD_RESP(x)         extract32((x)->word[2], 11, 2)
>> +#define CMD_LEAF(x)         extract32((x)->word[2], 0 , 1)
>> +#define CMD_STE_RANGE(x)    extract32((x)->word[2], 0 , 5)
>> +#define CMD_ADDR(x) ({                                        \
>> +            uint64_t high = (uint64_t)(x)->word[3];           \
>> +            uint64_t low = extract32((x)->word[2], 12, 20);    \
>> +            uint64_t addr = high << 32 | (low << 12);         \
>> +            addr;                                             \
>> +        })
>> +
>> +int smmuv3_cmdq_consume(SMMUv3State *s);
>> +
>>  #endif
>> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
>> index 8779d3f..0b57215 100644
>> --- a/hw/arm/smmuv3.c
>> +++ b/hw/arm/smmuv3.c
>> @@ -94,6 +94,72 @@ void smmuv3_write_gerrorn(SMMUv3State *s, uint32_t new_gerrorn)
>>      trace_smmuv3_write_gerrorn(acked, s->gerrorn);
>>  }
>>
>> +static uint32_t queue_index_inc(uint32_t val,
>> +                                uint32_t qidx_mask, uint32_t qwrap_mask)
>> +{
>> +    uint32_t i = (val + 1) & qidx_mask;
>> +
>> +    if (i <= (val & qidx_mask)) {
>> +        i = ((val & qwrap_mask) ^ qwrap_mask) | i;
>> +    } else {
>> +        i = (val & qwrap_mask) | i;
>> +    }
>> +    return i;
> 
> This is unnecessarily complicated -- an index increment is just
>    val = (val + 1) & INDEX_WRAP_MASK;
> which will automatically flip the wrap bit as required.
> 
OK
>> +}
>> +
>> +static inline void queue_prod_incr(SMMUQueue *q)
>> +{
>> +    q->prod = queue_index_inc(q->prod, INDEX_MASK(q), WRAP_MASK(q));
> 
> Doesn't this trash the ERR code in bits [30:24], or are you
> keeping that somewhere else for efficiency?
in case there is an error we don't increment. But switching to deposit32
as it looks saner.
> 
>> +}
>> +
>> +static inline void queue_cons_incr(SMMUQueue *q)
>> +{
>> +    q->cons = queue_index_inc(q->cons, INDEX_MASK(q), WRAP_MASK(q));
>> +}
>> +
>> +static inline MemTxResult queue_read(SMMUQueue *q, void *data)
>> +{
>> +    dma_addr_t addr = Q_CONS_ENTRY(q);
>> +
>> +    return dma_memory_read(&address_space_memory, addr,
>> +                           (uint8_t *)data, q->entry_size);
> 
> Does the compiler complain if you don't provide this cast?
no
> 
>> +}
>> +
>> +static void queue_write(SMMUQueue *q, void *data)
>> +{
>> +    dma_addr_t addr = Q_PROD_ENTRY(q);
>> +    MemTxResult ret;
>> +
>> +    ret = dma_memory_write(&address_space_memory, addr,
>> +                           (uint8_t *)data, q->entry_size);
>> +    if (ret != MEMTX_OK) {
>> +        return;
> 
> Shouldn't we record or return this error to the caller,
> like queue_read() does, rather than throwing it away?
> I think that for the event queue (which is the only user
> here ) this should cause an EVENTQ_ABT_ERR.
done
> 
>> +    }
>> +
>> +    queue_prod_incr(q);
>> +}
>> +
>> +void smmuv3_write_eventq(SMMUv3State *s, Evt *evt)
>> +{
>> +    SMMUQueue *q = &s->eventq;
>> +    bool q_empty = Q_EMPTY(q);
>> +    bool q_full = Q_FULL(q);
> 
> You only use these once each, and they're not very complicated
> expressions, so you might as well just have the uses be
> "if (Q_FULL(q)) { ..." &c.
done
> 
>> +
>> +    if (!SMMUV3_EVENTQ_ENABLED(s)) {
>> +        return;
>> +    }
>> +
>> +    if (q_full) {
>> +        return;
>> +    }
>> +
>> +    queue_write(q, evt);
>> +
>> +    if (q_empty) {
>> +        smmuv3_trigger_irq(s, SMMU_IRQ_EVTQ, 0);
>> +    }
>> +}
>> +
>>  static void smmuv3_init_regs(SMMUv3State *s)
>>  {
>>      /**
>> @@ -133,6 +199,102 @@ static void smmuv3_init_regs(SMMUv3State *s)
>>      s->sid_split = 0;
>>  }
>>
>> +int smmuv3_cmdq_consume(SMMUv3State *s)
>> +{
>> +    SMMUCmdError cmd_error = SMMU_CERROR_NONE;
>> +    SMMUQueue *q = &s->cmdq;
>> +    uint32_t type = 0;
>> +
>> +    if (!SMMUV3_CMDQ_ENABLED(s)) {
>> +        return 0;
>> +    }
>> +    /*
>> +     * some commands depend on register values, as above. In case those
> 
> Where is "as above" referring to ?
I meant CMDQ enabled (CR0).
> 
>> +     * register values change while handling the command, spec says it
>> +     * is UNPREDICTABLE whether the command is interpreted under the new
>> +     * or old value.
>> +     */
>> +
>> +    while (!Q_EMPTY(q)) {
>> +        uint32_t pending = s->gerror ^ s->gerrorn;
>> +        Cmd cmd;
>> +
>> +        trace_smmuv3_cmdq_consume(Q_PROD(q), Q_CONS(q),
>> +                                  Q_PROD_WRAP(q), Q_CONS_WRAP(q));
>> +
>> +        if (FIELD_EX32(pending, GERROR, CMDQ_ERR)) {
>> +            break;
>> +        }
>> +
>> +        if (queue_read(q, &cmd) != MEMTX_OK) {
>> +            cmd_error = SMMU_CERROR_ABT;
>> +            break;
>> +        }
>> +
>> +        type = CMD_TYPE(&cmd);
>> +
>> +        trace_smmuv3_cmdq_opcode(SMMU_CMD_STRING(type));
>> +
>> +        switch (type) {
>> +        case SMMU_CMD_SYNC:
>> +            if (CMD_SYNC_CS(&cmd) & CMD_SYNC_SIG_IRQ) {
>> +                smmuv3_trigger_irq(s, SMMU_IRQ_CMD_SYNC, 0);
>> +            }
>> +            break;
>> +        case SMMU_CMD_PREFETCH_CONFIG:
>> +        case SMMU_CMD_PREFETCH_ADDR:
>> +        case SMMU_CMD_CFGI_STE:
>> +        case SMMU_CMD_CFGI_STE_RANGE: /* same as SMMU_CMD_CFGI_ALL */
>> +        case SMMU_CMD_CFGI_CD:
>> +        case SMMU_CMD_CFGI_CD_ALL:
>> +        case SMMU_CMD_TLBI_NH_ALL:
>> +        case SMMU_CMD_TLBI_NH_ASID:
>> +        case SMMU_CMD_TLBI_NH_VA:
>> +        case SMMU_CMD_TLBI_NH_VAA:
>> +        case SMMU_CMD_TLBI_EL3_ALL:
>> +        case SMMU_CMD_TLBI_EL3_VA:
>> +        case SMMU_CMD_TLBI_EL2_ALL:
>> +        case SMMU_CMD_TLBI_EL2_ASID:
>> +        case SMMU_CMD_TLBI_EL2_VA:
>> +        case SMMU_CMD_TLBI_EL2_VAA:
>> +        case SMMU_CMD_TLBI_S12_VMALL:
>> +        case SMMU_CMD_TLBI_S2_IPA:
>> +        case SMMU_CMD_TLBI_NSNH_ALL:
>> +        case SMMU_CMD_ATC_INV:
>> +        case SMMU_CMD_PRI_RESP:
>> +        case SMMU_CMD_RESUME:
>> +        case SMMU_CMD_STALL_TERM:
>> +            trace_smmuv3_unhandled_cmd(type);
>> +            break;
>> +        default:
>> +            cmd_error = SMMU_CERROR_ILL;
>> +            error_report("Illegal command type: %d", CMD_TYPE(&cmd));
> 
> This isn't what error_report() is for. You can log it as a GUEST_ERROR.
OK
> 
>> +            break;
>> +        }
>> +        if (cmd_error) {
>> +            break;
>> +        }
>> +        /*
>> +         * We only increment the cons index after the completion of
>> +         * the command. We do that because the SYNC returns immediatly
> 
> "immediately"
> 
>> +         * and do not check the completion of previous commands
> 
> "does not"
> 
>> +         */
>> +        queue_cons_incr(q);
>> +    }
>> +
>> +    if (cmd_error) {
>> +        error_report("Error on %s command execution: %d",
>> +                     SMMU_CMD_STRING(type), cmd_error);
> 
> Again, not error_report(). Probably a good location for a trace_ point.
OK
> 
>> +        smmu_write_cmdq_err(s, cmd_error);
>> +        smmuv3_trigger_irq(s, SMMU_IRQ_GERROR, R_GERROR_CMDQ_ERR_MASK);
>> +    }
>> +
>> +    trace_smmuv3_cmdq_consume_out(Q_PROD(q), Q_CONS(q),
>> +                                  Q_PROD_WRAP(q), Q_CONS_WRAP(q));
>> +
>> +    return 0;
>> +}
>> +
>>  static void smmu_write_mmio(void *opaque, hwaddr addr,
>>                              uint64_t val, unsigned size)
>>  {
>> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
>> index 2ddae40..1c5105d 100644
>> --- a/hw/arm/trace-events
>> +++ b/hw/arm/trace-events
>> @@ -18,3 +18,7 @@ smmuv3_read_mmio(hwaddr addr, uint64_t val, unsigned size) "addr: 0x%"PRIx64" va
>>  smmuv3_trigger_irq(int irq) "irq=%d"
>>  smmuv3_write_gerror(uint32_t toggled, uint32_t gerror) "toggled=0x%x, new gerror=0x%x"
>>  smmuv3_write_gerrorn(uint32_t acked, uint32_t gerrorn) "acked=0x%x, new gerrorn=0x%x"
>> +smmuv3_unhandled_cmd(uint32_t type) "Unhandled command type=%d"
>> +smmuv3_cmdq_consume(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "prod=%d cons=%d prod.wrap=%d cons.wrap=%d"
>> +smmuv3_cmdq_opcode(const char *opcode) "<--- %s"
>> +smmuv3_cmdq_consume_out(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "prod:%d, cons:%d, prod_wrap:%d, cons_wrap:%d "
>> --
>> 2.5.5
>>
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 08/14] hw/arm/smmuv3: Event queue recording helper
  2018-03-08 18:39   ` Peter Maydell
@ 2018-03-09 17:16     ` Auger Eric
  0 siblings, 0 replies; 63+ messages in thread
From: Auger Eric @ 2018-03-09 17:16 UTC (permalink / raw)
  To: Peter Maydell
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

Hi Peter,

On 08/03/18 19:39, Peter Maydell wrote:
> On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
>> Let's introduce a helper function aiming at recording an
>> event in the event queue.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>>
>> v8 -> v9:
>> - add SMMU_EVENT_STRING
>>
>> v7 -> v8:
>> - use dma_addr_t instead of hwaddr in smmuv3_record_event()
>> - introduce struct SMMUEventInfo
>> - add event_stringify + helpers for all fields
>> ---
>>  hw/arm/smmuv3-internal.h | 140 ++++++++++++++++++++++++++++++++++++++++++++++-
>>  hw/arm/smmuv3.c          |  91 +++++++++++++++++++++++++++++-
>>  hw/arm/trace-events      |   1 +
>>  3 files changed, 229 insertions(+), 3 deletions(-)
>>
>> diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
>> index 5af97ae..3929f69 100644
>> --- a/hw/arm/smmuv3-internal.h
>> +++ b/hw/arm/smmuv3-internal.h
>> @@ -226,8 +226,6 @@ static inline void smmu_write_cmdq_err(SMMUv3State *s, uint32_t err_type)
>>      s->cmdq.cons = FIELD_DP32(s->cmdq.cons, CMDQ_CONS, ERR, err_type);
>>  }
>>
>> -void smmuv3_write_eventq(SMMUv3State *s, Evt *evt);
>> -
>>  /* Commands */
>>
>>  enum {
>> @@ -326,4 +324,142 @@ enum { /* Command completion notification */
>>              addr;                                             \
>>          })
>>
>> +/* Events */
>> +
>> +typedef enum SMMUEventType {
>> +    SMMU_EVT_OK                 = 0x00,
>> +    SMMU_EVT_F_UUT              = 0x01,
>> +    SMMU_EVT_C_BAD_STREAMID     = 0x02,
>> +    SMMU_EVT_F_STE_FETCH        = 0x03,
>> +    SMMU_EVT_C_BAD_STE          = 0x04,
>> +    SMMU_EVT_F_BAD_ATS_TREQ     = 0x05,
>> +    SMMU_EVT_F_STREAM_DISABLED  = 0x06,
>> +    SMMU_EVT_F_TRANS_FORBIDDEN  = 0x07,
>> +    SMMU_EVT_C_BAD_SUBSTREAMID  = 0x08,
>> +    SMMU_EVT_F_CD_FETCH         = 0x09,
>> +    SMMU_EVT_C_BAD_CD           = 0x0a,
>> +    SMMU_EVT_F_WALK_EABT        = 0x0b,
>> +    SMMU_EVT_F_TRANSLATION      = 0x10,
>> +    SMMU_EVT_F_ADDR_SIZE        = 0x11,
>> +    SMMU_EVT_F_ACCESS           = 0x12,
>> +    SMMU_EVT_F_PERMISSION       = 0x13,
>> +    SMMU_EVT_F_TLB_CONFLICT     = 0x20,
>> +    SMMU_EVT_F_CFG_CONFLICT     = 0x21,
>> +    SMMU_EVT_E_PAGE_REQ         = 0x24,
>> +} SMMUEventType;
>> +
>> +static const char *event_stringify[] = {
>> +    [SMMU_EVT_OK]                       = "SMMU_EVT_OK",
>> +    [SMMU_EVT_F_UUT]                    = "SMMU_EVT_F_UUT",
>> +    [SMMU_EVT_C_BAD_STREAMID]           = "SMMU_EVT_C_BAD_STREAMID",
>> +    [SMMU_EVT_F_STE_FETCH]              = "SMMU_EVT_F_STE_FETCH",
>> +    [SMMU_EVT_C_BAD_STE]                = "SMMU_EVT_C_BAD_STE",
>> +    [SMMU_EVT_F_BAD_ATS_TREQ]           = "SMMU_EVT_F_BAD_ATS_TREQ",
>> +    [SMMU_EVT_F_STREAM_DISABLED]        = "SMMU_EVT_F_STREAM_DISABLED",
>> +    [SMMU_EVT_F_TRANS_FORBIDDEN]        = "SMMU_EVT_F_TRANS_FORBIDDEN",
>> +    [SMMU_EVT_C_BAD_SUBSTREAMID]        = "SMMU_EVT_C_BAD_SUBSTREAMID",
>> +    [SMMU_EVT_F_CD_FETCH]               = "SMMU_EVT_F_CD_FETCH",
>> +    [SMMU_EVT_C_BAD_CD]                 = "SMMU_EVT_C_BAD_CD",
>> +    [SMMU_EVT_F_WALK_EABT]              = "SMMU_EVT_F_WALK_EABT",
>> +    [SMMU_EVT_F_TRANSLATION]            = "SMMU_EVT_F_TRANSLATION",
>> +    [SMMU_EVT_F_ADDR_SIZE]              = "SMMU_EVT_F_ADDR_SIZE",
>> +    [SMMU_EVT_F_ACCESS]                 = "SMMU_EVT_F_ACCESS",
>> +    [SMMU_EVT_F_PERMISSION]             = "SMMU_EVT_F_PERMISSION",
>> +    [SMMU_EVT_F_TLB_CONFLICT]           = "SMMU_EVT_F_TLB_CONFLICT",
>> +    [SMMU_EVT_F_CFG_CONFLICT]           = "SMMU_EVT_F_CFG_CONFLICT",
>> +    [SMMU_EVT_E_PAGE_REQ]               = "SMMU_EVT_E_PAGE_REQ",
>> +};
>> +
>> +#define SMMU_EVENT_STRING(event) (                                         \
>> +(event < ARRAY_SIZE(event_stringify)) ? event_stringify[event] : "UNKNOWN" \
>> +)
> 
> Same remarks as for the other value-to-string helper.
OK
> 
>> +
>> +typedef struct SMMUEventInfo {
> 
> This struct could use a comment summmarizing what it's for.
OK
> 
>> +    SMMUEventType type;
>> +    uint32_t sid;
>> +    bool recorded;
>> +    bool record_trans_faults;
>> +    union {
>> +        struct {
>> +            uint32_t ssid;
>> +            bool ssv;
>> +            dma_addr_t addr;
>> +            bool rnw;
>> +            bool pnu;
>> +            bool ind;
>> +       } f_uut;
>> +       struct ssid_info {
>> +            uint32_t ssid;
>> +            bool ssv;
>> +       } c_bad_streamid;
>> +       struct ssid_addr_info {
>> +            uint32_t ssid;
>> +            bool ssv;
>> +            dma_addr_t addr;
>> +       } f_ste_fetch;
>> +       struct ssid_info c_bad_ste;
>> +       struct {
>> +            dma_addr_t addr;
>> +            bool rnw;
>> +       } f_transl_forbidden;
>> +       struct {
>> +            uint32_t ssid;
>> +       } c_bad_substream;
>> +       struct ssid_addr_info f_cd_fetch;
>> +       struct ssid_info c_bad_cd;
>> +       struct full_info {
>> +            bool stall;
>> +            uint16_t stag;
>> +            uint32_t ssid;
>> +            bool ssv;
>> +            bool s2;
>> +            dma_addr_t addr;
>> +            bool rnw;
>> +            bool pnu;
>> +            bool ind;
>> +            uint8_t class;
>> +            dma_addr_t addr2;
>> +       } f_walk_eabt;
>> +       struct full_info f_translation;
>> +       struct full_info f_addr_size;
>> +       struct full_info f_access;
>> +       struct full_info f_permission;
>> +       struct ssid_info f_cfg_conflict;
>> +       /**
>> +        * not supported yet:
>> +        * F_BAD_ATS_TREQ
>> +        * F_BAD_ATS_TREQ
>> +        * F_TLB_CONFLICT
>> +        * E_PAGE_REQUEST
>> +        * IMPDEF_EVENTn
>> +        */
>> +    } u;
>> +} SMMUEventInfo;
>> +
>> +/* EVTQ fields */
>> +
>> +#define EVT_Q_OVERFLOW        (1 << 31)
>> +
>> +#define EVT_SET_TYPE(x, v)              deposit32((x)->word[0], 0 , 8 ,  v)
>> +#define EVT_SET_SSV(x, v)               deposit32((x)->word[0], 11, 1 ,  v)
>> +#define EVT_SET_SSID(x, v)              deposit32((x)->word[0], 12, 20, v)
>> +#define EVT_SET_SID(x, v)               ((x)->word[1] =  v)
>> +#define EVT_SET_STAG(x, v)              deposit32((x)->word[2], 0 , 16, v)
>> +#define EVT_SET_STALL(x, v)             deposit32((x)->word[2], 31, 1 , v)
>> +#define EVT_SET_PNU(x, v)               deposit32((x)->word[3], 1 , 1 , v)
>> +#define EVT_SET_IND(x, v)               deposit32((x)->word[3], 2 , 1 , v)
>> +#define EVT_SET_RNW(x, v)               deposit32((x)->word[3], 3 , 1 , v)
>> +#define EVT_SET_S2(x, v)                deposit32((x)->word[3], 7 , 1 , v)
>> +#define EVT_SET_CLASS(x, v)             deposit32((x)->word[3], 8 , 2 , v)
>> +#define EVT_SET_ADDR(x, addr) ({                    \
>> +            (x)->word[5] = (uint32_t)(addr >> 32);        \
>> +            (x)->word[4] = (uint32_t)(addr & 0xffffffff); \
>> +        })
>> +#define EVT_SET_ADDR2(x, addr) ({                    \
>> +            deposit32((x)->word[7], 3, 29, addr >> 16);        \
>> +            deposit32((x)->word[7], 0, 16, addr & 0xffff); \
>> +        })
>> +
>> +void smmuv3_record_event(SMMUv3State *s, SMMUEventInfo *event);
>> +
>>  #endif
>> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
>> index fcfdbb0..0adfe53 100644
>> --- a/hw/arm/smmuv3.c
>> +++ b/hw/arm/smmuv3.c
>> @@ -140,7 +140,7 @@ static void queue_write(SMMUQueue *q, void *data)
>>      queue_prod_incr(q);
>>  }
>>
>> -void smmuv3_write_eventq(SMMUv3State *s, Evt *evt)
>> +static void smmuv3_write_eventq(SMMUv3State *s, Evt *evt)
>>  {
>>      SMMUQueue *q = &s->eventq;
>>      bool q_empty = Q_EMPTY(q);
>> @@ -161,6 +161,95 @@ void smmuv3_write_eventq(SMMUv3State *s, Evt *evt)
>>      }
>>  }
>>
>> +void smmuv3_record_event(SMMUv3State *s, SMMUEventInfo *info)
>> +{
>> +    Evt evt;
>> +
>> +    if (!SMMUV3_EVENTQ_ENABLED(s)) {
>> +        return;
>> +    }
>> +
>> +    EVT_SET_TYPE(&evt, info->type);
>> +    EVT_SET_SID(&evt, info->sid);
>> +
>> +    switch (info->type) {
>> +    case SMMU_EVT_OK:
>> +        return;
>> +    case SMMU_EVT_F_UUT:
>> +        EVT_SET_SSID(&evt, info->u.f_uut.ssid);
>> +        EVT_SET_SSV(&evt,  info->u.f_uut.ssv);
>> +        EVT_SET_ADDR(&evt, info->u.f_uut.addr);
>> +        EVT_SET_RNW(&evt,  info->u.f_uut.rnw);
>> +        EVT_SET_PNU(&evt,  info->u.f_uut.pnu);
>> +        EVT_SET_IND(&evt,  info->u.f_uut.ind);
>> +        break;
>> +    case SMMU_EVT_C_BAD_STREAMID:
>> +        EVT_SET_SSID(&evt, info->u.c_bad_streamid.ssid);
>> +        EVT_SET_SSV(&evt,  info->u.c_bad_streamid.ssv);
>> +        break;
>> +    case SMMU_EVT_F_STE_FETCH:
>> +        EVT_SET_SSID(&evt, info->u.f_ste_fetch.ssid);
>> +        EVT_SET_SSV(&evt,  info->u.f_ste_fetch.ssv);
>> +        EVT_SET_ADDR(&evt, info->u.f_ste_fetch.addr);
>> +        break;
>> +    case SMMU_EVT_C_BAD_STE:
>> +        EVT_SET_SSID(&evt, info->u.c_bad_ste.ssid);
>> +        EVT_SET_SSV(&evt,  info->u.c_bad_ste.ssv);
>> +        break;
>> +    case SMMU_EVT_F_STREAM_DISABLED:
>> +        break;
>> +    case SMMU_EVT_F_TRANS_FORBIDDEN:
>> +        EVT_SET_ADDR(&evt, info->u.f_transl_forbidden.addr);
>> +        EVT_SET_RNW(&evt, info->u.f_transl_forbidden.rnw);
>> +        break;
>> +    case SMMU_EVT_C_BAD_SUBSTREAMID:
>> +        EVT_SET_SSID(&evt, info->u.c_bad_substream.ssid);
>> +        break;
>> +    case SMMU_EVT_F_CD_FETCH:
>> +        EVT_SET_SSID(&evt, info->u.f_cd_fetch.ssid);
>> +        EVT_SET_SSV(&evt,  info->u.f_cd_fetch.ssv);
>> +        EVT_SET_ADDR(&evt, info->u.f_cd_fetch.addr);
>> +        break;
>> +    case SMMU_EVT_C_BAD_CD:
>> +        EVT_SET_SSID(&evt, info->u.c_bad_cd.ssid);
>> +        EVT_SET_SSV(&evt,  info->u.c_bad_cd.ssv);
>> +        break;
>> +    case SMMU_EVT_F_WALK_EABT:
>> +    case SMMU_EVT_F_TRANSLATION:
>> +    case SMMU_EVT_F_ADDR_SIZE:
>> +    case SMMU_EVT_F_ACCESS:
>> +    case SMMU_EVT_F_PERMISSION:
>> +        EVT_SET_STALL(&evt, info->u.f_walk_eabt.stall);
>> +        EVT_SET_STAG(&evt, info->u.f_walk_eabt.stag);
>> +        EVT_SET_SSID(&evt, info->u.f_walk_eabt.ssid);
>> +        EVT_SET_SSV(&evt, info->u.f_walk_eabt.ssv);
>> +        EVT_SET_S2(&evt, info->u.f_walk_eabt.s2);
>> +        EVT_SET_ADDR(&evt, info->u.f_walk_eabt.addr);
>> +        EVT_SET_RNW(&evt, info->u.f_walk_eabt.rnw);
>> +        EVT_SET_PNU(&evt, info->u.f_walk_eabt.pnu);
>> +        EVT_SET_IND(&evt, info->u.f_walk_eabt.ind);
>> +        EVT_SET_CLASS(&evt, info->u.f_walk_eabt.class);
>> +        EVT_SET_ADDR2(&evt, info->u.f_walk_eabt.addr2);
>> +        break;
>> +    case SMMU_EVT_F_CFG_CONFLICT:
>> +        EVT_SET_SSID(&evt, info->u.f_cfg_conflict.ssid);
>> +        EVT_SET_SSV(&evt,  info->u.f_cfg_conflict.ssv);
>> +        break;
>> +    /* rest is not implemented */
>> +    case SMMU_EVT_F_BAD_ATS_TREQ:
>> +    case SMMU_EVT_F_TLB_CONFLICT:
>> +    case SMMU_EVT_E_PAGE_REQ:
>> +    default:
>> +        error_report("%s event %d not supported", __func__,
>> +                     info->type);
> 
> Not error_report, please.
replaced by g_assert_not_reached();
> 
>> +        return;
>> +    }
>> +
>> +    trace_smmuv3_record_event(SMMU_EVENT_STRING(info->type), info->sid);
>> +    smmuv3_write_eventq(s, &evt);
> 
> This should be handling the "oops, the write to memory failed" case.
OK

Thanks

Eric
> 
>> +    info->recorded = true;
>> +}
>> +
>>  static void smmuv3_init_regs(SMMUv3State *s)
>>  {
>>      /**
>> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
>> index ed5dce0..c79c15e 100644
>> --- a/hw/arm/trace-events
>> +++ b/hw/arm/trace-events
>> @@ -28,3 +28,4 @@ smmuv3_write_mmio(hwaddr addr, uint64_t val, unsigned size) "addr: 0x%"PRIx64" v
>>  smmuv3_write_mmio_idr(hwaddr addr, uint64_t val) "write to RO/Unimpl reg 0x%lx val64:0x%lx"
>>  smmuv3_write_mmio_evtq_cons_bef_clear(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "Before clearing interrupt prod:0x%x cons:0x%x prod.w:%d cons.w:%d"
>>  smmuv3_write_mmio_evtq_cons_after_clear(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "after clearing interrupt prod:0x%x cons:0x%x prod.w:%d cons.w:%d"
>> +smmuv3_record_event(const char *type, uint32_t sid) "%s sid=%d"
>> --
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 10/14] hw/arm/smmuv3: Abort on vfio or vhost case
  2018-03-08 19:06   ` Peter Maydell
@ 2018-03-09 17:53     ` Auger Eric
  2018-03-09 17:59       ` Peter Maydell
  0 siblings, 1 reply; 63+ messages in thread
From: Auger Eric @ 2018-03-09 17:53 UTC (permalink / raw)
  To: Peter Maydell
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

Hi Peter,
On 08/03/18 20:06, Peter Maydell wrote:
> On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
>> At the moment, the SMMUv3 does not support notification on
>> TLB invalidation. So let's abort as soon as such notifier gets
>> enabled.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> ---
>>  hw/arm/smmuv3.c | 11 +++++++++++
>>  1 file changed, 11 insertions(+)
>>
>> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
>> index 384393f..5efe933 100644
>> --- a/hw/arm/smmuv3.c
>> +++ b/hw/arm/smmuv3.c
>> @@ -1074,12 +1074,23 @@ static void smmuv3_class_init(ObjectClass *klass, void *data)
>>      dc->realize = smmu_realize;
>>  }
>>
>> +static void smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
>> +                                       IOMMUNotifierFlag old,
>> +                                       IOMMUNotifierFlag new)
>> +{
>> +    if (old == IOMMU_NOTIFIER_NONE) {
>> +        error_setg(&error_fatal,
>> +                   "SMMUV3: vhost and vfio notifiers not yet supported");
>> +    }
>> +}
> 
> Is this triggerable by the guest, or by the user on the command
> line, or only by a bug in the board or other QEMU code?
by the user on the command line.

Thanks

Eric
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 10/14] hw/arm/smmuv3: Abort on vfio or vhost case
  2018-03-09 17:53     ` Auger Eric
@ 2018-03-09 17:59       ` Peter Maydell
  2018-03-12 10:53         ` Eric Auger
  0 siblings, 1 reply; 63+ messages in thread
From: Peter Maydell @ 2018-03-09 17:59 UTC (permalink / raw)
  To: Auger Eric
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

On 9 March 2018 at 17:53, Auger Eric <eric.auger@redhat.com> wrote:
> Hi Peter,
> On 08/03/18 20:06, Peter Maydell wrote:
>> On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
>>> At the moment, the SMMUv3 does not support notification on
>>> TLB invalidation. So let's abort as soon as such notifier gets
>>> enabled.
>>>
>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>> ---
>>>  hw/arm/smmuv3.c | 11 +++++++++++
>>>  1 file changed, 11 insertions(+)
>>>
>>> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
>>> index 384393f..5efe933 100644
>>> --- a/hw/arm/smmuv3.c
>>> +++ b/hw/arm/smmuv3.c
>>> @@ -1074,12 +1074,23 @@ static void smmuv3_class_init(ObjectClass *klass, void *data)
>>>      dc->realize = smmu_realize;
>>>  }
>>>
>>> +static void smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
>>> +                                       IOMMUNotifierFlag old,
>>> +                                       IOMMUNotifierFlag new)
>>> +{
>>> +    if (old == IOMMU_NOTIFIER_NONE) {
>>> +        error_setg(&error_fatal,
>>> +                   "SMMUV3: vhost and vfio notifiers not yet supported");
>>> +    }
>>> +}
>>
>> Is this triggerable by the guest, or by the user on the command
>> line, or only by a bug in the board or other QEMU code?
> by the user on the command line.

OK. Do they get this error immediately on startup, or only later
in execution? (If the latter, is it possible to make the error
happen earlier?)

thanks
-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 09/14] hw/arm/smmuv3: Implement translate callback
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 09/14] hw/arm/smmuv3: Implement translate callback Eric Auger
@ 2018-03-09 18:46   ` Peter Maydell
  2018-03-12 10:38     ` Eric Auger
  0 siblings, 1 reply; 63+ messages in thread
From: Peter Maydell @ 2018-03-09 18:46 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
> This patch implements the IOMMU Memory Region translate()
> callback. Most of the code relates to the translation
> configuration decoding and check (STE, CD).
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>
> ---
> v8 -> v9:
> - use SMMU_EVENT_STRING macro
> - get rid of last erro_report's
> - decode asid
> - handle config abort before ptw
> - add 64-bit single-copy atomic comment
>
> v7 -> v8:
> - use address_space_rw
> - s/Ste/STE, s/Cd/CD
> - use dma_memory_read
> - remove everything related to stage 2
> - collect data for both TTx
> - renamings
> - pass the event handle all along the config decoding path
> - decode tbi, ars
> ---
>  hw/arm/smmuv3-internal.h | 146 ++++++++++++++++++++
>  hw/arm/smmuv3.c          | 341 +++++++++++++++++++++++++++++++++++++++++++++++
>  hw/arm/trace-events      |   9 ++
>  3 files changed, 496 insertions(+)
>
> diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
> index 3929f69..b203426 100644
> --- a/hw/arm/smmuv3-internal.h
> +++ b/hw/arm/smmuv3-internal.h
> @@ -462,4 +462,150 @@ typedef struct SMMUEventInfo {
>
>  void smmuv3_record_event(SMMUv3State *s, SMMUEventInfo *event);
>
> +/* Configuration Data */
> +
> +/* STE Level 1 Descriptor */
> +typedef struct STEDesc {
> +    uint32_t word[2];
> +} STEDesc;
> +
> +/* CD Level 1 Descriptor */
> +typedef struct CDDesc {
> +    uint32_t word[2];
> +} CDDesc;
> +
> +/* Stream Table Entry(STE) */
> +typedef struct STE {
> +    uint32_t word[16];
> +} STE;
> +
> +/* Context Descriptor(CD) */
> +typedef struct CD {
> +    uint32_t word[16];
> +} CD;
> +
> +/* STE fields */
> +
> +#define STE_VALID(x)   extract32((x)->word[0], 0, 1) /* 0 */

I'm not sure what the comment is supposed to be telling me ?

> +
> +#define STE_CONFIG(x)  extract32((x)->word[0], 1, 3)
> +#define STE_CFG_S1_ENABLED(config) (config & 0x1)
> +#define STE_CFG_S2_ENABLED(config) (config & 0x2)
> +#define STE_CFG_ABORT(config)      (!(config & 0x4))
> +#define STE_CFG_BYPASS(config)     (config == 0x4)
> +
> +#define STE_S1FMT(x)   extract32((x)->word[0], 4 , 2)
> +#define STE_S1CDMAX(x) extract32((x)->word[1], 27, 5)
> +#define STE_EATS(x)    extract32((x)->word[2], 28, 2)
> +#define STE_STRW(x)    extract32((x)->word[2], 30, 2)
> +#define STE_S2VMID(x)  extract32((x)->word[4], 0 , 16)
> +#define STE_S2T0SZ(x)  extract32((x)->word[5], 0 , 6)
> +#define STE_S2SL0(x)   extract32((x)->word[5], 6 , 2)
> +#define STE_S2TG(x)    extract32((x)->word[5], 14, 2)
> +#define STE_S2PS(x)    extract32((x)->word[5], 16, 3)
> +#define STE_S2AA64(x)  extract32((x)->word[5], 19, 1)
> +#define STE_S2HD(x)    extract32((x)->word[5], 24, 1)
> +#define STE_S2HA(x)    extract32((x)->word[5], 25, 1)
> +#define STE_S2S(x)     extract32((x)->word[5], 26, 1)
> +#define STE_CTXPTR(x)                                           \
> +    ({                                                          \
> +        unsigned long addr;                                     \
> +        addr = (uint64_t)extract32((x)->word[1], 0, 16) << 32;  \
> +        addr |= (uint64_t)((x)->word[0] & 0xffffffc0);          \
> +        addr;                                                   \
> +    })
> +
> +#define STE_S2TTB(x)                                            \
> +    ({                                                          \
> +        unsigned long addr;                                     \
> +        addr = (uint64_t)extract32((x)->word[7], 0, 16) << 32;  \
> +        addr |= (uint64_t)((x)->word[6] & 0xfffffff0);          \
> +        addr;                                                   \
> +    })
> +
> +static inline int oas2bits(int oas_field)
> +{
> +    switch (oas_field) {
> +    case 0b011:
> +        return 42;
> +    case 0b100:
> +        return 44;
> +    default:
> +        return 32 + (1 << oas_field);

Is this right? For instance OAS == 0b001 should be 36,
but 32 + (1 << 1) == 34 ?  0b110 should be 52, but
32 + (1 << 6) == 96 ?

I'm not sure there's a neat relationship between the field
values and the address sizes here, so maybe we should just
have a switch with a case for each value?

> +   }
> +}
> +
> +static inline int pa_range(STE *ste)
> +{
> +    int oas_field = MIN(STE_S2PS(ste), SMMU_IDR5_OAS);
> +
> +    if (!STE_S2AA64(ste)) {
> +        return 40;
> +    }
> +
> +    return oas2bits(oas_field);
> +}
> +
> +#define MAX_PA(ste) ((1 << pa_range(ste)) - 1)
> +
> +/* CD fields */
> +
> +#define CD_VALID(x)   extract32((x)->word[0], 30, 1)
> +#define CD_ASID(x)    extract32((x)->word[1], 16, 16)
> +#define CD_TTB(x, sel)                                      \
> +    ({                                                      \
> +        uint64_t hi, lo;                                    \
> +        hi = extract32((x)->word[(sel) * 2 + 3], 0, 16);    \

Spec says TTB0/1 fields go up to bit 19. You want the whole lot
even if we don't support address sizes that wide because out
of range high bits should cause an illegal CD error, not be ignored.

> +        hi <<= 32;                                          \
> +        lo = (x)->word[(sel) * 2 + 2] & ~0xf;               \

Does this ~0xf end up correctly sign extending up into the top 32 bits?
ULL suffix would make it certain.

> +        hi | lo;                                            \
> +    })
> +
> +#define CD_TSZ(x, sel)   extract32((x)->word[0], (16 * (sel)) + 0, 6)
> +#define CD_TG(x, sel)    extract32((x)->word[0], (16 * (sel)) + 6, 2)
> +#define CD_EPD(x, sel)   extract32((x)->word[0], (16 * (sel)) + 14, 1)
> +#define CD_ENDI(x)       extract32((x)->word[0], 15, 1)
> +#define CD_IPS(x)        extract32((x)->word[1], 0 , 3)
> +#define CD_TBI(x)        extract32((x)->word[1], 6 , 2)
> +#define CD_S(x)          extract32((x)->word[1], 12, 1)
> +#define CD_R(x)          extract32((x)->word[1], 13, 1)
> +#define CD_A(x)          extract32((x)->word[1], 14, 1)
> +#define CD_AARCH64(x)    extract32((x)->word[1], 9 , 1)
> +
> +#define CDM_VALID(x)    ((x)->word[0] & 0x1)
> +
> +static inline int is_cd_valid(SMMUv3State *s, STE *ste, CD *cd)
> +{
> +    return CD_VALID(cd);
> +}
> +
> +/**
> + * tg2granule - Decodes the CD translation granule size field according
> + * to the ttbr in use
> + * @bits: TG0/1 fields
> + * @ttbr: ttbr index in use
> + */
> +static inline int tg2granule(int bits, int ttbr)
> +{
> +    switch (bits) {
> +    case 1:
> +        return ttbr ? 14 : 16;
> +    case 2:
> +        return ttbr ? 12 : 14;
> +    case 3:
> +        return ttbr ? 16 : 12;
> +    default:
> +        return 12;

TG0 == 0b11 and TG1 == 0b00 are reserved and should cause
an illegal CD error if the config means they would be used,
not be treated as if they were one of the other values.

The spec has a definition of a CD_ILLEGAL expression which
specifies what conditions should cause an illegal CD error.
It might be useful to have something in the patchset that looks
like that, and then later you can just assert that you're
not dealing with illegal values, or ignore them.

> +    }
> +}
> +
> +#define L1STD_L2PTR(stm) ({                                 \
> +            uint64_t hi, lo;                            \
> +            hi = (stm)->word[1];                        \
> +            lo = (stm)->word[0] & ~(uint64_t)0x1f;      \

I see here you have a cast to force the mask to 64-bits wide.
ULL suffix would be shorter, but in any case do the same thing
in all places.

> +            hi << 32 | lo;                              \
> +        })

This would definitely be better as an inline function rather than
a multiline macro that has to use the GCC extension to define
macro-local variables and return a value.

> +
> +#define L1STD_SPAN(stm) (extract32((stm)->word[0], 0, 4))
> +
>  #endif
> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
> index 0adfe53..384393f 100644
> --- a/hw/arm/smmuv3.c
> +++ b/hw/arm/smmuv3.c
> @@ -289,6 +289,344 @@ static void smmuv3_init_regs(SMMUv3State *s)
>      s->sid_split = 0;
>  }
>
> +static int smmu_get_ste(SMMUv3State *s, dma_addr_t addr, STE *buf,
> +                        SMMUEventInfo *event)
> +{
> +    int ret;
> +
> +    trace_smmuv3_get_ste(addr);
> +    /* TODO: guarantee 64-bit single-copy atomicity */
> +    ret = dma_memory_read(&address_space_memory, addr,
> +                          (void *)buf, sizeof(*buf));
> +    if (ret != MEMTX_OK) {
> +        qemu_log_mask(LOG_GUEST_ERROR,
> +                      "Cannot fetch pte at address=0x%"PRIx64"\n", addr);
> +        event->type = SMMU_EVT_F_STE_FETCH;
> +        event->u.f_ste_fetch.addr = addr;
> +        return -EINVAL;
> +    }
> +    return 0;
> +
> +}
> +
> +/* @ssid > 0 not supported yet */
> +static int smmu_get_cd(SMMUv3State *s, STE *ste, uint32_t ssid,
> +                       CD *buf, SMMUEventInfo *event)
> +{
> +    dma_addr_t addr = STE_CTXPTR(ste);
> +    int ret;
> +
> +    trace_smmuv3_get_cd(addr);
> +    /* TODO: guarantee 64-bit single-copy atomicity */
> +    ret = dma_memory_read(&address_space_memory, addr,
> +                           (void *)buf, sizeof(*buf));
> +    if (ret != MEMTX_OK) {
> +        qemu_log_mask(LOG_GUEST_ERROR,
> +                      "Cannot fetch pte at address=0x%"PRIx64"\n", addr);
> +        event->type = SMMU_EVT_F_CD_FETCH;
> +        event->u.f_ste_fetch.addr = addr;
> +        return -EINVAL;
> +    }
> +    return 0;
> +}
> +
> +static int decode_ste(SMMUv3State *s, SMMUTransCfg *cfg,
> +                      STE *ste, SMMUEventInfo *event)
> +{
> +    uint32_t config = STE_CONFIG(ste);
> +    int ret = -EINVAL;
> +
> +    if (STE_CFG_ABORT(config)) {
> +        /* abort but don't record any event */
> +        cfg->aborted = true;
> +        return ret;
> +    }
> +
> +    if (STE_CFG_BYPASS(config)) {
> +        cfg->bypassed = true;
> +        return ret;
> +    }
> +
> +    if (!STE_VALID(ste)) {
> +        goto bad_ste;
> +    }
> +
> +    if (STE_CFG_S2_ENABLED(config)) {
> +        error_setg(&error_fatal, "SMMUv3 does not support stage 2 yet");

Usual remarks about not using error_setg() for this kind of thing.

> +    }
> +
> +    if (STE_S1CDMAX(ste) != 0) {
> +        error_setg(&error_fatal,
> +                   "SMMUv3 does not support multiple context descriptors yet");
> +        goto bad_ste;
> +    }
> +    return 0;
> +
> +bad_ste:
> +    event->type = SMMU_EVT_C_BAD_STE;
> +    return -EINVAL;
> +}
> +
> +/**
> + * smmu_find_ste - Return the stream table entry associated
> + * to the sid
> + *
> + * @s: smmuv3 handle
> + * @sid: stream ID
> + * @ste: returned stream table entry
> + * @event: handle to an event info
> + *
> + * Supports linear and 2-level stream table
> + * Return 0 on success, -EINVAL otherwise
> + */
> +static int smmu_find_ste(SMMUv3State *s, uint32_t sid, STE *ste,
> +                         SMMUEventInfo *event)
> +{
> +    dma_addr_t addr;
> +    int ret;
> +
> +    trace_smmuv3_find_ste(sid, s->features, s->sid_split);
> +    /* Check SID range */
> +    if (sid > (1 << SMMU_IDR1_SIDSIZE)) {
> +        event->type = SMMU_EVT_C_BAD_STREAMID;
> +        return -EINVAL;
> +    }
> +    if (s->features & SMMU_FEATURE_2LVL_STE) {
> +        int l1_ste_offset, l2_ste_offset, max_l2_ste, span;
> +        dma_addr_t strtab_base, l1ptr, l2ptr;
> +        STEDesc l1std;
> +
> +        strtab_base = s->strtab_base & SMMU_BASE_ADDR_MASK;
> +        l1_ste_offset = sid >> s->sid_split;
> +        l2_ste_offset = sid & ((1 << s->sid_split) - 1);
> +        l1ptr = (dma_addr_t)(strtab_base + l1_ste_offset * sizeof(l1std));
> +        /* TODO: guarantee 64-bit single-copy atomicity */
> +        ret = dma_memory_read(&address_space_memory, l1ptr,
> +                              (uint8_t *)&l1std, sizeof(l1std));
> +        if (ret != MEMTX_OK) {
> +            qemu_log_mask(LOG_GUEST_ERROR,
> +                          "Could not read L1PTR at 0X%"PRIx64"\n", l1ptr);
> +            event->type = SMMU_EVT_F_STE_FETCH;
> +            event->u.f_ste_fetch.addr = l1ptr;
> +            return -EINVAL;
> +        }
> +
> +        span = L1STD_SPAN(&l1std);
> +
> +        if (!span) {
> +            /* l2ptr is not valid */
> +            qemu_log_mask(LOG_GUEST_ERROR,
> +                          "invalid sid=%d (L1STD span=0)\n", sid);
> +            event->type = SMMU_EVT_C_BAD_STREAMID;
> +            return -EINVAL;
> +        }
> +        max_l2_ste = (1 << span) - 1;
> +        l2ptr = L1STD_L2PTR(&l1std);
> +        trace_smmuv3_find_ste_2lvl(s->strtab_base, l1ptr, l1_ste_offset,
> +                                   l2ptr, l2_ste_offset, max_l2_ste);
> +        if (l2_ste_offset > max_l2_ste) {
> +            qemu_log_mask(LOG_GUEST_ERROR,
> +                          "l2_ste_offset=%d > max_l2_ste=%d\n",
> +                          l2_ste_offset, max_l2_ste);
> +            event->type = SMMU_EVT_C_BAD_STE;
> +            return -EINVAL;
> +        }
> +        addr = L1STD_L2PTR(&l1std) + l2_ste_offset * sizeof(*ste);
> +    } else {
> +        addr = s->strtab_base + sid * sizeof(*ste);
> +    }
> +
> +    if (smmu_get_ste(s, addr, ste, event)) {
> +        return -EINVAL;
> +    }
> +
> +    return 0;
> +}
> +
> +static int decode_cd(SMMUTransCfg *cfg, CD *cd, SMMUEventInfo *event)
> +{
> +    int ret = -EINVAL;
> +    int i;
> +
> +    if (!CD_VALID(cd) || !CD_AARCH64(cd)) {
> +        goto error;
> +    }
> +
> +    /* we support only those at the moment */
> +    cfg->aa64 = true;
> +    cfg->stage = 1;
> +
> +    cfg->oas = oas2bits(CD_IPS(cd));
> +    cfg->oas = MIN(oas2bits(SMMU_IDR5_OAS), cfg->oas);
> +    cfg->tbi = CD_TBI(cd);
> +    cfg->asid = CD_ASID(cd);
> +
> +    trace_smmuv3_decode_cd(cfg->oas);
> +
> +    /* decode data dependent on TT */
> +    for (i = 0; i <= 1; i++) {
> +        int tg, tsz;
> +        SMMUTransTableInfo *tt = &cfg->tt[i];
> +
> +        cfg->tt[i].disabled = CD_EPD(cd, i);
> +        if (cfg->tt[i].disabled) {
> +            continue;
> +        }
> +
> +        tsz = CD_TSZ(cd, i);
> +        if (tsz < 16 || tsz > 39) {
> +            goto error;
> +        }
> +
> +        tg = CD_TG(cd, i);
> +        tt->granule_sz = tg2granule(tg, i);
> +        if ((tt->granule_sz != 12 && tt->granule_sz != 16) || CD_ENDI(cd)) {
> +            goto error;
> +        }
> +
> +        tt->tsz = tsz;
> +        tt->initial_level = 4 - (64 - tsz - 4) / (tt->granule_sz - 3);
> +        tt->ttb = CD_TTB(cd, i);
> +        tt->ttb = extract64(tt->ttb, 0, cfg->oas);
> +        trace_smmuv3_decode_cd_tt(i, tt->tsz, tt->ttb,
> +                                  tt->granule_sz, tt->initial_level);
> +    }
> +
> +    event->record_trans_faults = CD_R(cd);
> +
> +    return 0;
> +
> +error:
> +    event->type = SMMU_EVT_C_BAD_CD;
> +    return ret;
> +}
> +
> +/**
> + * smmuv3_decode_config - Prepare the translation configuration
> + * for the @mr iommu region
> + * @mr: iommu memory region the translation config must be prepared for
> + * @cfg: output translation configuration which is populated through
> + *       the different configuration decodng steps

"decoding"

> + * @event: must be zero'ed by the caller
> + *
> + * return < 0 if the translation needs to be aborted (@event is filled
> + * accordingly). Return 0 otherwise.
> + */
> +static int smmuv3_decode_config(IOMMUMemoryRegion *mr, SMMUTransCfg *cfg,
> +                                SMMUEventInfo *event)
> +{
> +    SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
> +    uint32_t sid = smmu_get_sid(sdev);
> +    SMMUv3State *s = sdev->smmu;
> +    int ret = -EINVAL;
> +    STE ste;
> +    CD cd;
> +
> +    if (smmu_find_ste(s, sid, &ste, event)) {
> +        return ret;
> +    }
> +
> +    if (decode_ste(s, cfg, &ste, event)) {
> +        return ret;
> +    }
> +
> +    if (smmu_get_cd(s, &ste, 0 /* ssid */, &cd, event)) {
> +        return ret;
> +    }
> +
> +    return decode_cd(cfg, &cd, event);
> +}
> +
> +static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr addr,
> +                                      IOMMUAccessFlags flag)
> +{
> +    SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
> +    SMMUv3State *s = sdev->smmu;
> +    uint32_t sid = smmu_get_sid(sdev);
> +    SMMUEventInfo event = {.type = SMMU_EVT_OK, .sid = sid};
> +    SMMUPTWEventInfo ptw_info = {};
> +    SMMUTransCfg cfg = {};
> +    IOMMUTLBEntry entry = {
> +        .target_as = &address_space_memory,
> +        .iova = addr,
> +        .translated_addr = addr,
> +        .addr_mask = ~(hwaddr)0,
> +        .perm = IOMMU_NONE,
> +    };
> +    int ret = 0;
> +
> +    if (!smmu_enabled(s)) {
> +        goto out;
> +    }
> +
> +    ret = smmuv3_decode_config(mr, &cfg, &event);
> +    if (ret) {
> +        goto out;
> +    }
> +
> +    if (cfg.aborted) {
> +        goto out;
> +    }
> +
> +    ret = smmu_ptw(&cfg, addr, flag, &entry, &ptw_info);
> +    if (ret) {
> +        switch (ptw_info.type) {
> +        case SMMU_PTW_ERR_WALK_EABT:
> +            event.type = SMMU_EVT_F_WALK_EABT;
> +            event.u.f_walk_eabt.addr = addr;
> +            event.u.f_walk_eabt.rnw = flag & 0x1;
> +            event.u.f_walk_eabt.class = 0x1;
> +            event.u.f_walk_eabt.addr2 = ptw_info.addr;
> +            break;
> +        case SMMU_PTW_ERR_TRANSLATION:
> +            if (event.record_trans_faults) {
> +                event.type = SMMU_EVT_F_TRANSLATION;
> +                event.u.f_translation.addr = addr;
> +                event.u.f_translation.rnw = flag & 0x1;
> +            }
> +            break;
> +        case SMMU_PTW_ERR_ADDR_SIZE:
> +            if (event.record_trans_faults) {
> +                event.type = SMMU_EVT_F_ADDR_SIZE;
> +                event.u.f_addr_size.addr = addr;
> +                event.u.f_addr_size.rnw = flag & 0x1;
> +            }
> +            break;
> +        case SMMU_PTW_ERR_ACCESS:
> +            if (event.record_trans_faults) {
> +                event.type = SMMU_EVT_F_ACCESS;
> +                event.u.f_access.addr = addr;
> +                event.u.f_access.rnw = flag & 0x1;
> +            }
> +            break;
> +        case SMMU_PTW_ERR_PERMISSION:
> +            if (event.record_trans_faults) {
> +                event.type = SMMU_EVT_F_PERMISSION;
> +                event.u.f_permission.addr = addr;
> +                event.u.f_permission.rnw = flag & 0x1;
> +            }
> +            break;
> +        default:
> +            error_setg(&error_fatal, "SMMUV3 BUG");

g_assert_not_reached(), I guess ?

> +        }
> +    }
> +
> +    trace_smmuv3_translate(mr->parent_obj.name, sid, addr,
> +                           entry.translated_addr, entry.perm);
> +out:
> +    if (ret) {
> +        qemu_log_mask(LOG_GUEST_ERROR,
> +                      "%s translation failed for iova=0x%"PRIx64" (%s)\n",
> +                      mr->parent_obj.name, addr, SMMU_EVENT_STRING(event.type));
> +        entry.perm = IOMMU_NONE;
> +        smmuv3_record_event(s, &event);
> +    } else if (!cfg.aborted) {
> +        entry.perm = flag;
> +    }
> +
> +    return entry;
> +}
> +
>  static int smmuv3_cmdq_consume(SMMUv3State *s)
>  {
>      SMMUCmdError cmd_error = SMMU_CERROR_NONE;
> @@ -739,6 +1077,9 @@ static void smmuv3_class_init(ObjectClass *klass, void *data)
>  static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
>                                                    void *data)
>  {
> +    IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass);
> +
> +    imrc->translate = smmuv3_translate;
>  }
>
>  static const TypeInfo smmuv3_type_info = {
> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
> index c79c15e..1102bd4 100644
> --- a/hw/arm/trace-events
> +++ b/hw/arm/trace-events
> @@ -29,3 +29,12 @@ smmuv3_write_mmio_idr(hwaddr addr, uint64_t val) "write to RO/Unimpl reg 0x%lx v
>  smmuv3_write_mmio_evtq_cons_bef_clear(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "Before clearing interrupt prod:0x%x cons:0x%x prod.w:%d cons.w:%d"
>  smmuv3_write_mmio_evtq_cons_after_clear(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "after clearing interrupt prod:0x%x cons:0x%x prod.w:%d cons.w:%d"
>  smmuv3_record_event(const char *type, uint32_t sid) "%s sid=%d"
> +smmuv3_find_ste(uint16_t sid, uint32_t features, uint16_t sid_split) "SID:0x%x features:0x%x, sid_split:0x%x"
> +smmuv3_find_ste_2lvl(uint64_t strtab_base, hwaddr l1ptr, int l1_ste_offset, hwaddr l2ptr, int l2_ste_offset, int max_l2_ste) "strtab_base:0x%lx l1ptr:0x%"PRIx64" l1_off:0x%x, l2ptr:0x%"PRIx64" l2_off:0x%x max_l2_ste:%d"
> +smmuv3_get_ste(hwaddr addr) "STE addr: 0x%"PRIx64
> +smmuv3_translate_bypass(const char *n, uint16_t sid, hwaddr addr, bool is_write) "%s sid=%d bypass iova:0x%"PRIx64" is_write=%d"
> +smmuv3_translate_in(uint16_t sid, int pci_bus_num, hwaddr strtab_base) "SID:0x%x bus:%d strtab_base:0x%"PRIx64
> +smmuv3_get_cd(hwaddr addr) "CD addr: 0x%"PRIx64
> +smmuv3_translate(const char *n, uint16_t sid, hwaddr iova, hwaddr translated, int perm) "%s sid=%d iova=0x%"PRIx64" translated=0x%"PRIx64" perm=0x%x"
> +smmuv3_decode_cd(uint32_t oas) "oas=%d"
> +smmuv3_decode_cd_tt(int i, uint32_t tsz, uint64_t ttb, uint32_t granule_sz, int initial_level) "TT[%d]:tsz:%d ttb:0x%"PRIx64" granule_sz:%d, initial_level = %d"
> --

More hwaddrs in here that should be uint64_t.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 09/14] hw/arm/smmuv3: Implement translate callback
  2018-03-09 18:46   ` Peter Maydell
@ 2018-03-12 10:38     ` Eric Auger
  0 siblings, 0 replies; 63+ messages in thread
From: Eric Auger @ 2018-03-12 10:38 UTC (permalink / raw)
  To: Peter Maydell, Eric Auger
  Cc: qemu-arm, QEMU Developers, Prem Mallappa, Alex Williamson,
	Tomasz Nowicki, Michael S. Tsirkin, Christoffer Dall,
	Bharat Bhushan, Jean-Philippe Brucker, Edgar E. Iglesias,
	linuc.decode, Peter Xu

Hi Peter,

On 09/03/18 19:46, Peter Maydell wrote:
> On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
>> This patch implements the IOMMU Memory Region translate()
>> callback. Most of the code relates to the translation
>> configuration decoding and check (STE, CD).
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>> v8 -> v9:
>> - use SMMU_EVENT_STRING macro
>> - get rid of last erro_report's
>> - decode asid
>> - handle config abort before ptw
>> - add 64-bit single-copy atomic comment
>>
>> v7 -> v8:
>> - use address_space_rw
>> - s/Ste/STE, s/Cd/CD
>> - use dma_memory_read
>> - remove everything related to stage 2
>> - collect data for both TTx
>> - renamings
>> - pass the event handle all along the config decoding path
>> - decode tbi, ars
>> ---
>>  hw/arm/smmuv3-internal.h | 146 ++++++++++++++++++++
>>  hw/arm/smmuv3.c          | 341 +++++++++++++++++++++++++++++++++++++++++++++++
>>  hw/arm/trace-events      |   9 ++
>>  3 files changed, 496 insertions(+)
>>
>> diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
>> index 3929f69..b203426 100644
>> --- a/hw/arm/smmuv3-internal.h
>> +++ b/hw/arm/smmuv3-internal.h
>> @@ -462,4 +462,150 @@ typedef struct SMMUEventInfo {
>>
>>  void smmuv3_record_event(SMMUv3State *s, SMMUEventInfo *event);
>>
>> +/* Configuration Data */
>> +
>> +/* STE Level 1 Descriptor */
>> +typedef struct STEDesc {
>> +    uint32_t word[2];
>> +} STEDesc;
>> +
>> +/* CD Level 1 Descriptor */
>> +typedef struct CDDesc {
>> +    uint32_t word[2];
>> +} CDDesc;
>> +
>> +/* Stream Table Entry(STE) */
>> +typedef struct STE {
>> +    uint32_t word[16];
>> +} STE;
>> +
>> +/* Context Descriptor(CD) */
>> +typedef struct CD {
>> +    uint32_t word[16];
>> +} CD;
>> +
>> +/* STE fields */
>> +
>> +#define STE_VALID(x)   extract32((x)->word[0], 0, 1) /* 0 */
> 
> I'm not sure what the comment is supposed to be telling me ?
removed
> 
>> +
>> +#define STE_CONFIG(x)  extract32((x)->word[0], 1, 3)
>> +#define STE_CFG_S1_ENABLED(config) (config & 0x1)
>> +#define STE_CFG_S2_ENABLED(config) (config & 0x2)
>> +#define STE_CFG_ABORT(config)      (!(config & 0x4))
>> +#define STE_CFG_BYPASS(config)     (config == 0x4)
>> +
>> +#define STE_S1FMT(x)   extract32((x)->word[0], 4 , 2)
>> +#define STE_S1CDMAX(x) extract32((x)->word[1], 27, 5)
>> +#define STE_EATS(x)    extract32((x)->word[2], 28, 2)
>> +#define STE_STRW(x)    extract32((x)->word[2], 30, 2)
>> +#define STE_S2VMID(x)  extract32((x)->word[4], 0 , 16)
>> +#define STE_S2T0SZ(x)  extract32((x)->word[5], 0 , 6)
>> +#define STE_S2SL0(x)   extract32((x)->word[5], 6 , 2)
>> +#define STE_S2TG(x)    extract32((x)->word[5], 14, 2)
>> +#define STE_S2PS(x)    extract32((x)->word[5], 16, 3)
>> +#define STE_S2AA64(x)  extract32((x)->word[5], 19, 1)
>> +#define STE_S2HD(x)    extract32((x)->word[5], 24, 1)
>> +#define STE_S2HA(x)    extract32((x)->word[5], 25, 1)
>> +#define STE_S2S(x)     extract32((x)->word[5], 26, 1)
>> +#define STE_CTXPTR(x)                                           \
>> +    ({                                                          \
>> +        unsigned long addr;                                     \
>> +        addr = (uint64_t)extract32((x)->word[1], 0, 16) << 32;  \
>> +        addr |= (uint64_t)((x)->word[0] & 0xffffffc0);          \
>> +        addr;                                                   \
>> +    })
>> +
>> +#define STE_S2TTB(x)                                            \
>> +    ({                                                          \
>> +        unsigned long addr;                                     \
>> +        addr = (uint64_t)extract32((x)->word[7], 0, 16) << 32;  \
>> +        addr |= (uint64_t)((x)->word[6] & 0xfffffff0);          \
>> +        addr;                                                   \
>> +    })
>> +
>> +static inline int oas2bits(int oas_field)
>> +{
>> +    switch (oas_field) {
>> +    case 0b011:
>> +        return 42;
>> +    case 0b100:
>> +        return 44;
>> +    default:
>> +        return 32 + (1 << oas_field);
> 
> Is this right? For instance OAS == 0b001 should be 36,
> but 32 + (1 << 1) == 34 ?  0b110 should be 52, but
> 32 + (1 << 6) == 96 ?
> 
> I'm not sure there's a neat relationship between the field
> values and the address sizes here, so maybe we should just
> have a switch with a case for each value?
indeed, done
> 
>> +   }
>> +}
>> +
>> +static inline int pa_range(STE *ste)
>> +{
>> +    int oas_field = MIN(STE_S2PS(ste), SMMU_IDR5_OAS);
>> +
>> +    if (!STE_S2AA64(ste)) {
>> +        return 40;
>> +    }
>> +
>> +    return oas2bits(oas_field);
>> +}
>> +
>> +#define MAX_PA(ste) ((1 << pa_range(ste)) - 1)
>> +
>> +/* CD fields */
>> +
>> +#define CD_VALID(x)   extract32((x)->word[0], 30, 1)
>> +#define CD_ASID(x)    extract32((x)->word[1], 16, 16)
>> +#define CD_TTB(x, sel)                                      \
>> +    ({                                                      \
>> +        uint64_t hi, lo;                                    \
>> +        hi = extract32((x)->word[(sel) * 2 + 3], 0, 16);    \
> 
> Spec says TTB0/1 fields go up to bit 19. You want the whole lot
> even if we don't support address sizes that wide because out
> of range high bits should cause an illegal CD error, not be ignored.
Hum I used an older version of the spec where it stopped at bit 15.
> 
>> +        hi <<= 32;                                          \
>> +        lo = (x)->word[(sel) * 2 + 2] & ~0xf;               \
> 
> Does this ~0xf end up correctly sign extending up into the top 32 bits?
> ULL suffix would make it certain.
OK
> 
>> +        hi | lo;                                            \
>> +    })
>> +
>> +#define CD_TSZ(x, sel)   extract32((x)->word[0], (16 * (sel)) + 0, 6)
>> +#define CD_TG(x, sel)    extract32((x)->word[0], (16 * (sel)) + 6, 2)
>> +#define CD_EPD(x, sel)   extract32((x)->word[0], (16 * (sel)) + 14, 1)
>> +#define CD_ENDI(x)       extract32((x)->word[0], 15, 1)
>> +#define CD_IPS(x)        extract32((x)->word[1], 0 , 3)
>> +#define CD_TBI(x)        extract32((x)->word[1], 6 , 2)
>> +#define CD_S(x)          extract32((x)->word[1], 12, 1)
>> +#define CD_R(x)          extract32((x)->word[1], 13, 1)
>> +#define CD_A(x)          extract32((x)->word[1], 14, 1)
>> +#define CD_AARCH64(x)    extract32((x)->word[1], 9 , 1)
>> +
>> +#define CDM_VALID(x)    ((x)->word[0] & 0x1)
>> +
>> +static inline int is_cd_valid(SMMUv3State *s, STE *ste, CD *cd)
>> +{
>> +    return CD_VALID(cd);
>> +}
>> +
>> +/**
>> + * tg2granule - Decodes the CD translation granule size field according
>> + * to the ttbr in use
>> + * @bits: TG0/1 fields
>> + * @ttbr: ttbr index in use
>> + */
>> +static inline int tg2granule(int bits, int ttbr)
>> +{
>> +    switch (bits) {
>> +    case 1:
>> +        return ttbr ? 14 : 16;
>> +    case 2:
>> +        return ttbr ? 12 : 14;
>> +    case 3:
>> +        return ttbr ? 16 : 12;
>> +    default:
>> +        return 12;
> 
> TG0 == 0b11 and TG1 == 0b00 are reserved and should cause
> an illegal CD error if the config means they would be used,
> not be treated as if they were one of the other values.
OK modified
> 
> The spec has a definition of a CD_ILLEGAL expression which
> specifies what conditions should cause an illegal CD error.
> It might be useful to have something in the patchset that looks
> like that, and then later you can just assert that you're
> not dealing with illegal values, or ignore them.
At the moment I added some additional checks on S1STALLD, CD_A, CD_S,
CD_HA, CD_HD to match the formulae. My concern is I am  filling the
config struct while checking the data so it was easier for me to split
into several checks.

> 
>> +    }
>> +}
>> +
>> +#define L1STD_L2PTR(stm) ({                                 \
>> +            uint64_t hi, lo;                            \
>> +            hi = (stm)->word[1];                        \
>> +            lo = (stm)->word[0] & ~(uint64_t)0x1f;      \
> 
> I see here you have a cast to force the mask to 64-bits wide.
> ULL suffix would be shorter, but in any case do the same thing
> in all places.
OK
> 
>> +            hi << 32 | lo;                              \
>> +        })
> 
> This would definitely be better as an inline function rather than
> a multiline macro that has to use the GCC extension to define
> macro-local variables and return a value.
done
> 
>> +
>> +#define L1STD_SPAN(stm) (extract32((stm)->word[0], 0, 4))
>> +
>>  #endif
>> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
>> index 0adfe53..384393f 100644
>> --- a/hw/arm/smmuv3.c
>> +++ b/hw/arm/smmuv3.c
>> @@ -289,6 +289,344 @@ static void smmuv3_init_regs(SMMUv3State *s)
>>      s->sid_split = 0;
>>  }
>>
>> +static int smmu_get_ste(SMMUv3State *s, dma_addr_t addr, STE *buf,
>> +                        SMMUEventInfo *event)
>> +{
>> +    int ret;
>> +
>> +    trace_smmuv3_get_ste(addr);
>> +    /* TODO: guarantee 64-bit single-copy atomicity */
>> +    ret = dma_memory_read(&address_space_memory, addr,
>> +                          (void *)buf, sizeof(*buf));
>> +    if (ret != MEMTX_OK) {
>> +        qemu_log_mask(LOG_GUEST_ERROR,
>> +                      "Cannot fetch pte at address=0x%"PRIx64"\n", addr);
>> +        event->type = SMMU_EVT_F_STE_FETCH;
>> +        event->u.f_ste_fetch.addr = addr;
>> +        return -EINVAL;
>> +    }
>> +    return 0;
>> +
>> +}
>> +
>> +/* @ssid > 0 not supported yet */
>> +static int smmu_get_cd(SMMUv3State *s, STE *ste, uint32_t ssid,
>> +                       CD *buf, SMMUEventInfo *event)
>> +{
>> +    dma_addr_t addr = STE_CTXPTR(ste);
>> +    int ret;
>> +
>> +    trace_smmuv3_get_cd(addr);
>> +    /* TODO: guarantee 64-bit single-copy atomicity */
>> +    ret = dma_memory_read(&address_space_memory, addr,
>> +                           (void *)buf, sizeof(*buf));
>> +    if (ret != MEMTX_OK) {
>> +        qemu_log_mask(LOG_GUEST_ERROR,
>> +                      "Cannot fetch pte at address=0x%"PRIx64"\n", addr);
>> +        event->type = SMMU_EVT_F_CD_FETCH;
>> +        event->u.f_ste_fetch.addr = addr;
>> +        return -EINVAL;
>> +    }
>> +    return 0;
>> +}
>> +
>> +static int decode_ste(SMMUv3State *s, SMMUTransCfg *cfg,
>> +                      STE *ste, SMMUEventInfo *event)
>> +{
>> +    uint32_t config = STE_CONFIG(ste);
>> +    int ret = -EINVAL;
>> +
>> +    if (STE_CFG_ABORT(config)) {
>> +        /* abort but don't record any event */
>> +        cfg->aborted = true;
>> +        return ret;
>> +    }
>> +
>> +    if (STE_CFG_BYPASS(config)) {
>> +        cfg->bypassed = true;
>> +        return ret;
>> +    }
>> +
>> +    if (!STE_VALID(ste)) {
>> +        goto bad_ste;
>> +    }
>> +
>> +    if (STE_CFG_S2_ENABLED(config)) {
>> +        error_setg(&error_fatal, "SMMUv3 does not support stage 2 yet");
> 
> Usual remarks about not using error_setg() for this kind of thing.
done
> 
>> +    }
>> +
>> +    if (STE_S1CDMAX(ste) != 0) {
>> +        error_setg(&error_fatal,
>> +                   "SMMUv3 does not support multiple context descriptors yet");
>> +        goto bad_ste;
>> +    }
>> +    return 0;
>> +
>> +bad_ste:
>> +    event->type = SMMU_EVT_C_BAD_STE;
>> +    return -EINVAL;
>> +}
>> +
>> +/**
>> + * smmu_find_ste - Return the stream table entry associated
>> + * to the sid
>> + *
>> + * @s: smmuv3 handle
>> + * @sid: stream ID
>> + * @ste: returned stream table entry
>> + * @event: handle to an event info
>> + *
>> + * Supports linear and 2-level stream table
>> + * Return 0 on success, -EINVAL otherwise
>> + */
>> +static int smmu_find_ste(SMMUv3State *s, uint32_t sid, STE *ste,
>> +                         SMMUEventInfo *event)
>> +{
>> +    dma_addr_t addr;
>> +    int ret;
>> +
>> +    trace_smmuv3_find_ste(sid, s->features, s->sid_split);
>> +    /* Check SID range */
>> +    if (sid > (1 << SMMU_IDR1_SIDSIZE)) {
>> +        event->type = SMMU_EVT_C_BAD_STREAMID;
>> +        return -EINVAL;
>> +    }
>> +    if (s->features & SMMU_FEATURE_2LVL_STE) {
>> +        int l1_ste_offset, l2_ste_offset, max_l2_ste, span;
>> +        dma_addr_t strtab_base, l1ptr, l2ptr;
>> +        STEDesc l1std;
>> +
>> +        strtab_base = s->strtab_base & SMMU_BASE_ADDR_MASK;
>> +        l1_ste_offset = sid >> s->sid_split;
>> +        l2_ste_offset = sid & ((1 << s->sid_split) - 1);
>> +        l1ptr = (dma_addr_t)(strtab_base + l1_ste_offset * sizeof(l1std));
>> +        /* TODO: guarantee 64-bit single-copy atomicity */
>> +        ret = dma_memory_read(&address_space_memory, l1ptr,
>> +                              (uint8_t *)&l1std, sizeof(l1std));
>> +        if (ret != MEMTX_OK) {
>> +            qemu_log_mask(LOG_GUEST_ERROR,
>> +                          "Could not read L1PTR at 0X%"PRIx64"\n", l1ptr);
>> +            event->type = SMMU_EVT_F_STE_FETCH;
>> +            event->u.f_ste_fetch.addr = l1ptr;
>> +            return -EINVAL;
>> +        }
>> +
>> +        span = L1STD_SPAN(&l1std);
>> +
>> +        if (!span) {
>> +            /* l2ptr is not valid */
>> +            qemu_log_mask(LOG_GUEST_ERROR,
>> +                          "invalid sid=%d (L1STD span=0)\n", sid);
>> +            event->type = SMMU_EVT_C_BAD_STREAMID;
>> +            return -EINVAL;
>> +        }
>> +        max_l2_ste = (1 << span) - 1;
>> +        l2ptr = L1STD_L2PTR(&l1std);
>> +        trace_smmuv3_find_ste_2lvl(s->strtab_base, l1ptr, l1_ste_offset,
>> +                                   l2ptr, l2_ste_offset, max_l2_ste);
>> +        if (l2_ste_offset > max_l2_ste) {
>> +            qemu_log_mask(LOG_GUEST_ERROR,
>> +                          "l2_ste_offset=%d > max_l2_ste=%d\n",
>> +                          l2_ste_offset, max_l2_ste);
>> +            event->type = SMMU_EVT_C_BAD_STE;
>> +            return -EINVAL;
>> +        }
>> +        addr = L1STD_L2PTR(&l1std) + l2_ste_offset * sizeof(*ste);
>> +    } else {
>> +        addr = s->strtab_base + sid * sizeof(*ste);
>> +    }
>> +
>> +    if (smmu_get_ste(s, addr, ste, event)) {
>> +        return -EINVAL;
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +static int decode_cd(SMMUTransCfg *cfg, CD *cd, SMMUEventInfo *event)
>> +{
>> +    int ret = -EINVAL;
>> +    int i;
>> +
>> +    if (!CD_VALID(cd) || !CD_AARCH64(cd)) {
>> +        goto error;
>> +    }
>> +
>> +    /* we support only those at the moment */
>> +    cfg->aa64 = true;
>> +    cfg->stage = 1;
>> +
>> +    cfg->oas = oas2bits(CD_IPS(cd));
>> +    cfg->oas = MIN(oas2bits(SMMU_IDR5_OAS), cfg->oas);
>> +    cfg->tbi = CD_TBI(cd);
>> +    cfg->asid = CD_ASID(cd);
>> +
>> +    trace_smmuv3_decode_cd(cfg->oas);
>> +
>> +    /* decode data dependent on TT */
>> +    for (i = 0; i <= 1; i++) {
>> +        int tg, tsz;
>> +        SMMUTransTableInfo *tt = &cfg->tt[i];
>> +
>> +        cfg->tt[i].disabled = CD_EPD(cd, i);
>> +        if (cfg->tt[i].disabled) {
>> +            continue;
>> +        }
>> +
>> +        tsz = CD_TSZ(cd, i);
>> +        if (tsz < 16 || tsz > 39) {
>> +            goto error;
>> +        }
>> +
>> +        tg = CD_TG(cd, i);
>> +        tt->granule_sz = tg2granule(tg, i);
>> +        if ((tt->granule_sz != 12 && tt->granule_sz != 16) || CD_ENDI(cd)) {
>> +            goto error;
>> +        }
>> +
>> +        tt->tsz = tsz;
>> +        tt->initial_level = 4 - (64 - tsz - 4) / (tt->granule_sz - 3);
>> +        tt->ttb = CD_TTB(cd, i);
>> +        tt->ttb = extract64(tt->ttb, 0, cfg->oas);
>> +        trace_smmuv3_decode_cd_tt(i, tt->tsz, tt->ttb,
>> +                                  tt->granule_sz, tt->initial_level);
>> +    }
>> +
>> +    event->record_trans_faults = CD_R(cd);
>> +
>> +    return 0;
>> +
>> +error:
>> +    event->type = SMMU_EVT_C_BAD_CD;
>> +    return ret;
>> +}
>> +
>> +/**
>> + * smmuv3_decode_config - Prepare the translation configuration
>> + * for the @mr iommu region
>> + * @mr: iommu memory region the translation config must be prepared for
>> + * @cfg: output translation configuration which is populated through
>> + *       the different configuration decodng steps
> 
> "decoding"
ok
> 
>> + * @event: must be zero'ed by the callerj
>> + *
>> + * return < 0 if the translation needs to be aborted (@event is filled
>> + * accordingly). Return 0 otherwise.
>> + */
>> +static int smmuv3_decode_config(IOMMUMemoryRegion *mr, SMMUTransCfg *cfg,
>> +                                SMMUEventInfo *event)
>> +{
>> +    SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
>> +    uint32_t sid = smmu_get_sid(sdev);
>> +    SMMUv3State *s = sdev->smmu;
>> +    int ret = -EINVAL;
>> +    STE ste;
>> +    CD cd;
>> +
>> +    if (smmu_find_ste(s, sid, &ste, event)) {
>> +        return ret;
>> +    }
>> +
>> +    if (decode_ste(s, cfg, &ste, event)) {
>> +        return ret;
>> +    }
>> +
>> +    if (smmu_get_cd(s, &ste, 0 /* ssid */, &cd, event)) {
>> +        return ret;
>> +    }
>> +
>> +    return decode_cd(cfg, &cd, event);
>> +}
>> +
>> +static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr addr,
>> +                                      IOMMUAccessFlags flag)
>> +{
>> +    SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
>> +    SMMUv3State *s = sdev->smmu;
>> +    uint32_t sid = smmu_get_sid(sdev);
>> +    SMMUEventInfo event = {.type = SMMU_EVT_OK, .sid = sid};
>> +    SMMUPTWEventInfo ptw_info = {};
>> +    SMMUTransCfg cfg = {};
>> +    IOMMUTLBEntry entry = {
>> +        .target_as = &address_space_memory,
>> +        .iova = addr,
>> +        .translated_addr = addr,
>> +        .addr_mask = ~(hwaddr)0,
>> +        .perm = IOMMU_NONE,
>> +    };
>> +    int ret = 0;
>> +
>> +    if (!smmu_enabled(s)) {
>> +        goto out;
>> +    }
>> +
>> +    ret = smmuv3_decode_config(mr, &cfg, &event);
>> +    if (ret) {
>> +        goto out;
>> +    }
>> +
>> +    if (cfg.aborted) {
>> +        goto out;
>> +    }
>> +
>> +    ret = smmu_ptw(&cfg, addr, flag, &entry, &ptw_info);
>> +    if (ret) {
>> +        switch (ptw_info.type) {
>> +        case SMMU_PTW_ERR_WALK_EABT:
>> +            event.type = SMMU_EVT_F_WALK_EABT;
>> +            event.u.f_walk_eabt.addr = addr;
>> +            event.u.f_walk_eabt.rnw = flag & 0x1;
>> +            event.u.f_walk_eabt.class = 0x1;
>> +            event.u.f_walk_eabt.addr2 = ptw_info.addr;
>> +            break;
>> +        case SMMU_PTW_ERR_TRANSLATION:
>> +            if (event.record_trans_faults) {
>> +                event.type = SMMU_EVT_F_TRANSLATION;
>> +                event.u.f_translation.addr = addr;
>> +                event.u.f_translation.rnw = flag & 0x1;
>> +            }
>> +            break;
>> +        case SMMU_PTW_ERR_ADDR_SIZE:
>> +            if (event.record_trans_faults) {
>> +                event.type = SMMU_EVT_F_ADDR_SIZE;
>> +                event.u.f_addr_size.addr = addr;
>> +                event.u.f_addr_size.rnw = flag & 0x1;
>> +            }
>> +            break;
>> +        case SMMU_PTW_ERR_ACCESS:
>> +            if (event.record_trans_faults) {
>> +                event.type = SMMU_EVT_F_ACCESS;
>> +                event.u.f_access.addr = addr;
>> +                event.u.f_access.rnw = flag & 0x1;
>> +            }
>> +            break;
>> +        case SMMU_PTW_ERR_PERMISSION:
>> +            if (event.record_trans_faults) {
>> +                event.type = SMMU_EVT_F_PERMISSION;
>> +                event.u.f_permission.addr = addr;
>> +                event.u.f_permission.rnw = flag & 0x1;
>> +            }
>> +            break;
>> +        default:
>> +            error_setg(&error_fatal, "SMMUV3 BUG");
> 
> g_assert_not_reached(), I guess ?
yes
> 
>> +        }
>> +    }
>> +
>> +    trace_smmuv3_translate(mr->parent_obj.name, sid, addr,
>> +                           entry.translated_addr, entry.perm);
>> +out:
>> +    if (ret) {
>> +        qemu_log_mask(LOG_GUEST_ERROR,
>> +                      "%s translation failed for iova=0x%"PRIx64" (%s)\n",
>> +                      mr->parent_obj.name, addr, SMMU_EVENT_STRING(event.type));
>> +        entry.perm = IOMMU_NONE;
>> +        smmuv3_record_event(s, &event);
>> +    } else if (!cfg.aborted) {
>> +        entry.perm = flag;
>> +    }
>> +
>> +    return entry;
>> +}
>> +
>>  static int smmuv3_cmdq_consume(SMMUv3State *s)
>>  {
>>      SMMUCmdError cmd_error = SMMU_CERROR_NONE;
>> @@ -739,6 +1077,9 @@ static void smmuv3_class_init(ObjectClass *klass, void *data)
>>  static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
>>                                                    void *data)
>>  {
>> +    IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass);
>> +
>> +    imrc->translate = smmuv3_translate;
>>  }
>>
>>  static const TypeInfo smmuv3_type_info = {
>> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
>> index c79c15e..1102bd4 100644
>> --- a/hw/arm/trace-events
>> +++ b/hw/arm/trace-events
>> @@ -29,3 +29,12 @@ smmuv3_write_mmio_idr(hwaddr addr, uint64_t val) "write to RO/Unimpl reg 0x%lx v
>>  smmuv3_write_mmio_evtq_cons_bef_clear(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "Before clearing interrupt prod:0x%x cons:0x%x prod.w:%d cons.w:%d"
>>  smmuv3_write_mmio_evtq_cons_after_clear(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "after clearing interrupt prod:0x%x cons:0x%x prod.w:%d cons.w:%d"
>>  smmuv3_record_event(const char *type, uint32_t sid) "%s sid=%d"
>> +smmuv3_find_ste(uint16_t sid, uint32_t features, uint16_t sid_split) "SID:0x%x features:0x%x, sid_split:0x%x"
>> +smmuv3_find_ste_2lvl(uint64_t strtab_base, hwaddr l1ptr, int l1_ste_offset, hwaddr l2ptr, int l2_ste_offset, int max_l2_ste) "strtab_base:0x%lx l1ptr:0x%"PRIx64" l1_off:0x%x, l2ptr:0x%"PRIx64" l2_off:0x%x max_l2_ste:%d"
>> +smmuv3_get_ste(hwaddr addr) "STE addr: 0x%"PRIx64
>> +smmuv3_translate_bypass(const char *n, uint16_t sid, hwaddr addr, bool is_write) "%s sid=%d bypass iova:0x%"PRIx64" is_write=%d"
>> +smmuv3_translate_in(uint16_t sid, int pci_bus_num, hwaddr strtab_base) "SID:0x%x bus:%d strtab_base:0x%"PRIx64
>> +smmuv3_get_cd(hwaddr addr) "CD addr: 0x%"PRIx64
>> +smmuv3_translate(const char *n, uint16_t sid, hwaddr iova, hwaddr translated, int perm) "%s sid=%d iova=0x%"PRIx64" translated=0x%"PRIx64" perm=0x%x"
>> +smmuv3_decode_cd(uint32_t oas) "oas=%d"
>> +smmuv3_decode_cd_tt(int i, uint32_t tsz, uint64_t ttb, uint32_t granule_sz, int initial_level) "TT[%d]:tsz:%d ttb:0x%"PRIx64" granule_sz:%d, initial_level = %d"
>> --
> 
> More hwaddrs in here that should be uint64_t.
fixed

Thanks

Eric
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 10/14] hw/arm/smmuv3: Abort on vfio or vhost case
  2018-03-09 17:59       ` Peter Maydell
@ 2018-03-12 10:53         ` Eric Auger
  2018-03-12 11:10           ` Peter Maydell
  0 siblings, 1 reply; 63+ messages in thread
From: Eric Auger @ 2018-03-12 10:53 UTC (permalink / raw)
  To: Peter Maydell, Auger Eric
  Cc: qemu-arm, QEMU Developers, Prem Mallappa, Alex Williamson,
	Tomasz Nowicki, Michael S. Tsirkin, Christoffer Dall,
	Bharat Bhushan, Jean-Philippe Brucker, Edgar E. Iglesias,
	linuc.decode, Peter Xu

Hi Peter,

On 09/03/18 18:59, Peter Maydell wrote:
> On 9 March 2018 at 17:53, Auger Eric <eric.auger@redhat.com> wrote:
>> Hi Peter,
>> On 08/03/18 20:06, Peter Maydell wrote:
>>> On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
>>>> At the moment, the SMMUv3 does not support notification on
>>>> TLB invalidation. So let's abort as soon as such notifier gets
>>>> enabled.
>>>>
>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>> ---
>>>>  hw/arm/smmuv3.c | 11 +++++++++++
>>>>  1 file changed, 11 insertions(+)
>>>>
>>>> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
>>>> index 384393f..5efe933 100644
>>>> --- a/hw/arm/smmuv3.c
>>>> +++ b/hw/arm/smmuv3.c
>>>> @@ -1074,12 +1074,23 @@ static void smmuv3_class_init(ObjectClass *klass, void *data)
>>>>      dc->realize = smmu_realize;
>>>>  }
>>>>
>>>> +static void smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
>>>> +                                       IOMMUNotifierFlag old,
>>>> +                                       IOMMUNotifierFlag new)
>>>> +{
>>>> +    if (old == IOMMU_NOTIFIER_NONE) {
>>>> +        error_setg(&error_fatal,
>>>> +                   "SMMUV3: vhost and vfio notifiers not yet supported");
>>>> +    }
>>>> +}
>>>
>>> Is this triggerable by the guest, or by the user on the command
>>> line, or only by a bug in the board or other QEMU code?
>> by the user on the command line.
> 
> OK. Do they get this error immediately on startup, or only later
> in execution? (If the latter, is it possible to make the error
> happen earlier?)
later in execution. We also have to handle the case where such device is
hot-plugged. At best if could be done on smmu_find_add_as() by checking
the type of attached device but this wouldn't happen much earlier. By
the way we will soon support vhost and we will just rule out vfio
integration by detecting map notifiers.

Thanks

Eric
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 10/14] hw/arm/smmuv3: Abort on vfio or vhost case
  2018-03-12 10:53         ` Eric Auger
@ 2018-03-12 11:10           ` Peter Maydell
  2018-03-12 15:01             ` Auger Eric
  0 siblings, 1 reply; 63+ messages in thread
From: Peter Maydell @ 2018-03-12 11:10 UTC (permalink / raw)
  To: Eric Auger
  Cc: Auger Eric, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

On 12 March 2018 at 10:53, Eric Auger <eric.auger.pro@gmail.com> wrote:
> Hi Peter,
>
> On 09/03/18 18:59, Peter Maydell wrote:
>> On 9 March 2018 at 17:53, Auger Eric <eric.auger@redhat.com> wrote:
>>> Hi Peter,
>>> On 08/03/18 20:06, Peter Maydell wrote:
>>>> On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
>>>>> +static void smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
>>>>> +                                       IOMMUNotifierFlag old,
>>>>> +                                       IOMMUNotifierFlag new)
>>>>> +{
>>>>> +    if (old == IOMMU_NOTIFIER_NONE) {
>>>>> +        error_setg(&error_fatal,
>>>>> +                   "SMMUV3: vhost and vfio notifiers not yet supported");
>>>>> +    }
>>>>> +}
>>>>
>>>> Is this triggerable by the guest, or by the user on the command
>>>> line, or only by a bug in the board or other QEMU code?
>>> by the user on the command line.
>>
>> OK. Do they get this error immediately on startup, or only later
>> in execution? (If the latter, is it possible to make the error
>> happen earlier?)

> later in execution. We also have to handle the case where such device is
> hot-plugged. At best if could be done on smmu_find_add_as() by checking
> the type of attached device but this wouldn't happen much earlier. By
> the way we will soon support vhost and we will just rule out vfio
> integration by detecting map notifiers.

Hmm. error_fatal is a bit unfortunate for a hotplug event -- ideally
you would want to cause the hotplug to cleanly fail without aborting
the running QEMU session.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 11/14] target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 11/14] target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route Eric Auger
@ 2018-03-12 11:59   ` Peter Maydell
  2018-03-12 15:16     ` Auger Eric
  0 siblings, 1 reply; 63+ messages in thread
From: Peter Maydell @ 2018-03-12 11:59 UTC (permalink / raw)
  To: Eric Auger
  Cc: Eric Auger, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu, Paolo Bonzini

On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
> In case the MSI is translated by an IOMMU we need to fixup the
> MSI route with the translated address.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>
> ---
>
> v5 -> v6:
> - use IOMMUMemoryRegionClass API
>
> It is still unclear to me if we need to register an IOMMUNotifier
> to handle any change in the MSI doorbell which would occur behind
> the scene and would not lead to any call to kvm_arch_fixup_msi_route().

Paolo, do you know the answer to this question ?

> ---
>  target/arm/kvm.c        | 27 +++++++++++++++++++++++++++
>  target/arm/trace-events |  3 +++
>  2 files changed, 30 insertions(+)
>
> diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> index 1219d00..9f5976a 100644
> --- a/target/arm/kvm.c
> +++ b/target/arm/kvm.c
> @@ -20,8 +20,13 @@
>  #include "sysemu/kvm.h"
>  #include "kvm_arm.h"
>  #include "cpu.h"
> +#include "trace.h"
>  #include "internals.h"
>  #include "hw/arm/arm.h"
> +#include "hw/pci/pci.h"
> +#include "hw/pci/msi.h"
> +#include "hw/arm/smmu-common.h"
> +#include "hw/arm/smmuv3.h"
>  #include "exec/memattrs.h"
>  #include "exec/address-spaces.h"
>  #include "hw/boards.h"
> @@ -666,6 +671,28 @@ int kvm_arm_vgic_probe(void)
>  int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route,
>                               uint64_t address, uint32_t data, PCIDevice *dev)
>  {
> +    AddressSpace *as = pci_device_iommu_address_space(dev);
> +    IOMMUMemoryRegionClass *imrc;
> +    IOMMUTLBEntry entry;
> +    SMMUDevice *sdev;
> +
> +    if (as == &address_space_memory) {
> +        return 0;
> +    }
> +
> +    /* MSI doorbell address is translated by an IOMMU */
> +    sdev = container_of(as, SMMUDevice, as);
> +    imrc = IOMMU_MEMORY_REGION_GET_CLASS(&sdev->iommu);
> +
> +    entry = imrc->translate(&sdev->iommu, address, IOMMU_WO);
> +
> +    route->u.msi.address_lo = entry.translated_addr;
> +    route->u.msi.address_hi = entry.translated_addr >> 32;
> +
> +    trace_kvm_arm_fixup_msi_route(address, sdev->devfn,
> +                                  sdev->iommu.parent_obj.name,
> +                                  entry.translated_addr);
> +
>      return 0;
>  }

It seems a bit odd that:
 * the code for arm for "PCI devices behind IOMMU need to have
   the MSI doorbell writes go through the IOMMU" looks rather
   different from the code for x86 for the same thing
 * the code here needs to know specifically that this is an SMMU
   and not some other kind of IOMMU

I would have expected this to be more generic-to-all-IOMMU
APIs and maybe even implemented in the generic KVM support code...

The x86 code seems to be similarly x86-specific though, so
this is more of a "perhaps there is a cleanup opportunity here"
observation I guess.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 12/14] hw/arm/virt: Add SMMUv3 to the virt board
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 12/14] hw/arm/virt: Add SMMUv3 to the virt board Eric Auger
@ 2018-03-12 12:46   ` Peter Maydell
  2018-03-12 15:01     ` Auger Eric
  0 siblings, 1 reply; 63+ messages in thread
From: Peter Maydell @ 2018-03-12 12:46 UTC (permalink / raw)
  To: Eric Auger
  Cc: Eric Auger, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
> From: Prem Mallappa <prem.mallappa@broadcom.com>
>
> Add code to instantiate an smmuv3 in virt machine. A new iommu
> integer member is introduced in VirtMachineState to store the type
> of the iommu in use.
>
> Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>
> ---
> v7 -> v8:
> - integer iommu member
> - add primary-bus property
>
> v4 -> v5:
> - add dma-coherent property
>
> v2 -> v3:
> - vbi was removed. Use vms instead
> - migrate to new smmu binding format (iommu-map)
> - don't use appendprop anymore
> - add vms->smmu and guard instantiation with this latter
> - interrupts type changed to edge
>
> Conflicts:
>         hw/arm/smmuv3.c
> ---
>  hw/arm/virt.c         | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>  include/hw/arm/virt.h | 10 ++++++++
>  2 files changed, 73 insertions(+), 1 deletion(-)
>
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index dbb3c80..e9dca0d 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -58,6 +58,7 @@
>  #include "hw/smbios/smbios.h"
>  #include "qapi/visitor.h"
>  #include "standard-headers/linux/input.h"
> +#include "hw/arm/smmuv3.h"
>
>  #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
>      static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
> @@ -141,6 +142,7 @@ static const MemMapEntry a15memmap[] = {
>      [VIRT_FW_CFG] =             { 0x09020000, 0x00000018 },
>      [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
>      [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
> +    [VIRT_SMMU] =               { 0x09050000, 0x00020000 }, /* 128K, needed */

What does the "needed" comment mean here?

>      [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
>      /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
>      [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
> @@ -161,6 +163,7 @@ static const int a15irqmap[] = {
>      [VIRT_SECURE_UART] = 8,
>      [VIRT_MMIO] = 16, /* ...to 16 + NUM_VIRTIO_TRANSPORTS - 1 */
>      [VIRT_GIC_V2M] = 48, /* ...to 48 + NUM_GICV2M_SPIS - 1 */
> +    [VIRT_SMMU] = 74,    /* ...to 74 + NUM_SMMU_IRQS - 1 */
>      [VIRT_PLATFORM_BUS] = 112, /* ...to 112 + PLATFORM_BUS_NUM_IRQS -1 */
>  };
>
> @@ -941,7 +944,57 @@ static void create_pcie_irq_map(const VirtMachineState *vms,
>                             0x7           /* PCI irq */);
>  }
>
> -static void create_pcie(const VirtMachineState *vms, qemu_irq *pic)
> +static void create_smmu(const VirtMachineState *vms, qemu_irq *pic,
> +                        PCIBus *bus)

Side suggestion: if you add "algorithm = histogram" to the [diff] section
of your .gitconfig, you will probably find that it doesn't generate patches
with this sort of silly patch hunk any more. I only discovered this
config setting recently but I think it's better than the default.

> +{
> +    char *node;
> +    const char compat[] = "arm,smmu-v3";
> +    int irq =  vms->irqmap[VIRT_SMMU];
> +    int i;
> +    hwaddr base = vms->memmap[VIRT_SMMU].base;
> +    hwaddr size = vms->memmap[VIRT_SMMU].size;
> +    const char irq_names[] = "eventq\0priq\0cmdq-sync\0gerror";
> +    DeviceState *dev;
> +
> +    if (vms->iommu != VIRT_IOMMU_SMMUV3 || !vms->iommu_phandle) {
> +        return;
> +    }
> +
> +    dev = qdev_create(NULL, "arm-smmuv3");
> +
> +    object_property_set_link(OBJECT(dev), OBJECT(bus), "primary-bus",
> +                             &error_abort);
> +    qdev_init_nofail(dev);
> +    sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, base);
> +    for (i = 0; i < NUM_SMMU_IRQS; i++) {
> +        sysbus_connect_irq(SYS_BUS_DEVICE(dev), i, pic[irq + i]);
> +    }
> +
> +    node = g_strdup_printf("/smmuv3@%" PRIx64, base);
> +    qemu_fdt_add_subnode(vms->fdt, node);
> +    qemu_fdt_setprop(vms->fdt, node, "compatible", compat, sizeof(compat));
> +    qemu_fdt_setprop_sized_cells(vms->fdt, node, "reg", 2, base, 2, size);
> +
> +    qemu_fdt_setprop_cells(vms->fdt, node, "interrupts",
> +            GIC_FDT_IRQ_TYPE_SPI, irq    , GIC_FDT_IRQ_FLAGS_EDGE_LO_HI,
> +            GIC_FDT_IRQ_TYPE_SPI, irq + 1, GIC_FDT_IRQ_FLAGS_EDGE_LO_HI,
> +            GIC_FDT_IRQ_TYPE_SPI, irq + 2, GIC_FDT_IRQ_FLAGS_EDGE_LO_HI,
> +            GIC_FDT_IRQ_TYPE_SPI, irq + 3, GIC_FDT_IRQ_FLAGS_EDGE_LO_HI);
> +
> +    qemu_fdt_setprop(vms->fdt, node, "interrupt-names", irq_names,
> +                     sizeof(irq_names));
> +
> +    qemu_fdt_setprop_cell(vms->fdt, node, "clocks", vms->clock_phandle);
> +    qemu_fdt_setprop_string(vms->fdt, node, "clock-names", "apb_pclk");
> +    qemu_fdt_setprop(vms->fdt, node, "dma-coherent", NULL, 0);

Is this definitely the right setting for dma-coherent? My (possibly
naive) thought is that for an emulated SMMU our page-table walks are
going to be coherent with the CPU's view.

> +
> +    qemu_fdt_setprop_cell(vms->fdt, node, "#iommu-cells", 1);
> +
> +    qemu_fdt_setprop_cell(vms->fdt, node, "phandle", vms->iommu_phandle);
> +    g_free(node);
> +}

> +enum {
> +    VIRT_IOMMU_NONE,
> +    VIRT_IOMMU_SMMUV3,
> +    VIRT_IOMMU_VIRTIO,
> +};
> +
>  typedef struct MemMapEntry {
>      hwaddr base;
>      hwaddr size;
> @@ -96,6 +104,7 @@ typedef struct {
>      bool its;
>      bool virt;
>      int32_t gic_version;
> +    int32_t iommu;

This is always one of the VIRT_IOMMU_* values, right?
I think we should give that enum a typedef name and then
use the type here.

>      struct arm_boot_info bootinfo;
>      const MemMapEntry *memmap;
>      const int *irqmap;
> @@ -105,6 +114,7 @@ typedef struct {
>      uint32_t clock_phandle;
>      uint32_t gic_phandle;
>      uint32_t msi_phandle;
> +    uint32_t iommu_phandle;
>      int psci_conduit;
>  } VirtMachineState;
>

Otherwise
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 13/14] hw/arm/virt-acpi-build: Add smmuv3 node in IORT table
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 13/14] hw/arm/virt-acpi-build: Add smmuv3 node in IORT table Eric Auger
@ 2018-03-12 12:48   ` Peter Maydell
  2018-03-19 14:32     ` Shannon Zhao
  0 siblings, 1 reply; 63+ messages in thread
From: Peter Maydell @ 2018-03-12 12:48 UTC (permalink / raw)
  To: Eric Auger
  Cc: Eric Auger, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu, Shannon Zhao,
	Shannon Zhao

On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
> From: Prem Mallappa <prem.mallappa@broadcom.com>
>
> This patch builds the smmuv3 node in the ACPI IORT table.
>
> The RID space of the root complex, which spans 0x0-0x10000
> maps to streamid space 0x0-0x10000 in smmuv3, which in turn
> maps to deviceid space 0x0-0x10000 in the ITS group.
>
> The guest must feature the IOMMU probe deferral series
> (https://lkml.org/lkml/2017/4/10/214) which fixes streamid
> multiple lookup. This bug is not related to the SMMU emulation.
>
> Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>
> ---
>
> v2 -> v3:
> - integrate into the existing IORT table made up of ITS, RC nodes
> - take into account vms->smmu
> - match linux actbl2.h acpi_iort_smmu_v3 field names
> ---
>  hw/arm/virt-acpi-build.c    | 56 +++++++++++++++++++++++++++++++++++++++------
>  include/hw/acpi/acpi-defs.h | 15 ++++++++++++
>  2 files changed, 64 insertions(+), 7 deletions(-)

Since ACPI is definitely not my area of expertise, I've cc'd
Shannon on this patch. Shannon, could you take a look at it, please?

> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index f7fa795..4b5ad91 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -393,19 +393,26 @@ build_rsdp(GArray *rsdp_table, BIOSLinker *linker, unsigned xsdt_tbl_offset)
>  }
>
>  static void
> -build_iort(GArray *table_data, BIOSLinker *linker)
> +build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>  {
> -    int iort_start = table_data->len;
> +    int nb_nodes, iort_start = table_data->len;
>      AcpiIortIdMapping *idmap;
>      AcpiIortItsGroup *its;
>      AcpiIortTable *iort;
> -    size_t node_size, iort_length;
> +    AcpiIortSmmu3 *smmu;
> +    size_t node_size, iort_length, smmu_offset = 0;
>      AcpiIortRC *rc;
>
>      iort = acpi_data_push(table_data, sizeof(*iort));
>
> +    if (vms->iommu) {
> +        nb_nodes = 3; /* RC, ITS, SMMUv3 */
> +    } else {
> +        nb_nodes = 2; /* RC, ITS */
> +    }
> +
>      iort_length = sizeof(*iort);
> -    iort->node_count = cpu_to_le32(2); /* RC and ITS nodes */
> +    iort->node_count = cpu_to_le32(nb_nodes);
>      iort->node_offset = cpu_to_le32(sizeof(*iort));
>
>      /* ITS group node */
> @@ -418,6 +425,35 @@ build_iort(GArray *table_data, BIOSLinker *linker)
>      its->its_count = cpu_to_le32(1);
>      its->identifiers[0] = 0; /* MADT translation_id */
>
> +    if (vms->iommu == VIRT_IOMMU_SMMUV3) {
> +        int irq =  vms->irqmap[VIRT_SMMU];
> +
> +        /* SMMUv3 node */
> +        smmu_offset = cpu_to_le32(iort->node_offset + node_size);
> +        node_size = sizeof(*smmu) + sizeof(*idmap);
> +        iort_length += node_size;
> +        smmu = acpi_data_push(table_data, node_size);
> +
> +
> +        smmu->type = ACPI_IORT_NODE_SMMU_V3;
> +        smmu->length = cpu_to_le16(node_size);
> +        smmu->mapping_count = cpu_to_le32(1);
> +        smmu->mapping_offset = cpu_to_le32(sizeof(*smmu));
> +        smmu->base_address = cpu_to_le64(vms->memmap[VIRT_SMMU].base);
> +        smmu->event_gsiv = cpu_to_le32(irq);
> +        smmu->pri_gsiv = cpu_to_le32(irq + 1);
> +        smmu->gerr_gsiv = cpu_to_le32(irq + 2);
> +        smmu->sync_gsiv = cpu_to_le32(irq + 3);
> +
> +        /* Identity RID mapping covering the whole input RID range */
> +        idmap = &smmu->id_mapping_array[0];
> +        idmap->input_base = 0;
> +        idmap->id_count = cpu_to_le32(0xFFFF);
> +        idmap->output_base = 0;
> +        /* output IORT node is the ITS group node (the first node) */
> +        idmap->output_reference = cpu_to_le32(iort->node_offset);
> +    }
> +
>      /* Root Complex Node */
>      node_size = sizeof(*rc) + sizeof(*idmap);
>      iort_length += node_size;
> @@ -438,8 +474,14 @@ build_iort(GArray *table_data, BIOSLinker *linker)
>      idmap->input_base = 0;
>      idmap->id_count = cpu_to_le32(0xFFFF);
>      idmap->output_base = 0;
> -    /* output IORT node is the ITS group node (the first node) */
> -    idmap->output_reference = cpu_to_le32(iort->node_offset);
> +
> +    if (vms->iommu) {
> +        /* output IORT node is the smmuv3 node */
> +        idmap->output_reference = cpu_to_le32(smmu_offset);
> +    } else {
> +        /* output IORT node is the ITS group node (the first node) */
> +        idmap->output_reference = cpu_to_le32(iort->node_offset);
> +    }
>
>      iort->length = cpu_to_le32(iort_length);
>
> @@ -786,7 +828,7 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
>
>      if (its_class_name() && !vmc->no_its) {
>          acpi_add_table(table_offsets, tables_blob);
> -        build_iort(tables_blob, tables->linker);
> +        build_iort(tables_blob, tables->linker, vms);
>      }
>
>      /* XSDT is pointed to by RSDP */
> diff --git a/include/hw/acpi/acpi-defs.h b/include/hw/acpi/acpi-defs.h
> index 80c8099..068ce28 100644
> --- a/include/hw/acpi/acpi-defs.h
> +++ b/include/hw/acpi/acpi-defs.h
> @@ -700,6 +700,21 @@ struct AcpiIortItsGroup {
>  } QEMU_PACKED;
>  typedef struct AcpiIortItsGroup AcpiIortItsGroup;
>
> +struct AcpiIortSmmu3 {
> +    ACPI_IORT_NODE_HEADER_DEF
> +    uint64_t base_address;
> +    uint32_t flags;
> +    uint32_t reserved2;
> +    uint64_t vatos_address;
> +    uint32_t model;
> +    uint32_t event_gsiv;
> +    uint32_t pri_gsiv;
> +    uint32_t gerr_gsiv;
> +    uint32_t sync_gsiv;
> +    AcpiIortIdMapping id_mapping_array[0];
> +} QEMU_PACKED;
> +typedef struct AcpiIortSmmu3 AcpiIortSmmu3;
> +
>  struct AcpiIortRC {
>      ACPI_IORT_NODE_HEADER_DEF
>      AcpiIortMemoryAccess memory_properties;
> --
> 2.5.5
>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 14/14] hw/arm/virt: Handle iommu in 2.12 machine type
  2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 14/14] hw/arm/virt: Handle iommu in 2.12 machine type Eric Auger
@ 2018-03-12 12:56   ` Peter Maydell
  2018-03-12 15:01     ` Auger Eric
  0 siblings, 1 reply; 63+ messages in thread
From: Peter Maydell @ 2018-03-12 12:56 UTC (permalink / raw)
  To: Eric Auger
  Cc: Eric Auger, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
> The new machine type exposes a new "iommu" virt machine option.
> The SMMUv3 IOMMU is instantiated using -machine virt,iommu=smmuv3.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>
> ---
> v7 -> v8:
> - Revert to machine option, now dubbed "iommu", preparing for virtio
>   instantiation.
>
> v5 -> v6: machine 2_11
>
> Another alternative would be to use the -device option as
> done on x86. As the smmu is a sysbus device, we would need to
> use the platform bus framework.
> ---
>  hw/arm/virt.c         | 45 +++++++++++++++++++++++++++++++++++++++++++++
>  include/hw/arm/virt.h |  1 +
>  2 files changed, 46 insertions(+)
>
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index e9dca0d..607c7e1 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -1547,6 +1547,34 @@ static void virt_set_gic_version(Object *obj, const char *value, Error **errp)
>      }
>  }
>
> +static char *virt_get_iommu(Object *obj, Error **errp)
> +{
> +    VirtMachineState *vms = VIRT_MACHINE(obj);
> +
> +    switch (vms->iommu) {
> +    case VIRT_IOMMU_NONE:
> +        return g_strdup("none");
> +    case VIRT_IOMMU_SMMUV3:
> +        return g_strdup("smmuv3");
> +    default:
> +        return g_strdup("none");

Isn't this a "can't happen" case? If so, g_assert_not_reached().

> +    }
> +}
> +
> +static void virt_set_iommu(Object *obj, const char *value, Error **errp)
> +{
> +    VirtMachineState *vms = VIRT_MACHINE(obj);
> +
> +    if (!strcmp(value, "smmuv3")) {
> +        vms->iommu = VIRT_IOMMU_SMMUV3;
> +    } else if (!strcmp(value, "none")) {
> +        vms->iommu = VIRT_IOMMU_NONE;
> +    } else {
> +        error_setg(errp, "Invalid iommu value");
> +        error_append_hint(errp, "Valid value are none, smmuv3\n");

"Valid values are", and add trailing "." to the hint string (matches
virt_set_gic_version() phrasing.)

> +    }
> +}
> +
>  static CpuInstanceProperties
>  virt_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
>  {
> @@ -1679,6 +1707,19 @@ static void virt_2_12_instance_init(Object *obj)
>                                          NULL);
>      }
>
> +    if (vmc->no_iommu) {
> +        vms->iommu = VIRT_IOMMU_NONE;
> +    } else {
> +        /* Default disallows smmu instantiation */
> +        vms->iommu = VIRT_IOMMU_NONE;

If the default is "none" then you don't need the no_iommu field
in the VirtMachineClass. You can just have the property exist
on all virt machines, the way we do for "secure" and "virtualization".
It's only if we want to have the default for newer virt-n.nn versions
be something other than what the older machines had that we need to
have the VirtMachineClass field that lets us distinguish the older
and newer machine types.

> +        object_property_add_str(obj, "iommu", virt_get_iommu,
> +                                 virt_set_iommu, NULL);
> +        object_property_set_description(obj, "iommu",
> +                                        "Set the IOMMU model among "
> +                                        "none, smmuv3 (default none)",
> +                                        NULL);
> +    }
> +
>      vms->memmap = a15memmap;
>      vms->irqmap = a15irqmap;
>  }
> @@ -1698,8 +1739,12 @@ static void virt_2_11_instance_init(Object *obj)
>
>  static void virt_machine_2_11_options(MachineClass *mc)
>  {
> +    VirtMachineClass *vmc = VIRT_MACHINE_CLASS(OBJECT_CLASS(mc));
> +
>      virt_machine_2_12_options(mc);
>      SET_MACHINE_COMPAT(mc, VIRT_COMPAT_2_11);
> +
> +    vmc->no_iommu = true;
>  }
>  DEFINE_VIRT_MACHINE(2, 11)
>
> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> index 13d3724..3a92fc3 100644
> --- a/include/hw/arm/virt.h
> +++ b/include/hw/arm/virt.h
> @@ -92,6 +92,7 @@ typedef struct {
>      bool disallow_affinity_adjustment;
>      bool no_its;
>      bool no_pmu;
> +    bool no_iommu;
>      bool claim_edge_triggered_timers;
>  } VirtMachineClass;
>
> --
> 2.5.5
>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 00/14] ARM SMMUv3 Emulation Support
  2018-02-28  8:44   ` Auger Eric
@ 2018-03-12 12:58     ` Peter Maydell
  2018-03-12 15:22       ` Auger Eric
  0 siblings, 1 reply; 63+ messages in thread
From: Peter Maydell @ 2018-03-12 12:58 UTC (permalink / raw)
  To: Auger Eric
  Cc: Michael S. Tsirkin, Jean-Philippe Brucker, Tomasz Nowicki,
	QEMU Developers, Peter Xu, Alex Williamson, qemu-arm,
	Christoffer Dall, Edgar E. Iglesias, linuc.decode,
	Bharat Bhushan, Prem Mallappa, Eric Auger

On 28 February 2018 at 08:44, Auger Eric <eric.auger@redhat.com> wrote:
> On 27/02/18 20:02, Peter Maydell wrote:
>> On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
>>> This series implements the emulation code for ARM SMMUv3.

>> What state is this series in now? Is it "seems ready to
>> go, needs review"? Are you hoping it might get in for 2.12,
>> or targeting 2.13 ?

> Yes I think it is in a decent state and I would be happy to get some new
> reviews, for a tentative pull in 2.12. In any case I will be reactive on
> any comment in that prospect.

Hi, Eric. I've now gone through the patchset and reviewed it. Sorry
I didn't manage to get to this in time to be able to put it into 2.12,
but there weren't any major or structural issues, so we should be
able to get this into the tree early in the 2.13 dev cycle.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 10/14] hw/arm/smmuv3: Abort on vfio or vhost case
  2018-03-12 11:10           ` Peter Maydell
@ 2018-03-12 15:01             ` Auger Eric
  0 siblings, 0 replies; 63+ messages in thread
From: Auger Eric @ 2018-03-12 15:01 UTC (permalink / raw)
  To: Peter Maydell, Eric Auger
  Cc: Michael S. Tsirkin, Jean-Philippe Brucker, Tomasz Nowicki,
	QEMU Developers, Peter Xu, Alex Williamson, qemu-arm,
	Christoffer Dall, Edgar E. Iglesias, linuc.decode,
	Bharat Bhushan, Prem Mallappa

Hi Peter,

On 12/03/18 12:10, Peter Maydell wrote:
> On 12 March 2018 at 10:53, Eric Auger <eric.auger.pro@gmail.com> wrote:
>> Hi Peter,
>>
>> On 09/03/18 18:59, Peter Maydell wrote:
>>> On 9 March 2018 at 17:53, Auger Eric <eric.auger@redhat.com> wrote:
>>>> Hi Peter,
>>>> On 08/03/18 20:06, Peter Maydell wrote:
>>>>> On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
>>>>>> +static void smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
>>>>>> +                                       IOMMUNotifierFlag old,
>>>>>> +                                       IOMMUNotifierFlag new)
>>>>>> +{
>>>>>> +    if (old == IOMMU_NOTIFIER_NONE) {
>>>>>> +        error_setg(&error_fatal,
>>>>>> +                   "SMMUV3: vhost and vfio notifiers not yet supported");
>>>>>> +    }
>>>>>> +}
>>>>>
>>>>> Is this triggerable by the guest, or by the user on the command
>>>>> line, or only by a bug in the board or other QEMU code?
>>>> by the user on the command line.
>>>
>>> OK. Do they get this error immediately on startup, or only later
>>> in execution? (If the latter, is it possible to make the error
>>> happen earlier?)
> 
>> later in execution. We also have to handle the case where such device is
>> hot-plugged. At best if could be done on smmu_find_add_as() by checking
>> the type of attached device but this wouldn't happen much earlier. By
>> the way we will soon support vhost and we will just rule out vfio
>> integration by detecting map notifiers.
> 
> Hmm. error_fatal is a bit unfortunate for a hotplug event -- ideally
> you would want to cause the hotplug to cleanly fail without aborting
> the running QEMU session.

At the moment I suggest I replace the assert by a warn_report saying
vhost/vfio devices will not function properly.

Normally in short term the restriction will only apply to VFIO devices.

Having something more elegant would imply some modifications in the pci
subsystem I think:

pci_init_bus_master
	-> pci_device_iommu_address_space
		->iommu_fn = smmu_find_add_as

pci_init_bus_master is called in pcibus_machine_done and
pci_qdev_realize/do_pci_register_device. But it is a void at the moment.

in smmu_find_add_as I could check whether the device is a VFIO

In case of a VFIO device smmu_find_add_as could return NULL; test this
in pci_init_bus_master and propagate the error upstream?


Thanks

Eric
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 14/14] hw/arm/virt: Handle iommu in 2.12 machine type
  2018-03-12 12:56   ` Peter Maydell
@ 2018-03-12 15:01     ` Auger Eric
  0 siblings, 0 replies; 63+ messages in thread
From: Auger Eric @ 2018-03-12 15:01 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Eric Auger, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

Hi Peter,

On 12/03/18 13:56, Peter Maydell wrote:
> On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
>> The new machine type exposes a new "iommu" virt machine option.
>> The SMMUv3 IOMMU is instantiated using -machine virt,iommu=smmuv3.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>> v7 -> v8:
>> - Revert to machine option, now dubbed "iommu", preparing for virtio
>>   instantiation.
>>
>> v5 -> v6: machine 2_11
>>
>> Another alternative would be to use the -device option as
>> done on x86. As the smmu is a sysbus device, we would need to
>> use the platform bus framework.
>> ---
>>  hw/arm/virt.c         | 45 +++++++++++++++++++++++++++++++++++++++++++++
>>  include/hw/arm/virt.h |  1 +
>>  2 files changed, 46 insertions(+)
>>
>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>> index e9dca0d..607c7e1 100644
>> --- a/hw/arm/virt.c
>> +++ b/hw/arm/virt.c
>> @@ -1547,6 +1547,34 @@ static void virt_set_gic_version(Object *obj, const char *value, Error **errp)
>>      }
>>  }
>>
>> +static char *virt_get_iommu(Object *obj, Error **errp)
>> +{
>> +    VirtMachineState *vms = VIRT_MACHINE(obj);
>> +
>> +    switch (vms->iommu) {
>> +    case VIRT_IOMMU_NONE:
>> +        return g_strdup("none");
>> +    case VIRT_IOMMU_SMMUV3:
>> +        return g_strdup("smmuv3");
>> +    default:
>> +        return g_strdup("none");
> 
> Isn't this a "can't happen" case? If so, g_assert_not_reached().
yes, done
> 
>> +    }
>> +}
>> +
>> +static void virt_set_iommu(Object *obj, const char *value, Error **errp)
>> +{
>> +    VirtMachineState *vms = VIRT_MACHINE(obj);
>> +
>> +    if (!strcmp(value, "smmuv3")) {
>> +        vms->iommu = VIRT_IOMMU_SMMUV3;
>> +    } else if (!strcmp(value, "none")) {
>> +        vms->iommu = VIRT_IOMMU_NONE;
>> +    } else {
>> +        error_setg(errp, "Invalid iommu value");
>> +        error_append_hint(errp, "Valid value are none, smmuv3\n");
> 
> "Valid values are", and add trailing "." to the hint string (matches
> virt_set_gic_version() phrasing.)
OK
> 
>> +    }
>> +}
>> +
>>  static CpuInstanceProperties
>>  virt_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
>>  {
>> @@ -1679,6 +1707,19 @@ static void virt_2_12_instance_init(Object *obj)
>>                                          NULL);
>>      }
>>
>> +    if (vmc->no_iommu) {
>> +        vms->iommu = VIRT_IOMMU_NONE;
>> +    } else {
>> +        /* Default disallows smmu instantiation */
>> +        vms->iommu = VIRT_IOMMU_NONE;
> 
> If the default is "none" then you don't need the no_iommu field
> in the VirtMachineClass. You can just have the property exist
> on all virt machines, the way we do for "secure" and "virtualization".
> It's only if we want to have the default for newer virt-n.nn versions
> be something other than what the older machines had that we need to
> have the VirtMachineClass field that lets us distinguish the older
> and newer machine types.
OK

Thanks

Eric
> 
>> +        object_property_add_str(obj, "iommu", virt_get_iommu,
>> +                                 virt_set_iommu, NULL);
>> +        object_property_set_description(obj, "iommu",
>> +                                        "Set the IOMMU model among "
>> +                                        "none, smmuv3 (default none)",
>> +                                        NULL);
>> +    }
>> +
>>      vms->memmap = a15memmap;
>>      vms->irqmap = a15irqmap;
>>  }
>> @@ -1698,8 +1739,12 @@ static void virt_2_11_instance_init(Object *obj)
>>
>>  static void virt_machine_2_11_options(MachineClass *mc)
>>  {
>> +    VirtMachineClass *vmc = VIRT_MACHINE_CLASS(OBJECT_CLASS(mc));
>> +
>>      virt_machine_2_12_options(mc);
>>      SET_MACHINE_COMPAT(mc, VIRT_COMPAT_2_11);
>> +
>> +    vmc->no_iommu = true;
>>  }
>>  DEFINE_VIRT_MACHINE(2, 11)
>>
>> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
>> index 13d3724..3a92fc3 100644
>> --- a/include/hw/arm/virt.h
>> +++ b/include/hw/arm/virt.h
>> @@ -92,6 +92,7 @@ typedef struct {
>>      bool disallow_affinity_adjustment;
>>      bool no_its;
>>      bool no_pmu;
>> +    bool no_iommu;
>>      bool claim_edge_triggered_timers;
>>  } VirtMachineClass;
>>
>> --
>> 2.5.5
>>
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 12/14] hw/arm/virt: Add SMMUv3 to the virt board
  2018-03-12 12:46   ` Peter Maydell
@ 2018-03-12 15:01     ` Auger Eric
  2018-03-12 15:05       ` Peter Maydell
  0 siblings, 1 reply; 63+ messages in thread
From: Auger Eric @ 2018-03-12 15:01 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Michael S. Tsirkin, Jean-Philippe Brucker, Tomasz Nowicki,
	QEMU Developers, Peter Xu, Alex Williamson, qemu-arm,
	Christoffer Dall, Edgar E. Iglesias, linuc.decode,
	Bharat Bhushan, Prem Mallappa, Eric Auger

Hi Peter,

On 12/03/18 13:46, Peter Maydell wrote:
> On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
>> From: Prem Mallappa <prem.mallappa@broadcom.com>
>>
>> Add code to instantiate an smmuv3 in virt machine. A new iommu
>> integer member is introduced in VirtMachineState to store the type
>> of the iommu in use.
>>
>> Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>> v7 -> v8:
>> - integer iommu member
>> - add primary-bus property
>>
>> v4 -> v5:
>> - add dma-coherent property
>>
>> v2 -> v3:
>> - vbi was removed. Use vms instead
>> - migrate to new smmu binding format (iommu-map)
>> - don't use appendprop anymore
>> - add vms->smmu and guard instantiation with this latter
>> - interrupts type changed to edge
>>
>> Conflicts:
>>         hw/arm/smmuv3.c
>> ---
>>  hw/arm/virt.c         | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>>  include/hw/arm/virt.h | 10 ++++++++
>>  2 files changed, 73 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>> index dbb3c80..e9dca0d 100644
>> --- a/hw/arm/virt.c
>> +++ b/hw/arm/virt.c
>> @@ -58,6 +58,7 @@
>>  #include "hw/smbios/smbios.h"
>>  #include "qapi/visitor.h"
>>  #include "standard-headers/linux/input.h"
>> +#include "hw/arm/smmuv3.h"
>>
>>  #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
>>      static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
>> @@ -141,6 +142,7 @@ static const MemMapEntry a15memmap[] = {
>>      [VIRT_FW_CFG] =             { 0x09020000, 0x00000018 },
>>      [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
>>      [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
>> +    [VIRT_SMMU] =               { 0x09050000, 0x00020000 }, /* 128K, needed */
> 
> What does the "needed" comment mean here?
removed
> 
>>      [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
>>      /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
>>      [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
>> @@ -161,6 +163,7 @@ static const int a15irqmap[] = {
>>      [VIRT_SECURE_UART] = 8,
>>      [VIRT_MMIO] = 16, /* ...to 16 + NUM_VIRTIO_TRANSPORTS - 1 */
>>      [VIRT_GIC_V2M] = 48, /* ...to 48 + NUM_GICV2M_SPIS - 1 */
>> +    [VIRT_SMMU] = 74,    /* ...to 74 + NUM_SMMU_IRQS - 1 */
>>      [VIRT_PLATFORM_BUS] = 112, /* ...to 112 + PLATFORM_BUS_NUM_IRQS -1 */
>>  };
>>
>> @@ -941,7 +944,57 @@ static void create_pcie_irq_map(const VirtMachineState *vms,
>>                             0x7           /* PCI irq */);
>>  }
>>
>> -static void create_pcie(const VirtMachineState *vms, qemu_irq *pic)
>> +static void create_smmu(const VirtMachineState *vms, qemu_irq *pic,
>> +                        PCIBus *bus)
> 
> Side suggestion: if you add "algorithm = histogram" to the [diff] section
> of your .gitconfig, you will probably find that it doesn't generate patches
> with this sort of silly patch hunk any more. I only discovered this
> config setting recently but I think it's better than the default.
ah OK thanks for the advice.
> 
>> +{
>> +    char *node;
>> +    const char compat[] = "arm,smmu-v3";
>> +    int irq =  vms->irqmap[VIRT_SMMU];
>> +    int i;
>> +    hwaddr base = vms->memmap[VIRT_SMMU].base;
>> +    hwaddr size = vms->memmap[VIRT_SMMU].size;
>> +    const char irq_names[] = "eventq\0priq\0cmdq-sync\0gerror";
>> +    DeviceState *dev;
>> +
>> +    if (vms->iommu != VIRT_IOMMU_SMMUV3 || !vms->iommu_phandle) {
>> +        return;
>> +    }
>> +
>> +    dev = qdev_create(NULL, "arm-smmuv3");
>> +
>> +    object_property_set_link(OBJECT(dev), OBJECT(bus), "primary-bus",
>> +                             &error_abort);
>> +    qdev_init_nofail(dev);
>> +    sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, base);
>> +    for (i = 0; i < NUM_SMMU_IRQS; i++) {
>> +        sysbus_connect_irq(SYS_BUS_DEVICE(dev), i, pic[irq + i]);
>> +    }
>> +
>> +    node = g_strdup_printf("/smmuv3@%" PRIx64, base);
>> +    qemu_fdt_add_subnode(vms->fdt, node);
>> +    qemu_fdt_setprop(vms->fdt, node, "compatible", compat, sizeof(compat));
>> +    qemu_fdt_setprop_sized_cells(vms->fdt, node, "reg", 2, base, 2, size);
>> +
>> +    qemu_fdt_setprop_cells(vms->fdt, node, "interrupts",
>> +            GIC_FDT_IRQ_TYPE_SPI, irq    , GIC_FDT_IRQ_FLAGS_EDGE_LO_HI,
>> +            GIC_FDT_IRQ_TYPE_SPI, irq + 1, GIC_FDT_IRQ_FLAGS_EDGE_LO_HI,
>> +            GIC_FDT_IRQ_TYPE_SPI, irq + 2, GIC_FDT_IRQ_FLAGS_EDGE_LO_HI,
>> +            GIC_FDT_IRQ_TYPE_SPI, irq + 3, GIC_FDT_IRQ_FLAGS_EDGE_LO_HI);
>> +
>> +    qemu_fdt_setprop(vms->fdt, node, "interrupt-names", irq_names,
>> +                     sizeof(irq_names));
>> +
>> +    qemu_fdt_setprop_cell(vms->fdt, node, "clocks", vms->clock_phandle);
>> +    qemu_fdt_setprop_string(vms->fdt, node, "clock-names", "apb_pclk");
>> +    qemu_fdt_setprop(vms->fdt, node, "dma-coherent", NULL, 0);
> 
> Is this definitely the right setting for dma-coherent? My (possibly
> naive) thought is that for an emulated SMMU our page-table walks are
> going to be coherent with the CPU's view.

Indeed we add the dma-capable property to make the DMA ops of the SMMU
cache coherent with the CPU. We pass NULL/O since there is just
dma-coherent in the dt property. What am I missing?
> 
>> +
>> +    qemu_fdt_setprop_cell(vms->fdt, node, "#iommu-cells", 1);
>> +
>> +    qemu_fdt_setprop_cell(vms->fdt, node, "phandle", vms->iommu_phandle);
>> +    g_free(node);
>> +}
> 
>> +enum {
>> +    VIRT_IOMMU_NONE,
>> +    VIRT_IOMMU_SMMUV3,
>> +    VIRT_IOMMU_VIRTIO,
>> +};
>> +
>>  typedef struct MemMapEntry {
>>      hwaddr base;
>>      hwaddr size;
>> @@ -96,6 +104,7 @@ typedef struct {
>>      bool its;
>>      bool virt;
>>      int32_t gic_version;
>> +    int32_t iommu;
> 
> This is always one of the VIRT_IOMMU_* values, right?
> I think we should give that enum a typedef name and then
> use the type here.
yes, done.
> 
>>      struct arm_boot_info bootinfo;
>>      const MemMapEntry *memmap;
>>      const int *irqmap;
>> @@ -105,6 +114,7 @@ typedef struct {
>>      uint32_t clock_phandle;
>>      uint32_t gic_phandle;
>>      uint32_t msi_phandle;
>> +    uint32_t iommu_phandle;
>>      int psci_conduit;
>>  } VirtMachineState;
>>
> 
> Otherwise
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Thanks!

Eric
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 12/14] hw/arm/virt: Add SMMUv3 to the virt board
  2018-03-12 15:01     ` Auger Eric
@ 2018-03-12 15:05       ` Peter Maydell
  0 siblings, 0 replies; 63+ messages in thread
From: Peter Maydell @ 2018-03-12 15:05 UTC (permalink / raw)
  To: Auger Eric
  Cc: Michael S. Tsirkin, Jean-Philippe Brucker, Tomasz Nowicki,
	QEMU Developers, Peter Xu, Alex Williamson, qemu-arm,
	Christoffer Dall, Edgar E. Iglesias, linuc.decode,
	Bharat Bhushan, Prem Mallappa, Eric Auger

On 12 March 2018 at 15:01, Auger Eric <eric.auger@redhat.com> wrote:
> Hi Peter,
>
> On 12/03/18 13:46, Peter Maydell wrote:
>> On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
>>> +    qemu_fdt_setprop(vms->fdt, node, "dma-coherent", NULL, 0);
>>
>> Is this definitely the right setting for dma-coherent? My (possibly
>> naive) thought is that for an emulated SMMU our page-table walks are
>> going to be coherent with the CPU's view.
>
> Indeed we add the dma-capable property to make the DMA ops of the SMMU
> cache coherent with the CPU. We pass NULL/O since there is just
> dma-coherent in the dt property. What am I missing?

My mistake, I thought this was the equivalent of "dma-coherent = 0;".

thanks
-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 11/14] target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route
  2018-03-12 11:59   ` Peter Maydell
@ 2018-03-12 15:16     ` Auger Eric
  2018-03-13 13:37       ` Paolo Bonzini
  0 siblings, 1 reply; 63+ messages in thread
From: Auger Eric @ 2018-03-12 15:16 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Eric Auger, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu, Paolo Bonzini

Hi Peter,
On 12/03/18 12:59, Peter Maydell wrote:
> On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
>> In case the MSI is translated by an IOMMU we need to fixup the
>> MSI route with the translated address.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>>
>> v5 -> v6:
>> - use IOMMUMemoryRegionClass API
>>
>> It is still unclear to me if we need to register an IOMMUNotifier
>> to handle any change in the MSI doorbell which would occur behind
>> the scene and would not lead to any call to kvm_arch_fixup_msi_route().
> 
> Paolo, do you know the answer to this question ?
> 
>> ---
>>  target/arm/kvm.c        | 27 +++++++++++++++++++++++++++
>>  target/arm/trace-events |  3 +++
>>  2 files changed, 30 insertions(+)
>>
>> diff --git a/target/arm/kvm.c b/target/arm/kvm.c
>> index 1219d00..9f5976a 100644
>> --- a/target/arm/kvm.c
>> +++ b/target/arm/kvm.c
>> @@ -20,8 +20,13 @@
>>  #include "sysemu/kvm.h"
>>  #include "kvm_arm.h"
>>  #include "cpu.h"
>> +#include "trace.h"
>>  #include "internals.h"
>>  #include "hw/arm/arm.h"
>> +#include "hw/pci/pci.h"
>> +#include "hw/pci/msi.h"
>> +#include "hw/arm/smmu-common.h"
>> +#include "hw/arm/smmuv3.h"
>>  #include "exec/memattrs.h"
>>  #include "exec/address-spaces.h"
>>  #include "hw/boards.h"
>> @@ -666,6 +671,28 @@ int kvm_arm_vgic_probe(void)
>>  int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route,
>>                               uint64_t address, uint32_t data, PCIDevice *dev)
>>  {
>> +    AddressSpace *as = pci_device_iommu_address_space(dev);
>> +    IOMMUMemoryRegionClass *imrc;
>> +    IOMMUTLBEntry entry;
>> +    SMMUDevice *sdev;
>> +
>> +    if (as == &address_space_memory) {
>> +        return 0;
>> +    }
>> +
>> +    /* MSI doorbell address is translated by an IOMMU */
>> +    sdev = container_of(as, SMMUDevice, as);
>> +    imrc = IOMMU_MEMORY_REGION_GET_CLASS(&sdev->iommu);
>> +
>> +    entry = imrc->translate(&sdev->iommu, address, IOMMU_WO);
>> +
>> +    route->u.msi.address_lo = entry.translated_addr;
>> +    route->u.msi.address_hi = entry.translated_addr >> 32;
>> +
>> +    trace_kvm_arm_fixup_msi_route(address, sdev->devfn,
>> +                                  sdev->iommu.parent_obj.name,
>> +                                  entry.translated_addr);
>> +
>>      return 0;
>>  }
> 
> It seems a bit odd that:
>  * the code for arm for "PCI devices behind IOMMU need to have
>    the MSI doorbell writes go through the IOMMU" looks rather
>    different from the code for x86 for the same thing
ARM SMMU translates MSIs whereas Intel/AMD IOMMU do not translate them.
Hence this implementation
>  * the code here needs to know specifically that this is an SMMU
>    and not some other kind of IOMMU
Yes when introducing virtio-iommu we will need to get this fixed. We
need a way to retrieve the iommu mr from the as. I will work on this
concurrently.
> 
> I would have expected this to be more generic-to-all-IOMMU
> APIs and maybe even implemented in the generic KVM support code...
> 
> The x86 code seems to be similarly x86-specific though, so
> this is more of a "perhaps there is a cleanup opportunity here"
> observation I guess.

OK

Thanks

Eric
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 00/14] ARM SMMUv3 Emulation Support
  2018-03-12 12:58     ` Peter Maydell
@ 2018-03-12 15:22       ` Auger Eric
  0 siblings, 0 replies; 63+ messages in thread
From: Auger Eric @ 2018-03-12 15:22 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Michael S. Tsirkin, Jean-Philippe Brucker, Tomasz Nowicki,
	QEMU Developers, Peter Xu, Alex Williamson, qemu-arm,
	Christoffer Dall, Edgar E. Iglesias, linuc.decode,
	Bharat Bhushan, Prem Mallappa, Eric Auger

Hi Peter,

On 12/03/18 13:58, Peter Maydell wrote:
> On 28 February 2018 at 08:44, Auger Eric <eric.auger@redhat.com> wrote:
>> On 27/02/18 20:02, Peter Maydell wrote:
>>> On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
>>>> This series implements the emulation code for ARM SMMUv3.
> 
>>> What state is this series in now? Is it "seems ready to
>>> go, needs review"? Are you hoping it might get in for 2.12,
>>> or targeting 2.13 ?
> 
>> Yes I think it is in a decent state and I would be happy to get some new
>> reviews, for a tentative pull in 2.12. In any case I will be reactive on
>> any comment in that prospect.
> 
> Hi, Eric. I've now gone through the patchset and reviewed it. Sorry
> I didn't manage to get to this in time to be able to put it into 2.12,
> but there weren't any major or structural issues, so we should be
> able to get this into the tree early in the 2.13 dev cycle.

No worries. I am already really grateful to you for the time you spent
on this comprehensive review.

I have taken into account most of your comments already. I will run some
more tests and post v10 this week then.

Thanks

Eric
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 11/14] target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route
  2018-03-12 15:16     ` Auger Eric
@ 2018-03-13 13:37       ` Paolo Bonzini
  2018-03-15  9:45         ` Auger Eric
  0 siblings, 1 reply; 63+ messages in thread
From: Paolo Bonzini @ 2018-03-13 13:37 UTC (permalink / raw)
  To: Auger Eric, Peter Maydell
  Cc: Eric Auger, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

On 12/03/2018 16:16, Auger Eric wrote:
>> It is still unclear to me if we need to register an IOMMUNotifier
>> to handle any change in the MSI doorbell which would occur behind
>> the scene and would not lead to any call to kvm_arch_fixup_msi_route().
> 
> Paolo, do you know the answer to this question ?

Yes, x86 is wrong in this respect (it wouldn't use an IOMMUNotifier, but
it still should process Interrupt Entry Cache invalidations).

>> It seems a bit odd that:
>>  * the code for arm for "PCI devices behind IOMMU need to have
>>    the MSI doorbell writes go through the IOMMU" looks rather
>>    different from the code for x86 for the same thing
>
> ARM SMMU translates MSIs whereas Intel/AMD IOMMU do not translate them.
> Hence this implementation

More precisely, Intel IOMMU implements interrupt remapping through an
MMIO region instead of an IOMMU region, because on x86 interrupt
remapping can also change the MSI value and not just the address.

>>  * the code here needs to know specifically that this is an SMMU
>>    and not some other kind of IOMMU
> Yes when introducing virtio-iommu we will need to get this fixed. We
> need a way to retrieve the iommu mr from the as. I will work on this
> concurrently.

Probably something like address_space_translate can be used?

Paolo

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 11/14] target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route
  2018-03-13 13:37       ` Paolo Bonzini
@ 2018-03-15  9:45         ` Auger Eric
  0 siblings, 0 replies; 63+ messages in thread
From: Auger Eric @ 2018-03-15  9:45 UTC (permalink / raw)
  To: Paolo Bonzini, Peter Maydell
  Cc: Eric Auger, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Tomasz Nowicki, Michael S. Tsirkin,
	Christoffer Dall, Bharat Bhushan, Jean-Philippe Brucker,
	Edgar E. Iglesias, linuc.decode, Peter Xu

Hi Paolo,

On 13/03/18 14:37, Paolo Bonzini wrote:
> On 12/03/2018 16:16, Auger Eric wrote:
>>> It is still unclear to me if we need to register an IOMMUNotifier
>>> to handle any change in the MSI doorbell which would occur behind
>>> the scene and would not lead to any call to kvm_arch_fixup_msi_route().
>>
>> Paolo, do you know the answer to this question ?
> 
> Yes, x86 is wrong in this respect (it wouldn't use an IOMMUNotifier, but
> it still should process Interrupt Entry Cache invalidations).

I am not sure anymore of the issue we want to fix. The concen I had in
mind was: what if the MSI doorbell gets unmapped. We would need to be
notified in some way. As the doorbell lies in MMIO space, no
IOMMUNotifier is attached to it at the moment. On the other hand, if the
doorbell gets unmapped, can't we expect some kind of tear down of the
route? Paolo, is it the use case you are talking about here?
> 
>>> It seems a bit odd that:
>>>  * the code for arm for "PCI devices behind IOMMU need to have
>>>    the MSI doorbell writes go through the IOMMU" looks rather
>>>    different from the code for x86 for the same thing
>>
>> ARM SMMU translates MSIs whereas Intel/AMD IOMMU do not translate them.
>> Hence this implementation
> 
> More precisely, Intel IOMMU implements interrupt remapping through an
> MMIO region instead of an IOMMU region, because on x86 interrupt
> remapping can also change the MSI value and not just the address.
thank you for the clarification. Yes I too much focused on IOMMU DMA
translation and forgot the x86 interrupt remapping stuff which
corresponds to what is done here.
> 
>>>  * the code here needs to know specifically that this is an SMMU
>>>    and not some other kind of IOMMU
>> Yes when introducing virtio-iommu we will need to get this fixed. We
>> need a way to retrieve the iommu mr from the as. I will work on this
>> concurrently.
> 
> Probably something like address_space_translate can be used?
Yes achieved with address_space_translate + memory_region_find

Thanks

Eric

> 
> Paolo
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 13/14] hw/arm/virt-acpi-build: Add smmuv3 node in IORT table
  2018-03-12 12:48   ` Peter Maydell
@ 2018-03-19 14:32     ` Shannon Zhao
  2018-03-19 20:50       ` Auger Eric
  0 siblings, 1 reply; 63+ messages in thread
From: Shannon Zhao @ 2018-03-19 14:32 UTC (permalink / raw)
  To: Peter Maydell, Eric Auger
  Cc: Michael S. Tsirkin, Jean-Philippe Brucker, Tomasz Nowicki,
	QEMU Developers, Peter Xu, Shannon Zhao, Alex Williamson,
	qemu-arm, Christoffer Dall, Edgar E. Iglesias, linuc.decode,
	Bharat Bhushan, Prem Mallappa, Eric Auger



On 2018/3/12 20:48, Peter Maydell wrote:
> On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
>> From: Prem Mallappa <prem.mallappa@broadcom.com>
>>
>> This patch builds the smmuv3 node in the ACPI IORT table.
>>
>> The RID space of the root complex, which spans 0x0-0x10000
>> maps to streamid space 0x0-0x10000 in smmuv3, which in turn
>> maps to deviceid space 0x0-0x10000 in the ITS group.
>>
>> The guest must feature the IOMMU probe deferral series
>> (https://lkml.org/lkml/2017/4/10/214) which fixes streamid
>> multiple lookup. This bug is not related to the SMMU emulation.
>>
>> Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>>
>> v2 -> v3:
>> - integrate into the existing IORT table made up of ITS, RC nodes
>> - take into account vms->smmu
>> - match linux actbl2.h acpi_iort_smmu_v3 field names
>> ---
>>  hw/arm/virt-acpi-build.c    | 56 +++++++++++++++++++++++++++++++++++++++------
>>  include/hw/acpi/acpi-defs.h | 15 ++++++++++++
>>  2 files changed, 64 insertions(+), 7 deletions(-)
> 
> Since ACPI is definitely not my area of expertise, I've cc'd
> Shannon on this patch. Shannon, could you take a look at it, please?
> 
Sure.

>> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
>> index f7fa795..4b5ad91 100644
>> --- a/hw/arm/virt-acpi-build.c
>> +++ b/hw/arm/virt-acpi-build.c
>> @@ -393,19 +393,26 @@ build_rsdp(GArray *rsdp_table, BIOSLinker *linker, unsigned xsdt_tbl_offset)
>>  }
>>
>>  static void
>> -build_iort(GArray *table_data, BIOSLinker *linker)
>> +build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>>  {
>> -    int iort_start = table_data->len;
>> +    int nb_nodes, iort_start = table_data->len;
>>      AcpiIortIdMapping *idmap;
>>      AcpiIortItsGroup *its;
>>      AcpiIortTable *iort;
>> -    size_t node_size, iort_length;
>> +    AcpiIortSmmu3 *smmu;
>> +    size_t node_size, iort_length, smmu_offset = 0;
>>      AcpiIortRC *rc;
>>
>>      iort = acpi_data_push(table_data, sizeof(*iort));
>>
>> +    if (vms->iommu) {
use if (vms->iommu == VIRT_IOMMU_SMMUV3) ? in case we support other
types of SMMU.

>> +        nb_nodes = 3; /* RC, ITS, SMMUv3 */
>> +    } else {
>> +        nb_nodes = 2; /* RC, ITS */
>> +    }
>> +
>>      iort_length = sizeof(*iort);
>> -    iort->node_count = cpu_to_le32(2); /* RC and ITS nodes */
>> +    iort->node_count = cpu_to_le32(nb_nodes);
>>      iort->node_offset = cpu_to_le32(sizeof(*iort));
>>
>>      /* ITS group node */
>> @@ -418,6 +425,35 @@ build_iort(GArray *table_data, BIOSLinker *linker)
>>      its->its_count = cpu_to_le32(1);
>>      its->identifiers[0] = 0; /* MADT translation_id */
>>
>> +    if (vms->iommu == VIRT_IOMMU_SMMUV3) {
>> +        int irq =  vms->irqmap[VIRT_SMMU];
>> +
>> +        /* SMMUv3 node */
>> +        smmu_offset = cpu_to_le32(iort->node_offset + node_size);
no need cpu_to_le32 here.
Otherwise: Reviewed-by: Shannon Zhao <zhaoshenglong@huawei.com>

>> +        node_size = sizeof(*smmu) + sizeof(*idmap);
>> +        iort_length += node_size;
>> +        smmu = acpi_data_push(table_data, node_size);
>> +
>> +
>> +        smmu->type = ACPI_IORT_NODE_SMMU_V3;
>> +        smmu->length = cpu_to_le16(node_size);
>> +        smmu->mapping_count = cpu_to_le32(1);
>> +        smmu->mapping_offset = cpu_to_le32(sizeof(*smmu));
>> +        smmu->base_address = cpu_to_le64(vms->memmap[VIRT_SMMU].base);
>> +        smmu->event_gsiv = cpu_to_le32(irq);
>> +        smmu->pri_gsiv = cpu_to_le32(irq + 1);
>> +        smmu->gerr_gsiv = cpu_to_le32(irq + 2);
>> +        smmu->sync_gsiv = cpu_to_le32(irq + 3);
>> +
>> +        /* Identity RID mapping covering the whole input RID range */
>> +        idmap = &smmu->id_mapping_array[0];
>> +        idmap->input_base = 0;
>> +        idmap->id_count = cpu_to_le32(0xFFFF);
>> +        idmap->output_base = 0;
>> +        /* output IORT node is the ITS group node (the first node) */
>> +        idmap->output_reference = cpu_to_le32(iort->node_offset);
>> +    }
>> +
>>      /* Root Complex Node */
>>      node_size = sizeof(*rc) + sizeof(*idmap);
>>      iort_length += node_size;
>> @@ -438,8 +474,14 @@ build_iort(GArray *table_data, BIOSLinker *linker)
>>      idmap->input_base = 0;
>>      idmap->id_count = cpu_to_le32(0xFFFF);
>>      idmap->output_base = 0;
>> -    /* output IORT node is the ITS group node (the first node) */
>> -    idmap->output_reference = cpu_to_le32(iort->node_offset);
>> +
>> +    if (vms->iommu) {
>> +        /* output IORT node is the smmuv3 node */
>> +        idmap->output_reference = cpu_to_le32(smmu_offset);
>> +    } else {
>> +        /* output IORT node is the ITS group node (the first node) */
>> +        idmap->output_reference = cpu_to_le32(iort->node_offset);
>> +    }
>>
>>      iort->length = cpu_to_le32(iort_length);
>>
>> @@ -786,7 +828,7 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
>>
>>      if (its_class_name() && !vmc->no_its) {
>>          acpi_add_table(table_offsets, tables_blob);
>> -        build_iort(tables_blob, tables->linker);
>> +        build_iort(tables_blob, tables->linker, vms);
>>      }
>>
>>      /* XSDT is pointed to by RSDP */
>> diff --git a/include/hw/acpi/acpi-defs.h b/include/hw/acpi/acpi-defs.h
>> index 80c8099..068ce28 100644
>> --- a/include/hw/acpi/acpi-defs.h
>> +++ b/include/hw/acpi/acpi-defs.h
>> @@ -700,6 +700,21 @@ struct AcpiIortItsGroup {
>>  } QEMU_PACKED;
>>  typedef struct AcpiIortItsGroup AcpiIortItsGroup;
>>
>> +struct AcpiIortSmmu3 {
>> +    ACPI_IORT_NODE_HEADER_DEF
>> +    uint64_t base_address;
>> +    uint32_t flags;
>> +    uint32_t reserved2;
>> +    uint64_t vatos_address;
>> +    uint32_t model;
>> +    uint32_t event_gsiv;
>> +    uint32_t pri_gsiv;
>> +    uint32_t gerr_gsiv;
>> +    uint32_t sync_gsiv;
>> +    AcpiIortIdMapping id_mapping_array[0];
>> +} QEMU_PACKED;
>> +typedef struct AcpiIortSmmu3 AcpiIortSmmu3;
>> +
>>  struct AcpiIortRC {
>>      ACPI_IORT_NODE_HEADER_DEF
>>      AcpiIortMemoryAccess memory_properties;
>> --
>> 2.5.5
>>
> 
> thanks
> -- PMM
> 
> 
> .
> 

-- 
Shannon

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [PATCH v9 13/14] hw/arm/virt-acpi-build: Add smmuv3 node in IORT table
  2018-03-19 14:32     ` Shannon Zhao
@ 2018-03-19 20:50       ` Auger Eric
  0 siblings, 0 replies; 63+ messages in thread
From: Auger Eric @ 2018-03-19 20:50 UTC (permalink / raw)
  To: Shannon Zhao, Peter Maydell
  Cc: Michael S. Tsirkin, Jean-Philippe Brucker, Tomasz Nowicki,
	QEMU Developers, Peter Xu, Shannon Zhao, Alex Williamson,
	qemu-arm, Prem Mallappa, Edgar E. Iglesias, linuc.decode,
	Bharat Bhushan, Christoffer Dall, Eric Auger

Hi Shannon,

On 19/03/18 15:32, Shannon Zhao wrote:
> 
> 
> On 2018/3/12 20:48, Peter Maydell wrote:
>> On 17 February 2018 at 18:46, Eric Auger <eric.auger@redhat.com> wrote:
>>> From: Prem Mallappa <prem.mallappa@broadcom.com>
>>>
>>> This patch builds the smmuv3 node in the ACPI IORT table.
>>>
>>> The RID space of the root complex, which spans 0x0-0x10000
>>> maps to streamid space 0x0-0x10000 in smmuv3, which in turn
>>> maps to deviceid space 0x0-0x10000 in the ITS group.
>>>
>>> The guest must feature the IOMMU probe deferral series
>>> (https://lkml.org/lkml/2017/4/10/214) which fixes streamid
>>> multiple lookup. This bug is not related to the SMMU emulation.
>>>
>>> Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>
>>> ---
>>>
>>> v2 -> v3:
>>> - integrate into the existing IORT table made up of ITS, RC nodes
>>> - take into account vms->smmu
>>> - match linux actbl2.h acpi_iort_smmu_v3 field names
>>> ---
>>>  hw/arm/virt-acpi-build.c    | 56 +++++++++++++++++++++++++++++++++++++++------
>>>  include/hw/acpi/acpi-defs.h | 15 ++++++++++++
>>>  2 files changed, 64 insertions(+), 7 deletions(-)
>>
>> Since ACPI is definitely not my area of expertise, I've cc'd
>> Shannon on this patch. Shannon, could you take a look at it, please?
>>
> Sure.
> 
>>> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
>>> index f7fa795..4b5ad91 100644
>>> --- a/hw/arm/virt-acpi-build.c
>>> +++ b/hw/arm/virt-acpi-build.c
>>> @@ -393,19 +393,26 @@ build_rsdp(GArray *rsdp_table, BIOSLinker *linker, unsigned xsdt_tbl_offset)
>>>  }
>>>
>>>  static void
>>> -build_iort(GArray *table_data, BIOSLinker *linker)
>>> +build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>>>  {
>>> -    int iort_start = table_data->len;
>>> +    int nb_nodes, iort_start = table_data->len;
>>>      AcpiIortIdMapping *idmap;
>>>      AcpiIortItsGroup *its;
>>>      AcpiIortTable *iort;
>>> -    size_t node_size, iort_length;
>>> +    AcpiIortSmmu3 *smmu;
>>> +    size_t node_size, iort_length, smmu_offset = 0;
>>>      AcpiIortRC *rc;
>>>
>>>      iort = acpi_data_push(table_data, sizeof(*iort));
>>>
>>> +    if (vms->iommu) {
> use if (vms->iommu == VIRT_IOMMU_SMMUV3) ? in case we support other
> types of SMMU.
OK
> 
>>> +        nb_nodes = 3; /* RC, ITS, SMMUv3 */
>>> +    } else {
>>> +        nb_nodes = 2; /* RC, ITS */
>>> +    }
>>> +
>>>      iort_length = sizeof(*iort);
>>> -    iort->node_count = cpu_to_le32(2); /* RC and ITS nodes */
>>> +    iort->node_count = cpu_to_le32(nb_nodes);
>>>      iort->node_offset = cpu_to_le32(sizeof(*iort));
>>>
>>>      /* ITS group node */
>>> @@ -418,6 +425,35 @@ build_iort(GArray *table_data, BIOSLinker *linker)
>>>      its->its_count = cpu_to_le32(1);
>>>      its->identifiers[0] = 0; /* MADT translation_id */
>>>
>>> +    if (vms->iommu == VIRT_IOMMU_SMMUV3) {
>>> +        int irq =  vms->irqmap[VIRT_SMMU];
>>> +
>>> +        /* SMMUv3 node */
>>> +        smmu_offset = cpu_to_le32(iort->node_offset + node_size);
> no need cpu_to_le32 here.
> Otherwise: Reviewed-by: Shannon Zhao <zhaoshenglong@huawei.com>

OK. Thank you for the review!

Thanks

Eric
> 
>>> +        node_size = sizeof(*smmu) + sizeof(*idmap);
>>> +        iort_length += node_size;
>>> +        smmu = acpi_data_push(table_data, node_size);
>>> +
>>> +
>>> +        smmu->type = ACPI_IORT_NODE_SMMU_V3;
>>> +        smmu->length = cpu_to_le16(node_size);
>>> +        smmu->mapping_count = cpu_to_le32(1);
>>> +        smmu->mapping_offset = cpu_to_le32(sizeof(*smmu));
>>> +        smmu->base_address = cpu_to_le64(vms->memmap[VIRT_SMMU].base);
>>> +        smmu->event_gsiv = cpu_to_le32(irq);
>>> +        smmu->pri_gsiv = cpu_to_le32(irq + 1);
>>> +        smmu->gerr_gsiv = cpu_to_le32(irq + 2);
>>> +        smmu->sync_gsiv = cpu_to_le32(irq + 3);
>>> +
>>> +        /* Identity RID mapping covering the whole input RID range */
>>> +        idmap = &smmu->id_mapping_array[0];
>>> +        idmap->input_base = 0;
>>> +        idmap->id_count = cpu_to_le32(0xFFFF);
>>> +        idmap->output_base = 0;
>>> +        /* output IORT node is the ITS group node (the first node) */
>>> +        idmap->output_reference = cpu_to_le32(iort->node_offset);
>>> +    }
>>> +
>>>      /* Root Complex Node */
>>>      node_size = sizeof(*rc) + sizeof(*idmap);
>>>      iort_length += node_size;
>>> @@ -438,8 +474,14 @@ build_iort(GArray *table_data, BIOSLinker *linker)
>>>      idmap->input_base = 0;
>>>      idmap->id_count = cpu_to_le32(0xFFFF);
>>>      idmap->output_base = 0;
>>> -    /* output IORT node is the ITS group node (the first node) */
>>> -    idmap->output_reference = cpu_to_le32(iort->node_offset);
>>> +
>>> +    if (vms->iommu) {
>>> +        /* output IORT node is the smmuv3 node */
>>> +        idmap->output_reference = cpu_to_le32(smmu_offset);
>>> +    } else {
>>> +        /* output IORT node is the ITS group node (the first node) */
>>> +        idmap->output_reference = cpu_to_le32(iort->node_offset);
>>> +    }
>>>
>>>      iort->length = cpu_to_le32(iort_length);
>>>
>>> @@ -786,7 +828,7 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
>>>
>>>      if (its_class_name() && !vmc->no_its) {
>>>          acpi_add_table(table_offsets, tables_blob);
>>> -        build_iort(tables_blob, tables->linker);
>>> +        build_iort(tables_blob, tables->linker, vms);
>>>      }
>>>
>>>      /* XSDT is pointed to by RSDP */
>>> diff --git a/include/hw/acpi/acpi-defs.h b/include/hw/acpi/acpi-defs.h
>>> index 80c8099..068ce28 100644
>>> --- a/include/hw/acpi/acpi-defs.h
>>> +++ b/include/hw/acpi/acpi-defs.h
>>> @@ -700,6 +700,21 @@ struct AcpiIortItsGroup {
>>>  } QEMU_PACKED;
>>>  typedef struct AcpiIortItsGroup AcpiIortItsGroup;
>>>
>>> +struct AcpiIortSmmu3 {
>>> +    ACPI_IORT_NODE_HEADER_DEF
>>> +    uint64_t base_address;
>>> +    uint32_t flags;
>>> +    uint32_t reserved2;
>>> +    uint64_t vatos_address;
>>> +    uint32_t model;
>>> +    uint32_t event_gsiv;
>>> +    uint32_t pri_gsiv;
>>> +    uint32_t gerr_gsiv;
>>> +    uint32_t sync_gsiv;
>>> +    AcpiIortIdMapping id_mapping_array[0];
>>> +} QEMU_PACKED;
>>> +typedef struct AcpiIortSmmu3 AcpiIortSmmu3;
>>> +
>>>  struct AcpiIortRC {
>>>      ACPI_IORT_NODE_HEADER_DEF
>>>      AcpiIortMemoryAccess memory_properties;
>>> --
>>> 2.5.5
>>>
>>
>> thanks
>> -- PMM
>>
>>
>> .
>>
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

end of thread, other threads:[~2018-03-19 20:50 UTC | newest]

Thread overview: 63+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-17 18:46 [Qemu-devel] [PATCH v9 00/14] ARM SMMUv3 Emulation Support Eric Auger
2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 01/14] hw/arm/smmu-common: smmu base device and datatypes Eric Auger
2018-03-06 12:09   ` Peter Maydell
2018-03-06 15:01     ` Auger Eric
2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 02/14] hw/arm/smmu-common: IOMMU memory region and address space setup Eric Auger
2018-03-06 14:08   ` Peter Maydell
2018-03-06 14:47     ` Auger Eric
2018-03-06 14:49       ` Peter Maydell
2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 03/14] hw/arm/smmu-common: VMSAv8-64 page table walk Eric Auger
2018-03-06 19:43   ` Peter Maydell
2018-03-07 16:23     ` Auger Eric
2018-03-07 16:35       ` Peter Maydell
2018-03-08 18:56         ` Auger Eric
2018-03-08 19:01           ` Peter Maydell
2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 04/14] hw/arm/smmuv3: Skeleton Eric Auger
2018-03-08 14:27   ` Peter Maydell
2018-03-09 13:19     ` Auger Eric
2018-03-09 13:37       ` Peter Maydell
2018-03-09 13:49         ` Auger Eric
2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 05/14] hw/arm/smmuv3: Wired IRQ and GERROR helpers Eric Auger
2018-03-08 17:49   ` Peter Maydell
2018-03-09 14:03     ` Auger Eric
2018-03-09 14:18       ` Peter Maydell
2018-03-09 14:50         ` Auger Eric
2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 06/14] hw/arm/smmuv3: Queue helpers Eric Auger
2018-03-08 18:28   ` Peter Maydell
2018-03-09 16:43     ` Auger Eric
2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 07/14] hw/arm/smmuv3: Implement MMIO write operations Eric Auger
2018-03-08 18:37   ` Peter Maydell
2018-03-09 16:42     ` Auger Eric
2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 08/14] hw/arm/smmuv3: Event queue recording helper Eric Auger
2018-03-08 18:39   ` Peter Maydell
2018-03-09 17:16     ` Auger Eric
2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 09/14] hw/arm/smmuv3: Implement translate callback Eric Auger
2018-03-09 18:46   ` Peter Maydell
2018-03-12 10:38     ` Eric Auger
2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 10/14] hw/arm/smmuv3: Abort on vfio or vhost case Eric Auger
2018-03-08 19:06   ` Peter Maydell
2018-03-09 17:53     ` Auger Eric
2018-03-09 17:59       ` Peter Maydell
2018-03-12 10:53         ` Eric Auger
2018-03-12 11:10           ` Peter Maydell
2018-03-12 15:01             ` Auger Eric
2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 11/14] target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route Eric Auger
2018-03-12 11:59   ` Peter Maydell
2018-03-12 15:16     ` Auger Eric
2018-03-13 13:37       ` Paolo Bonzini
2018-03-15  9:45         ` Auger Eric
2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 12/14] hw/arm/virt: Add SMMUv3 to the virt board Eric Auger
2018-03-12 12:46   ` Peter Maydell
2018-03-12 15:01     ` Auger Eric
2018-03-12 15:05       ` Peter Maydell
2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 13/14] hw/arm/virt-acpi-build: Add smmuv3 node in IORT table Eric Auger
2018-03-12 12:48   ` Peter Maydell
2018-03-19 14:32     ` Shannon Zhao
2018-03-19 20:50       ` Auger Eric
2018-02-17 18:46 ` [Qemu-devel] [PATCH v9 14/14] hw/arm/virt: Handle iommu in 2.12 machine type Eric Auger
2018-03-12 12:56   ` Peter Maydell
2018-03-12 15:01     ` Auger Eric
2018-02-27 19:02 ` [Qemu-devel] [PATCH v9 00/14] ARM SMMUv3 Emulation Support Peter Maydell
2018-02-28  8:44   ` Auger Eric
2018-03-12 12:58     ` Peter Maydell
2018-03-12 15:22       ` Auger Eric

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.