All of lore.kernel.org
 help / color / mirror / Atom feed
* [PULL 00/10] virtio,pc,pci: fixes,cleanups,features
@ 2022-06-16 16:57 Michael S. Tsirkin
  2022-06-16 16:57 ` [PULL 01/10] pci-bridge/cxl_upstream: Add a CXL switch upstream port Michael S. Tsirkin
                   ` (10 more replies)
  0 siblings, 11 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2022-06-16 16:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell

The following changes since commit def6fd6c9ce9e00a30cdd0066e0fde206b3f3d2f:

  Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging (2022-06-16 07:13:04 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_upstream

for you to fetch changes up to 8c97e4deeca9ad791ab369d3879ebfb0267b24ca:

  acpi/erst: fix fallthrough code upon validation failure (2022-06-16 12:54:58 -0400)

----------------------------------------------------------------
virtio,pc,pci: fixes,cleanups,features

more CXL patches
RSA support for crypto
fixes, cleanups all over the place

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

----------------------------------------------------------------
Ani Sinha (1):
      acpi/erst: fix fallthrough code upon validation failure

Jonathan Cameron (3):
      pci-bridge/cxl_upstream: Add a CXL switch upstream port
      pci-bridge/cxl_downstream: Add a CXL switch downstream port
      docs/cxl: Add switch documentation

Ni Xun (1):
      vhost: also check queue state in the vhost_dev_set_log error routine

Yajun Wu (1):
      virtio/vhost-user: Fix wrong vhost notifier GPtrArray size

Zhenwei Pi (1):
      crypto: Introduce RSA algorithm

Zhenzhong Duan (3):
      virtio-iommu: Add bypass mode support to assigned device
      virtio-iommu: Use recursive lock to avoid deadlock
      virtio-iommu: Add an assert check in translate routine

 include/hw/cxl/cxl.h              |   5 +
 include/hw/virtio/virtio-crypto.h |   5 +-
 include/hw/virtio/virtio-iommu.h  |   4 +-
 include/sysemu/cryptodev.h        |  83 ++++++++--
 backends/cryptodev-builtin.c      | 276 ++++++++++++++++++++++++++++-----
 backends/cryptodev-vhost-user.c   |  34 +++-
 backends/cryptodev.c              |  32 ++--
 hw/acpi/erst.c                    |   3 +
 hw/cxl/cxl-host.c                 |  43 ++++-
 hw/pci-bridge/cxl_downstream.c    | 249 +++++++++++++++++++++++++++++
 hw/pci-bridge/cxl_upstream.c      | 216 ++++++++++++++++++++++++++
 hw/virtio/vhost-user.c            |   2 +-
 hw/virtio/vhost.c                 |   4 +
 hw/virtio/virtio-crypto.c         | 319 ++++++++++++++++++++++++++++++--------
 hw/virtio/virtio-iommu.c          | 135 ++++++++++++++--
 docs/system/devices/cxl.rst       |  88 ++++++++++-
 hw/pci-bridge/meson.build         |   2 +-
 hw/virtio/trace-events            |   1 +
 18 files changed, 1343 insertions(+), 158 deletions(-)
 create mode 100644 hw/pci-bridge/cxl_downstream.c
 create mode 100644 hw/pci-bridge/cxl_upstream.c



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PULL 01/10] pci-bridge/cxl_upstream: Add a CXL switch upstream port
  2022-06-16 16:57 [PULL 00/10] virtio,pc,pci: fixes,cleanups,features Michael S. Tsirkin
@ 2022-06-16 16:57 ` Michael S. Tsirkin
  2022-06-16 16:57 ` [PULL 02/10] pci-bridge/cxl_downstream: Add a CXL switch downstream port Michael S. Tsirkin
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2022-06-16 16:57 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Jonathan Cameron, Marcel Apfelbaum, Ben Widawsky

From: Jonathan Cameron <Jonathan.Cameron@huawei.com>

An initial simple upstream port emulation to allow the creation
of CXL switches. The Device ID has been allocated for this use.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20220616145126.8002-2-Jonathan.Cameron@huawei.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/hw/cxl/cxl.h         |   5 +
 hw/pci-bridge/cxl_upstream.c | 216 +++++++++++++++++++++++++++++++++++
 hw/pci-bridge/meson.build    |   2 +-
 3 files changed, 222 insertions(+), 1 deletion(-)
 create mode 100644 hw/pci-bridge/cxl_upstream.c

diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
index 134b295b40..38e0e271d5 100644
--- a/include/hw/cxl/cxl.h
+++ b/include/hw/cxl/cxl.h
@@ -53,4 +53,9 @@ struct CXLHost {
 #define TYPE_PXB_CXL_HOST "pxb-cxl-host"
 OBJECT_DECLARE_SIMPLE_TYPE(CXLHost, PXB_CXL_HOST)
 
+#define TYPE_CXL_USP "cxl-upstream"
+
+typedef struct CXLUpstreamPort CXLUpstreamPort;
+DECLARE_INSTANCE_CHECKER(CXLUpstreamPort, CXL_USP, TYPE_CXL_USP)
+CXLComponentState *cxl_usp_to_cstate(CXLUpstreamPort *usp);
 #endif
diff --git a/hw/pci-bridge/cxl_upstream.c b/hw/pci-bridge/cxl_upstream.c
new file mode 100644
index 0000000000..a83a3e81e4
--- /dev/null
+++ b/hw/pci-bridge/cxl_upstream.c
@@ -0,0 +1,216 @@
+/*
+ * Emulated CXL Switch Upstream Port
+ *
+ * Copyright (c) 2022 Huawei Technologies.
+ *
+ * Based on xio3130_upstream.c
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "hw/pci/msi.h"
+#include "hw/pci/pcie.h"
+#include "hw/pci/pcie_port.h"
+
+#define CXL_UPSTREAM_PORT_MSI_NR_VECTOR 1
+
+#define CXL_UPSTREAM_PORT_MSI_OFFSET 0x70
+#define CXL_UPSTREAM_PORT_PCIE_CAP_OFFSET 0x90
+#define CXL_UPSTREAM_PORT_AER_OFFSET 0x100
+#define CXL_UPSTREAM_PORT_DVSEC_OFFSET \
+    (CXL_UPSTREAM_PORT_AER_OFFSET + PCI_ERR_SIZEOF)
+
+typedef struct CXLUpstreamPort {
+    /*< private >*/
+    PCIEPort parent_obj;
+
+    /*< public >*/
+    CXLComponentState cxl_cstate;
+} CXLUpstreamPort;
+
+CXLComponentState *cxl_usp_to_cstate(CXLUpstreamPort *usp)
+{
+    return &usp->cxl_cstate;
+}
+
+static void cxl_usp_dvsec_write_config(PCIDevice *dev, uint32_t addr,
+                                       uint32_t val, int len)
+{
+    CXLUpstreamPort *usp = CXL_USP(dev);
+
+    if (range_contains(&usp->cxl_cstate.dvsecs[EXTENSIONS_PORT_DVSEC], addr)) {
+        uint8_t *reg = &dev->config[addr];
+        addr -= usp->cxl_cstate.dvsecs[EXTENSIONS_PORT_DVSEC].lob;
+        if (addr == PORT_CONTROL_OFFSET) {
+            if (pci_get_word(reg) & PORT_CONTROL_UNMASK_SBR) {
+                /* unmask SBR */
+                qemu_log_mask(LOG_UNIMP, "SBR mask control is not supported\n");
+            }
+            if (pci_get_word(reg) & PORT_CONTROL_ALT_MEMID_EN) {
+                /* Alt Memory & ID Space Enable */
+                qemu_log_mask(LOG_UNIMP,
+                              "Alt Memory & ID space is not supported\n");
+            }
+        }
+    }
+}
+
+static void cxl_usp_write_config(PCIDevice *d, uint32_t address,
+                                 uint32_t val, int len)
+{
+    pci_bridge_write_config(d, address, val, len);
+    pcie_cap_flr_write_config(d, address, val, len);
+    pcie_aer_write_config(d, address, val, len);
+
+    cxl_usp_dvsec_write_config(d, address, val, len);
+}
+
+static void latch_registers(CXLUpstreamPort *usp)
+{
+    uint32_t *reg_state = usp->cxl_cstate.crb.cache_mem_registers;
+    uint32_t *write_msk = usp->cxl_cstate.crb.cache_mem_regs_write_mask;
+
+    cxl_component_register_init_common(reg_state, write_msk,
+                                       CXL2_UPSTREAM_PORT);
+    ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY, TARGET_COUNT, 8);
+}
+
+static void cxl_usp_reset(DeviceState *qdev)
+{
+    PCIDevice *d = PCI_DEVICE(qdev);
+    CXLUpstreamPort *usp = CXL_USP(qdev);
+
+    pci_bridge_reset(qdev);
+    pcie_cap_deverr_reset(d);
+    latch_registers(usp);
+}
+
+static void build_dvsecs(CXLComponentState *cxl)
+{
+    uint8_t *dvsec;
+
+    dvsec = (uint8_t *)&(CXLDVSECPortExtensions){
+        .status = 0x1, /* Port Power Management Init Complete */
+    };
+    cxl_component_create_dvsec(cxl, CXL2_UPSTREAM_PORT,
+                               EXTENSIONS_PORT_DVSEC_LENGTH,
+                               EXTENSIONS_PORT_DVSEC,
+                               EXTENSIONS_PORT_DVSEC_REVID, dvsec);
+    dvsec = (uint8_t *)&(CXLDVSECPortFlexBus){
+        .cap                     = 0x27, /* Cache, IO, Mem, non-MLD */
+        .ctrl                    = 0x27, /* Cache, IO, Mem */
+        .status                  = 0x26, /* same */
+        .rcvd_mod_ts_data_phase1 = 0xef, /* WTF? */
+    };
+    cxl_component_create_dvsec(cxl, CXL2_UPSTREAM_PORT,
+                               PCIE_FLEXBUS_PORT_DVSEC_LENGTH_2_0,
+                               PCIE_FLEXBUS_PORT_DVSEC,
+                               PCIE_FLEXBUS_PORT_DVSEC_REVID_2_0, dvsec);
+
+    dvsec = (uint8_t *)&(CXLDVSECRegisterLocator){
+        .rsvd         = 0,
+        .reg0_base_lo = RBI_COMPONENT_REG | CXL_COMPONENT_REG_BAR_IDX,
+        .reg0_base_hi = 0,
+    };
+    cxl_component_create_dvsec(cxl, CXL2_UPSTREAM_PORT,
+                               REG_LOC_DVSEC_LENGTH, REG_LOC_DVSEC,
+                               REG_LOC_DVSEC_REVID, dvsec);
+}
+
+static void cxl_usp_realize(PCIDevice *d, Error **errp)
+{
+    PCIEPort *p = PCIE_PORT(d);
+    CXLUpstreamPort *usp = CXL_USP(d);
+    CXLComponentState *cxl_cstate = &usp->cxl_cstate;
+    ComponentRegisters *cregs = &cxl_cstate->crb;
+    MemoryRegion *component_bar = &cregs->component_registers;
+    int rc;
+
+    pci_bridge_initfn(d, TYPE_PCIE_BUS);
+    pcie_port_init_reg(d);
+
+    rc = msi_init(d, CXL_UPSTREAM_PORT_MSI_OFFSET,
+                  CXL_UPSTREAM_PORT_MSI_NR_VECTOR, true, true, errp);
+    if (rc) {
+        assert(rc == -ENOTSUP);
+        goto err_bridge;
+    }
+
+    rc = pcie_cap_init(d, CXL_UPSTREAM_PORT_PCIE_CAP_OFFSET,
+                       PCI_EXP_TYPE_UPSTREAM, p->port, errp);
+    if (rc < 0) {
+        goto err_msi;
+    }
+
+    pcie_cap_flr_init(d);
+    pcie_cap_deverr_init(d);
+    rc = pcie_aer_init(d, PCI_ERR_VER, CXL_UPSTREAM_PORT_AER_OFFSET,
+                       PCI_ERR_SIZEOF, errp);
+    if (rc) {
+        goto err_cap;
+    }
+
+    cxl_cstate->dvsec_offset = CXL_UPSTREAM_PORT_DVSEC_OFFSET;
+    cxl_cstate->pdev = d;
+    build_dvsecs(cxl_cstate);
+    cxl_component_register_block_init(OBJECT(d), cxl_cstate, TYPE_CXL_USP);
+    pci_register_bar(d, CXL_COMPONENT_REG_BAR_IDX,
+                     PCI_BASE_ADDRESS_SPACE_MEMORY |
+                     PCI_BASE_ADDRESS_MEM_TYPE_64,
+                     component_bar);
+
+    return;
+
+err_cap:
+    pcie_cap_exit(d);
+err_msi:
+    msi_uninit(d);
+err_bridge:
+    pci_bridge_exitfn(d);
+}
+
+static void cxl_usp_exitfn(PCIDevice *d)
+{
+    pcie_aer_exit(d);
+    pcie_cap_exit(d);
+    msi_uninit(d);
+    pci_bridge_exitfn(d);
+}
+
+static void cxl_upstream_class_init(ObjectClass *oc, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(oc);
+    PCIDeviceClass *k = PCI_DEVICE_CLASS(oc);
+
+    k->is_bridge = true;
+    k->config_write = cxl_usp_write_config;
+    k->realize = cxl_usp_realize;
+    k->exit = cxl_usp_exitfn;
+    k->vendor_id = 0x19e5; /* Huawei */
+    k->device_id = 0xa128; /* Emulated CXL Switch Upstream Port */
+    k->revision = 0;
+    set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
+    dc->desc = "CXL Switch Upstream Port";
+    dc->reset = cxl_usp_reset;
+}
+
+static const TypeInfo cxl_usp_info = {
+    .name = TYPE_CXL_USP,
+    .parent = TYPE_PCIE_PORT,
+    .instance_size = sizeof(CXLUpstreamPort),
+    .class_init = cxl_upstream_class_init,
+    .interfaces = (InterfaceInfo[]) {
+        { INTERFACE_PCIE_DEVICE },
+        { INTERFACE_CXL_DEVICE },
+        { }
+    },
+};
+
+static void cxl_usp_register_type(void)
+{
+    type_register_static(&cxl_usp_info);
+}
+
+type_init(cxl_usp_register_type);
diff --git a/hw/pci-bridge/meson.build b/hw/pci-bridge/meson.build
index fdbe2e07c5..6828f0e08d 100644
--- a/hw/pci-bridge/meson.build
+++ b/hw/pci-bridge/meson.build
@@ -6,7 +6,7 @@ pci_ss.add(when: 'CONFIG_PCIE_PORT', if_true: files('pcie_root_port.c', 'gen_pci
 pci_ss.add(when: 'CONFIG_PXB', if_true: files('pci_expander_bridge.c'),
                                if_false: files('pci_expander_bridge_stubs.c'))
 pci_ss.add(when: 'CONFIG_XIO3130', if_true: files('xio3130_upstream.c', 'xio3130_downstream.c'))
-pci_ss.add(when: 'CONFIG_CXL', if_true: files('cxl_root_port.c'))
+pci_ss.add(when: 'CONFIG_CXL', if_true: files('cxl_root_port.c', 'cxl_upstream.c'))
 
 # NewWorld PowerMac
 pci_ss.add(when: 'CONFIG_DEC_PCI', if_true: files('dec.c'))
-- 
MST



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PULL 02/10] pci-bridge/cxl_downstream: Add a CXL switch downstream port
  2022-06-16 16:57 [PULL 00/10] virtio,pc,pci: fixes,cleanups,features Michael S. Tsirkin
  2022-06-16 16:57 ` [PULL 01/10] pci-bridge/cxl_upstream: Add a CXL switch upstream port Michael S. Tsirkin
@ 2022-06-16 16:57 ` Michael S. Tsirkin
  2022-11-04  6:47   ` Thomas Huth
  2022-06-16 16:57 ` [PULL 03/10] docs/cxl: Add switch documentation Michael S. Tsirkin
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 21+ messages in thread
From: Michael S. Tsirkin @ 2022-06-16 16:57 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Jonathan Cameron, Ben Widawsky, Marcel Apfelbaum

From: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Emulation of a simple CXL Switch downstream port.
The Device ID has been allocated for this use.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20220616145126.8002-3-Jonathan.Cameron@huawei.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 hw/cxl/cxl-host.c              |  43 +++++-
 hw/pci-bridge/cxl_downstream.c | 249 +++++++++++++++++++++++++++++++++
 hw/pci-bridge/meson.build      |   2 +-
 3 files changed, 291 insertions(+), 3 deletions(-)
 create mode 100644 hw/pci-bridge/cxl_downstream.c

diff --git a/hw/cxl/cxl-host.c b/hw/cxl/cxl-host.c
index efa14908d8..483d8eb13f 100644
--- a/hw/cxl/cxl-host.c
+++ b/hw/cxl/cxl-host.c
@@ -129,8 +129,9 @@ static bool cxl_hdm_find_target(uint32_t *cache_mem, hwaddr addr,
 
 static PCIDevice *cxl_cfmws_find_device(CXLFixedWindow *fw, hwaddr addr)
 {
-    CXLComponentState *hb_cstate;
+    CXLComponentState *hb_cstate, *usp_cstate;
     PCIHostState *hb;
+    CXLUpstreamPort *usp;
     int rb_index;
     uint32_t *cache_mem;
     uint8_t target;
@@ -164,8 +165,46 @@ static PCIDevice *cxl_cfmws_find_device(CXLFixedWindow *fw, hwaddr addr)
     }
 
     d = pci_bridge_get_sec_bus(PCI_BRIDGE(rp))->devices[0];
+    if (!d) {
+        return NULL;
+    }
 
-    if (!d || !object_dynamic_cast(OBJECT(d), TYPE_CXL_TYPE3)) {
+    if (object_dynamic_cast(OBJECT(d), TYPE_CXL_TYPE3)) {
+        return d;
+    }
+
+    /*
+     * Could also be a switch.  Note only one level of switching currently
+     * supported.
+     */
+    if (!object_dynamic_cast(OBJECT(d), TYPE_CXL_USP)) {
+        return NULL;
+    }
+    usp = CXL_USP(d);
+
+    usp_cstate = cxl_usp_to_cstate(usp);
+    if (!usp_cstate) {
+        return NULL;
+    }
+
+    cache_mem = usp_cstate->crb.cache_mem_registers;
+
+    target_found = cxl_hdm_find_target(cache_mem, addr, &target);
+    if (!target_found) {
+        return NULL;
+    }
+
+    d = pcie_find_port_by_pn(&PCI_BRIDGE(d)->sec_bus, target);
+    if (!d) {
+        return NULL;
+    }
+
+    d = pci_bridge_get_sec_bus(PCI_BRIDGE(d))->devices[0];
+    if (!d) {
+        return NULL;
+    }
+
+    if (!object_dynamic_cast(OBJECT(d), TYPE_CXL_TYPE3)) {
         return NULL;
     }
 
diff --git a/hw/pci-bridge/cxl_downstream.c b/hw/pci-bridge/cxl_downstream.c
new file mode 100644
index 0000000000..a361e519d0
--- /dev/null
+++ b/hw/pci-bridge/cxl_downstream.c
@@ -0,0 +1,249 @@
+/*
+ * Emulated CXL Switch Downstream Port
+ *
+ * Copyright (c) 2022 Huawei Technologies.
+ *
+ * Based on xio3130_downstream.c
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "hw/pci/msi.h"
+#include "hw/pci/pcie.h"
+#include "hw/pci/pcie_port.h"
+#include "qapi/error.h"
+
+typedef struct CXLDownStreamPort {
+    /*< private >*/
+    PCIESlot parent_obj;
+
+    /*< public >*/
+    CXLComponentState cxl_cstate;
+} CXLDownstreamPort;
+
+#define TYPE_CXL_DSP "cxl-downstream"
+DECLARE_INSTANCE_CHECKER(CXLDownstreamPort, CXL_DSP, TYPE_CXL_DSP)
+
+#define CXL_DOWNSTREAM_PORT_MSI_OFFSET 0x70
+#define CXL_DOWNSTREAM_PORT_MSI_NR_VECTOR 1
+#define CXL_DOWNSTREAM_PORT_EXP_OFFSET 0x90
+#define CXL_DOWNSTREAM_PORT_AER_OFFSET 0x100
+#define CXL_DOWNSTREAM_PORT_DVSEC_OFFSET        \
+    (CXL_DOWNSTREAM_PORT_AER_OFFSET + PCI_ERR_SIZEOF)
+
+static void latch_registers(CXLDownstreamPort *dsp)
+{
+    uint32_t *reg_state = dsp->cxl_cstate.crb.cache_mem_registers;
+    uint32_t *write_msk = dsp->cxl_cstate.crb.cache_mem_regs_write_mask;
+
+    cxl_component_register_init_common(reg_state, write_msk,
+                                       CXL2_DOWNSTREAM_PORT);
+}
+
+/* TODO: Look at sharing this code acorss all CXL port types */
+static void cxl_dsp_dvsec_write_config(PCIDevice *dev, uint32_t addr,
+                                      uint32_t val, int len)
+{
+    CXLDownstreamPort *dsp = CXL_DSP(dev);
+    CXLComponentState *cxl_cstate = &dsp->cxl_cstate;
+
+    if (range_contains(&cxl_cstate->dvsecs[EXTENSIONS_PORT_DVSEC], addr)) {
+        uint8_t *reg = &dev->config[addr];
+        addr -= cxl_cstate->dvsecs[EXTENSIONS_PORT_DVSEC].lob;
+        if (addr == PORT_CONTROL_OFFSET) {
+            if (pci_get_word(reg) & PORT_CONTROL_UNMASK_SBR) {
+                /* unmask SBR */
+                qemu_log_mask(LOG_UNIMP, "SBR mask control is not supported\n");
+            }
+            if (pci_get_word(reg) & PORT_CONTROL_ALT_MEMID_EN) {
+                /* Alt Memory & ID Space Enable */
+                qemu_log_mask(LOG_UNIMP,
+                              "Alt Memory & ID space is not supported\n");
+
+            }
+        }
+    }
+}
+
+static void cxl_dsp_config_write(PCIDevice *d, uint32_t address,
+                                 uint32_t val, int len)
+{
+    uint16_t slt_ctl, slt_sta;
+
+    pcie_cap_slot_get(d, &slt_ctl, &slt_sta);
+    pci_bridge_write_config(d, address, val, len);
+    pcie_cap_flr_write_config(d, address, val, len);
+    pcie_cap_slot_write_config(d, slt_ctl, slt_sta, address, val, len);
+    pcie_aer_write_config(d, address, val, len);
+
+    cxl_dsp_dvsec_write_config(d, address, val, len);
+}
+
+static void cxl_dsp_reset(DeviceState *qdev)
+{
+    PCIDevice *d = PCI_DEVICE(qdev);
+    CXLDownstreamPort *dsp = CXL_DSP(qdev);
+
+    pcie_cap_deverr_reset(d);
+    pcie_cap_slot_reset(d);
+    pcie_cap_arifwd_reset(d);
+    pci_bridge_reset(qdev);
+
+    latch_registers(dsp);
+}
+
+static void build_dvsecs(CXLComponentState *cxl)
+{
+    uint8_t *dvsec;
+
+    dvsec = (uint8_t *)&(CXLDVSECPortExtensions){ 0 };
+    cxl_component_create_dvsec(cxl, CXL2_DOWNSTREAM_PORT,
+                               EXTENSIONS_PORT_DVSEC_LENGTH,
+                               EXTENSIONS_PORT_DVSEC,
+                               EXTENSIONS_PORT_DVSEC_REVID, dvsec);
+
+    dvsec = (uint8_t *)&(CXLDVSECPortFlexBus){
+        .cap                     = 0x27, /* Cache, IO, Mem, non-MLD */
+        .ctrl                    = 0x02, /* IO always enabled */
+        .status                  = 0x26, /* same */
+        .rcvd_mod_ts_data_phase1 = 0xef, /* WTF? */
+    };
+    cxl_component_create_dvsec(cxl, CXL2_DOWNSTREAM_PORT,
+                               PCIE_FLEXBUS_PORT_DVSEC_LENGTH_2_0,
+                               PCIE_FLEXBUS_PORT_DVSEC,
+                               PCIE_FLEXBUS_PORT_DVSEC_REVID_2_0, dvsec);
+
+    dvsec = (uint8_t *)&(CXLDVSECPortGPF){
+        .rsvd        = 0,
+        .phase1_ctrl = 1, /* 1μs timeout */
+        .phase2_ctrl = 1, /* 1μs timeout */
+    };
+    cxl_component_create_dvsec(cxl, CXL2_DOWNSTREAM_PORT,
+                               GPF_PORT_DVSEC_LENGTH, GPF_PORT_DVSEC,
+                               GPF_PORT_DVSEC_REVID, dvsec);
+
+    dvsec = (uint8_t *)&(CXLDVSECRegisterLocator){
+        .rsvd         = 0,
+        .reg0_base_lo = RBI_COMPONENT_REG | CXL_COMPONENT_REG_BAR_IDX,
+        .reg0_base_hi = 0,
+    };
+    cxl_component_create_dvsec(cxl, CXL2_DOWNSTREAM_PORT,
+                               REG_LOC_DVSEC_LENGTH, REG_LOC_DVSEC,
+                               REG_LOC_DVSEC_REVID, dvsec);
+}
+
+static void cxl_dsp_realize(PCIDevice *d, Error **errp)
+{
+    PCIEPort *p = PCIE_PORT(d);
+    PCIESlot *s = PCIE_SLOT(d);
+    CXLDownstreamPort *dsp = CXL_DSP(d);
+    CXLComponentState *cxl_cstate = &dsp->cxl_cstate;
+    ComponentRegisters *cregs = &cxl_cstate->crb;
+    MemoryRegion *component_bar = &cregs->component_registers;
+    int rc;
+
+    pci_bridge_initfn(d, TYPE_PCIE_BUS);
+    pcie_port_init_reg(d);
+
+    rc = msi_init(d, CXL_DOWNSTREAM_PORT_MSI_OFFSET,
+                  CXL_DOWNSTREAM_PORT_MSI_NR_VECTOR,
+                  true, true, errp);
+    if (rc) {
+        assert(rc == -ENOTSUP);
+        goto err_bridge;
+    }
+
+    rc = pcie_cap_init(d, CXL_DOWNSTREAM_PORT_EXP_OFFSET,
+                       PCI_EXP_TYPE_DOWNSTREAM, p->port,
+                       errp);
+    if (rc < 0) {
+        goto err_msi;
+    }
+
+    pcie_cap_flr_init(d);
+    pcie_cap_deverr_init(d);
+    pcie_cap_slot_init(d, s);
+    pcie_cap_arifwd_init(d);
+
+    pcie_chassis_create(s->chassis);
+    rc = pcie_chassis_add_slot(s);
+    if (rc < 0) {
+        error_setg(errp, "Can't add chassis slot, error %d", rc);
+        goto err_pcie_cap;
+    }
+
+    rc = pcie_aer_init(d, PCI_ERR_VER, CXL_DOWNSTREAM_PORT_AER_OFFSET,
+                       PCI_ERR_SIZEOF, errp);
+    if (rc < 0) {
+        goto err_chassis;
+    }
+
+    cxl_cstate->dvsec_offset = CXL_DOWNSTREAM_PORT_DVSEC_OFFSET;
+    cxl_cstate->pdev = d;
+    build_dvsecs(cxl_cstate);
+    cxl_component_register_block_init(OBJECT(d), cxl_cstate, TYPE_CXL_DSP);
+    pci_register_bar(d, CXL_COMPONENT_REG_BAR_IDX,
+                     PCI_BASE_ADDRESS_SPACE_MEMORY |
+                         PCI_BASE_ADDRESS_MEM_TYPE_64,
+                     component_bar);
+
+    return;
+
+ err_chassis:
+    pcie_chassis_del_slot(s);
+ err_pcie_cap:
+    pcie_cap_exit(d);
+ err_msi:
+    msi_uninit(d);
+ err_bridge:
+    pci_bridge_exitfn(d);
+}
+
+static void cxl_dsp_exitfn(PCIDevice *d)
+{
+    PCIESlot *s = PCIE_SLOT(d);
+
+    pcie_aer_exit(d);
+    pcie_chassis_del_slot(s);
+    pcie_cap_exit(d);
+    msi_uninit(d);
+    pci_bridge_exitfn(d);
+}
+
+static void cxl_dsp_class_init(ObjectClass *oc, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(oc);
+    PCIDeviceClass *k = PCI_DEVICE_CLASS(oc);
+
+    k->is_bridge = true;
+    k->config_write = cxl_dsp_config_write;
+    k->realize = cxl_dsp_realize;
+    k->exit = cxl_dsp_exitfn;
+    k->vendor_id = 0x19e5; /* Huawei */
+    k->device_id = 0xa129; /* Emulated CXL Switch Downstream Port */
+    k->revision = 0;
+    set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
+    dc->desc = "CXL Switch Downstream Port";
+    dc->reset = cxl_dsp_reset;
+}
+
+static const TypeInfo cxl_dsp_info = {
+    .name = TYPE_CXL_DSP,
+    .instance_size = sizeof(CXLDownstreamPort),
+    .parent = TYPE_PCIE_SLOT,
+    .class_init = cxl_dsp_class_init,
+    .interfaces = (InterfaceInfo[]) {
+        { INTERFACE_PCIE_DEVICE },
+        { INTERFACE_CXL_DEVICE },
+        { }
+    },
+};
+
+static void cxl_dsp_register_type(void)
+{
+    type_register_static(&cxl_dsp_info);
+}
+
+type_init(cxl_dsp_register_type);
diff --git a/hw/pci-bridge/meson.build b/hw/pci-bridge/meson.build
index 6828f0e08d..243ceeda50 100644
--- a/hw/pci-bridge/meson.build
+++ b/hw/pci-bridge/meson.build
@@ -6,7 +6,7 @@ pci_ss.add(when: 'CONFIG_PCIE_PORT', if_true: files('pcie_root_port.c', 'gen_pci
 pci_ss.add(when: 'CONFIG_PXB', if_true: files('pci_expander_bridge.c'),
                                if_false: files('pci_expander_bridge_stubs.c'))
 pci_ss.add(when: 'CONFIG_XIO3130', if_true: files('xio3130_upstream.c', 'xio3130_downstream.c'))
-pci_ss.add(when: 'CONFIG_CXL', if_true: files('cxl_root_port.c', 'cxl_upstream.c'))
+pci_ss.add(when: 'CONFIG_CXL', if_true: files('cxl_root_port.c', 'cxl_upstream.c', 'cxl_downstream.c'))
 
 # NewWorld PowerMac
 pci_ss.add(when: 'CONFIG_DEC_PCI', if_true: files('dec.c'))
-- 
MST



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PULL 03/10] docs/cxl: Add switch documentation
  2022-06-16 16:57 [PULL 00/10] virtio,pc,pci: fixes,cleanups,features Michael S. Tsirkin
  2022-06-16 16:57 ` [PULL 01/10] pci-bridge/cxl_upstream: Add a CXL switch upstream port Michael S. Tsirkin
  2022-06-16 16:57 ` [PULL 02/10] pci-bridge/cxl_downstream: Add a CXL switch downstream port Michael S. Tsirkin
@ 2022-06-16 16:57 ` Michael S. Tsirkin
  2022-06-16 16:57 ` [PULL 04/10] virtio/vhost-user: Fix wrong vhost notifier GPtrArray size Michael S. Tsirkin
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2022-06-16 16:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, Jonathan Cameron, Davidlohr Bueso, Ben Widawsky

From: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Switches were already introduced, but now we support them update
the documentation to provide an example in diagram and
qemu command line parameter forms.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20220616145126.8002-4-Jonathan.Cameron@huawei.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 docs/system/devices/cxl.rst | 88 ++++++++++++++++++++++++++++++++++++-
 1 file changed, 86 insertions(+), 2 deletions(-)

diff --git a/docs/system/devices/cxl.rst b/docs/system/devices/cxl.rst
index bcbfe8c490..a57e4c4e5c 100644
--- a/docs/system/devices/cxl.rst
+++ b/docs/system/devices/cxl.rst
@@ -118,8 +118,6 @@ and associated component register access via PCI bars.
 
 CXL Switch
 ~~~~~~~~~~
-Not yet implemented in QEMU.
-
 Here we consider a simple CXL switch with only a single
 virtual hierarchy. Whilst more complex devices exist, their
 visibility to a particular host is generally the same as for
@@ -137,6 +135,10 @@ BARs.  The Upstream Port has the configuration interfaces for
 the HDM decoders which route incoming memory accesses to the
 appropriate downstream port.
 
+A CXL switch is created in a similar fashion to PCI switches
+by creating an upstream port (cxl-upstream) and a number of
+downstream ports on the internal switch bus (cxl-downstream).
+
 CXL Memory Devices - Type 3
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
 CXL type 3 devices use a PCI class code and are intended to be supported
@@ -240,6 +242,62 @@ Notes:
     they will take the Host Physical Addresses of accesses and map
     them to their own local Device Physical Address Space (DPA).
 
+Example topology involving a switch::
+
+  |<------------------SYSTEM PHYSICAL ADDRESS MAP (1)----------------->|
+  |    __________   __________________________________   __________    |
+  |   |          | |                                  | |          |   |
+  |   | CFMW 0   | |  CXL Fixed Memory Window 1       | | CFMW 1   |   |
+  |   | HB0 only | |  Configured to interleave memory | | HB1 only |   |
+  |   |          | |  memory accesses across HB0/HB1  | |          |   |
+  |   |____x_____| |__________________________________| |__________|   |
+           |             |                     |             |
+           |             |                     |             |
+           |             |                     |
+  Interleave Decoder     |                     |             |
+   Matches this HB       |                     |             |
+           \_____________|                     |_____________/
+               __________|__________      _____|_______________
+              |                     |    |                     |
+              | CXL HB 0            |    | CXL HB 1            |
+              | HB IntLv Decoders   |    | HB IntLv Decoders   |
+              | PCI/CXL Root Bus 0c |    | PCI/CXL Root Bus 0d |
+              |                     |    |                     |
+              |___x_________________|    |_____________________|
+                  |              |          |               |
+                  |
+       A HB 0 HDM Decoder
+       matches this Port
+       ___________|___
+      |  Root Port 0  |
+      |  Appears in   |
+      |  PCI topology |
+      |  As 0c:00.0   |
+      |___________x___|
+                  |
+                  |
+                  \_____________________
+                                        |
+                                        |
+            ---------------------------------------------------
+           |    Switch 0  USP as PCI 0d:00.0                   |
+           |    USP has HDM decoder which direct traffic to    |
+           |    appropiate downstream port                     |
+           |    Switch BUS appears as 0e                       |
+           |x__________________________________________________|
+            |                  |               |              |
+            |                  |               |              |
+       _____|_________   ______|______   ______|_____   ______|_______
+   (4)|     x         | |             | |            | |              |
+      | CXL Type3 0   | | CXL Type3 1 | | CXL type3 2| | CLX Type 3 3 |
+      |               | |             | |            | |              |
+      | PMEM0(Vol LSA)| | PMEM1 (...) | | PMEM2 (...)| | PMEM3 (...)  |
+      | Decoder to go | |             | |            | |              |
+      | from host PA  | | PCI 10:00.0 | | PCI 11:00.0| | PCI 12:00.0  |
+      | to device PA  | |             | |            | |              |
+      | PCI as 0f:00.0| |             | |            | |              |
+      |_______________| |_____________| |____________| |______________|
+
 Example command lines
 ---------------------
 A very simple setup with just one directly attached CXL Type 3 device::
@@ -279,6 +337,32 @@ the CXL Type3 device directly attached (no switches).::
   -device cxl-type3,bus=root_port16,memdev=cxl-mem4,lsa=cxl-lsa4,id=cxl-pmem3 \
   -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.targets.1=cxl.2,cxl-fmw.0.size=4G,cxl-fmw.0.interleave-granularity=8k
 
+An example of 4 devices below a switch suitable for 1, 2 or 4 way interleave::
+
+  qemu-system-aarch64 -M virt,gic-version=3,cxl=on -m 4g,maxmem=8G,slots=8 -cpu max \
+  ...
+  -object memory-backend-file,id=cxl-mem0,share=on,mem-path=/tmp/cxltest.raw,size=256M \
+  -object memory-backend-file,id=cxl-mem1,share=on,mem-path=/tmp/cxltest1.raw,size=256M \
+  -object memory-backend-file,id=cxl-mem2,share=on,mem-path=/tmp/cxltest2.raw,size=256M \
+  -object memory-backend-file,id=cxl-mem3,share=on,mem-path=/tmp/cxltest3.raw,size=256M \
+  -object memory-backend-file,id=cxl-lsa0,share=on,mem-path=/tmp/lsa0.raw,size=256M \
+  -object memory-backend-file,id=cxl-lsa1,share=on,mem-path=/tmp/lsa1.raw,size=256M \
+  -object memory-backend-file,id=cxl-lsa2,share=on,mem-path=/tmp/lsa2.raw,size=256M \
+  -object memory-backend-file,id=cxl-lsa3,share=on,mem-path=/tmp/lsa3.raw,size=256M \
+  -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \
+  -device cxl-rp,port=0,bus=cxl.1,id=root_port0,chassis=0,slot=0 \
+  -device cxl-rp,port=1,bus=cxl.1,id=root_port1,chassis=0,slot=1 \
+  -device cxl-upstream,bus=root_port0,id=us0 \
+  -device cxl-downstream,port=0,bus=us0,id=swport0,chassis=0,slot=4 \
+  -device cxl-type3,bus=swport0,memdev=cxl-mem0,lsa=cxl-lsa0,id=cxl-pmem0,size=256M \
+  -device cxl-downstream,port=1,bus=us0,id=swport1,chassis=0,slot=5 \
+  -device cxl-type3,bus=swport1,memdev=cxl-mem1,lsa=cxl-lsa1,id=cxl-pmem1,size=256M \
+  -device cxl-downstream,port=2,bus=us0,id=swport2,chassis=0,slot=6 \
+  -device cxl-type3,bus=swport2,memdev=cxl-mem2,lsa=cxl-lsa2,id=cxl-pmem2,size=256M \
+  -device cxl-downstream,port=3,bus=us0,id=swport3,chassis=0,slot=7 \
+  -device cxl-type3,bus=swport3,memdev=cxl-mem3,lsa=cxl-lsa3,id=cxl-pmem3,size=256M \
+  -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G,cxl-fmw.0.interleave-granularity=4k
+
 Kernel Configuration Options
 ----------------------------
 
-- 
MST



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PULL 04/10] virtio/vhost-user: Fix wrong vhost notifier GPtrArray size
  2022-06-16 16:57 [PULL 00/10] virtio,pc,pci: fixes,cleanups,features Michael S. Tsirkin
                   ` (2 preceding siblings ...)
  2022-06-16 16:57 ` [PULL 03/10] docs/cxl: Add switch documentation Michael S. Tsirkin
@ 2022-06-16 16:57 ` Michael S. Tsirkin
  2022-06-16 16:57 ` [PULL 05/10] virtio-iommu: Add bypass mode support to assigned device Michael S. Tsirkin
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2022-06-16 16:57 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Yajun Wu, Parav Pandit, Alex Bennée, Eddie Dong

From: Yajun Wu <yajunw@nvidia.com>

In fetch_or_create_notifier, idx begins with 0. So the GPtrArray size
should be idx + 1 and g_ptr_array_set_size should be called with idx + 1.

This wrong GPtrArray size causes fetch_or_create_notifier return an invalid
address. Passing this invalid pointer to vhost_user_host_notifier_remove
causes assert fail:

    qemu/include/qemu/int128.h:27: int128_get64: Assertion `r == a' failed.
	shutting down, reason=crashed

Backends like dpdk-vdpa which sends out vhost notifier requests almost always
hit qemu crash.

Fixes: 503e355465 ("virtio/vhost-user: dynamically assign VhostUserHostNotifiers")
Signed-off-by: Yajun Wu <yajunw@nvidia.com>
Acked-by: Parav Pandit <parav@nvidia.com>
Change-Id: I87e0f7591ca9a59d210879b260704a2d9e9d6bcd
Message-Id: <20220526034851.683258-1-yajunw@nvidia.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
---
 hw/virtio/vhost-user.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 0594178224..4b9be26e84 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -1525,7 +1525,7 @@ static VhostUserHostNotifier *fetch_or_create_notifier(VhostUserState *u,
 {
     VhostUserHostNotifier *n = NULL;
     if (idx >= u->notifiers->len) {
-        g_ptr_array_set_size(u->notifiers, idx);
+        g_ptr_array_set_size(u->notifiers, idx + 1);
     }
 
     n = g_ptr_array_index(u->notifiers, idx);
-- 
MST



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PULL 05/10] virtio-iommu: Add bypass mode support to assigned device
  2022-06-16 16:57 [PULL 00/10] virtio,pc,pci: fixes,cleanups,features Michael S. Tsirkin
                   ` (3 preceding siblings ...)
  2022-06-16 16:57 ` [PULL 04/10] virtio/vhost-user: Fix wrong vhost notifier GPtrArray size Michael S. Tsirkin
@ 2022-06-16 16:57 ` Michael S. Tsirkin
  2022-06-16 16:57 ` [PULL 06/10] virtio-iommu: Use recursive lock to avoid deadlock Michael S. Tsirkin
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2022-06-16 16:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, Zhenzhong Duan, Eric Auger

From: Zhenzhong Duan <zhenzhong.duan@intel.com>

Currently assigned devices can not work in virtio-iommu bypass mode.
Guest driver fails to probe the device due to DMA failure. And the
reason is because of lacking GPA -> HPA mappings when VM is created.

Add a root container memory region to hold both bypass memory region
and iommu memory region, so the switch between them is supported
just like the implementation in virtual VT-d.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20220613061010.2674054-2-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/hw/virtio/virtio-iommu.h |   2 +
 hw/virtio/virtio-iommu.c         | 115 ++++++++++++++++++++++++++++++-
 hw/virtio/trace-events           |   1 +
 3 files changed, 116 insertions(+), 2 deletions(-)

diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h
index 84391f8448..102eeefa73 100644
--- a/include/hw/virtio/virtio-iommu.h
+++ b/include/hw/virtio/virtio-iommu.h
@@ -37,6 +37,8 @@ typedef struct IOMMUDevice {
     int           devfn;
     IOMMUMemoryRegion  iommu_mr;
     AddressSpace  as;
+    MemoryRegion root;          /* The root container of the device */
+    MemoryRegion bypass_mr;     /* The alias of shared memory MR */
 } IOMMUDevice;
 
 typedef struct IOMMUPciBus {
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 2597e166f9..ff718107ee 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -69,6 +69,77 @@ static inline uint16_t virtio_iommu_get_bdf(IOMMUDevice *dev)
     return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn);
 }
 
+static bool virtio_iommu_device_bypassed(IOMMUDevice *sdev)
+{
+    uint32_t sid;
+    bool bypassed;
+    VirtIOIOMMU *s = sdev->viommu;
+    VirtIOIOMMUEndpoint *ep;
+
+    sid = virtio_iommu_get_bdf(sdev);
+
+    qemu_mutex_lock(&s->mutex);
+    /* need to check bypass before system reset */
+    if (!s->endpoints) {
+        bypassed = s->config.bypass;
+        goto unlock;
+    }
+
+    ep = g_tree_lookup(s->endpoints, GUINT_TO_POINTER(sid));
+    if (!ep || !ep->domain) {
+        bypassed = s->config.bypass;
+    } else {
+        bypassed = ep->domain->bypass;
+    }
+
+unlock:
+    qemu_mutex_unlock(&s->mutex);
+    return bypassed;
+}
+
+/* Return whether the device is using IOMMU translation. */
+static bool virtio_iommu_switch_address_space(IOMMUDevice *sdev)
+{
+    bool use_remapping;
+
+    assert(sdev);
+
+    use_remapping = !virtio_iommu_device_bypassed(sdev);
+
+    trace_virtio_iommu_switch_address_space(pci_bus_num(sdev->bus),
+                                            PCI_SLOT(sdev->devfn),
+                                            PCI_FUNC(sdev->devfn),
+                                            use_remapping);
+
+    /* Turn off first then on the other */
+    if (use_remapping) {
+        memory_region_set_enabled(&sdev->bypass_mr, false);
+        memory_region_set_enabled(MEMORY_REGION(&sdev->iommu_mr), true);
+    } else {
+        memory_region_set_enabled(MEMORY_REGION(&sdev->iommu_mr), false);
+        memory_region_set_enabled(&sdev->bypass_mr, true);
+    }
+
+    return use_remapping;
+}
+
+static void virtio_iommu_switch_address_space_all(VirtIOIOMMU *s)
+{
+    GHashTableIter iter;
+    IOMMUPciBus *iommu_pci_bus;
+    int i;
+
+    g_hash_table_iter_init(&iter, s->as_by_busptr);
+    while (g_hash_table_iter_next(&iter, NULL, (void **)&iommu_pci_bus)) {
+        for (i = 0; i < PCI_DEVFN_MAX; i++) {
+            if (!iommu_pci_bus->pbdev[i]) {
+                continue;
+            }
+            virtio_iommu_switch_address_space(iommu_pci_bus->pbdev[i]);
+        }
+    }
+}
+
 /**
  * The bus number is used for lookup when SID based operations occur.
  * In that case we lazily populate the IOMMUPciBus array from the bus hash
@@ -213,6 +284,7 @@ static gboolean virtio_iommu_notify_map_cb(gpointer key, gpointer value,
 static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep)
 {
     VirtIOIOMMUDomain *domain = ep->domain;
+    IOMMUDevice *sdev = container_of(ep->iommu_mr, IOMMUDevice, iommu_mr);
 
     if (!ep->domain) {
         return;
@@ -221,6 +293,7 @@ static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep)
                    ep->iommu_mr);
     QLIST_REMOVE(ep, next);
     ep->domain = NULL;
+    virtio_iommu_switch_address_space(sdev);
 }
 
 static VirtIOIOMMUEndpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s,
@@ -323,12 +396,39 @@ static AddressSpace *virtio_iommu_find_add_as(PCIBus *bus, void *opaque,
 
         trace_virtio_iommu_init_iommu_mr(name);
 
+        memory_region_init(&sdev->root, OBJECT(s), name, UINT64_MAX);
+        address_space_init(&sdev->as, &sdev->root, TYPE_VIRTIO_IOMMU);
+
+        /*
+         * Build the IOMMU disabled container with aliases to the
+         * shared MRs.  Note that aliasing to a shared memory region
+         * could help the memory API to detect same FlatViews so we
+         * can have devices to share the same FlatView when in bypass
+         * mode. (either by not configuring virtio-iommu driver or with
+         * "iommu=pt").  It will greatly reduce the total number of
+         * FlatViews of the system hence VM runs faster.
+         */
+        memory_region_init_alias(&sdev->bypass_mr, OBJECT(s),
+                                 "system", get_system_memory(), 0,
+                                 memory_region_size(get_system_memory()));
+
         memory_region_init_iommu(&sdev->iommu_mr, sizeof(sdev->iommu_mr),
                                  TYPE_VIRTIO_IOMMU_MEMORY_REGION,
                                  OBJECT(s), name,
                                  UINT64_MAX);
-        address_space_init(&sdev->as,
-                           MEMORY_REGION(&sdev->iommu_mr), TYPE_VIRTIO_IOMMU);
+
+        /*
+         * Hook both the containers under the root container, we
+         * switch between iommu & bypass MRs by enable/disable
+         * corresponding sub-containers
+         */
+        memory_region_add_subregion_overlap(&sdev->root, 0,
+                                            MEMORY_REGION(&sdev->iommu_mr),
+                                            0);
+        memory_region_add_subregion_overlap(&sdev->root, 0,
+                                            &sdev->bypass_mr, 0);
+
+        virtio_iommu_switch_address_space(sdev);
         g_free(name);
     }
     return &sdev->as;
@@ -342,6 +442,7 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
     uint32_t flags = le32_to_cpu(req->flags);
     VirtIOIOMMUDomain *domain;
     VirtIOIOMMUEndpoint *ep;
+    IOMMUDevice *sdev;
 
     trace_virtio_iommu_attach(domain_id, ep_id);
 
@@ -375,6 +476,8 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
     QLIST_INSERT_HEAD(&domain->endpoint_list, ep, next);
 
     ep->domain = domain;
+    sdev = container_of(ep->iommu_mr, IOMMUDevice, iommu_mr);
+    virtio_iommu_switch_address_space(sdev);
 
     /* Replay domain mappings on the associated memory region */
     g_tree_foreach(domain->mappings, virtio_iommu_notify_map_cb,
@@ -887,6 +990,7 @@ static void virtio_iommu_set_config(VirtIODevice *vdev,
             return;
         }
         dev_config->bypass = in_config->bypass;
+        virtio_iommu_switch_address_space_all(dev);
     }
 
     trace_virtio_iommu_set_config(in_config->bypass);
@@ -1026,6 +1130,8 @@ static void virtio_iommu_system_reset(void *opaque)
      * system reset
      */
     s->config.bypass = s->boot_bypass;
+    virtio_iommu_switch_address_space_all(s);
+
 }
 
 static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
@@ -1041,6 +1147,11 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
                              virtio_iommu_handle_command);
     s->event_vq = virtio_add_queue(vdev, VIOMMU_DEFAULT_QUEUE_SIZE, NULL);
 
+    /*
+     * config.bypass is needed to get initial address space early, such as
+     * in vfio realize
+     */
+    s->config.bypass = s->boot_bypass;
     s->config.page_size_mask = TARGET_PAGE_MASK;
     s->config.input_range.end = UINT64_MAX;
     s->config.domain_range.end = UINT32_MAX;
diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index ab8e095b73..20af2e7ebd 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -124,6 +124,7 @@ virtio_iommu_remap(const char *name, uint64_t virt_start, uint64_t virt_end, uin
 virtio_iommu_set_page_size_mask(const char *name, uint64_t old, uint64_t new) "mr=%s old_mask=0x%"PRIx64" new_mask=0x%"PRIx64
 virtio_iommu_notify_flag_add(const char *name) "add notifier to mr %s"
 virtio_iommu_notify_flag_del(const char *name) "del notifier from mr %s"
+virtio_iommu_switch_address_space(uint8_t bus, uint8_t slot, uint8_t fn, bool on) "Device %02x:%02x.%x switching address space (iommu enabled=%d)"
 
 # virtio-mem.c
 virtio_mem_send_response(uint16_t type) "type=%" PRIu16
-- 
MST



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PULL 06/10] virtio-iommu: Use recursive lock to avoid deadlock
  2022-06-16 16:57 [PULL 00/10] virtio,pc,pci: fixes,cleanups,features Michael S. Tsirkin
                   ` (4 preceding siblings ...)
  2022-06-16 16:57 ` [PULL 05/10] virtio-iommu: Add bypass mode support to assigned device Michael S. Tsirkin
@ 2022-06-16 16:57 ` Michael S. Tsirkin
  2022-06-16 16:57 ` [PULL 07/10] virtio-iommu: Add an assert check in translate routine Michael S. Tsirkin
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2022-06-16 16:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, Zhenzhong Duan, Eric Auger

From: Zhenzhong Duan <zhenzhong.duan@intel.com>

When switching address space with mutex lock hold, mapping will be
replayed for assigned device. This will trigger relock deadlock.

Also release the mutex resource in unrealize routine.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20220613061010.2674054-3-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/hw/virtio/virtio-iommu.h |  2 +-
 hw/virtio/virtio-iommu.c         | 20 +++++++++++---------
 2 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h
index 102eeefa73..2ad5ee320b 100644
--- a/include/hw/virtio/virtio-iommu.h
+++ b/include/hw/virtio/virtio-iommu.h
@@ -58,7 +58,7 @@ struct VirtIOIOMMU {
     ReservedRegion *reserved_regions;
     uint32_t nb_reserved_regions;
     GTree *domains;
-    QemuMutex mutex;
+    QemuRecMutex mutex;
     GTree *endpoints;
     bool boot_bypass;
 };
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index ff718107ee..73d5bde9d1 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -78,7 +78,7 @@ static bool virtio_iommu_device_bypassed(IOMMUDevice *sdev)
 
     sid = virtio_iommu_get_bdf(sdev);
 
-    qemu_mutex_lock(&s->mutex);
+    qemu_rec_mutex_lock(&s->mutex);
     /* need to check bypass before system reset */
     if (!s->endpoints) {
         bypassed = s->config.bypass;
@@ -93,7 +93,7 @@ static bool virtio_iommu_device_bypassed(IOMMUDevice *sdev)
     }
 
 unlock:
-    qemu_mutex_unlock(&s->mutex);
+    qemu_rec_mutex_unlock(&s->mutex);
     return bypassed;
 }
 
@@ -745,7 +745,7 @@ static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
             tail.status = VIRTIO_IOMMU_S_DEVERR;
             goto out;
         }
-        qemu_mutex_lock(&s->mutex);
+        qemu_rec_mutex_lock(&s->mutex);
         switch (head.type) {
         case VIRTIO_IOMMU_T_ATTACH:
             tail.status = virtio_iommu_handle_attach(s, iov, iov_cnt);
@@ -774,7 +774,7 @@ static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
         default:
             tail.status = VIRTIO_IOMMU_S_UNSUPP;
         }
-        qemu_mutex_unlock(&s->mutex);
+        qemu_rec_mutex_unlock(&s->mutex);
 
 out:
         sz = iov_from_buf(elem->in_sg, elem->in_num, 0,
@@ -862,7 +862,7 @@ static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
     sid = virtio_iommu_get_bdf(sdev);
 
     trace_virtio_iommu_translate(mr->parent_obj.name, sid, addr, flag);
-    qemu_mutex_lock(&s->mutex);
+    qemu_rec_mutex_lock(&s->mutex);
 
     ep = g_tree_lookup(s->endpoints, GUINT_TO_POINTER(sid));
     if (!ep) {
@@ -946,7 +946,7 @@ static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
     trace_virtio_iommu_translate_out(addr, entry.translated_addr, sid);
 
 unlock:
-    qemu_mutex_unlock(&s->mutex);
+    qemu_rec_mutex_unlock(&s->mutex);
     return entry;
 }
 
@@ -1035,7 +1035,7 @@ static void virtio_iommu_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n)
 
     sid = virtio_iommu_get_bdf(sdev);
 
-    qemu_mutex_lock(&s->mutex);
+    qemu_rec_mutex_lock(&s->mutex);
 
     if (!s->endpoints) {
         goto unlock;
@@ -1049,7 +1049,7 @@ static void virtio_iommu_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n)
     g_tree_foreach(ep->domain->mappings, virtio_iommu_remap, mr);
 
 unlock:
-    qemu_mutex_unlock(&s->mutex);
+    qemu_rec_mutex_unlock(&s->mutex);
 }
 
 static int virtio_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu_mr,
@@ -1167,7 +1167,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
     virtio_add_feature(&s->features, VIRTIO_IOMMU_F_PROBE);
     virtio_add_feature(&s->features, VIRTIO_IOMMU_F_BYPASS_CONFIG);
 
-    qemu_mutex_init(&s->mutex);
+    qemu_rec_mutex_init(&s->mutex);
 
     s->as_by_busptr = g_hash_table_new_full(NULL, NULL, NULL, g_free);
 
@@ -1195,6 +1195,8 @@ static void virtio_iommu_device_unrealize(DeviceState *dev)
         g_tree_destroy(s->endpoints);
     }
 
+    qemu_rec_mutex_destroy(&s->mutex);
+
     virtio_delete_queue(s->req_vq);
     virtio_delete_queue(s->event_vq);
     virtio_cleanup(vdev);
-- 
MST



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PULL 07/10] virtio-iommu: Add an assert check in translate routine
  2022-06-16 16:57 [PULL 00/10] virtio,pc,pci: fixes,cleanups,features Michael S. Tsirkin
                   ` (5 preceding siblings ...)
  2022-06-16 16:57 ` [PULL 06/10] virtio-iommu: Use recursive lock to avoid deadlock Michael S. Tsirkin
@ 2022-06-16 16:57 ` Michael S. Tsirkin
  2022-06-16 16:57 ` [PULL 08/10] crypto: Introduce RSA algorithm Michael S. Tsirkin
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2022-06-16 16:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, Zhenzhong Duan, Eric Auger

From: Zhenzhong Duan <zhenzhong.duan@intel.com>

With address space switch supported, dma access translation only
happen after endpoint is attached to a non-bypass domain.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20220613061010.2674054-4-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 hw/virtio/virtio-iommu.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 73d5bde9d1..7c122ab957 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -865,6 +865,10 @@ static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
     qemu_rec_mutex_lock(&s->mutex);
 
     ep = g_tree_lookup(s->endpoints, GUINT_TO_POINTER(sid));
+
+    if (bypass_allowed)
+        assert(ep && ep->domain && !ep->domain->bypass);
+
     if (!ep) {
         if (!bypass_allowed) {
             error_report_once("%s sid=%d is not known!!", __func__, sid);
-- 
MST



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PULL 08/10] crypto: Introduce RSA algorithm
  2022-06-16 16:57 [PULL 00/10] virtio,pc,pci: fixes,cleanups,features Michael S. Tsirkin
                   ` (6 preceding siblings ...)
  2022-06-16 16:57 ` [PULL 07/10] virtio-iommu: Add an assert check in translate routine Michael S. Tsirkin
@ 2022-06-16 16:57 ` Michael S. Tsirkin
  2022-06-16 16:57 ` [PULL 09/10] vhost: also check queue state in the vhost_dev_set_log error routine Michael S. Tsirkin
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2022-06-16 16:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, zhenwei pi, Gonglei

From: zhenwei pi <pizhenwei@bytedance.com>

There are two parts in this patch:
1, support akcipher service by cryptodev-builtin driver
2, virtio-crypto driver supports akcipher service

In principle, we should separate this into two patches, to avoid
compiling error, merge them into one.

Then virtio-crypto gets request from guest side, and forwards the
request to builtin driver to handle it.

Test with a guest linux:
1, The self-test framework of crypto layer works fine in guest kernel
2, Test with Linux guest(with asym support), the following script
test(note that pkey_XXX is supported only in a newer version of keyutils):
  - both public key & private key
  - create/close session
  - encrypt/decrypt/sign/verify basic driver operation
  - also test with kernel crypto layer(pkey add/query)

All the cases work fine.

Run script in guest:
rm -rf *.der *.pem *.pfx
modprobe pkcs8_key_parser # if CONFIG_PKCS8_PRIVATE_KEY_PARSER=m
rm -rf /tmp/data
dd if=/dev/random of=/tmp/data count=1 bs=20

openssl req -nodes -x509 -newkey rsa:2048 -keyout key.pem -out cert.pem -subj "/C=CN/ST=BJ/L=HD/O=qemu/OU=dev/CN=qemu/emailAddress=qemu@qemu.org"
openssl pkcs8 -in key.pem -topk8 -nocrypt -outform DER -out key.der
openssl x509 -in cert.pem -inform PEM -outform DER -out cert.der

PRIV_KEY_ID=`cat key.der | keyctl padd asymmetric test_priv_key @s`
echo "priv key id = "$PRIV_KEY_ID
PUB_KEY_ID=`cat cert.der | keyctl padd asymmetric test_pub_key @s`
echo "pub key id = "$PUB_KEY_ID

keyctl pkey_query $PRIV_KEY_ID 0
keyctl pkey_query $PUB_KEY_ID 0

echo "Enc with priv key..."
keyctl pkey_encrypt $PRIV_KEY_ID 0 /tmp/data enc=pkcs1 >/tmp/enc.priv
echo "Dec with pub key..."
keyctl pkey_decrypt $PRIV_KEY_ID 0 /tmp/enc.priv enc=pkcs1 >/tmp/dec
cmp /tmp/data /tmp/dec

echo "Sign with priv key..."
keyctl pkey_sign $PRIV_KEY_ID 0 /tmp/data enc=pkcs1 hash=sha1 > /tmp/sig
echo "Verify with pub key..."
keyctl pkey_verify $PRIV_KEY_ID 0 /tmp/data /tmp/sig enc=pkcs1 hash=sha1

echo "Enc with pub key..."
keyctl pkey_encrypt $PUB_KEY_ID 0 /tmp/data enc=pkcs1 >/tmp/enc.pub
echo "Dec with priv key..."
keyctl pkey_decrypt $PRIV_KEY_ID 0 /tmp/enc.pub enc=pkcs1 >/tmp/dec
cmp /tmp/data /tmp/dec

echo "Verify with pub key..."
keyctl pkey_verify $PUB_KEY_ID 0 /tmp/data /tmp/sig enc=pkcs1 hash=sha1

Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: lei he <helei.sig11@bytedance.com
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20220611064243.24535-2-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/hw/virtio/virtio-crypto.h |   5 +-
 include/sysemu/cryptodev.h        |  83 ++++++--
 backends/cryptodev-builtin.c      | 276 ++++++++++++++++++++++----
 backends/cryptodev-vhost-user.c   |  34 +++-
 backends/cryptodev.c              |  32 ++-
 hw/virtio/virtio-crypto.c         | 319 ++++++++++++++++++++++++------
 6 files changed, 607 insertions(+), 142 deletions(-)

diff --git a/include/hw/virtio/virtio-crypto.h b/include/hw/virtio/virtio-crypto.h
index a2228d7b2e..348749f5d5 100644
--- a/include/hw/virtio/virtio-crypto.h
+++ b/include/hw/virtio/virtio-crypto.h
@@ -50,6 +50,7 @@ typedef struct VirtIOCryptoConf {
     uint32_t mac_algo_l;
     uint32_t mac_algo_h;
     uint32_t aead_algo;
+    uint32_t akcipher_algo;
 
     /* Maximum length of cipher key */
     uint32_t max_cipher_key_len;
@@ -71,9 +72,7 @@ typedef struct VirtIOCryptoReq {
     size_t in_len;
     VirtQueue *vq;
     struct VirtIOCrypto *vcrypto;
-    union {
-        CryptoDevBackendSymOpInfo *sym_op_info;
-    } u;
+    CryptoDevBackendOpInfo op_info;
 } VirtIOCryptoReq;
 
 typedef struct VirtIOCryptoQueue {
diff --git a/include/sysemu/cryptodev.h b/include/sysemu/cryptodev.h
index f4d4057d4d..37c3a360fd 100644
--- a/include/sysemu/cryptodev.h
+++ b/include/sysemu/cryptodev.h
@@ -50,13 +50,13 @@ typedef struct CryptoDevBackendClient
 
 enum CryptoDevBackendAlgType {
     CRYPTODEV_BACKEND_ALG_SYM,
+    CRYPTODEV_BACKEND_ALG_ASYM,
     CRYPTODEV_BACKEND_ALG__MAX,
 };
 
 /**
  * CryptoDevBackendSymSessionInfo:
  *
- * @op_code: operation code (refer to virtio_crypto.h)
  * @cipher_alg: algorithm type of CIPHER
  * @key_len: byte length of cipher key
  * @hash_alg: algorithm type of HASH/MAC
@@ -74,7 +74,6 @@ enum CryptoDevBackendAlgType {
  */
 typedef struct CryptoDevBackendSymSessionInfo {
     /* corresponding with virtio crypto spec */
-    uint32_t op_code;
     uint32_t cipher_alg;
     uint32_t key_len;
     uint32_t hash_alg;
@@ -89,11 +88,36 @@ typedef struct CryptoDevBackendSymSessionInfo {
     uint8_t *auth_key;
 } CryptoDevBackendSymSessionInfo;
 
+/**
+ * CryptoDevBackendAsymSessionInfo:
+ */
+typedef struct CryptoDevBackendRsaPara {
+    uint32_t padding_algo;
+    uint32_t hash_algo;
+} CryptoDevBackendRsaPara;
+
+typedef struct CryptoDevBackendAsymSessionInfo {
+    /* corresponding with virtio crypto spec */
+    uint32_t algo;
+    uint32_t keytype;
+    uint32_t keylen;
+    uint8_t *key;
+    union {
+        CryptoDevBackendRsaPara rsa;
+    } u;
+} CryptoDevBackendAsymSessionInfo;
+
+typedef struct CryptoDevBackendSessionInfo {
+    uint32_t op_code;
+    union {
+        CryptoDevBackendSymSessionInfo sym_sess_info;
+        CryptoDevBackendAsymSessionInfo asym_sess_info;
+    } u;
+} CryptoDevBackendSessionInfo;
+
 /**
  * CryptoDevBackendSymOpInfo:
  *
- * @session_id: session index which was previously
- *              created by cryptodev_backend_sym_create_session()
  * @aad_len: byte length of additional authenticated data
  * @iv_len: byte length of initialization vector or counter
  * @src_len: byte length of source data
@@ -119,7 +143,6 @@ typedef struct CryptoDevBackendSymSessionInfo {
  *
  */
 typedef struct CryptoDevBackendSymOpInfo {
-    uint64_t session_id;
     uint32_t aad_len;
     uint32_t iv_len;
     uint32_t src_len;
@@ -138,6 +161,33 @@ typedef struct CryptoDevBackendSymOpInfo {
     uint8_t data[];
 } CryptoDevBackendSymOpInfo;
 
+
+/**
+ * CryptoDevBackendAsymOpInfo:
+ *
+ * @src_len: byte length of source data
+ * @dst_len: byte length of destination data
+ * @src: point to the source data
+ * @dst: point to the destination data
+ *
+ */
+typedef struct CryptoDevBackendAsymOpInfo {
+    uint32_t src_len;
+    uint32_t dst_len;
+    uint8_t *src;
+    uint8_t *dst;
+} CryptoDevBackendAsymOpInfo;
+
+typedef struct CryptoDevBackendOpInfo {
+    enum CryptoDevBackendAlgType algtype;
+    uint32_t op_code;
+    uint64_t session_id;
+    union {
+        CryptoDevBackendSymOpInfo *sym_op_info;
+        CryptoDevBackendAsymOpInfo *asym_op_info;
+    } u;
+} CryptoDevBackendOpInfo;
+
 struct CryptoDevBackendClass {
     ObjectClass parent_class;
 
@@ -145,13 +195,13 @@ struct CryptoDevBackendClass {
     void (*cleanup)(CryptoDevBackend *backend, Error **errp);
 
     int64_t (*create_session)(CryptoDevBackend *backend,
-                       CryptoDevBackendSymSessionInfo *sess_info,
+                       CryptoDevBackendSessionInfo *sess_info,
                        uint32_t queue_index, Error **errp);
     int (*close_session)(CryptoDevBackend *backend,
                            uint64_t session_id,
                            uint32_t queue_index, Error **errp);
-    int (*do_sym_op)(CryptoDevBackend *backend,
-                     CryptoDevBackendSymOpInfo *op_info,
+    int (*do_op)(CryptoDevBackend *backend,
+                     CryptoDevBackendOpInfo *op_info,
                      uint32_t queue_index, Error **errp);
 };
 
@@ -190,6 +240,7 @@ struct CryptoDevBackendConf {
     uint32_t mac_algo_l;
     uint32_t mac_algo_h;
     uint32_t aead_algo;
+    uint32_t akcipher_algo;
     /* Maximum length of cipher key */
     uint32_t max_cipher_key_len;
     /* Maximum length of authenticated key */
@@ -247,34 +298,34 @@ void cryptodev_backend_cleanup(
            Error **errp);
 
 /**
- * cryptodev_backend_sym_create_session:
+ * cryptodev_backend_create_session:
  * @backend: the cryptodev backend object
  * @sess_info: parameters needed by session creating
  * @queue_index: queue index of cryptodev backend client
  * @errp: pointer to a NULL-initialized error object
  *
- * Create a session for symmetric algorithms
+ * Create a session for symmetric/symmetric algorithms
  *
  * Returns: session id on success, or -1 on error
  */
-int64_t cryptodev_backend_sym_create_session(
+int64_t cryptodev_backend_create_session(
            CryptoDevBackend *backend,
-           CryptoDevBackendSymSessionInfo *sess_info,
+           CryptoDevBackendSessionInfo *sess_info,
            uint32_t queue_index, Error **errp);
 
 /**
- * cryptodev_backend_sym_close_session:
+ * cryptodev_backend_close_session:
  * @backend: the cryptodev backend object
  * @session_id: the session id
  * @queue_index: queue index of cryptodev backend client
  * @errp: pointer to a NULL-initialized error object
  *
- * Close a session for symmetric algorithms which was previously
- * created by cryptodev_backend_sym_create_session()
+ * Close a session for which was previously
+ * created by cryptodev_backend_create_session()
  *
  * Returns: 0 on success, or Negative on error
  */
-int cryptodev_backend_sym_close_session(
+int cryptodev_backend_close_session(
            CryptoDevBackend *backend,
            uint64_t session_id,
            uint32_t queue_index, Error **errp);
diff --git a/backends/cryptodev-builtin.c b/backends/cryptodev-builtin.c
index 0671bf9f3e..125cbad1d3 100644
--- a/backends/cryptodev-builtin.c
+++ b/backends/cryptodev-builtin.c
@@ -26,6 +26,7 @@
 #include "qapi/error.h"
 #include "standard-headers/linux/virtio_crypto.h"
 #include "crypto/cipher.h"
+#include "crypto/akcipher.h"
 #include "qom/object.h"
 
 
@@ -42,10 +43,11 @@ typedef struct CryptoDevBackendBuiltinSession {
     QCryptoCipher *cipher;
     uint8_t direction; /* encryption or decryption */
     uint8_t type; /* cipher? hash? aead? */
+    QCryptoAkCipher *akcipher;
     QTAILQ_ENTRY(CryptoDevBackendBuiltinSession) next;
 } CryptoDevBackendBuiltinSession;
 
-/* Max number of symmetric sessions */
+/* Max number of symmetric/asymmetric sessions */
 #define MAX_NUM_SESSIONS 256
 
 #define CRYPTODEV_BUITLIN_MAX_AUTH_KEY_LEN    512
@@ -80,15 +82,17 @@ static void cryptodev_builtin_init(
     backend->conf.crypto_services =
                          1u << VIRTIO_CRYPTO_SERVICE_CIPHER |
                          1u << VIRTIO_CRYPTO_SERVICE_HASH |
-                         1u << VIRTIO_CRYPTO_SERVICE_MAC;
+                         1u << VIRTIO_CRYPTO_SERVICE_MAC |
+                         1u << VIRTIO_CRYPTO_SERVICE_AKCIPHER;
     backend->conf.cipher_algo_l = 1u << VIRTIO_CRYPTO_CIPHER_AES_CBC;
     backend->conf.hash_algo = 1u << VIRTIO_CRYPTO_HASH_SHA1;
+    backend->conf.akcipher_algo = 1u << VIRTIO_CRYPTO_AKCIPHER_RSA;
     /*
      * Set the Maximum length of crypto request.
      * Why this value? Just avoid to overflow when
      * memory allocation for each crypto request.
      */
-    backend->conf.max_size = LONG_MAX - sizeof(CryptoDevBackendSymOpInfo);
+    backend->conf.max_size = LONG_MAX - sizeof(CryptoDevBackendOpInfo);
     backend->conf.max_cipher_key_len = CRYPTODEV_BUITLIN_MAX_CIPHER_KEY_LEN;
     backend->conf.max_auth_key_len = CRYPTODEV_BUITLIN_MAX_AUTH_KEY_LEN;
 
@@ -148,6 +152,55 @@ err:
    return -1;
 }
 
+static int cryptodev_builtin_get_rsa_hash_algo(
+    int virtio_rsa_hash, Error **errp)
+{
+    switch (virtio_rsa_hash) {
+    case VIRTIO_CRYPTO_RSA_MD5:
+        return QCRYPTO_HASH_ALG_MD5;
+
+    case VIRTIO_CRYPTO_RSA_SHA1:
+        return QCRYPTO_HASH_ALG_SHA1;
+
+    case VIRTIO_CRYPTO_RSA_SHA256:
+        return QCRYPTO_HASH_ALG_SHA256;
+
+    case VIRTIO_CRYPTO_RSA_SHA512:
+        return QCRYPTO_HASH_ALG_SHA512;
+
+    default:
+        error_setg(errp, "Unsupported rsa hash algo: %d", virtio_rsa_hash);
+        return -1;
+    }
+}
+
+static int cryptodev_builtin_set_rsa_options(
+                    int virtio_padding_algo,
+                    int virtio_hash_algo,
+                    QCryptoAkCipherOptionsRSA *opt,
+                    Error **errp)
+{
+    if (virtio_padding_algo == VIRTIO_CRYPTO_RSA_PKCS1_PADDING) {
+        int hash_alg;
+
+        hash_alg = cryptodev_builtin_get_rsa_hash_algo(virtio_hash_algo, errp);
+        if (hash_alg < 0) {
+            return -1;
+        }
+        opt->hash_alg = hash_alg;
+        opt->padding_alg = QCRYPTO_RSA_PADDING_ALG_PKCS1;
+        return 0;
+    }
+
+    if (virtio_padding_algo == VIRTIO_CRYPTO_RSA_RAW_PADDING) {
+        opt->padding_alg = QCRYPTO_RSA_PADDING_ALG_RAW;
+        return 0;
+    }
+
+    error_setg(errp, "Unsupported rsa padding algo: %d", virtio_padding_algo);
+    return -1;
+}
+
 static int cryptodev_builtin_create_cipher_session(
                     CryptoDevBackendBuiltin *builtin,
                     CryptoDevBackendSymSessionInfo *sess_info,
@@ -240,26 +293,89 @@ static int cryptodev_builtin_create_cipher_session(
     return index;
 }
 
-static int64_t cryptodev_builtin_sym_create_session(
+static int cryptodev_builtin_create_akcipher_session(
+                    CryptoDevBackendBuiltin *builtin,
+                    CryptoDevBackendAsymSessionInfo *sess_info,
+                    Error **errp)
+{
+    CryptoDevBackendBuiltinSession *sess;
+    QCryptoAkCipher *akcipher;
+    int index;
+    QCryptoAkCipherKeyType type;
+    QCryptoAkCipherOptions opts;
+
+    switch (sess_info->algo) {
+    case VIRTIO_CRYPTO_AKCIPHER_RSA:
+        opts.alg = QCRYPTO_AKCIPHER_ALG_RSA;
+        if (cryptodev_builtin_set_rsa_options(sess_info->u.rsa.padding_algo,
+            sess_info->u.rsa.hash_algo, &opts.u.rsa, errp) != 0) {
+            return -1;
+        }
+        break;
+
+    /* TODO support DSA&ECDSA until qemu crypto framework support these */
+
+    default:
+        error_setg(errp, "Unsupported akcipher alg %u", sess_info->algo);
+        return -1;
+    }
+
+    switch (sess_info->keytype) {
+    case VIRTIO_CRYPTO_AKCIPHER_KEY_TYPE_PUBLIC:
+        type = QCRYPTO_AKCIPHER_KEY_TYPE_PUBLIC;
+        break;
+
+    case VIRTIO_CRYPTO_AKCIPHER_KEY_TYPE_PRIVATE:
+        type = QCRYPTO_AKCIPHER_KEY_TYPE_PRIVATE;
+        break;
+
+    default:
+        error_setg(errp, "Unsupported akcipher keytype %u", sess_info->keytype);
+        return -1;
+    }
+
+    index = cryptodev_builtin_get_unused_session_index(builtin);
+    if (index < 0) {
+        error_setg(errp, "Total number of sessions created exceeds %u",
+                   MAX_NUM_SESSIONS);
+        return -1;
+    }
+
+    akcipher = qcrypto_akcipher_new(&opts, type, sess_info->key,
+                                    sess_info->keylen, errp);
+    if (!akcipher) {
+        return -1;
+    }
+
+    sess = g_new0(CryptoDevBackendBuiltinSession, 1);
+    sess->akcipher = akcipher;
+
+    builtin->sessions[index] = sess;
+
+    return index;
+}
+
+static int64_t cryptodev_builtin_create_session(
            CryptoDevBackend *backend,
-           CryptoDevBackendSymSessionInfo *sess_info,
+           CryptoDevBackendSessionInfo *sess_info,
            uint32_t queue_index, Error **errp)
 {
     CryptoDevBackendBuiltin *builtin =
                       CRYPTODEV_BACKEND_BUILTIN(backend);
-    int64_t session_id = -1;
-    int ret;
+    CryptoDevBackendSymSessionInfo *sym_sess_info;
+    CryptoDevBackendAsymSessionInfo *asym_sess_info;
 
     switch (sess_info->op_code) {
     case VIRTIO_CRYPTO_CIPHER_CREATE_SESSION:
-        ret = cryptodev_builtin_create_cipher_session(
-                           builtin, sess_info, errp);
-        if (ret < 0) {
-            return ret;
-        } else {
-            session_id = ret;
-        }
-        break;
+        sym_sess_info = &sess_info->u.sym_sess_info;
+        return cryptodev_builtin_create_cipher_session(
+                           builtin, sym_sess_info, errp);
+
+    case VIRTIO_CRYPTO_AKCIPHER_CREATE_SESSION:
+        asym_sess_info = &sess_info->u.asym_sess_info;
+        return cryptodev_builtin_create_akcipher_session(
+                           builtin, asym_sess_info, errp);
+
     case VIRTIO_CRYPTO_HASH_CREATE_SESSION:
     case VIRTIO_CRYPTO_MAC_CREATE_SESSION:
     default:
@@ -268,50 +384,44 @@ static int64_t cryptodev_builtin_sym_create_session(
         return -1;
     }
 
-    return session_id;
+    return -1;
 }
 
-static int cryptodev_builtin_sym_close_session(
+static int cryptodev_builtin_close_session(
            CryptoDevBackend *backend,
            uint64_t session_id,
            uint32_t queue_index, Error **errp)
 {
     CryptoDevBackendBuiltin *builtin =
                       CRYPTODEV_BACKEND_BUILTIN(backend);
+    CryptoDevBackendBuiltinSession *session;
 
     assert(session_id < MAX_NUM_SESSIONS && builtin->sessions[session_id]);
 
-    qcrypto_cipher_free(builtin->sessions[session_id]->cipher);
-    g_free(builtin->sessions[session_id]);
+    session = builtin->sessions[session_id];
+    if (session->cipher) {
+        qcrypto_cipher_free(session->cipher);
+    } else if (session->akcipher) {
+        qcrypto_akcipher_free(session->akcipher);
+    }
+
+    g_free(session);
     builtin->sessions[session_id] = NULL;
     return 0;
 }
 
 static int cryptodev_builtin_sym_operation(
-                 CryptoDevBackend *backend,
-                 CryptoDevBackendSymOpInfo *op_info,
-                 uint32_t queue_index, Error **errp)
+                 CryptoDevBackendBuiltinSession *sess,
+                 CryptoDevBackendSymOpInfo *op_info, Error **errp)
 {
-    CryptoDevBackendBuiltin *builtin =
-                      CRYPTODEV_BACKEND_BUILTIN(backend);
-    CryptoDevBackendBuiltinSession *sess;
     int ret;
 
-    if (op_info->session_id >= MAX_NUM_SESSIONS ||
-              builtin->sessions[op_info->session_id] == NULL) {
-        error_setg(errp, "Cannot find a valid session id: %" PRIu64 "",
-                   op_info->session_id);
-        return -VIRTIO_CRYPTO_INVSESS;
-    }
-
     if (op_info->op_type == VIRTIO_CRYPTO_SYM_OP_ALGORITHM_CHAINING) {
         error_setg(errp,
                "Algorithm chain is unsupported for cryptdoev-builtin");
         return -VIRTIO_CRYPTO_NOTSUPP;
     }
 
-    sess = builtin->sessions[op_info->session_id];
-
     if (op_info->iv_len > 0) {
         ret = qcrypto_cipher_setiv(sess->cipher, op_info->iv,
                                    op_info->iv_len, errp);
@@ -333,9 +443,99 @@ static int cryptodev_builtin_sym_operation(
             return -VIRTIO_CRYPTO_ERR;
         }
     }
+
     return VIRTIO_CRYPTO_OK;
 }
 
+static int cryptodev_builtin_asym_operation(
+                 CryptoDevBackendBuiltinSession *sess, uint32_t op_code,
+                 CryptoDevBackendAsymOpInfo *op_info, Error **errp)
+{
+    int ret;
+
+    switch (op_code) {
+    case VIRTIO_CRYPTO_AKCIPHER_ENCRYPT:
+        ret = qcrypto_akcipher_encrypt(sess->akcipher,
+                                       op_info->src, op_info->src_len,
+                                       op_info->dst, op_info->dst_len, errp);
+        break;
+
+    case VIRTIO_CRYPTO_AKCIPHER_DECRYPT:
+        ret = qcrypto_akcipher_decrypt(sess->akcipher,
+                                       op_info->src, op_info->src_len,
+                                       op_info->dst, op_info->dst_len, errp);
+        break;
+
+    case VIRTIO_CRYPTO_AKCIPHER_SIGN:
+        ret = qcrypto_akcipher_sign(sess->akcipher,
+                                    op_info->src, op_info->src_len,
+                                    op_info->dst, op_info->dst_len, errp);
+        break;
+
+    case VIRTIO_CRYPTO_AKCIPHER_VERIFY:
+        ret = qcrypto_akcipher_verify(sess->akcipher,
+                                      op_info->src, op_info->src_len,
+                                      op_info->dst, op_info->dst_len, errp);
+        break;
+
+    default:
+        return -VIRTIO_CRYPTO_ERR;
+    }
+
+    if (ret < 0) {
+        if (op_code == VIRTIO_CRYPTO_AKCIPHER_VERIFY) {
+            return -VIRTIO_CRYPTO_KEY_REJECTED;
+        }
+        return -VIRTIO_CRYPTO_ERR;
+    }
+
+    /* Buffer is too short, typically the driver should handle this case */
+    if (unlikely(ret > op_info->dst_len)) {
+        if (errp && !*errp) {
+            error_setg(errp, "dst buffer too short");
+        }
+
+        return -VIRTIO_CRYPTO_ERR;
+    }
+
+    op_info->dst_len = ret;
+
+    return VIRTIO_CRYPTO_OK;
+}
+
+static int cryptodev_builtin_operation(
+                 CryptoDevBackend *backend,
+                 CryptoDevBackendOpInfo *op_info,
+                 uint32_t queue_index, Error **errp)
+{
+    CryptoDevBackendBuiltin *builtin =
+                      CRYPTODEV_BACKEND_BUILTIN(backend);
+    CryptoDevBackendBuiltinSession *sess;
+    CryptoDevBackendSymOpInfo *sym_op_info;
+    CryptoDevBackendAsymOpInfo *asym_op_info;
+    enum CryptoDevBackendAlgType algtype = op_info->algtype;
+    int ret = -VIRTIO_CRYPTO_ERR;
+
+    if (op_info->session_id >= MAX_NUM_SESSIONS ||
+              builtin->sessions[op_info->session_id] == NULL) {
+        error_setg(errp, "Cannot find a valid session id: %" PRIu64 "",
+                   op_info->session_id);
+        return -VIRTIO_CRYPTO_INVSESS;
+    }
+
+    sess = builtin->sessions[op_info->session_id];
+    if (algtype == CRYPTODEV_BACKEND_ALG_SYM) {
+        sym_op_info = op_info->u.sym_op_info;
+        ret = cryptodev_builtin_sym_operation(sess, sym_op_info, errp);
+    } else if (algtype == CRYPTODEV_BACKEND_ALG_ASYM) {
+        asym_op_info = op_info->u.asym_op_info;
+        ret = cryptodev_builtin_asym_operation(sess, op_info->op_code,
+                                               asym_op_info, errp);
+    }
+
+    return ret;
+}
+
 static void cryptodev_builtin_cleanup(
              CryptoDevBackend *backend,
              Error **errp)
@@ -348,7 +548,7 @@ static void cryptodev_builtin_cleanup(
 
     for (i = 0; i < MAX_NUM_SESSIONS; i++) {
         if (builtin->sessions[i] != NULL) {
-            cryptodev_builtin_sym_close_session(backend, i, 0, &error_abort);
+            cryptodev_builtin_close_session(backend, i, 0, &error_abort);
         }
     }
 
@@ -370,9 +570,9 @@ cryptodev_builtin_class_init(ObjectClass *oc, void *data)
 
     bc->init = cryptodev_builtin_init;
     bc->cleanup = cryptodev_builtin_cleanup;
-    bc->create_session = cryptodev_builtin_sym_create_session;
-    bc->close_session = cryptodev_builtin_sym_close_session;
-    bc->do_sym_op = cryptodev_builtin_sym_operation;
+    bc->create_session = cryptodev_builtin_create_session;
+    bc->close_session = cryptodev_builtin_close_session;
+    bc->do_op = cryptodev_builtin_operation;
 }
 
 static const TypeInfo cryptodev_builtin_info = {
diff --git a/backends/cryptodev-vhost-user.c b/backends/cryptodev-vhost-user.c
index bedb452474..5443a59153 100644
--- a/backends/cryptodev-vhost-user.c
+++ b/backends/cryptodev-vhost-user.c
@@ -259,7 +259,33 @@ static int64_t cryptodev_vhost_user_sym_create_session(
     return -1;
 }
 
-static int cryptodev_vhost_user_sym_close_session(
+static int64_t cryptodev_vhost_user_create_session(
+           CryptoDevBackend *backend,
+           CryptoDevBackendSessionInfo *sess_info,
+           uint32_t queue_index, Error **errp)
+{
+    uint32_t op_code = sess_info->op_code;
+    CryptoDevBackendSymSessionInfo *sym_sess_info;
+
+    switch (op_code) {
+    case VIRTIO_CRYPTO_CIPHER_CREATE_SESSION:
+    case VIRTIO_CRYPTO_HASH_CREATE_SESSION:
+    case VIRTIO_CRYPTO_MAC_CREATE_SESSION:
+    case VIRTIO_CRYPTO_AEAD_CREATE_SESSION:
+        sym_sess_info = &sess_info->u.sym_sess_info;
+        return cryptodev_vhost_user_sym_create_session(backend, sym_sess_info,
+                   queue_index, errp);
+    default:
+        error_setg(errp, "Unsupported opcode :%" PRIu32 "",
+                   sess_info->op_code);
+        return -1;
+
+    }
+
+    return -1;
+}
+
+static int cryptodev_vhost_user_close_session(
            CryptoDevBackend *backend,
            uint64_t session_id,
            uint32_t queue_index, Error **errp)
@@ -351,9 +377,9 @@ cryptodev_vhost_user_class_init(ObjectClass *oc, void *data)
 
     bc->init = cryptodev_vhost_user_init;
     bc->cleanup = cryptodev_vhost_user_cleanup;
-    bc->create_session = cryptodev_vhost_user_sym_create_session;
-    bc->close_session = cryptodev_vhost_user_sym_close_session;
-    bc->do_sym_op = NULL;
+    bc->create_session = cryptodev_vhost_user_create_session;
+    bc->close_session = cryptodev_vhost_user_close_session;
+    bc->do_op = NULL;
 
     object_class_property_add_str(oc, "chardev",
                                   cryptodev_vhost_user_get_chardev,
diff --git a/backends/cryptodev.c b/backends/cryptodev.c
index 2b105e433c..33eb4e1a70 100644
--- a/backends/cryptodev.c
+++ b/backends/cryptodev.c
@@ -72,9 +72,9 @@ void cryptodev_backend_cleanup(
     }
 }
 
-int64_t cryptodev_backend_sym_create_session(
+int64_t cryptodev_backend_create_session(
            CryptoDevBackend *backend,
-           CryptoDevBackendSymSessionInfo *sess_info,
+           CryptoDevBackendSessionInfo *sess_info,
            uint32_t queue_index, Error **errp)
 {
     CryptoDevBackendClass *bc =
@@ -87,7 +87,7 @@ int64_t cryptodev_backend_sym_create_session(
     return -1;
 }
 
-int cryptodev_backend_sym_close_session(
+int cryptodev_backend_close_session(
            CryptoDevBackend *backend,
            uint64_t session_id,
            uint32_t queue_index, Error **errp)
@@ -102,16 +102,16 @@ int cryptodev_backend_sym_close_session(
     return -1;
 }
 
-static int cryptodev_backend_sym_operation(
+static int cryptodev_backend_operation(
                  CryptoDevBackend *backend,
-                 CryptoDevBackendSymOpInfo *op_info,
+                 CryptoDevBackendOpInfo *op_info,
                  uint32_t queue_index, Error **errp)
 {
     CryptoDevBackendClass *bc =
                       CRYPTODEV_BACKEND_GET_CLASS(backend);
 
-    if (bc->do_sym_op) {
-        return bc->do_sym_op(backend, op_info, queue_index, errp);
+    if (bc->do_op) {
+        return bc->do_op(backend, op_info, queue_index, errp);
     }
 
     return -VIRTIO_CRYPTO_ERR;
@@ -123,20 +123,18 @@ int cryptodev_backend_crypto_operation(
                  uint32_t queue_index, Error **errp)
 {
     VirtIOCryptoReq *req = opaque;
+    CryptoDevBackendOpInfo *op_info = &req->op_info;
+    enum CryptoDevBackendAlgType algtype = req->flags;
 
-    if (req->flags == CRYPTODEV_BACKEND_ALG_SYM) {
-        CryptoDevBackendSymOpInfo *op_info;
-        op_info = req->u.sym_op_info;
-
-        return cryptodev_backend_sym_operation(backend,
-                         op_info, queue_index, errp);
-    } else {
+    if ((algtype != CRYPTODEV_BACKEND_ALG_SYM)
+        && (algtype != CRYPTODEV_BACKEND_ALG_ASYM)) {
         error_setg(errp, "Unsupported cryptodev alg type: %" PRIu32 "",
-                   req->flags);
-       return -VIRTIO_CRYPTO_NOTSUPP;
+                   algtype);
+
+        return -VIRTIO_CRYPTO_NOTSUPP;
     }
 
-    return -VIRTIO_CRYPTO_ERR;
+    return cryptodev_backend_operation(backend, op_info, queue_index, errp);
 }
 
 static void
diff --git a/hw/virtio/virtio-crypto.c b/hw/virtio/virtio-crypto.c
index c3829e7498..c1243c3f93 100644
--- a/hw/virtio/virtio-crypto.c
+++ b/hw/virtio/virtio-crypto.c
@@ -83,7 +83,8 @@ virtio_crypto_create_sym_session(VirtIOCrypto *vcrypto,
                struct iovec *iov, unsigned int out_num)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(vcrypto);
-    CryptoDevBackendSymSessionInfo info;
+    CryptoDevBackendSessionInfo info;
+    CryptoDevBackendSymSessionInfo *sym_info;
     int64_t session_id;
     int queue_index;
     uint32_t op_type;
@@ -92,11 +93,13 @@ virtio_crypto_create_sym_session(VirtIOCrypto *vcrypto,
 
     memset(&info, 0, sizeof(info));
     op_type = ldl_le_p(&sess_req->op_type);
-    info.op_type = op_type;
     info.op_code = opcode;
 
+    sym_info = &info.u.sym_sess_info;
+    sym_info->op_type = op_type;
+
     if (op_type == VIRTIO_CRYPTO_SYM_OP_CIPHER) {
-        ret = virtio_crypto_cipher_session_helper(vdev, &info,
+        ret = virtio_crypto_cipher_session_helper(vdev, sym_info,
                            &sess_req->u.cipher.para,
                            &iov, &out_num);
         if (ret < 0) {
@@ -105,47 +108,47 @@ virtio_crypto_create_sym_session(VirtIOCrypto *vcrypto,
     } else if (op_type == VIRTIO_CRYPTO_SYM_OP_ALGORITHM_CHAINING) {
         size_t s;
         /* cipher part */
-        ret = virtio_crypto_cipher_session_helper(vdev, &info,
+        ret = virtio_crypto_cipher_session_helper(vdev, sym_info,
                            &sess_req->u.chain.para.cipher_param,
                            &iov, &out_num);
         if (ret < 0) {
             goto err;
         }
         /* hash part */
-        info.alg_chain_order = ldl_le_p(
+        sym_info->alg_chain_order = ldl_le_p(
                                      &sess_req->u.chain.para.alg_chain_order);
-        info.add_len = ldl_le_p(&sess_req->u.chain.para.aad_len);
-        info.hash_mode = ldl_le_p(&sess_req->u.chain.para.hash_mode);
-        if (info.hash_mode == VIRTIO_CRYPTO_SYM_HASH_MODE_AUTH) {
-            info.hash_alg = ldl_le_p(&sess_req->u.chain.para.u.mac_param.algo);
-            info.auth_key_len = ldl_le_p(
+        sym_info->add_len = ldl_le_p(&sess_req->u.chain.para.aad_len);
+        sym_info->hash_mode = ldl_le_p(&sess_req->u.chain.para.hash_mode);
+        if (sym_info->hash_mode == VIRTIO_CRYPTO_SYM_HASH_MODE_AUTH) {
+            sym_info->hash_alg =
+                ldl_le_p(&sess_req->u.chain.para.u.mac_param.algo);
+            sym_info->auth_key_len = ldl_le_p(
                              &sess_req->u.chain.para.u.mac_param.auth_key_len);
-            info.hash_result_len = ldl_le_p(
+            sym_info->hash_result_len = ldl_le_p(
                            &sess_req->u.chain.para.u.mac_param.hash_result_len);
-            if (info.auth_key_len > vcrypto->conf.max_auth_key_len) {
+            if (sym_info->auth_key_len > vcrypto->conf.max_auth_key_len) {
                 error_report("virtio-crypto length of auth key is too big: %u",
-                             info.auth_key_len);
+                             sym_info->auth_key_len);
                 ret = -VIRTIO_CRYPTO_ERR;
                 goto err;
             }
             /* get auth key */
-            if (info.auth_key_len > 0) {
-                DPRINTF("auth_keylen=%" PRIu32 "\n", info.auth_key_len);
-                info.auth_key = g_malloc(info.auth_key_len);
-                s = iov_to_buf(iov, out_num, 0, info.auth_key,
-                               info.auth_key_len);
-                if (unlikely(s != info.auth_key_len)) {
+            if (sym_info->auth_key_len > 0) {
+                sym_info->auth_key = g_malloc(sym_info->auth_key_len);
+                s = iov_to_buf(iov, out_num, 0, sym_info->auth_key,
+                               sym_info->auth_key_len);
+                if (unlikely(s != sym_info->auth_key_len)) {
                     virtio_error(vdev,
                           "virtio-crypto authenticated key incorrect");
                     ret = -EFAULT;
                     goto err;
                 }
-                iov_discard_front(&iov, &out_num, info.auth_key_len);
+                iov_discard_front(&iov, &out_num, sym_info->auth_key_len);
             }
-        } else if (info.hash_mode == VIRTIO_CRYPTO_SYM_HASH_MODE_PLAIN) {
-            info.hash_alg = ldl_le_p(
+        } else if (sym_info->hash_mode == VIRTIO_CRYPTO_SYM_HASH_MODE_PLAIN) {
+            sym_info->hash_alg = ldl_le_p(
                              &sess_req->u.chain.para.u.hash_param.algo);
-            info.hash_result_len = ldl_le_p(
+            sym_info->hash_result_len = ldl_le_p(
                         &sess_req->u.chain.para.u.hash_param.hash_result_len);
         } else {
             /* VIRTIO_CRYPTO_SYM_HASH_MODE_NESTED */
@@ -161,13 +164,10 @@ virtio_crypto_create_sym_session(VirtIOCrypto *vcrypto,
     }
 
     queue_index = virtio_crypto_vq2q(queue_id);
-    session_id = cryptodev_backend_sym_create_session(
+    session_id = cryptodev_backend_create_session(
                                      vcrypto->cryptodev,
                                      &info, queue_index, &local_err);
     if (session_id >= 0) {
-        DPRINTF("create session_id=%" PRIu64 " successfully\n",
-                session_id);
-
         ret = session_id;
     } else {
         if (local_err) {
@@ -177,11 +177,78 @@ virtio_crypto_create_sym_session(VirtIOCrypto *vcrypto,
     }
 
 err:
-    g_free(info.cipher_key);
-    g_free(info.auth_key);
+    g_free(sym_info->cipher_key);
+    g_free(sym_info->auth_key);
     return ret;
 }
 
+static int64_t
+virtio_crypto_create_asym_session(VirtIOCrypto *vcrypto,
+               struct virtio_crypto_akcipher_create_session_req *sess_req,
+               uint32_t queue_id, uint32_t opcode,
+               struct iovec *iov, unsigned int out_num)
+{
+    VirtIODevice *vdev = VIRTIO_DEVICE(vcrypto);
+    CryptoDevBackendSessionInfo info = {0};
+    CryptoDevBackendAsymSessionInfo *asym_info;
+    int64_t session_id;
+    int queue_index;
+    uint32_t algo, keytype, keylen;
+    g_autofree uint8_t *key = NULL;
+    Error *local_err = NULL;
+
+    algo = ldl_le_p(&sess_req->para.algo);
+    keytype = ldl_le_p(&sess_req->para.keytype);
+    keylen = ldl_le_p(&sess_req->para.keylen);
+
+    if ((keytype != VIRTIO_CRYPTO_AKCIPHER_KEY_TYPE_PUBLIC)
+         && (keytype != VIRTIO_CRYPTO_AKCIPHER_KEY_TYPE_PRIVATE)) {
+        error_report("unsupported asym keytype: %d", keytype);
+        return -VIRTIO_CRYPTO_NOTSUPP;
+    }
+
+    if (keylen) {
+        key = g_malloc(keylen);
+        if (iov_to_buf(iov, out_num, 0, key, keylen) != keylen) {
+            virtio_error(vdev, "virtio-crypto asym key incorrect");
+            return -EFAULT;
+        }
+        iov_discard_front(&iov, &out_num, keylen);
+    }
+
+    info.op_code = opcode;
+    asym_info = &info.u.asym_sess_info;
+    asym_info->algo = algo;
+    asym_info->keytype = keytype;
+    asym_info->keylen = keylen;
+    asym_info->key = key;
+    switch (asym_info->algo) {
+    case VIRTIO_CRYPTO_AKCIPHER_RSA:
+        asym_info->u.rsa.padding_algo =
+            ldl_le_p(&sess_req->para.u.rsa.padding_algo);
+        asym_info->u.rsa.hash_algo =
+            ldl_le_p(&sess_req->para.u.rsa.hash_algo);
+        break;
+
+    /* TODO DSA&ECDSA handling */
+
+    default:
+        return -VIRTIO_CRYPTO_ERR;
+    }
+
+    queue_index = virtio_crypto_vq2q(queue_id);
+    session_id = cryptodev_backend_create_session(vcrypto->cryptodev, &info,
+                     queue_index, &local_err);
+    if (session_id < 0) {
+        if (local_err) {
+            error_report_err(local_err);
+        }
+        return -VIRTIO_CRYPTO_ERR;
+    }
+
+    return session_id;
+}
+
 static uint8_t
 virtio_crypto_handle_close_session(VirtIOCrypto *vcrypto,
          struct virtio_crypto_destroy_session_req *close_sess_req,
@@ -195,7 +262,7 @@ virtio_crypto_handle_close_session(VirtIOCrypto *vcrypto,
     session_id = ldq_le_p(&close_sess_req->session_id);
     DPRINTF("close session, id=%" PRIu64 "\n", session_id);
 
-    ret = cryptodev_backend_sym_close_session(
+    ret = cryptodev_backend_close_session(
               vcrypto->cryptodev, session_id, queue_id, &local_err);
     if (ret == 0) {
         status = VIRTIO_CRYPTO_OK;
@@ -260,13 +327,22 @@ static void virtio_crypto_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
         opcode = ldl_le_p(&ctrl.header.opcode);
         queue_id = ldl_le_p(&ctrl.header.queue_id);
 
+        memset(&input, 0, sizeof(input));
         switch (opcode) {
         case VIRTIO_CRYPTO_CIPHER_CREATE_SESSION:
-            memset(&input, 0, sizeof(input));
             session_id = virtio_crypto_create_sym_session(vcrypto,
                              &ctrl.u.sym_create_session,
                              queue_id, opcode,
                              out_iov, out_num);
+            goto check_session;
+
+        case VIRTIO_CRYPTO_AKCIPHER_CREATE_SESSION:
+            session_id = virtio_crypto_create_asym_session(vcrypto,
+                             &ctrl.u.akcipher_create_session,
+                             queue_id, opcode,
+                             out_iov, out_num);
+
+check_session:
             /* Serious errors, need to reset virtio crypto device */
             if (session_id == -EFAULT) {
                 virtqueue_detach_element(vq, elem, 0);
@@ -290,10 +366,12 @@ static void virtio_crypto_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
             virtqueue_push(vq, elem, sizeof(input));
             virtio_notify(vdev, vq);
             break;
+
         case VIRTIO_CRYPTO_CIPHER_DESTROY_SESSION:
         case VIRTIO_CRYPTO_HASH_DESTROY_SESSION:
         case VIRTIO_CRYPTO_MAC_DESTROY_SESSION:
         case VIRTIO_CRYPTO_AEAD_DESTROY_SESSION:
+        case VIRTIO_CRYPTO_AKCIPHER_DESTROY_SESSION:
             status = virtio_crypto_handle_close_session(vcrypto,
                    &ctrl.u.destroy_session, queue_id);
             /* The status only occupy one byte, we can directly use it */
@@ -311,7 +389,6 @@ static void virtio_crypto_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
         case VIRTIO_CRYPTO_AEAD_CREATE_SESSION:
         default:
             error_report("virtio-crypto unsupported ctrl opcode: %d", opcode);
-            memset(&input, 0, sizeof(input));
             stl_le_p(&input.status, VIRTIO_CRYPTO_NOTSUPP);
             s = iov_from_buf(in_iov, in_num, 0, &input, sizeof(input));
             if (unlikely(s != sizeof(input))) {
@@ -339,28 +416,39 @@ static void virtio_crypto_init_request(VirtIOCrypto *vcrypto, VirtQueue *vq,
     req->in_num = 0;
     req->in_len = 0;
     req->flags = CRYPTODEV_BACKEND_ALG__MAX;
-    req->u.sym_op_info = NULL;
+    memset(&req->op_info, 0x00, sizeof(req->op_info));
 }
 
 static void virtio_crypto_free_request(VirtIOCryptoReq *req)
 {
-    if (req) {
-        if (req->flags == CRYPTODEV_BACKEND_ALG_SYM) {
-            size_t max_len;
-            CryptoDevBackendSymOpInfo *op_info = req->u.sym_op_info;
+    if (!req) {
+        return;
+    }
 
-            max_len = op_info->iv_len +
-                      op_info->aad_len +
-                      op_info->src_len +
-                      op_info->dst_len +
-                      op_info->digest_result_len;
+    if (req->flags == CRYPTODEV_BACKEND_ALG_SYM) {
+        size_t max_len;
+        CryptoDevBackendSymOpInfo *op_info = req->op_info.u.sym_op_info;
 
-            /* Zeroize and free request data structure */
-            memset(op_info, 0, sizeof(*op_info) + max_len);
+        max_len = op_info->iv_len +
+                  op_info->aad_len +
+                  op_info->src_len +
+                  op_info->dst_len +
+                  op_info->digest_result_len;
+
+        /* Zeroize and free request data structure */
+        memset(op_info, 0, sizeof(*op_info) + max_len);
+        g_free(op_info);
+    } else if (req->flags == CRYPTODEV_BACKEND_ALG_ASYM) {
+        CryptoDevBackendAsymOpInfo *op_info = req->op_info.u.asym_op_info;
+        if (op_info) {
+            g_free(op_info->src);
+            g_free(op_info->dst);
+            memset(op_info, 0, sizeof(*op_info));
             g_free(op_info);
         }
-        g_free(req);
     }
+
+    g_free(req);
 }
 
 static void
@@ -397,6 +485,35 @@ virtio_crypto_sym_input_data_helper(VirtIODevice *vdev,
     }
 }
 
+static void
+virtio_crypto_akcipher_input_data_helper(VirtIODevice *vdev,
+        VirtIOCryptoReq *req, int32_t status,
+        CryptoDevBackendAsymOpInfo *asym_op_info)
+{
+    size_t s, len;
+
+    if (status != VIRTIO_CRYPTO_OK) {
+        return;
+    }
+
+    len = asym_op_info->dst_len;
+    if (!len) {
+        return;
+    }
+
+    s = iov_from_buf(req->in_iov, req->in_num, 0, asym_op_info->dst, len);
+    if (s != len) {
+        virtio_error(vdev, "virtio-crypto asym dest data incorrect");
+        return;
+    }
+
+    iov_discard_front(&req->in_iov, &req->in_num, len);
+
+    /* For akcipher, dst_len may be changed after operation */
+    req->in_len = sizeof(struct virtio_crypto_inhdr) + asym_op_info->dst_len;
+}
+
+
 static void virtio_crypto_req_complete(VirtIOCryptoReq *req, uint8_t status)
 {
     VirtIOCrypto *vcrypto = req->vcrypto;
@@ -404,7 +521,10 @@ static void virtio_crypto_req_complete(VirtIOCryptoReq *req, uint8_t status)
 
     if (req->flags == CRYPTODEV_BACKEND_ALG_SYM) {
         virtio_crypto_sym_input_data_helper(vdev, req, status,
-                                            req->u.sym_op_info);
+                                            req->op_info.u.sym_op_info);
+    } else if (req->flags == CRYPTODEV_BACKEND_ALG_ASYM) {
+        virtio_crypto_akcipher_input_data_helper(vdev, req, status,
+                                             req->op_info.u.asym_op_info);
     }
     stb_p(&req->in->status, status);
     virtqueue_push(req->vq, &req->elem, req->in_len);
@@ -543,41 +663,100 @@ err:
 static int
 virtio_crypto_handle_sym_req(VirtIOCrypto *vcrypto,
                struct virtio_crypto_sym_data_req *req,
-               CryptoDevBackendSymOpInfo **sym_op_info,
+               CryptoDevBackendOpInfo *op_info,
                struct iovec *iov, unsigned int out_num)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(vcrypto);
+    CryptoDevBackendSymOpInfo *sym_op_info;
     uint32_t op_type;
-    CryptoDevBackendSymOpInfo *op_info;
 
     op_type = ldl_le_p(&req->op_type);
-
     if (op_type == VIRTIO_CRYPTO_SYM_OP_CIPHER) {
-        op_info = virtio_crypto_sym_op_helper(vdev, &req->u.cipher.para,
+        sym_op_info = virtio_crypto_sym_op_helper(vdev, &req->u.cipher.para,
                                               NULL, iov, out_num);
-        if (!op_info) {
+        if (!sym_op_info) {
             return -EFAULT;
         }
-        op_info->op_type = op_type;
     } else if (op_type == VIRTIO_CRYPTO_SYM_OP_ALGORITHM_CHAINING) {
-        op_info = virtio_crypto_sym_op_helper(vdev, NULL,
+        sym_op_info = virtio_crypto_sym_op_helper(vdev, NULL,
                                               &req->u.chain.para,
                                               iov, out_num);
-        if (!op_info) {
+        if (!sym_op_info) {
             return -EFAULT;
         }
-        op_info->op_type = op_type;
     } else {
         /* VIRTIO_CRYPTO_SYM_OP_NONE */
         error_report("virtio-crypto unsupported cipher type");
         return -VIRTIO_CRYPTO_NOTSUPP;
     }
 
-    *sym_op_info = op_info;
+    sym_op_info->op_type = op_type;
+    op_info->u.sym_op_info = sym_op_info;
 
     return 0;
 }
 
+static int
+virtio_crypto_handle_asym_req(VirtIOCrypto *vcrypto,
+               struct virtio_crypto_akcipher_data_req *req,
+               CryptoDevBackendOpInfo *op_info,
+               struct iovec *iov, unsigned int out_num)
+{
+    VirtIODevice *vdev = VIRTIO_DEVICE(vcrypto);
+    CryptoDevBackendAsymOpInfo *asym_op_info;
+    uint32_t src_len;
+    uint32_t dst_len;
+    uint32_t len;
+    uint8_t *src = NULL;
+    uint8_t *dst = NULL;
+
+    asym_op_info = g_malloc0(sizeof(CryptoDevBackendAsymOpInfo));
+    src_len = ldl_le_p(&req->para.src_data_len);
+    dst_len = ldl_le_p(&req->para.dst_data_len);
+
+    if (src_len > 0) {
+        src = g_malloc0(src_len);
+        len = iov_to_buf(iov, out_num, 0, src, src_len);
+        if (unlikely(len != src_len)) {
+            virtio_error(vdev, "virtio-crypto asym src data incorrect"
+                         "expected %u, actual %u", src_len, len);
+            goto err;
+        }
+
+        iov_discard_front(&iov, &out_num, src_len);
+    }
+
+    if (dst_len > 0) {
+        dst = g_malloc0(dst_len);
+
+        if (op_info->op_code == VIRTIO_CRYPTO_AKCIPHER_VERIFY) {
+            len = iov_to_buf(iov, out_num, 0, dst, dst_len);
+            if (unlikely(len != dst_len)) {
+                virtio_error(vdev, "virtio-crypto asym dst data incorrect"
+                             "expected %u, actual %u", dst_len, len);
+                goto err;
+            }
+
+            iov_discard_front(&iov, &out_num, dst_len);
+        }
+    }
+
+    asym_op_info->src_len = src_len;
+    asym_op_info->dst_len = dst_len;
+    asym_op_info->src = src;
+    asym_op_info->dst = dst;
+    op_info->u.asym_op_info = asym_op_info;
+
+    return 0;
+
+ err:
+    g_free(asym_op_info);
+    g_free(src);
+    g_free(dst);
+
+    return -EFAULT;
+}
+
 static int
 virtio_crypto_handle_request(VirtIOCryptoReq *request)
 {
@@ -595,8 +774,7 @@ virtio_crypto_handle_request(VirtIOCryptoReq *request)
     unsigned out_num;
     uint32_t opcode;
     uint8_t status = VIRTIO_CRYPTO_ERR;
-    uint64_t session_id;
-    CryptoDevBackendSymOpInfo *sym_op_info = NULL;
+    CryptoDevBackendOpInfo *op_info = &request->op_info;
     Error *local_err = NULL;
 
     if (elem->out_num < 1 || elem->in_num < 1) {
@@ -639,15 +817,28 @@ virtio_crypto_handle_request(VirtIOCryptoReq *request)
     request->in_iov = in_iov;
 
     opcode = ldl_le_p(&req.header.opcode);
-    session_id = ldq_le_p(&req.header.session_id);
+    op_info->session_id = ldq_le_p(&req.header.session_id);
+    op_info->op_code = opcode;
 
     switch (opcode) {
     case VIRTIO_CRYPTO_CIPHER_ENCRYPT:
     case VIRTIO_CRYPTO_CIPHER_DECRYPT:
+        op_info->algtype = request->flags = CRYPTODEV_BACKEND_ALG_SYM;
         ret = virtio_crypto_handle_sym_req(vcrypto,
-                         &req.u.sym_req,
-                         &sym_op_info,
+                         &req.u.sym_req, op_info,
                          out_iov, out_num);
+        goto check_result;
+
+    case VIRTIO_CRYPTO_AKCIPHER_ENCRYPT:
+    case VIRTIO_CRYPTO_AKCIPHER_DECRYPT:
+    case VIRTIO_CRYPTO_AKCIPHER_SIGN:
+    case VIRTIO_CRYPTO_AKCIPHER_VERIFY:
+        op_info->algtype = request->flags = CRYPTODEV_BACKEND_ALG_ASYM;
+        ret = virtio_crypto_handle_asym_req(vcrypto,
+                         &req.u.akcipher_req, op_info,
+                         out_iov, out_num);
+
+check_result:
         /* Serious errors, need to reset virtio crypto device */
         if (ret == -EFAULT) {
             return -1;
@@ -655,11 +846,8 @@ virtio_crypto_handle_request(VirtIOCryptoReq *request)
             virtio_crypto_req_complete(request, VIRTIO_CRYPTO_NOTSUPP);
             virtio_crypto_free_request(request);
         } else {
-            sym_op_info->session_id = session_id;
 
             /* Set request's parameter */
-            request->flags = CRYPTODEV_BACKEND_ALG_SYM;
-            request->u.sym_op_info = sym_op_info;
             ret = cryptodev_backend_crypto_operation(vcrypto->cryptodev,
                                     request, queue_index, &local_err);
             if (ret < 0) {
@@ -674,6 +862,7 @@ virtio_crypto_handle_request(VirtIOCryptoReq *request)
             virtio_crypto_free_request(request);
         }
         break;
+
     case VIRTIO_CRYPTO_HASH:
     case VIRTIO_CRYPTO_MAC:
     case VIRTIO_CRYPTO_AEAD_ENCRYPT:
@@ -779,6 +968,7 @@ static void virtio_crypto_init_config(VirtIODevice *vdev)
     vcrypto->conf.mac_algo_l = vcrypto->conf.cryptodev->conf.mac_algo_l;
     vcrypto->conf.mac_algo_h = vcrypto->conf.cryptodev->conf.mac_algo_h;
     vcrypto->conf.aead_algo = vcrypto->conf.cryptodev->conf.aead_algo;
+    vcrypto->conf.akcipher_algo = vcrypto->conf.cryptodev->conf.akcipher_algo;
     vcrypto->conf.max_cipher_key_len =
                   vcrypto->conf.cryptodev->conf.max_cipher_key_len;
     vcrypto->conf.max_auth_key_len =
@@ -891,6 +1081,7 @@ static void virtio_crypto_get_config(VirtIODevice *vdev, uint8_t *config)
     stl_le_p(&crypto_cfg.max_cipher_key_len, c->conf.max_cipher_key_len);
     stl_le_p(&crypto_cfg.max_auth_key_len, c->conf.max_auth_key_len);
     stq_le_p(&crypto_cfg.max_size, c->conf.max_size);
+    stl_le_p(&crypto_cfg.akcipher_algo, c->conf.akcipher_algo);
 
     memcpy(config, &crypto_cfg, c->config_size);
 }
-- 
MST



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PULL 09/10] vhost: also check queue state in the vhost_dev_set_log error routine
  2022-06-16 16:57 [PULL 00/10] virtio,pc,pci: fixes,cleanups,features Michael S. Tsirkin
                   ` (7 preceding siblings ...)
  2022-06-16 16:57 ` [PULL 08/10] crypto: Introduce RSA algorithm Michael S. Tsirkin
@ 2022-06-16 16:57 ` Michael S. Tsirkin
  2022-06-16 16:57 ` [PULL 10/10] acpi/erst: fix fallthrough code upon validation failure Michael S. Tsirkin
  2022-06-16 20:46 ` [PULL 00/10] virtio,pc,pci: fixes,cleanups,features Richard Henderson
  10 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2022-06-16 16:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, Ni Xun, Zhigang Lu

From: Ni Xun <richardni@tencent.com>

When check queue state in the vhost_dev_set_log routine, it miss the error
routine check, this patch also check queue state in error case.

Fixes: 1e5a050f5798 ("check queue state in the vhost_dev_set_log routine")
Signed-off-by: Ni Xun <richardni@tencent.com>
Reviewed-by: Zhigang Lu <tonnylu@tencent.com>
Message-Id: <OS0PR01MB57139163F3F3955960675B52EAA79@OS0PR01MB5713.jpnprd01.prod.outlook.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 hw/virtio/vhost.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index dd3263df56..6c41fa13e3 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -886,6 +886,10 @@ static int vhost_dev_set_log(struct vhost_dev *dev, bool enable_log)
 err_vq:
     for (; i >= 0; --i) {
         idx = dev->vhost_ops->vhost_get_vq_index(dev, dev->vq_index + i);
+        addr = virtio_queue_get_desc_addr(dev->vdev, idx);
+        if (!addr) {
+            continue;
+        }
         vhost_virtqueue_set_addr(dev, dev->vqs + i, idx,
                                  dev->log_enabled);
     }
-- 
MST



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PULL 10/10] acpi/erst: fix fallthrough code upon validation failure
  2022-06-16 16:57 [PULL 00/10] virtio,pc,pci: fixes,cleanups,features Michael S. Tsirkin
                   ` (8 preceding siblings ...)
  2022-06-16 16:57 ` [PULL 09/10] vhost: also check queue state in the vhost_dev_set_log error routine Michael S. Tsirkin
@ 2022-06-16 16:57 ` Michael S. Tsirkin
  2022-06-16 20:46 ` [PULL 00/10] virtio,pc,pci: fixes,cleanups,features Richard Henderson
  10 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2022-06-16 16:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, Ani Sinha, Eric DeVolder, Igor Mammedov

From: Ani Sinha <ani@anisinha.ca>

At any step when any validation fail in check_erst_backend_storage(), there is
no need to continue further through other validation checks. Further, by
continuing even when record_size is 0, we run the risk of triggering a divide
by zero error if we continued with other validation checks. Hence, we should
simply return from this function upon validation failure.

CC: Peter Maydell <peter.maydell@linaro.org>
CC: Eric DeVolder <eric.devolder@oracle.com>
Signed-off-by: Ani Sinha <ani@anisinha.ca>
Message-Id: <20220513141005.1929422-1-ani@anisinha.ca>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric DeVolder <eric.devolder@oracle.com>
---
 hw/acpi/erst.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/acpi/erst.c b/hw/acpi/erst.c
index de509c2b48..df856b2669 100644
--- a/hw/acpi/erst.c
+++ b/hw/acpi/erst.c
@@ -440,6 +440,7 @@ static void check_erst_backend_storage(ERSTDeviceState *s, Error **errp)
         (record_size >= 4096) /* PAGE_SIZE */
         )) {
         error_setg(errp, "ERST record_size %u is invalid", record_size);
+        return;
     }
 
     /* Validity check header */
@@ -450,6 +451,7 @@ static void check_erst_backend_storage(ERSTDeviceState *s, Error **errp)
         (le16_to_cpu(header->reserved) == 0)
         )) {
         error_setg(errp, "ERST backend storage header is invalid");
+        return;
     }
 
     /* Check storage_size against record_size */
@@ -457,6 +459,7 @@ static void check_erst_backend_storage(ERSTDeviceState *s, Error **errp)
          (record_size > s->storage_size)) {
         error_setg(errp, "ACPI ERST requires storage size be multiple of "
             "record size (%uKiB)", record_size);
+        return;
     }
 
     /* Compute offset of first and last record storage slot */
-- 
MST



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PULL 00/10] virtio,pc,pci: fixes,cleanups,features
  2022-06-16 16:57 [PULL 00/10] virtio,pc,pci: fixes,cleanups,features Michael S. Tsirkin
                   ` (9 preceding siblings ...)
  2022-06-16 16:57 ` [PULL 10/10] acpi/erst: fix fallthrough code upon validation failure Michael S. Tsirkin
@ 2022-06-16 20:46 ` Richard Henderson
  10 siblings, 0 replies; 21+ messages in thread
From: Richard Henderson @ 2022-06-16 20:46 UTC (permalink / raw)
  To: Michael S. Tsirkin, qemu-devel; +Cc: Peter Maydell

On 6/16/22 09:57, Michael S. Tsirkin wrote:
> The following changes since commit def6fd6c9ce9e00a30cdd0066e0fde206b3f3d2f:
> 
>    Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging (2022-06-16 07:13:04 -0700)
> 
> are available in the Git repository at:
> 
>    git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_upstream
> 
> for you to fetch changes up to 8c97e4deeca9ad791ab369d3879ebfb0267b24ca:
> 
>    acpi/erst: fix fallthrough code upon validation failure (2022-06-16 12:54:58 -0400)
> 
> ----------------------------------------------------------------
> virtio,pc,pci: fixes,cleanups,features
> 
> more CXL patches
> RSA support for crypto
> fixes, cleanups all over the place

Applied, thanks.  Please update https://wiki.qemu.org/ChangeLog/7.1 as appropriate.


r~


> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> ----------------------------------------------------------------
> Ani Sinha (1):
>        acpi/erst: fix fallthrough code upon validation failure
> 
> Jonathan Cameron (3):
>        pci-bridge/cxl_upstream: Add a CXL switch upstream port
>        pci-bridge/cxl_downstream: Add a CXL switch downstream port
>        docs/cxl: Add switch documentation
> 
> Ni Xun (1):
>        vhost: also check queue state in the vhost_dev_set_log error routine
> 
> Yajun Wu (1):
>        virtio/vhost-user: Fix wrong vhost notifier GPtrArray size
> 
> Zhenwei Pi (1):
>        crypto: Introduce RSA algorithm
> 
> Zhenzhong Duan (3):
>        virtio-iommu: Add bypass mode support to assigned device
>        virtio-iommu: Use recursive lock to avoid deadlock
>        virtio-iommu: Add an assert check in translate routine
> 
>   include/hw/cxl/cxl.h              |   5 +
>   include/hw/virtio/virtio-crypto.h |   5 +-
>   include/hw/virtio/virtio-iommu.h  |   4 +-
>   include/sysemu/cryptodev.h        |  83 ++++++++--
>   backends/cryptodev-builtin.c      | 276 ++++++++++++++++++++++++++++-----
>   backends/cryptodev-vhost-user.c   |  34 +++-
>   backends/cryptodev.c              |  32 ++--
>   hw/acpi/erst.c                    |   3 +
>   hw/cxl/cxl-host.c                 |  43 ++++-
>   hw/pci-bridge/cxl_downstream.c    | 249 +++++++++++++++++++++++++++++
>   hw/pci-bridge/cxl_upstream.c      | 216 ++++++++++++++++++++++++++
>   hw/virtio/vhost-user.c            |   2 +-
>   hw/virtio/vhost.c                 |   4 +
>   hw/virtio/virtio-crypto.c         | 319 ++++++++++++++++++++++++++++++--------
>   hw/virtio/virtio-iommu.c          | 135 ++++++++++++++--
>   docs/system/devices/cxl.rst       |  88 ++++++++++-
>   hw/pci-bridge/meson.build         |   2 +-
>   hw/virtio/trace-events            |   1 +
>   18 files changed, 1343 insertions(+), 158 deletions(-)
>   create mode 100644 hw/pci-bridge/cxl_downstream.c
>   create mode 100644 hw/pci-bridge/cxl_upstream.c
> 
> 



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PULL 02/10] pci-bridge/cxl_downstream: Add a CXL switch downstream port
  2022-06-16 16:57 ` [PULL 02/10] pci-bridge/cxl_downstream: Add a CXL switch downstream port Michael S. Tsirkin
@ 2022-11-04  6:47   ` Thomas Huth
  2022-12-04  7:23     ` Thomas Huth
  0 siblings, 1 reply; 21+ messages in thread
From: Thomas Huth @ 2022-11-04  6:47 UTC (permalink / raw)
  To: Michael S. Tsirkin, qemu-devel, Jonathan Cameron
  Cc: Peter Maydell, Ben Widawsky, Marcel Apfelbaum

On 16/06/2022 18.57, Michael S. Tsirkin wrote:
> From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> Emulation of a simple CXL Switch downstream port.
> The Device ID has been allocated for this use.
> 
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Message-Id: <20220616145126.8002-3-Jonathan.Cameron@huawei.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>   hw/cxl/cxl-host.c              |  43 +++++-
>   hw/pci-bridge/cxl_downstream.c | 249 +++++++++++++++++++++++++++++++++
>   hw/pci-bridge/meson.build      |   2 +-
>   3 files changed, 291 insertions(+), 3 deletions(-)
>   create mode 100644 hw/pci-bridge/cxl_downstream.c

  Hi!

There is a memory problem somewhere in this new device. I can make QEMU 
crash by running something like this:

$ MALLOC_PERTURB_=59 ./qemu-system-x86_64 -M x-remote \
     -display none -monitor stdio
QEMU 7.1.50 monitor - type 'help' for more information
(qemu) device_add cxl-downstream
./qemu/qom/object.c:1188:5: runtime error: member access within misaligned 
address 0x3b3b3b3b3b3b3b3b for type 'struct Object', which requires 8 byte 
alignment
0x3b3b3b3b3b3b3b3b: note: pointer points here
<memory cannot be printed>
Bus error (core dumped)

Could you have a look if you've got some spare minutes?

  Thomas



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PULL 02/10] pci-bridge/cxl_downstream: Add a CXL switch downstream port
  2022-11-04  6:47   ` Thomas Huth
@ 2022-12-04  7:23     ` Thomas Huth
  2022-12-05 10:54       ` Jonathan Cameron via
  0 siblings, 1 reply; 21+ messages in thread
From: Thomas Huth @ 2022-12-04  7:23 UTC (permalink / raw)
  To: Michael S. Tsirkin, qemu-devel, Jonathan Cameron
  Cc: Peter Maydell, Ben Widawsky, Marcel Apfelbaum

On 04/11/2022 07.47, Thomas Huth wrote:
> On 16/06/2022 18.57, Michael S. Tsirkin wrote:
>> From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>>
>> Emulation of a simple CXL Switch downstream port.
>> The Device ID has been allocated for this use.
>>
>> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> Message-Id: <20220616145126.8002-3-Jonathan.Cameron@huawei.com>
>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>> ---
>>   hw/cxl/cxl-host.c              |  43 +++++-
>>   hw/pci-bridge/cxl_downstream.c | 249 +++++++++++++++++++++++++++++++++
>>   hw/pci-bridge/meson.build      |   2 +-
>>   3 files changed, 291 insertions(+), 3 deletions(-)
>>   create mode 100644 hw/pci-bridge/cxl_downstream.c
> 
>   Hi!
> 
> There is a memory problem somewhere in this new device. I can make QEMU 
> crash by running something like this:
> 
> $ MALLOC_PERTURB_=59 ./qemu-system-x86_64 -M x-remote \
>      -display none -monitor stdio
> QEMU 7.1.50 monitor - type 'help' for more information
> (qemu) device_add cxl-downstream
> ./qemu/qom/object.c:1188:5: runtime error: member access within misaligned 
> address 0x3b3b3b3b3b3b3b3b for type 'struct Object', which requires 8 byte 
> alignment
> 0x3b3b3b3b3b3b3b3b: note: pointer points here
> <memory cannot be printed>
> Bus error (core dumped)
> 
> Could you have a look if you've got some spare minutes?

Ping! Jonathan, Michael, any news on this bug?

(this breaks one of my local tests, that's why it's annoying for me)

  Thomas



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PULL 02/10] pci-bridge/cxl_downstream: Add a CXL switch downstream port
  2022-12-04  7:23     ` Thomas Huth
@ 2022-12-05 10:54       ` Jonathan Cameron via
  2022-12-05 12:45         ` Jonathan Cameron via
  0 siblings, 1 reply; 21+ messages in thread
From: Jonathan Cameron via @ 2022-12-05 10:54 UTC (permalink / raw)
  To: Thomas Huth
  Cc: Michael S. Tsirkin, qemu-devel, Peter Maydell, Ben Widawsky,
	Marcel Apfelbaum

On Sun, 4 Dec 2022 08:23:55 +0100
Thomas Huth <thuth@redhat.com> wrote:

> On 04/11/2022 07.47, Thomas Huth wrote:
> > On 16/06/2022 18.57, Michael S. Tsirkin wrote:  
> >> From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> >>
> >> Emulation of a simple CXL Switch downstream port.
> >> The Device ID has been allocated for this use.
> >>
> >> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> >> Message-Id: <20220616145126.8002-3-Jonathan.Cameron@huawei.com>
> >> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> >> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >> ---
> >>   hw/cxl/cxl-host.c              |  43 +++++-
> >>   hw/pci-bridge/cxl_downstream.c | 249 +++++++++++++++++++++++++++++++++
> >>   hw/pci-bridge/meson.build      |   2 +-
> >>   3 files changed, 291 insertions(+), 3 deletions(-)
> >>   create mode 100644 hw/pci-bridge/cxl_downstream.c  
> > 
> >   Hi!
> > 
> > There is a memory problem somewhere in this new device. I can make QEMU 
> > crash by running something like this:
> > 
> > $ MALLOC_PERTURB_=59 ./qemu-system-x86_64 -M x-remote \
> >      -display none -monitor stdio
> > QEMU 7.1.50 monitor - type 'help' for more information
> > (qemu) device_add cxl-downstream
> > ./qemu/qom/object.c:1188:5: runtime error: member access within misaligned 
> > address 0x3b3b3b3b3b3b3b3b for type 'struct Object', which requires 8 byte 
> > alignment
> > 0x3b3b3b3b3b3b3b3b: note: pointer points here
> > <memory cannot be printed>
> > Bus error (core dumped)
> > 
> > Could you have a look if you've got some spare minutes?  
> 
> Ping! Jonathan, Michael, any news on this bug?
> 
> (this breaks one of my local tests, that's why it's annoying for me)
Sorry, my email filters ate your earlier message.

Looking into this now. I'll note that it also happens on
device_add xio3130-downstream so not specific to this new device.

So far all I've managed to do is track it to something rcu related
as failing in a call to drain_call_rcu() in qmp_device_add()

Will continue digging.

Jonathan


> 
>   Thomas
> 



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PULL 02/10] pci-bridge/cxl_downstream: Add a CXL switch downstream port
  2022-12-05 10:54       ` Jonathan Cameron via
@ 2022-12-05 12:45         ` Jonathan Cameron via
  2022-12-05 14:59           ` Alex Bennée
  0 siblings, 1 reply; 21+ messages in thread
From: Jonathan Cameron via @ 2022-12-05 12:45 UTC (permalink / raw)
  To: Jonathan Cameron via
  Cc: Jonathan Cameron, Thomas Huth, Michael S. Tsirkin, Peter Maydell,
	Ben Widawsky, Marcel Apfelbaum

On Mon, 5 Dec 2022 10:54:03 +0000
Jonathan Cameron via <qemu-devel@nongnu.org> wrote:

> On Sun, 4 Dec 2022 08:23:55 +0100
> Thomas Huth <thuth@redhat.com> wrote:
> 
> > On 04/11/2022 07.47, Thomas Huth wrote:  
> > > On 16/06/2022 18.57, Michael S. Tsirkin wrote:    
> > >> From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > >>
> > >> Emulation of a simple CXL Switch downstream port.
> > >> The Device ID has been allocated for this use.
> > >>
> > >> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > >> Message-Id: <20220616145126.8002-3-Jonathan.Cameron@huawei.com>
> > >> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > >> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> > >> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > >> ---
> > >>   hw/cxl/cxl-host.c              |  43 +++++-
> > >>   hw/pci-bridge/cxl_downstream.c | 249 +++++++++++++++++++++++++++++++++
> > >>   hw/pci-bridge/meson.build      |   2 +-
> > >>   3 files changed, 291 insertions(+), 3 deletions(-)
> > >>   create mode 100644 hw/pci-bridge/cxl_downstream.c    
> > > 
> > >   Hi!
> > > 
> > > There is a memory problem somewhere in this new device. I can make QEMU 
> > > crash by running something like this:
> > > 
> > > $ MALLOC_PERTURB_=59 ./qemu-system-x86_64 -M x-remote \
> > >      -display none -monitor stdio
> > > QEMU 7.1.50 monitor - type 'help' for more information
> > > (qemu) device_add cxl-downstream
> > > ./qemu/qom/object.c:1188:5: runtime error: member access within misaligned 
> > > address 0x3b3b3b3b3b3b3b3b for type 'struct Object', which requires 8 byte 
> > > alignment
> > > 0x3b3b3b3b3b3b3b3b: note: pointer points here
> > > <memory cannot be printed>
> > > Bus error (core dumped)
> > > 
> > > Could you have a look if you've got some spare minutes?    
> > 
> > Ping! Jonathan, Michael, any news on this bug?
> > 
> > (this breaks one of my local tests, that's why it's annoying for me)  
> Sorry, my email filters ate your earlier message.
> 
> Looking into this now. I'll note that it also happens on
> device_add xio3130-downstream so not specific to this new device.
> 
> So far all I've managed to do is track it to something rcu related
> as failing in a call to drain_call_rcu() in qmp_device_add()
> 
> Will continue digging.

Assuming I'm seeing the same thing...

Problem is g_free() on the PCIBridge windows: 
https://elixir.bootlin.com/qemu/latest/source/hw/pci/pci_bridge.c#L235

Is called before we get an rcu_call() to flatview_destroy() as a
result of the final call of flatview_unref() in address_space_set_flatview()
so we get a use after free.

As to what the fix is...  Suggestions welcome!

> 
> Jonathan
> 
> 
> > 
> >   Thomas
> >   
> 
> 



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PULL 02/10] pci-bridge/cxl_downstream: Add a CXL switch downstream port
  2022-12-05 12:45         ` Jonathan Cameron via
@ 2022-12-05 14:59           ` Alex Bennée
  2022-12-07 13:21             ` Jonathan Cameron via
  0 siblings, 1 reply; 21+ messages in thread
From: Alex Bennée @ 2022-12-05 14:59 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Thomas Huth, Michael S. Tsirkin, Peter Maydell, Ben Widawsky,
	Marcel Apfelbaum, qemu-devel, Paolo Bonzini


Jonathan Cameron via <qemu-devel@nongnu.org> writes:

> On Mon, 5 Dec 2022 10:54:03 +0000
> Jonathan Cameron via <qemu-devel@nongnu.org> wrote:
>
>> On Sun, 4 Dec 2022 08:23:55 +0100
>> Thomas Huth <thuth@redhat.com> wrote:
>> 
>> > On 04/11/2022 07.47, Thomas Huth wrote:  
>> > > On 16/06/2022 18.57, Michael S. Tsirkin wrote:    
>> > >> From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> > >>
>> > >> Emulation of a simple CXL Switch downstream port.
>> > >> The Device ID has been allocated for this use.
>> > >>
>> > >> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> > >> Message-Id: <20220616145126.8002-3-Jonathan.Cameron@huawei.com>
>> > >> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>> > >> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
>> > >> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>> > >> ---
>> > >>   hw/cxl/cxl-host.c              |  43 +++++-
>> > >>   hw/pci-bridge/cxl_downstream.c | 249 +++++++++++++++++++++++++++++++++
>> > >>   hw/pci-bridge/meson.build      |   2 +-
>> > >>   3 files changed, 291 insertions(+), 3 deletions(-)
>> > >>   create mode 100644 hw/pci-bridge/cxl_downstream.c    
>> > > 
>> > >   Hi!
>> > > 
>> > > There is a memory problem somewhere in this new device. I can make QEMU 
>> > > crash by running something like this:
>> > > 
>> > > $ MALLOC_PERTURB_=59 ./qemu-system-x86_64 -M x-remote \
>> > >      -display none -monitor stdio
>> > > QEMU 7.1.50 monitor - type 'help' for more information
>> > > (qemu) device_add cxl-downstream
>> > > ./qemu/qom/object.c:1188:5: runtime error: member access within misaligned 
>> > > address 0x3b3b3b3b3b3b3b3b for type 'struct Object', which requires 8 byte 
>> > > alignment
>> > > 0x3b3b3b3b3b3b3b3b: note: pointer points here
>> > > <memory cannot be printed>
>> > > Bus error (core dumped)
>> > > 
>> > > Could you have a look if you've got some spare minutes?    
>> > 
>> > Ping! Jonathan, Michael, any news on this bug?
>> > 
>> > (this breaks one of my local tests, that's why it's annoying for me)  
>> Sorry, my email filters ate your earlier message.
>> 
>> Looking into this now. I'll note that it also happens on
>> device_add xio3130-downstream so not specific to this new device.
>> 
>> So far all I've managed to do is track it to something rcu related
>> as failing in a call to drain_call_rcu() in qmp_device_add()
>> 
>> Will continue digging.
>
> Assuming I'm seeing the same thing...
>
> Problem is g_free() on the PCIBridge windows: 
> https://elixir.bootlin.com/qemu/latest/source/hw/pci/pci_bridge.c#L235
>
> Is called before we get an rcu_call() to flatview_destroy() as a
> result of the final call of flatview_unref() in address_space_set_flatview()
> so we get a use after free.
>
> As to what the fix is...  Suggestions welcome!

It sounds like this is the wrong place to free the value then. I guess
the PCI aliases into &w->alias_io() don't get dealt with until RCU
clean-up time.

I *think* using g_free_rcu() should be enough to ensure the free occurs
after the rest of the RCU cleanups but maybe you should only be cleaning
up the windows at device unrealize time? Is this a dynamic piece of
memory which gets updated during the lifetime of the device?

If the memory is being cleared with RCU then the access to the base
pointer should be done with the appropriate qatomic_rcu_[set|read]
functions.

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PULL 02/10] pci-bridge/cxl_downstream: Add a CXL switch downstream port
  2022-12-05 14:59           ` Alex Bennée
@ 2022-12-07 13:21             ` Jonathan Cameron via
  2022-12-07 13:26               ` Thomas Huth
  0 siblings, 1 reply; 21+ messages in thread
From: Jonathan Cameron via @ 2022-12-07 13:21 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Thomas Huth, Michael S. Tsirkin, Peter Maydell, Ben Widawsky,
	Marcel Apfelbaum, qemu-devel, Paolo Bonzini

On Mon, 05 Dec 2022 14:59:39 +0000
Alex Bennée <alex.bennee@linaro.org> wrote:

> Jonathan Cameron via <qemu-devel@nongnu.org> writes:
> 
> > On Mon, 5 Dec 2022 10:54:03 +0000
> > Jonathan Cameron via <qemu-devel@nongnu.org> wrote:
> >  
> >> On Sun, 4 Dec 2022 08:23:55 +0100
> >> Thomas Huth <thuth@redhat.com> wrote:
> >>   
> >> > On 04/11/2022 07.47, Thomas Huth wrote:    
> >> > > On 16/06/2022 18.57, Michael S. Tsirkin wrote:      
> >> > >> From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> >> > >>
> >> > >> Emulation of a simple CXL Switch downstream port.
> >> > >> The Device ID has been allocated for this use.
> >> > >>
> >> > >> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> >> > >> Message-Id: <20220616145126.8002-3-Jonathan.Cameron@huawei.com>
> >> > >> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >> > >> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> >> > >> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >> > >> ---
> >> > >>   hw/cxl/cxl-host.c              |  43 +++++-
> >> > >>   hw/pci-bridge/cxl_downstream.c | 249 +++++++++++++++++++++++++++++++++
> >> > >>   hw/pci-bridge/meson.build      |   2 +-
> >> > >>   3 files changed, 291 insertions(+), 3 deletions(-)
> >> > >>   create mode 100644 hw/pci-bridge/cxl_downstream.c      
> >> > > 
> >> > >   Hi!
> >> > > 
> >> > > There is a memory problem somewhere in this new device. I can make QEMU 
> >> > > crash by running something like this:
> >> > > 
> >> > > $ MALLOC_PERTURB_=59 ./qemu-system-x86_64 -M x-remote \
> >> > >      -display none -monitor stdio
> >> > > QEMU 7.1.50 monitor - type 'help' for more information
> >> > > (qemu) device_add cxl-downstream
> >> > > ./qemu/qom/object.c:1188:5: runtime error: member access within misaligned 
> >> > > address 0x3b3b3b3b3b3b3b3b for type 'struct Object', which requires 8 byte 
> >> > > alignment
> >> > > 0x3b3b3b3b3b3b3b3b: note: pointer points here
> >> > > <memory cannot be printed>
> >> > > Bus error (core dumped)
> >> > > 
> >> > > Could you have a look if you've got some spare minutes?      
> >> > 
> >> > Ping! Jonathan, Michael, any news on this bug?
> >> > 
> >> > (this breaks one of my local tests, that's why it's annoying for me)    
> >> Sorry, my email filters ate your earlier message.
> >> 
> >> Looking into this now. I'll note that it also happens on
> >> device_add xio3130-downstream so not specific to this new device.
> >> 
> >> So far all I've managed to do is track it to something rcu related
> >> as failing in a call to drain_call_rcu() in qmp_device_add()
> >> 
> >> Will continue digging.  
> >
> > Assuming I'm seeing the same thing...
> >
> > Problem is g_free() on the PCIBridge windows: 
> > https://elixir.bootlin.com/qemu/latest/source/hw/pci/pci_bridge.c#L235
> >
> > Is called before we get an rcu_call() to flatview_destroy() as a
> > result of the final call of flatview_unref() in address_space_set_flatview()
> > so we get a use after free.
> >
> > As to what the fix is...  Suggestions welcome!  
> 
> It sounds like this is the wrong place to free the value then. I guess
> the PCI aliases into &w->alias_io() don't get dealt with until RCU
> clean-up time.
> 
> I *think* using g_free_rcu() should be enough to ensure the free occurs
> after the rest of the RCU cleanups but maybe you should only be cleaning
> up the windows at device unrealize time? Is this a dynamic piece of
> memory which gets updated during the lifetime of the device?

There is unfortunately code that swaps it for an updated structure
in pci_bridge_update_mappings()

> 
> If the memory is being cleared with RCU then the access to the base
> pointer should be done with the appropriate qatomic_rcu_[set|read]
> functions.
>

I'm annoyingly snowed under this week with other things, but hopefully
can get to in a few days.  Note we are looking at an old problem
here, just one that's happening for an additional device, not sure
if that really affects urgency of fix though.

Jonathan
 



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PULL 02/10] pci-bridge/cxl_downstream: Add a CXL switch downstream port
  2022-12-07 13:21             ` Jonathan Cameron via
@ 2022-12-07 13:26               ` Thomas Huth
  2023-03-24 10:17                 ` Thomas Huth
  0 siblings, 1 reply; 21+ messages in thread
From: Thomas Huth @ 2022-12-07 13:26 UTC (permalink / raw)
  To: Jonathan Cameron, Alex Bennée
  Cc: Michael S. Tsirkin, Peter Maydell, Ben Widawsky,
	Marcel Apfelbaum, qemu-devel, Paolo Bonzini

On 07/12/2022 14.21, Jonathan Cameron wrote:
> On Mon, 05 Dec 2022 14:59:39 +0000
> Alex Bennée <alex.bennee@linaro.org> wrote:
> 
>> Jonathan Cameron via <qemu-devel@nongnu.org> writes:
>>
>>> On Mon, 5 Dec 2022 10:54:03 +0000
>>> Jonathan Cameron via <qemu-devel@nongnu.org> wrote:
>>>   
>>>> On Sun, 4 Dec 2022 08:23:55 +0100
>>>> Thomas Huth <thuth@redhat.com> wrote:
>>>>    
>>>>> On 04/11/2022 07.47, Thomas Huth wrote:
>>>>>> On 16/06/2022 18.57, Michael S. Tsirkin wrote:
>>>>>>> From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>>>>>>>
>>>>>>> Emulation of a simple CXL Switch downstream port.
>>>>>>> The Device ID has been allocated for this use.
>>>>>>>
>>>>>>> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>>>>>>> Message-Id: <20220616145126.8002-3-Jonathan.Cameron@huawei.com>
>>>>>>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>>>>>>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
>>>>>>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>>>>>>> ---
>>>>>>>    hw/cxl/cxl-host.c              |  43 +++++-
>>>>>>>    hw/pci-bridge/cxl_downstream.c | 249 +++++++++++++++++++++++++++++++++
>>>>>>>    hw/pci-bridge/meson.build      |   2 +-
>>>>>>>    3 files changed, 291 insertions(+), 3 deletions(-)
>>>>>>>    create mode 100644 hw/pci-bridge/cxl_downstream.c
>>>>>>
>>>>>>    Hi!
>>>>>>
>>>>>> There is a memory problem somewhere in this new device. I can make QEMU
>>>>>> crash by running something like this:
>>>>>>
>>>>>> $ MALLOC_PERTURB_=59 ./qemu-system-x86_64 -M x-remote \
>>>>>>       -display none -monitor stdio
>>>>>> QEMU 7.1.50 monitor - type 'help' for more information
>>>>>> (qemu) device_add cxl-downstream
>>>>>> ./qemu/qom/object.c:1188:5: runtime error: member access within misaligned
>>>>>> address 0x3b3b3b3b3b3b3b3b for type 'struct Object', which requires 8 byte
>>>>>> alignment
>>>>>> 0x3b3b3b3b3b3b3b3b: note: pointer points here
>>>>>> <memory cannot be printed>
>>>>>> Bus error (core dumped)
>>>>>>
>>>>>> Could you have a look if you've got some spare minutes?
>>>>>
>>>>> Ping! Jonathan, Michael, any news on this bug?
>>>>>
>>>>> (this breaks one of my local tests, that's why it's annoying for me)
>>>> Sorry, my email filters ate your earlier message.
>>>>
>>>> Looking into this now. I'll note that it also happens on
>>>> device_add xio3130-downstream so not specific to this new device.
>>>>
>>>> So far all I've managed to do is track it to something rcu related
>>>> as failing in a call to drain_call_rcu() in qmp_device_add()
>>>>
>>>> Will continue digging.
>>>
>>> Assuming I'm seeing the same thing...
>>>
>>> Problem is g_free() on the PCIBridge windows:
>>> https://elixir.bootlin.com/qemu/latest/source/hw/pci/pci_bridge.c#L235
>>>
>>> Is called before we get an rcu_call() to flatview_destroy() as a
>>> result of the final call of flatview_unref() in address_space_set_flatview()
>>> so we get a use after free.
>>>
>>> As to what the fix is...  Suggestions welcome!
>>
>> It sounds like this is the wrong place to free the value then. I guess
>> the PCI aliases into &w->alias_io() don't get dealt with until RCU
>> clean-up time.
>>
>> I *think* using g_free_rcu() should be enough to ensure the free occurs
>> after the rest of the RCU cleanups but maybe you should only be cleaning
>> up the windows at device unrealize time? Is this a dynamic piece of
>> memory which gets updated during the lifetime of the device?
> 
> There is unfortunately code that swaps it for an updated structure
> in pci_bridge_update_mappings()
> 
>>
>> If the memory is being cleared with RCU then the access to the base
>> pointer should be done with the appropriate qatomic_rcu_[set|read]
>> functions.
>>
> 
> I'm annoyingly snowed under this week with other things, but hopefully
> can get to in a few days.  Note we are looking at an old problem
> here, just one that's happening for an additional device, not sure
> if that really affects urgency of fix though.

It's too late now for QEMU 7.2 anyway, so there is no hurry, I think.

  Thomas



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PULL 02/10] pci-bridge/cxl_downstream: Add a CXL switch downstream port
  2022-12-07 13:26               ` Thomas Huth
@ 2023-03-24 10:17                 ` Thomas Huth
  2023-04-03 16:12                   ` Jonathan Cameron via
  0 siblings, 1 reply; 21+ messages in thread
From: Thomas Huth @ 2023-03-24 10:17 UTC (permalink / raw)
  To: Jonathan Cameron, Alex Bennée
  Cc: Michael S. Tsirkin, Peter Maydell, Ben Widawsky,
	Marcel Apfelbaum, qemu-devel, Paolo Bonzini

On 07/12/2022 14.26, Thomas Huth wrote:
> On 07/12/2022 14.21, Jonathan Cameron wrote:
>> On Mon, 05 Dec 2022 14:59:39 +0000
>> Alex Bennée <alex.bennee@linaro.org> wrote:
>>
>>> Jonathan Cameron via <qemu-devel@nongnu.org> writes:
>>>
>>>> On Mon, 5 Dec 2022 10:54:03 +0000
>>>> Jonathan Cameron via <qemu-devel@nongnu.org> wrote:
>>>>> On Sun, 4 Dec 2022 08:23:55 +0100
>>>>> Thomas Huth <thuth@redhat.com> wrote:
>>>>>> On 04/11/2022 07.47, Thomas Huth wrote:
>>>>>>> On 16/06/2022 18.57, Michael S. Tsirkin wrote:
>>>>>>>> From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>>>>>>>>
>>>>>>>> Emulation of a simple CXL Switch downstream port.
>>>>>>>> The Device ID has been allocated for this use.
>>>>>>>>
>>>>>>>> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>>>>>>>> Message-Id: <20220616145126.8002-3-Jonathan.Cameron@huawei.com>
>>>>>>>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>>>>>>>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
>>>>>>>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>>>>>>>> ---
>>>>>>>>    hw/cxl/cxl-host.c              |  43 +++++-
>>>>>>>>    hw/pci-bridge/cxl_downstream.c | 249 
>>>>>>>> +++++++++++++++++++++++++++++++++
>>>>>>>>    hw/pci-bridge/meson.build      |   2 +-
>>>>>>>>    3 files changed, 291 insertions(+), 3 deletions(-)
>>>>>>>>    create mode 100644 hw/pci-bridge/cxl_downstream.c
>>>>>>>
>>>>>>>    Hi!
>>>>>>>
>>>>>>> There is a memory problem somewhere in this new device. I can make QEMU
>>>>>>> crash by running something like this:
>>>>>>>
>>>>>>> $ MALLOC_PERTURB_=59 ./qemu-system-x86_64 -M x-remote \
>>>>>>>       -display none -monitor stdio
>>>>>>> QEMU 7.1.50 monitor - type 'help' for more information
>>>>>>> (qemu) device_add cxl-downstream
>>>>>>> ./qemu/qom/object.c:1188:5: runtime error: member access within 
>>>>>>> misaligned
>>>>>>> address 0x3b3b3b3b3b3b3b3b for type 'struct Object', which requires 8 
>>>>>>> byte
>>>>>>> alignment
>>>>>>> 0x3b3b3b3b3b3b3b3b: note: pointer points here
>>>>>>> <memory cannot be printed>
>>>>>>> Bus error (core dumped)
>>>>>>>
>>>>>>> Could you have a look if you've got some spare minutes?
>>>>>>
>>>>>> Ping! Jonathan, Michael, any news on this bug?
>>>>>>
>>>>>> (this breaks one of my local tests, that's why it's annoying for me)
>>>>> Sorry, my email filters ate your earlier message.
>>>>>
>>>>> Looking into this now. I'll note that it also happens on
>>>>> device_add xio3130-downstream so not specific to this new device.
>>>>>
>>>>> So far all I've managed to do is track it to something rcu related
>>>>> as failing in a call to drain_call_rcu() in qmp_device_add()
>>>>>
>>>>> Will continue digging.
>>>>
>>>> Assuming I'm seeing the same thing...
>>>>
>>>> Problem is g_free() on the PCIBridge windows:
>>>> https://elixir.bootlin.com/qemu/latest/source/hw/pci/pci_bridge.c#L235
>>>>
>>>> Is called before we get an rcu_call() to flatview_destroy() as a
>>>> result of the final call of flatview_unref() in 
>>>> address_space_set_flatview()
>>>> so we get a use after free.
>>>>
>>>> As to what the fix is...  Suggestions welcome!
>>>
>>> It sounds like this is the wrong place to free the value then. I guess
>>> the PCI aliases into &w->alias_io() don't get dealt with until RCU
>>> clean-up time.
>>>
>>> I *think* using g_free_rcu() should be enough to ensure the free occurs
>>> after the rest of the RCU cleanups but maybe you should only be cleaning
>>> up the windows at device unrealize time? Is this a dynamic piece of
>>> memory which gets updated during the lifetime of the device?
>>
>> There is unfortunately code that swaps it for an updated structure
>> in pci_bridge_update_mappings()
>>
>>>
>>> If the memory is being cleared with RCU then the access to the base
>>> pointer should be done with the appropriate qatomic_rcu_[set|read]
>>> functions.
>>>
>>
>> I'm annoyingly snowed under this week with other things, but hopefully
>> can get to in a few days.  Note we are looking at an old problem
>> here, just one that's happening for an additional device, not sure
>> if that really affects urgency of fix though.
> 
> It's too late now for QEMU 7.2 anyway, so there is no hurry, I think.

I'm still seeing problems with this device, I guess it's still the
same issue:

$ valgrind -q ./qemu-system-x86_64 -M x-remote -monitor stdio
...
QEMU 7.2.91 monitor - type 'help' for more information
(qemu) device_add cxl-downstream,id=c1
==46154== Thread 2:
==46154== Invalid read of size 8
==46154==    at 0x7A0A0B: memory_region_unref (memory.c:1798)
==46154==    by 0x7A0A0B: flatview_destroy (memory.c:298)
==46154==    by 0x9A5E32: call_rcu_thread (rcu.c:284)
==46154==    by 0x99CC19: qemu_thread_start (qemu-thread-posix.c:541)
==46154==    by 0xB6A31C9: start_thread (in /usr/lib64/libpthread-2.28.so)
==46154==    by 0xB8F4E72: clone (in /usr/lib64/libc-2.28.so)
==46154==  Address 0x1a2a95c0 is 64 bytes inside a block of size 1,632 free'd
==46154==    at 0x4C39A93: free (vg_replace_malloc.c:872)
==46154==    by 0x79B62B1: g_free (in /usr/lib64/libglib-2.0.so.0.5600.4)
==46154==    by 0x55E06F: cxl_dsp_realize (cxl_downstream.c:201)
==46154==    by 0x5523AC: pci_qdev_realize (pci.c:2098)
==46154==    by 0x833A1E: device_set_realized (qdev.c:510)
==46154==    by 0x836DC5: property_set_bool (object.c:2285)
==46154==    by 0x838FA2: object_property_set (object.c:1420)
==46154==    by 0x83C01E: object_property_set_qobject (qom-qobject.c:28)
==46154==    by 0x839213: object_property_set_bool (object.c:1489)
==46154==    by 0x5F9787: qdev_device_add_from_qdict (qdev-monitor.c:714)
==46154==    by 0x5F98F0: qdev_device_add (qdev-monitor.c:733)
==46154==    by 0x5F99E1: qmp_device_add (qdev-monitor.c:855)
==46154==  Block was alloc'd at
==46154==    at 0x4C37135: malloc (vg_replace_malloc.c:381)
==46154==    by 0x79B61A5: g_malloc (in /usr/lib64/libglib-2.0.so.0.5600.4)
==46154==    by 0x553072: pci_bridge_region_init (pci_bridge.c:191)
==46154==    by 0x553575: pci_bridge_initfn (pci_bridge.c:388)
==46154==    by 0x55E032: cxl_dsp_realize (cxl_downstream.c:147)
==46154==    by 0x5523AC: pci_qdev_realize (pci.c:2098)
==46154==    by 0x833A1E: device_set_realized (qdev.c:510)
==46154==    by 0x836DC5: property_set_bool (object.c:2285)
==46154==    by 0x838FA2: object_property_set (object.c:1420)
==46154==    by 0x83C01E: object_property_set_qobject (qom-qobject.c:28)
==46154==    by 0x839213: object_property_set_bool (object.c:1489)
==46154==    by 0x5F9787: qdev_device_add_from_qdict (qdev-monitor.c:714)

Could we get a fix for QEMU 8.0 ?

  Thomas



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PULL 02/10] pci-bridge/cxl_downstream: Add a CXL switch downstream port
  2023-03-24 10:17                 ` Thomas Huth
@ 2023-04-03 16:12                   ` Jonathan Cameron via
  0 siblings, 0 replies; 21+ messages in thread
From: Jonathan Cameron via @ 2023-04-03 16:12 UTC (permalink / raw)
  To: Thomas Huth
  Cc: Alex Bennée, Michael S. Tsirkin, Peter Maydell,
	Ben Widawsky, Marcel Apfelbaum, qemu-devel, Paolo Bonzini

On Fri, 24 Mar 2023 11:17:50 +0100
Thomas Huth <thuth@redhat.com> wrote:

> On 07/12/2022 14.26, Thomas Huth wrote:
> > On 07/12/2022 14.21, Jonathan Cameron wrote:  
> >> On Mon, 05 Dec 2022 14:59:39 +0000
> >> Alex Bennée <alex.bennee@linaro.org> wrote:
> >>  
> >>> Jonathan Cameron via <qemu-devel@nongnu.org> writes:
> >>>  
> >>>> On Mon, 5 Dec 2022 10:54:03 +0000
> >>>> Jonathan Cameron via <qemu-devel@nongnu.org> wrote:  
> >>>>> On Sun, 4 Dec 2022 08:23:55 +0100
> >>>>> Thomas Huth <thuth@redhat.com> wrote:  
> >>>>>> On 04/11/2022 07.47, Thomas Huth wrote:  
> >>>>>>> On 16/06/2022 18.57, Michael S. Tsirkin wrote:  
> >>>>>>>> From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> >>>>>>>>
> >>>>>>>> Emulation of a simple CXL Switch downstream port.
> >>>>>>>> The Device ID has been allocated for this use.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> >>>>>>>> Message-Id: <20220616145126.8002-3-Jonathan.Cameron@huawei.com>
> >>>>>>>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >>>>>>>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> >>>>>>>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >>>>>>>> ---
> >>>>>>>>    hw/cxl/cxl-host.c              |  43 +++++-
> >>>>>>>>    hw/pci-bridge/cxl_downstream.c | 249 
> >>>>>>>> +++++++++++++++++++++++++++++++++
> >>>>>>>>    hw/pci-bridge/meson.build      |   2 +-
> >>>>>>>>    3 files changed, 291 insertions(+), 3 deletions(-)
> >>>>>>>>    create mode 100644 hw/pci-bridge/cxl_downstream.c  
> >>>>>>>
> >>>>>>>    Hi!
> >>>>>>>
> >>>>>>> There is a memory problem somewhere in this new device. I can make QEMU
> >>>>>>> crash by running something like this:
> >>>>>>>
> >>>>>>> $ MALLOC_PERTURB_=59 ./qemu-system-x86_64 -M x-remote \
> >>>>>>>       -display none -monitor stdio
> >>>>>>> QEMU 7.1.50 monitor - type 'help' for more information
> >>>>>>> (qemu) device_add cxl-downstream
> >>>>>>> ./qemu/qom/object.c:1188:5: runtime error: member access within 
> >>>>>>> misaligned
> >>>>>>> address 0x3b3b3b3b3b3b3b3b for type 'struct Object', which requires 8 
> >>>>>>> byte
> >>>>>>> alignment
> >>>>>>> 0x3b3b3b3b3b3b3b3b: note: pointer points here
> >>>>>>> <memory cannot be printed>
> >>>>>>> Bus error (core dumped)
> >>>>>>>
> >>>>>>> Could you have a look if you've got some spare minutes?  
> >>>>>>
> >>>>>> Ping! Jonathan, Michael, any news on this bug?
> >>>>>>
> >>>>>> (this breaks one of my local tests, that's why it's annoying for me)  
> >>>>> Sorry, my email filters ate your earlier message.
> >>>>>
> >>>>> Looking into this now. I'll note that it also happens on
> >>>>> device_add xio3130-downstream so not specific to this new device.
> >>>>>
> >>>>> So far all I've managed to do is track it to something rcu related
> >>>>> as failing in a call to drain_call_rcu() in qmp_device_add()
> >>>>>
> >>>>> Will continue digging.  
> >>>>
> >>>> Assuming I'm seeing the same thing...
> >>>>
> >>>> Problem is g_free() on the PCIBridge windows:
> >>>> https://elixir.bootlin.com/qemu/latest/source/hw/pci/pci_bridge.c#L235
> >>>>
> >>>> Is called before we get an rcu_call() to flatview_destroy() as a
> >>>> result of the final call of flatview_unref() in 
> >>>> address_space_set_flatview()
> >>>> so we get a use after free.
> >>>>
> >>>> As to what the fix is...  Suggestions welcome!  
> >>>
> >>> It sounds like this is the wrong place to free the value then. I guess
> >>> the PCI aliases into &w->alias_io() don't get dealt with until RCU
> >>> clean-up time.
> >>>
> >>> I *think* using g_free_rcu() should be enough to ensure the free occurs
> >>> after the rest of the RCU cleanups but maybe you should only be cleaning
> >>> up the windows at device unrealize time? Is this a dynamic piece of
> >>> memory which gets updated during the lifetime of the device?  
> >>
> >> There is unfortunately code that swaps it for an updated structure
> >> in pci_bridge_update_mappings()
> >>  
> >>>
> >>> If the memory is being cleared with RCU then the access to the base
> >>> pointer should be done with the appropriate qatomic_rcu_[set|read]
> >>> functions.
> >>>  
> >>
> >> I'm annoyingly snowed under this week with other things, but hopefully
> >> can get to in a few days.  Note we are looking at an old problem
> >> here, just one that's happening for an additional device, not sure
> >> if that really affects urgency of fix though.  
> > 
> > It's too late now for QEMU 7.2 anyway, so there is no hurry, I think.  
> 
> I'm still seeing problems with this device, I guess it's still the
> same issue:
> 
> $ valgrind -q ./qemu-system-x86_64 -M x-remote -monitor stdio
> ...
> QEMU 7.2.91 monitor - type 'help' for more information
> (qemu) device_add cxl-downstream,id=c1
> ==46154== Thread 2:
> ==46154== Invalid read of size 8
> ==46154==    at 0x7A0A0B: memory_region_unref (memory.c:1798)
> ==46154==    by 0x7A0A0B: flatview_destroy (memory.c:298)
> ==46154==    by 0x9A5E32: call_rcu_thread (rcu.c:284)
> ==46154==    by 0x99CC19: qemu_thread_start (qemu-thread-posix.c:541)
> ==46154==    by 0xB6A31C9: start_thread (in /usr/lib64/libpthread-2.28.so)
> ==46154==    by 0xB8F4E72: clone (in /usr/lib64/libc-2.28.so)
> ==46154==  Address 0x1a2a95c0 is 64 bytes inside a block of size 1,632 free'd
> ==46154==    at 0x4C39A93: free (vg_replace_malloc.c:872)
> ==46154==    by 0x79B62B1: g_free (in /usr/lib64/libglib-2.0.so.0.5600.4)
> ==46154==    by 0x55E06F: cxl_dsp_realize (cxl_downstream.c:201)
> ==46154==    by 0x5523AC: pci_qdev_realize (pci.c:2098)
> ==46154==    by 0x833A1E: device_set_realized (qdev.c:510)
> ==46154==    by 0x836DC5: property_set_bool (object.c:2285)
> ==46154==    by 0x838FA2: object_property_set (object.c:1420)
> ==46154==    by 0x83C01E: object_property_set_qobject (qom-qobject.c:28)
> ==46154==    by 0x839213: object_property_set_bool (object.c:1489)
> ==46154==    by 0x5F9787: qdev_device_add_from_qdict (qdev-monitor.c:714)
> ==46154==    by 0x5F98F0: qdev_device_add (qdev-monitor.c:733)
> ==46154==    by 0x5F99E1: qmp_device_add (qdev-monitor.c:855)
> ==46154==  Block was alloc'd at
> ==46154==    at 0x4C37135: malloc (vg_replace_malloc.c:381)
> ==46154==    by 0x79B61A5: g_malloc (in /usr/lib64/libglib-2.0.so.0.5600.4)
> ==46154==    by 0x553072: pci_bridge_region_init (pci_bridge.c:191)
> ==46154==    by 0x553575: pci_bridge_initfn (pci_bridge.c:388)
> ==46154==    by 0x55E032: cxl_dsp_realize (cxl_downstream.c:147)
> ==46154==    by 0x5523AC: pci_qdev_realize (pci.c:2098)
> ==46154==    by 0x833A1E: device_set_realized (qdev.c:510)
> ==46154==    by 0x836DC5: property_set_bool (object.c:2285)
> ==46154==    by 0x838FA2: object_property_set (object.c:1420)
> ==46154==    by 0x83C01E: object_property_set_qobject (qom-qobject.c:28)
> ==46154==    by 0x839213: object_property_set_bool (object.c:1489)
> ==46154==    by 0x5F9787: qdev_device_add_from_qdict (qdev-monitor.c:714)
> 
> Could we get a fix for QEMU 8.0 ?

The original option of moving over to g_free_rcu() and need to then protect
all accesses to br->windows was make a mess of things and as far as I can
immediately spot seems to be unnecessary.

Unfortunately I don't understand why the window handling is complex in the
first place.  The following patch just embeds the structure
directly in the PCI Bridge and seems to avoid the issue you have reported.

As the code to update it on the fly is protected anyway by
memory_region_transaction_begin() I don't think we care about ensuring the
exposed windows are consistent whilst we update them.

If someone else could sanity check my logic that would be great.

Thanks,

Jonathan


diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
index dd5af508f9..698fd01ae6 100644
--- a/hw/pci/pci_bridge.c
+++ b/hw/pci/pci_bridge.c
@@ -184,11 +184,11 @@ static void pci_bridge_init_vga_aliases(PCIBridge *br, PCIBus *parent,
     }
 }
 
-static PCIBridgeWindows *pci_bridge_region_init(PCIBridge *br)
+static void pci_bridge_region_init(PCIBridge *br)
 {
     PCIDevice *pd = PCI_DEVICE(br);
     PCIBus *parent = pci_get_bus(pd);
-    PCIBridgeWindows *w = g_new(PCIBridgeWindows, 1);
+    PCIBridgeWindows *w = &br->windows;
     uint16_t cmd = pci_get_word(pd->config + PCI_COMMAND);
 
     pci_bridge_init_alias(br, &w->alias_pref_mem,
@@ -211,8 +211,6 @@ static PCIBridgeWindows *pci_bridge_region_init(PCIBridge *br)
                           cmd & PCI_COMMAND_IO);
 
     pci_bridge_init_vga_aliases(br, parent, w->alias_vga);
-
-    return w;
 }
 
 static void pci_bridge_region_del(PCIBridge *br, PCIBridgeWindows *w)
@@ -234,19 +232,17 @@ static void pci_bridge_region_cleanup(PCIBridge *br, PCIBridgeWindows *w)
     object_unparent(OBJECT(&w->alias_vga[QEMU_PCI_VGA_IO_LO]));
     object_unparent(OBJECT(&w->alias_vga[QEMU_PCI_VGA_IO_HI]));
     object_unparent(OBJECT(&w->alias_vga[QEMU_PCI_VGA_MEM]));
-    g_free(w);
 }
 
 void pci_bridge_update_mappings(PCIBridge *br)
 {
-    PCIBridgeWindows *w = br->windows;
-
+    PCIBridgeWindows *w = &br->windows;
     /* Make updates atomic to: handle the case of one VCPU updating the bridge
      * while another accesses an unaffected region. */
     memory_region_transaction_begin();
-    pci_bridge_region_del(br, br->windows);
+    pci_bridge_region_del(br, w);
     pci_bridge_region_cleanup(br, w);
-    br->windows = pci_bridge_region_init(br);
+    pci_bridge_region_init(br);
     memory_region_transaction_commit();
 }
 
@@ -385,7 +381,7 @@ void pci_bridge_initfn(PCIDevice *dev, const char *typename)
     sec_bus->address_space_io = &br->address_space_io;
     memory_region_init(&br->address_space_io, OBJECT(br), "pci_bridge_io",
                        4 * GiB);
-    br->windows = pci_bridge_region_init(br);
+    pci_bridge_region_init(br);
     QLIST_INIT(&sec_bus->child);
     QLIST_INSERT_HEAD(&parent->child, sec_bus, sibling);
 }
@@ -396,8 +392,8 @@ void pci_bridge_exitfn(PCIDevice *pci_dev)
     PCIBridge *s = PCI_BRIDGE(pci_dev);
     assert(QLIST_EMPTY(&s->sec_bus.child));
     QLIST_REMOVE(&s->sec_bus, sibling);
-    pci_bridge_region_del(s, s->windows);
-    pci_bridge_region_cleanup(s, s->windows);
+    pci_bridge_region_del(s, &s->windows);
+    pci_bridge_region_cleanup(s, &s->windows);
     /* object_unparent() is called automatically during device deletion */
 }
 
diff --git a/include/hw/pci/pci_bridge.h b/include/hw/pci/pci_bridge.h
index 81a058bb2c..b650748a39 100644
--- a/include/hw/pci/pci_bridge.h
+++ b/include/hw/pci/pci_bridge.h
@@ -30,6 +30,7 @@
 #include "hw/pci/pci_bus.h"
 #include "hw/cxl/cxl.h"
 #include "qom/object.h"
+#include "qemu/rcu.h"
 
 typedef struct PCIBridgeWindows PCIBridgeWindows;
 
@@ -39,8 +40,11 @@ typedef struct PCIBridgeWindows PCIBridgeWindows;
  * as subregions.
  */
 struct PCIBridgeWindows {
+//    struct rcu_head rcu;
     MemoryRegion alias_pref_mem;
+    //   uint8_t pad;
     MemoryRegion alias_mem;
+    // uint8_t pad2;
     MemoryRegion alias_io;
     /*
      * When bridge control VGA forwarding is enabled, bridges will
@@ -73,7 +77,7 @@ struct PCIBridge {
     MemoryRegion address_space_mem;
     MemoryRegion address_space_io;
 
-    PCIBridgeWindows *windows;
+    PCIBridgeWindows windows;
 
     pci_map_irq_fn map_irq;
     const char *bus_name;

> 
>   Thomas
> 



^ permalink raw reply related	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2023-04-03 16:13 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-16 16:57 [PULL 00/10] virtio,pc,pci: fixes,cleanups,features Michael S. Tsirkin
2022-06-16 16:57 ` [PULL 01/10] pci-bridge/cxl_upstream: Add a CXL switch upstream port Michael S. Tsirkin
2022-06-16 16:57 ` [PULL 02/10] pci-bridge/cxl_downstream: Add a CXL switch downstream port Michael S. Tsirkin
2022-11-04  6:47   ` Thomas Huth
2022-12-04  7:23     ` Thomas Huth
2022-12-05 10:54       ` Jonathan Cameron via
2022-12-05 12:45         ` Jonathan Cameron via
2022-12-05 14:59           ` Alex Bennée
2022-12-07 13:21             ` Jonathan Cameron via
2022-12-07 13:26               ` Thomas Huth
2023-03-24 10:17                 ` Thomas Huth
2023-04-03 16:12                   ` Jonathan Cameron via
2022-06-16 16:57 ` [PULL 03/10] docs/cxl: Add switch documentation Michael S. Tsirkin
2022-06-16 16:57 ` [PULL 04/10] virtio/vhost-user: Fix wrong vhost notifier GPtrArray size Michael S. Tsirkin
2022-06-16 16:57 ` [PULL 05/10] virtio-iommu: Add bypass mode support to assigned device Michael S. Tsirkin
2022-06-16 16:57 ` [PULL 06/10] virtio-iommu: Use recursive lock to avoid deadlock Michael S. Tsirkin
2022-06-16 16:57 ` [PULL 07/10] virtio-iommu: Add an assert check in translate routine Michael S. Tsirkin
2022-06-16 16:57 ` [PULL 08/10] crypto: Introduce RSA algorithm Michael S. Tsirkin
2022-06-16 16:57 ` [PULL 09/10] vhost: also check queue state in the vhost_dev_set_log error routine Michael S. Tsirkin
2022-06-16 16:57 ` [PULL 10/10] acpi/erst: fix fallthrough code upon validation failure Michael S. Tsirkin
2022-06-16 20:46 ` [PULL 00/10] virtio,pc,pci: fixes,cleanups,features Richard Henderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.