All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/3] Introduce a new Qemu machine for RISC-V
@ 2022-04-12  2:10 ` Atish Patra
  0 siblings, 0 replies; 35+ messages in thread
From: Atish Patra @ 2022-04-12  2:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-riscv, Michael S. Tsirkin, Bin Meng, Atish Patra,
	Alistair Francis, Paolo Bonzini, Palmer Dabbelt


The RISC-V virt machine has helped RISC-V software eco system to evolve at a
rapid pace even in absense of the real hardware. It is definitely commendable.
However, the number of devices & commandline options keeps growing as a result
of that as well. That adds flexibility but will also become bit difficult
to manage in the future as more extension support will be added. As it is the
most commonly used qemu machine, it needs to support all kinds of device and
interrupts as well. Moreover, virt machine has limitations on the maximum
number of harts it can support because of all the MMIO devices it has to support.

The RISC-V IMSIC specification allows to develop machines completely relying
on MSI and don't care about the wired interrupts at all. It just requires
all the devices to be present behind a PCI bus or present themselves as platform
MSI device. The former is a more common scenario in x86 world where most
of the devices are behind PCI bus. As there is very limited MMIO device
support, it can also scale to very large number of harts.

That's why, this patch series introduces a minimalistic yet very extensible
forward looking machine called as "RISC-V Mini Computer" or "minic". The
idea is to build PC or server like systems with this machine. The machine can
work with or without virtio framework. The current implementation only
supports RV64. I am not sure if building a RV32 machine would be of interest
for such machines. The only mmio device it requires is clint to emulate
the mtimecmp.

"Naming is hard". I am not too attached with the name "minic". 
I just chose least bad one out of the few on my mind :). I am definitely
open to any other name as well. 

The other alternative to provide MSI only option to aia in the 
existing virt machine to build MSI only machines. This is certainly doable
and here is the patch that supports that kind of setup.

https://github.com/atishp04/qemu/tree/virt_imsic_only

However, it even complicates the virt machine even further with additional
command line option, branches in the code. I believe virt machine will become
very complex if we continue this path. I am interested to learn what everyone
else think.

It is needless to say that the current version of minic machine
is inspired from virt machine and tries to reuse as much as code possible.
The first patch in this series adds MSI support for serial-pci device so
console can work on such a machine. The 2nd patch moves some common functions
between minic and the virt machine to a helper file. The PATCH3 actually
implements the new minic machine.

I have not added the fw-cfg/flash support. We probably should add those
but I just wanted to start small and get the feedback first.
This is a work in progress and have few more TODO items before becoming the
new world order :)

1. OpenSBI doesn't have PCI support. Thus, no console support for OpenSBI
for now.
2. The ns16550 driver in OpenSBI also need to support MSI/MSI-X.
3. Add MSI-X support for serial-pci device.

This series can boot Linux distros with the minic machine with or without virtio
devices with out-of-tree Linux kernel patches[1]. Here is an example commandline 

Without virtio devices (nvme, serial-pci & e1000e):
=====================================================
/scratch/workspace/qemu/build/qemu-system-riscv64 -cpu rv64 -M minic -m 1G -smp 4 -nographic -nodefaults \
-display none -bios /scratch/workspace/opensbi/build/platform/generic/firmware/fw_dynamic.elf \
-kernel /scratch/workspace/linux/arch/riscv/boot/Image \
-chardev stdio,mux=on,signal=off,id=charconsole0 \
-mon chardev=charconsole0,mode=readline \
-device pci-serial,msi=true,chardev=charconsole0 \
-drive id=disk3,file=/scratch/workspace/rootfs_images//fedora/Fedora-Developer-Rawhide-20211110.n.0-sda.raw,format=raw,if=none,id=drive-system-disk,cache=none,format=raw \
-device nvme,serial=deadbeef,drive=disk3 \
-netdev user,id=usernet,hostfwd=tcp::10000-:22 -device e1000e,netdev=usernet,bus=pcie.0 \
-append 'root=/dev/nvme0n1p2 rw loglevel=8 memblock=debug console=ttyS0 earlycon' -d in_asm -D log.txt -s

With virtio devices (virtio-scsi-pci, serial-pci & virtio-net-pci)
==================================================================
/scratch/workspace/qemu/build/qemu-system-riscv64 -cpu rv64 -M minic -m 1G -smp 4 -nographic -nodefaults \
-display none -bios /scratch/workspace/opensbi/build/platform/generic/firmware/fw_dynamic.elf \
-kernel /scratch/workspace/linux/arch/riscv/boot/Image \
-chardev stdio,mux=on,signal=off,id=charconsole0 \
-mon chardev=charconsole0,mode=readline \
-device pci-serial,msi=true,chardev=charconsole0 \
-drive file=/scratch/workspace/rootfs_images//fedora/Fedora-Developer-Rawhide-20211110.n.0-sda.raw,format=raw,if=none,id=drive-system-disk,cache=none \
-device virtio-scsi-pci,id=scsi0 -device scsi-hd,bus=scsi0.0,drive=drive-system-disk,id=system-disk,bootindex=1 \
-netdev user,id=n1,hostfwd=tcp::10000-:22 -device virtio-net-pci,netdev=n1 \
-append 'root=/dev/sda2 rw loglevel=8 memblock=debug console=ttyS0 earlycon'

The objective of this series is to engage the community to solve this problem.
Please suggest if you have another alternatve solution.

[1] https://github.com/atishp04/linux/tree/msi_only_console 

Atish Patra (3):
serial: Enable MSI capablity and option
hw/riscv: virt: Move common functions to a separate helper file
hw/riscv: Create a new qemu machine for RISC-V

configs/devices/riscv64-softmmu/default.mak |   1 +
hw/char/serial-pci.c                        |  36 +-
hw/riscv/Kconfig                            |  11 +
hw/riscv/machine_helper.c                   | 417 +++++++++++++++++++
hw/riscv/meson.build                        |   2 +
hw/riscv/minic.c                            | 438 ++++++++++++++++++++
hw/riscv/virt.c                             | 403 ++----------------
include/hw/riscv/machine_helper.h           |  87 ++++
include/hw/riscv/minic.h                    |  65 +++
include/hw/riscv/virt.h                     |  13 -
10 files changed, 1090 insertions(+), 383 deletions(-)
create mode 100644 hw/riscv/machine_helper.c
create mode 100644 hw/riscv/minic.c
create mode 100644 include/hw/riscv/machine_helper.h
create mode 100644 include/hw/riscv/minic.h

--
2.25.1



^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFC 0/3] Introduce a new Qemu machine for RISC-V
@ 2022-04-12  2:10 ` Atish Patra
  0 siblings, 0 replies; 35+ messages in thread
From: Atish Patra @ 2022-04-12  2:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: Atish Patra, Alistair Francis, Bin Meng, Michael S. Tsirkin,
	Palmer Dabbelt, Paolo Bonzini, qemu-riscv


The RISC-V virt machine has helped RISC-V software eco system to evolve at a
rapid pace even in absense of the real hardware. It is definitely commendable.
However, the number of devices & commandline options keeps growing as a result
of that as well. That adds flexibility but will also become bit difficult
to manage in the future as more extension support will be added. As it is the
most commonly used qemu machine, it needs to support all kinds of device and
interrupts as well. Moreover, virt machine has limitations on the maximum
number of harts it can support because of all the MMIO devices it has to support.

The RISC-V IMSIC specification allows to develop machines completely relying
on MSI and don't care about the wired interrupts at all. It just requires
all the devices to be present behind a PCI bus or present themselves as platform
MSI device. The former is a more common scenario in x86 world where most
of the devices are behind PCI bus. As there is very limited MMIO device
support, it can also scale to very large number of harts.

That's why, this patch series introduces a minimalistic yet very extensible
forward looking machine called as "RISC-V Mini Computer" or "minic". The
idea is to build PC or server like systems with this machine. The machine can
work with or without virtio framework. The current implementation only
supports RV64. I am not sure if building a RV32 machine would be of interest
for such machines. The only mmio device it requires is clint to emulate
the mtimecmp.

"Naming is hard". I am not too attached with the name "minic". 
I just chose least bad one out of the few on my mind :). I am definitely
open to any other name as well. 

The other alternative to provide MSI only option to aia in the 
existing virt machine to build MSI only machines. This is certainly doable
and here is the patch that supports that kind of setup.

https://github.com/atishp04/qemu/tree/virt_imsic_only

However, it even complicates the virt machine even further with additional
command line option, branches in the code. I believe virt machine will become
very complex if we continue this path. I am interested to learn what everyone
else think.

It is needless to say that the current version of minic machine
is inspired from virt machine and tries to reuse as much as code possible.
The first patch in this series adds MSI support for serial-pci device so
console can work on such a machine. The 2nd patch moves some common functions
between minic and the virt machine to a helper file. The PATCH3 actually
implements the new minic machine.

I have not added the fw-cfg/flash support. We probably should add those
but I just wanted to start small and get the feedback first.
This is a work in progress and have few more TODO items before becoming the
new world order :)

1. OpenSBI doesn't have PCI support. Thus, no console support for OpenSBI
for now.
2. The ns16550 driver in OpenSBI also need to support MSI/MSI-X.
3. Add MSI-X support for serial-pci device.

This series can boot Linux distros with the minic machine with or without virtio
devices with out-of-tree Linux kernel patches[1]. Here is an example commandline 

Without virtio devices (nvme, serial-pci & e1000e):
=====================================================
/scratch/workspace/qemu/build/qemu-system-riscv64 -cpu rv64 -M minic -m 1G -smp 4 -nographic -nodefaults \
-display none -bios /scratch/workspace/opensbi/build/platform/generic/firmware/fw_dynamic.elf \
-kernel /scratch/workspace/linux/arch/riscv/boot/Image \
-chardev stdio,mux=on,signal=off,id=charconsole0 \
-mon chardev=charconsole0,mode=readline \
-device pci-serial,msi=true,chardev=charconsole0 \
-drive id=disk3,file=/scratch/workspace/rootfs_images//fedora/Fedora-Developer-Rawhide-20211110.n.0-sda.raw,format=raw,if=none,id=drive-system-disk,cache=none,format=raw \
-device nvme,serial=deadbeef,drive=disk3 \
-netdev user,id=usernet,hostfwd=tcp::10000-:22 -device e1000e,netdev=usernet,bus=pcie.0 \
-append 'root=/dev/nvme0n1p2 rw loglevel=8 memblock=debug console=ttyS0 earlycon' -d in_asm -D log.txt -s

With virtio devices (virtio-scsi-pci, serial-pci & virtio-net-pci)
==================================================================
/scratch/workspace/qemu/build/qemu-system-riscv64 -cpu rv64 -M minic -m 1G -smp 4 -nographic -nodefaults \
-display none -bios /scratch/workspace/opensbi/build/platform/generic/firmware/fw_dynamic.elf \
-kernel /scratch/workspace/linux/arch/riscv/boot/Image \
-chardev stdio,mux=on,signal=off,id=charconsole0 \
-mon chardev=charconsole0,mode=readline \
-device pci-serial,msi=true,chardev=charconsole0 \
-drive file=/scratch/workspace/rootfs_images//fedora/Fedora-Developer-Rawhide-20211110.n.0-sda.raw,format=raw,if=none,id=drive-system-disk,cache=none \
-device virtio-scsi-pci,id=scsi0 -device scsi-hd,bus=scsi0.0,drive=drive-system-disk,id=system-disk,bootindex=1 \
-netdev user,id=n1,hostfwd=tcp::10000-:22 -device virtio-net-pci,netdev=n1 \
-append 'root=/dev/sda2 rw loglevel=8 memblock=debug console=ttyS0 earlycon'

The objective of this series is to engage the community to solve this problem.
Please suggest if you have another alternatve solution.

[1] https://github.com/atishp04/linux/tree/msi_only_console 

Atish Patra (3):
serial: Enable MSI capablity and option
hw/riscv: virt: Move common functions to a separate helper file
hw/riscv: Create a new qemu machine for RISC-V

configs/devices/riscv64-softmmu/default.mak |   1 +
hw/char/serial-pci.c                        |  36 +-
hw/riscv/Kconfig                            |  11 +
hw/riscv/machine_helper.c                   | 417 +++++++++++++++++++
hw/riscv/meson.build                        |   2 +
hw/riscv/minic.c                            | 438 ++++++++++++++++++++
hw/riscv/virt.c                             | 403 ++----------------
include/hw/riscv/machine_helper.h           |  87 ++++
include/hw/riscv/minic.h                    |  65 +++
include/hw/riscv/virt.h                     |  13 -
10 files changed, 1090 insertions(+), 383 deletions(-)
create mode 100644 hw/riscv/machine_helper.c
create mode 100644 hw/riscv/minic.c
create mode 100644 include/hw/riscv/machine_helper.h
create mode 100644 include/hw/riscv/minic.h

--
2.25.1



^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFC 1/3] serial: Enable MSI capablity and option
  2022-04-12  2:10 ` Atish Patra
@ 2022-04-12  2:10   ` Atish Patra
  -1 siblings, 0 replies; 35+ messages in thread
From: Atish Patra @ 2022-04-12  2:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-riscv, Michael S. Tsirkin, Bin Meng, Atish Patra,
	Alistair Francis, Paolo Bonzini, Palmer Dabbelt

The seria-pci device doesn't support MSI. Enable the device to provide
MSI so that any platform with MSI support only can also use
this serial device. MSI can be enabled by enabling the newly introduced
device property. This will be disabled by default preserving the current
behavior of the seria-pci device.

Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
 hw/char/serial-pci.c | 36 +++++++++++++++++++++++++++++++++---
 1 file changed, 33 insertions(+), 3 deletions(-)

diff --git a/hw/char/serial-pci.c b/hw/char/serial-pci.c
index 93d6f9924425..ca93c2ce2776 100644
--- a/hw/char/serial-pci.c
+++ b/hw/char/serial-pci.c
@@ -31,6 +31,7 @@
 #include "hw/char/serial.h"
 #include "hw/irq.h"
 #include "hw/pci/pci.h"
+#include "hw/pci/msi.h"
 #include "hw/qdev-properties.h"
 #include "migration/vmstate.h"
 #include "qom/object.h"
@@ -39,26 +40,54 @@ struct PCISerialState {
     PCIDevice dev;
     SerialState state;
     uint8_t prog_if;
+    bool msi_enabled;
 };
 
 #define TYPE_PCI_SERIAL "pci-serial"
 OBJECT_DECLARE_SIMPLE_TYPE(PCISerialState, PCI_SERIAL)
 
+
+static void msi_irq_handler(void *opaque, int irq_num, int level)
+{
+    PCIDevice *pci_dev = opaque;
+
+    assert(level == 0 || level == 1);
+
+    if (msi_enabled(pci_dev)) {
+        msi_notify(pci_dev, 0);
+    }
+}
+
 static void serial_pci_realize(PCIDevice *dev, Error **errp)
 {
     PCISerialState *pci = DO_UPCAST(PCISerialState, dev, dev);
     SerialState *s = &pci->state;
+    Error *err = NULL;
+    int ret;
 
     if (!qdev_realize(DEVICE(s), NULL, errp)) {
         return;
     }
 
     pci->dev.config[PCI_CLASS_PROG] = pci->prog_if;
-    pci->dev.config[PCI_INTERRUPT_PIN] = 0x01;
-    s->irq = pci_allocate_irq(&pci->dev);
-
+    if (pci->msi_enabled) {
+        pci->dev.config[PCI_INTERRUPT_PIN] = 0x00;
+        s->irq = qemu_allocate_irq(msi_irq_handler, &pci->dev, 100);
+    } else {
+        pci->dev.config[PCI_INTERRUPT_PIN] = 0x01;
+        s->irq = pci_allocate_irq(&pci->dev);
+    }
     memory_region_init_io(&s->io, OBJECT(pci), &serial_io_ops, s, "serial", 8);
     pci_register_bar(&pci->dev, 0, PCI_BASE_ADDRESS_SPACE_IO, &s->io);
+
+    if (!pci->msi_enabled) {
+        return;
+    }
+
+    ret = msi_init(&pci->dev, 0, 1, true, false, &err);
+    if (ret == -ENOTSUP) {
+        fprintf(stdout, "MSIX INIT FAILED\n");
+    }
 }
 
 static void serial_pci_exit(PCIDevice *dev)
@@ -83,6 +112,7 @@ static const VMStateDescription vmstate_pci_serial = {
 
 static Property serial_pci_properties[] = {
     DEFINE_PROP_UINT8("prog_if",  PCISerialState, prog_if, 0x02),
+    DEFINE_PROP_BOOL("msi",  PCISerialState, msi_enabled, false),
     DEFINE_PROP_END_OF_LIST(),
 };
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC 1/3] serial: Enable MSI capablity and option
@ 2022-04-12  2:10   ` Atish Patra
  0 siblings, 0 replies; 35+ messages in thread
From: Atish Patra @ 2022-04-12  2:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: Atish Patra, Alistair Francis, Bin Meng, Michael S. Tsirkin,
	Palmer Dabbelt, Paolo Bonzini, qemu-riscv

The seria-pci device doesn't support MSI. Enable the device to provide
MSI so that any platform with MSI support only can also use
this serial device. MSI can be enabled by enabling the newly introduced
device property. This will be disabled by default preserving the current
behavior of the seria-pci device.

Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
 hw/char/serial-pci.c | 36 +++++++++++++++++++++++++++++++++---
 1 file changed, 33 insertions(+), 3 deletions(-)

diff --git a/hw/char/serial-pci.c b/hw/char/serial-pci.c
index 93d6f9924425..ca93c2ce2776 100644
--- a/hw/char/serial-pci.c
+++ b/hw/char/serial-pci.c
@@ -31,6 +31,7 @@
 #include "hw/char/serial.h"
 #include "hw/irq.h"
 #include "hw/pci/pci.h"
+#include "hw/pci/msi.h"
 #include "hw/qdev-properties.h"
 #include "migration/vmstate.h"
 #include "qom/object.h"
@@ -39,26 +40,54 @@ struct PCISerialState {
     PCIDevice dev;
     SerialState state;
     uint8_t prog_if;
+    bool msi_enabled;
 };
 
 #define TYPE_PCI_SERIAL "pci-serial"
 OBJECT_DECLARE_SIMPLE_TYPE(PCISerialState, PCI_SERIAL)
 
+
+static void msi_irq_handler(void *opaque, int irq_num, int level)
+{
+    PCIDevice *pci_dev = opaque;
+
+    assert(level == 0 || level == 1);
+
+    if (msi_enabled(pci_dev)) {
+        msi_notify(pci_dev, 0);
+    }
+}
+
 static void serial_pci_realize(PCIDevice *dev, Error **errp)
 {
     PCISerialState *pci = DO_UPCAST(PCISerialState, dev, dev);
     SerialState *s = &pci->state;
+    Error *err = NULL;
+    int ret;
 
     if (!qdev_realize(DEVICE(s), NULL, errp)) {
         return;
     }
 
     pci->dev.config[PCI_CLASS_PROG] = pci->prog_if;
-    pci->dev.config[PCI_INTERRUPT_PIN] = 0x01;
-    s->irq = pci_allocate_irq(&pci->dev);
-
+    if (pci->msi_enabled) {
+        pci->dev.config[PCI_INTERRUPT_PIN] = 0x00;
+        s->irq = qemu_allocate_irq(msi_irq_handler, &pci->dev, 100);
+    } else {
+        pci->dev.config[PCI_INTERRUPT_PIN] = 0x01;
+        s->irq = pci_allocate_irq(&pci->dev);
+    }
     memory_region_init_io(&s->io, OBJECT(pci), &serial_io_ops, s, "serial", 8);
     pci_register_bar(&pci->dev, 0, PCI_BASE_ADDRESS_SPACE_IO, &s->io);
+
+    if (!pci->msi_enabled) {
+        return;
+    }
+
+    ret = msi_init(&pci->dev, 0, 1, true, false, &err);
+    if (ret == -ENOTSUP) {
+        fprintf(stdout, "MSIX INIT FAILED\n");
+    }
 }
 
 static void serial_pci_exit(PCIDevice *dev)
@@ -83,6 +112,7 @@ static const VMStateDescription vmstate_pci_serial = {
 
 static Property serial_pci_properties[] = {
     DEFINE_PROP_UINT8("prog_if",  PCISerialState, prog_if, 0x02),
+    DEFINE_PROP_BOOL("msi",  PCISerialState, msi_enabled, false),
     DEFINE_PROP_END_OF_LIST(),
 };
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC 2/3] hw/riscv: virt: Move common functions to a separate helper file
  2022-04-12  2:10 ` Atish Patra
@ 2022-04-12  2:10   ` Atish Patra
  -1 siblings, 0 replies; 35+ messages in thread
From: Atish Patra @ 2022-04-12  2:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-riscv, Michael S. Tsirkin, Bin Meng, Atish Patra,
	Alistair Francis, Paolo Bonzini, Palmer Dabbelt

The virt machine has many generic functions that can be used by
other machines. Move these functions to a helper file so that other
machines can use it in the future.

Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
 hw/riscv/machine_helper.c         | 417 ++++++++++++++++++++++++++++++
 hw/riscv/meson.build              |   1 +
 hw/riscv/virt.c                   | 403 +++--------------------------
 include/hw/riscv/machine_helper.h |  87 +++++++
 include/hw/riscv/virt.h           |  13 -
 5 files changed, 541 insertions(+), 380 deletions(-)
 create mode 100644 hw/riscv/machine_helper.c
 create mode 100644 include/hw/riscv/machine_helper.h

diff --git a/hw/riscv/machine_helper.c b/hw/riscv/machine_helper.c
new file mode 100644
index 000000000000..d8e6b87f1a48
--- /dev/null
+++ b/hw/riscv/machine_helper.c
@@ -0,0 +1,417 @@
+/*
+ * QEMU machine helper
+ *
+ * Copyright (c) 2022 Rivos, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "qemu/error-report.h"
+#include "qapi/error.h"
+#include "hw/boards.h"
+#include "hw/loader.h"
+#include "hw/sysbus.h"
+#include "hw/qdev-properties.h"
+#include "target/riscv/cpu.h"
+#include "hw/riscv/riscv_hart.h"
+#include "hw/riscv/virt.h"
+#include "hw/riscv/boot.h"
+#include "hw/riscv/numa.h"
+#include "hw/riscv/machine_helper.h"
+#include "hw/intc/riscv_imsic.h"
+#include "chardev/char.h"
+#include "sysemu/device_tree.h"
+#include "sysemu/sysemu.h"
+#include "hw/pci/pci.h"
+#include "hw/pci-host/gpex.h"
+#include "hw/display/ramfb.h"
+
+static inline DeviceState *__gpex_pcie_common(MemoryRegion *sys_mem,
+                                              PcieInitData *data)
+{
+    DeviceState *dev;
+    MemoryRegion *ecam_alias, *ecam_reg;
+    MemoryRegion *mmio_alias, *high_mmio_alias, *mmio_reg;
+
+    dev = qdev_new(TYPE_GPEX_HOST);
+
+    sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
+
+    ecam_alias = g_new0(MemoryRegion, 1);
+    ecam_reg = sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 0);
+    memory_region_init_alias(ecam_alias, OBJECT(dev), "pcie-ecam",
+                             ecam_reg, 0, data->pcie_ecam.size);
+    memory_region_add_subregion(get_system_memory(), data->pcie_ecam.base,
+                                ecam_alias);
+
+    mmio_alias = g_new0(MemoryRegion, 1);
+    mmio_reg = sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 1);
+    memory_region_init_alias(mmio_alias, OBJECT(dev), "pcie-mmio",
+                             mmio_reg, data->pcie_mmio.base,
+                             data->pcie_mmio.size);
+    memory_region_add_subregion(get_system_memory(), data->pcie_mmio.base,
+                                mmio_alias);
+
+    /* Map high MMIO space */
+    high_mmio_alias = g_new0(MemoryRegion, 1);
+    memory_region_init_alias(high_mmio_alias, OBJECT(dev), "pcie-mmio-high",
+                             mmio_reg, data->pcie_high_mmio.base,
+                             data->pcie_high_mmio.size);
+    memory_region_add_subregion(get_system_memory(), data->pcie_high_mmio.base,
+                                high_mmio_alias);
+
+    sysbus_mmio_map(SYS_BUS_DEVICE(dev), 2, data->pcie_pio.base);
+
+    return dev;
+}
+
+DeviceState *riscv_gpex_pcie_msi_init(MemoryRegion *sys_mem,
+                                      PcieInitData *data)
+{
+    return __gpex_pcie_common(sys_mem, data);
+}
+
+DeviceState *riscv_gpex_pcie_intx_init(MemoryRegion *sys_mem,
+                                       PcieInitData *data, DeviceState *irqchip)
+{
+    qemu_irq irq;
+    int i;
+    DeviceState *dev;
+
+    dev = __gpex_pcie_common(sys_mem, data);
+    for (i = 0; i < GPEX_NUM_IRQS; i++) {
+        irq = qdev_get_gpio_in(irqchip, PCIE_IRQ + i);
+
+        sysbus_connect_irq(SYS_BUS_DEVICE(dev), i, irq);
+        gpex_set_irq_num(GPEX_HOST(dev), i, PCIE_IRQ + i);
+    }
+
+    return dev;
+}
+
+uint32_t riscv_imsic_num_bits(uint32_t count)
+{
+    uint32_t ret = 0;
+
+    while (BIT(ret) < count) {
+        ret++;
+    }
+
+    return ret;
+}
+
+void riscv_create_fdt_imsic(MachineState *mc, RISCVHartArrayState *soc,
+                            uint32_t *phandle, uint32_t *intc_phandles,
+                            uint32_t *msi_m_phandle, uint32_t *msi_s_phandle,
+                            ImsicInitData *data)
+{
+    int cpu, socket;
+    char *imsic_name;
+    uint32_t imsic_max_hart_per_socket, imsic_guest_bits;
+    uint32_t *imsic_cells, *imsic_regs, imsic_addr, imsic_size;
+
+    *msi_m_phandle = (*phandle)++;
+    *msi_s_phandle = (*phandle)++;
+    imsic_cells = g_new0(uint32_t, mc->smp.cpus * 2);
+    imsic_regs = g_new0(uint32_t, riscv_socket_count(mc) * 4);
+
+    /* M-level IMSIC node */
+    for (cpu = 0; cpu < mc->smp.cpus; cpu++) {
+        imsic_cells[cpu * 2 + 0] = cpu_to_be32(intc_phandles[cpu]);
+        imsic_cells[cpu * 2 + 1] = cpu_to_be32(IRQ_M_EXT);
+    }
+    imsic_max_hart_per_socket = 0;
+    for (socket = 0; socket < riscv_socket_count(mc); socket++) {
+        imsic_addr = data->imsic_m.base + socket * data->group_max_size;
+        imsic_size = IMSIC_HART_SIZE(0) * soc[socket].num_harts;
+        imsic_regs[socket * 4 + 0] = 0;
+        imsic_regs[socket * 4 + 1] = cpu_to_be32(imsic_addr);
+        imsic_regs[socket * 4 + 2] = 0;
+        imsic_regs[socket * 4 + 3] = cpu_to_be32(imsic_size);
+        if (imsic_max_hart_per_socket < soc[socket].num_harts) {
+            imsic_max_hart_per_socket = soc[socket].num_harts;
+        }
+    }
+    imsic_name = g_strdup_printf("/soc/imsics@%lx",
+        (unsigned long)data->imsic_m.base);
+    qemu_fdt_add_subnode(mc->fdt, imsic_name);
+    qemu_fdt_setprop_string(mc->fdt, imsic_name, "compatible",
+        "riscv,imsics");
+    qemu_fdt_setprop_cell(mc->fdt, imsic_name, "#interrupt-cells",
+        FDT_IMSIC_INT_CELLS);
+    qemu_fdt_setprop(mc->fdt, imsic_name, "interrupt-controller",
+        NULL, 0);
+    qemu_fdt_setprop(mc->fdt, imsic_name, "msi-controller",
+        NULL, 0);
+    qemu_fdt_setprop(mc->fdt, imsic_name, "interrupts-extended",
+        imsic_cells, mc->smp.cpus * sizeof(uint32_t) * 2);
+    qemu_fdt_setprop(mc->fdt, imsic_name, "reg", imsic_regs,
+        riscv_socket_count(mc) * sizeof(uint32_t) * 4);
+    qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,num-ids",
+        data->num_msi);
+    qemu_fdt_setprop_cells(mc->fdt, imsic_name, "riscv,ipi-id",
+        data->ipi_msi);
+    if (riscv_socket_count(mc) > 1) {
+        qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,hart-index-bits",
+            riscv_imsic_num_bits(imsic_max_hart_per_socket));
+        qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,group-index-bits",
+            riscv_imsic_num_bits(riscv_socket_count(mc)));
+        qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,group-index-shift",
+            IMSIC_MMIO_GROUP_MIN_SHIFT);
+    }
+    qemu_fdt_setprop_cell(mc->fdt, imsic_name, "phandle", *msi_m_phandle);
+    g_free(imsic_name);
+
+    /* S-level IMSIC node */
+    for (cpu = 0; cpu < mc->smp.cpus; cpu++) {
+        imsic_cells[cpu * 2 + 0] = cpu_to_be32(intc_phandles[cpu]);
+        imsic_cells[cpu * 2 + 1] = cpu_to_be32(IRQ_S_EXT);
+    }
+    imsic_guest_bits = riscv_imsic_num_bits(data->num_guests + 1);
+    imsic_max_hart_per_socket = 0;
+    for (socket = 0; socket < riscv_socket_count(mc); socket++) {
+        imsic_addr = data->imsic_s.base + socket * data->group_max_size;
+        imsic_size = IMSIC_HART_SIZE(imsic_guest_bits) *
+                     soc[socket].num_harts;
+        imsic_regs[socket * 4 + 0] = 0;
+        imsic_regs[socket * 4 + 1] = cpu_to_be32(imsic_addr);
+        imsic_regs[socket * 4 + 2] = 0;
+        imsic_regs[socket * 4 + 3] = cpu_to_be32(imsic_size);
+        if (imsic_max_hart_per_socket < soc[socket].num_harts) {
+            imsic_max_hart_per_socket = soc[socket].num_harts;
+        }
+    }
+    imsic_name = g_strdup_printf("/soc/imsics@%lx",
+        (unsigned long)data->imsic_s.base);
+    qemu_fdt_add_subnode(mc->fdt, imsic_name);
+    qemu_fdt_setprop_string(mc->fdt, imsic_name, "compatible", "riscv,imsics");
+    qemu_fdt_setprop_cell(mc->fdt, imsic_name, "#interrupt-cells",
+                          FDT_IMSIC_INT_CELLS);
+    qemu_fdt_setprop(mc->fdt, imsic_name, "interrupt-controller", NULL, 0);
+    qemu_fdt_setprop(mc->fdt, imsic_name, "msi-controller", NULL, 0);
+    qemu_fdt_setprop(mc->fdt, imsic_name, "interrupts-extended",
+        imsic_cells, mc->smp.cpus * sizeof(uint32_t) * 2);
+    qemu_fdt_setprop(mc->fdt, imsic_name, "reg", imsic_regs,
+        riscv_socket_count(mc) * sizeof(uint32_t) * 4);
+    qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,num-ids", data->num_msi);
+    qemu_fdt_setprop_cells(mc->fdt, imsic_name, "riscv,ipi-id", data->ipi_msi);
+    if (imsic_guest_bits) {
+        qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,guest-index-bits",
+            imsic_guest_bits);
+    }
+    if (riscv_socket_count(mc) > 1) {
+        qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,hart-index-bits",
+            riscv_imsic_num_bits(imsic_max_hart_per_socket));
+        qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,group-index-bits",
+            riscv_imsic_num_bits(riscv_socket_count(mc)));
+        qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,group-index-shift",
+            IMSIC_MMIO_GROUP_MIN_SHIFT);
+    }
+    qemu_fdt_setprop_cell(mc->fdt, imsic_name, "phandle", *msi_s_phandle);
+    g_free(imsic_name);
+
+    g_free(imsic_regs);
+    g_free(imsic_cells);
+}
+
+static void create_pcie_irq_map(void *fdt, char *nodename,
+                                uint32_t irqchip_phandle,
+                                RISCV_IRQ_TYPE irq_type)
+{
+    int pin, dev;
+    uint32_t irq_map_stride = 0;
+    uint32_t full_irq_map[GPEX_NUM_IRQS * GPEX_NUM_IRQS *
+                          FDT_MAX_INT_MAP_WIDTH] = {};
+    uint32_t *irq_map = full_irq_map;
+
+    /* This code creates a standard swizzle of interrupts such that
+     * each device's first interrupt is based on it's PCI_SLOT number.
+     * (See pci_swizzle_map_irq_fn())
+     *
+     * We only need one entry per interrupt in the table (not one per
+     * possible slot) seeing the interrupt-map-mask will allow the table
+     * to wrap to any number of devices.
+     */
+    for (dev = 0; dev < GPEX_NUM_IRQS; dev++) {
+        int devfn = dev * 0x8;
+
+        for (pin = 0; pin < GPEX_NUM_IRQS; pin++) {
+            int irq_nr = PCIE_IRQ + ((pin + PCI_SLOT(devfn)) % GPEX_NUM_IRQS);
+            int i = 0;
+
+            /* Fill PCI address cells */
+            irq_map[i] = cpu_to_be32(devfn << 8);
+            i += FDT_PCI_ADDR_CELLS;
+
+            /* Fill PCI Interrupt cells */
+            irq_map[i] = cpu_to_be32(pin + 1);
+            i += FDT_PCI_INT_CELLS;
+
+            /* Fill interrupt controller phandle and cells */
+            irq_map[i++] = cpu_to_be32(irqchip_phandle);
+            irq_map[i++] = cpu_to_be32(irq_nr);
+            if (irq_type != RISCV_IRQ_WIRED_PLIC) {
+                irq_map[i++] = cpu_to_be32(0x4);
+            }
+
+            if (!irq_map_stride) {
+                irq_map_stride = i;
+            }
+            irq_map += irq_map_stride;
+        }
+    }
+
+    qemu_fdt_setprop(fdt, nodename, "interrupt-map", full_irq_map,
+                     GPEX_NUM_IRQS * GPEX_NUM_IRQS *
+                     irq_map_stride * sizeof(uint32_t));
+
+    qemu_fdt_setprop_cells(fdt, nodename, "interrupt-map-mask",
+                           0x1800, 0, 0, 0x7);
+}
+
+RISCV_IRQ_TYPE riscv_get_irq_type(RISCVVirtAIAType virt_aia_type)
+{
+    int irq_type = RISCV_IRQ_INVALID;
+
+    switch (virt_aia_type) {
+    case VIRT_AIA_TYPE_NONE:
+        irq_type = RISCV_IRQ_WIRED_PLIC;
+        break;
+    case VIRT_AIA_TYPE_APLIC:
+        irq_type = RISCV_IRQ_WIRED_APLIC;
+        break;
+    case VIRT_AIA_TYPE_APLIC_IMSIC:
+        irq_type = RISCV_IRQ_WIRED_MSI;
+        break;
+    }
+
+    return irq_type;
+}
+
+void riscv_create_fdt_pcie(MachineState *mc, const PcieInitData *data,
+                           uint32_t irq_pcie_phandle, uint32_t msi_pcie_phandle)
+{
+    char *name;
+    RISCV_IRQ_TYPE irq_type = data->irq_type;
+
+    name = g_strdup_printf("/soc/pci@%lx",
+        (long) data->pcie_ecam.base);
+    qemu_fdt_add_subnode(mc->fdt, name);
+    qemu_fdt_setprop_cell(mc->fdt, name, "#address-cells",
+        FDT_PCI_ADDR_CELLS);
+    qemu_fdt_setprop_cell(mc->fdt, name, "#interrupt-cells",
+        FDT_PCI_INT_CELLS);
+    qemu_fdt_setprop_cell(mc->fdt, name, "#size-cells", 0x2);
+    qemu_fdt_setprop_string(mc->fdt, name, "compatible",
+        "pci-host-ecam-generic");
+    qemu_fdt_setprop_string(mc->fdt, name, "device_type", "pci");
+    qemu_fdt_setprop_cell(mc->fdt, name, "linux,pci-domain", 0);
+    qemu_fdt_setprop_cells(mc->fdt, name, "bus-range", 0,
+        data->pcie_ecam.size / PCIE_MMCFG_SIZE_MIN - 1);
+    qemu_fdt_setprop(mc->fdt, name, "dma-coherent", NULL, 0);
+    if (irq_type == RISCV_IRQ_MSI_ONLY || irq_type == RISCV_IRQ_WIRED_MSI) {
+        qemu_fdt_setprop_cell(mc->fdt, name, "msi-parent", msi_pcie_phandle);
+    }
+    qemu_fdt_setprop_cells(mc->fdt, name, "reg", 0,
+        data->pcie_ecam.base, 0, data->pcie_ecam.size);
+    qemu_fdt_setprop_sized_cells(mc->fdt, name, "ranges",
+        1, FDT_PCI_RANGE_IOPORT, 2, 0,
+        2, data->pcie_pio.base, 2, data->pcie_pio.size,
+        1, FDT_PCI_RANGE_MMIO,
+        2, data->pcie_mmio.base,
+        2, data->pcie_mmio.base, 2, data->pcie_mmio.size,
+        1, FDT_PCI_RANGE_MMIO_64BIT,
+        2, data->pcie_high_mmio.base,
+        2, data->pcie_high_mmio.base, 2, data->pcie_high_mmio.size);
+
+    if (irq_type != RISCV_IRQ_MSI_ONLY) {
+        create_pcie_irq_map(mc->fdt, name, irq_pcie_phandle, irq_type);
+    }
+    g_free(name);
+}
+
+void riscv_create_fdt_socket_cpus(MachineState *mc, RISCVHartArrayState *soc,
+                                  int socket, char *clust_name,
+                                  uint32_t *phandle, bool is_32_bit,
+                                  uint32_t *intc_phandles)
+{
+    int cpu;
+    uint32_t cpu_phandle;
+    char *name, *cpu_name, *core_name, *intc_name;
+
+    for (cpu = soc[socket].num_harts - 1; cpu >= 0; cpu--) {
+        cpu_phandle = (*phandle)++;
+
+        cpu_name = g_strdup_printf("/cpus/cpu@%d",
+            soc[socket].hartid_base + cpu);
+        qemu_fdt_add_subnode(mc->fdt, cpu_name);
+        qemu_fdt_setprop_string(mc->fdt, cpu_name, "mmu-type",
+            (is_32_bit) ? "riscv,sv32" : "riscv,sv48");
+        name = riscv_isa_string(&soc[socket].harts[cpu]);
+        qemu_fdt_setprop_string(mc->fdt, cpu_name, "riscv,isa", name);
+        g_free(name);
+        qemu_fdt_setprop_string(mc->fdt, cpu_name, "compatible", "riscv");
+        qemu_fdt_setprop_string(mc->fdt, cpu_name, "status", "okay");
+        qemu_fdt_setprop_cell(mc->fdt, cpu_name, "reg",
+            soc[socket].hartid_base + cpu);
+        qemu_fdt_setprop_string(mc->fdt, cpu_name, "device_type", "cpu");
+        riscv_socket_fdt_write_id(mc, mc->fdt, cpu_name, socket);
+        qemu_fdt_setprop_cell(mc->fdt, cpu_name, "phandle", cpu_phandle);
+
+        intc_phandles[cpu] = (*phandle)++;
+
+        intc_name = g_strdup_printf("%s/interrupt-controller", cpu_name);
+        qemu_fdt_add_subnode(mc->fdt, intc_name);
+        qemu_fdt_setprop_cell(mc->fdt, intc_name, "phandle",
+            intc_phandles[cpu]);
+        if (riscv_feature(&soc[socket].harts[cpu].env, RISCV_FEATURE_AIA)) {
+            static const char * const compat[2] = {
+                "riscv,cpu-intc-aia", "riscv,cpu-intc"
+            };
+            qemu_fdt_setprop_string_array(mc->fdt, intc_name, "compatible",
+                                      (char **)&compat, ARRAY_SIZE(compat));
+        } else {
+            qemu_fdt_setprop_string(mc->fdt, intc_name, "compatible",
+                "riscv,cpu-intc");
+        }
+        qemu_fdt_setprop(mc->fdt, intc_name, "interrupt-controller", NULL, 0);
+        qemu_fdt_setprop_cell(mc->fdt, intc_name, "#interrupt-cells", 1);
+
+        core_name = g_strdup_printf("%s/core%d", clust_name, cpu);
+        qemu_fdt_add_subnode(mc->fdt, core_name);
+        qemu_fdt_setprop_cell(mc->fdt, core_name, "cpu", cpu_phandle);
+
+        g_free(core_name);
+        g_free(intc_name);
+        g_free(cpu_name);
+    }
+}
+
+void riscv_create_fdt_socket_memory(MachineState *mc, hwaddr dram_base,
+                                    int socket)
+{
+    char *mem_name;
+    uint64_t addr, size;
+
+    addr = dram_base + riscv_socket_mem_offset(mc, socket);
+    size = riscv_socket_mem_size(mc, socket);
+    mem_name = g_strdup_printf("/memory@%lx", (long)addr);
+    qemu_fdt_add_subnode(mc->fdt, mem_name);
+   qemu_fdt_setprop_cells(mc->fdt, mem_name, "reg",
+        addr >> 32, addr, size >> 32, size);
+    qemu_fdt_setprop_string(mc->fdt, mem_name, "device_type", "memory");
+    riscv_socket_fdt_write_id(mc, mc->fdt, mem_name, socket);
+    g_free(mem_name);
+}
diff --git a/hw/riscv/meson.build b/hw/riscv/meson.build
index ab6cae57eae5..b3ae84ac0539 100644
--- a/hw/riscv/meson.build
+++ b/hw/riscv/meson.build
@@ -2,6 +2,7 @@ riscv_ss = ss.source_set()
 riscv_ss.add(files('boot.c'), fdt)
 riscv_ss.add(when: 'CONFIG_RISCV_NUMA', if_true: files('numa.c'))
 riscv_ss.add(files('riscv_hart.c'))
+riscv_ss.add(files('machine_helper.c'))
 riscv_ss.add(when: 'CONFIG_OPENTITAN', if_true: files('opentitan.c'))
 riscv_ss.add(when: 'CONFIG_RISCV_VIRT', if_true: files('virt.c'))
 riscv_ss.add(when: 'CONFIG_SHAKTI_C', if_true: files('shakti_c.c'))
diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index da50cbed43ec..5999395e5bf9 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -30,6 +30,7 @@
 #include "target/riscv/cpu.h"
 #include "hw/riscv/riscv_hart.h"
 #include "hw/riscv/virt.h"
+#include "hw/riscv/machine_helper.h"
 #include "hw/riscv/boot.h"
 #include "hw/riscv/numa.h"
 #include "hw/intc/riscv_aclint.h"
@@ -161,136 +162,6 @@ static void virt_flash_map(RISCVVirtState *s,
                     sysmem);
 }
 
-static void create_pcie_irq_map(RISCVVirtState *s, void *fdt, char *nodename,
-                                uint32_t irqchip_phandle)
-{
-    int pin, dev;
-    uint32_t irq_map_stride = 0;
-    uint32_t full_irq_map[GPEX_NUM_IRQS * GPEX_NUM_IRQS *
-                          FDT_MAX_INT_MAP_WIDTH] = {};
-    uint32_t *irq_map = full_irq_map;
-
-    /* This code creates a standard swizzle of interrupts such that
-     * each device's first interrupt is based on it's PCI_SLOT number.
-     * (See pci_swizzle_map_irq_fn())
-     *
-     * We only need one entry per interrupt in the table (not one per
-     * possible slot) seeing the interrupt-map-mask will allow the table
-     * to wrap to any number of devices.
-     */
-    for (dev = 0; dev < GPEX_NUM_IRQS; dev++) {
-        int devfn = dev * 0x8;
-
-        for (pin = 0; pin < GPEX_NUM_IRQS; pin++) {
-            int irq_nr = PCIE_IRQ + ((pin + PCI_SLOT(devfn)) % GPEX_NUM_IRQS);
-            int i = 0;
-
-            /* Fill PCI address cells */
-            irq_map[i] = cpu_to_be32(devfn << 8);
-            i += FDT_PCI_ADDR_CELLS;
-
-            /* Fill PCI Interrupt cells */
-            irq_map[i] = cpu_to_be32(pin + 1);
-            i += FDT_PCI_INT_CELLS;
-
-            /* Fill interrupt controller phandle and cells */
-            irq_map[i++] = cpu_to_be32(irqchip_phandle);
-            irq_map[i++] = cpu_to_be32(irq_nr);
-            if (s->aia_type != VIRT_AIA_TYPE_NONE) {
-                irq_map[i++] = cpu_to_be32(0x4);
-            }
-
-            if (!irq_map_stride) {
-                irq_map_stride = i;
-            }
-            irq_map += irq_map_stride;
-        }
-    }
-
-    qemu_fdt_setprop(fdt, nodename, "interrupt-map", full_irq_map,
-                     GPEX_NUM_IRQS * GPEX_NUM_IRQS *
-                     irq_map_stride * sizeof(uint32_t));
-
-    qemu_fdt_setprop_cells(fdt, nodename, "interrupt-map-mask",
-                           0x1800, 0, 0, 0x7);
-}
-
-static void create_fdt_socket_cpus(RISCVVirtState *s, int socket,
-                                   char *clust_name, uint32_t *phandle,
-                                   bool is_32_bit, uint32_t *intc_phandles)
-{
-    int cpu;
-    uint32_t cpu_phandle;
-    MachineState *mc = MACHINE(s);
-    char *name, *cpu_name, *core_name, *intc_name;
-
-    for (cpu = s->soc[socket].num_harts - 1; cpu >= 0; cpu--) {
-        cpu_phandle = (*phandle)++;
-
-        cpu_name = g_strdup_printf("/cpus/cpu@%d",
-            s->soc[socket].hartid_base + cpu);
-        qemu_fdt_add_subnode(mc->fdt, cpu_name);
-        qemu_fdt_setprop_string(mc->fdt, cpu_name, "mmu-type",
-            (is_32_bit) ? "riscv,sv32" : "riscv,sv48");
-        name = riscv_isa_string(&s->soc[socket].harts[cpu]);
-        qemu_fdt_setprop_string(mc->fdt, cpu_name, "riscv,isa", name);
-        g_free(name);
-        qemu_fdt_setprop_string(mc->fdt, cpu_name, "compatible", "riscv");
-        qemu_fdt_setprop_string(mc->fdt, cpu_name, "status", "okay");
-        qemu_fdt_setprop_cell(mc->fdt, cpu_name, "reg",
-            s->soc[socket].hartid_base + cpu);
-        qemu_fdt_setprop_string(mc->fdt, cpu_name, "device_type", "cpu");
-        riscv_socket_fdt_write_id(mc, mc->fdt, cpu_name, socket);
-        qemu_fdt_setprop_cell(mc->fdt, cpu_name, "phandle", cpu_phandle);
-
-        intc_phandles[cpu] = (*phandle)++;
-
-        intc_name = g_strdup_printf("%s/interrupt-controller", cpu_name);
-        qemu_fdt_add_subnode(mc->fdt, intc_name);
-        qemu_fdt_setprop_cell(mc->fdt, intc_name, "phandle",
-            intc_phandles[cpu]);
-        if (riscv_feature(&s->soc[socket].harts[cpu].env,
-                          RISCV_FEATURE_AIA)) {
-            static const char * const compat[2] = {
-                "riscv,cpu-intc-aia", "riscv,cpu-intc"
-            };
-            qemu_fdt_setprop_string_array(mc->fdt, intc_name, "compatible",
-                                      (char **)&compat, ARRAY_SIZE(compat));
-        } else {
-            qemu_fdt_setprop_string(mc->fdt, intc_name, "compatible",
-                "riscv,cpu-intc");
-        }
-        qemu_fdt_setprop(mc->fdt, intc_name, "interrupt-controller", NULL, 0);
-        qemu_fdt_setprop_cell(mc->fdt, intc_name, "#interrupt-cells", 1);
-
-        core_name = g_strdup_printf("%s/core%d", clust_name, cpu);
-        qemu_fdt_add_subnode(mc->fdt, core_name);
-        qemu_fdt_setprop_cell(mc->fdt, core_name, "cpu", cpu_phandle);
-
-        g_free(core_name);
-        g_free(intc_name);
-        g_free(cpu_name);
-    }
-}
-
-static void create_fdt_socket_memory(RISCVVirtState *s,
-                                     const MemMapEntry *memmap, int socket)
-{
-    char *mem_name;
-    uint64_t addr, size;
-    MachineState *mc = MACHINE(s);
-
-    addr = memmap[VIRT_DRAM].base + riscv_socket_mem_offset(mc, socket);
-    size = riscv_socket_mem_size(mc, socket);
-    mem_name = g_strdup_printf("/memory@%lx", (long)addr);
-    qemu_fdt_add_subnode(mc->fdt, mem_name);
-    qemu_fdt_setprop_cells(mc->fdt, mem_name, "reg",
-        addr >> 32, addr, size >> 32, size);
-    qemu_fdt_setprop_string(mc->fdt, mem_name, "device_type", "memory");
-    riscv_socket_fdt_write_id(mc, mc->fdt, mem_name, socket);
-    g_free(mem_name);
-}
-
 static void create_fdt_socket_clint(RISCVVirtState *s,
                                     const MemMapEntry *memmap, int socket,
                                     uint32_t *intc_phandles)
@@ -472,138 +343,6 @@ static void create_fdt_socket_plic(RISCVVirtState *s,
     g_free(plic_cells);
 }
 
-static uint32_t imsic_num_bits(uint32_t count)
-{
-    uint32_t ret = 0;
-
-    while (BIT(ret) < count) {
-        ret++;
-    }
-
-    return ret;
-}
-
-static void create_fdt_imsic(RISCVVirtState *s, const MemMapEntry *memmap,
-                             uint32_t *phandle, uint32_t *intc_phandles,
-                             uint32_t *msi_m_phandle, uint32_t *msi_s_phandle)
-{
-    int cpu, socket;
-    char *imsic_name;
-    MachineState *mc = MACHINE(s);
-    uint32_t imsic_max_hart_per_socket, imsic_guest_bits;
-    uint32_t *imsic_cells, *imsic_regs, imsic_addr, imsic_size;
-
-    *msi_m_phandle = (*phandle)++;
-    *msi_s_phandle = (*phandle)++;
-    imsic_cells = g_new0(uint32_t, mc->smp.cpus * 2);
-    imsic_regs = g_new0(uint32_t, riscv_socket_count(mc) * 4);
-
-    /* M-level IMSIC node */
-    for (cpu = 0; cpu < mc->smp.cpus; cpu++) {
-        imsic_cells[cpu * 2 + 0] = cpu_to_be32(intc_phandles[cpu]);
-        imsic_cells[cpu * 2 + 1] = cpu_to_be32(IRQ_M_EXT);
-    }
-    imsic_max_hart_per_socket = 0;
-    for (socket = 0; socket < riscv_socket_count(mc); socket++) {
-        imsic_addr = memmap[VIRT_IMSIC_M].base +
-                     socket * VIRT_IMSIC_GROUP_MAX_SIZE;
-        imsic_size = IMSIC_HART_SIZE(0) * s->soc[socket].num_harts;
-        imsic_regs[socket * 4 + 0] = 0;
-        imsic_regs[socket * 4 + 1] = cpu_to_be32(imsic_addr);
-        imsic_regs[socket * 4 + 2] = 0;
-        imsic_regs[socket * 4 + 3] = cpu_to_be32(imsic_size);
-        if (imsic_max_hart_per_socket < s->soc[socket].num_harts) {
-            imsic_max_hart_per_socket = s->soc[socket].num_harts;
-        }
-    }
-    imsic_name = g_strdup_printf("/soc/imsics@%lx",
-        (unsigned long)memmap[VIRT_IMSIC_M].base);
-    qemu_fdt_add_subnode(mc->fdt, imsic_name);
-    qemu_fdt_setprop_string(mc->fdt, imsic_name, "compatible",
-        "riscv,imsics");
-    qemu_fdt_setprop_cell(mc->fdt, imsic_name, "#interrupt-cells",
-        FDT_IMSIC_INT_CELLS);
-    qemu_fdt_setprop(mc->fdt, imsic_name, "interrupt-controller",
-        NULL, 0);
-    qemu_fdt_setprop(mc->fdt, imsic_name, "msi-controller",
-        NULL, 0);
-    qemu_fdt_setprop(mc->fdt, imsic_name, "interrupts-extended",
-        imsic_cells, mc->smp.cpus * sizeof(uint32_t) * 2);
-    qemu_fdt_setprop(mc->fdt, imsic_name, "reg", imsic_regs,
-        riscv_socket_count(mc) * sizeof(uint32_t) * 4);
-    qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,num-ids",
-        VIRT_IRQCHIP_NUM_MSIS);
-    qemu_fdt_setprop_cells(mc->fdt, imsic_name, "riscv,ipi-id",
-        VIRT_IRQCHIP_IPI_MSI);
-    if (riscv_socket_count(mc) > 1) {
-        qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,hart-index-bits",
-            imsic_num_bits(imsic_max_hart_per_socket));
-        qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,group-index-bits",
-            imsic_num_bits(riscv_socket_count(mc)));
-        qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,group-index-shift",
-            IMSIC_MMIO_GROUP_MIN_SHIFT);
-    }
-    qemu_fdt_setprop_cell(mc->fdt, imsic_name, "phandle", *msi_m_phandle);
-    g_free(imsic_name);
-
-    /* S-level IMSIC node */
-    for (cpu = 0; cpu < mc->smp.cpus; cpu++) {
-        imsic_cells[cpu * 2 + 0] = cpu_to_be32(intc_phandles[cpu]);
-        imsic_cells[cpu * 2 + 1] = cpu_to_be32(IRQ_S_EXT);
-    }
-    imsic_guest_bits = imsic_num_bits(s->aia_guests + 1);
-    imsic_max_hart_per_socket = 0;
-    for (socket = 0; socket < riscv_socket_count(mc); socket++) {
-        imsic_addr = memmap[VIRT_IMSIC_S].base +
-                     socket * VIRT_IMSIC_GROUP_MAX_SIZE;
-        imsic_size = IMSIC_HART_SIZE(imsic_guest_bits) *
-                     s->soc[socket].num_harts;
-        imsic_regs[socket * 4 + 0] = 0;
-        imsic_regs[socket * 4 + 1] = cpu_to_be32(imsic_addr);
-        imsic_regs[socket * 4 + 2] = 0;
-        imsic_regs[socket * 4 + 3] = cpu_to_be32(imsic_size);
-        if (imsic_max_hart_per_socket < s->soc[socket].num_harts) {
-            imsic_max_hart_per_socket = s->soc[socket].num_harts;
-        }
-    }
-    imsic_name = g_strdup_printf("/soc/imsics@%lx",
-        (unsigned long)memmap[VIRT_IMSIC_S].base);
-    qemu_fdt_add_subnode(mc->fdt, imsic_name);
-    qemu_fdt_setprop_string(mc->fdt, imsic_name, "compatible",
-        "riscv,imsics");
-    qemu_fdt_setprop_cell(mc->fdt, imsic_name, "#interrupt-cells",
-        FDT_IMSIC_INT_CELLS);
-    qemu_fdt_setprop(mc->fdt, imsic_name, "interrupt-controller",
-        NULL, 0);
-    qemu_fdt_setprop(mc->fdt, imsic_name, "msi-controller",
-        NULL, 0);
-    qemu_fdt_setprop(mc->fdt, imsic_name, "interrupts-extended",
-        imsic_cells, mc->smp.cpus * sizeof(uint32_t) * 2);
-    qemu_fdt_setprop(mc->fdt, imsic_name, "reg", imsic_regs,
-        riscv_socket_count(mc) * sizeof(uint32_t) * 4);
-    qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,num-ids",
-        VIRT_IRQCHIP_NUM_MSIS);
-    qemu_fdt_setprop_cells(mc->fdt, imsic_name, "riscv,ipi-id",
-        VIRT_IRQCHIP_IPI_MSI);
-    if (imsic_guest_bits) {
-        qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,guest-index-bits",
-            imsic_guest_bits);
-    }
-    if (riscv_socket_count(mc) > 1) {
-        qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,hart-index-bits",
-            imsic_num_bits(imsic_max_hart_per_socket));
-        qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,group-index-bits",
-            imsic_num_bits(riscv_socket_count(mc)));
-        qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,group-index-shift",
-            IMSIC_MMIO_GROUP_MIN_SHIFT);
-    }
-    qemu_fdt_setprop_cell(mc->fdt, imsic_name, "phandle", *msi_s_phandle);
-    g_free(imsic_name);
-
-    g_free(imsic_regs);
-    g_free(imsic_cells);
-}
-
 static void create_fdt_socket_aplic(RISCVVirtState *s,
                                     const MemMapEntry *memmap, int socket,
                                     uint32_t msi_m_phandle,
@@ -699,6 +438,7 @@ static void create_fdt_sockets(RISCVVirtState *s, const MemMapEntry *memmap,
     MachineState *mc = MACHINE(s);
     uint32_t msi_m_phandle = 0, msi_s_phandle = 0;
     uint32_t *intc_phandles, xplic_phandles[MAX_NODES];
+    ImsicInitData idata;
 
     qemu_fdt_add_subnode(mc->fdt, "/cpus");
     qemu_fdt_setprop_cell(mc->fdt, "/cpus", "timebase-frequency",
@@ -716,10 +456,10 @@ static void create_fdt_sockets(RISCVVirtState *s, const MemMapEntry *memmap,
         clust_name = g_strdup_printf("/cpus/cpu-map/cluster%d", socket);
         qemu_fdt_add_subnode(mc->fdt, clust_name);
 
-        create_fdt_socket_cpus(s, socket, clust_name, phandle,
+        riscv_create_fdt_socket_cpus(mc, s->soc, socket, clust_name, phandle,
             is_32_bit, &intc_phandles[phandle_pos]);
 
-        create_fdt_socket_memory(s, memmap, socket);
+        riscv_create_fdt_socket_memory(mc, memmap[VIRT_DRAM].base, socket);
 
         g_free(clust_name);
 
@@ -735,8 +475,17 @@ static void create_fdt_sockets(RISCVVirtState *s, const MemMapEntry *memmap,
     }
 
     if (s->aia_type == VIRT_AIA_TYPE_APLIC_IMSIC) {
-        create_fdt_imsic(s, memmap, phandle, intc_phandles,
-            &msi_m_phandle, &msi_s_phandle);
+        idata.imsic_m.base = memmap[VIRT_IMSIC_M].base;
+        idata.imsic_m.size = memmap[VIRT_IMSIC_M].size;
+        idata.imsic_s.base = memmap[VIRT_IMSIC_S].base;
+        idata.imsic_s.size = memmap[VIRT_IMSIC_S].size;
+        idata.group_max_size = VIRT_IMSIC_GROUP_MAX_SIZE;
+        idata.num_msi = VIRT_IRQCHIP_NUM_MSIS;
+        idata.ipi_msi = VIRT_IRQCHIP_IPI_MSI;
+        idata.num_guests = s->aia_guests;
+
+        riscv_create_fdt_imsic(mc, s->soc, phandle, intc_phandles,
+            &msi_m_phandle, &msi_s_phandle, &idata);
         *msi_pcie_phandle = msi_s_phandle;
     }
 
@@ -802,47 +551,6 @@ static void create_fdt_virtio(RISCVVirtState *s, const MemMapEntry *memmap,
     }
 }
 
-static void create_fdt_pcie(RISCVVirtState *s, const MemMapEntry *memmap,
-                            uint32_t irq_pcie_phandle,
-                            uint32_t msi_pcie_phandle)
-{
-    char *name;
-    MachineState *mc = MACHINE(s);
-
-    name = g_strdup_printf("/soc/pci@%lx",
-        (long) memmap[VIRT_PCIE_ECAM].base);
-    qemu_fdt_add_subnode(mc->fdt, name);
-    qemu_fdt_setprop_cell(mc->fdt, name, "#address-cells",
-        FDT_PCI_ADDR_CELLS);
-    qemu_fdt_setprop_cell(mc->fdt, name, "#interrupt-cells",
-        FDT_PCI_INT_CELLS);
-    qemu_fdt_setprop_cell(mc->fdt, name, "#size-cells", 0x2);
-    qemu_fdt_setprop_string(mc->fdt, name, "compatible",
-        "pci-host-ecam-generic");
-    qemu_fdt_setprop_string(mc->fdt, name, "device_type", "pci");
-    qemu_fdt_setprop_cell(mc->fdt, name, "linux,pci-domain", 0);
-    qemu_fdt_setprop_cells(mc->fdt, name, "bus-range", 0,
-        memmap[VIRT_PCIE_ECAM].size / PCIE_MMCFG_SIZE_MIN - 1);
-    qemu_fdt_setprop(mc->fdt, name, "dma-coherent", NULL, 0);
-    if (s->aia_type == VIRT_AIA_TYPE_APLIC_IMSIC) {
-        qemu_fdt_setprop_cell(mc->fdt, name, "msi-parent", msi_pcie_phandle);
-    }
-    qemu_fdt_setprop_cells(mc->fdt, name, "reg", 0,
-        memmap[VIRT_PCIE_ECAM].base, 0, memmap[VIRT_PCIE_ECAM].size);
-    qemu_fdt_setprop_sized_cells(mc->fdt, name, "ranges",
-        1, FDT_PCI_RANGE_IOPORT, 2, 0,
-        2, memmap[VIRT_PCIE_PIO].base, 2, memmap[VIRT_PCIE_PIO].size,
-        1, FDT_PCI_RANGE_MMIO,
-        2, memmap[VIRT_PCIE_MMIO].base,
-        2, memmap[VIRT_PCIE_MMIO].base, 2, memmap[VIRT_PCIE_MMIO].size,
-        1, FDT_PCI_RANGE_MMIO_64BIT,
-        2, virt_high_pcie_memmap.base,
-        2, virt_high_pcie_memmap.base, 2, virt_high_pcie_memmap.size);
-
-    create_pcie_irq_map(s, mc->fdt, name, irq_pcie_phandle);
-    g_free(name);
-}
-
 static void create_fdt_reset(RISCVVirtState *s, const MemMapEntry *memmap,
                              uint32_t *phandle)
 {
@@ -948,12 +656,26 @@ static void create_fdt_flash(RISCVVirtState *s, const MemMapEntry *memmap)
     g_free(name);
 }
 
+static void copy_memmap_to_pciedata(const MemMapEntry *memmap,
+                                    PcieInitData *pdata)
+{
+    pdata->pcie_ecam.base =  memmap[VIRT_PCIE_ECAM].base;
+    pdata->pcie_ecam.size =  memmap[VIRT_PCIE_ECAM].size;
+    pdata->pcie_pio.base =  memmap[VIRT_PCIE_PIO].base;
+    pdata->pcie_pio.size =  memmap[VIRT_PCIE_PIO].size;
+    pdata->pcie_mmio.base =  memmap[VIRT_PCIE_MMIO].base;
+    pdata->pcie_mmio.size =  memmap[VIRT_PCIE_MMIO].size;
+    pdata->pcie_high_mmio.base = virt_high_pcie_memmap.base;
+    pdata->pcie_high_mmio.size = virt_high_pcie_memmap.size;
+}
+
 static void create_fdt(RISCVVirtState *s, const MemMapEntry *memmap,
                        uint64_t mem_size, const char *cmdline, bool is_32_bit)
 {
     MachineState *mc = MACHINE(s);
     uint32_t phandle = 1, irq_mmio_phandle = 1, msi_pcie_phandle = 1;
     uint32_t irq_pcie_phandle = 1, irq_virtio_phandle = 1;
+    PcieInitData pdata;
 
     if (mc->dtb) {
         mc->fdt = load_device_tree(mc->dtb, &s->fdt_size);
@@ -987,7 +709,9 @@ static void create_fdt(RISCVVirtState *s, const MemMapEntry *memmap,
 
     create_fdt_virtio(s, memmap, irq_virtio_phandle);
 
-    create_fdt_pcie(s, memmap, irq_pcie_phandle, msi_pcie_phandle);
+    pdata.irq_type = riscv_get_irq_type(s->aia_type);
+    copy_memmap_to_pciedata(memmap, &pdata);
+    riscv_create_fdt_pcie(mc, &pdata, irq_pcie_phandle, msi_pcie_phandle);
 
     create_fdt_reset(s, memmap, &phandle);
 
@@ -1003,55 +727,6 @@ update_bootargs:
     }
 }
 
-static inline DeviceState *gpex_pcie_init(MemoryRegion *sys_mem,
-                                          hwaddr ecam_base, hwaddr ecam_size,
-                                          hwaddr mmio_base, hwaddr mmio_size,
-                                          hwaddr high_mmio_base,
-                                          hwaddr high_mmio_size,
-                                          hwaddr pio_base,
-                                          DeviceState *irqchip)
-{
-    DeviceState *dev;
-    MemoryRegion *ecam_alias, *ecam_reg;
-    MemoryRegion *mmio_alias, *high_mmio_alias, *mmio_reg;
-    qemu_irq irq;
-    int i;
-
-    dev = qdev_new(TYPE_GPEX_HOST);
-
-    sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
-
-    ecam_alias = g_new0(MemoryRegion, 1);
-    ecam_reg = sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 0);
-    memory_region_init_alias(ecam_alias, OBJECT(dev), "pcie-ecam",
-                             ecam_reg, 0, ecam_size);
-    memory_region_add_subregion(get_system_memory(), ecam_base, ecam_alias);
-
-    mmio_alias = g_new0(MemoryRegion, 1);
-    mmio_reg = sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 1);
-    memory_region_init_alias(mmio_alias, OBJECT(dev), "pcie-mmio",
-                             mmio_reg, mmio_base, mmio_size);
-    memory_region_add_subregion(get_system_memory(), mmio_base, mmio_alias);
-
-    /* Map high MMIO space */
-    high_mmio_alias = g_new0(MemoryRegion, 1);
-    memory_region_init_alias(high_mmio_alias, OBJECT(dev), "pcie-mmio-high",
-                             mmio_reg, high_mmio_base, high_mmio_size);
-    memory_region_add_subregion(get_system_memory(), high_mmio_base,
-                                high_mmio_alias);
-
-    sysbus_mmio_map(SYS_BUS_DEVICE(dev), 2, pio_base);
-
-    for (i = 0; i < GPEX_NUM_IRQS; i++) {
-        irq = qdev_get_gpio_in(irqchip, PCIE_IRQ + i);
-
-        sysbus_connect_irq(SYS_BUS_DEVICE(dev), i, irq);
-        gpex_set_irq_num(GPEX_HOST(dev), i, PCIE_IRQ + i);
-    }
-
-    return dev;
-}
-
 static FWCfgState *create_fw_cfg(const MachineState *mc)
 {
     hwaddr base = virt_memmap[VIRT_FW_CFG].base;
@@ -1122,7 +797,7 @@ static DeviceState *virt_create_aia(RISCVVirtAIAType aia_type, int aia_guests,
         }
 
         /* Per-socket S-level IMSICs */
-        guest_bits = imsic_num_bits(aia_guests + 1);
+        guest_bits = riscv_imsic_num_bits(aia_guests + 1);
         addr = memmap[VIRT_IMSIC_S].base + socket * VIRT_IMSIC_GROUP_MAX_SIZE;
         for (i = 0; i < hart_count; i++) {
             riscv_imsic_create(addr + i * IMSIC_HART_SIZE(guest_bits),
@@ -1169,6 +844,7 @@ static void virt_machine_init(MachineState *machine)
     uint64_t kernel_entry;
     DeviceState *mmio_irqchip, *virtio_irqchip, *pcie_irqchip;
     int i, base_hartid, hart_count;
+    PcieInitData pdata;
 
     /* Check socket count limit */
     if (VIRT_SOCKETS_MAX < riscv_socket_count(machine)) {
@@ -1392,15 +1068,8 @@ static void virt_machine_init(MachineState *machine)
             qdev_get_gpio_in(DEVICE(virtio_irqchip), VIRTIO_IRQ + i));
     }
 
-    gpex_pcie_init(system_memory,
-                   memmap[VIRT_PCIE_ECAM].base,
-                   memmap[VIRT_PCIE_ECAM].size,
-                   memmap[VIRT_PCIE_MMIO].base,
-                   memmap[VIRT_PCIE_MMIO].size,
-                   virt_high_pcie_memmap.base,
-                   virt_high_pcie_memmap.size,
-                   memmap[VIRT_PCIE_PIO].base,
-                   DEVICE(pcie_irqchip));
+    copy_memmap_to_pciedata(memmap, &pdata);
+    riscv_gpex_pcie_intx_init(system_memory, &pdata, DEVICE(pcie_irqchip));
 
     serial_mm_init(system_memory, memmap[VIRT_UART0].base,
         0, qdev_get_gpio_in(DEVICE(mmio_irqchip), UART0_IRQ), 399193,
diff --git a/include/hw/riscv/machine_helper.h b/include/hw/riscv/machine_helper.h
new file mode 100644
index 000000000000..9029adec941b
--- /dev/null
+++ b/include/hw/riscv/machine_helper.h
@@ -0,0 +1,87 @@
+/*
+ * QEMU RISC-V Machine common helper functions
+ *
+ * Copyright (c) 2022 Rivos, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_RISCV_MACHINE_HELPER_H
+#define HW_RISCV_MACHINE_HELPER_H
+
+#include "hw/riscv/riscv_hart.h"
+#include "hw/riscv/virt.h"
+#include "hw/sysbus.h"
+#include "qom/object.h"
+#include "exec/memory.h"
+
+#define FDT_PCI_ADDR_CELLS    3
+#define FDT_PCI_INT_CELLS     1
+#define FDT_PLIC_INT_CELLS    1
+#define FDT_APLIC_INT_CELLS   2
+#define FDT_IMSIC_INT_CELLS   0
+#define FDT_MAX_INT_CELLS     2
+#define FDT_MAX_INT_MAP_WIDTH (FDT_PCI_ADDR_CELLS + FDT_PCI_INT_CELLS + \
+                                 1 + FDT_MAX_INT_CELLS)
+#define FDT_PLIC_INT_MAP_WIDTH  (FDT_PCI_ADDR_CELLS + FDT_PCI_INT_CELLS + \
+                                 1 + FDT_PLIC_INT_CELLS)
+#define FDT_APLIC_INT_MAP_WIDTH (FDT_PCI_ADDR_CELLS + FDT_PCI_INT_CELLS + \
+                                 1 + FDT_APLIC_INT_CELLS)
+
+typedef enum RISCV_IRQ_TYPE {
+    RISCV_IRQ_WIRED_PLIC = 0,
+    RISCV_IRQ_WIRED_APLIC,
+    RISCV_IRQ_WIRED_MSI,
+    RISCV_IRQ_MSI_ONLY,
+    RISCV_IRQ_INVALID
+} RISCV_IRQ_TYPE;
+
+typedef struct ImsicInitData {
+    MemMapEntry imsic_m;
+    MemMapEntry imsic_s;
+    uint32_t group_max_size;
+    uint32_t num_msi;
+    uint32_t ipi_msi;
+    uint32_t num_guests;
+} ImsicInitData;
+
+typedef struct PcieInitData {
+    MemMapEntry pcie_ecam;
+    MemMapEntry pcie_pio;
+    MemMapEntry pcie_mmio;
+    MemMapEntry pcie_high_mmio;
+    RISCV_IRQ_TYPE irq_type;
+} PcieInitData;
+
+uint32_t riscv_imsic_num_bits(uint32_t count);
+void riscv_create_fdt_imsic(MachineState *mc, RISCVHartArrayState *soc,
+                            uint32_t *phandle, uint32_t *intc_phandles,
+                            uint32_t *msi_m_phandle, uint32_t *msi_s_phandle,
+                            ImsicInitData *data);
+void riscv_create_fdt_pcie(MachineState *mc, const PcieInitData *data,
+                           uint32_t irq_pcie_phandle,
+                           uint32_t msi_pcie_phandle);
+DeviceState *riscv_gpex_pcie_intx_init(MemoryRegion *sys_mem,
+                                       PcieInitData *data,
+                                       DeviceState *irqchip);
+DeviceState *riscv_gpex_pcie_msi_init(MemoryRegion *sys_mem,
+                                      PcieInitData *data);
+void riscv_create_fdt_socket_cpus(MachineState *mc, RISCVHartArrayState *soc,
+                                  int socket, char *clust_name,
+                                  uint32_t *phandle, bool is_32_bit,
+                                  uint32_t *intc_phandles);
+void riscv_create_fdt_socket_memory(MachineState *mc, hwaddr dram_base,
+                                    int socket);
+RISCV_IRQ_TYPE riscv_get_irq_type(RISCVVirtAIAType virt_aia_type);
+
+#endif
diff --git a/include/hw/riscv/virt.h b/include/hw/riscv/virt.h
index 78b058ec8683..2f62e2475653 100644
--- a/include/hw/riscv/virt.h
+++ b/include/hw/riscv/virt.h
@@ -103,17 +103,4 @@ enum {
 #define VIRT_PLIC_SIZE(__num_context) \
     (VIRT_PLIC_CONTEXT_BASE + (__num_context) * VIRT_PLIC_CONTEXT_STRIDE)
 
-#define FDT_PCI_ADDR_CELLS    3
-#define FDT_PCI_INT_CELLS     1
-#define FDT_PLIC_INT_CELLS    1
-#define FDT_APLIC_INT_CELLS   2
-#define FDT_IMSIC_INT_CELLS   0
-#define FDT_MAX_INT_CELLS     2
-#define FDT_MAX_INT_MAP_WIDTH (FDT_PCI_ADDR_CELLS + FDT_PCI_INT_CELLS + \
-                                 1 + FDT_MAX_INT_CELLS)
-#define FDT_PLIC_INT_MAP_WIDTH  (FDT_PCI_ADDR_CELLS + FDT_PCI_INT_CELLS + \
-                                 1 + FDT_PLIC_INT_CELLS)
-#define FDT_APLIC_INT_MAP_WIDTH (FDT_PCI_ADDR_CELLS + FDT_PCI_INT_CELLS + \
-                                 1 + FDT_APLIC_INT_CELLS)
-
 #endif
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC 2/3] hw/riscv: virt: Move common functions to a separate helper file
@ 2022-04-12  2:10   ` Atish Patra
  0 siblings, 0 replies; 35+ messages in thread
From: Atish Patra @ 2022-04-12  2:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: Atish Patra, Alistair Francis, Bin Meng, Michael S. Tsirkin,
	Palmer Dabbelt, Paolo Bonzini, qemu-riscv

The virt machine has many generic functions that can be used by
other machines. Move these functions to a helper file so that other
machines can use it in the future.

Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
 hw/riscv/machine_helper.c         | 417 ++++++++++++++++++++++++++++++
 hw/riscv/meson.build              |   1 +
 hw/riscv/virt.c                   | 403 +++--------------------------
 include/hw/riscv/machine_helper.h |  87 +++++++
 include/hw/riscv/virt.h           |  13 -
 5 files changed, 541 insertions(+), 380 deletions(-)
 create mode 100644 hw/riscv/machine_helper.c
 create mode 100644 include/hw/riscv/machine_helper.h

diff --git a/hw/riscv/machine_helper.c b/hw/riscv/machine_helper.c
new file mode 100644
index 000000000000..d8e6b87f1a48
--- /dev/null
+++ b/hw/riscv/machine_helper.c
@@ -0,0 +1,417 @@
+/*
+ * QEMU machine helper
+ *
+ * Copyright (c) 2022 Rivos, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "qemu/error-report.h"
+#include "qapi/error.h"
+#include "hw/boards.h"
+#include "hw/loader.h"
+#include "hw/sysbus.h"
+#include "hw/qdev-properties.h"
+#include "target/riscv/cpu.h"
+#include "hw/riscv/riscv_hart.h"
+#include "hw/riscv/virt.h"
+#include "hw/riscv/boot.h"
+#include "hw/riscv/numa.h"
+#include "hw/riscv/machine_helper.h"
+#include "hw/intc/riscv_imsic.h"
+#include "chardev/char.h"
+#include "sysemu/device_tree.h"
+#include "sysemu/sysemu.h"
+#include "hw/pci/pci.h"
+#include "hw/pci-host/gpex.h"
+#include "hw/display/ramfb.h"
+
+static inline DeviceState *__gpex_pcie_common(MemoryRegion *sys_mem,
+                                              PcieInitData *data)
+{
+    DeviceState *dev;
+    MemoryRegion *ecam_alias, *ecam_reg;
+    MemoryRegion *mmio_alias, *high_mmio_alias, *mmio_reg;
+
+    dev = qdev_new(TYPE_GPEX_HOST);
+
+    sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
+
+    ecam_alias = g_new0(MemoryRegion, 1);
+    ecam_reg = sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 0);
+    memory_region_init_alias(ecam_alias, OBJECT(dev), "pcie-ecam",
+                             ecam_reg, 0, data->pcie_ecam.size);
+    memory_region_add_subregion(get_system_memory(), data->pcie_ecam.base,
+                                ecam_alias);
+
+    mmio_alias = g_new0(MemoryRegion, 1);
+    mmio_reg = sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 1);
+    memory_region_init_alias(mmio_alias, OBJECT(dev), "pcie-mmio",
+                             mmio_reg, data->pcie_mmio.base,
+                             data->pcie_mmio.size);
+    memory_region_add_subregion(get_system_memory(), data->pcie_mmio.base,
+                                mmio_alias);
+
+    /* Map high MMIO space */
+    high_mmio_alias = g_new0(MemoryRegion, 1);
+    memory_region_init_alias(high_mmio_alias, OBJECT(dev), "pcie-mmio-high",
+                             mmio_reg, data->pcie_high_mmio.base,
+                             data->pcie_high_mmio.size);
+    memory_region_add_subregion(get_system_memory(), data->pcie_high_mmio.base,
+                                high_mmio_alias);
+
+    sysbus_mmio_map(SYS_BUS_DEVICE(dev), 2, data->pcie_pio.base);
+
+    return dev;
+}
+
+DeviceState *riscv_gpex_pcie_msi_init(MemoryRegion *sys_mem,
+                                      PcieInitData *data)
+{
+    return __gpex_pcie_common(sys_mem, data);
+}
+
+DeviceState *riscv_gpex_pcie_intx_init(MemoryRegion *sys_mem,
+                                       PcieInitData *data, DeviceState *irqchip)
+{
+    qemu_irq irq;
+    int i;
+    DeviceState *dev;
+
+    dev = __gpex_pcie_common(sys_mem, data);
+    for (i = 0; i < GPEX_NUM_IRQS; i++) {
+        irq = qdev_get_gpio_in(irqchip, PCIE_IRQ + i);
+
+        sysbus_connect_irq(SYS_BUS_DEVICE(dev), i, irq);
+        gpex_set_irq_num(GPEX_HOST(dev), i, PCIE_IRQ + i);
+    }
+
+    return dev;
+}
+
+uint32_t riscv_imsic_num_bits(uint32_t count)
+{
+    uint32_t ret = 0;
+
+    while (BIT(ret) < count) {
+        ret++;
+    }
+
+    return ret;
+}
+
+void riscv_create_fdt_imsic(MachineState *mc, RISCVHartArrayState *soc,
+                            uint32_t *phandle, uint32_t *intc_phandles,
+                            uint32_t *msi_m_phandle, uint32_t *msi_s_phandle,
+                            ImsicInitData *data)
+{
+    int cpu, socket;
+    char *imsic_name;
+    uint32_t imsic_max_hart_per_socket, imsic_guest_bits;
+    uint32_t *imsic_cells, *imsic_regs, imsic_addr, imsic_size;
+
+    *msi_m_phandle = (*phandle)++;
+    *msi_s_phandle = (*phandle)++;
+    imsic_cells = g_new0(uint32_t, mc->smp.cpus * 2);
+    imsic_regs = g_new0(uint32_t, riscv_socket_count(mc) * 4);
+
+    /* M-level IMSIC node */
+    for (cpu = 0; cpu < mc->smp.cpus; cpu++) {
+        imsic_cells[cpu * 2 + 0] = cpu_to_be32(intc_phandles[cpu]);
+        imsic_cells[cpu * 2 + 1] = cpu_to_be32(IRQ_M_EXT);
+    }
+    imsic_max_hart_per_socket = 0;
+    for (socket = 0; socket < riscv_socket_count(mc); socket++) {
+        imsic_addr = data->imsic_m.base + socket * data->group_max_size;
+        imsic_size = IMSIC_HART_SIZE(0) * soc[socket].num_harts;
+        imsic_regs[socket * 4 + 0] = 0;
+        imsic_regs[socket * 4 + 1] = cpu_to_be32(imsic_addr);
+        imsic_regs[socket * 4 + 2] = 0;
+        imsic_regs[socket * 4 + 3] = cpu_to_be32(imsic_size);
+        if (imsic_max_hart_per_socket < soc[socket].num_harts) {
+            imsic_max_hart_per_socket = soc[socket].num_harts;
+        }
+    }
+    imsic_name = g_strdup_printf("/soc/imsics@%lx",
+        (unsigned long)data->imsic_m.base);
+    qemu_fdt_add_subnode(mc->fdt, imsic_name);
+    qemu_fdt_setprop_string(mc->fdt, imsic_name, "compatible",
+        "riscv,imsics");
+    qemu_fdt_setprop_cell(mc->fdt, imsic_name, "#interrupt-cells",
+        FDT_IMSIC_INT_CELLS);
+    qemu_fdt_setprop(mc->fdt, imsic_name, "interrupt-controller",
+        NULL, 0);
+    qemu_fdt_setprop(mc->fdt, imsic_name, "msi-controller",
+        NULL, 0);
+    qemu_fdt_setprop(mc->fdt, imsic_name, "interrupts-extended",
+        imsic_cells, mc->smp.cpus * sizeof(uint32_t) * 2);
+    qemu_fdt_setprop(mc->fdt, imsic_name, "reg", imsic_regs,
+        riscv_socket_count(mc) * sizeof(uint32_t) * 4);
+    qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,num-ids",
+        data->num_msi);
+    qemu_fdt_setprop_cells(mc->fdt, imsic_name, "riscv,ipi-id",
+        data->ipi_msi);
+    if (riscv_socket_count(mc) > 1) {
+        qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,hart-index-bits",
+            riscv_imsic_num_bits(imsic_max_hart_per_socket));
+        qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,group-index-bits",
+            riscv_imsic_num_bits(riscv_socket_count(mc)));
+        qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,group-index-shift",
+            IMSIC_MMIO_GROUP_MIN_SHIFT);
+    }
+    qemu_fdt_setprop_cell(mc->fdt, imsic_name, "phandle", *msi_m_phandle);
+    g_free(imsic_name);
+
+    /* S-level IMSIC node */
+    for (cpu = 0; cpu < mc->smp.cpus; cpu++) {
+        imsic_cells[cpu * 2 + 0] = cpu_to_be32(intc_phandles[cpu]);
+        imsic_cells[cpu * 2 + 1] = cpu_to_be32(IRQ_S_EXT);
+    }
+    imsic_guest_bits = riscv_imsic_num_bits(data->num_guests + 1);
+    imsic_max_hart_per_socket = 0;
+    for (socket = 0; socket < riscv_socket_count(mc); socket++) {
+        imsic_addr = data->imsic_s.base + socket * data->group_max_size;
+        imsic_size = IMSIC_HART_SIZE(imsic_guest_bits) *
+                     soc[socket].num_harts;
+        imsic_regs[socket * 4 + 0] = 0;
+        imsic_regs[socket * 4 + 1] = cpu_to_be32(imsic_addr);
+        imsic_regs[socket * 4 + 2] = 0;
+        imsic_regs[socket * 4 + 3] = cpu_to_be32(imsic_size);
+        if (imsic_max_hart_per_socket < soc[socket].num_harts) {
+            imsic_max_hart_per_socket = soc[socket].num_harts;
+        }
+    }
+    imsic_name = g_strdup_printf("/soc/imsics@%lx",
+        (unsigned long)data->imsic_s.base);
+    qemu_fdt_add_subnode(mc->fdt, imsic_name);
+    qemu_fdt_setprop_string(mc->fdt, imsic_name, "compatible", "riscv,imsics");
+    qemu_fdt_setprop_cell(mc->fdt, imsic_name, "#interrupt-cells",
+                          FDT_IMSIC_INT_CELLS);
+    qemu_fdt_setprop(mc->fdt, imsic_name, "interrupt-controller", NULL, 0);
+    qemu_fdt_setprop(mc->fdt, imsic_name, "msi-controller", NULL, 0);
+    qemu_fdt_setprop(mc->fdt, imsic_name, "interrupts-extended",
+        imsic_cells, mc->smp.cpus * sizeof(uint32_t) * 2);
+    qemu_fdt_setprop(mc->fdt, imsic_name, "reg", imsic_regs,
+        riscv_socket_count(mc) * sizeof(uint32_t) * 4);
+    qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,num-ids", data->num_msi);
+    qemu_fdt_setprop_cells(mc->fdt, imsic_name, "riscv,ipi-id", data->ipi_msi);
+    if (imsic_guest_bits) {
+        qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,guest-index-bits",
+            imsic_guest_bits);
+    }
+    if (riscv_socket_count(mc) > 1) {
+        qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,hart-index-bits",
+            riscv_imsic_num_bits(imsic_max_hart_per_socket));
+        qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,group-index-bits",
+            riscv_imsic_num_bits(riscv_socket_count(mc)));
+        qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,group-index-shift",
+            IMSIC_MMIO_GROUP_MIN_SHIFT);
+    }
+    qemu_fdt_setprop_cell(mc->fdt, imsic_name, "phandle", *msi_s_phandle);
+    g_free(imsic_name);
+
+    g_free(imsic_regs);
+    g_free(imsic_cells);
+}
+
+static void create_pcie_irq_map(void *fdt, char *nodename,
+                                uint32_t irqchip_phandle,
+                                RISCV_IRQ_TYPE irq_type)
+{
+    int pin, dev;
+    uint32_t irq_map_stride = 0;
+    uint32_t full_irq_map[GPEX_NUM_IRQS * GPEX_NUM_IRQS *
+                          FDT_MAX_INT_MAP_WIDTH] = {};
+    uint32_t *irq_map = full_irq_map;
+
+    /* This code creates a standard swizzle of interrupts such that
+     * each device's first interrupt is based on it's PCI_SLOT number.
+     * (See pci_swizzle_map_irq_fn())
+     *
+     * We only need one entry per interrupt in the table (not one per
+     * possible slot) seeing the interrupt-map-mask will allow the table
+     * to wrap to any number of devices.
+     */
+    for (dev = 0; dev < GPEX_NUM_IRQS; dev++) {
+        int devfn = dev * 0x8;
+
+        for (pin = 0; pin < GPEX_NUM_IRQS; pin++) {
+            int irq_nr = PCIE_IRQ + ((pin + PCI_SLOT(devfn)) % GPEX_NUM_IRQS);
+            int i = 0;
+
+            /* Fill PCI address cells */
+            irq_map[i] = cpu_to_be32(devfn << 8);
+            i += FDT_PCI_ADDR_CELLS;
+
+            /* Fill PCI Interrupt cells */
+            irq_map[i] = cpu_to_be32(pin + 1);
+            i += FDT_PCI_INT_CELLS;
+
+            /* Fill interrupt controller phandle and cells */
+            irq_map[i++] = cpu_to_be32(irqchip_phandle);
+            irq_map[i++] = cpu_to_be32(irq_nr);
+            if (irq_type != RISCV_IRQ_WIRED_PLIC) {
+                irq_map[i++] = cpu_to_be32(0x4);
+            }
+
+            if (!irq_map_stride) {
+                irq_map_stride = i;
+            }
+            irq_map += irq_map_stride;
+        }
+    }
+
+    qemu_fdt_setprop(fdt, nodename, "interrupt-map", full_irq_map,
+                     GPEX_NUM_IRQS * GPEX_NUM_IRQS *
+                     irq_map_stride * sizeof(uint32_t));
+
+    qemu_fdt_setprop_cells(fdt, nodename, "interrupt-map-mask",
+                           0x1800, 0, 0, 0x7);
+}
+
+RISCV_IRQ_TYPE riscv_get_irq_type(RISCVVirtAIAType virt_aia_type)
+{
+    int irq_type = RISCV_IRQ_INVALID;
+
+    switch (virt_aia_type) {
+    case VIRT_AIA_TYPE_NONE:
+        irq_type = RISCV_IRQ_WIRED_PLIC;
+        break;
+    case VIRT_AIA_TYPE_APLIC:
+        irq_type = RISCV_IRQ_WIRED_APLIC;
+        break;
+    case VIRT_AIA_TYPE_APLIC_IMSIC:
+        irq_type = RISCV_IRQ_WIRED_MSI;
+        break;
+    }
+
+    return irq_type;
+}
+
+void riscv_create_fdt_pcie(MachineState *mc, const PcieInitData *data,
+                           uint32_t irq_pcie_phandle, uint32_t msi_pcie_phandle)
+{
+    char *name;
+    RISCV_IRQ_TYPE irq_type = data->irq_type;
+
+    name = g_strdup_printf("/soc/pci@%lx",
+        (long) data->pcie_ecam.base);
+    qemu_fdt_add_subnode(mc->fdt, name);
+    qemu_fdt_setprop_cell(mc->fdt, name, "#address-cells",
+        FDT_PCI_ADDR_CELLS);
+    qemu_fdt_setprop_cell(mc->fdt, name, "#interrupt-cells",
+        FDT_PCI_INT_CELLS);
+    qemu_fdt_setprop_cell(mc->fdt, name, "#size-cells", 0x2);
+    qemu_fdt_setprop_string(mc->fdt, name, "compatible",
+        "pci-host-ecam-generic");
+    qemu_fdt_setprop_string(mc->fdt, name, "device_type", "pci");
+    qemu_fdt_setprop_cell(mc->fdt, name, "linux,pci-domain", 0);
+    qemu_fdt_setprop_cells(mc->fdt, name, "bus-range", 0,
+        data->pcie_ecam.size / PCIE_MMCFG_SIZE_MIN - 1);
+    qemu_fdt_setprop(mc->fdt, name, "dma-coherent", NULL, 0);
+    if (irq_type == RISCV_IRQ_MSI_ONLY || irq_type == RISCV_IRQ_WIRED_MSI) {
+        qemu_fdt_setprop_cell(mc->fdt, name, "msi-parent", msi_pcie_phandle);
+    }
+    qemu_fdt_setprop_cells(mc->fdt, name, "reg", 0,
+        data->pcie_ecam.base, 0, data->pcie_ecam.size);
+    qemu_fdt_setprop_sized_cells(mc->fdt, name, "ranges",
+        1, FDT_PCI_RANGE_IOPORT, 2, 0,
+        2, data->pcie_pio.base, 2, data->pcie_pio.size,
+        1, FDT_PCI_RANGE_MMIO,
+        2, data->pcie_mmio.base,
+        2, data->pcie_mmio.base, 2, data->pcie_mmio.size,
+        1, FDT_PCI_RANGE_MMIO_64BIT,
+        2, data->pcie_high_mmio.base,
+        2, data->pcie_high_mmio.base, 2, data->pcie_high_mmio.size);
+
+    if (irq_type != RISCV_IRQ_MSI_ONLY) {
+        create_pcie_irq_map(mc->fdt, name, irq_pcie_phandle, irq_type);
+    }
+    g_free(name);
+}
+
+void riscv_create_fdt_socket_cpus(MachineState *mc, RISCVHartArrayState *soc,
+                                  int socket, char *clust_name,
+                                  uint32_t *phandle, bool is_32_bit,
+                                  uint32_t *intc_phandles)
+{
+    int cpu;
+    uint32_t cpu_phandle;
+    char *name, *cpu_name, *core_name, *intc_name;
+
+    for (cpu = soc[socket].num_harts - 1; cpu >= 0; cpu--) {
+        cpu_phandle = (*phandle)++;
+
+        cpu_name = g_strdup_printf("/cpus/cpu@%d",
+            soc[socket].hartid_base + cpu);
+        qemu_fdt_add_subnode(mc->fdt, cpu_name);
+        qemu_fdt_setprop_string(mc->fdt, cpu_name, "mmu-type",
+            (is_32_bit) ? "riscv,sv32" : "riscv,sv48");
+        name = riscv_isa_string(&soc[socket].harts[cpu]);
+        qemu_fdt_setprop_string(mc->fdt, cpu_name, "riscv,isa", name);
+        g_free(name);
+        qemu_fdt_setprop_string(mc->fdt, cpu_name, "compatible", "riscv");
+        qemu_fdt_setprop_string(mc->fdt, cpu_name, "status", "okay");
+        qemu_fdt_setprop_cell(mc->fdt, cpu_name, "reg",
+            soc[socket].hartid_base + cpu);
+        qemu_fdt_setprop_string(mc->fdt, cpu_name, "device_type", "cpu");
+        riscv_socket_fdt_write_id(mc, mc->fdt, cpu_name, socket);
+        qemu_fdt_setprop_cell(mc->fdt, cpu_name, "phandle", cpu_phandle);
+
+        intc_phandles[cpu] = (*phandle)++;
+
+        intc_name = g_strdup_printf("%s/interrupt-controller", cpu_name);
+        qemu_fdt_add_subnode(mc->fdt, intc_name);
+        qemu_fdt_setprop_cell(mc->fdt, intc_name, "phandle",
+            intc_phandles[cpu]);
+        if (riscv_feature(&soc[socket].harts[cpu].env, RISCV_FEATURE_AIA)) {
+            static const char * const compat[2] = {
+                "riscv,cpu-intc-aia", "riscv,cpu-intc"
+            };
+            qemu_fdt_setprop_string_array(mc->fdt, intc_name, "compatible",
+                                      (char **)&compat, ARRAY_SIZE(compat));
+        } else {
+            qemu_fdt_setprop_string(mc->fdt, intc_name, "compatible",
+                "riscv,cpu-intc");
+        }
+        qemu_fdt_setprop(mc->fdt, intc_name, "interrupt-controller", NULL, 0);
+        qemu_fdt_setprop_cell(mc->fdt, intc_name, "#interrupt-cells", 1);
+
+        core_name = g_strdup_printf("%s/core%d", clust_name, cpu);
+        qemu_fdt_add_subnode(mc->fdt, core_name);
+        qemu_fdt_setprop_cell(mc->fdt, core_name, "cpu", cpu_phandle);
+
+        g_free(core_name);
+        g_free(intc_name);
+        g_free(cpu_name);
+    }
+}
+
+void riscv_create_fdt_socket_memory(MachineState *mc, hwaddr dram_base,
+                                    int socket)
+{
+    char *mem_name;
+    uint64_t addr, size;
+
+    addr = dram_base + riscv_socket_mem_offset(mc, socket);
+    size = riscv_socket_mem_size(mc, socket);
+    mem_name = g_strdup_printf("/memory@%lx", (long)addr);
+    qemu_fdt_add_subnode(mc->fdt, mem_name);
+   qemu_fdt_setprop_cells(mc->fdt, mem_name, "reg",
+        addr >> 32, addr, size >> 32, size);
+    qemu_fdt_setprop_string(mc->fdt, mem_name, "device_type", "memory");
+    riscv_socket_fdt_write_id(mc, mc->fdt, mem_name, socket);
+    g_free(mem_name);
+}
diff --git a/hw/riscv/meson.build b/hw/riscv/meson.build
index ab6cae57eae5..b3ae84ac0539 100644
--- a/hw/riscv/meson.build
+++ b/hw/riscv/meson.build
@@ -2,6 +2,7 @@ riscv_ss = ss.source_set()
 riscv_ss.add(files('boot.c'), fdt)
 riscv_ss.add(when: 'CONFIG_RISCV_NUMA', if_true: files('numa.c'))
 riscv_ss.add(files('riscv_hart.c'))
+riscv_ss.add(files('machine_helper.c'))
 riscv_ss.add(when: 'CONFIG_OPENTITAN', if_true: files('opentitan.c'))
 riscv_ss.add(when: 'CONFIG_RISCV_VIRT', if_true: files('virt.c'))
 riscv_ss.add(when: 'CONFIG_SHAKTI_C', if_true: files('shakti_c.c'))
diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index da50cbed43ec..5999395e5bf9 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -30,6 +30,7 @@
 #include "target/riscv/cpu.h"
 #include "hw/riscv/riscv_hart.h"
 #include "hw/riscv/virt.h"
+#include "hw/riscv/machine_helper.h"
 #include "hw/riscv/boot.h"
 #include "hw/riscv/numa.h"
 #include "hw/intc/riscv_aclint.h"
@@ -161,136 +162,6 @@ static void virt_flash_map(RISCVVirtState *s,
                     sysmem);
 }
 
-static void create_pcie_irq_map(RISCVVirtState *s, void *fdt, char *nodename,
-                                uint32_t irqchip_phandle)
-{
-    int pin, dev;
-    uint32_t irq_map_stride = 0;
-    uint32_t full_irq_map[GPEX_NUM_IRQS * GPEX_NUM_IRQS *
-                          FDT_MAX_INT_MAP_WIDTH] = {};
-    uint32_t *irq_map = full_irq_map;
-
-    /* This code creates a standard swizzle of interrupts such that
-     * each device's first interrupt is based on it's PCI_SLOT number.
-     * (See pci_swizzle_map_irq_fn())
-     *
-     * We only need one entry per interrupt in the table (not one per
-     * possible slot) seeing the interrupt-map-mask will allow the table
-     * to wrap to any number of devices.
-     */
-    for (dev = 0; dev < GPEX_NUM_IRQS; dev++) {
-        int devfn = dev * 0x8;
-
-        for (pin = 0; pin < GPEX_NUM_IRQS; pin++) {
-            int irq_nr = PCIE_IRQ + ((pin + PCI_SLOT(devfn)) % GPEX_NUM_IRQS);
-            int i = 0;
-
-            /* Fill PCI address cells */
-            irq_map[i] = cpu_to_be32(devfn << 8);
-            i += FDT_PCI_ADDR_CELLS;
-
-            /* Fill PCI Interrupt cells */
-            irq_map[i] = cpu_to_be32(pin + 1);
-            i += FDT_PCI_INT_CELLS;
-
-            /* Fill interrupt controller phandle and cells */
-            irq_map[i++] = cpu_to_be32(irqchip_phandle);
-            irq_map[i++] = cpu_to_be32(irq_nr);
-            if (s->aia_type != VIRT_AIA_TYPE_NONE) {
-                irq_map[i++] = cpu_to_be32(0x4);
-            }
-
-            if (!irq_map_stride) {
-                irq_map_stride = i;
-            }
-            irq_map += irq_map_stride;
-        }
-    }
-
-    qemu_fdt_setprop(fdt, nodename, "interrupt-map", full_irq_map,
-                     GPEX_NUM_IRQS * GPEX_NUM_IRQS *
-                     irq_map_stride * sizeof(uint32_t));
-
-    qemu_fdt_setprop_cells(fdt, nodename, "interrupt-map-mask",
-                           0x1800, 0, 0, 0x7);
-}
-
-static void create_fdt_socket_cpus(RISCVVirtState *s, int socket,
-                                   char *clust_name, uint32_t *phandle,
-                                   bool is_32_bit, uint32_t *intc_phandles)
-{
-    int cpu;
-    uint32_t cpu_phandle;
-    MachineState *mc = MACHINE(s);
-    char *name, *cpu_name, *core_name, *intc_name;
-
-    for (cpu = s->soc[socket].num_harts - 1; cpu >= 0; cpu--) {
-        cpu_phandle = (*phandle)++;
-
-        cpu_name = g_strdup_printf("/cpus/cpu@%d",
-            s->soc[socket].hartid_base + cpu);
-        qemu_fdt_add_subnode(mc->fdt, cpu_name);
-        qemu_fdt_setprop_string(mc->fdt, cpu_name, "mmu-type",
-            (is_32_bit) ? "riscv,sv32" : "riscv,sv48");
-        name = riscv_isa_string(&s->soc[socket].harts[cpu]);
-        qemu_fdt_setprop_string(mc->fdt, cpu_name, "riscv,isa", name);
-        g_free(name);
-        qemu_fdt_setprop_string(mc->fdt, cpu_name, "compatible", "riscv");
-        qemu_fdt_setprop_string(mc->fdt, cpu_name, "status", "okay");
-        qemu_fdt_setprop_cell(mc->fdt, cpu_name, "reg",
-            s->soc[socket].hartid_base + cpu);
-        qemu_fdt_setprop_string(mc->fdt, cpu_name, "device_type", "cpu");
-        riscv_socket_fdt_write_id(mc, mc->fdt, cpu_name, socket);
-        qemu_fdt_setprop_cell(mc->fdt, cpu_name, "phandle", cpu_phandle);
-
-        intc_phandles[cpu] = (*phandle)++;
-
-        intc_name = g_strdup_printf("%s/interrupt-controller", cpu_name);
-        qemu_fdt_add_subnode(mc->fdt, intc_name);
-        qemu_fdt_setprop_cell(mc->fdt, intc_name, "phandle",
-            intc_phandles[cpu]);
-        if (riscv_feature(&s->soc[socket].harts[cpu].env,
-                          RISCV_FEATURE_AIA)) {
-            static const char * const compat[2] = {
-                "riscv,cpu-intc-aia", "riscv,cpu-intc"
-            };
-            qemu_fdt_setprop_string_array(mc->fdt, intc_name, "compatible",
-                                      (char **)&compat, ARRAY_SIZE(compat));
-        } else {
-            qemu_fdt_setprop_string(mc->fdt, intc_name, "compatible",
-                "riscv,cpu-intc");
-        }
-        qemu_fdt_setprop(mc->fdt, intc_name, "interrupt-controller", NULL, 0);
-        qemu_fdt_setprop_cell(mc->fdt, intc_name, "#interrupt-cells", 1);
-
-        core_name = g_strdup_printf("%s/core%d", clust_name, cpu);
-        qemu_fdt_add_subnode(mc->fdt, core_name);
-        qemu_fdt_setprop_cell(mc->fdt, core_name, "cpu", cpu_phandle);
-
-        g_free(core_name);
-        g_free(intc_name);
-        g_free(cpu_name);
-    }
-}
-
-static void create_fdt_socket_memory(RISCVVirtState *s,
-                                     const MemMapEntry *memmap, int socket)
-{
-    char *mem_name;
-    uint64_t addr, size;
-    MachineState *mc = MACHINE(s);
-
-    addr = memmap[VIRT_DRAM].base + riscv_socket_mem_offset(mc, socket);
-    size = riscv_socket_mem_size(mc, socket);
-    mem_name = g_strdup_printf("/memory@%lx", (long)addr);
-    qemu_fdt_add_subnode(mc->fdt, mem_name);
-    qemu_fdt_setprop_cells(mc->fdt, mem_name, "reg",
-        addr >> 32, addr, size >> 32, size);
-    qemu_fdt_setprop_string(mc->fdt, mem_name, "device_type", "memory");
-    riscv_socket_fdt_write_id(mc, mc->fdt, mem_name, socket);
-    g_free(mem_name);
-}
-
 static void create_fdt_socket_clint(RISCVVirtState *s,
                                     const MemMapEntry *memmap, int socket,
                                     uint32_t *intc_phandles)
@@ -472,138 +343,6 @@ static void create_fdt_socket_plic(RISCVVirtState *s,
     g_free(plic_cells);
 }
 
-static uint32_t imsic_num_bits(uint32_t count)
-{
-    uint32_t ret = 0;
-
-    while (BIT(ret) < count) {
-        ret++;
-    }
-
-    return ret;
-}
-
-static void create_fdt_imsic(RISCVVirtState *s, const MemMapEntry *memmap,
-                             uint32_t *phandle, uint32_t *intc_phandles,
-                             uint32_t *msi_m_phandle, uint32_t *msi_s_phandle)
-{
-    int cpu, socket;
-    char *imsic_name;
-    MachineState *mc = MACHINE(s);
-    uint32_t imsic_max_hart_per_socket, imsic_guest_bits;
-    uint32_t *imsic_cells, *imsic_regs, imsic_addr, imsic_size;
-
-    *msi_m_phandle = (*phandle)++;
-    *msi_s_phandle = (*phandle)++;
-    imsic_cells = g_new0(uint32_t, mc->smp.cpus * 2);
-    imsic_regs = g_new0(uint32_t, riscv_socket_count(mc) * 4);
-
-    /* M-level IMSIC node */
-    for (cpu = 0; cpu < mc->smp.cpus; cpu++) {
-        imsic_cells[cpu * 2 + 0] = cpu_to_be32(intc_phandles[cpu]);
-        imsic_cells[cpu * 2 + 1] = cpu_to_be32(IRQ_M_EXT);
-    }
-    imsic_max_hart_per_socket = 0;
-    for (socket = 0; socket < riscv_socket_count(mc); socket++) {
-        imsic_addr = memmap[VIRT_IMSIC_M].base +
-                     socket * VIRT_IMSIC_GROUP_MAX_SIZE;
-        imsic_size = IMSIC_HART_SIZE(0) * s->soc[socket].num_harts;
-        imsic_regs[socket * 4 + 0] = 0;
-        imsic_regs[socket * 4 + 1] = cpu_to_be32(imsic_addr);
-        imsic_regs[socket * 4 + 2] = 0;
-        imsic_regs[socket * 4 + 3] = cpu_to_be32(imsic_size);
-        if (imsic_max_hart_per_socket < s->soc[socket].num_harts) {
-            imsic_max_hart_per_socket = s->soc[socket].num_harts;
-        }
-    }
-    imsic_name = g_strdup_printf("/soc/imsics@%lx",
-        (unsigned long)memmap[VIRT_IMSIC_M].base);
-    qemu_fdt_add_subnode(mc->fdt, imsic_name);
-    qemu_fdt_setprop_string(mc->fdt, imsic_name, "compatible",
-        "riscv,imsics");
-    qemu_fdt_setprop_cell(mc->fdt, imsic_name, "#interrupt-cells",
-        FDT_IMSIC_INT_CELLS);
-    qemu_fdt_setprop(mc->fdt, imsic_name, "interrupt-controller",
-        NULL, 0);
-    qemu_fdt_setprop(mc->fdt, imsic_name, "msi-controller",
-        NULL, 0);
-    qemu_fdt_setprop(mc->fdt, imsic_name, "interrupts-extended",
-        imsic_cells, mc->smp.cpus * sizeof(uint32_t) * 2);
-    qemu_fdt_setprop(mc->fdt, imsic_name, "reg", imsic_regs,
-        riscv_socket_count(mc) * sizeof(uint32_t) * 4);
-    qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,num-ids",
-        VIRT_IRQCHIP_NUM_MSIS);
-    qemu_fdt_setprop_cells(mc->fdt, imsic_name, "riscv,ipi-id",
-        VIRT_IRQCHIP_IPI_MSI);
-    if (riscv_socket_count(mc) > 1) {
-        qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,hart-index-bits",
-            imsic_num_bits(imsic_max_hart_per_socket));
-        qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,group-index-bits",
-            imsic_num_bits(riscv_socket_count(mc)));
-        qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,group-index-shift",
-            IMSIC_MMIO_GROUP_MIN_SHIFT);
-    }
-    qemu_fdt_setprop_cell(mc->fdt, imsic_name, "phandle", *msi_m_phandle);
-    g_free(imsic_name);
-
-    /* S-level IMSIC node */
-    for (cpu = 0; cpu < mc->smp.cpus; cpu++) {
-        imsic_cells[cpu * 2 + 0] = cpu_to_be32(intc_phandles[cpu]);
-        imsic_cells[cpu * 2 + 1] = cpu_to_be32(IRQ_S_EXT);
-    }
-    imsic_guest_bits = imsic_num_bits(s->aia_guests + 1);
-    imsic_max_hart_per_socket = 0;
-    for (socket = 0; socket < riscv_socket_count(mc); socket++) {
-        imsic_addr = memmap[VIRT_IMSIC_S].base +
-                     socket * VIRT_IMSIC_GROUP_MAX_SIZE;
-        imsic_size = IMSIC_HART_SIZE(imsic_guest_bits) *
-                     s->soc[socket].num_harts;
-        imsic_regs[socket * 4 + 0] = 0;
-        imsic_regs[socket * 4 + 1] = cpu_to_be32(imsic_addr);
-        imsic_regs[socket * 4 + 2] = 0;
-        imsic_regs[socket * 4 + 3] = cpu_to_be32(imsic_size);
-        if (imsic_max_hart_per_socket < s->soc[socket].num_harts) {
-            imsic_max_hart_per_socket = s->soc[socket].num_harts;
-        }
-    }
-    imsic_name = g_strdup_printf("/soc/imsics@%lx",
-        (unsigned long)memmap[VIRT_IMSIC_S].base);
-    qemu_fdt_add_subnode(mc->fdt, imsic_name);
-    qemu_fdt_setprop_string(mc->fdt, imsic_name, "compatible",
-        "riscv,imsics");
-    qemu_fdt_setprop_cell(mc->fdt, imsic_name, "#interrupt-cells",
-        FDT_IMSIC_INT_CELLS);
-    qemu_fdt_setprop(mc->fdt, imsic_name, "interrupt-controller",
-        NULL, 0);
-    qemu_fdt_setprop(mc->fdt, imsic_name, "msi-controller",
-        NULL, 0);
-    qemu_fdt_setprop(mc->fdt, imsic_name, "interrupts-extended",
-        imsic_cells, mc->smp.cpus * sizeof(uint32_t) * 2);
-    qemu_fdt_setprop(mc->fdt, imsic_name, "reg", imsic_regs,
-        riscv_socket_count(mc) * sizeof(uint32_t) * 4);
-    qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,num-ids",
-        VIRT_IRQCHIP_NUM_MSIS);
-    qemu_fdt_setprop_cells(mc->fdt, imsic_name, "riscv,ipi-id",
-        VIRT_IRQCHIP_IPI_MSI);
-    if (imsic_guest_bits) {
-        qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,guest-index-bits",
-            imsic_guest_bits);
-    }
-    if (riscv_socket_count(mc) > 1) {
-        qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,hart-index-bits",
-            imsic_num_bits(imsic_max_hart_per_socket));
-        qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,group-index-bits",
-            imsic_num_bits(riscv_socket_count(mc)));
-        qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,group-index-shift",
-            IMSIC_MMIO_GROUP_MIN_SHIFT);
-    }
-    qemu_fdt_setprop_cell(mc->fdt, imsic_name, "phandle", *msi_s_phandle);
-    g_free(imsic_name);
-
-    g_free(imsic_regs);
-    g_free(imsic_cells);
-}
-
 static void create_fdt_socket_aplic(RISCVVirtState *s,
                                     const MemMapEntry *memmap, int socket,
                                     uint32_t msi_m_phandle,
@@ -699,6 +438,7 @@ static void create_fdt_sockets(RISCVVirtState *s, const MemMapEntry *memmap,
     MachineState *mc = MACHINE(s);
     uint32_t msi_m_phandle = 0, msi_s_phandle = 0;
     uint32_t *intc_phandles, xplic_phandles[MAX_NODES];
+    ImsicInitData idata;
 
     qemu_fdt_add_subnode(mc->fdt, "/cpus");
     qemu_fdt_setprop_cell(mc->fdt, "/cpus", "timebase-frequency",
@@ -716,10 +456,10 @@ static void create_fdt_sockets(RISCVVirtState *s, const MemMapEntry *memmap,
         clust_name = g_strdup_printf("/cpus/cpu-map/cluster%d", socket);
         qemu_fdt_add_subnode(mc->fdt, clust_name);
 
-        create_fdt_socket_cpus(s, socket, clust_name, phandle,
+        riscv_create_fdt_socket_cpus(mc, s->soc, socket, clust_name, phandle,
             is_32_bit, &intc_phandles[phandle_pos]);
 
-        create_fdt_socket_memory(s, memmap, socket);
+        riscv_create_fdt_socket_memory(mc, memmap[VIRT_DRAM].base, socket);
 
         g_free(clust_name);
 
@@ -735,8 +475,17 @@ static void create_fdt_sockets(RISCVVirtState *s, const MemMapEntry *memmap,
     }
 
     if (s->aia_type == VIRT_AIA_TYPE_APLIC_IMSIC) {
-        create_fdt_imsic(s, memmap, phandle, intc_phandles,
-            &msi_m_phandle, &msi_s_phandle);
+        idata.imsic_m.base = memmap[VIRT_IMSIC_M].base;
+        idata.imsic_m.size = memmap[VIRT_IMSIC_M].size;
+        idata.imsic_s.base = memmap[VIRT_IMSIC_S].base;
+        idata.imsic_s.size = memmap[VIRT_IMSIC_S].size;
+        idata.group_max_size = VIRT_IMSIC_GROUP_MAX_SIZE;
+        idata.num_msi = VIRT_IRQCHIP_NUM_MSIS;
+        idata.ipi_msi = VIRT_IRQCHIP_IPI_MSI;
+        idata.num_guests = s->aia_guests;
+
+        riscv_create_fdt_imsic(mc, s->soc, phandle, intc_phandles,
+            &msi_m_phandle, &msi_s_phandle, &idata);
         *msi_pcie_phandle = msi_s_phandle;
     }
 
@@ -802,47 +551,6 @@ static void create_fdt_virtio(RISCVVirtState *s, const MemMapEntry *memmap,
     }
 }
 
-static void create_fdt_pcie(RISCVVirtState *s, const MemMapEntry *memmap,
-                            uint32_t irq_pcie_phandle,
-                            uint32_t msi_pcie_phandle)
-{
-    char *name;
-    MachineState *mc = MACHINE(s);
-
-    name = g_strdup_printf("/soc/pci@%lx",
-        (long) memmap[VIRT_PCIE_ECAM].base);
-    qemu_fdt_add_subnode(mc->fdt, name);
-    qemu_fdt_setprop_cell(mc->fdt, name, "#address-cells",
-        FDT_PCI_ADDR_CELLS);
-    qemu_fdt_setprop_cell(mc->fdt, name, "#interrupt-cells",
-        FDT_PCI_INT_CELLS);
-    qemu_fdt_setprop_cell(mc->fdt, name, "#size-cells", 0x2);
-    qemu_fdt_setprop_string(mc->fdt, name, "compatible",
-        "pci-host-ecam-generic");
-    qemu_fdt_setprop_string(mc->fdt, name, "device_type", "pci");
-    qemu_fdt_setprop_cell(mc->fdt, name, "linux,pci-domain", 0);
-    qemu_fdt_setprop_cells(mc->fdt, name, "bus-range", 0,
-        memmap[VIRT_PCIE_ECAM].size / PCIE_MMCFG_SIZE_MIN - 1);
-    qemu_fdt_setprop(mc->fdt, name, "dma-coherent", NULL, 0);
-    if (s->aia_type == VIRT_AIA_TYPE_APLIC_IMSIC) {
-        qemu_fdt_setprop_cell(mc->fdt, name, "msi-parent", msi_pcie_phandle);
-    }
-    qemu_fdt_setprop_cells(mc->fdt, name, "reg", 0,
-        memmap[VIRT_PCIE_ECAM].base, 0, memmap[VIRT_PCIE_ECAM].size);
-    qemu_fdt_setprop_sized_cells(mc->fdt, name, "ranges",
-        1, FDT_PCI_RANGE_IOPORT, 2, 0,
-        2, memmap[VIRT_PCIE_PIO].base, 2, memmap[VIRT_PCIE_PIO].size,
-        1, FDT_PCI_RANGE_MMIO,
-        2, memmap[VIRT_PCIE_MMIO].base,
-        2, memmap[VIRT_PCIE_MMIO].base, 2, memmap[VIRT_PCIE_MMIO].size,
-        1, FDT_PCI_RANGE_MMIO_64BIT,
-        2, virt_high_pcie_memmap.base,
-        2, virt_high_pcie_memmap.base, 2, virt_high_pcie_memmap.size);
-
-    create_pcie_irq_map(s, mc->fdt, name, irq_pcie_phandle);
-    g_free(name);
-}
-
 static void create_fdt_reset(RISCVVirtState *s, const MemMapEntry *memmap,
                              uint32_t *phandle)
 {
@@ -948,12 +656,26 @@ static void create_fdt_flash(RISCVVirtState *s, const MemMapEntry *memmap)
     g_free(name);
 }
 
+static void copy_memmap_to_pciedata(const MemMapEntry *memmap,
+                                    PcieInitData *pdata)
+{
+    pdata->pcie_ecam.base =  memmap[VIRT_PCIE_ECAM].base;
+    pdata->pcie_ecam.size =  memmap[VIRT_PCIE_ECAM].size;
+    pdata->pcie_pio.base =  memmap[VIRT_PCIE_PIO].base;
+    pdata->pcie_pio.size =  memmap[VIRT_PCIE_PIO].size;
+    pdata->pcie_mmio.base =  memmap[VIRT_PCIE_MMIO].base;
+    pdata->pcie_mmio.size =  memmap[VIRT_PCIE_MMIO].size;
+    pdata->pcie_high_mmio.base = virt_high_pcie_memmap.base;
+    pdata->pcie_high_mmio.size = virt_high_pcie_memmap.size;
+}
+
 static void create_fdt(RISCVVirtState *s, const MemMapEntry *memmap,
                        uint64_t mem_size, const char *cmdline, bool is_32_bit)
 {
     MachineState *mc = MACHINE(s);
     uint32_t phandle = 1, irq_mmio_phandle = 1, msi_pcie_phandle = 1;
     uint32_t irq_pcie_phandle = 1, irq_virtio_phandle = 1;
+    PcieInitData pdata;
 
     if (mc->dtb) {
         mc->fdt = load_device_tree(mc->dtb, &s->fdt_size);
@@ -987,7 +709,9 @@ static void create_fdt(RISCVVirtState *s, const MemMapEntry *memmap,
 
     create_fdt_virtio(s, memmap, irq_virtio_phandle);
 
-    create_fdt_pcie(s, memmap, irq_pcie_phandle, msi_pcie_phandle);
+    pdata.irq_type = riscv_get_irq_type(s->aia_type);
+    copy_memmap_to_pciedata(memmap, &pdata);
+    riscv_create_fdt_pcie(mc, &pdata, irq_pcie_phandle, msi_pcie_phandle);
 
     create_fdt_reset(s, memmap, &phandle);
 
@@ -1003,55 +727,6 @@ update_bootargs:
     }
 }
 
-static inline DeviceState *gpex_pcie_init(MemoryRegion *sys_mem,
-                                          hwaddr ecam_base, hwaddr ecam_size,
-                                          hwaddr mmio_base, hwaddr mmio_size,
-                                          hwaddr high_mmio_base,
-                                          hwaddr high_mmio_size,
-                                          hwaddr pio_base,
-                                          DeviceState *irqchip)
-{
-    DeviceState *dev;
-    MemoryRegion *ecam_alias, *ecam_reg;
-    MemoryRegion *mmio_alias, *high_mmio_alias, *mmio_reg;
-    qemu_irq irq;
-    int i;
-
-    dev = qdev_new(TYPE_GPEX_HOST);
-
-    sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
-
-    ecam_alias = g_new0(MemoryRegion, 1);
-    ecam_reg = sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 0);
-    memory_region_init_alias(ecam_alias, OBJECT(dev), "pcie-ecam",
-                             ecam_reg, 0, ecam_size);
-    memory_region_add_subregion(get_system_memory(), ecam_base, ecam_alias);
-
-    mmio_alias = g_new0(MemoryRegion, 1);
-    mmio_reg = sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 1);
-    memory_region_init_alias(mmio_alias, OBJECT(dev), "pcie-mmio",
-                             mmio_reg, mmio_base, mmio_size);
-    memory_region_add_subregion(get_system_memory(), mmio_base, mmio_alias);
-
-    /* Map high MMIO space */
-    high_mmio_alias = g_new0(MemoryRegion, 1);
-    memory_region_init_alias(high_mmio_alias, OBJECT(dev), "pcie-mmio-high",
-                             mmio_reg, high_mmio_base, high_mmio_size);
-    memory_region_add_subregion(get_system_memory(), high_mmio_base,
-                                high_mmio_alias);
-
-    sysbus_mmio_map(SYS_BUS_DEVICE(dev), 2, pio_base);
-
-    for (i = 0; i < GPEX_NUM_IRQS; i++) {
-        irq = qdev_get_gpio_in(irqchip, PCIE_IRQ + i);
-
-        sysbus_connect_irq(SYS_BUS_DEVICE(dev), i, irq);
-        gpex_set_irq_num(GPEX_HOST(dev), i, PCIE_IRQ + i);
-    }
-
-    return dev;
-}
-
 static FWCfgState *create_fw_cfg(const MachineState *mc)
 {
     hwaddr base = virt_memmap[VIRT_FW_CFG].base;
@@ -1122,7 +797,7 @@ static DeviceState *virt_create_aia(RISCVVirtAIAType aia_type, int aia_guests,
         }
 
         /* Per-socket S-level IMSICs */
-        guest_bits = imsic_num_bits(aia_guests + 1);
+        guest_bits = riscv_imsic_num_bits(aia_guests + 1);
         addr = memmap[VIRT_IMSIC_S].base + socket * VIRT_IMSIC_GROUP_MAX_SIZE;
         for (i = 0; i < hart_count; i++) {
             riscv_imsic_create(addr + i * IMSIC_HART_SIZE(guest_bits),
@@ -1169,6 +844,7 @@ static void virt_machine_init(MachineState *machine)
     uint64_t kernel_entry;
     DeviceState *mmio_irqchip, *virtio_irqchip, *pcie_irqchip;
     int i, base_hartid, hart_count;
+    PcieInitData pdata;
 
     /* Check socket count limit */
     if (VIRT_SOCKETS_MAX < riscv_socket_count(machine)) {
@@ -1392,15 +1068,8 @@ static void virt_machine_init(MachineState *machine)
             qdev_get_gpio_in(DEVICE(virtio_irqchip), VIRTIO_IRQ + i));
     }
 
-    gpex_pcie_init(system_memory,
-                   memmap[VIRT_PCIE_ECAM].base,
-                   memmap[VIRT_PCIE_ECAM].size,
-                   memmap[VIRT_PCIE_MMIO].base,
-                   memmap[VIRT_PCIE_MMIO].size,
-                   virt_high_pcie_memmap.base,
-                   virt_high_pcie_memmap.size,
-                   memmap[VIRT_PCIE_PIO].base,
-                   DEVICE(pcie_irqchip));
+    copy_memmap_to_pciedata(memmap, &pdata);
+    riscv_gpex_pcie_intx_init(system_memory, &pdata, DEVICE(pcie_irqchip));
 
     serial_mm_init(system_memory, memmap[VIRT_UART0].base,
         0, qdev_get_gpio_in(DEVICE(mmio_irqchip), UART0_IRQ), 399193,
diff --git a/include/hw/riscv/machine_helper.h b/include/hw/riscv/machine_helper.h
new file mode 100644
index 000000000000..9029adec941b
--- /dev/null
+++ b/include/hw/riscv/machine_helper.h
@@ -0,0 +1,87 @@
+/*
+ * QEMU RISC-V Machine common helper functions
+ *
+ * Copyright (c) 2022 Rivos, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_RISCV_MACHINE_HELPER_H
+#define HW_RISCV_MACHINE_HELPER_H
+
+#include "hw/riscv/riscv_hart.h"
+#include "hw/riscv/virt.h"
+#include "hw/sysbus.h"
+#include "qom/object.h"
+#include "exec/memory.h"
+
+#define FDT_PCI_ADDR_CELLS    3
+#define FDT_PCI_INT_CELLS     1
+#define FDT_PLIC_INT_CELLS    1
+#define FDT_APLIC_INT_CELLS   2
+#define FDT_IMSIC_INT_CELLS   0
+#define FDT_MAX_INT_CELLS     2
+#define FDT_MAX_INT_MAP_WIDTH (FDT_PCI_ADDR_CELLS + FDT_PCI_INT_CELLS + \
+                                 1 + FDT_MAX_INT_CELLS)
+#define FDT_PLIC_INT_MAP_WIDTH  (FDT_PCI_ADDR_CELLS + FDT_PCI_INT_CELLS + \
+                                 1 + FDT_PLIC_INT_CELLS)
+#define FDT_APLIC_INT_MAP_WIDTH (FDT_PCI_ADDR_CELLS + FDT_PCI_INT_CELLS + \
+                                 1 + FDT_APLIC_INT_CELLS)
+
+typedef enum RISCV_IRQ_TYPE {
+    RISCV_IRQ_WIRED_PLIC = 0,
+    RISCV_IRQ_WIRED_APLIC,
+    RISCV_IRQ_WIRED_MSI,
+    RISCV_IRQ_MSI_ONLY,
+    RISCV_IRQ_INVALID
+} RISCV_IRQ_TYPE;
+
+typedef struct ImsicInitData {
+    MemMapEntry imsic_m;
+    MemMapEntry imsic_s;
+    uint32_t group_max_size;
+    uint32_t num_msi;
+    uint32_t ipi_msi;
+    uint32_t num_guests;
+} ImsicInitData;
+
+typedef struct PcieInitData {
+    MemMapEntry pcie_ecam;
+    MemMapEntry pcie_pio;
+    MemMapEntry pcie_mmio;
+    MemMapEntry pcie_high_mmio;
+    RISCV_IRQ_TYPE irq_type;
+} PcieInitData;
+
+uint32_t riscv_imsic_num_bits(uint32_t count);
+void riscv_create_fdt_imsic(MachineState *mc, RISCVHartArrayState *soc,
+                            uint32_t *phandle, uint32_t *intc_phandles,
+                            uint32_t *msi_m_phandle, uint32_t *msi_s_phandle,
+                            ImsicInitData *data);
+void riscv_create_fdt_pcie(MachineState *mc, const PcieInitData *data,
+                           uint32_t irq_pcie_phandle,
+                           uint32_t msi_pcie_phandle);
+DeviceState *riscv_gpex_pcie_intx_init(MemoryRegion *sys_mem,
+                                       PcieInitData *data,
+                                       DeviceState *irqchip);
+DeviceState *riscv_gpex_pcie_msi_init(MemoryRegion *sys_mem,
+                                      PcieInitData *data);
+void riscv_create_fdt_socket_cpus(MachineState *mc, RISCVHartArrayState *soc,
+                                  int socket, char *clust_name,
+                                  uint32_t *phandle, bool is_32_bit,
+                                  uint32_t *intc_phandles);
+void riscv_create_fdt_socket_memory(MachineState *mc, hwaddr dram_base,
+                                    int socket);
+RISCV_IRQ_TYPE riscv_get_irq_type(RISCVVirtAIAType virt_aia_type);
+
+#endif
diff --git a/include/hw/riscv/virt.h b/include/hw/riscv/virt.h
index 78b058ec8683..2f62e2475653 100644
--- a/include/hw/riscv/virt.h
+++ b/include/hw/riscv/virt.h
@@ -103,17 +103,4 @@ enum {
 #define VIRT_PLIC_SIZE(__num_context) \
     (VIRT_PLIC_CONTEXT_BASE + (__num_context) * VIRT_PLIC_CONTEXT_STRIDE)
 
-#define FDT_PCI_ADDR_CELLS    3
-#define FDT_PCI_INT_CELLS     1
-#define FDT_PLIC_INT_CELLS    1
-#define FDT_APLIC_INT_CELLS   2
-#define FDT_IMSIC_INT_CELLS   0
-#define FDT_MAX_INT_CELLS     2
-#define FDT_MAX_INT_MAP_WIDTH (FDT_PCI_ADDR_CELLS + FDT_PCI_INT_CELLS + \
-                                 1 + FDT_MAX_INT_CELLS)
-#define FDT_PLIC_INT_MAP_WIDTH  (FDT_PCI_ADDR_CELLS + FDT_PCI_INT_CELLS + \
-                                 1 + FDT_PLIC_INT_CELLS)
-#define FDT_APLIC_INT_MAP_WIDTH (FDT_PCI_ADDR_CELLS + FDT_PCI_INT_CELLS + \
-                                 1 + FDT_APLIC_INT_CELLS)
-
 #endif
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC 3/3] hw/riscv: Create a new qemu machine for RISC-V
  2022-04-12  2:10 ` Atish Patra
@ 2022-04-12  2:10   ` Atish Patra
  -1 siblings, 0 replies; 35+ messages in thread
From: Atish Patra @ 2022-04-12  2:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-riscv, Michael S. Tsirkin, Bin Meng, Atish Patra,
	Alistair Francis, Paolo Bonzini, Palmer Dabbelt

The RISC-V virt machine has been growing with many different commmandline
options. It has its limitations in terms of maximum number of harts that
it can support. The commandline options slowly will become bit difficult
to manage. Moreover, it always depends on the virtio framework and lot
of mmio devices.

The new MSI based interrupt controller (IMSIC) allows us to build a
minimalistic yet extensible machines without any wired interrupts. All
the devices can be behind PCI with MSI/MSI-X are only source of external
interrupts. This approach is highly scalable in terms of number of harts
and also mimics modern day PC/server machines more closely. As every device
must be behind PCI, we won't require a lot of addition to the machine.

Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
 configs/devices/riscv64-softmmu/default.mak |   1 +
 hw/riscv/Kconfig                            |  11 +
 hw/riscv/meson.build                        |   1 +
 hw/riscv/minic.c                            | 438 ++++++++++++++++++++
 include/hw/riscv/minic.h                    |  65 +++
 5 files changed, 516 insertions(+)
 create mode 100644 hw/riscv/minic.c
 create mode 100644 include/hw/riscv/minic.h

diff --git a/configs/devices/riscv64-softmmu/default.mak b/configs/devices/riscv64-softmmu/default.mak
index bc69301fa4a6..1407c4a9fe2f 100644
--- a/configs/devices/riscv64-softmmu/default.mak
+++ b/configs/devices/riscv64-softmmu/default.mak
@@ -14,3 +14,4 @@ CONFIG_SIFIVE_U=y
 CONFIG_RISCV_VIRT=y
 CONFIG_MICROCHIP_PFSOC=y
 CONFIG_SHAKTI_C=y
+CONFIG_MINIC=y
diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig
index 91bb9d21c471..9eca1a6efa25 100644
--- a/hw/riscv/Kconfig
+++ b/hw/riscv/Kconfig
@@ -83,3 +83,14 @@ config SPIKE
     select MSI_NONBROKEN
     select RISCV_ACLINT
     select SIFIVE_PLIC
+
+config MINIC
+    bool
+    imply PCI_DEVICES
+    select RISCV_NUMA
+    select MSI_NONBROKEN
+    select PCI
+    select PCI_EXPRESS_GENERIC_BRIDGE
+    select SERIAL
+    select RISCV_ACLINT
+    select RISCV_IMSIC
diff --git a/hw/riscv/meson.build b/hw/riscv/meson.build
index b3ae84ac0539..7b1c49466e62 100644
--- a/hw/riscv/meson.build
+++ b/hw/riscv/meson.build
@@ -10,5 +10,6 @@ riscv_ss.add(when: 'CONFIG_SIFIVE_E', if_true: files('sifive_e.c'))
 riscv_ss.add(when: 'CONFIG_SIFIVE_U', if_true: files('sifive_u.c'))
 riscv_ss.add(when: 'CONFIG_SPIKE', if_true: files('spike.c'))
 riscv_ss.add(when: 'CONFIG_MICROCHIP_PFSOC', if_true: files('microchip_pfsoc.c'))
+riscv_ss.add(when: 'CONFIG_MINIC', if_true: files('minic.c'))
 
 hw_arch += {'riscv': riscv_ss}
diff --git a/hw/riscv/minic.c b/hw/riscv/minic.c
new file mode 100644
index 000000000000..4ca707da1023
--- /dev/null
+++ b/hw/riscv/minic.c
@@ -0,0 +1,438 @@
+/*
+ * QEMU RISC-V Mini Computer
+ *
+ * Based on the minic machine implementation
+ *
+ * Copyright (c) 2022 Rivos, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "qemu/error-report.h"
+#include "qapi/error.h"
+#include "hw/boards.h"
+#include "hw/loader.h"
+#include "hw/sysbus.h"
+#include "hw/qdev-properties.h"
+#include "target/riscv/cpu.h"
+#include "hw/riscv/riscv_hart.h"
+#include "hw/riscv/minic.h"
+#include "hw/riscv/machine_helper.h"
+#include "hw/intc/riscv_aclint.h"
+#include "hw/intc/riscv_imsic.h"
+#include "hw/riscv/boot.h"
+#include "hw/riscv/numa.h"
+#include "chardev/char.h"
+#include "sysemu/device_tree.h"
+#include "sysemu/sysemu.h"
+#include "sysemu/kvm.h"
+#include "hw/pci/pci.h"
+#include "hw/pci-host/gpex.h"
+#include "hw/display/ramfb.h"
+
+#define MINIC_IMSIC_GROUP_MAX_SIZE      (1U << IMSIC_MMIO_GROUP_MIN_SHIFT)
+#if MINIC_IMSIC_GROUP_MAX_SIZE < \
+    IMSIC_GROUP_SIZE(MINIC_CPUS_MAX_BITS, MINIC_IRQCHIP_MAX_GUESTS_BITS)
+#error "Can't accomodate single IMSIC group in address space"
+#endif
+
+#define MINIC_IMSIC_MAX_SIZE            (MINIC_SOCKETS_MAX * \
+                                        MINIC_IMSIC_GROUP_MAX_SIZE)
+#if 0x4000000 < MINIC_IMSIC_MAX_SIZE
+#error "Can't accomodate all IMSIC groups in address space"
+#endif
+
+static const MemMapEntry minic_memmap[] = {
+    [MINIC_MROM] =        {     0x1000,        0xf000 },
+    [MINIC_CLINT] =       {  0x2000000,       0x10000 },
+    [MINIC_PCIE_PIO] =    {  0x3000000,       0x10000 },
+    [MINIC_IMSIC_M] =     { 0x24000000, MINIC_IMSIC_MAX_SIZE },
+    [MINIC_IMSIC_S] =     { 0x28000000, MINIC_IMSIC_MAX_SIZE },
+    [MINIC_PCIE_ECAM] =   { 0x30000000,    0x10000000 },
+    [MINIC_PCIE_MMIO] =   { 0x40000000,    0x40000000 },
+    [MINIC_DRAM] =        { 0x80000000,           0x0 },
+};
+
+static PcieInitData pdata;
+/* PCIe high mmio for RV64, size is fixed but base depends on top of RAM */
+#define MINIC64_HIGH_PCIE_MMIO_SIZE  (16 * GiB)
+
+static void minic_create_fdt_socket_clint(RISCVMinicState *s,
+                                    const MemMapEntry *memmap, int socket,
+                                    uint32_t *intc_phandles)
+{
+    int cpu;
+    char *clint_name;
+    uint32_t *clint_cells;
+    unsigned long clint_addr;
+    MachineState *mc = MACHINE(s);
+    static const char * const clint_compat[2] = {
+        "sifive,clint0", "riscv,clint0"
+    };
+
+    clint_cells = g_new0(uint32_t, s->soc[socket].num_harts * 4);
+
+    for (cpu = 0; cpu < s->soc[socket].num_harts; cpu++) {
+        clint_cells[cpu * 4 + 0] = cpu_to_be32(intc_phandles[cpu]);
+        clint_cells[cpu * 4 + 1] = cpu_to_be32(IRQ_M_SOFT);
+        clint_cells[cpu * 4 + 2] = cpu_to_be32(intc_phandles[cpu]);
+        clint_cells[cpu * 4 + 3] = cpu_to_be32(IRQ_M_TIMER);
+    }
+
+    clint_addr = memmap[MINIC_CLINT].base + (memmap[MINIC_CLINT].size * socket);
+    clint_name = g_strdup_printf("/soc/clint@%lx", clint_addr);
+    qemu_fdt_add_subnode(mc->fdt, clint_name);
+    qemu_fdt_setprop_string_array(mc->fdt, clint_name, "compatible",
+                                  (char **)&clint_compat,
+                                  ARRAY_SIZE(clint_compat));
+    qemu_fdt_setprop_cells(mc->fdt, clint_name, "reg",
+        0x0, clint_addr, 0x0, memmap[MINIC_CLINT].size);
+    qemu_fdt_setprop(mc->fdt, clint_name, "interrupts-extended",
+        clint_cells, s->soc[socket].num_harts * sizeof(uint32_t) * 4);
+    riscv_socket_fdt_write_id(mc, mc->fdt, clint_name, socket);
+    g_free(clint_name);
+
+    g_free(clint_cells);
+}
+
+static void minic_create_fdt_sockets(RISCVMinicState *s,
+                                     const MemMapEntry *memmap,
+                                     uint32_t *phandle,
+                                     uint32_t *msi_pcie_phandle)
+{
+    char *clust_name;
+    int socket, phandle_pos;
+    MachineState *mc = MACHINE(s);
+    uint32_t msi_m_phandle = 0, msi_s_phandle = 0;
+    uint32_t *intc_phandles;
+    ImsicInitData idata;
+
+    qemu_fdt_add_subnode(mc->fdt, "/cpus");
+    qemu_fdt_setprop_cell(mc->fdt, "/cpus", "timebase-frequency",
+                          RISCV_ACLINT_DEFAULT_TIMEBASE_FREQ);
+    qemu_fdt_setprop_cell(mc->fdt, "/cpus", "#size-cells", 0x0);
+    qemu_fdt_setprop_cell(mc->fdt, "/cpus", "#address-cells", 0x1);
+    qemu_fdt_add_subnode(mc->fdt, "/cpus/cpu-map");
+
+    intc_phandles = g_new0(uint32_t, mc->smp.cpus);
+
+    phandle_pos = mc->smp.cpus;
+    for (socket = (riscv_socket_count(mc) - 1); socket >= 0; socket--) {
+        phandle_pos -= s->soc[socket].num_harts;
+
+        clust_name = g_strdup_printf("/cpus/cpu-map/cluster%d", socket);
+        qemu_fdt_add_subnode(mc->fdt, clust_name);
+
+        riscv_create_fdt_socket_cpus(mc, s->soc, socket, clust_name, phandle,
+                               false, &intc_phandles[phandle_pos]);
+
+        riscv_create_fdt_socket_memory(mc, memmap[MINIC_DRAM].base, socket);
+        minic_create_fdt_socket_clint(s, memmap, socket,
+                                      &intc_phandles[phandle_pos]);
+        g_free(clust_name);
+
+    }
+
+    idata.imsic_m.base = memmap[MINIC_IMSIC_M].base;
+    idata.imsic_m.size = memmap[MINIC_IMSIC_M].size;
+    idata.imsic_s.base = memmap[MINIC_IMSIC_S].base;
+    idata.imsic_s.size = memmap[MINIC_IMSIC_S].size;
+    idata.group_max_size = MINIC_IMSIC_GROUP_MAX_SIZE;
+    idata.num_msi = MINIC_IRQCHIP_NUM_MSIS;
+    idata.ipi_msi = MINIC_IRQCHIP_IPI_MSI;
+    idata.num_guests = s->aia_guests;
+
+    riscv_create_fdt_imsic(mc, s->soc, phandle, intc_phandles,
+                           &msi_m_phandle, &msi_s_phandle, &idata);
+    *msi_pcie_phandle = msi_s_phandle;
+
+    riscv_socket_fdt_write_distance_matrix(mc, mc->fdt);
+    g_free(intc_phandles);
+}
+
+static void copy_memmap_to_pciedata(const MemMapEntry *memmap,
+                                    PcieInitData *pdata, uint64_t ram_size)
+{
+    pdata->pcie_ecam.base =  memmap[MINIC_PCIE_ECAM].base;
+    pdata->pcie_ecam.size =  memmap[MINIC_PCIE_ECAM].size;
+    pdata->pcie_pio.base =  memmap[MINIC_PCIE_PIO].base;
+    pdata->pcie_pio.size =  memmap[MINIC_PCIE_PIO].size;
+    pdata->pcie_mmio.base =  memmap[MINIC_PCIE_MMIO].base;
+    pdata->pcie_mmio.size =  memmap[MINIC_PCIE_MMIO].size;
+    pdata->pcie_high_mmio.size  = MINIC64_HIGH_PCIE_MMIO_SIZE;
+    pdata->pcie_high_mmio.base = memmap[MINIC_DRAM].base + ram_size;
+    pdata->pcie_high_mmio.base = ROUND_UP(pdata->pcie_high_mmio.base,
+                                          pdata->pcie_high_mmio.size);
+}
+
+static void minic_create_fdt(RISCVMinicState *s, const MemMapEntry *memmap,
+                       uint64_t mem_size, const char *cmdline)
+{
+    MachineState *mc = MACHINE(s);
+    uint32_t phandle = 1, msi_pcie_phandle = 1;
+
+    if (mc->dtb) {
+        mc->fdt = load_device_tree(mc->dtb, &s->fdt_size);
+        if (!mc->fdt) {
+            error_report("load_device_tree() failed");
+            exit(1);
+        }
+        goto update_bootargs;
+    } else {
+        mc->fdt = create_device_tree(&s->fdt_size);
+        if (!mc->fdt) {
+            error_report("create_device_tree() failed");
+            exit(1);
+        }
+    }
+
+    qemu_fdt_setprop_string(mc->fdt, "/", "model", "riscv-minic,qemu");
+    qemu_fdt_setprop_string(mc->fdt, "/", "compatible", "riscv-minic");
+    qemu_fdt_setprop_cell(mc->fdt, "/", "#size-cells", 0x2);
+    qemu_fdt_setprop_cell(mc->fdt, "/", "#address-cells", 0x2);
+
+    qemu_fdt_add_subnode(mc->fdt, "/soc");
+    qemu_fdt_setprop(mc->fdt, "/soc", "ranges", NULL, 0);
+    qemu_fdt_setprop_string(mc->fdt, "/soc", "compatible", "simple-bus");
+    qemu_fdt_setprop_cell(mc->fdt, "/soc", "#size-cells", 0x2);
+    qemu_fdt_setprop_cell(mc->fdt, "/soc", "#address-cells", 0x2);
+
+    minic_create_fdt_sockets(s, memmap, &phandle, &msi_pcie_phandle);
+    qemu_fdt_add_subnode(mc->fdt, "/chosen");
+
+    copy_memmap_to_pciedata(memmap, &pdata, mc->ram_size);
+    pdata.irq_type = RISCV_IRQ_MSI_ONLY;
+    riscv_create_fdt_pcie(mc, &pdata, 0, msi_pcie_phandle);
+
+update_bootargs:
+    if (cmdline) {
+        qemu_fdt_setprop_string(mc->fdt, "/chosen", "bootargs", cmdline);
+    }
+}
+
+static void minic_create_imsic(int aia_guests,
+                               const MemMapEntry *memmap, int socket,
+                               int base_hartid, int hart_count)
+{
+    int i;
+    hwaddr addr;
+    uint32_t guest_bits;
+
+    /* Per-socket M-level IMSICs */
+    addr = memmap[MINIC_IMSIC_M].base + socket * MINIC_IMSIC_GROUP_MAX_SIZE;
+    for (i = 0; i < hart_count; i++) {
+        riscv_imsic_create(addr + i * IMSIC_HART_SIZE(0),
+                           base_hartid + i, true, 1,
+                           MINIC_IRQCHIP_NUM_MSIS);
+    }
+
+    /* Per-socket S-level IMSICs */
+    guest_bits = riscv_imsic_num_bits(aia_guests + 1);
+    addr = memmap[MINIC_IMSIC_S].base + socket * MINIC_IMSIC_GROUP_MAX_SIZE;
+    for (i = 0; i < hart_count; i++) {
+        riscv_imsic_create(addr + i * IMSIC_HART_SIZE(guest_bits),
+                           base_hartid + i, false, 1 + aia_guests,
+                           MINIC_IRQCHIP_NUM_MSIS);
+    }
+}
+
+static void minic_machine_init(MachineState *machine)
+{
+    const MemMapEntry *memmap = minic_memmap;
+    RISCVMinicState *s = RISCV_MINIC_MACHINE(machine);
+    MemoryRegion *system_memory = get_system_memory();
+    MemoryRegion *mask_rom = g_new(MemoryRegion, 1);
+    char *soc_name;
+    target_ulong start_addr = memmap[MINIC_DRAM].base;
+    target_ulong firmware_end_addr, kernel_start_addr;
+    uint32_t fdt_load_addr;
+    uint64_t kernel_entry;
+    int i, base_hartid, hart_count;
+
+    /* Check socket count limit */
+    if (MINIC_SOCKETS_MAX < riscv_socket_count(machine)) {
+        error_report("number of sockets/nodes should be less than %d",
+            MINIC_SOCKETS_MAX);
+        exit(1);
+    }
+
+    /* Initialize sockets */
+    for (i = 0; i < riscv_socket_count(machine); i++) {
+        if (!riscv_socket_check_hartids(machine, i)) {
+            error_report("discontinuous hartids in socket%d", i);
+            exit(1);
+        }
+
+        base_hartid = riscv_socket_first_hartid(machine, i);
+        if (base_hartid < 0) {
+            error_report("can't find hartid base for socket%d", i);
+            exit(1);
+        }
+
+        hart_count = riscv_socket_hart_count(machine, i);
+        if (hart_count < 0) {
+            error_report("can't find hart count for socket%d", i);
+            exit(1);
+        }
+
+        soc_name = g_strdup_printf("soc%d", i);
+        object_initialize_child(OBJECT(machine), soc_name, &s->soc[i],
+                                TYPE_RISCV_HART_ARRAY);
+        g_free(soc_name);
+        object_property_set_str(OBJECT(&s->soc[i]), "cpu-type",
+                                machine->cpu_type, &error_abort);
+        object_property_set_int(OBJECT(&s->soc[i]), "hartid-base",
+                                base_hartid, &error_abort);
+        object_property_set_int(OBJECT(&s->soc[i]), "num-harts",
+                                hart_count, &error_abort);
+        sysbus_realize(SYS_BUS_DEVICE(&s->soc[i]), &error_abort);
+
+        /*
+         * Minic machine doesn't need M-mode software interrupt IPI device
+         * However, clint doesn't provide modularity and the existing software
+         * stack expect this address to be present with clint.
+         */
+        riscv_aclint_swi_create(
+                    memmap[MINIC_CLINT].base + i * memmap[MINIC_CLINT].size,
+                    base_hartid, hart_count, false);
+
+        /* Per-socket ACLINT MTIMER */
+        riscv_aclint_mtimer_create(memmap[MINIC_CLINT].base +
+                        i * memmap[MINIC_CLINT].size + RISCV_ACLINT_SWI_SIZE,
+                    RISCV_ACLINT_DEFAULT_MTIMER_SIZE, base_hartid, hart_count,
+                    RISCV_ACLINT_DEFAULT_MTIMECMP, RISCV_ACLINT_DEFAULT_MTIME,
+                    RISCV_ACLINT_DEFAULT_TIMEBASE_FREQ, true);
+
+        minic_create_imsic(s->aia_guests, memmap, i, base_hartid, hart_count);
+    }
+
+    /* register system main memory (actual RAM) */
+    memory_region_add_subregion(system_memory, memmap[MINIC_DRAM].base,
+        machine->ram);
+
+    /* create device tree */
+    minic_create_fdt(s, memmap, machine->ram_size, machine->kernel_cmdline);
+
+    /* boot rom */
+    memory_region_init_rom(mask_rom, NULL, "riscv_minic_board.mrom",
+                           memmap[MINIC_MROM].size, &error_fatal);
+    memory_region_add_subregion(system_memory, memmap[MINIC_MROM].base,
+                                mask_rom);
+
+    firmware_end_addr = riscv_find_and_load_firmware(machine,
+                                    RISCV64_BIOS_BIN, start_addr, NULL);
+
+    if (machine->kernel_filename) {
+        kernel_start_addr = riscv_calc_kernel_start_addr(&s->soc[0],
+                                                         firmware_end_addr);
+
+        kernel_entry = riscv_load_kernel(machine->kernel_filename,
+                                         kernel_start_addr, NULL);
+
+        if (machine->initrd_filename) {
+            hwaddr start;
+            hwaddr end = riscv_load_initrd(machine->initrd_filename,
+                                           machine->ram_size, kernel_entry,
+                                           &start);
+            qemu_fdt_setprop_cell(machine->fdt, "/chosen",
+                                  "linux,initrd-start", start);
+            qemu_fdt_setprop_cell(machine->fdt, "/chosen", "linux,initrd-end",
+                                  end);
+        }
+    } else {
+       /*
+        * If dynamic firmware is used, it doesn't know where is the next mode
+        * if kernel argument is not set.
+        */
+        kernel_entry = 0;
+    }
+
+    /* Compute the fdt load address in dram */
+    fdt_load_addr = riscv_load_fdt(memmap[MINIC_DRAM].base,
+                                   machine->ram_size, machine->fdt);
+    /* load the reset vector */
+    riscv_setup_rom_reset_vec(machine, &s->soc[0], start_addr,
+                              minic_memmap[MINIC_MROM].base,
+                              minic_memmap[MINIC_MROM].size, kernel_entry,
+                              fdt_load_addr, machine->fdt);
+
+    riscv_gpex_pcie_msi_init(system_memory, &pdata);
+}
+
+static void minic_machine_instance_init(Object *obj)
+{
+}
+
+static char *minic_get_aia_guests(Object *obj, Error **errp)
+{
+    RISCVMinicState *s = RISCV_MINIC_MACHINE(obj);
+    char val[32];
+
+    sprintf(val, "%d", s->aia_guests);
+    return g_strdup(val);
+}
+
+static void minic_set_aia_guests(Object *obj, const char *val, Error **errp)
+{
+    RISCVMinicState *s = RISCV_MINIC_MACHINE(obj);
+
+    s->aia_guests = atoi(val);
+    if (s->aia_guests < 0 || s->aia_guests > MINIC_IRQCHIP_MAX_GUESTS) {
+        error_setg(errp, "Invalid number of AIA IMSIC guests");
+        error_append_hint(errp, "Valid values be between 0 and %d.\n",
+                          MINIC_IRQCHIP_MAX_GUESTS);
+    }
+}
+
+static void minic_machine_class_init(ObjectClass *oc, void *data)
+{
+    char str[128];
+    MachineClass *mc = MACHINE_CLASS(oc);
+
+    mc->desc = "RISC-V Mini Computer";
+    mc->init = minic_machine_init;
+    mc->max_cpus = MINIC_CPUS_MAX;
+    mc->default_cpu_type = TYPE_RISCV_CPU_BASE64;
+    mc->pci_allow_0_address = true;
+    mc->possible_cpu_arch_ids = riscv_numa_possible_cpu_arch_ids;
+    mc->cpu_index_to_instance_props = riscv_numa_cpu_index_to_props;
+    mc->get_default_cpu_node_id = riscv_numa_get_default_cpu_node_id;
+    mc->numa_mem_supported = true;
+    mc->default_ram_id = "riscv_minic.ram";
+
+    machine_class_allow_dynamic_sysbus_dev(mc, TYPE_RAMFB_DEVICE);
+
+    object_class_property_add_str(oc, "aia-guests",
+                                  minic_get_aia_guests,
+                                  minic_set_aia_guests);
+    sprintf(str, "Set number of guest MMIO pages for AIA IMSIC. Valid value "
+                 "should be between 0 and %d.", MINIC_IRQCHIP_MAX_GUESTS);
+    object_class_property_set_description(oc, "aia-guests", str);
+}
+
+static const TypeInfo minic_machine_typeinfo = {
+    .name       = MACHINE_TYPE_NAME("minic"),
+    .parent     = TYPE_MACHINE,
+    .class_init = minic_machine_class_init,
+    .instance_init = minic_machine_instance_init,
+    .instance_size = sizeof(RISCVMinicState),
+};
+
+static void minic_machine_init_register_types(void)
+{
+    type_register_static(&minic_machine_typeinfo);
+}
+
+type_init(minic_machine_init_register_types)
diff --git a/include/hw/riscv/minic.h b/include/hw/riscv/minic.h
new file mode 100644
index 000000000000..950911abc2b9
--- /dev/null
+++ b/include/hw/riscv/minic.h
@@ -0,0 +1,65 @@
+/*
+ * QEMU RISC-V Mini Computer machine interface
+ *
+ * Copyright (c) 2022 Rivos, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_RISCV_MINIC_H
+#define HW_RISCV_MINIC_H
+
+#include "hw/riscv/riscv_hart.h"
+#include "hw/sysbus.h"
+#include "hw/block/flash.h"
+#include "qom/object.h"
+
+#define MINIC_CPUS_MAX_BITS             9
+#define MINIC_CPUS_MAX                  (1 << MINIC_CPUS_MAX_BITS)
+#define MINIC_SOCKETS_MAX_BITS          2
+#define MINIC_SOCKETS_MAX               (1 << MINIC_SOCKETS_MAX_BITS)
+
+#define MINIC_IRQCHIP_IPI_MSI 1
+#define MINIC_IRQCHIP_NUM_MSIS 255
+#define MINIC_IRQCHIP_NUM_PRIO_BITS 3
+#define MINIC_IRQCHIP_MAX_GUESTS_BITS 3
+#define MINIC_IRQCHIP_MAX_GUESTS ((1U << MINIC_IRQCHIP_MAX_GUESTS_BITS) - 1U)
+
+#define TYPE_RISCV_MINIC_MACHINE MACHINE_TYPE_NAME("minic")
+
+typedef struct RISCVMinicState RISCVMinicState;
+DECLARE_INSTANCE_CHECKER(RISCVMinicState, RISCV_MINIC_MACHINE,
+                         TYPE_RISCV_MINIC_MACHINE)
+
+struct RISCVMinicState {
+    /*< private >*/
+    MachineState parent;
+
+    /*< public >*/
+    RISCVHartArrayState soc[MINIC_SOCKETS_MAX];
+    int fdt_size;
+    int aia_guests;
+};
+
+enum {
+    MINIC_MROM = 0,
+    MINIC_CLINT,
+    MINIC_IMSIC_M,
+    MINIC_IMSIC_S,
+    MINIC_DRAM,
+    MINIC_PCIE_MMIO,
+    MINIC_PCIE_PIO,
+    MINIC_PCIE_ECAM
+};
+
+#endif
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC 3/3] hw/riscv: Create a new qemu machine for RISC-V
@ 2022-04-12  2:10   ` Atish Patra
  0 siblings, 0 replies; 35+ messages in thread
From: Atish Patra @ 2022-04-12  2:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: Atish Patra, Alistair Francis, Bin Meng, Michael S. Tsirkin,
	Palmer Dabbelt, Paolo Bonzini, qemu-riscv

The RISC-V virt machine has been growing with many different commmandline
options. It has its limitations in terms of maximum number of harts that
it can support. The commandline options slowly will become bit difficult
to manage. Moreover, it always depends on the virtio framework and lot
of mmio devices.

The new MSI based interrupt controller (IMSIC) allows us to build a
minimalistic yet extensible machines without any wired interrupts. All
the devices can be behind PCI with MSI/MSI-X are only source of external
interrupts. This approach is highly scalable in terms of number of harts
and also mimics modern day PC/server machines more closely. As every device
must be behind PCI, we won't require a lot of addition to the machine.

Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
 configs/devices/riscv64-softmmu/default.mak |   1 +
 hw/riscv/Kconfig                            |  11 +
 hw/riscv/meson.build                        |   1 +
 hw/riscv/minic.c                            | 438 ++++++++++++++++++++
 include/hw/riscv/minic.h                    |  65 +++
 5 files changed, 516 insertions(+)
 create mode 100644 hw/riscv/minic.c
 create mode 100644 include/hw/riscv/minic.h

diff --git a/configs/devices/riscv64-softmmu/default.mak b/configs/devices/riscv64-softmmu/default.mak
index bc69301fa4a6..1407c4a9fe2f 100644
--- a/configs/devices/riscv64-softmmu/default.mak
+++ b/configs/devices/riscv64-softmmu/default.mak
@@ -14,3 +14,4 @@ CONFIG_SIFIVE_U=y
 CONFIG_RISCV_VIRT=y
 CONFIG_MICROCHIP_PFSOC=y
 CONFIG_SHAKTI_C=y
+CONFIG_MINIC=y
diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig
index 91bb9d21c471..9eca1a6efa25 100644
--- a/hw/riscv/Kconfig
+++ b/hw/riscv/Kconfig
@@ -83,3 +83,14 @@ config SPIKE
     select MSI_NONBROKEN
     select RISCV_ACLINT
     select SIFIVE_PLIC
+
+config MINIC
+    bool
+    imply PCI_DEVICES
+    select RISCV_NUMA
+    select MSI_NONBROKEN
+    select PCI
+    select PCI_EXPRESS_GENERIC_BRIDGE
+    select SERIAL
+    select RISCV_ACLINT
+    select RISCV_IMSIC
diff --git a/hw/riscv/meson.build b/hw/riscv/meson.build
index b3ae84ac0539..7b1c49466e62 100644
--- a/hw/riscv/meson.build
+++ b/hw/riscv/meson.build
@@ -10,5 +10,6 @@ riscv_ss.add(when: 'CONFIG_SIFIVE_E', if_true: files('sifive_e.c'))
 riscv_ss.add(when: 'CONFIG_SIFIVE_U', if_true: files('sifive_u.c'))
 riscv_ss.add(when: 'CONFIG_SPIKE', if_true: files('spike.c'))
 riscv_ss.add(when: 'CONFIG_MICROCHIP_PFSOC', if_true: files('microchip_pfsoc.c'))
+riscv_ss.add(when: 'CONFIG_MINIC', if_true: files('minic.c'))
 
 hw_arch += {'riscv': riscv_ss}
diff --git a/hw/riscv/minic.c b/hw/riscv/minic.c
new file mode 100644
index 000000000000..4ca707da1023
--- /dev/null
+++ b/hw/riscv/minic.c
@@ -0,0 +1,438 @@
+/*
+ * QEMU RISC-V Mini Computer
+ *
+ * Based on the minic machine implementation
+ *
+ * Copyright (c) 2022 Rivos, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "qemu/error-report.h"
+#include "qapi/error.h"
+#include "hw/boards.h"
+#include "hw/loader.h"
+#include "hw/sysbus.h"
+#include "hw/qdev-properties.h"
+#include "target/riscv/cpu.h"
+#include "hw/riscv/riscv_hart.h"
+#include "hw/riscv/minic.h"
+#include "hw/riscv/machine_helper.h"
+#include "hw/intc/riscv_aclint.h"
+#include "hw/intc/riscv_imsic.h"
+#include "hw/riscv/boot.h"
+#include "hw/riscv/numa.h"
+#include "chardev/char.h"
+#include "sysemu/device_tree.h"
+#include "sysemu/sysemu.h"
+#include "sysemu/kvm.h"
+#include "hw/pci/pci.h"
+#include "hw/pci-host/gpex.h"
+#include "hw/display/ramfb.h"
+
+#define MINIC_IMSIC_GROUP_MAX_SIZE      (1U << IMSIC_MMIO_GROUP_MIN_SHIFT)
+#if MINIC_IMSIC_GROUP_MAX_SIZE < \
+    IMSIC_GROUP_SIZE(MINIC_CPUS_MAX_BITS, MINIC_IRQCHIP_MAX_GUESTS_BITS)
+#error "Can't accomodate single IMSIC group in address space"
+#endif
+
+#define MINIC_IMSIC_MAX_SIZE            (MINIC_SOCKETS_MAX * \
+                                        MINIC_IMSIC_GROUP_MAX_SIZE)
+#if 0x4000000 < MINIC_IMSIC_MAX_SIZE
+#error "Can't accomodate all IMSIC groups in address space"
+#endif
+
+static const MemMapEntry minic_memmap[] = {
+    [MINIC_MROM] =        {     0x1000,        0xf000 },
+    [MINIC_CLINT] =       {  0x2000000,       0x10000 },
+    [MINIC_PCIE_PIO] =    {  0x3000000,       0x10000 },
+    [MINIC_IMSIC_M] =     { 0x24000000, MINIC_IMSIC_MAX_SIZE },
+    [MINIC_IMSIC_S] =     { 0x28000000, MINIC_IMSIC_MAX_SIZE },
+    [MINIC_PCIE_ECAM] =   { 0x30000000,    0x10000000 },
+    [MINIC_PCIE_MMIO] =   { 0x40000000,    0x40000000 },
+    [MINIC_DRAM] =        { 0x80000000,           0x0 },
+};
+
+static PcieInitData pdata;
+/* PCIe high mmio for RV64, size is fixed but base depends on top of RAM */
+#define MINIC64_HIGH_PCIE_MMIO_SIZE  (16 * GiB)
+
+static void minic_create_fdt_socket_clint(RISCVMinicState *s,
+                                    const MemMapEntry *memmap, int socket,
+                                    uint32_t *intc_phandles)
+{
+    int cpu;
+    char *clint_name;
+    uint32_t *clint_cells;
+    unsigned long clint_addr;
+    MachineState *mc = MACHINE(s);
+    static const char * const clint_compat[2] = {
+        "sifive,clint0", "riscv,clint0"
+    };
+
+    clint_cells = g_new0(uint32_t, s->soc[socket].num_harts * 4);
+
+    for (cpu = 0; cpu < s->soc[socket].num_harts; cpu++) {
+        clint_cells[cpu * 4 + 0] = cpu_to_be32(intc_phandles[cpu]);
+        clint_cells[cpu * 4 + 1] = cpu_to_be32(IRQ_M_SOFT);
+        clint_cells[cpu * 4 + 2] = cpu_to_be32(intc_phandles[cpu]);
+        clint_cells[cpu * 4 + 3] = cpu_to_be32(IRQ_M_TIMER);
+    }
+
+    clint_addr = memmap[MINIC_CLINT].base + (memmap[MINIC_CLINT].size * socket);
+    clint_name = g_strdup_printf("/soc/clint@%lx", clint_addr);
+    qemu_fdt_add_subnode(mc->fdt, clint_name);
+    qemu_fdt_setprop_string_array(mc->fdt, clint_name, "compatible",
+                                  (char **)&clint_compat,
+                                  ARRAY_SIZE(clint_compat));
+    qemu_fdt_setprop_cells(mc->fdt, clint_name, "reg",
+        0x0, clint_addr, 0x0, memmap[MINIC_CLINT].size);
+    qemu_fdt_setprop(mc->fdt, clint_name, "interrupts-extended",
+        clint_cells, s->soc[socket].num_harts * sizeof(uint32_t) * 4);
+    riscv_socket_fdt_write_id(mc, mc->fdt, clint_name, socket);
+    g_free(clint_name);
+
+    g_free(clint_cells);
+}
+
+static void minic_create_fdt_sockets(RISCVMinicState *s,
+                                     const MemMapEntry *memmap,
+                                     uint32_t *phandle,
+                                     uint32_t *msi_pcie_phandle)
+{
+    char *clust_name;
+    int socket, phandle_pos;
+    MachineState *mc = MACHINE(s);
+    uint32_t msi_m_phandle = 0, msi_s_phandle = 0;
+    uint32_t *intc_phandles;
+    ImsicInitData idata;
+
+    qemu_fdt_add_subnode(mc->fdt, "/cpus");
+    qemu_fdt_setprop_cell(mc->fdt, "/cpus", "timebase-frequency",
+                          RISCV_ACLINT_DEFAULT_TIMEBASE_FREQ);
+    qemu_fdt_setprop_cell(mc->fdt, "/cpus", "#size-cells", 0x0);
+    qemu_fdt_setprop_cell(mc->fdt, "/cpus", "#address-cells", 0x1);
+    qemu_fdt_add_subnode(mc->fdt, "/cpus/cpu-map");
+
+    intc_phandles = g_new0(uint32_t, mc->smp.cpus);
+
+    phandle_pos = mc->smp.cpus;
+    for (socket = (riscv_socket_count(mc) - 1); socket >= 0; socket--) {
+        phandle_pos -= s->soc[socket].num_harts;
+
+        clust_name = g_strdup_printf("/cpus/cpu-map/cluster%d", socket);
+        qemu_fdt_add_subnode(mc->fdt, clust_name);
+
+        riscv_create_fdt_socket_cpus(mc, s->soc, socket, clust_name, phandle,
+                               false, &intc_phandles[phandle_pos]);
+
+        riscv_create_fdt_socket_memory(mc, memmap[MINIC_DRAM].base, socket);
+        minic_create_fdt_socket_clint(s, memmap, socket,
+                                      &intc_phandles[phandle_pos]);
+        g_free(clust_name);
+
+    }
+
+    idata.imsic_m.base = memmap[MINIC_IMSIC_M].base;
+    idata.imsic_m.size = memmap[MINIC_IMSIC_M].size;
+    idata.imsic_s.base = memmap[MINIC_IMSIC_S].base;
+    idata.imsic_s.size = memmap[MINIC_IMSIC_S].size;
+    idata.group_max_size = MINIC_IMSIC_GROUP_MAX_SIZE;
+    idata.num_msi = MINIC_IRQCHIP_NUM_MSIS;
+    idata.ipi_msi = MINIC_IRQCHIP_IPI_MSI;
+    idata.num_guests = s->aia_guests;
+
+    riscv_create_fdt_imsic(mc, s->soc, phandle, intc_phandles,
+                           &msi_m_phandle, &msi_s_phandle, &idata);
+    *msi_pcie_phandle = msi_s_phandle;
+
+    riscv_socket_fdt_write_distance_matrix(mc, mc->fdt);
+    g_free(intc_phandles);
+}
+
+static void copy_memmap_to_pciedata(const MemMapEntry *memmap,
+                                    PcieInitData *pdata, uint64_t ram_size)
+{
+    pdata->pcie_ecam.base =  memmap[MINIC_PCIE_ECAM].base;
+    pdata->pcie_ecam.size =  memmap[MINIC_PCIE_ECAM].size;
+    pdata->pcie_pio.base =  memmap[MINIC_PCIE_PIO].base;
+    pdata->pcie_pio.size =  memmap[MINIC_PCIE_PIO].size;
+    pdata->pcie_mmio.base =  memmap[MINIC_PCIE_MMIO].base;
+    pdata->pcie_mmio.size =  memmap[MINIC_PCIE_MMIO].size;
+    pdata->pcie_high_mmio.size  = MINIC64_HIGH_PCIE_MMIO_SIZE;
+    pdata->pcie_high_mmio.base = memmap[MINIC_DRAM].base + ram_size;
+    pdata->pcie_high_mmio.base = ROUND_UP(pdata->pcie_high_mmio.base,
+                                          pdata->pcie_high_mmio.size);
+}
+
+static void minic_create_fdt(RISCVMinicState *s, const MemMapEntry *memmap,
+                       uint64_t mem_size, const char *cmdline)
+{
+    MachineState *mc = MACHINE(s);
+    uint32_t phandle = 1, msi_pcie_phandle = 1;
+
+    if (mc->dtb) {
+        mc->fdt = load_device_tree(mc->dtb, &s->fdt_size);
+        if (!mc->fdt) {
+            error_report("load_device_tree() failed");
+            exit(1);
+        }
+        goto update_bootargs;
+    } else {
+        mc->fdt = create_device_tree(&s->fdt_size);
+        if (!mc->fdt) {
+            error_report("create_device_tree() failed");
+            exit(1);
+        }
+    }
+
+    qemu_fdt_setprop_string(mc->fdt, "/", "model", "riscv-minic,qemu");
+    qemu_fdt_setprop_string(mc->fdt, "/", "compatible", "riscv-minic");
+    qemu_fdt_setprop_cell(mc->fdt, "/", "#size-cells", 0x2);
+    qemu_fdt_setprop_cell(mc->fdt, "/", "#address-cells", 0x2);
+
+    qemu_fdt_add_subnode(mc->fdt, "/soc");
+    qemu_fdt_setprop(mc->fdt, "/soc", "ranges", NULL, 0);
+    qemu_fdt_setprop_string(mc->fdt, "/soc", "compatible", "simple-bus");
+    qemu_fdt_setprop_cell(mc->fdt, "/soc", "#size-cells", 0x2);
+    qemu_fdt_setprop_cell(mc->fdt, "/soc", "#address-cells", 0x2);
+
+    minic_create_fdt_sockets(s, memmap, &phandle, &msi_pcie_phandle);
+    qemu_fdt_add_subnode(mc->fdt, "/chosen");
+
+    copy_memmap_to_pciedata(memmap, &pdata, mc->ram_size);
+    pdata.irq_type = RISCV_IRQ_MSI_ONLY;
+    riscv_create_fdt_pcie(mc, &pdata, 0, msi_pcie_phandle);
+
+update_bootargs:
+    if (cmdline) {
+        qemu_fdt_setprop_string(mc->fdt, "/chosen", "bootargs", cmdline);
+    }
+}
+
+static void minic_create_imsic(int aia_guests,
+                               const MemMapEntry *memmap, int socket,
+                               int base_hartid, int hart_count)
+{
+    int i;
+    hwaddr addr;
+    uint32_t guest_bits;
+
+    /* Per-socket M-level IMSICs */
+    addr = memmap[MINIC_IMSIC_M].base + socket * MINIC_IMSIC_GROUP_MAX_SIZE;
+    for (i = 0; i < hart_count; i++) {
+        riscv_imsic_create(addr + i * IMSIC_HART_SIZE(0),
+                           base_hartid + i, true, 1,
+                           MINIC_IRQCHIP_NUM_MSIS);
+    }
+
+    /* Per-socket S-level IMSICs */
+    guest_bits = riscv_imsic_num_bits(aia_guests + 1);
+    addr = memmap[MINIC_IMSIC_S].base + socket * MINIC_IMSIC_GROUP_MAX_SIZE;
+    for (i = 0; i < hart_count; i++) {
+        riscv_imsic_create(addr + i * IMSIC_HART_SIZE(guest_bits),
+                           base_hartid + i, false, 1 + aia_guests,
+                           MINIC_IRQCHIP_NUM_MSIS);
+    }
+}
+
+static void minic_machine_init(MachineState *machine)
+{
+    const MemMapEntry *memmap = minic_memmap;
+    RISCVMinicState *s = RISCV_MINIC_MACHINE(machine);
+    MemoryRegion *system_memory = get_system_memory();
+    MemoryRegion *mask_rom = g_new(MemoryRegion, 1);
+    char *soc_name;
+    target_ulong start_addr = memmap[MINIC_DRAM].base;
+    target_ulong firmware_end_addr, kernel_start_addr;
+    uint32_t fdt_load_addr;
+    uint64_t kernel_entry;
+    int i, base_hartid, hart_count;
+
+    /* Check socket count limit */
+    if (MINIC_SOCKETS_MAX < riscv_socket_count(machine)) {
+        error_report("number of sockets/nodes should be less than %d",
+            MINIC_SOCKETS_MAX);
+        exit(1);
+    }
+
+    /* Initialize sockets */
+    for (i = 0; i < riscv_socket_count(machine); i++) {
+        if (!riscv_socket_check_hartids(machine, i)) {
+            error_report("discontinuous hartids in socket%d", i);
+            exit(1);
+        }
+
+        base_hartid = riscv_socket_first_hartid(machine, i);
+        if (base_hartid < 0) {
+            error_report("can't find hartid base for socket%d", i);
+            exit(1);
+        }
+
+        hart_count = riscv_socket_hart_count(machine, i);
+        if (hart_count < 0) {
+            error_report("can't find hart count for socket%d", i);
+            exit(1);
+        }
+
+        soc_name = g_strdup_printf("soc%d", i);
+        object_initialize_child(OBJECT(machine), soc_name, &s->soc[i],
+                                TYPE_RISCV_HART_ARRAY);
+        g_free(soc_name);
+        object_property_set_str(OBJECT(&s->soc[i]), "cpu-type",
+                                machine->cpu_type, &error_abort);
+        object_property_set_int(OBJECT(&s->soc[i]), "hartid-base",
+                                base_hartid, &error_abort);
+        object_property_set_int(OBJECT(&s->soc[i]), "num-harts",
+                                hart_count, &error_abort);
+        sysbus_realize(SYS_BUS_DEVICE(&s->soc[i]), &error_abort);
+
+        /*
+         * Minic machine doesn't need M-mode software interrupt IPI device
+         * However, clint doesn't provide modularity and the existing software
+         * stack expect this address to be present with clint.
+         */
+        riscv_aclint_swi_create(
+                    memmap[MINIC_CLINT].base + i * memmap[MINIC_CLINT].size,
+                    base_hartid, hart_count, false);
+
+        /* Per-socket ACLINT MTIMER */
+        riscv_aclint_mtimer_create(memmap[MINIC_CLINT].base +
+                        i * memmap[MINIC_CLINT].size + RISCV_ACLINT_SWI_SIZE,
+                    RISCV_ACLINT_DEFAULT_MTIMER_SIZE, base_hartid, hart_count,
+                    RISCV_ACLINT_DEFAULT_MTIMECMP, RISCV_ACLINT_DEFAULT_MTIME,
+                    RISCV_ACLINT_DEFAULT_TIMEBASE_FREQ, true);
+
+        minic_create_imsic(s->aia_guests, memmap, i, base_hartid, hart_count);
+    }
+
+    /* register system main memory (actual RAM) */
+    memory_region_add_subregion(system_memory, memmap[MINIC_DRAM].base,
+        machine->ram);
+
+    /* create device tree */
+    minic_create_fdt(s, memmap, machine->ram_size, machine->kernel_cmdline);
+
+    /* boot rom */
+    memory_region_init_rom(mask_rom, NULL, "riscv_minic_board.mrom",
+                           memmap[MINIC_MROM].size, &error_fatal);
+    memory_region_add_subregion(system_memory, memmap[MINIC_MROM].base,
+                                mask_rom);
+
+    firmware_end_addr = riscv_find_and_load_firmware(machine,
+                                    RISCV64_BIOS_BIN, start_addr, NULL);
+
+    if (machine->kernel_filename) {
+        kernel_start_addr = riscv_calc_kernel_start_addr(&s->soc[0],
+                                                         firmware_end_addr);
+
+        kernel_entry = riscv_load_kernel(machine->kernel_filename,
+                                         kernel_start_addr, NULL);
+
+        if (machine->initrd_filename) {
+            hwaddr start;
+            hwaddr end = riscv_load_initrd(machine->initrd_filename,
+                                           machine->ram_size, kernel_entry,
+                                           &start);
+            qemu_fdt_setprop_cell(machine->fdt, "/chosen",
+                                  "linux,initrd-start", start);
+            qemu_fdt_setprop_cell(machine->fdt, "/chosen", "linux,initrd-end",
+                                  end);
+        }
+    } else {
+       /*
+        * If dynamic firmware is used, it doesn't know where is the next mode
+        * if kernel argument is not set.
+        */
+        kernel_entry = 0;
+    }
+
+    /* Compute the fdt load address in dram */
+    fdt_load_addr = riscv_load_fdt(memmap[MINIC_DRAM].base,
+                                   machine->ram_size, machine->fdt);
+    /* load the reset vector */
+    riscv_setup_rom_reset_vec(machine, &s->soc[0], start_addr,
+                              minic_memmap[MINIC_MROM].base,
+                              minic_memmap[MINIC_MROM].size, kernel_entry,
+                              fdt_load_addr, machine->fdt);
+
+    riscv_gpex_pcie_msi_init(system_memory, &pdata);
+}
+
+static void minic_machine_instance_init(Object *obj)
+{
+}
+
+static char *minic_get_aia_guests(Object *obj, Error **errp)
+{
+    RISCVMinicState *s = RISCV_MINIC_MACHINE(obj);
+    char val[32];
+
+    sprintf(val, "%d", s->aia_guests);
+    return g_strdup(val);
+}
+
+static void minic_set_aia_guests(Object *obj, const char *val, Error **errp)
+{
+    RISCVMinicState *s = RISCV_MINIC_MACHINE(obj);
+
+    s->aia_guests = atoi(val);
+    if (s->aia_guests < 0 || s->aia_guests > MINIC_IRQCHIP_MAX_GUESTS) {
+        error_setg(errp, "Invalid number of AIA IMSIC guests");
+        error_append_hint(errp, "Valid values be between 0 and %d.\n",
+                          MINIC_IRQCHIP_MAX_GUESTS);
+    }
+}
+
+static void minic_machine_class_init(ObjectClass *oc, void *data)
+{
+    char str[128];
+    MachineClass *mc = MACHINE_CLASS(oc);
+
+    mc->desc = "RISC-V Mini Computer";
+    mc->init = minic_machine_init;
+    mc->max_cpus = MINIC_CPUS_MAX;
+    mc->default_cpu_type = TYPE_RISCV_CPU_BASE64;
+    mc->pci_allow_0_address = true;
+    mc->possible_cpu_arch_ids = riscv_numa_possible_cpu_arch_ids;
+    mc->cpu_index_to_instance_props = riscv_numa_cpu_index_to_props;
+    mc->get_default_cpu_node_id = riscv_numa_get_default_cpu_node_id;
+    mc->numa_mem_supported = true;
+    mc->default_ram_id = "riscv_minic.ram";
+
+    machine_class_allow_dynamic_sysbus_dev(mc, TYPE_RAMFB_DEVICE);
+
+    object_class_property_add_str(oc, "aia-guests",
+                                  minic_get_aia_guests,
+                                  minic_set_aia_guests);
+    sprintf(str, "Set number of guest MMIO pages for AIA IMSIC. Valid value "
+                 "should be between 0 and %d.", MINIC_IRQCHIP_MAX_GUESTS);
+    object_class_property_set_description(oc, "aia-guests", str);
+}
+
+static const TypeInfo minic_machine_typeinfo = {
+    .name       = MACHINE_TYPE_NAME("minic"),
+    .parent     = TYPE_MACHINE,
+    .class_init = minic_machine_class_init,
+    .instance_init = minic_machine_instance_init,
+    .instance_size = sizeof(RISCVMinicState),
+};
+
+static void minic_machine_init_register_types(void)
+{
+    type_register_static(&minic_machine_typeinfo);
+}
+
+type_init(minic_machine_init_register_types)
diff --git a/include/hw/riscv/minic.h b/include/hw/riscv/minic.h
new file mode 100644
index 000000000000..950911abc2b9
--- /dev/null
+++ b/include/hw/riscv/minic.h
@@ -0,0 +1,65 @@
+/*
+ * QEMU RISC-V Mini Computer machine interface
+ *
+ * Copyright (c) 2022 Rivos, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_RISCV_MINIC_H
+#define HW_RISCV_MINIC_H
+
+#include "hw/riscv/riscv_hart.h"
+#include "hw/sysbus.h"
+#include "hw/block/flash.h"
+#include "qom/object.h"
+
+#define MINIC_CPUS_MAX_BITS             9
+#define MINIC_CPUS_MAX                  (1 << MINIC_CPUS_MAX_BITS)
+#define MINIC_SOCKETS_MAX_BITS          2
+#define MINIC_SOCKETS_MAX               (1 << MINIC_SOCKETS_MAX_BITS)
+
+#define MINIC_IRQCHIP_IPI_MSI 1
+#define MINIC_IRQCHIP_NUM_MSIS 255
+#define MINIC_IRQCHIP_NUM_PRIO_BITS 3
+#define MINIC_IRQCHIP_MAX_GUESTS_BITS 3
+#define MINIC_IRQCHIP_MAX_GUESTS ((1U << MINIC_IRQCHIP_MAX_GUESTS_BITS) - 1U)
+
+#define TYPE_RISCV_MINIC_MACHINE MACHINE_TYPE_NAME("minic")
+
+typedef struct RISCVMinicState RISCVMinicState;
+DECLARE_INSTANCE_CHECKER(RISCVMinicState, RISCV_MINIC_MACHINE,
+                         TYPE_RISCV_MINIC_MACHINE)
+
+struct RISCVMinicState {
+    /*< private >*/
+    MachineState parent;
+
+    /*< public >*/
+    RISCVHartArrayState soc[MINIC_SOCKETS_MAX];
+    int fdt_size;
+    int aia_guests;
+};
+
+enum {
+    MINIC_MROM = 0,
+    MINIC_CLINT,
+    MINIC_IMSIC_M,
+    MINIC_IMSIC_S,
+    MINIC_DRAM,
+    MINIC_PCIE_MMIO,
+    MINIC_PCIE_PIO,
+    MINIC_PCIE_ECAM
+};
+
+#endif
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [RFC 1/3] serial: Enable MSI capablity and option
  2022-04-12  2:10   ` Atish Patra
@ 2022-04-12 15:59     ` Marc Zyngier
  -1 siblings, 0 replies; 35+ messages in thread
From: Marc Zyngier @ 2022-04-12 15:59 UTC (permalink / raw)
  To: Atish Patra
  Cc: qemu-riscv, Michael S. Tsirkin, Bin Meng, qemu-devel,
	Alistair Francis, Paolo Bonzini, Palmer Dabbelt

On 2022-04-12 03:10, Atish Patra wrote:
> The seria-pci device doesn't support MSI. Enable the device to provide
> MSI so that any platform with MSI support only can also use
> this serial device. MSI can be enabled by enabling the newly introduced
> device property. This will be disabled by default preserving the 
> current
> behavior of the seria-pci device.

This seems really odd. Switching to MSI implies that you now have
an edge signalling. This means that the guest will not be interrupted
again if it acks the MSI and doesn't service the device, as you'd
expect for a level interrupt (which is what the device generates today).

 From what I understand of the patch, you signal a MSI on each
transition of the device state, which is equally odd (you get
an interrupt even where the device goes idle?).

While this may work for some guests, this completely changes the
semantics of the device. You may want to at least document the new
behaviour.

Thanks,

         M.

> 
> Signed-off-by: Atish Patra <atishp@rivosinc.com>
> ---
>  hw/char/serial-pci.c | 36 +++++++++++++++++++++++++++++++++---
>  1 file changed, 33 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/char/serial-pci.c b/hw/char/serial-pci.c
> index 93d6f9924425..ca93c2ce2776 100644
> --- a/hw/char/serial-pci.c
> +++ b/hw/char/serial-pci.c
> @@ -31,6 +31,7 @@
>  #include "hw/char/serial.h"
>  #include "hw/irq.h"
>  #include "hw/pci/pci.h"
> +#include "hw/pci/msi.h"
>  #include "hw/qdev-properties.h"
>  #include "migration/vmstate.h"
>  #include "qom/object.h"
> @@ -39,26 +40,54 @@ struct PCISerialState {
>      PCIDevice dev;
>      SerialState state;
>      uint8_t prog_if;
> +    bool msi_enabled;
>  };
> 
>  #define TYPE_PCI_SERIAL "pci-serial"
>  OBJECT_DECLARE_SIMPLE_TYPE(PCISerialState, PCI_SERIAL)
> 
> +
> +static void msi_irq_handler(void *opaque, int irq_num, int level)
> +{
> +    PCIDevice *pci_dev = opaque;
> +
> +    assert(level == 0 || level == 1);
> +
> +    if (msi_enabled(pci_dev)) {
> +        msi_notify(pci_dev, 0);
> +    }
> +}
> +
>  static void serial_pci_realize(PCIDevice *dev, Error **errp)
>  {
>      PCISerialState *pci = DO_UPCAST(PCISerialState, dev, dev);
>      SerialState *s = &pci->state;
> +    Error *err = NULL;
> +    int ret;
> 
>      if (!qdev_realize(DEVICE(s), NULL, errp)) {
>          return;
>      }
> 
>      pci->dev.config[PCI_CLASS_PROG] = pci->prog_if;
> -    pci->dev.config[PCI_INTERRUPT_PIN] = 0x01;
> -    s->irq = pci_allocate_irq(&pci->dev);
> -
> +    if (pci->msi_enabled) {
> +        pci->dev.config[PCI_INTERRUPT_PIN] = 0x00;
> +        s->irq = qemu_allocate_irq(msi_irq_handler, &pci->dev, 100);
> +    } else {
> +        pci->dev.config[PCI_INTERRUPT_PIN] = 0x01;
> +        s->irq = pci_allocate_irq(&pci->dev);
> +    }
>      memory_region_init_io(&s->io, OBJECT(pci), &serial_io_ops, s, 
> "serial", 8);
>      pci_register_bar(&pci->dev, 0, PCI_BASE_ADDRESS_SPACE_IO, &s->io);
> +
> +    if (!pci->msi_enabled) {
> +        return;
> +    }
> +
> +    ret = msi_init(&pci->dev, 0, 1, true, false, &err);
> +    if (ret == -ENOTSUP) {
> +        fprintf(stdout, "MSIX INIT FAILED\n");
> +    }
>  }
> 
>  static void serial_pci_exit(PCIDevice *dev)
> @@ -83,6 +112,7 @@ static const VMStateDescription vmstate_pci_serial = 
> {
> 
>  static Property serial_pci_properties[] = {
>      DEFINE_PROP_UINT8("prog_if",  PCISerialState, prog_if, 0x02),
> +    DEFINE_PROP_BOOL("msi",  PCISerialState, msi_enabled, false),
>      DEFINE_PROP_END_OF_LIST(),
>  };

-- 
Jazz is not dead. It just smells funny...


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 1/3] serial: Enable MSI capablity and option
@ 2022-04-12 15:59     ` Marc Zyngier
  0 siblings, 0 replies; 35+ messages in thread
From: Marc Zyngier @ 2022-04-12 15:59 UTC (permalink / raw)
  To: Atish Patra
  Cc: qemu-devel, qemu-riscv, Michael S. Tsirkin, Bin Meng,
	Alistair Francis, Paolo Bonzini, Palmer Dabbelt

On 2022-04-12 03:10, Atish Patra wrote:
> The seria-pci device doesn't support MSI. Enable the device to provide
> MSI so that any platform with MSI support only can also use
> this serial device. MSI can be enabled by enabling the newly introduced
> device property. This will be disabled by default preserving the 
> current
> behavior of the seria-pci device.

This seems really odd. Switching to MSI implies that you now have
an edge signalling. This means that the guest will not be interrupted
again if it acks the MSI and doesn't service the device, as you'd
expect for a level interrupt (which is what the device generates today).

 From what I understand of the patch, you signal a MSI on each
transition of the device state, which is equally odd (you get
an interrupt even where the device goes idle?).

While this may work for some guests, this completely changes the
semantics of the device. You may want to at least document the new
behaviour.

Thanks,

         M.

> 
> Signed-off-by: Atish Patra <atishp@rivosinc.com>
> ---
>  hw/char/serial-pci.c | 36 +++++++++++++++++++++++++++++++++---
>  1 file changed, 33 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/char/serial-pci.c b/hw/char/serial-pci.c
> index 93d6f9924425..ca93c2ce2776 100644
> --- a/hw/char/serial-pci.c
> +++ b/hw/char/serial-pci.c
> @@ -31,6 +31,7 @@
>  #include "hw/char/serial.h"
>  #include "hw/irq.h"
>  #include "hw/pci/pci.h"
> +#include "hw/pci/msi.h"
>  #include "hw/qdev-properties.h"
>  #include "migration/vmstate.h"
>  #include "qom/object.h"
> @@ -39,26 +40,54 @@ struct PCISerialState {
>      PCIDevice dev;
>      SerialState state;
>      uint8_t prog_if;
> +    bool msi_enabled;
>  };
> 
>  #define TYPE_PCI_SERIAL "pci-serial"
>  OBJECT_DECLARE_SIMPLE_TYPE(PCISerialState, PCI_SERIAL)
> 
> +
> +static void msi_irq_handler(void *opaque, int irq_num, int level)
> +{
> +    PCIDevice *pci_dev = opaque;
> +
> +    assert(level == 0 || level == 1);
> +
> +    if (msi_enabled(pci_dev)) {
> +        msi_notify(pci_dev, 0);
> +    }
> +}
> +
>  static void serial_pci_realize(PCIDevice *dev, Error **errp)
>  {
>      PCISerialState *pci = DO_UPCAST(PCISerialState, dev, dev);
>      SerialState *s = &pci->state;
> +    Error *err = NULL;
> +    int ret;
> 
>      if (!qdev_realize(DEVICE(s), NULL, errp)) {
>          return;
>      }
> 
>      pci->dev.config[PCI_CLASS_PROG] = pci->prog_if;
> -    pci->dev.config[PCI_INTERRUPT_PIN] = 0x01;
> -    s->irq = pci_allocate_irq(&pci->dev);
> -
> +    if (pci->msi_enabled) {
> +        pci->dev.config[PCI_INTERRUPT_PIN] = 0x00;
> +        s->irq = qemu_allocate_irq(msi_irq_handler, &pci->dev, 100);
> +    } else {
> +        pci->dev.config[PCI_INTERRUPT_PIN] = 0x01;
> +        s->irq = pci_allocate_irq(&pci->dev);
> +    }
>      memory_region_init_io(&s->io, OBJECT(pci), &serial_io_ops, s, 
> "serial", 8);
>      pci_register_bar(&pci->dev, 0, PCI_BASE_ADDRESS_SPACE_IO, &s->io);
> +
> +    if (!pci->msi_enabled) {
> +        return;
> +    }
> +
> +    ret = msi_init(&pci->dev, 0, 1, true, false, &err);
> +    if (ret == -ENOTSUP) {
> +        fprintf(stdout, "MSIX INIT FAILED\n");
> +    }
>  }
> 
>  static void serial_pci_exit(PCIDevice *dev)
> @@ -83,6 +112,7 @@ static const VMStateDescription vmstate_pci_serial = 
> {
> 
>  static Property serial_pci_properties[] = {
>      DEFINE_PROP_UINT8("prog_if",  PCISerialState, prog_if, 0x02),
> +    DEFINE_PROP_BOOL("msi",  PCISerialState, msi_enabled, false),
>      DEFINE_PROP_END_OF_LIST(),
>  };

-- 
Jazz is not dead. It just smells funny...


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] Introduce a new Qemu machine for RISC-V
  2022-04-12  2:10 ` Atish Patra
@ 2022-04-19 16:51   ` Daniel P. Berrangé
  -1 siblings, 0 replies; 35+ messages in thread
From: Daniel P. Berrangé @ 2022-04-19 16:51 UTC (permalink / raw)
  To: Atish Patra
  Cc: qemu-riscv, Michael S. Tsirkin, Bin Meng, qemu-devel,
	Alistair Francis, Paolo Bonzini, Palmer Dabbelt

On Mon, Apr 11, 2022 at 07:10:06PM -0700, Atish Patra wrote:
> 
> The RISC-V virt machine has helped RISC-V software eco system to evolve at a
> rapid pace even in absense of the real hardware. It is definitely commendable.
> However, the number of devices & commandline options keeps growing as a result
> of that as well. That adds flexibility but will also become bit difficult
> to manage in the future as more extension support will be added. As it is the
> most commonly used qemu machine, it needs to support all kinds of device and
> interrupts as well. Moreover, virt machine has limitations on the maximum
> number of harts it can support because of all the MMIO devices it has to support.
> 
> The RISC-V IMSIC specification allows to develop machines completely relying
> on MSI and don't care about the wired interrupts at all. It just requires
> all the devices to be present behind a PCI bus or present themselves as platform
> MSI device. The former is a more common scenario in x86 world where most
> of the devices are behind PCI bus. As there is very limited MMIO device
> support, it can also scale to very large number of harts.
> 
> That's why, this patch series introduces a minimalistic yet very extensible
> forward looking machine called as "RISC-V Mini Computer" or "minic". The
> idea is to build PC or server like systems with this machine. The machine can
> work with or without virtio framework. The current implementation only
> supports RV64. I am not sure if building a RV32 machine would be of interest
> for such machines. The only mmio device it requires is clint to emulate
> the mtimecmp.

I would ask what you see as the long term future usage for 'virt' vs
'minic' machine types ? Would you expect all existing users of 'virt'
to ultimately switch to 'minic', or are there distinct non-overlapping
use cases for 'virt' vs 'minic' such that both end up widely used ?

Is 'minic' intended to be able to mimic real physical hardware at all,
or is it still intended as a purely virtual machine, like a 'virt-ng' ?

Essentially 'virt' was positioned as the standard machine to use if
you want to run a virtual machine, without any particular match to
physical hardware. It feels like 'minic' is creating a second machine
type to fill the same purpose, so how do users decide which to use ? 

> "Naming is hard". I am not too attached with the name "minic". 
> I just chose least bad one out of the few on my mind :). I am definitely
> open to any other name as well. 
> 
> The other alternative to provide MSI only option to aia in the 
> existing virt machine to build MSI only machines. This is certainly doable
> and here is the patch that supports that kind of setup.
> 
> https://github.com/atishp04/qemu/tree/virt_imsic_only
> 
> However, it even complicates the virt machine even further with additional
> command line option, branches in the code. I believe virt machine will become
> very complex if we continue this path. I am interested to learn what everyone
> else think.
> 
> It is needless to say that the current version of minic machine
> is inspired from virt machine and tries to reuse as much as code possible.
> The first patch in this series adds MSI support for serial-pci device so
> console can work on such a machine. The 2nd patch moves some common functions
> between minic and the virt machine to a helper file. The PATCH3 actually
> implements the new minic machine.
> 
> I have not added the fw-cfg/flash support. We probably should add those
> but I just wanted to start small and get the feedback first.
> This is a work in progress and have few more TODO items before becoming the
> new world order :)
> 
> 1. OpenSBI doesn't have PCI support. Thus, no console support for OpenSBI
> for now.
> 2. The ns16550 driver in OpenSBI also need to support MSI/MSI-X.
> 3. Add MSI-X support for serial-pci device.
> 
> This series can boot Linux distros with the minic machine with or without virtio
> devices with out-of-tree Linux kernel patches[1]. Here is an example commandline 
> 
> Without virtio devices (nvme, serial-pci & e1000e):
> =====================================================
> /scratch/workspace/qemu/build/qemu-system-riscv64 -cpu rv64 -M minic -m 1G -smp 4 -nographic -nodefaults \
> -display none -bios /scratch/workspace/opensbi/build/platform/generic/firmware/fw_dynamic.elf \
> -kernel /scratch/workspace/linux/arch/riscv/boot/Image \
> -chardev stdio,mux=on,signal=off,id=charconsole0 \
> -mon chardev=charconsole0,mode=readline \
> -device pci-serial,msi=true,chardev=charconsole0 \
> -drive id=disk3,file=/scratch/workspace/rootfs_images//fedora/Fedora-Developer-Rawhide-20211110.n.0-sda.raw,format=raw,if=none,id=drive-system-disk,cache=none,format=raw \
> -device nvme,serial=deadbeef,drive=disk3 \
> -netdev user,id=usernet,hostfwd=tcp::10000-:22 -device e1000e,netdev=usernet,bus=pcie.0 \
> -append 'root=/dev/nvme0n1p2 rw loglevel=8 memblock=debug console=ttyS0 earlycon' -d in_asm -D log.txt -s
> 
> With virtio devices (virtio-scsi-pci, serial-pci & virtio-net-pci)
> ==================================================================
> /scratch/workspace/qemu/build/qemu-system-riscv64 -cpu rv64 -M minic -m 1G -smp 4 -nographic -nodefaults \
> -display none -bios /scratch/workspace/opensbi/build/platform/generic/firmware/fw_dynamic.elf \
> -kernel /scratch/workspace/linux/arch/riscv/boot/Image \
> -chardev stdio,mux=on,signal=off,id=charconsole0 \
> -mon chardev=charconsole0,mode=readline \
> -device pci-serial,msi=true,chardev=charconsole0 \
> -drive file=/scratch/workspace/rootfs_images//fedora/Fedora-Developer-Rawhide-20211110.n.0-sda.raw,format=raw,if=none,id=drive-system-disk,cache=none \
> -device virtio-scsi-pci,id=scsi0 -device scsi-hd,bus=scsi0.0,drive=drive-system-disk,id=system-disk,bootindex=1 \
> -netdev user,id=n1,hostfwd=tcp::10000-:22 -device virtio-net-pci,netdev=n1 \
> -append 'root=/dev/sda2 rw loglevel=8 memblock=debug console=ttyS0 earlycon'
> 
> The objective of this series is to engage the community to solve this problem.
> Please suggest if you have another alternatve solution.
> 
> [1] https://github.com/atishp04/linux/tree/msi_only_console 
> 
> Atish Patra (3):
> serial: Enable MSI capablity and option
> hw/riscv: virt: Move common functions to a separate helper file
> hw/riscv: Create a new qemu machine for RISC-V
> 
> configs/devices/riscv64-softmmu/default.mak |   1 +
> hw/char/serial-pci.c                        |  36 +-
> hw/riscv/Kconfig                            |  11 +
> hw/riscv/machine_helper.c                   | 417 +++++++++++++++++++
> hw/riscv/meson.build                        |   2 +
> hw/riscv/minic.c                            | 438 ++++++++++++++++++++
> hw/riscv/virt.c                             | 403 ++----------------
> include/hw/riscv/machine_helper.h           |  87 ++++
> include/hw/riscv/minic.h                    |  65 +++
> include/hw/riscv/virt.h                     |  13 -
> 10 files changed, 1090 insertions(+), 383 deletions(-)
> create mode 100644 hw/riscv/machine_helper.c
> create mode 100644 hw/riscv/minic.c
> create mode 100644 include/hw/riscv/machine_helper.h
> create mode 100644 include/hw/riscv/minic.h
> 
> --
> 2.25.1
> 
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] Introduce a new Qemu machine for RISC-V
@ 2022-04-19 16:51   ` Daniel P. Berrangé
  0 siblings, 0 replies; 35+ messages in thread
From: Daniel P. Berrangé @ 2022-04-19 16:51 UTC (permalink / raw)
  To: Atish Patra
  Cc: qemu-devel, qemu-riscv, Michael S. Tsirkin, Bin Meng,
	Alistair Francis, Paolo Bonzini, Palmer Dabbelt

On Mon, Apr 11, 2022 at 07:10:06PM -0700, Atish Patra wrote:
> 
> The RISC-V virt machine has helped RISC-V software eco system to evolve at a
> rapid pace even in absense of the real hardware. It is definitely commendable.
> However, the number of devices & commandline options keeps growing as a result
> of that as well. That adds flexibility but will also become bit difficult
> to manage in the future as more extension support will be added. As it is the
> most commonly used qemu machine, it needs to support all kinds of device and
> interrupts as well. Moreover, virt machine has limitations on the maximum
> number of harts it can support because of all the MMIO devices it has to support.
> 
> The RISC-V IMSIC specification allows to develop machines completely relying
> on MSI and don't care about the wired interrupts at all. It just requires
> all the devices to be present behind a PCI bus or present themselves as platform
> MSI device. The former is a more common scenario in x86 world where most
> of the devices are behind PCI bus. As there is very limited MMIO device
> support, it can also scale to very large number of harts.
> 
> That's why, this patch series introduces a minimalistic yet very extensible
> forward looking machine called as "RISC-V Mini Computer" or "minic". The
> idea is to build PC or server like systems with this machine. The machine can
> work with or without virtio framework. The current implementation only
> supports RV64. I am not sure if building a RV32 machine would be of interest
> for such machines. The only mmio device it requires is clint to emulate
> the mtimecmp.

I would ask what you see as the long term future usage for 'virt' vs
'minic' machine types ? Would you expect all existing users of 'virt'
to ultimately switch to 'minic', or are there distinct non-overlapping
use cases for 'virt' vs 'minic' such that both end up widely used ?

Is 'minic' intended to be able to mimic real physical hardware at all,
or is it still intended as a purely virtual machine, like a 'virt-ng' ?

Essentially 'virt' was positioned as the standard machine to use if
you want to run a virtual machine, without any particular match to
physical hardware. It feels like 'minic' is creating a second machine
type to fill the same purpose, so how do users decide which to use ? 

> "Naming is hard". I am not too attached with the name "minic". 
> I just chose least bad one out of the few on my mind :). I am definitely
> open to any other name as well. 
> 
> The other alternative to provide MSI only option to aia in the 
> existing virt machine to build MSI only machines. This is certainly doable
> and here is the patch that supports that kind of setup.
> 
> https://github.com/atishp04/qemu/tree/virt_imsic_only
> 
> However, it even complicates the virt machine even further with additional
> command line option, branches in the code. I believe virt machine will become
> very complex if we continue this path. I am interested to learn what everyone
> else think.
> 
> It is needless to say that the current version of minic machine
> is inspired from virt machine and tries to reuse as much as code possible.
> The first patch in this series adds MSI support for serial-pci device so
> console can work on such a machine. The 2nd patch moves some common functions
> between minic and the virt machine to a helper file. The PATCH3 actually
> implements the new minic machine.
> 
> I have not added the fw-cfg/flash support. We probably should add those
> but I just wanted to start small and get the feedback first.
> This is a work in progress and have few more TODO items before becoming the
> new world order :)
> 
> 1. OpenSBI doesn't have PCI support. Thus, no console support for OpenSBI
> for now.
> 2. The ns16550 driver in OpenSBI also need to support MSI/MSI-X.
> 3. Add MSI-X support for serial-pci device.
> 
> This series can boot Linux distros with the minic machine with or without virtio
> devices with out-of-tree Linux kernel patches[1]. Here is an example commandline 
> 
> Without virtio devices (nvme, serial-pci & e1000e):
> =====================================================
> /scratch/workspace/qemu/build/qemu-system-riscv64 -cpu rv64 -M minic -m 1G -smp 4 -nographic -nodefaults \
> -display none -bios /scratch/workspace/opensbi/build/platform/generic/firmware/fw_dynamic.elf \
> -kernel /scratch/workspace/linux/arch/riscv/boot/Image \
> -chardev stdio,mux=on,signal=off,id=charconsole0 \
> -mon chardev=charconsole0,mode=readline \
> -device pci-serial,msi=true,chardev=charconsole0 \
> -drive id=disk3,file=/scratch/workspace/rootfs_images//fedora/Fedora-Developer-Rawhide-20211110.n.0-sda.raw,format=raw,if=none,id=drive-system-disk,cache=none,format=raw \
> -device nvme,serial=deadbeef,drive=disk3 \
> -netdev user,id=usernet,hostfwd=tcp::10000-:22 -device e1000e,netdev=usernet,bus=pcie.0 \
> -append 'root=/dev/nvme0n1p2 rw loglevel=8 memblock=debug console=ttyS0 earlycon' -d in_asm -D log.txt -s
> 
> With virtio devices (virtio-scsi-pci, serial-pci & virtio-net-pci)
> ==================================================================
> /scratch/workspace/qemu/build/qemu-system-riscv64 -cpu rv64 -M minic -m 1G -smp 4 -nographic -nodefaults \
> -display none -bios /scratch/workspace/opensbi/build/platform/generic/firmware/fw_dynamic.elf \
> -kernel /scratch/workspace/linux/arch/riscv/boot/Image \
> -chardev stdio,mux=on,signal=off,id=charconsole0 \
> -mon chardev=charconsole0,mode=readline \
> -device pci-serial,msi=true,chardev=charconsole0 \
> -drive file=/scratch/workspace/rootfs_images//fedora/Fedora-Developer-Rawhide-20211110.n.0-sda.raw,format=raw,if=none,id=drive-system-disk,cache=none \
> -device virtio-scsi-pci,id=scsi0 -device scsi-hd,bus=scsi0.0,drive=drive-system-disk,id=system-disk,bootindex=1 \
> -netdev user,id=n1,hostfwd=tcp::10000-:22 -device virtio-net-pci,netdev=n1 \
> -append 'root=/dev/sda2 rw loglevel=8 memblock=debug console=ttyS0 earlycon'
> 
> The objective of this series is to engage the community to solve this problem.
> Please suggest if you have another alternatve solution.
> 
> [1] https://github.com/atishp04/linux/tree/msi_only_console 
> 
> Atish Patra (3):
> serial: Enable MSI capablity and option
> hw/riscv: virt: Move common functions to a separate helper file
> hw/riscv: Create a new qemu machine for RISC-V
> 
> configs/devices/riscv64-softmmu/default.mak |   1 +
> hw/char/serial-pci.c                        |  36 +-
> hw/riscv/Kconfig                            |  11 +
> hw/riscv/machine_helper.c                   | 417 +++++++++++++++++++
> hw/riscv/meson.build                        |   2 +
> hw/riscv/minic.c                            | 438 ++++++++++++++++++++
> hw/riscv/virt.c                             | 403 ++----------------
> include/hw/riscv/machine_helper.h           |  87 ++++
> include/hw/riscv/minic.h                    |  65 +++
> include/hw/riscv/virt.h                     |  13 -
> 10 files changed, 1090 insertions(+), 383 deletions(-)
> create mode 100644 hw/riscv/machine_helper.c
> create mode 100644 hw/riscv/minic.c
> create mode 100644 include/hw/riscv/machine_helper.h
> create mode 100644 include/hw/riscv/minic.h
> 
> --
> 2.25.1
> 
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] Introduce a new Qemu machine for RISC-V
  2022-04-19 16:51   ` Daniel P. Berrangé
@ 2022-04-20  0:26     ` Atish Patra
  -1 siblings, 0 replies; 35+ messages in thread
From: Atish Patra @ 2022-04-20  0:26 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: open list:RISC-V, Michael S. Tsirkin, Bin Meng, Atish Patra,
	qemu-devel@nongnu.org Developers, Alistair Francis,
	Paolo Bonzini, Palmer Dabbelt

On Tue, Apr 19, 2022 at 9:51 AM Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> On Mon, Apr 11, 2022 at 07:10:06PM -0700, Atish Patra wrote:
> >
> > The RISC-V virt machine has helped RISC-V software eco system to evolve at a
> > rapid pace even in absense of the real hardware. It is definitely commendable.
> > However, the number of devices & commandline options keeps growing as a result
> > of that as well. That adds flexibility but will also become bit difficult
> > to manage in the future as more extension support will be added. As it is the
> > most commonly used qemu machine, it needs to support all kinds of device and
> > interrupts as well. Moreover, virt machine has limitations on the maximum
> > number of harts it can support because of all the MMIO devices it has to support.
> >
> > The RISC-V IMSIC specification allows to develop machines completely relying
> > on MSI and don't care about the wired interrupts at all. It just requires
> > all the devices to be present behind a PCI bus or present themselves as platform
> > MSI device. The former is a more common scenario in x86 world where most
> > of the devices are behind PCI bus. As there is very limited MMIO device
> > support, it can also scale to very large number of harts.
> >
> > That's why, this patch series introduces a minimalistic yet very extensible
> > forward looking machine called as "RISC-V Mini Computer" or "minic". The
> > idea is to build PC or server like systems with this machine. The machine can
> > work with or without virtio framework. The current implementation only
> > supports RV64. I am not sure if building a RV32 machine would be of interest
> > for such machines. The only mmio device it requires is clint to emulate
> > the mtimecmp.
>
> I would ask what you see as the long term future usage for 'virt' vs
> 'minic' machine types ? Would you expect all existing users of 'virt'
> to ultimately switch to 'minic', or are there distinct non-overlapping
> use cases for 'virt' vs 'minic' such that both end up widely used ?
>

Nope. I don't expect existing 'virt' users to switch to 'minic' as
they aim to cater to different users.

Here are the major differences
1. virt machine supports MMIO devices & wired interrupts. Minic doesn't
2. virt machine doesn't support the MSI only option yet (can be added
though[1]). Minic does.
3. Number of cpu supported by virt machine are limited because of the
MMIO devices. Minic can scale to very
large numbers of cpu.
4. 'Minic' only supports PCI based MSI capable devices. Thus, MSI is a
mandatory requirement for 'minic' while
it is optional for 'virt'.

'Minic' aims towards the users who want to create virtual machines
that are MSI based and don't care about
a million options that virt machines provide.  Virt machine is more
complex so that it can be flexible in terms of
what it supports. Minic is a minimalistic machine which doesn't need
to be expanded a lot in the future given that
most of the devices can be behind PCI.

[1] https://github.com/atishp04/qemu/tree/virt_imsic_only

> Is 'minic' intended to be able to mimic real physical hardware at all,
> or is it still intended as a purely virtual machine, like a 'virt-ng' ?
>

Any future hardware that relies only on PCI-MSI based devices, they
can be created on top of minic.
At that point, minic will provide a useful abstract for all those
machines as well. minic doesn't need a virtio framework.
Thus, it can closely emulate such hardware as well.

> Essentially 'virt' was positioned as the standard machine to use if
> you want to run a virtual machine, without any particular match to
> physical hardware. It feels like 'minic' is creating a second machine
> type to fill the same purpose, so how do users decide which to use ?
>

I envision 'minic' to be a standard machine for a specific set of user
requirements (x86 style PCI based
machines). Virt machine will continue to be a standard machine for
more generic use cases with MMIO devices.

> > "Naming is hard". I am not too attached with the name "minic".
> > I just chose least bad one out of the few on my mind :). I am definitely
> > open to any other name as well.
> >
> > The other alternative to provide MSI only option to aia in the
> > existing virt machine to build MSI only machines. This is certainly doable
> > and here is the patch that supports that kind of setup.
> >
> > https://github.com/atishp04/qemu/tree/virt_imsic_only
> >
> > However, it even complicates the virt machine even further with additional
> > command line option, branches in the code. I believe virt machine will become
> > very complex if we continue this path. I am interested to learn what everyone
> > else think.
> >
> > It is needless to say that the current version of minic machine
> > is inspired from virt machine and tries to reuse as much as code possible.
> > The first patch in this series adds MSI support for serial-pci device so
> > console can work on such a machine. The 2nd patch moves some common functions
> > between minic and the virt machine to a helper file. The PATCH3 actually
> > implements the new minic machine.
> >
> > I have not added the fw-cfg/flash support. We probably should add those
> > but I just wanted to start small and get the feedback first.
> > This is a work in progress and have few more TODO items before becoming the
> > new world order :)
> >
> > 1. OpenSBI doesn't have PCI support. Thus, no console support for OpenSBI
> > for now.
> > 2. The ns16550 driver in OpenSBI also need to support MSI/MSI-X.
> > 3. Add MSI-X support for serial-pci device.
> >
> > This series can boot Linux distros with the minic machine with or without virtio
> > devices with out-of-tree Linux kernel patches[1]. Here is an example commandline
> >
> > Without virtio devices (nvme, serial-pci & e1000e):
> > =====================================================
> > /scratch/workspace/qemu/build/qemu-system-riscv64 -cpu rv64 -M minic -m 1G -smp 4 -nographic -nodefaults \
> > -display none -bios /scratch/workspace/opensbi/build/platform/generic/firmware/fw_dynamic.elf \
> > -kernel /scratch/workspace/linux/arch/riscv/boot/Image \
> > -chardev stdio,mux=on,signal=off,id=charconsole0 \
> > -mon chardev=charconsole0,mode=readline \
> > -device pci-serial,msi=true,chardev=charconsole0 \
> > -drive id=disk3,file=/scratch/workspace/rootfs_images//fedora/Fedora-Developer-Rawhide-20211110.n.0-sda.raw,format=raw,if=none,id=drive-system-disk,cache=none,format=raw \
> > -device nvme,serial=deadbeef,drive=disk3 \
> > -netdev user,id=usernet,hostfwd=tcp::10000-:22 -device e1000e,netdev=usernet,bus=pcie.0 \
> > -append 'root=/dev/nvme0n1p2 rw loglevel=8 memblock=debug console=ttyS0 earlycon' -d in_asm -D log.txt -s
> >
> > With virtio devices (virtio-scsi-pci, serial-pci & virtio-net-pci)
> > ==================================================================
> > /scratch/workspace/qemu/build/qemu-system-riscv64 -cpu rv64 -M minic -m 1G -smp 4 -nographic -nodefaults \
> > -display none -bios /scratch/workspace/opensbi/build/platform/generic/firmware/fw_dynamic.elf \
> > -kernel /scratch/workspace/linux/arch/riscv/boot/Image \
> > -chardev stdio,mux=on,signal=off,id=charconsole0 \
> > -mon chardev=charconsole0,mode=readline \
> > -device pci-serial,msi=true,chardev=charconsole0 \
> > -drive file=/scratch/workspace/rootfs_images//fedora/Fedora-Developer-Rawhide-20211110.n.0-sda.raw,format=raw,if=none,id=drive-system-disk,cache=none \
> > -device virtio-scsi-pci,id=scsi0 -device scsi-hd,bus=scsi0.0,drive=drive-system-disk,id=system-disk,bootindex=1 \
> > -netdev user,id=n1,hostfwd=tcp::10000-:22 -device virtio-net-pci,netdev=n1 \
> > -append 'root=/dev/sda2 rw loglevel=8 memblock=debug console=ttyS0 earlycon'
> >
> > The objective of this series is to engage the community to solve this problem.
> > Please suggest if you have another alternatve solution.
> >
> > [1] https://github.com/atishp04/linux/tree/msi_only_console
> >
> > Atish Patra (3):
> > serial: Enable MSI capablity and option
> > hw/riscv: virt: Move common functions to a separate helper file
> > hw/riscv: Create a new qemu machine for RISC-V
> >
> > configs/devices/riscv64-softmmu/default.mak |   1 +
> > hw/char/serial-pci.c                        |  36 +-
> > hw/riscv/Kconfig                            |  11 +
> > hw/riscv/machine_helper.c                   | 417 +++++++++++++++++++
> > hw/riscv/meson.build                        |   2 +
> > hw/riscv/minic.c                            | 438 ++++++++++++++++++++
> > hw/riscv/virt.c                             | 403 ++----------------
> > include/hw/riscv/machine_helper.h           |  87 ++++
> > include/hw/riscv/minic.h                    |  65 +++
> > include/hw/riscv/virt.h                     |  13 -
> > 10 files changed, 1090 insertions(+), 383 deletions(-)
> > create mode 100644 hw/riscv/machine_helper.c
> > create mode 100644 hw/riscv/minic.c
> > create mode 100644 include/hw/riscv/machine_helper.h
> > create mode 100644 include/hw/riscv/minic.h
> >
> > --
> > 2.25.1
> >
> >
>
> With regards,
> Daniel
> --
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
>
>


-- 
Regards,
Atish


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] Introduce a new Qemu machine for RISC-V
@ 2022-04-20  0:26     ` Atish Patra
  0 siblings, 0 replies; 35+ messages in thread
From: Atish Patra @ 2022-04-20  0:26 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Atish Patra, open list:RISC-V, Michael S. Tsirkin, Bin Meng,
	qemu-devel@nongnu.org Developers, Alistair Francis,
	Paolo Bonzini, Palmer Dabbelt

On Tue, Apr 19, 2022 at 9:51 AM Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> On Mon, Apr 11, 2022 at 07:10:06PM -0700, Atish Patra wrote:
> >
> > The RISC-V virt machine has helped RISC-V software eco system to evolve at a
> > rapid pace even in absense of the real hardware. It is definitely commendable.
> > However, the number of devices & commandline options keeps growing as a result
> > of that as well. That adds flexibility but will also become bit difficult
> > to manage in the future as more extension support will be added. As it is the
> > most commonly used qemu machine, it needs to support all kinds of device and
> > interrupts as well. Moreover, virt machine has limitations on the maximum
> > number of harts it can support because of all the MMIO devices it has to support.
> >
> > The RISC-V IMSIC specification allows to develop machines completely relying
> > on MSI and don't care about the wired interrupts at all. It just requires
> > all the devices to be present behind a PCI bus or present themselves as platform
> > MSI device. The former is a more common scenario in x86 world where most
> > of the devices are behind PCI bus. As there is very limited MMIO device
> > support, it can also scale to very large number of harts.
> >
> > That's why, this patch series introduces a minimalistic yet very extensible
> > forward looking machine called as "RISC-V Mini Computer" or "minic". The
> > idea is to build PC or server like systems with this machine. The machine can
> > work with or without virtio framework. The current implementation only
> > supports RV64. I am not sure if building a RV32 machine would be of interest
> > for such machines. The only mmio device it requires is clint to emulate
> > the mtimecmp.
>
> I would ask what you see as the long term future usage for 'virt' vs
> 'minic' machine types ? Would you expect all existing users of 'virt'
> to ultimately switch to 'minic', or are there distinct non-overlapping
> use cases for 'virt' vs 'minic' such that both end up widely used ?
>

Nope. I don't expect existing 'virt' users to switch to 'minic' as
they aim to cater to different users.

Here are the major differences
1. virt machine supports MMIO devices & wired interrupts. Minic doesn't
2. virt machine doesn't support the MSI only option yet (can be added
though[1]). Minic does.
3. Number of cpu supported by virt machine are limited because of the
MMIO devices. Minic can scale to very
large numbers of cpu.
4. 'Minic' only supports PCI based MSI capable devices. Thus, MSI is a
mandatory requirement for 'minic' while
it is optional for 'virt'.

'Minic' aims towards the users who want to create virtual machines
that are MSI based and don't care about
a million options that virt machines provide.  Virt machine is more
complex so that it can be flexible in terms of
what it supports. Minic is a minimalistic machine which doesn't need
to be expanded a lot in the future given that
most of the devices can be behind PCI.

[1] https://github.com/atishp04/qemu/tree/virt_imsic_only

> Is 'minic' intended to be able to mimic real physical hardware at all,
> or is it still intended as a purely virtual machine, like a 'virt-ng' ?
>

Any future hardware that relies only on PCI-MSI based devices, they
can be created on top of minic.
At that point, minic will provide a useful abstract for all those
machines as well. minic doesn't need a virtio framework.
Thus, it can closely emulate such hardware as well.

> Essentially 'virt' was positioned as the standard machine to use if
> you want to run a virtual machine, without any particular match to
> physical hardware. It feels like 'minic' is creating a second machine
> type to fill the same purpose, so how do users decide which to use ?
>

I envision 'minic' to be a standard machine for a specific set of user
requirements (x86 style PCI based
machines). Virt machine will continue to be a standard machine for
more generic use cases with MMIO devices.

> > "Naming is hard". I am not too attached with the name "minic".
> > I just chose least bad one out of the few on my mind :). I am definitely
> > open to any other name as well.
> >
> > The other alternative to provide MSI only option to aia in the
> > existing virt machine to build MSI only machines. This is certainly doable
> > and here is the patch that supports that kind of setup.
> >
> > https://github.com/atishp04/qemu/tree/virt_imsic_only
> >
> > However, it even complicates the virt machine even further with additional
> > command line option, branches in the code. I believe virt machine will become
> > very complex if we continue this path. I am interested to learn what everyone
> > else think.
> >
> > It is needless to say that the current version of minic machine
> > is inspired from virt machine and tries to reuse as much as code possible.
> > The first patch in this series adds MSI support for serial-pci device so
> > console can work on such a machine. The 2nd patch moves some common functions
> > between minic and the virt machine to a helper file. The PATCH3 actually
> > implements the new minic machine.
> >
> > I have not added the fw-cfg/flash support. We probably should add those
> > but I just wanted to start small and get the feedback first.
> > This is a work in progress and have few more TODO items before becoming the
> > new world order :)
> >
> > 1. OpenSBI doesn't have PCI support. Thus, no console support for OpenSBI
> > for now.
> > 2. The ns16550 driver in OpenSBI also need to support MSI/MSI-X.
> > 3. Add MSI-X support for serial-pci device.
> >
> > This series can boot Linux distros with the minic machine with or without virtio
> > devices with out-of-tree Linux kernel patches[1]. Here is an example commandline
> >
> > Without virtio devices (nvme, serial-pci & e1000e):
> > =====================================================
> > /scratch/workspace/qemu/build/qemu-system-riscv64 -cpu rv64 -M minic -m 1G -smp 4 -nographic -nodefaults \
> > -display none -bios /scratch/workspace/opensbi/build/platform/generic/firmware/fw_dynamic.elf \
> > -kernel /scratch/workspace/linux/arch/riscv/boot/Image \
> > -chardev stdio,mux=on,signal=off,id=charconsole0 \
> > -mon chardev=charconsole0,mode=readline \
> > -device pci-serial,msi=true,chardev=charconsole0 \
> > -drive id=disk3,file=/scratch/workspace/rootfs_images//fedora/Fedora-Developer-Rawhide-20211110.n.0-sda.raw,format=raw,if=none,id=drive-system-disk,cache=none,format=raw \
> > -device nvme,serial=deadbeef,drive=disk3 \
> > -netdev user,id=usernet,hostfwd=tcp::10000-:22 -device e1000e,netdev=usernet,bus=pcie.0 \
> > -append 'root=/dev/nvme0n1p2 rw loglevel=8 memblock=debug console=ttyS0 earlycon' -d in_asm -D log.txt -s
> >
> > With virtio devices (virtio-scsi-pci, serial-pci & virtio-net-pci)
> > ==================================================================
> > /scratch/workspace/qemu/build/qemu-system-riscv64 -cpu rv64 -M minic -m 1G -smp 4 -nographic -nodefaults \
> > -display none -bios /scratch/workspace/opensbi/build/platform/generic/firmware/fw_dynamic.elf \
> > -kernel /scratch/workspace/linux/arch/riscv/boot/Image \
> > -chardev stdio,mux=on,signal=off,id=charconsole0 \
> > -mon chardev=charconsole0,mode=readline \
> > -device pci-serial,msi=true,chardev=charconsole0 \
> > -drive file=/scratch/workspace/rootfs_images//fedora/Fedora-Developer-Rawhide-20211110.n.0-sda.raw,format=raw,if=none,id=drive-system-disk,cache=none \
> > -device virtio-scsi-pci,id=scsi0 -device scsi-hd,bus=scsi0.0,drive=drive-system-disk,id=system-disk,bootindex=1 \
> > -netdev user,id=n1,hostfwd=tcp::10000-:22 -device virtio-net-pci,netdev=n1 \
> > -append 'root=/dev/sda2 rw loglevel=8 memblock=debug console=ttyS0 earlycon'
> >
> > The objective of this series is to engage the community to solve this problem.
> > Please suggest if you have another alternatve solution.
> >
> > [1] https://github.com/atishp04/linux/tree/msi_only_console
> >
> > Atish Patra (3):
> > serial: Enable MSI capablity and option
> > hw/riscv: virt: Move common functions to a separate helper file
> > hw/riscv: Create a new qemu machine for RISC-V
> >
> > configs/devices/riscv64-softmmu/default.mak |   1 +
> > hw/char/serial-pci.c                        |  36 +-
> > hw/riscv/Kconfig                            |  11 +
> > hw/riscv/machine_helper.c                   | 417 +++++++++++++++++++
> > hw/riscv/meson.build                        |   2 +
> > hw/riscv/minic.c                            | 438 ++++++++++++++++++++
> > hw/riscv/virt.c                             | 403 ++----------------
> > include/hw/riscv/machine_helper.h           |  87 ++++
> > include/hw/riscv/minic.h                    |  65 +++
> > include/hw/riscv/virt.h                     |  13 -
> > 10 files changed, 1090 insertions(+), 383 deletions(-)
> > create mode 100644 hw/riscv/machine_helper.c
> > create mode 100644 hw/riscv/minic.c
> > create mode 100644 include/hw/riscv/machine_helper.h
> > create mode 100644 include/hw/riscv/minic.h
> >
> > --
> > 2.25.1
> >
> >
>
> With regards,
> Daniel
> --
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
>
>


-- 
Regards,
Atish


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] Introduce a new Qemu machine for RISC-V
  2022-04-20  0:26     ` Atish Patra
  (?)
@ 2022-05-03  7:22     ` Atish Patra
  2022-05-05  9:36       ` Alistair Francis
  -1 siblings, 1 reply; 35+ messages in thread
From: Atish Patra @ 2022-05-03  7:22 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Atish Patra, open list:RISC-V, Michael S. Tsirkin, Bin Meng,
	qemu-devel@nongnu.org Developers, Alistair Francis,
	Paolo Bonzini, Palmer Dabbelt

On Tue, Apr 19, 2022 at 5:26 PM Atish Patra <atishp@atishpatra.org> wrote:
>
> On Tue, Apr 19, 2022 at 9:51 AM Daniel P. Berrangé <berrange@redhat.com> wrote:
> >
> > On Mon, Apr 11, 2022 at 07:10:06PM -0700, Atish Patra wrote:
> > >
> > > The RISC-V virt machine has helped RISC-V software eco system to evolve at a
> > > rapid pace even in absense of the real hardware. It is definitely commendable.
> > > However, the number of devices & commandline options keeps growing as a result
> > > of that as well. That adds flexibility but will also become bit difficult
> > > to manage in the future as more extension support will be added. As it is the
> > > most commonly used qemu machine, it needs to support all kinds of device and
> > > interrupts as well. Moreover, virt machine has limitations on the maximum
> > > number of harts it can support because of all the MMIO devices it has to support.
> > >
> > > The RISC-V IMSIC specification allows to develop machines completely relying
> > > on MSI and don't care about the wired interrupts at all. It just requires
> > > all the devices to be present behind a PCI bus or present themselves as platform
> > > MSI device. The former is a more common scenario in x86 world where most
> > > of the devices are behind PCI bus. As there is very limited MMIO device
> > > support, it can also scale to very large number of harts.
> > >
> > > That's why, this patch series introduces a minimalistic yet very extensible
> > > forward looking machine called as "RISC-V Mini Computer" or "minic". The
> > > idea is to build PC or server like systems with this machine. The machine can
> > > work with or without virtio framework. The current implementation only
> > > supports RV64. I am not sure if building a RV32 machine would be of interest
> > > for such machines. The only mmio device it requires is clint to emulate
> > > the mtimecmp.
> >

Any other thoughts ?

> > I would ask what you see as the long term future usage for 'virt' vs
> > 'minic' machine types ? Would you expect all existing users of 'virt'
> > to ultimately switch to 'minic', or are there distinct non-overlapping
> > use cases for 'virt' vs 'minic' such that both end up widely used ?
> >
>
> Nope. I don't expect existing 'virt' users to switch to 'minic' as
> they aim to cater to different users.
>
> Here are the major differences
> 1. virt machine supports MMIO devices & wired interrupts. Minic doesn't
> 2. virt machine doesn't support the MSI only option yet (can be added
> though[1]). Minic does.
> 3. Number of cpu supported by virt machine are limited because of the
> MMIO devices. Minic can scale to very
> large numbers of cpu.
> 4. 'Minic' only supports PCI based MSI capable devices. Thus, MSI is a
> mandatory requirement for 'minic' while
> it is optional for 'virt'.
>
> 'Minic' aims towards the users who want to create virtual machines
> that are MSI based and don't care about
> a million options that virt machines provide.  Virt machine is more
> complex so that it can be flexible in terms of
> what it supports. Minic is a minimalistic machine which doesn't need
> to be expanded a lot in the future given that
> most of the devices can be behind PCI.
>
> [1] https://github.com/atishp04/qemu/tree/virt_imsic_only
>
> > Is 'minic' intended to be able to mimic real physical hardware at all,
> > or is it still intended as a purely virtual machine, like a 'virt-ng' ?
> >
>
> Any future hardware that relies only on PCI-MSI based devices, they
> can be created on top of minic.
> At that point, minic will provide a useful abstract for all those
> machines as well. minic doesn't need a virtio framework.
> Thus, it can closely emulate such hardware as well.
>
> > Essentially 'virt' was positioned as the standard machine to use if
> > you want to run a virtual machine, without any particular match to
> > physical hardware. It feels like 'minic' is creating a second machine
> > type to fill the same purpose, so how do users decide which to use ?
> >
>
> I envision 'minic' to be a standard machine for a specific set of user
> requirements (x86 style PCI based
> machines). Virt machine will continue to be a standard machine for
> more generic use cases with MMIO devices.
>
> > > "Naming is hard". I am not too attached with the name "minic".
> > > I just chose least bad one out of the few on my mind :). I am definitely
> > > open to any other name as well.
> > >
> > > The other alternative to provide MSI only option to aia in the
> > > existing virt machine to build MSI only machines. This is certainly doable
> > > and here is the patch that supports that kind of setup.
> > >
> > > https://github.com/atishp04/qemu/tree/virt_imsic_only
> > >
> > > However, it even complicates the virt machine even further with additional
> > > command line option, branches in the code. I believe virt machine will become
> > > very complex if we continue this path. I am interested to learn what everyone
> > > else think.
> > >
> > > It is needless to say that the current version of minic machine
> > > is inspired from virt machine and tries to reuse as much as code possible.
> > > The first patch in this series adds MSI support for serial-pci device so
> > > console can work on such a machine. The 2nd patch moves some common functions
> > > between minic and the virt machine to a helper file. The PATCH3 actually
> > > implements the new minic machine.
> > >
> > > I have not added the fw-cfg/flash support. We probably should add those
> > > but I just wanted to start small and get the feedback first.
> > > This is a work in progress and have few more TODO items before becoming the
> > > new world order :)
> > >
> > > 1. OpenSBI doesn't have PCI support. Thus, no console support for OpenSBI
> > > for now.
> > > 2. The ns16550 driver in OpenSBI also need to support MSI/MSI-X.
> > > 3. Add MSI-X support for serial-pci device.
> > >
> > > This series can boot Linux distros with the minic machine with or without virtio
> > > devices with out-of-tree Linux kernel patches[1]. Here is an example commandline
> > >
> > > Without virtio devices (nvme, serial-pci & e1000e):
> > > =====================================================
> > > /scratch/workspace/qemu/build/qemu-system-riscv64 -cpu rv64 -M minic -m 1G -smp 4 -nographic -nodefaults \
> > > -display none -bios /scratch/workspace/opensbi/build/platform/generic/firmware/fw_dynamic.elf \
> > > -kernel /scratch/workspace/linux/arch/riscv/boot/Image \
> > > -chardev stdio,mux=on,signal=off,id=charconsole0 \
> > > -mon chardev=charconsole0,mode=readline \
> > > -device pci-serial,msi=true,chardev=charconsole0 \
> > > -drive id=disk3,file=/scratch/workspace/rootfs_images//fedora/Fedora-Developer-Rawhide-20211110.n.0-sda.raw,format=raw,if=none,id=drive-system-disk,cache=none,format=raw \
> > > -device nvme,serial=deadbeef,drive=disk3 \
> > > -netdev user,id=usernet,hostfwd=tcp::10000-:22 -device e1000e,netdev=usernet,bus=pcie.0 \
> > > -append 'root=/dev/nvme0n1p2 rw loglevel=8 memblock=debug console=ttyS0 earlycon' -d in_asm -D log.txt -s
> > >
> > > With virtio devices (virtio-scsi-pci, serial-pci & virtio-net-pci)
> > > ==================================================================
> > > /scratch/workspace/qemu/build/qemu-system-riscv64 -cpu rv64 -M minic -m 1G -smp 4 -nographic -nodefaults \
> > > -display none -bios /scratch/workspace/opensbi/build/platform/generic/firmware/fw_dynamic.elf \
> > > -kernel /scratch/workspace/linux/arch/riscv/boot/Image \
> > > -chardev stdio,mux=on,signal=off,id=charconsole0 \
> > > -mon chardev=charconsole0,mode=readline \
> > > -device pci-serial,msi=true,chardev=charconsole0 \
> > > -drive file=/scratch/workspace/rootfs_images//fedora/Fedora-Developer-Rawhide-20211110.n.0-sda.raw,format=raw,if=none,id=drive-system-disk,cache=none \
> > > -device virtio-scsi-pci,id=scsi0 -device scsi-hd,bus=scsi0.0,drive=drive-system-disk,id=system-disk,bootindex=1 \
> > > -netdev user,id=n1,hostfwd=tcp::10000-:22 -device virtio-net-pci,netdev=n1 \
> > > -append 'root=/dev/sda2 rw loglevel=8 memblock=debug console=ttyS0 earlycon'
> > >
> > > The objective of this series is to engage the community to solve this problem.
> > > Please suggest if you have another alternatve solution.
> > >
> > > [1] https://github.com/atishp04/linux/tree/msi_only_console
> > >
> > > Atish Patra (3):
> > > serial: Enable MSI capablity and option
> > > hw/riscv: virt: Move common functions to a separate helper file
> > > hw/riscv: Create a new qemu machine for RISC-V
> > >
> > > configs/devices/riscv64-softmmu/default.mak |   1 +
> > > hw/char/serial-pci.c                        |  36 +-
> > > hw/riscv/Kconfig                            |  11 +
> > > hw/riscv/machine_helper.c                   | 417 +++++++++++++++++++
> > > hw/riscv/meson.build                        |   2 +
> > > hw/riscv/minic.c                            | 438 ++++++++++++++++++++
> > > hw/riscv/virt.c                             | 403 ++----------------
> > > include/hw/riscv/machine_helper.h           |  87 ++++
> > > include/hw/riscv/minic.h                    |  65 +++
> > > include/hw/riscv/virt.h                     |  13 -
> > > 10 files changed, 1090 insertions(+), 383 deletions(-)
> > > create mode 100644 hw/riscv/machine_helper.c
> > > create mode 100644 hw/riscv/minic.c
> > > create mode 100644 include/hw/riscv/machine_helper.h
> > > create mode 100644 include/hw/riscv/minic.h
> > >
> > > --
> > > 2.25.1
> > >
> > >
> >
> > With regards,
> > Daniel
> > --
> > |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> > |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> > |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
> >
> >
>
>
> --
> Regards,
> Atish



-- 
Regards,
Atish


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] Introduce a new Qemu machine for RISC-V
  2022-05-03  7:22     ` Atish Patra
@ 2022-05-05  9:36       ` Alistair Francis
  2022-05-05 10:04         ` Daniel P. Berrangé
  2022-05-05 21:29         ` Atish Kumar Patra
  0 siblings, 2 replies; 35+ messages in thread
From: Alistair Francis @ 2022-05-05  9:36 UTC (permalink / raw)
  To: Atish Patra
  Cc: Daniel P. Berrangé,
	Atish Patra, open list:RISC-V, Michael S. Tsirkin, Bin Meng,
	qemu-devel@nongnu.org Developers, Alistair Francis,
	Paolo Bonzini, Palmer Dabbelt

On Tue, May 3, 2022 at 5:57 PM Atish Patra <atishp@atishpatra.org> wrote:
>
> On Tue, Apr 19, 2022 at 5:26 PM Atish Patra <atishp@atishpatra.org> wrote:
> >
> > On Tue, Apr 19, 2022 at 9:51 AM Daniel P. Berrangé <berrange@redhat.com> wrote:
> > >
> > > On Mon, Apr 11, 2022 at 07:10:06PM -0700, Atish Patra wrote:
> > > >
> > > > The RISC-V virt machine has helped RISC-V software eco system to evolve at a
> > > > rapid pace even in absense of the real hardware. It is definitely commendable.
> > > > However, the number of devices & commandline options keeps growing as a result
> > > > of that as well. That adds flexibility but will also become bit difficult
> > > > to manage in the future as more extension support will be added. As it is the
> > > > most commonly used qemu machine, it needs to support all kinds of device and
> > > > interrupts as well. Moreover, virt machine has limitations on the maximum
> > > > number of harts it can support because of all the MMIO devices it has to support.
> > > >
> > > > The RISC-V IMSIC specification allows to develop machines completely relying
> > > > on MSI and don't care about the wired interrupts at all. It just requires
> > > > all the devices to be present behind a PCI bus or present themselves as platform
> > > > MSI device. The former is a more common scenario in x86 world where most
> > > > of the devices are behind PCI bus. As there is very limited MMIO device
> > > > support, it can also scale to very large number of harts.
> > > >
> > > > That's why, this patch series introduces a minimalistic yet very extensible
> > > > forward looking machine called as "RISC-V Mini Computer" or "minic". The
> > > > idea is to build PC or server like systems with this machine. The machine can
> > > > work with or without virtio framework. The current implementation only
> > > > supports RV64. I am not sure if building a RV32 machine would be of interest
> > > > for such machines. The only mmio device it requires is clint to emulate
> > > > the mtimecmp.
> > >
>
> Any other thoughts ?

I don't *love* this idea. I think the virt machine is useful, but I'm
not convinced we need a second one.

This feels a little bit more like a "none" machine, as it contains
just the bare minimum to work.

>
> > > I would ask what you see as the long term future usage for 'virt' vs
> > > 'minic' machine types ? Would you expect all existing users of 'virt'
> > > to ultimately switch to 'minic', or are there distinct non-overlapping
> > > use cases for 'virt' vs 'minic' such that both end up widely used ?
> > >
> >
> > Nope. I don't expect existing 'virt' users to switch to 'minic' as
> > they aim to cater to different users.
> >
> > Here are the major differences
> > 1. virt machine supports MMIO devices & wired interrupts. Minic doesn't

This seems like the main difference

> > 2. virt machine doesn't support the MSI only option yet (can be added
> > though[1]). Minic does.

This could be fixed

> > 3. Number of cpu supported by virt machine are limited because of the
> > MMIO devices. Minic can scale to very
> > large numbers of cpu.

Similar to 1

> > 4. 'Minic' only supports PCI based MSI capable devices. Thus, MSI is a
> > mandatory requirement for 'minic' while
> > it is optional for 'virt'.

I'm not fully convinced we need this, but it also doesn't seem to cost
us a lot in terms of maintenance. It would be beneficial if we could
share a bit more of the code. Can we share the socket creation code as
well?

I don't like the name minic though. What about something like
`virt-hpc`, `virt-pcie-minimal` or something like that? That way we
indicate it's still a virt board

Alistair

> >
> > 'Minic' aims towards the users who want to create virtual machines
> > that are MSI based and don't care about
> > a million options that virt machines provide.  Virt machine is more
> > complex so that it can be flexible in terms of
> > what it supports. Minic is a minimalistic machine which doesn't need
> > to be expanded a lot in the future given that
> > most of the devices can be behind PCI.
> >
> > [1] https://github.com/atishp04/qemu/tree/virt_imsic_only
> >
> > > Is 'minic' intended to be able to mimic real physical hardware at all,
> > > or is it still intended as a purely virtual machine, like a 'virt-ng' ?
> > >
> >
> > Any future hardware that relies only on PCI-MSI based devices, they
> > can be created on top of minic.
> > At that point, minic will provide a useful abstract for all those
> > machines as well. minic doesn't need a virtio framework.
> > Thus, it can closely emulate such hardware as well.
> >
> > > Essentially 'virt' was positioned as the standard machine to use if
> > > you want to run a virtual machine, without any particular match to
> > > physical hardware. It feels like 'minic' is creating a second machine
> > > type to fill the same purpose, so how do users decide which to use ?
> > >
> >
> > I envision 'minic' to be a standard machine for a specific set of user
> > requirements (x86 style PCI based
> > machines). Virt machine will continue to be a standard machine for
> > more generic use cases with MMIO devices.
> >
> > > > "Naming is hard". I am not too attached with the name "minic".
> > > > I just chose least bad one out of the few on my mind :). I am definitely
> > > > open to any other name as well.
> > > >
> > > > The other alternative to provide MSI only option to aia in the
> > > > existing virt machine to build MSI only machines. This is certainly doable
> > > > and here is the patch that supports that kind of setup.
> > > >
> > > > https://github.com/atishp04/qemu/tree/virt_imsic_only
> > > >
> > > > However, it even complicates the virt machine even further with additional
> > > > command line option, branches in the code. I believe virt machine will become
> > > > very complex if we continue this path. I am interested to learn what everyone
> > > > else think.
> > > >
> > > > It is needless to say that the current version of minic machine
> > > > is inspired from virt machine and tries to reuse as much as code possible.
> > > > The first patch in this series adds MSI support for serial-pci device so
> > > > console can work on such a machine. The 2nd patch moves some common functions
> > > > between minic and the virt machine to a helper file. The PATCH3 actually
> > > > implements the new minic machine.
> > > >
> > > > I have not added the fw-cfg/flash support. We probably should add those
> > > > but I just wanted to start small and get the feedback first.
> > > > This is a work in progress and have few more TODO items before becoming the
> > > > new world order :)
> > > >
> > > > 1. OpenSBI doesn't have PCI support. Thus, no console support for OpenSBI
> > > > for now.
> > > > 2. The ns16550 driver in OpenSBI also need to support MSI/MSI-X.
> > > > 3. Add MSI-X support for serial-pci device.
> > > >
> > > > This series can boot Linux distros with the minic machine with or without virtio
> > > > devices with out-of-tree Linux kernel patches[1]. Here is an example commandline
> > > >
> > > > Without virtio devices (nvme, serial-pci & e1000e):
> > > > =====================================================
> > > > /scratch/workspace/qemu/build/qemu-system-riscv64 -cpu rv64 -M minic -m 1G -smp 4 -nographic -nodefaults \
> > > > -display none -bios /scratch/workspace/opensbi/build/platform/generic/firmware/fw_dynamic.elf \
> > > > -kernel /scratch/workspace/linux/arch/riscv/boot/Image \
> > > > -chardev stdio,mux=on,signal=off,id=charconsole0 \
> > > > -mon chardev=charconsole0,mode=readline \
> > > > -device pci-serial,msi=true,chardev=charconsole0 \
> > > > -drive id=disk3,file=/scratch/workspace/rootfs_images//fedora/Fedora-Developer-Rawhide-20211110.n.0-sda.raw,format=raw,if=none,id=drive-system-disk,cache=none,format=raw \
> > > > -device nvme,serial=deadbeef,drive=disk3 \
> > > > -netdev user,id=usernet,hostfwd=tcp::10000-:22 -device e1000e,netdev=usernet,bus=pcie.0 \
> > > > -append 'root=/dev/nvme0n1p2 rw loglevel=8 memblock=debug console=ttyS0 earlycon' -d in_asm -D log.txt -s
> > > >
> > > > With virtio devices (virtio-scsi-pci, serial-pci & virtio-net-pci)
> > > > ==================================================================
> > > > /scratch/workspace/qemu/build/qemu-system-riscv64 -cpu rv64 -M minic -m 1G -smp 4 -nographic -nodefaults \
> > > > -display none -bios /scratch/workspace/opensbi/build/platform/generic/firmware/fw_dynamic.elf \
> > > > -kernel /scratch/workspace/linux/arch/riscv/boot/Image \
> > > > -chardev stdio,mux=on,signal=off,id=charconsole0 \
> > > > -mon chardev=charconsole0,mode=readline \
> > > > -device pci-serial,msi=true,chardev=charconsole0 \
> > > > -drive file=/scratch/workspace/rootfs_images//fedora/Fedora-Developer-Rawhide-20211110.n.0-sda.raw,format=raw,if=none,id=drive-system-disk,cache=none \
> > > > -device virtio-scsi-pci,id=scsi0 -device scsi-hd,bus=scsi0.0,drive=drive-system-disk,id=system-disk,bootindex=1 \
> > > > -netdev user,id=n1,hostfwd=tcp::10000-:22 -device virtio-net-pci,netdev=n1 \
> > > > -append 'root=/dev/sda2 rw loglevel=8 memblock=debug console=ttyS0 earlycon'
> > > >
> > > > The objective of this series is to engage the community to solve this problem.
> > > > Please suggest if you have another alternatve solution.
> > > >
> > > > [1] https://github.com/atishp04/linux/tree/msi_only_console
> > > >
> > > > Atish Patra (3):
> > > > serial: Enable MSI capablity and option
> > > > hw/riscv: virt: Move common functions to a separate helper file
> > > > hw/riscv: Create a new qemu machine for RISC-V
> > > >
> > > > configs/devices/riscv64-softmmu/default.mak |   1 +
> > > > hw/char/serial-pci.c                        |  36 +-
> > > > hw/riscv/Kconfig                            |  11 +
> > > > hw/riscv/machine_helper.c                   | 417 +++++++++++++++++++
> > > > hw/riscv/meson.build                        |   2 +
> > > > hw/riscv/minic.c                            | 438 ++++++++++++++++++++
> > > > hw/riscv/virt.c                             | 403 ++----------------
> > > > include/hw/riscv/machine_helper.h           |  87 ++++
> > > > include/hw/riscv/minic.h                    |  65 +++
> > > > include/hw/riscv/virt.h                     |  13 -
> > > > 10 files changed, 1090 insertions(+), 383 deletions(-)
> > > > create mode 100644 hw/riscv/machine_helper.c
> > > > create mode 100644 hw/riscv/minic.c
> > > > create mode 100644 include/hw/riscv/machine_helper.h
> > > > create mode 100644 include/hw/riscv/minic.h
> > > >
> > > > --
> > > > 2.25.1
> > > >
> > > >
> > >
> > > With regards,
> > > Daniel
> > > --
> > > |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> > > |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> > > |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
> > >
> > >
> >
> >
> > --
> > Regards,
> > Atish
>
>
>
> --
> Regards,
> Atish
>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] Introduce a new Qemu machine for RISC-V
  2022-05-05  9:36       ` Alistair Francis
@ 2022-05-05 10:04         ` Daniel P. Berrangé
  2022-05-05 20:34           ` Alistair Francis
  2022-05-06  4:01           ` Anup Patel
  2022-05-05 21:29         ` Atish Kumar Patra
  1 sibling, 2 replies; 35+ messages in thread
From: Daniel P. Berrangé @ 2022-05-05 10:04 UTC (permalink / raw)
  To: Alistair Francis
  Cc: Atish Patra, Atish Patra, open list:RISC-V, Michael S. Tsirkin,
	Bin Meng, qemu-devel@nongnu.org Developers, Alistair Francis,
	Paolo Bonzini, Palmer Dabbelt

On Thu, May 05, 2022 at 07:36:51PM +1000, Alistair Francis wrote:
> On Tue, May 3, 2022 at 5:57 PM Atish Patra <atishp@atishpatra.org> wrote:
> >
> > On Tue, Apr 19, 2022 at 5:26 PM Atish Patra <atishp@atishpatra.org> wrote:
> > >
> > > On Tue, Apr 19, 2022 at 9:51 AM Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > >
> > > > On Mon, Apr 11, 2022 at 07:10:06PM -0700, Atish Patra wrote:
> > > > >
> > > > > The RISC-V virt machine has helped RISC-V software eco system to evolve at a
> > > > > rapid pace even in absense of the real hardware. It is definitely commendable.
> > > > > However, the number of devices & commandline options keeps growing as a result
> > > > > of that as well. That adds flexibility but will also become bit difficult
> > > > > to manage in the future as more extension support will be added. As it is the
> > > > > most commonly used qemu machine, it needs to support all kinds of device and
> > > > > interrupts as well. Moreover, virt machine has limitations on the maximum
> > > > > number of harts it can support because of all the MMIO devices it has to support.
> > > > >
> > > > > The RISC-V IMSIC specification allows to develop machines completely relying
> > > > > on MSI and don't care about the wired interrupts at all. It just requires
> > > > > all the devices to be present behind a PCI bus or present themselves as platform
> > > > > MSI device. The former is a more common scenario in x86 world where most
> > > > > of the devices are behind PCI bus. As there is very limited MMIO device
> > > > > support, it can also scale to very large number of harts.
> > > > >
> > > > > That's why, this patch series introduces a minimalistic yet very extensible
> > > > > forward looking machine called as "RISC-V Mini Computer" or "minic". The
> > > > > idea is to build PC or server like systems with this machine. The machine can
> > > > > work with or without virtio framework. The current implementation only
> > > > > supports RV64. I am not sure if building a RV32 machine would be of interest
> > > > > for such machines. The only mmio device it requires is clint to emulate
> > > > > the mtimecmp.
> > > >
> >
> > Any other thoughts ?
> 
> I don't *love* this idea. I think the virt machine is useful, but I'm
> not convinced we need a second one.
> 
> This feels a little bit more like a "none" machine, as it contains
> just the bare minimum to work.
> 
> >
> > > > I would ask what you see as the long term future usage for 'virt' vs
> > > > 'minic' machine types ? Would you expect all existing users of 'virt'
> > > > to ultimately switch to 'minic', or are there distinct non-overlapping
> > > > use cases for 'virt' vs 'minic' such that both end up widely used ?
> > > >
> > >
> > > Nope. I don't expect existing 'virt' users to switch to 'minic' as
> > > they aim to cater to different users.
> > >
> > > Here are the major differences
> > > 1. virt machine supports MMIO devices & wired interrupts. Minic doesn't
> 
> This seems like the main difference
> 
> > > 2. virt machine doesn't support the MSI only option yet (can be added
> > > though[1]). Minic does.
> 
> This could be fixed
> 
> > > 3. Number of cpu supported by virt machine are limited because of the
> > > MMIO devices. Minic can scale to very
> > > large numbers of cpu.
> 
> Similar to 1
> 
> > > 4. 'Minic' only supports PCI based MSI capable devices. Thus, MSI is a
> > > mandatory requirement for 'minic' while
> > > it is optional for 'virt'.
> 
> I'm not fully convinced we need this, but it also doesn't seem to cost
> us a lot in terms of maintenance. It would be beneficial if we could
> share a bit more of the code. Can we share the socket creation code as
> well?
> 
> I don't like the name minic though. What about something like
> `virt-hpc`, `virt-pcie-minimal` or something like that? That way we
> indicate it's still a virt board

We're not versioning the 'virt' machine type right so. IOW, we've not
made any promises about its long term featureset. 

If the virt machine type isn't the perfect match right now, should
we change it, in a potentially incompatible way, to give us the right
solution long term, rather than introducing a brand new machine type
with significant overlap.


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] Introduce a new Qemu machine for RISC-V
  2022-05-05 10:04         ` Daniel P. Berrangé
@ 2022-05-05 20:34           ` Alistair Francis
  2022-05-05 22:19             ` Atish Kumar Patra
  2022-05-06  8:16             ` Daniel P. Berrangé
  2022-05-06  4:01           ` Anup Patel
  1 sibling, 2 replies; 35+ messages in thread
From: Alistair Francis @ 2022-05-05 20:34 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Atish Patra, Atish Patra, open list:RISC-V, Michael S. Tsirkin,
	Bin Meng, qemu-devel@nongnu.org Developers, Alistair Francis,
	Paolo Bonzini, Palmer Dabbelt

On Thu, May 5, 2022 at 8:04 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> On Thu, May 05, 2022 at 07:36:51PM +1000, Alistair Francis wrote:
> > On Tue, May 3, 2022 at 5:57 PM Atish Patra <atishp@atishpatra.org> wrote:
> > >
> > > On Tue, Apr 19, 2022 at 5:26 PM Atish Patra <atishp@atishpatra.org> wrote:
> > > >
> > > > On Tue, Apr 19, 2022 at 9:51 AM Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > >
> > > > > On Mon, Apr 11, 2022 at 07:10:06PM -0700, Atish Patra wrote:
> > > > > >
> > > > > > The RISC-V virt machine has helped RISC-V software eco system to evolve at a
> > > > > > rapid pace even in absense of the real hardware. It is definitely commendable.
> > > > > > However, the number of devices & commandline options keeps growing as a result
> > > > > > of that as well. That adds flexibility but will also become bit difficult
> > > > > > to manage in the future as more extension support will be added. As it is the
> > > > > > most commonly used qemu machine, it needs to support all kinds of device and
> > > > > > interrupts as well. Moreover, virt machine has limitations on the maximum
> > > > > > number of harts it can support because of all the MMIO devices it has to support.
> > > > > >
> > > > > > The RISC-V IMSIC specification allows to develop machines completely relying
> > > > > > on MSI and don't care about the wired interrupts at all. It just requires
> > > > > > all the devices to be present behind a PCI bus or present themselves as platform
> > > > > > MSI device. The former is a more common scenario in x86 world where most
> > > > > > of the devices are behind PCI bus. As there is very limited MMIO device
> > > > > > support, it can also scale to very large number of harts.
> > > > > >
> > > > > > That's why, this patch series introduces a minimalistic yet very extensible
> > > > > > forward looking machine called as "RISC-V Mini Computer" or "minic". The
> > > > > > idea is to build PC or server like systems with this machine. The machine can
> > > > > > work with or without virtio framework. The current implementation only
> > > > > > supports RV64. I am not sure if building a RV32 machine would be of interest
> > > > > > for such machines. The only mmio device it requires is clint to emulate
> > > > > > the mtimecmp.
> > > > >
> > >
> > > Any other thoughts ?
> >
> > I don't *love* this idea. I think the virt machine is useful, but I'm
> > not convinced we need a second one.
> >
> > This feels a little bit more like a "none" machine, as it contains
> > just the bare minimum to work.
> >
> > >
> > > > > I would ask what you see as the long term future usage for 'virt' vs
> > > > > 'minic' machine types ? Would you expect all existing users of 'virt'
> > > > > to ultimately switch to 'minic', or are there distinct non-overlapping
> > > > > use cases for 'virt' vs 'minic' such that both end up widely used ?
> > > > >
> > > >
> > > > Nope. I don't expect existing 'virt' users to switch to 'minic' as
> > > > they aim to cater to different users.
> > > >
> > > > Here are the major differences
> > > > 1. virt machine supports MMIO devices & wired interrupts. Minic doesn't
> >
> > This seems like the main difference
> >
> > > > 2. virt machine doesn't support the MSI only option yet (can be added
> > > > though[1]). Minic does.
> >
> > This could be fixed
> >
> > > > 3. Number of cpu supported by virt machine are limited because of the
> > > > MMIO devices. Minic can scale to very
> > > > large numbers of cpu.
> >
> > Similar to 1
> >
> > > > 4. 'Minic' only supports PCI based MSI capable devices. Thus, MSI is a
> > > > mandatory requirement for 'minic' while
> > > > it is optional for 'virt'.
> >
> > I'm not fully convinced we need this, but it also doesn't seem to cost
> > us a lot in terms of maintenance. It would be beneficial if we could
> > share a bit more of the code. Can we share the socket creation code as
> > well?
> >
> > I don't like the name minic though. What about something like
> > `virt-hpc`, `virt-pcie-minimal` or something like that? That way we
> > indicate it's still a virt board
>
> We're not versioning the 'virt' machine type right so. IOW, we've not
> made any promises about its long term featureset.
>
> If the virt machine type isn't the perfect match right now, should
> we change it, in a potentially incompatible way, to give us the right
> solution long term, rather than introducing a brand new machine type
> with significant overlap.

Even if we didn't worry about backwards compatibility the current virt
machine would still be what most users want. It's just a small number
of users who don't want MMIO devices and instead want to use PCIe for
everything. Realistically it's only HPC users who would want this type
of machine, at least that's my understanding.

Alistair

>
>
> With regards,
> Daniel
> --
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] Introduce a new Qemu machine for RISC-V
  2022-05-05  9:36       ` Alistair Francis
  2022-05-05 10:04         ` Daniel P. Berrangé
@ 2022-05-05 21:29         ` Atish Kumar Patra
  1 sibling, 0 replies; 35+ messages in thread
From: Atish Kumar Patra @ 2022-05-05 21:29 UTC (permalink / raw)
  To: Alistair Francis
  Cc: Atish Patra, Daniel P. Berrangé,
	open list:RISC-V, Michael S. Tsirkin, Bin Meng,
	qemu-devel@nongnu.org Developers, Alistair Francis,
	Paolo Bonzini, Palmer Dabbelt

On Thu, May 5, 2022 at 2:37 AM Alistair Francis <alistair23@gmail.com> wrote:
>
> On Tue, May 3, 2022 at 5:57 PM Atish Patra <atishp@atishpatra.org> wrote:
> >
> > On Tue, Apr 19, 2022 at 5:26 PM Atish Patra <atishp@atishpatra.org> wrote:
> > >
> > > On Tue, Apr 19, 2022 at 9:51 AM Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > >
> > > > On Mon, Apr 11, 2022 at 07:10:06PM -0700, Atish Patra wrote:
> > > > >
> > > > > The RISC-V virt machine has helped RISC-V software eco system to evolve at a
> > > > > rapid pace even in absense of the real hardware. It is definitely commendable.
> > > > > However, the number of devices & commandline options keeps growing as a result
> > > > > of that as well. That adds flexibility but will also become bit difficult
> > > > > to manage in the future as more extension support will be added. As it is the
> > > > > most commonly used qemu machine, it needs to support all kinds of device and
> > > > > interrupts as well. Moreover, virt machine has limitations on the maximum
> > > > > number of harts it can support because of all the MMIO devices it has to support.
> > > > >
> > > > > The RISC-V IMSIC specification allows to develop machines completely relying
> > > > > on MSI and don't care about the wired interrupts at all. It just requires
> > > > > all the devices to be present behind a PCI bus or present themselves as platform
> > > > > MSI device. The former is a more common scenario in x86 world where most
> > > > > of the devices are behind PCI bus. As there is very limited MMIO device
> > > > > support, it can also scale to very large number of harts.
> > > > >
> > > > > That's why, this patch series introduces a minimalistic yet very extensible
> > > > > forward looking machine called as "RISC-V Mini Computer" or "minic". The
> > > > > idea is to build PC or server like systems with this machine. The machine can
> > > > > work with or without virtio framework. The current implementation only
> > > > > supports RV64. I am not sure if building a RV32 machine would be of interest
> > > > > for such machines. The only mmio device it requires is clint to emulate
> > > > > the mtimecmp.
> > > >
> >
> > Any other thoughts ?
>
> I don't *love* this idea. I think the virt machine is useful, but I'm
> not convinced we need a second one.
>
> This feels a little bit more like a "none" machine, as it contains
> just the bare minimum to work.
>

Ha ha :). That was the whole point. To create a minimal machine that
is easy to maintain and emulate platforms in the real world.

> >
> > > > I would ask what you see as the long term future usage for 'virt' vs
> > > > 'minic' machine types ? Would you expect all existing users of 'virt'
> > > > to ultimately switch to 'minic', or are there distinct non-overlapping
> > > > use cases for 'virt' vs 'minic' such that both end up widely used ?
> > > >
> > >
> > > Nope. I don't expect existing 'virt' users to switch to 'minic' as
> > > they aim to cater to different users.
> > >
> > > Here are the major differences
> > > 1. virt machine supports MMIO devices & wired interrupts. Minic doesn't
>
> This seems like the main difference
>
> > > 2. virt machine doesn't support the MSI only option yet (can be added
> > > though[1]). Minic does.
>
> This could be fixed
>
> > > 3. Number of cpu supported by virt machine are limited because of the
> > > MMIO devices. Minic can scale to very
> > > large numbers of cpu.
>
> Similar to 1
>

Yes. I already had the patch to fix that in the virt machine in the
cover letter.
I did not send it to the mailing list to avoid confusion.

> > > 4. 'Minic' only supports PCI based MSI capable devices. Thus, MSI is a
> > > mandatory requirement for 'minic' while
> > > it is optional for 'virt'.
>
> I'm not fully convinced we need this, but it also doesn't seem to cost
> us a lot in terms of maintenance. It would be beneficial if we could
> share a bit more of the code. Can we share the socket creation code as
> well?
>

Yeah. We can move the socket creation to the common code as well.
There are few others small ones (virt_set/get_aia_guests) can be moved
to common code.

In the future, real world hpc machines may just build their machine
on top of this machine instead of developing from scratch if we allow
some configurability
(e.g memory map, machine name, max cpus etc.)

> I don't like the name minic though. What about something like
> `virt-hpc`, `virt-pcie-minimal` or something like that? That way we
> indicate it's still a virt board
>

Fair enough. I can rename it to virt-hpc or something similar.

> Alistair
>
> > >
> > > 'Minic' aims towards the users who want to create virtual machines
> > > that are MSI based and don't care about
> > > a million options that virt machines provide.  Virt machine is more
> > > complex so that it can be flexible in terms of
> > > what it supports. Minic is a minimalistic machine which doesn't need
> > > to be expanded a lot in the future given that
> > > most of the devices can be behind PCI.
> > >
> > > [1] https://github.com/atishp04/qemu/tree/virt_imsic_only
> > >
> > > > Is 'minic' intended to be able to mimic real physical hardware at all,
> > > > or is it still intended as a purely virtual machine, like a 'virt-ng' ?
> > > >
> > >
> > > Any future hardware that relies only on PCI-MSI based devices, they
> > > can be created on top of minic.
> > > At that point, minic will provide a useful abstract for all those
> > > machines as well. minic doesn't need a virtio framework.
> > > Thus, it can closely emulate such hardware as well.
> > >
> > > > Essentially 'virt' was positioned as the standard machine to use if
> > > > you want to run a virtual machine, without any particular match to
> > > > physical hardware. It feels like 'minic' is creating a second machine
> > > > type to fill the same purpose, so how do users decide which to use ?
> > > >
> > >
> > > I envision 'minic' to be a standard machine for a specific set of user
> > > requirements (x86 style PCI based
> > > machines). Virt machine will continue to be a standard machine for
> > > more generic use cases with MMIO devices.
> > >
> > > > > "Naming is hard". I am not too attached with the name "minic".
> > > > > I just chose least bad one out of the few on my mind :). I am definitely
> > > > > open to any other name as well.
> > > > >
> > > > > The other alternative to provide MSI only option to aia in the
> > > > > existing virt machine to build MSI only machines. This is certainly doable
> > > > > and here is the patch that supports that kind of setup.
> > > > >
> > > > > https://github.com/atishp04/qemu/tree/virt_imsic_only
> > > > >
> > > > > However, it even complicates the virt machine even further with additional
> > > > > command line option, branches in the code. I believe virt machine will become
> > > > > very complex if we continue this path. I am interested to learn what everyone
> > > > > else think.
> > > > >
> > > > > It is needless to say that the current version of minic machine
> > > > > is inspired from virt machine and tries to reuse as much as code possible.
> > > > > The first patch in this series adds MSI support for serial-pci device so
> > > > > console can work on such a machine. The 2nd patch moves some common functions
> > > > > between minic and the virt machine to a helper file. The PATCH3 actually
> > > > > implements the new minic machine.
> > > > >
> > > > > I have not added the fw-cfg/flash support. We probably should add those
> > > > > but I just wanted to start small and get the feedback first.
> > > > > This is a work in progress and have few more TODO items before becoming the
> > > > > new world order :)
> > > > >
> > > > > 1. OpenSBI doesn't have PCI support. Thus, no console support for OpenSBI
> > > > > for now.
> > > > > 2. The ns16550 driver in OpenSBI also need to support MSI/MSI-X.
> > > > > 3. Add MSI-X support for serial-pci device.
> > > > >
> > > > > This series can boot Linux distros with the minic machine with or without virtio
> > > > > devices with out-of-tree Linux kernel patches[1]. Here is an example commandline
> > > > >
> > > > > Without virtio devices (nvme, serial-pci & e1000e):
> > > > > =====================================================
> > > > > /scratch/workspace/qemu/build/qemu-system-riscv64 -cpu rv64 -M minic -m 1G -smp 4 -nographic -nodefaults \
> > > > > -display none -bios /scratch/workspace/opensbi/build/platform/generic/firmware/fw_dynamic.elf \
> > > > > -kernel /scratch/workspace/linux/arch/riscv/boot/Image \
> > > > > -chardev stdio,mux=on,signal=off,id=charconsole0 \
> > > > > -mon chardev=charconsole0,mode=readline \
> > > > > -device pci-serial,msi=true,chardev=charconsole0 \
> > > > > -drive id=disk3,file=/scratch/workspace/rootfs_images//fedora/Fedora-Developer-Rawhide-20211110.n.0-sda.raw,format=raw,if=none,id=drive-system-disk,cache=none,format=raw \
> > > > > -device nvme,serial=deadbeef,drive=disk3 \
> > > > > -netdev user,id=usernet,hostfwd=tcp::10000-:22 -device e1000e,netdev=usernet,bus=pcie.0 \
> > > > > -append 'root=/dev/nvme0n1p2 rw loglevel=8 memblock=debug console=ttyS0 earlycon' -d in_asm -D log.txt -s
> > > > >
> > > > > With virtio devices (virtio-scsi-pci, serial-pci & virtio-net-pci)
> > > > > ==================================================================
> > > > > /scratch/workspace/qemu/build/qemu-system-riscv64 -cpu rv64 -M minic -m 1G -smp 4 -nographic -nodefaults \
> > > > > -display none -bios /scratch/workspace/opensbi/build/platform/generic/firmware/fw_dynamic.elf \
> > > > > -kernel /scratch/workspace/linux/arch/riscv/boot/Image \
> > > > > -chardev stdio,mux=on,signal=off,id=charconsole0 \
> > > > > -mon chardev=charconsole0,mode=readline \
> > > > > -device pci-serial,msi=true,chardev=charconsole0 \
> > > > > -drive file=/scratch/workspace/rootfs_images//fedora/Fedora-Developer-Rawhide-20211110.n.0-sda.raw,format=raw,if=none,id=drive-system-disk,cache=none \
> > > > > -device virtio-scsi-pci,id=scsi0 -device scsi-hd,bus=scsi0.0,drive=drive-system-disk,id=system-disk,bootindex=1 \
> > > > > -netdev user,id=n1,hostfwd=tcp::10000-:22 -device virtio-net-pci,netdev=n1 \
> > > > > -append 'root=/dev/sda2 rw loglevel=8 memblock=debug console=ttyS0 earlycon'
> > > > >
> > > > > The objective of this series is to engage the community to solve this problem.
> > > > > Please suggest if you have another alternatve solution.
> > > > >
> > > > > [1] https://github.com/atishp04/linux/tree/msi_only_console
> > > > >
> > > > > Atish Patra (3):
> > > > > serial: Enable MSI capablity and option
> > > > > hw/riscv: virt: Move common functions to a separate helper file
> > > > > hw/riscv: Create a new qemu machine for RISC-V
> > > > >
> > > > > configs/devices/riscv64-softmmu/default.mak |   1 +
> > > > > hw/char/serial-pci.c                        |  36 +-
> > > > > hw/riscv/Kconfig                            |  11 +
> > > > > hw/riscv/machine_helper.c                   | 417 +++++++++++++++++++
> > > > > hw/riscv/meson.build                        |   2 +
> > > > > hw/riscv/minic.c                            | 438 ++++++++++++++++++++
> > > > > hw/riscv/virt.c                             | 403 ++----------------
> > > > > include/hw/riscv/machine_helper.h           |  87 ++++
> > > > > include/hw/riscv/minic.h                    |  65 +++
> > > > > include/hw/riscv/virt.h                     |  13 -
> > > > > 10 files changed, 1090 insertions(+), 383 deletions(-)
> > > > > create mode 100644 hw/riscv/machine_helper.c
> > > > > create mode 100644 hw/riscv/minic.c
> > > > > create mode 100644 include/hw/riscv/machine_helper.h
> > > > > create mode 100644 include/hw/riscv/minic.h
> > > > >
> > > > > --
> > > > > 2.25.1
> > > > >
> > > > >
> > > >
> > > > With regards,
> > > > Daniel
> > > > --
> > > > |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> > > > |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> > > > |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
> > > >
> > > >
> > >
> > >
> > > --
> > > Regards,
> > > Atish
> >
> >
> >
> > --
> > Regards,
> > Atish
> >


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] Introduce a new Qemu machine for RISC-V
  2022-05-05 20:34           ` Alistair Francis
@ 2022-05-05 22:19             ` Atish Kumar Patra
  2022-05-06  8:16             ` Daniel P. Berrangé
  1 sibling, 0 replies; 35+ messages in thread
From: Atish Kumar Patra @ 2022-05-05 22:19 UTC (permalink / raw)
  To: Alistair Francis
  Cc: Daniel P. Berrangé,
	Atish Patra, open list:RISC-V, Michael S. Tsirkin, Bin Meng,
	qemu-devel@nongnu.org Developers, Alistair Francis,
	Paolo Bonzini, Palmer Dabbelt

On Thu, May 5, 2022 at 1:35 PM Alistair Francis <alistair23@gmail.com> wrote:
>
> On Thu, May 5, 2022 at 8:04 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
> >
> > On Thu, May 05, 2022 at 07:36:51PM +1000, Alistair Francis wrote:
> > > On Tue, May 3, 2022 at 5:57 PM Atish Patra <atishp@atishpatra.org> wrote:
> > > >
> > > > On Tue, Apr 19, 2022 at 5:26 PM Atish Patra <atishp@atishpatra.org> wrote:
> > > > >
> > > > > On Tue, Apr 19, 2022 at 9:51 AM Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > >
> > > > > > On Mon, Apr 11, 2022 at 07:10:06PM -0700, Atish Patra wrote:
> > > > > > >
> > > > > > > The RISC-V virt machine has helped RISC-V software eco system to evolve at a
> > > > > > > rapid pace even in absense of the real hardware. It is definitely commendable.
> > > > > > > However, the number of devices & commandline options keeps growing as a result
> > > > > > > of that as well. That adds flexibility but will also become bit difficult
> > > > > > > to manage in the future as more extension support will be added. As it is the
> > > > > > > most commonly used qemu machine, it needs to support all kinds of device and
> > > > > > > interrupts as well. Moreover, virt machine has limitations on the maximum
> > > > > > > number of harts it can support because of all the MMIO devices it has to support.
> > > > > > >
> > > > > > > The RISC-V IMSIC specification allows to develop machines completely relying
> > > > > > > on MSI and don't care about the wired interrupts at all. It just requires
> > > > > > > all the devices to be present behind a PCI bus or present themselves as platform
> > > > > > > MSI device. The former is a more common scenario in x86 world where most
> > > > > > > of the devices are behind PCI bus. As there is very limited MMIO device
> > > > > > > support, it can also scale to very large number of harts.
> > > > > > >
> > > > > > > That's why, this patch series introduces a minimalistic yet very extensible
> > > > > > > forward looking machine called as "RISC-V Mini Computer" or "minic". The
> > > > > > > idea is to build PC or server like systems with this machine. The machine can
> > > > > > > work with or without virtio framework. The current implementation only
> > > > > > > supports RV64. I am not sure if building a RV32 machine would be of interest
> > > > > > > for such machines. The only mmio device it requires is clint to emulate
> > > > > > > the mtimecmp.
> > > > > >
> > > >
> > > > Any other thoughts ?
> > >
> > > I don't *love* this idea. I think the virt machine is useful, but I'm
> > > not convinced we need a second one.
> > >
> > > This feels a little bit more like a "none" machine, as it contains
> > > just the bare minimum to work.
> > >
> > > >
> > > > > > I would ask what you see as the long term future usage for 'virt' vs
> > > > > > 'minic' machine types ? Would you expect all existing users of 'virt'
> > > > > > to ultimately switch to 'minic', or are there distinct non-overlapping
> > > > > > use cases for 'virt' vs 'minic' such that both end up widely used ?
> > > > > >
> > > > >
> > > > > Nope. I don't expect existing 'virt' users to switch to 'minic' as
> > > > > they aim to cater to different users.
> > > > >
> > > > > Here are the major differences
> > > > > 1. virt machine supports MMIO devices & wired interrupts. Minic doesn't
> > >
> > > This seems like the main difference
> > >
> > > > > 2. virt machine doesn't support the MSI only option yet (can be added
> > > > > though[1]). Minic does.
> > >
> > > This could be fixed
> > >
> > > > > 3. Number of cpu supported by virt machine are limited because of the
> > > > > MMIO devices. Minic can scale to very
> > > > > large numbers of cpu.
> > >
> > > Similar to 1
> > >
> > > > > 4. 'Minic' only supports PCI based MSI capable devices. Thus, MSI is a
> > > > > mandatory requirement for 'minic' while
> > > > > it is optional for 'virt'.
> > >
> > > I'm not fully convinced we need this, but it also doesn't seem to cost
> > > us a lot in terms of maintenance. It would be beneficial if we could
> > > share a bit more of the code. Can we share the socket creation code as
> > > well?
> > >
> > > I don't like the name minic though. What about something like
> > > `virt-hpc`, `virt-pcie-minimal` or something like that? That way we
> > > indicate it's still a virt board
> >
> > We're not versioning the 'virt' machine type right so. IOW, we've not
> > made any promises about its long term featureset.
> >
> > If the virt machine type isn't the perfect match right now, should
> > we change it, in a potentially incompatible way, to give us the right
> > solution long term, rather than introducing a brand new machine type
> > with significant overlap.
>
> Even if we didn't worry about backwards compatibility the current virt
> machine would still be what most users want. It's just a small number
> of users who don't want MMIO devices and instead want to use PCIe for
> everything. Realistically it's only HPC users who would want this type
> of machine, at least that's my understanding.
>

Yes. You are correct. Virt machine will continue to be useful for
platforms with MMIO or wired irq devices.
The new machine is more suitable for platforms where everything is
behind PCI. HPC platforms certainly
is one market segment. However, future RISC-V PCs (similar to x86 PCs)
may also fall in this category.

> Alistair
>
> >
> >
> > With regards,
> > Daniel
> > --
> > |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> > |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> > |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
> >


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] Introduce a new Qemu machine for RISC-V
  2022-05-05 10:04         ` Daniel P. Berrangé
  2022-05-05 20:34           ` Alistair Francis
@ 2022-05-06  4:01           ` Anup Patel
  1 sibling, 0 replies; 35+ messages in thread
From: Anup Patel @ 2022-05-06  4:01 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Alistair Francis, Atish Patra, Atish Patra, open list:RISC-V,
	Michael S. Tsirkin, Bin Meng, qemu-devel@nongnu.org Developers,
	Alistair Francis, Paolo Bonzini, Palmer Dabbelt

On Thu, May 5, 2022 at 4:24 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> On Thu, May 05, 2022 at 07:36:51PM +1000, Alistair Francis wrote:
> > On Tue, May 3, 2022 at 5:57 PM Atish Patra <atishp@atishpatra.org> wrote:
> > >
> > > On Tue, Apr 19, 2022 at 5:26 PM Atish Patra <atishp@atishpatra.org> wrote:
> > > >
> > > > On Tue, Apr 19, 2022 at 9:51 AM Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > >
> > > > > On Mon, Apr 11, 2022 at 07:10:06PM -0700, Atish Patra wrote:
> > > > > >
> > > > > > The RISC-V virt machine has helped RISC-V software eco system to evolve at a
> > > > > > rapid pace even in absense of the real hardware. It is definitely commendable.
> > > > > > However, the number of devices & commandline options keeps growing as a result
> > > > > > of that as well. That adds flexibility but will also become bit difficult
> > > > > > to manage in the future as more extension support will be added. As it is the
> > > > > > most commonly used qemu machine, it needs to support all kinds of device and
> > > > > > interrupts as well. Moreover, virt machine has limitations on the maximum
> > > > > > number of harts it can support because of all the MMIO devices it has to support.
> > > > > >
> > > > > > The RISC-V IMSIC specification allows to develop machines completely relying
> > > > > > on MSI and don't care about the wired interrupts at all. It just requires
> > > > > > all the devices to be present behind a PCI bus or present themselves as platform
> > > > > > MSI device. The former is a more common scenario in x86 world where most
> > > > > > of the devices are behind PCI bus. As there is very limited MMIO device
> > > > > > support, it can also scale to very large number of harts.
> > > > > >
> > > > > > That's why, this patch series introduces a minimalistic yet very extensible
> > > > > > forward looking machine called as "RISC-V Mini Computer" or "minic". The
> > > > > > idea is to build PC or server like systems with this machine. The machine can
> > > > > > work with or without virtio framework. The current implementation only
> > > > > > supports RV64. I am not sure if building a RV32 machine would be of interest
> > > > > > for such machines. The only mmio device it requires is clint to emulate
> > > > > > the mtimecmp.
> > > > >
> > >
> > > Any other thoughts ?
> >
> > I don't *love* this idea. I think the virt machine is useful, but I'm
> > not convinced we need a second one.
> >
> > This feels a little bit more like a "none" machine, as it contains
> > just the bare minimum to work.
> >
> > >
> > > > > I would ask what you see as the long term future usage for 'virt' vs
> > > > > 'minic' machine types ? Would you expect all existing users of 'virt'
> > > > > to ultimately switch to 'minic', or are there distinct non-overlapping
> > > > > use cases for 'virt' vs 'minic' such that both end up widely used ?
> > > > >
> > > >
> > > > Nope. I don't expect existing 'virt' users to switch to 'minic' as
> > > > they aim to cater to different users.
> > > >
> > > > Here are the major differences
> > > > 1. virt machine supports MMIO devices & wired interrupts. Minic doesn't
> >
> > This seems like the main difference
> >
> > > > 2. virt machine doesn't support the MSI only option yet (can be added
> > > > though[1]). Minic does.
> >
> > This could be fixed
> >
> > > > 3. Number of cpu supported by virt machine are limited because of the
> > > > MMIO devices. Minic can scale to very
> > > > large numbers of cpu.
> >
> > Similar to 1
> >
> > > > 4. 'Minic' only supports PCI based MSI capable devices. Thus, MSI is a
> > > > mandatory requirement for 'minic' while
> > > > it is optional for 'virt'.
> >
> > I'm not fully convinced we need this, but it also doesn't seem to cost
> > us a lot in terms of maintenance. It would be beneficial if we could
> > share a bit more of the code. Can we share the socket creation code as
> > well?
> >
> > I don't like the name minic though. What about something like
> > `virt-hpc`, `virt-pcie-minimal` or something like that? That way we
> > indicate it's still a virt board
>
> We're not versioning the 'virt' machine type right so. IOW, we've not
> made any promises about its long term featureset.
>
> If the virt machine type isn't the perfect match right now, should
> we change it, in a potentially incompatible way, to give us the right
> solution long term, rather than introducing a brand new machine type
> with significant overlap.

Versioning of "virt" machine has been a long pending item for enterprise
class Guest/VM migration.

Since "virt" machine is QEMU RISC-V specific, we can do the following:
1) Detailed documentation of "virt" machine layout along with versioning
    in the QEMU documentation
2) Re-structure "virt" machine implementation to support future changes
    "virt" machine.

Regards,
Anup

>
>
> With regards,
> Daniel
> --
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
>
>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] Introduce a new Qemu machine for RISC-V
  2022-05-05 20:34           ` Alistair Francis
  2022-05-05 22:19             ` Atish Kumar Patra
@ 2022-05-06  8:16             ` Daniel P. Berrangé
  2022-05-06 10:59               ` Peter Maydell
  1 sibling, 1 reply; 35+ messages in thread
From: Daniel P. Berrangé @ 2022-05-06  8:16 UTC (permalink / raw)
  To: Alistair Francis
  Cc: Atish Patra, Atish Patra, open list:RISC-V, Michael S. Tsirkin,
	Bin Meng, qemu-devel@nongnu.org Developers, Alistair Francis,
	Paolo Bonzini, Palmer Dabbelt

On Fri, May 06, 2022 at 06:34:47AM +1000, Alistair Francis wrote:
> On Thu, May 5, 2022 at 8:04 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
> >
> > On Thu, May 05, 2022 at 07:36:51PM +1000, Alistair Francis wrote:
> > > On Tue, May 3, 2022 at 5:57 PM Atish Patra <atishp@atishpatra.org> wrote:
> > > >
> > > > On Tue, Apr 19, 2022 at 5:26 PM Atish Patra <atishp@atishpatra.org> wrote:
> > > > >
> > > > > On Tue, Apr 19, 2022 at 9:51 AM Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > >
> > > > > > On Mon, Apr 11, 2022 at 07:10:06PM -0700, Atish Patra wrote:
> > > > > > >
> > > > > > > The RISC-V virt machine has helped RISC-V software eco system to evolve at a
> > > > > > > rapid pace even in absense of the real hardware. It is definitely commendable.
> > > > > > > However, the number of devices & commandline options keeps growing as a result
> > > > > > > of that as well. That adds flexibility but will also become bit difficult
> > > > > > > to manage in the future as more extension support will be added. As it is the
> > > > > > > most commonly used qemu machine, it needs to support all kinds of device and
> > > > > > > interrupts as well. Moreover, virt machine has limitations on the maximum
> > > > > > > number of harts it can support because of all the MMIO devices it has to support.
> > > > > > >
> > > > > > > The RISC-V IMSIC specification allows to develop machines completely relying
> > > > > > > on MSI and don't care about the wired interrupts at all. It just requires
> > > > > > > all the devices to be present behind a PCI bus or present themselves as platform
> > > > > > > MSI device. The former is a more common scenario in x86 world where most
> > > > > > > of the devices are behind PCI bus. As there is very limited MMIO device
> > > > > > > support, it can also scale to very large number of harts.
> > > > > > >
> > > > > > > That's why, this patch series introduces a minimalistic yet very extensible
> > > > > > > forward looking machine called as "RISC-V Mini Computer" or "minic". The
> > > > > > > idea is to build PC or server like systems with this machine. The machine can
> > > > > > > work with or without virtio framework. The current implementation only
> > > > > > > supports RV64. I am not sure if building a RV32 machine would be of interest
> > > > > > > for such machines. The only mmio device it requires is clint to emulate
> > > > > > > the mtimecmp.
> > > > > >
> > > >
> > > > Any other thoughts ?
> > >
> > > I don't *love* this idea. I think the virt machine is useful, but I'm
> > > not convinced we need a second one.
> > >
> > > This feels a little bit more like a "none" machine, as it contains
> > > just the bare minimum to work.
> > >
> > > >
> > > > > > I would ask what you see as the long term future usage for 'virt' vs
> > > > > > 'minic' machine types ? Would you expect all existing users of 'virt'
> > > > > > to ultimately switch to 'minic', or are there distinct non-overlapping
> > > > > > use cases for 'virt' vs 'minic' such that both end up widely used ?
> > > > > >
> > > > >
> > > > > Nope. I don't expect existing 'virt' users to switch to 'minic' as
> > > > > they aim to cater to different users.
> > > > >
> > > > > Here are the major differences
> > > > > 1. virt machine supports MMIO devices & wired interrupts. Minic doesn't
> > >
> > > This seems like the main difference
> > >
> > > > > 2. virt machine doesn't support the MSI only option yet (can be added
> > > > > though[1]). Minic does.
> > >
> > > This could be fixed
> > >
> > > > > 3. Number of cpu supported by virt machine are limited because of the
> > > > > MMIO devices. Minic can scale to very
> > > > > large numbers of cpu.
> > >
> > > Similar to 1
> > >
> > > > > 4. 'Minic' only supports PCI based MSI capable devices. Thus, MSI is a
> > > > > mandatory requirement for 'minic' while
> > > > > it is optional for 'virt'.
> > >
> > > I'm not fully convinced we need this, but it also doesn't seem to cost
> > > us a lot in terms of maintenance. It would be beneficial if we could
> > > share a bit more of the code. Can we share the socket creation code as
> > > well?
> > >
> > > I don't like the name minic though. What about something like
> > > `virt-hpc`, `virt-pcie-minimal` or something like that? That way we
> > > indicate it's still a virt board
> >
> > We're not versioning the 'virt' machine type right so. IOW, we've not
> > made any promises about its long term featureset.
> >
> > If the virt machine type isn't the perfect match right now, should
> > we change it, in a potentially incompatible way, to give us the right
> > solution long term, rather than introducing a brand new machine type
> > with significant overlap.
> 
> Even if we didn't worry about backwards compatibility the current virt
> machine would still be what most users want. It's just a small number
> of users who don't want MMIO devices and instead want to use PCIe for
> everything. Realistically it's only HPC users who would want this type
> of machine, at least that's my understanding.

I'm not so sure about that. Every other architecture has ended up
standardizing on PCI for general purpose virtual machines. IIRC,
aarch64 started off with MMIO, but switched to PCI as it matured. 

In terms of having VM mgmt tools "just work" for risc-v, I think
it will be very compelling for the general 'virt' machine to be
PCI based, otherwise all the assumptions about PCI in mgmt apps
are going to break requiring never ending riscv fixes.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] Introduce a new Qemu machine for RISC-V
  2022-05-06  8:16             ` Daniel P. Berrangé
@ 2022-05-06 10:59               ` Peter Maydell
  2022-05-06 20:30                 ` Atish Kumar Patra
  0 siblings, 1 reply; 35+ messages in thread
From: Peter Maydell @ 2022-05-06 10:59 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Alistair Francis, Atish Patra, Atish Patra, open list:RISC-V,
	Michael S. Tsirkin, Bin Meng, qemu-devel@nongnu.org Developers,
	Alistair Francis, Paolo Bonzini, Palmer Dabbelt

On Fri, 6 May 2022 at 09:18, Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> On Fri, May 06, 2022 at 06:34:47AM +1000, Alistair Francis wrote:
> > Even if we didn't worry about backwards compatibility the current virt
> > machine would still be what most users want. It's just a small number
> > of users who don't want MMIO devices and instead want to use PCIe for
> > everything. Realistically it's only HPC users who would want this type
> > of machine, at least that's my understanding.
>
> I'm not so sure about that. Every other architecture has ended up
> standardizing on PCI for general purpose virtual machines. IIRC,
> aarch64 started off with MMIO, but switched to PCI as it matured.
>
> In terms of having VM mgmt tools "just work" for risc-v, I think
> it will be very compelling for the general 'virt' machine to be
> PCI based, otherwise all the assumptions about PCI in mgmt apps
> are going to break requiring never ending riscv fixes.

Mmm, my experience with aarch64 virt is that PCI is much nicer
as a general preference. aarch64 virt has some MMIO devices
for historical reasons and some because you can't reasonably
do the necessary things with PCI, but I'm actively trying to
push people who submit new MMIO device features for virt to
try to use a PCI-based solution instead if they possibly can.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] Introduce a new Qemu machine for RISC-V
  2022-05-06 10:59               ` Peter Maydell
@ 2022-05-06 20:30                 ` Atish Kumar Patra
  2022-05-17  5:03                   ` Alistair Francis
  0 siblings, 1 reply; 35+ messages in thread
From: Atish Kumar Patra @ 2022-05-06 20:30 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Daniel P. Berrangé,
	Alistair Francis, Atish Patra, open list:RISC-V,
	Michael S. Tsirkin, Bin Meng, qemu-devel@nongnu.org Developers,
	Alistair Francis, Paolo Bonzini, Palmer Dabbelt

On Fri, May 6, 2022 at 4:00 AM Peter Maydell <peter.maydell@linaro.org> wrote:
>
> On Fri, 6 May 2022 at 09:18, Daniel P. Berrangé <berrange@redhat.com> wrote:
> >
> > On Fri, May 06, 2022 at 06:34:47AM +1000, Alistair Francis wrote:
> > > Even if we didn't worry about backwards compatibility the current virt
> > > machine would still be what most users want. It's just a small number
> > > of users who don't want MMIO devices and instead want to use PCIe for
> > > everything. Realistically it's only HPC users who would want this type
> > > of machine, at least that's my understanding.
> >
> > I'm not so sure about that. Every other architecture has ended up
> > standardizing on PCI for general purpose virtual machines. IIRC,
> > aarch64 started off with MMIO, but switched to PCI as it matured.
> >
> > In terms of having VM mgmt tools "just work" for risc-v, I think
> > it will be very compelling for the general 'virt' machine to be
> > PCI based, otherwise all the assumptions about PCI in mgmt apps
> > are going to break requiring never ending riscv fixes.
>
> Mmm, my experience with aarch64 virt is that PCI is much nicer
> as a general preference. aarch64 virt has some MMIO devices
> for historical reasons and some because you can't reasonably
> do the necessary things with PCI, but I'm actively trying to
> push people who submit new MMIO device features for virt to
> try to use a PCI-based solution instead if they possibly can.
>

Yeah. That was one of the primary goals of this series. If we have an
alternate PCI only machine,
folks will be more motivated to add only PCI based solutions in the
future. In that case, there will be minimal
or no change required to the machine code itself.

Just for clarification: We can achieve the same with the current virt
machine. But it is already bloated with a bunch of MMIO devices
and will probably continue to do so because of its flexibility. In
addition to that, any real PCI based platform emulation can not reuse
the virt machine code which will result in more vendor specific
implementations in the future..

> thanks
> -- PMM


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] Introduce a new Qemu machine for RISC-V
  2022-05-06 20:30                 ` Atish Kumar Patra
@ 2022-05-17  5:03                   ` Alistair Francis
  2022-05-17  8:52                     ` Daniel P. Berrangé
  0 siblings, 1 reply; 35+ messages in thread
From: Alistair Francis @ 2022-05-17  5:03 UTC (permalink / raw)
  To: Atish Kumar Patra
  Cc: Peter Maydell, Daniel P. Berrangé,
	Atish Patra, open list:RISC-V, Michael S. Tsirkin, Bin Meng,
	qemu-devel@nongnu.org Developers, Alistair Francis,
	Paolo Bonzini, Palmer Dabbelt

On Sat, May 7, 2022 at 6:30 AM Atish Kumar Patra <atishp@rivosinc.com> wrote:
>
> On Fri, May 6, 2022 at 4:00 AM Peter Maydell <peter.maydell@linaro.org> wrote:
> >
> > On Fri, 6 May 2022 at 09:18, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > >
> > > On Fri, May 06, 2022 at 06:34:47AM +1000, Alistair Francis wrote:
> > > > Even if we didn't worry about backwards compatibility the current virt
> > > > machine would still be what most users want. It's just a small number
> > > > of users who don't want MMIO devices and instead want to use PCIe for
> > > > everything. Realistically it's only HPC users who would want this type
> > > > of machine, at least that's my understanding.
> > >
> > > I'm not so sure about that. Every other architecture has ended up
> > > standardizing on PCI for general purpose virtual machines. IIRC,
> > > aarch64 started off with MMIO, but switched to PCI as it matured.
> > >
> > > In terms of having VM mgmt tools "just work" for risc-v, I think
> > > it will be very compelling for the general 'virt' machine to be
> > > PCI based, otherwise all the assumptions about PCI in mgmt apps
> > > are going to break requiring never ending riscv fixes.
> >
> > Mmm, my experience with aarch64 virt is that PCI is much nicer
> > as a general preference. aarch64 virt has some MMIO devices
> > for historical reasons and some because you can't reasonably
> > do the necessary things with PCI, but I'm actively trying to
> > push people who submit new MMIO device features for virt to
> > try to use a PCI-based solution instead if they possibly can.

Interesting...

Ok, maybe calling this "virt-pcie" might be a good start, with the
expectation to eventually replace the current virt with the new
virt-pcie at some point.

The other option would be to try and gradually change from the current
virt machine to this new virt machine

Alistair

> >
>
> Yeah. That was one of the primary goals of this series. If we have an
> alternate PCI only machine,
> folks will be more motivated to add only PCI based solutions in the
> future. In that case, there will be minimal
> or no change required to the machine code itself.
>
> Just for clarification: We can achieve the same with the current virt
> machine. But it is already bloated with a bunch of MMIO devices
> and will probably continue to do so because of its flexibility. In
> addition to that, any real PCI based platform emulation can not reuse
> the virt machine code which will result in more vendor specific
> implementations in the future..
>
> > thanks
> > -- PMM


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] Introduce a new Qemu machine for RISC-V
  2022-05-17  5:03                   ` Alistair Francis
@ 2022-05-17  8:52                     ` Daniel P. Berrangé
  2022-05-17 20:53                       ` Alistair Francis
  0 siblings, 1 reply; 35+ messages in thread
From: Daniel P. Berrangé @ 2022-05-17  8:52 UTC (permalink / raw)
  To: Alistair Francis
  Cc: Atish Kumar Patra, Peter Maydell, Atish Patra, open list:RISC-V,
	Michael S. Tsirkin, Bin Meng, qemu-devel@nongnu.org Developers,
	Alistair Francis, Paolo Bonzini, Palmer Dabbelt

On Tue, May 17, 2022 at 03:03:38PM +1000, Alistair Francis wrote:
> On Sat, May 7, 2022 at 6:30 AM Atish Kumar Patra <atishp@rivosinc.com> wrote:
> >
> > On Fri, May 6, 2022 at 4:00 AM Peter Maydell <peter.maydell@linaro.org> wrote:
> > >
> > > On Fri, 6 May 2022 at 09:18, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > >
> > > > On Fri, May 06, 2022 at 06:34:47AM +1000, Alistair Francis wrote:
> > > > > Even if we didn't worry about backwards compatibility the current virt
> > > > > machine would still be what most users want. It's just a small number
> > > > > of users who don't want MMIO devices and instead want to use PCIe for
> > > > > everything. Realistically it's only HPC users who would want this type
> > > > > of machine, at least that's my understanding.
> > > >
> > > > I'm not so sure about that. Every other architecture has ended up
> > > > standardizing on PCI for general purpose virtual machines. IIRC,
> > > > aarch64 started off with MMIO, but switched to PCI as it matured.
> > > >
> > > > In terms of having VM mgmt tools "just work" for risc-v, I think
> > > > it will be very compelling for the general 'virt' machine to be
> > > > PCI based, otherwise all the assumptions about PCI in mgmt apps
> > > > are going to break requiring never ending riscv fixes.
> > >
> > > Mmm, my experience with aarch64 virt is that PCI is much nicer
> > > as a general preference. aarch64 virt has some MMIO devices
> > > for historical reasons and some because you can't reasonably
> > > do the necessary things with PCI, but I'm actively trying to
> > > push people who submit new MMIO device features for virt to
> > > try to use a PCI-based solution instead if they possibly can.
> 
> Interesting...
> 
> Ok, maybe calling this "virt-pcie" might be a good start, with the
> expectation to eventually replace the current virt with the new
> virt-pcie at some point.

Delaying the inevitable by leaving PCIE support in a separate
machine type initially is going to be more painful long term.

> The other option would be to try and gradually change from the current
> virt machine to this new virt machine

Yes, I really think the 'virt' machine type needs to aim for PCIE
support sooner rather than later, if RISC-V wants to get on part
with other architectures. The best time to have added PCIE support
to 'virt' was when it was first created, the next best time is now.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] Introduce a new Qemu machine for RISC-V
  2022-05-17  8:52                     ` Daniel P. Berrangé
@ 2022-05-17 20:53                       ` Alistair Francis
  2022-05-18  6:38                         ` Atish Patra
  0 siblings, 1 reply; 35+ messages in thread
From: Alistair Francis @ 2022-05-17 20:53 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Atish Kumar Patra, Peter Maydell, Atish Patra, open list:RISC-V,
	Michael S. Tsirkin, Bin Meng, qemu-devel@nongnu.org Developers,
	Alistair Francis, Paolo Bonzini, Palmer Dabbelt

On Tue, May 17, 2022 at 6:52 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> On Tue, May 17, 2022 at 03:03:38PM +1000, Alistair Francis wrote:
> > On Sat, May 7, 2022 at 6:30 AM Atish Kumar Patra <atishp@rivosinc.com> wrote:
> > >
> > > On Fri, May 6, 2022 at 4:00 AM Peter Maydell <peter.maydell@linaro.org> wrote:
> > > >
> > > > On Fri, 6 May 2022 at 09:18, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > >
> > > > > On Fri, May 06, 2022 at 06:34:47AM +1000, Alistair Francis wrote:
> > > > > > Even if we didn't worry about backwards compatibility the current virt
> > > > > > machine would still be what most users want. It's just a small number
> > > > > > of users who don't want MMIO devices and instead want to use PCIe for
> > > > > > everything. Realistically it's only HPC users who would want this type
> > > > > > of machine, at least that's my understanding.
> > > > >
> > > > > I'm not so sure about that. Every other architecture has ended up
> > > > > standardizing on PCI for general purpose virtual machines. IIRC,
> > > > > aarch64 started off with MMIO, but switched to PCI as it matured.
> > > > >
> > > > > In terms of having VM mgmt tools "just work" for risc-v, I think
> > > > > it will be very compelling for the general 'virt' machine to be
> > > > > PCI based, otherwise all the assumptions about PCI in mgmt apps
> > > > > are going to break requiring never ending riscv fixes.
> > > >
> > > > Mmm, my experience with aarch64 virt is that PCI is much nicer
> > > > as a general preference. aarch64 virt has some MMIO devices
> > > > for historical reasons and some because you can't reasonably
> > > > do the necessary things with PCI, but I'm actively trying to
> > > > push people who submit new MMIO device features for virt to
> > > > try to use a PCI-based solution instead if they possibly can.
> >
> > Interesting...
> >
> > Ok, maybe calling this "virt-pcie" might be a good start, with the
> > expectation to eventually replace the current virt with the new
> > virt-pcie at some point.
>
> Delaying the inevitable by leaving PCIE support in a separate
> machine type initially is going to be more painful long term.
>
> > The other option would be to try and gradually change from the current
> > virt machine to this new virt machine
>
> Yes, I really think the 'virt' machine type needs to aim for PCIE
> support sooner rather than later, if RISC-V wants to get on part
> with other architectures. The best time to have added PCIE support
> to 'virt' was when it was first created, the next best time is now.

So maybe instead we lock in the current virt machine as the 7.1 virt
machine for QEMU 7.1, then work on migrating to a PCIe only machine
with versions (similar to the other archs)

Alistair

>
> With regards,
> Daniel
> --
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] Introduce a new Qemu machine for RISC-V
  2022-05-17 20:53                       ` Alistair Francis
@ 2022-05-18  6:38                         ` Atish Patra
  2022-05-18  8:25                           ` Daniel P. Berrangé
  2022-05-23  5:59                           ` Alistair Francis
  0 siblings, 2 replies; 35+ messages in thread
From: Atish Patra @ 2022-05-18  6:38 UTC (permalink / raw)
  To: Alistair Francis
  Cc: Daniel P. Berrangé,
	Atish Kumar Patra, Peter Maydell, open list:RISC-V,
	Michael S. Tsirkin, Bin Meng, qemu-devel@nongnu.org Developers,
	Alistair Francis, Paolo Bonzini, Palmer Dabbelt

On Tue, May 17, 2022 at 1:54 PM Alistair Francis <alistair23@gmail.com> wrote:
>
> On Tue, May 17, 2022 at 6:52 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
> >
> > On Tue, May 17, 2022 at 03:03:38PM +1000, Alistair Francis wrote:
> > > On Sat, May 7, 2022 at 6:30 AM Atish Kumar Patra <atishp@rivosinc.com> wrote:
> > > >
> > > > On Fri, May 6, 2022 at 4:00 AM Peter Maydell <peter.maydell@linaro.org> wrote:
> > > > >
> > > > > On Fri, 6 May 2022 at 09:18, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > >
> > > > > > On Fri, May 06, 2022 at 06:34:47AM +1000, Alistair Francis wrote:
> > > > > > > Even if we didn't worry about backwards compatibility the current virt
> > > > > > > machine would still be what most users want. It's just a small number
> > > > > > > of users who don't want MMIO devices and instead want to use PCIe for
> > > > > > > everything. Realistically it's only HPC users who would want this type
> > > > > > > of machine, at least that's my understanding.
> > > > > >
> > > > > > I'm not so sure about that. Every other architecture has ended up
> > > > > > standardizing on PCI for general purpose virtual machines. IIRC,
> > > > > > aarch64 started off with MMIO, but switched to PCI as it matured.
> > > > > >
> > > > > > In terms of having VM mgmt tools "just work" for risc-v, I think
> > > > > > it will be very compelling for the general 'virt' machine to be
> > > > > > PCI based, otherwise all the assumptions about PCI in mgmt apps
> > > > > > are going to break requiring never ending riscv fixes.
> > > > >
> > > > > Mmm, my experience with aarch64 virt is that PCI is much nicer
> > > > > as a general preference. aarch64 virt has some MMIO devices
> > > > > for historical reasons and some because you can't reasonably
> > > > > do the necessary things with PCI, but I'm actively trying to
> > > > > push people who submit new MMIO device features for virt to
> > > > > try to use a PCI-based solution instead if they possibly can.
> > >
> > > Interesting...
> > >
> > > Ok, maybe calling this "virt-pcie" might be a good start, with the
> > > expectation to eventually replace the current virt with the new
> > > virt-pcie at some point.
> >
> > Delaying the inevitable by leaving PCIE support in a separate
> > machine type initially is going to be more painful long term.
> >
> > > The other option would be to try and gradually change from the current
> > > virt machine to this new virt machine
> >
> > Yes, I really think the 'virt' machine type needs to aim for PCIE
> > support sooner rather than later, if RISC-V wants to get on part
> > with other architectures. The best time to have added PCIE support
> > to 'virt' was when it was first created, the next best time is now.
>
> So maybe instead we lock in the current virt machine as the 7.1 virt
> machine for QEMU 7.1, then work on migrating to a PCIe only machine
> with versions (similar to the other archs)
>

I am not quite sure what exactly you mean here. Do you mean to modify
the current virt
machine to be PCIE only after QEMU 7.1 or the new PCIE only machine
(with the versioning)
which will be the default machine in the future

If you intend to say the former, few issues with that approach.

1. virt machine is not well documented and already bloated. There is
no specification for virt machine as such.
Putting restrictions after a certain release will lead to confusion.
2. Do we support existing MMIO devices after that specific version or not ?
3. The user has to be aware of which version of virt machine it is
running in order to test specific features (i.e. flash, reset, wired
interrupts)
4. Based on the version of the virt machine, the command line options
will change. This may also be confusing.

On the other hand, the second approach will be much cleaner without
any baggage. The RISC-V eco-system is still maturing and this is the
right time
to start something fresh. Let's call the new machine virt-pcie for
now. Here are a few things that can be implemented for this machine.

1. It must support versioning and any changes to the machine code must
modify the version of the machine.
2. It must only support MSI based PCIe devices. No support for wired interrupts.
    The only allowed MMIO devices are
           -- mtimer/mtimecmp (there is no other option provided in ISA)
           -- reset/rtc device (If there is a way we can emulate these
two over PCIe, that would be great)
           -- flash
Beyond this, adding any other MMIO device must be strongly discouraged.
3. The virt-pcie machine must be well documented similar to a hardware
specification (including address range, restrictions and
implemented/allowed devices)
4. No dependence on virtio framework but must work with virtio-pcie backend.
5. Migration must be supported.
6. No command line option is required.
7. Any other ?

Once we have these policies in place, this can be the preferred virt
machine and the current virt machine can be phased out or continue to
co-exist
as a more flexible experimental platform.

Thoughts ?

> Alistair
>
> >
> > With regards,
> > Daniel
> > --
> > |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> > |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> > |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
> >



-- 
Regards,
Atish


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] Introduce a new Qemu machine for RISC-V
  2022-05-18  6:38                         ` Atish Patra
@ 2022-05-18  8:25                           ` Daniel P. Berrangé
  2022-05-18 10:46                             ` Peter Maydell
  2022-05-23  5:59                           ` Alistair Francis
  1 sibling, 1 reply; 35+ messages in thread
From: Daniel P. Berrangé @ 2022-05-18  8:25 UTC (permalink / raw)
  To: Atish Patra
  Cc: Alistair Francis, Atish Kumar Patra, Peter Maydell,
	open list:RISC-V, Michael S. Tsirkin, Bin Meng,
	qemu-devel@nongnu.org Developers, Alistair Francis,
	Paolo Bonzini, Palmer Dabbelt

On Tue, May 17, 2022 at 11:38:39PM -0700, Atish Patra wrote:
> On Tue, May 17, 2022 at 1:54 PM Alistair Francis <alistair23@gmail.com> wrote:
> >
> > On Tue, May 17, 2022 at 6:52 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
> > >
> > > On Tue, May 17, 2022 at 03:03:38PM +1000, Alistair Francis wrote:
> > > > On Sat, May 7, 2022 at 6:30 AM Atish Kumar Patra <atishp@rivosinc.com> wrote:
> > > > >
> > > > > On Fri, May 6, 2022 at 4:00 AM Peter Maydell <peter.maydell@linaro.org> wrote:
> > > > > >
> > > > > > On Fri, 6 May 2022 at 09:18, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > > >
> > > > > > > On Fri, May 06, 2022 at 06:34:47AM +1000, Alistair Francis wrote:
> > > > > > > > Even if we didn't worry about backwards compatibility the current virt
> > > > > > > > machine would still be what most users want. It's just a small number
> > > > > > > > of users who don't want MMIO devices and instead want to use PCIe for
> > > > > > > > everything. Realistically it's only HPC users who would want this type
> > > > > > > > of machine, at least that's my understanding.
> > > > > > >
> > > > > > > I'm not so sure about that. Every other architecture has ended up
> > > > > > > standardizing on PCI for general purpose virtual machines. IIRC,
> > > > > > > aarch64 started off with MMIO, but switched to PCI as it matured.
> > > > > > >
> > > > > > > In terms of having VM mgmt tools "just work" for risc-v, I think
> > > > > > > it will be very compelling for the general 'virt' machine to be
> > > > > > > PCI based, otherwise all the assumptions about PCI in mgmt apps
> > > > > > > are going to break requiring never ending riscv fixes.
> > > > > >
> > > > > > Mmm, my experience with aarch64 virt is that PCI is much nicer
> > > > > > as a general preference. aarch64 virt has some MMIO devices
> > > > > > for historical reasons and some because you can't reasonably
> > > > > > do the necessary things with PCI, but I'm actively trying to
> > > > > > push people who submit new MMIO device features for virt to
> > > > > > try to use a PCI-based solution instead if they possibly can.
> > > >
> > > > Interesting...
> > > >
> > > > Ok, maybe calling this "virt-pcie" might be a good start, with the
> > > > expectation to eventually replace the current virt with the new
> > > > virt-pcie at some point.
> > >
> > > Delaying the inevitable by leaving PCIE support in a separate
> > > machine type initially is going to be more painful long term.
> > >
> > > > The other option would be to try and gradually change from the current
> > > > virt machine to this new virt machine
> > >
> > > Yes, I really think the 'virt' machine type needs to aim for PCIE
> > > support sooner rather than later, if RISC-V wants to get on part
> > > with other architectures. The best time to have added PCIE support
> > > to 'virt' was when it was first created, the next best time is now.
> >
> > So maybe instead we lock in the current virt machine as the 7.1 virt
> > machine for QEMU 7.1, then work on migrating to a PCIe only machine
> > with versions (similar to the other archs)
> >
> 
> I am not quite sure what exactly you mean here. Do you mean to modify
> the current virt
> machine to be PCIE only after QEMU 7.1 or the new PCIE only machine
> (with the versioning)
> which will be the default machine in the future
> 
> If you intend to say the former, few issues with that approach.
> 
> 1. virt machine is not well documented and already bloated. There is
> no specification for virt machine as such.
> Putting restrictions after a certain release will lead to confusion.
> 2. Do we support existing MMIO devices after that specific version or not ?
> 3. The user has to be aware of which version of virt machine it is
> running in order to test specific features (i.e. flash, reset, wired
> interrupts)
> 4. Based on the version of the virt machine, the command line options
> will change. This may also be confusing.
> 
> On the other hand, the second approach will be much cleaner without
> any baggage. The RISC-V eco-system is still maturing and this is the
> right time
> to start something fresh. Let's call the new machine virt-pcie for
> now. Here are a few things that can be implemented for this machine.

The fact that RISC-V ecosystem is so young & has relatively few
users, and even fewer expecting  long term stability, is precisely
why we should just modify the existing 'virt' machine now rather
than introducing a new 'virt-pcie'. We can afford to have the
limited incompatibility in the short term given the small userbase.
We went through this same exercise with aarch64 virt machine and
despite the short term disruption, it was a good success IMHO to
get it switched from MMIO to PCI, instead of having two machines
in parallel long term.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] Introduce a new Qemu machine for RISC-V
  2022-05-18  8:25                           ` Daniel P. Berrangé
@ 2022-05-18 10:46                             ` Peter Maydell
  2022-05-19 18:16                               ` Atish Kumar Patra
  0 siblings, 1 reply; 35+ messages in thread
From: Peter Maydell @ 2022-05-18 10:46 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Atish Patra, Alistair Francis, Atish Kumar Patra,
	open list:RISC-V, Michael S. Tsirkin, Bin Meng,
	qemu-devel@nongnu.org Developers, Alistair Francis,
	Paolo Bonzini, Palmer Dabbelt

On Wed, 18 May 2022 at 09:25, Daniel P. Berrangé <berrange@redhat.com> wrote:
> The fact that RISC-V ecosystem is so young & has relatively few
> users, and even fewer expecting  long term stability, is precisely
> why we should just modify the existing 'virt' machine now rather
> than introducing a new 'virt-pcie'. We can afford to have the
> limited incompatibility in the short term given the small userbase.
> We went through this same exercise with aarch64 virt machine and
> despite the short term disruption, it was a good success IMHO to
> get it switched from MMIO to PCI, instead of having two machines
> in parallel long term.

The aarch64 virt board does still carry around the mmio devices,
though...it's just that we have pci as well now.

Personally I don't think that switching to a new machine type
is likely to help escape from the "bloat" problem, which arises
from two conflicting desires:

 (1) people want this kind of board to be nice and small and
     simple, with a minimal set of devices
 (2) everybody has their own "but this specific one device is
     really important and it should be in the minimal set"
     (watchdog? acpi? ability to power the machine on and off?
     second UART? i2c? etc etc etc)

So either your 'minimal' board is only serving a small subset
of the users who want a minimal board; or else it's not as
minimal as any of them would like; or else it acquires a
growing set of -machine options to turn various devices on
and off...

-- PMM


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] Introduce a new Qemu machine for RISC-V
  2022-05-18 10:46                             ` Peter Maydell
@ 2022-05-19 18:16                               ` Atish Kumar Patra
  0 siblings, 0 replies; 35+ messages in thread
From: Atish Kumar Patra @ 2022-05-19 18:16 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Daniel P. Berrangé,
	Atish Patra, Alistair Francis, open list:RISC-V,
	Michael S. Tsirkin, Bin Meng, qemu-devel@nongnu.org Developers,
	Alistair Francis, Paolo Bonzini, Palmer Dabbelt

On Wed, May 18, 2022 at 3:46 AM Peter Maydell <peter.maydell@linaro.org> wrote:
>
> On Wed, 18 May 2022 at 09:25, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > The fact that RISC-V ecosystem is so young & has relatively few
> > users, and even fewer expecting  long term stability, is precisely
> > why we should just modify the existing 'virt' machine now rather
> > than introducing a new 'virt-pcie'. We can afford to have the
> > limited incompatibility in the short term given the small userbase.
> > We went through this same exercise with aarch64 virt machine and
> > despite the short term disruption, it was a good success IMHO to
> > get it switched from MMIO to PCI, instead of having two machines
> > in parallel long term.
>
> The aarch64 virt board does still carry around the mmio devices,
> though...it's just that we have pci as well now.
>
> Personally I don't think that switching to a new machine type
> is likely to help escape from the "bloat" problem, which arises
> from two conflicting desires:
>
>  (1) people want this kind of board to be nice and small and
>      simple, with a minimal set of devices
>  (2) everybody has their own "but this specific one device is
>      really important and it should be in the minimal set"
>      (watchdog? acpi? ability to power the machine on and off?
>      second UART? i2c? etc etc etc)
>

Both acpi/device tree support should be there anyways.
MMIO based reset will probably needed as well (I listed that earlier
with mandatory MMIO devices)

AFAIK everything else can be PCIe based which the new board will mandate.
It must strictly enforce the rules about what can be added to it. The
bar to allow
new MMIO devices must be very high and must have a wide range of usage.
This will make life easier for the entire ecosystem as well. AFAIK,
libvirt uses PCIe devices only to build VMs.

I understand that is probably a big ask but if odd mmio devices sneak
into this platform, then that defeats the purpose.
On other hand, having a flag day for virt machines creates a lot of
incompatibility for the users until everyone transitions.
The transition also has to happen based on Qemu version as virt
machine doesn't have any versioning right now.

Do we make users' life difficult by having a flag date based on the
Qemu version or take additional responsibility of maintaining another
board ?
I hope the new board will continue to be small so the maintenance
burden is not too much. Personally, I feel the latter approach will
have minimum inconvenience for everybody
but I am okay with whatever is decided by the community.



> So either your 'minimal' board is only serving a small subset
> of the users who want a minimal board; or else it's not as
> minimal as any of them would like; or else it acquires a
> growing set of -machine options to turn various devices on
> and off...
>
> -- PMM


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] Introduce a new Qemu machine for RISC-V
  2022-05-18  6:38                         ` Atish Patra
  2022-05-18  8:25                           ` Daniel P. Berrangé
@ 2022-05-23  5:59                           ` Alistair Francis
  2022-05-24  3:16                             ` Atish Patra
  1 sibling, 1 reply; 35+ messages in thread
From: Alistair Francis @ 2022-05-23  5:59 UTC (permalink / raw)
  To: Atish Patra
  Cc: Daniel P. Berrangé,
	Atish Kumar Patra, Peter Maydell, open list:RISC-V,
	Michael S. Tsirkin, Bin Meng, qemu-devel@nongnu.org Developers,
	Alistair Francis, Paolo Bonzini, Palmer Dabbelt

On Wed, May 18, 2022 at 4:38 PM Atish Patra <atishp@atishpatra.org> wrote:
>
> On Tue, May 17, 2022 at 1:54 PM Alistair Francis <alistair23@gmail.com> wrote:
> >
> > On Tue, May 17, 2022 at 6:52 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
> > >
> > > On Tue, May 17, 2022 at 03:03:38PM +1000, Alistair Francis wrote:
> > > > On Sat, May 7, 2022 at 6:30 AM Atish Kumar Patra <atishp@rivosinc.com> wrote:
> > > > >
> > > > > On Fri, May 6, 2022 at 4:00 AM Peter Maydell <peter.maydell@linaro.org> wrote:
> > > > > >
> > > > > > On Fri, 6 May 2022 at 09:18, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > > >
> > > > > > > On Fri, May 06, 2022 at 06:34:47AM +1000, Alistair Francis wrote:
> > > > > > > > Even if we didn't worry about backwards compatibility the current virt
> > > > > > > > machine would still be what most users want. It's just a small number
> > > > > > > > of users who don't want MMIO devices and instead want to use PCIe for
> > > > > > > > everything. Realistically it's only HPC users who would want this type
> > > > > > > > of machine, at least that's my understanding.
> > > > > > >
> > > > > > > I'm not so sure about that. Every other architecture has ended up
> > > > > > > standardizing on PCI for general purpose virtual machines. IIRC,
> > > > > > > aarch64 started off with MMIO, but switched to PCI as it matured.
> > > > > > >
> > > > > > > In terms of having VM mgmt tools "just work" for risc-v, I think
> > > > > > > it will be very compelling for the general 'virt' machine to be
> > > > > > > PCI based, otherwise all the assumptions about PCI in mgmt apps
> > > > > > > are going to break requiring never ending riscv fixes.
> > > > > >
> > > > > > Mmm, my experience with aarch64 virt is that PCI is much nicer
> > > > > > as a general preference. aarch64 virt has some MMIO devices
> > > > > > for historical reasons and some because you can't reasonably
> > > > > > do the necessary things with PCI, but I'm actively trying to
> > > > > > push people who submit new MMIO device features for virt to
> > > > > > try to use a PCI-based solution instead if they possibly can.
> > > >
> > > > Interesting...
> > > >
> > > > Ok, maybe calling this "virt-pcie" might be a good start, with the
> > > > expectation to eventually replace the current virt with the new
> > > > virt-pcie at some point.
> > >
> > > Delaying the inevitable by leaving PCIE support in a separate
> > > machine type initially is going to be more painful long term.
> > >
> > > > The other option would be to try and gradually change from the current
> > > > virt machine to this new virt machine
> > >
> > > Yes, I really think the 'virt' machine type needs to aim for PCIE
> > > support sooner rather than later, if RISC-V wants to get on part
> > > with other architectures. The best time to have added PCIE support
> > > to 'virt' was when it was first created, the next best time is now.
> >
> > So maybe instead we lock in the current virt machine as the 7.1 virt
> > machine for QEMU 7.1, then work on migrating to a PCIe only machine
> > with versions (similar to the other archs)
> >
>
> I am not quite sure what exactly you mean here. Do you mean to modify
> the current virt
> machine to be PCIE only after QEMU 7.1 or the new PCIE only machine
> (with the versioning)
> which will be the default machine in the future

I mean that we call the current virt machine the virt machine for QEMU
7.1. Then for future releases we can make breaking changes, where we
have the old 7.1 virt machine for backwards compatibility.

>
> If you intend to say the former, few issues with that approach.
>
> 1. virt machine is not well documented and already bloated. There is
> no specification for virt machine as such.
> Putting restrictions after a certain release will lead to confusion.
> 2. Do we support existing MMIO devices after that specific version or not ?

Yeah, so I guess this doesn't achieve the same outcome you want. I
would say we would still include some MMIO devices, like UART for
example.

But we could simplify things a bit. So for example maybe we could use
AIA by default and then remove the PLIC code. That would help cleanup
the board file. Then we could add a `msi-only` option that would be
similar to https://github.com/atishp04/qemu/commit/d7fc1c6aa7855b414b3484672a076140166a2dcd.
But without the PLIC it should hopefully be cleaner

We would need AIA ratified before we could remove the PLIC though.

> 3. The user has to be aware of which version of virt machine it is
> running in order to test specific features (i.e. flash, reset, wired
> interrupts)

That's true, but I think we are going to have this issue someday
anyway. We don't want to use the SiFive CLINT and PLIC forever,
eventually we will want to use the newer hardware.

> 4. Based on the version of the virt machine, the command line options
> will change. This may also be confusing.
>
> On the other hand, the second approach will be much cleaner without
> any baggage. The RISC-V eco-system is still maturing and this is the
> right time
> to start something fresh. Let's call the new machine virt-pcie for
> now. Here are a few things that can be implemented for this machine.
>
> 1. It must support versioning and any changes to the machine code must
> modify the version of the machine.
> 2. It must only support MSI based PCIe devices. No support for wired interrupts.
>     The only allowed MMIO devices are
>            -- mtimer/mtimecmp (there is no other option provided in ISA)
>            -- reset/rtc device (If there is a way we can emulate these
> two over PCIe, that would be great)
>            -- flash
> Beyond this, adding any other MMIO device must be strongly discouraged.
> 3. The virt-pcie machine must be well documented similar to a hardware
> specification (including address range, restrictions and
> implemented/allowed devices)
> 4. No dependence on virtio framework but must work with virtio-pcie backend.
> 5. Migration must be supported.

I'm on board with these! They would all be great to have.

I'm open to a virt-pcie, but as others have pointed out it might be
easier to just make the switch now to the general board.

> 6. No command line option is required.

I don't see this being achievable unfortunately

> 7. Any other ?
>
> Once we have these policies in place, this can be the preferred virt
> machine and the current virt machine can be phased out or continue to
> co-exist
> as a more flexible experimental platform.
>
> Thoughts ?
>
> > Alistair
> >
> > >
> > > With regards,
> > > Daniel
> > > --
> > > |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> > > |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> > > |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
> > >
>
>
>
> --
> Regards,
> Atish


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] Introduce a new Qemu machine for RISC-V
  2022-05-23  5:59                           ` Alistair Francis
@ 2022-05-24  3:16                             ` Atish Patra
  2022-05-24 15:56                               ` Andrea Bolognani
  0 siblings, 1 reply; 35+ messages in thread
From: Atish Patra @ 2022-05-24  3:16 UTC (permalink / raw)
  To: Alistair Francis
  Cc: Daniel P. Berrangé,
	Atish Kumar Patra, Peter Maydell, open list:RISC-V,
	Michael S. Tsirkin, Bin Meng, qemu-devel@nongnu.org Developers,
	Alistair Francis, Paolo Bonzini, Palmer Dabbelt

On Sun, May 22, 2022 at 10:59 PM Alistair Francis <alistair23@gmail.com> wrote:
>
> On Wed, May 18, 2022 at 4:38 PM Atish Patra <atishp@atishpatra.org> wrote:
> >
> > On Tue, May 17, 2022 at 1:54 PM Alistair Francis <alistair23@gmail.com> wrote:
> > >
> > > On Tue, May 17, 2022 at 6:52 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > >
> > > > On Tue, May 17, 2022 at 03:03:38PM +1000, Alistair Francis wrote:
> > > > > On Sat, May 7, 2022 at 6:30 AM Atish Kumar Patra <atishp@rivosinc.com> wrote:
> > > > > >
> > > > > > On Fri, May 6, 2022 at 4:00 AM Peter Maydell <peter.maydell@linaro.org> wrote:
> > > > > > >
> > > > > > > On Fri, 6 May 2022 at 09:18, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > > > >
> > > > > > > > On Fri, May 06, 2022 at 06:34:47AM +1000, Alistair Francis wrote:
> > > > > > > > > Even if we didn't worry about backwards compatibility the current virt
> > > > > > > > > machine would still be what most users want. It's just a small number
> > > > > > > > > of users who don't want MMIO devices and instead want to use PCIe for
> > > > > > > > > everything. Realistically it's only HPC users who would want this type
> > > > > > > > > of machine, at least that's my understanding.
> > > > > > > >
> > > > > > > > I'm not so sure about that. Every other architecture has ended up
> > > > > > > > standardizing on PCI for general purpose virtual machines. IIRC,
> > > > > > > > aarch64 started off with MMIO, but switched to PCI as it matured.
> > > > > > > >
> > > > > > > > In terms of having VM mgmt tools "just work" for risc-v, I think
> > > > > > > > it will be very compelling for the general 'virt' machine to be
> > > > > > > > PCI based, otherwise all the assumptions about PCI in mgmt apps
> > > > > > > > are going to break requiring never ending riscv fixes.
> > > > > > >
> > > > > > > Mmm, my experience with aarch64 virt is that PCI is much nicer
> > > > > > > as a general preference. aarch64 virt has some MMIO devices
> > > > > > > for historical reasons and some because you can't reasonably
> > > > > > > do the necessary things with PCI, but I'm actively trying to
> > > > > > > push people who submit new MMIO device features for virt to
> > > > > > > try to use a PCI-based solution instead if they possibly can.
> > > > >
> > > > > Interesting...
> > > > >
> > > > > Ok, maybe calling this "virt-pcie" might be a good start, with the
> > > > > expectation to eventually replace the current virt with the new
> > > > > virt-pcie at some point.
> > > >
> > > > Delaying the inevitable by leaving PCIE support in a separate
> > > > machine type initially is going to be more painful long term.
> > > >
> > > > > The other option would be to try and gradually change from the current
> > > > > virt machine to this new virt machine
> > > >
> > > > Yes, I really think the 'virt' machine type needs to aim for PCIE
> > > > support sooner rather than later, if RISC-V wants to get on part
> > > > with other architectures. The best time to have added PCIE support
> > > > to 'virt' was when it was first created, the next best time is now.
> > >
> > > So maybe instead we lock in the current virt machine as the 7.1 virt
> > > machine for QEMU 7.1, then work on migrating to a PCIe only machine
> > > with versions (similar to the other archs)
> > >
> >
> > I am not quite sure what exactly you mean here. Do you mean to modify
> > the current virt
> > machine to be PCIE only after QEMU 7.1 or the new PCIE only machine
> > (with the versioning)
> > which will be the default machine in the future
>
> I mean that we call the current virt machine the virt machine for QEMU
> 7.1. Then for future releases we can make breaking changes, where we
> have the old 7.1 virt machine for backwards compatibility.
>
> >
> > If you intend to say the former, few issues with that approach.
> >
> > 1. virt machine is not well documented and already bloated. There is
> > no specification for virt machine as such.
> > Putting restrictions after a certain release will lead to confusion.
> > 2. Do we support existing MMIO devices after that specific version or not ?
>
> Yeah, so I guess this doesn't achieve the same outcome you want. I
> would say we would still include some MMIO devices, like UART for
> example.
>

Why ? We can just rely on the pcie based uart (virtio-serial-pci or
serial-pci) should be enough.
The only MMIO devices that should be allowed are the ones that can't
be behind pcie.

> But we could simplify things a bit. So for example maybe we could use
> AIA by default and then remove the PLIC code. That would help cleanup
> the board file. Then we could add a `msi-only` option that would be
> similar to https://github.com/atishp04/qemu/commit/d7fc1c6aa7855b414b3484672a076140166a2dcd.
> But without the PLIC it should hopefully be cleaner
>
> We would need AIA ratified before we could remove the PLIC though.
>

And AIA patches available in the upstream Linux kernel.
Even after that, can we just remove the PLIC ?
That would mean everybody has to use the latest kernel with AIA
support after that.

> > 3. The user has to be aware of which version of virt machine it is
> > running in order to test specific features (i.e. flash, reset, wired
> > interrupts)
>
> That's true, but I think we are going to have this issue someday
> anyway. We don't want to use the SiFive CLINT and PLIC forever,
> eventually we will want to use the newer hardware.
>

Agreed. But It should be disabed by default at first. In the future it
can be removed.
Otherwise, we end up breaking a bunch of user configurations.

> > 4. Based on the version of the virt machine, the command line options
> > will change. This may also be confusing.
> >
> > On the other hand, the second approach will be much cleaner without
> > any baggage. The RISC-V eco-system is still maturing and this is the
> > right time
> > to start something fresh. Let's call the new machine virt-pcie for
> > now. Here are a few things that can be implemented for this machine.
> >
> > 1. It must support versioning and any changes to the machine code must
> > modify the version of the machine.
> > 2. It must only support MSI based PCIe devices. No support for wired interrupts.
> >     The only allowed MMIO devices are
> >            -- mtimer/mtimecmp (there is no other option provided in ISA)
> >            -- reset/rtc device (If there is a way we can emulate these
> > two over PCIe, that would be great)
> >            -- flash
> > Beyond this, adding any other MMIO device must be strongly discouraged.
> > 3. The virt-pcie machine must be well documented similar to a hardware
> > specification (including address range, restrictions and
> > implemented/allowed devices)
> > 4. No dependence on virtio framework but must work with virtio-pcie backend.
> > 5. Migration must be supported.
>
> I'm on board with these! They would all be great to have.
>
> I'm open to a virt-pcie, but as others have pointed out it might be
> easier to just make the switch now to the general board.
>
> > 6. No command line option is required.
>
> I don't see this being achievable unfortunately
>

I meant no command line configurability options for the machine
itself. We probably should
allow using different versions of the machine via command line.

> > 7. Any other ?
> >
> > Once we have these policies in place, this can be the preferred virt
> > machine and the current virt machine can be phased out or continue to
> > co-exist
> > as a more flexible experimental platform.
> >
> > Thoughts ?
> >
> > > Alistair
> > >
> > > >
> > > > With regards,
> > > > Daniel
> > > > --
> > > > |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> > > > |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> > > > |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
> > > >
> >
> >
> >
> > --
> > Regards,
> > Atish



-- 
Regards,
Atish


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] Introduce a new Qemu machine for RISC-V
  2022-05-24  3:16                             ` Atish Patra
@ 2022-05-24 15:56                               ` Andrea Bolognani
  2022-06-01  2:01                                 ` Alistair Francis
  0 siblings, 1 reply; 35+ messages in thread
From: Andrea Bolognani @ 2022-05-24 15:56 UTC (permalink / raw)
  To: Atish Patra
  Cc: Alistair Francis, Daniel P. Berrangé,
	Atish Kumar Patra, Peter Maydell, open list:RISC-V,
	Michael S. Tsirkin, Bin Meng, qemu-devel@nongnu.org Developers,
	Alistair Francis, Paolo Bonzini, Palmer Dabbelt

On Mon, May 23, 2022 at 08:16:40PM -0700, Atish Patra wrote:
> On Sun, May 22, 2022 at 10:59 PM Alistair Francis <alistair23@gmail.com> wrote:
> > On Wed, May 18, 2022 at 4:38 PM Atish Patra <atishp@atishpatra.org> wrote:
> > > 1. virt machine is not well documented and already bloated. There is
> > > no specification for virt machine as such.
> > > Putting restrictions after a certain release will lead to confusion.
> > > 2. Do we support existing MMIO devices after that specific version or not ?
> >
> > Yeah, so I guess this doesn't achieve the same outcome you want. I
> > would say we would still include some MMIO devices, like UART for
> > example.
>
> Why ? We can just rely on the pcie based uart (virtio-serial-pci or
> serial-pci) should be enough.
> The only MMIO devices that should be allowed are the ones that can't
> be behind pcie.

IIRC virtio-serial is initialized too late to catch messages produced
very early by the firmware (and possibly the kernel), which means
it's okay for regular usage but not when trying to debug an entire
class of boot issues.

Either way, it looks like you wouldn't be able to completely get rid
of MMIO even if you introduced a new virt-pcie machine type. That's
the same for the aarch64 virt machine. I agree with Dan that we
should follow the example set by that architecture - it has worked
out pretty well.

If there is a desire to reduce the complexity of the "standard"
machine type, can we just take the current virt machine type and
rename it to something else? And have your simpler code take over the
virt name? Sure, it will cause some pain in the short term, but the
RISC-V ecosystem is still young enough for it to not be a deal
breaker.

-- 
Andrea Bolognani / Red Hat / Virtualization



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] Introduce a new Qemu machine for RISC-V
  2022-05-24 15:56                               ` Andrea Bolognani
@ 2022-06-01  2:01                                 ` Alistair Francis
  0 siblings, 0 replies; 35+ messages in thread
From: Alistair Francis @ 2022-06-01  2:01 UTC (permalink / raw)
  To: Andrea Bolognani
  Cc: Atish Patra, Daniel P. Berrangé,
	Atish Kumar Patra, Peter Maydell, open list:RISC-V,
	Michael S. Tsirkin, Bin Meng, qemu-devel@nongnu.org Developers,
	Alistair Francis, Paolo Bonzini, Palmer Dabbelt

On Wed, May 25, 2022 at 1:56 AM Andrea Bolognani <abologna@redhat.com> wrote:
>
> On Mon, May 23, 2022 at 08:16:40PM -0700, Atish Patra wrote:
> > On Sun, May 22, 2022 at 10:59 PM Alistair Francis <alistair23@gmail.com> wrote:
> > > On Wed, May 18, 2022 at 4:38 PM Atish Patra <atishp@atishpatra.org> wrote:
> > > > 1. virt machine is not well documented and already bloated. There is
> > > > no specification for virt machine as such.
> > > > Putting restrictions after a certain release will lead to confusion.
> > > > 2. Do we support existing MMIO devices after that specific version or not ?
> > >
> > > Yeah, so I guess this doesn't achieve the same outcome you want. I
> > > would say we would still include some MMIO devices, like UART for
> > > example.
> >
> > Why ? We can just rely on the pcie based uart (virtio-serial-pci or
> > serial-pci) should be enough.
> > The only MMIO devices that should be allowed are the ones that can't
> > be behind pcie.
>
> IIRC virtio-serial is initialized too late to catch messages produced
> very early by the firmware (and possibly the kernel), which means
> it's okay for regular usage but not when trying to debug an entire
> class of boot issues.

Agreed. OpenSBI doesn't even support PCIe so we need an MMIO UART for
OpenSBI to be able to print messages

Alistair

>
> Either way, it looks like you wouldn't be able to completely get rid
> of MMIO even if you introduced a new virt-pcie machine type. That's
> the same for the aarch64 virt machine. I agree with Dan that we
> should follow the example set by that architecture - it has worked
> out pretty well.
>
> If there is a desire to reduce the complexity of the "standard"
> machine type, can we just take the current virt machine type and
> rename it to something else? And have your simpler code take over the
> virt name? Sure, it will cause some pain in the short term, but the
> RISC-V ecosystem is still young enough for it to not be a deal
> breaker.
>
> --
> Andrea Bolognani / Red Hat / Virtualization
>


^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2022-06-01  2:03 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-12  2:10 [RFC 0/3] Introduce a new Qemu machine for RISC-V Atish Patra
2022-04-12  2:10 ` Atish Patra
2022-04-12  2:10 ` [RFC 1/3] serial: Enable MSI capablity and option Atish Patra
2022-04-12  2:10   ` Atish Patra
2022-04-12 15:59   ` Marc Zyngier
2022-04-12 15:59     ` Marc Zyngier
2022-04-12  2:10 ` [RFC 2/3] hw/riscv: virt: Move common functions to a separate helper file Atish Patra
2022-04-12  2:10   ` Atish Patra
2022-04-12  2:10 ` [RFC 3/3] hw/riscv: Create a new qemu machine for RISC-V Atish Patra
2022-04-12  2:10   ` Atish Patra
2022-04-19 16:51 ` [RFC 0/3] Introduce a new Qemu " Daniel P. Berrangé
2022-04-19 16:51   ` Daniel P. Berrangé
2022-04-20  0:26   ` Atish Patra
2022-04-20  0:26     ` Atish Patra
2022-05-03  7:22     ` Atish Patra
2022-05-05  9:36       ` Alistair Francis
2022-05-05 10:04         ` Daniel P. Berrangé
2022-05-05 20:34           ` Alistair Francis
2022-05-05 22:19             ` Atish Kumar Patra
2022-05-06  8:16             ` Daniel P. Berrangé
2022-05-06 10:59               ` Peter Maydell
2022-05-06 20:30                 ` Atish Kumar Patra
2022-05-17  5:03                   ` Alistair Francis
2022-05-17  8:52                     ` Daniel P. Berrangé
2022-05-17 20:53                       ` Alistair Francis
2022-05-18  6:38                         ` Atish Patra
2022-05-18  8:25                           ` Daniel P. Berrangé
2022-05-18 10:46                             ` Peter Maydell
2022-05-19 18:16                               ` Atish Kumar Patra
2022-05-23  5:59                           ` Alistair Francis
2022-05-24  3:16                             ` Atish Patra
2022-05-24 15:56                               ` Andrea Bolognani
2022-06-01  2:01                                 ` Alistair Francis
2022-05-06  4:01           ` Anup Patel
2022-05-05 21:29         ` Atish Kumar Patra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.