All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console
@ 2023-10-25 14:50 David Woodhouse
  2023-10-25 14:50 ` [PATCH v3 01/28] i386/xen: Don't advertise XENFEAT_supervisor_mode_kernel David Woodhouse
                   ` (27 more replies)
  0 siblings, 28 replies; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

Round up a handful of outstanding fixes, add console support and fix up
per-vCPU upcall vector support (which was previously untested), and that
allows us to boot PV guests using the Xen "PV shim".

Having fixed up the per-vCPU upcall vector support, pull in slightly 
newer Xen headers just for the definition of the CPUID bit that lets
us tell the Linux (6.0+) kernel to use it. That'll help with testing.

v2 of the series added network support for emulated Xen guests by 
converting the xen_nic driver to the "new" XenDevice model. That had 
been on the TODO list for a while, and having done the console it made
sense to do network too while I still remembered it all.

But then there was a bit more breakage to deal with. Even before the 
conversion to XenDevice, the xen_nic support was already hosed for 
actual Xen PV guests (-m xenfv) because it never actually managed to 
connect the Xen network device to a netdev. And net_cleanup() was 
freeing the NICs from underneath the device models which own them, which 
upsets the devices that actually do their own cleanup.

Simplify the user experience for "-device file=IMAGE,if=xen" because
that had been offending me for a while, and now I knew how to do that
too.

Update the documentation, and take the opportunity to fix up that bit 
about unplug not working on Q35, because I *also* worked out how to do 
that when heckling Joel's attempt to do so.

v3 of the series includes more cleanups to the -nic option handling,
allowing me to ditch some of the ugly special cases for registering
the Xen network device for xenfv vs. pc platforms, and replace it with
a simple "register all NICs on this bus" which PCI can use too. That
was seen in a 45-patch series all of its own, but now I've pulled in
just the basic part that's needed for Xen support, and the remaining
bombing run on all the platforms can wait; I won't spam the list with
the rest of that again just yet.

David Woodhouse (28):
      i386/xen: Don't advertise XENFEAT_supervisor_mode_kernel
      i386/xen: fix per-vCPU upcall vector for Xen emulation
      hw/xen: select kernel mode for per-vCPU event channel upcall vector
      hw/xen: don't clear map_track[] in xen_gnttab_reset()
      hw/xen: fix XenStore watch delivery to guest
      hw/xen: take iothread mutex in xen_evtchn_reset_op()
      hw/xen: use correct default protocol for xen-block on x86
      i386/xen: Ignore VCPU_SSHOTTMR_future flag in set_singleshot_timer()
      hw/xen: Clean up event channel 'type_val' handling to use union
      include: update Xen public headers to Xen 4.17.2 release
      i386/xen: advertise XEN_HVM_CPUID_UPCALL_VECTOR in CPUID
      hw/xen: populate store frontend nodes with XenStore PFN/port
      hw/xen: automatically assign device index to block devices
      hw/xen: add get_frontend_path() method to XenDeviceClass
      hw/xen: do not repeatedly try to create a failing backend device
      hw/xen: update Xen console to XenDevice model
      hw/xen: add support for Xen primary console in emulated mode
      hw/xen: only remove peers of PCI NICs on unplug
      hw/xen: update Xen PV NIC to XenDevice model
      net: do not delete nics in net_cleanup()
      xen-platform: unplug AHCI disks
      net: add qemu_{configure,create}_nic_device(), qemu_find_nic_info()
      net: report list of available models according to platform
      net: add qemu_create_nic_bus_devices()
      hw/pci: add pci_init_nic_devices(), pci_init_nic_in_slot()
      hw/i386/pc: use qemu_get_nic_info() and pci_init_nic_devices()
      hw/xen: use qemu_create_nic_bus_devices() to instantiate Xen NICs
      docs: update Xen-on-KVM documentation

 MAINTAINERS                                    |   2 +-
 blockdev.c                                     |  15 +-
 docs/system/i386/xen.rst                       | 100 +++--
 hw/block/xen-block.c                           |  44 +-
 hw/char/trace-events                           |   8 +
 hw/char/xen_console.c                          | 572 +++++++++++++++++++------
 hw/i386/kvm/meson.build                        |   1 +
 hw/i386/kvm/trace-events                       |   2 +
 hw/i386/kvm/xen-stubs.c                        |   8 +
 hw/i386/kvm/xen_evtchn.c                       | 158 ++++---
 hw/i386/kvm/xen_gnttab.c                       |   7 +-
 hw/i386/kvm/xen_primary_console.c              | 193 +++++++++
 hw/i386/kvm/xen_primary_console.h              |  23 +
 hw/i386/kvm/xen_xenstore.c                     |  31 +-
 hw/i386/pc.c                                   |  21 +-
 hw/i386/xen/xen_platform.c                     |  77 ++--
 hw/net/meson.build                             |   2 +-
 hw/net/trace-events                            |  11 +
 hw/net/xen_nic.c                               | 484 ++++++++++++++++-----
 hw/pci/pci.c                                   |  45 ++
 hw/xen/xen-backend.c                           |  27 +-
 hw/xen/xen-bus.c                               |  23 +-
 hw/xen/xen-legacy-backend.c                    |   1 -
 hw/xen/xen_devconfig.c                         |  53 ---
 hw/xenpv/xen_machine_pv.c                      |  19 -
 include/hw/net/ne2000-isa.h                    |   2 -
 include/hw/pci/pci.h                           |   4 +-
 include/hw/xen/interface/arch-arm.h            |  37 +-
 include/hw/xen/interface/arch-x86/cpuid.h      |  31 +-
 include/hw/xen/interface/arch-x86/xen-x86_32.h |  19 +-
 include/hw/xen/interface/arch-x86/xen-x86_64.h |  19 +-
 include/hw/xen/interface/arch-x86/xen.h        |  26 +-
 include/hw/xen/interface/event_channel.h       |  19 +-
 include/hw/xen/interface/features.h            |  19 +-
 include/hw/xen/interface/grant_table.h         |  19 +-
 include/hw/xen/interface/hvm/hvm_op.h          |  19 +-
 include/hw/xen/interface/hvm/params.h          |  19 +-
 include/hw/xen/interface/io/blkif.h            |  27 +-
 include/hw/xen/interface/io/console.h          |  19 +-
 include/hw/xen/interface/io/fbif.h             |  19 +-
 include/hw/xen/interface/io/kbdif.h            |  19 +-
 include/hw/xen/interface/io/netif.h            |  25 +-
 include/hw/xen/interface/io/protocols.h        |  19 +-
 include/hw/xen/interface/io/ring.h             |  49 +--
 include/hw/xen/interface/io/usbif.h            |  19 +-
 include/hw/xen/interface/io/xenbus.h           |  19 +-
 include/hw/xen/interface/io/xs_wire.h          |  36 +-
 include/hw/xen/interface/memory.h              |  30 +-
 include/hw/xen/interface/physdev.h             |  23 +-
 include/hw/xen/interface/sched.h               |  19 +-
 include/hw/xen/interface/trace.h               |  19 +-
 include/hw/xen/interface/vcpu.h                |  19 +-
 include/hw/xen/interface/version.h             |  19 +-
 include/hw/xen/interface/xen-compat.h          |  19 +-
 include/hw/xen/interface/xen.h                 |  19 +-
 include/hw/xen/xen-backend.h                   |   1 +
 include/hw/xen/xen-bus.h                       |   3 +
 include/hw/xen/xen-legacy-backend.h            |   2 -
 include/net/net.h                              |  10 +-
 include/sysemu/kvm_xen.h                       |   1 +
 net/net.c                                      | 226 +++++++++-
 target/i386/kvm/kvm.c                          |   4 +
 target/i386/kvm/xen-emu.c                      |  59 ++-
 63 files changed, 1851 insertions(+), 1033 deletions(-)



^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v3 01/28] i386/xen: Don't advertise XENFEAT_supervisor_mode_kernel
  2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
@ 2023-10-25 14:50 ` David Woodhouse
  2023-10-25 14:50 ` [PATCH v3 02/28] i386/xen: fix per-vCPU upcall vector for Xen emulation David Woodhouse
                   ` (26 subsequent siblings)
  27 siblings, 0 replies; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

This confuses lscpu into thinking it's running in PVH mode.

Fixes: bedcc139248 ("i386/xen: implement HYPERVISOR_xen_version")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
---
 target/i386/kvm/xen-emu.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c
index 76348f9d5d..0055441b2e 100644
--- a/target/i386/kvm/xen-emu.c
+++ b/target/i386/kvm/xen-emu.c
@@ -267,7 +267,6 @@ static bool kvm_xen_hcall_xen_version(struct kvm_xen_exit *exit, X86CPU *cpu,
             fi.submap |= 1 << XENFEAT_writable_page_tables |
                          1 << XENFEAT_writable_descriptor_tables |
                          1 << XENFEAT_auto_translated_physmap |
-                         1 << XENFEAT_supervisor_mode_kernel |
                          1 << XENFEAT_hvm_callback_vector |
                          1 << XENFEAT_hvm_safe_pvclock |
                          1 << XENFEAT_hvm_pirqs;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 02/28] i386/xen: fix per-vCPU upcall vector for Xen emulation
  2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
  2023-10-25 14:50 ` [PATCH v3 01/28] i386/xen: Don't advertise XENFEAT_supervisor_mode_kernel David Woodhouse
@ 2023-10-25 14:50 ` David Woodhouse
  2023-10-25 14:50 ` [PATCH v3 03/28] hw/xen: select kernel mode for per-vCPU event channel upcall vector David Woodhouse
                   ` (25 subsequent siblings)
  27 siblings, 0 replies; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

The per-vCPU upcall vector support had three problems. Firstly it was
using the wrong hypercall argument and would always return -EFAULT when
the guest tried to set it up. Secondly it was using the wrong ioctl() to
pass the vector to the kernel and thus the *kernel* would always return
-EINVAL. Finally, even when delivering the event directly from userspace
with an MSI, it put the destination CPU ID into the wrong bits of the
MSI address.

Linux doesn't (yet) use this mode so it went without decent testing
for a while.

Fixes: 105b47fdf2d0 ("i386/xen: implement HVMOP_set_evtchn_upcall_vector")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
---
 target/i386/kvm/xen-emu.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c
index 0055441b2e..7c504d9fa4 100644
--- a/target/i386/kvm/xen-emu.c
+++ b/target/i386/kvm/xen-emu.c
@@ -306,7 +306,7 @@ static int kvm_xen_set_vcpu_callback_vector(CPUState *cs)
 
     trace_kvm_xen_set_vcpu_callback(cs->cpu_index, vector);
 
-    return kvm_vcpu_ioctl(cs, KVM_XEN_HVM_SET_ATTR, &xva);
+    return kvm_vcpu_ioctl(cs, KVM_XEN_VCPU_SET_ATTR, &xva);
 }
 
 static void do_set_vcpu_callback_vector(CPUState *cs, run_on_cpu_data data)
@@ -440,7 +440,8 @@ void kvm_xen_inject_vcpu_callback_vector(uint32_t vcpu_id, int type)
          * deliver it as an MSI.
          */
         MSIMessage msg = {
-            .address = APIC_DEFAULT_ADDRESS | X86_CPU(cs)->apic_id,
+            .address = APIC_DEFAULT_ADDRESS |
+                       (X86_CPU(cs)->apic_id << MSI_ADDR_DEST_ID_SHIFT),
             .data = vector | (1UL << MSI_DATA_LEVEL_SHIFT),
         };
         kvm_irqchip_send_msi(kvm_state, msg);
@@ -849,8 +850,7 @@ static bool kvm_xen_hcall_hvm_op(struct kvm_xen_exit *exit, X86CPU *cpu,
     int ret = -ENOSYS;
     switch (cmd) {
     case HVMOP_set_evtchn_upcall_vector:
-        ret = kvm_xen_hcall_evtchn_upcall_vector(exit, cpu,
-                                                 exit->u.hcall.params[0]);
+        ret = kvm_xen_hcall_evtchn_upcall_vector(exit, cpu, arg);
         break;
 
     case HVMOP_pagetable_dying:
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 03/28] hw/xen: select kernel mode for per-vCPU event channel upcall vector
  2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
  2023-10-25 14:50 ` [PATCH v3 01/28] i386/xen: Don't advertise XENFEAT_supervisor_mode_kernel David Woodhouse
  2023-10-25 14:50 ` [PATCH v3 02/28] i386/xen: fix per-vCPU upcall vector for Xen emulation David Woodhouse
@ 2023-10-25 14:50 ` David Woodhouse
  2023-10-25 14:50 ` [PATCH v3 04/28] hw/xen: don't clear map_track[] in xen_gnttab_reset() David Woodhouse
                   ` (24 subsequent siblings)
  27 siblings, 0 replies; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

A guest which has configured the per-vCPU upcall vector may set the
HVM_PARAM_CALLBACK_IRQ param to fairly much anything other than zero.

For example, Linux v6.0+ after commit b1c3497e604 ("x86/xen: Add support
for HVMOP_set_evtchn_upcall_vector") will just do this after setting the
vector:

       /* Trick toolstack to think we are enlightened. */
       if (!cpu)
               rc = xen_set_callback_via(1);

That's explicitly setting the delivery to GSI#1, but it's supposed to be
overridden by the per-vCPU vector setting. This mostly works in Qemu
*except* for the logic to enable the in-kernel handling of event channels,
which falsely determines that the kernel cannot accelerate GSI delivery
in this case.

Add a kvm_xen_has_vcpu_callback_vector() to report whether vCPU#0 has
the vector set, and use that in xen_evtchn_set_callback_param() to
enable the kernel acceleration features even when the param *appears*
to be set to target a GSI.

Preserve the Xen behaviour that when HVM_PARAM_CALLBACK_IRQ is set to
*zero* the event channel delivery is disabled completely. (Which is
what that bizarre guest behaviour is working round in the first place.)

Fixes: 91cce756179 ("hw/xen: Add xen_evtchn device for event channel emulation")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
---
 hw/i386/kvm/xen_evtchn.c  | 6 ++++++
 include/sysemu/kvm_xen.h  | 1 +
 target/i386/kvm/xen-emu.c | 7 +++++++
 3 files changed, 14 insertions(+)

diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c
index a731738411..3d6f4b4a0a 100644
--- a/hw/i386/kvm/xen_evtchn.c
+++ b/hw/i386/kvm/xen_evtchn.c
@@ -490,6 +490,12 @@ int xen_evtchn_set_callback_param(uint64_t param)
         break;
     }
 
+    /* If the guest has set a per-vCPU callback vector, prefer that. */
+    if (gsi && kvm_xen_has_vcpu_callback_vector()) {
+        in_kernel = kvm_xen_has_cap(EVTCHN_SEND);
+        gsi = 0;
+    }
+
     if (!ret) {
         /* If vector delivery was turned *off* then tell the kernel */
         if ((s->callback_param >> CALLBACK_VIA_TYPE_SHIFT) ==
diff --git a/include/sysemu/kvm_xen.h b/include/sysemu/kvm_xen.h
index 595abfbe40..961c702c4e 100644
--- a/include/sysemu/kvm_xen.h
+++ b/include/sysemu/kvm_xen.h
@@ -22,6 +22,7 @@
 int kvm_xen_soft_reset(void);
 uint32_t kvm_xen_get_caps(void);
 void *kvm_xen_get_vcpu_info_hva(uint32_t vcpu_id);
+bool kvm_xen_has_vcpu_callback_vector(void);
 void kvm_xen_inject_vcpu_callback_vector(uint32_t vcpu_id, int type);
 void kvm_xen_set_callback_asserted(void);
 int kvm_xen_set_vcpu_virq(uint32_t vcpu_id, uint16_t virq, uint16_t port);
diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c
index 7c504d9fa4..75b2c557b9 100644
--- a/target/i386/kvm/xen-emu.c
+++ b/target/i386/kvm/xen-emu.c
@@ -424,6 +424,13 @@ void kvm_xen_set_callback_asserted(void)
     }
 }
 
+bool kvm_xen_has_vcpu_callback_vector(void)
+{
+    CPUState *cs = qemu_get_cpu(0);
+
+    return cs && !!X86_CPU(cs)->env.xen_vcpu_callback_vector;
+}
+
 void kvm_xen_inject_vcpu_callback_vector(uint32_t vcpu_id, int type)
 {
     CPUState *cs = qemu_get_cpu(vcpu_id);
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 04/28] hw/xen: don't clear map_track[] in xen_gnttab_reset()
  2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
                   ` (2 preceding siblings ...)
  2023-10-25 14:50 ` [PATCH v3 03/28] hw/xen: select kernel mode for per-vCPU event channel upcall vector David Woodhouse
@ 2023-10-25 14:50 ` David Woodhouse
  2023-10-25 14:50 ` [PATCH v3 05/28] hw/xen: fix XenStore watch delivery to guest David Woodhouse
                   ` (23 subsequent siblings)
  27 siblings, 0 replies; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

The refcounts actually correspond to 'active_ref' structures stored in a
GHashTable per "user" on the backend side (mostly, per XenDevice).

If we zero map_track[] on reset, then when the backend drivers get torn
down and release their mapping we hit the assert(s->map_track[ref] != 0)
in gnt_unref().

So leave them in place. Each backend driver will disconnect and reconnect
as the guest comes back up again and reconnects, and it all works out OK
in the end as the old refs get dropped.

Fixes: de26b2619789 ("hw/xen: Implement soft reset for emulated gnttab")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
---
 hw/i386/kvm/xen_gnttab.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/hw/i386/kvm/xen_gnttab.c b/hw/i386/kvm/xen_gnttab.c
index 21c30e3659..839ec920a1 100644
--- a/hw/i386/kvm/xen_gnttab.c
+++ b/hw/i386/kvm/xen_gnttab.c
@@ -541,7 +541,5 @@ int xen_gnttab_reset(void)
     s->entries.v1[GNTTAB_RESERVED_XENSTORE].flags = GTF_permit_access;
     s->entries.v1[GNTTAB_RESERVED_XENSTORE].frame = XEN_SPECIAL_PFN(XENSTORE);
 
-    memset(s->map_track, 0, s->max_frames * ENTRIES_PER_FRAME_V1);
-
     return 0;
 }
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 05/28] hw/xen: fix XenStore watch delivery to guest
  2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
                   ` (3 preceding siblings ...)
  2023-10-25 14:50 ` [PATCH v3 04/28] hw/xen: don't clear map_track[] in xen_gnttab_reset() David Woodhouse
@ 2023-10-25 14:50 ` David Woodhouse
  2023-10-27  7:15   ` Durrant, Paul
  2023-10-25 14:50 ` [PATCH v3 06/28] hw/xen: take iothread mutex in xen_evtchn_reset_op() David Woodhouse
                   ` (22 subsequent siblings)
  27 siblings, 1 reply; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

When fire_watch_cb() found the response buffer empty, it would call
deliver_watch() to generate the XS_WATCH_EVENT message in the response
buffer and send an event channel notification to the guest… without
actually *copying* the response buffer into the ring. So there was
nothing for the guest to see. The pending response didn't actually get
processed into the ring until the guest next triggered some activity
from its side.

Add the missing call to put_rsp().

It might have been slightly nicer to call xen_xenstore_event() here,
which would *almost* have worked. Except for the fact that it calls
xen_be_evtchn_pending() to check that it really does have an event
pending (and clear the eventfd for next time). And under Xen it's
defined that setting that fd to O_NONBLOCK isn't guaranteed to work,
so the emu implementation follows suit.

This fixes Xen device hot-unplug.

Fixes: 0254c4d19df ("hw/xen: Add xenstore wire implementation and implementation stubs")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 hw/i386/kvm/xen_xenstore.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/hw/i386/kvm/xen_xenstore.c b/hw/i386/kvm/xen_xenstore.c
index 660d0b72f9..8e716a7009 100644
--- a/hw/i386/kvm/xen_xenstore.c
+++ b/hw/i386/kvm/xen_xenstore.c
@@ -1357,10 +1357,12 @@ static void fire_watch_cb(void *opaque, const char *path, const char *token)
     } else {
         deliver_watch(s, path, token);
         /*
-         * If the message was queued because there was already ring activity,
-         * no need to wake the guest. But if not, we need to send the evtchn.
+         * Attempt to queue the message into the actual ring, and send
+         * the event channel notification if any bytes are copied.
          */
-        xen_be_evtchn_notify(s->eh, s->be_port);
+        if (s->rsp_pending && put_rsp(s) > 0) {
+            xen_be_evtchn_notify(s->eh, s->be_port);
+        }
     }
 }
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 06/28] hw/xen: take iothread mutex in xen_evtchn_reset_op()
  2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
                   ` (4 preceding siblings ...)
  2023-10-25 14:50 ` [PATCH v3 05/28] hw/xen: fix XenStore watch delivery to guest David Woodhouse
@ 2023-10-25 14:50 ` David Woodhouse
  2023-10-27  7:20   ` Durrant, Paul
  2023-10-25 14:50 ` [PATCH v3 07/28] hw/xen: use correct default protocol for xen-block on x86 David Woodhouse
                   ` (21 subsequent siblings)
  27 siblings, 1 reply; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

The xen_evtchn_soft_reset() function requires the iothread mutex, but is
also called for the EVTCHNOP_reset hypercall. Ensure the mutex is taken
in that case.

Fixes: a15b10978fe6 ("hw/xen: Implement EVTCHNOP_reset")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 hw/i386/kvm/xen_evtchn.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c
index 3d6f4b4a0a..b2b4be9983 100644
--- a/hw/i386/kvm/xen_evtchn.c
+++ b/hw/i386/kvm/xen_evtchn.c
@@ -1135,6 +1135,7 @@ int xen_evtchn_reset_op(struct evtchn_reset *reset)
         return -ESRCH;
     }
 
+    QEMU_IOTHREAD_LOCK_GUARD();
     return xen_evtchn_soft_reset();
 }
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 07/28] hw/xen: use correct default protocol for xen-block on x86
  2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
                   ` (5 preceding siblings ...)
  2023-10-25 14:50 ` [PATCH v3 06/28] hw/xen: take iothread mutex in xen_evtchn_reset_op() David Woodhouse
@ 2023-10-25 14:50 ` David Woodhouse
  2023-10-27  7:22   ` Durrant, Paul
  2023-10-25 14:50 ` [PATCH v3 08/28] i386/xen: Ignore VCPU_SSHOTTMR_future flag in set_singleshot_timer() David Woodhouse
                   ` (20 subsequent siblings)
  27 siblings, 1 reply; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

Even on x86_64 the default protocol is the x86-32 one if the guest doesn't
specifically ask for x86-64.

Fixes: b6af8926fb85 ("xen: add implementations of xen-block connect and disconnect functions...")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 hw/block/xen-block.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c
index a07cd7eb5d..64470fc579 100644
--- a/hw/block/xen-block.c
+++ b/hw/block/xen-block.c
@@ -117,7 +117,11 @@ static void xen_block_connect(XenDevice *xendev, Error **errp)
 
     if (xen_device_frontend_scanf(xendev, "protocol", "%ms",
                                   &str) != 1) {
-        protocol = BLKIF_PROTOCOL_NATIVE;
+        /* x86 defaults to the 32-bit protocol even for 64-bit guests. */
+        if (object_dynamic_cast(OBJECT(qdev_get_machine()), "x86-machine"))
+            protocol = BLKIF_PROTOCOL_X86_32;
+        else
+            protocol = BLKIF_PROTOCOL_NATIVE;
     } else {
         if (strcmp(str, XEN_IO_PROTO_ABI_X86_32) == 0) {
             protocol = BLKIF_PROTOCOL_X86_32;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 08/28] i386/xen: Ignore VCPU_SSHOTTMR_future flag in set_singleshot_timer()
  2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
                   ` (6 preceding siblings ...)
  2023-10-25 14:50 ` [PATCH v3 07/28] hw/xen: use correct default protocol for xen-block on x86 David Woodhouse
@ 2023-10-25 14:50 ` David Woodhouse
  2023-10-27  7:23   ` Durrant, Paul
  2023-10-25 14:50 ` [PATCH v3 09/28] hw/xen: Clean up event channel 'type_val' handling to use union David Woodhouse
                   ` (19 subsequent siblings)
  27 siblings, 1 reply; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

Upstream Xen now ignores this flag¹, since the only guest kernel ever to
use it was buggy.

¹ https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=19c6cbd909

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
---
 target/i386/kvm/xen-emu.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c
index 75b2c557b9..1dc9ab0d91 100644
--- a/target/i386/kvm/xen-emu.c
+++ b/target/i386/kvm/xen-emu.c
@@ -1077,17 +1077,13 @@ static int vcpuop_stop_periodic_timer(CPUState *target)
  * Must always be called with xen_timers_lock held.
  */
 static int do_set_singleshot_timer(CPUState *cs, uint64_t timeout_abs,
-                                   bool future, bool linux_wa)
+                                   bool linux_wa)
 {
     CPUX86State *env = &X86_CPU(cs)->env;
     int64_t now = kvm_get_current_ns();
     int64_t qemu_now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
     int64_t delta = timeout_abs - now;
 
-    if (future && timeout_abs < now) {
-        return -ETIME;
-    }
-
     if (linux_wa && unlikely((int64_t)timeout_abs < 0 ||
                              (delta > 0 && (uint32_t)(delta >> 50) != 0))) {
         /*
@@ -1129,9 +1125,13 @@ static int vcpuop_set_singleshot_timer(CPUState *cs, uint64_t arg)
     }
 
     QEMU_LOCK_GUARD(&X86_CPU(cs)->env.xen_timers_lock);
-    return do_set_singleshot_timer(cs, sst.timeout_abs_ns,
-                                   !!(sst.flags & VCPU_SSHOTTMR_future),
-                                   false);
+
+    /*
+     * We ignore the VCPU_SSHOTTMR_future flag, just as Xen now does.
+     * The only guest that ever used it, got it wrong.
+     * https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=19c6cbd909
+     */
+    return do_set_singleshot_timer(cs, sst.timeout_abs_ns, false);
 }
 
 static int vcpuop_stop_singleshot_timer(CPUState *cs)
@@ -1156,7 +1156,7 @@ static bool kvm_xen_hcall_set_timer_op(struct kvm_xen_exit *exit, X86CPU *cpu,
         err = vcpuop_stop_singleshot_timer(CPU(cpu));
     } else {
         QEMU_LOCK_GUARD(&X86_CPU(cpu)->env.xen_timers_lock);
-        err = do_set_singleshot_timer(CPU(cpu), timeout, false, true);
+        err = do_set_singleshot_timer(CPU(cpu), timeout, true);
     }
     exit->u.hcall.result = err;
     return true;
@@ -1844,7 +1844,7 @@ int kvm_put_xen_state(CPUState *cs)
         QEMU_LOCK_GUARD(&env->xen_timers_lock);
         if (env->xen_singleshot_timer_ns) {
             ret = do_set_singleshot_timer(cs, env->xen_singleshot_timer_ns,
-                                    false, false);
+                                          false);
             if (ret < 0) {
                 return ret;
             }
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 09/28] hw/xen: Clean up event channel 'type_val' handling to use union
  2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
                   ` (7 preceding siblings ...)
  2023-10-25 14:50 ` [PATCH v3 08/28] i386/xen: Ignore VCPU_SSHOTTMR_future flag in set_singleshot_timer() David Woodhouse
@ 2023-10-25 14:50 ` David Woodhouse
  2023-10-25 14:50 ` [PATCH v3 10/28] include: update Xen public headers to Xen 4.17.2 release David Woodhouse
                   ` (18 subsequent siblings)
  27 siblings, 0 replies; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

A previous implementation of this stuff used a 64-bit field for all of
the port information (vcpu/type/type_val) and did atomic exchanges on
them. When I implemented that in Qemu I regretted my life choices and
just kept it simple with locking instead.

So there's no need for the XenEvtchnPort to be so simplistic. We can
use a union for the pirq/virq/interdomain information, which lets us
keep a separate bit for the 'remote domain' in interdomain ports. A
single bit is enough since the only possible targets are loopback or
qemu itself.

So now we can ditch PORT_INFO_TYPEVAL_REMOTE_QEMU and the horrid
manual masking, although the in-memory representation is identical
so there's no change in the saved state ABI.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
---
 hw/i386/kvm/xen_evtchn.c | 151 ++++++++++++++++++---------------------
 1 file changed, 70 insertions(+), 81 deletions(-)

diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c
index b2b4be9983..02b8cbf8df 100644
--- a/hw/i386/kvm/xen_evtchn.c
+++ b/hw/i386/kvm/xen_evtchn.c
@@ -58,7 +58,15 @@ OBJECT_DECLARE_SIMPLE_TYPE(XenEvtchnState, XEN_EVTCHN)
 typedef struct XenEvtchnPort {
     uint32_t vcpu;      /* Xen/ACPI vcpu_id */
     uint16_t type;      /* EVTCHNSTAT_xxxx */
-    uint16_t type_val;  /* pirq# / virq# / remote port according to type */
+    union {
+        uint16_t val;  /* raw value for serialization etc. */
+        uint16_t pirq;
+        uint16_t virq;
+        struct {
+            uint16_t port:15;
+            uint16_t to_qemu:1; /* Only two targets; qemu or loopback */
+        } interdomain;
+    } u;
 } XenEvtchnPort;
 
 /* 32-bit compatibility definitions, also used natively in 32-bit build */
@@ -105,14 +113,6 @@ struct xenevtchn_handle {
     int fd;
 };
 
-/*
- * For unbound/interdomain ports there are only two possible remote
- * domains; self and QEMU. Use a single high bit in type_val for that,
- * and the low bits for the remote port number (or 0 for unbound).
- */
-#define PORT_INFO_TYPEVAL_REMOTE_QEMU           0x8000
-#define PORT_INFO_TYPEVAL_REMOTE_PORT_MASK      0x7FFF
-
 /*
  * These 'emuirq' values are used by Xen in the LM stream... and yes, I am
  * insane enough to think about guest-transparent live migration from actual
@@ -210,16 +210,16 @@ static int xen_evtchn_post_load(void *opaque, int version_id)
         XenEvtchnPort *p = &s->port_table[i];
 
         if (p->type == EVTCHNSTAT_pirq) {
-            assert(p->type_val);
-            assert(p->type_val < s->nr_pirqs);
+            assert(p->u.pirq);
+            assert(p->u.pirq < s->nr_pirqs);
 
             /*
              * Set the gsi to IRQ_UNBOUND; it may be changed to an actual
              * GSI# below, or to IRQ_MSI_EMU when the MSI table snooping
              * catches up with it.
              */
-            s->pirq[p->type_val].gsi = IRQ_UNBOUND;
-            s->pirq[p->type_val].port = i;
+            s->pirq[p->u.pirq].gsi = IRQ_UNBOUND;
+            s->pirq[p->u.pirq].port = i;
         }
     }
     /* Rebuild s->pirq[].gsi mapping */
@@ -243,7 +243,7 @@ static const VMStateDescription xen_evtchn_port_vmstate = {
     .fields = (VMStateField[]) {
         VMSTATE_UINT32(vcpu, XenEvtchnPort),
         VMSTATE_UINT16(type, XenEvtchnPort),
-        VMSTATE_UINT16(type_val, XenEvtchnPort),
+        VMSTATE_UINT16(u.val, XenEvtchnPort),
         VMSTATE_END_OF_LIST()
     }
 };
@@ -605,14 +605,13 @@ static void unbind_backend_ports(XenEvtchnState *s)
 
     for (i = 1; i < s->nr_ports; i++) {
         p = &s->port_table[i];
-        if (p->type == EVTCHNSTAT_interdomain &&
-            (p->type_val & PORT_INFO_TYPEVAL_REMOTE_QEMU)) {
-            evtchn_port_t be_port = p->type_val & PORT_INFO_TYPEVAL_REMOTE_PORT_MASK;
+        if (p->type == EVTCHNSTAT_interdomain && p->u.interdomain.to_qemu) {
+            evtchn_port_t be_port = p->u.interdomain.port;
 
             if (s->be_handles[be_port]) {
                 /* This part will be overwritten on the load anyway. */
                 p->type = EVTCHNSTAT_unbound;
-                p->type_val = PORT_INFO_TYPEVAL_REMOTE_QEMU;
+                p->u.interdomain.port = 0;
 
                 /* Leave the backend port open and unbound too. */
                 if (kvm_xen_has_cap(EVTCHN_SEND)) {
@@ -650,30 +649,22 @@ int xen_evtchn_status_op(struct evtchn_status *status)
 
     switch (p->type) {
     case EVTCHNSTAT_unbound:
-        if (p->type_val & PORT_INFO_TYPEVAL_REMOTE_QEMU) {
-            status->u.unbound.dom = DOMID_QEMU;
-        } else {
-            status->u.unbound.dom = xen_domid;
-        }
+        status->u.unbound.dom = p->u.interdomain.to_qemu ? DOMID_QEMU
+                                                         : xen_domid;
         break;
 
     case EVTCHNSTAT_interdomain:
-        if (p->type_val & PORT_INFO_TYPEVAL_REMOTE_QEMU) {
-            status->u.interdomain.dom = DOMID_QEMU;
-        } else {
-            status->u.interdomain.dom = xen_domid;
-        }
-
-        status->u.interdomain.port = p->type_val &
-            PORT_INFO_TYPEVAL_REMOTE_PORT_MASK;
+        status->u.interdomain.dom = p->u.interdomain.to_qemu ? DOMID_QEMU
+                                                             : xen_domid;
+        status->u.interdomain.port = p->u.interdomain.port;
         break;
 
     case EVTCHNSTAT_pirq:
-        status->u.pirq = p->type_val;
+        status->u.pirq = p->u.pirq;
         break;
 
     case EVTCHNSTAT_virq:
-        status->u.virq = p->type_val;
+        status->u.virq = p->u.virq;
         break;
     }
 
@@ -989,7 +980,7 @@ static int clear_port_pending(XenEvtchnState *s, evtchn_port_t port)
 static void free_port(XenEvtchnState *s, evtchn_port_t port)
 {
     s->port_table[port].type = EVTCHNSTAT_closed;
-    s->port_table[port].type_val = 0;
+    s->port_table[port].u.val = 0;
     s->port_table[port].vcpu = 0;
 
     if (s->nr_ports == port + 1) {
@@ -1012,7 +1003,7 @@ static int allocate_port(XenEvtchnState *s, uint32_t vcpu, uint16_t type,
         if (s->port_table[p].type == EVTCHNSTAT_closed) {
             s->port_table[p].vcpu = vcpu;
             s->port_table[p].type = type;
-            s->port_table[p].type_val = val;
+            s->port_table[p].u.val = val;
 
             *port = p;
 
@@ -1053,15 +1044,15 @@ static int close_port(XenEvtchnState *s, evtchn_port_t port,
         return -ENOENT;
 
     case EVTCHNSTAT_pirq:
-        s->pirq[p->type_val].port = 0;
-        if (s->pirq[p->type_val].is_translated) {
+        s->pirq[p->u.pirq].port = 0;
+        if (s->pirq[p->u.pirq].is_translated) {
             *flush_kvm_routes = true;
         }
         break;
 
     case EVTCHNSTAT_virq:
-        kvm_xen_set_vcpu_virq(virq_is_global(p->type_val) ? 0 : p->vcpu,
-                              p->type_val, 0);
+        kvm_xen_set_vcpu_virq(virq_is_global(p->u.virq) ? 0 : p->vcpu,
+                              p->u.virq, 0);
         break;
 
     case EVTCHNSTAT_ipi:
@@ -1071,8 +1062,8 @@ static int close_port(XenEvtchnState *s, evtchn_port_t port,
         break;
 
     case EVTCHNSTAT_interdomain:
-        if (p->type_val & PORT_INFO_TYPEVAL_REMOTE_QEMU) {
-            uint16_t be_port = p->type_val & ~PORT_INFO_TYPEVAL_REMOTE_QEMU;
+        if (p->u.interdomain.to_qemu) {
+            uint16_t be_port = p->u.interdomain.port;
             struct xenevtchn_handle *xc = s->be_handles[be_port];
             if (xc) {
                 if (kvm_xen_has_cap(EVTCHN_SEND)) {
@@ -1082,14 +1073,15 @@ static int close_port(XenEvtchnState *s, evtchn_port_t port,
             }
         } else {
             /* Loopback interdomain */
-            XenEvtchnPort *rp = &s->port_table[p->type_val];
-            if (!valid_port(p->type_val) || rp->type_val != port ||
+            XenEvtchnPort *rp = &s->port_table[p->u.interdomain.port];
+            if (!valid_port(p->u.interdomain.port) ||
+                rp->u.interdomain.port != port ||
                 rp->type != EVTCHNSTAT_interdomain) {
                 error_report("Inconsistent state for interdomain unbind");
             } else {
                 /* Set the other end back to unbound */
                 rp->type = EVTCHNSTAT_unbound;
-                rp->type_val = 0;
+                rp->u.interdomain.port = 0;
             }
         }
         break;
@@ -1214,7 +1206,7 @@ int xen_evtchn_bind_vcpu_op(struct evtchn_bind_vcpu *vcpu)
     if (p->type == EVTCHNSTAT_interdomain ||
         p->type == EVTCHNSTAT_unbound ||
         p->type == EVTCHNSTAT_pirq ||
-        (p->type == EVTCHNSTAT_virq && virq_is_global(p->type_val))) {
+        (p->type == EVTCHNSTAT_virq && virq_is_global(p->u.virq))) {
         /*
          * unmask_port() with do_unmask==false will just raise the event
          * on the new vCPU if the port was already pending.
@@ -1359,19 +1351,15 @@ int xen_evtchn_bind_ipi_op(struct evtchn_bind_ipi *ipi)
 int xen_evtchn_bind_interdomain_op(struct evtchn_bind_interdomain *interdomain)
 {
     XenEvtchnState *s = xen_evtchn_singleton;
-    uint16_t type_val;
     int ret;
 
     if (!s) {
         return -ENOTSUP;
     }
 
-    if (interdomain->remote_dom == DOMID_QEMU) {
-        type_val = PORT_INFO_TYPEVAL_REMOTE_QEMU;
-    } else if (interdomain->remote_dom == DOMID_SELF ||
-               interdomain->remote_dom == xen_domid) {
-        type_val = 0;
-    } else {
+    if (interdomain->remote_dom != DOMID_QEMU &&
+        interdomain->remote_dom != DOMID_SELF &&
+        interdomain->remote_dom != xen_domid) {
         return -ESRCH;
     }
 
@@ -1382,8 +1370,8 @@ int xen_evtchn_bind_interdomain_op(struct evtchn_bind_interdomain *interdomain)
     qemu_mutex_lock(&s->port_lock);
 
     /* The newly allocated port starts out as unbound */
-    ret = allocate_port(s, 0, EVTCHNSTAT_unbound, type_val,
-                        &interdomain->local_port);
+    ret = allocate_port(s, 0, EVTCHNSTAT_unbound, 0, &interdomain->local_port);
+
     if (ret) {
         goto out;
     }
@@ -1408,7 +1396,8 @@ int xen_evtchn_bind_interdomain_op(struct evtchn_bind_interdomain *interdomain)
             assign_kernel_eventfd(lp->type, xc->guest_port, xc->fd);
         }
         lp->type = EVTCHNSTAT_interdomain;
-        lp->type_val = PORT_INFO_TYPEVAL_REMOTE_QEMU | interdomain->remote_port;
+        lp->u.interdomain.to_qemu = 1;
+        lp->u.interdomain.port = interdomain->remote_port;
         ret = 0;
     } else {
         /* Loopback */
@@ -1416,19 +1405,18 @@ int xen_evtchn_bind_interdomain_op(struct evtchn_bind_interdomain *interdomain)
         XenEvtchnPort *lp = &s->port_table[interdomain->local_port];
 
         /*
-         * The 'remote' port for loopback must be an unbound port allocated for
-         * communication with the local domain (as indicated by rp->type_val
-         * being zero, not PORT_INFO_TYPEVAL_REMOTE_QEMU), and must *not* be
-         * the port that was just allocated for the local end.
+         * The 'remote' port for loopback must be an unbound port allocated
+         * for communication with the local domain, and must *not* be the
+         * port that was just allocated for the local end.
          */
         if (interdomain->local_port != interdomain->remote_port &&
-            rp->type == EVTCHNSTAT_unbound && rp->type_val == 0) {
+            rp->type == EVTCHNSTAT_unbound && !rp->u.interdomain.to_qemu) {
 
             rp->type = EVTCHNSTAT_interdomain;
-            rp->type_val = interdomain->local_port;
+            rp->u.interdomain.port = interdomain->local_port;
 
             lp->type = EVTCHNSTAT_interdomain;
-            lp->type_val = interdomain->remote_port;
+            lp->u.interdomain.port = interdomain->remote_port;
         } else {
             ret = -EINVAL;
         }
@@ -1447,7 +1435,6 @@ int xen_evtchn_bind_interdomain_op(struct evtchn_bind_interdomain *interdomain)
 int xen_evtchn_alloc_unbound_op(struct evtchn_alloc_unbound *alloc)
 {
     XenEvtchnState *s = xen_evtchn_singleton;
-    uint16_t type_val;
     int ret;
 
     if (!s) {
@@ -1458,18 +1445,20 @@ int xen_evtchn_alloc_unbound_op(struct evtchn_alloc_unbound *alloc)
         return -ESRCH;
     }
 
-    if (alloc->remote_dom == DOMID_QEMU) {
-        type_val = PORT_INFO_TYPEVAL_REMOTE_QEMU;
-    } else if (alloc->remote_dom == DOMID_SELF ||
-               alloc->remote_dom == xen_domid) {
-        type_val = 0;
-    } else {
+    if (alloc->remote_dom != DOMID_QEMU &&
+        alloc->remote_dom != DOMID_SELF &&
+        alloc->remote_dom != xen_domid) {
         return -EPERM;
     }
 
     qemu_mutex_lock(&s->port_lock);
 
-    ret = allocate_port(s, 0, EVTCHNSTAT_unbound, type_val, &alloc->port);
+    ret = allocate_port(s, 0, EVTCHNSTAT_unbound, 0, &alloc->port);
+
+    if (!ret && alloc->remote_dom == DOMID_QEMU) {
+        XenEvtchnPort *p = &s->port_table[alloc->port];
+        p->u.interdomain.to_qemu = 1;
+    }
 
     qemu_mutex_unlock(&s->port_lock);
 
@@ -1496,12 +1485,12 @@ int xen_evtchn_send_op(struct evtchn_send *send)
 
     switch (p->type) {
     case EVTCHNSTAT_interdomain:
-        if (p->type_val & PORT_INFO_TYPEVAL_REMOTE_QEMU) {
+        if (p->u.interdomain.to_qemu) {
             /*
              * This is an event from the guest to qemu itself, which is
              * serving as the driver domain.
              */
-            uint16_t be_port = p->type_val & ~PORT_INFO_TYPEVAL_REMOTE_QEMU;
+            uint16_t be_port = p->u.interdomain.port;
             struct xenevtchn_handle *xc = s->be_handles[be_port];
             if (xc) {
                 eventfd_write(xc->fd, 1);
@@ -1511,7 +1500,7 @@ int xen_evtchn_send_op(struct evtchn_send *send)
             }
         } else {
             /* Loopback interdomain ports; just a complex IPI */
-            set_port_pending(s, p->type_val);
+            set_port_pending(s, p->u.interdomain.port);
         }
         break;
 
@@ -1553,8 +1542,7 @@ int xen_evtchn_set_port(uint16_t port)
 
     /* QEMU has no business sending to anything but these */
     if (p->type == EVTCHNSTAT_virq ||
-        (p->type == EVTCHNSTAT_interdomain &&
-         (p->type_val & PORT_INFO_TYPEVAL_REMOTE_QEMU))) {
+        (p->type == EVTCHNSTAT_interdomain && p->u.interdomain.to_qemu)) {
         set_port_pending(s, port);
         ret = 0;
     }
@@ -2064,7 +2052,7 @@ int xen_be_evtchn_bind_interdomain(struct xenevtchn_handle *xc, uint32_t domid,
     switch (gp->type) {
     case EVTCHNSTAT_interdomain:
         /* Allow rebinding after migration, preserve port # if possible */
-        be_port = gp->type_val & ~PORT_INFO_TYPEVAL_REMOTE_QEMU;
+        be_port = gp->u.interdomain.port;
         assert(be_port != 0);
         if (!s->be_handles[be_port]) {
             s->be_handles[be_port] = xc;
@@ -2085,7 +2073,8 @@ int xen_be_evtchn_bind_interdomain(struct xenevtchn_handle *xc, uint32_t domid,
         }
 
         gp->type = EVTCHNSTAT_interdomain;
-        gp->type_val = be_port | PORT_INFO_TYPEVAL_REMOTE_QEMU;
+        gp->u.interdomain.to_qemu = 1;
+        gp->u.interdomain.port = be_port;
         xc->guest_port = guest_port;
         if (kvm_xen_has_cap(EVTCHN_SEND)) {
             assign_kernel_eventfd(gp->type, guest_port, xc->fd);
@@ -2130,7 +2119,7 @@ int xen_be_evtchn_unbind(struct xenevtchn_handle *xc, evtchn_port_t port)
         /* This should never *not* be true */
         if (gp->type == EVTCHNSTAT_interdomain) {
             gp->type = EVTCHNSTAT_unbound;
-            gp->type_val = PORT_INFO_TYPEVAL_REMOTE_QEMU;
+            gp->u.interdomain.port = 0;
         }
 
         if (kvm_xen_has_cap(EVTCHN_SEND)) {
@@ -2284,11 +2273,11 @@ EvtchnInfoList *qmp_xen_event_list(Error **errp)
 
         info->type = p->type;
         if (p->type == EVTCHNSTAT_interdomain) {
-            info->remote_domain = g_strdup((p->type_val & PORT_INFO_TYPEVAL_REMOTE_QEMU) ?
+            info->remote_domain = g_strdup(p->u.interdomain.to_qemu ?
                                            "qemu" : "loopback");
-            info->target = p->type_val & PORT_INFO_TYPEVAL_REMOTE_PORT_MASK;
+            info->target = p->u.interdomain.port;
         } else {
-            info->target = p->type_val;
+            info->target = p->u.val; /* pirq# or virq# */
         }
         info->vcpu = p->vcpu;
         info->pending = test_bit(i, pending);
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 10/28] include: update Xen public headers to Xen 4.17.2 release
  2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
                   ` (8 preceding siblings ...)
  2023-10-25 14:50 ` [PATCH v3 09/28] hw/xen: Clean up event channel 'type_val' handling to use union David Woodhouse
@ 2023-10-25 14:50 ` David Woodhouse
  2023-10-25 14:50 ` [PATCH v3 11/28] i386/xen: advertise XEN_HVM_CPUID_UPCALL_VECTOR in CPUID David Woodhouse
                   ` (17 subsequent siblings)
  27 siblings, 0 replies; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

... in order to advertise the XEN_HVM_CPUID_UPCALL_VECTOR feature,
which will come in a subsequent commit.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Acked-by: Paul Durrant <paul@xen.org>
---
 hw/i386/kvm/xen_xenstore.c                    |  2 +-
 include/hw/xen/interface/arch-arm.h           | 37 +++++++-------
 include/hw/xen/interface/arch-x86/cpuid.h     | 31 +++++-------
 .../hw/xen/interface/arch-x86/xen-x86_32.h    | 19 +------
 .../hw/xen/interface/arch-x86/xen-x86_64.h    | 19 +------
 include/hw/xen/interface/arch-x86/xen.h       | 26 ++--------
 include/hw/xen/interface/event_channel.h      | 19 +------
 include/hw/xen/interface/features.h           | 19 +------
 include/hw/xen/interface/grant_table.h        | 19 +------
 include/hw/xen/interface/hvm/hvm_op.h         | 19 +------
 include/hw/xen/interface/hvm/params.h         | 19 +------
 include/hw/xen/interface/io/blkif.h           | 27 ++++------
 include/hw/xen/interface/io/console.h         | 19 +------
 include/hw/xen/interface/io/fbif.h            | 19 +------
 include/hw/xen/interface/io/kbdif.h           | 19 +------
 include/hw/xen/interface/io/netif.h           | 25 +++-------
 include/hw/xen/interface/io/protocols.h       | 19 +------
 include/hw/xen/interface/io/ring.h            | 49 ++++++++++---------
 include/hw/xen/interface/io/usbif.h           | 19 +------
 include/hw/xen/interface/io/xenbus.h          | 19 +------
 include/hw/xen/interface/io/xs_wire.h         | 36 ++++++--------
 include/hw/xen/interface/memory.h             | 30 +++++-------
 include/hw/xen/interface/physdev.h            | 23 ++-------
 include/hw/xen/interface/sched.h              | 19 +------
 include/hw/xen/interface/trace.h              | 19 +------
 include/hw/xen/interface/vcpu.h               | 19 +------
 include/hw/xen/interface/version.h            | 19 +------
 include/hw/xen/interface/xen-compat.h         | 19 +------
 include/hw/xen/interface/xen.h                | 19 +------
 29 files changed, 124 insertions(+), 523 deletions(-)

diff --git a/hw/i386/kvm/xen_xenstore.c b/hw/i386/kvm/xen_xenstore.c
index 8e716a7009..831da535fc 100644
--- a/hw/i386/kvm/xen_xenstore.c
+++ b/hw/i386/kvm/xen_xenstore.c
@@ -331,7 +331,7 @@ static void xs_error(XenXenstoreState *s, unsigned int id,
     const char *errstr = NULL;
 
     for (unsigned int i = 0; i < ARRAY_SIZE(xsd_errors); i++) {
-        struct xsd_errors *xsd_error = &xsd_errors[i];
+        const struct xsd_errors *xsd_error = &xsd_errors[i];
 
         if (xsd_error->errnum == errnum) {
             errstr = xsd_error->errstring;
diff --git a/include/hw/xen/interface/arch-arm.h b/include/hw/xen/interface/arch-arm.h
index 94b31511dd..1528ced509 100644
--- a/include/hw/xen/interface/arch-arm.h
+++ b/include/hw/xen/interface/arch-arm.h
@@ -1,26 +1,9 @@
+/* SPDX-License-Identifier: MIT */
 /******************************************************************************
  * arch-arm.h
  *
  * Guest OS interface to ARM Xen.
  *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
  * Copyright 2011 (C) Citrix Systems
  */
 
@@ -361,6 +344,7 @@ typedef uint64_t xen_callback_t;
 #define PSR_DBG_MASK    (1<<9)        /* arm64: Debug Exception mask */
 #define PSR_IT_MASK     (0x0600fc00)  /* Thumb If-Then Mask */
 #define PSR_JAZELLE     (1<<24)       /* Jazelle Mode */
+#define PSR_Z           (1<<30)       /* Zero condition flag */
 
 /* 32 bit modes */
 #define PSR_MODE_USR 0x10
@@ -383,7 +367,15 @@ typedef uint64_t xen_callback_t;
 #define PSR_MODE_EL1t 0x04
 #define PSR_MODE_EL0t 0x00
 
-#define PSR_GUEST32_INIT  (PSR_ABT_MASK|PSR_FIQ_MASK|PSR_IRQ_MASK|PSR_MODE_SVC)
+/*
+ * We set PSR_Z to be able to boot Linux kernel versions with an invalid
+ * encoding of the first 8 NOP instructions. See commit a92882a4d270 in
+ * Linux.
+ *
+ * Note that PSR_Z is also set by U-Boot and QEMU -kernel when loading
+ * zImage kernels on aarch32.
+ */
+#define PSR_GUEST32_INIT (PSR_Z|PSR_ABT_MASK|PSR_FIQ_MASK|PSR_IRQ_MASK|PSR_MODE_SVC)
 #define PSR_GUEST64_INIT (PSR_ABT_MASK|PSR_FIQ_MASK|PSR_IRQ_MASK|PSR_MODE_EL1h)
 
 #define SCTLR_GUEST_INIT    xen_mk_ullong(0x00c50078)
@@ -398,6 +390,10 @@ typedef uint64_t xen_callback_t;
 
 /* Physical Address Space */
 
+/* Virtio MMIO mappings */
+#define GUEST_VIRTIO_MMIO_BASE   xen_mk_ullong(0x02000000)
+#define GUEST_VIRTIO_MMIO_SIZE   xen_mk_ullong(0x00100000)
+
 /*
  * vGIC mappings: Only one set of mapping is used by the guest.
  * Therefore they can overlap.
@@ -484,6 +480,9 @@ typedef uint64_t xen_callback_t;
 
 #define GUEST_VPL011_SPI        32
 
+#define GUEST_VIRTIO_MMIO_SPI_FIRST   33
+#define GUEST_VIRTIO_MMIO_SPI_LAST    43
+
 /* PSCI functions */
 #define PSCI_cpu_suspend 0
 #define PSCI_cpu_off     1
diff --git a/include/hw/xen/interface/arch-x86/cpuid.h b/include/hw/xen/interface/arch-x86/cpuid.h
index ce46305bee..7ecd16ae05 100644
--- a/include/hw/xen/interface/arch-x86/cpuid.h
+++ b/include/hw/xen/interface/arch-x86/cpuid.h
@@ -1,26 +1,9 @@
+/* SPDX-License-Identifier: MIT */
 /******************************************************************************
  * arch-x86/cpuid.h
  *
  * CPUID interface to Xen.
  *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
  * Copyright (c) 2007 Citrix Systems, Inc.
  *
  * Authors:
@@ -102,6 +85,18 @@
 #define XEN_HVM_CPUID_IOMMU_MAPPINGS   (1u << 2)
 #define XEN_HVM_CPUID_VCPU_ID_PRESENT  (1u << 3) /* vcpu id is present in EBX */
 #define XEN_HVM_CPUID_DOMID_PRESENT    (1u << 4) /* domid is present in ECX */
+/*
+ * With interrupt format set to 0 (non-remappable) bits 55:49 from the
+ * IO-APIC RTE and bits 11:5 from the MSI address can be used to store
+ * high bits for the Destination ID. This expands the Destination ID
+ * field from 8 to 15 bits, allowing to target APIC IDs up 32768.
+ */
+#define XEN_HVM_CPUID_EXT_DEST_ID      (1u << 5)
+/*
+ * Per-vCPU event channel upcalls work correctly with physical IRQs
+ * bound to event channels.
+ */
+#define XEN_HVM_CPUID_UPCALL_VECTOR    (1u << 6)
 
 /*
  * Leaf 6 (0x40000x05)
diff --git a/include/hw/xen/interface/arch-x86/xen-x86_32.h b/include/hw/xen/interface/arch-x86/xen-x86_32.h
index 19d7388633..139438e835 100644
--- a/include/hw/xen/interface/arch-x86/xen-x86_32.h
+++ b/include/hw/xen/interface/arch-x86/xen-x86_32.h
@@ -1,26 +1,9 @@
+/* SPDX-License-Identifier: MIT */
 /******************************************************************************
  * xen-x86_32.h
  *
  * Guest OS interface to x86 32-bit Xen.
  *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
  * Copyright (c) 2004-2007, K A Fraser
  */
 
diff --git a/include/hw/xen/interface/arch-x86/xen-x86_64.h b/include/hw/xen/interface/arch-x86/xen-x86_64.h
index 40aed14366..5d9035ed22 100644
--- a/include/hw/xen/interface/arch-x86/xen-x86_64.h
+++ b/include/hw/xen/interface/arch-x86/xen-x86_64.h
@@ -1,26 +1,9 @@
+/* SPDX-License-Identifier: MIT */
 /******************************************************************************
  * xen-x86_64.h
  *
  * Guest OS interface to x86 64-bit Xen.
  *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
  * Copyright (c) 2004-2006, K A Fraser
  */
 
diff --git a/include/hw/xen/interface/arch-x86/xen.h b/include/hw/xen/interface/arch-x86/xen.h
index 7acd94c8eb..c0f4551247 100644
--- a/include/hw/xen/interface/arch-x86/xen.h
+++ b/include/hw/xen/interface/arch-x86/xen.h
@@ -1,26 +1,9 @@
+/* SPDX-License-Identifier: MIT */
 /******************************************************************************
  * arch-x86/xen.h
  *
  * Guest OS interface to x86 Xen.
  *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
  * Copyright (c) 2004-2006, K A Fraser
  */
 
@@ -320,12 +303,9 @@ struct xen_arch_domainconfig {
     uint32_t misc_flags;
 };
 
-/* Location of online VCPU bitmap. */
-#define XEN_ACPI_CPU_MAP             0xaf00
-#define XEN_ACPI_CPU_MAP_LEN         ((HVM_MAX_VCPUS + 7) / 8)
+/* Max  XEN_X86_* constant. Used for ABI checking. */
+#define XEN_X86_MISC_FLAGS_MAX XEN_X86_MSR_RELAXED
 
-/* GPE0 bit set during CPU hotplug */
-#define XEN_ACPI_GPE0_CPUHP_BIT      2
 #endif
 
 /*
diff --git a/include/hw/xen/interface/event_channel.h b/include/hw/xen/interface/event_channel.h
index 73c9f38ce1..0d91a1c4af 100644
--- a/include/hw/xen/interface/event_channel.h
+++ b/include/hw/xen/interface/event_channel.h
@@ -1,26 +1,9 @@
+/* SPDX-License-Identifier: MIT */
 /******************************************************************************
  * event_channel.h
  *
  * Event channels between domains.
  *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
  * Copyright (c) 2003-2004, K A Fraser.
  */
 
diff --git a/include/hw/xen/interface/features.h b/include/hw/xen/interface/features.h
index 9ee2f760ef..d2a9175aae 100644
--- a/include/hw/xen/interface/features.h
+++ b/include/hw/xen/interface/features.h
@@ -1,26 +1,9 @@
+/* SPDX-License-Identifier: MIT */
 /******************************************************************************
  * features.h
  *
  * Feature flags, reported by XENVER_get_features.
  *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
  * Copyright (c) 2006, Keir Fraser <keir@xensource.com>
  */
 
diff --git a/include/hw/xen/interface/grant_table.h b/include/hw/xen/interface/grant_table.h
index 7934d7b718..1dfa17a6d0 100644
--- a/include/hw/xen/interface/grant_table.h
+++ b/include/hw/xen/interface/grant_table.h
@@ -1,27 +1,10 @@
+/* SPDX-License-Identifier: MIT */
 /******************************************************************************
  * grant_table.h
  *
  * Interface for granting foreign access to page frames, and receiving
  * page-ownership transfers.
  *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
  * Copyright (c) 2004, K A Fraser
  */
 
diff --git a/include/hw/xen/interface/hvm/hvm_op.h b/include/hw/xen/interface/hvm/hvm_op.h
index 870ec52060..e22adf0319 100644
--- a/include/hw/xen/interface/hvm/hvm_op.h
+++ b/include/hw/xen/interface/hvm/hvm_op.h
@@ -1,22 +1,5 @@
+/* SPDX-License-Identifier: MIT */
 /*
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
  * Copyright (c) 2007, Keir Fraser
  */
 
diff --git a/include/hw/xen/interface/hvm/params.h b/include/hw/xen/interface/hvm/params.h
index c9d6e70d7b..a22b4ed45d 100644
--- a/include/hw/xen/interface/hvm/params.h
+++ b/include/hw/xen/interface/hvm/params.h
@@ -1,22 +1,5 @@
+/* SPDX-License-Identifier: MIT */
 /*
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
  * Copyright (c) 2007, Keir Fraser
  */
 
diff --git a/include/hw/xen/interface/io/blkif.h b/include/hw/xen/interface/io/blkif.h
index 4cdba79aba..22f1eef0c0 100644
--- a/include/hw/xen/interface/io/blkif.h
+++ b/include/hw/xen/interface/io/blkif.h
@@ -1,26 +1,9 @@
+/* SPDX-License-Identifier: MIT */
 /******************************************************************************
  * blkif.h
  *
  * Unified block-device I/O interface for Xen guest OSes.
  *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
  * Copyright (c) 2003-2004, Keir Fraser
  * Copyright (c) 2012, Spectra Logic Corporation
  */
@@ -363,6 +346,14 @@
  *      that the frontend requires that the logical block size is 512 as it
  *      is hardcoded (which is the case in some frontend implementations).
  *
+ * trusted
+ *      Values:         0/1 (boolean)
+ *      Default value:  1
+ *
+ *      A value of "0" indicates that the frontend should not trust the
+ *      backend, and should deploy whatever measures available to protect from
+ *      a malicious backend on the other end.
+ *
  *------------------------- Virtual Device Properties -------------------------
  *
  * device-type
diff --git a/include/hw/xen/interface/io/console.h b/include/hw/xen/interface/io/console.h
index 4811f47220..4509b4b689 100644
--- a/include/hw/xen/interface/io/console.h
+++ b/include/hw/xen/interface/io/console.h
@@ -1,26 +1,9 @@
+/* SPDX-License-Identifier: MIT */
 /******************************************************************************
  * console.h
  *
  * Console I/O interface for Xen guest OSes.
  *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
  * Copyright (c) 2005, Keir Fraser
  */
 
diff --git a/include/hw/xen/interface/io/fbif.h b/include/hw/xen/interface/io/fbif.h
index cc25aab32e..93c73195d8 100644
--- a/include/hw/xen/interface/io/fbif.h
+++ b/include/hw/xen/interface/io/fbif.h
@@ -1,24 +1,7 @@
+/* SPDX-License-Identifier: MIT */
 /*
  * fbif.h -- Xen virtual frame buffer device
  *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
  * Copyright (C) 2005 Anthony Liguori <aliguori@us.ibm.com>
  * Copyright (C) 2006 Red Hat, Inc., Markus Armbruster <armbru@redhat.com>
  */
diff --git a/include/hw/xen/interface/io/kbdif.h b/include/hw/xen/interface/io/kbdif.h
index a6b01c52c7..4bde6b3821 100644
--- a/include/hw/xen/interface/io/kbdif.h
+++ b/include/hw/xen/interface/io/kbdif.h
@@ -1,24 +1,7 @@
+/* SPDX-License-Identifier: MIT */
 /*
  * kbdif.h -- Xen virtual keyboard/mouse
  *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
  * Copyright (C) 2005 Anthony Liguori <aliguori@us.ibm.com>
  * Copyright (C) 2006 Red Hat, Inc., Markus Armbruster <armbru@redhat.com>
  */
diff --git a/include/hw/xen/interface/io/netif.h b/include/hw/xen/interface/io/netif.h
index 00dd258712..c13b85061d 100644
--- a/include/hw/xen/interface/io/netif.h
+++ b/include/hw/xen/interface/io/netif.h
@@ -1,26 +1,9 @@
+/* SPDX-License-Identifier: MIT */
 /******************************************************************************
  * netif.h
  *
  * Unified network-device I/O interface for Xen guest OSes.
  *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
  * Copyright (c) 2003-2004, Keir Fraser
  */
 
@@ -160,6 +143,12 @@
  * be applied if it is set.
  */
 
+/*
+ * The setting of "trusted" node to "0" in the frontend path signals that the
+ * frontend should not trust the backend, and should deploy whatever measures
+ * available to protect from a malicious backend on the other end.
+ */
+
 /*
  * Control ring
  * ============
diff --git a/include/hw/xen/interface/io/protocols.h b/include/hw/xen/interface/io/protocols.h
index 52b4de0f81..7815e1ff0f 100644
--- a/include/hw/xen/interface/io/protocols.h
+++ b/include/hw/xen/interface/io/protocols.h
@@ -1,24 +1,7 @@
+/* SPDX-License-Identifier: MIT */
 /******************************************************************************
  * protocols.h
  *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
  * Copyright (c) 2008, Keir Fraser
  */
 
diff --git a/include/hw/xen/interface/io/ring.h b/include/hw/xen/interface/io/ring.h
index c486c457e0..025939278b 100644
--- a/include/hw/xen/interface/io/ring.h
+++ b/include/hw/xen/interface/io/ring.h
@@ -1,26 +1,9 @@
+/* SPDX-License-Identifier: MIT */
 /******************************************************************************
  * ring.h
  *
  * Shared producer-consumer ring macros.
  *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
  * Tim Deegan and Andrew Warfield November 2004.
  */
 
@@ -95,9 +78,8 @@ typedef unsigned int RING_IDX;
  * of the shared memory area (PAGE_SIZE, for instance). To initialise
  * the front half:
  *
- *     mytag_front_ring_t front_ring;
- *     SHARED_RING_INIT((mytag_sring_t *)shared_page);
- *     FRONT_RING_INIT(&front_ring, (mytag_sring_t *)shared_page, PAGE_SIZE);
+ *     mytag_front_ring_t ring;
+ *     XEN_FRONT_RING_INIT(&ring, (mytag_sring_t *)shared_page, PAGE_SIZE);
  *
  * Initializing the back follows similarly (note that only the front
  * initializes the shared ring):
@@ -184,6 +166,11 @@ typedef struct __name##_back_ring __name##_back_ring_t
 
 #define FRONT_RING_INIT(_r, _s, __size) FRONT_RING_ATTACH(_r, _s, 0, __size)
 
+#define XEN_FRONT_RING_INIT(r, s, size) do {                            \
+    SHARED_RING_INIT(s);                                                \
+    FRONT_RING_INIT(r, s, size);                                        \
+} while (0)
+
 #define BACK_RING_ATTACH(_r, _s, _i, __size) do {                       \
     (_r)->rsp_prod_pvt = (_i);                                          \
     (_r)->req_cons = (_i);                                              \
@@ -208,11 +195,11 @@ typedef struct __name##_back_ring __name##_back_ring_t
     (RING_FREE_REQUESTS(_r) == 0)
 
 /* Test if there are outstanding messages to be processed on a ring. */
-#define RING_HAS_UNCONSUMED_RESPONSES(_r)                               \
+#define XEN_RING_NR_UNCONSUMED_RESPONSES(_r)                            \
     ((_r)->sring->rsp_prod - (_r)->rsp_cons)
 
 #ifdef __GNUC__
-#define RING_HAS_UNCONSUMED_REQUESTS(_r) ({                             \
+#define XEN_RING_NR_UNCONSUMED_REQUESTS(_r) ({                          \
     unsigned int req = (_r)->sring->req_prod - (_r)->req_cons;          \
     unsigned int rsp = RING_SIZE(_r) -                                  \
         ((_r)->req_cons - (_r)->rsp_prod_pvt);                          \
@@ -220,13 +207,27 @@ typedef struct __name##_back_ring __name##_back_ring_t
 })
 #else
 /* Same as above, but without the nice GCC ({ ... }) syntax. */
-#define RING_HAS_UNCONSUMED_REQUESTS(_r)                                \
+#define XEN_RING_NR_UNCONSUMED_REQUESTS(_r)                             \
     ((((_r)->sring->req_prod - (_r)->req_cons) <                        \
       (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt))) ?        \
      ((_r)->sring->req_prod - (_r)->req_cons) :                         \
      (RING_SIZE(_r) - ((_r)->req_cons - (_r)->rsp_prod_pvt)))
 #endif
 
+#ifdef XEN_RING_HAS_UNCONSUMED_IS_BOOL
+/*
+ * These variants should only be used in case no caller is abusing them for
+ * obtaining the number of unconsumed responses/requests.
+ */
+#define RING_HAS_UNCONSUMED_RESPONSES(_r) \
+    (!!XEN_RING_NR_UNCONSUMED_RESPONSES(_r))
+#define RING_HAS_UNCONSUMED_REQUESTS(_r)  \
+    (!!XEN_RING_NR_UNCONSUMED_REQUESTS(_r))
+#else
+#define RING_HAS_UNCONSUMED_RESPONSES(_r) XEN_RING_NR_UNCONSUMED_RESPONSES(_r)
+#define RING_HAS_UNCONSUMED_REQUESTS(_r)  XEN_RING_NR_UNCONSUMED_REQUESTS(_r)
+#endif
+
 /* Direct access to individual ring elements, by index. */
 #define RING_GET_REQUEST(_r, _idx)                                      \
     (&((_r)->sring->ring[((_idx) & (RING_SIZE(_r) - 1))].req))
diff --git a/include/hw/xen/interface/io/usbif.h b/include/hw/xen/interface/io/usbif.h
index c0a552e195..875af0dc7c 100644
--- a/include/hw/xen/interface/io/usbif.h
+++ b/include/hw/xen/interface/io/usbif.h
@@ -1,3 +1,4 @@
+/* SPDX-License-Identifier: MIT */
 /*
  * usbif.h
  *
@@ -5,24 +6,6 @@
  *
  * Copyright (C) 2009, FUJITSU LABORATORIES LTD.
  * Author: Noboru Iwamatsu <n_iwamatsu@jp.fujitsu.com>
- *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
  */
 
 #ifndef __XEN_PUBLIC_IO_USBIF_H__
diff --git a/include/hw/xen/interface/io/xenbus.h b/include/hw/xen/interface/io/xenbus.h
index 927f9db552..9cd0cd7c67 100644
--- a/include/hw/xen/interface/io/xenbus.h
+++ b/include/hw/xen/interface/io/xenbus.h
@@ -1,26 +1,9 @@
+/* SPDX-License-Identifier: MIT */
 /*****************************************************************************
  * xenbus.h
  *
  * Xenbus protocol details.
  *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
  * Copyright (C) 2005 XenSource Ltd.
  */
 
diff --git a/include/hw/xen/interface/io/xs_wire.h b/include/hw/xen/interface/io/xs_wire.h
index 4dd6632669..04e6849feb 100644
--- a/include/hw/xen/interface/io/xs_wire.h
+++ b/include/hw/xen/interface/io/xs_wire.h
@@ -1,25 +1,8 @@
+/* SPDX-License-Identifier: MIT */
 /*
  * Details of the "wire" protocol between Xen Store Daemon and client
  * library or guest kernel.
  *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
  * Copyright (C) 2005 Rusty Russell IBM Corporation
  */
 
@@ -71,11 +54,12 @@ struct xsd_errors
 #ifdef EINVAL
 #define XSD_ERROR(x) { x, #x }
 /* LINTED: static unused */
-static struct xsd_errors xsd_errors[]
+static const struct xsd_errors xsd_errors[]
 #if defined(__GNUC__)
 __attribute__((unused))
 #endif
     = {
+    /* /!\ New errors should be added at the end of the array. */
     XSD_ERROR(EINVAL),
     XSD_ERROR(EACCES),
     XSD_ERROR(EEXIST),
@@ -90,7 +74,8 @@ __attribute__((unused))
     XSD_ERROR(EBUSY),
     XSD_ERROR(EAGAIN),
     XSD_ERROR(EISCONN),
-    XSD_ERROR(E2BIG)
+    XSD_ERROR(E2BIG),
+    XSD_ERROR(EPERM),
 };
 #endif
 
@@ -124,6 +109,7 @@ struct xenstore_domain_interface {
     XENSTORE_RING_IDX rsp_cons, rsp_prod;
     uint32_t server_features; /* Bitmap of features supported by the server */
     uint32_t connection;
+    uint32_t error;
 };
 
 /* Violating this is very bad.  See docs/misc/xenstore.txt. */
@@ -135,10 +121,18 @@ struct xenstore_domain_interface {
 
 /* The ability to reconnect a ring */
 #define XENSTORE_SERVER_FEATURE_RECONNECTION 1
+/* The presence of the "error" field in the ring page */
+#define XENSTORE_SERVER_FEATURE_ERROR        2
 
 /* Valid values for the connection field */
 #define XENSTORE_CONNECTED 0 /* the steady-state */
-#define XENSTORE_RECONNECT 1 /* guest has initiated a reconnect */
+#define XENSTORE_RECONNECT 1 /* reconnect in progress */
+
+/* Valid values for the error field */
+#define XENSTORE_ERROR_NONE    0 /* No error */
+#define XENSTORE_ERROR_COMM    1 /* Communication problem */
+#define XENSTORE_ERROR_RINGIDX 2 /* Invalid ring index */
+#define XENSTORE_ERROR_PROTO   3 /* Protocol violation (payload too long) */
 
 #endif /* _XS_WIRE_H */
 
diff --git a/include/hw/xen/interface/memory.h b/include/hw/xen/interface/memory.h
index 383a9468c3..29cf5c8239 100644
--- a/include/hw/xen/interface/memory.h
+++ b/include/hw/xen/interface/memory.h
@@ -1,26 +1,9 @@
+/* SPDX-License-Identifier: MIT */
 /******************************************************************************
  * memory.h
  *
  * Memory reservation and information.
  *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
  * Copyright (c) 2005, Keir Fraser <keir@xensource.com>
  */
 
@@ -541,12 +524,14 @@ struct xen_mem_sharing_op {
                 uint32_t gref;     /* IN: gref to debug         */
             } u;
         } debug;
-        struct mem_sharing_op_fork {      /* OP_FORK */
+        struct mem_sharing_op_fork {      /* OP_FORK{,_RESET} */
             domid_t parent_domain;        /* IN: parent's domain id */
 /* Only makes sense for short-lived forks */
 #define XENMEM_FORK_WITH_IOMMU_ALLOWED (1u << 0)
 /* Only makes sense for short-lived forks */
 #define XENMEM_FORK_BLOCK_INTERRUPTS   (1u << 1)
+#define XENMEM_FORK_RESET_STATE        (1u << 2)
+#define XENMEM_FORK_RESET_MEMORY       (1u << 3)
             uint16_t flags;               /* IN: optional settings */
             uint32_t pad;                 /* Must be set to 0 */
         } fork;
@@ -662,6 +647,13 @@ struct xen_mem_acquire_resource {
      * two calls.
      */
     uint32_t nr_frames;
+    /*
+     * Padding field, must be zero on input.
+     * In a previous version this was an output field with the lowest bit
+     * named XENMEM_rsrc_acq_caller_owned. Future versions of this interface
+     * will not reuse this bit as an output with the field being zero on
+     * input.
+     */
     uint32_t pad;
     /*
      * IN - the index of the initial frame to be mapped. This parameter
diff --git a/include/hw/xen/interface/physdev.h b/include/hw/xen/interface/physdev.h
index d271766ad0..f0c0d4727c 100644
--- a/include/hw/xen/interface/physdev.h
+++ b/include/hw/xen/interface/physdev.h
@@ -1,22 +1,5 @@
+/* SPDX-License-Identifier: MIT */
 /*
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
  * Copyright (c) 2006, Keir Fraser
  */
 
@@ -211,8 +194,8 @@ struct physdev_manage_pci_ext {
     /* IN */
     uint8_t bus;
     uint8_t devfn;
-    unsigned is_extfn;
-    unsigned is_virtfn;
+    uint32_t is_extfn;
+    uint32_t is_virtfn;
     struct {
         uint8_t bus;
         uint8_t devfn;
diff --git a/include/hw/xen/interface/sched.h b/include/hw/xen/interface/sched.h
index 811bd87c82..b4362c6a1d 100644
--- a/include/hw/xen/interface/sched.h
+++ b/include/hw/xen/interface/sched.h
@@ -1,26 +1,9 @@
+/* SPDX-License-Identifier: MIT */
 /******************************************************************************
  * sched.h
  *
  * Scheduler state interactions
  *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
  * Copyright (c) 2005, Keir Fraser <keir@xensource.com>
  */
 
diff --git a/include/hw/xen/interface/trace.h b/include/hw/xen/interface/trace.h
index d5fa4aea8d..62a179971d 100644
--- a/include/hw/xen/interface/trace.h
+++ b/include/hw/xen/interface/trace.h
@@ -1,24 +1,7 @@
+/* SPDX-License-Identifier: MIT */
 /******************************************************************************
  * include/public/trace.h
  *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
  * Mark Williamson, (C) 2004 Intel Research Cambridge
  * Copyright (C) 2005 Bin Ren
  */
diff --git a/include/hw/xen/interface/vcpu.h b/include/hw/xen/interface/vcpu.h
index 3623af932f..81a3b3a743 100644
--- a/include/hw/xen/interface/vcpu.h
+++ b/include/hw/xen/interface/vcpu.h
@@ -1,26 +1,9 @@
+/* SPDX-License-Identifier: MIT */
 /******************************************************************************
  * vcpu.h
  *
  * VCPU initialisation, query, and hotplug.
  *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
  * Copyright (c) 2005, Keir Fraser <keir@xensource.com>
  */
 
diff --git a/include/hw/xen/interface/version.h b/include/hw/xen/interface/version.h
index 17a81e23cd..9c78b4f3b6 100644
--- a/include/hw/xen/interface/version.h
+++ b/include/hw/xen/interface/version.h
@@ -1,26 +1,9 @@
+/* SPDX-License-Identifier: MIT */
 /******************************************************************************
  * version.h
  *
  * Xen version, type, and compile information.
  *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
  * Copyright (c) 2005, Nguyen Anh Quynh <aquynh@gmail.com>
  * Copyright (c) 2005, Keir Fraser <keir@xensource.com>
  */
diff --git a/include/hw/xen/interface/xen-compat.h b/include/hw/xen/interface/xen-compat.h
index e1c027a95c..97fe698498 100644
--- a/include/hw/xen/interface/xen-compat.h
+++ b/include/hw/xen/interface/xen-compat.h
@@ -1,26 +1,9 @@
+/* SPDX-License-Identifier: MIT */
 /******************************************************************************
  * xen-compat.h
  *
  * Guest OS interface to Xen.  Compatibility layer.
  *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
  * Copyright (c) 2006, Christian Limpach
  */
 
diff --git a/include/hw/xen/interface/xen.h b/include/hw/xen/interface/xen.h
index e373592c33..920567e006 100644
--- a/include/hw/xen/interface/xen.h
+++ b/include/hw/xen/interface/xen.h
@@ -1,26 +1,9 @@
+/* SPDX-License-Identifier: MIT */
 /******************************************************************************
  * xen.h
  *
  * Guest OS interface to Xen.
  *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
  * Copyright (c) 2004, K A Fraser
  */
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 11/28] i386/xen: advertise XEN_HVM_CPUID_UPCALL_VECTOR in CPUID
  2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
                   ` (9 preceding siblings ...)
  2023-10-25 14:50 ` [PATCH v3 10/28] include: update Xen public headers to Xen 4.17.2 release David Woodhouse
@ 2023-10-25 14:50 ` David Woodhouse
  2023-10-25 14:50 ` [PATCH v3 12/28] hw/xen: populate store frontend nodes with XenStore PFN/port David Woodhouse
                   ` (16 subsequent siblings)
  27 siblings, 0 replies; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

This will allow Linux guests (since v6.0) to use the per-vCPU upcall
vector delivered as MSI through the local APIC.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
---
 target/i386/kvm/kvm.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index e7c054cc16..8f93a0b5fc 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1889,6 +1889,10 @@ int kvm_arch_init_vcpu(CPUState *cs)
                 c->eax |= XEN_HVM_CPUID_VCPU_ID_PRESENT;
                 c->ebx = cs->cpu_index;
             }
+
+            if (cs->kvm_state->xen_version >= XEN_VERSION(4, 17)) {
+                c->eax |= XEN_HVM_CPUID_UPCALL_VECTOR;
+            }
         }
 
         r = kvm_xen_init_vcpu(cs);
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 12/28] hw/xen: populate store frontend nodes with XenStore PFN/port
  2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
                   ` (10 preceding siblings ...)
  2023-10-25 14:50 ` [PATCH v3 11/28] i386/xen: advertise XEN_HVM_CPUID_UPCALL_VECTOR in CPUID David Woodhouse
@ 2023-10-25 14:50 ` David Woodhouse
  2023-10-25 14:50 ` [PATCH v3 13/28] hw/xen: automatically assign device index to block devices David Woodhouse
                   ` (15 subsequent siblings)
  27 siblings, 0 replies; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

This is kind of redundant since without being able to get these through
some other method (HVMOP_get_param) the guest wouldn't be able to access
XenStore in order to find them.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
---
 hw/i386/kvm/xen_xenstore.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/hw/i386/kvm/xen_xenstore.c b/hw/i386/kvm/xen_xenstore.c
index 831da535fc..b7c0407765 100644
--- a/hw/i386/kvm/xen_xenstore.c
+++ b/hw/i386/kvm/xen_xenstore.c
@@ -1434,6 +1434,7 @@ static void alloc_guest_port(XenXenstoreState *s)
 int xen_xenstore_reset(void)
 {
     XenXenstoreState *s = xen_xenstore_singleton;
+    GList *perms;
     int err;
 
     if (!s) {
@@ -1461,6 +1462,16 @@ int xen_xenstore_reset(void)
     }
     s->be_port = err;
 
+    /* Create frontend store nodes */
+    perms = g_list_append(NULL, xs_perm_as_string(XS_PERM_NONE, DOMID_QEMU));
+    perms = g_list_append(perms, xs_perm_as_string(XS_PERM_READ, xen_domid));
+
+    relpath_printf(s, perms, "store/port", "%u", s->guest_port);
+    relpath_printf(s, perms, "store/ring-ref", "%lu",
+                   XEN_SPECIAL_PFN(XENSTORE));
+
+    g_list_free_full(perms, g_free);
+
     /*
      * We don't actually access the guest's page through the grant, because
      * this isn't real Xen, and we can just use the page we gave it in the
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 13/28] hw/xen: automatically assign device index to block devices
  2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
                   ` (11 preceding siblings ...)
  2023-10-25 14:50 ` [PATCH v3 12/28] hw/xen: populate store frontend nodes with XenStore PFN/port David Woodhouse
@ 2023-10-25 14:50 ` David Woodhouse
  2023-10-27  7:30   ` Durrant, Paul
  2023-10-25 14:50 ` [PATCH v3 14/28] hw/xen: add get_frontend_path() method to XenDeviceClass David Woodhouse
                   ` (14 subsequent siblings)
  27 siblings, 1 reply; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

There's no need to force the user to assign a vdev. We can automatically
assign one, starting at xvda and searching until we find the first disk
name that's unused.

This means we can now allow '-drive if=xen,file=xxx' to work without an
explicit separate -driver argument, just like if=virtio.

Rip out the legacy handling from the xenpv machine, which was scribbling
over any disks configured by the toolstack, and didn't work with anything
but raw images.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Acked-by: Kevin Wolf <kwolf@redhat.com>
---
 blockdev.c                          | 15 +++++++++---
 hw/block/xen-block.c                | 38 +++++++++++++++++++++++++++++
 hw/xen/xen_devconfig.c              | 28 ---------------------
 hw/xenpv/xen_machine_pv.c           |  9 -------
 include/hw/xen/xen-legacy-backend.h |  1 -
 5 files changed, 50 insertions(+), 41 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index a01c62596b..5d9f2e5395 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -255,13 +255,13 @@ void drive_check_orphaned(void)
          * Ignore default drives, because we create certain default
          * drives unconditionally, then leave them unclaimed.  Not the
          * users fault.
-         * Ignore IF_VIRTIO, because it gets desugared into -device,
-         * so we can leave failing to -device.
+         * Ignore IF_VIRTIO or IF_XEN, because it gets desugared into
+         * -device, so we can leave failing to -device.
          * Ignore IF_NONE, because leaving unclaimed IF_NONE remains
          * available for device_add is a feature.
          */
         if (dinfo->is_default || dinfo->type == IF_VIRTIO
-            || dinfo->type == IF_NONE) {
+            || dinfo->type == IF_XEN || dinfo->type == IF_NONE) {
             continue;
         }
         if (!blk_get_attached_dev(blk)) {
@@ -977,6 +977,15 @@ DriveInfo *drive_new(QemuOpts *all_opts, BlockInterfaceType block_default_type,
         qemu_opt_set(devopts, "driver", "virtio-blk", &error_abort);
         qemu_opt_set(devopts, "drive", qdict_get_str(bs_opts, "id"),
                      &error_abort);
+    } else if (type == IF_XEN) {
+        QemuOpts *devopts;
+        devopts = qemu_opts_create(qemu_find_opts("device"), NULL, 0,
+                                   &error_abort);
+        qemu_opt_set(devopts, "driver",
+                     (media == MEDIA_CDROM) ? "xen-cdrom" : "xen-disk",
+                     &error_abort);
+        qemu_opt_set(devopts, "drive", qdict_get_str(bs_opts, "id"),
+                     &error_abort);
     }
 
     filename = qemu_opt_get(legacy_opts, "file");
diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c
index 64470fc579..5011fe9430 100644
--- a/hw/block/xen-block.c
+++ b/hw/block/xen-block.c
@@ -27,6 +27,7 @@
 #include "sysemu/block-backend.h"
 #include "sysemu/iothread.h"
 #include "dataplane/xen-block.h"
+#include "hw/xen/interface/io/xs_wire.h"
 #include "trace.h"
 
 static char *xen_block_get_name(XenDevice *xendev, Error **errp)
@@ -34,6 +35,43 @@ static char *xen_block_get_name(XenDevice *xendev, Error **errp)
     XenBlockDevice *blockdev = XEN_BLOCK_DEVICE(xendev);
     XenBlockVdev *vdev = &blockdev->props.vdev;
 
+    if (blockdev->props.vdev.type == XEN_BLOCK_VDEV_TYPE_INVALID) {
+        XenBus *xenbus = XEN_BUS(qdev_get_parent_bus(DEVICE(xendev)));
+        char fe_path[XENSTORE_ABS_PATH_MAX + 1];
+        char *value;
+        int disk = 0;
+        unsigned long idx;
+
+        /* Find an unoccupied device name */
+        while (disk < (1 << 20)) {
+            if (disk < (1 << 4)) {
+                idx = (202 << 8) | (disk << 4);
+            } else {
+                idx = (1 << 28) | (disk << 8);
+            }
+            snprintf(fe_path, sizeof(fe_path),
+                     "/local/domain/%u/device/vbd/%lu",
+                     xendev->frontend_id, idx);
+            value = qemu_xen_xs_read(xenbus->xsh, XBT_NULL, fe_path, NULL);
+            if (!value) {
+                if (errno == ENOENT) {
+                    vdev->type = XEN_BLOCK_VDEV_TYPE_XVD;
+                    vdev->partition = 0;
+                    vdev->disk = disk;
+                    vdev->number = idx;
+                    goto found;
+                }
+                error_setg(errp, "cannot read %s: %s", fe_path,
+                           strerror(errno));
+                return NULL;
+            }
+            free(value);
+            disk++;
+        }
+        error_setg(errp, "cannot find device vdev for block device");
+        return NULL;
+    }
+ found:
     return g_strdup_printf("%lu", vdev->number);
 }
 
diff --git a/hw/xen/xen_devconfig.c b/hw/xen/xen_devconfig.c
index 9b7304e544..3f77c675c6 100644
--- a/hw/xen/xen_devconfig.c
+++ b/hw/xen/xen_devconfig.c
@@ -46,34 +46,6 @@ static int xen_config_dev_all(char *fe, char *be)
 
 /* ------------------------------------------------------------- */
 
-int xen_config_dev_blk(DriveInfo *disk)
-{
-    char fe[256], be[256], device_name[32];
-    int vdev = 202 * 256 + 16 * disk->unit;
-    int cdrom = disk->media_cd;
-    const char *devtype = cdrom ? "cdrom" : "disk";
-    const char *mode    = cdrom ? "r"     : "w";
-    const char *filename = qemu_opt_get(disk->opts, "file");
-
-    snprintf(device_name, sizeof(device_name), "xvd%c", 'a' + disk->unit);
-    xen_pv_printf(NULL, 1, "config disk %d [%s]: %s\n",
-                  disk->unit, device_name, filename);
-    xen_config_dev_dirs("vbd", "qdisk", vdev, fe, be, sizeof(fe));
-
-    /* frontend */
-    xenstore_write_int(fe, "virtual-device",  vdev);
-    xenstore_write_str(fe, "device-type",     devtype);
-
-    /* backend */
-    xenstore_write_str(be, "dev",             device_name);
-    xenstore_write_str(be, "type",            "file");
-    xenstore_write_str(be, "params",          filename);
-    xenstore_write_str(be, "mode",            mode);
-
-    /* common stuff */
-    return xen_config_dev_all(fe, be);
-}
-
 int xen_config_dev_nic(NICInfo *nic)
 {
     char fe[256], be[256];
diff --git a/hw/xenpv/xen_machine_pv.c b/hw/xenpv/xen_machine_pv.c
index 17cda5ec13..1533f5dfb4 100644
--- a/hw/xenpv/xen_machine_pv.c
+++ b/hw/xenpv/xen_machine_pv.c
@@ -32,7 +32,6 @@
 
 static void xen_init_pv(MachineState *machine)
 {
-    DriveInfo *dinfo;
     int i;
 
     setup_xen_backend_ops();
@@ -64,14 +63,6 @@ static void xen_init_pv(MachineState *machine)
         vga_interface_created = true;
     }
 
-    /* configure disks */
-    for (i = 0; i < 16; i++) {
-        dinfo = drive_get(IF_XEN, 0, i);
-        if (!dinfo)
-            continue;
-        xen_config_dev_blk(dinfo);
-    }
-
     /* configure nics */
     for (i = 0; i < nb_nics; i++) {
         if (!nd_table[i].model || 0 != strcmp(nd_table[i].model, "xen"))
diff --git a/include/hw/xen/xen-legacy-backend.h b/include/hw/xen/xen-legacy-backend.h
index 6c307c5f2c..fc42146bc2 100644
--- a/include/hw/xen/xen-legacy-backend.h
+++ b/include/hw/xen/xen-legacy-backend.h
@@ -81,7 +81,6 @@ extern struct XenDevOps xen_usb_ops;          /* xen-usb.c         */
 
 /* configuration (aka xenbus setup) */
 void xen_config_cleanup(void);
-int xen_config_dev_blk(DriveInfo *disk);
 int xen_config_dev_nic(NICInfo *nic);
 int xen_config_dev_vfb(int vdev, const char *type);
 int xen_config_dev_vkbd(int vdev);
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 14/28] hw/xen: add get_frontend_path() method to XenDeviceClass
  2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
                   ` (12 preceding siblings ...)
  2023-10-25 14:50 ` [PATCH v3 13/28] hw/xen: automatically assign device index to block devices David Woodhouse
@ 2023-10-25 14:50 ` David Woodhouse
  2023-10-27  7:31   ` Durrant, Paul
  2023-10-25 14:50 ` [PATCH v3 15/28] hw/xen: do not repeatedly try to create a failing backend device David Woodhouse
                   ` (13 subsequent siblings)
  27 siblings, 1 reply; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

The primary Xen console is special. The guest's side is set up for it by
the toolstack automatically and not by the standard PV init sequence.

Accordingly, its *frontend* doesn't appear in …/device/console/0 either;
instead it appears under …/console in the guest's XenStore node.

To allow the Xen console driver to override the frontend path for the
primary console, add a method to the XenDeviceClass which can be used
instead of the standard xen_device_get_frontend_path()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 hw/xen/xen-bus.c         | 11 ++++++++++-
 include/hw/xen/xen-bus.h |  2 ++
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/hw/xen/xen-bus.c b/hw/xen/xen-bus.c
index ece8ec40cd..12ff782005 100644
--- a/hw/xen/xen-bus.c
+++ b/hw/xen/xen-bus.c
@@ -711,8 +711,17 @@ static void xen_device_frontend_create(XenDevice *xendev, Error **errp)
 {
     ERRP_GUARD();
     XenBus *xenbus = XEN_BUS(qdev_get_parent_bus(DEVICE(xendev)));
+    XenDeviceClass *xendev_class = XEN_DEVICE_GET_CLASS(xendev);
 
-    xendev->frontend_path = xen_device_get_frontend_path(xendev);
+    if (xendev_class->get_frontend_path) {
+        xendev->frontend_path = xendev_class->get_frontend_path(xendev, errp);
+        if (!xendev->frontend_path) {
+            error_prepend(errp, "failed to create frontend: ");
+            return;
+        }
+    } else {
+        xendev->frontend_path = xen_device_get_frontend_path(xendev);
+    }
 
     /*
      * The frontend area may have already been created by a legacy
diff --git a/include/hw/xen/xen-bus.h b/include/hw/xen/xen-bus.h
index f435898164..eb440880b5 100644
--- a/include/hw/xen/xen-bus.h
+++ b/include/hw/xen/xen-bus.h
@@ -33,6 +33,7 @@ struct XenDevice {
 };
 typedef struct XenDevice XenDevice;
 
+typedef char *(*XenDeviceGetFrontendPath)(XenDevice *xendev, Error **errp);
 typedef char *(*XenDeviceGetName)(XenDevice *xendev, Error **errp);
 typedef void (*XenDeviceRealize)(XenDevice *xendev, Error **errp);
 typedef void (*XenDeviceFrontendChanged)(XenDevice *xendev,
@@ -46,6 +47,7 @@ struct XenDeviceClass {
     /*< public >*/
     const char *backend;
     const char *device;
+    XenDeviceGetFrontendPath get_frontend_path;
     XenDeviceGetName get_name;
     XenDeviceRealize realize;
     XenDeviceFrontendChanged frontend_changed;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 15/28] hw/xen: do not repeatedly try to create a failing backend device
  2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
                   ` (13 preceding siblings ...)
  2023-10-25 14:50 ` [PATCH v3 14/28] hw/xen: add get_frontend_path() method to XenDeviceClass David Woodhouse
@ 2023-10-25 14:50 ` David Woodhouse
  2023-10-25 14:50 ` [PATCH v3 16/28] hw/xen: update Xen console to XenDevice model David Woodhouse
                   ` (12 subsequent siblings)
  27 siblings, 0 replies; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

If xen_backend_device_create() fails to instantiate a device, the XenBus
code will just keep trying over and over again each time the bus is
re-enumerated, as long as the backend appears online and in
XenbusStateInitialising.

The only thing which prevents the XenBus code from recreating duplicates
of devices which already exist, is the fact that xen_device_realize()
sets the backend state to XenbusStateInitWait. If the attempt to create
the device doesn't get *that* far, that's when it will keep getting
retried.

My first thought was to handle errors by setting the backend state to
XenbusStateClosed, but that doesn't work for XenConsole which wants to
*ignore* any device of type != "ioemu" completely.

So, make xen_backend_device_create() *keep* the XenBackendInstance for a
failed device, and provide a new xen_backend_exists() function to allow
xen_bus_type_enumerate() to check whether one already exists before
creating a new one.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
---
 hw/xen/xen-backend.c         | 27 +++++++++++++++++++++------
 hw/xen/xen-bus.c             |  3 ++-
 include/hw/xen/xen-backend.h |  1 +
 3 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/hw/xen/xen-backend.c b/hw/xen/xen-backend.c
index 5b0fb76eae..b9bf70a9f5 100644
--- a/hw/xen/xen-backend.c
+++ b/hw/xen/xen-backend.c
@@ -101,6 +101,24 @@ static XenBackendInstance *xen_backend_list_find(XenDevice *xendev)
     return NULL;
 }
 
+bool xen_backend_exists(const char *type, const char *name)
+{
+    const XenBackendImpl *impl = xen_backend_table_lookup(type);
+    XenBackendInstance *backend;
+
+    if (!impl) {
+        return false;
+    }
+
+    QLIST_FOREACH(backend, &backend_list, entry) {
+        if (backend->impl == impl && !strcmp(backend->name, name)) {
+            return true;
+        }
+    }
+
+    return false;
+}
+
 static void xen_backend_list_remove(XenBackendInstance *backend)
 {
     QLIST_REMOVE(backend, entry);
@@ -122,11 +140,6 @@ void xen_backend_device_create(XenBus *xenbus, const char *type,
     backend->name = g_strdup(name);
 
     impl->create(backend, opts, errp);
-    if (*errp) {
-        g_free(backend->name);
-        g_free(backend);
-        return;
-    }
 
     backend->impl = impl;
     xen_backend_list_add(backend);
@@ -165,7 +178,9 @@ bool xen_backend_try_device_destroy(XenDevice *xendev, Error **errp)
     }
 
     impl = backend->impl;
-    impl->destroy(backend, errp);
+    if (backend->xendev) {
+        impl->destroy(backend, errp);
+    }
 
     xen_backend_list_remove(backend);
     g_free(backend->name);
diff --git a/hw/xen/xen-bus.c b/hw/xen/xen-bus.c
index 12ff782005..3ffd1a5333 100644
--- a/hw/xen/xen-bus.c
+++ b/hw/xen/xen-bus.c
@@ -209,7 +209,8 @@ static void xen_bus_type_enumerate(XenBus *xenbus, const char *type)
                           NULL, "%u", &online) != 1)
             online = 0;
 
-        if (online && state == XenbusStateInitialising) {
+        if (online && state == XenbusStateInitialising &&
+            !xen_backend_exists(type, backend[i])) {
             Error *local_err = NULL;
 
             xen_bus_backend_create(xenbus, type, backend[i], backend_path,
diff --git a/include/hw/xen/xen-backend.h b/include/hw/xen/xen-backend.h
index aac2fd454d..0f01631ae7 100644
--- a/include/hw/xen/xen-backend.h
+++ b/include/hw/xen/xen-backend.h
@@ -33,6 +33,7 @@ XenDevice *xen_backend_get_device(XenBackendInstance *backend);
 void xen_backend_register(const XenBackendInfo *info);
 const char **xen_backend_get_types(unsigned int *nr);
 
+bool xen_backend_exists(const char *type, const char *name);
 void xen_backend_device_create(XenBus *xenbus, const char *type,
                                const char *name, QDict *opts, Error **errp);
 bool xen_backend_try_device_destroy(XenDevice *xendev, Error **errp);
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 16/28] hw/xen: update Xen console to XenDevice model
  2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
                   ` (14 preceding siblings ...)
  2023-10-25 14:50 ` [PATCH v3 15/28] hw/xen: do not repeatedly try to create a failing backend device David Woodhouse
@ 2023-10-25 14:50 ` David Woodhouse
  2023-10-25 14:50 ` [PATCH v3 17/28] hw/xen: add support for Xen primary console in emulated mode David Woodhouse
                   ` (11 subsequent siblings)
  27 siblings, 0 replies; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

This allows (non-primary) console devices to be created on the command
line and hotplugged.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
---
 hw/char/trace-events        |   8 +
 hw/char/xen_console.c       | 532 +++++++++++++++++++++++++++---------
 hw/xen/xen-legacy-backend.c |   1 -
 3 files changed, 411 insertions(+), 130 deletions(-)

diff --git a/hw/char/trace-events b/hw/char/trace-events
index babf4d35ea..7a398c82a5 100644
--- a/hw/char/trace-events
+++ b/hw/char/trace-events
@@ -105,3 +105,11 @@ cadence_uart_baudrate(unsigned baudrate) "baudrate %u"
 # sh_serial.c
 sh_serial_read(char *id, unsigned size, uint64_t offs, uint64_t val) " %s size %d offs 0x%02" PRIx64 " -> 0x%02" PRIx64
 sh_serial_write(char *id, unsigned size, uint64_t offs, uint64_t val) "%s size %d offs 0x%02" PRIx64 " <- 0x%02" PRIx64
+
+# xen_console.c
+xen_console_connect(unsigned int idx, unsigned int ring_ref, unsigned int port, unsigned int limit) "idx %u ring_ref %u port %u limit %u"
+xen_console_disconnect(unsigned int idx) "idx %u"
+xen_console_unrealize(unsigned int idx) "idx %u"
+xen_console_realize(unsigned int idx, const char *chrdev) "idx %u chrdev %s"
+xen_console_device_create(unsigned int idx) "idx %u"
+xen_console_device_destroy(unsigned int idx) "idx %u"
diff --git a/hw/char/xen_console.c b/hw/char/xen_console.c
index 810dae3f44..4a419dc287 100644
--- a/hw/char/xen_console.c
+++ b/hw/char/xen_console.c
@@ -20,15 +20,20 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/cutils.h"
 #include <sys/select.h>
 #include <termios.h>
 
 #include "qapi/error.h"
 #include "sysemu/sysemu.h"
 #include "chardev/char-fe.h"
-#include "hw/xen/xen-legacy-backend.h"
-
+#include "hw/xen/xen-backend.h"
+#include "hw/xen/xen-bus-helper.h"
+#include "hw/qdev-properties.h"
+#include "hw/qdev-properties-system.h"
 #include "hw/xen/interface/io/console.h"
+#include "hw/xen/interface/io/xs_wire.h"
+#include "trace.h"
 
 struct buffer {
     uint8_t *data;
@@ -39,16 +44,22 @@ struct buffer {
 };
 
 struct XenConsole {
-    struct XenLegacyDevice  xendev;  /* must be first */
+    struct XenDevice  xendev;  /* must be first */
+    XenEventChannel   *event_channel;
+    int               dev;
     struct buffer     buffer;
-    char              console[XEN_BUFSIZE];
-    int               ring_ref;
+    char              *fe_path;
+    unsigned int      ring_ref;
     void              *sring;
     CharBackend       chr;
     int               backlog;
 };
+typedef struct XenConsole XenConsole;
+
+#define TYPE_XEN_CONSOLE_DEVICE "xen-console"
+OBJECT_DECLARE_SIMPLE_TYPE(XenConsole, XEN_CONSOLE_DEVICE)
 
-static void buffer_append(struct XenConsole *con)
+static bool buffer_append(XenConsole *con)
 {
     struct buffer *buffer = &con->buffer;
     XENCONS_RING_IDX cons, prod, size;
@@ -60,7 +71,7 @@ static void buffer_append(struct XenConsole *con)
 
     size = prod - cons;
     if ((size == 0) || (size > sizeof(intf->out)))
-        return;
+        return false;
 
     if ((buffer->capacity - buffer->size) < size) {
         buffer->capacity += (size + 1024);
@@ -73,7 +84,7 @@ static void buffer_append(struct XenConsole *con)
 
     xen_mb();
     intf->out_cons = cons;
-    xen_pv_send_notify(&con->xendev);
+    xen_device_notify_event_channel(XEN_DEVICE(con), con->event_channel, NULL);
 
     if (buffer->max_capacity &&
         buffer->size > buffer->max_capacity) {
@@ -89,6 +100,7 @@ static void buffer_append(struct XenConsole *con)
         if (buffer->consumed > buffer->max_capacity - over)
             buffer->consumed = buffer->max_capacity - over;
     }
+    return true;
 }
 
 static void buffer_advance(struct buffer *buffer, size_t len)
@@ -100,7 +112,7 @@ static void buffer_advance(struct buffer *buffer, size_t len)
     }
 }
 
-static int ring_free_bytes(struct XenConsole *con)
+static int ring_free_bytes(XenConsole *con)
 {
     struct xencons_interface *intf = con->sring;
     XENCONS_RING_IDX cons, prod, space;
@@ -118,13 +130,13 @@ static int ring_free_bytes(struct XenConsole *con)
 
 static int xencons_can_receive(void *opaque)
 {
-    struct XenConsole *con = opaque;
+    XenConsole *con = opaque;
     return ring_free_bytes(con);
 }
 
 static void xencons_receive(void *opaque, const uint8_t *buf, int len)
 {
-    struct XenConsole *con = opaque;
+    XenConsole *con = opaque;
     struct xencons_interface *intf = con->sring;
     XENCONS_RING_IDX prod;
     int i, max;
@@ -141,10 +153,10 @@ static void xencons_receive(void *opaque, const uint8_t *buf, int len)
     }
     xen_wmb();
     intf->in_prod = prod;
-    xen_pv_send_notify(&con->xendev);
+    xen_device_notify_event_channel(XEN_DEVICE(con), con->event_channel, NULL);
 }
 
-static void xencons_send(struct XenConsole *con)
+static bool xencons_send(XenConsole *con)
 {
     ssize_t len, size;
 
@@ -159,174 +171,436 @@ static void xencons_send(struct XenConsole *con)
     if (len < 1) {
         if (!con->backlog) {
             con->backlog = 1;
-            xen_pv_printf(&con->xendev, 1,
-                          "backlog piling up, nobody listening?\n");
         }
     } else {
         buffer_advance(&con->buffer, len);
         if (con->backlog && len == size) {
             con->backlog = 0;
-            xen_pv_printf(&con->xendev, 1, "backlog is gone\n");
         }
     }
+    return len > 0;
 }
 
 /* -------------------------------------------------------------------- */
 
-static int store_con_info(struct XenConsole *con)
+static bool con_event(void *_xendev)
 {
-    Chardev *cs = qemu_chr_fe_get_driver(&con->chr);
-    char *pts = NULL;
-    char *dom_path;
-    g_autoptr(GString) path = NULL;
+    XenConsole *con = XEN_CONSOLE_DEVICE(_xendev);
+    bool done_something;
+
+    if (xen_device_backend_get_state(&con->xendev) != XenbusStateConnected) {
+        return false;
+    }
+
+    done_something = buffer_append(con);
+
+    if (con->buffer.size - con->buffer.consumed) {
+        done_something |= xencons_send(con);
+    }
+    return done_something;
+}
+
+/* -------------------------------------------------------------------- */
+
+static bool xen_console_connect(XenDevice *xendev, Error **errp)
+{
+    XenConsole *con = XEN_CONSOLE_DEVICE(xendev);
+    unsigned int port, limit;
+
+    if (xen_device_frontend_scanf(xendev, "ring-ref", "%u",
+                                  &con->ring_ref) != 1) {
+        error_setg(errp, "failed to read ring-ref");
+        return false;
+    }
 
-    /* Only continue if we're talking to a pty. */
-    if (!CHARDEV_IS_PTY(cs)) {
-        return 0;
+    if (xen_device_frontend_scanf(xendev, "port", "%u", &port) != 1) {
+        error_setg(errp, "failed to read remote port");
+        return false;
     }
-    pts = cs->filename + 4;
 
-    dom_path = qemu_xen_xs_get_domain_path(xenstore, xen_domid);
-    if (!dom_path) {
-        return 0;
+    if (xen_device_frontend_scanf(xendev, "limit", "%u", &limit) == 1) {
+        con->buffer.max_capacity = limit;
     }
 
-    path = g_string_new(dom_path);
-    free(dom_path);
+    con->event_channel = xen_device_bind_event_channel(xendev, port,
+                                                       con_event,
+                                                       con,
+                                                       errp);
+    if (!con->event_channel) {
+        return false;
+    }
 
-    if (con->xendev.dev) {
-        g_string_append_printf(path, "/device/console/%d", con->xendev.dev);
+    if (!con->dev) {
+        xen_pfn_t mfn = (xen_pfn_t)con->ring_ref;
+        con->sring = qemu_xen_foreignmem_map(xendev->frontend_id, NULL,
+                                             PROT_READ | PROT_WRITE,
+                                             1, &mfn, NULL);
+        if (!con->sring) {
+            error_setg(errp, "failed to map console page");
+            return false;
+        }
     } else {
-        g_string_append(path, "/console");
+        con->sring = xen_device_map_grant_refs(xendev,
+                                               &con->ring_ref, 1,
+                                               PROT_READ | PROT_WRITE,
+                                               errp);
+        if (!con->sring) {
+            error_prepend(errp, "failed to map grant ref: ");
+            return false;
+        }
     }
-    g_string_append(path, "/tty");
 
-    if (xenstore_write_str(con->console, path->str, pts)) {
-        fprintf(stderr, "xenstore_write_str for '%s' fail", path->str);
-        return -1;
+    trace_xen_console_connect(con->dev, con->ring_ref, port,
+                              con->buffer.max_capacity);
+
+    qemu_chr_fe_set_handlers(&con->chr, xencons_can_receive,
+                             xencons_receive, NULL, NULL, con, NULL,
+                             true);
+    return true;
+}
+
+static void xen_console_disconnect(XenDevice *xendev, Error **errp)
+{
+    XenConsole *con = XEN_CONSOLE_DEVICE(xendev);
+
+    trace_xen_console_disconnect(con->dev);
+
+    qemu_chr_fe_set_handlers(&con->chr, NULL, NULL, NULL, NULL,
+                             con, NULL, true);
+
+    if (con->event_channel) {
+        xen_device_unbind_event_channel(xendev, con->event_channel,
+                                        errp);
+        con->event_channel = NULL;
+    }
+
+    if (con->sring) {
+        if (!con->dev) {
+            qemu_xen_foreignmem_unmap(con->sring, 1);
+        } else {
+            xen_device_unmap_grant_refs(xendev, con->sring,
+                                        &con->ring_ref, 1, errp);
+        }
+        con->sring = NULL;
     }
-    return 0;
 }
 
-static int con_init(struct XenLegacyDevice *xendev)
+static void xen_console_frontend_changed(XenDevice *xendev,
+                                         enum xenbus_state frontend_state,
+                                         Error **errp)
 {
-    struct XenConsole *con = container_of(xendev, struct XenConsole, xendev);
-    char *type, *dom, label[32];
-    int ret = 0;
-    const char *output;
-
-    /* setup */
-    dom = qemu_xen_xs_get_domain_path(xenstore, con->xendev.dom);
-    if (!xendev->dev) {
-        snprintf(con->console, sizeof(con->console), "%s/console", dom);
-    } else {
-        snprintf(con->console, sizeof(con->console), "%s/device/console/%d", dom, xendev->dev);
+    ERRP_GUARD();
+    enum xenbus_state backend_state = xen_device_backend_get_state(xendev);
+
+    switch (frontend_state) {
+    case XenbusStateInitialised:
+    case XenbusStateConnected:
+        if (backend_state == XenbusStateConnected) {
+            break;
+        }
+
+        xen_console_disconnect(xendev, errp);
+        if (*errp) {
+            break;
+        }
+
+        if (!xen_console_connect(xendev, errp)) {
+            xen_device_backend_set_state(xendev, XenbusStateClosing);
+            break;
+        }
+
+        xen_device_backend_set_state(xendev, XenbusStateConnected);
+        break;
+
+    case XenbusStateClosing:
+        xen_device_backend_set_state(xendev, XenbusStateClosing);
+        break;
+
+    case XenbusStateClosed:
+    case XenbusStateUnknown:
+        xen_console_disconnect(xendev, errp);
+        if (*errp) {
+            break;
+        }
+
+        xen_device_backend_set_state(xendev, XenbusStateClosed);
+        break;
+
+    default:
+        break;
     }
-    free(dom);
+}
 
-    type = xenstore_read_str(con->console, "type");
-    if (!type || strcmp(type, "ioemu") != 0) {
-        xen_pv_printf(xendev, 1, "not for me (type=%s)\n", type);
-        ret = -1;
-        goto out;
+static char *xen_console_get_name(XenDevice *xendev, Error **errp)
+{
+    XenConsole *con = XEN_CONSOLE_DEVICE(xendev);
+
+    if (con->dev == -1) {
+        XenBus *xenbus = XEN_BUS(qdev_get_parent_bus(DEVICE(xendev)));
+        char fe_path[XENSTORE_ABS_PATH_MAX + 1];
+        char *value;
+        int idx = 1;
+
+        /* Theoretically we could go up to INT_MAX here but that's overkill */
+        while (idx < 100) {
+            snprintf(fe_path, sizeof(fe_path),
+                     "/local/domain/%u/device/console/%u",
+                     xendev->frontend_id, idx);
+            value = qemu_xen_xs_read(xenbus->xsh, XBT_NULL, fe_path, NULL);
+            if (!value) {
+                if (errno == ENOENT) {
+                    con->dev = idx;
+                    goto found;
+                }
+                error_setg(errp, "cannot read %s: %s", fe_path,
+                           strerror(errno));
+                return NULL;
+            }
+            free(value);
+            idx++;
+        }
+        error_setg(errp, "cannot find device index for console device");
+        return NULL;
     }
+ found:
+    return g_strdup_printf("%u", con->dev);
+}
+
+static void xen_console_unrealize(XenDevice *xendev)
+{
+    XenConsole *con = XEN_CONSOLE_DEVICE(xendev);
+
+    trace_xen_console_unrealize(con->dev);
 
-    output = xenstore_read_str(con->console, "output");
+    /* Disconnect from the frontend in case this has not already happened */
+    xen_console_disconnect(xendev, NULL);
 
-    /* no Xen override, use qemu output device */
-    if (output == NULL) {
-        if (con->xendev.dev) {
-            qemu_chr_fe_init(&con->chr, serial_hd(con->xendev.dev),
-                             &error_abort);
+    qemu_chr_fe_deinit(&con->chr, false);
+}
+
+static void xen_console_realize(XenDevice *xendev, Error **errp)
+{
+    ERRP_GUARD();
+    XenConsole *con = XEN_CONSOLE_DEVICE(xendev);
+    Chardev *cs = qemu_chr_fe_get_driver(&con->chr);
+    unsigned int u;
+
+    if (!cs) {
+        error_setg(errp, "no backing character device");
+        return;
+    }
+
+    if (con->dev == -1) {
+        error_setg(errp, "no device index provided");
+        return;
+    }
+
+    /*
+     * The Xen primary console is special. The ring-ref is actually a GFN to
+     * be mapped directly as foreignmem (not a grant ref), and the guest port
+     * was allocated *for* the guest by the toolstack. The guest gets these
+     * through HVMOP_get_param and can use the console long before it's got
+     * XenStore up and running. We cannot create those for a Xen guest.
+     */
+    if (!con->dev) {
+        if (xen_device_frontend_scanf(xendev, "ring-ref", "%u", &u) != 1 ||
+            xen_device_frontend_scanf(xendev, "port", "%u", &u) != 1) {
+            error_setg(errp, "cannot create primary Xen console");
+            return;
         }
+    }
+
+    trace_xen_console_realize(con->dev, object_get_typename(OBJECT(cs)));
+
+    if (CHARDEV_IS_PTY(cs)) {
+        /* Strip the leading 'pty:' */
+        xen_device_frontend_printf(xendev, "tty", "%s", cs->filename + 4);
+    }
+
+    /* No normal PV driver initialization for the primary console */
+    if (!con->dev) {
+        xen_console_connect(xendev, errp);
+    }
+}
+
+static char *console_frontend_path(struct qemu_xs_handle *xenstore,
+                                   unsigned int dom_id, unsigned int dev)
+{
+    if (!dev) {
+        return g_strdup_printf("/local/domain/%u/console", dom_id);
     } else {
-        snprintf(label, sizeof(label), "xencons%d", con->xendev.dev);
-        qemu_chr_fe_init(&con->chr,
-                         /*
-                          * FIXME: sure we want to support implicit
-                          * muxed monitors here?
-                          */
-                         qemu_chr_new_mux_mon(label, output, NULL),
-                         &error_abort);
+        return g_strdup_printf("/local/domain/%u/device/console/%u", dom_id,
+                               dev);
     }
+}
 
-    store_con_info(con);
+static char *xen_console_get_frontend_path(XenDevice *xendev, Error **errp)
+{
+    XenConsole *con = XEN_CONSOLE_DEVICE(xendev);
+    XenBus *xenbus = XEN_BUS(qdev_get_parent_bus(DEVICE(xendev)));
+    char *ret = console_frontend_path(xenbus->xsh, xendev->frontend_id,
+                                      con->dev);
 
-out:
-    g_free(type);
+    if (!ret) {
+        error_setg(errp, "failed to create frontend path");
+    }
     return ret;
 }
 
-static int con_initialise(struct XenLegacyDevice *xendev)
+
+static Property xen_console_properties[] = {
+    DEFINE_PROP_CHR("chardev", XenConsole, chr),
+    DEFINE_PROP_INT32("idx", XenConsole, dev, -1),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void xen_console_class_init(ObjectClass *class, void *data)
 {
-    struct XenConsole *con = container_of(xendev, struct XenConsole, xendev);
-    int limit;
-
-    if (xenstore_read_int(con->console, "ring-ref", &con->ring_ref) == -1)
-        return -1;
-    if (xenstore_read_int(con->console, "port", &con->xendev.remote_port) == -1)
-        return -1;
-    if (xenstore_read_int(con->console, "limit", &limit) == 0)
-        con->buffer.max_capacity = limit;
+    DeviceClass *dev_class = DEVICE_CLASS(class);
+    XenDeviceClass *xendev_class = XEN_DEVICE_CLASS(class);
+
+    xendev_class->backend = "console";
+    xendev_class->device = "console";
+    xendev_class->get_name = xen_console_get_name;
+    xendev_class->realize = xen_console_realize;
+    xendev_class->frontend_changed = xen_console_frontend_changed;
+    xendev_class->unrealize = xen_console_unrealize;
+    xendev_class->get_frontend_path = xen_console_get_frontend_path;
+
+    device_class_set_props(dev_class, xen_console_properties);
+}
 
-    if (!xendev->dev) {
-        xen_pfn_t mfn = con->ring_ref;
-        con->sring = qemu_xen_foreignmem_map(con->xendev.dom, NULL,
-                                             PROT_READ | PROT_WRITE,
-                                             1, &mfn, NULL);
-    } else {
-        con->sring = xen_be_map_grant_ref(xendev, con->ring_ref,
-                                          PROT_READ | PROT_WRITE);
-    }
-    if (!con->sring)
-        return -1;
+static const TypeInfo xen_console_type_info = {
+    .name = TYPE_XEN_CONSOLE_DEVICE,
+    .parent = TYPE_XEN_DEVICE,
+    .instance_size = sizeof(XenConsole),
+    .class_init = xen_console_class_init,
+};
 
-    xen_be_bind_evtchn(&con->xendev);
-    qemu_chr_fe_set_handlers(&con->chr, xencons_can_receive,
-                             xencons_receive, NULL, NULL, con, NULL, true);
-
-    xen_pv_printf(xendev, 1,
-                  "ring mfn %d, remote port %d, local port %d, limit %zd\n",
-                  con->ring_ref,
-                  con->xendev.remote_port,
-                  con->xendev.local_port,
-                  con->buffer.max_capacity);
-    return 0;
+static void xen_console_register_types(void)
+{
+    type_register_static(&xen_console_type_info);
 }
 
-static void con_disconnect(struct XenLegacyDevice *xendev)
+type_init(xen_console_register_types)
+
+/* Called to instantiate a XenConsole when the backend is detected. */
+static void xen_console_device_create(XenBackendInstance *backend,
+                                      QDict *opts, Error **errp)
 {
-    struct XenConsole *con = container_of(xendev, struct XenConsole, xendev);
+    ERRP_GUARD();
+    XenBus *xenbus = xen_backend_get_bus(backend);
+    const char *name = xen_backend_get_name(backend);
+    unsigned long number;
+    char *fe = NULL, *type = NULL, *output = NULL;
+    char label[32];
+    XenDevice *xendev = NULL;
+    XenConsole *con;
+    Chardev *cd = NULL;
+    struct qemu_xs_handle *xsh = xenbus->xsh;
+
+    if (qemu_strtoul(name, NULL, 10, &number) || number > INT_MAX) {
+        error_setg(errp, "failed to parse name '%s'", name);
+        goto fail;
+    }
 
-    qemu_chr_fe_deinit(&con->chr, false);
-    xen_pv_unbind_evtchn(&con->xendev);
+    trace_xen_console_device_create(number);
 
-    if (con->sring) {
-        if (!xendev->dev) {
-            qemu_xen_foreignmem_unmap(con->sring, 1);
-        } else {
-            xen_be_unmap_grant_ref(xendev, con->sring, con->ring_ref);
+    fe = console_frontend_path(xsh, xen_domid, number);
+    if (fe == NULL) {
+        error_setg(errp, "failed to generate frontend path");
+        goto fail;
+    }
+
+    if (xs_node_scanf(xsh, XBT_NULL, fe, "type", errp, "%ms", &type) != 1) {
+        error_prepend(errp, "failed to read console device type: ");
+        goto fail;
+    }
+
+    if (strcmp(type, "ioemu")) {
+        error_setg(errp, "declining to handle console type '%s'",
+                   type);
+        goto fail;
+    }
+
+    xendev = XEN_DEVICE(qdev_new(TYPE_XEN_CONSOLE_DEVICE));
+    con = XEN_CONSOLE_DEVICE(xendev);
+
+    con->dev = number;
+
+    snprintf(label, sizeof(label), "xencons%ld", number);
+
+    if (xs_node_scanf(xsh, XBT_NULL, fe, "output", NULL, "%ms", &output) == 1) {
+        /*
+         * FIXME: sure we want to support implicit
+         * muxed monitors here?
+         */
+        cd = qemu_chr_new_mux_mon(label, output, NULL);
+        if (!cd) {
+            error_setg(errp, "console: No valid chardev found at '%s': ",
+                       output);
+            goto fail;
         }
-        con->sring = NULL;
+    } else if (number) {
+        cd = serial_hd(number);
+        if (!cd) {
+            error_prepend(errp, "console: No serial device #%ld found: ",
+                          number);
+            goto fail;
+        }
+    } else {
+        /* No 'output' node on primary console: use null. */
+        cd = qemu_chr_new(label, "null", NULL);
+        if (!cd) {
+            error_setg(errp, "console: failed to create null device");
+            goto fail;
+        }
+    }
+
+    if (!qemu_chr_fe_init(&con->chr, cd, errp)) {
+        error_prepend(errp, "console: failed to initialize backing chardev: ");
+        goto fail;
+    }
+
+    if (qdev_realize_and_unref(DEVICE(xendev), BUS(xenbus), errp)) {
+        xen_backend_set_device(backend, xendev);
+        goto done;
+    }
+
+    error_prepend(errp, "realization of console device %lu failed: ",
+                  number);
+
+ fail:
+    if (xendev) {
+        object_unparent(OBJECT(xendev));
     }
+ done:
+    g_free(fe);
+    free(type);
+    free(output);
 }
 
-static void con_event(struct XenLegacyDevice *xendev)
+static void xen_console_device_destroy(XenBackendInstance *backend,
+                                       Error **errp)
 {
-    struct XenConsole *con = container_of(xendev, struct XenConsole, xendev);
+    ERRP_GUARD();
+    XenDevice *xendev = xen_backend_get_device(backend);
+    XenConsole *con = XEN_CONSOLE_DEVICE(xendev);
 
-    buffer_append(con);
-    if (con->buffer.size - con->buffer.consumed)
-        xencons_send(con);
-}
+    trace_xen_console_device_destroy(con->dev);
 
-/* -------------------------------------------------------------------- */
+    object_unparent(OBJECT(xendev));
+}
 
-struct XenDevOps xen_console_ops = {
-    .size       = sizeof(struct XenConsole),
-    .flags      = DEVOPS_FLAG_IGNORE_STATE|DEVOPS_FLAG_NEED_GNTDEV,
-    .init       = con_init,
-    .initialise = con_initialise,
-    .event      = con_event,
-    .disconnect = con_disconnect,
+static const XenBackendInfo xen_console_backend_info  = {
+    .type = "console",
+    .create = xen_console_device_create,
+    .destroy = xen_console_device_destroy,
 };
+
+static void xen_console_register_backend(void)
+{
+    xen_backend_register(&xen_console_backend_info);
+}
+
+xen_backend_init(xen_console_register_backend);
diff --git a/hw/xen/xen-legacy-backend.c b/hw/xen/xen-legacy-backend.c
index 4ded3cec23..124dd5f3d6 100644
--- a/hw/xen/xen-legacy-backend.c
+++ b/hw/xen/xen-legacy-backend.c
@@ -623,7 +623,6 @@ void xen_be_init(void)
 
     xen_set_dynamic_sysbus();
 
-    xen_be_register("console", &xen_console_ops);
     xen_be_register("vkbd", &xen_kbdmouse_ops);
 #ifdef CONFIG_VIRTFS
     xen_be_register("9pfs", &xen_9pfs_ops);
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 17/28] hw/xen: add support for Xen primary console in emulated mode
  2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
                   ` (15 preceding siblings ...)
  2023-10-25 14:50 ` [PATCH v3 16/28] hw/xen: update Xen console to XenDevice model David Woodhouse
@ 2023-10-25 14:50 ` David Woodhouse
  2023-10-27  7:44   ` Durrant, Paul
  2023-10-25 14:50 ` [PATCH v3 18/28] hw/xen: only remove peers of PCI NICs on unplug David Woodhouse
                   ` (10 subsequent siblings)
  27 siblings, 1 reply; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

The primary console is special because the toolstack maps a page into
the guest for its ring, and also allocates the guest-side event channel.
The guest's grant table is even primed to export that page using a known
grant ref#. Add support for all that in emulated mode, so that we can
have a primary console.

For reasons unclear, the backends running under real Xen don't just use
a mapping of the well-known GNTTAB_RESERVED_CONSOLE grant ref (which
would also be in the ring-ref node in XenStore). Instead, the toolstack
sets the ring-ref node of the primary console to the GFN of the guest
page. The backend is expected to handle that special case and map it
with foreignmem operations instead.

We don't have an implementation of foreignmem ops for emulated Xen mode,
so just make it map GNTTAB_RESERVED_CONSOLE instead. This would probably
work for real Xen too, but we can't work out how to make real Xen create
a primary console of type "ioemu" to make QEMU drive it, so we can't
test that; might as well leave it as it is for now under Xen.

Now at last we can boot the Xen PV shim and run PV kernels in QEMU.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 hw/char/xen_console.c             |  78 ++++++++----
 hw/i386/kvm/meson.build           |   1 +
 hw/i386/kvm/trace-events          |   2 +
 hw/i386/kvm/xen-stubs.c           |   8 ++
 hw/i386/kvm/xen_gnttab.c          |   7 +-
 hw/i386/kvm/xen_primary_console.c | 193 ++++++++++++++++++++++++++++++
 hw/i386/kvm/xen_primary_console.h |  23 ++++
 hw/i386/kvm/xen_xenstore.c        |  10 ++
 hw/xen/xen-bus.c                  |   5 +
 include/hw/xen/xen-bus.h          |   1 +
 target/i386/kvm/xen-emu.c         |  23 +++-
 11 files changed, 328 insertions(+), 23 deletions(-)
 create mode 100644 hw/i386/kvm/xen_primary_console.c
 create mode 100644 hw/i386/kvm/xen_primary_console.h

diff --git a/hw/char/xen_console.c b/hw/char/xen_console.c
index 4a419dc287..5cbee2f184 100644
--- a/hw/char/xen_console.c
+++ b/hw/char/xen_console.c
@@ -33,6 +33,8 @@
 #include "hw/qdev-properties-system.h"
 #include "hw/xen/interface/io/console.h"
 #include "hw/xen/interface/io/xs_wire.h"
+#include "hw/xen/interface/grant_table.h"
+#include "hw/i386/kvm/xen_primary_console.h"
 #include "trace.h"
 
 struct buffer {
@@ -230,24 +232,47 @@ static bool xen_console_connect(XenDevice *xendev, Error **errp)
         return false;
     }
 
-    if (!con->dev) {
-        xen_pfn_t mfn = (xen_pfn_t)con->ring_ref;
-        con->sring = qemu_xen_foreignmem_map(xendev->frontend_id, NULL,
-                                             PROT_READ | PROT_WRITE,
-                                             1, &mfn, NULL);
-        if (!con->sring) {
-            error_setg(errp, "failed to map console page");
-            return false;
+    switch (con->dev) {
+    case 0:
+        /*
+         * The primary console is special. For real Xen the ring-ref is
+         * actually a GFN which needs to be mapped as foreignmem.
+         */
+        if (xen_mode != XEN_EMULATE) {
+            xen_pfn_t mfn = (xen_pfn_t)con->ring_ref;
+            con->sring = qemu_xen_foreignmem_map(xendev->frontend_id, NULL,
+                                                 PROT_READ | PROT_WRITE,
+                                                 1, &mfn, NULL);
+            if (!con->sring) {
+                error_setg(errp, "failed to map console page");
+                return false;
+            }
+            break;
         }
-    } else {
+
+        /*
+         * For Xen emulation, we still follow the convention of ring-ref
+         * holding the GFN, but we map the fixed GNTTAB_RESERVED_CONSOLE
+         * grant ref because there is no implementation of foreignmem
+         * operations for emulated mode. The emulation code which handles
+         * the guest-side page and event channel also needs to be informed
+         * of the backend event channel port, in order to reconnect to it
+         * after a soft reset.
+         */
+        xen_primary_console_set_be_port(
+            xen_event_channel_get_local_port(con->event_channel));
+        con->ring_ref = GNTTAB_RESERVED_CONSOLE;
+        /* fallthrough */
+    default:
         con->sring = xen_device_map_grant_refs(xendev,
                                                &con->ring_ref, 1,
                                                PROT_READ | PROT_WRITE,
                                                errp);
         if (!con->sring) {
-            error_prepend(errp, "failed to map grant ref: ");
+            error_prepend(errp, "failed to map console grant ref: ");
             return false;
         }
+        break;
     }
 
     trace_xen_console_connect(con->dev, con->ring_ref, port,
@@ -272,10 +297,14 @@ static void xen_console_disconnect(XenDevice *xendev, Error **errp)
         xen_device_unbind_event_channel(xendev, con->event_channel,
                                         errp);
         con->event_channel = NULL;
+
+        if (xen_mode == XEN_EMULATE && !con->dev) {
+            xen_primary_console_set_be_port(0);
+        }
     }
 
     if (con->sring) {
-        if (!con->dev) {
+        if (!con->dev && xen_mode != XEN_EMULATE) {
             qemu_xen_foreignmem_unmap(con->sring, 1);
         } else {
             xen_device_unmap_grant_refs(xendev, con->sring,
@@ -338,14 +367,19 @@ static char *xen_console_get_name(XenDevice *xendev, Error **errp)
     if (con->dev == -1) {
         XenBus *xenbus = XEN_BUS(qdev_get_parent_bus(DEVICE(xendev)));
         char fe_path[XENSTORE_ABS_PATH_MAX + 1];
+        int idx = (xen_mode == XEN_EMULATE) ? 0 : 1;
         char *value;
-        int idx = 1;
 
         /* Theoretically we could go up to INT_MAX here but that's overkill */
         while (idx < 100) {
-            snprintf(fe_path, sizeof(fe_path),
-                     "/local/domain/%u/device/console/%u",
-                     xendev->frontend_id, idx);
+            if (!idx) {
+                snprintf(fe_path, sizeof(fe_path),
+                         "/local/domain/%u/console", xendev->frontend_id);
+            } else {
+                snprintf(fe_path, sizeof(fe_path),
+                         "/local/domain/%u/device/console/%u",
+                         xendev->frontend_id, idx);
+            }
             value = qemu_xen_xs_read(xenbus->xsh, XBT_NULL, fe_path, NULL);
             if (!value) {
                 if (errno == ENOENT) {
@@ -400,11 +434,15 @@ static void xen_console_realize(XenDevice *xendev, Error **errp)
      * be mapped directly as foreignmem (not a grant ref), and the guest port
      * was allocated *for* the guest by the toolstack. The guest gets these
      * through HVMOP_get_param and can use the console long before it's got
-     * XenStore up and running. We cannot create those for a Xen guest.
+     * XenStore up and running. We cannot create those for a true Xen guest,
+     * but we can for Xen emulation.
      */
     if (!con->dev) {
-        if (xen_device_frontend_scanf(xendev, "ring-ref", "%u", &u) != 1 ||
-            xen_device_frontend_scanf(xendev, "port", "%u", &u) != 1) {
+        if (xen_mode == XEN_EMULATE) {
+            xen_primary_console_create();
+        } else if (xen_device_frontend_scanf(xendev, "ring-ref", "%u", &u)
+                   != 1 ||
+                   xen_device_frontend_scanf(xendev, "port", "%u", &u) != 1) {
             error_setg(errp, "cannot create primary Xen console");
             return;
         }
@@ -417,8 +455,8 @@ static void xen_console_realize(XenDevice *xendev, Error **errp)
         xen_device_frontend_printf(xendev, "tty", "%s", cs->filename + 4);
     }
 
-    /* No normal PV driver initialization for the primary console */
-    if (!con->dev) {
+    /* No normal PV driver initialization for the primary console under Xen */
+    if (!con->dev && xen_mode != XEN_EMULATE) {
         xen_console_connect(xendev, errp);
     }
 }
diff --git a/hw/i386/kvm/meson.build b/hw/i386/kvm/meson.build
index ab143d6474..a4a2e23c06 100644
--- a/hw/i386/kvm/meson.build
+++ b/hw/i386/kvm/meson.build
@@ -9,6 +9,7 @@ i386_kvm_ss.add(when: 'CONFIG_XEN_EMU', if_true: files(
   'xen_evtchn.c',
   'xen_gnttab.c',
   'xen_xenstore.c',
+  'xen_primary_console.c',
   'xenstore_impl.c',
   ))
 
diff --git a/hw/i386/kvm/trace-events b/hw/i386/kvm/trace-events
index e4c82de6f3..67bf7f174e 100644
--- a/hw/i386/kvm/trace-events
+++ b/hw/i386/kvm/trace-events
@@ -18,3 +18,5 @@ xenstore_watch(const char *path, const char *token) "path %s token %s"
 xenstore_unwatch(const char *path, const char *token) "path %s token %s"
 xenstore_reset_watches(void) ""
 xenstore_watch_event(const char *path, const char *token) "path %s token %s"
+xen_primary_console_create(void) ""
+xen_primary_console_reset(int port) "port %u"
diff --git a/hw/i386/kvm/xen-stubs.c b/hw/i386/kvm/xen-stubs.c
index ae406e0b02..d03131e686 100644
--- a/hw/i386/kvm/xen-stubs.c
+++ b/hw/i386/kvm/xen-stubs.c
@@ -15,6 +15,7 @@
 #include "qapi/qapi-commands-misc-target.h"
 
 #include "xen_evtchn.h"
+#include "xen_primary_console.h"
 
 void xen_evtchn_snoop_msi(PCIDevice *dev, bool is_msix, unsigned int vector,
                           uint64_t addr, uint32_t data, bool is_masked)
@@ -30,6 +31,13 @@ bool xen_evtchn_deliver_pirq_msi(uint64_t address, uint32_t data)
     return false;
 }
 
+void xen_primary_console_create(void)
+{
+}
+
+void xen_primary_console_set_be_port(uint16_t port)
+{
+}
 #ifdef TARGET_I386
 EvtchnInfoList *qmp_xen_event_list(Error **errp)
 {
diff --git a/hw/i386/kvm/xen_gnttab.c b/hw/i386/kvm/xen_gnttab.c
index 839ec920a1..0a24f53f20 100644
--- a/hw/i386/kvm/xen_gnttab.c
+++ b/hw/i386/kvm/xen_gnttab.c
@@ -25,6 +25,7 @@
 #include "hw/xen/xen_backend_ops.h"
 #include "xen_overlay.h"
 #include "xen_gnttab.h"
+#include "xen_primary_console.h"
 
 #include "sysemu/kvm.h"
 #include "sysemu/kvm_xen.h"
@@ -537,9 +538,13 @@ int xen_gnttab_reset(void)
     s->nr_frames = 0;
 
     memset(s->entries.v1, 0, XEN_PAGE_SIZE * s->max_frames);
-
     s->entries.v1[GNTTAB_RESERVED_XENSTORE].flags = GTF_permit_access;
     s->entries.v1[GNTTAB_RESERVED_XENSTORE].frame = XEN_SPECIAL_PFN(XENSTORE);
 
+    if (xen_primary_console_get_pfn()) {
+        s->entries.v1[GNTTAB_RESERVED_CONSOLE].flags = GTF_permit_access;
+        s->entries.v1[GNTTAB_RESERVED_CONSOLE].frame = XEN_SPECIAL_PFN(CONSOLE);
+    }
+
     return 0;
 }
diff --git a/hw/i386/kvm/xen_primary_console.c b/hw/i386/kvm/xen_primary_console.c
new file mode 100644
index 0000000000..abe79f565b
--- /dev/null
+++ b/hw/i386/kvm/xen_primary_console.c
@@ -0,0 +1,193 @@
+/*
+ * QEMU Xen emulation: Primary console support
+ *
+ * Copyright © 2023 Amazon.com, Inc. or its affiliates. All Rights Reserved.
+ *
+ * Authors: David Woodhouse <dwmw2@infradead.org>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+
+#include "qapi/error.h"
+
+#include "hw/sysbus.h"
+#include "hw/xen/xen.h"
+#include "hw/xen/xen_backend_ops.h"
+#include "xen_evtchn.h"
+#include "xen_overlay.h"
+#include "xen_primary_console.h"
+
+#include "sysemu/kvm.h"
+#include "sysemu/kvm_xen.h"
+
+#include "trace.h"
+
+#include "hw/xen/interface/event_channel.h"
+#include "hw/xen/interface/grant_table.h"
+
+#define TYPE_XEN_PRIMARY_CONSOLE "xen-primary-console"
+OBJECT_DECLARE_SIMPLE_TYPE(XenPrimaryConsoleState, XEN_PRIMARY_CONSOLE)
+
+struct XenPrimaryConsoleState {
+    /*< private >*/
+    SysBusDevice busdev;
+    /*< public >*/
+
+    MemoryRegion console_page;
+    void *cp;
+
+    evtchn_port_t guest_port;
+    evtchn_port_t be_port;
+
+    struct xengntdev_handle *gt;
+    void *granted_xs;
+};
+
+struct XenPrimaryConsoleState *xen_primary_console_singleton;
+
+static void xen_primary_console_realize(DeviceState *dev, Error **errp)
+{
+    XenPrimaryConsoleState *s = XEN_PRIMARY_CONSOLE(dev);
+
+    if (xen_mode != XEN_EMULATE) {
+        error_setg(errp, "Xen primary console support is for Xen emulation");
+        return;
+    }
+
+    memory_region_init_ram(&s->console_page, OBJECT(dev), "xen:console_page",
+                           XEN_PAGE_SIZE, &error_abort);
+    memory_region_set_enabled(&s->console_page, true);
+    s->cp = memory_region_get_ram_ptr(&s->console_page);
+    memset(s->cp, 0, XEN_PAGE_SIZE);
+
+    /* We can't map it this early as KVM isn't ready */
+    xen_primary_console_singleton = s;
+}
+
+static void xen_primary_console_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+
+    dc->realize = xen_primary_console_realize;
+}
+
+static const TypeInfo xen_primary_console_info = {
+    .name          = TYPE_XEN_PRIMARY_CONSOLE,
+    .parent        = TYPE_SYS_BUS_DEVICE,
+    .instance_size = sizeof(XenPrimaryConsoleState),
+    .class_init    = xen_primary_console_class_init,
+};
+
+
+void xen_primary_console_create(void)
+{
+    DeviceState *dev = sysbus_create_simple(TYPE_XEN_PRIMARY_CONSOLE, -1, NULL);
+
+    trace_xen_primary_console_create();
+
+    xen_primary_console_singleton = XEN_PRIMARY_CONSOLE(dev);
+
+    /*
+     * Defer the init (xen_primary_console_reset()) until KVM is set up and the
+     * overlay page can be mapped.
+     */
+}
+
+static void xen_primary_console_register_types(void)
+{
+    type_register_static(&xen_primary_console_info);
+}
+
+type_init(xen_primary_console_register_types)
+
+uint16_t xen_primary_console_get_port(void)
+{
+    XenPrimaryConsoleState *s = xen_primary_console_singleton;
+    if (!s) {
+        return 0;
+    }
+    return s->guest_port;
+}
+
+void xen_primary_console_set_be_port(uint16_t port)
+{
+    XenPrimaryConsoleState *s = xen_primary_console_singleton;
+    if (s) {
+        s->be_port = port;
+    }
+}
+
+uint64_t xen_primary_console_get_pfn(void)
+{
+    XenPrimaryConsoleState *s = xen_primary_console_singleton;
+    if (!s) {
+        return 0;
+    }
+    return XEN_SPECIAL_PFN(CONSOLE);
+}
+
+void *xen_primary_console_get_map(void)
+{
+    XenPrimaryConsoleState *s = xen_primary_console_singleton;
+    if (!s) {
+        return 0;
+    }
+    return s->cp;
+}
+
+static void alloc_guest_port(XenPrimaryConsoleState *s)
+{
+    struct evtchn_alloc_unbound alloc = {
+        .dom = DOMID_SELF,
+        .remote_dom = DOMID_QEMU,
+    };
+
+    if (!xen_evtchn_alloc_unbound_op(&alloc)) {
+        s->guest_port = alloc.port;
+    }
+}
+
+static void rebind_guest_port(XenPrimaryConsoleState *s)
+{
+    struct evtchn_bind_interdomain inter = {
+        .remote_dom = DOMID_QEMU,
+        .remote_port = s->be_port,
+    };
+
+    if (!xen_evtchn_bind_interdomain_op(&inter)) {
+        s->guest_port = inter.local_port;
+    }
+
+    s->be_port = 0;
+}
+
+int xen_primary_console_reset(void)
+{
+    XenPrimaryConsoleState *s = xen_primary_console_singleton;
+    if (!s) {
+        return 0;
+    }
+
+    if (!memory_region_is_mapped(&s->console_page)) {
+        uint64_t gpa = XEN_SPECIAL_PFN(CONSOLE) << TARGET_PAGE_BITS;
+        xen_overlay_do_map_page(&s->console_page, gpa);
+    }
+
+    if (s->be_port) {
+        rebind_guest_port(s);
+    } else {
+        alloc_guest_port(s);
+    }
+
+    trace_xen_primary_console_reset(s->guest_port);
+
+    s->gt = qemu_xen_gnttab_open();
+    uint32_t xs_gntref = GNTTAB_RESERVED_CONSOLE;
+    s->granted_xs = qemu_xen_gnttab_map_refs(s->gt, 1, xen_domid, &xs_gntref,
+                                             PROT_READ | PROT_WRITE);
+
+    return 0;
+}
diff --git a/hw/i386/kvm/xen_primary_console.h b/hw/i386/kvm/xen_primary_console.h
new file mode 100644
index 0000000000..7e2989ea0d
--- /dev/null
+++ b/hw/i386/kvm/xen_primary_console.h
@@ -0,0 +1,23 @@
+/*
+ * QEMU Xen emulation: Primary console support
+ *
+ * Copyright © 2023 Amazon.com, Inc. or its affiliates. All Rights Reserved.
+ *
+ * Authors: David Woodhouse <dwmw2@infradead.org>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_XEN_PRIMARY_CONSOLE_H
+#define QEMU_XEN_PRIMARY_CONSOLE_H
+
+void xen_primary_console_create(void);
+int xen_primary_console_reset(void);
+
+uint16_t xen_primary_console_get_port(void);
+void xen_primary_console_set_be_port(uint16_t port);
+uint64_t xen_primary_console_get_pfn(void);
+void *xen_primary_console_get_map(void);
+
+#endif /* QEMU_XEN_PRIMARY_CONSOLE_H */
diff --git a/hw/i386/kvm/xen_xenstore.c b/hw/i386/kvm/xen_xenstore.c
index b7c0407765..6e651960b3 100644
--- a/hw/i386/kvm/xen_xenstore.c
+++ b/hw/i386/kvm/xen_xenstore.c
@@ -25,6 +25,7 @@
 #include "hw/xen/xen_backend_ops.h"
 #include "xen_overlay.h"
 #include "xen_evtchn.h"
+#include "xen_primary_console.h"
 #include "xen_xenstore.h"
 
 #include "sysemu/kvm.h"
@@ -1434,6 +1435,7 @@ static void alloc_guest_port(XenXenstoreState *s)
 int xen_xenstore_reset(void)
 {
     XenXenstoreState *s = xen_xenstore_singleton;
+    int console_port;
     GList *perms;
     int err;
 
@@ -1470,6 +1472,14 @@ int xen_xenstore_reset(void)
     relpath_printf(s, perms, "store/ring-ref", "%lu",
                    XEN_SPECIAL_PFN(XENSTORE));
 
+    console_port = xen_primary_console_get_port();
+    if (console_port) {
+        relpath_printf(s, perms, "console/ring-ref", "%lu",
+                       XEN_SPECIAL_PFN(CONSOLE));
+        relpath_printf(s, perms, "console/port", "%u", console_port);
+        relpath_printf(s, perms, "console/state", "%u", XenbusStateInitialised);
+    }
+
     g_list_free_full(perms, g_free);
 
     /*
diff --git a/hw/xen/xen-bus.c b/hw/xen/xen-bus.c
index 3ffd1a5333..cc6f1b362f 100644
--- a/hw/xen/xen-bus.c
+++ b/hw/xen/xen-bus.c
@@ -922,6 +922,11 @@ void xen_device_notify_event_channel(XenDevice *xendev,
     }
 }
 
+unsigned int xen_event_channel_get_local_port(XenEventChannel *channel)
+{
+    return channel->local_port;
+}
+
 void xen_device_unbind_event_channel(XenDevice *xendev,
                                      XenEventChannel *channel,
                                      Error **errp)
diff --git a/include/hw/xen/xen-bus.h b/include/hw/xen/xen-bus.h
index eb440880b5..38d40afa37 100644
--- a/include/hw/xen/xen-bus.h
+++ b/include/hw/xen/xen-bus.h
@@ -131,5 +131,6 @@ void xen_device_notify_event_channel(XenDevice *xendev,
 void xen_device_unbind_event_channel(XenDevice *xendev,
                                      XenEventChannel *channel,
                                      Error **errp);
+unsigned int xen_event_channel_get_local_port(XenEventChannel *channel);
 
 #endif /* HW_XEN_BUS_H */
diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c
index 1dc9ab0d91..c0631f9cf4 100644
--- a/target/i386/kvm/xen-emu.c
+++ b/target/i386/kvm/xen-emu.c
@@ -28,6 +28,7 @@
 #include "hw/i386/kvm/xen_overlay.h"
 #include "hw/i386/kvm/xen_evtchn.h"
 #include "hw/i386/kvm/xen_gnttab.h"
+#include "hw/i386/kvm/xen_primary_console.h"
 #include "hw/i386/kvm/xen_xenstore.h"
 
 #include "hw/xen/interface/version.h"
@@ -182,7 +183,8 @@ int kvm_xen_init(KVMState *s, uint32_t hypercall_msr)
         return ret;
     }
 
-    /* The page couldn't be overlaid until KVM was initialized */
+    /* The pages couldn't be overlaid until KVM was initialized */
+    xen_primary_console_reset();
     xen_xenstore_reset();
 
     return 0;
@@ -812,11 +814,23 @@ static bool handle_get_param(struct kvm_xen_exit *exit, X86CPU *cpu,
     case HVM_PARAM_STORE_EVTCHN:
         hp.value = xen_xenstore_get_port();
         break;
+    case HVM_PARAM_CONSOLE_PFN:
+        hp.value = xen_primary_console_get_pfn();
+        if (!hp.value) {
+            err = -EINVAL;
+        }
+        break;
+    case HVM_PARAM_CONSOLE_EVTCHN:
+        hp.value = xen_primary_console_get_port();
+        if (!hp.value) {
+            err = -EINVAL;
+        }
+        break;
     default:
         return false;
     }
 
-    if (kvm_copy_to_gva(cs, arg, &hp, sizeof(hp))) {
+    if (!err && kvm_copy_to_gva(cs, arg, &hp, sizeof(hp))) {
         err = -EFAULT;
     }
 out:
@@ -1427,6 +1441,11 @@ int kvm_xen_soft_reset(void)
         return err;
     }
 
+    err = xen_primary_console_reset();
+    if (err) {
+        return err;
+    }
+
     err = xen_xenstore_reset();
     if (err) {
         return err;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 18/28] hw/xen: only remove peers of PCI NICs on unplug
  2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
                   ` (16 preceding siblings ...)
  2023-10-25 14:50 ` [PATCH v3 17/28] hw/xen: add support for Xen primary console in emulated mode David Woodhouse
@ 2023-10-25 14:50 ` David Woodhouse
  2023-10-27  8:29   ` Durrant, Paul
  2023-10-25 14:50 ` [PATCH v3 19/28] hw/xen: update Xen PV NIC to XenDevice model David Woodhouse
                   ` (9 subsequent siblings)
  27 siblings, 1 reply; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

When the Xen guest asks to unplug *emulated* NICs, it's kind of unhelpful
also to unplug the peer of the *Xen* PV NIC.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 hw/i386/xen/xen_platform.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/hw/i386/xen/xen_platform.c b/hw/i386/xen/xen_platform.c
index 17457ff3de..e2dd1b536a 100644
--- a/hw/i386/xen/xen_platform.c
+++ b/hw/i386/xen/xen_platform.c
@@ -140,9 +140,14 @@ static void unplug_nic(PCIBus *b, PCIDevice *d, void *o)
 /* Remove the peer of the NIC device. Normally, this would be a tap device. */
 static void del_nic_peer(NICState *nic, void *opaque)
 {
-    NetClientState *nc;
+    NetClientState *nc = qemu_get_queue(nic);
+    ObjectClass *klass = module_object_class_by_name(nc->model);
+
+    /* Only delete peers of PCI NICs that we're about to delete */
+    if (!klass || !object_class_dynamic_cast(klass, TYPE_PCI_DEVICE)) {
+        return;
+    }
 
-    nc = qemu_get_queue(nic);
     if (nc->peer)
         qemu_del_net_client(nc->peer);
 }
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 19/28] hw/xen: update Xen PV NIC to XenDevice model
  2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
                   ` (17 preceding siblings ...)
  2023-10-25 14:50 ` [PATCH v3 18/28] hw/xen: only remove peers of PCI NICs on unplug David Woodhouse
@ 2023-10-25 14:50 ` David Woodhouse
  2023-10-27  8:42   ` Durrant, Paul
  2023-10-25 14:50 ` [PATCH v3 20/28] net: do not delete nics in net_cleanup() David Woodhouse
                   ` (8 subsequent siblings)
  27 siblings, 1 reply; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

This allows us to use Xen PV networking with emulated Xen guests, and to
add them on the command line or hotplug.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 hw/net/meson.build        |   2 +-
 hw/net/trace-events       |  11 +
 hw/net/xen_nic.c          | 484 +++++++++++++++++++++++++++++---------
 hw/xenpv/xen_machine_pv.c |   1 -
 4 files changed, 381 insertions(+), 117 deletions(-)

diff --git a/hw/net/meson.build b/hw/net/meson.build
index 2632634df3..f64651c467 100644
--- a/hw/net/meson.build
+++ b/hw/net/meson.build
@@ -1,5 +1,5 @@
 system_ss.add(when: 'CONFIG_DP8393X', if_true: files('dp8393x.c'))
-system_ss.add(when: 'CONFIG_XEN', if_true: files('xen_nic.c'))
+system_ss.add(when: 'CONFIG_XEN_BUS', if_true: files('xen_nic.c'))
 system_ss.add(when: 'CONFIG_NE2000_COMMON', if_true: files('ne2000.c'))
 
 # PCI network cards
diff --git a/hw/net/trace-events b/hw/net/trace-events
index 3abfd65e5b..3097742cc0 100644
--- a/hw/net/trace-events
+++ b/hw/net/trace-events
@@ -482,3 +482,14 @@ dp8393x_receive_oversize(int size) "oversize packet, pkt_size is %d"
 dp8393x_receive_not_netcard(void) "packet not for netcard"
 dp8393x_receive_packet(int crba) "Receive packet at 0x%"PRIx32
 dp8393x_receive_write_status(int crba) "Write status at 0x%"PRIx32
+
+# xen_nic.c
+xen_netdev_realize(int dev, const char *info, const char *peer) "vif%u info '%s' peer '%s'"
+xen_netdev_unrealize(int dev) "vif%u"
+xen_netdev_create(int dev) "vif%u"
+xen_netdev_destroy(int dev) "vif%u"
+xen_netdev_disconnect(int dev) "vif%u"
+xen_netdev_connect(int dev, unsigned int tx, unsigned int rx, int port) "vif%u tx %u rx %u port %u"
+xen_netdev_frontend_changed(const char *dev, int state) "vif%s state %d"
+xen_netdev_tx(int dev, int ref, int off, int len, unsigned int flags, const char *c, const char *d, const char *m, const char *e) "vif%u ref %u off %u len %u flags 0x%x%s%s%s%s"
+xen_netdev_rx(int dev, int idx, int status, int flags) "vif%u idx %d status %d flags 0x%x"
diff --git a/hw/net/xen_nic.c b/hw/net/xen_nic.c
index 9bbf6599fc..af4ba3f1e6 100644
--- a/hw/net/xen_nic.c
+++ b/hw/net/xen_nic.c
@@ -20,6 +20,13 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/main-loop.h"
+#include "qemu/cutils.h"
+#include "qemu/log.h"
+#include "qemu/qemu-print.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/error.h"
+
 #include <sys/socket.h>
 #include <sys/ioctl.h>
 #include <sys/wait.h>
@@ -27,18 +34,26 @@
 #include "net/net.h"
 #include "net/checksum.h"
 #include "net/util.h"
-#include "hw/xen/xen-legacy-backend.h"
+
+#include "hw/xen/xen-backend.h"
+#include "hw/xen/xen-bus-helper.h"
+#include "hw/qdev-properties.h"
+#include "hw/qdev-properties-system.h"
 
 #include "hw/xen/interface/io/netif.h"
+#include "hw/xen/interface/io/xs_wire.h"
+
+#include "trace.h"
 
 /* ------------------------------------------------------------- */
 
 struct XenNetDev {
-    struct XenLegacyDevice      xendev;  /* must be first */
-    char                  *mac;
+    struct XenDevice      xendev;  /* must be first */
+    XenEventChannel       *event_channel;
+    int                   dev;
     int                   tx_work;
-    int                   tx_ring_ref;
-    int                   rx_ring_ref;
+    unsigned int          tx_ring_ref;
+    unsigned int          rx_ring_ref;
     struct netif_tx_sring *txs;
     struct netif_rx_sring *rxs;
     netif_tx_back_ring_t  tx_ring;
@@ -47,6 +62,11 @@ struct XenNetDev {
     NICState              *nic;
 };
 
+typedef struct XenNetDev XenNetDev;
+
+#define TYPE_XEN_NET_DEVICE "xen-net-device"
+OBJECT_DECLARE_SIMPLE_TYPE(XenNetDev, XEN_NET_DEVICE)
+
 /* ------------------------------------------------------------- */
 
 static void net_tx_response(struct XenNetDev *netdev, netif_tx_request_t *txp, int8_t st)
@@ -68,7 +88,8 @@ static void net_tx_response(struct XenNetDev *netdev, netif_tx_request_t *txp, i
     netdev->tx_ring.rsp_prod_pvt = ++i;
     RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&netdev->tx_ring, notify);
     if (notify) {
-        xen_pv_send_notify(&netdev->xendev);
+        xen_device_notify_event_channel(XEN_DEVICE(netdev),
+                                        netdev->event_channel, NULL);
     }
 
     if (i == netdev->tx_ring.req_cons) {
@@ -104,13 +125,16 @@ static void net_tx_error(struct XenNetDev *netdev, netif_tx_request_t *txp, RING
 #endif
 }
 
-static void net_tx_packets(struct XenNetDev *netdev)
+static bool net_tx_packets(struct XenNetDev *netdev)
 {
+    bool done_something = false;
     netif_tx_request_t txreq;
     RING_IDX rc, rp;
     void *page;
     void *tmpbuf = NULL;
 
+    assert(qemu_mutex_iothread_locked());
+
     for (;;) {
         rc = netdev->tx_ring.req_cons;
         rp = netdev->tx_ring.sring->req_prod;
@@ -122,49 +146,52 @@ static void net_tx_packets(struct XenNetDev *netdev)
             }
             memcpy(&txreq, RING_GET_REQUEST(&netdev->tx_ring, rc), sizeof(txreq));
             netdev->tx_ring.req_cons = ++rc;
+            done_something = true;
 
 #if 1
             /* should not happen in theory, we don't announce the *
              * feature-{sg,gso,whatelse} flags in xenstore (yet?) */
             if (txreq.flags & NETTXF_extra_info) {
-                xen_pv_printf(&netdev->xendev, 0, "FIXME: extra info flag\n");
+                qemu_log_mask(LOG_UNIMP, "vif%u: FIXME: extra info flag\n",
+                              netdev->dev);
                 net_tx_error(netdev, &txreq, rc);
                 continue;
             }
             if (txreq.flags & NETTXF_more_data) {
-                xen_pv_printf(&netdev->xendev, 0, "FIXME: more data flag\n");
+                qemu_log_mask(LOG_UNIMP, "vif%u: FIXME: more data flag\n",
+                              netdev->dev);
                 net_tx_error(netdev, &txreq, rc);
                 continue;
             }
 #endif
 
             if (txreq.size < 14) {
-                xen_pv_printf(&netdev->xendev, 0, "bad packet size: %d\n",
-                              txreq.size);
+                qemu_log_mask(LOG_GUEST_ERROR, "vif%u: bad packet size: %d\n",
+                              netdev->dev, txreq.size);
                 net_tx_error(netdev, &txreq, rc);
                 continue;
             }
 
             if ((txreq.offset + txreq.size) > XEN_PAGE_SIZE) {
-                xen_pv_printf(&netdev->xendev, 0, "error: page crossing\n");
+                qemu_log_mask(LOG_GUEST_ERROR, "vif%u: error: page crossing\n",
+                              netdev->dev);
                 net_tx_error(netdev, &txreq, rc);
                 continue;
             }
 
-            xen_pv_printf(&netdev->xendev, 3,
-                          "tx packet ref %d, off %d, len %d, flags 0x%x%s%s%s%s\n",
-                          txreq.gref, txreq.offset, txreq.size, txreq.flags,
-                          (txreq.flags & NETTXF_csum_blank)     ? " csum_blank"     : "",
-                          (txreq.flags & NETTXF_data_validated) ? " data_validated" : "",
-                          (txreq.flags & NETTXF_more_data)      ? " more_data"      : "",
-                          (txreq.flags & NETTXF_extra_info)     ? " extra_info"     : "");
+            trace_xen_netdev_tx(netdev->dev, txreq.gref, txreq.offset,
+                                txreq.size, txreq.flags,
+                                (txreq.flags & NETTXF_csum_blank)     ? " csum_blank"     : "",
+                                (txreq.flags & NETTXF_data_validated) ? " data_validated" : "",
+                                (txreq.flags & NETTXF_more_data)      ? " more_data"      : "",
+                                (txreq.flags & NETTXF_extra_info)     ? " extra_info"     : "");
 
-            page = xen_be_map_grant_ref(&netdev->xendev, txreq.gref,
-                                        PROT_READ);
+            page = xen_device_map_grant_refs(&netdev->xendev, &txreq.gref, 1,
+                                             PROT_READ, NULL);
             if (page == NULL) {
-                xen_pv_printf(&netdev->xendev, 0,
-                              "error: tx gref dereference failed (%d)\n",
-                             txreq.gref);
+                qemu_log_mask(LOG_GUEST_ERROR,
+                              "vif%u: tx gref dereference failed (%d)\n",
+                              netdev->dev, txreq.gref);
                 net_tx_error(netdev, &txreq, rc);
                 continue;
             }
@@ -181,7 +208,8 @@ static void net_tx_packets(struct XenNetDev *netdev)
                 qemu_send_packet(qemu_get_queue(netdev->nic),
                                  page + txreq.offset, txreq.size);
             }
-            xen_be_unmap_grant_ref(&netdev->xendev, page, txreq.gref);
+            xen_device_unmap_grant_refs(&netdev->xendev, page, &txreq.gref, 1,
+                                        NULL);
             net_tx_response(netdev, &txreq, NETIF_RSP_OKAY);
         }
         if (!netdev->tx_work) {
@@ -190,6 +218,7 @@ static void net_tx_packets(struct XenNetDev *netdev)
         netdev->tx_work = 0;
     }
     g_free(tmpbuf);
+    return done_something;
 }
 
 /* ------------------------------------------------------------- */
@@ -212,14 +241,13 @@ static void net_rx_response(struct XenNetDev *netdev,
         resp->status = (int16_t)st;
     }
 
-    xen_pv_printf(&netdev->xendev, 3,
-                  "rx response: idx %d, status %d, flags 0x%x\n",
-                  i, resp->status, resp->flags);
+    trace_xen_netdev_rx(netdev->dev, i, resp->status, resp->flags);
 
     netdev->rx_ring.rsp_prod_pvt = ++i;
     RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&netdev->rx_ring, notify);
     if (notify) {
-        xen_pv_send_notify(&netdev->xendev);
+        xen_device_notify_event_channel(XEN_DEVICE(netdev),
+                                        netdev->event_channel, NULL);
     }
 }
 
@@ -232,7 +260,9 @@ static ssize_t net_rx_packet(NetClientState *nc, const uint8_t *buf, size_t size
     RING_IDX rc, rp;
     void *page;
 
-    if (netdev->xendev.be_state != XenbusStateConnected) {
+    assert(qemu_mutex_iothread_locked());
+
+    if (xen_device_backend_get_state(&netdev->xendev) != XenbusStateConnected) {
         return -1;
     }
 
@@ -244,24 +274,26 @@ static ssize_t net_rx_packet(NetClientState *nc, const uint8_t *buf, size_t size
         return 0;
     }
     if (size > XEN_PAGE_SIZE - NET_IP_ALIGN) {
-        xen_pv_printf(&netdev->xendev, 0, "packet too big (%lu > %ld)",
-                      (unsigned long)size, XEN_PAGE_SIZE - NET_IP_ALIGN);
+        qemu_log_mask(LOG_GUEST_ERROR, "vif%u: packet too big (%lu > %ld)",
+                      netdev->dev, (unsigned long)size,
+                      XEN_PAGE_SIZE - NET_IP_ALIGN);
         return -1;
     }
 
     memcpy(&rxreq, RING_GET_REQUEST(&netdev->rx_ring, rc), sizeof(rxreq));
     netdev->rx_ring.req_cons = ++rc;
 
-    page = xen_be_map_grant_ref(&netdev->xendev, rxreq.gref, PROT_WRITE);
+    page = xen_device_map_grant_refs(&netdev->xendev, &rxreq.gref, 1,
+                                     PROT_WRITE, NULL);
     if (page == NULL) {
-        xen_pv_printf(&netdev->xendev, 0,
-                      "error: rx gref dereference failed (%d)\n",
-                      rxreq.gref);
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "vif%u: rx gref dereference failed (%d)\n",
+                      netdev->dev, rxreq.gref);
         net_rx_response(netdev, &rxreq, NETIF_RSP_ERROR, 0, 0, 0);
         return -1;
     }
     memcpy(page + NET_IP_ALIGN, buf, size);
-    xen_be_unmap_grant_ref(&netdev->xendev, page, rxreq.gref);
+    xen_device_unmap_grant_refs(&netdev->xendev, page, &rxreq.gref, 1, NULL);
     net_rx_response(netdev, &rxreq, NETIF_RSP_OKAY, NET_IP_ALIGN, size, 0);
 
     return size;
@@ -275,139 +307,361 @@ static NetClientInfo net_xen_info = {
     .receive = net_rx_packet,
 };
 
-static int net_init(struct XenLegacyDevice *xendev)
+static void xen_netdev_realize(XenDevice *xendev, Error **errp)
 {
-    struct XenNetDev *netdev = container_of(xendev, struct XenNetDev, xendev);
+    ERRP_GUARD();
+    XenNetDev *netdev = XEN_NET_DEVICE(xendev);
+    NetClientState *nc;
 
-    /* read xenstore entries */
-    if (netdev->mac == NULL) {
-        netdev->mac = xenstore_read_be_str(&netdev->xendev, "mac");
-    }
-
-    /* do we have all we need? */
-    if (netdev->mac == NULL) {
-        return -1;
-    }
+    qemu_macaddr_default_if_unset(&netdev->conf.macaddr);
 
-    if (net_parse_macaddr(netdev->conf.macaddr.a, netdev->mac) < 0) {
-        return -1;
-    }
+    xen_device_frontend_printf(xendev, "mac", "%02x:%02x:%02x:%02x:%02x:%02x",
+                               netdev->conf.macaddr.a[0],
+                               netdev->conf.macaddr.a[1],
+                               netdev->conf.macaddr.a[2],
+                               netdev->conf.macaddr.a[3],
+                               netdev->conf.macaddr.a[4],
+                               netdev->conf.macaddr.a[5]);
 
     netdev->nic = qemu_new_nic(&net_xen_info, &netdev->conf,
-                               "xen", NULL, netdev);
+                               object_get_typename(OBJECT(xendev)),
+                               DEVICE(xendev)->id, netdev);
 
-    qemu_set_info_str(qemu_get_queue(netdev->nic),
-                      "nic: xenbus vif macaddr=%s", netdev->mac);
+    nc = qemu_get_queue(netdev->nic);
+    qemu_format_nic_info_str(nc, netdev->conf.macaddr.a);
 
     /* fill info */
-    xenstore_write_be_int(&netdev->xendev, "feature-rx-copy", 1);
-    xenstore_write_be_int(&netdev->xendev, "feature-rx-flip", 0);
+    xen_device_backend_printf(xendev, "feature-rx-copy", "%u", 1);
+    xen_device_backend_printf(xendev, "feature-rx-flip", "%u", 0);
 
-    return 0;
+    trace_xen_netdev_realize(netdev->dev, nc->info_str, nc->peer ?
+                             nc->peer->name : "(none)");
 }
 
-static int net_connect(struct XenLegacyDevice *xendev)
+static bool net_event(void *_xendev)
 {
-    struct XenNetDev *netdev = container_of(xendev, struct XenNetDev, xendev);
-    int rx_copy;
+    XenNetDev *netdev = XEN_NET_DEVICE(_xendev);
+    bool done_something;
 
-    if (xenstore_read_fe_int(&netdev->xendev, "tx-ring-ref",
-                             &netdev->tx_ring_ref) == -1) {
-        return -1;
+    done_something = net_tx_packets(netdev);
+    qemu_flush_queued_packets(qemu_get_queue(netdev->nic));
+    return done_something;
+}
+
+static bool xen_netdev_connect(XenDevice *xendev, Error **errp)
+{
+    XenNetDev *netdev = XEN_NET_DEVICE(xendev);
+    unsigned int port, rx_copy;
+
+    assert(qemu_mutex_iothread_locked());
+
+    if (xen_device_frontend_scanf(xendev, "tx-ring-ref", "%u",
+                                  &netdev->tx_ring_ref) != 1) {
+        error_setg(errp, "failed to read tx-ring-ref");
+        return false;
     }
-    if (xenstore_read_fe_int(&netdev->xendev, "rx-ring-ref",
-                             &netdev->rx_ring_ref) == -1) {
-        return 1;
+
+    if (xen_device_frontend_scanf(xendev, "rx-ring-ref", "%u",
+                                  &netdev->rx_ring_ref) != 1) {
+        error_setg(errp, "failed to read rx-ring-ref");
+        return false;
     }
-    if (xenstore_read_fe_int(&netdev->xendev, "event-channel",
-                             &netdev->xendev.remote_port) == -1) {
-        return -1;
+
+    if (xen_device_frontend_scanf(xendev, "event-channel", "%u",
+                                  &port) != 1) {
+        error_setg(errp, "failed to read event-channel");
+        return false;
     }
 
-    if (xenstore_read_fe_int(&netdev->xendev, "request-rx-copy", &rx_copy) == -1) {
+    if (xen_device_frontend_scanf(xendev, "request-rx-copy", "%u",
+                                  &rx_copy) != 1) {
         rx_copy = 0;
     }
     if (rx_copy == 0) {
-        xen_pv_printf(&netdev->xendev, 0,
-                      "frontend doesn't support rx-copy.\n");
-        return -1;
+        error_setg(errp, "frontend doesn't support rx-copy");
+        return false;
     }
 
-    netdev->txs = xen_be_map_grant_ref(&netdev->xendev,
-                                       netdev->tx_ring_ref,
-                                       PROT_READ | PROT_WRITE);
+    netdev->txs = xen_device_map_grant_refs(xendev,
+                                            &netdev->tx_ring_ref, 1,
+                                            PROT_READ | PROT_WRITE,
+                                            errp);
     if (!netdev->txs) {
-        return -1;
+        error_prepend(errp, "failed to map tx grant ref: ");
+        return false;
     }
-    netdev->rxs = xen_be_map_grant_ref(&netdev->xendev,
-                                       netdev->rx_ring_ref,
-                                       PROT_READ | PROT_WRITE);
+
+    netdev->rxs = xen_device_map_grant_refs(xendev,
+                                            &netdev->rx_ring_ref, 1,
+                                            PROT_READ | PROT_WRITE,
+                                            errp);
     if (!netdev->rxs) {
-        xen_be_unmap_grant_ref(&netdev->xendev, netdev->txs,
-                               netdev->tx_ring_ref);
-        netdev->txs = NULL;
-        return -1;
+        error_prepend(errp, "failed to map rx grant ref: ");
+        return false;
     }
+
     BACK_RING_INIT(&netdev->tx_ring, netdev->txs, XEN_PAGE_SIZE);
     BACK_RING_INIT(&netdev->rx_ring, netdev->rxs, XEN_PAGE_SIZE);
 
-    xen_be_bind_evtchn(&netdev->xendev);
+    netdev->event_channel = xen_device_bind_event_channel(xendev, port,
+                                                          net_event,
+                                                          netdev,
+                                                          errp);
+    if (!netdev->event_channel) {
+        return false;
+    }
 
-    xen_pv_printf(&netdev->xendev, 1, "ok: tx-ring-ref %d, rx-ring-ref %d, "
-                  "remote port %d, local port %d\n",
-                  netdev->tx_ring_ref, netdev->rx_ring_ref,
-                  netdev->xendev.remote_port, netdev->xendev.local_port);
+    trace_xen_netdev_connect(netdev->dev, netdev->tx_ring_ref,
+                             netdev->rx_ring_ref, port);
 
     net_tx_packets(netdev);
-    return 0;
+    return true;
 }
 
-static void net_disconnect(struct XenLegacyDevice *xendev)
+static void xen_netdev_disconnect(XenDevice *xendev, Error **errp)
 {
-    struct XenNetDev *netdev = container_of(xendev, struct XenNetDev, xendev);
+    XenNetDev *netdev = XEN_NET_DEVICE(xendev);
+
+    trace_xen_netdev_disconnect(netdev->dev);
+
+    assert(qemu_mutex_iothread_locked());
 
-    xen_pv_unbind_evtchn(&netdev->xendev);
+    netdev->tx_ring.sring = NULL;
+    netdev->rx_ring.sring = NULL;
 
+    if (netdev->event_channel) {
+        xen_device_unbind_event_channel(xendev, netdev->event_channel,
+                                        errp);
+        netdev->event_channel = NULL;
+    }
     if (netdev->txs) {
-        xen_be_unmap_grant_ref(&netdev->xendev, netdev->txs,
-                               netdev->tx_ring_ref);
+        xen_device_unmap_grant_refs(xendev, netdev->txs,
+                                    &netdev->tx_ring_ref, 1, errp);
         netdev->txs = NULL;
     }
     if (netdev->rxs) {
-        xen_be_unmap_grant_ref(&netdev->xendev, netdev->rxs,
-                               netdev->rx_ring_ref);
+        xen_device_unmap_grant_refs(xendev, netdev->rxs,
+                                    &netdev->rx_ring_ref, 1, errp);
         netdev->rxs = NULL;
     }
 }
 
-static void net_event(struct XenLegacyDevice *xendev)
+/* -------------------------------------------------------------------- */
+
+
+static void xen_netdev_frontend_changed(XenDevice *xendev,
+                                       enum xenbus_state frontend_state,
+                                       Error **errp)
 {
-    struct XenNetDev *netdev = container_of(xendev, struct XenNetDev, xendev);
-    net_tx_packets(netdev);
-    qemu_flush_queued_packets(qemu_get_queue(netdev->nic));
+    ERRP_GUARD();
+    enum xenbus_state backend_state = xen_device_backend_get_state(xendev);
+
+    trace_xen_netdev_frontend_changed(xendev->name, frontend_state);
+
+    switch (frontend_state) {
+    case XenbusStateConnected:
+        if (backend_state == XenbusStateConnected) {
+            break;
+        }
+
+        xen_netdev_disconnect(xendev, errp);
+        if (*errp) {
+            break;
+        }
+
+        if (!xen_netdev_connect(xendev, errp)) {
+            xen_netdev_disconnect(xendev, NULL);
+            xen_device_backend_set_state(xendev, XenbusStateClosing);
+            break;
+        }
+
+        xen_device_backend_set_state(xendev, XenbusStateConnected);
+        break;
+
+    case XenbusStateClosing:
+        xen_device_backend_set_state(xendev, XenbusStateClosing);
+        break;
+
+    case XenbusStateClosed:
+    case XenbusStateUnknown:
+        xen_netdev_disconnect(xendev, errp);
+        if (*errp) {
+            break;
+        }
+
+        xen_device_backend_set_state(xendev, XenbusStateClosed);
+        break;
+
+    case XenbusStateInitialised:
+        /*
+         * Linux netback does nothing on the frontend going (back) to
+         * XenbusStateInitialised, so do the same here.
+         */
+    default:
+        break;
+    }
 }
 
-static int net_free(struct XenLegacyDevice *xendev)
+static char *xen_netdev_get_name(XenDevice *xendev, Error **errp)
 {
-    struct XenNetDev *netdev = container_of(xendev, struct XenNetDev, xendev);
+    XenNetDev *netdev = XEN_NET_DEVICE(xendev);
+
+    if (netdev->dev == -1) {
+        XenBus *xenbus = XEN_BUS(qdev_get_parent_bus(DEVICE(xendev)));
+        char fe_path[XENSTORE_ABS_PATH_MAX + 1];
+        int idx = (xen_mode == XEN_EMULATE) ? 0 : 1;
+        char *value;
+
+        /* Theoretically we could go up to INT_MAX here but that's overkill */
+        while (idx < 100) {
+            snprintf(fe_path, sizeof(fe_path),
+                     "/local/domain/%u/device/vif/%u",
+                     xendev->frontend_id, idx);
+            value = qemu_xen_xs_read(xenbus->xsh, XBT_NULL, fe_path, NULL);
+            if (!value) {
+                if (errno == ENOENT) {
+                    netdev->dev = idx;
+                    goto found;
+                }
+                error_setg(errp, "cannot read %s: %s", fe_path,
+                           strerror(errno));
+                return NULL;
+            }
+            free(value);
+            idx++;
+        }
+        error_setg(errp, "cannot find device index for netdev device");
+        return NULL;
+    }
+ found:
+    return g_strdup_printf("%u", netdev->dev);
+}
+
+static void xen_netdev_unrealize(XenDevice *xendev)
+{
+    XenNetDev *netdev = XEN_NET_DEVICE(xendev);
+
+    trace_xen_netdev_unrealize(netdev->dev);
+
+    /* Disconnect from the frontend in case this has not already happened */
+    xen_netdev_disconnect(xendev, NULL);
 
     if (netdev->nic) {
         qemu_del_nic(netdev->nic);
-        netdev->nic = NULL;
     }
-    g_free(netdev->mac);
-    netdev->mac = NULL;
-    return 0;
 }
 
 /* ------------------------------------------------------------- */
 
-struct XenDevOps xen_netdev_ops = {
-    .size       = sizeof(struct XenNetDev),
-    .flags      = DEVOPS_FLAG_NEED_GNTDEV,
-    .init       = net_init,
-    .initialise    = net_connect,
-    .event      = net_event,
-    .disconnect = net_disconnect,
-    .free       = net_free,
+static Property xen_netdev_properties[] = {
+    DEFINE_NIC_PROPERTIES(XenNetDev, conf),
+    DEFINE_PROP_INT32("idx", XenNetDev, dev, -1),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void xen_netdev_class_init(ObjectClass *class, void *data)
+{
+    DeviceClass *dev_class = DEVICE_CLASS(class);
+    XenDeviceClass *xendev_class = XEN_DEVICE_CLASS(class);
+
+    xendev_class->backend = "qnic";
+    xendev_class->device = "vif";
+    xendev_class->get_name = xen_netdev_get_name;
+    xendev_class->realize = xen_netdev_realize;
+    xendev_class->frontend_changed = xen_netdev_frontend_changed;
+    xendev_class->unrealize = xen_netdev_unrealize;
+    set_bit(DEVICE_CATEGORY_NETWORK, dev_class->categories);
+    dev_class->user_creatable = true;
+
+    device_class_set_props(dev_class, xen_netdev_properties);
+}
+
+static const TypeInfo xen_net_type_info = {
+    .name = TYPE_XEN_NET_DEVICE,
+    .parent = TYPE_XEN_DEVICE,
+    .instance_size = sizeof(XenNetDev),
+    .class_init = xen_netdev_class_init,
+};
+
+static void xen_net_register_types(void)
+{
+    type_register_static(&xen_net_type_info);
+}
+
+type_init(xen_net_register_types)
+
+/* Called to instantiate a XenNetDev when the backend is detected. */
+static void xen_net_device_create(XenBackendInstance *backend,
+                                  QDict *opts, Error **errp)
+{
+    ERRP_GUARD();
+    XenBus *xenbus = xen_backend_get_bus(backend);
+    const char *name = xen_backend_get_name(backend);
+    XenDevice *xendev = NULL;
+    unsigned long number;
+    const char *macstr;
+    XenNetDev *net;
+    MACAddr mac;
+
+    if (qemu_strtoul(name, NULL, 10, &number) || number >= INT_MAX) {
+        error_setg(errp, "failed to parse name '%s'", name);
+        goto fail;
+    }
+
+    trace_xen_netdev_create(number);
+
+    macstr = qdict_get_try_str(opts, "mac");
+    if (macstr == NULL) {
+        error_setg(errp, "no MAC address found");
+        goto fail;
+    }
+
+    if (net_parse_macaddr(mac.a, macstr) < 0) {
+        error_setg(errp, "failed to parse MAC address");
+        goto fail;
+    }
+
+    xendev = XEN_DEVICE(qdev_new(TYPE_XEN_NET_DEVICE));
+    net = XEN_NET_DEVICE(xendev);
+
+    net->dev = number;
+    memcpy(&net->conf.macaddr, &mac, sizeof(mac));
+
+    if (qdev_realize_and_unref(DEVICE(xendev), BUS(xenbus), errp)) {
+        xen_backend_set_device(backend, xendev);
+        return;
+    }
+
+    error_prepend(errp, "realization of net device %lu failed: ",
+                  number);
+
+ fail:
+    if (xendev) {
+        object_unparent(OBJECT(xendev));
+    }
+}
+
+static void xen_net_device_destroy(XenBackendInstance *backend,
+                                   Error **errp)
+{
+    ERRP_GUARD();
+    XenDevice *xendev = xen_backend_get_device(backend);
+    XenNetDev *netdev = XEN_NET_DEVICE(xendev);
+
+    trace_xen_netdev_destroy(netdev->dev);
+
+    object_unparent(OBJECT(xendev));
+}
+
+static const XenBackendInfo xen_net_backend_info  = {
+    .type = "qnic",
+    .create = xen_net_device_create,
+    .destroy = xen_net_device_destroy,
 };
+
+static void xen_net_register_backend(void)
+{
+    xen_backend_register(&xen_net_backend_info);
+}
+
+xen_backend_init(xen_net_register_backend);
diff --git a/hw/xenpv/xen_machine_pv.c b/hw/xenpv/xen_machine_pv.c
index 1533f5dfb4..9f9f137f99 100644
--- a/hw/xenpv/xen_machine_pv.c
+++ b/hw/xenpv/xen_machine_pv.c
@@ -54,7 +54,6 @@ static void xen_init_pv(MachineState *machine)
     }
 
     xen_be_register("vfb", &xen_framebuffer_ops);
-    xen_be_register("qnic", &xen_netdev_ops);
 
     /* configure framebuffer */
     if (vga_interface_type == VGA_XENFB) {
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 20/28] net: do not delete nics in net_cleanup()
  2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
                   ` (18 preceding siblings ...)
  2023-10-25 14:50 ` [PATCH v3 19/28] hw/xen: update Xen PV NIC to XenDevice model David Woodhouse
@ 2023-10-25 14:50 ` David Woodhouse
  2023-10-27  8:44   ` Durrant, Paul
  2023-10-25 14:50 ` [PATCH v3 21/28] xen-platform: unplug AHCI disks David Woodhouse
                   ` (7 subsequent siblings)
  27 siblings, 1 reply; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

In net_cleanup() we only need to delete the netdevs, as those may have
state which outlives Qemu when it exits, and thus may actually need to
be cleaned up on exit.

The nics, on the other hand, are owned by the device which created them.
Most devices don't bother to clean up on exit because they don't have
any state which will outlive Qemu... but XenBus devices do need to clean
up their nodes in XenStore, and do have an exit handler to delete them.

When the XenBus exit handler destroys the xen-net-device, it attempts
to delete its nic after net_cleanup() had already done so. And crashes.

Fix this by only deleting netdevs as we walk the list. As the comment
notes, we can't use QTAILQ_FOREACH_SAFE() as each deletion may remove
*multiple* entries, including the "safely" saved 'next' pointer. But
we can store the *previous* entry, since nics are safe.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 net/net.c | 28 ++++++++++++++++++++++------
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/net/net.c b/net/net.c
index c0c0cbe99e..bbe33da176 100644
--- a/net/net.c
+++ b/net/net.c
@@ -1499,18 +1499,34 @@ static void net_vm_change_state_handler(void *opaque, bool running,
 
 void net_cleanup(void)
 {
-    NetClientState *nc;
+    NetClientState *nc, **p = &QTAILQ_FIRST(&net_clients);
 
     /*cleanup colo compare module for COLO*/
     colo_compare_cleanup();
 
-    /* We may del multiple entries during qemu_del_net_client(),
-     * so QTAILQ_FOREACH_SAFE() is also not safe here.
+    /*
+     * Walk the net_clients list and remove the netdevs but *not* any
+     * NET_CLIENT_DRIVER_NIC entries. The latter are owned by the device
+     * model which created them, and in some cases (e.g. xen-net-device)
+     * the device itself may do cleanup at exit and will be upset if we
+     * just delete its NIC from underneath it.
+     *
+     * Since qemu_del_net_client() may delete multiple entries, using
+     * QTAILQ_FOREACH_SAFE() is not safe here. The only safe pointer
+     * to keep as a bookmark is a NET_CLIENT_DRIVER_NIC entry, so keep
+     * 'p' pointing to either the head of the list, or the 'next' field
+     * of the latest NET_CLIENT_DRIVER_NIC, and operate on *p as we walk
+     * the list.
+     *
+     * The 'nc' variable isn't part of the list traversal; it's purely
+     * for convenience as too much '(*p)->' has a tendency to make the
+     * readers' eyes bleed.
      */
-    while (!QTAILQ_EMPTY(&net_clients)) {
-        nc = QTAILQ_FIRST(&net_clients);
+    while (*p) {
+        nc = *p;
         if (nc->info->type == NET_CLIENT_DRIVER_NIC) {
-            qemu_del_nic(qemu_get_nic(nc));
+            /* Skip NET_CLIENT_DRIVER_NIC entries */
+            p = &QTAILQ_NEXT(nc, next);
         } else {
             qemu_del_net_client(nc);
         }
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 21/28] xen-platform: unplug AHCI disks
  2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
                   ` (19 preceding siblings ...)
  2023-10-25 14:50 ` [PATCH v3 20/28] net: do not delete nics in net_cleanup() David Woodhouse
@ 2023-10-25 14:50 ` David Woodhouse
  2023-10-27  9:08   ` Durrant, Paul
  2023-10-25 14:50   ` [PATCH v3 22/28] net: add qemu_{configure, create}_nic_device(), qemu_find_nic_info() David Woodhouse
                   ` (6 subsequent siblings)
  27 siblings, 1 reply; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

To support Xen guests using the Q35 chipset, the unplug protocol needs
to also remove AHCI disks.

Make pci_xen_ide_unplug() more generic, iterating over the children
of the PCI device and destroying the "ide-hd" devices. That works the
same for both AHCI and IDE, as does the detection of the primary disk
as unit 0 on the bus named "ide.0".

Then pci_xen_ide_unplug() can be used for both AHCI and IDE devices.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 hw/i386/xen/xen_platform.c | 68 +++++++++++++++++++++++++-------------
 1 file changed, 45 insertions(+), 23 deletions(-)

diff --git a/hw/i386/xen/xen_platform.c b/hw/i386/xen/xen_platform.c
index e2dd1b536a..ef7d3fc05f 100644
--- a/hw/i386/xen/xen_platform.c
+++ b/hw/i386/xen/xen_platform.c
@@ -169,39 +169,60 @@ static void pci_unplug_nics(PCIBus *bus)
  *
  * [1] https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=docs/misc/hvm-emulated-unplug.pandoc
  */
-static void pci_xen_ide_unplug(PCIDevice *d, bool aux)
+struct ide_unplug_state {
+    bool aux;
+    int nr_unplugged;
+};
+
+static int ide_dev_unplug(DeviceState *dev, void *_st)
 {
-    DeviceState *dev = DEVICE(d);
-    PCIIDEState *pci_ide;
-    int i;
+    struct ide_unplug_state *st = _st;
     IDEDevice *idedev;
     IDEBus *idebus;
     BlockBackend *blk;
+    int unit;
+
+    idedev = IDE_DEVICE(object_dynamic_cast(OBJECT(dev), "ide-hd"));
+    if (!idedev) {
+        return 0;
+    }
 
-    pci_ide = PCI_IDE(dev);
+    idebus = IDE_BUS(qdev_get_parent_bus(dev));
 
-    for (i = aux ? 1 : 0; i < 4; i++) {
-        idebus = &pci_ide->bus[i / 2];
-        blk = idebus->ifs[i % 2].blk;
+    unit = (idedev == idebus->slave);
+    assert(unit || idedev == idebus->master);
 
-        if (blk && idebus->ifs[i % 2].drive_kind != IDE_CD) {
-            if (!(i % 2)) {
-                idedev = idebus->master;
-            } else {
-                idedev = idebus->slave;
-            }
+    if (st->aux && !unit && !strcmp(BUS(idebus)->name, "ide.0")) {
+        return 0;
+    }
 
-            blk_drain(blk);
-            blk_flush(blk);
+    blk = idebus->ifs[unit].blk;
+    if (blk) {
+        blk_drain(blk);
+        blk_flush(blk);
 
-            blk_detach_dev(blk, DEVICE(idedev));
-            idebus->ifs[i % 2].blk = NULL;
-            idedev->conf.blk = NULL;
-            monitor_remove_blk(blk);
-            blk_unref(blk);
-        }
+        blk_detach_dev(blk, DEVICE(idedev));
+        idebus->ifs[unit].blk = NULL;
+        idedev->conf.blk = NULL;
+        monitor_remove_blk(blk);
+        blk_unref(blk);
+    }
+
+    object_unparent(OBJECT(dev));
+    st->nr_unplugged++;
+
+    return 0;
+}
+
+static void pci_xen_ide_unplug(PCIDevice *d, bool aux)
+{
+    struct ide_unplug_state st = { aux, 0 };
+    DeviceState *dev = DEVICE(d);
+
+    qdev_walk_children(dev, NULL, NULL, ide_dev_unplug, NULL, &st);
+    if (st.nr_unplugged) {
+        pci_device_reset(d);
     }
-    pci_device_reset(d);
 }
 
 static void unplug_disks(PCIBus *b, PCIDevice *d, void *opaque)
@@ -216,6 +237,7 @@ static void unplug_disks(PCIBus *b, PCIDevice *d, void *opaque)
 
     switch (pci_get_word(d->config + PCI_CLASS_DEVICE)) {
     case PCI_CLASS_STORAGE_IDE:
+    case PCI_CLASS_STORAGE_SATA:
         pci_xen_ide_unplug(d, aux);
         break;
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 22/28] net: add qemu_{configure,create}_nic_device(), qemu_find_nic_info()
  2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
@ 2023-10-25 14:50   ` David Woodhouse
  2023-10-25 14:50 ` [PATCH v3 02/28] i386/xen: fix per-vCPU upcall vector for Xen emulation David Woodhouse
                     ` (26 subsequent siblings)
  27 siblings, 0 replies; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

Most code which directly accesses nd_table[] and nb_nics uses them for
one of two things. Either "I have created a NIC device and I'd like a
configuration for it", or "I will create a NIC device *if* there is a
configuration for it".  With some variants on the theme around whether
they actually *check* if the model specified in the configuration is
the right one.

Provide functions which perform both of those, allowing platforms to
be a little more consistent and as a step towards making nd_table[]
and nb_nics private to the net code.

Also export the qemu_find_nic_info() helper, as some platforms have
special cases they need to handle.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 include/net/net.h |  7 ++++++-
 net/net.c         | 51 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 57 insertions(+), 1 deletion(-)

diff --git a/include/net/net.h b/include/net/net.h
index 2fb1c9181c..56be694c75 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -205,7 +205,12 @@ int qemu_show_nic_models(const char *arg, const char *const *models);
 void qemu_check_nic_model(NICInfo *nd, const char *model);
 int qemu_find_nic_model(NICInfo *nd, const char * const *models,
                         const char *default_model);
-
+NICInfo *qemu_find_nic_info(const char *typename, bool match_default,
+                            const char *alias);
+bool qemu_configure_nic_device(DeviceState *dev, bool match_default,
+                               const char *alias);
+DeviceState *qemu_create_nic_device(const char *typename, bool match_default,
+                                    const char *alias);
 void print_net_client(Monitor *mon, NetClientState *nc);
 void net_socket_rs_init(SocketReadState *rs,
                         SocketReadStateFinalize *finalize,
diff --git a/net/net.c b/net/net.c
index bbe33da176..f8b4973a1e 100644
--- a/net/net.c
+++ b/net/net.c
@@ -1072,6 +1072,57 @@ static int net_init_nic(const Netdev *netdev, const char *name,
     return idx;
 }
 
+NICInfo *qemu_find_nic_info(const char *typename, bool match_default,
+                            const char *alias)
+{
+    NICInfo *nd;
+    int i;
+
+    for (i = 0; i < nb_nics; i++) {
+        nd = &nd_table[i];
+
+        if (!nd->used || nd->instantiated) {
+            continue;
+        }
+
+        if ((match_default && !nd->model) || !g_strcmp0(nd->model, typename)
+            || (alias && !g_strcmp0(nd->model, alias))) {
+            return nd;
+        }
+    }
+    return NULL;
+}
+
+
+/* "I have created a device. Please configure it if you can" */
+bool qemu_configure_nic_device(DeviceState *dev, bool match_default,
+                               const char *alias)
+{
+    NICInfo *nd = qemu_find_nic_info(object_get_typename(OBJECT(dev)),
+                                     match_default, alias);
+
+    if (nd) {
+        qdev_set_nic_properties(dev, nd);
+        return true;
+    }
+    return false;
+}
+
+/* "Please create a device, if you have a configuration for it" */
+DeviceState *qemu_create_nic_device(const char *typename, bool match_default,
+                                    const char *alias)
+{
+    NICInfo *nd = qemu_find_nic_info(typename, match_default, alias);
+    DeviceState *dev;
+
+    if (!nd) {
+        return NULL;
+    }
+
+    dev = qdev_new(typename);
+    qdev_set_nic_properties(dev, nd);
+    return dev;
+}
 
 static int (* const net_client_init_fun[NET_CLIENT_DRIVER__MAX])(
     const Netdev *netdev,
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 22/28] net: add qemu_{configure, create}_nic_device(), qemu_find_nic_info()
@ 2023-10-25 14:50   ` David Woodhouse
  0 siblings, 0 replies; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

Most code which directly accesses nd_table[] and nb_nics uses them for
one of two things. Either "I have created a NIC device and I'd like a
configuration for it", or "I will create a NIC device *if* there is a
configuration for it".  With some variants on the theme around whether
they actually *check* if the model specified in the configuration is
the right one.

Provide functions which perform both of those, allowing platforms to
be a little more consistent and as a step towards making nd_table[]
and nb_nics private to the net code.

Also export the qemu_find_nic_info() helper, as some platforms have
special cases they need to handle.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 include/net/net.h |  7 ++++++-
 net/net.c         | 51 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 57 insertions(+), 1 deletion(-)

diff --git a/include/net/net.h b/include/net/net.h
index 2fb1c9181c..56be694c75 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -205,7 +205,12 @@ int qemu_show_nic_models(const char *arg, const char *const *models);
 void qemu_check_nic_model(NICInfo *nd, const char *model);
 int qemu_find_nic_model(NICInfo *nd, const char * const *models,
                         const char *default_model);
-
+NICInfo *qemu_find_nic_info(const char *typename, bool match_default,
+                            const char *alias);
+bool qemu_configure_nic_device(DeviceState *dev, bool match_default,
+                               const char *alias);
+DeviceState *qemu_create_nic_device(const char *typename, bool match_default,
+                                    const char *alias);
 void print_net_client(Monitor *mon, NetClientState *nc);
 void net_socket_rs_init(SocketReadState *rs,
                         SocketReadStateFinalize *finalize,
diff --git a/net/net.c b/net/net.c
index bbe33da176..f8b4973a1e 100644
--- a/net/net.c
+++ b/net/net.c
@@ -1072,6 +1072,57 @@ static int net_init_nic(const Netdev *netdev, const char *name,
     return idx;
 }
 
+NICInfo *qemu_find_nic_info(const char *typename, bool match_default,
+                            const char *alias)
+{
+    NICInfo *nd;
+    int i;
+
+    for (i = 0; i < nb_nics; i++) {
+        nd = &nd_table[i];
+
+        if (!nd->used || nd->instantiated) {
+            continue;
+        }
+
+        if ((match_default && !nd->model) || !g_strcmp0(nd->model, typename)
+            || (alias && !g_strcmp0(nd->model, alias))) {
+            return nd;
+        }
+    }
+    return NULL;
+}
+
+
+/* "I have created a device. Please configure it if you can" */
+bool qemu_configure_nic_device(DeviceState *dev, bool match_default,
+                               const char *alias)
+{
+    NICInfo *nd = qemu_find_nic_info(object_get_typename(OBJECT(dev)),
+                                     match_default, alias);
+
+    if (nd) {
+        qdev_set_nic_properties(dev, nd);
+        return true;
+    }
+    return false;
+}
+
+/* "Please create a device, if you have a configuration for it" */
+DeviceState *qemu_create_nic_device(const char *typename, bool match_default,
+                                    const char *alias)
+{
+    NICInfo *nd = qemu_find_nic_info(typename, match_default, alias);
+    DeviceState *dev;
+
+    if (!nd) {
+        return NULL;
+    }
+
+    dev = qdev_new(typename);
+    qdev_set_nic_properties(dev, nd);
+    return dev;
+}
 
 static int (* const net_client_init_fun[NET_CLIENT_DRIVER__MAX])(
     const Netdev *netdev,
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 23/28] net: report list of available models according to platform
  2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
                   ` (21 preceding siblings ...)
  2023-10-25 14:50   ` [PATCH v3 22/28] net: add qemu_{configure, create}_nic_device(), qemu_find_nic_info() David Woodhouse
@ 2023-10-25 14:50 ` David Woodhouse
  2023-10-27  9:31   ` Durrant, Paul
  2023-10-25 14:50 ` [PATCH v3 24/28] net: add qemu_create_nic_bus_devices() David Woodhouse
                   ` (4 subsequent siblings)
  27 siblings, 1 reply; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

By noting the models for which a configuration was requested, we can give
the user an accurate list of which NIC models were actually available on
the platform/configuration that was otherwise chosen.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 net/net.c | 94 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 94 insertions(+)

diff --git a/net/net.c b/net/net.c
index f8b4973a1e..807220e630 100644
--- a/net/net.c
+++ b/net/net.c
@@ -75,6 +75,8 @@ typedef QSIMPLEQ_HEAD(, NetdevQueueEntry) NetdevQueue;
 
 static NetdevQueue nd_queue = QSIMPLEQ_HEAD_INITIALIZER(nd_queue);
 
+static GHashTable *nic_model_help;
+
 /***********************************************************/
 /* network device redirectors */
 
@@ -1072,12 +1074,94 @@ static int net_init_nic(const Netdev *netdev, const char *name,
     return idx;
 }
 
+static gboolean add_nic_result(gpointer key, gpointer value, gpointer user_data)
+{
+    GPtrArray *results = user_data;
+    GPtrArray *alias_list = value;
+    const char *model = key;
+    char *result;
+
+    if (!alias_list) {
+        result = g_strdup(model);
+    } else {
+        GString *result_str = g_string_new(model);
+        int i;
+
+        g_string_append(result_str, " (aka ");
+        for (i = 0; i < alias_list->len; i++) {
+            if (i) {
+                g_string_append(result_str, ", ");
+            }
+            g_string_append(result_str, alias_list->pdata[i]);
+        }
+        g_string_append(result_str, ")");
+        result = result_str->str;
+        g_string_free(result_str, false);
+        g_ptr_array_unref(alias_list);
+    }
+    g_ptr_array_add(results, result);
+    return true;
+}
+
+static int model_cmp(char **a, char **b)
+{
+    return strcmp(*a, *b);
+}
+
+static void show_nic_models(void)
+{
+    GPtrArray *results = g_ptr_array_new();
+    int i;
+
+    g_hash_table_foreach_remove(nic_model_help, add_nic_result, results);
+    g_ptr_array_sort(results, (GCompareFunc)model_cmp);
+
+    printf("Available NIC models for this configuration:\n");
+    for (i = 0 ; i < results->len; i++) {
+        printf("%s\n", (char *)results->pdata[i]);
+    }
+    g_hash_table_unref(nic_model_help);
+    nic_model_help = NULL;
+}
+
+static void add_nic_model_help(const char *model, const char *alias)
+{
+    GPtrArray *alias_list = NULL;
+
+    if (g_hash_table_lookup_extended(nic_model_help, model, NULL,
+                                     (gpointer *)&alias_list)) {
+        /* Already exists, no alias to add: return */
+        if (!alias) {
+            return;
+        }
+        if (alias_list) {
+            /* Check if this alias is already in the list. Add if not. */
+            if (!g_ptr_array_find_with_equal_func(alias_list, alias,
+                                                  g_str_equal, NULL)) {
+                g_ptr_array_add(alias_list, g_strdup(alias));
+            }
+            return;
+        }
+    }
+    /* Either this model wasn't in the list already, or a first alias added */
+    if (alias) {
+        alias_list = g_ptr_array_new();
+        g_ptr_array_set_free_func(alias_list, g_free);
+        g_ptr_array_add(alias_list, g_strdup(alias));
+    }
+    g_hash_table_replace(nic_model_help, g_strdup(model), alias_list);
+}
+
 NICInfo *qemu_find_nic_info(const char *typename, bool match_default,
                             const char *alias)
 {
     NICInfo *nd;
     int i;
 
+    if (nic_model_help) {
+        add_nic_model_help(typename, alias);
+    }
+
     for (i = 0; i < nb_nics; i++) {
         nd = &nd_table[i];
 
@@ -1591,6 +1675,10 @@ void net_check_clients(void)
     NetClientState *nc;
     int i;
 
+    if (nic_model_help) {
+        show_nic_models();
+        exit(0);
+    }
     net_hub_check_clients();
 
     QTAILQ_FOREACH(nc, &net_clients, next) {
@@ -1670,6 +1758,12 @@ static int net_param_nic(void *dummy, QemuOpts *opts, Error **errp)
     memset(ni, 0, sizeof(*ni));
     ni->model = qemu_opt_get_del(opts, "model");
 
+    if (!nic_model_help && !g_strcmp0(ni->model, "help")) {
+        nic_model_help = g_hash_table_new_full(g_str_hash, g_str_equal,
+                                               g_free, NULL);
+        return 0;
+    }
+
     /* Create an ID if the user did not specify one */
     nd_id = g_strdup(qemu_opts_id(opts));
     if (!nd_id) {
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 24/28] net: add qemu_create_nic_bus_devices()
  2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
                   ` (22 preceding siblings ...)
  2023-10-25 14:50 ` [PATCH v3 23/28] net: report list of available models according to platform David Woodhouse
@ 2023-10-25 14:50 ` David Woodhouse
  2023-10-27  9:42   ` Durrant, Paul
  2023-10-25 14:50 ` [PATCH v3 25/28] hw/pci: add pci_init_nic_devices(), pci_init_nic_in_slot() David Woodhouse
                   ` (3 subsequent siblings)
  27 siblings, 1 reply; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

This will instantiate any NICs which live on a given bus type. Each bus
is allowed *one* substitution (for PCI it's virtio → virtio-net-pci, for
Xen it's xen → xen-net-device; no point in overengineering it unless we
actually want more).

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 include/net/net.h |  3 +++
 net/net.c         | 53 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 56 insertions(+)

diff --git a/include/net/net.h b/include/net/net.h
index 56be694c75..ce830a47d0 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -211,6 +211,9 @@ bool qemu_configure_nic_device(DeviceState *dev, bool match_default,
                                const char *alias);
 DeviceState *qemu_create_nic_device(const char *typename, bool match_default,
                                     const char *alias);
+void qemu_create_nic_bus_devices(BusState *bus, const char *parent_type,
+                                 const char *default_model,
+                                 const char *alias, const char *alias_target);
 void print_net_client(Monitor *mon, NetClientState *nc);
 void net_socket_rs_init(SocketReadState *rs,
                         SocketReadStateFinalize *finalize,
diff --git a/net/net.c b/net/net.c
index 807220e630..73621795cb 100644
--- a/net/net.c
+++ b/net/net.c
@@ -1208,6 +1208,59 @@ DeviceState *qemu_create_nic_device(const char *typename, bool match_default,
     return dev;
 }
 
+void qemu_create_nic_bus_devices(BusState *bus, const char *parent_type,
+                                 const char *default_model,
+                                 const char *alias, const char *alias_target)
+{
+    GPtrArray *nic_models = qemu_get_nic_models(parent_type);
+    const char *model;
+    DeviceState *dev;
+    NICInfo *nd;
+    int i;
+
+    if (nic_model_help) {
+        if (alias_target) {
+            add_nic_model_help(alias_target, alias);
+        }
+        for (i = 0; i < nic_models->len - 1; i++) {
+            add_nic_model_help(nic_models->pdata[i], NULL);
+        }
+    }
+
+    /* Drop the NULL terminator which would make g_str_equal() unhappy */
+    nic_models->len--;
+
+    for (i = 0; i < nb_nics; i++) {
+        nd = &nd_table[i];
+
+        if (!nd->used || nd->instantiated) {
+            continue;
+        }
+
+        model = nd->model ? nd->model : default_model;
+        if (!model) {
+            continue;
+        }
+
+        /* Each bus type is allowed *one* substitution */
+        if (g_str_equal(model, alias)) {
+            model = alias_target;
+        }
+
+        if (!g_ptr_array_find_with_equal_func(nic_models, model,
+                                              g_str_equal, NULL)) {
+            /* This NIC does not live on this bus. */
+            continue;
+        }
+
+        dev = qdev_new(model);
+        qdev_set_nic_properties(dev, nd);
+        qdev_realize_and_unref(dev, bus, &error_fatal);
+    }
+
+    g_ptr_array_free(nic_models, true);
+}
+
 static int (* const net_client_init_fun[NET_CLIENT_DRIVER__MAX])(
     const Netdev *netdev,
     const char *name,
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 25/28] hw/pci: add pci_init_nic_devices(), pci_init_nic_in_slot()
  2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
                   ` (23 preceding siblings ...)
  2023-10-25 14:50 ` [PATCH v3 24/28] net: add qemu_create_nic_bus_devices() David Woodhouse
@ 2023-10-25 14:50 ` David Woodhouse
  2023-10-27  9:46   ` Durrant, Paul
  2023-10-25 14:50 ` [PATCH v3 26/28] hw/i386/pc: use qemu_get_nic_info() and pci_init_nic_devices() David Woodhouse
                   ` (2 subsequent siblings)
  27 siblings, 1 reply; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

The loop over nd_table[] to add PCI NICs is repeated in quite a few
places. Add a helper function to do it.

Some platforms also try to instantiate a specific model in a specific
slot, to match the real hardware. Add pci_init_nic_in_slot() for that
purpose.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 hw/pci/pci.c         | 45 ++++++++++++++++++++++++++++++++++++++++++++
 include/hw/pci/pci.h |  4 +++-
 2 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 7d09e1a39d..51b0be3972 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -1925,6 +1925,51 @@ PCIDevice *pci_nic_init_nofail(NICInfo *nd, PCIBus *rootbus,
     return pci_dev;
 }
 
+void pci_init_nic_devices(PCIBus *bus, const char *default_model)
+{
+    qemu_create_nic_bus_devices(&bus->qbus, TYPE_PCI_DEVICE, default_model,
+                                "virtio", "virtio-net-pci");
+}
+
+bool pci_init_nic_in_slot(PCIBus *rootbus, const char *model,
+                          const char *alias, const char *devaddr)
+{
+    NICInfo *nd = qemu_find_nic_info(model, true, alias);
+    int dom, busnr, devfn;
+    PCIDevice *pci_dev;
+    unsigned slot;
+    PCIBus *bus;
+
+    if (!nd) {
+        return false;
+    }
+
+    if (!devaddr || pci_parse_devaddr(devaddr, &dom, &busnr, &slot, NULL) < 0) {
+        error_report("Invalid PCI device address %s for device %s",
+                     devaddr, model);
+        exit(1);
+    }
+
+    if (dom != 0) {
+        error_report("No support for non-zero PCI domains");
+        exit(1);
+    }
+
+    devfn = PCI_DEVFN(slot, 0);
+
+    bus = pci_find_bus_nr(rootbus, busnr);
+    if (!bus) {
+        error_report("Invalid PCI device address %s for device %s",
+                     devaddr, model);
+        exit(1);
+    }
+
+    pci_dev = pci_new(devfn, model);
+    qdev_set_nic_properties(&pci_dev->qdev, nd);
+    pci_realize_and_unref(pci_dev, bus, &error_fatal);
+    return true;
+}
+
 PCIDevice *pci_vga_init(PCIBus *bus)
 {
     vga_interface_created = true;
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index ea5aff118b..684d49bdcd 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -317,7 +317,9 @@ void pci_device_reset(PCIDevice *dev);
 PCIDevice *pci_nic_init_nofail(NICInfo *nd, PCIBus *rootbus,
                                const char *default_model,
                                const char *default_devaddr);
-
+void pci_init_nic_devices(PCIBus *bus, const char *default_model);
+bool pci_init_nic_in_slot(PCIBus *rootbus, const char *default_model,
+                          const char *alias, const char *devaddr);
 PCIDevice *pci_vga_init(PCIBus *bus);
 
 static inline PCIBus *pci_get_bus(const PCIDevice *dev)
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 26/28] hw/i386/pc: use qemu_get_nic_info() and pci_init_nic_devices()
  2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
                   ` (24 preceding siblings ...)
  2023-10-25 14:50 ` [PATCH v3 25/28] hw/pci: add pci_init_nic_devices(), pci_init_nic_in_slot() David Woodhouse
@ 2023-10-25 14:50 ` David Woodhouse
  2023-10-27  9:48   ` Durrant, Paul
  2023-10-25 14:50 ` [PATCH v3 27/28] hw/xen: use qemu_create_nic_bus_devices() to instantiate Xen NICs David Woodhouse
  2023-10-25 14:50 ` [PATCH v3 28/28] docs: update Xen-on-KVM documentation David Woodhouse
  27 siblings, 1 reply; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

Eliminate direct access to nd_table[] and nb_nics by processing the the
ISA NICs first and then calling pci_init_nic_devices() for the test.

It's important to do this *before* the subsequent patch which registers
the Xen PV network devices, because the code being remove here didn't
check whether nd->instantiated was already set before using each entry.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 hw/i386/pc.c                | 21 +++++++++++----------
 include/hw/net/ne2000-isa.h |  2 --
 2 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 11fed78d17..7d2ae2b3d7 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -652,8 +652,11 @@ static void pc_init_ne2k_isa(ISABus *bus, NICInfo *nd)
 {
     static int nb_ne2k = 0;
 
-    if (nb_ne2k == NE2000_NB_MAX)
+    if (nb_ne2k == NE2000_NB_MAX) {
+        error_setg(&error_fatal,
+                   "maximum number of ISA NE2000 devices exceeded");
         return;
+    }
     isa_ne2000_init(bus, ne2000_io[nb_ne2k],
                     ne2000_irq[nb_ne2k], nd);
     nb_ne2k++;
@@ -1294,19 +1297,17 @@ void pc_basic_device_init(struct PCMachineState *pcms,
 void pc_nic_init(PCMachineClass *pcmc, ISABus *isa_bus, PCIBus *pci_bus)
 {
     MachineClass *mc = MACHINE_CLASS(pcmc);
-    int i;
+    bool default_is_ne2k = g_str_equal(mc->default_nic, TYPE_ISA_NE2000);
+    NICInfo *nd;
 
     rom_set_order_override(FW_CFG_ORDER_OVERRIDE_NIC);
-    for (i = 0; i < nb_nics; i++) {
-        NICInfo *nd = &nd_table[i];
-        const char *model = nd->model ? nd->model : mc->default_nic;
 
-        if (g_str_equal(model, "ne2k_isa")) {
-            pc_init_ne2k_isa(isa_bus, nd);
-        } else {
-            pci_nic_init_nofail(nd, pci_bus, model, NULL);
-        }
+    while ((nd = qemu_find_nic_info(TYPE_ISA_NE2000, default_is_ne2k, NULL))) {
+        pc_init_ne2k_isa(isa_bus, nd);
     }
+
+    pci_init_nic_devices(pci_bus, mc->default_nic);
+
     rom_reset_order_override();
 }
 
diff --git a/include/hw/net/ne2000-isa.h b/include/hw/net/ne2000-isa.h
index af59ee0b02..73bae10ad1 100644
--- a/include/hw/net/ne2000-isa.h
+++ b/include/hw/net/ne2000-isa.h
@@ -22,8 +22,6 @@ static inline ISADevice *isa_ne2000_init(ISABus *bus, int base, int irq,
 {
     ISADevice *d;
 
-    qemu_check_nic_model(nd, "ne2k_isa");
-
     d = isa_try_new(TYPE_ISA_NE2000);
     if (d) {
         DeviceState *dev = DEVICE(d);
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 27/28] hw/xen: use qemu_create_nic_bus_devices() to instantiate Xen NICs
  2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
                   ` (25 preceding siblings ...)
  2023-10-25 14:50 ` [PATCH v3 26/28] hw/i386/pc: use qemu_get_nic_info() and pci_init_nic_devices() David Woodhouse
@ 2023-10-25 14:50 ` David Woodhouse
  2023-10-27  9:52   ` Durrant, Paul
  2023-10-25 14:50 ` [PATCH v3 28/28] docs: update Xen-on-KVM documentation David Woodhouse
  27 siblings, 1 reply; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

When instantiating XenBus itself, for each NIC which is configured with
either the model unspecified, or set to to "xen" or "xen-net-device",
create a corresponding xen-net-device for it.

Now we can launch emulated Xen guests with '-nic user', and this fixes
the setup for Xen PV guests, which was previously broken in various
ways and never actually managed to peer with the netdev.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 hw/xen/xen-bus.c                    |  4 ++++
 hw/xen/xen_devconfig.c              | 25 -------------------------
 hw/xenpv/xen_machine_pv.c           |  9 ---------
 include/hw/xen/xen-legacy-backend.h |  1 -
 4 files changed, 4 insertions(+), 35 deletions(-)

diff --git a/hw/xen/xen-bus.c b/hw/xen/xen-bus.c
index cc6f1b362f..fb82cc33e4 100644
--- a/hw/xen/xen-bus.c
+++ b/hw/xen/xen-bus.c
@@ -19,6 +19,7 @@
 #include "qapi/error.h"
 #include "qapi/qmp/qdict.h"
 #include "sysemu/sysemu.h"
+#include "net/net.h"
 #include "trace.h"
 
 static char *xen_device_get_backend_path(XenDevice *xendev)
@@ -1140,4 +1141,7 @@ void xen_bus_init(void)
 
     sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
     qbus_set_bus_hotplug_handler(bus);
+
+    qemu_create_nic_bus_devices(bus, TYPE_XEN_DEVICE, "xen-net-device",
+                                "xen", "xen-net-device");
 }
diff --git a/hw/xen/xen_devconfig.c b/hw/xen/xen_devconfig.c
index 3f77c675c6..2150869f60 100644
--- a/hw/xen/xen_devconfig.c
+++ b/hw/xen/xen_devconfig.c
@@ -46,31 +46,6 @@ static int xen_config_dev_all(char *fe, char *be)
 
 /* ------------------------------------------------------------- */
 
-int xen_config_dev_nic(NICInfo *nic)
-{
-    char fe[256], be[256];
-    char mac[20];
-    int vlan_id = -1;
-
-    net_hub_id_for_client(nic->netdev, &vlan_id);
-    snprintf(mac, sizeof(mac), "%02x:%02x:%02x:%02x:%02x:%02x",
-             nic->macaddr.a[0], nic->macaddr.a[1], nic->macaddr.a[2],
-             nic->macaddr.a[3], nic->macaddr.a[4], nic->macaddr.a[5]);
-    xen_pv_printf(NULL, 1, "config nic %d: mac=\"%s\"\n", vlan_id, mac);
-    xen_config_dev_dirs("vif", "qnic", vlan_id, fe, be, sizeof(fe));
-
-    /* frontend */
-    xenstore_write_int(fe, "handle",     vlan_id);
-    xenstore_write_str(fe, "mac",        mac);
-
-    /* backend */
-    xenstore_write_int(be, "handle",     vlan_id);
-    xenstore_write_str(be, "mac",        mac);
-
-    /* common stuff */
-    return xen_config_dev_all(fe, be);
-}
-
 int xen_config_dev_vfb(int vdev, const char *type)
 {
     char fe[256], be[256];
diff --git a/hw/xenpv/xen_machine_pv.c b/hw/xenpv/xen_machine_pv.c
index 9f9f137f99..1130d1a147 100644
--- a/hw/xenpv/xen_machine_pv.c
+++ b/hw/xenpv/xen_machine_pv.c
@@ -32,8 +32,6 @@
 
 static void xen_init_pv(MachineState *machine)
 {
-    int i;
-
     setup_xen_backend_ops();
 
     /* Initialize backend core & drivers */
@@ -62,13 +60,6 @@ static void xen_init_pv(MachineState *machine)
         vga_interface_created = true;
     }
 
-    /* configure nics */
-    for (i = 0; i < nb_nics; i++) {
-        if (!nd_table[i].model || 0 != strcmp(nd_table[i].model, "xen"))
-            continue;
-        xen_config_dev_nic(nd_table + i);
-    }
-
     xen_bus_init();
 
     /* config cleanup hook */
diff --git a/include/hw/xen/xen-legacy-backend.h b/include/hw/xen/xen-legacy-backend.h
index fc42146bc2..2cca174778 100644
--- a/include/hw/xen/xen-legacy-backend.h
+++ b/include/hw/xen/xen-legacy-backend.h
@@ -81,7 +81,6 @@ extern struct XenDevOps xen_usb_ops;          /* xen-usb.c         */
 
 /* configuration (aka xenbus setup) */
 void xen_config_cleanup(void);
-int xen_config_dev_nic(NICInfo *nic);
 int xen_config_dev_vfb(int vdev, const char *type);
 int xen_config_dev_vkbd(int vdev);
 int xen_config_dev_console(int vdev);
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 28/28] docs: update Xen-on-KVM documentation
  2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
                   ` (26 preceding siblings ...)
  2023-10-25 14:50 ` [PATCH v3 27/28] hw/xen: use qemu_create_nic_bus_devices() to instantiate Xen NICs David Woodhouse
@ 2023-10-25 14:50 ` David Woodhouse
  2023-10-25 18:20   ` Eric Blake
  27 siblings, 1 reply; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 14:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

From: David Woodhouse <dwmw@amazon.co.uk>

Add notes about console and network support, and how to launch PV guests.
Clean up the disk configuration examples now that that's simpler, and
remove the comment about IDE unplug on q35/AHCI now that it's fixed.

Also update stale avocado test filename in MAINTAINERS.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 MAINTAINERS              |   2 +-
 docs/system/i386/xen.rst | 100 ++++++++++++++++++++++++++++-----------
 2 files changed, 73 insertions(+), 29 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index d36aa44661..0fcc454ccd 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -490,7 +490,7 @@ S: Supported
 F: include/sysemu/kvm_xen.h
 F: target/i386/kvm/xen*
 F: hw/i386/kvm/xen*
-F: tests/avocado/xen_guest.py
+F: tests/avocado/kvm_xen_guest.py
 
 Guest CPU Cores (other accelerators)
 ------------------------------------
diff --git a/docs/system/i386/xen.rst b/docs/system/i386/xen.rst
index f06765e88c..6214c4571e 100644
--- a/docs/system/i386/xen.rst
+++ b/docs/system/i386/xen.rst
@@ -15,46 +15,24 @@ Setup
 -----
 
 Xen mode is enabled by setting the ``xen-version`` property of the KVM
-accelerator, for example for Xen 4.10:
+accelerator, for example for Xen 4.17:
 
 .. parsed-literal::
 
-  |qemu_system| --accel kvm,xen-version=0x4000a,kernel-irqchip=split
+  |qemu_system| --accel kvm,xen-version=0x40011,kernel-irqchip=split
 
 Additionally, virtual APIC support can be advertised to the guest through the
 ``xen-vapic`` CPU flag:
 
 .. parsed-literal::
 
-  |qemu_system| --accel kvm,xen-version=0x4000a,kernel-irqchip=split --cpu host,+xen_vapic
+  |qemu_system| --accel kvm,xen-version=0x40011,kernel-irqchip=split --cpu host,+xen-vapic
 
 When Xen support is enabled, QEMU changes hypervisor identification (CPUID
 0x40000000..0x4000000A) to Xen. The KVM identification and features are not
 advertised to a Xen guest. If Hyper-V is also enabled, the Xen identification
 moves to leaves 0x40000100..0x4000010A.
 
-The Xen platform device is enabled automatically for a Xen guest. This allows
-a guest to unplug all emulated devices, in order to use Xen PV block and network
-drivers instead. Under Xen, the boot disk is typically available both via IDE
-emulation, and as a PV block device. Guest bootloaders typically use IDE to load
-the guest kernel, which then unplugs the IDE and continues with the Xen PV block
-device.
-
-This configuration can be achieved as follows
-
-.. parsed-literal::
-
-  |qemu_system| -M pc --accel kvm,xen-version=0x4000a,kernel-irqchip=split \\
-       -drive file=${GUEST_IMAGE},if=none,id=disk,file.locking=off -device xen-disk,drive=disk,vdev=xvda \\
-       -drive file=${GUEST_IMAGE},index=2,media=disk,file.locking=off,if=ide
-
-It is necessary to use the pc machine type, as the q35 machine uses AHCI instead
-of legacy IDE, and AHCI disks are not unplugged through the Xen PV unplug
-mechanism.
-
-VirtIO devices can also be used; Linux guests may need to be dissuaded from
-umplugging them by adding 'xen_emul_unplug=never' on their command line.
-
 Properties
 ----------
 
@@ -63,7 +41,10 @@ The following properties exist on the KVM accelerator object:
 ``xen-version``
   This property contains the Xen version in ``XENVER_version`` form, with the
   major version in the top 16 bits and the minor version in the low 16 bits.
-  Setting this property enables the Xen guest support.
+  Setting this property enables the Xen guest support. If Xen version 4.5 or
+  greater is specified, the HVM leaf in Xen CPUID is populated. Xen version
+  4.6 enables the vCPU ID in CPUID, and version 4.17 advertises vCPU upcall
+  vector support to the guest.
 
 ``xen-evtchn-max-pirq``
   Xen PIRQs represent an emulated physical interrupt, either GSI or MSI, which
@@ -83,8 +64,71 @@ The following properties exist on the KVM accelerator object:
   through simultaneous grants. For guests with large numbers of PV devices and
   high throughput, it may be desirable to increase this value.
 
-OS requirements
----------------
+Xen paravirtual devices
+-----------------------
+
+The Xen PCI platform device is enabled automatically for a Xen guest. This
+allows a guest to unplug all emulated devices, in order to use paravirtual
+block and network drivers instead.
+
+Those paravirtual Xen block, network (and console) devices can be created
+through the command line, and/or hot-plugged.
+
+To provide a Xen console device, define a character device and then a device
+of type ``xen-console`` to connect to it. For the Xen console equivalent of
+the handy ``-serial mon:stdio`` option, for example:
+
+.. parsed-literal::
+   -chardev -chardev stdio,mux=on,id=char0,signal=off -mon char0 \\
+   -device xen-console,chardev=char0
+
+The Xen network device is ``xen-net-device``, which becomes the default NIC
+model for emulated Xen guests, meaning that just the default ``-nic user``
+should automatically work and present a Xen network device to the guest.
+
+Disks can be configured with '``-drive file=${GUEST_IMAGE},if=xen``' and will
+appear to the guest as ``xvda`` onwards.
+
+Under Xen, the boot disk is typically available both via IDE emulation, and
+as a PV block device. Guest bootloaders typically use IDE to load the guest
+kernel, which then unplugs the IDE and continues with the Xen PV block device.
+
+This configuration can be achieved as follows:
+
+.. parsed-literal::
+
+  |qemu_system| --accel kvm,xen-version=0x40011,kernel-irqchip=split \\
+       -drive file=${GUEST_IMAGE},if=xen \\
+       -drive file=${GUEST_IMAGE},file.locking=off,if=ide
+
+VirtIO devices can also be used; Linux guests may need to be dissuaded from
+umplugging them by adding '``xen_emul_unplug=never``' on their command line.
+
+Booting Xen PV guests
+---------------------
+
+Booting PV guest kernels is possible by using the Xen PV shim (a version of Xen
+itself, designed to run inside a Xen HVM guest and provide memory management
+services for one guest alone).
+
+The Xen binary is provided as the ``-kernel`` and the guest kernel itself (or
+PV Grub image) as the ``-initrd`` image, which actually just means the first
+multiboot "module". For example:
+
+.. parsed-literal::
+
+  |qemu_system| --accel kvm,xen-version=0x40011,kernel-irqchip=split \\
+       -chardev stdio,id=char0 -device xen-console,chardev=char0 \\
+       -display none  -m 1G  -kernel xen -initrd bzImage \\
+       -append "pv-shim console=xen,pv -- console=hvc0 root=/dev/xvda1" \\
+       -drive file=${GUEST_IMAGE},if=xen
+
+The Xen image must be built with the ``CONFIG_XEN_GUEST`` and ``CONFIG_PV_SHIM``
+options, and as of Xen 4.17, Xen's PV shim mode does not support using a serial
+port; it must have a Xen console or it will panic.
+
+Host OS requirements
+--------------------
 
 The minimal Xen support in the KVM accelerator requires the host to be running
 Linux v5.12 or newer. Later versions add optimisations: Linux v5.17 added
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 28/28] docs: update Xen-on-KVM documentation
  2023-10-25 14:50 ` [PATCH v3 28/28] docs: update Xen-on-KVM documentation David Woodhouse
@ 2023-10-25 18:20   ` Eric Blake
  2023-10-25 18:26     ` David Woodhouse
  0 siblings, 1 reply; 59+ messages in thread
From: Eric Blake @ 2023-10-25 18:20 UTC (permalink / raw)
  To: David Woodhouse
  Cc: qemu-devel, Kevin Wolf, Hanna Reitz, Stefano Stabellini,
	Anthony Perard, Paul Durrant, Marc-André Lureau,
	Paolo Bonzini, Michael S. Tsirkin, Marcel Apfelbaum,
	Richard Henderson, Eduardo Habkost, Jason Wang, Marcelo Tosatti,
	qemu-block, xen-devel, kvm, Bernhard Beschow, Joel Upham

On Wed, Oct 25, 2023 at 03:50:42PM +0100, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> Add notes about console and network support, and how to launch PV guests.
> Clean up the disk configuration examples now that that's simpler, and
> remove the comment about IDE unplug on q35/AHCI now that it's fixed.
> 
> Also update stale avocado test filename in MAINTAINERS.
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
> +Xen paravirtual devices
> +-----------------------
> +
> +The Xen PCI platform device is enabled automatically for a Xen guest. This
> +allows a guest to unplug all emulated devices, in order to use paravirtual
> +block and network drivers instead.
> +
> +Those paravirtual Xen block, network (and console) devices can be created
> +through the command line, and/or hot-plugged.
> +
> +To provide a Xen console device, define a character device and then a device
> +of type ``xen-console`` to connect to it. For the Xen console equivalent of
> +the handy ``-serial mon:stdio`` option, for example:
> +
> +.. parsed-literal::
> +   -chardev -chardev stdio,mux=on,id=char0,signal=off -mon char0 \\
> +   -device xen-console,chardev=char0

Is -chardev supposed to appear twice here?

...
> +
> +Booting Xen PV guests
> +---------------------
> +
> +Booting PV guest kernels is possible by using the Xen PV shim (a version of Xen
> +itself, designed to run inside a Xen HVM guest and provide memory management
> +services for one guest alone).
> +
> +The Xen binary is provided as the ``-kernel`` and the guest kernel itself (or
> +PV Grub image) as the ``-initrd`` image, which actually just means the first
> +multiboot "module". For example:
> +
> +.. parsed-literal::
> +
> +  |qemu_system| --accel kvm,xen-version=0x40011,kernel-irqchip=split \\
> +       -chardev stdio,id=char0 -device xen-console,chardev=char0 \\
> +       -display none  -m 1G  -kernel xen -initrd bzImage \\
> +       -append "pv-shim console=xen,pv -- console=hvc0 root=/dev/xvda1" \\
> +       -drive file=${GUEST_IMAGE},if=xen

Is the space between -- and console= intentionsl?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 28/28] docs: update Xen-on-KVM documentation
  2023-10-25 18:20   ` Eric Blake
@ 2023-10-25 18:26     ` David Woodhouse
  2023-10-25 18:56       ` Andrew Cooper
  0 siblings, 1 reply; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 18:26 UTC (permalink / raw)
  To: Eric Blake
  Cc: qemu-devel, Kevin Wolf, Hanna Reitz, Stefano Stabellini,
	Anthony Perard, Paul Durrant, Marc-André Lureau,
	Paolo Bonzini, Michael S. Tsirkin, Marcel Apfelbaum,
	Richard Henderson, Eduardo Habkost, Jason Wang, Marcelo Tosatti,
	qemu-block, xen-devel, kvm, Bernhard Beschow, Joel Upham

[-- Attachment #1: Type: text/plain, Size: 2528 bytes --]

On Wed, 2023-10-25 at 13:20 -0500, Eric Blake wrote:
> On Wed, Oct 25, 2023 at 03:50:42PM +0100, David Woodhouse wrote:
> > From: David Woodhouse <dwmw@amazon.co.uk>
> > 
> > Add notes about console and network support, and how to launch PV guests.
> > Clean up the disk configuration examples now that that's simpler, and
> > remove the comment about IDE unplug on q35/AHCI now that it's fixed.
> > 
> > Also update stale avocado test filename in MAINTAINERS.
> > 
> > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> > ---
> > +Xen paravirtual devices
> > +-----------------------
> > +
> > +The Xen PCI platform device is enabled automatically for a Xen guest. This
> > +allows a guest to unplug all emulated devices, in order to use paravirtual
> > +block and network drivers instead.
> > +
> > +Those paravirtual Xen block, network (and console) devices can be created
> > +through the command line, and/or hot-plugged.
> > +
> > +To provide a Xen console device, define a character device and then a device
> > +of type ``xen-console`` to connect to it. For the Xen console equivalent of
> > +the handy ``-serial mon:stdio`` option, for example:
> > +
> > +.. parsed-literal::
> > +   -chardev -chardev stdio,mux=on,id=char0,signal=off -mon char0 \\
> > +   -device xen-console,chardev=char0
> 
> Is -chardev supposed to appear twice here?

It is not. Will fix; thanks.

> ...
> > +
> > +Booting Xen PV guests
> > +---------------------
> > +
> > +Booting PV guest kernels is possible by using the Xen PV shim (a version of Xen
> > +itself, designed to run inside a Xen HVM guest and provide memory management
> > +services for one guest alone).
> > +
> > +The Xen binary is provided as the ``-kernel`` and the guest kernel itself (or
> > +PV Grub image) as the ``-initrd`` image, which actually just means the first
> > +multiboot "module". For example:
> > +
> > +.. parsed-literal::
> > +
> > +  |qemu_system| --accel kvm,xen-version=0x40011,kernel-irqchip=split \\
> > +       -chardev stdio,id=char0 -device xen-console,chardev=char0 \\
> > +       -display none  -m 1G  -kernel xen -initrd bzImage \\
> > +       -append "pv-shim console=xen,pv -- console=hvc0 root=/dev/xvda1" \\
> > +       -drive file=${GUEST_IMAGE},if=xen
> 
> Is the space between -- and console= intentionsl?

Yes, that one is correct. The -- is how you separate Xen's command line
(on the left) from the guest kernel command line (on the right).


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5965 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 28/28] docs: update Xen-on-KVM documentation
  2023-10-25 18:26     ` David Woodhouse
@ 2023-10-25 18:56       ` Andrew Cooper
  2023-10-25 19:02         ` David Woodhouse
  2023-10-26  8:26         ` Kevin Wolf
  0 siblings, 2 replies; 59+ messages in thread
From: Andrew Cooper @ 2023-10-25 18:56 UTC (permalink / raw)
  To: David Woodhouse, Eric Blake
  Cc: qemu-devel, Kevin Wolf, Hanna Reitz, Stefano Stabellini,
	Anthony Perard, Paul Durrant, Marc-André Lureau,
	Paolo Bonzini, Michael S. Tsirkin, Marcel Apfelbaum,
	Richard Henderson, Eduardo Habkost, Jason Wang, Marcelo Tosatti,
	qemu-block, xen-devel, kvm, Bernhard Beschow, Joel Upham

[-- Attachment #1: Type: text/plain, Size: 1594 bytes --]

On 25/10/2023 7:26 pm, David Woodhouse wrote:
> On Wed, 2023-10-25 at 13:20 -0500, Eric Blake wrote:
>> On Wed, Oct 25, 2023 at 03:50:42PM +0100, David Woodhouse wrote:
>>> +
>>> +Booting Xen PV guests
>>> +---------------------
>>> +
>>> +Booting PV guest kernels is possible by using the Xen PV shim (a version of Xen
>>> +itself, designed to run inside a Xen HVM guest and provide memory management
>>> +services for one guest alone).
>>> +
>>> +The Xen binary is provided as the ``-kernel`` and the guest kernel itself (or
>>> +PV Grub image) as the ``-initrd`` image, which actually just means the first
>>> +multiboot "module". For example:
>>> +
>>> +.. parsed-literal::
>>> +
>>> +  |qemu_system| --accel kvm,xen-version=0x40011,kernel-irqchip=split \\
>>> +       -chardev stdio,id=char0 -device xen-console,chardev=char0 \\
>>> +       -display none  -m 1G  -kernel xen -initrd bzImage \\
>>> +       -append "pv-shim console=xen,pv -- console=hvc0 root=/dev/xvda1" \\
>>> +       -drive file=${GUEST_IMAGE},if=xen
>> Is the space between -- and console= intentionsl?
> Yes, that one is correct. The -- is how you separate Xen's command line
> (on the left) from the guest kernel command line (on the right).

To expand on this a bit.

Multiboot1 supports multiple modules but only a single command line.  As
one of the modules passed to Xen is the dom0 kernel, we need some way to
pass it's command line, hence the " -- ".

Multiboot2 and PVH support a command line per module, which is the
preferred way to pass the commandlines, given a choice.

~Andrew

[-- Attachment #2: Type: text/html, Size: 2354 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 28/28] docs: update Xen-on-KVM documentation
  2023-10-25 18:56       ` Andrew Cooper
@ 2023-10-25 19:02         ` David Woodhouse
  2023-10-26  8:26         ` Kevin Wolf
  1 sibling, 0 replies; 59+ messages in thread
From: David Woodhouse @ 2023-10-25 19:02 UTC (permalink / raw)
  To: Andrew Cooper, Eric Blake
  Cc: qemu-devel, Kevin Wolf, Hanna Reitz, Stefano Stabellini,
	Anthony Perard, Paul Durrant, Marc-André Lureau,
	Paolo Bonzini, Michael S. Tsirkin, Marcel Apfelbaum,
	Richard Henderson, Eduardo Habkost, Jason Wang, Marcelo Tosatti,
	qemu-block, xen-devel, kvm, Bernhard Beschow, Joel Upham

[-- Attachment #1: Type: text/plain, Size: 2116 bytes --]

On Wed, 2023-10-25 at 19:56 +0100, Andrew Cooper wrote:
> On 25/10/2023 7:26 pm, David Woodhouse wrote:
>  
> > On Wed, 2023-10-25 at 13:20 -0500, Eric Blake wrote:
> >  
> > > On Wed, Oct 25, 2023 at 03:50:42PM +0100, David Woodhouse wrote:
> >  
> > >  
> > > > +
> > > > +Booting Xen PV guests
> > > > +---------------------
> > > > +
> > > > +Booting PV guest kernels is possible by using the Xen PV shim (a version of Xen
> > > > +itself, designed to run inside a Xen HVM guest and provide memory management
> > > > +services for one guest alone).
> > > > +
> > > > +The Xen binary is provided as the ``-kernel`` and the guest kernel itself (or
> > > > +PV Grub image) as the ``-initrd`` image, which actually just means the first
> > > > +multiboot "module". For example:
> > > > +
> > > > +.. parsed-literal::
> > > > +
> > > > +  |qemu_system| --accel kvm,xen-version=0x40011,kernel-irqchip=split \\
> > > > +       -chardev stdio,id=char0 -device xen-console,chardev=char0 \\
> > > > +       -display none  -m 1G  -kernel xen -initrd bzImage \\
> > > > +       -append "pv-shim console=xen,pv -- console=hvc0 root=/dev/xvda1" \\
> > > > +       -drive file=${GUEST_IMAGE},if=xen
> > > Is the space between -- and console= intentionsl?
> > Yes, that one is correct. The -- is how you separate Xen's command line
> > (on the left) from the guest kernel command line (on the right).
>  
>  To expand on this a bit.
>  
>  Multiboot1 supports multiple modules but only a single command
> line.  As one of the modules passed to Xen is the dom0 kernel, we
> need some way to pass it's command line, hence the " -- ".
>  
>  Multiboot2 and PVH support a command line per module, which is the
> preferred way to pass the commandlines, given a choice.
>  

Thanks.

Indeed, I had *originally* thought I was going to need to implement
Multiboot2 in qemu in order to boot Shim + PV guest, but it turns out
we can make it work with just Multiboot1 support.

As long as the guest kernel doesn't want an *actual* initrd, that is ;)


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5965 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 28/28] docs: update Xen-on-KVM documentation
  2023-10-25 18:56       ` Andrew Cooper
  2023-10-25 19:02         ` David Woodhouse
@ 2023-10-26  8:26         ` Kevin Wolf
  2023-10-26  9:25           ` David Woodhouse
  1 sibling, 1 reply; 59+ messages in thread
From: Kevin Wolf @ 2023-10-26  8:26 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: David Woodhouse, Eric Blake, qemu-devel, Hanna Reitz,
	Stefano Stabellini, Anthony Perard, Paul Durrant,
	Marc-André Lureau, Paolo Bonzini, Michael S. Tsirkin,
	Marcel Apfelbaum, Richard Henderson, Eduardo Habkost, Jason Wang,
	Marcelo Tosatti, qemu-block, xen-devel, kvm, Bernhard Beschow,
	Joel Upham

Am 25.10.2023 um 20:56 hat Andrew Cooper geschrieben:
> On 25/10/2023 7:26 pm, David Woodhouse wrote:
> > On Wed, 2023-10-25 at 13:20 -0500, Eric Blake wrote:
> >> On Wed, Oct 25, 2023 at 03:50:42PM +0100, David Woodhouse wrote:
> >>> +
> >>> +Booting Xen PV guests
> >>> +---------------------
> >>> +
> >>> +Booting PV guest kernels is possible by using the Xen PV shim (a version of Xen
> >>> +itself, designed to run inside a Xen HVM guest and provide memory management
> >>> +services for one guest alone).
> >>> +
> >>> +The Xen binary is provided as the ``-kernel`` and the guest kernel itself (or
> >>> +PV Grub image) as the ``-initrd`` image, which actually just means the first
> >>> +multiboot "module". For example:
> >>> +
> >>> +.. parsed-literal::
> >>> +
> >>> +  |qemu_system| --accel kvm,xen-version=0x40011,kernel-irqchip=split \\
> >>> +       -chardev stdio,id=char0 -device xen-console,chardev=char0 \\
> >>> +       -display none  -m 1G  -kernel xen -initrd bzImage \\
> >>> +       -append "pv-shim console=xen,pv -- console=hvc0 root=/dev/xvda1" \\
> >>> +       -drive file=${GUEST_IMAGE},if=xen
> >> Is the space between -- and console= intentionsl?
> > Yes, that one is correct. The -- is how you separate Xen's command line
> > (on the left) from the guest kernel command line (on the right).
> 
> To expand on this a bit.
> 
> Multiboot1 supports multiple modules but only a single command line.  As
> one of the modules passed to Xen is the dom0 kernel, we need some way to
> pass it's command line, hence the " -- ".

That's not right, even Multiboot 1 contains a 'string' field in the
module structure that is defined to typically hold a command line. The
exact meaning is OS dependent, so Xen could use it however it wants.

In QEMU (and I believe this is the same behaviour as in GRUB),
everything before the space in an -initrd argument is treated as a
filename to load, everything after it is just passed as the command
line.

So it would have been entirely possible to use -initrd 'bzImage
console=hvc0 root=/dev/xvda1' if Xen worked like that.

> Multiboot2 and PVH support a command line per module, which is the
> preferred way to pass the commandlines, given a choice.

Multiboot 2 seems to integrate the string in a variable length module
structure instead of just having a pointer in a fixed length one, but
the model behind it is essentially the same as before.

Kevin


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 28/28] docs: update Xen-on-KVM documentation
  2023-10-26  8:26         ` Kevin Wolf
@ 2023-10-26  9:25           ` David Woodhouse
  2023-10-26 16:25             ` David Woodhouse
  0 siblings, 1 reply; 59+ messages in thread
From: David Woodhouse @ 2023-10-26  9:25 UTC (permalink / raw)
  To: Kevin Wolf, Andrew Cooper
  Cc: Eric Blake, qemu-devel, Hanna Reitz, Stefano Stabellini,
	Anthony Perard, Paul Durrant, Marc-André Lureau,
	Paolo Bonzini, Michael S. Tsirkin, Marcel Apfelbaum,
	Richard Henderson, Eduardo Habkost, Jason Wang, Marcelo Tosatti,
	qemu-block, xen-devel, kvm, Bernhard Beschow, Joel Upham

[-- Attachment #1: Type: text/plain, Size: 2021 bytes --]

On Thu, 2023-10-26 at 10:26 +0200, Kevin Wolf wrote:
> 
> > > > > +.. parsed-literal::
> > > > > +
> > > > > +  |qemu_system| --accel kvm,xen-version=0x40011,kernel-irqchip=split \\
> > > > > +       -chardev stdio,id=char0 -device xen-console,chardev=char0 \\
> > > > > +       -display none  -m 1G  -kernel xen -initrd bzImage \\
> > > > > +       -append "pv-shim console=xen,pv -- console=hvc0 root=/dev/xvda1" \\
> > > > > +       -drive file=${GUEST_IMAGE},if=xen
> > > > Is the space between -- and console= intentionsl?
> > > Yes, that one is correct. The -- is how you separate Xen's command line
> > > (on the left) from the guest kernel command line (on the right).
> > 
> > To expand on this a bit.
> > 
> > Multiboot1 supports multiple modules but only a single command line.  As
> > one of the modules passed to Xen is the dom0 kernel, we need some way to
> > pass it's command line, hence the " -- ".
> 
> That's not right, even Multiboot 1 contains a 'string' field in the
> module structure that is defined to typically hold a command line. The
> exact meaning is OS dependent, so Xen could use it however it wants.
> 
> In QEMU (and I believe this is the same behaviour as in GRUB),
> everything before the space in an -initrd argument is treated as a
> filename to load, everything after it is just passed as the command
> line.
> 
> So it would have been entirely possible to use -initrd 'bzImage
> console=hvc0 root=/dev/xvda1' if Xen worked like that.

Xen does allow that too. I didn't realise our multiboot loader did though.

So yes, you *can* use  -initrd 'bzImage root=/dev/xvda1'. 

And you can even load more than one module, it seems. Separate them by
commas, so -initrd 'bzImage,initrd.img' should work.

You can even do both at the same time. If you have commas on the kernel
command line, *double* them:

 -initrd 'bzImage root=/dev/xvda earlyprintk=xen,,keep,initrd.img'

I'll update the documentation accordingly.

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5965 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 28/28] docs: update Xen-on-KVM documentation
  2023-10-26  9:25           ` David Woodhouse
@ 2023-10-26 16:25             ` David Woodhouse
  0 siblings, 0 replies; 59+ messages in thread
From: David Woodhouse @ 2023-10-26 16:25 UTC (permalink / raw)
  To: Kevin Wolf, Andrew Cooper
  Cc: Eric Blake, qemu-devel, Hanna Reitz, Stefano Stabellini,
	Anthony Perard, Paul Durrant, Marc-André Lureau,
	Paolo Bonzini, Michael S. Tsirkin, Marcel Apfelbaum,
	Richard Henderson, Eduardo Habkost, Jason Wang, Marcelo Tosatti,
	qemu-block, xen-devel, kvm, Bernhard Beschow, Joel Upham

[-- Attachment #1: Type: text/plain, Size: 2943 bytes --]

On Thu, 2023-10-26 at 10:25 +0100, David Woodhouse wrote:
> 
> > So it would have been entirely possible to use -initrd 'bzImage
> > console=hvc0 root=/dev/xvda1' if Xen worked like that.
> 
> Xen does allow that too. I didn't realise our multiboot loader did though.
> 
> So yes, you *can* use  -initrd 'bzImage root=/dev/xvda1'. 
> 
> And you can even load more than one module, it seems. Separate them by
> commas, so -initrd 'bzImage,initrd.img' should work.
> 
> You can even do both at the same time. If you have commas on the kernel
> command line, *double* them:
> 
>  -initrd 'bzImage root=/dev/xvda earlyprintk=xen,,keep,initrd.img'
> 
> I'll update the documentation accordingly.

https://git.infradead.org/users/dwmw2/qemu.git/commitdiff/0b13c0ae39b


+Booting Xen PV guests
+---------------------
+
+Booting PV guest kernels is possible by using the Xen PV shim (a version of Xen
+itself, designed to run inside a Xen HVM guest and provide memory management
+services for one guest alone).
+
+The Xen binary is provided as the ``-kernel`` and the guest kernel itself (or
+PV Grub image) as the ``-initrd`` image, which actually just means the first
+multiboot "module". For example:
+
+.. parsed-literal::
+
+  |qemu_system| --accel kvm,xen-version=0x40011,kernel-irqchip=split \\
+       -chardev stdio,id=char0 -device xen-console,chardev=char0 \\
+       -display none  -m 1G  -kernel xen -initrd bzImage \\
+       -append "pv-shim console=xen,pv -- console=hvc0 root=/dev/xvda1" \\
+       -drive file=${GUEST_IMAGE},if=xen
+
+The Xen image must be built with the ``CONFIG_XEN_GUEST`` and ``CONFIG_PV_SHIM``
+options, and as of Xen 4.17, Xen's PV shim mode does not support using a serial
+port; it must have a Xen console or it will panic.
+
+The example above provides the guest kernel command line after a separator
+(" ``--`` ") on the Xen command line, and does not provide the guest kernel
+with an actual initramfs, which would need to listed as a second multiboot
+module. For more complicated alternatives, see the
+:ref:`documentation <initrd-reference-label>` for the ``-initrd`` option.
+


I also fixed up the -initrd documentation so that it actually mentions
how to quote commas, using a Xen PV launch as an example:

 ``-initrd "file1 arg=foo,file2"``
     This syntax is only available with multiboot.
 
-    Use file1 and file2 as modules and pass arg=foo as parameter to the
-    first module.
+    Use file1 and file2 as modules and pass ``arg=foo`` as parameter to the
+    first module. Commas can be provided in module parameters by doubling
+    them on the command line to escape them:
+
+``-initrd "bzImage earlyprintk=xen,,keep root=/dev/xvda1,initrd.img"``
+    Multiboot only. Use bzImage as the first module with
+    "``earlyprintk=xen,keep root=/dev/xvda1``" as its command line,
+    and initrd.img as the second module.

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5965 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 05/28] hw/xen: fix XenStore watch delivery to guest
  2023-10-25 14:50 ` [PATCH v3 05/28] hw/xen: fix XenStore watch delivery to guest David Woodhouse
@ 2023-10-27  7:15   ` Durrant, Paul
  0 siblings, 0 replies; 59+ messages in thread
From: Durrant, Paul @ 2023-10-27  7:15 UTC (permalink / raw)
  To: David Woodhouse, qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

On 25/10/2023 15:50, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> When fire_watch_cb() found the response buffer empty, it would call
> deliver_watch() to generate the XS_WATCH_EVENT message in the response
> buffer and send an event channel notification to the guest… without
> actually *copying* the response buffer into the ring. So there was
> nothing for the guest to see. The pending response didn't actually get
> processed into the ring until the guest next triggered some activity
> from its side.
> 
> Add the missing call to put_rsp().
> 
> It might have been slightly nicer to call xen_xenstore_event() here,
> which would *almost* have worked. Except for the fact that it calls
> xen_be_evtchn_pending() to check that it really does have an event
> pending (and clear the eventfd for next time). And under Xen it's
> defined that setting that fd to O_NONBLOCK isn't guaranteed to work,
> so the emu implementation follows suit.
> 
> This fixes Xen device hot-unplug.
> 
> Fixes: 0254c4d19df ("hw/xen: Add xenstore wire implementation and implementation stubs")
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>   hw/i386/kvm/xen_xenstore.c | 8 +++++---
>   1 file changed, 5 insertions(+), 3 deletions(-)
> 

Reviewed-by: Paul Durrant <paul@xen.org>


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 06/28] hw/xen: take iothread mutex in xen_evtchn_reset_op()
  2023-10-25 14:50 ` [PATCH v3 06/28] hw/xen: take iothread mutex in xen_evtchn_reset_op() David Woodhouse
@ 2023-10-27  7:20   ` Durrant, Paul
  0 siblings, 0 replies; 59+ messages in thread
From: Durrant, Paul @ 2023-10-27  7:20 UTC (permalink / raw)
  To: David Woodhouse, qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

On 25/10/2023 15:50, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> The xen_evtchn_soft_reset() function requires the iothread mutex, but is
> also called for the EVTCHNOP_reset hypercall. Ensure the mutex is taken
> in that case.
> 
> Fixes: a15b10978fe6 ("hw/xen: Implement EVTCHNOP_reset")
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>   hw/i386/kvm/xen_evtchn.c | 1 +
>   1 file changed, 1 insertion(+)
> 

Reviewed-by: Paul Durrant <paul@xen.org>


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 07/28] hw/xen: use correct default protocol for xen-block on x86
  2023-10-25 14:50 ` [PATCH v3 07/28] hw/xen: use correct default protocol for xen-block on x86 David Woodhouse
@ 2023-10-27  7:22   ` Durrant, Paul
  0 siblings, 0 replies; 59+ messages in thread
From: Durrant, Paul @ 2023-10-27  7:22 UTC (permalink / raw)
  To: David Woodhouse, qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

On 25/10/2023 15:50, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> Even on x86_64 the default protocol is the x86-32 one if the guest doesn't
> specifically ask for x86-64.
> 
> Fixes: b6af8926fb85 ("xen: add implementations of xen-block connect and disconnect functions...")
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>   hw/block/xen-block.c | 6 +++++-
>   1 file changed, 5 insertions(+), 1 deletion(-)
> 

Reviewed-by: Paul Durrant <paul@xen.org>


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 08/28] i386/xen: Ignore VCPU_SSHOTTMR_future flag in set_singleshot_timer()
  2023-10-25 14:50 ` [PATCH v3 08/28] i386/xen: Ignore VCPU_SSHOTTMR_future flag in set_singleshot_timer() David Woodhouse
@ 2023-10-27  7:23   ` Durrant, Paul
  0 siblings, 0 replies; 59+ messages in thread
From: Durrant, Paul @ 2023-10-27  7:23 UTC (permalink / raw)
  To: David Woodhouse, qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

On 25/10/2023 15:50, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> Upstream Xen now ignores this flag¹, since the only guest kernel ever to
> use it was buggy.
> 
> ¹ https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=19c6cbd909
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> Reviewed-by: Paul Durrant <paul@xen.org>
> ---
>   target/i386/kvm/xen-emu.c | 20 ++++++++++----------
>   1 file changed, 10 insertions(+), 10 deletions(-)
> 

Reviewed-by: Paul Durrant <paul@xen.org>


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 13/28] hw/xen: automatically assign device index to block devices
  2023-10-25 14:50 ` [PATCH v3 13/28] hw/xen: automatically assign device index to block devices David Woodhouse
@ 2023-10-27  7:30   ` Durrant, Paul
  2023-10-27  8:45     ` David Woodhouse
  0 siblings, 1 reply; 59+ messages in thread
From: Durrant, Paul @ 2023-10-27  7:30 UTC (permalink / raw)
  To: David Woodhouse, qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

On 25/10/2023 15:50, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> There's no need to force the user to assign a vdev. We can automatically
> assign one, starting at xvda and searching until we find the first disk
> name that's unused.
> 
> This means we can now allow '-drive if=xen,file=xxx' to work without an
> explicit separate -driver argument, just like if=virtio.
> 
> Rip out the legacy handling from the xenpv machine, which was scribbling
> over any disks configured by the toolstack, and didn't work with anything
> but raw images.
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> Acked-by: Kevin Wolf <kwolf@redhat.com>
> ---
>   blockdev.c                          | 15 +++++++++---
>   hw/block/xen-block.c                | 38 +++++++++++++++++++++++++++++
>   hw/xen/xen_devconfig.c              | 28 ---------------------
>   hw/xenpv/xen_machine_pv.c           |  9 -------
>   include/hw/xen/xen-legacy-backend.h |  1 -
>   5 files changed, 50 insertions(+), 41 deletions(-)
> 
> diff --git a/blockdev.c b/blockdev.c
> index a01c62596b..5d9f2e5395 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -255,13 +255,13 @@ void drive_check_orphaned(void)
>            * Ignore default drives, because we create certain default
>            * drives unconditionally, then leave them unclaimed.  Not the
>            * users fault.
> -         * Ignore IF_VIRTIO, because it gets desugared into -device,
> -         * so we can leave failing to -device.
> +         * Ignore IF_VIRTIO or IF_XEN, because it gets desugared into
> +         * -device, so we can leave failing to -device.
>            * Ignore IF_NONE, because leaving unclaimed IF_NONE remains
>            * available for device_add is a feature.
>            */
>           if (dinfo->is_default || dinfo->type == IF_VIRTIO
> -            || dinfo->type == IF_NONE) {
> +            || dinfo->type == IF_XEN || dinfo->type == IF_NONE) {
>               continue;
>           }
>           if (!blk_get_attached_dev(blk)) {
> @@ -977,6 +977,15 @@ DriveInfo *drive_new(QemuOpts *all_opts, BlockInterfaceType block_default_type,
>           qemu_opt_set(devopts, "driver", "virtio-blk", &error_abort);
>           qemu_opt_set(devopts, "drive", qdict_get_str(bs_opts, "id"),
>                        &error_abort);
> +    } else if (type == IF_XEN) {
> +        QemuOpts *devopts;
> +        devopts = qemu_opts_create(qemu_find_opts("device"), NULL, 0,
> +                                   &error_abort);
> +        qemu_opt_set(devopts, "driver",
> +                     (media == MEDIA_CDROM) ? "xen-cdrom" : "xen-disk",
> +                     &error_abort);
> +        qemu_opt_set(devopts, "drive", qdict_get_str(bs_opts, "id"),
> +                     &error_abort);
>       }
>   
>       filename = qemu_opt_get(legacy_opts, "file");
> diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c
> index 64470fc579..5011fe9430 100644
> --- a/hw/block/xen-block.c
> +++ b/hw/block/xen-block.c
> @@ -27,6 +27,7 @@
>   #include "sysemu/block-backend.h"
>   #include "sysemu/iothread.h"
>   #include "dataplane/xen-block.h"
> +#include "hw/xen/interface/io/xs_wire.h"
>   #include "trace.h"
>   
>   static char *xen_block_get_name(XenDevice *xendev, Error **errp)
> @@ -34,6 +35,43 @@ static char *xen_block_get_name(XenDevice *xendev, Error **errp)
>       XenBlockDevice *blockdev = XEN_BLOCK_DEVICE(xendev);
>       XenBlockVdev *vdev = &blockdev->props.vdev;
>   
> +    if (blockdev->props.vdev.type == XEN_BLOCK_VDEV_TYPE_INVALID) {
> +        XenBus *xenbus = XEN_BUS(qdev_get_parent_bus(DEVICE(xendev)));
> +        char fe_path[XENSTORE_ABS_PATH_MAX + 1];
> +        char *value;
> +        int disk = 0;
> +        unsigned long idx;
> +
> +        /* Find an unoccupied device name */

Not sure this is going to work is it? What happens if 'hda' or 'sda', or 
'd0' exists? I think you need to use the core of the code in 
xen_block_set_vdev() to generate names and search all possible encodings 
for each disk.

   Paul

> +        while (disk < (1 << 20)) {
> +            if (disk < (1 << 4)) {
> +                idx = (202 << 8) | (disk << 4);
> +            } else {
> +                idx = (1 << 28) | (disk << 8);
> +            }
> +            snprintf(fe_path, sizeof(fe_path),
> +                     "/local/domain/%u/device/vbd/%lu",
> +                     xendev->frontend_id, idx);
> +            value = qemu_xen_xs_read(xenbus->xsh, XBT_NULL, fe_path, NULL);
> +            if (!value) {
> +                if (errno == ENOENT) {
> +                    vdev->type = XEN_BLOCK_VDEV_TYPE_XVD;
> +                    vdev->partition = 0;
> +                    vdev->disk = disk;
> +                    vdev->number = idx;
> +                    goto found;
> +                }
> +                error_setg(errp, "cannot read %s: %s", fe_path,
> +                           strerror(errno));
> +                return NULL;
> +            }
> +            free(value);
> +            disk++;
> +        }
> +        error_setg(errp, "cannot find device vdev for block device");
> +        return NULL;
> +    }
> + found:
>       return g_strdup_printf("%lu", vdev->number);
>   }
>   
> diff --git a/hw/xen/xen_devconfig.c b/hw/xen/xen_devconfig.c
> index 9b7304e544..3f77c675c6 100644
> --- a/hw/xen/xen_devconfig.c
> +++ b/hw/xen/xen_devconfig.c
> @@ -46,34 +46,6 @@ static int xen_config_dev_all(char *fe, char *be)
>   
>   /* ------------------------------------------------------------- */
>   
> -int xen_config_dev_blk(DriveInfo *disk)
> -{
> -    char fe[256], be[256], device_name[32];
> -    int vdev = 202 * 256 + 16 * disk->unit;
> -    int cdrom = disk->media_cd;
> -    const char *devtype = cdrom ? "cdrom" : "disk";
> -    const char *mode    = cdrom ? "r"     : "w";
> -    const char *filename = qemu_opt_get(disk->opts, "file");
> -
> -    snprintf(device_name, sizeof(device_name), "xvd%c", 'a' + disk->unit);
> -    xen_pv_printf(NULL, 1, "config disk %d [%s]: %s\n",
> -                  disk->unit, device_name, filename);
> -    xen_config_dev_dirs("vbd", "qdisk", vdev, fe, be, sizeof(fe));
> -
> -    /* frontend */
> -    xenstore_write_int(fe, "virtual-device",  vdev);
> -    xenstore_write_str(fe, "device-type",     devtype);
> -
> -    /* backend */
> -    xenstore_write_str(be, "dev",             device_name);
> -    xenstore_write_str(be, "type",            "file");
> -    xenstore_write_str(be, "params",          filename);
> -    xenstore_write_str(be, "mode",            mode);
> -
> -    /* common stuff */
> -    return xen_config_dev_all(fe, be);
> -}
> -
>   int xen_config_dev_nic(NICInfo *nic)
>   {
>       char fe[256], be[256];
> diff --git a/hw/xenpv/xen_machine_pv.c b/hw/xenpv/xen_machine_pv.c
> index 17cda5ec13..1533f5dfb4 100644
> --- a/hw/xenpv/xen_machine_pv.c
> +++ b/hw/xenpv/xen_machine_pv.c
> @@ -32,7 +32,6 @@
>   
>   static void xen_init_pv(MachineState *machine)
>   {
> -    DriveInfo *dinfo;
>       int i;
>   
>       setup_xen_backend_ops();
> @@ -64,14 +63,6 @@ static void xen_init_pv(MachineState *machine)
>           vga_interface_created = true;
>       }
>   
> -    /* configure disks */
> -    for (i = 0; i < 16; i++) {
> -        dinfo = drive_get(IF_XEN, 0, i);
> -        if (!dinfo)
> -            continue;
> -        xen_config_dev_blk(dinfo);
> -    }
> -
>       /* configure nics */
>       for (i = 0; i < nb_nics; i++) {
>           if (!nd_table[i].model || 0 != strcmp(nd_table[i].model, "xen"))
> diff --git a/include/hw/xen/xen-legacy-backend.h b/include/hw/xen/xen-legacy-backend.h
> index 6c307c5f2c..fc42146bc2 100644
> --- a/include/hw/xen/xen-legacy-backend.h
> +++ b/include/hw/xen/xen-legacy-backend.h
> @@ -81,7 +81,6 @@ extern struct XenDevOps xen_usb_ops;          /* xen-usb.c         */
>   
>   /* configuration (aka xenbus setup) */
>   void xen_config_cleanup(void);
> -int xen_config_dev_blk(DriveInfo *disk);
>   int xen_config_dev_nic(NICInfo *nic);
>   int xen_config_dev_vfb(int vdev, const char *type);
>   int xen_config_dev_vkbd(int vdev);


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 14/28] hw/xen: add get_frontend_path() method to XenDeviceClass
  2023-10-25 14:50 ` [PATCH v3 14/28] hw/xen: add get_frontend_path() method to XenDeviceClass David Woodhouse
@ 2023-10-27  7:31   ` Durrant, Paul
  0 siblings, 0 replies; 59+ messages in thread
From: Durrant, Paul @ 2023-10-27  7:31 UTC (permalink / raw)
  To: David Woodhouse, qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

On 25/10/2023 15:50, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> The primary Xen console is special. The guest's side is set up for it by
> the toolstack automatically and not by the standard PV init sequence.
> 
> Accordingly, its *frontend* doesn't appear in …/device/console/0 either;
> instead it appears under …/console in the guest's XenStore node.
> 
> To allow the Xen console driver to override the frontend path for the
> primary console, add a method to the XenDeviceClass which can be used
> instead of the standard xen_device_get_frontend_path()
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>   hw/xen/xen-bus.c         | 11 ++++++++++-
>   include/hw/xen/xen-bus.h |  2 ++
>   2 files changed, 12 insertions(+), 1 deletion(-)
> 

Reviewed-by: Paul Durrant <paul@xen.org>


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 17/28] hw/xen: add support for Xen primary console in emulated mode
  2023-10-25 14:50 ` [PATCH v3 17/28] hw/xen: add support for Xen primary console in emulated mode David Woodhouse
@ 2023-10-27  7:44   ` Durrant, Paul
  0 siblings, 0 replies; 59+ messages in thread
From: Durrant, Paul @ 2023-10-27  7:44 UTC (permalink / raw)
  To: David Woodhouse, qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

On 25/10/2023 15:50, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> The primary console is special because the toolstack maps a page into
> the guest for its ring, and also allocates the guest-side event channel.
> The guest's grant table is even primed to export that page using a known
> grant ref#. Add support for all that in emulated mode, so that we can
> have a primary console.
> 
> For reasons unclear, the backends running under real Xen don't just use
> a mapping of the well-known GNTTAB_RESERVED_CONSOLE grant ref (which
> would also be in the ring-ref node in XenStore). Instead, the toolstack
> sets the ring-ref node of the primary console to the GFN of the guest
> page. The backend is expected to handle that special case and map it
> with foreignmem operations instead.
> 
> We don't have an implementation of foreignmem ops for emulated Xen mode,
> so just make it map GNTTAB_RESERVED_CONSOLE instead. This would probably
> work for real Xen too, but we can't work out how to make real Xen create
> a primary console of type "ioemu" to make QEMU drive it, so we can't
> test that; might as well leave it as it is for now under Xen.
> 
> Now at last we can boot the Xen PV shim and run PV kernels in QEMU.
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>   hw/char/xen_console.c             |  78 ++++++++----
>   hw/i386/kvm/meson.build           |   1 +
>   hw/i386/kvm/trace-events          |   2 +
>   hw/i386/kvm/xen-stubs.c           |   8 ++
>   hw/i386/kvm/xen_gnttab.c          |   7 +-
>   hw/i386/kvm/xen_primary_console.c | 193 ++++++++++++++++++++++++++++++
>   hw/i386/kvm/xen_primary_console.h |  23 ++++
>   hw/i386/kvm/xen_xenstore.c        |  10 ++
>   hw/xen/xen-bus.c                  |   5 +
>   include/hw/xen/xen-bus.h          |   1 +
>   target/i386/kvm/xen-emu.c         |  23 +++-
>   11 files changed, 328 insertions(+), 23 deletions(-)
>   create mode 100644 hw/i386/kvm/xen_primary_console.c
>   create mode 100644 hw/i386/kvm/xen_primary_console.h
> 

Reviewed-by: Paul Durrant <paul@xen.org>



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 18/28] hw/xen: only remove peers of PCI NICs on unplug
  2023-10-25 14:50 ` [PATCH v3 18/28] hw/xen: only remove peers of PCI NICs on unplug David Woodhouse
@ 2023-10-27  8:29   ` Durrant, Paul
  0 siblings, 0 replies; 59+ messages in thread
From: Durrant, Paul @ 2023-10-27  8:29 UTC (permalink / raw)
  To: David Woodhouse, qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

On 25/10/2023 15:50, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> When the Xen guest asks to unplug *emulated* NICs, it's kind of unhelpful
> also to unplug the peer of the *Xen* PV NIC.
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>   hw/i386/xen/xen_platform.c | 9 +++++++--
>   1 file changed, 7 insertions(+), 2 deletions(-)
> 

Reviewed-by: Paul Durrant <paul@xen.org>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 19/28] hw/xen: update Xen PV NIC to XenDevice model
  2023-10-25 14:50 ` [PATCH v3 19/28] hw/xen: update Xen PV NIC to XenDevice model David Woodhouse
@ 2023-10-27  8:42   ` Durrant, Paul
  0 siblings, 0 replies; 59+ messages in thread
From: Durrant, Paul @ 2023-10-27  8:42 UTC (permalink / raw)
  To: David Woodhouse, qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

On 25/10/2023 15:50, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> This allows us to use Xen PV networking with emulated Xen guests, and to
> add them on the command line or hotplug.
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>   hw/net/meson.build        |   2 +-
>   hw/net/trace-events       |  11 +
>   hw/net/xen_nic.c          | 484 +++++++++++++++++++++++++++++---------
>   hw/xenpv/xen_machine_pv.c |   1 -
>   4 files changed, 381 insertions(+), 117 deletions(-)
> 
> diff --git a/hw/net/meson.build b/hw/net/meson.build
> index 2632634df3..f64651c467 100644
> --- a/hw/net/meson.build
> +++ b/hw/net/meson.build
> @@ -1,5 +1,5 @@
>   system_ss.add(when: 'CONFIG_DP8393X', if_true: files('dp8393x.c'))
> -system_ss.add(when: 'CONFIG_XEN', if_true: files('xen_nic.c'))
> +system_ss.add(when: 'CONFIG_XEN_BUS', if_true: files('xen_nic.c'))
>   system_ss.add(when: 'CONFIG_NE2000_COMMON', if_true: files('ne2000.c'))
>   
>   # PCI network cards
> diff --git a/hw/net/trace-events b/hw/net/trace-events
> index 3abfd65e5b..3097742cc0 100644
> --- a/hw/net/trace-events
> +++ b/hw/net/trace-events
> @@ -482,3 +482,14 @@ dp8393x_receive_oversize(int size) "oversize packet, pkt_size is %d"
>   dp8393x_receive_not_netcard(void) "packet not for netcard"
>   dp8393x_receive_packet(int crba) "Receive packet at 0x%"PRIx32
>   dp8393x_receive_write_status(int crba) "Write status at 0x%"PRIx32
> +
> +# xen_nic.c
> +xen_netdev_realize(int dev, const char *info, const char *peer) "vif%u info '%s' peer '%s'"
> +xen_netdev_unrealize(int dev) "vif%u"
> +xen_netdev_create(int dev) "vif%u"
> +xen_netdev_destroy(int dev) "vif%u"
> +xen_netdev_disconnect(int dev) "vif%u"
> +xen_netdev_connect(int dev, unsigned int tx, unsigned int rx, int port) "vif%u tx %u rx %u port %u"
> +xen_netdev_frontend_changed(const char *dev, int state) "vif%s state %d"
> +xen_netdev_tx(int dev, int ref, int off, int len, unsigned int flags, const char *c, const char *d, const char *m, const char *e) "vif%u ref %u off %u len %u flags 0x%x%s%s%s%s"
> +xen_netdev_rx(int dev, int idx, int status, int flags) "vif%u idx %d status %d flags 0x%x"
> diff --git a/hw/net/xen_nic.c b/hw/net/xen_nic.c
> index 9bbf6599fc..af4ba3f1e6 100644
> --- a/hw/net/xen_nic.c
> +++ b/hw/net/xen_nic.c
> @@ -20,6 +20,13 @@
>    */
>   
>   #include "qemu/osdep.h"
> +#include "qemu/main-loop.h"
> +#include "qemu/cutils.h"
> +#include "qemu/log.h"
> +#include "qemu/qemu-print.h"
> +#include "qapi/qmp/qdict.h"
> +#include "qapi/error.h"
> +
>   #include <sys/socket.h>
>   #include <sys/ioctl.h>
>   #include <sys/wait.h>
> @@ -27,18 +34,26 @@
>   #include "net/net.h"
>   #include "net/checksum.h"
>   #include "net/util.h"
> -#include "hw/xen/xen-legacy-backend.h"
> +
> +#include "hw/xen/xen-backend.h"
> +#include "hw/xen/xen-bus-helper.h"
> +#include "hw/qdev-properties.h"
> +#include "hw/qdev-properties-system.h"
>   
>   #include "hw/xen/interface/io/netif.h"
> +#include "hw/xen/interface/io/xs_wire.h"
> +
> +#include "trace.h"
>   
>   /* ------------------------------------------------------------- */
>   
>   struct XenNetDev {
> -    struct XenLegacyDevice      xendev;  /* must be first */
> -    char                  *mac;
> +    struct XenDevice      xendev;  /* must be first */
> +    XenEventChannel       *event_channel;
> +    int                   dev;
>       int                   tx_work;
> -    int                   tx_ring_ref;
> -    int                   rx_ring_ref;
> +    unsigned int          tx_ring_ref;
> +    unsigned int          rx_ring_ref;
>       struct netif_tx_sring *txs;
>       struct netif_rx_sring *rxs;
>       netif_tx_back_ring_t  tx_ring;
> @@ -47,6 +62,11 @@ struct XenNetDev {
>       NICState              *nic;
>   };
>   
> +typedef struct XenNetDev XenNetDev;
> +
> +#define TYPE_XEN_NET_DEVICE "xen-net-device"
> +OBJECT_DECLARE_SIMPLE_TYPE(XenNetDev, XEN_NET_DEVICE)
> +
>   /* ------------------------------------------------------------- */
>   
>   static void net_tx_response(struct XenNetDev *netdev, netif_tx_request_t *txp, int8_t st)
> @@ -68,7 +88,8 @@ static void net_tx_response(struct XenNetDev *netdev, netif_tx_request_t *txp, i
>       netdev->tx_ring.rsp_prod_pvt = ++i;
>       RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&netdev->tx_ring, notify);
>       if (notify) {
> -        xen_pv_send_notify(&netdev->xendev);
> +        xen_device_notify_event_channel(XEN_DEVICE(netdev),
> +                                        netdev->event_channel, NULL);
>       }
>   
>       if (i == netdev->tx_ring.req_cons) {
> @@ -104,13 +125,16 @@ static void net_tx_error(struct XenNetDev *netdev, netif_tx_request_t *txp, RING
>   #endif
>   }
>   
> -static void net_tx_packets(struct XenNetDev *netdev)
> +static bool net_tx_packets(struct XenNetDev *netdev)
>   {
> +    bool done_something = false;
>       netif_tx_request_t txreq;
>       RING_IDX rc, rp;
>       void *page;
>       void *tmpbuf = NULL;
>   
> +    assert(qemu_mutex_iothread_locked());
> +
>       for (;;) {
>           rc = netdev->tx_ring.req_cons;
>           rp = netdev->tx_ring.sring->req_prod;
> @@ -122,49 +146,52 @@ static void net_tx_packets(struct XenNetDev *netdev)
>               }
>               memcpy(&txreq, RING_GET_REQUEST(&netdev->tx_ring, rc), sizeof(txreq));
>               netdev->tx_ring.req_cons = ++rc;
> +            done_something = true;
>   
>   #if 1
>               /* should not happen in theory, we don't announce the *
>                * feature-{sg,gso,whatelse} flags in xenstore (yet?) */
>               if (txreq.flags & NETTXF_extra_info) {
> -                xen_pv_printf(&netdev->xendev, 0, "FIXME: extra info flag\n");
> +                qemu_log_mask(LOG_UNIMP, "vif%u: FIXME: extra info flag\n",
> +                              netdev->dev);
>                   net_tx_error(netdev, &txreq, rc);
>                   continue;
>               }
>               if (txreq.flags & NETTXF_more_data) {
> -                xen_pv_printf(&netdev->xendev, 0, "FIXME: more data flag\n");
> +                qemu_log_mask(LOG_UNIMP, "vif%u: FIXME: more data flag\n",
> +                              netdev->dev);
>                   net_tx_error(netdev, &txreq, rc);
>                   continue;
>               }
>   #endif

I know that this is just translation but the fact the above code is 
there indicates that you're likely to see problems here. Not supporting 
extra_info is likely ok as long as RSS, TSO or multicast filtering is 
not advertized (as the comment says). But lack of support for more_data 
basically means your frontend needs to send all packets in a single frag 
(e.g. no header split) so that might bite pretty quickly. I guess it 
goes to show that no-one has used this code in many many years.

The translation looks ok to me though so...

Reviewed-by: Paul Durrant <paul@xen.org>


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 20/28] net: do not delete nics in net_cleanup()
  2023-10-25 14:50 ` [PATCH v3 20/28] net: do not delete nics in net_cleanup() David Woodhouse
@ 2023-10-27  8:44   ` Durrant, Paul
  0 siblings, 0 replies; 59+ messages in thread
From: Durrant, Paul @ 2023-10-27  8:44 UTC (permalink / raw)
  To: David Woodhouse, qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

On 25/10/2023 15:50, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> In net_cleanup() we only need to delete the netdevs, as those may have
> state which outlives Qemu when it exits, and thus may actually need to
> be cleaned up on exit.
> 
> The nics, on the other hand, are owned by the device which created them.
> Most devices don't bother to clean up on exit because they don't have
> any state which will outlive Qemu... but XenBus devices do need to clean
> up their nodes in XenStore, and do have an exit handler to delete them.
> 
> When the XenBus exit handler destroys the xen-net-device, it attempts
> to delete its nic after net_cleanup() had already done so. And crashes.
> 
> Fix this by only deleting netdevs as we walk the list. As the comment
> notes, we can't use QTAILQ_FOREACH_SAFE() as each deletion may remove
> *multiple* entries, including the "safely" saved 'next' pointer. But
> we can store the *previous* entry, since nics are safe.
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>   net/net.c | 28 ++++++++++++++++++++++------
>   1 file changed, 22 insertions(+), 6 deletions(-)
> 

Reviewed-by: Paul Durrant <paul@xen.org>


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 13/28] hw/xen: automatically assign device index to block devices
  2023-10-27  7:30   ` Durrant, Paul
@ 2023-10-27  8:45     ` David Woodhouse
  2023-10-27  9:01       ` Durrant, Paul
  0 siblings, 1 reply; 59+ messages in thread
From: David Woodhouse @ 2023-10-27  8:45 UTC (permalink / raw)
  To: paul, qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Marc-André Lureau, Paolo Bonzini, Michael S. Tsirkin,
	Marcel Apfelbaum, Richard Henderson, Eduardo Habkost, Jason Wang,
	Marcelo Tosatti, qemu-block, xen-devel, kvm, Bernhard Beschow,
	Joel Upham

[-- Attachment #1: Type: text/plain, Size: 1193 bytes --]

On Fri, 2023-10-27 at 08:30 +0100, Durrant, Paul wrote:
> 
> > +    if (blockdev->props.vdev.type == XEN_BLOCK_VDEV_TYPE_INVALID) {
> > +        XenBus *xenbus = XEN_BUS(qdev_get_parent_bus(DEVICE(xendev)));
> > +        char fe_path[XENSTORE_ABS_PATH_MAX + 1];
> > +        char *value;
> > +        int disk = 0;
> > +        unsigned long idx;
> > +
> > +        /* Find an unoccupied device name */
> 
> Not sure this is going to work is it? What happens if 'hda' or 'sda', or 
> 'd0' exists? I think you need to use the core of the code in 
> xen_block_set_vdev() to generate names and search all possible encodings 
> for each disk.

Do we care? You're allowed to have *all* of "hda", "sda" and "xvda" at
the same time. If a user explicitly provides "sda" and then provides
another disk without giving it a name, we're allowed to use "xvda".

Hell, you can also have *separate* backing stores provided as "hda1",
"sda1" and "xvda1". I *might* have tolerated a heckle that this
function should check for at least the latter of those, but when I was
first coding it up I was more inclined to argue "Don't Do That Then".

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5965 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 13/28] hw/xen: automatically assign device index to block devices
  2023-10-27  8:45     ` David Woodhouse
@ 2023-10-27  9:01       ` Durrant, Paul
  2023-10-27 10:25         ` David Woodhouse
  0 siblings, 1 reply; 59+ messages in thread
From: Durrant, Paul @ 2023-10-27  9:01 UTC (permalink / raw)
  To: David Woodhouse, paul, qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Marc-André Lureau, Paolo Bonzini, Michael S. Tsirkin,
	Marcel Apfelbaum, Richard Henderson, Eduardo Habkost, Jason Wang,
	Marcelo Tosatti, qemu-block, xen-devel, kvm, Bernhard Beschow,
	Joel Upham

On 27/10/2023 09:45, David Woodhouse wrote:
> On Fri, 2023-10-27 at 08:30 +0100, Durrant, Paul wrote:
>>
>>> +    if (blockdev->props.vdev.type == XEN_BLOCK_VDEV_TYPE_INVALID) {
>>> +        XenBus *xenbus = XEN_BUS(qdev_get_parent_bus(DEVICE(xendev)));
>>> +        char fe_path[XENSTORE_ABS_PATH_MAX + 1];
>>> +        char *value;
>>> +        int disk = 0;
>>> +        unsigned long idx;
>>> +
>>> +        /* Find an unoccupied device name */
>>
>> Not sure this is going to work is it? What happens if 'hda' or 'sda', or
>> 'd0' exists? I think you need to use the core of the code in
>> xen_block_set_vdev() to generate names and search all possible encodings
>> for each disk.
> 
> Do we care? You're allowed to have *all* of "hda", "sda" and "xvda" at
> the same time. If a user explicitly provides "sda" and then provides
> another disk without giving it a name, we're allowed to use "xvda".
> 

Maybe sda and xvda can co-exist, but

https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=docs/man/xen-vbd-interface.7.pandoc;h=ba0d159dfa7eaf359922583ccd6d2b413acddb13;hb=HEAD#l125

says that you'll likely run into trouble if hda exists and you happen to 
create xvda.

> Hell, you can also have *separate* backing stores provided as "hda1",
> "sda1" and "xvda1". I *might* have tolerated a heckle that this
> function should check for at least the latter of those, but when I was
> first coding it up I was more inclined to argue "Don't Do That Then".

This code is allocating a name automatically so I think the onus is on 
it not create a needless clash which is likely to have unpredictable 
results depending on what the guest is. Just avoid any aliasing in the 
first place and things will be fine.

   Paul


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 21/28] xen-platform: unplug AHCI disks
  2023-10-25 14:50 ` [PATCH v3 21/28] xen-platform: unplug AHCI disks David Woodhouse
@ 2023-10-27  9:08   ` Durrant, Paul
  0 siblings, 0 replies; 59+ messages in thread
From: Durrant, Paul @ 2023-10-27  9:08 UTC (permalink / raw)
  To: David Woodhouse, qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

On 25/10/2023 15:50, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> To support Xen guests using the Q35 chipset, the unplug protocol needs
> to also remove AHCI disks.
> 
> Make pci_xen_ide_unplug() more generic, iterating over the children
> of the PCI device and destroying the "ide-hd" devices. That works the
> same for both AHCI and IDE, as does the detection of the primary disk
> as unit 0 on the bus named "ide.0".
> 
> Then pci_xen_ide_unplug() can be used for both AHCI and IDE devices.
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>   hw/i386/xen/xen_platform.c | 68 +++++++++++++++++++++++++-------------
>   1 file changed, 45 insertions(+), 23 deletions(-)
> 

Reviewed-by: Paul Durrant <paul@xen.org>


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 22/28] net: add qemu_{configure,create}_nic_device(), qemu_find_nic_info()
  2023-10-25 14:50   ` [PATCH v3 22/28] net: add qemu_{configure, create}_nic_device(), qemu_find_nic_info() David Woodhouse
  (?)
@ 2023-10-27  9:25   ` Durrant, Paul
  -1 siblings, 0 replies; 59+ messages in thread
From: Durrant, Paul @ 2023-10-27  9:25 UTC (permalink / raw)
  To: David Woodhouse, qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

On 25/10/2023 15:50, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> Most code which directly accesses nd_table[] and nb_nics uses them for
> one of two things. Either "I have created a NIC device and I'd like a
> configuration for it", or "I will create a NIC device *if* there is a
> configuration for it".  With some variants on the theme around whether
> they actually *check* if the model specified in the configuration is
> the right one.
> 
> Provide functions which perform both of those, allowing platforms to
> be a little more consistent and as a step towards making nd_table[]
> and nb_nics private to the net code.
> 
> Also export the qemu_find_nic_info() helper, as some platforms have
> special cases they need to handle.
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>   include/net/net.h |  7 ++++++-
>   net/net.c         | 51 +++++++++++++++++++++++++++++++++++++++++++++++
>   2 files changed, 57 insertions(+), 1 deletion(-)
> 

Reviewed-by: Paul Durrant <paul@xen.org>


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 23/28] net: report list of available models according to platform
  2023-10-25 14:50 ` [PATCH v3 23/28] net: report list of available models according to platform David Woodhouse
@ 2023-10-27  9:31   ` Durrant, Paul
  0 siblings, 0 replies; 59+ messages in thread
From: Durrant, Paul @ 2023-10-27  9:31 UTC (permalink / raw)
  To: David Woodhouse, qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

On 25/10/2023 15:50, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> By noting the models for which a configuration was requested, we can give
> the user an accurate list of which NIC models were actually available on
> the platform/configuration that was otherwise chosen.
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>   net/net.c | 94 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 94 insertions(+)
> 

Reviewed-by: Paul Durrant <paul@xen.org>


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 24/28] net: add qemu_create_nic_bus_devices()
  2023-10-25 14:50 ` [PATCH v3 24/28] net: add qemu_create_nic_bus_devices() David Woodhouse
@ 2023-10-27  9:42   ` Durrant, Paul
  0 siblings, 0 replies; 59+ messages in thread
From: Durrant, Paul @ 2023-10-27  9:42 UTC (permalink / raw)
  To: David Woodhouse, qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

On 25/10/2023 15:50, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> This will instantiate any NICs which live on a given bus type. Each bus
> is allowed *one* substitution (for PCI it's virtio → virtio-net-pci, for
> Xen it's xen → xen-net-device; no point in overengineering it unless we
> actually want more).
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>   include/net/net.h |  3 +++
>   net/net.c         | 53 +++++++++++++++++++++++++++++++++++++++++++++++
>   2 files changed, 56 insertions(+)
> 

Reviewed-by: Paul Durrant <paul@xen.org>


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 25/28] hw/pci: add pci_init_nic_devices(), pci_init_nic_in_slot()
  2023-10-25 14:50 ` [PATCH v3 25/28] hw/pci: add pci_init_nic_devices(), pci_init_nic_in_slot() David Woodhouse
@ 2023-10-27  9:46   ` Durrant, Paul
  0 siblings, 0 replies; 59+ messages in thread
From: Durrant, Paul @ 2023-10-27  9:46 UTC (permalink / raw)
  To: David Woodhouse, qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

On 25/10/2023 15:50, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> The loop over nd_table[] to add PCI NICs is repeated in quite a few
> places. Add a helper function to do it.
> 
> Some platforms also try to instantiate a specific model in a specific
> slot, to match the real hardware. Add pci_init_nic_in_slot() for that
> purpose.
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>   hw/pci/pci.c         | 45 ++++++++++++++++++++++++++++++++++++++++++++
>   include/hw/pci/pci.h |  4 +++-
>   2 files changed, 48 insertions(+), 1 deletion(-)
> 

Reviewed-by: Paul Durrant <paul@xen.org>


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 26/28] hw/i386/pc: use qemu_get_nic_info() and pci_init_nic_devices()
  2023-10-25 14:50 ` [PATCH v3 26/28] hw/i386/pc: use qemu_get_nic_info() and pci_init_nic_devices() David Woodhouse
@ 2023-10-27  9:48   ` Durrant, Paul
  0 siblings, 0 replies; 59+ messages in thread
From: Durrant, Paul @ 2023-10-27  9:48 UTC (permalink / raw)
  To: David Woodhouse, qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

On 25/10/2023 15:50, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> Eliminate direct access to nd_table[] and nb_nics by processing the the
> ISA NICs first and then calling pci_init_nic_devices() for the test.
> 
> It's important to do this *before* the subsequent patch which registers
> the Xen PV network devices, because the code being remove here didn't
> check whether nd->instantiated was already set before using each entry.
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>   hw/i386/pc.c                | 21 +++++++++++----------
>   include/hw/net/ne2000-isa.h |  2 --
>   2 files changed, 11 insertions(+), 12 deletions(-)
> 

Reviewed-by: Paul Durrant <paul@xen.org>


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 27/28] hw/xen: use qemu_create_nic_bus_devices() to instantiate Xen NICs
  2023-10-25 14:50 ` [PATCH v3 27/28] hw/xen: use qemu_create_nic_bus_devices() to instantiate Xen NICs David Woodhouse
@ 2023-10-27  9:52   ` Durrant, Paul
  0 siblings, 0 replies; 59+ messages in thread
From: Durrant, Paul @ 2023-10-27  9:52 UTC (permalink / raw)
  To: David Woodhouse, qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Paul Durrant, Marc-André Lureau, Paolo Bonzini,
	Michael S. Tsirkin, Marcel Apfelbaum, Richard Henderson,
	Eduardo Habkost, Jason Wang, Marcelo Tosatti, qemu-block,
	xen-devel, kvm, Bernhard Beschow, Joel Upham

On 25/10/2023 15:50, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> When instantiating XenBus itself, for each NIC which is configured with
> either the model unspecified, or set to to "xen" or "xen-net-device",
> create a corresponding xen-net-device for it.
> 
> Now we can launch emulated Xen guests with '-nic user', and this fixes
> the setup for Xen PV guests, which was previously broken in various
> ways and never actually managed to peer with the netdev.
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>   hw/xen/xen-bus.c                    |  4 ++++
>   hw/xen/xen_devconfig.c              | 25 -------------------------
>   hw/xenpv/xen_machine_pv.c           |  9 ---------
>   include/hw/xen/xen-legacy-backend.h |  1 -
>   4 files changed, 4 insertions(+), 35 deletions(-)
> 

Yay! I've been wanting this for years but ETIME.

Reviewed-by: Paul Durrant <paul@xen.org>



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 13/28] hw/xen: automatically assign device index to block devices
  2023-10-27  9:01       ` Durrant, Paul
@ 2023-10-27 10:25         ` David Woodhouse
  2023-10-27 10:32           ` Durrant, Paul
  0 siblings, 1 reply; 59+ messages in thread
From: David Woodhouse @ 2023-10-27 10:25 UTC (permalink / raw)
  To: paul, qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Marc-André Lureau, Paolo Bonzini, Michael S. Tsirkin,
	Marcel Apfelbaum, Richard Henderson, Eduardo Habkost, Jason Wang,
	Marcelo Tosatti, qemu-block, xen-devel, kvm, Bernhard Beschow,
	Joel Upham

[-- Attachment #1: Type: text/plain, Size: 636 bytes --]

On Fri, 2023-10-27 at 10:01 +0100, Durrant, Paul wrote:
> 
> This code is allocating a name automatically so I think the onus is on 
> it not create a needless clash which is likely to have unpredictable 
> results depending on what the guest is. Just avoid any aliasing in the 
> first place and things will be fine.


Yeah, fair enough. In which case I'll probably switch to using
xs_directory() and then processing those results to find a free slot,
instead of going out to XenStore for every existence check.

This isn't exactly fast path and I'm prepared to tolerate a little bit
of O(n²), but only within reason :)

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5965 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 13/28] hw/xen: automatically assign device index to block devices
  2023-10-27 10:25         ` David Woodhouse
@ 2023-10-27 10:32           ` Durrant, Paul
  2023-10-27 12:02             ` David Woodhouse
  0 siblings, 1 reply; 59+ messages in thread
From: Durrant, Paul @ 2023-10-27 10:32 UTC (permalink / raw)
  To: David Woodhouse, paul, qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Marc-André Lureau, Paolo Bonzini, Michael S. Tsirkin,
	Marcel Apfelbaum, Richard Henderson, Eduardo Habkost, Jason Wang,
	Marcelo Tosatti, qemu-block, xen-devel, kvm, Bernhard Beschow,
	Joel Upham

On 27/10/2023 11:25, David Woodhouse wrote:
> On Fri, 2023-10-27 at 10:01 +0100, Durrant, Paul wrote:
>>
>> This code is allocating a name automatically so I think the onus is on
>> it not create a needless clash which is likely to have unpredictable
>> results depending on what the guest is. Just avoid any aliasing in the
>> first place and things will be fine.
> 
> 
> Yeah, fair enough. In which case I'll probably switch to using
> xs_directory() and then processing those results to find a free slot,
> instead of going out to XenStore for every existence check.
> 
> This isn't exactly fast path and I'm prepared to tolerate a little bit
> of O(n²), but only within reason :)

Yes, doing an xs_directory() and then using the code 
xen_block_get_vdev() to populate a list of existent disks will be neater.


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 13/28] hw/xen: automatically assign device index to block devices
  2023-10-27 10:32           ` Durrant, Paul
@ 2023-10-27 12:02             ` David Woodhouse
  0 siblings, 0 replies; 59+ messages in thread
From: David Woodhouse @ 2023-10-27 12:02 UTC (permalink / raw)
  To: paul, qemu-devel
  Cc: Kevin Wolf, Hanna Reitz, Stefano Stabellini, Anthony Perard,
	Marc-André Lureau, Paolo Bonzini, Michael S. Tsirkin,
	Marcel Apfelbaum, Richard Henderson, Eduardo Habkost, Jason Wang,
	Marcelo Tosatti, qemu-block, xen-devel, kvm, Bernhard Beschow,
	Joel Upham

[-- Attachment #1: Type: text/plain, Size: 7459 bytes --]

On Fri, 2023-10-27 at 11:32 +0100, Durrant, Paul wrote:
> On 27/10/2023 11:25, David Woodhouse wrote:
> > On Fri, 2023-10-27 at 10:01 +0100, Durrant, Paul wrote:
> > > 
> > > This code is allocating a name automatically so I think the onus is on
> > > it not create a needless clash which is likely to have unpredictable
> > > results depending on what the guest is. Just avoid any aliasing in the
> > > first place and things will be fine.
> > 
> > 
> > Yeah, fair enough. In which case I'll probably switch to using
> > xs_directory() and then processing those results to find a free slot,
> > instead of going out to XenStore for every existence check.
> > 
> > This isn't exactly fast path and I'm prepared to tolerate a little bit
> > of O(n²), but only within reason :)
> 
> Yes, doing an xs_directory() and then using the code 
> xen_block_get_vdev() to populate a list of existent disks will be neater.

diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c
index 5011fe9430..9b7d7ef7e1 100644
--- a/hw/block/xen-block.c
+++ b/hw/block/xen-block.c
@@ -30,48 +30,105 @@
 #include "hw/xen/interface/io/xs_wire.h"
 #include "trace.h"
 
-static char *xen_block_get_name(XenDevice *xendev, Error **errp)
+#define XVDA_MAJOR 202
+#define XVDQ_MAJOR (1<<20)
+#define XVDBGQCV_MAJOR ((1<<21) - 1)
+#define HDA_MAJOR 3
+#define HDC_MAJOR 22
+#define SDA_MAJOR 8
+
+/*
+ * This is fairly arbitrary just to avoid a stupidly sized bitmap, but Linux
+ * as of v6.6 only supports up to /dev/xvdfan (disk 4095) anyway.
+ */
+#define MAX_AUTO_VDEV 4096
+
+static int vdev_to_diskno(unsigned int vdev_nr)
 {
-    XenBlockDevice *blockdev = XEN_BLOCK_DEVICE(xendev);
+    switch (vdev_nr >> 8) {
+    case XVDA_MAJOR:
+    case SDA_MAJOR:
+        return (vdev_nr >> 4) & 0x15;
+
+    case HDA_MAJOR:
+        return (vdev_nr >> 6) & 1;
+
+    case HDC_MAJOR:
+        return ((vdev_nr >> 6) & 1) + 2;
+
+    case XVDQ_MAJOR ... XVDBGQCV_MAJOR:
+        return (vdev_nr >> 8) & 0xfffff;
+
+    default:
+        return -1;
+    }
+}
+
+static bool xen_block_find_free_vdev(XenBlockDevice *blockdev, Error **errp)
+{
+    XenBus *xenbus = XEN_BUS(qdev_get_parent_bus(DEVICE(blockdev)));
+    unsigned long used_devs[BITS_TO_LONGS(MAX_AUTO_VDEV)];
     XenBlockVdev *vdev = &blockdev->props.vdev;
+    char fe_path[XENSTORE_ABS_PATH_MAX + 1];
+    char **existing_frontends;
+    unsigned int nr_existing = 0;
+    unsigned int vdev_nr;
+    int i, disk = 0;
+
+    snprintf(fe_path, sizeof(fe_path), "/local/domain/%u/device/vbd",
+             blockdev->xendev.frontend_id);
+
+    existing_frontends = qemu_xen_xs_directory(xenbus->xsh, XBT_NULL, fe_path,
+                                               &nr_existing);
+    if (!existing_frontends && errno != ENOENT) {
+        error_setg_errno(errp, errno, "cannot read %s", fe_path);
+        return false;
+    }
 
-    if (blockdev->props.vdev.type == XEN_BLOCK_VDEV_TYPE_INVALID) {
-        XenBus *xenbus = XEN_BUS(qdev_get_parent_bus(DEVICE(xendev)));
-        char fe_path[XENSTORE_ABS_PATH_MAX + 1];
-        char *value;
-        int disk = 0;
-        unsigned long idx;
-
-        /* Find an unoccupied device name */
-        while (disk < (1 << 20)) {
-            if (disk < (1 << 4)) {
-                idx = (202 << 8) | (disk << 4);
-            } else {
-                idx = (1 << 28) | (disk << 8);
-            }
-            snprintf(fe_path, sizeof(fe_path),
-                     "/local/domain/%u/device/vbd/%lu",
-                     xendev->frontend_id, idx);
-            value = qemu_xen_xs_read(xenbus->xsh, XBT_NULL, fe_path, NULL);
-            if (!value) {
-                if (errno == ENOENT) {
-                    vdev->type = XEN_BLOCK_VDEV_TYPE_XVD;
-                    vdev->partition = 0;
-                    vdev->disk = disk;
-                    vdev->number = idx;
-                    goto found;
-                }
-                error_setg(errp, "cannot read %s: %s", fe_path,
-                           strerror(errno));
-                return NULL;
-            }
-            free(value);
-            disk++;
+    memset(used_devs, 0, sizeof(used_devs));
+    for (i = 0; i < nr_existing; i++) {
+        if (qemu_strtoui(existing_frontends[i], NULL, 10, &vdev_nr)) {
+            free(existing_frontends[i]);
+            continue;
         }
+
+        free(existing_frontends[i]);
+
+        disk = vdev_to_diskno(vdev_nr);
+        if (disk < 0 || disk >= MAX_AUTO_VDEV) {
+            continue;
+        }
+
+        set_bit(disk, used_devs);
+    }
+    free(existing_frontends);
+
+    disk = find_first_zero_bit(used_devs, MAX_AUTO_VDEV);
+    if (disk == MAX_AUTO_VDEV) {
         error_setg(errp, "cannot find device vdev for block device");
+        return false;
+    }
+
+    vdev->type = XEN_BLOCK_VDEV_TYPE_XVD;
+    vdev->partition = 0;
+    vdev->disk = disk;
+    if (disk < (1 << 4)) {
+        vdev->number = (XVDA_MAJOR << 8) | (disk << 4);
+    } else {
+        vdev->number = (XVDQ_MAJOR << 8) | (disk << 8);
+    }
+    return true;
+}
+
+static char *xen_block_get_name(XenDevice *xendev, Error **errp)
+{
+    XenBlockDevice *blockdev = XEN_BLOCK_DEVICE(xendev);
+    XenBlockVdev *vdev = &blockdev->props.vdev;
+
+    if (vdev->type == XEN_BLOCK_VDEV_TYPE_INVALID &&
+        !xen_block_find_free_vdev(blockdev, errp)) {
         return NULL;
     }
- found:
     return g_strdup_printf("%lu", vdev->number);
 }
 
@@ -520,10 +577,10 @@ static void xen_block_set_vdev(Object *obj, Visitor *v, const char *name,
     case XEN_BLOCK_VDEV_TYPE_DP:
     case XEN_BLOCK_VDEV_TYPE_XVD:
         if (vdev->disk < (1 << 4) && vdev->partition < (1 << 4)) {
-            vdev->number = (202 << 8) | (vdev->disk << 4) |
+            vdev->number = (XVDA_MAJOR << 8) | (vdev->disk << 4) |
                 vdev->partition;
         } else if (vdev->disk < (1 << 20) && vdev->partition < (1 << 8)) {
-            vdev->number = (1 << 28) | (vdev->disk << 8) |
+            vdev->number = (XVDQ_MAJOR << 8) | (vdev->disk << 8) |
                 vdev->partition;
         } else {
             goto invalid;
@@ -533,10 +590,10 @@ static void xen_block_set_vdev(Object *obj, Visitor *v, const char *name,
     case XEN_BLOCK_VDEV_TYPE_HD:
         if ((vdev->disk == 0 || vdev->disk == 1) &&
             vdev->partition < (1 << 6)) {
-            vdev->number = (3 << 8) | (vdev->disk << 6) | vdev->partition;
+            vdev->number = (HDA_MAJOR << 8) | (vdev->disk << 6) | vdev->partition;
         } else if ((vdev->disk == 2 || vdev->disk == 3) &&
                    vdev->partition < (1 << 6)) {
-            vdev->number = (22 << 8) | ((vdev->disk - 2) << 6) |
+            vdev->number = (HDC_MAJOR << 8) | ((vdev->disk - 2) << 6) |
                 vdev->partition;
         } else {
             goto invalid;
@@ -545,7 +602,7 @@ static void xen_block_set_vdev(Object *obj, Visitor *v, const char *name,
 
     case XEN_BLOCK_VDEV_TYPE_SD:
         if (vdev->disk < (1 << 4) && vdev->partition < (1 << 4)) {
-            vdev->number = (8 << 8) | (vdev->disk << 4) | vdev->partition;
+            vdev->number = (SDA_MAJOR << 8) | (vdev->disk << 4) | vdev->partition;
         } else {
             goto invalid;
         }


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5965 bytes --]

^ permalink raw reply related	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2023-10-27 12:03 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-25 14:50 [PATCH v3 00/28] Get Xen PV shim running in QEMU, add net & console David Woodhouse
2023-10-25 14:50 ` [PATCH v3 01/28] i386/xen: Don't advertise XENFEAT_supervisor_mode_kernel David Woodhouse
2023-10-25 14:50 ` [PATCH v3 02/28] i386/xen: fix per-vCPU upcall vector for Xen emulation David Woodhouse
2023-10-25 14:50 ` [PATCH v3 03/28] hw/xen: select kernel mode for per-vCPU event channel upcall vector David Woodhouse
2023-10-25 14:50 ` [PATCH v3 04/28] hw/xen: don't clear map_track[] in xen_gnttab_reset() David Woodhouse
2023-10-25 14:50 ` [PATCH v3 05/28] hw/xen: fix XenStore watch delivery to guest David Woodhouse
2023-10-27  7:15   ` Durrant, Paul
2023-10-25 14:50 ` [PATCH v3 06/28] hw/xen: take iothread mutex in xen_evtchn_reset_op() David Woodhouse
2023-10-27  7:20   ` Durrant, Paul
2023-10-25 14:50 ` [PATCH v3 07/28] hw/xen: use correct default protocol for xen-block on x86 David Woodhouse
2023-10-27  7:22   ` Durrant, Paul
2023-10-25 14:50 ` [PATCH v3 08/28] i386/xen: Ignore VCPU_SSHOTTMR_future flag in set_singleshot_timer() David Woodhouse
2023-10-27  7:23   ` Durrant, Paul
2023-10-25 14:50 ` [PATCH v3 09/28] hw/xen: Clean up event channel 'type_val' handling to use union David Woodhouse
2023-10-25 14:50 ` [PATCH v3 10/28] include: update Xen public headers to Xen 4.17.2 release David Woodhouse
2023-10-25 14:50 ` [PATCH v3 11/28] i386/xen: advertise XEN_HVM_CPUID_UPCALL_VECTOR in CPUID David Woodhouse
2023-10-25 14:50 ` [PATCH v3 12/28] hw/xen: populate store frontend nodes with XenStore PFN/port David Woodhouse
2023-10-25 14:50 ` [PATCH v3 13/28] hw/xen: automatically assign device index to block devices David Woodhouse
2023-10-27  7:30   ` Durrant, Paul
2023-10-27  8:45     ` David Woodhouse
2023-10-27  9:01       ` Durrant, Paul
2023-10-27 10:25         ` David Woodhouse
2023-10-27 10:32           ` Durrant, Paul
2023-10-27 12:02             ` David Woodhouse
2023-10-25 14:50 ` [PATCH v3 14/28] hw/xen: add get_frontend_path() method to XenDeviceClass David Woodhouse
2023-10-27  7:31   ` Durrant, Paul
2023-10-25 14:50 ` [PATCH v3 15/28] hw/xen: do not repeatedly try to create a failing backend device David Woodhouse
2023-10-25 14:50 ` [PATCH v3 16/28] hw/xen: update Xen console to XenDevice model David Woodhouse
2023-10-25 14:50 ` [PATCH v3 17/28] hw/xen: add support for Xen primary console in emulated mode David Woodhouse
2023-10-27  7:44   ` Durrant, Paul
2023-10-25 14:50 ` [PATCH v3 18/28] hw/xen: only remove peers of PCI NICs on unplug David Woodhouse
2023-10-27  8:29   ` Durrant, Paul
2023-10-25 14:50 ` [PATCH v3 19/28] hw/xen: update Xen PV NIC to XenDevice model David Woodhouse
2023-10-27  8:42   ` Durrant, Paul
2023-10-25 14:50 ` [PATCH v3 20/28] net: do not delete nics in net_cleanup() David Woodhouse
2023-10-27  8:44   ` Durrant, Paul
2023-10-25 14:50 ` [PATCH v3 21/28] xen-platform: unplug AHCI disks David Woodhouse
2023-10-27  9:08   ` Durrant, Paul
2023-10-25 14:50 ` [PATCH v3 22/28] net: add qemu_{configure,create}_nic_device(), qemu_find_nic_info() David Woodhouse
2023-10-25 14:50   ` [PATCH v3 22/28] net: add qemu_{configure, create}_nic_device(), qemu_find_nic_info() David Woodhouse
2023-10-27  9:25   ` [PATCH v3 22/28] net: add qemu_{configure,create}_nic_device(), qemu_find_nic_info() Durrant, Paul
2023-10-25 14:50 ` [PATCH v3 23/28] net: report list of available models according to platform David Woodhouse
2023-10-27  9:31   ` Durrant, Paul
2023-10-25 14:50 ` [PATCH v3 24/28] net: add qemu_create_nic_bus_devices() David Woodhouse
2023-10-27  9:42   ` Durrant, Paul
2023-10-25 14:50 ` [PATCH v3 25/28] hw/pci: add pci_init_nic_devices(), pci_init_nic_in_slot() David Woodhouse
2023-10-27  9:46   ` Durrant, Paul
2023-10-25 14:50 ` [PATCH v3 26/28] hw/i386/pc: use qemu_get_nic_info() and pci_init_nic_devices() David Woodhouse
2023-10-27  9:48   ` Durrant, Paul
2023-10-25 14:50 ` [PATCH v3 27/28] hw/xen: use qemu_create_nic_bus_devices() to instantiate Xen NICs David Woodhouse
2023-10-27  9:52   ` Durrant, Paul
2023-10-25 14:50 ` [PATCH v3 28/28] docs: update Xen-on-KVM documentation David Woodhouse
2023-10-25 18:20   ` Eric Blake
2023-10-25 18:26     ` David Woodhouse
2023-10-25 18:56       ` Andrew Cooper
2023-10-25 19:02         ` David Woodhouse
2023-10-26  8:26         ` Kevin Wolf
2023-10-26  9:25           ` David Woodhouse
2023-10-26 16:25             ` David Woodhouse

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.