qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH qemu RFC 0/4] spapr: Kexec style boot
@ 2019-07-20  1:28 Alexey Kardashevskiy
  2019-07-20  1:28 ` [Qemu-devel] [PATCH qemu RFC 1/4] spapr: Allow changing kernel loading address Alexey Kardashevskiy
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Alexey Kardashevskiy @ 2019-07-20  1:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alexey Kardashevskiy, Michael Ellerman, Paul Mackerras,
	Paul Mackerras, qemu-ppc, David Gibson

This is an attempts to boot a pseries guest without a firmware.

The idea is to boot a VM with petitboot as a more powerful boot loader
and eliminate scanning phase of the SLOF booting process.

This provides environment without SLOF but with the device tree
and few modifications to already existing H_CAS and H_RTAS hypeprcalls.

This is made on top of these 2 patches:
spapr: Stop providing RTAS blob
spapr_pci: Advertise BAR reallocation capability

This requires a modified pseries kernel, posted separately
as "powerpc/pseries: Kexec style boot".

This also requires patched kexec-lite from the openpower build as
currently it requires linux,rtas-base and rtas-size properties in
the DT which we won't have in such environment.

1/1 is not necessary but having vmlinux at 0 helps debugging via
the qemu gdb stub easier.

The example command line is:
/home/aik/pbuild/qemu-killslof-localhost-ppc64/ppc64-softmmu/qemu-system-ppc64 \
-nodefaults \
-chardev stdio,id=STDIO0,signal=off,mux=on \
-device spapr-vty,id=svty0,reg=0x71000110,chardev=STDIO0 \
-mon id=MON0,chardev=STDIO0,mode=readline -nographic -vga none \
-enable-kvm \
-device nec-usb-xhci,id=nec-usb-xhci0 -m 4G \
-kernel /home/aik/pbuild/kernel-guest-nv2-le/vmlinux \
-initrd op-build/output/images/rootfs.cpio \
-machine pseries,bios=no,cap-cfpc=broken,cap-sbbc=broken,cap-ibs=broken \
-snapshot \
-netdev "tap,id=TAP0,helper=/home/aik/qemu-bridge-helper --br=br0" \
-device \
"virtio-net-pci,id=vnet0,bus=pci.0,addr=1.0,mac=C0:41:49:4b:00:02,netdev=TAP0" \
-device virtio-scsi-pci,id=vscsi0 \
-drive \
id=DRIVE0,if=none,file=pbuild/__img/u1804-64le-killslof.qcow2,format=qcow2 \
-device scsi-disk,id=scsi-disk0,drive=DRIVE0 \
-smp 8,threads=8 \
-L /home/aik/t/qemu-ppc64-bios/ \
-trace events=qemu_trace_events -d guest_errors \
-chardev socket,id=SOCKET0,server,nowait,path=qemu.mon.mac02 \
-mon chardev=SOCKET0,mode=control



This is based on sha1
216965b20b04 Joel Stanley "ppc/pnv: update skiboot to v6.4".

Please comment. Thanks.



Alexey Kardashevskiy (4):
  spapr: Allow changing kernel loading address
  spapr: Allow bios-less configuration
  spapr: Advertise H_RTAS to the guest
  spapr: Implement SLOF-less client_architecture_support

 include/hw/ppc/spapr.h |   7 +++
 hw/ppc/spapr.c         | 118 ++++++++++++++++++++++++++++++++---------
 hw/ppc/spapr_hcall.c   |  49 +++++++++++++++--
 3 files changed, 146 insertions(+), 28 deletions(-)

-- 
2.17.1



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH qemu RFC 1/4] spapr: Allow changing kernel loading address
  2019-07-20  1:28 [Qemu-devel] [PATCH qemu RFC 0/4] spapr: Kexec style boot Alexey Kardashevskiy
@ 2019-07-20  1:28 ` Alexey Kardashevskiy
  2019-07-23  3:49   ` David Gibson
  2019-07-20  1:28 ` [Qemu-devel] [PATCH qemu RFC 2/4] spapr: Allow bios-less configuration Alexey Kardashevskiy
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: Alexey Kardashevskiy @ 2019-07-20  1:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paul Mackerras, Michael Ellerman, qemu-ppc, Alexey Kardashevskiy,
	David Gibson

Useful for the debugging purposes.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
 include/hw/ppc/spapr.h |  1 +
 hw/ppc/spapr.c         | 33 +++++++++++++++++++++++++++------
 2 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 74e427b588fc..ff82bb8554e1 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -159,6 +159,7 @@ struct SpaprMachineState {
     void *fdt_blob;
     long kernel_size;
     bool kernel_le;
+    uint64_t kernel_addr;
     uint32_t initrd_base;
     long initrd_size;
     uint64_t rtc_offset; /* Now used only during incoming migration */
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 7fad42350538..6d13d65d8996 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1179,7 +1179,7 @@ static void spapr_dt_chosen(SpaprMachineState *spapr, void *fdt)
                           spapr->initrd_base + spapr->initrd_size));
 
     if (spapr->kernel_size) {
-        uint64_t kprop[2] = { cpu_to_be64(KERNEL_LOAD_ADDR),
+        uint64_t kprop[2] = { cpu_to_be64(spapr->kernel_addr),
                               cpu_to_be64(spapr->kernel_size) };
 
         _FDT(fdt_setprop(fdt, chosen, "qemu,boot-kernel",
@@ -1365,7 +1365,7 @@ static void *spapr_build_fdt(SpaprMachineState *spapr)
 
     /* Build memory reserve map */
     if (spapr->kernel_size) {
-        _FDT((fdt_add_mem_rsv(fdt, KERNEL_LOAD_ADDR, spapr->kernel_size)));
+        _FDT((fdt_add_mem_rsv(fdt, spapr->kernel_addr, spapr->kernel_size)));
     }
     if (spapr->initrd_size) {
         _FDT((fdt_add_mem_rsv(fdt, spapr->initrd_base, spapr->initrd_size)));
@@ -1391,7 +1391,8 @@ static void *spapr_build_fdt(SpaprMachineState *spapr)
 
 static uint64_t translate_kernel_address(void *opaque, uint64_t addr)
 {
-    return (addr & 0x0fffffff) + KERNEL_LOAD_ADDR;
+    SpaprMachineState *spapr = opaque;
+    return (addr & 0x0fffffff) + spapr->kernel_addr;
 }
 
 static void emulate_spapr_hypercall(PPCVirtualHypervisor *vhyp,
@@ -2995,12 +2996,12 @@ static void spapr_machine_init(MachineState *machine)
         uint64_t lowaddr = 0;
 
         spapr->kernel_size = load_elf(kernel_filename, NULL,
-                                      translate_kernel_address, NULL,
+                                      translate_kernel_address, spapr,
                                       NULL, &lowaddr, NULL, 1,
                                       PPC_ELF_MACHINE, 0, 0);
         if (spapr->kernel_size == ELF_LOAD_WRONG_ENDIAN) {
             spapr->kernel_size = load_elf(kernel_filename, NULL,
-                                          translate_kernel_address, NULL, NULL,
+                                          translate_kernel_address, spapr, NULL,
                                           &lowaddr, NULL, 0, PPC_ELF_MACHINE,
                                           0, 0);
             spapr->kernel_le = spapr->kernel_size > 0;
@@ -3016,7 +3017,7 @@ static void spapr_machine_init(MachineState *machine)
             /* Try to locate the initrd in the gap between the kernel
              * and the firmware. Add a bit of space just in case
              */
-            spapr->initrd_base = (KERNEL_LOAD_ADDR + spapr->kernel_size
+            spapr->initrd_base = (spapr->kernel_addr + spapr->kernel_size
                                   + 0x1ffff) & ~0xffff;
             spapr->initrd_size = load_image_targphys(initrd_filename,
                                                      spapr->initrd_base,
@@ -3253,6 +3254,18 @@ static void spapr_set_vsmt(Object *obj, Visitor *v, const char *name,
     visit_type_uint32(v, name, (uint32_t *)opaque, errp);
 }
 
+static void spapr_get_kernel_addr(Object *obj, Visitor *v, const char *name,
+                                  void *opaque, Error **errp)
+{
+    visit_type_uint64(v, name, (uint64_t *)opaque, errp);
+}
+
+static void spapr_set_kernel_addr(Object *obj, Visitor *v, const char *name,
+                                  void *opaque, Error **errp)
+{
+    visit_type_uint64(v, name, (uint64_t *)opaque, errp);
+}
+
 static char *spapr_get_ic_mode(Object *obj, Error **errp)
 {
     SpaprMachineState *spapr = SPAPR_MACHINE(obj);
@@ -3358,6 +3371,14 @@ static void spapr_instance_init(Object *obj)
     object_property_add_bool(obj, "vfio-no-msix-emulation",
                              spapr_get_msix_emulation, NULL, NULL);
 
+    object_property_add(obj, "kernel-addr", "uint64", spapr_get_kernel_addr,
+                        spapr_set_kernel_addr, NULL, &spapr->kernel_addr,
+                        &error_abort);
+    object_property_set_description(obj, "kernel-addr",
+                                    stringify(KERNEL_LOAD_ADDR)
+                                    " for -kernel is the default",
+                                    NULL);
+    spapr->kernel_addr = KERNEL_LOAD_ADDR;
     /* The machine class defines the default interrupt controller mode */
     spapr->irq = smc->irq;
     object_property_add_str(obj, "ic-mode", spapr_get_ic_mode,
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH qemu RFC 2/4] spapr: Allow bios-less configuration
  2019-07-20  1:28 [Qemu-devel] [PATCH qemu RFC 0/4] spapr: Kexec style boot Alexey Kardashevskiy
  2019-07-20  1:28 ` [Qemu-devel] [PATCH qemu RFC 1/4] spapr: Allow changing kernel loading address Alexey Kardashevskiy
@ 2019-07-20  1:28 ` Alexey Kardashevskiy
  2019-07-23  3:52   ` David Gibson
  2019-07-20  1:28 ` [Qemu-devel] [PATCH qemu RFC 3/4] spapr: Advertise H_RTAS to the guest Alexey Kardashevskiy
  2019-07-20  1:28 ` [Qemu-devel] [PATCH qemu RFC 4/4] spapr: Implement SLOF-less client_architecture_support Alexey Kardashevskiy
  3 siblings, 1 reply; 15+ messages in thread
From: Alexey Kardashevskiy @ 2019-07-20  1:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paul Mackerras, Michael Ellerman, qemu-ppc, Alexey Kardashevskiy,
	David Gibson

The pseries kernel can do either usual prom-init boot or kexec style boot.
We always did the prom-init which relies on the completeness of
the device tree (for example, PCI BARs have to be assigned beforehand) and
the client interface; the system firmware (SLOF) implements this.

However we can use the kexec style boot as well. To do that, we can skip
loading SLOF and jump straight to the kernel. GPR5==0 (the client
interface entry point, SLOF passes a valid pointer there) tells Linux to
do the kexec boot rather than prom_init so it can proceed to the initramfs.
With few PCI fixes in the guest kernel, it can boot from PCI (via
petitboot, for example).

This adds a "bios" machine option which controls whether QEMU loads SLOF
or jumps directly to the kernel. When bios==off, this does not copy SLOF
and RTAS into the guest RAM and sets RTAS properties to 0 to bypass
the kexec user space tool which checks for their presence (not for
the values though).

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
 include/hw/ppc/spapr.h |  1 +
 hw/ppc/spapr.c         | 58 ++++++++++++++++++++++++++++++++----------
 2 files changed, 45 insertions(+), 14 deletions(-)

diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index ff82bb8554e1..7f5d7a70d27e 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -160,6 +160,7 @@ struct SpaprMachineState {
     long kernel_size;
     bool kernel_le;
     uint64_t kernel_addr;
+    bool bios_enabled;
     uint32_t initrd_base;
     long initrd_size;
     uint64_t rtc_offset; /* Now used only during incoming migration */
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 6d13d65d8996..81ad6a6f28de 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1116,6 +1116,10 @@ static void spapr_dt_rtas(SpaprMachineState *spapr, void *fdt)
     _FDT(fdt_setprop(fdt, rtas, "ibm,lrdr-capacity",
                      lrdr_capacity, sizeof(lrdr_capacity)));
 
+    /* These are to make kexec-lite happy */
+    _FDT(fdt_setprop_cell(fdt, rtas, "linux,rtas-base", 0));
+    _FDT(fdt_setprop_cell(fdt, rtas, "rtas-size", 0));
+
     spapr_dt_rtas_tokens(fdt, rtas);
 }
 
@@ -1814,7 +1818,11 @@ static void spapr_machine_reset(MachineState *machine)
     spapr->fdt_blob = fdt;
 
     /* Set up the entry state */
-    spapr_cpu_set_entry_state(first_ppc_cpu, SPAPR_ENTRY_POINT, fdt_addr);
+    if (!spapr->bios_enabled) {
+        spapr_cpu_set_entry_state(first_ppc_cpu, spapr->kernel_addr, fdt_addr);
+    } else {
+        spapr_cpu_set_entry_state(first_ppc_cpu, SPAPR_ENTRY_POINT, fdt_addr);
+    }
     first_ppc_cpu->env.gpr[5] = 0;
 
     spapr->cas_reboot = false;
@@ -3031,20 +3039,22 @@ static void spapr_machine_init(MachineState *machine)
         }
     }
 
-    if (bios_name == NULL) {
-        bios_name = FW_FILE_NAME;
+    if (spapr->bios_enabled) {
+        if (bios_name == NULL) {
+            bios_name = FW_FILE_NAME;
+        }
+        filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
+        if (!filename) {
+            error_report("Could not find LPAR firmware '%s'", bios_name);
+            exit(1);
+        }
+        fw_size = load_image_targphys(filename, 0, FW_MAX_SIZE);
+        if (fw_size <= 0) {
+            error_report("Could not load LPAR firmware '%s'", filename);
+            exit(1);
+        }
+        g_free(filename);
     }
-    filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
-    if (!filename) {
-        error_report("Could not find LPAR firmware '%s'", bios_name);
-        exit(1);
-    }
-    fw_size = load_image_targphys(filename, 0, FW_MAX_SIZE);
-    if (fw_size <= 0) {
-        error_report("Could not load LPAR firmware '%s'", filename);
-        exit(1);
-    }
-    g_free(filename);
 
     /* FIXME: Should register things through the MachineState's qdev
      * interface, this is a legacy from the sPAPREnvironment structure
@@ -3266,6 +3276,20 @@ static void spapr_set_kernel_addr(Object *obj, Visitor *v, const char *name,
     visit_type_uint64(v, name, (uint64_t *)opaque, errp);
 }
 
+static bool spapr_get_bios_enabled(Object *obj, Error **errp)
+{
+    SpaprMachineState *spapr = SPAPR_MACHINE(obj);
+
+    return spapr->bios_enabled;
+}
+
+static void spapr_set_bios_enabled(Object *obj, bool value, Error **errp)
+{
+    SpaprMachineState *spapr = SPAPR_MACHINE(obj);
+
+    spapr->bios_enabled = value;
+}
+
 static char *spapr_get_ic_mode(Object *obj, Error **errp)
 {
     SpaprMachineState *spapr = SPAPR_MACHINE(obj);
@@ -3379,6 +3403,12 @@ static void spapr_instance_init(Object *obj)
                                     " for -kernel is the default",
                                     NULL);
     spapr->kernel_addr = KERNEL_LOAD_ADDR;
+    object_property_add_bool(obj, "bios", spapr_get_bios_enabled,
+                            spapr_set_bios_enabled, NULL);
+    object_property_set_description(obj, "bios", "Conrols whether to load bios",
+                                    NULL);
+    spapr->bios_enabled = true;
+
     /* The machine class defines the default interrupt controller mode */
     spapr->irq = smc->irq;
     object_property_add_str(obj, "ic-mode", spapr_get_ic_mode,
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH qemu RFC 3/4] spapr: Advertise H_RTAS to the guest
  2019-07-20  1:28 [Qemu-devel] [PATCH qemu RFC 0/4] spapr: Kexec style boot Alexey Kardashevskiy
  2019-07-20  1:28 ` [Qemu-devel] [PATCH qemu RFC 1/4] spapr: Allow changing kernel loading address Alexey Kardashevskiy
  2019-07-20  1:28 ` [Qemu-devel] [PATCH qemu RFC 2/4] spapr: Allow bios-less configuration Alexey Kardashevskiy
@ 2019-07-20  1:28 ` Alexey Kardashevskiy
  2019-07-23  3:53   ` David Gibson
  2019-07-20  1:28 ` [Qemu-devel] [PATCH qemu RFC 4/4] spapr: Implement SLOF-less client_architecture_support Alexey Kardashevskiy
  3 siblings, 1 reply; 15+ messages in thread
From: Alexey Kardashevskiy @ 2019-07-20  1:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paul Mackerras, Michael Ellerman, qemu-ppc, Alexey Kardashevskiy,
	David Gibson

Since day 1 QEMU implemented RTAS as a custom hypercall wrapped into
a small 20 bytes blob which guest would call to enter RTAS. Although
it works fine, it is still a separate binary image which requires signing
at no additional benefit.

This adds a flag into /chosen to tell a modified guest that if the flag
is there, it can call H_RTAS directly and avoid calling into the RTAS
blob.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
 hw/ppc/spapr.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 81ad6a6f28de..b097a99951f1 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1230,6 +1230,9 @@ static void spapr_dt_chosen(SpaprMachineState *spapr, void *fdt)
         _FDT(fdt_setprop_cell(fdt, chosen, "linux,pci-probe-only", 0));
     }
 
+    /* We always implemented RTAS as hcall, tell guests to call it directly */
+    _FDT(fdt_setprop_cell(fdt, chosen, "qemu,h_rtas", 1));
+
     spapr_dt_ov5_platform_support(spapr, fdt, chosen);
 
     g_free(stdout_path);
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH qemu RFC 4/4] spapr: Implement SLOF-less client_architecture_support
  2019-07-20  1:28 [Qemu-devel] [PATCH qemu RFC 0/4] spapr: Kexec style boot Alexey Kardashevskiy
                   ` (2 preceding siblings ...)
  2019-07-20  1:28 ` [Qemu-devel] [PATCH qemu RFC 3/4] spapr: Advertise H_RTAS to the guest Alexey Kardashevskiy
@ 2019-07-20  1:28 ` Alexey Kardashevskiy
  2019-07-28  6:09   ` David Gibson
  3 siblings, 1 reply; 15+ messages in thread
From: Alexey Kardashevskiy @ 2019-07-20  1:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paul Mackerras, Michael Ellerman, qemu-ppc, Alexey Kardashevskiy,
	David Gibson

QEMU already implements H_CAS called by SLOF. The existing handler
prepares a diff FDT and SLOF applies it on top of its current tree.
In SLOF-less setup when the user explicitly selected "bios=no",
this updates the FDT from the OS, updates it and writes back to the OS.
The new behavior is advertised to the OS via "/chosen/qemu,h_cas".

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
 include/hw/ppc/spapr.h |  5 +++++
 hw/ppc/spapr.c         | 24 ++++++++++++++++-----
 hw/ppc/spapr_hcall.c   | 49 +++++++++++++++++++++++++++++++++++++++---
 3 files changed, 70 insertions(+), 8 deletions(-)

diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 7f5d7a70d27e..73cd9cf25b83 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -766,9 +766,14 @@ struct SpaprEventLogEntry {
 
 void spapr_events_init(SpaprMachineState *sm);
 void spapr_dt_events(SpaprMachineState *sm, void *fdt);
+int spapr_h_cas_do_compose_response(SpaprMachineState *spapr, void *fdt,
+                                    SpaprOptionVector *ov5_updates);
 int spapr_h_cas_compose_response(SpaprMachineState *sm,
                                  target_ulong addr, target_ulong size,
                                  SpaprOptionVector *ov5_updates);
+#define FDT_MAX_SIZE 0x100000
+void *spapr_build_fdt(SpaprMachineState *spapr);
+
 void close_htab_fd(SpaprMachineState *spapr);
 void spapr_setup_hpt_and_vrma(SpaprMachineState *spapr);
 void spapr_free_hpt(SpaprMachineState *spapr);
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index b097a99951f1..f84895f4a8b4 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -978,6 +978,19 @@ static bool spapr_hotplugged_dev_before_cas(void)
     return false;
 }
 
+int spapr_h_cas_do_compose_response(SpaprMachineState *spapr, void *fdt,
+                                    SpaprOptionVector *ov5_updates)
+{
+    /* Fixup cpu nodes */
+    _FDT((spapr_fixup_cpu_dt(fdt, spapr)));
+
+    if (spapr_dt_cas_updates(spapr, fdt, ov5_updates)) {
+        return -1;
+    }
+
+    return 0;
+}
+
 int spapr_h_cas_compose_response(SpaprMachineState *spapr,
                                  target_ulong addr, target_ulong size,
                                  SpaprOptionVector *ov5_updates)
@@ -1009,10 +1022,7 @@ int spapr_h_cas_compose_response(SpaprMachineState *spapr,
     _FDT((fdt_open_into(fdt_skel, fdt, size)));
     g_free(fdt_skel);
 
-    /* Fixup cpu nodes */
-    _FDT((spapr_fixup_cpu_dt(fdt, spapr)));
-
-    if (spapr_dt_cas_updates(spapr, fdt, ov5_updates)) {
+    if (spapr_h_cas_do_compose_response(spapr, fdt, ov5_updates)) {
         return -1;
     }
 
@@ -1232,6 +1242,10 @@ static void spapr_dt_chosen(SpaprMachineState *spapr, void *fdt)
 
     /* We always implemented RTAS as hcall, tell guests to call it directly */
     _FDT(fdt_setprop_cell(fdt, chosen, "qemu,h_rtas", 1));
+    /* Tell the guest that H_CAS will return the entire FDT now, not the diff */
+    if (!spapr->bios_enabled) {
+        _FDT(fdt_setprop_cell(fdt, chosen, "qemu,h_cas", 1));
+    }
 
     spapr_dt_ov5_platform_support(spapr, fdt, chosen);
 
@@ -1262,7 +1276,7 @@ static void spapr_dt_hypervisor(SpaprMachineState *spapr, void *fdt)
     }
 }
 
-static void *spapr_build_fdt(SpaprMachineState *spapr)
+void *spapr_build_fdt(SpaprMachineState *spapr)
 {
     MachineState *machine = MACHINE(spapr);
     MachineClass *mc = MACHINE_GET_CLASS(machine);
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index b964d94f330b..c5cb06c9d507 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -17,6 +17,7 @@
 #include "hw/ppc/spapr_ovec.h"
 #include "mmu-book3s-v3.h"
 #include "hw/mem/memory-device.h"
+#include "sysemu/device_tree.h"
 
 static bool has_spr(PowerPCCPU *cpu, int spr)
 {
@@ -1774,9 +1775,51 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu,
             /* legacy hash or new hash: */
             spapr_setup_hpt_and_vrma(spapr);
         }
-        spapr->cas_reboot =
-            (spapr_h_cas_compose_response(spapr, args[1], args[2],
-                                          ov5_updates) != 0);
+
+        if (spapr->bios_enabled) {
+            spapr->cas_reboot =
+                (spapr_h_cas_compose_response(spapr, args[1], args[2],
+                                              ov5_updates) != 0);
+        } else {
+            int size;
+            void *fdt, *fdt_skel;
+            struct fdt_header hdr = { 0 };
+
+            cpu_physical_memory_read(args[1], &hdr, sizeof(hdr));
+            size = fdt32_to_cpu(hdr.totalsize);
+            if (size > FDT_MAX_SIZE) {
+                return H_NOT_AVAILABLE;
+            }
+
+            fdt_skel = g_malloc0(size);
+            cpu_physical_memory_read(args[1], fdt_skel, size);
+
+            fdt = g_malloc0(FDT_MAX_SIZE);
+            fdt_open_into(fdt_skel, fdt, FDT_MAX_SIZE);
+            g_free(fdt_skel);
+
+            if (spapr_h_cas_do_compose_response(spapr, fdt, ov5_updates)) {
+                g_free(fdt);
+                return H_NOT_AVAILABLE;
+            }
+            fdt_pack(fdt);
+            if (fdt_totalsize(fdt) > FDT_MAX_SIZE) {
+                error_report("FDT too big ! 0x%x bytes (max is 0x%x)",
+                             fdt_totalsize(fdt), FDT_MAX_SIZE);
+                g_free(fdt);
+                return H_NOT_AVAILABLE;
+            }
+
+            /* Load the fdt */
+            cpu_physical_memory_write(args[1], fdt, fdt_totalsize(fdt));
+
+            g_free(spapr->fdt_blob);
+            spapr->fdt_size = fdt_totalsize(fdt);
+            spapr->fdt_initial_size = spapr->fdt_size;
+            spapr->fdt_blob = fdt;
+
+            qemu_fdt_dumpdtb(fdt, fdt_totalsize(fdt));
+        }
     }
 
     /*
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH qemu RFC 1/4] spapr: Allow changing kernel loading address
  2019-07-20  1:28 ` [Qemu-devel] [PATCH qemu RFC 1/4] spapr: Allow changing kernel loading address Alexey Kardashevskiy
@ 2019-07-23  3:49   ` David Gibson
  2019-07-23  5:36     ` Alexey Kardashevskiy
  0 siblings, 1 reply; 15+ messages in thread
From: David Gibson @ 2019-07-23  3:49 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: Paul Mackerras, Michael Ellerman, qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 5764 bytes --]

On Sat, Jul 20, 2019 at 11:28:47AM +1000, Alexey Kardashevskiy wrote:
> Useful for the debugging purposes.

Am I correct in understanding this isn't actually necessary for the
rest of the series to work, just useful for debugging?

> 
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
>  include/hw/ppc/spapr.h |  1 +
>  hw/ppc/spapr.c         | 33 +++++++++++++++++++++++++++------
>  2 files changed, 28 insertions(+), 6 deletions(-)
> 
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 74e427b588fc..ff82bb8554e1 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -159,6 +159,7 @@ struct SpaprMachineState {
>      void *fdt_blob;
>      long kernel_size;
>      bool kernel_le;
> +    uint64_t kernel_addr;
>      uint32_t initrd_base;
>      long initrd_size;
>      uint64_t rtc_offset; /* Now used only during incoming migration */
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 7fad42350538..6d13d65d8996 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1179,7 +1179,7 @@ static void spapr_dt_chosen(SpaprMachineState *spapr, void *fdt)
>                            spapr->initrd_base + spapr->initrd_size));
>  
>      if (spapr->kernel_size) {
> -        uint64_t kprop[2] = { cpu_to_be64(KERNEL_LOAD_ADDR),
> +        uint64_t kprop[2] = { cpu_to_be64(spapr->kernel_addr),
>                                cpu_to_be64(spapr->kernel_size) };
>  
>          _FDT(fdt_setprop(fdt, chosen, "qemu,boot-kernel",
> @@ -1365,7 +1365,7 @@ static void *spapr_build_fdt(SpaprMachineState *spapr)
>  
>      /* Build memory reserve map */
>      if (spapr->kernel_size) {
> -        _FDT((fdt_add_mem_rsv(fdt, KERNEL_LOAD_ADDR, spapr->kernel_size)));
> +        _FDT((fdt_add_mem_rsv(fdt, spapr->kernel_addr, spapr->kernel_size)));
>      }
>      if (spapr->initrd_size) {
>          _FDT((fdt_add_mem_rsv(fdt, spapr->initrd_base, spapr->initrd_size)));
> @@ -1391,7 +1391,8 @@ static void *spapr_build_fdt(SpaprMachineState *spapr)
>  
>  static uint64_t translate_kernel_address(void *opaque, uint64_t addr)
>  {
> -    return (addr & 0x0fffffff) + KERNEL_LOAD_ADDR;
> +    SpaprMachineState *spapr = opaque;
> +    return (addr & 0x0fffffff) + spapr->kernel_addr;
>  }
>  
>  static void emulate_spapr_hypercall(PPCVirtualHypervisor *vhyp,
> @@ -2995,12 +2996,12 @@ static void spapr_machine_init(MachineState *machine)
>          uint64_t lowaddr = 0;
>  
>          spapr->kernel_size = load_elf(kernel_filename, NULL,
> -                                      translate_kernel_address, NULL,
> +                                      translate_kernel_address, spapr,
>                                        NULL, &lowaddr, NULL, 1,
>                                        PPC_ELF_MACHINE, 0, 0);
>          if (spapr->kernel_size == ELF_LOAD_WRONG_ENDIAN) {
>              spapr->kernel_size = load_elf(kernel_filename, NULL,
> -                                          translate_kernel_address, NULL, NULL,
> +                                          translate_kernel_address, spapr, NULL,
>                                            &lowaddr, NULL, 0, PPC_ELF_MACHINE,
>                                            0, 0);
>              spapr->kernel_le = spapr->kernel_size > 0;
> @@ -3016,7 +3017,7 @@ static void spapr_machine_init(MachineState *machine)
>              /* Try to locate the initrd in the gap between the kernel
>               * and the firmware. Add a bit of space just in case
>               */
> -            spapr->initrd_base = (KERNEL_LOAD_ADDR + spapr->kernel_size
> +            spapr->initrd_base = (spapr->kernel_addr + spapr->kernel_size
>                                    + 0x1ffff) & ~0xffff;
>              spapr->initrd_size = load_image_targphys(initrd_filename,
>                                                       spapr->initrd_base,
> @@ -3253,6 +3254,18 @@ static void spapr_set_vsmt(Object *obj, Visitor *v, const char *name,
>      visit_type_uint32(v, name, (uint32_t *)opaque, errp);
>  }
>  
> +static void spapr_get_kernel_addr(Object *obj, Visitor *v, const char *name,
> +                                  void *opaque, Error **errp)
> +{
> +    visit_type_uint64(v, name, (uint64_t *)opaque, errp);
> +}
> +
> +static void spapr_set_kernel_addr(Object *obj, Visitor *v, const char *name,
> +                                  void *opaque, Error **errp)
> +{
> +    visit_type_uint64(v, name, (uint64_t *)opaque, errp);
> +}
> +
>  static char *spapr_get_ic_mode(Object *obj, Error **errp)
>  {
>      SpaprMachineState *spapr = SPAPR_MACHINE(obj);
> @@ -3358,6 +3371,14 @@ static void spapr_instance_init(Object *obj)
>      object_property_add_bool(obj, "vfio-no-msix-emulation",
>                               spapr_get_msix_emulation, NULL, NULL);
>  
> +    object_property_add(obj, "kernel-addr", "uint64", spapr_get_kernel_addr,
> +                        spapr_set_kernel_addr, NULL, &spapr->kernel_addr,
> +                        &error_abort);
> +    object_property_set_description(obj, "kernel-addr",
> +                                    stringify(KERNEL_LOAD_ADDR)
> +                                    " for -kernel is the default",
> +                                    NULL);
> +    spapr->kernel_addr = KERNEL_LOAD_ADDR;
>      /* The machine class defines the default interrupt controller mode */
>      spapr->irq = smc->irq;
>      object_property_add_str(obj, "ic-mode", spapr_get_ic_mode,

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH qemu RFC 2/4] spapr: Allow bios-less configuration
  2019-07-20  1:28 ` [Qemu-devel] [PATCH qemu RFC 2/4] spapr: Allow bios-less configuration Alexey Kardashevskiy
@ 2019-07-23  3:52   ` David Gibson
  2019-07-23  5:43     ` Alexey Kardashevskiy
  0 siblings, 1 reply; 15+ messages in thread
From: David Gibson @ 2019-07-23  3:52 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: Paul Mackerras, Michael Ellerman, qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 6122 bytes --]

On Sat, Jul 20, 2019 at 11:28:48AM +1000, Alexey Kardashevskiy wrote:
> The pseries kernel can do either usual prom-init boot or kexec style boot.
> We always did the prom-init which relies on the completeness of
> the device tree (for example, PCI BARs have to be assigned beforehand) and
> the client interface; the system firmware (SLOF) implements this.
> 
> However we can use the kexec style boot as well. To do that, we can skip
> loading SLOF and jump straight to the kernel. GPR5==0 (the client
> interface entry point, SLOF passes a valid pointer there) tells Linux to
> do the kexec boot rather than prom_init so it can proceed to the initramfs.
> With few PCI fixes in the guest kernel, it can boot from PCI (via
> petitboot, for example).
> 
> This adds a "bios" machine option which controls whether QEMU loads SLOF
> or jumps directly to the kernel. When bios==off, this does not copy SLOF
> and RTAS into the guest RAM and sets RTAS properties to 0 to bypass
> the kexec user space tool which checks for their presence (not for
> the values though).

BIOS is sometimes used to refer to any machine's firmware, but it's
also used to refer specifically to PC style BIOS.  I think it would be
clearer to be explicit here and call the options "slof" rather than
"bios".

> 
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
>  include/hw/ppc/spapr.h |  1 +
>  hw/ppc/spapr.c         | 58 ++++++++++++++++++++++++++++++++----------
>  2 files changed, 45 insertions(+), 14 deletions(-)
> 
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index ff82bb8554e1..7f5d7a70d27e 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -160,6 +160,7 @@ struct SpaprMachineState {
>      long kernel_size;
>      bool kernel_le;
>      uint64_t kernel_addr;
> +    bool bios_enabled;
>      uint32_t initrd_base;
>      long initrd_size;
>      uint64_t rtc_offset; /* Now used only during incoming migration */
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 6d13d65d8996..81ad6a6f28de 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1116,6 +1116,10 @@ static void spapr_dt_rtas(SpaprMachineState *spapr, void *fdt)
>      _FDT(fdt_setprop(fdt, rtas, "ibm,lrdr-capacity",
>                       lrdr_capacity, sizeof(lrdr_capacity)));
>  
> +    /* These are to make kexec-lite happy */
> +    _FDT(fdt_setprop_cell(fdt, rtas, "linux,rtas-base", 0));
> +    _FDT(fdt_setprop_cell(fdt, rtas, "rtas-size", 0));

What exactly is kexec-lite and what does it need here?

>      spapr_dt_rtas_tokens(fdt, rtas);
>  }
>  
> @@ -1814,7 +1818,11 @@ static void spapr_machine_reset(MachineState *machine)
>      spapr->fdt_blob = fdt;
>  
>      /* Set up the entry state */
> -    spapr_cpu_set_entry_state(first_ppc_cpu, SPAPR_ENTRY_POINT, fdt_addr);
> +    if (!spapr->bios_enabled) {
> +        spapr_cpu_set_entry_state(first_ppc_cpu, spapr->kernel_addr, fdt_addr);
> +    } else {
> +        spapr_cpu_set_entry_state(first_ppc_cpu, SPAPR_ENTRY_POINT, fdt_addr);
> +    }
>      first_ppc_cpu->env.gpr[5] = 0;
>  
>      spapr->cas_reboot = false;
> @@ -3031,20 +3039,22 @@ static void spapr_machine_init(MachineState *machine)
>          }
>      }
>  
> -    if (bios_name == NULL) {
> -        bios_name = FW_FILE_NAME;
> +    if (spapr->bios_enabled) {
> +        if (bios_name == NULL) {
> +            bios_name = FW_FILE_NAME;
> +        }
> +        filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
> +        if (!filename) {
> +            error_report("Could not find LPAR firmware '%s'", bios_name);
> +            exit(1);
> +        }
> +        fw_size = load_image_targphys(filename, 0, FW_MAX_SIZE);
> +        if (fw_size <= 0) {
> +            error_report("Could not load LPAR firmware '%s'", filename);
> +            exit(1);
> +        }
> +        g_free(filename);
>      }
> -    filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
> -    if (!filename) {
> -        error_report("Could not find LPAR firmware '%s'", bios_name);
> -        exit(1);
> -    }
> -    fw_size = load_image_targphys(filename, 0, FW_MAX_SIZE);
> -    if (fw_size <= 0) {
> -        error_report("Could not load LPAR firmware '%s'", filename);
> -        exit(1);
> -    }
> -    g_free(filename);
>  
>      /* FIXME: Should register things through the MachineState's qdev
>       * interface, this is a legacy from the sPAPREnvironment structure
> @@ -3266,6 +3276,20 @@ static void spapr_set_kernel_addr(Object *obj, Visitor *v, const char *name,
>      visit_type_uint64(v, name, (uint64_t *)opaque, errp);
>  }
>  
> +static bool spapr_get_bios_enabled(Object *obj, Error **errp)
> +{
> +    SpaprMachineState *spapr = SPAPR_MACHINE(obj);
> +
> +    return spapr->bios_enabled;
> +}
> +
> +static void spapr_set_bios_enabled(Object *obj, bool value, Error **errp)
> +{
> +    SpaprMachineState *spapr = SPAPR_MACHINE(obj);
> +
> +    spapr->bios_enabled = value;
> +}
> +
>  static char *spapr_get_ic_mode(Object *obj, Error **errp)
>  {
>      SpaprMachineState *spapr = SPAPR_MACHINE(obj);
> @@ -3379,6 +3403,12 @@ static void spapr_instance_init(Object *obj)
>                                      " for -kernel is the default",
>                                      NULL);
>      spapr->kernel_addr = KERNEL_LOAD_ADDR;
> +    object_property_add_bool(obj, "bios", spapr_get_bios_enabled,
> +                            spapr_set_bios_enabled, NULL);
> +    object_property_set_description(obj, "bios", "Conrols whether to load bios",
> +                                    NULL);
> +    spapr->bios_enabled = true;
> +
>      /* The machine class defines the default interrupt controller mode */
>      spapr->irq = smc->irq;
>      object_property_add_str(obj, "ic-mode", spapr_get_ic_mode,

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH qemu RFC 3/4] spapr: Advertise H_RTAS to the guest
  2019-07-20  1:28 ` [Qemu-devel] [PATCH qemu RFC 3/4] spapr: Advertise H_RTAS to the guest Alexey Kardashevskiy
@ 2019-07-23  3:53   ` David Gibson
  2019-07-23  5:48     ` Alexey Kardashevskiy
  0 siblings, 1 reply; 15+ messages in thread
From: David Gibson @ 2019-07-23  3:53 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: Paul Mackerras, Michael Ellerman, qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1533 bytes --]

On Sat, Jul 20, 2019 at 11:28:49AM +1000, Alexey Kardashevskiy wrote:
> Since day 1 QEMU implemented RTAS as a custom hypercall wrapped into
> a small 20 bytes blob which guest would call to enter RTAS. Although
> it works fine, it is still a separate binary image which requires signing
> at no additional benefit.
> 
> This adds a flag into /chosen to tell a modified guest that if the flag
> is there, it can call H_RTAS directly and avoid calling into the RTAS
> blob.
> 
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
>  hw/ppc/spapr.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 81ad6a6f28de..b097a99951f1 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1230,6 +1230,9 @@ static void spapr_dt_chosen(SpaprMachineState *spapr, void *fdt)
>          _FDT(fdt_setprop_cell(fdt, chosen, "linux,pci-probe-only", 0));
>      }
>  
> +    /* We always implemented RTAS as hcall, tell guests to call it directly */
> +    _FDT(fdt_setprop_cell(fdt, chosen, "qemu,h_rtas", 1));

Rather than creating a new property for this we could use the
qemu,hypertas-functions property which is already used to advertise
some KVM specific hcalls.

>      spapr_dt_ov5_platform_support(spapr, fdt, chosen);
>  
>      g_free(stdout_path);

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH qemu RFC 1/4] spapr: Allow changing kernel loading address
  2019-07-23  3:49   ` David Gibson
@ 2019-07-23  5:36     ` Alexey Kardashevskiy
  0 siblings, 0 replies; 15+ messages in thread
From: Alexey Kardashevskiy @ 2019-07-23  5:36 UTC (permalink / raw)
  To: David Gibson; +Cc: Paul Mackerras, Michael Ellerman, qemu-ppc, qemu-devel



On 23/07/2019 13:49, David Gibson wrote:
> On Sat, Jul 20, 2019 at 11:28:47AM +1000, Alexey Kardashevskiy wrote:
>> Useful for the debugging purposes.
> 
> Am I correct in understanding this isn't actually necessary for the
> rest of the series to work, just useful for debugging?


Correct. When I just started and used stepping in gdb stub, it helped to 
have the kernel at specific location. And since there is another 
property added in this patchset - "bios" - I did not really want to 
rebase (I will have to anyway if we decide to proceed). Thanks,


> 
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> ---
>>   include/hw/ppc/spapr.h |  1 +
>>   hw/ppc/spapr.c         | 33 +++++++++++++++++++++++++++------
>>   2 files changed, 28 insertions(+), 6 deletions(-)
>>
>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>> index 74e427b588fc..ff82bb8554e1 100644
>> --- a/include/hw/ppc/spapr.h
>> +++ b/include/hw/ppc/spapr.h
>> @@ -159,6 +159,7 @@ struct SpaprMachineState {
>>       void *fdt_blob;
>>       long kernel_size;
>>       bool kernel_le;
>> +    uint64_t kernel_addr;
>>       uint32_t initrd_base;
>>       long initrd_size;
>>       uint64_t rtc_offset; /* Now used only during incoming migration */
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 7fad42350538..6d13d65d8996 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -1179,7 +1179,7 @@ static void spapr_dt_chosen(SpaprMachineState *spapr, void *fdt)
>>                             spapr->initrd_base + spapr->initrd_size));
>>   
>>       if (spapr->kernel_size) {
>> -        uint64_t kprop[2] = { cpu_to_be64(KERNEL_LOAD_ADDR),
>> +        uint64_t kprop[2] = { cpu_to_be64(spapr->kernel_addr),
>>                                 cpu_to_be64(spapr->kernel_size) };
>>   
>>           _FDT(fdt_setprop(fdt, chosen, "qemu,boot-kernel",
>> @@ -1365,7 +1365,7 @@ static void *spapr_build_fdt(SpaprMachineState *spapr)
>>   
>>       /* Build memory reserve map */
>>       if (spapr->kernel_size) {
>> -        _FDT((fdt_add_mem_rsv(fdt, KERNEL_LOAD_ADDR, spapr->kernel_size)));
>> +        _FDT((fdt_add_mem_rsv(fdt, spapr->kernel_addr, spapr->kernel_size)));
>>       }
>>       if (spapr->initrd_size) {
>>           _FDT((fdt_add_mem_rsv(fdt, spapr->initrd_base, spapr->initrd_size)));
>> @@ -1391,7 +1391,8 @@ static void *spapr_build_fdt(SpaprMachineState *spapr)
>>   
>>   static uint64_t translate_kernel_address(void *opaque, uint64_t addr)
>>   {
>> -    return (addr & 0x0fffffff) + KERNEL_LOAD_ADDR;
>> +    SpaprMachineState *spapr = opaque;
>> +    return (addr & 0x0fffffff) + spapr->kernel_addr;
>>   }
>>   
>>   static void emulate_spapr_hypercall(PPCVirtualHypervisor *vhyp,
>> @@ -2995,12 +2996,12 @@ static void spapr_machine_init(MachineState *machine)
>>           uint64_t lowaddr = 0;
>>   
>>           spapr->kernel_size = load_elf(kernel_filename, NULL,
>> -                                      translate_kernel_address, NULL,
>> +                                      translate_kernel_address, spapr,
>>                                         NULL, &lowaddr, NULL, 1,
>>                                         PPC_ELF_MACHINE, 0, 0);
>>           if (spapr->kernel_size == ELF_LOAD_WRONG_ENDIAN) {
>>               spapr->kernel_size = load_elf(kernel_filename, NULL,
>> -                                          translate_kernel_address, NULL, NULL,
>> +                                          translate_kernel_address, spapr, NULL,
>>                                             &lowaddr, NULL, 0, PPC_ELF_MACHINE,
>>                                             0, 0);
>>               spapr->kernel_le = spapr->kernel_size > 0;
>> @@ -3016,7 +3017,7 @@ static void spapr_machine_init(MachineState *machine)
>>               /* Try to locate the initrd in the gap between the kernel
>>                * and the firmware. Add a bit of space just in case
>>                */
>> -            spapr->initrd_base = (KERNEL_LOAD_ADDR + spapr->kernel_size
>> +            spapr->initrd_base = (spapr->kernel_addr + spapr->kernel_size
>>                                     + 0x1ffff) & ~0xffff;
>>               spapr->initrd_size = load_image_targphys(initrd_filename,
>>                                                        spapr->initrd_base,
>> @@ -3253,6 +3254,18 @@ static void spapr_set_vsmt(Object *obj, Visitor *v, const char *name,
>>       visit_type_uint32(v, name, (uint32_t *)opaque, errp);
>>   }
>>   
>> +static void spapr_get_kernel_addr(Object *obj, Visitor *v, const char *name,
>> +                                  void *opaque, Error **errp)
>> +{
>> +    visit_type_uint64(v, name, (uint64_t *)opaque, errp);
>> +}
>> +
>> +static void spapr_set_kernel_addr(Object *obj, Visitor *v, const char *name,
>> +                                  void *opaque, Error **errp)
>> +{
>> +    visit_type_uint64(v, name, (uint64_t *)opaque, errp);
>> +}
>> +
>>   static char *spapr_get_ic_mode(Object *obj, Error **errp)
>>   {
>>       SpaprMachineState *spapr = SPAPR_MACHINE(obj);
>> @@ -3358,6 +3371,14 @@ static void spapr_instance_init(Object *obj)
>>       object_property_add_bool(obj, "vfio-no-msix-emulation",
>>                                spapr_get_msix_emulation, NULL, NULL);
>>   
>> +    object_property_add(obj, "kernel-addr", "uint64", spapr_get_kernel_addr,
>> +                        spapr_set_kernel_addr, NULL, &spapr->kernel_addr,
>> +                        &error_abort);
>> +    object_property_set_description(obj, "kernel-addr",
>> +                                    stringify(KERNEL_LOAD_ADDR)
>> +                                    " for -kernel is the default",
>> +                                    NULL);
>> +    spapr->kernel_addr = KERNEL_LOAD_ADDR;
>>       /* The machine class defines the default interrupt controller mode */
>>       spapr->irq = smc->irq;
>>       object_property_add_str(obj, "ic-mode", spapr_get_ic_mode,
> 

-- 
Alexey


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH qemu RFC 2/4] spapr: Allow bios-less configuration
  2019-07-23  3:52   ` David Gibson
@ 2019-07-23  5:43     ` Alexey Kardashevskiy
  0 siblings, 0 replies; 15+ messages in thread
From: Alexey Kardashevskiy @ 2019-07-23  5:43 UTC (permalink / raw)
  To: David Gibson
  Cc: Paul Mackerras, Michael Ellerman, qemu-ppc, qemu-devel, Anton Blanchard



On 23/07/2019 13:52, David Gibson wrote:
> On Sat, Jul 20, 2019 at 11:28:48AM +1000, Alexey Kardashevskiy wrote:
>> The pseries kernel can do either usual prom-init boot or kexec style boot.
>> We always did the prom-init which relies on the completeness of
>> the device tree (for example, PCI BARs have to be assigned beforehand) and
>> the client interface; the system firmware (SLOF) implements this.
>>
>> However we can use the kexec style boot as well. To do that, we can skip
>> loading SLOF and jump straight to the kernel. GPR5==0 (the client
>> interface entry point, SLOF passes a valid pointer there) tells Linux to
>> do the kexec boot rather than prom_init so it can proceed to the initramfs.
>> With few PCI fixes in the guest kernel, it can boot from PCI (via
>> petitboot, for example).
>>
>> This adds a "bios" machine option which controls whether QEMU loads SLOF
>> or jumps directly to the kernel. When bios==off, this does not copy SLOF
>> and RTAS into the guest RAM and sets RTAS properties to 0 to bypass
>> the kexec user space tool which checks for their presence (not for
>> the values though).
> 
> BIOS is sometimes used to refer to any machine's firmware, but it's
> also used to refer specifically to PC style BIOS.  I think it would be
> clearer to be explicit here and call the options "slof" rather than
> "bios".


This is a machine option of the "pseries" machine so it did not sound 
like PC bios to me, and slof itself lives in pc-bios so it seemed 
aligned name for a property like this.



> 
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> ---
>>   include/hw/ppc/spapr.h |  1 +
>>   hw/ppc/spapr.c         | 58 ++++++++++++++++++++++++++++++++----------
>>   2 files changed, 45 insertions(+), 14 deletions(-)
>>
>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>> index ff82bb8554e1..7f5d7a70d27e 100644
>> --- a/include/hw/ppc/spapr.h
>> +++ b/include/hw/ppc/spapr.h
>> @@ -160,6 +160,7 @@ struct SpaprMachineState {
>>       long kernel_size;
>>       bool kernel_le;
>>       uint64_t kernel_addr;
>> +    bool bios_enabled;
>>       uint32_t initrd_base;
>>       long initrd_size;
>>       uint64_t rtc_offset; /* Now used only during incoming migration */
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 6d13d65d8996..81ad6a6f28de 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -1116,6 +1116,10 @@ static void spapr_dt_rtas(SpaprMachineState *spapr, void *fdt)
>>       _FDT(fdt_setprop(fdt, rtas, "ibm,lrdr-capacity",
>>                        lrdr_capacity, sizeof(lrdr_capacity)));
>>   
>> +    /* These are to make kexec-lite happy */
>> +    _FDT(fdt_setprop_cell(fdt, rtas, "linux,rtas-base", 0));
>> +    _FDT(fdt_setprop_cell(fdt, rtas, "rtas-size", 0));
> 
> What exactly is kexec-lite and what does it need here?

This is a leftover which did not help (I think, need to double check).

It is a small kexec used in openpower builds, maintained by Anton:
https://github.com/open-power/kexec-lite

This is a part of the petitboot initramdisk used on bare metal powernv 
machines and if the tool detects /rtas in the DT, it insists on these 2 
properties, otherwise it does not proceed to reboot("kexec"):
https://github.com/open-power/kexec-lite/blob/master/kexec_memory_map.c#L272

I ended up patching it (hi Anton, please review my "[PATCH kexec] 
memory_map: Allow RTAS-less setup") and rebuilding the initramdisk.


> 
>>       spapr_dt_rtas_tokens(fdt, rtas);
>>   }
>>   
>> @@ -1814,7 +1818,11 @@ static void spapr_machine_reset(MachineState *machine)
>>       spapr->fdt_blob = fdt;
>>   
>>       /* Set up the entry state */
>> -    spapr_cpu_set_entry_state(first_ppc_cpu, SPAPR_ENTRY_POINT, fdt_addr);
>> +    if (!spapr->bios_enabled) {
>> +        spapr_cpu_set_entry_state(first_ppc_cpu, spapr->kernel_addr, fdt_addr);
>> +    } else {
>> +        spapr_cpu_set_entry_state(first_ppc_cpu, SPAPR_ENTRY_POINT, fdt_addr);
>> +    }
>>       first_ppc_cpu->env.gpr[5] = 0;
>>   
>>       spapr->cas_reboot = false;
>> @@ -3031,20 +3039,22 @@ static void spapr_machine_init(MachineState *machine)
>>           }
>>       }
>>   
>> -    if (bios_name == NULL) {
>> -        bios_name = FW_FILE_NAME;
>> +    if (spapr->bios_enabled) {
>> +        if (bios_name == NULL) {
>> +            bios_name = FW_FILE_NAME;
>> +        }
>> +        filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
>> +        if (!filename) {
>> +            error_report("Could not find LPAR firmware '%s'", bios_name);
>> +            exit(1);
>> +        }
>> +        fw_size = load_image_targphys(filename, 0, FW_MAX_SIZE);
>> +        if (fw_size <= 0) {
>> +            error_report("Could not load LPAR firmware '%s'", filename);
>> +            exit(1);
>> +        }
>> +        g_free(filename);
>>       }
>> -    filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
>> -    if (!filename) {
>> -        error_report("Could not find LPAR firmware '%s'", bios_name);
>> -        exit(1);
>> -    }
>> -    fw_size = load_image_targphys(filename, 0, FW_MAX_SIZE);
>> -    if (fw_size <= 0) {
>> -        error_report("Could not load LPAR firmware '%s'", filename);
>> -        exit(1);
>> -    }
>> -    g_free(filename);
>>   
>>       /* FIXME: Should register things through the MachineState's qdev
>>        * interface, this is a legacy from the sPAPREnvironment structure
>> @@ -3266,6 +3276,20 @@ static void spapr_set_kernel_addr(Object *obj, Visitor *v, const char *name,
>>       visit_type_uint64(v, name, (uint64_t *)opaque, errp);
>>   }
>>   
>> +static bool spapr_get_bios_enabled(Object *obj, Error **errp)
>> +{
>> +    SpaprMachineState *spapr = SPAPR_MACHINE(obj);
>> +
>> +    return spapr->bios_enabled;
>> +}
>> +
>> +static void spapr_set_bios_enabled(Object *obj, bool value, Error **errp)
>> +{
>> +    SpaprMachineState *spapr = SPAPR_MACHINE(obj);
>> +
>> +    spapr->bios_enabled = value;
>> +}
>> +
>>   static char *spapr_get_ic_mode(Object *obj, Error **errp)
>>   {
>>       SpaprMachineState *spapr = SPAPR_MACHINE(obj);
>> @@ -3379,6 +3403,12 @@ static void spapr_instance_init(Object *obj)
>>                                       " for -kernel is the default",
>>                                       NULL);
>>       spapr->kernel_addr = KERNEL_LOAD_ADDR;
>> +    object_property_add_bool(obj, "bios", spapr_get_bios_enabled,
>> +                            spapr_set_bios_enabled, NULL);
>> +    object_property_set_description(obj, "bios", "Conrols whether to load bios",
>> +                                    NULL);
>> +    spapr->bios_enabled = true;
>> +
>>       /* The machine class defines the default interrupt controller mode */
>>       spapr->irq = smc->irq;
>>       object_property_add_str(obj, "ic-mode", spapr_get_ic_mode,
> 

-- 
Alexey


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH qemu RFC 3/4] spapr: Advertise H_RTAS to the guest
  2019-07-23  3:53   ` David Gibson
@ 2019-07-23  5:48     ` Alexey Kardashevskiy
  2019-07-23  6:14       ` David Gibson
  0 siblings, 1 reply; 15+ messages in thread
From: Alexey Kardashevskiy @ 2019-07-23  5:48 UTC (permalink / raw)
  To: David Gibson; +Cc: Paul Mackerras, Michael Ellerman, qemu-ppc, qemu-devel



On 23/07/2019 13:53, David Gibson wrote:
> On Sat, Jul 20, 2019 at 11:28:49AM +1000, Alexey Kardashevskiy wrote:
>> Since day 1 QEMU implemented RTAS as a custom hypercall wrapped into
>> a small 20 bytes blob which guest would call to enter RTAS. Although
>> it works fine, it is still a separate binary image which requires signing
>> at no additional benefit.
>>
>> This adds a flag into /chosen to tell a modified guest that if the flag
>> is there, it can call H_RTAS directly and avoid calling into the RTAS
>> blob.
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> ---
>>   hw/ppc/spapr.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 81ad6a6f28de..b097a99951f1 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -1230,6 +1230,9 @@ static void spapr_dt_chosen(SpaprMachineState *spapr, void *fdt)
>>           _FDT(fdt_setprop_cell(fdt, chosen, "linux,pci-probe-only", 0));
>>       }
>>   
>> +    /* We always implemented RTAS as hcall, tell guests to call it directly */
>> +    _FDT(fdt_setprop_cell(fdt, chosen, "qemu,h_rtas", 1));
> 
> Rather than creating a new property for this we could use the
> qemu,hypertas-functions property which is already used to advertise
> some KVM specific hcalls.

True, this is another way of doing it. We will also need a proper number 
for it rather that custom 0xf000, Paul suggested we could tell the guest 
this number via the "qemu,h_rtas" property.


> 
>>       spapr_dt_ov5_platform_support(spapr, fdt, chosen);
>>   
>>       g_free(stdout_path);
> 

-- 
Alexey


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH qemu RFC 3/4] spapr: Advertise H_RTAS to the guest
  2019-07-23  5:48     ` Alexey Kardashevskiy
@ 2019-07-23  6:14       ` David Gibson
  2019-07-23  7:41         ` Alexey Kardashevskiy
  0 siblings, 1 reply; 15+ messages in thread
From: David Gibson @ 2019-07-23  6:14 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: Paul Mackerras, Michael Ellerman, qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2503 bytes --]

On Tue, Jul 23, 2019 at 03:48:34PM +1000, Alexey Kardashevskiy wrote:
> 
> 
> On 23/07/2019 13:53, David Gibson wrote:
> > On Sat, Jul 20, 2019 at 11:28:49AM +1000, Alexey Kardashevskiy wrote:
> > > Since day 1 QEMU implemented RTAS as a custom hypercall wrapped into
> > > a small 20 bytes blob which guest would call to enter RTAS. Although
> > > it works fine, it is still a separate binary image which requires signing
> > > at no additional benefit.
> > > 
> > > This adds a flag into /chosen to tell a modified guest that if the flag
> > > is there, it can call H_RTAS directly and avoid calling into the RTAS
> > > blob.
> > > 
> > > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> > > ---
> > >   hw/ppc/spapr.c | 3 +++
> > >   1 file changed, 3 insertions(+)
> > > 
> > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > index 81ad6a6f28de..b097a99951f1 100644
> > > --- a/hw/ppc/spapr.c
> > > +++ b/hw/ppc/spapr.c
> > > @@ -1230,6 +1230,9 @@ static void spapr_dt_chosen(SpaprMachineState *spapr, void *fdt)
> > >           _FDT(fdt_setprop_cell(fdt, chosen, "linux,pci-probe-only", 0));
> > >       }
> > > +    /* We always implemented RTAS as hcall, tell guests to call it directly */
> > > +    _FDT(fdt_setprop_cell(fdt, chosen, "qemu,h_rtas", 1));
> > 
> > Rather than creating a new property for this we could use the
> > qemu,hypertas-functions property which is already used to advertise
> > some KVM specific hcalls.
> 
> True, this is another way of doing it. We will also need a proper number for
> it rather that custom 0xf000,

So, I take from this you're considering making this a supported method
of operation, maybe even incorporating it into PAPR?

> Paul suggested we could tell the guest this
> number via the "qemu,h_rtas" property.

Hm, we could, but introducing dynamic hypercall numbers for this seems
an odd thing to do; both PAPR standardized and qemu specific
hypercalls we currently operate on the guest just knowing the numbers.

It seems to me that the obvious way to handle this is to have a tag in
qemu,hypertas-functions which indicates the presence of the existing
0xf00 H_RTAS.  If/when we have a PAPR (or pseudo-PAPR) standardized
number, that would get a tag in ibm,hypertas-functions to advertise
it.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH qemu RFC 3/4] spapr: Advertise H_RTAS to the guest
  2019-07-23  6:14       ` David Gibson
@ 2019-07-23  7:41         ` Alexey Kardashevskiy
  0 siblings, 0 replies; 15+ messages in thread
From: Alexey Kardashevskiy @ 2019-07-23  7:41 UTC (permalink / raw)
  To: David Gibson; +Cc: Paul Mackerras, Michael Ellerman, qemu-ppc, qemu-devel



On 23/07/2019 16:14, David Gibson wrote:
> On Tue, Jul 23, 2019 at 03:48:34PM +1000, Alexey Kardashevskiy wrote:
>>
>>
>> On 23/07/2019 13:53, David Gibson wrote:
>>> On Sat, Jul 20, 2019 at 11:28:49AM +1000, Alexey Kardashevskiy wrote:
>>>> Since day 1 QEMU implemented RTAS as a custom hypercall wrapped into
>>>> a small 20 bytes blob which guest would call to enter RTAS. Although
>>>> it works fine, it is still a separate binary image which requires signing
>>>> at no additional benefit.
>>>>
>>>> This adds a flag into /chosen to tell a modified guest that if the flag
>>>> is there, it can call H_RTAS directly and avoid calling into the RTAS
>>>> blob.
>>>>
>>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>>>> ---
>>>>    hw/ppc/spapr.c | 3 +++
>>>>    1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>>> index 81ad6a6f28de..b097a99951f1 100644
>>>> --- a/hw/ppc/spapr.c
>>>> +++ b/hw/ppc/spapr.c
>>>> @@ -1230,6 +1230,9 @@ static void spapr_dt_chosen(SpaprMachineState *spapr, void *fdt)
>>>>            _FDT(fdt_setprop_cell(fdt, chosen, "linux,pci-probe-only", 0));
>>>>        }
>>>> +    /* We always implemented RTAS as hcall, tell guests to call it directly */
>>>> +    _FDT(fdt_setprop_cell(fdt, chosen, "qemu,h_rtas", 1));
>>>
>>> Rather than creating a new property for this we could use the
>>> qemu,hypertas-functions property which is already used to advertise
>>> some KVM specific hcalls.
>>
>> True, this is another way of doing it. We will also need a proper number for
>> it rather that custom 0xf000,
> 
> So, I take from this you're considering making this a supported method
> of operation, maybe even incorporating it into PAPR?


This was discussed as well since we are adding things in this area 
anyway (there is instantiate-rtas-64 coming), we could add this H_RTAS too.


>> Paul suggested we could tell the guest this
>> number via the "qemu,h_rtas" property.
> 
> Hm, we could, but introducing dynamic hypercall numbers for this seems
> an odd thing to do; both PAPR standardized and qemu specific
> hypercalls we currently operate on the guest just knowing the numbers.
>
> It seems to me that the obvious way to handle this is to have a tag in
> qemu,hypertas-functions which indicates the presence of the existing
> 0xf00 H_RTAS.  If/when we have a PAPR (or pseudo-PAPR) standardized
> number, that would get a tag in ibm,hypertas-functions to advertise
> it.

Makes sense to me, will update.


-- 
Alexey


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH qemu RFC 4/4] spapr: Implement SLOF-less client_architecture_support
  2019-07-20  1:28 ` [Qemu-devel] [PATCH qemu RFC 4/4] spapr: Implement SLOF-less client_architecture_support Alexey Kardashevskiy
@ 2019-07-28  6:09   ` David Gibson
  2019-07-29  5:03     ` Alexey Kardashevskiy
  0 siblings, 1 reply; 15+ messages in thread
From: David Gibson @ 2019-07-28  6:09 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: Paul Mackerras, Michael Ellerman, qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 7328 bytes --]

On Sat, Jul 20, 2019 at 11:28:50AM +1000, Alexey Kardashevskiy wrote:
> QEMU already implements H_CAS called by SLOF. The existing handler
> prepares a diff FDT and SLOF applies it on top of its current tree.
> In SLOF-less setup when the user explicitly selected "bios=no",
> this updates the FDT from the OS, updates it and writes back to the OS.
> The new behavior is advertised to the OS via "/chosen/qemu,h_cas".

I don't love having two different paths here, I'm wondering if we can
unify things at all.

I have wondered at some points if there's anything preventing us just
giving a whole new device tree at CAS, rather than selected updates -
that could simplify several things.

> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
>  include/hw/ppc/spapr.h |  5 +++++
>  hw/ppc/spapr.c         | 24 ++++++++++++++++-----
>  hw/ppc/spapr_hcall.c   | 49 +++++++++++++++++++++++++++++++++++++++---
>  3 files changed, 70 insertions(+), 8 deletions(-)
> 
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 7f5d7a70d27e..73cd9cf25b83 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -766,9 +766,14 @@ struct SpaprEventLogEntry {
>  
>  void spapr_events_init(SpaprMachineState *sm);
>  void spapr_dt_events(SpaprMachineState *sm, void *fdt);
> +int spapr_h_cas_do_compose_response(SpaprMachineState *spapr, void *fdt,
> +                                    SpaprOptionVector *ov5_updates);
>  int spapr_h_cas_compose_response(SpaprMachineState *sm,
>                                   target_ulong addr, target_ulong size,
>                                   SpaprOptionVector *ov5_updates);
> +#define FDT_MAX_SIZE 0x100000
> +void *spapr_build_fdt(SpaprMachineState *spapr);
> +
>  void close_htab_fd(SpaprMachineState *spapr);
>  void spapr_setup_hpt_and_vrma(SpaprMachineState *spapr);
>  void spapr_free_hpt(SpaprMachineState *spapr);
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index b097a99951f1..f84895f4a8b4 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -978,6 +978,19 @@ static bool spapr_hotplugged_dev_before_cas(void)
>      return false;
>  }
>  
> +int spapr_h_cas_do_compose_response(SpaprMachineState *spapr, void *fdt,
> +                                    SpaprOptionVector *ov5_updates)

Not a great function name.

> +{
> +    /* Fixup cpu nodes */
> +    _FDT((spapr_fixup_cpu_dt(fdt, spapr)));
> +
> +    if (spapr_dt_cas_updates(spapr, fdt, ov5_updates)) {
> +        return -1;
> +    }
> +
> +    return 0;
> +}
> +
>  int spapr_h_cas_compose_response(SpaprMachineState *spapr,
>                                   target_ulong addr, target_ulong size,
>                                   SpaprOptionVector *ov5_updates)
> @@ -1009,10 +1022,7 @@ int spapr_h_cas_compose_response(SpaprMachineState *spapr,
>      _FDT((fdt_open_into(fdt_skel, fdt, size)));
>      g_free(fdt_skel);
>  
> -    /* Fixup cpu nodes */
> -    _FDT((spapr_fixup_cpu_dt(fdt, spapr)));
> -
> -    if (spapr_dt_cas_updates(spapr, fdt, ov5_updates)) {
> +    if (spapr_h_cas_do_compose_response(spapr, fdt, ov5_updates)) {
>          return -1;
>      }
>  
> @@ -1232,6 +1242,10 @@ static void spapr_dt_chosen(SpaprMachineState *spapr, void *fdt)
>  
>      /* We always implemented RTAS as hcall, tell guests to call it directly */
>      _FDT(fdt_setprop_cell(fdt, chosen, "qemu,h_rtas", 1));
> +    /* Tell the guest that H_CAS will return the entire FDT now, not the diff */
> +    if (!spapr->bios_enabled) {
> +        _FDT(fdt_setprop_cell(fdt, chosen, "qemu,h_cas", 1));

As with H_RTAS< using qemu,hypertas-functions would be more
appropriate for this.

> +    }
>  
>      spapr_dt_ov5_platform_support(spapr, fdt, chosen);
>  
> @@ -1262,7 +1276,7 @@ static void spapr_dt_hypervisor(SpaprMachineState *spapr, void *fdt)
>      }
>  }
>  
> -static void *spapr_build_fdt(SpaprMachineState *spapr)
> +void *spapr_build_fdt(SpaprMachineState *spapr)
>  {
>      MachineState *machine = MACHINE(spapr);
>      MachineClass *mc = MACHINE_GET_CLASS(machine);
> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> index b964d94f330b..c5cb06c9d507 100644
> --- a/hw/ppc/spapr_hcall.c
> +++ b/hw/ppc/spapr_hcall.c
> @@ -17,6 +17,7 @@
>  #include "hw/ppc/spapr_ovec.h"
>  #include "mmu-book3s-v3.h"
>  #include "hw/mem/memory-device.h"
> +#include "sysemu/device_tree.h"
>  
>  static bool has_spr(PowerPCCPU *cpu, int spr)
>  {
> @@ -1774,9 +1775,51 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu,
>              /* legacy hash or new hash: */
>              spapr_setup_hpt_and_vrma(spapr);
>          }
> -        spapr->cas_reboot =
> -            (spapr_h_cas_compose_response(spapr, args[1], args[2],
> -                                          ov5_updates) != 0);
> +
> +        if (spapr->bios_enabled) {
> +            spapr->cas_reboot =
> +                (spapr_h_cas_compose_response(spapr, args[1], args[2],
> +                                              ov5_updates) != 0);
> +        } else {
> +            int size;
> +            void *fdt, *fdt_skel;
> +            struct fdt_header hdr = { 0 };
> +
> +            cpu_physical_memory_read(args[1], &hdr, sizeof(hdr));
> +            size = fdt32_to_cpu(hdr.totalsize);
> +            if (size > FDT_MAX_SIZE) {
> +                return H_NOT_AVAILABLE;
> +            }
> +
> +            fdt_skel = g_malloc0(size);
> +            cpu_physical_memory_read(args[1], fdt_skel, size);
> +
> +            fdt = g_malloc0(FDT_MAX_SIZE);
> +            fdt_open_into(fdt_skel, fdt, FDT_MAX_SIZE);
> +            g_free(fdt_skel);

fdt_open_into() explicitly permits using the same buffer for both
arguments, so you don't need two allocations here - you can just
allocate FDT_MAX_SIZE, read it in, then fdt_open_into() in place.

You probably should check for errors from fdt_open_into(), though.

> +
> +            if (spapr_h_cas_do_compose_response(spapr, fdt, ov5_updates)) {
> +                g_free(fdt);
> +                return H_NOT_AVAILABLE;
> +            }
> +            fdt_pack(fdt);
> +            if (fdt_totalsize(fdt) > FDT_MAX_SIZE) {

This can't actually happen - you only ever allocated a buffer of size
FDT_MAX_SIZE, so if the DT was going to exceed that you would have hit
FDT_ERR_NOSPACE in an earlier step.

> +                error_report("FDT too big ! 0x%x bytes (max is 0x%x)",
> +                             fdt_totalsize(fdt), FDT_MAX_SIZE);
> +                g_free(fdt);
> +                return H_NOT_AVAILABLE;
> +            }
> +
> +            /* Load the fdt */
> +            cpu_physical_memory_write(args[1], fdt, fdt_totalsize(fdt));
> +
> +            g_free(spapr->fdt_blob);
> +            spapr->fdt_size = fdt_totalsize(fdt);
> +            spapr->fdt_initial_size = spapr->fdt_size;
> +            spapr->fdt_blob = fdt;
> +
> +            qemu_fdt_dumpdtb(fdt, fdt_totalsize(fdt));
> +        }
>      }
>  
>      /*

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH qemu RFC 4/4] spapr: Implement SLOF-less client_architecture_support
  2019-07-28  6:09   ` David Gibson
@ 2019-07-29  5:03     ` Alexey Kardashevskiy
  0 siblings, 0 replies; 15+ messages in thread
From: Alexey Kardashevskiy @ 2019-07-29  5:03 UTC (permalink / raw)
  To: David Gibson; +Cc: Paul Mackerras, Michael Ellerman, qemu-ppc, qemu-devel



On 28/07/2019 16:09, David Gibson wrote:
> On Sat, Jul 20, 2019 at 11:28:50AM +1000, Alexey Kardashevskiy wrote:
>> QEMU already implements H_CAS called by SLOF. The existing handler
>> prepares a diff FDT and SLOF applies it on top of its current tree.
>> In SLOF-less setup when the user explicitly selected "bios=no",
>> this updates the FDT from the OS, updates it and writes back to the OS.
>> The new behavior is advertised to the OS via "/chosen/qemu,h_cas".
> 
> I don't love having two different paths here, I'm wondering if we can
> unify things at all.
> 
> I have wondered at some points if there's anything preventing us just
> giving a whole new device tree at CAS, rather than selected updates -
> that could simplify several things.


An update has a header, and this patch just copies the fdt over, if I
add a header to this new path, this will require a few more changes in
the guest which I would rather avoid. Thanks,


> 
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> ---
>>  include/hw/ppc/spapr.h |  5 +++++
>>  hw/ppc/spapr.c         | 24 ++++++++++++++++-----
>>  hw/ppc/spapr_hcall.c   | 49 +++++++++++++++++++++++++++++++++++++++---
>>  3 files changed, 70 insertions(+), 8 deletions(-)
>>
>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>> index 7f5d7a70d27e..73cd9cf25b83 100644
>> --- a/include/hw/ppc/spapr.h
>> +++ b/include/hw/ppc/spapr.h
>> @@ -766,9 +766,14 @@ struct SpaprEventLogEntry {
>>  
>>  void spapr_events_init(SpaprMachineState *sm);
>>  void spapr_dt_events(SpaprMachineState *sm, void *fdt);
>> +int spapr_h_cas_do_compose_response(SpaprMachineState *spapr, void *fdt,
>> +                                    SpaprOptionVector *ov5_updates);
>>  int spapr_h_cas_compose_response(SpaprMachineState *sm,
>>                                   target_ulong addr, target_ulong size,
>>                                   SpaprOptionVector *ov5_updates);
>> +#define FDT_MAX_SIZE 0x100000
>> +void *spapr_build_fdt(SpaprMachineState *spapr);
>> +
>>  void close_htab_fd(SpaprMachineState *spapr);
>>  void spapr_setup_hpt_and_vrma(SpaprMachineState *spapr);
>>  void spapr_free_hpt(SpaprMachineState *spapr);
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index b097a99951f1..f84895f4a8b4 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -978,6 +978,19 @@ static bool spapr_hotplugged_dev_before_cas(void)
>>      return false;
>>  }
>>  
>> +int spapr_h_cas_do_compose_response(SpaprMachineState *spapr, void *fdt,
>> +                                    SpaprOptionVector *ov5_updates)
> 
> Not a great function name.
> 
>> +{
>> +    /* Fixup cpu nodes */
>> +    _FDT((spapr_fixup_cpu_dt(fdt, spapr)));
>> +
>> +    if (spapr_dt_cas_updates(spapr, fdt, ov5_updates)) {
>> +        return -1;
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>>  int spapr_h_cas_compose_response(SpaprMachineState *spapr,
>>                                   target_ulong addr, target_ulong size,
>>                                   SpaprOptionVector *ov5_updates)
>> @@ -1009,10 +1022,7 @@ int spapr_h_cas_compose_response(SpaprMachineState *spapr,
>>      _FDT((fdt_open_into(fdt_skel, fdt, size)));
>>      g_free(fdt_skel);
>>  
>> -    /* Fixup cpu nodes */
>> -    _FDT((spapr_fixup_cpu_dt(fdt, spapr)));
>> -
>> -    if (spapr_dt_cas_updates(spapr, fdt, ov5_updates)) {
>> +    if (spapr_h_cas_do_compose_response(spapr, fdt, ov5_updates)) {
>>          return -1;
>>      }
>>  
>> @@ -1232,6 +1242,10 @@ static void spapr_dt_chosen(SpaprMachineState *spapr, void *fdt)
>>  
>>      /* We always implemented RTAS as hcall, tell guests to call it directly */
>>      _FDT(fdt_setprop_cell(fdt, chosen, "qemu,h_rtas", 1));
>> +    /* Tell the guest that H_CAS will return the entire FDT now, not the diff */
>> +    if (!spapr->bios_enabled) {
>> +        _FDT(fdt_setprop_cell(fdt, chosen, "qemu,h_cas", 1));
> 
> As with H_RTAS< using qemu,hypertas-functions would be more
> appropriate for this.
> 
>> +    }
>>  
>>      spapr_dt_ov5_platform_support(spapr, fdt, chosen);
>>  
>> @@ -1262,7 +1276,7 @@ static void spapr_dt_hypervisor(SpaprMachineState *spapr, void *fdt)
>>      }
>>  }
>>  
>> -static void *spapr_build_fdt(SpaprMachineState *spapr)
>> +void *spapr_build_fdt(SpaprMachineState *spapr)
>>  {
>>      MachineState *machine = MACHINE(spapr);
>>      MachineClass *mc = MACHINE_GET_CLASS(machine);
>> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
>> index b964d94f330b..c5cb06c9d507 100644
>> --- a/hw/ppc/spapr_hcall.c
>> +++ b/hw/ppc/spapr_hcall.c
>> @@ -17,6 +17,7 @@
>>  #include "hw/ppc/spapr_ovec.h"
>>  #include "mmu-book3s-v3.h"
>>  #include "hw/mem/memory-device.h"
>> +#include "sysemu/device_tree.h"
>>  
>>  static bool has_spr(PowerPCCPU *cpu, int spr)
>>  {
>> @@ -1774,9 +1775,51 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu,
>>              /* legacy hash or new hash: */
>>              spapr_setup_hpt_and_vrma(spapr);
>>          }
>> -        spapr->cas_reboot =
>> -            (spapr_h_cas_compose_response(spapr, args[1], args[2],
>> -                                          ov5_updates) != 0);
>> +
>> +        if (spapr->bios_enabled) {
>> +            spapr->cas_reboot =
>> +                (spapr_h_cas_compose_response(spapr, args[1], args[2],
>> +                                              ov5_updates) != 0);
>> +        } else {
>> +            int size;
>> +            void *fdt, *fdt_skel;
>> +            struct fdt_header hdr = { 0 };
>> +
>> +            cpu_physical_memory_read(args[1], &hdr, sizeof(hdr));
>> +            size = fdt32_to_cpu(hdr.totalsize);
>> +            if (size > FDT_MAX_SIZE) {
>> +                return H_NOT_AVAILABLE;
>> +            }
>> +
>> +            fdt_skel = g_malloc0(size);
>> +            cpu_physical_memory_read(args[1], fdt_skel, size);
>> +
>> +            fdt = g_malloc0(FDT_MAX_SIZE);
>> +            fdt_open_into(fdt_skel, fdt, FDT_MAX_SIZE);
>> +            g_free(fdt_skel);
> 
> fdt_open_into() explicitly permits using the same buffer for both
> arguments, so you don't need two allocations here - you can just
> allocate FDT_MAX_SIZE, read it in, then fdt_open_into() in place.
> 
> You probably should check for errors from fdt_open_into(), though.
> 
>> +
>> +            if (spapr_h_cas_do_compose_response(spapr, fdt, ov5_updates)) {
>> +                g_free(fdt);
>> +                return H_NOT_AVAILABLE;
>> +            }
>> +            fdt_pack(fdt);
>> +            if (fdt_totalsize(fdt) > FDT_MAX_SIZE) {
> 
> This can't actually happen - you only ever allocated a buffer of size
> FDT_MAX_SIZE, so if the DT was going to exceed that you would have hit
> FDT_ERR_NOSPACE in an earlier step.
> 
>> +                error_report("FDT too big ! 0x%x bytes (max is 0x%x)",
>> +                             fdt_totalsize(fdt), FDT_MAX_SIZE);
>> +                g_free(fdt);
>> +                return H_NOT_AVAILABLE;
>> +            }
>> +
>> +            /* Load the fdt */
>> +            cpu_physical_memory_write(args[1], fdt, fdt_totalsize(fdt));
>> +
>> +            g_free(spapr->fdt_blob);
>> +            spapr->fdt_size = fdt_totalsize(fdt);
>> +            spapr->fdt_initial_size = spapr->fdt_size;
>> +            spapr->fdt_blob = fdt;
>> +
>> +            qemu_fdt_dumpdtb(fdt, fdt_totalsize(fdt));
>> +        }
>>      }
>>  
>>      /*
> 

-- 
Alexey


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2019-07-29  5:03 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-20  1:28 [Qemu-devel] [PATCH qemu RFC 0/4] spapr: Kexec style boot Alexey Kardashevskiy
2019-07-20  1:28 ` [Qemu-devel] [PATCH qemu RFC 1/4] spapr: Allow changing kernel loading address Alexey Kardashevskiy
2019-07-23  3:49   ` David Gibson
2019-07-23  5:36     ` Alexey Kardashevskiy
2019-07-20  1:28 ` [Qemu-devel] [PATCH qemu RFC 2/4] spapr: Allow bios-less configuration Alexey Kardashevskiy
2019-07-23  3:52   ` David Gibson
2019-07-23  5:43     ` Alexey Kardashevskiy
2019-07-20  1:28 ` [Qemu-devel] [PATCH qemu RFC 3/4] spapr: Advertise H_RTAS to the guest Alexey Kardashevskiy
2019-07-23  3:53   ` David Gibson
2019-07-23  5:48     ` Alexey Kardashevskiy
2019-07-23  6:14       ` David Gibson
2019-07-23  7:41         ` Alexey Kardashevskiy
2019-07-20  1:28 ` [Qemu-devel] [PATCH qemu RFC 4/4] spapr: Implement SLOF-less client_architecture_support Alexey Kardashevskiy
2019-07-28  6:09   ` David Gibson
2019-07-29  5:03     ` Alexey Kardashevskiy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).