All of lore.kernel.org
 help / color / mirror / Atom feed
From: BALATON Zoltan <balaton@eik.bme.hu>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org,
	David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [PATCH qemu v20] spapr: Implement Open Firmware client interface
Date: Sat, 22 May 2021 15:01:55 +0200 (CEST)	[thread overview]
Message-ID: <babe39af-fd34-8c5-de99-a0f485bfbce@eik.bme.hu> (raw)
In-Reply-To: <8527c8d2-c1e7-b3f8-0bda-529ba3864701@ozlabs.ru>

[-- Attachment #1: Type: text/plain, Size: 14698 bytes --]

On Sat, 22 May 2021, Alexey Kardashevskiy wrote:
> On 21/05/2021 19:05, BALATON Zoltan wrote:
>> On Fri, 21 May 2021, Alexey Kardashevskiy wrote:
>>> On 21/05/2021 07:59, BALATON Zoltan wrote:
>>>> On Thu, 20 May 2021, Alexey Kardashevskiy wrote:
>>>>> The PAPR platform describes an OS environment that's presented by
>>>>> a combination of a hypervisor and firmware. The features it specifies
>>>>> require collaboration between the firmware and the hypervisor.
>>>>> 
>>>>> Since the beginning, the runtime component of the firmware (RTAS) has
>>>>> been implemented as a 20 byte shim which simply forwards it to
>>>>> a hypercall implemented in qemu. The boot time firmware component is
>>>>> SLOF - but a build that's specific to qemu, and has always needed to be
>>>>> updated in sync with it. Even though we've managed to limit the amount
>>>>> of runtime communication we need between qemu and SLOF, there's some,
>>>>> and it has become increasingly awkward to handle as we've implemented
>>>>> new features.
>>>>> 
>>>>> This implements a boot time OF client interface (CI) which is
>>>>> enabled by a new "x-vof" pseries machine option (stands for "Virtual 
>>>>> Open
>>>>> Firmware). When enabled, QEMU implements the custom H_OF_CLIENT hcall
>>>>> which implements Open Firmware Client Interface (OF CI). This allows
>>>>> using a smaller stateless firmware which does not have to manage
>>>>> the device tree.
>>>>> 
>>>>> The new "vof.bin" firmware image is included with source code under
>>>>> pc-bios/. It also includes RTAS blob.
>>>>> 
>>>>> This implements a handful of CI methods just to get -kernel/-initrd
>>>>> working. In particular, this implements the device tree fetching and
>>>>> simple memory allocator - "claim" (an OF CI memory allocator) and 
>>>>> updates
>>>>> "/memory@0/available" to report the client about available memory.
>>>>> 
>>>>> This implements changing some device tree properties which we know how
>>>>> to deal with, the rest is ignored. To allow changes, this skips
>>>>> fdt_pack() when x-vof=on as not packing the blob leaves some room for
>>>>> appending.
>>>>> 
>>>>> In absence of SLOF, this assigns phandles to device tree nodes to make
>>>>> device tree traversing work.
>>>>> 
>>>>> When x-vof=on, this adds "/chosen" every time QEMU (re)builds a tree.
>>>>> 
>>>>> This adds basic instances support which are managed by a hash map
>>>>> ihandle -> [phandle].
>>>>> 
>>>>> Before the guest started, the used memory is:
>>>>> 0..e60 - the initial firmware
>>>>> 8000..10000 - stack
>>>>> 400000.. - kernel
>>>>> 3ea0000.. - initramdisk
>>>>> 
>>>>> This OF CI does not implement "interpret".
>>>>> 
>>>>> Unlike SLOF, this does not format uninitialized nvram. Instead, this
>>>>> includes a disk image with pre-formatted nvram.
>>>>> 
>>>>> With this basic support, this can only boot into kernel directly.
>>>>> However this is just enough for the petitboot kernel and initradmdisk to
>>>>> boot from any possible source. Note this requires reasonably recent 
>>>>> guest
>>>>> kernel with:
>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735 
>>>>> The immediate benefit is much faster booting time which especially
>>>>> crucial with fully emulated early CPU bring up environments. Also this
>>>>> may come handy when/if GRUB-in-the-userspace sees light of the day.
>>>>> 
>>>>> This separates VOF and sPAPR in a hope that VOF bits may be reused by
>>>>> other POWERPC boards which do not support pSeries.
>>>>> 
>>>>> This is coded in assumption that later on we might be adding support for
>>>>> booting from QEMU backends (blockdev is the first candidate) without
>>>>> devices/drivers in between as OF1275 does not require that and
>>>>> it is quite easy to so.
>>>>> 
>>>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>>>>> ---
>>>>> 
>>>>> The example command line is:
>>>>> 
>>>>> /home/aik/pbuild/qemu-killslof-localhost-ppc64/qemu-system-ppc64 \
>>>>> -nodefaults \
>>>>> -chardev stdio,id=STDIO0,signal=off,mux=on \
>>>>> -device spapr-vty,id=svty0,reg=0x71000110,chardev=STDIO0 \
>>>>> -mon id=MON0,chardev=STDIO0,mode=readline \
>>>>> -nographic \
>>>>> -vga none \
>>>>> -enable-kvm \
>>>>> -m 8G \
>>>>> -machine 
>>>>> pseries,x-vof=on,cap-cfpc=broken,cap-sbbc=broken,cap-ibs=broken,cap-ccf-assist=off 
>>>>> \
>>>>> -kernel pbuild/kernel-le-guest/vmlinux \
>>>>> -initrd pb/rootfs.cpio.xz \
>>>>> -drive 
>>>>> id=DRIVE0,if=none,file=./p/qemu-killslof/pc-bios/vof-nvram.bin,format=raw 
>>>>> \
>>>>> -global spapr-nvram.drive=DRIVE0 \
>>>>> -snapshot \
>>>>> -smp 8,threads=8 \
>>>>> -L /home/aik/t/qemu-ppc64-bios/ \
>>>>> -trace events=qemu_trace_events \
>>>>> -d guest_errors \
>>>>> -chardev socket,id=SOCKET0,server,nowait,path=qemu.mon.tmux26 \
>>>>> -mon chardev=SOCKET0,mode=control
>>>>> 
>>>>> ---
>>>>> Changes:
>>>>> v20:
>>>>> * compile vof.bin with -mcpu=power4 for better compatibility
>>>>> * s/std/stw/ in entry.S to make it work on ppc32
>>>>> * fixed dt_available property to support both 32 and 64bit
>>>>> * shuffled prom_args handling code
>>>>> * do not enforce 32bit in MSR (again, to support 32bit platforms)
>>>>> 
>>>> 
>>>> [...]
>>>> 
>>>>> diff --git a/default-configs/devices/ppc64-softmmu.mak 
>>>>> b/default-configs/devices/ppc64-softmmu.mak
>>>>> index ae0841fa3a18..9fb201dfacfa 100644
>>>>> --- a/default-configs/devices/ppc64-softmmu.mak
>>>>> +++ b/default-configs/devices/ppc64-softmmu.mak
>>>>> @@ -9,3 +9,4 @@ CONFIG_POWERNV=y
>>>>>  # For pSeries
>>>>>  CONFIG_PSERIES=y
>>>>>  CONFIG_NVDIMM=y
>>>>> +CONFIG_VOF=y
>>>>> diff --git a/hw/ppc/Kconfig b/hw/ppc/Kconfig
>>>>> index e51e0e5e5ac6..964510dfc73d 100644
>>>>> --- a/hw/ppc/Kconfig
>>>>> +++ b/hw/ppc/Kconfig
>>>>> @@ -143,3 +143,6 @@ config FW_CFG_PPC
>>>>> 
>>>>>  config FDT_PPC
>>>>>      bool
>>>>> +
>>>>> +config VOF
>>>>> +    bool
>>>> 
>>>> I think you should just add "select VOF" to config PSERIES section in 
>>>> Kconfig instead of adding it to 
>>>> default-configs/devices/ppc64-softmmu.mak. 
>>> 
>>> oh well, can do that too.
>> 
>> I think most config options should be selected by KConfig and the default 
>> config should only include machines, otherwise VOF would be added also when 
>> you don't compile PSERIES or PEGASOS2. With select in Kconfig it will be 
>> added when needed. That's why it's better to use select in this case.
>> 
>>>>  That should do it, it works in my updated pegasos2 patch:
>>>> 
>>>> https://osdn.net/projects/qmiga/scm/git/qemu/commits/3c1fad08469b4d3c04def22044e52b2d27774a61 
>>>> [...]
>>>>> diff --git a/pc-bios/vof/entry.S b/pc-bios/vof/entry.S
>>>>> new file mode 100644
>>>>> index 000000000000..569688714c91
>>>>> --- /dev/null
>>>>> +++ b/pc-bios/vof/entry.S
>>>>> @@ -0,0 +1,51 @@
>>>>> +#define LOAD32(rn, name)    \
>>>>> +    lis     rn,name##@h;    \
>>>>> +    ori     rn,rn,name##@l
>>>>> +
>>>>> +#define ENTRY(func_name)    \
>>>>> +    .text;                  \
>>>>> +    .align  2;              \
>>>>> +    .globl  .func_name;     \
>>>>> +    .func_name:             \
>>>>> +    .globl  func_name;      \
>>>>> +    func_name:
>>>>> +
>>>>> +#define KVMPPC_HCALL_BASE       0xf000
>>>>> +#define KVMPPC_H_RTAS           (KVMPPC_HCALL_BASE + 0x0)
>>>>> +#define KVMPPC_H_VOF_CLIENT     (KVMPPC_HCALL_BASE + 0x5)
>>>>> +
>>>>> +    . = 0x100 /* Do exactly as SLOF does */
>>>>> +
>>>>> +ENTRY(_start)
>>>>> +#    LOAD32(%r31, 0) /* Go 32bit mode */
>>>>> +#    mtmsrd %r31,0
>>>>> +    LOAD32(2, __toc_start)
>>>>> +    b entry_c
>>>>> +
>>>>> +ENTRY(_prom_entry)
>>>>> +    LOAD32(2, __toc_start)
>>>>> +    stwu    %r1,-112(%r1)
>>>>> +    stw     %r31,104(%r1)
>>>>> +    mflr    %r31
>>>>> +    bl prom_entry
>>>>> +    nop
>>>>> +    mtlr    %r31
>>>>> +    ld      %r31,104(%r1)
>>>> 
>>>> It's getting there, now I see the first client call from the guest boot 
>>>> code but then it crashes on this ld opcode which apparently is 64 bit 
>>>> only:
>>> 
>>> Oh right.
>>> 
>>> 
>>>> Hopefully this is the last such opcode left before I can really test 
>>>> this.
>>> 
>>> Make it lwz, and test it?
>> 
>> Yes, figured that out too after sending this message. Replacing with lwz 
>> works but I wonder that now you have stwu lwz do the stack offsets need 
>> adjusting too or you just waste 4 bytes now?
>
> Well, this assumes the 64bit client and that ABI. I think ideally the 
> firmware is supposed to use its own stack but I did not bother here. I do not 
> know 32bit ABI at all so say whether the existing code should just work or 
> not :-/

It seems to work so that's OK, just thought if the firmware is 32 bit it 
does not need 64 bit values on stack but if that's also potentially used 
by a 64 bit kernel then it may be better to keep it that way to avoid 
confusion. With the 64 bit opcodes replaced it seems to work on pegasos2 
and the guest can call CI functions and get a reply so maybe it's just a 
few wasted bytes that's not a big deal.

>> With lwz here I found no further 64 bit opcodes and the guest boot code 
>> could walk the device tree. It failed later but I think that's because I'll 
>> need to fill more info about the machine in the device tree. I'll 
>> experiment with that but it looks like it could work at least for MorphOS. 
>> I'll have to try Linux too.
>
> There are plenty of tracepoints, enable them all.

I'm running with -trace enable="vof*" but it does not give me too much 
info as a lot of calls (such as peer, child, etc.) don't log anything 
other than there was a hypercall so only get info about opening paths and 
querying some props. The MorphOS boot.img just walks the device tree 
gathering some data about the machine then calls quiesce and boot into the 
OS that later tries to use the gathered info at which point it crashes 
without any logs if some info is not as expected. This does not make it 
easy to debug but I think once I fill the device tree enough with all 
needed info it should work. Currently I'm missing info about PCI devices 
that it may need.

>>>> Do you have some info on how the stdout works in VOF? I think I'll need 
>>>> that to test with Linux and get output but I'm not sure what's needed on 
>>>> the machine side.
>>> 
>>> VOF opens stsout and stores the ihandle (in fdt) which the client 
>>> (==kernel) uses for writing. To make it work properly, you need to hook up 
>>> that instance to a device backend similar to what I have for spapr-vty:
>>> 
>>> https://github.com/aik/qemu/commit/a381a5b50c23c74013e2bd39cc5dad5b6385965d 
>>> 
>>> This is not a part of this patch as I'm trying to keep things simpler and 
>>> accessing backends from VOF is still unsettled. But there is a workaround 
>>> which  is trace_vof_write, I use this. Thanks,
>> 
>> The above patch is about stdin but stdout seems to be added by the current 
>> vof patch. What is spapr-vty?
>
> It is pseries' paravirtual serial device, pegasos does not have it.
>
>> I don't think I have something similar in pegasos2 where I just have a 
>> normal serial port created by ISASuperIO in the vt8231 model.
>
> Correct.
>
>> Can I use that backend somehow or have to create some other serial device 
>> to connect to stdout?
>> Does trace_vof_write work for stuff output by the guest?
>> I guess that's only for things printed by VOF itself
>
> VOF itself does not prints anything in this patch.

However it seems to be needed for linux as the first thing it does seems 
to be getting /chosen/stdout and calls exit if it returns nothing. So I'll 
need this at least for linux. (I think MorphOS may also query it to print 
a banner or some messages but not sure it needs it, at least it does not 
abort right away if not found.)

>> but to see Linux output do I need a stdout in VOF or it will just open the 
>> serial with its own driver and use that?
>> So I'm not sure what's the stdout parts in the current vof patch does and 
>> if I need that for anything. I'll try to experiment with it some more but 
>> fixing the ld and Kconfig seems to be enough to get it work for me.
>
> So for the client to print something, /chosen/stdout needs to have a valid 
> ihandle.
> The only way to get a valid ihandle is having a valid phandle which 
> vof_client_open() can open.
> A valid phandle is a phandle of any node in the device tree. On spapr we pick 
> some spapr-vty, open it and store in /chosen/stdout.
>
> From this point output from the client can be seen via a tracepoint.
>
> Now if we want proper output without tracepoints - we need to hook it up with 
> some chardev backend (not a device such a vt8231 or spapr-vty but backend).

I don't know much about it but devices are also connected to some backend 
so is it possible to use the same backend for VOF as used for the normal 
serial port? But I need a way to find that and connect it to VOF and I'm 
not qure how to do that yet. Or do I need to create a separate serial 
backend and connect that to VOF? I'll try to look at spapr-vty to see what 
it does.

> https://github.com/aik/qemu/commit/a381a5b50c23c74013e2bd3 does this:
> 1. when a phandle is open, QEMU will search for DeviceState* for the specific 
> FDT node and get a chardev from the device.
> 2. when write() is called, QEMU calls qemu_chr_fe_write_all() on chardev from 
> 1.
>
> From this point you do not need a tracepoint and the output will appears in 
> the console you set up for stdout.
>
> Now if you want input from this console, things get tricky. First, on 
> powernv/pseries we only need this for grub as otherwise the kernel has all 
> the drivers needed and will not use the client interface. For the grub, we 
> need to provide a valid ihandle for /chosen/stdin which is easy but 
> implementing read() on this is not as there is no simple 
> device-type-independend way of reading from chardev. I hacked it for 
> spapr-tvy but other serial devices will need special handling, or we'll have 
> to introduce some VOF_SERIAL_READ interface for those which will face 
> opposition :)
>
> Makes sense?

It explains things a bit but still not entirely clear how can I get 
something to add as a stdout. With the pegasos2 firmware it puts the 
serial device there normally that it inits and opens. Without that 
firmware we have to somehow do that from QEMU so find the serial backend 
used by the serial device within the vt8231 model (or use a different 
backend just for this?) then open it and put it in the device tree. If 
that's correct or how to do it is not clear yet.

Regards.
BALATON Zoltan

  reply	other threads:[~2021-05-22 13:03 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-20  9:05 [PATCH qemu v20] spapr: Implement Open Firmware client interface Alexey Kardashevskiy
2021-05-20 21:59 ` BALATON Zoltan
2021-05-21  0:25   ` Alexey Kardashevskiy
2021-05-21  9:05     ` BALATON Zoltan
2021-05-21 19:57       ` BALATON Zoltan
2021-05-22  6:39         ` Alexey Kardashevskiy
2021-05-22 13:08           ` BALATON Zoltan
2021-05-23  3:47             ` Alexey Kardashevskiy
2021-05-23 12:12               ` BALATON Zoltan
2021-05-22  6:22       ` Alexey Kardashevskiy
2021-05-22 13:01         ` BALATON Zoltan [this message]
2021-05-22 15:02           ` BALATON Zoltan
2021-05-22 16:46             ` BALATON Zoltan
2021-05-23  3:41               ` Alexey Kardashevskiy
2021-05-23 12:02                 ` BALATON Zoltan
2021-05-23  3:31             ` Alexey Kardashevskiy
2021-05-23 11:24               ` BALATON Zoltan
2021-05-24  4:26                 ` Alexey Kardashevskiy
2021-05-24  5:40                   ` David Gibson
2021-05-24 11:56                     ` BALATON Zoltan
2021-05-23  3:20           ` Alexey Kardashevskiy
2021-05-23 11:19             ` BALATON Zoltan
2021-05-23 17:09               ` BALATON Zoltan
2021-05-24  6:01                 ` David Gibson
2021-05-24 10:55                   ` BALATON Zoltan
2021-05-24 12:46                     ` Alexey Kardashevskiy
2021-05-24 22:34                       ` BALATON Zoltan
2021-05-25  5:24                       ` David Gibson
2021-05-25  5:23                     ` David Gibson
2021-05-25 10:08                       ` BALATON Zoltan
2021-05-27  5:34                         ` David Gibson
2021-05-27 12:42                           ` BALATON Zoltan
2021-06-02  7:57                             ` David Gibson
2021-06-02 12:29                               ` BALATON Zoltan
2021-06-04  6:29                                 ` David Gibson
2021-06-04 13:59                                   ` BALATON Zoltan
2021-06-07  3:30                                     ` David Gibson
2021-06-07 22:54                                       ` BALATON Zoltan
2021-06-09  5:51                                         ` Alexey Kardashevskiy
2021-06-09 10:19                                           ` BALATON Zoltan
2021-06-06 22:21                                   ` BALATON Zoltan
2021-06-07  3:37                                     ` David Gibson
2021-06-07 22:20                                       ` BALATON Zoltan
2021-05-24 12:42                   ` BALATON Zoltan
2021-05-25  5:29                     ` David Gibson
2021-05-25  9:55                       ` BALATON Zoltan
2021-05-27  5:31                         ` David Gibson
2021-05-24  5:23   ` David Gibson
2021-05-24  9:57     ` BALATON Zoltan
2021-05-24 10:50       ` David Gibson
2021-05-29 18:10 ` BALATON Zoltan
2021-05-30 17:33 ` BALATON Zoltan
2021-05-31 13:07   ` BALATON Zoltan
2021-06-01 12:02     ` Alexey Kardashevskiy
2021-06-01 14:12       ` BALATON Zoltan
2021-06-04  6:21         ` David Gibson
2021-06-04 13:27           ` BALATON Zoltan
2021-06-07  3:02             ` David Gibson
2021-06-04  6:19   ` David Gibson
2021-06-04 13:50     ` BALATON Zoltan
2021-06-04 14:34       ` BALATON Zoltan
2021-06-07  3:05       ` David Gibson
2021-06-09  6:13         ` Alexey Kardashevskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=babe39af-fd34-8c5-de99-a0f485bfbce@eik.bme.hu \
    --to=balaton@eik.bme.hu \
    --cc=aik@ozlabs.ru \
    --cc=david@gibson.dropbear.id.au \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.