Metadata and signalling channels for Zephyr virtio-backends on Xen

All of lore.kernel.org
 help / color / mirror / Atom feed

* Metadata and signalling channels for Zephyr virtio-backends on Xen
@ 2022-02-07 10:20 Alex Bennée
  2022-02-08  0:16 ` Stefano Stabellini
  2022-02-15 14:09 ` Vincent Guittot
  0 siblings, 2 replies; 10+ messages in thread
From: Alex Bennée @ 2022-02-07 10:20 UTC (permalink / raw)
  To: Stefano Stabellini, Vincent Guittot
  Cc: stratos-dev, xen-devel, AKASHI Takahiro, Arnd Bergmann,
	Christopher Clark, Dmytro Firsov, Julien Grall,
	Volodymyr Babchuk

Hi Stefano,

Vincent gave an update on his virtio-scmi work at the last Stratos sync
call and the discussion moved onto next steps. Currently the demo setup
is intermediated by a double-ended vhost-user daemon running on the
devbox acting as a go between a number of QEMU instances representing
the front and back-ends. You can view the architecture with Vincents
diagram here:

  https://docs.google.com/drawings/d/1YSuJUSjEdTi2oEUq4oG4A9pBKSEJTAp6hhcHKKhmYHs/edit?usp=sharing

The key virtq handling is done over the special carve outs of shared
memory between the front end and guest. However the signalling is
currently over a virtio device on the backend. This is useful for the
PoC but obviously in a real system we don't have a hidden POSIX system
acting as a go between not to mention the additional latency it causes
with all those context switches.

I was hoping we could get some more of the Xen experts to the next
Stratos sync (17th Feb) to go over approaches for a properly hosted on
Xen approach. From my recollection (Vincent please correct me if I'm
wrong) of last week the issues that need solving are:

 * How to handle configuration steps as FE guests come up

The SCMI server will be a long running persistent backend because it is
managing real HW resources. However the guests may be ephemeral (or just
restarted) so we can't just hard-code everything in a DTB. While the
virtio-negotiation in the config space covers most things we still need
information like where in the guests address space the shared memory
lives and at what offset into that the queues are created. As far as I'm
aware the canonical source of domain information is XenStore
(https://wiki.xenproject.org/wiki/XenStore) but this relies on a Dom0
type approach. Is there an alternative for dom0less systems or do we
need a dom0-light approach, for example using STR-21 (Ensure Zephyr can
run cleanly as a Dom0 guest) providing just enough services for FE's to
register metadata and BE's to read it?

 * How to handle mapping of memory

AIUI the Xen model is the FE guest explicitly makes grant table requests
to expose portions of it's memory to other domains. Can the BE query the
hypervisor itself to discover the available grants or does it require
coordination with Dom0/XenStore for that information to be available to
the BE domain?

 * How to handle signalling

I guess this requires a minimal implementation of the IOREQ calls for
Zephyr so we can register the handler in the backend? Does the IOREQ API
allow for a IPI style notifications using the global GIC IRQs?

Forgive the incomplete notes from the Stratos sync, I was trying to type
while participating in the discussion so hopefully this email captures
what was missed:

  https://linaro.atlassian.net/wiki/spaces/STR/pages/28682518685/2022-02-03+Project+Stratos+Sync+Meeting+Notes

Vincent, anything to add?

-- 
Alex Bennée

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Metadata and signalling channels for Zephyr virtio-backends on Xen
  2022-02-07 10:20 Metadata and signalling channels for Zephyr virtio-backends on Xen Alex Bennée
@ 2022-02-08  0:16 ` Stefano Stabellini
  2022-02-11 18:20   ` Alex Bennée
  2022-02-15 14:32   ` Vincent Guittot
  2022-02-15 14:09 ` Vincent Guittot
  1 sibling, 2 replies; 10+ messages in thread
From: Stefano Stabellini @ 2022-02-08  0:16 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Stefano Stabellini, Vincent Guittot, stratos-dev, xen-devel,
	AKASHI Takahiro, Arnd Bergmann, Christopher Clark, Dmytro Firsov,
	Julien Grall, Volodymyr Babchuk, Oleksandr_Tyshchenko,
	Artem_Mygaiev, bertrand.marquis, Wei.Chen, Ed.Doxat,
	Oleksii_Moisieiev

[-- Attachment #1: Type: text/plain, Size: 6618 bytes --]

On Mon, 7 Feb 2022, Alex Bennée wrote:
> Hi Stefano,
> 
> Vincent gave an update on his virtio-scmi work at the last Stratos sync
> call and the discussion moved onto next steps.

Hi Alex,

I don't know the specifics of virtio-scmi, but if it is about power,
clocks, reset, etc. like the original SCMI protocol, then virtio-scmi is
likely going to be very different from all the other virtio frontends
and backends. That's because SCMI requires a full view of the system,
which is different from something like virtio-net that is limited to the
emulation of 1 device. For this reason, it is likely that the
virtio-scmi backend would be a better fit in Xen itself, rather than run
in userspace inside a VM.

FYI, a good and promising approach to handle both SCMI and SCPI is the
series recently submitted by EPAM to mediate SCMI and SCPI requests in
Xen: https://marc.info/?l=xen-devel&m=163947444032590

(Another "special" virtio backend is virtio-iommu for similar reasons:
the guest p2m address mappings and also the IOMMU drivers are in Xen.
It is not immediately clear whether a virtio-iommu backend would need to
be in Xen or run as a process in dom0/domU.)

On the other hand, for all the other "normal" protocols (e.g.
virtio-net, virtio-block, etc.) the backend would naturally run as a
process in dom0 or domU (e.g. QEMU in Dom0) as one would expect.

> Currently the demo setup
> is intermediated by a double-ended vhost-user daemon running on the
> devbox acting as a go between a number of QEMU instances representing
> the front and back-ends. You can view the architecture with Vincents
> diagram here:
> 
>   https://docs.google.com/drawings/d/1YSuJUSjEdTi2oEUq4oG4A9pBKSEJTAp6hhcHKKhmYHs/edit?usp=sharing
> 
> The key virtq handling is done over the special carve outs of shared
> memory between the front end and guest. However the signalling is
> currently over a virtio device on the backend. This is useful for the
> PoC but obviously in a real system we don't have a hidden POSIX system
> acting as a go between not to mention the additional latency it causes
> with all those context switches.
> 
> I was hoping we could get some more of the Xen experts to the next
> Stratos sync (17th Feb) to go over approaches for a properly hosted on
> Xen approach. From my recollection (Vincent please correct me if I'm
> wrong) of last week the issues that need solving are:

Unfortunately I have a regular conflict which prevents me from being
able to join the Stratos calls. However, I can certainly make myself
available for one call (unless something unexpected comes up).

>  * How to handle configuration steps as FE guests come up
> 
> The SCMI server will be a long running persistent backend because it is
> managing real HW resources. However the guests may be ephemeral (or just
> restarted) so we can't just hard-code everything in a DTB. While the
> virtio-negotiation in the config space covers most things we still need
> information like where in the guests address space the shared memory
> lives and at what offset into that the queues are created. As far as I'm
> aware the canonical source of domain information is XenStore
> (https://wiki.xenproject.org/wiki/XenStore) but this relies on a Dom0
> type approach. Is there an alternative for dom0less systems or do we
> need a dom0-light approach, for example using STR-21 (Ensure Zephyr can
> run cleanly as a Dom0 guest) providing just enough services for FE's to
> register metadata and BE's to read it?

I'll try to answer the question for a generic virtio frontend and
backend instead (not SCMI because SCMI is unique due to the reasons
above.)

Yes, xenstore is the easiest way to exchange configuration information
between domains. I think EPAM used xenstore to exchange the
configuration information in their virtio-block demo. There is a way to
use xenstore even between dom0less VMs:
https://marc.info/?l=xen-devel&m=164340547602391 Not just xenstore but
full PV drivers too. However, in the dom0less case xenstore is going to
become available some time after boot, not immediately at startup time.
That's because you need to wait until xenstored is up and running.

There are other ways to send data from one VM to another which are
available immediately at boot, such as Argo and static shared memory.

But dom0less is all about static partitioning, so it makes sense to
exploit the build-time tools to the fullest. In the dom0less case, we
already know what is going to run on the target before it is even turned
on. As an example, we might have already prepared an environment with 3
VMs using Yocto and ImageBuilder. We could also generate all
configurations needed and place them inside each VMs using Yocto's
standard tools and ImageBuilder. So for dom0less, I recommend to go via
a different route and pre-generate the configuration directly where
needed instead of doing dynamic discovery.

>  * How to handle mapping of memory
> 
> AIUI the Xen model is the FE guest explicitly makes grant table requests
> to expose portions of it's memory to other domains. Can the BE query the
> hypervisor itself to discover the available grants or does it require
> coordination with Dom0/XenStore for that information to be available to
> the BE domain?

Typically the frontend passes grant table references to the backend
(i.e. instead of plain guest physical addresses on the virtio ring.)
Then, the backend maps the grants; Xen checks that the mapping is
allowed.

We might be able to use the same model with virtio devices. A special
pseudo-IOMMU driver in Linux would return a grant table reference and an
offset as "DMA address". The "DMA address" is passed to the virtio
backend over the virtio ring. The backend would map the grant table
reference using the regular grant table hypercalls.

>  * How to handle signalling
> 
> I guess this requires a minimal implementation of the IOREQ calls for
> Zephyr so we can register the handler in the backend? Does the IOREQ API
> allow for a IPI style notifications using the global GIC IRQs?
> 
> Forgive the incomplete notes from the Stratos sync, I was trying to type
> while participating in the discussion so hopefully this email captures
> what was missed:
> 
>   https://linaro.atlassian.net/wiki/spaces/STR/pages/28682518685/2022-02-03+Project+Stratos+Sync+Meeting+Notes

Yes, any emulation backend (including virtio backends) would require an
IOREQ implementation, which includes notifications via event channels.
Event channels are delivered as a GIC PPI interrupt to the Linux kernel.
Then, the kernel sends the notification to userspace via a file
descriptor.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Metadata and signalling channels for Zephyr virtio-backends on Xen
  2022-02-08  0:16 ` Stefano Stabellini
@ 2022-02-11 18:20   ` Alex Bennée
  2022-02-11 23:34     ` Stefano Stabellini
  2022-02-15 14:32   ` Vincent Guittot
  1 sibling, 1 reply; 10+ messages in thread
From: Alex Bennée @ 2022-02-11 18:20 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Vincent Guittot, stratos-dev, xen-devel, AKASHI Takahiro,
	Arnd Bergmann, Christopher Clark, Dmytro Firsov, Julien Grall,
	Volodymyr Babchuk, Oleksandr_Tyshchenko, Artem_Mygaiev,
	bertrand.marquis, Wei.Chen, Ed.Doxat, Oleksii_Moisieiev


Stefano Stabellini <stefano.stabellini@xilinx.com> writes:

> On Mon, 7 Feb 2022, Alex Bennée wrote:
>> Hi Stefano,
>> 
>> Vincent gave an update on his virtio-scmi work at the last Stratos sync
>> call and the discussion moved onto next steps.
>
> Hi Alex,
>
> I don't know the specifics of virtio-scmi, but if it is about power,
> clocks, reset, etc. like the original SCMI protocol, then virtio-scmi is
> likely going to be very different from all the other virtio frontends
> and backends. That's because SCMI requires a full view of the system,
> which is different from something like virtio-net that is limited to the
> emulation of 1 device. For this reason, it is likely that the
> virtio-scmi backend would be a better fit in Xen itself, rather than run
> in userspace inside a VM.

That may be a good solution for Xen but I still think it's worthwhile
being able to package SCMI in a VM for other hypervisors. We are just
happening to use Xen as a nice type-1 example.

Vincents SCMI server code is portable anyway and can reside in a Zephyr
app, firmware blob or a userspace vhost-user client.

> FYI, a good and promising approach to handle both SCMI and SCPI is the
> series recently submitted by EPAM to mediate SCMI and SCPI requests in
> Xen: https://marc.info/?l=xen-devel&m=163947444032590
>
> (Another "special" virtio backend is virtio-iommu for similar reasons:
> the guest p2m address mappings and also the IOMMU drivers are in Xen.
> It is not immediately clear whether a virtio-iommu backend would need to
> be in Xen or run as a process in dom0/domU.)
>
> On the other hand, for all the other "normal" protocols (e.g.
> virtio-net, virtio-block, etc.) the backend would naturally run as a
> process in dom0 or domU (e.g. QEMU in Dom0) as one would expect.

Can domU's not be given particular access to HW they might want to
tweak? I assume at some point a block device backend needs to actually
talk to real HW to store the blocks (even if in most cases it would be a
kernel doing the HW access on it's behalf).

>> Currently the demo setup
>> is intermediated by a double-ended vhost-user daemon running on the
>> devbox acting as a go between a number of QEMU instances representing
>> the front and back-ends. You can view the architecture with Vincents
>> diagram here:
>> 
>>   https://docs.google.com/drawings/d/1YSuJUSjEdTi2oEUq4oG4A9pBKSEJTAp6hhcHKKhmYHs/edit?usp=sharing
>> 
>> The key virtq handling is done over the special carve outs of shared
>> memory between the front end and guest. However the signalling is
>> currently over a virtio device on the backend. This is useful for the
>> PoC but obviously in a real system we don't have a hidden POSIX system
>> acting as a go between not to mention the additional latency it causes
>> with all those context switches.
>> 
>> I was hoping we could get some more of the Xen experts to the next
>> Stratos sync (17th Feb) to go over approaches for a properly hosted on
>> Xen approach. From my recollection (Vincent please correct me if I'm
>> wrong) of last week the issues that need solving are:
>
> Unfortunately I have a regular conflict which prevents me from being
> able to join the Stratos calls. However, I can certainly make myself
> available for one call (unless something unexpected comes up).
>
>
>>  * How to handle configuration steps as FE guests come up
>> 
>> The SCMI server will be a long running persistent backend because it is
>> managing real HW resources. However the guests may be ephemeral (or just
>> restarted) so we can't just hard-code everything in a DTB. While the
>> virtio-negotiation in the config space covers most things we still need
>> information like where in the guests address space the shared memory
>> lives and at what offset into that the queues are created. As far as I'm
>> aware the canonical source of domain information is XenStore
>> (https://wiki.xenproject.org/wiki/XenStore) but this relies on a Dom0
>> type approach. Is there an alternative for dom0less systems or do we
>> need a dom0-light approach, for example using STR-21 (Ensure Zephyr can
>> run cleanly as a Dom0 guest) providing just enough services for FE's to
>> register metadata and BE's to read it?
>
> I'll try to answer the question for a generic virtio frontend and
> backend instead (not SCMI because SCMI is unique due to the reasons
> above.)
>
> Yes, xenstore is the easiest way to exchange configuration information
> between domains. I think EPAM used xenstore to exchange the
> configuration information in their virtio-block demo. There is a way to
> use xenstore even between dom0less VMs:
> https://marc.info/?l=xen-devel&m=164340547602391 Not just xenstore but
> full PV drivers too. However, in the dom0less case xenstore is going to
> become available some time after boot, not immediately at startup time.
> That's because you need to wait until xenstored is up and running.
>
> There are other ways to send data from one VM to another which are
> available immediately at boot, such as Argo and static shared memory.
>
> But dom0less is all about static partitioning, so it makes sense to
> exploit the build-time tools to the fullest. In the dom0less case, we
> already know what is going to run on the target before it is even turned
> on. As an example, we might have already prepared an environment with 3
> VMs using Yocto and ImageBuilder. We could also generate all
> configurations needed and place them inside each VMs using Yocto's
> standard tools and ImageBuilder. So for dom0less, I recommend to go via
> a different route and pre-generate the configuration directly where
> needed instead of doing dynamic discovery.

Even in a full dom0less setup you still need to manage lifetimes somehow
if a guest reboots.

>
>
>>  * How to handle mapping of memory
>> 
>> AIUI the Xen model is the FE guest explicitly makes grant table requests
>> to expose portions of it's memory to other domains. Can the BE query the
>> hypervisor itself to discover the available grants or does it require
>> coordination with Dom0/XenStore for that information to be available to
>> the BE domain?
>
> Typically the frontend passes grant table references to the backend
> (i.e. instead of plain guest physical addresses on the virtio ring.)
> Then, the backend maps the grants; Xen checks that the mapping is
> allowed.
>
> We might be able to use the same model with virtio devices. A special
> pseudo-IOMMU driver in Linux would return a grant table reference and an
> offset as "DMA address". The "DMA address" is passed to the virtio
> backend over the virtio ring. The backend would map the grant table
> reference using the regular grant table hypercalls.
>
>
>>  * How to handle signalling
>> 
>> I guess this requires a minimal implementation of the IOREQ calls for
>> Zephyr so we can register the handler in the backend? Does the IOREQ API
>> allow for a IPI style notifications using the global GIC IRQs?
>> 
>> Forgive the incomplete notes from the Stratos sync, I was trying to type
>> while participating in the discussion so hopefully this email captures
>> what was missed:
>> 
>>   https://linaro.atlassian.net/wiki/spaces/STR/pages/28682518685/2022-02-03+Project+Stratos+Sync+Meeting+Notes
>
> Yes, any emulation backend (including virtio backends) would require an
> IOREQ implementation, which includes notifications via event channels.
> Event channels are delivered as a GIC PPI interrupt to the Linux kernel.
> Then, the kernel sends the notification to userspace via a file
> descriptor.

Thanks.

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Metadata and signalling channels for Zephyr virtio-backends on Xen
  2022-02-11 18:20   ` Alex Bennée
@ 2022-02-11 23:34     ` Stefano Stabellini
  2022-02-15 14:47       ` Vincent Guittot
  0 siblings, 1 reply; 10+ messages in thread
From: Stefano Stabellini @ 2022-02-11 23:34 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Stefano Stabellini, Vincent Guittot, stratos-dev, xen-devel,
	AKASHI Takahiro, Arnd Bergmann, Christopher Clark, Dmytro Firsov,
	Julien Grall, Volodymyr Babchuk, Oleksandr_Tyshchenko,
	Artem_Mygaiev, bertrand.marquis, Wei.Chen, Ed.Doxat,
	Oleksii_Moisieiev

[-- Attachment #1: Type: text/plain, Size: 6736 bytes --]

On Fri, 11 Feb 2022, Alex Bennée wrote:
> > FYI, a good and promising approach to handle both SCMI and SCPI is the
> > series recently submitted by EPAM to mediate SCMI and SCPI requests in
> > Xen: https://marc.info/?l=xen-devel&m=163947444032590
> >
> > (Another "special" virtio backend is virtio-iommu for similar reasons:
> > the guest p2m address mappings and also the IOMMU drivers are in Xen.
> > It is not immediately clear whether a virtio-iommu backend would need to
> > be in Xen or run as a process in dom0/domU.)
> >
> > On the other hand, for all the other "normal" protocols (e.g.
> > virtio-net, virtio-block, etc.) the backend would naturally run as a
> > process in dom0 or domU (e.g. QEMU in Dom0) as one would expect.
> 
> Can domU's not be given particular access to HW they might want to
> tweak? I assume at some point a block device backend needs to actually
> talk to real HW to store the blocks (even if in most cases it would be a
> kernel doing the HW access on it's behalf).

Yes, it would. Block and network are subsystems with limited visibility,
access, and harmful capabilities (assuming IOMMU).

If the block device goes down or is misused, block might not work but
everything else is expected to work. Block only requires visibility of
the block device for it to work. The same is true for network, GPU, USB,
etc.

SCMI is different. If SCMI is misused the whole platform is affected.
SCMI implies visibility of everything in the system. It is not much
about emulating SCMI but more about mediating SCMI calls.

In other words, SCMI is not a device, it is a core interface. In a Xen
model, Xen virtualizes CPU and memory and other core features/interfaces
(timers, interrupt controller, IOMMU, etc). The PCI root complex is
handled by Xen too. Individual (PCI and non-PCI) devices are assigned to
guests.

These are the reasons why I think the best way to enable SCMI in
upstream Xen is with a mediator in the hypervisor as it is currently in
development. Any chances you could combine your efforts with EPAM's
outstanding series? You might be able to spot gaps if any, and might
even have already code to fill those gaps. It would be fantastic to have
your reviews and/or contributions on xen-devel.

Otherwise, if you have to run the virtio-scmi backend in userspace, why
not try to get it to work on Xen :-) It might not be the ideal solution,
but it could be a good learning experience and pave the way for the
other virtio backends which definitely will be in userspace
(virtio-block, virtio-gpu, etc).


> >> Currently the demo setup
> >> is intermediated by a double-ended vhost-user daemon running on the
> >> devbox acting as a go between a number of QEMU instances representing
> >> the front and back-ends. You can view the architecture with Vincents
> >> diagram here:
> >> 
> >>   https://docs.google.com/drawings/d/1YSuJUSjEdTi2oEUq4oG4A9pBKSEJTAp6hhcHKKhmYHs/edit?usp=sharing
> >> 
> >> The key virtq handling is done over the special carve outs of shared
> >> memory between the front end and guest. However the signalling is
> >> currently over a virtio device on the backend. This is useful for the
> >> PoC but obviously in a real system we don't have a hidden POSIX system
> >> acting as a go between not to mention the additional latency it causes
> >> with all those context switches.
> >> 
> >> I was hoping we could get some more of the Xen experts to the next
> >> Stratos sync (17th Feb) to go over approaches for a properly hosted on
> >> Xen approach. From my recollection (Vincent please correct me if I'm
> >> wrong) of last week the issues that need solving are:
> >
> > Unfortunately I have a regular conflict which prevents me from being
> > able to join the Stratos calls. However, I can certainly make myself
> > available for one call (unless something unexpected comes up).
> >
> >
> >>  * How to handle configuration steps as FE guests come up
> >> 
> >> The SCMI server will be a long running persistent backend because it is
> >> managing real HW resources. However the guests may be ephemeral (or just
> >> restarted) so we can't just hard-code everything in a DTB. While the
> >> virtio-negotiation in the config space covers most things we still need
> >> information like where in the guests address space the shared memory
> >> lives and at what offset into that the queues are created. As far as I'm
> >> aware the canonical source of domain information is XenStore
> >> (https://wiki.xenproject.org/wiki/XenStore) but this relies on a Dom0
> >> type approach. Is there an alternative for dom0less systems or do we
> >> need a dom0-light approach, for example using STR-21 (Ensure Zephyr can
> >> run cleanly as a Dom0 guest) providing just enough services for FE's to
> >> register metadata and BE's to read it?
> >
> > I'll try to answer the question for a generic virtio frontend and
> > backend instead (not SCMI because SCMI is unique due to the reasons
> > above.)
> >
> > Yes, xenstore is the easiest way to exchange configuration information
> > between domains. I think EPAM used xenstore to exchange the
> > configuration information in their virtio-block demo. There is a way to
> > use xenstore even between dom0less VMs:
> > https://marc.info/?l=xen-devel&m=164340547602391 Not just xenstore but
> > full PV drivers too. However, in the dom0less case xenstore is going to
> > become available some time after boot, not immediately at startup time.
> > That's because you need to wait until xenstored is up and running.
> >
> > There are other ways to send data from one VM to another which are
> > available immediately at boot, such as Argo and static shared memory.
> >
> > But dom0less is all about static partitioning, so it makes sense to
> > exploit the build-time tools to the fullest. In the dom0less case, we
> > already know what is going to run on the target before it is even turned
> > on. As an example, we might have already prepared an environment with 3
> > VMs using Yocto and ImageBuilder. We could also generate all
> > configurations needed and place them inside each VMs using Yocto's
> > standard tools and ImageBuilder. So for dom0less, I recommend to go via
> > a different route and pre-generate the configuration directly where
> > needed instead of doing dynamic discovery.
> 
> Even in a full dom0less setup you still need to manage lifetimes somehow
> if a guest reboots.

Sure but that's not a problem: all the info and configuration related to
rebooting the guest can also be pre-generated in Yocto or ImageBuilder.

As an example, it is already possible (although rudimental) in
ImageBuilder to generate the dom0less configuration and also the domU xl
config file for the same domU with passthrough devices.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Metadata and signalling channels for Zephyr virtio-backends on Xen
  2022-02-07 10:20 Metadata and signalling channels for Zephyr virtio-backends on Xen Alex Bennée
  2022-02-08  0:16 ` Stefano Stabellini
@ 2022-02-15 14:09 ` Vincent Guittot
  1 sibling, 0 replies; 10+ messages in thread
From: Vincent Guittot @ 2022-02-15 14:09 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Stefano Stabellini, stratos-dev, xen-devel, AKASHI Takahiro,
	Arnd Bergmann, Christopher Clark, Dmytro Firsov, Julien Grall,
	Volodymyr Babchuk

Hi All,

Sorry for the late reply but I was off last week. I will go through
the thread and try to answer open point

On Mon, 7 Feb 2022 at 11:56, Alex Bennée <alex.bennee@linaro.org> wrote:
>
>
> Hi Stefano,
>
> Vincent gave an update on his virtio-scmi work at the last Stratos sync
> call and the discussion moved onto next steps. Currently the demo setup
> is intermediated by a double-ended vhost-user daemon running on the
> devbox acting as a go between a number of QEMU instances representing
> the front and back-ends. You can view the architecture with Vincents
> diagram here:
>
>   https://docs.google.com/drawings/d/1YSuJUSjEdTi2oEUq4oG4A9pBKSEJTAp6hhcHKKhmYHs/edit?usp=sharing
>
> The key virtq handling is done over the special carve outs of shared
> memory between the front end and guest. However the signalling is
> currently over a virtio device on the backend. This is useful for the
> PoC but obviously in a real system we don't have a hidden POSIX system
> acting as a go between not to mention the additional latency it causes
> with all those context switches.
>
> I was hoping we could get some more of the Xen experts to the next
> Stratos sync (17th Feb) to go over approaches for a properly hosted on
> Xen approach. From my recollection (Vincent please correct me if I'm
> wrong) of last week the issues that need solving are:
>
>  * How to handle configuration steps as FE guests come up
>
> The SCMI server will be a long running persistent backend because it is
> managing real HW resources. However the guests may be ephemeral (or just
> restarted) so we can't just hard-code everything in a DTB. While the
> virtio-negotiation in the config space covers most things we still need
> information like where in the guests address space the shared memory
> lives and at what offset into that the queues are created. As far as I'm
> aware the canonical source of domain information is XenStore
> (https://wiki.xenproject.org/wiki/XenStore) but this relies on a Dom0
> type approach. Is there an alternative for dom0less systems or do we
> need a dom0-light approach, for example using STR-21 (Ensure Zephyr can
> run cleanly as a Dom0 guest) providing just enough services for FE's to
> register metadata and BE's to read it?
>
>  * How to handle mapping of memory
>
> AIUI the Xen model is the FE guest explicitly makes grant table requests
> to expose portions of it's memory to other domains. Can the BE query the
> hypervisor itself to discover the available grants or does it require
> coordination with Dom0/XenStore for that information to be available to
> the BE domain?

I have noticed that it was possible to share memory between VMs in the
VM config file which seem to be quite similar to what is done with
qemu to share memory object between VMs
>
>  * How to handle signalling
>
> I guess this requires a minimal implementation of the IOREQ calls for
> Zephyr so we can register the handler in the backend? Does the IOREQ API
> allow for a IPI style notifications using the global GIC IRQs?
>
> Forgive the incomplete notes from the Stratos sync, I was trying to type
> while participating in the discussion so hopefully this email captures
> what was missed:
>
>   https://linaro.atlassian.net/wiki/spaces/STR/pages/28682518685/2022-02-03+Project+Stratos+Sync+Meeting+Notes
>
> Vincent, anything to add?

I want to use an interface that is not tied to an hypervisor that's
why i have reused the virtio_mmio to emulate the device side where the
backend can get virtqueue description

>
> --
> Alex Bennée


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Metadata and signalling channels for Zephyr virtio-backends on Xen
  2022-02-08  0:16 ` Stefano Stabellini
  2022-02-11 18:20   ` Alex Bennée
@ 2022-02-15 14:32   ` Vincent Guittot
  2022-02-16 21:45     ` Stefano Stabellini
  1 sibling, 1 reply; 10+ messages in thread
From: Vincent Guittot @ 2022-02-15 14:32 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Alex Bennée, stratos-dev, xen-devel, AKASHI Takahiro,
	Arnd Bergmann, Christopher Clark, Dmytro Firsov, Julien Grall,
	Volodymyr Babchuk, Oleksandr_Tyshchenko, Artem_Mygaiev,
	bertrand.marquis, Wei.Chen, Ed.Doxat, Oleksii_Moisieiev

Hi Stefano,

On Tue, 8 Feb 2022 at 01:16, Stefano Stabellini
<stefano.stabellini@xilinx.com> wrote:
>
> On Mon, 7 Feb 2022, Alex Bennée wrote:
> > Hi Stefano,
> >
> > Vincent gave an update on his virtio-scmi work at the last Stratos sync
> > call and the discussion moved onto next steps.
>
> Hi Alex,
>
> I don't know the specifics of virtio-scmi, but if it is about power,
> clocks, reset, etc. like the original SCMI protocol, then virtio-scmi is

virtio-scmi is one transport channel that support SCMI protocol

> likely going to be very different from all the other virtio frontends

The virtio-scmi front-end is merged mainline

> and backends. That's because SCMI requires a full view of the system,
> which is different from something like virtio-net that is limited to the
> emulation of 1 device. For this reason, it is likely that the
> virtio-scmi backend would be a better fit in Xen itself, rather than run
> in userspace inside a VM.

Not sure what you mean when you say that SCMI requires a full view of
the system. If you are referring to the system wide resources which
reset or power up/down the whole SoC, this is not really what we are
targeting here. Those system wide resources should already be handled
by a dedicated power coprocessor. In our case, the IPs of the SoC will
be handled by different VMs but those IPs are usually sharing common
resources like a parent PLL , a power domain or a clock gating reg as
few examples. Because all those VMs can't directly set these resources
without taking into account others and because the power coprocessor
doesn't have an unlimited number of channels, we add an SCMI backend
that will gather and proxy the VM request before accessing the
register that gates some clocks IP as an example or before powering
down an external regulator shared between the camera and another
device. This SCMI backend will most probably also send request with
OSPM permission access to the power coprocessor once aggregating all
the VMs ' request
We are using virtio-cmi protocol because it has the main advantage of
not being tied to an hypervisor

In our PoC, the SCMI backend is running with zehyr and reuse the same
software that can run in the power coprocessor which helps splitting
what is critical and must be handled by power coprocessor and what is
not critical for the system (what is usually managed by linux directly
when their no hypervisor involved typically)

>
> FYI, a good and promising approach to handle both SCMI and SCPI is the
> series recently submitted by EPAM to mediate SCMI and SCPI requests in
> Xen: https://marc.info/?l=xen-devel&m=163947444032590
>
> (Another "special" virtio backend is virtio-iommu for similar reasons:
> the guest p2m address mappings and also the IOMMU drivers are in Xen.
> It is not immediately clear whether a virtio-iommu backend would need to
> be in Xen or run as a process in dom0/domU.)
>
> On the other hand, for all the other "normal" protocols (e.g.
> virtio-net, virtio-block, etc.) the backend would naturally run as a
> process in dom0 or domU (e.g. QEMU in Dom0) as one would expect.
>
>
> > Currently the demo setup
> > is intermediated by a double-ended vhost-user daemon running on the
> > devbox acting as a go between a number of QEMU instances representing
> > the front and back-ends. You can view the architecture with Vincents
> > diagram here:
> >
> >   https://docs.google.com/drawings/d/1YSuJUSjEdTi2oEUq4oG4A9pBKSEJTAp6hhcHKKhmYHs/edit?usp=sharing
> >
> > The key virtq handling is done over the special carve outs of shared
> > memory between the front end and guest. However the signalling is
> > currently over a virtio device on the backend. This is useful for the
> > PoC but obviously in a real system we don't have a hidden POSIX system
> > acting as a go between not to mention the additional latency it causes
> > with all those context switches.
> >
> > I was hoping we could get some more of the Xen experts to the next
> > Stratos sync (17th Feb) to go over approaches for a properly hosted on
> > Xen approach. From my recollection (Vincent please correct me if I'm
> > wrong) of last week the issues that need solving are:
>
> Unfortunately I have a regular conflict which prevents me from being
> able to join the Stratos calls. However, I can certainly make myself
> available for one call (unless something unexpected comes up).
>
>
> >  * How to handle configuration steps as FE guests come up
> >
> > The SCMI server will be a long running persistent backend because it is
> > managing real HW resources. However the guests may be ephemeral (or just
> > restarted) so we can't just hard-code everything in a DTB. While the
> > virtio-negotiation in the config space covers most things we still need
> > information like where in the guests address space the shared memory
> > lives and at what offset into that the queues are created. As far as I'm
> > aware the canonical source of domain information is XenStore
> > (https://wiki.xenproject.org/wiki/XenStore) but this relies on a Dom0
> > type approach. Is there an alternative for dom0less systems or do we
> > need a dom0-light approach, for example using STR-21 (Ensure Zephyr can
> > run cleanly as a Dom0 guest) providing just enough services for FE's to
> > register metadata and BE's to read it?
>
> I'll try to answer the question for a generic virtio frontend and
> backend instead (not SCMI because SCMI is unique due to the reasons
> above.)
>
> Yes, xenstore is the easiest way to exchange configuration information
> between domains. I think EPAM used xenstore to exchange the
> configuration information in their virtio-block demo. There is a way to
> use xenstore even between dom0less VMs:
> https://marc.info/?l=xen-devel&m=164340547602391 Not just xenstore but
> full PV drivers too. However, in the dom0less case xenstore is going to
> become available some time after boot, not immediately at startup time.
> That's because you need to wait until xenstored is up and running.
>
> There are other ways to send data from one VM to another which are
> available immediately at boot, such as Argo and static shared memory.
>
> But dom0less is all about static partitioning, so it makes sense to
> exploit the build-time tools to the fullest. In the dom0less case, we
> already know what is going to run on the target before it is even turned
> on. As an example, we might have already prepared an environment with 3
> VMs using Yocto and ImageBuilder. We could also generate all
> configurations needed and place them inside each VMs using Yocto's
> standard tools and ImageBuilder. So for dom0less, I recommend to go via
> a different route and pre-generate the configuration directly where
> needed instead of doing dynamic discovery.
>
>
> >  * How to handle mapping of memory
> >
> > AIUI the Xen model is the FE guest explicitly makes grant table requests
> > to expose portions of it's memory to other domains. Can the BE query the
> > hypervisor itself to discover the available grants or does it require
> > coordination with Dom0/XenStore for that information to be available to
> > the BE domain?
>
> Typically the frontend passes grant table references to the backend
> (i.e. instead of plain guest physical addresses on the virtio ring.)
> Then, the backend maps the grants; Xen checks that the mapping is
> allowed.
>
> We might be able to use the same model with virtio devices. A special
> pseudo-IOMMU driver in Linux would return a grant table reference and an
> offset as "DMA address". The "DMA address" is passed to the virtio
> backend over the virtio ring. The backend would map the grant table
> reference using the regular grant table hypercalls.
>
>
> >  * How to handle signalling
> >
> > I guess this requires a minimal implementation of the IOREQ calls for
> > Zephyr so we can register the handler in the backend? Does the IOREQ API
> > allow for a IPI style notifications using the global GIC IRQs?
> >
> > Forgive the incomplete notes from the Stratos sync, I was trying to type
> > while participating in the discussion so hopefully this email captures
> > what was missed:
> >
> >   https://linaro.atlassian.net/wiki/spaces/STR/pages/28682518685/2022-02-03+Project+Stratos+Sync+Meeting+Notes
>
> Yes, any emulation backend (including virtio backends) would require an
> IOREQ implementation, which includes notifications via event channels.
> Event channels are delivered as a GIC PPI interrupt to the Linux kernel.
> Then, the kernel sends the notification to userspace via a file
> descriptor.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Metadata and signalling channels for Zephyr virtio-backends on Xen
  2022-02-11 23:34     ` Stefano Stabellini
@ 2022-02-15 14:47       ` Vincent Guittot
  0 siblings, 0 replies; 10+ messages in thread
From: Vincent Guittot @ 2022-02-15 14:47 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Alex Bennée, stratos-dev, xen-devel, AKASHI Takahiro,
	Arnd Bergmann, Christopher Clark, Dmytro Firsov, Julien Grall,
	Volodymyr Babchuk, Oleksandr_Tyshchenko, Artem_Mygaiev,
	bertrand.marquis, Wei.Chen, Ed.Doxat, Oleksii_Moisieiev

On Sat, 12 Feb 2022 at 00:34, Stefano Stabellini
<stefano.stabellini@xilinx.com> wrote:
>
> On Fri, 11 Feb 2022, Alex Bennée wrote:
> > > FYI, a good and promising approach to handle both SCMI and SCPI is the
> > > series recently submitted by EPAM to mediate SCMI and SCPI requests in
> > > Xen: https://marc.info/?l=xen-devel&m=163947444032590
> > >
> > > (Another "special" virtio backend is virtio-iommu for similar reasons:
> > > the guest p2m address mappings and also the IOMMU drivers are in Xen.
> > > It is not immediately clear whether a virtio-iommu backend would need to
> > > be in Xen or run as a process in dom0/domU.)
> > >
> > > On the other hand, for all the other "normal" protocols (e.g.
> > > virtio-net, virtio-block, etc.) the backend would naturally run as a
> > > process in dom0 or domU (e.g. QEMU in Dom0) as one would expect.
> >
> > Can domU's not be given particular access to HW they might want to
> > tweak? I assume at some point a block device backend needs to actually
> > talk to real HW to store the blocks (even if in most cases it would be a
> > kernel doing the HW access on it's behalf).
>
> Yes, it would. Block and network are subsystems with limited visibility,
> access, and harmful capabilities (assuming IOMMU).
>
> If the block device goes down or is misused, block might not work but
> everything else is expected to work. Block only requires visibility of
> the block device for it to work. The same is true for network, GPU, USB,
> etc.
>
> SCMI is different. If SCMI is misused the whole platform is affected.
> SCMI implies visibility of everything in the system. It is not much
> about emulating SCMI but more about mediating SCMI calls.
>
> In other words, SCMI is not a device, it is a core interface. In a Xen
> model, Xen virtualizes CPU and memory and other core features/interfaces
> (timers, interrupt controller, IOMMU, etc). The PCI root complex is
> handled by Xen too. Individual (PCI and non-PCI) devices are assigned to
> guests.
>
> These are the reasons why I think the best way to enable SCMI in
> upstream Xen is with a mediator in the hypervisor as it is currently in
> development. Any chances you could combine your efforts with EPAM's
> outstanding series? You might be able to spot gaps if any, and might
> even have already code to fill those gaps. It would be fantastic to have
> your reviews and/or contributions on xen-devel.
>
> Otherwise, if you have to run the virtio-scmi backend in userspace, why

Just to clarify, this goal is not to run the scmi backend as a linux
userspace app but to run a virtual power coprocessor that will handle
everything which is not system critical and will change from one
product to another which make it quite hard to maintain in the
hypervisor.

I have only looked at the cover letter which mentions the use of SMC
call which will be trapped by Xen before being modified and forward to
ATF. AFAICT, the ATF execution context is quite simple and synchronous
with the request. In our case, we want to be able to manage to I2C
device as an example or to notifies VMs with aynshorous event like
sensor or performance change which virtio-scmi support


> not try to get it to work on Xen :-) It might not be the ideal solution,
> but it could be a good learning experience and pave the way for the
> other virtio backends which definitely will be in userspace
> (virtio-block, virtio-gpu, etc).
>
>
> > >> Currently the demo setup
> > >> is intermediated by a double-ended vhost-user daemon running on the
> > >> devbox acting as a go between a number of QEMU instances representing
> > >> the front and back-ends. You can view the architecture with Vincents
> > >> diagram here:
> > >>
> > >>   https://docs.google.com/drawings/d/1YSuJUSjEdTi2oEUq4oG4A9pBKSEJTAp6hhcHKKhmYHs/edit?usp=sharing
> > >>
> > >> The key virtq handling is done over the special carve outs of shared
> > >> memory between the front end and guest. However the signalling is
> > >> currently over a virtio device on the backend. This is useful for the
> > >> PoC but obviously in a real system we don't have a hidden POSIX system
> > >> acting as a go between not to mention the additional latency it causes
> > >> with all those context switches.
> > >>
> > >> I was hoping we could get some more of the Xen experts to the next
> > >> Stratos sync (17th Feb) to go over approaches for a properly hosted on
> > >> Xen approach. From my recollection (Vincent please correct me if I'm
> > >> wrong) of last week the issues that need solving are:
> > >
> > > Unfortunately I have a regular conflict which prevents me from being
> > > able to join the Stratos calls. However, I can certainly make myself
> > > available for one call (unless something unexpected comes up).
> > >
> > >
> > >>  * How to handle configuration steps as FE guests come up
> > >>
> > >> The SCMI server will be a long running persistent backend because it is
> > >> managing real HW resources. However the guests may be ephemeral (or just
> > >> restarted) so we can't just hard-code everything in a DTB. While the
> > >> virtio-negotiation in the config space covers most things we still need
> > >> information like where in the guests address space the shared memory
> > >> lives and at what offset into that the queues are created. As far as I'm
> > >> aware the canonical source of domain information is XenStore
> > >> (https://wiki.xenproject.org/wiki/XenStore) but this relies on a Dom0
> > >> type approach. Is there an alternative for dom0less systems or do we
> > >> need a dom0-light approach, for example using STR-21 (Ensure Zephyr can
> > >> run cleanly as a Dom0 guest) providing just enough services for FE's to
> > >> register metadata and BE's to read it?
> > >
> > > I'll try to answer the question for a generic virtio frontend and
> > > backend instead (not SCMI because SCMI is unique due to the reasons
> > > above.)
> > >
> > > Yes, xenstore is the easiest way to exchange configuration information
> > > between domains. I think EPAM used xenstore to exchange the
> > > configuration information in their virtio-block demo. There is a way to
> > > use xenstore even between dom0less VMs:
> > > https://marc.info/?l=xen-devel&m=164340547602391 Not just xenstore but
> > > full PV drivers too. However, in the dom0less case xenstore is going to
> > > become available some time after boot, not immediately at startup time.
> > > That's because you need to wait until xenstored is up and running.
> > >
> > > There are other ways to send data from one VM to another which are
> > > available immediately at boot, such as Argo and static shared memory.
> > >
> > > But dom0less is all about static partitioning, so it makes sense to
> > > exploit the build-time tools to the fullest. In the dom0less case, we
> > > already know what is going to run on the target before it is even turned
> > > on. As an example, we might have already prepared an environment with 3
> > > VMs using Yocto and ImageBuilder. We could also generate all
> > > configurations needed and place them inside each VMs using Yocto's
> > > standard tools and ImageBuilder. So for dom0less, I recommend to go via
> > > a different route and pre-generate the configuration directly where
> > > needed instead of doing dynamic discovery.
> >
> > Even in a full dom0less setup you still need to manage lifetimes somehow
> > if a guest reboots.
>
> Sure but that's not a problem: all the info and configuration related to
> rebooting the guest can also be pre-generated in Yocto or ImageBuilder.
>
> As an example, it is already possible (although rudimental) in
> ImageBuilder to generate the dom0less configuration and also the domU xl
> config file for the same domU with passthrough devices.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Metadata and signalling channels for Zephyr virtio-backends on Xen
  2022-02-15 14:32   ` Vincent Guittot
@ 2022-02-16 21:45     ` Stefano Stabellini
  2022-02-17 13:48       ` Vincent Guittot
  0 siblings, 1 reply; 10+ messages in thread
From: Stefano Stabellini @ 2022-02-16 21:45 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Stefano Stabellini, Alex Bennée, stratos-dev, xen-devel,
	AKASHI Takahiro, Arnd Bergmann, Christopher Clark, Dmytro Firsov,
	Julien Grall, Volodymyr Babchuk, Oleksandr_Tyshchenko,
	Artem_Mygaiev, bertrand.marquis, Wei.Chen, Ed.Doxat,
	Oleksii_Moisieiev

[-- Attachment #1: Type: text/plain, Size: 10935 bytes --]

On Tue, 15 Feb 2022, Vincent Guittot wrote:
> On Tue, 8 Feb 2022 at 01:16, Stefano Stabellini
> <stefano.stabellini@xilinx.com> wrote:
> >
> > On Mon, 7 Feb 2022, Alex Bennée wrote:
> > > Hi Stefano,
> > >
> > > Vincent gave an update on his virtio-scmi work at the last Stratos sync
> > > call and the discussion moved onto next steps.
> >
> > Hi Alex,
> >
> > I don't know the specifics of virtio-scmi, but if it is about power,
> > clocks, reset, etc. like the original SCMI protocol, then virtio-scmi is
> 
> virtio-scmi is one transport channel that support SCMI protocol
> 
> > likely going to be very different from all the other virtio frontends
> 
> The virtio-scmi front-end is merged mainline
> 
> > and backends. That's because SCMI requires a full view of the system,
> > which is different from something like virtio-net that is limited to the
> > emulation of 1 device. For this reason, it is likely that the
> > virtio-scmi backend would be a better fit in Xen itself, rather than run
> > in userspace inside a VM.
> 
> Not sure what you mean when you say that SCMI requires a full view of
> the system.

SCMI can be used to read the status of resources in the system and
typically leads to a full view of the system's resources.

If I assign the USB controller to a VM, I expect that VM to only "see"
the USB controller and any attached USB peripherals, in addition to the
other regular virtual resources that a VM commonly has.

If I assign SCMI to a VM, I expect the VM to "see" everything in the
system thanks to the SCMI probing functions. Unless we only assign a
single SCMI channel with limited capabilities to the VM, like EPAM's
patch series on xen-devel is doing.


> If you are referring to the system wide resources which
> reset or power up/down the whole SoC, this is not really what we are
> targeting here. Those system wide resources should already be handled
> by a dedicated power coprocessor. In our case, the IPs of the SoC will
> be handled by different VMs but those IPs are usually sharing common
> resources like a parent PLL , a power domain or a clock gating reg as
> few examples. Because all those VMs can't directly set these resources
> without taking into account others and because the power coprocessor
> doesn't have an unlimited number of channels, we add an SCMI backend
> that will gather and proxy the VM request before accessing the
> register that gates some clocks IP as an example or before powering
> down an external regulator shared between the camera and another
> device.

Do you know what would be the expected number of SCMI channels available
in a "normal" deployment?

My expectation was that there would be enough SCMI channels to give one
for each VM in a common embedded scenario, where the number of VMs is
typically not very high. If we have enought channels so that we can
assign each channel to a different VM maybe we can get away without a
proxy?


> This SCMI backend will most probably also send request with
> OSPM permission access to the power coprocessor once aggregating all
> the VMs ' request

Please correct me if I am wrong, but I would have expected the SCMI
firmware to be able to do reference counting on the hardware resources
and therefore be able to handle the case where:

- we have 2 VMs
- each VM has its own SCMI channel
- a VM requests power-off on 1 resource also used by the other VM

My understanding of the SCMI protocol is that the SCMI firmware
implementation should detect that the resource in question is also
in-use by another VM/channel and thus it would refuse the power-off
operation. (For your information, that is also how the Xilinx EEMI
protocol works.)

Reference counting is a key requirement for a good multi-channel
implementation. If SCMI doesn't support it today, then we have a
problem with SCMI multi-channel, regardless of virtualization.


> We are using virtio-cmi protocol because it has the main advantage of
> not being tied to an hypervisor

That is a valuable goal, which is a bit different from the goal of
finding the best SCMI architecture for Xen, and that's OK. Let's see if
we can find any common ground and synergies we can exploit to improve
both goals. I'll join the Stratos meeting tomorrow.


> In our PoC, the SCMI backend is running with zehyr and reuse the same
> software that can run in the power coprocessor which helps splitting
> what is critical and must be handled by power coprocessor and what is
> not critical for the system (what is usually managed by linux directly
> when their no hypervisor involved typically)
> 
>
> > FYI, a good and promising approach to handle both SCMI and SCPI is the
> > series recently submitted by EPAM to mediate SCMI and SCPI requests in
> > Xen: https://marc.info/?l=xen-devel&m=163947444032590
> >
> > (Another "special" virtio backend is virtio-iommu for similar reasons:
> > the guest p2m address mappings and also the IOMMU drivers are in Xen.
> > It is not immediately clear whether a virtio-iommu backend would need to
> > be in Xen or run as a process in dom0/domU.)
> >
> > On the other hand, for all the other "normal" protocols (e.g.
> > virtio-net, virtio-block, etc.) the backend would naturally run as a
> > process in dom0 or domU (e.g. QEMU in Dom0) as one would expect.
> >
> >
> > > Currently the demo setup
> > > is intermediated by a double-ended vhost-user daemon running on the
> > > devbox acting as a go between a number of QEMU instances representing
> > > the front and back-ends. You can view the architecture with Vincents
> > > diagram here:
> > >
> > >   https://docs.google.com/drawings/d/1YSuJUSjEdTi2oEUq4oG4A9pBKSEJTAp6hhcHKKhmYHs/edit?usp=sharing
> > >
> > > The key virtq handling is done over the special carve outs of shared
> > > memory between the front end and guest. However the signalling is
> > > currently over a virtio device on the backend. This is useful for the
> > > PoC but obviously in a real system we don't have a hidden POSIX system
> > > acting as a go between not to mention the additional latency it causes
> > > with all those context switches.
> > >
> > > I was hoping we could get some more of the Xen experts to the next
> > > Stratos sync (17th Feb) to go over approaches for a properly hosted on
> > > Xen approach. From my recollection (Vincent please correct me if I'm
> > > wrong) of last week the issues that need solving are:
> >
> > Unfortunately I have a regular conflict which prevents me from being
> > able to join the Stratos calls. However, I can certainly make myself
> > available for one call (unless something unexpected comes up).
> >
> >
> > >  * How to handle configuration steps as FE guests come up
> > >
> > > The SCMI server will be a long running persistent backend because it is
> > > managing real HW resources. However the guests may be ephemeral (or just
> > > restarted) so we can't just hard-code everything in a DTB. While the
> > > virtio-negotiation in the config space covers most things we still need
> > > information like where in the guests address space the shared memory
> > > lives and at what offset into that the queues are created. As far as I'm
> > > aware the canonical source of domain information is XenStore
> > > (https://wiki.xenproject.org/wiki/XenStore) but this relies on a Dom0
> > > type approach. Is there an alternative for dom0less systems or do we
> > > need a dom0-light approach, for example using STR-21 (Ensure Zephyr can
> > > run cleanly as a Dom0 guest) providing just enough services for FE's to
> > > register metadata and BE's to read it?
> >
> > I'll try to answer the question for a generic virtio frontend and
> > backend instead (not SCMI because SCMI is unique due to the reasons
> > above.)
> >
> > Yes, xenstore is the easiest way to exchange configuration information
> > between domains. I think EPAM used xenstore to exchange the
> > configuration information in their virtio-block demo. There is a way to
> > use xenstore even between dom0less VMs:
> > https://marc.info/?l=xen-devel&m=164340547602391 Not just xenstore but
> > full PV drivers too. However, in the dom0less case xenstore is going to
> > become available some time after boot, not immediately at startup time.
> > That's because you need to wait until xenstored is up and running.
> >
> > There are other ways to send data from one VM to another which are
> > available immediately at boot, such as Argo and static shared memory.
> >
> > But dom0less is all about static partitioning, so it makes sense to
> > exploit the build-time tools to the fullest. In the dom0less case, we
> > already know what is going to run on the target before it is even turned
> > on. As an example, we might have already prepared an environment with 3
> > VMs using Yocto and ImageBuilder. We could also generate all
> > configurations needed and place them inside each VMs using Yocto's
> > standard tools and ImageBuilder. So for dom0less, I recommend to go via
> > a different route and pre-generate the configuration directly where
> > needed instead of doing dynamic discovery.
> >
> >
> > >  * How to handle mapping of memory
> > >
> > > AIUI the Xen model is the FE guest explicitly makes grant table requests
> > > to expose portions of it's memory to other domains. Can the BE query the
> > > hypervisor itself to discover the available grants or does it require
> > > coordination with Dom0/XenStore for that information to be available to
> > > the BE domain?
> >
> > Typically the frontend passes grant table references to the backend
> > (i.e. instead of plain guest physical addresses on the virtio ring.)
> > Then, the backend maps the grants; Xen checks that the mapping is
> > allowed.
> >
> > We might be able to use the same model with virtio devices. A special
> > pseudo-IOMMU driver in Linux would return a grant table reference and an
> > offset as "DMA address". The "DMA address" is passed to the virtio
> > backend over the virtio ring. The backend would map the grant table
> > reference using the regular grant table hypercalls.
> >
> >
> > >  * How to handle signalling
> > >
> > > I guess this requires a minimal implementation of the IOREQ calls for
> > > Zephyr so we can register the handler in the backend? Does the IOREQ API
> > > allow for a IPI style notifications using the global GIC IRQs?
> > >
> > > Forgive the incomplete notes from the Stratos sync, I was trying to type
> > > while participating in the discussion so hopefully this email captures
> > > what was missed:
> > >
> > >   https://linaro.atlassian.net/wiki/spaces/STR/pages/28682518685/2022-02-03+Project+Stratos+Sync+Meeting+Notes
> >
> > Yes, any emulation backend (including virtio backends) would require an
> > IOREQ implementation, which includes notifications via event channels.
> > Event channels are delivered as a GIC PPI interrupt to the Linux kernel.
> > Then, the kernel sends the notification to userspace via a file
> > descriptor.
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Metadata and signalling channels for Zephyr virtio-backends on Xen
  2022-02-16 21:45     ` Stefano Stabellini
@ 2022-02-17 13:48       ` Vincent Guittot
  2022-02-17 21:38         ` Stefano Stabellini
  0 siblings, 1 reply; 10+ messages in thread
From: Vincent Guittot @ 2022-02-17 13:48 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Alex Bennée, stratos-dev, xen-devel, AKASHI Takahiro,
	Arnd Bergmann, Christopher Clark, Dmytro Firsov, Julien Grall,
	Volodymyr Babchuk, Oleksandr_Tyshchenko, Artem_Mygaiev,
	bertrand.marquis, Wei.Chen, Ed.Doxat, Oleksii_Moisieiev

On Wed, 16 Feb 2022 at 22:45, Stefano Stabellini
<stefano.stabellini@xilinx.com> wrote:
>
> On Tue, 15 Feb 2022, Vincent Guittot wrote:
> > On Tue, 8 Feb 2022 at 01:16, Stefano Stabellini
> > <stefano.stabellini@xilinx.com> wrote:
> > >
> > > On Mon, 7 Feb 2022, Alex Bennée wrote:
> > > > Hi Stefano,
> > > >
> > > > Vincent gave an update on his virtio-scmi work at the last Stratos sync
> > > > call and the discussion moved onto next steps.
> > >
> > > Hi Alex,
> > >
> > > I don't know the specifics of virtio-scmi, but if it is about power,
> > > clocks, reset, etc. like the original SCMI protocol, then virtio-scmi is
> >
> > virtio-scmi is one transport channel that support SCMI protocol
> >
> > > likely going to be very different from all the other virtio frontends
> >
> > The virtio-scmi front-end is merged mainline
> >
> > > and backends. That's because SCMI requires a full view of the system,
> > > which is different from something like virtio-net that is limited to the
> > > emulation of 1 device. For this reason, it is likely that the
> > > virtio-scmi backend would be a better fit in Xen itself, rather than run
> > > in userspace inside a VM.
> >
> > Not sure what you mean when you say that SCMI requires a full view of
> > the system.
>
> SCMI can be used to read the status of resources in the system and
> typically leads to a full view of the system's resources.
>
> If I assign the USB controller to a VM, I expect that VM to only "see"
> the USB controller and any attached USB peripherals, in addition to the
> other regular virtual resources that a VM commonly has.
>
> If I assign SCMI to a VM, I expect the VM to "see" everything in the
> system thanks to the SCMI probing functions. Unless we only assign a
> single SCMI channel with limited capabilities to the VM, like EPAM's
> patch series on xen-devel is doing.


>
>
> > If you are referring to the system wide resources which
> > reset or power up/down the whole SoC, this is not really what we are
> > targeting here. Those system wide resources should already be handled
> > by a dedicated power coprocessor. In our case, the IPs of the SoC will
> > be handled by different VMs but those IPs are usually sharing common
> > resources like a parent PLL , a power domain or a clock gating reg as
> > few examples. Because all those VMs can't directly set these resources
> > without taking into account others and because the power coprocessor
> > doesn't have an unlimited number of channels, we add an SCMI backend
> > that will gather and proxy the VM request before accessing the
> > register that gates some clocks IP as an example or before powering
> > down an external regulator shared between the camera and another
> > device.
>
> Do you know what would be the expected number of SCMI channels available
> in a "normal" deployment?

I don't have a fixed value but it can comes from few to hundreds
depending of the product

>
> My expectation was that there would be enough SCMI channels to give one
> for each VM in a common embedded scenario, where the number of VMs is
> typically not very high. If we have enought channels so that we can
> assign each channel to a different VM maybe we can get away without a
> proxy?

If you have enough HW channels per VM then you don't need anything
including what EPAM is proposing. But part of my requirement is that
we don't have enough HW resources and we don't want to update the
secure power coprocessor for each and every product.

In SCMI, each channel/agent can have its own view of available
resources; Typically only the ATF/PSCI channel can power off the whole
system but not OSPMs agent

>
>
> > This SCMI backend will most probably also send request with
> > OSPM permission access to the power coprocessor once aggregating all
> > the VMs ' request
>
> Please correct me if I am wrong, but I would have expected the SCMI
> firmware to be able to do reference counting on the hardware resources
> and therefore be able to handle the case where:

The example that I have in mind is :
current case: Linux takes care of a clock divider that provides clocks
for let say video decoder and SD controller but its parent is a system
clock shared with others subsystem which is managed by the power
coprocessor.
Linux will 1st send a request to the coprocessor to enable system
clock. Then it will set the divider and probably ungates the clock at
HW IP level

Now we move the sdcard in VMA and the video decoder in VMB.
The SCMI server VM will take care of the clock divider and clock gating register
VMA will send a request to SCMI backend which aggregates/refcounts its
request with VMB request. When it needs to enable the clock divider,
It will 1st send a request to the coprocessor for the system clock.

The coprocessor stays unchanged and the SCMI backend can be adjusted
per product.

>
> - we have 2 VMs
> - each VM has its own SCMI channel
> - a VM requests power-off on 1 resource also used by the other VM

yes it has refcounting but you assume that you have enough HW channels.
It also implies that the coprocessor firmware is different for each
and every end product because you will have to adjust the scmi
permission of each channel

>
> My understanding of the SCMI protocol is that the SCMI firmware
> implementation should detect that the resource in question is also
> in-use by another VM/channel and thus it would refuse the power-off
> operation. (For your information, that is also how the Xilinx EEMI
> protocol works.)
>
> Reference counting is a key requirement for a good multi-channel
> implementation. If SCMI doesn't support it today, then we have a
> problem with SCMI multi-channel, regardless of virtualization.channels
>
>
> > We are using virtio-cmi protocol because it has the main advantage of
> > not being tied to an hypervisor
>
> That is a valuable goal, which is a bit different from the goal of
> finding the best SCMI architecture for Xen, and that's OK. Let's see if
> we can find any common ground and synergies we can exploit to improve
> both goals. I'll join the Stratos meeting tomorrow.

Great

>
>
> > In our PoC, the SCMI backend is running with zehyr and reuse the same
> > software that can run in the power coprocessor which helps splitting
> > what is critical and must be handled by power coprocessor and what is
> > not critical for the system (what is usually managed by linux directly
> > when their no hypervisor involved typically)
> >
> >
> > > FYI, a good and promising approach to handle both SCMI and SCPI is the
> > > series recently submitted by EPAM to mediate SCMI and SCPI requests in
> > > Xen: https://marc.info/?l=xen-devel&m=163947444032590
> > >
> > > (Another "special" virtio backend is virtio-iommu for similar reasons:
> > > the guest p2m address mappings and also the IOMMU drivers are in Xen.
> > > It is not immediately clear whether a virtio-iommu backend would need to
> > > be in Xen or run as a process in dom0/domU.)
> > >
> > > On the other hand, for all the other "normal" protocols (e.g.
> > > virtio-net, virtio-block, etc.) the backend would naturally run as a
> > > process in dom0 or domU (e.g. QEMU in Dom0) as one would expect.
> > >
> > >
> > > > Currently the demo setup
> > > > is intermediated by a double-ended vhost-user daemon running on the
> > > > devbox acting as a go between a number of QEMU instances representing
> > > > the front and back-ends. You can view the architecture with Vincents
> > > > diagram here:
> > > >
> > > >   https://docs.google.com/drawings/d/1YSuJUSjEdTi2oEUq4oG4A9pBKSEJTAp6hhcHKKhmYHs/edit?usp=sharing
> > > >
> > > > The key virtq handling is done over the special carve outs of shared
> > > > memory between the front end and guest. However the signalling is
> > > > currently over a virtio device on the backend. This is useful for the
> > > > PoC but obviously in a real system we don't have a hidden POSIX system
> > > > acting as a go between not to mention the additional latency it causes
> > > > with all those context switches.
> > > >
> > > > I was hoping we could get some more of the Xen experts to the next
> > > > Stratos sync (17th Feb) to go over approaches for a properly hosted on
> > > > Xen approach. From my recollection (Vincent please correct me if I'm
> > > > wrong) of last week the issues that need solving are:
> > >
> > > Unfortunately I have a regular conflict which prevents me from being
> > > able to join the Stratos calls. However, I can certainly make myself
> > > available for one call (unless something unexpected comes up).
> > >
> > >
> > > >  * How to handle configuration steps as FE guests come up
> > > >
> > > > The SCMI server will be a long running persistent backend because it is
> > > > managing real HW resources. However the guests may be ephemeral (or just
> > > > restarted) so we can't just hard-code everything in a DTB. While the
> > > > virtio-negotiation in the config space covers most things we still need
> > > > information like where in the guests address space the shared memory
> > > > lives and at what offset into that the queues are created. As far as I'm
> > > > aware the canonical source of domain information is XenStore
> > > > (https://wiki.xenproject.org/wiki/XenStore) but this relies on a Dom0
> > > > type approach. Is there an alternative for dom0less systems or do we
> > > > need a dom0-light approach, for example using STR-21 (Ensure Zephyr can
> > > > run cleanly as a Dom0 guest) providing just enough services for FE's to
> > > > register metadata and BE's to read it?
> > >
> > > I'll try to answer the question for a generic virtio frontend and
> > > backend instead (not SCMI because SCMI is unique due to the reasons
> > > above.)
> > >
> > > Yes, xenstore is the easiest way to exchange configuration information
> > > between domains. I think EPAM used xenstore to exchange the
> > > configuration information in their virtio-block demo. There is a way to
> > > use xenstore even between dom0less VMs:
> > > https://marc.info/?l=xen-devel&m=164340547602391 Not just xenstore but
> > > full PV drivers too. However, in the dom0less case xenstore is going to
> > > become available some time after boot, not immediately at startup time.
> > > That's because you need to wait until xenstored is up and running.
> > >
> > > There are other ways to send data from one VM to another which are
> > > available immediately at boot, such as Argo and static shared memory.
> > >
> > > But dom0less is all about static partitioning, so it makes sense to
> > > exploit the build-time tools to the fullest. In the dom0less case, we
> > > already know what is going to run on the target before it is even turned
> > > on. As an example, we might have already prepared an environment with 3
> > > VMs using Yocto and ImageBuilder. We could also generate all
> > > configurations needed and place them inside each VMs using Yocto's
> > > standard tools and ImageBuilder. So for dom0less, I recommend to go via
> > > a different route and pre-generate the configuration directly where
> > > needed instead of doing dynamic discovery.
> > >
> > >
> > > >  * How to handle mapping of memory
> > > >
> > > > AIUI the Xen model is the FE guest explicitly makes grant table requests
> > > > to expose portions of it's memory to other domains. Can the BE query the
> > > > hypervisor itself to discover the available grants or does it require
> > > > coordination with Dom0/XenStore for that information to be available to
> > > > the BE domain?
> > >
> > > Typically the frontend passes grant table references to the backend
> > > (i.e. instead of plain guest physical addresses on the virtio ring.)
> > > Then, the backend maps the grants; Xen checks that the mapping is
> > > allowed.
> > >
> > > We might be able to use the same model with virtio devices. A special
> > > pseudo-IOMMU driver in Linux would return a grant table reference and an
> > > offset as "DMA address". The "DMA address" is passed to the virtio
> > > backend over the virtio ring. The backend would map the grant table
> > > reference using the regular grant table hypercalls.
> > >
> > >
> > > >  * How to handle signalling
> > > >
> > > > I guess this requires a minimal implementation of the IOREQ calls for
> > > > Zephyr so we can register the handler in the backend? Does the IOREQ API
> > > > allow for a IPI style notifications using the global GIC IRQs?
> > > >
> > > > Forgive the incomplete notes from the Stratos sync, I was trying to type
> > > > while participating in the discussion so hopefully this email captures
> > > > what was missed:
> > > >
> > > >   https://linaro.atlassian.net/wiki/spaces/STR/pages/28682518685/2022-02-03+Project+Stratos+Sync+Meeting+Notes
> > >
> > > Yes, any emulation backend (including virtio backends) would require an
> > > IOREQ implementation, which includes notifications via event channels.
> > > Event channels are delivered as a GIC PPI interrupt to the Linux kernel.
> > > Then, the kernel sends the notification to userspace via a file
> > > descriptor.
> >


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Metadata and signalling channels for Zephyr virtio-backends on Xen
  2022-02-17 13:48       ` Vincent Guittot
@ 2022-02-17 21:38         ` Stefano Stabellini
  0 siblings, 0 replies; 10+ messages in thread
From: Stefano Stabellini @ 2022-02-17 21:38 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Stefano Stabellini, Alex Bennée, stratos-dev, xen-devel,
	AKASHI Takahiro, Arnd Bergmann, Christopher Clark, Dmytro Firsov,
	Julien Grall, Volodymyr Babchuk, Oleksandr_Tyshchenko,
	Artem_Mygaiev, bertrand.marquis, Wei.Chen, Ed.Doxat,
	Oleksii_Moisieiev

[-- Attachment #1: Type: text/plain, Size: 322 bytes --]

Hi Vincent,

I am replying to this thread to follow-up on this morning's discussion.

I am attaching the simple patch that I mentioned during the call to add
event channels support to guest kernels, see xen.h.

I am also attaching a toy example application that makes use of it, just
to give you an idea.

Cheers,

Stefano

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-chdr; name=xen.h, Size: 6042 bytes --]

/* SPDX-License-Identifier: (BSD-3-Clause) */
/*
 * Xen definitions, hypercalls, and functions used to setup event
 * channels and send and receive event notifications.
 */

#ifndef XEN_H
#define XEN_H

#define GUEST_EVTCHN_PPI        31
#define DOMID_SELF              0x7FF0U

struct vcpu_time_info {
    uint32_t version;
    uint32_t pad0;
    uint64_t tsc_timestamp;
    uint64_t system_time;
    uint32_t tsc_to_system_mul;
    int8_t   tsc_shift;
    uint8_t  flags;
    uint8_t  pad1[2];
} __attribute__((__packed__)); /* 32 bytes */

struct pvclock_wall_clock {
    uint32_t version;
    uint32_t sec;
    uint32_t nsec;
    uint32_t sec_hi;
} __attribute__((__packed__));

struct arch_vcpu_info { };
struct arch_shared_info { };

struct vcpu_info {
    uint8_t evtchn_upcall_pending;
    uint8_t evtchn_upcall_mask;
    uint64_t evtchn_pending_sel;
    struct arch_vcpu_info arch;
    struct vcpu_time_info time;
};

struct shared_info {
    struct vcpu_info vcpu_info[1];
    uint64_t evtchn_pending[sizeof(uint64_t) * 8];
    uint64_t evtchn_mask[sizeof(uint64_t) * 8];

    struct pvclock_wall_clock wc;
    uint32_t wc_sec_hi;
    struct arch_shared_info arch;
};

#define active_evtchns(cpu,sh,idx)              \
    ((sh)->evtchn_pending[idx] &                \
     ~(sh)->evtchn_mask[idx])

#define HYPERVISOR_memory_op            12
#define HYPERVISOR_xen_version          17
#define HYPERVISOR_console_io           18
#define HYPERVISOR_grant_table_op       20
#define HYPERVISOR_vcpu_op              24
#define HYPERVISOR_xsm_op               27
#define HYPERVISOR_sched_op             29
#define HYPERVISOR_callback_op          30
#define HYPERVISOR_event_channel_op     32
#define HYPERVISOR_physdev_op           33
#define HYPERVISOR_hvm_op               34
#define HYPERVISOR_sysctl               35
#define HYPERVISOR_domctl               36
#define HYPERVISOR_argo_op              39
#define HYPERVISOR_dm_op                41
#define HYPERVISOR_hypfs_op             42


/* hypercalls */
static inline int64_t xen_hypercall(unsigned long arg0, unsigned long arg1,
                                    unsigned long arg2, unsigned long arg3,
                                    unsigned long hypercall)
{
    register uintptr_t a0 asm("x0") = arg0;
    register uintptr_t a1 asm("x1") = arg1;
    register uintptr_t a2 asm("x2") = arg2;
    register uintptr_t a3 asm("x3") = arg3;
    register uintptr_t nr asm("x16") = hypercall;
    asm volatile("hvc 0xea1\n"
                     : "=r" (a0), "=r"(a1), "=r" (a2), "=r" (a3), "=r" (nr)
                     : "0" (a0),
                       "r" (a1),
                       "r" (a2),
                       "r" (a3),
                       "r" (nr));
    return a0;
}


/* console_io */
#define CONSOLEIO_write 0


/* memory_op */
#define XENMAPSPACE_shared_info  0 /* shared info page */
#define XENMAPSPACE_grant_table  1 /* grant table page */

#define XENMEM_add_to_physmap      7

struct xen_add_to_physmap {
    /* Which domain to change the mapping for. */
    uint16_t domid;

    /* Number of pages to go through for gmfn_range */
    uint16_t    size;

    /* Source mapping space. */
    unsigned int space;

    /* Index into source mapping space. */
    uint64_t idx;

    /* GPFN where the source mapping page should appear. */
    uint64_t gpfn;
};

static inline int xen_register_shared_info(struct shared_info *shared_info)
{
    int rc;
    struct xen_add_to_physmap xatp;

    xatp.domid = DOMID_SELF;
    xatp.idx = 0;
    xatp.space = XENMAPSPACE_shared_info;
    xatp.gpfn = ((unsigned long)shared_info) >> 12;
    rc = xen_hypercall(XENMEM_add_to_physmap, (unsigned long)&xatp, 0, 0,
                       HYPERVISOR_memory_op);
    return rc;
}


/* event_channel_op */
#define EVTCHNOP_bind_interdomain 0
#define EVTCHNOP_close            3
#define EVTCHNOP_send             4
#define EVTCHNOP_status           5
#define EVTCHNOP_alloc_unbound    6
#define EVTCHNOP_unmask           9

struct evtchn_bind_interdomain {
    /* IN parameters. */
    uint16_t remote_dom;
    uint32_t remote_port;
    /* OUT parameters. */
    uint32_t local_port;
};

struct evtchn_alloc_unbound {
    /* IN parameters */
    uint16_t dom, remote_dom;
    /* OUT parameters */
    uint32_t port;
};

struct evtchn_send {
    /* IN parameters. */
    uint32_t port;
};


/* printf */
static inline void xen_console_write(const char *str)
{
    ssize_t len = strlen(str);

    xen_hypercall(CONSOLEIO_write, len, (unsigned long)str, 0,
                  HYPERVISOR_console_io);
}

static inline void xen_printf(const char *fmt, ...)
{
    char buf[128];
    va_list ap;
    char *str = &buf[0];
    memset(buf, 0x0, 128);

    va_start(ap, fmt);
    vsprintf(str, fmt, ap);
    va_end(ap);

    xen_console_write(buf);
}


/* 
 * utility functions, not xen specific, but needed by the function
 * below
 */
#define xchg(ptr,v) __atomic_exchange_n(ptr, v, __ATOMIC_SEQ_CST)

static __inline__ unsigned long __ffs(unsigned long word)
{
        return __builtin_ctzl(word);
}


/* event handling */
static inline void handle_event_irq(struct shared_info *s,
                                    void (*do_event)(unsigned int event))
{
    uint64_t  l1, l2, l1i, l2i;
    unsigned int   port;
    int            cpu = 0;
    struct vcpu_info   *vcpu_info = &s->vcpu_info[cpu];

    vcpu_info->evtchn_upcall_pending = 0;
    mb();

    l1 = xchg(&vcpu_info->evtchn_pending_sel, 0);
    while ( l1 != 0 )
    {
        l1i = __ffs(l1);
        l1 &= ~(1UL << l1i);
        l2 = xchg(&s->evtchn_pending[l1i], 0);

        while ( l2 != 0 )
        {
            l2i = __ffs(l2);
            l2 &= ~(1UL << l2i);

            port = (l1i * sizeof(uint64_t)) + l2i;

            do_event(port);
        }
    }
}

#endif /* XEN_H */

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: Type: text/x-csrc; name=example.c, Size: 4132 bytes --]

/* SPDX-License-Identifier: (BSD-3-Clause) */
/*
 * TBM application to send and receive Xen event channels.
 *
 * Written by Stefano Stabellini
 */
#define _MINIC_SOURCE

#include <stdio.h>
#include <stdbool.h>
#include <stdint.h>
#include <string.h>
#include <unistd.h>
#include "sys.h"

#include "drivers/arm/gic.h"

#include "xen.h"

static uint16_t domid = 0;
static struct shared_info *shared_info = 0;
/* statically configured shared memory at address 0x7fe00000 */
static char* shared_mem = (char *)0x7fe00000;

static void print_event(unsigned int event)
{
    xen_printf("handle_event domid=%u event=%u\n", domid, event);
}

static void irq_handler(struct excp_frame *f)
{
    uint32_t irq;

    irq = gic_ack_irq(GIC_CPU_BASE);

    handle_event_irq(shared_info, print_event);

    gic_end_of_irq(GIC_CPU_BASE, irq);
    gic_deactivate_irq(GIC_CPU_BASE, irq);
    local_cpu_ei();
}

static void gic_init(int irq)
{
    assert(irq < 32);

    /* Disable interrupts while we configure the GIC.  */
    local_cpu_di();

    /* Setup the GIC.  */
    gicd_set_irq_group(GIC_DIST_BASE, irq, 0);
    gicd_set_irq_target(GIC_DIST_BASE, irq, 0);
    gicd_enable_irq(GIC_DIST_BASE, irq);
    gicd_set_irq_group(GIC_DIST_BASE, 5, 0);
    gicd_set_irq_target(GIC_DIST_BASE, 5, 0);
    gicd_enable_irq(GIC_DIST_BASE, 5);

    writel(GIC_DIST_BASE + GICD_CTRL, 3);
    writel(GIC_CPU_BASE + GICC_CTRL, 3);
    writel(GIC_CPU_BASE + GICC_PMR, 0xff);
    mb();
    local_cpu_ei();
}

void debug_get_domid()
{
    register uintptr_t a0 asm("x0");
    __asm__ __volatile__("hvc 0xfffd\n" 
            : "=r" (a0)
            : "0" (a0));
    domid = a0;
}

void app_run(void)
{
    int ret = 0;

    /* Setup GIC and interrupt handler for Xen events */
    gic_init(GUEST_EVTCHN_PPI);
    aarch64_set_irq_h(irq_handler);

    /* Register shared_info page */
    shared_info = aligned_alloc(4096, 4096);
    memset(shared_info, 0x0, 4096);
    xen_register_shared_info(shared_info);

    /* Get our domid with debug hypercall */
    debug_get_domid();
    xen_printf("DEBUG domid=%d\n", domid);

    /* If domid == 1 allocate an unbound event to receive notifications */
    if (domid == 1) {
        uint16_t remote_domid = 2;
        struct evtchn_alloc_unbound alloc;

        alloc.dom = DOMID_SELF;
        alloc.remote_dom = remote_domid;
        alloc.port = 0;

        ret = xen_hypercall(EVTCHNOP_alloc_unbound, (unsigned long)&alloc,
                            0, 0, HYPERVISOR_event_channel_op);
        mb();

        xen_printf("DEBUG domid=%d alloc_unbound ret=%d port=%u\n", domid, ret, alloc.port);

        /* first message to signal readiness */
        memcpy(shared_mem, "go", sizeof("go"));
        mb();
        /* send port number to other domain */
        memcpy(shared_mem + 4, &alloc.port, sizeof(alloc.port));

    /* if domid == 2 bind to foreign event channel and send event notifications */
    } else {
        uint16_t remote_domid = 1;
        uint16_t remote_port;
        struct evtchn_bind_interdomain bind;
        struct evtchn_send send;

        /* wait for readiness signal */
        while (1) {
            if (strcmp(shared_mem, "go") == 0)
                break;
            mb();
        }
        mb();
        /* read port number of the other domain */
        memcpy(&remote_port, shared_mem + 4, sizeof(remote_port));

        xen_printf("DEBUG domid=%d remote_port=%u\n", domid, remote_port);

        bind.remote_dom = remote_domid;
        bind.remote_port = remote_port;
        bind.local_port = 0;
        ret = xen_hypercall(EVTCHNOP_bind_interdomain, (unsigned long)&bind,
                            0, 0, HYPERVISOR_event_channel_op);

        xen_printf("DEBUG domid=%d bind_interdomain ret=%d local_port=%u\n", domid, ret, bind.local_port);

        send.port = bind.local_port;
        xen_hypercall(EVTCHNOP_send, (unsigned long)&send,
                      0, 0, HYPERVISOR_event_channel_op);
    }

    while (1)
        ;
}

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-02-17 21:38 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-07 10:20 Metadata and signalling channels for Zephyr virtio-backends on Xen Alex Bennée
2022-02-08  0:16 ` Stefano Stabellini
2022-02-11 18:20   ` Alex Bennée
2022-02-11 23:34     ` Stefano Stabellini
2022-02-15 14:47       ` Vincent Guittot
2022-02-15 14:32   ` Vincent Guittot
2022-02-16 21:45     ` Stefano Stabellini
2022-02-17 13:48       ` Vincent Guittot
2022-02-17 21:38         ` Stefano Stabellini
2022-02-15 14:09 ` Vincent Guittot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.