All of lore.kernel.org
 help / color / mirror / Atom feed
* ARM/PCI passthrough: libxl_pci, sysfs and pciback questions
@ 2020-10-27  9:59 Oleksandr Andrushchenko
  2020-10-27 12:55 ` Roger Pau Monné
  0 siblings, 1 reply; 6+ messages in thread
From: Oleksandr Andrushchenko @ 2020-10-27  9:59 UTC (permalink / raw)
  To: Bertrand Marquis, andrew.cooper3, george.dunlap, Ian Jackson,
	Jan Beulich, Julien Grall, Stefano Stabellini, wl,
	Roger Pau Monné,
	paul, Artem Mygaiev, Oleksandr Tyshchenko, xen-devel,
	Rahul Singh

Hello, all!

While working on PCI passthrough on ARM (partial RFC was published by ARM
earlier this year) I tried to implement some related changes in the toolstack.
One of the obstacles for ARM is PCI backend’s related code presence: ARM is
going to fully emulate an ECAM host bridge in Xen, so no PCI backend/frontend
pair is going to be used.

If my understanding correct the functionality which is implemented by the
pciback and toolstack and which is relevant/needed for ARM:

  1. pciback is used as a database for assignable PCI devices, e.g. xl
     pci-assignable-{add|remove|list} manipulates that list. So, whenever the
     toolstack needs to know which PCI devices can be passed through it reads
     that from the relevant sysfs entries of the pciback.

  2. pciback is used to hold the unbound PCI devices, e.g. when passing through a
     PCI device it needs to be unbound from the relevant device driver and bound
     to pciback (strictly speaking it is not required that the device is bound to
     pciback, but pciback is again used as a database of the passed through PCI
     devices, so we can re-bind the devices back to their original drivers when
     guest domain shuts down)

  3. toolstack depends on Domain-0 for discovering PCI device resources which are
     then permitted for the guest domain, e.g MMIO ranges, IRQs. are read from
     the sysfs

  4. toolstack is responsible for resetting PCI devices being passed through via
     sysfs/reset of the Domain-0’s PCI bus subsystem

  5. toolstack is responsible for the devices are passed with all relevant
     functions, e.g. so for multifunction devices all the functions are passed to
     a domain and no partial passthrough is done

  6. toolstack cares about SR-IOV devices (am I correct here?)


I have implemented a really dirty POC for that which I would need to clean up
before showing, but before that I would like to get some feedback and advice on
how to proceed with the above. I suggest we:

  1. Move all pciback related code (which seems to become x86 code only) into a
     dedicated file, something like tools/libxl/libxl_pci_x86.c

  2. Make the functionality now provided by pciback architecture dependent, so
     tools/libxl/libxl_pci.c delegates actual assignable device list handling to
     that arch code and uses some sort of “ops”, e.g.
     arch->ops.get_all_assignable, arch->ops.add_assignable etc. (This can also
     be done with “#ifdef CONFIG_PCIBACK”, but seems to be not cute). Introduce
     tools/libxl/libxl_pci_arm.c to provide ARM implementation.

  3. ARM only: As we do not have pciback on ARM we need to have some storage for
     assignable device list: move that into Xen by extending struct pci_dev with
     “bool assigned” and providing sysctls for manipulating that, e.g.
     XEN_SYSCTL_pci_device_{set|get}_assigned,
     XEN_SYSCTL_pci_device_enum_assigned (to enumerate/get the list of
     assigned/not-assigned PCI devices). Can this also be interesting for x86? At
     the moment it seems that x86 does rely on pciback presence, so probably this
     change might not be interesting for x86 world, but may allow stripping
     pciback functionality a bit and making the code common to both ARM and x86.

  4. ARM only: It is not clear how to handle re-binding of the PCI driver on
     guest shutdown: we need to store the sysfs path of the original driver the
     device was bound to. Do we also want to store that in struct pci_dev?

  5. An alternative route for 3-4 could be to store that data in XenStore, e.g.
     MMIOs, IRQ, bind sysfs path etc. This would require more code on Xen side to
     access XenStore and won’t work if MMIOs/IRQs are passed via device tree/ACPI
     tables by the bootloaders.


Another big question is with respect to Domain-0 and PCI bus sysfs use. The
existing code for querying PCI device resources/IRQs and resetting those via
sysfs of Domain-0 is more than OK if Domain-0 is present and owns PCI HW. But,
there are at least two cases when this is not going to work on ARM: Dom0less
setups and when there is a hardware domain owning PCI devices.

In our case we have a dedicated guest which is a sort of hardware domain (driver
domain DomD) which owns all the hardware of the platform, so we are interested
in implementing something that fits our design as well: DomD/hardware domain
makes it not possible to access the relevant PCI bus sysfs entries from Domain-0
as those live in DomD/hwdom. This is also true for Dom0less setups as there is
no entity that can provide the same.

For that reason in my POC I have introduced the following: extended struct
pci_dev to hold an array of PCI device’s MMIO ranges and IRQ:

  1. Provide internal API for accessing the array of MMIO ranges and IRQ. This
     can be used in both Dom0less and Domain-0 setups to manipulate the relevant
     data. The actual data can be read from a device tree/ACPI tables if
     enumeration is done by bootloaders.

  2. For Domain-0/DomD setup add PHYSDEVOP_pci_device_set_resources so Domain-0
     can set the relevant resources in Xen while enumerating PCI devices. This
     requires a change to the Linux kernel driver to work (I can provide more
     details if needed).

  3. For the resetting devices we may want to do that functionality on Xen side
     as well via introducing PHYSDEVOP_pci_device_reset.


I can probably implement an RFC series with all the above if we agree on the
approach. Comments are more than welcome.

Thank you,
Oleksandr

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ARM/PCI passthrough: libxl_pci, sysfs and pciback questions
  2020-10-27  9:59 ARM/PCI passthrough: libxl_pci, sysfs and pciback questions Oleksandr Andrushchenko
@ 2020-10-27 12:55 ` Roger Pau Monné
  2020-10-27 15:52   ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 6+ messages in thread
From: Roger Pau Monné @ 2020-10-27 12:55 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: Bertrand Marquis, andrew.cooper3, george.dunlap, Ian Jackson,
	Jan Beulich, Julien Grall, Stefano Stabellini, wl, paul,
	Artem Mygaiev, Oleksandr Tyshchenko, xen-devel, Rahul Singh

On Tue, Oct 27, 2020 at 09:59:05AM +0000, Oleksandr Andrushchenko wrote:
> Hello, all!
> 
> While working on PCI passthrough on ARM (partial RFC was published by ARM
> earlier this year) I tried to implement some related changes in the toolstack.
> One of the obstacles for ARM is PCI backend’s related code presence: ARM is
> going to fully emulate an ECAM host bridge in Xen, so no PCI backend/frontend
> pair is going to be used.
> 
> If my understanding correct the functionality which is implemented by the
> pciback and toolstack and which is relevant/needed for ARM:
> 
>   1. pciback is used as a database for assignable PCI devices, e.g. xl
>      pci-assignable-{add|remove|list} manipulates that list. So, whenever the
>      toolstack needs to know which PCI devices can be passed through it reads
>      that from the relevant sysfs entries of the pciback.
> 
>   2. pciback is used to hold the unbound PCI devices, e.g. when passing through a
>      PCI device it needs to be unbound from the relevant device driver and bound
>      to pciback (strictly speaking it is not required that the device is bound to
>      pciback, but pciback is again used as a database of the passed through PCI
>      devices, so we can re-bind the devices back to their original drivers when
>      guest domain shuts down)
> 
>   3. toolstack depends on Domain-0 for discovering PCI device resources which are
>      then permitted for the guest domain, e.g MMIO ranges, IRQs. are read from
>      the sysfs
> 
>   4. toolstack is responsible for resetting PCI devices being passed through via
>      sysfs/reset of the Domain-0’s PCI bus subsystem
> 
>   5. toolstack is responsible for the devices are passed with all relevant
>      functions, e.g. so for multifunction devices all the functions are passed to
>      a domain and no partial passthrough is done
> 
>   6. toolstack cares about SR-IOV devices (am I correct here?)

I'm not sure I fully understand what this means. Toolstack cares about
SR-IOV as it cares about other PCI devices, but the SR-IOV
functionality is managed by the (dom0) kernel.

> 
> 
> I have implemented a really dirty POC for that which I would need to clean up
> before showing, but before that I would like to get some feedback and advice on
> how to proceed with the above. I suggest we:
> 
>   1. Move all pciback related code (which seems to become x86 code only) into a
>      dedicated file, something like tools/libxl/libxl_pci_x86.c
> 
>   2. Make the functionality now provided by pciback architecture dependent, so
>      tools/libxl/libxl_pci.c delegates actual assignable device list handling to
>      that arch code and uses some sort of “ops”, e.g.
>      arch->ops.get_all_assignable, arch->ops.add_assignable etc. (This can also
>      be done with “#ifdef CONFIG_PCIBACK”, but seems to be not cute). Introduce
>      tools/libxl/libxl_pci_arm.c to provide ARM implementation.

To be fair this is arch and OS dependent, since it's currently based
on sysfs which is Linux specific. So it should really be
libxl_pci_linux_x86.c or similar.

> 
>   3. ARM only: As we do not have pciback on ARM we need to have some storage for
>      assignable device list: move that into Xen by extending struct pci_dev with
>      “bool assigned” and providing sysctls for manipulating that, e.g.
>      XEN_SYSCTL_pci_device_{set|get}_assigned,
>      XEN_SYSCTL_pci_device_enum_assigned (to enumerate/get the list of
>      assigned/not-assigned PCI devices). Can this also be interesting for x86? At
>      the moment it seems that x86 does rely on pciback presence, so probably this
>      change might not be interesting for x86 world, but may allow stripping
>      pciback functionality a bit and making the code common to both ARM and x86.

How are you going to perform the device reset then? Will you assign
the device to dom0 after removing it from the guest so that dom0 can
perform the reset? You will need to use logic currently present in
pciback to do so IIRC.

It doesn't seem like a bad approach, but there are more consequences
than just how assignable devices are listed.

Also Xen doesn't currently know about IOMMU groups, so Xen would have
to gain this knowledge in order to know the minimal set of PCI devices
that can be assigned to a guest.

> 
>   4. ARM only: It is not clear how to handle re-binding of the PCI driver on
>      guest shutdown: we need to store the sysfs path of the original driver the
>      device was bound to. Do we also want to store that in struct pci_dev?

I'm not sure I follow you here. On shutdown the device would be
handled back to Xen?

Most certainly we don't want to store a sysfs (Linux private
information) inside of a Xen specific struct (pci_dev).

>   5. An alternative route for 3-4 could be to store that data in XenStore, e.g.
>      MMIOs, IRQ, bind sysfs path etc. This would require more code on Xen side to
>      access XenStore and won’t work if MMIOs/IRQs are passed via device tree/ACPI
>      tables by the bootloaders.

As above, I think I need more context to understand what and why you
need to save such information.

> 
> Another big question is with respect to Domain-0 and PCI bus sysfs use. The
> existing code for querying PCI device resources/IRQs and resetting those via
> sysfs of Domain-0 is more than OK if Domain-0 is present and owns PCI HW. But,
> there are at least two cases when this is not going to work on ARM: Dom0less
> setups and when there is a hardware domain owning PCI devices.
> 
> In our case we have a dedicated guest which is a sort of hardware domain (driver
> domain DomD) which owns all the hardware of the platform, so we are interested
> in implementing something that fits our design as well: DomD/hardware domain
> makes it not possible to access the relevant PCI bus sysfs entries from Domain-0
> as those live in DomD/hwdom. This is also true for Dom0less setups as there is
> no entity that can provide the same.

You need some kind of channel to transfer this information from the
hardware domain to the toolstack domain. Some kind of protocol over
libvchan might be an option.

> For that reason in my POC I have introduced the following: extended struct
> pci_dev to hold an array of PCI device’s MMIO ranges and IRQ:
> 
>   1. Provide internal API for accessing the array of MMIO ranges and IRQ. This
>      can be used in both Dom0less and Domain-0 setups to manipulate the relevant
>      data. The actual data can be read from a device tree/ACPI tables if
>      enumeration is done by bootloaders.

I would be against storing this data inside of Xen if Xen doesn't have
to make any use of it. Does Xen need to know the MMIO ranges and IRQs
to perform it's task?

If not, then there's no reason to store those in Xen. The hypervisor
is not the right place to implement a database like mechanism for PCI
devices.

Roger.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ARM/PCI passthrough: libxl_pci, sysfs and pciback questions
  2020-10-27 12:55 ` Roger Pau Monné
@ 2020-10-27 15:52   ` Oleksandr Andrushchenko
  2020-10-27 17:18     ` Jan Beulich
  0 siblings, 1 reply; 6+ messages in thread
From: Oleksandr Andrushchenko @ 2020-10-27 15:52 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Bertrand Marquis, andrew.cooper3, george.dunlap, Ian Jackson,
	Jan Beulich, Julien Grall, Stefano Stabellini, wl, paul,
	Artem Mygaiev, Oleksandr Tyshchenko, xen-devel, Rahul Singh

Hello, Roger!

On 10/27/20 2:55 PM, Roger Pau Monné wrote:
> On Tue, Oct 27, 2020 at 09:59:05AM +0000, Oleksandr Andrushchenko wrote:
>> Hello, all!
>>
>> While working on PCI passthrough on ARM (partial RFC was published by ARM
>> earlier this year) I tried to implement some related changes in the toolstack.
>> One of the obstacles for ARM is PCI backend’s related code presence: ARM is
>> going to fully emulate an ECAM host bridge in Xen, so no PCI backend/frontend
>> pair is going to be used.
>>
>> If my understanding correct the functionality which is implemented by the
>> pciback and toolstack and which is relevant/needed for ARM:
>>
>>    1. pciback is used as a database for assignable PCI devices, e.g. xl
>>       pci-assignable-{add|remove|list} manipulates that list. So, whenever the
>>       toolstack needs to know which PCI devices can be passed through it reads
>>       that from the relevant sysfs entries of the pciback.
>>
>>    2. pciback is used to hold the unbound PCI devices, e.g. when passing through a
>>       PCI device it needs to be unbound from the relevant device driver and bound
>>       to pciback (strictly speaking it is not required that the device is bound to
>>       pciback, but pciback is again used as a database of the passed through PCI
>>       devices, so we can re-bind the devices back to their original drivers when
>>       guest domain shuts down)
>>
>>    3. toolstack depends on Domain-0 for discovering PCI device resources which are
>>       then permitted for the guest domain, e.g MMIO ranges, IRQs. are read from
>>       the sysfs
>>
>>    4. toolstack is responsible for resetting PCI devices being passed through via
>>       sysfs/reset of the Domain-0’s PCI bus subsystem
>>
>>    5. toolstack is responsible for the devices are passed with all relevant
>>       functions, e.g. so for multifunction devices all the functions are passed to
>>       a domain and no partial passthrough is done
>>
>>    6. toolstack cares about SR-IOV devices (am I correct here?)
> I'm not sure I fully understand what this means. Toolstack cares about
> SR-IOV as it cares about other PCI devices, but the SR-IOV
> functionality is managed by the (dom0) kernel.
Yes, you are right. Please ignore #6
>
>>
>> I have implemented a really dirty POC for that which I would need to clean up
>> before showing, but before that I would like to get some feedback and advice on
>> how to proceed with the above. I suggest we:
>>
>>    1. Move all pciback related code (which seems to become x86 code only) into a
>>       dedicated file, something like tools/libxl/libxl_pci_x86.c
>>
>>    2. Make the functionality now provided by pciback architecture dependent, so
>>       tools/libxl/libxl_pci.c delegates actual assignable device list handling to
>>       that arch code and uses some sort of “ops”, e.g.
>>       arch->ops.get_all_assignable, arch->ops.add_assignable etc. (This can also
>>       be done with “#ifdef CONFIG_PCIBACK”, but seems to be not cute). Introduce
>>       tools/libxl/libxl_pci_arm.c to provide ARM implementation.
> To be fair this is arch and OS dependent, since it's currently based
> on sysfs which is Linux specific. So it should really be
> libxl_pci_linux_x86.c or similar.
This is true, but do we really have any other implementation yet?
>
>>    3. ARM only: As we do not have pciback on ARM we need to have some storage for
>>       assignable device list: move that into Xen by extending struct pci_dev with
>>       “bool assigned” and providing sysctls for manipulating that, e.g.
>>       XEN_SYSCTL_pci_device_{set|get}_assigned,
>>       XEN_SYSCTL_pci_device_enum_assigned (to enumerate/get the list of
>>       assigned/not-assigned PCI devices). Can this also be interesting for x86? At
>>       the moment it seems that x86 does rely on pciback presence, so probably this
>>       change might not be interesting for x86 world, but may allow stripping
>>       pciback functionality a bit and making the code common to both ARM and x86.
> How are you going to perform the device reset then? Will you assign
> the device to dom0 after removing it from the guest so that dom0 can
> perform the reset? You will need to use logic currently present in
> pciback to do so IIRC.
>
> It doesn't seem like a bad approach, but there are more consequences
> than just how assignable devices are listed.
>
> Also Xen doesn't currently know about IOMMU groups, so Xen would have
> to gain this knowledge in order to know the minimal set of PCI devices
> that can be assigned to a guest.
Good point, I'll check the relevant reset code. Thanks
>
>>    4. ARM only: It is not clear how to handle re-binding of the PCI driver on
>>       guest shutdown: we need to store the sysfs path of the original driver the
>>       device was bound to. Do we also want to store that in struct pci_dev?
> I'm not sure I follow you here. On shutdown the device would be
> handled back to Xen?

Currently it is bound back to the driver which we seized the device from (if any).

So, probably the same logic should remain?

>
> Most certainly we don't want to store a sysfs (Linux private
> information) inside of a Xen specific struct (pci_dev).
Yeap, this is something I don't like as well
>
>>    5. An alternative route for 3-4 could be to store that data in XenStore, e.g.
>>       MMIOs, IRQ, bind sysfs path etc. This would require more code on Xen side to
>>       access XenStore and won’t work if MMIOs/IRQs are passed via device tree/ACPI
>>       tables by the bootloaders.
> As above, I think I need more context to understand what and why you
> need to save such information.

Well, with pciback absence we loose a "database" which holds all the knowledge

about which devices are assigned, bound etc. So, XenStore *could* be used a such

a database for us. But this looks not elegant.

>
>> Another big question is with respect to Domain-0 and PCI bus sysfs use. The
>> existing code for querying PCI device resources/IRQs and resetting those via
>> sysfs of Domain-0 is more than OK if Domain-0 is present and owns PCI HW. But,
>> there are at least two cases when this is not going to work on ARM: Dom0less
>> setups and when there is a hardware domain owning PCI devices.
>>
>> In our case we have a dedicated guest which is a sort of hardware domain (driver
>> domain DomD) which owns all the hardware of the platform, so we are interested
>> in implementing something that fits our design as well: DomD/hardware domain
>> makes it not possible to access the relevant PCI bus sysfs entries from Domain-0
>> as those live in DomD/hwdom. This is also true for Dom0less setups as there is
>> no entity that can provide the same.
> You need some kind of channel to transfer this information from the
> hardware domain to the toolstack domain. Some kind of protocol over
> libvchan might be an option.
Yes, this way it will all be handled without workarounds
>
>> For that reason in my POC I have introduced the following: extended struct
>> pci_dev to hold an array of PCI device’s MMIO ranges and IRQ:
>>
>>    1. Provide internal API for accessing the array of MMIO ranges and IRQ. This
>>       can be used in both Dom0less and Domain-0 setups to manipulate the relevant
>>       data. The actual data can be read from a device tree/ACPI tables if
>>       enumeration is done by bootloaders.
> I would be against storing this data inside of Xen if Xen doesn't have
> to make any use of it. Does Xen need to know the MMIO ranges and IRQs
> to perform it's task?
>
> If not, then there's no reason to store those in Xen. The hypervisor
> is not the right place to implement a database like mechanism for PCI
> devices.

We have discussed all the above with Roger on IRC (thank you Roger),

so I'll prepare an RFC for ARM PCI passthrough configuration and send it ASAP.

>
> Roger.

Thank you,

Oleksandr

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ARM/PCI passthrough: libxl_pci, sysfs and pciback questions
  2020-10-27 15:52   ` Oleksandr Andrushchenko
@ 2020-10-27 17:18     ` Jan Beulich
  2020-10-27 17:45       ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2020-10-27 17:18 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: Bertrand Marquis, andrew.cooper3, george.dunlap, Ian Jackson,
	Julien Grall, Stefano Stabellini, wl, paul, Artem Mygaiev,
	Oleksandr Tyshchenko, xen-devel, Rahul Singh,
	Roger Pau Monné

On 27.10.2020 16:52, Oleksandr Andrushchenko wrote:
> On 10/27/20 2:55 PM, Roger Pau Monné wrote:
>> On Tue, Oct 27, 2020 at 09:59:05AM +0000, Oleksandr Andrushchenko wrote:
>>>    5. An alternative route for 3-4 could be to store that data in XenStore, e.g.
>>>       MMIOs, IRQ, bind sysfs path etc. This would require more code on Xen side to
>>>       access XenStore and won’t work if MMIOs/IRQs are passed via device tree/ACPI
>>>       tables by the bootloaders.
>> As above, I think I need more context to understand what and why you
>> need to save such information.
> 
> Well, with pciback absence we loose a "database" which holds all the knowledge
> 
> about which devices are assigned, bound etc.

What hasn't become clear to me (sorry if I've overlooked it) is
why some form of pciback is not an option on Arm. Where it would
need to run in your split hardware-domain / Dom0 setup (if I got
that right in the first place) would be a secondary question.

Jan


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ARM/PCI passthrough: libxl_pci, sysfs and pciback questions
  2020-10-27 17:18     ` Jan Beulich
@ 2020-10-27 17:45       ` Oleksandr Andrushchenko
  2020-10-28  7:54         ` Jan Beulich
  0 siblings, 1 reply; 6+ messages in thread
From: Oleksandr Andrushchenko @ 2020-10-27 17:45 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Bertrand Marquis, andrew.cooper3, george.dunlap, Ian Jackson,
	Julien Grall, Stefano Stabellini, wl, paul, Artem Mygaiev,
	Oleksandr Tyshchenko, xen-devel, Rahul Singh,
	Roger Pau Monné

On 10/27/20 7:18 PM, Jan Beulich wrote:
> On 27.10.2020 16:52, Oleksandr Andrushchenko wrote:
>> On 10/27/20 2:55 PM, Roger Pau Monné wrote:
>>> On Tue, Oct 27, 2020 at 09:59:05AM +0000, Oleksandr Andrushchenko wrote:
>>>>     5. An alternative route for 3-4 could be to store that data in XenStore, e.g.
>>>>        MMIOs, IRQ, bind sysfs path etc. This would require more code on Xen side to
>>>>        access XenStore and won’t work if MMIOs/IRQs are passed via device tree/ACPI
>>>>        tables by the bootloaders.
>>> As above, I think I need more context to understand what and why you
>>> need to save such information.
>> Well, with pciback absence we loose a "database" which holds all the knowledge
>>
>> about which devices are assigned, bound etc.
> What hasn't become clear to me (sorry if I've overlooked it) is
> why some form of pciback is not an option on Arm.
Yes, it is probably possible to run pciback even without running

pcifront instances in guests and only use that functionality

which is needed for the toolstack. We can even have it as is without

modifications given that pcifront won't run and that part of the pciback

related to PCI config space, MSI etc. won't simply be used, but still

present in the pciback driver. We can try that (pciback is x86

only in the kernel).

> Where it would
> need to run in your split hardware-domain / Dom0 setup (if I got
> that right in the first place) would be a secondary question.

This actually becomes a problem if we think about hwdom != Dom0:

Dom0/toolstack wants to read PCI bus sysfs and it also wants to access

pciback's sysfs entries. So, for Dom0's toolstack to read sysfs in this scenario

we need a bridge between Dom0 and that hwdom to access both PCI

subsystem and pciback's sysfs: this could be implemented as a back-front pair

with a ring and event channel as PV drivers do. This approach of course will

require the toolstack to work in two modes: local sysfs/pciback and remote ones.

In the remote access model the toolstack will need to create a connection to

the hwdom each time it runs and requires sysfs data which should be acceptable.

It can also be possible to have the toolstack always use the remote model even

if it runs locally which will make the toolstack's code support a single model for all

the use-cases.

(Never thought if it is possible to run both backend and frontend in the same VM though).

> Jan

Thank you,

Oleksandr

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ARM/PCI passthrough: libxl_pci, sysfs and pciback questions
  2020-10-27 17:45       ` Oleksandr Andrushchenko
@ 2020-10-28  7:54         ` Jan Beulich
  0 siblings, 0 replies; 6+ messages in thread
From: Jan Beulich @ 2020-10-28  7:54 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: Bertrand Marquis, andrew.cooper3, george.dunlap, Ian Jackson,
	Julien Grall, Stefano Stabellini, wl, paul, Artem Mygaiev,
	Oleksandr Tyshchenko, xen-devel, Rahul Singh,
	Roger Pau Monné

On 27.10.2020 18:45, Oleksandr Andrushchenko wrote:
> On 10/27/20 7:18 PM, Jan Beulich wrote:
>> On 27.10.2020 16:52, Oleksandr Andrushchenko wrote:
>>> On 10/27/20 2:55 PM, Roger Pau Monné wrote:
>>>> On Tue, Oct 27, 2020 at 09:59:05AM +0000, Oleksandr Andrushchenko wrote:
>>>>>     5. An alternative route for 3-4 could be to store that data in XenStore, e.g.
>>>>>        MMIOs, IRQ, bind sysfs path etc. This would require more code on Xen side to
>>>>>        access XenStore and won’t work if MMIOs/IRQs are passed via device tree/ACPI
>>>>>        tables by the bootloaders.
>>>> As above, I think I need more context to understand what and why you
>>>> need to save such information.
>>> Well, with pciback absence we loose a "database" which holds all the knowledge
>>>
>>> about which devices are assigned, bound etc.
>> What hasn't become clear to me (sorry if I've overlooked it) is
>> why some form of pciback is not an option on Arm.
> Yes, it is probably possible to run pciback even without running
> 
> pcifront instances in guests and only use that functionality
> 
> which is needed for the toolstack. We can even have it as is without
> 
> modifications given that pcifront won't run and that part of the pciback
> 
> related to PCI config space, MSI etc. won't simply be used, but still
> 
> present in the pciback driver. We can try that (pciback is x86
> 
> only in the kernel).
> 
>> Where it would
>> need to run in your split hardware-domain / Dom0 setup (if I got
>> that right in the first place) would be a secondary question.
> 
> This actually becomes a problem if we think about hwdom != Dom0:
> 
> Dom0/toolstack wants to read PCI bus sysfs and it also wants to access
> 
> pciback's sysfs entries. So, for Dom0's toolstack to read sysfs in this scenario
> 
> we need a bridge between Dom0 and that hwdom to access both PCI
> 
> subsystem and pciback's sysfs: this could be implemented as a back-front pair
> 
> with a ring and event channel as PV drivers do. This approach of course will
> 
> require the toolstack to work in two modes: local sysfs/pciback and remote ones.
> 
> In the remote access model the toolstack will need to create a connection to
> 
> the hwdom each time it runs and requires sysfs data which should be acceptable.

That's the price to pay for disaggregation, I think. So yes to the
outline in general, but I'd like such an abstraction to not talk in
terms of "sysfs" or in fact anything that's OS specific on either
side. Whether it indeed needs a full new pair of front/back drivers
is a different question.

> It can also be possible to have the toolstack always use the remote model even
> 
> if it runs locally which will make the toolstack's code support a single model for all
> 
> the use-cases.

That's certainly one possible way of doing the necessary abstraction,
I agree.

> (Never thought if it is possible to run both backend and frontend in the same VM though).

Why would it not be? Other back/front pairs certainly can.

Jan


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-10-28  7:54 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-27  9:59 ARM/PCI passthrough: libxl_pci, sysfs and pciback questions Oleksandr Andrushchenko
2020-10-27 12:55 ` Roger Pau Monné
2020-10-27 15:52   ` Oleksandr Andrushchenko
2020-10-27 17:18     ` Jan Beulich
2020-10-27 17:45       ` Oleksandr Andrushchenko
2020-10-28  7:54         ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.