All of lore.kernel.org
 help / color / mirror / Atom feed
* [ANNOUNCE] Xen 4.15 release schedule and feature tracking
@ 2021-01-07 14:35     ` Ian Jackson
  2021-01-07 15:45       ` Oleksandr
  2021-01-14 16:06       ` [ANNOUNCE] Xen 4.15 release schedule and feature tracking Ian Jackson
  0 siblings, 2 replies; 144+ messages in thread
From: Ian Jackson @ 2021-01-07 14:35 UTC (permalink / raw)
  To: xen-devel; +Cc: committers

Hi.  As the Release Manage for Xen 4.15 I am going to start tracking
the status of features which people are looking to get into Xen 4.15.

NB that the Last Posting Date is just over a week from now.

If you are working on a feature you want in 4.15 please let me know
about it.  Ideally I'd like a little stanza like this:

S: feature name
O: feature owner (proponent) name
E: feature owner (proponent) email address
P: your current estimate of the probability it making 4.15, as a %age

But free-form text is OK too.  Please reply to this mail.

NB the primary responsibility for driving a feature's progress to meet
the release schedule, lies with the feature's proponent(s).

As a reminder, here is the release schedule:

 Friday 15th January    Last posting date

     Patches adding new features should be posted to the mailing list
     by this cate, although perhaps not in their final version.

 Friday 29th January    Feature freeze
  
     Patches adding new features should be committed by this date.
     Straightforward bugfixes may continue to be accepted by
     maintainers.

 Friday 12th February **tentatve**   Code freeze

     Bugfixes only, all changes to be approved by the Release Manager.

 Week of 12th March **tentative**    Release
     (probably Tuesday or Wednesday)

Any patches containing substantial refactoring are to treated as
new features, even if they intent is to fix bugs.

Freeze exceptions will not be routine, but may be granted in
exceptional cases for small changes on the basis of risk assessment.
Large series will not get exceptions.  Contributors *must not* rely on
getting, or expect, a freeze exception.

The codefreeze and release dates are very much provisional and will be
adjusted in the light of apparent code quality etc.

If as a feature proponent you feel your feature is at risk and there
is something the Xen Project could do to help, please consult me or
the Community Manager.  In such situations please reach out earlier
rather than later.

Ian.


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [ANNOUNCE] Xen 4.15 release schedule and feature tracking
  2021-01-07 14:35     ` [ANNOUNCE] Xen 4.15 release schedule and feature tracking Ian Jackson
@ 2021-01-07 15:45       ` Oleksandr
  2021-01-14 16:11         ` [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Ian Jackson
  2021-01-14 16:06       ` [ANNOUNCE] Xen 4.15 release schedule and feature tracking Ian Jackson
  1 sibling, 1 reply; 144+ messages in thread
From: Oleksandr @ 2021-01-07 15:45 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, committers


On 07.01.21 16:35, Ian Jackson wrote:

Hi Ian

> Hi.  As the Release Manage for Xen 4.15 I am going to start tracking
> the status of features which people are looking to get into Xen 4.15.
>
> NB that the Last Posting Date is just over a week from now.
>
> If you are working on a feature you want in 4.15 please let me know
> about it.

I work on virtio-mmio on Arm which involves x86's IOREQ/DM features.
Currently I am working on making the said features common, implementing 
missing bits, code cleanup an hardening, etc.
I don't think the virtio-mmio is a 4.15 material, but it would be great 
have at least "common" IOREQ/DM in 4.15.


> Ideally I'd like a little stanza like this:
>
> S: feature name


The corresponding thread named:
IOREQ feature (+ virtio-mmio) on Arm
https://www.mail-archive.com/xen-devel@lists.xenproject.org/msg87002.html

> O: feature owner (proponent) name
> E: feature owner (proponent) email address

Julien as initiator of this activity, me as a person who continues this 
activity and tries to get it upstreamed:
Julien Grall <julien@xen.org>
Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>


> P: your current estimate of the probability it making 4.15, as a %age

Difficult to say, it depends ...
RFC was posted Aug 3, 2020, The last posted version is V3. Currently I 
am in the middle of preparing v4, still need to find a common ground for 
few bits.


>
> But free-form text is OK too.  Please reply to this mail.
>
> NB the primary responsibility for driving a feature's progress to meet
> the release schedule, lies with the feature's proponent(s).
>
> As a reminder, here is the release schedule:
>
>   Friday 15th January    Last posting date
>
>       Patches adding new features should be posted to the mailing list
>       by this cate, although perhaps not in their final version.
>
>   Friday 29th January    Feature freeze
>    
>       Patches adding new features should be committed by this date.
>       Straightforward bugfixes may continue to be accepted by
>       maintainers.
>
>   Friday 12th February **tentatve**   Code freeze
>
>       Bugfixes only, all changes to be approved by the Release Manager.
>
>   Week of 12th March **tentative**    Release
>       (probably Tuesday or Wednesday)
>
> Any patches containing substantial refactoring are to treated as
> new features, even if they intent is to fix bugs.
>
> Freeze exceptions will not be routine, but may be granted in
> exceptional cases for small changes on the basis of risk assessment.
> Large series will not get exceptions.  Contributors *must not* rely on
> getting, or expect, a freeze exception.
>
> The codefreeze and release dates are very much provisional and will be
> adjusted in the light of apparent code quality etc.
>
> If as a feature proponent you feel your feature is at risk and there
> is something the Xen Project could do to help, please consult me or
> the Community Manager.  In such situations please reach out earlier
> rather than later.
>
> Ian.
>
-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply	[flat|nested] 144+ messages in thread

* [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm
@ 2021-01-12 21:52 Oleksandr Tyshchenko
  2021-01-12 21:52 ` [PATCH V4 01/24] x86/ioreq: Prepare IOREQ feature for making it common Oleksandr Tyshchenko
                   ` (24 more replies)
  0 siblings, 25 replies; 144+ messages in thread
From: Oleksandr Tyshchenko @ 2021-01-12 21:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, Paul Durrant, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, Julien Grall, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Jun Nakajima, Kevin Tian, Tim Deegan,
	Daniel De Graaf, Volodymyr Babchuk, Anthony PERARD,
	Bertrand Marquis, Wei Chen, Kaly Xin, Artem Mygaiev,
	Alex Bennée

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

Hello all.

The purpose of this patch series is to add IOREQ/DM support to Xen on Arm.
You can find an initial discussion at [1] and RFC-V3 series at [2]-[5].
Xen on Arm requires some implementation to forward guest MMIO access to a device
model in order to implement virtio-mmio backend or even mediator outside of hypervisor.
As Xen on x86 already contains required support this series tries to make it common
and introduce Arm specific bits plus some new functionality. Patch series is based on
Julien's PoC "xen/arm: Add support for Guest IO forwarding to a device emulator".
Besides splitting existing IOREQ/DM support and introducing Arm side, the series
also includes virtio-mmio related changes (last 2 patches for toolstack)
for the reviewers to be able to see how the whole picture could look like
and give it a try.

According to the initial/subsequent discussions there are a few open
questions/concerns regarding security, performance in VirtIO solution:
1. virtio-mmio vs virtio-pci, SPI vs MSI, or even a composition of virtio-mmio + MSI, 
   different use-cases require different transport...
2. virtio backend is able to access all guest memory, some kind of protection
   is needed: 'virtio-iommu in Xen' vs 'pre-shared-memory & memcpys in guest', etc
   (for the first two Alex have provided valuable input at [6])
3. interface between toolstack and 'out-of-qemu' virtio backend, avoid using
   Xenstore in virtio backend if possible. Also, there is a desire to make VirtIO
   backend hypervisor-agnostic.
4. a lot of 'foreing mapping' could lead to the memory exhaustion at the host side,
   as we are stealing the page from host memory in order to map the guest page.
   Julien has some idea regarding that.
5. Julien also has some ideas how to optimize the IOREQ code:
   5.1 vcpu_ioreq_handle_completion (former handle_hvm_io_completion) which is called in
       an hotpath on Arm (everytime we are re-entering to the guest):
       Ideally, vcpu_ioreq_handle_completion should be a NOP (at max a few instructions)
       if there is nothing to do (if we don't have I/O forwarded to an IOREQ server).
       Maybe we want to introduce a per-vCPU flag indicating if an I/O has been
       forwarded to an IOREQ server. This would allow us to bypass most of the function
       if there is nothing to do.
   5.2 The current way to handle MMIO is the following:
       - Pause the vCPU
       - Forward the access to the backend domain
       - Schedule the backend domain
       - Wait for the access to be handled
       - Unpause the vCPU
       The sequence is going to be fairly expensive on Xen.
       It might be possible to optimize the ACK and avoid to wait for the backend
       to handle the access.

Looks like all of them are valid and worth considering, but the first thing
which we need on Arm is a mechanism to forward guest IO to a device emulator,
so let's focus on it in the first place.

***

There are a lot of changes since RFC series, almost all TODOs were resolved on Arm,
Arm code was improved and hardened, common IOREQ/DM code became really arch-agnostic
(without HVM-ism), the "legacy" mechanism of mapping magic pages for the IOREQ servers
was left x86 specific, etc. Also patch that makes DM code public was reworked to have
the top level dm-op handling arch-specific and call into ioreq_server_dm_op()
for otherwise unhandled ops.
But one TODO still remains which is "PIO handling" on Arm.
The "PIO handling" TODO is expected to left unaddressed for the current series.
It is not an big issue for now while Xen doesn't have support for vPCI on Arm.
On Arm64 they are only used for PCI IO Bar and we would probably want to expose
them to emulator as PIO access to make a DM completely arch-agnostic. So "PIO handling"
should be implemented when we add support for vPCI.

I left interface untouched in the following patch
"xen/dm: Introduce xendevicemodel_set_irq_level DM op"
since there is still an open discussion what interface to use/what
information to pass to the hypervisor.

There are patches on review this series depends on:
https://patchwork.kernel.org/patch/11816689
https://patchwork.kernel.org/patch/11803383

Please note, that IOREQ feature is disabled by default on Arm within current series.

***

Patch series [7] was rebased on recent "staging branch"
(7ba2ab4 x86/p2m: Fix paging_gva_to_gfn() for nested virt) and tested on
Renesas Salvator-X board + H3 ES3.0 SoC (Arm64) with virtio-mmio disk backend [8]
running in driver domain and unmodified Linux Guest running on existing
virtio-blk driver (frontend). No issues were observed. Guest domain 'reboot/destroy'
use-cases work properly. Patch series was only build-tested on x86.

Please note, build-test passed for the following modes:
1. x86: CONFIG_HVM=y / CONFIG_IOREQ_SERVER=y (default)
2. x86: #CONFIG_HVM is not set / #CONFIG_IOREQ_SERVER is not set
3. Arm64: CONFIG_HVM=y / CONFIG_IOREQ_SERVER=y
4. Arm64: CONFIG_HVM=y / #CONFIG_IOREQ_SERVER is not set  (default)
5. Arm32: CONFIG_HVM=y / CONFIG_IOREQ_SERVER=y
6. Arm32: CONFIG_HVM=y / #CONFIG_IOREQ_SERVER is not set  (default)

***

Any feedback/help would be highly appreciated.

[1] https://lists.xenproject.org/archives/html/xen-devel/2020-07/msg00825.html
[2] https://lists.xenproject.org/archives/html/xen-devel/2020-08/msg00071.html
[3] https://lists.xenproject.org/archives/html/xen-devel/2020-09/msg00732.html
[4] https://lists.xenproject.org/archives/html/xen-devel/2020-10/msg01077.html
[5] https://lists.xenproject.org/archives/html/xen-devel/2020-11/msg02188.html
[6] https://lists.xenproject.org/archives/html/xen-devel/2020-11/msg02212.html
[7] https://github.com/otyshchenko1/xen/commits/ioreq_4.14_ml5
[8] https://github.com/xen-troops/virtio-disk/commits/ioreq_ml1

Julien Grall (5):
  xen/ioreq: Make x86's IOREQ related dm-op handling common
  xen/mm: Make x86's XENMEM_resource_ioreq_server handling common
  arm/ioreq: Introduce arch specific bits for IOREQ/DM features
  xen/dm: Introduce xendevicemodel_set_irq_level DM op
  libxl: Introduce basic virtio-mmio support on Arm

Oleksandr Tyshchenko (19):
  x86/ioreq: Prepare IOREQ feature for making it common
  x86/ioreq: Add IOREQ_STATUS_* #define-s and update code for moving
  x86/ioreq: Provide out-of-line wrapper for the handle_mmio()
  xen/ioreq: Make x86's IOREQ feature common
  xen/ioreq: Make x86's hvm_ioreq_needs_completion() common
  xen/ioreq: Make x86's hvm_mmio_first(last)_byte() common
  xen/ioreq: Make x86's hvm_ioreq_(page/vcpu/server) structs common
  xen/ioreq: Move x86's ioreq_server to struct domain
  xen/ioreq: Move x86's io_completion/io_req fields to struct vcpu
  xen/ioreq: Remove "hvm" prefixes from involved function names
  xen/ioreq: Use guest_cmpxchg64() instead of cmpxchg()
  xen/arm: Stick around in leave_hypervisor_to_guest until I/O has
    completed
  xen/mm: Handle properly reference in set_foreign_p2m_entry() on Arm
  xen/ioreq: Introduce domain_has_ioreq_server()
  xen/arm: io: Abstract sign-extension
  xen/arm: io: Harden sign extension check
  xen/ioreq: Make x86's send_invalidate_req() common
  xen/arm: Add mapcache invalidation handling
  [RFC] libxl: Add support for virtio-disk configuration

 MAINTAINERS                                  |    8 +-
 tools/include/xendevicemodel.h               |    4 +
 tools/libs/devicemodel/core.c                |   18 +
 tools/libs/devicemodel/libxendevicemodel.map |    1 +
 tools/libs/light/Makefile                    |    1 +
 tools/libs/light/libxl_arm.c                 |   94 +-
 tools/libs/light/libxl_create.c              |    1 +
 tools/libs/light/libxl_internal.h            |    1 +
 tools/libs/light/libxl_types.idl             |   16 +
 tools/libs/light/libxl_types_internal.idl    |    1 +
 tools/libs/light/libxl_virtio_disk.c         |  109 ++
 tools/xl/Makefile                            |    2 +-
 tools/xl/xl.h                                |    3 +
 tools/xl/xl_cmdtable.c                       |   15 +
 tools/xl/xl_parse.c                          |  116 +++
 tools/xl/xl_virtio_disk.c                    |   46 +
 xen/arch/arm/Makefile                        |    2 +
 xen/arch/arm/dm.c                            |  174 ++++
 xen/arch/arm/domain.c                        |    9 +
 xen/arch/arm/io.c                            |   30 +-
 xen/arch/arm/ioreq.c                         |  198 ++++
 xen/arch/arm/p2m.c                           |   51 +-
 xen/arch/arm/traps.c                         |   72 +-
 xen/arch/x86/Kconfig                         |    1 +
 xen/arch/x86/hvm/dm.c                        |  107 +-
 xen/arch/x86/hvm/emulate.c                   |  220 ++--
 xen/arch/x86/hvm/hvm.c                       |   14 +-
 xen/arch/x86/hvm/hypercall.c                 |    9 +-
 xen/arch/x86/hvm/intercept.c                 |    5 +-
 xen/arch/x86/hvm/io.c                        |   52 +-
 xen/arch/x86/hvm/ioreq.c                     | 1375 ++-----------------------
 xen/arch/x86/hvm/stdvga.c                    |   12 +-
 xen/arch/x86/hvm/svm/nestedsvm.c             |    2 +-
 xen/arch/x86/hvm/vmx/realmode.c              |    8 +-
 xen/arch/x86/hvm/vmx/vvmx.c                  |    5 +-
 xen/arch/x86/mm.c                            |   46 +-
 xen/arch/x86/mm/p2m.c                        |   17 +-
 xen/arch/x86/mm/shadow/common.c              |    2 +-
 xen/common/Kconfig                           |    3 +
 xen/common/Makefile                          |    1 +
 xen/common/ioreq.c                           | 1426 ++++++++++++++++++++++++++
 xen/common/memory.c                          |   72 +-
 xen/include/asm-arm/domain.h                 |    3 +
 xen/include/asm-arm/hvm/ioreq.h              |   72 ++
 xen/include/asm-arm/mm.h                     |    8 -
 xen/include/asm-arm/mmio.h                   |    1 +
 xen/include/asm-arm/p2m.h                    |   19 +-
 xen/include/asm-arm/traps.h                  |   25 +
 xen/include/asm-x86/hvm/domain.h             |   43 -
 xen/include/asm-x86/hvm/emulate.h            |    2 +-
 xen/include/asm-x86/hvm/io.h                 |   17 -
 xen/include/asm-x86/hvm/ioreq.h              |   39 +-
 xen/include/asm-x86/hvm/vcpu.h               |   18 -
 xen/include/asm-x86/mm.h                     |    4 -
 xen/include/asm-x86/p2m.h                    |   27 +-
 xen/include/public/arch-arm.h                |    5 +
 xen/include/public/hvm/dm_op.h               |   16 +
 xen/include/xen/dm.h                         |   39 +
 xen/include/xen/ioreq.h                      |  140 +++
 xen/include/xen/p2m-common.h                 |    4 +
 xen/include/xen/sched.h                      |   34 +
 xen/include/xsm/dummy.h                      |    4 +-
 xen/include/xsm/xsm.h                        |    6 +-
 xen/xsm/dummy.c                              |    2 +-
 xen/xsm/flask/hooks.c                        |    5 +-
 65 files changed, 3073 insertions(+), 1809 deletions(-)
 create mode 100644 tools/libs/light/libxl_virtio_disk.c
 create mode 100644 tools/xl/xl_virtio_disk.c
 create mode 100644 xen/arch/arm/dm.c
 create mode 100644 xen/arch/arm/ioreq.c
 create mode 100644 xen/common/ioreq.c
 create mode 100644 xen/include/asm-arm/hvm/ioreq.h
 create mode 100644 xen/include/xen/dm.h
 create mode 100644 xen/include/xen/ioreq.h

-- 
2.7.4



^ permalink raw reply	[flat|nested] 144+ messages in thread

* [PATCH V4 01/24] x86/ioreq: Prepare IOREQ feature for making it common
  2021-01-12 21:52 [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Oleksandr Tyshchenko
@ 2021-01-12 21:52 ` Oleksandr Tyshchenko
  2021-01-15 15:16   ` Julien Grall
                     ` (2 more replies)
  2021-01-12 21:52 ` [PATCH V4 02/24] x86/ioreq: Add IOREQ_STATUS_* #define-s and update code for moving Oleksandr Tyshchenko
                   ` (23 subsequent siblings)
  24 siblings, 3 replies; 144+ messages in thread
From: Oleksandr Tyshchenko @ 2021-01-12 21:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, Paul Durrant, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, Julien Grall, Stefano Stabellini, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

As a lot of x86 code can be re-used on Arm later on, this
patch makes some preparation to x86/hvm/ioreq.c before moving
to the common code. This way we will get a verbatim copy
for a code movement in subsequent patch.

This patch mostly introduces specific hooks to abstract arch
specific materials taking into the account the requirment to leave
the "legacy" mechanism of mapping magic pages for the IOREQ servers
x86 specific and not expose it to the common code.

These hooks are named according to the more consistent new naming
scheme right away (including dropping the "hvm" prefixes and infixes):
- IOREQ server functions should start with "ioreq_server_"
- IOREQ functions should start with "ioreq_"
other functions will be renamed in subsequent patches.

Also re-order #include-s alphabetically.

This support is going to be used on Arm to be able run device
emulator outside of Xen hypervisor.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
CC: Julien Grall <julien.grall@arm.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>

---
Please note, this is a split/cleanup/hardening of Julien's PoC:
"Add support for Guest IO forwarding to a device emulator"

Changes RFC -> V1:
   - new patch, was split from:
     "[RFC PATCH V1 01/12] hvm/ioreq: Make x86's IOREQ feature common"
   - fold the check of p->type into hvm_get_ioreq_server_range_type()
     and make it return success/failure
   - remove relocate_portio_handler() call from arch_hvm_ioreq_destroy()
     in arch/x86/hvm/ioreq.c
   - introduce arch_hvm_destroy_ioreq_server()/arch_handle_hvm_io_completion()

Changes V1 -> V2:
   - update patch description
   - make arch functions inline and put them into arch header
     to achieve a truly rename by the subsequent patch
   - return void in arch_hvm_destroy_ioreq_server()
   - return bool in arch_hvm_ioreq_destroy()
   - bring relocate_portio_handler() back to arch_hvm_ioreq_destroy()
   - rename IOREQ_IO* to IOREQ_STATUS*
   - remove *handle* from arch_handle_hvm_io_completion()
   - re-order #include-s alphabetically
   - rename hvm_get_ioreq_server_range_type() to hvm_ioreq_server_get_type_addr()
     and add "const" to several arguments

Changes V2 -> V3:
   - update patch description
   - name new arch hooks according to the new naming scheme
   - don't make arch hooks inline, move them ioreq.c
   - make get_ioreq_server() local again
   - rework the whole patch taking into the account that "legacy" interface
     should remain x86 specific (additional arch hooks, etc)
   - update the code to be able to use hvm_map_mem_type_to_ioreq_server()
     in the common code (an extra arch hook, etc)
   - don’t include <asm/hvm/emulate.h> from arch header
   - add "arch" prefix to hvm_ioreq_server_get_type_addr()
   - move IOREQ_STATUS_* #define-s introduction to the separate patch
   - move HANDLE_BUFIOREQ to the arch header
   - just return relocate_portio_handler() from arch_ioreq_server_destroy_all()
   - misc adjustments proposed by Jan (adding const, unsigned int instead of uint32_t)

Changes V3 -> V4:
   - add Alex's R-b
   - update patch description
   - make arch_ioreq_server_get_type_addr return bool
   - drop #include <xen/ctype.h>
   - use two arch hooks in hvm_map_mem_type_to_ioreq_server()
     to avoid calling p2m_change_entry_type_global() with lock held
---
 xen/arch/x86/hvm/ioreq.c        | 179 ++++++++++++++++++++++++++--------------
 xen/include/asm-x86/hvm/ioreq.h |  22 +++++
 2 files changed, 141 insertions(+), 60 deletions(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index 1cc27df..468fe84 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -16,16 +16,15 @@
  * this program; If not, see <http://www.gnu.org/licenses/>.
  */
 
-#include <xen/ctype.h>
+#include <xen/domain.h>
+#include <xen/event.h>
 #include <xen/init.h>
+#include <xen/irq.h>
 #include <xen/lib.h>
-#include <xen/trace.h>
+#include <xen/paging.h>
 #include <xen/sched.h>
-#include <xen/irq.h>
 #include <xen/softirq.h>
-#include <xen/domain.h>
-#include <xen/event.h>
-#include <xen/paging.h>
+#include <xen/trace.h>
 #include <xen/vpci.h>
 
 #include <asm/hvm/emulate.h>
@@ -170,6 +169,29 @@ static bool hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ioreq_t *p)
     return true;
 }
 
+bool arch_vcpu_ioreq_completion(enum hvm_io_completion io_completion)
+{
+    switch ( io_completion )
+    {
+    case HVMIO_realmode_completion:
+    {
+        struct hvm_emulate_ctxt ctxt;
+
+        hvm_emulate_init_once(&ctxt, NULL, guest_cpu_user_regs());
+        vmx_realmode_emulate_one(&ctxt);
+        hvm_emulate_writeback(&ctxt);
+
+        break;
+    }
+
+    default:
+        ASSERT_UNREACHABLE();
+        break;
+    }
+
+    return true;
+}
+
 bool handle_hvm_io_completion(struct vcpu *v)
 {
     struct domain *d = v->domain;
@@ -209,19 +231,8 @@ bool handle_hvm_io_completion(struct vcpu *v)
         return handle_pio(vio->io_req.addr, vio->io_req.size,
                           vio->io_req.dir);
 
-    case HVMIO_realmode_completion:
-    {
-        struct hvm_emulate_ctxt ctxt;
-
-        hvm_emulate_init_once(&ctxt, NULL, guest_cpu_user_regs());
-        vmx_realmode_emulate_one(&ctxt);
-        hvm_emulate_writeback(&ctxt);
-
-        break;
-    }
     default:
-        ASSERT_UNREACHABLE();
-        break;
+        return arch_vcpu_ioreq_completion(io_completion);
     }
 
     return true;
@@ -477,9 +488,6 @@ static void hvm_update_ioreq_evtchn(struct hvm_ioreq_server *s,
     }
 }
 
-#define HANDLE_BUFIOREQ(s) \
-    ((s)->bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF)
-
 static int hvm_ioreq_server_add_vcpu(struct hvm_ioreq_server *s,
                                      struct vcpu *v)
 {
@@ -586,7 +594,7 @@ static void hvm_ioreq_server_remove_all_vcpus(struct hvm_ioreq_server *s)
     spin_unlock(&s->lock);
 }
 
-static int hvm_ioreq_server_map_pages(struct hvm_ioreq_server *s)
+int arch_ioreq_server_map_pages(struct hvm_ioreq_server *s)
 {
     int rc;
 
@@ -601,7 +609,7 @@ static int hvm_ioreq_server_map_pages(struct hvm_ioreq_server *s)
     return rc;
 }
 
-static void hvm_ioreq_server_unmap_pages(struct hvm_ioreq_server *s)
+void arch_ioreq_server_unmap_pages(struct hvm_ioreq_server *s)
 {
     hvm_unmap_ioreq_gfn(s, true);
     hvm_unmap_ioreq_gfn(s, false);
@@ -674,6 +682,12 @@ static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
     return rc;
 }
 
+void arch_ioreq_server_enable(struct hvm_ioreq_server *s)
+{
+    hvm_remove_ioreq_gfn(s, false);
+    hvm_remove_ioreq_gfn(s, true);
+}
+
 static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s)
 {
     struct hvm_ioreq_vcpu *sv;
@@ -683,8 +697,7 @@ static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s)
     if ( s->enabled )
         goto done;
 
-    hvm_remove_ioreq_gfn(s, false);
-    hvm_remove_ioreq_gfn(s, true);
+    arch_ioreq_server_enable(s);
 
     s->enabled = true;
 
@@ -697,6 +710,12 @@ static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s)
     spin_unlock(&s->lock);
 }
 
+void arch_ioreq_server_disable(struct hvm_ioreq_server *s)
+{
+    hvm_add_ioreq_gfn(s, true);
+    hvm_add_ioreq_gfn(s, false);
+}
+
 static void hvm_ioreq_server_disable(struct hvm_ioreq_server *s)
 {
     spin_lock(&s->lock);
@@ -704,8 +723,7 @@ static void hvm_ioreq_server_disable(struct hvm_ioreq_server *s)
     if ( !s->enabled )
         goto done;
 
-    hvm_add_ioreq_gfn(s, true);
-    hvm_add_ioreq_gfn(s, false);
+    arch_ioreq_server_disable(s);
 
     s->enabled = false;
 
@@ -750,7 +768,7 @@ static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
 
  fail_add:
     hvm_ioreq_server_remove_all_vcpus(s);
-    hvm_ioreq_server_unmap_pages(s);
+    arch_ioreq_server_unmap_pages(s);
 
     hvm_ioreq_server_free_rangesets(s);
 
@@ -764,7 +782,7 @@ static void hvm_ioreq_server_deinit(struct hvm_ioreq_server *s)
     hvm_ioreq_server_remove_all_vcpus(s);
 
     /*
-     * NOTE: It is safe to call both hvm_ioreq_server_unmap_pages() and
+     * NOTE: It is safe to call both arch_ioreq_server_unmap_pages() and
      *       hvm_ioreq_server_free_pages() in that order.
      *       This is because the former will do nothing if the pages
      *       are not mapped, leaving the page to be freed by the latter.
@@ -772,7 +790,7 @@ static void hvm_ioreq_server_deinit(struct hvm_ioreq_server *s)
      *       the page_info pointer to NULL, meaning the latter will do
      *       nothing.
      */
-    hvm_ioreq_server_unmap_pages(s);
+    arch_ioreq_server_unmap_pages(s);
     hvm_ioreq_server_free_pages(s);
 
     hvm_ioreq_server_free_rangesets(s);
@@ -836,6 +854,12 @@ int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
     return rc;
 }
 
+/* Called when target domain is paused */
+void arch_ioreq_server_destroy(struct hvm_ioreq_server *s)
+{
+    p2m_set_ioreq_server(s->target, 0, s);
+}
+
 int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
 {
     struct hvm_ioreq_server *s;
@@ -855,7 +879,7 @@ int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
 
     domain_pause(d);
 
-    p2m_set_ioreq_server(d, 0, s);
+    arch_ioreq_server_destroy(s);
 
     hvm_ioreq_server_disable(s);
 
@@ -900,7 +924,7 @@ int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
 
     if ( ioreq_gfn || bufioreq_gfn )
     {
-        rc = hvm_ioreq_server_map_pages(s);
+        rc = arch_ioreq_server_map_pages(s);
         if ( rc )
             goto out;
     }
@@ -1080,6 +1104,27 @@ int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
     return rc;
 }
 
+/* Called with ioreq_server lock held */
+int arch_ioreq_server_map_mem_type(struct domain *d,
+                                   struct hvm_ioreq_server *s,
+                                   uint32_t flags)
+{
+    return p2m_set_ioreq_server(d, flags, s);
+}
+
+void arch_ioreq_server_map_mem_type_completed(struct domain *d,
+                                              struct hvm_ioreq_server *s,
+                                              uint32_t flags)
+{
+    if ( flags == 0 )
+    {
+        const struct p2m_domain *p2m = p2m_get_hostp2m(d);
+
+        if ( read_atomic(&p2m->ioreq.entry_count) )
+            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
+    }
+}
+
 /*
  * Map or unmap an ioreq server to specific memory type. For now, only
  * HVMMEM_ioreq_server is supported, and in the future new types can be
@@ -1112,18 +1157,13 @@ int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
     if ( s->emulator != current->domain )
         goto out;
 
-    rc = p2m_set_ioreq_server(d, flags, s);
+    rc = arch_ioreq_server_map_mem_type(d, s, flags);
 
  out:
     spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
 
-    if ( rc == 0 && flags == 0 )
-    {
-        struct p2m_domain *p2m = p2m_get_hostp2m(d);
-
-        if ( read_atomic(&p2m->ioreq.entry_count) )
-            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
-    }
+    if ( rc == 0 )
+        arch_ioreq_server_map_mem_type_completed(d, s, flags);
 
     return rc;
 }
@@ -1210,12 +1250,17 @@ void hvm_all_ioreq_servers_remove_vcpu(struct domain *d, struct vcpu *v)
     spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
 }
 
+bool arch_ioreq_server_destroy_all(struct domain *d)
+{
+    return relocate_portio_handler(d, 0xcf8, 0xcf8, 4);
+}
+
 void hvm_destroy_all_ioreq_servers(struct domain *d)
 {
     struct hvm_ioreq_server *s;
     unsigned int id;
 
-    if ( !relocate_portio_handler(d, 0xcf8, 0xcf8, 4) )
+    if ( !arch_ioreq_server_destroy_all(d) )
         return;
 
     spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
@@ -1239,33 +1284,28 @@ void hvm_destroy_all_ioreq_servers(struct domain *d)
     spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
 }
 
-struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
-                                                 ioreq_t *p)
+bool arch_ioreq_server_get_type_addr(const struct domain *d,
+                                     const ioreq_t *p,
+                                     uint8_t *type,
+                                     uint64_t *addr)
 {
-    struct hvm_ioreq_server *s;
-    uint32_t cf8;
-    uint8_t type;
-    uint64_t addr;
-    unsigned int id;
+    unsigned int cf8 = d->arch.hvm.pci_cf8;
 
     if ( p->type != IOREQ_TYPE_COPY && p->type != IOREQ_TYPE_PIO )
-        return NULL;
-
-    cf8 = d->arch.hvm.pci_cf8;
+        return false;
 
     if ( p->type == IOREQ_TYPE_PIO &&
          (p->addr & ~3) == 0xcfc &&
          CF8_ENABLED(cf8) )
     {
-        uint32_t x86_fam;
+        unsigned int x86_fam, reg;
         pci_sbdf_t sbdf;
-        unsigned int reg;
 
         reg = hvm_pci_decode_addr(cf8, p->addr, &sbdf);
 
         /* PCI config data cycle */
-        type = XEN_DMOP_IO_RANGE_PCI;
-        addr = ((uint64_t)sbdf.sbdf << 32) | reg;
+        *type = XEN_DMOP_IO_RANGE_PCI;
+        *addr = ((uint64_t)sbdf.sbdf << 32) | reg;
         /* AMD extended configuration space access? */
         if ( CF8_ADDR_HI(cf8) &&
              d->arch.cpuid->x86_vendor == X86_VENDOR_AMD &&
@@ -1277,16 +1317,30 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
 
             if ( !rdmsr_safe(MSR_AMD64_NB_CFG, msr_val) &&
                  (msr_val & (1ULL << AMD64_NB_CFG_CF8_EXT_ENABLE_BIT)) )
-                addr |= CF8_ADDR_HI(cf8);
+                *addr |= CF8_ADDR_HI(cf8);
         }
     }
     else
     {
-        type = (p->type == IOREQ_TYPE_PIO) ?
-                XEN_DMOP_IO_RANGE_PORT : XEN_DMOP_IO_RANGE_MEMORY;
-        addr = p->addr;
+        *type = (p->type == IOREQ_TYPE_PIO) ?
+                 XEN_DMOP_IO_RANGE_PORT : XEN_DMOP_IO_RANGE_MEMORY;
+        *addr = p->addr;
     }
 
+    return true;
+}
+
+struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
+                                                 ioreq_t *p)
+{
+    struct hvm_ioreq_server *s;
+    uint8_t type;
+    uint64_t addr;
+    unsigned int id;
+
+    if ( !arch_ioreq_server_get_type_addr(d, p, &type, &addr) )
+        return NULL;
+
     FOR_EACH_IOREQ_SERVER(d, id, s)
     {
         struct rangeset *r;
@@ -1515,11 +1569,16 @@ static int hvm_access_cf8(
     return X86EMUL_UNHANDLEABLE;
 }
 
+void arch_ioreq_domain_init(struct domain *d)
+{
+    register_portio_handler(d, 0xcf8, 4, hvm_access_cf8);
+}
+
 void hvm_ioreq_init(struct domain *d)
 {
     spin_lock_init(&d->arch.hvm.ioreq_server.lock);
 
-    register_portio_handler(d, 0xcf8, 4, hvm_access_cf8);
+    arch_ioreq_domain_init(d);
 }
 
 /*
diff --git a/xen/include/asm-x86/hvm/ioreq.h b/xen/include/asm-x86/hvm/ioreq.h
index e2588e9..13d35e1 100644
--- a/xen/include/asm-x86/hvm/ioreq.h
+++ b/xen/include/asm-x86/hvm/ioreq.h
@@ -19,6 +19,9 @@
 #ifndef __ASM_X86_HVM_IOREQ_H__
 #define __ASM_X86_HVM_IOREQ_H__
 
+#define HANDLE_BUFIOREQ(s) \
+    ((s)->bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF)
+
 bool hvm_io_pending(struct vcpu *v);
 bool handle_hvm_io_completion(struct vcpu *v);
 bool is_ioreq_server_page(struct domain *d, const struct page_info *page);
@@ -55,6 +58,25 @@ unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered);
 
 void hvm_ioreq_init(struct domain *d);
 
+bool arch_vcpu_ioreq_completion(enum hvm_io_completion io_completion);
+int arch_ioreq_server_map_pages(struct hvm_ioreq_server *s);
+void arch_ioreq_server_unmap_pages(struct hvm_ioreq_server *s);
+void arch_ioreq_server_enable(struct hvm_ioreq_server *s);
+void arch_ioreq_server_disable(struct hvm_ioreq_server *s);
+void arch_ioreq_server_destroy(struct hvm_ioreq_server *s);
+int arch_ioreq_server_map_mem_type(struct domain *d,
+                                   struct hvm_ioreq_server *s,
+                                   uint32_t flags);
+void arch_ioreq_server_map_mem_type_completed(struct domain *d,
+                                              struct hvm_ioreq_server *s,
+                                              uint32_t flags);
+bool arch_ioreq_server_destroy_all(struct domain *d);
+bool arch_ioreq_server_get_type_addr(const struct domain *d,
+                                     const ioreq_t *p,
+                                     uint8_t *type,
+                                     uint64_t *addr);
+void arch_ioreq_domain_init(struct domain *d);
+
 #endif /* __ASM_X86_HVM_IOREQ_H__ */
 
 /*
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH V4 02/24] x86/ioreq: Add IOREQ_STATUS_* #define-s and update code for moving
  2021-01-12 21:52 [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Oleksandr Tyshchenko
  2021-01-12 21:52 ` [PATCH V4 01/24] x86/ioreq: Prepare IOREQ feature for making it common Oleksandr Tyshchenko
@ 2021-01-12 21:52 ` Oleksandr Tyshchenko
  2021-01-15 15:17   ` Julien Grall
  2021-01-18  8:24   ` Paul Durrant
  2021-01-12 21:52 ` [PATCH V4 03/24] x86/ioreq: Provide out-of-line wrapper for the handle_mmio() Oleksandr Tyshchenko
                   ` (22 subsequent siblings)
  24 siblings, 2 replies; 144+ messages in thread
From: Oleksandr Tyshchenko @ 2021-01-12 21:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, Paul Durrant, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, Julien Grall, Stefano Stabellini, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

This patch continues to make some preparation to x86/hvm/ioreq.c
before moving to the common code.

Add IOREQ_STATUS_* #define-s and update candidates for moving
since X86EMUL_* shouldn't be exposed to the common code in
that form.

This support is going to be used on Arm to be able run device
emulator outside of Xen hypervisor.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
CC: Julien Grall <julien.grall@arm.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>

---
Please note, this is a split/cleanup/hardening of Julien's PoC:
"Add support for Guest IO forwarding to a device emulator"

Changes V2 -> V3:
 - new patch, was split from
   [PATCH V2 01/23] x86/ioreq: Prepare IOREQ feature for making it common

Changes V3 -> V4:
 - add Alex's R-b and Jan's A-b
 - add a comment above IOREQ_STATUS_* #define-s
---
 xen/arch/x86/hvm/ioreq.c        | 16 ++++++++--------
 xen/include/asm-x86/hvm/ioreq.h |  5 +++++
 2 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index 468fe84..ff9a546 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -1405,7 +1405,7 @@ static int hvm_send_buffered_ioreq(struct hvm_ioreq_server *s, ioreq_t *p)
     pg = iorp->va;
 
     if ( !pg )
-        return X86EMUL_UNHANDLEABLE;
+        return IOREQ_STATUS_UNHANDLED;
 
     /*
      * Return 0 for the cases we can't deal with:
@@ -1435,7 +1435,7 @@ static int hvm_send_buffered_ioreq(struct hvm_ioreq_server *s, ioreq_t *p)
         break;
     default:
         gdprintk(XENLOG_WARNING, "unexpected ioreq size: %u\n", p->size);
-        return X86EMUL_UNHANDLEABLE;
+        return IOREQ_STATUS_UNHANDLED;
     }
 
     spin_lock(&s->bufioreq_lock);
@@ -1445,7 +1445,7 @@ static int hvm_send_buffered_ioreq(struct hvm_ioreq_server *s, ioreq_t *p)
     {
         /* The queue is full: send the iopacket through the normal path. */
         spin_unlock(&s->bufioreq_lock);
-        return X86EMUL_UNHANDLEABLE;
+        return IOREQ_STATUS_UNHANDLED;
     }
 
     pg->buf_ioreq[pg->ptrs.write_pointer % IOREQ_BUFFER_SLOT_NUM] = bp;
@@ -1476,7 +1476,7 @@ static int hvm_send_buffered_ioreq(struct hvm_ioreq_server *s, ioreq_t *p)
     notify_via_xen_event_channel(d, s->bufioreq_evtchn);
     spin_unlock(&s->bufioreq_lock);
 
-    return X86EMUL_OKAY;
+    return IOREQ_STATUS_HANDLED;
 }
 
 int hvm_send_ioreq(struct hvm_ioreq_server *s, ioreq_t *proto_p,
@@ -1492,7 +1492,7 @@ int hvm_send_ioreq(struct hvm_ioreq_server *s, ioreq_t *proto_p,
         return hvm_send_buffered_ioreq(s, proto_p);
 
     if ( unlikely(!vcpu_start_shutdown_deferral(curr)) )
-        return X86EMUL_RETRY;
+        return IOREQ_STATUS_RETRY;
 
     list_for_each_entry ( sv,
                           &s->ioreq_vcpu_list,
@@ -1532,11 +1532,11 @@ int hvm_send_ioreq(struct hvm_ioreq_server *s, ioreq_t *proto_p,
             notify_via_xen_event_channel(d, port);
 
             sv->pending = true;
-            return X86EMUL_RETRY;
+            return IOREQ_STATUS_RETRY;
         }
     }
 
-    return X86EMUL_UNHANDLEABLE;
+    return IOREQ_STATUS_UNHANDLED;
 }
 
 unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered)
@@ -1550,7 +1550,7 @@ unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered)
         if ( !s->enabled )
             continue;
 
-        if ( hvm_send_ioreq(s, p, buffered) == X86EMUL_UNHANDLEABLE )
+        if ( hvm_send_ioreq(s, p, buffered) == IOREQ_STATUS_UNHANDLED )
             failed++;
     }
 
diff --git a/xen/include/asm-x86/hvm/ioreq.h b/xen/include/asm-x86/hvm/ioreq.h
index 13d35e1..f140ef4 100644
--- a/xen/include/asm-x86/hvm/ioreq.h
+++ b/xen/include/asm-x86/hvm/ioreq.h
@@ -77,6 +77,11 @@ bool arch_ioreq_server_get_type_addr(const struct domain *d,
                                      uint64_t *addr);
 void arch_ioreq_domain_init(struct domain *d);
 
+/* This correlation must not be altered */
+#define IOREQ_STATUS_HANDLED     X86EMUL_OKAY
+#define IOREQ_STATUS_UNHANDLED   X86EMUL_UNHANDLEABLE
+#define IOREQ_STATUS_RETRY       X86EMUL_RETRY
+
 #endif /* __ASM_X86_HVM_IOREQ_H__ */
 
 /*
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH V4 03/24] x86/ioreq: Provide out-of-line wrapper for the handle_mmio()
  2021-01-12 21:52 [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Oleksandr Tyshchenko
  2021-01-12 21:52 ` [PATCH V4 01/24] x86/ioreq: Prepare IOREQ feature for making it common Oleksandr Tyshchenko
  2021-01-12 21:52 ` [PATCH V4 02/24] x86/ioreq: Add IOREQ_STATUS_* #define-s and update code for moving Oleksandr Tyshchenko
@ 2021-01-12 21:52 ` Oleksandr Tyshchenko
  2021-01-15 14:48   ` Alex Bennée
                     ` (2 more replies)
  2021-01-12 21:52 ` [PATCH V4 04/24] xen/ioreq: Make x86's IOREQ feature common Oleksandr Tyshchenko
                   ` (21 subsequent siblings)
  24 siblings, 3 replies; 144+ messages in thread
From: Oleksandr Tyshchenko @ 2021-01-12 21:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, Paul Durrant, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, Julien Grall, Stefano Stabellini, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

The IOREQ is about to be common feature and Arm will have its own
implementation.

But the name of the function is pretty generic and can be confusing
on Arm (we already have a try_handle_mmio()).

In order not to rename the function (which is used for a varying
set of purposes on x86) globally and get non-confusing variant on Arm
provide a wrapper arch_ioreq_complete_mmio() to be used on common
and Arm code.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
CC: Julien Grall <julien.grall@arm.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>

---
Please note, this is a split/cleanup/hardening of Julien's PoC:
"Add support for Guest IO forwarding to a device emulator"

Changes RFC -> V1:
   - new patch

Changes V1 -> V2:
   - remove "handle"
   - add Jan's A-b

Changes V2 -> V3:
   - remove Jan's A-b
   - update patch subject/description
   - use out-of-line function instead of #define
   - put earlier in the series to avoid breakage

Changes V3 -> V4:
   - add Jan's R-b
   - rename ioreq_complete_mmio() to arch_ioreq_complete_mmio()
---
 xen/arch/x86/hvm/ioreq.c        | 7 ++++++-
 xen/include/asm-x86/hvm/ioreq.h | 1 +
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index ff9a546..00c68f5 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -35,6 +35,11 @@
 #include <public/hvm/ioreq.h>
 #include <public/hvm/params.h>
 
+bool arch_ioreq_complete_mmio(void)
+{
+    return handle_mmio();
+}
+
 static void set_ioreq_server(struct domain *d, unsigned int id,
                              struct hvm_ioreq_server *s)
 {
@@ -225,7 +230,7 @@ bool handle_hvm_io_completion(struct vcpu *v)
         break;
 
     case HVMIO_mmio_completion:
-        return handle_mmio();
+        return arch_ioreq_complete_mmio();
 
     case HVMIO_pio_completion:
         return handle_pio(vio->io_req.addr, vio->io_req.size,
diff --git a/xen/include/asm-x86/hvm/ioreq.h b/xen/include/asm-x86/hvm/ioreq.h
index f140ef4..0e64e76 100644
--- a/xen/include/asm-x86/hvm/ioreq.h
+++ b/xen/include/asm-x86/hvm/ioreq.h
@@ -58,6 +58,7 @@ unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered);
 
 void hvm_ioreq_init(struct domain *d);
 
+bool arch_ioreq_complete_mmio(void);
 bool arch_vcpu_ioreq_completion(enum hvm_io_completion io_completion);
 int arch_ioreq_server_map_pages(struct hvm_ioreq_server *s);
 void arch_ioreq_server_unmap_pages(struct hvm_ioreq_server *s);
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH V4 04/24] xen/ioreq: Make x86's IOREQ feature common
  2021-01-12 21:52 [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Oleksandr Tyshchenko
                   ` (2 preceding siblings ...)
  2021-01-12 21:52 ` [PATCH V4 03/24] x86/ioreq: Provide out-of-line wrapper for the handle_mmio() Oleksandr Tyshchenko
@ 2021-01-12 21:52 ` Oleksandr Tyshchenko
  2021-01-15 14:55   ` Alex Bennée
                     ` (2 more replies)
  2021-01-12 21:52 ` [PATCH V4 05/24] xen/ioreq: Make x86's hvm_ioreq_needs_completion() common Oleksandr Tyshchenko
                   ` (20 subsequent siblings)
  24 siblings, 3 replies; 144+ messages in thread
From: Oleksandr Tyshchenko @ 2021-01-12 21:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, Andrew Cooper, George Dunlap, Ian Jackson,
	Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu,
	Roger Pau Monné,
	Paul Durrant, Jun Nakajima, Kevin Tian, Tim Deegan, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

As a lot of x86 code can be re-used on Arm later on, this patch
moves previously prepared IOREQ support to the common code
(the code movement is verbatim copy).

The "legacy" mechanism of mapping magic pages for the IOREQ servers
remains x86 specific and not exposed to the common code.

The common IOREQ feature is supposed to be built with IOREQ_SERVER
option enabled, which is selected for x86's config HVM for now.

In order to avoid having a gigantic patch here, the subsequent
patches will update remaining bits in the common code step by step:
- Make IOREQ related structs/materials common
- Drop the "hvm" prefixes and infixes
- Remove layering violation by moving corresponding fields
  out of *arch.hvm* or abstracting away accesses to them

Also include <xen/domain_page.h> which will be needed on Arm
to avoid touch the common code again when introducing Arm specific bits.

This support is going to be used on Arm to be able run device
emulator outside of Xen hypervisor.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Julien Grall <julien.grall@arm.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>

---
Please note, this is a split/cleanup/hardening of Julien's PoC:
"Add support for Guest IO forwarding to a device emulator"

***
Please note, this patch depends on the following which is
on review:
https://patchwork.kernel.org/patch/11816689/
***

Changes RFC -> V1:
   - was split into three patches:
     - x86/ioreq: Prepare IOREQ feature for making it common
     - xen/ioreq: Make x86's IOREQ feature common
     - xen/ioreq: Make x86's hvm_ioreq_needs_completion() common
   - update MAINTAINERS file
   - do not use a separate subdir for the IOREQ stuff, move it to:
     - xen/common/ioreq.c
     - xen/include/xen/ioreq.h
   - update x86's files to include xen/ioreq.h
   - remove unneeded headers in arch/x86/hvm/ioreq.c
   - re-order the headers alphabetically in common/ioreq.c
   - update common/ioreq.c according to the newly introduced arch functions:
     arch_hvm_destroy_ioreq_server()/arch_handle_hvm_io_completion()

Changes V1 -> V2:
   - update patch description
   - make everything needed in the previous patch to achieve
     a truly rename here
   - don't include unnecessary headers from asm-x86/hvm/ioreq.h
     and xen/ioreq.h
   - use __XEN_IOREQ_H__ instead of __IOREQ_H__
   - move get_ioreq_server() to common/ioreq.c

Changes V2 -> V3:
   - update patch description
   - make everything needed in the previous patch to not
     expose "legacy" interface to the common code here
   - update patch according the "legacy interface" is x86 specific
   - include <xen/domain_page.h> in common ioreq.c

Changes V3 -> V4:
   - rebase
   - don't include <xen/ioreq.h> from arch header
   - мove all arch hook declarations to the common header
---
 MAINTAINERS                     |    8 +-
 xen/arch/x86/Kconfig            |    1 +
 xen/arch/x86/hvm/dm.c           |    2 +-
 xen/arch/x86/hvm/emulate.c      |    2 +-
 xen/arch/x86/hvm/hvm.c          |    2 +-
 xen/arch/x86/hvm/io.c           |    2 +-
 xen/arch/x86/hvm/ioreq.c        | 1347 ++-------------------------------------
 xen/arch/x86/hvm/stdvga.c       |    2 +-
 xen/arch/x86/hvm/vmx/vvmx.c     |    3 +-
 xen/arch/x86/mm.c               |    2 +-
 xen/arch/x86/mm/shadow/common.c |    2 +-
 xen/common/Kconfig              |    3 +
 xen/common/Makefile             |    1 +
 xen/common/ioreq.c              | 1290 +++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/hvm/ioreq.h |   59 --
 xen/include/xen/ioreq.h         |   93 +++
 16 files changed, 1455 insertions(+), 1364 deletions(-)
 create mode 100644 xen/common/ioreq.c
 create mode 100644 xen/include/xen/ioreq.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 6dbd99a..0160cab 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -333,6 +333,13 @@ X:	xen/drivers/passthrough/vtd/
 X:	xen/drivers/passthrough/device_tree.c
 F:	xen/include/xen/iommu.h
 
+I/O EMULATION (IOREQ)
+M:	Paul Durrant <paul@xen.org>
+S:	Supported
+F:	xen/common/ioreq.c
+F:	xen/include/xen/ioreq.h
+F:	xen/include/public/hvm/ioreq.h
+
 KCONFIG
 M:	Doug Goldstein <cardoe@cardoe.com>
 S:	Supported
@@ -549,7 +556,6 @@ F:	xen/arch/x86/hvm/ioreq.c
 F:	xen/include/asm-x86/hvm/emulate.h
 F:	xen/include/asm-x86/hvm/io.h
 F:	xen/include/asm-x86/hvm/ioreq.h
-F:	xen/include/public/hvm/ioreq.h
 
 X86 MEMORY MANAGEMENT
 M:	Jan Beulich <jbeulich@suse.com>
diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
index 24868aa..abe0fce 100644
--- a/xen/arch/x86/Kconfig
+++ b/xen/arch/x86/Kconfig
@@ -91,6 +91,7 @@ config PV_LINEAR_PT
 
 config HVM
 	def_bool !PV_SHIM_EXCLUSIVE
+	select IOREQ_SERVER
 	prompt "HVM support"
 	---help---
 	  Interfaces to support HVM domains.  HVM domains require hardware
diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c
index 71f5ca4..d3e2a9e 100644
--- a/xen/arch/x86/hvm/dm.c
+++ b/xen/arch/x86/hvm/dm.c
@@ -17,12 +17,12 @@
 #include <xen/event.h>
 #include <xen/guest_access.h>
 #include <xen/hypercall.h>
+#include <xen/ioreq.h>
 #include <xen/nospec.h>
 #include <xen/sched.h>
 
 #include <asm/hap.h>
 #include <asm/hvm/cacheattr.h>
-#include <asm/hvm/ioreq.h>
 #include <asm/shadow.h>
 
 #include <xsm/xsm.h>
diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 24cf85f..60ca465 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -10,6 +10,7 @@
  */
 
 #include <xen/init.h>
+#include <xen/ioreq.h>
 #include <xen/lib.h>
 #include <xen/sched.h>
 #include <xen/paging.h>
@@ -20,7 +21,6 @@
 #include <asm/xstate.h>
 #include <asm/hvm/emulate.h>
 #include <asm/hvm/hvm.h>
-#include <asm/hvm/ioreq.h>
 #include <asm/hvm/monitor.h>
 #include <asm/hvm/trace.h>
 #include <asm/hvm/support.h>
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 54e32e4..bc96947 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -20,6 +20,7 @@
 
 #include <xen/ctype.h>
 #include <xen/init.h>
+#include <xen/ioreq.h>
 #include <xen/lib.h>
 #include <xen/trace.h>
 #include <xen/sched.h>
@@ -64,7 +65,6 @@
 #include <asm/hvm/trace.h>
 #include <asm/hvm/nestedhvm.h>
 #include <asm/hvm/monitor.h>
-#include <asm/hvm/ioreq.h>
 #include <asm/hvm/viridian.h>
 #include <asm/hvm/vm_event.h>
 #include <asm/altp2m.h>
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index 3e09d9b..11e007d 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -19,6 +19,7 @@
  */
 
 #include <xen/init.h>
+#include <xen/ioreq.h>
 #include <xen/mm.h>
 #include <xen/lib.h>
 #include <xen/errno.h>
@@ -35,7 +36,6 @@
 #include <asm/shadow.h>
 #include <asm/p2m.h>
 #include <asm/hvm/hvm.h>
-#include <asm/hvm/ioreq.h>
 #include <asm/hvm/support.h>
 #include <asm/hvm/vpt.h>
 #include <asm/hvm/vpic.h>
diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index 00c68f5..177b964 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -19,6 +19,7 @@
 #include <xen/domain.h>
 #include <xen/event.h>
 #include <xen/init.h>
+#include <xen/ioreq.h>
 #include <xen/irq.h>
 #include <xen/lib.h>
 #include <xen/paging.h>
@@ -29,7 +30,6 @@
 
 #include <asm/hvm/emulate.h>
 #include <asm/hvm/hvm.h>
-#include <asm/hvm/ioreq.h>
 #include <asm/hvm/vmx/vmx.h>
 
 #include <public/hvm/ioreq.h>
@@ -40,140 +40,6 @@ bool arch_ioreq_complete_mmio(void)
     return handle_mmio();
 }
 
-static void set_ioreq_server(struct domain *d, unsigned int id,
-                             struct hvm_ioreq_server *s)
-{
-    ASSERT(id < MAX_NR_IOREQ_SERVERS);
-    ASSERT(!s || !d->arch.hvm.ioreq_server.server[id]);
-
-    d->arch.hvm.ioreq_server.server[id] = s;
-}
-
-#define GET_IOREQ_SERVER(d, id) \
-    (d)->arch.hvm.ioreq_server.server[id]
-
-static struct hvm_ioreq_server *get_ioreq_server(const struct domain *d,
-                                                 unsigned int id)
-{
-    if ( id >= MAX_NR_IOREQ_SERVERS )
-        return NULL;
-
-    return GET_IOREQ_SERVER(d, id);
-}
-
-/*
- * Iterate over all possible ioreq servers.
- *
- * NOTE: The iteration is backwards such that more recently created
- *       ioreq servers are favoured in hvm_select_ioreq_server().
- *       This is a semantic that previously existed when ioreq servers
- *       were held in a linked list.
- */
-#define FOR_EACH_IOREQ_SERVER(d, id, s) \
-    for ( (id) = MAX_NR_IOREQ_SERVERS; (id) != 0; ) \
-        if ( !(s = GET_IOREQ_SERVER(d, --(id))) ) \
-            continue; \
-        else
-
-static ioreq_t *get_ioreq(struct hvm_ioreq_server *s, struct vcpu *v)
-{
-    shared_iopage_t *p = s->ioreq.va;
-
-    ASSERT((v == current) || !vcpu_runnable(v));
-    ASSERT(p != NULL);
-
-    return &p->vcpu_ioreq[v->vcpu_id];
-}
-
-static struct hvm_ioreq_vcpu *get_pending_vcpu(const struct vcpu *v,
-                                               struct hvm_ioreq_server **srvp)
-{
-    struct domain *d = v->domain;
-    struct hvm_ioreq_server *s;
-    unsigned int id;
-
-    FOR_EACH_IOREQ_SERVER(d, id, s)
-    {
-        struct hvm_ioreq_vcpu *sv;
-
-        list_for_each_entry ( sv,
-                              &s->ioreq_vcpu_list,
-                              list_entry )
-        {
-            if ( sv->vcpu == v && sv->pending )
-            {
-                if ( srvp )
-                    *srvp = s;
-                return sv;
-            }
-        }
-    }
-
-    return NULL;
-}
-
-bool hvm_io_pending(struct vcpu *v)
-{
-    return get_pending_vcpu(v, NULL);
-}
-
-static bool hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ioreq_t *p)
-{
-    unsigned int prev_state = STATE_IOREQ_NONE;
-    unsigned int state = p->state;
-    uint64_t data = ~0;
-
-    smp_rmb();
-
-    /*
-     * The only reason we should see this condition be false is when an
-     * emulator dying races with I/O being requested.
-     */
-    while ( likely(state != STATE_IOREQ_NONE) )
-    {
-        if ( unlikely(state < prev_state) )
-        {
-            gdprintk(XENLOG_ERR, "Weird HVM ioreq state transition %u -> %u\n",
-                     prev_state, state);
-            sv->pending = false;
-            domain_crash(sv->vcpu->domain);
-            return false; /* bail */
-        }
-
-        switch ( prev_state = state )
-        {
-        case STATE_IORESP_READY: /* IORESP_READY -> NONE */
-            p->state = STATE_IOREQ_NONE;
-            data = p->data;
-            break;
-
-        case STATE_IOREQ_READY:  /* IOREQ_{READY,INPROCESS} -> IORESP_READY */
-        case STATE_IOREQ_INPROCESS:
-            wait_on_xen_event_channel(sv->ioreq_evtchn,
-                                      ({ state = p->state;
-                                         smp_rmb();
-                                         state != prev_state; }));
-            continue;
-
-        default:
-            gdprintk(XENLOG_ERR, "Weird HVM iorequest state %u\n", state);
-            sv->pending = false;
-            domain_crash(sv->vcpu->domain);
-            return false; /* bail */
-        }
-
-        break;
-    }
-
-    p = &sv->vcpu->arch.hvm.hvm_io.io_req;
-    if ( hvm_ioreq_needs_completion(p) )
-        p->data = data;
-
-    sv->pending = false;
-
-    return true;
-}
-
 bool arch_vcpu_ioreq_completion(enum hvm_io_completion io_completion)
 {
     switch ( io_completion )
@@ -197,52 +63,6 @@ bool arch_vcpu_ioreq_completion(enum hvm_io_completion io_completion)
     return true;
 }
 
-bool handle_hvm_io_completion(struct vcpu *v)
-{
-    struct domain *d = v->domain;
-    struct hvm_vcpu_io *vio = &v->arch.hvm.hvm_io;
-    struct hvm_ioreq_server *s;
-    struct hvm_ioreq_vcpu *sv;
-    enum hvm_io_completion io_completion;
-
-    if ( has_vpci(d) && vpci_process_pending(v) )
-    {
-        raise_softirq(SCHEDULE_SOFTIRQ);
-        return false;
-    }
-
-    sv = get_pending_vcpu(v, &s);
-    if ( sv && !hvm_wait_for_io(sv, get_ioreq(s, v)) )
-        return false;
-
-    vio->io_req.state = hvm_ioreq_needs_completion(&vio->io_req) ?
-        STATE_IORESP_READY : STATE_IOREQ_NONE;
-
-    msix_write_completion(v);
-    vcpu_end_shutdown_deferral(v);
-
-    io_completion = vio->io_completion;
-    vio->io_completion = HVMIO_no_completion;
-
-    switch ( io_completion )
-    {
-    case HVMIO_no_completion:
-        break;
-
-    case HVMIO_mmio_completion:
-        return arch_ioreq_complete_mmio();
-
-    case HVMIO_pio_completion:
-        return handle_pio(vio->io_req.addr, vio->io_req.size,
-                          vio->io_req.dir);
-
-    default:
-        return arch_vcpu_ioreq_completion(io_completion);
-    }
-
-    return true;
-}
-
 static gfn_t hvm_alloc_legacy_ioreq_gfn(struct hvm_ioreq_server *s)
 {
     struct domain *d = s->target;
@@ -359,93 +179,6 @@ static int hvm_map_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
     return rc;
 }
 
-static int hvm_alloc_ioreq_mfn(struct hvm_ioreq_server *s, bool buf)
-{
-    struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
-    struct page_info *page;
-
-    if ( iorp->page )
-    {
-        /*
-         * If a guest frame has already been mapped (which may happen
-         * on demand if hvm_get_ioreq_server_info() is called), then
-         * allocating a page is not permitted.
-         */
-        if ( !gfn_eq(iorp->gfn, INVALID_GFN) )
-            return -EPERM;
-
-        return 0;
-    }
-
-    page = alloc_domheap_page(s->target, MEMF_no_refcount);
-
-    if ( !page )
-        return -ENOMEM;
-
-    if ( !get_page_and_type(page, s->target, PGT_writable_page) )
-    {
-        /*
-         * The domain can't possibly know about this page yet, so failure
-         * here is a clear indication of something fishy going on.
-         */
-        domain_crash(s->emulator);
-        return -ENODATA;
-    }
-
-    iorp->va = __map_domain_page_global(page);
-    if ( !iorp->va )
-        goto fail;
-
-    iorp->page = page;
-    clear_page(iorp->va);
-    return 0;
-
- fail:
-    put_page_alloc_ref(page);
-    put_page_and_type(page);
-
-    return -ENOMEM;
-}
-
-static void hvm_free_ioreq_mfn(struct hvm_ioreq_server *s, bool buf)
-{
-    struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
-    struct page_info *page = iorp->page;
-
-    if ( !page )
-        return;
-
-    iorp->page = NULL;
-
-    unmap_domain_page_global(iorp->va);
-    iorp->va = NULL;
-
-    put_page_alloc_ref(page);
-    put_page_and_type(page);
-}
-
-bool is_ioreq_server_page(struct domain *d, const struct page_info *page)
-{
-    const struct hvm_ioreq_server *s;
-    unsigned int id;
-    bool found = false;
-
-    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
-
-    FOR_EACH_IOREQ_SERVER(d, id, s)
-    {
-        if ( (s->ioreq.page == page) || (s->bufioreq.page == page) )
-        {
-            found = true;
-            break;
-        }
-    }
-
-    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
-
-    return found;
-}
-
 static void hvm_remove_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
 
 {
@@ -480,125 +213,6 @@ static int hvm_add_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
     return rc;
 }
 
-static void hvm_update_ioreq_evtchn(struct hvm_ioreq_server *s,
-                                    struct hvm_ioreq_vcpu *sv)
-{
-    ASSERT(spin_is_locked(&s->lock));
-
-    if ( s->ioreq.va != NULL )
-    {
-        ioreq_t *p = get_ioreq(s, sv->vcpu);
-
-        p->vp_eport = sv->ioreq_evtchn;
-    }
-}
-
-static int hvm_ioreq_server_add_vcpu(struct hvm_ioreq_server *s,
-                                     struct vcpu *v)
-{
-    struct hvm_ioreq_vcpu *sv;
-    int rc;
-
-    sv = xzalloc(struct hvm_ioreq_vcpu);
-
-    rc = -ENOMEM;
-    if ( !sv )
-        goto fail1;
-
-    spin_lock(&s->lock);
-
-    rc = alloc_unbound_xen_event_channel(v->domain, v->vcpu_id,
-                                         s->emulator->domain_id, NULL);
-    if ( rc < 0 )
-        goto fail2;
-
-    sv->ioreq_evtchn = rc;
-
-    if ( v->vcpu_id == 0 && HANDLE_BUFIOREQ(s) )
-    {
-        rc = alloc_unbound_xen_event_channel(v->domain, 0,
-                                             s->emulator->domain_id, NULL);
-        if ( rc < 0 )
-            goto fail3;
-
-        s->bufioreq_evtchn = rc;
-    }
-
-    sv->vcpu = v;
-
-    list_add(&sv->list_entry, &s->ioreq_vcpu_list);
-
-    if ( s->enabled )
-        hvm_update_ioreq_evtchn(s, sv);
-
-    spin_unlock(&s->lock);
-    return 0;
-
- fail3:
-    free_xen_event_channel(v->domain, sv->ioreq_evtchn);
-
- fail2:
-    spin_unlock(&s->lock);
-    xfree(sv);
-
- fail1:
-    return rc;
-}
-
-static void hvm_ioreq_server_remove_vcpu(struct hvm_ioreq_server *s,
-                                         struct vcpu *v)
-{
-    struct hvm_ioreq_vcpu *sv;
-
-    spin_lock(&s->lock);
-
-    list_for_each_entry ( sv,
-                          &s->ioreq_vcpu_list,
-                          list_entry )
-    {
-        if ( sv->vcpu != v )
-            continue;
-
-        list_del(&sv->list_entry);
-
-        if ( v->vcpu_id == 0 && HANDLE_BUFIOREQ(s) )
-            free_xen_event_channel(v->domain, s->bufioreq_evtchn);
-
-        free_xen_event_channel(v->domain, sv->ioreq_evtchn);
-
-        xfree(sv);
-        break;
-    }
-
-    spin_unlock(&s->lock);
-}
-
-static void hvm_ioreq_server_remove_all_vcpus(struct hvm_ioreq_server *s)
-{
-    struct hvm_ioreq_vcpu *sv, *next;
-
-    spin_lock(&s->lock);
-
-    list_for_each_entry_safe ( sv,
-                               next,
-                               &s->ioreq_vcpu_list,
-                               list_entry )
-    {
-        struct vcpu *v = sv->vcpu;
-
-        list_del(&sv->list_entry);
-
-        if ( v->vcpu_id == 0 && HANDLE_BUFIOREQ(s) )
-            free_xen_event_channel(v->domain, s->bufioreq_evtchn);
-
-        free_xen_event_channel(v->domain, sv->ioreq_evtchn);
-
-        xfree(sv);
-    }
-
-    spin_unlock(&s->lock);
-}
-
 int arch_ioreq_server_map_pages(struct hvm_ioreq_server *s)
 {
     int rc;
@@ -620,705 +234,80 @@ void arch_ioreq_server_unmap_pages(struct hvm_ioreq_server *s)
     hvm_unmap_ioreq_gfn(s, false);
 }
 
-static int hvm_ioreq_server_alloc_pages(struct hvm_ioreq_server *s)
+void arch_ioreq_server_enable(struct hvm_ioreq_server *s)
 {
-    int rc;
-
-    rc = hvm_alloc_ioreq_mfn(s, false);
-
-    if ( !rc && (s->bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF) )
-        rc = hvm_alloc_ioreq_mfn(s, true);
-
-    if ( rc )
-        hvm_free_ioreq_mfn(s, false);
-
-    return rc;
+    hvm_remove_ioreq_gfn(s, false);
+    hvm_remove_ioreq_gfn(s, true);
 }
 
-static void hvm_ioreq_server_free_pages(struct hvm_ioreq_server *s)
+void arch_ioreq_server_disable(struct hvm_ioreq_server *s)
 {
-    hvm_free_ioreq_mfn(s, true);
-    hvm_free_ioreq_mfn(s, false);
+    hvm_add_ioreq_gfn(s, true);
+    hvm_add_ioreq_gfn(s, false);
 }
 
-static void hvm_ioreq_server_free_rangesets(struct hvm_ioreq_server *s)
+/* Called when target domain is paused */
+void arch_ioreq_server_destroy(struct hvm_ioreq_server *s)
 {
-    unsigned int i;
-
-    for ( i = 0; i < NR_IO_RANGE_TYPES; i++ )
-        rangeset_destroy(s->range[i]);
+    p2m_set_ioreq_server(s->target, 0, s);
 }
 
-static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
-                                            ioservid_t id)
+/* Called with ioreq_server lock held */
+int arch_ioreq_server_map_mem_type(struct domain *d,
+                                   struct hvm_ioreq_server *s,
+                                   uint32_t flags)
 {
-    unsigned int i;
-    int rc;
+    return p2m_set_ioreq_server(d, flags, s);
+}
 
-    for ( i = 0; i < NR_IO_RANGE_TYPES; i++ )
+void arch_ioreq_server_map_mem_type_completed(struct domain *d,
+                                              struct hvm_ioreq_server *s,
+                                              uint32_t flags)
+{
+    if ( flags == 0 )
     {
-        char *name;
-
-        rc = asprintf(&name, "ioreq_server %d %s", id,
-                      (i == XEN_DMOP_IO_RANGE_PORT) ? "port" :
-                      (i == XEN_DMOP_IO_RANGE_MEMORY) ? "memory" :
-                      (i == XEN_DMOP_IO_RANGE_PCI) ? "pci" :
-                      "");
-        if ( rc )
-            goto fail;
-
-        s->range[i] = rangeset_new(s->target, name,
-                                   RANGESETF_prettyprint_hex);
-
-        xfree(name);
-
-        rc = -ENOMEM;
-        if ( !s->range[i] )
-            goto fail;
+        const struct p2m_domain *p2m = p2m_get_hostp2m(d);
 
-        rangeset_limit(s->range[i], MAX_NR_IO_RANGES);
+        if ( read_atomic(&p2m->ioreq.entry_count) )
+            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
     }
-
-    return 0;
-
- fail:
-    hvm_ioreq_server_free_rangesets(s);
-
-    return rc;
 }
 
-void arch_ioreq_server_enable(struct hvm_ioreq_server *s)
+bool arch_ioreq_server_destroy_all(struct domain *d)
 {
-    hvm_remove_ioreq_gfn(s, false);
-    hvm_remove_ioreq_gfn(s, true);
+    return relocate_portio_handler(d, 0xcf8, 0xcf8, 4);
 }
 
-static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s)
+bool arch_ioreq_server_get_type_addr(const struct domain *d,
+                                     const ioreq_t *p,
+                                     uint8_t *type,
+                                     uint64_t *addr)
 {
-    struct hvm_ioreq_vcpu *sv;
-
-    spin_lock(&s->lock);
-
-    if ( s->enabled )
-        goto done;
-
-    arch_ioreq_server_enable(s);
+    unsigned int cf8 = d->arch.hvm.pci_cf8;
 
-    s->enabled = true;
+    if ( p->type != IOREQ_TYPE_COPY && p->type != IOREQ_TYPE_PIO )
+        return false;
 
-    list_for_each_entry ( sv,
-                          &s->ioreq_vcpu_list,
-                          list_entry )
-        hvm_update_ioreq_evtchn(s, sv);
+    if ( p->type == IOREQ_TYPE_PIO &&
+         (p->addr & ~3) == 0xcfc &&
+         CF8_ENABLED(cf8) )
+    {
+        unsigned int x86_fam, reg;
+        pci_sbdf_t sbdf;
 
-  done:
-    spin_unlock(&s->lock);
-}
+        reg = hvm_pci_decode_addr(cf8, p->addr, &sbdf);
 
-void arch_ioreq_server_disable(struct hvm_ioreq_server *s)
-{
-    hvm_add_ioreq_gfn(s, true);
-    hvm_add_ioreq_gfn(s, false);
-}
-
-static void hvm_ioreq_server_disable(struct hvm_ioreq_server *s)
-{
-    spin_lock(&s->lock);
-
-    if ( !s->enabled )
-        goto done;
-
-    arch_ioreq_server_disable(s);
-
-    s->enabled = false;
-
- done:
-    spin_unlock(&s->lock);
-}
-
-static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
-                                 struct domain *d, int bufioreq_handling,
-                                 ioservid_t id)
-{
-    struct domain *currd = current->domain;
-    struct vcpu *v;
-    int rc;
-
-    s->target = d;
-
-    get_knownalive_domain(currd);
-    s->emulator = currd;
-
-    spin_lock_init(&s->lock);
-    INIT_LIST_HEAD(&s->ioreq_vcpu_list);
-    spin_lock_init(&s->bufioreq_lock);
-
-    s->ioreq.gfn = INVALID_GFN;
-    s->bufioreq.gfn = INVALID_GFN;
-
-    rc = hvm_ioreq_server_alloc_rangesets(s, id);
-    if ( rc )
-        return rc;
-
-    s->bufioreq_handling = bufioreq_handling;
-
-    for_each_vcpu ( d, v )
-    {
-        rc = hvm_ioreq_server_add_vcpu(s, v);
-        if ( rc )
-            goto fail_add;
-    }
-
-    return 0;
-
- fail_add:
-    hvm_ioreq_server_remove_all_vcpus(s);
-    arch_ioreq_server_unmap_pages(s);
-
-    hvm_ioreq_server_free_rangesets(s);
-
-    put_domain(s->emulator);
-    return rc;
-}
-
-static void hvm_ioreq_server_deinit(struct hvm_ioreq_server *s)
-{
-    ASSERT(!s->enabled);
-    hvm_ioreq_server_remove_all_vcpus(s);
-
-    /*
-     * NOTE: It is safe to call both arch_ioreq_server_unmap_pages() and
-     *       hvm_ioreq_server_free_pages() in that order.
-     *       This is because the former will do nothing if the pages
-     *       are not mapped, leaving the page to be freed by the latter.
-     *       However if the pages are mapped then the former will set
-     *       the page_info pointer to NULL, meaning the latter will do
-     *       nothing.
-     */
-    arch_ioreq_server_unmap_pages(s);
-    hvm_ioreq_server_free_pages(s);
-
-    hvm_ioreq_server_free_rangesets(s);
-
-    put_domain(s->emulator);
-}
-
-int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
-                            ioservid_t *id)
-{
-    struct hvm_ioreq_server *s;
-    unsigned int i;
-    int rc;
-
-    if ( bufioreq_handling > HVM_IOREQSRV_BUFIOREQ_ATOMIC )
-        return -EINVAL;
-
-    s = xzalloc(struct hvm_ioreq_server);
-    if ( !s )
-        return -ENOMEM;
-
-    domain_pause(d);
-    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
-
-    for ( i = 0; i < MAX_NR_IOREQ_SERVERS; i++ )
-    {
-        if ( !GET_IOREQ_SERVER(d, i) )
-            break;
-    }
-
-    rc = -ENOSPC;
-    if ( i >= MAX_NR_IOREQ_SERVERS )
-        goto fail;
-
-    /*
-     * It is safe to call set_ioreq_server() prior to
-     * hvm_ioreq_server_init() since the target domain is paused.
-     */
-    set_ioreq_server(d, i, s);
-
-    rc = hvm_ioreq_server_init(s, d, bufioreq_handling, i);
-    if ( rc )
-    {
-        set_ioreq_server(d, i, NULL);
-        goto fail;
-    }
-
-    if ( id )
-        *id = i;
-
-    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
-    domain_unpause(d);
-
-    return 0;
-
- fail:
-    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
-    domain_unpause(d);
-
-    xfree(s);
-    return rc;
-}
-
-/* Called when target domain is paused */
-void arch_ioreq_server_destroy(struct hvm_ioreq_server *s)
-{
-    p2m_set_ioreq_server(s->target, 0, s);
-}
-
-int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
-{
-    struct hvm_ioreq_server *s;
-    int rc;
-
-    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
-
-    s = get_ioreq_server(d, id);
-
-    rc = -ENOENT;
-    if ( !s )
-        goto out;
-
-    rc = -EPERM;
-    if ( s->emulator != current->domain )
-        goto out;
-
-    domain_pause(d);
-
-    arch_ioreq_server_destroy(s);
-
-    hvm_ioreq_server_disable(s);
-
-    /*
-     * It is safe to call hvm_ioreq_server_deinit() prior to
-     * set_ioreq_server() since the target domain is paused.
-     */
-    hvm_ioreq_server_deinit(s);
-    set_ioreq_server(d, id, NULL);
-
-    domain_unpause(d);
-
-    xfree(s);
-
-    rc = 0;
-
- out:
-    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
-
-    return rc;
-}
-
-int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
-                              unsigned long *ioreq_gfn,
-                              unsigned long *bufioreq_gfn,
-                              evtchn_port_t *bufioreq_port)
-{
-    struct hvm_ioreq_server *s;
-    int rc;
-
-    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
-
-    s = get_ioreq_server(d, id);
-
-    rc = -ENOENT;
-    if ( !s )
-        goto out;
-
-    rc = -EPERM;
-    if ( s->emulator != current->domain )
-        goto out;
-
-    if ( ioreq_gfn || bufioreq_gfn )
-    {
-        rc = arch_ioreq_server_map_pages(s);
-        if ( rc )
-            goto out;
-    }
-
-    if ( ioreq_gfn )
-        *ioreq_gfn = gfn_x(s->ioreq.gfn);
-
-    if ( HANDLE_BUFIOREQ(s) )
-    {
-        if ( bufioreq_gfn )
-            *bufioreq_gfn = gfn_x(s->bufioreq.gfn);
-
-        if ( bufioreq_port )
-            *bufioreq_port = s->bufioreq_evtchn;
-    }
-
-    rc = 0;
-
- out:
-    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
-
-    return rc;
-}
-
-int hvm_get_ioreq_server_frame(struct domain *d, ioservid_t id,
-                               unsigned long idx, mfn_t *mfn)
-{
-    struct hvm_ioreq_server *s;
-    int rc;
-
-    ASSERT(is_hvm_domain(d));
-
-    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
-
-    s = get_ioreq_server(d, id);
-
-    rc = -ENOENT;
-    if ( !s )
-        goto out;
-
-    rc = -EPERM;
-    if ( s->emulator != current->domain )
-        goto out;
-
-    rc = hvm_ioreq_server_alloc_pages(s);
-    if ( rc )
-        goto out;
-
-    switch ( idx )
-    {
-    case XENMEM_resource_ioreq_server_frame_bufioreq:
-        rc = -ENOENT;
-        if ( !HANDLE_BUFIOREQ(s) )
-            goto out;
-
-        *mfn = page_to_mfn(s->bufioreq.page);
-        rc = 0;
-        break;
-
-    case XENMEM_resource_ioreq_server_frame_ioreq(0):
-        *mfn = page_to_mfn(s->ioreq.page);
-        rc = 0;
-        break;
-
-    default:
-        rc = -EINVAL;
-        break;
-    }
-
- out:
-    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
-
-    return rc;
-}
-
-int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
-                                     uint32_t type, uint64_t start,
-                                     uint64_t end)
-{
-    struct hvm_ioreq_server *s;
-    struct rangeset *r;
-    int rc;
-
-    if ( start > end )
-        return -EINVAL;
-
-    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
-
-    s = get_ioreq_server(d, id);
-
-    rc = -ENOENT;
-    if ( !s )
-        goto out;
-
-    rc = -EPERM;
-    if ( s->emulator != current->domain )
-        goto out;
-
-    switch ( type )
-    {
-    case XEN_DMOP_IO_RANGE_PORT:
-    case XEN_DMOP_IO_RANGE_MEMORY:
-    case XEN_DMOP_IO_RANGE_PCI:
-        r = s->range[type];
-        break;
-
-    default:
-        r = NULL;
-        break;
-    }
-
-    rc = -EINVAL;
-    if ( !r )
-        goto out;
-
-    rc = -EEXIST;
-    if ( rangeset_overlaps_range(r, start, end) )
-        goto out;
-
-    rc = rangeset_add_range(r, start, end);
-
- out:
-    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
-
-    return rc;
-}
-
-int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
-                                         uint32_t type, uint64_t start,
-                                         uint64_t end)
-{
-    struct hvm_ioreq_server *s;
-    struct rangeset *r;
-    int rc;
-
-    if ( start > end )
-        return -EINVAL;
-
-    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
-
-    s = get_ioreq_server(d, id);
-
-    rc = -ENOENT;
-    if ( !s )
-        goto out;
-
-    rc = -EPERM;
-    if ( s->emulator != current->domain )
-        goto out;
-
-    switch ( type )
-    {
-    case XEN_DMOP_IO_RANGE_PORT:
-    case XEN_DMOP_IO_RANGE_MEMORY:
-    case XEN_DMOP_IO_RANGE_PCI:
-        r = s->range[type];
-        break;
-
-    default:
-        r = NULL;
-        break;
-    }
-
-    rc = -EINVAL;
-    if ( !r )
-        goto out;
-
-    rc = -ENOENT;
-    if ( !rangeset_contains_range(r, start, end) )
-        goto out;
-
-    rc = rangeset_remove_range(r, start, end);
-
- out:
-    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
-
-    return rc;
-}
-
-/* Called with ioreq_server lock held */
-int arch_ioreq_server_map_mem_type(struct domain *d,
-                                   struct hvm_ioreq_server *s,
-                                   uint32_t flags)
-{
-    return p2m_set_ioreq_server(d, flags, s);
-}
-
-void arch_ioreq_server_map_mem_type_completed(struct domain *d,
-                                              struct hvm_ioreq_server *s,
-                                              uint32_t flags)
-{
-    if ( flags == 0 )
-    {
-        const struct p2m_domain *p2m = p2m_get_hostp2m(d);
-
-        if ( read_atomic(&p2m->ioreq.entry_count) )
-            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
-    }
-}
-
-/*
- * Map or unmap an ioreq server to specific memory type. For now, only
- * HVMMEM_ioreq_server is supported, and in the future new types can be
- * introduced, e.g. HVMMEM_ioreq_serverX mapped to ioreq server X. And
- * currently, only write operations are to be forwarded to an ioreq server.
- * Support for the emulation of read operations can be added when an ioreq
- * server has such requirement in the future.
- */
-int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
-                                     uint32_t type, uint32_t flags)
-{
-    struct hvm_ioreq_server *s;
-    int rc;
-
-    if ( type != HVMMEM_ioreq_server )
-        return -EINVAL;
-
-    if ( flags & ~XEN_DMOP_IOREQ_MEM_ACCESS_WRITE )
-        return -EINVAL;
-
-    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
-
-    s = get_ioreq_server(d, id);
-
-    rc = -ENOENT;
-    if ( !s )
-        goto out;
-
-    rc = -EPERM;
-    if ( s->emulator != current->domain )
-        goto out;
-
-    rc = arch_ioreq_server_map_mem_type(d, s, flags);
-
- out:
-    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
-
-    if ( rc == 0 )
-        arch_ioreq_server_map_mem_type_completed(d, s, flags);
-
-    return rc;
-}
-
-int hvm_set_ioreq_server_state(struct domain *d, ioservid_t id,
-                               bool enabled)
-{
-    struct hvm_ioreq_server *s;
-    int rc;
-
-    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
-
-    s = get_ioreq_server(d, id);
-
-    rc = -ENOENT;
-    if ( !s )
-        goto out;
-
-    rc = -EPERM;
-    if ( s->emulator != current->domain )
-        goto out;
-
-    domain_pause(d);
-
-    if ( enabled )
-        hvm_ioreq_server_enable(s);
-    else
-        hvm_ioreq_server_disable(s);
-
-    domain_unpause(d);
-
-    rc = 0;
-
- out:
-    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
-    return rc;
-}
-
-int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v)
-{
-    struct hvm_ioreq_server *s;
-    unsigned int id;
-    int rc;
-
-    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
-
-    FOR_EACH_IOREQ_SERVER(d, id, s)
-    {
-        rc = hvm_ioreq_server_add_vcpu(s, v);
-        if ( rc )
-            goto fail;
-    }
-
-    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
-
-    return 0;
-
- fail:
-    while ( ++id != MAX_NR_IOREQ_SERVERS )
-    {
-        s = GET_IOREQ_SERVER(d, id);
-
-        if ( !s )
-            continue;
-
-        hvm_ioreq_server_remove_vcpu(s, v);
-    }
-
-    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
-
-    return rc;
-}
-
-void hvm_all_ioreq_servers_remove_vcpu(struct domain *d, struct vcpu *v)
-{
-    struct hvm_ioreq_server *s;
-    unsigned int id;
-
-    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
-
-    FOR_EACH_IOREQ_SERVER(d, id, s)
-        hvm_ioreq_server_remove_vcpu(s, v);
-
-    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
-}
-
-bool arch_ioreq_server_destroy_all(struct domain *d)
-{
-    return relocate_portio_handler(d, 0xcf8, 0xcf8, 4);
-}
-
-void hvm_destroy_all_ioreq_servers(struct domain *d)
-{
-    struct hvm_ioreq_server *s;
-    unsigned int id;
-
-    if ( !arch_ioreq_server_destroy_all(d) )
-        return;
-
-    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
-
-    /* No need to domain_pause() as the domain is being torn down */
-
-    FOR_EACH_IOREQ_SERVER(d, id, s)
-    {
-        hvm_ioreq_server_disable(s);
-
-        /*
-         * It is safe to call hvm_ioreq_server_deinit() prior to
-         * set_ioreq_server() since the target domain is being destroyed.
-         */
-        hvm_ioreq_server_deinit(s);
-        set_ioreq_server(d, id, NULL);
-
-        xfree(s);
-    }
-
-    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
-}
-
-bool arch_ioreq_server_get_type_addr(const struct domain *d,
-                                     const ioreq_t *p,
-                                     uint8_t *type,
-                                     uint64_t *addr)
-{
-    unsigned int cf8 = d->arch.hvm.pci_cf8;
-
-    if ( p->type != IOREQ_TYPE_COPY && p->type != IOREQ_TYPE_PIO )
-        return false;
-
-    if ( p->type == IOREQ_TYPE_PIO &&
-         (p->addr & ~3) == 0xcfc &&
-         CF8_ENABLED(cf8) )
-    {
-        unsigned int x86_fam, reg;
-        pci_sbdf_t sbdf;
-
-        reg = hvm_pci_decode_addr(cf8, p->addr, &sbdf);
-
-        /* PCI config data cycle */
-        *type = XEN_DMOP_IO_RANGE_PCI;
-        *addr = ((uint64_t)sbdf.sbdf << 32) | reg;
-        /* AMD extended configuration space access? */
-        if ( CF8_ADDR_HI(cf8) &&
-             d->arch.cpuid->x86_vendor == X86_VENDOR_AMD &&
-             (x86_fam = get_cpu_family(
-                 d->arch.cpuid->basic.raw_fms, NULL, NULL)) >= 0x10 &&
-             x86_fam < 0x17 )
-        {
-            uint64_t msr_val;
+        /* PCI config data cycle */
+        *type = XEN_DMOP_IO_RANGE_PCI;
+        *addr = ((uint64_t)sbdf.sbdf << 32) | reg;
+        /* AMD extended configuration space access? */
+        if ( CF8_ADDR_HI(cf8) &&
+             d->arch.cpuid->x86_vendor == X86_VENDOR_AMD &&
+             (x86_fam = get_cpu_family(
+                 d->arch.cpuid->basic.raw_fms, NULL, NULL)) >= 0x10 &&
+             x86_fam < 0x17 )
+        {
+            uint64_t msr_val;
 
             if ( !rdmsr_safe(MSR_AMD64_NB_CFG, msr_val) &&
                  (msr_val & (1ULL << AMD64_NB_CFG_CF8_EXT_ENABLE_BIT)) )
@@ -1335,233 +324,6 @@ bool arch_ioreq_server_get_type_addr(const struct domain *d,
     return true;
 }
 
-struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
-                                                 ioreq_t *p)
-{
-    struct hvm_ioreq_server *s;
-    uint8_t type;
-    uint64_t addr;
-    unsigned int id;
-
-    if ( !arch_ioreq_server_get_type_addr(d, p, &type, &addr) )
-        return NULL;
-
-    FOR_EACH_IOREQ_SERVER(d, id, s)
-    {
-        struct rangeset *r;
-
-        if ( !s->enabled )
-            continue;
-
-        r = s->range[type];
-
-        switch ( type )
-        {
-            unsigned long start, end;
-
-        case XEN_DMOP_IO_RANGE_PORT:
-            start = addr;
-            end = start + p->size - 1;
-            if ( rangeset_contains_range(r, start, end) )
-                return s;
-
-            break;
-
-        case XEN_DMOP_IO_RANGE_MEMORY:
-            start = hvm_mmio_first_byte(p);
-            end = hvm_mmio_last_byte(p);
-
-            if ( rangeset_contains_range(r, start, end) )
-                return s;
-
-            break;
-
-        case XEN_DMOP_IO_RANGE_PCI:
-            if ( rangeset_contains_singleton(r, addr >> 32) )
-            {
-                p->type = IOREQ_TYPE_PCI_CONFIG;
-                p->addr = addr;
-                return s;
-            }
-
-            break;
-        }
-    }
-
-    return NULL;
-}
-
-static int hvm_send_buffered_ioreq(struct hvm_ioreq_server *s, ioreq_t *p)
-{
-    struct domain *d = current->domain;
-    struct hvm_ioreq_page *iorp;
-    buffered_iopage_t *pg;
-    buf_ioreq_t bp = { .data = p->data,
-                       .addr = p->addr,
-                       .type = p->type,
-                       .dir = p->dir };
-    /* Timeoffset sends 64b data, but no address. Use two consecutive slots. */
-    int qw = 0;
-
-    /* Ensure buffered_iopage fits in a page */
-    BUILD_BUG_ON(sizeof(buffered_iopage_t) > PAGE_SIZE);
-
-    iorp = &s->bufioreq;
-    pg = iorp->va;
-
-    if ( !pg )
-        return IOREQ_STATUS_UNHANDLED;
-
-    /*
-     * Return 0 for the cases we can't deal with:
-     *  - 'addr' is only a 20-bit field, so we cannot address beyond 1MB
-     *  - we cannot buffer accesses to guest memory buffers, as the guest
-     *    may expect the memory buffer to be synchronously accessed
-     *  - the count field is usually used with data_is_ptr and since we don't
-     *    support data_is_ptr we do not waste space for the count field either
-     */
-    if ( (p->addr > 0xffffful) || p->data_is_ptr || (p->count != 1) )
-        return 0;
-
-    switch ( p->size )
-    {
-    case 1:
-        bp.size = 0;
-        break;
-    case 2:
-        bp.size = 1;
-        break;
-    case 4:
-        bp.size = 2;
-        break;
-    case 8:
-        bp.size = 3;
-        qw = 1;
-        break;
-    default:
-        gdprintk(XENLOG_WARNING, "unexpected ioreq size: %u\n", p->size);
-        return IOREQ_STATUS_UNHANDLED;
-    }
-
-    spin_lock(&s->bufioreq_lock);
-
-    if ( (pg->ptrs.write_pointer - pg->ptrs.read_pointer) >=
-         (IOREQ_BUFFER_SLOT_NUM - qw) )
-    {
-        /* The queue is full: send the iopacket through the normal path. */
-        spin_unlock(&s->bufioreq_lock);
-        return IOREQ_STATUS_UNHANDLED;
-    }
-
-    pg->buf_ioreq[pg->ptrs.write_pointer % IOREQ_BUFFER_SLOT_NUM] = bp;
-
-    if ( qw )
-    {
-        bp.data = p->data >> 32;
-        pg->buf_ioreq[(pg->ptrs.write_pointer+1) % IOREQ_BUFFER_SLOT_NUM] = bp;
-    }
-
-    /* Make the ioreq_t visible /before/ write_pointer. */
-    smp_wmb();
-    pg->ptrs.write_pointer += qw ? 2 : 1;
-
-    /* Canonicalize read/write pointers to prevent their overflow. */
-    while ( (s->bufioreq_handling == HVM_IOREQSRV_BUFIOREQ_ATOMIC) &&
-            qw++ < IOREQ_BUFFER_SLOT_NUM &&
-            pg->ptrs.read_pointer >= IOREQ_BUFFER_SLOT_NUM )
-    {
-        union bufioreq_pointers old = pg->ptrs, new;
-        unsigned int n = old.read_pointer / IOREQ_BUFFER_SLOT_NUM;
-
-        new.read_pointer = old.read_pointer - n * IOREQ_BUFFER_SLOT_NUM;
-        new.write_pointer = old.write_pointer - n * IOREQ_BUFFER_SLOT_NUM;
-        cmpxchg(&pg->ptrs.full, old.full, new.full);
-    }
-
-    notify_via_xen_event_channel(d, s->bufioreq_evtchn);
-    spin_unlock(&s->bufioreq_lock);
-
-    return IOREQ_STATUS_HANDLED;
-}
-
-int hvm_send_ioreq(struct hvm_ioreq_server *s, ioreq_t *proto_p,
-                   bool buffered)
-{
-    struct vcpu *curr = current;
-    struct domain *d = curr->domain;
-    struct hvm_ioreq_vcpu *sv;
-
-    ASSERT(s);
-
-    if ( buffered )
-        return hvm_send_buffered_ioreq(s, proto_p);
-
-    if ( unlikely(!vcpu_start_shutdown_deferral(curr)) )
-        return IOREQ_STATUS_RETRY;
-
-    list_for_each_entry ( sv,
-                          &s->ioreq_vcpu_list,
-                          list_entry )
-    {
-        if ( sv->vcpu == curr )
-        {
-            evtchn_port_t port = sv->ioreq_evtchn;
-            ioreq_t *p = get_ioreq(s, curr);
-
-            if ( unlikely(p->state != STATE_IOREQ_NONE) )
-            {
-                gprintk(XENLOG_ERR, "device model set bad IO state %d\n",
-                        p->state);
-                break;
-            }
-
-            if ( unlikely(p->vp_eport != port) )
-            {
-                gprintk(XENLOG_ERR, "device model set bad event channel %d\n",
-                        p->vp_eport);
-                break;
-            }
-
-            proto_p->state = STATE_IOREQ_NONE;
-            proto_p->vp_eport = port;
-            *p = *proto_p;
-
-            prepare_wait_on_xen_event_channel(port);
-
-            /*
-             * Following happens /after/ blocking and setting up ioreq
-             * contents. prepare_wait_on_xen_event_channel() is an implicit
-             * barrier.
-             */
-            p->state = STATE_IOREQ_READY;
-            notify_via_xen_event_channel(d, port);
-
-            sv->pending = true;
-            return IOREQ_STATUS_RETRY;
-        }
-    }
-
-    return IOREQ_STATUS_UNHANDLED;
-}
-
-unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered)
-{
-    struct domain *d = current->domain;
-    struct hvm_ioreq_server *s;
-    unsigned int id, failed = 0;
-
-    FOR_EACH_IOREQ_SERVER(d, id, s)
-    {
-        if ( !s->enabled )
-            continue;
-
-        if ( hvm_send_ioreq(s, p, buffered) == IOREQ_STATUS_UNHANDLED )
-            failed++;
-    }
-
-    return failed;
-}
-
 static int hvm_access_cf8(
     int dir, unsigned int port, unsigned int bytes, uint32_t *val)
 {
@@ -1579,13 +341,6 @@ void arch_ioreq_domain_init(struct domain *d)
     register_portio_handler(d, 0xcf8, 4, hvm_access_cf8);
 }
 
-void hvm_ioreq_init(struct domain *d)
-{
-    spin_lock_init(&d->arch.hvm.ioreq_server.lock);
-
-    arch_ioreq_domain_init(d);
-}
-
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c
index e267513..fd7cadb 100644
--- a/xen/arch/x86/hvm/stdvga.c
+++ b/xen/arch/x86/hvm/stdvga.c
@@ -27,10 +27,10 @@
  *  can have side effects.
  */
 
+#include <xen/ioreq.h>
 #include <xen/types.h>
 #include <xen/sched.h>
 #include <xen/domain_page.h>
-#include <asm/hvm/ioreq.h>
 #include <asm/hvm/support.h>
 #include <xen/numa.h>
 #include <xen/paging.h>
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index 3a37e9e..0ddb6a4 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -19,10 +19,11 @@
  *
  */
 
+#include <xen/ioreq.h>
+
 #include <asm/types.h>
 #include <asm/mtrr.h>
 #include <asm/p2m.h>
-#include <asm/hvm/ioreq.h>
 #include <asm/hvm/vmx/vmx.h>
 #include <asm/hvm/vmx/vvmx.h>
 #include <asm/hvm/nestedhvm.h>
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 79acf20..f6e128e 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -100,6 +100,7 @@
  */
 
 #include <xen/init.h>
+#include <xen/ioreq.h>
 #include <xen/kernel.h>
 #include <xen/lib.h>
 #include <xen/mm.h>
@@ -140,7 +141,6 @@
 #include <asm/io_apic.h>
 #include <asm/pci.h>
 #include <asm/guest.h>
-#include <asm/hvm/ioreq.h>
 #include <asm/pv/domain.h>
 #include <asm/pv/mm.h>
 
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index 3298711..5012a9c 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -20,6 +20,7 @@
  * along with this program; If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <xen/ioreq.h>
 #include <xen/types.h>
 #include <xen/mm.h>
 #include <xen/trace.h>
@@ -34,7 +35,6 @@
 #include <asm/current.h>
 #include <asm/flushtlb.h>
 #include <asm/shadow.h>
-#include <asm/hvm/ioreq.h>
 #include <xen/numa.h>
 #include "private.h"
 
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index 0661328..fa049a6 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -136,6 +136,9 @@ config HYPFS_CONFIG
 	  Disable this option in case you want to spare some memory or you
 	  want to hide the .config contents from dom0.
 
+config IOREQ_SERVER
+	bool
+
 config KEXEC
 	bool "kexec support"
 	default y
diff --git a/xen/common/Makefile b/xen/common/Makefile
index 7a4e652..b161381 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -14,6 +14,7 @@ obj-$(CONFIG_GRANT_TABLE) += grant_table.o
 obj-y += guestcopy.o
 obj-bin-y += gunzip.init.o
 obj-$(CONFIG_HYPFS) += hypfs.o
+obj-$(CONFIG_IOREQ_SERVER) += ioreq.o
 obj-y += irq.o
 obj-y += kernel.o
 obj-y += keyhandler.o
diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c
new file mode 100644
index 0000000..8a004c4
--- /dev/null
+++ b/xen/common/ioreq.c
@@ -0,0 +1,1290 @@
+/*
+ * ioreq.c: hardware virtual machine I/O emulation
+ *
+ * Copyright (c) 2016 Citrix Systems Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/domain.h>
+#include <xen/domain_page.h>
+#include <xen/event.h>
+#include <xen/init.h>
+#include <xen/ioreq.h>
+#include <xen/irq.h>
+#include <xen/lib.h>
+#include <xen/paging.h>
+#include <xen/sched.h>
+#include <xen/softirq.h>
+#include <xen/trace.h>
+#include <xen/vpci.h>
+
+#include <asm/hvm/ioreq.h>
+
+#include <public/hvm/ioreq.h>
+#include <public/hvm/params.h>
+
+static void set_ioreq_server(struct domain *d, unsigned int id,
+                             struct hvm_ioreq_server *s)
+{
+    ASSERT(id < MAX_NR_IOREQ_SERVERS);
+    ASSERT(!s || !d->arch.hvm.ioreq_server.server[id]);
+
+    d->arch.hvm.ioreq_server.server[id] = s;
+}
+
+#define GET_IOREQ_SERVER(d, id) \
+    (d)->arch.hvm.ioreq_server.server[id]
+
+static struct hvm_ioreq_server *get_ioreq_server(const struct domain *d,
+                                                 unsigned int id)
+{
+    if ( id >= MAX_NR_IOREQ_SERVERS )
+        return NULL;
+
+    return GET_IOREQ_SERVER(d, id);
+}
+
+/*
+ * Iterate over all possible ioreq servers.
+ *
+ * NOTE: The iteration is backwards such that more recently created
+ *       ioreq servers are favoured in hvm_select_ioreq_server().
+ *       This is a semantic that previously existed when ioreq servers
+ *       were held in a linked list.
+ */
+#define FOR_EACH_IOREQ_SERVER(d, id, s) \
+    for ( (id) = MAX_NR_IOREQ_SERVERS; (id) != 0; ) \
+        if ( !(s = GET_IOREQ_SERVER(d, --(id))) ) \
+            continue; \
+        else
+
+static ioreq_t *get_ioreq(struct hvm_ioreq_server *s, struct vcpu *v)
+{
+    shared_iopage_t *p = s->ioreq.va;
+
+    ASSERT((v == current) || !vcpu_runnable(v));
+    ASSERT(p != NULL);
+
+    return &p->vcpu_ioreq[v->vcpu_id];
+}
+
+static struct hvm_ioreq_vcpu *get_pending_vcpu(const struct vcpu *v,
+                                               struct hvm_ioreq_server **srvp)
+{
+    struct domain *d = v->domain;
+    struct hvm_ioreq_server *s;
+    unsigned int id;
+
+    FOR_EACH_IOREQ_SERVER(d, id, s)
+    {
+        struct hvm_ioreq_vcpu *sv;
+
+        list_for_each_entry ( sv,
+                              &s->ioreq_vcpu_list,
+                              list_entry )
+        {
+            if ( sv->vcpu == v && sv->pending )
+            {
+                if ( srvp )
+                    *srvp = s;
+                return sv;
+            }
+        }
+    }
+
+    return NULL;
+}
+
+bool hvm_io_pending(struct vcpu *v)
+{
+    return get_pending_vcpu(v, NULL);
+}
+
+static bool hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ioreq_t *p)
+{
+    unsigned int prev_state = STATE_IOREQ_NONE;
+    unsigned int state = p->state;
+    uint64_t data = ~0;
+
+    smp_rmb();
+
+    /*
+     * The only reason we should see this condition be false is when an
+     * emulator dying races with I/O being requested.
+     */
+    while ( likely(state != STATE_IOREQ_NONE) )
+    {
+        if ( unlikely(state < prev_state) )
+        {
+            gdprintk(XENLOG_ERR, "Weird HVM ioreq state transition %u -> %u\n",
+                     prev_state, state);
+            sv->pending = false;
+            domain_crash(sv->vcpu->domain);
+            return false; /* bail */
+        }
+
+        switch ( prev_state = state )
+        {
+        case STATE_IORESP_READY: /* IORESP_READY -> NONE */
+            p->state = STATE_IOREQ_NONE;
+            data = p->data;
+            break;
+
+        case STATE_IOREQ_READY:  /* IOREQ_{READY,INPROCESS} -> IORESP_READY */
+        case STATE_IOREQ_INPROCESS:
+            wait_on_xen_event_channel(sv->ioreq_evtchn,
+                                      ({ state = p->state;
+                                         smp_rmb();
+                                         state != prev_state; }));
+            continue;
+
+        default:
+            gdprintk(XENLOG_ERR, "Weird HVM iorequest state %u\n", state);
+            sv->pending = false;
+            domain_crash(sv->vcpu->domain);
+            return false; /* bail */
+        }
+
+        break;
+    }
+
+    p = &sv->vcpu->arch.hvm.hvm_io.io_req;
+    if ( hvm_ioreq_needs_completion(p) )
+        p->data = data;
+
+    sv->pending = false;
+
+    return true;
+}
+
+bool handle_hvm_io_completion(struct vcpu *v)
+{
+    struct domain *d = v->domain;
+    struct hvm_vcpu_io *vio = &v->arch.hvm.hvm_io;
+    struct hvm_ioreq_server *s;
+    struct hvm_ioreq_vcpu *sv;
+    enum hvm_io_completion io_completion;
+
+    if ( has_vpci(d) && vpci_process_pending(v) )
+    {
+        raise_softirq(SCHEDULE_SOFTIRQ);
+        return false;
+    }
+
+    sv = get_pending_vcpu(v, &s);
+    if ( sv && !hvm_wait_for_io(sv, get_ioreq(s, v)) )
+        return false;
+
+    vio->io_req.state = hvm_ioreq_needs_completion(&vio->io_req) ?
+        STATE_IORESP_READY : STATE_IOREQ_NONE;
+
+    msix_write_completion(v);
+    vcpu_end_shutdown_deferral(v);
+
+    io_completion = vio->io_completion;
+    vio->io_completion = HVMIO_no_completion;
+
+    switch ( io_completion )
+    {
+    case HVMIO_no_completion:
+        break;
+
+    case HVMIO_mmio_completion:
+        return arch_ioreq_complete_mmio();
+
+    case HVMIO_pio_completion:
+        return handle_pio(vio->io_req.addr, vio->io_req.size,
+                          vio->io_req.dir);
+
+    default:
+        return arch_vcpu_ioreq_completion(io_completion);
+    }
+
+    return true;
+}
+
+static int hvm_alloc_ioreq_mfn(struct hvm_ioreq_server *s, bool buf)
+{
+    struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
+    struct page_info *page;
+
+    if ( iorp->page )
+    {
+        /*
+         * If a guest frame has already been mapped (which may happen
+         * on demand if hvm_get_ioreq_server_info() is called), then
+         * allocating a page is not permitted.
+         */
+        if ( !gfn_eq(iorp->gfn, INVALID_GFN) )
+            return -EPERM;
+
+        return 0;
+    }
+
+    page = alloc_domheap_page(s->target, MEMF_no_refcount);
+
+    if ( !page )
+        return -ENOMEM;
+
+    if ( !get_page_and_type(page, s->target, PGT_writable_page) )
+    {
+        /*
+         * The domain can't possibly know about this page yet, so failure
+         * here is a clear indication of something fishy going on.
+         */
+        domain_crash(s->emulator);
+        return -ENODATA;
+    }
+
+    iorp->va = __map_domain_page_global(page);
+    if ( !iorp->va )
+        goto fail;
+
+    iorp->page = page;
+    clear_page(iorp->va);
+    return 0;
+
+ fail:
+    put_page_alloc_ref(page);
+    put_page_and_type(page);
+
+    return -ENOMEM;
+}
+
+static void hvm_free_ioreq_mfn(struct hvm_ioreq_server *s, bool buf)
+{
+    struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
+    struct page_info *page = iorp->page;
+
+    if ( !page )
+        return;
+
+    iorp->page = NULL;
+
+    unmap_domain_page_global(iorp->va);
+    iorp->va = NULL;
+
+    put_page_alloc_ref(page);
+    put_page_and_type(page);
+}
+
+bool is_ioreq_server_page(struct domain *d, const struct page_info *page)
+{
+    const struct hvm_ioreq_server *s;
+    unsigned int id;
+    bool found = false;
+
+    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
+
+    FOR_EACH_IOREQ_SERVER(d, id, s)
+    {
+        if ( (s->ioreq.page == page) || (s->bufioreq.page == page) )
+        {
+            found = true;
+            break;
+        }
+    }
+
+    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+
+    return found;
+}
+
+static void hvm_update_ioreq_evtchn(struct hvm_ioreq_server *s,
+                                    struct hvm_ioreq_vcpu *sv)
+{
+    ASSERT(spin_is_locked(&s->lock));
+
+    if ( s->ioreq.va != NULL )
+    {
+        ioreq_t *p = get_ioreq(s, sv->vcpu);
+
+        p->vp_eport = sv->ioreq_evtchn;
+    }
+}
+
+static int hvm_ioreq_server_add_vcpu(struct hvm_ioreq_server *s,
+                                     struct vcpu *v)
+{
+    struct hvm_ioreq_vcpu *sv;
+    int rc;
+
+    sv = xzalloc(struct hvm_ioreq_vcpu);
+
+    rc = -ENOMEM;
+    if ( !sv )
+        goto fail1;
+
+    spin_lock(&s->lock);
+
+    rc = alloc_unbound_xen_event_channel(v->domain, v->vcpu_id,
+                                         s->emulator->domain_id, NULL);
+    if ( rc < 0 )
+        goto fail2;
+
+    sv->ioreq_evtchn = rc;
+
+    if ( v->vcpu_id == 0 && HANDLE_BUFIOREQ(s) )
+    {
+        rc = alloc_unbound_xen_event_channel(v->domain, 0,
+                                             s->emulator->domain_id, NULL);
+        if ( rc < 0 )
+            goto fail3;
+
+        s->bufioreq_evtchn = rc;
+    }
+
+    sv->vcpu = v;
+
+    list_add(&sv->list_entry, &s->ioreq_vcpu_list);
+
+    if ( s->enabled )
+        hvm_update_ioreq_evtchn(s, sv);
+
+    spin_unlock(&s->lock);
+    return 0;
+
+ fail3:
+    free_xen_event_channel(v->domain, sv->ioreq_evtchn);
+
+ fail2:
+    spin_unlock(&s->lock);
+    xfree(sv);
+
+ fail1:
+    return rc;
+}
+
+static void hvm_ioreq_server_remove_vcpu(struct hvm_ioreq_server *s,
+                                         struct vcpu *v)
+{
+    struct hvm_ioreq_vcpu *sv;
+
+    spin_lock(&s->lock);
+
+    list_for_each_entry ( sv,
+                          &s->ioreq_vcpu_list,
+                          list_entry )
+    {
+        if ( sv->vcpu != v )
+            continue;
+
+        list_del(&sv->list_entry);
+
+        if ( v->vcpu_id == 0 && HANDLE_BUFIOREQ(s) )
+            free_xen_event_channel(v->domain, s->bufioreq_evtchn);
+
+        free_xen_event_channel(v->domain, sv->ioreq_evtchn);
+
+        xfree(sv);
+        break;
+    }
+
+    spin_unlock(&s->lock);
+}
+
+static void hvm_ioreq_server_remove_all_vcpus(struct hvm_ioreq_server *s)
+{
+    struct hvm_ioreq_vcpu *sv, *next;
+
+    spin_lock(&s->lock);
+
+    list_for_each_entry_safe ( sv,
+                               next,
+                               &s->ioreq_vcpu_list,
+                               list_entry )
+    {
+        struct vcpu *v = sv->vcpu;
+
+        list_del(&sv->list_entry);
+
+        if ( v->vcpu_id == 0 && HANDLE_BUFIOREQ(s) )
+            free_xen_event_channel(v->domain, s->bufioreq_evtchn);
+
+        free_xen_event_channel(v->domain, sv->ioreq_evtchn);
+
+        xfree(sv);
+    }
+
+    spin_unlock(&s->lock);
+}
+
+static int hvm_ioreq_server_alloc_pages(struct hvm_ioreq_server *s)
+{
+    int rc;
+
+    rc = hvm_alloc_ioreq_mfn(s, false);
+
+    if ( !rc && (s->bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF) )
+        rc = hvm_alloc_ioreq_mfn(s, true);
+
+    if ( rc )
+        hvm_free_ioreq_mfn(s, false);
+
+    return rc;
+}
+
+static void hvm_ioreq_server_free_pages(struct hvm_ioreq_server *s)
+{
+    hvm_free_ioreq_mfn(s, true);
+    hvm_free_ioreq_mfn(s, false);
+}
+
+static void hvm_ioreq_server_free_rangesets(struct hvm_ioreq_server *s)
+{
+    unsigned int i;
+
+    for ( i = 0; i < NR_IO_RANGE_TYPES; i++ )
+        rangeset_destroy(s->range[i]);
+}
+
+static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
+                                            ioservid_t id)
+{
+    unsigned int i;
+    int rc;
+
+    for ( i = 0; i < NR_IO_RANGE_TYPES; i++ )
+    {
+        char *name;
+
+        rc = asprintf(&name, "ioreq_server %d %s", id,
+                      (i == XEN_DMOP_IO_RANGE_PORT) ? "port" :
+                      (i == XEN_DMOP_IO_RANGE_MEMORY) ? "memory" :
+                      (i == XEN_DMOP_IO_RANGE_PCI) ? "pci" :
+                      "");
+        if ( rc )
+            goto fail;
+
+        s->range[i] = rangeset_new(s->target, name,
+                                   RANGESETF_prettyprint_hex);
+
+        xfree(name);
+
+        rc = -ENOMEM;
+        if ( !s->range[i] )
+            goto fail;
+
+        rangeset_limit(s->range[i], MAX_NR_IO_RANGES);
+    }
+
+    return 0;
+
+ fail:
+    hvm_ioreq_server_free_rangesets(s);
+
+    return rc;
+}
+
+static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s)
+{
+    struct hvm_ioreq_vcpu *sv;
+
+    spin_lock(&s->lock);
+
+    if ( s->enabled )
+        goto done;
+
+    arch_ioreq_server_enable(s);
+
+    s->enabled = true;
+
+    list_for_each_entry ( sv,
+                          &s->ioreq_vcpu_list,
+                          list_entry )
+        hvm_update_ioreq_evtchn(s, sv);
+
+  done:
+    spin_unlock(&s->lock);
+}
+
+static void hvm_ioreq_server_disable(struct hvm_ioreq_server *s)
+{
+    spin_lock(&s->lock);
+
+    if ( !s->enabled )
+        goto done;
+
+    arch_ioreq_server_disable(s);
+
+    s->enabled = false;
+
+ done:
+    spin_unlock(&s->lock);
+}
+
+static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
+                                 struct domain *d, int bufioreq_handling,
+                                 ioservid_t id)
+{
+    struct domain *currd = current->domain;
+    struct vcpu *v;
+    int rc;
+
+    s->target = d;
+
+    get_knownalive_domain(currd);
+    s->emulator = currd;
+
+    spin_lock_init(&s->lock);
+    INIT_LIST_HEAD(&s->ioreq_vcpu_list);
+    spin_lock_init(&s->bufioreq_lock);
+
+    s->ioreq.gfn = INVALID_GFN;
+    s->bufioreq.gfn = INVALID_GFN;
+
+    rc = hvm_ioreq_server_alloc_rangesets(s, id);
+    if ( rc )
+        return rc;
+
+    s->bufioreq_handling = bufioreq_handling;
+
+    for_each_vcpu ( d, v )
+    {
+        rc = hvm_ioreq_server_add_vcpu(s, v);
+        if ( rc )
+            goto fail_add;
+    }
+
+    return 0;
+
+ fail_add:
+    hvm_ioreq_server_remove_all_vcpus(s);
+    arch_ioreq_server_unmap_pages(s);
+
+    hvm_ioreq_server_free_rangesets(s);
+
+    put_domain(s->emulator);
+    return rc;
+}
+
+static void hvm_ioreq_server_deinit(struct hvm_ioreq_server *s)
+{
+    ASSERT(!s->enabled);
+    hvm_ioreq_server_remove_all_vcpus(s);
+
+    /*
+     * NOTE: It is safe to call both arch_ioreq_server_unmap_pages() and
+     *       hvm_ioreq_server_free_pages() in that order.
+     *       This is because the former will do nothing if the pages
+     *       are not mapped, leaving the page to be freed by the latter.
+     *       However if the pages are mapped then the former will set
+     *       the page_info pointer to NULL, meaning the latter will do
+     *       nothing.
+     */
+    arch_ioreq_server_unmap_pages(s);
+    hvm_ioreq_server_free_pages(s);
+
+    hvm_ioreq_server_free_rangesets(s);
+
+    put_domain(s->emulator);
+}
+
+int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
+                            ioservid_t *id)
+{
+    struct hvm_ioreq_server *s;
+    unsigned int i;
+    int rc;
+
+    if ( bufioreq_handling > HVM_IOREQSRV_BUFIOREQ_ATOMIC )
+        return -EINVAL;
+
+    s = xzalloc(struct hvm_ioreq_server);
+    if ( !s )
+        return -ENOMEM;
+
+    domain_pause(d);
+    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
+
+    for ( i = 0; i < MAX_NR_IOREQ_SERVERS; i++ )
+    {
+        if ( !GET_IOREQ_SERVER(d, i) )
+            break;
+    }
+
+    rc = -ENOSPC;
+    if ( i >= MAX_NR_IOREQ_SERVERS )
+        goto fail;
+
+    /*
+     * It is safe to call set_ioreq_server() prior to
+     * hvm_ioreq_server_init() since the target domain is paused.
+     */
+    set_ioreq_server(d, i, s);
+
+    rc = hvm_ioreq_server_init(s, d, bufioreq_handling, i);
+    if ( rc )
+    {
+        set_ioreq_server(d, i, NULL);
+        goto fail;
+    }
+
+    if ( id )
+        *id = i;
+
+    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+    domain_unpause(d);
+
+    return 0;
+
+ fail:
+    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+    domain_unpause(d);
+
+    xfree(s);
+    return rc;
+}
+
+int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
+{
+    struct hvm_ioreq_server *s;
+    int rc;
+
+    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
+
+    s = get_ioreq_server(d, id);
+
+    rc = -ENOENT;
+    if ( !s )
+        goto out;
+
+    rc = -EPERM;
+    if ( s->emulator != current->domain )
+        goto out;
+
+    domain_pause(d);
+
+    arch_ioreq_server_destroy(s);
+
+    hvm_ioreq_server_disable(s);
+
+    /*
+     * It is safe to call hvm_ioreq_server_deinit() prior to
+     * set_ioreq_server() since the target domain is paused.
+     */
+    hvm_ioreq_server_deinit(s);
+    set_ioreq_server(d, id, NULL);
+
+    domain_unpause(d);
+
+    xfree(s);
+
+    rc = 0;
+
+ out:
+    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+
+    return rc;
+}
+
+int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
+                              unsigned long *ioreq_gfn,
+                              unsigned long *bufioreq_gfn,
+                              evtchn_port_t *bufioreq_port)
+{
+    struct hvm_ioreq_server *s;
+    int rc;
+
+    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
+
+    s = get_ioreq_server(d, id);
+
+    rc = -ENOENT;
+    if ( !s )
+        goto out;
+
+    rc = -EPERM;
+    if ( s->emulator != current->domain )
+        goto out;
+
+    if ( ioreq_gfn || bufioreq_gfn )
+    {
+        rc = arch_ioreq_server_map_pages(s);
+        if ( rc )
+            goto out;
+    }
+
+    if ( ioreq_gfn )
+        *ioreq_gfn = gfn_x(s->ioreq.gfn);
+
+    if ( HANDLE_BUFIOREQ(s) )
+    {
+        if ( bufioreq_gfn )
+            *bufioreq_gfn = gfn_x(s->bufioreq.gfn);
+
+        if ( bufioreq_port )
+            *bufioreq_port = s->bufioreq_evtchn;
+    }
+
+    rc = 0;
+
+ out:
+    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+
+    return rc;
+}
+
+int hvm_get_ioreq_server_frame(struct domain *d, ioservid_t id,
+                               unsigned long idx, mfn_t *mfn)
+{
+    struct hvm_ioreq_server *s;
+    int rc;
+
+    ASSERT(is_hvm_domain(d));
+
+    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
+
+    s = get_ioreq_server(d, id);
+
+    rc = -ENOENT;
+    if ( !s )
+        goto out;
+
+    rc = -EPERM;
+    if ( s->emulator != current->domain )
+        goto out;
+
+    rc = hvm_ioreq_server_alloc_pages(s);
+    if ( rc )
+        goto out;
+
+    switch ( idx )
+    {
+    case XENMEM_resource_ioreq_server_frame_bufioreq:
+        rc = -ENOENT;
+        if ( !HANDLE_BUFIOREQ(s) )
+            goto out;
+
+        *mfn = page_to_mfn(s->bufioreq.page);
+        rc = 0;
+        break;
+
+    case XENMEM_resource_ioreq_server_frame_ioreq(0):
+        *mfn = page_to_mfn(s->ioreq.page);
+        rc = 0;
+        break;
+
+    default:
+        rc = -EINVAL;
+        break;
+    }
+
+ out:
+    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+
+    return rc;
+}
+
+int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
+                                     uint32_t type, uint64_t start,
+                                     uint64_t end)
+{
+    struct hvm_ioreq_server *s;
+    struct rangeset *r;
+    int rc;
+
+    if ( start > end )
+        return -EINVAL;
+
+    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
+
+    s = get_ioreq_server(d, id);
+
+    rc = -ENOENT;
+    if ( !s )
+        goto out;
+
+    rc = -EPERM;
+    if ( s->emulator != current->domain )
+        goto out;
+
+    switch ( type )
+    {
+    case XEN_DMOP_IO_RANGE_PORT:
+    case XEN_DMOP_IO_RANGE_MEMORY:
+    case XEN_DMOP_IO_RANGE_PCI:
+        r = s->range[type];
+        break;
+
+    default:
+        r = NULL;
+        break;
+    }
+
+    rc = -EINVAL;
+    if ( !r )
+        goto out;
+
+    rc = -EEXIST;
+    if ( rangeset_overlaps_range(r, start, end) )
+        goto out;
+
+    rc = rangeset_add_range(r, start, end);
+
+ out:
+    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+
+    return rc;
+}
+
+int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
+                                         uint32_t type, uint64_t start,
+                                         uint64_t end)
+{
+    struct hvm_ioreq_server *s;
+    struct rangeset *r;
+    int rc;
+
+    if ( start > end )
+        return -EINVAL;
+
+    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
+
+    s = get_ioreq_server(d, id);
+
+    rc = -ENOENT;
+    if ( !s )
+        goto out;
+
+    rc = -EPERM;
+    if ( s->emulator != current->domain )
+        goto out;
+
+    switch ( type )
+    {
+    case XEN_DMOP_IO_RANGE_PORT:
+    case XEN_DMOP_IO_RANGE_MEMORY:
+    case XEN_DMOP_IO_RANGE_PCI:
+        r = s->range[type];
+        break;
+
+    default:
+        r = NULL;
+        break;
+    }
+
+    rc = -EINVAL;
+    if ( !r )
+        goto out;
+
+    rc = -ENOENT;
+    if ( !rangeset_contains_range(r, start, end) )
+        goto out;
+
+    rc = rangeset_remove_range(r, start, end);
+
+ out:
+    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+
+    return rc;
+}
+
+/*
+ * Map or unmap an ioreq server to specific memory type. For now, only
+ * HVMMEM_ioreq_server is supported, and in the future new types can be
+ * introduced, e.g. HVMMEM_ioreq_serverX mapped to ioreq server X. And
+ * currently, only write operations are to be forwarded to an ioreq server.
+ * Support for the emulation of read operations can be added when an ioreq
+ * server has such requirement in the future.
+ */
+int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
+                                     uint32_t type, uint32_t flags)
+{
+    struct hvm_ioreq_server *s;
+    int rc;
+
+    if ( type != HVMMEM_ioreq_server )
+        return -EINVAL;
+
+    if ( flags & ~XEN_DMOP_IOREQ_MEM_ACCESS_WRITE )
+        return -EINVAL;
+
+    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
+
+    s = get_ioreq_server(d, id);
+
+    rc = -ENOENT;
+    if ( !s )
+        goto out;
+
+    rc = -EPERM;
+    if ( s->emulator != current->domain )
+        goto out;
+
+    rc = arch_ioreq_server_map_mem_type(d, s, flags);
+
+ out:
+    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+
+    if ( rc == 0 )
+        arch_ioreq_server_map_mem_type_completed(d, s, flags);
+
+    return rc;
+}
+
+int hvm_set_ioreq_server_state(struct domain *d, ioservid_t id,
+                               bool enabled)
+{
+    struct hvm_ioreq_server *s;
+    int rc;
+
+    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
+
+    s = get_ioreq_server(d, id);
+
+    rc = -ENOENT;
+    if ( !s )
+        goto out;
+
+    rc = -EPERM;
+    if ( s->emulator != current->domain )
+        goto out;
+
+    domain_pause(d);
+
+    if ( enabled )
+        hvm_ioreq_server_enable(s);
+    else
+        hvm_ioreq_server_disable(s);
+
+    domain_unpause(d);
+
+    rc = 0;
+
+ out:
+    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+    return rc;
+}
+
+int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v)
+{
+    struct hvm_ioreq_server *s;
+    unsigned int id;
+    int rc;
+
+    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
+
+    FOR_EACH_IOREQ_SERVER(d, id, s)
+    {
+        rc = hvm_ioreq_server_add_vcpu(s, v);
+        if ( rc )
+            goto fail;
+    }
+
+    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+
+    return 0;
+
+ fail:
+    while ( ++id != MAX_NR_IOREQ_SERVERS )
+    {
+        s = GET_IOREQ_SERVER(d, id);
+
+        if ( !s )
+            continue;
+
+        hvm_ioreq_server_remove_vcpu(s, v);
+    }
+
+    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+
+    return rc;
+}
+
+void hvm_all_ioreq_servers_remove_vcpu(struct domain *d, struct vcpu *v)
+{
+    struct hvm_ioreq_server *s;
+    unsigned int id;
+
+    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
+
+    FOR_EACH_IOREQ_SERVER(d, id, s)
+        hvm_ioreq_server_remove_vcpu(s, v);
+
+    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+}
+
+void hvm_destroy_all_ioreq_servers(struct domain *d)
+{
+    struct hvm_ioreq_server *s;
+    unsigned int id;
+
+    if ( !arch_ioreq_server_destroy_all(d) )
+        return;
+
+    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
+
+    /* No need to domain_pause() as the domain is being torn down */
+
+    FOR_EACH_IOREQ_SERVER(d, id, s)
+    {
+        hvm_ioreq_server_disable(s);
+
+        /*
+         * It is safe to call hvm_ioreq_server_deinit() prior to
+         * set_ioreq_server() since the target domain is being destroyed.
+         */
+        hvm_ioreq_server_deinit(s);
+        set_ioreq_server(d, id, NULL);
+
+        xfree(s);
+    }
+
+    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+}
+
+struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
+                                                 ioreq_t *p)
+{
+    struct hvm_ioreq_server *s;
+    uint8_t type;
+    uint64_t addr;
+    unsigned int id;
+
+    if ( !arch_ioreq_server_get_type_addr(d, p, &type, &addr) )
+        return NULL;
+
+    FOR_EACH_IOREQ_SERVER(d, id, s)
+    {
+        struct rangeset *r;
+
+        if ( !s->enabled )
+            continue;
+
+        r = s->range[type];
+
+        switch ( type )
+        {
+            unsigned long start, end;
+
+        case XEN_DMOP_IO_RANGE_PORT:
+            start = addr;
+            end = start + p->size - 1;
+            if ( rangeset_contains_range(r, start, end) )
+                return s;
+
+            break;
+
+        case XEN_DMOP_IO_RANGE_MEMORY:
+            start = hvm_mmio_first_byte(p);
+            end = hvm_mmio_last_byte(p);
+
+            if ( rangeset_contains_range(r, start, end) )
+                return s;
+
+            break;
+
+        case XEN_DMOP_IO_RANGE_PCI:
+            if ( rangeset_contains_singleton(r, addr >> 32) )
+            {
+                p->type = IOREQ_TYPE_PCI_CONFIG;
+                p->addr = addr;
+                return s;
+            }
+
+            break;
+        }
+    }
+
+    return NULL;
+}
+
+static int hvm_send_buffered_ioreq(struct hvm_ioreq_server *s, ioreq_t *p)
+{
+    struct domain *d = current->domain;
+    struct hvm_ioreq_page *iorp;
+    buffered_iopage_t *pg;
+    buf_ioreq_t bp = { .data = p->data,
+                       .addr = p->addr,
+                       .type = p->type,
+                       .dir = p->dir };
+    /* Timeoffset sends 64b data, but no address. Use two consecutive slots. */
+    int qw = 0;
+
+    /* Ensure buffered_iopage fits in a page */
+    BUILD_BUG_ON(sizeof(buffered_iopage_t) > PAGE_SIZE);
+
+    iorp = &s->bufioreq;
+    pg = iorp->va;
+
+    if ( !pg )
+        return IOREQ_STATUS_UNHANDLED;
+
+    /*
+     * Return 0 for the cases we can't deal with:
+     *  - 'addr' is only a 20-bit field, so we cannot address beyond 1MB
+     *  - we cannot buffer accesses to guest memory buffers, as the guest
+     *    may expect the memory buffer to be synchronously accessed
+     *  - the count field is usually used with data_is_ptr and since we don't
+     *    support data_is_ptr we do not waste space for the count field either
+     */
+    if ( (p->addr > 0xffffful) || p->data_is_ptr || (p->count != 1) )
+        return 0;
+
+    switch ( p->size )
+    {
+    case 1:
+        bp.size = 0;
+        break;
+    case 2:
+        bp.size = 1;
+        break;
+    case 4:
+        bp.size = 2;
+        break;
+    case 8:
+        bp.size = 3;
+        qw = 1;
+        break;
+    default:
+        gdprintk(XENLOG_WARNING, "unexpected ioreq size: %u\n", p->size);
+        return IOREQ_STATUS_UNHANDLED;
+    }
+
+    spin_lock(&s->bufioreq_lock);
+
+    if ( (pg->ptrs.write_pointer - pg->ptrs.read_pointer) >=
+         (IOREQ_BUFFER_SLOT_NUM - qw) )
+    {
+        /* The queue is full: send the iopacket through the normal path. */
+        spin_unlock(&s->bufioreq_lock);
+        return IOREQ_STATUS_UNHANDLED;
+    }
+
+    pg->buf_ioreq[pg->ptrs.write_pointer % IOREQ_BUFFER_SLOT_NUM] = bp;
+
+    if ( qw )
+    {
+        bp.data = p->data >> 32;
+        pg->buf_ioreq[(pg->ptrs.write_pointer+1) % IOREQ_BUFFER_SLOT_NUM] = bp;
+    }
+
+    /* Make the ioreq_t visible /before/ write_pointer. */
+    smp_wmb();
+    pg->ptrs.write_pointer += qw ? 2 : 1;
+
+    /* Canonicalize read/write pointers to prevent their overflow. */
+    while ( (s->bufioreq_handling == HVM_IOREQSRV_BUFIOREQ_ATOMIC) &&
+            qw++ < IOREQ_BUFFER_SLOT_NUM &&
+            pg->ptrs.read_pointer >= IOREQ_BUFFER_SLOT_NUM )
+    {
+        union bufioreq_pointers old = pg->ptrs, new;
+        unsigned int n = old.read_pointer / IOREQ_BUFFER_SLOT_NUM;
+
+        new.read_pointer = old.read_pointer - n * IOREQ_BUFFER_SLOT_NUM;
+        new.write_pointer = old.write_pointer - n * IOREQ_BUFFER_SLOT_NUM;
+        cmpxchg(&pg->ptrs.full, old.full, new.full);
+    }
+
+    notify_via_xen_event_channel(d, s->bufioreq_evtchn);
+    spin_unlock(&s->bufioreq_lock);
+
+    return IOREQ_STATUS_HANDLED;
+}
+
+int hvm_send_ioreq(struct hvm_ioreq_server *s, ioreq_t *proto_p,
+                   bool buffered)
+{
+    struct vcpu *curr = current;
+    struct domain *d = curr->domain;
+    struct hvm_ioreq_vcpu *sv;
+
+    ASSERT(s);
+
+    if ( buffered )
+        return hvm_send_buffered_ioreq(s, proto_p);
+
+    if ( unlikely(!vcpu_start_shutdown_deferral(curr)) )
+        return IOREQ_STATUS_RETRY;
+
+    list_for_each_entry ( sv,
+                          &s->ioreq_vcpu_list,
+                          list_entry )
+    {
+        if ( sv->vcpu == curr )
+        {
+            evtchn_port_t port = sv->ioreq_evtchn;
+            ioreq_t *p = get_ioreq(s, curr);
+
+            if ( unlikely(p->state != STATE_IOREQ_NONE) )
+            {
+                gprintk(XENLOG_ERR, "device model set bad IO state %d\n",
+                        p->state);
+                break;
+            }
+
+            if ( unlikely(p->vp_eport != port) )
+            {
+                gprintk(XENLOG_ERR, "device model set bad event channel %d\n",
+                        p->vp_eport);
+                break;
+            }
+
+            proto_p->state = STATE_IOREQ_NONE;
+            proto_p->vp_eport = port;
+            *p = *proto_p;
+
+            prepare_wait_on_xen_event_channel(port);
+
+            /*
+             * Following happens /after/ blocking and setting up ioreq
+             * contents. prepare_wait_on_xen_event_channel() is an implicit
+             * barrier.
+             */
+            p->state = STATE_IOREQ_READY;
+            notify_via_xen_event_channel(d, port);
+
+            sv->pending = true;
+            return IOREQ_STATUS_RETRY;
+        }
+    }
+
+    return IOREQ_STATUS_UNHANDLED;
+}
+
+unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered)
+{
+    struct domain *d = current->domain;
+    struct hvm_ioreq_server *s;
+    unsigned int id, failed = 0;
+
+    FOR_EACH_IOREQ_SERVER(d, id, s)
+    {
+        if ( !s->enabled )
+            continue;
+
+        if ( hvm_send_ioreq(s, p, buffered) == IOREQ_STATUS_UNHANDLED )
+            failed++;
+    }
+
+    return failed;
+}
+
+void hvm_ioreq_init(struct domain *d)
+{
+    spin_lock_init(&d->arch.hvm.ioreq_server.lock);
+
+    arch_ioreq_domain_init(d);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-x86/hvm/ioreq.h b/xen/include/asm-x86/hvm/ioreq.h
index 0e64e76..9b2eb6f 100644
--- a/xen/include/asm-x86/hvm/ioreq.h
+++ b/xen/include/asm-x86/hvm/ioreq.h
@@ -19,65 +19,6 @@
 #ifndef __ASM_X86_HVM_IOREQ_H__
 #define __ASM_X86_HVM_IOREQ_H__
 
-#define HANDLE_BUFIOREQ(s) \
-    ((s)->bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF)
-
-bool hvm_io_pending(struct vcpu *v);
-bool handle_hvm_io_completion(struct vcpu *v);
-bool is_ioreq_server_page(struct domain *d, const struct page_info *page);
-
-int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
-                            ioservid_t *id);
-int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id);
-int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
-                              unsigned long *ioreq_gfn,
-                              unsigned long *bufioreq_gfn,
-                              evtchn_port_t *bufioreq_port);
-int hvm_get_ioreq_server_frame(struct domain *d, ioservid_t id,
-                               unsigned long idx, mfn_t *mfn);
-int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
-                                     uint32_t type, uint64_t start,
-                                     uint64_t end);
-int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
-                                         uint32_t type, uint64_t start,
-                                         uint64_t end);
-int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
-                                     uint32_t type, uint32_t flags);
-int hvm_set_ioreq_server_state(struct domain *d, ioservid_t id,
-                               bool enabled);
-
-int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v);
-void hvm_all_ioreq_servers_remove_vcpu(struct domain *d, struct vcpu *v);
-void hvm_destroy_all_ioreq_servers(struct domain *d);
-
-struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
-                                                 ioreq_t *p);
-int hvm_send_ioreq(struct hvm_ioreq_server *s, ioreq_t *proto_p,
-                   bool buffered);
-unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered);
-
-void hvm_ioreq_init(struct domain *d);
-
-bool arch_ioreq_complete_mmio(void);
-bool arch_vcpu_ioreq_completion(enum hvm_io_completion io_completion);
-int arch_ioreq_server_map_pages(struct hvm_ioreq_server *s);
-void arch_ioreq_server_unmap_pages(struct hvm_ioreq_server *s);
-void arch_ioreq_server_enable(struct hvm_ioreq_server *s);
-void arch_ioreq_server_disable(struct hvm_ioreq_server *s);
-void arch_ioreq_server_destroy(struct hvm_ioreq_server *s);
-int arch_ioreq_server_map_mem_type(struct domain *d,
-                                   struct hvm_ioreq_server *s,
-                                   uint32_t flags);
-void arch_ioreq_server_map_mem_type_completed(struct domain *d,
-                                              struct hvm_ioreq_server *s,
-                                              uint32_t flags);
-bool arch_ioreq_server_destroy_all(struct domain *d);
-bool arch_ioreq_server_get_type_addr(const struct domain *d,
-                                     const ioreq_t *p,
-                                     uint8_t *type,
-                                     uint64_t *addr);
-void arch_ioreq_domain_init(struct domain *d);
-
 /* This correlation must not be altered */
 #define IOREQ_STATUS_HANDLED     X86EMUL_OKAY
 #define IOREQ_STATUS_UNHANDLED   X86EMUL_UNHANDLEABLE
diff --git a/xen/include/xen/ioreq.h b/xen/include/xen/ioreq.h
new file mode 100644
index 0000000..7b67950
--- /dev/null
+++ b/xen/include/xen/ioreq.h
@@ -0,0 +1,93 @@
+/*
+ * ioreq.h: Hardware virtual machine assist interface definitions.
+ *
+ * Copyright (c) 2016 Citrix Systems Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __XEN_IOREQ_H__
+#define __XEN_IOREQ_H__
+
+#include <xen/sched.h>
+
+#define HANDLE_BUFIOREQ(s) \
+    ((s)->bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF)
+
+bool hvm_io_pending(struct vcpu *v);
+bool handle_hvm_io_completion(struct vcpu *v);
+bool is_ioreq_server_page(struct domain *d, const struct page_info *page);
+
+int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
+                            ioservid_t *id);
+int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id);
+int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
+                              unsigned long *ioreq_gfn,
+                              unsigned long *bufioreq_gfn,
+                              evtchn_port_t *bufioreq_port);
+int hvm_get_ioreq_server_frame(struct domain *d, ioservid_t id,
+                               unsigned long idx, mfn_t *mfn);
+int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
+                                     uint32_t type, uint64_t start,
+                                     uint64_t end);
+int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
+                                         uint32_t type, uint64_t start,
+                                         uint64_t end);
+int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
+                                     uint32_t type, uint32_t flags);
+int hvm_set_ioreq_server_state(struct domain *d, ioservid_t id,
+                               bool enabled);
+
+int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v);
+void hvm_all_ioreq_servers_remove_vcpu(struct domain *d, struct vcpu *v);
+void hvm_destroy_all_ioreq_servers(struct domain *d);
+
+struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
+                                                 ioreq_t *p);
+int hvm_send_ioreq(struct hvm_ioreq_server *s, ioreq_t *proto_p,
+                   bool buffered);
+unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered);
+
+void hvm_ioreq_init(struct domain *d);
+
+bool arch_ioreq_complete_mmio(void);
+bool arch_vcpu_ioreq_completion(enum hvm_io_completion io_completion);
+int arch_ioreq_server_map_pages(struct hvm_ioreq_server *s);
+void arch_ioreq_server_unmap_pages(struct hvm_ioreq_server *s);
+void arch_ioreq_server_enable(struct hvm_ioreq_server *s);
+void arch_ioreq_server_disable(struct hvm_ioreq_server *s);
+void arch_ioreq_server_destroy(struct hvm_ioreq_server *s);
+int arch_ioreq_server_map_mem_type(struct domain *d,
+                                   struct hvm_ioreq_server *s,
+                                   uint32_t flags);
+void arch_ioreq_server_map_mem_type_completed(struct domain *d,
+                                              struct hvm_ioreq_server *s,
+                                              uint32_t flags);
+bool arch_ioreq_server_destroy_all(struct domain *d);
+bool arch_ioreq_server_get_type_addr(const struct domain *d,
+                                     const ioreq_t *p,
+                                     uint8_t *type,
+                                     uint64_t *addr);
+void arch_ioreq_domain_init(struct domain *d);
+
+#endif /* __XEN_IOREQ_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH V4 05/24] xen/ioreq: Make x86's hvm_ioreq_needs_completion() common
  2021-01-12 21:52 [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Oleksandr Tyshchenko
                   ` (3 preceding siblings ...)
  2021-01-12 21:52 ` [PATCH V4 04/24] xen/ioreq: Make x86's IOREQ feature common Oleksandr Tyshchenko
@ 2021-01-12 21:52 ` Oleksandr Tyshchenko
  2021-01-15 15:25   ` Julien Grall
  2021-01-20  8:48   ` Alex Bennée
  2021-01-12 21:52 ` [PATCH V4 06/24] xen/ioreq: Make x86's hvm_mmio_first(last)_byte() common Oleksandr Tyshchenko
                   ` (19 subsequent siblings)
  24 siblings, 2 replies; 144+ messages in thread
From: Oleksandr Tyshchenko @ 2021-01-12 21:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, Paul Durrant, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, Julien Grall, Stefano Stabellini, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

The IOREQ is a common feature now and this helper will be used
on Arm as is. Move it to xen/ioreq.h and remove "hvm" prefix.

Although PIO handling on Arm is not introduced with the current series
(it will be implemented when we add support for vPCI), technically
the PIOs exist on Arm (however they are accessed the same way as MMIO)
and it would be better not to diverge now.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
CC: Julien Grall <julien.grall@arm.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>

---
Please note, this is a split/cleanup/hardening of Julien's PoC:
"Add support for Guest IO forwarding to a device emulator"

Changes RFC -> V1:
   - new patch, was split from:
     "[RFC PATCH V1 01/12] hvm/ioreq: Make x86's IOREQ feature common"

Changes V1 -> V2:
   - remove "hvm" prefix

Changes V2 -> V3:
   - add Paul's R-b

Changes V3 -> V4:
   - add Jan's A-b
---
 xen/arch/x86/hvm/emulate.c     | 4 ++--
 xen/arch/x86/hvm/io.c          | 2 +-
 xen/common/ioreq.c             | 4 ++--
 xen/include/asm-x86/hvm/vcpu.h | 7 -------
 xen/include/xen/ioreq.h        | 7 +++++++
 5 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 60ca465..c3487b5 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -336,7 +336,7 @@ static int hvmemul_do_io(
             rc = hvm_send_ioreq(s, &p, 0);
             if ( rc != X86EMUL_RETRY || currd->is_shutting_down )
                 vio->io_req.state = STATE_IOREQ_NONE;
-            else if ( !hvm_ioreq_needs_completion(&vio->io_req) )
+            else if ( !ioreq_needs_completion(&vio->io_req) )
                 rc = X86EMUL_OKAY;
         }
         break;
@@ -2649,7 +2649,7 @@ static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
     if ( rc == X86EMUL_OKAY && vio->mmio_retry )
         rc = X86EMUL_RETRY;
 
-    if ( !hvm_ioreq_needs_completion(&vio->io_req) )
+    if ( !ioreq_needs_completion(&vio->io_req) )
         completion = HVMIO_no_completion;
     else if ( completion == HVMIO_no_completion )
         completion = (vio->io_req.type != IOREQ_TYPE_PIO ||
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index 11e007d..ef8286b 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -135,7 +135,7 @@ bool handle_pio(uint16_t port, unsigned int size, int dir)
 
     rc = hvmemul_do_pio_buffer(port, size, dir, &data);
 
-    if ( hvm_ioreq_needs_completion(&vio->io_req) )
+    if ( ioreq_needs_completion(&vio->io_req) )
         vio->io_completion = HVMIO_pio_completion;
 
     switch ( rc )
diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c
index 8a004c4..47e38b6 100644
--- a/xen/common/ioreq.c
+++ b/xen/common/ioreq.c
@@ -160,7 +160,7 @@ static bool hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ioreq_t *p)
     }
 
     p = &sv->vcpu->arch.hvm.hvm_io.io_req;
-    if ( hvm_ioreq_needs_completion(p) )
+    if ( ioreq_needs_completion(p) )
         p->data = data;
 
     sv->pending = false;
@@ -186,7 +186,7 @@ bool handle_hvm_io_completion(struct vcpu *v)
     if ( sv && !hvm_wait_for_io(sv, get_ioreq(s, v)) )
         return false;
 
-    vio->io_req.state = hvm_ioreq_needs_completion(&vio->io_req) ?
+    vio->io_req.state = ioreq_needs_completion(&vio->io_req) ?
         STATE_IORESP_READY : STATE_IOREQ_NONE;
 
     msix_write_completion(v);
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 5ccd075..6c1feda 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -91,13 +91,6 @@ struct hvm_vcpu_io {
     const struct g2m_ioport *g2m_ioport;
 };
 
-static inline bool hvm_ioreq_needs_completion(const ioreq_t *ioreq)
-{
-    return ioreq->state == STATE_IOREQ_READY &&
-           !ioreq->data_is_ptr &&
-           (ioreq->type != IOREQ_TYPE_PIO || ioreq->dir != IOREQ_WRITE);
-}
-
 struct nestedvcpu {
     bool_t nv_guestmode; /* vcpu in guestmode? */
     void *nv_vvmcx; /* l1 guest virtual VMCB/VMCS */
diff --git a/xen/include/xen/ioreq.h b/xen/include/xen/ioreq.h
index 7b67950..750d884 100644
--- a/xen/include/xen/ioreq.h
+++ b/xen/include/xen/ioreq.h
@@ -21,6 +21,13 @@
 
 #include <xen/sched.h>
 
+static inline bool ioreq_needs_completion(const ioreq_t *ioreq)
+{
+    return ioreq->state == STATE_IOREQ_READY &&
+           !ioreq->data_is_ptr &&
+           (ioreq->type != IOREQ_TYPE_PIO || ioreq->dir != IOREQ_WRITE);
+}
+
 #define HANDLE_BUFIOREQ(s) \
     ((s)->bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF)
 
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH V4 06/24] xen/ioreq: Make x86's hvm_mmio_first(last)_byte() common
  2021-01-12 21:52 [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Oleksandr Tyshchenko
                   ` (4 preceding siblings ...)
  2021-01-12 21:52 ` [PATCH V4 05/24] xen/ioreq: Make x86's hvm_ioreq_needs_completion() common Oleksandr Tyshchenko
@ 2021-01-12 21:52 ` Oleksandr Tyshchenko
  2021-01-15 15:34   ` Julien Grall
                     ` (2 more replies)
  2021-01-12 21:52 ` [PATCH V4 07/24] xen/ioreq: Make x86's hvm_ioreq_(page/vcpu/server) structs common Oleksandr Tyshchenko
                   ` (18 subsequent siblings)
  24 siblings, 3 replies; 144+ messages in thread
From: Oleksandr Tyshchenko @ 2021-01-12 21:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, Paul Durrant, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, Julien Grall, Stefano Stabellini, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

The IOREQ is a common feature now and these helpers will be used
on Arm as is. Move them to xen/ioreq.h and replace "hvm" prefixes
with "ioreq".

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Paul Durrant <paul@xen.org>
CC: Julien Grall <julien.grall@arm.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>

---
Please note, this is a split/cleanup/hardening of Julien's PoC:
"Add support for Guest IO forwarding to a device emulator"

Changes RFC -> V1:
   - new patch

Changes V1 -> V2:
   - replace "hvm" prefix by "ioreq"

Changes V2 -> V3:
   - add Paul's R-b

Changes V32 -> V4:
   - add Jan's A-b
---
 xen/arch/x86/hvm/intercept.c |  5 +++--
 xen/arch/x86/hvm/stdvga.c    |  4 ++--
 xen/common/ioreq.c           |  4 ++--
 xen/include/asm-x86/hvm/io.h | 16 ----------------
 xen/include/xen/ioreq.h      | 16 ++++++++++++++++
 5 files changed, 23 insertions(+), 22 deletions(-)

diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index cd4c4c1..02ca3b0 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -17,6 +17,7 @@
  * this program; If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <xen/ioreq.h>
 #include <xen/types.h>
 #include <xen/sched.h>
 #include <asm/regs.h>
@@ -34,7 +35,7 @@
 static bool_t hvm_mmio_accept(const struct hvm_io_handler *handler,
                               const ioreq_t *p)
 {
-    paddr_t first = hvm_mmio_first_byte(p), last;
+    paddr_t first = ioreq_mmio_first_byte(p), last;
 
     BUG_ON(handler->type != IOREQ_TYPE_COPY);
 
@@ -42,7 +43,7 @@ static bool_t hvm_mmio_accept(const struct hvm_io_handler *handler,
         return 0;
 
     /* Make sure the handler will accept the whole access. */
-    last = hvm_mmio_last_byte(p);
+    last = ioreq_mmio_last_byte(p);
     if ( last != first &&
          !handler->mmio.ops->check(current, last) )
         domain_crash(current->domain);
diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c
index fd7cadb..17dee74 100644
--- a/xen/arch/x86/hvm/stdvga.c
+++ b/xen/arch/x86/hvm/stdvga.c
@@ -524,8 +524,8 @@ static bool_t stdvga_mem_accept(const struct hvm_io_handler *handler,
      * deadlock when hvm_mmio_internal() is called from
      * hvm_copy_to/from_guest_phys() in hvm_process_io_intercept().
      */
-    if ( (hvm_mmio_first_byte(p) < VGA_MEM_BASE) ||
-         (hvm_mmio_last_byte(p) >= (VGA_MEM_BASE + VGA_MEM_SIZE)) )
+    if ( (ioreq_mmio_first_byte(p) < VGA_MEM_BASE) ||
+         (ioreq_mmio_last_byte(p) >= (VGA_MEM_BASE + VGA_MEM_SIZE)) )
         return 0;
 
     spin_lock(&s->lock);
diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c
index 47e38b6..a196e14 100644
--- a/xen/common/ioreq.c
+++ b/xen/common/ioreq.c
@@ -1078,8 +1078,8 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
             break;
 
         case XEN_DMOP_IO_RANGE_MEMORY:
-            start = hvm_mmio_first_byte(p);
-            end = hvm_mmio_last_byte(p);
+            start = ioreq_mmio_first_byte(p);
+            end = ioreq_mmio_last_byte(p);
 
             if ( rangeset_contains_range(r, start, end) )
                 return s;
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index 558426b..fb64294 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -40,22 +40,6 @@ struct hvm_mmio_ops {
     hvm_mmio_write_t write;
 };
 
-static inline paddr_t hvm_mmio_first_byte(const ioreq_t *p)
-{
-    return unlikely(p->df) ?
-           p->addr - (p->count - 1ul) * p->size :
-           p->addr;
-}
-
-static inline paddr_t hvm_mmio_last_byte(const ioreq_t *p)
-{
-    unsigned long size = p->size;
-
-    return unlikely(p->df) ?
-           p->addr + size - 1:
-           p->addr + (p->count * size) - 1;
-}
-
 typedef int (*portio_action_t)(
     int dir, unsigned int port, unsigned int bytes, uint32_t *val);
 
diff --git a/xen/include/xen/ioreq.h b/xen/include/xen/ioreq.h
index 750d884..aeea67e 100644
--- a/xen/include/xen/ioreq.h
+++ b/xen/include/xen/ioreq.h
@@ -21,6 +21,22 @@
 
 #include <xen/sched.h>
 
+static inline paddr_t ioreq_mmio_first_byte(const ioreq_t *p)
+{
+    return unlikely(p->df) ?
+           p->addr - (p->count - 1ul) * p->size :
+           p->addr;
+}
+
+static inline paddr_t ioreq_mmio_last_byte(const ioreq_t *p)
+{
+    unsigned long size = p->size;
+
+    return unlikely(p->df) ?
+           p->addr + size - 1:
+           p->addr + (p->count * size) - 1;
+}
+
 static inline bool ioreq_needs_completion(const ioreq_t *ioreq)
 {
     return ioreq->state == STATE_IOREQ_READY &&
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH V4 07/24] xen/ioreq: Make x86's hvm_ioreq_(page/vcpu/server) structs common
  2021-01-12 21:52 [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Oleksandr Tyshchenko
                   ` (5 preceding siblings ...)
  2021-01-12 21:52 ` [PATCH V4 06/24] xen/ioreq: Make x86's hvm_mmio_first(last)_byte() common Oleksandr Tyshchenko
@ 2021-01-12 21:52 ` Oleksandr Tyshchenko
  2021-01-15 15:36   ` Julien Grall
                     ` (2 more replies)
  2021-01-12 21:52 ` [PATCH V4 08/24] xen/ioreq: Move x86's ioreq_server to struct domain Oleksandr Tyshchenko
                   ` (17 subsequent siblings)
  24 siblings, 3 replies; 144+ messages in thread
From: Oleksandr Tyshchenko @ 2021-01-12 21:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, Paul Durrant, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini,
	Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

The IOREQ is a common feature now and these structs will be used
on Arm as is. Move them to xen/ioreq.h and remove "hvm" prefixes.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
CC: Julien Grall <julien.grall@arm.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>

---
Please note, this is a split/cleanup/hardening of Julien's PoC:
"Add support for Guest IO forwarding to a device emulator"

Changes RFC -> V1:
   - new patch

Changes V1 -> V2:
   - remove "hvm" prefix

Changes V2 -> V3:
   - update patch according the "legacy interface" is x86 specific

Changes V3 -> V4:
   - add Jan's A-b
---
 xen/arch/x86/hvm/emulate.c       |   2 +-
 xen/arch/x86/hvm/ioreq.c         |  38 +++++++-------
 xen/arch/x86/hvm/stdvga.c        |   2 +-
 xen/arch/x86/mm/p2m.c            |   8 +--
 xen/common/ioreq.c               | 108 +++++++++++++++++++--------------------
 xen/include/asm-x86/hvm/domain.h |  36 +------------
 xen/include/asm-x86/p2m.h        |   8 +--
 xen/include/xen/ioreq.h          |  54 ++++++++++++++++----
 8 files changed, 128 insertions(+), 128 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index c3487b5..4d62199 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -287,7 +287,7 @@ static int hvmemul_do_io(
          * However, there's no cheap approach to avoid above situations in xen,
          * so the device model side needs to check the incoming ioreq event.
          */
-        struct hvm_ioreq_server *s = NULL;
+        struct ioreq_server *s = NULL;
         p2m_type_t p2mt = p2m_invalid;
 
         if ( is_mmio )
diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index 177b964..8393922 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -63,7 +63,7 @@ bool arch_vcpu_ioreq_completion(enum hvm_io_completion io_completion)
     return true;
 }
 
-static gfn_t hvm_alloc_legacy_ioreq_gfn(struct hvm_ioreq_server *s)
+static gfn_t hvm_alloc_legacy_ioreq_gfn(struct ioreq_server *s)
 {
     struct domain *d = s->target;
     unsigned int i;
@@ -79,7 +79,7 @@ static gfn_t hvm_alloc_legacy_ioreq_gfn(struct hvm_ioreq_server *s)
     return INVALID_GFN;
 }
 
-static gfn_t hvm_alloc_ioreq_gfn(struct hvm_ioreq_server *s)
+static gfn_t hvm_alloc_ioreq_gfn(struct ioreq_server *s)
 {
     struct domain *d = s->target;
     unsigned int i;
@@ -97,7 +97,7 @@ static gfn_t hvm_alloc_ioreq_gfn(struct hvm_ioreq_server *s)
     return hvm_alloc_legacy_ioreq_gfn(s);
 }
 
-static bool hvm_free_legacy_ioreq_gfn(struct hvm_ioreq_server *s,
+static bool hvm_free_legacy_ioreq_gfn(struct ioreq_server *s,
                                       gfn_t gfn)
 {
     struct domain *d = s->target;
@@ -115,7 +115,7 @@ static bool hvm_free_legacy_ioreq_gfn(struct hvm_ioreq_server *s,
     return true;
 }
 
-static void hvm_free_ioreq_gfn(struct hvm_ioreq_server *s, gfn_t gfn)
+static void hvm_free_ioreq_gfn(struct ioreq_server *s, gfn_t gfn)
 {
     struct domain *d = s->target;
     unsigned int i = gfn_x(gfn) - d->arch.hvm.ioreq_gfn.base;
@@ -129,9 +129,9 @@ static void hvm_free_ioreq_gfn(struct hvm_ioreq_server *s, gfn_t gfn)
     }
 }
 
-static void hvm_unmap_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
+static void hvm_unmap_ioreq_gfn(struct ioreq_server *s, bool buf)
 {
-    struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
+    struct ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
 
     if ( gfn_eq(iorp->gfn, INVALID_GFN) )
         return;
@@ -143,10 +143,10 @@ static void hvm_unmap_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
     iorp->gfn = INVALID_GFN;
 }
 
-static int hvm_map_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
+static int hvm_map_ioreq_gfn(struct ioreq_server *s, bool buf)
 {
     struct domain *d = s->target;
-    struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
+    struct ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
     int rc;
 
     if ( iorp->page )
@@ -179,11 +179,11 @@ static int hvm_map_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
     return rc;
 }
 
-static void hvm_remove_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
+static void hvm_remove_ioreq_gfn(struct ioreq_server *s, bool buf)
 
 {
     struct domain *d = s->target;
-    struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
+    struct ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
 
     if ( gfn_eq(iorp->gfn, INVALID_GFN) )
         return;
@@ -194,10 +194,10 @@ static void hvm_remove_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
     clear_page(iorp->va);
 }
 
-static int hvm_add_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
+static int hvm_add_ioreq_gfn(struct ioreq_server *s, bool buf)
 {
     struct domain *d = s->target;
-    struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
+    struct ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
     int rc;
 
     if ( gfn_eq(iorp->gfn, INVALID_GFN) )
@@ -213,7 +213,7 @@ static int hvm_add_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
     return rc;
 }
 
-int arch_ioreq_server_map_pages(struct hvm_ioreq_server *s)
+int arch_ioreq_server_map_pages(struct ioreq_server *s)
 {
     int rc;
 
@@ -228,40 +228,40 @@ int arch_ioreq_server_map_pages(struct hvm_ioreq_server *s)
     return rc;
 }
 
-void arch_ioreq_server_unmap_pages(struct hvm_ioreq_server *s)
+void arch_ioreq_server_unmap_pages(struct ioreq_server *s)
 {
     hvm_unmap_ioreq_gfn(s, true);
     hvm_unmap_ioreq_gfn(s, false);
 }
 
-void arch_ioreq_server_enable(struct hvm_ioreq_server *s)
+void arch_ioreq_server_enable(struct ioreq_server *s)
 {
     hvm_remove_ioreq_gfn(s, false);
     hvm_remove_ioreq_gfn(s, true);
 }
 
-void arch_ioreq_server_disable(struct hvm_ioreq_server *s)
+void arch_ioreq_server_disable(struct ioreq_server *s)
 {
     hvm_add_ioreq_gfn(s, true);
     hvm_add_ioreq_gfn(s, false);
 }
 
 /* Called when target domain is paused */
-void arch_ioreq_server_destroy(struct hvm_ioreq_server *s)
+void arch_ioreq_server_destroy(struct ioreq_server *s)
 {
     p2m_set_ioreq_server(s->target, 0, s);
 }
 
 /* Called with ioreq_server lock held */
 int arch_ioreq_server_map_mem_type(struct domain *d,
-                                   struct hvm_ioreq_server *s,
+                                   struct ioreq_server *s,
                                    uint32_t flags)
 {
     return p2m_set_ioreq_server(d, flags, s);
 }
 
 void arch_ioreq_server_map_mem_type_completed(struct domain *d,
-                                              struct hvm_ioreq_server *s,
+                                              struct ioreq_server *s,
                                               uint32_t flags)
 {
     if ( flags == 0 )
diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c
index 17dee74..ee13449 100644
--- a/xen/arch/x86/hvm/stdvga.c
+++ b/xen/arch/x86/hvm/stdvga.c
@@ -466,7 +466,7 @@ static int stdvga_mem_write(const struct hvm_io_handler *handler,
         .dir = IOREQ_WRITE,
         .data = data,
     };
-    struct hvm_ioreq_server *srv;
+    struct ioreq_server *srv;
 
     if ( !stdvga_cache_is_enabled(s) || !s->stdvga )
         goto done;
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index ad4bb94..71fda06 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -372,7 +372,7 @@ void p2m_memory_type_changed(struct domain *d)
 
 int p2m_set_ioreq_server(struct domain *d,
                          unsigned int flags,
-                         struct hvm_ioreq_server *s)
+                         struct ioreq_server *s)
 {
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
     int rc;
@@ -420,11 +420,11 @@ int p2m_set_ioreq_server(struct domain *d,
     return rc;
 }
 
-struct hvm_ioreq_server *p2m_get_ioreq_server(struct domain *d,
-                                              unsigned int *flags)
+struct ioreq_server *p2m_get_ioreq_server(struct domain *d,
+                                          unsigned int *flags)
 {
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
-    struct hvm_ioreq_server *s;
+    struct ioreq_server *s;
 
     spin_lock(&p2m->ioreq.lock);
 
diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c
index a196e14..3f631ec 100644
--- a/xen/common/ioreq.c
+++ b/xen/common/ioreq.c
@@ -35,7 +35,7 @@
 #include <public/hvm/params.h>
 
 static void set_ioreq_server(struct domain *d, unsigned int id,
-                             struct hvm_ioreq_server *s)
+                             struct ioreq_server *s)
 {
     ASSERT(id < MAX_NR_IOREQ_SERVERS);
     ASSERT(!s || !d->arch.hvm.ioreq_server.server[id]);
@@ -46,8 +46,8 @@ static void set_ioreq_server(struct domain *d, unsigned int id,
 #define GET_IOREQ_SERVER(d, id) \
     (d)->arch.hvm.ioreq_server.server[id]
 
-static struct hvm_ioreq_server *get_ioreq_server(const struct domain *d,
-                                                 unsigned int id)
+static struct ioreq_server *get_ioreq_server(const struct domain *d,
+                                             unsigned int id)
 {
     if ( id >= MAX_NR_IOREQ_SERVERS )
         return NULL;
@@ -69,7 +69,7 @@ static struct hvm_ioreq_server *get_ioreq_server(const struct domain *d,
             continue; \
         else
 
-static ioreq_t *get_ioreq(struct hvm_ioreq_server *s, struct vcpu *v)
+static ioreq_t *get_ioreq(struct ioreq_server *s, struct vcpu *v)
 {
     shared_iopage_t *p = s->ioreq.va;
 
@@ -79,16 +79,16 @@ static ioreq_t *get_ioreq(struct hvm_ioreq_server *s, struct vcpu *v)
     return &p->vcpu_ioreq[v->vcpu_id];
 }
 
-static struct hvm_ioreq_vcpu *get_pending_vcpu(const struct vcpu *v,
-                                               struct hvm_ioreq_server **srvp)
+static struct ioreq_vcpu *get_pending_vcpu(const struct vcpu *v,
+                                           struct ioreq_server **srvp)
 {
     struct domain *d = v->domain;
-    struct hvm_ioreq_server *s;
+    struct ioreq_server *s;
     unsigned int id;
 
     FOR_EACH_IOREQ_SERVER(d, id, s)
     {
-        struct hvm_ioreq_vcpu *sv;
+        struct ioreq_vcpu *sv;
 
         list_for_each_entry ( sv,
                               &s->ioreq_vcpu_list,
@@ -111,7 +111,7 @@ bool hvm_io_pending(struct vcpu *v)
     return get_pending_vcpu(v, NULL);
 }
 
-static bool hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ioreq_t *p)
+static bool hvm_wait_for_io(struct ioreq_vcpu *sv, ioreq_t *p)
 {
     unsigned int prev_state = STATE_IOREQ_NONE;
     unsigned int state = p->state;
@@ -172,8 +172,8 @@ bool handle_hvm_io_completion(struct vcpu *v)
 {
     struct domain *d = v->domain;
     struct hvm_vcpu_io *vio = &v->arch.hvm.hvm_io;
-    struct hvm_ioreq_server *s;
-    struct hvm_ioreq_vcpu *sv;
+    struct ioreq_server *s;
+    struct ioreq_vcpu *sv;
     enum hvm_io_completion io_completion;
 
     if ( has_vpci(d) && vpci_process_pending(v) )
@@ -214,9 +214,9 @@ bool handle_hvm_io_completion(struct vcpu *v)
     return true;
 }
 
-static int hvm_alloc_ioreq_mfn(struct hvm_ioreq_server *s, bool buf)
+static int hvm_alloc_ioreq_mfn(struct ioreq_server *s, bool buf)
 {
-    struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
+    struct ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
     struct page_info *page;
 
     if ( iorp->page )
@@ -262,9 +262,9 @@ static int hvm_alloc_ioreq_mfn(struct hvm_ioreq_server *s, bool buf)
     return -ENOMEM;
 }
 
-static void hvm_free_ioreq_mfn(struct hvm_ioreq_server *s, bool buf)
+static void hvm_free_ioreq_mfn(struct ioreq_server *s, bool buf)
 {
-    struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
+    struct ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
     struct page_info *page = iorp->page;
 
     if ( !page )
@@ -281,7 +281,7 @@ static void hvm_free_ioreq_mfn(struct hvm_ioreq_server *s, bool buf)
 
 bool is_ioreq_server_page(struct domain *d, const struct page_info *page)
 {
-    const struct hvm_ioreq_server *s;
+    const struct ioreq_server *s;
     unsigned int id;
     bool found = false;
 
@@ -301,8 +301,8 @@ bool is_ioreq_server_page(struct domain *d, const struct page_info *page)
     return found;
 }
 
-static void hvm_update_ioreq_evtchn(struct hvm_ioreq_server *s,
-                                    struct hvm_ioreq_vcpu *sv)
+static void hvm_update_ioreq_evtchn(struct ioreq_server *s,
+                                    struct ioreq_vcpu *sv)
 {
     ASSERT(spin_is_locked(&s->lock));
 
@@ -314,13 +314,13 @@ static void hvm_update_ioreq_evtchn(struct hvm_ioreq_server *s,
     }
 }
 
-static int hvm_ioreq_server_add_vcpu(struct hvm_ioreq_server *s,
+static int hvm_ioreq_server_add_vcpu(struct ioreq_server *s,
                                      struct vcpu *v)
 {
-    struct hvm_ioreq_vcpu *sv;
+    struct ioreq_vcpu *sv;
     int rc;
 
-    sv = xzalloc(struct hvm_ioreq_vcpu);
+    sv = xzalloc(struct ioreq_vcpu);
 
     rc = -ENOMEM;
     if ( !sv )
@@ -366,10 +366,10 @@ static int hvm_ioreq_server_add_vcpu(struct hvm_ioreq_server *s,
     return rc;
 }
 
-static void hvm_ioreq_server_remove_vcpu(struct hvm_ioreq_server *s,
+static void hvm_ioreq_server_remove_vcpu(struct ioreq_server *s,
                                          struct vcpu *v)
 {
-    struct hvm_ioreq_vcpu *sv;
+    struct ioreq_vcpu *sv;
 
     spin_lock(&s->lock);
 
@@ -394,9 +394,9 @@ static void hvm_ioreq_server_remove_vcpu(struct hvm_ioreq_server *s,
     spin_unlock(&s->lock);
 }
 
-static void hvm_ioreq_server_remove_all_vcpus(struct hvm_ioreq_server *s)
+static void hvm_ioreq_server_remove_all_vcpus(struct ioreq_server *s)
 {
-    struct hvm_ioreq_vcpu *sv, *next;
+    struct ioreq_vcpu *sv, *next;
 
     spin_lock(&s->lock);
 
@@ -420,7 +420,7 @@ static void hvm_ioreq_server_remove_all_vcpus(struct hvm_ioreq_server *s)
     spin_unlock(&s->lock);
 }
 
-static int hvm_ioreq_server_alloc_pages(struct hvm_ioreq_server *s)
+static int hvm_ioreq_server_alloc_pages(struct ioreq_server *s)
 {
     int rc;
 
@@ -435,13 +435,13 @@ static int hvm_ioreq_server_alloc_pages(struct hvm_ioreq_server *s)
     return rc;
 }
 
-static void hvm_ioreq_server_free_pages(struct hvm_ioreq_server *s)
+static void hvm_ioreq_server_free_pages(struct ioreq_server *s)
 {
     hvm_free_ioreq_mfn(s, true);
     hvm_free_ioreq_mfn(s, false);
 }
 
-static void hvm_ioreq_server_free_rangesets(struct hvm_ioreq_server *s)
+static void hvm_ioreq_server_free_rangesets(struct ioreq_server *s)
 {
     unsigned int i;
 
@@ -449,7 +449,7 @@ static void hvm_ioreq_server_free_rangesets(struct hvm_ioreq_server *s)
         rangeset_destroy(s->range[i]);
 }
 
-static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
+static int hvm_ioreq_server_alloc_rangesets(struct ioreq_server *s,
                                             ioservid_t id)
 {
     unsigned int i;
@@ -487,9 +487,9 @@ static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
     return rc;
 }
 
-static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s)
+static void hvm_ioreq_server_enable(struct ioreq_server *s)
 {
-    struct hvm_ioreq_vcpu *sv;
+    struct ioreq_vcpu *sv;
 
     spin_lock(&s->lock);
 
@@ -509,7 +509,7 @@ static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s)
     spin_unlock(&s->lock);
 }
 
-static void hvm_ioreq_server_disable(struct hvm_ioreq_server *s)
+static void hvm_ioreq_server_disable(struct ioreq_server *s)
 {
     spin_lock(&s->lock);
 
@@ -524,7 +524,7 @@ static void hvm_ioreq_server_disable(struct hvm_ioreq_server *s)
     spin_unlock(&s->lock);
 }
 
-static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
+static int hvm_ioreq_server_init(struct ioreq_server *s,
                                  struct domain *d, int bufioreq_handling,
                                  ioservid_t id)
 {
@@ -569,7 +569,7 @@ static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
     return rc;
 }
 
-static void hvm_ioreq_server_deinit(struct hvm_ioreq_server *s)
+static void hvm_ioreq_server_deinit(struct ioreq_server *s)
 {
     ASSERT(!s->enabled);
     hvm_ioreq_server_remove_all_vcpus(s);
@@ -594,14 +594,14 @@ static void hvm_ioreq_server_deinit(struct hvm_ioreq_server *s)
 int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
                             ioservid_t *id)
 {
-    struct hvm_ioreq_server *s;
+    struct ioreq_server *s;
     unsigned int i;
     int rc;
 
     if ( bufioreq_handling > HVM_IOREQSRV_BUFIOREQ_ATOMIC )
         return -EINVAL;
 
-    s = xzalloc(struct hvm_ioreq_server);
+    s = xzalloc(struct ioreq_server);
     if ( !s )
         return -ENOMEM;
 
@@ -649,7 +649,7 @@ int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
 
 int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
 {
-    struct hvm_ioreq_server *s;
+    struct ioreq_server *s;
     int rc;
 
     spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
@@ -694,7 +694,7 @@ int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
                               unsigned long *bufioreq_gfn,
                               evtchn_port_t *bufioreq_port)
 {
-    struct hvm_ioreq_server *s;
+    struct ioreq_server *s;
     int rc;
 
     spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
@@ -739,7 +739,7 @@ int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
 int hvm_get_ioreq_server_frame(struct domain *d, ioservid_t id,
                                unsigned long idx, mfn_t *mfn)
 {
-    struct hvm_ioreq_server *s;
+    struct ioreq_server *s;
     int rc;
 
     ASSERT(is_hvm_domain(d));
@@ -791,7 +791,7 @@ int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
                                      uint32_t type, uint64_t start,
                                      uint64_t end)
 {
-    struct hvm_ioreq_server *s;
+    struct ioreq_server *s;
     struct rangeset *r;
     int rc;
 
@@ -843,7 +843,7 @@ int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
                                          uint32_t type, uint64_t start,
                                          uint64_t end)
 {
-    struct hvm_ioreq_server *s;
+    struct ioreq_server *s;
     struct rangeset *r;
     int rc;
 
@@ -902,7 +902,7 @@ int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
 int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
                                      uint32_t type, uint32_t flags)
 {
-    struct hvm_ioreq_server *s;
+    struct ioreq_server *s;
     int rc;
 
     if ( type != HVMMEM_ioreq_server )
@@ -937,7 +937,7 @@ int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
 int hvm_set_ioreq_server_state(struct domain *d, ioservid_t id,
                                bool enabled)
 {
-    struct hvm_ioreq_server *s;
+    struct ioreq_server *s;
     int rc;
 
     spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
@@ -970,7 +970,7 @@ int hvm_set_ioreq_server_state(struct domain *d, ioservid_t id,
 
 int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v)
 {
-    struct hvm_ioreq_server *s;
+    struct ioreq_server *s;
     unsigned int id;
     int rc;
 
@@ -1005,7 +1005,7 @@ int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v)
 
 void hvm_all_ioreq_servers_remove_vcpu(struct domain *d, struct vcpu *v)
 {
-    struct hvm_ioreq_server *s;
+    struct ioreq_server *s;
     unsigned int id;
 
     spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
@@ -1018,7 +1018,7 @@ void hvm_all_ioreq_servers_remove_vcpu(struct domain *d, struct vcpu *v)
 
 void hvm_destroy_all_ioreq_servers(struct domain *d)
 {
-    struct hvm_ioreq_server *s;
+    struct ioreq_server *s;
     unsigned int id;
 
     if ( !arch_ioreq_server_destroy_all(d) )
@@ -1045,10 +1045,10 @@ void hvm_destroy_all_ioreq_servers(struct domain *d)
     spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
 }
 
-struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
-                                                 ioreq_t *p)
+struct ioreq_server *hvm_select_ioreq_server(struct domain *d,
+                                             ioreq_t *p)
 {
-    struct hvm_ioreq_server *s;
+    struct ioreq_server *s;
     uint8_t type;
     uint64_t addr;
     unsigned int id;
@@ -1101,10 +1101,10 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
     return NULL;
 }
 
-static int hvm_send_buffered_ioreq(struct hvm_ioreq_server *s, ioreq_t *p)
+static int hvm_send_buffered_ioreq(struct ioreq_server *s, ioreq_t *p)
 {
     struct domain *d = current->domain;
-    struct hvm_ioreq_page *iorp;
+    struct ioreq_page *iorp;
     buffered_iopage_t *pg;
     buf_ioreq_t bp = { .data = p->data,
                        .addr = p->addr,
@@ -1194,12 +1194,12 @@ static int hvm_send_buffered_ioreq(struct hvm_ioreq_server *s, ioreq_t *p)
     return IOREQ_STATUS_HANDLED;
 }
 
-int hvm_send_ioreq(struct hvm_ioreq_server *s, ioreq_t *proto_p,
+int hvm_send_ioreq(struct ioreq_server *s, ioreq_t *proto_p,
                    bool buffered)
 {
     struct vcpu *curr = current;
     struct domain *d = curr->domain;
-    struct hvm_ioreq_vcpu *sv;
+    struct ioreq_vcpu *sv;
 
     ASSERT(s);
 
@@ -1257,7 +1257,7 @@ int hvm_send_ioreq(struct hvm_ioreq_server *s, ioreq_t *proto_p,
 unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered)
 {
     struct domain *d = current->domain;
-    struct hvm_ioreq_server *s;
+    struct ioreq_server *s;
     unsigned int id, failed = 0;
 
     FOR_EACH_IOREQ_SERVER(d, id, s)
diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index 9d247ba..1c4ca47 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -30,40 +30,6 @@
 
 #include <public/hvm/dm_op.h>
 
-struct hvm_ioreq_page {
-    gfn_t gfn;
-    struct page_info *page;
-    void *va;
-};
-
-struct hvm_ioreq_vcpu {
-    struct list_head list_entry;
-    struct vcpu      *vcpu;
-    evtchn_port_t    ioreq_evtchn;
-    bool             pending;
-};
-
-#define NR_IO_RANGE_TYPES (XEN_DMOP_IO_RANGE_PCI + 1)
-#define MAX_NR_IO_RANGES  256
-
-struct hvm_ioreq_server {
-    struct domain          *target, *emulator;
-
-    /* Lock to serialize toolstack modifications */
-    spinlock_t             lock;
-
-    struct hvm_ioreq_page  ioreq;
-    struct list_head       ioreq_vcpu_list;
-    struct hvm_ioreq_page  bufioreq;
-
-    /* Lock to serialize access to buffered ioreq ring */
-    spinlock_t             bufioreq_lock;
-    evtchn_port_t          bufioreq_evtchn;
-    struct rangeset        *range[NR_IO_RANGE_TYPES];
-    bool                   enabled;
-    uint8_t                bufioreq_handling;
-};
-
 #ifdef CONFIG_MEM_SHARING
 struct mem_sharing_domain
 {
@@ -110,7 +76,7 @@ struct hvm_domain {
     /* Lock protects all other values in the sub-struct and the default */
     struct {
         spinlock_t              lock;
-        struct hvm_ioreq_server *server[MAX_NR_IOREQ_SERVERS];
+        struct ioreq_server *server[MAX_NR_IOREQ_SERVERS];
     } ioreq_server;
 
     /* Cached CF8 for guest PCI config cycles */
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 6447696..7df2878 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -363,7 +363,7 @@ struct p2m_domain {
           * ioreq server who's responsible for the emulation of
           * gfns with specific p2m type(for now, p2m_ioreq_server).
           */
-         struct hvm_ioreq_server *server;
+         struct ioreq_server *server;
          /*
           * flags specifies whether read, write or both operations
           * are to be emulated by an ioreq server.
@@ -937,9 +937,9 @@ static inline unsigned int p2m_get_iommu_flags(p2m_type_t p2mt, mfn_t mfn)
 }
 
 int p2m_set_ioreq_server(struct domain *d, unsigned int flags,
-                         struct hvm_ioreq_server *s);
-struct hvm_ioreq_server *p2m_get_ioreq_server(struct domain *d,
-                                              unsigned int *flags);
+                         struct ioreq_server *s);
+struct ioreq_server *p2m_get_ioreq_server(struct domain *d,
+                                          unsigned int *flags);
 
 static inline int p2m_entry_modify(struct p2m_domain *p2m, p2m_type_t nt,
                                    p2m_type_t ot, mfn_t nfn, mfn_t ofn,
diff --git a/xen/include/xen/ioreq.h b/xen/include/xen/ioreq.h
index aeea67e..bc79c37 100644
--- a/xen/include/xen/ioreq.h
+++ b/xen/include/xen/ioreq.h
@@ -21,6 +21,40 @@
 
 #include <xen/sched.h>
 
+struct ioreq_page {
+    gfn_t gfn;
+    struct page_info *page;
+    void *va;
+};
+
+struct ioreq_vcpu {
+    struct list_head list_entry;
+    struct vcpu      *vcpu;
+    evtchn_port_t    ioreq_evtchn;
+    bool             pending;
+};
+
+#define NR_IO_RANGE_TYPES (XEN_DMOP_IO_RANGE_PCI + 1)
+#define MAX_NR_IO_RANGES  256
+
+struct ioreq_server {
+    struct domain          *target, *emulator;
+
+    /* Lock to serialize toolstack modifications */
+    spinlock_t             lock;
+
+    struct ioreq_page      ioreq;
+    struct list_head       ioreq_vcpu_list;
+    struct ioreq_page      bufioreq;
+
+    /* Lock to serialize access to buffered ioreq ring */
+    spinlock_t             bufioreq_lock;
+    evtchn_port_t          bufioreq_evtchn;
+    struct rangeset        *range[NR_IO_RANGE_TYPES];
+    bool                   enabled;
+    uint8_t                bufioreq_handling;
+};
+
 static inline paddr_t ioreq_mmio_first_byte(const ioreq_t *p)
 {
     return unlikely(p->df) ?
@@ -75,9 +109,9 @@ int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v);
 void hvm_all_ioreq_servers_remove_vcpu(struct domain *d, struct vcpu *v);
 void hvm_destroy_all_ioreq_servers(struct domain *d);
 
-struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
-                                                 ioreq_t *p);
-int hvm_send_ioreq(struct hvm_ioreq_server *s, ioreq_t *proto_p,
+struct ioreq_server *hvm_select_ioreq_server(struct domain *d,
+                                             ioreq_t *p);
+int hvm_send_ioreq(struct ioreq_server *s, ioreq_t *proto_p,
                    bool buffered);
 unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered);
 
@@ -85,16 +119,16 @@ void hvm_ioreq_init(struct domain *d);
 
 bool arch_ioreq_complete_mmio(void);
 bool arch_vcpu_ioreq_completion(enum hvm_io_completion io_completion);
-int arch_ioreq_server_map_pages(struct hvm_ioreq_server *s);
-void arch_ioreq_server_unmap_pages(struct hvm_ioreq_server *s);
-void arch_ioreq_server_enable(struct hvm_ioreq_server *s);
-void arch_ioreq_server_disable(struct hvm_ioreq_server *s);
-void arch_ioreq_server_destroy(struct hvm_ioreq_server *s);
+int arch_ioreq_server_map_pages(struct ioreq_server *s);
+void arch_ioreq_server_unmap_pages(struct ioreq_server *s);
+void arch_ioreq_server_enable(struct ioreq_server *s);
+void arch_ioreq_server_disable(struct ioreq_server *s);
+void arch_ioreq_server_destroy(struct ioreq_server *s);
 int arch_ioreq_server_map_mem_type(struct domain *d,
-                                   struct hvm_ioreq_server *s,
+                                   struct ioreq_server *s,
                                    uint32_t flags);
 void arch_ioreq_server_map_mem_type_completed(struct domain *d,
-                                              struct hvm_ioreq_server *s,
+                                              struct ioreq_server *s,
                                               uint32_t flags);
 bool arch_ioreq_server_destroy_all(struct domain *d);
 bool arch_ioreq_server_get_type_addr(const struct domain *d,
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH V4 08/24] xen/ioreq: Move x86's ioreq_server to struct domain
  2021-01-12 21:52 [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Oleksandr Tyshchenko
                   ` (6 preceding siblings ...)
  2021-01-12 21:52 ` [PATCH V4 07/24] xen/ioreq: Make x86's hvm_ioreq_(page/vcpu/server) structs common Oleksandr Tyshchenko
@ 2021-01-12 21:52 ` Oleksandr Tyshchenko
  2021-01-15 15:44   ` Julien Grall
                     ` (2 more replies)
  2021-01-12 21:52 ` [PATCH V4 09/24] xen/ioreq: Make x86's IOREQ related dm-op handling common Oleksandr Tyshchenko
                   ` (16 subsequent siblings)
  24 siblings, 3 replies; 144+ messages in thread
From: Oleksandr Tyshchenko @ 2021-01-12 21:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, Paul Durrant, Andrew Cooper, George Dunlap,
	Ian Jackson, Jan Beulich, Julien Grall, Stefano Stabellini,
	Wei Liu, Roger Pau Monné,
	Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

The IOREQ is a common feature now and this struct will be used
on Arm as is. Move it to common struct domain. This also
significantly reduces the layering violation in the common code
(*arch.hvm* usage).

We don't move ioreq_gfn since it is not used in the common code
(the "legacy" mechanism is x86 specific).

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
CC: Julien Grall <julien.grall@arm.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>

---
Please note, this is a split/cleanup/hardening of Julien's PoC:
"Add support for Guest IO forwarding to a device emulator"

Changes V1 -> V2:
   - new patch

Changes V2 -> V3:
   - remove the mention of "ioreq_gfn" from patch subject/description
   - update patch according the "legacy interface" is x86 specific
   - drop hvm_params related changes in arch/x86/hvm/hvm.c
   - leave ioreq_gfn in hvm_domain

Changes V3 -> V4:
   - rebase
   - drop the stale part of the comment above struct ioreq_server
   - add Jan's A-b
---
 xen/common/ioreq.c               | 60 ++++++++++++++++++++--------------------
 xen/include/asm-x86/hvm/domain.h |  8 ------
 xen/include/xen/sched.h          | 10 +++++++
 3 files changed, 40 insertions(+), 38 deletions(-)

diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c
index 3f631ec..a319c88 100644
--- a/xen/common/ioreq.c
+++ b/xen/common/ioreq.c
@@ -38,13 +38,13 @@ static void set_ioreq_server(struct domain *d, unsigned int id,
                              struct ioreq_server *s)
 {
     ASSERT(id < MAX_NR_IOREQ_SERVERS);
-    ASSERT(!s || !d->arch.hvm.ioreq_server.server[id]);
+    ASSERT(!s || !d->ioreq_server.server[id]);
 
-    d->arch.hvm.ioreq_server.server[id] = s;
+    d->ioreq_server.server[id] = s;
 }
 
 #define GET_IOREQ_SERVER(d, id) \
-    (d)->arch.hvm.ioreq_server.server[id]
+    (d)->ioreq_server.server[id]
 
 static struct ioreq_server *get_ioreq_server(const struct domain *d,
                                              unsigned int id)
@@ -285,7 +285,7 @@ bool is_ioreq_server_page(struct domain *d, const struct page_info *page)
     unsigned int id;
     bool found = false;
 
-    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
+    spin_lock_recursive(&d->ioreq_server.lock);
 
     FOR_EACH_IOREQ_SERVER(d, id, s)
     {
@@ -296,7 +296,7 @@ bool is_ioreq_server_page(struct domain *d, const struct page_info *page)
         }
     }
 
-    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+    spin_unlock_recursive(&d->ioreq_server.lock);
 
     return found;
 }
@@ -606,7 +606,7 @@ int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
         return -ENOMEM;
 
     domain_pause(d);
-    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
+    spin_lock_recursive(&d->ioreq_server.lock);
 
     for ( i = 0; i < MAX_NR_IOREQ_SERVERS; i++ )
     {
@@ -634,13 +634,13 @@ int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
     if ( id )
         *id = i;
 
-    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+    spin_unlock_recursive(&d->ioreq_server.lock);
     domain_unpause(d);
 
     return 0;
 
  fail:
-    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+    spin_unlock_recursive(&d->ioreq_server.lock);
     domain_unpause(d);
 
     xfree(s);
@@ -652,7 +652,7 @@ int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
     struct ioreq_server *s;
     int rc;
 
-    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
+    spin_lock_recursive(&d->ioreq_server.lock);
 
     s = get_ioreq_server(d, id);
 
@@ -684,7 +684,7 @@ int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
     rc = 0;
 
  out:
-    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+    spin_unlock_recursive(&d->ioreq_server.lock);
 
     return rc;
 }
@@ -697,7 +697,7 @@ int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
     struct ioreq_server *s;
     int rc;
 
-    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
+    spin_lock_recursive(&d->ioreq_server.lock);
 
     s = get_ioreq_server(d, id);
 
@@ -731,7 +731,7 @@ int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
     rc = 0;
 
  out:
-    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+    spin_unlock_recursive(&d->ioreq_server.lock);
 
     return rc;
 }
@@ -744,7 +744,7 @@ int hvm_get_ioreq_server_frame(struct domain *d, ioservid_t id,
 
     ASSERT(is_hvm_domain(d));
 
-    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
+    spin_lock_recursive(&d->ioreq_server.lock);
 
     s = get_ioreq_server(d, id);
 
@@ -782,7 +782,7 @@ int hvm_get_ioreq_server_frame(struct domain *d, ioservid_t id,
     }
 
  out:
-    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+    spin_unlock_recursive(&d->ioreq_server.lock);
 
     return rc;
 }
@@ -798,7 +798,7 @@ int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
     if ( start > end )
         return -EINVAL;
 
-    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
+    spin_lock_recursive(&d->ioreq_server.lock);
 
     s = get_ioreq_server(d, id);
 
@@ -834,7 +834,7 @@ int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
     rc = rangeset_add_range(r, start, end);
 
  out:
-    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+    spin_unlock_recursive(&d->ioreq_server.lock);
 
     return rc;
 }
@@ -850,7 +850,7 @@ int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
     if ( start > end )
         return -EINVAL;
 
-    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
+    spin_lock_recursive(&d->ioreq_server.lock);
 
     s = get_ioreq_server(d, id);
 
@@ -886,7 +886,7 @@ int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
     rc = rangeset_remove_range(r, start, end);
 
  out:
-    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+    spin_unlock_recursive(&d->ioreq_server.lock);
 
     return rc;
 }
@@ -911,7 +911,7 @@ int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
     if ( flags & ~XEN_DMOP_IOREQ_MEM_ACCESS_WRITE )
         return -EINVAL;
 
-    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
+    spin_lock_recursive(&d->ioreq_server.lock);
 
     s = get_ioreq_server(d, id);
 
@@ -926,7 +926,7 @@ int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
     rc = arch_ioreq_server_map_mem_type(d, s, flags);
 
  out:
-    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+    spin_unlock_recursive(&d->ioreq_server.lock);
 
     if ( rc == 0 )
         arch_ioreq_server_map_mem_type_completed(d, s, flags);
@@ -940,7 +940,7 @@ int hvm_set_ioreq_server_state(struct domain *d, ioservid_t id,
     struct ioreq_server *s;
     int rc;
 
-    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
+    spin_lock_recursive(&d->ioreq_server.lock);
 
     s = get_ioreq_server(d, id);
 
@@ -964,7 +964,7 @@ int hvm_set_ioreq_server_state(struct domain *d, ioservid_t id,
     rc = 0;
 
  out:
-    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+    spin_unlock_recursive(&d->ioreq_server.lock);
     return rc;
 }
 
@@ -974,7 +974,7 @@ int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v)
     unsigned int id;
     int rc;
 
-    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
+    spin_lock_recursive(&d->ioreq_server.lock);
 
     FOR_EACH_IOREQ_SERVER(d, id, s)
     {
@@ -983,7 +983,7 @@ int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v)
             goto fail;
     }
 
-    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+    spin_unlock_recursive(&d->ioreq_server.lock);
 
     return 0;
 
@@ -998,7 +998,7 @@ int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v)
         hvm_ioreq_server_remove_vcpu(s, v);
     }
 
-    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+    spin_unlock_recursive(&d->ioreq_server.lock);
 
     return rc;
 }
@@ -1008,12 +1008,12 @@ void hvm_all_ioreq_servers_remove_vcpu(struct domain *d, struct vcpu *v)
     struct ioreq_server *s;
     unsigned int id;
 
-    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
+    spin_lock_recursive(&d->ioreq_server.lock);
 
     FOR_EACH_IOREQ_SERVER(d, id, s)
         hvm_ioreq_server_remove_vcpu(s, v);
 
-    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+    spin_unlock_recursive(&d->ioreq_server.lock);
 }
 
 void hvm_destroy_all_ioreq_servers(struct domain *d)
@@ -1024,7 +1024,7 @@ void hvm_destroy_all_ioreq_servers(struct domain *d)
     if ( !arch_ioreq_server_destroy_all(d) )
         return;
 
-    spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
+    spin_lock_recursive(&d->ioreq_server.lock);
 
     /* No need to domain_pause() as the domain is being torn down */
 
@@ -1042,7 +1042,7 @@ void hvm_destroy_all_ioreq_servers(struct domain *d)
         xfree(s);
     }
 
-    spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
+    spin_unlock_recursive(&d->ioreq_server.lock);
 }
 
 struct ioreq_server *hvm_select_ioreq_server(struct domain *d,
@@ -1274,7 +1274,7 @@ unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered)
 
 void hvm_ioreq_init(struct domain *d)
 {
-    spin_lock_init(&d->arch.hvm.ioreq_server.lock);
+    spin_lock_init(&d->ioreq_server.lock);
 
     arch_ioreq_domain_init(d);
 }
diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index 1c4ca47..b8be1ad 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -63,8 +63,6 @@ struct hvm_pi_ops {
     void (*vcpu_block)(struct vcpu *);
 };
 
-#define MAX_NR_IOREQ_SERVERS 8
-
 struct hvm_domain {
     /* Guest page range used for non-default ioreq servers */
     struct {
@@ -73,12 +71,6 @@ struct hvm_domain {
         unsigned long legacy_mask; /* indexed by HVM param number */
     } ioreq_gfn;
 
-    /* Lock protects all other values in the sub-struct and the default */
-    struct {
-        spinlock_t              lock;
-        struct ioreq_server *server[MAX_NR_IOREQ_SERVERS];
-    } ioreq_server;
-
     /* Cached CF8 for guest PCI config cycles */
     uint32_t                pci_cf8;
 
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 3e46384..ad0d761 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -318,6 +318,8 @@ struct sched_unit {
 
 struct evtchn_port_ops;
 
+#define MAX_NR_IOREQ_SERVERS 8
+
 struct domain
 {
     domid_t          domain_id;
@@ -533,6 +535,14 @@ struct domain
     struct {
         unsigned int val;
     } teardown;
+
+#ifdef CONFIG_IOREQ_SERVER
+    /* Lock protects all other values in the sub-struct */
+    struct {
+        spinlock_t              lock;
+        struct ioreq_server     *server[MAX_NR_IOREQ_SERVERS];
+    } ioreq_server;
+#endif
 };
 
 static inline struct page_list_head *page_to_list(
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH V4 09/24] xen/ioreq: Make x86's IOREQ related dm-op handling common
  2021-01-12 21:52 [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Oleksandr Tyshchenko
                   ` (7 preceding siblings ...)
  2021-01-12 21:52 ` [PATCH V4 08/24] xen/ioreq: Move x86's ioreq_server to struct domain Oleksandr Tyshchenko
@ 2021-01-12 21:52 ` Oleksandr Tyshchenko
  2021-01-18  9:17   ` Paul Durrant
  2021-01-20 16:21   ` Jan Beulich
  2021-01-12 21:52 ` [PATCH V4 10/24] xen/ioreq: Move x86's io_completion/io_req fields to struct vcpu Oleksandr Tyshchenko
                   ` (15 subsequent siblings)
  24 siblings, 2 replies; 144+ messages in thread
From: Oleksandr Tyshchenko @ 2021-01-12 21:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Julien Grall, Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Paul Durrant, Daniel De Graaf,
	Oleksandr Tyshchenko

From: Julien Grall <julien.grall@arm.com>

As a lot of x86 code can be re-used on Arm later on, this patch
moves the IOREQ related dm-op handling to the common code.

The idea is to have the top level dm-op handling arch-specific
and call into ioreq_server_dm_op() for otherwise unhandled ops.
Pros:
- More natural than doing it other way around (top level dm-op
handling common).
- Leave compat_dm_op() in x86 code.
Cons:
- Code duplication. Both arches have to duplicate do_dm_op(), etc.

Also update XSM code a bit to let dm-op be used on Arm.

This support is going to be used on Arm to be able run device
emulator outside of Xen hypervisor.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>

---
Please note, this is a split/cleanup/hardening of Julien's PoC:
"Add support for Guest IO forwarding to a device emulator"

***
I decided to leave common dm.h to keep struct dmop_args declaration
(to be included by Arm's dm.c), alternatively we could avoid
introducing new header by moving the declaration into the existing
header, but failed to find a suitable one which context would fit.
***

Changes RFC -> V1:
   - update XSM, related changes were pulled from:
     [RFC PATCH V1 04/12] xen/arm: Introduce arch specific bits for IOREQ/DM features

Changes V1 -> V2:
   - update the author of a patch
   - update patch description
   - introduce xen/dm.h and move definitions here

Changes V2 -> V3:
   - no changes

Changes V3 -> V4:
   - rework to have the top level dm-op handling arch-specific
   - update patch subject/description, was "xen/dm: Make x86's DM feature common"
   - make a few functions static in common ioreq.c
---
 xen/arch/x86/hvm/dm.c   | 101 +-----------------------------------
 xen/common/ioreq.c      | 135 ++++++++++++++++++++++++++++++++++++++++++------
 xen/include/xen/dm.h    |  39 ++++++++++++++
 xen/include/xen/ioreq.h |  17 +-----
 xen/include/xsm/dummy.h |   4 +-
 xen/include/xsm/xsm.h   |   6 +--
 xen/xsm/dummy.c         |   2 +-
 xen/xsm/flask/hooks.c   |   5 +-
 8 files changed, 171 insertions(+), 138 deletions(-)
 create mode 100644 xen/include/xen/dm.h

diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c
index d3e2a9e..dc8e47d 100644
--- a/xen/arch/x86/hvm/dm.c
+++ b/xen/arch/x86/hvm/dm.c
@@ -16,6 +16,7 @@
 
 #include <xen/event.h>
 #include <xen/guest_access.h>
+#include <xen/dm.h>
 #include <xen/hypercall.h>
 #include <xen/ioreq.h>
 #include <xen/nospec.h>
@@ -29,13 +30,6 @@
 
 #include <public/hvm/hvm_op.h>
 
-struct dmop_args {
-    domid_t domid;
-    unsigned int nr_bufs;
-    /* Reserve enough buf elements for all current hypercalls. */
-    struct xen_dm_op_buf buf[2];
-};
-
 static bool _raw_copy_from_guest_buf_offset(void *dst,
                                             const struct dmop_args *args,
                                             unsigned int buf_idx,
@@ -408,71 +402,6 @@ static int dm_op(const struct dmop_args *op_args)
 
     switch ( op.op )
     {
-    case XEN_DMOP_create_ioreq_server:
-    {
-        struct xen_dm_op_create_ioreq_server *data =
-            &op.u.create_ioreq_server;
-
-        const_op = false;
-
-        rc = -EINVAL;
-        if ( data->pad[0] || data->pad[1] || data->pad[2] )
-            break;
-
-        rc = hvm_create_ioreq_server(d, data->handle_bufioreq,
-                                     &data->id);
-        break;
-    }
-
-    case XEN_DMOP_get_ioreq_server_info:
-    {
-        struct xen_dm_op_get_ioreq_server_info *data =
-            &op.u.get_ioreq_server_info;
-        const uint16_t valid_flags = XEN_DMOP_no_gfns;
-
-        const_op = false;
-
-        rc = -EINVAL;
-        if ( data->flags & ~valid_flags )
-            break;
-
-        rc = hvm_get_ioreq_server_info(d, data->id,
-                                       (data->flags & XEN_DMOP_no_gfns) ?
-                                       NULL : &data->ioreq_gfn,
-                                       (data->flags & XEN_DMOP_no_gfns) ?
-                                       NULL : &data->bufioreq_gfn,
-                                       &data->bufioreq_port);
-        break;
-    }
-
-    case XEN_DMOP_map_io_range_to_ioreq_server:
-    {
-        const struct xen_dm_op_ioreq_server_range *data =
-            &op.u.map_io_range_to_ioreq_server;
-
-        rc = -EINVAL;
-        if ( data->pad )
-            break;
-
-        rc = hvm_map_io_range_to_ioreq_server(d, data->id, data->type,
-                                              data->start, data->end);
-        break;
-    }
-
-    case XEN_DMOP_unmap_io_range_from_ioreq_server:
-    {
-        const struct xen_dm_op_ioreq_server_range *data =
-            &op.u.unmap_io_range_from_ioreq_server;
-
-        rc = -EINVAL;
-        if ( data->pad )
-            break;
-
-        rc = hvm_unmap_io_range_from_ioreq_server(d, data->id, data->type,
-                                                  data->start, data->end);
-        break;
-    }
-
     case XEN_DMOP_map_mem_type_to_ioreq_server:
     {
         struct xen_dm_op_map_mem_type_to_ioreq_server *data =
@@ -523,32 +452,6 @@ static int dm_op(const struct dmop_args *op_args)
         break;
     }
 
-    case XEN_DMOP_set_ioreq_server_state:
-    {
-        const struct xen_dm_op_set_ioreq_server_state *data =
-            &op.u.set_ioreq_server_state;
-
-        rc = -EINVAL;
-        if ( data->pad )
-            break;
-
-        rc = hvm_set_ioreq_server_state(d, data->id, !!data->enabled);
-        break;
-    }
-
-    case XEN_DMOP_destroy_ioreq_server:
-    {
-        const struct xen_dm_op_destroy_ioreq_server *data =
-            &op.u.destroy_ioreq_server;
-
-        rc = -EINVAL;
-        if ( data->pad )
-            break;
-
-        rc = hvm_destroy_ioreq_server(d, data->id);
-        break;
-    }
-
     case XEN_DMOP_track_dirty_vram:
     {
         const struct xen_dm_op_track_dirty_vram *data =
@@ -703,7 +606,7 @@ static int dm_op(const struct dmop_args *op_args)
     }
 
     default:
-        rc = -EOPNOTSUPP;
+        rc = ioreq_server_dm_op(&op, d, &const_op);
         break;
     }
 
diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c
index a319c88..72b5da0 100644
--- a/xen/common/ioreq.c
+++ b/xen/common/ioreq.c
@@ -591,8 +591,8 @@ static void hvm_ioreq_server_deinit(struct ioreq_server *s)
     put_domain(s->emulator);
 }
 
-int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
-                            ioservid_t *id)
+static int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
+                                   ioservid_t *id)
 {
     struct ioreq_server *s;
     unsigned int i;
@@ -647,7 +647,7 @@ int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
     return rc;
 }
 
-int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
+static int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
 {
     struct ioreq_server *s;
     int rc;
@@ -689,10 +689,10 @@ int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
     return rc;
 }
 
-int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
-                              unsigned long *ioreq_gfn,
-                              unsigned long *bufioreq_gfn,
-                              evtchn_port_t *bufioreq_port)
+static int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
+                                     unsigned long *ioreq_gfn,
+                                     unsigned long *bufioreq_gfn,
+                                     evtchn_port_t *bufioreq_port)
 {
     struct ioreq_server *s;
     int rc;
@@ -787,9 +787,9 @@ int hvm_get_ioreq_server_frame(struct domain *d, ioservid_t id,
     return rc;
 }
 
-int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
-                                     uint32_t type, uint64_t start,
-                                     uint64_t end)
+static int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
+                                            uint32_t type, uint64_t start,
+                                            uint64_t end)
 {
     struct ioreq_server *s;
     struct rangeset *r;
@@ -839,9 +839,9 @@ int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
     return rc;
 }
 
-int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
-                                         uint32_t type, uint64_t start,
-                                         uint64_t end)
+static int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
+                                                uint32_t type, uint64_t start,
+                                                uint64_t end)
 {
     struct ioreq_server *s;
     struct rangeset *r;
@@ -934,8 +934,8 @@ int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
     return rc;
 }
 
-int hvm_set_ioreq_server_state(struct domain *d, ioservid_t id,
-                               bool enabled)
+static int hvm_set_ioreq_server_state(struct domain *d, ioservid_t id,
+                                      bool enabled)
 {
     struct ioreq_server *s;
     int rc;
@@ -1279,6 +1279,111 @@ void hvm_ioreq_init(struct domain *d)
     arch_ioreq_domain_init(d);
 }
 
+int ioreq_server_dm_op(struct xen_dm_op *op, struct domain *d, bool *const_op)
+{
+    long rc;
+
+    switch ( op->op )
+    {
+    case XEN_DMOP_create_ioreq_server:
+    {
+        struct xen_dm_op_create_ioreq_server *data =
+            &op->u.create_ioreq_server;
+
+        *const_op = false;
+
+        rc = -EINVAL;
+        if ( data->pad[0] || data->pad[1] || data->pad[2] )
+            break;
+
+        rc = hvm_create_ioreq_server(d, data->handle_bufioreq,
+                                     &data->id);
+        break;
+    }
+
+    case XEN_DMOP_get_ioreq_server_info:
+    {
+        struct xen_dm_op_get_ioreq_server_info *data =
+            &op->u.get_ioreq_server_info;
+        const uint16_t valid_flags = XEN_DMOP_no_gfns;
+
+        *const_op = false;
+
+        rc = -EINVAL;
+        if ( data->flags & ~valid_flags )
+            break;
+
+        rc = hvm_get_ioreq_server_info(d, data->id,
+                                       (data->flags & XEN_DMOP_no_gfns) ?
+                                       NULL : (unsigned long *)&data->ioreq_gfn,
+                                       (data->flags & XEN_DMOP_no_gfns) ?
+                                       NULL : (unsigned long *)&data->bufioreq_gfn,
+                                       &data->bufioreq_port);
+        break;
+    }
+
+    case XEN_DMOP_map_io_range_to_ioreq_server:
+    {
+        const struct xen_dm_op_ioreq_server_range *data =
+            &op->u.map_io_range_to_ioreq_server;
+
+        rc = -EINVAL;
+        if ( data->pad )
+            break;
+
+        rc = hvm_map_io_range_to_ioreq_server(d, data->id, data->type,
+                                              data->start, data->end);
+        break;
+    }
+
+    case XEN_DMOP_unmap_io_range_from_ioreq_server:
+    {
+        const struct xen_dm_op_ioreq_server_range *data =
+            &op->u.unmap_io_range_from_ioreq_server;
+
+        rc = -EINVAL;
+        if ( data->pad )
+            break;
+
+        rc = hvm_unmap_io_range_from_ioreq_server(d, data->id, data->type,
+                                                  data->start, data->end);
+        break;
+    }
+
+    case XEN_DMOP_set_ioreq_server_state:
+    {
+        const struct xen_dm_op_set_ioreq_server_state *data =
+            &op->u.set_ioreq_server_state;
+
+        rc = -EINVAL;
+        if ( data->pad )
+            break;
+
+        rc = hvm_set_ioreq_server_state(d, data->id, !!data->enabled);
+        break;
+    }
+
+    case XEN_DMOP_destroy_ioreq_server:
+    {
+        const struct xen_dm_op_destroy_ioreq_server *data =
+            &op->u.destroy_ioreq_server;
+
+        rc = -EINVAL;
+        if ( data->pad )
+            break;
+
+        rc = hvm_destroy_ioreq_server(d, data->id);
+        break;
+    }
+
+    default:
+        rc = -EOPNOTSUPP;
+        break;
+    }
+
+    return rc;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/xen/dm.h b/xen/include/xen/dm.h
new file mode 100644
index 0000000..2c9952d
--- /dev/null
+++ b/xen/include/xen/dm.h
@@ -0,0 +1,39 @@
+/*
+ * Copyright (c) 2016 Citrix Systems Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __XEN_DM_H__
+#define __XEN_DM_H__
+
+#include <xen/sched.h>
+
+struct dmop_args {
+    domid_t domid;
+    unsigned int nr_bufs;
+    /* Reserve enough buf elements for all current hypercalls. */
+    struct xen_dm_op_buf buf[2];
+};
+
+#endif /* __XEN_DM_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/xen/ioreq.h b/xen/include/xen/ioreq.h
index bc79c37..7a90873 100644
--- a/xen/include/xen/ioreq.h
+++ b/xen/include/xen/ioreq.h
@@ -85,25 +85,10 @@ bool hvm_io_pending(struct vcpu *v);
 bool handle_hvm_io_completion(struct vcpu *v);
 bool is_ioreq_server_page(struct domain *d, const struct page_info *page);
 
-int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
-                            ioservid_t *id);
-int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id);
-int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
-                              unsigned long *ioreq_gfn,
-                              unsigned long *bufioreq_gfn,
-                              evtchn_port_t *bufioreq_port);
 int hvm_get_ioreq_server_frame(struct domain *d, ioservid_t id,
                                unsigned long idx, mfn_t *mfn);
-int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
-                                     uint32_t type, uint64_t start,
-                                     uint64_t end);
-int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
-                                         uint32_t type, uint64_t start,
-                                         uint64_t end);
 int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
                                      uint32_t type, uint32_t flags);
-int hvm_set_ioreq_server_state(struct domain *d, ioservid_t id,
-                               bool enabled);
 
 int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v);
 void hvm_all_ioreq_servers_remove_vcpu(struct domain *d, struct vcpu *v);
@@ -117,6 +102,8 @@ unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered);
 
 void hvm_ioreq_init(struct domain *d);
 
+int ioreq_server_dm_op(struct xen_dm_op *op, struct domain *d, bool *const_op);
+
 bool arch_ioreq_complete_mmio(void);
 bool arch_vcpu_ioreq_completion(enum hvm_io_completion io_completion);
 int arch_ioreq_server_map_pages(struct ioreq_server *s);
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index 7ae3c40..5c61d8e 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -707,14 +707,14 @@ static XSM_INLINE int xsm_pmu_op (XSM_DEFAULT_ARG struct domain *d, unsigned int
     }
 }
 
+#endif /* CONFIG_X86 */
+
 static XSM_INLINE int xsm_dm_op(XSM_DEFAULT_ARG struct domain *d)
 {
     XSM_ASSERT_ACTION(XSM_DM_PRIV);
     return xsm_default_action(action, current->domain, d);
 }
 
-#endif /* CONFIG_X86 */
-
 #ifdef CONFIG_ARGO
 static XSM_INLINE int xsm_argo_enable(const struct domain *d)
 {
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index 7bd03d8..91ecff4 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -176,8 +176,8 @@ struct xsm_operations {
     int (*ioport_permission) (struct domain *d, uint32_t s, uint32_t e, uint8_t allow);
     int (*ioport_mapping) (struct domain *d, uint32_t s, uint32_t e, uint8_t allow);
     int (*pmu_op) (struct domain *d, unsigned int op);
-    int (*dm_op) (struct domain *d);
 #endif
+    int (*dm_op) (struct domain *d);
     int (*xen_version) (uint32_t cmd);
     int (*domain_resource_map) (struct domain *d);
 #ifdef CONFIG_ARGO
@@ -682,13 +682,13 @@ static inline int xsm_pmu_op (xsm_default_t def, struct domain *d, unsigned int
     return xsm_ops->pmu_op(d, op);
 }
 
+#endif /* CONFIG_X86 */
+
 static inline int xsm_dm_op(xsm_default_t def, struct domain *d)
 {
     return xsm_ops->dm_op(d);
 }
 
-#endif /* CONFIG_X86 */
-
 static inline int xsm_xen_version (xsm_default_t def, uint32_t op)
 {
     return xsm_ops->xen_version(op);
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index 9e09512..8bdffe7 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -147,8 +147,8 @@ void __init xsm_fixup_ops (struct xsm_operations *ops)
     set_to_dummy_if_null(ops, ioport_permission);
     set_to_dummy_if_null(ops, ioport_mapping);
     set_to_dummy_if_null(ops, pmu_op);
-    set_to_dummy_if_null(ops, dm_op);
 #endif
+    set_to_dummy_if_null(ops, dm_op);
     set_to_dummy_if_null(ops, xen_version);
     set_to_dummy_if_null(ops, domain_resource_map);
 #ifdef CONFIG_ARGO
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 19b0d9e..11784d7 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -1656,14 +1656,13 @@ static int flask_pmu_op (struct domain *d, unsigned int op)
         return -EPERM;
     }
 }
+#endif /* CONFIG_X86 */
 
 static int flask_dm_op(struct domain *d)
 {
     return current_has_perm(d, SECCLASS_HVM, HVM__DM);
 }
 
-#endif /* CONFIG_X86 */
-
 static int flask_xen_version (uint32_t op)
 {
     u32 dsid = domain_sid(current->domain);
@@ -1865,8 +1864,8 @@ static struct xsm_operations flask_ops = {
     .ioport_permission = flask_ioport_permission,
     .ioport_mapping = flask_ioport_mapping,
     .pmu_op = flask_pmu_op,
-    .dm_op = flask_dm_op,
 #endif
+    .dm_op = flask_dm_op,
     .xen_version = flask_xen_version,
     .domain_resource_map = flask_domain_resource_map,
 #ifdef CONFIG_ARGO
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH V4 10/24] xen/ioreq: Move x86's io_completion/io_req fields to struct vcpu
  2021-01-12 21:52 [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Oleksandr Tyshchenko
                   ` (8 preceding siblings ...)
  2021-01-12 21:52 ` [PATCH V4 09/24] xen/ioreq: Make x86's IOREQ related dm-op handling common Oleksandr Tyshchenko
@ 2021-01-12 21:52 ` Oleksandr Tyshchenko
  2021-01-15 19:34   ` Julien Grall
                     ` (2 more replies)
  2021-01-12 21:52 ` [PATCH V4 11/24] xen/mm: Make x86's XENMEM_resource_ioreq_server handling common Oleksandr Tyshchenko
                   ` (14 subsequent siblings)
  24 siblings, 3 replies; 144+ messages in thread
From: Oleksandr Tyshchenko @ 2021-01-12 21:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, Paul Durrant, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Jun Nakajima, Kevin Tian, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

The IOREQ is a common feature now and these fields will be used
on Arm as is. Move them to common struct vcpu as a part of new
struct vcpu_io and drop duplicating "io" prefixes. Also move
enum hvm_io_completion to xen/sched.h and remove "hvm" prefixes.

This patch completely removes layering violation in the common code.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Julien Grall <julien.grall@arm.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>

---
Please note, this is a split/cleanup/hardening of Julien's PoC:
"Add support for Guest IO forwarding to a device emulator"

Changes V1 -> V2:
   - new patch

Changes V2 -> V3:
   - update patch according the "legacy interface" is x86 specific
   - update patch description
   - drop the "io" prefixes from the field names
   - wrap IO_realmode_completion

Changes V3 -> V4:
   - rename all hvm_vcpu_io locals to "hvio"
   - rename according to the new renaming scheme IO_ -> VIO_ (io_ -> vio_)
   - drop "io" prefix from io_completion locals
---
 xen/arch/x86/hvm/emulate.c        | 210 +++++++++++++++++++-------------------
 xen/arch/x86/hvm/hvm.c            |   2 +-
 xen/arch/x86/hvm/io.c             |  32 +++---
 xen/arch/x86/hvm/ioreq.c          |   6 +-
 xen/arch/x86/hvm/svm/nestedsvm.c  |   2 +-
 xen/arch/x86/hvm/vmx/realmode.c   |   8 +-
 xen/common/ioreq.c                |  26 ++---
 xen/include/asm-x86/hvm/emulate.h |   2 +-
 xen/include/asm-x86/hvm/vcpu.h    |  11 --
 xen/include/xen/ioreq.h           |   2 +-
 xen/include/xen/sched.h           |  19 ++++
 11 files changed, 164 insertions(+), 156 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 4d62199..21051ce 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -140,15 +140,15 @@ static const struct hvm_io_handler ioreq_server_handler = {
  */
 void hvmemul_cancel(struct vcpu *v)
 {
-    struct hvm_vcpu_io *vio = &v->arch.hvm.hvm_io;
+    struct hvm_vcpu_io *hvio = &v->arch.hvm.hvm_io;
 
-    vio->io_req.state = STATE_IOREQ_NONE;
-    vio->io_completion = HVMIO_no_completion;
-    vio->mmio_cache_count = 0;
-    vio->mmio_insn_bytes = 0;
-    vio->mmio_access = (struct npfec){};
-    vio->mmio_retry = false;
-    vio->g2m_ioport = NULL;
+    v->io.req.state = STATE_IOREQ_NONE;
+    v->io.completion = VIO_no_completion;
+    hvio->mmio_cache_count = 0;
+    hvio->mmio_insn_bytes = 0;
+    hvio->mmio_access = (struct npfec){};
+    hvio->mmio_retry = false;
+    hvio->g2m_ioport = NULL;
 
     hvmemul_cache_disable(v);
 }
@@ -159,7 +159,7 @@ static int hvmemul_do_io(
 {
     struct vcpu *curr = current;
     struct domain *currd = curr->domain;
-    struct hvm_vcpu_io *vio = &curr->arch.hvm.hvm_io;
+    struct vcpu_io *vio = &curr->io;
     ioreq_t p = {
         .type = is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO,
         .addr = addr,
@@ -184,13 +184,13 @@ static int hvmemul_do_io(
         return X86EMUL_UNHANDLEABLE;
     }
 
-    switch ( vio->io_req.state )
+    switch ( vio->req.state )
     {
     case STATE_IOREQ_NONE:
         break;
     case STATE_IORESP_READY:
-        vio->io_req.state = STATE_IOREQ_NONE;
-        p = vio->io_req;
+        vio->req.state = STATE_IOREQ_NONE;
+        p = vio->req;
 
         /* Verify the emulation request has been correctly re-issued */
         if ( (p.type != (is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO)) ||
@@ -238,7 +238,7 @@ static int hvmemul_do_io(
     }
     ASSERT(p.count);
 
-    vio->io_req = p;
+    vio->req = p;
 
     rc = hvm_io_intercept(&p);
 
@@ -247,12 +247,12 @@ static int hvmemul_do_io(
      * our callers and mirror this into latched state.
      */
     ASSERT(p.count <= *reps);
-    *reps = vio->io_req.count = p.count;
+    *reps = vio->req.count = p.count;
 
     switch ( rc )
     {
     case X86EMUL_OKAY:
-        vio->io_req.state = STATE_IOREQ_NONE;
+        vio->req.state = STATE_IOREQ_NONE;
         break;
     case X86EMUL_UNHANDLEABLE:
     {
@@ -305,7 +305,7 @@ static int hvmemul_do_io(
                 if ( s == NULL )
                 {
                     rc = X86EMUL_RETRY;
-                    vio->io_req.state = STATE_IOREQ_NONE;
+                    vio->req.state = STATE_IOREQ_NONE;
                     break;
                 }
 
@@ -316,7 +316,7 @@ static int hvmemul_do_io(
                 if ( dir == IOREQ_READ )
                 {
                     rc = hvm_process_io_intercept(&ioreq_server_handler, &p);
-                    vio->io_req.state = STATE_IOREQ_NONE;
+                    vio->req.state = STATE_IOREQ_NONE;
                     break;
                 }
             }
@@ -329,14 +329,14 @@ static int hvmemul_do_io(
         if ( !s )
         {
             rc = hvm_process_io_intercept(&null_handler, &p);
-            vio->io_req.state = STATE_IOREQ_NONE;
+            vio->req.state = STATE_IOREQ_NONE;
         }
         else
         {
             rc = hvm_send_ioreq(s, &p, 0);
             if ( rc != X86EMUL_RETRY || currd->is_shutting_down )
-                vio->io_req.state = STATE_IOREQ_NONE;
-            else if ( !ioreq_needs_completion(&vio->io_req) )
+                vio->req.state = STATE_IOREQ_NONE;
+            else if ( !ioreq_needs_completion(&vio->req) )
                 rc = X86EMUL_OKAY;
         }
         break;
@@ -1005,14 +1005,14 @@ static int hvmemul_phys_mmio_access(
  * cache indexed by linear MMIO address.
  */
 static struct hvm_mmio_cache *hvmemul_find_mmio_cache(
-    struct hvm_vcpu_io *vio, unsigned long gla, uint8_t dir, bool create)
+    struct hvm_vcpu_io *hvio, unsigned long gla, uint8_t dir, bool create)
 {
     unsigned int i;
     struct hvm_mmio_cache *cache;
 
-    for ( i = 0; i < vio->mmio_cache_count; i ++ )
+    for ( i = 0; i < hvio->mmio_cache_count; i ++ )
     {
-        cache = &vio->mmio_cache[i];
+        cache = &hvio->mmio_cache[i];
 
         if ( gla == cache->gla &&
              dir == cache->dir )
@@ -1022,13 +1022,13 @@ static struct hvm_mmio_cache *hvmemul_find_mmio_cache(
     if ( !create )
         return NULL;
 
-    i = vio->mmio_cache_count;
-    if( i == ARRAY_SIZE(vio->mmio_cache) )
+    i = hvio->mmio_cache_count;
+    if( i == ARRAY_SIZE(hvio->mmio_cache) )
         return NULL;
 
-    ++vio->mmio_cache_count;
+    ++hvio->mmio_cache_count;
 
-    cache = &vio->mmio_cache[i];
+    cache = &hvio->mmio_cache[i];
     memset(cache, 0, sizeof (*cache));
 
     cache->gla = gla;
@@ -1037,26 +1037,26 @@ static struct hvm_mmio_cache *hvmemul_find_mmio_cache(
     return cache;
 }
 
-static void latch_linear_to_phys(struct hvm_vcpu_io *vio, unsigned long gla,
+static void latch_linear_to_phys(struct hvm_vcpu_io *hvio, unsigned long gla,
                                  unsigned long gpa, bool_t write)
 {
-    if ( vio->mmio_access.gla_valid )
+    if ( hvio->mmio_access.gla_valid )
         return;
 
-    vio->mmio_gla = gla & PAGE_MASK;
-    vio->mmio_gpfn = PFN_DOWN(gpa);
-    vio->mmio_access = (struct npfec){ .gla_valid = 1,
-                                       .read_access = 1,
-                                       .write_access = write };
+    hvio->mmio_gla = gla & PAGE_MASK;
+    hvio->mmio_gpfn = PFN_DOWN(gpa);
+    hvio->mmio_access = (struct npfec){ .gla_valid = 1,
+                                        .read_access = 1,
+                                        .write_access = write };
 }
 
 static int hvmemul_linear_mmio_access(
     unsigned long gla, unsigned int size, uint8_t dir, void *buffer,
     uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, bool_t known_gpfn)
 {
-    struct hvm_vcpu_io *vio = &current->arch.hvm.hvm_io;
+    struct hvm_vcpu_io *hvio = &current->arch.hvm.hvm_io;
     unsigned long offset = gla & ~PAGE_MASK;
-    struct hvm_mmio_cache *cache = hvmemul_find_mmio_cache(vio, gla, dir, true);
+    struct hvm_mmio_cache *cache = hvmemul_find_mmio_cache(hvio, gla, dir, true);
     unsigned int chunk, buffer_offset = 0;
     paddr_t gpa;
     unsigned long one_rep = 1;
@@ -1068,7 +1068,7 @@ static int hvmemul_linear_mmio_access(
     chunk = min_t(unsigned int, size, PAGE_SIZE - offset);
 
     if ( known_gpfn )
-        gpa = pfn_to_paddr(vio->mmio_gpfn) | offset;
+        gpa = pfn_to_paddr(hvio->mmio_gpfn) | offset;
     else
     {
         rc = hvmemul_linear_to_phys(gla, &gpa, chunk, &one_rep, pfec,
@@ -1076,7 +1076,7 @@ static int hvmemul_linear_mmio_access(
         if ( rc != X86EMUL_OKAY )
             return rc;
 
-        latch_linear_to_phys(vio, gla, gpa, dir == IOREQ_WRITE);
+        latch_linear_to_phys(hvio, gla, gpa, dir == IOREQ_WRITE);
     }
 
     for ( ;; )
@@ -1122,22 +1122,22 @@ static inline int hvmemul_linear_mmio_write(
 
 static bool known_gla(unsigned long addr, unsigned int bytes, uint32_t pfec)
 {
-    const struct hvm_vcpu_io *vio = &current->arch.hvm.hvm_io;
+    const struct hvm_vcpu_io *hvio = &current->arch.hvm.hvm_io;
 
     if ( pfec & PFEC_write_access )
     {
-        if ( !vio->mmio_access.write_access )
+        if ( !hvio->mmio_access.write_access )
             return false;
     }
     else if ( pfec & PFEC_insn_fetch )
     {
-        if ( !vio->mmio_access.insn_fetch )
+        if ( !hvio->mmio_access.insn_fetch )
             return false;
     }
-    else if ( !vio->mmio_access.read_access )
+    else if ( !hvio->mmio_access.read_access )
             return false;
 
-    return (vio->mmio_gla == (addr & PAGE_MASK) &&
+    return (hvio->mmio_gla == (addr & PAGE_MASK) &&
             (addr & ~PAGE_MASK) + bytes <= PAGE_SIZE);
 }
 
@@ -1145,7 +1145,7 @@ static int linear_read(unsigned long addr, unsigned int bytes, void *p_data,
                        uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt)
 {
     pagefault_info_t pfinfo;
-    struct hvm_vcpu_io *vio = &current->arch.hvm.hvm_io;
+    struct hvm_vcpu_io *hvio = &current->arch.hvm.hvm_io;
     unsigned int offset = addr & ~PAGE_MASK;
     int rc = HVMTRANS_bad_gfn_to_mfn;
 
@@ -1167,7 +1167,7 @@ static int linear_read(unsigned long addr, unsigned int bytes, void *p_data,
      * we handle this access in the same way to guarantee completion and hence
      * clean up any interim state.
      */
-    if ( !hvmemul_find_mmio_cache(vio, addr, IOREQ_READ, false) )
+    if ( !hvmemul_find_mmio_cache(hvio, addr, IOREQ_READ, false) )
         rc = hvm_copy_from_guest_linear(p_data, addr, bytes, pfec, &pfinfo);
 
     switch ( rc )
@@ -1200,7 +1200,7 @@ static int linear_write(unsigned long addr, unsigned int bytes, void *p_data,
                         uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt)
 {
     pagefault_info_t pfinfo;
-    struct hvm_vcpu_io *vio = &current->arch.hvm.hvm_io;
+    struct hvm_vcpu_io *hvio = &current->arch.hvm.hvm_io;
     unsigned int offset = addr & ~PAGE_MASK;
     int rc = HVMTRANS_bad_gfn_to_mfn;
 
@@ -1222,7 +1222,7 @@ static int linear_write(unsigned long addr, unsigned int bytes, void *p_data,
      * we handle this access in the same way to guarantee completion and hence
      * clean up any interim state.
      */
-    if ( !hvmemul_find_mmio_cache(vio, addr, IOREQ_WRITE, false) )
+    if ( !hvmemul_find_mmio_cache(hvio, addr, IOREQ_WRITE, false) )
         rc = hvm_copy_to_guest_linear(addr, p_data, bytes, pfec, &pfinfo);
 
     switch ( rc )
@@ -1599,7 +1599,7 @@ static int hvmemul_cmpxchg(
     struct vcpu *curr = current;
     unsigned long addr;
     uint32_t pfec = PFEC_page_present | PFEC_write_access;
-    struct hvm_vcpu_io *vio = &curr->arch.hvm.hvm_io;
+    struct hvm_vcpu_io *hvio = &curr->arch.hvm.hvm_io;
     int rc;
     void *mapping = NULL;
 
@@ -1625,8 +1625,8 @@ static int hvmemul_cmpxchg(
         /* Fix this in case the guest is really relying on r-m-w atomicity. */
         return hvmemul_linear_mmio_write(addr, bytes, p_new, pfec,
                                          hvmemul_ctxt,
-                                         vio->mmio_access.write_access &&
-                                         vio->mmio_gla == (addr & PAGE_MASK));
+                                         hvio->mmio_access.write_access &&
+                                         hvio->mmio_gla == (addr & PAGE_MASK));
     }
 
     switch ( bytes )
@@ -1823,7 +1823,7 @@ static int hvmemul_rep_movs(
     struct hvm_emulate_ctxt *hvmemul_ctxt =
         container_of(ctxt, struct hvm_emulate_ctxt, ctxt);
     struct vcpu *curr = current;
-    struct hvm_vcpu_io *vio = &curr->arch.hvm.hvm_io;
+    struct hvm_vcpu_io *hvio = &curr->arch.hvm.hvm_io;
     unsigned long saddr, daddr, bytes;
     paddr_t sgpa, dgpa;
     uint32_t pfec = PFEC_page_present;
@@ -1846,18 +1846,18 @@ static int hvmemul_rep_movs(
     if ( hvmemul_ctxt->seg_reg[x86_seg_ss].dpl == 3 )
         pfec |= PFEC_user_mode;
 
-    if ( vio->mmio_access.read_access &&
-         (vio->mmio_gla == (saddr & PAGE_MASK)) &&
+    if ( hvio->mmio_access.read_access &&
+         (hvio->mmio_gla == (saddr & PAGE_MASK)) &&
          /*
           * Upon initial invocation don't truncate large batches just because
           * of a hit for the translation: Doing the guest page table walk is
           * cheaper than multiple round trips through the device model. Yet
           * when processing a response we can always re-use the translation.
           */
-         (vio->io_req.state == STATE_IORESP_READY ||
+         (curr->io.req.state == STATE_IORESP_READY ||
           ((!df || *reps == 1) &&
            PAGE_SIZE - (saddr & ~PAGE_MASK) >= *reps * bytes_per_rep)) )
-        sgpa = pfn_to_paddr(vio->mmio_gpfn) | (saddr & ~PAGE_MASK);
+        sgpa = pfn_to_paddr(hvio->mmio_gpfn) | (saddr & ~PAGE_MASK);
     else
     {
         rc = hvmemul_linear_to_phys(saddr, &sgpa, bytes_per_rep, reps, pfec,
@@ -1867,13 +1867,13 @@ static int hvmemul_rep_movs(
     }
 
     bytes = PAGE_SIZE - (daddr & ~PAGE_MASK);
-    if ( vio->mmio_access.write_access &&
-         (vio->mmio_gla == (daddr & PAGE_MASK)) &&
+    if ( hvio->mmio_access.write_access &&
+         (hvio->mmio_gla == (daddr & PAGE_MASK)) &&
          /* See comment above. */
-         (vio->io_req.state == STATE_IORESP_READY ||
+         (curr->io.req.state == STATE_IORESP_READY ||
           ((!df || *reps == 1) &&
            PAGE_SIZE - (daddr & ~PAGE_MASK) >= *reps * bytes_per_rep)) )
-        dgpa = pfn_to_paddr(vio->mmio_gpfn) | (daddr & ~PAGE_MASK);
+        dgpa = pfn_to_paddr(hvio->mmio_gpfn) | (daddr & ~PAGE_MASK);
     else
     {
         rc = hvmemul_linear_to_phys(daddr, &dgpa, bytes_per_rep, reps,
@@ -1892,14 +1892,14 @@ static int hvmemul_rep_movs(
 
     if ( sp2mt == p2m_mmio_dm )
     {
-        latch_linear_to_phys(vio, saddr, sgpa, 0);
+        latch_linear_to_phys(hvio, saddr, sgpa, 0);
         return hvmemul_do_mmio_addr(
             sgpa, reps, bytes_per_rep, IOREQ_READ, df, dgpa);
     }
 
     if ( dp2mt == p2m_mmio_dm )
     {
-        latch_linear_to_phys(vio, daddr, dgpa, 1);
+        latch_linear_to_phys(hvio, daddr, dgpa, 1);
         return hvmemul_do_mmio_addr(
             dgpa, reps, bytes_per_rep, IOREQ_WRITE, df, sgpa);
     }
@@ -1992,7 +1992,7 @@ static int hvmemul_rep_stos(
     struct hvm_emulate_ctxt *hvmemul_ctxt =
         container_of(ctxt, struct hvm_emulate_ctxt, ctxt);
     struct vcpu *curr = current;
-    struct hvm_vcpu_io *vio = &curr->arch.hvm.hvm_io;
+    struct hvm_vcpu_io *hvio = &curr->arch.hvm.hvm_io;
     unsigned long addr, bytes;
     paddr_t gpa;
     p2m_type_t p2mt;
@@ -2004,13 +2004,13 @@ static int hvmemul_rep_stos(
         return rc;
 
     bytes = PAGE_SIZE - (addr & ~PAGE_MASK);
-    if ( vio->mmio_access.write_access &&
-         (vio->mmio_gla == (addr & PAGE_MASK)) &&
+    if ( hvio->mmio_access.write_access &&
+         (hvio->mmio_gla == (addr & PAGE_MASK)) &&
          /* See respective comment in MOVS processing. */
-         (vio->io_req.state == STATE_IORESP_READY ||
+         (curr->io.req.state == STATE_IORESP_READY ||
           ((!df || *reps == 1) &&
            PAGE_SIZE - (addr & ~PAGE_MASK) >= *reps * bytes_per_rep)) )
-        gpa = pfn_to_paddr(vio->mmio_gpfn) | (addr & ~PAGE_MASK);
+        gpa = pfn_to_paddr(hvio->mmio_gpfn) | (addr & ~PAGE_MASK);
     else
     {
         uint32_t pfec = PFEC_page_present | PFEC_write_access;
@@ -2103,7 +2103,7 @@ static int hvmemul_rep_stos(
         return X86EMUL_UNHANDLEABLE;
 
     case p2m_mmio_dm:
-        latch_linear_to_phys(vio, addr, gpa, 1);
+        latch_linear_to_phys(hvio, addr, gpa, 1);
         return hvmemul_do_mmio_buffer(gpa, reps, bytes_per_rep, IOREQ_WRITE, df,
                                       p_data);
     }
@@ -2613,18 +2613,18 @@ static const struct x86_emulate_ops hvm_emulate_ops_no_write = {
 };
 
 /*
- * Note that passing HVMIO_no_completion into this function serves as kind
+ * Note that passing VIO_no_completion into this function serves as kind
  * of (but not fully) an "auto select completion" indicator.  When there's
  * no completion needed, the passed in value will be ignored in any case.
  */
 static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
     const struct x86_emulate_ops *ops,
-    enum hvm_io_completion completion)
+    enum vio_completion completion)
 {
     const struct cpu_user_regs *regs = hvmemul_ctxt->ctxt.regs;
     struct vcpu *curr = current;
     uint32_t new_intr_shadow;
-    struct hvm_vcpu_io *vio = &curr->arch.hvm.hvm_io;
+    struct hvm_vcpu_io *hvio = &curr->arch.hvm.hvm_io;
     int rc;
 
     /*
@@ -2632,45 +2632,45 @@ static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
      * untouched if it's already enabled, for re-execution to consume
      * entries populated by an earlier pass.
      */
-    if ( vio->cache->num_ents > vio->cache->max_ents )
+    if ( hvio->cache->num_ents > hvio->cache->max_ents )
     {
-        ASSERT(vio->io_req.state == STATE_IOREQ_NONE);
-        vio->cache->num_ents = 0;
+        ASSERT(curr->io.req.state == STATE_IOREQ_NONE);
+        hvio->cache->num_ents = 0;
     }
     else
-        ASSERT(vio->io_req.state == STATE_IORESP_READY);
+        ASSERT(curr->io.req.state == STATE_IORESP_READY);
 
-    hvm_emulate_init_per_insn(hvmemul_ctxt, vio->mmio_insn,
-                              vio->mmio_insn_bytes);
+    hvm_emulate_init_per_insn(hvmemul_ctxt, hvio->mmio_insn,
+                              hvio->mmio_insn_bytes);
 
-    vio->mmio_retry = 0;
+    hvio->mmio_retry = 0;
 
     rc = x86_emulate(&hvmemul_ctxt->ctxt, ops);
-    if ( rc == X86EMUL_OKAY && vio->mmio_retry )
+    if ( rc == X86EMUL_OKAY && hvio->mmio_retry )
         rc = X86EMUL_RETRY;
 
-    if ( !ioreq_needs_completion(&vio->io_req) )
-        completion = HVMIO_no_completion;
-    else if ( completion == HVMIO_no_completion )
-        completion = (vio->io_req.type != IOREQ_TYPE_PIO ||
-                      hvmemul_ctxt->is_mem_access) ? HVMIO_mmio_completion
-                                                   : HVMIO_pio_completion;
+    if ( !ioreq_needs_completion(&curr->io.req) )
+        completion = VIO_no_completion;
+    else if ( completion == VIO_no_completion )
+        completion = (curr->io.req.type != IOREQ_TYPE_PIO ||
+                      hvmemul_ctxt->is_mem_access) ? VIO_mmio_completion
+                                                   : VIO_pio_completion;
 
-    switch ( vio->io_completion = completion )
+    switch ( curr->io.completion = completion )
     {
-    case HVMIO_no_completion:
-    case HVMIO_pio_completion:
-        vio->mmio_cache_count = 0;
-        vio->mmio_insn_bytes = 0;
-        vio->mmio_access = (struct npfec){};
+    case VIO_no_completion:
+    case VIO_pio_completion:
+        hvio->mmio_cache_count = 0;
+        hvio->mmio_insn_bytes = 0;
+        hvio->mmio_access = (struct npfec){};
         hvmemul_cache_disable(curr);
         break;
 
-    case HVMIO_mmio_completion:
-    case HVMIO_realmode_completion:
-        BUILD_BUG_ON(sizeof(vio->mmio_insn) < sizeof(hvmemul_ctxt->insn_buf));
-        vio->mmio_insn_bytes = hvmemul_ctxt->insn_buf_bytes;
-        memcpy(vio->mmio_insn, hvmemul_ctxt->insn_buf, vio->mmio_insn_bytes);
+    case VIO_mmio_completion:
+    case VIO_realmode_completion:
+        BUILD_BUG_ON(sizeof(hvio->mmio_insn) < sizeof(hvmemul_ctxt->insn_buf));
+        hvio->mmio_insn_bytes = hvmemul_ctxt->insn_buf_bytes;
+        memcpy(hvio->mmio_insn, hvmemul_ctxt->insn_buf, hvio->mmio_insn_bytes);
         break;
 
     default:
@@ -2716,7 +2716,7 @@ static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
 
 int hvm_emulate_one(
     struct hvm_emulate_ctxt *hvmemul_ctxt,
-    enum hvm_io_completion completion)
+    enum vio_completion completion)
 {
     return _hvm_emulate_one(hvmemul_ctxt, &hvm_emulate_ops, completion);
 }
@@ -2754,7 +2754,7 @@ int hvm_emulate_one_mmio(unsigned long mfn, unsigned long gla)
                           guest_cpu_user_regs());
     ctxt.ctxt.data = &mmio_ro_ctxt;
 
-    switch ( rc = _hvm_emulate_one(&ctxt, ops, HVMIO_no_completion) )
+    switch ( rc = _hvm_emulate_one(&ctxt, ops, VIO_no_completion) )
     {
     case X86EMUL_UNHANDLEABLE:
     case X86EMUL_UNIMPLEMENTED:
@@ -2782,28 +2782,28 @@ void hvm_emulate_one_vm_event(enum emul_kind kind, unsigned int trapnr,
     {
     case EMUL_KIND_NOWRITE:
         rc = _hvm_emulate_one(&ctx, &hvm_emulate_ops_no_write,
-                              HVMIO_no_completion);
+                              VIO_no_completion);
         break;
     case EMUL_KIND_SET_CONTEXT_INSN: {
         struct vcpu *curr = current;
-        struct hvm_vcpu_io *vio = &curr->arch.hvm.hvm_io;
+        struct hvm_vcpu_io *hvio = &curr->arch.hvm.hvm_io;
 
-        BUILD_BUG_ON(sizeof(vio->mmio_insn) !=
+        BUILD_BUG_ON(sizeof(hvio->mmio_insn) !=
                      sizeof(curr->arch.vm_event->emul.insn.data));
-        ASSERT(!vio->mmio_insn_bytes);
+        ASSERT(!hvio->mmio_insn_bytes);
 
         /*
          * Stash insn buffer into mmio buffer here instead of ctx
          * to avoid having to add more logic to hvm_emulate_one.
          */
-        vio->mmio_insn_bytes = sizeof(vio->mmio_insn);
-        memcpy(vio->mmio_insn, curr->arch.vm_event->emul.insn.data,
-               vio->mmio_insn_bytes);
+        hvio->mmio_insn_bytes = sizeof(hvio->mmio_insn);
+        memcpy(hvio->mmio_insn, curr->arch.vm_event->emul.insn.data,
+               hvio->mmio_insn_bytes);
     }
     /* Fall-through */
     default:
         ctx.set_context = (kind == EMUL_KIND_SET_CONTEXT_DATA);
-        rc = hvm_emulate_one(&ctx, HVMIO_no_completion);
+        rc = hvm_emulate_one(&ctx, VIO_no_completion);
     }
 
     switch ( rc )
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index bc96947..4ed929c 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3800,7 +3800,7 @@ void hvm_ud_intercept(struct cpu_user_regs *regs)
         return;
     }
 
-    switch ( hvm_emulate_one(&ctxt, HVMIO_no_completion) )
+    switch ( hvm_emulate_one(&ctxt, VIO_no_completion) )
     {
     case X86EMUL_UNHANDLEABLE:
     case X86EMUL_UNIMPLEMENTED:
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index ef8286b..dd733e1 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -85,7 +85,7 @@ bool hvm_emulate_one_insn(hvm_emulate_validate_t *validate, const char *descr)
 
     hvm_emulate_init_once(&ctxt, validate, guest_cpu_user_regs());
 
-    switch ( rc = hvm_emulate_one(&ctxt, HVMIO_no_completion) )
+    switch ( rc = hvm_emulate_one(&ctxt, VIO_no_completion) )
     {
     case X86EMUL_UNHANDLEABLE:
         hvm_dump_emulation_state(XENLOG_G_WARNING, descr, &ctxt, rc);
@@ -109,20 +109,20 @@ bool hvm_emulate_one_insn(hvm_emulate_validate_t *validate, const char *descr)
 bool handle_mmio_with_translation(unsigned long gla, unsigned long gpfn,
                                   struct npfec access)
 {
-    struct hvm_vcpu_io *vio = &current->arch.hvm.hvm_io;
+    struct hvm_vcpu_io *hvio = &current->arch.hvm.hvm_io;
 
-    vio->mmio_access = access.gla_valid &&
-                       access.kind == npfec_kind_with_gla
-                       ? access : (struct npfec){};
-    vio->mmio_gla = gla & PAGE_MASK;
-    vio->mmio_gpfn = gpfn;
+    hvio->mmio_access = access.gla_valid &&
+                        access.kind == npfec_kind_with_gla
+                        ? access : (struct npfec){};
+    hvio->mmio_gla = gla & PAGE_MASK;
+    hvio->mmio_gpfn = gpfn;
     return handle_mmio();
 }
 
 bool handle_pio(uint16_t port, unsigned int size, int dir)
 {
     struct vcpu *curr = current;
-    struct hvm_vcpu_io *vio = &curr->arch.hvm.hvm_io;
+    struct vcpu_io *vio = &curr->io;
     unsigned int data;
     int rc;
 
@@ -135,8 +135,8 @@ bool handle_pio(uint16_t port, unsigned int size, int dir)
 
     rc = hvmemul_do_pio_buffer(port, size, dir, &data);
 
-    if ( ioreq_needs_completion(&vio->io_req) )
-        vio->io_completion = HVMIO_pio_completion;
+    if ( ioreq_needs_completion(&vio->req) )
+        vio->completion = VIO_pio_completion;
 
     switch ( rc )
     {
@@ -175,7 +175,7 @@ static bool_t g2m_portio_accept(const struct hvm_io_handler *handler,
 {
     struct vcpu *curr = current;
     const struct hvm_domain *hvm = &curr->domain->arch.hvm;
-    struct hvm_vcpu_io *vio = &curr->arch.hvm.hvm_io;
+    struct hvm_vcpu_io *hvio = &curr->arch.hvm.hvm_io;
     struct g2m_ioport *g2m_ioport;
     unsigned int start, end;
 
@@ -185,7 +185,7 @@ static bool_t g2m_portio_accept(const struct hvm_io_handler *handler,
         end = start + g2m_ioport->np;
         if ( (p->addr >= start) && (p->addr + p->size <= end) )
         {
-            vio->g2m_ioport = g2m_ioport;
+            hvio->g2m_ioport = g2m_ioport;
             return 1;
         }
     }
@@ -196,8 +196,8 @@ static bool_t g2m_portio_accept(const struct hvm_io_handler *handler,
 static int g2m_portio_read(const struct hvm_io_handler *handler,
                            uint64_t addr, uint32_t size, uint64_t *data)
 {
-    struct hvm_vcpu_io *vio = &current->arch.hvm.hvm_io;
-    const struct g2m_ioport *g2m_ioport = vio->g2m_ioport;
+    struct hvm_vcpu_io *hvio = &current->arch.hvm.hvm_io;
+    const struct g2m_ioport *g2m_ioport = hvio->g2m_ioport;
     unsigned int mport = (addr - g2m_ioport->gport) + g2m_ioport->mport;
 
     switch ( size )
@@ -221,8 +221,8 @@ static int g2m_portio_read(const struct hvm_io_handler *handler,
 static int g2m_portio_write(const struct hvm_io_handler *handler,
                             uint64_t addr, uint32_t size, uint64_t data)
 {
-    struct hvm_vcpu_io *vio = &current->arch.hvm.hvm_io;
-    const struct g2m_ioport *g2m_ioport = vio->g2m_ioport;
+    struct hvm_vcpu_io *hvio = &current->arch.hvm.hvm_io;
+    const struct g2m_ioport *g2m_ioport = hvio->g2m_ioport;
     unsigned int mport = (addr - g2m_ioport->gport) + g2m_ioport->mport;
 
     switch ( size )
diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index 8393922..c00ee8e 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -40,11 +40,11 @@ bool arch_ioreq_complete_mmio(void)
     return handle_mmio();
 }
 
-bool arch_vcpu_ioreq_completion(enum hvm_io_completion io_completion)
+bool arch_vcpu_ioreq_completion(enum vio_completion completion)
 {
-    switch ( io_completion )
+    switch ( completion )
     {
-    case HVMIO_realmode_completion:
+    case VIO_realmode_completion:
     {
         struct hvm_emulate_ctxt ctxt;
 
diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c
index fcfccf7..6d90630 100644
--- a/xen/arch/x86/hvm/svm/nestedsvm.c
+++ b/xen/arch/x86/hvm/svm/nestedsvm.c
@@ -1266,7 +1266,7 @@ enum hvm_intblk nsvm_intr_blocked(struct vcpu *v)
          * Delay the injection because this would result in delivering
          * an interrupt *within* the execution of an instruction.
          */
-        if ( v->arch.hvm.hvm_io.io_req.state != STATE_IOREQ_NONE )
+        if ( v->io.req.state != STATE_IOREQ_NONE )
             return hvm_intblk_shadow;
 
         if ( !nv->nv_vmexit_pending && n2vmcb->exit_int_info.v )
diff --git a/xen/arch/x86/hvm/vmx/realmode.c b/xen/arch/x86/hvm/vmx/realmode.c
index 768f01e..cc23afa 100644
--- a/xen/arch/x86/hvm/vmx/realmode.c
+++ b/xen/arch/x86/hvm/vmx/realmode.c
@@ -101,7 +101,7 @@ void vmx_realmode_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt)
 
     perfc_incr(realmode_emulations);
 
-    rc = hvm_emulate_one(hvmemul_ctxt, HVMIO_realmode_completion);
+    rc = hvm_emulate_one(hvmemul_ctxt, VIO_realmode_completion);
 
     if ( rc == X86EMUL_UNHANDLEABLE )
     {
@@ -153,7 +153,7 @@ void vmx_realmode(struct cpu_user_regs *regs)
     struct vcpu *curr = current;
     struct hvm_emulate_ctxt hvmemul_ctxt;
     struct segment_register *sreg;
-    struct hvm_vcpu_io *vio = &curr->arch.hvm.hvm_io;
+    struct hvm_vcpu_io *hvio = &curr->arch.hvm.hvm_io;
     unsigned long intr_info;
     unsigned int emulations = 0;
 
@@ -188,7 +188,7 @@ void vmx_realmode(struct cpu_user_regs *regs)
 
         vmx_realmode_emulate_one(&hvmemul_ctxt);
 
-        if ( vio->io_req.state != STATE_IOREQ_NONE || vio->mmio_retry )
+        if ( curr->io.req.state != STATE_IOREQ_NONE || hvio->mmio_retry )
             break;
 
         /* Stop emulating unless our segment state is not safe */
@@ -202,7 +202,7 @@ void vmx_realmode(struct cpu_user_regs *regs)
     }
 
     /* Need to emulate next time if we've started an IO operation */
-    if ( vio->io_req.state != STATE_IOREQ_NONE )
+    if ( curr->io.req.state != STATE_IOREQ_NONE )
         curr->arch.hvm.vmx.vmx_emulate = 1;
 
     if ( !curr->arch.hvm.vmx.vmx_emulate && !curr->arch.hvm.vmx.vmx_realmode )
diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c
index 72b5da0..273683f 100644
--- a/xen/common/ioreq.c
+++ b/xen/common/ioreq.c
@@ -159,7 +159,7 @@ static bool hvm_wait_for_io(struct ioreq_vcpu *sv, ioreq_t *p)
         break;
     }
 
-    p = &sv->vcpu->arch.hvm.hvm_io.io_req;
+    p = &sv->vcpu->io.req;
     if ( ioreq_needs_completion(p) )
         p->data = data;
 
@@ -171,10 +171,10 @@ static bool hvm_wait_for_io(struct ioreq_vcpu *sv, ioreq_t *p)
 bool handle_hvm_io_completion(struct vcpu *v)
 {
     struct domain *d = v->domain;
-    struct hvm_vcpu_io *vio = &v->arch.hvm.hvm_io;
+    struct vcpu_io *vio = &v->io;
     struct ioreq_server *s;
     struct ioreq_vcpu *sv;
-    enum hvm_io_completion io_completion;
+    enum vio_completion completion;
 
     if ( has_vpci(d) && vpci_process_pending(v) )
     {
@@ -186,29 +186,29 @@ bool handle_hvm_io_completion(struct vcpu *v)
     if ( sv && !hvm_wait_for_io(sv, get_ioreq(s, v)) )
         return false;
 
-    vio->io_req.state = ioreq_needs_completion(&vio->io_req) ?
+    vio->req.state = ioreq_needs_completion(&vio->req) ?
         STATE_IORESP_READY : STATE_IOREQ_NONE;
 
     msix_write_completion(v);
     vcpu_end_shutdown_deferral(v);
 
-    io_completion = vio->io_completion;
-    vio->io_completion = HVMIO_no_completion;
+    completion = vio->completion;
+    vio->completion = VIO_no_completion;
 
-    switch ( io_completion )
+    switch ( completion )
     {
-    case HVMIO_no_completion:
+    case VIO_no_completion:
         break;
 
-    case HVMIO_mmio_completion:
+    case VIO_mmio_completion:
         return arch_ioreq_complete_mmio();
 
-    case HVMIO_pio_completion:
-        return handle_pio(vio->io_req.addr, vio->io_req.size,
-                          vio->io_req.dir);
+    case VIO_pio_completion:
+        return handle_pio(vio->req.addr, vio->req.size,
+                          vio->req.dir);
 
     default:
-        return arch_vcpu_ioreq_completion(io_completion);
+        return arch_vcpu_ioreq_completion(completion);
     }
 
     return true;
diff --git a/xen/include/asm-x86/hvm/emulate.h b/xen/include/asm-x86/hvm/emulate.h
index 1620cc7..610078b 100644
--- a/xen/include/asm-x86/hvm/emulate.h
+++ b/xen/include/asm-x86/hvm/emulate.h
@@ -65,7 +65,7 @@ bool __nonnull(1, 2) hvm_emulate_one_insn(
     const char *descr);
 int hvm_emulate_one(
     struct hvm_emulate_ctxt *hvmemul_ctxt,
-    enum hvm_io_completion completion);
+    enum vio_completion completion);
 void hvm_emulate_one_vm_event(enum emul_kind kind,
     unsigned int trapnr,
     unsigned int errcode);
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 6c1feda..8adf455 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -28,13 +28,6 @@
 #include <asm/mtrr.h>
 #include <public/hvm/ioreq.h>
 
-enum hvm_io_completion {
-    HVMIO_no_completion,
-    HVMIO_mmio_completion,
-    HVMIO_pio_completion,
-    HVMIO_realmode_completion
-};
-
 struct hvm_vcpu_asid {
     uint64_t generation;
     uint32_t asid;
@@ -52,10 +45,6 @@ struct hvm_mmio_cache {
 };
 
 struct hvm_vcpu_io {
-    /* I/O request in flight to device model. */
-    enum hvm_io_completion io_completion;
-    ioreq_t                io_req;
-
     /*
      * HVM emulation:
      *  Linear address @mmio_gla maps to MMIO physical frame @mmio_gpfn.
diff --git a/xen/include/xen/ioreq.h b/xen/include/xen/ioreq.h
index 7a90873..dffed60 100644
--- a/xen/include/xen/ioreq.h
+++ b/xen/include/xen/ioreq.h
@@ -105,7 +105,7 @@ void hvm_ioreq_init(struct domain *d);
 int ioreq_server_dm_op(struct xen_dm_op *op, struct domain *d, bool *const_op);
 
 bool arch_ioreq_complete_mmio(void);
-bool arch_vcpu_ioreq_completion(enum hvm_io_completion io_completion);
+bool arch_vcpu_ioreq_completion(enum vio_completion completion);
 int arch_ioreq_server_map_pages(struct ioreq_server *s);
 void arch_ioreq_server_unmap_pages(struct ioreq_server *s);
 void arch_ioreq_server_enable(struct ioreq_server *s);
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index ad0d761..7aea2bb 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -147,6 +147,21 @@ void evtchn_destroy_final(struct domain *d); /* from complete_domain_destroy */
 
 struct waitqueue_vcpu;
 
+enum vio_completion {
+    VIO_no_completion,
+    VIO_mmio_completion,
+    VIO_pio_completion,
+#ifdef CONFIG_X86
+    VIO_realmode_completion,
+#endif
+};
+
+struct vcpu_io {
+    /* I/O request in flight to device model. */
+    enum vio_completion  completion;
+    ioreq_t              req;
+};
+
 struct vcpu
 {
     int              vcpu_id;
@@ -258,6 +273,10 @@ struct vcpu
     struct vpci_vcpu vpci;
 
     struct arch_vcpu arch;
+
+#ifdef CONFIG_IOREQ_SERVER
+    struct vcpu_io io;
+#endif
 };
 
 struct sched_unit {
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH V4 11/24] xen/mm: Make x86's XENMEM_resource_ioreq_server handling common
  2021-01-12 21:52 [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Oleksandr Tyshchenko
                   ` (9 preceding siblings ...)
  2021-01-12 21:52 ` [PATCH V4 10/24] xen/ioreq: Move x86's io_completion/io_req fields to struct vcpu Oleksandr Tyshchenko
@ 2021-01-12 21:52 ` Oleksandr Tyshchenko
  2021-01-14  3:58   ` Wei Chen
  2021-01-18  9:38   ` Paul Durrant
  2021-01-12 21:52 ` [PATCH V4 12/24] xen/ioreq: Remove "hvm" prefixes from involved function names Oleksandr Tyshchenko
                   ` (13 subsequent siblings)
  24 siblings, 2 replies; 144+ messages in thread
From: Oleksandr Tyshchenko @ 2021-01-12 21:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Julien Grall, Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Volodymyr Babchuk, Oleksandr Tyshchenko

From: Julien Grall <julien.grall@arm.com>

As x86 implementation of XENMEM_resource_ioreq_server can be
re-used on Arm later on, this patch makes it common and removes
arch_acquire_resource as unneeded.

Also re-order #include-s alphabetically.

This support is going to be used on Arm to be able run device
emulator outside of Xen hypervisor.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>

---
Please note, this is a split/cleanup/hardening of Julien's PoC:
"Add support for Guest IO forwarding to a device emulator"

Changes RFC -> V1:
   - no changes

Changes V1 -> V2:
   - update the author of a patch

Changes V2 -> V3:
   - don't wrap #include <xen/ioreq.h>
   - limit the number of #ifdef-s
   - re-order #include-s alphabetically

Changes V3 -> V4:
   - rebase
   - Add Jan's R-b
---
 xen/arch/x86/mm.c        | 44 ---------------------------------
 xen/common/memory.c      | 63 +++++++++++++++++++++++++++++++++++++++---------
 xen/include/asm-arm/mm.h |  8 ------
 xen/include/asm-x86/mm.h |  4 ---
 4 files changed, 51 insertions(+), 68 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index f6e128e..54ac398 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4587,50 +4587,6 @@ static int handle_iomem_range(unsigned long s, unsigned long e, void *p)
     return err || s > e ? err : _handle_iomem_range(s, e, p);
 }
 
-int arch_acquire_resource(struct domain *d, unsigned int type,
-                          unsigned int id, unsigned long frame,
-                          unsigned int nr_frames, xen_pfn_t mfn_list[])
-{
-    int rc;
-
-    switch ( type )
-    {
-#ifdef CONFIG_HVM
-    case XENMEM_resource_ioreq_server:
-    {
-        ioservid_t ioservid = id;
-        unsigned int i;
-
-        rc = -EINVAL;
-        if ( !is_hvm_domain(d) )
-            break;
-
-        if ( id != (unsigned int)ioservid )
-            break;
-
-        rc = 0;
-        for ( i = 0; i < nr_frames; i++ )
-        {
-            mfn_t mfn;
-
-            rc = hvm_get_ioreq_server_frame(d, id, frame + i, &mfn);
-            if ( rc )
-                break;
-
-            mfn_list[i] = mfn_x(mfn);
-        }
-        break;
-    }
-#endif
-
-    default:
-        rc = -EOPNOTSUPP;
-        break;
-    }
-
-    return rc;
-}
-
 long arch_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 {
     int rc;
diff --git a/xen/common/memory.c b/xen/common/memory.c
index b21b6c4..7e560b5 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -8,22 +8,23 @@
  */
 
 #include <xen/domain_page.h>
-#include <xen/types.h>
+#include <xen/errno.h>
+#include <xen/event.h>
+#include <xen/grant_table.h>
+#include <xen/guest_access.h>
+#include <xen/hypercall.h>
+#include <xen/iocap.h>
+#include <xen/ioreq.h>
 #include <xen/lib.h>
+#include <xen/mem_access.h>
 #include <xen/mm.h>
+#include <xen/numa.h>
+#include <xen/paging.h>
 #include <xen/param.h>
 #include <xen/perfc.h>
 #include <xen/sched.h>
-#include <xen/event.h>
-#include <xen/paging.h>
-#include <xen/iocap.h>
-#include <xen/guest_access.h>
-#include <xen/hypercall.h>
-#include <xen/errno.h>
-#include <xen/numa.h>
-#include <xen/mem_access.h>
 #include <xen/trace.h>
-#include <xen/grant_table.h>
+#include <xen/types.h>
 #include <asm/current.h>
 #include <asm/hardirq.h>
 #include <asm/p2m.h>
@@ -1090,6 +1091,40 @@ static int acquire_grant_table(struct domain *d, unsigned int id,
     return 0;
 }
 
+static int acquire_ioreq_server(struct domain *d,
+                                unsigned int id,
+                                unsigned long frame,
+                                unsigned int nr_frames,
+                                xen_pfn_t mfn_list[])
+{
+#ifdef CONFIG_IOREQ_SERVER
+    ioservid_t ioservid = id;
+    unsigned int i;
+    int rc;
+
+    if ( !is_hvm_domain(d) )
+        return -EINVAL;
+
+    if ( id != (unsigned int)ioservid )
+        return -EINVAL;
+
+    for ( i = 0; i < nr_frames; i++ )
+    {
+        mfn_t mfn;
+
+        rc = hvm_get_ioreq_server_frame(d, id, frame + i, &mfn);
+        if ( rc )
+            return rc;
+
+        mfn_list[i] = mfn_x(mfn);
+    }
+
+    return 0;
+#else
+    return -EOPNOTSUPP;
+#endif
+}
+
 static int acquire_resource(
     XEN_GUEST_HANDLE_PARAM(xen_mem_acquire_resource_t) arg)
 {
@@ -1148,9 +1183,13 @@ static int acquire_resource(
                                  mfn_list);
         break;
 
+    case XENMEM_resource_ioreq_server:
+        rc = acquire_ioreq_server(d, xmar.id, xmar.frame, xmar.nr_frames,
+                                  mfn_list);
+        break;
+
     default:
-        rc = arch_acquire_resource(d, xmar.type, xmar.id, xmar.frame,
-                                   xmar.nr_frames, mfn_list);
+        rc = -EOPNOTSUPP;
         break;
     }
 
diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
index f8ba49b..0b7de31 100644
--- a/xen/include/asm-arm/mm.h
+++ b/xen/include/asm-arm/mm.h
@@ -358,14 +358,6 @@ static inline void put_page_and_type(struct page_info *page)
 
 void clear_and_clean_page(struct page_info *page);
 
-static inline
-int arch_acquire_resource(struct domain *d, unsigned int type, unsigned int id,
-                          unsigned long frame, unsigned int nr_frames,
-                          xen_pfn_t mfn_list[])
-{
-    return -EOPNOTSUPP;
-}
-
 unsigned int arch_get_dma_bitsize(void);
 
 #endif /*  __ARCH_ARM_MM__ */
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index deeba75..859214e 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -639,8 +639,4 @@ static inline bool arch_mfn_in_directmap(unsigned long mfn)
     return mfn <= (virt_to_mfn(eva - 1) + 1);
 }
 
-int arch_acquire_resource(struct domain *d, unsigned int type,
-                          unsigned int id, unsigned long frame,
-                          unsigned int nr_frames, xen_pfn_t mfn_list[]);
-
 #endif /* __ASM_X86_MM_H__ */
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH V4 12/24] xen/ioreq: Remove "hvm" prefixes from involved function names
  2021-01-12 21:52 [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Oleksandr Tyshchenko
                   ` (10 preceding siblings ...)
  2021-01-12 21:52 ` [PATCH V4 11/24] xen/mm: Make x86's XENMEM_resource_ioreq_server handling common Oleksandr Tyshchenko
@ 2021-01-12 21:52 ` Oleksandr Tyshchenko
  2021-01-18  9:55   ` Paul Durrant
  2021-01-12 21:52 ` [PATCH V4 13/24] xen/ioreq: Use guest_cmpxchg64() instead of cmpxchg() Oleksandr Tyshchenko
                   ` (12 subsequent siblings)
  24 siblings, 1 reply; 144+ messages in thread
From: Oleksandr Tyshchenko @ 2021-01-12 21:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Paul Durrant, Jun Nakajima, Kevin Tian,
	Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

This patch removes "hvm" prefixes and infixes from IOREQ related
function names in the common code and performs a renaming where
appropriate according to the more consistent new naming scheme:
- IOREQ server functions should start with "ioreq_server_"
- IOREQ functions should start with "ioreq_"

A few function names are clarified to better fit into their purposes:
handle_hvm_io_completion -> vcpu_ioreq_handle_completion
hvm_io_pending           -> vcpu_ioreq_pending
hvm_ioreq_init           -> ioreq_domain_init
hvm_alloc_ioreq_mfn      -> ioreq_server_alloc_mfn
hvm_free_ioreq_mfn       -> ioreq_server_free_mfn

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
CC: Julien Grall <julien.grall@arm.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>

---
Please note, this is a split/cleanup/hardening of Julien's PoC:
"Add support for Guest IO forwarding to a device emulator"

Changes V1 -> V2:
   - new patch

Changes V2 -> V3:
   - update patch according the "legacy interface" is x86 specific
   - update patch description
   - rename everything touched according to new naming scheme

Changes V3 -> V4:
   - rebase
   - rename ioreq_update_evtchn() to ioreq_server_update_evtchn()
   - add Jan's R-b
---
 xen/arch/x86/hvm/dm.c       |   4 +-
 xen/arch/x86/hvm/emulate.c  |   6 +-
 xen/arch/x86/hvm/hvm.c      |  10 +--
 xen/arch/x86/hvm/io.c       |   6 +-
 xen/arch/x86/hvm/ioreq.c    |   2 +-
 xen/arch/x86/hvm/stdvga.c   |   4 +-
 xen/arch/x86/hvm/vmx/vvmx.c |   2 +-
 xen/common/ioreq.c          | 202 ++++++++++++++++++++++----------------------
 xen/common/memory.c         |   2 +-
 xen/include/xen/ioreq.h     |  30 +++----
 10 files changed, 134 insertions(+), 134 deletions(-)

diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c
index dc8e47d..f770536 100644
--- a/xen/arch/x86/hvm/dm.c
+++ b/xen/arch/x86/hvm/dm.c
@@ -415,8 +415,8 @@ static int dm_op(const struct dmop_args *op_args)
             break;
 
         if ( first_gfn == 0 )
-            rc = hvm_map_mem_type_to_ioreq_server(d, data->id,
-                                                  data->type, data->flags);
+            rc = ioreq_server_map_mem_type(d, data->id,
+                                           data->type, data->flags);
         else
             rc = 0;
 
diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 21051ce..425c8dd 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -261,7 +261,7 @@ static int hvmemul_do_io(
          * an ioreq server that can handle it.
          *
          * Rules:
-         * A> PIO or MMIO accesses run through hvm_select_ioreq_server() to
+         * A> PIO or MMIO accesses run through ioreq_server_select() to
          * choose the ioreq server by range. If no server is found, the access
          * is ignored.
          *
@@ -323,7 +323,7 @@ static int hvmemul_do_io(
         }
 
         if ( !s )
-            s = hvm_select_ioreq_server(currd, &p);
+            s = ioreq_server_select(currd, &p);
 
         /* If there is no suitable backing DM, just ignore accesses */
         if ( !s )
@@ -333,7 +333,7 @@ static int hvmemul_do_io(
         }
         else
         {
-            rc = hvm_send_ioreq(s, &p, 0);
+            rc = ioreq_send(s, &p, 0);
             if ( rc != X86EMUL_RETRY || currd->is_shutting_down )
                 vio->req.state = STATE_IOREQ_NONE;
             else if ( !ioreq_needs_completion(&vio->req) )
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 4ed929c..0d7bb42 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -546,7 +546,7 @@ void hvm_do_resume(struct vcpu *v)
 
     pt_restore_timer(v);
 
-    if ( !handle_hvm_io_completion(v) )
+    if ( !vcpu_ioreq_handle_completion(v) )
         return;
 
     if ( unlikely(v->arch.vm_event) )
@@ -677,7 +677,7 @@ int hvm_domain_initialise(struct domain *d)
     register_g2m_portio_handler(d);
     register_vpci_portio_handler(d);
 
-    hvm_ioreq_init(d);
+    ioreq_domain_init(d);
 
     hvm_init_guest_time(d);
 
@@ -739,7 +739,7 @@ void hvm_domain_relinquish_resources(struct domain *d)
 
     viridian_domain_deinit(d);
 
-    hvm_destroy_all_ioreq_servers(d);
+    ioreq_server_destroy_all(d);
 
     msixtbl_pt_cleanup(d);
 
@@ -1582,7 +1582,7 @@ int hvm_vcpu_initialise(struct vcpu *v)
     if ( rc )
         goto fail5;
 
-    rc = hvm_all_ioreq_servers_add_vcpu(d, v);
+    rc = ioreq_server_add_vcpu_all(d, v);
     if ( rc != 0 )
         goto fail6;
 
@@ -1618,7 +1618,7 @@ void hvm_vcpu_destroy(struct vcpu *v)
 {
     viridian_vcpu_deinit(v);
 
-    hvm_all_ioreq_servers_remove_vcpu(v->domain, v);
+    ioreq_server_remove_vcpu_all(v->domain, v);
 
     if ( hvm_altp2m_supported() )
         altp2m_vcpu_destroy(v);
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index dd733e1..66a37ee 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -60,7 +60,7 @@ void send_timeoffset_req(unsigned long timeoff)
     if ( timeoff == 0 )
         return;
 
-    if ( hvm_broadcast_ioreq(&p, true) != 0 )
+    if ( ioreq_broadcast(&p, true) != 0 )
         gprintk(XENLOG_ERR, "Unsuccessful timeoffset update\n");
 }
 
@@ -74,7 +74,7 @@ void send_invalidate_req(void)
         .data = ~0UL, /* flush all */
     };
 
-    if ( hvm_broadcast_ioreq(&p, false) != 0 )
+    if ( ioreq_broadcast(&p, false) != 0 )
         gprintk(XENLOG_ERR, "Unsuccessful map-cache invalidate\n");
 }
 
@@ -155,7 +155,7 @@ bool handle_pio(uint16_t port, unsigned int size, int dir)
          * We should not advance RIP/EIP if the domain is shutting down or
          * if X86EMUL_RETRY has been returned by an internal handler.
          */
-        if ( curr->domain->is_shutting_down || !hvm_io_pending(curr) )
+        if ( curr->domain->is_shutting_down || !vcpu_ioreq_pending(curr) )
             return false;
         break;
 
diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index c00ee8e..5c9f3a5 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -153,7 +153,7 @@ static int hvm_map_ioreq_gfn(struct ioreq_server *s, bool buf)
     {
         /*
          * If a page has already been allocated (which will happen on
-         * demand if hvm_get_ioreq_server_frame() is called), then
+         * demand if ioreq_server_get_frame() is called), then
          * mapping a guest frame is not permitted.
          */
         if ( gfn_eq(iorp->gfn, INVALID_GFN) )
diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c
index ee13449..ab9781d 100644
--- a/xen/arch/x86/hvm/stdvga.c
+++ b/xen/arch/x86/hvm/stdvga.c
@@ -507,11 +507,11 @@ static int stdvga_mem_write(const struct hvm_io_handler *handler,
     }
 
  done:
-    srv = hvm_select_ioreq_server(current->domain, &p);
+    srv = ioreq_server_select(current->domain, &p);
     if ( !srv )
         return X86EMUL_UNHANDLEABLE;
 
-    return hvm_send_ioreq(srv, &p, 1);
+    return ioreq_send(srv, &p, 1);
 }
 
 static bool_t stdvga_mem_accept(const struct hvm_io_handler *handler,
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index 0ddb6a4..e9f94da 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -1517,7 +1517,7 @@ void nvmx_switch_guest(void)
      * don't want to continue as this setup is not implemented nor supported
      * as of right now.
      */
-    if ( hvm_io_pending(v) )
+    if ( vcpu_ioreq_pending(v) )
         return;
     /*
      * a softirq may interrupt us between a virtual vmentry is
diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c
index 273683f..d233a49 100644
--- a/xen/common/ioreq.c
+++ b/xen/common/ioreq.c
@@ -59,7 +59,7 @@ static struct ioreq_server *get_ioreq_server(const struct domain *d,
  * Iterate over all possible ioreq servers.
  *
  * NOTE: The iteration is backwards such that more recently created
- *       ioreq servers are favoured in hvm_select_ioreq_server().
+ *       ioreq servers are favoured in ioreq_server_select().
  *       This is a semantic that previously existed when ioreq servers
  *       were held in a linked list.
  */
@@ -106,12 +106,12 @@ static struct ioreq_vcpu *get_pending_vcpu(const struct vcpu *v,
     return NULL;
 }
 
-bool hvm_io_pending(struct vcpu *v)
+bool vcpu_ioreq_pending(struct vcpu *v)
 {
     return get_pending_vcpu(v, NULL);
 }
 
-static bool hvm_wait_for_io(struct ioreq_vcpu *sv, ioreq_t *p)
+static bool wait_for_io(struct ioreq_vcpu *sv, ioreq_t *p)
 {
     unsigned int prev_state = STATE_IOREQ_NONE;
     unsigned int state = p->state;
@@ -168,7 +168,7 @@ static bool hvm_wait_for_io(struct ioreq_vcpu *sv, ioreq_t *p)
     return true;
 }
 
-bool handle_hvm_io_completion(struct vcpu *v)
+bool vcpu_ioreq_handle_completion(struct vcpu *v)
 {
     struct domain *d = v->domain;
     struct vcpu_io *vio = &v->io;
@@ -183,7 +183,7 @@ bool handle_hvm_io_completion(struct vcpu *v)
     }
 
     sv = get_pending_vcpu(v, &s);
-    if ( sv && !hvm_wait_for_io(sv, get_ioreq(s, v)) )
+    if ( sv && !wait_for_io(sv, get_ioreq(s, v)) )
         return false;
 
     vio->req.state = ioreq_needs_completion(&vio->req) ?
@@ -214,7 +214,7 @@ bool handle_hvm_io_completion(struct vcpu *v)
     return true;
 }
 
-static int hvm_alloc_ioreq_mfn(struct ioreq_server *s, bool buf)
+static int ioreq_server_alloc_mfn(struct ioreq_server *s, bool buf)
 {
     struct ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
     struct page_info *page;
@@ -223,7 +223,7 @@ static int hvm_alloc_ioreq_mfn(struct ioreq_server *s, bool buf)
     {
         /*
          * If a guest frame has already been mapped (which may happen
-         * on demand if hvm_get_ioreq_server_info() is called), then
+         * on demand if ioreq_server_get_info() is called), then
          * allocating a page is not permitted.
          */
         if ( !gfn_eq(iorp->gfn, INVALID_GFN) )
@@ -262,7 +262,7 @@ static int hvm_alloc_ioreq_mfn(struct ioreq_server *s, bool buf)
     return -ENOMEM;
 }
 
-static void hvm_free_ioreq_mfn(struct ioreq_server *s, bool buf)
+static void ioreq_server_free_mfn(struct ioreq_server *s, bool buf)
 {
     struct ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
     struct page_info *page = iorp->page;
@@ -301,8 +301,8 @@ bool is_ioreq_server_page(struct domain *d, const struct page_info *page)
     return found;
 }
 
-static void hvm_update_ioreq_evtchn(struct ioreq_server *s,
-                                    struct ioreq_vcpu *sv)
+static void ioreq_server_update_evtchn(struct ioreq_server *s,
+                                       struct ioreq_vcpu *sv)
 {
     ASSERT(spin_is_locked(&s->lock));
 
@@ -314,8 +314,8 @@ static void hvm_update_ioreq_evtchn(struct ioreq_server *s,
     }
 }
 
-static int hvm_ioreq_server_add_vcpu(struct ioreq_server *s,
-                                     struct vcpu *v)
+static int ioreq_server_add_vcpu(struct ioreq_server *s,
+                                 struct vcpu *v)
 {
     struct ioreq_vcpu *sv;
     int rc;
@@ -350,7 +350,7 @@ static int hvm_ioreq_server_add_vcpu(struct ioreq_server *s,
     list_add(&sv->list_entry, &s->ioreq_vcpu_list);
 
     if ( s->enabled )
-        hvm_update_ioreq_evtchn(s, sv);
+        ioreq_server_update_evtchn(s, sv);
 
     spin_unlock(&s->lock);
     return 0;
@@ -366,8 +366,8 @@ static int hvm_ioreq_server_add_vcpu(struct ioreq_server *s,
     return rc;
 }
 
-static void hvm_ioreq_server_remove_vcpu(struct ioreq_server *s,
-                                         struct vcpu *v)
+static void ioreq_server_remove_vcpu(struct ioreq_server *s,
+                                     struct vcpu *v)
 {
     struct ioreq_vcpu *sv;
 
@@ -394,7 +394,7 @@ static void hvm_ioreq_server_remove_vcpu(struct ioreq_server *s,
     spin_unlock(&s->lock);
 }
 
-static void hvm_ioreq_server_remove_all_vcpus(struct ioreq_server *s)
+static void ioreq_server_remove_all_vcpus(struct ioreq_server *s)
 {
     struct ioreq_vcpu *sv, *next;
 
@@ -420,28 +420,28 @@ static void hvm_ioreq_server_remove_all_vcpus(struct ioreq_server *s)
     spin_unlock(&s->lock);
 }
 
-static int hvm_ioreq_server_alloc_pages(struct ioreq_server *s)
+static int ioreq_server_alloc_pages(struct ioreq_server *s)
 {
     int rc;
 
-    rc = hvm_alloc_ioreq_mfn(s, false);
+    rc = ioreq_server_alloc_mfn(s, false);
 
     if ( !rc && (s->bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF) )
-        rc = hvm_alloc_ioreq_mfn(s, true);
+        rc = ioreq_server_alloc_mfn(s, true);
 
     if ( rc )
-        hvm_free_ioreq_mfn(s, false);
+        ioreq_server_free_mfn(s, false);
 
     return rc;
 }
 
-static void hvm_ioreq_server_free_pages(struct ioreq_server *s)
+static void ioreq_server_free_pages(struct ioreq_server *s)
 {
-    hvm_free_ioreq_mfn(s, true);
-    hvm_free_ioreq_mfn(s, false);
+    ioreq_server_free_mfn(s, true);
+    ioreq_server_free_mfn(s, false);
 }
 
-static void hvm_ioreq_server_free_rangesets(struct ioreq_server *s)
+static void ioreq_server_free_rangesets(struct ioreq_server *s)
 {
     unsigned int i;
 
@@ -449,8 +449,8 @@ static void hvm_ioreq_server_free_rangesets(struct ioreq_server *s)
         rangeset_destroy(s->range[i]);
 }
 
-static int hvm_ioreq_server_alloc_rangesets(struct ioreq_server *s,
-                                            ioservid_t id)
+static int ioreq_server_alloc_rangesets(struct ioreq_server *s,
+                                        ioservid_t id)
 {
     unsigned int i;
     int rc;
@@ -482,12 +482,12 @@ static int hvm_ioreq_server_alloc_rangesets(struct ioreq_server *s,
     return 0;
 
  fail:
-    hvm_ioreq_server_free_rangesets(s);
+    ioreq_server_free_rangesets(s);
 
     return rc;
 }
 
-static void hvm_ioreq_server_enable(struct ioreq_server *s)
+static void ioreq_server_enable(struct ioreq_server *s)
 {
     struct ioreq_vcpu *sv;
 
@@ -503,13 +503,13 @@ static void hvm_ioreq_server_enable(struct ioreq_server *s)
     list_for_each_entry ( sv,
                           &s->ioreq_vcpu_list,
                           list_entry )
-        hvm_update_ioreq_evtchn(s, sv);
+        ioreq_server_update_evtchn(s, sv);
 
   done:
     spin_unlock(&s->lock);
 }
 
-static void hvm_ioreq_server_disable(struct ioreq_server *s)
+static void ioreq_server_disable(struct ioreq_server *s)
 {
     spin_lock(&s->lock);
 
@@ -524,9 +524,9 @@ static void hvm_ioreq_server_disable(struct ioreq_server *s)
     spin_unlock(&s->lock);
 }
 
-static int hvm_ioreq_server_init(struct ioreq_server *s,
-                                 struct domain *d, int bufioreq_handling,
-                                 ioservid_t id)
+static int ioreq_server_init(struct ioreq_server *s,
+                             struct domain *d, int bufioreq_handling,
+                             ioservid_t id)
 {
     struct domain *currd = current->domain;
     struct vcpu *v;
@@ -544,7 +544,7 @@ static int hvm_ioreq_server_init(struct ioreq_server *s,
     s->ioreq.gfn = INVALID_GFN;
     s->bufioreq.gfn = INVALID_GFN;
 
-    rc = hvm_ioreq_server_alloc_rangesets(s, id);
+    rc = ioreq_server_alloc_rangesets(s, id);
     if ( rc )
         return rc;
 
@@ -552,7 +552,7 @@ static int hvm_ioreq_server_init(struct ioreq_server *s,
 
     for_each_vcpu ( d, v )
     {
-        rc = hvm_ioreq_server_add_vcpu(s, v);
+        rc = ioreq_server_add_vcpu(s, v);
         if ( rc )
             goto fail_add;
     }
@@ -560,23 +560,23 @@ static int hvm_ioreq_server_init(struct ioreq_server *s,
     return 0;
 
  fail_add:
-    hvm_ioreq_server_remove_all_vcpus(s);
+    ioreq_server_remove_all_vcpus(s);
     arch_ioreq_server_unmap_pages(s);
 
-    hvm_ioreq_server_free_rangesets(s);
+    ioreq_server_free_rangesets(s);
 
     put_domain(s->emulator);
     return rc;
 }
 
-static void hvm_ioreq_server_deinit(struct ioreq_server *s)
+static void ioreq_server_deinit(struct ioreq_server *s)
 {
     ASSERT(!s->enabled);
-    hvm_ioreq_server_remove_all_vcpus(s);
+    ioreq_server_remove_all_vcpus(s);
 
     /*
      * NOTE: It is safe to call both arch_ioreq_server_unmap_pages() and
-     *       hvm_ioreq_server_free_pages() in that order.
+     *       ioreq_server_free_pages() in that order.
      *       This is because the former will do nothing if the pages
      *       are not mapped, leaving the page to be freed by the latter.
      *       However if the pages are mapped then the former will set
@@ -584,15 +584,15 @@ static void hvm_ioreq_server_deinit(struct ioreq_server *s)
      *       nothing.
      */
     arch_ioreq_server_unmap_pages(s);
-    hvm_ioreq_server_free_pages(s);
+    ioreq_server_free_pages(s);
 
-    hvm_ioreq_server_free_rangesets(s);
+    ioreq_server_free_rangesets(s);
 
     put_domain(s->emulator);
 }
 
-static int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
-                                   ioservid_t *id)
+static int ioreq_server_create(struct domain *d, int bufioreq_handling,
+                               ioservid_t *id)
 {
     struct ioreq_server *s;
     unsigned int i;
@@ -620,11 +620,11 @@ static int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
 
     /*
      * It is safe to call set_ioreq_server() prior to
-     * hvm_ioreq_server_init() since the target domain is paused.
+     * ioreq_server_init() since the target domain is paused.
      */
     set_ioreq_server(d, i, s);
 
-    rc = hvm_ioreq_server_init(s, d, bufioreq_handling, i);
+    rc = ioreq_server_init(s, d, bufioreq_handling, i);
     if ( rc )
     {
         set_ioreq_server(d, i, NULL);
@@ -647,7 +647,7 @@ static int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
     return rc;
 }
 
-static int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
+static int ioreq_server_destroy(struct domain *d, ioservid_t id)
 {
     struct ioreq_server *s;
     int rc;
@@ -668,13 +668,13 @@ static int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
 
     arch_ioreq_server_destroy(s);
 
-    hvm_ioreq_server_disable(s);
+    ioreq_server_disable(s);
 
     /*
-     * It is safe to call hvm_ioreq_server_deinit() prior to
+     * It is safe to call ioreq_server_deinit() prior to
      * set_ioreq_server() since the target domain is paused.
      */
-    hvm_ioreq_server_deinit(s);
+    ioreq_server_deinit(s);
     set_ioreq_server(d, id, NULL);
 
     domain_unpause(d);
@@ -689,10 +689,10 @@ static int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
     return rc;
 }
 
-static int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
-                                     unsigned long *ioreq_gfn,
-                                     unsigned long *bufioreq_gfn,
-                                     evtchn_port_t *bufioreq_port)
+static int ioreq_server_get_info(struct domain *d, ioservid_t id,
+                                 unsigned long *ioreq_gfn,
+                                 unsigned long *bufioreq_gfn,
+                                 evtchn_port_t *bufioreq_port)
 {
     struct ioreq_server *s;
     int rc;
@@ -736,8 +736,8 @@ static int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
     return rc;
 }
 
-int hvm_get_ioreq_server_frame(struct domain *d, ioservid_t id,
-                               unsigned long idx, mfn_t *mfn)
+int ioreq_server_get_frame(struct domain *d, ioservid_t id,
+                           unsigned long idx, mfn_t *mfn)
 {
     struct ioreq_server *s;
     int rc;
@@ -756,7 +756,7 @@ int hvm_get_ioreq_server_frame(struct domain *d, ioservid_t id,
     if ( s->emulator != current->domain )
         goto out;
 
-    rc = hvm_ioreq_server_alloc_pages(s);
+    rc = ioreq_server_alloc_pages(s);
     if ( rc )
         goto out;
 
@@ -787,9 +787,9 @@ int hvm_get_ioreq_server_frame(struct domain *d, ioservid_t id,
     return rc;
 }
 
-static int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
-                                            uint32_t type, uint64_t start,
-                                            uint64_t end)
+static int ioreq_server_map_io_range(struct domain *d, ioservid_t id,
+                                     uint32_t type, uint64_t start,
+                                     uint64_t end)
 {
     struct ioreq_server *s;
     struct rangeset *r;
@@ -839,9 +839,9 @@ static int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
     return rc;
 }
 
-static int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
-                                                uint32_t type, uint64_t start,
-                                                uint64_t end)
+static int ioreq_server_unmap_io_range(struct domain *d, ioservid_t id,
+                                       uint32_t type, uint64_t start,
+                                       uint64_t end)
 {
     struct ioreq_server *s;
     struct rangeset *r;
@@ -899,8 +899,8 @@ static int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
  * Support for the emulation of read operations can be added when an ioreq
  * server has such requirement in the future.
  */
-int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
-                                     uint32_t type, uint32_t flags)
+int ioreq_server_map_mem_type(struct domain *d, ioservid_t id,
+                              uint32_t type, uint32_t flags)
 {
     struct ioreq_server *s;
     int rc;
@@ -934,8 +934,8 @@ int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
     return rc;
 }
 
-static int hvm_set_ioreq_server_state(struct domain *d, ioservid_t id,
-                                      bool enabled)
+static int ioreq_server_set_state(struct domain *d, ioservid_t id,
+                                  bool enabled)
 {
     struct ioreq_server *s;
     int rc;
@@ -955,9 +955,9 @@ static int hvm_set_ioreq_server_state(struct domain *d, ioservid_t id,
     domain_pause(d);
 
     if ( enabled )
-        hvm_ioreq_server_enable(s);
+        ioreq_server_enable(s);
     else
-        hvm_ioreq_server_disable(s);
+        ioreq_server_disable(s);
 
     domain_unpause(d);
 
@@ -968,7 +968,7 @@ static int hvm_set_ioreq_server_state(struct domain *d, ioservid_t id,
     return rc;
 }
 
-int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v)
+int ioreq_server_add_vcpu_all(struct domain *d, struct vcpu *v)
 {
     struct ioreq_server *s;
     unsigned int id;
@@ -978,7 +978,7 @@ int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v)
 
     FOR_EACH_IOREQ_SERVER(d, id, s)
     {
-        rc = hvm_ioreq_server_add_vcpu(s, v);
+        rc = ioreq_server_add_vcpu(s, v);
         if ( rc )
             goto fail;
     }
@@ -995,7 +995,7 @@ int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v)
         if ( !s )
             continue;
 
-        hvm_ioreq_server_remove_vcpu(s, v);
+        ioreq_server_remove_vcpu(s, v);
     }
 
     spin_unlock_recursive(&d->ioreq_server.lock);
@@ -1003,7 +1003,7 @@ int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v)
     return rc;
 }
 
-void hvm_all_ioreq_servers_remove_vcpu(struct domain *d, struct vcpu *v)
+void ioreq_server_remove_vcpu_all(struct domain *d, struct vcpu *v)
 {
     struct ioreq_server *s;
     unsigned int id;
@@ -1011,12 +1011,12 @@ void hvm_all_ioreq_servers_remove_vcpu(struct domain *d, struct vcpu *v)
     spin_lock_recursive(&d->ioreq_server.lock);
 
     FOR_EACH_IOREQ_SERVER(d, id, s)
-        hvm_ioreq_server_remove_vcpu(s, v);
+        ioreq_server_remove_vcpu(s, v);
 
     spin_unlock_recursive(&d->ioreq_server.lock);
 }
 
-void hvm_destroy_all_ioreq_servers(struct domain *d)
+void ioreq_server_destroy_all(struct domain *d)
 {
     struct ioreq_server *s;
     unsigned int id;
@@ -1030,13 +1030,13 @@ void hvm_destroy_all_ioreq_servers(struct domain *d)
 
     FOR_EACH_IOREQ_SERVER(d, id, s)
     {
-        hvm_ioreq_server_disable(s);
+        ioreq_server_disable(s);
 
         /*
-         * It is safe to call hvm_ioreq_server_deinit() prior to
+         * It is safe to call ioreq_server_deinit() prior to
          * set_ioreq_server() since the target domain is being destroyed.
          */
-        hvm_ioreq_server_deinit(s);
+        ioreq_server_deinit(s);
         set_ioreq_server(d, id, NULL);
 
         xfree(s);
@@ -1045,8 +1045,8 @@ void hvm_destroy_all_ioreq_servers(struct domain *d)
     spin_unlock_recursive(&d->ioreq_server.lock);
 }
 
-struct ioreq_server *hvm_select_ioreq_server(struct domain *d,
-                                             ioreq_t *p)
+struct ioreq_server *ioreq_server_select(struct domain *d,
+                                         ioreq_t *p)
 {
     struct ioreq_server *s;
     uint8_t type;
@@ -1101,7 +1101,7 @@ struct ioreq_server *hvm_select_ioreq_server(struct domain *d,
     return NULL;
 }
 
-static int hvm_send_buffered_ioreq(struct ioreq_server *s, ioreq_t *p)
+static int ioreq_send_buffered(struct ioreq_server *s, ioreq_t *p)
 {
     struct domain *d = current->domain;
     struct ioreq_page *iorp;
@@ -1194,8 +1194,8 @@ static int hvm_send_buffered_ioreq(struct ioreq_server *s, ioreq_t *p)
     return IOREQ_STATUS_HANDLED;
 }
 
-int hvm_send_ioreq(struct ioreq_server *s, ioreq_t *proto_p,
-                   bool buffered)
+int ioreq_send(struct ioreq_server *s, ioreq_t *proto_p,
+               bool buffered)
 {
     struct vcpu *curr = current;
     struct domain *d = curr->domain;
@@ -1204,7 +1204,7 @@ int hvm_send_ioreq(struct ioreq_server *s, ioreq_t *proto_p,
     ASSERT(s);
 
     if ( buffered )
-        return hvm_send_buffered_ioreq(s, proto_p);
+        return ioreq_send_buffered(s, proto_p);
 
     if ( unlikely(!vcpu_start_shutdown_deferral(curr)) )
         return IOREQ_STATUS_RETRY;
@@ -1254,7 +1254,7 @@ int hvm_send_ioreq(struct ioreq_server *s, ioreq_t *proto_p,
     return IOREQ_STATUS_UNHANDLED;
 }
 
-unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered)
+unsigned int ioreq_broadcast(ioreq_t *p, bool buffered)
 {
     struct domain *d = current->domain;
     struct ioreq_server *s;
@@ -1265,14 +1265,14 @@ unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered)
         if ( !s->enabled )
             continue;
 
-        if ( hvm_send_ioreq(s, p, buffered) == IOREQ_STATUS_UNHANDLED )
+        if ( ioreq_send(s, p, buffered) == IOREQ_STATUS_UNHANDLED )
             failed++;
     }
 
     return failed;
 }
 
-void hvm_ioreq_init(struct domain *d)
+void ioreq_domain_init(struct domain *d)
 {
     spin_lock_init(&d->ioreq_server.lock);
 
@@ -1296,8 +1296,8 @@ int ioreq_server_dm_op(struct xen_dm_op *op, struct domain *d, bool *const_op)
         if ( data->pad[0] || data->pad[1] || data->pad[2] )
             break;
 
-        rc = hvm_create_ioreq_server(d, data->handle_bufioreq,
-                                     &data->id);
+        rc = ioreq_server_create(d, data->handle_bufioreq,
+                                 &data->id);
         break;
     }
 
@@ -1313,12 +1313,12 @@ int ioreq_server_dm_op(struct xen_dm_op *op, struct domain *d, bool *const_op)
         if ( data->flags & ~valid_flags )
             break;
 
-        rc = hvm_get_ioreq_server_info(d, data->id,
-                                       (data->flags & XEN_DMOP_no_gfns) ?
-                                       NULL : (unsigned long *)&data->ioreq_gfn,
-                                       (data->flags & XEN_DMOP_no_gfns) ?
-                                       NULL : (unsigned long *)&data->bufioreq_gfn,
-                                       &data->bufioreq_port);
+        rc = ioreq_server_get_info(d, data->id,
+                                   (data->flags & XEN_DMOP_no_gfns) ?
+                                   NULL : (unsigned long *)&data->ioreq_gfn,
+                                   (data->flags & XEN_DMOP_no_gfns) ?
+                                   NULL : (unsigned long *)&data->bufioreq_gfn,
+                                   &data->bufioreq_port);
         break;
     }
 
@@ -1331,8 +1331,8 @@ int ioreq_server_dm_op(struct xen_dm_op *op, struct domain *d, bool *const_op)
         if ( data->pad )
             break;
 
-        rc = hvm_map_io_range_to_ioreq_server(d, data->id, data->type,
-                                              data->start, data->end);
+        rc = ioreq_server_map_io_range(d, data->id, data->type,
+                                       data->start, data->end);
         break;
     }
 
@@ -1345,8 +1345,8 @@ int ioreq_server_dm_op(struct xen_dm_op *op, struct domain *d, bool *const_op)
         if ( data->pad )
             break;
 
-        rc = hvm_unmap_io_range_from_ioreq_server(d, data->id, data->type,
-                                                  data->start, data->end);
+        rc = ioreq_server_unmap_io_range(d, data->id, data->type,
+                                         data->start, data->end);
         break;
     }
 
@@ -1359,7 +1359,7 @@ int ioreq_server_dm_op(struct xen_dm_op *op, struct domain *d, bool *const_op)
         if ( data->pad )
             break;
 
-        rc = hvm_set_ioreq_server_state(d, data->id, !!data->enabled);
+        rc = ioreq_server_set_state(d, data->id, !!data->enabled);
         break;
     }
 
@@ -1372,7 +1372,7 @@ int ioreq_server_dm_op(struct xen_dm_op *op, struct domain *d, bool *const_op)
         if ( data->pad )
             break;
 
-        rc = hvm_destroy_ioreq_server(d, data->id);
+        rc = ioreq_server_destroy(d, data->id);
         break;
     }
 
diff --git a/xen/common/memory.c b/xen/common/memory.c
index 7e560b5..66828d9 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -1112,7 +1112,7 @@ static int acquire_ioreq_server(struct domain *d,
     {
         mfn_t mfn;
 
-        rc = hvm_get_ioreq_server_frame(d, id, frame + i, &mfn);
+        rc = ioreq_server_get_frame(d, id, frame + i, &mfn);
         if ( rc )
             return rc;
 
diff --git a/xen/include/xen/ioreq.h b/xen/include/xen/ioreq.h
index dffed60..ec7e98d 100644
--- a/xen/include/xen/ioreq.h
+++ b/xen/include/xen/ioreq.h
@@ -81,26 +81,26 @@ static inline bool ioreq_needs_completion(const ioreq_t *ioreq)
 #define HANDLE_BUFIOREQ(s) \
     ((s)->bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF)
 
-bool hvm_io_pending(struct vcpu *v);
-bool handle_hvm_io_completion(struct vcpu *v);
+bool vcpu_ioreq_pending(struct vcpu *v);
+bool vcpu_ioreq_handle_completion(struct vcpu *v);
 bool is_ioreq_server_page(struct domain *d, const struct page_info *page);
 
-int hvm_get_ioreq_server_frame(struct domain *d, ioservid_t id,
-                               unsigned long idx, mfn_t *mfn);
-int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
-                                     uint32_t type, uint32_t flags);
+int ioreq_server_get_frame(struct domain *d, ioservid_t id,
+                           unsigned long idx, mfn_t *mfn);
+int ioreq_server_map_mem_type(struct domain *d, ioservid_t id,
+                              uint32_t type, uint32_t flags);
 
-int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v);
-void hvm_all_ioreq_servers_remove_vcpu(struct domain *d, struct vcpu *v);
-void hvm_destroy_all_ioreq_servers(struct domain *d);
+int ioreq_server_add_vcpu_all(struct domain *d, struct vcpu *v);
+void ioreq_server_remove_vcpu_all(struct domain *d, struct vcpu *v);
+void ioreq_server_destroy_all(struct domain *d);
 
-struct ioreq_server *hvm_select_ioreq_server(struct domain *d,
-                                             ioreq_t *p);
-int hvm_send_ioreq(struct ioreq_server *s, ioreq_t *proto_p,
-                   bool buffered);
-unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered);
+struct ioreq_server *ioreq_server_select(struct domain *d,
+                                         ioreq_t *p);
+int ioreq_send(struct ioreq_server *s, ioreq_t *proto_p,
+               bool buffered);
+unsigned int ioreq_broadcast(ioreq_t *p, bool buffered);
 
-void hvm_ioreq_init(struct domain *d);
+void ioreq_domain_init(struct domain *d);
 
 int ioreq_server_dm_op(struct xen_dm_op *op, struct domain *d, bool *const_op);
 
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH V4 13/24] xen/ioreq: Use guest_cmpxchg64() instead of cmpxchg()
  2021-01-12 21:52 [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Oleksandr Tyshchenko
                   ` (11 preceding siblings ...)
  2021-01-12 21:52 ` [PATCH V4 12/24] xen/ioreq: Remove "hvm" prefixes from involved function names Oleksandr Tyshchenko
@ 2021-01-12 21:52 ` Oleksandr Tyshchenko
  2021-01-15 19:37   ` Julien Grall
  2021-01-18 10:00   ` Paul Durrant
  2021-01-12 21:52 ` [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features Oleksandr Tyshchenko
                   ` (11 subsequent siblings)
  24 siblings, 2 replies; 144+ messages in thread
From: Oleksandr Tyshchenko @ 2021-01-12 21:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, Paul Durrant, Julien Grall,
	Stefano Stabellini, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

The cmpxchg() in ioreq_send_buffered() operates on memory shared
with the emulator domain (and the target domain if the legacy
interface is used).

In order to be on the safe side we need to switch
to guest_cmpxchg64() to prevent a domain to DoS Xen on Arm.

As there is no plan to support the legacy interface on Arm,
we will have a page to be mapped in a single domain at the time,
so we can use s->emulator in guest_cmpxchg64() safely.

Thankfully the only user of the legacy interface is x86 so far
and there is not concern regarding the atomics operations.

Please note, that the legacy interface *must* not be used on Arm
without revisiting the code.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@arm.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>

---
Please note, this is a split/cleanup/hardening of Julien's PoC:
"Add support for Guest IO forwarding to a device emulator"

Changes RFC -> V1:
   - new patch

Changes V1 -> V2:
   - move earlier to avoid breaking arm32 compilation
   - add an explanation to commit description and hvm_allow_set_param()
   - pass s->emulator

Changes V2 -> V3:
   - update patch description

Changes V3 -> V4:
   - add Stefano's A-b
   - drop comment from arm/hvm.c
---
 xen/common/ioreq.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c
index d233a49..d5f4dd3 100644
--- a/xen/common/ioreq.c
+++ b/xen/common/ioreq.c
@@ -29,6 +29,7 @@
 #include <xen/trace.h>
 #include <xen/vpci.h>
 
+#include <asm/guest_atomics.h>
 #include <asm/hvm/ioreq.h>
 
 #include <public/hvm/ioreq.h>
@@ -1185,7 +1186,7 @@ static int ioreq_send_buffered(struct ioreq_server *s, ioreq_t *p)
 
         new.read_pointer = old.read_pointer - n * IOREQ_BUFFER_SLOT_NUM;
         new.write_pointer = old.write_pointer - n * IOREQ_BUFFER_SLOT_NUM;
-        cmpxchg(&pg->ptrs.full, old.full, new.full);
+        guest_cmpxchg64(s->emulator, &pg->ptrs.full, old.full, new.full);
     }
 
     notify_via_xen_event_channel(d, s->bufioreq_evtchn);
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features
  2021-01-12 21:52 [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Oleksandr Tyshchenko
                   ` (12 preceding siblings ...)
  2021-01-12 21:52 ` [PATCH V4 13/24] xen/ioreq: Use guest_cmpxchg64() instead of cmpxchg() Oleksandr Tyshchenko
@ 2021-01-12 21:52 ` Oleksandr Tyshchenko
  2021-01-15  0:55   ` Stefano Stabellini
  2021-01-15 20:26   ` Julien Grall
  2021-01-12 21:52 ` [PATCH V4 15/24] xen/arm: Stick around in leave_hypervisor_to_guest until I/O has completed Oleksandr Tyshchenko
                   ` (10 subsequent siblings)
  24 siblings, 2 replies; 144+ messages in thread
From: Oleksandr Tyshchenko @ 2021-01-12 21:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Julien Grall, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Oleksandr Tyshchenko

From: Julien Grall <julien.grall@arm.com>

This patch adds basic IOREQ/DM support on Arm. The subsequent
patches will improve functionality and add remaining bits.

The IOREQ/DM features are supposed to be built with IOREQ_SERVER
option enabled, which is disabled by default on Arm for now.

Please note, the "PIO handling" TODO is expected to left unaddressed
for the current series. It is not an big issue for now while Xen
doesn't have support for vPCI on Arm. On Arm64 they are only used
for PCI IO Bar and we would probably want to expose them to emulator
as PIO access to make a DM completely arch-agnostic. So "PIO handling"
should be implemented when we add support for vPCI.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>

---
Please note, this is a split/cleanup/hardening of Julien's PoC:
"Add support for Guest IO forwarding to a device emulator"

Changes RFC -> V1:
   - was split into:
     - arm/ioreq: Introduce arch specific bits for IOREQ/DM features
     - xen/mm: Handle properly reference in set_foreign_p2m_entry() on Arm
   - update patch description
   - update asm-arm/hvm/ioreq.h according to the newly introduced arch functions:
     - arch_hvm_destroy_ioreq_server()
     - arch_handle_hvm_io_completion()
   - update arch files to include xen/ioreq.h
   - remove HVMOP plumbing
   - rewrite a logic to handle properly case when hvm_send_ioreq() returns IO_RETRY
   - add a logic to handle properly handle_hvm_io_completion() return value
   - rename handle_mmio() to ioreq_handle_complete_mmio()
   - move paging_mark_pfn_dirty() to asm-arm/paging.h
   - remove forward declaration for hvm_ioreq_server in asm-arm/paging.h
   - move try_fwd_ioserv() to ioreq.c, provide stubs if !CONFIG_IOREQ_SERVER
   - do not remove #ifdef CONFIG_IOREQ_SERVER in memory.c for guarding xen/ioreq.h
   - use gdprintk in try_fwd_ioserv(), remove unneeded prints
   - update list of #include-s
   - move has_vpci() to asm-arm/domain.h
   - add a comment (TODO) to unimplemented yet handle_pio()
   - remove hvm_mmio_first(last)_byte() and hvm_ioreq_(page/vcpu/server) structs
     from the arch files, they were already moved to the common code
   - remove set_foreign_p2m_entry() changes, they will be properly implemented
     in the follow-up patch
   - select IOREQ_SERVER for Arm instead of Arm64 in Kconfig
   - remove x86's realmode and other unneeded stubs from xen/ioreq.h
   - clafify ioreq_t p.df usage in try_fwd_ioserv()
   - set ioreq_t p.count to 1 in try_fwd_ioserv()

Changes V1 -> V2:
   - was split into:
     - arm/ioreq: Introduce arch specific bits for IOREQ/DM features
     - xen/arm: Stick around in leave_hypervisor_to_guest until I/O has completed
   - update the author of a patch
   - update patch description
   - move a loop in leave_hypervisor_to_guest() to a separate patch
   - set IOREQ_SERVER disabled by default
   - remove already clarified /* XXX */
   - replace BUG() by ASSERT_UNREACHABLE() in handle_pio()
   - remove default case for handling the return value of try_handle_mmio()
   - remove struct hvm_domain, enum hvm_io_completion, struct hvm_vcpu_io,
     struct hvm_vcpu from asm-arm/domain.h, these are common materials now
   - update everything according to the recent changes (IOREQ related function
     names don't contain "hvm" prefixes/infixes anymore, IOREQ related fields
     are part of common struct vcpu/domain now, etc)

Changes V2 -> V3:
   - update patch according the "legacy interface" is x86 specific
   - add dummy arch hooks
   - remove dummy paging_mark_pfn_dirty()
   - don’t include <xen/domain_page.h> in common ioreq.c
   - don’t include <public/hvm/ioreq.h> in arch ioreq.h
   - remove #define ioreq_params(d, i)

Changes V3 -> V4:
   - rebase
   - update patch according to the renaming IO_ -> VIO_ (io_ -> vio_)
     and misc changes to arch hooks
   - update patch according to the IOREQ related dm-op handling changes
   - don't include <xen/ioreq.h> from arch header
   - make all arch hooks out-of-line
   - add a comment above IOREQ_STATUS_* #define-s
---
 xen/arch/arm/Makefile           |   2 +
 xen/arch/arm/dm.c               | 122 +++++++++++++++++++++++
 xen/arch/arm/domain.c           |   9 ++
 xen/arch/arm/io.c               |  12 ++-
 xen/arch/arm/ioreq.c            | 213 ++++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/traps.c            |  13 +++
 xen/include/asm-arm/domain.h    |   3 +
 xen/include/asm-arm/hvm/ioreq.h |  72 ++++++++++++++
 xen/include/asm-arm/mmio.h      |   1 +
 9 files changed, 446 insertions(+), 1 deletion(-)
 create mode 100644 xen/arch/arm/dm.c
 create mode 100644 xen/arch/arm/ioreq.c
 create mode 100644 xen/include/asm-arm/hvm/ioreq.h

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 512ffdd..16e6523 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -13,6 +13,7 @@ obj-y += cpuerrata.o
 obj-y += cpufeature.o
 obj-y += decode.o
 obj-y += device.o
+obj-$(CONFIG_IOREQ_SERVER) += dm.o
 obj-y += domain.o
 obj-y += domain_build.init.o
 obj-y += domctl.o
@@ -27,6 +28,7 @@ obj-y += guest_atomics.o
 obj-y += guest_walk.o
 obj-y += hvm.o
 obj-y += io.o
+obj-$(CONFIG_IOREQ_SERVER) += ioreq.o
 obj-y += irq.o
 obj-y += kernel.init.o
 obj-$(CONFIG_LIVEPATCH) += livepatch.o
diff --git a/xen/arch/arm/dm.c b/xen/arch/arm/dm.c
new file mode 100644
index 0000000..e6dedf4
--- /dev/null
+++ b/xen/arch/arm/dm.c
@@ -0,0 +1,122 @@
+/*
+ * Copyright (c) 2019 Arm ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/dm.h>
+#include <xen/guest_access.h>
+#include <xen/hypercall.h>
+#include <xen/ioreq.h>
+#include <xen/nospec.h>
+
+static int dm_op(const struct dmop_args *op_args)
+{
+    struct domain *d;
+    struct xen_dm_op op;
+    bool const_op = true;
+    long rc;
+    size_t offset;
+
+    static const uint8_t op_size[] = {
+        [XEN_DMOP_create_ioreq_server]              = sizeof(struct xen_dm_op_create_ioreq_server),
+        [XEN_DMOP_get_ioreq_server_info]            = sizeof(struct xen_dm_op_get_ioreq_server_info),
+        [XEN_DMOP_map_io_range_to_ioreq_server]     = sizeof(struct xen_dm_op_ioreq_server_range),
+        [XEN_DMOP_unmap_io_range_from_ioreq_server] = sizeof(struct xen_dm_op_ioreq_server_range),
+        [XEN_DMOP_set_ioreq_server_state]           = sizeof(struct xen_dm_op_set_ioreq_server_state),
+        [XEN_DMOP_destroy_ioreq_server]             = sizeof(struct xen_dm_op_destroy_ioreq_server),
+    };
+
+    rc = rcu_lock_remote_domain_by_id(op_args->domid, &d);
+    if ( rc )
+        return rc;
+
+    rc = xsm_dm_op(XSM_DM_PRIV, d);
+    if ( rc )
+        goto out;
+
+    offset = offsetof(struct xen_dm_op, u);
+
+    rc = -EFAULT;
+    if ( op_args->buf[0].size < offset )
+        goto out;
+
+    if ( copy_from_guest_offset((void *)&op, op_args->buf[0].h, 0, offset) )
+        goto out;
+
+    if ( op.op >= ARRAY_SIZE(op_size) )
+    {
+        rc = -EOPNOTSUPP;
+        goto out;
+    }
+
+    op.op = array_index_nospec(op.op, ARRAY_SIZE(op_size));
+
+    if ( op_args->buf[0].size < offset + op_size[op.op] )
+        goto out;
+
+    if ( copy_from_guest_offset((void *)&op.u, op_args->buf[0].h, offset,
+                                op_size[op.op]) )
+        goto out;
+
+    rc = -EINVAL;
+    if ( op.pad )
+        goto out;
+
+    rc = ioreq_server_dm_op(&op, d, &const_op);
+
+    if ( (!rc || rc == -ERESTART) &&
+         !const_op && copy_to_guest_offset(op_args->buf[0].h, offset,
+                                           (void *)&op.u, op_size[op.op]) )
+        rc = -EFAULT;
+
+ out:
+    rcu_unlock_domain(d);
+
+    return rc;
+}
+
+long do_dm_op(domid_t domid,
+              unsigned int nr_bufs,
+              XEN_GUEST_HANDLE_PARAM(xen_dm_op_buf_t) bufs)
+{
+    struct dmop_args args;
+    int rc;
+
+    if ( nr_bufs > ARRAY_SIZE(args.buf) )
+        return -E2BIG;
+
+    args.domid = domid;
+    args.nr_bufs = array_index_nospec(nr_bufs, ARRAY_SIZE(args.buf) + 1);
+
+    if ( copy_from_guest_offset(&args.buf[0], bufs, 0, args.nr_bufs) )
+        return -EFAULT;
+
+    rc = dm_op(&args);
+
+    if ( rc == -ERESTART )
+        rc = hypercall_create_continuation(__HYPERVISOR_dm_op, "iih",
+                                           domid, nr_bufs, bufs);
+
+    return rc;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 18cafcd..8f55aba 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -15,6 +15,7 @@
 #include <xen/guest_access.h>
 #include <xen/hypercall.h>
 #include <xen/init.h>
+#include <xen/ioreq.h>
 #include <xen/lib.h>
 #include <xen/livepatch.h>
 #include <xen/sched.h>
@@ -696,6 +697,10 @@ int arch_domain_create(struct domain *d,
 
     ASSERT(config != NULL);
 
+#ifdef CONFIG_IOREQ_SERVER
+    ioreq_domain_init(d);
+#endif
+
     /* p2m_init relies on some value initialized by the IOMMU subsystem */
     if ( (rc = iommu_domain_init(d, config->iommu_opts)) != 0 )
         goto fail;
@@ -1014,6 +1019,10 @@ int domain_relinquish_resources(struct domain *d)
         if (ret )
             return ret;
 
+#ifdef CONFIG_IOREQ_SERVER
+        ioreq_server_destroy_all(d);
+#endif
+
     PROGRESS(xen):
         ret = relinquish_memory(d, &d->xenpage_list);
         if ( ret )
diff --git a/xen/arch/arm/io.c b/xen/arch/arm/io.c
index ae7ef96..9814481 100644
--- a/xen/arch/arm/io.c
+++ b/xen/arch/arm/io.c
@@ -16,6 +16,7 @@
  * GNU General Public License for more details.
  */
 
+#include <xen/ioreq.h>
 #include <xen/lib.h>
 #include <xen/spinlock.h>
 #include <xen/sched.h>
@@ -23,6 +24,7 @@
 #include <asm/cpuerrata.h>
 #include <asm/current.h>
 #include <asm/mmio.h>
+#include <asm/hvm/ioreq.h>
 
 #include "decode.h"
 
@@ -123,7 +125,15 @@ enum io_state try_handle_mmio(struct cpu_user_regs *regs,
 
     handler = find_mmio_handler(v->domain, info.gpa);
     if ( !handler )
-        return IO_UNHANDLED;
+    {
+        int rc;
+
+        rc = try_fwd_ioserv(regs, v, &info);
+        if ( rc == IO_HANDLED )
+            return handle_ioserv(regs, v);
+
+        return rc;
+    }
 
     /* All the instructions used on emulated MMIO region should be valid */
     if ( !dabt.valid )
diff --git a/xen/arch/arm/ioreq.c b/xen/arch/arm/ioreq.c
new file mode 100644
index 0000000..3c4a24d
--- /dev/null
+++ b/xen/arch/arm/ioreq.c
@@ -0,0 +1,213 @@
+/*
+ * arm/ioreq.c: hardware virtual machine I/O emulation
+ *
+ * Copyright (c) 2019 Arm ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/domain.h>
+#include <xen/ioreq.h>
+
+#include <asm/traps.h>
+
+#include <public/hvm/ioreq.h>
+
+enum io_state handle_ioserv(struct cpu_user_regs *regs, struct vcpu *v)
+{
+    const union hsr hsr = { .bits = regs->hsr };
+    const struct hsr_dabt dabt = hsr.dabt;
+    /* Code is similar to handle_read */
+    uint8_t size = (1 << dabt.size) * 8;
+    register_t r = v->io.req.data;
+
+    /* We are done with the IO */
+    v->io.req.state = STATE_IOREQ_NONE;
+
+    if ( dabt.write )
+        return IO_HANDLED;
+
+    /*
+     * Sign extend if required.
+     * Note that we expect the read handler to have zeroed the bits
+     * outside the requested access size.
+     */
+    if ( dabt.sign && (r & (1UL << (size - 1))) )
+    {
+        /*
+         * We are relying on register_t using the same as
+         * an unsigned long in order to keep the 32-bit assembly
+         * code smaller.
+         */
+        BUILD_BUG_ON(sizeof(register_t) != sizeof(unsigned long));
+        r |= (~0UL) << size;
+    }
+
+    set_user_reg(regs, dabt.reg, r);
+
+    return IO_HANDLED;
+}
+
+enum io_state try_fwd_ioserv(struct cpu_user_regs *regs,
+                             struct vcpu *v, mmio_info_t *info)
+{
+    struct vcpu_io *vio = &v->io;
+    ioreq_t p = {
+        .type = IOREQ_TYPE_COPY,
+        .addr = info->gpa,
+        .size = 1 << info->dabt.size,
+        .count = 1,
+        .dir = !info->dabt.write,
+        /*
+         * On x86, df is used by 'rep' instruction to tell the direction
+         * to iterate (forward or backward).
+         * On Arm, all the accesses to MMIO region will do a single
+         * memory access. So for now, we can safely always set to 0.
+         */
+        .df = 0,
+        .data = get_user_reg(regs, info->dabt.reg),
+        .state = STATE_IOREQ_READY,
+    };
+    struct ioreq_server *s = NULL;
+    enum io_state rc;
+
+    switch ( vio->req.state )
+    {
+    case STATE_IOREQ_NONE:
+        break;
+
+    case STATE_IORESP_READY:
+        return IO_HANDLED;
+
+    default:
+        gdprintk(XENLOG_ERR, "wrong state %u\n", vio->req.state);
+        return IO_ABORT;
+    }
+
+    s = ioreq_server_select(v->domain, &p);
+    if ( !s )
+        return IO_UNHANDLED;
+
+    if ( !info->dabt.valid )
+        return IO_ABORT;
+
+    vio->req = p;
+
+    rc = ioreq_send(s, &p, 0);
+    if ( rc != IO_RETRY || v->domain->is_shutting_down )
+        vio->req.state = STATE_IOREQ_NONE;
+    else if ( !ioreq_needs_completion(&vio->req) )
+        rc = IO_HANDLED;
+    else
+        vio->completion = VIO_mmio_completion;
+
+    return rc;
+}
+
+bool arch_ioreq_complete_mmio(void)
+{
+    struct vcpu *v = current;
+    struct cpu_user_regs *regs = guest_cpu_user_regs();
+    const union hsr hsr = { .bits = regs->hsr };
+    paddr_t addr = v->io.req.addr;
+
+    if ( try_handle_mmio(regs, hsr, addr) == IO_HANDLED )
+    {
+        advance_pc(regs, hsr);
+        return true;
+    }
+
+    return false;
+}
+
+bool arch_vcpu_ioreq_completion(enum vio_completion completion)
+{
+    ASSERT_UNREACHABLE();
+    return true;
+}
+
+/*
+ * The "legacy" mechanism of mapping magic pages for the IOREQ servers
+ * is x86 specific, so the following hooks don't need to be implemented on Arm:
+ * - arch_ioreq_server_map_pages
+ * - arch_ioreq_server_unmap_pages
+ * - arch_ioreq_server_enable
+ * - arch_ioreq_server_disable
+ */
+int arch_ioreq_server_map_pages(struct ioreq_server *s)
+{
+    return -EOPNOTSUPP;
+}
+
+void arch_ioreq_server_unmap_pages(struct ioreq_server *s)
+{
+}
+
+void arch_ioreq_server_enable(struct ioreq_server *s)
+{
+}
+
+void arch_ioreq_server_disable(struct ioreq_server *s)
+{
+}
+
+void arch_ioreq_server_destroy(struct ioreq_server *s)
+{
+}
+
+int arch_ioreq_server_map_mem_type(struct domain *d,
+                                   struct ioreq_server *s,
+                                   uint32_t flags)
+{
+    return -EOPNOTSUPP;
+}
+
+void arch_ioreq_server_map_mem_type_completed(struct domain *d,
+                                              struct ioreq_server *s,
+                                              uint32_t flags)
+{
+}
+
+bool arch_ioreq_server_destroy_all(struct domain *d)
+{
+    return true;
+}
+
+bool arch_ioreq_server_get_type_addr(const struct domain *d,
+                                     const ioreq_t *p,
+                                     uint8_t *type,
+                                     uint64_t *addr)
+{
+    if ( p->type != IOREQ_TYPE_COPY && p->type != IOREQ_TYPE_PIO )
+        return false;
+
+    *type = (p->type == IOREQ_TYPE_PIO) ?
+             XEN_DMOP_IO_RANGE_PORT : XEN_DMOP_IO_RANGE_MEMORY;
+    *addr = p->addr;
+
+    return true;
+}
+
+void arch_ioreq_domain_init(struct domain *d)
+{
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 22bd1bd..036b13f 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -21,6 +21,7 @@
 #include <xen/hypercall.h>
 #include <xen/init.h>
 #include <xen/iocap.h>
+#include <xen/ioreq.h>
 #include <xen/irq.h>
 #include <xen/lib.h>
 #include <xen/mem_access.h>
@@ -1385,6 +1386,9 @@ static arm_hypercall_t arm_hypercall_table[] = {
 #ifdef CONFIG_HYPFS
     HYPERCALL(hypfs_op, 5),
 #endif
+#ifdef CONFIG_IOREQ_SERVER
+    HYPERCALL(dm_op, 3),
+#endif
 };
 
 #ifndef NDEBUG
@@ -1956,6 +1960,9 @@ static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
             case IO_HANDLED:
                 advance_pc(regs, hsr);
                 return;
+            case IO_RETRY:
+                /* finish later */
+                return;
             case IO_UNHANDLED:
                 /* IO unhandled, try another way to handle it. */
                 break;
@@ -2254,6 +2261,12 @@ static void check_for_vcpu_work(void)
 {
     struct vcpu *v = current;
 
+#ifdef CONFIG_IOREQ_SERVER
+    local_irq_enable();
+    vcpu_ioreq_handle_completion(v);
+    local_irq_disable();
+#endif
+
     if ( likely(!v->arch.need_flush_to_ram) )
         return;
 
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index 6819a3b..c235e5b 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -10,6 +10,7 @@
 #include <asm/gic.h>
 #include <asm/vgic.h>
 #include <asm/vpl011.h>
+#include <public/hvm/dm_op.h>
 #include <public/hvm/params.h>
 
 struct hvm_domain
@@ -262,6 +263,8 @@ static inline void arch_vcpu_block(struct vcpu *v) {}
 
 #define arch_vm_assist_valid_mask(d) (1UL << VMASST_TYPE_runstate_update_flag)
 
+#define has_vpci(d)    ({ (void)(d); false; })
+
 #endif /* __ASM_DOMAIN_H__ */
 
 /*
diff --git a/xen/include/asm-arm/hvm/ioreq.h b/xen/include/asm-arm/hvm/ioreq.h
new file mode 100644
index 0000000..19e1247
--- /dev/null
+++ b/xen/include/asm-arm/hvm/ioreq.h
@@ -0,0 +1,72 @@
+/*
+ * hvm.h: Hardware virtual machine assist interface definitions.
+ *
+ * Copyright (c) 2016 Citrix Systems Inc.
+ * Copyright (c) 2019 Arm ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __ASM_ARM_HVM_IOREQ_H__
+#define __ASM_ARM_HVM_IOREQ_H__
+
+#ifdef CONFIG_IOREQ_SERVER
+enum io_state handle_ioserv(struct cpu_user_regs *regs, struct vcpu *v);
+enum io_state try_fwd_ioserv(struct cpu_user_regs *regs,
+                             struct vcpu *v, mmio_info_t *info);
+#else
+static inline enum io_state handle_ioserv(struct cpu_user_regs *regs,
+                                          struct vcpu *v)
+{
+    return IO_UNHANDLED;
+}
+
+static inline enum io_state try_fwd_ioserv(struct cpu_user_regs *regs,
+                                           struct vcpu *v, mmio_info_t *info)
+{
+    return IO_UNHANDLED;
+}
+#endif
+
+bool ioreq_complete_mmio(void);
+
+static inline bool handle_pio(uint16_t port, unsigned int size, int dir)
+{
+    /*
+     * TODO: For Arm64, the main user will be PCI. So this should be
+     * implemented when we add support for vPCI.
+     */
+    ASSERT_UNREACHABLE();
+    return true;
+}
+
+static inline void msix_write_completion(struct vcpu *v)
+{
+}
+
+/* This correlation must not be altered */
+#define IOREQ_STATUS_HANDLED     IO_HANDLED
+#define IOREQ_STATUS_UNHANDLED   IO_UNHANDLED
+#define IOREQ_STATUS_RETRY       IO_RETRY
+
+#endif /* __ASM_ARM_HVM_IOREQ_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-arm/mmio.h b/xen/include/asm-arm/mmio.h
index 8dbfb27..7ab873c 100644
--- a/xen/include/asm-arm/mmio.h
+++ b/xen/include/asm-arm/mmio.h
@@ -37,6 +37,7 @@ enum io_state
     IO_ABORT,       /* The IO was handled by the helper and led to an abort. */
     IO_HANDLED,     /* The IO was successfully handled by the helper. */
     IO_UNHANDLED,   /* The IO was not handled by the helper. */
+    IO_RETRY,       /* Retry the emulation for some reason */
 };
 
 typedef int (*mmio_read_t)(struct vcpu *v, mmio_info_t *info,
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH V4 15/24] xen/arm: Stick around in leave_hypervisor_to_guest until I/O has completed
  2021-01-12 21:52 [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Oleksandr Tyshchenko
                   ` (13 preceding siblings ...)
  2021-01-12 21:52 ` [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features Oleksandr Tyshchenko
@ 2021-01-12 21:52 ` Oleksandr Tyshchenko
  2021-01-15  1:12   ` Stefano Stabellini
  2021-01-15 20:55   ` Julien Grall
  2021-01-12 21:52 ` [PATCH V4 16/24] xen/mm: Handle properly reference in set_foreign_p2m_entry() on Arm Oleksandr Tyshchenko
                   ` (9 subsequent siblings)
  24 siblings, 2 replies; 144+ messages in thread
From: Oleksandr Tyshchenko @ 2021-01-12 21:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

This patch adds proper handling of return value of
vcpu_ioreq_handle_completion() which involves using a loop in
leave_hypervisor_to_guest().

The reason to use an unbounded loop here is the fact that vCPU shouldn't
continue until the I/O has completed.

The IOREQ code is using wait_on_xen_event_channel(). Yet, this can
still "exit" early if an event has been received. But this doesn't mean
the I/O has completed (in can be just a spurious wake-up). So we need
to check if the I/O has completed and wait again if it hasn't (we will
block the vCPU again until an event is received). This loop makes sure
that all the vCPU works are done before we return to the guest.

The call chain below:
check_for_vcpu_work -> vcpu_ioreq_handle_completion -> wait_for_io ->
wait_on_xen_event_channel

The worse that can happen here if the vCPU will never run again
(the I/O will never complete). But, in Xen case, if the I/O never
completes then it most likely means that something went horribly
wrong with the Device Emulator. And it is most likely not safe
to continue. So letting the vCPU to spin forever if the I/O never
completes is a safer action than letting it continue and leaving
the guest in unclear state and is the best what we can do for now.

Please note, using this loop we will not spin forever on a pCPU,
preventing any other vCPUs from being scheduled. At every loop
we will call check_for_pcpu_work() that will process pending
softirqs. In case of failure, the guest will crash and the vCPU
will be unscheduled. In normal case, if the rescheduling is necessary
(might be set by a timer or by a caller in check_for_vcpu_work(),
where wait_for_io() is a preemption point) the vCPU will be rescheduled
to give place to someone else.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Julien Grall <julien.grall@arm.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>

---
Please note, this is a split/cleanup/hardening of Julien's PoC:
"Add support for Guest IO forwarding to a device emulator"

Changes V1 -> V2:
   - new patch, changes were derived from (+ new explanation):
     arm/ioreq: Introduce arch specific bits for IOREQ/DM features

Changes V2 -> V3:
   - update patch description

Changes V3 -> V4:
   - update patch description and comment in code
---
 xen/arch/arm/traps.c | 38 +++++++++++++++++++++++++++++++++-----
 1 file changed, 33 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 036b13f..4a83e1e 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -2257,18 +2257,23 @@ static void check_for_pcpu_work(void)
  * Process pending work for the vCPU. Any call should be fast or
  * implement preemption.
  */
-static void check_for_vcpu_work(void)
+static bool check_for_vcpu_work(void)
 {
     struct vcpu *v = current;
 
 #ifdef CONFIG_IOREQ_SERVER
+    bool handled;
+
     local_irq_enable();
-    vcpu_ioreq_handle_completion(v);
+    handled = vcpu_ioreq_handle_completion(v);
     local_irq_disable();
+
+    if ( !handled )
+        return true;
 #endif
 
     if ( likely(!v->arch.need_flush_to_ram) )
-        return;
+        return false;
 
     /*
      * Give a chance for the pCPU to process work before handling the vCPU
@@ -2279,6 +2284,8 @@ static void check_for_vcpu_work(void)
     local_irq_enable();
     p2m_flush_vm(v);
     local_irq_disable();
+
+    return false;
 }
 
 /*
@@ -2291,8 +2298,29 @@ void leave_hypervisor_to_guest(void)
 {
     local_irq_disable();
 
-    check_for_vcpu_work();
-    check_for_pcpu_work();
+    /*
+     * The reason to use an unbounded loop here is the fact that vCPU
+     * shouldn't continue until the I/O has completed.
+     *
+     * The worse that can happen here if the vCPU will never run again
+     * (the I/O will never complete). But, in Xen case, if the I/O never
+     * completes then it most likely means that something went horribly
+     * wrong with the Device Emulator. And it is most likely not safe
+     * to continue. So letting the vCPU to spin forever if the I/O never
+     * completes is a safer action than letting it continue and leaving
+     * the guest in unclear state and is the best what we can do for now.
+     *
+     * Please note, using this loop we will not spin forever on a pCPU,
+     * preventing any other vCPUs from being scheduled. At every loop
+     * we will call check_for_pcpu_work() that will process pending
+     * softirqs. In case of failure, the guest will crash and the vCPU
+     * will be unscheduled. In normal case, if the rescheduling is necessary
+     * (might be set by a timer or by a caller in check_for_vcpu_work(),
+     * the vCPU will be rescheduled to give place to someone else.
+     */
+    do {
+        check_for_pcpu_work();
+    } while ( check_for_vcpu_work() );
 
     vgic_sync_to_lrs();
 
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH V4 16/24] xen/mm: Handle properly reference in set_foreign_p2m_entry() on Arm
  2021-01-12 21:52 [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Oleksandr Tyshchenko
                   ` (14 preceding siblings ...)
  2021-01-12 21:52 ` [PATCH V4 15/24] xen/arm: Stick around in leave_hypervisor_to_guest until I/O has completed Oleksandr Tyshchenko
@ 2021-01-12 21:52 ` Oleksandr Tyshchenko
  2021-01-15  1:19   ` Stefano Stabellini
                     ` (2 more replies)
  2021-01-12 21:52 ` [PATCH V4 17/24] xen/ioreq: Introduce domain_has_ioreq_server() Oleksandr Tyshchenko
                   ` (8 subsequent siblings)
  24 siblings, 3 replies; 144+ messages in thread
From: Oleksandr Tyshchenko @ 2021-01-12 21:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Ian Jackson,
	Jan Beulich, Wei Liu, Roger Pau Monné,
	Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

This patch implements reference counting of foreign entries in
in set_foreign_p2m_entry() on Arm. This is a mandatory action if
we want to run emulator (IOREQ server) in other than dom0 domain,
as we can't trust it to do the right thing if it is not running
in dom0. So we need to grab a reference on the page to avoid it
disappearing.

It is valid to always pass "p2m_map_foreign_rw" type to
guest_physmap_add_entry() since the current and foreign domains
would be always different. A case when they are equal would be
rejected by rcu_lock_remote_domain_by_id(). Besides the similar
comment in the code put a respective ASSERT() to catch incorrect
usage in future.

It was tested with IOREQ feature to confirm that all the pages given
to this function belong to a domain, so we can use the same approach
as for XENMAPSPACE_gmfn_foreign handling in xenmem_add_to_physmap_one().

This involves adding an extra parameter for the foreign domain to
set_foreign_p2m_entry() and a helper to indicate whether the arch
supports the reference counting of foreign entries and the restriction
for the hardware domain in the common code can be skipped for it.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Julien Grall <julien.grall@arm.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>

---
Please note, this is a split/cleanup/hardening of Julien's PoC:
"Add support for Guest IO forwarding to a device emulator"

Changes RFC -> V1:
   - new patch, was split from:
     "[RFC PATCH V1 04/12] xen/arm: Introduce arch specific bits for IOREQ/DM features"
   - rewrite a logic to handle properly reference in set_foreign_p2m_entry()
     instead of treating foreign entries as p2m_ram_rw

Changes V1 -> V2:
   - rebase according to the recent changes to acquire_resource()
   - update patch description
   - introduce arch_refcounts_p2m()
   - add an explanation why p2m_map_foreign_rw is valid
   - move set_foreign_p2m_entry() to p2m-common.h
   - add const to new parameter

Changes V2 -> V3:
   - update patch description
   - rename arch_refcounts_p2m() to arch_acquire_resource_check()
   - move comment to x86’s arch_acquire_resource_check()
   - return rc in Arm's set_foreign_p2m_entry()
   - put a respective ASSERT() into Arm's set_foreign_p2m_entry()

Changes V3 -> V4:
   - update arch_acquire_resource_check() implementation on x86
     and common code which uses it, pass struct domain to the function
   - put ASSERT() to x86/Arm set_foreign_p2m_entry()
   - use arch_acquire_resource_check() in p2m_add_foreign()
     instead of open-coding it
---
 xen/arch/arm/p2m.c           | 26 ++++++++++++++++++++++++++
 xen/arch/x86/mm/p2m.c        |  9 ++++++---
 xen/common/memory.c          |  9 ++-------
 xen/include/asm-arm/p2m.h    | 19 +++++++++----------
 xen/include/asm-x86/p2m.h    | 19 ++++++++++++++++---
 xen/include/xen/p2m-common.h |  4 ++++
 6 files changed, 63 insertions(+), 23 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 4eeb867..d41c4fa 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1380,6 +1380,32 @@ int guest_physmap_remove_page(struct domain *d, gfn_t gfn, mfn_t mfn,
     return p2m_remove_mapping(d, gfn, (1 << page_order), mfn);
 }
 
+int set_foreign_p2m_entry(struct domain *d, const struct domain *fd,
+                          unsigned long gfn, mfn_t mfn)
+{
+    struct page_info *page = mfn_to_page(mfn);
+    int rc;
+
+    ASSERT(arch_acquire_resource_check(d));
+
+    if ( !get_page(page, fd) )
+        return -EINVAL;
+
+    /*
+     * It is valid to always use p2m_map_foreign_rw here as if this gets
+     * called then d != fd. A case when d == fd would be rejected by
+     * rcu_lock_remote_domain_by_id() earlier. Put a respective ASSERT()
+     * to catch incorrect usage in future.
+     */
+    ASSERT(d != fd);
+
+    rc = guest_physmap_add_entry(d, _gfn(gfn), mfn, 0, p2m_map_foreign_rw);
+    if ( rc )
+        put_page(page);
+
+    return rc;
+}
+
 static struct page_info *p2m_allocate_root(void)
 {
     struct page_info *page;
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 71fda06..cbeea85 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1323,8 +1323,11 @@ static int set_typed_p2m_entry(struct domain *d, unsigned long gfn_l,
 }
 
 /* Set foreign mfn in the given guest's p2m table. */
-int set_foreign_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
+int set_foreign_p2m_entry(struct domain *d, const struct domain *fd,
+                          unsigned long gfn, mfn_t mfn)
 {
+    ASSERT(arch_acquire_resource_check(d));
+
     return set_typed_p2m_entry(d, gfn, mfn, PAGE_ORDER_4K, p2m_map_foreign,
                                p2m_get_hostp2m(d)->default_access);
 }
@@ -2579,7 +2582,7 @@ static int p2m_add_foreign(struct domain *tdom, unsigned long fgfn,
      * hvm fixme: until support is added to p2m teardown code to cleanup any
      * foreign entries, limit this to hardware domain only.
      */
-    if ( !is_hardware_domain(tdom) )
+    if ( !arch_acquire_resource_check(tdom) )
         return -EPERM;
 
     if ( foreigndom == DOMID_XEN )
@@ -2635,7 +2638,7 @@ static int p2m_add_foreign(struct domain *tdom, unsigned long fgfn,
      * will update the m2p table which will result in  mfn -> gpfn of dom0
      * and not fgfn of domU.
      */
-    rc = set_foreign_p2m_entry(tdom, gpfn, mfn);
+    rc = set_foreign_p2m_entry(tdom, fdom, gpfn, mfn);
     if ( rc )
         gdprintk(XENLOG_WARNING, "set_foreign_p2m_entry failed. "
                  "gpfn:%lx mfn:%lx fgfn:%lx td:%d fd:%d\n",
diff --git a/xen/common/memory.c b/xen/common/memory.c
index 66828d9..d625a9b 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -1138,12 +1138,7 @@ static int acquire_resource(
     xen_pfn_t mfn_list[32];
     int rc;
 
-    /*
-     * FIXME: Until foreign pages inserted into the P2M are properly
-     *        reference counted, it is unsafe to allow mapping of
-     *        resource pages unless the caller is the hardware domain.
-     */
-    if ( paging_mode_translate(currd) && !is_hardware_domain(currd) )
+    if ( !arch_acquire_resource_check(currd) )
         return -EACCES;
 
     if ( copy_from_guest(&xmar, arg, 1) )
@@ -1211,7 +1206,7 @@ static int acquire_resource(
 
         for ( i = 0; !rc && i < xmar.nr_frames; i++ )
         {
-            rc = set_foreign_p2m_entry(currd, gfn_list[i],
+            rc = set_foreign_p2m_entry(currd, d, gfn_list[i],
                                        _mfn(mfn_list[i]));
             /* rc should be -EIO for any iteration other than the first */
             if ( rc && i )
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index 28ca9a8..4f8b3b0 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -161,6 +161,15 @@ typedef enum {
 #endif
 #include <xen/p2m-common.h>
 
+static inline bool arch_acquire_resource_check(struct domain *d)
+{
+    /*
+     * The reference counting of foreign entries in set_foreign_p2m_entry()
+     * is supported on Arm.
+     */
+    return true;
+}
+
 static inline
 void p2m_altp2m_check(struct vcpu *v, uint16_t idx)
 {
@@ -392,16 +401,6 @@ static inline gfn_t gfn_next_boundary(gfn_t gfn, unsigned int order)
     return gfn_add(gfn, 1UL << order);
 }
 
-static inline int set_foreign_p2m_entry(struct domain *d, unsigned long gfn,
-                                        mfn_t mfn)
-{
-    /*
-     * NOTE: If this is implemented then proper reference counting of
-     *       foreign entries will need to be implemented.
-     */
-    return -EOPNOTSUPP;
-}
-
 /*
  * A vCPU has cache enabled only when the MMU is enabled and data cache
  * is enabled.
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 7df2878..1d64c12 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -382,6 +382,22 @@ struct p2m_domain {
 #endif
 #include <xen/p2m-common.h>
 
+static inline bool arch_acquire_resource_check(struct domain *d)
+{
+    /*
+     * The reference counting of foreign entries in set_foreign_p2m_entry()
+     * is not supported for translated domains on x86.
+     *
+     * FIXME: Until foreign pages inserted into the P2M are properly
+     * reference counted, it is unsafe to allow mapping of
+     * resource pages unless the caller is the hardware domain.
+     */
+    if ( paging_mode_translate(d) && !is_hardware_domain(d) )
+        return false;
+
+    return true;
+}
+
 /*
  * Updates vCPU's n2pm to match its np2m_base in VMCx12 and returns that np2m.
  */
@@ -647,9 +663,6 @@ int p2m_finish_type_change(struct domain *d,
 int p2m_is_logdirty_range(struct p2m_domain *, unsigned long start,
                           unsigned long end);
 
-/* Set foreign entry in the p2m table (for priv-mapping) */
-int set_foreign_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn);
-
 /* Set mmio addresses in the p2m table (for pass-through) */
 int set_mmio_p2m_entry(struct domain *d, gfn_t gfn, mfn_t mfn,
                        unsigned int order);
diff --git a/xen/include/xen/p2m-common.h b/xen/include/xen/p2m-common.h
index 58031a6..b4bc709 100644
--- a/xen/include/xen/p2m-common.h
+++ b/xen/include/xen/p2m-common.h
@@ -3,6 +3,10 @@
 
 #include <xen/mm.h>
 
+/* Set foreign entry in the p2m table */
+int set_foreign_p2m_entry(struct domain *d, const struct domain *fd,
+                          unsigned long gfn, mfn_t mfn);
+
 /* Remove a page from a domain's p2m table */
 int __must_check
 guest_physmap_remove_page(struct domain *d, gfn_t gfn, mfn_t mfn,
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH V4 17/24] xen/ioreq: Introduce domain_has_ioreq_server()
  2021-01-12 21:52 [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Oleksandr Tyshchenko
                   ` (15 preceding siblings ...)
  2021-01-12 21:52 ` [PATCH V4 16/24] xen/mm: Handle properly reference in set_foreign_p2m_entry() on Arm Oleksandr Tyshchenko
@ 2021-01-12 21:52 ` Oleksandr Tyshchenko
  2021-01-15  1:24   ` Stefano Stabellini
  2021-01-18 10:23   ` Paul Durrant
  2021-01-12 21:52 ` [PATCH V4 18/24] xen/dm: Introduce xendevicemodel_set_irq_level DM op Oleksandr Tyshchenko
                   ` (7 subsequent siblings)
  24 siblings, 2 replies; 144+ messages in thread
From: Oleksandr Tyshchenko @ 2021-01-12 21:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Paul Durrant, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

This patch introduces a helper the main purpose of which is to check
if a domain is using IOREQ server(s).

On Arm the current benefit is to avoid calling vcpu_ioreq_handle_completion()
(which implies iterating over all possible IOREQ servers anyway)
on every return in leave_hypervisor_to_guest() if there is no active
servers for the particular domain.
Also this helper will be used by one of the subsequent patches on Arm.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Julien Grall <julien.grall@arm.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>

---
Please note, this is a split/cleanup/hardening of Julien's PoC:
"Add support for Guest IO forwarding to a device emulator"

Changes RFC -> V1:
   - new patch

Changes V1 -> V2:
   - update patch description
   - guard helper with CONFIG_IOREQ_SERVER
   - remove "hvm" prefix
   - modify helper to just return d->arch.hvm.ioreq_server.nr_servers
   - put suitable ASSERT()s
   - use ASSERT(d->ioreq_server.server[id] ? !s : !!s) in set_ioreq_server()
   - remove d->ioreq_server.nr_servers = 0 from hvm_ioreq_init()

Changes V2 -> V3:
   - update patch description
   - remove ASSERT()s from the helper, add a comment
   - use #ifdef CONFIG_IOREQ_SERVER inside function body
   - use new ASSERT() construction in set_ioreq_server()

Changes V3 -> V4:
   - update patch description
   - drop per-domain variable "nr_servers"
   - reimplement a helper to count the non-NULL entries
   - make the helper out-of-line
---
 xen/arch/arm/traps.c    | 15 +++++++++------
 xen/common/ioreq.c      | 16 ++++++++++++++++
 xen/include/xen/ioreq.h |  2 ++
 3 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 4a83e1e..35094d8 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -2262,14 +2262,17 @@ static bool check_for_vcpu_work(void)
     struct vcpu *v = current;
 
 #ifdef CONFIG_IOREQ_SERVER
-    bool handled;
+    if ( domain_has_ioreq_server(v->domain) )
+    {
+        bool handled;
 
-    local_irq_enable();
-    handled = vcpu_ioreq_handle_completion(v);
-    local_irq_disable();
+        local_irq_enable();
+        handled = vcpu_ioreq_handle_completion(v);
+        local_irq_disable();
 
-    if ( !handled )
-        return true;
+        if ( !handled )
+            return true;
+    }
 #endif
 
     if ( likely(!v->arch.need_flush_to_ram) )
diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c
index d5f4dd3..59f4990 100644
--- a/xen/common/ioreq.c
+++ b/xen/common/ioreq.c
@@ -80,6 +80,22 @@ static ioreq_t *get_ioreq(struct ioreq_server *s, struct vcpu *v)
     return &p->vcpu_ioreq[v->vcpu_id];
 }
 
+/*
+ * This should only be used when d == current->domain or when they're
+ * distinct and d is paused. Otherwise the result is stale before
+ * the caller can inspect it.
+ */
+bool domain_has_ioreq_server(const struct domain *d)
+{
+    const struct ioreq_server *s;
+    unsigned int id;
+
+    FOR_EACH_IOREQ_SERVER(d, id, s)
+        return true;
+
+    return false;
+}
+
 static struct ioreq_vcpu *get_pending_vcpu(const struct vcpu *v,
                                            struct ioreq_server **srvp)
 {
diff --git a/xen/include/xen/ioreq.h b/xen/include/xen/ioreq.h
index ec7e98d..f0908af 100644
--- a/xen/include/xen/ioreq.h
+++ b/xen/include/xen/ioreq.h
@@ -81,6 +81,8 @@ static inline bool ioreq_needs_completion(const ioreq_t *ioreq)
 #define HANDLE_BUFIOREQ(s) \
     ((s)->bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF)
 
+bool domain_has_ioreq_server(const struct domain *d);
+
 bool vcpu_ioreq_pending(struct vcpu *v);
 bool vcpu_ioreq_handle_completion(struct vcpu *v);
 bool is_ioreq_server_page(struct domain *d, const struct page_info *page);
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH V4 18/24] xen/dm: Introduce xendevicemodel_set_irq_level DM op
  2021-01-12 21:52 [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Oleksandr Tyshchenko
                   ` (16 preceding siblings ...)
  2021-01-12 21:52 ` [PATCH V4 17/24] xen/ioreq: Introduce domain_has_ioreq_server() Oleksandr Tyshchenko
@ 2021-01-12 21:52 ` Oleksandr Tyshchenko
  2021-01-15  1:32   ` Stefano Stabellini
  2021-01-12 21:52 ` [PATCH V4 19/24] xen/arm: io: Abstract sign-extension Oleksandr Tyshchenko
                   ` (6 subsequent siblings)
  24 siblings, 1 reply; 144+ messages in thread
From: Oleksandr Tyshchenko @ 2021-01-12 21:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Julien Grall, Ian Jackson, Wei Liu, Andrew Cooper, George Dunlap,
	Jan Beulich, Julien Grall, Stefano Stabellini, Volodymyr Babchuk,
	Oleksandr Tyshchenko

From: Julien Grall <julien.grall@arm.com>

This patch adds ability to the device emulator to notify otherend
(some entity running in the guest) using a SPI and implements Arm
specific bits for it. Proposed interface allows emulator to set
the logical level of a one of a domain's IRQ lines.

We can't reuse the existing DM op (xen_dm_op_set_isa_irq_level)
to inject an interrupt as the "isa_irq" field is only 8-bit and
able to cover IRQ 0 - 255, whereas we need a wider range (0 - 1020).

Please note, for egde-triggered interrupt (which is used for
the virtio-mmio emulation) we only trigger the interrupt on Arm
if the level is asserted (rising edge) and do nothing if the level
is deasserted (falling edge), so the call could be named "trigger_irq"
(without the level parameter). But, in order to model the line closely
(to be able to support level-triggered interrupt) we need to know whether
the line is low or high, so the proposed interface has been chosen.
However, it is worth mentioning that in case of the level-triggered
interrupt, we should keep injecting the interrupt to the guest until
the line is deasserted (this is not covered by current patch).

Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>

---
Please note, this is a split/cleanup/hardening of Julien's PoC:
"Add support for Guest IO forwarding to a device emulator"

Changes RFC -> V1:
   - check incoming parameters in arch_dm_op()
   - add explicit padding to struct xen_dm_op_set_irq_level

Changes V1 -> V2:
   - update the author of a patch
   - update patch description
   - check that padding is always 0
   - mention that interface is Arm only and only SPIs are
     supported for now
   - allow to set the logical level of a line for non-allocated
     interrupts only
   - add xen_dm_op_set_irq_level_t

Changes V2 -> V3:
   - no changes

Changes V3 -> V4:
   - update patch description
   - update patch according to the IOREQ related dm-op handling changes
---
 tools/include/xendevicemodel.h               |  4 +++
 tools/libs/devicemodel/core.c                | 18 ++++++++++
 tools/libs/devicemodel/libxendevicemodel.map |  1 +
 xen/arch/arm/dm.c                            | 54 +++++++++++++++++++++++++++-
 xen/include/public/hvm/dm_op.h               | 16 +++++++++
 5 files changed, 92 insertions(+), 1 deletion(-)

diff --git a/tools/include/xendevicemodel.h b/tools/include/xendevicemodel.h
index e877f5c..c06b3c8 100644
--- a/tools/include/xendevicemodel.h
+++ b/tools/include/xendevicemodel.h
@@ -209,6 +209,10 @@ int xendevicemodel_set_isa_irq_level(
     xendevicemodel_handle *dmod, domid_t domid, uint8_t irq,
     unsigned int level);
 
+int xendevicemodel_set_irq_level(
+    xendevicemodel_handle *dmod, domid_t domid, unsigned int irq,
+    unsigned int level);
+
 /**
  * This function maps a PCI INTx line to a an IRQ line.
  *
diff --git a/tools/libs/devicemodel/core.c b/tools/libs/devicemodel/core.c
index 4d40639..30bd79f 100644
--- a/tools/libs/devicemodel/core.c
+++ b/tools/libs/devicemodel/core.c
@@ -430,6 +430,24 @@ int xendevicemodel_set_isa_irq_level(
     return xendevicemodel_op(dmod, domid, 1, &op, sizeof(op));
 }
 
+int xendevicemodel_set_irq_level(
+    xendevicemodel_handle *dmod, domid_t domid, uint32_t irq,
+    unsigned int level)
+{
+    struct xen_dm_op op;
+    struct xen_dm_op_set_irq_level *data;
+
+    memset(&op, 0, sizeof(op));
+
+    op.op = XEN_DMOP_set_irq_level;
+    data = &op.u.set_irq_level;
+
+    data->irq = irq;
+    data->level = level;
+
+    return xendevicemodel_op(dmod, domid, 1, &op, sizeof(op));
+}
+
 int xendevicemodel_set_pci_link_route(
     xendevicemodel_handle *dmod, domid_t domid, uint8_t link, uint8_t irq)
 {
diff --git a/tools/libs/devicemodel/libxendevicemodel.map b/tools/libs/devicemodel/libxendevicemodel.map
index 561c62d..a0c3012 100644
--- a/tools/libs/devicemodel/libxendevicemodel.map
+++ b/tools/libs/devicemodel/libxendevicemodel.map
@@ -32,6 +32,7 @@ VERS_1.2 {
 	global:
 		xendevicemodel_relocate_memory;
 		xendevicemodel_pin_memory_cacheattr;
+		xendevicemodel_set_irq_level;
 } VERS_1.1;
 
 VERS_1.3 {
diff --git a/xen/arch/arm/dm.c b/xen/arch/arm/dm.c
index e6dedf4..804830a 100644
--- a/xen/arch/arm/dm.c
+++ b/xen/arch/arm/dm.c
@@ -20,6 +20,8 @@
 #include <xen/ioreq.h>
 #include <xen/nospec.h>
 
+#include <asm/vgic.h>
+
 static int dm_op(const struct dmop_args *op_args)
 {
     struct domain *d;
@@ -35,6 +37,7 @@ static int dm_op(const struct dmop_args *op_args)
         [XEN_DMOP_unmap_io_range_from_ioreq_server] = sizeof(struct xen_dm_op_ioreq_server_range),
         [XEN_DMOP_set_ioreq_server_state]           = sizeof(struct xen_dm_op_set_ioreq_server_state),
         [XEN_DMOP_destroy_ioreq_server]             = sizeof(struct xen_dm_op_destroy_ioreq_server),
+        [XEN_DMOP_set_irq_level]                    = sizeof(struct xen_dm_op_set_irq_level),
     };
 
     rc = rcu_lock_remote_domain_by_id(op_args->domid, &d);
@@ -73,7 +76,56 @@ static int dm_op(const struct dmop_args *op_args)
     if ( op.pad )
         goto out;
 
-    rc = ioreq_server_dm_op(&op, d, &const_op);
+    switch ( op.op )
+    {
+    case XEN_DMOP_set_irq_level:
+    {
+        const struct xen_dm_op_set_irq_level *data =
+            &op.u.set_irq_level;
+        unsigned int i;
+
+        /* Only SPIs are supported */
+        if ( (data->irq < NR_LOCAL_IRQS) || (data->irq >= vgic_num_irqs(d)) )
+        {
+            rc = -EINVAL;
+            break;
+        }
+
+        if ( data->level != 0 && data->level != 1 )
+        {
+            rc = -EINVAL;
+            break;
+        }
+
+        /* Check that padding is always 0 */
+        for ( i = 0; i < sizeof(data->pad); i++ )
+        {
+            if ( data->pad[i] )
+            {
+                rc = -EINVAL;
+                break;
+            }
+        }
+
+        /*
+         * Allow to set the logical level of a line for non-allocated
+         * interrupts only.
+         */
+        if ( test_bit(data->irq, d->arch.vgic.allocated_irqs) )
+        {
+            rc = -EINVAL;
+            break;
+        }
+
+        vgic_inject_irq(d, NULL, data->irq, data->level);
+        rc = 0;
+        break;
+    }
+
+    default:
+        rc = ioreq_server_dm_op(&op, d, &const_op);
+        break;
+    }
 
     if ( (!rc || rc == -ERESTART) &&
          !const_op && copy_to_guest_offset(op_args->buf[0].h, offset,
diff --git a/xen/include/public/hvm/dm_op.h b/xen/include/public/hvm/dm_op.h
index 66cae1a..1f70d58 100644
--- a/xen/include/public/hvm/dm_op.h
+++ b/xen/include/public/hvm/dm_op.h
@@ -434,6 +434,21 @@ struct xen_dm_op_pin_memory_cacheattr {
 };
 typedef struct xen_dm_op_pin_memory_cacheattr xen_dm_op_pin_memory_cacheattr_t;
 
+/*
+ * XEN_DMOP_set_irq_level: Set the logical level of a one of a domain's
+ *                         IRQ lines (currently Arm only).
+ * Only SPIs are supported.
+ */
+#define XEN_DMOP_set_irq_level 19
+
+struct xen_dm_op_set_irq_level {
+    uint32_t irq;
+    /* IN - Level: 0 -> deasserted, 1 -> asserted */
+    uint8_t level;
+    uint8_t pad[3];
+};
+typedef struct xen_dm_op_set_irq_level xen_dm_op_set_irq_level_t;
+
 struct xen_dm_op {
     uint32_t op;
     uint32_t pad;
@@ -447,6 +462,7 @@ struct xen_dm_op {
         xen_dm_op_track_dirty_vram_t track_dirty_vram;
         xen_dm_op_set_pci_intx_level_t set_pci_intx_level;
         xen_dm_op_set_isa_irq_level_t set_isa_irq_level;
+        xen_dm_op_set_irq_level_t set_irq_level;
         xen_dm_op_set_pci_link_route_t set_pci_link_route;
         xen_dm_op_modified_memory_t modified_memory;
         xen_dm_op_set_mem_type_t set_mem_type;
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH V4 19/24] xen/arm: io: Abstract sign-extension
  2021-01-12 21:52 [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Oleksandr Tyshchenko
                   ` (17 preceding siblings ...)
  2021-01-12 21:52 ` [PATCH V4 18/24] xen/dm: Introduce xendevicemodel_set_irq_level DM op Oleksandr Tyshchenko
@ 2021-01-12 21:52 ` Oleksandr Tyshchenko
  2021-01-15  1:35   ` Stefano Stabellini
  2021-01-12 21:52 ` [PATCH V4 20/24] xen/arm: io: Harden sign extension check Oleksandr Tyshchenko
                   ` (5 subsequent siblings)
  24 siblings, 1 reply; 144+ messages in thread
From: Oleksandr Tyshchenko @ 2021-01-12 21:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

In order to avoid code duplication (both handle_read() and
handle_ioserv() contain the same code for the sign-extension)
put this code to a common helper to be used for both.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Julien Grall <julien.grall@arm.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>

---
Please note, this is a split/cleanup/hardening of Julien's PoC:
"Add support for Guest IO forwarding to a device emulator"

Changes V1 -> V2:
   - new patch

Changes V2 -> V3:
   - no changes

Changes V3 -> V4:
   - no changes here, but in new patch:
     "xen/arm: io: Harden sign extension check"
---
 xen/arch/arm/io.c           | 18 ++----------------
 xen/arch/arm/ioreq.c        | 17 +----------------
 xen/include/asm-arm/traps.h | 24 ++++++++++++++++++++++++
 3 files changed, 27 insertions(+), 32 deletions(-)

diff --git a/xen/arch/arm/io.c b/xen/arch/arm/io.c
index 9814481..307c521 100644
--- a/xen/arch/arm/io.c
+++ b/xen/arch/arm/io.c
@@ -24,6 +24,7 @@
 #include <asm/cpuerrata.h>
 #include <asm/current.h>
 #include <asm/mmio.h>
+#include <asm/traps.h>
 #include <asm/hvm/ioreq.h>
 
 #include "decode.h"
@@ -40,26 +41,11 @@ static enum io_state handle_read(const struct mmio_handler *handler,
      * setting r).
      */
     register_t r = 0;
-    uint8_t size = (1 << dabt.size) * 8;
 
     if ( !handler->ops->read(v, info, &r, handler->priv) )
         return IO_ABORT;
 
-    /*
-     * Sign extend if required.
-     * Note that we expect the read handler to have zeroed the bits
-     * outside the requested access size.
-     */
-    if ( dabt.sign && (r & (1UL << (size - 1))) )
-    {
-        /*
-         * We are relying on register_t using the same as
-         * an unsigned long in order to keep the 32-bit assembly
-         * code smaller.
-         */
-        BUILD_BUG_ON(sizeof(register_t) != sizeof(unsigned long));
-        r |= (~0UL) << size;
-    }
+    r = sign_extend(dabt, r);
 
     set_user_reg(regs, dabt.reg, r);
 
diff --git a/xen/arch/arm/ioreq.c b/xen/arch/arm/ioreq.c
index 3c4a24d..40b9e59 100644
--- a/xen/arch/arm/ioreq.c
+++ b/xen/arch/arm/ioreq.c
@@ -28,7 +28,6 @@ enum io_state handle_ioserv(struct cpu_user_regs *regs, struct vcpu *v)
     const union hsr hsr = { .bits = regs->hsr };
     const struct hsr_dabt dabt = hsr.dabt;
     /* Code is similar to handle_read */
-    uint8_t size = (1 << dabt.size) * 8;
     register_t r = v->io.req.data;
 
     /* We are done with the IO */
@@ -37,21 +36,7 @@ enum io_state handle_ioserv(struct cpu_user_regs *regs, struct vcpu *v)
     if ( dabt.write )
         return IO_HANDLED;
 
-    /*
-     * Sign extend if required.
-     * Note that we expect the read handler to have zeroed the bits
-     * outside the requested access size.
-     */
-    if ( dabt.sign && (r & (1UL << (size - 1))) )
-    {
-        /*
-         * We are relying on register_t using the same as
-         * an unsigned long in order to keep the 32-bit assembly
-         * code smaller.
-         */
-        BUILD_BUG_ON(sizeof(register_t) != sizeof(unsigned long));
-        r |= (~0UL) << size;
-    }
+    r = sign_extend(dabt, r);
 
     set_user_reg(regs, dabt.reg, r);
 
diff --git a/xen/include/asm-arm/traps.h b/xen/include/asm-arm/traps.h
index 997c378..e301c44 100644
--- a/xen/include/asm-arm/traps.h
+++ b/xen/include/asm-arm/traps.h
@@ -83,6 +83,30 @@ static inline bool VABORT_GEN_BY_GUEST(const struct cpu_user_regs *regs)
         (unsigned long)abort_guest_exit_end == regs->pc;
 }
 
+/* Check whether the sign extension is required and perform it */
+static inline register_t sign_extend(const struct hsr_dabt dabt, register_t r)
+{
+    uint8_t size = (1 << dabt.size) * 8;
+
+    /*
+     * Sign extend if required.
+     * Note that we expect the read handler to have zeroed the bits
+     * outside the requested access size.
+     */
+    if ( dabt.sign && (r & (1UL << (size - 1))) )
+    {
+        /*
+         * We are relying on register_t using the same as
+         * an unsigned long in order to keep the 32-bit assembly
+         * code smaller.
+         */
+        BUILD_BUG_ON(sizeof(register_t) != sizeof(unsigned long));
+        r |= (~0UL) << size;
+    }
+
+    return r;
+}
+
 #endif /* __ASM_ARM_TRAPS__ */
 /*
  * Local variables:
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH V4 20/24] xen/arm: io: Harden sign extension check
  2021-01-12 21:52 [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Oleksandr Tyshchenko
                   ` (18 preceding siblings ...)
  2021-01-12 21:52 ` [PATCH V4 19/24] xen/arm: io: Abstract sign-extension Oleksandr Tyshchenko
@ 2021-01-12 21:52 ` Oleksandr Tyshchenko
  2021-01-15  1:48   ` Stefano Stabellini
  2021-01-22 10:15   ` Volodymyr Babchuk
  2021-01-12 21:52 ` [PATCH V4 21/24] xen/ioreq: Make x86's send_invalidate_req() common Oleksandr Tyshchenko
                   ` (4 subsequent siblings)
  24 siblings, 2 replies; 144+ messages in thread
From: Oleksandr Tyshchenko @ 2021-01-12 21:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

In the ideal world we would never get an undefined behavior when
propagating the sign bit since that bit can only be set for access
size smaller than the register size (i.e byte/half-word for aarch32,
byte/half-word/word for aarch64).

In the real world we need to care for *possible* hardware bug such as
advertising a sign extension for either 64-bit (or 32-bit) on Arm64
(resp. Arm32).

So harden a bit more the code to prevent undefined behavior when
propagating the sign bit in case of buggy hardware.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Julien Grall <julien.grall@arm.com>

---
Please note, this is a split/cleanup/hardening of Julien's PoC:
"Add support for Guest IO forwarding to a device emulator"

Changes V3 -> V4:
   - new patch
---
 xen/include/asm-arm/traps.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/xen/include/asm-arm/traps.h b/xen/include/asm-arm/traps.h
index e301c44..992d537 100644
--- a/xen/include/asm-arm/traps.h
+++ b/xen/include/asm-arm/traps.h
@@ -93,7 +93,8 @@ static inline register_t sign_extend(const struct hsr_dabt dabt, register_t r)
      * Note that we expect the read handler to have zeroed the bits
      * outside the requested access size.
      */
-    if ( dabt.sign && (r & (1UL << (size - 1))) )
+    if ( dabt.sign && (size < sizeof(register_t) * 8) &&
+         (r & (1UL << (size - 1))) )
     {
         /*
          * We are relying on register_t using the same as
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH V4 21/24] xen/ioreq: Make x86's send_invalidate_req() common
  2021-01-12 21:52 [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Oleksandr Tyshchenko
                   ` (19 preceding siblings ...)
  2021-01-12 21:52 ` [PATCH V4 20/24] xen/arm: io: Harden sign extension check Oleksandr Tyshchenko
@ 2021-01-12 21:52 ` Oleksandr Tyshchenko
  2021-01-18 10:31   ` Paul Durrant
  2021-01-12 21:52 ` [PATCH V4 22/24] xen/arm: Add mapcache invalidation handling Oleksandr Tyshchenko
                   ` (3 subsequent siblings)
  24 siblings, 1 reply; 144+ messages in thread
From: Oleksandr Tyshchenko @ 2021-01-12 21:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Paul Durrant, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

As the IOREQ is a common feature now and we also need to
invalidate qemu/demu mapcache on Arm when the required condition
occurs this patch moves this function to the common code
(and remames it to ioreq_signal_mapcache_invalidate).
This patch also moves per-domain qemu_mapcache_invalidate
variable out of the arch sub-struct (and drops "qemu" prefix).

We don't put this variable inside the #ifdef CONFIG_IOREQ_SERVER
at the end of struct domain, but in the hole next to the group
of 5 bools further up which is more efficient.

The subsequent patch will add mapcache invalidation handling on Arm.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Julien Grall <julien.grall@arm.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>

---
Please note, this is a split/cleanup/hardening of Julien's PoC:
"Add support for Guest IO forwarding to a device emulator"

Changes RFC -> V1:
   - move send_invalidate_req() to the common code
   - update patch subject/description
   - move qemu_mapcache_invalidate out of the arch sub-struct,
     update checks
   - remove #if defined(CONFIG_ARM64) from the common code

Changes V1 -> V2:
   - was split into:
     - xen/ioreq: Make x86's send_invalidate_req() common
     - xen/arm: Add mapcache invalidation handling
   - update patch description/subject
   - move Arm bits to a separate patch
   - don't alter the common code, the flag is set by arch code
   - rename send_invalidate_req() to send_invalidate_ioreq()
   - guard qemu_mapcache_invalidate with CONFIG_IOREQ_SERVER
   - use bool instead of bool_t
   - remove blank line blank line between head comment and #include-s

Changes V2 -> V3:
   - update patch description
   - drop "qemu" prefix from the variable name
   - rename send_invalidate_req() to ioreq_signal_mapcache_invalidate()

Changes V3 -> V4:
   - change variable location in struct domain
---
 xen/arch/x86/hvm/hypercall.c     |  9 +++++----
 xen/arch/x86/hvm/io.c            | 14 --------------
 xen/common/ioreq.c               | 14 ++++++++++++++
 xen/include/asm-x86/hvm/domain.h |  1 -
 xen/include/asm-x86/hvm/io.h     |  1 -
 xen/include/xen/ioreq.h          |  1 +
 xen/include/xen/sched.h          |  5 +++++
 7 files changed, 25 insertions(+), 20 deletions(-)

diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
index ac573c8..6d41c56 100644
--- a/xen/arch/x86/hvm/hypercall.c
+++ b/xen/arch/x86/hvm/hypercall.c
@@ -20,6 +20,7 @@
  */
 #include <xen/lib.h>
 #include <xen/hypercall.h>
+#include <xen/ioreq.h>
 #include <xen/nospec.h>
 
 #include <asm/hvm/emulate.h>
@@ -47,7 +48,7 @@ static long hvm_memory_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
         rc = compat_memory_op(cmd, arg);
 
     if ( (cmd & MEMOP_CMD_MASK) == XENMEM_decrease_reservation )
-        curr->domain->arch.hvm.qemu_mapcache_invalidate = true;
+        curr->domain->mapcache_invalidate = true;
 
     return rc;
 }
@@ -326,9 +327,9 @@ int hvm_hypercall(struct cpu_user_regs *regs)
 
     HVM_DBG_LOG(DBG_LEVEL_HCALL, "hcall%lu -> %lx", eax, regs->rax);
 
-    if ( unlikely(currd->arch.hvm.qemu_mapcache_invalidate) &&
-         test_and_clear_bool(currd->arch.hvm.qemu_mapcache_invalidate) )
-        send_invalidate_req();
+    if ( unlikely(currd->mapcache_invalidate) &&
+         test_and_clear_bool(currd->mapcache_invalidate) )
+        ioreq_signal_mapcache_invalidate();
 
     return curr->hcall_preempted ? HVM_HCALL_preempted : HVM_HCALL_completed;
 }
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index 66a37ee..046a8eb 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -64,20 +64,6 @@ void send_timeoffset_req(unsigned long timeoff)
         gprintk(XENLOG_ERR, "Unsuccessful timeoffset update\n");
 }
 
-/* Ask ioemu mapcache to invalidate mappings. */
-void send_invalidate_req(void)
-{
-    ioreq_t p = {
-        .type = IOREQ_TYPE_INVALIDATE,
-        .size = 4,
-        .dir = IOREQ_WRITE,
-        .data = ~0UL, /* flush all */
-    };
-
-    if ( ioreq_broadcast(&p, false) != 0 )
-        gprintk(XENLOG_ERR, "Unsuccessful map-cache invalidate\n");
-}
-
 bool hvm_emulate_one_insn(hvm_emulate_validate_t *validate, const char *descr)
 {
     struct hvm_emulate_ctxt ctxt;
diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c
index 59f4990..050891f 100644
--- a/xen/common/ioreq.c
+++ b/xen/common/ioreq.c
@@ -35,6 +35,20 @@
 #include <public/hvm/ioreq.h>
 #include <public/hvm/params.h>
 
+/* Ask ioemu mapcache to invalidate mappings. */
+void ioreq_signal_mapcache_invalidate(void)
+{
+    ioreq_t p = {
+        .type = IOREQ_TYPE_INVALIDATE,
+        .size = 4,
+        .dir = IOREQ_WRITE,
+        .data = ~0UL, /* flush all */
+    };
+
+    if ( ioreq_broadcast(&p, false) != 0 )
+        gprintk(XENLOG_ERR, "Unsuccessful map-cache invalidate\n");
+}
+
 static void set_ioreq_server(struct domain *d, unsigned int id,
                              struct ioreq_server *s)
 {
diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index b8be1ad..cf959f6 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -122,7 +122,6 @@ struct hvm_domain {
 
     struct viridian_domain *viridian;
 
-    bool_t                 qemu_mapcache_invalidate;
     bool_t                 is_s3_suspended;
 
     /*
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index fb64294..3da0136 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -97,7 +97,6 @@ bool relocate_portio_handler(
     unsigned int size);
 
 void send_timeoffset_req(unsigned long timeoff);
-void send_invalidate_req(void);
 bool handle_mmio_with_translation(unsigned long gla, unsigned long gpfn,
                                   struct npfec);
 bool handle_pio(uint16_t port, unsigned int size, int dir);
diff --git a/xen/include/xen/ioreq.h b/xen/include/xen/ioreq.h
index f0908af..dc47ec7 100644
--- a/xen/include/xen/ioreq.h
+++ b/xen/include/xen/ioreq.h
@@ -101,6 +101,7 @@ struct ioreq_server *ioreq_server_select(struct domain *d,
 int ioreq_send(struct ioreq_server *s, ioreq_t *proto_p,
                bool buffered);
 unsigned int ioreq_broadcast(ioreq_t *p, bool buffered);
+void ioreq_signal_mapcache_invalidate(void);
 
 void ioreq_domain_init(struct domain *d);
 
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 7aea2bb..5139b44 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -444,6 +444,11 @@ struct domain
      * unpaused for the first time by the systemcontroller.
      */
     bool             creation_finished;
+    /*
+     * Indicates that mapcache invalidation request should be sent to
+     * the device emulator.
+     */
+    bool             mapcache_invalidate;
 
     /* Which guest this guest has privileges on */
     struct domain   *target;
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH V4 22/24] xen/arm: Add mapcache invalidation handling
  2021-01-12 21:52 [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Oleksandr Tyshchenko
                   ` (20 preceding siblings ...)
  2021-01-12 21:52 ` [PATCH V4 21/24] xen/ioreq: Make x86's send_invalidate_req() common Oleksandr Tyshchenko
@ 2021-01-12 21:52 ` Oleksandr Tyshchenko
  2021-01-15  2:11   ` Stefano Stabellini
  2021-01-12 21:52 ` [PATCH V4 23/24] libxl: Introduce basic virtio-mmio support on Arm Oleksandr Tyshchenko
                   ` (2 subsequent siblings)
  24 siblings, 1 reply; 144+ messages in thread
From: Oleksandr Tyshchenko @ 2021-01-12 21:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Julien Grall

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

We need to send mapcache invalidation request to qemu/demu everytime
the page gets removed from a guest.

At the moment, the Arm code doesn't explicitely remove the existing
mapping before inserting the new mapping. Instead, this is done
implicitely by __p2m_set_entry().

So we need to recognize a case when old entry is a RAM page *and*
the new MFN is different in order to set the corresponding flag.
The most suitable place to do this is p2m_free_entry(), there
we can find the correct leaf type. The invalidation request
will be sent in do_trap_hypercall() later on.

Taking into the account the following the do_trap_hypercall()
is the best place to send invalidation request:
 - The only way a guest can modify its P2M on Arm is via an hypercall
 - When sending the invalidation request, the vCPU will be blocked
   until all the IOREQ servers have acknowledged the invalidation

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Julien Grall <julien.grall@arm.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>

---
Please note, this is a split/cleanup/hardening of Julien's PoC:
"Add support for Guest IO forwarding to a device emulator"

***
Please note, this patch depends on the following which is
on review:
https://patchwork.kernel.org/patch/11803383/

This patch is on par with x86 code (whether it is buggy or not).
If there is a need to improve/harden something, this can be done on
a follow-up.
***

Changes V1 -> V2:
   - new patch, some changes were derived from (+ new explanation):
     xen/ioreq: Make x86's invalidate qemu mapcache handling common
   - put setting of the flag into __p2m_set_entry()
   - clarify the conditions when the flag should be set
   - use domain_has_ioreq_server()
   - update do_trap_hypercall() by adding local variable

Changes V2 -> V3:
   - update patch description
   - move check to p2m_free_entry()
   - add a comment
   - use "curr" instead of "v" in do_trap_hypercall()

Changes V3 -> V4:
   - update patch description
   - re-order check in p2m_free_entry() to call domain_has_ioreq_server()
     only if p2m->domain == current->domain
   - add a comment in do_trap_hypercall()
---
 xen/arch/arm/p2m.c   | 25 +++++++++++++++++--------
 xen/arch/arm/traps.c | 20 +++++++++++++++++---
 2 files changed, 34 insertions(+), 11 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index d41c4fa..26acb95d 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1,6 +1,7 @@
 #include <xen/cpu.h>
 #include <xen/domain_page.h>
 #include <xen/iocap.h>
+#include <xen/ioreq.h>
 #include <xen/lib.h>
 #include <xen/sched.h>
 #include <xen/softirq.h>
@@ -749,17 +750,25 @@ static void p2m_free_entry(struct p2m_domain *p2m,
     if ( !p2m_is_valid(entry) )
         return;
 
-    /* Nothing to do but updating the stats if the entry is a super-page. */
-    if ( p2m_is_superpage(entry, level) )
+    if ( p2m_is_superpage(entry, level) || (level == 3) )
     {
-        p2m->stats.mappings[level]--;
-        return;
-    }
+#ifdef CONFIG_IOREQ_SERVER
+        /*
+         * If this gets called (non-recursively) then either the entry
+         * was replaced by an entry with a different base (valid case) or
+         * the shattering of a superpage was failed (error case).
+         * So, at worst, the spurious mapcache invalidation might be sent.
+         */
+        if ( (p2m->domain == current->domain) &&
+              domain_has_ioreq_server(p2m->domain) &&
+              p2m_is_ram(entry.p2m.type) )
+            p2m->domain->mapcache_invalidate = true;
+#endif
 
-    if ( level == 3 )
-    {
         p2m->stats.mappings[level]--;
-        p2m_put_l3_page(entry);
+        /* Nothing to do if the entry is a super-page. */
+        if ( level == 3 )
+            p2m_put_l3_page(entry);
         return;
     }
 
diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 35094d8..1070d1b 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1443,6 +1443,7 @@ static void do_trap_hypercall(struct cpu_user_regs *regs, register_t *nr,
                               const union hsr hsr)
 {
     arm_hypercall_fn_t call = NULL;
+    struct vcpu *curr = current;
 
     BUILD_BUG_ON(NR_hypercalls < ARRAY_SIZE(arm_hypercall_table) );
 
@@ -1459,7 +1460,7 @@ static void do_trap_hypercall(struct cpu_user_regs *regs, register_t *nr,
         return;
     }
 
-    current->hcall_preempted = false;
+    curr->hcall_preempted = false;
 
     perfc_incra(hypercalls, *nr);
     call = arm_hypercall_table[*nr].fn;
@@ -1472,7 +1473,7 @@ static void do_trap_hypercall(struct cpu_user_regs *regs, register_t *nr,
     HYPERCALL_RESULT_REG(regs) = call(HYPERCALL_ARGS(regs));
 
 #ifndef NDEBUG
-    if ( !current->hcall_preempted )
+    if ( !curr->hcall_preempted )
     {
         /* Deliberately corrupt parameter regs used by this hypercall. */
         switch ( arm_hypercall_table[*nr].nr_args ) {
@@ -1489,8 +1490,21 @@ static void do_trap_hypercall(struct cpu_user_regs *regs, register_t *nr,
 #endif
 
     /* Ensure the hypercall trap instruction is re-executed. */
-    if ( current->hcall_preempted )
+    if ( curr->hcall_preempted )
         regs->pc -= 4;  /* re-execute 'hvc #XEN_HYPERCALL_TAG' */
+
+#ifdef CONFIG_IOREQ_SERVER
+    /*
+     * Taking into the account the following the do_trap_hypercall()
+     * is the best place to send invalidation request:
+     * - The only way a guest can modify its P2M on Arm is via an hypercall
+     * - When sending the invalidation request, the vCPU will be blocked
+     *   until all the IOREQ servers have acknowledged the invalidation
+     */
+    if ( unlikely(curr->domain->mapcache_invalidate) &&
+         test_and_clear_bool(curr->domain->mapcache_invalidate) )
+        ioreq_signal_mapcache_invalidate();
+#endif
 }
 
 void arch_hypercall_tasklet_result(struct vcpu *v, long res)
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH V4 23/24] libxl: Introduce basic virtio-mmio support on Arm
  2021-01-12 21:52 [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Oleksandr Tyshchenko
                   ` (21 preceding siblings ...)
  2021-01-12 21:52 ` [PATCH V4 22/24] xen/arm: Add mapcache invalidation handling Oleksandr Tyshchenko
@ 2021-01-12 21:52 ` Oleksandr Tyshchenko
  2021-01-15 21:30   ` Julien Grall
  2021-01-12 21:52 ` [PATCH V4 24/24] [RFC] libxl: Add support for virtio-disk configuration Oleksandr Tyshchenko
  2021-01-14  3:55 ` [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Wei Chen
  24 siblings, 1 reply; 144+ messages in thread
From: Oleksandr Tyshchenko @ 2021-01-12 21:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Julien Grall, Ian Jackson, Wei Liu, Anthony PERARD,
	Stefano Stabellini, Julien Grall, Volodymyr Babchuk,
	Oleksandr Tyshchenko

From: Julien Grall <julien.grall@arm.com>

This patch creates specific device node in the Guest device-tree
with allocated MMIO range and SPI interrupt if specific 'virtio'
property is present in domain config.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>

---
Please note, this is a split/cleanup/hardening of Julien's PoC:
"Add support for Guest IO forwarding to a device emulator"

Changes RFC -> V1:
   - was squashed with:
     "[RFC PATCH V1 09/12] libxl: Handle virtio-mmio irq in more correct way"
     "[RFC PATCH V1 11/12] libxl: Insert "dma-coherent" property into virtio-mmio device node"
     "[RFC PATCH V1 12/12] libxl: Fix duplicate memory node in DT"
   - move VirtIO MMIO #define-s to xen/include/public/arch-arm.h

Changes V1 -> V2:
   - update the author of a patch

Changes V2 -> V3:
   - no changes

Changes V3 -> V4:
   - no changes
---
 tools/libs/light/libxl_arm.c     | 58 ++++++++++++++++++++++++++++++++++++++--
 tools/libs/light/libxl_types.idl |  1 +
 tools/xl/xl_parse.c              |  1 +
 xen/include/public/arch-arm.h    |  5 ++++
 4 files changed, 63 insertions(+), 2 deletions(-)

diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index 66e8a06..588ee5a 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -26,8 +26,8 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
 {
     uint32_t nr_spis = 0;
     unsigned int i;
-    uint32_t vuart_irq;
-    bool vuart_enabled = false;
+    uint32_t vuart_irq, virtio_irq;
+    bool vuart_enabled = false, virtio_enabled = false;
 
     /*
      * If pl011 vuart is enabled then increment the nr_spis to allow allocation
@@ -39,6 +39,17 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
         vuart_enabled = true;
     }
 
+    /*
+     * XXX: Handle properly virtio
+     * A proper solution would be the toolstack to allocate the interrupts
+     * used by each virtio backend and let the backend now which one is used
+     */
+    if (libxl_defbool_val(d_config->b_info.arch_arm.virtio)) {
+        nr_spis += (GUEST_VIRTIO_MMIO_SPI - 32) + 1;
+        virtio_irq = GUEST_VIRTIO_MMIO_SPI;
+        virtio_enabled = true;
+    }
+
     for (i = 0; i < d_config->b_info.num_irqs; i++) {
         uint32_t irq = d_config->b_info.irqs[i];
         uint32_t spi;
@@ -58,6 +69,12 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
             return ERROR_FAIL;
         }
 
+        /* The same check as for vpl011 */
+        if (virtio_enabled && irq == virtio_irq) {
+            LOG(ERROR, "Physical IRQ %u conflicting with virtio SPI\n", irq);
+            return ERROR_FAIL;
+        }
+
         if (irq < 32)
             continue;
 
@@ -658,6 +675,39 @@ static int make_vpl011_uart_node(libxl__gc *gc, void *fdt,
     return 0;
 }
 
+static int make_virtio_mmio_node(libxl__gc *gc, void *fdt,
+                                 uint64_t base, uint32_t irq)
+{
+    int res;
+    gic_interrupt intr;
+    /* Placeholder for virtio@ + a 64-bit number + \0 */
+    char buf[24];
+
+    snprintf(buf, sizeof(buf), "virtio@%"PRIx64, base);
+    res = fdt_begin_node(fdt, buf);
+    if (res) return res;
+
+    res = fdt_property_compat(gc, fdt, 1, "virtio,mmio");
+    if (res) return res;
+
+    res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS,
+                            1, base, GUEST_VIRTIO_MMIO_SIZE);
+    if (res) return res;
+
+    set_interrupt(intr, irq, 0xf, DT_IRQ_TYPE_EDGE_RISING);
+    res = fdt_property_interrupts(gc, fdt, &intr, 1);
+    if (res) return res;
+
+    res = fdt_property(fdt, "dma-coherent", NULL, 0);
+    if (res) return res;
+
+    res = fdt_end_node(fdt);
+    if (res) return res;
+
+    return 0;
+
+}
+
 static const struct arch_info *get_arch_info(libxl__gc *gc,
                                              const struct xc_dom_image *dom)
 {
@@ -961,6 +1011,9 @@ next_resize:
         if (info->tee == LIBXL_TEE_TYPE_OPTEE)
             FDT( make_optee_node(gc, fdt) );
 
+        if (libxl_defbool_val(info->arch_arm.virtio))
+            FDT( make_virtio_mmio_node(gc, fdt, GUEST_VIRTIO_MMIO_BASE, GUEST_VIRTIO_MMIO_SPI) );
+
         if (pfdt)
             FDT( copy_partial_fdt(gc, fdt, pfdt) );
 
@@ -1178,6 +1231,7 @@ void libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
 {
     /* ACPI is disabled by default */
     libxl_defbool_setdefault(&b_info->acpi, false);
+    libxl_defbool_setdefault(&b_info->arch_arm.virtio, false);
 
     if (b_info->type != LIBXL_DOMAIN_TYPE_PV)
         return;
diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
index 0532473..839df86 100644
--- a/tools/libs/light/libxl_types.idl
+++ b/tools/libs/light/libxl_types.idl
@@ -640,6 +640,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
 
 
     ("arch_arm", Struct(None, [("gic_version", libxl_gic_version),
+                               ("virtio", libxl_defbool),
                                ("vuart", libxl_vuart_type),
                               ])),
     # Alternate p2m is not bound to any architecture or guest type, as it is
diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
index 4ebf396..2a3364b 100644
--- a/tools/xl/xl_parse.c
+++ b/tools/xl/xl_parse.c
@@ -2581,6 +2581,7 @@ skip_usbdev:
     }
 
     xlu_cfg_get_defbool(config, "dm_restrict", &b_info->dm_restrict, 0);
+    xlu_cfg_get_defbool(config, "virtio", &b_info->arch_arm.virtio, 0);
 
     if (c_info->type == LIBXL_DOMAIN_TYPE_HVM) {
         if (!xlu_cfg_get_string (config, "vga", &buf, 0)) {
diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
index c365b1b..be7595f 100644
--- a/xen/include/public/arch-arm.h
+++ b/xen/include/public/arch-arm.h
@@ -464,6 +464,11 @@ typedef uint64_t xen_callback_t;
 #define PSCI_cpu_on      2
 #define PSCI_migrate     3
 
+/* VirtIO MMIO definitions */
+#define GUEST_VIRTIO_MMIO_BASE  xen_mk_ullong(0x02000000)
+#define GUEST_VIRTIO_MMIO_SIZE  xen_mk_ullong(0x200)
+#define GUEST_VIRTIO_MMIO_SPI   33
+
 #endif
 
 #ifndef __ASSEMBLY__
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [PATCH V4 24/24] [RFC] libxl: Add support for virtio-disk configuration
  2021-01-12 21:52 [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Oleksandr Tyshchenko
                   ` (22 preceding siblings ...)
  2021-01-12 21:52 ` [PATCH V4 23/24] libxl: Introduce basic virtio-mmio support on Arm Oleksandr Tyshchenko
@ 2021-01-12 21:52 ` Oleksandr Tyshchenko
  2021-01-14 17:20   ` Ian Jackson
  2021-01-15 22:01   ` Julien Grall
  2021-01-14  3:55 ` [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Wei Chen
  24 siblings, 2 replies; 144+ messages in thread
From: Oleksandr Tyshchenko @ 2021-01-12 21:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, Ian Jackson, Wei Liu, Anthony PERARD,
	Julien Grall, Stefano Stabellini

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

This patch adds basic support for configuring and assisting virtio-disk
backend (emualator) which is intended to run out of Qemu and could be run
in any domain.

Xenstore was chosen as a communication interface for the emulator running
in non-toolstack domain to be able to get configuration either by reading
Xenstore directly or by receiving command line parameters (an updated 'xl devd'
running in the same domain would read Xenstore beforehand and call backend
executable with the required arguments).

An example of domain configuration (two disks are assigned to the guest,
the latter is in readonly mode):

vdisk = [ 'backend=DomD, disks=rw:/dev/mmcblk0p3;ro:/dev/mmcblk1p3' ]

Where per-disk Xenstore entries are:
- filename and readonly flag (configured via "vdisk" property)
- base and irq (allocated dynamically)

Besides handling 'visible' params described in configuration file,
patch also allocates virtio-mmio specific ones for each device and
writes them into Xenstore. virtio-mmio params (irq and base) are
unique per guest domain, they allocated at the domain creation time
and passed through to the emulator. Each VirtIO device has at least
one pair of these params.

TODO:
1. An extra "virtio" property could be removed.
2. Update documentation.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>

---
Changes RFC -> V1:
   - no changes

Changes V1 -> V2:
   - rebase according to the new location of libxl_virtio_disk.c

Changes V2 -> V3:
   - no changes

Changes V3 -> V4:
   - rebase according to the new argument for DEFINE_DEVICE_TYPE_STRUCT

Please note, there is a real concern about VirtIO interrupts allocation.
[Just copy here what Stefano said in RFC thread]

So, if we end up allocating let's say 6 virtio interrupts for a domain,
the chance of a clash with a physical interrupt of a passthrough device is real.

I am not entirely sure how to solve it, but these are a few ideas:
- choosing virtio interrupts that are less likely to conflict (maybe > 1000)
- make the virtio irq (optionally) configurable so that a user could
  override the default irq and specify one that doesn't conflict
- implementing support for virq != pirq (even the xl interface doesn't
  allow to specify the virq number for passthrough devices, see "irqs")

Also there is one suggestion from Wei Chen regarding a parameter for domain
config file which I haven't addressed yet.
[Just copy here what Wei said in V2 thread]
Can we keep use the same 'disk' parameter for virtio-disk, but add an option like
"model=virtio-disk"?
For example:
disk = [ 'backend=DomD, disks=rw:/dev/mmcblk0p3,model=virtio-disk' ]
Just like what Xen has done for x86 virtio-net.
---
 tools/libs/light/Makefile                 |   1 +
 tools/libs/light/libxl_arm.c              |  56 ++++++++++++---
 tools/libs/light/libxl_create.c           |   1 +
 tools/libs/light/libxl_internal.h         |   1 +
 tools/libs/light/libxl_types.idl          |  15 ++++
 tools/libs/light/libxl_types_internal.idl |   1 +
 tools/libs/light/libxl_virtio_disk.c      | 109 ++++++++++++++++++++++++++++
 tools/xl/Makefile                         |   2 +-
 tools/xl/xl.h                             |   3 +
 tools/xl/xl_cmdtable.c                    |  15 ++++
 tools/xl/xl_parse.c                       | 115 ++++++++++++++++++++++++++++++
 tools/xl/xl_virtio_disk.c                 |  46 ++++++++++++
 12 files changed, 354 insertions(+), 11 deletions(-)
 create mode 100644 tools/libs/light/libxl_virtio_disk.c
 create mode 100644 tools/xl/xl_virtio_disk.c

diff --git a/tools/libs/light/Makefile b/tools/libs/light/Makefile
index 68f6fa3..ccc91b9 100644
--- a/tools/libs/light/Makefile
+++ b/tools/libs/light/Makefile
@@ -115,6 +115,7 @@ SRCS-y += libxl_genid.c
 SRCS-y += _libxl_types.c
 SRCS-y += libxl_flask.c
 SRCS-y += _libxl_types_internal.c
+SRCS-y += libxl_virtio_disk.c
 
 ifeq ($(CONFIG_LIBNL),y)
 CFLAGS_LIBXL += $(LIBNL3_CFLAGS)
diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index 588ee5a..9eb3022 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -8,6 +8,12 @@
 #include <assert.h>
 #include <xen/device_tree_defs.h>
 
+#ifndef container_of
+#define container_of(ptr, type, member) ({			\
+        typeof( ((type *)0)->member ) *__mptr = (ptr);	\
+        (type *)( (char *)__mptr - offsetof(type,member) );})
+#endif
+
 static const char *gicv_to_string(libxl_gic_version gic_version)
 {
     switch (gic_version) {
@@ -39,14 +45,32 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
         vuart_enabled = true;
     }
 
-    /*
-     * XXX: Handle properly virtio
-     * A proper solution would be the toolstack to allocate the interrupts
-     * used by each virtio backend and let the backend now which one is used
-     */
     if (libxl_defbool_val(d_config->b_info.arch_arm.virtio)) {
-        nr_spis += (GUEST_VIRTIO_MMIO_SPI - 32) + 1;
+        uint64_t virtio_base;
+        libxl_device_virtio_disk *virtio_disk;
+
+        virtio_base = GUEST_VIRTIO_MMIO_BASE;
         virtio_irq = GUEST_VIRTIO_MMIO_SPI;
+
+        if (!d_config->num_virtio_disks) {
+            LOG(ERROR, "Virtio is enabled, but no Virtio devices present\n");
+            return ERROR_FAIL;
+        }
+        virtio_disk = &d_config->virtio_disks[0];
+
+        for (i = 0; i < virtio_disk->num_disks; i++) {
+            virtio_disk->disks[i].base = virtio_base;
+            virtio_disk->disks[i].irq = virtio_irq;
+
+            LOG(DEBUG, "Allocate Virtio MMIO params: IRQ %u BASE 0x%"PRIx64,
+                virtio_irq, virtio_base);
+
+            virtio_irq ++;
+            virtio_base += GUEST_VIRTIO_MMIO_SIZE;
+        }
+        virtio_irq --;
+
+        nr_spis += (virtio_irq - 32) + 1;
         virtio_enabled = true;
     }
 
@@ -70,8 +94,9 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
         }
 
         /* The same check as for vpl011 */
-        if (virtio_enabled && irq == virtio_irq) {
-            LOG(ERROR, "Physical IRQ %u conflicting with virtio SPI\n", irq);
+        if (virtio_enabled &&
+           (irq >= GUEST_VIRTIO_MMIO_SPI && irq <= virtio_irq)) {
+            LOG(ERROR, "Physical IRQ %u conflicting with Virtio IRQ range\n", irq);
             return ERROR_FAIL;
         }
 
@@ -1011,8 +1036,19 @@ next_resize:
         if (info->tee == LIBXL_TEE_TYPE_OPTEE)
             FDT( make_optee_node(gc, fdt) );
 
-        if (libxl_defbool_val(info->arch_arm.virtio))
-            FDT( make_virtio_mmio_node(gc, fdt, GUEST_VIRTIO_MMIO_BASE, GUEST_VIRTIO_MMIO_SPI) );
+        if (libxl_defbool_val(info->arch_arm.virtio)) {
+            libxl_domain_config *d_config =
+                container_of(info, libxl_domain_config, b_info);
+            libxl_device_virtio_disk *virtio_disk = &d_config->virtio_disks[0];
+            unsigned int i;
+
+            for (i = 0; i < virtio_disk->num_disks; i++) {
+                uint64_t base = virtio_disk->disks[i].base;
+                uint32_t irq = virtio_disk->disks[i].irq;
+
+                FDT( make_virtio_mmio_node(gc, fdt, base, irq) );
+            }
+        }
 
         if (pfdt)
             FDT( copy_partial_fdt(gc, fdt, pfdt) );
diff --git a/tools/libs/light/libxl_create.c b/tools/libs/light/libxl_create.c
index 86f4a83..1734fcd 100644
--- a/tools/libs/light/libxl_create.c
+++ b/tools/libs/light/libxl_create.c
@@ -1821,6 +1821,7 @@ const libxl__device_type *device_type_tbl[] = {
     &libxl__dtdev_devtype,
     &libxl__vdispl_devtype,
     &libxl__vsnd_devtype,
+    &libxl__virtio_disk_devtype,
     NULL
 };
 
diff --git a/tools/libs/light/libxl_internal.h b/tools/libs/light/libxl_internal.h
index c79523b..5edef85 100644
--- a/tools/libs/light/libxl_internal.h
+++ b/tools/libs/light/libxl_internal.h
@@ -3999,6 +3999,7 @@ extern const libxl__device_type libxl__vdispl_devtype;
 extern const libxl__device_type libxl__p9_devtype;
 extern const libxl__device_type libxl__pvcallsif_devtype;
 extern const libxl__device_type libxl__vsnd_devtype;
+extern const libxl__device_type libxl__virtio_disk_devtype;
 
 extern const libxl__device_type *device_type_tbl[];
 
diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
index 839df86..2c40bc2 100644
--- a/tools/libs/light/libxl_types.idl
+++ b/tools/libs/light/libxl_types.idl
@@ -936,6 +936,20 @@ libxl_device_vsnd = Struct("device_vsnd", [
     ("pcms", Array(libxl_vsnd_pcm, "num_vsnd_pcms"))
     ])
 
+libxl_virtio_disk_param = Struct("virtio_disk_param", [
+    ("filename", string),
+    ("readonly", bool),
+    ("irq", uint32),
+    ("base", uint64),
+    ])
+
+libxl_device_virtio_disk = Struct("device_virtio_disk", [
+    ("backend_domid", libxl_domid),
+    ("backend_domname", string),
+    ("devid", libxl_devid),
+    ("disks", Array(libxl_virtio_disk_param, "num_disks")),
+    ])
+
 libxl_domain_config = Struct("domain_config", [
     ("c_info", libxl_domain_create_info),
     ("b_info", libxl_domain_build_info),
@@ -952,6 +966,7 @@ libxl_domain_config = Struct("domain_config", [
     ("pvcallsifs", Array(libxl_device_pvcallsif, "num_pvcallsifs")),
     ("vdispls", Array(libxl_device_vdispl, "num_vdispls")),
     ("vsnds", Array(libxl_device_vsnd, "num_vsnds")),
+    ("virtio_disks", Array(libxl_device_virtio_disk, "num_virtio_disks")),
     # a channel manifests as a console with a name,
     # see docs/misc/channels.txt
     ("channels", Array(libxl_device_channel, "num_channels")),
diff --git a/tools/libs/light/libxl_types_internal.idl b/tools/libs/light/libxl_types_internal.idl
index 3593e21..8f71980 100644
--- a/tools/libs/light/libxl_types_internal.idl
+++ b/tools/libs/light/libxl_types_internal.idl
@@ -32,6 +32,7 @@ libxl__device_kind = Enumeration("device_kind", [
     (14, "PVCALLS"),
     (15, "VSND"),
     (16, "VINPUT"),
+    (17, "VIRTIO_DISK"),
     ])
 
 libxl__console_backend = Enumeration("console_backend", [
diff --git a/tools/libs/light/libxl_virtio_disk.c b/tools/libs/light/libxl_virtio_disk.c
new file mode 100644
index 0000000..be769ad
--- /dev/null
+++ b/tools/libs/light/libxl_virtio_disk.c
@@ -0,0 +1,109 @@
+/*
+ * Copyright (C) 2020 EPAM Systems Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "libxl_internal.h"
+
+static int libxl__device_virtio_disk_setdefault(libxl__gc *gc, uint32_t domid,
+                                                libxl_device_virtio_disk *virtio_disk,
+                                                bool hotplug)
+{
+    return libxl__resolve_domid(gc, virtio_disk->backend_domname,
+                                &virtio_disk->backend_domid);
+}
+
+static int libxl__virtio_disk_from_xenstore(libxl__gc *gc, const char *libxl_path,
+                                            libxl_devid devid,
+                                            libxl_device_virtio_disk *virtio_disk)
+{
+    const char *be_path;
+    int rc;
+
+    virtio_disk->devid = devid;
+    rc = libxl__xs_read_mandatory(gc, XBT_NULL,
+                                  GCSPRINTF("%s/backend", libxl_path),
+                                  &be_path);
+    if (rc) return rc;
+
+    rc = libxl__backendpath_parse_domid(gc, be_path, &virtio_disk->backend_domid);
+    if (rc) return rc;
+
+    return 0;
+}
+
+static void libxl__update_config_virtio_disk(libxl__gc *gc,
+                                             libxl_device_virtio_disk *dst,
+                                             libxl_device_virtio_disk *src)
+{
+    dst->devid = src->devid;
+}
+
+static int libxl_device_virtio_disk_compare(libxl_device_virtio_disk *d1,
+                                            libxl_device_virtio_disk *d2)
+{
+    return COMPARE_DEVID(d1, d2);
+}
+
+static void libxl__device_virtio_disk_add(libxl__egc *egc, uint32_t domid,
+                                          libxl_device_virtio_disk *virtio_disk,
+                                          libxl__ao_device *aodev)
+{
+    libxl__device_add_async(egc, domid, &libxl__virtio_disk_devtype, virtio_disk, aodev);
+}
+
+static int libxl__set_xenstore_virtio_disk(libxl__gc *gc, uint32_t domid,
+                                           libxl_device_virtio_disk *virtio_disk,
+                                           flexarray_t *back, flexarray_t *front,
+                                           flexarray_t *ro_front)
+{
+    int rc;
+    unsigned int i;
+
+    for (i = 0; i < virtio_disk->num_disks; i++) {
+        rc = flexarray_append_pair(ro_front, GCSPRINTF("%d/filename", i),
+                                   GCSPRINTF("%s", virtio_disk->disks[i].filename));
+        if (rc) return rc;
+
+        rc = flexarray_append_pair(ro_front, GCSPRINTF("%d/readonly", i),
+                                   GCSPRINTF("%d", virtio_disk->disks[i].readonly));
+        if (rc) return rc;
+
+        rc = flexarray_append_pair(ro_front, GCSPRINTF("%d/base", i),
+                                   GCSPRINTF("%lu", virtio_disk->disks[i].base));
+        if (rc) return rc;
+
+        rc = flexarray_append_pair(ro_front, GCSPRINTF("%d/irq", i),
+                                   GCSPRINTF("%u", virtio_disk->disks[i].irq));
+        if (rc) return rc;
+    }
+
+    return 0;
+}
+
+static LIBXL_DEFINE_UPDATE_DEVID(virtio_disk)
+static LIBXL_DEFINE_DEVICE_FROM_TYPE(virtio_disk)
+static LIBXL_DEFINE_DEVICES_ADD(virtio_disk)
+
+DEFINE_DEVICE_TYPE_STRUCT(virtio_disk, VIRTIO_DISK, virtio_disks,
+    .update_config = (device_update_config_fn_t) libxl__update_config_virtio_disk,
+    .from_xenstore = (device_from_xenstore_fn_t) libxl__virtio_disk_from_xenstore,
+    .set_xenstore_config = (device_set_xenstore_config_fn_t) libxl__set_xenstore_virtio_disk
+);
+
+/*
+ * Local variables:
+ * mode: C
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/tools/xl/Makefile b/tools/xl/Makefile
index bdf67c8..9d8f2aa 100644
--- a/tools/xl/Makefile
+++ b/tools/xl/Makefile
@@ -23,7 +23,7 @@ XL_OBJS += xl_vtpm.o xl_block.o xl_nic.o xl_usb.o
 XL_OBJS += xl_sched.o xl_pci.o xl_vcpu.o xl_cdrom.o xl_mem.o
 XL_OBJS += xl_info.o xl_console.o xl_misc.o
 XL_OBJS += xl_vmcontrol.o xl_saverestore.o xl_migrate.o
-XL_OBJS += xl_vdispl.o xl_vsnd.o xl_vkb.o
+XL_OBJS += xl_vdispl.o xl_vsnd.o xl_vkb.o xl_virtio_disk.o
 
 $(XL_OBJS): CFLAGS += $(CFLAGS_libxentoollog)
 $(XL_OBJS): CFLAGS += $(CFLAGS_XL)
diff --git a/tools/xl/xl.h b/tools/xl/xl.h
index 06569c6..3d26f19 100644
--- a/tools/xl/xl.h
+++ b/tools/xl/xl.h
@@ -178,6 +178,9 @@ int main_vsnddetach(int argc, char **argv);
 int main_vkbattach(int argc, char **argv);
 int main_vkblist(int argc, char **argv);
 int main_vkbdetach(int argc, char **argv);
+int main_virtio_diskattach(int argc, char **argv);
+int main_virtio_disklist(int argc, char **argv);
+int main_virtio_diskdetach(int argc, char **argv);
 int main_usbctrl_attach(int argc, char **argv);
 int main_usbctrl_detach(int argc, char **argv);
 int main_usbdev_attach(int argc, char **argv);
diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c
index 6ab5e47..696b190 100644
--- a/tools/xl/xl_cmdtable.c
+++ b/tools/xl/xl_cmdtable.c
@@ -435,6 +435,21 @@ struct cmd_spec cmd_table[] = {
       "Destroy a domain's virtual sound device",
       "<Domain> <DevId>",
     },
+    { "virtio-disk-attach",
+      &main_virtio_diskattach, 1, 1,
+      "Create a new virtio block device",
+      " TBD\n"
+    },
+    { "virtio-disk-list",
+      &main_virtio_disklist, 0, 0,
+      "List virtio block devices for a domain",
+      "<Domain(s)>",
+    },
+    { "virtio-disk-detach",
+      &main_virtio_diskdetach, 0, 1,
+      "Destroy a domain's virtio block device",
+      "<Domain> <DevId>",
+    },
     { "uptime",
       &main_uptime, 0, 0,
       "Print uptime for all/some domains",
diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
index 2a3364b..054a0c9 100644
--- a/tools/xl/xl_parse.c
+++ b/tools/xl/xl_parse.c
@@ -1204,6 +1204,120 @@ out:
     if (rc) exit(EXIT_FAILURE);
 }
 
+#define MAX_VIRTIO_DISKS 4
+
+static int parse_virtio_disk_config(libxl_device_virtio_disk *virtio_disk, char *token)
+{
+    char *oparg;
+    libxl_string_list disks = NULL;
+    int i, rc;
+
+    if (MATCH_OPTION("backend", token, oparg)) {
+        virtio_disk->backend_domname = strdup(oparg);
+    } else if (MATCH_OPTION("disks", token, oparg)) {
+        split_string_into_string_list(oparg, ";", &disks);
+
+        virtio_disk->num_disks = libxl_string_list_length(&disks);
+        if (virtio_disk->num_disks > MAX_VIRTIO_DISKS) {
+            fprintf(stderr, "vdisk: currently only %d disks are supported",
+                    MAX_VIRTIO_DISKS);
+            return 1;
+        }
+        virtio_disk->disks = xcalloc(virtio_disk->num_disks,
+                                     sizeof(*virtio_disk->disks));
+
+        for(i = 0; i < virtio_disk->num_disks; i++) {
+            char *disk_opt;
+
+            rc = split_string_into_pair(disks[i], ":", &disk_opt,
+                                        &virtio_disk->disks[i].filename);
+            if (rc) {
+                fprintf(stderr, "vdisk: failed to split \"%s\" into pair\n",
+                        disks[i]);
+                goto out;
+            }
+
+            if (!strcmp(disk_opt, "ro"))
+                virtio_disk->disks[i].readonly = 1;
+            else if (!strcmp(disk_opt, "rw"))
+                virtio_disk->disks[i].readonly = 0;
+            else {
+                fprintf(stderr, "vdisk: failed to parse \"%s\" disk option\n",
+                        disk_opt);
+                rc = 1;
+            }
+            free(disk_opt);
+
+            if (rc) goto out;
+        }
+    } else {
+        fprintf(stderr, "Unknown string \"%s\" in vdisk spec\n", token);
+        rc = 1; goto out;
+    }
+
+    rc = 0;
+
+out:
+    libxl_string_list_dispose(&disks);
+    return rc;
+}
+
+static void parse_virtio_disk_list(const XLU_Config *config,
+                            libxl_domain_config *d_config)
+{
+    XLU_ConfigList *virtio_disks;
+    const char *item;
+    char *buf = NULL;
+    int rc;
+
+    if (!xlu_cfg_get_list (config, "vdisk", &virtio_disks, 0, 0)) {
+        libxl_domain_build_info *b_info = &d_config->b_info;
+        int entry = 0;
+
+        /* XXX Remove an extra property */
+        libxl_defbool_setdefault(&b_info->arch_arm.virtio, false);
+        if (!libxl_defbool_val(b_info->arch_arm.virtio)) {
+            fprintf(stderr, "Virtio device requires Virtio property to be set\n");
+            exit(EXIT_FAILURE);
+        }
+
+        while ((item = xlu_cfg_get_listitem(virtio_disks, entry)) != NULL) {
+            libxl_device_virtio_disk *virtio_disk;
+            char *p;
+
+            virtio_disk = ARRAY_EXTEND_INIT(d_config->virtio_disks,
+                                            d_config->num_virtio_disks,
+                                            libxl_device_virtio_disk_init);
+
+            buf = strdup(item);
+
+            p = strtok (buf, ",");
+            while (p != NULL)
+            {
+                while (*p == ' ') p++;
+
+                rc = parse_virtio_disk_config(virtio_disk, p);
+                if (rc) goto out;
+
+                p = strtok (NULL, ",");
+            }
+
+            entry++;
+
+            if (virtio_disk->num_disks == 0) {
+                fprintf(stderr, "At least one virtio disk should be specified\n");
+                rc = 1; goto out;
+            }
+        }
+    }
+
+    rc = 0;
+
+out:
+    free(buf);
+    if (rc) exit(EXIT_FAILURE);
+}
+
 void parse_config_data(const char *config_source,
                        const char *config_data,
                        int config_len,
@@ -2734,6 +2848,7 @@ skip_usbdev:
     }
 
     parse_vkb_list(config, d_config);
+    parse_virtio_disk_list(config, d_config);
 
     xlu_cfg_get_defbool(config, "xend_suspend_evtchn_compat",
                         &c_info->xend_suspend_evtchn_compat, 0);
diff --git a/tools/xl/xl_virtio_disk.c b/tools/xl/xl_virtio_disk.c
new file mode 100644
index 0000000..808a7da
--- /dev/null
+++ b/tools/xl/xl_virtio_disk.c
@@ -0,0 +1,46 @@
+/*
+ * Copyright (C) 2020 EPAM Systems Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include <stdlib.h>
+
+#include <libxl.h>
+#include <libxl_utils.h>
+#include <libxlutil.h>
+
+#include "xl.h"
+#include "xl_utils.h"
+#include "xl_parse.h"
+
+int main_virtio_diskattach(int argc, char **argv)
+{
+    return 0;
+}
+
+int main_virtio_disklist(int argc, char **argv)
+{
+   return 0;
+}
+
+int main_virtio_diskdetach(int argc, char **argv)
+{
+    return 0;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 144+ messages in thread

* RE: [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm
  2021-01-12 21:52 [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Oleksandr Tyshchenko
                   ` (23 preceding siblings ...)
  2021-01-12 21:52 ` [PATCH V4 24/24] [RFC] libxl: Add support for virtio-disk configuration Oleksandr Tyshchenko
@ 2021-01-14  3:55 ` Wei Chen
  2021-01-14 15:23   ` Oleksandr
  24 siblings, 1 reply; 144+ messages in thread
From: Wei Chen @ 2021-01-14  3:55 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, xen-devel
  Cc: Oleksandr Tyshchenko, Paul Durrant, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, Julien Grall, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Jun Nakajima, Kevin Tian, Tim Deegan,
	Daniel De Graaf, Volodymyr Babchuk, Anthony PERARD,
	Bertrand Marquis, Kaly Xin, Artem Mygaiev, Alex Bennée

Hi Oleksandr,

I have tested this series with latest master and staging branches. 
The virtio function works well for Arm as v3.

For latest staging branch, it needs a tiny rebase for:
0011 xen/mm: Make x86's XENMEM_resource_ioreq_server handling common
As staging branch changes rapidly, I had done it manually and done the test.
It should not affect review.

Tested-by: Wei Chen <Wei.Chen@arm.com>

> -----Original Message-----
> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of
> Oleksandr Tyshchenko
> Sent: 2021年1月13日 5:52
> To: xen-devel@lists.xenproject.org
> Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>; Paul Durrant
> <paul@xen.org>; Jan Beulich <jbeulich@suse.com>; Andrew Cooper
> <andrew.cooper3@citrix.com>; Roger Pau Monné <roger.pau@citrix.com>;
> Wei Liu <wl@xen.org>; Julien Grall <Julien.Grall@arm.com>; George Dunlap
> <george.dunlap@citrix.com>; Ian Jackson <iwj@xenproject.org>; Julien Grall
> <julien@xen.org>; Stefano Stabellini <sstabellini@kernel.org>; Jun Nakajima
> <jun.nakajima@intel.com>; Kevin Tian <kevin.tian@intel.com>; Tim Deegan
> <tim@xen.org>; Daniel De Graaf <dgdegra@tycho.nsa.gov>; Volodymyr
> Babchuk <Volodymyr_Babchuk@epam.com>; Anthony PERARD
> <anthony.perard@citrix.com>; Bertrand Marquis <Bertrand.Marquis@arm.com>;
> Wei Chen <Wei.Chen@arm.com>; Kaly Xin <Kaly.Xin@arm.com>; Artem
> Mygaiev <joculator@gmail.com>; Alex Bennée <alex.bennee@linaro.org>
> Subject: [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm
> 
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> Hello all.
> 
> The purpose of this patch series is to add IOREQ/DM support to Xen on Arm.
> You can find an initial discussion at [1] and RFC-V3 series at [2]-[5].
> Xen on Arm requires some implementation to forward guest MMIO access to a
> device
> model in order to implement virtio-mmio backend or even mediator outside of
> hypervisor.
> As Xen on x86 already contains required support this series tries to make it
> common
> and introduce Arm specific bits plus some new functionality. Patch series is
> based on
> Julien's PoC "xen/arm: Add support for Guest IO forwarding to a device
> emulator".
> Besides splitting existing IOREQ/DM support and introducing Arm side, the series
> also includes virtio-mmio related changes (last 2 patches for toolstack)
> for the reviewers to be able to see how the whole picture could look like
> and give it a try.
> 
> According to the initial/subsequent discussions there are a few open
> questions/concerns regarding security, performance in VirtIO solution:
> 1. virtio-mmio vs virtio-pci, SPI vs MSI, or even a composition of virtio-mmio +
> MSI,
>    different use-cases require different transport...
> 2. virtio backend is able to access all guest memory, some kind of protection
>    is needed: 'virtio-iommu in Xen' vs 'pre-shared-memory & memcpys in guest',
> etc
>    (for the first two Alex have provided valuable input at [6])
> 3. interface between toolstack and 'out-of-qemu' virtio backend, avoid using
>    Xenstore in virtio backend if possible. Also, there is a desire to make VirtIO
>    backend hypervisor-agnostic.
> 4. a lot of 'foreing mapping' could lead to the memory exhaustion at the host
> side,
>    as we are stealing the page from host memory in order to map the guest page.
>    Julien has some idea regarding that.
> 5. Julien also has some ideas how to optimize the IOREQ code:
>    5.1 vcpu_ioreq_handle_completion (former handle_hvm_io_completion)
> which is called in
>        an hotpath on Arm (everytime we are re-entering to the guest):
>        Ideally, vcpu_ioreq_handle_completion should be a NOP (at max a few
> instructions)
>        if there is nothing to do (if we don't have I/O forwarded to an IOREQ server).
>        Maybe we want to introduce a per-vCPU flag indicating if an I/O has been
>        forwarded to an IOREQ server. This would allow us to bypass most of the
> function
>        if there is nothing to do.
>    5.2 The current way to handle MMIO is the following:
>        - Pause the vCPU
>        - Forward the access to the backend domain
>        - Schedule the backend domain
>        - Wait for the access to be handled
>        - Unpause the vCPU
>        The sequence is going to be fairly expensive on Xen.
>        It might be possible to optimize the ACK and avoid to wait for the backend
>        to handle the access.
> 
> Looks like all of them are valid and worth considering, but the first thing
> which we need on Arm is a mechanism to forward guest IO to a device emulator,
> so let's focus on it in the first place.
> 
> ***
> 
> There are a lot of changes since RFC series, almost all TODOs were resolved on
> Arm,
> Arm code was improved and hardened, common IOREQ/DM code became really
> arch-agnostic
> (without HVM-ism), the "legacy" mechanism of mapping magic pages for the
> IOREQ servers
> was left x86 specific, etc. Also patch that makes DM code public was reworked
> to have
> the top level dm-op handling arch-specific and call into ioreq_server_dm_op()
> for otherwise unhandled ops.
> But one TODO still remains which is "PIO handling" on Arm.
> The "PIO handling" TODO is expected to left unaddressed for the current series.
> It is not an big issue for now while Xen doesn't have support for vPCI on Arm.
> On Arm64 they are only used for PCI IO Bar and we would probably want to
> expose
> them to emulator as PIO access to make a DM completely arch-agnostic. So
> "PIO handling"
> should be implemented when we add support for vPCI.
> 
> I left interface untouched in the following patch
> "xen/dm: Introduce xendevicemodel_set_irq_level DM op"
> since there is still an open discussion what interface to use/what
> information to pass to the hypervisor.
> 
> There are patches on review this series depends on:
> https://patchwork.kernel.org/patch/11816689
> https://patchwork.kernel.org/patch/11803383
> 
> Please note, that IOREQ feature is disabled by default on Arm within current
> series.
> 
> ***
> 
> Patch series [7] was rebased on recent "staging branch"
> (7ba2ab4 x86/p2m: Fix paging_gva_to_gfn() for nested virt) and tested on
> Renesas Salvator-X board + H3 ES3.0 SoC (Arm64) with virtio-mmio disk backend
> [8]
> running in driver domain and unmodified Linux Guest running on existing
> virtio-blk driver (frontend). No issues were observed. Guest domain
> 'reboot/destroy'
> use-cases work properly. Patch series was only build-tested on x86.
> 
> Please note, build-test passed for the following modes:
> 1. x86: CONFIG_HVM=y / CONFIG_IOREQ_SERVER=y (default)
> 2. x86: #CONFIG_HVM is not set / #CONFIG_IOREQ_SERVER is not set
> 3. Arm64: CONFIG_HVM=y / CONFIG_IOREQ_SERVER=y
> 4. Arm64: CONFIG_HVM=y / #CONFIG_IOREQ_SERVER is not set  (default)
> 5. Arm32: CONFIG_HVM=y / CONFIG_IOREQ_SERVER=y
> 6. Arm32: CONFIG_HVM=y / #CONFIG_IOREQ_SERVER is not set  (default)
> 
> ***
> 
> Any feedback/help would be highly appreciated.
> 
> [1] https://lists.xenproject.org/archives/html/xen-devel/2020-
> 07/msg00825.html
> [2] https://lists.xenproject.org/archives/html/xen-devel/2020-
> 08/msg00071.html
> [3] https://lists.xenproject.org/archives/html/xen-devel/2020-
> 09/msg00732.html
> [4] https://lists.xenproject.org/archives/html/xen-devel/2020-
> 10/msg01077.html
> [5] https://lists.xenproject.org/archives/html/xen-devel/2020-
> 11/msg02188.html
> [6] https://lists.xenproject.org/archives/html/xen-devel/2020-
> 11/msg02212.html
> [7] https://github.com/otyshchenko1/xen/commits/ioreq_4.14_ml5
> [8] https://github.com/xen-troops/virtio-disk/commits/ioreq_ml1
> 
> Julien Grall (5):
>   xen/ioreq: Make x86's IOREQ related dm-op handling common
>   xen/mm: Make x86's XENMEM_resource_ioreq_server handling common
>   arm/ioreq: Introduce arch specific bits for IOREQ/DM features
>   xen/dm: Introduce xendevicemodel_set_irq_level DM op
>   libxl: Introduce basic virtio-mmio support on Arm
> 
> Oleksandr Tyshchenko (19):
>   x86/ioreq: Prepare IOREQ feature for making it common
>   x86/ioreq: Add IOREQ_STATUS_* #define-s and update code for moving
>   x86/ioreq: Provide out-of-line wrapper for the handle_mmio()
>   xen/ioreq: Make x86's IOREQ feature common
>   xen/ioreq: Make x86's hvm_ioreq_needs_completion() common
>   xen/ioreq: Make x86's hvm_mmio_first(last)_byte() common
>   xen/ioreq: Make x86's hvm_ioreq_(page/vcpu/server) structs common
>   xen/ioreq: Move x86's ioreq_server to struct domain
>   xen/ioreq: Move x86's io_completion/io_req fields to struct vcpu
>   xen/ioreq: Remove "hvm" prefixes from involved function names
>   xen/ioreq: Use guest_cmpxchg64() instead of cmpxchg()
>   xen/arm: Stick around in leave_hypervisor_to_guest until I/O has
>     completed
>   xen/mm: Handle properly reference in set_foreign_p2m_entry() on Arm
>   xen/ioreq: Introduce domain_has_ioreq_server()
>   xen/arm: io: Abstract sign-extension
>   xen/arm: io: Harden sign extension check
>   xen/ioreq: Make x86's send_invalidate_req() common
>   xen/arm: Add mapcache invalidation handling
>   [RFC] libxl: Add support for virtio-disk configuration
> 
>  MAINTAINERS                                  |    8 +-
>  tools/include/xendevicemodel.h               |    4 +
>  tools/libs/devicemodel/core.c                |   18 +
>  tools/libs/devicemodel/libxendevicemodel.map |    1 +
>  tools/libs/light/Makefile                    |    1 +
>  tools/libs/light/libxl_arm.c                 |   94 +-
>  tools/libs/light/libxl_create.c              |    1 +
>  tools/libs/light/libxl_internal.h            |    1 +
>  tools/libs/light/libxl_types.idl             |   16 +
>  tools/libs/light/libxl_types_internal.idl    |    1 +
>  tools/libs/light/libxl_virtio_disk.c         |  109 ++
>  tools/xl/Makefile                            |    2 +-
>  tools/xl/xl.h                                |    3 +
>  tools/xl/xl_cmdtable.c                       |   15 +
>  tools/xl/xl_parse.c                          |  116 +++
>  tools/xl/xl_virtio_disk.c                    |   46 +
>  xen/arch/arm/Makefile                        |    2 +
>  xen/arch/arm/dm.c                            |  174 ++++
>  xen/arch/arm/domain.c                        |    9 +
>  xen/arch/arm/io.c                            |   30 +-
>  xen/arch/arm/ioreq.c                         |  198 ++++
>  xen/arch/arm/p2m.c                           |   51 +-
>  xen/arch/arm/traps.c                         |   72 +-
>  xen/arch/x86/Kconfig                         |    1 +
>  xen/arch/x86/hvm/dm.c                        |  107 +-
>  xen/arch/x86/hvm/emulate.c                   |  220 ++--
>  xen/arch/x86/hvm/hvm.c                       |   14 +-
>  xen/arch/x86/hvm/hypercall.c                 |    9 +-
>  xen/arch/x86/hvm/intercept.c                 |    5 +-
>  xen/arch/x86/hvm/io.c                        |   52 +-
>  xen/arch/x86/hvm/ioreq.c                     | 1375 ++-----------------------
>  xen/arch/x86/hvm/stdvga.c                    |   12 +-
>  xen/arch/x86/hvm/svm/nestedsvm.c             |    2 +-
>  xen/arch/x86/hvm/vmx/realmode.c              |    8 +-
>  xen/arch/x86/hvm/vmx/vvmx.c                  |    5 +-
>  xen/arch/x86/mm.c                            |   46 +-
>  xen/arch/x86/mm/p2m.c                        |   17 +-
>  xen/arch/x86/mm/shadow/common.c              |    2 +-
>  xen/common/Kconfig                           |    3 +
>  xen/common/Makefile                          |    1 +
>  xen/common/ioreq.c                           | 1426 ++++++++++++++++++++++++++
>  xen/common/memory.c                          |   72 +-
>  xen/include/asm-arm/domain.h                 |    3 +
>  xen/include/asm-arm/hvm/ioreq.h              |   72 ++
>  xen/include/asm-arm/mm.h                     |    8 -
>  xen/include/asm-arm/mmio.h                   |    1 +
>  xen/include/asm-arm/p2m.h                    |   19 +-
>  xen/include/asm-arm/traps.h                  |   25 +
>  xen/include/asm-x86/hvm/domain.h             |   43 -
>  xen/include/asm-x86/hvm/emulate.h            |    2 +-
>  xen/include/asm-x86/hvm/io.h                 |   17 -
>  xen/include/asm-x86/hvm/ioreq.h              |   39 +-
>  xen/include/asm-x86/hvm/vcpu.h               |   18 -
>  xen/include/asm-x86/mm.h                     |    4 -
>  xen/include/asm-x86/p2m.h                    |   27 +-
>  xen/include/public/arch-arm.h                |    5 +
>  xen/include/public/hvm/dm_op.h               |   16 +
>  xen/include/xen/dm.h                         |   39 +
>  xen/include/xen/ioreq.h                      |  140 +++
>  xen/include/xen/p2m-common.h                 |    4 +
>  xen/include/xen/sched.h                      |   34 +
>  xen/include/xsm/dummy.h                      |    4 +-
>  xen/include/xsm/xsm.h                        |    6 +-
>  xen/xsm/dummy.c                              |    2 +-
>  xen/xsm/flask/hooks.c                        |    5 +-
>  65 files changed, 3073 insertions(+), 1809 deletions(-)
>  create mode 100644 tools/libs/light/libxl_virtio_disk.c
>  create mode 100644 tools/xl/xl_virtio_disk.c
>  create mode 100644 xen/arch/arm/dm.c
>  create mode 100644 xen/arch/arm/ioreq.c
>  create mode 100644 xen/common/ioreq.c
>  create mode 100644 xen/include/asm-arm/hvm/ioreq.h
>  create mode 100644 xen/include/xen/dm.h
>  create mode 100644 xen/include/xen/ioreq.h
> 
> --
> 2.7.4
> 


^ permalink raw reply	[flat|nested] 144+ messages in thread

* RE: [PATCH V4 11/24] xen/mm: Make x86's XENMEM_resource_ioreq_server handling common
  2021-01-12 21:52 ` [PATCH V4 11/24] xen/mm: Make x86's XENMEM_resource_ioreq_server handling common Oleksandr Tyshchenko
@ 2021-01-14  3:58   ` Wei Chen
  2021-01-14 15:31     ` Oleksandr
  2021-01-18  9:38   ` Paul Durrant
  1 sibling, 1 reply; 144+ messages in thread
From: Wei Chen @ 2021-01-14  3:58 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, xen-devel
  Cc: Julien Grall, Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Volodymyr Babchuk, Oleksandr Tyshchenko

Hi Oleksandr,

> -----Original Message-----
> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of
> Oleksandr Tyshchenko
> Sent: 2021年1月13日 5:52
> To: xen-devel@lists.xenproject.org
> Cc: Julien Grall <Julien.Grall@arm.com>; Jan Beulich <jbeulich@suse.com>;
> Andrew Cooper <andrew.cooper3@citrix.com>; Roger Pau Monné
> <roger.pau@citrix.com>; Wei Liu <wl@xen.org>; George Dunlap
> <george.dunlap@citrix.com>; Ian Jackson <iwj@xenproject.org>; Julien Grall
> <julien@xen.org>; Stefano Stabellini <sstabellini@kernel.org>; Volodymyr
> Babchuk <Volodymyr_Babchuk@epam.com>; Oleksandr Tyshchenko
> <oleksandr_tyshchenko@epam.com>
> Subject: [PATCH V4 11/24] xen/mm: Make x86's
> XENMEM_resource_ioreq_server handling common
> 
> From: Julien Grall <julien.grall@arm.com>
> 
> As x86 implementation of XENMEM_resource_ioreq_server can be
> re-used on Arm later on, this patch makes it common and removes
> arch_acquire_resource as unneeded.
> 
> Also re-order #include-s alphabetically.
> 
> This support is going to be used on Arm to be able run device
> emulator outside of Xen hypervisor.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>
> 
> ---
> Please note, this is a split/cleanup/hardening of Julien's PoC:
> "Add support for Guest IO forwarding to a device emulator"
> 
> Changes RFC -> V1:
>    - no changes
> 
> Changes V1 -> V2:
>    - update the author of a patch
> 
> Changes V2 -> V3:
>    - don't wrap #include <xen/ioreq.h>
>    - limit the number of #ifdef-s
>    - re-order #include-s alphabetically
> 
> Changes V3 -> V4:
>    - rebase
>    - Add Jan's R-b
> ---
>  xen/arch/x86/mm.c        | 44 ---------------------------------
>  xen/common/memory.c      | 63
> +++++++++++++++++++++++++++++++++++++++---------
>  xen/include/asm-arm/mm.h |  8 ------
>  xen/include/asm-x86/mm.h |  4 ---
>  4 files changed, 51 insertions(+), 68 deletions(-)
> 
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index f6e128e..54ac398 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -4587,50 +4587,6 @@ static int handle_iomem_range(unsigned long s,
> unsigned long e, void *p)
>      return err || s > e ? err : _handle_iomem_range(s, e, p);
>  }
> 
> -int arch_acquire_resource(struct domain *d, unsigned int type,
> -                          unsigned int id, unsigned long frame,
> -                          unsigned int nr_frames, xen_pfn_t mfn_list[])
> -{
> -    int rc;
> -
> -    switch ( type )
> -    {
> -#ifdef CONFIG_HVM
> -    case XENMEM_resource_ioreq_server:
> -    {
> -        ioservid_t ioservid = id;
> -        unsigned int i;
> -
> -        rc = -EINVAL;
> -        if ( !is_hvm_domain(d) )
> -            break;
> -
> -        if ( id != (unsigned int)ioservid )
> -            break;
> -
> -        rc = 0;
> -        for ( i = 0; i < nr_frames; i++ )
> -        {
> -            mfn_t mfn;
> -
> -            rc = hvm_get_ioreq_server_frame(d, id, frame + i, &mfn);
> -            if ( rc )
> -                break;
> -
> -            mfn_list[i] = mfn_x(mfn);
> -        }
> -        break;
> -    }
> -#endif
> -
> -    default:
> -        rc = -EOPNOTSUPP;
> -        break;
> -    }
> -
> -    return rc;
> -}
> -
>  long arch_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void)
> arg)
>  {
>      int rc;
> diff --git a/xen/common/memory.c b/xen/common/memory.c
> index b21b6c4..7e560b5 100644
> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -8,22 +8,23 @@
>   */
> 
>  #include <xen/domain_page.h>
> -#include <xen/types.h>
> +#include <xen/errno.h>
> +#include <xen/event.h>
> +#include <xen/grant_table.h>
> +#include <xen/guest_access.h>
> +#include <xen/hypercall.h>
> +#include <xen/iocap.h>
> +#include <xen/ioreq.h>
>  #include <xen/lib.h>
> +#include <xen/mem_access.h>
>  #include <xen/mm.h>
> +#include <xen/numa.h>
> +#include <xen/paging.h>
>  #include <xen/param.h>
>  #include <xen/perfc.h>
>  #include <xen/sched.h>
> -#include <xen/event.h>
> -#include <xen/paging.h>
> -#include <xen/iocap.h>
> -#include <xen/guest_access.h>
> -#include <xen/hypercall.h>
> -#include <xen/errno.h>
> -#include <xen/numa.h>
> -#include <xen/mem_access.h>
>  #include <xen/trace.h>
> -#include <xen/grant_table.h>
> +#include <xen/types.h>
>  #include <asm/current.h>
>  #include <asm/hardirq.h>
>  #include <asm/p2m.h>
> @@ -1090,6 +1091,40 @@ static int acquire_grant_table(struct domain *d,
> unsigned int id,
>      return 0;
>  }
> 
> +static int acquire_ioreq_server(struct domain *d,
> +                                unsigned int id,
> +                                unsigned long frame,
> +                                unsigned int nr_frames,
> +                                xen_pfn_t mfn_list[])
> +{
> +#ifdef CONFIG_IOREQ_SERVER
> +    ioservid_t ioservid = id;
> +    unsigned int i;
> +    int rc;
> +
> +    if ( !is_hvm_domain(d) )
> +        return -EINVAL;
> +
> +    if ( id != (unsigned int)ioservid )
> +        return -EINVAL;
> +
> +    for ( i = 0; i < nr_frames; i++ )
> +    {
> +        mfn_t mfn;
> +
> +        rc = hvm_get_ioreq_server_frame(d, id, frame + i, &mfn);
> +        if ( rc )
> +            return rc;
> +
> +        mfn_list[i] = mfn_x(mfn);
> +    }
> +
> +    return 0;
> +#else
> +    return -EOPNOTSUPP;
> +#endif
> +}
> +
>  static int acquire_resource(
>      XEN_GUEST_HANDLE_PARAM(xen_mem_acquire_resource_t) arg)
>  {
> @@ -1148,9 +1183,13 @@ static int acquire_resource(
>                                   mfn_list);
>          break;
> 
> +    case XENMEM_resource_ioreq_server:
> +        rc = acquire_ioreq_server(d, xmar.id, xmar.frame, xmar.nr_frames,
> +                                  mfn_list);
> +        break;
> +
>      default:
> -        rc = arch_acquire_resource(d, xmar.type, xmar.id, xmar.frame,
> -                                   xmar.nr_frames, mfn_list);
> +        rc = -EOPNOTSUPP;
>          break;
>      }
> 
> diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
> index f8ba49b..0b7de31 100644
> --- a/xen/include/asm-arm/mm.h
> +++ b/xen/include/asm-arm/mm.h
> @@ -358,14 +358,6 @@ static inline void put_page_and_type(struct page_info
> *page)
> 
>  void clear_and_clean_page(struct page_info *page);
> 
> -static inline
> -int arch_acquire_resource(struct domain *d, unsigned int type, unsigned int id,
> -                          unsigned long frame, unsigned int nr_frames,
> -                          xen_pfn_t mfn_list[])
> -{
> -    return -EOPNOTSUPP;
> -}
> -
>  unsigned int arch_get_dma_bitsize(void);
> 

This change could not be applied to the latest staging branch.

>  #endif /*  __ARCH_ARM_MM__ */
> diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
> index deeba75..859214e 100644
> --- a/xen/include/asm-x86/mm.h
> +++ b/xen/include/asm-x86/mm.h
> @@ -639,8 +639,4 @@ static inline bool arch_mfn_in_directmap(unsigned long
> mfn)
>      return mfn <= (virt_to_mfn(eva - 1) + 1);
>  }
> 
> -int arch_acquire_resource(struct domain *d, unsigned int type,
> -                          unsigned int id, unsigned long frame,
> -                          unsigned int nr_frames, xen_pfn_t mfn_list[]);
> -
>  #endif /* __ASM_X86_MM_H__ */
> --
> 2.7.4
> 


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm
  2021-01-14  3:55 ` [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Wei Chen
@ 2021-01-14 15:23   ` Oleksandr
  2021-01-07 14:35     ` [ANNOUNCE] Xen 4.15 release schedule and feature tracking Ian Jackson
  0 siblings, 1 reply; 144+ messages in thread
From: Oleksandr @ 2021-01-14 15:23 UTC (permalink / raw)
  To: Wei Chen
  Cc: xen-devel, Oleksandr Tyshchenko, Paul Durrant, Jan Beulich,
	Andrew Cooper, Roger Pau Monné,
	Wei Liu, Julien Grall, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Jun Nakajima, Kevin Tian, Tim Deegan,
	Daniel De Graaf, Volodymyr Babchuk, Anthony PERARD,
	Bertrand Marquis, Kaly Xin, Artem Mygaiev, Alex Bennée


On 14.01.21 05:55, Wei Chen wrote:
> Hi Oleksandr,

Hi Wei.


>
> I have tested this series with latest master and staging branches.
> The virtio function works well for Arm as v3.

Thank you! This is good news.


>
> For latest staging branch, it needs a tiny rebase for:
> 0011 xen/mm: Make x86's XENMEM_resource_ioreq_server handling common
> As staging branch changes rapidly, I had done it manually and done the test.
> It should not affect review.

Yes, very rapidly I would say) Yes, I will need to rebase due to the 
recent "xen/memory: Introduce CONFIG_ARCH_ACQUIRE_RESOURCE" patch


>
> Tested-by: Wei Chen <Wei.Chen@arm.com>

Thanks.


>
>> -----Original Message-----
>> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of
>> Oleksandr Tyshchenko
>> Sent: 2021年1月13日 5:52
>> To: xen-devel@lists.xenproject.org
>> Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>; Paul Durrant
>> <paul@xen.org>; Jan Beulich <jbeulich@suse.com>; Andrew Cooper
>> <andrew.cooper3@citrix.com>; Roger Pau Monné <roger.pau@citrix.com>;
>> Wei Liu <wl@xen.org>; Julien Grall <Julien.Grall@arm.com>; George Dunlap
>> <george.dunlap@citrix.com>; Ian Jackson <iwj@xenproject.org>; Julien Grall
>> <julien@xen.org>; Stefano Stabellini <sstabellini@kernel.org>; Jun Nakajima
>> <jun.nakajima@intel.com>; Kevin Tian <kevin.tian@intel.com>; Tim Deegan
>> <tim@xen.org>; Daniel De Graaf <dgdegra@tycho.nsa.gov>; Volodymyr
>> Babchuk <Volodymyr_Babchuk@epam.com>; Anthony PERARD
>> <anthony.perard@citrix.com>; Bertrand Marquis <Bertrand.Marquis@arm.com>;
>> Wei Chen <Wei.Chen@arm.com>; Kaly Xin <Kaly.Xin@arm.com>; Artem
>> Mygaiev <joculator@gmail.com>; Alex Bennée <alex.bennee@linaro.org>
>> Subject: [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm
>>
>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>
>> Hello all.
>>
>> The purpose of this patch series is to add IOREQ/DM support to Xen on Arm.
>> You can find an initial discussion at [1] and RFC-V3 series at [2]-[5].
>> Xen on Arm requires some implementation to forward guest MMIO access to a
>> device
>> model in order to implement virtio-mmio backend or even mediator outside of
>> hypervisor.
>> As Xen on x86 already contains required support this series tries to make it
>> common
>> and introduce Arm specific bits plus some new functionality. Patch series is
>> based on
>> Julien's PoC "xen/arm: Add support for Guest IO forwarding to a device
>> emulator".
>> Besides splitting existing IOREQ/DM support and introducing Arm side, the series
>> also includes virtio-mmio related changes (last 2 patches for toolstack)
>> for the reviewers to be able to see how the whole picture could look like
>> and give it a try.
>>
>> According to the initial/subsequent discussions there are a few open
>> questions/concerns regarding security, performance in VirtIO solution:
>> 1. virtio-mmio vs virtio-pci, SPI vs MSI, or even a composition of virtio-mmio +
>> MSI,
>>     different use-cases require different transport...
>> 2. virtio backend is able to access all guest memory, some kind of protection
>>     is needed: 'virtio-iommu in Xen' vs 'pre-shared-memory & memcpys in guest',
>> etc
>>     (for the first two Alex have provided valuable input at [6])
>> 3. interface between toolstack and 'out-of-qemu' virtio backend, avoid using
>>     Xenstore in virtio backend if possible. Also, there is a desire to make VirtIO
>>     backend hypervisor-agnostic.
>> 4. a lot of 'foreing mapping' could lead to the memory exhaustion at the host
>> side,
>>     as we are stealing the page from host memory in order to map the guest page.
>>     Julien has some idea regarding that.
>> 5. Julien also has some ideas how to optimize the IOREQ code:
>>     5.1 vcpu_ioreq_handle_completion (former handle_hvm_io_completion)
>> which is called in
>>         an hotpath on Arm (everytime we are re-entering to the guest):
>>         Ideally, vcpu_ioreq_handle_completion should be a NOP (at max a few
>> instructions)
>>         if there is nothing to do (if we don't have I/O forwarded to an IOREQ server).
>>         Maybe we want to introduce a per-vCPU flag indicating if an I/O has been
>>         forwarded to an IOREQ server. This would allow us to bypass most of the
>> function
>>         if there is nothing to do.
>>     5.2 The current way to handle MMIO is the following:
>>         - Pause the vCPU
>>         - Forward the access to the backend domain
>>         - Schedule the backend domain
>>         - Wait for the access to be handled
>>         - Unpause the vCPU
>>         The sequence is going to be fairly expensive on Xen.
>>         It might be possible to optimize the ACK and avoid to wait for the backend
>>         to handle the access.
>>
>> Looks like all of them are valid and worth considering, but the first thing
>> which we need on Arm is a mechanism to forward guest IO to a device emulator,
>> so let's focus on it in the first place.
>>
>> ***
>>
>> There are a lot of changes since RFC series, almost all TODOs were resolved on
>> Arm,
>> Arm code was improved and hardened, common IOREQ/DM code became really
>> arch-agnostic
>> (without HVM-ism), the "legacy" mechanism of mapping magic pages for the
>> IOREQ servers
>> was left x86 specific, etc. Also patch that makes DM code public was reworked
>> to have
>> the top level dm-op handling arch-specific and call into ioreq_server_dm_op()
>> for otherwise unhandled ops.
>> But one TODO still remains which is "PIO handling" on Arm.
>> The "PIO handling" TODO is expected to left unaddressed for the current series.
>> It is not an big issue for now while Xen doesn't have support for vPCI on Arm.
>> On Arm64 they are only used for PCI IO Bar and we would probably want to
>> expose
>> them to emulator as PIO access to make a DM completely arch-agnostic. So
>> "PIO handling"
>> should be implemented when we add support for vPCI.
>>
>> I left interface untouched in the following patch
>> "xen/dm: Introduce xendevicemodel_set_irq_level DM op"
>> since there is still an open discussion what interface to use/what
>> information to pass to the hypervisor.
>>
>> There are patches on review this series depends on:
>> https://patchwork.kernel.org/patch/11816689
>> https://patchwork.kernel.org/patch/11803383
>>
>> Please note, that IOREQ feature is disabled by default on Arm within current
>> series.
>>
>> ***
>>
>> Patch series [7] was rebased on recent "staging branch"
>> (7ba2ab4 x86/p2m: Fix paging_gva_to_gfn() for nested virt) and tested on
>> Renesas Salvator-X board + H3 ES3.0 SoC (Arm64) with virtio-mmio disk backend
>> [8]
>> running in driver domain and unmodified Linux Guest running on existing
>> virtio-blk driver (frontend). No issues were observed. Guest domain
>> 'reboot/destroy'
>> use-cases work properly. Patch series was only build-tested on x86.
>>
>> Please note, build-test passed for the following modes:
>> 1. x86: CONFIG_HVM=y / CONFIG_IOREQ_SERVER=y (default)
>> 2. x86: #CONFIG_HVM is not set / #CONFIG_IOREQ_SERVER is not set
>> 3. Arm64: CONFIG_HVM=y / CONFIG_IOREQ_SERVER=y
>> 4. Arm64: CONFIG_HVM=y / #CONFIG_IOREQ_SERVER is not set  (default)
>> 5. Arm32: CONFIG_HVM=y / CONFIG_IOREQ_SERVER=y
>> 6. Arm32: CONFIG_HVM=y / #CONFIG_IOREQ_SERVER is not set  (default)
>>
>> ***
>>
>> Any feedback/help would be highly appreciated.
>>
>> [1] https://lists.xenproject.org/archives/html/xen-devel/2020-
>> 07/msg00825.html
>> [2] https://lists.xenproject.org/archives/html/xen-devel/2020-
>> 08/msg00071.html
>> [3] https://lists.xenproject.org/archives/html/xen-devel/2020-
>> 09/msg00732.html
>> [4] https://lists.xenproject.org/archives/html/xen-devel/2020-
>> 10/msg01077.html
>> [5] https://lists.xenproject.org/archives/html/xen-devel/2020-
>> 11/msg02188.html
>> [6] https://lists.xenproject.org/archives/html/xen-devel/2020-
>> 11/msg02212.html
>> [7] https://github.com/otyshchenko1/xen/commits/ioreq_4.14_ml5
>> [8] https://github.com/xen-troops/virtio-disk/commits/ioreq_ml1
>>
>> Julien Grall (5):
>>    xen/ioreq: Make x86's IOREQ related dm-op handling common
>>    xen/mm: Make x86's XENMEM_resource_ioreq_server handling common
>>    arm/ioreq: Introduce arch specific bits for IOREQ/DM features
>>    xen/dm: Introduce xendevicemodel_set_irq_level DM op
>>    libxl: Introduce basic virtio-mmio support on Arm
>>
>> Oleksandr Tyshchenko (19):
>>    x86/ioreq: Prepare IOREQ feature for making it common
>>    x86/ioreq: Add IOREQ_STATUS_* #define-s and update code for moving
>>    x86/ioreq: Provide out-of-line wrapper for the handle_mmio()
>>    xen/ioreq: Make x86's IOREQ feature common
>>    xen/ioreq: Make x86's hvm_ioreq_needs_completion() common
>>    xen/ioreq: Make x86's hvm_mmio_first(last)_byte() common
>>    xen/ioreq: Make x86's hvm_ioreq_(page/vcpu/server) structs common
>>    xen/ioreq: Move x86's ioreq_server to struct domain
>>    xen/ioreq: Move x86's io_completion/io_req fields to struct vcpu
>>    xen/ioreq: Remove "hvm" prefixes from involved function names
>>    xen/ioreq: Use guest_cmpxchg64() instead of cmpxchg()
>>    xen/arm: Stick around in leave_hypervisor_to_guest until I/O has
>>      completed
>>    xen/mm: Handle properly reference in set_foreign_p2m_entry() on Arm
>>    xen/ioreq: Introduce domain_has_ioreq_server()
>>    xen/arm: io: Abstract sign-extension
>>    xen/arm: io: Harden sign extension check
>>    xen/ioreq: Make x86's send_invalidate_req() common
>>    xen/arm: Add mapcache invalidation handling
>>    [RFC] libxl: Add support for virtio-disk configuration
>>
>>   MAINTAINERS                                  |    8 +-
>>   tools/include/xendevicemodel.h               |    4 +
>>   tools/libs/devicemodel/core.c                |   18 +
>>   tools/libs/devicemodel/libxendevicemodel.map |    1 +
>>   tools/libs/light/Makefile                    |    1 +
>>   tools/libs/light/libxl_arm.c                 |   94 +-
>>   tools/libs/light/libxl_create.c              |    1 +
>>   tools/libs/light/libxl_internal.h            |    1 +
>>   tools/libs/light/libxl_types.idl             |   16 +
>>   tools/libs/light/libxl_types_internal.idl    |    1 +
>>   tools/libs/light/libxl_virtio_disk.c         |  109 ++
>>   tools/xl/Makefile                            |    2 +-
>>   tools/xl/xl.h                                |    3 +
>>   tools/xl/xl_cmdtable.c                       |   15 +
>>   tools/xl/xl_parse.c                          |  116 +++
>>   tools/xl/xl_virtio_disk.c                    |   46 +
>>   xen/arch/arm/Makefile                        |    2 +
>>   xen/arch/arm/dm.c                            |  174 ++++
>>   xen/arch/arm/domain.c                        |    9 +
>>   xen/arch/arm/io.c                            |   30 +-
>>   xen/arch/arm/ioreq.c                         |  198 ++++
>>   xen/arch/arm/p2m.c                           |   51 +-
>>   xen/arch/arm/traps.c                         |   72 +-
>>   xen/arch/x86/Kconfig                         |    1 +
>>   xen/arch/x86/hvm/dm.c                        |  107 +-
>>   xen/arch/x86/hvm/emulate.c                   |  220 ++--
>>   xen/arch/x86/hvm/hvm.c                       |   14 +-
>>   xen/arch/x86/hvm/hypercall.c                 |    9 +-
>>   xen/arch/x86/hvm/intercept.c                 |    5 +-
>>   xen/arch/x86/hvm/io.c                        |   52 +-
>>   xen/arch/x86/hvm/ioreq.c                     | 1375 ++-----------------------
>>   xen/arch/x86/hvm/stdvga.c                    |   12 +-
>>   xen/arch/x86/hvm/svm/nestedsvm.c             |    2 +-
>>   xen/arch/x86/hvm/vmx/realmode.c              |    8 +-
>>   xen/arch/x86/hvm/vmx/vvmx.c                  |    5 +-
>>   xen/arch/x86/mm.c                            |   46 +-
>>   xen/arch/x86/mm/p2m.c                        |   17 +-
>>   xen/arch/x86/mm/shadow/common.c              |    2 +-
>>   xen/common/Kconfig                           |    3 +
>>   xen/common/Makefile                          |    1 +
>>   xen/common/ioreq.c                           | 1426 ++++++++++++++++++++++++++
>>   xen/common/memory.c                          |   72 +-
>>   xen/include/asm-arm/domain.h                 |    3 +
>>   xen/include/asm-arm/hvm/ioreq.h              |   72 ++
>>   xen/include/asm-arm/mm.h                     |    8 -
>>   xen/include/asm-arm/mmio.h                   |    1 +
>>   xen/include/asm-arm/p2m.h                    |   19 +-
>>   xen/include/asm-arm/traps.h                  |   25 +
>>   xen/include/asm-x86/hvm/domain.h             |   43 -
>>   xen/include/asm-x86/hvm/emulate.h            |    2 +-
>>   xen/include/asm-x86/hvm/io.h                 |   17 -
>>   xen/include/asm-x86/hvm/ioreq.h              |   39 +-
>>   xen/include/asm-x86/hvm/vcpu.h               |   18 -
>>   xen/include/asm-x86/mm.h                     |    4 -
>>   xen/include/asm-x86/p2m.h                    |   27 +-
>>   xen/include/public/arch-arm.h                |    5 +
>>   xen/include/public/hvm/dm_op.h               |   16 +
>>   xen/include/xen/dm.h                         |   39 +
>>   xen/include/xen/ioreq.h                      |  140 +++
>>   xen/include/xen/p2m-common.h                 |    4 +
>>   xen/include/xen/sched.h                      |   34 +
>>   xen/include/xsm/dummy.h                      |    4 +-
>>   xen/include/xsm/xsm.h                        |    6 +-
>>   xen/xsm/dummy.c                              |    2 +-
>>   xen/xsm/flask/hooks.c                        |    5 +-
>>   65 files changed, 3073 insertions(+), 1809 deletions(-)
>>   create mode 100644 tools/libs/light/libxl_virtio_disk.c
>>   create mode 100644 tools/xl/xl_virtio_disk.c
>>   create mode 100644 xen/arch/arm/dm.c
>>   create mode 100644 xen/arch/arm/ioreq.c
>>   create mode 100644 xen/common/ioreq.c
>>   create mode 100644 xen/include/asm-arm/hvm/ioreq.h
>>   create mode 100644 xen/include/xen/dm.h
>>   create mode 100644 xen/include/xen/ioreq.h
>>
>> --
>> 2.7.4
>>
-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 11/24] xen/mm: Make x86's XENMEM_resource_ioreq_server handling common
  2021-01-14  3:58   ` Wei Chen
@ 2021-01-14 15:31     ` Oleksandr
  2021-01-15 14:35       ` Alex Bennée
  0 siblings, 1 reply; 144+ messages in thread
From: Oleksandr @ 2021-01-14 15:31 UTC (permalink / raw)
  To: Wei Chen
  Cc: xen-devel, Julien Grall, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Volodymyr Babchuk, Oleksandr Tyshchenko


On 14.01.21 05:58, Wei Chen wrote:
> Hi Oleksandr,

Hi Wei


>
>> -----Original Message-----
>> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of
>> Oleksandr Tyshchenko
>> Sent: 2021年1月13日 5:52
>> To: xen-devel@lists.xenproject.org
>> Cc: Julien Grall <Julien.Grall@arm.com>; Jan Beulich <jbeulich@suse.com>;
>> Andrew Cooper <andrew.cooper3@citrix.com>; Roger Pau Monné
>> <roger.pau@citrix.com>; Wei Liu <wl@xen.org>; George Dunlap
>> <george.dunlap@citrix.com>; Ian Jackson <iwj@xenproject.org>; Julien Grall
>> <julien@xen.org>; Stefano Stabellini <sstabellini@kernel.org>; Volodymyr
>> Babchuk <Volodymyr_Babchuk@epam.com>; Oleksandr Tyshchenko
>> <oleksandr_tyshchenko@epam.com>
>> Subject: [PATCH V4 11/24] xen/mm: Make x86's
>> XENMEM_resource_ioreq_server handling common
>>
>> From: Julien Grall <julien.grall@arm.com>
>>
>> As x86 implementation of XENMEM_resource_ioreq_server can be
>> re-used on Arm later on, this patch makes it common and removes
>> arch_acquire_resource as unneeded.
>>
>> Also re-order #include-s alphabetically.
>>
>> This support is going to be used on Arm to be able run device
>> emulator outside of Xen hypervisor.
>>
>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>> [On Arm only]
>> Tested-by: Wei Chen <Wei.Chen@arm.com>
>>
>> ---
>> Please note, this is a split/cleanup/hardening of Julien's PoC:
>> "Add support for Guest IO forwarding to a device emulator"
>>
>> Changes RFC -> V1:
>>     - no changes
>>
>> Changes V1 -> V2:
>>     - update the author of a patch
>>
>> Changes V2 -> V3:
>>     - don't wrap #include <xen/ioreq.h>
>>     - limit the number of #ifdef-s
>>     - re-order #include-s alphabetically
>>
>> Changes V3 -> V4:
>>     - rebase
>>     - Add Jan's R-b
>> ---
>>   xen/arch/x86/mm.c        | 44 ---------------------------------
>>   xen/common/memory.c      | 63
>> +++++++++++++++++++++++++++++++++++++++---------
>>   xen/include/asm-arm/mm.h |  8 ------
>>   xen/include/asm-x86/mm.h |  4 ---
>>   4 files changed, 51 insertions(+), 68 deletions(-)
>>
>> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
>> index f6e128e..54ac398 100644
>> --- a/xen/arch/x86/mm.c
>> +++ b/xen/arch/x86/mm.c
>> @@ -4587,50 +4587,6 @@ static int handle_iomem_range(unsigned long s,
>> unsigned long e, void *p)
>>       return err || s > e ? err : _handle_iomem_range(s, e, p);
>>   }
>>
>> -int arch_acquire_resource(struct domain *d, unsigned int type,
>> -                          unsigned int id, unsigned long frame,
>> -                          unsigned int nr_frames, xen_pfn_t mfn_list[])
>> -{
>> -    int rc;
>> -
>> -    switch ( type )
>> -    {
>> -#ifdef CONFIG_HVM
>> -    case XENMEM_resource_ioreq_server:
>> -    {
>> -        ioservid_t ioservid = id;
>> -        unsigned int i;
>> -
>> -        rc = -EINVAL;
>> -        if ( !is_hvm_domain(d) )
>> -            break;
>> -
>> -        if ( id != (unsigned int)ioservid )
>> -            break;
>> -
>> -        rc = 0;
>> -        for ( i = 0; i < nr_frames; i++ )
>> -        {
>> -            mfn_t mfn;
>> -
>> -            rc = hvm_get_ioreq_server_frame(d, id, frame + i, &mfn);
>> -            if ( rc )
>> -                break;
>> -
>> -            mfn_list[i] = mfn_x(mfn);
>> -        }
>> -        break;
>> -    }
>> -#endif
>> -
>> -    default:
>> -        rc = -EOPNOTSUPP;
>> -        break;
>> -    }
>> -
>> -    return rc;
>> -}
>> -
>>   long arch_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void)
>> arg)
>>   {
>>       int rc;
>> diff --git a/xen/common/memory.c b/xen/common/memory.c
>> index b21b6c4..7e560b5 100644
>> --- a/xen/common/memory.c
>> +++ b/xen/common/memory.c
>> @@ -8,22 +8,23 @@
>>    */
>>
>>   #include <xen/domain_page.h>
>> -#include <xen/types.h>
>> +#include <xen/errno.h>
>> +#include <xen/event.h>
>> +#include <xen/grant_table.h>
>> +#include <xen/guest_access.h>
>> +#include <xen/hypercall.h>
>> +#include <xen/iocap.h>
>> +#include <xen/ioreq.h>
>>   #include <xen/lib.h>
>> +#include <xen/mem_access.h>
>>   #include <xen/mm.h>
>> +#include <xen/numa.h>
>> +#include <xen/paging.h>
>>   #include <xen/param.h>
>>   #include <xen/perfc.h>
>>   #include <xen/sched.h>
>> -#include <xen/event.h>
>> -#include <xen/paging.h>
>> -#include <xen/iocap.h>
>> -#include <xen/guest_access.h>
>> -#include <xen/hypercall.h>
>> -#include <xen/errno.h>
>> -#include <xen/numa.h>
>> -#include <xen/mem_access.h>
>>   #include <xen/trace.h>
>> -#include <xen/grant_table.h>
>> +#include <xen/types.h>
>>   #include <asm/current.h>
>>   #include <asm/hardirq.h>
>>   #include <asm/p2m.h>
>> @@ -1090,6 +1091,40 @@ static int acquire_grant_table(struct domain *d,
>> unsigned int id,
>>       return 0;
>>   }
>>
>> +static int acquire_ioreq_server(struct domain *d,
>> +                                unsigned int id,
>> +                                unsigned long frame,
>> +                                unsigned int nr_frames,
>> +                                xen_pfn_t mfn_list[])
>> +{
>> +#ifdef CONFIG_IOREQ_SERVER
>> +    ioservid_t ioservid = id;
>> +    unsigned int i;
>> +    int rc;
>> +
>> +    if ( !is_hvm_domain(d) )
>> +        return -EINVAL;
>> +
>> +    if ( id != (unsigned int)ioservid )
>> +        return -EINVAL;
>> +
>> +    for ( i = 0; i < nr_frames; i++ )
>> +    {
>> +        mfn_t mfn;
>> +
>> +        rc = hvm_get_ioreq_server_frame(d, id, frame + i, &mfn);
>> +        if ( rc )
>> +            return rc;
>> +
>> +        mfn_list[i] = mfn_x(mfn);
>> +    }
>> +
>> +    return 0;
>> +#else
>> +    return -EOPNOTSUPP;
>> +#endif
>> +}
>> +
>>   static int acquire_resource(
>>       XEN_GUEST_HANDLE_PARAM(xen_mem_acquire_resource_t) arg)
>>   {
>> @@ -1148,9 +1183,13 @@ static int acquire_resource(
>>                                    mfn_list);
>>           break;
>>
>> +    case XENMEM_resource_ioreq_server:
>> +        rc = acquire_ioreq_server(d, xmar.id, xmar.frame, xmar.nr_frames,
>> +                                  mfn_list);
>> +        break;
>> +
>>       default:
>> -        rc = arch_acquire_resource(d, xmar.type, xmar.id, xmar.frame,
>> -                                   xmar.nr_frames, mfn_list);
>> +        rc = -EOPNOTSUPP;
>>           break;
>>       }
>>
>> diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
>> index f8ba49b..0b7de31 100644
>> --- a/xen/include/asm-arm/mm.h
>> +++ b/xen/include/asm-arm/mm.h
>> @@ -358,14 +358,6 @@ static inline void put_page_and_type(struct page_info
>> *page)
>>
>>   void clear_and_clean_page(struct page_info *page);
>>
>> -static inline
>> -int arch_acquire_resource(struct domain *d, unsigned int type, unsigned int id,
>> -                          unsigned long frame, unsigned int nr_frames,
>> -                          xen_pfn_t mfn_list[])
>> -{
>> -    return -EOPNOTSUPP;
>> -}
>> -
>>   unsigned int arch_get_dma_bitsize(void);
>>
> This change could not be applied to the latest staging branch.

Yes, thank you noticing that.  The code around was changed a bit (patch 
series is based on 10-days old staging), I will update for the next version.


>
>>   #endif /*  __ARCH_ARM_MM__ */
>> diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
>> index deeba75..859214e 100644
>> --- a/xen/include/asm-x86/mm.h
>> +++ b/xen/include/asm-x86/mm.h
>> @@ -639,8 +639,4 @@ static inline bool arch_mfn_in_directmap(unsigned long
>> mfn)
>>       return mfn <= (virt_to_mfn(eva - 1) + 1);
>>   }
>>
>> -int arch_acquire_resource(struct domain *d, unsigned int type,
>> -                          unsigned int id, unsigned long frame,
>> -                          unsigned int nr_frames, xen_pfn_t mfn_list[]);
>> -
>>   #endif /* __ASM_X86_MM_H__ */
>> --
>> 2.7.4
>>
-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply	[flat|nested] 144+ messages in thread

* [ANNOUNCE] Xen 4.15 release schedule and feature tracking
  2021-01-07 14:35     ` [ANNOUNCE] Xen 4.15 release schedule and feature tracking Ian Jackson
  2021-01-07 15:45       ` Oleksandr
@ 2021-01-14 16:06       ` Ian Jackson
  2021-01-14 19:02         ` Andrew Cooper
  1 sibling, 1 reply; 144+ messages in thread
From: Ian Jackson @ 2021-01-14 16:06 UTC (permalink / raw)
  To: xen-devel, committers

The last posting date for new feature patches for Xen 4.15 is
tomorrow. [1]  We seem to be getting a reasonable good flood of stuff
trying to meet this deadline :-).

Patches for new fetures posted after tomorrow will be deferred to the
next Xen release after 4.15.  NB the primary responsibility for
driving a feature's progress to meet the release schedule, lies with
the feature's proponent(s).


  As a reminder, here is the release schedule:
+ (unchanged information indented with spaces):

   Friday 15th January    Last posting date

       Patches adding new features should be posted to the mailing list
       by this cate, although perhaps not in their final version.

   Friday 29th January    Feature freeze

       Patches adding new features should be committed by this date.
       Straightforward bugfixes may continue to be accepted by
       maintainers.

   Friday 12th February **tentatve**   Code freeze

       Bugfixes only, all changes to be approved by the Release Manager.

   Week of 12th March **tentative**    Release
       (probably Tuesday or Wednesday)

  Any patches containing substantial refactoring are to treated as
  new features, even if they intent is to fix bugs.

  Freeze exceptions will not be routine, but may be granted in
  exceptional cases for small changes on the basis of risk assessment.
  Large series will not get exceptions.  Contributors *must not* rely on
  getting, or expect, a freeze exception.

+ New or improved tests (supposing they do not involve refactoring,
+ even build system reorganisation), and documentation improvements,
+ will generally be treated as bugfixes.

  The codefreeze and release dates are provisional and will be adjusted
  in the light of apparent code quality etc.

  If as a feature proponent you feel your feature is at risk and there
  is something the Xen Project could do to help, please consult me or
  the Community Manager.  In such situations please reach out earlier
  rather than later.


In my last update I asked this:

> If you are working on a feature you want in 4.15 please let me know
> about it.  Ideally I'd like a little stanza like this:
> 
> S: feature name
> O: feature owner (proponent) name
> E: feature owner (proponent) email address
> P: your current estimate of the probability it making 4.15, as a %age
> 
> But free-form text is OK too.  Please reply to this mail.

I received one mail.  Thanks to Oleksandr Andrushchenko for his update
on the following feeature:

  IOREQ feature (+ virtio-mmio) on Arm
  https://www.mail-archive.com/xen-devel@lists.xenproject.org/msg87002.html

  Julien Grall <julien@xen.org>
  Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

I see that V4 of this series was just posted.  Thanks, Oleksandr.
I'll make a separate enquiry about your series.

I think if people don't find the traditional feature tracking useful,
I will try to assemble Release Notes information later, during the
freeze, when fewer people are rushing to try to meet the deadlines.


Thanks,
Ian.


[1] The precise nominal cutoff time is 2020-01-14 23:59:00 in time
zone International Date Line West
  https://en.wikipedia.org/wiki/UTC%E2%88%9212:00
based on the timestamp recorded in the earliest Received: line
inserted by lists.xenproject.org, in the message containing the
last-received patch of the series.


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm
  2021-01-07 15:45       ` Oleksandr
@ 2021-01-14 16:11         ` Ian Jackson
  2021-01-14 18:41           ` Oleksandr
  0 siblings, 1 reply; 144+ messages in thread
From: Ian Jackson @ 2021-01-14 16:11 UTC (permalink / raw)
  To: Oleksandr; +Cc: Julien Grall, xen-devel

Hi, thanks for giving this update.

Since you were the only person who took the time to send such an
update I feel I can spend some time on trying to help with any
obstacles you may face.  Hence this enquiry:

Oleksandr writes ("Re: [ANNOUNCE] Xen 4.15 release schedule and feature tracking"):
> I work on virtio-mmio on Arm which involves x86's IOREQ/DM features.
> Currently I am working on making the said features common, implementing 
> missing bits, code cleanup an hardening, etc.
> I don't think the virtio-mmio is a 4.15 material, but it would be great 
> have at least "common" IOREQ/DM in 4.15.
..
> > P: your current estimate of the probability it making 4.15, as a %age
> 
> Difficult to say, it depends ...
> RFC was posted Aug 3, 2020, The last posted version is V3. Currently I 
> am in the middle of preparing v4, still need to find a common ground for 
> few bits.

So, I'm replying to V4 here.  Did you resolve your issues ?
What are the major outstanding risks to this series and do you need
any help from the Xen Project (eg from me as Release Manager) ?

NB I have not been following this series in detail - I'm just looking
at your mail and your 00/ posting and so on.  So if there is some
blocker or risk I am probably unaware of it.

I notice that there's one libxl RFC patch in there.  Since that's in
my bailiwick I will try to review it soon.

Regards,
Ian.


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 24/24] [RFC] libxl: Add support for virtio-disk configuration
  2021-01-12 21:52 ` [PATCH V4 24/24] [RFC] libxl: Add support for virtio-disk configuration Oleksandr Tyshchenko
@ 2021-01-14 17:20   ` Ian Jackson
  2021-01-16  9:05     ` Oleksandr
  2021-01-15 22:01   ` Julien Grall
  1 sibling, 1 reply; 144+ messages in thread
From: Ian Jackson @ 2021-01-14 17:20 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: xen-devel, Oleksandr Tyshchenko, Wei Liu, Anthony PERARD,
	Julien Grall, Stefano Stabellini

Oleksandr Tyshchenko writes ("[PATCH V4 24/24] [RFC] libxl: Add support for virtio-disk configuration"):
> This patch adds basic support for configuring and assisting virtio-disk
> backend (emualator) which is intended to run out of Qemu and could be run
> in any domain.

Thanks.  I think this is a very important feature.  But I think this
part at least needs some work.  (That's not inappropriate for an RFC
patch - so please don't feel you have done anything wrong.  I hope you
will find my comments constructive.)


> An example of domain configuration (two disks are assigned to the guest,
> the latter is in readonly mode):
> 
> vdisk = [ 'backend=DomD, disks=rw:/dev/mmcblk0p3;ro:/dev/mmcblk1p3' ]

I can see why you have done it like this but I am concerned that this
is not well-integrated with the existing disk configuration system.

As a result not only is your new feature lacking support for many
existing libxl features (block backend scripts, cdroms tagged as such,
non-raw formats) that could otherwise be made available, but I think
adding them later would be quite awkward.

I it would be better to reuse (and, if necessary, adapt) the existing
disk parsing logic in libxl, so that the syntax for your new vdisks =
[...] parameter is the same as for the existing disks.  Or even
better, simply make your new kind of disk a new flag on the existing
disk structure.

> Also there is one suggestion from Wei Chen regarding a parameter for
> domain config file which I haven't addressed yet.
> [Just copy here what Wei said in V2 thread]
> Can we keep use the same 'disk' parameter for virtio-disk, but add
> an option like "model=virtio-disk"?
> For example:
> disk = [ 'backend=DomD, disks=rw:/dev/mmcblk0p3,model=virtio-disk' ]
> Just like what Xen has done for x86 virtio-net.

This is the same suggestion I make above, basically.  It would be much
better, yes.


> Xenstore was chosen as a communication interface for the emulator
> running in non-toolstack domain to be able to get configuration
> either by reading Xenstore directly or by receiving command line
> parameters (an updated 'xl devd' running in the same domain would
> read Xenstore beforehand and call backend executable with the
> required arguments).

I was surprised to read this because I would expect that qemu upstream
would be resistant to this approach.  As far as the Xen Project's
point of view goes, I think using xenstore for this is fine, but we
would definitely want the support in upstream qemu.

Can you please explain the status of the corresponding qemu feature ?
(Ideally, in a formal way in the commit message.)

> Please note, there is a real concern about VirtIO interrupts allocation.
> [Just copy here what Stefano said in RFC thread]
> 
> So, if we end up allocating let's say 6 virtio interrupts for a
> domain, the chance of a clash with a physical interrupt of a
> passthrough device is real.
> 
> I am not entirely sure how to solve it, but these are a few ideas:
> - choosing virtio interrupts that are less likely to conflict (maybe > 1000)
> - make the virtio irq (optionally) configurable so that a user could
>   override the default irq and specify one that doesn't conflict
> - implementing support for virq != pirq (even the xl interface doesn't
>   allow to specify the virq number for passthrough devices, see "irqs")

I think here you have chosen to make the interupt configurable ?

The implications are that a someone using this with passthrough would
have to choose non-clashing IRQs ?  In the non-passthrough case (ie, a
guest with no passthrough devices), can your code choose an
appropriate IRQ, if the user doesn't specify one ?


I don't see any changes to the xl documentation in this patch.  That
would be the place to explain the irq stuff, and would be needed
anyway.  Indeed with anything substantial like your proposal, it is
often a good idea to write (at least a sketch of) the documentation
*first*, and then you know what you're aiming to implement.


I have some comments on the code details but I think you will probably
want to focus on the overall approach, first:

> +#ifndef container_of
> +#define container_of(ptr, type, member) ({			\
> +        typeof( ((type *)0)->member ) *__mptr = (ptr);	\
> +        (type *)( (char *)__mptr - offsetof(type,member) );})
> +#endif

Please use the existing CONTAINER_OF which we have already.

>  static const char *gicv_to_string(libxl_gic_version gic_version)
>  {
>      switch (gic_version) {
> @@ -39,14 +45,32 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
>          vuart_enabled = true;
>      }
>  
> -    /*
> -     * XXX: Handle properly virtio
> -     * A proper solution would be the toolstack to allocate the interrupts
> -     * used by each virtio backend and let the backend now which one is used
> -     */
>      if (libxl_defbool_val(d_config->b_info.arch_arm.virtio)) {
> -        nr_spis += (GUEST_VIRTIO_MMIO_SPI - 32) + 1;
> +        uint64_t virtio_base;
> +        libxl_device_virtio_disk *virtio_disk;
> +
> +        virtio_base = GUEST_VIRTIO_MMIO_BASE;
>          virtio_irq = GUEST_VIRTIO_MMIO_SPI;

I would like to see a review of these changes to virtio handling by
someone who understands virtio.

> +static int libxl__device_virtio_disk_setdefault(libxl__gc *gc, uint32_t domid,
> +                                                libxl_device_virtio_disk *virtio_disk,
> +                                                bool hotplug)
> +{
> +    return libxl__resolve_domid(gc, virtio_disk->backend_domname,
> +                                &virtio_disk->backend_domid);

There are some line length problems here.

I haven't reviewed your parsing code because I think this ought to be
done as an option or addition to with the existing disk spec parsing.

> diff --git a/tools/xl/xl_virtio_disk.c b/tools/xl/xl_virtio_disk.c
> new file mode 100644
> index 0000000..808a7da
> --- /dev/null
> +++ b/tools/xl/xl_virtio_disk.c
> @@ -0,0 +1,46 @@
> +/*
> + * Copyright (C) 2020 EPAM Systems Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU Lesser General Public License as published
> + * by the Free Software Foundation; version 2.1 only. with the special
> + * exception on linking described in file LICENSE.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU Lesser General Public License for more details.
> + */
> +
> +#include <stdlib.h>
> +
> +#include <libxl.h>
> +#include <libxl_utils.h>
> +#include <libxlutil.h>
> +
> +#include "xl.h"
> +#include "xl_utils.h"
> +#include "xl_parse.h"
> +
> +int main_virtio_diskattach(int argc, char **argv)
> +{
> +    return 0;
> +}
> +
> +int main_virtio_disklist(int argc, char **argv)
> +{
> +   return 0;
> +}
> +
> +int main_virtio_diskdetach(int argc, char **argv)
> +{
> +    return 0;
> +}

This seems to be a stray early test file left over in the patch ?


Thanks,
Ian.


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm
  2021-01-14 16:11         ` [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Ian Jackson
@ 2021-01-14 18:41           ` Oleksandr
  0 siblings, 0 replies; 144+ messages in thread
From: Oleksandr @ 2021-01-14 18:41 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Julien Grall, xen-devel


On 14.01.21 18:11, Ian Jackson wrote:

Hi Ian

> Hi, thanks for giving this update.
>
> Since you were the only person who took the time to send such an
> update I feel I can spend some time on trying to help with any
> obstacles you may face.

Thank you.


>    
> Hence this enquiry:
>
> Oleksandr writes ("Re: [ANNOUNCE] Xen 4.15 release schedule and feature tracking"):
>> I work on virtio-mmio on Arm which involves x86's IOREQ/DM features.
>> Currently I am working on making the said features common, implementing
>> missing bits, code cleanup an hardening, etc.
>> I don't think the virtio-mmio is a 4.15 material, but it would be great
>> have at least "common" IOREQ/DM in 4.15.
> ..
>>> P: your current estimate of the probability it making 4.15, as a %age
>> Difficult to say, it depends ...
>> RFC was posted Aug 3, 2020, The last posted version is V3. Currently I
>> am in the middle of preparing v4, still need to find a common ground for
>> few bits.
> So, I'm replying to V4 here.  Did you resolve your issues ?

I think, yes. I hope, I addressed all review сomments/requests for V3.


> What are the major outstanding risks to this series and do you need
> any help from the Xen Project (eg from me as Release Manager) ?

I would like to get the review for V4 for me to be able to make the 
required changes in time.
Last 2 patches could be skipped for now, I don't expect the VirtIO to be 
in 4.15
(I keep these patches for visibility reason and test purposes):
- libxl: Introduce basic virtio-mmio support on Arm
- [RFC] libxl: Add support for virtio-disk configuration
But, first 22 patches (IOREQ/DM) I would like to see in 4.15.

Also what worries me the most is that a quite big series hasn't been 
fully tested on x86 (only build tested).


>
> NB I have not been following this series in detail - I'm just looking
> at your mail and your 00/ posting and so on.  So if there is some
> blocker or risk I am probably unaware of it.
>
> I notice that there's one libxl RFC patch in there.  Since that's in
> my bailiwick I will try to review it soon.
Thank you, please note, this RFC patch is not a target for 4.15, no rush)
For the next version I will drop these VirtIO related patches.


-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [ANNOUNCE] Xen 4.15 release schedule and feature tracking
  2021-01-14 16:06       ` [ANNOUNCE] Xen 4.15 release schedule and feature tracking Ian Jackson
@ 2021-01-14 19:02         ` Andrew Cooper
  2021-01-15  9:57           ` Jan Beulich
                             ` (3 more replies)
  0 siblings, 4 replies; 144+ messages in thread
From: Andrew Cooper @ 2021-01-14 19:02 UTC (permalink / raw)
  To: Ian Jackson, xen-devel, committers, Tamas K Lengyel,
	Michał Leszczyński

On 14/01/2021 16:06, Ian Jackson wrote:
> The last posting date for new feature patches for Xen 4.15 is
> tomorrow. [1]  We seem to be getting a reasonable good flood of stuff
> trying to meet this deadline :-).
>
> Patches for new fetures posted after tomorrow will be deferred to the
> next Xen release after 4.15.  NB the primary responsibility for
> driving a feature's progress to meet the release schedule, lies with
> the feature's proponent(s).
>
>
>   As a reminder, here is the release schedule:
> + (unchanged information indented with spaces):
>
>    Friday 15th January    Last posting date
>
>        Patches adding new features should be posted to the mailing list
>        by this cate, although perhaps not in their final version.
>
>    Friday 29th January    Feature freeze
>
>        Patches adding new features should be committed by this date.
>        Straightforward bugfixes may continue to be accepted by
>        maintainers.
>
>    Friday 12th February **tentatve**   Code freeze
>
>        Bugfixes only, all changes to be approved by the Release Manager.
>
>    Week of 12th March **tentative**    Release
>        (probably Tuesday or Wednesday)
>
>   Any patches containing substantial refactoring are to treated as
>   new features, even if they intent is to fix bugs.
>
>   Freeze exceptions will not be routine, but may be granted in
>   exceptional cases for small changes on the basis of risk assessment.
>   Large series will not get exceptions.  Contributors *must not* rely on
>   getting, or expect, a freeze exception.
>
> + New or improved tests (supposing they do not involve refactoring,
> + even build system reorganisation), and documentation improvements,
> + will generally be treated as bugfixes.
>
>   The codefreeze and release dates are provisional and will be adjusted
>   in the light of apparent code quality etc.
>
>   If as a feature proponent you feel your feature is at risk and there
>   is something the Xen Project could do to help, please consult me or
>   the Community Manager.  In such situations please reach out earlier
>   rather than later.
>
>
> In my last update I asked this:
>
>> If you are working on a feature you want in 4.15 please let me know
>> about it.  Ideally I'd like a little stanza like this:
>>
>> S: feature name
>> O: feature owner (proponent) name
>> E: feature owner (proponent) email address
>> P: your current estimate of the probability it making 4.15, as a %age
>>
>> But free-form text is OK too.  Please reply to this mail.
> I received one mail.  Thanks to Oleksandr Andrushchenko for his update
> on the following feeature:
>
>   IOREQ feature (+ virtio-mmio) on Arm
>   https://www.mail-archive.com/xen-devel@lists.xenproject.org/msg87002.html
>
>   Julien Grall <julien@xen.org>
>   Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>
> I see that V4 of this series was just posted.  Thanks, Oleksandr.
> I'll make a separate enquiry about your series.
>
> I think if people don't find the traditional feature tracking useful,
> I will try to assemble Release Notes information later, during the
> freeze, when fewer people are rushing to try to meet the deadlines.

(Now I have working email).

Features:

1) acquire_resource fixes.

Not really a new feature - entirely bugfixing a preexisting one.
Developed by me to help 2).  Reasonably well acked, but awaiting
feedback on v3.

2) External Processor Trace support.

Development by Michał.  Depends on 1), and awaiting a new version being
posted.

As far as I'm aware, both Intel and CERT have production systems
deployed using this functionality, so it is very highly desirable to get
into 4.15.

3) Initial Trenchboot+SKINIT support.

I've got two patches I need to clean up and submit which is the first
part of the Trenchboot + Dynamic Root of Trust on AMD support.  This
will get Xen into a position where it can be started via the new grub
"secure_launch" protocol.

Later patches (i.e. post 4.15) will do support for Intel TXT (i.e.
without tboot), as well as the common infrastructure for the TPM event
log and further measurements during the boot process.

4) "simple" autotest support.


Bugs:

1) HPET/PIT issue on newer Intel systems.  This has had literally tens
of reports across the devel and users mailing lists, and prevents Xen
from booting at all on the past two generations of Intel laptop.  I've
finally got a repro and posted a fix to the list, but still in progress.

2) "scheduler broken" bugs.  We've had 4 or 5 reports of Xen not
working, and very little investigation on whats going on.  Suspicion is
that there might be two bugs, one with smt=0 on recent AMD hardware, and
one more general "some workloads cause negative credit" and might or
might not be specific to credit2 (debugging feedback differs - also
might be 3 underlying issue).

All of these have had repeated bug reports.  I'd classify them as
blockers, given the impact they're having on people.

~Andrew


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features
  2021-01-12 21:52 ` [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features Oleksandr Tyshchenko
@ 2021-01-15  0:55   ` Stefano Stabellini
  2021-01-17 12:45     ` Oleksandr
  2021-01-15 20:26   ` Julien Grall
  1 sibling, 1 reply; 144+ messages in thread
From: Stefano Stabellini @ 2021-01-15  0:55 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Oleksandr Tyshchenko

[-- Attachment #1: Type: text/plain, Size: 23218 bytes --]

On Tue, 12 Jan 2021, Oleksandr Tyshchenko wrote:
> From: Julien Grall <julien.grall@arm.com>
> 
> This patch adds basic IOREQ/DM support on Arm. The subsequent
> patches will improve functionality and add remaining bits.
> 
> The IOREQ/DM features are supposed to be built with IOREQ_SERVER
> option enabled, which is disabled by default on Arm for now.
> 
> Please note, the "PIO handling" TODO is expected to left unaddressed
> for the current series. It is not an big issue for now while Xen
> doesn't have support for vPCI on Arm. On Arm64 they are only used
> for PCI IO Bar and we would probably want to expose them to emulator
> as PIO access to make a DM completely arch-agnostic. So "PIO handling"
> should be implemented when we add support for vPCI.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>
> 
> ---
> Please note, this is a split/cleanup/hardening of Julien's PoC:
> "Add support for Guest IO forwarding to a device emulator"
> 
> Changes RFC -> V1:
>    - was split into:
>      - arm/ioreq: Introduce arch specific bits for IOREQ/DM features
>      - xen/mm: Handle properly reference in set_foreign_p2m_entry() on Arm
>    - update patch description
>    - update asm-arm/hvm/ioreq.h according to the newly introduced arch functions:
>      - arch_hvm_destroy_ioreq_server()
>      - arch_handle_hvm_io_completion()
>    - update arch files to include xen/ioreq.h
>    - remove HVMOP plumbing
>    - rewrite a logic to handle properly case when hvm_send_ioreq() returns IO_RETRY
>    - add a logic to handle properly handle_hvm_io_completion() return value
>    - rename handle_mmio() to ioreq_handle_complete_mmio()
>    - move paging_mark_pfn_dirty() to asm-arm/paging.h
>    - remove forward declaration for hvm_ioreq_server in asm-arm/paging.h
>    - move try_fwd_ioserv() to ioreq.c, provide stubs if !CONFIG_IOREQ_SERVER
>    - do not remove #ifdef CONFIG_IOREQ_SERVER in memory.c for guarding xen/ioreq.h
>    - use gdprintk in try_fwd_ioserv(), remove unneeded prints
>    - update list of #include-s
>    - move has_vpci() to asm-arm/domain.h
>    - add a comment (TODO) to unimplemented yet handle_pio()
>    - remove hvm_mmio_first(last)_byte() and hvm_ioreq_(page/vcpu/server) structs
>      from the arch files, they were already moved to the common code
>    - remove set_foreign_p2m_entry() changes, they will be properly implemented
>      in the follow-up patch
>    - select IOREQ_SERVER for Arm instead of Arm64 in Kconfig
>    - remove x86's realmode and other unneeded stubs from xen/ioreq.h
>    - clafify ioreq_t p.df usage in try_fwd_ioserv()
>    - set ioreq_t p.count to 1 in try_fwd_ioserv()
> 
> Changes V1 -> V2:
>    - was split into:
>      - arm/ioreq: Introduce arch specific bits for IOREQ/DM features
>      - xen/arm: Stick around in leave_hypervisor_to_guest until I/O has completed
>    - update the author of a patch
>    - update patch description
>    - move a loop in leave_hypervisor_to_guest() to a separate patch
>    - set IOREQ_SERVER disabled by default
>    - remove already clarified /* XXX */
>    - replace BUG() by ASSERT_UNREACHABLE() in handle_pio()
>    - remove default case for handling the return value of try_handle_mmio()
>    - remove struct hvm_domain, enum hvm_io_completion, struct hvm_vcpu_io,
>      struct hvm_vcpu from asm-arm/domain.h, these are common materials now
>    - update everything according to the recent changes (IOREQ related function
>      names don't contain "hvm" prefixes/infixes anymore, IOREQ related fields
>      are part of common struct vcpu/domain now, etc)
> 
> Changes V2 -> V3:
>    - update patch according the "legacy interface" is x86 specific
>    - add dummy arch hooks
>    - remove dummy paging_mark_pfn_dirty()
>    - don’t include <xen/domain_page.h> in common ioreq.c
>    - don’t include <public/hvm/ioreq.h> in arch ioreq.h
>    - remove #define ioreq_params(d, i)
> 
> Changes V3 -> V4:
>    - rebase
>    - update patch according to the renaming IO_ -> VIO_ (io_ -> vio_)
>      and misc changes to arch hooks
>    - update patch according to the IOREQ related dm-op handling changes
>    - don't include <xen/ioreq.h> from arch header
>    - make all arch hooks out-of-line
>    - add a comment above IOREQ_STATUS_* #define-s
> ---
>  xen/arch/arm/Makefile           |   2 +
>  xen/arch/arm/dm.c               | 122 +++++++++++++++++++++++
>  xen/arch/arm/domain.c           |   9 ++
>  xen/arch/arm/io.c               |  12 ++-
>  xen/arch/arm/ioreq.c            | 213 ++++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/traps.c            |  13 +++
>  xen/include/asm-arm/domain.h    |   3 +
>  xen/include/asm-arm/hvm/ioreq.h |  72 ++++++++++++++
>  xen/include/asm-arm/mmio.h      |   1 +
>  9 files changed, 446 insertions(+), 1 deletion(-)
>  create mode 100644 xen/arch/arm/dm.c
>  create mode 100644 xen/arch/arm/ioreq.c
>  create mode 100644 xen/include/asm-arm/hvm/ioreq.h
> 
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index 512ffdd..16e6523 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -13,6 +13,7 @@ obj-y += cpuerrata.o
>  obj-y += cpufeature.o
>  obj-y += decode.o
>  obj-y += device.o
> +obj-$(CONFIG_IOREQ_SERVER) += dm.o
>  obj-y += domain.o
>  obj-y += domain_build.init.o
>  obj-y += domctl.o
> @@ -27,6 +28,7 @@ obj-y += guest_atomics.o
>  obj-y += guest_walk.o
>  obj-y += hvm.o
>  obj-y += io.o
> +obj-$(CONFIG_IOREQ_SERVER) += ioreq.o
>  obj-y += irq.o
>  obj-y += kernel.init.o
>  obj-$(CONFIG_LIVEPATCH) += livepatch.o
> diff --git a/xen/arch/arm/dm.c b/xen/arch/arm/dm.c
> new file mode 100644
> index 0000000..e6dedf4
> --- /dev/null
> +++ b/xen/arch/arm/dm.c
> @@ -0,0 +1,122 @@
> +/*
> + * Copyright (c) 2019 Arm ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <xen/dm.h>
> +#include <xen/guest_access.h>
> +#include <xen/hypercall.h>
> +#include <xen/ioreq.h>
> +#include <xen/nospec.h>
> +
> +static int dm_op(const struct dmop_args *op_args)
> +{
> +    struct domain *d;
> +    struct xen_dm_op op;
> +    bool const_op = true;
> +    long rc;
> +    size_t offset;
> +
> +    static const uint8_t op_size[] = {
> +        [XEN_DMOP_create_ioreq_server]              = sizeof(struct xen_dm_op_create_ioreq_server),
> +        [XEN_DMOP_get_ioreq_server_info]            = sizeof(struct xen_dm_op_get_ioreq_server_info),
> +        [XEN_DMOP_map_io_range_to_ioreq_server]     = sizeof(struct xen_dm_op_ioreq_server_range),
> +        [XEN_DMOP_unmap_io_range_from_ioreq_server] = sizeof(struct xen_dm_op_ioreq_server_range),
> +        [XEN_DMOP_set_ioreq_server_state]           = sizeof(struct xen_dm_op_set_ioreq_server_state),
> +        [XEN_DMOP_destroy_ioreq_server]             = sizeof(struct xen_dm_op_destroy_ioreq_server),
> +    };
> +
> +    rc = rcu_lock_remote_domain_by_id(op_args->domid, &d);
> +    if ( rc )
> +        return rc;
> +
> +    rc = xsm_dm_op(XSM_DM_PRIV, d);
> +    if ( rc )
> +        goto out;
> +
> +    offset = offsetof(struct xen_dm_op, u);
> +
> +    rc = -EFAULT;
> +    if ( op_args->buf[0].size < offset )
> +        goto out;
> +
> +    if ( copy_from_guest_offset((void *)&op, op_args->buf[0].h, 0, offset) )
> +        goto out;
> +
> +    if ( op.op >= ARRAY_SIZE(op_size) )
> +    {
> +        rc = -EOPNOTSUPP;
> +        goto out;
> +    }
> +
> +    op.op = array_index_nospec(op.op, ARRAY_SIZE(op_size));
> +
> +    if ( op_args->buf[0].size < offset + op_size[op.op] )
> +        goto out;
> +
> +    if ( copy_from_guest_offset((void *)&op.u, op_args->buf[0].h, offset,
> +                                op_size[op.op]) )
> +        goto out;
> +
> +    rc = -EINVAL;
> +    if ( op.pad )
> +        goto out;
> +
> +    rc = ioreq_server_dm_op(&op, d, &const_op);
> +
> +    if ( (!rc || rc == -ERESTART) &&
> +         !const_op && copy_to_guest_offset(op_args->buf[0].h, offset,
> +                                           (void *)&op.u, op_size[op.op]) )
> +        rc = -EFAULT;
> +
> + out:
> +    rcu_unlock_domain(d);
> +
> +    return rc;
> +}
> +
> +long do_dm_op(domid_t domid,
> +              unsigned int nr_bufs,
> +              XEN_GUEST_HANDLE_PARAM(xen_dm_op_buf_t) bufs)
> +{
> +    struct dmop_args args;
> +    int rc;
> +
> +    if ( nr_bufs > ARRAY_SIZE(args.buf) )
> +        return -E2BIG;
> +
> +    args.domid = domid;
> +    args.nr_bufs = array_index_nospec(nr_bufs, ARRAY_SIZE(args.buf) + 1);
> +
> +    if ( copy_from_guest_offset(&args.buf[0], bufs, 0, args.nr_bufs) )
> +        return -EFAULT;
> +
> +    rc = dm_op(&args);
> +
> +    if ( rc == -ERESTART )
> +        rc = hypercall_create_continuation(__HYPERVISOR_dm_op, "iih",
> +                                           domid, nr_bufs, bufs);
> +
> +    return rc;
> +}

I might have missed something in the discussions but this function is
identical to xen/arch/x86/hvm/dm.c:do_dm_op, why not make it common?

Also the previous function dm_op is very similar to
xen/arch/x86/hvm/dm.c:dm_op I would prefer to make them common if
possible. Was this already discussed?


> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index 18cafcd..8f55aba 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -15,6 +15,7 @@
>  #include <xen/guest_access.h>
>  #include <xen/hypercall.h>
>  #include <xen/init.h>
> +#include <xen/ioreq.h>
>  #include <xen/lib.h>
>  #include <xen/livepatch.h>
>  #include <xen/sched.h>
> @@ -696,6 +697,10 @@ int arch_domain_create(struct domain *d,
>  
>      ASSERT(config != NULL);
>  
> +#ifdef CONFIG_IOREQ_SERVER
> +    ioreq_domain_init(d);
> +#endif
> +
>      /* p2m_init relies on some value initialized by the IOMMU subsystem */
>      if ( (rc = iommu_domain_init(d, config->iommu_opts)) != 0 )
>          goto fail;
> @@ -1014,6 +1019,10 @@ int domain_relinquish_resources(struct domain *d)
>          if (ret )
>              return ret;
>  
> +#ifdef CONFIG_IOREQ_SERVER
> +        ioreq_server_destroy_all(d);
> +#endif
> +
>      PROGRESS(xen):
>          ret = relinquish_memory(d, &d->xenpage_list);
>          if ( ret )
> diff --git a/xen/arch/arm/io.c b/xen/arch/arm/io.c
> index ae7ef96..9814481 100644
> --- a/xen/arch/arm/io.c
> +++ b/xen/arch/arm/io.c
> @@ -16,6 +16,7 @@
>   * GNU General Public License for more details.
>   */
>  
> +#include <xen/ioreq.h>
>  #include <xen/lib.h>
>  #include <xen/spinlock.h>
>  #include <xen/sched.h>
> @@ -23,6 +24,7 @@
>  #include <asm/cpuerrata.h>
>  #include <asm/current.h>
>  #include <asm/mmio.h>
> +#include <asm/hvm/ioreq.h>
>  
>  #include "decode.h"
>  
> @@ -123,7 +125,15 @@ enum io_state try_handle_mmio(struct cpu_user_regs *regs,
>  
>      handler = find_mmio_handler(v->domain, info.gpa);
>      if ( !handler )
> -        return IO_UNHANDLED;
> +    {
> +        int rc;
> +
> +        rc = try_fwd_ioserv(regs, v, &info);
> +        if ( rc == IO_HANDLED )
> +            return handle_ioserv(regs, v);
> +
> +        return rc;
> +    }
>  
>      /* All the instructions used on emulated MMIO region should be valid */
>      if ( !dabt.valid )
> diff --git a/xen/arch/arm/ioreq.c b/xen/arch/arm/ioreq.c
> new file mode 100644
> index 0000000..3c4a24d
> --- /dev/null
> +++ b/xen/arch/arm/ioreq.c
> @@ -0,0 +1,213 @@
> +/*
> + * arm/ioreq.c: hardware virtual machine I/O emulation
> + *
> + * Copyright (c) 2019 Arm ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <xen/domain.h>
> +#include <xen/ioreq.h>
> +
> +#include <asm/traps.h>
> +
> +#include <public/hvm/ioreq.h>
> +
> +enum io_state handle_ioserv(struct cpu_user_regs *regs, struct vcpu *v)
> +{
> +    const union hsr hsr = { .bits = regs->hsr };
> +    const struct hsr_dabt dabt = hsr.dabt;
> +    /* Code is similar to handle_read */
> +    uint8_t size = (1 << dabt.size) * 8;
> +    register_t r = v->io.req.data;
> +
> +    /* We are done with the IO */
> +    v->io.req.state = STATE_IOREQ_NONE;
> +
> +    if ( dabt.write )
> +        return IO_HANDLED;
> +
> +    /*
> +     * Sign extend if required.
> +     * Note that we expect the read handler to have zeroed the bits
> +     * outside the requested access size.
> +     */
> +    if ( dabt.sign && (r & (1UL << (size - 1))) )
> +    {
> +        /*
> +         * We are relying on register_t using the same as
> +         * an unsigned long in order to keep the 32-bit assembly
> +         * code smaller.
> +         */
> +        BUILD_BUG_ON(sizeof(register_t) != sizeof(unsigned long));
> +        r |= (~0UL) << size;
> +    }
> +
> +    set_user_reg(regs, dabt.reg, r);
> +
> +    return IO_HANDLED;
> +}
> +
> +enum io_state try_fwd_ioserv(struct cpu_user_regs *regs,
> +                             struct vcpu *v, mmio_info_t *info)
> +{
> +    struct vcpu_io *vio = &v->io;
> +    ioreq_t p = {
> +        .type = IOREQ_TYPE_COPY,
> +        .addr = info->gpa,
> +        .size = 1 << info->dabt.size,
> +        .count = 1,
> +        .dir = !info->dabt.write,
> +        /*
> +         * On x86, df is used by 'rep' instruction to tell the direction
> +         * to iterate (forward or backward).
> +         * On Arm, all the accesses to MMIO region will do a single
> +         * memory access. So for now, we can safely always set to 0.
> +         */
> +        .df = 0,
> +        .data = get_user_reg(regs, info->dabt.reg),
> +        .state = STATE_IOREQ_READY,
> +    };
> +    struct ioreq_server *s = NULL;
> +    enum io_state rc;
> +
> +    switch ( vio->req.state )
> +    {
> +    case STATE_IOREQ_NONE:
> +        break;
> +
> +    case STATE_IORESP_READY:
> +        return IO_HANDLED;
> +
> +    default:
> +        gdprintk(XENLOG_ERR, "wrong state %u\n", vio->req.state);
> +        return IO_ABORT;
> +    }
> +
> +    s = ioreq_server_select(v->domain, &p);
> +    if ( !s )
> +        return IO_UNHANDLED;
> +
> +    if ( !info->dabt.valid )
> +        return IO_ABORT;
> +
> +    vio->req = p;
> +
> +    rc = ioreq_send(s, &p, 0);
> +    if ( rc != IO_RETRY || v->domain->is_shutting_down )
> +        vio->req.state = STATE_IOREQ_NONE;
> +    else if ( !ioreq_needs_completion(&vio->req) )
> +        rc = IO_HANDLED;
> +    else
> +        vio->completion = VIO_mmio_completion;
> +
> +    return rc;
> +}
> +
> +bool arch_ioreq_complete_mmio(void)
> +{
> +    struct vcpu *v = current;
> +    struct cpu_user_regs *regs = guest_cpu_user_regs();
> +    const union hsr hsr = { .bits = regs->hsr };
> +    paddr_t addr = v->io.req.addr;
> +
> +    if ( try_handle_mmio(regs, hsr, addr) == IO_HANDLED )
> +    {
> +        advance_pc(regs, hsr);
> +        return true;
> +    }
> +
> +    return false;
> +}
> +
> +bool arch_vcpu_ioreq_completion(enum vio_completion completion)
> +{
> +    ASSERT_UNREACHABLE();
> +    return true;
> +}
> +
> +/*
> + * The "legacy" mechanism of mapping magic pages for the IOREQ servers
> + * is x86 specific, so the following hooks don't need to be implemented on Arm:
> + * - arch_ioreq_server_map_pages
> + * - arch_ioreq_server_unmap_pages
> + * - arch_ioreq_server_enable
> + * - arch_ioreq_server_disable
> + */
> +int arch_ioreq_server_map_pages(struct ioreq_server *s)
> +{
> +    return -EOPNOTSUPP;
> +}
> +
> +void arch_ioreq_server_unmap_pages(struct ioreq_server *s)
> +{
> +}
> +
> +void arch_ioreq_server_enable(struct ioreq_server *s)
> +{
> +}
> +
> +void arch_ioreq_server_disable(struct ioreq_server *s)
> +{
> +}
> +
> +void arch_ioreq_server_destroy(struct ioreq_server *s)
> +{
> +}
> +
> +int arch_ioreq_server_map_mem_type(struct domain *d,
> +                                   struct ioreq_server *s,
> +                                   uint32_t flags)
> +{
> +    return -EOPNOTSUPP;
> +}
> +
> +void arch_ioreq_server_map_mem_type_completed(struct domain *d,
> +                                              struct ioreq_server *s,
> +                                              uint32_t flags)
> +{
> +}
> +
> +bool arch_ioreq_server_destroy_all(struct domain *d)
> +{
> +    return true;
> +}
> +
> +bool arch_ioreq_server_get_type_addr(const struct domain *d,
> +                                     const ioreq_t *p,
> +                                     uint8_t *type,
> +                                     uint64_t *addr)
> +{
> +    if ( p->type != IOREQ_TYPE_COPY && p->type != IOREQ_TYPE_PIO )
> +        return false;
> +
> +    *type = (p->type == IOREQ_TYPE_PIO) ?
> +             XEN_DMOP_IO_RANGE_PORT : XEN_DMOP_IO_RANGE_MEMORY;
> +    *addr = p->addr;
> +
> +    return true;
> +}
> +
> +void arch_ioreq_domain_init(struct domain *d)
> +{
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index 22bd1bd..036b13f 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -21,6 +21,7 @@
>  #include <xen/hypercall.h>
>  #include <xen/init.h>
>  #include <xen/iocap.h>
> +#include <xen/ioreq.h>
>  #include <xen/irq.h>
>  #include <xen/lib.h>
>  #include <xen/mem_access.h>
> @@ -1385,6 +1386,9 @@ static arm_hypercall_t arm_hypercall_table[] = {
>  #ifdef CONFIG_HYPFS
>      HYPERCALL(hypfs_op, 5),
>  #endif
> +#ifdef CONFIG_IOREQ_SERVER
> +    HYPERCALL(dm_op, 3),
> +#endif
>  };
>  
>  #ifndef NDEBUG
> @@ -1956,6 +1960,9 @@ static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
>              case IO_HANDLED:
>                  advance_pc(regs, hsr);
>                  return;
> +            case IO_RETRY:
> +                /* finish later */
> +                return;
>              case IO_UNHANDLED:
>                  /* IO unhandled, try another way to handle it. */
>                  break;
> @@ -2254,6 +2261,12 @@ static void check_for_vcpu_work(void)
>  {
>      struct vcpu *v = current;
>  
> +#ifdef CONFIG_IOREQ_SERVER
> +    local_irq_enable();
> +    vcpu_ioreq_handle_completion(v);
> +    local_irq_disable();
> +#endif
> +
>      if ( likely(!v->arch.need_flush_to_ram) )
>          return;
>  
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index 6819a3b..c235e5b 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -10,6 +10,7 @@
>  #include <asm/gic.h>
>  #include <asm/vgic.h>
>  #include <asm/vpl011.h>
> +#include <public/hvm/dm_op.h>
>  #include <public/hvm/params.h>
>  
>  struct hvm_domain
> @@ -262,6 +263,8 @@ static inline void arch_vcpu_block(struct vcpu *v) {}
>  
>  #define arch_vm_assist_valid_mask(d) (1UL << VMASST_TYPE_runstate_update_flag)
>  
> +#define has_vpci(d)    ({ (void)(d); false; })
> +
>  #endif /* __ASM_DOMAIN_H__ */
>  
>  /*
> diff --git a/xen/include/asm-arm/hvm/ioreq.h b/xen/include/asm-arm/hvm/ioreq.h
> new file mode 100644
> index 0000000..19e1247
> --- /dev/null
> +++ b/xen/include/asm-arm/hvm/ioreq.h
> @@ -0,0 +1,72 @@
> +/*
> + * hvm.h: Hardware virtual machine assist interface definitions.
> + *
> + * Copyright (c) 2016 Citrix Systems Inc.
> + * Copyright (c) 2019 Arm ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef __ASM_ARM_HVM_IOREQ_H__
> +#define __ASM_ARM_HVM_IOREQ_H__
> +
> +#ifdef CONFIG_IOREQ_SERVER
> +enum io_state handle_ioserv(struct cpu_user_regs *regs, struct vcpu *v);
> +enum io_state try_fwd_ioserv(struct cpu_user_regs *regs,
> +                             struct vcpu *v, mmio_info_t *info);
> +#else
> +static inline enum io_state handle_ioserv(struct cpu_user_regs *regs,
> +                                          struct vcpu *v)
> +{
> +    return IO_UNHANDLED;
> +}
> +
> +static inline enum io_state try_fwd_ioserv(struct cpu_user_regs *regs,
> +                                           struct vcpu *v, mmio_info_t *info)
> +{
> +    return IO_UNHANDLED;
> +}
> +#endif
> +
> +bool ioreq_complete_mmio(void);
> +
> +static inline bool handle_pio(uint16_t port, unsigned int size, int dir)
> +{
> +    /*
> +     * TODO: For Arm64, the main user will be PCI. So this should be
> +     * implemented when we add support for vPCI.
> +     */
> +    ASSERT_UNREACHABLE();
> +    return true;
> +}
> +
> +static inline void msix_write_completion(struct vcpu *v)
> +{
> +}
> +
> +/* This correlation must not be altered */
> +#define IOREQ_STATUS_HANDLED     IO_HANDLED
> +#define IOREQ_STATUS_UNHANDLED   IO_UNHANDLED
> +#define IOREQ_STATUS_RETRY       IO_RETRY
> +
> +#endif /* __ASM_ARM_HVM_IOREQ_H__ */
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/include/asm-arm/mmio.h b/xen/include/asm-arm/mmio.h
> index 8dbfb27..7ab873c 100644
> --- a/xen/include/asm-arm/mmio.h
> +++ b/xen/include/asm-arm/mmio.h
> @@ -37,6 +37,7 @@ enum io_state
>      IO_ABORT,       /* The IO was handled by the helper and led to an abort. */
>      IO_HANDLED,     /* The IO was successfully handled by the helper. */
>      IO_UNHANDLED,   /* The IO was not handled by the helper. */
> +    IO_RETRY,       /* Retry the emulation for some reason */
>  };
>  
>  typedef int (*mmio_read_t)(struct vcpu *v, mmio_info_t *info,
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 15/24] xen/arm: Stick around in leave_hypervisor_to_guest until I/O has completed
  2021-01-12 21:52 ` [PATCH V4 15/24] xen/arm: Stick around in leave_hypervisor_to_guest until I/O has completed Oleksandr Tyshchenko
@ 2021-01-15  1:12   ` Stefano Stabellini
  2021-01-15 20:55   ` Julien Grall
  1 sibling, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2021-01-15  1:12 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: xen-devel, Oleksandr Tyshchenko, Stefano Stabellini,
	Julien Grall, Volodymyr Babchuk, Julien Grall

On Tue, 12 Jan 2021, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> This patch adds proper handling of return value of
> vcpu_ioreq_handle_completion() which involves using a loop in
> leave_hypervisor_to_guest().
> 
> The reason to use an unbounded loop here is the fact that vCPU shouldn't
> continue until the I/O has completed.
> 
> The IOREQ code is using wait_on_xen_event_channel(). Yet, this can
> still "exit" early if an event has been received. But this doesn't mean
> the I/O has completed (in can be just a spurious wake-up). So we need
> to check if the I/O has completed and wait again if it hasn't (we will
> block the vCPU again until an event is received). This loop makes sure
> that all the vCPU works are done before we return to the guest.
> 
> The call chain below:
> check_for_vcpu_work -> vcpu_ioreq_handle_completion -> wait_for_io ->
> wait_on_xen_event_channel
> 
> The worse that can happen here if the vCPU will never run again
> (the I/O will never complete). But, in Xen case, if the I/O never
> completes then it most likely means that something went horribly
> wrong with the Device Emulator. And it is most likely not safe
> to continue. So letting the vCPU to spin forever if the I/O never
> completes is a safer action than letting it continue and leaving
> the guest in unclear state and is the best what we can do for now.
> 
> Please note, using this loop we will not spin forever on a pCPU,
> preventing any other vCPUs from being scheduled. At every loop
> we will call check_for_pcpu_work() that will process pending
> softirqs. In case of failure, the guest will crash and the vCPU
> will be unscheduled. In normal case, if the rescheduling is necessary
> (might be set by a timer or by a caller in check_for_vcpu_work(),
> where wait_for_io() is a preemption point) the vCPU will be rescheduled
> to give place to someone else.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
> Please note, this is a split/cleanup/hardening of Julien's PoC:
> "Add support for Guest IO forwarding to a device emulator"
> 
> Changes V1 -> V2:
>    - new patch, changes were derived from (+ new explanation):
>      arm/ioreq: Introduce arch specific bits for IOREQ/DM features
> 
> Changes V2 -> V3:
>    - update patch description
> 
> Changes V3 -> V4:
>    - update patch description and comment in code
> ---
>  xen/arch/arm/traps.c | 38 +++++++++++++++++++++++++++++++++-----
>  1 file changed, 33 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index 036b13f..4a83e1e 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -2257,18 +2257,23 @@ static void check_for_pcpu_work(void)
>   * Process pending work for the vCPU. Any call should be fast or
>   * implement preemption.
>   */
> -static void check_for_vcpu_work(void)
> +static bool check_for_vcpu_work(void)
>  {
>      struct vcpu *v = current;
>  
>  #ifdef CONFIG_IOREQ_SERVER
> +    bool handled;
> +
>      local_irq_enable();
> -    vcpu_ioreq_handle_completion(v);
> +    handled = vcpu_ioreq_handle_completion(v);
>      local_irq_disable();
> +
> +    if ( !handled )
> +        return true;
>  #endif
>  
>      if ( likely(!v->arch.need_flush_to_ram) )
> -        return;
> +        return false;
>  
>      /*
>       * Give a chance for the pCPU to process work before handling the vCPU
> @@ -2279,6 +2284,8 @@ static void check_for_vcpu_work(void)
>      local_irq_enable();
>      p2m_flush_vm(v);
>      local_irq_disable();
> +
> +    return false;
>  }
>  
>  /*
> @@ -2291,8 +2298,29 @@ void leave_hypervisor_to_guest(void)
>  {
>      local_irq_disable();
>  
> -    check_for_vcpu_work();
> -    check_for_pcpu_work();
> +    /*
> +     * The reason to use an unbounded loop here is the fact that vCPU
> +     * shouldn't continue until the I/O has completed.
> +     *
> +     * The worse that can happen here if the vCPU will never run again
> +     * (the I/O will never complete). But, in Xen case, if the I/O never
> +     * completes then it most likely means that something went horribly
> +     * wrong with the Device Emulator. And it is most likely not safe
> +     * to continue. So letting the vCPU to spin forever if the I/O never
> +     * completes is a safer action than letting it continue and leaving
> +     * the guest in unclear state and is the best what we can do for now.
> +     *
> +     * Please note, using this loop we will not spin forever on a pCPU,
> +     * preventing any other vCPUs from being scheduled. At every loop
> +     * we will call check_for_pcpu_work() that will process pending
> +     * softirqs. In case of failure, the guest will crash and the vCPU
> +     * will be unscheduled. In normal case, if the rescheduling is necessary
> +     * (might be set by a timer or by a caller in check_for_vcpu_work(),
> +     * the vCPU will be rescheduled to give place to someone else.
> +     */
> +    do {
> +        check_for_pcpu_work();
> +    } while ( check_for_vcpu_work() );
>  
>      vgic_sync_to_lrs();
>  
> -- 
> 2.7.4
> 


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 16/24] xen/mm: Handle properly reference in set_foreign_p2m_entry() on Arm
  2021-01-12 21:52 ` [PATCH V4 16/24] xen/mm: Handle properly reference in set_foreign_p2m_entry() on Arm Oleksandr Tyshchenko
@ 2021-01-15  1:19   ` Stefano Stabellini
  2021-01-15 20:59   ` Julien Grall
  2021-01-21 13:57   ` Jan Beulich
  2 siblings, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2021-01-15  1:19 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: xen-devel, Oleksandr Tyshchenko, Stefano Stabellini,
	Julien Grall, Volodymyr Babchuk, Andrew Cooper, George Dunlap,
	Ian Jackson, Jan Beulich, Wei Liu, Roger Pau Monné,
	Julien Grall

[-- Attachment #1: Type: text/plain, Size: 10285 bytes --]

On Tue, 12 Jan 2021, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> This patch implements reference counting of foreign entries in
> in set_foreign_p2m_entry() on Arm. This is a mandatory action if
> we want to run emulator (IOREQ server) in other than dom0 domain,
> as we can't trust it to do the right thing if it is not running
> in dom0. So we need to grab a reference on the page to avoid it
> disappearing.
> 
> It is valid to always pass "p2m_map_foreign_rw" type to
> guest_physmap_add_entry() since the current and foreign domains
> would be always different. A case when they are equal would be
> rejected by rcu_lock_remote_domain_by_id(). Besides the similar
> comment in the code put a respective ASSERT() to catch incorrect
> usage in future.
> 
> It was tested with IOREQ feature to confirm that all the pages given
> to this function belong to a domain, so we can use the same approach
> as for XENMAPSPACE_gmfn_foreign handling in xenmem_add_to_physmap_one().
> 
> This involves adding an extra parameter for the foreign domain to
> set_foreign_p2m_entry() and a helper to indicate whether the arch
> supports the reference counting of foreign entries and the restriction
> for the hardware domain in the common code can be skipped for it.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>

Acked-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
> Please note, this is a split/cleanup/hardening of Julien's PoC:
> "Add support for Guest IO forwarding to a device emulator"
> 
> Changes RFC -> V1:
>    - new patch, was split from:
>      "[RFC PATCH V1 04/12] xen/arm: Introduce arch specific bits for IOREQ/DM features"
>    - rewrite a logic to handle properly reference in set_foreign_p2m_entry()
>      instead of treating foreign entries as p2m_ram_rw
> 
> Changes V1 -> V2:
>    - rebase according to the recent changes to acquire_resource()
>    - update patch description
>    - introduce arch_refcounts_p2m()
>    - add an explanation why p2m_map_foreign_rw is valid
>    - move set_foreign_p2m_entry() to p2m-common.h
>    - add const to new parameter
> 
> Changes V2 -> V3:
>    - update patch description
>    - rename arch_refcounts_p2m() to arch_acquire_resource_check()
>    - move comment to x86’s arch_acquire_resource_check()
>    - return rc in Arm's set_foreign_p2m_entry()
>    - put a respective ASSERT() into Arm's set_foreign_p2m_entry()
> 
> Changes V3 -> V4:
>    - update arch_acquire_resource_check() implementation on x86
>      and common code which uses it, pass struct domain to the function
>    - put ASSERT() to x86/Arm set_foreign_p2m_entry()
>    - use arch_acquire_resource_check() in p2m_add_foreign()
>      instead of open-coding it
> ---
>  xen/arch/arm/p2m.c           | 26 ++++++++++++++++++++++++++
>  xen/arch/x86/mm/p2m.c        |  9 ++++++---
>  xen/common/memory.c          |  9 ++-------
>  xen/include/asm-arm/p2m.h    | 19 +++++++++----------
>  xen/include/asm-x86/p2m.h    | 19 ++++++++++++++++---
>  xen/include/xen/p2m-common.h |  4 ++++
>  6 files changed, 63 insertions(+), 23 deletions(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 4eeb867..d41c4fa 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -1380,6 +1380,32 @@ int guest_physmap_remove_page(struct domain *d, gfn_t gfn, mfn_t mfn,
>      return p2m_remove_mapping(d, gfn, (1 << page_order), mfn);
>  }
>  
> +int set_foreign_p2m_entry(struct domain *d, const struct domain *fd,
> +                          unsigned long gfn, mfn_t mfn)
> +{
> +    struct page_info *page = mfn_to_page(mfn);
> +    int rc;
> +
> +    ASSERT(arch_acquire_resource_check(d));
> +
> +    if ( !get_page(page, fd) )
> +        return -EINVAL;
> +
> +    /*
> +     * It is valid to always use p2m_map_foreign_rw here as if this gets
> +     * called then d != fd. A case when d == fd would be rejected by
> +     * rcu_lock_remote_domain_by_id() earlier. Put a respective ASSERT()
> +     * to catch incorrect usage in future.
> +     */
> +    ASSERT(d != fd);
> +
> +    rc = guest_physmap_add_entry(d, _gfn(gfn), mfn, 0, p2m_map_foreign_rw);
> +    if ( rc )
> +        put_page(page);
> +
> +    return rc;
> +}
> +
>  static struct page_info *p2m_allocate_root(void)
>  {
>      struct page_info *page;
> diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
> index 71fda06..cbeea85 100644
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -1323,8 +1323,11 @@ static int set_typed_p2m_entry(struct domain *d, unsigned long gfn_l,
>  }
>  
>  /* Set foreign mfn in the given guest's p2m table. */
> -int set_foreign_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
> +int set_foreign_p2m_entry(struct domain *d, const struct domain *fd,
> +                          unsigned long gfn, mfn_t mfn)
>  {
> +    ASSERT(arch_acquire_resource_check(d));
> +
>      return set_typed_p2m_entry(d, gfn, mfn, PAGE_ORDER_4K, p2m_map_foreign,
>                                 p2m_get_hostp2m(d)->default_access);
>  }
> @@ -2579,7 +2582,7 @@ static int p2m_add_foreign(struct domain *tdom, unsigned long fgfn,
>       * hvm fixme: until support is added to p2m teardown code to cleanup any
>       * foreign entries, limit this to hardware domain only.
>       */
> -    if ( !is_hardware_domain(tdom) )
> +    if ( !arch_acquire_resource_check(tdom) )
>          return -EPERM;
>  
>      if ( foreigndom == DOMID_XEN )
> @@ -2635,7 +2638,7 @@ static int p2m_add_foreign(struct domain *tdom, unsigned long fgfn,
>       * will update the m2p table which will result in  mfn -> gpfn of dom0
>       * and not fgfn of domU.
>       */
> -    rc = set_foreign_p2m_entry(tdom, gpfn, mfn);
> +    rc = set_foreign_p2m_entry(tdom, fdom, gpfn, mfn);
>      if ( rc )
>          gdprintk(XENLOG_WARNING, "set_foreign_p2m_entry failed. "
>                   "gpfn:%lx mfn:%lx fgfn:%lx td:%d fd:%d\n",
> diff --git a/xen/common/memory.c b/xen/common/memory.c
> index 66828d9..d625a9b 100644
> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -1138,12 +1138,7 @@ static int acquire_resource(
>      xen_pfn_t mfn_list[32];
>      int rc;
>  
> -    /*
> -     * FIXME: Until foreign pages inserted into the P2M are properly
> -     *        reference counted, it is unsafe to allow mapping of
> -     *        resource pages unless the caller is the hardware domain.
> -     */
> -    if ( paging_mode_translate(currd) && !is_hardware_domain(currd) )
> +    if ( !arch_acquire_resource_check(currd) )
>          return -EACCES;
>  
>      if ( copy_from_guest(&xmar, arg, 1) )
> @@ -1211,7 +1206,7 @@ static int acquire_resource(
>  
>          for ( i = 0; !rc && i < xmar.nr_frames; i++ )
>          {
> -            rc = set_foreign_p2m_entry(currd, gfn_list[i],
> +            rc = set_foreign_p2m_entry(currd, d, gfn_list[i],
>                                         _mfn(mfn_list[i]));
>              /* rc should be -EIO for any iteration other than the first */
>              if ( rc && i )
> diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
> index 28ca9a8..4f8b3b0 100644
> --- a/xen/include/asm-arm/p2m.h
> +++ b/xen/include/asm-arm/p2m.h
> @@ -161,6 +161,15 @@ typedef enum {
>  #endif
>  #include <xen/p2m-common.h>
>  
> +static inline bool arch_acquire_resource_check(struct domain *d)
> +{
> +    /*
> +     * The reference counting of foreign entries in set_foreign_p2m_entry()
> +     * is supported on Arm.
> +     */
> +    return true;
> +}
> +
>  static inline
>  void p2m_altp2m_check(struct vcpu *v, uint16_t idx)
>  {
> @@ -392,16 +401,6 @@ static inline gfn_t gfn_next_boundary(gfn_t gfn, unsigned int order)
>      return gfn_add(gfn, 1UL << order);
>  }
>  
> -static inline int set_foreign_p2m_entry(struct domain *d, unsigned long gfn,
> -                                        mfn_t mfn)
> -{
> -    /*
> -     * NOTE: If this is implemented then proper reference counting of
> -     *       foreign entries will need to be implemented.
> -     */
> -    return -EOPNOTSUPP;
> -}
> -
>  /*
>   * A vCPU has cache enabled only when the MMU is enabled and data cache
>   * is enabled.
> diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
> index 7df2878..1d64c12 100644
> --- a/xen/include/asm-x86/p2m.h
> +++ b/xen/include/asm-x86/p2m.h
> @@ -382,6 +382,22 @@ struct p2m_domain {
>  #endif
>  #include <xen/p2m-common.h>
>  
> +static inline bool arch_acquire_resource_check(struct domain *d)
> +{
> +    /*
> +     * The reference counting of foreign entries in set_foreign_p2m_entry()
> +     * is not supported for translated domains on x86.
> +     *
> +     * FIXME: Until foreign pages inserted into the P2M are properly
> +     * reference counted, it is unsafe to allow mapping of
> +     * resource pages unless the caller is the hardware domain.
> +     */
> +    if ( paging_mode_translate(d) && !is_hardware_domain(d) )
> +        return false;
> +
> +    return true;
> +}
> +
>  /*
>   * Updates vCPU's n2pm to match its np2m_base in VMCx12 and returns that np2m.
>   */
> @@ -647,9 +663,6 @@ int p2m_finish_type_change(struct domain *d,
>  int p2m_is_logdirty_range(struct p2m_domain *, unsigned long start,
>                            unsigned long end);
>  
> -/* Set foreign entry in the p2m table (for priv-mapping) */
> -int set_foreign_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn);
> -
>  /* Set mmio addresses in the p2m table (for pass-through) */
>  int set_mmio_p2m_entry(struct domain *d, gfn_t gfn, mfn_t mfn,
>                         unsigned int order);
> diff --git a/xen/include/xen/p2m-common.h b/xen/include/xen/p2m-common.h
> index 58031a6..b4bc709 100644
> --- a/xen/include/xen/p2m-common.h
> +++ b/xen/include/xen/p2m-common.h
> @@ -3,6 +3,10 @@
>  
>  #include <xen/mm.h>
>  
> +/* Set foreign entry in the p2m table */
> +int set_foreign_p2m_entry(struct domain *d, const struct domain *fd,
> +                          unsigned long gfn, mfn_t mfn);
> +
>  /* Remove a page from a domain's p2m table */
>  int __must_check
>  guest_physmap_remove_page(struct domain *d, gfn_t gfn, mfn_t mfn,
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 17/24] xen/ioreq: Introduce domain_has_ioreq_server()
  2021-01-12 21:52 ` [PATCH V4 17/24] xen/ioreq: Introduce domain_has_ioreq_server() Oleksandr Tyshchenko
@ 2021-01-15  1:24   ` Stefano Stabellini
  2021-01-18 10:23   ` Paul Durrant
  1 sibling, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2021-01-15  1:24 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: xen-devel, Oleksandr Tyshchenko, Stefano Stabellini,
	Julien Grall, Volodymyr Babchuk, Paul Durrant, Julien Grall

On Tue, 12 Jan 2021, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> This patch introduces a helper the main purpose of which is to check
> if a domain is using IOREQ server(s).
> 
> On Arm the current benefit is to avoid calling vcpu_ioreq_handle_completion()
> (which implies iterating over all possible IOREQ servers anyway)
> on every return in leave_hypervisor_to_guest() if there is no active
> servers for the particular domain.
> Also this helper will be used by one of the subsequent patches on Arm.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
> Please note, this is a split/cleanup/hardening of Julien's PoC:
> "Add support for Guest IO forwarding to a device emulator"
> 
> Changes RFC -> V1:
>    - new patch
> 
> Changes V1 -> V2:
>    - update patch description
>    - guard helper with CONFIG_IOREQ_SERVER
>    - remove "hvm" prefix
>    - modify helper to just return d->arch.hvm.ioreq_server.nr_servers
>    - put suitable ASSERT()s
>    - use ASSERT(d->ioreq_server.server[id] ? !s : !!s) in set_ioreq_server()
>    - remove d->ioreq_server.nr_servers = 0 from hvm_ioreq_init()
> 
> Changes V2 -> V3:
>    - update patch description
>    - remove ASSERT()s from the helper, add a comment
>    - use #ifdef CONFIG_IOREQ_SERVER inside function body
>    - use new ASSERT() construction in set_ioreq_server()
> 
> Changes V3 -> V4:
>    - update patch description
>    - drop per-domain variable "nr_servers"
>    - reimplement a helper to count the non-NULL entries
>    - make the helper out-of-line
> ---
>  xen/arch/arm/traps.c    | 15 +++++++++------
>  xen/common/ioreq.c      | 16 ++++++++++++++++
>  xen/include/xen/ioreq.h |  2 ++
>  3 files changed, 27 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index 4a83e1e..35094d8 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -2262,14 +2262,17 @@ static bool check_for_vcpu_work(void)
>      struct vcpu *v = current;
>  
>  #ifdef CONFIG_IOREQ_SERVER
> -    bool handled;
> +    if ( domain_has_ioreq_server(v->domain) )
> +    {
> +        bool handled;
>  
> -    local_irq_enable();
> -    handled = vcpu_ioreq_handle_completion(v);
> -    local_irq_disable();
> +        local_irq_enable();
> +        handled = vcpu_ioreq_handle_completion(v);
> +        local_irq_disable();
>  
> -    if ( !handled )
> -        return true;
> +        if ( !handled )
> +            return true;
> +    }
>  #endif
>  
>      if ( likely(!v->arch.need_flush_to_ram) )
> diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c
> index d5f4dd3..59f4990 100644
> --- a/xen/common/ioreq.c
> +++ b/xen/common/ioreq.c
> @@ -80,6 +80,22 @@ static ioreq_t *get_ioreq(struct ioreq_server *s, struct vcpu *v)
>      return &p->vcpu_ioreq[v->vcpu_id];
>  }
>  
> +/*
> + * This should only be used when d == current->domain or when they're
> + * distinct and d is paused. Otherwise the result is stale before
> + * the caller can inspect it.
> + */
> +bool domain_has_ioreq_server(const struct domain *d)
> +{
> +    const struct ioreq_server *s;
> +    unsigned int id;
> +
> +    FOR_EACH_IOREQ_SERVER(d, id, s)
> +        return true;
> +
> +    return false;
> +}
> +
>  static struct ioreq_vcpu *get_pending_vcpu(const struct vcpu *v,
>                                             struct ioreq_server **srvp)
>  {
> diff --git a/xen/include/xen/ioreq.h b/xen/include/xen/ioreq.h
> index ec7e98d..f0908af 100644
> --- a/xen/include/xen/ioreq.h
> +++ b/xen/include/xen/ioreq.h
> @@ -81,6 +81,8 @@ static inline bool ioreq_needs_completion(const ioreq_t *ioreq)
>  #define HANDLE_BUFIOREQ(s) \
>      ((s)->bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF)
>  
> +bool domain_has_ioreq_server(const struct domain *d);
> +
>  bool vcpu_ioreq_pending(struct vcpu *v);
>  bool vcpu_ioreq_handle_completion(struct vcpu *v);
>  bool is_ioreq_server_page(struct domain *d, const struct page_info *page);
> -- 
> 2.7.4
> 


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 18/24] xen/dm: Introduce xendevicemodel_set_irq_level DM op
  2021-01-12 21:52 ` [PATCH V4 18/24] xen/dm: Introduce xendevicemodel_set_irq_level DM op Oleksandr Tyshchenko
@ 2021-01-15  1:32   ` Stefano Stabellini
  0 siblings, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2021-01-15  1:32 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: xen-devel, Julien Grall, Ian Jackson, Wei Liu, Andrew Cooper,
	George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini,
	Volodymyr Babchuk, Oleksandr Tyshchenko

On Tue, 12 Jan 2021, Oleksandr Tyshchenko wrote:
> From: Julien Grall <julien.grall@arm.com>
> 
> This patch adds ability to the device emulator to notify otherend
> (some entity running in the guest) using a SPI and implements Arm
> specific bits for it. Proposed interface allows emulator to set
> the logical level of a one of a domain's IRQ lines.
> 
> We can't reuse the existing DM op (xen_dm_op_set_isa_irq_level)
> to inject an interrupt as the "isa_irq" field is only 8-bit and
> able to cover IRQ 0 - 255, whereas we need a wider range (0 - 1020).
> 
> Please note, for egde-triggered interrupt (which is used for
> the virtio-mmio emulation) we only trigger the interrupt on Arm
> if the level is asserted (rising edge) and do nothing if the level
> is deasserted (falling edge), so the call could be named "trigger_irq"
> (without the level parameter). But, in order to model the line closely
> (to be able to support level-triggered interrupt) we need to know whether
> the line is low or high, so the proposed interface has been chosen.
> However, it is worth mentioning that in case of the level-triggered
> interrupt, we should keep injecting the interrupt to the guest until
> the line is deasserted (this is not covered by current patch).
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>

Acked-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
> Please note, this is a split/cleanup/hardening of Julien's PoC:
> "Add support for Guest IO forwarding to a device emulator"
> 
> Changes RFC -> V1:
>    - check incoming parameters in arch_dm_op()
>    - add explicit padding to struct xen_dm_op_set_irq_level
> 
> Changes V1 -> V2:
>    - update the author of a patch
>    - update patch description
>    - check that padding is always 0
>    - mention that interface is Arm only and only SPIs are
>      supported for now
>    - allow to set the logical level of a line for non-allocated
>      interrupts only
>    - add xen_dm_op_set_irq_level_t
> 
> Changes V2 -> V3:
>    - no changes
> 
> Changes V3 -> V4:
>    - update patch description
>    - update patch according to the IOREQ related dm-op handling changes
> ---
>  tools/include/xendevicemodel.h               |  4 +++
>  tools/libs/devicemodel/core.c                | 18 ++++++++++
>  tools/libs/devicemodel/libxendevicemodel.map |  1 +
>  xen/arch/arm/dm.c                            | 54 +++++++++++++++++++++++++++-
>  xen/include/public/hvm/dm_op.h               | 16 +++++++++
>  5 files changed, 92 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/include/xendevicemodel.h b/tools/include/xendevicemodel.h
> index e877f5c..c06b3c8 100644
> --- a/tools/include/xendevicemodel.h
> +++ b/tools/include/xendevicemodel.h
> @@ -209,6 +209,10 @@ int xendevicemodel_set_isa_irq_level(
>      xendevicemodel_handle *dmod, domid_t domid, uint8_t irq,
>      unsigned int level);
>  
> +int xendevicemodel_set_irq_level(
> +    xendevicemodel_handle *dmod, domid_t domid, unsigned int irq,
> +    unsigned int level);
> +
>  /**
>   * This function maps a PCI INTx line to a an IRQ line.
>   *
> diff --git a/tools/libs/devicemodel/core.c b/tools/libs/devicemodel/core.c
> index 4d40639..30bd79f 100644
> --- a/tools/libs/devicemodel/core.c
> +++ b/tools/libs/devicemodel/core.c
> @@ -430,6 +430,24 @@ int xendevicemodel_set_isa_irq_level(
>      return xendevicemodel_op(dmod, domid, 1, &op, sizeof(op));
>  }
>  
> +int xendevicemodel_set_irq_level(
> +    xendevicemodel_handle *dmod, domid_t domid, uint32_t irq,
> +    unsigned int level)
> +{
> +    struct xen_dm_op op;
> +    struct xen_dm_op_set_irq_level *data;
> +
> +    memset(&op, 0, sizeof(op));
> +
> +    op.op = XEN_DMOP_set_irq_level;
> +    data = &op.u.set_irq_level;
> +
> +    data->irq = irq;
> +    data->level = level;
> +
> +    return xendevicemodel_op(dmod, domid, 1, &op, sizeof(op));
> +}
> +
>  int xendevicemodel_set_pci_link_route(
>      xendevicemodel_handle *dmod, domid_t domid, uint8_t link, uint8_t irq)
>  {
> diff --git a/tools/libs/devicemodel/libxendevicemodel.map b/tools/libs/devicemodel/libxendevicemodel.map
> index 561c62d..a0c3012 100644
> --- a/tools/libs/devicemodel/libxendevicemodel.map
> +++ b/tools/libs/devicemodel/libxendevicemodel.map
> @@ -32,6 +32,7 @@ VERS_1.2 {
>  	global:
>  		xendevicemodel_relocate_memory;
>  		xendevicemodel_pin_memory_cacheattr;
> +		xendevicemodel_set_irq_level;
>  } VERS_1.1;
>  
>  VERS_1.3 {
> diff --git a/xen/arch/arm/dm.c b/xen/arch/arm/dm.c
> index e6dedf4..804830a 100644
> --- a/xen/arch/arm/dm.c
> +++ b/xen/arch/arm/dm.c
> @@ -20,6 +20,8 @@
>  #include <xen/ioreq.h>
>  #include <xen/nospec.h>
>  
> +#include <asm/vgic.h>
> +
>  static int dm_op(const struct dmop_args *op_args)
>  {
>      struct domain *d;
> @@ -35,6 +37,7 @@ static int dm_op(const struct dmop_args *op_args)
>          [XEN_DMOP_unmap_io_range_from_ioreq_server] = sizeof(struct xen_dm_op_ioreq_server_range),
>          [XEN_DMOP_set_ioreq_server_state]           = sizeof(struct xen_dm_op_set_ioreq_server_state),
>          [XEN_DMOP_destroy_ioreq_server]             = sizeof(struct xen_dm_op_destroy_ioreq_server),
> +        [XEN_DMOP_set_irq_level]                    = sizeof(struct xen_dm_op_set_irq_level),
>      };
>  
>      rc = rcu_lock_remote_domain_by_id(op_args->domid, &d);
> @@ -73,7 +76,56 @@ static int dm_op(const struct dmop_args *op_args)
>      if ( op.pad )
>          goto out;
>  
> -    rc = ioreq_server_dm_op(&op, d, &const_op);
> +    switch ( op.op )
> +    {
> +    case XEN_DMOP_set_irq_level:
> +    {
> +        const struct xen_dm_op_set_irq_level *data =
> +            &op.u.set_irq_level;
> +        unsigned int i;
> +
> +        /* Only SPIs are supported */
> +        if ( (data->irq < NR_LOCAL_IRQS) || (data->irq >= vgic_num_irqs(d)) )
> +        {
> +            rc = -EINVAL;
> +            break;
> +        }
> +
> +        if ( data->level != 0 && data->level != 1 )
> +        {
> +            rc = -EINVAL;
> +            break;
> +        }
> +
> +        /* Check that padding is always 0 */
> +        for ( i = 0; i < sizeof(data->pad); i++ )
> +        {
> +            if ( data->pad[i] )
> +            {
> +                rc = -EINVAL;
> +                break;
> +            }
> +        }
> +
> +        /*
> +         * Allow to set the logical level of a line for non-allocated
> +         * interrupts only.
> +         */
> +        if ( test_bit(data->irq, d->arch.vgic.allocated_irqs) )
> +        {
> +            rc = -EINVAL;
> +            break;
> +        }
> +
> +        vgic_inject_irq(d, NULL, data->irq, data->level);
> +        rc = 0;
> +        break;
> +    }
> +
> +    default:
> +        rc = ioreq_server_dm_op(&op, d, &const_op);
> +        break;
> +    }
>  
>      if ( (!rc || rc == -ERESTART) &&
>           !const_op && copy_to_guest_offset(op_args->buf[0].h, offset,
> diff --git a/xen/include/public/hvm/dm_op.h b/xen/include/public/hvm/dm_op.h
> index 66cae1a..1f70d58 100644
> --- a/xen/include/public/hvm/dm_op.h
> +++ b/xen/include/public/hvm/dm_op.h
> @@ -434,6 +434,21 @@ struct xen_dm_op_pin_memory_cacheattr {
>  };
>  typedef struct xen_dm_op_pin_memory_cacheattr xen_dm_op_pin_memory_cacheattr_t;
>  
> +/*
> + * XEN_DMOP_set_irq_level: Set the logical level of a one of a domain's
> + *                         IRQ lines (currently Arm only).
> + * Only SPIs are supported.
> + */
> +#define XEN_DMOP_set_irq_level 19
> +
> +struct xen_dm_op_set_irq_level {
> +    uint32_t irq;
> +    /* IN - Level: 0 -> deasserted, 1 -> asserted */
> +    uint8_t level;
> +    uint8_t pad[3];
> +};
> +typedef struct xen_dm_op_set_irq_level xen_dm_op_set_irq_level_t;
> +
>  struct xen_dm_op {
>      uint32_t op;
>      uint32_t pad;
> @@ -447,6 +462,7 @@ struct xen_dm_op {
>          xen_dm_op_track_dirty_vram_t track_dirty_vram;
>          xen_dm_op_set_pci_intx_level_t set_pci_intx_level;
>          xen_dm_op_set_isa_irq_level_t set_isa_irq_level;
> +        xen_dm_op_set_irq_level_t set_irq_level;
>          xen_dm_op_set_pci_link_route_t set_pci_link_route;
>          xen_dm_op_modified_memory_t modified_memory;
>          xen_dm_op_set_mem_type_t set_mem_type;
> -- 
> 2.7.4
> 


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 19/24] xen/arm: io: Abstract sign-extension
  2021-01-12 21:52 ` [PATCH V4 19/24] xen/arm: io: Abstract sign-extension Oleksandr Tyshchenko
@ 2021-01-15  1:35   ` Stefano Stabellini
  0 siblings, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2021-01-15  1:35 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: xen-devel, Oleksandr Tyshchenko, Stefano Stabellini,
	Julien Grall, Volodymyr Babchuk, Julien Grall

On Tue, 12 Jan 2021, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> In order to avoid code duplication (both handle_read() and
> handle_ioserv() contain the same code for the sign-extension)
> put this code to a common helper to be used for both.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
> Please note, this is a split/cleanup/hardening of Julien's PoC:
> "Add support for Guest IO forwarding to a device emulator"
> 
> Changes V1 -> V2:
>    - new patch
> 
> Changes V2 -> V3:
>    - no changes
> 
> Changes V3 -> V4:
>    - no changes here, but in new patch:
>      "xen/arm: io: Harden sign extension check"
> ---
>  xen/arch/arm/io.c           | 18 ++----------------
>  xen/arch/arm/ioreq.c        | 17 +----------------
>  xen/include/asm-arm/traps.h | 24 ++++++++++++++++++++++++
>  3 files changed, 27 insertions(+), 32 deletions(-)
> 
> diff --git a/xen/arch/arm/io.c b/xen/arch/arm/io.c
> index 9814481..307c521 100644
> --- a/xen/arch/arm/io.c
> +++ b/xen/arch/arm/io.c
> @@ -24,6 +24,7 @@
>  #include <asm/cpuerrata.h>
>  #include <asm/current.h>
>  #include <asm/mmio.h>
> +#include <asm/traps.h>
>  #include <asm/hvm/ioreq.h>
>  
>  #include "decode.h"
> @@ -40,26 +41,11 @@ static enum io_state handle_read(const struct mmio_handler *handler,
>       * setting r).
>       */
>      register_t r = 0;
> -    uint8_t size = (1 << dabt.size) * 8;
>  
>      if ( !handler->ops->read(v, info, &r, handler->priv) )
>          return IO_ABORT;
>  
> -    /*
> -     * Sign extend if required.
> -     * Note that we expect the read handler to have zeroed the bits
> -     * outside the requested access size.
> -     */
> -    if ( dabt.sign && (r & (1UL << (size - 1))) )
> -    {
> -        /*
> -         * We are relying on register_t using the same as
> -         * an unsigned long in order to keep the 32-bit assembly
> -         * code smaller.
> -         */
> -        BUILD_BUG_ON(sizeof(register_t) != sizeof(unsigned long));
> -        r |= (~0UL) << size;
> -    }
> +    r = sign_extend(dabt, r);
>  
>      set_user_reg(regs, dabt.reg, r);
>  
> diff --git a/xen/arch/arm/ioreq.c b/xen/arch/arm/ioreq.c
> index 3c4a24d..40b9e59 100644
> --- a/xen/arch/arm/ioreq.c
> +++ b/xen/arch/arm/ioreq.c
> @@ -28,7 +28,6 @@ enum io_state handle_ioserv(struct cpu_user_regs *regs, struct vcpu *v)
>      const union hsr hsr = { .bits = regs->hsr };
>      const struct hsr_dabt dabt = hsr.dabt;
>      /* Code is similar to handle_read */
> -    uint8_t size = (1 << dabt.size) * 8;
>      register_t r = v->io.req.data;
>  
>      /* We are done with the IO */
> @@ -37,21 +36,7 @@ enum io_state handle_ioserv(struct cpu_user_regs *regs, struct vcpu *v)
>      if ( dabt.write )
>          return IO_HANDLED;
>  
> -    /*
> -     * Sign extend if required.
> -     * Note that we expect the read handler to have zeroed the bits
> -     * outside the requested access size.
> -     */
> -    if ( dabt.sign && (r & (1UL << (size - 1))) )
> -    {
> -        /*
> -         * We are relying on register_t using the same as
> -         * an unsigned long in order to keep the 32-bit assembly
> -         * code smaller.
> -         */
> -        BUILD_BUG_ON(sizeof(register_t) != sizeof(unsigned long));
> -        r |= (~0UL) << size;
> -    }
> +    r = sign_extend(dabt, r);
>  
>      set_user_reg(regs, dabt.reg, r);
>  
> diff --git a/xen/include/asm-arm/traps.h b/xen/include/asm-arm/traps.h
> index 997c378..e301c44 100644
> --- a/xen/include/asm-arm/traps.h
> +++ b/xen/include/asm-arm/traps.h
> @@ -83,6 +83,30 @@ static inline bool VABORT_GEN_BY_GUEST(const struct cpu_user_regs *regs)
>          (unsigned long)abort_guest_exit_end == regs->pc;
>  }
>  
> +/* Check whether the sign extension is required and perform it */
> +static inline register_t sign_extend(const struct hsr_dabt dabt, register_t r)
> +{
> +    uint8_t size = (1 << dabt.size) * 8;
> +
> +    /*
> +     * Sign extend if required.
> +     * Note that we expect the read handler to have zeroed the bits
> +     * outside the requested access size.
> +     */
> +    if ( dabt.sign && (r & (1UL << (size - 1))) )
> +    {
> +        /*
> +         * We are relying on register_t using the same as
> +         * an unsigned long in order to keep the 32-bit assembly
> +         * code smaller.
> +         */
> +        BUILD_BUG_ON(sizeof(register_t) != sizeof(unsigned long));
> +        r |= (~0UL) << size;
> +    }
> +
> +    return r;
> +}
> +
>  #endif /* __ASM_ARM_TRAPS__ */
>  /*
>   * Local variables:
> -- 
> 2.7.4
> 


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 20/24] xen/arm: io: Harden sign extension check
  2021-01-12 21:52 ` [PATCH V4 20/24] xen/arm: io: Harden sign extension check Oleksandr Tyshchenko
@ 2021-01-15  1:48   ` Stefano Stabellini
  2021-01-22 10:15   ` Volodymyr Babchuk
  1 sibling, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2021-01-15  1:48 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: xen-devel, Oleksandr Tyshchenko, Stefano Stabellini,
	Julien Grall, Volodymyr Babchuk, Julien Grall

On Tue, 12 Jan 2021, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> In the ideal world we would never get an undefined behavior when
> propagating the sign bit since that bit can only be set for access
> size smaller than the register size (i.e byte/half-word for aarch32,
> byte/half-word/word for aarch64).
> 
> In the real world we need to care for *possible* hardware bug such as
> advertising a sign extension for either 64-bit (or 32-bit) on Arm64
> (resp. Arm32).
> 
> So harden a bit more the code to prevent undefined behavior when
> propagating the sign bit in case of buggy hardware.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Julien Grall <julien.grall@arm.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
> Please note, this is a split/cleanup/hardening of Julien's PoC:
> "Add support for Guest IO forwarding to a device emulator"
> 
> Changes V3 -> V4:
>    - new patch
> ---
>  xen/include/asm-arm/traps.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/include/asm-arm/traps.h b/xen/include/asm-arm/traps.h
> index e301c44..992d537 100644
> --- a/xen/include/asm-arm/traps.h
> +++ b/xen/include/asm-arm/traps.h
> @@ -93,7 +93,8 @@ static inline register_t sign_extend(const struct hsr_dabt dabt, register_t r)
>       * Note that we expect the read handler to have zeroed the bits
>       * outside the requested access size.
>       */
> -    if ( dabt.sign && (r & (1UL << (size - 1))) )
> +    if ( dabt.sign && (size < sizeof(register_t) * 8) &&
> +         (r & (1UL << (size - 1))) )
>      {
>          /*
>           * We are relying on register_t using the same as
> -- 
> 2.7.4
> 


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 22/24] xen/arm: Add mapcache invalidation handling
  2021-01-12 21:52 ` [PATCH V4 22/24] xen/arm: Add mapcache invalidation handling Oleksandr Tyshchenko
@ 2021-01-15  2:11   ` Stefano Stabellini
  2021-01-21 19:47     ` Oleksandr
  0 siblings, 1 reply; 144+ messages in thread
From: Stefano Stabellini @ 2021-01-15  2:11 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: xen-devel, Oleksandr Tyshchenko, Stefano Stabellini,
	Julien Grall, Volodymyr Babchuk, Julien Grall

On Tue, 12 Jan 2021, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> We need to send mapcache invalidation request to qemu/demu everytime
> the page gets removed from a guest.
> 
> At the moment, the Arm code doesn't explicitely remove the existing
> mapping before inserting the new mapping. Instead, this is done
> implicitely by __p2m_set_entry().
> 
> So we need to recognize a case when old entry is a RAM page *and*
> the new MFN is different in order to set the corresponding flag.
> The most suitable place to do this is p2m_free_entry(), there
> we can find the correct leaf type. The invalidation request
> will be sent in do_trap_hypercall() later on.
> 
> Taking into the account the following the do_trap_hypercall()
> is the best place to send invalidation request:
>  - The only way a guest can modify its P2M on Arm is via an hypercall
>  - When sending the invalidation request, the vCPU will be blocked
>    until all the IOREQ servers have acknowledged the invalidation
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>
> 
> ---
> Please note, this is a split/cleanup/hardening of Julien's PoC:
> "Add support for Guest IO forwarding to a device emulator"
> 
> ***
> Please note, this patch depends on the following which is
> on review:
> https://patchwork.kernel.org/patch/11803383/
> 
> This patch is on par with x86 code (whether it is buggy or not).
> If there is a need to improve/harden something, this can be done on
> a follow-up.
> ***
> 
> Changes V1 -> V2:
>    - new patch, some changes were derived from (+ new explanation):
>      xen/ioreq: Make x86's invalidate qemu mapcache handling common
>    - put setting of the flag into __p2m_set_entry()
>    - clarify the conditions when the flag should be set
>    - use domain_has_ioreq_server()
>    - update do_trap_hypercall() by adding local variable
> 
> Changes V2 -> V3:
>    - update patch description
>    - move check to p2m_free_entry()
>    - add a comment
>    - use "curr" instead of "v" in do_trap_hypercall()
> 
> Changes V3 -> V4:
>    - update patch description
>    - re-order check in p2m_free_entry() to call domain_has_ioreq_server()
>      only if p2m->domain == current->domain
>    - add a comment in do_trap_hypercall()
> ---
>  xen/arch/arm/p2m.c   | 25 +++++++++++++++++--------
>  xen/arch/arm/traps.c | 20 +++++++++++++++++---
>  2 files changed, 34 insertions(+), 11 deletions(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index d41c4fa..26acb95d 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -1,6 +1,7 @@
>  #include <xen/cpu.h>
>  #include <xen/domain_page.h>
>  #include <xen/iocap.h>
> +#include <xen/ioreq.h>
>  #include <xen/lib.h>
>  #include <xen/sched.h>
>  #include <xen/softirq.h>
> @@ -749,17 +750,25 @@ static void p2m_free_entry(struct p2m_domain *p2m,
>      if ( !p2m_is_valid(entry) )
>          return;
>  
> -    /* Nothing to do but updating the stats if the entry is a super-page. */
> -    if ( p2m_is_superpage(entry, level) )
> +    if ( p2m_is_superpage(entry, level) || (level == 3) )
>      {
> -        p2m->stats.mappings[level]--;
> -        return;
> -    }
> +#ifdef CONFIG_IOREQ_SERVER
> +        /*
> +         * If this gets called (non-recursively) then either the entry
> +         * was replaced by an entry with a different base (valid case) or
> +         * the shattering of a superpage was failed (error case).
> +         * So, at worst, the spurious mapcache invalidation might be sent.
> +         */
> +        if ( (p2m->domain == current->domain) &&
> +              domain_has_ioreq_server(p2m->domain) &&
> +              p2m_is_ram(entry.p2m.type) )
> +            p2m->domain->mapcache_invalidate = true;
> +#endif
>  
> -    if ( level == 3 )
> -    {
>          p2m->stats.mappings[level]--;
> -        p2m_put_l3_page(entry);
> +        /* Nothing to do if the entry is a super-page. */
> +        if ( level == 3 )
> +            p2m_put_l3_page(entry);
>          return;
>      }
>  
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index 35094d8..1070d1b 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -1443,6 +1443,7 @@ static void do_trap_hypercall(struct cpu_user_regs *regs, register_t *nr,
>                                const union hsr hsr)
>  {
>      arm_hypercall_fn_t call = NULL;
> +    struct vcpu *curr = current;
>  
>      BUILD_BUG_ON(NR_hypercalls < ARRAY_SIZE(arm_hypercall_table) );
>  
> @@ -1459,7 +1460,7 @@ static void do_trap_hypercall(struct cpu_user_regs *regs, register_t *nr,
>          return;
>      }
>  
> -    current->hcall_preempted = false;
> +    curr->hcall_preempted = false;
>  
>      perfc_incra(hypercalls, *nr);
>      call = arm_hypercall_table[*nr].fn;
> @@ -1472,7 +1473,7 @@ static void do_trap_hypercall(struct cpu_user_regs *regs, register_t *nr,
>      HYPERCALL_RESULT_REG(regs) = call(HYPERCALL_ARGS(regs));
>  
>  #ifndef NDEBUG
> -    if ( !current->hcall_preempted )
> +    if ( !curr->hcall_preempted )
>      {
>          /* Deliberately corrupt parameter regs used by this hypercall. */
>          switch ( arm_hypercall_table[*nr].nr_args ) {
> @@ -1489,8 +1490,21 @@ static void do_trap_hypercall(struct cpu_user_regs *regs, register_t *nr,
>  #endif
>  
>      /* Ensure the hypercall trap instruction is re-executed. */
> -    if ( current->hcall_preempted )
> +    if ( curr->hcall_preempted )
>          regs->pc -= 4;  /* re-execute 'hvc #XEN_HYPERCALL_TAG' */
> +
> +#ifdef CONFIG_IOREQ_SERVER
> +    /*
> +     * Taking into the account the following the do_trap_hypercall()
> +     * is the best place to send invalidation request:
> +     * - The only way a guest can modify its P2M on Arm is via an hypercall
> +     * - When sending the invalidation request, the vCPU will be blocked
> +     *   until all the IOREQ servers have acknowledged the invalidation

NIT: I suggest to reword it as follows to make it sound better.

We call ioreq_signal_mapcache_invalidate from do_trap_hypercall()
because the only way a guest can modify its P2M on Arm is via an
hypercall. Note that sending the invalidation request causes the vCPU to
block until all the IOREQ servers have acknowledged the invalidation.


Could be done on commit.

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> +     */
> +    if ( unlikely(curr->domain->mapcache_invalidate) &&
> +         test_and_clear_bool(curr->domain->mapcache_invalidate) )
> +        ioreq_signal_mapcache_invalidate();
> +#endif
>  }
>  
>  void arch_hypercall_tasklet_result(struct vcpu *v, long res)
> -- 
> 2.7.4
> 


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [ANNOUNCE] Xen 4.15 release schedule and feature tracking
  2021-01-14 19:02         ` Andrew Cooper
@ 2021-01-15  9:57           ` Jan Beulich
  2021-01-15 10:00             ` Julien Grall
  2021-01-15 10:52             ` Andrew Cooper
  2021-01-15 10:43           ` Bertrand Marquis
                             ` (2 subsequent siblings)
  3 siblings, 2 replies; 144+ messages in thread
From: Jan Beulich @ 2021-01-15  9:57 UTC (permalink / raw)
  To: Andrew Cooper, Ian Jackson
  Cc: xen-devel, committers, Tamas K Lengyel, Michał Leszczyński

On 14.01.2021 20:02, Andrew Cooper wrote:
> Bugs:
> 
> 1) HPET/PIT issue on newer Intel systems.  This has had literally tens
> of reports across the devel and users mailing lists, and prevents Xen
> from booting at all on the past two generations of Intel laptop.  I've
> finally got a repro and posted a fix to the list, but still in progress.
> 
> 2) "scheduler broken" bugs.  We've had 4 or 5 reports of Xen not
> working, and very little investigation on whats going on.  Suspicion is
> that there might be two bugs, one with smt=0 on recent AMD hardware, and
> one more general "some workloads cause negative credit" and might or
> might not be specific to credit2 (debugging feedback differs - also
> might be 3 underlying issue).
> 
> All of these have had repeated bug reports.  I'd classify them as
> blockers, given the impact they're having on people.

3) Fallout from MSR handling behavioral change.

Jan


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [ANNOUNCE] Xen 4.15 release schedule and feature tracking
  2021-01-15  9:57           ` Jan Beulich
@ 2021-01-15 10:00             ` Julien Grall
  2021-01-15 10:52             ` Andrew Cooper
  1 sibling, 0 replies; 144+ messages in thread
From: Julien Grall @ 2021-01-15 10:00 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper, Ian Jackson
  Cc: xen-devel, committers, Tamas K Lengyel, Michał Leszczyński



On 15/01/2021 09:57, Jan Beulich wrote:
> On 14.01.2021 20:02, Andrew Cooper wrote:
>> Bugs:
>>
>> 1) HPET/PIT issue on newer Intel systems.  This has had literally tens
>> of reports across the devel and users mailing lists, and prevents Xen
>> from booting at all on the past two generations of Intel laptop.  I've
>> finally got a repro and posted a fix to the list, but still in progress.
>>
>> 2) "scheduler broken" bugs.  We've had 4 or 5 reports of Xen not
>> working, and very little investigation on whats going on.  Suspicion is
>> that there might be two bugs, one with smt=0 on recent AMD hardware, and
>> one more general "some workloads cause negative credit" and might or
>> might not be specific to credit2 (debugging feedback differs - also
>> might be 3 underlying issue).
>>
>> All of these have had repeated bug reports.  I'd classify them as
>> blockers, given the impact they're having on people.
> 
> 3) Fallout from MSR handling behavioral change.

4) Use-after-free in the IOMMU code (this should be a blocker).

See  "xen/iommu: Collection of bug fixes for IOMMU teadorwn"

<20201222154338.9459-1-julien@xen.org>

Cheers,

> 
> Jan
> 

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [ANNOUNCE] Xen 4.15 release schedule and feature tracking
  2021-01-14 19:02         ` Andrew Cooper
  2021-01-15  9:57           ` Jan Beulich
@ 2021-01-15 10:43           ` Bertrand Marquis
  2021-01-15 15:14           ` Lengyel, Tamas
  2021-01-28 18:26           ` Dario Faggioli
  3 siblings, 0 replies; 144+ messages in thread
From: Bertrand Marquis @ 2021-01-15 10:43 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Ian Jackson, xen-devel, committers, Tamas K Lengyel,
	Michał Leszczyński

Hi,

> On 14 Jan 2021, at 19:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> 
> On 14/01/2021 16:06, Ian Jackson wrote:
>> The last posting date for new feature patches for Xen 4.15 is
>> tomorrow. [1]  We seem to be getting a reasonable good flood of stuff
>> trying to meet this deadline :-).
>> 
>> Patches for new fetures posted after tomorrow will be deferred to the
>> next Xen release after 4.15.  NB the primary responsibility for
>> driving a feature's progress to meet the release schedule, lies with
>> the feature's proponent(s).
>> 
>> 
>>  As a reminder, here is the release schedule:
>> + (unchanged information indented with spaces):
>> 
>>   Friday 15th January    Last posting date
>> 
>>       Patches adding new features should be posted to the mailing list
>>       by this cate, although perhaps not in their final version.
>> 
>>   Friday 29th January    Feature freeze
>> 
>>       Patches adding new features should be committed by this date.
>>       Straightforward bugfixes may continue to be accepted by
>>       maintainers.
>> 
>>   Friday 12th February **tentatve**   Code freeze
>> 
>>       Bugfixes only, all changes to be approved by the Release Manager.
>> 
>>   Week of 12th March **tentative**    Release
>>       (probably Tuesday or Wednesday)
>> 
>>  Any patches containing substantial refactoring are to treated as
>>  new features, even if they intent is to fix bugs.
>> 
>>  Freeze exceptions will not be routine, but may be granted in
>>  exceptional cases for small changes on the basis of risk assessment.
>>  Large series will not get exceptions.  Contributors *must not* rely on
>>  getting, or expect, a freeze exception.
>> 
>> + New or improved tests (supposing they do not involve refactoring,
>> + even build system reorganisation), and documentation improvements,
>> + will generally be treated as bugfixes.
>> 
>>  The codefreeze and release dates are provisional and will be adjusted
>>  in the light of apparent code quality etc.
>> 
>>  If as a feature proponent you feel your feature is at risk and there
>>  is something the Xen Project could do to help, please consult me or
>>  the Community Manager.  In such situations please reach out earlier
>>  rather than later.
>> 
>> 
>> In my last update I asked this:
>> 
>>> If you are working on a feature you want in 4.15 please let me know
>>> about it.  Ideally I'd like a little stanza like this:
>>> 
>>> S: feature name
>>> O: feature owner (proponent) name
>>> E: feature owner (proponent) email address
>>> P: your current estimate of the probability it making 4.15, as a %age
>>> 
>>> But free-form text is OK too.  Please reply to this mail.
>> I received one mail.  Thanks to Oleksandr Andrushchenko for his update
>> on the following feeature:
>> 
>>  IOREQ feature (+ virtio-mmio) on Arm
>>  https://www.mail-archive.com/xen-devel@lists.xenproject.org/msg87002.html
>> 
>>  Julien Grall <julien@xen.org>
>>  Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> 
>> I see that V4 of this series was just posted.  Thanks, Oleksandr.
>> I'll make a separate enquiry about your series.
>> 
>> I think if people don't find the traditional feature tracking useful,
>> I will try to assemble Release Notes information later, during the
>> freeze, when fewer people are rushing to try to meet the deadlines.
> 
> (Now I have working email).
> 
> Features:
> 
> 1) acquire_resource fixes.
> 
> Not really a new feature - entirely bugfixing a preexisting one.
> Developed by me to help 2).  Reasonably well acked, but awaiting
> feedback on v3.
> 
> 2) External Processor Trace support.
> 
> Development by Michał.  Depends on 1), and awaiting a new version being
> posted.
> 
> As far as I'm aware, both Intel and CERT have production systems
> deployed using this functionality, so it is very highly desirable to get
> into 4.15.
> 
> 3) Initial Trenchboot+SKINIT support.
> 
> I've got two patches I need to clean up and submit which is the first
> part of the Trenchboot + Dynamic Root of Trust on AMD support.  This
> will get Xen into a position where it can be started via the new grub
> "secure_launch" protocol.
> 
> Later patches (i.e. post 4.15) will do support for Intel TXT (i.e.
> without tboot), as well as the common infrastructure for the TPM event
> log and further measurements during the boot process.
> 
> 4) "simple" autotest support.


5) SMMU-v3 Support from Rahul Singh

See "xen/arm: Add support for SMMUv3 driver”
https://lists.xenproject.org/archives/html/xen-devel/2021-01/msg00429.html

Almost everything in the serie is already acked.

Cheers
Bertrand

> 
> 
> Bugs:
> 
> 1) HPET/PIT issue on newer Intel systems.  This has had literally tens
> of reports across the devel and users mailing lists, and prevents Xen
> from booting at all on the past two generations of Intel laptop.  I've
> finally got a repro and posted a fix to the list, but still in progress.
> 
> 2) "scheduler broken" bugs.  We've had 4 or 5 reports of Xen not
> working, and very little investigation on whats going on.  Suspicion is
> that there might be two bugs, one with smt=0 on recent AMD hardware, and
> one more general "some workloads cause negative credit" and might or
> might not be specific to credit2 (debugging feedback differs - also
> might be 3 underlying issue).
> 
> All of these have had repeated bug reports.  I'd classify them as
> blockers, given the impact they're having on people.
> 
> ~Andrew


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [ANNOUNCE] Xen 4.15 release schedule and feature tracking
  2021-01-15  9:57           ` Jan Beulich
  2021-01-15 10:00             ` Julien Grall
@ 2021-01-15 10:52             ` Andrew Cooper
  2021-01-15 10:59               ` Andrew Cooper
  1 sibling, 1 reply; 144+ messages in thread
From: Andrew Cooper @ 2021-01-15 10:52 UTC (permalink / raw)
  To: Jan Beulich, Ian Jackson
  Cc: xen-devel, committers, Tamas K Lengyel, Michał Leszczyński

On 15/01/2021 09:57, Jan Beulich wrote:
> On 14.01.2021 20:02, Andrew Cooper wrote:
>> Bugs:
>>
>> 1) HPET/PIT issue on newer Intel systems.  This has had literally tens
>> of reports across the devel and users mailing lists, and prevents Xen
>> from booting at all on the past two generations of Intel laptop.  I've
>> finally got a repro and posted a fix to the list, but still in progress.
>>
>> 2) "scheduler broken" bugs.  We've had 4 or 5 reports of Xen not
>> working, and very little investigation on whats going on.  Suspicion is
>> that there might be two bugs, one with smt=0 on recent AMD hardware, and
>> one more general "some workloads cause negative credit" and might or
>> might not be specific to credit2 (debugging feedback differs - also
>> might be 3 underlying issue).
>>
>> All of these have had repeated bug reports.  I'd classify them as
>> blockers, given the impact they're having on people.
> 3) Fallout from MSR handling behavioral change.

Yes, sorry for forgetting.  I was literally working on it while writing
this email - no idea why I forgot it.

4) zstd support to unbreak Fedora.  (I'm deliberately putting this in
the bugs rather than feature cateogry).

~Andrew


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [ANNOUNCE] Xen 4.15 release schedule and feature tracking
  2021-01-15 10:52             ` Andrew Cooper
@ 2021-01-15 10:59               ` Andrew Cooper
  2021-01-15 11:08                 ` Jan Beulich
  0 siblings, 1 reply; 144+ messages in thread
From: Andrew Cooper @ 2021-01-15 10:59 UTC (permalink / raw)
  To: Jan Beulich, Ian Jackson
  Cc: xen-devel, committers, Tamas K Lengyel, Michał Leszczyński

On 15/01/2021 10:52, Andrew Cooper wrote:
> On 15/01/2021 09:57, Jan Beulich wrote:
>> On 14.01.2021 20:02, Andrew Cooper wrote:
>>> Bugs:
>>>
>>> 1) HPET/PIT issue on newer Intel systems.  This has had literally tens
>>> of reports across the devel and users mailing lists, and prevents Xen
>>> from booting at all on the past two generations of Intel laptop.  I've
>>> finally got a repro and posted a fix to the list, but still in progress.
>>>
>>> 2) "scheduler broken" bugs.  We've had 4 or 5 reports of Xen not
>>> working, and very little investigation on whats going on.  Suspicion is
>>> that there might be two bugs, one with smt=0 on recent AMD hardware, and
>>> one more general "some workloads cause negative credit" and might or
>>> might not be specific to credit2 (debugging feedback differs - also
>>> might be 3 underlying issue).
>>>
>>> All of these have had repeated bug reports.  I'd classify them as
>>> blockers, given the impact they're having on people.
>> 3) Fallout from MSR handling behavioral change.
> Yes, sorry for forgetting.  I was literally working on it while writing
> this email - no idea why I forgot it.
>
> 4) zstd support to unbreak Fedora.  (I'm deliberately putting this in
> the bugs rather than feature cateogry).

Ha!  I should have read further through my emails before replying.

But we do at least want this item tracking.

~Andrew


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [ANNOUNCE] Xen 4.15 release schedule and feature tracking
  2021-01-15 10:59               ` Andrew Cooper
@ 2021-01-15 11:08                 ` Jan Beulich
  0 siblings, 0 replies; 144+ messages in thread
From: Jan Beulich @ 2021-01-15 11:08 UTC (permalink / raw)
  To: Andrew Cooper, Ian Jackson
  Cc: xen-devel, committers, Tamas K Lengyel, Michał Leszczyński

On 15.01.2021 11:59, Andrew Cooper wrote:
> On 15/01/2021 10:52, Andrew Cooper wrote:
>> 4) zstd support to unbreak Fedora.  (I'm deliberately putting this in
>> the bugs rather than feature cateogry).
> 
> Ha!  I should have read further through my emails before replying.

What I've sent doesn't cover DomU-s, though, so ...

> But we do at least want this item tracking.

... I definitely agree here from whichever perspective I take.

Jan


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 11/24] xen/mm: Make x86's XENMEM_resource_ioreq_server handling common
  2021-01-14 15:31     ` Oleksandr
@ 2021-01-15 14:35       ` Alex Bennée
  2021-01-18 17:42         ` Oleksandr
  0 siblings, 1 reply; 144+ messages in thread
From: Alex Bennée @ 2021-01-15 14:35 UTC (permalink / raw)
  To: Oleksandr
  Cc: Wei Chen, Julien Grall, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Volodymyr Babchuk, Oleksandr Tyshchenko,
	xen-devel


Oleksandr <olekstysh@gmail.com> writes:

> On 14.01.21 05:58, Wei Chen wrote:
>> Hi Oleksandr,
>
> Hi Wei
<snip>
>>> @@ -1090,6 +1091,40 @@ static int acquire_grant_table(struct domain *d,
>>> unsigned int id,
>>>       return 0;
>>>   }
>>>
>>> +static int acquire_ioreq_server(struct domain *d,
>>> +                                unsigned int id,
>>> +                                unsigned long frame,
>>> +                                unsigned int nr_frames,
>>> +                                xen_pfn_t mfn_list[])
>>> +{
>>> +#ifdef CONFIG_IOREQ_SERVER
>>> +    ioservid_t ioservid = id;
>>> +    unsigned int i;
>>> +    int rc;
>>> +
>>> +    if ( !is_hvm_domain(d) )
>>> +        return -EINVAL;
>>> +
>>> +    if ( id != (unsigned int)ioservid )
>>> +        return -EINVAL;
>>> +
>>> +    for ( i = 0; i < nr_frames; i++ )
>>> +    {
>>> +        mfn_t mfn;
>>> +
>>> +        rc = hvm_get_ioreq_server_frame(d, id, frame + i, &mfn);
>>> +        if ( rc )
>>> +            return rc;
>>> +
>>> +        mfn_list[i] = mfn_x(mfn);
>>> +    }
>>> +
>>> +    return 0;
>>> +#else
>>> +    return -EOPNOTSUPP;
>>> +#endif
>>> +}
>>> +
<snip>
>>>
>> This change could not be applied to the latest staging branch.
>
> Yes, thank you noticing that.  The code around was changed a bit (patch 
> series is based on 10-days old staging), I will update for the next
> version.

I think the commit that introduced config ARCH_ACQUIRE_RESOURCE could
probably be reverted as it achieves pretty much the same thing as the
above code by moving the logic into the common code path.

The only real practical difference is a inline stub vs a general purpose
function with an IOREQ specific #ifdeferry.
<snip>

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 03/24] x86/ioreq: Provide out-of-line wrapper for the handle_mmio()
  2021-01-12 21:52 ` [PATCH V4 03/24] x86/ioreq: Provide out-of-line wrapper for the handle_mmio() Oleksandr Tyshchenko
@ 2021-01-15 14:48   ` Alex Bennée
  2021-01-15 15:19   ` Julien Grall
  2021-01-18  8:29   ` Paul Durrant
  2 siblings, 0 replies; 144+ messages in thread
From: Alex Bennée @ 2021-01-15 14:48 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Oleksandr Tyshchenko, Paul Durrant, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, Julien Grall, Stefano Stabellini, Julien Grall,
	xen-devel


Oleksandr Tyshchenko <olekstysh@gmail.com> writes:

> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>
> The IOREQ is about to be common feature and Arm will have its own
> implementation.
>
> But the name of the function is pretty generic and can be confusing
> on Arm (we already have a try_handle_mmio()).
>
> In order not to rename the function (which is used for a varying
> set of purposes on x86) globally and get non-confusing variant on Arm
> provide a wrapper arch_ioreq_complete_mmio() to be used on common
> and Arm code.
>
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 04/24] xen/ioreq: Make x86's IOREQ feature common
  2021-01-12 21:52 ` [PATCH V4 04/24] xen/ioreq: Make x86's IOREQ feature common Oleksandr Tyshchenko
@ 2021-01-15 14:55   ` Alex Bennée
  2021-01-15 15:23   ` Julien Grall
  2021-01-18  8:48   ` Paul Durrant
  2 siblings, 0 replies; 144+ messages in thread
From: Alex Bennée @ 2021-01-15 14:55 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Oleksandr Tyshchenko, Andrew Cooper, George Dunlap, Ian Jackson,
	Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu,
	Roger Pau Monné,
	Paul Durrant, Jun Nakajima, Kevin Tian, Tim Deegan, Julien Grall,
	xen-devel


Oleksandr Tyshchenko <olekstysh@gmail.com> writes:

> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>
> As a lot of x86 code can be re-used on Arm later on, this patch
> moves previously prepared IOREQ support to the common code
> (the code movement is verbatim copy).
>
> The "legacy" mechanism of mapping magic pages for the IOREQ servers
> remains x86 specific and not exposed to the common code.
>
> The common IOREQ feature is supposed to be built with IOREQ_SERVER
> option enabled, which is selected for x86's config HVM for now.
>
> In order to avoid having a gigantic patch here, the subsequent
> patches will update remaining bits in the common code step by step:
> - Make IOREQ related structs/materials common
> - Drop the "hvm" prefixes and infixes
> - Remove layering violation by moving corresponding fields
>   out of *arch.hvm* or abstracting away accesses to them
>
> Also include <xen/domain_page.h> which will be needed on Arm
> to avoid touch the common code again when introducing Arm specific bits.
>
> This support is going to be used on Arm to be able run device
> emulator outside of Xen hypervisor.
>
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>
>
> ---
> Please note, this is a split/cleanup/hardening of Julien's PoC:
> "Add support for Guest IO forwarding to a device emulator"
>
> ***
> Please note, this patch depends on the following which is
> on review:
> https://patchwork.kernel.org/patch/11816689/
> ***

Just a note on process because I got tripped up again after applying the
series to a clean branch.

I tend to just include any pre-requisite patches in my series just to
make them easy to apply as a standalone series even if I'm expecting the
master version of the patch to get merged before my series. It usually
disappears on the next re-base ;-)

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 144+ messages in thread

* RE: [ANNOUNCE] Xen 4.15 release schedule and feature tracking
  2021-01-14 19:02         ` Andrew Cooper
  2021-01-15  9:57           ` Jan Beulich
  2021-01-15 10:43           ` Bertrand Marquis
@ 2021-01-15 15:14           ` Lengyel, Tamas
  2021-01-28 22:55             ` Dario Faggioli
  2021-01-28 18:26           ` Dario Faggioli
  3 siblings, 1 reply; 144+ messages in thread
From: Lengyel, Tamas @ 2021-01-15 15:14 UTC (permalink / raw)
  To: Cooper, Andrew, Ian Jackson, xen-devel, committers,
	Tamas K Lengyel, Michał Leszczyński

> Features:
> 
> 1) acquire_resource fixes.
> 
> Not really a new feature - entirely bugfixing a preexisting one.
> Developed by me to help 2).  Reasonably well acked, but awaiting feedback
> on v3.
> 
> 2) External Processor Trace support.
> 
> Development by Michał.  Depends on 1), and awaiting a new version being
> posted.
> 
> As far as I'm aware, both Intel and CERT have production systems deployed
> using this functionality, so it is very highly desirable to get into 4.15.

We are actively using a backported version on top of 4.14.1, having this in 4.15 would be absolutely huge. We've ran over 10 billion fuzz cycles using it so far using VM forks, works great. Several other researchers in the community are using it as well.

> 1) HPET/PIT issue on newer Intel systems.  This has had literally tens of
> reports across the devel and users mailing lists, and prevents Xen from
> booting at all on the past two generations of Intel laptop.  I've finally got a
> repro and posted a fix to the list, but still in progress.

We've ran into this on multiple systems, Andrew's patch does fix it.

> 2) "scheduler broken" bugs.  We've had 4 or 5 reports of Xen not working,
> and very little investigation on whats going on.  Suspicion is that there
> might be two bugs, one with smt=0 on recent AMD hardware, and one
> more general "some workloads cause negative credit" and might or might
> not be specific to credit2 (debugging feedback differs - also might be 3
> underlying issue).

We've also ran into intermittent Xen lockups requiring power-cycling servers. We switched back to credit1 and had no issues since. Hard to tell if it was related to the scheduler or the pile of other experimental stuff we are running with but right now we have stable systems across the board with credit1.

> 
> All of these have had repeated bug reports.  I'd classify them as blockers,
> given the impact they're having on people.

+1

Thanks,
Tamas

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 01/24] x86/ioreq: Prepare IOREQ feature for making it common
  2021-01-12 21:52 ` [PATCH V4 01/24] x86/ioreq: Prepare IOREQ feature for making it common Oleksandr Tyshchenko
@ 2021-01-15 15:16   ` Julien Grall
  2021-01-15 16:41   ` Jan Beulich
  2021-01-18  8:22   ` Paul Durrant
  2 siblings, 0 replies; 144+ messages in thread
From: Julien Grall @ 2021-01-15 15:16 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, xen-devel
  Cc: Oleksandr Tyshchenko, Paul Durrant, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, Stefano Stabellini, Julien Grall

Hi Oleksandr,

On 12/01/2021 21:52, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> As a lot of x86 code can be re-used on Arm later on, this
> patch makes some preparation to x86/hvm/ioreq.c before moving
> to the common code. This way we will get a verbatim copy
> for a code movement in subsequent patch.
> 
> This patch mostly introduces specific hooks to abstract arch
> specific materials taking into the account the requirment to leave
> the "legacy" mechanism of mapping magic pages for the IOREQ servers
> x86 specific and not expose it to the common code.
> 
> These hooks are named according to the more consistent new naming
> scheme right away (including dropping the "hvm" prefixes and infixes):
> - IOREQ server functions should start with "ioreq_server_"
> - IOREQ functions should start with "ioreq_"
> other functions will be renamed in subsequent patches.
> 
> Also re-order #include-s alphabetically.
> 
> This support is going to be used on Arm to be able run device
> emulator outside of Xen hypervisor.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

Reviewed-by: Julien Grall <jgrall@amazon.com>

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 02/24] x86/ioreq: Add IOREQ_STATUS_* #define-s and update code for moving
  2021-01-12 21:52 ` [PATCH V4 02/24] x86/ioreq: Add IOREQ_STATUS_* #define-s and update code for moving Oleksandr Tyshchenko
@ 2021-01-15 15:17   ` Julien Grall
  2021-01-18  8:24   ` Paul Durrant
  1 sibling, 0 replies; 144+ messages in thread
From: Julien Grall @ 2021-01-15 15:17 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, xen-devel
  Cc: Oleksandr Tyshchenko, Paul Durrant, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, Stefano Stabellini, Julien Grall

Hi Oleksandr,

On 12/01/2021 21:52, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> This patch continues to make some preparation to x86/hvm/ioreq.c
> before moving to the common code.
> 
> Add IOREQ_STATUS_* #define-s and update candidates for moving
> since X86EMUL_* shouldn't be exposed to the common code in
> that form.
> 
> This support is going to be used on Arm to be able run device
> emulator outside of Xen hypervisor.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

Reviewed-by: Julien Grall <jgrall@amazon.com>

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 03/24] x86/ioreq: Provide out-of-line wrapper for the handle_mmio()
  2021-01-12 21:52 ` [PATCH V4 03/24] x86/ioreq: Provide out-of-line wrapper for the handle_mmio() Oleksandr Tyshchenko
  2021-01-15 14:48   ` Alex Bennée
@ 2021-01-15 15:19   ` Julien Grall
  2021-01-18  8:29   ` Paul Durrant
  2 siblings, 0 replies; 144+ messages in thread
From: Julien Grall @ 2021-01-15 15:19 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, xen-devel
  Cc: Oleksandr Tyshchenko, Paul Durrant, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, Stefano Stabellini, Julien Grall

Hi Oleksandr,

On 12/01/2021 21:52, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> The IOREQ is about to be common feature and Arm will have its own
> implementation.
> 
> But the name of the function is pretty generic and can be confusing
> on Arm (we already have a try_handle_mmio()).
> 
> In order not to rename the function (which is used for a varying
> set of purposes on x86) globally and get non-confusing variant on Arm
> provide a wrapper arch_ioreq_complete_mmio() to be used on common
> and Arm code.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>

Reviewed-by: Julien Grall <jgrall@amazon.com>

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 04/24] xen/ioreq: Make x86's IOREQ feature common
  2021-01-12 21:52 ` [PATCH V4 04/24] xen/ioreq: Make x86's IOREQ feature common Oleksandr Tyshchenko
  2021-01-15 14:55   ` Alex Bennée
@ 2021-01-15 15:23   ` Julien Grall
  2021-01-18  8:48   ` Paul Durrant
  2 siblings, 0 replies; 144+ messages in thread
From: Julien Grall @ 2021-01-15 15:23 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, xen-devel, Paul Durrant
  Cc: Oleksandr Tyshchenko, Andrew Cooper, George Dunlap, Ian Jackson,
	Jan Beulich, Stefano Stabellini, Wei Liu, Roger Pau Monné,
	Jun Nakajima, Kevin Tian, Tim Deegan, Julien Grall

Hi Oleksandr,

On 12/01/2021 21:52, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> As a lot of x86 code can be re-used on Arm later on, this patch
> moves previously prepared IOREQ support to the common code
> (the code movement is verbatim copy).
> 
> The "legacy" mechanism of mapping magic pages for the IOREQ servers
> remains x86 specific and not exposed to the common code.
> 
> The common IOREQ feature is supposed to be built with IOREQ_SERVER
> option enabled, which is selected for x86's config HVM for now.
> 
> In order to avoid having a gigantic patch here, the subsequent
> patches will update remaining bits in the common code step by step:
> - Make IOREQ related structs/materials common
> - Drop the "hvm" prefixes and infixes
> - Remove layering violation by moving corresponding fields
>    out of *arch.hvm* or abstracting away accesses to them
> 
> Also include <xen/domain_page.h> which will be needed on Arm
> to avoid touch the common code again when introducing Arm specific bits.
> 
> This support is going to be used on Arm to be able run device
> emulator outside of Xen hypervisor.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>
> 
> ---
> Please note, this is a split/cleanup/hardening of Julien's PoC:
> "Add support for Guest IO forwarding to a device emulator"
> 
> ***
> Please note, this patch depends on the following which is
> on review:
> https://patchwork.kernel.org/patch/11816689/
> ***

The effort was paused because we found a security issue around that code 
(see XSA-348). @Paul do you plan to revive it for 4.15?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 05/24] xen/ioreq: Make x86's hvm_ioreq_needs_completion() common
  2021-01-12 21:52 ` [PATCH V4 05/24] xen/ioreq: Make x86's hvm_ioreq_needs_completion() common Oleksandr Tyshchenko
@ 2021-01-15 15:25   ` Julien Grall
  2021-01-20  8:48   ` Alex Bennée
  1 sibling, 0 replies; 144+ messages in thread
From: Julien Grall @ 2021-01-15 15:25 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, xen-devel
  Cc: Oleksandr Tyshchenko, Paul Durrant, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, Stefano Stabellini, Julien Grall

Hi Oleksandr,

On 12/01/2021 21:52, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> The IOREQ is a common feature now and this helper will be used
> on Arm as is. Move it to xen/ioreq.h and remove "hvm" prefix.
> 
> Although PIO handling on Arm is not introduced with the current series
> (it will be implemented when we add support for vPCI), technically
> the PIOs exist on Arm (however they are accessed the same way as MMIO)
> and it would be better not to diverge now.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> Reviewed-by: Paul Durrant <paul@xen.org>
> Acked-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Julien Grall <jgrall@amazon.com>

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 06/24] xen/ioreq: Make x86's hvm_mmio_first(last)_byte() common
  2021-01-12 21:52 ` [PATCH V4 06/24] xen/ioreq: Make x86's hvm_mmio_first(last)_byte() common Oleksandr Tyshchenko
@ 2021-01-15 15:34   ` Julien Grall
  2021-01-20  8:57   ` Alex Bennée
  2021-01-20 16:15   ` Jan Beulich
  2 siblings, 0 replies; 144+ messages in thread
From: Julien Grall @ 2021-01-15 15:34 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, xen-devel
  Cc: Oleksandr Tyshchenko, Paul Durrant, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, Stefano Stabellini, Julien Grall

Hi Oleksandr,

On 12/01/2021 21:52, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> The IOREQ is a common feature now and these helpers will be used
> on Arm as is. Move them to xen/ioreq.h and replace "hvm" prefixes
> with "ioreq".
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> Reviewed-by: Paul Durrant <paul@xen.org>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>

Reviewed-by: Julien Grall <jgrall@amazon.com>

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 07/24] xen/ioreq: Make x86's hvm_ioreq_(page/vcpu/server) structs common
  2021-01-12 21:52 ` [PATCH V4 07/24] xen/ioreq: Make x86's hvm_ioreq_(page/vcpu/server) structs common Oleksandr Tyshchenko
@ 2021-01-15 15:36   ` Julien Grall
  2021-01-18  8:59   ` Paul Durrant
  2021-01-20  8:58   ` Alex Bennée
  2 siblings, 0 replies; 144+ messages in thread
From: Julien Grall @ 2021-01-15 15:36 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, xen-devel
  Cc: Oleksandr Tyshchenko, Paul Durrant, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, George Dunlap, Stefano Stabellini, Julien Grall

Hi Oleksandr,

On 12/01/2021 21:52, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> The IOREQ is a common feature now and these structs will be used
> on Arm as is. Move them to xen/ioreq.h and remove "hvm" prefixes.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>

Reviewed-by: Julien Grall <jgrall@amazon.com>

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 08/24] xen/ioreq: Move x86's ioreq_server to struct domain
  2021-01-12 21:52 ` [PATCH V4 08/24] xen/ioreq: Move x86's ioreq_server to struct domain Oleksandr Tyshchenko
@ 2021-01-15 15:44   ` Julien Grall
  2021-01-18  9:09   ` Paul Durrant
  2021-01-20  9:00   ` Alex Bennée
  2 siblings, 0 replies; 144+ messages in thread
From: Julien Grall @ 2021-01-15 15:44 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, xen-devel
  Cc: Oleksandr Tyshchenko, Paul Durrant, Andrew Cooper, George Dunlap,
	Ian Jackson, Jan Beulich, Stefano Stabellini, Wei Liu,
	Roger Pau Monné,
	Julien Grall

Hi Oleksandr,

On 12/01/2021 21:52, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> The IOREQ is a common feature now and this struct will be used
> on Arm as is. Move it to common struct domain. This also
> significantly reduces the layering violation in the common code
> (*arch.hvm* usage).
> 
> We don't move ioreq_gfn since it is not used in the common code
> (the "legacy" mechanism is x86 specific).
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>

Reviewed-by: Julien Grall <jgrall@amazon.com>

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 01/24] x86/ioreq: Prepare IOREQ feature for making it common
  2021-01-12 21:52 ` [PATCH V4 01/24] x86/ioreq: Prepare IOREQ feature for making it common Oleksandr Tyshchenko
  2021-01-15 15:16   ` Julien Grall
@ 2021-01-15 16:41   ` Jan Beulich
  2021-01-16  9:48     ` Oleksandr
  2021-01-18  8:22   ` Paul Durrant
  2 siblings, 1 reply; 144+ messages in thread
From: Jan Beulich @ 2021-01-15 16:41 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Oleksandr Tyshchenko, Paul Durrant, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, Julien Grall, Stefano Stabellini, Julien Grall,
	xen-devel

On 12.01.2021 22:52, Oleksandr Tyshchenko wrote:
> @@ -1080,6 +1104,27 @@ int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
>      return rc;
>  }
>  
> +/* Called with ioreq_server lock held */
> +int arch_ioreq_server_map_mem_type(struct domain *d,
> +                                   struct hvm_ioreq_server *s,
> +                                   uint32_t flags)
> +{
> +    return p2m_set_ioreq_server(d, flags, s);
> +}
> +
> +void arch_ioreq_server_map_mem_type_completed(struct domain *d,
> +                                              struct hvm_ioreq_server *s,
> +                                              uint32_t flags)
> +{
> +    if ( flags == 0 )
> +    {
> +        const struct p2m_domain *p2m = p2m_get_hostp2m(d);
> +
> +        if ( read_atomic(&p2m->ioreq.entry_count) )
> +            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);

If I was the maintainer of this code, I'd ask that such single
use variables, unless needed to sensibly deal with line length
restrictions, be removed.

> --- a/xen/include/asm-x86/hvm/ioreq.h
> +++ b/xen/include/asm-x86/hvm/ioreq.h
> @@ -19,6 +19,9 @@
>  #ifndef __ASM_X86_HVM_IOREQ_H__
>  #define __ASM_X86_HVM_IOREQ_H__
>  
> +#define HANDLE_BUFIOREQ(s) \
> +    ((s)->bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF)
> +
>  bool hvm_io_pending(struct vcpu *v);
>  bool handle_hvm_io_completion(struct vcpu *v);
>  bool is_ioreq_server_page(struct domain *d, const struct page_info *page);
> @@ -55,6 +58,25 @@ unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered);
>  
>  void hvm_ioreq_init(struct domain *d);
>  
> +bool arch_vcpu_ioreq_completion(enum hvm_io_completion io_completion);
> +int arch_ioreq_server_map_pages(struct hvm_ioreq_server *s);
> +void arch_ioreq_server_unmap_pages(struct hvm_ioreq_server *s);
> +void arch_ioreq_server_enable(struct hvm_ioreq_server *s);
> +void arch_ioreq_server_disable(struct hvm_ioreq_server *s);
> +void arch_ioreq_server_destroy(struct hvm_ioreq_server *s);
> +int arch_ioreq_server_map_mem_type(struct domain *d,
> +                                   struct hvm_ioreq_server *s,
> +                                   uint32_t flags);
> +void arch_ioreq_server_map_mem_type_completed(struct domain *d,
> +                                              struct hvm_ioreq_server *s,
> +                                              uint32_t flags);
> +bool arch_ioreq_server_destroy_all(struct domain *d);
> +bool arch_ioreq_server_get_type_addr(const struct domain *d,
> +                                     const ioreq_t *p,
> +                                     uint8_t *type,
> +                                     uint64_t *addr);
> +void arch_ioreq_domain_init(struct domain *d);

As indicated before, I don't think these declarations should
live here. Even if a later patch moves them I wouldn't see
why they couldn't be put in their final resting place right
away.

Also where possible without violating line length restrictions
please still try to put multiple parameters on a single line,
as is done higher up in this file.

Jan


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 10/24] xen/ioreq: Move x86's io_completion/io_req fields to struct vcpu
  2021-01-12 21:52 ` [PATCH V4 10/24] xen/ioreq: Move x86's io_completion/io_req fields to struct vcpu Oleksandr Tyshchenko
@ 2021-01-15 19:34   ` Julien Grall
  2021-01-18  9:35   ` Paul Durrant
  2021-01-20 16:24   ` Jan Beulich
  2 siblings, 0 replies; 144+ messages in thread
From: Julien Grall @ 2021-01-15 19:34 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, xen-devel
  Cc: Oleksandr Tyshchenko, Paul Durrant, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, George Dunlap, Ian Jackson, Stefano Stabellini,
	Jun Nakajima, Kevin Tian, Julien Grall

Hi Oleksandr,

On 12/01/2021 21:52, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> The IOREQ is a common feature now and these fields will be used
> on Arm as is. Move them to common struct vcpu as a part of new
> struct vcpu_io and drop duplicating "io" prefixes. Also move
> enum hvm_io_completion to xen/sched.h and remove "hvm" prefixes.
> 
> This patch completely removes layering violation in the common code.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>

Reviewed-by: Julien Grall <jgrall@amazon.com>

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 13/24] xen/ioreq: Use guest_cmpxchg64() instead of cmpxchg()
  2021-01-12 21:52 ` [PATCH V4 13/24] xen/ioreq: Use guest_cmpxchg64() instead of cmpxchg() Oleksandr Tyshchenko
@ 2021-01-15 19:37   ` Julien Grall
  2021-01-17 11:32     ` Oleksandr
  2021-01-18 10:00   ` Paul Durrant
  1 sibling, 1 reply; 144+ messages in thread
From: Julien Grall @ 2021-01-15 19:37 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, xen-devel
  Cc: Oleksandr Tyshchenko, Paul Durrant, Stefano Stabellini, Julien Grall

Hi Oleksandr,

On 12/01/2021 21:52, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> The cmpxchg() in ioreq_send_buffered() operates on memory shared
> with the emulator domain (and the target domain if the legacy
> interface is used).
> 
> In order to be on the safe side we need to switch
> to guest_cmpxchg64() to prevent a domain to DoS Xen on Arm.
> 
> As there is no plan to support the legacy interface on Arm,
> we will have a page to be mapped in a single domain at the time,
> so we can use s->emulator in guest_cmpxchg64() safely.

I think you want to explain why you are using the 64-bit version of helper.

> 
> Thankfully the only user of the legacy interface is x86 so far
> and there is not concern regarding the atomics operations.
> 
> Please note, that the legacy interface *must* not be used on Arm
> without revisiting the code.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> Acked-by: Stefano Stabellini <sstabellini@kernel.org>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>
> 
> ---
> Please note, this is a split/cleanup/hardening of Julien's PoC:
> "Add support for Guest IO forwarding to a device emulator"
> 
> Changes RFC -> V1:
>     - new patch
> 
> Changes V1 -> V2:
>     - move earlier to avoid breaking arm32 compilation
>     - add an explanation to commit description and hvm_allow_set_param()
>     - pass s->emulator
> 
> Changes V2 -> V3:
>     - update patch description
> 
> Changes V3 -> V4:
>     - add Stefano's A-b
>     - drop comment from arm/hvm.c
> ---
>   xen/common/ioreq.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c
> index d233a49..d5f4dd3 100644
> --- a/xen/common/ioreq.c
> +++ b/xen/common/ioreq.c
> @@ -29,6 +29,7 @@
>   #include <xen/trace.h>
>   #include <xen/vpci.h>
>   
> +#include <asm/guest_atomics.h>
>   #include <asm/hvm/ioreq.h>
>   
>   #include <public/hvm/ioreq.h>
> @@ -1185,7 +1186,7 @@ static int ioreq_send_buffered(struct ioreq_server *s, ioreq_t *p)
>   
>           new.read_pointer = old.read_pointer - n * IOREQ_BUFFER_SLOT_NUM;
>           new.write_pointer = old.write_pointer - n * IOREQ_BUFFER_SLOT_NUM;
> -        cmpxchg(&pg->ptrs.full, old.full, new.full);
> +        guest_cmpxchg64(s->emulator, &pg->ptrs.full, old.full, new.full);
>       }
>   
>       notify_via_xen_event_channel(d, s->bufioreq_evtchn);
> 

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features
  2021-01-12 21:52 ` [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features Oleksandr Tyshchenko
  2021-01-15  0:55   ` Stefano Stabellini
@ 2021-01-15 20:26   ` Julien Grall
  2021-01-17 17:11     ` Oleksandr
  1 sibling, 1 reply; 144+ messages in thread
From: Julien Grall @ 2021-01-15 20:26 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, xen-devel
  Cc: Julien Grall, Stefano Stabellini, Volodymyr Babchuk,
	Oleksandr Tyshchenko



On 12/01/2021 21:52, Oleksandr Tyshchenko wrote:
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index 18cafcd..8f55aba 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -15,6 +15,7 @@
>   #include <xen/guest_access.h>
>   #include <xen/hypercall.h>
>   #include <xen/init.h>
> +#include <xen/ioreq.h>
>   #include <xen/lib.h>
>   #include <xen/livepatch.h>
>   #include <xen/sched.h>
> @@ -696,6 +697,10 @@ int arch_domain_create(struct domain *d,
>   
>       ASSERT(config != NULL);
>   
> +#ifdef CONFIG_IOREQ_SERVER
> +    ioreq_domain_init(d);
> +#endif
> +
>       /* p2m_init relies on some value initialized by the IOMMU subsystem */
>       if ( (rc = iommu_domain_init(d, config->iommu_opts)) != 0 )
>           goto fail;
> @@ -1014,6 +1019,10 @@ int domain_relinquish_resources(struct domain *d)
>           if (ret )
>               return ret;
>   
> +#ifdef CONFIG_IOREQ_SERVER
> +        ioreq_server_destroy_all(d);
> +#endif

The placement of this call feels quite odd. Shouldn't this moved in case 0?

> +
>       PROGRESS(xen):
>           ret = relinquish_memory(d, &d->xenpage_list);
>           if ( ret )
> diff --git a/xen/arch/arm/io.c b/xen/arch/arm/io.c
> index ae7ef96..9814481 100644
> --- a/xen/arch/arm/io.c
> +++ b/xen/arch/arm/io.c
> @@ -16,6 +16,7 @@
>    * GNU General Public License for more details.
>    */
>   
> +#include <xen/ioreq.h>
>   #include <xen/lib.h>
>   #include <xen/spinlock.h>
>   #include <xen/sched.h>
> @@ -23,6 +24,7 @@
>   #include <asm/cpuerrata.h>
>   #include <asm/current.h>
>   #include <asm/mmio.h>
> +#include <asm/hvm/ioreq.h>

Shouldn't this have been included by "xen/ioreq.h"?

>   
>   #include "decode.h"
>   
> @@ -123,7 +125,15 @@ enum io_state try_handle_mmio(struct cpu_user_regs *regs,
>   
>       handler = find_mmio_handler(v->domain, info.gpa);
>       if ( !handler )
> -        return IO_UNHANDLED;
> +    {
> +        int rc;
> +
> +        rc = try_fwd_ioserv(regs, v, &info);
> +        if ( rc == IO_HANDLED )
> +            return handle_ioserv(regs, v);
> +
> +        return rc;
> +    }
>   
>       /* All the instructions used on emulated MMIO region should be valid */
>       if ( !dabt.valid )
> diff --git a/xen/arch/arm/ioreq.c b/xen/arch/arm/ioreq.c
> new file mode 100644
> index 0000000..3c4a24d
> --- /dev/null
> +++ b/xen/arch/arm/ioreq.c
> @@ -0,0 +1,213 @@
> +/*
> + * arm/ioreq.c: hardware virtual machine I/O emulation
> + *
> + * Copyright (c) 2019 Arm ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <xen/domain.h>
> +#include <xen/ioreq.h>
> +
> +#include <asm/traps.h>
> +
> +#include <public/hvm/ioreq.h>
> +
> +enum io_state handle_ioserv(struct cpu_user_regs *regs, struct vcpu *v)
> +{
> +    const union hsr hsr = { .bits = regs->hsr };
> +    const struct hsr_dabt dabt = hsr.dabt;
> +    /* Code is similar to handle_read */
> +    uint8_t size = (1 << dabt.size) * 8;
> +    register_t r = v->io.req.data;
> +
> +    /* We are done with the IO */
> +    v->io.req.state = STATE_IOREQ_NONE;
> +
> +    if ( dabt.write )
> +        return IO_HANDLED;
> +
> +    /*
> +     * Sign extend if required.
> +     * Note that we expect the read handler to have zeroed the bits
> +     * outside the requested access size.
> +     */
> +    if ( dabt.sign && (r & (1UL << (size - 1))) )
> +    {
> +        /*
> +         * We are relying on register_t using the same as
> +         * an unsigned long in order to keep the 32-bit assembly
> +         * code smaller.
> +         */
> +        BUILD_BUG_ON(sizeof(register_t) != sizeof(unsigned long));
> +        r |= (~0UL) << size;
> +    }

Looking at the rest of the series, this code is going to be refactored 
in patch #19 and then hardened. It would have been better to do the 
refactoring first and then use it.

This helps a lot for the review and to reduce what I would call churn in 
the series.

I am OK to keep it like that for this series.

> +
> +    set_user_reg(regs, dabt.reg, r);
> +
> +    return IO_HANDLED;
> +}
> +
> +enum io_state try_fwd_ioserv(struct cpu_user_regs *regs,
> +                             struct vcpu *v, mmio_info_t *info)
> +{
> +    struct vcpu_io *vio = &v->io;
> +    ioreq_t p = {
> +        .type = IOREQ_TYPE_COPY,
> +        .addr = info->gpa,
> +        .size = 1 << info->dabt.size,
> +        .count = 1,
> +        .dir = !info->dabt.write,
> +        /*
> +         * On x86, df is used by 'rep' instruction to tell the direction
> +         * to iterate (forward or backward).
> +         * On Arm, all the accesses to MMIO region will do a single
> +         * memory access. So for now, we can safely always set to 0.
> +         */
> +        .df = 0,
> +        .data = get_user_reg(regs, info->dabt.reg),
> +        .state = STATE_IOREQ_READY,
> +    };
> +    struct ioreq_server *s = NULL;
> +    enum io_state rc;
> +
> +    switch ( vio->req.state )
> +    {
> +    case STATE_IOREQ_NONE:
> +        break;
> +
> +    case STATE_IORESP_READY:
> +        return IO_HANDLED;

With the Arm code in mind, I am a bit confused with this check. If 
vio->req.state == STATE_IORESP_READY, then it would imply that the 
previous I/O emulation was somehow not completed (from Xen PoV).

If you return IO_HANDLED here, then it means the we will take care of 
previous I/O but the current one is going to be ignored. So shouldn't we 
use the default path here as well?

> +
> +    default:
> +        gdprintk(XENLOG_ERR, "wrong state %u\n", vio->req.state);
> +        return IO_ABORT;
> +    }
> +
> +    s = ioreq_server_select(v->domain, &p);
> +    if ( !s )
> +        return IO_UNHANDLED;
> +
> +    if ( !info->dabt.valid )
> +        return IO_ABORT;
> +
> +    vio->req = p;
> +
> +    rc = ioreq_send(s, &p, 0);
> +    if ( rc != IO_RETRY || v->domain->is_shutting_down )
> +        vio->req.state = STATE_IOREQ_NONE;
> +    else if ( !ioreq_needs_completion(&vio->req) )
> +        rc = IO_HANDLED;
> +    else
> +        vio->completion = VIO_mmio_completion;
> +
> +    return rc;
> +}
> +
> +bool arch_ioreq_complete_mmio(void)
> +{
> +    struct vcpu *v = current;
> +    struct cpu_user_regs *regs = guest_cpu_user_regs();
> +    const union hsr hsr = { .bits = regs->hsr };
> +    paddr_t addr = v->io.req.addr;
> +
> +    if ( try_handle_mmio(regs, hsr, addr) == IO_HANDLED )
> +    {
> +        advance_pc(regs, hsr);
> +        return true;
> +    }
> +
> +    return false;
> +}
> +
> +bool arch_vcpu_ioreq_completion(enum vio_completion completion)
> +{
> +    ASSERT_UNREACHABLE();
> +    return true;
> +}
> +
> +/*
> + * The "legacy" mechanism of mapping magic pages for the IOREQ servers
> + * is x86 specific, so the following hooks don't need to be implemented on Arm:
> + * - arch_ioreq_server_map_pages
> + * - arch_ioreq_server_unmap_pages
> + * - arch_ioreq_server_enable
> + * - arch_ioreq_server_disable
> + */
> +int arch_ioreq_server_map_pages(struct ioreq_server *s)
> +{
> +    return -EOPNOTSUPP;
> +}
> +
> +void arch_ioreq_server_unmap_pages(struct ioreq_server *s)
> +{
> +}
> +
> +void arch_ioreq_server_enable(struct ioreq_server *s)
> +{
> +}
> +
> +void arch_ioreq_server_disable(struct ioreq_server *s)
> +{
> +}
> +
> +void arch_ioreq_server_destroy(struct ioreq_server *s)
> +{
> +}
> +
> +int arch_ioreq_server_map_mem_type(struct domain *d,
> +                                   struct ioreq_server *s,
> +                                   uint32_t flags)
> +{
> +    return -EOPNOTSUPP;
> +}
> +
> +void arch_ioreq_server_map_mem_type_completed(struct domain *d,
> +                                              struct ioreq_server *s,
> +                                              uint32_t flags)
> +{
> +}
> +
> +bool arch_ioreq_server_destroy_all(struct domain *d)
> +{
> +    return true;
> +}
> +
> +bool arch_ioreq_server_get_type_addr(const struct domain *d,
> +                                     const ioreq_t *p,
> +                                     uint8_t *type,
> +                                     uint64_t *addr)
> +{
> +    if ( p->type != IOREQ_TYPE_COPY && p->type != IOREQ_TYPE_PIO )
> +        return false;
> +
> +    *type = (p->type == IOREQ_TYPE_PIO) ?
> +             XEN_DMOP_IO_RANGE_PORT : XEN_DMOP_IO_RANGE_MEMORY;
> +    *addr = p->addr;
> +
> +    return true;
> +}
> +
> +void arch_ioreq_domain_init(struct domain *d)
> +{
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index 22bd1bd..036b13f 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -21,6 +21,7 @@
>   #include <xen/hypercall.h>
>   #include <xen/init.h>
>   #include <xen/iocap.h>
> +#include <xen/ioreq.h>
>   #include <xen/irq.h>
>   #include <xen/lib.h>
>   #include <xen/mem_access.h>
> @@ -1385,6 +1386,9 @@ static arm_hypercall_t arm_hypercall_table[] = {
>   #ifdef CONFIG_HYPFS
>       HYPERCALL(hypfs_op, 5),
>   #endif
> +#ifdef CONFIG_IOREQ_SERVER
> +    HYPERCALL(dm_op, 3),
> +#endif
>   };
>   
>   #ifndef NDEBUG
> @@ -1956,6 +1960,9 @@ static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
>               case IO_HANDLED:
>                   advance_pc(regs, hsr);
>                   return;
> +            case IO_RETRY:
> +                /* finish later */
> +                return;
>               case IO_UNHANDLED:
>                   /* IO unhandled, try another way to handle it. */
>                   break;
> @@ -2254,6 +2261,12 @@ static void check_for_vcpu_work(void)
>   {
>       struct vcpu *v = current;
>   
> +#ifdef CONFIG_IOREQ_SERVER
> +    local_irq_enable();
> +    vcpu_ioreq_handle_completion(v);
> +    local_irq_disable();
> +#endif
> +
>       if ( likely(!v->arch.need_flush_to_ram) )
>           return;
>   
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index 6819a3b..c235e5b 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -10,6 +10,7 @@
>   #include <asm/gic.h>
>   #include <asm/vgic.h>
>   #include <asm/vpl011.h>
> +#include <public/hvm/dm_op.h>

May I ask, why do you need to include dm_op.h here?

>   #include <public/hvm/params.h>
>   
>   struct hvm_domain
> @@ -262,6 +263,8 @@ static inline void arch_vcpu_block(struct vcpu *v) {}
>   
>   #define arch_vm_assist_valid_mask(d) (1UL << VMASST_TYPE_runstate_update_flag)
>   
> +#define has_vpci(d)    ({ (void)(d); false; })
> +
>   #endif /* __ASM_DOMAIN_H__ */
>   
>   /*
> diff --git a/xen/include/asm-arm/hvm/ioreq.h b/xen/include/asm-arm/hvm/ioreq.h
> new file mode 100644
> index 0000000..19e1247
> --- /dev/null
> +++ b/xen/include/asm-arm/hvm/ioreq.h

Shouldn't this directly be under asm-arm/ rather than asm-arm/hvm/ as 
the IOREQ is now meant to be agnostic?

> @@ -0,0 +1,72 @@
> +/*
> + * hvm.h: Hardware virtual machine assist interface definitions.
> + *
> + * Copyright (c) 2016 Citrix Systems Inc.
> + * Copyright (c) 2019 Arm ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef __ASM_ARM_HVM_IOREQ_H__
> +#define __ASM_ARM_HVM_IOREQ_H__
> +
> +#ifdef CONFIG_IOREQ_SERVER
> +enum io_state handle_ioserv(struct cpu_user_regs *regs, struct vcpu *v);
> +enum io_state try_fwd_ioserv(struct cpu_user_regs *regs,
> +                             struct vcpu *v, mmio_info_t *info);
> +#else
> +static inline enum io_state handle_ioserv(struct cpu_user_regs *regs,
> +                                          struct vcpu *v)
> +{
> +    return IO_UNHANDLED;
> +}
> +
> +static inline enum io_state try_fwd_ioserv(struct cpu_user_regs *regs,
> +                                           struct vcpu *v, mmio_info_t *info)
> +{
> +    return IO_UNHANDLED;
> +}
> +#endif
> +
> +bool ioreq_complete_mmio(void);
> +
> +static inline bool handle_pio(uint16_t port, unsigned int size, int dir)
> +{
> +    /*
> +     * TODO: For Arm64, the main user will be PCI. So this should be
> +     * implemented when we add support for vPCI.
> +     */
> +    ASSERT_UNREACHABLE();
> +    return true;
> +}
> +
> +static inline void msix_write_completion(struct vcpu *v)
> +{
> +}
> +
> +/* This correlation must not be altered */
> +#define IOREQ_STATUS_HANDLED     IO_HANDLED
> +#define IOREQ_STATUS_UNHANDLED   IO_UNHANDLED
> +#define IOREQ_STATUS_RETRY       IO_RETRY
> +
> +#endif /* __ASM_ARM_HVM_IOREQ_H__ */
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/include/asm-arm/mmio.h b/xen/include/asm-arm/mmio.h
> index 8dbfb27..7ab873c 100644
> --- a/xen/include/asm-arm/mmio.h
> +++ b/xen/include/asm-arm/mmio.h
> @@ -37,6 +37,7 @@ enum io_state
>       IO_ABORT,       /* The IO was handled by the helper and led to an abort. */
>       IO_HANDLED,     /* The IO was successfully handled by the helper. */
>       IO_UNHANDLED,   /* The IO was not handled by the helper. */
> +    IO_RETRY,       /* Retry the emulation for some reason */
>   };
>   
>   typedef int (*mmio_read_t)(struct vcpu *v, mmio_info_t *info,
> 

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 15/24] xen/arm: Stick around in leave_hypervisor_to_guest until I/O has completed
  2021-01-12 21:52 ` [PATCH V4 15/24] xen/arm: Stick around in leave_hypervisor_to_guest until I/O has completed Oleksandr Tyshchenko
  2021-01-15  1:12   ` Stefano Stabellini
@ 2021-01-15 20:55   ` Julien Grall
  2021-01-17 20:23     ` Oleksandr
  1 sibling, 1 reply; 144+ messages in thread
From: Julien Grall @ 2021-01-15 20:55 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, xen-devel
  Cc: Oleksandr Tyshchenko, Stefano Stabellini, Volodymyr Babchuk,
	Julien Grall

Hi Oleksandr,

On 12/01/2021 21:52, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> This patch adds proper handling of return value of
> vcpu_ioreq_handle_completion() which involves using a loop in
> leave_hypervisor_to_guest().
> 
> The reason to use an unbounded loop here is the fact that vCPU shouldn't
> continue until the I/O has completed.
> 
> The IOREQ code is using wait_on_xen_event_channel(). Yet, this can
> still "exit" early if an event has been received. But this doesn't mean
> the I/O has completed (in can be just a spurious wake-up).

While I agree we need the loop, I don't think the reason is correct 
here. If you receive a spurious event, then the loop in wait_for_io() 
will catch it.

The only way to get out of that loop is if the I/O has been handled or 
the state in the IOREQ page is invalid.

In addition to that, handle_hvm_io_completion(), will only return false 
if the state is invalid or there is vCPI work to do.

> So we need
> to check if the I/O has completed and wait again if it hasn't (we will
> block the vCPU again until an event is received). This loop makes sure
> that all the vCPU works are done before we return to the guest.
> 
> The call chain below:
> check_for_vcpu_work -> vcpu_ioreq_handle_completion -> wait_for_io ->
> wait_on_xen_event_channel
> 
> The worse that can happen here if the vCPU will never run again
> (the I/O will never complete). But, in Xen case, if the I/O never
> completes then it most likely means that something went horribly
> wrong with the Device Emulator. And it is most likely not safe
> to continue. So letting the vCPU to spin forever if the I/O never
> completes is a safer action than letting it continue and leaving
> the guest in unclear state and is the best what we can do for now.
> 
> Please note, using this loop we will not spin forever on a pCPU,
> preventing any other vCPUs from being scheduled. At every loop
> we will call check_for_pcpu_work() that will process pending
> softirqs. In case of failure, the guest will crash and the vCPU
> will be unscheduled. In normal case, if the rescheduling is necessary
> (might be set by a timer or by a caller in check_for_vcpu_work(),
> where wait_for_io() is a preemption point) the vCPU will be rescheduled
> to give place to someone else.
> 
What you describe here is a bug that was introduced by this series. If 
you think the code requires a separate patch, then please split off 
patch #14 so the code callling vcpu_ioreq_handle_completion() happen here.

> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>


> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>
> 
> ---
> Please note, this is a split/cleanup/hardening of Julien's PoC:
> "Add support for Guest IO forwarding to a device emulator"
> 
> Changes V1 -> V2:
>     - new patch, changes were derived from (+ new explanation):
>       arm/ioreq: Introduce arch specific bits for IOREQ/DM features
> 
> Changes V2 -> V3:
>     - update patch description
> 
> Changes V3 -> V4:
>     - update patch description and comment in code
> ---
>   xen/arch/arm/traps.c | 38 +++++++++++++++++++++++++++++++++-----
>   1 file changed, 33 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index 036b13f..4a83e1e 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -2257,18 +2257,23 @@ static void check_for_pcpu_work(void)
>    * Process pending work for the vCPU. Any call should be fast or
>    * implement preemption.
>    */
> -static void check_for_vcpu_work(void)
> +static bool check_for_vcpu_work(void)
>   {
>       struct vcpu *v = current;
>   
>   #ifdef CONFIG_IOREQ_SERVER
> +    bool handled;
> +
>       local_irq_enable();
> -    vcpu_ioreq_handle_completion(v);
> +    handled = vcpu_ioreq_handle_completion(v);
>       local_irq_disable();
> +
> +    if ( !handled )
> +        return true;
>   #endif
>   
>       if ( likely(!v->arch.need_flush_to_ram) )
> -        return;
> +        return false;
>   
>       /*
>        * Give a chance for the pCPU to process work before handling the vCPU
> @@ -2279,6 +2284,8 @@ static void check_for_vcpu_work(void)
>       local_irq_enable();
>       p2m_flush_vm(v);
>       local_irq_disable();
> +
> +    return false;
>   }
>   
>   /*
> @@ -2291,8 +2298,29 @@ void leave_hypervisor_to_guest(void)
>   {
>       local_irq_disable();
>   
> -    check_for_vcpu_work();
> -    check_for_pcpu_work();
> +    /*
> +     * The reason to use an unbounded loop here is the fact that vCPU
> +     * shouldn't continue until the I/O has completed.
> +     *
> +     * The worse that can happen here if the vCPU will never run again
> +     * (the I/O will never complete). But, in Xen case, if the I/O never
> +     * completes then it most likely means that something went horribly
> +     * wrong with the Device Emulator. And it is most likely not safe
> +     * to continue. So letting the vCPU to spin forever if the I/O never
> +     * completes is a safer action than letting it continue and leaving
> +     * the guest in unclear state and is the best what we can do for now.
> +     *
> +     * Please note, using this loop we will not spin forever on a pCPU,
> +     * preventing any other vCPUs from being scheduled. At every loop
> +     * we will call check_for_pcpu_work() that will process pending
> +     * softirqs. In case of failure, the guest will crash and the vCPU
> +     * will be unscheduled. In normal case, if the rescheduling is necessary
> +     * (might be set by a timer or by a caller in check_for_vcpu_work(),
> +     * the vCPU will be rescheduled to give place to someone else.

TBH, I think this comment is a bit too much and sort of out of context 
because this describing the inner implementation of check_for_vcpu_work().

How about the following:

/*
  * check_for_vcpu_work() may return true if there are more work to
  * before the vCPU can safely resume. This gives us an opportunity
  * to deschedule the vCPU if needed.
  */

> +     */
> +    do {
> +        check_for_pcpu_work();
> +    } while ( check_for_vcpu_work() );

So there are two important changes in this new implementation:
   1) Without CONFIG_IOREQ_SERVER=y, we will call check_for_pcpu_work() 
twice in a row when handling set/way.
   2) After handling the pCPU work, we will now return to the guest 
directly. Before, we gave another opportunity for Xen to schedule a 
different work. This means, we may return to the vCPU for a very short 
time and will introduce more overhead.

So I would rework the loop to write it as:

while ( check_for_pcpu_work() )
    check_for_pcpu_work();
check_for_pcpu_work();

>   
>       vgic_sync_to_lrs();
>   
> 

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 16/24] xen/mm: Handle properly reference in set_foreign_p2m_entry() on Arm
  2021-01-12 21:52 ` [PATCH V4 16/24] xen/mm: Handle properly reference in set_foreign_p2m_entry() on Arm Oleksandr Tyshchenko
  2021-01-15  1:19   ` Stefano Stabellini
@ 2021-01-15 20:59   ` Julien Grall
  2021-01-21 13:57   ` Jan Beulich
  2 siblings, 0 replies; 144+ messages in thread
From: Julien Grall @ 2021-01-15 20:59 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, xen-devel
  Cc: Oleksandr Tyshchenko, Stefano Stabellini, Volodymyr Babchuk,
	Andrew Cooper, George Dunlap, Ian Jackson, Jan Beulich, Wei Liu,
	Roger Pau Monné,
	Julien Grall

Hi Oleksandr,

On 12/01/2021 21:52, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> This patch implements reference counting of foreign entries in
> in set_foreign_p2m_entry() on Arm. This is a mandatory action if
> we want to run emulator (IOREQ server) in other than dom0 domain,
> as we can't trust it to do the right thing if it is not running
> in dom0. So we need to grab a reference on the page to avoid it
> disappearing.
> 
> It is valid to always pass "p2m_map_foreign_rw" type to
> guest_physmap_add_entry() since the current and foreign domains
> would be always different. A case when they are equal would be
> rejected by rcu_lock_remote_domain_by_id(). Besides the similar
> comment in the code put a respective ASSERT() to catch incorrect
> usage in future.
> 
> It was tested with IOREQ feature to confirm that all the pages given
> to this function belong to a domain, so we can use the same approach
> as for XENMAPSPACE_gmfn_foreign handling in xenmem_add_to_physmap_one().
> 
> This involves adding an extra parameter for the foreign domain to
> set_foreign_p2m_entry() and a helper to indicate whether the arch
> supports the reference counting of foreign entries and the restriction
> for the hardware domain in the common code can be skipped for it.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>

Reviewed-by: Julien Grall <jgrall@amazon.com>

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 23/24] libxl: Introduce basic virtio-mmio support on Arm
  2021-01-12 21:52 ` [PATCH V4 23/24] libxl: Introduce basic virtio-mmio support on Arm Oleksandr Tyshchenko
@ 2021-01-15 21:30   ` Julien Grall
  2021-01-17 22:22     ` Oleksandr
  0 siblings, 1 reply; 144+ messages in thread
From: Julien Grall @ 2021-01-15 21:30 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, xen-devel
  Cc: Julien Grall, Ian Jackson, Wei Liu, Anthony PERARD,
	Stefano Stabellini, Volodymyr Babchuk, Oleksandr Tyshchenko

Hi Oleksandr,

On 12/01/2021 21:52, Oleksandr Tyshchenko wrote:
> From: Julien Grall <julien.grall@arm.com>
> 
> This patch creates specific device node in the Guest device-tree
> with allocated MMIO range and SPI interrupt if specific 'virtio'
> property is present in domain config.

 From my understanding, for each virtio device use the MMIO transparent,
we would need to reserve an area in memory for its exclusive use.

If I were an admin, I would expect to only describe the list of virtio 
devices I want to assign to my guest and then let the toolstack figure 
out how to expose them.

So I am not quite too sure how this new parameter can be used. Could you 
expand it?

> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>
> 
> ---
> Please note, this is a split/cleanup/hardening of Julien's PoC:
> "Add support for Guest IO forwarding to a device emulator"
> 
> Changes RFC -> V1:
>     - was squashed with:
>       "[RFC PATCH V1 09/12] libxl: Handle virtio-mmio irq in more correct way"
>       "[RFC PATCH V1 11/12] libxl: Insert "dma-coherent" property into virtio-mmio device node"
>       "[RFC PATCH V1 12/12] libxl: Fix duplicate memory node in DT"
>     - move VirtIO MMIO #define-s to xen/include/public/arch-arm.h
> 
> Changes V1 -> V2:
>     - update the author of a patch
> 
> Changes V2 -> V3:
>     - no changes
> 
> Changes V3 -> V4:
>     - no changes
> ---
>   tools/libs/light/libxl_arm.c     | 58 ++++++++++++++++++++++++++++++++++++++--
>   tools/libs/light/libxl_types.idl |  1 +
>   tools/xl/xl_parse.c              |  1 +
>   xen/include/public/arch-arm.h    |  5 ++++
>   4 files changed, 63 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
> index 66e8a06..588ee5a 100644
> --- a/tools/libs/light/libxl_arm.c
> +++ b/tools/libs/light/libxl_arm.c
> @@ -26,8 +26,8 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
>   {
>       uint32_t nr_spis = 0;
>       unsigned int i;
> -    uint32_t vuart_irq;
> -    bool vuart_enabled = false;
> +    uint32_t vuart_irq, virtio_irq;
> +    bool vuart_enabled = false, virtio_enabled = false;
>   
>       /*
>        * If pl011 vuart is enabled then increment the nr_spis to allow allocation
> @@ -39,6 +39,17 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
>           vuart_enabled = true;
>       }
>   
> +    /*
> +     * XXX: Handle properly virtio
> +     * A proper solution would be the toolstack to allocate the interrupts
> +     * used by each virtio backend and let the backend now which one is used
> +     */
> +    if (libxl_defbool_val(d_config->b_info.arch_arm.virtio)) {
> +        nr_spis += (GUEST_VIRTIO_MMIO_SPI - 32) + 1;
> +        virtio_irq = GUEST_VIRTIO_MMIO_SPI;
> +        virtio_enabled = true;
> +    }
> +
>       for (i = 0; i < d_config->b_info.num_irqs; i++) {
>           uint32_t irq = d_config->b_info.irqs[i];
>           uint32_t spi;
> @@ -58,6 +69,12 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
>               return ERROR_FAIL;
>           }
>   
> +        /* The same check as for vpl011 */
> +        if (virtio_enabled && irq == virtio_irq) {
> +            LOG(ERROR, "Physical IRQ %u conflicting with virtio SPI\n", irq);
> +            return ERROR_FAIL;
> +        }
> +
>           if (irq < 32)
>               continue;
>   
> @@ -658,6 +675,39 @@ static int make_vpl011_uart_node(libxl__gc *gc, void *fdt,
>       return 0;
>   }
>   
> +static int make_virtio_mmio_node(libxl__gc *gc, void *fdt,
> +                                 uint64_t base, uint32_t irq)
> +{
> +    int res;
> +    gic_interrupt intr;
> +    /* Placeholder for virtio@ + a 64-bit number + \0 */
> +    char buf[24];
> +
> +    snprintf(buf, sizeof(buf), "virtio@%"PRIx64, base);
> +    res = fdt_begin_node(fdt, buf);
> +    if (res) return res;
> +
> +    res = fdt_property_compat(gc, fdt, 1, "virtio,mmio");
> +    if (res) return res;
> +
> +    res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS,
> +                            1, base, GUEST_VIRTIO_MMIO_SIZE);
> +    if (res) return res;
> +
> +    set_interrupt(intr, irq, 0xf, DT_IRQ_TYPE_EDGE_RISING);
> +    res = fdt_property_interrupts(gc, fdt, &intr, 1);
> +    if (res) return res;
> +
> +    res = fdt_property(fdt, "dma-coherent", NULL, 0);
> +    if (res) return res;
> +
> +    res = fdt_end_node(fdt);
> +    if (res) return res;
> +
> +    return 0;
> +
> +}
> +
>   static const struct arch_info *get_arch_info(libxl__gc *gc,
>                                                const struct xc_dom_image *dom)
>   {
> @@ -961,6 +1011,9 @@ next_resize:
>           if (info->tee == LIBXL_TEE_TYPE_OPTEE)
>               FDT( make_optee_node(gc, fdt) );
>   
> +        if (libxl_defbool_val(info->arch_arm.virtio))
> +            FDT( make_virtio_mmio_node(gc, fdt, GUEST_VIRTIO_MMIO_BASE, GUEST_VIRTIO_MMIO_SPI) ); > +
>           if (pfdt)
>               FDT( copy_partial_fdt(gc, fdt, pfdt) );
>   
> @@ -1178,6 +1231,7 @@ void libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
>   {
>       /* ACPI is disabled by default */
>       libxl_defbool_setdefault(&b_info->acpi, false);
> +    libxl_defbool_setdefault(&b_info->arch_arm.virtio, false);
>   
>       if (b_info->type != LIBXL_DOMAIN_TYPE_PV)
>           return;
> diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
> index 0532473..839df86 100644
> --- a/tools/libs/light/libxl_types.idl
> +++ b/tools/libs/light/libxl_types.idl
> @@ -640,6 +640,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
>   
>   
>       ("arch_arm", Struct(None, [("gic_version", libxl_gic_version),
> +                               ("virtio", libxl_defbool),

Regardless the question above, this doesn't sound very Arm specific.


I think we want to get the virtio configuration arch-agnostic because an 
admin should not need to know the arch internal to be able to assign 
virtio devices.

That said, you can leave it completely unimplemented for anything other 
than Arm.

If you add new parameters in the idl, you will also want to introduce a 
define in libxl.h so an external toolstack (such as libvirt) can detect 
whether the field is supported by the installed version of libxl. See 
the other LIBXL_HAVE_*.

>                                  ("vuart", libxl_vuart_type),
>                                 ])),
>       # Alternate p2m is not bound to any architecture or guest type, as it is
> diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
> index 4ebf396..2a3364b 100644
> --- a/tools/xl/xl_parse.c
> +++ b/tools/xl/xl_parse.c
> @@ -2581,6 +2581,7 @@ skip_usbdev:
>       }
>   
>       xlu_cfg_get_defbool(config, "dm_restrict", &b_info->dm_restrict, 0);
> +    xlu_cfg_get_defbool(config, "virtio", &b_info->arch_arm.virtio, 0);

Regardless the question above, any addition in the configuration file 
should be documented docs/man/xl.cfg.5.pod.in.

>   
>       if (c_info->type == LIBXL_DOMAIN_TYPE_HVM) {
>           if (!xlu_cfg_get_string (config, "vga", &buf, 0)) {
> diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
> index c365b1b..be7595f 100644
> --- a/xen/include/public/arch-arm.h
> +++ b/xen/include/public/arch-arm.h
> @@ -464,6 +464,11 @@ typedef uint64_t xen_callback_t;
>   #define PSCI_cpu_on      2
>   #define PSCI_migrate     3
>   
> +/* VirtIO MMIO definitions */
> +#define GUEST_VIRTIO_MMIO_BASE  xen_mk_ullong(0x02000000)

You will want to define any new region with the other *_{BASE, SIZE} 
above. Note that they should be ordered from bottom to the top of the 
memory layout.

> +#define GUEST_VIRTIO_MMIO_SIZE  xen_mk_ullong(0x200)

AFAICT, the size of the virtio mmio region should be 0x100. So why is it 
0x200?

> +#define GUEST_VIRTIO_MMIO_SPI   33

This will want to be defined with the other GUEST_*_SPI above.

Most likely, you will want to reserve a range

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 24/24] [RFC] libxl: Add support for virtio-disk configuration
  2021-01-12 21:52 ` [PATCH V4 24/24] [RFC] libxl: Add support for virtio-disk configuration Oleksandr Tyshchenko
  2021-01-14 17:20   ` Ian Jackson
@ 2021-01-15 22:01   ` Julien Grall
  2021-01-18  8:32     ` Oleksandr
  1 sibling, 1 reply; 144+ messages in thread
From: Julien Grall @ 2021-01-15 22:01 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, xen-devel
  Cc: Oleksandr Tyshchenko, Ian Jackson, Wei Liu, Anthony PERARD,
	Stefano Stabellini

Hi Oleksandr,

On 12/01/2021 21:52, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> This patch adds basic support for configuring and assisting virtio-disk
> backend (emualator) which is intended to run out of Qemu and could be run
> in any domain.
> 
> Xenstore was chosen as a communication interface for the emulator running
> in non-toolstack domain to be able to get configuration either by reading
> Xenstore directly or by receiving command line parameters (an updated 'xl devd'
> running in the same domain would read Xenstore beforehand and call backend
> executable with the required arguments).
> 
> An example of domain configuration (two disks are assigned to the guest,
> the latter is in readonly mode):
> 
> vdisk = [ 'backend=DomD, disks=rw:/dev/mmcblk0p3;ro:/dev/mmcblk1p3' ]
> 
> Where per-disk Xenstore entries are:
> - filename and readonly flag (configured via "vdisk" property)
> - base and irq (allocated dynamically)
> 
> Besides handling 'visible' params described in configuration file,
> patch also allocates virtio-mmio specific ones for each device and
> writes them into Xenstore. virtio-mmio params (irq and base) are
> unique per guest domain, they allocated at the domain creation time
> and passed through to the emulator. Each VirtIO device has at least
> one pair of these params.
> 
> TODO:
> 1. An extra "virtio" property could be removed.
> 2. Update documentation.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>
> 
> ---
> Changes RFC -> V1:
>     - no changes
> 
> Changes V1 -> V2:
>     - rebase according to the new location of libxl_virtio_disk.c
> 
> Changes V2 -> V3:
>     - no changes
> 
> Changes V3 -> V4:
>     - rebase according to the new argument for DEFINE_DEVICE_TYPE_STRUCT
> 
> Please note, there is a real concern about VirtIO interrupts allocation.
> [Just copy here what Stefano said in RFC thread]
> 
> So, if we end up allocating let's say 6 virtio interrupts for a domain,
> the chance of a clash with a physical interrupt of a passthrough device is real.

For the first version, I think a static approach is fine because it 
doesn't bind us to anything yet (there is no interface change). We can 
refine it on follow-ups as we figure out how virtio is going to be used 
in the field.

> 
> I am not entirely sure how to solve it, but these are a few ideas:
> - choosing virtio interrupts that are less likely to conflict (maybe > 1000)

Well, we only support 988 interrupts :). However, we will waste some 
memory in the vGIC structure (we would need to allocate memory for the 
988 interrupts) if you chose an interrupt towards then end.

> - make the virtio irq (optionally) configurable so that a user could
>    override the default irq and specify one that doesn't conflict

This is not very ideal because it makes the use of virtio quite 
unfriendly with passthrough. Note that platform device passthrough is 
already unfriendly, but I am thinking PCI :).

> - implementing support for virq != pirq (even the xl interface doesn't
>    allow to specify the virq number for passthrough devices, see "irqs")
I can't remember whether I had a reason to not support virq != pirq when 
this was initially implemented. This is one possibility, but it is as 
unfriendly as the previous option.

I will add a 4th one:
    - Automatically allocate the virtio IRQ. This should be possible to 
do it without too much trouble as we know in advance which IRQs will be 
passthrough.

My preference is the 4th one, that said we may also want to pick either 
2 or 3 to give some flexibility to an admin if they wish to get their 
hand dirty.

> 
> Also there is one suggestion from Wei Chen regarding a parameter for domain
> config file which I haven't addressed yet.
> [Just copy here what Wei said in V2 thread]
> Can we keep use the same 'disk' parameter for virtio-disk, but add an option like
> "model=virtio-disk"?
> For example:
> disk = [ 'backend=DomD, disks=rw:/dev/mmcblk0p3,model=virtio-disk' ]
> Just like what Xen has done for x86 virtio-net.
> ---
>   tools/libs/light/Makefile                 |   1 +
>   tools/libs/light/libxl_arm.c              |  56 ++++++++++++---
>   tools/libs/light/libxl_create.c           |   1 +
>   tools/libs/light/libxl_internal.h         |   1 +
>   tools/libs/light/libxl_types.idl          |  15 ++++
>   tools/libs/light/libxl_types_internal.idl |   1 +
>   tools/libs/light/libxl_virtio_disk.c      | 109 ++++++++++++++++++++++++++++
>   tools/xl/Makefile                         |   2 +-
>   tools/xl/xl.h                             |   3 +
>   tools/xl/xl_cmdtable.c                    |  15 ++++
>   tools/xl/xl_parse.c                       | 115 ++++++++++++++++++++++++++++++
>   tools/xl/xl_virtio_disk.c                 |  46 ++++++++++++
>   12 files changed, 354 insertions(+), 11 deletions(-)
>   create mode 100644 tools/libs/light/libxl_virtio_disk.c
>   create mode 100644 tools/xl/xl_virtio_disk.c
> 
> diff --git a/tools/libs/light/Makefile b/tools/libs/light/Makefile
> index 68f6fa3..ccc91b9 100644
> --- a/tools/libs/light/Makefile
> +++ b/tools/libs/light/Makefile
> @@ -115,6 +115,7 @@ SRCS-y += libxl_genid.c
>   SRCS-y += _libxl_types.c
>   SRCS-y += libxl_flask.c
>   SRCS-y += _libxl_types_internal.c
> +SRCS-y += libxl_virtio_disk.c
>   
>   ifeq ($(CONFIG_LIBNL),y)
>   CFLAGS_LIBXL += $(LIBNL3_CFLAGS)
> diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
> index 588ee5a..9eb3022 100644
> --- a/tools/libs/light/libxl_arm.c
> +++ b/tools/libs/light/libxl_arm.c
> @@ -8,6 +8,12 @@
>   #include <assert.h>
>   #include <xen/device_tree_defs.h>
>   
> +#ifndef container_of
> +#define container_of(ptr, type, member) ({			\
> +        typeof( ((type *)0)->member ) *__mptr = (ptr);	\
> +        (type *)( (char *)__mptr - offsetof(type,member) );})
> +#endif
> +
>   static const char *gicv_to_string(libxl_gic_version gic_version)
>   {
>       switch (gic_version) {
> @@ -39,14 +45,32 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
>           vuart_enabled = true;
>       }
>   
> -    /*
> -     * XXX: Handle properly virtio
> -     * A proper solution would be the toolstack to allocate the interrupts
> -     * used by each virtio backend and let the backend now which one is used
> -     */

Ok, so you added some code in patch #23 that is going to be mostly 
dropped here. I think you want to rethink how you do the split here.

One possible approach would be to have a patch which adds the 
infrastructe but no call. It would contain:
   1) Allocate a space in the virtio region and an interrupt
   2) Create the bindings.

Those helpers can then be called in this patch.

>       if (libxl_defbool_val(d_config->b_info.arch_arm.virtio)) {

It feels to me that this parameter is not necessary. You can easily 
infer it based whether you have a virtio disks attached or not.

> -        nr_spis += (GUEST_VIRTIO_MMIO_SPI - 32) + 1;
> +        uint64_t virtio_base;
> +        libxl_device_virtio_disk *virtio_disk;
> +
> +        virtio_base = GUEST_VIRTIO_MMIO_BASE;
>           virtio_irq = GUEST_VIRTIO_MMIO_SPI;

Looking at patch #23, you defined a single SPI and a region that can 
only fit virtio device. However, here, you are going to define multiple 
virtio devices.

I think you want to define the following:

  - GUEST_VIRTIO_MMIO_BASE: Base address of the virtio window
  - GUEST_VIRTIO_MMIO_SIZE: Full length of the virtio window (may 
contain multiple devices)
  - GUEST_VIRTIO_SPI_FIRST: First SPI reserved for virtio
  - GUEST_VIRTIO_SPI_LAST: Last SPI reserved for virtio

The per-device size doesn't need to be defined in arch-arm.h. Instead, I 
would only define internally (unless we can use a virtio.h header from 
Linux?).

> +
> +        if (!d_config->num_virtio_disks) {
> +            LOG(ERROR, "Virtio is enabled, but no Virtio devices present\n");
> +            return ERROR_FAIL;
> +        }
> +        virtio_disk = &d_config->virtio_disks[0];
> +
> +        for (i = 0; i < virtio_disk->num_disks; i++) {
> +            virtio_disk->disks[i].base = virtio_base;
> +            virtio_disk->disks[i].irq = virtio_irq;
> +
> +            LOG(DEBUG, "Allocate Virtio MMIO params: IRQ %u BASE 0x%"PRIx64,
> +                virtio_irq, virtio_base);
> +
> +            virtio_irq ++;

NIT: We usually don't have space before ++ or ...

> +            virtio_base += GUEST_VIRTIO_MMIO_SIZE;
> +        }
> +        virtio_irq --;

... --;

> +
> +        nr_spis += (virtio_irq - 32) + 1;
>           virtio_enabled = true;
>       }
>   

[...]

> diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
> index 2a3364b..054a0c9 100644
> --- a/tools/xl/xl_parse.c
> +++ b/tools/xl/xl_parse.c
> @@ -1204,6 +1204,120 @@ out:
>       if (rc) exit(EXIT_FAILURE);
>   }
>   
> +#define MAX_VIRTIO_DISKS 4

May I ask why this is hardcoded to 4?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 24/24] [RFC] libxl: Add support for virtio-disk configuration
  2021-01-14 17:20   ` Ian Jackson
@ 2021-01-16  9:05     ` Oleksandr
  0 siblings, 0 replies; 144+ messages in thread
From: Oleksandr @ 2021-01-16  9:05 UTC (permalink / raw)
  To: Ian Jackson
  Cc: xen-devel, Oleksandr Tyshchenko, Wei Liu, Anthony PERARD,
	Julien Grall, Stefano Stabellini


Hi Ian


On 14.01.21 19:20, Ian Jackson wrote:
> Oleksandr Tyshchenko writes ("[PATCH V4 24/24] [RFC] libxl: Add support for virtio-disk configuration"):
>> This patch adds basic support for configuring and assisting virtio-disk
>> backend (emualator) which is intended to run out of Qemu and could be run
>> in any domain.
> Thanks.  I think this is a very important feature.  But I think this
> part at least needs some work.  (That's not inappropriate for an RFC
> patch - so please don't feel you have done anything wrong.  I hope you
> will find my comments constructive.)

ok


>
>
>> An example of domain configuration (two disks are assigned to the guest,
>> the latter is in readonly mode):
>>
>> vdisk = [ 'backend=DomD, disks=rw:/dev/mmcblk0p3;ro:/dev/mmcblk1p3' ]
> I can see why you have done it like this but I am concerned that this
> is not well-integrated with the existing disk configuration system.
>
> As a result not only is your new feature lacking support for many
> existing libxl features (block backend scripts, cdroms tagged as such,
> non-raw formats) that could otherwise be made available, but I think
> adding them later would be quite awkward.
>
> I it would be better to reuse (and, if necessary, adapt) the existing
> disk parsing logic in libxl, so that the syntax for your new vdisks =
> [...] parameter is the same as for the existing disks.  Or even
> better, simply make your new kind of disk a new flag on the existing
> disk structure.
I got your point and agree. Almost the same suggestion (to reuse 
existing disk parameter
rather than introduce new one) was proposed by Wei. This is not forgotten,
but in my TODO list to investigate (and implement). I will come up with 
clarifying questions if any.


>
>> Also there is one suggestion from Wei Chen regarding a parameter for
>> domain config file which I haven't addressed yet.
>> [Just copy here what Wei said in V2 thread]
>> Can we keep use the same 'disk' parameter for virtio-disk, but add
>> an option like "model=virtio-disk"?
>> For example:
>> disk = [ 'backend=DomD, disks=rw:/dev/mmcblk0p3,model=virtio-disk' ]
>> Just like what Xen has done for x86 virtio-net.
> This is the same suggestion I make above, basically.  It would be much
> better, yes.

ok


>
>
>> Xenstore was chosen as a communication interface for the emulator
>> running in non-toolstack domain to be able to get configuration
>> either by reading Xenstore directly or by receiving command line
>> parameters (an updated 'xl devd' running in the same domain would
>> read Xenstore beforehand and call backend executable with the
>> required arguments).
> I was surprised to read this because I would expect that qemu upstream
> would be resistant to this approach.  As far as the Xen Project's
> point of view goes, I think using xenstore for this is fine, but we
> would definitely want the support in upstream qemu.
>
> Can you please explain the status of the corresponding qemu feature ?
> (Ideally, in a formal way in the commit message.)
I am afraid, I don't entirely get what is "the corresponding qemu feature"?
I haven't looked at the Qemu direction yet (we don't use Qemu in our 
target system), so have no ideas what should be done
there (if indeed needed) to support standalone "out-of-Qemu" virtio backend.
Could you please clarify what support is needed in Qemu for that purpose?


>
>> Please note, there is a real concern about VirtIO interrupts allocation.
>> [Just copy here what Stefano said in RFC thread]
>>
>> So, if we end up allocating let's say 6 virtio interrupts for a
>> domain, the chance of a clash with a physical interrupt of a
>> passthrough device is real.
>>
>> I am not entirely sure how to solve it, but these are a few ideas:
>> - choosing virtio interrupts that are less likely to conflict (maybe > 1000)
>> - make the virtio irq (optionally) configurable so that a user could
>>    override the default irq and specify one that doesn't conflict
>> - implementing support for virq != pirq (even the xl interface doesn't
>>    allow to specify the virq number for passthrough devices, see "irqs")
> I think here you have chosen to make the interupt configurable ?
>
> The implications are that a someone using this with passthrough would
> have to choose non-clashing IRQs ?
Yes
>    In the non-passthrough case (ie, a
> guest with no passthrough devices), can your code choose an
> appropriate IRQ, if the user doesn't specify one ?

Yes

Personally I think, it would be good if we could come up with a way
_without_ user involvement at all.


>
>
> I don't see any changes to the xl documentation in this patch.  That
> would be the place to explain the irq stuff, and would be needed
> anyway.  Indeed with anything substantial like your proposal, it is
> often a good idea to write (at least a sketch of) the documentation
> *first*, and then you know what you're aiming to implement.

Indeed, this ought to be documented. This is on my TODO list, will 
definitely update in the next version.


>
>
> I have some comments on the code details but I think you will probably
> want to focus on the overall approach, first:
>
>> +#ifndef container_of
>> +#define container_of(ptr, type, member) ({			\
>> +        typeof( ((type *)0)->member ) *__mptr = (ptr);	\
>> +        (type *)( (char *)__mptr - offsetof(type,member) );})
>> +#endif
> Please use the existing CONTAINER_OF which we have already.
oh, it is present, great. I failed to find something suitable (for some 
reason) when writing that code)
Will reuse.


>
>>   static const char *gicv_to_string(libxl_gic_version gic_version)
>>   {
>>       switch (gic_version) {
>> @@ -39,14 +45,32 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
>>           vuart_enabled = true;
>>       }
>>   
>> -    /*
>> -     * XXX: Handle properly virtio
>> -     * A proper solution would be the toolstack to allocate the interrupts
>> -     * used by each virtio backend and let the backend now which one is used
>> -     */
>>       if (libxl_defbool_val(d_config->b_info.arch_arm.virtio)) {
>> -        nr_spis += (GUEST_VIRTIO_MMIO_SPI - 32) + 1;
>> +        uint64_t virtio_base;
>> +        libxl_device_virtio_disk *virtio_disk;
>> +
>> +        virtio_base = GUEST_VIRTIO_MMIO_BASE;
>>           virtio_irq = GUEST_VIRTIO_MMIO_SPI;
> I would like to see a review of these changes to virtio handling by
> someone who understands virtio.

Makes sense.


>
>> +static int libxl__device_virtio_disk_setdefault(libxl__gc *gc, uint32_t domid,
>> +                                                libxl_device_virtio_disk *virtio_disk,
>> +                                                bool hotplug)
>> +{
>> +    return libxl__resolve_domid(gc, virtio_disk->backend_domname,
>> +                                &virtio_disk->backend_domid);
> There are some line length problems here.

Will correct.


>
> I haven't reviewed your parsing code because I think this ought to be
> done as an option or addition to with the existing disk spec parsing.

I got it, fair enough.


>
>> diff --git a/tools/xl/xl_virtio_disk.c b/tools/xl/xl_virtio_disk.c
>> new file mode 100644
>> index 0000000..808a7da
>> --- /dev/null
>> +++ b/tools/xl/xl_virtio_disk.c
>> @@ -0,0 +1,46 @@
>> +/*
>> + * Copyright (C) 2020 EPAM Systems Inc.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU Lesser General Public License as published
>> + * by the Free Software Foundation; version 2.1 only. with the special
>> + * exception on linking described in file LICENSE.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU Lesser General Public License for more details.
>> + */
>> +
>> +#include <stdlib.h>
>> +
>> +#include <libxl.h>
>> +#include <libxl_utils.h>
>> +#include <libxlutil.h>
>> +
>> +#include "xl.h"
>> +#include "xl_utils.h"
>> +#include "xl_parse.h"
>> +
>> +int main_virtio_diskattach(int argc, char **argv)
>> +{
>> +    return 0;
>> +}
>> +
>> +int main_virtio_disklist(int argc, char **argv)
>> +{
>> +   return 0;
>> +}
>> +
>> +int main_virtio_diskdetach(int argc, char **argv)
>> +{
>> +    return 0;
>> +}
> This seems to be a stray early test file left over in the patch ?

Will implement these bits.


>
>
> Thanks,
> Ian.

-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 01/24] x86/ioreq: Prepare IOREQ feature for making it common
  2021-01-15 16:41   ` Jan Beulich
@ 2021-01-16  9:48     ` Oleksandr
  0 siblings, 0 replies; 144+ messages in thread
From: Oleksandr @ 2021-01-16  9:48 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Oleksandr Tyshchenko, Paul Durrant, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, Julien Grall, Stefano Stabellini, Julien Grall,
	xen-devel


On 15.01.21 18:41, Jan Beulich wrote:

Hi Jan

> On 12.01.2021 22:52, Oleksandr Tyshchenko wrote:
>> @@ -1080,6 +1104,27 @@ int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
>>       return rc;
>>   }
>>   
>> +/* Called with ioreq_server lock held */
>> +int arch_ioreq_server_map_mem_type(struct domain *d,
>> +                                   struct hvm_ioreq_server *s,
>> +                                   uint32_t flags)
>> +{
>> +    return p2m_set_ioreq_server(d, flags, s);
>> +}
>> +
>> +void arch_ioreq_server_map_mem_type_completed(struct domain *d,
>> +                                              struct hvm_ioreq_server *s,
>> +                                              uint32_t flags)
>> +{
>> +    if ( flags == 0 )
>> +    {
>> +        const struct p2m_domain *p2m = p2m_get_hostp2m(d);
>> +
>> +        if ( read_atomic(&p2m->ioreq.entry_count) )
>> +            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
> If I was the maintainer of this code, I'd ask that such single
> use variables, unless needed to sensibly deal with line length
> restrictions, be removed.

ok, will remove. With that we could omit the braces and have a combined 
condition on a single line.


>
>> --- a/xen/include/asm-x86/hvm/ioreq.h
>> +++ b/xen/include/asm-x86/hvm/ioreq.h
>> @@ -19,6 +19,9 @@
>>   #ifndef __ASM_X86_HVM_IOREQ_H__
>>   #define __ASM_X86_HVM_IOREQ_H__
>>   
>> +#define HANDLE_BUFIOREQ(s) \
>> +    ((s)->bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF)
>> +
>>   bool hvm_io_pending(struct vcpu *v);
>>   bool handle_hvm_io_completion(struct vcpu *v);
>>   bool is_ioreq_server_page(struct domain *d, const struct page_info *page);
>> @@ -55,6 +58,25 @@ unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered);
>>   
>>   void hvm_ioreq_init(struct domain *d);
>>   
>> +bool arch_vcpu_ioreq_completion(enum hvm_io_completion io_completion);
>> +int arch_ioreq_server_map_pages(struct hvm_ioreq_server *s);
>> +void arch_ioreq_server_unmap_pages(struct hvm_ioreq_server *s);
>> +void arch_ioreq_server_enable(struct hvm_ioreq_server *s);
>> +void arch_ioreq_server_disable(struct hvm_ioreq_server *s);
>> +void arch_ioreq_server_destroy(struct hvm_ioreq_server *s);
>> +int arch_ioreq_server_map_mem_type(struct domain *d,
>> +                                   struct hvm_ioreq_server *s,
>> +                                   uint32_t flags);
>> +void arch_ioreq_server_map_mem_type_completed(struct domain *d,
>> +                                              struct hvm_ioreq_server *s,
>> +                                              uint32_t flags);
>> +bool arch_ioreq_server_destroy_all(struct domain *d);
>> +bool arch_ioreq_server_get_type_addr(const struct domain *d,
>> +                                     const ioreq_t *p,
>> +                                     uint8_t *type,
>> +                                     uint64_t *addr);
>> +void arch_ioreq_domain_init(struct domain *d);
> As indicated before, I don't think these declarations should
> live here. Even if a later patch moves them I wouldn't see
> why they couldn't be put in their final resting place right
> away.

Well, will introduce common ioreq.h right away.


>
> Also where possible without violating line length restrictions
> please still try to put multiple parameters on a single line,
> as is done higher up in this file.

Got it.


>
> Jan

-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 13/24] xen/ioreq: Use guest_cmpxchg64() instead of cmpxchg()
  2021-01-15 19:37   ` Julien Grall
@ 2021-01-17 11:32     ` Oleksandr
  0 siblings, 0 replies; 144+ messages in thread
From: Oleksandr @ 2021-01-17 11:32 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Oleksandr Tyshchenko, Paul Durrant,
	Stefano Stabellini, Julien Grall


On 15.01.21 21:37, Julien Grall wrote:
> Hi Oleksandr,


Hi Julien


>
> On 12/01/2021 21:52, Oleksandr Tyshchenko wrote:
>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>
>> The cmpxchg() in ioreq_send_buffered() operates on memory shared
>> with the emulator domain (and the target domain if the legacy
>> interface is used).
>>
>> In order to be on the safe side we need to switch
>> to guest_cmpxchg64() to prevent a domain to DoS Xen on Arm.
>>
>> As there is no plan to support the legacy interface on Arm,
>> we will have a page to be mapped in a single domain at the time,
>> so we can use s->emulator in guest_cmpxchg64() safely.
>
> I think you want to explain why you are using the 64-bit version of 
> helper.

The point to use 64-bit version of helper is to support Arm32 since the 
IOREQ code uses cmpxchg() with 64-bit value.

I will update patch description.


>
>>
>> Thankfully the only user of the legacy interface is x86 so far
>> and there is not concern regarding the atomics operations.
>>
>> Please note, that the legacy interface *must* not be used on Arm
>> without revisiting the code.
>>
>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> Acked-by: Stefano Stabellini <sstabellini@kernel.org>
>> CC: Julien Grall <julien.grall@arm.com>
>> [On Arm only]
>> Tested-by: Wei Chen <Wei.Chen@arm.com>
>>
>> ---
>> Please note, this is a split/cleanup/hardening of Julien's PoC:
>> "Add support for Guest IO forwarding to a device emulator"
>>
>> Changes RFC -> V1:
>>     - new patch
>>
>> Changes V1 -> V2:
>>     - move earlier to avoid breaking arm32 compilation
>>     - add an explanation to commit description and hvm_allow_set_param()
>>     - pass s->emulator
>>
>> Changes V2 -> V3:
>>     - update patch description
>>
>> Changes V3 -> V4:
>>     - add Stefano's A-b
>>     - drop comment from arm/hvm.c
>> ---
>>   xen/common/ioreq.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c
>> index d233a49..d5f4dd3 100644
>> --- a/xen/common/ioreq.c
>> +++ b/xen/common/ioreq.c
>> @@ -29,6 +29,7 @@
>>   #include <xen/trace.h>
>>   #include <xen/vpci.h>
>>   +#include <asm/guest_atomics.h>
>>   #include <asm/hvm/ioreq.h>
>>     #include <public/hvm/ioreq.h>
>> @@ -1185,7 +1186,7 @@ static int ioreq_send_buffered(struct 
>> ioreq_server *s, ioreq_t *p)
>>             new.read_pointer = old.read_pointer - n * 
>> IOREQ_BUFFER_SLOT_NUM;
>>           new.write_pointer = old.write_pointer - n * 
>> IOREQ_BUFFER_SLOT_NUM;
>> -        cmpxchg(&pg->ptrs.full, old.full, new.full);
>> +        guest_cmpxchg64(s->emulator, &pg->ptrs.full, old.full, 
>> new.full);
>>       }
>>         notify_via_xen_event_channel(d, s->bufioreq_evtchn);
>>
>
> Cheers,
>
-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features
  2021-01-15  0:55   ` Stefano Stabellini
@ 2021-01-17 12:45     ` Oleksandr
  2021-01-20  0:23       ` Stefano Stabellini
  0 siblings, 1 reply; 144+ messages in thread
From: Oleksandr @ 2021-01-17 12:45 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Julien Grall, Julien Grall, Volodymyr Babchuk,
	Oleksandr Tyshchenko


On 15.01.21 02:55, Stefano Stabellini wrote:

Hi Stefano


> On Tue, 12 Jan 2021, Oleksandr Tyshchenko wrote:
>> From: Julien Grall <julien.grall@arm.com>
>>
>> This patch adds basic IOREQ/DM support on Arm. The subsequent
>> patches will improve functionality and add remaining bits.
>>
>> The IOREQ/DM features are supposed to be built with IOREQ_SERVER
>> option enabled, which is disabled by default on Arm for now.
>>
>> Please note, the "PIO handling" TODO is expected to left unaddressed
>> for the current series. It is not an big issue for now while Xen
>> doesn't have support for vPCI on Arm. On Arm64 they are only used
>> for PCI IO Bar and we would probably want to expose them to emulator
>> as PIO access to make a DM completely arch-agnostic. So "PIO handling"
>> should be implemented when we add support for vPCI.
>>
>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> [On Arm only]
>> Tested-by: Wei Chen <Wei.Chen@arm.com>
>>
>> ---
>> Please note, this is a split/cleanup/hardening of Julien's PoC:
>> "Add support for Guest IO forwarding to a device emulator"
>>
>> Changes RFC -> V1:
>>     - was split into:
>>       - arm/ioreq: Introduce arch specific bits for IOREQ/DM features
>>       - xen/mm: Handle properly reference in set_foreign_p2m_entry() on Arm
>>     - update patch description
>>     - update asm-arm/hvm/ioreq.h according to the newly introduced arch functions:
>>       - arch_hvm_destroy_ioreq_server()
>>       - arch_handle_hvm_io_completion()
>>     - update arch files to include xen/ioreq.h
>>     - remove HVMOP plumbing
>>     - rewrite a logic to handle properly case when hvm_send_ioreq() returns IO_RETRY
>>     - add a logic to handle properly handle_hvm_io_completion() return value
>>     - rename handle_mmio() to ioreq_handle_complete_mmio()
>>     - move paging_mark_pfn_dirty() to asm-arm/paging.h
>>     - remove forward declaration for hvm_ioreq_server in asm-arm/paging.h
>>     - move try_fwd_ioserv() to ioreq.c, provide stubs if !CONFIG_IOREQ_SERVER
>>     - do not remove #ifdef CONFIG_IOREQ_SERVER in memory.c for guarding xen/ioreq.h
>>     - use gdprintk in try_fwd_ioserv(), remove unneeded prints
>>     - update list of #include-s
>>     - move has_vpci() to asm-arm/domain.h
>>     - add a comment (TODO) to unimplemented yet handle_pio()
>>     - remove hvm_mmio_first(last)_byte() and hvm_ioreq_(page/vcpu/server) structs
>>       from the arch files, they were already moved to the common code
>>     - remove set_foreign_p2m_entry() changes, they will be properly implemented
>>       in the follow-up patch
>>     - select IOREQ_SERVER for Arm instead of Arm64 in Kconfig
>>     - remove x86's realmode and other unneeded stubs from xen/ioreq.h
>>     - clafify ioreq_t p.df usage in try_fwd_ioserv()
>>     - set ioreq_t p.count to 1 in try_fwd_ioserv()
>>
>> Changes V1 -> V2:
>>     - was split into:
>>       - arm/ioreq: Introduce arch specific bits for IOREQ/DM features
>>       - xen/arm: Stick around in leave_hypervisor_to_guest until I/O has completed
>>     - update the author of a patch
>>     - update patch description
>>     - move a loop in leave_hypervisor_to_guest() to a separate patch
>>     - set IOREQ_SERVER disabled by default
>>     - remove already clarified /* XXX */
>>     - replace BUG() by ASSERT_UNREACHABLE() in handle_pio()
>>     - remove default case for handling the return value of try_handle_mmio()
>>     - remove struct hvm_domain, enum hvm_io_completion, struct hvm_vcpu_io,
>>       struct hvm_vcpu from asm-arm/domain.h, these are common materials now
>>     - update everything according to the recent changes (IOREQ related function
>>       names don't contain "hvm" prefixes/infixes anymore, IOREQ related fields
>>       are part of common struct vcpu/domain now, etc)
>>
>> Changes V2 -> V3:
>>     - update patch according the "legacy interface" is x86 specific
>>     - add dummy arch hooks
>>     - remove dummy paging_mark_pfn_dirty()
>>     - don’t include <xen/domain_page.h> in common ioreq.c
>>     - don’t include <public/hvm/ioreq.h> in arch ioreq.h
>>     - remove #define ioreq_params(d, i)
>>
>> Changes V3 -> V4:
>>     - rebase
>>     - update patch according to the renaming IO_ -> VIO_ (io_ -> vio_)
>>       and misc changes to arch hooks
>>     - update patch according to the IOREQ related dm-op handling changes
>>     - don't include <xen/ioreq.h> from arch header
>>     - make all arch hooks out-of-line
>>     - add a comment above IOREQ_STATUS_* #define-s
>> ---
>>   xen/arch/arm/Makefile           |   2 +
>>   xen/arch/arm/dm.c               | 122 +++++++++++++++++++++++
>>   xen/arch/arm/domain.c           |   9 ++
>>   xen/arch/arm/io.c               |  12 ++-
>>   xen/arch/arm/ioreq.c            | 213 ++++++++++++++++++++++++++++++++++++++++
>>   xen/arch/arm/traps.c            |  13 +++
>>   xen/include/asm-arm/domain.h    |   3 +
>>   xen/include/asm-arm/hvm/ioreq.h |  72 ++++++++++++++
>>   xen/include/asm-arm/mmio.h      |   1 +
>>   9 files changed, 446 insertions(+), 1 deletion(-)
>>   create mode 100644 xen/arch/arm/dm.c
>>   create mode 100644 xen/arch/arm/ioreq.c
>>   create mode 100644 xen/include/asm-arm/hvm/ioreq.h
>>
>> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
>> index 512ffdd..16e6523 100644
>> --- a/xen/arch/arm/Makefile
>> +++ b/xen/arch/arm/Makefile
>> @@ -13,6 +13,7 @@ obj-y += cpuerrata.o
>>   obj-y += cpufeature.o
>>   obj-y += decode.o
>>   obj-y += device.o
>> +obj-$(CONFIG_IOREQ_SERVER) += dm.o
>>   obj-y += domain.o
>>   obj-y += domain_build.init.o
>>   obj-y += domctl.o
>> @@ -27,6 +28,7 @@ obj-y += guest_atomics.o
>>   obj-y += guest_walk.o
>>   obj-y += hvm.o
>>   obj-y += io.o
>> +obj-$(CONFIG_IOREQ_SERVER) += ioreq.o
>>   obj-y += irq.o
>>   obj-y += kernel.init.o
>>   obj-$(CONFIG_LIVEPATCH) += livepatch.o
>> diff --git a/xen/arch/arm/dm.c b/xen/arch/arm/dm.c
>> new file mode 100644
>> index 0000000..e6dedf4
>> --- /dev/null
>> +++ b/xen/arch/arm/dm.c
>> @@ -0,0 +1,122 @@
>> +/*
>> + * Copyright (c) 2019 Arm ltd.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + *
>> + * You should have received a copy of the GNU General Public License along with
>> + * this program; If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#include <xen/dm.h>
>> +#include <xen/guest_access.h>
>> +#include <xen/hypercall.h>
>> +#include <xen/ioreq.h>
>> +#include <xen/nospec.h>
>> +
>> +static int dm_op(const struct dmop_args *op_args)
>> +{
>> +    struct domain *d;
>> +    struct xen_dm_op op;
>> +    bool const_op = true;
>> +    long rc;
>> +    size_t offset;
>> +
>> +    static const uint8_t op_size[] = {
>> +        [XEN_DMOP_create_ioreq_server]              = sizeof(struct xen_dm_op_create_ioreq_server),
>> +        [XEN_DMOP_get_ioreq_server_info]            = sizeof(struct xen_dm_op_get_ioreq_server_info),
>> +        [XEN_DMOP_map_io_range_to_ioreq_server]     = sizeof(struct xen_dm_op_ioreq_server_range),
>> +        [XEN_DMOP_unmap_io_range_from_ioreq_server] = sizeof(struct xen_dm_op_ioreq_server_range),
>> +        [XEN_DMOP_set_ioreq_server_state]           = sizeof(struct xen_dm_op_set_ioreq_server_state),
>> +        [XEN_DMOP_destroy_ioreq_server]             = sizeof(struct xen_dm_op_destroy_ioreq_server),
>> +    };
>> +
>> +    rc = rcu_lock_remote_domain_by_id(op_args->domid, &d);
>> +    if ( rc )
>> +        return rc;
>> +
>> +    rc = xsm_dm_op(XSM_DM_PRIV, d);
>> +    if ( rc )
>> +        goto out;
>> +
>> +    offset = offsetof(struct xen_dm_op, u);
>> +
>> +    rc = -EFAULT;
>> +    if ( op_args->buf[0].size < offset )
>> +        goto out;
>> +
>> +    if ( copy_from_guest_offset((void *)&op, op_args->buf[0].h, 0, offset) )
>> +        goto out;
>> +
>> +    if ( op.op >= ARRAY_SIZE(op_size) )
>> +    {
>> +        rc = -EOPNOTSUPP;
>> +        goto out;
>> +    }
>> +
>> +    op.op = array_index_nospec(op.op, ARRAY_SIZE(op_size));
>> +
>> +    if ( op_args->buf[0].size < offset + op_size[op.op] )
>> +        goto out;
>> +
>> +    if ( copy_from_guest_offset((void *)&op.u, op_args->buf[0].h, offset,
>> +                                op_size[op.op]) )
>> +        goto out;
>> +
>> +    rc = -EINVAL;
>> +    if ( op.pad )
>> +        goto out;
>> +
>> +    rc = ioreq_server_dm_op(&op, d, &const_op);
>> +
>> +    if ( (!rc || rc == -ERESTART) &&
>> +         !const_op && copy_to_guest_offset(op_args->buf[0].h, offset,
>> +                                           (void *)&op.u, op_size[op.op]) )
>> +        rc = -EFAULT;
>> +
>> + out:
>> +    rcu_unlock_domain(d);
>> +
>> +    return rc;
>> +}
>> +
>> +long do_dm_op(domid_t domid,
>> +              unsigned int nr_bufs,
>> +              XEN_GUEST_HANDLE_PARAM(xen_dm_op_buf_t) bufs)
>> +{
>> +    struct dmop_args args;
>> +    int rc;
>> +
>> +    if ( nr_bufs > ARRAY_SIZE(args.buf) )
>> +        return -E2BIG;
>> +
>> +    args.domid = domid;
>> +    args.nr_bufs = array_index_nospec(nr_bufs, ARRAY_SIZE(args.buf) + 1);
>> +
>> +    if ( copy_from_guest_offset(&args.buf[0], bufs, 0, args.nr_bufs) )
>> +        return -EFAULT;
>> +
>> +    rc = dm_op(&args);
>> +
>> +    if ( rc == -ERESTART )
>> +        rc = hypercall_create_continuation(__HYPERVISOR_dm_op, "iih",
>> +                                           domid, nr_bufs, bufs);
>> +
>> +    return rc;
>> +}
> I might have missed something in the discussions but this function is
> identical to xen/arch/x86/hvm/dm.c:do_dm_op, why not make it common?
>
> Also the previous function dm_op is very similar to
> xen/arch/x86/hvm/dm.c:dm_op I would prefer to make them common if
> possible. Was this already discussed?
Well, let me explain. Both dm_op() and do_dm_op() were indeed common 
(top level dm-op handling common) for previous versions, so Arm's dm.c 
didn't contain this stuff.
The idea to make it other way around (top level dm-op handling 
arch-specific and call into ioreq_server_dm_op() for otherwise unhandled 
ops) was discussed at [1] which besides
it's Pros leads to code duplication, so Arm's dm.c has to duplicate some 
stuff, etc.
I was thinking about moving do_dm_op() which is _same_ for both arches 
to common code, but I am not sure whether it is conceptually correct 
which that new "alternative" approach of handling dm-op.
Please see [2].



[1] 
https://lore.kernel.org/xen-devel/1606732298-22107-10-git-send-email-olekstysh@gmail.com/
[2] 
https://lore.kernel.org/xen-devel/1610488352-18494-10-git-send-email-olekstysh@gmail.com/

-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features
  2021-01-15 20:26   ` Julien Grall
@ 2021-01-17 17:11     ` Oleksandr
  2021-01-17 18:07       ` Julien Grall
  2021-01-18 10:44       ` Jan Beulich
  0 siblings, 2 replies; 144+ messages in thread
From: Oleksandr @ 2021-01-17 17:11 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Volodymyr Babchuk,
	Oleksandr Tyshchenko


On 15.01.21 22:26, Julien Grall wrote:

Hi Julien

>
>
> On 12/01/2021 21:52, Oleksandr Tyshchenko wrote:
>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>> index 18cafcd..8f55aba 100644
>> --- a/xen/arch/arm/domain.c
>> +++ b/xen/arch/arm/domain.c
>> @@ -15,6 +15,7 @@
>>   #include <xen/guest_access.h>
>>   #include <xen/hypercall.h>
>>   #include <xen/init.h>
>> +#include <xen/ioreq.h>
>>   #include <xen/lib.h>
>>   #include <xen/livepatch.h>
>>   #include <xen/sched.h>
>> @@ -696,6 +697,10 @@ int arch_domain_create(struct domain *d,
>>         ASSERT(config != NULL);
>>   +#ifdef CONFIG_IOREQ_SERVER
>> +    ioreq_domain_init(d);
>> +#endif
>> +
>>       /* p2m_init relies on some value initialized by the IOMMU 
>> subsystem */
>>       if ( (rc = iommu_domain_init(d, config->iommu_opts)) != 0 )
>>           goto fail;
>> @@ -1014,6 +1019,10 @@ int domain_relinquish_resources(struct domain *d)
>>           if (ret )
>>               return ret;
>>   +#ifdef CONFIG_IOREQ_SERVER
>> +        ioreq_server_destroy_all(d);
>> +#endif
>
> The placement of this call feels quite odd. Shouldn't this moved in 
> case 0?

Indeed it is odd to call it here, will move.



>
>> +
>>       PROGRESS(xen):
>>           ret = relinquish_memory(d, &d->xenpage_list);
>>           if ( ret )
>> diff --git a/xen/arch/arm/io.c b/xen/arch/arm/io.c
>> index ae7ef96..9814481 100644
>> --- a/xen/arch/arm/io.c
>> +++ b/xen/arch/arm/io.c
>> @@ -16,6 +16,7 @@
>>    * GNU General Public License for more details.
>>    */
>>   +#include <xen/ioreq.h>
>>   #include <xen/lib.h>
>>   #include <xen/spinlock.h>
>>   #include <xen/sched.h>
>> @@ -23,6 +24,7 @@
>>   #include <asm/cpuerrata.h>
>>   #include <asm/current.h>
>>   #include <asm/mmio.h>
>> +#include <asm/hvm/ioreq.h>
>
> Shouldn't this have been included by "xen/ioreq.h"?
Well, for V1 asm/hvm/ioreq.h was included by xen/ioreq.h. But, it turned 
out that there was nothing inside common header required arch one to be 
included and
I was asked to include arch header where it was indeed needed (several 
*.c files).


>
>
>>     #include "decode.h"
>>   @@ -123,7 +125,15 @@ enum io_state try_handle_mmio(struct 
>> cpu_user_regs *regs,
>>         handler = find_mmio_handler(v->domain, info.gpa);
>>       if ( !handler )
>> -        return IO_UNHANDLED;
>> +    {
>> +        int rc;
>> +
>> +        rc = try_fwd_ioserv(regs, v, &info);
>> +        if ( rc == IO_HANDLED )
>> +            return handle_ioserv(regs, v);
>> +
>> +        return rc;
>> +    }
>>         /* All the instructions used on emulated MMIO region should 
>> be valid */
>>       if ( !dabt.valid )
>> diff --git a/xen/arch/arm/ioreq.c b/xen/arch/arm/ioreq.c
>> new file mode 100644
>> index 0000000..3c4a24d
>> --- /dev/null
>> +++ b/xen/arch/arm/ioreq.c
>> @@ -0,0 +1,213 @@
>> +/*
>> + * arm/ioreq.c: hardware virtual machine I/O emulation
>> + *
>> + * Copyright (c) 2019 Arm ltd.
>> + *
>> + * This program is free software; you can redistribute it and/or 
>> modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but 
>> WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of 
>> MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public 
>> License for
>> + * more details.
>> + *
>> + * You should have received a copy of the GNU General Public License 
>> along with
>> + * this program; If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#include <xen/domain.h>
>> +#include <xen/ioreq.h>
>> +
>> +#include <asm/traps.h>
>> +
>> +#include <public/hvm/ioreq.h>
>> +
>> +enum io_state handle_ioserv(struct cpu_user_regs *regs, struct vcpu *v)
>> +{
>> +    const union hsr hsr = { .bits = regs->hsr };
>> +    const struct hsr_dabt dabt = hsr.dabt;
>> +    /* Code is similar to handle_read */
>> +    uint8_t size = (1 << dabt.size) * 8;
>> +    register_t r = v->io.req.data;
>> +
>> +    /* We are done with the IO */
>> +    v->io.req.state = STATE_IOREQ_NONE;
>> +
>> +    if ( dabt.write )
>> +        return IO_HANDLED;
>> +
>> +    /*
>> +     * Sign extend if required.
>> +     * Note that we expect the read handler to have zeroed the bits
>> +     * outside the requested access size.
>> +     */
>> +    if ( dabt.sign && (r & (1UL << (size - 1))) )
>> +    {
>> +        /*
>> +         * We are relying on register_t using the same as
>> +         * an unsigned long in order to keep the 32-bit assembly
>> +         * code smaller.
>> +         */
>> +        BUILD_BUG_ON(sizeof(register_t) != sizeof(unsigned long));
>> +        r |= (~0UL) << size;
>> +    }
>
> Looking at the rest of the series, this code is going to be refactored 
> in patch #19 and then hardened. It would have been better to do the 
> refactoring first and then use it.
>
> This helps a lot for the review and to reduce what I would call churn 
> in the series.

Agree, it would be better.


>
>
> I am OK to keep it like that for this series.

Thank you, this saves me some time.


>
>
>> +
>> +    set_user_reg(regs, dabt.reg, r);
>> +
>> +    return IO_HANDLED;
>> +}
>> +
>> +enum io_state try_fwd_ioserv(struct cpu_user_regs *regs,
>> +                             struct vcpu *v, mmio_info_t *info)
>> +{
>> +    struct vcpu_io *vio = &v->io;
>> +    ioreq_t p = {
>> +        .type = IOREQ_TYPE_COPY,
>> +        .addr = info->gpa,
>> +        .size = 1 << info->dabt.size,
>> +        .count = 1,
>> +        .dir = !info->dabt.write,
>> +        /*
>> +         * On x86, df is used by 'rep' instruction to tell the 
>> direction
>> +         * to iterate (forward or backward).
>> +         * On Arm, all the accesses to MMIO region will do a single
>> +         * memory access. So for now, we can safely always set to 0.
>> +         */
>> +        .df = 0,
>> +        .data = get_user_reg(regs, info->dabt.reg),
>> +        .state = STATE_IOREQ_READY,
>> +    };
>> +    struct ioreq_server *s = NULL;
>> +    enum io_state rc;
>> +
>> +    switch ( vio->req.state )
>> +    {
>> +    case STATE_IOREQ_NONE:
>> +        break;
>> +
>> +    case STATE_IORESP_READY:
>> +        return IO_HANDLED;
>
> With the Arm code in mind, I am a bit confused with this check. If 
> vio->req.state == STATE_IORESP_READY, then it would imply that the 
> previous I/O emulation was somehow not completed (from Xen PoV).

Agree


>
> If you return IO_HANDLED here, then it means the we will take care of 
> previous I/O but the current one is going to be ignored. 
Which current one? As I understand, if try_fwd_ioserv() gets called with 
vio->req.state == STATE_IORESP_READY then this is a second round after 
emulator completes the emulation (the first round was when
we returned IO_RETRY down the function and claimed that we would need a 
completion), so we are still dealing with previous I/O.
vcpu_ioreq_handle_completion() -> arch_ioreq_complete_mmio() -> 
try_handle_mmio() -> try_fwd_ioserv() -> handle_ioserv()
And after we return IO_HANDLED here, handle_ioserv() will be called to 
complete the handling of this previous I/O emulation.
Or I really missed something?


> So shouldn't we use the default path here as well?

I am afraid, I don't entirely get the suggestion.


>
>
>> +
>> +    default:
>> +        gdprintk(XENLOG_ERR, "wrong state %u\n", vio->req.state);
>> +        return IO_ABORT;
>> +    }
>> +
>> +    s = ioreq_server_select(v->domain, &p);
>> +    if ( !s )
>> +        return IO_UNHANDLED;
>> +
>> +    if ( !info->dabt.valid )
>> +        return IO_ABORT;
>> +
>> +    vio->req = p;
>> +
>> +    rc = ioreq_send(s, &p, 0);
>> +    if ( rc != IO_RETRY || v->domain->is_shutting_down )
>> +        vio->req.state = STATE_IOREQ_NONE;
>> +    else if ( !ioreq_needs_completion(&vio->req) )
>> +        rc = IO_HANDLED;
>> +    else
>> +        vio->completion = VIO_mmio_completion;
>> +
>> +    return rc;
>> +}
>> +
>> +bool arch_ioreq_complete_mmio(void)
>> +{
>> +    struct vcpu *v = current;
>> +    struct cpu_user_regs *regs = guest_cpu_user_regs();
>> +    const union hsr hsr = { .bits = regs->hsr };
>> +    paddr_t addr = v->io.req.addr;
>> +
>> +    if ( try_handle_mmio(regs, hsr, addr) == IO_HANDLED )
>> +    {
>> +        advance_pc(regs, hsr);
>> +        return true;
>> +    }
>> +
>> +    return false;
>> +}
>> +
>> +bool arch_vcpu_ioreq_completion(enum vio_completion completion)
>> +{
>> +    ASSERT_UNREACHABLE();
>> +    return true;
>> +}
>> +
>> +/*
>> + * The "legacy" mechanism of mapping magic pages for the IOREQ servers
>> + * is x86 specific, so the following hooks don't need to be 
>> implemented on Arm:
>> + * - arch_ioreq_server_map_pages
>> + * - arch_ioreq_server_unmap_pages
>> + * - arch_ioreq_server_enable
>> + * - arch_ioreq_server_disable
>> + */
>> +int arch_ioreq_server_map_pages(struct ioreq_server *s)
>> +{
>> +    return -EOPNOTSUPP;
>> +}
>> +
>> +void arch_ioreq_server_unmap_pages(struct ioreq_server *s)
>> +{
>> +}
>> +
>> +void arch_ioreq_server_enable(struct ioreq_server *s)
>> +{
>> +}
>> +
>> +void arch_ioreq_server_disable(struct ioreq_server *s)
>> +{
>> +}
>> +
>> +void arch_ioreq_server_destroy(struct ioreq_server *s)
>> +{
>> +}
>> +
>> +int arch_ioreq_server_map_mem_type(struct domain *d,
>> +                                   struct ioreq_server *s,
>> +                                   uint32_t flags)
>> +{
>> +    return -EOPNOTSUPP;
>> +}
>> +
>> +void arch_ioreq_server_map_mem_type_completed(struct domain *d,
>> +                                              struct ioreq_server *s,
>> +                                              uint32_t flags)
>> +{
>> +}
>> +
>> +bool arch_ioreq_server_destroy_all(struct domain *d)
>> +{
>> +    return true;
>> +}
>> +
>> +bool arch_ioreq_server_get_type_addr(const struct domain *d,
>> +                                     const ioreq_t *p,
>> +                                     uint8_t *type,
>> +                                     uint64_t *addr)
>> +{
>> +    if ( p->type != IOREQ_TYPE_COPY && p->type != IOREQ_TYPE_PIO )
>> +        return false;
>> +
>> +    *type = (p->type == IOREQ_TYPE_PIO) ?
>> +             XEN_DMOP_IO_RANGE_PORT : XEN_DMOP_IO_RANGE_MEMORY;
>> +    *addr = p->addr;
>> +
>> +    return true;
>> +}
>> +
>> +void arch_ioreq_domain_init(struct domain *d)
>> +{
>> +}
>> +
>> +/*
>> + * Local variables:
>> + * mode: C
>> + * c-file-style: "BSD"
>> + * c-basic-offset: 4
>> + * tab-width: 4
>> + * indent-tabs-mode: nil
>> + * End:
>> + */
>> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
>> index 22bd1bd..036b13f 100644
>> --- a/xen/arch/arm/traps.c
>> +++ b/xen/arch/arm/traps.c
>> @@ -21,6 +21,7 @@
>>   #include <xen/hypercall.h>
>>   #include <xen/init.h>
>>   #include <xen/iocap.h>
>> +#include <xen/ioreq.h>
>>   #include <xen/irq.h>
>>   #include <xen/lib.h>
>>   #include <xen/mem_access.h>
>> @@ -1385,6 +1386,9 @@ static arm_hypercall_t arm_hypercall_table[] = {
>>   #ifdef CONFIG_HYPFS
>>       HYPERCALL(hypfs_op, 5),
>>   #endif
>> +#ifdef CONFIG_IOREQ_SERVER
>> +    HYPERCALL(dm_op, 3),
>> +#endif
>>   };
>>     #ifndef NDEBUG
>> @@ -1956,6 +1960,9 @@ static void do_trap_stage2_abort_guest(struct 
>> cpu_user_regs *regs,
>>               case IO_HANDLED:
>>                   advance_pc(regs, hsr);
>>                   return;
>> +            case IO_RETRY:
>> +                /* finish later */
>> +                return;
>>               case IO_UNHANDLED:
>>                   /* IO unhandled, try another way to handle it. */
>>                   break;
>> @@ -2254,6 +2261,12 @@ static void check_for_vcpu_work(void)
>>   {
>>       struct vcpu *v = current;
>>   +#ifdef CONFIG_IOREQ_SERVER
>> +    local_irq_enable();
>> +    vcpu_ioreq_handle_completion(v);
>> +    local_irq_disable();
>> +#endif
>> +
>>       if ( likely(!v->arch.need_flush_to_ram) )
>>           return;
>>   diff --git a/xen/include/asm-arm/domain.h 
>> b/xen/include/asm-arm/domain.h
>> index 6819a3b..c235e5b 100644
>> --- a/xen/include/asm-arm/domain.h
>> +++ b/xen/include/asm-arm/domain.h
>> @@ -10,6 +10,7 @@
>>   #include <asm/gic.h>
>>   #include <asm/vgic.h>
>>   #include <asm/vpl011.h>
>> +#include <public/hvm/dm_op.h>
>
> May I ask, why do you need to include dm_op.h here?
I needed to include that header to make some bits visible 
(XEN_DMOP_IO_RANGE_PCI, struct xen_dm_op_buf, etc). Why here - is a 
really good question.
I don't remember exactly, probably I followed x86's domain.h which also 
included it.
So, trying to remove the inclusion here, I get several build failures on 
Arm which could be fixed if I include that header from dm.h and ioreq.h:

Shall I do this way?


diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index 6bd2d85..9de7c4e 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -10,7 +10,6 @@
  #include <asm/gic.h>
  #include <asm/vgic.h>
  #include <asm/vpl011.h>
-#include <public/hvm/dm_op.h>
  #include <public/hvm/params.h>

  struct hvm_domain
diff --git a/xen/include/xen/dm.h b/xen/include/xen/dm.h
index 2c9952d..4ce6655 100644
--- a/xen/include/xen/dm.h
+++ b/xen/include/xen/dm.h
@@ -19,6 +19,8 @@

  #include <xen/sched.h>

+#include <public/hvm/dm_op.h>
+
  struct dmop_args {
      domid_t domid;
      unsigned int nr_bufs;
diff --git a/xen/include/xen/ioreq.h b/xen/include/xen/ioreq.h
index dc47ec7..7b74983 100644
--- a/xen/include/xen/ioreq.h
+++ b/xen/include/xen/ioreq.h
@@ -21,6 +21,8 @@

  #include <xen/sched.h>

+#include <public/hvm/dm_op.h>
+
  struct ioreq_page {
      gfn_t gfn;
      struct page_info *page;
(END)


>
>
>>   #include <public/hvm/params.h>
>>     struct hvm_domain
>> @@ -262,6 +263,8 @@ static inline void arch_vcpu_block(struct vcpu 
>> *v) {}
>>     #define arch_vm_assist_valid_mask(d) (1UL << 
>> VMASST_TYPE_runstate_update_flag)
>>   +#define has_vpci(d)    ({ (void)(d); false; })
>> +
>>   #endif /* __ASM_DOMAIN_H__ */
>>     /*
>> diff --git a/xen/include/asm-arm/hvm/ioreq.h 
>> b/xen/include/asm-arm/hvm/ioreq.h
>> new file mode 100644
>> index 0000000..19e1247
>> --- /dev/null
>> +++ b/xen/include/asm-arm/hvm/ioreq.h
>
> Shouldn't this directly be under asm-arm/ rather than asm-arm/hvm/ as 
> the IOREQ is now meant to be agnostic?
Good question... The _common_ IOREQ code is indeed arch-agnostic. But, 
can the _arch_ IOREQ code be treated as really subarch-agnostic?
I think, on Arm it can and it is most likely ok to keep it in 
"asm-arm/", but how it would be correlated with x86's IOREQ code which 
is HVM specific and located
in "hvm" subdir?



>
>> @@ -0,0 +1,72 @@
>> +/*
>> + * hvm.h: Hardware virtual machine assist interface definitions.
>> + *
>> + * Copyright (c) 2016 Citrix Systems Inc.
>> + * Copyright (c) 2019 Arm ltd.
>> + *
>> + * This program is free software; you can redistribute it and/or 
>> modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but 
>> WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of 
>> MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public 
>> License for
>> + * more details.
>> + *
>> + * You should have received a copy of the GNU General Public License 
>> along with
>> + * this program; If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#ifndef __ASM_ARM_HVM_IOREQ_H__
>> +#define __ASM_ARM_HVM_IOREQ_H__
>> +
>> +#ifdef CONFIG_IOREQ_SERVER
>> +enum io_state handle_ioserv(struct cpu_user_regs *regs, struct vcpu 
>> *v);
>> +enum io_state try_fwd_ioserv(struct cpu_user_regs *regs,
>> +                             struct vcpu *v, mmio_info_t *info);
>> +#else
>> +static inline enum io_state handle_ioserv(struct cpu_user_regs *regs,
>> +                                          struct vcpu *v)
>> +{
>> +    return IO_UNHANDLED;
>> +}
>> +
>> +static inline enum io_state try_fwd_ioserv(struct cpu_user_regs *regs,
>> +                                           struct vcpu *v, 
>> mmio_info_t *info)
>> +{
>> +    return IO_UNHANDLED;
>> +}
>> +#endif
>> +
>> +bool ioreq_complete_mmio(void);
>> +
>> +static inline bool handle_pio(uint16_t port, unsigned int size, int 
>> dir)
>> +{
>> +    /*
>> +     * TODO: For Arm64, the main user will be PCI. So this should be
>> +     * implemented when we add support for vPCI.
>> +     */
>> +    ASSERT_UNREACHABLE();
>> +    return true;
>> +}
>> +
>> +static inline void msix_write_completion(struct vcpu *v)
>> +{
>> +}
>> +
>> +/* This correlation must not be altered */
>> +#define IOREQ_STATUS_HANDLED     IO_HANDLED
>> +#define IOREQ_STATUS_UNHANDLED   IO_UNHANDLED
>> +#define IOREQ_STATUS_RETRY       IO_RETRY
>> +
>> +#endif /* __ASM_ARM_HVM_IOREQ_H__ */
>> +
>> +/*
>> + * Local variables:
>> + * mode: C
>> + * c-file-style: "BSD"
>> + * c-basic-offset: 4
>> + * tab-width: 4
>> + * indent-tabs-mode: nil
>> + * End:
>> + */
>> diff --git a/xen/include/asm-arm/mmio.h b/xen/include/asm-arm/mmio.h
>> index 8dbfb27..7ab873c 100644
>> --- a/xen/include/asm-arm/mmio.h
>> +++ b/xen/include/asm-arm/mmio.h
>> @@ -37,6 +37,7 @@ enum io_state
>>       IO_ABORT,       /* The IO was handled by the helper and led to 
>> an abort. */
>>       IO_HANDLED,     /* The IO was successfully handled by the 
>> helper. */
>>       IO_UNHANDLED,   /* The IO was not handled by the helper. */
>> +    IO_RETRY,       /* Retry the emulation for some reason */
>>   };
>>     typedef int (*mmio_read_t)(struct vcpu *v, mmio_info_t *info,
>>
>
-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply related	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features
  2021-01-17 17:11     ` Oleksandr
@ 2021-01-17 18:07       ` Julien Grall
  2021-01-17 18:52         ` Oleksandr
  2021-01-18 10:44       ` Jan Beulich
  1 sibling, 1 reply; 144+ messages in thread
From: Julien Grall @ 2021-01-17 18:07 UTC (permalink / raw)
  To: Oleksandr
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Volodymyr Babchuk,
	Oleksandr Tyshchenko



On 17/01/2021 17:11, Oleksandr wrote:
> 
> On 15.01.21 22:26, Julien Grall wrote:
> 
> Hi Julien

Hi Oleksandr,

>>
>>> +
>>>       PROGRESS(xen):
>>>           ret = relinquish_memory(d, &d->xenpage_list);
>>>           if ( ret )
>>> diff --git a/xen/arch/arm/io.c b/xen/arch/arm/io.c
>>> index ae7ef96..9814481 100644
>>> --- a/xen/arch/arm/io.c
>>> +++ b/xen/arch/arm/io.c
>>> @@ -16,6 +16,7 @@
>>>    * GNU General Public License for more details.
>>>    */
>>>   +#include <xen/ioreq.h>
>>>   #include <xen/lib.h>
>>>   #include <xen/spinlock.h>
>>>   #include <xen/sched.h>
>>> @@ -23,6 +24,7 @@
>>>   #include <asm/cpuerrata.h>
>>>   #include <asm/current.h>
>>>   #include <asm/mmio.h>
>>> +#include <asm/hvm/ioreq.h>
>>
>> Shouldn't this have been included by "xen/ioreq.h"?
> Well, for V1 asm/hvm/ioreq.h was included by xen/ioreq.h. But, it turned 
> out that there was nothing inside common header required arch one to be 
> included and
> I was asked to include arch header where it was indeed needed (several 
> *.c files).

Fair enough.

[...]

>>
>> If you return IO_HANDLED here, then it means the we will take care of 
>> previous I/O but the current one is going to be ignored. 
> Which current one? As I understand, if try_fwd_ioserv() gets called with 
> vio->req.state == STATE_IORESP_READY then this is a second round after 
> emulator completes the emulation (the first round was when
> we returned IO_RETRY down the function and claimed that we would need a 
> completion), so we are still dealing with previous I/O.
> vcpu_ioreq_handle_completion() -> arch_ioreq_complete_mmio() -> 
> try_handle_mmio() -> try_fwd_ioserv() -> handle_ioserv()
> And after we return IO_HANDLED here, handle_ioserv() will be called to 
> complete the handling of this previous I/O emulation.
> Or I really missed something?

Hmmm... I somehow thought try_fw_ioserv() would only be called the first 
time. Do you have a branch with your code applied? This would help to 
follow the different paths.

>>>   diff --git a/xen/include/asm-arm/domain.h 
>>> b/xen/include/asm-arm/domain.h
>>> index 6819a3b..c235e5b 100644
>>> --- a/xen/include/asm-arm/domain.h
>>> +++ b/xen/include/asm-arm/domain.h
>>> @@ -10,6 +10,7 @@
>>>   #include <asm/gic.h>
>>>   #include <asm/vgic.h>
>>>   #include <asm/vpl011.h>
>>> +#include <public/hvm/dm_op.h>
>>
>> May I ask, why do you need to include dm_op.h here?
> I needed to include that header to make some bits visible 
> (XEN_DMOP_IO_RANGE_PCI, struct xen_dm_op_buf, etc). Why here - is a 
> really good question.
> I don't remember exactly, probably I followed x86's domain.h which also 
> included it.
> So, trying to remove the inclusion here, I get several build failures on 
> Arm which could be fixed if I include that header from dm.h and ioreq.h:
> 
> Shall I do this way?

If the failure are indeded because ioreq.h and dm.h use definition from 
public/hvm/dm_op.h, then yes. Can you post the errors?

[...]

>>>   #include <public/hvm/params.h>
>>>     struct hvm_domain
>>> @@ -262,6 +263,8 @@ static inline void arch_vcpu_block(struct vcpu 
>>> *v) {}
>>>     #define arch_vm_assist_valid_mask(d) (1UL << 
>>> VMASST_TYPE_runstate_update_flag)
>>>   +#define has_vpci(d)    ({ (void)(d); false; })
>>> +
>>>   #endif /* __ASM_DOMAIN_H__ */
>>>     /*
>>> diff --git a/xen/include/asm-arm/hvm/ioreq.h 
>>> b/xen/include/asm-arm/hvm/ioreq.h
>>> new file mode 100644
>>> index 0000000..19e1247
>>> --- /dev/null
>>> +++ b/xen/include/asm-arm/hvm/ioreq.h
>>
>> Shouldn't this directly be under asm-arm/ rather than asm-arm/hvm/ as 
>> the IOREQ is now meant to be agnostic?
> Good question... The _common_ IOREQ code is indeed arch-agnostic. But, 
> can the _arch_ IOREQ code be treated as really subarch-agnostic?
> I think, on Arm it can and it is most likely ok to keep it in 
> "asm-arm/", but how it would be correlated with x86's IOREQ code which 
> is HVM specific and located
> in "hvm" subdir?

Sorry, I don't understand your answer/questions. So let me ask the 
question differently, is asm-arm/hvm/ioreq.h going to be included from 
common code?

If the answer is no, then I see no reason to follow the x86 here.
If the answer is yes, then I am quite confused why half of the series 
tried to remove "hvm" from the function name but we still include 
"asm/hvm/ioreq.h".

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features
  2021-01-17 18:07       ` Julien Grall
@ 2021-01-17 18:52         ` Oleksandr
  2021-01-18 19:17           ` Julien Grall
  2021-01-20 15:50           ` Julien Grall
  0 siblings, 2 replies; 144+ messages in thread
From: Oleksandr @ 2021-01-17 18:52 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Volodymyr Babchuk,
	Oleksandr Tyshchenko

[-- Attachment #1: Type: text/plain, Size: 5311 bytes --]


On 17.01.21 20:07, Julien Grall wrote:
>
>
> On 17/01/2021 17:11, Oleksandr wrote:
>>
>> On 15.01.21 22:26, Julien Grall wrote:
>>
>> Hi Julien
>
> Hi Oleksandr,


Hi Julien



>
>>>
>>>> +
>>>>       PROGRESS(xen):
>>>>           ret = relinquish_memory(d, &d->xenpage_list);
>>>>           if ( ret )
>>>> diff --git a/xen/arch/arm/io.c b/xen/arch/arm/io.c
>>>> index ae7ef96..9814481 100644
>>>> --- a/xen/arch/arm/io.c
>>>> +++ b/xen/arch/arm/io.c
>>>> @@ -16,6 +16,7 @@
>>>>    * GNU General Public License for more details.
>>>>    */
>>>>   +#include <xen/ioreq.h>
>>>>   #include <xen/lib.h>
>>>>   #include <xen/spinlock.h>
>>>>   #include <xen/sched.h>
>>>> @@ -23,6 +24,7 @@
>>>>   #include <asm/cpuerrata.h>
>>>>   #include <asm/current.h>
>>>>   #include <asm/mmio.h>
>>>> +#include <asm/hvm/ioreq.h>
>>>
>>> Shouldn't this have been included by "xen/ioreq.h"?
>> Well, for V1 asm/hvm/ioreq.h was included by xen/ioreq.h. But, it 
>> turned out that there was nothing inside common header required arch 
>> one to be included and
>> I was asked to include arch header where it was indeed needed 
>> (several *.c files).
>
> Fair enough.
>
> [...]
>
>>>
>>> If you return IO_HANDLED here, then it means the we will take care 
>>> of previous I/O but the current one is going to be ignored. 
>> Which current one? As I understand, if try_fwd_ioserv() gets called 
>> with vio->req.state == STATE_IORESP_READY then this is a second round 
>> after emulator completes the emulation (the first round was when
>> we returned IO_RETRY down the function and claimed that we would need 
>> a completion), so we are still dealing with previous I/O.
>> vcpu_ioreq_handle_completion() -> arch_ioreq_complete_mmio() -> 
>> try_handle_mmio() -> try_fwd_ioserv() -> handle_ioserv()
>> And after we return IO_HANDLED here, handle_ioserv() will be called 
>> to complete the handling of this previous I/O emulation.
>> Or I really missed something?
>
> Hmmm... I somehow thought try_fw_ioserv() would only be called the 
> first time. Do you have a branch with your code applied? This would 
> help to follow the different paths.
Yes, I mentioned about it in cover letter.

Please see
https://github.com/otyshchenko1/xen/commits/ioreq_4.14_ml5
why 5 - because I started counting from the RFC)



>
>>>>   diff --git a/xen/include/asm-arm/domain.h 
>>>> b/xen/include/asm-arm/domain.h
>>>> index 6819a3b..c235e5b 100644
>>>> --- a/xen/include/asm-arm/domain.h
>>>> +++ b/xen/include/asm-arm/domain.h
>>>> @@ -10,6 +10,7 @@
>>>>   #include <asm/gic.h>
>>>>   #include <asm/vgic.h>
>>>>   #include <asm/vpl011.h>
>>>> +#include <public/hvm/dm_op.h>
>>>
>>> May I ask, why do you need to include dm_op.h here?
>> I needed to include that header to make some bits visible 
>> (XEN_DMOP_IO_RANGE_PCI, struct xen_dm_op_buf, etc). Why here - is a 
>> really good question.
>> I don't remember exactly, probably I followed x86's domain.h which 
>> also included it.
>> So, trying to remove the inclusion here, I get several build failures 
>> on Arm which could be fixed if I include that header from dm.h and 
>> ioreq.h:
>>
>> Shall I do this way?
>
> If the failure are indeded because ioreq.h and dm.h use definition 
> from public/hvm/dm_op.h, then yes. Can you post the errors?
Please see attached, although I built for Arm32 (and the whole series), 
I think errors are valid for Arm64 also.
error1.txt - when remove #include <public/hvm/dm_op.h> from asm-arm/domain.h
error2.txt - when add #include <public/hvm/dm_op.h> to xen/ioreq.h
error3.txt - when add #include <public/hvm/dm_op.h> to xen/dm.h


>
>
> [...]
>
>>>>   #include <public/hvm/params.h>
>>>>     struct hvm_domain
>>>> @@ -262,6 +263,8 @@ static inline void arch_vcpu_block(struct vcpu 
>>>> *v) {}
>>>>     #define arch_vm_assist_valid_mask(d) (1UL << 
>>>> VMASST_TYPE_runstate_update_flag)
>>>>   +#define has_vpci(d)    ({ (void)(d); false; })
>>>> +
>>>>   #endif /* __ASM_DOMAIN_H__ */
>>>>     /*
>>>> diff --git a/xen/include/asm-arm/hvm/ioreq.h 
>>>> b/xen/include/asm-arm/hvm/ioreq.h
>>>> new file mode 100644
>>>> index 0000000..19e1247
>>>> --- /dev/null
>>>> +++ b/xen/include/asm-arm/hvm/ioreq.h
>>>
>>> Shouldn't this directly be under asm-arm/ rather than asm-arm/hvm/ 
>>> as the IOREQ is now meant to be agnostic?
>> Good question... The _common_ IOREQ code is indeed arch-agnostic. 
>> But, can the _arch_ IOREQ code be treated as really subarch-agnostic?
>> I think, on Arm it can and it is most likely ok to keep it in 
>> "asm-arm/", but how it would be correlated with x86's IOREQ code 
>> which is HVM specific and located
>> in "hvm" subdir?
>
> Sorry, I don't understand your answer/questions. So let me ask the 
> question differently, is asm-arm/hvm/ioreq.h going to be included from 
> common code?

Sorry if I was unclear.


>
> If the answer is no, then I see no reason to follow the x86 here.
> If the answer is yes, then I am quite confused why half of the series 
> tried to remove "hvm" from the function name but we still include 
> "asm/hvm/ioreq.h".

Answer is yes. Even if we could to avoid including that header from the 
common code somehow, we would still have #include <public/hvm/*>, 
is_hvm_domain().


>
>
> Cheers,
>
-- 
Regards,

Oleksandr Tyshchenko


[-- Attachment #2: error3.txt --]
[-- Type: text/plain, Size: 21026 bytes --]

make xen XEN_TARGET_ARCH=arm32 CROSS_COMPILE=/home/otyshchenko/work/projects/toolchain/gcc-linaro-7.5.0-2019.12-i686_arm-linux-gnueabihf/bin/arm-linux-gnueabihf-
make -C xen install
make[1]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen'
make -f Rules.mk _install
make[2]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen'
make -C tools
make[3]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/tools'
make symbols
make[4]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/tools'
make[4]: 'symbols' is up to date.
make[4]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/tools'
make[3]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/tools'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk include/xen/compile.h
make[3]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen'
 Xen 4.15-unstable
make[3]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen'
[ -e include/asm ] || ln -sf asm-arm include/asm
[ -e arch/arm/efi ] && for f in $(cd common/efi; echo *.[ch]); \
	do test -r arch/arm/efi/$f || \
	   ln -nsf ../../../common/efi/$f arch/arm/efi/; \
	done; \
	true
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk -C include
make[3]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include'
make[3]: Nothing to be done for 'all'.
make[3]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk -C arch/arm asm-offsets.s
make[3]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/arch/arm'
make[3]: 'asm-offsets.s' is up to date.
make[3]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/arch/arm'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk include/asm-arm/asm-offsets.h
make[3]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen'
make[3]: 'include/asm-arm/asm-offsets.h' is up to date.
make[3]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk -C arch/arm /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/xen
make[3]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/arch/arm'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk -C /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/common built_in.o
make[4]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/common'
sed "s! $PWD/! !" .ioreq.o.d >.ioreq.o.d2.tmp && mv -f .ioreq.o.d2.tmp .ioreq.o.d2
sed "s! $PWD/! !" .memory.o.d >.memory.o.d2.tmp && mv -f .memory.o.d2.tmp .memory.o.d2
sed "s! $PWD/! !" .version.o.d >.version.o.d2.tmp && mv -f .version.o.d2.tmp .version.o.d2
/home/otyshchenko/work/projects/toolchain/gcc-linaro-7.5.0-2019.12-i686_arm-linux-gnueabihf/bin/arm-linux-gnueabihf-gcc -MMD -MP -MF ./.ioreq.o.d -marm -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall -Wstrict-prototypes -Wdeclaration-after-statement -Wno-unused-but-set-variable -Wno-unused-local-typedefs   -O1 -fno-omit-frame-pointer -nostdinc -fno-builtin -fno-common -Werror -Wredundant-decls -Wno-pointer-arith -Wvla -pipe -D__XEN__ -include /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/config.h -Wa,--strip-local-absolute -g -msoft-float -mcpu=cortex-a15  -I/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include -fno-stack-protector -fno-exceptions -fno-asynchronous-unwind-tables -Wnested-externs '-D__OBJECT_FILE__="ioreq.o"'  -c ioreq.c -o ioreq.o
In file included from ioreq.c:23:0:
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/ioreq.h:39:28: error: ‘XEN_DMOP_IO_RANGE_PCI’ undeclared here (not in a function); did you mean ‘XEN_DOMCTL_DEV_PCI’?
 #define NR_IO_RANGE_TYPES (XEN_DMOP_IO_RANGE_PCI + 1)
                            ^
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/ioreq.h:55:35: note: in expansion of macro ‘NR_IO_RANGE_TYPES’
     struct rangeset        *range[NR_IO_RANGE_TYPES];
                                   ^~~~~~~~~~~~~~~~~
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/ioreq.h:92:46: error: unknown type name ‘ioservid_t’; did you mean ‘nodeid_t’?
 int ioreq_server_get_frame(struct domain *d, ioservid_t id,
                                              ^~~~~~~~~~
                                              nodeid_t
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/ioreq.h:94:49: error: unknown type name ‘ioservid_t’; did you mean ‘nodeid_t’?
 int ioreq_server_map_mem_type(struct domain *d, ioservid_t id,
                                                 ^~~~~~~~~~
                                                 nodeid_t
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/ioreq.h:110:31: error: ‘struct xen_dm_op’ declared inside parameter list will not be visible outside of this definition or declaration [-Werror]
 int ioreq_server_dm_op(struct xen_dm_op *op, struct domain *d, bool *const_op);
                               ^~~~~~~~~
ioreq.c:484:41: error: unknown type name ‘ioservid_t’; did you mean ‘nodeid_t’?
                                         ioservid_t id)
                                         ^~~~~~~~~~
                                         nodeid_t
ioreq.c:560:30: error: unknown type name ‘ioservid_t’; did you mean ‘nodeid_t’?
                              ioservid_t id)
                              ^~~~~~~~~~
                              nodeid_t
ioreq.c:626:32: error: unknown type name ‘ioservid_t’; did you mean ‘nodeid_t’?
                                ioservid_t *id)
                                ^~~~~~~~~~
                                nodeid_t
ioreq.c:681:51: error: unknown type name ‘ioservid_t’; did you mean ‘nodeid_t’?
 static int ioreq_server_destroy(struct domain *d, ioservid_t id)
                                                   ^~~~~~~~~~
                                                   nodeid_t
ioreq.c:723:52: error: unknown type name ‘ioservid_t’; did you mean ‘nodeid_t’?
 static int ioreq_server_get_info(struct domain *d, ioservid_t id,
                                                    ^~~~~~~~~~
                                                    nodeid_t
ioreq.c:770:46: error: unknown type name ‘ioservid_t’; did you mean ‘nodeid_t’?
 int ioreq_server_get_frame(struct domain *d, ioservid_t id,
                                              ^~~~~~~~~~
                                              nodeid_t
ioreq.c:821:56: error: unknown type name ‘ioservid_t’; did you mean ‘nodeid_t’?
 static int ioreq_server_map_io_range(struct domain *d, ioservid_t id,
                                                        ^~~~~~~~~~
                                                        nodeid_t
ioreq.c:873:58: error: unknown type name ‘ioservid_t’; did you mean ‘nodeid_t’?
 static int ioreq_server_unmap_io_range(struct domain *d, ioservid_t id,
                                                          ^~~~~~~~~~
                                                          nodeid_t
ioreq.c:933:49: error: unknown type name ‘ioservid_t’; did you mean ‘nodeid_t’?
 int ioreq_server_map_mem_type(struct domain *d, ioservid_t id,
                                                 ^~~~~~~~~~
                                                 nodeid_t
ioreq.c:968:53: error: unknown type name ‘ioservid_t’; did you mean ‘nodeid_t’?
 static int ioreq_server_set_state(struct domain *d, ioservid_t id,
                                                     ^~~~~~~~~~
                                                     nodeid_t
ioreq.c: In function ‘ioreq_server_select’:
ioreq.c:1103:14: error: ‘XEN_DMOP_IO_RANGE_PORT’ undeclared (first use in this function); did you mean ‘XEN_DMOP_IO_RANGE_PCI’?
         case XEN_DMOP_IO_RANGE_PORT:
              ^~~~~~~~~~~~~~~~~~~~~~
              XEN_DMOP_IO_RANGE_PCI
ioreq.c:1103:14: note: each undeclared identifier is reported only once for each function it appears in
ioreq.c:1111:14: error: ‘XEN_DMOP_IO_RANGE_MEMORY’ undeclared (first use in this function); did you mean ‘XEN_DMOP_IO_RANGE_PORT’?
         case XEN_DMOP_IO_RANGE_MEMORY:
              ^~~~~~~~~~~~~~~~~~~~~~~~
              XEN_DMOP_IO_RANGE_PORT
ioreq.c: At top level:
ioreq.c:1313:31: error: ‘struct xen_dm_op’ declared inside parameter list will not be visible outside of this definition or declaration [-Werror]
 int ioreq_server_dm_op(struct xen_dm_op *op, struct domain *d, bool *const_op)
                               ^~~~~~~~~
ioreq.c:1313:5: error: conflicting types for ‘ioreq_server_dm_op’
 int ioreq_server_dm_op(struct xen_dm_op *op, struct domain *d, bool *const_op)
     ^~~~~~~~~~~~~~~~~~
In file included from ioreq.c:23:0:
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/ioreq.h:110:5: note: previous declaration of ‘ioreq_server_dm_op’ was here
 int ioreq_server_dm_op(struct xen_dm_op *op, struct domain *d, bool *const_op);
     ^~~~~~~~~~~~~~~~~~
ioreq.c: In function ‘ioreq_server_dm_op’:
ioreq.c:1317:16: error: dereferencing pointer to incomplete type ‘struct xen_dm_op’
     switch ( op->op )
                ^~
ioreq.c:1319:10: error: ‘XEN_DMOP_create_ioreq_server’ undeclared (first use in this function); did you mean ‘XENMEM_resource_ioreq_server’?
     case XEN_DMOP_create_ioreq_server:
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
          XENMEM_resource_ioreq_server
ioreq.c:1327:18: error: dereferencing pointer to incomplete type ‘struct xen_dm_op_create_ioreq_server’
         if ( data->pad[0] || data->pad[1] || data->pad[2] )
                  ^~
ioreq.c:1330:14: error: implicit declaration of function ‘ioreq_server_create’; did you mean ‘ioreq_server_disable’? [-Werror=implicit-function-declaration]
         rc = ioreq_server_create(d, data->handle_bufioreq,
              ^~~~~~~~~~~~~~~~~~~
              ioreq_server_disable
ioreq.c:1330:14: error: nested extern declaration of ‘ioreq_server_create’ [-Werror=nested-externs]
ioreq.c:1335:10: error: ‘XEN_DMOP_get_ioreq_server_info’ undeclared (first use in this function); did you mean ‘XEN_DMOP_create_ioreq_server’?
     case XEN_DMOP_get_ioreq_server_info:
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          XEN_DMOP_create_ioreq_server
ioreq.c:1339:38: error: ‘XEN_DMOP_no_gfns’ undeclared (first use in this function)
         const uint16_t valid_flags = XEN_DMOP_no_gfns;
                                      ^~~~~~~~~~~~~~~~
ioreq.c:1344:18: error: dereferencing pointer to incomplete type ‘struct xen_dm_op_get_ioreq_server_info’
         if ( data->flags & ~valid_flags )
                  ^~
ioreq.c:1347:14: error: implicit declaration of function ‘ioreq_server_get_info’; did you mean ‘ioreq_server_deinit’? [-Werror=implicit-function-declaration]
         rc = ioreq_server_get_info(d, data->id,
              ^~~~~~~~~~~~~~~~~~~~~
              ioreq_server_deinit
ioreq.c:1347:14: error: nested extern declaration of ‘ioreq_server_get_info’ [-Werror=nested-externs]
ioreq.c:1356:10: error: ‘XEN_DMOP_map_io_range_to_ioreq_server’ undeclared (first use in this function); did you mean ‘XEN_DMOP_create_ioreq_server’?
     case XEN_DMOP_map_io_range_to_ioreq_server:
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          XEN_DMOP_create_ioreq_server
ioreq.c:1362:18: error: dereferencing pointer to incomplete type ‘const struct xen_dm_op_ioreq_server_range’
         if ( data->pad )
                  ^~
ioreq.c:1365:14: error: implicit declaration of function ‘ioreq_server_map_io_range’; did you mean ‘ioreq_server_alloc_pages’? [-Werror=implicit-function-declaration]
         rc = ioreq_server_map_io_range(d, data->id, data->type,
              ^~~~~~~~~~~~~~~~~~~~~~~~~
              ioreq_server_alloc_pages
ioreq.c:1365:14: error: nested extern declaration of ‘ioreq_server_map_io_range’ [-Werror=nested-externs]
ioreq.c:1370:10: error: ‘XEN_DMOP_unmap_io_range_from_ioreq_server’ undeclared (first use in this function); did you mean ‘XEN_DMOP_map_io_range_to_ioreq_server’?
     case XEN_DMOP_unmap_io_range_from_ioreq_server:
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          XEN_DMOP_map_io_range_to_ioreq_server
ioreq.c:1376:18: error: dereferencing pointer to incomplete type ‘const struct xen_dm_op_ioreq_server_range’
         if ( data->pad )
                  ^~
ioreq.c:1379:14: error: implicit declaration of function ‘ioreq_server_unmap_io_range’; did you mean ‘ioreq_server_alloc_pages’? [-Werror=implicit-function-declaration]
         rc = ioreq_server_unmap_io_range(d, data->id, data->type,
              ^~~~~~~~~~~~~~~~~~~~~~~~~~~
              ioreq_server_alloc_pages
ioreq.c:1379:14: error: nested extern declaration of ‘ioreq_server_unmap_io_range’ [-Werror=nested-externs]
ioreq.c:1384:10: error: ‘XEN_DMOP_set_ioreq_server_state’ undeclared (first use in this function); did you mean ‘XEN_DMOP_get_ioreq_server_info’?
     case XEN_DMOP_set_ioreq_server_state:
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          XEN_DMOP_get_ioreq_server_info
ioreq.c:1390:18: error: dereferencing pointer to incomplete type ‘const struct xen_dm_op_set_ioreq_server_state’
         if ( data->pad )
                  ^~
ioreq.c:1393:14: error: implicit declaration of function ‘ioreq_server_set_state’; did you mean ‘ioreq_server_select’? [-Werror=implicit-function-declaration]
         rc = ioreq_server_set_state(d, data->id, !!data->enabled);
              ^~~~~~~~~~~~~~~~~~~~~~
              ioreq_server_select
ioreq.c:1393:14: error: nested extern declaration of ‘ioreq_server_set_state’ [-Werror=nested-externs]
ioreq.c:1397:10: error: ‘XEN_DMOP_destroy_ioreq_server’ undeclared (first use in this function); did you mean ‘XEN_DMOP_create_ioreq_server’?
     case XEN_DMOP_destroy_ioreq_server:
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          XEN_DMOP_create_ioreq_server
ioreq.c:1403:18: error: dereferencing pointer to incomplete type ‘const struct xen_dm_op_destroy_ioreq_server’
         if ( data->pad )
                  ^~
ioreq.c:1406:14: error: implicit declaration of function ‘ioreq_server_destroy’; did you mean ‘ioreq_server_destroy_all’? [-Werror=implicit-function-declaration]
         rc = ioreq_server_destroy(d, data->id);
              ^~~~~~~~~~~~~~~~~~~~
              ioreq_server_destroy_all
ioreq.c:1406:14: error: nested extern declaration of ‘ioreq_server_destroy’ [-Werror=nested-externs]
At top level:
ioreq.c:521:13: error: ‘ioreq_server_enable’ defined but not used [-Werror=unused-function]
 static void ioreq_server_enable(struct ioreq_server *s)
             ^~~~~~~~~~~~~~~~~~~
ioreq.c:454:12: error: ‘ioreq_server_alloc_pages’ defined but not used [-Werror=unused-function]
 static int ioreq_server_alloc_pages(struct ioreq_server *s)
            ^~~~~~~~~~~~~~~~~~~~~~~~
ioreq.c:64:29: error: ‘get_ioreq_server’ defined but not used [-Werror=unused-function]
 static struct ioreq_server *get_ioreq_server(const struct domain *d,
                             ^~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk:197: recipe for target 'ioreq.o' failed
make[4]: *** [ioreq.o] Error 1
make[4]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/common'
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk:180: recipe for target '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/common/built_in.o' failed
make[3]: *** [/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/common/built_in.o] Error 2
make[3]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/arch/arm'
Makefile:359: recipe for target '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/xen' failed
make[2]: *** [/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/xen] Error 2
make[2]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen'
Makefile:263: recipe for target 'install' failed
make[1]: *** [install] Error 2
make[1]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen'
Makefile:130: recipe for target 'install-xen' failed
make: *** [install-xen] Error 2
otyshchenko@otyshchenko:/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git$ 


[-- Attachment #3: error2.txt --]
[-- Type: text/plain, Size: 29386 bytes --]

make xen XEN_TARGET_ARCH=arm32 CROSS_COMPILE=/home/otyshchenko/work/projects/toolchain/gcc-linaro-7.5.0-2019.12-i686_arm-linux-gnueabihf/bin/arm-linux-gnueabihf-
make -C xen install
make[1]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen'
make -f Rules.mk _install
make[2]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen'
make -C tools
make[3]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/tools'
make symbols
make[4]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/tools'
make[4]: 'symbols' is up to date.
make[4]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/tools'
make[3]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/tools'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk include/xen/compile.h
make[3]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen'
 Xen 4.15-unstable
make[3]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen'
[ -e include/asm ] || ln -sf asm-arm include/asm
[ -e arch/arm/efi ] && for f in $(cd common/efi; echo *.[ch]); \
	do test -r arch/arm/efi/$f || \
	   ln -nsf ../../../common/efi/$f arch/arm/efi/; \
	done; \
	true
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk -C include
make[3]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include'
make[3]: Nothing to be done for 'all'.
make[3]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk -C arch/arm asm-offsets.s
make[3]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/arch/arm'
make[3]: 'asm-offsets.s' is up to date.
make[3]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/arch/arm'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk include/asm-arm/asm-offsets.h
make[3]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen'
make[3]: 'include/asm-arm/asm-offsets.h' is up to date.
make[3]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk -C arch/arm /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/xen
make[3]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/arch/arm'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk -C /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/common built_in.o
make[4]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/common'
/home/otyshchenko/work/projects/toolchain/gcc-linaro-7.5.0-2019.12-i686_arm-linux-gnueabihf/bin/arm-linux-gnueabihf-gcc -MMD -MP -MF ./.ioreq.o.d -marm -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall -Wstrict-prototypes -Wdeclaration-after-statement -Wno-unused-but-set-variable -Wno-unused-local-typedefs   -O1 -fno-omit-frame-pointer -nostdinc -fno-builtin -fno-common -Werror -Wredundant-decls -Wno-pointer-arith -Wvla -pipe -D__XEN__ -include /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/config.h -Wa,--strip-local-absolute -g -msoft-float -mcpu=cortex-a15  -I/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include -fno-stack-protector -fno-exceptions -fno-asynchronous-unwind-tables -Wnested-externs '-D__OBJECT_FILE__="ioreq.o"'  -c ioreq.c -o ioreq.o
/home/otyshchenko/work/projects/toolchain/gcc-linaro-7.5.0-2019.12-i686_arm-linux-gnueabihf/bin/arm-linux-gnueabihf-gcc -MMD -MP -MF ./.memory.o.d -marm -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall -Wstrict-prototypes -Wdeclaration-after-statement -Wno-unused-but-set-variable -Wno-unused-local-typedefs   -O1 -fno-omit-frame-pointer -nostdinc -fno-builtin -fno-common -Werror -Wredundant-decls -Wno-pointer-arith -Wvla -pipe -D__XEN__ -include /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/config.h -Wa,--strip-local-absolute -g -msoft-float -mcpu=cortex-a15  -I/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include -fno-stack-protector -fno-exceptions -fno-asynchronous-unwind-tables -Wnested-externs '-D__OBJECT_FILE__="memory.o"'  -c memory.c -o memory.o
/home/otyshchenko/work/projects/toolchain/gcc-linaro-7.5.0-2019.12-i686_arm-linux-gnueabihf/bin/arm-linux-gnueabihf-gcc -MMD -MP -MF ./.version.o.d -marm -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall -Wstrict-prototypes -Wdeclaration-after-statement -Wno-unused-but-set-variable -Wno-unused-local-typedefs   -O1 -fno-omit-frame-pointer -nostdinc -fno-builtin -fno-common -Werror -Wredundant-decls -Wno-pointer-arith -Wvla -pipe -D__XEN__ -include /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/config.h -Wa,--strip-local-absolute -g -msoft-float -mcpu=cortex-a15  -I/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include -fno-stack-protector -fno-exceptions -fno-asynchronous-unwind-tables -Wnested-externs '-D__OBJECT_FILE__="version.o"'  -c version.c -o version.o
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk -C sched built_in.o
make[5]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/common/sched'
make[5]: 'built_in.o' is up to date.
make[5]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/common/sched'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk -C libfdt built_in.o
make[5]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/common/libfdt'
make[5]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/common/libfdt'
/home/otyshchenko/work/projects/toolchain/gcc-linaro-7.5.0-2019.12-i686_arm-linux-gnueabihf/bin/arm-linux-gnueabihf-ld    -EL  -r -o built_in.o bitmap.o config_data.o cpu.o device_tree.o domain.o event_2l.o event_channel.o event_fifo.o grant_table.o guestcopy.o hypfs.o ioreq.o irq.o kernel.o keyhandler.o lib.o memory.o multicall.o notifier.o page_alloc.o pdx.o preempt.o random.o rangeset.o radix-tree.o rcupdate.o rwlock.o shutdown.o softirq.o smp.o spinlock.o stop_machine.o string.o symbols.o tasklet.o time.o timer.o trace.o version.o virtual_region.o vm_event.o vmap.o vsprintf.o wait.o xmalloc_tlsf.o domctl.o monitor.o sysctl.o sched/built_in.o libfdt/built_in.o gunzip.init.o warning.init.o
make[4]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/common'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk -C /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/drivers built_in.o
make[4]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/drivers'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk -C char built_in.o
make[5]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/drivers/char'
make[5]: 'built_in.o' is up to date.
make[5]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/drivers/char'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk -C passthrough built_in.o
make[5]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/drivers/passthrough'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk -C arm built_in.o
make[6]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/drivers/passthrough/arm'
make[6]: 'built_in.o' is up to date.
make[6]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/drivers/passthrough/arm'
make[5]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/drivers/passthrough'
make[4]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/drivers'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk -C /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/lib built_in.o
make[4]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/lib'
make[4]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/lib'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk -C /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/xsm built_in.o
make[4]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/xsm'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk -C flask built_in.o
make[5]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/xsm/flask'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk -C ss built_in.o
make[6]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/xsm/flask/ss'
make[6]: 'built_in.o' is up to date.
make[6]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/xsm/flask/ss'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/../tools/flask/policy/Makefile.common -C /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/../tools/flask/policy FLASK_BUILD_DIR=/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/xsm/flask
make[6]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/tools/flask/policy'
make[6]: Nothing to be done for 'all'.
make[6]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/tools/flask/policy'
cmp -s /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/xsm/flask/xenpolicy-4.15-unstable policy.bin || cp /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/xsm/flask/xenpolicy-4.15-unstable policy.bin
make[5]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/xsm/flask'
make[4]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/xsm'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk -C /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/arch/arm built_in.o
make[4]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/arch/arm'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk -C arm32 built_in.o
make[5]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/arch/arm/arm32'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk -C lib built_in.o
make[6]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/arch/arm/arm32/lib'
make[6]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/arch/arm/arm32/lib'
make[5]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/arch/arm/arm32'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk -C platforms built_in.o
make[5]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/arch/arm/platforms'
make[5]: 'built_in.o' is up to date.
make[5]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/arch/arm/platforms'
/home/otyshchenko/work/projects/toolchain/gcc-linaro-7.5.0-2019.12-i686_arm-linux-gnueabihf/bin/arm-linux-gnueabihf-gcc -MMD -MP -MF ./.dm.o.d -marm -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall -Wstrict-prototypes -Wdeclaration-after-statement -Wno-unused-but-set-variable -Wno-unused-local-typedefs   -O1 -fno-omit-frame-pointer -nostdinc -fno-builtin -fno-common -Werror -Wredundant-decls -Wno-pointer-arith -Wvla -pipe -D__XEN__ -include /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/config.h -Wa,--strip-local-absolute -g -msoft-float -mcpu=cortex-a15  -I/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include -fno-stack-protector -fno-exceptions -fno-asynchronous-unwind-tables -Wnested-externs '-D__OBJECT_FILE__="dm.o"'  -c dm.c -o dm.o
In file included from dm.c:17:0:
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/dm.h:28:26: error: array type has incomplete element type ‘struct xen_dm_op_buf’
     struct xen_dm_op_buf buf[2];
                          ^~~
In file included from /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/asm/system.h:5:0,
                 from /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/asm/time.h:5,
                 from /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/time.h:76,
                 from /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/spinlock.h:4,
                 from /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/sched.h:6,
                 from /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/dm.h:20,
                 from dm.c:17:
dm.c: In function ‘do_dm_op’:
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/lib.h:45:36: error: expression in static assertion is not an integer
     sizeof(struct { _Static_assert(!(cond), "!(" #cond ")"); })
                                    ^
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/compiler.h:94:3: note: in expansion of macro ‘BUILD_BUG_ON_ZERO’
   BUILD_BUG_ON_ZERO(__builtin_types_compatible_p(typeof(a), typeof(&a[0])))
   ^~~~~~~~~~~~~~~~~
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/lib.h:76:53: note: in expansion of macro ‘__must_be_array’
 #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]) + __must_be_array(x))
                                                     ^~~~~~~~~~~~~~~
dm.c:148:20: note: in expansion of macro ‘ARRAY_SIZE’
     if ( nr_bufs > ARRAY_SIZE(args.buf) )
                    ^~~~~~~~~~
In file included from /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/sched.h:18:0,
                 from /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/dm.h:20,
                 from dm.c:17:
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/lib.h:45:36: error: expression in static assertion is not an integer
     sizeof(struct { _Static_assert(!(cond), "!(" #cond ")"); })
                                    ^
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/nospec.h:54:12: note: in definition of macro ‘array_index_nospec’
     typeof(size) _s = (size);                                           \
            ^~~~
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/compiler.h:94:3: note: in expansion of macro ‘BUILD_BUG_ON_ZERO’
   BUILD_BUG_ON_ZERO(__builtin_types_compatible_p(typeof(a), typeof(&a[0])))
   ^~~~~~~~~~~~~~~~~
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/lib.h:76:53: note: in expansion of macro ‘__must_be_array’
 #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]) + __must_be_array(x))
                                                     ^~~~~~~~~~~~~~~
dm.c:152:48: note: in expansion of macro ‘ARRAY_SIZE’
     args.nr_bufs = array_index_nospec(nr_bufs, ARRAY_SIZE(args.buf) + 1);
                                                ^~~~~~~~~~
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/lib.h:45:36: error: expression in static assertion is not an integer
     sizeof(struct { _Static_assert(!(cond), "!(" #cond ")"); })
                                    ^
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/nospec.h:54:24: note: in definition of macro ‘array_index_nospec’
     typeof(size) _s = (size);                                           \
                        ^~~~
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/compiler.h:94:3: note: in expansion of macro ‘BUILD_BUG_ON_ZERO’
   BUILD_BUG_ON_ZERO(__builtin_types_compatible_p(typeof(a), typeof(&a[0])))
   ^~~~~~~~~~~~~~~~~
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/lib.h:76:53: note: in expansion of macro ‘__must_be_array’
 #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]) + __must_be_array(x))
                                                     ^~~~~~~~~~~~~~~
dm.c:152:48: note: in expansion of macro ‘ARRAY_SIZE’
     args.nr_bufs = array_index_nospec(nr_bufs, ARRAY_SIZE(args.buf) + 1);
                                                ^~~~~~~~~~
In file included from dm.c:18:0:
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/guest_access.h:83:32: error: initialization from incompatible pointer type [-Werror=incompatible-pointer-types]
     const typeof(*(ptr)) *_s = (hnd).p;                 \
                                ^
dm.c:154:10: note: in expansion of macro ‘copy_from_guest_offset’
     if ( copy_from_guest_offset(&args.buf[0], bufs, 0, args.nr_bufs) )
          ^~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk:197: recipe for target 'dm.o' failed
make[4]: *** [dm.o] Error 1
make[4]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/arch/arm'
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk:180: recipe for target '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/arch/arm/built_in.o' failed
make[3]: *** [/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/arch/arm/built_in.o] Error 2
make[3]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/arch/arm'
Makefile:359: recipe for target '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/xen' failed
make[2]: *** [/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/xen] Error 2
make[2]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen'
Makefile:263: recipe for target 'install' failed
make[1]: *** [install] Error 2
make[1]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen'
Makefile:130: recipe for target 'install-xen' failed
make: *** [install-xen] Error 2
otyshchenko@otyshchenko:/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git$ 


[-- Attachment #4: error1.txt --]
[-- Type: text/plain, Size: 21070 bytes --]

make xen XEN_TARGET_ARCH=arm32 CROSS_COMPILE=/home/otyshchenko/work/projects/toolchain/gcc-linaro-7.5.0-2019.12-i686_arm-linux-gnueabihf/bin/arm-linux-gnueabihf-
make -C xen install
make[1]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen'
make -f Rules.mk _install
make[2]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen'
sed "s! $PWD/! !" ..xen-syms.0.o.d >..xen-syms.0.o.d2.tmp && mv -f ..xen-syms.0.o.d2.tmp ..xen-syms.0.o.d2
sed "s! $PWD/! !" ..xen-syms.1.o.d >..xen-syms.1.o.d2.tmp && mv -f ..xen-syms.1.o.d2.tmp ..xen-syms.1.o.d2
make -C tools
make[3]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/tools'
make symbols
make[4]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/tools'
make[4]: 'symbols' is up to date.
make[4]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/tools'
make[3]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/tools'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk include/xen/compile.h
make[3]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen'
 Xen 4.15-unstable
make[3]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen'
[ -e include/asm ] || ln -sf asm-arm include/asm
[ -e arch/arm/efi ] && for f in $(cd common/efi; echo *.[ch]); \
	do test -r arch/arm/efi/$f || \
	   ln -nsf ../../../common/efi/$f arch/arm/efi/; \
	done; \
	true
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk -C include
make[3]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include'
make[3]: Nothing to be done for 'all'.
make[3]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk -C arch/arm asm-offsets.s
make[3]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/arch/arm'
make[3]: 'asm-offsets.s' is up to date.
make[3]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/arch/arm'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk include/asm-arm/asm-offsets.h
make[3]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen'
make[3]: 'include/asm-arm/asm-offsets.h' is up to date.
make[3]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk -C arch/arm /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/xen
make[3]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/arch/arm'
make -f /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk -C /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/common built_in.o
make[4]: Entering directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/common'
sed "s! $PWD/! !" .version.o.d >.version.o.d2.tmp && mv -f .version.o.d2.tmp .version.o.d2
/home/otyshchenko/work/projects/toolchain/gcc-linaro-7.5.0-2019.12-i686_arm-linux-gnueabihf/bin/arm-linux-gnueabihf-gcc -MMD -MP -MF ./.ioreq.o.d -marm -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall -Wstrict-prototypes -Wdeclaration-after-statement -Wno-unused-but-set-variable -Wno-unused-local-typedefs   -O1 -fno-omit-frame-pointer -nostdinc -fno-builtin -fno-common -Werror -Wredundant-decls -Wno-pointer-arith -Wvla -pipe -D__XEN__ -include /media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/config.h -Wa,--strip-local-absolute -g -msoft-float -mcpu=cortex-a15  -I/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include -fno-stack-protector -fno-exceptions -fno-asynchronous-unwind-tables -Wnested-externs '-D__OBJECT_FILE__="ioreq.o"'  -c ioreq.c -o ioreq.o
In file included from ioreq.c:23:0:
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/ioreq.h:39:28: error: ‘XEN_DMOP_IO_RANGE_PCI’ undeclared here (not in a function); did you mean ‘XEN_DOMCTL_DEV_PCI’?
 #define NR_IO_RANGE_TYPES (XEN_DMOP_IO_RANGE_PCI + 1)
                            ^
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/ioreq.h:55:35: note: in expansion of macro ‘NR_IO_RANGE_TYPES’
     struct rangeset        *range[NR_IO_RANGE_TYPES];
                                   ^~~~~~~~~~~~~~~~~
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/ioreq.h:92:46: error: unknown type name ‘ioservid_t’; did you mean ‘nodeid_t’?
 int ioreq_server_get_frame(struct domain *d, ioservid_t id,
                                              ^~~~~~~~~~
                                              nodeid_t
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/ioreq.h:94:49: error: unknown type name ‘ioservid_t’; did you mean ‘nodeid_t’?
 int ioreq_server_map_mem_type(struct domain *d, ioservid_t id,
                                                 ^~~~~~~~~~
                                                 nodeid_t
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/ioreq.h:110:31: error: ‘struct xen_dm_op’ declared inside parameter list will not be visible outside of this definition or declaration [-Werror]
 int ioreq_server_dm_op(struct xen_dm_op *op, struct domain *d, bool *const_op);
                               ^~~~~~~~~
ioreq.c:484:41: error: unknown type name ‘ioservid_t’; did you mean ‘nodeid_t’?
                                         ioservid_t id)
                                         ^~~~~~~~~~
                                         nodeid_t
ioreq.c:560:30: error: unknown type name ‘ioservid_t’; did you mean ‘nodeid_t’?
                              ioservid_t id)
                              ^~~~~~~~~~
                              nodeid_t
ioreq.c:626:32: error: unknown type name ‘ioservid_t’; did you mean ‘nodeid_t’?
                                ioservid_t *id)
                                ^~~~~~~~~~
                                nodeid_t
ioreq.c:681:51: error: unknown type name ‘ioservid_t’; did you mean ‘nodeid_t’?
 static int ioreq_server_destroy(struct domain *d, ioservid_t id)
                                                   ^~~~~~~~~~
                                                   nodeid_t
ioreq.c:723:52: error: unknown type name ‘ioservid_t’; did you mean ‘nodeid_t’?
 static int ioreq_server_get_info(struct domain *d, ioservid_t id,
                                                    ^~~~~~~~~~
                                                    nodeid_t
ioreq.c:770:46: error: unknown type name ‘ioservid_t’; did you mean ‘nodeid_t’?
 int ioreq_server_get_frame(struct domain *d, ioservid_t id,
                                              ^~~~~~~~~~
                                              nodeid_t
ioreq.c:821:56: error: unknown type name ‘ioservid_t’; did you mean ‘nodeid_t’?
 static int ioreq_server_map_io_range(struct domain *d, ioservid_t id,
                                                        ^~~~~~~~~~
                                                        nodeid_t
ioreq.c:873:58: error: unknown type name ‘ioservid_t’; did you mean ‘nodeid_t’?
 static int ioreq_server_unmap_io_range(struct domain *d, ioservid_t id,
                                                          ^~~~~~~~~~
                                                          nodeid_t
ioreq.c:933:49: error: unknown type name ‘ioservid_t’; did you mean ‘nodeid_t’?
 int ioreq_server_map_mem_type(struct domain *d, ioservid_t id,
                                                 ^~~~~~~~~~
                                                 nodeid_t
ioreq.c:968:53: error: unknown type name ‘ioservid_t’; did you mean ‘nodeid_t’?
 static int ioreq_server_set_state(struct domain *d, ioservid_t id,
                                                     ^~~~~~~~~~
                                                     nodeid_t
ioreq.c: In function ‘ioreq_server_select’:
ioreq.c:1103:14: error: ‘XEN_DMOP_IO_RANGE_PORT’ undeclared (first use in this function); did you mean ‘XEN_DMOP_IO_RANGE_PCI’?
         case XEN_DMOP_IO_RANGE_PORT:
              ^~~~~~~~~~~~~~~~~~~~~~
              XEN_DMOP_IO_RANGE_PCI
ioreq.c:1103:14: note: each undeclared identifier is reported only once for each function it appears in
ioreq.c:1111:14: error: ‘XEN_DMOP_IO_RANGE_MEMORY’ undeclared (first use in this function); did you mean ‘XEN_DMOP_IO_RANGE_PORT’?
         case XEN_DMOP_IO_RANGE_MEMORY:
              ^~~~~~~~~~~~~~~~~~~~~~~~
              XEN_DMOP_IO_RANGE_PORT
ioreq.c: At top level:
ioreq.c:1313:31: error: ‘struct xen_dm_op’ declared inside parameter list will not be visible outside of this definition or declaration [-Werror]
 int ioreq_server_dm_op(struct xen_dm_op *op, struct domain *d, bool *const_op)
                               ^~~~~~~~~
ioreq.c:1313:5: error: conflicting types for ‘ioreq_server_dm_op’
 int ioreq_server_dm_op(struct xen_dm_op *op, struct domain *d, bool *const_op)
     ^~~~~~~~~~~~~~~~~~
In file included from ioreq.c:23:0:
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/include/xen/ioreq.h:110:5: note: previous declaration of ‘ioreq_server_dm_op’ was here
 int ioreq_server_dm_op(struct xen_dm_op *op, struct domain *d, bool *const_op);
     ^~~~~~~~~~~~~~~~~~
ioreq.c: In function ‘ioreq_server_dm_op’:
ioreq.c:1317:16: error: dereferencing pointer to incomplete type ‘struct xen_dm_op’
     switch ( op->op )
                ^~
ioreq.c:1319:10: error: ‘XEN_DMOP_create_ioreq_server’ undeclared (first use in this function); did you mean ‘XENMEM_resource_ioreq_server’?
     case XEN_DMOP_create_ioreq_server:
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
          XENMEM_resource_ioreq_server
ioreq.c:1327:18: error: dereferencing pointer to incomplete type ‘struct xen_dm_op_create_ioreq_server’
         if ( data->pad[0] || data->pad[1] || data->pad[2] )
                  ^~
ioreq.c:1330:14: error: implicit declaration of function ‘ioreq_server_create’; did you mean ‘ioreq_server_disable’? [-Werror=implicit-function-declaration]
         rc = ioreq_server_create(d, data->handle_bufioreq,
              ^~~~~~~~~~~~~~~~~~~
              ioreq_server_disable
ioreq.c:1330:14: error: nested extern declaration of ‘ioreq_server_create’ [-Werror=nested-externs]
ioreq.c:1335:10: error: ‘XEN_DMOP_get_ioreq_server_info’ undeclared (first use in this function); did you mean ‘XEN_DMOP_create_ioreq_server’?
     case XEN_DMOP_get_ioreq_server_info:
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          XEN_DMOP_create_ioreq_server
ioreq.c:1339:38: error: ‘XEN_DMOP_no_gfns’ undeclared (first use in this function)
         const uint16_t valid_flags = XEN_DMOP_no_gfns;
                                      ^~~~~~~~~~~~~~~~
ioreq.c:1344:18: error: dereferencing pointer to incomplete type ‘struct xen_dm_op_get_ioreq_server_info’
         if ( data->flags & ~valid_flags )
                  ^~
ioreq.c:1347:14: error: implicit declaration of function ‘ioreq_server_get_info’; did you mean ‘ioreq_server_deinit’? [-Werror=implicit-function-declaration]
         rc = ioreq_server_get_info(d, data->id,
              ^~~~~~~~~~~~~~~~~~~~~
              ioreq_server_deinit
ioreq.c:1347:14: error: nested extern declaration of ‘ioreq_server_get_info’ [-Werror=nested-externs]
ioreq.c:1356:10: error: ‘XEN_DMOP_map_io_range_to_ioreq_server’ undeclared (first use in this function); did you mean ‘XEN_DMOP_create_ioreq_server’?
     case XEN_DMOP_map_io_range_to_ioreq_server:
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          XEN_DMOP_create_ioreq_server
ioreq.c:1362:18: error: dereferencing pointer to incomplete type ‘const struct xen_dm_op_ioreq_server_range’
         if ( data->pad )
                  ^~
ioreq.c:1365:14: error: implicit declaration of function ‘ioreq_server_map_io_range’; did you mean ‘ioreq_server_alloc_pages’? [-Werror=implicit-function-declaration]
         rc = ioreq_server_map_io_range(d, data->id, data->type,
              ^~~~~~~~~~~~~~~~~~~~~~~~~
              ioreq_server_alloc_pages
ioreq.c:1365:14: error: nested extern declaration of ‘ioreq_server_map_io_range’ [-Werror=nested-externs]
ioreq.c:1370:10: error: ‘XEN_DMOP_unmap_io_range_from_ioreq_server’ undeclared (first use in this function); did you mean ‘XEN_DMOP_map_io_range_to_ioreq_server’?
     case XEN_DMOP_unmap_io_range_from_ioreq_server:
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          XEN_DMOP_map_io_range_to_ioreq_server
ioreq.c:1376:18: error: dereferencing pointer to incomplete type ‘const struct xen_dm_op_ioreq_server_range’
         if ( data->pad )
                  ^~
ioreq.c:1379:14: error: implicit declaration of function ‘ioreq_server_unmap_io_range’; did you mean ‘ioreq_server_alloc_pages’? [-Werror=implicit-function-declaration]
         rc = ioreq_server_unmap_io_range(d, data->id, data->type,
              ^~~~~~~~~~~~~~~~~~~~~~~~~~~
              ioreq_server_alloc_pages
ioreq.c:1379:14: error: nested extern declaration of ‘ioreq_server_unmap_io_range’ [-Werror=nested-externs]
ioreq.c:1384:10: error: ‘XEN_DMOP_set_ioreq_server_state’ undeclared (first use in this function); did you mean ‘XEN_DMOP_get_ioreq_server_info’?
     case XEN_DMOP_set_ioreq_server_state:
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          XEN_DMOP_get_ioreq_server_info
ioreq.c:1390:18: error: dereferencing pointer to incomplete type ‘const struct xen_dm_op_set_ioreq_server_state’
         if ( data->pad )
                  ^~
ioreq.c:1393:14: error: implicit declaration of function ‘ioreq_server_set_state’; did you mean ‘ioreq_server_select’? [-Werror=implicit-function-declaration]
         rc = ioreq_server_set_state(d, data->id, !!data->enabled);
              ^~~~~~~~~~~~~~~~~~~~~~
              ioreq_server_select
ioreq.c:1393:14: error: nested extern declaration of ‘ioreq_server_set_state’ [-Werror=nested-externs]
ioreq.c:1397:10: error: ‘XEN_DMOP_destroy_ioreq_server’ undeclared (first use in this function); did you mean ‘XEN_DMOP_create_ioreq_server’?
     case XEN_DMOP_destroy_ioreq_server:
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          XEN_DMOP_create_ioreq_server
ioreq.c:1403:18: error: dereferencing pointer to incomplete type ‘const struct xen_dm_op_destroy_ioreq_server’
         if ( data->pad )
                  ^~
ioreq.c:1406:14: error: implicit declaration of function ‘ioreq_server_destroy’; did you mean ‘ioreq_server_destroy_all’? [-Werror=implicit-function-declaration]
         rc = ioreq_server_destroy(d, data->id);
              ^~~~~~~~~~~~~~~~~~~~
              ioreq_server_destroy_all
ioreq.c:1406:14: error: nested extern declaration of ‘ioreq_server_destroy’ [-Werror=nested-externs]
At top level:
ioreq.c:521:13: error: ‘ioreq_server_enable’ defined but not used [-Werror=unused-function]
 static void ioreq_server_enable(struct ioreq_server *s)
             ^~~~~~~~~~~~~~~~~~~
ioreq.c:454:12: error: ‘ioreq_server_alloc_pages’ defined but not used [-Werror=unused-function]
 static int ioreq_server_alloc_pages(struct ioreq_server *s)
            ^~~~~~~~~~~~~~~~~~~~~~~~
ioreq.c:64:29: error: ‘get_ioreq_server’ defined but not used [-Werror=unused-function]
 static struct ioreq_server *get_ioreq_server(const struct domain *d,
                             ^~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk:197: recipe for target 'ioreq.o' failed
make[4]: *** [ioreq.o] Error 1
make[4]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/common'
/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/Rules.mk:180: recipe for target '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/common/built_in.o' failed
make[3]: *** [/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/common/built_in.o] Error 2
make[3]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/arch/arm'
Makefile:359: recipe for target '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/xen' failed
make[2]: *** [/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen/xen] Error 2
make[2]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen'
Makefile:263: recipe for target 'install' failed
make[1]: *** [install] Error 2
make[1]: Leaving directory '/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git/xen'
Makefile:130: recipe for target 'install-xen' failed
make: *** [install-xen] Error 2
otyshchenko@otyshchenko:/media/b/build/build/tmp/work/x86_64-xt-linux/domd-image-weston/1.0-r0/repo/build/tmp/work/aarch64-poky-linux/xen/4.14.0+gitAUTOINC+2c6e5a8ceb-r0/git$ 


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 15/24] xen/arm: Stick around in leave_hypervisor_to_guest until I/O has completed
  2021-01-15 20:55   ` Julien Grall
@ 2021-01-17 20:23     ` Oleksandr
  2021-01-18 10:57       ` Julien Grall
  0 siblings, 1 reply; 144+ messages in thread
From: Oleksandr @ 2021-01-17 20:23 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Oleksandr Tyshchenko, Stefano Stabellini,
	Volodymyr Babchuk, Julien Grall


On 15.01.21 22:55, Julien Grall wrote:
> Hi Oleksandr,

Hi Julien



>
> On 12/01/2021 21:52, Oleksandr Tyshchenko wrote:
>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>
>> This patch adds proper handling of return value of
>> vcpu_ioreq_handle_completion() which involves using a loop in
>> leave_hypervisor_to_guest().
>>
>> The reason to use an unbounded loop here is the fact that vCPU shouldn't
>> continue until the I/O has completed.
>>
>> The IOREQ code is using wait_on_xen_event_channel(). Yet, this can
>> still "exit" early if an event has been received. But this doesn't mean
>> the I/O has completed (in can be just a spurious wake-up).
>
> While I agree we need the loop, I don't think the reason is correct 
> here. If you receive a spurious event, then the loop in wait_for_io() 
> will catch it.
>
> The only way to get out of that loop is if the I/O has been handled or 
> the state in the IOREQ page is invalid.
>
> In addition to that, handle_hvm_io_completion(), will only return 
> false if the state is invalid or there is vCPI work to do.

Agree, update description.


>
>
>> So we need
>> to check if the I/O has completed and wait again if it hasn't (we will
>> block the vCPU again until an event is received). This loop makes sure
>> that all the vCPU works are done before we return to the guest.
>>
>> The call chain below:
>> check_for_vcpu_work -> vcpu_ioreq_handle_completion -> wait_for_io ->
>> wait_on_xen_event_channel
>>
>> The worse that can happen here if the vCPU will never run again
>> (the I/O will never complete). But, in Xen case, if the I/O never
>> completes then it most likely means that something went horribly
>> wrong with the Device Emulator. And it is most likely not safe
>> to continue. So letting the vCPU to spin forever if the I/O never
>> completes is a safer action than letting it continue and leaving
>> the guest in unclear state and is the best what we can do for now.
>>
>> Please note, using this loop we will not spin forever on a pCPU,
>> preventing any other vCPUs from being scheduled. At every loop
>> we will call check_for_pcpu_work() that will process pending
>> softirqs. In case of failure, the guest will crash and the vCPU
>> will be unscheduled. In normal case, if the rescheduling is necessary
>> (might be set by a timer or by a caller in check_for_vcpu_work(),
>> where wait_for_io() is a preemption point) the vCPU will be rescheduled
>> to give place to someone else.
>>
> What you describe here is a bug that was introduced by this series. If 
> you think the code requires a separate patch, then please split off 
> patch #14 so the code callling vcpu_ioreq_handle_completion() happen here.
I am afraid, I don't understand which bug you are talking about, I just 
tried to explain why using a loop is not bad (there wouldn't be any 
impact to other vCPUs, etc) and the worse case which could happen.
Also I don't see a reason why the code requires a separate patch 
(probably, if I understood a bug I would see a reason ...) Could you 
please clarify?


>
>
>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>
>
>> CC: Julien Grall <julien.grall@arm.com>
>> [On Arm only]
>> Tested-by: Wei Chen <Wei.Chen@arm.com>
>>
>> ---
>> Please note, this is a split/cleanup/hardening of Julien's PoC:
>> "Add support for Guest IO forwarding to a device emulator"
>>
>> Changes V1 -> V2:
>>     - new patch, changes were derived from (+ new explanation):
>>       arm/ioreq: Introduce arch specific bits for IOREQ/DM features
>>
>> Changes V2 -> V3:
>>     - update patch description
>>
>> Changes V3 -> V4:
>>     - update patch description and comment in code
>> ---
>>   xen/arch/arm/traps.c | 38 +++++++++++++++++++++++++++++++++-----
>>   1 file changed, 33 insertions(+), 5 deletions(-)
>>
>> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
>> index 036b13f..4a83e1e 100644
>> --- a/xen/arch/arm/traps.c
>> +++ b/xen/arch/arm/traps.c
>> @@ -2257,18 +2257,23 @@ static void check_for_pcpu_work(void)
>>    * Process pending work for the vCPU. Any call should be fast or
>>    * implement preemption.
>>    */
>> -static void check_for_vcpu_work(void)
>> +static bool check_for_vcpu_work(void)
>>   {
>>       struct vcpu *v = current;
>>     #ifdef CONFIG_IOREQ_SERVER
>> +    bool handled;
>> +
>>       local_irq_enable();
>> -    vcpu_ioreq_handle_completion(v);
>> +    handled = vcpu_ioreq_handle_completion(v);
>>       local_irq_disable();
>> +
>> +    if ( !handled )
>> +        return true;
>>   #endif
>>         if ( likely(!v->arch.need_flush_to_ram) )
>> -        return;
>> +        return false;
>>         /*
>>        * Give a chance for the pCPU to process work before handling 
>> the vCPU
>> @@ -2279,6 +2284,8 @@ static void check_for_vcpu_work(void)
>>       local_irq_enable();
>>       p2m_flush_vm(v);
>>       local_irq_disable();
>> +
>> +    return false;
>>   }
>>     /*
>> @@ -2291,8 +2298,29 @@ void leave_hypervisor_to_guest(void)
>>   {
>>       local_irq_disable();
>>   -    check_for_vcpu_work();
>> -    check_for_pcpu_work();
>> +    /*
>> +     * The reason to use an unbounded loop here is the fact that vCPU
>> +     * shouldn't continue until the I/O has completed.
>> +     *
>> +     * The worse that can happen here if the vCPU will never run again
>> +     * (the I/O will never complete). But, in Xen case, if the I/O 
>> never
>> +     * completes then it most likely means that something went horribly
>> +     * wrong with the Device Emulator. And it is most likely not safe
>> +     * to continue. So letting the vCPU to spin forever if the I/O 
>> never
>> +     * completes is a safer action than letting it continue and leaving
>> +     * the guest in unclear state and is the best what we can do for 
>> now.
>> +     *
>> +     * Please note, using this loop we will not spin forever on a pCPU,
>> +     * preventing any other vCPUs from being scheduled. At every loop
>> +     * we will call check_for_pcpu_work() that will process pending
>> +     * softirqs. In case of failure, the guest will crash and the vCPU
>> +     * will be unscheduled. In normal case, if the rescheduling is 
>> necessary
>> +     * (might be set by a timer or by a caller in 
>> check_for_vcpu_work(),
>> +     * the vCPU will be rescheduled to give place to someone else.
>
> TBH, I think this comment is a bit too much and sort of out of context 
> because this describing the inner implementation of 
> check_for_vcpu_work().
>
> How about the following:
>
> /*
>  * check_for_vcpu_work() may return true if there are more work to
>  * before the vCPU can safely resume. This gives us an opportunity
>  * to deschedule the vCPU if needed.
>  */

I am fine with that.


>
>> +     */
>> +    do {
>> +        check_for_pcpu_work();
>> +    } while ( check_for_vcpu_work() );
>
> So there are two important changes in this new implementation:
>   1) Without CONFIG_IOREQ_SERVER=y, we will call check_for_pcpu_work() 
> twice in a row when handling set/way.

hmm, yes


>
>   2) After handling the pCPU work, we will now return to the guest 
> directly. Before, we gave another opportunity for Xen to schedule a 
> different work. This means, we may return to the vCPU for a very short 
> time and will introduce more overhead.

yes, I haven't even imagined this could cause such difference in behavior


>
>
> So I would rework the loop to write it as:
>
> while ( check_for_pcpu_work() )
>    check_for_pcpu_work();
> check_for_pcpu_work();

makes sense, I assume you meant while ( check_for_vcpu_work() ) ...


>
>>         vgic_sync_to_lrs();
>>
>
> Cheers,
>
-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 23/24] libxl: Introduce basic virtio-mmio support on Arm
  2021-01-15 21:30   ` Julien Grall
@ 2021-01-17 22:22     ` Oleksandr
  2021-01-20 16:40       ` Julien Grall
  0 siblings, 1 reply; 144+ messages in thread
From: Oleksandr @ 2021-01-17 22:22 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Julien Grall, Ian Jackson, Wei Liu, Anthony PERARD,
	Stefano Stabellini, Volodymyr Babchuk, Oleksandr Tyshchenko


On 15.01.21 23:30, Julien Grall wrote:
> Hi Oleksandr,


Hi Julien


>
> On 12/01/2021 21:52, Oleksandr Tyshchenko wrote:
>> From: Julien Grall <julien.grall@arm.com>
>>
>> This patch creates specific device node in the Guest device-tree
>> with allocated MMIO range and SPI interrupt if specific 'virtio'
>> property is present in domain config.
>
> From my understanding, for each virtio device use the MMIO transparent,
> we would need to reserve an area in memory for its exclusive use.
>
> If I were an admin, I would expect to only describe the list of virtio 
> devices I want to assign to my guest and then let the toolstack figure 
> out how to expose them.

Yes, I think in the same way.


>
>
> So I am not quite too sure how this new parameter can be used. Could 
> you expand it?
The original idea was to set it if we are going to assign virtio 
device(s) to the guest.
Being honest, I have a plan to remove this extra parameter. It might not 
be obvious looking at the current patch, but next patch will show that 
we can avoid introducing it at all.


>
>
>>
>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> [On Arm only]
>> Tested-by: Wei Chen <Wei.Chen@arm.com>
>>
>> ---
>> Please note, this is a split/cleanup/hardening of Julien's PoC:
>> "Add support for Guest IO forwarding to a device emulator"
>>
>> Changes RFC -> V1:
>>     - was squashed with:
>>       "[RFC PATCH V1 09/12] libxl: Handle virtio-mmio irq in more 
>> correct way"
>>       "[RFC PATCH V1 11/12] libxl: Insert "dma-coherent" property 
>> into virtio-mmio device node"
>>       "[RFC PATCH V1 12/12] libxl: Fix duplicate memory node in DT"
>>     - move VirtIO MMIO #define-s to xen/include/public/arch-arm.h
>>
>> Changes V1 -> V2:
>>     - update the author of a patch
>>
>> Changes V2 -> V3:
>>     - no changes
>>
>> Changes V3 -> V4:
>>     - no changes
>> ---
>>   tools/libs/light/libxl_arm.c     | 58 
>> ++++++++++++++++++++++++++++++++++++++--
>>   tools/libs/light/libxl_types.idl |  1 +
>>   tools/xl/xl_parse.c              |  1 +
>>   xen/include/public/arch-arm.h    |  5 ++++
>>   4 files changed, 63 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
>> index 66e8a06..588ee5a 100644
>> --- a/tools/libs/light/libxl_arm.c
>> +++ b/tools/libs/light/libxl_arm.c
>> @@ -26,8 +26,8 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
>>   {
>>       uint32_t nr_spis = 0;
>>       unsigned int i;
>> -    uint32_t vuart_irq;
>> -    bool vuart_enabled = false;
>> +    uint32_t vuart_irq, virtio_irq;
>> +    bool vuart_enabled = false, virtio_enabled = false;
>>         /*
>>        * If pl011 vuart is enabled then increment the nr_spis to 
>> allow allocation
>> @@ -39,6 +39,17 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
>>           vuart_enabled = true;
>>       }
>>   +    /*
>> +     * XXX: Handle properly virtio
>> +     * A proper solution would be the toolstack to allocate the 
>> interrupts
>> +     * used by each virtio backend and let the backend now which one 
>> is used
>> +     */
>> +    if (libxl_defbool_val(d_config->b_info.arch_arm.virtio)) {
>> +        nr_spis += (GUEST_VIRTIO_MMIO_SPI - 32) + 1;
>> +        virtio_irq = GUEST_VIRTIO_MMIO_SPI;
>> +        virtio_enabled = true;
>> +    }
>> +
>>       for (i = 0; i < d_config->b_info.num_irqs; i++) {
>>           uint32_t irq = d_config->b_info.irqs[i];
>>           uint32_t spi;
>> @@ -58,6 +69,12 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
>>               return ERROR_FAIL;
>>           }
>>   +        /* The same check as for vpl011 */
>> +        if (virtio_enabled && irq == virtio_irq) {
>> +            LOG(ERROR, "Physical IRQ %u conflicting with virtio 
>> SPI\n", irq);
>> +            return ERROR_FAIL;
>> +        }
>> +
>>           if (irq < 32)
>>               continue;
>>   @@ -658,6 +675,39 @@ static int make_vpl011_uart_node(libxl__gc 
>> *gc, void *fdt,
>>       return 0;
>>   }
>>   +static int make_virtio_mmio_node(libxl__gc *gc, void *fdt,
>> +                                 uint64_t base, uint32_t irq)
>> +{
>> +    int res;
>> +    gic_interrupt intr;
>> +    /* Placeholder for virtio@ + a 64-bit number + \0 */
>> +    char buf[24];
>> +
>> +    snprintf(buf, sizeof(buf), "virtio@%"PRIx64, base);
>> +    res = fdt_begin_node(fdt, buf);
>> +    if (res) return res;
>> +
>> +    res = fdt_property_compat(gc, fdt, 1, "virtio,mmio");
>> +    if (res) return res;
>> +
>> +    res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS, 
>> GUEST_ROOT_SIZE_CELLS,
>> +                            1, base, GUEST_VIRTIO_MMIO_SIZE);
>> +    if (res) return res;
>> +
>> +    set_interrupt(intr, irq, 0xf, DT_IRQ_TYPE_EDGE_RISING);
>> +    res = fdt_property_interrupts(gc, fdt, &intr, 1);
>> +    if (res) return res;
>> +
>> +    res = fdt_property(fdt, "dma-coherent", NULL, 0);
>> +    if (res) return res;
>> +
>> +    res = fdt_end_node(fdt);
>> +    if (res) return res;
>> +
>> +    return 0;
>> +
>> +}
>> +
>>   static const struct arch_info *get_arch_info(libxl__gc *gc,
>>                                                const struct 
>> xc_dom_image *dom)
>>   {
>> @@ -961,6 +1011,9 @@ next_resize:
>>           if (info->tee == LIBXL_TEE_TYPE_OPTEE)
>>               FDT( make_optee_node(gc, fdt) );
>>   +        if (libxl_defbool_val(info->arch_arm.virtio))
>> +            FDT( make_virtio_mmio_node(gc, fdt, 
>> GUEST_VIRTIO_MMIO_BASE, GUEST_VIRTIO_MMIO_SPI) ); > +
>>           if (pfdt)
>>               FDT( copy_partial_fdt(gc, fdt, pfdt) );
>>   @@ -1178,6 +1231,7 @@ void 
>> libxl__arch_domain_build_info_setdefault(libxl__gc *gc,
>>   {
>>       /* ACPI is disabled by default */
>>       libxl_defbool_setdefault(&b_info->acpi, false);
>> +    libxl_defbool_setdefault(&b_info->arch_arm.virtio, false);
>>         if (b_info->type != LIBXL_DOMAIN_TYPE_PV)
>>           return;
>> diff --git a/tools/libs/light/libxl_types.idl 
>> b/tools/libs/light/libxl_types.idl
>> index 0532473..839df86 100644
>> --- a/tools/libs/light/libxl_types.idl
>> +++ b/tools/libs/light/libxl_types.idl
>> @@ -640,6 +640,7 @@ libxl_domain_build_info = 
>> Struct("domain_build_info",[
>>           ("arch_arm", Struct(None, [("gic_version", libxl_gic_version),
>> +                               ("virtio", libxl_defbool),
>
> Regardless the question above, this doesn't sound very Arm specific.

yes


>
>
>
> I think we want to get the virtio configuration arch-agnostic because 
> an admin should not need to know the arch internal to be able to 
> assign virtio devices.

sounds reasonable


>
>
> That said, you can leave it completely unimplemented for anything 
> other than Arm.

got it


>
>
> If you add new parameters in the idl, you will also want to introduce 
> a define in libxl.h so an external toolstack (such as libvirt) can 
> detect whether the field is supported by the installed version of 
> libxl. See the other LIBXL_HAVE_*.

hmm, I didn't know about that, thank you.


>
>>                                  ("vuart", libxl_vuart_type),
>>                                 ])),
>>       # Alternate p2m is not bound to any architecture or guest type, 
>> as it is
>> diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
>> index 4ebf396..2a3364b 100644
>> --- a/tools/xl/xl_parse.c
>> +++ b/tools/xl/xl_parse.c
>> @@ -2581,6 +2581,7 @@ skip_usbdev:
>>       }
>>         xlu_cfg_get_defbool(config, "dm_restrict", 
>> &b_info->dm_restrict, 0);
>> +    xlu_cfg_get_defbool(config, "virtio", &b_info->arch_arm.virtio, 0);
>
> Regardless the question above, any addition in the configuration file 
> should be documented docs/man/xl.cfg.5.pod.in.

yes, documentation is my nearest plan.


>
>
>>         if (c_info->type == LIBXL_DOMAIN_TYPE_HVM) {
>>           if (!xlu_cfg_get_string (config, "vga", &buf, 0)) {
>> diff --git a/xen/include/public/arch-arm.h 
>> b/xen/include/public/arch-arm.h
>> index c365b1b..be7595f 100644
>> --- a/xen/include/public/arch-arm.h
>> +++ b/xen/include/public/arch-arm.h
>> @@ -464,6 +464,11 @@ typedef uint64_t xen_callback_t;
>>   #define PSCI_cpu_on      2
>>   #define PSCI_migrate     3
>>   +/* VirtIO MMIO definitions */
>> +#define GUEST_VIRTIO_MMIO_BASE  xen_mk_ullong(0x02000000)
>
> You will want to define any new region with the other *_{BASE, SIZE} 
> above. Note that they should be ordered from bottom to the top of the 
> memory layout.

I got it, this one should be put at the very beginning (before vGIC v2 
mappings).


>
>
>> +#define GUEST_VIRTIO_MMIO_SIZE xen_mk_ullong(0x200)
>
> AFAICT, the size of the virtio mmio region should be 0x100. So why is 
> it 0x200?


I didn't find the total size requirement for the mmio region in virtio 
specification v1.1 (the size of control registers is indeed 0x100 and 
device-specific configuration registers starts at the offset 0x100, 
however it's size depends on the device and the driver).

kvmtool uses 0x200 [1], in some Linux device-trees we can see 0x200 [2] 
(however, device-tree bindings example has 0x100 [3]), so what would be 
the proper value for Xen code?


>
>> +#define GUEST_VIRTIO_MMIO_SPI   33
>
> This will want to be defined with the other GUEST_*_SPI above.

ok


>
>
> Most likely, you will want to reserve a range

it seems yes, good point. BTW, the range is needed for the mmio region 
as well, correct?


>
> Cheers,
>
[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/will/kvmtool.git/tree/include/kvm/virtio-mmio.h#n9
[2] 
https://elixir.bootlin.com/linux/v5.11-rc3/source/arch/arm64/boot/dts/arm/foundation-v8.dtsi#L226
[3] 
https://elixir.bootlin.com/linux/v5.11-rc3/source/Documentation/devicetree/bindings/virtio/mmio.txt#L31


-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply	[flat|nested] 144+ messages in thread

* RE: [PATCH V4 01/24] x86/ioreq: Prepare IOREQ feature for making it common
  2021-01-12 21:52 ` [PATCH V4 01/24] x86/ioreq: Prepare IOREQ feature for making it common Oleksandr Tyshchenko
  2021-01-15 15:16   ` Julien Grall
  2021-01-15 16:41   ` Jan Beulich
@ 2021-01-18  8:22   ` Paul Durrant
  2 siblings, 0 replies; 144+ messages in thread
From: Paul Durrant @ 2021-01-18  8:22 UTC (permalink / raw)
  To: 'Oleksandr Tyshchenko', xen-devel
  Cc: 'Oleksandr Tyshchenko', 'Jan Beulich',
	'Andrew Cooper', 'Roger Pau Monné',
	'Wei Liu', 'Julien Grall',
	'Stefano Stabellini', 'Julien Grall'

> -----Original Message-----
> From: Oleksandr Tyshchenko <olekstysh@gmail.com>
> Sent: 12 January 2021 21:52
> To: xen-devel@lists.xenproject.org
> Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>; Paul Durrant <paul@xen.org>; Jan Beulich
> <jbeulich@suse.com>; Andrew Cooper <andrew.cooper3@citrix.com>; Roger Pau Monné
> <roger.pau@citrix.com>; Wei Liu <wl@xen.org>; Julien Grall <julien@xen.org>; Stefano Stabellini
> <sstabellini@kernel.org>; Julien Grall <julien.grall@arm.com>
> Subject: [PATCH V4 01/24] x86/ioreq: Prepare IOREQ feature for making it common
> 
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> As a lot of x86 code can be re-used on Arm later on, this
> patch makes some preparation to x86/hvm/ioreq.c before moving
> to the common code. This way we will get a verbatim copy
> for a code movement in subsequent patch.
> 
> This patch mostly introduces specific hooks to abstract arch
> specific materials taking into the account the requirment to leave
> the "legacy" mechanism of mapping magic pages for the IOREQ servers
> x86 specific and not expose it to the common code.
> 
> These hooks are named according to the more consistent new naming
> scheme right away (including dropping the "hvm" prefixes and infixes):
> - IOREQ server functions should start with "ioreq_server_"
> - IOREQ functions should start with "ioreq_"
> other functions will be renamed in subsequent patches.
> 
> Also re-order #include-s alphabetically.
> 
> This support is going to be used on Arm to be able run device
> emulator outside of Xen hypervisor.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

Reviewed-by: Paul Durrant <paul@xen.org>

> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>
> 
> ---
> Please note, this is a split/cleanup/hardening of Julien's PoC:
> "Add support for Guest IO forwarding to a device emulator"
> 
> Changes RFC -> V1:
>    - new patch, was split from:
>      "[RFC PATCH V1 01/12] hvm/ioreq: Make x86's IOREQ feature common"
>    - fold the check of p->type into hvm_get_ioreq_server_range_type()
>      and make it return success/failure
>    - remove relocate_portio_handler() call from arch_hvm_ioreq_destroy()
>      in arch/x86/hvm/ioreq.c
>    - introduce arch_hvm_destroy_ioreq_server()/arch_handle_hvm_io_completion()
> 
> Changes V1 -> V2:
>    - update patch description
>    - make arch functions inline and put them into arch header
>      to achieve a truly rename by the subsequent patch
>    - return void in arch_hvm_destroy_ioreq_server()
>    - return bool in arch_hvm_ioreq_destroy()
>    - bring relocate_portio_handler() back to arch_hvm_ioreq_destroy()
>    - rename IOREQ_IO* to IOREQ_STATUS*
>    - remove *handle* from arch_handle_hvm_io_completion()
>    - re-order #include-s alphabetically
>    - rename hvm_get_ioreq_server_range_type() to hvm_ioreq_server_get_type_addr()
>      and add "const" to several arguments
> 
> Changes V2 -> V3:
>    - update patch description
>    - name new arch hooks according to the new naming scheme
>    - don't make arch hooks inline, move them ioreq.c
>    - make get_ioreq_server() local again
>    - rework the whole patch taking into the account that "legacy" interface
>      should remain x86 specific (additional arch hooks, etc)
>    - update the code to be able to use hvm_map_mem_type_to_ioreq_server()
>      in the common code (an extra arch hook, etc)
>    - don’t include <asm/hvm/emulate.h> from arch header
>    - add "arch" prefix to hvm_ioreq_server_get_type_addr()
>    - move IOREQ_STATUS_* #define-s introduction to the separate patch
>    - move HANDLE_BUFIOREQ to the arch header
>    - just return relocate_portio_handler() from arch_ioreq_server_destroy_all()
>    - misc adjustments proposed by Jan (adding const, unsigned int instead of uint32_t)
> 
> Changes V3 -> V4:
>    - add Alex's R-b
>    - update patch description
>    - make arch_ioreq_server_get_type_addr return bool
>    - drop #include <xen/ctype.h>
>    - use two arch hooks in hvm_map_mem_type_to_ioreq_server()
>      to avoid calling p2m_change_entry_type_global() with lock held
> ---
>  xen/arch/x86/hvm/ioreq.c        | 179 ++++++++++++++++++++++++++--------------
>  xen/include/asm-x86/hvm/ioreq.h |  22 +++++
>  2 files changed, 141 insertions(+), 60 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
> index 1cc27df..468fe84 100644
> --- a/xen/arch/x86/hvm/ioreq.c
> +++ b/xen/arch/x86/hvm/ioreq.c
> @@ -16,16 +16,15 @@
>   * this program; If not, see <http://www.gnu.org/licenses/>.
>   */
> 
> -#include <xen/ctype.h>
> +#include <xen/domain.h>
> +#include <xen/event.h>
>  #include <xen/init.h>
> +#include <xen/irq.h>
>  #include <xen/lib.h>
> -#include <xen/trace.h>
> +#include <xen/paging.h>
>  #include <xen/sched.h>
> -#include <xen/irq.h>
>  #include <xen/softirq.h>
> -#include <xen/domain.h>
> -#include <xen/event.h>
> -#include <xen/paging.h>
> +#include <xen/trace.h>
>  #include <xen/vpci.h>
> 
>  #include <asm/hvm/emulate.h>
> @@ -170,6 +169,29 @@ static bool hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ioreq_t *p)
>      return true;
>  }
> 
> +bool arch_vcpu_ioreq_completion(enum hvm_io_completion io_completion)
> +{
> +    switch ( io_completion )
> +    {
> +    case HVMIO_realmode_completion:
> +    {
> +        struct hvm_emulate_ctxt ctxt;
> +
> +        hvm_emulate_init_once(&ctxt, NULL, guest_cpu_user_regs());
> +        vmx_realmode_emulate_one(&ctxt);
> +        hvm_emulate_writeback(&ctxt);
> +
> +        break;
> +    }
> +
> +    default:
> +        ASSERT_UNREACHABLE();
> +        break;
> +    }
> +
> +    return true;
> +}
> +
>  bool handle_hvm_io_completion(struct vcpu *v)
>  {
>      struct domain *d = v->domain;
> @@ -209,19 +231,8 @@ bool handle_hvm_io_completion(struct vcpu *v)
>          return handle_pio(vio->io_req.addr, vio->io_req.size,
>                            vio->io_req.dir);
> 
> -    case HVMIO_realmode_completion:
> -    {
> -        struct hvm_emulate_ctxt ctxt;
> -
> -        hvm_emulate_init_once(&ctxt, NULL, guest_cpu_user_regs());
> -        vmx_realmode_emulate_one(&ctxt);
> -        hvm_emulate_writeback(&ctxt);
> -
> -        break;
> -    }
>      default:
> -        ASSERT_UNREACHABLE();
> -        break;
> +        return arch_vcpu_ioreq_completion(io_completion);
>      }
> 
>      return true;
> @@ -477,9 +488,6 @@ static void hvm_update_ioreq_evtchn(struct hvm_ioreq_server *s,
>      }
>  }
> 
> -#define HANDLE_BUFIOREQ(s) \
> -    ((s)->bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF)
> -
>  static int hvm_ioreq_server_add_vcpu(struct hvm_ioreq_server *s,
>                                       struct vcpu *v)
>  {
> @@ -586,7 +594,7 @@ static void hvm_ioreq_server_remove_all_vcpus(struct hvm_ioreq_server *s)
>      spin_unlock(&s->lock);
>  }
> 
> -static int hvm_ioreq_server_map_pages(struct hvm_ioreq_server *s)
> +int arch_ioreq_server_map_pages(struct hvm_ioreq_server *s)
>  {
>      int rc;
> 
> @@ -601,7 +609,7 @@ static int hvm_ioreq_server_map_pages(struct hvm_ioreq_server *s)
>      return rc;
>  }
> 
> -static void hvm_ioreq_server_unmap_pages(struct hvm_ioreq_server *s)
> +void arch_ioreq_server_unmap_pages(struct hvm_ioreq_server *s)
>  {
>      hvm_unmap_ioreq_gfn(s, true);
>      hvm_unmap_ioreq_gfn(s, false);
> @@ -674,6 +682,12 @@ static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
>      return rc;
>  }
> 
> +void arch_ioreq_server_enable(struct hvm_ioreq_server *s)
> +{
> +    hvm_remove_ioreq_gfn(s, false);
> +    hvm_remove_ioreq_gfn(s, true);
> +}
> +
>  static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s)
>  {
>      struct hvm_ioreq_vcpu *sv;
> @@ -683,8 +697,7 @@ static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s)
>      if ( s->enabled )
>          goto done;
> 
> -    hvm_remove_ioreq_gfn(s, false);
> -    hvm_remove_ioreq_gfn(s, true);
> +    arch_ioreq_server_enable(s);
> 
>      s->enabled = true;
> 
> @@ -697,6 +710,12 @@ static void hvm_ioreq_server_enable(struct hvm_ioreq_server *s)
>      spin_unlock(&s->lock);
>  }
> 
> +void arch_ioreq_server_disable(struct hvm_ioreq_server *s)
> +{
> +    hvm_add_ioreq_gfn(s, true);
> +    hvm_add_ioreq_gfn(s, false);
> +}
> +
>  static void hvm_ioreq_server_disable(struct hvm_ioreq_server *s)
>  {
>      spin_lock(&s->lock);
> @@ -704,8 +723,7 @@ static void hvm_ioreq_server_disable(struct hvm_ioreq_server *s)
>      if ( !s->enabled )
>          goto done;
> 
> -    hvm_add_ioreq_gfn(s, true);
> -    hvm_add_ioreq_gfn(s, false);
> +    arch_ioreq_server_disable(s);
> 
>      s->enabled = false;
> 
> @@ -750,7 +768,7 @@ static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
> 
>   fail_add:
>      hvm_ioreq_server_remove_all_vcpus(s);
> -    hvm_ioreq_server_unmap_pages(s);
> +    arch_ioreq_server_unmap_pages(s);
> 
>      hvm_ioreq_server_free_rangesets(s);
> 
> @@ -764,7 +782,7 @@ static void hvm_ioreq_server_deinit(struct hvm_ioreq_server *s)
>      hvm_ioreq_server_remove_all_vcpus(s);
> 
>      /*
> -     * NOTE: It is safe to call both hvm_ioreq_server_unmap_pages() and
> +     * NOTE: It is safe to call both arch_ioreq_server_unmap_pages() and
>       *       hvm_ioreq_server_free_pages() in that order.
>       *       This is because the former will do nothing if the pages
>       *       are not mapped, leaving the page to be freed by the latter.
> @@ -772,7 +790,7 @@ static void hvm_ioreq_server_deinit(struct hvm_ioreq_server *s)
>       *       the page_info pointer to NULL, meaning the latter will do
>       *       nothing.
>       */
> -    hvm_ioreq_server_unmap_pages(s);
> +    arch_ioreq_server_unmap_pages(s);
>      hvm_ioreq_server_free_pages(s);
> 
>      hvm_ioreq_server_free_rangesets(s);
> @@ -836,6 +854,12 @@ int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
>      return rc;
>  }
> 
> +/* Called when target domain is paused */
> +void arch_ioreq_server_destroy(struct hvm_ioreq_server *s)
> +{
> +    p2m_set_ioreq_server(s->target, 0, s);
> +}
> +
>  int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
>  {
>      struct hvm_ioreq_server *s;
> @@ -855,7 +879,7 @@ int hvm_destroy_ioreq_server(struct domain *d, ioservid_t id)
> 
>      domain_pause(d);
> 
> -    p2m_set_ioreq_server(d, 0, s);
> +    arch_ioreq_server_destroy(s);
> 
>      hvm_ioreq_server_disable(s);
> 
> @@ -900,7 +924,7 @@ int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
> 
>      if ( ioreq_gfn || bufioreq_gfn )
>      {
> -        rc = hvm_ioreq_server_map_pages(s);
> +        rc = arch_ioreq_server_map_pages(s);
>          if ( rc )
>              goto out;
>      }
> @@ -1080,6 +1104,27 @@ int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
>      return rc;
>  }
> 
> +/* Called with ioreq_server lock held */
> +int arch_ioreq_server_map_mem_type(struct domain *d,
> +                                   struct hvm_ioreq_server *s,
> +                                   uint32_t flags)
> +{
> +    return p2m_set_ioreq_server(d, flags, s);
> +}
> +
> +void arch_ioreq_server_map_mem_type_completed(struct domain *d,
> +                                              struct hvm_ioreq_server *s,
> +                                              uint32_t flags)
> +{
> +    if ( flags == 0 )
> +    {
> +        const struct p2m_domain *p2m = p2m_get_hostp2m(d);
> +
> +        if ( read_atomic(&p2m->ioreq.entry_count) )
> +            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
> +    }
> +}
> +
>  /*
>   * Map or unmap an ioreq server to specific memory type. For now, only
>   * HVMMEM_ioreq_server is supported, and in the future new types can be
> @@ -1112,18 +1157,13 @@ int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
>      if ( s->emulator != current->domain )
>          goto out;
> 
> -    rc = p2m_set_ioreq_server(d, flags, s);
> +    rc = arch_ioreq_server_map_mem_type(d, s, flags);
> 
>   out:
>      spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
> 
> -    if ( rc == 0 && flags == 0 )
> -    {
> -        struct p2m_domain *p2m = p2m_get_hostp2m(d);
> -
> -        if ( read_atomic(&p2m->ioreq.entry_count) )
> -            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
> -    }
> +    if ( rc == 0 )
> +        arch_ioreq_server_map_mem_type_completed(d, s, flags);
> 
>      return rc;
>  }
> @@ -1210,12 +1250,17 @@ void hvm_all_ioreq_servers_remove_vcpu(struct domain *d, struct vcpu *v)
>      spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
>  }
> 
> +bool arch_ioreq_server_destroy_all(struct domain *d)
> +{
> +    return relocate_portio_handler(d, 0xcf8, 0xcf8, 4);
> +}
> +
>  void hvm_destroy_all_ioreq_servers(struct domain *d)
>  {
>      struct hvm_ioreq_server *s;
>      unsigned int id;
> 
> -    if ( !relocate_portio_handler(d, 0xcf8, 0xcf8, 4) )
> +    if ( !arch_ioreq_server_destroy_all(d) )
>          return;
> 
>      spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
> @@ -1239,33 +1284,28 @@ void hvm_destroy_all_ioreq_servers(struct domain *d)
>      spin_unlock_recursive(&d->arch.hvm.ioreq_server.lock);
>  }
> 
> -struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
> -                                                 ioreq_t *p)
> +bool arch_ioreq_server_get_type_addr(const struct domain *d,
> +                                     const ioreq_t *p,
> +                                     uint8_t *type,
> +                                     uint64_t *addr)
>  {
> -    struct hvm_ioreq_server *s;
> -    uint32_t cf8;
> -    uint8_t type;
> -    uint64_t addr;
> -    unsigned int id;
> +    unsigned int cf8 = d->arch.hvm.pci_cf8;
> 
>      if ( p->type != IOREQ_TYPE_COPY && p->type != IOREQ_TYPE_PIO )
> -        return NULL;
> -
> -    cf8 = d->arch.hvm.pci_cf8;
> +        return false;
> 
>      if ( p->type == IOREQ_TYPE_PIO &&
>           (p->addr & ~3) == 0xcfc &&
>           CF8_ENABLED(cf8) )
>      {
> -        uint32_t x86_fam;
> +        unsigned int x86_fam, reg;
>          pci_sbdf_t sbdf;
> -        unsigned int reg;
> 
>          reg = hvm_pci_decode_addr(cf8, p->addr, &sbdf);
> 
>          /* PCI config data cycle */
> -        type = XEN_DMOP_IO_RANGE_PCI;
> -        addr = ((uint64_t)sbdf.sbdf << 32) | reg;
> +        *type = XEN_DMOP_IO_RANGE_PCI;
> +        *addr = ((uint64_t)sbdf.sbdf << 32) | reg;
>          /* AMD extended configuration space access? */
>          if ( CF8_ADDR_HI(cf8) &&
>               d->arch.cpuid->x86_vendor == X86_VENDOR_AMD &&
> @@ -1277,16 +1317,30 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
> 
>              if ( !rdmsr_safe(MSR_AMD64_NB_CFG, msr_val) &&
>                   (msr_val & (1ULL << AMD64_NB_CFG_CF8_EXT_ENABLE_BIT)) )
> -                addr |= CF8_ADDR_HI(cf8);
> +                *addr |= CF8_ADDR_HI(cf8);
>          }
>      }
>      else
>      {
> -        type = (p->type == IOREQ_TYPE_PIO) ?
> -                XEN_DMOP_IO_RANGE_PORT : XEN_DMOP_IO_RANGE_MEMORY;
> -        addr = p->addr;
> +        *type = (p->type == IOREQ_TYPE_PIO) ?
> +                 XEN_DMOP_IO_RANGE_PORT : XEN_DMOP_IO_RANGE_MEMORY;
> +        *addr = p->addr;
>      }
> 
> +    return true;
> +}
> +
> +struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
> +                                                 ioreq_t *p)
> +{
> +    struct hvm_ioreq_server *s;
> +    uint8_t type;
> +    uint64_t addr;
> +    unsigned int id;
> +
> +    if ( !arch_ioreq_server_get_type_addr(d, p, &type, &addr) )
> +        return NULL;
> +
>      FOR_EACH_IOREQ_SERVER(d, id, s)
>      {
>          struct rangeset *r;
> @@ -1515,11 +1569,16 @@ static int hvm_access_cf8(
>      return X86EMUL_UNHANDLEABLE;
>  }
> 
> +void arch_ioreq_domain_init(struct domain *d)
> +{
> +    register_portio_handler(d, 0xcf8, 4, hvm_access_cf8);
> +}
> +
>  void hvm_ioreq_init(struct domain *d)
>  {
>      spin_lock_init(&d->arch.hvm.ioreq_server.lock);
> 
> -    register_portio_handler(d, 0xcf8, 4, hvm_access_cf8);
> +    arch_ioreq_domain_init(d);
>  }
> 
>  /*
> diff --git a/xen/include/asm-x86/hvm/ioreq.h b/xen/include/asm-x86/hvm/ioreq.h
> index e2588e9..13d35e1 100644
> --- a/xen/include/asm-x86/hvm/ioreq.h
> +++ b/xen/include/asm-x86/hvm/ioreq.h
> @@ -19,6 +19,9 @@
>  #ifndef __ASM_X86_HVM_IOREQ_H__
>  #define __ASM_X86_HVM_IOREQ_H__
> 
> +#define HANDLE_BUFIOREQ(s) \
> +    ((s)->bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF)
> +
>  bool hvm_io_pending(struct vcpu *v);
>  bool handle_hvm_io_completion(struct vcpu *v);
>  bool is_ioreq_server_page(struct domain *d, const struct page_info *page);
> @@ -55,6 +58,25 @@ unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered);
> 
>  void hvm_ioreq_init(struct domain *d);
> 
> +bool arch_vcpu_ioreq_completion(enum hvm_io_completion io_completion);
> +int arch_ioreq_server_map_pages(struct hvm_ioreq_server *s);
> +void arch_ioreq_server_unmap_pages(struct hvm_ioreq_server *s);
> +void arch_ioreq_server_enable(struct hvm_ioreq_server *s);
> +void arch_ioreq_server_disable(struct hvm_ioreq_server *s);
> +void arch_ioreq_server_destroy(struct hvm_ioreq_server *s);
> +int arch_ioreq_server_map_mem_type(struct domain *d,
> +                                   struct hvm_ioreq_server *s,
> +                                   uint32_t flags);
> +void arch_ioreq_server_map_mem_type_completed(struct domain *d,
> +                                              struct hvm_ioreq_server *s,
> +                                              uint32_t flags);
> +bool arch_ioreq_server_destroy_all(struct domain *d);
> +bool arch_ioreq_server_get_type_addr(const struct domain *d,
> +                                     const ioreq_t *p,
> +                                     uint8_t *type,
> +                                     uint64_t *addr);
> +void arch_ioreq_domain_init(struct domain *d);
> +
>  #endif /* __ASM_X86_HVM_IOREQ_H__ */
> 
>  /*
> --
> 2.7.4




^ permalink raw reply	[flat|nested] 144+ messages in thread

* RE: [PATCH V4 02/24] x86/ioreq: Add IOREQ_STATUS_* #define-s and update code for moving
  2021-01-12 21:52 ` [PATCH V4 02/24] x86/ioreq: Add IOREQ_STATUS_* #define-s and update code for moving Oleksandr Tyshchenko
  2021-01-15 15:17   ` Julien Grall
@ 2021-01-18  8:24   ` Paul Durrant
  1 sibling, 0 replies; 144+ messages in thread
From: Paul Durrant @ 2021-01-18  8:24 UTC (permalink / raw)
  To: 'Oleksandr Tyshchenko', xen-devel
  Cc: 'Oleksandr Tyshchenko', 'Jan Beulich',
	'Andrew Cooper', 'Roger Pau Monné',
	'Wei Liu', 'Julien Grall',
	'Stefano Stabellini', 'Julien Grall'

> -----Original Message-----
> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of Oleksandr Tyshchenko
> Sent: 12 January 2021 21:52
> To: xen-devel@lists.xenproject.org
> Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>; Paul Durrant <paul@xen.org>; Jan Beulich
> <jbeulich@suse.com>; Andrew Cooper <andrew.cooper3@citrix.com>; Roger Pau Monné
> <roger.pau@citrix.com>; Wei Liu <wl@xen.org>; Julien Grall <julien@xen.org>; Stefano Stabellini
> <sstabellini@kernel.org>; Julien Grall <julien.grall@arm.com>
> Subject: [PATCH V4 02/24] x86/ioreq: Add IOREQ_STATUS_* #define-s and update code for moving
> 
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> This patch continues to make some preparation to x86/hvm/ioreq.c
> before moving to the common code.
> 
> Add IOREQ_STATUS_* #define-s and update candidates for moving
> since X86EMUL_* shouldn't be exposed to the common code in
> that form.
> 
> This support is going to be used on Arm to be able run device
> emulator outside of Xen hypervisor.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

Reviewed-by: Paul Durrant <paul@xen.org>

> Acked-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>
> 
> ---
> Please note, this is a split/cleanup/hardening of Julien's PoC:
> "Add support for Guest IO forwarding to a device emulator"
> 
> Changes V2 -> V3:
>  - new patch, was split from
>    [PATCH V2 01/23] x86/ioreq: Prepare IOREQ feature for making it common
> 
> Changes V3 -> V4:
>  - add Alex's R-b and Jan's A-b
>  - add a comment above IOREQ_STATUS_* #define-s
> ---
>  xen/arch/x86/hvm/ioreq.c        | 16 ++++++++--------
>  xen/include/asm-x86/hvm/ioreq.h |  5 +++++
>  2 files changed, 13 insertions(+), 8 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
> index 468fe84..ff9a546 100644
> --- a/xen/arch/x86/hvm/ioreq.c
> +++ b/xen/arch/x86/hvm/ioreq.c
> @@ -1405,7 +1405,7 @@ static int hvm_send_buffered_ioreq(struct hvm_ioreq_server *s, ioreq_t *p)
>      pg = iorp->va;
> 
>      if ( !pg )
> -        return X86EMUL_UNHANDLEABLE;
> +        return IOREQ_STATUS_UNHANDLED;
> 
>      /*
>       * Return 0 for the cases we can't deal with:
> @@ -1435,7 +1435,7 @@ static int hvm_send_buffered_ioreq(struct hvm_ioreq_server *s, ioreq_t *p)
>          break;
>      default:
>          gdprintk(XENLOG_WARNING, "unexpected ioreq size: %u\n", p->size);
> -        return X86EMUL_UNHANDLEABLE;
> +        return IOREQ_STATUS_UNHANDLED;
>      }
> 
>      spin_lock(&s->bufioreq_lock);
> @@ -1445,7 +1445,7 @@ static int hvm_send_buffered_ioreq(struct hvm_ioreq_server *s, ioreq_t *p)
>      {
>          /* The queue is full: send the iopacket through the normal path. */
>          spin_unlock(&s->bufioreq_lock);
> -        return X86EMUL_UNHANDLEABLE;
> +        return IOREQ_STATUS_UNHANDLED;
>      }
> 
>      pg->buf_ioreq[pg->ptrs.write_pointer % IOREQ_BUFFER_SLOT_NUM] = bp;
> @@ -1476,7 +1476,7 @@ static int hvm_send_buffered_ioreq(struct hvm_ioreq_server *s, ioreq_t *p)
>      notify_via_xen_event_channel(d, s->bufioreq_evtchn);
>      spin_unlock(&s->bufioreq_lock);
> 
> -    return X86EMUL_OKAY;
> +    return IOREQ_STATUS_HANDLED;
>  }
> 
>  int hvm_send_ioreq(struct hvm_ioreq_server *s, ioreq_t *proto_p,
> @@ -1492,7 +1492,7 @@ int hvm_send_ioreq(struct hvm_ioreq_server *s, ioreq_t *proto_p,
>          return hvm_send_buffered_ioreq(s, proto_p);
> 
>      if ( unlikely(!vcpu_start_shutdown_deferral(curr)) )
> -        return X86EMUL_RETRY;
> +        return IOREQ_STATUS_RETRY;
> 
>      list_for_each_entry ( sv,
>                            &s->ioreq_vcpu_list,
> @@ -1532,11 +1532,11 @@ int hvm_send_ioreq(struct hvm_ioreq_server *s, ioreq_t *proto_p,
>              notify_via_xen_event_channel(d, port);
> 
>              sv->pending = true;
> -            return X86EMUL_RETRY;
> +            return IOREQ_STATUS_RETRY;
>          }
>      }
> 
> -    return X86EMUL_UNHANDLEABLE;
> +    return IOREQ_STATUS_UNHANDLED;
>  }
> 
>  unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered)
> @@ -1550,7 +1550,7 @@ unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered)
>          if ( !s->enabled )
>              continue;
> 
> -        if ( hvm_send_ioreq(s, p, buffered) == X86EMUL_UNHANDLEABLE )
> +        if ( hvm_send_ioreq(s, p, buffered) == IOREQ_STATUS_UNHANDLED )
>              failed++;
>      }
> 
> diff --git a/xen/include/asm-x86/hvm/ioreq.h b/xen/include/asm-x86/hvm/ioreq.h
> index 13d35e1..f140ef4 100644
> --- a/xen/include/asm-x86/hvm/ioreq.h
> +++ b/xen/include/asm-x86/hvm/ioreq.h
> @@ -77,6 +77,11 @@ bool arch_ioreq_server_get_type_addr(const struct domain *d,
>                                       uint64_t *addr);
>  void arch_ioreq_domain_init(struct domain *d);
> 
> +/* This correlation must not be altered */
> +#define IOREQ_STATUS_HANDLED     X86EMUL_OKAY
> +#define IOREQ_STATUS_UNHANDLED   X86EMUL_UNHANDLEABLE
> +#define IOREQ_STATUS_RETRY       X86EMUL_RETRY
> +
>  #endif /* __ASM_X86_HVM_IOREQ_H__ */
> 
>  /*
> --
> 2.7.4
> 




^ permalink raw reply	[flat|nested] 144+ messages in thread

* RE: [PATCH V4 03/24] x86/ioreq: Provide out-of-line wrapper for the handle_mmio()
  2021-01-12 21:52 ` [PATCH V4 03/24] x86/ioreq: Provide out-of-line wrapper for the handle_mmio() Oleksandr Tyshchenko
  2021-01-15 14:48   ` Alex Bennée
  2021-01-15 15:19   ` Julien Grall
@ 2021-01-18  8:29   ` Paul Durrant
  2 siblings, 0 replies; 144+ messages in thread
From: Paul Durrant @ 2021-01-18  8:29 UTC (permalink / raw)
  To: 'Oleksandr Tyshchenko', xen-devel
  Cc: 'Oleksandr Tyshchenko', 'Jan Beulich',
	'Andrew Cooper', 'Roger Pau Monné',
	'Wei Liu', 'Julien Grall',
	'Stefano Stabellini', 'Julien Grall'

> -----Original Message-----
> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of Oleksandr Tyshchenko
> Sent: 12 January 2021 21:52
> To: xen-devel@lists.xenproject.org
> Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>; Paul Durrant <paul@xen.org>; Jan Beulich
> <jbeulich@suse.com>; Andrew Cooper <andrew.cooper3@citrix.com>; Roger Pau Monné
> <roger.pau@citrix.com>; Wei Liu <wl@xen.org>; Julien Grall <julien@xen.org>; Stefano Stabellini
> <sstabellini@kernel.org>; Julien Grall <julien.grall@arm.com>
> Subject: [PATCH V4 03/24] x86/ioreq: Provide out-of-line wrapper for the handle_mmio()
> 
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> The IOREQ is about to be common feature and Arm will have its own
> implementation.
> 
> But the name of the function is pretty generic and can be confusing
> on Arm (we already have a try_handle_mmio()).
> 
> In order not to rename the function (which is used for a varying
> set of purposes on x86) globally and get non-confusing variant on Arm
> provide a wrapper arch_ioreq_complete_mmio() to be used on common
> and Arm code.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

Reviewed-by: Paul Durrant <paul@xen.org>

> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>
> 
> ---
> Please note, this is a split/cleanup/hardening of Julien's PoC:
> "Add support for Guest IO forwarding to a device emulator"
> 
> Changes RFC -> V1:
>    - new patch
> 
> Changes V1 -> V2:
>    - remove "handle"
>    - add Jan's A-b
> 
> Changes V2 -> V3:
>    - remove Jan's A-b
>    - update patch subject/description
>    - use out-of-line function instead of #define
>    - put earlier in the series to avoid breakage
> 
> Changes V3 -> V4:
>    - add Jan's R-b
>    - rename ioreq_complete_mmio() to arch_ioreq_complete_mmio()
> ---
>  xen/arch/x86/hvm/ioreq.c        | 7 ++++++-
>  xen/include/asm-x86/hvm/ioreq.h | 1 +
>  2 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
> index ff9a546..00c68f5 100644
> --- a/xen/arch/x86/hvm/ioreq.c
> +++ b/xen/arch/x86/hvm/ioreq.c
> @@ -35,6 +35,11 @@
>  #include <public/hvm/ioreq.h>
>  #include <public/hvm/params.h>
> 
> +bool arch_ioreq_complete_mmio(void)
> +{
> +    return handle_mmio();
> +}
> +
>  static void set_ioreq_server(struct domain *d, unsigned int id,
>                               struct hvm_ioreq_server *s)
>  {
> @@ -225,7 +230,7 @@ bool handle_hvm_io_completion(struct vcpu *v)
>          break;
> 
>      case HVMIO_mmio_completion:
> -        return handle_mmio();
> +        return arch_ioreq_complete_mmio();
> 
>      case HVMIO_pio_completion:
>          return handle_pio(vio->io_req.addr, vio->io_req.size,
> diff --git a/xen/include/asm-x86/hvm/ioreq.h b/xen/include/asm-x86/hvm/ioreq.h
> index f140ef4..0e64e76 100644
> --- a/xen/include/asm-x86/hvm/ioreq.h
> +++ b/xen/include/asm-x86/hvm/ioreq.h
> @@ -58,6 +58,7 @@ unsigned int hvm_broadcast_ioreq(ioreq_t *p, bool buffered);
> 
>  void hvm_ioreq_init(struct domain *d);
> 
> +bool arch_ioreq_complete_mmio(void);
>  bool arch_vcpu_ioreq_completion(enum hvm_io_completion io_completion);
>  int arch_ioreq_server_map_pages(struct hvm_ioreq_server *s);
>  void arch_ioreq_server_unmap_pages(struct hvm_ioreq_server *s);
> --
> 2.7.4
> 




^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 24/24] [RFC] libxl: Add support for virtio-disk configuration
  2021-01-15 22:01   ` Julien Grall
@ 2021-01-18  8:32     ` Oleksandr
  2021-01-20 17:05       ` Julien Grall
  0 siblings, 1 reply; 144+ messages in thread
From: Oleksandr @ 2021-01-18  8:32 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Oleksandr Tyshchenko, Ian Jackson, Wei Liu,
	Anthony PERARD, Stefano Stabellini


On 16.01.21 00:01, Julien Grall wrote:
> Hi Oleksandr,

Hi Julien


>
> On 12/01/2021 21:52, Oleksandr Tyshchenko wrote:
>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>
>> This patch adds basic support for configuring and assisting virtio-disk
>> backend (emualator) which is intended to run out of Qemu and could be 
>> run
>> in any domain.
>>
>> Xenstore was chosen as a communication interface for the emulator 
>> running
>> in non-toolstack domain to be able to get configuration either by 
>> reading
>> Xenstore directly or by receiving command line parameters (an updated 
>> 'xl devd'
>> running in the same domain would read Xenstore beforehand and call 
>> backend
>> executable with the required arguments).
>>
>> An example of domain configuration (two disks are assigned to the guest,
>> the latter is in readonly mode):
>>
>> vdisk = [ 'backend=DomD, disks=rw:/dev/mmcblk0p3;ro:/dev/mmcblk1p3' ]
>>
>> Where per-disk Xenstore entries are:
>> - filename and readonly flag (configured via "vdisk" property)
>> - base and irq (allocated dynamically)
>>
>> Besides handling 'visible' params described in configuration file,
>> patch also allocates virtio-mmio specific ones for each device and
>> writes them into Xenstore. virtio-mmio params (irq and base) are
>> unique per guest domain, they allocated at the domain creation time
>> and passed through to the emulator. Each VirtIO device has at least
>> one pair of these params.
>>
>> TODO:
>> 1. An extra "virtio" property could be removed.
>> 2. Update documentation.
>>
>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> [On Arm only]
>> Tested-by: Wei Chen <Wei.Chen@arm.com>
>>
>> ---
>> Changes RFC -> V1:
>>     - no changes
>>
>> Changes V1 -> V2:
>>     - rebase according to the new location of libxl_virtio_disk.c
>>
>> Changes V2 -> V3:
>>     - no changes
>>
>> Changes V3 -> V4:
>>     - rebase according to the new argument for DEFINE_DEVICE_TYPE_STRUCT
>>
>> Please note, there is a real concern about VirtIO interrupts allocation.
>> [Just copy here what Stefano said in RFC thread]
>>
>> So, if we end up allocating let's say 6 virtio interrupts for a domain,
>> the chance of a clash with a physical interrupt of a passthrough 
>> device is real.
>
> For the first version, I think a static approach is fine because it 
> doesn't bind us to anything yet (there is no interface change). We can 
> refine it on follow-ups as we figure out how virtio is going to be 
> used in the field.
>
>>
>> I am not entirely sure how to solve it, but these are a few ideas:
>> - choosing virtio interrupts that are less likely to conflict (maybe 
>> > 1000)
>
> Well, we only support 988 interrupts :). However, we will waste some 
> memory in the vGIC structure (we would need to allocate memory for the 
> 988 interrupts) if you chose an interrupt towards then end.
>
>> - make the virtio irq (optionally) configurable so that a user could
>>    override the default irq and specify one that doesn't conflict
>
> This is not very ideal because it makes the use of virtio quite 
> unfriendly with passthrough. Note that platform device passthrough is 
> already unfriendly, but I am thinking PCI :).
>
>> - implementing support for virq != pirq (even the xl interface doesn't
>>    allow to specify the virq number for passthrough devices, see "irqs")
> I can't remember whether I had a reason to not support virq != pirq 
> when this was initially implemented. This is one possibility, but it 
> is as unfriendly as the previous option.
>
> I will add a 4th one:
>    - Automatically allocate the virtio IRQ. This should be possible to 
> do it without too much trouble as we know in advance which IRQs will 
> be passthrough.
As I understand the IRQs for passthrough are described in "irq" property 
and stored in d_config->b_info.irqs[i], so yes we know in advance which 
IRQs will be used for passthrough
and we will be able to choose non-clashed ones (iterating over all IRQs 
in a reserved range) for the virtio devices.  The question is how many 
IRQs should be reserved.


>
> My preference is the 4th one, that said we may also want to pick 
> either 2 or 3 to give some flexibility to an admin if they wish to get 
> their hand dirty.

Personally I would be ok with 4th, but without admin involvement.


>
>
>>
>> Also there is one suggestion from Wei Chen regarding a parameter for 
>> domain
>> config file which I haven't addressed yet.
>> [Just copy here what Wei said in V2 thread]
>> Can we keep use the same 'disk' parameter for virtio-disk, but add an 
>> option like
>> "model=virtio-disk"?
>> For example:
>> disk = [ 'backend=DomD, disks=rw:/dev/mmcblk0p3,model=virtio-disk' ]
>> Just like what Xen has done for x86 virtio-net.
>> ---
>>   tools/libs/light/Makefile                 |   1 +
>>   tools/libs/light/libxl_arm.c              |  56 ++++++++++++---
>>   tools/libs/light/libxl_create.c           |   1 +
>>   tools/libs/light/libxl_internal.h         |   1 +
>>   tools/libs/light/libxl_types.idl          |  15 ++++
>>   tools/libs/light/libxl_types_internal.idl |   1 +
>>   tools/libs/light/libxl_virtio_disk.c      | 109 
>> ++++++++++++++++++++++++++++
>>   tools/xl/Makefile                         |   2 +-
>>   tools/xl/xl.h                             |   3 +
>>   tools/xl/xl_cmdtable.c                    |  15 ++++
>>   tools/xl/xl_parse.c                       | 115 
>> ++++++++++++++++++++++++++++++
>>   tools/xl/xl_virtio_disk.c                 |  46 ++++++++++++
>>   12 files changed, 354 insertions(+), 11 deletions(-)
>>   create mode 100644 tools/libs/light/libxl_virtio_disk.c
>>   create mode 100644 tools/xl/xl_virtio_disk.c
>>
>> diff --git a/tools/libs/light/Makefile b/tools/libs/light/Makefile
>> index 68f6fa3..ccc91b9 100644
>> --- a/tools/libs/light/Makefile
>> +++ b/tools/libs/light/Makefile
>> @@ -115,6 +115,7 @@ SRCS-y += libxl_genid.c
>>   SRCS-y += _libxl_types.c
>>   SRCS-y += libxl_flask.c
>>   SRCS-y += _libxl_types_internal.c
>> +SRCS-y += libxl_virtio_disk.c
>>     ifeq ($(CONFIG_LIBNL),y)
>>   CFLAGS_LIBXL += $(LIBNL3_CFLAGS)
>> diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
>> index 588ee5a..9eb3022 100644
>> --- a/tools/libs/light/libxl_arm.c
>> +++ b/tools/libs/light/libxl_arm.c
>> @@ -8,6 +8,12 @@
>>   #include <assert.h>
>>   #include <xen/device_tree_defs.h>
>>   +#ifndef container_of
>> +#define container_of(ptr, type, member) ({            \
>> +        typeof( ((type *)0)->member ) *__mptr = (ptr);    \
>> +        (type *)( (char *)__mptr - offsetof(type,member) );})
>> +#endif
>> +
>>   static const char *gicv_to_string(libxl_gic_version gic_version)
>>   {
>>       switch (gic_version) {
>> @@ -39,14 +45,32 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
>>           vuart_enabled = true;
>>       }
>>   -    /*
>> -     * XXX: Handle properly virtio
>> -     * A proper solution would be the toolstack to allocate the 
>> interrupts
>> -     * used by each virtio backend and let the backend now which one 
>> is used
>> -     */
>
> Ok, so you added some code in patch #23 that is going to be mostly 
> dropped here. I think you want to rethink how you do the split here.
>
> One possible approach would be to have a patch which adds the 
> infrastructe but no call. It would contain:
>   1) Allocate a space in the virtio region and an interrupt
>   2) Create the bindings.
>
> Those helpers can then be called in this patch.

Sounds reasonable.


>
>
>>       if (libxl_defbool_val(d_config->b_info.arch_arm.virtio)) {
>
> It feels to me that this parameter is not necessary. You can easily 
> infer it based whether you have a virtio disks attached or not.

Yes!


>
>
>> -        nr_spis += (GUEST_VIRTIO_MMIO_SPI - 32) + 1;
>> +        uint64_t virtio_base;
>> +        libxl_device_virtio_disk *virtio_disk;
>> +
>> +        virtio_base = GUEST_VIRTIO_MMIO_BASE;
>>           virtio_irq = GUEST_VIRTIO_MMIO_SPI;
>
> Looking at patch #23, you defined a single SPI and a region that can 
> only fit virtio device. However, here, you are going to define 
> multiple virtio devices.
>
> I think you want to define the following:
>
>  - GUEST_VIRTIO_MMIO_BASE: Base address of the virtio window
>  - GUEST_VIRTIO_MMIO_SIZE: Full length of the virtio window (may 
> contain multiple devices)
>  - GUEST_VIRTIO_SPI_FIRST: First SPI reserved for virtio
>  - GUEST_VIRTIO_SPI_LAST: Last SPI reserved for virtio
>
> The per-device size doesn't need to be defined in arch-arm.h. Instead, 
> I would only define internally (unless we can use a virtio.h header 
> from Linux?).

I think I got the idea. What are the preferences for these values?


>
>
>> +
>> +        if (!d_config->num_virtio_disks) {
>> +            LOG(ERROR, "Virtio is enabled, but no Virtio devices 
>> present\n");
>> +            return ERROR_FAIL;
>> +        }
>> +        virtio_disk = &d_config->virtio_disks[0];
>> +
>> +        for (i = 0; i < virtio_disk->num_disks; i++) {
>> +            virtio_disk->disks[i].base = virtio_base;
>> +            virtio_disk->disks[i].irq = virtio_irq;
>> +
>> +            LOG(DEBUG, "Allocate Virtio MMIO params: IRQ %u BASE 
>> 0x%"PRIx64,
>> +                virtio_irq, virtio_base);
>> +
>> +            virtio_irq ++;
>
> NIT: We usually don't have space before ++ or ...

ok


>
>> +            virtio_base += GUEST_VIRTIO_MMIO_SIZE;
>> +        }
>> +        virtio_irq --;
>
> ... --;

ok


>
>> +
>> +        nr_spis += (virtio_irq - 32) + 1;
>>           virtio_enabled = true;
>>       }
>
> [...]
>
>> diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
>> index 2a3364b..054a0c9 100644
>> --- a/tools/xl/xl_parse.c
>> +++ b/tools/xl/xl_parse.c
>> @@ -1204,6 +1204,120 @@ out:
>>       if (rc) exit(EXIT_FAILURE);
>>   }
>>   +#define MAX_VIRTIO_DISKS 4
>
> May I ask why this is hardcoded to 4?

I found 4 as a reasonable value for the initial implementation.
This means how many disks the single device instance can handle.
Do you think we need to change it?


>
> Cheers,
>
-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply	[flat|nested] 144+ messages in thread

* RE: [PATCH V4 04/24] xen/ioreq: Make x86's IOREQ feature common
  2021-01-12 21:52 ` [PATCH V4 04/24] xen/ioreq: Make x86's IOREQ feature common Oleksandr Tyshchenko
  2021-01-15 14:55   ` Alex Bennée
  2021-01-15 15:23   ` Julien Grall
@ 2021-01-18  8:48   ` Paul Durrant
  2 siblings, 0 replies; 144+ messages in thread
From: Paul Durrant @ 2021-01-18  8:48 UTC (permalink / raw)
  To: 'Oleksandr Tyshchenko', xen-devel
  Cc: 'Oleksandr Tyshchenko', 'Andrew Cooper',
	'George Dunlap', 'Ian Jackson',
	'Jan Beulich', 'Julien Grall',
	'Stefano Stabellini', 'Wei Liu',
	'Roger Pau Monné', 'Jun Nakajima',
	'Kevin Tian', 'Tim Deegan',
	'Julien Grall'

> -----Original Message-----
> From: Oleksandr Tyshchenko <olekstysh@gmail.com>
> Sent: 12 January 2021 21:52
> To: xen-devel@lists.xenproject.org
> Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>; Andrew Cooper <andrew.cooper3@citrix.com>;
> George Dunlap <george.dunlap@citrix.com>; Ian Jackson <iwj@xenproject.org>; Jan Beulich
> <jbeulich@suse.com>; Julien Grall <julien@xen.org>; Stefano Stabellini <sstabellini@kernel.org>; Wei
> Liu <wl@xen.org>; Roger Pau Monné <roger.pau@citrix.com>; Paul Durrant <paul@xen.org>; Jun Nakajima
> <jun.nakajima@intel.com>; Kevin Tian <kevin.tian@intel.com>; Tim Deegan <tim@xen.org>; Julien Grall
> <julien.grall@arm.com>
> Subject: [PATCH V4 04/24] xen/ioreq: Make x86's IOREQ feature common
> 
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> As a lot of x86 code can be re-used on Arm later on, this patch
> moves previously prepared IOREQ support to the common code
> (the code movement is verbatim copy).
> 
> The "legacy" mechanism of mapping magic pages for the IOREQ servers
> remains x86 specific and not exposed to the common code.
> 
> The common IOREQ feature is supposed to be built with IOREQ_SERVER
> option enabled, which is selected for x86's config HVM for now.
> 
> In order to avoid having a gigantic patch here, the subsequent
> patches will update remaining bits in the common code step by step:
> - Make IOREQ related structs/materials common
> - Drop the "hvm" prefixes and infixes
> - Remove layering violation by moving corresponding fields
>   out of *arch.hvm* or abstracting away accesses to them
> 
> Also include <xen/domain_page.h> which will be needed on Arm
> to avoid touch the common code again when introducing Arm specific bits.
> 
> This support is going to be used on Arm to be able run device
> emulator outside of Xen hypervisor.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

Reviewed-by: Paul Durrant <paul@xen.org>



^ permalink raw reply	[flat|nested] 144+ messages in thread

* RE: [PATCH V4 07/24] xen/ioreq: Make x86's hvm_ioreq_(page/vcpu/server) structs common
  2021-01-12 21:52 ` [PATCH V4 07/24] xen/ioreq: Make x86's hvm_ioreq_(page/vcpu/server) structs common Oleksandr Tyshchenko
  2021-01-15 15:36   ` Julien Grall
@ 2021-01-18  8:59   ` Paul Durrant
  2021-01-20  8:58   ` Alex Bennée
  2 siblings, 0 replies; 144+ messages in thread
From: Paul Durrant @ 2021-01-18  8:59 UTC (permalink / raw)
  To: 'Oleksandr Tyshchenko', xen-devel
  Cc: 'Oleksandr Tyshchenko', 'Jan Beulich',
	'Andrew Cooper', 'Roger Pau Monné',
	'Wei Liu', 'George Dunlap',
	'Julien Grall', 'Stefano Stabellini',
	'Julien Grall'

> -----Original Message-----
> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of Oleksandr Tyshchenko
> Sent: 12 January 2021 21:52
> To: xen-devel@lists.xenproject.org
> Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>; Paul Durrant <paul@xen.org>; Jan Beulich
> <jbeulich@suse.com>; Andrew Cooper <andrew.cooper3@citrix.com>; Roger Pau Monné
> <roger.pau@citrix.com>; Wei Liu <wl@xen.org>; George Dunlap <george.dunlap@citrix.com>; Julien Grall
> <julien@xen.org>; Stefano Stabellini <sstabellini@kernel.org>; Julien Grall <julien.grall@arm.com>
> Subject: [PATCH V4 07/24] xen/ioreq: Make x86's hvm_ioreq_(page/vcpu/server) structs common
> 
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> The IOREQ is a common feature now and these structs will be used
> on Arm as is. Move them to xen/ioreq.h and remove "hvm" prefixes.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

Reviewed-by: Paul Durrant <paul@xen.org>

... with one small nit below (if you happen to do a v5)

> Acked-by: Jan Beulich <jbeulich@suse.com>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>
> 
> ---
> Please note, this is a split/cleanup/hardening of Julien's PoC:
> "Add support for Guest IO forwarding to a device emulator"
> 
> Changes RFC -> V1:
>    - new patch
> 
> Changes V1 -> V2:
>    - remove "hvm" prefix
> 
> Changes V2 -> V3:
>    - update patch according the "legacy interface" is x86 specific
> 
> Changes V3 -> V4:
>    - add Jan's A-b
> ---
>  xen/arch/x86/hvm/emulate.c       |   2 +-
>  xen/arch/x86/hvm/ioreq.c         |  38 +++++++-------
>  xen/arch/x86/hvm/stdvga.c        |   2 +-
>  xen/arch/x86/mm/p2m.c            |   8 +--
>  xen/common/ioreq.c               | 108 +++++++++++++++++++--------------------
>  xen/include/asm-x86/hvm/domain.h |  36 +------------
>  xen/include/asm-x86/p2m.h        |   8 +--
>  xen/include/xen/ioreq.h          |  54 ++++++++++++++++----
>  8 files changed, 128 insertions(+), 128 deletions(-)
> 
[snip]
>  #ifdef CONFIG_MEM_SHARING
>  struct mem_sharing_domain
>  {
> @@ -110,7 +76,7 @@ struct hvm_domain {
>      /* Lock protects all other values in the sub-struct and the default */
>      struct {
>          spinlock_t              lock;
> -        struct hvm_ioreq_server *server[MAX_NR_IOREQ_SERVERS];
> +        struct ioreq_server *server[MAX_NR_IOREQ_SERVERS];

NIT: this breaks the alignment... you should also remove some of the indent from the line above.

>      } ioreq_server;
> 
>      /* Cached CF8 for guest PCI config cycles */



^ permalink raw reply	[flat|nested] 144+ messages in thread

* RE: [PATCH V4 08/24] xen/ioreq: Move x86's ioreq_server to struct domain
  2021-01-12 21:52 ` [PATCH V4 08/24] xen/ioreq: Move x86's ioreq_server to struct domain Oleksandr Tyshchenko
  2021-01-15 15:44   ` Julien Grall
@ 2021-01-18  9:09   ` Paul Durrant
  2021-01-20  9:00   ` Alex Bennée
  2 siblings, 0 replies; 144+ messages in thread
From: Paul Durrant @ 2021-01-18  9:09 UTC (permalink / raw)
  To: 'Oleksandr Tyshchenko', xen-devel
  Cc: 'Oleksandr Tyshchenko', 'Andrew Cooper',
	'George Dunlap', 'Ian Jackson',
	'Jan Beulich', 'Julien Grall',
	'Stefano Stabellini', 'Wei Liu',
	'Roger Pau Monné', 'Julien Grall'

> -----Original Message-----
> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of Oleksandr Tyshchenko
> Sent: 12 January 2021 21:52
> To: xen-devel@lists.xenproject.org
> Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>; Paul Durrant <paul@xen.org>; Andrew Cooper
> <andrew.cooper3@citrix.com>; George Dunlap <george.dunlap@citrix.com>; Ian Jackson
> <iwj@xenproject.org>; Jan Beulich <jbeulich@suse.com>; Julien Grall <julien@xen.org>; Stefano
> Stabellini <sstabellini@kernel.org>; Wei Liu <wl@xen.org>; Roger Pau Monné <roger.pau@citrix.com>;
> Julien Grall <julien.grall@arm.com>
> Subject: [PATCH V4 08/24] xen/ioreq: Move x86's ioreq_server to struct domain
> 
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> The IOREQ is a common feature now and this struct will be used
> on Arm as is. Move it to common struct domain. This also
> significantly reduces the layering violation in the common code
> (*arch.hvm* usage).
> 
> We don't move ioreq_gfn since it is not used in the common code
> (the "legacy" mechanism is x86 specific).
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

Reviewed-by: Paul Durrant <paul@xen.org>

...and I see you fix the alignment issue as part of the code movement in this patch :-)




^ permalink raw reply	[flat|nested] 144+ messages in thread

* RE: [PATCH V4 09/24] xen/ioreq: Make x86's IOREQ related dm-op handling common
  2021-01-12 21:52 ` [PATCH V4 09/24] xen/ioreq: Make x86's IOREQ related dm-op handling common Oleksandr Tyshchenko
@ 2021-01-18  9:17   ` Paul Durrant
  2021-01-18 10:19     ` Oleksandr
  2021-01-20 16:21   ` Jan Beulich
  1 sibling, 1 reply; 144+ messages in thread
From: Paul Durrant @ 2021-01-18  9:17 UTC (permalink / raw)
  To: 'Oleksandr Tyshchenko', xen-devel
  Cc: 'Julien Grall', 'Jan Beulich',
	'Andrew Cooper', 'Roger Pau Monné',
	'Wei Liu', 'George Dunlap', 'Ian Jackson',
	'Julien Grall', 'Stefano Stabellini',
	'Daniel De Graaf', 'Oleksandr Tyshchenko'

> -----Original Message-----
> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of Oleksandr Tyshchenko
> Sent: 12 January 2021 21:52
> To: xen-devel@lists.xenproject.org
> Cc: Julien Grall <julien.grall@arm.com>; Jan Beulich <jbeulich@suse.com>; Andrew Cooper
> <andrew.cooper3@citrix.com>; Roger Pau Monné <roger.pau@citrix.com>; Wei Liu <wl@xen.org>; George
> Dunlap <george.dunlap@citrix.com>; Ian Jackson <iwj@xenproject.org>; Julien Grall <julien@xen.org>;
> Stefano Stabellini <sstabellini@kernel.org>; Paul Durrant <paul@xen.org>; Daniel De Graaf
> <dgdegra@tycho.nsa.gov>; Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> Subject: [PATCH V4 09/24] xen/ioreq: Make x86's IOREQ related dm-op handling common
> 
> From: Julien Grall <julien.grall@arm.com>
> 
> As a lot of x86 code can be re-used on Arm later on, this patch
> moves the IOREQ related dm-op handling to the common code.
> 
> The idea is to have the top level dm-op handling arch-specific
> and call into ioreq_server_dm_op() for otherwise unhandled ops.
> Pros:
> - More natural than doing it other way around (top level dm-op
> handling common).
> - Leave compat_dm_op() in x86 code.
> Cons:
> - Code duplication. Both arches have to duplicate do_dm_op(), etc.
> 
> Also update XSM code a bit to let dm-op be used on Arm.
> 
> This support is going to be used on Arm to be able run device
> emulator outside of Xen hypervisor.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>
> 
> ---
> Please note, this is a split/cleanup/hardening of Julien's PoC:
> "Add support for Guest IO forwarding to a device emulator"
> 
> ***
> I decided to leave common dm.h to keep struct dmop_args declaration
> (to be included by Arm's dm.c), alternatively we could avoid
> introducing new header by moving the declaration into the existing
> header, but failed to find a suitable one which context would fit.
> ***
> 
> Changes RFC -> V1:
>    - update XSM, related changes were pulled from:
>      [RFC PATCH V1 04/12] xen/arm: Introduce arch specific bits for IOREQ/DM features
> 
> Changes V1 -> V2:
>    - update the author of a patch
>    - update patch description
>    - introduce xen/dm.h and move definitions here
> 
> Changes V2 -> V3:
>    - no changes
> 
> Changes V3 -> V4:
>    - rework to have the top level dm-op handling arch-specific
>    - update patch subject/description, was "xen/dm: Make x86's DM feature common"
>    - make a few functions static in common ioreq.c
> ---
>  xen/arch/x86/hvm/dm.c   | 101 +-----------------------------------
>  xen/common/ioreq.c      | 135 ++++++++++++++++++++++++++++++++++++++++++------
>  xen/include/xen/dm.h    |  39 ++++++++++++++
>  xen/include/xen/ioreq.h |  17 +-----
>  xen/include/xsm/dummy.h |   4 +-
>  xen/include/xsm/xsm.h   |   6 +--
>  xen/xsm/dummy.c         |   2 +-
>  xen/xsm/flask/hooks.c   |   5 +-
>  8 files changed, 171 insertions(+), 138 deletions(-)
>  create mode 100644 xen/include/xen/dm.h
> 
> diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c
> index d3e2a9e..dc8e47d 100644
> --- a/xen/arch/x86/hvm/dm.c
> +++ b/xen/arch/x86/hvm/dm.c
> @@ -16,6 +16,7 @@
> 
>  #include <xen/event.h>
>  #include <xen/guest_access.h>
> +#include <xen/dm.h>
>  #include <xen/hypercall.h>
>  #include <xen/ioreq.h>
>  #include <xen/nospec.h>
> @@ -29,13 +30,6 @@
> 
>  #include <public/hvm/hvm_op.h>
> 
> -struct dmop_args {
> -    domid_t domid;
> -    unsigned int nr_bufs;
> -    /* Reserve enough buf elements for all current hypercalls. */
> -    struct xen_dm_op_buf buf[2];
> -};
> -
>  static bool _raw_copy_from_guest_buf_offset(void *dst,
>                                              const struct dmop_args *args,
>                                              unsigned int buf_idx,
> @@ -408,71 +402,6 @@ static int dm_op(const struct dmop_args *op_args)
> 
>      switch ( op.op )
>      {
> -    case XEN_DMOP_create_ioreq_server:
> -    {
> -        struct xen_dm_op_create_ioreq_server *data =
> -            &op.u.create_ioreq_server;
> -
> -        const_op = false;
> -
> -        rc = -EINVAL;
> -        if ( data->pad[0] || data->pad[1] || data->pad[2] )
> -            break;
> -
> -        rc = hvm_create_ioreq_server(d, data->handle_bufioreq,
> -                                     &data->id);
> -        break;
> -    }
> -
> -    case XEN_DMOP_get_ioreq_server_info:
> -    {
> -        struct xen_dm_op_get_ioreq_server_info *data =
> -            &op.u.get_ioreq_server_info;
> -        const uint16_t valid_flags = XEN_DMOP_no_gfns;
> -
> -        const_op = false;
> -
> -        rc = -EINVAL;
> -        if ( data->flags & ~valid_flags )
> -            break;
> -
> -        rc = hvm_get_ioreq_server_info(d, data->id,
> -                                       (data->flags & XEN_DMOP_no_gfns) ?
> -                                       NULL : &data->ioreq_gfn,
> -                                       (data->flags & XEN_DMOP_no_gfns) ?
> -                                       NULL : &data->bufioreq_gfn,
> -                                       &data->bufioreq_port);
> -        break;
> -    }
> -
> -    case XEN_DMOP_map_io_range_to_ioreq_server:
> -    {
> -        const struct xen_dm_op_ioreq_server_range *data =
> -            &op.u.map_io_range_to_ioreq_server;
> -
> -        rc = -EINVAL;
> -        if ( data->pad )
> -            break;
> -
> -        rc = hvm_map_io_range_to_ioreq_server(d, data->id, data->type,
> -                                              data->start, data->end);
> -        break;
> -    }
> -
> -    case XEN_DMOP_unmap_io_range_from_ioreq_server:
> -    {
> -        const struct xen_dm_op_ioreq_server_range *data =
> -            &op.u.unmap_io_range_from_ioreq_server;
> -
> -        rc = -EINVAL;
> -        if ( data->pad )
> -            break;
> -
> -        rc = hvm_unmap_io_range_from_ioreq_server(d, data->id, data->type,
> -                                                  data->start, data->end);
> -        break;
> -    }
> -
>      case XEN_DMOP_map_mem_type_to_ioreq_server:
>      {
>          struct xen_dm_op_map_mem_type_to_ioreq_server *data =
> @@ -523,32 +452,6 @@ static int dm_op(const struct dmop_args *op_args)
>          break;
>      }
> 
> -    case XEN_DMOP_set_ioreq_server_state:
> -    {
> -        const struct xen_dm_op_set_ioreq_server_state *data =
> -            &op.u.set_ioreq_server_state;
> -
> -        rc = -EINVAL;
> -        if ( data->pad )
> -            break;
> -
> -        rc = hvm_set_ioreq_server_state(d, data->id, !!data->enabled);
> -        break;
> -    }
> -
> -    case XEN_DMOP_destroy_ioreq_server:
> -    {
> -        const struct xen_dm_op_destroy_ioreq_server *data =
> -            &op.u.destroy_ioreq_server;
> -
> -        rc = -EINVAL;
> -        if ( data->pad )
> -            break;
> -
> -        rc = hvm_destroy_ioreq_server(d, data->id);
> -        break;
> -    }
> -
>      case XEN_DMOP_track_dirty_vram:
>      {
>          const struct xen_dm_op_track_dirty_vram *data =
> @@ -703,7 +606,7 @@ static int dm_op(const struct dmop_args *op_args)
>      }
> 
>      default:
> -        rc = -EOPNOTSUPP;
> +        rc = ioreq_server_dm_op(&op, d, &const_op);
>          break;
>      }
> 
> diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c
> index a319c88..72b5da0 100644
> --- a/xen/common/ioreq.c
> +++ b/xen/common/ioreq.c
> @@ -591,8 +591,8 @@ static void hvm_ioreq_server_deinit(struct ioreq_server *s)
>      put_domain(s->emulator);
>  }
> 
> -int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
> -                            ioservid_t *id)
> +static int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
> +                                   ioservid_t *id)

Would this not be a good opportunity to drop the 'hvm_' prefix (here and elsewhere)?

  Paul



^ permalink raw reply	[flat|nested] 144+ messages in thread

* RE: [PATCH V4 10/24] xen/ioreq: Move x86's io_completion/io_req fields to struct vcpu
  2021-01-12 21:52 ` [PATCH V4 10/24] xen/ioreq: Move x86's io_completion/io_req fields to struct vcpu Oleksandr Tyshchenko
  2021-01-15 19:34   ` Julien Grall
@ 2021-01-18  9:35   ` Paul Durrant
  2021-01-20 16:24   ` Jan Beulich
  2 siblings, 0 replies; 144+ messages in thread
From: Paul Durrant @ 2021-01-18  9:35 UTC (permalink / raw)
  To: 'Oleksandr Tyshchenko', xen-devel
  Cc: 'Oleksandr Tyshchenko', 'Jan Beulich',
	'Andrew Cooper', 'Roger Pau Monné',
	'Wei Liu', 'George Dunlap', 'Ian Jackson',
	'Julien Grall', 'Stefano Stabellini',
	'Jun Nakajima', 'Kevin Tian',
	'Julien Grall'

> -----Original Message-----
> From: Oleksandr Tyshchenko <olekstysh@gmail.com>
> Sent: 12 January 2021 21:52
> To: xen-devel@lists.xenproject.org
> Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>; Paul Durrant <paul@xen.org>; Jan Beulich
> <jbeulich@suse.com>; Andrew Cooper <andrew.cooper3@citrix.com>; Roger Pau Monné
> <roger.pau@citrix.com>; Wei Liu <wl@xen.org>; George Dunlap <george.dunlap@citrix.com>; Ian Jackson
> <iwj@xenproject.org>; Julien Grall <julien@xen.org>; Stefano Stabellini <sstabellini@kernel.org>; Jun
> Nakajima <jun.nakajima@intel.com>; Kevin Tian <kevin.tian@intel.com>; Julien Grall
> <julien.grall@arm.com>
> Subject: [PATCH V4 10/24] xen/ioreq: Move x86's io_completion/io_req fields to struct vcpu
> 
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> The IOREQ is a common feature now and these fields will be used
> on Arm as is. Move them to common struct vcpu as a part of new
> struct vcpu_io and drop duplicating "io" prefixes. Also move
> enum hvm_io_completion to xen/sched.h and remove "hvm" prefixes.
> 
> This patch completely removes layering violation in the common code.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

Reviewed-by: Paul Durrant <paul@xen.org>



^ permalink raw reply	[flat|nested] 144+ messages in thread

* RE: [PATCH V4 11/24] xen/mm: Make x86's XENMEM_resource_ioreq_server handling common
  2021-01-12 21:52 ` [PATCH V4 11/24] xen/mm: Make x86's XENMEM_resource_ioreq_server handling common Oleksandr Tyshchenko
  2021-01-14  3:58   ` Wei Chen
@ 2021-01-18  9:38   ` Paul Durrant
  1 sibling, 0 replies; 144+ messages in thread
From: Paul Durrant @ 2021-01-18  9:38 UTC (permalink / raw)
  To: 'Oleksandr Tyshchenko', xen-devel
  Cc: 'Julien Grall', 'Jan Beulich',
	'Andrew Cooper', 'Roger Pau Monné',
	'Wei Liu', 'George Dunlap', 'Ian Jackson',
	'Julien Grall', 'Stefano Stabellini',
	'Volodymyr Babchuk', 'Oleksandr Tyshchenko'

> -----Original Message-----
> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of Oleksandr Tyshchenko
> Sent: 12 January 2021 21:52
> To: xen-devel@lists.xenproject.org
> Cc: Julien Grall <julien.grall@arm.com>; Jan Beulich <jbeulich@suse.com>; Andrew Cooper
> <andrew.cooper3@citrix.com>; Roger Pau Monné <roger.pau@citrix.com>; Wei Liu <wl@xen.org>; George
> Dunlap <george.dunlap@citrix.com>; Ian Jackson <iwj@xenproject.org>; Julien Grall <julien@xen.org>;
> Stefano Stabellini <sstabellini@kernel.org>; Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>; Oleksandr
> Tyshchenko <oleksandr_tyshchenko@epam.com>
> Subject: [PATCH V4 11/24] xen/mm: Make x86's XENMEM_resource_ioreq_server handling common
> 
> From: Julien Grall <julien.grall@arm.com>
> 
> As x86 implementation of XENMEM_resource_ioreq_server can be
> re-used on Arm later on, this patch makes it common and removes
> arch_acquire_resource as unneeded.
> 
> Also re-order #include-s alphabetically.
> 
> This support is going to be used on Arm to be able run device
> emulator outside of Xen hypervisor.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>
> 

Reviewed-by: Paul Durrant <paul@xen.org>



^ permalink raw reply	[flat|nested] 144+ messages in thread

* RE: [PATCH V4 12/24] xen/ioreq: Remove "hvm" prefixes from involved function names
  2021-01-12 21:52 ` [PATCH V4 12/24] xen/ioreq: Remove "hvm" prefixes from involved function names Oleksandr Tyshchenko
@ 2021-01-18  9:55   ` Paul Durrant
  0 siblings, 0 replies; 144+ messages in thread
From: Paul Durrant @ 2021-01-18  9:55 UTC (permalink / raw)
  To: 'Oleksandr Tyshchenko', xen-devel
  Cc: 'Oleksandr Tyshchenko', 'Jan Beulich',
	'Andrew Cooper', 'Roger Pau Monné',
	'Wei Liu', 'George Dunlap', 'Ian Jackson',
	'Julien Grall', 'Stefano Stabellini',
	'Jun Nakajima', 'Kevin Tian',
	'Julien Grall'

> -----Original Message-----
> From: Oleksandr Tyshchenko <olekstysh@gmail.com>
> Sent: 12 January 2021 21:52
> To: xen-devel@lists.xenproject.org
> Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>; Jan Beulich <jbeulich@suse.com>; Andrew
> Cooper <andrew.cooper3@citrix.com>; Roger Pau Monné <roger.pau@citrix.com>; Wei Liu <wl@xen.org>;
> George Dunlap <george.dunlap@citrix.com>; Ian Jackson <iwj@xenproject.org>; Julien Grall
> <julien@xen.org>; Stefano Stabellini <sstabellini@kernel.org>; Paul Durrant <paul@xen.org>; Jun
> Nakajima <jun.nakajima@intel.com>; Kevin Tian <kevin.tian@intel.com>; Julien Grall
> <julien.grall@arm.com>
> Subject: [PATCH V4 12/24] xen/ioreq: Remove "hvm" prefixes from involved function names
> 
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> This patch removes "hvm" prefixes and infixes from IOREQ related
> function names in the common code and performs a renaming where
> appropriate according to the more consistent new naming scheme:
> - IOREQ server functions should start with "ioreq_server_"
> - IOREQ functions should start with "ioreq_"
> 
> A few function names are clarified to better fit into their purposes:
> handle_hvm_io_completion -> vcpu_ioreq_handle_completion
> hvm_io_pending           -> vcpu_ioreq_pending
> hvm_ioreq_init           -> ioreq_domain_init
> hvm_alloc_ioreq_mfn      -> ioreq_server_alloc_mfn
> hvm_free_ioreq_mfn       -> ioreq_server_free_mfn
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>
> 

Reviewed-by: Paul Durrant <paul@xen.org>




^ permalink raw reply	[flat|nested] 144+ messages in thread

* RE: [PATCH V4 13/24] xen/ioreq: Use guest_cmpxchg64() instead of cmpxchg()
  2021-01-12 21:52 ` [PATCH V4 13/24] xen/ioreq: Use guest_cmpxchg64() instead of cmpxchg() Oleksandr Tyshchenko
  2021-01-15 19:37   ` Julien Grall
@ 2021-01-18 10:00   ` Paul Durrant
  1 sibling, 0 replies; 144+ messages in thread
From: Paul Durrant @ 2021-01-18 10:00 UTC (permalink / raw)
  To: 'Oleksandr Tyshchenko', xen-devel
  Cc: 'Oleksandr Tyshchenko', 'Julien Grall',
	'Stefano Stabellini', 'Julien Grall'

> -----Original Message-----
> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of Oleksandr Tyshchenko
> Sent: 12 January 2021 21:52
> To: xen-devel@lists.xenproject.org
> Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>; Paul Durrant <paul@xen.org>; Julien Grall
> <julien@xen.org>; Stefano Stabellini <sstabellini@kernel.org>; Julien Grall <julien.grall@arm.com>
> Subject: [PATCH V4 13/24] xen/ioreq: Use guest_cmpxchg64() instead of cmpxchg()
> 
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> The cmpxchg() in ioreq_send_buffered() operates on memory shared
> with the emulator domain (and the target domain if the legacy
> interface is used).
> 
> In order to be on the safe side we need to switch
> to guest_cmpxchg64() to prevent a domain to DoS Xen on Arm.
> 
> As there is no plan to support the legacy interface on Arm,
> we will have a page to be mapped in a single domain at the time,
> so we can use s->emulator in guest_cmpxchg64() safely.
> 
> Thankfully the only user of the legacy interface is x86 so far
> and there is not concern regarding the atomics operations.
> 
> Please note, that the legacy interface *must* not be used on Arm
> without revisiting the code.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> Acked-by: Stefano Stabellini <sstabellini@kernel.org>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>

Reviewed-by: Paul Durrant <paul@xen.org>



^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 09/24] xen/ioreq: Make x86's IOREQ related dm-op handling common
  2021-01-18  9:17   ` Paul Durrant
@ 2021-01-18 10:19     ` Oleksandr
  2021-01-18 10:34       ` Paul Durrant
  0 siblings, 1 reply; 144+ messages in thread
From: Oleksandr @ 2021-01-18 10:19 UTC (permalink / raw)
  To: paul
  Cc: xen-devel, 'Julien Grall', 'Jan Beulich',
	'Andrew Cooper', 'Roger Pau Monné',
	'Wei Liu', 'George Dunlap', 'Ian Jackson',
	'Julien Grall', 'Stefano Stabellini',
	'Daniel De Graaf', 'Oleksandr Tyshchenko'


On 18.01.21 11:17, Paul Durrant wrote:

Hi Paul



>> -----Original Message-----
>> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of Oleksandr Tyshchenko
>> Sent: 12 January 2021 21:52
>> To: xen-devel@lists.xenproject.org
>> Cc: Julien Grall <julien.grall@arm.com>; Jan Beulich <jbeulich@suse.com>; Andrew Cooper
>> <andrew.cooper3@citrix.com>; Roger Pau Monné <roger.pau@citrix.com>; Wei Liu <wl@xen.org>; George
>> Dunlap <george.dunlap@citrix.com>; Ian Jackson <iwj@xenproject.org>; Julien Grall <julien@xen.org>;
>> Stefano Stabellini <sstabellini@kernel.org>; Paul Durrant <paul@xen.org>; Daniel De Graaf
>> <dgdegra@tycho.nsa.gov>; Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> Subject: [PATCH V4 09/24] xen/ioreq: Make x86's IOREQ related dm-op handling common
>>
>> From: Julien Grall <julien.grall@arm.com>
>>
>> As a lot of x86 code can be re-used on Arm later on, this patch
>> moves the IOREQ related dm-op handling to the common code.
>>
>> The idea is to have the top level dm-op handling arch-specific
>> and call into ioreq_server_dm_op() for otherwise unhandled ops.
>> Pros:
>> - More natural than doing it other way around (top level dm-op
>> handling common).
>> - Leave compat_dm_op() in x86 code.
>> Cons:
>> - Code duplication. Both arches have to duplicate do_dm_op(), etc.
>>
>> Also update XSM code a bit to let dm-op be used on Arm.
>>
>> This support is going to be used on Arm to be able run device
>> emulator outside of Xen hypervisor.
>>
>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> [On Arm only]
>> Tested-by: Wei Chen <Wei.Chen@arm.com>
>>
>> ---
>> Please note, this is a split/cleanup/hardening of Julien's PoC:
>> "Add support for Guest IO forwarding to a device emulator"
>>
>> ***
>> I decided to leave common dm.h to keep struct dmop_args declaration
>> (to be included by Arm's dm.c), alternatively we could avoid
>> introducing new header by moving the declaration into the existing
>> header, but failed to find a suitable one which context would fit.
>> ***
>>
>> Changes RFC -> V1:
>>     - update XSM, related changes were pulled from:
>>       [RFC PATCH V1 04/12] xen/arm: Introduce arch specific bits for IOREQ/DM features
>>
>> Changes V1 -> V2:
>>     - update the author of a patch
>>     - update patch description
>>     - introduce xen/dm.h and move definitions here
>>
>> Changes V2 -> V3:
>>     - no changes
>>
>> Changes V3 -> V4:
>>     - rework to have the top level dm-op handling arch-specific
>>     - update patch subject/description, was "xen/dm: Make x86's DM feature common"
>>     - make a few functions static in common ioreq.c
>> ---
>>   xen/arch/x86/hvm/dm.c   | 101 +-----------------------------------
>>   xen/common/ioreq.c      | 135 ++++++++++++++++++++++++++++++++++++++++++------
>>   xen/include/xen/dm.h    |  39 ++++++++++++++
>>   xen/include/xen/ioreq.h |  17 +-----
>>   xen/include/xsm/dummy.h |   4 +-
>>   xen/include/xsm/xsm.h   |   6 +--
>>   xen/xsm/dummy.c         |   2 +-
>>   xen/xsm/flask/hooks.c   |   5 +-
>>   8 files changed, 171 insertions(+), 138 deletions(-)
>>   create mode 100644 xen/include/xen/dm.h
>>
>> diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c
>> index d3e2a9e..dc8e47d 100644
>> --- a/xen/arch/x86/hvm/dm.c
>> +++ b/xen/arch/x86/hvm/dm.c
>> @@ -16,6 +16,7 @@
>>
>>   #include <xen/event.h>
>>   #include <xen/guest_access.h>
>> +#include <xen/dm.h>
>>   #include <xen/hypercall.h>
>>   #include <xen/ioreq.h>
>>   #include <xen/nospec.h>
>> @@ -29,13 +30,6 @@
>>
>>   #include <public/hvm/hvm_op.h>
>>
>> -struct dmop_args {
>> -    domid_t domid;
>> -    unsigned int nr_bufs;
>> -    /* Reserve enough buf elements for all current hypercalls. */
>> -    struct xen_dm_op_buf buf[2];
>> -};
>> -
>>   static bool _raw_copy_from_guest_buf_offset(void *dst,
>>                                               const struct dmop_args *args,
>>                                               unsigned int buf_idx,
>> @@ -408,71 +402,6 @@ static int dm_op(const struct dmop_args *op_args)
>>
>>       switch ( op.op )
>>       {
>> -    case XEN_DMOP_create_ioreq_server:
>> -    {
>> -        struct xen_dm_op_create_ioreq_server *data =
>> -            &op.u.create_ioreq_server;
>> -
>> -        const_op = false;
>> -
>> -        rc = -EINVAL;
>> -        if ( data->pad[0] || data->pad[1] || data->pad[2] )
>> -            break;
>> -
>> -        rc = hvm_create_ioreq_server(d, data->handle_bufioreq,
>> -                                     &data->id);
>> -        break;
>> -    }
>> -
>> -    case XEN_DMOP_get_ioreq_server_info:
>> -    {
>> -        struct xen_dm_op_get_ioreq_server_info *data =
>> -            &op.u.get_ioreq_server_info;
>> -        const uint16_t valid_flags = XEN_DMOP_no_gfns;
>> -
>> -        const_op = false;
>> -
>> -        rc = -EINVAL;
>> -        if ( data->flags & ~valid_flags )
>> -            break;
>> -
>> -        rc = hvm_get_ioreq_server_info(d, data->id,
>> -                                       (data->flags & XEN_DMOP_no_gfns) ?
>> -                                       NULL : &data->ioreq_gfn,
>> -                                       (data->flags & XEN_DMOP_no_gfns) ?
>> -                                       NULL : &data->bufioreq_gfn,
>> -                                       &data->bufioreq_port);
>> -        break;
>> -    }
>> -
>> -    case XEN_DMOP_map_io_range_to_ioreq_server:
>> -    {
>> -        const struct xen_dm_op_ioreq_server_range *data =
>> -            &op.u.map_io_range_to_ioreq_server;
>> -
>> -        rc = -EINVAL;
>> -        if ( data->pad )
>> -            break;
>> -
>> -        rc = hvm_map_io_range_to_ioreq_server(d, data->id, data->type,
>> -                                              data->start, data->end);
>> -        break;
>> -    }
>> -
>> -    case XEN_DMOP_unmap_io_range_from_ioreq_server:
>> -    {
>> -        const struct xen_dm_op_ioreq_server_range *data =
>> -            &op.u.unmap_io_range_from_ioreq_server;
>> -
>> -        rc = -EINVAL;
>> -        if ( data->pad )
>> -            break;
>> -
>> -        rc = hvm_unmap_io_range_from_ioreq_server(d, data->id, data->type,
>> -                                                  data->start, data->end);
>> -        break;
>> -    }
>> -
>>       case XEN_DMOP_map_mem_type_to_ioreq_server:
>>       {
>>           struct xen_dm_op_map_mem_type_to_ioreq_server *data =
>> @@ -523,32 +452,6 @@ static int dm_op(const struct dmop_args *op_args)
>>           break;
>>       }
>>
>> -    case XEN_DMOP_set_ioreq_server_state:
>> -    {
>> -        const struct xen_dm_op_set_ioreq_server_state *data =
>> -            &op.u.set_ioreq_server_state;
>> -
>> -        rc = -EINVAL;
>> -        if ( data->pad )
>> -            break;
>> -
>> -        rc = hvm_set_ioreq_server_state(d, data->id, !!data->enabled);
>> -        break;
>> -    }
>> -
>> -    case XEN_DMOP_destroy_ioreq_server:
>> -    {
>> -        const struct xen_dm_op_destroy_ioreq_server *data =
>> -            &op.u.destroy_ioreq_server;
>> -
>> -        rc = -EINVAL;
>> -        if ( data->pad )
>> -            break;
>> -
>> -        rc = hvm_destroy_ioreq_server(d, data->id);
>> -        break;
>> -    }
>> -
>>       case XEN_DMOP_track_dirty_vram:
>>       {
>>           const struct xen_dm_op_track_dirty_vram *data =
>> @@ -703,7 +606,7 @@ static int dm_op(const struct dmop_args *op_args)
>>       }
>>
>>       default:
>> -        rc = -EOPNOTSUPP;
>> +        rc = ioreq_server_dm_op(&op, d, &const_op);
>>           break;
>>       }
>>
>> diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c
>> index a319c88..72b5da0 100644
>> --- a/xen/common/ioreq.c
>> +++ b/xen/common/ioreq.c
>> @@ -591,8 +591,8 @@ static void hvm_ioreq_server_deinit(struct ioreq_server *s)
>>       put_domain(s->emulator);
>>   }
>>
>> -int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
>> -                            ioservid_t *id)
>> +static int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
>> +                                   ioservid_t *id)
> Would this not be a good opportunity to drop the 'hvm_' prefix (here and elsewhere)?

It would be, I will drop.


May I ask, are you ok with that alternative approach proposed by Jan and 
already implemented in current version (top level dm-op handling 
arch-specific
and call into ioreq_server_dm_op() for otherwise unhandled ops)?

Initial discussion here:
https://lore.kernel.org/xen-devel/1606732298-22107-10-git-send-email-olekstysh@gmail.com/

-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply	[flat|nested] 144+ messages in thread

* RE: [PATCH V4 17/24] xen/ioreq: Introduce domain_has_ioreq_server()
  2021-01-12 21:52 ` [PATCH V4 17/24] xen/ioreq: Introduce domain_has_ioreq_server() Oleksandr Tyshchenko
  2021-01-15  1:24   ` Stefano Stabellini
@ 2021-01-18 10:23   ` Paul Durrant
  1 sibling, 0 replies; 144+ messages in thread
From: Paul Durrant @ 2021-01-18 10:23 UTC (permalink / raw)
  To: 'Oleksandr Tyshchenko', xen-devel
  Cc: 'Oleksandr Tyshchenko', 'Stefano Stabellini',
	'Julien Grall', 'Volodymyr Babchuk',
	'Julien Grall'

> -----Original Message-----
> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of Oleksandr Tyshchenko
> Sent: 12 January 2021 21:52
> To: xen-devel@lists.xenproject.org
> Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>; Stefano Stabellini <sstabellini@kernel.org>;
> Julien Grall <julien@xen.org>; Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>; Paul Durrant
> <paul@xen.org>; Julien Grall <julien.grall@arm.com>
> Subject: [PATCH V4 17/24] xen/ioreq: Introduce domain_has_ioreq_server()
> 
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> This patch introduces a helper the main purpose of which is to check
> if a domain is using IOREQ server(s).
> 
> On Arm the current benefit is to avoid calling vcpu_ioreq_handle_completion()
> (which implies iterating over all possible IOREQ servers anyway)
> on every return in leave_hypervisor_to_guest() if there is no active
> servers for the particular domain.
> Also this helper will be used by one of the subsequent patches on Arm.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>
> 

Reviewed-by: Paul Durrant <paul@xen.org>



^ permalink raw reply	[flat|nested] 144+ messages in thread

* RE: [PATCH V4 21/24] xen/ioreq: Make x86's send_invalidate_req() common
  2021-01-12 21:52 ` [PATCH V4 21/24] xen/ioreq: Make x86's send_invalidate_req() common Oleksandr Tyshchenko
@ 2021-01-18 10:31   ` Paul Durrant
  2021-01-21 14:02     ` Jan Beulich
  0 siblings, 1 reply; 144+ messages in thread
From: Paul Durrant @ 2021-01-18 10:31 UTC (permalink / raw)
  To: 'Oleksandr Tyshchenko', xen-devel
  Cc: 'Oleksandr Tyshchenko', 'Jan Beulich',
	'Andrew Cooper', 'Roger Pau Monné',
	'Wei Liu', 'George Dunlap', 'Ian Jackson',
	'Julien Grall', 'Stefano Stabellini',
	'Julien Grall'

> -----Original Message-----
> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of Oleksandr Tyshchenko
> Sent: 12 January 2021 21:52
> To: xen-devel@lists.xenproject.org
> Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>; Jan Beulich <jbeulich@suse.com>; Andrew
> Cooper <andrew.cooper3@citrix.com>; Roger Pau Monné <roger.pau@citrix.com>; Wei Liu <wl@xen.org>;
> George Dunlap <george.dunlap@citrix.com>; Ian Jackson <iwj@xenproject.org>; Julien Grall
> <julien@xen.org>; Stefano Stabellini <sstabellini@kernel.org>; Paul Durrant <paul@xen.org>; Julien
> Grall <julien.grall@arm.com>
> Subject: [PATCH V4 21/24] xen/ioreq: Make x86's send_invalidate_req() common
> 
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> As the IOREQ is a common feature now and we also need to
> invalidate qemu/demu mapcache on Arm when the required condition
> occurs this patch moves this function to the common code
> (and remames it to ioreq_signal_mapcache_invalidate).
> This patch also moves per-domain qemu_mapcache_invalidate
> variable out of the arch sub-struct (and drops "qemu" prefix).
> 
> We don't put this variable inside the #ifdef CONFIG_IOREQ_SERVER
> at the end of struct domain, but in the hole next to the group
> of 5 bools further up which is more efficient.
> 
> The subsequent patch will add mapcache invalidation handling on Arm.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>

Reviewed-by: Paul Durrant <paul@xen.org>



^ permalink raw reply	[flat|nested] 144+ messages in thread

* RE: [PATCH V4 09/24] xen/ioreq: Make x86's IOREQ related dm-op handling common
  2021-01-18 10:19     ` Oleksandr
@ 2021-01-18 10:34       ` Paul Durrant
  0 siblings, 0 replies; 144+ messages in thread
From: Paul Durrant @ 2021-01-18 10:34 UTC (permalink / raw)
  To: 'Oleksandr'
  Cc: xen-devel, 'Julien Grall', 'Jan Beulich',
	'Andrew Cooper', 'Roger Pau Monné',
	'Wei Liu', 'George Dunlap', 'Ian Jackson',
	'Julien Grall', 'Stefano Stabellini',
	'Daniel De Graaf', 'Oleksandr Tyshchenko'

> -----Original Message-----
[snip]
> >> diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c
> >> index a319c88..72b5da0 100644
> >> --- a/xen/common/ioreq.c
> >> +++ b/xen/common/ioreq.c
> >> @@ -591,8 +591,8 @@ static void hvm_ioreq_server_deinit(struct ioreq_server *s)
> >>       put_domain(s->emulator);
> >>   }
> >>
> >> -int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
> >> -                            ioservid_t *id)
> >> +static int hvm_create_ioreq_server(struct domain *d, int bufioreq_handling,
> >> +                                   ioservid_t *id)
> > Would this not be a good opportunity to drop the 'hvm_' prefix (here and elsewhere)?
> 
> It would be, I will drop.
> 
> 
> May I ask, are you ok with that alternative approach proposed by Jan and
> already implemented in current version (top level dm-op handling
> arch-specific
> and call into ioreq_server_dm_op() for otherwise unhandled ops)?
> 

Yes, I think per-arch hypercall handlers is the tidiest way to go.

  Paul

> Initial discussion here:
> https://lore.kernel.org/xen-devel/1606732298-22107-10-git-send-email-olekstysh@gmail.com/
> 
> --
> Regards,
> 
> Oleksandr Tyshchenko




^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features
  2021-01-17 17:11     ` Oleksandr
  2021-01-17 18:07       ` Julien Grall
@ 2021-01-18 10:44       ` Jan Beulich
  2021-01-18 15:52         ` Oleksandr
  1 sibling, 1 reply; 144+ messages in thread
From: Jan Beulich @ 2021-01-18 10:44 UTC (permalink / raw)
  To: Oleksandr
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Volodymyr Babchuk,
	Oleksandr Tyshchenko, Julien Grall

On 17.01.2021 18:11, Oleksandr wrote:
> On 15.01.21 22:26, Julien Grall wrote:
>> On 12/01/2021 21:52, Oleksandr Tyshchenko wrote:
>>> --- a/xen/arch/arm/io.c
>>> +++ b/xen/arch/arm/io.c
>>> @@ -16,6 +16,7 @@
>>>    * GNU General Public License for more details.
>>>    */
>>>   +#include <xen/ioreq.h>
>>>   #include <xen/lib.h>
>>>   #include <xen/spinlock.h>
>>>   #include <xen/sched.h>
>>> @@ -23,6 +24,7 @@
>>>   #include <asm/cpuerrata.h>
>>>   #include <asm/current.h>
>>>   #include <asm/mmio.h>
>>> +#include <asm/hvm/ioreq.h>
>>
>> Shouldn't this have been included by "xen/ioreq.h"?
> Well, for V1 asm/hvm/ioreq.h was included by xen/ioreq.h. But, it turned 
> out that there was nothing inside common header required arch one to be 
> included and
> I was asked to include arch header where it was indeed needed (several 
> *.c files).

I guess the general usage model of the two headers needs to be
established first: If the per-arch header declares only stuff
needed by the soon common/ioreq.c, then indeed it should be
only that file and the producer(s) of the arch_*() functions
which include that header; it should then in particular not be
included by xen/ioreq.h.

However, with the change request on patch 1 I think that usage
model goes away at least for now, at which point the question
is what exactly the per-arch header still declares, and based
on that it would need to be decided whether xen/ioreq.h
should include it.

>>> --- a/xen/include/asm-arm/domain.h
>>> +++ b/xen/include/asm-arm/domain.h
>>> @@ -10,6 +10,7 @@
>>>   #include <asm/gic.h>
>>>   #include <asm/vgic.h>
>>>   #include <asm/vpl011.h>
>>> +#include <public/hvm/dm_op.h>
>>
>> May I ask, why do you need to include dm_op.h here?
> I needed to include that header to make some bits visible 
> (XEN_DMOP_IO_RANGE_PCI, struct xen_dm_op_buf, etc). Why here - is a 
> really good question.
> I don't remember exactly, probably I followed x86's domain.h which also 
> included it.
> So, trying to remove the inclusion here, I get several build failures on 
> Arm which could be fixed if I include that header from dm.h and ioreq.h:
> 
> Shall I do this way?

The general rule ought to be that header include what they need,
but not more. Header dependencies are quite problematic already,
so every dependency we can avoid (or eliminate) will help. This
goes as far as only forward declaring structure where possible.

>>> @@ -262,6 +263,8 @@ static inline void arch_vcpu_block(struct vcpu 
>>> *v) {}
>>>     #define arch_vm_assist_valid_mask(d) (1UL << 
>>> VMASST_TYPE_runstate_update_flag)
>>>   +#define has_vpci(d)    ({ (void)(d); false; })
>>> +
>>>   #endif /* __ASM_DOMAIN_H__ */
>>>     /*
>>> diff --git a/xen/include/asm-arm/hvm/ioreq.h 
>>> b/xen/include/asm-arm/hvm/ioreq.h
>>> new file mode 100644
>>> index 0000000..19e1247
>>> --- /dev/null
>>> +++ b/xen/include/asm-arm/hvm/ioreq.h
>>
>> Shouldn't this directly be under asm-arm/ rather than asm-arm/hvm/ as 
>> the IOREQ is now meant to be agnostic?
> Good question... The _common_ IOREQ code is indeed arch-agnostic. But, 
> can the _arch_ IOREQ code be treated as really subarch-agnostic?
> I think, on Arm it can and it is most likely ok to keep it in 
> "asm-arm/", but how it would be correlated with x86's IOREQ code which 
> is HVM specific and located
> in "hvm" subdir?

I think for Arm's sake this should be used as asm/ioreq.h, where
x86 would gain a new header consisting of just

#include <asm/hvm/ioreq.h>

as there the functionality is needed for HVM only.

Jan


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 15/24] xen/arm: Stick around in leave_hypervisor_to_guest until I/O has completed
  2021-01-17 20:23     ` Oleksandr
@ 2021-01-18 10:57       ` Julien Grall
  2021-01-18 13:23         ` Oleksandr
  0 siblings, 1 reply; 144+ messages in thread
From: Julien Grall @ 2021-01-18 10:57 UTC (permalink / raw)
  To: Oleksandr
  Cc: xen-devel, Oleksandr Tyshchenko, Stefano Stabellini,
	Volodymyr Babchuk, Julien Grall

Hi Oleksandr,

On 17/01/2021 20:23, Oleksandr wrote:
> 
> On 15.01.21 22:55, Julien Grall wrote:
>>> So we need
>>> to check if the I/O has completed and wait again if it hasn't (we will
>>> block the vCPU again until an event is received). This loop makes sure
>>> that all the vCPU works are done before we return to the guest.
>>>
>>> The call chain below:
>>> check_for_vcpu_work -> vcpu_ioreq_handle_completion -> wait_for_io ->
>>> wait_on_xen_event_channel
>>>
>>> The worse that can happen here if the vCPU will never run again
>>> (the I/O will never complete). But, in Xen case, if the I/O never
>>> completes then it most likely means that something went horribly
>>> wrong with the Device Emulator. And it is most likely not safe
>>> to continue. So letting the vCPU to spin forever if the I/O never
>>> completes is a safer action than letting it continue and leaving
>>> the guest in unclear state and is the best what we can do for now.
>>>
>>> Please note, using this loop we will not spin forever on a pCPU,
>>> preventing any other vCPUs from being scheduled. At every loop
>>> we will call check_for_pcpu_work() that will process pending
>>> softirqs. In case of failure, the guest will crash and the vCPU
>>> will be unscheduled. In normal case, if the rescheduling is necessary
>>> (might be set by a timer or by a caller in check_for_vcpu_work(),
>>> where wait_for_io() is a preemption point) the vCPU will be rescheduled
>>> to give place to someone else.
>>>
>> What you describe here is a bug that was introduced by this series. If 
>> you think the code requires a separate patch, then please split off 
>> patch #14 so the code callling vcpu_ioreq_handle_completion() happen 
>> here.
> I am afraid, I don't understand which bug you are talking about, I just 
> tried to explain why using a loop is not bad (there wouldn't be any 
> impact to other vCPUs, etc) and the worse case which could happen.
> Also I don't see a reason why the code requires a separate patch 
> (probably, if I understood a bug I would see a reason ...) Could you 
> please clarify?

Your commit message begins with:

"This patch adds proper handling of return value of
vcpu_ioreq_handle_completion() which involves using a loop in
leave_hypervisor_to_guest()."

I read this as "there was a bug in the code base and we are now fixing 
  it". AFAICT, this patch would not be necessary if we don't apply patch 
#14 in Xen (assuming the rest of IOREQ is complete).

Therefore you are fixing a bug that you introduced in the same series.

It is considered as a bad practice because it means
   1) we have to review code that is known "buggy" (patch #14).
   2) adds more churn in the series than necessary

Instead, it would be better to split your changes in
check_for_vcpu_work() from patch #14 and add them here.

[...]

>> So I would rework the loop to write it as:
>>
>> while ( check_for_pcpu_work() )
>>    check_for_pcpu_work();
>> check_for_pcpu_work();
> 
> makes sense, I assume you meant while ( check_for_vcpu_work() ) ...

Yes. Sorry for the typo.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 15/24] xen/arm: Stick around in leave_hypervisor_to_guest until I/O has completed
  2021-01-18 10:57       ` Julien Grall
@ 2021-01-18 13:23         ` Oleksandr
  0 siblings, 0 replies; 144+ messages in thread
From: Oleksandr @ 2021-01-18 13:23 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Oleksandr Tyshchenko, Stefano Stabellini,
	Volodymyr Babchuk, Julien Grall


On 18.01.21 12:57, Julien Grall wrote:
> Hi Oleksandr,


Hi Julien



>
> On 17/01/2021 20:23, Oleksandr wrote:
>>
>> On 15.01.21 22:55, Julien Grall wrote:
>>>> So we need
>>>> to check if the I/O has completed and wait again if it hasn't (we will
>>>> block the vCPU again until an event is received). This loop makes sure
>>>> that all the vCPU works are done before we return to the guest.
>>>>
>>>> The call chain below:
>>>> check_for_vcpu_work -> vcpu_ioreq_handle_completion -> wait_for_io ->
>>>> wait_on_xen_event_channel
>>>>
>>>> The worse that can happen here if the vCPU will never run again
>>>> (the I/O will never complete). But, in Xen case, if the I/O never
>>>> completes then it most likely means that something went horribly
>>>> wrong with the Device Emulator. And it is most likely not safe
>>>> to continue. So letting the vCPU to spin forever if the I/O never
>>>> completes is a safer action than letting it continue and leaving
>>>> the guest in unclear state and is the best what we can do for now.
>>>>
>>>> Please note, using this loop we will not spin forever on a pCPU,
>>>> preventing any other vCPUs from being scheduled. At every loop
>>>> we will call check_for_pcpu_work() that will process pending
>>>> softirqs. In case of failure, the guest will crash and the vCPU
>>>> will be unscheduled. In normal case, if the rescheduling is necessary
>>>> (might be set by a timer or by a caller in check_for_vcpu_work(),
>>>> where wait_for_io() is a preemption point) the vCPU will be 
>>>> rescheduled
>>>> to give place to someone else.
>>>>
>>> What you describe here is a bug that was introduced by this series. 
>>> If you think the code requires a separate patch, then please split 
>>> off patch #14 so the code callling vcpu_ioreq_handle_completion() 
>>> happen here.
>> I am afraid, I don't understand which bug you are talking about, I 
>> just tried to explain why using a loop is not bad (there wouldn't be 
>> any impact to other vCPUs, etc) and the worse case which could happen.
>> Also I don't see a reason why the code requires a separate patch 
>> (probably, if I understood a bug I would see a reason ...) Could you 
>> please clarify?
>
> Your commit message begins with:
>
> "This patch adds proper handling of return value of
> vcpu_ioreq_handle_completion() which involves using a loop in
> leave_hypervisor_to_guest()."
>
> I read this as "there was a bug in the code base and we are now fixing 
>  it". AFAICT, this patch would not be necessary if we don't apply 
> patch #14 in Xen (assuming the rest of IOREQ is complete).
>
> Therefore you are fixing a bug that you introduced in the same series.

Now I got it. Thank you for the clarification.


>
>
> It is considered as a bad practice because it means
>   1) we have to review code that is known "buggy" (patch #14).
>   2) adds more churn in the series than necessary
>
> Instead, it would be better to split your changes in
> check_for_vcpu_work() from patch #14 and add them here.

Now I agree, will do (and update patch subject).


>
>
> [...]
>
>>> So I would rework the loop to write it as:
>>>
>>> while ( check_for_pcpu_work() )
>>>    check_for_pcpu_work();
>>> check_for_pcpu_work();
>>
>> makes sense, I assume you meant while ( check_for_vcpu_work() ) ...
>
> Yes. Sorry for the typo.
>
> Cheers,
>
-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features
  2021-01-18 10:44       ` Jan Beulich
@ 2021-01-18 15:52         ` Oleksandr
  2021-01-18 16:00           ` Jan Beulich
  0 siblings, 1 reply; 144+ messages in thread
From: Oleksandr @ 2021-01-18 15:52 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Volodymyr Babchuk,
	Oleksandr Tyshchenko, Julien Grall


On 18.01.21 12:44, Jan Beulich wrote:

Hi Jan

> On 17.01.2021 18:11, Oleksandr wrote:
>> On 15.01.21 22:26, Julien Grall wrote:
>>> On 12/01/2021 21:52, Oleksandr Tyshchenko wrote:
>>>> --- a/xen/arch/arm/io.c
>>>> +++ b/xen/arch/arm/io.c
>>>> @@ -16,6 +16,7 @@
>>>>     * GNU General Public License for more details.
>>>>     */
>>>>    +#include <xen/ioreq.h>
>>>>    #include <xen/lib.h>
>>>>    #include <xen/spinlock.h>
>>>>    #include <xen/sched.h>
>>>> @@ -23,6 +24,7 @@
>>>>    #include <asm/cpuerrata.h>
>>>>    #include <asm/current.h>
>>>>    #include <asm/mmio.h>
>>>> +#include <asm/hvm/ioreq.h>


Note to self:

Remove obsolete bool ioreq_complete_mmio(void) from asm-arm/hvm/ioreq.h



>>> Shouldn't this have been included by "xen/ioreq.h"?
>> Well, for V1 asm/hvm/ioreq.h was included by xen/ioreq.h. But, it turned
>> out that there was nothing inside common header required arch one to be
>> included and
>> I was asked to include arch header where it was indeed needed (several
>> *.c files).
> I guess the general usage model of the two headers needs to be
> established first: If the per-arch header declares only stuff
> needed by the soon common/ioreq.c, then indeed it should be
> only that file and the producer(s) of the arch_*() functions
> which include that header; it should then in particular not be
> included by xen/ioreq.h.
>
> However, with the change request on patch 1 I think that usage
> model goes away at least for now, at which point the question
> is what exactly the per-arch header still declares, and based
> on that it would need to be decided whether xen/ioreq.h
> should include it.

ok, well.

x86's arch header now contains few IOREQ_STATUS_* #define-s, but Arm's 
contains more stuff
besides that:
- stuff which is needed by common/ioreq.c, mostly stubs which are not 
implemented yet (handle_pio, etc)
- stuff which is not needed by common/ioreq.c, internal Arm bits 
(handle_ioserv, try_fwd_ioserv)

Could we please decide based on the information above?



>
>>>> --- a/xen/include/asm-arm/domain.h
>>>> +++ b/xen/include/asm-arm/domain.h
>>>> @@ -10,6 +10,7 @@
>>>>    #include <asm/gic.h>
>>>>    #include <asm/vgic.h>
>>>>    #include <asm/vpl011.h>
>>>> +#include <public/hvm/dm_op.h>
>>> May I ask, why do you need to include dm_op.h here?
>> I needed to include that header to make some bits visible
>> (XEN_DMOP_IO_RANGE_PCI, struct xen_dm_op_buf, etc). Why here - is a
>> really good question.
>> I don't remember exactly, probably I followed x86's domain.h which also
>> included it.
>> So, trying to remove the inclusion here, I get several build failures on
>> Arm which could be fixed if I include that header from dm.h and ioreq.h:
>>
>> Shall I do this way?
> The general rule ought to be that header include what they need,
> but not more. Header dependencies are quite problematic already,
> so every dependency we can avoid (or eliminate) will help. This
> goes as far as only forward declaring structure where possible.

I got it.


>
>>>> @@ -262,6 +263,8 @@ static inline void arch_vcpu_block(struct vcpu
>>>> *v) {}
>>>>      #define arch_vm_assist_valid_mask(d) (1UL <<
>>>> VMASST_TYPE_runstate_update_flag)
>>>>    +#define has_vpci(d)    ({ (void)(d); false; })
>>>> +
>>>>    #endif /* __ASM_DOMAIN_H__ */
>>>>      /*
>>>> diff --git a/xen/include/asm-arm/hvm/ioreq.h
>>>> b/xen/include/asm-arm/hvm/ioreq.h
>>>> new file mode 100644
>>>> index 0000000..19e1247
>>>> --- /dev/null
>>>> +++ b/xen/include/asm-arm/hvm/ioreq.h
>>> Shouldn't this directly be under asm-arm/ rather than asm-arm/hvm/ as
>>> the IOREQ is now meant to be agnostic?
>> Good question... The _common_ IOREQ code is indeed arch-agnostic. But,
>> can the _arch_ IOREQ code be treated as really subarch-agnostic?
>> I think, on Arm it can and it is most likely ok to keep it in
>> "asm-arm/", but how it would be correlated with x86's IOREQ code which
>> is HVM specific and located
>> in "hvm" subdir?
> I think for Arm's sake this should be used as asm/ioreq.h, where
> x86 would gain a new header consisting of just
>
> #include <asm/hvm/ioreq.h>
>
> as there the functionality is needed for HVM only.
For me this sounds perfectly fine. I think, this would also address 
Julien's question.
May I introduce that new header together with moving IOREQ to the common 
code (patch #4)?


-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features
  2021-01-18 15:52         ` Oleksandr
@ 2021-01-18 16:00           ` Jan Beulich
  2021-01-18 16:29             ` Oleksandr
  0 siblings, 1 reply; 144+ messages in thread
From: Jan Beulich @ 2021-01-18 16:00 UTC (permalink / raw)
  To: Oleksandr
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Volodymyr Babchuk,
	Oleksandr Tyshchenko, Julien Grall

On 18.01.2021 16:52, Oleksandr wrote:
> On 18.01.21 12:44, Jan Beulich wrote:
>> On 17.01.2021 18:11, Oleksandr wrote:
>>> On 15.01.21 22:26, Julien Grall wrote:
>>>> On 12/01/2021 21:52, Oleksandr Tyshchenko wrote:
>>>>> --- a/xen/arch/arm/io.c
>>>>> +++ b/xen/arch/arm/io.c
>>>>> @@ -16,6 +16,7 @@
>>>>>     * GNU General Public License for more details.
>>>>>     */
>>>>>    +#include <xen/ioreq.h>
>>>>>    #include <xen/lib.h>
>>>>>    #include <xen/spinlock.h>
>>>>>    #include <xen/sched.h>
>>>>> @@ -23,6 +24,7 @@
>>>>>    #include <asm/cpuerrata.h>
>>>>>    #include <asm/current.h>
>>>>>    #include <asm/mmio.h>
>>>>> +#include <asm/hvm/ioreq.h>
> 
> 
> Note to self:
> 
> Remove obsolete bool ioreq_complete_mmio(void) from asm-arm/hvm/ioreq.h
> 
> 
> 
>>>> Shouldn't this have been included by "xen/ioreq.h"?
>>> Well, for V1 asm/hvm/ioreq.h was included by xen/ioreq.h. But, it turned
>>> out that there was nothing inside common header required arch one to be
>>> included and
>>> I was asked to include arch header where it was indeed needed (several
>>> *.c files).
>> I guess the general usage model of the two headers needs to be
>> established first: If the per-arch header declares only stuff
>> needed by the soon common/ioreq.c, then indeed it should be
>> only that file and the producer(s) of the arch_*() functions
>> which include that header; it should then in particular not be
>> included by xen/ioreq.h.
>>
>> However, with the change request on patch 1 I think that usage
>> model goes away at least for now, at which point the question
>> is what exactly the per-arch header still declares, and based
>> on that it would need to be decided whether xen/ioreq.h
>> should include it.
> 
> ok, well.
> 
> x86's arch header now contains few IOREQ_STATUS_* #define-s, but Arm's 
> contains more stuff
> besides that:
> - stuff which is needed by common/ioreq.c, mostly stubs which are not 
> implemented yet (handle_pio, etc)
> - stuff which is not needed by common/ioreq.c, internal Arm bits 
> (handle_ioserv, try_fwd_ioserv)
> 
> Could we please decide based on the information above?

You're in the best position to tell. The IOREQ_STATUS_* you
mention may require including from xen/ioreq.h, but as said,
...

>>>>> --- a/xen/include/asm-arm/domain.h
>>>>> +++ b/xen/include/asm-arm/domain.h
>>>>> @@ -10,6 +10,7 @@
>>>>>    #include <asm/gic.h>
>>>>>    #include <asm/vgic.h>
>>>>>    #include <asm/vpl011.h>
>>>>> +#include <public/hvm/dm_op.h>
>>>> May I ask, why do you need to include dm_op.h here?
>>> I needed to include that header to make some bits visible
>>> (XEN_DMOP_IO_RANGE_PCI, struct xen_dm_op_buf, etc). Why here - is a
>>> really good question.
>>> I don't remember exactly, probably I followed x86's domain.h which also
>>> included it.
>>> So, trying to remove the inclusion here, I get several build failures on
>>> Arm which could be fixed if I include that header from dm.h and ioreq.h:
>>>
>>> Shall I do this way?
>> The general rule ought to be that header include what they need,
>> but not more. Header dependencies are quite problematic already,
>> so every dependency we can avoid (or eliminate) will help. This
>> goes as far as only forward declaring structure where possible.
> 
> I got it.

... it depends. If xen/ioreq.h needs nothing from asm/ioreq.h,
the I wouldn't see why it should include it.

>>>>> @@ -262,6 +263,8 @@ static inline void arch_vcpu_block(struct vcpu
>>>>> *v) {}
>>>>>      #define arch_vm_assist_valid_mask(d) (1UL <<
>>>>> VMASST_TYPE_runstate_update_flag)
>>>>>    +#define has_vpci(d)    ({ (void)(d); false; })
>>>>> +
>>>>>    #endif /* __ASM_DOMAIN_H__ */
>>>>>      /*
>>>>> diff --git a/xen/include/asm-arm/hvm/ioreq.h
>>>>> b/xen/include/asm-arm/hvm/ioreq.h
>>>>> new file mode 100644
>>>>> index 0000000..19e1247
>>>>> --- /dev/null
>>>>> +++ b/xen/include/asm-arm/hvm/ioreq.h
>>>> Shouldn't this directly be under asm-arm/ rather than asm-arm/hvm/ as
>>>> the IOREQ is now meant to be agnostic?
>>> Good question... The _common_ IOREQ code is indeed arch-agnostic. But,
>>> can the _arch_ IOREQ code be treated as really subarch-agnostic?
>>> I think, on Arm it can and it is most likely ok to keep it in
>>> "asm-arm/", but how it would be correlated with x86's IOREQ code which
>>> is HVM specific and located
>>> in "hvm" subdir?
>> I think for Arm's sake this should be used as asm/ioreq.h, where
>> x86 would gain a new header consisting of just
>>
>> #include <asm/hvm/ioreq.h>
>>
>> as there the functionality is needed for HVM only.
> For me this sounds perfectly fine. I think, this would also address 
> Julien's question.
> May I introduce that new header together with moving IOREQ to the common 
> code (patch #4)?

As with about everything, introduce new things the first time you
need them, unless this results in overly big patches (in which
case suitably splitting up is desirable, but of course no always
possible). IOW if you introduce xen/ioreq.h and it needs to
include asm/ioreq.h, then of course at this point you also need
to introduce the asm-x86/ioreq.h wrapper.

Jan


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features
  2021-01-18 16:00           ` Jan Beulich
@ 2021-01-18 16:29             ` Oleksandr
  0 siblings, 0 replies; 144+ messages in thread
From: Oleksandr @ 2021-01-18 16:29 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Volodymyr Babchuk,
	Oleksandr Tyshchenko, Julien Grall


On 18.01.21 18:00, Jan Beulich wrote:

Hi Jan

> On 18.01.2021 16:52, Oleksandr wrote:
>> On 18.01.21 12:44, Jan Beulich wrote:
>>> On 17.01.2021 18:11, Oleksandr wrote:
>>>> On 15.01.21 22:26, Julien Grall wrote:
>>>>> On 12/01/2021 21:52, Oleksandr Tyshchenko wrote:
>>>>>> --- a/xen/arch/arm/io.c
>>>>>> +++ b/xen/arch/arm/io.c
>>>>>> @@ -16,6 +16,7 @@
>>>>>>      * GNU General Public License for more details.
>>>>>>      */
>>>>>>     +#include <xen/ioreq.h>
>>>>>>     #include <xen/lib.h>
>>>>>>     #include <xen/spinlock.h>
>>>>>>     #include <xen/sched.h>
>>>>>> @@ -23,6 +24,7 @@
>>>>>>     #include <asm/cpuerrata.h>
>>>>>>     #include <asm/current.h>
>>>>>>     #include <asm/mmio.h>
>>>>>> +#include <asm/hvm/ioreq.h>
>>
>> Note to self:
>>
>> Remove obsolete bool ioreq_complete_mmio(void) from asm-arm/hvm/ioreq.h
>>
>>
>>
>>>>> Shouldn't this have been included by "xen/ioreq.h"?
>>>> Well, for V1 asm/hvm/ioreq.h was included by xen/ioreq.h. But, it turned
>>>> out that there was nothing inside common header required arch one to be
>>>> included and
>>>> I was asked to include arch header where it was indeed needed (several
>>>> *.c files).
>>> I guess the general usage model of the two headers needs to be
>>> established first: If the per-arch header declares only stuff
>>> needed by the soon common/ioreq.c, then indeed it should be
>>> only that file and the producer(s) of the arch_*() functions
>>> which include that header; it should then in particular not be
>>> included by xen/ioreq.h.
>>>
>>> However, with the change request on patch 1 I think that usage
>>> model goes away at least for now, at which point the question
>>> is what exactly the per-arch header still declares, and based
>>> on that it would need to be decided whether xen/ioreq.h
>>> should include it.
>> ok, well.
>>
>> x86's arch header now contains few IOREQ_STATUS_* #define-s, but Arm's
>> contains more stuff
>> besides that:
>> - stuff which is needed by common/ioreq.c, mostly stubs which are not
>> implemented yet (handle_pio, etc)
>> - stuff which is not needed by common/ioreq.c, internal Arm bits
>> (handle_ioserv, try_fwd_ioserv)
>>
>> Could we please decide based on the information above?
> You're in the best position to tell. The IOREQ_STATUS_* you
> mention may require including from xen/ioreq.h, but as said,
> ...
>
>>>>>> --- a/xen/include/asm-arm/domain.h
>>>>>> +++ b/xen/include/asm-arm/domain.h
>>>>>> @@ -10,6 +10,7 @@
>>>>>>     #include <asm/gic.h>
>>>>>>     #include <asm/vgic.h>
>>>>>>     #include <asm/vpl011.h>
>>>>>> +#include <public/hvm/dm_op.h>
>>>>> May I ask, why do you need to include dm_op.h here?
>>>> I needed to include that header to make some bits visible
>>>> (XEN_DMOP_IO_RANGE_PCI, struct xen_dm_op_buf, etc). Why here - is a
>>>> really good question.
>>>> I don't remember exactly, probably I followed x86's domain.h which also
>>>> included it.
>>>> So, trying to remove the inclusion here, I get several build failures on
>>>> Arm which could be fixed if I include that header from dm.h and ioreq.h:
>>>>
>>>> Shall I do this way?
>>> The general rule ought to be that header include what they need,
>>> but not more. Header dependencies are quite problematic already,
>>> so every dependency we can avoid (or eliminate) will help. This
>>> goes as far as only forward declaring structure where possible.
>> I got it.
> ... it depends. If xen/ioreq.h needs nothing from asm/ioreq.h,
> the I wouldn't see why it should include it.
>
>>>>>> @@ -262,6 +263,8 @@ static inline void arch_vcpu_block(struct vcpu
>>>>>> *v) {}
>>>>>>       #define arch_vm_assist_valid_mask(d) (1UL <<
>>>>>> VMASST_TYPE_runstate_update_flag)
>>>>>>     +#define has_vpci(d)    ({ (void)(d); false; })
>>>>>> +
>>>>>>     #endif /* __ASM_DOMAIN_H__ */
>>>>>>       /*
>>>>>> diff --git a/xen/include/asm-arm/hvm/ioreq.h
>>>>>> b/xen/include/asm-arm/hvm/ioreq.h
>>>>>> new file mode 100644
>>>>>> index 0000000..19e1247
>>>>>> --- /dev/null
>>>>>> +++ b/xen/include/asm-arm/hvm/ioreq.h
>>>>> Shouldn't this directly be under asm-arm/ rather than asm-arm/hvm/ as
>>>>> the IOREQ is now meant to be agnostic?
>>>> Good question... The _common_ IOREQ code is indeed arch-agnostic. But,
>>>> can the _arch_ IOREQ code be treated as really subarch-agnostic?
>>>> I think, on Arm it can and it is most likely ok to keep it in
>>>> "asm-arm/", but how it would be correlated with x86's IOREQ code which
>>>> is HVM specific and located
>>>> in "hvm" subdir?
>>> I think for Arm's sake this should be used as asm/ioreq.h, where
>>> x86 would gain a new header consisting of just
>>>
>>> #include <asm/hvm/ioreq.h>
>>>
>>> as there the functionality is needed for HVM only.
>> For me this sounds perfectly fine. I think, this would also address
>> Julien's question.
>> May I introduce that new header together with moving IOREQ to the common
>> code (patch #4)?
> As with about everything, introduce new things the first time you
> need them, unless this results in overly big patches (in which
> case suitably splitting up is desirable, but of course no always
> possible). IOW if you introduce xen/ioreq.h and it needs to
> include asm/ioreq.h, then of course at this point you also need
> to introduce the asm-x86/ioreq.h wrapper.


Thank you for the clarification.

-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 11/24] xen/mm: Make x86's XENMEM_resource_ioreq_server handling common
  2021-01-15 14:35       ` Alex Bennée
@ 2021-01-18 17:42         ` Oleksandr
  0 siblings, 0 replies; 144+ messages in thread
From: Oleksandr @ 2021-01-18 17:42 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Wei Chen, Julien Grall, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Volodymyr Babchuk, Oleksandr Tyshchenko,
	xen-devel


On 15.01.21 16:35, Alex Bennée wrote:

Hi Alex

> Oleksandr <olekstysh@gmail.com> writes:
>
>> On 14.01.21 05:58, Wei Chen wrote:
>>> Hi Oleksandr,
>> Hi Wei
> <snip>
>>>> @@ -1090,6 +1091,40 @@ static int acquire_grant_table(struct domain *d,
>>>> unsigned int id,
>>>>        return 0;
>>>>    }
>>>>
>>>> +static int acquire_ioreq_server(struct domain *d,
>>>> +                                unsigned int id,
>>>> +                                unsigned long frame,
>>>> +                                unsigned int nr_frames,
>>>> +                                xen_pfn_t mfn_list[])
>>>> +{
>>>> +#ifdef CONFIG_IOREQ_SERVER
>>>> +    ioservid_t ioservid = id;
>>>> +    unsigned int i;
>>>> +    int rc;
>>>> +
>>>> +    if ( !is_hvm_domain(d) )
>>>> +        return -EINVAL;
>>>> +
>>>> +    if ( id != (unsigned int)ioservid )
>>>> +        return -EINVAL;
>>>> +
>>>> +    for ( i = 0; i < nr_frames; i++ )
>>>> +    {
>>>> +        mfn_t mfn;
>>>> +
>>>> +        rc = hvm_get_ioreq_server_frame(d, id, frame + i, &mfn);
>>>> +        if ( rc )
>>>> +            return rc;
>>>> +
>>>> +        mfn_list[i] = mfn_x(mfn);
>>>> +    }
>>>> +
>>>> +    return 0;
>>>> +#else
>>>> +    return -EOPNOTSUPP;
>>>> +#endif
>>>> +}
>>>> +
> <snip>
>>> This change could not be applied to the latest staging branch.
>> Yes, thank you noticing that.  The code around was changed a bit (patch
>> series is based on 10-days old staging), I will update for the next
>> version.
> I think the commit that introduced config ARCH_ACQUIRE_RESOURCE could
> probably be reverted as it achieves pretty much the same thing as the
> above code by moving the logic into the common code path.
>
> The only real practical difference is a inline stub vs a general purpose
> function with an IOREQ specific #ifdeferry.
> <snip>
Hmm, thank you for noticing that.
So, yes, I should either add an extra patch for V5 to revert 
ARCH_ACQUIRE_RESOURCE before applying this one
or rebase it to the current codebase (and likely drop all collected R-bs 
because of an additional changes of removing ARCH_ACQUIRE_RESOURCE bits).


-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features
  2021-01-17 18:52         ` Oleksandr
@ 2021-01-18 19:17           ` Julien Grall
  2021-01-19 15:20             ` Oleksandr
  2021-01-20 15:50           ` Julien Grall
  1 sibling, 1 reply; 144+ messages in thread
From: Julien Grall @ 2021-01-18 19:17 UTC (permalink / raw)
  To: Oleksandr
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Volodymyr Babchuk,
	Oleksandr Tyshchenko

Hi Oleksandr,

On 17/01/2021 18:52, Oleksandr wrote:
> 
> On 17.01.21 20:07, Julien Grall wrote:
>>
>>
>> On 17/01/2021 17:11, Oleksandr wrote:
>>>
>>> On 15.01.21 22:26, Julien Grall wrote:
>>>
>>> Hi Julien
>>
>> Hi Oleksandr,
> 
> 
> Hi Julien
> 
> 
> 
>>
>>>>
>>>>> +
>>>>>       PROGRESS(xen):
>>>>>           ret = relinquish_memory(d, &d->xenpage_list);
>>>>>           if ( ret )
>>>>> diff --git a/xen/arch/arm/io.c b/xen/arch/arm/io.c
>>>>> index ae7ef96..9814481 100644
>>>>> --- a/xen/arch/arm/io.c
>>>>> +++ b/xen/arch/arm/io.c
>>>>> @@ -16,6 +16,7 @@
>>>>>    * GNU General Public License for more details.
>>>>>    */
>>>>>   +#include <xen/ioreq.h>
>>>>>   #include <xen/lib.h>
>>>>>   #include <xen/spinlock.h>
>>>>>   #include <xen/sched.h>
>>>>> @@ -23,6 +24,7 @@
>>>>>   #include <asm/cpuerrata.h>
>>>>>   #include <asm/current.h>
>>>>>   #include <asm/mmio.h>
>>>>> +#include <asm/hvm/ioreq.h>
>>>>
>>>> Shouldn't this have been included by "xen/ioreq.h"?
>>> Well, for V1 asm/hvm/ioreq.h was included by xen/ioreq.h. But, it 
>>> turned out that there was nothing inside common header required arch 
>>> one to be included and
>>> I was asked to include arch header where it was indeed needed 
>>> (several *.c files).
>>
>> Fair enough.
>>
>> [...]
>>
>>>>
>>>> If you return IO_HANDLED here, then it means the we will take care 
>>>> of previous I/O but the current one is going to be ignored. 
>>> Which current one? As I understand, if try_fwd_ioserv() gets called 
>>> with vio->req.state == STATE_IORESP_READY then this is a second round 
>>> after emulator completes the emulation (the first round was when
>>> we returned IO_RETRY down the function and claimed that we would need 
>>> a completion), so we are still dealing with previous I/O.
>>> vcpu_ioreq_handle_completion() -> arch_ioreq_complete_mmio() -> 
>>> try_handle_mmio() -> try_fwd_ioserv() -> handle_ioserv()
>>> And after we return IO_HANDLED here, handle_ioserv() will be called 
>>> to complete the handling of this previous I/O emulation.
>>> Or I really missed something?
>>
>> Hmmm... I somehow thought try_fw_ioserv() would only be called the 
>> first time. Do you have a branch with your code applied? This would 
>> help to follow the different paths.
> Yes, I mentioned about it in cover letter.
> 
> Please see
> https://github.com/otyshchenko1/xen/commits/ioreq_4.14_ml5
> why 5 - because I started counting from the RFC)

Oh, I looked at the cover letter and didn't find it. Hence why I asked. 
I should have looked more carefully. Thanks!

I have looked closer at the question and I am not sure to understand why 
arch_ioreq_complete_mmio() is going to call try_handle_mmio().

This looks pretty innefficient to me because we already now the IO was 
handled by the IOREQ server.

I realize that x86 is calling handle_mmio() again. However, I don't 
think we need the same on Arm because the instruction for accessing 
device memory are a lot simpler (you can only read or store at most a 
64-bit value).

So I would like to keep our emulation simple and not rely on 
try_ioserv_fw() to always return true when call from completion (AFAICT 
it is not possible to return false then).

I will answer to the rest separately.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features
  2021-01-18 19:17           ` Julien Grall
@ 2021-01-19 15:20             ` Oleksandr
  2021-01-20  0:50               ` Stefano Stabellini
  0 siblings, 1 reply; 144+ messages in thread
From: Oleksandr @ 2021-01-19 15:20 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Volodymyr Babchuk,
	Oleksandr Tyshchenko


Hi Julien


>>
>>
>>>
>>>>>
>>>>>> +
>>>>>>       PROGRESS(xen):
>>>>>>           ret = relinquish_memory(d, &d->xenpage_list);
>>>>>>           if ( ret )
>>>>>> diff --git a/xen/arch/arm/io.c b/xen/arch/arm/io.c
>>>>>> index ae7ef96..9814481 100644
>>>>>> --- a/xen/arch/arm/io.c
>>>>>> +++ b/xen/arch/arm/io.c
>>>>>> @@ -16,6 +16,7 @@
>>>>>>    * GNU General Public License for more details.
>>>>>>    */
>>>>>>   +#include <xen/ioreq.h>
>>>>>>   #include <xen/lib.h>
>>>>>>   #include <xen/spinlock.h>
>>>>>>   #include <xen/sched.h>
>>>>>> @@ -23,6 +24,7 @@
>>>>>>   #include <asm/cpuerrata.h>
>>>>>>   #include <asm/current.h>
>>>>>>   #include <asm/mmio.h>
>>>>>> +#include <asm/hvm/ioreq.h>
>>>>>
>>>>> Shouldn't this have been included by "xen/ioreq.h"?
>>>> Well, for V1 asm/hvm/ioreq.h was included by xen/ioreq.h. But, it 
>>>> turned out that there was nothing inside common header required 
>>>> arch one to be included and
>>>> I was asked to include arch header where it was indeed needed 
>>>> (several *.c files).
>>>
>>> Fair enough.
>>>
>>> [...]
>>>
>>>>>
>>>>> If you return IO_HANDLED here, then it means the we will take care 
>>>>> of previous I/O but the current one is going to be ignored. 
>>>> Which current one? As I understand, if try_fwd_ioserv() gets called 
>>>> with vio->req.state == STATE_IORESP_READY then this is a second 
>>>> round after emulator completes the emulation (the first round was when
>>>> we returned IO_RETRY down the function and claimed that we would 
>>>> need a completion), so we are still dealing with previous I/O.
>>>> vcpu_ioreq_handle_completion() -> arch_ioreq_complete_mmio() -> 
>>>> try_handle_mmio() -> try_fwd_ioserv() -> handle_ioserv()
>>>> And after we return IO_HANDLED here, handle_ioserv() will be called 
>>>> to complete the handling of this previous I/O emulation.
>>>> Or I really missed something?
>>>
>>> Hmmm... I somehow thought try_fw_ioserv() would only be called the 
>>> first time. Do you have a branch with your code applied? This would 
>>> help to follow the different paths.
>> Yes, I mentioned about it in cover letter.
>>
>> Please see
>> https://github.com/otyshchenko1/xen/commits/ioreq_4.14_ml5
>> why 5 - because I started counting from the RFC)
>
> Oh, I looked at the cover letter and didn't find it. Hence why I 
> asked. I should have looked more carefully. Thanks!
>
> I have looked closer at the question and I am not sure to understand 
> why arch_ioreq_complete_mmio() is going to call try_handle_mmio().
>
> This looks pretty innefficient to me because we already now the IO was 
> handled by the IOREQ server.
>
> I realize that x86 is calling handle_mmio() again. However, I don't 
> think we need the same on Arm because the instruction for accessing 
> device memory are a lot simpler (you can only read or store at most a 
> 64-bit value).

I think, I agree.


>
> So I would like to keep our emulation simple and not rely on 
> try_ioserv_fw() to always return true when call from completion 
> (AFAICT it is not possible to return false then).


So what you are proposing is just a replacement try_ioserv_fw() by 
handle_ioserv() technically?


diff --git a/xen/arch/arm/ioreq.c b/xen/arch/arm/ioreq.c
index 40b9e59..0508bd8 100644
--- a/xen/arch/arm/ioreq.c
+++ b/xen/arch/arm/ioreq.c
@@ -101,12 +101,10 @@ enum io_state try_fwd_ioserv(struct cpu_user_regs 
*regs,

  bool arch_ioreq_complete_mmio(void)
  {
-    struct vcpu *v = current;
      struct cpu_user_regs *regs = guest_cpu_user_regs();
      const union hsr hsr = { .bits = regs->hsr };
-    paddr_t addr = v->io.req.addr;

-    if ( try_handle_mmio(regs, hsr, addr) == IO_HANDLED )
+    if ( handle_ioserv(regs, current) == IO_HANDLED )
      {
          advance_pc(regs, hsr);
          return true;


>
> I will answer to the rest separately.

Thank you.


>
> Cheers,
>
-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply related	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features
  2021-01-17 12:45     ` Oleksandr
@ 2021-01-20  0:23       ` Stefano Stabellini
  2021-01-21  9:51         ` Oleksandr
  0 siblings, 1 reply; 144+ messages in thread
From: Stefano Stabellini @ 2021-01-20  0:23 UTC (permalink / raw)
  To: Oleksandr
  Cc: Stefano Stabellini, xen-devel, Julien Grall, Julien Grall,
	Volodymyr Babchuk, Oleksandr Tyshchenko

[-- Attachment #1: Type: text/plain, Size: 12046 bytes --]

On Sun, 17 Jan 2021, Oleksandr wrote:
> On 15.01.21 02:55, Stefano Stabellini wrote:
> > On Tue, 12 Jan 2021, Oleksandr Tyshchenko wrote:
> > > From: Julien Grall <julien.grall@arm.com>
> > > 
> > > This patch adds basic IOREQ/DM support on Arm. The subsequent
> > > patches will improve functionality and add remaining bits.
> > > 
> > > The IOREQ/DM features are supposed to be built with IOREQ_SERVER
> > > option enabled, which is disabled by default on Arm for now.
> > > 
> > > Please note, the "PIO handling" TODO is expected to left unaddressed
> > > for the current series. It is not an big issue for now while Xen
> > > doesn't have support for vPCI on Arm. On Arm64 they are only used
> > > for PCI IO Bar and we would probably want to expose them to emulator
> > > as PIO access to make a DM completely arch-agnostic. So "PIO handling"
> > > should be implemented when we add support for vPCI.
> > > 
> > > Signed-off-by: Julien Grall <julien.grall@arm.com>
> > > Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> > > [On Arm only]
> > > Tested-by: Wei Chen <Wei.Chen@arm.com>
> > > 
> > > ---
> > > Please note, this is a split/cleanup/hardening of Julien's PoC:
> > > "Add support for Guest IO forwarding to a device emulator"
> > > 
> > > Changes RFC -> V1:
> > >     - was split into:
> > >       - arm/ioreq: Introduce arch specific bits for IOREQ/DM features
> > >       - xen/mm: Handle properly reference in set_foreign_p2m_entry() on
> > > Arm
> > >     - update patch description
> > >     - update asm-arm/hvm/ioreq.h according to the newly introduced arch
> > > functions:
> > >       - arch_hvm_destroy_ioreq_server()
> > >       - arch_handle_hvm_io_completion()
> > >     - update arch files to include xen/ioreq.h
> > >     - remove HVMOP plumbing
> > >     - rewrite a logic to handle properly case when hvm_send_ioreq()
> > > returns IO_RETRY
> > >     - add a logic to handle properly handle_hvm_io_completion() return
> > > value
> > >     - rename handle_mmio() to ioreq_handle_complete_mmio()
> > >     - move paging_mark_pfn_dirty() to asm-arm/paging.h
> > >     - remove forward declaration for hvm_ioreq_server in asm-arm/paging.h
> > >     - move try_fwd_ioserv() to ioreq.c, provide stubs if
> > > !CONFIG_IOREQ_SERVER
> > >     - do not remove #ifdef CONFIG_IOREQ_SERVER in memory.c for guarding
> > > xen/ioreq.h
> > >     - use gdprintk in try_fwd_ioserv(), remove unneeded prints
> > >     - update list of #include-s
> > >     - move has_vpci() to asm-arm/domain.h
> > >     - add a comment (TODO) to unimplemented yet handle_pio()
> > >     - remove hvm_mmio_first(last)_byte() and hvm_ioreq_(page/vcpu/server)
> > > structs
> > >       from the arch files, they were already moved to the common code
> > >     - remove set_foreign_p2m_entry() changes, they will be properly
> > > implemented
> > >       in the follow-up patch
> > >     - select IOREQ_SERVER for Arm instead of Arm64 in Kconfig
> > >     - remove x86's realmode and other unneeded stubs from xen/ioreq.h
> > >     - clafify ioreq_t p.df usage in try_fwd_ioserv()
> > >     - set ioreq_t p.count to 1 in try_fwd_ioserv()
> > > 
> > > Changes V1 -> V2:
> > >     - was split into:
> > >       - arm/ioreq: Introduce arch specific bits for IOREQ/DM features
> > >       - xen/arm: Stick around in leave_hypervisor_to_guest until I/O has
> > > completed
> > >     - update the author of a patch
> > >     - update patch description
> > >     - move a loop in leave_hypervisor_to_guest() to a separate patch
> > >     - set IOREQ_SERVER disabled by default
> > >     - remove already clarified /* XXX */
> > >     - replace BUG() by ASSERT_UNREACHABLE() in handle_pio()
> > >     - remove default case for handling the return value of
> > > try_handle_mmio()
> > >     - remove struct hvm_domain, enum hvm_io_completion, struct
> > > hvm_vcpu_io,
> > >       struct hvm_vcpu from asm-arm/domain.h, these are common materials
> > > now
> > >     - update everything according to the recent changes (IOREQ related
> > > function
> > >       names don't contain "hvm" prefixes/infixes anymore, IOREQ related
> > > fields
> > >       are part of common struct vcpu/domain now, etc)
> > > 
> > > Changes V2 -> V3:
> > >     - update patch according the "legacy interface" is x86 specific
> > >     - add dummy arch hooks
> > >     - remove dummy paging_mark_pfn_dirty()
> > >     - don’t include <xen/domain_page.h> in common ioreq.c
> > >     - don’t include <public/hvm/ioreq.h> in arch ioreq.h
> > >     - remove #define ioreq_params(d, i)
> > > 
> > > Changes V3 -> V4:
> > >     - rebase
> > >     - update patch according to the renaming IO_ -> VIO_ (io_ -> vio_)
> > >       and misc changes to arch hooks
> > >     - update patch according to the IOREQ related dm-op handling changes
> > >     - don't include <xen/ioreq.h> from arch header
> > >     - make all arch hooks out-of-line
> > >     - add a comment above IOREQ_STATUS_* #define-s
> > > ---
> > >   xen/arch/arm/Makefile           |   2 +
> > >   xen/arch/arm/dm.c               | 122 +++++++++++++++++++++++
> > >   xen/arch/arm/domain.c           |   9 ++
> > >   xen/arch/arm/io.c               |  12 ++-
> > >   xen/arch/arm/ioreq.c            | 213
> > > ++++++++++++++++++++++++++++++++++++++++
> > >   xen/arch/arm/traps.c            |  13 +++
> > >   xen/include/asm-arm/domain.h    |   3 +
> > >   xen/include/asm-arm/hvm/ioreq.h |  72 ++++++++++++++
> > >   xen/include/asm-arm/mmio.h      |   1 +
> > >   9 files changed, 446 insertions(+), 1 deletion(-)
> > >   create mode 100644 xen/arch/arm/dm.c
> > >   create mode 100644 xen/arch/arm/ioreq.c
> > >   create mode 100644 xen/include/asm-arm/hvm/ioreq.h
> > > 
> > > diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> > > index 512ffdd..16e6523 100644
> > > --- a/xen/arch/arm/Makefile
> > > +++ b/xen/arch/arm/Makefile
> > > @@ -13,6 +13,7 @@ obj-y += cpuerrata.o
> > >   obj-y += cpufeature.o
> > >   obj-y += decode.o
> > >   obj-y += device.o
> > > +obj-$(CONFIG_IOREQ_SERVER) += dm.o
> > >   obj-y += domain.o
> > >   obj-y += domain_build.init.o
> > >   obj-y += domctl.o
> > > @@ -27,6 +28,7 @@ obj-y += guest_atomics.o
> > >   obj-y += guest_walk.o
> > >   obj-y += hvm.o
> > >   obj-y += io.o
> > > +obj-$(CONFIG_IOREQ_SERVER) += ioreq.o
> > >   obj-y += irq.o
> > >   obj-y += kernel.init.o
> > >   obj-$(CONFIG_LIVEPATCH) += livepatch.o
> > > diff --git a/xen/arch/arm/dm.c b/xen/arch/arm/dm.c
> > > new file mode 100644
> > > index 0000000..e6dedf4
> > > --- /dev/null
> > > +++ b/xen/arch/arm/dm.c
> > > @@ -0,0 +1,122 @@
> > > +/*
> > > + * Copyright (c) 2019 Arm ltd.
> > > + *
> > > + * This program is free software; you can redistribute it and/or modify
> > > it
> > > + * under the terms and conditions of the GNU General Public License,
> > > + * version 2, as published by the Free Software Foundation.
> > > + *
> > > + * This program is distributed in the hope it will be useful, but WITHOUT
> > > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> > > + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> > > for
> > > + * more details.
> > > + *
> > > + * You should have received a copy of the GNU General Public License
> > > along with
> > > + * this program; If not, see <http://www.gnu.org/licenses/>.
> > > + */
> > > +
> > > +#include <xen/dm.h>
> > > +#include <xen/guest_access.h>
> > > +#include <xen/hypercall.h>
> > > +#include <xen/ioreq.h>
> > > +#include <xen/nospec.h>
> > > +
> > > +static int dm_op(const struct dmop_args *op_args)
> > > +{
> > > +    struct domain *d;
> > > +    struct xen_dm_op op;
> > > +    bool const_op = true;
> > > +    long rc;
> > > +    size_t offset;
> > > +
> > > +    static const uint8_t op_size[] = {
> > > +        [XEN_DMOP_create_ioreq_server]              = sizeof(struct
> > > xen_dm_op_create_ioreq_server),
> > > +        [XEN_DMOP_get_ioreq_server_info]            = sizeof(struct
> > > xen_dm_op_get_ioreq_server_info),
> > > +        [XEN_DMOP_map_io_range_to_ioreq_server]     = sizeof(struct
> > > xen_dm_op_ioreq_server_range),
> > > +        [XEN_DMOP_unmap_io_range_from_ioreq_server] = sizeof(struct
> > > xen_dm_op_ioreq_server_range),
> > > +        [XEN_DMOP_set_ioreq_server_state]           = sizeof(struct
> > > xen_dm_op_set_ioreq_server_state),
> > > +        [XEN_DMOP_destroy_ioreq_server]             = sizeof(struct
> > > xen_dm_op_destroy_ioreq_server),
> > > +    };
> > > +
> > > +    rc = rcu_lock_remote_domain_by_id(op_args->domid, &d);
> > > +    if ( rc )
> > > +        return rc;
> > > +
> > > +    rc = xsm_dm_op(XSM_DM_PRIV, d);
> > > +    if ( rc )
> > > +        goto out;
> > > +
> > > +    offset = offsetof(struct xen_dm_op, u);
> > > +
> > > +    rc = -EFAULT;
> > > +    if ( op_args->buf[0].size < offset )
> > > +        goto out;
> > > +
> > > +    if ( copy_from_guest_offset((void *)&op, op_args->buf[0].h, 0,
> > > offset) )
> > > +        goto out;
> > > +
> > > +    if ( op.op >= ARRAY_SIZE(op_size) )
> > > +    {
> > > +        rc = -EOPNOTSUPP;
> > > +        goto out;
> > > +    }
> > > +
> > > +    op.op = array_index_nospec(op.op, ARRAY_SIZE(op_size));
> > > +
> > > +    if ( op_args->buf[0].size < offset + op_size[op.op] )
> > > +        goto out;
> > > +
> > > +    if ( copy_from_guest_offset((void *)&op.u, op_args->buf[0].h, offset,
> > > +                                op_size[op.op]) )
> > > +        goto out;
> > > +
> > > +    rc = -EINVAL;
> > > +    if ( op.pad )
> > > +        goto out;
> > > +
> > > +    rc = ioreq_server_dm_op(&op, d, &const_op);
> > > +
> > > +    if ( (!rc || rc == -ERESTART) &&
> > > +         !const_op && copy_to_guest_offset(op_args->buf[0].h, offset,
> > > +                                           (void *)&op.u, op_size[op.op])
> > > )
> > > +        rc = -EFAULT;
> > > +
> > > + out:
> > > +    rcu_unlock_domain(d);
> > > +
> > > +    return rc;
> > > +}
> > > +
> > > +long do_dm_op(domid_t domid,
> > > +              unsigned int nr_bufs,
> > > +              XEN_GUEST_HANDLE_PARAM(xen_dm_op_buf_t) bufs)
> > > +{
> > > +    struct dmop_args args;
> > > +    int rc;
> > > +
> > > +    if ( nr_bufs > ARRAY_SIZE(args.buf) )
> > > +        return -E2BIG;
> > > +
> > > +    args.domid = domid;
> > > +    args.nr_bufs = array_index_nospec(nr_bufs, ARRAY_SIZE(args.buf) + 1);
> > > +
> > > +    if ( copy_from_guest_offset(&args.buf[0], bufs, 0, args.nr_bufs) )
> > > +        return -EFAULT;
> > > +
> > > +    rc = dm_op(&args);
> > > +
> > > +    if ( rc == -ERESTART )
> > > +        rc = hypercall_create_continuation(__HYPERVISOR_dm_op, "iih",
> > > +                                           domid, nr_bufs, bufs);
> > > +
> > > +    return rc;
> > > +}
> > I might have missed something in the discussions but this function is
> > identical to xen/arch/x86/hvm/dm.c:do_dm_op, why not make it common?
> > 
> > Also the previous function dm_op is very similar to
> > xen/arch/x86/hvm/dm.c:dm_op I would prefer to make them common if
> > possible. Was this already discussed?
> Well, let me explain. Both dm_op() and do_dm_op() were indeed common (top
> level dm-op handling common) for previous versions, so Arm's dm.c didn't
> contain this stuff.
> The idea to make it other way around (top level dm-op handling arch-specific
> and call into ioreq_server_dm_op() for otherwise unhandled ops) was discussed
> at [1] which besides
> it's Pros leads to code duplication, so Arm's dm.c has to duplicate some
> stuff, etc.
> I was thinking about moving do_dm_op() which is _same_ for both arches to
> common code, but I am not sure whether it is conceptually correct which that
> new "alternative" approach of handling dm-op.

Yes, I think it makes sense to make do_dm_op common because it is
identical. That should be easy.

I realize that the common part of dm_op is the initial boilerplate which
is similar for every hypercall, so I think it is also OK if we don't
share it and leave it as it is in this version of the series.

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features
  2021-01-19 15:20             ` Oleksandr
@ 2021-01-20  0:50               ` Stefano Stabellini
  2021-01-20 15:57                 ` Julien Grall
  0 siblings, 1 reply; 144+ messages in thread
From: Stefano Stabellini @ 2021-01-20  0:50 UTC (permalink / raw)
  To: Oleksandr
  Cc: Julien Grall, xen-devel, Julien Grall, Stefano Stabellini,
	Volodymyr Babchuk, Oleksandr Tyshchenko

[-- Attachment #1: Type: text/plain, Size: 4536 bytes --]

On Tue, 19 Jan 2021, Oleksandr wrote:
> > > > > > >       PROGRESS(xen):
> > > > > > >           ret = relinquish_memory(d, &d->xenpage_list);
> > > > > > >           if ( ret )
> > > > > > > diff --git a/xen/arch/arm/io.c b/xen/arch/arm/io.c
> > > > > > > index ae7ef96..9814481 100644
> > > > > > > --- a/xen/arch/arm/io.c
> > > > > > > +++ b/xen/arch/arm/io.c
> > > > > > > @@ -16,6 +16,7 @@
> > > > > > >    * GNU General Public License for more details.
> > > > > > >    */
> > > > > > >   +#include <xen/ioreq.h>
> > > > > > >   #include <xen/lib.h>
> > > > > > >   #include <xen/spinlock.h>
> > > > > > >   #include <xen/sched.h>
> > > > > > > @@ -23,6 +24,7 @@
> > > > > > >   #include <asm/cpuerrata.h>
> > > > > > >   #include <asm/current.h>
> > > > > > >   #include <asm/mmio.h>
> > > > > > > +#include <asm/hvm/ioreq.h>
> > > > > > 
> > > > > > Shouldn't this have been included by "xen/ioreq.h"?
> > > > > Well, for V1 asm/hvm/ioreq.h was included by xen/ioreq.h. But, it
> > > > > turned out that there was nothing inside common header required arch
> > > > > one to be included and
> > > > > I was asked to include arch header where it was indeed needed (several
> > > > > *.c files).
> > > > 
> > > > Fair enough.
> > > > 
> > > > [...]
> > > > 
> > > > > > 
> > > > > > If you return IO_HANDLED here, then it means the we will take care
> > > > > > of previous I/O but the current one is going to be ignored. 
> > > > > Which current one? As I understand, if try_fwd_ioserv() gets called
> > > > > with vio->req.state == STATE_IORESP_READY then this is a second round
> > > > > after emulator completes the emulation (the first round was when
> > > > > we returned IO_RETRY down the function and claimed that we would need
> > > > > a completion), so we are still dealing with previous I/O.
> > > > > vcpu_ioreq_handle_completion() -> arch_ioreq_complete_mmio() ->
> > > > > try_handle_mmio() -> try_fwd_ioserv() -> handle_ioserv()
> > > > > And after we return IO_HANDLED here, handle_ioserv() will be called to
> > > > > complete the handling of this previous I/O emulation.
> > > > > Or I really missed something?
> > > > 
> > > > Hmmm... I somehow thought try_fw_ioserv() would only be called the first
> > > > time. Do you have a branch with your code applied? This would help to
> > > > follow the different paths.
> > > Yes, I mentioned about it in cover letter.
> > > 
> > > Please see
> > > https://github.com/otyshchenko1/xen/commits/ioreq_4.14_ml5
> > > why 5 - because I started counting from the RFC)
> > 
> > Oh, I looked at the cover letter and didn't find it. Hence why I asked. I
> > should have looked more carefully. Thanks!
> > 
> > I have looked closer at the question and I am not sure to understand why
> > arch_ioreq_complete_mmio() is going to call try_handle_mmio().
> > 
> > This looks pretty innefficient to me because we already now the IO was
> > handled by the IOREQ server.
> > 
> > I realize that x86 is calling handle_mmio() again. However, I don't think we
> > need the same on Arm because the instruction for accessing device memory are
> > a lot simpler (you can only read or store at most a 64-bit value).
> 
> I think, I agree.

Yes I agree too


> > So I would like to keep our emulation simple and not rely on try_ioserv_fw()
> > to always return true when call from completion (AFAICT it is not possible
> > to return false then).
> 
> 
> So what you are proposing is just a replacement try_ioserv_fw() by
> handle_ioserv() technically?
> 
> 
> diff --git a/xen/arch/arm/ioreq.c b/xen/arch/arm/ioreq.c
> index 40b9e59..0508bd8 100644
> --- a/xen/arch/arm/ioreq.c
> +++ b/xen/arch/arm/ioreq.c
> @@ -101,12 +101,10 @@ enum io_state try_fwd_ioserv(struct cpu_user_regs *regs,
> 
>  bool arch_ioreq_complete_mmio(void)
>  {
> -    struct vcpu *v = current;
>      struct cpu_user_regs *regs = guest_cpu_user_regs();
>      const union hsr hsr = { .bits = regs->hsr };
> -    paddr_t addr = v->io.req.addr;
> 
> -    if ( try_handle_mmio(regs, hsr, addr) == IO_HANDLED )
> +    if ( handle_ioserv(regs, current) == IO_HANDLED )
>      {
>          advance_pc(regs, hsr);
>          return true;

Yes, but I think we want to keep the check

    vio->req.state == STATE_IORESP_READY

So maybe (uncompiled, untested):

    if ( v->io.req.state != STATE_IORESP_READY )
        return false;

    if ( handle_ioserv(regs, current) == IO_HANDLED )
    {
        advance_pc(regs, hsr);
        return true;
    }

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 05/24] xen/ioreq: Make x86's hvm_ioreq_needs_completion() common
  2021-01-12 21:52 ` [PATCH V4 05/24] xen/ioreq: Make x86's hvm_ioreq_needs_completion() common Oleksandr Tyshchenko
  2021-01-15 15:25   ` Julien Grall
@ 2021-01-20  8:48   ` Alex Bennée
  2021-01-20  9:31     ` Julien Grall
  1 sibling, 1 reply; 144+ messages in thread
From: Alex Bennée @ 2021-01-20  8:48 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Oleksandr Tyshchenko, Paul Durrant, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, Julien Grall, Stefano Stabellini, Julien Grall,
	xen-devel


Oleksandr Tyshchenko <olekstysh@gmail.com> writes:

> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>
> The IOREQ is a common feature now and this helper will be used
> on Arm as is. Move it to xen/ioreq.h and remove "hvm" prefix.
>
> Although PIO handling on Arm is not introduced with the current series
> (it will be implemented when we add support for vPCI), technically
> the PIOs exist on Arm (however they are accessed the same way as MMIO)
> and it would be better not to diverge now.

I find this description a little confusing. When you say PIO do you mean
using instructions like in/out on the x86? If so then AFAIK it's a
legacy feature of x86 as everything I've come across since just does
MMIO, including PCI.

The code changes look fine to me though:

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 06/24] xen/ioreq: Make x86's hvm_mmio_first(last)_byte() common
  2021-01-12 21:52 ` [PATCH V4 06/24] xen/ioreq: Make x86's hvm_mmio_first(last)_byte() common Oleksandr Tyshchenko
  2021-01-15 15:34   ` Julien Grall
@ 2021-01-20  8:57   ` Alex Bennée
  2021-01-20 16:15   ` Jan Beulich
  2 siblings, 0 replies; 144+ messages in thread
From: Alex Bennée @ 2021-01-20  8:57 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Oleksandr Tyshchenko, Paul Durrant, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, Julien Grall, Stefano Stabellini, Julien Grall,
	xen-devel


Oleksandr Tyshchenko <olekstysh@gmail.com> writes:

> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>
> The IOREQ is a common feature now and these helpers will be used
> on Arm as is. Move them to xen/ioreq.h and replace "hvm" prefixes
> with "ioreq".
>
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> Reviewed-by: Paul Durrant <paul@xen.org>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 07/24] xen/ioreq: Make x86's hvm_ioreq_(page/vcpu/server) structs common
  2021-01-12 21:52 ` [PATCH V4 07/24] xen/ioreq: Make x86's hvm_ioreq_(page/vcpu/server) structs common Oleksandr Tyshchenko
  2021-01-15 15:36   ` Julien Grall
  2021-01-18  8:59   ` Paul Durrant
@ 2021-01-20  8:58   ` Alex Bennée
  2 siblings, 0 replies; 144+ messages in thread
From: Alex Bennée @ 2021-01-20  8:58 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Oleksandr Tyshchenko, Paul Durrant, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini,
	Julien Grall, xen-devel


Oleksandr Tyshchenko <olekstysh@gmail.com> writes:

> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>
> The IOREQ is a common feature now and these structs will be used
> on Arm as is. Move them to xen/ioreq.h and remove "hvm" prefixes.
>
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 08/24] xen/ioreq: Move x86's ioreq_server to struct domain
  2021-01-12 21:52 ` [PATCH V4 08/24] xen/ioreq: Move x86's ioreq_server to struct domain Oleksandr Tyshchenko
  2021-01-15 15:44   ` Julien Grall
  2021-01-18  9:09   ` Paul Durrant
@ 2021-01-20  9:00   ` Alex Bennée
  2 siblings, 0 replies; 144+ messages in thread
From: Alex Bennée @ 2021-01-20  9:00 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Oleksandr Tyshchenko, Paul Durrant, Andrew Cooper, George Dunlap,
	Ian Jackson, Jan Beulich, Julien Grall, Stefano Stabellini,
	Wei Liu, Roger Pau Monné,
	Julien Grall, xen-devel


Oleksandr Tyshchenko <olekstysh@gmail.com> writes:

> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>
> The IOREQ is a common feature now and this struct will be used
> on Arm as is. Move it to common struct domain. This also
> significantly reduces the layering violation in the common code
> (*arch.hvm* usage).
>
> We don't move ioreq_gfn since it is not used in the common code
> (the "legacy" mechanism is x86 specific).
>
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 05/24] xen/ioreq: Make x86's hvm_ioreq_needs_completion() common
  2021-01-20  8:48   ` Alex Bennée
@ 2021-01-20  9:31     ` Julien Grall
  0 siblings, 0 replies; 144+ messages in thread
From: Julien Grall @ 2021-01-20  9:31 UTC (permalink / raw)
  To: Alex Bennée, Oleksandr Tyshchenko
  Cc: Oleksandr Tyshchenko, Paul Durrant, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, Stefano Stabellini, Julien Grall, xen-devel

Hi Alex,

On 20/01/2021 08:48, Alex Bennée wrote:
> 
> Oleksandr Tyshchenko <olekstysh@gmail.com> writes:
> 
>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>
>> The IOREQ is a common feature now and this helper will be used
>> on Arm as is. Move it to xen/ioreq.h and remove "hvm" prefix.
>>
>> Although PIO handling on Arm is not introduced with the current series
>> (it will be implemented when we add support for vPCI), technically
>> the PIOs exist on Arm (however they are accessed the same way as MMIO)
>> and it would be better not to diverge now.
> 
> I find this description a little confusing. When you say PIO do you mean
> using instructions like in/out on the x86? If so then AFAIK it's a
> legacy feature of x86 as everything I've come across since just does
> MMIO, including PCI.

Stefano and I had quite a long discussion about this a few months ago 
(see [1]).

 From my understanding, while Arm will access the PCI I/O BAR via MMIO, 
the BAR itself will be configured using an offset from a fixed I/O 
window base address. IOW, we don't configure the BAR with a full MMIO 
address.

In the case the hostbridge is emulated in Xen, I would like to re-use 
the TYPE_PIO for such access because it makes the device model 
arch-agnostic.

I believe this would behave the same way as a real PCI device card: you 
can plug it anywhere without having to understand the underlying 
architecture.

If we were going to use the MMIO type, then we would need:
   1) Inform each device model where is the I/O Window (necessary to be 
able to know we are accessing the I/O BAR)
   2) Have arch boiler plate in the device model

> 
> The code changes look fine to me though:
> 
> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
> 

Cheers,

[1] <4cbe37bd-abd2-3d02-535e-cca6f7497ef2@xen.org>


-- 
Julien Grall


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features
  2021-01-17 18:52         ` Oleksandr
  2021-01-18 19:17           ` Julien Grall
@ 2021-01-20 15:50           ` Julien Grall
  2021-01-21  8:50             ` Oleksandr
  1 sibling, 1 reply; 144+ messages in thread
From: Julien Grall @ 2021-01-20 15:50 UTC (permalink / raw)
  To: Oleksandr
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Volodymyr Babchuk,
	Oleksandr Tyshchenko

Hi Oleksandr,

On 17/01/2021 18:52, Oleksandr wrote:
>>>>>   diff --git a/xen/include/asm-arm/domain.h 
>>>>> b/xen/include/asm-arm/domain.h
>>>>> index 6819a3b..c235e5b 100644
>>>>> --- a/xen/include/asm-arm/domain.h
>>>>> +++ b/xen/include/asm-arm/domain.h
>>>>> @@ -10,6 +10,7 @@
>>>>>   #include <asm/gic.h>
>>>>>   #include <asm/vgic.h>
>>>>>   #include <asm/vpl011.h>
>>>>> +#include <public/hvm/dm_op.h>
>>>>
>>>> May I ask, why do you need to include dm_op.h here?
>>> I needed to include that header to make some bits visible 
>>> (XEN_DMOP_IO_RANGE_PCI, struct xen_dm_op_buf, etc). Why here - is a 
>>> really good question.
>>> I don't remember exactly, probably I followed x86's domain.h which 
>>> also included it.
>>> So, trying to remove the inclusion here, I get several build failures 
>>> on Arm which could be fixed if I include that header from dm.h and 
>>> ioreq.h:
>>>
>>> Shall I do this way?
>>
>> If the failure are indeded because ioreq.h and dm.h use definition 
>> from public/hvm/dm_op.h, then yes. Can you post the errors?
> Please see attached, although I built for Arm32 (and the whole series), 
> I think errors are valid for Arm64 also.

Thanks!

> error1.txt - when remove #include <public/hvm/dm_op.h> from 
> asm-arm/domain.h

For this one, I agree that including <public/hvm/dm_op.h> in xen.h looks 
the best solution.

> error2.txt - when add #include <public/hvm/dm_op.h> to xen/ioreq.h

It looks like the error is happening in dm.c rather than xen/dm.h. Any 
reason to not include <public/hvm/dm_op.h> in dm.c directly?


> error3.txt - when add #include <public/hvm/dm_op.h> to xen/dm.h

I am a bit confused with this one. Isn't it the same as error1.txt?

> 
> 
>>
>>
>> [...]
>>
>>>>>   #include <public/hvm/params.h>
>>>>>     struct hvm_domain
>>>>> @@ -262,6 +263,8 @@ static inline void arch_vcpu_block(struct vcpu 
>>>>> *v) {}
>>>>>     #define arch_vm_assist_valid_mask(d) (1UL << 
>>>>> VMASST_TYPE_runstate_update_flag)
>>>>>   +#define has_vpci(d)    ({ (void)(d); false; })
>>>>> +
>>>>>   #endif /* __ASM_DOMAIN_H__ */
>>>>>     /*
>>>>> diff --git a/xen/include/asm-arm/hvm/ioreq.h 
>>>>> b/xen/include/asm-arm/hvm/ioreq.h
>>>>> new file mode 100644
>>>>> index 0000000..19e1247
>>>>> --- /dev/null
>>>>> +++ b/xen/include/asm-arm/hvm/ioreq.h
>>>>
>>>> Shouldn't this directly be under asm-arm/ rather than asm-arm/hvm/ 
>>>> as the IOREQ is now meant to be agnostic?
>>> Good question... The _common_ IOREQ code is indeed arch-agnostic. 
>>> But, can the _arch_ IOREQ code be treated as really subarch-agnostic?
>>> I think, on Arm it can and it is most likely ok to keep it in 
>>> "asm-arm/", but how it would be correlated with x86's IOREQ code 
>>> which is HVM specific and located
>>> in "hvm" subdir?
>>
>> Sorry, I don't understand your answer/questions. So let me ask the 
>> question differently, is asm-arm/hvm/ioreq.h going to be included from 
>> common code?
> 
> Sorry if I was unclear.
> 
> 
>>
>> If the answer is no, then I see no reason to follow the x86 here.
>> If the answer is yes, then I am quite confused why half of the series 
>> tried to remove "hvm" from the function name but we still include 
>> "asm/hvm/ioreq.h".
> 
> Answer is yes. Even if we could to avoid including that header from the 
> common code somehow, we would still have #include <public/hvm/*>, 
> is_hvm_domain().

I saw Jan answered about this one. Let me know if you need more input.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features
  2021-01-20  0:50               ` Stefano Stabellini
@ 2021-01-20 15:57                 ` Julien Grall
  2021-01-20 19:47                   ` Stefano Stabellini
  0 siblings, 1 reply; 144+ messages in thread
From: Julien Grall @ 2021-01-20 15:57 UTC (permalink / raw)
  To: Stefano Stabellini, Oleksandr
  Cc: xen-devel, Julien Grall, Volodymyr Babchuk, Oleksandr Tyshchenko

Hi Stefano,

On 20/01/2021 00:50, Stefano Stabellini wrote:
> On Tue, 19 Jan 2021, Oleksandr wrote:
>> diff --git a/xen/arch/arm/ioreq.c b/xen/arch/arm/ioreq.c
>> index 40b9e59..0508bd8 100644
>> --- a/xen/arch/arm/ioreq.c
>> +++ b/xen/arch/arm/ioreq.c
>> @@ -101,12 +101,10 @@ enum io_state try_fwd_ioserv(struct cpu_user_regs *regs,
>>
>>   bool arch_ioreq_complete_mmio(void)
>>   {
>> -    struct vcpu *v = current;
>>       struct cpu_user_regs *regs = guest_cpu_user_regs();
>>       const union hsr hsr = { .bits = regs->hsr };
>> -    paddr_t addr = v->io.req.addr;
>>
>> -    if ( try_handle_mmio(regs, hsr, addr) == IO_HANDLED )
>> +    if ( handle_ioserv(regs, current) == IO_HANDLED )
>>       {
>>           advance_pc(regs, hsr);
>>           return true;
> 
> Yes, but I think we want to keep the check
> 
>      vio->req.state == STATE_IORESP_READY
> 
> So maybe (uncompiled, untested):
> 
>      if ( v->io.req.state != STATE_IORESP_READY )
>          return false;

Is it possible to reach this function with v->io.req.state != 
STATE_IORESP_READY? If not, then I would suggest to add an 
ASSERT_UNREACHABLE() before the return.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 06/24] xen/ioreq: Make x86's hvm_mmio_first(last)_byte() common
  2021-01-12 21:52 ` [PATCH V4 06/24] xen/ioreq: Make x86's hvm_mmio_first(last)_byte() common Oleksandr Tyshchenko
  2021-01-15 15:34   ` Julien Grall
  2021-01-20  8:57   ` Alex Bennée
@ 2021-01-20 16:15   ` Jan Beulich
  2021-01-20 20:47     ` Oleksandr
  2 siblings, 1 reply; 144+ messages in thread
From: Jan Beulich @ 2021-01-20 16:15 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Oleksandr Tyshchenko, Paul Durrant, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, Julien Grall, Stefano Stabellini, Julien Grall,
	xen-devel

On 12.01.2021 22:52, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> The IOREQ is a common feature now and these helpers will be used
> on Arm as is. Move them to xen/ioreq.h and replace "hvm" prefixes
> with "ioreq".
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> Reviewed-by: Paul Durrant <paul@xen.org>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>
> 
> ---
> Please note, this is a split/cleanup/hardening of Julien's PoC:
> "Add support for Guest IO forwarding to a device emulator"
> 
> Changes RFC -> V1:
>    - new patch
> 
> Changes V1 -> V2:
>    - replace "hvm" prefix by "ioreq"
> 
> Changes V2 -> V3:
>    - add Paul's R-b
> 
> Changes V32 -> V4:
>    - add Jan's A-b

Did you?

Jan


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 09/24] xen/ioreq: Make x86's IOREQ related dm-op handling common
  2021-01-12 21:52 ` [PATCH V4 09/24] xen/ioreq: Make x86's IOREQ related dm-op handling common Oleksandr Tyshchenko
  2021-01-18  9:17   ` Paul Durrant
@ 2021-01-20 16:21   ` Jan Beulich
  2021-01-21 10:23     ` Oleksandr
  1 sibling, 1 reply; 144+ messages in thread
From: Jan Beulich @ 2021-01-20 16:21 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Julien Grall, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Paul Durrant, Daniel De Graaf,
	Oleksandr Tyshchenko, xen-devel

On 12.01.2021 22:52, Oleksandr Tyshchenko wrote:
> From: Julien Grall <julien.grall@arm.com>
> 
> As a lot of x86 code can be re-used on Arm later on, this patch
> moves the IOREQ related dm-op handling to the common code.
> 
> The idea is to have the top level dm-op handling arch-specific
> and call into ioreq_server_dm_op() for otherwise unhandled ops.
> Pros:
> - More natural than doing it other way around (top level dm-op
> handling common).
> - Leave compat_dm_op() in x86 code.
> Cons:
> - Code duplication. Both arches have to duplicate do_dm_op(), etc.
> 
> Also update XSM code a bit to let dm-op be used on Arm.
> 
> This support is going to be used on Arm to be able run device
> emulator outside of Xen hypervisor.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>

Assuming the moved code is indeed just being moved (which is
quite hard to ascertain by just looking at the diff),
applicable parts
Acked-by: Jan Beulich <jbeulich@suse.com>

Jan


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 10/24] xen/ioreq: Move x86's io_completion/io_req fields to struct vcpu
  2021-01-12 21:52 ` [PATCH V4 10/24] xen/ioreq: Move x86's io_completion/io_req fields to struct vcpu Oleksandr Tyshchenko
  2021-01-15 19:34   ` Julien Grall
  2021-01-18  9:35   ` Paul Durrant
@ 2021-01-20 16:24   ` Jan Beulich
  2 siblings, 0 replies; 144+ messages in thread
From: Jan Beulich @ 2021-01-20 16:24 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Oleksandr Tyshchenko, Paul Durrant, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Jun Nakajima, Kevin Tian, Julien Grall,
	xen-devel

On 12.01.2021 22:52, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> The IOREQ is a common feature now and these fields will be used
> on Arm as is. Move them to common struct vcpu as a part of new
> struct vcpu_io and drop duplicating "io" prefixes. Also move
> enum hvm_io_completion to xen/sched.h and remove "hvm" prefixes.
> 
> This patch completely removes layering violation in the common code.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>

Applicable parts
Acked-by: Jan Beulich <jbeulich@suse.com>

Jan


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 23/24] libxl: Introduce basic virtio-mmio support on Arm
  2021-01-17 22:22     ` Oleksandr
@ 2021-01-20 16:40       ` Julien Grall
  2021-01-20 20:35         ` Stefano Stabellini
  2021-02-09 21:04         ` Oleksandr
  0 siblings, 2 replies; 144+ messages in thread
From: Julien Grall @ 2021-01-20 16:40 UTC (permalink / raw)
  To: Oleksandr
  Cc: xen-devel, Julien Grall, Ian Jackson, Wei Liu, Anthony PERARD,
	Stefano Stabellini, Volodymyr Babchuk, Oleksandr Tyshchenko

Hi Oleksandr,

On 17/01/2021 22:22, Oleksandr wrote:
> On 15.01.21 23:30, Julien Grall wrote:
>> On 12/01/2021 21:52, Oleksandr Tyshchenko wrote:
>>> From: Julien Grall <julien.grall@arm.com>
>> So I am not quite too sure how this new parameter can be used. Could 
>> you expand it?
> The original idea was to set it if we are going to assign virtio 
> device(s) to the guest.
> Being honest, I have a plan to remove this extra parameter. It might not 
> be obvious looking at the current patch, but next patch will show that 
> we can avoid introducing it at all.

Right, so I think we want to avoid introducing the parameter. I have 
suggested in patch #24 a different way to split code introduced by #23 
and #24.

[...]

>>
>>> +#define GUEST_VIRTIO_MMIO_SIZE xen_mk_ullong(0x200)
>>
>> AFAICT, the size of the virtio mmio region should be 0x100. So why is 
>> it 0x200?
> 
> 
> I didn't find the total size requirement for the mmio region in virtio 
> specification v1.1 (the size of control registers is indeed 0x100 and 
> device-specific configuration registers starts at the offset 0x100, 
> however it's size depends on the device and the driver).
> 
> kvmtool uses 0x200 [1], in some Linux device-trees we can see 0x200 [2] 
> (however, device-tree bindings example has 0x100 [3]), so what would be 
> the proper value for Xen code?

Hmm... I missed that fact. I would say we want to use the biggest size 
possible so we can cover most of the devices.

Although, as you pointed out, this may not cover all the devices. So 
maybe we want to allow the user to configure the size via xl.cfg for the 
one not conforming with 0x200.

This could be implemented in the future. Stefano/Ian, what do you think?

>> Most likely, you will want to reserve a range
> 
> it seems yes, good point. BTW, the range is needed for the mmio region 
> as well, correct?

I would reserve 1MB (just for the sake of avoid region size in KB).

For the SPIs, I would consider to reserve 10-20 interrupts. Do you think 
this will cover your use cases?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 24/24] [RFC] libxl: Add support for virtio-disk configuration
  2021-01-18  8:32     ` Oleksandr
@ 2021-01-20 17:05       ` Julien Grall
  2021-02-10  9:02         ` Oleksandr
  0 siblings, 1 reply; 144+ messages in thread
From: Julien Grall @ 2021-01-20 17:05 UTC (permalink / raw)
  To: Oleksandr
  Cc: xen-devel, Oleksandr Tyshchenko, Ian Jackson, Wei Liu,
	Anthony PERARD, Stefano Stabellini

Hi Oleksandr,

On 18/01/2021 08:32, Oleksandr wrote:
> 
> On 16.01.21 00:01, Julien Grall wrote:
>> Hi Oleksandr,
> 
> Hi Julien
> 
> 
>>
>> On 12/01/2021 21:52, Oleksandr Tyshchenko wrote:
>>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>>
>>> This patch adds basic support for configuring and assisting virtio-disk
>>> backend (emualator) which is intended to run out of Qemu and could be 
>>> run
>>> in any domain.
>>>
>>> Xenstore was chosen as a communication interface for the emulator 
>>> running
>>> in non-toolstack domain to be able to get configuration either by 
>>> reading
>>> Xenstore directly or by receiving command line parameters (an updated 
>>> 'xl devd'
>>> running in the same domain would read Xenstore beforehand and call 
>>> backend
>>> executable with the required arguments).
>>>
>>> An example of domain configuration (two disks are assigned to the guest,
>>> the latter is in readonly mode):
>>>
>>> vdisk = [ 'backend=DomD, disks=rw:/dev/mmcblk0p3;ro:/dev/mmcblk1p3' ]
>>>
>>> Where per-disk Xenstore entries are:
>>> - filename and readonly flag (configured via "vdisk" property)
>>> - base and irq (allocated dynamically)
>>>
>>> Besides handling 'visible' params described in configuration file,
>>> patch also allocates virtio-mmio specific ones for each device and
>>> writes them into Xenstore. virtio-mmio params (irq and base) are
>>> unique per guest domain, they allocated at the domain creation time
>>> and passed through to the emulator. Each VirtIO device has at least
>>> one pair of these params.
>>>
>>> TODO:
>>> 1. An extra "virtio" property could be removed.
>>> 2. Update documentation.
>>>
>>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>> [On Arm only]
>>> Tested-by: Wei Chen <Wei.Chen@arm.com>
>>>
>>> ---
>>> Changes RFC -> V1:
>>>     - no changes
>>>
>>> Changes V1 -> V2:
>>>     - rebase according to the new location of libxl_virtio_disk.c
>>>
>>> Changes V2 -> V3:
>>>     - no changes
>>>
>>> Changes V3 -> V4:
>>>     - rebase according to the new argument for DEFINE_DEVICE_TYPE_STRUCT
>>>
>>> Please note, there is a real concern about VirtIO interrupts allocation.
>>> [Just copy here what Stefano said in RFC thread]
>>>
>>> So, if we end up allocating let's say 6 virtio interrupts for a domain,
>>> the chance of a clash with a physical interrupt of a passthrough 
>>> device is real.
>>
>> For the first version, I think a static approach is fine because it 
>> doesn't bind us to anything yet (there is no interface change). We can 
>> refine it on follow-ups as we figure out how virtio is going to be 
>> used in the field.
>>
>>>
>>> I am not entirely sure how to solve it, but these are a few ideas:
>>> - choosing virtio interrupts that are less likely to conflict (maybe 
>>> > 1000)
>>
>> Well, we only support 988 interrupts :). However, we will waste some 
>> memory in the vGIC structure (we would need to allocate memory for the 
>> 988 interrupts) if you chose an interrupt towards then end.
>>
>>> - make the virtio irq (optionally) configurable so that a user could
>>>    override the default irq and specify one that doesn't conflict
>>
>> This is not very ideal because it makes the use of virtio quite 
>> unfriendly with passthrough. Note that platform device passthrough is 
>> already unfriendly, but I am thinking PCI :).
>>
>>> - implementing support for virq != pirq (even the xl interface doesn't
>>>    allow to specify the virq number for passthrough devices, see "irqs")
>> I can't remember whether I had a reason to not support virq != pirq 
>> when this was initially implemented. This is one possibility, but it 
>> is as unfriendly as the previous option.
>>
>> I will add a 4th one:
>>    - Automatically allocate the virtio IRQ. This should be possible to 
>> do it without too much trouble as we know in advance which IRQs will 
>> be passthrough.
> As I understand the IRQs for passthrough are described in "irq" property 
> and stored in d_config->b_info.irqs[i], so yes we know in advance which 
> IRQs will be used for passthrough
> and we will be able to choose non-clashed ones (iterating over all IRQs 
> in a reserved range) for the virtio devices.  The question is how many 
> IRQs should be reserved.

If we are automatically selecting the interrupt for virtio devices, then 
I don't think we need to reserve a batch. Instead, we can allocate one 
by one as we create the virtio device in libxl.

For the static case, then a range of 10-20 might be sufficient for now.

[...]

>>> -        nr_spis += (GUEST_VIRTIO_MMIO_SPI - 32) + 1;
>>> +        uint64_t virtio_base;
>>> +        libxl_device_virtio_disk *virtio_disk;
>>> +
>>> +        virtio_base = GUEST_VIRTIO_MMIO_BASE;
>>>           virtio_irq = GUEST_VIRTIO_MMIO_SPI;
>>
>> Looking at patch #23, you defined a single SPI and a region that can 
>> only fit virtio device. However, here, you are going to define 
>> multiple virtio devices.
>>
>> I think you want to define the following:
>>
>>  - GUEST_VIRTIO_MMIO_BASE: Base address of the virtio window
>>  - GUEST_VIRTIO_MMIO_SIZE: Full length of the virtio window (may 
>> contain multiple devices)
>>  - GUEST_VIRTIO_SPI_FIRST: First SPI reserved for virtio
>>  - GUEST_VIRTIO_SPI_LAST: Last SPI reserved for virtio
>>
>> The per-device size doesn't need to be defined in arch-arm.h. Instead, 
>> I would only define internally (unless we can use a virtio.h header 
>> from Linux?).
> 
> I think I got the idea. What are the preferences for these values?

I have suggested some values in patch #23. Let me know what you think there.

[...]

>>> +
>>> +        nr_spis += (virtio_irq - 32) + 1;
>>>           virtio_enabled = true;
>>>       }
>>
>> [...]
>>
>>> diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
>>> index 2a3364b..054a0c9 100644
>>> --- a/tools/xl/xl_parse.c
>>> +++ b/tools/xl/xl_parse.c
>>> @@ -1204,6 +1204,120 @@ out:
>>>       if (rc) exit(EXIT_FAILURE);
>>>   }
>>>   +#define MAX_VIRTIO_DISKS 4
>>
>> May I ask why this is hardcoded to 4?
> 
> I found 4 as a reasonable value for the initial implementation.
> This means how many disks the single device instance can handle.

Right, the question is why do you need to impose a limit in xl?

Looking at the code, the value is only used in:

+        if (virtio_disk->num_disks > MAX_VIRTIO_DISKS) {
+            fprintf(stderr, "vdisk: currently only %d disks are supported",
+                    MAX_VIRTIO_DISKS);

The rest of the code (at list in libxl/xl) seems to be completely 
agnostic to the number of disks. So it feels strange to me to impose 
what looks like an arbitrary limit in the tools.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features
  2021-01-20 15:57                 ` Julien Grall
@ 2021-01-20 19:47                   ` Stefano Stabellini
  2021-01-21  9:31                     ` Oleksandr
  0 siblings, 1 reply; 144+ messages in thread
From: Stefano Stabellini @ 2021-01-20 19:47 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Oleksandr, xen-devel, Julien Grall,
	Volodymyr Babchuk, Oleksandr Tyshchenko

[-- Attachment #1: Type: text/plain, Size: 1489 bytes --]

On Wed, 20 Jan 2021, Julien Grall wrote:
> Hi Stefano,
> 
> On 20/01/2021 00:50, Stefano Stabellini wrote:
> > On Tue, 19 Jan 2021, Oleksandr wrote:
> > > diff --git a/xen/arch/arm/ioreq.c b/xen/arch/arm/ioreq.c
> > > index 40b9e59..0508bd8 100644
> > > --- a/xen/arch/arm/ioreq.c
> > > +++ b/xen/arch/arm/ioreq.c
> > > @@ -101,12 +101,10 @@ enum io_state try_fwd_ioserv(struct cpu_user_regs
> > > *regs,
> > > 
> > >   bool arch_ioreq_complete_mmio(void)
> > >   {
> > > -    struct vcpu *v = current;
> > >       struct cpu_user_regs *regs = guest_cpu_user_regs();
> > >       const union hsr hsr = { .bits = regs->hsr };
> > > -    paddr_t addr = v->io.req.addr;
> > > 
> > > -    if ( try_handle_mmio(regs, hsr, addr) == IO_HANDLED )
> > > +    if ( handle_ioserv(regs, current) == IO_HANDLED )
> > >       {
> > >           advance_pc(regs, hsr);
> > >           return true;
> > 
> > Yes, but I think we want to keep the check
> > 
> >      vio->req.state == STATE_IORESP_READY
> > 
> > So maybe (uncompiled, untested):
> > 
> >      if ( v->io.req.state != STATE_IORESP_READY )
> >          return false;
> 
> Is it possible to reach this function with v->io.req.state !=
> STATE_IORESP_READY? If not, then I would suggest to add an
> ASSERT_UNREACHABLE() before the return.

If I am reading the state machine right it should *not* be possible to
get here with v->io.req.state != STATE_IORESP_READY, so yes,
ASSERT_UNREACHABLE() would work.

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 23/24] libxl: Introduce basic virtio-mmio support on Arm
  2021-01-20 16:40       ` Julien Grall
@ 2021-01-20 20:35         ` Stefano Stabellini
  2021-02-09 21:04         ` Oleksandr
  1 sibling, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2021-01-20 20:35 UTC (permalink / raw)
  To: Julien Grall
  Cc: Oleksandr, xen-devel, Julien Grall, Ian Jackson, Wei Liu,
	Anthony PERARD, Stefano Stabellini, Volodymyr Babchuk,
	Oleksandr Tyshchenko

On Wed, 20 Jan 2021, Julien Grall wrote:
> > > > +#define GUEST_VIRTIO_MMIO_SIZE xen_mk_ullong(0x200)
> > > 
> > > AFAICT, the size of the virtio mmio region should be 0x100. So why is it
> > > 0x200?
> > 
> > 
> > I didn't find the total size requirement for the mmio region in virtio
> > specification v1.1 (the size of control registers is indeed 0x100 and
> > device-specific configuration registers starts at the offset 0x100, however
> > it's size depends on the device and the driver).
> > 
> > kvmtool uses 0x200 [1], in some Linux device-trees we can see 0x200 [2]
> > (however, device-tree bindings example has 0x100 [3]), so what would be the
> > proper value for Xen code?
> 
> Hmm... I missed that fact. I would say we want to use the biggest size
> possible so we can cover most of the devices.
> 
> Although, as you pointed out, this may not cover all the devices. So maybe we
> want to allow the user to configure the size via xl.cfg for the one not
> conforming with 0x200.
> 
> This could be implemented in the future. Stefano/Ian, what do you think?

I agree it could be implemented in the future. For now, I would pick
0x200.


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 06/24] xen/ioreq: Make x86's hvm_mmio_first(last)_byte() common
  2021-01-20 16:15   ` Jan Beulich
@ 2021-01-20 20:47     ` Oleksandr
  0 siblings, 0 replies; 144+ messages in thread
From: Oleksandr @ 2021-01-20 20:47 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Oleksandr Tyshchenko, Paul Durrant, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, Julien Grall, Stefano Stabellini, Julien Grall,
	xen-devel


On 20.01.21 18:15, Jan Beulich wrote:

Hi Jan

> On 12.01.2021 22:52, Oleksandr Tyshchenko wrote:
>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>
>> The IOREQ is a common feature now and these helpers will be used
>> on Arm as is. Move them to xen/ioreq.h and replace "hvm" prefixes
>> with "ioreq".
>>
>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> Reviewed-by: Paul Durrant <paul@xen.org>
>> CC: Julien Grall <julien.grall@arm.com>
>> [On Arm only]
>> Tested-by: Wei Chen <Wei.Chen@arm.com>
>>
>> ---
>> Please note, this is a split/cleanup/hardening of Julien's PoC:
>> "Add support for Guest IO forwarding to a device emulator"
>>
>> Changes RFC -> V1:
>>     - new patch
>>
>> Changes V1 -> V2:
>>     - replace "hvm" prefix by "ioreq"
>>
>> Changes V2 -> V3:
>>     - add Paul's R-b
>>
>> Changes V32 -> V4:
>>     - add Jan's A-b
> Did you?

Oops, I didn't. I will add.


>
> Jan

-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features
  2021-01-20 15:50           ` Julien Grall
@ 2021-01-21  8:50             ` Oleksandr
  2021-01-27 10:24               ` Jan Beulich
  0 siblings, 1 reply; 144+ messages in thread
From: Oleksandr @ 2021-01-21  8:50 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Volodymyr Babchuk,
	Oleksandr Tyshchenko


On 20.01.21 17:50, Julien Grall wrote:
> Hi Oleksandr,


Hi Julien



>
> On 17/01/2021 18:52, Oleksandr wrote:
>>>>>>   diff --git a/xen/include/asm-arm/domain.h 
>>>>>> b/xen/include/asm-arm/domain.h
>>>>>> index 6819a3b..c235e5b 100644
>>>>>> --- a/xen/include/asm-arm/domain.h
>>>>>> +++ b/xen/include/asm-arm/domain.h
>>>>>> @@ -10,6 +10,7 @@
>>>>>>   #include <asm/gic.h>
>>>>>>   #include <asm/vgic.h>
>>>>>>   #include <asm/vpl011.h>
>>>>>> +#include <public/hvm/dm_op.h>
>>>>>
>>>>> May I ask, why do you need to include dm_op.h here?
>>>> I needed to include that header to make some bits visible 
>>>> (XEN_DMOP_IO_RANGE_PCI, struct xen_dm_op_buf, etc). Why here - is a 
>>>> really good question.
>>>> I don't remember exactly, probably I followed x86's domain.h which 
>>>> also included it.
>>>> So, trying to remove the inclusion here, I get several build 
>>>> failures on Arm which could be fixed if I include that header from 
>>>> dm.h and ioreq.h:
>>>>
>>>> Shall I do this way?
>>>
>>> If the failure are indeded because ioreq.h and dm.h use definition 
>>> from public/hvm/dm_op.h, then yes. Can you post the errors?
>> Please see attached, although I built for Arm32 (and the whole 
>> series), I think errors are valid for Arm64 also.
>
> Thanks!
>
>> error1.txt - when remove #include <public/hvm/dm_op.h> from 
>> asm-arm/domain.h
>
> For this one, I agree that including <public/hvm/dm_op.h> in xen.h 
> looks the best solution.

Yes, I assume you meant in "ioreq.h"

>
>
>> error2.txt - when add #include <public/hvm/dm_op.h> to xen/ioreq.h
>
> It looks like the error is happening in dm.c rather than xen/dm.h. Any 
> reason to not include <public/hvm/dm_op.h> in dm.c directly?
Including it directly doesn't solve build issue.
If I am not mistaken in order to follow requirements how to include 
headers (alphabetic order, public* should be included after xen* and 
asm* ones, etc)
the dm.h gets included the first in dm.c, and dm_op.h gets included the 
last. We can avoid build issue if we change inclusion order a bit,
what I mean is to include dm.h after hypercall.h at least (because 
hypercall.h already includes dm_op.h). But this breaks the requirements 
and is not way to go.
Now I am in doubt how to overcome this.


>
>
>
>> error3.txt - when add #include <public/hvm/dm_op.h> to xen/dm.h
>
> I am a bit confused with this one. Isn't it the same as error1.txt?

The same, please ignore them, sorry for the confusion.


>
>
>>
>>
>>>
>>>
>>> [...]
>>>
>>>>>>   #include <public/hvm/params.h>
>>>>>>     struct hvm_domain
>>>>>> @@ -262,6 +263,8 @@ static inline void arch_vcpu_block(struct 
>>>>>> vcpu *v) {}
>>>>>>     #define arch_vm_assist_valid_mask(d) (1UL << 
>>>>>> VMASST_TYPE_runstate_update_flag)
>>>>>>   +#define has_vpci(d)    ({ (void)(d); false; })
>>>>>> +
>>>>>>   #endif /* __ASM_DOMAIN_H__ */
>>>>>>     /*
>>>>>> diff --git a/xen/include/asm-arm/hvm/ioreq.h 
>>>>>> b/xen/include/asm-arm/hvm/ioreq.h
>>>>>> new file mode 100644
>>>>>> index 0000000..19e1247
>>>>>> --- /dev/null
>>>>>> +++ b/xen/include/asm-arm/hvm/ioreq.h
>>>>>
>>>>> Shouldn't this directly be under asm-arm/ rather than asm-arm/hvm/ 
>>>>> as the IOREQ is now meant to be agnostic?
>>>> Good question... The _common_ IOREQ code is indeed arch-agnostic. 
>>>> But, can the _arch_ IOREQ code be treated as really subarch-agnostic?
>>>> I think, on Arm it can and it is most likely ok to keep it in 
>>>> "asm-arm/", but how it would be correlated with x86's IOREQ code 
>>>> which is HVM specific and located
>>>> in "hvm" subdir?
>>>
>>> Sorry, I don't understand your answer/questions. So let me ask the 
>>> question differently, is asm-arm/hvm/ioreq.h going to be included 
>>> from common code?
>>
>> Sorry if I was unclear.
>>
>>
>>>
>>> If the answer is no, then I see no reason to follow the x86 here.
>>> If the answer is yes, then I am quite confused why half of the 
>>> series tried to remove "hvm" from the function name but we still 
>>> include "asm/hvm/ioreq.h".
>>
>> Answer is yes. Even if we could to avoid including that header from 
>> the common code somehow, we would still have #include <public/hvm/*>, 
>> is_hvm_domain().
>
> I saw Jan answered about this one. Let me know if you need more input.

Thank you, I think, no. Everything is clear at the moment.


>
> Cheers,
>
-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features
  2021-01-20 19:47                   ` Stefano Stabellini
@ 2021-01-21  9:31                     ` Oleksandr
  2021-01-21 21:34                       ` Stefano Stabellini
  0 siblings, 1 reply; 144+ messages in thread
From: Oleksandr @ 2021-01-21  9:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall
  Cc: xen-devel, Julien Grall, Volodymyr Babchuk, Oleksandr Tyshchenko


On 20.01.21 21:47, Stefano Stabellini wrote:

Hi Julien, Stefano


> On Wed, 20 Jan 2021, Julien Grall wrote:
>> Hi Stefano,
>>
>> On 20/01/2021 00:50, Stefano Stabellini wrote:
>>> On Tue, 19 Jan 2021, Oleksandr wrote:
>>>> diff --git a/xen/arch/arm/ioreq.c b/xen/arch/arm/ioreq.c
>>>> index 40b9e59..0508bd8 100644
>>>> --- a/xen/arch/arm/ioreq.c
>>>> +++ b/xen/arch/arm/ioreq.c
>>>> @@ -101,12 +101,10 @@ enum io_state try_fwd_ioserv(struct cpu_user_regs
>>>> *regs,
>>>>
>>>>    bool arch_ioreq_complete_mmio(void)
>>>>    {
>>>> -    struct vcpu *v = current;
>>>>        struct cpu_user_regs *regs = guest_cpu_user_regs();
>>>>        const union hsr hsr = { .bits = regs->hsr };
>>>> -    paddr_t addr = v->io.req.addr;
>>>>
>>>> -    if ( try_handle_mmio(regs, hsr, addr) == IO_HANDLED )
>>>> +    if ( handle_ioserv(regs, current) == IO_HANDLED )
>>>>        {
>>>>            advance_pc(regs, hsr);
>>>>            return true;
>>> Yes, but I think we want to keep the check
>>>
>>>       vio->req.state == STATE_IORESP_READY
>>>
>>> So maybe (uncompiled, untested):
>>>
>>>       if ( v->io.req.state != STATE_IORESP_READY )
>>>           return false;
>> Is it possible to reach this function with v->io.req.state !=
>> STATE_IORESP_READY? If not, then I would suggest to add an
>> ASSERT_UNREACHABLE() before the return.
> If I am reading the state machine right it should *not* be possible to
> get here with v->io.req.state != STATE_IORESP_READY, so yes,
> ASSERT_UNREACHABLE() would work.
Agree here. If the assumption is not correct (unlikely), I think I will 
catch this during testing.
In addition, we can probably drop case STATE_IORESP_READY in 
try_fwd_ioserv().


[not tested]


diff --git a/xen/arch/arm/ioreq.c b/xen/arch/arm/ioreq.c
index 40b9e59..c7ee1a7 100644
--- a/xen/arch/arm/ioreq.c
+++ b/xen/arch/arm/ioreq.c
@@ -71,9 +71,6 @@ enum io_state try_fwd_ioserv(struct cpu_user_regs *regs,
      case STATE_IOREQ_NONE:
          break;

-    case STATE_IORESP_READY:
-        return IO_HANDLED;
-
      default:
          gdprintk(XENLOG_ERR, "wrong state %u\n", vio->req.state);
          return IO_ABORT;
@@ -104,9 +101,14 @@ bool arch_ioreq_complete_mmio(void)
      struct vcpu *v = current;
      struct cpu_user_regs *regs = guest_cpu_user_regs();
      const union hsr hsr = { .bits = regs->hsr };
-    paddr_t addr = v->io.req.addr;

-    if ( try_handle_mmio(regs, hsr, addr) == IO_HANDLED )
+    if ( v->io.req.state != STATE_IORESP_READY )
+    {
+        ASSERT_UNREACHABLE();
+        return false;
+    }
+
+    if ( handle_ioserv(regs, v) == IO_HANDLED )
      {
          advance_pc(regs, hsr);
          return true;


-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply related	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features
  2021-01-20  0:23       ` Stefano Stabellini
@ 2021-01-21  9:51         ` Oleksandr
  0 siblings, 0 replies; 144+ messages in thread
From: Oleksandr @ 2021-01-21  9:51 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Julien Grall, Julien Grall, Volodymyr Babchuk,
	Oleksandr Tyshchenko


On 20.01.21 02:23, Stefano Stabellini wrote:

Hi Stefano


> On Sun, 17 Jan 2021, Oleksandr wrote:
>> On 15.01.21 02:55, Stefano Stabellini wrote:
>>> On Tue, 12 Jan 2021, Oleksandr Tyshchenko wrote:
>>>> From: Julien Grall <julien.grall@arm.com>
>>>>
>>>> This patch adds basic IOREQ/DM support on Arm. The subsequent
>>>> patches will improve functionality and add remaining bits.
>>>>
>>>> The IOREQ/DM features are supposed to be built with IOREQ_SERVER
>>>> option enabled, which is disabled by default on Arm for now.
>>>>
>>>> Please note, the "PIO handling" TODO is expected to left unaddressed
>>>> for the current series. It is not an big issue for now while Xen
>>>> doesn't have support for vPCI on Arm. On Arm64 they are only used
>>>> for PCI IO Bar and we would probably want to expose them to emulator
>>>> as PIO access to make a DM completely arch-agnostic. So "PIO handling"
>>>> should be implemented when we add support for vPCI.
>>>>
>>>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>>>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>>> [On Arm only]
>>>> Tested-by: Wei Chen <Wei.Chen@arm.com>
>>>>
>>>> ---
>>>> Please note, this is a split/cleanup/hardening of Julien's PoC:
>>>> "Add support for Guest IO forwarding to a device emulator"
>>>>
>>>> Changes RFC -> V1:
>>>>      - was split into:
>>>>        - arm/ioreq: Introduce arch specific bits for IOREQ/DM features
>>>>        - xen/mm: Handle properly reference in set_foreign_p2m_entry() on
>>>> Arm
>>>>      - update patch description
>>>>      - update asm-arm/hvm/ioreq.h according to the newly introduced arch
>>>> functions:
>>>>        - arch_hvm_destroy_ioreq_server()
>>>>        - arch_handle_hvm_io_completion()
>>>>      - update arch files to include xen/ioreq.h
>>>>      - remove HVMOP plumbing
>>>>      - rewrite a logic to handle properly case when hvm_send_ioreq()
>>>> returns IO_RETRY
>>>>      - add a logic to handle properly handle_hvm_io_completion() return
>>>> value
>>>>      - rename handle_mmio() to ioreq_handle_complete_mmio()
>>>>      - move paging_mark_pfn_dirty() to asm-arm/paging.h
>>>>      - remove forward declaration for hvm_ioreq_server in asm-arm/paging.h
>>>>      - move try_fwd_ioserv() to ioreq.c, provide stubs if
>>>> !CONFIG_IOREQ_SERVER
>>>>      - do not remove #ifdef CONFIG_IOREQ_SERVER in memory.c for guarding
>>>> xen/ioreq.h
>>>>      - use gdprintk in try_fwd_ioserv(), remove unneeded prints
>>>>      - update list of #include-s
>>>>      - move has_vpci() to asm-arm/domain.h
>>>>      - add a comment (TODO) to unimplemented yet handle_pio()
>>>>      - remove hvm_mmio_first(last)_byte() and hvm_ioreq_(page/vcpu/server)
>>>> structs
>>>>        from the arch files, they were already moved to the common code
>>>>      - remove set_foreign_p2m_entry() changes, they will be properly
>>>> implemented
>>>>        in the follow-up patch
>>>>      - select IOREQ_SERVER for Arm instead of Arm64 in Kconfig
>>>>      - remove x86's realmode and other unneeded stubs from xen/ioreq.h
>>>>      - clafify ioreq_t p.df usage in try_fwd_ioserv()
>>>>      - set ioreq_t p.count to 1 in try_fwd_ioserv()
>>>>
>>>> Changes V1 -> V2:
>>>>      - was split into:
>>>>        - arm/ioreq: Introduce arch specific bits for IOREQ/DM features
>>>>        - xen/arm: Stick around in leave_hypervisor_to_guest until I/O has
>>>> completed
>>>>      - update the author of a patch
>>>>      - update patch description
>>>>      - move a loop in leave_hypervisor_to_guest() to a separate patch
>>>>      - set IOREQ_SERVER disabled by default
>>>>      - remove already clarified /* XXX */
>>>>      - replace BUG() by ASSERT_UNREACHABLE() in handle_pio()
>>>>      - remove default case for handling the return value of
>>>> try_handle_mmio()
>>>>      - remove struct hvm_domain, enum hvm_io_completion, struct
>>>> hvm_vcpu_io,
>>>>        struct hvm_vcpu from asm-arm/domain.h, these are common materials
>>>> now
>>>>      - update everything according to the recent changes (IOREQ related
>>>> function
>>>>        names don't contain "hvm" prefixes/infixes anymore, IOREQ related
>>>> fields
>>>>        are part of common struct vcpu/domain now, etc)
>>>>
>>>> Changes V2 -> V3:
>>>>      - update patch according the "legacy interface" is x86 specific
>>>>      - add dummy arch hooks
>>>>      - remove dummy paging_mark_pfn_dirty()
>>>>      - don’t include <xen/domain_page.h> in common ioreq.c
>>>>      - don’t include <public/hvm/ioreq.h> in arch ioreq.h
>>>>      - remove #define ioreq_params(d, i)
>>>>
>>>> Changes V3 -> V4:
>>>>      - rebase
>>>>      - update patch according to the renaming IO_ -> VIO_ (io_ -> vio_)
>>>>        and misc changes to arch hooks
>>>>      - update patch according to the IOREQ related dm-op handling changes
>>>>      - don't include <xen/ioreq.h> from arch header
>>>>      - make all arch hooks out-of-line
>>>>      - add a comment above IOREQ_STATUS_* #define-s
>>>> ---
>>>>    xen/arch/arm/Makefile           |   2 +
>>>>    xen/arch/arm/dm.c               | 122 +++++++++++++++++++++++
>>>>    xen/arch/arm/domain.c           |   9 ++
>>>>    xen/arch/arm/io.c               |  12 ++-
>>>>    xen/arch/arm/ioreq.c            | 213
>>>> ++++++++++++++++++++++++++++++++++++++++
>>>>    xen/arch/arm/traps.c            |  13 +++
>>>>    xen/include/asm-arm/domain.h    |   3 +
>>>>    xen/include/asm-arm/hvm/ioreq.h |  72 ++++++++++++++
>>>>    xen/include/asm-arm/mmio.h      |   1 +
>>>>    9 files changed, 446 insertions(+), 1 deletion(-)
>>>>    create mode 100644 xen/arch/arm/dm.c
>>>>    create mode 100644 xen/arch/arm/ioreq.c
>>>>    create mode 100644 xen/include/asm-arm/hvm/ioreq.h
>>>>
>>>> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
>>>> index 512ffdd..16e6523 100644
>>>> --- a/xen/arch/arm/Makefile
>>>> +++ b/xen/arch/arm/Makefile
>>>> @@ -13,6 +13,7 @@ obj-y += cpuerrata.o
>>>>    obj-y += cpufeature.o
>>>>    obj-y += decode.o
>>>>    obj-y += device.o
>>>> +obj-$(CONFIG_IOREQ_SERVER) += dm.o
>>>>    obj-y += domain.o
>>>>    obj-y += domain_build.init.o
>>>>    obj-y += domctl.o
>>>> @@ -27,6 +28,7 @@ obj-y += guest_atomics.o
>>>>    obj-y += guest_walk.o
>>>>    obj-y += hvm.o
>>>>    obj-y += io.o
>>>> +obj-$(CONFIG_IOREQ_SERVER) += ioreq.o
>>>>    obj-y += irq.o
>>>>    obj-y += kernel.init.o
>>>>    obj-$(CONFIG_LIVEPATCH) += livepatch.o
>>>> diff --git a/xen/arch/arm/dm.c b/xen/arch/arm/dm.c
>>>> new file mode 100644
>>>> index 0000000..e6dedf4
>>>> --- /dev/null
>>>> +++ b/xen/arch/arm/dm.c
>>>> @@ -0,0 +1,122 @@
>>>> +/*
>>>> + * Copyright (c) 2019 Arm ltd.
>>>> + *
>>>> + * This program is free software; you can redistribute it and/or modify
>>>> it
>>>> + * under the terms and conditions of the GNU General Public License,
>>>> + * version 2, as published by the Free Software Foundation.
>>>> + *
>>>> + * This program is distributed in the hope it will be useful, but WITHOUT
>>>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>>>> for
>>>> + * more details.
>>>> + *
>>>> + * You should have received a copy of the GNU General Public License
>>>> along with
>>>> + * this program; If not, see <http://www.gnu.org/licenses/>.
>>>> + */
>>>> +
>>>> +#include <xen/dm.h>
>>>> +#include <xen/guest_access.h>
>>>> +#include <xen/hypercall.h>
>>>> +#include <xen/ioreq.h>
>>>> +#include <xen/nospec.h>
>>>> +
>>>> +static int dm_op(const struct dmop_args *op_args)
>>>> +{
>>>> +    struct domain *d;
>>>> +    struct xen_dm_op op;
>>>> +    bool const_op = true;
>>>> +    long rc;
>>>> +    size_t offset;
>>>> +
>>>> +    static const uint8_t op_size[] = {
>>>> +        [XEN_DMOP_create_ioreq_server]              = sizeof(struct
>>>> xen_dm_op_create_ioreq_server),
>>>> +        [XEN_DMOP_get_ioreq_server_info]            = sizeof(struct
>>>> xen_dm_op_get_ioreq_server_info),
>>>> +        [XEN_DMOP_map_io_range_to_ioreq_server]     = sizeof(struct
>>>> xen_dm_op_ioreq_server_range),
>>>> +        [XEN_DMOP_unmap_io_range_from_ioreq_server] = sizeof(struct
>>>> xen_dm_op_ioreq_server_range),
>>>> +        [XEN_DMOP_set_ioreq_server_state]           = sizeof(struct
>>>> xen_dm_op_set_ioreq_server_state),
>>>> +        [XEN_DMOP_destroy_ioreq_server]             = sizeof(struct
>>>> xen_dm_op_destroy_ioreq_server),
>>>> +    };
>>>> +
>>>> +    rc = rcu_lock_remote_domain_by_id(op_args->domid, &d);
>>>> +    if ( rc )
>>>> +        return rc;
>>>> +
>>>> +    rc = xsm_dm_op(XSM_DM_PRIV, d);
>>>> +    if ( rc )
>>>> +        goto out;
>>>> +
>>>> +    offset = offsetof(struct xen_dm_op, u);
>>>> +
>>>> +    rc = -EFAULT;
>>>> +    if ( op_args->buf[0].size < offset )
>>>> +        goto out;
>>>> +
>>>> +    if ( copy_from_guest_offset((void *)&op, op_args->buf[0].h, 0,
>>>> offset) )
>>>> +        goto out;
>>>> +
>>>> +    if ( op.op >= ARRAY_SIZE(op_size) )
>>>> +    {
>>>> +        rc = -EOPNOTSUPP;
>>>> +        goto out;
>>>> +    }
>>>> +
>>>> +    op.op = array_index_nospec(op.op, ARRAY_SIZE(op_size));
>>>> +
>>>> +    if ( op_args->buf[0].size < offset + op_size[op.op] )
>>>> +        goto out;
>>>> +
>>>> +    if ( copy_from_guest_offset((void *)&op.u, op_args->buf[0].h, offset,
>>>> +                                op_size[op.op]) )
>>>> +        goto out;
>>>> +
>>>> +    rc = -EINVAL;
>>>> +    if ( op.pad )
>>>> +        goto out;
>>>> +
>>>> +    rc = ioreq_server_dm_op(&op, d, &const_op);
>>>> +
>>>> +    if ( (!rc || rc == -ERESTART) &&
>>>> +         !const_op && copy_to_guest_offset(op_args->buf[0].h, offset,
>>>> +                                           (void *)&op.u, op_size[op.op])
>>>> )
>>>> +        rc = -EFAULT;
>>>> +
>>>> + out:
>>>> +    rcu_unlock_domain(d);
>>>> +
>>>> +    return rc;
>>>> +}
>>>> +
>>>> +long do_dm_op(domid_t domid,
>>>> +              unsigned int nr_bufs,
>>>> +              XEN_GUEST_HANDLE_PARAM(xen_dm_op_buf_t) bufs)
>>>> +{
>>>> +    struct dmop_args args;
>>>> +    int rc;
>>>> +
>>>> +    if ( nr_bufs > ARRAY_SIZE(args.buf) )
>>>> +        return -E2BIG;
>>>> +
>>>> +    args.domid = domid;
>>>> +    args.nr_bufs = array_index_nospec(nr_bufs, ARRAY_SIZE(args.buf) + 1);
>>>> +
>>>> +    if ( copy_from_guest_offset(&args.buf[0], bufs, 0, args.nr_bufs) )
>>>> +        return -EFAULT;
>>>> +
>>>> +    rc = dm_op(&args);
>>>> +
>>>> +    if ( rc == -ERESTART )
>>>> +        rc = hypercall_create_continuation(__HYPERVISOR_dm_op, "iih",
>>>> +                                           domid, nr_bufs, bufs);
>>>> +
>>>> +    return rc;
>>>> +}
>>> I might have missed something in the discussions but this function is
>>> identical to xen/arch/x86/hvm/dm.c:do_dm_op, why not make it common?
>>>
>>> Also the previous function dm_op is very similar to
>>> xen/arch/x86/hvm/dm.c:dm_op I would prefer to make them common if
>>> possible. Was this already discussed?
>> Well, let me explain. Both dm_op() and do_dm_op() were indeed common (top
>> level dm-op handling common) for previous versions, so Arm's dm.c didn't
>> contain this stuff.
>> The idea to make it other way around (top level dm-op handling arch-specific
>> and call into ioreq_server_dm_op() for otherwise unhandled ops) was discussed
>> at [1] which besides
>> it's Pros leads to code duplication, so Arm's dm.c has to duplicate some
>> stuff, etc.
>> I was thinking about moving do_dm_op() which is _same_ for both arches to
>> common code, but I am not sure whether it is conceptually correct which that
>> new "alternative" approach of handling dm-op.
> Yes, I think it makes sense to make do_dm_op common because it is
> identical. That should be easy.
Absolutely identical) Agree, technically it is not hard. Well, let's 
continue discussion in [1]
(which actually leads to the duplication) to see if all parties are 
happy with that.


>
> I realize that the common part of dm_op is the initial boilerplate which
> is similar for every hypercall, so I think it is also OK if we don't
> share it and leave it as it is in this version of the series.
ok


[1] 
https://lore.kernel.org/xen-devel/1610488352-18494-10-git-send-email-olekstysh@gmail.com/

-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 09/24] xen/ioreq: Make x86's IOREQ related dm-op handling common
  2021-01-20 16:21   ` Jan Beulich
@ 2021-01-21 10:23     ` Oleksandr
  2021-01-21 10:27       ` Jan Beulich
  0 siblings, 1 reply; 144+ messages in thread
From: Oleksandr @ 2021-01-21 10:23 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Julien Grall, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Paul Durrant, Daniel De Graaf,
	Oleksandr Tyshchenko, xen-devel


On 20.01.21 18:21, Jan Beulich wrote:

Hi Jan

> On 12.01.2021 22:52, Oleksandr Tyshchenko wrote:
>> From: Julien Grall <julien.grall@arm.com>
>>
>> As a lot of x86 code can be re-used on Arm later on, this patch
>> moves the IOREQ related dm-op handling to the common code.
>>
>> The idea is to have the top level dm-op handling arch-specific
>> and call into ioreq_server_dm_op() for otherwise unhandled ops.
>> Pros:
>> - More natural than doing it other way around (top level dm-op
>> handling common).
>> - Leave compat_dm_op() in x86 code.
>> Cons:
>> - Code duplication. Both arches have to duplicate do_dm_op(), etc.
>>
>> Also update XSM code a bit to let dm-op be used on Arm.
>>
>> This support is going to be used on Arm to be able run device
>> emulator outside of Xen hypervisor.
>>
>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> [On Arm only]
>> Tested-by: Wei Chen <Wei.Chen@arm.com>
> Assuming the moved code is indeed just being moved (which is
> quite hard to ascertain by just looking at the diff),

I have checked and will double-check again.


> applicable parts
> Acked-by: Jan Beulich <jbeulich@suse.com>

Thanks.

I would like to clarify regarding do_dm_op() which is identical for both 
arches and could *probably* be moved to the common code (we can return 
common dm.c back to put it there) and make dm_op() global.
Would you/Paul be happy with that change? Or there are some reasons 
(which we are not aware of yet) for not doing it this way?

Initial discussion happened in [1] (which, let say, suffers from the 
duplication) and more precise in [2].


[1] 
https://lore.kernel.org/xen-devel/1610488352-18494-15-git-send-email-olekstysh@gmail.com/
[2] 
https://lore.kernel.org/xen-devel/alpine.DEB.2.21.2101191620050.14528@sstabellini-ThinkPad-T480s/

-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 09/24] xen/ioreq: Make x86's IOREQ related dm-op handling common
  2021-01-21 10:23     ` Oleksandr
@ 2021-01-21 10:27       ` Jan Beulich
  2021-01-21 11:13         ` Oleksandr
  0 siblings, 1 reply; 144+ messages in thread
From: Jan Beulich @ 2021-01-21 10:27 UTC (permalink / raw)
  To: Oleksandr
  Cc: Julien Grall, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Paul Durrant, Daniel De Graaf,
	Oleksandr Tyshchenko, xen-devel

On 21.01.2021 11:23, Oleksandr wrote:
> I would like to clarify regarding do_dm_op() which is identical for both 
> arches and could *probably* be moved to the common code (we can return 
> common dm.c back to put it there) and make dm_op() global.
> Would you/Paul be happy with that change? Or there are some reasons 
> (which we are not aware of yet) for not doing it this way?

Probably reasonable to do; the only reason not to that I
could see is that then dm_op() has to become non-static.

Jan


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 09/24] xen/ioreq: Make x86's IOREQ related dm-op handling common
  2021-01-21 10:27       ` Jan Beulich
@ 2021-01-21 11:13         ` Oleksandr
  0 siblings, 0 replies; 144+ messages in thread
From: Oleksandr @ 2021-01-21 11:13 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Julien Grall, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Paul Durrant, Daniel De Graaf,
	Oleksandr Tyshchenko, xen-devel


On 21.01.21 12:27, Jan Beulich wrote:

Hi Jan

> On 21.01.2021 11:23, Oleksandr wrote:
>> I would like to clarify regarding do_dm_op() which is identical for both
>> arches and could *probably* be moved to the common code (we can return
>> common dm.c back to put it there) and make dm_op() global.
>> Would you/Paul be happy with that change? Or there are some reasons
>> (which we are not aware of yet) for not doing it this way?
> Probably reasonable to do; the only reason not to that I
> could see is that then dm_op() has to become non-static.
Thank you for the clarification. So I will make this change for V5 if no 
objections during these days.
I am going to leave your ack, please let me know if you think otherwise.


-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 16/24] xen/mm: Handle properly reference in set_foreign_p2m_entry() on Arm
  2021-01-12 21:52 ` [PATCH V4 16/24] xen/mm: Handle properly reference in set_foreign_p2m_entry() on Arm Oleksandr Tyshchenko
  2021-01-15  1:19   ` Stefano Stabellini
  2021-01-15 20:59   ` Julien Grall
@ 2021-01-21 13:57   ` Jan Beulich
  2021-01-21 18:42     ` Oleksandr
  2 siblings, 1 reply; 144+ messages in thread
From: Jan Beulich @ 2021-01-21 13:57 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Ian Jackson,
	Wei Liu, Roger Pau Monné,
	Julien Grall, xen-devel

On 12.01.2021 22:52, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> This patch implements reference counting of foreign entries in
> in set_foreign_p2m_entry() on Arm. This is a mandatory action if
> we want to run emulator (IOREQ server) in other than dom0 domain,
> as we can't trust it to do the right thing if it is not running
> in dom0. So we need to grab a reference on the page to avoid it
> disappearing.
> 
> It is valid to always pass "p2m_map_foreign_rw" type to
> guest_physmap_add_entry() since the current and foreign domains
> would be always different. A case when they are equal would be
> rejected by rcu_lock_remote_domain_by_id(). Besides the similar
> comment in the code put a respective ASSERT() to catch incorrect
> usage in future.
> 
> It was tested with IOREQ feature to confirm that all the pages given
> to this function belong to a domain, so we can use the same approach
> as for XENMAPSPACE_gmfn_foreign handling in xenmem_add_to_physmap_one().
> 
> This involves adding an extra parameter for the foreign domain to
> set_foreign_p2m_entry() and a helper to indicate whether the arch
> supports the reference counting of foreign entries and the restriction
> for the hardware domain in the common code can be skipped for it.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Julien Grall <julien.grall@arm.com>
> [On Arm only]
> Tested-by: Wei Chen <Wei.Chen@arm.com>

In principle x86 parts
Reviewed-by: Jan Beulich <jbeulich@suse.com>
However, being a maintainer of ...

> --- a/xen/include/asm-x86/p2m.h
> +++ b/xen/include/asm-x86/p2m.h
> @@ -382,6 +382,22 @@ struct p2m_domain {
>  #endif
>  #include <xen/p2m-common.h>
>  
> +static inline bool arch_acquire_resource_check(struct domain *d)
> +{
> +    /*
> +     * The reference counting of foreign entries in set_foreign_p2m_entry()
> +     * is not supported for translated domains on x86.
> +     *
> +     * FIXME: Until foreign pages inserted into the P2M are properly
> +     * reference counted, it is unsafe to allow mapping of
> +     * resource pages unless the caller is the hardware domain.
> +     */
> +    if ( paging_mode_translate(d) && !is_hardware_domain(d) )
> +        return false;
> +
> +    return true;
> +}


... this code, I'd like to ask that such constructs be avoided
and this be a single return statement:

    return !paging_mode_translate(d) || is_hardware_domain(d);

I also think you may want to consider dropping the initial
"The" from the comment. I'm further unconvinced "foreign
entries" needs saying when set_foreign_p2m_entry() deals with
exclusively such. In the end the original comment moved here
would probably suffice, no need for any more additions than
perhaps a simple "(see set_foreign_p2m_entry())".

Jan


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 21/24] xen/ioreq: Make x86's send_invalidate_req() common
  2021-01-18 10:31   ` Paul Durrant
@ 2021-01-21 14:02     ` Jan Beulich
  0 siblings, 0 replies; 144+ messages in thread
From: Jan Beulich @ 2021-01-21 14:02 UTC (permalink / raw)
  To: 'Oleksandr Tyshchenko'
  Cc: 'Oleksandr Tyshchenko', 'Andrew Cooper',
	'Roger Pau Monné', 'Wei Liu',
	'George Dunlap', 'Ian Jackson',
	'Julien Grall', 'Stefano Stabellini',
	'Julien Grall',
	paul, xen-devel

On 18.01.2021 11:31, Paul Durrant wrote:
>> -----Original Message-----
>> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of Oleksandr Tyshchenko
>> Sent: 12 January 2021 21:52
>> To: xen-devel@lists.xenproject.org
>> Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>; Jan Beulich <jbeulich@suse.com>; Andrew
>> Cooper <andrew.cooper3@citrix.com>; Roger Pau Monné <roger.pau@citrix.com>; Wei Liu <wl@xen.org>;
>> George Dunlap <george.dunlap@citrix.com>; Ian Jackson <iwj@xenproject.org>; Julien Grall
>> <julien@xen.org>; Stefano Stabellini <sstabellini@kernel.org>; Paul Durrant <paul@xen.org>; Julien
>> Grall <julien.grall@arm.com>
>> Subject: [PATCH V4 21/24] xen/ioreq: Make x86's send_invalidate_req() common
>>
>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>
>> As the IOREQ is a common feature now and we also need to
>> invalidate qemu/demu mapcache on Arm when the required condition
>> occurs this patch moves this function to the common code
>> (and remames it to ioreq_signal_mapcache_invalidate).
>> This patch also moves per-domain qemu_mapcache_invalidate
>> variable out of the arch sub-struct (and drops "qemu" prefix).
>>
>> We don't put this variable inside the #ifdef CONFIG_IOREQ_SERVER
>> at the end of struct domain, but in the hole next to the group
>> of 5 bools further up which is more efficient.
>>
>> The subsequent patch will add mapcache invalidation handling on Arm.
>>
>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> CC: Julien Grall <julien.grall@arm.com>
>> [On Arm only]
>> Tested-by: Wei Chen <Wei.Chen@arm.com>
> 
> Reviewed-by: Paul Durrant <paul@xen.org>

Applicable parts
Acked-by: Jan Beulich <jbeulich@suse.com>

Jan


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 16/24] xen/mm: Handle properly reference in set_foreign_p2m_entry() on Arm
  2021-01-21 13:57   ` Jan Beulich
@ 2021-01-21 18:42     ` Oleksandr
  0 siblings, 0 replies; 144+ messages in thread
From: Oleksandr @ 2021-01-21 18:42 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Oleksandr Tyshchenko, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Ian Jackson,
	Wei Liu, Roger Pau Monné,
	Julien Grall, xen-devel


On 21.01.21 15:57, Jan Beulich wrote:

Hi Jan


> On 12.01.2021 22:52, Oleksandr Tyshchenko wrote:
>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>
>> This patch implements reference counting of foreign entries in
>> in set_foreign_p2m_entry() on Arm. This is a mandatory action if
>> we want to run emulator (IOREQ server) in other than dom0 domain,
>> as we can't trust it to do the right thing if it is not running
>> in dom0. So we need to grab a reference on the page to avoid it
>> disappearing.
>>
>> It is valid to always pass "p2m_map_foreign_rw" type to
>> guest_physmap_add_entry() since the current and foreign domains
>> would be always different. A case when they are equal would be
>> rejected by rcu_lock_remote_domain_by_id(). Besides the similar
>> comment in the code put a respective ASSERT() to catch incorrect
>> usage in future.
>>
>> It was tested with IOREQ feature to confirm that all the pages given
>> to this function belong to a domain, so we can use the same approach
>> as for XENMAPSPACE_gmfn_foreign handling in xenmem_add_to_physmap_one().
>>
>> This involves adding an extra parameter for the foreign domain to
>> set_foreign_p2m_entry() and a helper to indicate whether the arch
>> supports the reference counting of foreign entries and the restriction
>> for the hardware domain in the common code can be skipped for it.
>>
>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> CC: Julien Grall <julien.grall@arm.com>
>> [On Arm only]
>> Tested-by: Wei Chen <Wei.Chen@arm.com>
> In principle x86 parts
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Thanks.


> However, being a maintainer of ...
>
>> --- a/xen/include/asm-x86/p2m.h
>> +++ b/xen/include/asm-x86/p2m.h
>> @@ -382,6 +382,22 @@ struct p2m_domain {
>>   #endif
>>   #include <xen/p2m-common.h>
>>   
>> +static inline bool arch_acquire_resource_check(struct domain *d)
>> +{
>> +    /*
>> +     * The reference counting of foreign entries in set_foreign_p2m_entry()
>> +     * is not supported for translated domains on x86.
>> +     *
>> +     * FIXME: Until foreign pages inserted into the P2M are properly
>> +     * reference counted, it is unsafe to allow mapping of
>> +     * resource pages unless the caller is the hardware domain.
>> +     */
>> +    if ( paging_mode_translate(d) && !is_hardware_domain(d) )
>> +        return false;
>> +
>> +    return true;
>> +}
>
> ... this code, I'd like to ask that such constructs be avoided
> and this be a single return statement:
>
>      return !paging_mode_translate(d) || is_hardware_domain(d);

ok, looks better.


>
> I also think you may want to consider dropping the initial
> "The" from the comment. I'm further unconvinced "foreign
> entries" needs saying when set_foreign_p2m_entry() deals with
> exclusively such. In the end the original comment moved here
> would probably suffice, no need for any more additions than
> perhaps a simple "(see set_foreign_p2m_entry())".

ok

-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 22/24] xen/arm: Add mapcache invalidation handling
  2021-01-15  2:11   ` Stefano Stabellini
@ 2021-01-21 19:47     ` Oleksandr
  0 siblings, 0 replies; 144+ messages in thread
From: Oleksandr @ 2021-01-21 19:47 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Oleksandr Tyshchenko, Julien Grall, Volodymyr Babchuk,
	Julien Grall


On 15.01.21 04:11, Stefano Stabellini wrote:

Hi Stefano

> On Tue, 12 Jan 2021, Oleksandr Tyshchenko wrote:
>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>
>> We need to send mapcache invalidation request to qemu/demu everytime
>> the page gets removed from a guest.
>>
>> At the moment, the Arm code doesn't explicitely remove the existing
>> mapping before inserting the new mapping. Instead, this is done
>> implicitely by __p2m_set_entry().
>>
>> So we need to recognize a case when old entry is a RAM page *and*
>> the new MFN is different in order to set the corresponding flag.
>> The most suitable place to do this is p2m_free_entry(), there
>> we can find the correct leaf type. The invalidation request
>> will be sent in do_trap_hypercall() later on.
>>
>> Taking into the account the following the do_trap_hypercall()
>> is the best place to send invalidation request:
>>   - The only way a guest can modify its P2M on Arm is via an hypercall
>>   - When sending the invalidation request, the vCPU will be blocked
>>     until all the IOREQ servers have acknowledged the invalidation
>>
>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> CC: Julien Grall <julien.grall@arm.com>
>> [On Arm only]
>> Tested-by: Wei Chen <Wei.Chen@arm.com>
>>
>> ---
>> Please note, this is a split/cleanup/hardening of Julien's PoC:
>> "Add support for Guest IO forwarding to a device emulator"
>>
>> ***
>> Please note, this patch depends on the following which is
>> on review:
>> https://patchwork.kernel.org/patch/11803383/
>>
>> This patch is on par with x86 code (whether it is buggy or not).
>> If there is a need to improve/harden something, this can be done on
>> a follow-up.
>> ***
>>
>> Changes V1 -> V2:
>>     - new patch, some changes were derived from (+ new explanation):
>>       xen/ioreq: Make x86's invalidate qemu mapcache handling common
>>     - put setting of the flag into __p2m_set_entry()
>>     - clarify the conditions when the flag should be set
>>     - use domain_has_ioreq_server()
>>     - update do_trap_hypercall() by adding local variable
>>
>> Changes V2 -> V3:
>>     - update patch description
>>     - move check to p2m_free_entry()
>>     - add a comment
>>     - use "curr" instead of "v" in do_trap_hypercall()
>>
>> Changes V3 -> V4:
>>     - update patch description
>>     - re-order check in p2m_free_entry() to call domain_has_ioreq_server()
>>       only if p2m->domain == current->domain
>>     - add a comment in do_trap_hypercall()
>> ---
>>   xen/arch/arm/p2m.c   | 25 +++++++++++++++++--------
>>   xen/arch/arm/traps.c | 20 +++++++++++++++++---
>>   2 files changed, 34 insertions(+), 11 deletions(-)
>>
>> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
>> index d41c4fa..26acb95d 100644
>> --- a/xen/arch/arm/p2m.c
>> +++ b/xen/arch/arm/p2m.c
>> @@ -1,6 +1,7 @@
>>   #include <xen/cpu.h>
>>   #include <xen/domain_page.h>
>>   #include <xen/iocap.h>
>> +#include <xen/ioreq.h>
>>   #include <xen/lib.h>
>>   #include <xen/sched.h>
>>   #include <xen/softirq.h>
>> @@ -749,17 +750,25 @@ static void p2m_free_entry(struct p2m_domain *p2m,
>>       if ( !p2m_is_valid(entry) )
>>           return;
>>   
>> -    /* Nothing to do but updating the stats if the entry is a super-page. */
>> -    if ( p2m_is_superpage(entry, level) )
>> +    if ( p2m_is_superpage(entry, level) || (level == 3) )
>>       {
>> -        p2m->stats.mappings[level]--;
>> -        return;
>> -    }
>> +#ifdef CONFIG_IOREQ_SERVER
>> +        /*
>> +         * If this gets called (non-recursively) then either the entry
>> +         * was replaced by an entry with a different base (valid case) or
>> +         * the shattering of a superpage was failed (error case).
>> +         * So, at worst, the spurious mapcache invalidation might be sent.
>> +         */
>> +        if ( (p2m->domain == current->domain) &&
>> +              domain_has_ioreq_server(p2m->domain) &&
>> +              p2m_is_ram(entry.p2m.type) )
>> +            p2m->domain->mapcache_invalidate = true;
>> +#endif
>>   
>> -    if ( level == 3 )
>> -    {
>>           p2m->stats.mappings[level]--;
>> -        p2m_put_l3_page(entry);
>> +        /* Nothing to do if the entry is a super-page. */
>> +        if ( level == 3 )
>> +            p2m_put_l3_page(entry);
>>           return;
>>       }
>>   
>> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
>> index 35094d8..1070d1b 100644
>> --- a/xen/arch/arm/traps.c
>> +++ b/xen/arch/arm/traps.c
>> @@ -1443,6 +1443,7 @@ static void do_trap_hypercall(struct cpu_user_regs *regs, register_t *nr,
>>                                 const union hsr hsr)
>>   {
>>       arm_hypercall_fn_t call = NULL;
>> +    struct vcpu *curr = current;
>>   
>>       BUILD_BUG_ON(NR_hypercalls < ARRAY_SIZE(arm_hypercall_table) );
>>   
>> @@ -1459,7 +1460,7 @@ static void do_trap_hypercall(struct cpu_user_regs *regs, register_t *nr,
>>           return;
>>       }
>>   
>> -    current->hcall_preempted = false;
>> +    curr->hcall_preempted = false;
>>   
>>       perfc_incra(hypercalls, *nr);
>>       call = arm_hypercall_table[*nr].fn;
>> @@ -1472,7 +1473,7 @@ static void do_trap_hypercall(struct cpu_user_regs *regs, register_t *nr,
>>       HYPERCALL_RESULT_REG(regs) = call(HYPERCALL_ARGS(regs));
>>   
>>   #ifndef NDEBUG
>> -    if ( !current->hcall_preempted )
>> +    if ( !curr->hcall_preempted )
>>       {
>>           /* Deliberately corrupt parameter regs used by this hypercall. */
>>           switch ( arm_hypercall_table[*nr].nr_args ) {
>> @@ -1489,8 +1490,21 @@ static void do_trap_hypercall(struct cpu_user_regs *regs, register_t *nr,
>>   #endif
>>   
>>       /* Ensure the hypercall trap instruction is re-executed. */
>> -    if ( current->hcall_preempted )
>> +    if ( curr->hcall_preempted )
>>           regs->pc -= 4;  /* re-execute 'hvc #XEN_HYPERCALL_TAG' */
>> +
>> +#ifdef CONFIG_IOREQ_SERVER
>> +    /*
>> +     * Taking into the account the following the do_trap_hypercall()
>> +     * is the best place to send invalidation request:
>> +     * - The only way a guest can modify its P2M on Arm is via an hypercall
>> +     * - When sending the invalidation request, the vCPU will be blocked
>> +     *   until all the IOREQ servers have acknowledged the invalidation
> NIT: I suggest to reword it as follows to make it sound better.
>
> We call ioreq_signal_mapcache_invalidate from do_trap_hypercall()
> because the only way a guest can modify its P2M on Arm is via an
> hypercall. Note that sending the invalidation request causes the vCPU to
> block until all the IOREQ servers have acknowledged the invalidation.

Agree


>
>
> Could be done on commit.

Thank you, I am preparing V5, so will update.


>
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

Thanks.


>
>
>> +     */
>> +    if ( unlikely(curr->domain->mapcache_invalidate) &&
>> +         test_and_clear_bool(curr->domain->mapcache_invalidate) )
>> +        ioreq_signal_mapcache_invalidate();
>> +#endif
>>   }
>>   
>>   void arch_hypercall_tasklet_result(struct vcpu *v, long res)
>> -- 
>> 2.7.4
>>
-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features
  2021-01-21  9:31                     ` Oleksandr
@ 2021-01-21 21:34                       ` Stefano Stabellini
  0 siblings, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2021-01-21 21:34 UTC (permalink / raw)
  To: Oleksandr
  Cc: Stefano Stabellini, Julien Grall, xen-devel, Julien Grall,
	Volodymyr Babchuk, Oleksandr Tyshchenko

[-- Attachment #1: Type: text/plain, Size: 3077 bytes --]

On Thu, 21 Jan 2021, Oleksandr wrote:
> On 20.01.21 21:47, Stefano Stabellini wrote:
> > On Wed, 20 Jan 2021, Julien Grall wrote:
> > > Hi Stefano,
> > > 
> > > On 20/01/2021 00:50, Stefano Stabellini wrote:
> > > > On Tue, 19 Jan 2021, Oleksandr wrote:
> > > > > diff --git a/xen/arch/arm/ioreq.c b/xen/arch/arm/ioreq.c
> > > > > index 40b9e59..0508bd8 100644
> > > > > --- a/xen/arch/arm/ioreq.c
> > > > > +++ b/xen/arch/arm/ioreq.c
> > > > > @@ -101,12 +101,10 @@ enum io_state try_fwd_ioserv(struct
> > > > > cpu_user_regs
> > > > > *regs,
> > > > > 
> > > > >    bool arch_ioreq_complete_mmio(void)
> > > > >    {
> > > > > -    struct vcpu *v = current;
> > > > >        struct cpu_user_regs *regs = guest_cpu_user_regs();
> > > > >        const union hsr hsr = { .bits = regs->hsr };
> > > > > -    paddr_t addr = v->io.req.addr;
> > > > > 
> > > > > -    if ( try_handle_mmio(regs, hsr, addr) == IO_HANDLED )
> > > > > +    if ( handle_ioserv(regs, current) == IO_HANDLED )
> > > > >        {
> > > > >            advance_pc(regs, hsr);
> > > > >            return true;
> > > > Yes, but I think we want to keep the check
> > > > 
> > > >       vio->req.state == STATE_IORESP_READY
> > > > 
> > > > So maybe (uncompiled, untested):
> > > > 
> > > >       if ( v->io.req.state != STATE_IORESP_READY )
> > > >           return false;
> > > Is it possible to reach this function with v->io.req.state !=
> > > STATE_IORESP_READY? If not, then I would suggest to add an
> > > ASSERT_UNREACHABLE() before the return.
> > If I am reading the state machine right it should *not* be possible to
> > get here with v->io.req.state != STATE_IORESP_READY, so yes,
> > ASSERT_UNREACHABLE() would work.
> Agree here. If the assumption is not correct (unlikely), I think I will catch
> this during testing.
> In addition, we can probably drop case STATE_IORESP_READY in try_fwd_ioserv().
> 
> 
> [not tested]

Yes, looks OK

 
> diff --git a/xen/arch/arm/ioreq.c b/xen/arch/arm/ioreq.c
> index 40b9e59..c7ee1a7 100644
> --- a/xen/arch/arm/ioreq.c
> +++ b/xen/arch/arm/ioreq.c
> @@ -71,9 +71,6 @@ enum io_state try_fwd_ioserv(struct cpu_user_regs *regs,
>      case STATE_IOREQ_NONE:
>          break;
> 
> -    case STATE_IORESP_READY:
> -        return IO_HANDLED;
> -
>      default:
>          gdprintk(XENLOG_ERR, "wrong state %u\n", vio->req.state);
>          return IO_ABORT;
> @@ -104,9 +101,14 @@ bool arch_ioreq_complete_mmio(void)
>      struct vcpu *v = current;
>      struct cpu_user_regs *regs = guest_cpu_user_regs();
>      const union hsr hsr = { .bits = regs->hsr };
> -    paddr_t addr = v->io.req.addr;
> 
> -    if ( try_handle_mmio(regs, hsr, addr) == IO_HANDLED )
> +    if ( v->io.req.state != STATE_IORESP_READY )
> +    {
> +        ASSERT_UNREACHABLE();
> +        return false;
> +    }
> +
> +    if ( handle_ioserv(regs, v) == IO_HANDLED )
>      {
>          advance_pc(regs, hsr);
>          return true;

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 20/24] xen/arm: io: Harden sign extension check
  2021-01-12 21:52 ` [PATCH V4 20/24] xen/arm: io: Harden sign extension check Oleksandr Tyshchenko
  2021-01-15  1:48   ` Stefano Stabellini
@ 2021-01-22 10:15   ` Volodymyr Babchuk
  1 sibling, 0 replies; 144+ messages in thread
From: Volodymyr Babchuk @ 2021-01-22 10:15 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: xen-devel, Oleksandr Tyshchenko, Stefano Stabellini,
	Julien Grall, Julien Grall


Hi Oleksandr,

Oleksandr Tyshchenko writes:

> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>
> In the ideal world we would never get an undefined behavior when
> propagating the sign bit since that bit can only be set for access
> size smaller than the register size (i.e byte/half-word for aarch32,
> byte/half-word/word for aarch64).
>
> In the real world we need to care for *possible* hardware bug such as
> advertising a sign extension for either 64-bit (or 32-bit) on Arm64
> (resp. Arm32).
>
> So harden a bit more the code to prevent undefined behavior when
> propagating the sign bit in case of buggy hardware.
>
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> CC: Julien Grall <julien.grall@arm.com>

Reviewed-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>

>
> ---
> Please note, this is a split/cleanup/hardening of Julien's PoC:
> "Add support for Guest IO forwarding to a device emulator"
>
> Changes V3 -> V4:
>    - new patch
> ---
>  xen/include/asm-arm/traps.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/xen/include/asm-arm/traps.h b/xen/include/asm-arm/traps.h
> index e301c44..992d537 100644
> --- a/xen/include/asm-arm/traps.h
> +++ b/xen/include/asm-arm/traps.h
> @@ -93,7 +93,8 @@ static inline register_t sign_extend(const struct hsr_dabt dabt, register_t r)
>       * Note that we expect the read handler to have zeroed the bits
>       * outside the requested access size.
>       */
> -    if ( dabt.sign && (r & (1UL << (size - 1))) )
> +    if ( dabt.sign && (size < sizeof(register_t) * 8) &&
> +         (r & (1UL << (size - 1))) )
>      {
>          /*
>           * We are relying on register_t using the same as


-- 
Volodymyr Babchuk at EPAM

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features
  2021-01-21  8:50             ` Oleksandr
@ 2021-01-27 10:24               ` Jan Beulich
  2021-01-27 12:22                 ` Oleksandr
  0 siblings, 1 reply; 144+ messages in thread
From: Jan Beulich @ 2021-01-27 10:24 UTC (permalink / raw)
  To: Oleksandr
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Volodymyr Babchuk,
	Oleksandr Tyshchenko, Julien Grall

On 21.01.2021 09:50, Oleksandr wrote:
> On 20.01.21 17:50, Julien Grall wrote:
>> On 17/01/2021 18:52, Oleksandr wrote:
>>> error2.txt - when add #include <public/hvm/dm_op.h> to xen/ioreq.h
>>
>> It looks like the error is happening in dm.c rather than xen/dm.h. Any 
>> reason to not include <public/hvm/dm_op.h> in dm.c directly?
> Including it directly doesn't solve build issue.
> If I am not mistaken in order to follow requirements how to include 
> headers (alphabetic order, public* should be included after xen* and 
> asm* ones, etc)
> the dm.h gets included the first in dm.c, and dm_op.h gets included the 
> last. We can avoid build issue if we change inclusion order a bit,
> what I mean is to include dm.h after hypercall.h at least (because 
> hypercall.h already includes dm_op.h). But this breaks the requirements 
> and is not way to go.
> Now I am in doubt how to overcome this.

First, violating the alphabetic ordering rule is perhaps less
of a problem than putting seemingly stray #include-s anywhere.
However, as soon as ordering starts mattering, this is
indicative of problems with the headers: Either the (seemingly)
too early included one lacks some #include-s, or you've run
into a circular dependency. In the former case the needed
#include-s should be added, and all ought to be fine. In the
latter case, however, disentangling may be a significant
effort, and hence it may be sensible and acceptable to instead
use unusual ordering of #include-s in the one place where it
matters (suitably justified in the description). Ideally such
would come with a promise to try to sort this later on, when
time is less constrained.

Jan


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features
  2021-01-27 10:24               ` Jan Beulich
@ 2021-01-27 12:22                 ` Oleksandr
  2021-01-27 12:52                   ` Jan Beulich
  0 siblings, 1 reply; 144+ messages in thread
From: Oleksandr @ 2021-01-27 12:22 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Volodymyr Babchuk,
	Oleksandr Tyshchenko, Julien Grall


On 27.01.21 12:24, Jan Beulich wrote:

Hi Jan

> On 21.01.2021 09:50, Oleksandr wrote:
>> On 20.01.21 17:50, Julien Grall wrote:
>>> On 17/01/2021 18:52, Oleksandr wrote:
>>>> error2.txt - when add #include <public/hvm/dm_op.h> to xen/ioreq.h
>>> It looks like the error is happening in dm.c rather than xen/dm.h. Any
>>> reason to not include <public/hvm/dm_op.h> in dm.c directly?
>> Including it directly doesn't solve build issue.
>> If I am not mistaken in order to follow requirements how to include
>> headers (alphabetic order, public* should be included after xen* and
>> asm* ones, etc)
>> the dm.h gets included the first in dm.c, and dm_op.h gets included the
>> last. We can avoid build issue if we change inclusion order a bit,
>> what I mean is to include dm.h after hypercall.h at least (because
>> hypercall.h already includes dm_op.h). But this breaks the requirements
>> and is not way to go.
>> Now I am in doubt how to overcome this.
> First, violating the alphabetic ordering rule is perhaps less
> of a problem than putting seemingly stray #include-s anywhere.
> However, as soon as ordering starts mattering, this is
> indicative of problems with the headers: Either the (seemingly)
> too early included one lacks some #include-s, or you've run
> into a circular dependency. In the former case the needed
> #include-s should be added, and all ought to be fine. In the
> latter case, however, disentangling may be a significant
> effort, and hence it may be sensible and acceptable to instead
> use unusual ordering of #include-s in the one place where it
> matters (suitably justified in the description). Ideally such
> would come with a promise to try to sort this later on, when
> time is less constrained.
Thank you for the explanation. I think, I am facing the former case (too 
early included one lacks some #include-s),
actually both common/dm.c and arch/arm/dm.c suffer from that.
It works for me if I add the following to both files (violating the 
alphabetic ordering rule):

+#include <xen/types.h>
+#include <public/hvm/dm_op.h>
+
  #include <xen/dm.h>


So, if I got your point correctly, we could include these both headers 
by dm.h) Would it be appropriate (with suitable justification of course)?
I think, we could avoid including xen/sched.h by dm.h (need to recheck), 
just these two headers above.


>
> Jan

-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features
  2021-01-27 12:22                 ` Oleksandr
@ 2021-01-27 12:52                   ` Jan Beulich
  0 siblings, 0 replies; 144+ messages in thread
From: Jan Beulich @ 2021-01-27 12:52 UTC (permalink / raw)
  To: Oleksandr
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Volodymyr Babchuk,
	Oleksandr Tyshchenko, Julien Grall

On 27.01.2021 13:22, Oleksandr wrote:
> On 27.01.21 12:24, Jan Beulich wrote:
>> On 21.01.2021 09:50, Oleksandr wrote:
>>> On 20.01.21 17:50, Julien Grall wrote:
>>>> On 17/01/2021 18:52, Oleksandr wrote:
>>>>> error2.txt - when add #include <public/hvm/dm_op.h> to xen/ioreq.h
>>>> It looks like the error is happening in dm.c rather than xen/dm.h. Any
>>>> reason to not include <public/hvm/dm_op.h> in dm.c directly?
>>> Including it directly doesn't solve build issue.
>>> If I am not mistaken in order to follow requirements how to include
>>> headers (alphabetic order, public* should be included after xen* and
>>> asm* ones, etc)
>>> the dm.h gets included the first in dm.c, and dm_op.h gets included the
>>> last. We can avoid build issue if we change inclusion order a bit,
>>> what I mean is to include dm.h after hypercall.h at least (because
>>> hypercall.h already includes dm_op.h). But this breaks the requirements
>>> and is not way to go.
>>> Now I am in doubt how to overcome this.
>> First, violating the alphabetic ordering rule is perhaps less
>> of a problem than putting seemingly stray #include-s anywhere.
>> However, as soon as ordering starts mattering, this is
>> indicative of problems with the headers: Either the (seemingly)
>> too early included one lacks some #include-s, or you've run
>> into a circular dependency. In the former case the needed
>> #include-s should be added, and all ought to be fine. In the
>> latter case, however, disentangling may be a significant
>> effort, and hence it may be sensible and acceptable to instead
>> use unusual ordering of #include-s in the one place where it
>> matters (suitably justified in the description). Ideally such
>> would come with a promise to try to sort this later on, when
>> time is less constrained.
> Thank you for the explanation. I think, I am facing the former case (too 
> early included one lacks some #include-s),
> actually both common/dm.c and arch/arm/dm.c suffer from that.
> It works for me if I add the following to both files (violating the 
> alphabetic ordering rule):
> 
> +#include <xen/types.h>
> +#include <public/hvm/dm_op.h>
> +
>   #include <xen/dm.h>
> 
> 
> So, if I got your point correctly, we could include these both headers 
> by dm.h) Would it be appropriate (with suitable justification of course)?

Perhaps - this is a header you introduce aiui, so it's up to
you to arrange for it to include all further headers it
depends upon. In such a case (new header) you don't need to
explicitly justify what you include, but of course you don't
want to include excessive ones, or you risk getting back
"Why?" from reviewers.

Jan


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [ANNOUNCE] Xen 4.15 release schedule and feature tracking
  2021-01-14 19:02         ` Andrew Cooper
                             ` (2 preceding siblings ...)
  2021-01-15 15:14           ` Lengyel, Tamas
@ 2021-01-28 18:26           ` Dario Faggioli
  2021-01-28 22:15             ` Dario Faggioli
  2021-01-29  8:38             ` Jan Beulich
  3 siblings, 2 replies; 144+ messages in thread
From: Dario Faggioli @ 2021-01-28 18:26 UTC (permalink / raw)
  To: Andrew Cooper, Ian Jackson, xen-devel, committers,
	Tamas K Lengyel, Michał Leszczyński

[-- Attachment #1: Type: text/plain, Size: 10069 bytes --]

On Thu, 2021-01-14 at 19:02 +0000, Andrew Cooper wrote:
> On 14/01/2021 16:06, Ian Jackson wrote:
> > The last posting date for new feature patches for Xen 4.15 is
> > tomorrow. [1]  We seem to be getting a reasonable good flood of
> > stuff
> > trying to meet this deadline :-).
> > 
> > Patches for new fetures posted after tomorrow will be deferred to
> > the
> > next Xen release after 4.15.  NB the primary responsibility for
> > driving a feature's progress to meet the release schedule, lies
> > with
> > the feature's proponent(s).
> > 
> > 
> >   As a reminder, here is the release schedule:
> > + (unchanged information indented with spaces):
> > 
> >    Friday 15th January    Last posting date
> > 
> >        Patches adding new features should be posted to the mailing
> > list
> >        by this cate, although perhaps not in their final version.
> > 
> >    Friday 29th January    Feature freeze
> > 
> >        Patches adding new features should be committed by this
> > date.
> >        Straightforward bugfixes may continue to be accepted by
> >        maintainers.
> > 
> >    Friday 12th February **tentatve**   Code freeze
> > 
> >        Bugfixes only, all changes to be approved by the Release
> > Manager.
> > 
> >    Week of 12th March **tentative**    Release
> >        (probably Tuesday or Wednesday)
> > 
> >   Any patches containing substantial refactoring are to treated as
> >   new features, even if they intent is to fix bugs.
> > 
> >   Freeze exceptions will not be routine, but may be granted in
> >   exceptional cases for small changes on the basis of risk
> > assessment.
> >   Large series will not get exceptions.  Contributors *must not*
> > rely on
> >   getting, or expect, a freeze exception.
> > 
> > + New or improved tests (supposing they do not involve refactoring,
> > + even build system reorganisation), and documentation
> > improvements,
> > + will generally be treated as bugfixes.
> > 
> >   The codefreeze and release dates are provisional and will be
> > adjusted
> >   in the light of apparent code quality etc.
> > 
> >   If as a feature proponent you feel your feature is at risk and
> > there
> >   is something the Xen Project could do to help, please consult me
> > or
> >   the Community Manager.  In such situations please reach out
> > earlier
> >   rather than later.
> > 
> > 
> > In my last update I asked this:
> > 
> > > If you are working on a feature you want in 4.15 please let me
> > > know
> > > about it.  Ideally I'd like a little stanza like this:
> > > 
> > > S: feature name
> > > O: feature owner (proponent) name
> > > E: feature owner (proponent) email address
> > > P: your current estimate of the probability it making 4.15, as a
> > > %age
> > > 
> > > But free-form text is OK too.  Please reply to this mail.
> > I received one mail.  Thanks to Oleksandr Andrushchenko for his
> > update
> > on the following feeature:
> > 
> >   IOREQ feature (+ virtio-mmio) on Arm
> >   
> > https://www.mail-archive.com/xen-devel@lists.xenproject.org/msg87002.html
> > 
> >   Julien Grall <julien@xen.org>
> >   Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> > 
> > I see that V4 of this series was just posted.  Thanks, Oleksandr.
> > I'll make a separate enquiry about your series.
> > 
> > I think if people don't find the traditional feature tracking
> > useful,
> > I will try to assemble Release Notes information later, during the
> > freeze, when fewer people are rushing to try to meet the deadlines.
> 
> (Now I have working email).
> 
> Features:
> 
> 1) acquire_resource fixes.
> 
> Not really a new feature - entirely bugfixing a preexisting one.
> Developed by me to help 2).  Reasonably well acked, but awaiting
> feedback on v3.
> 
> 2) External Processor Trace support.
> 
> Development by Michał.  Depends on 1), and awaiting a new version
> being
> posted.
> 
> As far as I'm aware, both Intel and CERT have production systems
> deployed using this functionality, so it is very highly desirable to
> get
> into 4.15.
> 
> 3) Initial Trenchboot+SKINIT support.
> 
> I've got two patches I need to clean up and submit which is the first
> part of the Trenchboot + Dynamic Root of Trust on AMD support.  This
> will get Xen into a position where it can be started via the new grub
> "secure_launch" protocol.
> 
> Later patches (i.e. post 4.15) will do support for Intel TXT (i.e.
> without tboot), as well as the common infrastructure for the TPM
> event
> log and further measurements during the boot process.
> 
> 4) "simple" autotest support.
> 
> 
> Bugs:
> 
> 1) HPET/PIT issue on newer Intel systems.  This has had literally
> tens
> of reports across the devel and users mailing lists, and prevents Xen
> from booting at all on the past two generations of Intel laptop. 
> I've
> finally got a repro and posted a fix to the list, but still in
> progress.
> 
> 2) "scheduler broken" bugs.  We've had 4 or 5 reports of Xen not
> working, and very little investigation on whats going on.  Suspicion
> is
> that there might be two bugs, one with smt=0 on recent AMD hardware,
> and
> one more general "some workloads cause negative credit" and might or
> might not be specific to credit2 (debugging feedback differs - also
> might be 3 underlying issue).
> 
Yep, so, let's try to summarize/collect the ones I think you may be
referring to:

1) There is one report about Credit2 not working, while Credit1 was
fine. It's this one:

https://lists.xenproject.org/archives/html/xen-devel/2020-10/msg01561.html

It's the one where it somehow happens that one or more vCPUs manage to
run for a really really long timeslice, much more than the scheduler
would have allowed them to, and this cause problems. _If_ that's it, my
investigation so far seems to show that this happens despite scheduler
code tries to enforce (via timers) the proper timeslice limits. when it
happens, makes the scheduler very unhappy. I've see reports of it
occurring both on Credit and Credit2, but definitely Credit2 seems to
be more sensitive to it.

I've actually been trying to track it down for a while now, but I can't
easily reproduce it, so it's proving to be challenging.

2) Then there has been his one:

https://lists.xenproject.org/archives/html/xen-devel/2020-10/msg01005.html

Here, the where reporter said that "[credit1] results is an observable
delay, unusable performance; credit2 seems to be the only usable
scheduler". This is the one that also Andrew mention, happening on
Ryzen and with SMT disabled (as this is on QubesOS, IIRC).

Here, doing "dom0_max_vcpus=1 dom0_vcpus_pin" seemed to mitigate the
problem but, of course, with obvious limitations. I don't have a Ryzen
handy, but I have a Zen and a Zen2. I checked there and again could not
reproduce (although, what I tried was upstream Xen, not QubesOS).

3) Then I recall this one:

https://lists.xenproject.org/archives/html/xen-devel/2020-10/msg01800.html

This also started as a "scheduler, probably Credit2" bug. But it then
turned out manifests on both Credit1 and Credit2 and it started to
happen on 4.14, while it was not there in 4.13... And nothing major
changed in scheduling between these two releases, I think.

During the analysis, we thought we identified a livelock, but then
could not pinpoint what was exactly going on. Oh, and then it was also
discovered that Credit2 + PVH dom0 seemed to be a working
configuration, and it's weird for a scheduling issue to have a (dom0)
domain type dependency, I think. But that could be anything really...
and I'm sure happy to keep digging.

4) There's the NULL scheduler + ARM + vwfi=native issue:

https://lists.xenproject.org/archives/html/xen-devel/2021-01/msg01634.html

This looks like something that we saw before, but remained unfixed,
although not exactly like that. If it's that one, analysis is done, and
we're working on a patch. If it's something else or even something
similar but slightly different... Well, we'll have to see when we have
the patch.

5) We're also dealing with this bugreport, although this is being
reported against Xen 4.13 (openSUSE 's packaged version of it):

https://bugzilla.opensuse.org/show_bug.cgi?id=1179246

This is again on recent AMD hardware and here, "dom0_max_vcpus=4
dom0_vcpus_pin" works ok, but only until a (Windows) HVM guest is
started. When that happens, though, we have crashes/hangs.

If guests are PV, things are apparently fine. If the HVM guests use a
different set of CPUs than dom0 (e.g., vm.cpumask="4-63" in xl.conf),
thinks are fine as well.

Again a scheduler issue and a scheduling algorithm dependency was
theorized and will be investigated (if the user can come back with
answers, which may take some time, as explained in the report). The
different behavior with different kind of guests is a little weird for
an issue of this kind, IME, but let's see.

6) If we want, we can include this too (hopefully just for reference):

https://lists.xenproject.org/archives/html/xen-devel/2021-01/msg01376.html

As indeed the symptoms were similar, such as hanging during boot, but
all fine with dom0_max_vcpus=1. However, Jan is currently investigating
this one, and they're heading toward problems with TSC reliability
reporting and rendezvous, but let's see.

Did I forget any?

As for "the plan", I am currently working on 4 (trying to come up with
a patch that fixes it) and on 1 (trying to come up with a way to track
down and uncover what I believe is the real issue).

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [ANNOUNCE] Xen 4.15 release schedule and feature tracking
  2021-01-28 18:26           ` Dario Faggioli
@ 2021-01-28 22:15             ` Dario Faggioli
  2021-01-29  8:38             ` Jan Beulich
  1 sibling, 0 replies; 144+ messages in thread
From: Dario Faggioli @ 2021-01-28 22:15 UTC (permalink / raw)
  To: Andrew Cooper, Ian Jackson, xen-devel, committers,
	Tamas K Lengyel, Michał Leszczyński

[-- Attachment #1: Type: text/plain, Size: 1128 bytes --]

On Thu, 2021-01-28 at 19:26 +0100, Dario Faggioli wrote:
> 
> [...]
>
> 2) Then there has been his one:
> 
>  
> https://lists.xenproject.org/archives/html/xen-devel/2020-10/msg01005.html
> 
> Here, the where reporter said that "[credit1] results is an
> observable
> delay, unusable performance; credit2 seems to be the only usable
> scheduler". This is the one that also Andrew mention, happening on
> Ryzen and with SMT disabled (as this is on QubesOS, IIRC).
> 
> Here, doing "dom0_max_vcpus=1 dom0_vcpus_pin" seemed to mitigate the
> problem but, of course, with obvious limitations. I don't have a
> Ryzen
> handy, but I have a Zen and a Zen2. 
>
And, what I meant here was "I have an EPYC and an EPYC2". Sorry.

Also, in my previous email, I forgot to properly trim the context above
my actual reply.

Sorry about that too.
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [ANNOUNCE] Xen 4.15 release schedule and feature tracking
  2021-01-15 15:14           ` Lengyel, Tamas
@ 2021-01-28 22:55             ` Dario Faggioli
  0 siblings, 0 replies; 144+ messages in thread
From: Dario Faggioli @ 2021-01-28 22:55 UTC (permalink / raw)
  To: Lengyel, Tamas, Cooper, Andrew, Ian Jackson, xen-devel,
	committers, Tamas K Lengyel, Michał Leszczyński

[-- Attachment #1: Type: text/plain, Size: 2057 bytes --]

On Fri, 2021-01-15 at 15:14 +0000, Lengyel, Tamas wrote:
> 
> > 2) "scheduler broken" bugs.  We've had 4 or 5 reports of Xen not
> > working,
> > and very little investigation on whats going on.  Suspicion is that
> > there
> > might be two bugs, one with smt=0 on recent AMD hardware, and one
> > more general "some workloads cause negative credit" and might or
> > might
> > not be specific to credit2 (debugging feedback differs - also might
> > be 3
> > underlying issue).
> 
> We've also ran into intermittent Xen lockups requiring power-cycling
> servers. We switched back to credit1 and had no issues since. 
>
Ah, that's interesting... Among the issues that I listed in my other
email, when trying to do a quick summary, "only" number 1 is about
Credit working when Credit2 does not. This one you're mentioning here
may be the second... or it may be the same! :-O

As said there, my theory so far is that there's a bug somewhere, not
necessarily in scheduling code, to which the two algorithms react
differently. Of course this is a theory, and I've not been able to
confirm it yet (otherwise I also would have fixed the problem. :-P).

But really, it would be interesting to double check if at least the
symptoms are the same than the ones of the issue reported here.

> Hard to tell if it was related to the scheduler or the pile of other
> experimental stuff we are running with but right now we have stable
> systems across the board with credit1.
> 
Well, sure, that's understandable. :-) Which is why it's tricky at
times to debug these issue. In fact, I cannot reproduce them myself,
and users, rightfully, move on if they found a workaround.

Anyway, if at some point you decide to investigate, I'll be happy to
help.

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [ANNOUNCE] Xen 4.15 release schedule and feature tracking
  2021-01-28 18:26           ` Dario Faggioli
  2021-01-28 22:15             ` Dario Faggioli
@ 2021-01-29  8:38             ` Jan Beulich
  2021-01-29  9:22               ` Dario Faggioli
  1 sibling, 1 reply; 144+ messages in thread
From: Jan Beulich @ 2021-01-29  8:38 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: committers, Tamas K Lengyel, Andrew Cooper,
	Michał Leszczyński, Ian Jackson, xen-devel

On 28.01.2021 19:26, Dario Faggioli wrote:
> On Thu, 2021-01-14 at 19:02 +0000, Andrew Cooper wrote:
>> 2) "scheduler broken" bugs.  We've had 4 or 5 reports of Xen not
>> working, and very little investigation on whats going on.  Suspicion
>> is
>> that there might be two bugs, one with smt=0 on recent AMD hardware,
>> and
>> one more general "some workloads cause negative credit" and might or
>> might not be specific to credit2 (debugging feedback differs - also
>> might be 3 underlying issue).
>>
> Yep, so, let's try to summarize/collect the ones I think you may be
> referring to:
> 
> 1) There is one report about Credit2 not working, while Credit1 was
> fine. It's this one:
> 
> https://lists.xenproject.org/archives/html/xen-devel/2020-10/msg01561.html
> 
> It's the one where it somehow happens that one or more vCPUs manage to
> run for a really really long timeslice, much more than the scheduler
> would have allowed them to, and this cause problems. _If_ that's it, my
> investigation so far seems to show that this happens despite scheduler
> code tries to enforce (via timers) the proper timeslice limits. when it
> happens, makes the scheduler very unhappy. I've see reports of it
> occurring both on Credit and Credit2, but definitely Credit2 seems to
> be more sensitive to it.
> 
> I've actually been trying to track it down for a while now, but I can't
> easily reproduce it, so it's proving to be challenging.
> 
> 2) Then there has been his one:
> 
> https://lists.xenproject.org/archives/html/xen-devel/2020-10/msg01005.html
> 
> Here, the where reporter said that "[credit1] results is an observable
> delay, unusable performance; credit2 seems to be the only usable
> scheduler". This is the one that also Andrew mention, happening on
> Ryzen and with SMT disabled (as this is on QubesOS, IIRC).
> 
> Here, doing "dom0_max_vcpus=1 dom0_vcpus_pin" seemed to mitigate the
> problem but, of course, with obvious limitations. I don't have a Ryzen
> handy, but I have a Zen and a Zen2. I checked there and again could not
> reproduce (although, what I tried was upstream Xen, not QubesOS).
> 
> 3) Then I recall this one:
> 
> https://lists.xenproject.org/archives/html/xen-devel/2020-10/msg01800.html
> 
> This also started as a "scheduler, probably Credit2" bug. But it then
> turned out manifests on both Credit1 and Credit2 and it started to
> happen on 4.14, while it was not there in 4.13... And nothing major
> changed in scheduling between these two releases, I think.
> 
> During the analysis, we thought we identified a livelock, but then
> could not pinpoint what was exactly going on. Oh, and then it was also
> discovered that Credit2 + PVH dom0 seemed to be a working
> configuration, and it's weird for a scheduling issue to have a (dom0)
> domain type dependency, I think. But that could be anything really...
> and I'm sure happy to keep digging.
> 
> 4) There's the NULL scheduler + ARM + vwfi=native issue:
> 
> https://lists.xenproject.org/archives/html/xen-devel/2021-01/msg01634.html
> 
> This looks like something that we saw before, but remained unfixed,
> although not exactly like that. If it's that one, analysis is done, and
> we're working on a patch. If it's something else or even something
> similar but slightly different... Well, we'll have to see when we have
> the patch.
> 
> 5) We're also dealing with this bugreport, although this is being
> reported against Xen 4.13 (openSUSE 's packaged version of it):
> 
> https://bugzilla.opensuse.org/show_bug.cgi?id=1179246
> 
> This is again on recent AMD hardware and here, "dom0_max_vcpus=4
> dom0_vcpus_pin" works ok, but only until a (Windows) HVM guest is
> started. When that happens, though, we have crashes/hangs.
> 
> If guests are PV, things are apparently fine. If the HVM guests use a
> different set of CPUs than dom0 (e.g., vm.cpumask="4-63" in xl.conf),
> thinks are fine as well.
> 
> Again a scheduler issue and a scheduling algorithm dependency was
> theorized and will be investigated (if the user can come back with
> answers, which may take some time, as explained in the report). The
> different behavior with different kind of guests is a little weird for
> an issue of this kind, IME, but let's see.
> 
> 6) If we want, we can include this too (hopefully just for reference):
> 
> https://lists.xenproject.org/archives/html/xen-devel/2021-01/msg01376.html
> 
> As indeed the symptoms were similar, such as hanging during boot, but
> all fine with dom0_max_vcpus=1. However, Jan is currently investigating
> this one, and they're heading toward problems with TSC reliability
> reporting and rendezvous, but let's see.
> 
> Did I forget any?

Going just from my mailbox, where I didn't keep all of the still
unaddressed reports, but some (another one I have there is among
the ones you've mentioned above):

https://lists.xen.org/archives/html/xen-devel/2020-03/msg01251.html
https://lists.xen.org/archives/html/xen-devel/2020-05/msg01985.html

Jan


^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [ANNOUNCE] Xen 4.15 release schedule and feature tracking
  2021-01-29  8:38             ` Jan Beulich
@ 2021-01-29  9:22               ` Dario Faggioli
  0 siblings, 0 replies; 144+ messages in thread
From: Dario Faggioli @ 2021-01-29  9:22 UTC (permalink / raw)
  To: Jan Beulich
  Cc: committers, Tamas K Lengyel, Andrew Cooper,
	Michał Leszczyński, Ian Jackson, xen-devel

[-- Attachment #1: Type: text/plain, Size: 1346 bytes --]

On Fri, 2021-01-29 at 09:38 +0100, Jan Beulich wrote:
> On 28.01.2021 19:26, Dario Faggioli wrote:
> > 
> > Did I forget any?
> 
> Going just from my mailbox, where I didn't keep all of the still
> unaddressed reports, but some (another one I have there is among
> the ones you've mentioned above):
> 
> https://lists.xen.org/archives/html/xen-devel/2020-03/msg01251.html
>
Yes, thanks! Now that you mention it, I do remember this one. It
definitely does look like scheduling related (not scheduling algorithm
probably, but "scheduling as a whole").

Well, I now have boxes that supports suspend that I can use, so I will
try *again* to reproduce it locally.

> https://lists.xen.org/archives/html/xen-devel/2020-05/msg01985.html
> 
Yes. So, my working theory is that this is the same issue as:

https://lists.xenproject.org/archives/html/xen-devel/2020-10/msg01561.html

I.e., the one that I listed as 1) in my "recap" (actually, I though I
mentioned it there somewhere, but now that I check, I actually didn't,
so thanks for this too).

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 23/24] libxl: Introduce basic virtio-mmio support on Arm
  2021-01-20 16:40       ` Julien Grall
  2021-01-20 20:35         ` Stefano Stabellini
@ 2021-02-09 21:04         ` Oleksandr
  1 sibling, 0 replies; 144+ messages in thread
From: Oleksandr @ 2021-02-09 21:04 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Julien Grall, Ian Jackson, Wei Liu, Anthony PERARD,
	Stefano Stabellini, Volodymyr Babchuk, Oleksandr Tyshchenko


On 20.01.21 18:40, Julien Grall wrote:
> Hi Oleksandr,

Hi Julien


Sorry for the late response.


>
> On 17/01/2021 22:22, Oleksandr wrote:
>> On 15.01.21 23:30, Julien Grall wrote:
>>> On 12/01/2021 21:52, Oleksandr Tyshchenko wrote:
>>>> From: Julien Grall <julien.grall@arm.com>
>>> So I am not quite too sure how this new parameter can be used. Could 
>>> you expand it?
>> The original idea was to set it if we are going to assign virtio 
>> device(s) to the guest.
>> Being honest, I have a plan to remove this extra parameter. It might 
>> not be obvious looking at the current patch, but next patch will show 
>> that we can avoid introducing it at all.
>
> Right, so I think we want to avoid introducing the parameter. I have 
> suggested in patch #24 a different way to split code introduced by #23 
> and #24.

Got it. Will take it into the account for the next version.


>
>
> [...]
>
>>>
>>>> +#define GUEST_VIRTIO_MMIO_SIZE xen_mk_ullong(0x200)
>>>
>>> AFAICT, the size of the virtio mmio region should be 0x100. So why 
>>> is it 0x200?
>>
>>
>> I didn't find the total size requirement for the mmio region in 
>> virtio specification v1.1 (the size of control registers is indeed 
>> 0x100 and device-specific configuration registers starts at the 
>> offset 0x100, however it's size depends on the device and the driver).
>>
>> kvmtool uses 0x200 [1], in some Linux device-trees we can see 0x200 
>> [2] (however, device-tree bindings example has 0x100 [3]), so what 
>> would be the proper value for Xen code?
>
> Hmm... I missed that fact. I would say we want to use the biggest size 
> possible so we can cover most of the devices.
>
> Although, as you pointed out, this may not cover all the devices. So 
> maybe we want to allow the user to configure the size via xl.cfg for 
> the one not conforming with 0x200.
>
> This could be implemented in the future. Stefano/Ian, what do you think?

I see that Stefano has already agreed on that, so let's leave 0x200 for now.


>
>
>>> Most likely, you will want to reserve a range
>>
>> it seems yes, good point. BTW, the range is needed for the mmio 
>> region as well, correct?
>
> I would reserve 1MB (just for the sake of avoid region size in KB).
>
> For the SPIs, I would consider to reserve 10-20 interrupts. Do you 
> think this will cover your use cases?

Yes, I think it would be enough for now.


>
>
> Cheers,
>
-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 24/24] [RFC] libxl: Add support for virtio-disk configuration
  2021-01-20 17:05       ` Julien Grall
@ 2021-02-10  9:02         ` Oleksandr
  2021-03-06 19:52           ` Julien Grall
  0 siblings, 1 reply; 144+ messages in thread
From: Oleksandr @ 2021-02-10  9:02 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Oleksandr Tyshchenko, Ian Jackson, Wei Liu,
	Anthony PERARD, Stefano Stabellini


On 20.01.21 19:05, Julien Grall wrote:
> Hi Oleksandr,


Hi Julien


Sorry for the late response.


>
> On 18/01/2021 08:32, Oleksandr wrote:
>>
>> On 16.01.21 00:01, Julien Grall wrote:
>>> Hi Oleksandr,
>>
>> Hi Julien
>>
>>
>>>
>>> On 12/01/2021 21:52, Oleksandr Tyshchenko wrote:
>>>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>>>
>>>> This patch adds basic support for configuring and assisting 
>>>> virtio-disk
>>>> backend (emualator) which is intended to run out of Qemu and could 
>>>> be run
>>>> in any domain.
>>>>
>>>> Xenstore was chosen as a communication interface for the emulator 
>>>> running
>>>> in non-toolstack domain to be able to get configuration either by 
>>>> reading
>>>> Xenstore directly or by receiving command line parameters (an 
>>>> updated 'xl devd'
>>>> running in the same domain would read Xenstore beforehand and call 
>>>> backend
>>>> executable with the required arguments).
>>>>
>>>> An example of domain configuration (two disks are assigned to the 
>>>> guest,
>>>> the latter is in readonly mode):
>>>>
>>>> vdisk = [ 'backend=DomD, disks=rw:/dev/mmcblk0p3;ro:/dev/mmcblk1p3' ]
>>>>
>>>> Where per-disk Xenstore entries are:
>>>> - filename and readonly flag (configured via "vdisk" property)
>>>> - base and irq (allocated dynamically)
>>>>
>>>> Besides handling 'visible' params described in configuration file,
>>>> patch also allocates virtio-mmio specific ones for each device and
>>>> writes them into Xenstore. virtio-mmio params (irq and base) are
>>>> unique per guest domain, they allocated at the domain creation time
>>>> and passed through to the emulator. Each VirtIO device has at least
>>>> one pair of these params.
>>>>
>>>> TODO:
>>>> 1. An extra "virtio" property could be removed.
>>>> 2. Update documentation.
>>>>
>>>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>>> [On Arm only]
>>>> Tested-by: Wei Chen <Wei.Chen@arm.com>
>>>>
>>>> ---
>>>> Changes RFC -> V1:
>>>>     - no changes
>>>>
>>>> Changes V1 -> V2:
>>>>     - rebase according to the new location of libxl_virtio_disk.c
>>>>
>>>> Changes V2 -> V3:
>>>>     - no changes
>>>>
>>>> Changes V3 -> V4:
>>>>     - rebase according to the new argument for 
>>>> DEFINE_DEVICE_TYPE_STRUCT
>>>>
>>>> Please note, there is a real concern about VirtIO interrupts 
>>>> allocation.
>>>> [Just copy here what Stefano said in RFC thread]
>>>>
>>>> So, if we end up allocating let's say 6 virtio interrupts for a 
>>>> domain,
>>>> the chance of a clash with a physical interrupt of a passthrough 
>>>> device is real.
>>>
>>> For the first version, I think a static approach is fine because it 
>>> doesn't bind us to anything yet (there is no interface change). We 
>>> can refine it on follow-ups as we figure out how virtio is going to 
>>> be used in the field.
>>>
>>>>
>>>> I am not entirely sure how to solve it, but these are a few ideas:
>>>> - choosing virtio interrupts that are less likely to conflict 
>>>> (maybe > 1000)
>>>
>>> Well, we only support 988 interrupts :). However, we will waste some 
>>> memory in the vGIC structure (we would need to allocate memory for 
>>> the 988 interrupts) if you chose an interrupt towards then end.
>>>
>>>> - make the virtio irq (optionally) configurable so that a user could
>>>>    override the default irq and specify one that doesn't conflict
>>>
>>> This is not very ideal because it makes the use of virtio quite 
>>> unfriendly with passthrough. Note that platform device passthrough 
>>> is already unfriendly, but I am thinking PCI :).
>>>
>>>> - implementing support for virq != pirq (even the xl interface doesn't
>>>>    allow to specify the virq number for passthrough devices, see 
>>>> "irqs")
>>> I can't remember whether I had a reason to not support virq != pirq 
>>> when this was initially implemented. This is one possibility, but it 
>>> is as unfriendly as the previous option.
>>>
>>> I will add a 4th one:
>>>    - Automatically allocate the virtio IRQ. This should be possible 
>>> to do it without too much trouble as we know in advance which IRQs 
>>> will be passthrough.
>> As I understand the IRQs for passthrough are described in "irq" 
>> property and stored in d_config->b_info.irqs[i], so yes we know in 
>> advance which IRQs will be used for passthrough
>> and we will be able to choose non-clashed ones (iterating over all 
>> IRQs in a reserved range) for the virtio devices.  The question is 
>> how many IRQs should be reserved.
>
> If we are automatically selecting the interrupt for virtio devices, 
> then I don't think we need to reserve a batch. Instead, we can 
> allocate one by one as we create the virtio device in libxl.

Looks like, yes, the reserved range is not needed if we use 4th option.


>
>
> For the static case, then a range of 10-20 might be sufficient for now.

ok


Thinking a bit more what approach to choose...
I would tend to automatically allocate the virtio IRQ (4th option) 
rather than use static approach with reserved IRQs
in order to eliminate the chance of a clash with a physical IRQs 
completely from the very beginning. From other side
we can indeed use static approach (as simpler one) for now and then 
refine it when we have more understanding about the virtio usage.
What do you think?


>
>
> [...]
>
>>>> -        nr_spis += (GUEST_VIRTIO_MMIO_SPI - 32) + 1;
>>>> +        uint64_t virtio_base;
>>>> +        libxl_device_virtio_disk *virtio_disk;
>>>> +
>>>> +        virtio_base = GUEST_VIRTIO_MMIO_BASE;
>>>>           virtio_irq = GUEST_VIRTIO_MMIO_SPI;
>>>
>>> Looking at patch #23, you defined a single SPI and a region that can 
>>> only fit virtio device. However, here, you are going to define 
>>> multiple virtio devices.
>>>
>>> I think you want to define the following:
>>>
>>>  - GUEST_VIRTIO_MMIO_BASE: Base address of the virtio window
>>>  - GUEST_VIRTIO_MMIO_SIZE: Full length of the virtio window (may 
>>> contain multiple devices)
>>>  - GUEST_VIRTIO_SPI_FIRST: First SPI reserved for virtio
>>>  - GUEST_VIRTIO_SPI_LAST: Last SPI reserved for virtio
>>>
>>> The per-device size doesn't need to be defined in arch-arm.h. 
>>> Instead, I would only define internally (unless we can use a 
>>> virtio.h header from Linux?).
>>
>> I think I got the idea. What are the preferences for these values?
>
> I have suggested some values in patch #23. Let me know what you think 
> there.

ok, thank you. I agree with the values.


>
>
> [...]
>
>>>> +
>>>> +        nr_spis += (virtio_irq - 32) + 1;
>>>>           virtio_enabled = true;
>>>>       }
>>>
>>> [...]
>>>
>>>> diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
>>>> index 2a3364b..054a0c9 100644
>>>> --- a/tools/xl/xl_parse.c
>>>> +++ b/tools/xl/xl_parse.c
>>>> @@ -1204,6 +1204,120 @@ out:
>>>>       if (rc) exit(EXIT_FAILURE);
>>>>   }
>>>>   +#define MAX_VIRTIO_DISKS 4
>>>
>>> May I ask why this is hardcoded to 4?
>>
>> I found 4 as a reasonable value for the initial implementation.
>> This means how many disks the single device instance can handle.
>
> Right, the question is why do you need to impose a limit in xl?
>
> Looking at the code, the value is only used in:
>
> +        if (virtio_disk->num_disks > MAX_VIRTIO_DISKS) {
> +            fprintf(stderr, "vdisk: currently only %d disks are 
> supported",
> +                    MAX_VIRTIO_DISKS);
>
> The rest of the code (at list in libxl/xl) seems to be completely 
> agnostic to the number of disks. So it feels strange to me to impose 
> what looks like an arbitrary limit in the tools.

Well, will drop this limit here.


>
>
> Cheers,
>
-- 
Regards,

Oleksandr Tyshchenko



^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [PATCH V4 24/24] [RFC] libxl: Add support for virtio-disk configuration
  2021-02-10  9:02         ` Oleksandr
@ 2021-03-06 19:52           ` Julien Grall
  0 siblings, 0 replies; 144+ messages in thread
From: Julien Grall @ 2021-03-06 19:52 UTC (permalink / raw)
  To: Oleksandr
  Cc: xen-devel, Oleksandr Tyshchenko, Ian Jackson, Wei Liu,
	Anthony PERARD, Stefano Stabellini

Hi Oleksandr,

On 10/02/2021 09:02, Oleksandr wrote:
> 
> On 20.01.21 19:05, Julien Grall wrote:
> Thinking a bit more what approach to choose...
> I would tend to automatically allocate the virtio IRQ (4th option) 
> rather than use static approach with reserved IRQs
> in order to eliminate the chance of a clash with a physical IRQs 
> completely from the very beginning. From other side
> we can indeed use static approach (as simpler one) for now and then 
> refine it when we have more understanding about the virtio usage.
> What do you think?

The static approach should be fine for now.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 144+ messages in thread

end of thread, other threads:[~2021-03-06 19:52 UTC | newest]

Thread overview: 144+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-12 21:52 [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Oleksandr Tyshchenko
2021-01-12 21:52 ` [PATCH V4 01/24] x86/ioreq: Prepare IOREQ feature for making it common Oleksandr Tyshchenko
2021-01-15 15:16   ` Julien Grall
2021-01-15 16:41   ` Jan Beulich
2021-01-16  9:48     ` Oleksandr
2021-01-18  8:22   ` Paul Durrant
2021-01-12 21:52 ` [PATCH V4 02/24] x86/ioreq: Add IOREQ_STATUS_* #define-s and update code for moving Oleksandr Tyshchenko
2021-01-15 15:17   ` Julien Grall
2021-01-18  8:24   ` Paul Durrant
2021-01-12 21:52 ` [PATCH V4 03/24] x86/ioreq: Provide out-of-line wrapper for the handle_mmio() Oleksandr Tyshchenko
2021-01-15 14:48   ` Alex Bennée
2021-01-15 15:19   ` Julien Grall
2021-01-18  8:29   ` Paul Durrant
2021-01-12 21:52 ` [PATCH V4 04/24] xen/ioreq: Make x86's IOREQ feature common Oleksandr Tyshchenko
2021-01-15 14:55   ` Alex Bennée
2021-01-15 15:23   ` Julien Grall
2021-01-18  8:48   ` Paul Durrant
2021-01-12 21:52 ` [PATCH V4 05/24] xen/ioreq: Make x86's hvm_ioreq_needs_completion() common Oleksandr Tyshchenko
2021-01-15 15:25   ` Julien Grall
2021-01-20  8:48   ` Alex Bennée
2021-01-20  9:31     ` Julien Grall
2021-01-12 21:52 ` [PATCH V4 06/24] xen/ioreq: Make x86's hvm_mmio_first(last)_byte() common Oleksandr Tyshchenko
2021-01-15 15:34   ` Julien Grall
2021-01-20  8:57   ` Alex Bennée
2021-01-20 16:15   ` Jan Beulich
2021-01-20 20:47     ` Oleksandr
2021-01-12 21:52 ` [PATCH V4 07/24] xen/ioreq: Make x86's hvm_ioreq_(page/vcpu/server) structs common Oleksandr Tyshchenko
2021-01-15 15:36   ` Julien Grall
2021-01-18  8:59   ` Paul Durrant
2021-01-20  8:58   ` Alex Bennée
2021-01-12 21:52 ` [PATCH V4 08/24] xen/ioreq: Move x86's ioreq_server to struct domain Oleksandr Tyshchenko
2021-01-15 15:44   ` Julien Grall
2021-01-18  9:09   ` Paul Durrant
2021-01-20  9:00   ` Alex Bennée
2021-01-12 21:52 ` [PATCH V4 09/24] xen/ioreq: Make x86's IOREQ related dm-op handling common Oleksandr Tyshchenko
2021-01-18  9:17   ` Paul Durrant
2021-01-18 10:19     ` Oleksandr
2021-01-18 10:34       ` Paul Durrant
2021-01-20 16:21   ` Jan Beulich
2021-01-21 10:23     ` Oleksandr
2021-01-21 10:27       ` Jan Beulich
2021-01-21 11:13         ` Oleksandr
2021-01-12 21:52 ` [PATCH V4 10/24] xen/ioreq: Move x86's io_completion/io_req fields to struct vcpu Oleksandr Tyshchenko
2021-01-15 19:34   ` Julien Grall
2021-01-18  9:35   ` Paul Durrant
2021-01-20 16:24   ` Jan Beulich
2021-01-12 21:52 ` [PATCH V4 11/24] xen/mm: Make x86's XENMEM_resource_ioreq_server handling common Oleksandr Tyshchenko
2021-01-14  3:58   ` Wei Chen
2021-01-14 15:31     ` Oleksandr
2021-01-15 14:35       ` Alex Bennée
2021-01-18 17:42         ` Oleksandr
2021-01-18  9:38   ` Paul Durrant
2021-01-12 21:52 ` [PATCH V4 12/24] xen/ioreq: Remove "hvm" prefixes from involved function names Oleksandr Tyshchenko
2021-01-18  9:55   ` Paul Durrant
2021-01-12 21:52 ` [PATCH V4 13/24] xen/ioreq: Use guest_cmpxchg64() instead of cmpxchg() Oleksandr Tyshchenko
2021-01-15 19:37   ` Julien Grall
2021-01-17 11:32     ` Oleksandr
2021-01-18 10:00   ` Paul Durrant
2021-01-12 21:52 ` [PATCH V4 14/24] arm/ioreq: Introduce arch specific bits for IOREQ/DM features Oleksandr Tyshchenko
2021-01-15  0:55   ` Stefano Stabellini
2021-01-17 12:45     ` Oleksandr
2021-01-20  0:23       ` Stefano Stabellini
2021-01-21  9:51         ` Oleksandr
2021-01-15 20:26   ` Julien Grall
2021-01-17 17:11     ` Oleksandr
2021-01-17 18:07       ` Julien Grall
2021-01-17 18:52         ` Oleksandr
2021-01-18 19:17           ` Julien Grall
2021-01-19 15:20             ` Oleksandr
2021-01-20  0:50               ` Stefano Stabellini
2021-01-20 15:57                 ` Julien Grall
2021-01-20 19:47                   ` Stefano Stabellini
2021-01-21  9:31                     ` Oleksandr
2021-01-21 21:34                       ` Stefano Stabellini
2021-01-20 15:50           ` Julien Grall
2021-01-21  8:50             ` Oleksandr
2021-01-27 10:24               ` Jan Beulich
2021-01-27 12:22                 ` Oleksandr
2021-01-27 12:52                   ` Jan Beulich
2021-01-18 10:44       ` Jan Beulich
2021-01-18 15:52         ` Oleksandr
2021-01-18 16:00           ` Jan Beulich
2021-01-18 16:29             ` Oleksandr
2021-01-12 21:52 ` [PATCH V4 15/24] xen/arm: Stick around in leave_hypervisor_to_guest until I/O has completed Oleksandr Tyshchenko
2021-01-15  1:12   ` Stefano Stabellini
2021-01-15 20:55   ` Julien Grall
2021-01-17 20:23     ` Oleksandr
2021-01-18 10:57       ` Julien Grall
2021-01-18 13:23         ` Oleksandr
2021-01-12 21:52 ` [PATCH V4 16/24] xen/mm: Handle properly reference in set_foreign_p2m_entry() on Arm Oleksandr Tyshchenko
2021-01-15  1:19   ` Stefano Stabellini
2021-01-15 20:59   ` Julien Grall
2021-01-21 13:57   ` Jan Beulich
2021-01-21 18:42     ` Oleksandr
2021-01-12 21:52 ` [PATCH V4 17/24] xen/ioreq: Introduce domain_has_ioreq_server() Oleksandr Tyshchenko
2021-01-15  1:24   ` Stefano Stabellini
2021-01-18 10:23   ` Paul Durrant
2021-01-12 21:52 ` [PATCH V4 18/24] xen/dm: Introduce xendevicemodel_set_irq_level DM op Oleksandr Tyshchenko
2021-01-15  1:32   ` Stefano Stabellini
2021-01-12 21:52 ` [PATCH V4 19/24] xen/arm: io: Abstract sign-extension Oleksandr Tyshchenko
2021-01-15  1:35   ` Stefano Stabellini
2021-01-12 21:52 ` [PATCH V4 20/24] xen/arm: io: Harden sign extension check Oleksandr Tyshchenko
2021-01-15  1:48   ` Stefano Stabellini
2021-01-22 10:15   ` Volodymyr Babchuk
2021-01-12 21:52 ` [PATCH V4 21/24] xen/ioreq: Make x86's send_invalidate_req() common Oleksandr Tyshchenko
2021-01-18 10:31   ` Paul Durrant
2021-01-21 14:02     ` Jan Beulich
2021-01-12 21:52 ` [PATCH V4 22/24] xen/arm: Add mapcache invalidation handling Oleksandr Tyshchenko
2021-01-15  2:11   ` Stefano Stabellini
2021-01-21 19:47     ` Oleksandr
2021-01-12 21:52 ` [PATCH V4 23/24] libxl: Introduce basic virtio-mmio support on Arm Oleksandr Tyshchenko
2021-01-15 21:30   ` Julien Grall
2021-01-17 22:22     ` Oleksandr
2021-01-20 16:40       ` Julien Grall
2021-01-20 20:35         ` Stefano Stabellini
2021-02-09 21:04         ` Oleksandr
2021-01-12 21:52 ` [PATCH V4 24/24] [RFC] libxl: Add support for virtio-disk configuration Oleksandr Tyshchenko
2021-01-14 17:20   ` Ian Jackson
2021-01-16  9:05     ` Oleksandr
2021-01-15 22:01   ` Julien Grall
2021-01-18  8:32     ` Oleksandr
2021-01-20 17:05       ` Julien Grall
2021-02-10  9:02         ` Oleksandr
2021-03-06 19:52           ` Julien Grall
2021-01-14  3:55 ` [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Wei Chen
2021-01-14 15:23   ` Oleksandr
2021-01-07 14:35     ` [ANNOUNCE] Xen 4.15 release schedule and feature tracking Ian Jackson
2021-01-07 15:45       ` Oleksandr
2021-01-14 16:11         ` [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm Ian Jackson
2021-01-14 18:41           ` Oleksandr
2021-01-14 16:06       ` [ANNOUNCE] Xen 4.15 release schedule and feature tracking Ian Jackson
2021-01-14 19:02         ` Andrew Cooper
2021-01-15  9:57           ` Jan Beulich
2021-01-15 10:00             ` Julien Grall
2021-01-15 10:52             ` Andrew Cooper
2021-01-15 10:59               ` Andrew Cooper
2021-01-15 11:08                 ` Jan Beulich
2021-01-15 10:43           ` Bertrand Marquis
2021-01-15 15:14           ` Lengyel, Tamas
2021-01-28 22:55             ` Dario Faggioli
2021-01-28 18:26           ` Dario Faggioli
2021-01-28 22:15             ` Dario Faggioli
2021-01-29  8:38             ` Jan Beulich
2021-01-29  9:22               ` Dario Faggioli

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.