All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xen-devel] [PATCH v6 0/6] Fix PCI passthrough for HVM with stubdomain
@ 2019-09-14 15:37 Marek Marczykowski-Górecki
  2019-09-14 15:37 ` [Xen-devel] [PATCH v6 1/6] libxl: do not attach xen-pciback to HVM domain, if stubdomain is in use Marek Marczykowski-Górecki
                   ` (5 more replies)
  0 siblings, 6 replies; 18+ messages in thread
From: Marek Marczykowski-Górecki @ 2019-09-14 15:37 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Suravee Suthikulpanit, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Marek Marczykowski-Górecki, Tim Deegan, Julien Grall,
	Jan Beulich, Anthony PERARD, Daniel De Graaf, Brian Woods,
	Roger Pau Monné

In this version, I add PHYSDEVOP_interrupt_control to allow stubdomain
enabling MSI after mapping it, and also disabling INTx beforehand. Actual
hypercall refuse to enable both of them.

Related article:
https://www.qubes-os.org/news/2017/10/18/msi-support/

Changes in v2:
 - new "xen/x86: Allow stubdom access to irq created for msi" patch
 - applied review comments from v1
Changes is v3:
 - apply suggestions by Roger
 - add PHYSDEVOP_msi_msix_set_enable
Changes in v4:
 - implement suggestions by Wei, Roger, Jan
 - plug new physdevop into XSM
Changes in v5:
 - rebase on master
 - rename to PHYSDEVOP_msi_control
 - move granting access to IRQ into create_irq
Changes in v6:
 - simplify granting IRQ access, record dm domid for cleanup
 - rename to PHYSDEVOP_interrupt_control
 - include INTx control in the hypercall

---
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Wei Liu <wl@xen.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Anthony PERARD <anthony.perard@citrix.com>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Brian Woods <brian.woods@amd.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Daniel De Graaf <dgdegra@tycho.nsa.gov>

Marek Marczykowski-Górecki (6):
  libxl: do not attach xen-pciback to HVM domain, if stubdomain is in use
  libxl: attach PCI device to qemu only after setting pciback/pcifront
  libxl: don't try to manipulate json config for stubdomain
  xen/x86: Allow stubdom access to irq created for msi.
  xen/x86: add PHYSDEVOP_interrupt_control
  tools/libxc: add wrapper for PHYSDEVOP_interrupt_control

 tools/libxc/include/xenctrl.h            |  6 ++-
 tools/libxc/xc_physdev.c                 | 15 ++++++-
 tools/libxl/libxl_pci.c                  | 63 +++++++++++++++++--------
 xen/arch/x86/hpet.c                      |  3 +-
 xen/arch/x86/irq.c                       | 51 ++++++++++++++------
 xen/arch/x86/msi.c                       | 45 ++++++++++++++++++-
 xen/arch/x86/physdev.c                   | 53 +++++++++++++++++++++-
 xen/arch/x86/x86_64/physdev.c            |  4 ++-
 xen/drivers/char/ns16550.c               |  2 +-
 xen/drivers/passthrough/amd/iommu_init.c |  2 +-
 xen/drivers/passthrough/vtd/iommu.c      |  3 +-
 xen/include/asm-x86/irq.h                |  7 ++-
 xen/include/asm-x86/msi.h                |  2 +-
 xen/include/public/physdev.h             | 23 +++++++++-
 xen/include/xlat.lst                     |  1 +-
 xen/include/xsm/dummy.h                  |  7 +++-
 xen/include/xsm/xsm.h                    |  6 ++-
 xen/xsm/dummy.c                          |  1 +-
 xen/xsm/flask/hooks.c                    | 24 ++++++++++-
 xen/xsm/flask/policy/access_vectors      |  1 +-
 20 files changed, 281 insertions(+), 38 deletions(-)

base-commit: 6c9639a72f0ca3a9430ef75f375877182281fdef
-- 
git-series 0.9.1

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Xen-devel] [PATCH v6 1/6] libxl: do not attach xen-pciback to HVM domain, if stubdomain is in use
  2019-09-14 15:37 [Xen-devel] [PATCH v6 0/6] Fix PCI passthrough for HVM with stubdomain Marek Marczykowski-Górecki
@ 2019-09-14 15:37 ` Marek Marczykowski-Górecki
  2019-09-14 15:37 ` [Xen-devel] [PATCH v6 2/6] libxl: attach PCI device to qemu only after setting pciback/pcifront Marek Marczykowski-Górecki
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Marek Marczykowski-Górecki @ 2019-09-14 15:37 UTC (permalink / raw)
  To: xen-devel
  Cc: Anthony PERARD, Ian Jackson, Wei Liu,
	Marek Marczykowski-Górecki, Wei Liu

HVM domains use IOMMU and device model assistance for communicating with
PCI devices, xen-pcifront/pciback isn't directly needed by HVM domain.
But pciback serve also second function - it reset the device when it is
deassigned from the guest and for this reason pciback needs to be used
with HVM domain too.
When HVM domain has device model in stubdomain, attaching xen-pciback to
the target domain itself may prevent attaching xen-pciback to the
(PV) stubdomain, effectively breaking PCI passthrough.

Fix this by attaching pciback only to one domain: if PV stubdomain is in
use, let it be stubdomain (the commit prevents attaching device to target
HVM in this case); otherwise, attach it to the target domain.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
Changes in v2:
 - previously called "libxl: attach xen-pciback only to PV domains"
 - instead of excluding all HVMs, change the condition to what actually
   matters here - check if stubdomain is in use; this way xen-pciback is
   always in use (either for the target domain, or it's stubdomain),
   fixing PCI reset by xen-pciback concerns
Changes in v3:
 - adjust commit message
---
 tools/libxl/libxl_pci.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl_pci.c b/tools/libxl/libxl_pci.c
index 03beb86..2e06a45 100644
--- a/tools/libxl/libxl_pci.c
+++ b/tools/libxl/libxl_pci.c
@@ -1106,7 +1106,7 @@ out:
         }
     }
 
-    if (!starting)
+    if (!starting && !libxl_get_stubdom_id(CTX, domid))
         rc = libxl__device_pci_add_xenstore(gc, domid, pcidev, starting);
     else
         rc = 0;
@@ -1302,7 +1302,7 @@ static void libxl__add_pcidevs(libxl__egc *egc, libxl__ao *ao, uint32_t domid,
         }
     }
 
-    if (d_config->num_pcidevs > 0) {
+    if (d_config->num_pcidevs > 0 && !libxl_get_stubdom_id(CTX, domid)) {
         rc = libxl__create_pci_backend(gc, domid, d_config->pcidevs,
             d_config->num_pcidevs);
         if (rc < 0) {
-- 
git-series 0.9.1

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Xen-devel] [PATCH v6 2/6] libxl: attach PCI device to qemu only after setting pciback/pcifront
  2019-09-14 15:37 [Xen-devel] [PATCH v6 0/6] Fix PCI passthrough for HVM with stubdomain Marek Marczykowski-Górecki
  2019-09-14 15:37 ` [Xen-devel] [PATCH v6 1/6] libxl: do not attach xen-pciback to HVM domain, if stubdomain is in use Marek Marczykowski-Górecki
@ 2019-09-14 15:37 ` Marek Marczykowski-Górecki
  2019-09-14 15:37 ` [Xen-devel] [PATCH v6 3/6] libxl: don't try to manipulate json config for stubdomain Marek Marczykowski-Górecki
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Marek Marczykowski-Górecki @ 2019-09-14 15:37 UTC (permalink / raw)
  To: xen-devel
  Cc: Anthony PERARD, Ian Jackson, Wei Liu,
	Marek Marczykowski-Górecki, Wei Liu

When qemu is running in stubdomain, handling "pci-ins" command will fail
if pcifront is not initialized already. Fix this by sending such command
only after confirming that pciback/front is running.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
Changes in v2:
- Fixed code style since previous version.
---
 tools/libxl/libxl_pci.c |  9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/tools/libxl/libxl_pci.c b/tools/libxl/libxl_pci.c
index 2e06a45..578535f 100644
--- a/tools/libxl/libxl_pci.c
+++ b/tools/libxl/libxl_pci.c
@@ -1191,6 +1191,7 @@ int libxl__device_pci_add(libxl__gc *gc, uint32_t domid, libxl_device_pci *pcide
 {
     libxl_ctx *ctx = libxl__gc_owner(gc);
     unsigned int orig_vdev, pfunc_mask;
+    char *be_path;
     libxl_device_pci *assigned;
     int num_assigned, i, rc;
     int stubdomid = 0;
@@ -1245,6 +1246,14 @@ int libxl__device_pci_add(libxl__gc *gc, uint32_t domid, libxl_device_pci *pcide
         rc = do_pci_add(gc, stubdomid, &pcidev_s, 0);
         if ( rc )
             goto out;
+        /* Wait for the device actually being connected, otherwise device model
+         * running there will fail to find the device. */
+        be_path = libxl__sprintf(gc, "%s/backend/pci/%d/0",
+                                 libxl__xs_get_dompath(gc, 0), stubdomid);
+        rc = libxl__wait_for_backend(gc, be_path,
+                                     GCSPRINTF("%d", XenbusStateConnected));
+        if (rc)
+            goto out;
     }
 
     orig_vdev = pcidev->vdevfn & ~7U;
-- 
git-series 0.9.1

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Xen-devel] [PATCH v6 3/6] libxl: don't try to manipulate json config for stubdomain
  2019-09-14 15:37 [Xen-devel] [PATCH v6 0/6] Fix PCI passthrough for HVM with stubdomain Marek Marczykowski-Górecki
  2019-09-14 15:37 ` [Xen-devel] [PATCH v6 1/6] libxl: do not attach xen-pciback to HVM domain, if stubdomain is in use Marek Marczykowski-Górecki
  2019-09-14 15:37 ` [Xen-devel] [PATCH v6 2/6] libxl: attach PCI device to qemu only after setting pciback/pcifront Marek Marczykowski-Górecki
@ 2019-09-14 15:37 ` Marek Marczykowski-Górecki
  2019-09-14 15:37 ` [Xen-devel] [PATCH v6 4/6] xen/x86: Allow stubdom access to irq created for msi Marek Marczykowski-Górecki
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Marek Marczykowski-Górecki @ 2019-09-14 15:37 UTC (permalink / raw)
  To: xen-devel
  Cc: Anthony PERARD, Ian Jackson, Wei Liu,
	Marek Marczykowski-Górecki, Wei Liu

Stubdomain do not have it's own config file - its configuration is
derived from target domains. Do not try to manipulate it when attaching
PCI device.

This bug prevented starting HVM with stubdomain and PCI passthrough
device attached.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
Changes in v3:
 - skip libxl__dm_check_start too, as stubdomain is guaranteed to be
   running at this stage already
 - do not init d_config at all, as it is used only for json manipulation
Changes in v4:
 - adjust comment style
---
 tools/libxl/libxl_pci.c | 50 ++++++++++++++++++++++++++----------------
 1 file changed, 32 insertions(+), 18 deletions(-)

diff --git a/tools/libxl/libxl_pci.c b/tools/libxl/libxl_pci.c
index 578535f..d26fc9a 100644
--- a/tools/libxl/libxl_pci.c
+++ b/tools/libxl/libxl_pci.c
@@ -120,10 +120,14 @@ static int libxl__device_pci_add_xenstore(libxl__gc *gc, uint32_t domid, libxl_d
     libxl_domain_config d_config;
     libxl_device_pci pcidev_saved;
     libxl__domain_userdata_lock *lock = NULL;
+    bool is_stubdomain = libxl_is_stubdom(CTX, domid, NULL);
 
-    libxl_domain_config_init(&d_config);
-    libxl_device_pci_init(&pcidev_saved);
-    libxl_device_pci_copy(CTX, &pcidev_saved, pcidev);
+    /* Stubdomain doesn't have own config. */
+    if (!is_stubdomain) {
+        libxl_domain_config_init(&d_config);
+        libxl_device_pci_init(&pcidev_saved);
+        libxl_device_pci_copy(CTX, &pcidev_saved, pcidev);
+    }
 
     be_path = libxl__domain_device_backend_path(gc, 0, domid, 0,
                                                 LIBXL__DEVICE_KIND_PCI);
@@ -152,27 +156,35 @@ static int libxl__device_pci_add_xenstore(libxl__gc *gc, uint32_t domid, libxl_d
     GCNEW(device);
     libxl__device_from_pcidev(gc, domid, pcidev, device);
 
-    lock = libxl__lock_domain_userdata(gc, domid);
-    if (!lock) {
-        rc = ERROR_LOCK_FAIL;
-        goto out;
-    }
+    /* 
+     * Stubdomin config is derived from its target domain, it doesn't have
+     * its own file.
+     */
+    if (!is_stubdomain) {
+        lock = libxl__lock_domain_userdata(gc, domid);
+        if (!lock) {
+            rc = ERROR_LOCK_FAIL;
+            goto out;
+        }
 
-    rc = libxl__get_domain_configuration(gc, domid, &d_config);
-    if (rc) goto out;
+        rc = libxl__get_domain_configuration(gc, domid, &d_config);
+        if (rc) goto out;
 
-    device_add_domain_config(gc, &d_config, &libxl__pcidev_devtype,
-                             &pcidev_saved);
+        device_add_domain_config(gc, &d_config, &libxl__pcidev_devtype,
+                                 &pcidev_saved);
 
-    rc = libxl__dm_check_start(gc, &d_config, domid);
-    if (rc) goto out;
+        rc = libxl__dm_check_start(gc, &d_config, domid);
+        if (rc) goto out;
+    }
 
     for (;;) {
         rc = libxl__xs_transaction_start(gc, &t);
         if (rc) goto out;
 
-        rc = libxl__set_domain_configuration(gc, domid, &d_config);
-        if (rc) goto out;
+        if (lock) {
+            rc = libxl__set_domain_configuration(gc, domid, &d_config);
+            if (rc) goto out;
+        }
 
         libxl__xs_writev(gc, t, be_path, libxl__xs_kvs_of_flexarray(gc, back));
 
@@ -184,8 +196,10 @@ static int libxl__device_pci_add_xenstore(libxl__gc *gc, uint32_t domid, libxl_d
 out:
     libxl__xs_transaction_abort(gc, &t);
     if (lock) libxl__unlock_domain_userdata(lock);
-    libxl_device_pci_dispose(&pcidev_saved);
-    libxl_domain_config_dispose(&d_config);
+    if (!is_stubdomain) {
+        libxl_device_pci_dispose(&pcidev_saved);
+        libxl_domain_config_dispose(&d_config);
+    }
     return rc;
 }
 
-- 
git-series 0.9.1

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Xen-devel] [PATCH v6 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-09-14 15:37 [Xen-devel] [PATCH v6 0/6] Fix PCI passthrough for HVM with stubdomain Marek Marczykowski-Górecki
                   ` (2 preceding siblings ...)
  2019-09-14 15:37 ` [Xen-devel] [PATCH v6 3/6] libxl: don't try to manipulate json config for stubdomain Marek Marczykowski-Górecki
@ 2019-09-14 15:37 ` Marek Marczykowski-Górecki
  2019-09-20  9:23   ` Jan Beulich
  2019-09-14 15:37 ` [Xen-devel] [PATCH v6 5/6] xen/x86: add PHYSDEVOP_interrupt_control Marek Marczykowski-Górecki
  2019-09-14 15:37 ` [Xen-devel] [PATCH v6 6/6] tools/libxc: add wrapper for PHYSDEVOP_interrupt_control Marek Marczykowski-Górecki
  5 siblings, 1 reply; 18+ messages in thread
From: Marek Marczykowski-Górecki @ 2019-09-14 15:37 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Suravee Suthikulpanit, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Marek Marczykowski-Górecki, Tim Deegan, Simon Gaiser,
	Julien Grall, Jan Beulich, Brian Woods, Roger Pau Monné

Stubdomains need to be given sufficient privilege over the guest which it
provides emulation for in order for PCI passthrough to work correctly.
When a HVM domain try to enable MSI, QEMU in stubdomain calls
PHYSDEVOP_map_pirq, but later it needs to call XEN_DOMCTL_bind_pt_irq as
part of xc_domain_update_msi_irq. Allow for that as part of
PHYSDEVOP_map_pirq.

This is not needed for PCI INTx, because IRQ in that case is known
beforehand and the stubdomain is given permissions over this IRQ by
libxl__device_pci_add (there's a do_pci_add against the stubdomain).

create_irq() already grant IRQ access to hardware_domain, with
assumption the device model (something managing this IRQ) lives there.
Modify create_irq() to take additional parameter pointing at device
model domain - which may be dom0 or stubdomain.  Save ID of the domain
given permission, to revoke it in destroy_irq() - easier and cleaner
than replaying logic of create_irq() parameter. Use domid instead of
actual reference to the domain, because it might get destroyed before
destroying IRQ (stubdomain is destroyed before its target domain). And
it is not an issue, because IRQ permissions live within domain
structure, so destroying a domain also implicitly revoke the permission.
Potential domid reuse is detected by by checking if that domain does
have permission over the IRQ being destroyed.

Then, adjust all callers to provide the parameter. In case of calls not
related to stubdomain-initiated allocations, give it either
hardware_domain (so the behavior is unchanged there), or NULL for
interrupts used by Xen internally.

Inspired by https://github.com/OpenXT/xenclient-oe/blob/5e0e7304a5a3c75ef01240a1e3673665b2aaf05e/recipes-extended/xen/files/stubdomain-msi-irq-access.patch by Eric Chanudet <chanudete@ainfosec.com>.

Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com>
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
---
Changes in v3:
 - extend commit message
Changes in v4:
 - add missing destroy_irq on error path
Changes in v5:
 - move irq_{grant,revoke}_access() to {create,destroy}_irq(), which
   basically make it a different patch
 - add get_dm_domain() helper
 - do not give hardware_domain permission over IRQs used in Xen
   internally
 - rename create_irq argument to just 'd', to avoid confusion
   when it's called by hardware domain
 - verify that device is de-assigned before pci_remove_device call
 - save ID of domain given permission in create_irq(), to revoke it in
 destroy_irq()
 - drop domain parameter from destroy_irq() and msi_free_irq()
 - do not give hardware domain permission over IRQ created in
 iommu_set_interrupt()
Changes in v6:
 - do not give permission over hpet irq to hardware_domain
 - move creator_domid to arch_irq_desc
 - fix creator_domid initialization
 - always give current->domain permission instead of using
 get_dm_domain() helper. Analysis of all its use cases tells that it is
 the only value it returns.
 - drop unrelated change
---
 xen/arch/x86/hpet.c                      |  3 +-
 xen/arch/x86/irq.c                       | 51 ++++++++++++++++++-------
 xen/drivers/char/ns16550.c               |  2 +-
 xen/drivers/passthrough/amd/iommu_init.c |  2 +-
 xen/drivers/passthrough/vtd/iommu.c      |  3 +-
 xen/include/asm-x86/irq.h                |  7 ++-
 6 files changed, 50 insertions(+), 18 deletions(-)

diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c
index 4b08488..5ed4405 100644
--- a/xen/arch/x86/hpet.c
+++ b/xen/arch/x86/hpet.c
@@ -11,6 +11,7 @@
 #include <xen/softirq.h>
 #include <xen/irq.h>
 #include <xen/numa.h>
+#include <xen/sched.h>
 #include <asm/fixmap.h>
 #include <asm/div64.h>
 #include <asm/hpet.h>
@@ -368,7 +369,7 @@ static int __init hpet_assign_irq(struct hpet_event_channel *ch)
 {
     int irq;
 
-    if ( (irq = create_irq(NUMA_NO_NODE)) < 0 )
+    if ( (irq = create_irq(NUMA_NO_NODE, NULL)) < 0 )
         return irq;
 
     ch->msi.irq = irq;
diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index 0ee3346..0b4c20a 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -254,7 +254,13 @@ void __init clear_irq_vector(int irq)
 /*
  * Dynamic irq allocate and deallocation for MSI
  */
-int create_irq(nodeid_t node)
+
+/*
+ * create_irq - allocate irq for MSI
+ * @d domain that will get permission over the allocated irq; this permission
+ * will automatically be revoked on destroy_irq
+ */
+int create_irq(nodeid_t node, struct domain *d)
 {
     int irq, ret;
     struct irq_desc *desc;
@@ -282,23 +288,30 @@ int create_irq(nodeid_t node)
         }
         ret = assign_irq_vector(irq, mask);
     }
+    ASSERT(desc->arch.creator_domid == DOMID_INVALID);
     if (ret < 0)
     {
         desc->arch.used = IRQ_UNUSED;
         irq = ret;
     }
-    else if ( hardware_domain )
+    else if ( d )
     {
-        ret = irq_permit_access(hardware_domain, irq);
+        ASSERT(d == current->domain);
+        ret = irq_permit_access(d, irq);
         if ( ret )
             printk(XENLOG_G_ERR
-                   "Could not grant Dom0 access to IRQ%d (error %d)\n",
-                   irq, ret);
+                   "Could not grant Dom%u access to IRQ%d (error %d)\n",
+                   d->domain_id, irq, ret);
+        else
+            desc->arch.creator_domid = d->domain_id;
     }
 
     return irq;
 }
 
+/*
+ * destroy_irq - deallocate irq for MSI
+ */
 void destroy_irq(unsigned int irq)
 {
     struct irq_desc *desc = irq_to_desc(irq);
@@ -307,14 +320,25 @@ void destroy_irq(unsigned int irq)
 
     BUG_ON(!MSI_IRQ(irq));
 
-    if ( hardware_domain )
+    if ( desc->arch.creator_domid != DOMID_INVALID )
     {
-        int err = irq_deny_access(hardware_domain, irq);
+        struct domain *d = get_domain_by_id(desc->arch.creator_domid);
 
-        if ( err )
-            printk(XENLOG_G_ERR
-                   "Could not revoke Dom0 access to IRQ%u (error %d)\n",
-                   irq, err);
+        if ( d && irq_access_permitted(d, irq) )
+        {
+            int err;
+
+            err = irq_deny_access(d, irq);
+            if ( err )
+                printk(XENLOG_G_ERR
+                       "Could not revoke Dom%u access to IRQ%u (error %d)\n",
+                       d->domain_id, irq, err);
+        }
+
+        if ( d )
+            put_domain(d);
+
+        desc->arch.creator_domid = DOMID_INVALID;
     }
 
     spin_lock_irqsave(&desc->lock, flags);
@@ -381,6 +405,7 @@ int arch_init_one_irq_desc(struct irq_desc *desc)
 
     desc->arch.vector = IRQ_VECTOR_UNASSIGNED;
     desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
+    desc->arch.creator_domid = DOMID_INVALID;
 
     return 0;
 }
@@ -2133,7 +2158,7 @@ int map_domain_pirq(
             spin_unlock_irqrestore(&desc->lock, flags);
 
             info = NULL;
-            irq = create_irq(NUMA_NO_NODE);
+            irq = create_irq(NUMA_NO_NODE, current->domain);
             ret = irq >= 0 ? prepare_domain_irq_pirq(d, irq, pirq + nr, &info)
                            : irq;
             if ( ret < 0 )
@@ -2818,7 +2843,7 @@ int allocate_and_map_msi_pirq(struct domain *d, int index, int *pirq_p,
         if ( irq == -1 )
         {
     case MAP_PIRQ_TYPE_MULTI_MSI:
-            irq = create_irq(NUMA_NO_NODE);
+            irq = create_irq(NUMA_NO_NODE, current->domain);
         }
 
         if ( irq < nr_irqs_gsi || irq >= nr_irqs )
diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c
index 8667de6..66cc680 100644
--- a/xen/drivers/char/ns16550.c
+++ b/xen/drivers/char/ns16550.c
@@ -722,7 +722,7 @@ static void __init ns16550_init_irq(struct serial_port *port)
     struct ns16550 *uart = port->uart;
 
     if ( uart->msi )
-        uart->irq = create_irq(0);
+        uart->irq = create_irq(0, NULL);
 #endif
 }
 
diff --git a/xen/drivers/passthrough/amd/iommu_init.c b/xen/drivers/passthrough/amd/iommu_init.c
index bb9f33e..9af4b7c 100644
--- a/xen/drivers/passthrough/amd/iommu_init.c
+++ b/xen/drivers/passthrough/amd/iommu_init.c
@@ -765,7 +765,7 @@ static bool_t __init set_iommu_interrupt_handler(struct amd_iommu *iommu)
 {
     int irq, ret;
 
-    irq = create_irq(NUMA_NO_NODE);
+    irq = create_irq(NUMA_NO_NODE, NULL);
     if ( irq <= 0 )
     {
         dprintk(XENLOG_ERR, "IOMMU: no irqs\n");
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 5d72270..7440bac 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1138,7 +1138,8 @@ static int __init iommu_set_interrupt(struct acpi_drhd_unit *drhd)
     struct irq_desc *desc;
 
     irq = create_irq(rhsa ? pxm_to_node(rhsa->proximity_domain)
-                          : NUMA_NO_NODE);
+                          : NUMA_NO_NODE,
+                     NULL);
     if ( irq <= 0 )
     {
         dprintk(XENLOG_ERR VTDPREFIX, "IOMMU: no irq available!\n");
diff --git a/xen/include/asm-x86/irq.h b/xen/include/asm-x86/irq.h
index bc0c0c1..7cf8a1b 100644
--- a/xen/include/asm-x86/irq.h
+++ b/xen/include/asm-x86/irq.h
@@ -45,6 +45,11 @@ struct arch_irq_desc {
         unsigned move_cleanup_count;
         u8 move_in_progress : 1;
         s8 used;
+        /*
+         * weak reference to domain having permission over this IRQ (which can
+         * be different from the domain actually havint the IRQ assigned)
+         */
+        domid_t creator_domid;
 };
 
 /* For use with irq_desc.arch.used */
@@ -161,7 +166,7 @@ int  init_irq_data(void);
 void clear_irq_vector(int irq);
 
 int irq_to_vector(int irq);
-int create_irq(nodeid_t node);
+int create_irq(nodeid_t node, struct domain *d);
 void destroy_irq(unsigned int irq);
 int assign_irq_vector(int irq, const cpumask_t *);
 
-- 
git-series 0.9.1

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Xen-devel] [PATCH v6 5/6] xen/x86: add PHYSDEVOP_interrupt_control
  2019-09-14 15:37 [Xen-devel] [PATCH v6 0/6] Fix PCI passthrough for HVM with stubdomain Marek Marczykowski-Górecki
                   ` (3 preceding siblings ...)
  2019-09-14 15:37 ` [Xen-devel] [PATCH v6 4/6] xen/x86: Allow stubdom access to irq created for msi Marek Marczykowski-Górecki
@ 2019-09-14 15:37 ` Marek Marczykowski-Górecki
  2019-09-20 10:10   ` Jan Beulich
  2019-09-14 15:37 ` [Xen-devel] [PATCH v6 6/6] tools/libxc: add wrapper for PHYSDEVOP_interrupt_control Marek Marczykowski-Górecki
  5 siblings, 1 reply; 18+ messages in thread
From: Marek Marczykowski-Górecki @ 2019-09-14 15:37 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson,
	Marek Marczykowski-Górecki, Tim Deegan, Julien Grall,
	Jan Beulich, Daniel De Graaf, Roger Pau Monné

Allow device model running in stubdomain to enable/disable INTx/MSI(-X),
bypassing pciback. While pciback is still used to access config space
from within stubdomain, it refuse to write to
PCI_MSI_FLAGS_ENABLE/PCI_MSIX_FLAGS_ENABLE/PCI_COMMAND_INTX_DISABLE
in non-permissive mode. Which is the right thing to do for PV domain
(the main use case for pciback), as PV domain should use XEN_PCI_OP_*
commands for that. Unfortunately those commands are not good for
stubdomain use, as they configure MSI in dom0's kernel too, which should
not happen for HVM domain.

This new physdevop is allowed only for stubdomain controlling the domain
which own the device.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
---
Changes in v3:
 - new patch
Changes in v4:
 - adjust code style
 - s/msi_msix/msi/
 - add msi_set_enable XSM hook
 - flatten struct physdev_msi_set_enable
 - add to include/xlat.lst
Changes in v5:
 - rename to PHYSDEVOP_msi_control
 - combine "mode" and "enable" into "flags"
 - refuse to enable both MSI and MSI-X, and also to enable MSI(-X) on
   incapable device
 - disable/enable INTx when enabling/disabling MSI (?)
 - refuse if !use_msi
 - adjust flask hook to make more sense (require "setup" access on
   device, not on domain)
 - rebase on master
Changes in v6:
 - rename to PHYSDEVOP_interrupt_control
 - extend with INTx control
 - Ensure than MSI(-X) can't be enabled together with INTx and the other MSI(-X).
 - deduplicate code in msi_control
 - explicitly refuse to operate on hidden devices
 - expand flags to uint16_t to avoid implicit padding

I'm not sure if XSM part is correct, compile-tested only, as I'm not
sure how to set the policy.
---
 xen/arch/x86/msi.c                  | 45 +++++++++++++++++++++++++-
 xen/arch/x86/physdev.c              | 53 ++++++++++++++++++++++++++++++-
 xen/arch/x86/x86_64/physdev.c       |  4 ++-
 xen/include/asm-x86/msi.h           |  2 +-
 xen/include/public/physdev.h        | 23 +++++++++++++-
 xen/include/xlat.lst                |  1 +-
 xen/include/xsm/dummy.h             |  7 ++++-
 xen/include/xsm/xsm.h               |  6 +++-
 xen/xsm/dummy.c                     |  1 +-
 xen/xsm/flask/hooks.c               | 24 ++++++++++++++-
 xen/xsm/flask/policy/access_vectors |  1 +-
 11 files changed, 167 insertions(+)

diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c
index d630600..ecea91a 100644
--- a/xen/arch/x86/msi.c
+++ b/xen/arch/x86/msi.c
@@ -1443,6 +1443,51 @@ int pci_restore_msi_state(struct pci_dev *pdev)
     return 0;
 }
 
+int msi_control(struct pci_dev *pdev, bool msix, bool enable)
+{
+    unsigned int cap = msix ? PCI_CAP_ID_MSIX : PCI_CAP_ID_MSI;
+    unsigned int other_cap = msix ? PCI_CAP_ID_MSI : PCI_CAP_ID_MSIX;
+    uint16_t cmd;
+
+    if ( !use_msi )
+        return -EOPNOTSUPP;
+
+    if ( !pci_find_cap_offset(pdev->seg,
+                              pdev->bus,
+                              PCI_SLOT(pdev->devfn),
+                              PCI_FUNC(pdev->devfn),
+                              cap) )
+        return -ENODEV;
+
+    cmd = pci_conf_read16(pdev->sbdf, PCI_COMMAND);
+
+    /* don't allow enabling MSI(-X) and INTx at the same time */
+    if ( enable && ! (cmd & PCI_COMMAND_INTX_DISABLE) )
+        return -EBUSY;
+
+    /* don't allow enabling both MSI and MSI-X at the same time */
+    if ( enable && find_msi_entry(pdev, -1, other_cap) )
+        return -EBUSY;
+
+    if ( msix )
+        msix_set_enable(pdev, enable);
+    else
+        msi_set_enable(pdev, enable);
+
+    return 0;
+}
+
+int intx_control(struct pci_dev *pdev, bool enable)
+{
+    /* don't allow enabling INTx if MSI(-X) is already enabled */
+    if ( enable && find_msi_entry(pdev, -1, PCI_CAP_ID_MSI) )
+        return -EBUSY;
+    if ( enable && find_msi_entry(pdev, -1, PCI_CAP_ID_MSIX) )
+        return -EBUSY;
+    pci_intx(pdev, enable);
+    return 0;
+}
+
 void __init early_msi_init(void)
 {
     if ( use_msi < 0 )
diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
index 3a3c158..7b71039 100644
--- a/xen/arch/x86/physdev.c
+++ b/xen/arch/x86/physdev.c
@@ -662,6 +662,59 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
         break;
     }
 
+    case PHYSDEVOP_interrupt_control: {
+        struct physdev_interrupt_control op;
+        struct pci_dev *pdev;
+        int intr_type;
+        bool enable;
+
+        ret = -EFAULT;
+        if ( copy_from_guest(&op, arg, 1) )
+            break;
+
+        ret = -EINVAL;
+        if ( op.flags & ~(PHYSDEVOP_INTERRUPT_CONTROL_TYPE_MASK |
+                          PHYSDEVOP_INTERRUPT_CONTROL_ENABLE) )
+            break;
+
+        intr_type = op.flags & PHYSDEVOP_INTERRUPT_CONTROL_TYPE_MASK;
+        enable = op.flags & PHYSDEVOP_INTERRUPT_CONTROL_ENABLE;
+
+        pcidevs_lock();
+        pdev = pci_get_pdev(op.seg, op.bus, op.devfn);
+        ret = -ENODEV;
+        /* explicitly exclude hidden devices */
+        if ( !pdev || pdev->domain == dom_xen )
+            goto pci_unlock;
+
+        ret = xsm_interrupt_control(XSM_DM_PRIV,
+                                    pdev->domain,
+                                    pdev->sbdf.sbdf,
+                                    intr_type,
+                                    enable);
+        if ( ret )
+            goto pci_unlock;
+
+        switch ( intr_type )
+        {
+            case PHYSDEVOP_INTERRUPT_CONTROL_INTX:
+                ret = intx_control(pdev, enable);
+                break;
+            case PHYSDEVOP_INTERRUPT_CONTROL_MSI:
+                ret = msi_control(pdev, false, enable);
+                break;
+            case PHYSDEVOP_INTERRUPT_CONTROL_MSIX:
+                ret = msi_control(pdev, true, enable);
+                break;
+            default:
+                ret = -EINVAL;
+                break;
+        }
+pci_unlock:
+        pcidevs_unlock();
+        break;
+    }
+
     default:
         ret = -ENOSYS;
         break;
diff --git a/xen/arch/x86/x86_64/physdev.c b/xen/arch/x86/x86_64/physdev.c
index c5a00ea..6e0e488 100644
--- a/xen/arch/x86/x86_64/physdev.c
+++ b/xen/arch/x86/x86_64/physdev.c
@@ -76,6 +76,10 @@ CHECK_physdev_pci_device_add
 CHECK_physdev_pci_device
 #undef xen_physdev_pci_device
 
+#define xen_physdev_interrupt_control physdev_interrupt_control
+CHECK_physdev_interrupt_control
+#undef xen_physdev_interrupt_control
+
 #define COMPAT
 #undef guest_handle_okay
 #define guest_handle_okay          compat_handle_okay
diff --git a/xen/include/asm-x86/msi.h b/xen/include/asm-x86/msi.h
index 10387dc..4c13e6b 100644
--- a/xen/include/asm-x86/msi.h
+++ b/xen/include/asm-x86/msi.h
@@ -252,5 +252,7 @@ void guest_mask_msi_irq(struct irq_desc *, bool mask);
 void ack_nonmaskable_msi_irq(struct irq_desc *);
 void end_nonmaskable_msi_irq(struct irq_desc *, u8 vector);
 void set_msi_affinity(struct irq_desc *, const cpumask_t *);
+int msi_control(struct pci_dev *pdev, bool msix, bool enable);
+int intx_control(struct pci_dev *pdev, bool enable);
 
 #endif /* __ASM_MSI_H */
diff --git a/xen/include/public/physdev.h b/xen/include/public/physdev.h
index b6faf83..689c11e 100644
--- a/xen/include/public/physdev.h
+++ b/xen/include/public/physdev.h
@@ -345,6 +345,29 @@ typedef struct physdev_dbgp_op physdev_dbgp_op_t;
 DEFINE_XEN_GUEST_HANDLE(physdev_dbgp_op_t);
 
 /*
+ * Choose which interrupt type to control. If neither MSI nor MSI-X is chosen,
+ * will apply to INTx - for convenience define PHYSDEVOP_INTERRUPT_CONTROL_INTX
+ * and PHYSDEVOP_INTERRUPT_CONTROL_TYPE_MASK
+ */
+#define PHYSDEVOP_INTERRUPT_CONTROL_TYPE_MASK 3
+#define PHYSDEVOP_INTERRUPT_CONTROL_INTX      0
+#define PHYSDEVOP_INTERRUPT_CONTROL_MSI       1
+#define PHYSDEVOP_INTERRUPT_CONTROL_MSIX      2
+/* when PHYSDEVOP_INTERRUPT_CONTROL_ENABLE not set, disable */
+#define PHYSDEVOP_INTERRUPT_CONTROL_ENABLE    4
+
+#define PHYSDEVOP_interrupt_control   32
+struct physdev_interrupt_control {
+    /* IN */
+    uint16_t seg;
+    uint8_t bus;
+    uint8_t devfn;
+    uint16_t flags;
+};
+typedef struct physdev_interrupt_control physdev_interrupt_control_t;
+DEFINE_XEN_GUEST_HANDLE(physdev_interrupt_control_t);
+
+/*
  * Notify that some PIRQ-bound event channels have been unmasked.
  * ** This command is obsolete since interface version 0x00030202 and is **
  * ** unsupported by newer versions of Xen.                              **
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 95f5e55..18af663 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -104,6 +104,7 @@
 !	vnuma_topology_info		memory.h
 ?	physdev_eoi			physdev.h
 ?	physdev_get_free_pirq		physdev.h
+?	physdev_interrupt_control	physdev.h
 ?	physdev_irq			physdev.h
 ?	physdev_irq_status_query	physdev.h
 ?	physdev_manage_pci		physdev.h
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index ef52bb1..5a758c5 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -514,6 +514,13 @@ static XSM_INLINE int xsm_pci_config_permission(XSM_DEFAULT_ARG struct domain *d
     return xsm_default_action(action, current->domain, d);
 }
 
+static XSM_INLINE int xsm_interrupt_control(XSM_DEFAULT_ARG struct domain *d, uint32_t machine_bdf,
+                                            uint8_t intr_type, uint8_t enable)
+{
+    XSM_ASSERT_ACTION(XSM_DM_PRIV);
+    return xsm_default_action(action, current->domain, d);
+}
+
 static XSM_INLINE int xsm_add_to_physmap(XSM_DEFAULT_ARG struct domain *d1, struct domain *d2)
 {
     XSM_ASSERT_ACTION(XSM_TARGET);
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index e22d616..f080189 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -106,6 +106,7 @@ struct xsm_operations {
     int (*iomem_permission) (struct domain *d, uint64_t s, uint64_t e, uint8_t allow);
     int (*iomem_mapping) (struct domain *d, uint64_t s, uint64_t e, uint8_t allow);
     int (*pci_config_permission) (struct domain *d, uint32_t machine_bdf, uint16_t start, uint16_t end, uint8_t access);
+    int (*interrupt_control) (struct domain *d, uint32_t machine_bdf, uint8_t intr_type, uint8_t enable);
 
 #if defined(CONFIG_HAS_PASSTHROUGH) && defined(CONFIG_HAS_PCI)
     int (*get_device_group) (uint32_t machine_bdf);
@@ -464,6 +465,11 @@ static inline int xsm_pci_config_permission (xsm_default_t def, struct domain *d
     return xsm_ops->pci_config_permission(d, machine_bdf, start, end, access);
 }
 
+static inline int xsm_interrupt_control (xsm_default_t def, struct domain *d, uint32_t machine_bdf, uint8_t msix, uint8_t enable)
+{
+    return xsm_ops->interrupt_control(d, machine_bdf, msix, enable);
+}
+
 #if defined(CONFIG_HAS_PASSTHROUGH) && defined(CONFIG_HAS_PCI)
 static inline int xsm_get_device_group(xsm_default_t def, uint32_t machine_bdf)
 {
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index 5705e52..3080ae7 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -81,6 +81,7 @@ void __init xsm_fixup_ops (struct xsm_operations *ops)
     set_to_dummy_if_null(ops, iomem_permission);
     set_to_dummy_if_null(ops, iomem_mapping);
     set_to_dummy_if_null(ops, pci_config_permission);
+    set_to_dummy_if_null(ops, interrupt_control);
     set_to_dummy_if_null(ops, get_vnumainfo);
 
 #if defined(CONFIG_HAS_PASSTHROUGH) && defined(CONFIG_HAS_PCI)
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 791c1f6..ee2fc52 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -1083,6 +1083,29 @@ static int flask_pci_config_permission(struct domain *d, uint32_t machine_bdf, u
 
 }
 
+static int flask_interrupt_control(struct domain *d, uint32_t machine_bdf, uint8_t type, uint8_t enable)
+{
+    uint32_t dsid, rsid;
+    int rc = -EPERM;
+    struct avc_audit_data ad;
+    uint32_t perm;
+
+    AVC_AUDIT_DATA_INIT(&ad, DEV);
+    ad.device = machine_bdf;
+
+    rc = security_device_sid(machine_bdf, &rsid);
+    if ( rc )
+        return rc;
+
+    rc = avc_current_has_perm(rsid, SECCLASS_RESOURCE, RESOURCE__SETUP, &ad);
+    if ( rc )
+        return rc;
+
+    perm = flask_iommu_resource_use_perm();
+    dsid = domain_sid(d);
+    return avc_has_perm(dsid, rsid, SECCLASS_RESOURCE, perm, &ad);
+}
+
 static int flask_resource_plug_core(void)
 {
     return avc_current_has_perm(SECINITSID_DOMXEN, SECCLASS_RESOURCE, RESOURCE__PLUG, NULL);
@@ -1800,6 +1823,7 @@ static struct xsm_operations flask_ops = {
     .iomem_permission = flask_iomem_permission,
     .iomem_mapping = flask_iomem_mapping,
     .pci_config_permission = flask_pci_config_permission,
+    .interrupt_control = flask_interrupt_control,
 
     .resource_plug_core = flask_resource_plug_core,
     .resource_unplug_core = flask_resource_unplug_core,
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index 194d743..82eaeac 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -466,6 +466,7 @@ class resource
 # checked for PHYSDEVOP_restore_msi* (target PCI device)
 # checked for PHYSDEVOP_setup_gsi (target IRQ)
 # checked for PHYSDEVOP_pci_mmcfg_reserved (target xen_t)
+# checked for PHYSDEVOP_interrupt_control (target PCI device)
     setup
 }
 
-- 
git-series 0.9.1

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Xen-devel] [PATCH v6 6/6] tools/libxc: add wrapper for PHYSDEVOP_interrupt_control
  2019-09-14 15:37 [Xen-devel] [PATCH v6 0/6] Fix PCI passthrough for HVM with stubdomain Marek Marczykowski-Górecki
                   ` (4 preceding siblings ...)
  2019-09-14 15:37 ` [Xen-devel] [PATCH v6 5/6] xen/x86: add PHYSDEVOP_interrupt_control Marek Marczykowski-Górecki
@ 2019-09-14 15:37 ` Marek Marczykowski-Górecki
  2019-09-16 10:54   ` Wei Liu
  5 siblings, 1 reply; 18+ messages in thread
From: Marek Marczykowski-Górecki @ 2019-09-14 15:37 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Marek Marczykowski-Górecki, Wei Liu

Add libxc wrapper for PHYSDEVOP_interrupt_control introduced in previous
commit.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
---
Changes in v3:
 - new patch
Changes in v4:
 - adjust for updated previous patch
Changes in v5:
 - rename to PHYSDEVOP_msi_control, adjust arguments
Change in v6:
 - initialize struct physdev_interrupt_control inline, drop pointless rc
   variable
 - rename to PHYSDEVOP_interrupt_control
---
 tools/libxc/include/xenctrl.h |  6 ++++++
 tools/libxc/xc_physdev.c      | 15 +++++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 0ff6ed9..2adb114 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1639,6 +1639,12 @@ int xc_physdev_unmap_pirq(xc_interface *xch,
                           uint32_t domid,
                           int pirq);
 
+int xc_physdev_interrupt_control(xc_interface *xch,
+                                 int seg,
+                                 int bus,
+                                 int devfn,
+                                 int flags);
+
 /*
  *  LOGGING AND ERROR REPORTING
  */
diff --git a/tools/libxc/xc_physdev.c b/tools/libxc/xc_physdev.c
index 460a8e7..5af8296 100644
--- a/tools/libxc/xc_physdev.c
+++ b/tools/libxc/xc_physdev.c
@@ -111,3 +111,18 @@ int xc_physdev_unmap_pirq(xc_interface *xch,
     return rc;
 }
 
+int xc_physdev_interrupt_control(xc_interface *xch,
+                                 int seg,
+                                 int bus,
+                                 int devfn,
+                                 int flags)
+{
+    struct physdev_interrupt_control op = {
+        .seg = seg,
+        .bus = bus,
+        .devfn = devfn,
+        .flags = flags,
+    };
+
+    return do_physdev_op(xch, PHYSDEVOP_interrupt_control, &op, sizeof(op));
+}
-- 
git-series 0.9.1

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH v6 6/6] tools/libxc: add wrapper for PHYSDEVOP_interrupt_control
  2019-09-14 15:37 ` [Xen-devel] [PATCH v6 6/6] tools/libxc: add wrapper for PHYSDEVOP_interrupt_control Marek Marczykowski-Górecki
@ 2019-09-16 10:54   ` Wei Liu
  0 siblings, 0 replies; 18+ messages in thread
From: Wei Liu @ 2019-09-16 10:54 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki; +Cc: Xen-devel, Ian Jackson, Wei Liu

On Sat, 14 Sep 2019 at 16:38, Marek Marczykowski-Górecki
<marmarek@invisiblethingslab.com> wrote:
>
> Add libxc wrapper for PHYSDEVOP_interrupt_control introduced in previous
> commit.
>
> Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>

Subject to acceptance of earlier patches.

Acked-by: Wei Liu <wl@xen.org>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH v6 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-09-14 15:37 ` [Xen-devel] [PATCH v6 4/6] xen/x86: Allow stubdom access to irq created for msi Marek Marczykowski-Górecki
@ 2019-09-20  9:23   ` Jan Beulich
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2019-09-20  9:23 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Simon Gaiser, Julien Grall, Suravee Suthikulpanit, xen-devel,
	Roger Pau Monné

On 14.09.2019 17:37, Marek Marczykowski-Górecki  wrote:
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -254,7 +254,13 @@ void __init clear_irq_vector(int irq)
>  /*
>   * Dynamic irq allocate and deallocation for MSI
>   */
> -int create_irq(nodeid_t node)
> +
> +/*
> + * create_irq - allocate irq for MSI
> + * @d domain that will get permission over the allocated irq; this permission
> + * will automatically be revoked on destroy_irq
> + */
> +int create_irq(nodeid_t node, struct domain *d)

I think there's nothing wrong with the pointer getting constified,
but see also below.

> @@ -282,23 +288,30 @@ int create_irq(nodeid_t node)
>          }
>          ret = assign_irq_vector(irq, mask);
>      }
> +    ASSERT(desc->arch.creator_domid == DOMID_INVALID);
>      if (ret < 0)

I think this insertion wants to gain blank lines on both sides.

>      {
>          desc->arch.used = IRQ_UNUSED;
>          irq = ret;
>      }
> -    else if ( hardware_domain )
> +    else if ( d )
>      {
> -        ret = irq_permit_access(hardware_domain, irq);
> +        ASSERT(d == current->domain);

Why pass in the domain then in the first place? Could by just a
boolean, couldn't it? Suitably named it might even eliminate
the need for the explanatory comment (see also below).

> +        ret = irq_permit_access(d, irq);
>          if ( ret )
>              printk(XENLOG_G_ERR
> -                   "Could not grant Dom0 access to IRQ%d (error %d)\n",
> -                   irq, ret);
> +                   "Could not grant Dom%u access to IRQ%d (error %d)\n",
> +                   d->domain_id, irq, ret);

Please use %pd here (and elsewhere).

> +        else
> +            desc->arch.creator_domid = d->domain_id;
>      }
>  
>      return irq;
>  }
>  
> +/*
> + * destroy_irq - deallocate irq for MSI
> + */
>  void destroy_irq(unsigned int irq)

I don't think this is a very helpful comment to add; in fact I think
the respective part on the other function would better be dropped as
well, seeing the further comment ahead of both functions. (Otherwise
I'd have to point out that this is a single line comment.)

> @@ -307,14 +320,25 @@ void destroy_irq(unsigned int irq)
>  
>      BUG_ON(!MSI_IRQ(irq));
>  
> -    if ( hardware_domain )
> +    if ( desc->arch.creator_domid != DOMID_INVALID )
>      {
> -        int err = irq_deny_access(hardware_domain, irq);
> +        struct domain *d = get_domain_by_id(desc->arch.creator_domid);
>  
> -        if ( err )
> -            printk(XENLOG_G_ERR
> -                   "Could not revoke Dom0 access to IRQ%u (error %d)\n",
> -                   irq, err);
> +        if ( d && irq_access_permitted(d, irq) )
> +        {
> +            int err;
> +
> +            err = irq_deny_access(d, irq);

Please keep prior code structure, i.e. the function call being the
initializer of the variable.

> +            if ( err )
> +                printk(XENLOG_G_ERR
> +                       "Could not revoke Dom%u access to IRQ%u (error %d)\n",
> +                       d->domain_id, irq, err);
> +        }

Why the irq_access_permitted() check around this? You go to some
lengths to explain this in the description, but if the domain has
no permission over the IRQ (because of domain ID re-use),
irq_deny_access() will simply do nothing, won't it? I.e. the way
this gets done and explained (saying that MSI IRQs can't be
shared between domains) wants to change a little.

> --- a/xen/include/asm-x86/irq.h
> +++ b/xen/include/asm-x86/irq.h
> @@ -45,6 +45,11 @@ struct arch_irq_desc {
>          unsigned move_cleanup_count;
>          u8 move_in_progress : 1;
>          s8 used;
> +        /*
> +         * weak reference to domain having permission over this IRQ (which can
> +         * be different from the domain actually havint the IRQ assigned)
> +         */
> +        domid_t creator_domid;

Comment style (should start with a capital letter).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH v6 5/6] xen/x86: add PHYSDEVOP_interrupt_control
  2019-09-14 15:37 ` [Xen-devel] [PATCH v6 5/6] xen/x86: add PHYSDEVOP_interrupt_control Marek Marczykowski-Górecki
@ 2019-09-20 10:10   ` Jan Beulich
  2019-09-20 16:02     ` Marek Marczykowski-Górecki
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2019-09-20 10:10 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel, Daniel De Graaf, Roger Pau Monné

On 14.09.2019 17:37, Marek Marczykowski-Górecki  wrote:
> Allow device model running in stubdomain to enable/disable INTx/MSI(-X),
> bypassing pciback. While pciback is still used to access config space
> from within stubdomain, it refuse to write to
> PCI_MSI_FLAGS_ENABLE/PCI_MSIX_FLAGS_ENABLE/PCI_COMMAND_INTX_DISABLE
> in non-permissive mode. Which is the right thing to do for PV domain
> (the main use case for pciback), as PV domain should use XEN_PCI_OP_*
> commands for that. Unfortunately those commands are not good for
> stubdomain use, as they configure MSI in dom0's kernel too, which should
> not happen for HVM domain.

Why the "for HVM domain" here? I.e. why would this be correct for
a PV domain? Besides my dislike for such a bypass (imo all of the
handling should go through pciback, or none of it) I continue to
wonder whether the problem can't be addressed by a pciback change.
And even if not, I'd still wonder whether the request shouldn't go
through pciback, to retain proper layering. Ultimately it may be
better to have even the map/unmap go through pciback (it's at
least an apparent violation of the original physdev-op model that
these two are XSM_DM_PRIV).

Irrespective of this a couple of comments on the patch itself:

> --- a/xen/arch/x86/msi.c
> +++ b/xen/arch/x86/msi.c
> @@ -1443,6 +1443,51 @@ int pci_restore_msi_state(struct pci_dev *pdev)
>      return 0;
>  }
>  
> +int msi_control(struct pci_dev *pdev, bool msix, bool enable)
> +{
> +    unsigned int cap = msix ? PCI_CAP_ID_MSIX : PCI_CAP_ID_MSI;
> +    unsigned int other_cap = msix ? PCI_CAP_ID_MSI : PCI_CAP_ID_MSIX;
> +    uint16_t cmd;
> +
> +    if ( !use_msi )
> +        return -EOPNOTSUPP;
> +
> +    if ( !pci_find_cap_offset(pdev->seg,
> +                              pdev->bus,
> +                              PCI_SLOT(pdev->devfn),
> +                              PCI_FUNC(pdev->devfn),

Please don't use PCI_SLOT() and PCI_FUNC() anymore, now that we
have pdev->dev and pdev->fn. And please put multiple arguments
on one line, as many as will fit.

> +                              cap) )
> +        return -ENODEV;
> +
> +    cmd = pci_conf_read16(pdev->sbdf, PCI_COMMAND);
> +
> +    /* don't allow enabling MSI(-X) and INTx at the same time */
> +    if ( enable && ! (cmd & PCI_COMMAND_INTX_DISABLE) )

Stray blank after ! .

> +        return -EBUSY;
> +
> +    /* don't allow enabling both MSI and MSI-X at the same time */
> +    if ( enable && find_msi_entry(pdev, -1, other_cap) )
> +        return -EBUSY;

Combine both if()-s, since they both check "enable"?

Also - comment style again (should start with capital letter);
more instances elsewhere.

> +int intx_control(struct pci_dev *pdev, bool enable)
> +{
> +    /* don't allow enabling INTx if MSI(-X) is already enabled */
> +    if ( enable && find_msi_entry(pdev, -1, PCI_CAP_ID_MSI) )
> +        return -EBUSY;
> +    if ( enable && find_msi_entry(pdev, -1, PCI_CAP_ID_MSIX) )
> +        return -EBUSY;

Here even more so you want to combine both if()-s.

> +    pci_intx(pdev, enable);
> +    return 0;
> +}

Blank line ahead of main return statement please, and I guess
another blank line ahead of the pci_intx() invocation wouldn't
hurt either.

> --- a/xen/arch/x86/physdev.c
> +++ b/xen/arch/x86/physdev.c
> @@ -662,6 +662,59 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>          break;
>      }
>  
> +    case PHYSDEVOP_interrupt_control: {
> +        struct physdev_interrupt_control op;
> +        struct pci_dev *pdev;
> +        int intr_type;
> +        bool enable;
> +
> +        ret = -EFAULT;
> +        if ( copy_from_guest(&op, arg, 1) )
> +            break;
> +
> +        ret = -EINVAL;
> +        if ( op.flags & ~(PHYSDEVOP_INTERRUPT_CONTROL_TYPE_MASK |
> +                          PHYSDEVOP_INTERRUPT_CONTROL_ENABLE) )
> +            break;
> +
> +        intr_type = op.flags & PHYSDEVOP_INTERRUPT_CONTROL_TYPE_MASK;
> +        enable = op.flags & PHYSDEVOP_INTERRUPT_CONTROL_ENABLE;
> +
> +        pcidevs_lock();
> +        pdev = pci_get_pdev(op.seg, op.bus, op.devfn);
> +        ret = -ENODEV;
> +        /* explicitly exclude hidden devices */
> +        if ( !pdev || pdev->domain == dom_xen )

The right side should be avoided by reducing the scope of the device
lookup right away, through use of pci_get_pdev_by_domain(). This
will also ensure we don't exclusively rely on the XSM check below to
prevent abuse of this operation. (FAOD, while
pci_get_pdev_by_domain() doesn't assert that the pcidevs lock is
held, you should still acquire it here. That missing ASSERT() should
get added as soon as other violators of the locking requirement have
been taken care of.)

> +            goto pci_unlock;
> +
> +        ret = xsm_interrupt_control(XSM_DM_PRIV,
> +                                    pdev->domain,
> +                                    pdev->sbdf.sbdf,
> +                                    intr_type,
> +                                    enable);
> +        if ( ret )
> +            goto pci_unlock;
> +
> +        switch ( intr_type )
> +        {
> +            case PHYSDEVOP_INTERRUPT_CONTROL_INTX:
> +                ret = intx_control(pdev, enable);
> +                break;
> +            case PHYSDEVOP_INTERRUPT_CONTROL_MSI:
> +                ret = msi_control(pdev, false, enable);
> +                break;
> +            case PHYSDEVOP_INTERRUPT_CONTROL_MSIX:
> +                ret = msi_control(pdev, true, enable);
> +                break;
> +            default:
> +                ret = -EINVAL;
> +                break;

Indentation and blank lines between independent case blocks please.

> +        }
> +pci_unlock:

Labels indented by at least one blank, please.

> --- a/xen/include/public/physdev.h
> +++ b/xen/include/public/physdev.h
> @@ -345,6 +345,29 @@ typedef struct physdev_dbgp_op physdev_dbgp_op_t;
>  DEFINE_XEN_GUEST_HANDLE(physdev_dbgp_op_t);
>  
>  /*
> + * Choose which interrupt type to control. If neither MSI nor MSI-X is chosen,
> + * will apply to INTx - for convenience define PHYSDEVOP_INTERRUPT_CONTROL_INTX
> + * and PHYSDEVOP_INTERRUPT_CONTROL_TYPE_MASK
> + */
> +#define PHYSDEVOP_INTERRUPT_CONTROL_TYPE_MASK 3
> +#define PHYSDEVOP_INTERRUPT_CONTROL_INTX      0
> +#define PHYSDEVOP_INTERRUPT_CONTROL_MSI       1
> +#define PHYSDEVOP_INTERRUPT_CONTROL_MSIX      2
> +/* when PHYSDEVOP_INTERRUPT_CONTROL_ENABLE not set, disable */
> +#define PHYSDEVOP_INTERRUPT_CONTROL_ENABLE    4
> +
> +#define PHYSDEVOP_interrupt_control   32
> +struct physdev_interrupt_control {
> +    /* IN */
> +    uint16_t seg;
> +    uint8_t bus;
> +    uint8_t devfn;

Please re-use struct physdev_pci_device for these.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH v6 5/6] xen/x86: add PHYSDEVOP_interrupt_control
  2019-09-20 10:10   ` Jan Beulich
@ 2019-09-20 16:02     ` Marek Marczykowski-Górecki
  2019-09-23  7:58       ` Jan Beulich
  2019-09-23 15:16       ` Roger Pau Monné
  0 siblings, 2 replies; 18+ messages in thread
From: Marek Marczykowski-Górecki @ 2019-09-20 16:02 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel, Daniel De Graaf, Roger Pau Monné


[-- Attachment #1.1: Type: text/plain, Size: 2641 bytes --]

On Fri, Sep 20, 2019 at 12:10:09PM +0200, Jan Beulich wrote:
> On 14.09.2019 17:37, Marek Marczykowski-Górecki  wrote:
> > Allow device model running in stubdomain to enable/disable INTx/MSI(-X),
> > bypassing pciback. While pciback is still used to access config space
> > from within stubdomain, it refuse to write to
> > PCI_MSI_FLAGS_ENABLE/PCI_MSIX_FLAGS_ENABLE/PCI_COMMAND_INTX_DISABLE
> > in non-permissive mode. Which is the right thing to do for PV domain
> > (the main use case for pciback), as PV domain should use XEN_PCI_OP_*
> > commands for that. Unfortunately those commands are not good for
> > stubdomain use, as they configure MSI in dom0's kernel too, which should
> > not happen for HVM domain.
> 
> Why the "for HVM domain" here? I.e. why would this be correct for
> a PV domain? Besides my dislike for such a bypass (imo all of the
> handling should go through pciback, or none of it) I continue to
> wonder whether the problem can't be addressed by a pciback change.
> And even if not, I'd still wonder whether the request shouldn't go
> through pciback, to retain proper layering. Ultimately it may be
> better to have even the map/unmap go through pciback (it's at
> least an apparent violation of the original physdev-op model that
> these two are XSM_DM_PRIV).

Technically it should be possible to move this part to pciback, and in
fact this is what I've considered in the first version of this series.
But Roger points out on each version[1] of this series that pciback is
meant to serve *PV* domains, where a PCI passthrough is a completely
different different beast. In fact, I even consider that using pcifront
in a Linux stubdomain as a proxy for qemu there may be a bad idea (one
needs to be careful to avoid stubdomain kernel fighting with qemu about
device state).

Roger, what is the state of Xen internal vPCI? If handling PCI
passthrough in Xen (or maybe standalone emulator), without qemu help is
going to happen sooner than later (I guess not 4.13, but maybe 4.14?),
then maybe this whole patch doesn't make sense as a temporary measure?

Anyway, if you all agree that pciback should be the way to go, I can go
that route too. In practice, it would be a flag (set by the toolstack?)
allowing writes to appropriate config space registers directly (with
appropriate checks, as in this patch).

[1] https://lists.xenproject.org/archives/html/xen-devel/2019-08/msg00486.html

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH v6 5/6] xen/x86: add PHYSDEVOP_interrupt_control
  2019-09-20 16:02     ` Marek Marczykowski-Górecki
@ 2019-09-23  7:58       ` Jan Beulich
  2019-09-23 10:47         ` Marek Marczykowski-Górecki
  2019-09-23 15:16       ` Roger Pau Monné
  1 sibling, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2019-09-23  7:58 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel, Daniel De Graaf, Roger Pau Monné

On 20.09.2019 18:02, Marek Marczykowski-Górecki  wrote:
> On Fri, Sep 20, 2019 at 12:10:09PM +0200, Jan Beulich wrote:
>> On 14.09.2019 17:37, Marek Marczykowski-Górecki  wrote:
>>> Allow device model running in stubdomain to enable/disable INTx/MSI(-X),
>>> bypassing pciback. While pciback is still used to access config space
>>> from within stubdomain, it refuse to write to
>>> PCI_MSI_FLAGS_ENABLE/PCI_MSIX_FLAGS_ENABLE/PCI_COMMAND_INTX_DISABLE
>>> in non-permissive mode. Which is the right thing to do for PV domain
>>> (the main use case for pciback), as PV domain should use XEN_PCI_OP_*
>>> commands for that. Unfortunately those commands are not good for
>>> stubdomain use, as they configure MSI in dom0's kernel too, which should
>>> not happen for HVM domain.
>>
>> Why the "for HVM domain" here? I.e. why would this be correct for
>> a PV domain? Besides my dislike for such a bypass (imo all of the
>> handling should go through pciback, or none of it) I continue to
>> wonder whether the problem can't be addressed by a pciback change.
>> And even if not, I'd still wonder whether the request shouldn't go
>> through pciback, to retain proper layering. Ultimately it may be
>> better to have even the map/unmap go through pciback (it's at
>> least an apparent violation of the original physdev-op model that
>> these two are XSM_DM_PRIV).
> 
> Technically it should be possible to move this part to pciback, and in
> fact this is what I've considered in the first version of this series.
> But Roger points out on each version[1] of this series that pciback is
> meant to serve *PV* domains, where a PCI passthrough is a completely
> different different beast. In fact, I even consider that using pcifront
> in a Linux stubdomain as a proxy for qemu there may be a bad idea (one
> needs to be careful to avoid stubdomain kernel fighting with qemu about
> device state).

Well, not using pciback _at all_ in this case would be another option.
What I dislike is the furthering of hybrid-ness.

> Anyway, if you all agree that pciback should be the way to go, I can go
> that route too. In practice, it would be a flag (set by the toolstack?)
> allowing writes to appropriate config space registers directly (with
> appropriate checks, as in this patch).

I'm afraid I don't agree: How would allowing writes to more config space
registers by a stubdom be safe?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH v6 5/6] xen/x86: add PHYSDEVOP_interrupt_control
  2019-09-23  7:58       ` Jan Beulich
@ 2019-09-23 10:47         ` Marek Marczykowski-Górecki
  2019-09-23 12:05           ` Jan Beulich
  0 siblings, 1 reply; 18+ messages in thread
From: Marek Marczykowski-Górecki @ 2019-09-23 10:47 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel, Daniel De Graaf, Roger Pau Monné


[-- Attachment #1.1: Type: text/plain, Size: 3451 bytes --]

On Mon, Sep 23, 2019 at 09:58:27AM +0200, Jan Beulich wrote:
> On 20.09.2019 18:02, Marek Marczykowski-Górecki  wrote:
> > On Fri, Sep 20, 2019 at 12:10:09PM +0200, Jan Beulich wrote:
> >> On 14.09.2019 17:37, Marek Marczykowski-Górecki  wrote:
> >>> Allow device model running in stubdomain to enable/disable INTx/MSI(-X),
> >>> bypassing pciback. While pciback is still used to access config space
> >>> from within stubdomain, it refuse to write to
> >>> PCI_MSI_FLAGS_ENABLE/PCI_MSIX_FLAGS_ENABLE/PCI_COMMAND_INTX_DISABLE
> >>> in non-permissive mode. Which is the right thing to do for PV domain
> >>> (the main use case for pciback), as PV domain should use XEN_PCI_OP_*
> >>> commands for that. Unfortunately those commands are not good for
> >>> stubdomain use, as they configure MSI in dom0's kernel too, which should
> >>> not happen for HVM domain.
> >>
> >> Why the "for HVM domain" here? I.e. why would this be correct for
> >> a PV domain? Besides my dislike for such a bypass (imo all of the
> >> handling should go through pciback, or none of it) I continue to
> >> wonder whether the problem can't be addressed by a pciback change.
> >> And even if not, I'd still wonder whether the request shouldn't go
> >> through pciback, to retain proper layering. Ultimately it may be
> >> better to have even the map/unmap go through pciback (it's at
> >> least an apparent violation of the original physdev-op model that
> >> these two are XSM_DM_PRIV).
> > 
> > Technically it should be possible to move this part to pciback, and in
> > fact this is what I've considered in the first version of this series.
> > But Roger points out on each version[1] of this series that pciback is
> > meant to serve *PV* domains, where a PCI passthrough is a completely
> > different different beast. In fact, I even consider that using pcifront
> > in a Linux stubdomain as a proxy for qemu there may be a bad idea (one
> > needs to be careful to avoid stubdomain kernel fighting with qemu about
> > device state).
> 
> Well, not using pciback _at all_ in this case would be another option.
> What I dislike is the furthering of hybrid-ness.

Ah, I see. This may be a good idea, if this type of PCI passthrough is
going to stay. If we're going away from qemu towards other options
mentioned in previous email, I'd say such a rework is too much work for a
time limited usefulness.
> 
> > Anyway, if you all agree that pciback should be the way to go, I can go
> > that route too. In practice, it would be a flag (set by the toolstack?)
> > allowing writes to appropriate config space registers directly (with
> > appropriate checks, as in this patch).
> 
> I'm afraid I don't agree: How would allowing writes to more config space
> registers by a stubdom be safe?

Exactly the same as in this patch: pciback would perform the same
validation (prohibit enabling MSI together with INTx etc).

BTW what are the risks (besides DoS) of allowing full config space
access, assuming VT-d with interrupt remapping present? This sounds
similar to risks of malicious device connected to some domU, right? Can
such device (or a domain controlling such device) break out to Xen or
dom0? Can it steal data from other domains?

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH v6 5/6] xen/x86: add PHYSDEVOP_interrupt_control
  2019-09-23 10:47         ` Marek Marczykowski-Górecki
@ 2019-09-23 12:05           ` Jan Beulich
  2019-09-23 12:25             ` Marek Marczykowski-Górecki
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2019-09-23 12:05 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel, Daniel De Graaf, Roger Pau Monné

On 23.09.2019 12:47, Marek Marczykowski-Górecki  wrote:
> On Mon, Sep 23, 2019 at 09:58:27AM +0200, Jan Beulich wrote:
>> On 20.09.2019 18:02, Marek Marczykowski-Górecki  wrote:
>>> Anyway, if you all agree that pciback should be the way to go, I can go
>>> that route too. In practice, it would be a flag (set by the toolstack?)
>>> allowing writes to appropriate config space registers directly (with
>>> appropriate checks, as in this patch).
>>
>> I'm afraid I don't agree: How would allowing writes to more config space
>> registers by a stubdom be safe?
> 
> Exactly the same as in this patch: pciback would perform the same
> validation (prohibit enabling MSI together with INTx etc).
> 
> BTW what are the risks (besides DoS) of allowing full config space
> access, assuming VT-d with interrupt remapping present? This sounds
> similar to risks of malicious device connected to some domU, right? Can
> such device (or a domain controlling such device) break out to Xen or
> dom0? Can it steal data from other domains?

There shouldn't be, but this would need proving. The direction of
proof then should be the other way around (and I realize it may be
[close to] impossible): Widening what guests (including stub
domains) are allowed to do should be proven to add no additional
risks. It shouldn't be (by example, as I imply from your question)
that an actual issue needs to be pointed out.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH v6 5/6] xen/x86: add PHYSDEVOP_interrupt_control
  2019-09-23 12:05           ` Jan Beulich
@ 2019-09-23 12:25             ` Marek Marczykowski-Górecki
  2019-09-23 13:02               ` Jan Beulich
  0 siblings, 1 reply; 18+ messages in thread
From: Marek Marczykowski-Górecki @ 2019-09-23 12:25 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel, Daniel De Graaf, Roger Pau Monné


[-- Attachment #1.1: Type: text/plain, Size: 1918 bytes --]

On Mon, Sep 23, 2019 at 02:05:58PM +0200, Jan Beulich wrote:
> On 23.09.2019 12:47, Marek Marczykowski-Górecki  wrote:
> > On Mon, Sep 23, 2019 at 09:58:27AM +0200, Jan Beulich wrote:
> >> On 20.09.2019 18:02, Marek Marczykowski-Górecki  wrote:
> >>> Anyway, if you all agree that pciback should be the way to go, I can go
> >>> that route too. In practice, it would be a flag (set by the toolstack?)
> >>> allowing writes to appropriate config space registers directly (with
> >>> appropriate checks, as in this patch).
> >>
> >> I'm afraid I don't agree: How would allowing writes to more config space
> >> registers by a stubdom be safe?
> > 
> > Exactly the same as in this patch: pciback would perform the same
> > validation (prohibit enabling MSI together with INTx etc).
> > 
> > BTW what are the risks (besides DoS) of allowing full config space
> > access, assuming VT-d with interrupt remapping present? This sounds
> > similar to risks of malicious device connected to some domU, right? Can
> > such device (or a domain controlling such device) break out to Xen or
> > dom0? Can it steal data from other domains?
> 
> There shouldn't be, but this would need proving. The direction of
> proof then should be the other way around (and I realize it may be
> [close to] impossible): Widening what guests (including stub
> domains) are allowed to do should be proven to add no additional
> risks. It shouldn't be (by example, as I imply from your question)
> that an actual issue needs to be pointed out.

What about this: HVM guest can already do all of this when qemu is
running in dom0. So, allowing those actions when qemu is running in
stubdomain should not introduce _additional_ risks.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH v6 5/6] xen/x86: add PHYSDEVOP_interrupt_control
  2019-09-23 12:25             ` Marek Marczykowski-Górecki
@ 2019-09-23 13:02               ` Jan Beulich
  2019-09-23 13:24                 ` Marek Marczykowski-Górecki
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2019-09-23 13:02 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel, Daniel De Graaf, Roger Pau Monné

On 23.09.2019 14:25, Marek Marczykowski-Górecki  wrote:
> What about this: HVM guest can already do all of this when qemu is
> running in dom0. So, allowing those actions when qemu is running in
> stubdomain should not introduce _additional_ risks.

Well, in a way - yes. But I don't think it's right to have qemu do
the direct writes it does (and I wouldn't be surprised if there
were still actual issues with this model). Hence it's not going to
be an improvement if this suspicious underlying design got
extended to other components.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH v6 5/6] xen/x86: add PHYSDEVOP_interrupt_control
  2019-09-23 13:02               ` Jan Beulich
@ 2019-09-23 13:24                 ` Marek Marczykowski-Górecki
  0 siblings, 0 replies; 18+ messages in thread
From: Marek Marczykowski-Górecki @ 2019-09-23 13:24 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel, Daniel De Graaf, Roger Pau Monné


[-- Attachment #1.1: Type: text/plain, Size: 1470 bytes --]

On Mon, Sep 23, 2019 at 03:02:49PM +0200, Jan Beulich wrote:
> On 23.09.2019 14:25, Marek Marczykowski-Górecki  wrote:
> > What about this: HVM guest can already do all of this when qemu is
> > running in dom0. So, allowing those actions when qemu is running in
> > stubdomain should not introduce _additional_ risks.
> 
> Well, in a way - yes. But I don't think it's right to have qemu do
> the direct writes it does (and I wouldn't be surprised if there
> were still actual issues with this model). Hence it's not going to
> be an improvement if this suspicious underlying design got
> extended to other components.

This sounds like any workflow involving qemu would be inferior. And I
agree with that. But also I do need PCI passthrough working, so I need a
solution until we have an alternative implementation. If that alternative
is going to happen soon, I'm also fine with carrying patches in Qubes
package (like I already do). This wouldn't be nice for the rest of the
community (I believe many other Xen-based projects also carry similar
patches already), but it looks like upstreaming this is taking way too
much effort than it's worth for a temporary solution.

So, in the next version of this series I'm going to drop this patch (and
the next one).

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH v6 5/6] xen/x86: add PHYSDEVOP_interrupt_control
  2019-09-20 16:02     ` Marek Marczykowski-Górecki
  2019-09-23  7:58       ` Jan Beulich
@ 2019-09-23 15:16       ` Roger Pau Monné
  1 sibling, 0 replies; 18+ messages in thread
From: Roger Pau Monné @ 2019-09-23 15:16 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Jan Beulich, xen-devel, Daniel De Graaf

On Fri, Sep 20, 2019 at 06:02:50PM +0200, Marek Marczykowski-Górecki wrote:
> On Fri, Sep 20, 2019 at 12:10:09PM +0200, Jan Beulich wrote:
> > On 14.09.2019 17:37, Marek Marczykowski-Górecki  wrote:
> > > Allow device model running in stubdomain to enable/disable INTx/MSI(-X),
> > > bypassing pciback. While pciback is still used to access config space
> > > from within stubdomain, it refuse to write to
> > > PCI_MSI_FLAGS_ENABLE/PCI_MSIX_FLAGS_ENABLE/PCI_COMMAND_INTX_DISABLE
> > > in non-permissive mode. Which is the right thing to do for PV domain
> > > (the main use case for pciback), as PV domain should use XEN_PCI_OP_*
> > > commands for that. Unfortunately those commands are not good for
> > > stubdomain use, as they configure MSI in dom0's kernel too, which should
> > > not happen for HVM domain.
> > 
> > Why the "for HVM domain" here? I.e. why would this be correct for
> > a PV domain? Besides my dislike for such a bypass (imo all of the
> > handling should go through pciback, or none of it) I continue to
> > wonder whether the problem can't be addressed by a pciback change.
> > And even if not, I'd still wonder whether the request shouldn't go
> > through pciback, to retain proper layering. Ultimately it may be
> > better to have even the map/unmap go through pciback (it's at
> > least an apparent violation of the original physdev-op model that
> > these two are XSM_DM_PRIV).
> 
> Technically it should be possible to move this part to pciback, and in
> fact this is what I've considered in the first version of this series.
> But Roger points out on each version[1] of this series that pciback is
> meant to serve *PV* domains, where a PCI passthrough is a completely
> different different beast. In fact, I even consider that using pcifront
> in a Linux stubdomain as a proxy for qemu there may be a bad idea (one
> needs to be careful to avoid stubdomain kernel fighting with qemu about
> device state).

Right, it's (as show by this series) tricky to proxy HVM passthrough
over the PV pciif protocol used by pcifront and pciback, because that
protocol was designed for PV guests pci-passthrough.

While it's indeed possible to expand the pciif protocol so it's also
suitable to proxy HVM passthrough by a QEMU stubdomain that would
require changes to Linux pciback at least (and to pcifront maybe?),
and it's usage would need to be limited to stubdomains only to not
risk expanding the attack surface of pciback.

> Roger, what is the state of Xen internal vPCI? If handling PCI
> passthrough in Xen (or maybe standalone emulator), without qemu help is
> going to happen sooner than later (I guess not 4.13, but maybe 4.14?),
> then maybe this whole patch doesn't make sense as a temporary measure?

I've got an initial series posted to convert vPCI to an internal ioreq
server, so it can co-exist with other ioreq servers that also trap
accesses to the pci configuration space. Once that's done the main
work will be to make vPCI safe for unprivileged domains. Right now
vPCI is too permissive since it's designed for dom0 only.

I hope 4.14 will have at least experimental code for vPCI for domUs,
but I cannot guarantee anything at this point.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2019-09-23 15:17 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-14 15:37 [Xen-devel] [PATCH v6 0/6] Fix PCI passthrough for HVM with stubdomain Marek Marczykowski-Górecki
2019-09-14 15:37 ` [Xen-devel] [PATCH v6 1/6] libxl: do not attach xen-pciback to HVM domain, if stubdomain is in use Marek Marczykowski-Górecki
2019-09-14 15:37 ` [Xen-devel] [PATCH v6 2/6] libxl: attach PCI device to qemu only after setting pciback/pcifront Marek Marczykowski-Górecki
2019-09-14 15:37 ` [Xen-devel] [PATCH v6 3/6] libxl: don't try to manipulate json config for stubdomain Marek Marczykowski-Górecki
2019-09-14 15:37 ` [Xen-devel] [PATCH v6 4/6] xen/x86: Allow stubdom access to irq created for msi Marek Marczykowski-Górecki
2019-09-20  9:23   ` Jan Beulich
2019-09-14 15:37 ` [Xen-devel] [PATCH v6 5/6] xen/x86: add PHYSDEVOP_interrupt_control Marek Marczykowski-Górecki
2019-09-20 10:10   ` Jan Beulich
2019-09-20 16:02     ` Marek Marczykowski-Górecki
2019-09-23  7:58       ` Jan Beulich
2019-09-23 10:47         ` Marek Marczykowski-Górecki
2019-09-23 12:05           ` Jan Beulich
2019-09-23 12:25             ` Marek Marczykowski-Górecki
2019-09-23 13:02               ` Jan Beulich
2019-09-23 13:24                 ` Marek Marczykowski-Górecki
2019-09-23 15:16       ` Roger Pau Monné
2019-09-14 15:37 ` [Xen-devel] [PATCH v6 6/6] tools/libxc: add wrapper for PHYSDEVOP_interrupt_control Marek Marczykowski-Górecki
2019-09-16 10:54   ` Wei Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.