All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/6] Fix PCI passthrough for HVM with stubdomain
@ 2019-02-07  0:07 Marek Marczykowski-Górecki
  2019-02-07  0:07 ` [PATCH v4 1/6] libxl: do not attach xen-pciback to HVM domain, if stubdomain is in use Marek Marczykowski-Górecki
                   ` (5 more replies)
  0 siblings, 6 replies; 43+ messages in thread
From: Marek Marczykowski-Górecki @ 2019-02-07  0:07 UTC (permalink / raw)
  To: xen-devel; +Cc: Marek Marczykowski-Górecki

In this version, I add PHYSDEVOP_msi_set_enable to allow stubdomain
enabling MSI after mapping it.

Related article:
https://www.qubes-os.org/news/2017/10/18/msi-support/

Changes in v2:
 - new "xen/x86: Allow stubdom access to irq created for msi" patch
 - applied review comments from v1
Changes is v3:
 - apply suggestions by Roger
 - add PHYSDEVOP_msi_msix_set_enable
Changes in v4:
 - implement suggestions by Wei, Roger, Jan
 - plug new physdevop into XSM

Marek Marczykowski-Górecki (5):
  libxl: do not attach xen-pciback to HVM domain, if stubdomain is in use
  libxl: attach PCI device to qemu only after setting pciback/pcifront
  libxl: don't try to manipulate json config for stubdomain
  xen/x86: add PHYSDEVOP_msi_set_enable
  tools/libxc: add wrapper for PHYSDEVOP_msi_set_enable

Simon Gaiser (1):
  xen/x86: Allow stubdom access to irq created for msi.

 tools/libxc/include/xenctrl.h |  7 ++++-
 tools/libxc/xc_physdev.c      | 21 ++++++++++++-
 tools/libxl/libxl_pci.c       | 63 ++++++++++++++++++++++++------------
 xen/arch/x86/irq.c            | 24 ++++++++++++++-
 xen/arch/x86/msi.c            | 24 ++++++++++++++-
 xen/arch/x86/physdev.c        | 33 +++++++++++++++++++-
 xen/arch/x86/x86_64/physdev.c |  4 ++-
 xen/include/asm-x86/msi.h     |  1 +-
 xen/include/public/physdev.h  | 15 +++++++++-
 xen/include/xlat.lst          |  1 +-
 xen/include/xsm/dummy.h       |  7 ++++-
 xen/include/xsm/xsm.h         |  6 +++-
 xen/xsm/dummy.c               |  1 +-
 xen/xsm/flask/hooks.c         | 25 ++++++++++++++-
 14 files changed, 212 insertions(+), 20 deletions(-)

base-commit: 93a62c544e20ba9e141e411bbaae3d65259d13a3
-- 
git-series 0.9.1

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH v4 1/6] libxl: do not attach xen-pciback to HVM domain, if stubdomain is in use
  2019-02-07  0:07 [PATCH v4 0/6] Fix PCI passthrough for HVM with stubdomain Marek Marczykowski-Górecki
@ 2019-02-07  0:07 ` Marek Marczykowski-Górecki
  2019-02-07  0:07 ` [PATCH v4 2/6] libxl: attach PCI device to qemu only after setting pciback/pcifront Marek Marczykowski-Górecki
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 43+ messages in thread
From: Marek Marczykowski-Górecki @ 2019-02-07  0:07 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, Ian Jackson, Marek Marczykowski-Górecki

HVM domains use IOMMU and device model assistance for communicating with
PCI devices, xen-pcifront/pciback isn't directly needed by HVM domain.
But pciback serve also second function - it reset the device when it is
deassigned from the guest and for this reason pciback needs to be used
with HVM domain too.
When HVM domain has device model in stubdomain, attaching xen-pciback to
the target domain itself may prevent attaching xen-pciback to the
(PV) stubdomain, effectively breaking PCI passthrough.

Fix this by attaching pciback only to one domain: if PV stubdomain is in
use, let it be stubdomain (the commit prevents attaching device to target
HVM in this case); otherwise, attach it to the target domain.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
Changes in v2:
 - previously called "libxl: attach xen-pciback only to PV domains"
 - instead of excluding all HVMs, change the condition to what actually
   matters here - check if stubdomain is in use; this way xen-pciback is
   always in use (either for the target domain, or it's stubdomain),
   fixing PCI reset by xen-pciback concerns
Changes in v3:
 - adjust commit message
---
 tools/libxl/libxl_pci.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl_pci.c b/tools/libxl/libxl_pci.c
index 87afa03..3b6b23c 100644
--- a/tools/libxl/libxl_pci.c
+++ b/tools/libxl/libxl_pci.c
@@ -1106,7 +1106,7 @@ out:
         }
     }
 
-    if (!starting)
+    if (!starting && !libxl_get_stubdom_id(CTX, domid))
         rc = libxl__device_pci_add_xenstore(gc, domid, pcidev, starting);
     else
         rc = 0;
@@ -1302,7 +1302,7 @@ static void libxl__add_pcidevs(libxl__egc *egc, libxl__ao *ao, uint32_t domid,
         }
     }
 
-    if (d_config->num_pcidevs > 0) {
+    if (d_config->num_pcidevs > 0 && !libxl_get_stubdom_id(CTX, domid)) {
         rc = libxl__create_pci_backend(gc, domid, d_config->pcidevs,
             d_config->num_pcidevs);
         if (rc < 0) {
-- 
git-series 0.9.1

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 2/6] libxl: attach PCI device to qemu only after setting pciback/pcifront
  2019-02-07  0:07 [PATCH v4 0/6] Fix PCI passthrough for HVM with stubdomain Marek Marczykowski-Górecki
  2019-02-07  0:07 ` [PATCH v4 1/6] libxl: do not attach xen-pciback to HVM domain, if stubdomain is in use Marek Marczykowski-Górecki
@ 2019-02-07  0:07 ` Marek Marczykowski-Górecki
  2019-02-07  0:07 ` [PATCH v4 3/6] libxl: don't try to manipulate json config for stubdomain Marek Marczykowski-Górecki
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 43+ messages in thread
From: Marek Marczykowski-Górecki @ 2019-02-07  0:07 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, Ian Jackson, Marek Marczykowski-Górecki

When qemu is running in stubdomain, handling "pci-ins" command will fail
if pcifront is not initialized already. Fix this by sending such command
only after confirming that pciback/front is running.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
Changes in v2:
- Fixed code style since previous version.
---
 tools/libxl/libxl_pci.c |  9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/tools/libxl/libxl_pci.c b/tools/libxl/libxl_pci.c
index 3b6b23c..1bde537 100644
--- a/tools/libxl/libxl_pci.c
+++ b/tools/libxl/libxl_pci.c
@@ -1191,6 +1191,7 @@ int libxl__device_pci_add(libxl__gc *gc, uint32_t domid, libxl_device_pci *pcide
 {
     libxl_ctx *ctx = libxl__gc_owner(gc);
     unsigned int orig_vdev, pfunc_mask;
+    char *be_path;
     libxl_device_pci *assigned;
     int num_assigned, i, rc;
     int stubdomid = 0;
@@ -1245,6 +1246,14 @@ int libxl__device_pci_add(libxl__gc *gc, uint32_t domid, libxl_device_pci *pcide
         rc = do_pci_add(gc, stubdomid, &pcidev_s, 0);
         if ( rc )
             goto out;
+        /* Wait for the device actually being connected, otherwise device model
+         * running there will fail to find the device. */
+        be_path = libxl__sprintf(gc, "%s/backend/pci/%d/0",
+                                 libxl__xs_get_dompath(gc, 0), stubdomid);
+        rc = libxl__wait_for_backend(gc, be_path,
+                                     GCSPRINTF("%d", XenbusStateConnected));
+        if (rc)
+            goto out;
     }
 
     orig_vdev = pcidev->vdevfn & ~7U;
-- 
git-series 0.9.1

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 3/6] libxl: don't try to manipulate json config for stubdomain
  2019-02-07  0:07 [PATCH v4 0/6] Fix PCI passthrough for HVM with stubdomain Marek Marczykowski-Górecki
  2019-02-07  0:07 ` [PATCH v4 1/6] libxl: do not attach xen-pciback to HVM domain, if stubdomain is in use Marek Marczykowski-Górecki
  2019-02-07  0:07 ` [PATCH v4 2/6] libxl: attach PCI device to qemu only after setting pciback/pcifront Marek Marczykowski-Górecki
@ 2019-02-07  0:07 ` Marek Marczykowski-Górecki
  2019-02-21 16:16   ` Wei Liu
  2019-02-07  0:07 ` [PATCH v4 4/6] xen/x86: Allow stubdom access to irq created for msi Marek Marczykowski-Górecki
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 43+ messages in thread
From: Marek Marczykowski-Górecki @ 2019-02-07  0:07 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, Ian Jackson, Marek Marczykowski-Górecki

Stubdomain do not have it's own config file - its configuration is
derived from target domains. Do not try to manipulate it when attaching
PCI device.

This bug prevented starting HVM with stubdomain and PCI passthrough
device attached.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
---
Changes in v3:
 - skip libxl__dm_check_start too, as stubdomain is guaranteed to be
   running at this stage already
 - do not init d_config at all, as it is used only for json manipulation
Changes in v4:
 - adjust comment style
---
 tools/libxl/libxl_pci.c | 50 ++++++++++++++++++++++++++----------------
 1 file changed, 32 insertions(+), 18 deletions(-)

diff --git a/tools/libxl/libxl_pci.c b/tools/libxl/libxl_pci.c
index 1bde537..0e70f0d 100644
--- a/tools/libxl/libxl_pci.c
+++ b/tools/libxl/libxl_pci.c
@@ -120,10 +120,14 @@ static int libxl__device_pci_add_xenstore(libxl__gc *gc, uint32_t domid, libxl_d
     libxl_domain_config d_config;
     libxl_device_pci pcidev_saved;
     libxl__domain_userdata_lock *lock = NULL;
+    bool is_stubdomain = libxl_is_stubdom(CTX, domid, NULL);
 
-    libxl_domain_config_init(&d_config);
-    libxl_device_pci_init(&pcidev_saved);
-    libxl_device_pci_copy(CTX, &pcidev_saved, pcidev);
+    /* Stubdomain doesn't have own config. */
+    if (!is_stubdomain) {
+        libxl_domain_config_init(&d_config);
+        libxl_device_pci_init(&pcidev_saved);
+        libxl_device_pci_copy(CTX, &pcidev_saved, pcidev);
+    }
 
     be_path = libxl__domain_device_backend_path(gc, 0, domid, 0,
                                                 LIBXL__DEVICE_KIND_PCI);
@@ -152,27 +156,35 @@ static int libxl__device_pci_add_xenstore(libxl__gc *gc, uint32_t domid, libxl_d
     GCNEW(device);
     libxl__device_from_pcidev(gc, domid, pcidev, device);
 
-    lock = libxl__lock_domain_userdata(gc, domid);
-    if (!lock) {
-        rc = ERROR_LOCK_FAIL;
-        goto out;
-    }
+    /* 
+     * Stubdomin config is derived from its target domain, it doesn't have
+     * its own file.
+     */
+    if (!is_stubdomain) {
+        lock = libxl__lock_domain_userdata(gc, domid);
+        if (!lock) {
+            rc = ERROR_LOCK_FAIL;
+            goto out;
+        }
 
-    rc = libxl__get_domain_configuration(gc, domid, &d_config);
-    if (rc) goto out;
+        rc = libxl__get_domain_configuration(gc, domid, &d_config);
+        if (rc) goto out;
 
-    device_add_domain_config(gc, &d_config, &libxl__pcidev_devtype,
-                             &pcidev_saved);
+        device_add_domain_config(gc, &d_config, &libxl__pcidev_devtype,
+                                 &pcidev_saved);
 
-    rc = libxl__dm_check_start(gc, &d_config, domid);
-    if (rc) goto out;
+        rc = libxl__dm_check_start(gc, &d_config, domid);
+        if (rc) goto out;
+    }
 
     for (;;) {
         rc = libxl__xs_transaction_start(gc, &t);
         if (rc) goto out;
 
-        rc = libxl__set_domain_configuration(gc, domid, &d_config);
-        if (rc) goto out;
+        if (lock) {
+            rc = libxl__set_domain_configuration(gc, domid, &d_config);
+            if (rc) goto out;
+        }
 
         libxl__xs_writev(gc, t, be_path, libxl__xs_kvs_of_flexarray(gc, back));
 
@@ -184,8 +196,10 @@ static int libxl__device_pci_add_xenstore(libxl__gc *gc, uint32_t domid, libxl_d
 out:
     libxl__xs_transaction_abort(gc, &t);
     if (lock) libxl__unlock_domain_userdata(lock);
-    libxl_device_pci_dispose(&pcidev_saved);
-    libxl_domain_config_dispose(&d_config);
+    if (!is_stubdomain) {
+        libxl_device_pci_dispose(&pcidev_saved);
+        libxl_domain_config_dispose(&d_config);
+    }
     return rc;
 }
 
-- 
git-series 0.9.1

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-02-07  0:07 [PATCH v4 0/6] Fix PCI passthrough for HVM with stubdomain Marek Marczykowski-Górecki
                   ` (2 preceding siblings ...)
  2019-02-07  0:07 ` [PATCH v4 3/6] libxl: don't try to manipulate json config for stubdomain Marek Marczykowski-Górecki
@ 2019-02-07  0:07 ` Marek Marczykowski-Górecki
  2019-02-07  9:57   ` Roger Pau Monné
  2019-02-07  0:07 ` [PATCH v4 5/6] xen/x86: add PHYSDEVOP_msi_set_enable Marek Marczykowski-Górecki
  2019-02-07  0:07 ` [PATCH v4 6/6] tools/libxc: add wrapper for PHYSDEVOP_msi_set_enable Marek Marczykowski-Górecki
  5 siblings, 1 reply; 43+ messages in thread
From: Marek Marczykowski-Górecki @ 2019-02-07  0:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, Andrew Cooper, Marek Marczykowski-Górecki,
	Simon Gaiser, Jan Beulich, Roger Pau Monné

From: Simon Gaiser <simon@invisiblethingslab.com>

Stubdomains need to be given sufficient privilege over the guest which it
provides emulation for in order for PCI passthrough to work correctly.
When a HVM domain try to enable MSI, QEMU in stubdomain calls
PHYSDEVOP_map_pirq, but later it needs to call XEN_DOMCTL_bind_pt_irq as
part of xc_domain_update_msi_irq. Allow for that as part of
PHYSDEVOP_map_pirq.

This is not needed for PCI INTx, because IRQ in that case is known
beforehand and the stubdomain is given permissions over this IRQ by
libxl__device_pci_add (there's a do_pci_add against the stubdomain).

Based on https://github.com/OpenXT/xenclient-oe/blob/5e0e7304a5a3c75ef01240a1e3673665b2aaf05e/recipes-extended/xen/files/stubdomain-msi-irq-access.patch by Eric Chanudet <chanudete@ainfosec.com>.

Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com>
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
---
Changes in v3:
 - extend commit message
Changes in v4:
 - add missing destroy_irq on error path

With this patch, stubdomain will be able to create and map multiple irq
(DoS possibility?), as only target domain is validated in practice. Is
that ok? If not, what additional limits could be applied here?
In INTx case the problem doesn't apply, because toolstack grant access
to particular IRQ and no allocation happen on stubdomain request. But in
MSI case, it isn't that easy as IRQ number isn't known before (as
explained in the commit message).
---
 xen/arch/x86/irq.c     | 24 ++++++++++++++++++++++++
 xen/arch/x86/physdev.c |  9 +++++++++
 2 files changed, 33 insertions(+)

diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index 8b44d6c..5e5dcac 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -2674,6 +2674,22 @@ int allocate_and_map_msi_pirq(struct domain *d, int index, int *pirq_p,
         {
     case MAP_PIRQ_TYPE_MULTI_MSI:
             irq = create_irq(NUMA_NO_NODE);
+            if ( !(irq < nr_irqs_gsi || irq >= nr_irqs) &&
+                    current->domain->target == d )
+            {
+                ret = irq_permit_access(current->domain, irq);
+                if ( ret ) {
+                    dprintk(XENLOG_G_ERR,
+                            "dom%d: can't grant it's stubdom (%d) access to "
+                            "irq %d for msi: %d!\n",
+                            d->domain_id,
+                            current->domain->domain_id,
+                            irq,
+                            ret);
+                    destroy_irq(irq);
+                    return ret;
+                }
+            }
         }
 
         if ( irq < nr_irqs_gsi || irq >= nr_irqs )
@@ -2717,7 +2733,15 @@ int allocate_and_map_msi_pirq(struct domain *d, int index, int *pirq_p,
         case MAP_PIRQ_TYPE_MSI:
             if ( index == -1 )
         case MAP_PIRQ_TYPE_MULTI_MSI:
+            {
+                if ( current->domain->target == d &&
+                        irq_deny_access(current->domain, irq) )
+                    dprintk(XENLOG_G_ERR,
+                            "dom%d: can't revoke stubdom's access to irq %d!\n",
+                            d->domain_id,
+                            irq);
                 destroy_irq(irq);
+            }
             break;
         }
     }
diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
index 3a3c158..de59e39 100644
--- a/xen/arch/x86/physdev.c
+++ b/xen/arch/x86/physdev.c
@@ -164,6 +164,15 @@ int physdev_unmap_pirq(domid_t domid, int pirq)
 
     pcidevs_lock();
     spin_lock(&d->event_lock);
+    if ( current->domain->target == d)
+    {
+        int irq = domain_pirq_to_irq(d, pirq);
+        if ( irq <= 0 || irq_deny_access(current->domain, irq) )
+            dprintk(XENLOG_G_ERR,
+                    "dom%d: can't revoke stubdom's access to irq %d!\n",
+                    d->domain_id,
+                    irq);
+    }
     ret = unmap_domain_pirq(d, pirq);
     spin_unlock(&d->event_lock);
     pcidevs_unlock();
-- 
git-series 0.9.1

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 5/6] xen/x86: add PHYSDEVOP_msi_set_enable
  2019-02-07  0:07 [PATCH v4 0/6] Fix PCI passthrough for HVM with stubdomain Marek Marczykowski-Górecki
                   ` (3 preceding siblings ...)
  2019-02-07  0:07 ` [PATCH v4 4/6] xen/x86: Allow stubdom access to irq created for msi Marek Marczykowski-Górecki
@ 2019-02-07  0:07 ` Marek Marczykowski-Górecki
  2019-02-07 10:25   ` Roger Pau Monné
  2019-02-27 11:41   ` Jan Beulich
  2019-02-07  0:07 ` [PATCH v4 6/6] tools/libxc: add wrapper for PHYSDEVOP_msi_set_enable Marek Marczykowski-Górecki
  5 siblings, 2 replies; 43+ messages in thread
From: Marek Marczykowski-Górecki @ 2019-02-07  0:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson,
	Marek Marczykowski-Górecki, Tim Deegan, Julien Grall,
	Jan Beulich, Daniel De Graaf, Roger Pau Monné

Allow device model running in stubdomain to enable/disable MSI(-X),
bypassing pciback. While pciback is still used to access config space
from within stubdomain, it refuse to write to
PCI_MSI_FLAGS_ENABLE/PCI_MSIX_FLAGS_ENABLE in non-permissive mode. Which
is the right thing to do for PV domain (the main use case for pciback),
as PV domain should use XEN_PCI_OP_* commands for that. Unfortunately
those commands are not good for stubdomain use, as they configure MSI in
dom0's kernel too, which should not happen for HVM domain.

This new physdevop is allowed only for stubdomain controlling the domain
which own the device.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
---
Changes in v3:
 - new patch
Changes in v4:
 - adjust code style
 - s/msi_msix/msi/
 - add msi_set_enable XSM hook
 - flatten struct physdev_msi_set_enable
 - add to include/xlat.lst

I'm not sure if XSM part is correct, compile-tested only, as I'm not
sure how to set the policy.
---
 xen/arch/x86/msi.c            | 24 ++++++++++++++++++++++++
 xen/arch/x86/physdev.c        | 24 ++++++++++++++++++++++++
 xen/arch/x86/x86_64/physdev.c |  4 ++++
 xen/include/asm-x86/msi.h     |  1 +
 xen/include/public/physdev.h  | 15 +++++++++++++++
 xen/include/xlat.lst          |  1 +
 xen/include/xsm/dummy.h       |  7 +++++++
 xen/include/xsm/xsm.h         |  6 ++++++
 xen/xsm/dummy.c               |  1 +
 xen/xsm/flask/hooks.c         | 25 +++++++++++++++++++++++++
 10 files changed, 108 insertions(+)

diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c
index babc414..c490c67 100644
--- a/xen/arch/x86/msi.c
+++ b/xen/arch/x86/msi.c
@@ -1474,6 +1474,30 @@ int pci_restore_msi_state(struct pci_dev *pdev)
     return 0;
 }
 
+int msi_msix_set_enable(struct pci_dev *pdev, int mode, int enable)
+{
+    int ret;
+
+    ret = xsm_msi_set_enable(XSM_DM_PRIV, pdev->domain,
+                             (pdev->seg << 16) | (pdev->bus << 8) | pdev->devfn,
+                             mode, enable);
+    if ( ret )
+        return ret;
+
+    switch ( mode )
+    {
+    case PHYSDEVOP_MSI_SET_ENABLE_MSI:
+        msi_set_enable(pdev, enable);
+        break;
+
+    case PHYSDEVOP_MSI_SET_ENABLE_MSIX:
+        msix_set_enable(pdev, enable);
+        break;
+    }
+
+    return 0;
+}
+
 void __init early_msi_init(void)
 {
     if ( use_msi < 0 )
diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
index de59e39..ead8af9 100644
--- a/xen/arch/x86/physdev.c
+++ b/xen/arch/x86/physdev.c
@@ -671,6 +671,30 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
         break;
     }
 
+    case PHYSDEVOP_msi_set_enable: {
+        struct physdev_msi_set_enable op;
+        struct pci_dev *pdev;
+
+        ret = -EFAULT;
+        if ( copy_from_guest(&op, arg, 1) )
+            break;
+
+        ret = -EINVAL;
+        if ( op.mode != PHYSDEVOP_MSI_SET_ENABLE_MSI &&
+             op.mode != PHYSDEVOP_MSI_SET_ENABLE_MSIX )
+            break;
+
+        pcidevs_lock();
+        pdev = pci_get_pdev(op.seg, op.bus, op.devfn);
+        if ( pdev )
+            ret = msi_msix_set_enable(pdev, op.mode, !!op.enable);
+        else
+            ret = -ENODEV;
+        pcidevs_unlock();
+        break;
+
+    }
+
     default:
         ret = -ENOSYS;
         break;
diff --git a/xen/arch/x86/x86_64/physdev.c b/xen/arch/x86/x86_64/physdev.c
index c5a00ea..cb26b1e 100644
--- a/xen/arch/x86/x86_64/physdev.c
+++ b/xen/arch/x86/x86_64/physdev.c
@@ -76,6 +76,10 @@ CHECK_physdev_pci_device_add
 CHECK_physdev_pci_device
 #undef xen_physdev_pci_device
 
+#define xen_physdev_msi_set_enable physdev_msi_set_enable
+CHECK_physdev_msi_set_enable
+#undef xen_physdev_msi_set_enable
+
 #define COMPAT
 #undef guest_handle_okay
 #define guest_handle_okay          compat_handle_okay
diff --git a/xen/include/asm-x86/msi.h b/xen/include/asm-x86/msi.h
index 10387dc..7a22595 100644
--- a/xen/include/asm-x86/msi.h
+++ b/xen/include/asm-x86/msi.h
@@ -252,5 +252,6 @@ void guest_mask_msi_irq(struct irq_desc *, bool mask);
 void ack_nonmaskable_msi_irq(struct irq_desc *);
 void end_nonmaskable_msi_irq(struct irq_desc *, u8 vector);
 void set_msi_affinity(struct irq_desc *, const cpumask_t *);
+int msi_msix_set_enable(struct pci_dev *pdev, int mode, int enable);
 
 #endif /* __ASM_MSI_H */
diff --git a/xen/include/public/physdev.h b/xen/include/public/physdev.h
index b6faf83..187fc23 100644
--- a/xen/include/public/physdev.h
+++ b/xen/include/public/physdev.h
@@ -344,6 +344,21 @@ struct physdev_dbgp_op {
 typedef struct physdev_dbgp_op physdev_dbgp_op_t;
 DEFINE_XEN_GUEST_HANDLE(physdev_dbgp_op_t);
 
+#define PHYSDEVOP_MSI_SET_ENABLE_MSI  0
+#define PHYSDEVOP_MSI_SET_ENABLE_MSIX 1
+
+#define PHYSDEVOP_msi_set_enable   32
+struct physdev_msi_set_enable {
+    /* IN */
+    uint16_t seg;
+    uint8_t bus;
+    uint8_t devfn;
+    uint8_t mode;
+    uint8_t enable;
+};
+typedef struct physdev_msi_set_enable physdev_msi_set_enable_t;
+DEFINE_XEN_GUEST_HANDLE(physdev_msi_set_enable_t);
+
 /*
  * Notify that some PIRQ-bound event channels have been unmasked.
  * ** This command is obsolete since interface version 0x00030202 and is **
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 5273320..cbd34a9 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -106,6 +106,7 @@
 ?	physdev_restore_msi		physdev.h
 ?	physdev_set_iopl		physdev.h
 ?	physdev_setup_gsi		physdev.h
+?	physdev_msi_set_enable		physdev.h
 !	pct_register			platform.h
 !	power_register			platform.h
 ?	processor_csd			platform.h
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index a29d1ef..a10c980 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -520,6 +520,13 @@ static XSM_INLINE int xsm_pci_config_permission(XSM_DEFAULT_ARG struct domain *d
     return xsm_default_action(action, current->domain, d);
 }
 
+static XSM_INLINE int xsm_msi_set_enable(XSM_DEFAULT_ARG struct domain *d, uint32_t machine_bdf,
+                                         uint8_t mode, uint8_t enable)
+{
+    XSM_ASSERT_ACTION(XSM_DM_PRIV);
+    return xsm_default_action(action, current->domain, d);
+}
+
 static XSM_INLINE int xsm_add_to_physmap(XSM_DEFAULT_ARG struct domain *d1, struct domain *d2)
 {
     XSM_ASSERT_ACTION(XSM_TARGET);
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index 3b192b5..6ca4c9e 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -106,6 +106,7 @@ struct xsm_operations {
     int (*iomem_permission) (struct domain *d, uint64_t s, uint64_t e, uint8_t allow);
     int (*iomem_mapping) (struct domain *d, uint64_t s, uint64_t e, uint8_t allow);
     int (*pci_config_permission) (struct domain *d, uint32_t machine_bdf, uint16_t start, uint16_t end, uint8_t access);
+    int (*msi_set_enable) (struct domain *d, uint32_t machine_bdf, uint8_t mode, uint8_t enable);
 
 #if defined(CONFIG_HAS_PASSTHROUGH) && defined(CONFIG_HAS_PCI)
     int (*get_device_group) (uint32_t machine_bdf);
@@ -458,6 +459,11 @@ static inline int xsm_pci_config_permission (xsm_default_t def, struct domain *d
     return xsm_ops->pci_config_permission(d, machine_bdf, start, end, access);
 }
 
+static inline int xsm_msi_set_enable (xsm_default_t def, struct domain *d, uint32_t machine_bdf, uint8_t mode, uint8_t enable)
+{
+    return xsm_ops->msi_set_enable(d, machine_bdf, mode, enable);
+}
+
 #if defined(CONFIG_HAS_PASSTHROUGH) && defined(CONFIG_HAS_PCI)
 static inline int xsm_get_device_group(xsm_default_t def, uint32_t machine_bdf)
 {
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index 5701047..3a9e275 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -81,6 +81,7 @@ void __init xsm_fixup_ops (struct xsm_operations *ops)
     set_to_dummy_if_null(ops, iomem_permission);
     set_to_dummy_if_null(ops, iomem_mapping);
     set_to_dummy_if_null(ops, pci_config_permission);
+    set_to_dummy_if_null(ops, msi_set_enable);
     set_to_dummy_if_null(ops, get_vnumainfo);
 
 #if defined(CONFIG_HAS_PASSTHROUGH) && defined(CONFIG_HAS_PCI)
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 96d31aa..1c201fd 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -1084,6 +1084,30 @@ static int flask_pci_config_permission(struct domain *d, uint32_t machine_bdf, u
 
 }
 
+static int flask_msi_set_enable(struct domain *d, uint32_t machine_bdf, uint8_t mode, uint8_t enable)
+{
+    u32 dsid, rsid;
+    int rc = -EPERM;
+    struct avc_audit_data ad;
+    u32 perm;
+
+    AVC_AUDIT_DATA_INIT(&ad, DEV);
+    ad.device = (unsigned long) machine_bdf;
+
+    rc = security_device_sid(machine_bdf, &rsid);
+    if ( rc )
+        return rc;
+
+    perm = flask_iommu_resource_use_perm();
+
+    rc = avc_current_has_perm(rsid, SECCLASS_RESOURCE, perm, &ad);
+    if ( rc )
+        return rc;
+
+    dsid = domain_sid(d);
+    return avc_current_has_perm(dsid, SECCLASS_RESOURCE, RESOURCE__SETUP, &ad);
+}
+
 static int flask_resource_plug_core(void)
 {
     return avc_current_has_perm(SECINITSID_DOMXEN, SECCLASS_RESOURCE, RESOURCE__PLUG, NULL);
@@ -1779,6 +1803,7 @@ static struct xsm_operations flask_ops = {
     .iomem_permission = flask_iomem_permission,
     .iomem_mapping = flask_iomem_mapping,
     .pci_config_permission = flask_pci_config_permission,
+    .msi_set_enable = flask_msi_set_enable,
 
     .resource_plug_core = flask_resource_plug_core,
     .resource_unplug_core = flask_resource_unplug_core,
-- 
git-series 0.9.1

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 6/6] tools/libxc: add wrapper for PHYSDEVOP_msi_set_enable
  2019-02-07  0:07 [PATCH v4 0/6] Fix PCI passthrough for HVM with stubdomain Marek Marczykowski-Górecki
                   ` (4 preceding siblings ...)
  2019-02-07  0:07 ` [PATCH v4 5/6] xen/x86: add PHYSDEVOP_msi_set_enable Marek Marczykowski-Górecki
@ 2019-02-07  0:07 ` Marek Marczykowski-Górecki
  5 siblings, 0 replies; 43+ messages in thread
From: Marek Marczykowski-Górecki @ 2019-02-07  0:07 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, Ian Jackson, Marek Marczykowski-Górecki

Add libxc wrapper for PHYSDEVOP_msi_set_enable introduced in
previous commit.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
---
Changes in v3:
 - new patch
Changes in v4:
 - adjust for updated previous patch
---
 tools/libxc/include/xenctrl.h |  7 +++++++
 tools/libxc/xc_physdev.c      | 21 +++++++++++++++++++++
 2 files changed, 28 insertions(+)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 31cdda7..879e926 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1639,6 +1639,13 @@ int xc_physdev_unmap_pirq(xc_interface *xch,
                           uint32_t domid,
                           int pirq);
 
+int xc_physdev_msi_set_enable(xc_interface *xch,
+                              int seg,
+                              int bus,
+                              int devfn,
+                              int mode,
+                              int enable);
+
 /*
  *  LOGGING AND ERROR REPORTING
  */
diff --git a/tools/libxc/xc_physdev.c b/tools/libxc/xc_physdev.c
index 460a8e7..af6116f 100644
--- a/tools/libxc/xc_physdev.c
+++ b/tools/libxc/xc_physdev.c
@@ -111,3 +111,24 @@ int xc_physdev_unmap_pirq(xc_interface *xch,
     return rc;
 }
 
+int xc_physdev_msi_set_enable(xc_interface *xch,
+                              int seg,
+                              int bus,
+                              int devfn,
+                              int mode,
+                              int enable)
+{
+    int rc;
+    struct physdev_msi_set_enable op;
+
+    memset(&op, 0, sizeof(struct physdev_msi_set_enable));
+    op.seg = seg;
+    op.bus = bus;
+    op.devfn = devfn;
+    op.mode = mode;
+    op.enable = enable;
+
+    rc = do_physdev_op(xch, PHYSDEVOP_msi_set_enable, &op, sizeof(op));
+
+    return rc;
+}
-- 
git-series 0.9.1

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-02-07  0:07 ` [PATCH v4 4/6] xen/x86: Allow stubdom access to irq created for msi Marek Marczykowski-Górecki
@ 2019-02-07  9:57   ` Roger Pau Monné
  2019-02-07 13:21     ` Marek Marczykowski-Górecki
  0 siblings, 1 reply; 43+ messages in thread
From: Roger Pau Monné @ 2019-02-07  9:57 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: Simon Gaiser, xen-devel, Wei Liu, Jan Beulich, Andrew Cooper

On Thu, Feb 07, 2019 at 01:07:47AM +0100, Marek Marczykowski-Górecki wrote:
> From: Simon Gaiser <simon@invisiblethingslab.com>
> 
> Stubdomains need to be given sufficient privilege over the guest which it
> provides emulation for in order for PCI passthrough to work correctly.
> When a HVM domain try to enable MSI, QEMU in stubdomain calls
> PHYSDEVOP_map_pirq, but later it needs to call XEN_DOMCTL_bind_pt_irq as
> part of xc_domain_update_msi_irq. Allow for that as part of
> PHYSDEVOP_map_pirq.
> 
> This is not needed for PCI INTx, because IRQ in that case is known
> beforehand and the stubdomain is given permissions over this IRQ by
> libxl__device_pci_add (there's a do_pci_add against the stubdomain).
> 
> Based on https://github.com/OpenXT/xenclient-oe/blob/5e0e7304a5a3c75ef01240a1e3673665b2aaf05e/recipes-extended/xen/files/stubdomain-msi-irq-access.patch by Eric Chanudet <chanudete@ainfosec.com>.
> 
> Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com>
> Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> ---
> Changes in v3:
>  - extend commit message
> Changes in v4:
>  - add missing destroy_irq on error path
> 
> With this patch, stubdomain will be able to create and map multiple irq
> (DoS possibility?), as only target domain is validated in practice. Is
> that ok? If not, what additional limits could be applied here?
> In INTx case the problem doesn't apply, because toolstack grant access
> to particular IRQ and no allocation happen on stubdomain request. But in
> MSI case, it isn't that easy as IRQ number isn't known before (as
> explained in the commit message).
> ---
>  xen/arch/x86/irq.c     | 24 ++++++++++++++++++++++++
>  xen/arch/x86/physdev.c |  9 +++++++++
>  2 files changed, 33 insertions(+)
> 
> diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
> index 8b44d6c..5e5dcac 100644
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -2674,6 +2674,22 @@ int allocate_and_map_msi_pirq(struct domain *d, int index, int *pirq_p,
>          {
>      case MAP_PIRQ_TYPE_MULTI_MSI:
>              irq = create_irq(NUMA_NO_NODE);
> +            if ( !(irq < nr_irqs_gsi || irq >= nr_irqs) &&
> +                    current->domain->target == d )
> +            {
> +                ret = irq_permit_access(current->domain, irq);
> +                if ( ret ) {
> +                    dprintk(XENLOG_G_ERR,
> +                            "dom%d: can't grant it's stubdom (%d) access to "
> +                            "irq %d for msi: %d!\n",
> +                            d->domain_id,
> +                            current->domain->domain_id,
> +                            irq,
> +                            ret);
> +                    destroy_irq(irq);
> +                    return ret;

I'm afraid his won't work for devices that support multiple MSI vectors.
Note that map_domain_pirq also has a call to create_irq, and you are
not adding the sutbdom permissions there.

IMO, the safer way to fix this would be to modify create_irq and
destroy_irq so that you give permissions to the subtdomain in the same
place that hardware domain permissions are given. Note that you will
have to change the function to take an extra domain parameter
AFAICT.

Alternatively the permissions could be granted/revoked in
{un}map_domain_pirq, which already contains a call to
irq_access_permitted/irq_deny_access, I think I've suggested this in a
previous version already [0]. This seems less intrusive that modifying
create_irq/destroy_irq if viable.

Thanks, Roger.

[0] https://lists.xenproject.org/archives/html/xen-devel/2019-01/msg01240.html

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 5/6] xen/x86: add PHYSDEVOP_msi_set_enable
  2019-02-07  0:07 ` [PATCH v4 5/6] xen/x86: add PHYSDEVOP_msi_set_enable Marek Marczykowski-Górecki
@ 2019-02-07 10:25   ` Roger Pau Monné
  2019-02-27 11:41   ` Jan Beulich
  1 sibling, 0 replies; 43+ messages in thread
From: Roger Pau Monné @ 2019-02-07 10:25 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Jan Beulich, xen-devel, Daniel De Graaf

On Thu, Feb 07, 2019 at 01:07:48AM +0100, Marek Marczykowski-Górecki wrote:
> Allow device model running in stubdomain to enable/disable MSI(-X),
> bypassing pciback. While pciback is still used to access config space
> from within stubdomain, it refuse to write to
> PCI_MSI_FLAGS_ENABLE/PCI_MSIX_FLAGS_ENABLE in non-permissive mode. Which
> is the right thing to do for PV domain (the main use case for pciback),
> as PV domain should use XEN_PCI_OP_* commands for that. Unfortunately
> those commands are not good for stubdomain use, as they configure MSI in
> dom0's kernel too, which should not happen for HVM domain.
> 
> This new physdevop is allowed only for stubdomain controlling the domain
> which own the device.
> 
> Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> ---
> Changes in v3:
>  - new patch
> Changes in v4:
>  - adjust code style
>  - s/msi_msix/msi/
>  - add msi_set_enable XSM hook
>  - flatten struct physdev_msi_set_enable
>  - add to include/xlat.lst
> 
> I'm not sure if XSM part is correct, compile-tested only, as I'm not
> sure how to set the policy.
> ---
>  xen/arch/x86/msi.c            | 24 ++++++++++++++++++++++++
>  xen/arch/x86/physdev.c        | 24 ++++++++++++++++++++++++
>  xen/arch/x86/x86_64/physdev.c |  4 ++++
>  xen/include/asm-x86/msi.h     |  1 +
>  xen/include/public/physdev.h  | 15 +++++++++++++++
>  xen/include/xlat.lst          |  1 +
>  xen/include/xsm/dummy.h       |  7 +++++++
>  xen/include/xsm/xsm.h         |  6 ++++++
>  xen/xsm/dummy.c               |  1 +
>  xen/xsm/flask/hooks.c         | 25 +++++++++++++++++++++++++
>  10 files changed, 108 insertions(+)
> 
> diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c
> index babc414..c490c67 100644
> --- a/xen/arch/x86/msi.c
> +++ b/xen/arch/x86/msi.c
> @@ -1474,6 +1474,30 @@ int pci_restore_msi_state(struct pci_dev *pdev)
>      return 0;
>  }
>  
> +int msi_msix_set_enable(struct pci_dev *pdev, int mode, int enable)
> +{
> +    int ret;
> +
> +    ret = xsm_msi_set_enable(XSM_DM_PRIV, pdev->domain,
> +                             (pdev->seg << 16) | (pdev->bus << 8) | pdev->devfn,

There's a helper for this: PCI_SBDF3.

> diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
> index 96d31aa..1c201fd 100644
> --- a/xen/xsm/flask/hooks.c
> +++ b/xen/xsm/flask/hooks.c
> @@ -1084,6 +1084,30 @@ static int flask_pci_config_permission(struct domain *d, uint32_t machine_bdf, u
>  
>  }
>  
> +static int flask_msi_set_enable(struct domain *d, uint32_t machine_bdf, uint8_t mode, uint8_t enable)
> +{
> +    u32 dsid, rsid;
> +    int rc = -EPERM;
> +    struct avc_audit_data ad;
> +    u32 perm;

Nit: I think we are trying to get away from using u32, so I would use
uint32_t. I know the file makes heavy use of the short u32/64 types
but we should try to not introduce more.

> +
> +    AVC_AUDIT_DATA_INIT(&ad, DEV);
> +    ad.device = (unsigned long) machine_bdf;

Do you really need the cast here?

> +
> +    rc = security_device_sid(machine_bdf, &rsid);
> +    if ( rc )
> +        return rc;
> +
> +    perm = flask_iommu_resource_use_perm();
> +
> +    rc = avc_current_has_perm(rsid, SECCLASS_RESOURCE, perm, &ad);
> +    if ( rc )
> +        return rc;
> +
> +    dsid = domain_sid(d);
> +    return avc_current_has_perm(dsid, SECCLASS_RESOURCE, RESOURCE__SETUP, &ad);

I think you need to add a line to xsm/flask/policy/access_vectors
noting that PHYSDEVOP_msi_set_enable makes use of the setup
permissions (like it's done for PHYSDEVOP_setup_gsi for example).

Anyway I'm not very familiar with XSM, so Daniel might have better
input on this.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-02-07  9:57   ` Roger Pau Monné
@ 2019-02-07 13:21     ` Marek Marczykowski-Górecki
  2019-02-07 13:40       ` Roger Pau Monné
  2019-02-07 14:52       ` Marek Marczykowski-Górecki
  0 siblings, 2 replies; 43+ messages in thread
From: Marek Marczykowski-Górecki @ 2019-02-07 13:21 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Simon Gaiser, xen-devel, Wei Liu, Jan Beulich, Andrew Cooper


[-- Attachment #1.1: Type: text/plain, Size: 4349 bytes --]

On Thu, Feb 07, 2019 at 10:57:19AM +0100, Roger Pau Monné wrote:
> On Thu, Feb 07, 2019 at 01:07:47AM +0100, Marek Marczykowski-Górecki wrote:
> > From: Simon Gaiser <simon@invisiblethingslab.com>
> > 
> > Stubdomains need to be given sufficient privilege over the guest which it
> > provides emulation for in order for PCI passthrough to work correctly.
> > When a HVM domain try to enable MSI, QEMU in stubdomain calls
> > PHYSDEVOP_map_pirq, but later it needs to call XEN_DOMCTL_bind_pt_irq as
> > part of xc_domain_update_msi_irq. Allow for that as part of
> > PHYSDEVOP_map_pirq.
> > 
> > This is not needed for PCI INTx, because IRQ in that case is known
> > beforehand and the stubdomain is given permissions over this IRQ by
> > libxl__device_pci_add (there's a do_pci_add against the stubdomain).
> > 
> > Based on https://github.com/OpenXT/xenclient-oe/blob/5e0e7304a5a3c75ef01240a1e3673665b2aaf05e/recipes-extended/xen/files/stubdomain-msi-irq-access.patch by Eric Chanudet <chanudete@ainfosec.com>.
> > 
> > Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com>
> > Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> > ---
> > Changes in v3:
> >  - extend commit message
> > Changes in v4:
> >  - add missing destroy_irq on error path
> > 
> > With this patch, stubdomain will be able to create and map multiple irq
> > (DoS possibility?), as only target domain is validated in practice. Is
> > that ok? If not, what additional limits could be applied here?
> > In INTx case the problem doesn't apply, because toolstack grant access
> > to particular IRQ and no allocation happen on stubdomain request. But in
> > MSI case, it isn't that easy as IRQ number isn't known before (as
> > explained in the commit message).
> > ---
> >  xen/arch/x86/irq.c     | 24 ++++++++++++++++++++++++
> >  xen/arch/x86/physdev.c |  9 +++++++++
> >  2 files changed, 33 insertions(+)
> > 
> > diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
> > index 8b44d6c..5e5dcac 100644
> > --- a/xen/arch/x86/irq.c
> > +++ b/xen/arch/x86/irq.c
> > @@ -2674,6 +2674,22 @@ int allocate_and_map_msi_pirq(struct domain *d, int index, int *pirq_p,
> >          {
> >      case MAP_PIRQ_TYPE_MULTI_MSI:
> >              irq = create_irq(NUMA_NO_NODE);
> > +            if ( !(irq < nr_irqs_gsi || irq >= nr_irqs) &&
> > +                    current->domain->target == d )
> > +            {
> > +                ret = irq_permit_access(current->domain, irq);
> > +                if ( ret ) {
> > +                    dprintk(XENLOG_G_ERR,
> > +                            "dom%d: can't grant it's stubdom (%d) access to "
> > +                            "irq %d for msi: %d!\n",
> > +                            d->domain_id,
> > +                            current->domain->domain_id,
> > +                            irq,
> > +                            ret);
> > +                    destroy_irq(irq);
> > +                    return ret;
> 
> I'm afraid his won't work for devices that support multiple MSI vectors.
> Note that map_domain_pirq also has a call to create_irq, and you are
> not adding the sutbdom permissions there.
> 
> IMO, the safer way to fix this would be to modify create_irq and
> destroy_irq so that you give permissions to the subtdomain in the same
> place that hardware domain permissions are given. Note that you will
> have to change the function to take an extra domain parameter
> AFAICT.

That may be a good idea, I'll try.

> Alternatively the permissions could be granted/revoked in
> {un}map_domain_pirq, which already contains a call to
> irq_access_permitted/irq_deny_access, I think I've suggested this in a
> previous version already [0]. This seems less intrusive that modifying
> create_irq/destroy_irq if viable.

I've already tried that. And as responded there, it won't fly, as the
first thing map_domain_pirq does is checking irq permission, which would
fail for irq allocated by allocate_and_map_msi_pirq.

> Thanks, Roger.
> 
> [0] https://lists.xenproject.org/archives/html/xen-devel/2019-01/msg01240.html

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-02-07 13:21     ` Marek Marczykowski-Górecki
@ 2019-02-07 13:40       ` Roger Pau Monné
  2019-02-07 14:52       ` Marek Marczykowski-Górecki
  1 sibling, 0 replies; 43+ messages in thread
From: Roger Pau Monné @ 2019-02-07 13:40 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: Simon Gaiser, xen-devel, Wei Liu, Jan Beulich, Andrew Cooper

On Thu, Feb 07, 2019 at 02:21:24PM +0100, Marek Marczykowski-Górecki wrote:
> On Thu, Feb 07, 2019 at 10:57:19AM +0100, Roger Pau Monné wrote:
> > On Thu, Feb 07, 2019 at 01:07:47AM +0100, Marek Marczykowski-Górecki wrote:
> > > diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
> > > index 8b44d6c..5e5dcac 100644
> > > --- a/xen/arch/x86/irq.c
> > > +++ b/xen/arch/x86/irq.c
> > > @@ -2674,6 +2674,22 @@ int allocate_and_map_msi_pirq(struct domain *d, int index, int *pirq_p,
> > >          {
> > >      case MAP_PIRQ_TYPE_MULTI_MSI:
> > >              irq = create_irq(NUMA_NO_NODE);
> > > +            if ( !(irq < nr_irqs_gsi || irq >= nr_irqs) &&
> > > +                    current->domain->target == d )
> > > +            {
> > > +                ret = irq_permit_access(current->domain, irq);
> > > +                if ( ret ) {
> > > +                    dprintk(XENLOG_G_ERR,
> > > +                            "dom%d: can't grant it's stubdom (%d) access to "
> > > +                            "irq %d for msi: %d!\n",
> > > +                            d->domain_id,
> > > +                            current->domain->domain_id,
> > > +                            irq,
> > > +                            ret);
> > > +                    destroy_irq(irq);
> > > +                    return ret;
> > 
> > I'm afraid his won't work for devices that support multiple MSI vectors.
> > Note that map_domain_pirq also has a call to create_irq, and you are
> > not adding the sutbdom permissions there.
> > 
> > IMO, the safer way to fix this would be to modify create_irq and
> > destroy_irq so that you give permissions to the subtdomain in the same
> > place that hardware domain permissions are given. Note that you will
> > have to change the function to take an extra domain parameter
> > AFAICT.
> 
> That may be a good idea, I'll try.
> 
> > Alternatively the permissions could be granted/revoked in
> > {un}map_domain_pirq, which already contains a call to
> > irq_access_permitted/irq_deny_access, I think I've suggested this in a
> > previous version already [0]. This seems less intrusive that modifying
> > create_irq/destroy_irq if viable.
> 
> I've already tried that. And as responded there, it won't fly, as the
> first thing map_domain_pirq does is checking irq permission, which would
> fail for irq allocated by allocate_and_map_msi_pirq.

Oh sorry, that thread then went on about the addition of the new
hypercall and I likely missed that reply.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-02-07 13:21     ` Marek Marczykowski-Górecki
  2019-02-07 13:40       ` Roger Pau Monné
@ 2019-02-07 14:52       ` Marek Marczykowski-Górecki
  2019-02-07 14:57         ` Roger Pau Monné
  1 sibling, 1 reply; 43+ messages in thread
From: Marek Marczykowski-Górecki @ 2019-02-07 14:52 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Simon Gaiser, xen-devel, Wei Liu, Jan Beulich, Andrew Cooper


[-- Attachment #1.1: Type: text/plain, Size: 4373 bytes --]

On Thu, Feb 07, 2019 at 02:21:27PM +0100, Marek Marczykowski-Górecki wrote:
> On Thu, Feb 07, 2019 at 10:57:19AM +0100, Roger Pau Monné wrote:
> > On Thu, Feb 07, 2019 at 01:07:47AM +0100, Marek Marczykowski-Górecki wrote:
> > > From: Simon Gaiser <simon@invisiblethingslab.com>
> > > 
> > > Stubdomains need to be given sufficient privilege over the guest which it
> > > provides emulation for in order for PCI passthrough to work correctly.
> > > When a HVM domain try to enable MSI, QEMU in stubdomain calls
> > > PHYSDEVOP_map_pirq, but later it needs to call XEN_DOMCTL_bind_pt_irq as
> > > part of xc_domain_update_msi_irq. Allow for that as part of
> > > PHYSDEVOP_map_pirq.
> > > 
> > > This is not needed for PCI INTx, because IRQ in that case is known
> > > beforehand and the stubdomain is given permissions over this IRQ by
> > > libxl__device_pci_add (there's a do_pci_add against the stubdomain).
> > > 
> > > Based on https://github.com/OpenXT/xenclient-oe/blob/5e0e7304a5a3c75ef01240a1e3673665b2aaf05e/recipes-extended/xen/files/stubdomain-msi-irq-access.patch by Eric Chanudet <chanudete@ainfosec.com>.
> > > 
> > > Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com>
> > > Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> > > ---
> > > Changes in v3:
> > >  - extend commit message
> > > Changes in v4:
> > >  - add missing destroy_irq on error path
> > > 
> > > With this patch, stubdomain will be able to create and map multiple irq
> > > (DoS possibility?), as only target domain is validated in practice. Is
> > > that ok? If not, what additional limits could be applied here?
> > > In INTx case the problem doesn't apply, because toolstack grant access
> > > to particular IRQ and no allocation happen on stubdomain request. But in
> > > MSI case, it isn't that easy as IRQ number isn't known before (as
> > > explained in the commit message).
> > > ---
> > >  xen/arch/x86/irq.c     | 24 ++++++++++++++++++++++++
> > >  xen/arch/x86/physdev.c |  9 +++++++++
> > >  2 files changed, 33 insertions(+)
> > > 
> > > diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
> > > index 8b44d6c..5e5dcac 100644
> > > --- a/xen/arch/x86/irq.c
> > > +++ b/xen/arch/x86/irq.c
> > > @@ -2674,6 +2674,22 @@ int allocate_and_map_msi_pirq(struct domain *d, int index, int *pirq_p,
> > >          {
> > >      case MAP_PIRQ_TYPE_MULTI_MSI:
> > >              irq = create_irq(NUMA_NO_NODE);
> > > +            if ( !(irq < nr_irqs_gsi || irq >= nr_irqs) &&
> > > +                    current->domain->target == d )
> > > +            {
> > > +                ret = irq_permit_access(current->domain, irq);
> > > +                if ( ret ) {
> > > +                    dprintk(XENLOG_G_ERR,
> > > +                            "dom%d: can't grant it's stubdom (%d) access to "
> > > +                            "irq %d for msi: %d!\n",
> > > +                            d->domain_id,
> > > +                            current->domain->domain_id,
> > > +                            irq,
> > > +                            ret);
> > > +                    destroy_irq(irq);
> > > +                    return ret;
> > 
> > I'm afraid his won't work for devices that support multiple MSI vectors.
> > Note that map_domain_pirq also has a call to create_irq, and you are
> > not adding the sutbdom permissions there.
> > 
> > IMO, the safer way to fix this would be to modify create_irq and
> > destroy_irq so that you give permissions to the subtdomain in the same
> > place that hardware domain permissions are given. Note that you will
> > have to change the function to take an extra domain parameter
> > AFAICT.
> 
> That may be a good idea, I'll try.

Hmm, looking at the code, wouldn't it make sense to give device model
domain access to the IRQ _instead of_ hardware domain? If stubdomain is
in use, I don't see why dom0 would need access to that irq. Simply
provide what the device model domain is as parameter - either
hardware_domain, or stubdomain. Something like:

    create_irq(..., current->domain->target == d ? current->domain : hardware_domain);

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-02-07 14:52       ` Marek Marczykowski-Górecki
@ 2019-02-07 14:57         ` Roger Pau Monné
  2019-02-07 15:41           ` Marek Marczykowski-Górecki
  0 siblings, 1 reply; 43+ messages in thread
From: Roger Pau Monné @ 2019-02-07 14:57 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: Simon Gaiser, xen-devel, Wei Liu, Jan Beulich, Andrew Cooper

On Thu, Feb 07, 2019 at 03:52:38PM +0100, Marek Marczykowski-Górecki wrote:
> On Thu, Feb 07, 2019 at 02:21:27PM +0100, Marek Marczykowski-Górecki wrote:
> > On Thu, Feb 07, 2019 at 10:57:19AM +0100, Roger Pau Monné wrote:
> > > On Thu, Feb 07, 2019 at 01:07:47AM +0100, Marek Marczykowski-Górecki wrote:
> > > > From: Simon Gaiser <simon@invisiblethingslab.com>
> > > > 
> > > > Stubdomains need to be given sufficient privilege over the guest which it
> > > > provides emulation for in order for PCI passthrough to work correctly.
> > > > When a HVM domain try to enable MSI, QEMU in stubdomain calls
> > > > PHYSDEVOP_map_pirq, but later it needs to call XEN_DOMCTL_bind_pt_irq as
> > > > part of xc_domain_update_msi_irq. Allow for that as part of
> > > > PHYSDEVOP_map_pirq.
> > > > 
> > > > This is not needed for PCI INTx, because IRQ in that case is known
> > > > beforehand and the stubdomain is given permissions over this IRQ by
> > > > libxl__device_pci_add (there's a do_pci_add against the stubdomain).
> > > > 
> > > > Based on https://github.com/OpenXT/xenclient-oe/blob/5e0e7304a5a3c75ef01240a1e3673665b2aaf05e/recipes-extended/xen/files/stubdomain-msi-irq-access.patch by Eric Chanudet <chanudete@ainfosec.com>.
> > > > 
> > > > Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com>
> > > > Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> > > > ---
> > > > Changes in v3:
> > > >  - extend commit message
> > > > Changes in v4:
> > > >  - add missing destroy_irq on error path
> > > > 
> > > > With this patch, stubdomain will be able to create and map multiple irq
> > > > (DoS possibility?), as only target domain is validated in practice. Is
> > > > that ok? If not, what additional limits could be applied here?
> > > > In INTx case the problem doesn't apply, because toolstack grant access
> > > > to particular IRQ and no allocation happen on stubdomain request. But in
> > > > MSI case, it isn't that easy as IRQ number isn't known before (as
> > > > explained in the commit message).
> > > > ---
> > > >  xen/arch/x86/irq.c     | 24 ++++++++++++++++++++++++
> > > >  xen/arch/x86/physdev.c |  9 +++++++++
> > > >  2 files changed, 33 insertions(+)
> > > > 
> > > > diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
> > > > index 8b44d6c..5e5dcac 100644
> > > > --- a/xen/arch/x86/irq.c
> > > > +++ b/xen/arch/x86/irq.c
> > > > @@ -2674,6 +2674,22 @@ int allocate_and_map_msi_pirq(struct domain *d, int index, int *pirq_p,
> > > >          {
> > > >      case MAP_PIRQ_TYPE_MULTI_MSI:
> > > >              irq = create_irq(NUMA_NO_NODE);
> > > > +            if ( !(irq < nr_irqs_gsi || irq >= nr_irqs) &&
> > > > +                    current->domain->target == d )
> > > > +            {
> > > > +                ret = irq_permit_access(current->domain, irq);
> > > > +                if ( ret ) {
> > > > +                    dprintk(XENLOG_G_ERR,
> > > > +                            "dom%d: can't grant it's stubdom (%d) access to "
> > > > +                            "irq %d for msi: %d!\n",
> > > > +                            d->domain_id,
> > > > +                            current->domain->domain_id,
> > > > +                            irq,
> > > > +                            ret);
> > > > +                    destroy_irq(irq);
> > > > +                    return ret;
> > > 
> > > I'm afraid his won't work for devices that support multiple MSI vectors.
> > > Note that map_domain_pirq also has a call to create_irq, and you are
> > > not adding the sutbdom permissions there.
> > > 
> > > IMO, the safer way to fix this would be to modify create_irq and
> > > destroy_irq so that you give permissions to the subtdomain in the same
> > > place that hardware domain permissions are given. Note that you will
> > > have to change the function to take an extra domain parameter
> > > AFAICT.
> > 
> > That may be a good idea, I'll try.
> 
> Hmm, looking at the code, wouldn't it make sense to give device model
> domain access to the IRQ _instead of_ hardware domain? If stubdomain is
> in use, I don't see why dom0 would need access to that irq. Simply
> provide what the device model domain is as parameter - either
> hardware_domain, or stubdomain. Something like:
> 
>     create_irq(..., current->domain->target == d ? current->domain : hardware_domain);

Isn't there some cleanup that likely needs to be done by dom0 if it's
not done by the stubdom, or in case the stubdom crashes for some
reason?

Or maybe that's already done on domain destruction by Xen itself, in
which case not giving permissions to dom0 would be fine.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-02-07 14:57         ` Roger Pau Monné
@ 2019-02-07 15:41           ` Marek Marczykowski-Górecki
  2019-02-07 17:40             ` Roger Pau Monné
  0 siblings, 1 reply; 43+ messages in thread
From: Marek Marczykowski-Górecki @ 2019-02-07 15:41 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Simon Gaiser, xen-devel, Wei Liu, Jan Beulich, Andrew Cooper


[-- Attachment #1.1: Type: text/plain, Size: 5902 bytes --]

On Thu, Feb 07, 2019 at 03:57:54PM +0100, Roger Pau Monné wrote:
> On Thu, Feb 07, 2019 at 03:52:38PM +0100, Marek Marczykowski-Górecki wrote:
> > On Thu, Feb 07, 2019 at 02:21:27PM +0100, Marek Marczykowski-Górecki wrote:
> > > On Thu, Feb 07, 2019 at 10:57:19AM +0100, Roger Pau Monné wrote:
> > > > On Thu, Feb 07, 2019 at 01:07:47AM +0100, Marek Marczykowski-Górecki wrote:
> > > > > From: Simon Gaiser <simon@invisiblethingslab.com>
> > > > > 
> > > > > Stubdomains need to be given sufficient privilege over the guest which it
> > > > > provides emulation for in order for PCI passthrough to work correctly.
> > > > > When a HVM domain try to enable MSI, QEMU in stubdomain calls
> > > > > PHYSDEVOP_map_pirq, but later it needs to call XEN_DOMCTL_bind_pt_irq as
> > > > > part of xc_domain_update_msi_irq. Allow for that as part of
> > > > > PHYSDEVOP_map_pirq.
> > > > > 
> > > > > This is not needed for PCI INTx, because IRQ in that case is known
> > > > > beforehand and the stubdomain is given permissions over this IRQ by
> > > > > libxl__device_pci_add (there's a do_pci_add against the stubdomain).
> > > > > 
> > > > > Based on https://github.com/OpenXT/xenclient-oe/blob/5e0e7304a5a3c75ef01240a1e3673665b2aaf05e/recipes-extended/xen/files/stubdomain-msi-irq-access.patch by Eric Chanudet <chanudete@ainfosec.com>.
> > > > > 
> > > > > Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com>
> > > > > Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> > > > > ---
> > > > > Changes in v3:
> > > > >  - extend commit message
> > > > > Changes in v4:
> > > > >  - add missing destroy_irq on error path
> > > > > 
> > > > > With this patch, stubdomain will be able to create and map multiple irq
> > > > > (DoS possibility?), as only target domain is validated in practice. Is
> > > > > that ok? If not, what additional limits could be applied here?
> > > > > In INTx case the problem doesn't apply, because toolstack grant access
> > > > > to particular IRQ and no allocation happen on stubdomain request. But in
> > > > > MSI case, it isn't that easy as IRQ number isn't known before (as
> > > > > explained in the commit message).
> > > > > ---
> > > > >  xen/arch/x86/irq.c     | 24 ++++++++++++++++++++++++
> > > > >  xen/arch/x86/physdev.c |  9 +++++++++
> > > > >  2 files changed, 33 insertions(+)
> > > > > 
> > > > > diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
> > > > > index 8b44d6c..5e5dcac 100644
> > > > > --- a/xen/arch/x86/irq.c
> > > > > +++ b/xen/arch/x86/irq.c
> > > > > @@ -2674,6 +2674,22 @@ int allocate_and_map_msi_pirq(struct domain *d, int index, int *pirq_p,
> > > > >          {
> > > > >      case MAP_PIRQ_TYPE_MULTI_MSI:
> > > > >              irq = create_irq(NUMA_NO_NODE);
> > > > > +            if ( !(irq < nr_irqs_gsi || irq >= nr_irqs) &&
> > > > > +                    current->domain->target == d )
> > > > > +            {
> > > > > +                ret = irq_permit_access(current->domain, irq);
> > > > > +                if ( ret ) {
> > > > > +                    dprintk(XENLOG_G_ERR,
> > > > > +                            "dom%d: can't grant it's stubdom (%d) access to "
> > > > > +                            "irq %d for msi: %d!\n",
> > > > > +                            d->domain_id,
> > > > > +                            current->domain->domain_id,
> > > > > +                            irq,
> > > > > +                            ret);
> > > > > +                    destroy_irq(irq);
> > > > > +                    return ret;
> > > > 
> > > > I'm afraid his won't work for devices that support multiple MSI vectors.
> > > > Note that map_domain_pirq also has a call to create_irq, and you are
> > > > not adding the sutbdom permissions there.
> > > > 
> > > > IMO, the safer way to fix this would be to modify create_irq and
> > > > destroy_irq so that you give permissions to the subtdomain in the same
> > > > place that hardware domain permissions are given. Note that you will
> > > > have to change the function to take an extra domain parameter
> > > > AFAICT.
> > > 
> > > That may be a good idea, I'll try.
> > 
> > Hmm, looking at the code, wouldn't it make sense to give device model
> > domain access to the IRQ _instead of_ hardware domain? If stubdomain is
> > in use, I don't see why dom0 would need access to that irq. Simply
> > provide what the device model domain is as parameter - either
> > hardware_domain, or stubdomain. Something like:
> > 
> >     create_irq(..., current->domain->target == d ? current->domain : hardware_domain);
> 
> Isn't there some cleanup that likely needs to be done by dom0 if it's
> not done by the stubdom, or in case the stubdom crashes for some
> reason?

I don't think toolstack know anything about IRQs allocated by device
model, looks like it does cleanup only for INTx interrupts.

> Or maybe that's already done on domain destruction by Xen itself, in
> which case not giving permissions to dom0 would be fine.

There is free_domain_pirqs() call in arch_domain_destroy(). But I don't
have device model reference there. Is there a way to get target ->
stubdomain mapping (other than iterating over all the domains)? I see
also domain->target field, which is the other way around.
The only thing needed is irq_deny_access() call there (in case of domain
ID reuse). Since such IRQs are not mapped to stubdomain itself,
free_domain_pirqs() for stubdomain will not clean this up.
Or maybe, _if stubdomain is guaranteed to be destroyed before its
target_, we can iterate over target domain's IRQs during stubdomain
destruction for this purpose?

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-02-07 15:41           ` Marek Marczykowski-Górecki
@ 2019-02-07 17:40             ` Roger Pau Monné
  2019-02-07 17:51               ` Marek Marczykowski-Górecki
  0 siblings, 1 reply; 43+ messages in thread
From: Roger Pau Monné @ 2019-02-07 17:40 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: Simon Gaiser, xen-devel, Wei Liu, Jan Beulich, Andrew Cooper

On Thu, Feb 07, 2019 at 04:41:38PM +0100, Marek Marczykowski-Górecki wrote:
> On Thu, Feb 07, 2019 at 03:57:54PM +0100, Roger Pau Monné wrote:
> > On Thu, Feb 07, 2019 at 03:52:38PM +0100, Marek Marczykowski-Górecki wrote:
> > > On Thu, Feb 07, 2019 at 02:21:27PM +0100, Marek Marczykowski-Górecki wrote:
> > > > On Thu, Feb 07, 2019 at 10:57:19AM +0100, Roger Pau Monné wrote:
> > > > > On Thu, Feb 07, 2019 at 01:07:47AM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > From: Simon Gaiser <simon@invisiblethingslab.com>
> > > > > > 
> > > > > > Stubdomains need to be given sufficient privilege over the guest which it
> > > > > > provides emulation for in order for PCI passthrough to work correctly.
> > > > > > When a HVM domain try to enable MSI, QEMU in stubdomain calls
> > > > > > PHYSDEVOP_map_pirq, but later it needs to call XEN_DOMCTL_bind_pt_irq as
> > > > > > part of xc_domain_update_msi_irq. Allow for that as part of
> > > > > > PHYSDEVOP_map_pirq.
> > > > > > 
> > > > > > This is not needed for PCI INTx, because IRQ in that case is known
> > > > > > beforehand and the stubdomain is given permissions over this IRQ by
> > > > > > libxl__device_pci_add (there's a do_pci_add against the stubdomain).
> > > > > > 
> > > > > > Based on https://github.com/OpenXT/xenclient-oe/blob/5e0e7304a5a3c75ef01240a1e3673665b2aaf05e/recipes-extended/xen/files/stubdomain-msi-irq-access.patch by Eric Chanudet <chanudete@ainfosec.com>.
> > > > > > 
> > > > > > Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com>
> > > > > > Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> > > > > > ---
> > > > > > Changes in v3:
> > > > > >  - extend commit message
> > > > > > Changes in v4:
> > > > > >  - add missing destroy_irq on error path
> > > > > > 
> > > > > > With this patch, stubdomain will be able to create and map multiple irq
> > > > > > (DoS possibility?), as only target domain is validated in practice. Is
> > > > > > that ok? If not, what additional limits could be applied here?
> > > > > > In INTx case the problem doesn't apply, because toolstack grant access
> > > > > > to particular IRQ and no allocation happen on stubdomain request. But in
> > > > > > MSI case, it isn't that easy as IRQ number isn't known before (as
> > > > > > explained in the commit message).
> > > > > > ---
> > > > > >  xen/arch/x86/irq.c     | 24 ++++++++++++++++++++++++
> > > > > >  xen/arch/x86/physdev.c |  9 +++++++++
> > > > > >  2 files changed, 33 insertions(+)
> > > > > > 
> > > > > > diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
> > > > > > index 8b44d6c..5e5dcac 100644
> > > > > > --- a/xen/arch/x86/irq.c
> > > > > > +++ b/xen/arch/x86/irq.c
> > > > > > @@ -2674,6 +2674,22 @@ int allocate_and_map_msi_pirq(struct domain *d, int index, int *pirq_p,
> > > > > >          {
> > > > > >      case MAP_PIRQ_TYPE_MULTI_MSI:
> > > > > >              irq = create_irq(NUMA_NO_NODE);
> > > > > > +            if ( !(irq < nr_irqs_gsi || irq >= nr_irqs) &&
> > > > > > +                    current->domain->target == d )
> > > > > > +            {
> > > > > > +                ret = irq_permit_access(current->domain, irq);
> > > > > > +                if ( ret ) {
> > > > > > +                    dprintk(XENLOG_G_ERR,
> > > > > > +                            "dom%d: can't grant it's stubdom (%d) access to "
> > > > > > +                            "irq %d for msi: %d!\n",
> > > > > > +                            d->domain_id,
> > > > > > +                            current->domain->domain_id,
> > > > > > +                            irq,
> > > > > > +                            ret);
> > > > > > +                    destroy_irq(irq);
> > > > > > +                    return ret;
> > > > > 
> > > > > I'm afraid his won't work for devices that support multiple MSI vectors.
> > > > > Note that map_domain_pirq also has a call to create_irq, and you are
> > > > > not adding the sutbdom permissions there.
> > > > > 
> > > > > IMO, the safer way to fix this would be to modify create_irq and
> > > > > destroy_irq so that you give permissions to the subtdomain in the same
> > > > > place that hardware domain permissions are given. Note that you will
> > > > > have to change the function to take an extra domain parameter
> > > > > AFAICT.
> > > > 
> > > > That may be a good idea, I'll try.
> > > 
> > > Hmm, looking at the code, wouldn't it make sense to give device model
> > > domain access to the IRQ _instead of_ hardware domain? If stubdomain is
> > > in use, I don't see why dom0 would need access to that irq. Simply
> > > provide what the device model domain is as parameter - either
> > > hardware_domain, or stubdomain. Something like:
> > > 
> > >     create_irq(..., current->domain->target == d ? current->domain : hardware_domain);
> > 
> > Isn't there some cleanup that likely needs to be done by dom0 if it's
> > not done by the stubdom, or in case the stubdom crashes for some
> > reason?
> 
> I don't think toolstack know anything about IRQs allocated by device
> model, looks like it does cleanup only for INTx interrupts.
> 
> > Or maybe that's already done on domain destruction by Xen itself, in
> > which case not giving permissions to dom0 would be fine.
> 
> There is free_domain_pirqs() call in arch_domain_destroy(). But I don't
> have device model reference there. Is there a way to get target ->
> stubdomain mapping (other than iterating over all the domains)? I see
> also domain->target field, which is the other way around.
> The only thing needed is irq_deny_access() call there (in case of domain
> ID reuse). Since such IRQs are not mapped to stubdomain itself,
> free_domain_pirqs() for stubdomain will not clean this up.
> Or maybe, _if stubdomain is guaranteed to be destroyed before its
> target_, we can iterate over target domain's IRQs during stubdomain
> destruction for this purpose?

The list of allowed irqs is stored inside of the domain struct,
which means that it goes away when the domain is destroyed, there's no
need to do any specific cleanup when the stubdomain is destroyed
AFAICT. Now if the target domain is destroyed, those permissions over
the irqs must be removed from the stubdomain, because the irqs will be
freed and likely reused. The current model assumes that the hardware
domain is always the controlling owner of such irqs, but if we allow
stubdomains to also be the controlling owner then we need to keep some
track of this, or else Xen could be leaking permissions.

I'm certainly open to suggestions in order to track this relation,
maybe a new field in the domain struct that points to it's stubdom if
it has one? That would be kind of limiting since we would then assume
a domain can only have a single stubdomain.

Maybe introduce some structure to keep track of irqs that contains a
reference to the owner and the target?

Sorry this all likely more complicated that you expected.

Finally I agree with you there's no need for the hardware domain to
have permissions over this irqs, the stubdomain and the target being
the only ones having permissions LGTM.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-02-07 17:40             ` Roger Pau Monné
@ 2019-02-07 17:51               ` Marek Marczykowski-Górecki
  2019-02-08  9:35                 ` Roger Pau Monné
  0 siblings, 1 reply; 43+ messages in thread
From: Marek Marczykowski-Górecki @ 2019-02-07 17:51 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Simon Gaiser, xen-devel, Wei Liu, Jan Beulich, Andrew Cooper


[-- Attachment #1.1: Type: text/plain, Size: 3172 bytes --]

On Thu, Feb 07, 2019 at 06:40:16PM +0100, Roger Pau Monné wrote:
> On Thu, Feb 07, 2019 at 04:41:38PM +0100, Marek Marczykowski-Górecki wrote:
> > On Thu, Feb 07, 2019 at 03:57:54PM +0100, Roger Pau Monné wrote:
> > > On Thu, Feb 07, 2019 at 03:52:38PM +0100, Marek Marczykowski-Górecki wrote:
> > > > Hmm, looking at the code, wouldn't it make sense to give device model
> > > > domain access to the IRQ _instead of_ hardware domain? If stubdomain is
> > > > in use, I don't see why dom0 would need access to that irq. Simply
> > > > provide what the device model domain is as parameter - either
> > > > hardware_domain, or stubdomain. Something like:
> > > > 
> > > >     create_irq(..., current->domain->target == d ? current->domain : hardware_domain);
> > > 
> > > Isn't there some cleanup that likely needs to be done by dom0 if it's
> > > not done by the stubdom, or in case the stubdom crashes for some
> > > reason?
> > 
> > I don't think toolstack know anything about IRQs allocated by device
> > model, looks like it does cleanup only for INTx interrupts.
> > 
> > > Or maybe that's already done on domain destruction by Xen itself, in
> > > which case not giving permissions to dom0 would be fine.
> > 
> > There is free_domain_pirqs() call in arch_domain_destroy(). But I don't
> > have device model reference there. Is there a way to get target ->
> > stubdomain mapping (other than iterating over all the domains)? I see
> > also domain->target field, which is the other way around.
> > The only thing needed is irq_deny_access() call there (in case of domain
> > ID reuse). Since such IRQs are not mapped to stubdomain itself,
> > free_domain_pirqs() for stubdomain will not clean this up.
> > Or maybe, _if stubdomain is guaranteed to be destroyed before its
> > target_, we can iterate over target domain's IRQs during stubdomain
> > destruction for this purpose?
> 
> The list of allowed irqs is stored inside of the domain struct,
> which means that it goes away when the domain is destroyed, there's no
> need to do any specific cleanup when the stubdomain is destroyed
> AFAICT. Now if the target domain is destroyed, those permissions over
> the irqs must be removed from the stubdomain, because the irqs will be
> freed and likely reused. The current model assumes that the hardware
> domain is always the controlling owner of such irqs, but if we allow
> stubdomains to also be the controlling owner then we need to keep some
> track of this, or else Xen could be leaking permissions.

This looks to be not a problem, because stubdomain keeps reference to
its target domain (and there is corresponding put_domain(d->target) on
domain destroy), so to answer my own question "if stubdomain is
guaranteed to be destroyed before its target" - yes, it is.

So, if domain destruction also implicitly revoke all _its_ irq
permissions (because of where they are stored), there is no additional
cleanup needed here.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-02-07 17:51               ` Marek Marczykowski-Górecki
@ 2019-02-08  9:35                 ` Roger Pau Monné
  2019-02-08 10:15                   ` Marek Marczykowski-Górecki
  0 siblings, 1 reply; 43+ messages in thread
From: Roger Pau Monné @ 2019-02-08  9:35 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: Simon Gaiser, xen-devel, Wei Liu, Jan Beulich, Andrew Cooper

On Thu, Feb 07, 2019 at 06:51:57PM +0100, Marek Marczykowski-Górecki wrote:
> On Thu, Feb 07, 2019 at 06:40:16PM +0100, Roger Pau Monné wrote:
> > On Thu, Feb 07, 2019 at 04:41:38PM +0100, Marek Marczykowski-Górecki wrote:
> > > On Thu, Feb 07, 2019 at 03:57:54PM +0100, Roger Pau Monné wrote:
> > > > On Thu, Feb 07, 2019 at 03:52:38PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > Hmm, looking at the code, wouldn't it make sense to give device model
> > > > > domain access to the IRQ _instead of_ hardware domain? If stubdomain is
> > > > > in use, I don't see why dom0 would need access to that irq. Simply
> > > > > provide what the device model domain is as parameter - either
> > > > > hardware_domain, or stubdomain. Something like:
> > > > > 
> > > > >     create_irq(..., current->domain->target == d ? current->domain : hardware_domain);
> > > > 
> > > > Isn't there some cleanup that likely needs to be done by dom0 if it's
> > > > not done by the stubdom, or in case the stubdom crashes for some
> > > > reason?
> > > 
> > > I don't think toolstack know anything about IRQs allocated by device
> > > model, looks like it does cleanup only for INTx interrupts.
> > > 
> > > > Or maybe that's already done on domain destruction by Xen itself, in
> > > > which case not giving permissions to dom0 would be fine.
> > > 
> > > There is free_domain_pirqs() call in arch_domain_destroy(). But I don't
> > > have device model reference there. Is there a way to get target ->
> > > stubdomain mapping (other than iterating over all the domains)? I see
> > > also domain->target field, which is the other way around.
> > > The only thing needed is irq_deny_access() call there (in case of domain
> > > ID reuse). Since such IRQs are not mapped to stubdomain itself,
> > > free_domain_pirqs() for stubdomain will not clean this up.
> > > Or maybe, _if stubdomain is guaranteed to be destroyed before its
> > > target_, we can iterate over target domain's IRQs during stubdomain
> > > destruction for this purpose?
> > 
> > The list of allowed irqs is stored inside of the domain struct,
> > which means that it goes away when the domain is destroyed, there's no
> > need to do any specific cleanup when the stubdomain is destroyed
> > AFAICT. Now if the target domain is destroyed, those permissions over
> > the irqs must be removed from the stubdomain, because the irqs will be
> > freed and likely reused. The current model assumes that the hardware
> > domain is always the controlling owner of such irqs, but if we allow
> > stubdomains to also be the controlling owner then we need to keep some
> > track of this, or else Xen could be leaking permissions.
> 
> This looks to be not a problem, because stubdomain keeps reference to
> its target domain (and there is corresponding put_domain(d->target) on
> domain destroy), so to answer my own question "if stubdomain is
> guaranteed to be destroyed before its target" - yes, it is.

Oh, that's good to know. So stubdomain will always be destroyed first,
making sure the irq permissions are removed from the stubdomain before
destroying the target domain and unmapping/destroying the irqs.

> So, if domain destruction also implicitly revoke all _its_ irq
> permissions (because of where they are stored), there is no additional
> cleanup needed here.

Not on domain destroy AFAICT.

What about hot-unplug? The proper flow there would be to ask the
stubdomain to detach the device, and only deassign it after the
stubdomain has reported successful detach. I think that's already the
case.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-02-08  9:35                 ` Roger Pau Monné
@ 2019-02-08 10:15                   ` Marek Marczykowski-Górecki
  2019-02-08 10:17                     ` [PATCH v4.1 " Marek Marczykowski-Górecki
  0 siblings, 1 reply; 43+ messages in thread
From: Marek Marczykowski-Górecki @ 2019-02-08 10:15 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Simon Gaiser, xen-devel, Wei Liu, Jan Beulich, Andrew Cooper


[-- Attachment #1.1: Type: text/plain, Size: 944 bytes --]

On Fri, Feb 08, 2019 at 10:35:37AM +0100, Roger Pau Monné wrote:
> What about hot-unplug? The proper flow there would be to ask the
> stubdomain to detach the device, and only deassign it after the
> stubdomain has reported successful detach. I think that's already the
> case.

I think so.

The approach with handling stubdomain permission in {create,destroy}_irq
seems to work, the patch will follow. 
There is one code path where I haven't managed to properly extract
possible stubdomain in use:
pci_remove_device()
 -> pci_cleanup_msi()
   -> msi_free_irqs()
     -> msi_free_irq()
       -> destroy_irq()
    
For now I've hardcoded hardware_domain there (in msi_free_irqs). Can it
happen when device is still assigned to some domU?

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH v4.1 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-02-08 10:15                   ` Marek Marczykowski-Górecki
@ 2019-02-08 10:17                     ` Marek Marczykowski-Górecki
  2019-02-21 16:47                       ` Roger Pau Monné
  2019-02-27 11:07                       ` Jan Beulich
  0 siblings, 2 replies; 43+ messages in thread
From: Marek Marczykowski-Górecki @ 2019-02-08 10:17 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Suravee Suthikulpanit,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Marek Marczykowski-Górecki, Tim Deegan, Simon Gaiser,
	Julien Grall, Jan Beulich, Brian Woods, Roger Pau Monné

Stubdomains need to be given sufficient privilege over the guest which it
provides emulation for in order for PCI passthrough to work correctly.
When a HVM domain try to enable MSI, QEMU in stubdomain calls
PHYSDEVOP_map_pirq, but later it needs to call XEN_DOMCTL_bind_pt_irq as
part of xc_domain_update_msi_irq. Allow for that as part of
PHYSDEVOP_map_pirq.

This is not needed for PCI INTx, because IRQ in that case is known
beforehand and the stubdomain is given permissions over this IRQ by
libxl__device_pci_add (there's a do_pci_add against the stubdomain).

create_irq() already grant IRQ access to hardware_domain, with
assumption the device model (something managing this IRQ) lives there.
Modify create_irq() to take additional parameter pointing at device
model domain - which may be dom0 or stubdomain. Do the same with
destroy_irq() (and msi_free_irq() which calls it) to reverse the
operation.

Then, adjust all callers to provide the parameter. In case of calls not
related to stubdomain-initiated allocations, give it hardware_domain, so
the behavior is unchanged there.

Inspired by https://github.com/OpenXT/xenclient-oe/blob/5e0e7304a5a3c75ef01240a1e3673665b2aaf05e/recipes-extended/xen/files/stubdomain-msi-irq-access.patch by Eric Chanudet <chanudete@ainfosec.com>.

Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com>
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
---
Changes in v3:
 - extend commit message
Changes in v4:
 - add missing destroy_irq on error path
Changes in v4.1:
 - move irq_{grant,revoke}_access() to {create,destroy}_irq(), which
   basically make it a different patch

There is one code path where I haven't managed to properly extract
possible stubdomain in use:
pci_remove_device()
 -> pci_cleanup_msi()
   -> msi_free_irqs()
     -> msi_free_irq()
       -> destroy_irq()

For now I've hardcoded hardware_domain there (in msi_free_irqs). Can it happen
when device is still assigned to some domU?
---
 xen/arch/x86/hpet.c                      |  5 +--
 xen/arch/x86/irq.c                       | 46 ++++++++++++++----------
 xen/arch/x86/msi.c                       |  6 ++--
 xen/drivers/char/ns16550.c               |  6 ++--
 xen/drivers/passthrough/amd/iommu_init.c |  4 +--
 xen/drivers/passthrough/vtd/iommu.c      |  7 ++--
 xen/include/asm-x86/irq.h                |  4 +--
 xen/include/asm-x86/msi.h                |  2 +-
 8 files changed, 46 insertions(+), 34 deletions(-)

diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c
index 4b08488ef1..6db71dfd71 100644
--- a/xen/arch/x86/hpet.c
+++ b/xen/arch/x86/hpet.c
@@ -11,6 +11,7 @@
 #include <xen/softirq.h>
 #include <xen/irq.h>
 #include <xen/numa.h>
+#include <xen/sched.h>
 #include <asm/fixmap.h>
 #include <asm/div64.h>
 #include <asm/hpet.h>
@@ -368,13 +369,13 @@ static int __init hpet_assign_irq(struct hpet_event_channel *ch)
 {
     int irq;
 
-    if ( (irq = create_irq(NUMA_NO_NODE)) < 0 )
+    if ( (irq = create_irq(NUMA_NO_NODE, hardware_domain)) < 0 )
         return irq;
 
     ch->msi.irq = irq;
     if ( hpet_setup_msi_irq(ch) )
     {
-        destroy_irq(irq);
+        destroy_irq(irq, hardware_domain);
         return -EINVAL;
     }
 
diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index 8b44d6ce0b..d41b32b2f4 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -157,7 +157,7 @@ int __init bind_irq_vector(int irq, int vector, const cpumask_t *cpu_mask)
 /*
  * Dynamic irq allocate and deallocation for MSI
  */
-int create_irq(nodeid_t node)
+int create_irq(nodeid_t node, struct domain *dm_domain)
 {
     int irq, ret;
     struct irq_desc *desc;
@@ -190,19 +190,19 @@ int create_irq(nodeid_t node)
         desc->arch.used = IRQ_UNUSED;
         irq = ret;
     }
-    else if ( hardware_domain )
+    else if ( dm_domain )
     {
-        ret = irq_permit_access(hardware_domain, irq);
+        ret = irq_permit_access(dm_domain, irq);
         if ( ret )
             printk(XENLOG_G_ERR
-                   "Could not grant Dom0 access to IRQ%d (error %d)\n",
-                   irq, ret);
+                   "Could not grant Dom%u access to IRQ%d (error %d)\n",
+                   dm_domain->domain_id, irq, ret);
     }
 
     return irq;
 }
 
-void destroy_irq(unsigned int irq)
+void destroy_irq(unsigned int irq, struct domain *dm_domain)
 {
     struct irq_desc *desc = irq_to_desc(irq);
     unsigned long flags;
@@ -210,14 +210,14 @@ void destroy_irq(unsigned int irq)
 
     BUG_ON(!MSI_IRQ(irq));
 
-    if ( hardware_domain )
+    if ( dm_domain )
     {
-        int err = irq_deny_access(hardware_domain, irq);
+        int err = irq_deny_access(dm_domain, irq);
 
         if ( err )
             printk(XENLOG_G_ERR
-                   "Could not revoke Dom0 access to IRQ%u (error %d)\n",
-                   irq, err);
+                   "Could not revoke Dom%u access to IRQ%u (error %d)\n",
+                   dm_domain->domain_id, irq, err);
     }
 
     spin_lock_irqsave(&desc->lock, flags);
@@ -2010,7 +2010,9 @@ int map_domain_pirq(
                     d->domain_id, irq);
             pci_disable_msi(msi_desc);
             msi_desc->irq = -1;
-            msi_free_irq(msi_desc);
+            msi_free_irq(msi_desc,
+                         current->domain->target == d ? current->domain
+                                                      : hardware_domain);
             ret = -EBUSY;
             goto done;
         }
@@ -2038,7 +2040,9 @@ int map_domain_pirq(
             spin_unlock_irqrestore(&desc->lock, flags);
 
             info = NULL;
-            irq = create_irq(NUMA_NO_NODE);
+            irq = create_irq(NUMA_NO_NODE,
+                             current->domain->target == d ? current->domain
+                                                          : hardware_domain);
             ret = irq >= 0 ? prepare_domain_irq_pirq(d, irq, pirq + nr, &info)
                            : irq;
             if ( ret < 0 )
@@ -2095,7 +2099,9 @@ int map_domain_pirq(
                 irq = info->arch.irq;
             }
             msi_desc->irq = -1;
-            msi_free_irq(msi_desc);
+            msi_free_irq(msi_desc,
+                         current->domain->target == d ? current->domain
+                                                      : hardware_domain);
             goto done;
         }
 
@@ -2255,7 +2261,9 @@ int unmap_domain_pirq(struct domain *d, int pirq)
     }
 
     if (msi_desc)
-        msi_free_irq(msi_desc);
+        msi_free_irq(msi_desc,
+                     current->domain->target == d ? current->domain
+                                                  : hardware_domain);
 
  done:
     return ret;
@@ -2671,10 +2679,10 @@ int allocate_and_map_msi_pirq(struct domain *d, int index, int *pirq_p,
             msi->entry_nr = 1;
         irq = index;
         if ( irq == -1 )
-        {
     case MAP_PIRQ_TYPE_MULTI_MSI:
-            irq = create_irq(NUMA_NO_NODE);
-        }
+            irq = create_irq(NUMA_NO_NODE,
+                             current->domain->target == d ? current->domain
+                                                          : hardware_domain);
 
         if ( irq < nr_irqs_gsi || irq >= nr_irqs )
         {
@@ -2717,7 +2725,9 @@ int allocate_and_map_msi_pirq(struct domain *d, int index, int *pirq_p,
         case MAP_PIRQ_TYPE_MSI:
             if ( index == -1 )
         case MAP_PIRQ_TYPE_MULTI_MSI:
-                destroy_irq(irq);
+                destroy_irq(irq,
+                            current->domain->target == d ? current->domain
+                                                         : hardware_domain);
             break;
         }
     }
diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c
index babc4147c4..66026e3ca5 100644
--- a/xen/arch/x86/msi.c
+++ b/xen/arch/x86/msi.c
@@ -633,7 +633,7 @@ int __setup_msi_irq(struct irq_desc *desc, struct msi_desc *msidesc,
     return ret;
 }
 
-int msi_free_irq(struct msi_desc *entry)
+int msi_free_irq(struct msi_desc *entry, struct domain *dm_domain)
 {
     unsigned int nr = entry->msi_attrib.type != PCI_CAP_ID_MSIX
                       ? entry->msi.nvec : 1;
@@ -641,7 +641,7 @@ int msi_free_irq(struct msi_desc *entry)
     while ( nr-- )
     {
         if ( entry[nr].irq >= 0 )
-            destroy_irq(entry[nr].irq);
+            destroy_irq(entry[nr].irq, dm_domain);
 
         /* Free the unused IRTE if intr remap enabled */
         if ( iommu_intremap )
@@ -1280,7 +1280,7 @@ static void msi_free_irqs(struct pci_dev* dev)
     list_for_each_entry_safe( entry, tmp, &dev->msi_list, list )
     {
         pci_disable_msi(entry);
-        msi_free_irq(entry);
+        msi_free_irq(entry, hardware_domain);
     }
 }
 
diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c
index 189e121b7e..2037bbbf08 100644
--- a/xen/drivers/char/ns16550.c
+++ b/xen/drivers/char/ns16550.c
@@ -719,7 +719,7 @@ static void __init ns16550_init_irq(struct serial_port *port)
     struct ns16550 *uart = port->uart;
 
     if ( uart->msi )
-        uart->irq = create_irq(0);
+        uart->irq = create_irq(0, hardware_domain);
 #endif
 }
 
@@ -812,9 +812,9 @@ static void __init ns16550_init_postirq(struct serial_port *port)
                 {
                     uart->irq = 0;
                     if ( msi_desc )
-                        msi_free_irq(msi_desc);
+                        msi_free_irq(msi_desc, hardware_domain);
                     else
-                        destroy_irq(msi.irq);
+                        destroy_irq(msi.irq, hardware_domain);
                 }
             }
 
diff --git a/xen/drivers/passthrough/amd/iommu_init.c b/xen/drivers/passthrough/amd/iommu_init.c
index 17f39552a9..423c473a63 100644
--- a/xen/drivers/passthrough/amd/iommu_init.c
+++ b/xen/drivers/passthrough/amd/iommu_init.c
@@ -780,7 +780,7 @@ static bool_t __init set_iommu_interrupt_handler(struct amd_iommu *iommu)
     hw_irq_controller *handler;
     u16 control;
 
-    irq = create_irq(NUMA_NO_NODE);
+    irq = create_irq(NUMA_NO_NODE, hardware_domain);
     if ( irq <= 0 )
     {
         dprintk(XENLOG_ERR, "IOMMU: no irqs\n");
@@ -816,7 +816,7 @@ static bool_t __init set_iommu_interrupt_handler(struct amd_iommu *iommu)
         ret = request_irq(irq, 0, iommu_interrupt_handler, "amd_iommu", iommu);
     if ( ret )
     {
-        destroy_irq(irq);
+        destroy_irq(irq, hardware_domain);
         AMD_IOMMU_DEBUG("can't request irq\n");
         return 0;
     }
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 50a0e25224..73cf16c145 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1138,7 +1138,8 @@ static int __init iommu_set_interrupt(struct acpi_drhd_unit *drhd)
     struct irq_desc *desc;
 
     irq = create_irq(rhsa ? pxm_to_node(rhsa->proximity_domain)
-                          : NUMA_NO_NODE);
+                          : NUMA_NO_NODE,
+                     hardware_domain);
     if ( irq <= 0 )
     {
         dprintk(XENLOG_ERR VTDPREFIX, "IOMMU: no irq available!\n");
@@ -1151,7 +1152,7 @@ static int __init iommu_set_interrupt(struct acpi_drhd_unit *drhd)
     if ( ret )
     {
         desc->handler = &no_irq_type;
-        destroy_irq(irq);
+        destroy_irq(irq, hardware_domain);
         dprintk(XENLOG_ERR VTDPREFIX, "IOMMU: can't request irq\n");
         return ret;
     }
@@ -1286,7 +1287,7 @@ void __init iommu_free(struct acpi_drhd_unit *drhd)
 
     free_intel_iommu(iommu->intel);
     if ( iommu->msi.irq >= 0 )
-        destroy_irq(iommu->msi.irq);
+        destroy_irq(iommu->msi.irq, hardware_domain);
     xfree(iommu);
 }
 
diff --git a/xen/include/asm-x86/irq.h b/xen/include/asm-x86/irq.h
index 4b39997f09..cf28a5c5ff 100644
--- a/xen/include/asm-x86/irq.h
+++ b/xen/include/asm-x86/irq.h
@@ -155,8 +155,8 @@ int  init_irq_data(void);
 void clear_irq_vector(int irq);
 
 int irq_to_vector(int irq);
-int create_irq(nodeid_t node);
-void destroy_irq(unsigned int irq);
+int create_irq(nodeid_t node, struct domain *dm_domain);
+void destroy_irq(unsigned int irq, struct domain *dm_domain);
 int assign_irq_vector(int irq, const cpumask_t *);
 
 extern void irq_complete_move(struct irq_desc *);
diff --git a/xen/include/asm-x86/msi.h b/xen/include/asm-x86/msi.h
index 10387dce2e..1a5ba84ccb 100644
--- a/xen/include/asm-x86/msi.h
+++ b/xen/include/asm-x86/msi.h
@@ -134,7 +134,7 @@ struct msi_desc {
 #define MSI_TYPE_IOMMU   2
 
 int msi_maskable_irq(const struct msi_desc *);
-int msi_free_irq(struct msi_desc *entry);
+int msi_free_irq(struct msi_desc *entry, struct domain *dm_domain);
 
 /*
  * Assume the maximum number of hot plug slots supported by the system is about
-- 
2.17.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 3/6] libxl: don't try to manipulate json config for stubdomain
  2019-02-07  0:07 ` [PATCH v4 3/6] libxl: don't try to manipulate json config for stubdomain Marek Marczykowski-Górecki
@ 2019-02-21 16:16   ` Wei Liu
  0 siblings, 0 replies; 43+ messages in thread
From: Wei Liu @ 2019-02-21 16:16 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki; +Cc: xen-devel, Ian Jackson, Wei Liu

On Thu, Feb 07, 2019 at 01:07:46AM +0100, Marek Marczykowski-Górecki wrote:
> Stubdomain do not have it's own config file - its configuration is
> derived from target domains. Do not try to manipulate it when attaching
> PCI device.
> 
> This bug prevented starting HVM with stubdomain and PCI passthrough
> device attached.
> 
> Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>

Acked-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4.1 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-02-08 10:17                     ` [PATCH v4.1 " Marek Marczykowski-Górecki
@ 2019-02-21 16:47                       ` Roger Pau Monné
  2019-02-21 17:40                         ` Marek Marczykowski-Górecki
  2019-02-27 11:07                       ` Jan Beulich
  1 sibling, 1 reply; 43+ messages in thread
From: Roger Pau Monné @ 2019-02-21 16:47 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Suravee Suthikulpanit,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Simon Gaiser, Julien Grall, Jan Beulich, xen-devel,
	Brian Woods

On Fri, Feb 08, 2019 at 11:17:05AM +0100, Marek Marczykowski-Górecki wrote:
> Stubdomains need to be given sufficient privilege over the guest which it
> provides emulation for in order for PCI passthrough to work correctly.
> When a HVM domain try to enable MSI, QEMU in stubdomain calls
> PHYSDEVOP_map_pirq, but later it needs to call XEN_DOMCTL_bind_pt_irq as
> part of xc_domain_update_msi_irq. Allow for that as part of
> PHYSDEVOP_map_pirq.
> 
> This is not needed for PCI INTx, because IRQ in that case is known
> beforehand and the stubdomain is given permissions over this IRQ by
> libxl__device_pci_add (there's a do_pci_add against the stubdomain).
> 
> create_irq() already grant IRQ access to hardware_domain, with
> assumption the device model (something managing this IRQ) lives there.
> Modify create_irq() to take additional parameter pointing at device
> model domain - which may be dom0 or stubdomain. Do the same with
> destroy_irq() (and msi_free_irq() which calls it) to reverse the
> operation.
> 
> Then, adjust all callers to provide the parameter. In case of calls not
> related to stubdomain-initiated allocations, give it hardware_domain, so
> the behavior is unchanged there.
> 
> Inspired by https://github.com/OpenXT/xenclient-oe/blob/5e0e7304a5a3c75ef01240a1e3673665b2aaf05e/recipes-extended/xen/files/stubdomain-msi-irq-access.patch by Eric Chanudet <chanudete@ainfosec.com>.
> 
> Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com>
> Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>

Thanks for working on this! I've got some comments below.

> ---
> Changes in v3:
>  - extend commit message
> Changes in v4:
>  - add missing destroy_irq on error path
> Changes in v4.1:
>  - move irq_{grant,revoke}_access() to {create,destroy}_irq(), which
>    basically make it a different patch
> 
> There is one code path where I haven't managed to properly extract
> possible stubdomain in use:
> pci_remove_device()
>  -> pci_cleanup_msi()
>    -> msi_free_irqs()
>      -> msi_free_irq()
>        -> destroy_irq()
> 
> For now I've hardcoded hardware_domain there (in msi_free_irqs). Can it happen
> when device is still assigned to some domU?
> ---
>  xen/arch/x86/hpet.c                      |  5 +--
>  xen/arch/x86/irq.c                       | 46 ++++++++++++++----------
>  xen/arch/x86/msi.c                       |  6 ++--
>  xen/drivers/char/ns16550.c               |  6 ++--
>  xen/drivers/passthrough/amd/iommu_init.c |  4 +--
>  xen/drivers/passthrough/vtd/iommu.c      |  7 ++--
>  xen/include/asm-x86/irq.h                |  4 +--
>  xen/include/asm-x86/msi.h                |  2 +-
>  8 files changed, 46 insertions(+), 34 deletions(-)
> 
> diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c
> index 4b08488ef1..6db71dfd71 100644
> --- a/xen/arch/x86/hpet.c
> +++ b/xen/arch/x86/hpet.c
> @@ -11,6 +11,7 @@
>  #include <xen/softirq.h>
>  #include <xen/irq.h>
>  #include <xen/numa.h>
> +#include <xen/sched.h>
>  #include <asm/fixmap.h>
>  #include <asm/div64.h>
>  #include <asm/hpet.h>
> @@ -368,13 +369,13 @@ static int __init hpet_assign_irq(struct hpet_event_channel *ch)
>  {
>      int irq;
>  
> -    if ( (irq = create_irq(NUMA_NO_NODE)) < 0 )
> +    if ( (irq = create_irq(NUMA_NO_NODE, hardware_domain)) < 0 )
>          return irq;
>  
>      ch->msi.irq = irq;
>      if ( hpet_setup_msi_irq(ch) )
>      {
> -        destroy_irq(irq);
> +        destroy_irq(irq, hardware_domain);

Hm, here you should use an explicit NULL instead of hardware_domain
(which will also be NULL by the time this gets called). This irq is
used by Xen, and it doesn't make sense logically to give permissions
over it to the hardware domain.

>          return -EINVAL;
>      }
>  
> diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
> index 8b44d6ce0b..d41b32b2f4 100644
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -157,7 +157,7 @@ int __init bind_irq_vector(int irq, int vector, const cpumask_t *cpu_mask)
>  /*
>   * Dynamic irq allocate and deallocation for MSI
>   */
> -int create_irq(nodeid_t node)
> +int create_irq(nodeid_t node, struct domain *dm_domain)

Using plain 'd' would be fine for me here, same below for
destroy_irq.

>  {
>      int irq, ret;
>      struct irq_desc *desc;
> @@ -190,19 +190,19 @@ int create_irq(nodeid_t node)
>          desc->arch.used = IRQ_UNUSED;
>          irq = ret;
>      }
> -    else if ( hardware_domain )
> +    else if ( dm_domain )

Can you guarantee that dm_domain is always current->domain?

I think you need to assert for this, or else take a reference to
dm_domain (get_domain) before accessing it's fields, or else you risk
the domain being destroyed while modifying it's fields.

>      {
> -        ret = irq_permit_access(hardware_domain, irq);
> +        ret = irq_permit_access(dm_domain, irq);
>          if ( ret )
>              printk(XENLOG_G_ERR
> -                   "Could not grant Dom0 access to IRQ%d (error %d)\n",
> -                   irq, ret);
> +                   "Could not grant Dom%u access to IRQ%d (error %d)\n",
> +                   dm_domain->domain_id, irq, ret);
>      }
>  
>      return irq;
>  }
>  
> -void destroy_irq(unsigned int irq)
> +void destroy_irq(unsigned int irq, struct domain *dm_domain)
>  {
>      struct irq_desc *desc = irq_to_desc(irq);
>      unsigned long flags;
> @@ -210,14 +210,14 @@ void destroy_irq(unsigned int irq)
>  
>      BUG_ON(!MSI_IRQ(irq));
>  
> -    if ( hardware_domain )
> +    if ( dm_domain )
>      {
> -        int err = irq_deny_access(hardware_domain, irq);
> +        int err = irq_deny_access(dm_domain, irq);
>  
>          if ( err )
>              printk(XENLOG_G_ERR
> -                   "Could not revoke Dom0 access to IRQ%u (error %d)\n",
> -                   irq, err);
> +                   "Could not revoke Dom%u access to IRQ%u (error %d)\n",
> +                   dm_domain->domain_id, irq, err);
>      }
>  
>      spin_lock_irqsave(&desc->lock, flags);
> @@ -2010,7 +2010,9 @@ int map_domain_pirq(
>                      d->domain_id, irq);
>              pci_disable_msi(msi_desc);
>              msi_desc->irq = -1;
> -            msi_free_irq(msi_desc);
> +            msi_free_irq(msi_desc,
> +                         current->domain->target == d ? current->domain
> +                                                      : hardware_domain);
>              ret = -EBUSY;
>              goto done;
>          }
> @@ -2038,7 +2040,9 @@ int map_domain_pirq(
>              spin_unlock_irqrestore(&desc->lock, flags);
>  
>              info = NULL;
> -            irq = create_irq(NUMA_NO_NODE);
> +            irq = create_irq(NUMA_NO_NODE,
> +                             current->domain->target == d ? current->domain
> +                                                          : hardware_domain);
>              ret = irq >= 0 ? prepare_domain_irq_pirq(d, irq, pirq + nr, &info)
>                             : irq;
>              if ( ret < 0 )
> @@ -2095,7 +2099,9 @@ int map_domain_pirq(
>                  irq = info->arch.irq;
>              }
>              msi_desc->irq = -1;
> -            msi_free_irq(msi_desc);
> +            msi_free_irq(msi_desc,
> +                         current->domain->target == d ? current->domain
> +                                                      : hardware_domain);
>              goto done;
>          }
>  
> @@ -2255,7 +2261,9 @@ int unmap_domain_pirq(struct domain *d, int pirq)
>      }
>  
>      if (msi_desc)
> -        msi_free_irq(msi_desc);
> +        msi_free_irq(msi_desc,
> +                     current->domain->target == d ? current->domain
> +                                                  : hardware_domain);
>  
>   done:
>      return ret;
> @@ -2671,10 +2679,10 @@ int allocate_and_map_msi_pirq(struct domain *d, int index, int *pirq_p,
>              msi->entry_nr = 1;
>          irq = index;
>          if ( irq == -1 )
> -        {
>      case MAP_PIRQ_TYPE_MULTI_MSI:
> -            irq = create_irq(NUMA_NO_NODE);
> -        }
> +            irq = create_irq(NUMA_NO_NODE,
> +                             current->domain->target == d ? current->domain
> +                                                          : hardware_domain);
>  
>          if ( irq < nr_irqs_gsi || irq >= nr_irqs )
>          {
> @@ -2717,7 +2725,9 @@ int allocate_and_map_msi_pirq(struct domain *d, int index, int *pirq_p,
>          case MAP_PIRQ_TYPE_MSI:
>              if ( index == -1 )
>          case MAP_PIRQ_TYPE_MULTI_MSI:
> -                destroy_irq(irq);
> +                destroy_irq(irq,
> +                            current->domain->target == d ? current->domain
> +                                                         : hardware_domain);

This pattern seems common enough to get a helper, maybe get_dm_domain?

>              break;
>          }
>      }
> diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c
> index babc4147c4..66026e3ca5 100644
> --- a/xen/arch/x86/msi.c
> +++ b/xen/arch/x86/msi.c
> @@ -633,7 +633,7 @@ int __setup_msi_irq(struct irq_desc *desc, struct msi_desc *msidesc,
>      return ret;
>  }
>  
> -int msi_free_irq(struct msi_desc *entry)
> +int msi_free_irq(struct msi_desc *entry, struct domain *dm_domain)
>  {
>      unsigned int nr = entry->msi_attrib.type != PCI_CAP_ID_MSIX
>                        ? entry->msi.nvec : 1;
> @@ -641,7 +641,7 @@ int msi_free_irq(struct msi_desc *entry)
>      while ( nr-- )
>      {
>          if ( entry[nr].irq >= 0 )
> -            destroy_irq(entry[nr].irq);
> +            destroy_irq(entry[nr].irq, dm_domain);
>  
>          /* Free the unused IRTE if intr remap enabled */
>          if ( iommu_intremap )
> @@ -1280,7 +1280,7 @@ static void msi_free_irqs(struct pci_dev* dev)
>      list_for_each_entry_safe( entry, tmp, &dev->msi_list, list )
>      {
>          pci_disable_msi(entry);
> -        msi_free_irq(entry);
> +        msi_free_irq(entry, hardware_domain);

This likely requires some comment to clarify why is the hardcoding of
hardware_domain correct here. AFAICT this will be called by
pci_remove_device, which I assume assures that the device has been
deassigned from any domain before attempting to remove it, hence it
can only have irqs assigned to the hardware_domain if any?

>      }
>  }
>  
> diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c
> index 189e121b7e..2037bbbf08 100644
> --- a/xen/drivers/char/ns16550.c
> +++ b/xen/drivers/char/ns16550.c
> @@ -719,7 +719,7 @@ static void __init ns16550_init_irq(struct serial_port *port)
>      struct ns16550 *uart = port->uart;
>  
>      if ( uart->msi )
> -        uart->irq = create_irq(0);
> +        uart->irq = create_irq(0, hardware_domain);
>  #endif
>  }
>  
> @@ -812,9 +812,9 @@ static void __init ns16550_init_postirq(struct serial_port *port)
>                  {
>                      uart->irq = 0;
>                      if ( msi_desc )
> -                        msi_free_irq(msi_desc);
> +                        msi_free_irq(msi_desc, hardware_domain);
>                      else
> -                        destroy_irq(msi.irq);
> +                        destroy_irq(msi.irq, hardware_domain);

All this calls should pass a NULL domain IMO, since this is the serial
console driver used by Xen.

>                  }
>              }
>  
> diff --git a/xen/drivers/passthrough/amd/iommu_init.c b/xen/drivers/passthrough/amd/iommu_init.c
> index 17f39552a9..423c473a63 100644
> --- a/xen/drivers/passthrough/amd/iommu_init.c
> +++ b/xen/drivers/passthrough/amd/iommu_init.c
> @@ -780,7 +780,7 @@ static bool_t __init set_iommu_interrupt_handler(struct amd_iommu *iommu)
>      hw_irq_controller *handler;
>      u16 control;
>  
> -    irq = create_irq(NUMA_NO_NODE);
> +    irq = create_irq(NUMA_NO_NODE, hardware_domain);
>      if ( irq <= 0 )
>      {
>          dprintk(XENLOG_ERR, "IOMMU: no irqs\n");
> @@ -816,7 +816,7 @@ static bool_t __init set_iommu_interrupt_handler(struct amd_iommu *iommu)
>          ret = request_irq(irq, 0, iommu_interrupt_handler, "amd_iommu", iommu);
>      if ( ret )
>      {
> -        destroy_irq(irq);
> +        destroy_irq(irq, hardware_domain);

Same here, this is the iommu used by Xen, hence the domain parameter
should be NULL.

>          AMD_IOMMU_DEBUG("can't request irq\n");
>          return 0;
>      }
> diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
> index 50a0e25224..73cf16c145 100644
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -1138,7 +1138,8 @@ static int __init iommu_set_interrupt(struct acpi_drhd_unit *drhd)
>      struct irq_desc *desc;
>  
>      irq = create_irq(rhsa ? pxm_to_node(rhsa->proximity_domain)
> -                          : NUMA_NO_NODE);
> +                          : NUMA_NO_NODE,
> +                     hardware_domain);
>      if ( irq <= 0 )
>      {
>          dprintk(XENLOG_ERR VTDPREFIX, "IOMMU: no irq available!\n");
> @@ -1151,7 +1152,7 @@ static int __init iommu_set_interrupt(struct acpi_drhd_unit *drhd)
>      if ( ret )
>      {
>          desc->handler = &no_irq_type;
> -        destroy_irq(irq);
> +        destroy_irq(irq, hardware_domain);
>          dprintk(XENLOG_ERR VTDPREFIX, "IOMMU: can't request irq\n");
>          return ret;
>      }
> @@ -1286,7 +1287,7 @@ void __init iommu_free(struct acpi_drhd_unit *drhd)
>  
>      free_intel_iommu(iommu->intel);
>      if ( iommu->msi.irq >= 0 )
> -        destroy_irq(iommu->msi.irq);
> +        destroy_irq(iommu->msi.irq, hardware_domain);

Same here, should be all NULL.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4.1 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-02-21 16:47                       ` Roger Pau Monné
@ 2019-02-21 17:40                         ` Marek Marczykowski-Górecki
  2019-02-22 10:42                           ` Roger Pau Monné
  0 siblings, 1 reply; 43+ messages in thread
From: Marek Marczykowski-Górecki @ 2019-02-21 17:40 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Suravee Suthikulpanit,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Simon Gaiser, Julien Grall, Jan Beulich, xen-devel,
	Brian Woods


[-- Attachment #1.1: Type: text/plain, Size: 15390 bytes --]

On Thu, Feb 21, 2019 at 05:47:51PM +0100, Roger Pau Monné wrote:
> On Fri, Feb 08, 2019 at 11:17:05AM +0100, Marek Marczykowski-Górecki wrote:
> > Stubdomains need to be given sufficient privilege over the guest which it
> > provides emulation for in order for PCI passthrough to work correctly.
> > When a HVM domain try to enable MSI, QEMU in stubdomain calls
> > PHYSDEVOP_map_pirq, but later it needs to call XEN_DOMCTL_bind_pt_irq as
> > part of xc_domain_update_msi_irq. Allow for that as part of
> > PHYSDEVOP_map_pirq.
> > 
> > This is not needed for PCI INTx, because IRQ in that case is known
> > beforehand and the stubdomain is given permissions over this IRQ by
> > libxl__device_pci_add (there's a do_pci_add against the stubdomain).
> > 
> > create_irq() already grant IRQ access to hardware_domain, with
> > assumption the device model (something managing this IRQ) lives there.
> > Modify create_irq() to take additional parameter pointing at device
> > model domain - which may be dom0 or stubdomain. Do the same with
> > destroy_irq() (and msi_free_irq() which calls it) to reverse the
> > operation.
> > 
> > Then, adjust all callers to provide the parameter. In case of calls not
> > related to stubdomain-initiated allocations, give it hardware_domain, so
> > the behavior is unchanged there.
> > 
> > Inspired by https://github.com/OpenXT/xenclient-oe/blob/5e0e7304a5a3c75ef01240a1e3673665b2aaf05e/recipes-extended/xen/files/stubdomain-msi-irq-access.patch by Eric Chanudet <chanudete@ainfosec.com>.
> > 
> > Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com>
> > Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> 
> Thanks for working on this! I've got some comments below.
> 
> > ---
> > Changes in v3:
> >  - extend commit message
> > Changes in v4:
> >  - add missing destroy_irq on error path
> > Changes in v4.1:
> >  - move irq_{grant,revoke}_access() to {create,destroy}_irq(), which
> >    basically make it a different patch
> > 
> > There is one code path where I haven't managed to properly extract
> > possible stubdomain in use:
> > pci_remove_device()
> >  -> pci_cleanup_msi()
> >    -> msi_free_irqs()
> >      -> msi_free_irq()
> >        -> destroy_irq()
> > 
> > For now I've hardcoded hardware_domain there (in msi_free_irqs). Can it happen
> > when device is still assigned to some domU?
> > ---
> >  xen/arch/x86/hpet.c                      |  5 +--
> >  xen/arch/x86/irq.c                       | 46 ++++++++++++++----------
> >  xen/arch/x86/msi.c                       |  6 ++--
> >  xen/drivers/char/ns16550.c               |  6 ++--
> >  xen/drivers/passthrough/amd/iommu_init.c |  4 +--
> >  xen/drivers/passthrough/vtd/iommu.c      |  7 ++--
> >  xen/include/asm-x86/irq.h                |  4 +--
> >  xen/include/asm-x86/msi.h                |  2 +-
> >  8 files changed, 46 insertions(+), 34 deletions(-)
> > 
> > diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c
> > index 4b08488ef1..6db71dfd71 100644
> > --- a/xen/arch/x86/hpet.c
> > +++ b/xen/arch/x86/hpet.c
> > @@ -11,6 +11,7 @@
> >  #include <xen/softirq.h>
> >  #include <xen/irq.h>
> >  #include <xen/numa.h>
> > +#include <xen/sched.h>
> >  #include <asm/fixmap.h>
> >  #include <asm/div64.h>
> >  #include <asm/hpet.h>
> > @@ -368,13 +369,13 @@ static int __init hpet_assign_irq(struct hpet_event_channel *ch)
> >  {
> >      int irq;
> >  
> > -    if ( (irq = create_irq(NUMA_NO_NODE)) < 0 )
> > +    if ( (irq = create_irq(NUMA_NO_NODE, hardware_domain)) < 0 )
> >          return irq;
> >  
> >      ch->msi.irq = irq;
> >      if ( hpet_setup_msi_irq(ch) )
> >      {
> > -        destroy_irq(irq);
> > +        destroy_irq(irq, hardware_domain);
> 
> Hm, here you should use an explicit NULL instead of hardware_domain
> (which will also be NULL by the time this gets called). This irq is
> used by Xen, and it doesn't make sense logically to give permissions
> over it to the hardware domain.

Ok.

> >          return -EINVAL;
> >      }
> >  
> > diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
> > index 8b44d6ce0b..d41b32b2f4 100644
> > --- a/xen/arch/x86/irq.c
> > +++ b/xen/arch/x86/irq.c
> > @@ -157,7 +157,7 @@ int __init bind_irq_vector(int irq, int vector, const cpumask_t *cpu_mask)
> >  /*
> >   * Dynamic irq allocate and deallocation for MSI
> >   */
> > -int create_irq(nodeid_t node)
> > +int create_irq(nodeid_t node, struct domain *dm_domain)
> 
> Using plain 'd' would be fine for me here, same below for
> destroy_irq.

I think it may be misleading, as the parameter is not about domain that
will handle that IRQ, but what domain will get access to it.

> >  {
> >      int irq, ret;
> >      struct irq_desc *desc;
> > @@ -190,19 +190,19 @@ int create_irq(nodeid_t node)
> >          desc->arch.used = IRQ_UNUSED;
> >          irq = ret;
> >      }
> > -    else if ( hardware_domain )
> > +    else if ( dm_domain )
> 
> Can you guarantee that dm_domain is always current->domain?

No, in some cases it will be hardware_domain.

> I think you need to assert for this, or else take a reference to
> dm_domain (get_domain) before accessing it's fields, or else you risk
> the domain being destroyed while modifying it's fields.

Can hardware_domain be destroyed without panicking Xen? If so,
get_domain would be indeed needed.

> >      {
> > -        ret = irq_permit_access(hardware_domain, irq);
> > +        ret = irq_permit_access(dm_domain, irq);
> >          if ( ret )
> >              printk(XENLOG_G_ERR
> > -                   "Could not grant Dom0 access to IRQ%d (error %d)\n",
> > -                   irq, ret);
> > +                   "Could not grant Dom%u access to IRQ%d (error %d)\n",
> > +                   dm_domain->domain_id, irq, ret);
> >      }
> >  
> >      return irq;
> >  }
> >  
> > -void destroy_irq(unsigned int irq)
> > +void destroy_irq(unsigned int irq, struct domain *dm_domain)
> >  {
> >      struct irq_desc *desc = irq_to_desc(irq);
> >      unsigned long flags;
> > @@ -210,14 +210,14 @@ void destroy_irq(unsigned int irq)
> >  
> >      BUG_ON(!MSI_IRQ(irq));
> >  
> > -    if ( hardware_domain )
> > +    if ( dm_domain )
> >      {
> > -        int err = irq_deny_access(hardware_domain, irq);
> > +        int err = irq_deny_access(dm_domain, irq);
> >  
> >          if ( err )
> >              printk(XENLOG_G_ERR
> > -                   "Could not revoke Dom0 access to IRQ%u (error %d)\n",
> > -                   irq, err);
> > +                   "Could not revoke Dom%u access to IRQ%u (error %d)\n",
> > +                   dm_domain->domain_id, irq, err);
> >      }
> >  
> >      spin_lock_irqsave(&desc->lock, flags);
> > @@ -2010,7 +2010,9 @@ int map_domain_pirq(
> >                      d->domain_id, irq);
> >              pci_disable_msi(msi_desc);
> >              msi_desc->irq = -1;
> > -            msi_free_irq(msi_desc);
> > +            msi_free_irq(msi_desc,
> > +                         current->domain->target == d ? current->domain
> > +                                                      : hardware_domain);
> >              ret = -EBUSY;
> >              goto done;
> >          }
> > @@ -2038,7 +2040,9 @@ int map_domain_pirq(
> >              spin_unlock_irqrestore(&desc->lock, flags);
> >  
> >              info = NULL;
> > -            irq = create_irq(NUMA_NO_NODE);
> > +            irq = create_irq(NUMA_NO_NODE,
> > +                             current->domain->target == d ? current->domain
> > +                                                          : hardware_domain);
> >              ret = irq >= 0 ? prepare_domain_irq_pirq(d, irq, pirq + nr, &info)
> >                             : irq;
> >              if ( ret < 0 )
> > @@ -2095,7 +2099,9 @@ int map_domain_pirq(
> >                  irq = info->arch.irq;
> >              }
> >              msi_desc->irq = -1;
> > -            msi_free_irq(msi_desc);
> > +            msi_free_irq(msi_desc,
> > +                         current->domain->target == d ? current->domain
> > +                                                      : hardware_domain);
> >              goto done;
> >          }
> >  
> > @@ -2255,7 +2261,9 @@ int unmap_domain_pirq(struct domain *d, int pirq)
> >      }
> >  
> >      if (msi_desc)
> > -        msi_free_irq(msi_desc);
> > +        msi_free_irq(msi_desc,
> > +                     current->domain->target == d ? current->domain
> > +                                                  : hardware_domain);
> >  
> >   done:
> >      return ret;
> > @@ -2671,10 +2679,10 @@ int allocate_and_map_msi_pirq(struct domain *d, int index, int *pirq_p,
> >              msi->entry_nr = 1;
> >          irq = index;
> >          if ( irq == -1 )
> > -        {
> >      case MAP_PIRQ_TYPE_MULTI_MSI:
> > -            irq = create_irq(NUMA_NO_NODE);
> > -        }
> > +            irq = create_irq(NUMA_NO_NODE,
> > +                             current->domain->target == d ? current->domain
> > +                                                          : hardware_domain);
> >  
> >          if ( irq < nr_irqs_gsi || irq >= nr_irqs )
> >          {
> > @@ -2717,7 +2725,9 @@ int allocate_and_map_msi_pirq(struct domain *d, int index, int *pirq_p,
> >          case MAP_PIRQ_TYPE_MSI:
> >              if ( index == -1 )
> >          case MAP_PIRQ_TYPE_MULTI_MSI:
> > -                destroy_irq(irq);
> > +                destroy_irq(irq,
> > +                            current->domain->target == d ? current->domain
> > +                                                         : hardware_domain);
> 
> This pattern seems common enough to get a helper, maybe get_dm_domain?

Ok.

> >              break;
> >          }
> >      }
> > diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c
> > index babc4147c4..66026e3ca5 100644
> > --- a/xen/arch/x86/msi.c
> > +++ b/xen/arch/x86/msi.c
> > @@ -633,7 +633,7 @@ int __setup_msi_irq(struct irq_desc *desc, struct msi_desc *msidesc,
> >      return ret;
> >  }
> >  
> > -int msi_free_irq(struct msi_desc *entry)
> > +int msi_free_irq(struct msi_desc *entry, struct domain *dm_domain)
> >  {
> >      unsigned int nr = entry->msi_attrib.type != PCI_CAP_ID_MSIX
> >                        ? entry->msi.nvec : 1;
> > @@ -641,7 +641,7 @@ int msi_free_irq(struct msi_desc *entry)
> >      while ( nr-- )
> >      {
> >          if ( entry[nr].irq >= 0 )
> > -            destroy_irq(entry[nr].irq);
> > +            destroy_irq(entry[nr].irq, dm_domain);
> >  
> >          /* Free the unused IRTE if intr remap enabled */
> >          if ( iommu_intremap )
> > @@ -1280,7 +1280,7 @@ static void msi_free_irqs(struct pci_dev* dev)
> >      list_for_each_entry_safe( entry, tmp, &dev->msi_list, list )
> >      {
> >          pci_disable_msi(entry);
> > -        msi_free_irq(entry);
> > +        msi_free_irq(entry, hardware_domain);
> 
> This likely requires some comment to clarify why is the hardcoding of
> hardware_domain correct here. AFAICT this will be called by
> pci_remove_device, which I assume assures that the device has been
> deassigned from any domain before attempting to remove it, hence it
> can only have irqs assigned to the hardware_domain if any?

That's also my understanding, but I'm not 100% sure about that.
See comment just bellow the commit message.

> >      }
> >  }
> >  
> > diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c
> > index 189e121b7e..2037bbbf08 100644
> > --- a/xen/drivers/char/ns16550.c
> > +++ b/xen/drivers/char/ns16550.c
> > @@ -719,7 +719,7 @@ static void __init ns16550_init_irq(struct serial_port *port)
> >      struct ns16550 *uart = port->uart;
> >  
> >      if ( uart->msi )
> > -        uart->irq = create_irq(0);
> > +        uart->irq = create_irq(0, hardware_domain);
> >  #endif
> >  }
> >  
> > @@ -812,9 +812,9 @@ static void __init ns16550_init_postirq(struct serial_port *port)
> >                  {
> >                      uart->irq = 0;
> >                      if ( msi_desc )
> > -                        msi_free_irq(msi_desc);
> > +                        msi_free_irq(msi_desc, hardware_domain);
> >                      else
> > -                        destroy_irq(msi.irq);
> > +                        destroy_irq(msi.irq, hardware_domain);
> 
> All this calls should pass a NULL domain IMO, since this is the serial
> console driver used by Xen.

Ok.

> >                  }
> >              }
> >  
> > diff --git a/xen/drivers/passthrough/amd/iommu_init.c b/xen/drivers/passthrough/amd/iommu_init.c
> > index 17f39552a9..423c473a63 100644
> > --- a/xen/drivers/passthrough/amd/iommu_init.c
> > +++ b/xen/drivers/passthrough/amd/iommu_init.c
> > @@ -780,7 +780,7 @@ static bool_t __init set_iommu_interrupt_handler(struct amd_iommu *iommu)
> >      hw_irq_controller *handler;
> >      u16 control;
> >  
> > -    irq = create_irq(NUMA_NO_NODE);
> > +    irq = create_irq(NUMA_NO_NODE, hardware_domain);
> >      if ( irq <= 0 )
> >      {
> >          dprintk(XENLOG_ERR, "IOMMU: no irqs\n");
> > @@ -816,7 +816,7 @@ static bool_t __init set_iommu_interrupt_handler(struct amd_iommu *iommu)
> >          ret = request_irq(irq, 0, iommu_interrupt_handler, "amd_iommu", iommu);
> >      if ( ret )
> >      {
> > -        destroy_irq(irq);
> > +        destroy_irq(irq, hardware_domain);
> 
> Same here, this is the iommu used by Xen, hence the domain parameter
> should be NULL.

Ok.

> >          AMD_IOMMU_DEBUG("can't request irq\n");
> >          return 0;
> >      }
> > diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
> > index 50a0e25224..73cf16c145 100644
> > --- a/xen/drivers/passthrough/vtd/iommu.c
> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> > @@ -1138,7 +1138,8 @@ static int __init iommu_set_interrupt(struct acpi_drhd_unit *drhd)
> >      struct irq_desc *desc;
> >  
> >      irq = create_irq(rhsa ? pxm_to_node(rhsa->proximity_domain)
> > -                          : NUMA_NO_NODE);
> > +                          : NUMA_NO_NODE,
> > +                     hardware_domain);
> >      if ( irq <= 0 )
> >      {
> >          dprintk(XENLOG_ERR VTDPREFIX, "IOMMU: no irq available!\n");
> > @@ -1151,7 +1152,7 @@ static int __init iommu_set_interrupt(struct acpi_drhd_unit *drhd)
> >      if ( ret )
> >      {
> >          desc->handler = &no_irq_type;
> > -        destroy_irq(irq);
> > +        destroy_irq(irq, hardware_domain);
> >          dprintk(XENLOG_ERR VTDPREFIX, "IOMMU: can't request irq\n");
> >          return ret;
> >      }
> > @@ -1286,7 +1287,7 @@ void __init iommu_free(struct acpi_drhd_unit *drhd)
> >  
> >      free_intel_iommu(iommu->intel);
> >      if ( iommu->msi.irq >= 0 )
> > -        destroy_irq(iommu->msi.irq);
> > +        destroy_irq(iommu->msi.irq, hardware_domain);
> 
> Same here, should be all NULL.

Ok.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4.1 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-02-21 17:40                         ` Marek Marczykowski-Górecki
@ 2019-02-22 10:42                           ` Roger Pau Monné
  2019-02-22 11:11                             ` Jan Beulich
  2019-03-07  0:50                             ` Marek Marczykowski-Górecki
  0 siblings, 2 replies; 43+ messages in thread
From: Roger Pau Monné @ 2019-02-22 10:42 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Suravee Suthikulpanit,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Simon Gaiser, Julien Grall, Jan Beulich, xen-devel,
	Brian Woods

On Thu, Feb 21, 2019 at 06:40:40PM +0100, Marek Marczykowski-Górecki wrote:
> On Thu, Feb 21, 2019 at 05:47:51PM +0100, Roger Pau Monné wrote:
> > On Fri, Feb 08, 2019 at 11:17:05AM +0100, Marek Marczykowski-Górecki wrote:
> > >          return -EINVAL;
> > >      }
> > >  
> > > diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
> > > index 8b44d6ce0b..d41b32b2f4 100644
> > > --- a/xen/arch/x86/irq.c
> > > +++ b/xen/arch/x86/irq.c
> > > @@ -157,7 +157,7 @@ int __init bind_irq_vector(int irq, int vector, const cpumask_t *cpu_mask)
> > >  /*
> > >   * Dynamic irq allocate and deallocation for MSI
> > >   */
> > > -int create_irq(nodeid_t node)
> > > +int create_irq(nodeid_t node, struct domain *dm_domain)
> > 
> > Using plain 'd' would be fine for me here, same below for
> > destroy_irq.
> 
> I think it may be misleading, as the parameter is not about domain that
> will handle that IRQ, but what domain will get access to it.

Right, but dom0 will also end up using this function for it's own irqs
(irqs allocated and mapped to dom0), in which case naming this
parameter dm_domain is misleading. IMO it's better to just name it
'd', which is the domain that gets permissions over the created irq.

> > >  {
> > >      int irq, ret;
> > >      struct irq_desc *desc;
> > > @@ -190,19 +190,19 @@ int create_irq(nodeid_t node)
> > >          desc->arch.used = IRQ_UNUSED;
> > >          irq = ret;
> > >      }
> > > -    else if ( hardware_domain )
> > > +    else if ( dm_domain )
> > 
> > Can you guarantee that dm_domain is always current->domain?
> 
> No, in some cases it will be hardware_domain.

Right, but in that case current->domain == hardware_domain I guess?

> 
> > I think you need to assert for this, or else take a reference to
> > dm_domain (get_domain) before accessing it's fields, or else you risk
> > the domain being destroyed while modifying it's fields.
> 
> Can hardware_domain be destroyed without panicking Xen? If so,
> get_domain would be indeed needed.

What about other callers that are not the hardware_domain? You need to
make sure those domains are not destroyed while {create/destroy}_irq
is changing the permissions.

If you can guarantee that dm_domain == current->domain then you are
safe, if not you need to get a reference before modifying any fields
of the domain struct.

So IMO you either need to add an assert or a get_domain depending on
the answer to the question above.

> > >              break;
> > >          }
> > >      }
> > > diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c
> > > index babc4147c4..66026e3ca5 100644
> > > --- a/xen/arch/x86/msi.c
> > > +++ b/xen/arch/x86/msi.c
> > > @@ -633,7 +633,7 @@ int __setup_msi_irq(struct irq_desc *desc, struct msi_desc *msidesc,
> > >      return ret;
> > >  }
> > >  
> > > -int msi_free_irq(struct msi_desc *entry)
> > > +int msi_free_irq(struct msi_desc *entry, struct domain *dm_domain)
> > >  {
> > >      unsigned int nr = entry->msi_attrib.type != PCI_CAP_ID_MSIX
> > >                        ? entry->msi.nvec : 1;
> > > @@ -641,7 +641,7 @@ int msi_free_irq(struct msi_desc *entry)
> > >      while ( nr-- )
> > >      {
> > >          if ( entry[nr].irq >= 0 )
> > > -            destroy_irq(entry[nr].irq);
> > > +            destroy_irq(entry[nr].irq, dm_domain);
> > >  
> > >          /* Free the unused IRTE if intr remap enabled */
> > >          if ( iommu_intremap )
> > > @@ -1280,7 +1280,7 @@ static void msi_free_irqs(struct pci_dev* dev)
> > >      list_for_each_entry_safe( entry, tmp, &dev->msi_list, list )
> > >      {
> > >          pci_disable_msi(entry);
> > > -        msi_free_irq(entry);
> > > +        msi_free_irq(entry, hardware_domain);
> > 
> > This likely requires some comment to clarify why is the hardcoding of
> > hardware_domain correct here. AFAICT this will be called by
> > pci_remove_device, which I assume assures that the device has been
> > deassigned from any domain before attempting to remove it, hence it
> > can only have irqs assigned to the hardware_domain if any?
> 
> That's also my understanding, but I'm not 100% sure about that.
> See comment just bellow the commit message.

I would add an assert that pdev->domain == hardware_domain just to be
sure.

And then in pci_remove_device I think Xen should also check that
pdev->domain == hardware_domain or else refuse to remove the device.

Jan do you know whether pci_remove_device is supposed to be used
against devices assigned to a domain different than the hardware
domain?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4.1 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-02-22 10:42                           ` Roger Pau Monné
@ 2019-02-22 11:11                             ` Jan Beulich
  2019-03-07  0:50                             ` Marek Marczykowski-Górecki
  1 sibling, 0 replies; 43+ messages in thread
From: Jan Beulich @ 2019-02-22 11:11 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Marek Marczykowski,
	Tim Deegan, Simon Gaiser, Julien Grall, Suravee Suthikulpanit,
	xen-devel, Brian Woods

>>> On 22.02.19 at 11:42, <roger.pau@citrix.com> wrote:
> Jan do you know whether pci_remove_device is supposed to be used
> against devices assigned to a domain different than the hardware
> domain?

No, I don't think it ought to be used on any other devices. I guess
the omission of the check goes back to assuming sane Dom0
behavior, properly de-assigning devices before hot-unplugging
them, and hence before invoking the respective physdev-ops.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4.1 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-02-08 10:17                     ` [PATCH v4.1 " Marek Marczykowski-Górecki
  2019-02-21 16:47                       ` Roger Pau Monné
@ 2019-02-27 11:07                       ` Jan Beulich
  2019-02-27 15:18                         ` Marek Marczykowski
  1 sibling, 1 reply; 43+ messages in thread
From: Jan Beulich @ 2019-02-27 11:07 UTC (permalink / raw)
  To: Marek Marczykowski
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Simon Gaiser, Julien Grall, Suravee Suthikulpanit, xen-devel,
	Brian Woods, Roger Pau Monne

>>> On 08.02.19 at 11:17, <marmarek@invisiblethingslab.com> wrote:
> There is one code path where I haven't managed to properly extract
> possible stubdomain in use:
> pci_remove_device()
>  -> pci_cleanup_msi()
>    -> msi_free_irqs()
>      -> msi_free_irq()
>        -> destroy_irq()
> 
> For now I've hardcoded hardware_domain there (in msi_free_irqs). Can it happen
> when device is still assigned to some domU?

In case this question is still open: No, it can't with current code,
and provided Dom0 behaves correctly.

> @@ -368,13 +369,13 @@ static int __init hpet_assign_irq(struct hpet_event_channel *ch)
>  {
>      int irq;
>  
> -    if ( (irq = create_irq(NUMA_NO_NODE)) < 0 )
> +    if ( (irq = create_irq(NUMA_NO_NODE, hardware_domain)) < 0 )
>          return irq;
>  
>      ch->msi.irq = irq;
>      if ( hpet_setup_msi_irq(ch) )
>      {
> -        destroy_irq(irq);
> +        destroy_irq(irq, hardware_domain);
>          return -EINVAL;
>      }

Why don't you take the opportunity here (and elsewhere) and properly
remove hwdom access to such internal-to-Xen IRQs? Simply pass NULL
here, and skip permission granting in this case (create_irq() already
checks for NULL anyway).

> @@ -190,19 +190,19 @@ int create_irq(nodeid_t node)
>          desc->arch.used = IRQ_UNUSED;
>          irq = ret;
>      }
> -    else if ( hardware_domain )
> +    else if ( dm_domain )
>      {
> -        ret = irq_permit_access(hardware_domain, irq);
> +        ret = irq_permit_access(dm_domain, irq);

Doesn't this imply that Dom0 has no way of cleaning up after the
guest/stubdom pair? IOW I wonder whether both dm and hwdom
should be granted access.

> @@ -2095,7 +2099,9 @@ int map_domain_pirq(
>                  irq = info->arch.irq;
>              }
>              msi_desc->irq = -1;
> -            msi_free_irq(msi_desc);
> +            msi_free_irq(msi_desc,
> +                         current->domain->target == d ? current->domain
> +                                                      : hardware_domain);

Note how ->irq gets set to -1 prior to the call (and also in at least
one other instance), which will lead to skipping of the destroy_irq()
call, and hence skipping of the permission removal. Or wait, that's
going to be taken care of in the caller as it seems. If this is also
your understanding, then please add a sentence to the description
pointing this out. The split logic isn't really helpful here (I know it
was me who wrote it, in an attempt to avoid re-writing everything
basically from scratch).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 5/6] xen/x86: add PHYSDEVOP_msi_set_enable
  2019-02-07  0:07 ` [PATCH v4 5/6] xen/x86: add PHYSDEVOP_msi_set_enable Marek Marczykowski-Górecki
  2019-02-07 10:25   ` Roger Pau Monné
@ 2019-02-27 11:41   ` Jan Beulich
  2019-02-27 15:05     ` Marek Marczykowski
  1 sibling, 1 reply; 43+ messages in thread
From: Jan Beulich @ 2019-02-27 11:41 UTC (permalink / raw)
  To: Marek Marczykowski
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel, Daniel de Graaf, Roger Pau Monne

>>> On 07.02.19 at 01:07, <marmarek@invisiblethingslab.com> wrote:
> --- a/xen/arch/x86/msi.c
> +++ b/xen/arch/x86/msi.c
> @@ -1474,6 +1474,30 @@ int pci_restore_msi_state(struct pci_dev *pdev)
>      return 0;
>  }
>  
> +int msi_msix_set_enable(struct pci_dev *pdev, int mode, int enable)

unsigned int mode, bool enable

I'm also not happy about the function name. Perhaps simply msi_enable()
or msi_control()?

> +{
> +    int ret;
> +
> +    ret = xsm_msi_set_enable(XSM_DM_PRIV, pdev->domain,
> +                             (pdev->seg << 16) | (pdev->bus << 8) | pdev->devfn,
> +                             mode, enable);
> +    if ( ret )
> +        return ret;
> +
> +    switch ( mode )
> +    {
> +    case PHYSDEVOP_MSI_SET_ENABLE_MSI:
> +        msi_set_enable(pdev, enable);
> +        break;
> +
> +    case PHYSDEVOP_MSI_SET_ENABLE_MSIX:
> +        msix_set_enable(pdev, enable);
> +        break;
> +    }

What about a call to pci_intx()? And what about invocations for
the wrong device (e.g. MSI-X request for MSI-X incapable device)?

> --- a/xen/arch/x86/physdev.c
> +++ b/xen/arch/x86/physdev.c
> @@ -671,6 +671,30 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>          break;
>      }
>  
> +    case PHYSDEVOP_msi_set_enable: {
> +        struct physdev_msi_set_enable op;
> +        struct pci_dev *pdev;
> +
> +        ret = -EFAULT;
> +        if ( copy_from_guest(&op, arg, 1) )
> +            break;
> +
> +        ret = -EINVAL;
> +        if ( op.mode != PHYSDEVOP_MSI_SET_ENABLE_MSI &&
> +             op.mode != PHYSDEVOP_MSI_SET_ENABLE_MSIX )
> +            break;
> +
> +        pcidevs_lock();
> +        pdev = pci_get_pdev(op.seg, op.bus, op.devfn);
> +        if ( pdev )
> +            ret = msi_msix_set_enable(pdev, op.mode, !!op.enable);

Unnecessary !! .

> +        else
> +            ret = -ENODEV;
> +        pcidevs_unlock();
> +        break;
> +
> +    }

Stray blank line above here.

> --- a/xen/include/public/physdev.h
> +++ b/xen/include/public/physdev.h
> @@ -344,6 +344,21 @@ struct physdev_dbgp_op {
>  typedef struct physdev_dbgp_op physdev_dbgp_op_t;
>  DEFINE_XEN_GUEST_HANDLE(physdev_dbgp_op_t);
>  
> +#define PHYSDEVOP_MSI_SET_ENABLE_MSI  0
> +#define PHYSDEVOP_MSI_SET_ENABLE_MSIX 1
> +
> +#define PHYSDEVOP_msi_set_enable   32
> +struct physdev_msi_set_enable {

Can this please also be something like msi_control?

> +    /* IN */
> +    uint16_t seg;
> +    uint8_t bus;
> +    uint8_t devfn;
> +    uint8_t mode;
> +    uint8_t enable;

"mode" and "enable" don't really make clear which of the two is the
boolean, and which is the operation. I'd anyway prefer a single
flags field with descriptive #define-s, which will also make more
obvious how to extend this if need be.

> --- a/xen/include/xlat.lst
> +++ b/xen/include/xlat.lst
> @@ -106,6 +106,7 @@
>  ?	physdev_restore_msi		physdev.h
>  ?	physdev_set_iopl		physdev.h
>  ?	physdev_setup_gsi		physdev.h
> +?	physdev_msi_set_enable		physdev.h

Please insert at the alphabetically correct place.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 5/6] xen/x86: add PHYSDEVOP_msi_set_enable
  2019-02-27 11:41   ` Jan Beulich
@ 2019-02-27 15:05     ` Marek Marczykowski
  2019-02-28 10:58       ` Jan Beulich
  0 siblings, 1 reply; 43+ messages in thread
From: Marek Marczykowski @ 2019-02-27 15:05 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel, Daniel de Graaf, Roger Pau Monne


[-- Attachment #1.1: Type: text/plain, Size: 4191 bytes --]

On Wed, Feb 27, 2019 at 04:41:37AM -0700, Jan Beulich wrote:
> >>> On 07.02.19 at 01:07, <marmarek@invisiblethingslab.com> wrote:
> > --- a/xen/arch/x86/msi.c
> > +++ b/xen/arch/x86/msi.c
> > @@ -1474,6 +1474,30 @@ int pci_restore_msi_state(struct pci_dev *pdev)
> >      return 0;
> >  }
> >  
> > +int msi_msix_set_enable(struct pci_dev *pdev, int mode, int enable)
> 
> unsigned int mode, bool enable
> 
> I'm also not happy about the function name. Perhaps simply msi_enable()
> or msi_control()?

Ok, will change to msi_control().

> > +{
> > +    int ret;
> > +
> > +    ret = xsm_msi_set_enable(XSM_DM_PRIV, pdev->domain,
> > +                             (pdev->seg << 16) | (pdev->bus << 8) | pdev->devfn,
> > +                             mode, enable);
> > +    if ( ret )
> > +        return ret;
> > +
> > +    switch ( mode )
> > +    {
> > +    case PHYSDEVOP_MSI_SET_ENABLE_MSI:
> > +        msi_set_enable(pdev, enable);
> > +        break;
> > +
> > +    case PHYSDEVOP_MSI_SET_ENABLE_MSIX:
> > +        msix_set_enable(pdev, enable);
> > +        break;
> > +    }
> 
> What about a call to pci_intx()? 

Should pci_intx(dev, !enable) be called in all those cases?

> And what about invocations for
> the wrong device (e.g. MSI-X request for MSI-X incapable device)?

Looking at msi(x)_set_enable(), it is no-op for incapable devices, but
if the function would do anything else, indeed such check should be
added. Is pci_find_cap_offset(..., PCI_CAP_ID_MSI(X)) the correct way
of doing that?

> > --- a/xen/arch/x86/physdev.c
> > +++ b/xen/arch/x86/physdev.c
> > @@ -671,6 +671,30 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
> >          break;
> >      }
> >  
> > +    case PHYSDEVOP_msi_set_enable: {
> > +        struct physdev_msi_set_enable op;
> > +        struct pci_dev *pdev;
> > +
> > +        ret = -EFAULT;
> > +        if ( copy_from_guest(&op, arg, 1) )
> > +            break;
> > +
> > +        ret = -EINVAL;
> > +        if ( op.mode != PHYSDEVOP_MSI_SET_ENABLE_MSI &&
> > +             op.mode != PHYSDEVOP_MSI_SET_ENABLE_MSIX )
> > +            break;
> > +
> > +        pcidevs_lock();
> > +        pdev = pci_get_pdev(op.seg, op.bus, op.devfn);
> > +        if ( pdev )
> > +            ret = msi_msix_set_enable(pdev, op.mode, !!op.enable);
> 
> Unnecessary !! .
> 
> > +        else
> > +            ret = -ENODEV;
> > +        pcidevs_unlock();
> > +        break;
> > +
> > +    }
> 
> Stray blank line above here.
> 
> > --- a/xen/include/public/physdev.h
> > +++ b/xen/include/public/physdev.h
> > @@ -344,6 +344,21 @@ struct physdev_dbgp_op {
> >  typedef struct physdev_dbgp_op physdev_dbgp_op_t;
> >  DEFINE_XEN_GUEST_HANDLE(physdev_dbgp_op_t);
> >  
> > +#define PHYSDEVOP_MSI_SET_ENABLE_MSI  0
> > +#define PHYSDEVOP_MSI_SET_ENABLE_MSIX 1
> > +
> > +#define PHYSDEVOP_msi_set_enable   32
> > +struct physdev_msi_set_enable {
> 
> Can this please also be something like msi_control?

Sure.

> > +    /* IN */
> > +    uint16_t seg;
> > +    uint8_t bus;
> > +    uint8_t devfn;
> > +    uint8_t mode;
> > +    uint8_t enable;
> 
> "mode" and "enable" don't really make clear which of the two is the
> boolean, and which is the operation. I'd anyway prefer a single
> flags field with descriptive #define-s, which will also make more
> obvious how to extend this if need be.

You mean:

#define PHYSDEVOP_MSI_CONTROL_ENABLE 1
#define PHYSDEVOP_MSI_CONTROL_MSI    2 
#define PHYSDEVOP_MSI_CONTROL_MSIX   4

Then use PHYSDEVOP_MSI_CONTROL_MSI(X) with or without
PHYSDEVOP_MSI_CONTROL_ENABLE ?

> > --- a/xen/include/xlat.lst
> > +++ b/xen/include/xlat.lst
> > @@ -106,6 +106,7 @@
> >  ?	physdev_restore_msi		physdev.h
> >  ?	physdev_set_iopl		physdev.h
> >  ?	physdev_setup_gsi		physdev.h
> > +?	physdev_msi_set_enable		physdev.h
> 
> Please insert at the alphabetically correct place.
> 
> Jan
> 
> 

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4.1 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-02-27 11:07                       ` Jan Beulich
@ 2019-02-27 15:18                         ` Marek Marczykowski
  2019-02-28 10:50                           ` Jan Beulich
  0 siblings, 1 reply; 43+ messages in thread
From: Marek Marczykowski @ 2019-02-27 15:18 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Simon Gaiser, Julien Grall, Suravee Suthikulpanit, xen-devel,
	Brian Woods, Roger Pau Monne


[-- Attachment #1.1: Type: text/plain, Size: 3543 bytes --]

On Wed, Feb 27, 2019 at 04:07:54AM -0700, Jan Beulich wrote:
> >>> On 08.02.19 at 11:17, <marmarek@invisiblethingslab.com> wrote:
> > There is one code path where I haven't managed to properly extract
> > possible stubdomain in use:
> > pci_remove_device()
> >  -> pci_cleanup_msi()
> >    -> msi_free_irqs()
> >      -> msi_free_irq()
> >        -> destroy_irq()
> > 
> > For now I've hardcoded hardware_domain there (in msi_free_irqs). Can it happen
> > when device is still assigned to some domU?
> 
> In case this question is still open: No, it can't with current code,
> and provided Dom0 behaves correctly.

Thanks for confirmation.

> > @@ -368,13 +369,13 @@ static int __init hpet_assign_irq(struct hpet_event_channel *ch)
> >  {
> >      int irq;
> >  
> > -    if ( (irq = create_irq(NUMA_NO_NODE)) < 0 )
> > +    if ( (irq = create_irq(NUMA_NO_NODE, hardware_domain)) < 0 )
> >          return irq;
> >  
> >      ch->msi.irq = irq;
> >      if ( hpet_setup_msi_irq(ch) )
> >      {
> > -        destroy_irq(irq);
> > +        destroy_irq(irq, hardware_domain);
> >          return -EINVAL;
> >      }
> 
> Why don't you take the opportunity here (and elsewhere) and properly
> remove hwdom access to such internal-to-Xen IRQs? Simply pass NULL
> here, and skip permission granting in this case (create_irq() already
> checks for NULL anyway).

Already queued for v5, per Roger's review.

> > @@ -190,19 +190,19 @@ int create_irq(nodeid_t node)
> >          desc->arch.used = IRQ_UNUSED;
> >          irq = ret;
> >      }
> > -    else if ( hardware_domain )
> > +    else if ( dm_domain )
> >      {
> > -        ret = irq_permit_access(hardware_domain, irq);
> > +        ret = irq_permit_access(dm_domain, irq);
> 
> Doesn't this imply that Dom0 has no way of cleaning up after the
> guest/stubdom pair? IOW I wonder whether both dm and hwdom
> should be granted access.

See discussion with Roger on this very patch.
In short: since permissions are stored in domain struct, not irq, there
is not much to cleanup after domain destruction. Also, toolstack in dom0
has no idea about IRQs allocated by stubdomain, so it couldn't do such
cleanup anyway.

> > @@ -2095,7 +2099,9 @@ int map_domain_pirq(
> >                  irq = info->arch.irq;
> >              }
> >              msi_desc->irq = -1;
> > -            msi_free_irq(msi_desc);
> > +            msi_free_irq(msi_desc,
> > +                         current->domain->target == d ? current->domain
> > +                                                      : hardware_domain);
> 
> Note how ->irq gets set to -1 prior to the call (and also in at least
> one other instance), which will lead to skipping of the destroy_irq()
> call, and hence skipping of the permission removal. Or wait, that's
> going to be taken care of in the caller as it seems. If this is also
> your understanding, then please add a sentence to the description
> pointing this out. The split logic isn't really helpful here (I know it
> was me who wrote it, in an attempt to avoid re-writing everything
> basically from scratch).

Yes, that matches my understanding - the caller will call on error
destroy_irq(), if it called create_irq() before (which may not always be
the case - and I think this is why it isn't destroyed here).

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4.1 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-02-27 15:18                         ` Marek Marczykowski
@ 2019-02-28 10:50                           ` Jan Beulich
  2019-02-28 11:41                             ` Roger Pau Monné
  0 siblings, 1 reply; 43+ messages in thread
From: Jan Beulich @ 2019-02-28 10:50 UTC (permalink / raw)
  To: Marek Marczykowski
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Simon Gaiser, Julien Grall, Suravee Suthikulpanit, xen-devel,
	Brian Woods, Roger Pau Monne

>>> On 27.02.19 at 16:18, <marmarek@invisiblethingslab.com> wrote:
> On Wed, Feb 27, 2019 at 04:07:54AM -0700, Jan Beulich wrote:
>> >>> On 08.02.19 at 11:17, <marmarek@invisiblethingslab.com> wrote:
>> > @@ -190,19 +190,19 @@ int create_irq(nodeid_t node)
>> >          desc->arch.used = IRQ_UNUSED;
>> >          irq = ret;
>> >      }
>> > -    else if ( hardware_domain )
>> > +    else if ( dm_domain )
>> >      {
>> > -        ret = irq_permit_access(hardware_domain, irq);
>> > +        ret = irq_permit_access(dm_domain, irq);
>> 
>> Doesn't this imply that Dom0 has no way of cleaning up after the
>> guest/stubdom pair? IOW I wonder whether both dm and hwdom
>> should be granted access.
> 
> See discussion with Roger on this very patch.
> In short: since permissions are stored in domain struct, not irq, there
> is not much to cleanup after domain destruction.

My point wasn't about cleaning up permissions, but about cleaning
up the IRQs. Dom0 can't do anything with them without being
given permission.

> Also, toolstack in dom0
> has no idea about IRQs allocated by stubdomain, so it couldn't do such
> cleanup anyway.

Well, a last resort cleanup could always be to iterate over all possible
IRQs. Whether that's helpful on top of Xen internal cleanup depends
on how much we care about cleaning up resources of zombie domains
as much as possible.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 5/6] xen/x86: add PHYSDEVOP_msi_set_enable
  2019-02-27 15:05     ` Marek Marczykowski
@ 2019-02-28 10:58       ` Jan Beulich
  2019-02-28 12:25         ` Marek Marczykowski
  2019-03-03  3:26         ` Marek Marczykowski
  0 siblings, 2 replies; 43+ messages in thread
From: Jan Beulich @ 2019-02-28 10:58 UTC (permalink / raw)
  To: Marek Marczykowski
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel, Daniel de Graaf, Roger Pau Monne

>>> On 27.02.19 at 16:05, <marmarek@invisiblethingslab.com> wrote:
> On Wed, Feb 27, 2019 at 04:41:37AM -0700, Jan Beulich wrote:
>> >>> On 07.02.19 at 01:07, <marmarek@invisiblethingslab.com> wrote:
>> > +int msi_msix_set_enable(struct pci_dev *pdev, int mode, int enable)
>> > +{
>> > +    int ret;
>> > +
>> > +    ret = xsm_msi_set_enable(XSM_DM_PRIV, pdev->domain,
>> > +                             (pdev->seg << 16) | (pdev->bus << 8) | pdev->devfn,
>> > +                             mode, enable);
>> > +    if ( ret )
>> > +        return ret;
>> > +
>> > +    switch ( mode )
>> > +    {
>> > +    case PHYSDEVOP_MSI_SET_ENABLE_MSI:
>> > +        msi_set_enable(pdev, enable);
>> > +        break;
>> > +
>> > +    case PHYSDEVOP_MSI_SET_ENABLE_MSIX:
>> > +        msix_set_enable(pdev, enable);
>> > +        break;
>> > +    }
>> 
>> What about a call to pci_intx()? 
> 
> Should pci_intx(dev, !enable) be called in all those cases?

Well, that depends whether Dom0 is involved, which is where the
operation would normally be done. But since this is about bypassing
pciback, I think it may be needed.

>> And what about invocations for
>> the wrong device (e.g. MSI-X request for MSI-X incapable device)?
> 
> Looking at msi(x)_set_enable(), it is no-op for incapable devices, but
> if the function would do anything else, indeed such check should be
> added. Is pci_find_cap_offset(..., PCI_CAP_ID_MSI(X)) the correct way
> of doing that?

Well, for MSI-X you could simply check pdev->msix to be non-NULL.
For MSI I think looking for the capability is your only choice.

Another thing: You're also bypassing the MSI{,-X}-already-enabled
checks that __pci_enable_msi{,x}() do, yet allowing to enable both
on a device would be a security issue.

>> > +    /* IN */
>> > +    uint16_t seg;
>> > +    uint8_t bus;
>> > +    uint8_t devfn;
>> > +    uint8_t mode;
>> > +    uint8_t enable;
>> 
>> "mode" and "enable" don't really make clear which of the two is the
>> boolean, and which is the operation. I'd anyway prefer a single
>> flags field with descriptive #define-s, which will also make more
>> obvious how to extend this if need be.
> 
> You mean:
> 
> #define PHYSDEVOP_MSI_CONTROL_ENABLE 1
> #define PHYSDEVOP_MSI_CONTROL_MSI    2 
> #define PHYSDEVOP_MSI_CONTROL_MSIX   4

Not exactly - you need just two flags: One selecting between
enable and disable, and a second selecting between MSI and
MSI-X. Otherwise, in your model, what do 0 or ENABLE alone
mean?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4.1 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-02-28 10:50                           ` Jan Beulich
@ 2019-02-28 11:41                             ` Roger Pau Monné
  0 siblings, 0 replies; 43+ messages in thread
From: Roger Pau Monné @ 2019-02-28 11:41 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Marek Marczykowski,
	Tim Deegan, Simon Gaiser, Julien Grall, Suravee Suthikulpanit,
	xen-devel, Brian Woods

On Thu, Feb 28, 2019 at 03:50:14AM -0700, Jan Beulich wrote:
> >>> On 27.02.19 at 16:18, <marmarek@invisiblethingslab.com> wrote:
> > On Wed, Feb 27, 2019 at 04:07:54AM -0700, Jan Beulich wrote:
> >> >>> On 08.02.19 at 11:17, <marmarek@invisiblethingslab.com> wrote:
> >> > @@ -190,19 +190,19 @@ int create_irq(nodeid_t node)
> >> >          desc->arch.used = IRQ_UNUSED;
> >> >          irq = ret;
> >> >      }
> >> > -    else if ( hardware_domain )
> >> > +    else if ( dm_domain )
> >> >      {
> >> > -        ret = irq_permit_access(hardware_domain, irq);
> >> > +        ret = irq_permit_access(dm_domain, irq);
> >> 
> >> Doesn't this imply that Dom0 has no way of cleaning up after the
> >> guest/stubdom pair? IOW I wonder whether both dm and hwdom
> >> should be granted access.
> > 
> > See discussion with Roger on this very patch.
> > In short: since permissions are stored in domain struct, not irq, there
> > is not much to cleanup after domain destruction.
> 
> My point wasn't about cleaning up permissions, but about cleaning
> up the IRQs. Dom0 can't do anything with them without being
> given permission.

Irqs should be cleaned up by the domain model in case of hot-unplug,
or by Xen when the domain is destroyed and the device is deassigned.
This will be fixed by the Intel patches posted by Chao IIRC.

AFAICT this is no different from when the device model in dom0 crashes
and Xen has to perform the clean up of MSI/MSI-X irqs, since neither
pciback, nor the toolstack know anything about those irqs.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 5/6] xen/x86: add PHYSDEVOP_msi_set_enable
  2019-02-28 10:58       ` Jan Beulich
@ 2019-02-28 12:25         ` Marek Marczykowski
  2019-03-03  1:10           ` Marek Marczykowski
  2019-03-03  3:26         ` Marek Marczykowski
  1 sibling, 1 reply; 43+ messages in thread
From: Marek Marczykowski @ 2019-02-28 12:25 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel, Daniel de Graaf, Roger Pau Monne


[-- Attachment #1.1: Type: text/plain, Size: 3080 bytes --]

On Thu, Feb 28, 2019 at 03:58:37AM -0700, Jan Beulich wrote:
> >>> On 27.02.19 at 16:05, <marmarek@invisiblethingslab.com> wrote:
> > On Wed, Feb 27, 2019 at 04:41:37AM -0700, Jan Beulich wrote:
> >> >>> On 07.02.19 at 01:07, <marmarek@invisiblethingslab.com> wrote:
> >> > +int msi_msix_set_enable(struct pci_dev *pdev, int mode, int enable)
> >> > +{
> >> > +    int ret;
> >> > +
> >> > +    ret = xsm_msi_set_enable(XSM_DM_PRIV, pdev->domain,
> >> > +                             (pdev->seg << 16) | (pdev->bus << 8) | pdev->devfn,
> >> > +                             mode, enable);
> >> > +    if ( ret )
> >> > +        return ret;
> >> > +
> >> > +    switch ( mode )
> >> > +    {
> >> > +    case PHYSDEVOP_MSI_SET_ENABLE_MSI:
> >> > +        msi_set_enable(pdev, enable);
> >> > +        break;
> >> > +
> >> > +    case PHYSDEVOP_MSI_SET_ENABLE_MSIX:
> >> > +        msix_set_enable(pdev, enable);
> >> > +        break;
> >> > +    }
> >> 
> >> What about a call to pci_intx()? 
> > 
> > Should pci_intx(dev, !enable) be called in all those cases?
> 
> Well, that depends whether Dom0 is involved, which is where the
> operation would normally be done. But since this is about bypassing
> pciback, I think it may be needed.
> 
> >> And what about invocations for
> >> the wrong device (e.g. MSI-X request for MSI-X incapable device)?
> > 
> > Looking at msi(x)_set_enable(), it is no-op for incapable devices, but
> > if the function would do anything else, indeed such check should be
> > added. Is pci_find_cap_offset(..., PCI_CAP_ID_MSI(X)) the correct way
> > of doing that?
> 
> Well, for MSI-X you could simply check pdev->msix to be non-NULL.
> For MSI I think looking for the capability is your only choice.
> 
> Another thing: You're also bypassing the MSI{,-X}-already-enabled
> checks that __pci_enable_msi{,x}() do, yet allowing to enable both
> on a device would be a security issue.

Ok.

> >> > +    /* IN */
> >> > +    uint16_t seg;
> >> > +    uint8_t bus;
> >> > +    uint8_t devfn;
> >> > +    uint8_t mode;
> >> > +    uint8_t enable;
> >> 
> >> "mode" and "enable" don't really make clear which of the two is the
> >> boolean, and which is the operation. I'd anyway prefer a single
> >> flags field with descriptive #define-s, which will also make more
> >> obvious how to extend this if need be.
> > 
> > You mean:
> > 
> > #define PHYSDEVOP_MSI_CONTROL_ENABLE 1
> > #define PHYSDEVOP_MSI_CONTROL_MSI    2 
> > #define PHYSDEVOP_MSI_CONTROL_MSIX   4
> 
> Not exactly - you need just two flags: One selecting between
> enable and disable, and a second selecting between MSI and
> MSI-X. Otherwise, in your model, what do 0 or ENABLE alone
> mean?

I put 3 flags there for easier extending it in the future.
But maybe indeed two flags + error on any other bit set would be enough
too.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 5/6] xen/x86: add PHYSDEVOP_msi_set_enable
  2019-02-28 12:25         ` Marek Marczykowski
@ 2019-03-03  1:10           ` Marek Marczykowski
  2019-03-04 10:19             ` Roger Pau Monné
  0 siblings, 1 reply; 43+ messages in thread
From: Marek Marczykowski @ 2019-03-03  1:10 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel, Daniel de Graaf, Roger Pau Monne


[-- Attachment #1.1: Type: text/plain, Size: 610 bytes --]

On Thu, Feb 28, 2019 at 01:25:50PM +0100, Marek Marczykowski wrote:
> On Thu, Feb 28, 2019 at 03:58:37AM -0700, Jan Beulich wrote:
> > Another thing: You're also bypassing the MSI{,-X}-already-enabled
> > checks that __pci_enable_msi{,x}() do, yet allowing to enable both
> > on a device would be a security issue.
> 
> Ok.

Hmm, could you explain more? Is that only the case when interrupt
remapping is missing?


-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 5/6] xen/x86: add PHYSDEVOP_msi_set_enable
  2019-02-28 10:58       ` Jan Beulich
  2019-02-28 12:25         ` Marek Marczykowski
@ 2019-03-03  3:26         ` Marek Marczykowski
  1 sibling, 0 replies; 43+ messages in thread
From: Marek Marczykowski @ 2019-03-03  3:26 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel, Daniel de Graaf, Roger Pau Monne


[-- Attachment #1.1: Type: text/plain, Size: 1616 bytes --]

On Thu, Feb 28, 2019 at 03:58:37AM -0700, Jan Beulich wrote:
> >>> On 27.02.19 at 16:05, <marmarek@invisiblethingslab.com> wrote:
> > On Wed, Feb 27, 2019 at 04:41:37AM -0700, Jan Beulich wrote:
> >> >>> On 07.02.19 at 01:07, <marmarek@invisiblethingslab.com> wrote:
> >> > +int msi_msix_set_enable(struct pci_dev *pdev, int mode, int enable)
> >> > +{
> >> > +    int ret;
> >> > +
> >> > +    ret = xsm_msi_set_enable(XSM_DM_PRIV, pdev->domain,
> >> > +                             (pdev->seg << 16) | (pdev->bus << 8) | pdev->devfn,
> >> > +                             mode, enable);
> >> > +    if ( ret )
> >> > +        return ret;
> >> > +
> >> > +    switch ( mode )
> >> > +    {
> >> > +    case PHYSDEVOP_MSI_SET_ENABLE_MSI:
> >> > +        msi_set_enable(pdev, enable);
> >> > +        break;
> >> > +
> >> > +    case PHYSDEVOP_MSI_SET_ENABLE_MSIX:
> >> > +        msix_set_enable(pdev, enable);
> >> > +        break;
> >> > +    }
> >> 
> >> What about a call to pci_intx()? 
> > 
> > Should pci_intx(dev, !enable) be called in all those cases?
> 
> Well, that depends whether Dom0 is involved, which is where the
> operation would normally be done. But since this is about bypassing
> pciback, I think it may be needed.

Shouldn't that be done by device model itself? Or even on command from
the target domain? Automatically toggling INTx when manipulating MSI(-X)
seems wrong.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 5/6] xen/x86: add PHYSDEVOP_msi_set_enable
  2019-03-03  1:10           ` Marek Marczykowski
@ 2019-03-04 10:19             ` Roger Pau Monné
  2019-03-04 10:22               ` Jan Beulich
  0 siblings, 1 reply; 43+ messages in thread
From: Roger Pau Monné @ 2019-03-04 10:19 UTC (permalink / raw)
  To: Marek Marczykowski
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Jan Beulich, xen-devel, Daniel de Graaf

On Sun, Mar 03, 2019 at 02:10:24AM +0100, Marek Marczykowski wrote:
> On Thu, Feb 28, 2019 at 01:25:50PM +0100, Marek Marczykowski wrote:
> > On Thu, Feb 28, 2019 at 03:58:37AM -0700, Jan Beulich wrote:
> > > Another thing: You're also bypassing the MSI{,-X}-already-enabled
> > > checks that __pci_enable_msi{,x}() do, yet allowing to enable both
> > > on a device would be a security issue.
> > 
> > Ok.
> 
> Hmm, could you explain more? Is that only the case when interrupt
> remapping is missing?

I think what Jan mentions is that the hypercall to enable MSI(-X)
should make sure PCI INTx is disabled, and prevent enabling both MSI
and MSI-X on the same device.

The device model that manages the passthrough device should already
make sure of that, but Xen should also protect itself against
bad-behaved device models when possible.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 5/6] xen/x86: add PHYSDEVOP_msi_set_enable
  2019-03-04 10:19             ` Roger Pau Monné
@ 2019-03-04 10:22               ` Jan Beulich
  0 siblings, 0 replies; 43+ messages in thread
From: Jan Beulich @ 2019-03-04 10:22 UTC (permalink / raw)
  To: roger.pau, marmarek
  Cc: sstabellini, wei.liu2, konrad.wilk, George.Dunlap,
	andrew.cooper3, Ian.Jackson, tim, julien.grall, xen-devel,
	dgdegra

>>> Roger Pau Monné <roger.pau@citrix.com> 03/04/19 11:19 AM >>>
>On Sun, Mar 03, 2019 at 02:10:24AM +0100, Marek Marczykowski wrote:
>> On Thu, Feb 28, 2019 at 01:25:50PM +0100, Marek Marczykowski wrote:
>> > On Thu, Feb 28, 2019 at 03:58:37AM -0700, Jan Beulich wrote:
>> > > Another thing: You're also bypassing the MSI{,-X}-already-enabled
>> > > checks that __pci_enable_msi{,x}() do, yet allowing to enable both
>> > > on a device would be a security issue.
>> > 
>> > Ok.
>> 
>> Hmm, could you explain more? Is that only the case when interrupt
>> remapping is missing?
>
>I think what Jan mentions is that the hypercall to enable MSI(-X)
>should make sure PCI INTx is disabled, and prevent enabling both MSI
>and MSI-X on the same device.
>
>The device model that manages the passthrough device should already
>make sure of that, but Xen should also protect itself against
>bad-behaved device models when possible.

Right, and specifically in the case where the device model itself runs with
limited privileges.

Jan




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4.1 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-02-22 10:42                           ` Roger Pau Monné
  2019-02-22 11:11                             ` Jan Beulich
@ 2019-03-07  0:50                             ` Marek Marczykowski-Górecki
  2019-03-07 14:48                               ` Roger Pau Monné
  1 sibling, 1 reply; 43+ messages in thread
From: Marek Marczykowski-Górecki @ 2019-03-07  0:50 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Suravee Suthikulpanit,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Simon Gaiser, Julien Grall, Jan Beulich, xen-devel,
	Brian Woods


[-- Attachment #1.1: Type: text/plain, Size: 8113 bytes --]

On Fri, Feb 22, 2019 at 11:42:22AM +0100, Roger Pau Monné wrote:
> On Thu, Feb 21, 2019 at 06:40:40PM +0100, Marek Marczykowski-Górecki wrote:
> > On Thu, Feb 21, 2019 at 05:47:51PM +0100, Roger Pau Monné wrote:
> > > On Fri, Feb 08, 2019 at 11:17:05AM +0100, Marek Marczykowski-Górecki wrote:
> > > >  {
> > > >      int irq, ret;
> > > >      struct irq_desc *desc;
> > > > @@ -190,19 +190,19 @@ int create_irq(nodeid_t node)
> > > >          desc->arch.used = IRQ_UNUSED;
> > > >          irq = ret;
> > > >      }
> > > > -    else if ( hardware_domain )
> > > > +    else if ( dm_domain )
> > > 
> > > Can you guarantee that dm_domain is always current->domain?
> > 
> > No, in some cases it will be hardware_domain.
> 
> Right, but in that case current->domain == hardware_domain I guess?
> 
> > 
> > > I think you need to assert for this, or else take a reference to
> > > dm_domain (get_domain) before accessing it's fields, or else you risk
> > > the domain being destroyed while modifying it's fields.
> > 
> > Can hardware_domain be destroyed without panicking Xen? If so,
> > get_domain would be indeed needed.
> 
> What about other callers that are not the hardware_domain? You need to
> make sure those domains are not destroyed while {create/destroy}_irq
> is changing the permissions.
> 
> If you can guarantee that dm_domain == current->domain then you are
> safe, if not you need to get a reference before modifying any fields
> of the domain struct.
> 
> So IMO you either need to add an assert or a get_domain depending on
> the answer to the question above.

I've added an assert, and I think I have the answer to the above question:

    (XEN) Assertion 'd == current->domain' failed at irq.c:232
    (XEN) ----[ Xen-4.12.0-rc  x86_64  debug=y   Not tainted ]----
    (XEN) CPU:    2
    (XEN) RIP:    e008:[<ffff82d08028f545>] destroy_irq+0x126/0x143
    (XEN) RFLAGS: 0000000000010206   CONTEXT: hypervisor
(...)
    (XEN) Xen call trace:
    (XEN)    [<ffff82d08028f545>] destroy_irq+0x126/0x143
    (XEN)    [<ffff82d08028ce1e>] msi_free_irq+0x51/0x1b8
    (XEN)    [<ffff82d0802923e1>] unmap_domain_pirq+0x487/0x4d4
    (XEN)    [<ffff82d08029249f>] free_domain_pirqs+0x71/0x8f
    (XEN)    [<ffff82d0802819e0>] arch_domain_destroy+0x41/0xa1
    (XEN)    [<ffff82d080207d22>] domain.c#complete_domain_destroy+0x86/0x159
    (XEN)    [<ffff82d08022a658>] rcupdate.c#rcu_process_callbacks+0xa5/0x1cc
    (XEN)    [<ffff82d08023c4fa>] softirq.c#__do_softirq+0x78/0x9a
    (XEN)    [<ffff82d08023c566>] do_softirq+0x13/0x15
    (XEN)    [<ffff82d080280532>] domain.c#idle_loop+0x63/0xb9

In this case, current->domain obviously isn't the stubdomain, because at
this point it is already destroyed (it keeps reference to the target
domain, so target domain couldn't be destroyed before its stubdomain).

In fact, in this case get_dm_domain() returns wrong value, since it
isn't called by device model. At the point where stubdomain is already
destroyed, I think it should return NULL, but it returns
hardware_domain. But it isn't that easy, because target domain don't
have any reference to its stubdomain. Especially already destroyed one.

I's defined as:

    static struct domain *get_dm_domain(struct domain *d)
    {
        return current->domain->target == d ? current->domain :
                                              hardware_domain;
    }

Since hardware domain couldn't be destroyed(*) while the system is
running, in practice this wrong return value it isn't a problem.
Hardware didn't have permission over this IRQ (if stubdomain have
created it), so irq_deny_access is a no-op.

So, I would adjust assert in destroy_irq to allow also hardware_domain,
and add a comment that get_dm_domain may return hardware_domain during
domain destruction. Is that ok?

(*) is this actually true? I see shutdown_domain(hardware_domain)
cause Xen to shutdown. But I don't see anything like this in
domain_kill/domain_destroy. Normally no other domain than dom0 is able
to destroy other domains (and domain can't destroy itself), so it should
be ok. But with some XSM policy, I think it could be possible to allow
other domain to destroy hardware domain. In fact, just having hardware
domain != dom0 is enough to violate this assumption.
Am I missing anything?

Full crash message:
(XEN) Assertion 'd == current->domain' failed at irq.c:232
(XEN) ----[ Xen-4.12.0-rc  x86_64  debug=y   Not tainted ]----
(XEN) CPU:    2
(XEN) RIP:    e008:[<ffff82d08028f545>] destroy_irq+0x126/0x143
(XEN) RFLAGS: 0000000000010206   CONTEXT: hypervisor
(XEN) rax: ffff830422264000   rbx: 000000000000001e   rcx: 0000000000000000
(XEN) rdx: ffff8304222de000   rsi: ffff830422235000   rdi: 000000000000001e
(XEN) rbp: ffff83042225fcc0   rsp: ffff83042225fc90   r8:  0000000000000000
(XEN) r9:  ffff830385868880   r10: 0000000000000004   r11: ffff8304222ba010
(XEN) r12: ffff830422235000   r13: 000000000000001e   r14: ffff830422280000
(XEN) r15: ffff83042d8d1000   cr0: 000000008005003b   cr4: 00000000000426e0
(XEN) cr3: 00000000ca49d000   cr2: ffff88011a179c00
(XEN) fsb: 0000000000000000   gsb: ffff880135640000   gss: 0000000000000000
(XEN) ds: 002b   es: 002b   fs: 0000   gs: 0000   ss: e010   cs: e008
(XEN) Xen code around <ffff82d08028f545> (destroy_irq+0x126/0x143):
(XEN)  41 5e 41 5f 5d c3 0f 0b <0f> 0b 41 0f b7 34 24 89 c1 44 89 ea 48 8d 3d b0
(XEN) Xen stack trace from rsp=ffff83042225fc90:
(XEN)    ffff83042225fca0 0000000000000000 ffff83025f33a860 ffff830422235000
(XEN)    ffff83025f33a860 ffff83042d8d1000 ffff83042225fd00 ffff82d08028ce1e
(XEN)    ffff83025f33a700 ffff83025f33a860 0000000000000000 ffff830422281e00
(XEN)    000000000000001e ffff83042d8d1000 ffff83042225fd90 ffff82d0802923e1
(XEN)    ffffffffffffff26 ffffffffffffffc9 ffffffffffffffc9 ffff83042d8d1858
(XEN)    000000002d8d1000 0000000000000286 0000000000000037 ffff83025f33a860
(XEN)    ffff83042d8d10f0 0000003700000001 ffff83042225fd80 0000000000000037
(XEN)    ffff83042d8d1000 ffff83042d8d1b28 ffff83042d8d10f0 ffff83042d8d10cc
(XEN)    ffff83042225fdd0 ffff82d08029249f 0000000000000013 ffff83042d8d1000
(XEN)    00000000ffffffff ffff83042d8d1b28 ffff83042d8d1000 fffffffffffffff8
(XEN)    ffff83042225fdf0 ffff82d0802819e0 ffff83042d8d1b28 ffff8303d2db1000
(XEN)    ffff83042225fe30 ffff82d080207d22 000008f2621ab810 ffff830422262040
(XEN)    0000000000000000 0000000000000001 ffff83042225ffff ffff82d0805c3880
(XEN)    ffff83042225fe60 ffff82d08022a658 000000000000001d ffff82d0805bb980
(XEN)    ffff82d0805bb880 ffffffffffffffff ffff83042225fea0 ffff82d08023c4fa
(XEN)    ffff82d0805bb980 0000000000000002 ffff82d0805e9af0 ffff82d0805bb880
(XEN)    ffff82d0805d4a20 ffff83042225ffff ffff83042225feb0 ffff82d08023c566
(XEN)    ffff83042225fef0 ffff82d080280532 ffff830422264000 ffff830422264000
(XEN)    ffff830422217000 0000000000000002 ffff830422235000 ffff8304222de000
(XEN)    ffff83042225fd98 0000000000000000 0000000000000000 0000000000000000
(XEN) Xen call trace:
(XEN)    [<ffff82d08028f545>] destroy_irq+0x126/0x143
(XEN)    [<ffff82d08028ce1e>] msi_free_irq+0x51/0x1b8
(XEN)    [<ffff82d0802923e1>] unmap_domain_pirq+0x487/0x4d4
(XEN)    [<ffff82d08029249f>] free_domain_pirqs+0x71/0x8f
(XEN)    [<ffff82d0802819e0>] arch_domain_destroy+0x41/0xa1
(XEN)    [<ffff82d080207d22>] domain.c#complete_domain_destroy+0x86/0x159
(XEN)    [<ffff82d08022a658>] rcupdate.c#rcu_process_callbacks+0xa5/0x1cc
(XEN)    [<ffff82d08023c4fa>] softirq.c#__do_softirq+0x78/0x9a
(XEN)    [<ffff82d08023c566>] do_softirq+0x13/0x15
(XEN)    [<ffff82d080280532>] domain.c#idle_loop+0x63/0xb9
(XEN) 
(XEN) 
(XEN) ****************************************
(XEN) Panic on CPU 2:
(XEN) Assertion 'd == current->domain' failed at irq.c:232
(XEN) ****************************************


-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4.1 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-03-07  0:50                             ` Marek Marczykowski-Górecki
@ 2019-03-07 14:48                               ` Roger Pau Monné
  2019-03-07 22:28                                 ` Marek Marczykowski-Górecki
  0 siblings, 1 reply; 43+ messages in thread
From: Roger Pau Monné @ 2019-03-07 14:48 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Suravee Suthikulpanit,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Simon Gaiser, Julien Grall, Jan Beulich, xen-devel,
	Brian Woods

On Thu, Mar 07, 2019 at 01:50:04AM +0100, Marek Marczykowski-Górecki wrote:
> On Fri, Feb 22, 2019 at 11:42:22AM +0100, Roger Pau Monné wrote:
> > On Thu, Feb 21, 2019 at 06:40:40PM +0100, Marek Marczykowski-Górecki wrote:
> > > On Thu, Feb 21, 2019 at 05:47:51PM +0100, Roger Pau Monné wrote:
> > > > On Fri, Feb 08, 2019 at 11:17:05AM +0100, Marek Marczykowski-Górecki wrote:
> > > > >  {
> > > > >      int irq, ret;
> > > > >      struct irq_desc *desc;
> > > > > @@ -190,19 +190,19 @@ int create_irq(nodeid_t node)
> > > > >          desc->arch.used = IRQ_UNUSED;
> > > > >          irq = ret;
> > > > >      }
> > > > > -    else if ( hardware_domain )
> > > > > +    else if ( dm_domain )
> > > > 
> > > > Can you guarantee that dm_domain is always current->domain?
> > > 
> > > No, in some cases it will be hardware_domain.
> > 
> > Right, but in that case current->domain == hardware_domain I guess?
> > 
> > > 
> > > > I think you need to assert for this, or else take a reference to
> > > > dm_domain (get_domain) before accessing it's fields, or else you risk
> > > > the domain being destroyed while modifying it's fields.
> > > 
> > > Can hardware_domain be destroyed without panicking Xen? If so,
> > > get_domain would be indeed needed.
> > 
> > What about other callers that are not the hardware_domain? You need to
> > make sure those domains are not destroyed while {create/destroy}_irq
> > is changing the permissions.
> > 
> > If you can guarantee that dm_domain == current->domain then you are
> > safe, if not you need to get a reference before modifying any fields
> > of the domain struct.
> > 
> > So IMO you either need to add an assert or a get_domain depending on
> > the answer to the question above.
> 
> I've added an assert, and I think I have the answer to the above question:
> 
>     (XEN) Assertion 'd == current->domain' failed at irq.c:232
>     (XEN) ----[ Xen-4.12.0-rc  x86_64  debug=y   Not tainted ]----
>     (XEN) CPU:    2
>     (XEN) RIP:    e008:[<ffff82d08028f545>] destroy_irq+0x126/0x143
>     (XEN) RFLAGS: 0000000000010206   CONTEXT: hypervisor
> (...)
>     (XEN) Xen call trace:
>     (XEN)    [<ffff82d08028f545>] destroy_irq+0x126/0x143
>     (XEN)    [<ffff82d08028ce1e>] msi_free_irq+0x51/0x1b8
>     (XEN)    [<ffff82d0802923e1>] unmap_domain_pirq+0x487/0x4d4
>     (XEN)    [<ffff82d08029249f>] free_domain_pirqs+0x71/0x8f
>     (XEN)    [<ffff82d0802819e0>] arch_domain_destroy+0x41/0xa1
>     (XEN)    [<ffff82d080207d22>] domain.c#complete_domain_destroy+0x86/0x159
>     (XEN)    [<ffff82d08022a658>] rcupdate.c#rcu_process_callbacks+0xa5/0x1cc
>     (XEN)    [<ffff82d08023c4fa>] softirq.c#__do_softirq+0x78/0x9a
>     (XEN)    [<ffff82d08023c566>] do_softirq+0x13/0x15
>     (XEN)    [<ffff82d080280532>] domain.c#idle_loop+0x63/0xb9
> 
> In this case, current->domain obviously isn't the stubdomain, because at
> this point it is already destroyed (it keeps reference to the target
> domain, so target domain couldn't be destroyed before its stubdomain).
> 
> In fact, in this case get_dm_domain() returns wrong value, since it
> isn't called by device model. At the point where stubdomain is already
> destroyed, I think it should return NULL, but it returns
> hardware_domain. But it isn't that easy, because target domain don't
> have any reference to its stubdomain. Especially already destroyed one.
> 
> I's defined as:
> 
>     static struct domain *get_dm_domain(struct domain *d)
>     {
>         return current->domain->target == d ? current->domain :
>                                               hardware_domain;
>     }

So get_dm_domain works fine when used by create_irq, because in that
case current->domain == d AFAICT.

As you pointed out however using the same mechanism for destroy is not
suitable, since the stubdomain might be already destroyed, and
unmap_domain_pirq called from the idle vCPU.

> Since hardware domain couldn't be destroyed(*) while the system is
> running, in practice this wrong return value it isn't a problem.
> Hardware didn't have permission over this IRQ (if stubdomain have
> created it), so irq_deny_access is a no-op.
> 
> So, I would adjust assert in destroy_irq to allow also hardware_domain,
> and add a comment that get_dm_domain may return hardware_domain during
> domain destruction. Is that ok?

Hm, albeit I agree with your analysis, I feel like this proposed
solution is kind of a workaround. Given the context, I think the best
way to deal with this issue in destroy_irq is to iterate over the list
of domains and make sure no domain has permissions over the destroyed
irq. Note that with this proposed solution you will have to drop the
domain parameter from destroy_irq.

Another option would be to store which domains are given permissions
over which irqs in a way that's optimized to get the list of domains
given an irq (ie: without having to iterate over the list of domains
like my proposed solution above).

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4.1 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-03-07 14:48                               ` Roger Pau Monné
@ 2019-03-07 22:28                                 ` Marek Marczykowski-Górecki
  2019-03-08 10:26                                   ` Roger Pau Monné
  2019-03-08 12:33                                   ` Jan Beulich
  0 siblings, 2 replies; 43+ messages in thread
From: Marek Marczykowski-Górecki @ 2019-03-07 22:28 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Suravee Suthikulpanit,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Simon Gaiser, Julien Grall, Jan Beulich, xen-devel,
	Brian Woods


[-- Attachment #1.1: Type: text/plain, Size: 7072 bytes --]

On Thu, Mar 07, 2019 at 03:48:01PM +0100, Roger Pau Monné wrote:
> On Thu, Mar 07, 2019 at 01:50:04AM +0100, Marek Marczykowski-Górecki wrote:
> > On Fri, Feb 22, 2019 at 11:42:22AM +0100, Roger Pau Monné wrote:
> > > On Thu, Feb 21, 2019 at 06:40:40PM +0100, Marek Marczykowski-Górecki wrote:
> > > > On Thu, Feb 21, 2019 at 05:47:51PM +0100, Roger Pau Monné wrote:
> > > > > On Fri, Feb 08, 2019 at 11:17:05AM +0100, Marek Marczykowski-Górecki wrote:
> > > > > >  {
> > > > > >      int irq, ret;
> > > > > >      struct irq_desc *desc;
> > > > > > @@ -190,19 +190,19 @@ int create_irq(nodeid_t node)
> > > > > >          desc->arch.used = IRQ_UNUSED;
> > > > > >          irq = ret;
> > > > > >      }
> > > > > > -    else if ( hardware_domain )
> > > > > > +    else if ( dm_domain )
> > > > > 
> > > > > Can you guarantee that dm_domain is always current->domain?
> > > > 
> > > > No, in some cases it will be hardware_domain.
> > > 
> > > Right, but in that case current->domain == hardware_domain I guess?
> > > 
> > > > 
> > > > > I think you need to assert for this, or else take a reference to
> > > > > dm_domain (get_domain) before accessing it's fields, or else you risk
> > > > > the domain being destroyed while modifying it's fields.
> > > > 
> > > > Can hardware_domain be destroyed without panicking Xen? If so,
> > > > get_domain would be indeed needed.
> > > 
> > > What about other callers that are not the hardware_domain? You need to
> > > make sure those domains are not destroyed while {create/destroy}_irq
> > > is changing the permissions.
> > > 
> > > If you can guarantee that dm_domain == current->domain then you are
> > > safe, if not you need to get a reference before modifying any fields
> > > of the domain struct.
> > > 
> > > So IMO you either need to add an assert or a get_domain depending on
> > > the answer to the question above.
> > 
> > I've added an assert, and I think I have the answer to the above question:
> > 
> >     (XEN) Assertion 'd == current->domain' failed at irq.c:232
> >     (XEN) ----[ Xen-4.12.0-rc  x86_64  debug=y   Not tainted ]----
> >     (XEN) CPU:    2
> >     (XEN) RIP:    e008:[<ffff82d08028f545>] destroy_irq+0x126/0x143
> >     (XEN) RFLAGS: 0000000000010206   CONTEXT: hypervisor
> > (...)
> >     (XEN) Xen call trace:
> >     (XEN)    [<ffff82d08028f545>] destroy_irq+0x126/0x143
> >     (XEN)    [<ffff82d08028ce1e>] msi_free_irq+0x51/0x1b8
> >     (XEN)    [<ffff82d0802923e1>] unmap_domain_pirq+0x487/0x4d4
> >     (XEN)    [<ffff82d08029249f>] free_domain_pirqs+0x71/0x8f
> >     (XEN)    [<ffff82d0802819e0>] arch_domain_destroy+0x41/0xa1
> >     (XEN)    [<ffff82d080207d22>] domain.c#complete_domain_destroy+0x86/0x159
> >     (XEN)    [<ffff82d08022a658>] rcupdate.c#rcu_process_callbacks+0xa5/0x1cc
> >     (XEN)    [<ffff82d08023c4fa>] softirq.c#__do_softirq+0x78/0x9a
> >     (XEN)    [<ffff82d08023c566>] do_softirq+0x13/0x15
> >     (XEN)    [<ffff82d080280532>] domain.c#idle_loop+0x63/0xb9
> > 
> > In this case, current->domain obviously isn't the stubdomain, because at
> > this point it is already destroyed (it keeps reference to the target
> > domain, so target domain couldn't be destroyed before its stubdomain).
> > 
> > In fact, in this case get_dm_domain() returns wrong value, since it
> > isn't called by device model. At the point where stubdomain is already
> > destroyed, I think it should return NULL, but it returns
> > hardware_domain. But it isn't that easy, because target domain don't
> > have any reference to its stubdomain. Especially already destroyed one.
> > 
> > I's defined as:
> > 
> >     static struct domain *get_dm_domain(struct domain *d)
> >     {
> >         return current->domain->target == d ? current->domain :
> >                                               hardware_domain;
> >     }
> 
> So get_dm_domain works fine when used by create_irq, because in that
> case current->domain == d AFAICT.
> 
> As you pointed out however using the same mechanism for destroy is not
> suitable, since the stubdomain might be already destroyed, and
> unmap_domain_pirq called from the idle vCPU.
> 
> > Since hardware domain couldn't be destroyed(*) while the system is
> > running, in practice this wrong return value it isn't a problem.
> > Hardware didn't have permission over this IRQ (if stubdomain have
> > created it), so irq_deny_access is a no-op.
> > 
> > So, I would adjust assert in destroy_irq to allow also hardware_domain,
> > and add a comment that get_dm_domain may return hardware_domain during
> > domain destruction. Is that ok?
> 
> Hm, albeit I agree with your analysis, I feel like this proposed
> solution is kind of a workaround. Given the context, I think the best
> way to deal with this issue in destroy_irq is to iterate over the list
> of domains and make sure no domain has permissions over the destroyed
> irq. Note that with this proposed solution you will have to drop the
> domain parameter from destroy_irq.

I'd really like to avoid iterating the whole domain list. Jan, what do
you think? Code-wise this seems to be the easiest solution.

> Another option would be to store which domains are given permissions
> over which irqs in a way that's optimized to get the list of domains
> given an irq (ie: without having to iterate over the list of domains
> like my proposed solution above).

This may make sense, but if that would be instead of the current
structure, we'd have another problem: when destroying stubdomain, you'd
need to get list IRQs it has access to, to explicitly revoke
stubdomain's access. In theory it could be done by looking at the target
domain and iterating over its IRQs, but this is getting more and more
complex.

I think the tricky part is unmap_domain_pirq->msi_free_irq, which can be
called:
1. from PHYSDEVOP_unmap_pirq, by stubdomain
2. from PHYSDEVOP_unmap_pirq, by dom0 when device model runs there
3. from PHYSDEVOP_unmap_pirq, by dom0 even with device model in
stubdomain - normally it shouldn't happen for MSI allocated by
stubdomain, but dom0 is allowed to do so, and it shouldn't cause any
harm
4. from free_domain_pirqs, during domain destruction
5. various error paths

If unmap_domain_pirq would know where device model is running, even if
not called by it, that would help. What about adding back reference in
struct domain to a stubdomain? That would help a lot here. The only
problem is a circular dependency stubdom->target->stubdom. This cycle
would need to be broken during stubdom teardown. domain_kill(stubdom)
looks like a good place to break it. Is it guaranteed to be called, or
is there some other path to destroying a stubdomain?

Can one HVM domain have multiple stubdomains? If so, it should a be
list, not a single field.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4.1 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-03-07 22:28                                 ` Marek Marczykowski-Górecki
@ 2019-03-08 10:26                                   ` Roger Pau Monné
  2019-03-08 16:49                                     ` Marek Marczykowski-Górecki
  2019-03-08 12:33                                   ` Jan Beulich
  1 sibling, 1 reply; 43+ messages in thread
From: Roger Pau Monné @ 2019-03-08 10:26 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Suravee Suthikulpanit,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Simon Gaiser, Julien Grall, Jan Beulich, xen-devel,
	Brian Woods

On Thu, Mar 07, 2019 at 11:28:25PM +0100, Marek Marczykowski-Górecki wrote:
> On Thu, Mar 07, 2019 at 03:48:01PM +0100, Roger Pau Monné wrote:
> > On Thu, Mar 07, 2019 at 01:50:04AM +0100, Marek Marczykowski-Górecki wrote:
> > > On Fri, Feb 22, 2019 at 11:42:22AM +0100, Roger Pau Monné wrote:
> > > > On Thu, Feb 21, 2019 at 06:40:40PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > On Thu, Feb 21, 2019 at 05:47:51PM +0100, Roger Pau Monné wrote:
> > > > > > On Fri, Feb 08, 2019 at 11:17:05AM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > >  {
> > > > > > >      int irq, ret;
> > > > > > >      struct irq_desc *desc;
> > > > > > > @@ -190,19 +190,19 @@ int create_irq(nodeid_t node)
> > > > > > >          desc->arch.used = IRQ_UNUSED;
> > > > > > >          irq = ret;
> > > > > > >      }
> > > > > > > -    else if ( hardware_domain )
> > > > > > > +    else if ( dm_domain )
> > > > > > 
> > > > > > Can you guarantee that dm_domain is always current->domain?
> > > > > 
> > > > > No, in some cases it will be hardware_domain.
> > > > 
> > > > Right, but in that case current->domain == hardware_domain I guess?
> > > > 
> > > > > 
> > > > > > I think you need to assert for this, or else take a reference to
> > > > > > dm_domain (get_domain) before accessing it's fields, or else you risk
> > > > > > the domain being destroyed while modifying it's fields.
> > > > > 
> > > > > Can hardware_domain be destroyed without panicking Xen? If so,
> > > > > get_domain would be indeed needed.
> > > > 
> > > > What about other callers that are not the hardware_domain? You need to
> > > > make sure those domains are not destroyed while {create/destroy}_irq
> > > > is changing the permissions.
> > > > 
> > > > If you can guarantee that dm_domain == current->domain then you are
> > > > safe, if not you need to get a reference before modifying any fields
> > > > of the domain struct.
> > > > 
> > > > So IMO you either need to add an assert or a get_domain depending on
> > > > the answer to the question above.
> > > 
> > > I've added an assert, and I think I have the answer to the above question:
> > > 
> > >     (XEN) Assertion 'd == current->domain' failed at irq.c:232
> > >     (XEN) ----[ Xen-4.12.0-rc  x86_64  debug=y   Not tainted ]----
> > >     (XEN) CPU:    2
> > >     (XEN) RIP:    e008:[<ffff82d08028f545>] destroy_irq+0x126/0x143
> > >     (XEN) RFLAGS: 0000000000010206   CONTEXT: hypervisor
> > > (...)
> > >     (XEN) Xen call trace:
> > >     (XEN)    [<ffff82d08028f545>] destroy_irq+0x126/0x143
> > >     (XEN)    [<ffff82d08028ce1e>] msi_free_irq+0x51/0x1b8
> > >     (XEN)    [<ffff82d0802923e1>] unmap_domain_pirq+0x487/0x4d4
> > >     (XEN)    [<ffff82d08029249f>] free_domain_pirqs+0x71/0x8f
> > >     (XEN)    [<ffff82d0802819e0>] arch_domain_destroy+0x41/0xa1
> > >     (XEN)    [<ffff82d080207d22>] domain.c#complete_domain_destroy+0x86/0x159
> > >     (XEN)    [<ffff82d08022a658>] rcupdate.c#rcu_process_callbacks+0xa5/0x1cc
> > >     (XEN)    [<ffff82d08023c4fa>] softirq.c#__do_softirq+0x78/0x9a
> > >     (XEN)    [<ffff82d08023c566>] do_softirq+0x13/0x15
> > >     (XEN)    [<ffff82d080280532>] domain.c#idle_loop+0x63/0xb9
> > > 
> > > In this case, current->domain obviously isn't the stubdomain, because at
> > > this point it is already destroyed (it keeps reference to the target
> > > domain, so target domain couldn't be destroyed before its stubdomain).
> > > 
> > > In fact, in this case get_dm_domain() returns wrong value, since it
> > > isn't called by device model. At the point where stubdomain is already
> > > destroyed, I think it should return NULL, but it returns
> > > hardware_domain. But it isn't that easy, because target domain don't
> > > have any reference to its stubdomain. Especially already destroyed one.
> > > 
> > > I's defined as:
> > > 
> > >     static struct domain *get_dm_domain(struct domain *d)
> > >     {
> > >         return current->domain->target == d ? current->domain :
> > >                                               hardware_domain;
> > >     }
> > 
> > So get_dm_domain works fine when used by create_irq, because in that
> > case current->domain == d AFAICT.
> > 
> > As you pointed out however using the same mechanism for destroy is not
> > suitable, since the stubdomain might be already destroyed, and
> > unmap_domain_pirq called from the idle vCPU.
> > 
> > > Since hardware domain couldn't be destroyed(*) while the system is
> > > running, in practice this wrong return value it isn't a problem.
> > > Hardware didn't have permission over this IRQ (if stubdomain have
> > > created it), so irq_deny_access is a no-op.
> > > 
> > > So, I would adjust assert in destroy_irq to allow also hardware_domain,
> > > and add a comment that get_dm_domain may return hardware_domain during
> > > domain destruction. Is that ok?
> > 
> > Hm, albeit I agree with your analysis, I feel like this proposed
> > solution is kind of a workaround. Given the context, I think the best
> > way to deal with this issue in destroy_irq is to iterate over the list
> > of domains and make sure no domain has permissions over the destroyed
> > irq. Note that with this proposed solution you will have to drop the
> > domain parameter from destroy_irq.
> 
> I'd really like to avoid iterating the whole domain list. Jan, what do
> you think? Code-wise this seems to be the easiest solution.
> 
> > Another option would be to store which domains are given permissions
> > over which irqs in a way that's optimized to get the list of domains
> > given an irq (ie: without having to iterate over the list of domains
> > like my proposed solution above).
> 
> This may make sense, but if that would be instead of the current
> structure, we'd have another problem: when destroying stubdomain, you'd
> need to get list IRQs it has access to, to explicitly revoke
> stubdomain's access.

Indeed. You would have to come up with a structure that allows to both
get the list of irqs given a domain, or get the domain list given a
irq.

Maybe as a start you could expand irq_desc to contain the list of
domains that have permission over the irq, and then use this
information on cleanup?

The main issue with this approach is that you would also need to
cleanup this information from irq_desc if the stubdomain is destroyed
before destroying the IRQ.

> In theory it could be done by looking at the target
> domain and iterating over its IRQs, but this is getting more and more
> complex.
> 
> I think the tricky part is unmap_domain_pirq->msi_free_irq, which can be
> called:
> 1. from PHYSDEVOP_unmap_pirq, by stubdomain
> 2. from PHYSDEVOP_unmap_pirq, by dom0 when device model runs there
> 3. from PHYSDEVOP_unmap_pirq, by dom0 even with device model in
> stubdomain - normally it shouldn't happen for MSI allocated by
> stubdomain, but dom0 is allowed to do so, and it shouldn't cause any
> harm
> 4. from free_domain_pirqs, during domain destruction
> 5. various error paths
> 
> If unmap_domain_pirq would know where device model is running, even if
> not called by it, that would help. What about adding back reference in
> struct domain to a stubdomain? That would help a lot here. The only
> problem is a circular dependency stubdom->target->stubdom. This cycle
> would need to be broken during stubdom teardown. domain_kill(stubdom)
> looks like a good place to break it. Is it guaranteed to be called, or
> is there some other path to destroying a stubdomain?

A stubdomain AFAICT is handled like others domains from a hypervisor
PoV, there's no distinction between guests and stubdomains inside the
hypervisor, so I think domain_kill would be an appropriate place.

> Can one HVM domain have multiple stubdomains? If so, it should a be
> list, not a single field.

Yes, I think that's a supported setup. You can register multiple ioreq
servers handling accesses for a single domain, and there's no
restriction that forces running all those ioreq servers in the same
control domain.

Similarly you could have a single domain that has control permissions
over multiple domains, ie: like the hardware domain. domain->target
should likely be a list also in order to support this use case, but I
guess no one has yet required such use-case.

But maybe I'm just overdesigning this when there's no use-case of a
domain having multiple stubdomains, or a stubdomain serving multiple
domains.

Maybe it's enough to have a clear statement of the scope of this
mechanism and it's current limitations:

 - A domain different that the hardware domain can only have control
   permissions over a single other domain.

 - When a domain with passed through devices that have mediators
   running in a domain different than the hardware domain is destroyed
   the domain running those mediators must have been destroyed
   beforehand.

With those limitations in mind I think you could then use
get_dm_domain in destroy_irq. IMO I think this is fragile, but would
be enough to solve the issue you are currently facing.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4.1 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-03-07 22:28                                 ` Marek Marczykowski-Górecki
  2019-03-08 10:26                                   ` Roger Pau Monné
@ 2019-03-08 12:33                                   ` Jan Beulich
  1 sibling, 0 replies; 43+ messages in thread
From: Jan Beulich @ 2019-03-08 12:33 UTC (permalink / raw)
  To: Marek Marczykowski
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Simon Gaiser, Julien Grall, Suravee Suthikulpanit, xen-devel,
	Brian Woods, Roger Pau Monne

>>> On 07.03.19 at 23:28, <marmarek@invisiblethingslab.com> wrote:
> On Thu, Mar 07, 2019 at 03:48:01PM +0100, Roger Pau Monné wrote:
>> Hm, albeit I agree with your analysis, I feel like this proposed
>> solution is kind of a workaround. Given the context, I think the best
>> way to deal with this issue in destroy_irq is to iterate over the list
>> of domains and make sure no domain has permissions over the destroyed
>> irq. Note that with this proposed solution you will have to drop the
>> domain parameter from destroy_irq.
> 
> I'd really like to avoid iterating the whole domain list. Jan, what do
> you think? Code-wise this seems to be the easiest solution.

I'd certainly like to avoid iterations over the domain list, too.

I don't think a back pointer is necessary though. A domain ID would
seem sufficient - if (at the time you need a domain pointer) it doesn't
resolve to a valid domain, or to one whose ->target points back at
the domain you have in hands, then it's gone (at least far enough
for the purposes here).

> Can one HVM domain have multiple stubdomains? If so, it should a be
> list, not a single field.

Other than what Roger has said in his subsequent reply, I'd prefer
if at least no new restrictions were introduced. Yet I agree with
his over-engineering remark, and hence I could see the amount
getting limited to a small number for now (say 4 for 64-bit and 2
for 32-bit, such that the resulting array of domain IDs consumes
just a single pointer's worth of memory).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4.1 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-03-08 10:26                                   ` Roger Pau Monné
@ 2019-03-08 16:49                                     ` Marek Marczykowski-Górecki
  2019-03-08 17:04                                       ` Jan Beulich
  0 siblings, 1 reply; 43+ messages in thread
From: Marek Marczykowski-Górecki @ 2019-03-08 16:49 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Suravee Suthikulpanit,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Simon Gaiser, Julien Grall, Jan Beulich, xen-devel,
	Brian Woods


[-- Attachment #1.1: Type: text/plain, Size: 6056 bytes --]

On Fri, Mar 08, 2019 at 11:26:28AM +0100, Roger Pau Monné wrote:
> On Thu, Mar 07, 2019 at 11:28:25PM +0100, Marek Marczykowski-Górecki wrote:
> > On Thu, Mar 07, 2019 at 03:48:01PM +0100, Roger Pau Monné wrote:
> > > Another option would be to store which domains are given permissions
> > > over which irqs in a way that's optimized to get the list of domains
> > > given an irq (ie: without having to iterate over the list of domains
> > > like my proposed solution above).
> > 
> > This may make sense, but if that would be instead of the current
> > structure, we'd have another problem: when destroying stubdomain, you'd
> > need to get list IRQs it has access to, to explicitly revoke
> > stubdomain's access.
> 
> Indeed. You would have to come up with a structure that allows to both
> get the list of irqs given a domain, or get the domain list given a
> irq.
> 
> Maybe as a start you could expand irq_desc to contain the list of
> domains that have permission over the irq, and then use this
> information on cleanup?
> 
> The main issue with this approach is that you would also need to
> cleanup this information from irq_desc if the stubdomain is destroyed
> before destroying the IRQ.

In case of stubdomain specifically, such cleanup wouldn't be that hard,
as I can get list if IRQs by iterating over pirqs of d->target. Also, I
can simply lookup to what IRQs such stubdomain have access through
d->irq_caps.  But similar to target->stubdomain mapping idea, this would
introduce a circular dependency, which needs to be broken at some point.
Adding irq_desc->dm_domain (*) is slightly easier than
target->stubdomain mapping, as a single entry is enough here. This is
because specific (stub)domain have created this IRQ.

On the other hand, having generic mechanism revoking IRQ permissions
during destroy_irq(), regardless of where irq_permit_access() was called
would be nice. A generic mechanism would require a domain list in
irq_desc, not a single entry. Is it worth it?

(*) Regarding dm_domain name, if qemu is running in dom0, then dom0 is
device model domain for this thing.

> > In theory it could be done by looking at the target
> > domain and iterating over its IRQs, but this is getting more and more
> > complex.
> > 
> > I think the tricky part is unmap_domain_pirq->msi_free_irq, which can be
> > called:
> > 1. from PHYSDEVOP_unmap_pirq, by stubdomain
> > 2. from PHYSDEVOP_unmap_pirq, by dom0 when device model runs there
> > 3. from PHYSDEVOP_unmap_pirq, by dom0 even with device model in
> > stubdomain - normally it shouldn't happen for MSI allocated by
> > stubdomain, but dom0 is allowed to do so, and it shouldn't cause any
> > harm
> > 4. from free_domain_pirqs, during domain destruction
> > 5. various error paths
> > 
> > If unmap_domain_pirq would know where device model is running, even if
> > not called by it, that would help. What about adding back reference in
> > struct domain to a stubdomain? That would help a lot here. The only
> > problem is a circular dependency stubdom->target->stubdom. This cycle
> > would need to be broken during stubdom teardown. domain_kill(stubdom)
> > looks like a good place to break it. Is it guaranteed to be called, or
> > is there some other path to destroying a stubdomain?
> 
> A stubdomain AFAICT is handled like others domains from a hypervisor
> PoV, there's no distinction between guests and stubdomains inside the
> hypervisor, so I think domain_kill would be an appropriate place.

As Jan said in the other message, this also could be solved by storing
domain id, not a pointer, because it's easy to detect if that domain id
is stale.

> > Can one HVM domain have multiple stubdomains? If so, it should a be
> > list, not a single field.
> 
> Yes, I think that's a supported setup. You can register multiple ioreq
> servers handling accesses for a single domain, and there's no
> restriction that forces running all those ioreq servers in the same
> control domain.

I was looking at it, and adding a list of struct domain pointers would
be easy, but would require breaking circular dependency during teardown
(in domain_kill). And will add yet another struct list_head into
struct domain. On the other hand, having list of domain ids would
require a wrapper structure or such - I don't see any API in Xen for
storing list of just ints (domid_t specifically). Using an array for
something that should really be a list looks clumsy - all the searching
for first available space, ignoring empty slots etc. Maybe rangeset
could be used for that?

> Similarly you could have a single domain that has control permissions
> over multiple domains, ie: like the hardware domain. domain->target
> should likely be a list also in order to support this use case, but I
> guess no one has yet required such use-case.

DOMCTL_set_target currently returns -EINVAL if you try assign a second
stubdomain to a HVM.

> But maybe I'm just overdesigning this when there's no use-case of a
> domain having multiple stubdomains, or a stubdomain serving multiple
> domains.
> 
> Maybe it's enough to have a clear statement of the scope of this
> mechanism and it's current limitations:
> 
>  - A domain different that the hardware domain can only have control
>    permissions over a single other domain.
> 
>  - When a domain with passed through devices that have mediators
>    running in a domain different than the hardware domain is destroyed
>    the domain running those mediators must have been destroyed
>    beforehand.

Yes, both of those are true, even without my patches.

> With those limitations in mind I think you could then use
> get_dm_domain in destroy_irq. IMO I think this is fragile, but would
> be enough to solve the issue you are currently facing.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4.1 4/6] xen/x86: Allow stubdom access to irq created for msi.
  2019-03-08 16:49                                     ` Marek Marczykowski-Górecki
@ 2019-03-08 17:04                                       ` Jan Beulich
  0 siblings, 0 replies; 43+ messages in thread
From: Jan Beulich @ 2019-03-08 17:04 UTC (permalink / raw)
  To: Marek Marczykowski
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Simon Gaiser, Julien Grall, Suravee Suthikulpanit, xen-devel,
	Brian Woods, Roger Pau Monne

>>> On 08.03.19 at 17:49, <marmarek@invisiblethingslab.com> wrote:
> On Fri, Mar 08, 2019 at 11:26:28AM +0100, Roger Pau Monné wrote:
>> On Thu, Mar 07, 2019 at 11:28:25PM +0100, Marek Marczykowski-Górecki wrote:
>> > Can one HVM domain have multiple stubdomains? If so, it should a be
>> > list, not a single field.
>> 
>> Yes, I think that's a supported setup. You can register multiple ioreq
>> servers handling accesses for a single domain, and there's no
>> restriction that forces running all those ioreq servers in the same
>> control domain.
> 
> I was looking at it, and adding a list of struct domain pointers would
> be easy, but would require breaking circular dependency during teardown
> (in domain_kill). And will add yet another struct list_head into
> struct domain. On the other hand, having list of domain ids would
> require a wrapper structure or such - I don't see any API in Xen for
> storing list of just ints (domid_t specifically). Using an array for
> something that should really be a list looks clumsy - all the searching
> for first available space, ignoring empty slots etc. Maybe rangeset
> could be used for that?

For one rangesets are meant primarily for things which come in
(consecutive) groups, which doesn't seem like a god fit here. And
then its overhead is even higher (especially for the common case
of just a single stubdom) compared to the simplest dynamic form
of allocating and maintaining an array of suitable size (in which
case you'd probably not even need to skip empty slots or serach
for available space).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2019-03-08 17:04 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-07  0:07 [PATCH v4 0/6] Fix PCI passthrough for HVM with stubdomain Marek Marczykowski-Górecki
2019-02-07  0:07 ` [PATCH v4 1/6] libxl: do not attach xen-pciback to HVM domain, if stubdomain is in use Marek Marczykowski-Górecki
2019-02-07  0:07 ` [PATCH v4 2/6] libxl: attach PCI device to qemu only after setting pciback/pcifront Marek Marczykowski-Górecki
2019-02-07  0:07 ` [PATCH v4 3/6] libxl: don't try to manipulate json config for stubdomain Marek Marczykowski-Górecki
2019-02-21 16:16   ` Wei Liu
2019-02-07  0:07 ` [PATCH v4 4/6] xen/x86: Allow stubdom access to irq created for msi Marek Marczykowski-Górecki
2019-02-07  9:57   ` Roger Pau Monné
2019-02-07 13:21     ` Marek Marczykowski-Górecki
2019-02-07 13:40       ` Roger Pau Monné
2019-02-07 14:52       ` Marek Marczykowski-Górecki
2019-02-07 14:57         ` Roger Pau Monné
2019-02-07 15:41           ` Marek Marczykowski-Górecki
2019-02-07 17:40             ` Roger Pau Monné
2019-02-07 17:51               ` Marek Marczykowski-Górecki
2019-02-08  9:35                 ` Roger Pau Monné
2019-02-08 10:15                   ` Marek Marczykowski-Górecki
2019-02-08 10:17                     ` [PATCH v4.1 " Marek Marczykowski-Górecki
2019-02-21 16:47                       ` Roger Pau Monné
2019-02-21 17:40                         ` Marek Marczykowski-Górecki
2019-02-22 10:42                           ` Roger Pau Monné
2019-02-22 11:11                             ` Jan Beulich
2019-03-07  0:50                             ` Marek Marczykowski-Górecki
2019-03-07 14:48                               ` Roger Pau Monné
2019-03-07 22:28                                 ` Marek Marczykowski-Górecki
2019-03-08 10:26                                   ` Roger Pau Monné
2019-03-08 16:49                                     ` Marek Marczykowski-Górecki
2019-03-08 17:04                                       ` Jan Beulich
2019-03-08 12:33                                   ` Jan Beulich
2019-02-27 11:07                       ` Jan Beulich
2019-02-27 15:18                         ` Marek Marczykowski
2019-02-28 10:50                           ` Jan Beulich
2019-02-28 11:41                             ` Roger Pau Monné
2019-02-07  0:07 ` [PATCH v4 5/6] xen/x86: add PHYSDEVOP_msi_set_enable Marek Marczykowski-Górecki
2019-02-07 10:25   ` Roger Pau Monné
2019-02-27 11:41   ` Jan Beulich
2019-02-27 15:05     ` Marek Marczykowski
2019-02-28 10:58       ` Jan Beulich
2019-02-28 12:25         ` Marek Marczykowski
2019-03-03  1:10           ` Marek Marczykowski
2019-03-04 10:19             ` Roger Pau Monné
2019-03-04 10:22               ` Jan Beulich
2019-03-03  3:26         ` Marek Marczykowski
2019-02-07  0:07 ` [PATCH v4 6/6] tools/libxc: add wrapper for PHYSDEVOP_msi_set_enable Marek Marczykowski-Górecki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.