All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] x86/dpci: bind legacy PCI interrupts to PVHv2 Dom0
@ 2017-04-19 15:11 Roger Pau Monne
  2017-04-19 15:11 ` [PATCH v2 1/3] x86/physdev: factor out the code to allocate and map a pirq Roger Pau Monne
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Roger Pau Monne @ 2017-04-19 15:11 UTC (permalink / raw)
  To: xen-devel; +Cc: boris.ostrovsky

Hello,

The following patches allow binding bare-metal GSIs into a PVHv2 Dom0, by
snooping on the vIO APICs writes made by Dom0.

This new version has reduced the number of patches from 5 to 3, first two
patches introduce the necessary code to bind GSIs into a PVH Dom0, and patch 3
snoops on vIO APIC writes for unmask and binds the GSI to Dom0.

The series has been tested with a PVHv2 Dom0 on a box with 3 IO APICs,
although all devices are wired up into the first IO APIC sadly.

A branch with the changes can be found at:

git://xenbits.xen.org/people/royger/xen.git dom0_gsi_v2

Thanks, Roger.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 1/3] x86/physdev: factor out the code to allocate and map a pirq
  2017-04-19 15:11 [PATCH v2 0/3] x86/dpci: bind legacy PCI interrupts to PVHv2 Dom0 Roger Pau Monne
@ 2017-04-19 15:11 ` Roger Pau Monne
  2017-05-12 12:56   ` Jan Beulich
  2017-05-16 11:52   ` Roger Pau Monne
  2017-04-19 15:11 ` [PATCH v2 2/3] x86/pt: enable binding of GSIs to a PVH Dom0 Roger Pau Monne
  2017-04-19 15:11 ` [PATCH v2 3/3] x86/vioapic: bind interrupts to " Roger Pau Monne
  2 siblings, 2 replies; 19+ messages in thread
From: Roger Pau Monne @ 2017-04-19 15:11 UTC (permalink / raw)
  To: xen-devel; +Cc: boris.ostrovsky, Roger Pau Monne

Move the code to allocate and map a domain pirq (either GSI or MSI) into the
x86 irq code base, so that it can be used outside of the physdev ops.

This change shouldn't affect the functionality of the already existing physdev
ops.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Jan Beulich <jbeulich@suse.com>
Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes since v1:
 - New in this version.
---
 xen/arch/x86/irq.c        | 175 ++++++++++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/physdev.c    | 124 ++------------------------------
 xen/include/asm-x86/irq.h |   5 ++
 3 files changed, 185 insertions(+), 119 deletions(-)

diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index 676ba5216f..8f6a91994c 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -2537,3 +2537,178 @@ bool_t hvm_domain_use_pirq(const struct domain *d, const struct pirq *pirq)
     return is_hvm_domain(d) && pirq &&
            pirq->arch.hvm.emuirq != IRQ_UNBOUND; 
 }
+
+int allocate_and_map_gsi_pirq(struct domain *d, int *index, int *pirq_p)
+{
+    int irq, pirq, ret;
+
+    if ( *index < 0 || *index >= nr_irqs_gsi )
+    {
+        dprintk(XENLOG_G_ERR, "dom%d: map invalid irq %d\n", d->domain_id,
+                *index);
+        return -EINVAL;
+    }
+
+    irq = domain_pirq_to_irq(current->domain, *index);
+    if ( irq <= 0 )
+    {
+        if ( is_hardware_domain(current->domain) )
+            irq = *index;
+        else {
+            dprintk(XENLOG_G_ERR, "dom%d: map pirq with incorrect irq!\n",
+                    d->domain_id);
+            return -EINVAL;
+        }
+    }
+
+    pcidevs_lock();
+    /* Verify or get pirq. */
+    spin_lock(&d->event_lock);
+    pirq = domain_irq_to_pirq(d, irq);
+
+    if ( *pirq_p < 0 )
+    {
+        if ( pirq )
+        {
+            dprintk(XENLOG_G_ERR, "dom%d: %d:%d already mapped to %d\n",
+                    d->domain_id, *index, *pirq_p, pirq);
+            if ( pirq < 0 )
+            {
+                ret = -EBUSY;
+                goto done;
+            }
+        }
+        else
+        {
+            pirq = get_free_pirq(d, MAP_PIRQ_TYPE_GSI);
+            if ( pirq < 0 )
+            {
+                dprintk(XENLOG_G_ERR, "dom%d: no free pirq\n", d->domain_id);
+                ret = pirq;
+                goto done;
+            }
+        }
+    }
+    else
+    {
+        if ( pirq && pirq != *pirq_p )
+        {
+            dprintk(XENLOG_G_ERR, "dom%d: pirq %d conflicts with irq %d\n",
+                    d->domain_id, *index, *pirq_p);
+            ret = -EEXIST;
+            goto done;
+        }
+        else
+            pirq = *pirq_p;
+    }
+
+    ret = map_domain_pirq(d, pirq, irq, MAP_PIRQ_TYPE_GSI, NULL);
+    if ( ret == 0 )
+        *pirq_p = pirq;
+
+ done:
+    spin_unlock(&d->event_lock);
+    pcidevs_unlock();
+    return ret;
+}
+
+int allocate_and_map_msi_pirq(struct domain *d, int *index, int *pirq_p,
+                              struct msi_info *msi)
+{
+    int irq, pirq, ret, type;
+
+    irq = *index;
+    if ( irq == -1 || msi->entry_nr > 1 )
+        irq = create_irq(NUMA_NO_NODE);
+
+    if ( irq < nr_irqs_gsi || irq >= nr_irqs )
+    {
+        dprintk(XENLOG_G_ERR, "dom%d: can't create irq for msi!\n",
+                d->domain_id);
+        ret = -EINVAL;
+        goto free_irq;
+    }
+
+    msi->irq = irq;
+    type = (msi->entry_nr > 1 && !msi->table_base) ? MAP_PIRQ_TYPE_MULTI_MSI
+                                                   : MAP_PIRQ_TYPE_MSI;
+
+    pcidevs_lock();
+    /* Verify or get pirq. */
+    spin_lock(&d->event_lock);
+    pirq = domain_irq_to_pirq(d, irq);
+    if ( *pirq_p < 0 )
+    {
+        if ( pirq )
+        {
+            dprintk(XENLOG_G_ERR, "dom%d: %d:%d already mapped to %d\n",
+                    d->domain_id, *index, *pirq_p, pirq);
+            if ( pirq < 0 )
+            {
+                ret = -EBUSY;
+                goto done;
+            }
+        }
+        else if ( msi->entry_nr > 1 && !msi->table_base )
+        {
+            if ( msi->entry_nr <= 0 || msi->entry_nr > 32 )
+                ret = -EDOM;
+            else if ( msi->entry_nr != 1 && !iommu_intremap )
+                ret = -EOPNOTSUPP;
+            else
+            {
+                while ( msi->entry_nr & (msi->entry_nr - 1) )
+                    msi->entry_nr += msi->entry_nr & -msi->entry_nr;
+                pirq = get_free_pirqs(d, msi->entry_nr);
+                ret = 0;
+                if ( pirq < 0 )
+                {
+                    while ( (msi->entry_nr >>= 1) > 1 )
+                        if ( get_free_pirqs(d, msi->entry_nr) > 0 )
+                            break;
+                    dprintk(XENLOG_G_ERR, "dom%d: no block of %d free pirqs\n",
+                            d->domain_id, msi->entry_nr << 1);
+                    ret = pirq;
+                }
+            }
+            if ( ret < 0 )
+                goto done;
+        }
+        else
+        {
+            pirq = get_free_pirq(d, type);
+            if ( pirq < 0 )
+            {
+                dprintk(XENLOG_G_ERR, "dom%d: no free pirq\n", d->domain_id);
+                ret = pirq;
+                goto done;
+            }
+        }
+    }
+    else
+    {
+        if ( pirq && pirq != *pirq_p )
+        {
+            dprintk(XENLOG_G_ERR, "dom%d: pirq %d conflicts with irq %d\n",
+                    d->domain_id, *index, *pirq_p);
+            ret = -EEXIST;
+            goto done;
+        }
+        else
+            pirq = *pirq_p;
+    }
+
+    ret = map_domain_pirq(d, pirq, irq, type, msi);
+    if ( ret == 0 )
+        *pirq_p = pirq;
+
+ done:
+    spin_unlock(&d->event_lock);
+    pcidevs_unlock();
+ free_irq:
+    if ( ret != 0 &&
+         ((msi->entry_nr > 1 && !msi->table_base) || *index == -1) )
+        destroy_irq(irq);
+    return ret;
+}
+
diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
index eec4a41231..b5eded4be9 100644
--- a/xen/arch/x86/physdev.c
+++ b/xen/arch/x86/physdev.c
@@ -92,8 +92,7 @@ int physdev_map_pirq(domid_t domid, int type, int *index, int *pirq_p,
                      struct msi_info *msi)
 {
     struct domain *d = current->domain;
-    int pirq, irq, ret = 0;
-    void *map_data = NULL;
+    int ret;
 
     if ( domid == DOMID_SELF && is_hvm_domain(d) && has_pirq(d) )
     {
@@ -119,135 +118,22 @@ int physdev_map_pirq(domid_t domid, int type, int *index, int *pirq_p,
     switch ( type )
     {
     case MAP_PIRQ_TYPE_GSI:
-        if ( *index < 0 || *index >= nr_irqs_gsi )
-        {
-            dprintk(XENLOG_G_ERR, "dom%d: map invalid irq %d\n",
-                    d->domain_id, *index);
-            ret = -EINVAL;
-            goto free_domain;
-        }
-
-        irq = domain_pirq_to_irq(current->domain, *index);
-        if ( irq <= 0 )
-        {
-            if ( is_hardware_domain(current->domain) )
-                irq = *index;
-            else {
-                dprintk(XENLOG_G_ERR, "dom%d: map pirq with incorrect irq!\n",
-                        d->domain_id);
-                ret = -EINVAL;
-                goto free_domain;
-            }
-        }
+        ret = allocate_and_map_gsi_pirq(d, index, pirq_p);
         break;
-
     case MAP_PIRQ_TYPE_MSI:
         if ( !msi->table_base )
             msi->entry_nr = 1;
-        irq = *index;
-        if ( irq == -1 )
+        /* fallthrough */
     case MAP_PIRQ_TYPE_MULTI_MSI:
-            irq = create_irq(NUMA_NO_NODE);
-
-        if ( irq < nr_irqs_gsi || irq >= nr_irqs )
-        {
-            dprintk(XENLOG_G_ERR, "dom%d: can't create irq for msi!\n",
-                    d->domain_id);
-            ret = -EINVAL;
-            goto free_domain;
-        }
-
-        msi->irq = irq;
-        map_data = msi;
+        ret = allocate_and_map_msi_pirq(d, index, pirq_p, msi);
         break;
-
     default:
         dprintk(XENLOG_G_ERR, "dom%d: wrong map_pirq type %x\n",
                 d->domain_id, type);
         ret = -EINVAL;
-        goto free_domain;
-    }
-
-    pcidevs_lock();
-    /* Verify or get pirq. */
-    spin_lock(&d->event_lock);
-    pirq = domain_irq_to_pirq(d, irq);
-    if ( *pirq_p < 0 )
-    {
-        if ( pirq )
-        {
-            dprintk(XENLOG_G_ERR, "dom%d: %d:%d already mapped to %d\n",
-                    d->domain_id, *index, *pirq_p, pirq);
-            if ( pirq < 0 )
-            {
-                ret = -EBUSY;
-                goto done;
-            }
-        }
-        else if ( type == MAP_PIRQ_TYPE_MULTI_MSI )
-        {
-            if ( msi->entry_nr <= 0 || msi->entry_nr > 32 )
-                ret = -EDOM;
-            else if ( msi->entry_nr != 1 && !iommu_intremap )
-                ret = -EOPNOTSUPP;
-            else
-            {
-                while ( msi->entry_nr & (msi->entry_nr - 1) )
-                    msi->entry_nr += msi->entry_nr & -msi->entry_nr;
-                pirq = get_free_pirqs(d, msi->entry_nr);
-                if ( pirq < 0 )
-                {
-                    while ( (msi->entry_nr >>= 1) > 1 )
-                        if ( get_free_pirqs(d, msi->entry_nr) > 0 )
-                            break;
-                    dprintk(XENLOG_G_ERR, "dom%d: no block of %d free pirqs\n",
-                            d->domain_id, msi->entry_nr << 1);
-                    ret = pirq;
-                }
-            }
-            if ( ret < 0 )
-                goto done;
-        }
-        else
-        {
-            pirq = get_free_pirq(d, type);
-            if ( pirq < 0 )
-            {
-                dprintk(XENLOG_G_ERR, "dom%d: no free pirq\n", d->domain_id);
-                ret = pirq;
-                goto done;
-            }
-        }
-    }
-    else
-    {
-        if ( pirq && pirq != *pirq_p )
-        {
-            dprintk(XENLOG_G_ERR, "dom%d: pirq %d conflicts with irq %d\n",
-                    d->domain_id, *index, *pirq_p);
-            ret = -EEXIST;
-            goto done;
-        }
-        else
-            pirq = *pirq_p;
+        break;
     }
 
-    ret = map_domain_pirq(d, pirq, irq, type, map_data);
-    if ( ret == 0 )
-        *pirq_p = pirq;
-
- done:
-    spin_unlock(&d->event_lock);
-    pcidevs_unlock();
-    if ( ret != 0 )
-        switch ( type )
-        {
-        case MAP_PIRQ_TYPE_MSI:
-            if ( *index == -1 )
-        case MAP_PIRQ_TYPE_MULTI_MSI:
-                destroy_irq(irq);
-            break;
-        }
  free_domain:
     rcu_unlock_domain(d);
     return ret;
diff --git a/xen/include/asm-x86/irq.h b/xen/include/asm-x86/irq.h
index ef625ebb13..2b1701e67b 100644
--- a/xen/include/asm-x86/irq.h
+++ b/xen/include/asm-x86/irq.h
@@ -200,4 +200,9 @@ bool_t cpu_has_pending_apic_eoi(void);
 
 static inline void arch_move_irqs(struct vcpu *v) { }
 
+struct msi_info;
+int allocate_and_map_gsi_pirq(struct domain *d, int *index, int *pirq_p);
+int allocate_and_map_msi_pirq(struct domain *d, int *index, int *pirq_p,
+                              struct msi_info *msi);
+
 #endif /* _ASM_HW_IRQ_H */
-- 
2.11.0 (Apple Git-81)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 2/3] x86/pt: enable binding of GSIs to a PVH Dom0
  2017-04-19 15:11 [PATCH v2 0/3] x86/dpci: bind legacy PCI interrupts to PVHv2 Dom0 Roger Pau Monne
  2017-04-19 15:11 ` [PATCH v2 1/3] x86/physdev: factor out the code to allocate and map a pirq Roger Pau Monne
@ 2017-04-19 15:11 ` Roger Pau Monne
  2017-05-12 13:42   ` Jan Beulich
  2017-04-19 15:11 ` [PATCH v2 3/3] x86/vioapic: bind interrupts to " Roger Pau Monne
  2 siblings, 1 reply; 19+ messages in thread
From: Roger Pau Monne @ 2017-04-19 15:11 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, boris.ostrovsky, Roger Pau Monne, Jan Beulich

Achieve this by expanding pt_irq_create_bind in order to support mapping
interrupts of type PT_IRQ_TYPE_PCI to a PVH Dom0. GSIs bound to Dom0 are always
identity bound, which means the all the fields inside of the u.pci sub-struct
are ignored, and only the machine_irq is actually used in order to determine
which GSI the caller wants to bind.

Also, the hvm_irq_dpci struct is not used by a PVH Dom0, since that's used to
route interrupts and allow different host to guest GSI mappings, which is not
used by a PVH Dom0.

This requires adding some specific handlers for such directly mapped GSIs,
which bypass the PCI interrupt routing done by Xen for HVM guests.

Note that currently there's no support for unbinding this interrupts.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes since v1:
 - Remove the PT_IRQ_TYPE_GSI and instead just use PT_IRQ_TYPE_PCI with a
   hardware domain special casing.
 - Check the trigger mode of the Dom0 vIO APIC in order to set the shareable
   flags in pt_irq_create_bind.
---
 xen/arch/x86/hvm/irq.c       |  28 ++++++
 xen/drivers/passthrough/io.c | 212 ++++++++++++++++++++++++++++++++-----------
 xen/include/xen/hvm/irq.h    |   6 ++
 3 files changed, 195 insertions(+), 51 deletions(-)

diff --git a/xen/arch/x86/hvm/irq.c b/xen/arch/x86/hvm/irq.c
index 86255847a6..cb9084e3f3 100644
--- a/xen/arch/x86/hvm/irq.c
+++ b/xen/arch/x86/hvm/irq.c
@@ -126,6 +126,34 @@ void hvm_pci_intx_deassert(
     spin_unlock(&d->arch.hvm_domain.irq_lock);
 }
 
+void hvm_gsi_assert(struct domain *d, unsigned int gsi)
+{
+    struct hvm_irq *hvm_irq = hvm_domain_irq(d);
+
+    ASSERT(gsi < hvm_irq->nr_gsis);
+    ASSERT(!has_vpic(d));
+    spin_lock(&d->arch.hvm_domain.irq_lock);
+    if ( !hvm_irq->gsi_assert_count[gsi] )
+    {
+        hvm_irq->gsi_assert_count[gsi]++;
+        assert_gsi(d, gsi);
+    }
+    spin_unlock(&d->arch.hvm_domain.irq_lock);
+}
+
+void hvm_gsi_deassert(struct domain *d, unsigned int gsi)
+{
+    struct hvm_irq *hvm_irq = hvm_domain_irq(d);
+
+    ASSERT(gsi < hvm_irq->nr_gsis);
+    ASSERT(!has_vpic(d));
+    spin_lock(&d->arch.hvm_domain.irq_lock);
+    if ( hvm_irq->gsi_assert_count[gsi] )
+        hvm_irq->gsi_assert_count[gsi]--;
+    ASSERT(!hvm_irq->gsi_assert_count[gsi]);
+    spin_unlock(&d->arch.hvm_domain.irq_lock);
+}
+
 void hvm_isa_irq_assert(
     struct domain *d, unsigned int isa_irq)
 {
diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
index e5a43e508f..1c89ae1642 100644
--- a/xen/drivers/passthrough/io.c
+++ b/xen/drivers/passthrough/io.c
@@ -164,6 +164,20 @@ static void pt_irq_time_out(void *data)
 
     spin_lock(&irq_map->dom->event_lock);
 
+    if ( irq_map->flags & HVM_IRQ_DPCI_IDENTITY_GSI )
+    {
+        struct pirq *pirq = dpci_pirq(irq_map);
+
+        ASSERT(is_hardware_domain(irq_map->dom));
+        /*
+         * Identity mapped, no need to iterate over the guest GSI list to find
+         * other pirqs sharing the same guest GSI.
+         */
+        irq_map->flags |= HVM_IRQ_DPCI_EOI_LATCH;
+        hvm_gsi_deassert(irq_map->dom, pirq->pirq);
+        goto out;
+    }
+
     dpci = domain_get_irq_dpci(irq_map->dom);
     if ( unlikely(!dpci) )
     {
@@ -185,6 +199,7 @@ static void pt_irq_time_out(void *data)
         hvm_pci_intx_deassert(irq_map->dom, digl->device, digl->intx);
     }
 
+ out:
     pt_pirq_iterate(irq_map->dom, pt_irq_guest_eoi, NULL);
 
     spin_unlock(&irq_map->dom->event_lock);
@@ -274,10 +289,16 @@ int pt_irq_create_bind(
     spin_lock(&d->event_lock);
 
     hvm_irq_dpci = domain_get_irq_dpci(d);
-    if ( hvm_irq_dpci == NULL )
+    if ( hvm_irq_dpci == NULL && !is_hardware_domain(d) )
     {
         unsigned int i;
 
+        /*
+         * NB: the hardware domain doesn't use a hvm_irq_dpci struct because
+         * it's only allowed to identity map GSIs, and so the data contained in
+         * that struct (used to map guest GSIs into machine GSIs and perform
+         * interrupt routing) it's completely useless for it.
+         */
         hvm_irq_dpci = xzalloc(struct hvm_irq_dpci);
         if ( hvm_irq_dpci == NULL )
         {
@@ -422,35 +443,51 @@ int pt_irq_create_bind(
     case PT_IRQ_TYPE_PCI:
     case PT_IRQ_TYPE_MSI_TRANSLATE:
     {
-        unsigned int bus = pt_irq_bind->u.pci.bus;
-        unsigned int device = pt_irq_bind->u.pci.device;
-        unsigned int intx = pt_irq_bind->u.pci.intx;
-        unsigned int guest_gsi = hvm_pci_intx_gsi(device, intx);
-        unsigned int link = hvm_pci_intx_link(device, intx);
-        struct dev_intx_gsi_link *digl = xmalloc(struct dev_intx_gsi_link);
-        struct hvm_girq_dpci_mapping *girq =
-            xmalloc(struct hvm_girq_dpci_mapping);
+        struct dev_intx_gsi_link *digl = NULL;
+        struct hvm_girq_dpci_mapping *girq = NULL;
+        unsigned int guest_gsi;
 
-        if ( !digl || !girq )
+        /*
+         * Mapping GSIs for the hardware domain is different than doing it for
+         * an unpriviledged guest, the hardware domain is only allowed to
+         * identity map GSIs, and as such all the data in the u.pci union is
+         * discarded.
+         */
+        if ( !is_hardware_domain(d) )
         {
-            spin_unlock(&d->event_lock);
-            xfree(girq);
-            xfree(digl);
-            return -ENOMEM;
-        }
+            unsigned int link;
+
+            digl = xmalloc(struct dev_intx_gsi_link);
+            girq = xmalloc(struct hvm_girq_dpci_mapping);
+
+            if ( !digl || !girq )
+            {
+                spin_unlock(&d->event_lock);
+                xfree(girq);
+                xfree(digl);
+                return -ENOMEM;
+            }
+
+            girq->bus = digl->bus = pt_irq_bind->u.pci.bus;
+            girq->device = digl->device = pt_irq_bind->u.pci.device;
+            girq->intx = digl->intx = pt_irq_bind->u.pci.intx;
+            list_add_tail(&digl->list, &pirq_dpci->digl_list);
 
-        hvm_irq_dpci->link_cnt[link]++;
+            guest_gsi = hvm_pci_intx_gsi(digl->device, digl->intx);
+            link = hvm_pci_intx_link(digl->device, digl->intx);
 
-        digl->bus = bus;
-        digl->device = device;
-        digl->intx = intx;
-        list_add_tail(&digl->list, &pirq_dpci->digl_list);
+            hvm_irq_dpci->link_cnt[link]++;
 
-        girq->bus = bus;
-        girq->device = device;
-        girq->intx = intx;
-        girq->machine_gsi = pirq;
-        list_add_tail(&girq->list, &hvm_irq_dpci->girq[guest_gsi]);
+            girq->machine_gsi = pirq;
+            list_add_tail(&girq->list, &hvm_irq_dpci->girq[guest_gsi]);
+        }
+        else
+        {
+            /* MSI_TRANSLATE is not supported by the hardware domain. */
+            ASSERT(pt_irq_bind->irq_type == PT_IRQ_TYPE_PCI);
+            guest_gsi = pirq;
+            ASSERT(guest_gsi < hvm_domain_irq(d)->nr_gsis);
+        }
 
         /* Bind the same mirq once in the same domain */
         if ( !(pirq_dpci->flags & HVM_IRQ_DPCI_MAPPED) )
@@ -472,7 +509,27 @@ int pt_irq_create_bind(
                 pirq_dpci->flags = HVM_IRQ_DPCI_MAPPED |
                                    HVM_IRQ_DPCI_MACH_PCI |
                                    HVM_IRQ_DPCI_GUEST_PCI;
-                share = BIND_PIRQ__WILL_SHARE;
+                if ( !is_hardware_domain(d) )
+                    share = BIND_PIRQ__WILL_SHARE;
+                else
+                {
+                    unsigned int pin;
+                    struct hvm_vioapic *vioapic = gsi_vioapic(d, guest_gsi,
+                                                              &pin);
+
+                    if ( !vioapic )
+                    {
+                        ASSERT_UNREACHABLE();
+                        return -EINVAL;
+                    }
+                    pirq_dpci->flags |= HVM_IRQ_DPCI_IDENTITY_GSI;
+                    /*
+                     * Check if the corresponding vIO APIC pin is configured
+                     * level or edge trigger, level triggered interrupts will
+                     * be marked as shareable.
+                     */
+                    share = vioapic->redirtbl[pin].fields.trig_mode;
+                }
             }
 
             /* Init timer before binding */
@@ -489,9 +546,15 @@ int pt_irq_create_bind(
                  * IRQ_GUEST is not set. As such we can reset 'dom' directly.
                  */
                 pirq_dpci->dom = NULL;
-                list_del(&girq->list);
-                list_del(&digl->list);
-                hvm_irq_dpci->link_cnt[link]--;
+                if ( !is_hardware_domain(d) )
+                {
+                    unsigned int link = hvm_pci_intx_link(digl->device,
+                                                          digl->intx);
+
+                    list_del(&girq->list);
+                    list_del(&digl->list);
+                    hvm_irq_dpci->link_cnt[link]--;
+                }
                 pirq_dpci->flags = 0;
                 pirq_cleanup_check(info, d);
                 spin_unlock(&d->event_lock);
@@ -504,10 +567,18 @@ int pt_irq_create_bind(
         spin_unlock(&d->event_lock);
 
         if ( iommu_verbose )
-            printk(XENLOG_G_INFO
-                   "d%d: bind: m_gsi=%u g_gsi=%u dev=%02x.%02x.%u intx=%u\n",
-                   d->domain_id, pirq, guest_gsi, bus,
-                   PCI_SLOT(device), PCI_FUNC(device), intx);
+        {
+            char buf[50];
+
+            if ( !is_hardware_domain(d) )
+                snprintf(buf, ARRAY_SIZE(buf), " dev=%02x.%02x.%u intx=%u",
+                         digl->bus, PCI_SLOT(digl->device),
+                         PCI_FUNC(digl->device), digl->intx);
+
+            printk(XENLOG_G_INFO "d%d: bind: m_gsi=%u g_gsi=%u%s\n",
+                   d->domain_id, pirq, guest_gsi,
+                   !is_hardware_domain(d) ? buf : "");
+        }
         break;
     }
 
@@ -522,7 +593,6 @@ int pt_irq_create_bind(
 int pt_irq_destroy_bind(
     struct domain *d, xen_domctl_bind_pt_irq_t *pt_irq_bind)
 {
-    struct hvm_irq_dpci *hvm_irq_dpci;
     struct hvm_pirq_dpci *pirq_dpci;
     unsigned int machine_gsi = pt_irq_bind->machine_irq;
     struct pirq *pirq;
@@ -552,17 +622,15 @@ int pt_irq_destroy_bind(
 
     spin_lock(&d->event_lock);
 
-    hvm_irq_dpci = domain_get_irq_dpci(d);
-
-    if ( hvm_irq_dpci == NULL )
+    pirq = pirq_info(d, machine_gsi);
+    pirq_dpci = pirq_dpci(pirq);
+    if ( pirq_dpci->flags & HVM_IRQ_DPCI_IDENTITY_GSI )
     {
+        ASSERT(is_hardware_domain(d));
         spin_unlock(&d->event_lock);
-        return -EINVAL;
+        return -EOPNOTSUPP;
     }
 
-    pirq = pirq_info(d, machine_gsi);
-    pirq_dpci = pirq_dpci(pirq);
-
     if ( pt_irq_bind->irq_type != PT_IRQ_TYPE_MSI )
     {
         unsigned int bus = pt_irq_bind->u.pci.bus;
@@ -570,9 +638,16 @@ int pt_irq_destroy_bind(
         unsigned int intx = pt_irq_bind->u.pci.intx;
         unsigned int guest_gsi = hvm_pci_intx_gsi(device, intx);
         unsigned int link = hvm_pci_intx_link(device, intx);
+        struct hvm_irq_dpci *hvm_irq_dpci = domain_get_irq_dpci(d);
         struct hvm_girq_dpci_mapping *girq;
         struct dev_intx_gsi_link *digl, *tmp;
 
+        if ( hvm_irq_dpci == NULL )
+        {
+            spin_unlock(&d->event_lock);
+            return -EINVAL;
+        }
+
         list_for_each_entry ( girq, &hvm_irq_dpci->girq[guest_gsi], list )
         {
             if ( girq->bus         == bus &&
@@ -696,7 +771,7 @@ int hvm_do_IRQ_dpci(struct domain *d, struct pirq *pirq)
     struct hvm_irq_dpci *dpci = domain_get_irq_dpci(d);
     struct hvm_pirq_dpci *pirq_dpci = pirq_dpci(pirq);
 
-    if ( !iommu_enabled || !dpci || !pirq_dpci ||
+    if ( !iommu_enabled || (!is_hardware_domain(d) && !dpci) || !pirq_dpci ||
          !(pirq_dpci->flags & HVM_IRQ_DPCI_MAPPED) )
         return 0;
 
@@ -757,7 +832,7 @@ void hvm_dpci_msi_eoi(struct domain *d, int vector)
 
 static void hvm_dirq_assist(struct domain *d, struct hvm_pirq_dpci *pirq_dpci)
 {
-    if ( unlikely(!hvm_domain_irq(d)->dpci) )
+    if ( unlikely(!hvm_domain_irq(d)->dpci) && !is_hardware_domain(d) )
     {
         ASSERT_UNREACHABLE();
         return;
@@ -789,10 +864,17 @@ static void hvm_dirq_assist(struct domain *d, struct hvm_pirq_dpci *pirq_dpci)
 
         list_for_each_entry ( digl, &pirq_dpci->digl_list, list )
         {
+            ASSERT(!(pirq_dpci->flags & HVM_IRQ_DPCI_IDENTITY_GSI));
             hvm_pci_intx_assert(d, digl->device, digl->intx);
             pirq_dpci->pending++;
         }
 
+        if ( pirq_dpci->flags & HVM_IRQ_DPCI_IDENTITY_GSI )
+        {
+            hvm_gsi_assert(d, pirq->pirq);
+            pirq_dpci->pending++;
+        }
+
         if ( pirq_dpci->flags & HVM_IRQ_DPCI_TRANSLATE )
         {
             /* for translated MSI to INTx interrupt, eoi as early as possible */
@@ -814,17 +896,12 @@ static void hvm_dirq_assist(struct domain *d, struct hvm_pirq_dpci *pirq_dpci)
     spin_unlock(&d->event_lock);
 }
 
-static void __hvm_dpci_eoi(struct domain *d,
-                           const struct hvm_girq_dpci_mapping *girq,
+static void __hvm_pirq_eoi(struct pirq *pirq,
                            const union vioapic_redir_entry *ent)
 {
-    struct pirq *pirq = pirq_info(d, girq->machine_gsi);
-    struct hvm_pirq_dpci *pirq_dpci;
-
-    if ( !hvm_domain_use_pirq(d, pirq) )
-        hvm_pci_intx_deassert(d, girq->device, girq->intx);
+    struct hvm_pirq_dpci *pirq_dpci = pirq_dpci(pirq);
 
-    pirq_dpci = pirq_dpci(pirq);
+    ASSERT(pirq_dpci);
 
     /*
      * No need to get vector lock for timer
@@ -839,6 +916,32 @@ static void __hvm_dpci_eoi(struct domain *d,
     pirq_guest_eoi(pirq);
 }
 
+static void __hvm_dpci_eoi(struct domain *d,
+                           const struct hvm_girq_dpci_mapping *girq,
+                           const union vioapic_redir_entry *ent)
+{
+    struct pirq *pirq = pirq_info(d, girq->machine_gsi);
+
+    if ( !hvm_domain_use_pirq(d, pirq) )
+        hvm_pci_intx_deassert(d, girq->device, girq->intx);
+
+    __hvm_pirq_eoi(pirq, ent);
+}
+
+static void __hvm_gsi_eoi(struct domain *d, unsigned int gsi,
+                          const union vioapic_redir_entry *ent)
+{
+    struct pirq *pirq = pirq_info(d, gsi);
+    struct hvm_pirq_dpci *pirq_dpci = pirq_dpci(pirq);
+
+    /* Check if GSI is actually mapped. */
+    if ( !pirq_dpci )
+        return;
+
+    hvm_gsi_deassert(d, gsi);
+    __hvm_pirq_eoi(pirq, ent);
+}
+
 void hvm_dpci_eoi(struct domain *d, unsigned int guest_gsi,
                   const union vioapic_redir_entry *ent)
 {
@@ -848,6 +951,13 @@ void hvm_dpci_eoi(struct domain *d, unsigned int guest_gsi,
     if ( !iommu_enabled )
         return;
 
+    if ( is_hardware_domain(d) )
+    {
+        spin_lock(&d->event_lock);
+        __hvm_gsi_eoi(d, guest_gsi, ent);
+        goto unlock;
+    }
+
     if ( guest_gsi < NR_ISAIRQS )
     {
         hvm_dpci_isairq_eoi(d, guest_gsi);
diff --git a/xen/include/xen/hvm/irq.h b/xen/include/xen/hvm/irq.h
index 671a6f2e06..0d2c72c109 100644
--- a/xen/include/xen/hvm/irq.h
+++ b/xen/include/xen/hvm/irq.h
@@ -40,6 +40,7 @@ struct dev_intx_gsi_link {
 #define _HVM_IRQ_DPCI_EOI_LATCH_SHIFT           3
 #define _HVM_IRQ_DPCI_GUEST_PCI_SHIFT           4
 #define _HVM_IRQ_DPCI_GUEST_MSI_SHIFT           5
+#define _HVM_IRQ_DPCI_IDENTITY_GSI_SHIFT        6
 #define _HVM_IRQ_DPCI_TRANSLATE_SHIFT          15
 #define HVM_IRQ_DPCI_MACH_PCI        (1 << _HVM_IRQ_DPCI_MACH_PCI_SHIFT)
 #define HVM_IRQ_DPCI_MACH_MSI        (1 << _HVM_IRQ_DPCI_MACH_MSI_SHIFT)
@@ -47,6 +48,7 @@ struct dev_intx_gsi_link {
 #define HVM_IRQ_DPCI_EOI_LATCH       (1 << _HVM_IRQ_DPCI_EOI_LATCH_SHIFT)
 #define HVM_IRQ_DPCI_GUEST_PCI       (1 << _HVM_IRQ_DPCI_GUEST_PCI_SHIFT)
 #define HVM_IRQ_DPCI_GUEST_MSI       (1 << _HVM_IRQ_DPCI_GUEST_MSI_SHIFT)
+#define HVM_IRQ_DPCI_IDENTITY_GSI    (1 << _HVM_IRQ_DPCI_IDENTITY_GSI_SHIFT)
 #define HVM_IRQ_DPCI_TRANSLATE       (1 << _HVM_IRQ_DPCI_TRANSLATE_SHIFT)
 
 #define VMSI_DEST_ID_MASK 0xff
@@ -123,6 +125,10 @@ void hvm_isa_irq_assert(
 void hvm_isa_irq_deassert(
     struct domain *d, unsigned int isa_irq);
 
+/* Modify state of GSIs. */
+void hvm_gsi_assert(struct domain *d, unsigned int gsi);
+void hvm_gsi_deassert(struct domain *d, unsigned int gsi);
+
 int hvm_set_pci_link_route(struct domain *d, u8 link, u8 isa_irq);
 
 int hvm_inject_msi(struct domain *d, uint64_t addr, uint32_t data);
-- 
2.11.0 (Apple Git-81)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 3/3] x86/vioapic: bind interrupts to PVH Dom0
  2017-04-19 15:11 [PATCH v2 0/3] x86/dpci: bind legacy PCI interrupts to PVHv2 Dom0 Roger Pau Monne
  2017-04-19 15:11 ` [PATCH v2 1/3] x86/physdev: factor out the code to allocate and map a pirq Roger Pau Monne
  2017-04-19 15:11 ` [PATCH v2 2/3] x86/pt: enable binding of GSIs to a PVH Dom0 Roger Pau Monne
@ 2017-04-19 15:11 ` Roger Pau Monne
  2017-05-12 13:55   ` Jan Beulich
  2 siblings, 1 reply; 19+ messages in thread
From: Roger Pau Monne @ 2017-04-19 15:11 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, boris.ostrovsky, Roger Pau Monne, Jan Beulich

Add the glue in order to bind the PVH Dom0 GSI from bare metal. This is done
when Dom0 unmasks the vIO APIC pins, by fetching the current pin settings and
setting up the PIRQ, which will then be bound to Dom0.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes since v1:
 - Mask the pin on error (instead of panicking).
 - Factor out the Dom0 specific code into a function.
 - Use the newly introduced allocate_and_map_gsi_pirq instead of
   physdev_map_pirq.
---
 xen/arch/x86/hvm/vioapic.c | 70 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 70 insertions(+)

diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index abcc473c88..4544e11abe 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -158,6 +158,62 @@ static int vioapic_read(
     return X86EMUL_OKAY;
 }
 
+static int vioapic_dom0_map_gsi(unsigned int gsi, unsigned int trig,
+                                unsigned int pol)
+{
+    struct domain *d = current->domain;
+    xen_domctl_bind_pt_irq_t pt_irq_bind = {
+        .irq_type = PT_IRQ_TYPE_PCI,
+        .machine_irq = gsi,
+        .hvm_domid = DOMID_SELF,
+    };
+    int ret, pirq = gsi;
+
+    ASSERT(is_hardware_domain(d));
+
+    /* Interrupt has been unmasked, bind it now. */
+    ret = mp_register_gsi(gsi, trig, pol);
+    if ( ret && ret != -EEXIST )
+    {
+        gdprintk(XENLOG_WARNING, "vioapic: error registering GSI %u: %d\n",
+                 gsi, ret);
+        goto error;
+    }
+    else if ( ret )
+        /* Already in use. */
+        return 0;
+
+    ret = allocate_and_map_gsi_pirq(d, &pirq, &pirq);
+    if ( ret )
+    {
+        gdprintk(XENLOG_WARNING, "vioapic: error mapping GSI %u: %d\n",
+                 gsi, ret);
+        goto error;
+    }
+
+    pcidevs_lock();
+    ret = pt_irq_create_bind(d, &pt_irq_bind);
+    pcidevs_unlock();
+    if ( ret )
+    {
+        gdprintk(XENLOG_WARNING, "vioapic: error binding GSI %u: %d\n",
+                 gsi, ret);
+        goto error_unmap;
+    }
+
+    return 0;
+
+ error_unmap:
+    pcidevs_lock();
+    spin_lock(&d->event_lock);
+    unmap_domain_pirq(d, pirq);
+    spin_unlock(&d->event_lock);
+    pcidevs_unlock();
+ error:
+    ASSERT(ret);
+    return ret;
+}
+
 static void vioapic_write_redirent(
     struct hvm_vioapic *vioapic, unsigned int idx,
     int top_word, uint32_t val)
@@ -188,6 +244,20 @@ static void vioapic_write_redirent(
         unmasked = unmasked && !ent.fields.mask;
     }
 
+    if ( is_hardware_domain(d) && unmasked )
+    {
+        int ret;
+
+        ret = vioapic_dom0_map_gsi(gsi, ent.fields.trig_mode,
+                                   ent.fields.polarity);
+        if ( ret )
+        {
+            /* Mask the entry again. */
+            ent.fields.mask = 1;
+            unmasked = 0;
+        }
+    }
+
     *pent = ent;
 
     if ( gsi == 0 )
-- 
2.11.0 (Apple Git-81)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/3] x86/physdev: factor out the code to allocate and map a pirq
  2017-04-19 15:11 ` [PATCH v2 1/3] x86/physdev: factor out the code to allocate and map a pirq Roger Pau Monne
@ 2017-05-12 12:56   ` Jan Beulich
  2017-05-16  9:47     ` Roger Pau Monne
  2017-05-16 11:52   ` Roger Pau Monne
  1 sibling, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2017-05-12 12:56 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, boris.ostrovsky

>>> On 19.04.17 at 17:11, <roger.pau@citrix.com> wrote:
> +int allocate_and_map_gsi_pirq(struct domain *d, int *index, int *pirq_p)
> +{
> +    int irq, pirq, ret;
> +
> +    if ( *index < 0 || *index >= nr_irqs_gsi )
> +    {
> +        dprintk(XENLOG_G_ERR, "dom%d: map invalid irq %d\n", d->domain_id,
> +                *index);
> +        return -EINVAL;
> +    }
> +
> +    irq = domain_pirq_to_irq(current->domain, *index);
> +    if ( irq <= 0 )
> +    {
> +        if ( is_hardware_domain(current->domain) )
> +            irq = *index;
> +        else {

Please adjust coding style issues like the brace placement here.

> +    pcidevs_lock();
> +    /* Verify or get pirq. */
> +    spin_lock(&d->event_lock);
> +    pirq = domain_irq_to_pirq(d, irq);
> +
> +    if ( *pirq_p < 0 )
> +    {
> +        if ( pirq )
> +        {
> +            dprintk(XENLOG_G_ERR, "dom%d: %d:%d already mapped to %d\n",
> +                    d->domain_id, *index, *pirq_p, pirq);
> +            if ( pirq < 0 )
> +            {
> +                ret = -EBUSY;
> +                goto done;
> +            }
> +        }
> +        else
> +        {
> +            pirq = get_free_pirq(d, MAP_PIRQ_TYPE_GSI);
> +            if ( pirq < 0 )
> +            {
> +                dprintk(XENLOG_G_ERR, "dom%d: no free pirq\n", d->domain_id);
> +                ret = pirq;
> +                goto done;
> +            }
> +        }
> +    }
> +    else
> +    {
> +        if ( pirq && pirq != *pirq_p )
> +        {
> +            dprintk(XENLOG_G_ERR, "dom%d: pirq %d conflicts with irq %d\n",
> +                    d->domain_id, *index, *pirq_p);
> +            ret = -EEXIST;
> +            goto done;
> +        }
> +        else
> +            pirq = *pirq_p;
> +    }
> +
> +    ret = map_domain_pirq(d, pirq, irq, MAP_PIRQ_TYPE_GSI, NULL);
> +    if ( ret == 0 )
> +        *pirq_p = pirq;
> +
> + done:
> +    spin_unlock(&d->event_lock);
> +    pcidevs_unlock();

All of the code above is being repeated in allocate_and_map_msi_pirq(),
merely with the multi-MSI addition. This is too much code duplication for
my taste. Additionally, with it being split like this it is then questionable
what acquiring the PCI devices lock is good for here (I would think it is
needed at most in the MSI case).

> +int allocate_and_map_msi_pirq(struct domain *d, int *index, int *pirq_p,
> +                              struct msi_info *msi)
> +{
> +    int irq, pirq, ret, type;
> +
> +    irq = *index;
> +    if ( irq == -1 || msi->entry_nr > 1 )
> +        irq = create_irq(NUMA_NO_NODE);

This doesn't look to be an exact equivalent of the original code: Even
with MAP_PIRQ_TYPE_MULTI_MSI entry_nr may be 1, and the original
code calls create_irq() also in that case. If this is an intended change,
the rationale should be provided in the commit message. But as you
don't alter map_domain_pirq(), I doubt this is correct.

> +    if ( irq < nr_irqs_gsi || irq >= nr_irqs )
> +    {
> +        dprintk(XENLOG_G_ERR, "dom%d: can't create irq for msi!\n",
> +                d->domain_id);
> +        ret = -EINVAL;
> +        goto free_irq;
> +    }
> +
> +    msi->irq = irq;
> +    type = (msi->entry_nr > 1 && !msi->table_base) ? MAP_PIRQ_TYPE_MULTI_MSI
> +                                                   : MAP_PIRQ_TYPE_MSI;

Same here - you're not necessarily reconstructing the type passed
into the hypercall.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/3] x86/pt: enable binding of GSIs to a PVH Dom0
  2017-04-19 15:11 ` [PATCH v2 2/3] x86/pt: enable binding of GSIs to a PVH Dom0 Roger Pau Monne
@ 2017-05-12 13:42   ` Jan Beulich
  2017-05-16 15:55     ` Roger Pau Monne
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2017-05-12 13:42 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, boris.ostrovsky, xen-devel

>>> On 19.04.17 at 17:11, <roger.pau@citrix.com> wrote:
> Note that currently there's no support for unbinding this interrupts.

Do you plan to deal with that before this changes goes in? Aiui this
not working means you can't pass through devices with pin based
interrupts once Dom0 chose to bind to them. Otoh hand you modify
pt_irq_destroy_bind(), so I'm a little puzzled ...

> --- a/xen/arch/x86/hvm/irq.c
> +++ b/xen/arch/x86/hvm/irq.c
> @@ -126,6 +126,34 @@ void hvm_pci_intx_deassert(
>      spin_unlock(&d->arch.hvm_domain.irq_lock);
>  }
>  
> +void hvm_gsi_assert(struct domain *d, unsigned int gsi)
> +{
> +    struct hvm_irq *hvm_irq = hvm_domain_irq(d);
> +
> +    ASSERT(gsi < hvm_irq->nr_gsis);

This would probably better match the alternative construct in
__hvm_pci_intx_assert().

> +    ASSERT(!has_vpic(d));

This doesn't look to be relevant for the rest of the function. Is there
a particular reason you've added it? If so, a brief comment might
help.

> +    spin_lock(&d->arch.hvm_domain.irq_lock);
> +    if ( !hvm_irq->gsi_assert_count[gsi] )
> +    {
> +        hvm_irq->gsi_assert_count[gsi]++;

Why is this an increment instead of a simple write of 1? Or the
other way around - why is this not always incrementing, just like
__hvm_pci_intx_assert() does? (Symmetric questions then for
hvm_gsi_deassert()).

> @@ -274,10 +289,16 @@ int pt_irq_create_bind(
>      spin_lock(&d->event_lock);
>  
>      hvm_irq_dpci = domain_get_irq_dpci(d);
> -    if ( hvm_irq_dpci == NULL )
> +    if ( hvm_irq_dpci == NULL && !is_hardware_domain(d) )
>      {
>          unsigned int i;
>  
> +        /*
> +         * NB: the hardware domain doesn't use a hvm_irq_dpci struct because
> +         * it's only allowed to identity map GSIs, and so the data contained in
> +         * that struct (used to map guest GSIs into machine GSIs and perform
> +         * interrupt routing) it's completely useless for it.

"...) is completely useless to it."

> @@ -422,35 +443,51 @@ int pt_irq_create_bind(
>      case PT_IRQ_TYPE_PCI:
>      case PT_IRQ_TYPE_MSI_TRANSLATE:
>      {
> -        unsigned int bus = pt_irq_bind->u.pci.bus;
> -        unsigned int device = pt_irq_bind->u.pci.device;
> -        unsigned int intx = pt_irq_bind->u.pci.intx;
> -        unsigned int guest_gsi = hvm_pci_intx_gsi(device, intx);
> -        unsigned int link = hvm_pci_intx_link(device, intx);
> -        struct dev_intx_gsi_link *digl = xmalloc(struct dev_intx_gsi_link);
> -        struct hvm_girq_dpci_mapping *girq =
> -            xmalloc(struct hvm_girq_dpci_mapping);
> +        struct dev_intx_gsi_link *digl = NULL;
> +        struct hvm_girq_dpci_mapping *girq = NULL;
> +        unsigned int guest_gsi;
>  
> -        if ( !digl || !girq )
> +        /*
> +         * Mapping GSIs for the hardware domain is different than doing it for
> +         * an unpriviledged guest, the hardware domain is only allowed to
> +         * identity map GSIs, and as such all the data in the u.pci union is
> +         * discarded.
> +         */
> +        if ( !is_hardware_domain(d) )
>          {
> -            spin_unlock(&d->event_lock);
> -            xfree(girq);
> -            xfree(digl);
> -            return -ENOMEM;
> -        }
> +            unsigned int link;
> +
> +            digl = xmalloc(struct dev_intx_gsi_link);
> +            girq = xmalloc(struct hvm_girq_dpci_mapping);
> +
> +            if ( !digl || !girq )
> +            {
> +                spin_unlock(&d->event_lock);
> +                xfree(girq);
> +                xfree(digl);
> +                return -ENOMEM;
> +            }
> +
> +            girq->bus = digl->bus = pt_irq_bind->u.pci.bus;
> +            girq->device = digl->device = pt_irq_bind->u.pci.device;
> +            girq->intx = digl->intx = pt_irq_bind->u.pci.intx;
> +            list_add_tail(&digl->list, &pirq_dpci->digl_list);
>  
> -        hvm_irq_dpci->link_cnt[link]++;
> +            guest_gsi = hvm_pci_intx_gsi(digl->device, digl->intx);
> +            link = hvm_pci_intx_link(digl->device, digl->intx);
>  
> -        digl->bus = bus;
> -        digl->device = device;
> -        digl->intx = intx;
> -        list_add_tail(&digl->list, &pirq_dpci->digl_list);
> +            hvm_irq_dpci->link_cnt[link]++;
>  
> -        girq->bus = bus;
> -        girq->device = device;
> -        girq->intx = intx;
> -        girq->machine_gsi = pirq;
> -        list_add_tail(&girq->list, &hvm_irq_dpci->girq[guest_gsi]);
> +            girq->machine_gsi = pirq;
> +            list_add_tail(&girq->list, &hvm_irq_dpci->girq[guest_gsi]);
> +        }
> +        else
> +        {
> +            /* MSI_TRANSLATE is not supported by the hardware domain. */
> +            ASSERT(pt_irq_bind->irq_type == PT_IRQ_TYPE_PCI);
> +            guest_gsi = pirq;
> +            ASSERT(guest_gsi < hvm_domain_irq(d)->nr_gsis);
> +        }

This is dangerous: For one it is impossible to judge the correctness
of at least the first of these assertions for the hwdom case without
looking at patch 3. And then the domctl path leading here does not
exclude the subject domain equaling the calling one, i.e. you
potentially assert guest input correctness here. Yes, we have XSA-77
in place, but no, we shouldn't introduce new issues anywhere unless
that's entirely unavoidable.

> @@ -504,10 +567,18 @@ int pt_irq_create_bind(
>          spin_unlock(&d->event_lock);
>  
>          if ( iommu_verbose )
> -            printk(XENLOG_G_INFO
> -                   "d%d: bind: m_gsi=%u g_gsi=%u dev=%02x.%02x.%u intx=%u\n",
> -                   d->domain_id, pirq, guest_gsi, bus,
> -                   PCI_SLOT(device), PCI_FUNC(device), intx);
> +        {
> +            char buf[50];
> +
> +            if ( !is_hardware_domain(d) )
> +                snprintf(buf, ARRAY_SIZE(buf), " dev=%02x.%02x.%u intx=%u",
> +                         digl->bus, PCI_SLOT(digl->device),
> +                         PCI_FUNC(digl->device), digl->intx);

The buffer array seems heavily over-sized - my counting gives at best
slightly over 20 characters you actually need.

> @@ -522,7 +593,6 @@ int pt_irq_create_bind(
>  int pt_irq_destroy_bind(
>      struct domain *d, xen_domctl_bind_pt_irq_t *pt_irq_bind)
>  {
> -    struct hvm_irq_dpci *hvm_irq_dpci;
>      struct hvm_pirq_dpci *pirq_dpci;
>      unsigned int machine_gsi = pt_irq_bind->machine_irq;
>      struct pirq *pirq;
> @@ -552,17 +622,15 @@ int pt_irq_destroy_bind(
>  
>      spin_lock(&d->event_lock);
>  
> -    hvm_irq_dpci = domain_get_irq_dpci(d);
> -
> -    if ( hvm_irq_dpci == NULL )
> +    pirq = pirq_info(d, machine_gsi);
> +    pirq_dpci = pirq_dpci(pirq);
> +    if ( pirq_dpci->flags & HVM_IRQ_DPCI_IDENTITY_GSI )

I'm sure we've discussed aspects of this before: pirq_dpci may be
NULL here, i.e. you can't blindly dereference it. All other uses in
the function indeed have a NULL check first.

> @@ -570,9 +638,16 @@ int pt_irq_destroy_bind(
>          unsigned int intx = pt_irq_bind->u.pci.intx;
>          unsigned int guest_gsi = hvm_pci_intx_gsi(device, intx);
>          unsigned int link = hvm_pci_intx_link(device, intx);
> +        struct hvm_irq_dpci *hvm_irq_dpci = domain_get_irq_dpci(d);
>          struct hvm_girq_dpci_mapping *girq;
>          struct dev_intx_gsi_link *digl, *tmp;
>  
> +        if ( hvm_irq_dpci == NULL )
> +        {
> +            spin_unlock(&d->event_lock);
> +            return -EINVAL;
> +        }

Moving this check here is a behavioral modification. Perhaps a
good one, yet it doesn't belong into this patch. Instead it should
be properly reasoned about in a separate patch, if it is a correct
thing to do.

> @@ -814,17 +896,12 @@ static void hvm_dirq_assist(struct domain *d, struct hvm_pirq_dpci *pirq_dpci)
>      spin_unlock(&d->event_lock);
>  }
>  
> -static void __hvm_dpci_eoi(struct domain *d,
> -                           const struct hvm_girq_dpci_mapping *girq,
> +static void __hvm_pirq_eoi(struct pirq *pirq,

Please drop the double leading underscores.

>                             const union vioapic_redir_entry *ent)
>  {
> -    struct pirq *pirq = pirq_info(d, girq->machine_gsi);
> -    struct hvm_pirq_dpci *pirq_dpci;
> -
> -    if ( !hvm_domain_use_pirq(d, pirq) )
> -        hvm_pci_intx_deassert(d, girq->device, girq->intx);
> +    struct hvm_pirq_dpci *pirq_dpci = pirq_dpci(pirq);
>  
> -    pirq_dpci = pirq_dpci(pirq);
> +    ASSERT(pirq_dpci);

Why is this useful / needed all of the sudden?

> @@ -839,6 +916,32 @@ static void __hvm_dpci_eoi(struct domain *d,
>      pirq_guest_eoi(pirq);
>  }
>  
> +static void __hvm_dpci_eoi(struct domain *d,
> +                           const struct hvm_girq_dpci_mapping *girq,
> +                           const union vioapic_redir_entry *ent)
> +{
> +    struct pirq *pirq = pirq_info(d, girq->machine_gsi);
> +
> +    if ( !hvm_domain_use_pirq(d, pirq) )
> +        hvm_pci_intx_deassert(d, girq->device, girq->intx);
> +
> +    __hvm_pirq_eoi(pirq, ent);
> +}
> +
> +static void __hvm_gsi_eoi(struct domain *d, unsigned int gsi,

Same here for the double underscores. For the pre-existing
function I'd leave it up to you whether to also drop them. What
I care about is that we don't gain new non-conforming names.

> +                          const union vioapic_redir_entry *ent)
> +{
> +    struct pirq *pirq = pirq_info(d, gsi);
> +    struct hvm_pirq_dpci *pirq_dpci = pirq_dpci(pirq);
> +
> +    /* Check if GSI is actually mapped. */
> +    if ( !pirq_dpci )

Please avoid the local variable when used just once here.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/3] x86/vioapic: bind interrupts to PVH Dom0
  2017-04-19 15:11 ` [PATCH v2 3/3] x86/vioapic: bind interrupts to " Roger Pau Monne
@ 2017-05-12 13:55   ` Jan Beulich
  2017-05-16 16:15     ` Roger Pau Monne
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2017-05-12 13:55 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, boris.ostrovsky, xen-devel

>>> On 19.04.17 at 17:11, <roger.pau@citrix.com> wrote:
> --- a/xen/arch/x86/hvm/vioapic.c
> +++ b/xen/arch/x86/hvm/vioapic.c
> @@ -158,6 +158,62 @@ static int vioapic_read(
>      return X86EMUL_OKAY;
>  }
>  
> +static int vioapic_dom0_map_gsi(unsigned int gsi, unsigned int trig,

Considering the conditional in the caller, please use hwdom instead
of dom0.

> +                                unsigned int pol)
> +{
> +    struct domain *d = current->domain;
> +    xen_domctl_bind_pt_irq_t pt_irq_bind = {
> +        .irq_type = PT_IRQ_TYPE_PCI,
> +        .machine_irq = gsi,
> +        .hvm_domid = DOMID_SELF,

I'm struggling with this field: Did you not notice it's entirely
unused? We should really delete it from the interface, as
redundant with the domain specifier in the domctl container
structure.

> +    };
> +    int ret, pirq = gsi;
> +
> +    ASSERT(is_hardware_domain(d));
> +
> +    /* Interrupt has been unmasked, bind it now. */
> +    ret = mp_register_gsi(gsi, trig, pol);
> +    if ( ret && ret != -EEXIST )
> +    {
> +        gdprintk(XENLOG_WARNING, "vioapic: error registering GSI %u: %d\n",
> +                 gsi, ret);
> +        goto error;
> +    }
> +    else if ( ret )
> +        /* Already in use. */
> +        return 0;

I think this would better be

    if ( ret == -EEXIST )
        return 0;
    if ( ret )
        ....

> +    ret = allocate_and_map_gsi_pirq(d, &pirq, &pirq);
> +    if ( ret )
> +    {
> +        gdprintk(XENLOG_WARNING, "vioapic: error mapping GSI %u: %d\n",
> +                 gsi, ret);
> +        goto error;
> +    }
> +
> +    pcidevs_lock();
> +    ret = pt_irq_create_bind(d, &pt_irq_bind);
> +    pcidevs_unlock();
> +    if ( ret )
> +    {
> +        gdprintk(XENLOG_WARNING, "vioapic: error binding GSI %u: %d\n",
> +                 gsi, ret);
> +        goto error_unmap;
> +    }
> +
> +    return 0;
> +
> + error_unmap:

I can live with the "error" label below, but the one above clearly can be
avoided quite easily by simply inverting the preceding if().

Also, considering this is Dom0-only, I wonder whether all of the log
messages wouldn't better use gprintk().

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/3] x86/physdev: factor out the code to allocate and map a pirq
  2017-05-12 12:56   ` Jan Beulich
@ 2017-05-16  9:47     ` Roger Pau Monne
  2017-05-16 12:03       ` Jan Beulich
  0 siblings, 1 reply; 19+ messages in thread
From: Roger Pau Monne @ 2017-05-16  9:47 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, boris.ostrovsky

On Fri, May 12, 2017 at 06:56:01AM -0600, Jan Beulich wrote:
> >>> On 19.04.17 at 17:11, <roger.pau@citrix.com> wrote:
> > +int allocate_and_map_gsi_pirq(struct domain *d, int *index, int *pirq_p)
> > +{
> > +    int irq, pirq, ret;
> > +
> > +    if ( *index < 0 || *index >= nr_irqs_gsi )
> > +    {
> > +        dprintk(XENLOG_G_ERR, "dom%d: map invalid irq %d\n", d->domain_id,
> > +                *index);
> > +        return -EINVAL;
> > +    }
> > +
> > +    irq = domain_pirq_to_irq(current->domain, *index);
> > +    if ( irq <= 0 )
> > +    {
> > +        if ( is_hardware_domain(current->domain) )
> > +            irq = *index;
> > +        else {
> 
> Please adjust coding style issues like the brace placement here.

Done.

> > +    pcidevs_lock();
> > +    /* Verify or get pirq. */
> > +    spin_lock(&d->event_lock);
> > +    pirq = domain_irq_to_pirq(d, irq);
> > +
> > +    if ( *pirq_p < 0 )
> > +    {
> > +        if ( pirq )
> > +        {
> > +            dprintk(XENLOG_G_ERR, "dom%d: %d:%d already mapped to %d\n",
> > +                    d->domain_id, *index, *pirq_p, pirq);
> > +            if ( pirq < 0 )
> > +            {
> > +                ret = -EBUSY;
> > +                goto done;
> > +            }
> > +        }
> > +        else
> > +        {
> > +            pirq = get_free_pirq(d, MAP_PIRQ_TYPE_GSI);
> > +            if ( pirq < 0 )
> > +            {
> > +                dprintk(XENLOG_G_ERR, "dom%d: no free pirq\n", d->domain_id);
> > +                ret = pirq;
> > +                goto done;
> > +            }
> > +        }
> > +    }
> > +    else
> > +    {
> > +        if ( pirq && pirq != *pirq_p )
> > +        {
> > +            dprintk(XENLOG_G_ERR, "dom%d: pirq %d conflicts with irq %d\n",
> > +                    d->domain_id, *index, *pirq_p);
> > +            ret = -EEXIST;
> > +            goto done;
> > +        }
> > +        else
> > +            pirq = *pirq_p;
> > +    }
> > +
> > +    ret = map_domain_pirq(d, pirq, irq, MAP_PIRQ_TYPE_GSI, NULL);
> > +    if ( ret == 0 )
> > +        *pirq_p = pirq;
> > +
> > + done:
> > +    spin_unlock(&d->event_lock);
> > +    pcidevs_unlock();
> 
> All of the code above is being repeated in allocate_and_map_msi_pirq(),
> merely with the multi-MSI addition. This is too much code duplication for
> my taste.

I can try to factor this out into a separate helper that's used by both.

> Additionally, with it being split like this it is then questionable
> what acquiring the PCI devices lock is good for here (I would think it is
> needed at most in the MSI case).

Right, also I'm not sure why the PCI devices lock is acquired before calling
into domain_irq_to_pirq, is that because of lock ordering rules with the domain
event lock?

> > +int allocate_and_map_msi_pirq(struct domain *d, int *index, int *pirq_p,
> > +                              struct msi_info *msi)
> > +{
> > +    int irq, pirq, ret, type;
> > +
> > +    irq = *index;
> > +    if ( irq == -1 || msi->entry_nr > 1 )
> > +        irq = create_irq(NUMA_NO_NODE);
> 
> This doesn't look to be an exact equivalent of the original code: Even
> with MAP_PIRQ_TYPE_MULTI_MSI entry_nr may be 1, and the original
> code calls create_irq() also in that case. If this is an intended change,
> the rationale should be provided in the commit message. But as you
> don't alter map_domain_pirq(), I doubt this is correct.

My bad then, it's quite hard for me to figure out exactly what's the
meaning/usage of those types, since they are not documented anywhere that I can
find. physdev.h contains 3 different MSI related types:

* MAP_PIRQ_TYPE_MSI_SEG maps into MAP_PIRQ_TYPE_MSI.
* MAP_PIRQ_TYPE_MULTI_MSI is only available to map MSI interrupts (no MSI-X).
* MAP_PIRQ_TYPE_MSI can map both MSI/MSI-X:
   - If table_base != 0 it's a MSI-X interrupt.
   - If table_base == 0 it's a single MSI interrupt AND entry_nr must be 1.

Let me know if this is accurate.

> > +    if ( irq < nr_irqs_gsi || irq >= nr_irqs )
> > +    {
> > +        dprintk(XENLOG_G_ERR, "dom%d: can't create irq for msi!\n",
> > +                d->domain_id);
> > +        ret = -EINVAL;
> > +        goto free_irq;
> > +    }
> > +
> > +    msi->irq = irq;
> > +    type = (msi->entry_nr > 1 && !msi->table_base) ? MAP_PIRQ_TYPE_MULTI_MSI
> > +                                                   : MAP_PIRQ_TYPE_MSI;
> 
> Same here - you're not necessarily reconstructing the type passed
> into the hypercall.

Right, see my comment above.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/3] x86/physdev: factor out the code to allocate and map a pirq
  2017-04-19 15:11 ` [PATCH v2 1/3] x86/physdev: factor out the code to allocate and map a pirq Roger Pau Monne
  2017-05-12 12:56   ` Jan Beulich
@ 2017-05-16 11:52   ` Roger Pau Monne
  2017-05-16 12:13     ` Jan Beulich
  1 sibling, 1 reply; 19+ messages in thread
From: Roger Pau Monne @ 2017-05-16 11:52 UTC (permalink / raw)
  To: xen-devel, konrad.wilk, boris.ostrovsky; +Cc: Jan Beulich

On Wed, Apr 19, 2017 at 04:11:26PM +0100, Roger Pau Monne wrote:
[...]
> +    if ( *pirq_p < 0 )
> +    {
> +        if ( pirq )
> +        {
> +            dprintk(XENLOG_G_ERR, "dom%d: %d:%d already mapped to %d\n",
> +                    d->domain_id, *index, *pirq_p, pirq);
> +            if ( pirq < 0 )
> +            {
> +                ret = -EBUSY;
> +                goto done;
> +            }
> +        }
> +        else if ( msi->entry_nr > 1 && !msi->table_base )
> +        {
> +            if ( msi->entry_nr <= 0 || msi->entry_nr > 32 )
> +                ret = -EDOM;
> +            else if ( msi->entry_nr != 1 && !iommu_intremap )
> +                ret = -EOPNOTSUPP;
> +            else
> +            {
> +                while ( msi->entry_nr & (msi->entry_nr - 1) )
> +                    msi->entry_nr += msi->entry_nr & -msi->entry_nr;
> +                pirq = get_free_pirqs(d, msi->entry_nr);
> +                ret = 0;
> +                if ( pirq < 0 )
> +                {
> +                    while ( (msi->entry_nr >>= 1) > 1 )
> +                        if ( get_free_pirqs(d, msi->entry_nr) > 0 )
> +                            break;

Maybe I'm missing something, but I think the code above should be:

                        if ( (pirq = get_free_pirqs(d, msi->entry_nr)) > 0 )
                            break;

Or else the pirq returned by get_free_pirqs is not stored anywhere, and thus
pirq will be -ENOSPC even when PIRQs have been allocated.

Let me know if this is true or not, so that I can send a patch to fix the
current code (which also has this issue).

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/3] x86/physdev: factor out the code to allocate and map a pirq
  2017-05-16  9:47     ` Roger Pau Monne
@ 2017-05-16 12:03       ` Jan Beulich
  0 siblings, 0 replies; 19+ messages in thread
From: Jan Beulich @ 2017-05-16 12:03 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, boris.ostrovsky

>>> On 16.05.17 at 11:47, <roger.pau@citrix.com> wrote:
> On Fri, May 12, 2017 at 06:56:01AM -0600, Jan Beulich wrote:
>> >>> On 19.04.17 at 17:11, <roger.pau@citrix.com> wrote:
>> > +    pcidevs_lock();
>> > +    /* Verify or get pirq. */
>> > +    spin_lock(&d->event_lock);
>> > +    pirq = domain_irq_to_pirq(d, irq);
>> > +
>> > +    if ( *pirq_p < 0 )
>> > +    {
>> > +        if ( pirq )
>> > +        {
>> > +            dprintk(XENLOG_G_ERR, "dom%d: %d:%d already mapped to %d\n",
>> > +                    d->domain_id, *index, *pirq_p, pirq);
>> > +            if ( pirq < 0 )
>> > +            {
>> > +                ret = -EBUSY;
>> > +                goto done;
>> > +            }
>> > +        }
>> > +        else
>> > +        {
>> > +            pirq = get_free_pirq(d, MAP_PIRQ_TYPE_GSI);
>> > +            if ( pirq < 0 )
>> > +            {
>> > +                dprintk(XENLOG_G_ERR, "dom%d: no free pirq\n", d->domain_id);
>> > +                ret = pirq;
>> > +                goto done;
>> > +            }
>> > +        }
>> > +    }
>> > +    else
>> > +    {
>> > +        if ( pirq && pirq != *pirq_p )
>> > +        {
>> > +            dprintk(XENLOG_G_ERR, "dom%d: pirq %d conflicts with irq %d\n",
>> > +                    d->domain_id, *index, *pirq_p);
>> > +            ret = -EEXIST;
>> > +            goto done;
>> > +        }
>> > +        else
>> > +            pirq = *pirq_p;
>> > +    }
>> > +
>> > +    ret = map_domain_pirq(d, pirq, irq, MAP_PIRQ_TYPE_GSI, NULL);
>> > +    if ( ret == 0 )
>> > +        *pirq_p = pirq;
>> > +
>> > + done:
>> > +    spin_unlock(&d->event_lock);
>> > +    pcidevs_unlock();
>> 
>> All of the code above is being repeated in allocate_and_map_msi_pirq(),
>> merely with the multi-MSI addition. This is too much code duplication for
>> my taste.
> 
> I can try to factor this out into a separate helper that's used by both.
> 
>> Additionally, with it being split like this it is then questionable
>> what acquiring the PCI devices lock is good for here (I would think it is
>> needed at most in the MSI case).
> 
> Right, also I'm not sure why the PCI devices lock is acquired before calling
> into domain_irq_to_pirq, is that because of lock ordering rules with the domain
> event lock?

Yes. map_domain_pirq() in the MSI case requires both locks to be
held.

>> > +int allocate_and_map_msi_pirq(struct domain *d, int *index, int *pirq_p,
>> > +                              struct msi_info *msi)
>> > +{
>> > +    int irq, pirq, ret, type;
>> > +
>> > +    irq = *index;
>> > +    if ( irq == -1 || msi->entry_nr > 1 )
>> > +        irq = create_irq(NUMA_NO_NODE);
>> 
>> This doesn't look to be an exact equivalent of the original code: Even
>> with MAP_PIRQ_TYPE_MULTI_MSI entry_nr may be 1, and the original
>> code calls create_irq() also in that case. If this is an intended change,
>> the rationale should be provided in the commit message. But as you
>> don't alter map_domain_pirq(), I doubt this is correct.
> 
> My bad then, it's quite hard for me to figure out exactly what's the
> meaning/usage of those types, since they are not documented anywhere that I can
> find. physdev.h contains 3 different MSI related types:
> 
> * MAP_PIRQ_TYPE_MSI_SEG maps into MAP_PIRQ_TYPE_MSI.

This was needed because of the extra segment information which
wasn't part of the original MAP_PIRQ_TYPE_MSI. Otherwise the
two are identical, so ..

> * MAP_PIRQ_TYPE_MULTI_MSI is only available to map MSI interrupts (no MSI-X).
> * MAP_PIRQ_TYPE_MSI can map both MSI/MSI-X:
>    - If table_base != 0 it's a MSI-X interrupt.
>    - If table_base == 0 it's a single MSI interrupt AND entry_nr must be 1.
> 
> Let me know if this is accurate.

... almost - entry_nr when coming in as hypercall argument isn't
required to be 1; instead physdev_map_pirq() sets it to 1 when
table_base is zero.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/3] x86/physdev: factor out the code to allocate and map a pirq
  2017-05-16 11:52   ` Roger Pau Monne
@ 2017-05-16 12:13     ` Jan Beulich
  0 siblings, 0 replies; 19+ messages in thread
From: Jan Beulich @ 2017-05-16 12:13 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, boris.ostrovsky

>>> On 16.05.17 at 13:52, <roger.pau@citrix.com> wrote:
> On Wed, Apr 19, 2017 at 04:11:26PM +0100, Roger Pau Monne wrote:
> [...]
>> +    if ( *pirq_p < 0 )
>> +    {
>> +        if ( pirq )
>> +        {
>> +            dprintk(XENLOG_G_ERR, "dom%d: %d:%d already mapped to %d\n",
>> +                    d->domain_id, *index, *pirq_p, pirq);
>> +            if ( pirq < 0 )
>> +            {
>> +                ret = -EBUSY;
>> +                goto done;
>> +            }
>> +        }
>> +        else if ( msi->entry_nr > 1 && !msi->table_base )
>> +        {
>> +            if ( msi->entry_nr <= 0 || msi->entry_nr > 32 )
>> +                ret = -EDOM;
>> +            else if ( msi->entry_nr != 1 && !iommu_intremap )
>> +                ret = -EOPNOTSUPP;
>> +            else
>> +            {
>> +                while ( msi->entry_nr & (msi->entry_nr - 1) )
>> +                    msi->entry_nr += msi->entry_nr & -msi->entry_nr;
>> +                pirq = get_free_pirqs(d, msi->entry_nr);
>> +                ret = 0;
>> +                if ( pirq < 0 )
>> +                {
>> +                    while ( (msi->entry_nr >>= 1) > 1 )
>> +                        if ( get_free_pirqs(d, msi->entry_nr) > 0 )
>> +                            break;
> 
> Maybe I'm missing something, but I think the code above should be:
> 
>                         if ( (pirq = get_free_pirqs(d, msi->entry_nr)) > 0 )
>                             break;
> 
> Or else the pirq returned by get_free_pirqs is not stored anywhere, and thus
> pirq will be -ENOSPC even when PIRQs have been allocated.
> 
> Let me know if this is true or not, so that I can send a patch to fix the
> current code (which also has this issue).

The code is correct. get_free_pirqs() doesn't alter the slots it
finds free, and here we only want to know how many we could
allocate at this point (given that we weren't able to allocate as
many as were requested).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/3] x86/pt: enable binding of GSIs to a PVH Dom0
  2017-05-12 13:42   ` Jan Beulich
@ 2017-05-16 15:55     ` Roger Pau Monne
  2017-05-16 16:23       ` Jan Beulich
  0 siblings, 1 reply; 19+ messages in thread
From: Roger Pau Monne @ 2017-05-16 15:55 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, boris.ostrovsky, xen-devel

On Fri, May 12, 2017 at 07:42:51AM -0600, Jan Beulich wrote:
> >>> On 19.04.17 at 17:11, <roger.pau@citrix.com> wrote:
> > Note that currently there's no support for unbinding this interrupts.
> 
> Do you plan to deal with that before this changes goes in? Aiui this
> not working means you can't pass through devices with pin based
> interrupts once Dom0 chose to bind to them. Otoh hand you modify
> pt_irq_destroy_bind(), so I'm a little puzzled ...

Yes, I modify pt_irq_destroy_bind to return EOPNOTSUPP when trying to unbind
such an interrupt. I can implement the unbind, but it's not going to be used
ATM.

> > --- a/xen/arch/x86/hvm/irq.c
> > +++ b/xen/arch/x86/hvm/irq.c
> > @@ -126,6 +126,34 @@ void hvm_pci_intx_deassert(
> >      spin_unlock(&d->arch.hvm_domain.irq_lock);
> >  }
> >  
> > +void hvm_gsi_assert(struct domain *d, unsigned int gsi)
> > +{
> > +    struct hvm_irq *hvm_irq = hvm_domain_irq(d);
> > +
> > +    ASSERT(gsi < hvm_irq->nr_gsis);
> 
> This would probably better match the alternative construct in
> __hvm_pci_intx_assert().

OK, changed.

> > +    ASSERT(!has_vpic(d));
> 
> This doesn't look to be relevant for the rest of the function. Is there
> a particular reason you've added it? If so, a brief comment might
> help.

I've added this because hvm_gsi_assert doesn't call assert_irq, so a PIC would
not work properly, but it's probably pointless.

> > +    spin_lock(&d->arch.hvm_domain.irq_lock);
> > +    if ( !hvm_irq->gsi_assert_count[gsi] )
> > +    {
> > +        hvm_irq->gsi_assert_count[gsi]++;
> 
> Why is this an increment instead of a simple write of 1? Or the
> other way around - why is this not always incrementing, just like
> __hvm_pci_intx_assert() does? (Symmetric questions then for
> hvm_gsi_deassert()).

__hvm_pci_intx_{de}assert has an array that tracks the status of each interrupt
line, and Xen does the routing based on that (the __test_and_clear_bit at the
top of __hvm_pci_intx_assert). That prevents the same line from triggering
multiple times, which is not available here, and thus Xen needs to rely on
gsi_assert_count in order to know if the GSI is pending or not.

Switched to use a set to 1/0 instead of the increment, which I agree makes this
clearer.

> 
> > @@ -274,10 +289,16 @@ int pt_irq_create_bind(
> >      spin_lock(&d->event_lock);
> >  
> >      hvm_irq_dpci = domain_get_irq_dpci(d);
> > -    if ( hvm_irq_dpci == NULL )
> > +    if ( hvm_irq_dpci == NULL && !is_hardware_domain(d) )
> >      {
> >          unsigned int i;
> >  
> > +        /*
> > +         * NB: the hardware domain doesn't use a hvm_irq_dpci struct because
> > +         * it's only allowed to identity map GSIs, and so the data contained in
> > +         * that struct (used to map guest GSIs into machine GSIs and perform
> > +         * interrupt routing) it's completely useless for it.
> 
> "...) is completely useless to it."

Fixed, thanks.

> > @@ -422,35 +443,51 @@ int pt_irq_create_bind(
> >      case PT_IRQ_TYPE_PCI:
> >      case PT_IRQ_TYPE_MSI_TRANSLATE:
> >      {
> > -        unsigned int bus = pt_irq_bind->u.pci.bus;
> > -        unsigned int device = pt_irq_bind->u.pci.device;
> > -        unsigned int intx = pt_irq_bind->u.pci.intx;
> > -        unsigned int guest_gsi = hvm_pci_intx_gsi(device, intx);
> > -        unsigned int link = hvm_pci_intx_link(device, intx);
> > -        struct dev_intx_gsi_link *digl = xmalloc(struct dev_intx_gsi_link);
> > -        struct hvm_girq_dpci_mapping *girq =
> > -            xmalloc(struct hvm_girq_dpci_mapping);
> > +        struct dev_intx_gsi_link *digl = NULL;
> > +        struct hvm_girq_dpci_mapping *girq = NULL;
> > +        unsigned int guest_gsi;
> >  
> > -        if ( !digl || !girq )
> > +        /*
> > +         * Mapping GSIs for the hardware domain is different than doing it for
> > +         * an unpriviledged guest, the hardware domain is only allowed to
> > +         * identity map GSIs, and as such all the data in the u.pci union is
> > +         * discarded.
> > +         */
> > +        if ( !is_hardware_domain(d) )
> >          {
> > -            spin_unlock(&d->event_lock);
> > -            xfree(girq);
> > -            xfree(digl);
> > -            return -ENOMEM;
> > -        }
> > +            unsigned int link;
> > +
> > +            digl = xmalloc(struct dev_intx_gsi_link);
> > +            girq = xmalloc(struct hvm_girq_dpci_mapping);
> > +
> > +            if ( !digl || !girq )
> > +            {
> > +                spin_unlock(&d->event_lock);
> > +                xfree(girq);
> > +                xfree(digl);
> > +                return -ENOMEM;
> > +            }
> > +
> > +            girq->bus = digl->bus = pt_irq_bind->u.pci.bus;
> > +            girq->device = digl->device = pt_irq_bind->u.pci.device;
> > +            girq->intx = digl->intx = pt_irq_bind->u.pci.intx;
> > +            list_add_tail(&digl->list, &pirq_dpci->digl_list);
> >  
> > -        hvm_irq_dpci->link_cnt[link]++;
> > +            guest_gsi = hvm_pci_intx_gsi(digl->device, digl->intx);
> > +            link = hvm_pci_intx_link(digl->device, digl->intx);
> >  
> > -        digl->bus = bus;
> > -        digl->device = device;
> > -        digl->intx = intx;
> > -        list_add_tail(&digl->list, &pirq_dpci->digl_list);
> > +            hvm_irq_dpci->link_cnt[link]++;
> >  
> > -        girq->bus = bus;
> > -        girq->device = device;
> > -        girq->intx = intx;
> > -        girq->machine_gsi = pirq;
> > -        list_add_tail(&girq->list, &hvm_irq_dpci->girq[guest_gsi]);
> > +            girq->machine_gsi = pirq;
> > +            list_add_tail(&girq->list, &hvm_irq_dpci->girq[guest_gsi]);
> > +        }
> > +        else
> > +        {
> > +            /* MSI_TRANSLATE is not supported by the hardware domain. */
> > +            ASSERT(pt_irq_bind->irq_type == PT_IRQ_TYPE_PCI);
> > +            guest_gsi = pirq;
> > +            ASSERT(guest_gsi < hvm_domain_irq(d)->nr_gsis);
> > +        }
> 
> This is dangerous: For one it is impossible to judge the correctness
> of at least the first of these assertions for the hwdom case without
> looking at patch 3. And then the domctl path leading here does not
> exclude the subject domain equaling the calling one, i.e. you
> potentially assert guest input correctness here. Yes, we have XSA-77
> in place, but no, we shouldn't introduce new issues anywhere unless
> that's entirely unavoidable.

OK, let me return error instead to be on the safe side then.

> > @@ -504,10 +567,18 @@ int pt_irq_create_bind(
> >          spin_unlock(&d->event_lock);
> >  
> >          if ( iommu_verbose )
> > -            printk(XENLOG_G_INFO
> > -                   "d%d: bind: m_gsi=%u g_gsi=%u dev=%02x.%02x.%u intx=%u\n",
> > -                   d->domain_id, pirq, guest_gsi, bus,
> > -                   PCI_SLOT(device), PCI_FUNC(device), intx);
> > +        {
> > +            char buf[50];
> > +
> > +            if ( !is_hardware_domain(d) )
> > +                snprintf(buf, ARRAY_SIZE(buf), " dev=%02x.%02x.%u intx=%u",
> > +                         digl->bus, PCI_SLOT(digl->device),
> > +                         PCI_FUNC(digl->device), digl->intx);
> 
> The buffer array seems heavily over-sized - my counting gives at best
> slightly over 20 characters you actually need.

AFAICT max length should be 21, would you be fine with me setting it to 24 to
be safe?

> 
> > @@ -522,7 +593,6 @@ int pt_irq_create_bind(
> >  int pt_irq_destroy_bind(
> >      struct domain *d, xen_domctl_bind_pt_irq_t *pt_irq_bind)
> >  {
> > -    struct hvm_irq_dpci *hvm_irq_dpci;
> >      struct hvm_pirq_dpci *pirq_dpci;
> >      unsigned int machine_gsi = pt_irq_bind->machine_irq;
> >      struct pirq *pirq;
> > @@ -552,17 +622,15 @@ int pt_irq_destroy_bind(
> >  
> >      spin_lock(&d->event_lock);
> >  
> > -  hvm_irq_dpci = domain_get_irq_dpci(d);
> > -
> > -    if ( hvm_irq_dpci == NULL )
> > +    pirq = pirq_info(d, machine_gsi);
> > +    pirq_dpci = pirq_dpci(pirq);
> > +    if ( pirq_dpci->flags & HVM_IRQ_DPCI_IDENTITY_GSI )
> 
> I'm sure we've discussed aspects of this before: pirq_dpci may be
> NULL here, i.e. you can't blindly dereference it. All other uses in
> the function indeed have a NULL check first.

OK, I've removed this and fixed the function so it can unbind
HVM_IRQ_DPCI_IDENTITY_GSI.

> > @@ -570,9 +638,16 @@ int pt_irq_destroy_bind(
> >          unsigned int intx = pt_irq_bind->u.pci.intx;
> >          unsigned int guest_gsi = hvm_pci_intx_gsi(device, intx);
> >          unsigned int link = hvm_pci_intx_link(device, intx);
> > +        struct hvm_irq_dpci *hvm_irq_dpci = domain_get_irq_dpci(d);
> >          struct hvm_girq_dpci_mapping *girq;
> >          struct dev_intx_gsi_link *digl, *tmp;
> >  
> > +        if ( hvm_irq_dpci == NULL )
> > +        {
> > +            spin_unlock(&d->event_lock);
> > +            return -EINVAL;
> > +        }
> 
> Moving this check here is a behavioral modification. Perhaps a
> good one, yet it doesn't belong into this patch. Instead it should
> be properly reasoned about in a separate patch, if it is a correct
> thing to do.

I've left it in it's previous place and added a !is_hardware_domain check.

> > @@ -814,17 +896,12 @@ static void hvm_dirq_assist(struct domain *d, struct hvm_pirq_dpci *pirq_dpci)
> >      spin_unlock(&d->event_lock);
> >  }
> >  
> > -static void __hvm_dpci_eoi(struct domain *d,
> > -                           const struct hvm_girq_dpci_mapping *girq,
> > +static void __hvm_pirq_eoi(struct pirq *pirq,
> 
> Please drop the double leading underscores.

Done.

> 
> >                             const union vioapic_redir_entry *ent)
> >  {
> > -    struct pirq *pirq = pirq_info(d, girq->machine_gsi);
> > -    struct hvm_pirq_dpci *pirq_dpci;
> > -
> > -    if ( !hvm_domain_use_pirq(d, pirq) )
> > -        hvm_pci_intx_deassert(d, girq->device, girq->intx);
> > +    struct hvm_pirq_dpci *pirq_dpci = pirq_dpci(pirq);
> >  
> > -    pirq_dpci = pirq_dpci(pirq);
> > +    ASSERT(pirq_dpci);
> 
> Why is this useful / needed all of the sudden?

Hm, I don't think it's needed, probably just a leftover from when I was testing
the patches.

> > @@ -839,6 +916,32 @@ static void __hvm_dpci_eoi(struct domain *d,
> >      pirq_guest_eoi(pirq);
> >  }
> >  
> > +static void __hvm_dpci_eoi(struct domain *d,
> > +                           const struct hvm_girq_dpci_mapping *girq,
> > +                           const union vioapic_redir_entry *ent)
> > +{
> > +    struct pirq *pirq = pirq_info(d, girq->machine_gsi);
> > +
> > +    if ( !hvm_domain_use_pirq(d, pirq) )
> > +        hvm_pci_intx_deassert(d, girq->device, girq->intx);
> > +
> > +    __hvm_pirq_eoi(pirq, ent);
> > +}
> > +
> > +static void __hvm_gsi_eoi(struct domain *d, unsigned int gsi,
> 
> Same here for the double underscores. For the pre-existing
> function I'd leave it up to you whether to also drop them. What
> I care about is that we don't gain new non-conforming names.

I will leave the others as they are, or else they should be changed in a
separate patch.

> > +                          const union vioapic_redir_entry *ent)
> > +{
> > +    struct pirq *pirq = pirq_info(d, gsi);
> > +    struct hvm_pirq_dpci *pirq_dpci = pirq_dpci(pirq);
> > +
> > +    /* Check if GSI is actually mapped. */
> > +    if ( !pirq_dpci )
> 
> Please avoid the local variable when used just once here.

Done.

Thanks for the review, Roger.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/3] x86/vioapic: bind interrupts to PVH Dom0
  2017-05-12 13:55   ` Jan Beulich
@ 2017-05-16 16:15     ` Roger Pau Monne
  0 siblings, 0 replies; 19+ messages in thread
From: Roger Pau Monne @ 2017-05-16 16:15 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, boris.ostrovsky, xen-devel

On Fri, May 12, 2017 at 07:55:04AM -0600, Jan Beulich wrote:
> >>> On 19.04.17 at 17:11, <roger.pau@citrix.com> wrote:
> > --- a/xen/arch/x86/hvm/vioapic.c
> > +++ b/xen/arch/x86/hvm/vioapic.c
> > @@ -158,6 +158,62 @@ static int vioapic_read(
> >      return X86EMUL_OKAY;
> >  }
> >  
> > +static int vioapic_dom0_map_gsi(unsigned int gsi, unsigned int trig,
> 
> Considering the conditional in the caller, please use hwdom instead
> of dom0.

Done.

> > +                                unsigned int pol)
> > +{
> > +    struct domain *d = current->domain;
> > +    xen_domctl_bind_pt_irq_t pt_irq_bind = {
> > +        .irq_type = PT_IRQ_TYPE_PCI,
> > +        .machine_irq = gsi,
> > +        .hvm_domid = DOMID_SELF,
> 
> I'm struggling with this field: Did you not notice it's entirely
> unused? We should really delete it from the interface, as
> redundant with the domain specifier in the domctl container
> structure.

Right, I've removed it.

> > +    };
> > +    int ret, pirq = gsi;
> > +
> > +    ASSERT(is_hardware_domain(d));
> > +
> > +    /* Interrupt has been unmasked, bind it now. */
> > +    ret = mp_register_gsi(gsi, trig, pol);
> > +    if ( ret && ret != -EEXIST )
> > +    {
> > +        gdprintk(XENLOG_WARNING, "vioapic: error registering GSI %u: %d\n",
> > +                 gsi, ret);
> > +        goto error;
> > +    }
> > +    else if ( ret )
> > +        /* Already in use. */
> > +        return 0;
> 
> I think this would better be
> 
>     if ( ret == -EEXIST )
>         return 0;
>     if ( ret )
>         ....

Done.

> > +    ret = allocate_and_map_gsi_pirq(d, &pirq, &pirq);
> > +    if ( ret )
> > +    {
> > +        gdprintk(XENLOG_WARNING, "vioapic: error mapping GSI %u: %d\n",
> > +                 gsi, ret);
> > +        goto error;
> > +    }
> > +
> > +    pcidevs_lock();
> > +    ret = pt_irq_create_bind(d, &pt_irq_bind);
> > +    pcidevs_unlock();
> > +    if ( ret )
> > +    {
> > +        gdprintk(XENLOG_WARNING, "vioapic: error binding GSI %u: %d\n",
> > +                 gsi, ret);
> > +        goto error_unmap;
> > +    }
> > +
> > +    return 0;
> > +
> > + error_unmap:
> 
> I can live with the "error" label below, but the one above clearly can be
> avoided quite easily by simply inverting the preceding if().

I've changed this to:

    pcidevs_lock();
    ret = pt_irq_create_bind(d, &pt_irq_bind);
    if ( ret )
    {
        gprintk(XENLOG_WARNING, "vioapic: error binding GSI %u: %d\n",
                gsi, ret);
        spin_lock(&d->event_lock);
        unmap_domain_pirq(d, pirq);
        spin_unlock(&d->event_lock);
    }
    pcidevs_unlock();

    return ret;

And got rid of both labels.

> Also, considering this is Dom0-only, I wonder whether all of the log
> messages wouldn't better use gprintk().

Done.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/3] x86/pt: enable binding of GSIs to a PVH Dom0
  2017-05-16 15:55     ` Roger Pau Monne
@ 2017-05-16 16:23       ` Jan Beulich
  2017-05-17  8:26         ` Roger Pau Monne
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2017-05-16 16:23 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, boris.ostrovsky, xen-devel

>>> On 16.05.17 at 17:55, <roger.pau@citrix.com> wrote:
> On Fri, May 12, 2017 at 07:42:51AM -0600, Jan Beulich wrote:
>> >>> On 19.04.17 at 17:11, <roger.pau@citrix.com> wrote:
>> > Note that currently there's no support for unbinding this interrupts.
>> 
>> Do you plan to deal with that before this changes goes in? Aiui this
>> not working means you can't pass through devices with pin based
>> interrupts once Dom0 chose to bind to them. Otoh hand you modify
>> pt_irq_destroy_bind(), so I'm a little puzzled ...
> 
> Yes, I modify pt_irq_destroy_bind to return EOPNOTSUPP when trying to unbind
> such an interrupt. I can implement the unbind, but it's not going to be used
> ATM.

Is it not? I can see the mentioned pass-through case to be of no
interest, but wouldn't a well behaved kernel perhaps unmap IRQs
while shutting down?

>> > +    spin_lock(&d->arch.hvm_domain.irq_lock);
>> > +    if ( !hvm_irq->gsi_assert_count[gsi] )
>> > +    {
>> > +        hvm_irq->gsi_assert_count[gsi]++;
>> 
>> Why is this an increment instead of a simple write of 1? Or the
>> other way around - why is this not always incrementing, just like
>> __hvm_pci_intx_assert() does? (Symmetric questions then for
>> hvm_gsi_deassert()).
> 
> __hvm_pci_intx_{de}assert has an array that tracks the status of each interrupt
> line, and Xen does the routing based on that (the __test_and_clear_bit at the
> top of __hvm_pci_intx_assert). That prevents the same line from triggering
> multiple times, which is not available here, and thus Xen needs to rely on
> gsi_assert_count in order to know if the GSI is pending or not.
> 
> Switched to use a set to 1/0 instead of the increment, which I agree makes this
> clearer.

And altogether this likely would benefit from a comment put
somewhere.

>> > @@ -504,10 +567,18 @@ int pt_irq_create_bind(
>> >          spin_unlock(&d->event_lock);
>> >  
>> >          if ( iommu_verbose )
>> > -            printk(XENLOG_G_INFO
>> > -                   "d%d: bind: m_gsi=%u g_gsi=%u dev=%02x.%02x.%u intx=%u\n",
>> > -                   d->domain_id, pirq, guest_gsi, bus,
>> > -                   PCI_SLOT(device), PCI_FUNC(device), intx);
>> > +        {
>> > +            char buf[50];
>> > +
>> > +            if ( !is_hardware_domain(d) )
>> > +                snprintf(buf, ARRAY_SIZE(buf), " dev=%02x.%02x.%u intx=%u",
>> > +                         digl->bus, PCI_SLOT(digl->device),
>> > +                         PCI_FUNC(digl->device), digl->intx);
>> 
>> The buffer array seems heavily over-sized - my counting gives at best
>> slightly over 20 characters you actually need.
> 
> AFAICT max length should be 21, would you be fine with me setting it to 24 to
> be safe?

Sure.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/3] x86/pt: enable binding of GSIs to a PVH Dom0
  2017-05-16 16:23       ` Jan Beulich
@ 2017-05-17  8:26         ` Roger Pau Monne
  2017-05-17  9:02           ` Jan Beulich
  0 siblings, 1 reply; 19+ messages in thread
From: Roger Pau Monne @ 2017-05-17  8:26 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, boris.ostrovsky, xen-devel

On Tue, May 16, 2017 at 10:23:48AM -0600, Jan Beulich wrote:
> >>> On 16.05.17 at 17:55, <roger.pau@citrix.com> wrote:
> > On Fri, May 12, 2017 at 07:42:51AM -0600, Jan Beulich wrote:
> >> >>> On 19.04.17 at 17:11, <roger.pau@citrix.com> wrote:
> >> > Note that currently there's no support for unbinding this interrupts.
> >> 
> >> Do you plan to deal with that before this changes goes in? Aiui this
> >> not working means you can't pass through devices with pin based
> >> interrupts once Dom0 chose to bind to them. Otoh hand you modify
> >> pt_irq_destroy_bind(), so I'm a little puzzled ...
> > 
> > Yes, I modify pt_irq_destroy_bind to return EOPNOTSUPP when trying to unbind
> > such an interrupt. I can implement the unbind, but it's not going to be used
> > ATM.
> 
> Is it not? I can see the mentioned pass-through case to be of no
> interest, but wouldn't a well behaved kernel perhaps unmap IRQs
> while shutting down?

I guess I haven't explained myself correctly, what I meant is that right now I
don't have any use-case for this, I haven't started working on pci-passtrhough
for guests, and the Dom0 implementation I have doesn't unbind interrupts on
shutdown.

I could unbind them when Dom0 masks the vIO APIC pin, but I think that's
going to be awfully slow.

> >> > +    spin_lock(&d->arch.hvm_domain.irq_lock);
> >> > +    if ( !hvm_irq->gsi_assert_count[gsi] )
> >> > +    {
> >> > +        hvm_irq->gsi_assert_count[gsi]++;
> >> 
> >> Why is this an increment instead of a simple write of 1? Or the
> >> other way around - why is this not always incrementing, just like
> >> __hvm_pci_intx_assert() does? (Symmetric questions then for
> >> hvm_gsi_deassert()).
> > 
> > __hvm_pci_intx_{de}assert has an array that tracks the status of each interrupt
> > line, and Xen does the routing based on that (the __test_and_clear_bit at the
> > top of __hvm_pci_intx_assert). That prevents the same line from triggering
> > multiple times, which is not available here, and thus Xen needs to rely on
> > gsi_assert_count in order to know if the GSI is pending or not.
> > 
> > Switched to use a set to 1/0 instead of the increment, which I agree makes this
> > clearer.
> 
> And altogether this likely would benefit from a comment put
> somewhere.

Done.

Thanks.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/3] x86/pt: enable binding of GSIs to a PVH Dom0
  2017-05-17  8:26         ` Roger Pau Monne
@ 2017-05-17  9:02           ` Jan Beulich
  2017-05-17  9:34             ` Roger Pau Monne
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2017-05-17  9:02 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, boris.ostrovsky, xen-devel

>>> On 17.05.17 at 10:26, <roger.pau@citrix.com> wrote:
> On Tue, May 16, 2017 at 10:23:48AM -0600, Jan Beulich wrote:
>> >>> On 16.05.17 at 17:55, <roger.pau@citrix.com> wrote:
>> > On Fri, May 12, 2017 at 07:42:51AM -0600, Jan Beulich wrote:
>> >> >>> On 19.04.17 at 17:11, <roger.pau@citrix.com> wrote:
>> >> > Note that currently there's no support for unbinding this interrupts.
>> >> 
>> >> Do you plan to deal with that before this changes goes in? Aiui this
>> >> not working means you can't pass through devices with pin based
>> >> interrupts once Dom0 chose to bind to them. Otoh hand you modify
>> >> pt_irq_destroy_bind(), so I'm a little puzzled ...
>> > 
>> > Yes, I modify pt_irq_destroy_bind to return EOPNOTSUPP when trying to unbind
>> > such an interrupt. I can implement the unbind, but it's not going to be used
>> > ATM.
>> 
>> Is it not? I can see the mentioned pass-through case to be of no
>> interest, but wouldn't a well behaved kernel perhaps unmap IRQs
>> while shutting down?
> 
> I guess I haven't explained myself correctly, what I meant is that right now I
> don't have any use-case for this, I haven't started working on pci-passtrhough
> for guests, and the Dom0 implementation I have doesn't unbind interrupts on
> shutdown.
> 
> I could unbind them when Dom0 masks the vIO APIC pin, but I think that's
> going to be awfully slow.

Well, doesn't this point out another weakness of the no-physdevops
model you advocate for? Implying an unbind from the mask bit being
set in an RTE would certainly be undesirable (there are reasons to
transiently mask an interrupt, after all). Hence there's no way Dom0
could indicate "I'm done with this interrupt", unless I'm missing
something.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/3] x86/pt: enable binding of GSIs to a PVH Dom0
  2017-05-17  9:02           ` Jan Beulich
@ 2017-05-17  9:34             ` Roger Pau Monne
  2017-05-17  9:51               ` Jan Beulich
  0 siblings, 1 reply; 19+ messages in thread
From: Roger Pau Monne @ 2017-05-17  9:34 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, boris.ostrovsky, xen-devel

On Wed, May 17, 2017 at 03:02:26AM -0600, Jan Beulich wrote:
> >>> On 17.05.17 at 10:26, <roger.pau@citrix.com> wrote:
> > On Tue, May 16, 2017 at 10:23:48AM -0600, Jan Beulich wrote:
> >> >>> On 16.05.17 at 17:55, <roger.pau@citrix.com> wrote:
> >> > On Fri, May 12, 2017 at 07:42:51AM -0600, Jan Beulich wrote:
> >> >> >>> On 19.04.17 at 17:11, <roger.pau@citrix.com> wrote:
> >> >> > Note that currently there's no support for unbinding this interrupts.
> >> >> 
> >> >> Do you plan to deal with that before this changes goes in? Aiui this
> >> >> not working means you can't pass through devices with pin based
> >> >> interrupts once Dom0 chose to bind to them. Otoh hand you modify
> >> >> pt_irq_destroy_bind(), so I'm a little puzzled ...
> >> > 
> >> > Yes, I modify pt_irq_destroy_bind to return EOPNOTSUPP when trying to unbind
> >> > such an interrupt. I can implement the unbind, but it's not going to be used
> >> > ATM.
> >> 
> >> Is it not? I can see the mentioned pass-through case to be of no
> >> interest, but wouldn't a well behaved kernel perhaps unmap IRQs
> >> while shutting down?
> > 
> > I guess I haven't explained myself correctly, what I meant is that right now I
> > don't have any use-case for this, I haven't started working on pci-passtrhough
> > for guests, and the Dom0 implementation I have doesn't unbind interrupts on
> > shutdown.
> > 
> > I could unbind them when Dom0 masks the vIO APIC pin, but I think that's
> > going to be awfully slow.
> 
> Well, doesn't this point out another weakness of the no-physdevops
> model you advocate for? Implying an unbind from the mask bit being
> set in an RTE would certainly be undesirable (there are reasons to
> transiently mask an interrupt, after all). Hence there's no way Dom0
> could indicate "I'm done with this interrupt", unless I'm missing
> something.

The only reason I could see for the hardware domain to unbind an interrupt is
when doing pci-passthrough, and there the toolstack is involved, which could
unbind the interrupt using the already existing hypercalls.

Just to be clear, my no-physdevops thing is about how guests interact with
physical devices when using them in a native way. Things like pci-passthrough
(that are not part of how an OS operates) should indeed use hypercalls (because
there's no native way to do this).

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/3] x86/pt: enable binding of GSIs to a PVH Dom0
  2017-05-17  9:34             ` Roger Pau Monne
@ 2017-05-17  9:51               ` Jan Beulich
  2017-05-17 10:13                 ` Roger Pau Monne
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2017-05-17  9:51 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, boris.ostrovsky, xen-devel

>>> On 17.05.17 at 11:34, <roger.pau@citrix.com> wrote:
> On Wed, May 17, 2017 at 03:02:26AM -0600, Jan Beulich wrote:
>> >>> On 17.05.17 at 10:26, <roger.pau@citrix.com> wrote:
>> > On Tue, May 16, 2017 at 10:23:48AM -0600, Jan Beulich wrote:
>> >> >>> On 16.05.17 at 17:55, <roger.pau@citrix.com> wrote:
>> >> > On Fri, May 12, 2017 at 07:42:51AM -0600, Jan Beulich wrote:
>> >> >> >>> On 19.04.17 at 17:11, <roger.pau@citrix.com> wrote:
>> >> >> > Note that currently there's no support for unbinding this interrupts.
>> >> >> 
>> >> >> Do you plan to deal with that before this changes goes in? Aiui this
>> >> >> not working means you can't pass through devices with pin based
>> >> >> interrupts once Dom0 chose to bind to them. Otoh hand you modify
>> >> >> pt_irq_destroy_bind(), so I'm a little puzzled ...
>> >> > 
>> >> > Yes, I modify pt_irq_destroy_bind to return EOPNOTSUPP when trying to 
> unbind
>> >> > such an interrupt. I can implement the unbind, but it's not going to be 
> used
>> >> > ATM.
>> >> 
>> >> Is it not? I can see the mentioned pass-through case to be of no
>> >> interest, but wouldn't a well behaved kernel perhaps unmap IRQs
>> >> while shutting down?
>> > 
>> > I guess I haven't explained myself correctly, what I meant is that right 
> now I
>> > don't have any use-case for this, I haven't started working on 
> pci-passtrhough
>> > for guests, and the Dom0 implementation I have doesn't unbind interrupts on
>> > shutdown.
>> > 
>> > I could unbind them when Dom0 masks the vIO APIC pin, but I think that's
>> > going to be awfully slow.
>> 
>> Well, doesn't this point out another weakness of the no-physdevops
>> model you advocate for? Implying an unbind from the mask bit being
>> set in an RTE would certainly be undesirable (there are reasons to
>> transiently mask an interrupt, after all). Hence there's no way Dom0
>> could indicate "I'm done with this interrupt", unless I'm missing
>> something.
> 
> The only reason I could see for the hardware domain to unbind an interrupt is
> when doing pci-passthrough, and there the toolstack is involved, which could
> unbind the interrupt using the already existing hypercalls.

I'm not convinced the tool stack doing this behind the back of the
kernel is an acceptable thing to do, even more so when thinking
of shared interrupts.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/3] x86/pt: enable binding of GSIs to a PVH Dom0
  2017-05-17  9:51               ` Jan Beulich
@ 2017-05-17 10:13                 ` Roger Pau Monne
  0 siblings, 0 replies; 19+ messages in thread
From: Roger Pau Monne @ 2017-05-17 10:13 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, boris.ostrovsky, xen-devel

On Wed, May 17, 2017 at 03:51:43AM -0600, Jan Beulich wrote:
> >>> On 17.05.17 at 11:34, <roger.pau@citrix.com> wrote:
> > On Wed, May 17, 2017 at 03:02:26AM -0600, Jan Beulich wrote:
> >> >>> On 17.05.17 at 10:26, <roger.pau@citrix.com> wrote:
> >> > On Tue, May 16, 2017 at 10:23:48AM -0600, Jan Beulich wrote:
> >> >> >>> On 16.05.17 at 17:55, <roger.pau@citrix.com> wrote:
> >> >> > On Fri, May 12, 2017 at 07:42:51AM -0600, Jan Beulich wrote:
> >> >> >> >>> On 19.04.17 at 17:11, <roger.pau@citrix.com> wrote:
> >> >> >> > Note that currently there's no support for unbinding this interrupts.
> >> >> >> 
> >> >> >> Do you plan to deal with that before this changes goes in? Aiui this
> >> >> >> not working means you can't pass through devices with pin based
> >> >> >> interrupts once Dom0 chose to bind to them. Otoh hand you modify
> >> >> >> pt_irq_destroy_bind(), so I'm a little puzzled ...
> >> >> > 
> >> >> > Yes, I modify pt_irq_destroy_bind to return EOPNOTSUPP when trying to 
> > unbind
> >> >> > such an interrupt. I can implement the unbind, but it's not going to be 
> > used
> >> >> > ATM.
> >> >> 
> >> >> Is it not? I can see the mentioned pass-through case to be of no
> >> >> interest, but wouldn't a well behaved kernel perhaps unmap IRQs
> >> >> while shutting down?
> >> > 
> >> > I guess I haven't explained myself correctly, what I meant is that right 
> > now I
> >> > don't have any use-case for this, I haven't started working on 
> > pci-passtrhough
> >> > for guests, and the Dom0 implementation I have doesn't unbind interrupts on
> >> > shutdown.
> >> > 
> >> > I could unbind them when Dom0 masks the vIO APIC pin, but I think that's
> >> > going to be awfully slow.
> >> 
> >> Well, doesn't this point out another weakness of the no-physdevops
> >> model you advocate for? Implying an unbind from the mask bit being
> >> set in an RTE would certainly be undesirable (there are reasons to
> >> transiently mask an interrupt, after all). Hence there's no way Dom0
> >> could indicate "I'm done with this interrupt", unless I'm missing
> >> something.
> > 
> > The only reason I could see for the hardware domain to unbind an interrupt is
> > when doing pci-passthrough, and there the toolstack is involved, which could
> > unbind the interrupt using the already existing hypercalls.
> 
> I'm not convinced the tool stack doing this behind the back of the
> kernel is an acceptable thing to do, even more so when thinking
> of shared interrupts.

The toolstack could figure out whether the interrupt is shared or not (edge or
level triggered) and then decide whether it needs unbinding or not. The kernel
must obviously not be using the device by that point. I'm again not opposed to
have an in-kernel (for Dom0) driver for managing passthrough, but I would like
to avoid it if possible.

In any case, I think the interface for how Dom0 interacts with interrupts from
it's assigned devices vs the interface used to detach devices for
pci-passthrough is orthogonal (one can be completely separated from the
other).

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2017-05-17 10:14 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-19 15:11 [PATCH v2 0/3] x86/dpci: bind legacy PCI interrupts to PVHv2 Dom0 Roger Pau Monne
2017-04-19 15:11 ` [PATCH v2 1/3] x86/physdev: factor out the code to allocate and map a pirq Roger Pau Monne
2017-05-12 12:56   ` Jan Beulich
2017-05-16  9:47     ` Roger Pau Monne
2017-05-16 12:03       ` Jan Beulich
2017-05-16 11:52   ` Roger Pau Monne
2017-05-16 12:13     ` Jan Beulich
2017-04-19 15:11 ` [PATCH v2 2/3] x86/pt: enable binding of GSIs to a PVH Dom0 Roger Pau Monne
2017-05-12 13:42   ` Jan Beulich
2017-05-16 15:55     ` Roger Pau Monne
2017-05-16 16:23       ` Jan Beulich
2017-05-17  8:26         ` Roger Pau Monne
2017-05-17  9:02           ` Jan Beulich
2017-05-17  9:34             ` Roger Pau Monne
2017-05-17  9:51               ` Jan Beulich
2017-05-17 10:13                 ` Roger Pau Monne
2017-04-19 15:11 ` [PATCH v2 3/3] x86/vioapic: bind interrupts to " Roger Pau Monne
2017-05-12 13:55   ` Jan Beulich
2017-05-16 16:15     ` Roger Pau Monne

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.