All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/6] x86/pvh: fixes for PVH Dom0
@ 2018-11-20 16:01 Roger Pau Monne
  2018-11-20 16:01 ` [PATCH v5 1/6] vpci: fix updating the command register Roger Pau Monne
                   ` (5 more replies)
  0 siblings, 6 replies; 24+ messages in thread
From: Roger Pau Monne @ 2018-11-20 16:01 UTC (permalink / raw)
  To: xen-devel; +Cc: Roger Pau Monne

Hello,

The following series contain miscellaneous fixes for a PVH Dom0. I've
found out this issues while trying to boot on an AMD EPYC box.

The series can be found on my git repo:

git://xenbits.xen.org/people/royger/xen.git fixes-pvh-v5

Roger Pau Monne (6):
  vpci: fix updating the command register
  vpci: fix deferral of long operations
  vpci/msix: carve p2m hole for MSIX MMIO regions
  pci: add a segment parameter to pci_hide_device
  amd/iommu: assign iommu devices to Xen
  amd/iommu: skip bridge devices when updating IOMMU page tables

 xen/arch/x86/hvm/ioreq.c                 |  9 ++--
 xen/drivers/char/ehci-dbgp.c             |  2 +-
 xen/drivers/char/ns16550.c               |  2 +-
 xen/drivers/passthrough/amd/iommu_init.c |  4 ++
 xen/drivers/passthrough/amd/iommu_map.c  |  4 ++
 xen/drivers/passthrough/pci.c            | 16 ++++--
 xen/drivers/video/vga.c                  |  2 +-
 xen/drivers/vpci/header.c                | 63 ++++++++++++++++--------
 xen/drivers/vpci/msix.c                  | 49 ++++++++++++++++++
 xen/include/xen/pci.h                    |  2 +-
 xen/include/xen/vpci.h                   |  5 +-
 11 files changed, 124 insertions(+), 34 deletions(-)

-- 
2.19.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v5 1/6] vpci: fix updating the command register
  2018-11-20 16:01 [PATCH v5 0/6] x86/pvh: fixes for PVH Dom0 Roger Pau Monne
@ 2018-11-20 16:01 ` Roger Pau Monne
  2018-11-26 11:20   ` Jan Beulich
  2018-11-20 16:01 ` [PATCH v5 2/6] vpci: fix deferral of long operations Roger Pau Monne
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 24+ messages in thread
From: Roger Pau Monne @ 2018-11-20 16:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Jan Beulich, Roger Pau Monne

When switching the memory decoding bit in the command register the
rest of the changes where dropped, leading to only the memory decoding
bit being updated.

Fix this by writing the command register once the guest physmap
manipulations are done if there are changes to the memory decoding
bit.

Note that when only mapping/unmapping the ROM BAR a fabricated command
register value is passed to modify_bars which is only used to signal
whether the action is a mapping or unmapping, but the value is never
written to the device command register. Turn the maodify_decoding
ASSERT into an ASSERT_UNREACHABLE and make sure that non-debug builds
won't end up writing to the command register if only modifying the ROM
BAR.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
---
Changes since v4:
 - Turn the ASSERT in modify_decoding to an ASSERT_UNREACHABLE to be
   sure the cmd value is never written when only {un}mapping the ROM
   BAR.
 - Add a comment in rom_write about the fabricated command register
   passed to modify_bars.

Changes since v3:
 - Only update the command register once after the physmap changes are
   done.
---
 xen/drivers/vpci/header.c | 47 ++++++++++++++++++++++-----------------
 xen/include/xen/vpci.h    |  2 +-
 2 files changed, 27 insertions(+), 22 deletions(-)

diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 4573ccadf0..39dffb21fb 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -81,11 +81,12 @@ static int map_range(unsigned long s, unsigned long e, void *data,
  * BAR's enable bit has changed with the memory decoding bit already enabled.
  * If rom_only is not set then it's the memory decoding bit that changed.
  */
-static void modify_decoding(const struct pci_dev *pdev, bool map, bool rom_only)
+static void modify_decoding(const struct pci_dev *pdev, uint16_t cmd,
+                            bool rom_only)
 {
     struct vpci_header *header = &pdev->vpci->header;
     uint8_t slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn);
-    uint16_t cmd;
+    bool map = cmd & PCI_COMMAND_MEMORY;
     unsigned int i;
 
     for ( i = 0; i < ARRAY_SIZE(header->bars); i++ )
@@ -110,12 +111,11 @@ static void modify_decoding(const struct pci_dev *pdev, bool map, bool rom_only)
             header->bars[i].enabled = map;
     }
 
-    ASSERT(!rom_only);
-    cmd = pci_conf_read16(pdev->seg, pdev->bus, slot, func, PCI_COMMAND);
-    cmd &= ~PCI_COMMAND_MEMORY;
-    cmd |= map ? PCI_COMMAND_MEMORY : 0;
-    pci_conf_write16(pdev->seg, pdev->bus, slot, func, PCI_COMMAND,
-                     cmd);
+    if ( !rom_only )
+        pci_conf_write16(pdev->seg, pdev->bus, slot, func, PCI_COMMAND,
+                         cmd);
+    else
+        ASSERT_UNREACHABLE();
 }
 
 bool vpci_process_pending(struct vcpu *v)
@@ -124,7 +124,7 @@ bool vpci_process_pending(struct vcpu *v)
     {
         struct map_data data = {
             .d = v->domain,
-            .map = v->vpci.map,
+            .map = v->vpci.cmd & PCI_COMMAND_MEMORY,
         };
         int rc = rangeset_consume_ranges(v->vpci.mem, map_range, &data);
 
@@ -133,7 +133,8 @@ bool vpci_process_pending(struct vcpu *v)
 
         spin_lock(&v->vpci.pdev->vpci->lock);
         /* Disable memory decoding unconditionally on failure. */
-        modify_decoding(v->vpci.pdev, !rc && v->vpci.map,
+        modify_decoding(v->vpci.pdev,
+                        rc ? v->vpci.cmd & ~PCI_COMMAND_MEMORY : v->vpci.cmd,
                         !rc && v->vpci.rom_only);
         spin_unlock(&v->vpci.pdev->vpci->lock);
 
@@ -154,7 +155,7 @@ bool vpci_process_pending(struct vcpu *v)
 }
 
 static int __init apply_map(struct domain *d, const struct pci_dev *pdev,
-                            struct rangeset *mem)
+                            struct rangeset *mem, uint16_t cmd)
 {
     struct map_data data = { .d = d, .map = true };
     int rc;
@@ -163,13 +164,13 @@ static int __init apply_map(struct domain *d, const struct pci_dev *pdev,
         process_pending_softirqs();
     rangeset_destroy(mem);
     if ( !rc )
-        modify_decoding(pdev, true, false);
+        modify_decoding(pdev, cmd, false);
 
     return rc;
 }
 
 static void defer_map(struct domain *d, struct pci_dev *pdev,
-                      struct rangeset *mem, bool map, bool rom_only)
+                      struct rangeset *mem, uint16_t cmd, bool rom_only)
 {
     struct vcpu *curr = current;
 
@@ -181,11 +182,11 @@ static void defer_map(struct domain *d, struct pci_dev *pdev,
      */
     curr->vpci.pdev = pdev;
     curr->vpci.mem = mem;
-    curr->vpci.map = map;
+    curr->vpci.cmd = cmd;
     curr->vpci.rom_only = rom_only;
 }
 
-static int modify_bars(const struct pci_dev *pdev, bool map, bool rom_only)
+static int modify_bars(const struct pci_dev *pdev, uint16_t cmd, bool rom_only)
 {
     struct vpci_header *header = &pdev->vpci->header;
     struct rangeset *mem = rangeset_new(NULL, NULL, 0);
@@ -305,11 +306,11 @@ static int modify_bars(const struct pci_dev *pdev, bool map, bool rom_only)
          * be called iff the memory decoding bit is enabled, thus the operation
          * will always be to establish mappings and process all the BARs.
          */
-        ASSERT(map && !rom_only);
-        return apply_map(pdev->domain, pdev, mem);
+        ASSERT((cmd & PCI_COMMAND_MEMORY) && !rom_only);
+        return apply_map(pdev->domain, pdev, mem, cmd);
     }
 
-    defer_map(dev->domain, dev, mem, map, rom_only);
+    defer_map(dev->domain, dev, mem, cmd, rom_only);
 
     return 0;
 }
@@ -332,7 +333,7 @@ static void cmd_write(const struct pci_dev *pdev, unsigned int reg,
          * memory decoding bit has not been changed, so leave everything as-is,
          * hoping the guest will realize and try again.
          */
-        modify_bars(pdev, cmd & PCI_COMMAND_MEMORY, false);
+        modify_bars(pdev, cmd, false);
     else
         pci_conf_write16(pdev->seg, pdev->bus, slot, func, reg, cmd);
 }
@@ -413,7 +414,11 @@ static void rom_write(const struct pci_dev *pdev, unsigned int reg,
         header->rom_enabled = new_enabled;
         pci_conf_write32(pdev->seg, pdev->bus, slot, func, reg, val);
     }
-    else if ( modify_bars(pdev, new_enabled, true) )
+    /*
+     * Pass PCI_COMMAND_MEMORY or 0 to signal a map/unmap request, note that
+     * this fabricated command is never going to be written to the register.
+     */
+    else if ( modify_bars(pdev, new_enabled ? PCI_COMMAND_MEMORY : 0, true) )
         /*
          * No memory has been added or removed from the p2m (because the actual
          * p2m changes are deferred in defer_map) and the ROM enable bit has
@@ -549,7 +554,7 @@ static int init_bars(struct pci_dev *pdev)
             rom->type = VPCI_BAR_EMPTY;
     }
 
-    return (cmd & PCI_COMMAND_MEMORY) ? modify_bars(pdev, true, false) : 0;
+    return (cmd & PCI_COMMAND_MEMORY) ? modify_bars(pdev, cmd, false) : 0;
 }
 REGISTER_VPCI_INIT(init_bars, VPCI_PRIORITY_MIDDLE);
 
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index af2b8580ee..44104b75b6 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -145,7 +145,7 @@ struct vpci_vcpu {
     /* Per-vcpu structure to store state while {un}mapping of PCI BARs. */
     struct rangeset *mem;
     struct pci_dev *pdev;
-    bool map      : 1;
+    uint16_t cmd;
     bool rom_only : 1;
 };
 
-- 
2.19.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v5 2/6] vpci: fix deferral of long operations
  2018-11-20 16:01 [PATCH v5 0/6] x86/pvh: fixes for PVH Dom0 Roger Pau Monne
  2018-11-20 16:01 ` [PATCH v5 1/6] vpci: fix updating the command register Roger Pau Monne
@ 2018-11-20 16:01 ` Roger Pau Monne
  2018-11-26 11:26   ` Jan Beulich
  2018-11-26 11:41   ` Jan Beulich
  2018-11-20 16:01 ` [PATCH v5 3/6] vpci/msix: carve p2m hole for MSIX MMIO regions Roger Pau Monne
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 24+ messages in thread
From: Roger Pau Monne @ 2018-11-20 16:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Paul Durrant, Jan Beulich, Roger Pau Monne

Current logic to handle long running operations is flawed because it
doesn't prevent the guest vcpu from running. Fix this by raising a
scheduler softirq when preemption is required, so that the do_softirq
call in the guest entry path performs a rescheduling. Also move the
call to vpci_process_pending into handle_hvm_io_completion, together
with the IOREQ code that handles pending IO instructions.

Note that a scheduler softirq is also raised when the long running
operation is queued in order to prevent the guest vcpu from resuming
execution.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Paul Durrant <paul.durrant@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
---
Changes since v4:
 - Add a comment to clarify defer_map raising a scheduler softirq.
 - Reword commit message.
 - Raise the scheduler softirq in handle_hvm_io_completion rather than
   vpci_process_pending.

Changes since v3:
 - Don't use a tasklet.
---
 xen/arch/x86/hvm/ioreq.c  | 9 ++++++---
 xen/drivers/vpci/header.c | 5 +++++
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index a56d634f31..71f23227e6 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -85,9 +85,6 @@ bool hvm_io_pending(struct vcpu *v)
     struct hvm_ioreq_server *s;
     unsigned int id;
 
-    if ( has_vpci(d) && vpci_process_pending(v) )
-        return true;
-
     FOR_EACH_IOREQ_SERVER(d, id, s)
     {
         struct hvm_ioreq_vcpu *sv;
@@ -186,6 +183,12 @@ bool handle_hvm_io_completion(struct vcpu *v)
     enum hvm_io_completion io_completion;
     unsigned int id;
 
+    if ( has_vpci(d) && vpci_process_pending(v) )
+    {
+        raise_softirq(SCHEDULE_SOFTIRQ);
+        return false;
+    }
+
     FOR_EACH_IOREQ_SERVER(d, id, s)
     {
         struct hvm_ioreq_vcpu *sv;
diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 39dffb21fb..c9bdc2ced3 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -184,6 +184,11 @@ static void defer_map(struct domain *d, struct pci_dev *pdev,
     curr->vpci.mem = mem;
     curr->vpci.cmd = cmd;
     curr->vpci.rom_only = rom_only;
+    /*
+     * Raise a scheduler softirq in order to prevent the guest from resuming
+     * execution with pending mapping operations.
+     */
+    raise_softirq(SCHEDULE_SOFTIRQ);
 }
 
 static int modify_bars(const struct pci_dev *pdev, uint16_t cmd, bool rom_only)
-- 
2.19.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v5 3/6] vpci/msix: carve p2m hole for MSIX MMIO regions
  2018-11-20 16:01 [PATCH v5 0/6] x86/pvh: fixes for PVH Dom0 Roger Pau Monne
  2018-11-20 16:01 ` [PATCH v5 1/6] vpci: fix updating the command register Roger Pau Monne
  2018-11-20 16:01 ` [PATCH v5 2/6] vpci: fix deferral of long operations Roger Pau Monne
@ 2018-11-20 16:01 ` Roger Pau Monne
  2018-11-20 16:01 ` [PATCH v5 4/6] pci: add a segment parameter to pci_hide_device Roger Pau Monne
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 24+ messages in thread
From: Roger Pau Monne @ 2018-11-20 16:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Jan Beulich, Roger Pau Monne

Make sure the MSIX MMIO regions don't have p2m entries setup, so that
accesses to them trap into the hypervisor and can be handled by vpci.

Commit 042678762 ("x86/iommu: add map-reserved dom0-iommu option to
map reserved memory ranges") added mappings for all the reserved
regions into the PVH Dom0 p2m, and some of those reserved regions
might contain MSIX MMIO regions, hence the need to make sure there are
no mappings established.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
---
Changes since v3:
 - Allow p2m_invalid for unmapped regions.
 - Add a DomU FIXME comment.
 - Reword commit message.
---
 xen/drivers/vpci/header.c | 11 +++++++++
 xen/drivers/vpci/msix.c   | 49 +++++++++++++++++++++++++++++++++++++++
 xen/include/xen/vpci.h    |  3 +++
 3 files changed, 63 insertions(+)

diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index c9bdc2ced3..dfb2ba4297 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -89,6 +89,17 @@ static void modify_decoding(const struct pci_dev *pdev, uint16_t cmd,
     bool map = cmd & PCI_COMMAND_MEMORY;
     unsigned int i;
 
+    /*
+     * Make sure there are no mappings in the MSIX MMIO areas, so that accesses
+     * can be trapped (and emulated) by Xen when the memory decoding bit is
+     * enabled.
+     *
+     * FIXME: punching holes after the p2m has been set up might be racy for
+     * DomU usage, needs to be revisited.
+     */
+    if ( map && !rom_only && vpci_make_msix_hole(pdev) )
+        return;
+
     for ( i = 0; i < ARRAY_SIZE(header->bars); i++ )
     {
         if ( !MAPPABLE_BAR(&header->bars[i]) )
diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
index 1960dae123..af3ffa087d 100644
--- a/xen/drivers/vpci/msix.c
+++ b/xen/drivers/vpci/msix.c
@@ -21,6 +21,7 @@
 #include <xen/vpci.h>
 
 #include <asm/msi.h>
+#include <asm/p2m.h>
 
 #define VMSIX_SIZE(num) offsetof(struct vpci_msix, entries[num])
 
@@ -395,6 +396,54 @@ static const struct hvm_mmio_ops vpci_msix_table_ops = {
     .write = msix_write,
 };
 
+int vpci_make_msix_hole(const struct pci_dev *pdev)
+{
+    struct domain *d = pdev->domain;
+    unsigned int i;
+
+    if ( !pdev->vpci->msix )
+        return 0;
+
+    /* Make sure there's a hole for the MSIX table/PBA in the p2m. */
+    for ( i = 0; i < ARRAY_SIZE(pdev->vpci->msix->tables); i++ )
+    {
+        unsigned long start = PFN_DOWN(vmsix_table_addr(pdev->vpci, i));
+        unsigned long end = PFN_DOWN(vmsix_table_addr(pdev->vpci, i) +
+                                     vmsix_table_size(pdev->vpci, i) - 1);
+
+        for ( ; start <= end; start++ )
+        {
+            p2m_type_t t;
+            mfn_t mfn = get_gfn_query(d, start, &t);
+
+            switch ( t )
+            {
+            case p2m_mmio_dm:
+            case p2m_invalid:
+                break;
+            case p2m_mmio_direct:
+                if ( mfn_x(mfn) == start )
+                {
+                    clear_identity_p2m_entry(d, start);
+                    break;
+                }
+                /* fallthrough. */
+            default:
+                put_gfn(d, start);
+                gprintk(XENLOG_WARNING,
+                        "%04x:%02x:%02x.%u: existing mapping (mfn: %" PRI_mfn
+                        "type: %d) at %#lx clobbers MSIX MMIO area\n",
+                        pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
+                        PCI_FUNC(pdev->devfn), mfn_x(mfn), t, start);
+                return -EEXIST;
+            }
+            put_gfn(d, start);
+        }
+    }
+
+    return 0;
+}
+
 static int init_msix(struct pci_dev *pdev)
 {
     struct domain *d = pdev->domain;
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index 44104b75b6..4cf233c779 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -152,6 +152,9 @@ struct vpci_vcpu {
 #ifdef __XEN__
 void vpci_dump_msi(void);
 
+/* Make sure there's a hole in the p2m for the MSIX mmio areas. */
+int vpci_make_msix_hole(const struct pci_dev *pdev);
+
 /* Arch-specific vPCI MSI helpers. */
 void vpci_msi_arch_mask(struct vpci_msi *msi, const struct pci_dev *pdev,
                         unsigned int entry, bool mask);
-- 
2.19.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v5 4/6] pci: add a segment parameter to pci_hide_device
  2018-11-20 16:01 [PATCH v5 0/6] x86/pvh: fixes for PVH Dom0 Roger Pau Monne
                   ` (2 preceding siblings ...)
  2018-11-20 16:01 ` [PATCH v5 3/6] vpci/msix: carve p2m hole for MSIX MMIO regions Roger Pau Monne
@ 2018-11-20 16:01 ` Roger Pau Monne
  2018-11-26 11:29   ` Jan Beulich
  2018-11-20 16:01 ` [PATCH v5 5/6] amd/iommu: assign iommu devices to Xen Roger Pau Monne
  2018-11-20 16:01 ` [PATCH v5 6/6] amd/iommu: skip bridge devices when updating IOMMU page tables Roger Pau Monne
  5 siblings, 1 reply; 24+ messages in thread
From: Roger Pau Monne @ 2018-11-20 16:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Jan Beulich, Roger Pau Monne

No functional change expected.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
---
Changes since v4:
 - New in this version.
---
 xen/drivers/char/ehci-dbgp.c  |  2 +-
 xen/drivers/char/ns16550.c    |  2 +-
 xen/drivers/passthrough/pci.c | 16 +++++++++++-----
 xen/drivers/video/vga.c       |  2 +-
 xen/include/xen/pci.h         |  2 +-
 5 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/xen/drivers/char/ehci-dbgp.c b/xen/drivers/char/ehci-dbgp.c
index d0071d3114..475dc41767 100644
--- a/xen/drivers/char/ehci-dbgp.c
+++ b/xen/drivers/char/ehci-dbgp.c
@@ -1362,7 +1362,7 @@ static void __init ehci_dbgp_init_postirq(struct serial_port *port)
 
     ehci_dbgp_setup_postirq(dbgp);
 
-    pci_hide_device(dbgp->bus, PCI_DEVFN(dbgp->slot, dbgp->func));
+    pci_hide_device(0, dbgp->bus, PCI_DEVFN(dbgp->slot, dbgp->func));
 }
 
 static int ehci_dbgp_check_release(struct ehci_dbgp *dbgp)
diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c
index f32dbd3247..3c66e65b1c 100644
--- a/xen/drivers/char/ns16550.c
+++ b/xen/drivers/char/ns16550.c
@@ -761,7 +761,7 @@ static void __init ns16550_init_postirq(struct serial_port *port)
     if ( uart->bar || uart->ps_bdf_enable )
     {
         if ( !uart->param )
-            pci_hide_device(uart->ps_bdf[0], PCI_DEVFN(uart->ps_bdf[1],
+            pci_hide_device(0, uart->ps_bdf[0], PCI_DEVFN(uart->ps_bdf[1],
                             uart->ps_bdf[2]));
         else
         {
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index e5b9602762..7584ce2fbb 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -440,17 +440,23 @@ static void _pci_hide_device(struct pci_dev *pdev)
     list_add(&pdev->domain_list, &dom_xen->arch.pdev_list);
 }
 
-int __init pci_hide_device(int bus, int devfn)
+int __init pci_hide_device(unsigned int seg, unsigned int bus,
+                           unsigned int devfn)
 {
     struct pci_dev *pdev;
+    struct pci_seg *pseg;
     int rc = -ENOMEM;
 
     pcidevs_lock();
-    pdev = alloc_pdev(get_pseg(0), bus, devfn);
-    if ( pdev )
+    pseg = alloc_pseg(seg);
+    if ( pseg )
     {
-        _pci_hide_device(pdev);
-        rc = 0;
+        pdev = alloc_pdev(pseg, bus, devfn);
+        if ( pdev )
+        {
+            _pci_hide_device(pdev);
+            rc = 0;
+        }
     }
     pcidevs_unlock();
 
diff --git a/xen/drivers/video/vga.c b/xen/drivers/video/vga.c
index 7dc07b13ed..6a64fd9013 100644
--- a/xen/drivers/video/vga.c
+++ b/xen/drivers/video/vga.c
@@ -157,7 +157,7 @@ void __init video_endboot(void)
                 {
                     printk(XENLOG_INFO "Boot video device %02x:%02x.%u\n",
                            bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
-                    pci_hide_device(bus, devfn);
+                    pci_hide_device(0, bus, devfn);
                 }
             }
     }
diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h
index 580e820a33..3c361cf0c0 100644
--- a/xen/include/xen/pci.h
+++ b/xen/include/xen/pci.h
@@ -150,7 +150,7 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn,
                    const struct pci_dev_info *, nodeid_t node);
 int pci_remove_device(u16 seg, u8 bus, u8 devfn);
 int pci_ro_device(int seg, int bus, int devfn);
-int pci_hide_device(int bus, int devfn);
+int pci_hide_device(unsigned int seg, unsigned int bus, unsigned int devfn);
 struct pci_dev *pci_get_pdev(int seg, int bus, int devfn);
 struct pci_dev *pci_get_real_pdev(int seg, int bus, int devfn);
 struct pci_dev *pci_get_pdev_by_domain(const struct domain *, int seg,
-- 
2.19.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v5 5/6] amd/iommu: assign iommu devices to Xen
  2018-11-20 16:01 [PATCH v5 0/6] x86/pvh: fixes for PVH Dom0 Roger Pau Monne
                   ` (3 preceding siblings ...)
  2018-11-20 16:01 ` [PATCH v5 4/6] pci: add a segment parameter to pci_hide_device Roger Pau Monne
@ 2018-11-20 16:01 ` Roger Pau Monne
  2018-11-20 16:14   ` Jan Beulich
  2018-11-20 16:01 ` [PATCH v5 6/6] amd/iommu: skip bridge devices when updating IOMMU page tables Roger Pau Monne
  5 siblings, 1 reply; 24+ messages in thread
From: Roger Pau Monne @ 2018-11-20 16:01 UTC (permalink / raw)
  To: xen-devel; +Cc: Brian Woods, Suravee Suthikulpanit, Roger Pau Monne

AMD IOMMU devices are exposed on the PCI bus, and thus are assigned by
default to the hardware domain. This can cause issues because the
IOMMU devices themselves are not behind an IOMMU, so update_paging_mode will
return an error if Xen tries to expand the page tables of a domain
that has assigned devices not behind an IOMMU. update_paging_mode
failing will cause the domain to be destroyed.

Fix this by hiding PCI IOMMU devices, so they are not assigned to the
hardware domain.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Brian Woods <brian.woods@amd.com>
---
Changes since v4:
 - Use pci_hide_device.
 - Expand commit message.
---
 xen/drivers/passthrough/amd/iommu_init.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/drivers/passthrough/amd/iommu_init.c b/xen/drivers/passthrough/amd/iommu_init.c
index 15c10b0929..17f39552a9 100644
--- a/xen/drivers/passthrough/amd/iommu_init.c
+++ b/xen/drivers/passthrough/amd/iommu_init.c
@@ -993,6 +993,8 @@ static void * __init allocate_ppr_log(struct amd_iommu *iommu)
 
 static int __init amd_iommu_init_one(struct amd_iommu *iommu)
 {
+    pci_hide_device(iommu->seg, PCI_BUS(iommu->bdf), PCI_DEVFN2(iommu->bdf));
+
     if ( map_iommu_mmio_region(iommu) != 0 )
         goto error_out;
 
-- 
2.19.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v5 6/6] amd/iommu: skip bridge devices when updating IOMMU page tables
  2018-11-20 16:01 [PATCH v5 0/6] x86/pvh: fixes for PVH Dom0 Roger Pau Monne
                   ` (4 preceding siblings ...)
  2018-11-20 16:01 ` [PATCH v5 5/6] amd/iommu: assign iommu devices to Xen Roger Pau Monne
@ 2018-11-20 16:01 ` Roger Pau Monne
  2018-11-20 16:45   ` Jan Beulich
  5 siblings, 1 reply; 24+ messages in thread
From: Roger Pau Monne @ 2018-11-20 16:01 UTC (permalink / raw)
  To: xen-devel; +Cc: Brian Woods, Suravee Suthikulpanit, Roger Pau Monne

Bridges are not behind an IOMMU, and are already special cased and
skipped in amd_iommu_add_device. Apply the same special casing when
updating page tables.

This is required or else update_paging_mode will fail and return an
error to the caller (amd_iommu_{un}map_page) which will destroy the
domain.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Brian Woods <brian.woods@amd.com>
---
Changes since v4:
 - Invert condition order so they match the order in
   amd_iommu_add_device.
 - Expand commit message to spell out why this is required.
---
 xen/drivers/passthrough/amd/iommu_map.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/xen/drivers/passthrough/amd/iommu_map.c b/xen/drivers/passthrough/amd/iommu_map.c
index c1daba8422..7752e7542f 100644
--- a/xen/drivers/passthrough/amd/iommu_map.c
+++ b/xen/drivers/passthrough/amd/iommu_map.c
@@ -612,6 +612,10 @@ static int update_paging_mode(struct domain *d, unsigned long dfn)
         /* Update device table entries using new root table and paging mode */
         for_each_pdev( d, pdev )
         {
+            if ( pdev->type == DEV_TYPE_PCI_HOST_BRIDGE &&
+                 is_hardware_domain(d) )
+                continue;
+
             bdf = PCI_BDF2(pdev->bus, pdev->devfn);
             iommu = find_iommu_for_device(pdev->seg, bdf);
             if ( !iommu )
-- 
2.19.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 5/6] amd/iommu: assign iommu devices to Xen
  2018-11-20 16:01 ` [PATCH v5 5/6] amd/iommu: assign iommu devices to Xen Roger Pau Monne
@ 2018-11-20 16:14   ` Jan Beulich
  0 siblings, 0 replies; 24+ messages in thread
From: Jan Beulich @ 2018-11-20 16:14 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, Brian Woods, Suravee Suthikulpanit

>>> On 20.11.18 at 17:01, <roger.pau@citrix.com> wrote:
> AMD IOMMU devices are exposed on the PCI bus, and thus are assigned by
> default to the hardware domain. This can cause issues because the
> IOMMU devices themselves are not behind an IOMMU, so update_paging_mode will
> return an error if Xen tries to expand the page tables of a domain
> that has assigned devices not behind an IOMMU. update_paging_mode
> failing will cause the domain to be destroyed.
> 
> Fix this by hiding PCI IOMMU devices, so they are not assigned to the
> hardware domain.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 6/6] amd/iommu: skip bridge devices when updating IOMMU page tables
  2018-11-20 16:01 ` [PATCH v5 6/6] amd/iommu: skip bridge devices when updating IOMMU page tables Roger Pau Monne
@ 2018-11-20 16:45   ` Jan Beulich
  2018-11-20 23:26     ` Woods, Brian
  0 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2018-11-20 16:45 UTC (permalink / raw)
  To: Brian Woods, Suravee Suthikulpanit, Roger Pau Monne; +Cc: xen-devel

>>> On 20.11.18 at 17:01, <roger.pau@citrix.com> wrote:
> Bridges are not behind an IOMMU, and are already special cased and
> skipped in amd_iommu_add_device. Apply the same special casing when
> updating page tables.
> 
> This is required or else update_paging_mode will fail and return an
> error to the caller (amd_iommu_{un}map_page) which will destroy the
> domain.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> Cc: Brian Woods <brian.woods@amd.com>
> ---
> Changes since v4:
>  - Invert condition order so they match the order in
>    amd_iommu_add_device.
>  - Expand commit message to spell out why this is required.

Thanks. Nevertheless ...

> --- a/xen/drivers/passthrough/amd/iommu_map.c
> +++ b/xen/drivers/passthrough/amd/iommu_map.c
> @@ -612,6 +612,10 @@ static int update_paging_mode(struct domain *d, unsigned long dfn)
>          /* Update device table entries using new root table and paging mode */
>          for_each_pdev( d, pdev )
>          {
> +            if ( pdev->type == DEV_TYPE_PCI_HOST_BRIDGE &&
> +                 is_hardware_domain(d) )
> +                continue;

... before spreading the issue I'd still like to see clarification of /
justification for the is_hardware_domain() part of the condition
you clone. Suravee, Brian?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 6/6] amd/iommu: skip bridge devices when updating IOMMU page tables
  2018-11-20 16:45   ` Jan Beulich
@ 2018-11-20 23:26     ` Woods, Brian
  2018-11-21  9:21       ` Jan Beulich
  0 siblings, 1 reply; 24+ messages in thread
From: Woods, Brian @ 2018-11-20 23:26 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Woods, Brian, Suthikulpanit, Suravee, Roger Pau Monne

On Tue, Nov 20, 2018 at 09:45:13AM -0700, Jan Beulich wrote:
> >>> On 20.11.18 at 17:01, <roger.pau@citrix.com> wrote:
> > Bridges are not behind an IOMMU, and are already special cased and
> > skipped in amd_iommu_add_device. Apply the same special casing when
> > updating page tables.
> > 
> > This is required or else update_paging_mode will fail and return an
> > error to the caller (amd_iommu_{un}map_page) which will destroy the
> > domain.
> > 
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > ---
> > Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> > Cc: Brian Woods <brian.woods@amd.com>
> > ---
> > Changes since v4:
> >  - Invert condition order so they match the order in
> >    amd_iommu_add_device.
> >  - Expand commit message to spell out why this is required.
> 
> Thanks. Nevertheless ...
> 
> > --- a/xen/drivers/passthrough/amd/iommu_map.c
> > +++ b/xen/drivers/passthrough/amd/iommu_map.c
> > @@ -612,6 +612,10 @@ static int update_paging_mode(struct domain *d, unsigned long dfn)
> >          /* Update device table entries using new root table and paging mode */
> >          for_each_pdev( d, pdev )
> >          {
> > +            if ( pdev->type == DEV_TYPE_PCI_HOST_BRIDGE &&
> > +                 is_hardware_domain(d) )
> > +                continue;
> 
> ... before spreading the issue I'd still like to see clarification of /
> justification for the is_hardware_domain() part of the condition
> you clone. Suravee, Brian?
> 
> Jan
> 

The original commit 0af438757d455f8eb6b5a6ae9a990ae245f230fd

The commit that adds is_hardware_domain (and rearrange things)
7c275549f46c5c46611592f7107c1345e93ed457

The orginal commit used the function like
setup_dom0_pci_devices(d, amd_iommu_setup_dom0_device);
which was because IOMMU needed to skip the host bridge devices on dom0.

So I assume you added the is_hardware_domain because it only needed to
be done on dom0.  I'm not familiar with the IOMMU/PCI history wrt to
what it mapped/passed through so.

-- 
Brian Woods

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 6/6] amd/iommu: skip bridge devices when updating IOMMU page tables
  2018-11-20 23:26     ` Woods, Brian
@ 2018-11-21  9:21       ` Jan Beulich
  2018-11-21 10:37         ` Roger Pau Monné
  0 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2018-11-21  9:21 UTC (permalink / raw)
  To: Brian Woods, Roger Pau Monne; +Cc: xen-devel, Suravee Suthikulpanit

>>> On 21.11.18 at 00:26, <Brian.Woods@amd.com> wrote:
> The original commit 0af438757d455f8eb6b5a6ae9a990ae245f230fd
> 
> The commit that adds is_hardware_domain (and rearrange things)
> 7c275549f46c5c46611592f7107c1345e93ed457
> 
> The orginal commit used the function like
> setup_dom0_pci_devices(d, amd_iommu_setup_dom0_device);
> which was because IOMMU needed to skip the host bridge devices on dom0.
> 
> So I assume you added the is_hardware_domain because it only needed to
> be done on dom0.  I'm not familiar with the IOMMU/PCI history wrt to
> what it mapped/passed through so.

Well, I added it presumably to retain original semantics. I still
think that the extra check would better be dropped there, not
the least to also cover the case of devices eventually getting
assigned to dom_xen.

Looking at this another time I find some other questionable
aspect though (both to pre-existing code and to the change
made here): "host bridge" != "bridge". The title here as much
as the comment next to the original piece of code both
suggest the wider general category is meant, but the code
cloned checks for host bridges only. In
amd_iommu_add_device() the check is used solely to emit a
less scary log message, but the change here goes beyond
that.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 6/6] amd/iommu: skip bridge devices when updating IOMMU page tables
  2018-11-21  9:21       ` Jan Beulich
@ 2018-11-21 10:37         ` Roger Pau Monné
  2018-11-21 10:58           ` Jan Beulich
  0 siblings, 1 reply; 24+ messages in thread
From: Roger Pau Monné @ 2018-11-21 10:37 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Brian Woods, Suravee Suthikulpanit

On Wed, Nov 21, 2018 at 02:21:36AM -0700, Jan Beulich wrote:
> >>> On 21.11.18 at 00:26, <Brian.Woods@amd.com> wrote:
> > The original commit 0af438757d455f8eb6b5a6ae9a990ae245f230fd
> > 
> > The commit that adds is_hardware_domain (and rearrange things)
> > 7c275549f46c5c46611592f7107c1345e93ed457
> > 
> > The orginal commit used the function like
> > setup_dom0_pci_devices(d, amd_iommu_setup_dom0_device);
> > which was because IOMMU needed to skip the host bridge devices on dom0.
> > 
> > So I assume you added the is_hardware_domain because it only needed to
> > be done on dom0.  I'm not familiar with the IOMMU/PCI history wrt to
> > what it mapped/passed through so.
> 
> Well, I added it presumably to retain original semantics. I still
> think that the extra check would better be dropped there, not
> the least to also cover the case of devices eventually getting
> assigned to dom_xen.
> 
> Looking at this another time I find some other questionable
> aspect though (both to pre-existing code and to the change
> made here): "host bridge" != "bridge". The title here as much
> as the comment next to the original piece of code both
> suggest the wider general category is meant, but the code
> cloned checks for host bridges only. In
> amd_iommu_add_device() the check is used solely to emit a
> less scary log message, but the change here goes beyond
> that.

The check in amd_iommu_add_device allows host bridge devices to be
assigned to the hardware domain without returning an error since they
are not behind an IOMMU. Note that even if amd_iommu_add_device
returned an error the device would still be added to the hardware
domain since the error is eaten by setup_one_hwdom_device and the
device is assigned to the hardware domain regardless.

So either all devices that are not behind an IOMMU are not assigned to
the hardware domain, or update_paging_mode needs this workaround in
order to be able to handle IOMMU page table expansion for the hardware
domain.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 6/6] amd/iommu: skip bridge devices when updating IOMMU page tables
  2018-11-21 10:37         ` Roger Pau Monné
@ 2018-11-21 10:58           ` Jan Beulich
  2018-11-21 11:51             ` Roger Pau Monné
  0 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2018-11-21 10:58 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, Brian Woods, Suravee Suthikulpanit

>>> On 21.11.18 at 11:37, <roger.pau@citrix.com> wrote:
> On Wed, Nov 21, 2018 at 02:21:36AM -0700, Jan Beulich wrote:
>> >>> On 21.11.18 at 00:26, <Brian.Woods@amd.com> wrote:
>> > The original commit 0af438757d455f8eb6b5a6ae9a990ae245f230fd
>> > 
>> > The commit that adds is_hardware_domain (and rearrange things)
>> > 7c275549f46c5c46611592f7107c1345e93ed457
>> > 
>> > The orginal commit used the function like
>> > setup_dom0_pci_devices(d, amd_iommu_setup_dom0_device);
>> > which was because IOMMU needed to skip the host bridge devices on dom0.
>> > 
>> > So I assume you added the is_hardware_domain because it only needed to
>> > be done on dom0.  I'm not familiar with the IOMMU/PCI history wrt to
>> > what it mapped/passed through so.
>> 
>> Well, I added it presumably to retain original semantics. I still
>> think that the extra check would better be dropped there, not
>> the least to also cover the case of devices eventually getting
>> assigned to dom_xen.
>> 
>> Looking at this another time I find some other questionable
>> aspect though (both to pre-existing code and to the change
>> made here): "host bridge" != "bridge". The title here as much
>> as the comment next to the original piece of code both
>> suggest the wider general category is meant, but the code
>> cloned checks for host bridges only. In
>> amd_iommu_add_device() the check is used solely to emit a
>> less scary log message, but the change here goes beyond
>> that.
> 
> The check in amd_iommu_add_device allows host bridge devices to be
> assigned to the hardware domain without returning an error since they
> are not behind an IOMMU. Note that even if amd_iommu_add_device
> returned an error the device would still be added to the hardware
> domain since the error is eaten by setup_one_hwdom_device and the
> device is assigned to the hardware domain regardless.

Hence me saying this check is mainly/just to make the log message
less scary.

> So either all devices that are not behind an IOMMU are not assigned to
> the hardware domain, or update_paging_mode needs this workaround in
> order to be able to handle IOMMU page table expansion for the hardware
> domain.

It looks like we're talking about different aspects: I don't put under
question that assignment wants to be avoided. What I question is
why the avoidance gets restricted to the hardware domain.

And then the secondary question I put up was why this restriction
applies to host bridges only, despite title here and comment there
saying "bridge" in general.

Your reply doesn't appear to relate to either of these.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 6/6] amd/iommu: skip bridge devices when updating IOMMU page tables
  2018-11-21 10:58           ` Jan Beulich
@ 2018-11-21 11:51             ` Roger Pau Monné
  2018-11-21 13:23               ` Jan Beulich
  0 siblings, 1 reply; 24+ messages in thread
From: Roger Pau Monné @ 2018-11-21 11:51 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Brian Woods, Suravee Suthikulpanit

On Wed, Nov 21, 2018 at 03:58:34AM -0700, Jan Beulich wrote:
> >>> On 21.11.18 at 11:37, <roger.pau@citrix.com> wrote:
> > On Wed, Nov 21, 2018 at 02:21:36AM -0700, Jan Beulich wrote:
> >> >>> On 21.11.18 at 00:26, <Brian.Woods@amd.com> wrote:
> >> > The original commit 0af438757d455f8eb6b5a6ae9a990ae245f230fd
> >> > 
> >> > The commit that adds is_hardware_domain (and rearrange things)
> >> > 7c275549f46c5c46611592f7107c1345e93ed457
> >> > 
> >> > The orginal commit used the function like
> >> > setup_dom0_pci_devices(d, amd_iommu_setup_dom0_device);
> >> > which was because IOMMU needed to skip the host bridge devices on dom0.
> >> > 
> >> > So I assume you added the is_hardware_domain because it only needed to
> >> > be done on dom0.  I'm not familiar with the IOMMU/PCI history wrt to
> >> > what it mapped/passed through so.
> >> 
> >> Well, I added it presumably to retain original semantics. I still
> >> think that the extra check would better be dropped there, not
> >> the least to also cover the case of devices eventually getting
> >> assigned to dom_xen.
> >> 
> >> Looking at this another time I find some other questionable
> >> aspect though (both to pre-existing code and to the change
> >> made here): "host bridge" != "bridge". The title here as much
> >> as the comment next to the original piece of code both
> >> suggest the wider general category is meant, but the code
> >> cloned checks for host bridges only. In
> >> amd_iommu_add_device() the check is used solely to emit a
> >> less scary log message, but the change here goes beyond
> >> that.
> > 
> > The check in amd_iommu_add_device allows host bridge devices to be
> > assigned to the hardware domain without returning an error since they
> > are not behind an IOMMU. Note that even if amd_iommu_add_device
> > returned an error the device would still be added to the hardware
> > domain since the error is eaten by setup_one_hwdom_device and the
> > device is assigned to the hardware domain regardless.
> 
> Hence me saying this check is mainly/just to make the log message
> less scary.
> 
> > So either all devices that are not behind an IOMMU are not assigned to
> > the hardware domain, or update_paging_mode needs this workaround in
> > order to be able to handle IOMMU page table expansion for the hardware
> > domain.
> 
> It looks like we're talking about different aspects: I don't put under
> question that assignment wants to be avoided. What I question is
> why the avoidance gets restricted to the hardware domain.

Not having written this code I can only make guesses. I would say this
is because you don't want to hand a PCI host bridge to a guest so it
cannot play with the registers there and likely move the MCFG or the
MMIO windows. I think the passthrough code in QEMU would already
prevent any such accesses.

> And then the secondary question I put up was why this restriction
> applies to host bridges only, despite title here and comment there
> saying "bridge" in general.

I think this needs to be answered by AMD, but one possibility is that
host bridges are the only bridges that are not behind an IOMMU, the
rest of bridges are expected to be behind an IOMMU.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 6/6] amd/iommu: skip bridge devices when updating IOMMU page tables
  2018-11-21 11:51             ` Roger Pau Monné
@ 2018-11-21 13:23               ` Jan Beulich
  2018-11-22 12:47                 ` Roger Pau Monné
  0 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2018-11-21 13:23 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, Brian Woods, Suravee Suthikulpanit

>>> On 21.11.18 at 12:51, <roger.pau@citrix.com> wrote:
> On Wed, Nov 21, 2018 at 03:58:34AM -0700, Jan Beulich wrote:
>> >>> On 21.11.18 at 11:37, <roger.pau@citrix.com> wrote:
>> > On Wed, Nov 21, 2018 at 02:21:36AM -0700, Jan Beulich wrote:
>> >> >>> On 21.11.18 at 00:26, <Brian.Woods@amd.com> wrote:
>> >> > The original commit 0af438757d455f8eb6b5a6ae9a990ae245f230fd
>> >> > 
>> >> > The commit that adds is_hardware_domain (and rearrange things)
>> >> > 7c275549f46c5c46611592f7107c1345e93ed457
>> >> > 
>> >> > The orginal commit used the function like
>> >> > setup_dom0_pci_devices(d, amd_iommu_setup_dom0_device);
>> >> > which was because IOMMU needed to skip the host bridge devices on dom0.
>> >> > 
>> >> > So I assume you added the is_hardware_domain because it only needed to
>> >> > be done on dom0.  I'm not familiar with the IOMMU/PCI history wrt to
>> >> > what it mapped/passed through so.
>> >> 
>> >> Well, I added it presumably to retain original semantics. I still
>> >> think that the extra check would better be dropped there, not
>> >> the least to also cover the case of devices eventually getting
>> >> assigned to dom_xen.
>> >> 
>> >> Looking at this another time I find some other questionable
>> >> aspect though (both to pre-existing code and to the change
>> >> made here): "host bridge" != "bridge". The title here as much
>> >> as the comment next to the original piece of code both
>> >> suggest the wider general category is meant, but the code
>> >> cloned checks for host bridges only. In
>> >> amd_iommu_add_device() the check is used solely to emit a
>> >> less scary log message, but the change here goes beyond
>> >> that.
>> > 
>> > The check in amd_iommu_add_device allows host bridge devices to be
>> > assigned to the hardware domain without returning an error since they
>> > are not behind an IOMMU. Note that even if amd_iommu_add_device
>> > returned an error the device would still be added to the hardware
>> > domain since the error is eaten by setup_one_hwdom_device and the
>> > device is assigned to the hardware domain regardless.
>> 
>> Hence me saying this check is mainly/just to make the log message
>> less scary.
>> 
>> > So either all devices that are not behind an IOMMU are not assigned to
>> > the hardware domain, or update_paging_mode needs this workaround in
>> > order to be able to handle IOMMU page table expansion for the hardware
>> > domain.
>> 
>> It looks like we're talking about different aspects: I don't put under
>> question that assignment wants to be avoided. What I question is
>> why the avoidance gets restricted to the hardware domain.
> 
> Not having written this code I can only make guesses. I would say this
> is because you don't want to hand a PCI host bridge to a guest so it
> cannot play with the registers there and likely move the MCFG or the
> MMIO windows.

That's all correct, but only indirectly related. The question is what's
wrong when the conditionals (both yours and the one you derive
from) have the is_hardware_domain() part dropped.

>> And then the secondary question I put up was why this restriction
>> applies to host bridges only, despite title here and comment there
>> saying "bridge" in general.
> 
> I think this needs to be answered by AMD, but one possibility is that
> host bridges are the only bridges that are not behind an IOMMU, the
> rest of bridges are expected to be behind an IOMMU.

Quite possible, in which case the title of your patch would need to
change, and the implementation could remain as is (perhaps with the
is_hardware_domain() aspect taken care of).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 6/6] amd/iommu: skip bridge devices when updating IOMMU page tables
  2018-11-21 13:23               ` Jan Beulich
@ 2018-11-22 12:47                 ` Roger Pau Monné
  2018-11-22 13:20                   ` Jan Beulich
  0 siblings, 1 reply; 24+ messages in thread
From: Roger Pau Monné @ 2018-11-22 12:47 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Brian Woods, Suravee Suthikulpanit

On Wed, Nov 21, 2018 at 06:23:30AM -0700, Jan Beulich wrote:
> >>> On 21.11.18 at 12:51, <roger.pau@citrix.com> wrote:
> > On Wed, Nov 21, 2018 at 03:58:34AM -0700, Jan Beulich wrote:
> >> >>> On 21.11.18 at 11:37, <roger.pau@citrix.com> wrote:
> >> > On Wed, Nov 21, 2018 at 02:21:36AM -0700, Jan Beulich wrote:
> >> >> >>> On 21.11.18 at 00:26, <Brian.Woods@amd.com> wrote:
> >> >> > The original commit 0af438757d455f8eb6b5a6ae9a990ae245f230fd
> >> >> > 
> >> >> > The commit that adds is_hardware_domain (and rearrange things)
> >> >> > 7c275549f46c5c46611592f7107c1345e93ed457
> >> >> > 
> >> >> > The orginal commit used the function like
> >> >> > setup_dom0_pci_devices(d, amd_iommu_setup_dom0_device);
> >> >> > which was because IOMMU needed to skip the host bridge devices on dom0.
> >> >> > 
> >> >> > So I assume you added the is_hardware_domain because it only needed to
> >> >> > be done on dom0.  I'm not familiar with the IOMMU/PCI history wrt to
> >> >> > what it mapped/passed through so.
> >> >> 
> >> >> Well, I added it presumably to retain original semantics. I still
> >> >> think that the extra check would better be dropped there, not
> >> >> the least to also cover the case of devices eventually getting
> >> >> assigned to dom_xen.
> >> >> 
> >> >> Looking at this another time I find some other questionable
> >> >> aspect though (both to pre-existing code and to the change
> >> >> made here): "host bridge" != "bridge". The title here as much
> >> >> as the comment next to the original piece of code both
> >> >> suggest the wider general category is meant, but the code
> >> >> cloned checks for host bridges only. In
> >> >> amd_iommu_add_device() the check is used solely to emit a
> >> >> less scary log message, but the change here goes beyond
> >> >> that.
> >> > 
> >> > The check in amd_iommu_add_device allows host bridge devices to be
> >> > assigned to the hardware domain without returning an error since they
> >> > are not behind an IOMMU. Note that even if amd_iommu_add_device
> >> > returned an error the device would still be added to the hardware
> >> > domain since the error is eaten by setup_one_hwdom_device and the
> >> > device is assigned to the hardware domain regardless.
> >> 
> >> Hence me saying this check is mainly/just to make the log message
> >> less scary.
> >> 
> >> > So either all devices that are not behind an IOMMU are not assigned to
> >> > the hardware domain, or update_paging_mode needs this workaround in
> >> > order to be able to handle IOMMU page table expansion for the hardware
> >> > domain.
> >> 
> >> It looks like we're talking about different aspects: I don't put under
> >> question that assignment wants to be avoided. What I question is
> >> why the avoidance gets restricted to the hardware domain.
> > 
> > Not having written this code I can only make guesses. I would say this
> > is because you don't want to hand a PCI host bridge to a guest so it
> > cannot play with the registers there and likely move the MCFG or the
> > MMIO windows.
> 
> That's all correct, but only indirectly related. The question is what's
> wrong when the conditionals (both yours and the one you derive
> from) have the is_hardware_domain() part dropped.

I think the is_hardware_domain part can be dropped from the
conditional I'm adding. update_paging_mode shouldn't be used to decide
whether a domain can or cannot have bridges attached. Whether a DomU
can or cannot have a host bridge assigned should be decided at
assignation time, and hence update_paging_mode shouldn't have hardware
domain specific checks.

Regarding the check in amd_iommu_add_device, if it's removed from
there amd_iommu_add_device would return an error when adding a host
bridge device, and that would cause setup_one_hwdom_device to return
early and not setup vPCI handlers for host bridges, so I think we want
to leave that one as-is.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 6/6] amd/iommu: skip bridge devices when updating IOMMU page tables
  2018-11-22 12:47                 ` Roger Pau Monné
@ 2018-11-22 13:20                   ` Jan Beulich
  2018-11-23 14:36                     ` Roger Pau Monné
  0 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2018-11-22 13:20 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, Brian Woods, Suravee Suthikulpanit

>>> On 22.11.18 at 13:47, <roger.pau@citrix.com> wrote:
> I think the is_hardware_domain part can be dropped from the
> conditional I'm adding. update_paging_mode shouldn't be used to decide
> whether a domain can or cannot have bridges attached. Whether a DomU
> can or cannot have a host bridge assigned should be decided at
> assignation time, and hence update_paging_mode shouldn't have hardware
> domain specific checks.

Okay, we're in agreement then.

> Regarding the check in amd_iommu_add_device, if it's removed from
> there amd_iommu_add_device would return an error when adding a host
> bridge device, and that would cause setup_one_hwdom_device to return
> early and not setup vPCI handlers for host bridges, so I think we want
> to leave that one as-is.

Right, I can see why it may be better to retain it there.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 6/6] amd/iommu: skip bridge devices when updating IOMMU page tables
  2018-11-22 13:20                   ` Jan Beulich
@ 2018-11-23 14:36                     ` Roger Pau Monné
       [not found]                       ` <60B388B7020000D60063616D@prv1-mh.provo.novell.com>
  0 siblings, 1 reply; 24+ messages in thread
From: Roger Pau Monné @ 2018-11-23 14:36 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Brian Woods, Suravee Suthikulpanit

On Thu, Nov 22, 2018 at 06:20:46AM -0700, Jan Beulich wrote:
> >>> On 22.11.18 at 13:47, <roger.pau@citrix.com> wrote:
> > I think the is_hardware_domain part can be dropped from the
> > conditional I'm adding. update_paging_mode shouldn't be used to decide
> > whether a domain can or cannot have bridges attached. Whether a DomU
> > can or cannot have a host bridge assigned should be decided at
> > assignation time, and hence update_paging_mode shouldn't have hardware
> > domain specific checks.
> 
> Okay, we're in agreement then.
> 
> > Regarding the check in amd_iommu_add_device, if it's removed from
> > there amd_iommu_add_device would return an error when adding a host
> > bridge device, and that would cause setup_one_hwdom_device to return
> > early and not setup vPCI handlers for host bridges, so I think we want
> > to leave that one as-is.
> 
> Right, I can see why it may be better to retain it there.

Thanks. Would you like me to resend the series with this fix, or
should I wait for feedback on the remaining patches?

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 6/6] amd/iommu: skip bridge devices when updating IOMMU page tables
       [not found]                       ` <60B388B7020000D60063616D@prv1-mh.provo.novell.com>
@ 2018-11-26  8:42                         ` Jan Beulich
  0 siblings, 0 replies; 24+ messages in thread
From: Jan Beulich @ 2018-11-26  8:42 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, Brian Woods, Suravee Suthikulpanit

>>> On 23.11.18 at 15:36, <roger.pau@citrix.com> wrote:
> On Thu, Nov 22, 2018 at 06:20:46AM -0700, Jan Beulich wrote:
>> >>> On 22.11.18 at 13:47, <roger.pau@citrix.com> wrote:
>> > I think the is_hardware_domain part can be dropped from the
>> > conditional I'm adding. update_paging_mode shouldn't be used to decide
>> > whether a domain can or cannot have bridges attached. Whether a DomU
>> > can or cannot have a host bridge assigned should be decided at
>> > assignation time, and hence update_paging_mode shouldn't have hardware
>> > domain specific checks.
>> 
>> Okay, we're in agreement then.
>> 
>> > Regarding the check in amd_iommu_add_device, if it's removed from
>> > there amd_iommu_add_device would return an error when adding a host
>> > bridge device, and that would cause setup_one_hwdom_device to return
>> > early and not setup vPCI handlers for host bridges, so I think we want
>> > to leave that one as-is.
>> 
>> Right, I can see why it may be better to retain it there.
> 
> Thanks. Would you like me to resend the series with this fix, or
> should I wait for feedback on the remaining patches?

I'd leave that up to you. I have yet to find time to look at patches
1, 2, and 3 again.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 1/6] vpci: fix updating the command register
  2018-11-20 16:01 ` [PATCH v5 1/6] vpci: fix updating the command register Roger Pau Monne
@ 2018-11-26 11:20   ` Jan Beulich
  0 siblings, 0 replies; 24+ messages in thread
From: Jan Beulich @ 2018-11-26 11:20 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel

>>> On 20.11.18 at 17:01, <roger.pau@citrix.com> wrote:
> When switching the memory decoding bit in the command register the
> rest of the changes where dropped, leading to only the memory decoding
> bit being updated.
> 
> Fix this by writing the command register once the guest physmap
> manipulations are done if there are changes to the memory decoding
> bit.
> 
> Note that when only mapping/unmapping the ROM BAR a fabricated command
> register value is passed to modify_bars which is only used to signal
> whether the action is a mapping or unmapping, but the value is never
> written to the device command register. Turn the maodify_decoding
> ASSERT into an ASSERT_UNREACHABLE and make sure that non-debug builds
> won't end up writing to the command register if only modifying the ROM
> BAR.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 2/6] vpci: fix deferral of long operations
  2018-11-20 16:01 ` [PATCH v5 2/6] vpci: fix deferral of long operations Roger Pau Monne
@ 2018-11-26 11:26   ` Jan Beulich
  2018-11-26 11:41   ` Jan Beulich
  1 sibling, 0 replies; 24+ messages in thread
From: Jan Beulich @ 2018-11-26 11:26 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Paul Durrant, xen-devel

>>> On 20.11.18 at 17:01, <roger.pau@citrix.com> wrote:
> Current logic to handle long running operations is flawed because it
> doesn't prevent the guest vcpu from running. Fix this by raising a
> scheduler softirq when preemption is required, so that the do_softirq
> call in the guest entry path performs a rescheduling. Also move the
> call to vpci_process_pending into handle_hvm_io_completion, together
> with the IOREQ code that handles pending IO instructions.
> 
> Note that a scheduler softirq is also raised when the long running
> operation is queued in order to prevent the guest vcpu from resuming
> execution.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
with one remark:

> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -184,6 +184,11 @@ static void defer_map(struct domain *d, struct pci_dev *pdev,
>      curr->vpci.mem = mem;
>      curr->vpci.cmd = cmd;
>      curr->vpci.rom_only = rom_only;
> +    /*
> +     * Raise a scheduler softirq in order to prevent the guest from resuming
> +     * execution with pending mapping operations.

..., to trigger the invocation of vpci_process_pending().

If you don't mind I can add this (or something similar if you
dislike this wording) while committing.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 4/6] pci: add a segment parameter to pci_hide_device
  2018-11-20 16:01 ` [PATCH v5 4/6] pci: add a segment parameter to pci_hide_device Roger Pau Monne
@ 2018-11-26 11:29   ` Jan Beulich
  0 siblings, 0 replies; 24+ messages in thread
From: Jan Beulich @ 2018-11-26 11:29 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel

>>> On 20.11.18 at 17:01, <roger.pau@citrix.com> wrote:
> No functional change expected.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 2/6] vpci: fix deferral of long operations
  2018-11-20 16:01 ` [PATCH v5 2/6] vpci: fix deferral of long operations Roger Pau Monne
  2018-11-26 11:26   ` Jan Beulich
@ 2018-11-26 11:41   ` Jan Beulich
  2018-11-26 11:57     ` Paul Durrant
  1 sibling, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2018-11-26 11:41 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel, Roger Pau Monne

>>> On 20.11.18 at 17:01, <roger.pau@citrix.com> wrote:
> Current logic to handle long running operations is flawed because it
> doesn't prevent the guest vcpu from running. Fix this by raising a
> scheduler softirq when preemption is required, so that the do_softirq
> call in the guest entry path performs a rescheduling. Also move the
> call to vpci_process_pending into handle_hvm_io_completion, together
> with the IOREQ code that handles pending IO instructions.
> 
> Note that a scheduler softirq is also raised when the long running
> operation is queued in order to prevent the guest vcpu from resuming
> execution.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> Cc: Paul Durrant <paul.durrant@citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Julien Grall <julien.grall@arm.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Tim Deegan <tim@xen.org>
> ---
> Changes since v4:
>  - Add a comment to clarify defer_map raising a scheduler softirq.
>  - Reword commit message.
>  - Raise the scheduler softirq in handle_hvm_io_completion rather than
>    vpci_process_pending.
> 
> Changes since v3:
>  - Don't use a tasklet.
> ---
>  xen/arch/x86/hvm/ioreq.c  | 9 ++++++---

Paul - I'll take the liberty and reinstate your v4 R-b for v5.

Jan

>  xen/drivers/vpci/header.c | 5 +++++
>  2 files changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
> index a56d634f31..71f23227e6 100644
> --- a/xen/arch/x86/hvm/ioreq.c
> +++ b/xen/arch/x86/hvm/ioreq.c
> @@ -85,9 +85,6 @@ bool hvm_io_pending(struct vcpu *v)
>      struct hvm_ioreq_server *s;
>      unsigned int id;
>  
> -    if ( has_vpci(d) && vpci_process_pending(v) )
> -        return true;
> -
>      FOR_EACH_IOREQ_SERVER(d, id, s)
>      {
>          struct hvm_ioreq_vcpu *sv;
> @@ -186,6 +183,12 @@ bool handle_hvm_io_completion(struct vcpu *v)
>      enum hvm_io_completion io_completion;
>      unsigned int id;
>  
> +    if ( has_vpci(d) && vpci_process_pending(v) )
> +    {
> +        raise_softirq(SCHEDULE_SOFTIRQ);
> +        return false;
> +    }
> +
>      FOR_EACH_IOREQ_SERVER(d, id, s)
>      {
>          struct hvm_ioreq_vcpu *sv;
> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> index 39dffb21fb..c9bdc2ced3 100644
> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -184,6 +184,11 @@ static void defer_map(struct domain *d, struct pci_dev 
> *pdev,
>      curr->vpci.mem = mem;
>      curr->vpci.cmd = cmd;
>      curr->vpci.rom_only = rom_only;
> +    /*
> +     * Raise a scheduler softirq in order to prevent the guest from 
> resuming
> +     * execution with pending mapping operations.
> +     */
> +    raise_softirq(SCHEDULE_SOFTIRQ);
>  }
>  
>  static int modify_bars(const struct pci_dev *pdev, uint16_t cmd, bool 
> rom_only)
> -- 
> 2.19.1




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 2/6] vpci: fix deferral of long operations
  2018-11-26 11:41   ` Jan Beulich
@ 2018-11-26 11:57     ` Paul Durrant
  0 siblings, 0 replies; 24+ messages in thread
From: Paul Durrant @ 2018-11-26 11:57 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, xen-devel, Ian Jackson,
	Roger Pau Monne

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 26 November 2018 11:42
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; Roger Pau Monne <roger.pau@citrix.com>; Wei
> Liu <wei.liu2@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Ian
> Jackson <Ian.Jackson@citrix.com>; Stefano Stabellini
> <sstabellini@kernel.org>; xen-devel <xen-devel@lists.xenproject.org>;
> Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; Tim (Xen.org)
> <tim@xen.org>
> Subject: Re: [PATCH v5 2/6] vpci: fix deferral of long operations
> 
> >>> On 20.11.18 at 17:01, <roger.pau@citrix.com> wrote:
> > Current logic to handle long running operations is flawed because it
> > doesn't prevent the guest vcpu from running. Fix this by raising a
> > scheduler softirq when preemption is required, so that the do_softirq
> > call in the guest entry path performs a rescheduling. Also move the
> > call to vpci_process_pending into handle_hvm_io_completion, together
> > with the IOREQ code that handles pending IO instructions.
> >
> > Note that a scheduler softirq is also raised when the long running
> > operation is queued in order to prevent the guest vcpu from resuming
> > execution.
> >
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > ---
> > Cc: Paul Durrant <paul.durrant@citrix.com>
> > Cc: Jan Beulich <jbeulich@suse.com>
> > Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> > Cc: Wei Liu <wei.liu2@citrix.com>
> > Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> > Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> > Cc: Julien Grall <julien.grall@arm.com>
> > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > Cc: Stefano Stabellini <sstabellini@kernel.org>
> > Cc: Tim Deegan <tim@xen.org>
> > ---
> > Changes since v4:
> >  - Add a comment to clarify defer_map raising a scheduler softirq.
> >  - Reword commit message.
> >  - Raise the scheduler softirq in handle_hvm_io_completion rather than
> >    vpci_process_pending.
> >
> > Changes since v3:
> >  - Don't use a tasklet.
> > ---
> >  xen/arch/x86/hvm/ioreq.c  | 9 ++++++---
> 
> Paul - I'll take the liberty and reinstate your v4 R-b for v5.
> 

That's fine.

  Paul

> Jan
> 
> >  xen/drivers/vpci/header.c | 5 +++++
> >  2 files changed, 11 insertions(+), 3 deletions(-)
> >
> > diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
> > index a56d634f31..71f23227e6 100644
> > --- a/xen/arch/x86/hvm/ioreq.c
> > +++ b/xen/arch/x86/hvm/ioreq.c
> > @@ -85,9 +85,6 @@ bool hvm_io_pending(struct vcpu *v)
> >      struct hvm_ioreq_server *s;
> >      unsigned int id;
> >
> > -    if ( has_vpci(d) && vpci_process_pending(v) )
> > -        return true;
> > -
> >      FOR_EACH_IOREQ_SERVER(d, id, s)
> >      {
> >          struct hvm_ioreq_vcpu *sv;
> > @@ -186,6 +183,12 @@ bool handle_hvm_io_completion(struct vcpu *v)
> >      enum hvm_io_completion io_completion;
> >      unsigned int id;
> >
> > +    if ( has_vpci(d) && vpci_process_pending(v) )
> > +    {
> > +        raise_softirq(SCHEDULE_SOFTIRQ);
> > +        return false;
> > +    }
> > +
> >      FOR_EACH_IOREQ_SERVER(d, id, s)
> >      {
> >          struct hvm_ioreq_vcpu *sv;
> > diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> > index 39dffb21fb..c9bdc2ced3 100644
> > --- a/xen/drivers/vpci/header.c
> > +++ b/xen/drivers/vpci/header.c
> > @@ -184,6 +184,11 @@ static void defer_map(struct domain *d, struct
> pci_dev
> > *pdev,
> >      curr->vpci.mem = mem;
> >      curr->vpci.cmd = cmd;
> >      curr->vpci.rom_only = rom_only;
> > +    /*
> > +     * Raise a scheduler softirq in order to prevent the guest from
> > resuming
> > +     * execution with pending mapping operations.
> > +     */
> > +    raise_softirq(SCHEDULE_SOFTIRQ);
> >  }
> >
> >  static int modify_bars(const struct pci_dev *pdev, uint16_t cmd, bool
> > rom_only)
> > --
> > 2.19.1
> 
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2018-11-26 11:57 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-20 16:01 [PATCH v5 0/6] x86/pvh: fixes for PVH Dom0 Roger Pau Monne
2018-11-20 16:01 ` [PATCH v5 1/6] vpci: fix updating the command register Roger Pau Monne
2018-11-26 11:20   ` Jan Beulich
2018-11-20 16:01 ` [PATCH v5 2/6] vpci: fix deferral of long operations Roger Pau Monne
2018-11-26 11:26   ` Jan Beulich
2018-11-26 11:41   ` Jan Beulich
2018-11-26 11:57     ` Paul Durrant
2018-11-20 16:01 ` [PATCH v5 3/6] vpci/msix: carve p2m hole for MSIX MMIO regions Roger Pau Monne
2018-11-20 16:01 ` [PATCH v5 4/6] pci: add a segment parameter to pci_hide_device Roger Pau Monne
2018-11-26 11:29   ` Jan Beulich
2018-11-20 16:01 ` [PATCH v5 5/6] amd/iommu: assign iommu devices to Xen Roger Pau Monne
2018-11-20 16:14   ` Jan Beulich
2018-11-20 16:01 ` [PATCH v5 6/6] amd/iommu: skip bridge devices when updating IOMMU page tables Roger Pau Monne
2018-11-20 16:45   ` Jan Beulich
2018-11-20 23:26     ` Woods, Brian
2018-11-21  9:21       ` Jan Beulich
2018-11-21 10:37         ` Roger Pau Monné
2018-11-21 10:58           ` Jan Beulich
2018-11-21 11:51             ` Roger Pau Monné
2018-11-21 13:23               ` Jan Beulich
2018-11-22 12:47                 ` Roger Pau Monné
2018-11-22 13:20                   ` Jan Beulich
2018-11-23 14:36                     ` Roger Pau Monné
     [not found]                       ` <60B388B7020000D60063616D@prv1-mh.provo.novell.com>
2018-11-26  8:42                         ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.