All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] x86/pvh: fix fixes for PVH Dom0
@ 2018-10-09  9:42 Roger Pau Monne
  2018-10-09  9:42 ` [PATCH 1/6] x86/pvh: fix TSC mode setup " Roger Pau Monne
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Roger Pau Monne @ 2018-10-09  9:42 UTC (permalink / raw)
  To: xen-devel; +Cc: Roger Pau Monne

Hello,

The following series contain miscellaneous fixes for a PVH Dom0. I've
found out this issues while trying to boot on an AMD EPYC box.

The series can be found on my git repo on top of "x86/vtd: fix
iommu_share_p2m_table":

git://xenbits.xen.org/people/royger/xen.git fixes_pvh

Thanks, Roger.

Roger Pau Monne (6):
  x86/pvh: fix TSC mode setup for PVH Dom0
  x86/dom0: switch parse_dom0_param to use parse_boolean
  x86/pvh: allow PVH Dom0 to use the debug IO port console
  vpci: fix updating the command register
  vpci: fix execution of long running operations
  vpci/msix: carve p2m hole for MSIX MMIO regions

 docs/misc/xen-command-line.markdown | 11 ++++++++
 xen/arch/x86/dom0_build.c           | 22 +++++++++++++---
 xen/arch/x86/hvm/ioreq.c            |  6 ++---
 xen/arch/x86/time.c                 |  2 +-
 xen/drivers/vpci/header.c           | 30 ++++++++++++++++------
 xen/drivers/vpci/msix.c             | 40 +++++++++++++++++++++++++++++
 xen/include/xen/vpci.h              |  9 ++++---
 7 files changed, 101 insertions(+), 19 deletions(-)

-- 
2.19.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/6] x86/pvh: fix TSC mode setup for PVH Dom0
  2018-10-09  9:42 [PATCH 0/6] x86/pvh: fix fixes for PVH Dom0 Roger Pau Monne
@ 2018-10-09  9:42 ` Roger Pau Monne
  2018-10-12 15:34   ` Jan Beulich
  2018-10-09  9:42 ` [PATCH 2/6] x86/dom0: switch parse_dom0_param to use parse_boolean Roger Pau Monne
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Roger Pau Monne @ 2018-10-09  9:42 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monne

A PVH Dom0 might use TSC scaling or other HVM specific TSC
adjustments, so only short-circuit the TSC setup for a classic PV
Dom0.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/time.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
index 553698d4ab..03f792e7e5 100644
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -2125,7 +2125,7 @@ void tsc_set_info(struct domain *d,
 {
     ASSERT(!is_system_domain(d));
 
-    if ( is_hardware_domain(d) )
+    if ( is_pv_domain(d) && is_hardware_domain(d) )
     {
         d->arch.vtsc = 0;
         return;
-- 
2.19.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/6] x86/dom0: switch parse_dom0_param to use parse_boolean
  2018-10-09  9:42 [PATCH 0/6] x86/pvh: fix fixes for PVH Dom0 Roger Pau Monne
  2018-10-09  9:42 ` [PATCH 1/6] x86/pvh: fix TSC mode setup " Roger Pau Monne
@ 2018-10-09  9:42 ` Roger Pau Monne
  2018-10-12 15:49   ` Jan Beulich
  2018-10-09  9:42 ` [PATCH 3/6] x86/pvh: allow PVH Dom0 to use the debug IO port console Roger Pau Monne
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Roger Pau Monne @ 2018-10-09  9:42 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monne

No functional change expected.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/dom0_build.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c
index 86eb7db1da..dcd7afb058 100644
--- a/xen/arch/x86/dom0_build.c
+++ b/xen/arch/x86/dom0_build.c
@@ -225,16 +225,17 @@ static int __init parse_dom0_param(const char *s)
     int rc = 0;
 
     do {
+        int val;
 
         ss = strchr(s, ',');
         if ( !ss )
             ss = strchr(s, '\0');
 
-        if ( !strncmp(s, "pvh", ss - s) )
-            dom0_pvh = true;
+        if ( (val = parse_boolean("pvh", s, ss)) >= 0 )
+            dom0_pvh = val;
 #ifdef CONFIG_SHADOW_PAGING
-        else if ( !strncmp(s, "shadow", ss - s) )
-            opt_dom0_shadow = true;
+        else if ( (val = parse_boolean("shadow", s, ss)) >= 0 )
+            opt_dom0_shadow = val;
 #endif
         else
             rc = -EINVAL;
-- 
2.19.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/6] x86/pvh: allow PVH Dom0 to use the debug IO port console
  2018-10-09  9:42 [PATCH 0/6] x86/pvh: fix fixes for PVH Dom0 Roger Pau Monne
  2018-10-09  9:42 ` [PATCH 1/6] x86/pvh: fix TSC mode setup " Roger Pau Monne
  2018-10-09  9:42 ` [PATCH 2/6] x86/dom0: switch parse_dom0_param to use parse_boolean Roger Pau Monne
@ 2018-10-09  9:42 ` Roger Pau Monne
  2018-10-12 15:55   ` Jan Beulich
  2018-10-15 12:56   ` Andrew Cooper
  2018-10-09  9:42 ` [PATCH 4/6] vpci: fix updating the command register Roger Pau Monne
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 15+ messages in thread
From: Roger Pau Monne @ 2018-10-09  9:42 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monne

Add an option to allow trapping accesses to IO port 0xe9 for a PVH
Dom0, so it can print to the HVM debug console. Note this is not
enabled by default in order to prevent clashes with hardware on the
system using 0xe9.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 docs/misc/xen-command-line.markdown | 11 +++++++++++
 xen/arch/x86/dom0_build.c           | 13 +++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index 1ffd586224..173cbe12f1 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -681,6 +681,17 @@ Flag that makes a dom0 boot in PVHv2 mode.
 Flag that makes a dom0 use shadow paging. Only works when "pvh" is
 enabled.
 
+> `debug-ioport`
+
+> Default: `false`
+
+Flag that enables the HVM debug console for a PVH Dom0. Xen will trap accesses
+to IO port 0xe9 so that Dom0 kernel can print output using this IO port before
+setting up the hypercall page.
+
+Note this option is not enabled by default because it might clash with hardware
+on the system using IO port 0xe9 and should only be used for debug purposes.
+
 ### dom0-iommu
 > `= List of [ passthrough | strict | map-inclusive ]`
 
diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c
index dcd7afb058..c36ffc8c05 100644
--- a/xen/arch/x86/dom0_build.c
+++ b/xen/arch/x86/dom0_build.c
@@ -212,12 +212,16 @@ struct vcpu *__init alloc_dom0_vcpu0(struct domain *dom0)
 bool __initdata opt_dom0_shadow;
 #endif
 bool __initdata dom0_pvh;
+#ifdef CONFIG_HVM
+bool __initdata opt_dom0_debug_ioport;
+#endif
 
 /*
  * List of parameters that affect Dom0 creation:
  *
  *  - pvh               Create a PVHv2 Dom0.
  *  - shadow            Use shadow paging for Dom0.
+ *  - debug-ioport      Trap accesses to 0xe9 (HVM debug console).
  */
 static int __init parse_dom0_param(const char *s)
 {
@@ -236,6 +240,10 @@ static int __init parse_dom0_param(const char *s)
 #ifdef CONFIG_SHADOW_PAGING
         else if ( (val = parse_boolean("shadow", s, ss)) >= 0 )
             opt_dom0_shadow = val;
+#endif
+#ifdef CONFIG_HVM
+        else if ( (val = parse_boolean("debug-ioport", s, ss)) >= 0 )
+            opt_dom0_debug_ioport = val;
 #endif
         else
             rc = -EINVAL;
@@ -433,6 +441,11 @@ int __init dom0_setup_permissions(struct domain *d)
         rc |= ioports_deny_access(d, pmtmr_ioport, pmtmr_ioport + 3);
     /* PCI configuration space (NB. 0xcf8 has special treatment). */
     rc |= ioports_deny_access(d, 0xcfc, 0xcff);
+#ifdef CONFIG_HVM
+    if ( is_hvm_domain(d) && opt_dom0_debug_ioport )
+        /* HVM debug console IO port. */
+        rc |= ioports_deny_access(d, 0xe9, 0xe9);
+#endif
     /* Command-line I/O ranges. */
     process_dom0_ioports_disable(d);
 
-- 
2.19.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/6] vpci: fix updating the command register
  2018-10-09  9:42 [PATCH 0/6] x86/pvh: fix fixes for PVH Dom0 Roger Pau Monne
                   ` (2 preceding siblings ...)
  2018-10-09  9:42 ` [PATCH 3/6] x86/pvh: allow PVH Dom0 to use the debug IO port console Roger Pau Monne
@ 2018-10-09  9:42 ` Roger Pau Monne
  2018-10-09  9:42 ` [PATCH 5/6] vpci: fix execution of long running operations Roger Pau Monne
  2018-10-09  9:42 ` [PATCH 6/6] vpci/msix: carve p2m hole for MSIX MMIO regions Roger Pau Monne
  5 siblings, 0 replies; 15+ messages in thread
From: Roger Pau Monne @ 2018-10-09  9:42 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Jan Beulich, Roger Pau Monne

When switching the memory decoding bit in the command register the
rest of the changes where dropped, leading to only the memory decoding
bit being updated.

Fix this by unconditionally writing the guest-requested command except
for the memory decoding bit, which will be updated once the p2m
changes are performed.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 xen/drivers/vpci/header.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 0ec4c082a6..9234de9b26 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -333,8 +333,10 @@ static void cmd_write(const struct pci_dev *pdev, unsigned int reg,
          * hoping the guest will realize and try again.
          */
         modify_bars(pdev, cmd & PCI_COMMAND_MEMORY, false);
-    else
-        pci_conf_write16(pdev->seg, pdev->bus, slot, func, reg, cmd);
+
+    /* Write the new command without updating the memory decoding bit. */
+    cmd = (cmd & ~PCI_COMMAND_MEMORY) | (current_cmd & PCI_COMMAND_MEMORY);
+    pci_conf_write16(pdev->seg, pdev->bus, slot, func, reg, cmd);
 }
 
 static void bar_write(const struct pci_dev *pdev, unsigned int reg,
-- 
2.19.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/6] vpci: fix execution of long running operations
  2018-10-09  9:42 [PATCH 0/6] x86/pvh: fix fixes for PVH Dom0 Roger Pau Monne
                   ` (3 preceding siblings ...)
  2018-10-09  9:42 ` [PATCH 4/6] vpci: fix updating the command register Roger Pau Monne
@ 2018-10-09  9:42 ` Roger Pau Monne
  2018-10-10  9:09   ` Roger Pau Monné
  2018-10-09  9:42 ` [PATCH 6/6] vpci/msix: carve p2m hole for MSIX MMIO regions Roger Pau Monne
  5 siblings, 1 reply; 15+ messages in thread
From: Roger Pau Monne @ 2018-10-09  9:42 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Paul Durrant, Jan Beulich, Roger Pau Monne

BAR map/unmap is a long running operation that needs to be preempted
in order to avoid overrunning the assigned vCPU time (or even
triggering the watchdog).

Current logic for this preemption is wrong, and won't work at all for
AMD since only Intel makes use of hvm_io_pending (and even in that
case the current code is wrong).

Instead move the code that performs the mapping/unmapping to
handle_hvm_io_completion and use do_softirq in order to execute the
pending softirqs while the {un}mapping takes place. Note that
do_softirq can also trigger a context switch to another vCPU, so
having pending vpci operations shouldn't prevent fair scheduling.

When the {un}map operation is queued (as a result of a trapped PCI
access) a schedule softirq is raised in order to force a context
switch and the execution of handle_hvm_io_completion.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Paul Durrant <paul.durrant@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
---
 xen/arch/x86/hvm/ioreq.c  |  6 +++---
 xen/drivers/vpci/header.c | 16 ++++++++++------
 xen/include/xen/vpci.h    |  6 +++---
 3 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index 3569beaad5..cf3abd0f58 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -85,9 +85,6 @@ bool hvm_io_pending(struct vcpu *v)
     struct hvm_ioreq_server *s;
     unsigned int id;
 
-    if ( has_vpci(d) && vpci_process_pending(v) )
-        return true;
-
     FOR_EACH_IOREQ_SERVER(d, id, s)
     {
         struct hvm_ioreq_vcpu *sv;
@@ -186,6 +183,9 @@ bool handle_hvm_io_completion(struct vcpu *v)
     enum hvm_io_completion io_completion;
     unsigned int id;
 
+    if ( has_vpci(d) )
+        vpci_process_pending(v);
+
     FOR_EACH_IOREQ_SERVER(d, id, s)
     {
         struct hvm_ioreq_vcpu *sv;
diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 9234de9b26..7a1380a5e7 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -118,7 +118,7 @@ static void modify_decoding(const struct pci_dev *pdev, bool map, bool rom_only)
                      cmd);
 }
 
-bool vpci_process_pending(struct vcpu *v)
+void vpci_process_pending(struct vcpu *v)
 {
     if ( v->vpci.mem )
     {
@@ -126,10 +126,11 @@ bool vpci_process_pending(struct vcpu *v)
             .d = v->domain,
             .map = v->vpci.map,
         };
-        int rc = rangeset_consume_ranges(v->vpci.mem, map_range, &data);
+        int rc;
 
-        if ( rc == -ERESTART )
-            return true;
+        while ( (rc = rangeset_consume_ranges(v->vpci.mem, map_range, &data)) ==
+                -ERESTART )
+            do_softirq();
 
         spin_lock(&v->vpci.pdev->vpci->lock);
         /* Disable memory decoding unconditionally on failure. */
@@ -149,8 +150,6 @@ bool vpci_process_pending(struct vcpu *v)
              */
             vpci_remove_device(v->vpci.pdev);
     }
-
-    return false;
 }
 
 static int __init apply_map(struct domain *d, const struct pci_dev *pdev,
@@ -183,6 +182,11 @@ static void defer_map(struct domain *d, struct pci_dev *pdev,
     curr->vpci.mem = mem;
     curr->vpci.map = map;
     curr->vpci.rom_only = rom_only;
+    /*
+     * Force a scheduler softirq in order to execute handle_hvm_io_completion
+     * (as part of hvm_do_resume) before attempting to return to guest context.
+     */
+    raise_softirq(SCHEDULE_SOFTIRQ);
 }
 
 static int modify_bars(const struct pci_dev *pdev, bool map, bool rom_only)
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index af2b8580ee..df0537f523 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -50,10 +50,10 @@ uint32_t vpci_hw_read32(const struct pci_dev *pdev, unsigned int reg,
                         void *data);
 
 /*
- * Check for pending vPCI operations on this vcpu. Returns true if the vcpu
- * should not run.
+ * Execute pending vPCI operations on this vcpu.
+ * Note that this call might force a rescheduling.
  */
-bool __must_check vpci_process_pending(struct vcpu *v);
+void vpci_process_pending(struct vcpu *v);
 
 struct vpci {
     /* List of vPCI handlers for a device. */
-- 
2.19.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 6/6] vpci/msix: carve p2m hole for MSIX MMIO regions
  2018-10-09  9:42 [PATCH 0/6] x86/pvh: fix fixes for PVH Dom0 Roger Pau Monne
                   ` (4 preceding siblings ...)
  2018-10-09  9:42 ` [PATCH 5/6] vpci: fix execution of long running operations Roger Pau Monne
@ 2018-10-09  9:42 ` Roger Pau Monne
  5 siblings, 0 replies; 15+ messages in thread
From: Roger Pau Monne @ 2018-10-09  9:42 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Jan Beulich, Roger Pau Monne

Make sure the MSIX MMIO regions don't have p2m entries setup, so that
accesses to them trap into the hypervisor and can be handled by vpci.

This is a side-effect of commit 042678762 for PVH Dom0, which added
mappings for all the reserved regions into the Dom0 p2m.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 xen/drivers/vpci/header.c |  8 ++++++++
 xen/drivers/vpci/msix.c   | 40 +++++++++++++++++++++++++++++++++++++++
 xen/include/xen/vpci.h    |  3 +++
 3 files changed, 51 insertions(+)

diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 7a1380a5e7..68cd91c0ec 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -88,6 +88,14 @@ static void modify_decoding(const struct pci_dev *pdev, bool map, bool rom_only)
     uint16_t cmd;
     unsigned int i;
 
+    /*
+     * Make sure there are no mappings in the MSIX MMIO areas, so that accesses
+     * can be trapped (and emulated) by Xen when the memory decoding bit is
+     * enabled.
+     */
+    if ( map && !rom_only && vpci_make_msix_hole(pdev) )
+        return;
+
     for ( i = 0; i < ARRAY_SIZE(header->bars); i++ )
     {
         if ( !MAPPABLE_BAR(&header->bars[i]) )
diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
index 1960dae123..5759551724 100644
--- a/xen/drivers/vpci/msix.c
+++ b/xen/drivers/vpci/msix.c
@@ -21,6 +21,7 @@
 #include <xen/vpci.h>
 
 #include <asm/msi.h>
+#include <asm/p2m.h>
 
 #define VMSIX_SIZE(num) offsetof(struct vpci_msix, entries[num])
 
@@ -395,6 +396,45 @@ static const struct hvm_mmio_ops vpci_msix_table_ops = {
     .write = msix_write,
 };
 
+int vpci_make_msix_hole(const struct pci_dev *pdev)
+{
+    struct domain *d = pdev->domain;
+    unsigned int i;
+
+    if ( !pdev->vpci->msix )
+        return 0;
+
+    /* Make sure there's a hole for the MSIX table/PBA in the p2m. */
+    for ( i = 0; i < ARRAY_SIZE(pdev->vpci->msix->tables); i++ )
+    {
+        unsigned long start = PFN_DOWN(vmsix_table_addr(pdev->vpci, i));
+        unsigned long end = PFN_DOWN(vmsix_table_addr(pdev->vpci, i) +
+                                     vmsix_table_size(pdev->vpci, i) - 1);
+
+        for ( ; start <= end; start++ )
+        {
+            p2m_type_t t;
+            mfn_t mfn = get_gfn_query(d, start, &t);
+
+            if ( t == p2m_mmio_direct && mfn_x(mfn) == start )
+                    clear_identity_p2m_entry(d, start);
+            else if ( t != p2m_mmio_dm )
+            {
+                put_gfn(d, start);
+                gprintk(XENLOG_WARNING,
+                        "%04x:%02x:%02x.%u: existing mapping (mfn: %" PRI_mfn
+                        "type: %d) at %#lx clobbers MSIX MMIO area\n",
+                        pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
+                        PCI_FUNC(pdev->devfn), mfn_x(mfn), t, start);
+                return -EEXIST;
+            }
+            put_gfn(d, start);
+        }
+    }
+
+    return 0;
+}
+
 static int init_msix(struct pci_dev *pdev)
 {
     struct domain *d = pdev->domain;
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index df0537f523..c7ddc7e63d 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -152,6 +152,9 @@ struct vpci_vcpu {
 #ifdef __XEN__
 void vpci_dump_msi(void);
 
+/* Make sure there's a hole in the p2m for the MSIX mmio areas. */
+int vpci_make_msix_hole(const struct pci_dev *pdev);
+
 /* Arch-specific vPCI MSI helpers. */
 void vpci_msi_arch_mask(struct vpci_msi *msi, const struct pci_dev *pdev,
                         unsigned int entry, bool mask);
-- 
2.19.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/6] vpci: fix execution of long running operations
  2018-10-09  9:42 ` [PATCH 5/6] vpci: fix execution of long running operations Roger Pau Monne
@ 2018-10-10  9:09   ` Roger Pau Monné
  2018-10-10  9:19     ` Paul Durrant
  0 siblings, 1 reply; 15+ messages in thread
From: Roger Pau Monné @ 2018-10-10  9:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Paul Durrant, Jan Beulich

On Tue, Oct 09, 2018 at 11:42:35AM +0200, Roger Pau Monne wrote:
> BAR map/unmap is a long running operation that needs to be preempted
> in order to avoid overrunning the assigned vCPU time (or even
> triggering the watchdog).
> 
> Current logic for this preemption is wrong, and won't work at all for
> AMD since only Intel makes use of hvm_io_pending (and even in that
> case the current code is wrong).
> 
> Instead move the code that performs the mapping/unmapping to
> handle_hvm_io_completion and use do_softirq in order to execute the
> pending softirqs while the {un}mapping takes place. Note that
> do_softirq can also trigger a context switch to another vCPU, so
> having pending vpci operations shouldn't prevent fair scheduling.
> 
> When the {un}map operation is queued (as a result of a trapped PCI
> access) a schedule softirq is raised in order to force a context
> switch and the execution of handle_hvm_io_completion.

The logic of this patch is wrong, and calling do_softirq in order to
preempt can lead to stack overflow due to recursion because the callee
vCPU is not blocked and can be scheduled when calling do_softirq,
leading to a recursive execution of vpci_process_pending if the MMIO
area to modify is big enough.

Instead of running those operation on the vCPU context I think it's
easier to switch to a task and instead pause the guest vCPU until the
BARs are mapped and memory decoding is enabled.

Below is the updated patch.

---8<---
From 5c9f8802c46594a39de776d2414210fb21127b78 Mon Sep 17 00:00:00 2001
From: Roger Pau Monne <roger.pau@citrix.com>
Date: Wed, 10 Oct 2018 11:04:45 +0200
Subject: [PATCH 5/6] vpci: fix execution of long running operations
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

BAR map/unmap is a long running operation that needs to be preempted
in order to avoid overrunning the assigned vCPU time (or even
triggering the watchdog).

Current logic for this preemption is wrong, and won't work at all for
AMD since only Intel makes use of hvm_io_pending (and even in that
case the current code is wrong).

Instead move the code that performs the mapping/unmapping to
a tasklet and pause the guest vCPU until the {un}mapping is done and
the memory decoding bit has been toggled.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Paul Durrant <paul.durrant@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
---
 xen/arch/x86/hvm/ioreq.c  |  3 --
 xen/common/domain.c       |  2 ++
 xen/drivers/vpci/header.c | 70 +++++++++++++++++++++++----------------
 xen/include/xen/vpci.h    | 15 +++------
 4 files changed, 48 insertions(+), 42 deletions(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index 3569beaad5..31429265f8 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -85,9 +85,6 @@ bool hvm_io_pending(struct vcpu *v)
     struct hvm_ioreq_server *s;
     unsigned int id;
 
-    if ( has_vpci(d) && vpci_process_pending(v) )
-        return true;
-
     FOR_EACH_IOREQ_SERVER(d, id, s)
     {
         struct hvm_ioreq_vcpu *sv;
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 65151e2ac4..54d2c26bd9 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -153,6 +153,8 @@ struct vcpu *vcpu_create(
 
     grant_table_init_vcpu(v);
 
+    vpci_init_vcpu(v);
+
     if ( !zalloc_cpumask_var(&v->cpu_hard_affinity) ||
          !zalloc_cpumask_var(&v->cpu_hard_affinity_tmp) ||
          !zalloc_cpumask_var(&v->cpu_hard_affinity_saved) ||
diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 9234de9b26..4af85d3c02 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -118,39 +118,48 @@ static void modify_decoding(const struct pci_dev *pdev, bool map, bool rom_only)
                      cmd);
 }
 
-bool vpci_process_pending(struct vcpu *v)
+static void vpci_process_pending(unsigned long data)
 {
-    if ( v->vpci.mem )
+    struct vcpu *v = (void *)data;
+    struct map_data map_data = {
+        .d = v->domain,
+        .map = v->vpci.map,
+    };
+    int rc;
+
+    ASSERT(v->vpci.mem && atomic_read(&v->pause_count));
+
+    while ( (rc = rangeset_consume_ranges(v->vpci.mem, map_range, &map_data)) ==
+            -ERESTART )
     {
-        struct map_data data = {
-            .d = v->domain,
-            .map = v->vpci.map,
-        };
-        int rc = rangeset_consume_ranges(v->vpci.mem, map_range, &data);
-
-        if ( rc == -ERESTART )
-            return true;
-
-        spin_lock(&v->vpci.pdev->vpci->lock);
-        /* Disable memory decoding unconditionally on failure. */
-        modify_decoding(v->vpci.pdev, !rc && v->vpci.map,
-                        !rc && v->vpci.rom_only);
-        spin_unlock(&v->vpci.pdev->vpci->lock);
-
-        rangeset_destroy(v->vpci.mem);
-        v->vpci.mem = NULL;
-        if ( rc )
-            /*
-             * FIXME: in case of failure remove the device from the domain.
-             * Note that there might still be leftover mappings. While this is
-             * safe for Dom0, for DomUs the domain will likely need to be
-             * killed in order to avoid leaking stale p2m mappings on
-             * failure.
-             */
-            vpci_remove_device(v->vpci.pdev);
+        tasklet_schedule(&v->vpci.task);
+        return;
     }
 
-    return false;
+    spin_lock(&v->vpci.pdev->vpci->lock);
+    /* Disable memory decoding unconditionally on failure. */
+    modify_decoding(v->vpci.pdev, !rc && v->vpci.map,
+                    !rc && v->vpci.rom_only);
+    spin_unlock(&v->vpci.pdev->vpci->lock);
+
+    rangeset_destroy(v->vpci.mem);
+    v->vpci.mem = NULL;
+    if ( rc )
+        /*
+         * FIXME: in case of failure remove the device from the domain.
+         * Note that there might still be leftover mappings. While this is
+         * safe for Dom0, for DomUs the domain will likely need to be
+         * killed in order to avoid leaking stale p2m mappings on
+         * failure.
+         */
+        vpci_remove_device(v->vpci.pdev);
+
+    vcpu_unpause(v);
+}
+
+void vpci_init_vcpu(struct vcpu *v)
+{
+    tasklet_init(&v->vpci.task, vpci_process_pending, (unsigned long)v);
 }
 
 static int __init apply_map(struct domain *d, const struct pci_dev *pdev,
@@ -183,6 +192,9 @@ static void defer_map(struct domain *d, struct pci_dev *pdev,
     curr->vpci.mem = mem;
     curr->vpci.map = map;
     curr->vpci.rom_only = rom_only;
+    /* Pause the vCPU and schedule the tasklet that will perform the mapping. */
+    vcpu_pause_nosync(curr);
+    tasklet_schedule(&curr->vpci.task);
 }
 
 static int modify_bars(const struct pci_dev *pdev, bool map, bool rom_only)
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index af2b8580ee..f97c48a8f1 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -49,11 +49,8 @@ uint32_t vpci_hw_read16(const struct pci_dev *pdev, unsigned int reg,
 uint32_t vpci_hw_read32(const struct pci_dev *pdev, unsigned int reg,
                         void *data);
 
-/*
- * Check for pending vPCI operations on this vcpu. Returns true if the vcpu
- * should not run.
- */
-bool __must_check vpci_process_pending(struct vcpu *v);
+/* Init per-vCPU vPCI data. */
+void vpci_init_vcpu(struct vcpu *v);
 
 struct vpci {
     /* List of vPCI handlers for a device. */
@@ -142,6 +139,8 @@ struct vpci {
 };
 
 struct vpci_vcpu {
+    /* Per-vCPU tasklet to enqueue pending operations. */
+    struct tasklet task;
     /* Per-vcpu structure to store state while {un}mapping of PCI BARs. */
     struct rangeset *mem;
     struct pci_dev *pdev;
@@ -233,11 +232,7 @@ static inline void vpci_write(pci_sbdf_t sbdf, unsigned int reg,
     ASSERT_UNREACHABLE();
 }
 
-static inline bool vpci_process_pending(struct vcpu *v)
-{
-    ASSERT_UNREACHABLE();
-    return false;
-}
+static inline void vpci_init_vcpu(struct vcpu *v) { }
 #endif
 
 #endif
-- 
2.19.0

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/6] vpci: fix execution of long running operations
  2018-10-10  9:09   ` Roger Pau Monné
@ 2018-10-10  9:19     ` Paul Durrant
  0 siblings, 0 replies; 15+ messages in thread
From: Paul Durrant @ 2018-10-10  9:19 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, Jan Beulich, Ian Jackson

> -----Original Message-----
> From: Roger Pau Monne
> Sent: 10 October 2018 10:10
> To: xen-devel@lists.xenproject.org
> Cc: Paul Durrant <Paul.Durrant@citrix.com>; Jan Beulich
> <jbeulich@suse.com>; Andrew Cooper <Andrew.Cooper3@citrix.com>; Wei Liu
> <wei.liu2@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Ian
> Jackson <Ian.Jackson@citrix.com>; Julien Grall <julien.grall@arm.com>;
> Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; Stefano Stabellini
> <sstabellini@kernel.org>; Tim (Xen.org) <tim@xen.org>
> Subject: Re: [PATCH 5/6] vpci: fix execution of long running operations
> 
> On Tue, Oct 09, 2018 at 11:42:35AM +0200, Roger Pau Monne wrote:
> > BAR map/unmap is a long running operation that needs to be preempted
> > in order to avoid overrunning the assigned vCPU time (or even
> > triggering the watchdog).
> >
> > Current logic for this preemption is wrong, and won't work at all for
> > AMD since only Intel makes use of hvm_io_pending (and even in that
> > case the current code is wrong).
> >
> > Instead move the code that performs the mapping/unmapping to
> > handle_hvm_io_completion and use do_softirq in order to execute the
> > pending softirqs while the {un}mapping takes place. Note that
> > do_softirq can also trigger a context switch to another vCPU, so
> > having pending vpci operations shouldn't prevent fair scheduling.
> >
> > When the {un}map operation is queued (as a result of a trapped PCI
> > access) a schedule softirq is raised in order to force a context
> > switch and the execution of handle_hvm_io_completion.
> 
> The logic of this patch is wrong, and calling do_softirq in order to
> preempt can lead to stack overflow due to recursion because the callee
> vCPU is not blocked and can be scheduled when calling do_softirq,
> leading to a recursive execution of vpci_process_pending if the MMIO
> area to modify is big enough.
> 
> Instead of running those operation on the vCPU context I think it's
> easier to switch to a task and instead pause the guest vCPU until the
> BARs are mapped and memory decoding is enabled.
> 
> Below is the updated patch.
> 
> ---8<---
> From 5c9f8802c46594a39de776d2414210fb21127b78 Mon Sep 17 00:00:00 2001
> From: Roger Pau Monne <roger.pau@citrix.com>
> Date: Wed, 10 Oct 2018 11:04:45 +0200
> Subject: [PATCH 5/6] vpci: fix execution of long running operations
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> BAR map/unmap is a long running operation that needs to be preempted
> in order to avoid overrunning the assigned vCPU time (or even
> triggering the watchdog).
> 
> Current logic for this preemption is wrong, and won't work at all for
> AMD since only Intel makes use of hvm_io_pending (and even in that
> case the current code is wrong).
> 
> Instead move the code that performs the mapping/unmapping to
> a tasklet and pause the guest vCPU until the {un}mapping is done and
> the memory decoding bit has been toggled.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

ioreq modification...

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> ---
> Cc: Paul Durrant <paul.durrant@citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Julien Grall <julien.grall@arm.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Tim Deegan <tim@xen.org>
> ---
>  xen/arch/x86/hvm/ioreq.c  |  3 --
>  xen/common/domain.c       |  2 ++
>  xen/drivers/vpci/header.c | 70 +++++++++++++++++++++++----------------
>  xen/include/xen/vpci.h    | 15 +++------
>  4 files changed, 48 insertions(+), 42 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
> index 3569beaad5..31429265f8 100644
> --- a/xen/arch/x86/hvm/ioreq.c
> +++ b/xen/arch/x86/hvm/ioreq.c
> @@ -85,9 +85,6 @@ bool hvm_io_pending(struct vcpu *v)
>      struct hvm_ioreq_server *s;
>      unsigned int id;
> 
> -    if ( has_vpci(d) && vpci_process_pending(v) )
> -        return true;
> -
>      FOR_EACH_IOREQ_SERVER(d, id, s)
>      {
>          struct hvm_ioreq_vcpu *sv;
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index 65151e2ac4..54d2c26bd9 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -153,6 +153,8 @@ struct vcpu *vcpu_create(
> 
>      grant_table_init_vcpu(v);
> 
> +    vpci_init_vcpu(v);
> +
>      if ( !zalloc_cpumask_var(&v->cpu_hard_affinity) ||
>           !zalloc_cpumask_var(&v->cpu_hard_affinity_tmp) ||
>           !zalloc_cpumask_var(&v->cpu_hard_affinity_saved) ||
> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> index 9234de9b26..4af85d3c02 100644
> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -118,39 +118,48 @@ static void modify_decoding(const struct pci_dev
> *pdev, bool map, bool rom_only)
>                       cmd);
>  }
> 
> -bool vpci_process_pending(struct vcpu *v)
> +static void vpci_process_pending(unsigned long data)
>  {
> -    if ( v->vpci.mem )
> +    struct vcpu *v = (void *)data;
> +    struct map_data map_data = {
> +        .d = v->domain,
> +        .map = v->vpci.map,
> +    };
> +    int rc;
> +
> +    ASSERT(v->vpci.mem && atomic_read(&v->pause_count));
> +
> +    while ( (rc = rangeset_consume_ranges(v->vpci.mem, map_range,
> &map_data)) ==
> +            -ERESTART )
>      {
> -        struct map_data data = {
> -            .d = v->domain,
> -            .map = v->vpci.map,
> -        };
> -        int rc = rangeset_consume_ranges(v->vpci.mem, map_range, &data);
> -
> -        if ( rc == -ERESTART )
> -            return true;
> -
> -        spin_lock(&v->vpci.pdev->vpci->lock);
> -        /* Disable memory decoding unconditionally on failure. */
> -        modify_decoding(v->vpci.pdev, !rc && v->vpci.map,
> -                        !rc && v->vpci.rom_only);
> -        spin_unlock(&v->vpci.pdev->vpci->lock);
> -
> -        rangeset_destroy(v->vpci.mem);
> -        v->vpci.mem = NULL;
> -        if ( rc )
> -            /*
> -             * FIXME: in case of failure remove the device from the
> domain.
> -             * Note that there might still be leftover mappings. While
> this is
> -             * safe for Dom0, for DomUs the domain will likely need to be
> -             * killed in order to avoid leaking stale p2m mappings on
> -             * failure.
> -             */
> -            vpci_remove_device(v->vpci.pdev);
> +        tasklet_schedule(&v->vpci.task);
> +        return;
>      }
> 
> -    return false;
> +    spin_lock(&v->vpci.pdev->vpci->lock);
> +    /* Disable memory decoding unconditionally on failure. */
> +    modify_decoding(v->vpci.pdev, !rc && v->vpci.map,
> +                    !rc && v->vpci.rom_only);
> +    spin_unlock(&v->vpci.pdev->vpci->lock);
> +
> +    rangeset_destroy(v->vpci.mem);
> +    v->vpci.mem = NULL;
> +    if ( rc )
> +        /*
> +         * FIXME: in case of failure remove the device from the domain.
> +         * Note that there might still be leftover mappings. While this
> is
> +         * safe for Dom0, for DomUs the domain will likely need to be
> +         * killed in order to avoid leaking stale p2m mappings on
> +         * failure.
> +         */
> +        vpci_remove_device(v->vpci.pdev);
> +
> +    vcpu_unpause(v);
> +}
> +
> +void vpci_init_vcpu(struct vcpu *v)
> +{
> +    tasklet_init(&v->vpci.task, vpci_process_pending, (unsigned long)v);
>  }
> 
>  static int __init apply_map(struct domain *d, const struct pci_dev *pdev,
> @@ -183,6 +192,9 @@ static void defer_map(struct domain *d, struct pci_dev
> *pdev,
>      curr->vpci.mem = mem;
>      curr->vpci.map = map;
>      curr->vpci.rom_only = rom_only;
> +    /* Pause the vCPU and schedule the tasklet that will perform the
> mapping. */
> +    vcpu_pause_nosync(curr);
> +    tasklet_schedule(&curr->vpci.task);
>  }
> 
>  static int modify_bars(const struct pci_dev *pdev, bool map, bool
> rom_only)
> diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
> index af2b8580ee..f97c48a8f1 100644
> --- a/xen/include/xen/vpci.h
> +++ b/xen/include/xen/vpci.h
> @@ -49,11 +49,8 @@ uint32_t vpci_hw_read16(const struct pci_dev *pdev,
> unsigned int reg,
>  uint32_t vpci_hw_read32(const struct pci_dev *pdev, unsigned int reg,
>                          void *data);
> 
> -/*
> - * Check for pending vPCI operations on this vcpu. Returns true if the
> vcpu
> - * should not run.
> - */
> -bool __must_check vpci_process_pending(struct vcpu *v);
> +/* Init per-vCPU vPCI data. */
> +void vpci_init_vcpu(struct vcpu *v);
> 
>  struct vpci {
>      /* List of vPCI handlers for a device. */
> @@ -142,6 +139,8 @@ struct vpci {
>  };
> 
>  struct vpci_vcpu {
> +    /* Per-vCPU tasklet to enqueue pending operations. */
> +    struct tasklet task;
>      /* Per-vcpu structure to store state while {un}mapping of PCI BARs.
> */
>      struct rangeset *mem;
>      struct pci_dev *pdev;
> @@ -233,11 +232,7 @@ static inline void vpci_write(pci_sbdf_t sbdf,
> unsigned int reg,
>      ASSERT_UNREACHABLE();
>  }
> 
> -static inline bool vpci_process_pending(struct vcpu *v)
> -{
> -    ASSERT_UNREACHABLE();
> -    return false;
> -}
> +static inline void vpci_init_vcpu(struct vcpu *v) { }
>  #endif
> 
>  #endif
> --
> 2.19.0

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/6] x86/pvh: fix TSC mode setup for PVH Dom0
  2018-10-09  9:42 ` [PATCH 1/6] x86/pvh: fix TSC mode setup " Roger Pau Monne
@ 2018-10-12 15:34   ` Jan Beulich
  2018-10-13  7:41     ` Roger Pau Monné
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2018-10-12 15:34 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 09.10.18 at 11:42, <roger.pau@citrix.com> wrote:
> A PVH Dom0 might use TSC scaling or other HVM specific TSC
> adjustments, so only short-circuit the TSC setup for a classic PV
> Dom0.

How that?

> --- a/xen/arch/x86/time.c
> +++ b/xen/arch/x86/time.c
> @@ -2125,7 +2125,7 @@ void tsc_set_info(struct domain *d,
>  {
>      ASSERT(!is_system_domain(d));
>  
> -    if ( is_hardware_domain(d) )
> +    if ( is_pv_domain(d) && is_hardware_domain(d) )

This code path is reachable only from arch_domain_create()
(setting defaults) and XEN_DOMCTL_settscinfo (which Dom0 can't
issue against itself). So either there's some important part missing
from the description, or I fail to see what use this change is. Hmm,
on a platform where host_tsc_is_safe() returns false the setting
of defaults would have an effect, but I think the description
should then say so, instead of giving the impression that the
functionality would become available in full.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/6] x86/dom0: switch parse_dom0_param to use parse_boolean
  2018-10-09  9:42 ` [PATCH 2/6] x86/dom0: switch parse_dom0_param to use parse_boolean Roger Pau Monne
@ 2018-10-12 15:49   ` Jan Beulich
  0 siblings, 0 replies; 15+ messages in thread
From: Jan Beulich @ 2018-10-12 15:49 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 09.10.18 at 11:42, <roger.pau@citrix.com> wrote:
> No functional change expected.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/6] x86/pvh: allow PVH Dom0 to use the debug IO port console
  2018-10-09  9:42 ` [PATCH 3/6] x86/pvh: allow PVH Dom0 to use the debug IO port console Roger Pau Monne
@ 2018-10-12 15:55   ` Jan Beulich
  2018-10-15 15:15     ` Roger Pau Monné
  2018-10-15 12:56   ` Andrew Cooper
  1 sibling, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2018-10-12 15:55 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

>>> On 09.10.18 at 11:42, <roger.pau@citrix.com> wrote:
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -681,6 +681,17 @@ Flag that makes a dom0 boot in PVHv2 mode.
>  Flag that makes a dom0 use shadow paging. Only works when "pvh" is
>  enabled.
>  
> +> `debug-ioport`
> +
> +> Default: `false`
> +
> +Flag that enables the HVM debug console for a PVH Dom0. Xen will trap accesses
> +to IO port 0xe9 so that Dom0 kernel can print output using this IO port before
> +setting up the hypercall page.
> +
> +Note this option is not enabled by default because it might clash with hardware
> +on the system using IO port 0xe9 and should only be used for debug purposes.

You also want to extend the "List of" at the top of the option.

> --- a/xen/arch/x86/dom0_build.c
> +++ b/xen/arch/x86/dom0_build.c
> @@ -212,12 +212,16 @@ struct vcpu *__init alloc_dom0_vcpu0(struct domain *dom0)
>  bool __initdata opt_dom0_shadow;
>  #endif
>  bool __initdata dom0_pvh;
> +#ifdef CONFIG_HVM
> +bool __initdata opt_dom0_debug_ioport;
> +#endif

And dom0_pvh does not go into this same conditional?

> @@ -236,6 +240,10 @@ static int __init parse_dom0_param(const char *s)
>  #ifdef CONFIG_SHADOW_PAGING
>          else if ( (val = parse_boolean("shadow", s, ss)) >= 0 )
>              opt_dom0_shadow = val;
> +#endif
> +#ifdef CONFIG_HVM
> +        else if ( (val = parse_boolean("debug-ioport", s, ss)) >= 0 )
> +            opt_dom0_debug_ioport = val;
>  #endif
>          else
>              rc = -EINVAL;
> @@ -433,6 +441,11 @@ int __init dom0_setup_permissions(struct domain *d)
>          rc |= ioports_deny_access(d, pmtmr_ioport, pmtmr_ioport + 3);
>      /* PCI configuration space (NB. 0xcf8 has special treatment). */
>      rc |= ioports_deny_access(d, 0xcfc, 0xcff);
> +#ifdef CONFIG_HVM
> +    if ( is_hvm_domain(d) && opt_dom0_debug_ioport )
> +        /* HVM debug console IO port. */
> +        rc |= ioports_deny_access(d, 0xe9, 0xe9);

I think we finally need a manifest constant for this port number.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/6] x86/pvh: fix TSC mode setup for PVH Dom0
  2018-10-12 15:34   ` Jan Beulich
@ 2018-10-13  7:41     ` Roger Pau Monné
  0 siblings, 0 replies; 15+ messages in thread
From: Roger Pau Monné @ 2018-10-13  7:41 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

On Fri, Oct 12, 2018 at 09:34:35AM -0600, Jan Beulich wrote:
> >>> On 09.10.18 at 11:42, <roger.pau@citrix.com> wrote:
> > A PVH Dom0 might use TSC scaling or other HVM specific TSC
> > adjustments, so only short-circuit the TSC setup for a classic PV
> > Dom0.
> 
> How that?

If the hardware supports hvm_tsc_scaling_supported
d->arch.hvm.tsc_scaling_ratio would be left uninitialized for a PVH
Dom0, leading to wrong scaling being applied.

hvm_domain_initialise also attempts to set
d->arch.hvm.tsc_scaling_ratio but AFAICT this is bogus, for a PVH Dom0
this lead to a non-working PV clock, only tsc_set_info seems to set
the correct values (also note the logic difference in setting
d->arch.hvm.tsc_scaling_ratio from hvm_domain_initialise or
tsc_set_info).

> > --- a/xen/arch/x86/time.c
> > +++ b/xen/arch/x86/time.c
> > @@ -2125,7 +2125,7 @@ void tsc_set_info(struct domain *d,
> >  {
> >      ASSERT(!is_system_domain(d));
> >  
> > -    if ( is_hardware_domain(d) )
> > +    if ( is_pv_domain(d) && is_hardware_domain(d) )
> 
> This code path is reachable only from arch_domain_create()
> (setting defaults) and XEN_DOMCTL_settscinfo (which Dom0 can't
> issue against itself). So either there's some important part missing
> from the description, or I fail to see what use this change is. Hmm,
> on a platform where host_tsc_is_safe() returns false the setting
> of defaults would have an effect, but I think the description
> should then say so, instead of giving the impression that the
> functionality would become available in full.

This is called from arch_domain_create with the defaults, which is
called during PVH Dom0 construction. What might be relevant for a PVH
Dom0 is the last chunk of the function, the code inside the
is_hvm_domain check.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/6] x86/pvh: allow PVH Dom0 to use the debug IO port console
  2018-10-09  9:42 ` [PATCH 3/6] x86/pvh: allow PVH Dom0 to use the debug IO port console Roger Pau Monne
  2018-10-12 15:55   ` Jan Beulich
@ 2018-10-15 12:56   ` Andrew Cooper
  1 sibling, 0 replies; 15+ messages in thread
From: Andrew Cooper @ 2018-10-15 12:56 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel; +Cc: Wei Liu, Jan Beulich

On 09/10/18 10:42, Roger Pau Monne wrote:
> Add an option to allow trapping accesses to IO port 0xe9 for a PVH
> Dom0, so it can print to the HVM debug console. Note this is not
> enabled by default in order to prevent clashes with hardware on the
> system using 0xe9.
>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

0xe9 was originally a BOCHS debug port, and was chosen because it isn't
typically used in an IBM compatible PC.

The only reference I can find which shows it as an option is "PC radio
by CoZet Info Systems".

I'd write it up with Xen's meaning, and leave the patch there.  I doubt
any HVM-capable system has an ISA radio card in it.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/6] x86/pvh: allow PVH Dom0 to use the debug IO port console
  2018-10-12 15:55   ` Jan Beulich
@ 2018-10-15 15:15     ` Roger Pau Monné
  0 siblings, 0 replies; 15+ messages in thread
From: Roger Pau Monné @ 2018-10-15 15:15 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

On Fri, Oct 12, 2018 at 09:55:10AM -0600, Jan Beulich wrote:
> >>> On 09.10.18 at 11:42, <roger.pau@citrix.com> wrote:
> > --- a/docs/misc/xen-command-line.markdown
> > +++ b/docs/misc/xen-command-line.markdown
> > @@ -681,6 +681,17 @@ Flag that makes a dom0 boot in PVHv2 mode.
> >  Flag that makes a dom0 use shadow paging. Only works when "pvh" is
> >  enabled.
> >  
> > +> `debug-ioport`
> > +
> > +> Default: `false`
> > +
> > +Flag that enables the HVM debug console for a PVH Dom0. Xen will trap accesses
> > +to IO port 0xe9 so that Dom0 kernel can print output using this IO port before
> > +setting up the hypercall page.
> > +
> > +Note this option is not enabled by default because it might clash with hardware
> > +on the system using IO port 0xe9 and should only be used for debug purposes.
> 
> You also want to extend the "List of" at the top of the option.
> 
> > --- a/xen/arch/x86/dom0_build.c
> > +++ b/xen/arch/x86/dom0_build.c
> > @@ -212,12 +212,16 @@ struct vcpu *__init alloc_dom0_vcpu0(struct domain *dom0)
> >  bool __initdata opt_dom0_shadow;
> >  #endif
> >  bool __initdata dom0_pvh;
> > +#ifdef CONFIG_HVM
> > +bool __initdata opt_dom0_debug_ioport;
> > +#endif
> 
> And dom0_pvh does not go into this same conditional?

dom0_pvh is used by __start_xen without any guards, so while I agree
that it should be guarded with CONFIG_HVM it will require some
unrelated changes.

> > @@ -236,6 +240,10 @@ static int __init parse_dom0_param(const char *s)
> >  #ifdef CONFIG_SHADOW_PAGING
> >          else if ( (val = parse_boolean("shadow", s, ss)) >= 0 )
> >              opt_dom0_shadow = val;
> > +#endif
> > +#ifdef CONFIG_HVM
> > +        else if ( (val = parse_boolean("debug-ioport", s, ss)) >= 0 )
> > +            opt_dom0_debug_ioport = val;
> >  #endif
> >          else
> >              rc = -EINVAL;
> > @@ -433,6 +441,11 @@ int __init dom0_setup_permissions(struct domain *d)
> >          rc |= ioports_deny_access(d, pmtmr_ioport, pmtmr_ioport + 3);
> >      /* PCI configuration space (NB. 0xcf8 has special treatment). */
> >      rc |= ioports_deny_access(d, 0xcfc, 0xcff);
> > +#ifdef CONFIG_HVM
> > +    if ( is_hvm_domain(d) && opt_dom0_debug_ioport )
> > +        /* HVM debug console IO port. */
> > +        rc |= ioports_deny_access(d, 0xe9, 0xe9);
> 
> I think we finally need a manifest constant for this port number.

I've added a patch to introduce the constant define.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-10-15 15:21 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-09  9:42 [PATCH 0/6] x86/pvh: fix fixes for PVH Dom0 Roger Pau Monne
2018-10-09  9:42 ` [PATCH 1/6] x86/pvh: fix TSC mode setup " Roger Pau Monne
2018-10-12 15:34   ` Jan Beulich
2018-10-13  7:41     ` Roger Pau Monné
2018-10-09  9:42 ` [PATCH 2/6] x86/dom0: switch parse_dom0_param to use parse_boolean Roger Pau Monne
2018-10-12 15:49   ` Jan Beulich
2018-10-09  9:42 ` [PATCH 3/6] x86/pvh: allow PVH Dom0 to use the debug IO port console Roger Pau Monne
2018-10-12 15:55   ` Jan Beulich
2018-10-15 15:15     ` Roger Pau Monné
2018-10-15 12:56   ` Andrew Cooper
2018-10-09  9:42 ` [PATCH 4/6] vpci: fix updating the command register Roger Pau Monne
2018-10-09  9:42 ` [PATCH 5/6] vpci: fix execution of long running operations Roger Pau Monne
2018-10-10  9:09   ` Roger Pau Monné
2018-10-10  9:19     ` Paul Durrant
2018-10-09  9:42 ` [PATCH 6/6] vpci/msix: carve p2m hole for MSIX MMIO regions Roger Pau Monne

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.