All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] x86/PVH: Dom0 building adjustments
@ 2021-08-30 13:01 Jan Beulich
  2021-08-30 13:02 ` [PATCH 1/4] x86/PVH: de-duplicate mappings for first Mb of Dom0 memory Jan Beulich
                   ` (5 more replies)
  0 siblings, 6 replies; 40+ messages in thread
From: Jan Beulich @ 2021-08-30 13:01 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

The code building PVH Dom0 made use of sequences of P2M changes
which are disallowed as of XSA-378. First of all population of the
first Mb of memory needs to be redone. Then, largely as a
workaround, checking introduced by XSA-378 needs to be slightly
relaxed.

Note that with these adjustments I get Dom0 to start booting on my
development system, but the Dom0 kernel then gets stuck. Since it
was the first time for me to try PVH Dom0 in this context (see
below for why I was hesitant), I cannot tell yet whether this is
due further fallout from the XSA, or some further unrelated
problem. Dom0's BSP is in VPF_blocked state while all APs are
still in VPF_down. The 'd' debug key, unhelpfully, doesn't produce
any output, so it's non-trivial to check whether (like PV likes to
do) Dom0 has panic()ed without leaving any (visible) output.

[And there was another rather basic issue to fight first (patch
 will be submitted separately): vPCI wasn't aware of hidden PCI
 devices, hitting an ASSERT(). Obviously I couldn't afford not
 having a functioning serial console.]

In the course I ran into an oom condition while populating Dom0's
RAM. Hence next some re-work of dom0_compute_nr_pages(). In turn
in the course of putting that together I did notice that PV Dom0,
when run in shadow mode, wouldn't have its shadow allocation
properly set.

1: PVH: de-duplicate mappings for first Mb of Dom0 memory
2: P2M: relax guarding of MMIO entries
3: PVH: improve Dom0 memory size calculation
4: PV: properly set shadow allocation for Dom0

Jan



^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 1/4] x86/PVH: de-duplicate mappings for first Mb of Dom0 memory
  2021-08-30 13:01 [PATCH 0/4] x86/PVH: Dom0 building adjustments Jan Beulich
@ 2021-08-30 13:02 ` Jan Beulich
  2021-08-31 13:02   ` Andrew Cooper
  2021-08-30 13:02 ` [PATCH 2/4] x86/P2M: relax guarding of MMIO entries Jan Beulich
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 40+ messages in thread
From: Jan Beulich @ 2021-08-30 13:02 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

One of the changes comprising the fixes for XSA-378 disallows replacing
MMIO mappings by unintended (for this purpose) code paths. This means we
need to be more careful about the mappings put in place in this range -
mappings should be created exactly once:
- iommu_hwdom_init() comes first; it should avoid the first Mb,
- pvh_populate_p2m() should insert identity mappings only into ranges
  not populated as RAM,
- pvh_setup_acpi() should again avoid the first Mb, which was already
  dealt with at that point.

Fixes: 753cb68e6530 ("x86/p2m: guard (in particular) identity mapping entries")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
Note: Conflicts with the previously submitted "IOMMU/x86: perform PV
      Dom0 mappings in batches".

--- a/xen/arch/x86/hvm/dom0_build.c
+++ b/xen/arch/x86/hvm/dom0_build.c
@@ -430,17 +430,6 @@ static int __init pvh_populate_p2m(struc
     int rc;
 #define MB1_PAGES PFN_DOWN(MB(1))
 
-    /*
-     * Memory below 1MB is identity mapped initially. RAM regions are
-     * populated and copied below, replacing the respective mappings.
-     */
-    rc = modify_identity_mmio(d, 0, MB1_PAGES, true);
-    if ( rc )
-    {
-        printk("Failed to identity map low 1MB: %d\n", rc);
-        return rc;
-    }
-
     /* Populate memory map. */
     for ( i = 0; i < d->arch.nr_e820; i++ )
     {
@@ -472,6 +461,23 @@ static int __init pvh_populate_p2m(struc
         }
     }
 
+    /* Non-RAM regions of space below 1MB get identity mapped. */
+    for ( i = rc = 0; i < MB1_PAGES; ++i )
+    {
+        p2m_type_t p2mt;
+
+        if ( mfn_eq(get_gfn_query(d, i, &p2mt), INVALID_MFN) )
+            rc = set_mmio_p2m_entry(d, _gfn(i), _mfn(i), PAGE_ORDER_4K);
+        else
+            ASSERT(p2mt == p2m_ram_rw);
+        put_gfn(d, i);
+        if ( rc )
+        {
+            printk("Failed to identity map PFN %x: %d\n", i, rc);
+            return rc;
+        }
+    }
+
     if ( cpu_has_vmx && paging_mode_hap(d) && !vmx_unrestricted_guest(v) )
     {
         /*
@@ -1095,6 +1101,17 @@ static int __init pvh_setup_acpi(struct
         nr_pages = PFN_UP((d->arch.e820[i].addr & ~PAGE_MASK) +
                           d->arch.e820[i].size);
 
+        /* Memory below 1MB has been dealt with by pvh_populate_p2m(). */
+        if ( pfn < PFN_DOWN(MB(1)) )
+        {
+            if ( pfn + nr_pages <= PFN_DOWN(MB(1)) )
+                continue;
+
+            /* This shouldn't happen, but is easy to deal with. */
+            nr_pages -= PFN_DOWN(MB(1)) - pfn;
+            pfn = PFN_DOWN(MB(1));
+        }
+
         rc = modify_identity_mmio(d, pfn, nr_pages, true);
         if ( rc )
         {
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -337,7 +337,13 @@ void __hwdom_init arch_iommu_hwdom_init(
     max_pfn = (GB(4) >> PAGE_SHIFT) - 1;
     top = max(max_pdx, pfn_to_pdx(max_pfn) + 1);
 
-    for ( i = 0; i < top; i++ )
+    /*
+     * First Mb will get mapped in one go by pvh_populate_p2m(). Avoid
+     * setting up potentially conflicting mappings here.
+     */
+    i = paging_mode_translate(d) ? PFN_DOWN(MB(1)) : 0;
+
+    for ( ; i < top; i++ )
     {
         unsigned long pfn = pdx_to_pfn(i);
         int rc;



^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 2/4] x86/P2M: relax guarding of MMIO entries
  2021-08-30 13:01 [PATCH 0/4] x86/PVH: Dom0 building adjustments Jan Beulich
  2021-08-30 13:02 ` [PATCH 1/4] x86/PVH: de-duplicate mappings for first Mb of Dom0 memory Jan Beulich
@ 2021-08-30 13:02 ` Jan Beulich
  2021-08-31 13:16   ` Jan Beulich
  2021-08-31 13:16   ` Andrew Cooper
  2021-08-30 13:03 ` [PATCH 3/4] x86/PVH: improve Dom0 memory size calculation Jan Beulich
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 40+ messages in thread
From: Jan Beulich @ 2021-08-30 13:02 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

One of the changes comprising the fixes for XSA-378 disallows replacing
MMIO mappings by unintended (for this purpose) code paths. At least in
the case of PVH Dom0 hitting an RMRR covered by an E820 ACPI region,
this is too strict. Generally short-circuit requests establishing the
same kind of mapping that's already in place.

Further permit "access" to differ in the "executable" attribute. While
ideally only ROM regions would get mapped with X set, getting there is
quite a bit of work. Therefore, as a temporary measure, permit X to
vary. For Dom0 the more permissive of the types will be used, while for
DomU it'll be the more restrictive one.

While there, also add a log message to the other domain_crash()
invocation that did prevent PVH Dom0 from coming up after the XSA-378
changes.

Fixes: 753cb68e6530 ("x86/p2m: guard (in particular) identity mapping entries")
Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -958,9 +958,13 @@ guest_physmap_add_entry(struct domain *d
         if ( p2m_is_special(ot) )
         {
             /* Don't permit unmapping grant/foreign/direct-MMIO this way. */
-            domain_crash(d);
             p2m_unlock(p2m);
-            
+            printk(XENLOG_G_ERR
+                   "%pd: GFN %lx (%lx:%u:%u) -> (%lx:%u:%u) not permitted\n",
+                   d, gfn_x(gfn) + i,
+                   mfn_x(omfn), ot, a,
+                   mfn_x(mfn) + i, t, p2m->default_access);
+            domain_crash(d);
             return -EPERM;
         }
         else if ( p2m_is_ram(ot) && !p2m_is_paged(ot) )
@@ -1302,9 +1306,50 @@ static int set_typed_p2m_entry(struct do
     }
     if ( p2m_is_special(ot) )
     {
-        gfn_unlock(p2m, gfn, order);
-        domain_crash(d);
-        return -EPERM;
+        bool done = false, bad = true;
+
+        /* Special-case (almost) identical mappings. */
+        if ( mfn_eq(mfn, omfn) && gfn_p2mt == ot )
+        {
+            /*
+             * For MMIO allow X to differ in the requests (to cover for
+             * set_identity_p2m_entry() and set_mmio_p2m_entry() differing in
+             * the way they specify "access"). For the hardware domain put (or
+             * leave) in place the more permissive of the two possibilities,
+             * while for DomU-s go with the more restrictive variant.
+             */
+            if ( gfn_p2mt == p2m_mmio_direct &&
+                 access <= p2m_access_rwx &&
+                 (access ^ a) == p2m_access_x )
+            {
+                if ( is_hardware_domain(d) )
+                    access |= p2m_access_x;
+                else
+                    access &= ~p2m_access_x;
+                bad = access == p2m_access_n;
+            }
+
+            if ( access == a )
+                done = true;
+        }
+
+        if ( done )
+        {
+            gfn_unlock(p2m, gfn, order);
+            return 0;
+        }
+
+        if ( bad )
+        {
+            gfn_unlock(p2m, gfn, order);
+            printk(XENLOG_G_ERR
+                   "%pd: GFN %lx (%lx:%u:%u:%u) -> (%lx:%u:%u:%u) not permitted\n",
+                   d, gfn_l,
+                   mfn_x(omfn), cur_order, ot, a,
+                   mfn_x(mfn), order, gfn_p2mt, access);
+            domain_crash(d);
+            return -EPERM;
+        }
     }
     else if ( p2m_is_ram(ot) )
     {



^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 3/4] x86/PVH: improve Dom0 memory size calculation
  2021-08-30 13:01 [PATCH 0/4] x86/PVH: Dom0 building adjustments Jan Beulich
  2021-08-30 13:02 ` [PATCH 1/4] x86/PVH: de-duplicate mappings for first Mb of Dom0 memory Jan Beulich
  2021-08-30 13:02 ` [PATCH 2/4] x86/P2M: relax guarding of MMIO entries Jan Beulich
@ 2021-08-30 13:03 ` Jan Beulich
  2021-08-31 14:07   ` Andrew Cooper
  2021-08-30 13:03 ` [PATCH 4/4] x86/PV: properly set shadow allocation for Dom0 Jan Beulich
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 40+ messages in thread
From: Jan Beulich @ 2021-08-30 13:03 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

Assuming that the accounting for IOMMU page tables will also take care
of the P2M needs was wrong: dom0_paging_pages() can determine a far
higher value, high enough for the system to run out of memory while
setting up Dom0. Hence in the case of shared page tables the larger of
the two values needs to be used (without shared page tables the sum of
both continues to be applicable).

While there also account for two further aspects in the PV case: With
"iommu=dom0-passthrough" no IOMMU page tables would get allocated, so
none need accounting for. And if shadow mode is to be enabled, setting
aside a suitable amount for the P2M pool to get populated is also
necessary (i.e. similar to the non-shared-page-tables case of PVH).

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/dom0_build.c
+++ b/xen/arch/x86/dom0_build.c
@@ -318,7 +318,7 @@ unsigned long __init dom0_compute_nr_pag
     struct domain *d, struct elf_dom_parms *parms, unsigned long initrd_len)
 {
     nodeid_t node;
-    unsigned long avail = 0, nr_pages, min_pages, max_pages;
+    unsigned long avail = 0, nr_pages, min_pages, max_pages, iommu_pages = 0;
     bool need_paging;
 
     /* The ordering of operands is to work around a clang5 issue. */
@@ -337,18 +337,23 @@ unsigned long __init dom0_compute_nr_pag
         avail -= d->max_vcpus - 1;
 
     /* Reserve memory for iommu_dom0_init() (rough estimate). */
-    if ( is_iommu_enabled(d) )
+    if ( is_iommu_enabled(d) && !iommu_hwdom_passthrough )
     {
         unsigned int s;
 
         for ( s = 9; s < BITS_PER_LONG; s += 9 )
-            avail -= max_pdx >> s;
+            iommu_pages += max_pdx >> s;
+
+        avail -= iommu_pages;
     }
 
-    need_paging = is_hvm_domain(d) &&
-        (!iommu_use_hap_pt(d) || !paging_mode_hap(d));
+    need_paging = is_hvm_domain(d)
+                  ? !iommu_use_hap_pt(d) || !paging_mode_hap(d)
+                  : opt_dom0_shadow;
     for ( ; ; need_paging = false )
     {
+        unsigned long paging_pages;
+
         nr_pages = get_memsize(&dom0_size, avail);
         min_pages = get_memsize(&dom0_min_size, avail);
         max_pages = get_memsize(&dom0_max_size, avail);
@@ -377,11 +382,20 @@ unsigned long __init dom0_compute_nr_pag
         nr_pages = min(nr_pages, max_pages);
         nr_pages = min(nr_pages, avail);
 
-        if ( !need_paging )
-            break;
+        paging_pages = paging_mode_enabled(d) || need_paging
+                       ? dom0_paging_pages(d, nr_pages) : 0;
 
         /* Reserve memory for shadow or HAP. */
-        avail -= dom0_paging_pages(d, nr_pages);
+        if ( !need_paging )
+        {
+            if ( paging_pages <= iommu_pages )
+                break;
+
+            avail -= paging_pages - iommu_pages;
+        }
+        else
+            avail -= paging_pages;
+        iommu_pages = paging_pages;
     }
 
     if ( is_pv_domain(d) &&



^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 4/4] x86/PV: properly set shadow allocation for Dom0
  2021-08-30 13:01 [PATCH 0/4] x86/PVH: Dom0 building adjustments Jan Beulich
                   ` (2 preceding siblings ...)
  2021-08-30 13:03 ` [PATCH 3/4] x86/PVH: improve Dom0 memory size calculation Jan Beulich
@ 2021-08-30 13:03 ` Jan Beulich
  2021-08-31 13:47   ` Andrew Cooper
  2021-08-31 21:08   ` Tim Deegan
  2021-08-31  8:53 ` [PATCH 0/4] x86/PVH: Dom0 building adjustments Jan Beulich
  2021-09-01 15:06 ` Jan Beulich
  5 siblings, 2 replies; 40+ messages in thread
From: Jan Beulich @ 2021-08-30 13:03 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné, Tim Deegan

Leaving shadow setup just to the L1TF tasklet means running Dom0 on a
minimally acceptable shadow memory pool, rather than what normally
would be used (also, for example, for PVH). Populate the pool before
triggering the tasklet, on a best effort basis (again like done for
PVH).

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -1298,7 +1298,7 @@ int shadow_set_allocation(struct domain
 {
     struct page_info *sp;
 
-    ASSERT(paging_locked_by_me(d));
+    ASSERT(paging_locked_by_me(d) || system_state < SYS_STATE_active);
 
     if ( pages > 0 )
     {
--- a/xen/arch/x86/pv/dom0_build.c
+++ b/xen/arch/x86/pv/dom0_build.c
@@ -21,6 +21,7 @@
 #include <asm/page.h>
 #include <asm/pv/mm.h>
 #include <asm/setup.h>
+#include <asm/shadow.h>
 
 /* Allow ring-3 access in long mode as guest cannot use ring 1 ... */
 #define BASE_PROT (_PAGE_PRESENT|_PAGE_RW|_PAGE_ACCESSED|_PAGE_USER)
@@ -933,7 +934,17 @@ int __init dom0_construct_pv(struct doma
 #ifdef CONFIG_SHADOW_PAGING
     if ( opt_dom0_shadow )
     {
+        bool preempted;
+
         printk("Switching dom0 to using shadow paging\n");
+
+        do {
+            preempted = false;
+            shadow_set_allocation(d, dom0_paging_pages(d, nr_pages),
+                                  &preempted);
+            process_pending_softirqs();
+        } while ( preempted );
+
         tasklet_schedule(&d->arch.paging.shadow.pv_l1tf_tasklet);
     }
 #endif



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 0/4] x86/PVH: Dom0 building adjustments
  2021-08-30 13:01 [PATCH 0/4] x86/PVH: Dom0 building adjustments Jan Beulich
                   ` (3 preceding siblings ...)
  2021-08-30 13:03 ` [PATCH 4/4] x86/PV: properly set shadow allocation for Dom0 Jan Beulich
@ 2021-08-31  8:53 ` Jan Beulich
  2021-09-01 13:56   ` Roger Pau Monné
  2021-09-01 15:06 ` Jan Beulich
  5 siblings, 1 reply; 40+ messages in thread
From: Jan Beulich @ 2021-08-31  8:53 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

On 30.08.2021 15:01, Jan Beulich wrote:
> The code building PVH Dom0 made use of sequences of P2M changes
> which are disallowed as of XSA-378. First of all population of the
> first Mb of memory needs to be redone. Then, largely as a
> workaround, checking introduced by XSA-378 needs to be slightly
> relaxed.
> 
> Note that with these adjustments I get Dom0 to start booting on my
> development system, but the Dom0 kernel then gets stuck. Since it
> was the first time for me to try PVH Dom0 in this context (see
> below for why I was hesitant), I cannot tell yet whether this is
> due further fallout from the XSA, or some further unrelated
> problem. Dom0's BSP is in VPF_blocked state while all APs are
> still in VPF_down. The 'd' debug key, unhelpfully, doesn't produce
> any output, so it's non-trivial to check whether (like PV likes to
> do) Dom0 has panic()ed without leaving any (visible) output.

Correction: I did mean '0' here, producing merely

(XEN) '0' pressed -> dumping Dom0's registers
(XEN) *** Dumping Dom0 vcpu#0 state: ***
(XEN) *** Dumping Dom0 vcpu#1 state: ***
(XEN) *** Dumping Dom0 vcpu#2 state: ***
(XEN) *** Dumping Dom0 vcpu#3 state: ***

'd' output supports the "system is idle" that was also visible from
'q' output.

Jan



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/4] x86/PVH: de-duplicate mappings for first Mb of Dom0 memory
  2021-08-30 13:02 ` [PATCH 1/4] x86/PVH: de-duplicate mappings for first Mb of Dom0 memory Jan Beulich
@ 2021-08-31 13:02   ` Andrew Cooper
  2021-08-31 13:14     ` Jan Beulich
  0 siblings, 1 reply; 40+ messages in thread
From: Andrew Cooper @ 2021-08-31 13:02 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monné

On 30/08/2021 14:02, Jan Beulich wrote:
> One of the changes comprising the fixes for XSA-378 disallows replacing
> MMIO mappings by unintended (for this purpose) code paths.

I'd drop the brackets.  All it does is confuse the sentence.

>  This means we
> need to be more careful about the mappings put in place in this range -
> mappings should be created exactly once:
> - iommu_hwdom_init() comes first; it should avoid the first Mb,
> - pvh_populate_p2m() should insert identity mappings only into ranges
>   not populated as RAM,
> - pvh_setup_acpi() should again avoid the first Mb, which was already
>   dealt with at that point.

This really is a mess.  It also seems very fragile.

Why is iommu_hwdom_init() separate in the first place?  It only does
mappings up to 4G in the first place, and with this change, it is now
[1M-4G)

All IOMMU modifications should be as a side effect of regular p2m/guest
physmap operations.  I suppose this is another breakage from the PV side
of things.

> @@ -1095,6 +1101,17 @@ static int __init pvh_setup_acpi(struct
>          nr_pages = PFN_UP((d->arch.e820[i].addr & ~PAGE_MASK) +
>                            d->arch.e820[i].size);
>  
> +        /* Memory below 1MB has been dealt with by pvh_populate_p2m(). */
> +        if ( pfn < PFN_DOWN(MB(1)) )
> +        {
> +            if ( pfn + nr_pages <= PFN_DOWN(MB(1)) )
> +                continue;
> +
> +            /* This shouldn't happen, but is easy to deal with. */

I'm not sure this comment is helpful.

Under PVH, there is nothing special about the 1M boundary, and we can
reasonably have something else here or crossing the boundary.


Preferably with this removed, Acked-by: Andrew Cooper
<andrew.cooper3@citrix.com>, but only because this is an emergency fix. 
I really don't think it is an improvement to the logic.

~Andrew



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/4] x86/PVH: de-duplicate mappings for first Mb of Dom0 memory
  2021-08-31 13:02   ` Andrew Cooper
@ 2021-08-31 13:14     ` Jan Beulich
  2021-08-31 13:19       ` Jan Beulich
  2021-09-01  8:41       ` Roger Pau Monné
  0 siblings, 2 replies; 40+ messages in thread
From: Jan Beulich @ 2021-08-31 13:14 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Roger Pau Monné, xen-devel

On 31.08.2021 15:02, Andrew Cooper wrote:
> On 30/08/2021 14:02, Jan Beulich wrote:
>> One of the changes comprising the fixes for XSA-378 disallows replacing
>> MMIO mappings by unintended (for this purpose) code paths.
> 
> I'd drop the brackets.  All it does is confuse the sentence.
> 
>>  This means we
>> need to be more careful about the mappings put in place in this range -
>> mappings should be created exactly once:
>> - iommu_hwdom_init() comes first; it should avoid the first Mb,
>> - pvh_populate_p2m() should insert identity mappings only into ranges
>>   not populated as RAM,
>> - pvh_setup_acpi() should again avoid the first Mb, which was already
>>   dealt with at that point.
> 
> This really is a mess.  It also seems very fragile.

So it seems to me.

> Why is iommu_hwdom_init() separate in the first place?  It only does
> mappings up to 4G in the first place, and with this change, it is now
> [1M-4G)

I guess we'll want to wait for Roger to return to shed some light on
this.

>> @@ -1095,6 +1101,17 @@ static int __init pvh_setup_acpi(struct
>>          nr_pages = PFN_UP((d->arch.e820[i].addr & ~PAGE_MASK) +
>>                            d->arch.e820[i].size);
>>  
>> +        /* Memory below 1MB has been dealt with by pvh_populate_p2m(). */
>> +        if ( pfn < PFN_DOWN(MB(1)) )
>> +        {
>> +            if ( pfn + nr_pages <= PFN_DOWN(MB(1)) )
>> +                continue;
>> +
>> +            /* This shouldn't happen, but is easy to deal with. */
> 
> I'm not sure this comment is helpful.
> 
> Under PVH, there is nothing special about the 1M boundary, and we can
> reasonably have something else here or crossing the boundary.

As long as we have this special treatment of the low Mb, the boundary
is meaningful imo. I'd see the comment go away when the handling in
general gets streamlined.

> Preferably with this removed, Acked-by: Andrew Cooper
> <andrew.cooper3@citrix.com>, but only because this is an emergency fix.

Thanks. I see. You'll like patch 2 even less; at least I do. And I'm
not really certain that change is enough to cover all possible
systems.

> I really don't think it is an improvement to the logic.

Yet I suppose you also have no immediate suggestions towards doing
better? Of course right here a full rework is out of scope. But if
there were smaller bits that - if adjusted - would make you feel
better about the change as a whole, I'd be happy to consider making
adjustments.

Jan



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/4] x86/P2M: relax guarding of MMIO entries
  2021-08-30 13:02 ` [PATCH 2/4] x86/P2M: relax guarding of MMIO entries Jan Beulich
@ 2021-08-31 13:16   ` Jan Beulich
  2021-08-31 13:16   ` Andrew Cooper
  1 sibling, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2021-08-31 13:16 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

On 30.08.2021 15:02, Jan Beulich wrote:
> One of the changes comprising the fixes for XSA-378 disallows replacing
> MMIO mappings by unintended (for this purpose) code paths. At least in
> the case of PVH Dom0 hitting an RMRR covered by an E820 ACPI region,
> this is too strict. Generally short-circuit requests establishing the
> same kind of mapping that's already in place.
> 
> Further permit "access" to differ in the "executable" attribute. While
> ideally only ROM regions would get mapped with X set, getting there is
> quite a bit of work. Therefore, as a temporary measure, permit X to
> vary. For Dom0 the more permissive of the types will be used, while for
> DomU it'll be the more restrictive one.
> 
> While there, also add a log message to the other domain_crash()
> invocation that did prevent PVH Dom0 from coming up after the XSA-378
> changes.
> 
> Fixes: 753cb68e6530 ("x86/p2m: guard (in particular) identity mapping entries")
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Btw, I had meant to have this post-commit-message remark here:

TBD: This could be generalized to all of R, W, and X. Dealing with just X
     is merely the minimum I found is immediately necessary.

Jan



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/4] x86/P2M: relax guarding of MMIO entries
  2021-08-30 13:02 ` [PATCH 2/4] x86/P2M: relax guarding of MMIO entries Jan Beulich
  2021-08-31 13:16   ` Jan Beulich
@ 2021-08-31 13:16   ` Andrew Cooper
  2021-08-31 13:26     ` Jan Beulich
  1 sibling, 1 reply; 40+ messages in thread
From: Andrew Cooper @ 2021-08-31 13:16 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monné

On 30/08/2021 14:02, Jan Beulich wrote:
> One of the changes comprising the fixes for XSA-378 disallows replacing
> MMIO mappings by unintended (for this purpose) code paths.

Drop the brackets.

> At least in
> the case of PVH Dom0 hitting an RMRR covered by an E820 ACPI region,
> this is too strict. Generally short-circuit requests establishing the
> same kind of mapping that's already in place.
>
> Further permit "access" to differ in the "executable" attribute. While
> ideally only ROM regions would get mapped with X set, getting there is
> quite a bit of work. Therefore, as a temporary measure, permit X to
> vary. For Dom0 the more permissive of the types will be used, while for
> DomU it'll be the more restrictive one.

Split behaviour between dom0 and domU based on types alone cannot
possibly be correct.

DomU's need to execute ROMs too, and this looks like will malfunction if
a ROM ends up in the region that HVMLoader relocated RAM from.

As this is a temporary bodge emergency bugfix, don't try to be clever -
just take the latest access.

> While there, also add a log message to the other domain_crash()
> invocation that did prevent PVH Dom0 from coming up after the XSA-378
> changes.
>
> Fixes: 753cb68e6530 ("x86/p2m: guard (in particular) identity mapping entries")
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -958,9 +958,13 @@ guest_physmap_add_entry(struct domain *d
>          if ( p2m_is_special(ot) )
>          {
>              /* Don't permit unmapping grant/foreign/direct-MMIO this way. */
> -            domain_crash(d);
>              p2m_unlock(p2m);
> -            
> +            printk(XENLOG_G_ERR
> +                   "%pd: GFN %lx (%lx:%u:%u) -> (%lx:%u:%u) not permitted\n",

type and access need to be rendered in hex, or you need to use 0x
prefixes to distinguish the two bases.

Also, use commas rather than colons.  Visually, this is ambiguous with
PCI BDFs, and commas match tuple notation in most programming languages
which is the construct you're trying to represent.

Same below.

> +                   d, gfn_x(gfn) + i,
> +                   mfn_x(omfn), ot, a,
> +                   mfn_x(mfn) + i, t, p2m->default_access);
> +            domain_crash(d);
>              return -EPERM;
>          }
>          else if ( p2m_is_ram(ot) && !p2m_is_paged(ot) )
> @@ -1302,9 +1306,50 @@ static int set_typed_p2m_entry(struct do
>      }
>      if ( p2m_is_special(ot) )
>      {
> -        gfn_unlock(p2m, gfn, order);
> -        domain_crash(d);
> -        return -EPERM;
> +        bool done = false, bad = true;
> +
> +        /* Special-case (almost) identical mappings. */
> +        if ( mfn_eq(mfn, omfn) && gfn_p2mt == ot )
> +        {
> +            /*
> +             * For MMIO allow X to differ in the requests (to cover for
> +             * set_identity_p2m_entry() and set_mmio_p2m_entry() differing in
> +             * the way they specify "access"). For the hardware domain put (or
> +             * leave) in place the more permissive of the two possibilities,
> +             * while for DomU-s go with the more restrictive variant.

This comment needs to identify clearly that it is a temporary bodge
intended to be removed.

~Andrew

> +             */
> +            if ( gfn_p2mt == p2m_mmio_direct &&
> +                 access <= p2m_access_rwx &&
> +                 (access ^ a) == p2m_access_x )
> +            {
> +                if ( is_hardware_domain(d) )
> +                    access |= p2m_access_x;
> +                else
> +                    access &= ~p2m_access_x;
> +                bad = access == p2m_access_n;
> +            }
> +
> +            if ( access == a )
> +                done = true;
> +        }
> +
> +        if ( done )
> +        {
> +            gfn_unlock(p2m, gfn, order);
> +            return 0;
> +        }
> +
> +        if ( bad )
> +        {
> +            gfn_unlock(p2m, gfn, order);
> +            printk(XENLOG_G_ERR
> +                   "%pd: GFN %lx (%lx:%u:%u:%u) -> (%lx:%u:%u:%u) not permitted\n",
> +                   d, gfn_l,
> +                   mfn_x(omfn), cur_order, ot, a,
> +                   mfn_x(mfn), order, gfn_p2mt, access);
> +            domain_crash(d);
> +            return -EPERM;
> +        }
>      }
>      else if ( p2m_is_ram(ot) )
>      {
>




^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/4] x86/PVH: de-duplicate mappings for first Mb of Dom0 memory
  2021-08-31 13:14     ` Jan Beulich
@ 2021-08-31 13:19       ` Jan Beulich
  2021-08-31 13:27         ` Andrew Cooper
  2021-09-01  8:41       ` Roger Pau Monné
  1 sibling, 1 reply; 40+ messages in thread
From: Jan Beulich @ 2021-08-31 13:19 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Roger Pau Monné, xen-devel

On 31.08.2021 15:14, Jan Beulich wrote:
> On 31.08.2021 15:02, Andrew Cooper wrote:
>> On 30/08/2021 14:02, Jan Beulich wrote:
>>> One of the changes comprising the fixes for XSA-378 disallows replacing
>>> MMIO mappings by unintended (for this purpose) code paths.
>>
>> I'd drop the brackets.  All it does is confuse the sentence.
>>
>>>  This means we
>>> need to be more careful about the mappings put in place in this range -
>>> mappings should be created exactly once:
>>> - iommu_hwdom_init() comes first; it should avoid the first Mb,
>>> - pvh_populate_p2m() should insert identity mappings only into ranges
>>>   not populated as RAM,
>>> - pvh_setup_acpi() should again avoid the first Mb, which was already
>>>   dealt with at that point.
>>
>> This really is a mess.  It also seems very fragile.
> 
> So it seems to me.
> 
>> Why is iommu_hwdom_init() separate in the first place?  It only does
>> mappings up to 4G in the first place, and with this change, it is now
>> [1M-4G)
> 
> I guess we'll want to wait for Roger to return to shed some light on
> this.
> 
>>> @@ -1095,6 +1101,17 @@ static int __init pvh_setup_acpi(struct
>>>          nr_pages = PFN_UP((d->arch.e820[i].addr & ~PAGE_MASK) +
>>>                            d->arch.e820[i].size);
>>>  
>>> +        /* Memory below 1MB has been dealt with by pvh_populate_p2m(). */
>>> +        if ( pfn < PFN_DOWN(MB(1)) )
>>> +        {
>>> +            if ( pfn + nr_pages <= PFN_DOWN(MB(1)) )
>>> +                continue;
>>> +
>>> +            /* This shouldn't happen, but is easy to deal with. */
>>
>> I'm not sure this comment is helpful.
>>
>> Under PVH, there is nothing special about the 1M boundary, and we can
>> reasonably have something else here or crossing the boundary.
> 
> As long as we have this special treatment of the low Mb, the boundary
> is meaningful imo. I'd see the comment go away when the handling in
> general gets streamlined.

I should have added: And as long as Dom0's E820 map gets cloned from
the host's, which will necessarily consider the 1Mb boundary special.

Jan



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/4] x86/P2M: relax guarding of MMIO entries
  2021-08-31 13:16   ` Andrew Cooper
@ 2021-08-31 13:26     ` Jan Beulich
  2021-08-31 15:25       ` Andrew Cooper
  0 siblings, 1 reply; 40+ messages in thread
From: Jan Beulich @ 2021-08-31 13:26 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Roger Pau Monné, xen-devel

On 31.08.2021 15:16, Andrew Cooper wrote:
> On 30/08/2021 14:02, Jan Beulich wrote:
>> Further permit "access" to differ in the "executable" attribute. While
>> ideally only ROM regions would get mapped with X set, getting there is
>> quite a bit of work. Therefore, as a temporary measure, permit X to
>> vary. For Dom0 the more permissive of the types will be used, while for
>> DomU it'll be the more restrictive one.
> 
> Split behaviour between dom0 and domU based on types alone cannot
> possibly be correct.

True, but what do you do.

> DomU's need to execute ROMs too, and this looks like will malfunction if
> a ROM ends up in the region that HVMLoader relocated RAM from.
> 
> As this is a temporary bodge emergency bugfix, don't try to be clever -
> just take the latest access.

And how do we know that that's what is going to work? We should
strictly accumulate for Dom0. And what we do for DomU is moot for
the moment, until PCI passthrough becomes a thing for PVH. Hence
I've opted to be restrictive there - I'd rather see things break
(and getting adjusted) when this future work actually gets carried
out, than leave things permissive for no-one to notice that it's
too permissive, leading to an XSA.

>> --- a/xen/arch/x86/mm/p2m.c
>> +++ b/xen/arch/x86/mm/p2m.c
>> @@ -958,9 +958,13 @@ guest_physmap_add_entry(struct domain *d
>>          if ( p2m_is_special(ot) )
>>          {
>>              /* Don't permit unmapping grant/foreign/direct-MMIO this way. */
>> -            domain_crash(d);
>>              p2m_unlock(p2m);
>> -            
>> +            printk(XENLOG_G_ERR
>> +                   "%pd: GFN %lx (%lx:%u:%u) -> (%lx:%u:%u) not permitted\n",
> 
> type and access need to be rendered in hex, or you need to use 0x
> prefixes to distinguish the two bases.

Will use %#lx then.

> Also, use commas rather than colons.  Visually, this is ambiguous with
> PCI BDFs, and commas match tuple notation in most programming languages
> which is the construct you're trying to represent.
> 
> Same below.

Sure, will do.

>> @@ -1302,9 +1306,50 @@ static int set_typed_p2m_entry(struct do
>>      }
>>      if ( p2m_is_special(ot) )
>>      {
>> -        gfn_unlock(p2m, gfn, order);
>> -        domain_crash(d);
>> -        return -EPERM;
>> +        bool done = false, bad = true;
>> +
>> +        /* Special-case (almost) identical mappings. */
>> +        if ( mfn_eq(mfn, omfn) && gfn_p2mt == ot )
>> +        {
>> +            /*
>> +             * For MMIO allow X to differ in the requests (to cover for
>> +             * set_identity_p2m_entry() and set_mmio_p2m_entry() differing in
>> +             * the way they specify "access"). For the hardware domain put (or
>> +             * leave) in place the more permissive of the two possibilities,
>> +             * while for DomU-s go with the more restrictive variant.
> 
> This comment needs to identify clearly that it is a temporary bodge
> intended to be removed.

Okay.

Jan



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/4] x86/PVH: de-duplicate mappings for first Mb of Dom0 memory
  2021-08-31 13:19       ` Jan Beulich
@ 2021-08-31 13:27         ` Andrew Cooper
  2021-08-31 13:36           ` Jan Beulich
  0 siblings, 1 reply; 40+ messages in thread
From: Andrew Cooper @ 2021-08-31 13:27 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Wei Liu, Roger Pau Monné, xen-devel

On 31/08/2021 14:19, Jan Beulich wrote:
>>>> @@ -1095,6 +1101,17 @@ static int __init pvh_setup_acpi(struct
>>>>          nr_pages = PFN_UP((d->arch.e820[i].addr & ~PAGE_MASK) +
>>>>                            d->arch.e820[i].size);
>>>>  
>>>> +        /* Memory below 1MB has been dealt with by pvh_populate_p2m(). */
>>>> +        if ( pfn < PFN_DOWN(MB(1)) )
>>>> +        {
>>>> +            if ( pfn + nr_pages <= PFN_DOWN(MB(1)) )
>>>> +                continue;
>>>> +
>>>> +            /* This shouldn't happen, but is easy to deal with. */
>>> I'm not sure this comment is helpful.
>>>
>>> Under PVH, there is nothing special about the 1M boundary, and we can
>>> reasonably have something else here or crossing the boundary.
>> As long as we have this special treatment of the low Mb, the boundary
>> is meaningful imo. I'd see the comment go away when the handling in
>> general gets streamlined.
> I should have added: And as long as Dom0's E820 map gets cloned from
> the host's, which will necessarily consider the 1Mb boundary special.

Not when you're booting virtualised in the first place.

~Andrew


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/4] x86/PVH: de-duplicate mappings for first Mb of Dom0 memory
  2021-08-31 13:27         ` Andrew Cooper
@ 2021-08-31 13:36           ` Jan Beulich
  2021-09-01 11:49             ` Andrew Cooper
  0 siblings, 1 reply; 40+ messages in thread
From: Jan Beulich @ 2021-08-31 13:36 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Roger Pau Monné, xen-devel

On 31.08.2021 15:27, Andrew Cooper wrote:
> On 31/08/2021 14:19, Jan Beulich wrote:
>>>>> @@ -1095,6 +1101,17 @@ static int __init pvh_setup_acpi(struct
>>>>>          nr_pages = PFN_UP((d->arch.e820[i].addr & ~PAGE_MASK) +
>>>>>                            d->arch.e820[i].size);
>>>>>  
>>>>> +        /* Memory below 1MB has been dealt with by pvh_populate_p2m(). */
>>>>> +        if ( pfn < PFN_DOWN(MB(1)) )
>>>>> +        {
>>>>> +            if ( pfn + nr_pages <= PFN_DOWN(MB(1)) )
>>>>> +                continue;
>>>>> +
>>>>> +            /* This shouldn't happen, but is easy to deal with. */
>>>> I'm not sure this comment is helpful.
>>>>
>>>> Under PVH, there is nothing special about the 1M boundary, and we can
>>>> reasonably have something else here or crossing the boundary.
>>> As long as we have this special treatment of the low Mb, the boundary
>>> is meaningful imo. I'd see the comment go away when the handling in
>>> general gets streamlined.
>> I should have added: And as long as Dom0's E820 map gets cloned from
>> the host's, which will necessarily consider the 1Mb boundary special.
> 
> Not when you're booting virtualised in the first place.

You mean when Xen itself runs in PVH (not HVM) mode, and then in turn
has a PVH Dom0? And if the underlying Xen has not in turn cloned
the E820 from the host's? That's surely too exotic a case to warrant
removing this comment. If you insist, I can mention that case as a
possible exception.

Jan



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 4/4] x86/PV: properly set shadow allocation for Dom0
  2021-08-30 13:03 ` [PATCH 4/4] x86/PV: properly set shadow allocation for Dom0 Jan Beulich
@ 2021-08-31 13:47   ` Andrew Cooper
  2021-08-31 14:25     ` Jan Beulich
  2021-08-31 21:08   ` Tim Deegan
  1 sibling, 1 reply; 40+ messages in thread
From: Andrew Cooper @ 2021-08-31 13:47 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monné, Tim Deegan

On 30/08/2021 14:03, Jan Beulich wrote:
> @@ -933,7 +934,17 @@ int __init dom0_construct_pv(struct doma
>  #ifdef CONFIG_SHADOW_PAGING
>      if ( opt_dom0_shadow )
>      {
> +        bool preempted;
> +
>          printk("Switching dom0 to using shadow paging\n");
> +
> +        do {
> +            preempted = false;
> +            shadow_set_allocation(d, dom0_paging_pages(d, nr_pages),
> +                                  &preempted);
> +            process_pending_softirqs();
> +        } while ( preempted );

This isn't correct.  The shadow pool is needed even without
opt_dom0_shadow, because some downstreams have elected not to retain
upstream's security vulnerability in default setting of opt_pv_l1tf_hwdom.

Also, dom0_paging_pages() isn't a trivial calculation, so should be
called once and cached.

~Andrew



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/4] x86/PVH: improve Dom0 memory size calculation
  2021-08-30 13:03 ` [PATCH 3/4] x86/PVH: improve Dom0 memory size calculation Jan Beulich
@ 2021-08-31 14:07   ` Andrew Cooper
  2021-08-31 15:30     ` Jan Beulich
  0 siblings, 1 reply; 40+ messages in thread
From: Andrew Cooper @ 2021-08-31 14:07 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monné

On 30/08/2021 14:03, Jan Beulich wrote:
> Assuming that the accounting for IOMMU page tables will also take care
> of the P2M needs was wrong: dom0_paging_pages() can determine a far
> higher value, high enough for the system to run out of memory while
> setting up Dom0. Hence in the case of shared page tables the larger of
> the two values needs to be used (without shared page tables the sum of
> both continues to be applicable).

I'm afraid that I can't follow this at all.

What causes the system to run out of RAM while constructing dom0?

>
> While there also account for two further aspects in the PV case: With
> "iommu=dom0-passthrough" no IOMMU page tables would get allocated, so
> none need accounting for. And if shadow mode is to be enabled, setting
> aside a suitable amount for the P2M pool to get populated is also
> necessary (i.e. similar to the non-shared-page-tables case of PVH).
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> --- a/xen/arch/x86/dom0_build.c
> +++ b/xen/arch/x86/dom0_build.c
> @@ -318,7 +318,7 @@ unsigned long __init dom0_compute_nr_pag
>      struct domain *d, struct elf_dom_parms *parms, unsigned long initrd_len)
>  {
>      nodeid_t node;
> -    unsigned long avail = 0, nr_pages, min_pages, max_pages;
> +    unsigned long avail = 0, nr_pages, min_pages, max_pages, iommu_pages = 0;
>      bool need_paging;
>  
>      /* The ordering of operands is to work around a clang5 issue. */
> @@ -337,18 +337,23 @@ unsigned long __init dom0_compute_nr_pag
>          avail -= d->max_vcpus - 1;
>  
>      /* Reserve memory for iommu_dom0_init() (rough estimate). */
> -    if ( is_iommu_enabled(d) )
> +    if ( is_iommu_enabled(d) && !iommu_hwdom_passthrough )
>      {
>          unsigned int s;
>  
>          for ( s = 9; s < BITS_PER_LONG; s += 9 )
> -            avail -= max_pdx >> s;
> +            iommu_pages += max_pdx >> s;
> +
> +        avail -= iommu_pages;
>      }
>  
> -    need_paging = is_hvm_domain(d) &&
> -        (!iommu_use_hap_pt(d) || !paging_mode_hap(d));
> +    need_paging = is_hvm_domain(d)
> +                  ? !iommu_use_hap_pt(d) || !paging_mode_hap(d)
> +                  : opt_dom0_shadow;

As per patch 4, this condition needs adjusting.

~Andrew

>      for ( ; ; need_paging = false )
>      {
> +        unsigned long paging_pages;
> +
>          nr_pages = get_memsize(&dom0_size, avail);
>          min_pages = get_memsize(&dom0_min_size, avail);
>          max_pages = get_memsize(&dom0_max_size, avail);
> @@ -377,11 +382,20 @@ unsigned long __init dom0_compute_nr_pag
>          nr_pages = min(nr_pages, max_pages);
>          nr_pages = min(nr_pages, avail);
>  
> -        if ( !need_paging )
> -            break;
> +        paging_pages = paging_mode_enabled(d) || need_paging
> +                       ? dom0_paging_pages(d, nr_pages) : 0;
>  
>          /* Reserve memory for shadow or HAP. */
> -        avail -= dom0_paging_pages(d, nr_pages);
> +        if ( !need_paging )
> +        {
> +            if ( paging_pages <= iommu_pages )
> +                break;
> +
> +            avail -= paging_pages - iommu_pages;
> +        }
> +        else
> +            avail -= paging_pages;
> +        iommu_pages = paging_pages;
>      }
>  
>      if ( is_pv_domain(d) &&
>




^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 4/4] x86/PV: properly set shadow allocation for Dom0
  2021-08-31 13:47   ` Andrew Cooper
@ 2021-08-31 14:25     ` Jan Beulich
  0 siblings, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2021-08-31 14:25 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Roger Pau Monné, Tim Deegan, xen-devel

On 31.08.2021 15:47, Andrew Cooper wrote:
> On 30/08/2021 14:03, Jan Beulich wrote:
>> @@ -933,7 +934,17 @@ int __init dom0_construct_pv(struct doma
>>  #ifdef CONFIG_SHADOW_PAGING
>>      if ( opt_dom0_shadow )
>>      {
>> +        bool preempted;
>> +
>>          printk("Switching dom0 to using shadow paging\n");
>> +
>> +        do {
>> +            preempted = false;
>> +            shadow_set_allocation(d, dom0_paging_pages(d, nr_pages),
>> +                                  &preempted);
>> +            process_pending_softirqs();
>> +        } while ( preempted );
> 
> This isn't correct.  The shadow pool is needed even without
> opt_dom0_shadow, because some downstreams have elected not to retain
> upstream's security vulnerability in default setting of opt_pv_l1tf_hwdom.

Are you suggesting to set up a (perhaps large) shadow pool just in
case we need to enable shadow mode on Dom0? And all of this memory
to then remain unused in the majority of cases?

Plus even if so, I'd view this as a 2nd, independent step, largely
orthogonal to the handling of "dom0=shadow". If somebody really
wanted that, I think this should be driven by an explicit setting
of the shadow pool size, indicating the admin is willing to waste
the memory.

I'm further puzzled by "not to retain upstream's security
vulnerability" - are you saying upstream is vulnerable in some way,
while perhaps you (XenServer) are not? In general I don't think I
view downstream decisions as a driving factor for what upstream
does, when the result is deliberately different behavior from
upstream.

> Also, dom0_paging_pages() isn't a trivial calculation, so should be
> called once and cached.

Sure, can do that. You did notice though that all I did is take
PVH's similar code?

Jan



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/4] x86/P2M: relax guarding of MMIO entries
  2021-08-31 13:26     ` Jan Beulich
@ 2021-08-31 15:25       ` Andrew Cooper
  2021-08-31 15:38         ` Jan Beulich
  0 siblings, 1 reply; 40+ messages in thread
From: Andrew Cooper @ 2021-08-31 15:25 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Wei Liu, Roger Pau Monné, xen-devel

On 31/08/2021 14:26, Jan Beulich wrote:
> On 31.08.2021 15:16, Andrew Cooper wrote:
>> On 30/08/2021 14:02, Jan Beulich wrote:
>>> Further permit "access" to differ in the "executable" attribute. While
>>> ideally only ROM regions would get mapped with X set, getting there is
>>> quite a bit of work. Therefore, as a temporary measure, permit X to
>>> vary. For Dom0 the more permissive of the types will be used, while for
>>> DomU it'll be the more restrictive one.
>> Split behaviour between dom0 and domU based on types alone cannot
>> possibly be correct.
> True, but what do you do.
>
>> DomU's need to execute ROMs too, and this looks like will malfunction if
>> a ROM ends up in the region that HVMLoader relocated RAM from.
>>
>> As this is a temporary bodge emergency bugfix, don't try to be clever -
>> just take the latest access.
> And how do we know that that's what is going to work?

Because it's the pre-existing behaviour.

>  We should
> strictly accumulate for Dom0. And what we do for DomU is moot for
> the moment, until PCI passthrough becomes a thing for PVH. Hence
> I've opted to be restrictive there - I'd rather see things break
> (and getting adjusted) when this future work actually gets carried
> out, than leave things permissive for no-one to notice that it's
> too permissive, leading to an XSA.

Restricting execute permissions is something unique to virt.  It doesn't
exist in a non-virtualised system, as I and D side reads are
indistinguishable outside of the core.

Furthermore, it is inexpressible on some systems/configurations.

Introspection is the only technology which should be restricting execute
permissions in the p2m, and only when it takes responsibility for
dealing with the fallout.

~Andrew



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/4] x86/PVH: improve Dom0 memory size calculation
  2021-08-31 14:07   ` Andrew Cooper
@ 2021-08-31 15:30     ` Jan Beulich
  0 siblings, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2021-08-31 15:30 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Roger Pau Monné, xen-devel

On 31.08.2021 16:07, Andrew Cooper wrote:
> On 30/08/2021 14:03, Jan Beulich wrote:
>> Assuming that the accounting for IOMMU page tables will also take care
>> of the P2M needs was wrong: dom0_paging_pages() can determine a far
>> higher value, high enough for the system to run out of memory while
>> setting up Dom0. Hence in the case of shared page tables the larger of
>> the two values needs to be used (without shared page tables the sum of
>> both continues to be applicable).
> 
> I'm afraid that I can't follow this at all.
> 
> What causes the system to run out of RAM while constructing dom0?

Without any "dom0_mem=" we set aside 128Mb. In my .config that was in
use I default to "dom0_mem=-255". Yet this was still far from enough to
cover the gap between the original IOMMU page table accounting and the
value returned from dom0_paging_pages(). Since this is what also gets
used to actually populate Dom0's P2M pool, with the P2M pool populated
there wasn't enough RAM anymore to reach the Dom0 size which
dom0_compute_nr_pages() did establish.

Putting it in a simplified formula, what we did so far when sharing
page tables was

RAM = total - IOMMU

but what we need is

RAM = total - max(IOMMU, P2M)

In the non-shared case we already correctly did

RAM = total - (IOMMU + P2M)

Jan



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/4] x86/P2M: relax guarding of MMIO entries
  2021-08-31 15:25       ` Andrew Cooper
@ 2021-08-31 15:38         ` Jan Beulich
  2021-09-01  8:08           ` Jan Beulich
                             ` (2 more replies)
  0 siblings, 3 replies; 40+ messages in thread
From: Jan Beulich @ 2021-08-31 15:38 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Roger Pau Monné, xen-devel

On 31.08.2021 17:25, Andrew Cooper wrote:
> On 31/08/2021 14:26, Jan Beulich wrote:
>> On 31.08.2021 15:16, Andrew Cooper wrote:
>>> On 30/08/2021 14:02, Jan Beulich wrote:
>>>> Further permit "access" to differ in the "executable" attribute. While
>>>> ideally only ROM regions would get mapped with X set, getting there is
>>>> quite a bit of work. Therefore, as a temporary measure, permit X to
>>>> vary. For Dom0 the more permissive of the types will be used, while for
>>>> DomU it'll be the more restrictive one.
>>> Split behaviour between dom0 and domU based on types alone cannot
>>> possibly be correct.
>> True, but what do you do.
>>
>>> DomU's need to execute ROMs too, and this looks like will malfunction if
>>> a ROM ends up in the region that HVMLoader relocated RAM from.
>>>
>>> As this is a temporary bodge emergency bugfix, don't try to be clever -
>>> just take the latest access.
>> And how do we know that that's what is going to work?
> 
> Because it's the pre-existing behaviour.

Valid point. But for the DomU case there simply has not been any
pre-existing behavior. Hence my desire to be restrictive initially
there.

>>  We should
>> strictly accumulate for Dom0. And what we do for DomU is moot for
>> the moment, until PCI passthrough becomes a thing for PVH. Hence
>> I've opted to be restrictive there - I'd rather see things break
>> (and getting adjusted) when this future work actually gets carried
>> out, than leave things permissive for no-one to notice that it's
>> too permissive, leading to an XSA.
> 
> Restricting execute permissions is something unique to virt.  It doesn't
> exist in a non-virtualised system, as I and D side reads are
> indistinguishable outside of the core.
> 
> Furthermore, it is inexpressible on some systems/configurations.
> 
> Introspection is the only technology which should be restricting execute
> permissions in the p2m, and only when it takes responsibility for
> dealing with the fallout.

IOW are you saying that the calls to set_identity_p2m_entry()
(pre-dating XSA-378) were wrong to use p2m_access_rw? Because that's
what's getting the way here.

Plus, as a side note, then we don't even have e.g. IOMMUF_executable.

Jan



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 4/4] x86/PV: properly set shadow allocation for Dom0
  2021-08-30 13:03 ` [PATCH 4/4] x86/PV: properly set shadow allocation for Dom0 Jan Beulich
  2021-08-31 13:47   ` Andrew Cooper
@ 2021-08-31 21:08   ` Tim Deegan
  1 sibling, 0 replies; 40+ messages in thread
From: Tim Deegan @ 2021-08-31 21:08 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Cooper, Wei Liu, Roger Pau Monné

At 15:03 +0200 on 30 Aug (1630335824), Jan Beulich wrote:
> Leaving shadow setup just to the L1TF tasklet means running Dom0 on a
> minimally acceptable shadow memory pool, rather than what normally
> would be used (also, for example, for PVH). Populate the pool before
> triggering the tasklet, on a best effort basis (again like done for
> PVH).
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Tim Deegan <tim@xen.org>


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/4] x86/P2M: relax guarding of MMIO entries
  2021-08-31 15:38         ` Jan Beulich
@ 2021-09-01  8:08           ` Jan Beulich
  2021-09-01  8:50           ` Roger Pau Monné
  2021-09-01 12:47           ` Andrew Cooper
  2 siblings, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2021-09-01  8:08 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Roger Pau Monné, xen-devel

On 31.08.2021 17:38, Jan Beulich wrote:
> On 31.08.2021 17:25, Andrew Cooper wrote:
>> On 31/08/2021 14:26, Jan Beulich wrote:
>>> On 31.08.2021 15:16, Andrew Cooper wrote:
>>>> On 30/08/2021 14:02, Jan Beulich wrote:
>>>>> Further permit "access" to differ in the "executable" attribute. While
>>>>> ideally only ROM regions would get mapped with X set, getting there is
>>>>> quite a bit of work. Therefore, as a temporary measure, permit X to
>>>>> vary. For Dom0 the more permissive of the types will be used, while for
>>>>> DomU it'll be the more restrictive one.
>>>> Split behaviour between dom0 and domU based on types alone cannot
>>>> possibly be correct.
>>> True, but what do you do.
>>>
>>>> DomU's need to execute ROMs too, and this looks like will malfunction if
>>>> a ROM ends up in the region that HVMLoader relocated RAM from.
>>>>
>>>> As this is a temporary bodge emergency bugfix, don't try to be clever -
>>>> just take the latest access.
>>> And how do we know that that's what is going to work?
>>
>> Because it's the pre-existing behaviour.
> 
> Valid point. But for the DomU case there simply has not been any
> pre-existing behavior. Hence my desire to be restrictive initially
> there.

Further to this: Using the last-value-set approach also puts us at
risk of running into a similar issue again when the ordering of
some operations changes elsewhere.

Jan



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/4] x86/PVH: de-duplicate mappings for first Mb of Dom0 memory
  2021-08-31 13:14     ` Jan Beulich
  2021-08-31 13:19       ` Jan Beulich
@ 2021-09-01  8:41       ` Roger Pau Monné
  1 sibling, 0 replies; 40+ messages in thread
From: Roger Pau Monné @ 2021-09-01  8:41 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

On Tue, Aug 31, 2021 at 03:14:06PM +0200, Jan Beulich wrote:

Sorry for the delay, and likely not being of much help right now.

> On 31.08.2021 15:02, Andrew Cooper wrote:
> > On 30/08/2021 14:02, Jan Beulich wrote:
> >> One of the changes comprising the fixes for XSA-378 disallows replacing
> >> MMIO mappings by unintended (for this purpose) code paths.
> > 
> > I'd drop the brackets.  All it does is confuse the sentence.
> > 
> >>  This means we
> >> need to be more careful about the mappings put in place in this range -
> >> mappings should be created exactly once:
> >> - iommu_hwdom_init() comes first; it should avoid the first Mb,
> >> - pvh_populate_p2m() should insert identity mappings only into ranges
> >>   not populated as RAM,
> >> - pvh_setup_acpi() should again avoid the first Mb, which was already
> >>   dealt with at that point.
> > 
> > This really is a mess.  It also seems very fragile.
> 
> So it seems to me.
> 
> > Why is iommu_hwdom_init() separate in the first place?  It only does
> > mappings up to 4G in the first place, and with this change, it is now
> > [1M-4G)
> 
> I guess we'll want to wait for Roger to return to shed some light on
> this.

iommu_hwdom_init should cover from [0, max_pdx], or 4G if max_pdx is
below that.

IIRC first PVH dom0 implementations used to have a behavior more
similar to PV in iommu_hwdom_init, as they would get almost everything
below 4GB that's not RAM identity mapped in the (IOMMU) page tables.

PVH dom0 has since diverged, and now iommu_hwdom_init just identity
maps reserved regions. We could likely move this somewhere else, but
given it's still shared with PV dom0 (by using the same command line
option, dom0-iommu=map-reserved) I think it would just make option
handling more confusing.

One way to simplify things would be to rely on the hardware provided
memory map to correctly have the regions in the low 1M marked as
reserved, so that iommu_hwdom_init would identity map them. Then we
would just need a bit of special handling to duplicate the data in RAM
regions for the low 1M but we could avoid a lot of complexity.  This
however requires trusting the hardware is not missing required regions
in the low 1M.

Another option might be to slightly modify hwdom_iommu_map so that for
PVH it returns true for all non-RAM regions in the low 1M. That would
avoid having to add another loop in pvh_populate_p2m to map those.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/4] x86/P2M: relax guarding of MMIO entries
  2021-08-31 15:38         ` Jan Beulich
  2021-09-01  8:08           ` Jan Beulich
@ 2021-09-01  8:50           ` Roger Pau Monné
  2021-09-01  9:53             ` Jan Beulich
  2021-09-01 12:47           ` Andrew Cooper
  2 siblings, 1 reply; 40+ messages in thread
From: Roger Pau Monné @ 2021-09-01  8:50 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

On Tue, Aug 31, 2021 at 05:38:49PM +0200, Jan Beulich wrote:
> On 31.08.2021 17:25, Andrew Cooper wrote:
> > On 31/08/2021 14:26, Jan Beulich wrote:
> >> On 31.08.2021 15:16, Andrew Cooper wrote:
> >>> On 30/08/2021 14:02, Jan Beulich wrote:
> >>>> Further permit "access" to differ in the "executable" attribute. While
> >>>> ideally only ROM regions would get mapped with X set, getting there is
> >>>> quite a bit of work. Therefore, as a temporary measure, permit X to
> >>>> vary. For Dom0 the more permissive of the types will be used, while for
> >>>> DomU it'll be the more restrictive one.
> >>> Split behaviour between dom0 and domU based on types alone cannot
> >>> possibly be correct.
> >> True, but what do you do.
> >>
> >>> DomU's need to execute ROMs too, and this looks like will malfunction if
> >>> a ROM ends up in the region that HVMLoader relocated RAM from.
> >>>
> >>> As this is a temporary bodge emergency bugfix, don't try to be clever -
> >>> just take the latest access.
> >> And how do we know that that's what is going to work?
> > 
> > Because it's the pre-existing behaviour.
> 
> Valid point. But for the DomU case there simply has not been any
> pre-existing behavior. Hence my desire to be restrictive initially
> there.
> 
> >>  We should
> >> strictly accumulate for Dom0. And what we do for DomU is moot for
> >> the moment, until PCI passthrough becomes a thing for PVH. Hence
> >> I've opted to be restrictive there - I'd rather see things break
> >> (and getting adjusted) when this future work actually gets carried
> >> out, than leave things permissive for no-one to notice that it's
> >> too permissive, leading to an XSA.
> > 
> > Restricting execute permissions is something unique to virt.  It doesn't
> > exist in a non-virtualised system, as I and D side reads are
> > indistinguishable outside of the core.
> > 
> > Furthermore, it is inexpressible on some systems/configurations.
> > 
> > Introspection is the only technology which should be restricting execute
> > permissions in the p2m, and only when it takes responsibility for
> > dealing with the fallout.
> 
> IOW are you saying that the calls to set_identity_p2m_entry()
> (pre-dating XSA-378) were wrong to use p2m_access_rw? Because that's
> what's getting the way here.

I did wonder this before, because I saw some messages on a couple of
systems about mappings override, and I'm not sure why we need to use
p2m_access_rw. My first thought was to suggest to switch to use the
default access type for the domain, like set_mmio_p2m_entry does.

I have to admit I'm not sure I see the point of preventing execution,
but it's possible I'm missing something.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/4] x86/P2M: relax guarding of MMIO entries
  2021-09-01  8:50           ` Roger Pau Monné
@ 2021-09-01  9:53             ` Jan Beulich
  2021-09-01 13:48               ` Roger Pau Monné
  0 siblings, 1 reply; 40+ messages in thread
From: Jan Beulich @ 2021-09-01  9:53 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: Andrew Cooper, Wei Liu, xen-devel

On 01.09.2021 10:50, Roger Pau Monné wrote:
> On Tue, Aug 31, 2021 at 05:38:49PM +0200, Jan Beulich wrote:
>> On 31.08.2021 17:25, Andrew Cooper wrote:
>>> On 31/08/2021 14:26, Jan Beulich wrote:
>>>> On 31.08.2021 15:16, Andrew Cooper wrote:
>>>>> On 30/08/2021 14:02, Jan Beulich wrote:
>>>>>> Further permit "access" to differ in the "executable" attribute. While
>>>>>> ideally only ROM regions would get mapped with X set, getting there is
>>>>>> quite a bit of work. Therefore, as a temporary measure, permit X to
>>>>>> vary. For Dom0 the more permissive of the types will be used, while for
>>>>>> DomU it'll be the more restrictive one.
>>>>> Split behaviour between dom0 and domU based on types alone cannot
>>>>> possibly be correct.
>>>> True, but what do you do.
>>>>
>>>>> DomU's need to execute ROMs too, and this looks like will malfunction if
>>>>> a ROM ends up in the region that HVMLoader relocated RAM from.
>>>>>
>>>>> As this is a temporary bodge emergency bugfix, don't try to be clever -
>>>>> just take the latest access.
>>>> And how do we know that that's what is going to work?
>>>
>>> Because it's the pre-existing behaviour.
>>
>> Valid point. But for the DomU case there simply has not been any
>> pre-existing behavior. Hence my desire to be restrictive initially
>> there.
>>
>>>>  We should
>>>> strictly accumulate for Dom0. And what we do for DomU is moot for
>>>> the moment, until PCI passthrough becomes a thing for PVH. Hence
>>>> I've opted to be restrictive there - I'd rather see things break
>>>> (and getting adjusted) when this future work actually gets carried
>>>> out, than leave things permissive for no-one to notice that it's
>>>> too permissive, leading to an XSA.
>>>
>>> Restricting execute permissions is something unique to virt.  It doesn't
>>> exist in a non-virtualised system, as I and D side reads are
>>> indistinguishable outside of the core.
>>>
>>> Furthermore, it is inexpressible on some systems/configurations.
>>>
>>> Introspection is the only technology which should be restricting execute
>>> permissions in the p2m, and only when it takes responsibility for
>>> dealing with the fallout.
>>
>> IOW are you saying that the calls to set_identity_p2m_entry()
>> (pre-dating XSA-378) were wrong to use p2m_access_rw? Because that's
>> what's getting the way here.
> 
> I did wonder this before, because I saw some messages on a couple of
> systems about mappings override, and I'm not sure why we need to use
> p2m_access_rw. My first thought was to suggest to switch to use the
> default access type for the domain, like set_mmio_p2m_entry does.
> 
> I have to admit I'm not sure I see the point of preventing execution,
> but it's possible I'm missing something.

Well, what good can come from allowing execution from, say, the
IO-APIC or LAPIC pages? Or other MMIO-mapped register space? Insn
fetches might even trip bad hardware behavior in such a case by
being the wrong granularity. It's imo really only ROM space which
ought to have execution permitted.

The issue isn't just with execution, though, and as a result I may
need to change the logic here to also include at least W. As of
one of the XSA-378 changes we may now pass just p2m_access_r to
iommu_identity_mapping(), if the ACPI tables on an AMD system were
saying so. (We may also pass p2m_access_w, but I sincerely hope no
firmware would specify write but no read access.)

Similarly in "IOMMU/x86: restrict IO-APIC mappings for PV Dom0" I
now pass p2m_access_r to set_identity_p2m_entry().

I suppose an underlying issue is the mixed purpose of using
p2m_access_*, which possibly has been against the intentions in the
first place. We cannot, for example, express r/o access to an MMIO
page without using p2m_access_r (or p2m_access_rx), as there's no
suitable p2m type to express this via type alone. We may need to
split p2m_mmio_direct into multiple types (up to 7), I guess, if
we wanted to remove this (ab)use of p2m_access_*.

Jan



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/4] x86/PVH: de-duplicate mappings for first Mb of Dom0 memory
  2021-08-31 13:36           ` Jan Beulich
@ 2021-09-01 11:49             ` Andrew Cooper
  0 siblings, 0 replies; 40+ messages in thread
From: Andrew Cooper @ 2021-09-01 11:49 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Wei Liu, Roger Pau Monné, xen-devel

On 31/08/2021 14:36, Jan Beulich wrote:
> On 31.08.2021 15:27, Andrew Cooper wrote:
>> On 31/08/2021 14:19, Jan Beulich wrote:
>>>>>> @@ -1095,6 +1101,17 @@ static int __init pvh_setup_acpi(struct
>>>>>>          nr_pages = PFN_UP((d->arch.e820[i].addr & ~PAGE_MASK) +
>>>>>>                            d->arch.e820[i].size);
>>>>>>  
>>>>>> +        /* Memory below 1MB has been dealt with by pvh_populate_p2m(). */
>>>>>> +        if ( pfn < PFN_DOWN(MB(1)) )
>>>>>> +        {
>>>>>> +            if ( pfn + nr_pages <= PFN_DOWN(MB(1)) )
>>>>>> +                continue;
>>>>>> +
>>>>>> +            /* This shouldn't happen, but is easy to deal with. */
>>>>> I'm not sure this comment is helpful.
>>>>>
>>>>> Under PVH, there is nothing special about the 1M boundary, and we can
>>>>> reasonably have something else here or crossing the boundary.
>>>> As long as we have this special treatment of the low Mb, the boundary
>>>> is meaningful imo. I'd see the comment go away when the handling in
>>>> general gets streamlined.
>>> I should have added: And as long as Dom0's E820 map gets cloned from
>>> the host's, which will necessarily consider the 1Mb boundary special.
>> Not when you're booting virtualised in the first place.
> You mean when Xen itself runs in PVH (not HVM) mode, and then in turn
> has a PVH Dom0? And if the underlying Xen has not in turn cloned
> the E820 from the host's? That's surely too exotic a case to warrant
> removing this comment. If you insist, I can mention that case as a
> possible exception.

It's a long way from exotic.

Also the magic surrounding the 1M boundary is disappearing on real
hardware with legacy BIOS support being dropped from platforms.


The comment is misleading and should be dropped.  The logic is still
perfectly clear given the outer comment.

~Andrew



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/4] x86/P2M: relax guarding of MMIO entries
  2021-08-31 15:38         ` Jan Beulich
  2021-09-01  8:08           ` Jan Beulich
  2021-09-01  8:50           ` Roger Pau Monné
@ 2021-09-01 12:47           ` Andrew Cooper
  2021-09-01 13:08             ` Jan Beulich
  2 siblings, 1 reply; 40+ messages in thread
From: Andrew Cooper @ 2021-09-01 12:47 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Wei Liu, Roger Pau Monné, xen-devel

On 31/08/2021 16:38, Jan Beulich wrote:
> On 31.08.2021 17:25, Andrew Cooper wrote:
>> On 31/08/2021 14:26, Jan Beulich wrote:
>>> On 31.08.2021 15:16, Andrew Cooper wrote:
>>>> On 30/08/2021 14:02, Jan Beulich wrote:
>>>>> Further permit "access" to differ in the "executable" attribute. While
>>>>> ideally only ROM regions would get mapped with X set, getting there is
>>>>> quite a bit of work. Therefore, as a temporary measure, permit X to
>>>>> vary. For Dom0 the more permissive of the types will be used, while for
>>>>> DomU it'll be the more restrictive one.
>>>> Split behaviour between dom0 and domU based on types alone cannot
>>>> possibly be correct.
>>> True, but what do you do.
>>>
>>>> DomU's need to execute ROMs too, and this looks like will malfunction if
>>>> a ROM ends up in the region that HVMLoader relocated RAM from.
>>>>
>>>> As this is a temporary bodge emergency bugfix, don't try to be clever -
>>>> just take the latest access.
>>> And how do we know that that's what is going to work?
>> Because it's the pre-existing behaviour.
> Valid point. But for the DomU case there simply has not been any
> pre-existing behavior. Hence my desire to be restrictive initially
> there.

But you're conflating a feature (under question anyway, because I gave
you an example where I expect this will collide in a regular domU
already), with an emergency bugfix to unbreak staging caused by an
unexpected interaction in a security hotfix.

At an absolute minimum, this patch needs splitting in two to separate
the bugfix from the proposed feature.

>>>  We should
>>> strictly accumulate for Dom0. And what we do for DomU is moot for
>>> the moment, until PCI passthrough becomes a thing for PVH. Hence
>>> I've opted to be restrictive there - I'd rather see things break
>>> (and getting adjusted) when this future work actually gets carried
>>> out, than leave things permissive for no-one to notice that it's
>>> too permissive, leading to an XSA.
>> Restricting execute permissions is something unique to virt.  It doesn't
>> exist in a non-virtualised system, as I and D side reads are
>> indistinguishable outside of the core.
>>
>> Furthermore, it is inexpressible on some systems/configurations.
>>
>> Introspection is the only technology which should be restricting execute
>> permissions in the p2m, and only when it takes responsibility for
>> dealing with the fallout.
> IOW are you saying that the calls to set_identity_p2m_entry()
> (pre-dating XSA-378) were wrong to use p2m_access_rw?

Yes.

>  Because that's
> what's getting the way here.

On a real machine, you really can write some executable code into an
E820 reserved region and jump to it.  You can also execute code from
real BARs is you happen to know that they are prefetchable (or you're a
glutton for UC reads...)

And there is the WPBT ACPI table which exists specifically to let
firmware inject drivers/applications into a windows environment, and may
come out of the SPI ROM in the first place.


Is it sensible to execute an E820 reserved region, or unmarked BAR? 
Probably not.

Should it work, because that's how real hardware behaves?  Absolutely.

Any restrictions beyond that want handling by some kind of introspection
agent which has a policy of what to do with legal-but-dodgy-looking actions.

> Plus, as a side note, then we don't even have e.g. IOMMUF_executable.

Just as I vs D side reads are indistinguishable outside of the CPU core,
the same is in principle true for PCI devices which execute code (e.g.
GPU shaders).  Reads on the bus are just reads.

That said, the latest SIOV spec does appear to include the ER (Execution
Request) bit for use with PASSID/Shared-Virtual-Memory, which interacts
with the EPT-X/ia32-NX bits, and goes as far as having SMEP/SMAP bits in
the IOMMU configuration.  I'm not sure if this is in released hardware
yet, but it's clearly on the horizon.  I can't spot any execute related
controls in the AMD IOMMU spec, although it does have user/supervisor
for PASSID/SVM.

~Andrew



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/4] x86/P2M: relax guarding of MMIO entries
  2021-09-01 12:47           ` Andrew Cooper
@ 2021-09-01 13:08             ` Jan Beulich
  2021-09-06 19:53               ` Andrew Cooper
  0 siblings, 1 reply; 40+ messages in thread
From: Jan Beulich @ 2021-09-01 13:08 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Roger Pau Monné, xen-devel

On 01.09.2021 14:47, Andrew Cooper wrote:
> On 31/08/2021 16:38, Jan Beulich wrote:
>> On 31.08.2021 17:25, Andrew Cooper wrote:
>>> On 31/08/2021 14:26, Jan Beulich wrote:
>>>> On 31.08.2021 15:16, Andrew Cooper wrote:
>>>>> On 30/08/2021 14:02, Jan Beulich wrote:
>>>>>> Further permit "access" to differ in the "executable" attribute. While
>>>>>> ideally only ROM regions would get mapped with X set, getting there is
>>>>>> quite a bit of work. Therefore, as a temporary measure, permit X to
>>>>>> vary. For Dom0 the more permissive of the types will be used, while for
>>>>>> DomU it'll be the more restrictive one.
>>>>> Split behaviour between dom0 and domU based on types alone cannot
>>>>> possibly be correct.
>>>> True, but what do you do.
>>>>
>>>>> DomU's need to execute ROMs too, and this looks like will malfunction if
>>>>> a ROM ends up in the region that HVMLoader relocated RAM from.
>>>>>
>>>>> As this is a temporary bodge emergency bugfix, don't try to be clever -
>>>>> just take the latest access.
>>>> And how do we know that that's what is going to work?
>>> Because it's the pre-existing behaviour.
>> Valid point. But for the DomU case there simply has not been any
>> pre-existing behavior. Hence my desire to be restrictive initially
>> there.
> 
> But you're conflating a feature (under question anyway, because I gave
> you an example where I expect this will collide in a regular domU
> already),

I don't think your example fits: hvmloader moving RAM will first
convert the p2m slot to non-present. Then a ROM page can get mapped
there quite fine. A direct transition (without going through n/p)
would not work independent of the change here: The MFNs would
differ, as would the p2m types.

> with an emergency bugfix to unbreak staging caused by an
> unexpected interaction in a security hotfix.
> 
> At an absolute minimum, this patch needs splitting in two to separate
> the bugfix from the proposed feature.

Well, okay, I will split the patch, despite not being convinced this
will do us any good - we'd backport just the part you consider a bug
fix, but not the part you deem a feature (and which I consider part
of the bug fix).

>>>>  We should
>>>> strictly accumulate for Dom0. And what we do for DomU is moot for
>>>> the moment, until PCI passthrough becomes a thing for PVH. Hence
>>>> I've opted to be restrictive there - I'd rather see things break
>>>> (and getting adjusted) when this future work actually gets carried
>>>> out, than leave things permissive for no-one to notice that it's
>>>> too permissive, leading to an XSA.

Actually I think I was missing an important aspect here: The code in
question gets used not only for PVH, but also for HVM, where pass-
through is a thing. Hence I'll restrict the "feature" part to Dom0
for now.

>>> Restricting execute permissions is something unique to virt.  It doesn't
>>> exist in a non-virtualised system, as I and D side reads are
>>> indistinguishable outside of the core.
>>>
>>> Furthermore, it is inexpressible on some systems/configurations.
>>>
>>> Introspection is the only technology which should be restricting execute
>>> permissions in the p2m, and only when it takes responsibility for
>>> dealing with the fallout.
>> IOW are you saying that the calls to set_identity_p2m_entry()
>> (pre-dating XSA-378) were wrong to use p2m_access_rw?
> 
> Yes.
> 
>>  Because that's
>> what's getting the way here.
> 
> On a real machine, you really can write some executable code into an
> E820 reserved region and jump to it.  You can also execute code from
> real BARs is you happen to know that they are prefetchable (or you're a
> glutton for UC reads...)
> 
> And there is the WPBT ACPI table which exists specifically to let
> firmware inject drivers/applications into a windows environment, and may
> come out of the SPI ROM in the first place.
> 
> 
> Is it sensible to execute an E820 reserved region, or unmarked BAR? 
> Probably not.
> 
> Should it work, because that's how real hardware behaves?  Absolutely.
> 
> Any restrictions beyond that want handling by some kind of introspection
> agent which has a policy of what to do with legal-but-dodgy-looking actions.

IOW you suggest we remove p2m_access_t parameters from various functions,
going with just default access? Of course, as pointed out in another
reply, we'll need to split p2m_mmio_direct into multiple types then, at
the very least to honor the potential r/o restriction of AMD IOMMU unity
mapped regions. (FAOD all of this isn't a short term plan anyway, at least
afaic.)

Jan



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/4] x86/P2M: relax guarding of MMIO entries
  2021-09-01  9:53             ` Jan Beulich
@ 2021-09-01 13:48               ` Roger Pau Monné
  2021-09-01 14:05                 ` Jan Beulich
  0 siblings, 1 reply; 40+ messages in thread
From: Roger Pau Monné @ 2021-09-01 13:48 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

On Wed, Sep 01, 2021 at 11:53:03AM +0200, Jan Beulich wrote:
> On 01.09.2021 10:50, Roger Pau Monné wrote:
> > On Tue, Aug 31, 2021 at 05:38:49PM +0200, Jan Beulich wrote:
> >> On 31.08.2021 17:25, Andrew Cooper wrote:
> >>> On 31/08/2021 14:26, Jan Beulich wrote:
> >>>> On 31.08.2021 15:16, Andrew Cooper wrote:
> >>>>> On 30/08/2021 14:02, Jan Beulich wrote:
> >>>>>> Further permit "access" to differ in the "executable" attribute. While
> >>>>>> ideally only ROM regions would get mapped with X set, getting there is
> >>>>>> quite a bit of work. Therefore, as a temporary measure, permit X to
> >>>>>> vary. For Dom0 the more permissive of the types will be used, while for
> >>>>>> DomU it'll be the more restrictive one.
> >>>>> Split behaviour between dom0 and domU based on types alone cannot
> >>>>> possibly be correct.
> >>>> True, but what do you do.
> >>>>
> >>>>> DomU's need to execute ROMs too, and this looks like will malfunction if
> >>>>> a ROM ends up in the region that HVMLoader relocated RAM from.
> >>>>>
> >>>>> As this is a temporary bodge emergency bugfix, don't try to be clever -
> >>>>> just take the latest access.
> >>>> And how do we know that that's what is going to work?
> >>>
> >>> Because it's the pre-existing behaviour.
> >>
> >> Valid point. But for the DomU case there simply has not been any
> >> pre-existing behavior. Hence my desire to be restrictive initially
> >> there.
> >>
> >>>>  We should
> >>>> strictly accumulate for Dom0. And what we do for DomU is moot for
> >>>> the moment, until PCI passthrough becomes a thing for PVH. Hence
> >>>> I've opted to be restrictive there - I'd rather see things break
> >>>> (and getting adjusted) when this future work actually gets carried
> >>>> out, than leave things permissive for no-one to notice that it's
> >>>> too permissive, leading to an XSA.
> >>>
> >>> Restricting execute permissions is something unique to virt.  It doesn't
> >>> exist in a non-virtualised system, as I and D side reads are
> >>> indistinguishable outside of the core.
> >>>
> >>> Furthermore, it is inexpressible on some systems/configurations.
> >>>
> >>> Introspection is the only technology which should be restricting execute
> >>> permissions in the p2m, and only when it takes responsibility for
> >>> dealing with the fallout.
> >>
> >> IOW are you saying that the calls to set_identity_p2m_entry()
> >> (pre-dating XSA-378) were wrong to use p2m_access_rw? Because that's
> >> what's getting the way here.
> > 
> > I did wonder this before, because I saw some messages on a couple of
> > systems about mappings override, and I'm not sure why we need to use
> > p2m_access_rw. My first thought was to suggest to switch to use the
> > default access type for the domain, like set_mmio_p2m_entry does.
> > 
> > I have to admit I'm not sure I see the point of preventing execution,
> > but it's possible I'm missing something.
> 
> Well, what good can come from allowing execution from, say, the
> IO-APIC or LAPIC pages? Or other MMIO-mapped register space?

map_mmio_regions does already map BARs with execute permissions, so
it's just some MMIO regions that get mapped without execution
permissions, which makes all this confusing.

> Insn
> fetches might even trip bad hardware behavior in such a case by
> being the wrong granularity.

Normal reads could also trigger such bad hardware behavior, so I'm not
sure preventing execution provides Xen any more safety.

> It's imo really only ROM space which
> ought to have execution permitted.
> 
> The issue isn't just with execution, though, and as a result I may
> need to change the logic here to also include at least W. As of
> one of the XSA-378 changes we may now pass just p2m_access_r to
> iommu_identity_mapping(), if the ACPI tables on an AMD system were
> saying so. (We may also pass p2m_access_w, but I sincerely hope no
> firmware would specify write but no read access.)
> 
> Similarly in "IOMMU/x86: restrict IO-APIC mappings for PV Dom0" I
> now pass p2m_access_r to set_identity_p2m_entry().

Not really I think, as PVH dom0 is the only user of the
set_identity_p2m_entry call in arch_iommu_hwdom_init, and we should
never identity map the IO-APIC range in that case because a set of
emulated IO-APIC replacements are provided and those require ranges to
be unmapped so that accesses can be trapped.

> 
> I suppose an underlying issue is the mixed purpose of using
> p2m_access_*, which possibly has been against the intentions in the
> first place. We cannot, for example, express r/o access to an MMIO
> page without using p2m_access_r (or p2m_access_rx), as there's no
> suitable p2m type to express this via type alone. We may need to
> split p2m_mmio_direct into multiple types (up to 7), I guess, if
> we wanted to remove this (ab)use of p2m_access_*.

My main complaint is mostly with the fact that some MMIO ranges are
mapped without execute permissions when mapped by
set_identity_p2m_entry vs map_mmio_regions that will map them with the
default permissions and that has execution set.

If the mappings from arch_iommu_hwdom_init for example would use the
default permissions that could solve quite a lot of the issues there
AFAIC.

Roger.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 0/4] x86/PVH: Dom0 building adjustments
  2021-08-31  8:53 ` [PATCH 0/4] x86/PVH: Dom0 building adjustments Jan Beulich
@ 2021-09-01 13:56   ` Roger Pau Monné
  2021-09-01 14:19     ` Jan Beulich
  0 siblings, 1 reply; 40+ messages in thread
From: Roger Pau Monné @ 2021-09-01 13:56 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Cooper, Wei Liu

On Tue, Aug 31, 2021 at 10:53:59AM +0200, Jan Beulich wrote:
> On 30.08.2021 15:01, Jan Beulich wrote:
> > The code building PVH Dom0 made use of sequences of P2M changes
> > which are disallowed as of XSA-378. First of all population of the
> > first Mb of memory needs to be redone. Then, largely as a
> > workaround, checking introduced by XSA-378 needs to be slightly
> > relaxed.
> > 
> > Note that with these adjustments I get Dom0 to start booting on my
> > development system, but the Dom0 kernel then gets stuck. Since it
> > was the first time for me to try PVH Dom0 in this context (see
> > below for why I was hesitant), I cannot tell yet whether this is
> > due further fallout from the XSA, or some further unrelated
> > problem.

Iff you have some time could you check without the XSA applied? I have
to admit I haven't been testing staging, so it's possible some
breakage as slipped in (however osstest seemed fine with it).

> > Dom0's BSP is in VPF_blocked state while all APs are
> > still in VPF_down. The 'd' debug key, unhelpfully, doesn't produce
> > any output, so it's non-trivial to check whether (like PV likes to
> > do) Dom0 has panic()ed without leaving any (visible) output.

Not sure it would help much, but maybe you can post the Xen+Linux
output?

Do you have iommu debug/verbose enabled to catch iommu faults?

> Correction: I did mean '0' here, producing merely
> 
> (XEN) '0' pressed -> dumping Dom0's registers
> (XEN) *** Dumping Dom0 vcpu#0 state: ***
> (XEN) *** Dumping Dom0 vcpu#1 state: ***
> (XEN) *** Dumping Dom0 vcpu#2 state: ***
> (XEN) *** Dumping Dom0 vcpu#3 state: ***
> 
> 'd' output supports the "system is idle" that was also visible from
> 'q' output.

Can you dump the state of the VMCS and see where the IP points to in
Linux?

Thanks, Roger.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/4] x86/P2M: relax guarding of MMIO entries
  2021-09-01 13:48               ` Roger Pau Monné
@ 2021-09-01 14:05                 ` Jan Beulich
  0 siblings, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2021-09-01 14:05 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: Andrew Cooper, Wei Liu, xen-devel

On 01.09.2021 15:48, Roger Pau Monné wrote:
> On Wed, Sep 01, 2021 at 11:53:03AM +0200, Jan Beulich wrote:
>> The issue isn't just with execution, though, and as a result I may
>> need to change the logic here to also include at least W. As of
>> one of the XSA-378 changes we may now pass just p2m_access_r to
>> iommu_identity_mapping(), if the ACPI tables on an AMD system were
>> saying so. (We may also pass p2m_access_w, but I sincerely hope no
>> firmware would specify write but no read access.)
>>
>> Similarly in "IOMMU/x86: restrict IO-APIC mappings for PV Dom0" I
>> now pass p2m_access_r to set_identity_p2m_entry().
> 
> Not really I think, as PVH dom0 is the only user of the
> set_identity_p2m_entry call in arch_iommu_hwdom_init, and we should
> never identity map the IO-APIC range in that case because a set of
> emulated IO-APIC replacements are provided and those require ranges to
> be unmapped so that accesses can be trapped.

Correct, this would be only a latent issue for now. The code passes
that value, but that path is not going to be taken (until, perhaps,
a future change).

>> I suppose an underlying issue is the mixed purpose of using
>> p2m_access_*, which possibly has been against the intentions in the
>> first place. We cannot, for example, express r/o access to an MMIO
>> page without using p2m_access_r (or p2m_access_rx), as there's no
>> suitable p2m type to express this via type alone. We may need to
>> split p2m_mmio_direct into multiple types (up to 7), I guess, if
>> we wanted to remove this (ab)use of p2m_access_*.
> 
> My main complaint is mostly with the fact that some MMIO ranges are
> mapped without execute permissions when mapped by
> set_identity_p2m_entry vs map_mmio_regions that will map them with the
> default permissions and that has execution set.

Right - as said in a reply to Andrew, we may need to drop some
p2m_access_t parameters in favor of using default permissions. But
at least for AMD's IOMMU-unity-mapped ranges we will need to have
a way to express at least r/o.

Jan



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 0/4] x86/PVH: Dom0 building adjustments
  2021-09-01 13:56   ` Roger Pau Monné
@ 2021-09-01 14:19     ` Jan Beulich
  2021-09-01 14:25       ` Jan Beulich
  2021-09-01 16:13       ` Roger Pau Monné
  0 siblings, 2 replies; 40+ messages in thread
From: Jan Beulich @ 2021-09-01 14:19 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Andrew Cooper, Wei Liu

On 01.09.2021 15:56, Roger Pau Monné wrote:
> On Tue, Aug 31, 2021 at 10:53:59AM +0200, Jan Beulich wrote:
>> On 30.08.2021 15:01, Jan Beulich wrote:
>>> The code building PVH Dom0 made use of sequences of P2M changes
>>> which are disallowed as of XSA-378. First of all population of the
>>> first Mb of memory needs to be redone. Then, largely as a
>>> workaround, checking introduced by XSA-378 needs to be slightly
>>> relaxed.
>>>
>>> Note that with these adjustments I get Dom0 to start booting on my
>>> development system, but the Dom0 kernel then gets stuck. Since it
>>> was the first time for me to try PVH Dom0 in this context (see
>>> below for why I was hesitant), I cannot tell yet whether this is
>>> due further fallout from the XSA, or some further unrelated
>>> problem.
> 
> Iff you have some time could you check without the XSA applied? I have
> to admit I haven't been testing staging, so it's possible some
> breakage as slipped in (however osstest seemed fine with it).

Well, I'd rather try to use the time to find the actual issue. From
osstest being fine I'm kind of inferring this might be machine
specific, or this might be due to yet some other of the overly many
patches I'm carrying. So if I can't infer anything from the stack
once I can actually dump that, I may indeed need to bisect my pile,
which would then also include the XSA-378 patches (as I didn't have
time to re-base so far).

>>> Dom0's BSP is in VPF_blocked state while all APs are
>>> still in VPF_down. The 'd' debug key, unhelpfully, doesn't produce
>>> any output, so it's non-trivial to check whether (like PV likes to
>>> do) Dom0 has panic()ed without leaving any (visible) output.
> 
> Not sure it would help much, but maybe you can post the Xen+Linux
> output?

There's no Linux output yet by that point (and either
"earlyprintk=xen" doesn't work in PVH mode, or it's even too early
for that). All Xen has to say is

(XEN) Dom0 callback via changed to Direct Vector 0xf3
(XEN) vmx.c:3265:d0v0 RDMSR 0x0000064e unimplemented
(XEN) vmx.c:3265:d0v0 RDMSR 0x00000034 unimplemented

> Do you have iommu debug/verbose enabled to catch iommu faults?

I'll try to remember to check that, but since Linux hasn't
brought up APs yet I don't think there's any device activity
just yet.

>> Correction: I did mean '0' here, producing merely
>>
>> (XEN) '0' pressed -> dumping Dom0's registers
>> (XEN) *** Dumping Dom0 vcpu#0 state: ***
>> (XEN) *** Dumping Dom0 vcpu#1 state: ***
>> (XEN) *** Dumping Dom0 vcpu#2 state: ***
>> (XEN) *** Dumping Dom0 vcpu#3 state: ***
>>
>> 'd' output supports the "system is idle" that was also visible from
>> 'q' output.
> 
> Can you dump the state of the VMCS and see where the IP points to in
> Linux?

Both that and the register dumping I have meanwhile working tell
me that it's the HLT in default_idle(). IOW Dom0 gives the impression
of also being idle, at the first glance. The stack pointer, however,
is farther away from the stack top than I would have expected, so it
may still have entered default_idle() for other reasons.

The VMCS also told me that the last VM entry was to deliver an
interrupt at vector 0xf3 (i.e. the "callback" one).

Jan



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 0/4] x86/PVH: Dom0 building adjustments
  2021-09-01 14:19     ` Jan Beulich
@ 2021-09-01 14:25       ` Jan Beulich
  2021-09-01 16:13       ` Roger Pau Monné
  1 sibling, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2021-09-01 14:25 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Andrew Cooper, Wei Liu

On 01.09.2021 16:19, Jan Beulich wrote:
> On 01.09.2021 15:56, Roger Pau Monné wrote:
>> Do you have iommu debug/verbose enabled to catch iommu faults?
> 
> I'll try to remember to check that, but since Linux hasn't
> brought up APs yet I don't think there's any device activity
> just yet.

No IOMMU faults, as expected.

Jan



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 0/4] x86/PVH: Dom0 building adjustments
  2021-08-30 13:01 [PATCH 0/4] x86/PVH: Dom0 building adjustments Jan Beulich
                   ` (4 preceding siblings ...)
  2021-08-31  8:53 ` [PATCH 0/4] x86/PVH: Dom0 building adjustments Jan Beulich
@ 2021-09-01 15:06 ` Jan Beulich
  2021-09-01 15:24   ` Juergen Gross
  5 siblings, 1 reply; 40+ messages in thread
From: Jan Beulich @ 2021-09-01 15:06 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

On 30.08.2021 15:01, Jan Beulich wrote:
> The code building PVH Dom0 made use of sequences of P2M changes
> which are disallowed as of XSA-378. First of all population of the
> first Mb of memory needs to be redone. Then, largely as a
> workaround, checking introduced by XSA-378 needs to be slightly
> relaxed.
> 
> Note that with these adjustments I get Dom0 to start booting on my
> development system, but the Dom0 kernel then gets stuck. Since it
> was the first time for me to try PVH Dom0 in this context (see
> below for why I was hesitant), I cannot tell yet whether this is
> due further fallout from the XSA, or some further unrelated
> problem. Dom0's BSP is in VPF_blocked state while all APs are
> still in VPF_down. The '0' debug key, unhelpfully, doesn't produce
> any output, so it's non-trivial to check whether (like PV likes to
> do) Dom0 has panic()ed without leaving any (visible) output.

Having made '0' work at least partly, I can now see that Dom0's
vCPU0 enters its idle loop after having gone through all normal
initialization. Clearly certain things must not have worked as
intended (no APs booted, no drivers loaded afaict), but I'm
having a hard time seeing how to find out what that might be
when there's no output at all. PV Dom0 does not require any
special command line option to do output to both the VGA console
and through hvc_xen (making its output also go to the serial
log) - is this perhaps different for PVH? I couldn't find
anything under docs/ ...

Jan



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 0/4] x86/PVH: Dom0 building adjustments
  2021-09-01 15:06 ` Jan Beulich
@ 2021-09-01 15:24   ` Juergen Gross
  2021-09-01 15:51     ` Jan Beulich
  0 siblings, 1 reply; 40+ messages in thread
From: Juergen Gross @ 2021-09-01 15:24 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné


[-- Attachment #1.1.1: Type: text/plain, Size: 1670 bytes --]

On 01.09.21 17:06, Jan Beulich wrote:
> On 30.08.2021 15:01, Jan Beulich wrote:
>> The code building PVH Dom0 made use of sequences of P2M changes
>> which are disallowed as of XSA-378. First of all population of the
>> first Mb of memory needs to be redone. Then, largely as a
>> workaround, checking introduced by XSA-378 needs to be slightly
>> relaxed.
>>
>> Note that with these adjustments I get Dom0 to start booting on my
>> development system, but the Dom0 kernel then gets stuck. Since it
>> was the first time for me to try PVH Dom0 in this context (see
>> below for why I was hesitant), I cannot tell yet whether this is
>> due further fallout from the XSA, or some further unrelated
>> problem. Dom0's BSP is in VPF_blocked state while all APs are
>> still in VPF_down. The '0' debug key, unhelpfully, doesn't produce
>> any output, so it's non-trivial to check whether (like PV likes to
>> do) Dom0 has panic()ed without leaving any (visible) output.
> 
> Having made '0' work at least partly, I can now see that Dom0's
> vCPU0 enters its idle loop after having gone through all normal
> initialization. Clearly certain things must not have worked as
> intended (no APs booted, no drivers loaded afaict), but I'm
> having a hard time seeing how to find out what that might be
> when there's no output at all. PV Dom0 does not require any
> special command line option to do output to both the VGA console
> and through hvc_xen (making its output also go to the serial
> log) - is this perhaps different for PVH? I couldn't find
> anything under docs/ ...

Did you add earlyprintk=xen to the dom0 boot parameters?


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 0/4] x86/PVH: Dom0 building adjustments
  2021-09-01 15:24   ` Juergen Gross
@ 2021-09-01 15:51     ` Jan Beulich
  0 siblings, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2021-09-01 15:51 UTC (permalink / raw)
  To: Juergen Gross; +Cc: Andrew Cooper, xen-devel, Wei Liu, Roger Pau Monné

On 01.09.2021 17:24, Juergen Gross wrote:
> On 01.09.21 17:06, Jan Beulich wrote:
>> On 30.08.2021 15:01, Jan Beulich wrote:
>>> The code building PVH Dom0 made use of sequences of P2M changes
>>> which are disallowed as of XSA-378. First of all population of the
>>> first Mb of memory needs to be redone. Then, largely as a
>>> workaround, checking introduced by XSA-378 needs to be slightly
>>> relaxed.
>>>
>>> Note that with these adjustments I get Dom0 to start booting on my
>>> development system, but the Dom0 kernel then gets stuck. Since it
>>> was the first time for me to try PVH Dom0 in this context (see
>>> below for why I was hesitant), I cannot tell yet whether this is
>>> due further fallout from the XSA, or some further unrelated
>>> problem. Dom0's BSP is in VPF_blocked state while all APs are
>>> still in VPF_down. The '0' debug key, unhelpfully, doesn't produce
>>> any output, so it's non-trivial to check whether (like PV likes to
>>> do) Dom0 has panic()ed without leaving any (visible) output.
>>
>> Having made '0' work at least partly, I can now see that Dom0's
>> vCPU0 enters its idle loop after having gone through all normal
>> initialization. Clearly certain things must not have worked as
>> intended (no APs booted, no drivers loaded afaict), but I'm
>> having a hard time seeing how to find out what that might be
>> when there's no output at all. PV Dom0 does not require any
>> special command line option to do output to both the VGA console
>> and through hvc_xen (making its output also go to the serial
>> log) - is this perhaps different for PVH? I couldn't find
>> anything under docs/ ...
> 
> Did you add earlyprintk=xen to the dom0 boot parameters?

Yes (I did mention this before) - no difference at all. I guess I'll
try again, just in case I made a stupid mistake.

Jan



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 0/4] x86/PVH: Dom0 building adjustments
  2021-09-01 14:19     ` Jan Beulich
  2021-09-01 14:25       ` Jan Beulich
@ 2021-09-01 16:13       ` Roger Pau Monné
  2021-09-02  6:30         ` Jan Beulich
  1 sibling, 1 reply; 40+ messages in thread
From: Roger Pau Monné @ 2021-09-01 16:13 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Cooper, Wei Liu

On Wed, Sep 01, 2021 at 04:19:40PM +0200, Jan Beulich wrote:
> On 01.09.2021 15:56, Roger Pau Monné wrote:
> > On Tue, Aug 31, 2021 at 10:53:59AM +0200, Jan Beulich wrote:
> >> On 30.08.2021 15:01, Jan Beulich wrote:
> >>> The code building PVH Dom0 made use of sequences of P2M changes
> >>> which are disallowed as of XSA-378. First of all population of the
> >>> first Mb of memory needs to be redone. Then, largely as a
> >>> workaround, checking introduced by XSA-378 needs to be slightly
> >>> relaxed.
> >>>
> >>> Note that with these adjustments I get Dom0 to start booting on my
> >>> development system, but the Dom0 kernel then gets stuck. Since it
> >>> was the first time for me to try PVH Dom0 in this context (see
> >>> below for why I was hesitant), I cannot tell yet whether this is
> >>> due further fallout from the XSA, or some further unrelated
> >>> problem.
> > 
> > Iff you have some time could you check without the XSA applied? I have
> > to admit I haven't been testing staging, so it's possible some
> > breakage as slipped in (however osstest seemed fine with it).
> 
> Well, I'd rather try to use the time to find the actual issue. From
> osstest being fine I'm kind of inferring this might be machine
> specific, or this might be due to yet some other of the overly many
> patches I'm carrying. So if I can't infer anything from the stack
> once I can actually dump that, I may indeed need to bisect my pile,
> which would then also include the XSA-378 patches (as I didn't have
> time to re-base so far).
> 
> >>> Dom0's BSP is in VPF_blocked state while all APs are
> >>> still in VPF_down. The 'd' debug key, unhelpfully, doesn't produce
> >>> any output, so it's non-trivial to check whether (like PV likes to
> >>> do) Dom0 has panic()ed without leaving any (visible) output.
> > 
> > Not sure it would help much, but maybe you can post the Xen+Linux
> > output?
> 
> There's no Linux output yet by that point (and either
> "earlyprintk=xen" doesn't work in PVH mode, or it's even too early
> for that). All Xen has to say is
> 
> (XEN) Dom0 callback via changed to Direct Vector 0xf3
> (XEN) vmx.c:3265:d0v0 RDMSR 0x0000064e unimplemented
> (XEN) vmx.c:3265:d0v0 RDMSR 0x00000034 unimplemented

Weird, I don't see why earlyprintk=xen shouldn't work in PVH mode,
unless it's not properly wired up. Certainly needs checking and
fixing, or else we won't be able to make much progress I think.

> > Do you have iommu debug/verbose enabled to catch iommu faults?
> 
> I'll try to remember to check that, but since Linux hasn't
> brought up APs yet I don't think there's any device activity
> just yet.
> 
> >> Correction: I did mean '0' here, producing merely
> >>
> >> (XEN) '0' pressed -> dumping Dom0's registers
> >> (XEN) *** Dumping Dom0 vcpu#0 state: ***
> >> (XEN) *** Dumping Dom0 vcpu#1 state: ***
> >> (XEN) *** Dumping Dom0 vcpu#2 state: ***
> >> (XEN) *** Dumping Dom0 vcpu#3 state: ***
> >>
> >> 'd' output supports the "system is idle" that was also visible from
> >> 'q' output.
> > 
> > Can you dump the state of the VMCS and see where the IP points to in
> > Linux?
> 
> Both that and the register dumping I have meanwhile working tell
> me that it's the HLT in default_idle(). IOW Dom0 gives the impression
> of also being idle, at the first glance. The stack pointer, however,
> is farther away from the stack top than I would have expected, so it
> may still have entered default_idle() for other reasons.
> 
> The VMCS also told me that the last VM entry was to deliver an
> interrupt at vector 0xf3 (i.e. the "callback" one).

That's all quite weird. Did dom0 setup the vCPU timer? What version of
Linux are you using?

It seems to get stuck very early (or either fail to output anything
while booting), which seems unlikely to be related to your specific
hardware.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 0/4] x86/PVH: Dom0 building adjustments
  2021-09-01 16:13       ` Roger Pau Monné
@ 2021-09-02  6:30         ` Jan Beulich
  0 siblings, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2021-09-02  6:30 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Andrew Cooper, Wei Liu

On 01.09.2021 18:13, Roger Pau Monné wrote:
> On Wed, Sep 01, 2021 at 04:19:40PM +0200, Jan Beulich wrote:
>> On 01.09.2021 15:56, Roger Pau Monné wrote:
>>> On Tue, Aug 31, 2021 at 10:53:59AM +0200, Jan Beulich wrote:
>>>> On 30.08.2021 15:01, Jan Beulich wrote:
>>>>> The code building PVH Dom0 made use of sequences of P2M changes
>>>>> which are disallowed as of XSA-378. First of all population of the
>>>>> first Mb of memory needs to be redone. Then, largely as a
>>>>> workaround, checking introduced by XSA-378 needs to be slightly
>>>>> relaxed.
>>>>>
>>>>> Note that with these adjustments I get Dom0 to start booting on my
>>>>> development system, but the Dom0 kernel then gets stuck. Since it
>>>>> was the first time for me to try PVH Dom0 in this context (see
>>>>> below for why I was hesitant), I cannot tell yet whether this is
>>>>> due further fallout from the XSA, or some further unrelated
>>>>> problem.
>>>
>>> Iff you have some time could you check without the XSA applied? I have
>>> to admit I haven't been testing staging, so it's possible some
>>> breakage as slipped in (however osstest seemed fine with it).
>>
>> Well, I'd rather try to use the time to find the actual issue. From
>> osstest being fine I'm kind of inferring this might be machine
>> specific, or this might be due to yet some other of the overly many
>> patches I'm carrying. So if I can't infer anything from the stack
>> once I can actually dump that, I may indeed need to bisect my pile,
>> which would then also include the XSA-378 patches (as I didn't have
>> time to re-base so far).
>>
>>>>> Dom0's BSP is in VPF_blocked state while all APs are
>>>>> still in VPF_down. The 'd' debug key, unhelpfully, doesn't produce
>>>>> any output, so it's non-trivial to check whether (like PV likes to
>>>>> do) Dom0 has panic()ed without leaving any (visible) output.
>>>
>>> Not sure it would help much, but maybe you can post the Xen+Linux
>>> output?
>>
>> There's no Linux output yet by that point (and either
>> "earlyprintk=xen" doesn't work in PVH mode, or it's even too early
>> for that). All Xen has to say is
>>
>> (XEN) Dom0 callback via changed to Direct Vector 0xf3
>> (XEN) vmx.c:3265:d0v0 RDMSR 0x0000064e unimplemented
>> (XEN) vmx.c:3265:d0v0 RDMSR 0x00000034 unimplemented
> 
> Weird, I don't see why earlyprintk=xen shouldn't work in PVH mode,
> unless it's not properly wired up. Certainly needs checking and
> fixing, or else we won't be able to make much progress I think.

Right - I'm intending to check this, including whether at least
xen_raw_console_write() would work.

>>>> Correction: I did mean '0' here, producing merely
>>>>
>>>> (XEN) '0' pressed -> dumping Dom0's registers
>>>> (XEN) *** Dumping Dom0 vcpu#0 state: ***
>>>> (XEN) *** Dumping Dom0 vcpu#1 state: ***
>>>> (XEN) *** Dumping Dom0 vcpu#2 state: ***
>>>> (XEN) *** Dumping Dom0 vcpu#3 state: ***
>>>>
>>>> 'd' output supports the "system is idle" that was also visible from
>>>> 'q' output.
>>>
>>> Can you dump the state of the VMCS and see where the IP points to in
>>> Linux?
>>
>> Both that and the register dumping I have meanwhile working tell
>> me that it's the HLT in default_idle(). IOW Dom0 gives the impression
>> of also being idle, at the first glance. The stack pointer, however,
>> is farther away from the stack top than I would have expected, so it
>> may still have entered default_idle() for other reasons.
>>
>> The VMCS also told me that the last VM entry was to deliver an
>> interrupt at vector 0xf3 (i.e. the "callback" one).
> 
> That's all quite weird. Did dom0 setup the vCPU timer?

Ah - I had meant to check active timers, but then forgot. Otoh I
thought I could observe vCPU0 waking up from HLT, as RIP in the
registers dumped has been pointing either at it or right past it.
Now that I write this I'm wondering though whether that's an
artifact rather than reflection of something that's really
happening, in particular because of this

(XEN) RSP = 0xffffffff81c03eb8 (0xffffffff81c03eb8)  RIP = 0xffffffff814be422 (0xffffffff814be423)

in the VMCS dump.

> What version of Linux are you using?

5.13.2; didn't get around to switching to 5.14 yet, but I also don't
expect this to make a difference.

> It seems to get stuck very early (or either fail to output anything
> while booting), which seems unlikely to be related to your specific
> hardware.

Well, it can't be extremely early - I see the ACPI IRQ getting set
up (from "iommu=debug" output mentioning GSI 9), and I see PCI
device BARs being played with (from debug messages I had added to
vPCI to monitor what P2M adjustments are being requested). As said
on another sub-thread, I get all the way through start_kernel()
and rest_init(), just that apparently some of the steps don't do
what they're supposed to do.

I'm meanwhile wondering whether I'm using a badly configured
kernel, i.e. whether there are any Kconfig settings which I ought
to enable, but which aren't "select"-ed nor have proper
"depends on". What I did is simply take my XEN_PV=y config,
replacing that by XEN_PVH=y. I did observe that this let XEN_DOM0
go off, but according to my checking (at the time) nothing this
crucial should have been affected by that.

Jan



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/4] x86/P2M: relax guarding of MMIO entries
  2021-09-01 13:08             ` Jan Beulich
@ 2021-09-06 19:53               ` Andrew Cooper
  2021-09-07  6:27                 ` Jan Beulich
  0 siblings, 1 reply; 40+ messages in thread
From: Andrew Cooper @ 2021-09-06 19:53 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Wei Liu, Roger Pau Monné, xen-devel

On 01/09/2021 14:08, Jan Beulich wrote:
>>>> Restricting execute permissions is something unique to virt.  It doesn't
>>>> exist in a non-virtualised system, as I and D side reads are
>>>> indistinguishable outside of the core.
>>>>
>>>> Furthermore, it is inexpressible on some systems/configurations.
>>>>
>>>> Introspection is the only technology which should be restricting execute
>>>> permissions in the p2m, and only when it takes responsibility for
>>>> dealing with the fallout.
>>> IOW are you saying that the calls to set_identity_p2m_entry()
>>> (pre-dating XSA-378) were wrong to use p2m_access_rw?
>> Yes.
>>
>>>  Because that's
>>> what's getting the way here.
>> On a real machine, you really can write some executable code into an
>> E820 reserved region and jump to it.  You can also execute code from
>> real BARs is you happen to know that they are prefetchable (or you're a
>> glutton for UC reads...)
>>
>> And there is the WPBT ACPI table which exists specifically to let
>> firmware inject drivers/applications into a windows environment, and may
>> come out of the SPI ROM in the first place.
>>
>>
>> Is it sensible to execute an E820 reserved region, or unmarked BAR? 
>> Probably not.
>>
>> Should it work, because that's how real hardware behaves?  Absolutely.
>>
>> Any restrictions beyond that want handling by some kind of introspection
>> agent which has a policy of what to do with legal-but-dodgy-looking actions.
> IOW you suggest we remove p2m_access_t parameters from various functions,
> going with just default access?

p2m_access_t was very obviously a bodge when introduced, and I doubt it
would pass today's review standards.

It is definitely a mis-design, given its ill-specified and overlapping
semantics with respect to the p2m type.

>  Of course, as pointed out in another
> reply, we'll need to split p2m_mmio_direct into multiple types then, at
> the very least to honor the potential r/o restriction of AMD IOMMU unity
> mapped regions. (FAOD all of this isn't a short term plan anyway, at least
> afaic.)

I don't think that will be necessary.

IVMDs are almost non-existent, and given how many other areas of the AMD
IOMMU spec are totally unused, I wouldn't be surprised if r/o unity
mappings were in that category too.  There's no obvious usecase for r/o,
as anything critical enough in the platform to have an IVMD in the first
place will also be non-trivial enough to require bidirectional
communication somehow.

The unity mapping only says "this device requires read-only access".  It
doesn't say "this must be mapped read-only", and it is legitimate to map
a r/o unity mapping as r/w.

If such a case actually exists, there's clearly one agent in the system
with r/w access into the r/o range, and mapping it r/w is equivalent to
the "IOMMU not enabled in the first place" case which is the default
case for most software for the past decade-and-a-bit.

In other words, I don't think the r/o unit maps on their own are a good
enough reasons to split the type.  If we gain other reasons then fine,
but this seems like chunk of complexity with no real users.

~Andrew



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/4] x86/P2M: relax guarding of MMIO entries
  2021-09-06 19:53               ` Andrew Cooper
@ 2021-09-07  6:27                 ` Jan Beulich
  0 siblings, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2021-09-07  6:27 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Roger Pau Monné, xen-devel

On 06.09.2021 21:53, Andrew Cooper wrote:
> On 01/09/2021 14:08, Jan Beulich wrote:
>>>>> Restricting execute permissions is something unique to virt.  It doesn't
>>>>> exist in a non-virtualised system, as I and D side reads are
>>>>> indistinguishable outside of the core.
>>>>>
>>>>> Furthermore, it is inexpressible on some systems/configurations.
>>>>>
>>>>> Introspection is the only technology which should be restricting execute
>>>>> permissions in the p2m, and only when it takes responsibility for
>>>>> dealing with the fallout.
>>>> IOW are you saying that the calls to set_identity_p2m_entry()
>>>> (pre-dating XSA-378) were wrong to use p2m_access_rw?
>>> Yes.
>>>
>>>>  Because that's
>>>> what's getting the way here.
>>> On a real machine, you really can write some executable code into an
>>> E820 reserved region and jump to it.  You can also execute code from
>>> real BARs is you happen to know that they are prefetchable (or you're a
>>> glutton for UC reads...)
>>>
>>> And there is the WPBT ACPI table which exists specifically to let
>>> firmware inject drivers/applications into a windows environment, and may
>>> come out of the SPI ROM in the first place.
>>>
>>>
>>> Is it sensible to execute an E820 reserved region, or unmarked BAR? 
>>> Probably not.
>>>
>>> Should it work, because that's how real hardware behaves?  Absolutely.
>>>
>>> Any restrictions beyond that want handling by some kind of introspection
>>> agent which has a policy of what to do with legal-but-dodgy-looking actions.
>> IOW you suggest we remove p2m_access_t parameters from various functions,
>> going with just default access?
> 
> p2m_access_t was very obviously a bodge when introduced, and I doubt it
> would pass today's review standards.
> 
> It is definitely a mis-design, given its ill-specified and overlapping
> semantics with respect to the p2m type.
> 
>>  Of course, as pointed out in another
>> reply, we'll need to split p2m_mmio_direct into multiple types then, at
>> the very least to honor the potential r/o restriction of AMD IOMMU unity
>> mapped regions. (FAOD all of this isn't a short term plan anyway, at least
>> afaic.)
> 
> I don't think that will be necessary.
> 
> IVMDs are almost non-existent, and given how many other areas of the AMD
> IOMMU spec are totally unused, I wouldn't be surprised if r/o unity
> mappings were in that category too.  There's no obvious usecase for r/o,
> as anything critical enough in the platform to have an IVMD in the first
> place will also be non-trivial enough to require bidirectional
> communication somehow.
> 
> The unity mapping only says "this device requires read-only access".  It
> doesn't say "this must be mapped read-only", and it is legitimate to map
> a r/o unity mapping as r/w.

Well, imo that's extending what "Write permission. 1b=writeable, 0b=not
writeable" and "Read permission. 1b=readable, 0b=not readable" in the
spec say. "Permission" to me doesn't mean what you say.

Nevertheless I would perhaps not insist (as I've already made clear I
don't see a strong need to support w/o mappings), if ...

> If such a case actually exists, there's clearly one agent in the system
> with r/w access into the r/o range, and mapping it r/w is equivalent to
> the "IOMMU not enabled in the first place" case which is the default
> case for most software for the past decade-and-a-bit.
> 
> In other words, I don't think the r/o unit maps on their own are a good
> enough reasons to split the type.  If we gain other reasons then fine,
> but this seems like chunk of complexity with no real users.

... there wasn't already a 2nd use for this: The IO-APIC mappings (see
my respective pending patch).

Jan



^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2021-09-07  6:27 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-30 13:01 [PATCH 0/4] x86/PVH: Dom0 building adjustments Jan Beulich
2021-08-30 13:02 ` [PATCH 1/4] x86/PVH: de-duplicate mappings for first Mb of Dom0 memory Jan Beulich
2021-08-31 13:02   ` Andrew Cooper
2021-08-31 13:14     ` Jan Beulich
2021-08-31 13:19       ` Jan Beulich
2021-08-31 13:27         ` Andrew Cooper
2021-08-31 13:36           ` Jan Beulich
2021-09-01 11:49             ` Andrew Cooper
2021-09-01  8:41       ` Roger Pau Monné
2021-08-30 13:02 ` [PATCH 2/4] x86/P2M: relax guarding of MMIO entries Jan Beulich
2021-08-31 13:16   ` Jan Beulich
2021-08-31 13:16   ` Andrew Cooper
2021-08-31 13:26     ` Jan Beulich
2021-08-31 15:25       ` Andrew Cooper
2021-08-31 15:38         ` Jan Beulich
2021-09-01  8:08           ` Jan Beulich
2021-09-01  8:50           ` Roger Pau Monné
2021-09-01  9:53             ` Jan Beulich
2021-09-01 13:48               ` Roger Pau Monné
2021-09-01 14:05                 ` Jan Beulich
2021-09-01 12:47           ` Andrew Cooper
2021-09-01 13:08             ` Jan Beulich
2021-09-06 19:53               ` Andrew Cooper
2021-09-07  6:27                 ` Jan Beulich
2021-08-30 13:03 ` [PATCH 3/4] x86/PVH: improve Dom0 memory size calculation Jan Beulich
2021-08-31 14:07   ` Andrew Cooper
2021-08-31 15:30     ` Jan Beulich
2021-08-30 13:03 ` [PATCH 4/4] x86/PV: properly set shadow allocation for Dom0 Jan Beulich
2021-08-31 13:47   ` Andrew Cooper
2021-08-31 14:25     ` Jan Beulich
2021-08-31 21:08   ` Tim Deegan
2021-08-31  8:53 ` [PATCH 0/4] x86/PVH: Dom0 building adjustments Jan Beulich
2021-09-01 13:56   ` Roger Pau Monné
2021-09-01 14:19     ` Jan Beulich
2021-09-01 14:25       ` Jan Beulich
2021-09-01 16:13       ` Roger Pau Monné
2021-09-02  6:30         ` Jan Beulich
2021-09-01 15:06 ` Jan Beulich
2021-09-01 15:24   ` Juergen Gross
2021-09-01 15:51     ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.