xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/2] VMX: apic access page handling adjustments
@ 2021-02-22 10:55 Jan Beulich
  2021-02-22 10:56 ` [PATCH v3 1/2][4.15] VMX: delay p2m insertion of APIC access page Jan Beulich
                   ` (3 more replies)
  0 siblings, 4 replies; 32+ messages in thread
From: Jan Beulich @ 2021-02-22 10:55 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Wei Liu, Roger Pau Monné,
	Ian Jackson, George Dunlap, Kevin Tian, Jun Nakajima

The latter of the two changes was on my mental todo list for a
very long time. With Julien reporting a problem with the handling
of this page, I finally felt urged to make a patch. As it turns
out, for addressing this problem only the first of the now split
pages is needed, and the second can be further discussed and
considered for 4.16.

1: delay p2m insertion of APIC access page
2: use a single, global APIC access page

Jan


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v3 1/2][4.15] VMX: delay p2m insertion of APIC access page
  2021-02-22 10:55 [PATCH v3 0/2] VMX: apic access page handling adjustments Jan Beulich
@ 2021-02-22 10:56 ` Jan Beulich
  2021-02-22 11:25   ` Ian Jackson
                     ` (2 more replies)
  2021-02-22 10:57 ` [PATCH v3 2/2] VMX: use a single, global " Jan Beulich
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 32+ messages in thread
From: Jan Beulich @ 2021-02-22 10:56 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Wei Liu, Roger Pau Monné,
	Ian Jackson, George Dunlap, Kevin Tian, Jun Nakajima,
	Julien Grall

Inserting the mapping at domain creation time leads to a memory leak
when the creation fails later on and the domain uses separate CPU and
IOMMU page tables - the latter requires intermediate page tables to be
allocated, but there's no freeing of them at present in this case. Since
we don't need the p2m insertion to happen this early, avoid the problem
altogether by deferring it until the last possible point. This comes at
the price of not being able to handle an error other than by crashing
the domain.

Reported-by: Julien Grall <julien@xen.org>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v3: New (split out).
---
Hooking p2m insertion onto arch_domain_creation_finished() isn't very
nice, but I couldn't find any better hook (nor a good place where to
introduce a new one). In particular there look to be no hvm_funcs hooks
being used on a domain-wide basis (except for init/destroy of course).
I did consider connecting this to the setting of HVM_PARAM_IDENT_PT, but
considered this no better, the more that the tool stack could be smarter
and avoid setting that param when not needed.

--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1007,6 +1007,8 @@ int arch_domain_soft_reset(struct domain
 
 void arch_domain_creation_finished(struct domain *d)
 {
+    if ( is_hvm_domain(d) )
+        hvm_domain_creation_finished(d);
 }
 
 #define xen_vcpu_guest_context vcpu_guest_context
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -428,6 +428,14 @@ static void vmx_domain_relinquish_resour
     vmx_free_vlapic_mapping(d);
 }
 
+static void domain_creation_finished(struct domain *d)
+{
+    if ( has_vlapic(d) && !mfn_eq(d->arch.hvm.vmx.apic_access_mfn, _mfn(0)) &&
+         set_mmio_p2m_entry(d, gaddr_to_gfn(APIC_DEFAULT_PHYS_BASE),
+                            d->arch.hvm.vmx.apic_access_mfn, PAGE_ORDER_4K) )
+        domain_crash(d);
+}
+
 static void vmx_init_ipt(struct vcpu *v)
 {
     unsigned int size = v->domain->vmtrace_size;
@@ -2408,6 +2416,7 @@ static struct hvm_function_table __initd
     .cpu_dead             = vmx_cpu_dead,
     .domain_initialise    = vmx_domain_initialise,
     .domain_relinquish_resources = vmx_domain_relinquish_resources,
+    .domain_creation_finished = domain_creation_finished,
     .vcpu_initialise      = vmx_vcpu_initialise,
     .vcpu_destroy         = vmx_vcpu_destroy,
     .save_cpu_ctxt        = vmx_save_vmcs_ctxt,
@@ -3234,8 +3243,7 @@ static int vmx_alloc_vlapic_mapping(stru
     clear_domain_page(mfn);
     d->arch.hvm.vmx.apic_access_mfn = mfn;
 
-    return set_mmio_p2m_entry(d, gaddr_to_gfn(APIC_DEFAULT_PHYS_BASE), mfn,
-                              PAGE_ORDER_4K);
+    return 0;
 }
 
 static void vmx_free_vlapic_mapping(struct domain *d)
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -106,6 +106,7 @@ struct hvm_function_table {
      * Initialise/destroy HVM domain/vcpu resources
      */
     int  (*domain_initialise)(struct domain *d);
+    void (*domain_creation_finished)(struct domain *d);
     void (*domain_relinquish_resources)(struct domain *d);
     void (*domain_destroy)(struct domain *d);
     int  (*vcpu_initialise)(struct vcpu *v);
@@ -390,6 +391,12 @@ static inline bool hvm_has_set_descripto
     return hvm_funcs.set_descriptor_access_exiting;
 }
 
+static inline void hvm_domain_creation_finished(struct domain *d)
+{
+    if ( hvm_funcs.domain_creation_finished )
+        alternative_vcall(hvm_funcs.domain_creation_finished, d);
+}
+
 static inline int
 hvm_guest_x86_mode(struct vcpu *v)
 {
@@ -765,6 +772,11 @@ static inline void hvm_invlpg(const stru
 {
     ASSERT_UNREACHABLE();
 }
+
+static inline void hvm_domain_creation_finished(struct domain *d)
+{
+    ASSERT_UNREACHABLE();
+}
 
 /*
  * Shadow code needs further cleanup to eliminate some HVM-only paths. For



^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v3 2/2] VMX: use a single, global APIC access page
  2021-02-22 10:55 [PATCH v3 0/2] VMX: apic access page handling adjustments Jan Beulich
  2021-02-22 10:56 ` [PATCH v3 1/2][4.15] VMX: delay p2m insertion of APIC access page Jan Beulich
@ 2021-02-22 10:57 ` Jan Beulich
  2021-03-01  2:34   ` Tian, Kevin
  2021-04-12 10:40 ` [PATCH v4] " Jan Beulich
  2021-04-23 10:51 ` [PATCH v4 0/3] VMX APIC access page and shadow adjustments Jan Beulich
  3 siblings, 1 reply; 32+ messages in thread
From: Jan Beulich @ 2021-02-22 10:57 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Wei Liu, Roger Pau Monné,
	Ian Jackson, George Dunlap, Kevin Tian, Jun Nakajima

The address of this page is used by the CPU only to recognize when to
access the virtual APIC page instead. No accesses would ever go to this
page. It only needs to be present in the (CPU) page tables so that
address translation will produce its address as result for respective
accesses.

By making this page global, we also eliminate the need to refcount it,
or to assign it to any domain in the first place.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v3: Split p2m insertion change to a separate patch.
v2: Avoid insertion when !has_vlapic(). Split off change to
    p2m_get_iommu_flags().
---
I did further consider not allocating any real page at all, but just
using the address of some unpopulated space (which would require
announcing this page as reserved to Dom0, so it wouldn't put any PCI
MMIO BARs there). But I thought this would be too controversial, because
of the possible risks associated with this.

--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -66,8 +66,7 @@ boolean_param("force-ept", opt_force_ept
 static void vmx_ctxt_switch_from(struct vcpu *v);
 static void vmx_ctxt_switch_to(struct vcpu *v);
 
-static int  vmx_alloc_vlapic_mapping(struct domain *d);
-static void vmx_free_vlapic_mapping(struct domain *d);
+static int alloc_vlapic_mapping(void);
 static void vmx_install_vlapic_mapping(struct vcpu *v);
 static void vmx_update_guest_cr(struct vcpu *v, unsigned int cr,
                                 unsigned int flags);
@@ -78,6 +77,8 @@ static int vmx_msr_read_intercept(unsign
 static int vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content);
 static void vmx_invlpg(struct vcpu *v, unsigned long linear);
 
+static mfn_t __read_mostly apic_access_mfn;
+
 /* Values for domain's ->arch.hvm_domain.pi_ops.flags. */
 #define PI_CSW_FROM (1u << 0)
 #define PI_CSW_TO   (1u << 1)
@@ -401,7 +402,6 @@ static int vmx_domain_initialise(struct
         .to   = vmx_ctxt_switch_to,
         .tail = vmx_do_resume,
     };
-    int rc;
 
     d->arch.ctxt_switch = &csw;
 
@@ -411,28 +411,14 @@ static int vmx_domain_initialise(struct
      */
     d->arch.hvm.vmx.exec_sp = is_hardware_domain(d) || opt_ept_exec_sp;
 
-    if ( !has_vlapic(d) )
-        return 0;
-
-    if ( (rc = vmx_alloc_vlapic_mapping(d)) != 0 )
-        return rc;
-
     return 0;
 }
 
-static void vmx_domain_relinquish_resources(struct domain *d)
-{
-    if ( !has_vlapic(d) )
-        return;
-
-    vmx_free_vlapic_mapping(d);
-}
-
 static void domain_creation_finished(struct domain *d)
 {
-    if ( has_vlapic(d) && !mfn_eq(d->arch.hvm.vmx.apic_access_mfn, _mfn(0)) &&
+    if ( has_vlapic(d) && !mfn_eq(apic_access_mfn, _mfn(0)) &&
          set_mmio_p2m_entry(d, gaddr_to_gfn(APIC_DEFAULT_PHYS_BASE),
-                            d->arch.hvm.vmx.apic_access_mfn, PAGE_ORDER_4K) )
+                            apic_access_mfn, PAGE_ORDER_4K) )
         domain_crash(d);
 }
 
@@ -2415,7 +2401,6 @@ static struct hvm_function_table __initd
     .cpu_up_prepare       = vmx_cpu_up_prepare,
     .cpu_dead             = vmx_cpu_dead,
     .domain_initialise    = vmx_domain_initialise,
-    .domain_relinquish_resources = vmx_domain_relinquish_resources,
     .domain_creation_finished = domain_creation_finished,
     .vcpu_initialise      = vmx_vcpu_initialise,
     .vcpu_destroy         = vmx_vcpu_destroy,
@@ -2662,7 +2647,7 @@ const struct hvm_function_table * __init
 {
     set_in_cr4(X86_CR4_VMXE);
 
-    if ( vmx_vmcs_init() )
+    if ( vmx_vmcs_init() || alloc_vlapic_mapping() )
     {
         printk("VMX: failed to initialise.\n");
         return NULL;
@@ -3217,7 +3202,7 @@ gp_fault:
     return X86EMUL_EXCEPTION;
 }
 
-static int vmx_alloc_vlapic_mapping(struct domain *d)
+static int __init alloc_vlapic_mapping(void)
 {
     struct page_info *pg;
     mfn_t mfn;
@@ -3225,52 +3210,28 @@ static int vmx_alloc_vlapic_mapping(stru
     if ( !cpu_has_vmx_virtualize_apic_accesses )
         return 0;
 
-    pg = alloc_domheap_page(d, MEMF_no_refcount);
+    pg = alloc_domheap_page(NULL, 0);
     if ( !pg )
         return -ENOMEM;
 
-    if ( !get_page_and_type(pg, d, PGT_writable_page) )
-    {
-        /*
-         * The domain can't possibly know about this page yet, so failure
-         * here is a clear indication of something fishy going on.
-         */
-        domain_crash(d);
-        return -ENODATA;
-    }
-
     mfn = page_to_mfn(pg);
     clear_domain_page(mfn);
-    d->arch.hvm.vmx.apic_access_mfn = mfn;
+    apic_access_mfn = mfn;
 
     return 0;
 }
 
-static void vmx_free_vlapic_mapping(struct domain *d)
-{
-    mfn_t mfn = d->arch.hvm.vmx.apic_access_mfn;
-
-    d->arch.hvm.vmx.apic_access_mfn = _mfn(0);
-    if ( !mfn_eq(mfn, _mfn(0)) )
-    {
-        struct page_info *pg = mfn_to_page(mfn);
-
-        put_page_alloc_ref(pg);
-        put_page_and_type(pg);
-    }
-}
-
 static void vmx_install_vlapic_mapping(struct vcpu *v)
 {
     paddr_t virt_page_ma, apic_page_ma;
 
-    if ( mfn_eq(v->domain->arch.hvm.vmx.apic_access_mfn, _mfn(0)) )
+    if ( !has_vlapic(v->domain) || mfn_eq(apic_access_mfn, _mfn(0)) )
         return;
 
     ASSERT(cpu_has_vmx_virtualize_apic_accesses);
 
     virt_page_ma = page_to_maddr(vcpu_vlapic(v)->regs_page);
-    apic_page_ma = mfn_to_maddr(v->domain->arch.hvm.vmx.apic_access_mfn);
+    apic_page_ma = mfn_to_maddr(apic_access_mfn);
 
     vmx_vmcs_enter(v);
     __vmwrite(VIRTUAL_APIC_PAGE_ADDR, virt_page_ma);
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -58,7 +58,6 @@ struct ept_data {
 #define _VMX_DOMAIN_PML_ENABLED    0
 #define VMX_DOMAIN_PML_ENABLED     (1ul << _VMX_DOMAIN_PML_ENABLED)
 struct vmx_domain {
-    mfn_t apic_access_mfn;
     /* VMX_DOMAIN_* */
     unsigned int status;
 



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 1/2][4.15] VMX: delay p2m insertion of APIC access page
  2021-02-22 10:56 ` [PATCH v3 1/2][4.15] VMX: delay p2m insertion of APIC access page Jan Beulich
@ 2021-02-22 11:25   ` Ian Jackson
  2021-02-22 14:05     ` Jan Beulich
  2021-02-22 12:15   ` Roger Pau Monné
  2021-02-25  8:44   ` Jan Beulich
  2 siblings, 1 reply; 32+ messages in thread
From: Ian Jackson @ 2021-02-22 11:25 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Andrew Cooper, Wei Liu, Roger Pau Monné,
	George Dunlap, Kevin Tian, Jun Nakajima, Julien Grall

Jan Beulich writes ("[PATCH v3 1/2][4.15] VMX: delay p2m insertion of APIC access page"):
> Inserting the mapping at domain creation time leads to a memory leak
> when the creation fails later on and the domain uses separate CPU and
> IOMMU page tables - the latter requires intermediate page tables to be
> allocated, but there's no freeing of them at present in this case. Since
> we don't need the p2m insertion to happen this early, avoid the problem
> altogether by deferring it until the last possible point.

Thanks.

>   This comes at
> the price of not being able to handle an error other than by crashing
> the domain.

How worried should I be about this ?

Ian.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 1/2][4.15] VMX: delay p2m insertion of APIC access page
  2021-02-22 10:56 ` [PATCH v3 1/2][4.15] VMX: delay p2m insertion of APIC access page Jan Beulich
  2021-02-22 11:25   ` Ian Jackson
@ 2021-02-22 12:15   ` Roger Pau Monné
  2021-02-25  8:44   ` Jan Beulich
  2 siblings, 0 replies; 32+ messages in thread
From: Roger Pau Monné @ 2021-02-22 12:15 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Andrew Cooper, Wei Liu, Ian Jackson, George Dunlap,
	Kevin Tian, Jun Nakajima, Julien Grall

On Mon, Feb 22, 2021 at 11:56:58AM +0100, Jan Beulich wrote:
> Inserting the mapping at domain creation time leads to a memory leak
> when the creation fails later on and the domain uses separate CPU and
> IOMMU page tables - the latter requires intermediate page tables to be
> allocated, but there's no freeing of them at present in this case. Since
> we don't need the p2m insertion to happen this early, avoid the problem
> altogether by deferring it until the last possible point. This comes at
> the price of not being able to handle an error other than by crashing
> the domain.
> 
> Reported-by: Julien Grall <julien@xen.org>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Roger Pau Monné <roger.pau@citrix.com>

> ---
> v3: New (split out).
> ---
> Hooking p2m insertion onto arch_domain_creation_finished() isn't very
> nice, but I couldn't find any better hook (nor a good place where to
> introduce a new one). In particular there look to be no hvm_funcs hooks
> being used on a domain-wide basis (except for init/destroy of course).
> I did consider connecting this to the setting of HVM_PARAM_IDENT_PT, but
> considered this no better, the more that the tool stack could be smarter
> and avoid setting that param when not needed.

I'm not specially found of allocating the page in one hook, mapping it
into the p2m in another hook and finally setting up the vmcs fields in
yet another hook.

I would rather prefer to have a single place where for the BSP the
page is allocated and mapped into the p2m, while for APs just the vmcs
fields are set. It's IMO slightly difficult to follow the setup when
it's split into so many different places.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 1/2][4.15] VMX: delay p2m insertion of APIC access page
  2021-02-22 11:25   ` Ian Jackson
@ 2021-02-22 14:05     ` Jan Beulich
  2021-02-22 17:17       ` Ian Jackson
  0 siblings, 1 reply; 32+ messages in thread
From: Jan Beulich @ 2021-02-22 14:05 UTC (permalink / raw)
  To: Ian Jackson
  Cc: xen-devel, Andrew Cooper, Wei Liu, Roger Pau Monné,
	George Dunlap, Kevin Tian, Jun Nakajima, Julien Grall

On 22.02.2021 12:25, Ian Jackson wrote:
> Jan Beulich writes ("[PATCH v3 1/2][4.15] VMX: delay p2m insertion of APIC access page"):
>> Inserting the mapping at domain creation time leads to a memory leak
>> when the creation fails later on and the domain uses separate CPU and
>> IOMMU page tables - the latter requires intermediate page tables to be
>> allocated, but there's no freeing of them at present in this case. Since
>> we don't need the p2m insertion to happen this early, avoid the problem
>> altogether by deferring it until the last possible point.
> 
> Thanks.
> 
>>   This comes at
>> the price of not being able to handle an error other than by crashing
>> the domain.
> 
> How worried should I be about this ?

Not overly much I would say. The difference is between a failure
(-ENOMEM) during domain creation vs the domain getting crashed
before it gets first scheduled. This is certainly less friendly
to the user, but lack of memory shouldn't typically happen when
creating domains. Plus the memory talked about here is such that
gets provided explicitly to the domain (the p2m pool), rather
than a system wide pool.

Jan


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 1/2][4.15] VMX: delay p2m insertion of APIC access page
  2021-02-22 14:05     ` Jan Beulich
@ 2021-02-22 17:17       ` Ian Jackson
  0 siblings, 0 replies; 32+ messages in thread
From: Ian Jackson @ 2021-02-22 17:17 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Andrew Cooper, Wei Liu, Roger Pau Monné,
	George Dunlap, Kevin Tian, Jun Nakajima, Julien Grall

Jan Beulich writes ("Re: [PATCH v3 1/2][4.15] VMX: delay p2m insertion of APIC access page"):
> On 22.02.2021 12:25, Ian Jackson wrote:
> > Jan Beulich writes ("[PATCH v3 1/2][4.15] VMX: delay p2m insertion of APIC access page"):
> >> Inserting the mapping at domain creation time leads to a memory leak
> >> when the creation fails later on and the domain uses separate CPU and
> >> IOMMU page tables - the latter requires intermediate page tables to be
> >> allocated, but there's no freeing of them at present in this case. Since
> >> we don't need the p2m insertion to happen this early, avoid the problem
> >> altogether by deferring it until the last possible point.
> > 
> > Thanks.
> > 
> >>   This comes at
> >> the price of not being able to handle an error other than by crashing
> >> the domain.
> > 
> > How worried should I be about this ?
> 
> Not overly much I would say. The difference is between a failure
> (-ENOMEM) during domain creation vs the domain getting crashed
> before it gets first scheduled. This is certainly less friendly
> to the user, but lack of memory shouldn't typically happen when
> creating domains. Plus the memory talked about here is such that
> gets provided explicitly to the domain (the p2m pool), rather
> than a system wide pool.

OK, thanks.

Release-Acked-by: Ian Jackson <iwj@xenproject.org>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 1/2][4.15] VMX: delay p2m insertion of APIC access page
  2021-02-22 10:56 ` [PATCH v3 1/2][4.15] VMX: delay p2m insertion of APIC access page Jan Beulich
  2021-02-22 11:25   ` Ian Jackson
  2021-02-22 12:15   ` Roger Pau Monné
@ 2021-02-25  8:44   ` Jan Beulich
  2021-02-26  7:06     ` Tian, Kevin
  2 siblings, 1 reply; 32+ messages in thread
From: Jan Beulich @ 2021-02-25  8:44 UTC (permalink / raw)
  To: Kevin Tian, Roger Pau Monné, Andrew Cooper, Jun Nakajima
  Cc: Wei Liu, Ian Jackson, George Dunlap, Julien Grall, xen-devel

On 22.02.2021 11:56, Jan Beulich wrote:
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -428,6 +428,14 @@ static void vmx_domain_relinquish_resour
>      vmx_free_vlapic_mapping(d);
>  }
>  
> +static void domain_creation_finished(struct domain *d)
> +{
> +    if ( has_vlapic(d) && !mfn_eq(d->arch.hvm.vmx.apic_access_mfn, _mfn(0)) &&
> +         set_mmio_p2m_entry(d, gaddr_to_gfn(APIC_DEFAULT_PHYS_BASE),
> +                            d->arch.hvm.vmx.apic_access_mfn, PAGE_ORDER_4K) )
> +        domain_crash(d);
> +}

Having noticed that in patch 2 I also need to arrange for
ept_get_entry_emt() to continue to return WB for this page, I'm
inclined to add a respective assertion here. Would anyone object
to me doing so?

Kevin, Jun - I'd like this to also serve as a ping for an ack
(with or without the suggested ASSERT() addition).

Jan


^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH v3 1/2][4.15] VMX: delay p2m insertion of APIC access page
  2021-02-25  8:44   ` Jan Beulich
@ 2021-02-26  7:06     ` Tian, Kevin
  0 siblings, 0 replies; 32+ messages in thread
From: Tian, Kevin @ 2021-02-26  7:06 UTC (permalink / raw)
  To: Jan Beulich, Roger Pau Monné, Cooper, Andrew, Nakajima, Jun
  Cc: Wei Liu, Ian Jackson, George Dunlap, Julien Grall, xen-devel

> From: Jan Beulich <jbeulich@suse.com>
> Sent: Thursday, February 25, 2021 4:44 PM
> 
> On 22.02.2021 11:56, Jan Beulich wrote:
> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > @@ -428,6 +428,14 @@ static void vmx_domain_relinquish_resour
> >      vmx_free_vlapic_mapping(d);
> >  }
> >
> > +static void domain_creation_finished(struct domain *d)
> > +{
> > +    if ( has_vlapic(d) && !mfn_eq(d->arch.hvm.vmx.apic_access_mfn,
> _mfn(0)) &&
> > +         set_mmio_p2m_entry(d, gaddr_to_gfn(APIC_DEFAULT_PHYS_BASE),
> > +                            d->arch.hvm.vmx.apic_access_mfn, PAGE_ORDER_4K) )
> > +        domain_crash(d);
> > +}
> 
> Having noticed that in patch 2 I also need to arrange for
> ept_get_entry_emt() to continue to return WB for this page, I'm
> inclined to add a respective assertion here. Would anyone object
> to me doing so?
> 
> Kevin, Jun - I'd like this to also serve as a ping for an ack
> (with or without the suggested ASSERT() addition).
> 

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH v3 2/2] VMX: use a single, global APIC access page
  2021-02-22 10:57 ` [PATCH v3 2/2] VMX: use a single, global " Jan Beulich
@ 2021-03-01  2:34   ` Tian, Kevin
  2021-03-01  8:18     ` Jan Beulich
  0 siblings, 1 reply; 32+ messages in thread
From: Tian, Kevin @ 2021-03-01  2:34 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: Cooper, Andrew, Wei Liu, Roger Pau Monné,
	Ian Jackson, George Dunlap, Nakajima, Jun

> From: Jan Beulich <jbeulich@suse.com>
> Sent: Monday, February 22, 2021 6:57 PM
> 
> The address of this page is used by the CPU only to recognize when to
> access the virtual APIC page instead. No accesses would ever go to this
> page. It only needs to be present in the (CPU) page tables so that
> address translation will produce its address as result for respective
> accesses.
> 
> By making this page global, we also eliminate the need to refcount it,
> or to assign it to any domain in the first place.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

> ---
> v3: Split p2m insertion change to a separate patch.
> v2: Avoid insertion when !has_vlapic(). Split off change to
>     p2m_get_iommu_flags().
> ---
> I did further consider not allocating any real page at all, but just
> using the address of some unpopulated space (which would require
> announcing this page as reserved to Dom0, so it wouldn't put any PCI
> MMIO BARs there). But I thought this would be too controversial, because
> of the possible risks associated with this.
> 
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -66,8 +66,7 @@ boolean_param("force-ept", opt_force_ept
>  static void vmx_ctxt_switch_from(struct vcpu *v);
>  static void vmx_ctxt_switch_to(struct vcpu *v);
> 
> -static int  vmx_alloc_vlapic_mapping(struct domain *d);
> -static void vmx_free_vlapic_mapping(struct domain *d);
> +static int alloc_vlapic_mapping(void);
>  static void vmx_install_vlapic_mapping(struct vcpu *v);
>  static void vmx_update_guest_cr(struct vcpu *v, unsigned int cr,
>                                  unsigned int flags);
> @@ -78,6 +77,8 @@ static int vmx_msr_read_intercept(unsign
>  static int vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content);
>  static void vmx_invlpg(struct vcpu *v, unsigned long linear);
> 
> +static mfn_t __read_mostly apic_access_mfn;
> +
>  /* Values for domain's ->arch.hvm_domain.pi_ops.flags. */
>  #define PI_CSW_FROM (1u << 0)
>  #define PI_CSW_TO   (1u << 1)
> @@ -401,7 +402,6 @@ static int vmx_domain_initialise(struct
>          .to   = vmx_ctxt_switch_to,
>          .tail = vmx_do_resume,
>      };
> -    int rc;
> 
>      d->arch.ctxt_switch = &csw;
> 
> @@ -411,28 +411,14 @@ static int vmx_domain_initialise(struct
>       */
>      d->arch.hvm.vmx.exec_sp = is_hardware_domain(d) || opt_ept_exec_sp;
> 
> -    if ( !has_vlapic(d) )
> -        return 0;
> -
> -    if ( (rc = vmx_alloc_vlapic_mapping(d)) != 0 )
> -        return rc;
> -
>      return 0;
>  }
> 
> -static void vmx_domain_relinquish_resources(struct domain *d)
> -{
> -    if ( !has_vlapic(d) )
> -        return;
> -
> -    vmx_free_vlapic_mapping(d);
> -}
> -
>  static void domain_creation_finished(struct domain *d)
>  {
> -    if ( has_vlapic(d) && !mfn_eq(d->arch.hvm.vmx.apic_access_mfn,
> _mfn(0)) &&
> +    if ( has_vlapic(d) && !mfn_eq(apic_access_mfn, _mfn(0)) &&
>           set_mmio_p2m_entry(d, gaddr_to_gfn(APIC_DEFAULT_PHYS_BASE),
> -                            d->arch.hvm.vmx.apic_access_mfn, PAGE_ORDER_4K) )
> +                            apic_access_mfn, PAGE_ORDER_4K) )
>          domain_crash(d);
>  }
> 
> @@ -2415,7 +2401,6 @@ static struct hvm_function_table __initd
>      .cpu_up_prepare       = vmx_cpu_up_prepare,
>      .cpu_dead             = vmx_cpu_dead,
>      .domain_initialise    = vmx_domain_initialise,
> -    .domain_relinquish_resources = vmx_domain_relinquish_resources,
>      .domain_creation_finished = domain_creation_finished,
>      .vcpu_initialise      = vmx_vcpu_initialise,
>      .vcpu_destroy         = vmx_vcpu_destroy,
> @@ -2662,7 +2647,7 @@ const struct hvm_function_table * __init
>  {
>      set_in_cr4(X86_CR4_VMXE);
> 
> -    if ( vmx_vmcs_init() )
> +    if ( vmx_vmcs_init() || alloc_vlapic_mapping() )
>      {
>          printk("VMX: failed to initialise.\n");
>          return NULL;
> @@ -3217,7 +3202,7 @@ gp_fault:
>      return X86EMUL_EXCEPTION;
>  }
> 
> -static int vmx_alloc_vlapic_mapping(struct domain *d)
> +static int __init alloc_vlapic_mapping(void)
>  {
>      struct page_info *pg;
>      mfn_t mfn;
> @@ -3225,52 +3210,28 @@ static int vmx_alloc_vlapic_mapping(stru
>      if ( !cpu_has_vmx_virtualize_apic_accesses )
>          return 0;
> 
> -    pg = alloc_domheap_page(d, MEMF_no_refcount);
> +    pg = alloc_domheap_page(NULL, 0);
>      if ( !pg )
>          return -ENOMEM;
> 
> -    if ( !get_page_and_type(pg, d, PGT_writable_page) )
> -    {
> -        /*
> -         * The domain can't possibly know about this page yet, so failure
> -         * here is a clear indication of something fishy going on.
> -         */
> -        domain_crash(d);
> -        return -ENODATA;
> -    }
> -
>      mfn = page_to_mfn(pg);
>      clear_domain_page(mfn);
> -    d->arch.hvm.vmx.apic_access_mfn = mfn;
> +    apic_access_mfn = mfn;
> 
>      return 0;
>  }
> 
> -static void vmx_free_vlapic_mapping(struct domain *d)
> -{
> -    mfn_t mfn = d->arch.hvm.vmx.apic_access_mfn;
> -
> -    d->arch.hvm.vmx.apic_access_mfn = _mfn(0);
> -    if ( !mfn_eq(mfn, _mfn(0)) )
> -    {
> -        struct page_info *pg = mfn_to_page(mfn);
> -
> -        put_page_alloc_ref(pg);
> -        put_page_and_type(pg);
> -    }
> -}
> -
>  static void vmx_install_vlapic_mapping(struct vcpu *v)
>  {
>      paddr_t virt_page_ma, apic_page_ma;
> 
> -    if ( mfn_eq(v->domain->arch.hvm.vmx.apic_access_mfn, _mfn(0)) )
> +    if ( !has_vlapic(v->domain) || mfn_eq(apic_access_mfn, _mfn(0)) )
>          return;
> 
>      ASSERT(cpu_has_vmx_virtualize_apic_accesses);
> 
>      virt_page_ma = page_to_maddr(vcpu_vlapic(v)->regs_page);
> -    apic_page_ma = mfn_to_maddr(v->domain-
> >arch.hvm.vmx.apic_access_mfn);
> +    apic_page_ma = mfn_to_maddr(apic_access_mfn);
> 
>      vmx_vmcs_enter(v);
>      __vmwrite(VIRTUAL_APIC_PAGE_ADDR, virt_page_ma);
> --- a/xen/include/asm-x86/hvm/vmx/vmcs.h
> +++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
> @@ -58,7 +58,6 @@ struct ept_data {
>  #define _VMX_DOMAIN_PML_ENABLED    0
>  #define VMX_DOMAIN_PML_ENABLED     (1ul <<
> _VMX_DOMAIN_PML_ENABLED)
>  struct vmx_domain {
> -    mfn_t apic_access_mfn;
>      /* VMX_DOMAIN_* */
>      unsigned int status;
> 


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 2/2] VMX: use a single, global APIC access page
  2021-03-01  2:34   ` Tian, Kevin
@ 2021-03-01  8:18     ` Jan Beulich
  0 siblings, 0 replies; 32+ messages in thread
From: Jan Beulich @ 2021-03-01  8:18 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Cooper, Andrew, Wei Liu, Roger Pau Monné,
	Ian Jackson, George Dunlap, Nakajima, Jun, xen-devel

On 01.03.2021 03:34, Tian, Kevin wrote:
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: Monday, February 22, 2021 6:57 PM
>>
>> The address of this page is used by the CPU only to recognize when to
>> access the virtual APIC page instead. No accesses would ever go to this
>> page. It only needs to be present in the (CPU) page tables so that
>> address translation will produce its address as result for respective
>> accesses.
>>
>> By making this page global, we also eliminate the need to refcount it,
>> or to assign it to any domain in the first place.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>

Thanks, but for the record: As said on an unrelated thread already,
I also need to make an adjustment to shadow mode code here. Since
that doesn't affect VMX code itself, I'll retain the R-b on v4
(which I intend to submit only once we have 4.15 branched).

Jan


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v4] VMX: use a single, global APIC access page
  2021-02-22 10:55 [PATCH v3 0/2] VMX: apic access page handling adjustments Jan Beulich
  2021-02-22 10:56 ` [PATCH v3 1/2][4.15] VMX: delay p2m insertion of APIC access page Jan Beulich
  2021-02-22 10:57 ` [PATCH v3 2/2] VMX: use a single, global " Jan Beulich
@ 2021-04-12 10:40 ` Jan Beulich
  2021-04-12 15:31   ` Roger Pau Monné
  2021-04-17 19:24   ` Tim Deegan
  2021-04-23 10:51 ` [PATCH v4 0/3] VMX APIC access page and shadow adjustments Jan Beulich
  3 siblings, 2 replies; 32+ messages in thread
From: Jan Beulich @ 2021-04-12 10:40 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Wei Liu, Roger Pau Monné,
	George Dunlap, Kevin Tian, Jun Nakajima, Tim Deegan

The address of this page is used by the CPU only to recognize when to
access the virtual APIC page instead. No accesses would ever go to this
page. It only needs to be present in the (CPU) page tables so that
address translation will produce its address as result for respective
accesses.

By making this page global, we also eliminate the need to refcount it,
or to assign it to any domain in the first place.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
v4: Set PGC_extra on the page. Make shadow mode work.
v3: Split p2m insertion change to a separate patch.
v2: Avoid insertion when !has_vlapic(). Split off change to
    p2m_get_iommu_flags().
---
I did further consider not allocating any real page at all, but just
using the address of some unpopulated space (which would require
announcing this page as reserved to Dom0, so it wouldn't put any PCI
MMIO BARs there). But I thought this would be too controversial, because
of the possible risks associated with this.

--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -66,8 +66,7 @@ boolean_param("force-ept", opt_force_ept
 static void vmx_ctxt_switch_from(struct vcpu *v);
 static void vmx_ctxt_switch_to(struct vcpu *v);
 
-static int  vmx_alloc_vlapic_mapping(struct domain *d);
-static void vmx_free_vlapic_mapping(struct domain *d);
+static int alloc_vlapic_mapping(void);
 static void vmx_install_vlapic_mapping(struct vcpu *v);
 static void vmx_update_guest_cr(struct vcpu *v, unsigned int cr,
                                 unsigned int flags);
@@ -78,6 +77,8 @@ static int vmx_msr_read_intercept(unsign
 static int vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content);
 static void vmx_invlpg(struct vcpu *v, unsigned long linear);
 
+static mfn_t __read_mostly apic_access_mfn;
+
 /* Values for domain's ->arch.hvm_domain.pi_ops.flags. */
 #define PI_CSW_FROM (1u << 0)
 #define PI_CSW_TO   (1u << 1)
@@ -401,7 +402,6 @@ static int vmx_domain_initialise(struct
         .to   = vmx_ctxt_switch_to,
         .tail = vmx_do_resume,
     };
-    int rc;
 
     d->arch.ctxt_switch = &csw;
 
@@ -411,28 +411,22 @@ static int vmx_domain_initialise(struct
      */
     d->arch.hvm.vmx.exec_sp = is_hardware_domain(d) || opt_ept_exec_sp;
 
-    if ( !has_vlapic(d) )
-        return 0;
-
-    if ( (rc = vmx_alloc_vlapic_mapping(d)) != 0 )
-        return rc;
-
     return 0;
 }
 
-static void vmx_domain_relinquish_resources(struct domain *d)
+static void domain_creation_finished(struct domain *d)
 {
-    if ( !has_vlapic(d) )
+    gfn_t gfn = gaddr_to_gfn(APIC_DEFAULT_PHYS_BASE);
+    uint8_t ipat;
+
+    if ( !has_vlapic(d) || mfn_eq(apic_access_mfn, _mfn(0)) )
         return;
 
-    vmx_free_vlapic_mapping(d);
-}
+    ASSERT(epte_get_entry_emt(d, gfn_x(gfn), apic_access_mfn, 0, &ipat,
+                              true) == MTRR_TYPE_WRBACK);
+    ASSERT(ipat);
 
-static void domain_creation_finished(struct domain *d)
-{
-    if ( has_vlapic(d) && !mfn_eq(d->arch.hvm.vmx.apic_access_mfn, _mfn(0)) &&
-         set_mmio_p2m_entry(d, gaddr_to_gfn(APIC_DEFAULT_PHYS_BASE),
-                            d->arch.hvm.vmx.apic_access_mfn, PAGE_ORDER_4K) )
+    if ( set_mmio_p2m_entry(d, gfn, apic_access_mfn, PAGE_ORDER_4K) )
         domain_crash(d);
 }
 
@@ -2415,7 +2409,6 @@ static struct hvm_function_table __initd
     .cpu_up_prepare       = vmx_cpu_up_prepare,
     .cpu_dead             = vmx_cpu_dead,
     .domain_initialise    = vmx_domain_initialise,
-    .domain_relinquish_resources = vmx_domain_relinquish_resources,
     .domain_creation_finished = domain_creation_finished,
     .vcpu_initialise      = vmx_vcpu_initialise,
     .vcpu_destroy         = vmx_vcpu_destroy,
@@ -2662,7 +2655,7 @@ const struct hvm_function_table * __init
 {
     set_in_cr4(X86_CR4_VMXE);
 
-    if ( vmx_vmcs_init() )
+    if ( vmx_vmcs_init() || alloc_vlapic_mapping() )
     {
         printk("VMX: failed to initialise.\n");
         return NULL;
@@ -3224,7 +3217,7 @@ gp_fault:
     return X86EMUL_EXCEPTION;
 }
 
-static int vmx_alloc_vlapic_mapping(struct domain *d)
+static int __init alloc_vlapic_mapping(void)
 {
     struct page_info *pg;
     mfn_t mfn;
@@ -3232,52 +3225,31 @@ static int vmx_alloc_vlapic_mapping(stru
     if ( !cpu_has_vmx_virtualize_apic_accesses )
         return 0;
 
-    pg = alloc_domheap_page(d, MEMF_no_refcount);
+    pg = alloc_domheap_page(NULL, 0);
     if ( !pg )
         return -ENOMEM;
 
-    if ( !get_page_and_type(pg, d, PGT_writable_page) )
-    {
-        /*
-         * The domain can't possibly know about this page yet, so failure
-         * here is a clear indication of something fishy going on.
-         */
-        domain_crash(d);
-        return -ENODATA;
-    }
+    /* Arrange for epte_get_entry_emt() to recognize this page as "special". */
+    pg->count_info |= PGC_extra;
 
     mfn = page_to_mfn(pg);
     clear_domain_page(mfn);
-    d->arch.hvm.vmx.apic_access_mfn = mfn;
+    apic_access_mfn = mfn;
 
     return 0;
 }
 
-static void vmx_free_vlapic_mapping(struct domain *d)
-{
-    mfn_t mfn = d->arch.hvm.vmx.apic_access_mfn;
-
-    d->arch.hvm.vmx.apic_access_mfn = _mfn(0);
-    if ( !mfn_eq(mfn, _mfn(0)) )
-    {
-        struct page_info *pg = mfn_to_page(mfn);
-
-        put_page_alloc_ref(pg);
-        put_page_and_type(pg);
-    }
-}
-
 static void vmx_install_vlapic_mapping(struct vcpu *v)
 {
     paddr_t virt_page_ma, apic_page_ma;
 
-    if ( mfn_eq(v->domain->arch.hvm.vmx.apic_access_mfn, _mfn(0)) )
+    if ( !has_vlapic(v->domain) || mfn_eq(apic_access_mfn, _mfn(0)) )
         return;
 
     ASSERT(cpu_has_vmx_virtualize_apic_accesses);
 
     virt_page_ma = page_to_maddr(vcpu_vlapic(v)->regs_page);
-    apic_page_ma = mfn_to_maddr(v->domain->arch.hvm.vmx.apic_access_mfn);
+    apic_page_ma = mfn_to_maddr(apic_access_mfn);
 
     vmx_vmcs_enter(v);
     __vmwrite(VIRTUAL_APIC_PAGE_ADDR, virt_page_ma);
--- a/xen/arch/x86/mm/shadow/set.c
+++ b/xen/arch/x86/mm/shadow/set.c
@@ -94,6 +94,22 @@ shadow_get_page_from_l1e(shadow_l1e_t sl
     ASSERT(!sh_l1e_is_magic(sl1e));
     ASSERT(shadow_mode_refcounts(d));
 
+    /*
+     * VMX'es APIC access MFN is just a surrogate page.  It doesn't actually
+     * get accessed, and hence there's no need to refcount it (and refcounting
+     * would fail, due to the page having no owner).
+     */
+    if ( mfn_valid(mfn = shadow_l1e_get_mfn(sl1e)) )
+    {
+        const struct page_info *pg = mfn_to_page(mfn);
+
+        if ( !page_get_owner(pg) && (pg->count_info & PGC_extra) )
+        {
+            ASSERT(type == p2m_mmio_direct);
+            return 0;
+        }
+    }
+
     res = get_page_from_l1e(sl1e, d, d);
 
     /*
--- a/xen/arch/x86/mm/shadow/types.h
+++ b/xen/arch/x86/mm/shadow/types.h
@@ -276,9 +276,20 @@ int shadow_set_l4e(struct domain *d, sha
 static void inline
 shadow_put_page_from_l1e(shadow_l1e_t sl1e, struct domain *d)
 {
+    mfn_t mfn;
+
     if ( !shadow_mode_refcounts(d) )
         return;
 
+    if ( mfn_valid(mfn = shadow_l1e_get_mfn(sl1e)) )
+    {
+        const struct page_info *pg = mfn_to_page(mfn);
+
+        /* See the respective comment in shadow_get_page_from_l1e(). */
+        if ( !page_get_owner(pg) && (pg->count_info & PGC_extra) )
+            return;
+    }
+
     put_page_from_l1e(sl1e, d);
 }
 
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -58,7 +58,6 @@ struct ept_data {
 #define _VMX_DOMAIN_PML_ENABLED    0
 #define VMX_DOMAIN_PML_ENABLED     (1ul << _VMX_DOMAIN_PML_ENABLED)
 struct vmx_domain {
-    mfn_t apic_access_mfn;
     /* VMX_DOMAIN_* */
     unsigned int status;
 


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4] VMX: use a single, global APIC access page
  2021-04-12 10:40 ` [PATCH v4] " Jan Beulich
@ 2021-04-12 15:31   ` Roger Pau Monné
  2021-04-13  9:24     ` Jan Beulich
  2021-04-17 19:24   ` Tim Deegan
  1 sibling, 1 reply; 32+ messages in thread
From: Roger Pau Monné @ 2021-04-12 15:31 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Andrew Cooper, Wei Liu, George Dunlap, Kevin Tian,
	Jun Nakajima, Tim Deegan

On Mon, Apr 12, 2021 at 12:40:48PM +0200, Jan Beulich wrote:
> The address of this page is used by the CPU only to recognize when to
> access the virtual APIC page instead. No accesses would ever go to this
> page. It only needs to be present in the (CPU) page tables so that
> address translation will produce its address as result for respective
> accesses.
> 
> By making this page global, we also eliminate the need to refcount it,
> or to assign it to any domain in the first place.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> ---
> v4: Set PGC_extra on the page. Make shadow mode work.
> v3: Split p2m insertion change to a separate patch.
> v2: Avoid insertion when !has_vlapic(). Split off change to
>     p2m_get_iommu_flags().
> ---
> I did further consider not allocating any real page at all, but just
> using the address of some unpopulated space (which would require
> announcing this page as reserved to Dom0, so it wouldn't put any PCI
> MMIO BARs there). But I thought this would be too controversial, because
> of the possible risks associated with this.

Really seems more trouble than reward. Also there are systems with
MMIO regions in holes on the memory map, like the issue I had with the
Intel pinctrl stuff that had an MMIO region in a hole on the memory
map [0], so I'm not sure Xen would be in a position to select a
suitable unpopulated page anyway.

[0] https://lore.kernel.org/xen-devel/YFx80wYt%2FKcHanC7@smile.fi.intel.com/

> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -66,8 +66,7 @@ boolean_param("force-ept", opt_force_ept
>  static void vmx_ctxt_switch_from(struct vcpu *v);
>  static void vmx_ctxt_switch_to(struct vcpu *v);
>  
> -static int  vmx_alloc_vlapic_mapping(struct domain *d);
> -static void vmx_free_vlapic_mapping(struct domain *d);
> +static int alloc_vlapic_mapping(void);
>  static void vmx_install_vlapic_mapping(struct vcpu *v);
>  static void vmx_update_guest_cr(struct vcpu *v, unsigned int cr,
>                                  unsigned int flags);
> @@ -78,6 +77,8 @@ static int vmx_msr_read_intercept(unsign
>  static int vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content);
>  static void vmx_invlpg(struct vcpu *v, unsigned long linear);
>  
> +static mfn_t __read_mostly apic_access_mfn;
> +
>  /* Values for domain's ->arch.hvm_domain.pi_ops.flags. */
>  #define PI_CSW_FROM (1u << 0)
>  #define PI_CSW_TO   (1u << 1)
> @@ -401,7 +402,6 @@ static int vmx_domain_initialise(struct
>          .to   = vmx_ctxt_switch_to,
>          .tail = vmx_do_resume,
>      };
> -    int rc;
>  
>      d->arch.ctxt_switch = &csw;
>  
> @@ -411,28 +411,22 @@ static int vmx_domain_initialise(struct
>       */
>      d->arch.hvm.vmx.exec_sp = is_hardware_domain(d) || opt_ept_exec_sp;
>  
> -    if ( !has_vlapic(d) )
> -        return 0;
> -
> -    if ( (rc = vmx_alloc_vlapic_mapping(d)) != 0 )
> -        return rc;
> -
>      return 0;
>  }
>  
> -static void vmx_domain_relinquish_resources(struct domain *d)
> +static void domain_creation_finished(struct domain *d)
>  {
> -    if ( !has_vlapic(d) )
> +    gfn_t gfn = gaddr_to_gfn(APIC_DEFAULT_PHYS_BASE);

Worth making it const static?

> +    uint8_t ipat;
> +
> +    if ( !has_vlapic(d) || mfn_eq(apic_access_mfn, _mfn(0)) )

It would be better to use INVALID_MFN here, and init apic_access_mfn
to that value.

>          return;
>  
> -    vmx_free_vlapic_mapping(d);
> -}
> +    ASSERT(epte_get_entry_emt(d, gfn_x(gfn), apic_access_mfn, 0, &ipat,
> +                              true) == MTRR_TYPE_WRBACK);
> +    ASSERT(ipat);
>  
> -static void domain_creation_finished(struct domain *d)
> -{
> -    if ( has_vlapic(d) && !mfn_eq(d->arch.hvm.vmx.apic_access_mfn, _mfn(0)) &&
> -         set_mmio_p2m_entry(d, gaddr_to_gfn(APIC_DEFAULT_PHYS_BASE),
> -                            d->arch.hvm.vmx.apic_access_mfn, PAGE_ORDER_4K) )
> +    if ( set_mmio_p2m_entry(d, gfn, apic_access_mfn, PAGE_ORDER_4K) )
>          domain_crash(d);
>  }
>  
> @@ -2415,7 +2409,6 @@ static struct hvm_function_table __initd
>      .cpu_up_prepare       = vmx_cpu_up_prepare,
>      .cpu_dead             = vmx_cpu_dead,
>      .domain_initialise    = vmx_domain_initialise,
> -    .domain_relinquish_resources = vmx_domain_relinquish_resources,
>      .domain_creation_finished = domain_creation_finished,
>      .vcpu_initialise      = vmx_vcpu_initialise,
>      .vcpu_destroy         = vmx_vcpu_destroy,
> @@ -2662,7 +2655,7 @@ const struct hvm_function_table * __init
>  {
>      set_in_cr4(X86_CR4_VMXE);
>  
> -    if ( vmx_vmcs_init() )
> +    if ( vmx_vmcs_init() || alloc_vlapic_mapping() )
>      {
>          printk("VMX: failed to initialise.\n");
>          return NULL;
> @@ -3224,7 +3217,7 @@ gp_fault:
>      return X86EMUL_EXCEPTION;
>  }
>  
> -static int vmx_alloc_vlapic_mapping(struct domain *d)
> +static int __init alloc_vlapic_mapping(void)
>  {
>      struct page_info *pg;
>      mfn_t mfn;
> @@ -3232,52 +3225,31 @@ static int vmx_alloc_vlapic_mapping(stru
>      if ( !cpu_has_vmx_virtualize_apic_accesses )
>          return 0;
>  
> -    pg = alloc_domheap_page(d, MEMF_no_refcount);
> +    pg = alloc_domheap_page(NULL, 0);
>      if ( !pg )
>          return -ENOMEM;
>  
> -    if ( !get_page_and_type(pg, d, PGT_writable_page) )
> -    {
> -        /*
> -         * The domain can't possibly know about this page yet, so failure
> -         * here is a clear indication of something fishy going on.
> -         */
> -        domain_crash(d);
> -        return -ENODATA;
> -    }
> +    /* Arrange for epte_get_entry_emt() to recognize this page as "special". */
> +    pg->count_info |= PGC_extra;
>  
>      mfn = page_to_mfn(pg);
>      clear_domain_page(mfn);
> -    d->arch.hvm.vmx.apic_access_mfn = mfn;
> +    apic_access_mfn = mfn;
>  
>      return 0;
>  }
>  
> -static void vmx_free_vlapic_mapping(struct domain *d)
> -{
> -    mfn_t mfn = d->arch.hvm.vmx.apic_access_mfn;
> -
> -    d->arch.hvm.vmx.apic_access_mfn = _mfn(0);
> -    if ( !mfn_eq(mfn, _mfn(0)) )
> -    {
> -        struct page_info *pg = mfn_to_page(mfn);
> -
> -        put_page_alloc_ref(pg);
> -        put_page_and_type(pg);
> -    }
> -}
> -
>  static void vmx_install_vlapic_mapping(struct vcpu *v)
>  {
>      paddr_t virt_page_ma, apic_page_ma;
>  
> -    if ( mfn_eq(v->domain->arch.hvm.vmx.apic_access_mfn, _mfn(0)) )
> +    if ( !has_vlapic(v->domain) || mfn_eq(apic_access_mfn, _mfn(0)) )
>          return;
>  
>      ASSERT(cpu_has_vmx_virtualize_apic_accesses);
>  
>      virt_page_ma = page_to_maddr(vcpu_vlapic(v)->regs_page);
> -    apic_page_ma = mfn_to_maddr(v->domain->arch.hvm.vmx.apic_access_mfn);
> +    apic_page_ma = mfn_to_maddr(apic_access_mfn);
>  
>      vmx_vmcs_enter(v);
>      __vmwrite(VIRTUAL_APIC_PAGE_ADDR, virt_page_ma);
> --- a/xen/arch/x86/mm/shadow/set.c
> +++ b/xen/arch/x86/mm/shadow/set.c
> @@ -94,6 +94,22 @@ shadow_get_page_from_l1e(shadow_l1e_t sl
>      ASSERT(!sh_l1e_is_magic(sl1e));
>      ASSERT(shadow_mode_refcounts(d));
>  
> +    /*
> +     * VMX'es APIC access MFN is just a surrogate page.  It doesn't actually
> +     * get accessed, and hence there's no need to refcount it (and refcounting
> +     * would fail, due to the page having no owner).
> +     */
> +    if ( mfn_valid(mfn = shadow_l1e_get_mfn(sl1e)) )

I find this assignment inside the parameter list quite ugly, I would
rather split it on it's own line.

> +    {
> +        const struct page_info *pg = mfn_to_page(mfn);
> +
> +        if ( !page_get_owner(pg) && (pg->count_info & PGC_extra) )
> +        {
> +            ASSERT(type == p2m_mmio_direct);
> +            return 0;

Are there any other pages that could pass this check? I don't think
so, but wanted to assert.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4] VMX: use a single, global APIC access page
  2021-04-12 15:31   ` Roger Pau Monné
@ 2021-04-13  9:24     ` Jan Beulich
  2021-04-13 10:18       ` Roger Pau Monné
  0 siblings, 1 reply; 32+ messages in thread
From: Jan Beulich @ 2021-04-13  9:24 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel, Andrew Cooper, Wei Liu, George Dunlap, Kevin Tian,
	Jun Nakajima, Tim Deegan

On 12.04.2021 17:31, Roger Pau Monné wrote:
> On Mon, Apr 12, 2021 at 12:40:48PM +0200, Jan Beulich wrote:
>> The address of this page is used by the CPU only to recognize when to
>> access the virtual APIC page instead. No accesses would ever go to this
>> page. It only needs to be present in the (CPU) page tables so that
>> address translation will produce its address as result for respective
>> accesses.
>>
>> By making this page global, we also eliminate the need to refcount it,
>> or to assign it to any domain in the first place.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
>> ---
>> v4: Set PGC_extra on the page. Make shadow mode work.
>> v3: Split p2m insertion change to a separate patch.
>> v2: Avoid insertion when !has_vlapic(). Split off change to
>>     p2m_get_iommu_flags().
>> ---
>> I did further consider not allocating any real page at all, but just
>> using the address of some unpopulated space (which would require
>> announcing this page as reserved to Dom0, so it wouldn't put any PCI
>> MMIO BARs there). But I thought this would be too controversial, because
>> of the possible risks associated with this.
> 
> Really seems more trouble than reward. Also there are systems with
> MMIO regions in holes on the memory map, like the issue I had with the
> Intel pinctrl stuff that had an MMIO region in a hole on the memory
> map [0], so I'm not sure Xen would be in a position to select a
> suitable unpopulated page anyway.
> 
> [0] https://lore.kernel.org/xen-devel/YFx80wYt%2FKcHanC7@smile.fi.intel.com/

Yeah, I had seen that. What I'm having trouble to understand is how the
OS will know to avoid that range for e.g. placing BARs.

>> @@ -411,28 +411,22 @@ static int vmx_domain_initialise(struct
>>       */
>>      d->arch.hvm.vmx.exec_sp = is_hardware_domain(d) || opt_ept_exec_sp;
>>  
>> -    if ( !has_vlapic(d) )
>> -        return 0;
>> -
>> -    if ( (rc = vmx_alloc_vlapic_mapping(d)) != 0 )
>> -        return rc;
>> -
>>      return 0;
>>  }
>>  
>> -static void vmx_domain_relinquish_resources(struct domain *d)
>> +static void domain_creation_finished(struct domain *d)
>>  {
>> -    if ( !has_vlapic(d) )
>> +    gfn_t gfn = gaddr_to_gfn(APIC_DEFAULT_PHYS_BASE);
> 
> Worth making it const static?

The compiler ought to be able to fold this into a suitable constant
at the use site. Definitely not static imo, and I see little point
in making a local variable const, unless one really wants to
document something very special.

>> +    uint8_t ipat;
>> +
>> +    if ( !has_vlapic(d) || mfn_eq(apic_access_mfn, _mfn(0)) )
> 
> It would be better to use INVALID_MFN here, and init apic_access_mfn
> to that value.

Oh, yes, that's easier possible now that the variable is static.

>> --- a/xen/arch/x86/mm/shadow/set.c
>> +++ b/xen/arch/x86/mm/shadow/set.c
>> @@ -94,6 +94,22 @@ shadow_get_page_from_l1e(shadow_l1e_t sl
>>      ASSERT(!sh_l1e_is_magic(sl1e));
>>      ASSERT(shadow_mode_refcounts(d));
>>  
>> +    /*
>> +     * VMX'es APIC access MFN is just a surrogate page.  It doesn't actually
>> +     * get accessed, and hence there's no need to refcount it (and refcounting
>> +     * would fail, due to the page having no owner).
>> +     */
>> +    if ( mfn_valid(mfn = shadow_l1e_get_mfn(sl1e)) )
> 
> I find this assignment inside the parameter list quite ugly, I would
> rather split it on it's own line.

Well, okay. To be honest I'm not even sure why I did it this way, as
I could have expected a respective comment.

>> +    {
>> +        const struct page_info *pg = mfn_to_page(mfn);
>> +
>> +        if ( !page_get_owner(pg) && (pg->count_info & PGC_extra) )
>> +        {
>> +            ASSERT(type == p2m_mmio_direct);
>> +            return 0;
> 
> Are there any other pages that could pass this check? I don't think
> so, but wanted to assert.

"Normal" extra pages have an owner, so no, there aren't any others.
If and when any appear, this may need further customizing, albeit
generally I'd hope further pages matching this pattern would also
want similar treatment.

Jan


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4] VMX: use a single, global APIC access page
  2021-04-13  9:24     ` Jan Beulich
@ 2021-04-13 10:18       ` Roger Pau Monné
  2021-04-13 12:03         ` Jan Beulich
  0 siblings, 1 reply; 32+ messages in thread
From: Roger Pau Monné @ 2021-04-13 10:18 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Andrew Cooper, Wei Liu, George Dunlap, Kevin Tian,
	Jun Nakajima, Tim Deegan

On Tue, Apr 13, 2021 at 11:24:09AM +0200, Jan Beulich wrote:
> On 12.04.2021 17:31, Roger Pau Monné wrote:
> > On Mon, Apr 12, 2021 at 12:40:48PM +0200, Jan Beulich wrote:
> >> The address of this page is used by the CPU only to recognize when to
> >> access the virtual APIC page instead. No accesses would ever go to this
> >> page. It only needs to be present in the (CPU) page tables so that
> >> address translation will produce its address as result for respective
> >> accesses.
> >>
> >> By making this page global, we also eliminate the need to refcount it,
> >> or to assign it to any domain in the first place.
> >>
> >> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> >> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> >> ---
> >> v4: Set PGC_extra on the page. Make shadow mode work.
> >> v3: Split p2m insertion change to a separate patch.
> >> v2: Avoid insertion when !has_vlapic(). Split off change to
> >>     p2m_get_iommu_flags().
> >> ---
> >> I did further consider not allocating any real page at all, but just
> >> using the address of some unpopulated space (which would require
> >> announcing this page as reserved to Dom0, so it wouldn't put any PCI
> >> MMIO BARs there). But I thought this would be too controversial, because
> >> of the possible risks associated with this.
> > 
> > Really seems more trouble than reward. Also there are systems with
> > MMIO regions in holes on the memory map, like the issue I had with the
> > Intel pinctrl stuff that had an MMIO region in a hole on the memory
> > map [0], so I'm not sure Xen would be in a position to select a
> > suitable unpopulated page anyway.
> > 
> > [0] https://lore.kernel.org/xen-devel/YFx80wYt%2FKcHanC7@smile.fi.intel.com/
> 
> Yeah, I had seen that. What I'm having trouble to understand is how the
> OS will know to avoid that range for e.g. placing BARs.

No idea really, I assume they expect that you parse all possible ACPI
devices from the dynamic tables before deciding on any BAR placements?

I still consider a bug that the pinctrl MMIO region is not marked as
reserved in the memory map.

> >> +    {
> >> +        const struct page_info *pg = mfn_to_page(mfn);
> >> +
> >> +        if ( !page_get_owner(pg) && (pg->count_info & PGC_extra) )
> >> +        {
> >> +            ASSERT(type == p2m_mmio_direct);
> >> +            return 0;
> > 
> > Are there any other pages that could pass this check? I don't think
> > so, but wanted to assert.
> 
> "Normal" extra pages have an owner, so no, there aren't any others.
> If and when any appear, this may need further customizing, albeit
> generally I'd hope further pages matching this pattern would also
> want similar treatment.

I wonder whether we want to add an assert here to make sure only the
APIC access page receives this special handling by the shadow code,
but maybe that's a bit too much?

Thanks, Roger.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4] VMX: use a single, global APIC access page
  2021-04-13 10:18       ` Roger Pau Monné
@ 2021-04-13 12:03         ` Jan Beulich
  2021-04-13 13:03           ` Roger Pau Monné
  0 siblings, 1 reply; 32+ messages in thread
From: Jan Beulich @ 2021-04-13 12:03 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel, Andrew Cooper, Wei Liu, George Dunlap, Kevin Tian,
	Jun Nakajima, Tim Deegan

On 13.04.2021 12:18, Roger Pau Monné wrote:
> On Tue, Apr 13, 2021 at 11:24:09AM +0200, Jan Beulich wrote:
>> On 12.04.2021 17:31, Roger Pau Monné wrote:
>>> On Mon, Apr 12, 2021 at 12:40:48PM +0200, Jan Beulich wrote:
>>>> +    {
>>>> +        const struct page_info *pg = mfn_to_page(mfn);
>>>> +
>>>> +        if ( !page_get_owner(pg) && (pg->count_info & PGC_extra) )
>>>> +        {
>>>> +            ASSERT(type == p2m_mmio_direct);
>>>> +            return 0;
>>>
>>> Are there any other pages that could pass this check? I don't think
>>> so, but wanted to assert.
>>
>> "Normal" extra pages have an owner, so no, there aren't any others.
>> If and when any appear, this may need further customizing, albeit
>> generally I'd hope further pages matching this pattern would also
>> want similar treatment.
> 
> I wonder whether we want to add an assert here to make sure only the
> APIC access page receives this special handling by the shadow code,
> but maybe that's a bit too much?

I think so, yes: It would require either a separate function or
making the variable global. Both feel like a layering violation.

Jan


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4] VMX: use a single, global APIC access page
  2021-04-13 12:03         ` Jan Beulich
@ 2021-04-13 13:03           ` Roger Pau Monné
  0 siblings, 0 replies; 32+ messages in thread
From: Roger Pau Monné @ 2021-04-13 13:03 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Andrew Cooper, Wei Liu, George Dunlap, Kevin Tian,
	Jun Nakajima, Tim Deegan

On Tue, Apr 13, 2021 at 02:03:52PM +0200, Jan Beulich wrote:
> On 13.04.2021 12:18, Roger Pau Monné wrote:
> > On Tue, Apr 13, 2021 at 11:24:09AM +0200, Jan Beulich wrote:
> >> On 12.04.2021 17:31, Roger Pau Monné wrote:
> >>> On Mon, Apr 12, 2021 at 12:40:48PM +0200, Jan Beulich wrote:
> >>>> +    {
> >>>> +        const struct page_info *pg = mfn_to_page(mfn);
> >>>> +
> >>>> +        if ( !page_get_owner(pg) && (pg->count_info & PGC_extra) )
> >>>> +        {
> >>>> +            ASSERT(type == p2m_mmio_direct);
> >>>> +            return 0;
> >>>
> >>> Are there any other pages that could pass this check? I don't think
> >>> so, but wanted to assert.
> >>
> >> "Normal" extra pages have an owner, so no, there aren't any others.
> >> If and when any appear, this may need further customizing, albeit
> >> generally I'd hope further pages matching this pattern would also
> >> want similar treatment.
> > 
> > I wonder whether we want to add an assert here to make sure only the
> > APIC access page receives this special handling by the shadow code,
> > but maybe that's a bit too much?
> 
> I think so, yes: It would require either a separate function or
> making the variable global. Both feel like a layering violation.

Right, with the INVALID_MFN change and the shadow mfn_valid style
adjustment:

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, Roger.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4] VMX: use a single, global APIC access page
  2021-04-12 10:40 ` [PATCH v4] " Jan Beulich
  2021-04-12 15:31   ` Roger Pau Monné
@ 2021-04-17 19:24   ` Tim Deegan
  2021-04-19 11:25     ` Jan Beulich
  1 sibling, 1 reply; 32+ messages in thread
From: Tim Deegan @ 2021-04-17 19:24 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Andrew Cooper, Wei Liu, Roger Pau Monné,
	George Dunlap, Kevin Tian, Jun Nakajima

Hi,

Apologies for not sending comments before, and thanks for the ping.

At 12:40 +0200 on 12 Apr (1618231248), Jan Beulich wrote:
> The address of this page is used by the CPU only to recognize when to
> access the virtual APIC page instead. No accesses would ever go to this
> page. It only needs to be present in the (CPU) page tables so that
> address translation will produce its address as result for respective
> accesses.
> 
> By making this page global, we also eliminate the need to refcount it,
> or to assign it to any domain in the first place.

What is the aim here?  To save 4k per domain?  It seems to come out
about even for adding and removing code. 

> --- a/xen/arch/x86/mm/shadow/set.c
> +++ b/xen/arch/x86/mm/shadow/set.c
> @@ -94,6 +94,22 @@ shadow_get_page_from_l1e(shadow_l1e_t sl
>      ASSERT(!sh_l1e_is_magic(sl1e));
>      ASSERT(shadow_mode_refcounts(d));
>  
> +    /*
> +     * VMX'es APIC access MFN is just a surrogate page.  It doesn't actually
> +     * get accessed, and hence there's no need to refcount it (and refcounting
> +     * would fail, due to the page having no owner).
> +     */
> +    if ( mfn_valid(mfn = shadow_l1e_get_mfn(sl1e)) )

Would it be better to check specifically for mfn == apic_access_mfn
(and apic_access_mfn != 0, I guess)?  If we want this behaviour for
for all un-owned PGC_extra MFNs it would be good to explain that in the
comments.

Cheers,

Tim.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4] VMX: use a single, global APIC access page
  2021-04-17 19:24   ` Tim Deegan
@ 2021-04-19 11:25     ` Jan Beulich
  2021-04-22  7:42       ` Tim Deegan
  0 siblings, 1 reply; 32+ messages in thread
From: Jan Beulich @ 2021-04-19 11:25 UTC (permalink / raw)
  To: Tim Deegan
  Cc: xen-devel, Andrew Cooper, Wei Liu, Roger Pau Monné,
	George Dunlap, Kevin Tian, Jun Nakajima

On 17.04.2021 21:24, Tim Deegan wrote:
> At 12:40 +0200 on 12 Apr (1618231248), Jan Beulich wrote:
>> The address of this page is used by the CPU only to recognize when to
>> access the virtual APIC page instead. No accesses would ever go to this
>> page. It only needs to be present in the (CPU) page tables so that
>> address translation will produce its address as result for respective
>> accesses.
>>
>> By making this page global, we also eliminate the need to refcount it,
>> or to assign it to any domain in the first place.
> 
> What is the aim here?  To save 4k per domain?  It seems to come out
> about even for adding and removing code. 

True, but still it looks wrong to me to use a page per guest when one
her host suffices. Think about many tiny, short-lived VMs (as in
Tamas'es VM forking).

>> --- a/xen/arch/x86/mm/shadow/set.c
>> +++ b/xen/arch/x86/mm/shadow/set.c
>> @@ -94,6 +94,22 @@ shadow_get_page_from_l1e(shadow_l1e_t sl
>>      ASSERT(!sh_l1e_is_magic(sl1e));
>>      ASSERT(shadow_mode_refcounts(d));
>>  
>> +    /*
>> +     * VMX'es APIC access MFN is just a surrogate page.  It doesn't actually
>> +     * get accessed, and hence there's no need to refcount it (and refcounting
>> +     * would fail, due to the page having no owner).
>> +     */
>> +    if ( mfn_valid(mfn = shadow_l1e_get_mfn(sl1e)) )
> 
> Would it be better to check specifically for mfn == apic_access_mfn
> (and apic_access_mfn != 0, I guess)?

Roger did ask about the same - I neither want to expose apic_access_mfn
outside its CU, nor do I want to introduce an accessor function. Both
feel like layering violations to me.

>  If we want this behaviour for
> for all un-owned PGC_extra MFNs it would be good to explain that in the
> comments.

This is hard to tell without knowing which (or even if) further such
PGC_extra pages will appear. Hence any comment to that effect would be
guesswork at best. Of course I can add e.g. "Other pages with the same
properties would be treated the same", if that's what you're after?

Jan


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4] VMX: use a single, global APIC access page
  2021-04-19 11:25     ` Jan Beulich
@ 2021-04-22  7:42       ` Tim Deegan
  2021-04-22  9:38         ` Jan Beulich
  0 siblings, 1 reply; 32+ messages in thread
From: Tim Deegan @ 2021-04-22  7:42 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Andrew Cooper, Wei Liu, Roger Pau Monné,
	George Dunlap, Kevin Tian, Jun Nakajima

At 13:25 +0200 on 19 Apr (1618838726), Jan Beulich wrote:
> On 17.04.2021 21:24, Tim Deegan wrote:
> > At 12:40 +0200 on 12 Apr (1618231248), Jan Beulich wrote:
> >> By making this page global, we also eliminate the need to refcount it,
> >> or to assign it to any domain in the first place.
> > 
> > What is the aim here?  To save 4k per domain?  It seems to come out
> > about even for adding and removing code. 
> 
> True, but still it looks wrong to me to use a page per guest when one
> her host suffices. Think about many tiny, short-lived VMs (as in
> Tamas'es VM forking).

OK, fair enough.

> >> --- a/xen/arch/x86/mm/shadow/set.c
> >> +++ b/xen/arch/x86/mm/shadow/set.c
> >> @@ -94,6 +94,22 @@ shadow_get_page_from_l1e(shadow_l1e_t sl
> >>      ASSERT(!sh_l1e_is_magic(sl1e));
> >>      ASSERT(shadow_mode_refcounts(d));
> >>  
> >> +    /*
> >> +     * VMX'es APIC access MFN is just a surrogate page.  It doesn't actually
> >> +     * get accessed, and hence there's no need to refcount it (and refcounting
> >> +     * would fail, due to the page having no owner).
> >> +     */
> >> +    if ( mfn_valid(mfn = shadow_l1e_get_mfn(sl1e)) )
> > 
> > Would it be better to check specifically for mfn == apic_access_mfn
> > (and apic_access_mfn != 0, I guess)?
> 
> Roger did ask about the same - I neither want to expose apic_access_mfn
> outside its CU, nor do I want to introduce an accessor function. Both
> feel like layering violations to me.

I think that this is even more of a layering violation: what we
actually want is to allow un-refcounted mappings of the
apic_access_mfn, but to do it we're relying on an internal
implementation detail (that it happens to be un-owned and PGC_extra)
rather than giving ourselves an API.

And so we're tangled up talking about how to write comments to warn
our future selves about the possible side-effects.

> >  If we want this behaviour for
> > for all un-owned PGC_extra MFNs it would be good to explain that in the
> > comments.
> 
> This is hard to tell without knowing which (or even if) further such
> PGC_extra pages will appear. Hence any comment to that effect would be
> guesswork at best. Of course I can add e.g. "Other pages with the same
> properties would be treated the same", if that's what you're after?

If you want to go this way there should be a comment here saying that
we're allowing this for all PGC_extra pages because we need it for
apic_access_mfn, and a comment at PGC_extra saying that it has this
effect.

Cheers,

Tim


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4] VMX: use a single, global APIC access page
  2021-04-22  7:42       ` Tim Deegan
@ 2021-04-22  9:38         ` Jan Beulich
  2021-04-22 15:05           ` Tim Deegan
  0 siblings, 1 reply; 32+ messages in thread
From: Jan Beulich @ 2021-04-22  9:38 UTC (permalink / raw)
  To: Tim Deegan
  Cc: xen-devel, Andrew Cooper, Wei Liu, Roger Pau Monné,
	George Dunlap, Kevin Tian, Jun Nakajima

On 22.04.2021 09:42, Tim Deegan wrote:
> At 13:25 +0200 on 19 Apr (1618838726), Jan Beulich wrote:
>> On 17.04.2021 21:24, Tim Deegan wrote:
>>> At 12:40 +0200 on 12 Apr (1618231248), Jan Beulich wrote:
>>>> --- a/xen/arch/x86/mm/shadow/set.c
>>>> +++ b/xen/arch/x86/mm/shadow/set.c
>>>> @@ -94,6 +94,22 @@ shadow_get_page_from_l1e(shadow_l1e_t sl
>>>>      ASSERT(!sh_l1e_is_magic(sl1e));
>>>>      ASSERT(shadow_mode_refcounts(d));
>>>>  
>>>> +    /*
>>>> +     * VMX'es APIC access MFN is just a surrogate page.  It doesn't actually
>>>> +     * get accessed, and hence there's no need to refcount it (and refcounting
>>>> +     * would fail, due to the page having no owner).
>>>> +     */
>>>> +    if ( mfn_valid(mfn = shadow_l1e_get_mfn(sl1e)) )
>>>
>>> Would it be better to check specifically for mfn == apic_access_mfn
>>> (and apic_access_mfn != 0, I guess)?
>>
>> Roger did ask about the same - I neither want to expose apic_access_mfn
>> outside its CU, nor do I want to introduce an accessor function. Both
>> feel like layering violations to me.
> 
> I think that this is even more of a layering violation: what we
> actually want is to allow un-refcounted mappings of the
> apic_access_mfn, but to do it we're relying on an internal
> implementation detail (that it happens to be un-owned and PGC_extra)
> rather than giving ourselves an API.
> 
> And so we're tangled up talking about how to write comments to warn
> our future selves about the possible side-effects.
> 
>>>  If we want this behaviour for
>>> for all un-owned PGC_extra MFNs it would be good to explain that in the
>>> comments.
>>
>> This is hard to tell without knowing which (or even if) further such
>> PGC_extra pages will appear. Hence any comment to that effect would be
>> guesswork at best. Of course I can add e.g. "Other pages with the same
>> properties would be treated the same", if that's what you're after?
> 
> If you want to go this way there should be a comment here saying that
> we're allowing this for all PGC_extra pages because we need it for
> apic_access_mfn, and a comment at PGC_extra saying that it has this
> effect.

So (along with a comment to this effect) how about I make
page_suppress_refcounting() and page_refcounting_suppressed() helpers?
The former would set PGC_extra on the page and assert the page has no
owner, while the latter would subsume the checks done here. The only
question then is what to do with the ASSERT(type == p2m_mmio_direct):
That's still a property of the APIC access MFN which may or may not
hold for future such pages. (It can't be part of the new helper anyway
as "put" doesn't have the type available.)

Jan


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4] VMX: use a single, global APIC access page
  2021-04-22  9:38         ` Jan Beulich
@ 2021-04-22 15:05           ` Tim Deegan
  0 siblings, 0 replies; 32+ messages in thread
From: Tim Deegan @ 2021-04-22 15:05 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Andrew Cooper, Wei Liu, Roger Pau Monné,
	George Dunlap, Kevin Tian, Jun Nakajima

At 11:38 +0200 on 22 Apr (1619091522), Jan Beulich wrote:
> On 22.04.2021 09:42, Tim Deegan wrote:
> > At 13:25 +0200 on 19 Apr (1618838726), Jan Beulich wrote:
> >> On 17.04.2021 21:24, Tim Deegan wrote:
> >>> At 12:40 +0200 on 12 Apr (1618231248), Jan Beulich wrote:
> >>>> --- a/xen/arch/x86/mm/shadow/set.c
> >>>> +++ b/xen/arch/x86/mm/shadow/set.c
> >>>> @@ -94,6 +94,22 @@ shadow_get_page_from_l1e(shadow_l1e_t sl
> >>>>      ASSERT(!sh_l1e_is_magic(sl1e));
> >>>>      ASSERT(shadow_mode_refcounts(d));
> >>>>  
> >>>> +    /*
> >>>> +     * VMX'es APIC access MFN is just a surrogate page.  It doesn't actually
> >>>> +     * get accessed, and hence there's no need to refcount it (and refcounting
> >>>> +     * would fail, due to the page having no owner).
> >>>> +     */
> >>>> +    if ( mfn_valid(mfn = shadow_l1e_get_mfn(sl1e)) )
> >>>
> >>> Would it be better to check specifically for mfn == apic_access_mfn
> >>> (and apic_access_mfn != 0, I guess)?
> >>
> >> Roger did ask about the same - I neither want to expose apic_access_mfn
> >> outside its CU, nor do I want to introduce an accessor function. Both
> >> feel like layering violations to me.
> > 
> > I think that this is even more of a layering violation: what we
> > actually want is to allow un-refcounted mappings of the
> > apic_access_mfn, but to do it we're relying on an internal
> > implementation detail (that it happens to be un-owned and PGC_extra)
> > rather than giving ourselves an API.
> > 
> > And so we're tangled up talking about how to write comments to warn
> > our future selves about the possible side-effects.
> > 
> >>>  If we want this behaviour for
> >>> for all un-owned PGC_extra MFNs it would be good to explain that in the
> >>> comments.
> >>
> >> This is hard to tell without knowing which (or even if) further such
> >> PGC_extra pages will appear. Hence any comment to that effect would be
> >> guesswork at best. Of course I can add e.g. "Other pages with the same
> >> properties would be treated the same", if that's what you're after?
> > 
> > If you want to go this way there should be a comment here saying that
> > we're allowing this for all PGC_extra pages because we need it for
> > apic_access_mfn, and a comment at PGC_extra saying that it has this
> > effect.
> 
> So (along with a comment to this effect) how about I make
> page_suppress_refcounting() and page_refcounting_suppressed() helpers?
> The former would set PGC_extra on the page and assert the page has no
> owner, while the latter would subsume the checks done here.

That sounds good to me.

> The only
> question then is what to do with the ASSERT(type == p2m_mmio_direct):
> That's still a property of the APIC access MFN which may or may not
> hold for future such pages. (It can't be part of the new helper anyway
> as "put" doesn't have the type available.)

I think we might drop that assertion, since the new mehanism would be
more general.

Cheers,

Tim.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v4 0/3] VMX APIC access page and shadow adjustments
  2021-02-22 10:55 [PATCH v3 0/2] VMX: apic access page handling adjustments Jan Beulich
                   ` (2 preceding siblings ...)
  2021-04-12 10:40 ` [PATCH v4] " Jan Beulich
@ 2021-04-23 10:51 ` Jan Beulich
  2021-04-23 10:52   ` [PATCH v4 1/3] VMX: use a single, global APIC access page Jan Beulich
                     ` (3 more replies)
  3 siblings, 4 replies; 32+ messages in thread
From: Jan Beulich @ 2021-04-23 10:51 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Wei Liu, Roger Pau Monné,
	George Dunlap, Kevin Tian, Jun Nakajima, Tim Deegan

1: VMX: use a single, global APIC access page
2: x86/shadow: re-use variables in shadow_get_page_from_l1e()
3: x86/shadow: streamline shadow_get_page_from_l1e()

Jan


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v4 1/3] VMX: use a single, global APIC access page
  2021-04-23 10:51 ` [PATCH v4 0/3] VMX APIC access page and shadow adjustments Jan Beulich
@ 2021-04-23 10:52   ` Jan Beulich
  2021-04-23 14:17     ` Roger Pau Monné
                       ` (2 more replies)
  2021-04-23 10:53   ` [PATCH v4 2/3] x86/shadow: re-use variables in shadow_get_page_from_l1e() Jan Beulich
                     ` (2 subsequent siblings)
  3 siblings, 3 replies; 32+ messages in thread
From: Jan Beulich @ 2021-04-23 10:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Wei Liu, Roger Pau Monné,
	George Dunlap, Kevin Tian, Jun Nakajima, Tim Deegan

The address of this page is used by the CPU only to recognize when to
access the virtual APIC page instead. No accesses would ever go to this
page. It only needs to be present in the (CPU) page tables so that
address translation will produce its address as result for respective
accesses.

By making this page global, we also eliminate the need to refcount it,
or to assign it to any domain in the first place.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v5: Init apic_access_mfn to INVALID_MFN. Move assignment out of if()
    condition. Introduce page_suppress_refcounting() and
    page_refcounting_suppressed().
v4: Set PGC_extra on the page. Make shadow mode work.
v3: Split p2m insertion change to a separate patch.
v2: Avoid insertion when !has_vlapic(). Split off change to
    p2m_get_iommu_flags().
---
I did further consider not allocating any real page at all, but just
using the address of some unpopulated space (which would require
announcing this page as reserved to Dom0, so it wouldn't put any PCI
MMIO BARs there). But I thought this would be too controversial, because
of the possible risks associated with this.

--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -66,8 +66,7 @@ boolean_param("force-ept", opt_force_ept
 static void vmx_ctxt_switch_from(struct vcpu *v);
 static void vmx_ctxt_switch_to(struct vcpu *v);
 
-static int  vmx_alloc_vlapic_mapping(struct domain *d);
-static void vmx_free_vlapic_mapping(struct domain *d);
+static int alloc_vlapic_mapping(void);
 static void vmx_install_vlapic_mapping(struct vcpu *v);
 static void vmx_update_guest_cr(struct vcpu *v, unsigned int cr,
                                 unsigned int flags);
@@ -78,6 +77,8 @@ static int vmx_msr_read_intercept(unsign
 static int vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content);
 static void vmx_invlpg(struct vcpu *v, unsigned long linear);
 
+static mfn_t __read_mostly apic_access_mfn = INVALID_MFN_INITIALIZER;
+
 /* Values for domain's ->arch.hvm_domain.pi_ops.flags. */
 #define PI_CSW_FROM (1u << 0)
 #define PI_CSW_TO   (1u << 1)
@@ -401,7 +402,6 @@ static int vmx_domain_initialise(struct
         .to   = vmx_ctxt_switch_to,
         .tail = vmx_do_resume,
     };
-    int rc;
 
     d->arch.ctxt_switch = &csw;
 
@@ -411,28 +411,22 @@ static int vmx_domain_initialise(struct
      */
     d->arch.hvm.vmx.exec_sp = is_hardware_domain(d) || opt_ept_exec_sp;
 
-    if ( !has_vlapic(d) )
-        return 0;
-
-    if ( (rc = vmx_alloc_vlapic_mapping(d)) != 0 )
-        return rc;
-
     return 0;
 }
 
-static void vmx_domain_relinquish_resources(struct domain *d)
+static void domain_creation_finished(struct domain *d)
 {
-    if ( !has_vlapic(d) )
+    gfn_t gfn = gaddr_to_gfn(APIC_DEFAULT_PHYS_BASE);
+    uint8_t ipat;
+
+    if ( !has_vlapic(d) || mfn_eq(apic_access_mfn, INVALID_MFN) )
         return;
 
-    vmx_free_vlapic_mapping(d);
-}
+    ASSERT(epte_get_entry_emt(d, gfn_x(gfn), apic_access_mfn, 0, &ipat,
+                              true) == MTRR_TYPE_WRBACK);
+    ASSERT(ipat);
 
-static void domain_creation_finished(struct domain *d)
-{
-    if ( has_vlapic(d) && !mfn_eq(d->arch.hvm.vmx.apic_access_mfn, _mfn(0)) &&
-         set_mmio_p2m_entry(d, gaddr_to_gfn(APIC_DEFAULT_PHYS_BASE),
-                            d->arch.hvm.vmx.apic_access_mfn, PAGE_ORDER_4K) )
+    if ( set_mmio_p2m_entry(d, gfn, apic_access_mfn, PAGE_ORDER_4K) )
         domain_crash(d);
 }
 
@@ -2415,7 +2409,6 @@ static struct hvm_function_table __initd
     .cpu_up_prepare       = vmx_cpu_up_prepare,
     .cpu_dead             = vmx_cpu_dead,
     .domain_initialise    = vmx_domain_initialise,
-    .domain_relinquish_resources = vmx_domain_relinquish_resources,
     .domain_creation_finished = domain_creation_finished,
     .vcpu_initialise      = vmx_vcpu_initialise,
     .vcpu_destroy         = vmx_vcpu_destroy,
@@ -2662,7 +2655,7 @@ const struct hvm_function_table * __init
 {
     set_in_cr4(X86_CR4_VMXE);
 
-    if ( vmx_vmcs_init() )
+    if ( vmx_vmcs_init() || alloc_vlapic_mapping() )
     {
         printk("VMX: failed to initialise.\n");
         return NULL;
@@ -3224,7 +3217,7 @@ gp_fault:
     return X86EMUL_EXCEPTION;
 }
 
-static int vmx_alloc_vlapic_mapping(struct domain *d)
+static int __init alloc_vlapic_mapping(void)
 {
     struct page_info *pg;
     mfn_t mfn;
@@ -3232,52 +3225,34 @@ static int vmx_alloc_vlapic_mapping(stru
     if ( !cpu_has_vmx_virtualize_apic_accesses )
         return 0;
 
-    pg = alloc_domheap_page(d, MEMF_no_refcount);
+    pg = alloc_domheap_page(NULL, 0);
     if ( !pg )
         return -ENOMEM;
 
-    if ( !get_page_and_type(pg, d, PGT_writable_page) )
-    {
-        /*
-         * The domain can't possibly know about this page yet, so failure
-         * here is a clear indication of something fishy going on.
-         */
-        domain_crash(d);
-        return -ENODATA;
-    }
+    /*
+     * Signal to shadow code that this page cannot be refcounted. This also
+     * makes epte_get_entry_emt() recognize this page as "special".
+     */
+    page_suppress_refcounting(pg);
 
     mfn = page_to_mfn(pg);
     clear_domain_page(mfn);
-    d->arch.hvm.vmx.apic_access_mfn = mfn;
+    apic_access_mfn = mfn;
 
     return 0;
 }
 
-static void vmx_free_vlapic_mapping(struct domain *d)
-{
-    mfn_t mfn = d->arch.hvm.vmx.apic_access_mfn;
-
-    d->arch.hvm.vmx.apic_access_mfn = _mfn(0);
-    if ( !mfn_eq(mfn, _mfn(0)) )
-    {
-        struct page_info *pg = mfn_to_page(mfn);
-
-        put_page_alloc_ref(pg);
-        put_page_and_type(pg);
-    }
-}
-
 static void vmx_install_vlapic_mapping(struct vcpu *v)
 {
     paddr_t virt_page_ma, apic_page_ma;
 
-    if ( mfn_eq(v->domain->arch.hvm.vmx.apic_access_mfn, _mfn(0)) )
+    if ( !has_vlapic(v->domain) || mfn_eq(apic_access_mfn, INVALID_MFN) )
         return;
 
     ASSERT(cpu_has_vmx_virtualize_apic_accesses);
 
     virt_page_ma = page_to_maddr(vcpu_vlapic(v)->regs_page);
-    apic_page_ma = mfn_to_maddr(v->domain->arch.hvm.vmx.apic_access_mfn);
+    apic_page_ma = mfn_to_maddr(apic_access_mfn);
 
     vmx_vmcs_enter(v);
     __vmwrite(VIRTUAL_APIC_PAGE_ADDR, virt_page_ma);
--- a/xen/arch/x86/mm/shadow/set.c
+++ b/xen/arch/x86/mm/shadow/set.c
@@ -94,6 +94,15 @@ shadow_get_page_from_l1e(shadow_l1e_t sl
     ASSERT(!sh_l1e_is_magic(sl1e));
     ASSERT(shadow_mode_refcounts(d));
 
+    /*
+     * Check whether refcounting is suppressed on this page. For example,
+     * VMX'es APIC access MFN is just a surrogate page.  It doesn't actually
+     * get accessed, and hence there's no need to refcount it.
+     */
+    mfn = shadow_l1e_get_mfn(sl1e);
+    if ( mfn_valid(mfn) && page_refcounting_suppressed(mfn_to_page(mfn)) )
+        return 0;
+
     res = get_page_from_l1e(sl1e, d, d);
 
     /*
--- a/xen/arch/x86/mm/shadow/types.h
+++ b/xen/arch/x86/mm/shadow/types.h
@@ -276,9 +276,16 @@ int shadow_set_l4e(struct domain *d, sha
 static void inline
 shadow_put_page_from_l1e(shadow_l1e_t sl1e, struct domain *d)
 {
+    mfn_t mfn;
+
     if ( !shadow_mode_refcounts(d) )
         return;
 
+    if ( mfn_valid(mfn = shadow_l1e_get_mfn(sl1e)) &&
+         /* See the respective comment in shadow_get_page_from_l1e(). */
+         page_refcounting_suppressed(mfn_to_page(mfn)) )
+        return;
+
     put_page_from_l1e(sl1e, d);
 }
 
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -58,7 +58,6 @@ struct ept_data {
 #define _VMX_DOMAIN_PML_ENABLED    0
 #define VMX_DOMAIN_PML_ENABLED     (1ul << _VMX_DOMAIN_PML_ENABLED)
 struct vmx_domain {
-    mfn_t apic_access_mfn;
     /* VMX_DOMAIN_* */
     unsigned int status;
 
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -82,7 +82,7 @@
 #define PGC_state_offlined PG_mask(2, 9)
 #define PGC_state_free    PG_mask(3, 9)
 #define page_state_is(pg, st) (((pg)->count_info&PGC_state) == PGC_state_##st)
-/* Page is not reference counted */
+/* Page is not reference counted (see below for caveats) */
 #define _PGC_extra        PG_shift(10)
 #define PGC_extra         PG_mask(1, 10)
 
@@ -374,6 +374,24 @@ void zap_ro_mpt(mfn_t mfn);
 
 bool is_iomem_page(mfn_t mfn);
 
+/*
+ * Pages with no owner which may get passed to functions wanting to
+ * refcount them can be marked PGC_extra to bypass this refcounting (which
+ * would fail due to the lack of an owner).
+ *
+ * (For pages with owner PGC_extra has different meaning.)
+ */
+static inline void page_suppress_refcounting(struct page_info *pg)
+{
+   ASSERT(!page_get_owner(pg));
+   pg->count_info |= PGC_extra;
+}
+
+static inline bool page_refcounting_suppressed(const struct page_info *pg)
+{
+    return !page_get_owner(pg) && (pg->count_info & PGC_extra);
+}
+
 struct platform_bad_page {
     unsigned long mfn;
     unsigned int order;



^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v4 2/3] x86/shadow: re-use variables in shadow_get_page_from_l1e()
  2021-04-23 10:51 ` [PATCH v4 0/3] VMX APIC access page and shadow adjustments Jan Beulich
  2021-04-23 10:52   ` [PATCH v4 1/3] VMX: use a single, global APIC access page Jan Beulich
@ 2021-04-23 10:53   ` Jan Beulich
  2021-04-23 10:54   ` [PATCH v4 3/3] x86/shadow: streamline shadow_get_page_from_l1e() Jan Beulich
  2021-04-23 11:00   ` Really v5 (was: [PATCH v4 0/3] VMX APIC access page and shadow adjustments) Jan Beulich
  3 siblings, 0 replies; 32+ messages in thread
From: Jan Beulich @ 2021-04-23 10:53 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Wei Liu, Roger Pau Monné, George Dunlap, Tim Deegan

There's little point in doing multiple mfn_to_page() or page_get_owner()
on all the same MFN. Calculate them once at the start of the function.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
---
v5: Integrate into series. Re-base.
v2: Re-base.

--- a/xen/arch/x86/mm/shadow/set.c
+++ b/xen/arch/x86/mm/shadow/set.c
@@ -88,19 +88,25 @@ static int inline
 shadow_get_page_from_l1e(shadow_l1e_t sl1e, struct domain *d, p2m_type_t type)
 {
     int res;
-    mfn_t mfn;
-    struct domain *owner;
+    mfn_t mfn = shadow_l1e_get_mfn(sl1e);
+    const struct page_info *pg = NULL;
+    struct domain *owner = NULL;
 
     ASSERT(!sh_l1e_is_magic(sl1e));
     ASSERT(shadow_mode_refcounts(d));
 
+    if ( mfn_valid(mfn) )
+    {
+        pg = mfn_to_page(mfn);
+        owner = page_get_owner(pg);
+    }
+
     /*
      * Check whether refcounting is suppressed on this page. For example,
      * VMX'es APIC access MFN is just a surrogate page.  It doesn't actually
      * get accessed, and hence there's no need to refcount it.
      */
-    mfn = shadow_l1e_get_mfn(sl1e);
-    if ( mfn_valid(mfn) && page_refcounting_suppressed(mfn_to_page(mfn)) )
+    if ( pg && page_refcounting_suppressed(pg) )
         return 0;
 
     res = get_page_from_l1e(sl1e, d, d);
@@ -111,9 +117,7 @@ shadow_get_page_from_l1e(shadow_l1e_t sl
      */
     if ( unlikely(res < 0) &&
          !shadow_mode_translate(d) &&
-         mfn_valid(mfn = shadow_l1e_get_mfn(sl1e)) &&
-         (owner = page_get_owner(mfn_to_page(mfn))) &&
-         (d != owner) )
+         owner && (d != owner) )
     {
         res = xsm_priv_mapping(XSM_TARGET, d, owner);
         if ( !res )
@@ -136,9 +140,8 @@ shadow_get_page_from_l1e(shadow_l1e_t sl
          * already have checked that we're supposed to have access, so
          * we can just grab a reference directly.
          */
-        mfn = shadow_l1e_get_mfn(sl1e);
-        if ( mfn_valid(mfn) )
-            res = get_page_from_l1e(sl1e, d, page_get_owner(mfn_to_page(mfn)));
+        if ( owner )
+            res = get_page_from_l1e(sl1e, d, owner);
     }
 
     if ( unlikely(res < 0) )



^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v4 3/3] x86/shadow: streamline shadow_get_page_from_l1e()
  2021-04-23 10:51 ` [PATCH v4 0/3] VMX APIC access page and shadow adjustments Jan Beulich
  2021-04-23 10:52   ` [PATCH v4 1/3] VMX: use a single, global APIC access page Jan Beulich
  2021-04-23 10:53   ` [PATCH v4 2/3] x86/shadow: re-use variables in shadow_get_page_from_l1e() Jan Beulich
@ 2021-04-23 10:54   ` Jan Beulich
  2021-04-23 11:00   ` Really v5 (was: [PATCH v4 0/3] VMX APIC access page and shadow adjustments) Jan Beulich
  3 siblings, 0 replies; 32+ messages in thread
From: Jan Beulich @ 2021-04-23 10:54 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Wei Liu, Roger Pau Monné, George Dunlap, Tim Deegan

Trying get_page_from_l1e() up to three times isn't helpful; in debug
builds it may lead to log messages making things look as if there was a
problem somewhere. And there's no need to have more than one try: The
function can only possibly succeed for one domain passed as 3rd
argument (unless the page is an MMIO one to which both have access,
but MMIO pages should be "got" by specifying the requesting domain
anyway). Re-arrange things so just the one call gets made which has a
chance of succeeding.

The code could in principle be arranged such that there's only a single
call to get_page_from_l1e(), but the conditional would become pretty
complex then and hence hard to follow / understand / adjust.

The redundant (with shadow_mode_refcounts()) shadow_mode_translate()
gets dropped.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
---
v5: Integrate into series. Re-base.

--- a/xen/arch/x86/mm/shadow/set.c
+++ b/xen/arch/x86/mm/shadow/set.c
@@ -109,40 +109,36 @@ shadow_get_page_from_l1e(shadow_l1e_t sl
     if ( pg && page_refcounting_suppressed(pg) )
         return 0;
 
-    res = get_page_from_l1e(sl1e, d, d);
+    if ( owner == dom_io )
+        owner = NULL;
 
     /*
      * If a privileged domain is attempting to install a map of a page it does
      * not own, we let it succeed anyway.
      */
-    if ( unlikely(res < 0) &&
-         !shadow_mode_translate(d) &&
-         owner && (d != owner) )
+    if ( owner && (d != owner) &&
+         !(res = xsm_priv_mapping(XSM_TARGET, d, owner)) )
     {
-        res = xsm_priv_mapping(XSM_TARGET, d, owner);
-        if ( !res )
-        {
-            res = get_page_from_l1e(sl1e, d, owner);
-            SHADOW_PRINTK("privileged %pd installs map of mfn %"PRI_mfn" owned by %pd: %s\n",
-                           d, mfn_x(mfn), owner,
-                           res >= 0 ? "success" : "failed");
-        }
+        res = get_page_from_l1e(sl1e, d, owner);
+        SHADOW_PRINTK("privileged %pd installs map of %pd's mfn %"PRI_mfn": %s\n",
+                      d, owner, mfn_x(mfn),
+                      res >= 0 ? "success" : "failed");
     }
-
     /* Okay, it might still be a grant mapping PTE.  Try it. */
-    if ( unlikely(res < 0) &&
-         (type == p2m_grant_map_rw ||
-          (type == p2m_grant_map_ro &&
-           !(shadow_l1e_get_flags(sl1e) & _PAGE_RW))) )
+    else if ( owner &&
+              (type == p2m_grant_map_rw ||
+               (type == p2m_grant_map_ro &&
+                !(shadow_l1e_get_flags(sl1e) & _PAGE_RW))) )
     {
         /*
          * It's a grant mapping.  The grant table implementation will
          * already have checked that we're supposed to have access, so
          * we can just grab a reference directly.
          */
-        if ( owner )
-            res = get_page_from_l1e(sl1e, d, owner);
+        res = get_page_from_l1e(sl1e, d, owner);
     }
+    else
+        res = get_page_from_l1e(sl1e, d, d);
 
     if ( unlikely(res < 0) )
     {



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Really v5 (was: [PATCH v4 0/3] VMX APIC access page and shadow adjustments)
  2021-04-23 10:51 ` [PATCH v4 0/3] VMX APIC access page and shadow adjustments Jan Beulich
                     ` (2 preceding siblings ...)
  2021-04-23 10:54   ` [PATCH v4 3/3] x86/shadow: streamline shadow_get_page_from_l1e() Jan Beulich
@ 2021-04-23 11:00   ` Jan Beulich
  3 siblings, 0 replies; 32+ messages in thread
From: Jan Beulich @ 2021-04-23 11:00 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Wei Liu, Roger Pau Monné,
	George Dunlap, Kevin Tian, Jun Nakajima, Tim Deegan

On 23.04.2021 12:51, Jan Beulich wrote:
> 1: VMX: use a single, global APIC access page
> 2: x86/shadow: re-use variables in shadow_get_page_from_l1e()
> 3: x86/shadow: streamline shadow_get_page_from_l1e()

I'm sorry, I'm noticing only now that I've typoed the version.

Jan


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 1/3] VMX: use a single, global APIC access page
  2021-04-23 10:52   ` [PATCH v4 1/3] VMX: use a single, global APIC access page Jan Beulich
@ 2021-04-23 14:17     ` Roger Pau Monné
  2021-04-23 14:42       ` Jan Beulich
  2021-04-25  1:27     ` Tian, Kevin
  2021-04-26 17:53     ` Tim Deegan
  2 siblings, 1 reply; 32+ messages in thread
From: Roger Pau Monné @ 2021-04-23 14:17 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Andrew Cooper, Wei Liu, George Dunlap, Kevin Tian,
	Jun Nakajima, Tim Deegan

On Fri, Apr 23, 2021 at 12:52:57PM +0200, Jan Beulich wrote:
> --- a/xen/arch/x86/mm/shadow/set.c
> +++ b/xen/arch/x86/mm/shadow/set.c
> @@ -94,6 +94,15 @@ shadow_get_page_from_l1e(shadow_l1e_t sl
>      ASSERT(!sh_l1e_is_magic(sl1e));
>      ASSERT(shadow_mode_refcounts(d));
>  
> +    /*
> +     * Check whether refcounting is suppressed on this page. For example,
> +     * VMX'es APIC access MFN is just a surrogate page.  It doesn't actually
> +     * get accessed, and hence there's no need to refcount it.
> +     */
> +    mfn = shadow_l1e_get_mfn(sl1e);
> +    if ( mfn_valid(mfn) && page_refcounting_suppressed(mfn_to_page(mfn)) )
> +        return 0;
> +
>      res = get_page_from_l1e(sl1e, d, d);
>  
>      /*
> --- a/xen/arch/x86/mm/shadow/types.h
> +++ b/xen/arch/x86/mm/shadow/types.h
> @@ -276,9 +276,16 @@ int shadow_set_l4e(struct domain *d, sha
>  static void inline
>  shadow_put_page_from_l1e(shadow_l1e_t sl1e, struct domain *d)
>  {
> +    mfn_t mfn;
> +
>      if ( !shadow_mode_refcounts(d) )
>          return;
>  
> +    if ( mfn_valid(mfn = shadow_l1e_get_mfn(sl1e)) &&

Nit: I would prefer if assigned mfn outside of the condition, like
it's done in the chunk added to shadow_get_page_from_l1e. The rest
LGTM, so:

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, Roger.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 1/3] VMX: use a single, global APIC access page
  2021-04-23 14:17     ` Roger Pau Monné
@ 2021-04-23 14:42       ` Jan Beulich
  2021-04-26 17:55         ` Tim Deegan
  0 siblings, 1 reply; 32+ messages in thread
From: Jan Beulich @ 2021-04-23 14:42 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel, Andrew Cooper, Wei Liu, George Dunlap, Kevin Tian,
	Jun Nakajima, Tim Deegan

On 23.04.2021 16:17, Roger Pau Monné wrote:
> On Fri, Apr 23, 2021 at 12:52:57PM +0200, Jan Beulich wrote:
>> --- a/xen/arch/x86/mm/shadow/set.c
>> +++ b/xen/arch/x86/mm/shadow/set.c
>> @@ -94,6 +94,15 @@ shadow_get_page_from_l1e(shadow_l1e_t sl
>>      ASSERT(!sh_l1e_is_magic(sl1e));
>>      ASSERT(shadow_mode_refcounts(d));
>>  
>> +    /*
>> +     * Check whether refcounting is suppressed on this page. For example,
>> +     * VMX'es APIC access MFN is just a surrogate page.  It doesn't actually
>> +     * get accessed, and hence there's no need to refcount it.
>> +     */
>> +    mfn = shadow_l1e_get_mfn(sl1e);
>> +    if ( mfn_valid(mfn) && page_refcounting_suppressed(mfn_to_page(mfn)) )
>> +        return 0;
>> +
>>      res = get_page_from_l1e(sl1e, d, d);
>>  
>>      /*
>> --- a/xen/arch/x86/mm/shadow/types.h
>> +++ b/xen/arch/x86/mm/shadow/types.h
>> @@ -276,9 +276,16 @@ int shadow_set_l4e(struct domain *d, sha
>>  static void inline
>>  shadow_put_page_from_l1e(shadow_l1e_t sl1e, struct domain *d)
>>  {
>> +    mfn_t mfn;
>> +
>>      if ( !shadow_mode_refcounts(d) )
>>          return;
>>  
>> +    if ( mfn_valid(mfn = shadow_l1e_get_mfn(sl1e)) &&
> 
> Nit: I would prefer if assigned mfn outside of the condition, like
> it's done in the chunk added to shadow_get_page_from_l1e.

Well, I did it differently here because the variable really is
only needed inside the if(), whereas in "get" the subsequent
patches use it elsewhere as well. I'll wait what Tim says.

> The rest LGTM, so:
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

Jan


^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH v4 1/3] VMX: use a single, global APIC access page
  2021-04-23 10:52   ` [PATCH v4 1/3] VMX: use a single, global APIC access page Jan Beulich
  2021-04-23 14:17     ` Roger Pau Monné
@ 2021-04-25  1:27     ` Tian, Kevin
  2021-04-26 17:53     ` Tim Deegan
  2 siblings, 0 replies; 32+ messages in thread
From: Tian, Kevin @ 2021-04-25  1:27 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: Cooper, Andrew, Wei Liu, Roger Pau Monné,
	George Dunlap, Nakajima, Jun, Tim Deegan

> From: Jan Beulich <jbeulich@suse.com>
> Sent: Friday, April 23, 2021 6:53 PM
> 
> The address of this page is used by the CPU only to recognize when to
> access the virtual APIC page instead. No accesses would ever go to this
> page. It only needs to be present in the (CPU) page tables so that
> address translation will produce its address as result for respective
> accesses.
> 
> By making this page global, we also eliminate the need to refcount it,
> or to assign it to any domain in the first place.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

> ---
> v5: Init apic_access_mfn to INVALID_MFN. Move assignment out of if()
>     condition. Introduce page_suppress_refcounting() and
>     page_refcounting_suppressed().
> v4: Set PGC_extra on the page. Make shadow mode work.
> v3: Split p2m insertion change to a separate patch.
> v2: Avoid insertion when !has_vlapic(). Split off change to
>     p2m_get_iommu_flags().
> ---
> I did further consider not allocating any real page at all, but just
> using the address of some unpopulated space (which would require
> announcing this page as reserved to Dom0, so it wouldn't put any PCI
> MMIO BARs there). But I thought this would be too controversial, because
> of the possible risks associated with this.
> 
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -66,8 +66,7 @@ boolean_param("force-ept", opt_force_ept
>  static void vmx_ctxt_switch_from(struct vcpu *v);
>  static void vmx_ctxt_switch_to(struct vcpu *v);
> 
> -static int  vmx_alloc_vlapic_mapping(struct domain *d);
> -static void vmx_free_vlapic_mapping(struct domain *d);
> +static int alloc_vlapic_mapping(void);
>  static void vmx_install_vlapic_mapping(struct vcpu *v);
>  static void vmx_update_guest_cr(struct vcpu *v, unsigned int cr,
>                                  unsigned int flags);
> @@ -78,6 +77,8 @@ static int vmx_msr_read_intercept(unsign
>  static int vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content);
>  static void vmx_invlpg(struct vcpu *v, unsigned long linear);
> 
> +static mfn_t __read_mostly apic_access_mfn = INVALID_MFN_INITIALIZER;
> +
>  /* Values for domain's ->arch.hvm_domain.pi_ops.flags. */
>  #define PI_CSW_FROM (1u << 0)
>  #define PI_CSW_TO   (1u << 1)
> @@ -401,7 +402,6 @@ static int vmx_domain_initialise(struct
>          .to   = vmx_ctxt_switch_to,
>          .tail = vmx_do_resume,
>      };
> -    int rc;
> 
>      d->arch.ctxt_switch = &csw;
> 
> @@ -411,28 +411,22 @@ static int vmx_domain_initialise(struct
>       */
>      d->arch.hvm.vmx.exec_sp = is_hardware_domain(d) || opt_ept_exec_sp;
> 
> -    if ( !has_vlapic(d) )
> -        return 0;
> -
> -    if ( (rc = vmx_alloc_vlapic_mapping(d)) != 0 )
> -        return rc;
> -
>      return 0;
>  }
> 
> -static void vmx_domain_relinquish_resources(struct domain *d)
> +static void domain_creation_finished(struct domain *d)
>  {
> -    if ( !has_vlapic(d) )
> +    gfn_t gfn = gaddr_to_gfn(APIC_DEFAULT_PHYS_BASE);
> +    uint8_t ipat;
> +
> +    if ( !has_vlapic(d) || mfn_eq(apic_access_mfn, INVALID_MFN) )
>          return;
> 
> -    vmx_free_vlapic_mapping(d);
> -}
> +    ASSERT(epte_get_entry_emt(d, gfn_x(gfn), apic_access_mfn, 0, &ipat,
> +                              true) == MTRR_TYPE_WRBACK);
> +    ASSERT(ipat);
> 
> -static void domain_creation_finished(struct domain *d)
> -{
> -    if ( has_vlapic(d) && !mfn_eq(d->arch.hvm.vmx.apic_access_mfn,
> _mfn(0)) &&
> -         set_mmio_p2m_entry(d, gaddr_to_gfn(APIC_DEFAULT_PHYS_BASE),
> -                            d->arch.hvm.vmx.apic_access_mfn, PAGE_ORDER_4K) )
> +    if ( set_mmio_p2m_entry(d, gfn, apic_access_mfn, PAGE_ORDER_4K) )
>          domain_crash(d);
>  }
> 
> @@ -2415,7 +2409,6 @@ static struct hvm_function_table __initd
>      .cpu_up_prepare       = vmx_cpu_up_prepare,
>      .cpu_dead             = vmx_cpu_dead,
>      .domain_initialise    = vmx_domain_initialise,
> -    .domain_relinquish_resources = vmx_domain_relinquish_resources,
>      .domain_creation_finished = domain_creation_finished,
>      .vcpu_initialise      = vmx_vcpu_initialise,
>      .vcpu_destroy         = vmx_vcpu_destroy,
> @@ -2662,7 +2655,7 @@ const struct hvm_function_table * __init
>  {
>      set_in_cr4(X86_CR4_VMXE);
> 
> -    if ( vmx_vmcs_init() )
> +    if ( vmx_vmcs_init() || alloc_vlapic_mapping() )
>      {
>          printk("VMX: failed to initialise.\n");
>          return NULL;
> @@ -3224,7 +3217,7 @@ gp_fault:
>      return X86EMUL_EXCEPTION;
>  }
> 
> -static int vmx_alloc_vlapic_mapping(struct domain *d)
> +static int __init alloc_vlapic_mapping(void)
>  {
>      struct page_info *pg;
>      mfn_t mfn;
> @@ -3232,52 +3225,34 @@ static int vmx_alloc_vlapic_mapping(stru
>      if ( !cpu_has_vmx_virtualize_apic_accesses )
>          return 0;
> 
> -    pg = alloc_domheap_page(d, MEMF_no_refcount);
> +    pg = alloc_domheap_page(NULL, 0);
>      if ( !pg )
>          return -ENOMEM;
> 
> -    if ( !get_page_and_type(pg, d, PGT_writable_page) )
> -    {
> -        /*
> -         * The domain can't possibly know about this page yet, so failure
> -         * here is a clear indication of something fishy going on.
> -         */
> -        domain_crash(d);
> -        return -ENODATA;
> -    }
> +    /*
> +     * Signal to shadow code that this page cannot be refcounted. This also
> +     * makes epte_get_entry_emt() recognize this page as "special".
> +     */
> +    page_suppress_refcounting(pg);
> 
>      mfn = page_to_mfn(pg);
>      clear_domain_page(mfn);
> -    d->arch.hvm.vmx.apic_access_mfn = mfn;
> +    apic_access_mfn = mfn;
> 
>      return 0;
>  }
> 
> -static void vmx_free_vlapic_mapping(struct domain *d)
> -{
> -    mfn_t mfn = d->arch.hvm.vmx.apic_access_mfn;
> -
> -    d->arch.hvm.vmx.apic_access_mfn = _mfn(0);
> -    if ( !mfn_eq(mfn, _mfn(0)) )
> -    {
> -        struct page_info *pg = mfn_to_page(mfn);
> -
> -        put_page_alloc_ref(pg);
> -        put_page_and_type(pg);
> -    }
> -}
> -
>  static void vmx_install_vlapic_mapping(struct vcpu *v)
>  {
>      paddr_t virt_page_ma, apic_page_ma;
> 
> -    if ( mfn_eq(v->domain->arch.hvm.vmx.apic_access_mfn, _mfn(0)) )
> +    if ( !has_vlapic(v->domain) || mfn_eq(apic_access_mfn, INVALID_MFN) )
>          return;
> 
>      ASSERT(cpu_has_vmx_virtualize_apic_accesses);
> 
>      virt_page_ma = page_to_maddr(vcpu_vlapic(v)->regs_page);
> -    apic_page_ma = mfn_to_maddr(v->domain-
> >arch.hvm.vmx.apic_access_mfn);
> +    apic_page_ma = mfn_to_maddr(apic_access_mfn);
> 
>      vmx_vmcs_enter(v);
>      __vmwrite(VIRTUAL_APIC_PAGE_ADDR, virt_page_ma);
> --- a/xen/arch/x86/mm/shadow/set.c
> +++ b/xen/arch/x86/mm/shadow/set.c
> @@ -94,6 +94,15 @@ shadow_get_page_from_l1e(shadow_l1e_t sl
>      ASSERT(!sh_l1e_is_magic(sl1e));
>      ASSERT(shadow_mode_refcounts(d));
> 
> +    /*
> +     * Check whether refcounting is suppressed on this page. For example,
> +     * VMX'es APIC access MFN is just a surrogate page.  It doesn't actually
> +     * get accessed, and hence there's no need to refcount it.
> +     */
> +    mfn = shadow_l1e_get_mfn(sl1e);
> +    if ( mfn_valid(mfn) &&
> page_refcounting_suppressed(mfn_to_page(mfn)) )
> +        return 0;
> +
>      res = get_page_from_l1e(sl1e, d, d);
> 
>      /*
> --- a/xen/arch/x86/mm/shadow/types.h
> +++ b/xen/arch/x86/mm/shadow/types.h
> @@ -276,9 +276,16 @@ int shadow_set_l4e(struct domain *d, sha
>  static void inline
>  shadow_put_page_from_l1e(shadow_l1e_t sl1e, struct domain *d)
>  {
> +    mfn_t mfn;
> +
>      if ( !shadow_mode_refcounts(d) )
>          return;
> 
> +    if ( mfn_valid(mfn = shadow_l1e_get_mfn(sl1e)) &&
> +         /* See the respective comment in shadow_get_page_from_l1e(). */
> +         page_refcounting_suppressed(mfn_to_page(mfn)) )
> +        return;
> +
>      put_page_from_l1e(sl1e, d);
>  }
> 
> --- a/xen/include/asm-x86/hvm/vmx/vmcs.h
> +++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
> @@ -58,7 +58,6 @@ struct ept_data {
>  #define _VMX_DOMAIN_PML_ENABLED    0
>  #define VMX_DOMAIN_PML_ENABLED     (1ul <<
> _VMX_DOMAIN_PML_ENABLED)
>  struct vmx_domain {
> -    mfn_t apic_access_mfn;
>      /* VMX_DOMAIN_* */
>      unsigned int status;
> 
> --- a/xen/include/asm-x86/mm.h
> +++ b/xen/include/asm-x86/mm.h
> @@ -82,7 +82,7 @@
>  #define PGC_state_offlined PG_mask(2, 9)
>  #define PGC_state_free    PG_mask(3, 9)
>  #define page_state_is(pg, st) (((pg)->count_info&PGC_state) ==
> PGC_state_##st)
> -/* Page is not reference counted */
> +/* Page is not reference counted (see below for caveats) */
>  #define _PGC_extra        PG_shift(10)
>  #define PGC_extra         PG_mask(1, 10)
> 
> @@ -374,6 +374,24 @@ void zap_ro_mpt(mfn_t mfn);
> 
>  bool is_iomem_page(mfn_t mfn);
> 
> +/*
> + * Pages with no owner which may get passed to functions wanting to
> + * refcount them can be marked PGC_extra to bypass this refcounting
> (which
> + * would fail due to the lack of an owner).
> + *
> + * (For pages with owner PGC_extra has different meaning.)
> + */
> +static inline void page_suppress_refcounting(struct page_info *pg)
> +{
> +   ASSERT(!page_get_owner(pg));
> +   pg->count_info |= PGC_extra;
> +}
> +
> +static inline bool page_refcounting_suppressed(const struct page_info *pg)
> +{
> +    return !page_get_owner(pg) && (pg->count_info & PGC_extra);
> +}
> +
>  struct platform_bad_page {
>      unsigned long mfn;
>      unsigned int order;


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 1/3] VMX: use a single, global APIC access page
  2021-04-23 10:52   ` [PATCH v4 1/3] VMX: use a single, global APIC access page Jan Beulich
  2021-04-23 14:17     ` Roger Pau Monné
  2021-04-25  1:27     ` Tian, Kevin
@ 2021-04-26 17:53     ` Tim Deegan
  2 siblings, 0 replies; 32+ messages in thread
From: Tim Deegan @ 2021-04-26 17:53 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Andrew Cooper, Wei Liu, Roger Pau Monné,
	George Dunlap, Kevin Tian, Jun Nakajima

At 12:52 +0200 on 23 Apr (1619182377), Jan Beulich wrote:
> The address of this page is used by the CPU only to recognize when to
> access the virtual APIC page instead. No accesses would ever go to this
> page. It only needs to be present in the (CPU) page tables so that
> address translation will produce its address as result for respective
> accesses.
> 
> By making this page global, we also eliminate the need to refcount it,
> or to assign it to any domain in the first place.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Looks good, thanks for the changes!

Acked-by: Tim Deegan <tim@xen.org>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 1/3] VMX: use a single, global APIC access page
  2021-04-23 14:42       ` Jan Beulich
@ 2021-04-26 17:55         ` Tim Deegan
  0 siblings, 0 replies; 32+ messages in thread
From: Tim Deegan @ 2021-04-26 17:55 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Roger Pau Monné,
	xen-devel, Andrew Cooper, Wei Liu, George Dunlap, Kevin Tian,
	Jun Nakajima

At 16:42 +0200 on 23 Apr (1619196141), Jan Beulich wrote:
> On 23.04.2021 16:17, Roger Pau Monné wrote:
> > On Fri, Apr 23, 2021 at 12:52:57PM +0200, Jan Beulich wrote:
> >> +    if ( mfn_valid(mfn = shadow_l1e_get_mfn(sl1e)) &&
> > 
> > Nit: I would prefer if assigned mfn outside of the condition, like
> > it's done in the chunk added to shadow_get_page_from_l1e.
> 
> Well, I did it differently here because the variable really is
> only needed inside the if(), whereas in "get" the subsequent
> patches use it elsewhere as well. I'll wait what Tim says.

No strong feelings on this, but since you asked me, I would also
prefer it to be outside the condition.

Cheers,

Tim.


^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2021-04-26 17:55 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-22 10:55 [PATCH v3 0/2] VMX: apic access page handling adjustments Jan Beulich
2021-02-22 10:56 ` [PATCH v3 1/2][4.15] VMX: delay p2m insertion of APIC access page Jan Beulich
2021-02-22 11:25   ` Ian Jackson
2021-02-22 14:05     ` Jan Beulich
2021-02-22 17:17       ` Ian Jackson
2021-02-22 12:15   ` Roger Pau Monné
2021-02-25  8:44   ` Jan Beulich
2021-02-26  7:06     ` Tian, Kevin
2021-02-22 10:57 ` [PATCH v3 2/2] VMX: use a single, global " Jan Beulich
2021-03-01  2:34   ` Tian, Kevin
2021-03-01  8:18     ` Jan Beulich
2021-04-12 10:40 ` [PATCH v4] " Jan Beulich
2021-04-12 15:31   ` Roger Pau Monné
2021-04-13  9:24     ` Jan Beulich
2021-04-13 10:18       ` Roger Pau Monné
2021-04-13 12:03         ` Jan Beulich
2021-04-13 13:03           ` Roger Pau Monné
2021-04-17 19:24   ` Tim Deegan
2021-04-19 11:25     ` Jan Beulich
2021-04-22  7:42       ` Tim Deegan
2021-04-22  9:38         ` Jan Beulich
2021-04-22 15:05           ` Tim Deegan
2021-04-23 10:51 ` [PATCH v4 0/3] VMX APIC access page and shadow adjustments Jan Beulich
2021-04-23 10:52   ` [PATCH v4 1/3] VMX: use a single, global APIC access page Jan Beulich
2021-04-23 14:17     ` Roger Pau Monné
2021-04-23 14:42       ` Jan Beulich
2021-04-26 17:55         ` Tim Deegan
2021-04-25  1:27     ` Tian, Kevin
2021-04-26 17:53     ` Tim Deegan
2021-04-23 10:53   ` [PATCH v4 2/3] x86/shadow: re-use variables in shadow_get_page_from_l1e() Jan Beulich
2021-04-23 10:54   ` [PATCH v4 3/3] x86/shadow: streamline shadow_get_page_from_l1e() Jan Beulich
2021-04-23 11:00   ` Really v5 (was: [PATCH v4 0/3] VMX APIC access page and shadow adjustments) Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).