All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xen-devel] [PATCH v5 0/6] remove one more shared xenheap page: shared_info
@ 2020-03-09 10:22 paul
  2020-03-09 10:22 ` [Xen-devel] [PATCH v5 1/6] domain: introduce alloc/free_shared_info() helpers paul
                   ` (5 more replies)
  0 siblings, 6 replies; 23+ messages in thread
From: paul @ 2020-03-09 10:22 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant

From: Paul Durrant <paul@xen.org>

Paul Durrant (6):
  domain: introduce alloc/free_shared_info() helpers...
  x86 / p2m: replace page_list check in p2m_alloc_table...
  x86 / pv: do not treat PGC_extra pages as RAM
  x86 / ioreq: use a MEMF_no_refcount allocation for server pages...
  mm: add 'is_special_page' inline function...
  domain: use PGC_extra domheap page for shared_info

 xen/arch/arm/domain.c           | 10 +++---
 xen/arch/arm/mm.c               |  2 +-
 xen/arch/x86/domain.c           | 20 +++++++-----
 xen/arch/x86/domctl.c           |  2 +-
 xen/arch/x86/hvm/ioreq.c        |  2 +-
 xen/arch/x86/mm.c               | 11 +++----
 xen/arch/x86/mm/altp2m.c        |  2 +-
 xen/arch/x86/mm/mem_sharing.c   |  3 +-
 xen/arch/x86/mm/p2m.c           |  6 ++--
 xen/arch/x86/mm/shadow/common.c | 13 +++++---
 xen/arch/x86/mm/shadow/multi.c  |  2 +-
 xen/arch/x86/numa.c             |  3 ++
 xen/arch/x86/pv/dom0_build.c    |  6 +++-
 xen/arch/x86/pv/shim.c          |  2 +-
 xen/arch/x86/tboot.c            | 11 +++++--
 xen/common/domain.c             | 56 +++++++++++++++++++++++++++++++++
 xen/common/domctl.c             |  2 +-
 xen/common/event_channel.c      |  3 ++
 xen/common/time.c               | 19 +++++++++--
 xen/include/asm-x86/shared.h    | 15 ++++-----
 xen/include/xen/domain.h        |  4 +++
 xen/include/xen/mm.h            |  5 +++
 xen/include/xen/sched.h         |  5 ++-
 xen/include/xen/shared.h        |  2 +-
 24 files changed, 155 insertions(+), 51 deletions(-)

-- 
2.20.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Xen-devel] [PATCH v5 1/6] domain: introduce alloc/free_shared_info() helpers...
  2020-03-09 10:22 [Xen-devel] [PATCH v5 0/6] remove one more shared xenheap page: shared_info paul
@ 2020-03-09 10:22 ` paul
  2020-03-09 10:23 ` [Xen-devel] [PATCH v5 2/6] x86 / p2m: replace page_list check in p2m_alloc_table paul
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 23+ messages in thread
From: paul @ 2020-03-09 10:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Julien Grall, Wei Liu, Paul Durrant,
	Andrew Cooper, Paul Durrant, Konrad Rzeszutek Wilk, Ian Jackson,
	George Dunlap, Hongyan Xia, Jan Beulich, Volodymyr Babchuk,
	Roger Pau Monné

From: Paul Durrant <pdurrant@amazon.com>

... and save the MFN.

This patch modifies the 'shared_info' field of struct domain to be
a structure comprising an MFN and a virtual address. Allocations are
still done from xenheap, so the virtual address still equates to
virt_to_mfn() called on the MFN but subsequent patch will change this.
Hence the need to save the MFN.

NOTE: Whist defining the new helpers, virt_to_mfn() in common/domain.c
      is made type safe.
      The definition of nmi_reason() in asm-x86/shared.h is also re-
      flowed to avoid overly long lines.

Signed-off-by: Paul Durrant <paul@xen.org>
Reviewed-by: Julien Grall <julien@xen.org>
Reviewed-by: Hongyan Xia <hongyxia@amazon.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Wei Liu <wl@xen.org>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
---
 xen/arch/arm/domain.c        |  8 ++------
 xen/arch/arm/mm.c            |  2 +-
 xen/arch/x86/domain.c        | 11 ++++-------
 xen/arch/x86/mm.c            |  2 +-
 xen/arch/x86/pv/dom0_build.c |  2 +-
 xen/arch/x86/pv/shim.c       |  2 +-
 xen/common/domain.c          | 26 ++++++++++++++++++++++++++
 xen/common/domctl.c          |  2 +-
 xen/common/time.c            |  4 ++--
 xen/include/asm-x86/shared.h | 15 ++++++++-------
 xen/include/xen/domain.h     |  3 +++
 xen/include/xen/sched.h      |  5 ++++-
 xen/include/xen/shared.h     |  2 +-
 13 files changed, 55 insertions(+), 29 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 6627be2922..5298d80bd2 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -689,13 +689,9 @@ int arch_domain_create(struct domain *d,
     if ( (rc = p2m_init(d)) != 0 )
         goto fail;
 
-    rc = -ENOMEM;
-    if ( (d->shared_info = alloc_xenheap_pages(0, 0)) == NULL )
+    if ( (rc = alloc_shared_info(d, 0)) != 0 )
         goto fail;
 
-    clear_page(d->shared_info);
-    share_xen_page_with_guest(virt_to_page(d->shared_info), d, SHARE_rw);
-
     switch ( config->arch.gic_version )
     {
     case XEN_DOMCTL_CONFIG_GIC_V2:
@@ -766,7 +762,7 @@ void arch_domain_destroy(struct domain *d)
     p2m_teardown(d);
     domain_vgic_free(d);
     domain_vuart_free(d);
-    free_xenheap_page(d->shared_info);
+    free_shared_info(d);
 #ifdef CONFIG_ACPI
     free_xenheap_pages(d->arch.efi_acpi_table,
                        get_order_from_bytes(d->arch.efi_acpi_len));
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 727107eefa..2bb592101d 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1424,7 +1424,7 @@ int xenmem_add_to_physmap_one(
         if ( idx != 0 )
             return -EINVAL;
 
-        mfn = virt_to_mfn(d->shared_info);
+        mfn = d->shared_info.mfn;
         t = p2m_ram_rw;
 
         break;
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index caf2ecad7e..bdcc0d972a 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -611,12 +611,9 @@ int arch_domain_create(struct domain *d,
      * The shared_info machine address must fit in a 32-bit field within a
      * 32-bit guest's start_info structure. Hence we specify MEMF_bits(32).
      */
-    if ( (d->shared_info = alloc_xenheap_pages(0, MEMF_bits(32))) == NULL )
+    if ( (rc = alloc_shared_info(d, MEMF_bits(32))) != 0 )
         goto fail;
 
-    clear_page(d->shared_info);
-    share_xen_page_with_guest(virt_to_page(d->shared_info), d, SHARE_rw);
-
     if ( (rc = init_domain_irq_mapping(d)) != 0 )
         goto fail;
 
@@ -664,7 +661,7 @@ int arch_domain_create(struct domain *d,
     psr_domain_free(d);
     iommu_domain_destroy(d);
     cleanup_domain_irq_mapping(d);
-    free_xenheap_page(d->shared_info);
+    free_shared_info(d);
     xfree(d->arch.cpuid);
     xfree(d->arch.msr);
     if ( paging_initialised )
@@ -693,7 +690,7 @@ void arch_domain_destroy(struct domain *d)
         pv_domain_destroy(d);
     free_perdomain_mappings(d);
 
-    free_xenheap_page(d->shared_info);
+    free_shared_info(d);
     cleanup_domain_irq_mapping(d);
 
     psr_domain_free(d);
@@ -719,7 +716,7 @@ void arch_domain_unpause(struct domain *d)
 
 int arch_domain_soft_reset(struct domain *d)
 {
-    struct page_info *page = virt_to_page(d->shared_info), *new_page;
+    struct page_info *page = mfn_to_page(d->shared_info.mfn), *new_page;
     int ret = 0;
     struct domain *owner;
     mfn_t mfn;
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 62507ca651..ba7563ed3c 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4540,7 +4540,7 @@ int xenmem_add_to_physmap_one(
     {
         case XENMAPSPACE_shared_info:
             if ( idx == 0 )
-                mfn = virt_to_mfn(d->shared_info);
+                mfn = d->shared_info.mfn;
             break;
         case XENMAPSPACE_grant_table:
             rc = gnttab_map_frame(d, idx, gpfn, &mfn);
diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c
index 5678da782d..dc16ef2e79 100644
--- a/xen/arch/x86/pv/dom0_build.c
+++ b/xen/arch/x86/pv/dom0_build.c
@@ -743,7 +743,7 @@ int __init dom0_construct_pv(struct domain *d,
     clear_page(si);
     si->nr_pages = nr_pages;
 
-    si->shared_info = virt_to_maddr(d->shared_info);
+    si->shared_info = mfn_to_maddr(d->shared_info.mfn);
 
     if ( !pv_shim )
         si->flags    = SIF_PRIVILEGED | SIF_INITDOMAIN;
diff --git a/xen/arch/x86/pv/shim.c b/xen/arch/x86/pv/shim.c
index d86e2de118..f512809dad 100644
--- a/xen/arch/x86/pv/shim.c
+++ b/xen/arch/x86/pv/shim.c
@@ -282,7 +282,7 @@ static void write_start_info(struct domain *d)
     snprintf(si->magic, sizeof(si->magic), "xen-3.0-x86_%s",
              is_pv_32bit_domain(d) ? "32p" : "64");
     si->nr_pages = domain_tot_pages(d);
-    si->shared_info = virt_to_maddr(d->shared_info);
+    si->shared_info = mfn_to_maddr(d->shared_info.mfn);
     si->flags = 0;
     BUG_ON(xen_hypercall_hvm_get_param(HVM_PARAM_STORE_PFN, &si->store_mfn));
     BUG_ON(xen_hypercall_hvm_get_param(HVM_PARAM_STORE_EVTCHN, &param));
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 6ad458fa6b..ba7a905258 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -47,6 +47,10 @@
 #include <asm/guest.h>
 #endif
 
+/* Override macros from asm/page.h to make them work with mfn_t */
+#undef virt_to_mfn
+#define virt_to_mfn(v) _mfn(__virt_to_mfn(v))
+
 /* Linux config option: propageted to domain0 */
 /* xen_processor_pmbits: xen control Cx, Px, ... */
 unsigned int xen_processor_pmbits = XEN_PROCESSOR_PM_PX;
@@ -1644,6 +1648,28 @@ int continue_hypercall_on_cpu(
     return 0;
 }
 
+int alloc_shared_info(struct domain *d, unsigned int memflags)
+{
+    if ( (d->shared_info.virt = alloc_xenheap_pages(0, memflags)) == NULL )
+        return -ENOMEM;
+
+    d->shared_info.mfn = virt_to_mfn(d->shared_info.virt);
+
+    clear_page(d->shared_info.virt);
+    share_xen_page_with_guest(mfn_to_page(d->shared_info.mfn), d, SHARE_rw);
+
+    return 0;
+}
+
+void free_shared_info(struct domain *d)
+{
+    if ( !d->shared_info.virt )
+        return;
+
+    free_xenheap_page(d->shared_info.virt);
+    d->shared_info.virt = NULL;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index a69b3b59a8..81f18e63a7 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -196,7 +196,7 @@ void getdomaininfo(struct domain *d, struct xen_domctl_getdomaininfo *info)
     info->outstanding_pages = d->outstanding_pages;
     info->shr_pages         = atomic_read(&d->shr_pages);
     info->paged_pages       = atomic_read(&d->paged_pages);
-    info->shared_info_frame = mfn_to_gmfn(d, virt_to_mfn(d->shared_info));
+    info->shared_info_frame = mfn_to_gmfn(d, mfn_x(d->shared_info.mfn));
     BUG_ON(SHARED_M2P(info->shared_info_frame));
 
     info->cpupool = cpupool_get_id(d);
diff --git a/xen/common/time.c b/xen/common/time.c
index 82336e2d5a..58fa9abc40 100644
--- a/xen/common/time.c
+++ b/xen/common/time.c
@@ -110,9 +110,9 @@ void update_domain_wallclock_time(struct domain *d)
     shared_info(d, wc_nsec)   = wc_nsec;
 #ifdef CONFIG_X86
     if ( likely(!has_32bit_shinfo(d)) )
-        d->shared_info->native.wc_sec_hi = sec >> 32;
+        d->shared_info.virt->native.wc_sec_hi = sec >> 32;
     else
-        d->shared_info->compat.arch.wc_sec_hi = sec >> 32;
+        d->shared_info.virt->compat.arch.wc_sec_hi = sec >> 32;
 #else
     shared_info(d, wc_sec_hi) = sec >> 32;
 #endif
diff --git a/xen/include/asm-x86/shared.h b/xen/include/asm-x86/shared.h
index af5d959d04..d4588e08a6 100644
--- a/xen/include/asm-x86/shared.h
+++ b/xen/include/asm-x86/shared.h
@@ -1,24 +1,25 @@
 #ifndef __XEN_X86_SHARED_H__
 #define __XEN_X86_SHARED_H__
 
-#define nmi_reason(d) (!has_32bit_shinfo(d) ?                             \
-                       (u32 *)&(d)->shared_info->native.arch.nmi_reason : \
-                       (u32 *)&(d)->shared_info->compat.arch.nmi_reason)
+#define nmi_reason(d)                                           \
+    (!has_32bit_shinfo(d) ?                                     \
+     (u32 *)&(d)->shared_info.virt->native.arch.nmi_reason :    \
+     (u32 *)&(d)->shared_info.virt->compat.arch.nmi_reason)
 
 #define GET_SET_SHARED(type, field)                             \
 static inline type arch_get_##field(const struct domain *d)     \
 {                                                               \
     return !has_32bit_shinfo(d) ?                               \
-           d->shared_info->native.arch.field :                  \
-           d->shared_info->compat.arch.field;                   \
+           d->shared_info.virt->native.arch.field :             \
+           d->shared_info.virt->compat.arch.field;              \
 }                                                               \
 static inline void arch_set_##field(struct domain *d,           \
                                     type val)                   \
 {                                                               \
     if ( !has_32bit_shinfo(d) )                                 \
-        d->shared_info->native.arch.field = val;                \
+        d->shared_info.virt->native.arch.field = val;           \
     else                                                        \
-        d->shared_info->compat.arch.field = val;                \
+        d->shared_info.virt->compat.arch.field = val;           \
 }
 
 #define GET_SET_VCPU(type, field)                               \
diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
index 7e51d361de..740e2032ad 100644
--- a/xen/include/xen/domain.h
+++ b/xen/include/xen/domain.h
@@ -130,4 +130,7 @@ struct vnuma_info {
 
 void vnuma_destroy(struct vnuma_info *vnuma);
 
+int alloc_shared_info(struct domain *d, unsigned int memflags);
+void free_shared_info(struct domain *d);
+
 #endif /* __XEN_DOMAIN_H__ */
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 3a4f43098c..f41d0ad2a0 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -346,7 +346,10 @@ struct domain
     unsigned int     max_vcpus;
     struct vcpu    **vcpu;
 
-    shared_info_t   *shared_info;     /* shared data area */
+    struct {
+        mfn_t mfn;
+        shared_info_t *virt;
+    } shared_info; /* shared data area */
 
     spinlock_t       domain_lock;
 
diff --git a/xen/include/xen/shared.h b/xen/include/xen/shared.h
index a411a8a3e3..57b2ff1e34 100644
--- a/xen/include/xen/shared.h
+++ b/xen/include/xen/shared.h
@@ -43,7 +43,7 @@ typedef struct vcpu_info vcpu_info_t;
 
 extern vcpu_info_t dummy_vcpu_info;
 
-#define shared_info(d, field)      __shared_info(d, (d)->shared_info, field)
+#define shared_info(d, field)      __shared_info(d, (d)->shared_info.virt, field)
 #define vcpu_info(v, field)        __vcpu_info(v, (v)->vcpu_info, field)
 
 #endif /* __XEN_SHARED_H__ */
-- 
2.20.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Xen-devel] [PATCH v5 2/6] x86 / p2m: replace page_list check in p2m_alloc_table...
  2020-03-09 10:22 [Xen-devel] [PATCH v5 0/6] remove one more shared xenheap page: shared_info paul
  2020-03-09 10:22 ` [Xen-devel] [PATCH v5 1/6] domain: introduce alloc/free_shared_info() helpers paul
@ 2020-03-09 10:23 ` paul
  2020-03-09 10:23 ` [Xen-devel] [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM paul
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 23+ messages in thread
From: paul @ 2020-03-09 10:23 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, Paul Durrant, Andrew Cooper, Paul Durrant,
	George Dunlap, Jan Beulich, Roger Pau Monné

From: Paul Durrant <pdurrant@amazon.com>

... with a check of domain_tot_pages().

The check of page_list prevents the prior allocation of PGC_extra pages,
whereas what the code is trying to verify is that the toolstack has not
already RAM for the domain.

Signed-off-by: Paul Durrant <paul@xen.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Wei Liu <wl@xen.org>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>

v4:
 - Re-worked so as not to completely remove the check

v2:
 - New in v2
---
 xen/arch/x86/mm/p2m.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 3719deae77..9f51370327 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -695,8 +695,7 @@ int p2m_alloc_table(struct p2m_domain *p2m)
 
     p2m_lock(p2m);
 
-    if ( p2m_is_hostp2m(p2m)
-         && !page_list_empty(&d->page_list) )
+    if ( p2m_is_hostp2m(p2m) && domain_tot_pages(d) )
     {
         P2M_ERROR("dom %d already has memory allocated\n", d->domain_id);
         p2m_unlock(p2m);
-- 
2.20.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Xen-devel] [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM
  2020-03-09 10:22 [Xen-devel] [PATCH v5 0/6] remove one more shared xenheap page: shared_info paul
  2020-03-09 10:22 ` [Xen-devel] [PATCH v5 1/6] domain: introduce alloc/free_shared_info() helpers paul
  2020-03-09 10:23 ` [Xen-devel] [PATCH v5 2/6] x86 / p2m: replace page_list check in p2m_alloc_table paul
@ 2020-03-09 10:23 ` paul
  2020-03-09 13:04   ` Jan Beulich
  2020-03-09 10:23 ` [Xen-devel] [PATCH v5 4/6] x86 / ioreq: use a MEMF_no_refcount allocation for server pages paul
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 23+ messages in thread
From: paul @ 2020-03-09 10:23 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, Paul Durrant, Andrew Cooper, Paul Durrant, Jan Beulich,
	Roger Pau Monné

From: Paul Durrant <pdurrant@amazon.com>

This patch modifies several places walking the domain's page_list to make
them ignore PGC_extra pages:

- dump_pageframe_info() should ignore PGC_extra pages in its dump as it
  determines whether to dump using domain_tot_pages() which also ignores
  PGC_extra pages.

- arch_set_info_guest() is looking for an L4 page table which will
  definitely not be in a PGC_extra page.

- audit_p2m() should ignore PGC_extra pages as it is perfectly legitimate
  for them not to be present in the P2M.

- dump_nama() should ignore PGC_extra pages as they are essentially
  uninteresting in that context.

- dom0_construct_pv() should ignore PGC_extra pages when setting up the
  physmap as they are only created for special purposes and, if they need
  to be mapped, will be mapped explicitly for whatever purpose is relevant.

- tboot_gen_domain_integrity() should ignore PGC_extra pages as they should
  not form part of the measurement.

Signed-off-by: Paul Durrant <paul@xen.org>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wl@xen.org>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>

v4:
 - Expand to cover more than just dom0_construct_pv()

v2:
 - New in v2
---
 xen/arch/x86/domain.c        | 6 +++++-
 xen/arch/x86/mm/p2m.c        | 3 +++
 xen/arch/x86/numa.c          | 3 +++
 xen/arch/x86/pv/dom0_build.c | 4 ++++
 xen/arch/x86/tboot.c         | 7 ++++++-
 5 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index bdcc0d972a..f6ed25e8ee 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -231,6 +231,9 @@ void dump_pageframe_info(struct domain *d)
             unsigned int index = MASK_EXTR(page->u.inuse.type_info,
                                            PGT_type_mask);
 
+            if ( page->count_info & PGC_extra )
+                continue;
+
             if ( ++total[index] > 16 )
             {
                 switch ( page->u.inuse.type_info & PGT_type_mask )
@@ -1044,7 +1047,8 @@ int arch_set_info_guest(
             {
                 struct page_info *page = page_list_remove_head(&d->page_list);
 
-                if ( page_lock(page) )
+                if ( !(page->count_info & PGC_extra) &&
+                     page_lock(page) )
                 {
                     if ( (page->u.inuse.type_info & PGT_type_mask) ==
                          PGT_l4_page_table )
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 9f51370327..71d2fb9bbc 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -2843,6 +2843,9 @@ void audit_p2m(struct domain *d,
     spin_lock(&d->page_alloc_lock);
     page_list_for_each ( page, &d->page_list )
     {
+        if ( page->count_info & PGC_extra )
+            continue;
+
         mfn = mfn_x(page_to_mfn(page));
 
         P2M_PRINTK("auditing guest page, mfn=%#lx\n", mfn);
diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index f1066c59c7..7e5aa8dc95 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -428,6 +428,9 @@ static void dump_numa(unsigned char key)
         spin_lock(&d->page_alloc_lock);
         page_list_for_each(page, &d->page_list)
         {
+            if ( page->count_info & PGC_extra )
+                break;
+
             i = phys_to_nid(page_to_maddr(page));
             page_num_node[i]++;
         }
diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c
index dc16ef2e79..f8f1bbe2f4 100644
--- a/xen/arch/x86/pv/dom0_build.c
+++ b/xen/arch/x86/pv/dom0_build.c
@@ -792,6 +792,10 @@ int __init dom0_construct_pv(struct domain *d,
     {
         mfn = mfn_x(page_to_mfn(page));
         BUG_ON(SHARED_M2P(get_gpfn_from_mfn(mfn)));
+
+        if ( page->count_info & PGC_extra )
+            continue;
+
         if ( get_gpfn_from_mfn(mfn) >= count )
         {
             BUG_ON(is_pv_32bit_domain(d));
diff --git a/xen/arch/x86/tboot.c b/xen/arch/x86/tboot.c
index 8c232270b4..6cc020cb71 100644
--- a/xen/arch/x86/tboot.c
+++ b/xen/arch/x86/tboot.c
@@ -220,7 +220,12 @@ static void tboot_gen_domain_integrity(const uint8_t key[TB_KEY_SIZE],
         spin_lock(&d->page_alloc_lock);
         page_list_for_each(page, &d->page_list)
         {
-            void *pg = __map_domain_page(page);
+            void *pg;
+
+            if ( page->count_info & PGC_extra )
+                continue;
+
+            pg = __map_domain_page(page);
             vmac_update(pg, PAGE_SIZE, &ctx);
             unmap_domain_page(pg);
         }
-- 
2.20.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Xen-devel] [PATCH v5 4/6] x86 / ioreq: use a MEMF_no_refcount allocation for server pages...
  2020-03-09 10:22 [Xen-devel] [PATCH v5 0/6] remove one more shared xenheap page: shared_info paul
                   ` (2 preceding siblings ...)
  2020-03-09 10:23 ` [Xen-devel] [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM paul
@ 2020-03-09 10:23 ` paul
  2020-03-09 10:23 ` [Xen-devel] [PATCH v5 5/6] mm: add 'is_special_page' inline function paul
  2020-03-09 10:23 ` [Xen-devel] [PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info paul
  5 siblings, 0 replies; 23+ messages in thread
From: paul @ 2020-03-09 10:23 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, Paul Durrant, Andrew Cooper, Paul Durrant, Jan Beulich,
	Roger Pau Monné

From: Paul Durrant <pdurrant@amazon.com>

... now that it is safe to assign them.

This avoids relying on libxl (or whatever toolstack is in use) setting
max_pages up with sufficient 'slop' to allow all necessary ioreq server
pages to be allocated.

Signed-off-by: Paul Durrant <paul@xen.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wl@xen.org>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>

v2:
 - New in v2
---
 xen/arch/x86/hvm/ioreq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index f8a5c81546..648ef9137f 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -375,7 +375,7 @@ static int hvm_alloc_ioreq_mfn(struct hvm_ioreq_server *s, bool buf)
         return 0;
     }
 
-    page = alloc_domheap_page(s->target, 0);
+    page = alloc_domheap_page(s->target, MEMF_no_refcount);
 
     if ( !page )
         return -ENOMEM;
-- 
2.20.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Xen-devel] [PATCH v5 5/6] mm: add 'is_special_page' inline function...
  2020-03-09 10:22 [Xen-devel] [PATCH v5 0/6] remove one more shared xenheap page: shared_info paul
                   ` (3 preceding siblings ...)
  2020-03-09 10:23 ` [Xen-devel] [PATCH v5 4/6] x86 / ioreq: use a MEMF_no_refcount allocation for server pages paul
@ 2020-03-09 10:23 ` paul
  2020-03-09 13:28   ` Jan Beulich
  2020-03-09 10:23 ` [Xen-devel] [PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info paul
  5 siblings, 1 reply; 23+ messages in thread
From: paul @ 2020-03-09 10:23 UTC (permalink / raw)
  To: xen-devel
  Cc: Tamas K Lengyel, Julien Grall, Wei Liu, Paul Durrant,
	Andrew Cooper, Paul Durrant, Konrad Rzeszutek Wilk, Ian Jackson,
	George Dunlap, Tim Deegan, Stefano Stabellini, Jan Beulich,
	Roger Pau Monné

From: Paul Durrant <pdurrant@amazon.com>

... to cover xenheap and PGC_extra pages.

PGC_extra pages are intended to hold data structures that are associated
with a domain and may be mapped by that domain. They should not be treated
as 'normal' guest pages (i.e. RAM or page tables). Hence, in many cases
where code currently tests is_xen_heap_page() it should also check for
the PGC_extra bit in 'count_info'.

This patch therefore defines is_special_page() to cover both cases and
converts tests if is_xen_heap_page() to is_special_page() where
appropriate.

Signed-off-by: Paul Durrant <paul@xen.org>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wl@xen.org>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Julien Grall <julien@xen.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>

v4:
 - Use inline function instead of macro
 - Add missing conversions from is_xen_heap_page()

v3:
 - Delete obsolete comment.

v2:
 - New in v2
---
 xen/arch/x86/domctl.c           |  2 +-
 xen/arch/x86/mm.c               |  9 ++++-----
 xen/arch/x86/mm/altp2m.c        |  2 +-
 xen/arch/x86/mm/mem_sharing.c   |  3 +--
 xen/arch/x86/mm/shadow/common.c | 13 ++++++++-----
 xen/arch/x86/mm/shadow/multi.c  |  2 +-
 xen/arch/x86/tboot.c            |  4 ++--
 xen/include/xen/mm.h            |  5 +++++
 8 files changed, 23 insertions(+), 17 deletions(-)

diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index ed86762fa6..add70126b9 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -394,7 +394,7 @@ long arch_do_domctl(
             page = get_page_from_gfn(d, gfn, &t, P2M_ALLOC);
 
             if ( unlikely(!page) ||
-                 unlikely(is_xen_heap_page(page)) )
+                 unlikely(is_special_page(page)) )
             {
                 if ( unlikely(p2m_is_broken(t)) )
                     type = XEN_DOMCTL_PFINFO_BROKEN;
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index ba7563ed3c..353bde5c2c 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -1014,7 +1014,7 @@ get_page_from_l1e(
         unsigned long cacheattr = pte_flags_to_cacheattr(l1f);
         int err;
 
-        if ( is_xen_heap_page(page) )
+        if ( is_special_page(page) )
         {
             if ( write )
                 put_page_type(page);
@@ -2447,7 +2447,7 @@ static int cleanup_page_mappings(struct page_info *page)
     {
         page->count_info &= ~PGC_cacheattr_mask;
 
-        BUG_ON(is_xen_heap_page(page));
+        BUG_ON(is_special_page(page));
 
         rc = update_xen_mappings(mfn, 0);
     }
@@ -2477,7 +2477,7 @@ static int cleanup_page_mappings(struct page_info *page)
                 rc = rc2;
         }
 
-        if ( likely(!is_xen_heap_page(page)) )
+        if ( likely(!is_special_page(page)) )
         {
             ASSERT((page->u.inuse.type_info &
                     (PGT_type_mask | PGT_count_mask)) == PGT_writable_page);
@@ -4216,8 +4216,7 @@ int steal_page(
     if ( !(owner = page_get_owner_and_reference(page)) )
         goto fail;
 
-    if ( owner != d || is_xen_heap_page(page) ||
-         (page->count_info & PGC_extra) )
+    if ( owner != d || is_special_page(page) )
         goto fail_put;
 
     /*
diff --git a/xen/arch/x86/mm/altp2m.c b/xen/arch/x86/mm/altp2m.c
index 50768f2547..c091b03ea3 100644
--- a/xen/arch/x86/mm/altp2m.c
+++ b/xen/arch/x86/mm/altp2m.c
@@ -77,7 +77,7 @@ int altp2m_vcpu_enable_ve(struct vcpu *v, gfn_t gfn)
      * pageable() predicate for this, due to it having the same properties
      * that we want.
      */
-    if ( !p2m_is_pageable(p2mt) || is_xen_heap_page(pg) )
+    if ( !p2m_is_pageable(p2mt) || is_special_page(pg) )
     {
         rc = -EINVAL;
         goto err;
diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index 3835bc928f..f49f27a3ef 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -840,9 +840,8 @@ static int nominate_page(struct domain *d, gfn_t gfn,
     if ( !p2m_is_sharable(p2mt) )
         goto out;
 
-    /* Skip xen heap pages */
     page = mfn_to_page(mfn);
-    if ( !page || is_xen_heap_page(page) )
+    if ( !page || is_special_page(page) )
         goto out;
 
     /* Check if there are mem_access/remapped altp2m entries for this page */
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index cba3ab1eba..e835940d86 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -2087,19 +2087,22 @@ static int sh_remove_all_mappings(struct domain *d, mfn_t gmfn, gfn_t gfn)
          * The qemu helper process has an untyped mapping of this dom's RAM
          * and the HVM restore program takes another.
          * Also allow one typed refcount for
-         * - Xen heap pages, to match share_xen_page_with_guest(),
-         * - ioreq server pages, to match prepare_ring_for_helper().
+         * - special pages, which are explicitly referenced and mapped by
+         *   Xen.
+         * - ioreq server pages, which may be special pages or normal
+         *   guest pages with an extra reference taken by
+         *   prepare_ring_for_helper().
          */
         if ( !(shadow_mode_external(d)
                && (page->count_info & PGC_count_mask) <= 3
                && ((page->u.inuse.type_info & PGT_count_mask)
-                   == (is_xen_heap_page(page) ||
+                   == (is_special_page(page) ||
                        (is_hvm_domain(d) && is_ioreq_server_page(d, page))))) )
             printk(XENLOG_G_ERR "can't find all mappings of mfn %"PRI_mfn
-                   " (gfn %"PRI_gfn"): c=%lx t=%lx x=%d i=%d\n",
+                   " (gfn %"PRI_gfn"): c=%lx t=%lx s=%d i=%d\n",
                    mfn_x(gmfn), gfn_x(gfn),
                    page->count_info, page->u.inuse.type_info,
-                   !!is_xen_heap_page(page),
+                   !!is_special_page(page),
                    (is_hvm_domain(d) && is_ioreq_server_page(d, page)));
     }
 
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index 26798b317c..ac19d203d7 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -559,7 +559,7 @@ _sh_propagate(struct vcpu *v,
      * caching attributes in the shadows to match what was asked for.
      */
     if ( (level == 1) && is_hvm_domain(d) &&
-         !is_xen_heap_mfn(target_mfn) )
+         !is_special_page(mfn_to_page(target_mfn)) )
     {
         int type;
 
diff --git a/xen/arch/x86/tboot.c b/xen/arch/x86/tboot.c
index 6cc020cb71..2fd7ce5305 100644
--- a/xen/arch/x86/tboot.c
+++ b/xen/arch/x86/tboot.c
@@ -189,7 +189,7 @@ static void update_pagetable_mac(vmac_ctx_t *ctx)
 
         if ( !mfn_valid(_mfn(mfn)) )
             continue;
-        if ( is_page_in_use(page) && !is_xen_heap_page(page) )
+        if ( is_page_in_use(page) && !is_special_page(page) )
         {
             if ( page->count_info & PGC_page_table )
             {
@@ -294,7 +294,7 @@ static void tboot_gen_xenheap_integrity(const uint8_t key[TB_KEY_SIZE],
                               + 3 * PAGE_SIZE)) )
             continue; /* skip tboot and its page tables */
 
-        if ( is_page_in_use(page) && is_xen_heap_page(page) )
+        if ( is_page_in_use(page) && is_special_page(page) )
         {
             void *pg;
 
diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
index d0d095d9c7..373de59969 100644
--- a/xen/include/xen/mm.h
+++ b/xen/include/xen/mm.h
@@ -285,6 +285,11 @@ extern struct domain *dom_cow;
 
 #include <asm/mm.h>
 
+static inline bool is_special_page(struct page_info *page)
+{
+    return is_xen_heap_page(page) || (page->count_info & PGC_extra);
+}
+
 #ifndef page_list_entry
 struct page_list_head
 {
-- 
2.20.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Xen-devel] [PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info
  2020-03-09 10:22 [Xen-devel] [PATCH v5 0/6] remove one more shared xenheap page: shared_info paul
                   ` (4 preceding siblings ...)
  2020-03-09 10:23 ` [Xen-devel] [PATCH v5 5/6] mm: add 'is_special_page' inline function paul
@ 2020-03-09 10:23 ` paul
  2020-03-09 15:56   ` Jan Beulich
  5 siblings, 1 reply; 23+ messages in thread
From: paul @ 2020-03-09 10:23 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Julien Grall, Wei Liu, Paul Durrant,
	Andrew Cooper, Paul Durrant, Konrad Rzeszutek Wilk, Ian Jackson,
	George Dunlap, Jan Beulich, Volodymyr Babchuk

From: Paul Durrant <pdurrant@amazon.com>

Currently shared_info is a shared xenheap page but shared xenheap pages
complicate future plans for live-update of Xen so it is desirable to,
where possible, not use them [1]. This patch therefore converts shared_info
into a PGC_extra domheap page. This does entail freeing shared_info during
domain_relinquish_resources() rather than domain_destroy() so care is
needed to avoid de-referencing a NULL shared_info pointer hence some
extra checks of 'is_dying' are needed.

Also, because shared_info will no longer be a xenheap page this patch adds
an extra dump function to make sure the shared_info MFN is included in the
output of dump_pageframe_info().

NOTE: For Arm, the call to free_shared_info() in arch_domain_destroy() is
      left in place since it is idempotent and called in the error path for
      arch_domain_create().

[1] See https://lists.xenproject.org/archives/html/xen-devel/2020-02/msg02018.html

Signed-off-by: Paul Durrant <paul@xen.org>
---
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien@xen.org>
Cc: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Wei Liu <wl@xen.org>

v5:
 - Incorporate new dump_shared_info() function

v2:
 - Addressed comments from Julien
 - Expanded the commit comment to explain why this patch is wanted
---
 xen/arch/arm/domain.c      |  2 ++
 xen/arch/x86/domain.c      |  5 ++++-
 xen/common/domain.c        | 38 ++++++++++++++++++++++++++++++++++----
 xen/common/event_channel.c |  3 +++
 xen/common/time.c          | 15 +++++++++++++++
 xen/include/xen/domain.h   |  1 +
 6 files changed, 59 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 5298d80bd2..741f6dd444 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -1005,6 +1005,8 @@ int domain_relinquish_resources(struct domain *d)
         BUG();
     }
 
+    free_shared_info(d);
+
     return 0;
 }
 
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index f6ed25e8ee..60623beba1 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -261,6 +261,8 @@ void dump_pageframe_info(struct domain *d)
                page->count_info, page->u.inuse.type_info);
     }
     spin_unlock(&d->page_alloc_lock);
+
+    dump_shared_info(d);
 }
 
 void update_guest_memory_policy(struct vcpu *v,
@@ -693,7 +695,6 @@ void arch_domain_destroy(struct domain *d)
         pv_domain_destroy(d);
     free_perdomain_mappings(d);
 
-    free_shared_info(d);
     cleanup_domain_irq_mapping(d);
 
     psr_domain_free(d);
@@ -2249,6 +2250,8 @@ int domain_relinquish_resources(struct domain *d)
     if ( is_hvm_domain(d) )
         hvm_domain_relinquish_resources(d);
 
+    free_shared_info(d);
+
     return 0;
 }
 
diff --git a/xen/common/domain.c b/xen/common/domain.c
index ba7a905258..0b1c722708 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -1650,24 +1650,54 @@ int continue_hypercall_on_cpu(
 
 int alloc_shared_info(struct domain *d, unsigned int memflags)
 {
-    if ( (d->shared_info.virt = alloc_xenheap_pages(0, memflags)) == NULL )
+    struct page_info *pg;
+
+    pg = alloc_domheap_page(d, MEMF_no_refcount | memflags);
+    if ( !pg )
         return -ENOMEM;
 
-    d->shared_info.mfn = virt_to_mfn(d->shared_info.virt);
+    if ( !get_page_and_type(pg, d, PGT_writable_page) )
+    {
+        /*
+         * The domain should not be running at this point so there is
+         * no way we should reach this error path.
+         */
+        ASSERT_UNREACHABLE();
+        return -ENODATA;
+    }
+
+    d->shared_info.mfn = page_to_mfn(pg);
+    d->shared_info.virt = __map_domain_page_global(pg);
 
     clear_page(d->shared_info.virt);
-    share_xen_page_with_guest(mfn_to_page(d->shared_info.mfn), d, SHARE_rw);
 
     return 0;
 }
 
 void free_shared_info(struct domain *d)
 {
+    struct page_info *pg;
+
     if ( !d->shared_info.virt )
         return;
 
-    free_xenheap_page(d->shared_info.virt);
+    unmap_domain_page_global(d->shared_info.virt);
     d->shared_info.virt = NULL;
+
+    pg = mfn_to_page(d->shared_info.mfn);
+
+    put_page_alloc_ref(pg);
+    put_page_and_type(pg);
+}
+
+void dump_shared_info(struct domain *d)
+{
+    domain_lock(d);
+
+    if ( d->shared_info.virt )
+        printk("Shared Info: %"PRI_mfn"\n", mfn_x(d->shared_info.mfn));
+
+    domain_unlock(d);
 }
 
 /*
diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index e86e2bfab0..a17422284d 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -1325,6 +1325,9 @@ void evtchn_destroy(struct domain *d)
 {
     unsigned int i;
 
+    /* This must be done before shared_info is freed */
+    BUG_ON(!d->shared_info.virt);
+
     /* After this barrier no new event-channel allocations can occur. */
     BUG_ON(!d->is_dying);
     spin_barrier(&d->event_lock);
diff --git a/xen/common/time.c b/xen/common/time.c
index 58fa9abc40..ada02faf07 100644
--- a/xen/common/time.c
+++ b/xen/common/time.c
@@ -99,6 +99,18 @@ void update_domain_wallclock_time(struct domain *d)
     uint32_t *wc_version;
     uint64_t sec;
 
+    if ( d != current->domain )
+    {
+        /*
+         * We need to check is_dying here as, if it is set, the
+         * shared_info may have been freed. To do this safely we need
+         * hold the domain lock.
+         */
+        domain_lock(d);
+        if ( d->is_dying )
+            goto unlock;
+    }
+
     spin_lock(&wc_lock);
 
     wc_version = &shared_info(d, wc_version);
@@ -121,6 +133,9 @@ void update_domain_wallclock_time(struct domain *d)
     *wc_version = version_update_end(*wc_version);
 
     spin_unlock(&wc_lock);
+ unlock:
+    if ( d != current->domain )
+        domain_unlock(d);
 }
 
 /* Set clock to <secs,usecs> after 00:00:00 UTC, 1 January, 1970. */
diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
index 740e2032ad..740c0537fa 100644
--- a/xen/include/xen/domain.h
+++ b/xen/include/xen/domain.h
@@ -132,5 +132,6 @@ void vnuma_destroy(struct vnuma_info *vnuma);
 
 int alloc_shared_info(struct domain *d, unsigned int memflags);
 void free_shared_info(struct domain *d);
+void dump_shared_info(struct domain *d);
 
 #endif /* __XEN_DOMAIN_H__ */
-- 
2.20.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [Xen-devel] [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM
  2020-03-09 10:23 ` [Xen-devel] [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM paul
@ 2020-03-09 13:04   ` Jan Beulich
  2020-03-10 13:32     ` Paul Durrant
  0 siblings, 1 reply; 23+ messages in thread
From: Jan Beulich @ 2020-03-09 13:04 UTC (permalink / raw)
  To: paul
  Cc: xen-devel, Paul Durrant, Roger Pau Monné, Wei Liu, Andrew Cooper

On 09.03.2020 11:23, paul@xen.org wrote:
> From: Paul Durrant <pdurrant@amazon.com>
> 
> This patch modifies several places walking the domain's page_list to make
> them ignore PGC_extra pages:
> 
> - dump_pageframe_info() should ignore PGC_extra pages in its dump as it
>   determines whether to dump using domain_tot_pages() which also ignores
>   PGC_extra pages.

This argument looks wrong to me: Let's take an example - a domain
almost fully cleaned up, with 8 "normal" and 3 "extra" pages left.
domain_tot_pages() returns 8 in this case, i.e. "normal" page
dumping doesn't get skipped. However, there now won't be any trace
of the "extra" pages, because they're also not on xenpage_list,
which gets all its pages dumped in all cases. Correct restoration
of original behavior would be to dump "normal" pages when there
are less than 10, and to dump all "extra" pages. (Same of course
goes for live domains, where "normal" page dumping would be
skipped in the common case, but xenheap pages would be dumped, and
hence so should be "extra" ones.) As indicated before, the removal
of the APIC assist page from xenpage_list was already slightly
regressing in this regard (as well as in at least one other way,
I'm afraid), and you're now deliberately making the regression
even bigger.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xen-devel] [PATCH v5 5/6] mm: add 'is_special_page' inline function...
  2020-03-09 10:23 ` [Xen-devel] [PATCH v5 5/6] mm: add 'is_special_page' inline function paul
@ 2020-03-09 13:28   ` Jan Beulich
  2020-03-10 16:32     ` Paul Durrant
  0 siblings, 1 reply; 23+ messages in thread
From: Jan Beulich @ 2020-03-09 13:28 UTC (permalink / raw)
  To: paul
  Cc: Tamas K Lengyel, Julien Grall, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Paul Durrant, Ian Jackson, George Dunlap,
	Tim Deegan, Stefano Stabellini, xen-devel, Roger Pau Monné

On 09.03.2020 11:23, paul@xen.org wrote:
> v4:
>  - Use inline function instead of macro
>  - Add missing conversions from is_xen_heap_page()

Among these also one conversion of is_xen_heap_mfn(). I'm still
curious why others wouldn't need converting - the description
doesn't mention there are more, see p2m_add_foreign() for an
example (may warrant introduction of is_special_mfn() then). It
would probably be beneficial if the description gave some
generic criteria for cases where conversion is (not) needed.

But there are issues beyond this, as there are also open-coded
instances of PGC_xen_heap checks, and that's the other possible
regression I notice from the APIC assist MFN page conversion:
PoD code, to avoid doing two separate checks on ->count_info [1],
uses two instances of a construct like this one

             !(pg->count_info & (PGC_page_table | PGC_xen_heap)) &&

(and again I didn't do a complete audit for further
occurrences). This means the APIC assist page right now might
be a candidate for getting converted to PoD (possibly others of
the constraints actually prohibit this, but I'm not sure).

[1] I'm unconvinced PGC_page_table pages can actually appear
there, so the open-coding may in fact be an optimization of
dead code.

> --- a/xen/arch/x86/mm/shadow/common.c
> +++ b/xen/arch/x86/mm/shadow/common.c
> @@ -2087,19 +2087,22 @@ static int sh_remove_all_mappings(struct domain *d, mfn_t gmfn, gfn_t gfn)
>           * The qemu helper process has an untyped mapping of this dom's RAM
>           * and the HVM restore program takes another.
>           * Also allow one typed refcount for
> -         * - Xen heap pages, to match share_xen_page_with_guest(),
> -         * - ioreq server pages, to match prepare_ring_for_helper().
> +         * - special pages, which are explicitly referenced and mapped by
> +         *   Xen.
> +         * - ioreq server pages, which may be special pages or normal
> +         *   guest pages with an extra reference taken by
> +         *   prepare_ring_for_helper().
>           */
>          if ( !(shadow_mode_external(d)
>                 && (page->count_info & PGC_count_mask) <= 3
>                 && ((page->u.inuse.type_info & PGT_count_mask)
> -                   == (is_xen_heap_page(page) ||
> +                   == (is_special_page(page) ||
>                         (is_hvm_domain(d) && is_ioreq_server_page(d, page))))) )
>              printk(XENLOG_G_ERR "can't find all mappings of mfn %"PRI_mfn
> -                   " (gfn %"PRI_gfn"): c=%lx t=%lx x=%d i=%d\n",
> +                   " (gfn %"PRI_gfn"): c=%lx t=%lx s=%d i=%d\n",
>                     mfn_x(gmfn), gfn_x(gfn),
>                     page->count_info, page->u.inuse.type_info,
> -                   !!is_xen_heap_page(page),
> +                   !!is_special_page(page),

The reason for me to ask to switch to an inline function was to
see this !! go away.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xen-devel] [PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info
  2020-03-09 10:23 ` [Xen-devel] [PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info paul
@ 2020-03-09 15:56   ` Jan Beulich
  2020-03-10 17:33     ` Paul Durrant
  0 siblings, 1 reply; 23+ messages in thread
From: Jan Beulich @ 2020-03-09 15:56 UTC (permalink / raw)
  To: paul
  Cc: Stefano Stabellini, Julien Grall, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Paul Durrant, Ian Jackson, George Dunlap,
	xen-devel, Volodymyr Babchuk

On 09.03.2020 11:23, paul@xen.org wrote:
> From: Paul Durrant <pdurrant@amazon.com>
> 
> Currently shared_info is a shared xenheap page but shared xenheap pages
> complicate future plans for live-update of Xen so it is desirable to,
> where possible, not use them [1]. This patch therefore converts shared_info
> into a PGC_extra domheap page. This does entail freeing shared_info during
> domain_relinquish_resources() rather than domain_destroy() so care is
> needed to avoid de-referencing a NULL shared_info pointer hence some
> extra checks of 'is_dying' are needed.
> 
> Also, because shared_info will no longer be a xenheap page this patch adds
> an extra dump function to make sure the shared_info MFN is included in the
> output of dump_pageframe_info().

I've voiced my objection to such a model, and hence it'll take
another REST maintainer to approve of this despite my arguments
against it. (And of course, just to re-record this here, the
APIC access page, converted by ea3daabff5f2, will want to get a
dumping function added then, too.)

I wonder whether a domain's "extra" pages couldn't be put in a
separate, singly-linked list, using the union the next_shadow
field is in as the linking field. None of the other union
members look to be applicable to "extra" pages.

> +void dump_shared_info(struct domain *d)
> +{
> +    domain_lock(d);
> +
> +    if ( d->shared_info.virt )
> +        printk("Shared Info: %"PRI_mfn"\n", mfn_x(d->shared_info.mfn));

count_info and type_info should be logged imo, just like
dump_pageframe_info() does. On the whole I think the actual
dumping might better be uniform, and these functions would
then only exist to "know" which page(s) to dump.

> --- a/xen/common/time.c
> +++ b/xen/common/time.c
> @@ -99,6 +99,18 @@ void update_domain_wallclock_time(struct domain *d)
>      uint32_t *wc_version;
>      uint64_t sec;
>  
> +    if ( d != current->domain )
> +    {
> +        /*
> +         * We need to check is_dying here as, if it is set, the
> +         * shared_info may have been freed. To do this safely we need
> +         * hold the domain lock.
> +         */
> +        domain_lock(d);
> +        if ( d->is_dying )
> +            goto unlock;
> +    }

This shouldn't happen very often, but it's pretty heavy a lock.
It's a fundamental aspect of xenheap pages that their disposal
can b e delay until almost the last moment of guest cleanup. I
continue to think it's not really a good ideal to have special
purpose allocation (and mapping) accompanied by these pages
getting taken care of by the generic relinquish-resources logic
here (from a more general pov such is of course often nice to
have). Instead of freeing these pages there, couldn't they just
be taken off d->page_list, with the unmapping and freeing left
as it was?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xen-devel] [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM
  2020-03-09 13:04   ` Jan Beulich
@ 2020-03-10 13:32     ` Paul Durrant
  2020-03-10 14:58       ` Jan Beulich
  0 siblings, 1 reply; 23+ messages in thread
From: Paul Durrant @ 2020-03-10 13:32 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: xen-devel, 'Paul Durrant', 'Roger Pau Monné',
	'Wei Liu', 'Andrew Cooper'

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 09 March 2020 13:04
> To: paul@xen.org
> Cc: xen-devel@lists.xenproject.org; Paul Durrant <pdurrant@amazon.com>; Andrew Cooper
> <andrew.cooper3@citrix.com>; Wei Liu <wl@xen.org>; Roger Pau Monné <roger.pau@citrix.com>
> Subject: Re: [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM
> 
> On 09.03.2020 11:23, paul@xen.org wrote:
> > From: Paul Durrant <pdurrant@amazon.com>
> >
> > This patch modifies several places walking the domain's page_list to make
> > them ignore PGC_extra pages:
> >
> > - dump_pageframe_info() should ignore PGC_extra pages in its dump as it
> >   determines whether to dump using domain_tot_pages() which also ignores
> >   PGC_extra pages.
> 
> This argument looks wrong to me: Let's take an example - a domain
> almost fully cleaned up, with 8 "normal" and 3 "extra" pages left.
> domain_tot_pages() returns 8 in this case, i.e. "normal" page
> dumping doesn't get skipped. However, there now won't be any trace
> of the "extra" pages, because they're also not on xenpage_list,
> which gets all its pages dumped in all cases. Correct restoration
> of original behavior would be to dump "normal" pages when there
> are less than 10, and to dump all "extra" pages. (Same of course
> goes for live domains, where "normal" page dumping would be
> skipped in the common case, but xenheap pages would be dumped, and
> hence so should be "extra" ones.) As indicated before, the removal
> of the APIC assist page from xenpage_list was already slightly
> regressing in this regard (as well as in at least one other way,
> I'm afraid), and you're now deliberately making the regression
> even bigger.

I thought the idea here was that the domheap dump loop should be dumping 'normal' pages so it seems reasonable to me that the number of pages dumped to match the value returned by domain_tot_pages().
Would you perhaps be happier if we put 'extra' pages on separate page list, which can be unconditionally dumped so as I transition xenheap pages to 'extra' pages they don't get missed? It would also get rid of some of the other checks for PGC_extra that I have to introduce because they currently end up on the domain's page list.

  Paul

> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xen-devel] [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM
  2020-03-10 13:32     ` Paul Durrant
@ 2020-03-10 14:58       ` Jan Beulich
  2020-03-10 15:05         ` Paul Durrant
  0 siblings, 1 reply; 23+ messages in thread
From: Jan Beulich @ 2020-03-10 14:58 UTC (permalink / raw)
  To: paul
  Cc: xen-devel, 'Paul Durrant', 'Roger Pau Monné',
	'Wei Liu', 'Andrew Cooper'

On 10.03.2020 14:32, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: 09 March 2020 13:04
>> To: paul@xen.org
>> Cc: xen-devel@lists.xenproject.org; Paul Durrant <pdurrant@amazon.com>; Andrew Cooper
>> <andrew.cooper3@citrix.com>; Wei Liu <wl@xen.org>; Roger Pau Monné <roger.pau@citrix.com>
>> Subject: Re: [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM
>>
>> On 09.03.2020 11:23, paul@xen.org wrote:
>>> From: Paul Durrant <pdurrant@amazon.com>
>>>
>>> This patch modifies several places walking the domain's page_list to make
>>> them ignore PGC_extra pages:
>>>
>>> - dump_pageframe_info() should ignore PGC_extra pages in its dump as it
>>>   determines whether to dump using domain_tot_pages() which also ignores
>>>   PGC_extra pages.
>>
>> This argument looks wrong to me: Let's take an example - a domain
>> almost fully cleaned up, with 8 "normal" and 3 "extra" pages left.
>> domain_tot_pages() returns 8 in this case, i.e. "normal" page
>> dumping doesn't get skipped. However, there now won't be any trace
>> of the "extra" pages, because they're also not on xenpage_list,
>> which gets all its pages dumped in all cases. Correct restoration
>> of original behavior would be to dump "normal" pages when there
>> are less than 10, and to dump all "extra" pages. (Same of course
>> goes for live domains, where "normal" page dumping would be
>> skipped in the common case, but xenheap pages would be dumped, and
>> hence so should be "extra" ones.) As indicated before, the removal
>> of the APIC assist page from xenpage_list was already slightly
>> regressing in this regard (as well as in at least one other way,
>> I'm afraid), and you're now deliberately making the regression
>> even bigger.
> 
> I thought the idea here was that the domheap dump loop should be
> dumping 'normal' pages so it seems reasonable to me that the number
> of pages dumped to match the value returned by domain_tot_pages().

I never thought of such a connection. To me the invocation of
domain_tot_pages() there is there only to avoid overly much output.

> Would you perhaps be happier if we put 'extra' pages on separate
> page list, which can be unconditionally dumped so as I transition
> xenheap pages to 'extra' pages they don't get missed? It would
> also get rid of some of the other checks for PGC_extra that I
> have to introduce because they currently end up on the domain's
> page list.

Hmm, wasn't it an intended side effect to have all pages on one
list now? Introducing a 3rd list (even if just temporarily, until
xenpage_list can be dropped) will be somewhat ugly because of how
arch_free_heap_page() works. In reply to patch 6 I did suggest to
have a separate list, but without taking these pages off
d->page_list, such that here you would skip them in the main
domain page dumping loop, but you would then traverse that second
list and dump all of its elements, just like xenpage_list gets
handled there.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xen-devel] [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM
  2020-03-10 14:58       ` Jan Beulich
@ 2020-03-10 15:05         ` Paul Durrant
  2020-03-10 15:12           ` Jan Beulich
  0 siblings, 1 reply; 23+ messages in thread
From: Paul Durrant @ 2020-03-10 15:05 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: xen-devel, 'Paul Durrant', 'Roger Pau Monné',
	'Wei Liu', 'Andrew Cooper'

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 10 March 2020 14:59
> To: paul@xen.org
> Cc: xen-devel@lists.xenproject.org; 'Paul Durrant' <pdurrant@amazon.com>; 'Andrew Cooper'
> <andrew.cooper3@citrix.com>; 'Wei Liu' <wl@xen.org>; 'Roger Pau Monné' <roger.pau@citrix.com>
> Subject: Re: [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM
> 
> On 10.03.2020 14:32, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Jan Beulich <jbeulich@suse.com>
> >> Sent: 09 March 2020 13:04
> >> To: paul@xen.org
> >> Cc: xen-devel@lists.xenproject.org; Paul Durrant <pdurrant@amazon.com>; Andrew Cooper
> >> <andrew.cooper3@citrix.com>; Wei Liu <wl@xen.org>; Roger Pau Monné <roger.pau@citrix.com>
> >> Subject: Re: [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM
> >>
> >> On 09.03.2020 11:23, paul@xen.org wrote:
> >>> From: Paul Durrant <pdurrant@amazon.com>
> >>>
> >>> This patch modifies several places walking the domain's page_list to make
> >>> them ignore PGC_extra pages:
> >>>
> >>> - dump_pageframe_info() should ignore PGC_extra pages in its dump as it
> >>>   determines whether to dump using domain_tot_pages() which also ignores
> >>>   PGC_extra pages.
> >>
> >> This argument looks wrong to me: Let's take an example - a domain
> >> almost fully cleaned up, with 8 "normal" and 3 "extra" pages left.
> >> domain_tot_pages() returns 8 in this case, i.e. "normal" page
> >> dumping doesn't get skipped. However, there now won't be any trace
> >> of the "extra" pages, because they're also not on xenpage_list,
> >> which gets all its pages dumped in all cases. Correct restoration
> >> of original behavior would be to dump "normal" pages when there
> >> are less than 10, and to dump all "extra" pages. (Same of course
> >> goes for live domains, where "normal" page dumping would be
> >> skipped in the common case, but xenheap pages would be dumped, and
> >> hence so should be "extra" ones.) As indicated before, the removal
> >> of the APIC assist page from xenpage_list was already slightly
> >> regressing in this regard (as well as in at least one other way,
> >> I'm afraid), and you're now deliberately making the regression
> >> even bigger.
> >
> > I thought the idea here was that the domheap dump loop should be
> > dumping 'normal' pages so it seems reasonable to me that the number
> > of pages dumped to match the value returned by domain_tot_pages().
> 
> I never thought of such a connection. To me the invocation of
> domain_tot_pages() there is there only to avoid overly much output.
> 
> > Would you perhaps be happier if we put 'extra' pages on separate
> > page list, which can be unconditionally dumped so as I transition
> > xenheap pages to 'extra' pages they don't get missed? It would
> > also get rid of some of the other checks for PGC_extra that I
> > have to introduce because they currently end up on the domain's
> > page list.
> 
> Hmm, wasn't it an intended side effect to have all pages on one
> list now?

That would be nice, but I cannot reconcile that with unconditionally dumping all extra pages... otherwise dump_pageframe_info() would always have to walk the entire page_list no matter how long it was.

> Introducing a 3rd list (even if just temporarily, until
> xenpage_list can be dropped) will be somewhat ugly because of how
> arch_free_heap_page() works.

Yes, but it would at least be temporary.

> In reply to patch 6 I did suggest to
> have a separate list, but without taking these pages off
> d->page_list,

How would that work without adding an extra page_list_entry into struct page_info?

> such that here you would skip them in the main
> domain page dumping loop, but you would then traverse that second
> list and dump all of its elements, just like xenpage_list gets
> handled there.
> 

Well, that's what I'm trying to achieve, yes.

  Paul


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xen-devel] [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM
  2020-03-10 15:05         ` Paul Durrant
@ 2020-03-10 15:12           ` Jan Beulich
  2020-03-10 15:16             ` Paul Durrant
  0 siblings, 1 reply; 23+ messages in thread
From: Jan Beulich @ 2020-03-10 15:12 UTC (permalink / raw)
  To: paul
  Cc: xen-devel, 'Paul Durrant', 'Roger Pau Monné',
	'Wei Liu', 'Andrew Cooper'

On 10.03.2020 16:05, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: 10 March 2020 14:59
>>
>> In reply to patch 6 I did suggest to
>> have a separate list, but without taking these pages off
>> d->page_list,
> 
> How would that work without adding an extra page_list_entry into struct page_info?

As said there, it'd be a singly linked list using a __pdx_t
field just like there already is with "next_shadow", i.e.
you'd add another union member "next_extra" or some such. Of
course the list shouldn't grow too long, or else insertion
and removal may become a bottleneck. Not sure how well this
would fit Arm, though; maybe they wouldn't need this, but
that depends on whether the list would be used for purposes
beyond dumping.

Jan

>> such that here you would skip them in the main
>> domain page dumping loop, but you would then traverse that second
>> list and dump all of its elements, just like xenpage_list gets
>> handled there.
>>
> 
> Well, that's what I'm trying to achieve, yes.
> 
>   Paul
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xen-devel] [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM
  2020-03-10 15:12           ` Jan Beulich
@ 2020-03-10 15:16             ` Paul Durrant
  2020-03-10 15:33               ` Jan Beulich
  0 siblings, 1 reply; 23+ messages in thread
From: Paul Durrant @ 2020-03-10 15:16 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: xen-devel, 'Paul Durrant', 'Roger Pau Monné',
	'Wei Liu', 'Andrew Cooper'

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 10 March 2020 15:13
> To: paul@xen.org
> Cc: xen-devel@lists.xenproject.org; 'Paul Durrant' <pdurrant@amazon.com>; 'Andrew Cooper'
> <andrew.cooper3@citrix.com>; 'Wei Liu' <wl@xen.org>; 'Roger Pau Monné' <roger.pau@citrix.com>
> Subject: Re: [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM
> 
> On 10.03.2020 16:05, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Jan Beulich <jbeulich@suse.com>
> >> Sent: 10 March 2020 14:59
> >>
> >> In reply to patch 6 I did suggest to
> >> have a separate list, but without taking these pages off
> >> d->page_list,
> >
> > How would that work without adding an extra page_list_entry into struct page_info?
> 
> As said there, it'd be a singly linked list using a __pdx_t
> field just like there already is with "next_shadow", i.e.
> you'd add another union member "next_extra" or some such. Of
> course the list shouldn't grow too long, or else insertion
> and removal may become a bottleneck. Not sure how well this
> would fit Arm, though; maybe they wouldn't need this, but
> that depends on whether the list would be used for purposes
> beyond dumping.
> 

That seems more obscure and bug-prone than an extra list head in struct domain. I'd really prefer to stick with that even if it does make things a little more ugly until xenpage_list goes away.

  Paul


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xen-devel] [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM
  2020-03-10 15:16             ` Paul Durrant
@ 2020-03-10 15:33               ` Jan Beulich
  2020-03-10 15:54                 ` Paul Durrant
  0 siblings, 1 reply; 23+ messages in thread
From: Jan Beulich @ 2020-03-10 15:33 UTC (permalink / raw)
  To: paul
  Cc: xen-devel, 'Paul Durrant', 'Roger Pau Monné',
	'Wei Liu', 'Andrew Cooper'

On 10.03.2020 16:16, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: 10 March 2020 15:13
>> To: paul@xen.org
>> Cc: xen-devel@lists.xenproject.org; 'Paul Durrant' <pdurrant@amazon.com>; 'Andrew Cooper'
>> <andrew.cooper3@citrix.com>; 'Wei Liu' <wl@xen.org>; 'Roger Pau Monné' <roger.pau@citrix.com>
>> Subject: Re: [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM
>>
>> On 10.03.2020 16:05, Paul Durrant wrote:
>>>> -----Original Message-----
>>>> From: Jan Beulich <jbeulich@suse.com>
>>>> Sent: 10 March 2020 14:59
>>>>
>>>> In reply to patch 6 I did suggest to
>>>> have a separate list, but without taking these pages off
>>>> d->page_list,
>>>
>>> How would that work without adding an extra page_list_entry into struct page_info?
>>
>> As said there, it'd be a singly linked list using a __pdx_t
>> field just like there already is with "next_shadow", i.e.
>> you'd add another union member "next_extra" or some such. Of
>> course the list shouldn't grow too long, or else insertion
>> and removal may become a bottleneck. Not sure how well this
>> would fit Arm, though; maybe they wouldn't need this, but
>> that depends on whether the list would be used for purposes
>> beyond dumping.
> 
> That seems more obscure and bug-prone than an extra list head
> in struct domain. I'd really prefer to stick with that even
> if it does make things a little more ugly until xenpage_list
> goes away.

Okay with me if there really was no property (other than
assign_pages() then needing to pick the right list, which is
not much different from needing to put the extra pages on two
lists) that you'd lose by no longer having the pages on the
same list.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xen-devel] [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM
  2020-03-10 15:33               ` Jan Beulich
@ 2020-03-10 15:54                 ` Paul Durrant
  0 siblings, 0 replies; 23+ messages in thread
From: Paul Durrant @ 2020-03-10 15:54 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: xen-devel, 'Paul Durrant', 'Roger Pau Monné',
	'Wei Liu', 'Andrew Cooper'

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 10 March 2020 15:33
> To: paul@xen.org
> Cc: xen-devel@lists.xenproject.org; 'Paul Durrant' <pdurrant@amazon.com>; 'Andrew Cooper'
> <andrew.cooper3@citrix.com>; 'Wei Liu' <wl@xen.org>; 'Roger Pau Monné' <roger.pau@citrix.com>
> Subject: Re: [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM
> 
> On 10.03.2020 16:16, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Jan Beulich <jbeulich@suse.com>
> >> Sent: 10 March 2020 15:13
> >> To: paul@xen.org
> >> Cc: xen-devel@lists.xenproject.org; 'Paul Durrant' <pdurrant@amazon.com>; 'Andrew Cooper'
> >> <andrew.cooper3@citrix.com>; 'Wei Liu' <wl@xen.org>; 'Roger Pau Monné' <roger.pau@citrix.com>
> >> Subject: Re: [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM
> >>
> >> On 10.03.2020 16:05, Paul Durrant wrote:
> >>>> -----Original Message-----
> >>>> From: Jan Beulich <jbeulich@suse.com>
> >>>> Sent: 10 March 2020 14:59
> >>>>
> >>>> In reply to patch 6 I did suggest to
> >>>> have a separate list, but without taking these pages off
> >>>> d->page_list,
> >>>
> >>> How would that work without adding an extra page_list_entry into struct page_info?
> >>
> >> As said there, it'd be a singly linked list using a __pdx_t
> >> field just like there already is with "next_shadow", i.e.
> >> you'd add another union member "next_extra" or some such. Of
> >> course the list shouldn't grow too long, or else insertion
> >> and removal may become a bottleneck. Not sure how well this
> >> would fit Arm, though; maybe they wouldn't need this, but
> >> that depends on whether the list would be used for purposes
> >> beyond dumping.
> >
> > That seems more obscure and bug-prone than an extra list head
> > in struct domain. I'd really prefer to stick with that even
> > if it does make things a little more ugly until xenpage_list
> > goes away.
> 
> Okay with me if there really was no property (other than
> assign_pages() then needing to pick the right list, which is
> not much different from needing to put the extra pages on two
> lists) that you'd lose by no longer having the pages on the
> same list.

Just assign_pages() and arch_free_heap_page(), and my test patch defines:

+static inline struct page_list_head *page_to_list(
+    struct domain *d, const struct page_info *pg)
+{
+    if ( is_xen_heap_page(pg) )
+        return &d->xenpage_list;
+    else if ( pg->count_info & PGC_extra )
+        return &d->extra_page_list;
+
+    return &d->page_list;
+}

which they both use to select the right list.

Once xenpage_list goes away then this can be simplified to use a ternary operator.

  Paul

> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xen-devel] [PATCH v5 5/6] mm: add 'is_special_page' inline function...
  2020-03-09 13:28   ` Jan Beulich
@ 2020-03-10 16:32     ` Paul Durrant
  0 siblings, 0 replies; 23+ messages in thread
From: Paul Durrant @ 2020-03-10 16:32 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: 'Tamas K Lengyel', 'Julien Grall',
	'Wei Liu', 'Konrad Rzeszutek Wilk',
	'Andrew Cooper', 'Paul Durrant',
	'Ian Jackson', 'George Dunlap',
	'Tim Deegan', 'Stefano Stabellini',
	xen-devel, 'Roger Pau Monné'

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 09 March 2020 13:28
> To: paul@xen.org
> Cc: xen-devel@lists.xenproject.org; Paul Durrant <pdurrant@amazon.com>; Tamas K Lengyel
> <tamas@tklengyel.com>; Andrew Cooper <andrew.cooper3@citrix.com>; Wei Liu <wl@xen.org>; Roger Pau
> Monné <roger.pau@citrix.com>; George Dunlap <george.dunlap@citrix.com>; Ian Jackson
> <ian.jackson@eu.citrix.com>; Julien Grall <julien@xen.org>; Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com>; Stefano Stabellini <sstabellini@kernel.org>; Tim Deegan <tim@xen.org>
> Subject: Re: [PATCH v5 5/6] mm: add 'is_special_page' inline function...
> 
> On 09.03.2020 11:23, paul@xen.org wrote:
> > v4:
> >  - Use inline function instead of macro
> >  - Add missing conversions from is_xen_heap_page()
> 
> Among these also one conversion of is_xen_heap_mfn(). I'm still
> curious why others wouldn't need converting - the description
> doesn't mention there are more, see p2m_add_foreign() for an
> example (may warrant introduction of is_special_mfn() then). It
> would probably be beneficial if the description gave some
> generic criteria for cases where conversion is (not) needed.
> 

OK. Basically it’s to cover the case where is_xen_heap_page() is used to mean 'isn't normal guest memory'. I'll expand the commit comment to say that.

> But there are issues beyond this, as there are also open-coded
> instances of PGC_xen_heap checks, and that's the other possible
> regression I notice from the APIC assist MFN page conversion:
> PoD code, to avoid doing two separate checks on ->count_info [1],
> uses two instances of a construct like this one
> 
>              !(pg->count_info & (PGC_page_table | PGC_xen_heap)) &&
> 
> (and again I didn't do a complete audit for further
> occurrences). This means the APIC assist page right now might
> be a candidate for getting converted to PoD (possibly others of
> the constraints actually prohibit this, but I'm not sure).
> 
> [1] I'm unconvinced PGC_page_table pages can actually appear
> there, so the open-coding may in fact be an optimization of
> dead code.

Ok, I'll audit occurrences of PGC_xen_heap.

  Paul

> 
> > --- a/xen/arch/x86/mm/shadow/common.c
> > +++ b/xen/arch/x86/mm/shadow/common.c
> > @@ -2087,19 +2087,22 @@ static int sh_remove_all_mappings(struct domain *d, mfn_t gmfn, gfn_t gfn)
> >           * The qemu helper process has an untyped mapping of this dom's RAM
> >           * and the HVM restore program takes another.
> >           * Also allow one typed refcount for
> > -         * - Xen heap pages, to match share_xen_page_with_guest(),
> > -         * - ioreq server pages, to match prepare_ring_for_helper().
> > +         * - special pages, which are explicitly referenced and mapped by
> > +         *   Xen.
> > +         * - ioreq server pages, which may be special pages or normal
> > +         *   guest pages with an extra reference taken by
> > +         *   prepare_ring_for_helper().
> >           */
> >          if ( !(shadow_mode_external(d)
> >                 && (page->count_info & PGC_count_mask) <= 3
> >                 && ((page->u.inuse.type_info & PGT_count_mask)
> > -                   == (is_xen_heap_page(page) ||
> > +                   == (is_special_page(page) ||
> >                         (is_hvm_domain(d) && is_ioreq_server_page(d, page))))) )
> >              printk(XENLOG_G_ERR "can't find all mappings of mfn %"PRI_mfn
> > -                   " (gfn %"PRI_gfn"): c=%lx t=%lx x=%d i=%d\n",
> > +                   " (gfn %"PRI_gfn"): c=%lx t=%lx s=%d i=%d\n",
> >                     mfn_x(gmfn), gfn_x(gfn),
> >                     page->count_info, page->u.inuse.type_info,
> > -                   !!is_xen_heap_page(page),
> > +                   !!is_special_page(page),
> 
> The reason for me to ask to switch to an inline function was to
> see this !! go away.
> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xen-devel] [PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info
  2020-03-09 15:56   ` Jan Beulich
@ 2020-03-10 17:33     ` Paul Durrant
  2020-03-11  9:16       ` Jan Beulich
  0 siblings, 1 reply; 23+ messages in thread
From: Paul Durrant @ 2020-03-10 17:33 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: 'Stefano Stabellini', 'Julien Grall',
	'Wei Liu', 'Konrad Rzeszutek Wilk',
	'Andrew Cooper', 'Paul Durrant',
	'Ian Jackson', 'George Dunlap',
	xen-devel, 'Volodymyr Babchuk'

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 09 March 2020 15:56
> To: paul@xen.org
> Cc: xen-devel@lists.xenproject.org; Paul Durrant <pdurrant@amazon.com>; Stefano Stabellini
> <sstabellini@kernel.org>; Julien Grall <julien@xen.org>; Volodymyr Babchuk
> <Volodymyr_Babchuk@epam.com>; Andrew Cooper <andrew.cooper3@citrix.com>; George Dunlap
> <george.dunlap@citrix.com>; Ian Jackson <ian.jackson@eu.citrix.com>; Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com>; Wei Liu <wl@xen.org>
> Subject: Re: [PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info
> 
> On 09.03.2020 11:23, paul@xen.org wrote:
> > From: Paul Durrant <pdurrant@amazon.com>
> >
> > Currently shared_info is a shared xenheap page but shared xenheap pages
> > complicate future plans for live-update of Xen so it is desirable to,
> > where possible, not use them [1]. This patch therefore converts shared_info
> > into a PGC_extra domheap page. This does entail freeing shared_info during
> > domain_relinquish_resources() rather than domain_destroy() so care is
> > needed to avoid de-referencing a NULL shared_info pointer hence some
> > extra checks of 'is_dying' are needed.
> >
> > Also, because shared_info will no longer be a xenheap page this patch adds
> > an extra dump function to make sure the shared_info MFN is included in the
> > output of dump_pageframe_info().
> 
> I've voiced my objection to such a model, and hence it'll take
> another REST maintainer to approve of this despite my arguments
> against it. (And of course, just to re-record this here, the
> APIC access page, converted by ea3daabff5f2, will want to get a
> dumping function added then, too.)
> 
> I wonder whether a domain's "extra" pages couldn't be put in a
> separate, singly-linked list, using the union the next_shadow
> field is in as the linking field. None of the other union
> members look to be applicable to "extra" pages.
> 
> > +void dump_shared_info(struct domain *d)
> > +{
> > +    domain_lock(d);
> > +
> > +    if ( d->shared_info.virt )
> > +        printk("Shared Info: %"PRI_mfn"\n", mfn_x(d->shared_info.mfn));
> 
> count_info and type_info should be logged imo, just like
> dump_pageframe_info() does. On the whole I think the actual
> dumping might better be uniform, and these functions would
> then only exist to "know" which page(s) to dump.
> 

I think the extra page list should cover these issues.

> > --- a/xen/common/time.c
> > +++ b/xen/common/time.c
> > @@ -99,6 +99,18 @@ void update_domain_wallclock_time(struct domain *d)
> >      uint32_t *wc_version;
> >      uint64_t sec;
> >
> > +    if ( d != current->domain )
> > +    {
> > +        /*
> > +         * We need to check is_dying here as, if it is set, the
> > +         * shared_info may have been freed. To do this safely we need
> > +         * hold the domain lock.
> > +         */
> > +        domain_lock(d);
> > +        if ( d->is_dying )
> > +            goto unlock;
> > +    }
> 
> This shouldn't happen very often, but it's pretty heavy a lock.
> It's a fundamental aspect of xenheap pages that their disposal
> can b e delay until almost the last moment of guest cleanup. I
> continue to think it's not really a good ideal to have special
> purpose allocation (and mapping) accompanied by these pages
> getting taken care of by the generic relinquish-resources logic
> here (from a more general pov such is of course often nice to
> have). Instead of freeing these pages there, couldn't they just
> be taken off d->page_list, with the unmapping and freeing left
> as it was?

I don't think this can be achieved without being able de-assign pages and I don't really want to have to invent new logic to do that (basically re-implementing what happens to xenheap pages). I really don't think it is that bad to deal with shared info and grant table pages as domheap pages. Yes, we have to be careful, but in this case the lock is only taken in a toolstack update of the wallclock and, apart from start of day access, I think this is limited to XEN_DOMCTL_settimeoffset and XEN_DOMCTL_setvcpucontext neither of which I believe is particularly performance-critical.

  Paul

> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xen-devel] [PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info
  2020-03-10 17:33     ` Paul Durrant
@ 2020-03-11  9:16       ` Jan Beulich
       [not found]         ` <004501d5f7b9$b00e1120$102a3360$@xen.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Jan Beulich @ 2020-03-11  9:16 UTC (permalink / raw)
  To: paul
  Cc: 'Stefano Stabellini', 'Julien Grall',
	'Wei Liu', 'Konrad Rzeszutek Wilk',
	'Andrew Cooper', 'Paul Durrant',
	'Ian Jackson', 'George Dunlap',
	xen-devel, 'Volodymyr Babchuk'

On 10.03.2020 18:33, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: 09 March 2020 15:56
>>
>> On 09.03.2020 11:23, paul@xen.org wrote:
>>> --- a/xen/common/time.c
>>> +++ b/xen/common/time.c
>>> @@ -99,6 +99,18 @@ void update_domain_wallclock_time(struct domain *d)
>>>      uint32_t *wc_version;
>>>      uint64_t sec;
>>>
>>> +    if ( d != current->domain )
>>> +    {
>>> +        /*
>>> +         * We need to check is_dying here as, if it is set, the
>>> +         * shared_info may have been freed. To do this safely we need
>>> +         * hold the domain lock.
>>> +         */
>>> +        domain_lock(d);
>>> +        if ( d->is_dying )
>>> +            goto unlock;
>>> +    }
>>
>> This shouldn't happen very often, but it's pretty heavy a lock.
>> It's a fundamental aspect of xenheap pages that their disposal
>> can b e delay until almost the last moment of guest cleanup. I
>> continue to think it's not really a good ideal to have special
>> purpose allocation (and mapping) accompanied by these pages
>> getting taken care of by the generic relinquish-resources logic
>> here (from a more general pov such is of course often nice to
>> have). Instead of freeing these pages there, couldn't they just
>> be taken off d->page_list, with the unmapping and freeing left
>> as it was?
> 
> I don't think this can be achieved without being able de-assign
> pages and I don't really want to have to invent new logic to do
> that (basically re-implementing what happens to xenheap pages).

Where's the connection to being able to de-assign pages here?
There'll be one when the same conversion is to be done for
gnttab code, but I don't see it here - the shared info page is
never to be de-assigned. As to gnttab code, I think it was
noted before that we may be better off not "unpopulating"
status pages when switching back from v2 to v1. At which point
the de-assignment need would go away there, too.

> I really don't think it is that bad to deal with shared info
> and grant table pages as domheap pages. Yes, we have to be
> careful, but in this case the lock is only taken in a
> toolstack update of the wallclock and, apart from start of
> day access, I think this is limited to XEN_DOMCTL_settimeoffset
> and XEN_DOMCTL_setvcpucontext neither of which I believe is
> particularly performance-critical.

It's not, I agree (and hence I had started my previous reply
also with "This shouldn't happen very often"). How all of this
is going to look like with the new extra_page_list I'll have
to see anyway. But for now I remain unconvinced of the want /
need to de-allocate the shared info page early.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xen-devel] [PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info
       [not found]         ` <004501d5f7b9$b00e1120$102a3360$@xen.org>
@ 2020-03-12  8:34           ` Jan Beulich
  2020-03-12  9:09             ` [Xen-devel] [EXTERNAL][PATCH " Paul Durrant
  0 siblings, 1 reply; 23+ messages in thread
From: Jan Beulich @ 2020-03-12  8:34 UTC (permalink / raw)
  To: paul
  Cc: 'Stefano Stabellini', 'Julien Grall',
	'Wei Liu', 'Konrad Rzeszutek Wilk',
	'Andrew Cooper', 'Paul Durrant',
	'Ian Jackson', 'George Dunlap',
	xen-devel, 'Volodymyr Babchuk'

On 11.03.2020 16:28, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: 11 March 2020 09:17
>> To: paul@xen.org
>> Cc: xen-devel@lists.xenproject.org; 'Paul Durrant' <pdurrant@amazon.com>; 'Stefano Stabellini'
>> <sstabellini@kernel.org>; 'Julien Grall' <julien@xen.org>; 'Volodymyr Babchuk'
>> <Volodymyr_Babchuk@epam.com>; 'Andrew Cooper' <andrew.cooper3@citrix.com>; 'George Dunlap'
>> <george.dunlap@citrix.com>; 'Ian Jackson' <ian.jackson@eu.citrix.com>; 'Konrad Rzeszutek Wilk'
>> <konrad.wilk@oracle.com>; 'Wei Liu' <wl@xen.org>
>> Subject: Re: [PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info
>>
>> On 10.03.2020 18:33, Paul Durrant wrote:
>>>> -----Original Message-----
>>>> From: Jan Beulich <jbeulich@suse.com>
>>>> Sent: 09 March 2020 15:56
>>>>
>>>> On 09.03.2020 11:23, paul@xen.org wrote:
>>>>> --- a/xen/common/time.c
>>>>> +++ b/xen/common/time.c
>>>>> @@ -99,6 +99,18 @@ void update_domain_wallclock_time(struct domain *d)
>>>>>      uint32_t *wc_version;
>>>>>      uint64_t sec;
>>>>>
>>>>> +    if ( d != current->domain )
>>>>> +    {
>>>>> +        /*
>>>>> +         * We need to check is_dying here as, if it is set, the
>>>>> +         * shared_info may have been freed. To do this safely we need
>>>>> +         * hold the domain lock.
>>>>> +         */
>>>>> +        domain_lock(d);
>>>>> +        if ( d->is_dying )
>>>>> +            goto unlock;
>>>>> +    }
>>>>
>>>> This shouldn't happen very often, but it's pretty heavy a lock.
>>>> It's a fundamental aspect of xenheap pages that their disposal
>>>> can b e delay until almost the last moment of guest cleanup. I
>>>> continue to think it's not really a good ideal to have special
>>>> purpose allocation (and mapping) accompanied by these pages
>>>> getting taken care of by the generic relinquish-resources logic
>>>> here (from a more general pov such is of course often nice to
>>>> have). Instead of freeing these pages there, couldn't they just
>>>> be taken off d->page_list, with the unmapping and freeing left
>>>> as it was?
>>>
>>> I don't think this can be achieved without being able de-assign
>>> pages and I don't really want to have to invent new logic to do
>>> that (basically re-implementing what happens to xenheap pages).
>>
>> Where's the connection to being able to de-assign pages here?
>> There'll be one when the same conversion is to be done for
>> gnttab code, but I don't see it here - the shared info page is
>> never to be de-assigned. As to gnttab code, I think it was
>> noted before that we may be better off not "unpopulating"
>> status pages when switching back from v2 to v1. At which point
>> the de-assignment need would go away there, too.
> 
> Ok, maybe I'm misunderstanding something then. We need to call
> free_domheap_pages() on all pages assigned to a domain so that
> the domain references get dropped. The xenpage ref is dropped
> when d->xenheap_pages == 0. The domheap ref is dropped when
> domain_adjust_tot_pages() returns zero. (This is what I meant
> by de-assigning... but that was probably a poor choice of words).
> So, because domain_adjust_tot_pages() returns d->tot_pages
> (which includes the extra_pages count) it won't fall to zero
> until the last put_page() on any PGC_extra page. So how is it
> possible to free shared_info in domain destroy? We'll never get
> that far, because the domheap ref will never get dropped.

Well, now that these pages sit on a separate list, it would
look even less problematic than before to me to also give them
special treatment here: You wouldn't even have to take them
off the list anymore, but just call domain_adjust_tot_pages()
with -d->extra_pages (and suitably deal with the return value;
perhaps for consistency then followed by also zeroing
d->extra_pages, so overall accounting still looks correct).
Actually taking these pages off the list could (for dumping
purposes) then be done alongside their actual freeing. Such a
transition would apparently imply clearing PGC_extra alongside
the new domain_adjust_tot_pages() call.

I realize though that the end result won't be much different
from the current PGC_xen_heap handling (at least as far as
domain cleanup goes, but after all that's what I'm in fact
trying to convince you of), so the question would be whether
the whole transition then is worth it. Without having seen at
least a sketch of the LU code that is affected by the current
behavior, it remains hard for me to tell what issues might
remain, despite your and David's explanations.

> I
> guess this could be fixed by having domain_adjust_tot_pages()
> return the same values as domain_tot_pages() (i.e.
> tot_pages - extra_pages). Is that what you're suggesting?

That's an option, too, but imo less desirable.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xen-devel] [EXTERNAL][PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info
  2020-03-12  8:34           ` Jan Beulich
@ 2020-03-12  9:09             ` Paul Durrant
  2020-03-12  9:26               ` Jan Beulich
  0 siblings, 1 reply; 23+ messages in thread
From: Paul Durrant @ 2020-03-12  9:09 UTC (permalink / raw)
  To: 'Jan Beulich', paul
  Cc: 'Stefano Stabellini', 'Julien Grall',
	'Wei Liu', 'Konrad Rzeszutek Wilk',
	'Andrew Cooper', 'Ian Jackson',
	'George Dunlap', xen-devel, 'Volodymyr Babchuk'

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 12 March 2020 08:34
> To: paul@xen.org
> Cc: xen-devel@lists.xenproject.org; Durrant, Paul <pdurrant@amazon.co.uk>; 'Stefano Stabellini'
> <sstabellini@kernel.org>; 'Julien Grall' <julien@xen.org>; 'Volodymyr Babchuk'
> <Volodymyr_Babchuk@epam.com>; 'Andrew Cooper' <andrew.cooper3@citrix.com>; 'George Dunlap'
> <george.dunlap@citrix.com>; 'Ian Jackson' <ian.jackson@eu.citrix.com>; 'Konrad Rzeszutek Wilk'
> <konrad.wilk@oracle.com>; 'Wei Liu' <wl@xen.org>
> Subject: RE: [EXTERNAL][PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info
> 
> CAUTION: This email originated from outside of the organization. Do not click links or open
> attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
> On 11.03.2020 16:28, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Jan Beulich <jbeulich@suse.com>
> >> Sent: 11 March 2020 09:17
> >> To: paul@xen.org
> >> Cc: xen-devel@lists.xenproject.org; 'Paul Durrant' <pdurrant@amazon.com>; 'Stefano Stabellini'
> >> <sstabellini@kernel.org>; 'Julien Grall' <julien@xen.org>; 'Volodymyr Babchuk'
> >> <Volodymyr_Babchuk@epam.com>; 'Andrew Cooper' <andrew.cooper3@citrix.com>; 'George Dunlap'
> >> <george.dunlap@citrix.com>; 'Ian Jackson' <ian.jackson@eu.citrix.com>; 'Konrad Rzeszutek Wilk'
> >> <konrad.wilk@oracle.com>; 'Wei Liu' <wl@xen.org>
> >> Subject: Re: [PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info
> >>
> >> On 10.03.2020 18:33, Paul Durrant wrote:
> >>>> -----Original Message-----
> >>>> From: Jan Beulich <jbeulich@suse.com>
> >>>> Sent: 09 March 2020 15:56
> >>>>
> >>>> On 09.03.2020 11:23, paul@xen.org wrote:
> >>>>> --- a/xen/common/time.c
> >>>>> +++ b/xen/common/time.c
> >>>>> @@ -99,6 +99,18 @@ void update_domain_wallclock_time(struct domain *d)
> >>>>>      uint32_t *wc_version;
> >>>>>      uint64_t sec;
> >>>>>
> >>>>> +    if ( d != current->domain )
> >>>>> +    {
> >>>>> +        /*
> >>>>> +         * We need to check is_dying here as, if it is set, the
> >>>>> +         * shared_info may have been freed. To do this safely we need
> >>>>> +         * hold the domain lock.
> >>>>> +         */
> >>>>> +        domain_lock(d);
> >>>>> +        if ( d->is_dying )
> >>>>> +            goto unlock;
> >>>>> +    }
> >>>>
> >>>> This shouldn't happen very often, but it's pretty heavy a lock.
> >>>> It's a fundamental aspect of xenheap pages that their disposal
> >>>> can b e delay until almost the last moment of guest cleanup. I
> >>>> continue to think it's not really a good ideal to have special
> >>>> purpose allocation (and mapping) accompanied by these pages
> >>>> getting taken care of by the generic relinquish-resources logic
> >>>> here (from a more general pov such is of course often nice to
> >>>> have). Instead of freeing these pages there, couldn't they just
> >>>> be taken off d->page_list, with the unmapping and freeing left
> >>>> as it was?
> >>>
> >>> I don't think this can be achieved without being able de-assign
> >>> pages and I don't really want to have to invent new logic to do
> >>> that (basically re-implementing what happens to xenheap pages).
> >>
> >> Where's the connection to being able to de-assign pages here?
> >> There'll be one when the same conversion is to be done for
> >> gnttab code, but I don't see it here - the shared info page is
> >> never to be de-assigned. As to gnttab code, I think it was
> >> noted before that we may be better off not "unpopulating"
> >> status pages when switching back from v2 to v1. At which point
> >> the de-assignment need would go away there, too.
> >
> > Ok, maybe I'm misunderstanding something then. We need to call
> > free_domheap_pages() on all pages assigned to a domain so that
> > the domain references get dropped. The xenpage ref is dropped
> > when d->xenheap_pages == 0. The domheap ref is dropped when
> > domain_adjust_tot_pages() returns zero. (This is what I meant
> > by de-assigning... but that was probably a poor choice of words).
> > So, because domain_adjust_tot_pages() returns d->tot_pages
> > (which includes the extra_pages count) it won't fall to zero
> > until the last put_page() on any PGC_extra page. So how is it
> > possible to free shared_info in domain destroy? We'll never get
> > that far, because the domheap ref will never get dropped.
> 
> Well, now that these pages sit on a separate list, it would
> look even less problematic than before to me to also give them
> special treatment here: You wouldn't even have to take them
> off the list anymore, but just call domain_adjust_tot_pages()
> with -d->extra_pages (and suitably deal with the return value;
> perhaps for consistency then followed by also zeroing
> d->extra_pages, so overall accounting still looks correct).
> Actually taking these pages off the list could (for dumping
> purposes) then be done alongside their actual freeing. Such a
> transition would apparently imply clearing PGC_extra alongside
> the new domain_adjust_tot_pages() call.
> 
> I realize though that the end result won't be much different
> from the current PGC_xen_heap handling (at least as far as
> domain cleanup goes, but after all that's what I'm in fact
> trying to convince you of), so the question would be whether
> the whole transition then is worth it.

Yes... with such adjustments the code gets increasingly pointless from a simplification PoV... which is why I coded this patch as it is. I am happy to make the concession of using the extra page list for dumping purposes, and to avoid the need to special-case PGC_extra pages in a few places, but otherwise I want them to be treated as 'normal' domheap pages.

So, if you're not willing to ack this patch then I guess I'll have to re-send the series including the grant table changes and the removal of the xenpage list.

  Paul


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xen-devel] [EXTERNAL][PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info
  2020-03-12  9:09             ` [Xen-devel] [EXTERNAL][PATCH " Paul Durrant
@ 2020-03-12  9:26               ` Jan Beulich
  0 siblings, 0 replies; 23+ messages in thread
From: Jan Beulich @ 2020-03-12  9:26 UTC (permalink / raw)
  To: paul
  Cc: 'Stefano Stabellini', 'Julien Grall',
	'Wei Liu', 'Konrad Rzeszutek Wilk',
	'Andrew Cooper', 'Ian Jackson',
	'George Dunlap', xen-devel, 'Volodymyr Babchuk'

On 12.03.2020 10:09, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: 12 March 2020 08:34
>> To: paul@xen.org
>> Cc: xen-devel@lists.xenproject.org; Durrant, Paul <pdurrant@amazon.co.uk>; 'Stefano Stabellini'
>> <sstabellini@kernel.org>; 'Julien Grall' <julien@xen.org>; 'Volodymyr Babchuk'
>> <Volodymyr_Babchuk@epam.com>; 'Andrew Cooper' <andrew.cooper3@citrix.com>; 'George Dunlap'
>> <george.dunlap@citrix.com>; 'Ian Jackson' <ian.jackson@eu.citrix.com>; 'Konrad Rzeszutek Wilk'
>> <konrad.wilk@oracle.com>; 'Wei Liu' <wl@xen.org>
>> Subject: RE: [EXTERNAL][PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info
>>
>> CAUTION: This email originated from outside of the organization. Do not click links or open
>> attachments unless you can confirm the sender and know the content is safe.
>>
>>
>>
>> On 11.03.2020 16:28, Paul Durrant wrote:
>>>> -----Original Message-----
>>>> From: Jan Beulich <jbeulich@suse.com>
>>>> Sent: 11 March 2020 09:17
>>>> To: paul@xen.org
>>>> Cc: xen-devel@lists.xenproject.org; 'Paul Durrant' <pdurrant@amazon.com>; 'Stefano Stabellini'
>>>> <sstabellini@kernel.org>; 'Julien Grall' <julien@xen.org>; 'Volodymyr Babchuk'
>>>> <Volodymyr_Babchuk@epam.com>; 'Andrew Cooper' <andrew.cooper3@citrix.com>; 'George Dunlap'
>>>> <george.dunlap@citrix.com>; 'Ian Jackson' <ian.jackson@eu.citrix.com>; 'Konrad Rzeszutek Wilk'
>>>> <konrad.wilk@oracle.com>; 'Wei Liu' <wl@xen.org>
>>>> Subject: Re: [PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info
>>>>
>>>> On 10.03.2020 18:33, Paul Durrant wrote:
>>>>>> -----Original Message-----
>>>>>> From: Jan Beulich <jbeulich@suse.com>
>>>>>> Sent: 09 March 2020 15:56
>>>>>>
>>>>>> On 09.03.2020 11:23, paul@xen.org wrote:
>>>>>>> --- a/xen/common/time.c
>>>>>>> +++ b/xen/common/time.c
>>>>>>> @@ -99,6 +99,18 @@ void update_domain_wallclock_time(struct domain *d)
>>>>>>>      uint32_t *wc_version;
>>>>>>>      uint64_t sec;
>>>>>>>
>>>>>>> +    if ( d != current->domain )
>>>>>>> +    {
>>>>>>> +        /*
>>>>>>> +         * We need to check is_dying here as, if it is set, the
>>>>>>> +         * shared_info may have been freed. To do this safely we need
>>>>>>> +         * hold the domain lock.
>>>>>>> +         */
>>>>>>> +        domain_lock(d);
>>>>>>> +        if ( d->is_dying )
>>>>>>> +            goto unlock;
>>>>>>> +    }
>>>>>>
>>>>>> This shouldn't happen very often, but it's pretty heavy a lock.
>>>>>> It's a fundamental aspect of xenheap pages that their disposal
>>>>>> can b e delay until almost the last moment of guest cleanup. I
>>>>>> continue to think it's not really a good ideal to have special
>>>>>> purpose allocation (and mapping) accompanied by these pages
>>>>>> getting taken care of by the generic relinquish-resources logic
>>>>>> here (from a more general pov such is of course often nice to
>>>>>> have). Instead of freeing these pages there, couldn't they just
>>>>>> be taken off d->page_list, with the unmapping and freeing left
>>>>>> as it was?
>>>>>
>>>>> I don't think this can be achieved without being able de-assign
>>>>> pages and I don't really want to have to invent new logic to do
>>>>> that (basically re-implementing what happens to xenheap pages).
>>>>
>>>> Where's the connection to being able to de-assign pages here?
>>>> There'll be one when the same conversion is to be done for
>>>> gnttab code, but I don't see it here - the shared info page is
>>>> never to be de-assigned. As to gnttab code, I think it was
>>>> noted before that we may be better off not "unpopulating"
>>>> status pages when switching back from v2 to v1. At which point
>>>> the de-assignment need would go away there, too.
>>>
>>> Ok, maybe I'm misunderstanding something then. We need to call
>>> free_domheap_pages() on all pages assigned to a domain so that
>>> the domain references get dropped. The xenpage ref is dropped
>>> when d->xenheap_pages == 0. The domheap ref is dropped when
>>> domain_adjust_tot_pages() returns zero. (This is what I meant
>>> by de-assigning... but that was probably a poor choice of words).
>>> So, because domain_adjust_tot_pages() returns d->tot_pages
>>> (which includes the extra_pages count) it won't fall to zero
>>> until the last put_page() on any PGC_extra page. So how is it
>>> possible to free shared_info in domain destroy? We'll never get
>>> that far, because the domheap ref will never get dropped.
>>
>> Well, now that these pages sit on a separate list, it would
>> look even less problematic than before to me to also give them
>> special treatment here: You wouldn't even have to take them
>> off the list anymore, but just call domain_adjust_tot_pages()
>> with -d->extra_pages (and suitably deal with the return value;
>> perhaps for consistency then followed by also zeroing
>> d->extra_pages, so overall accounting still looks correct).
>> Actually taking these pages off the list could (for dumping
>> purposes) then be done alongside their actual freeing. Such a
>> transition would apparently imply clearing PGC_extra alongside
>> the new domain_adjust_tot_pages() call.
>>
>> I realize though that the end result won't be much different
>> from the current PGC_xen_heap handling (at least as far as
>> domain cleanup goes, but after all that's what I'm in fact
>> trying to convince you of), so the question would be whether
>> the whole transition then is worth it.
> 
> Yes... with such adjustments the code gets increasingly pointless from a simplification PoV... which is why I coded this patch as it is. I am happy to make the concession of using the extra page list for dumping purposes, and to avoid the need to special-case PGC_extra pages in a few places, but otherwise I want them to be treated as 'normal' domheap pages.
> 
> So, if you're not willing to ack this patch

Well - I'm not the only one in the position to ack this. The
problem is that right now this is just a discussion between
the two of us.

> then I guess I'll have to re-send the series including the
> grant table changes and the removal of the xenpage list.

I'm not sure I see how this would help. My reservations
against the earlier freeing of the extra pages wouldn't go
away.

But let me first get around to look at v6.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2020-03-12  9:26 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-09 10:22 [Xen-devel] [PATCH v5 0/6] remove one more shared xenheap page: shared_info paul
2020-03-09 10:22 ` [Xen-devel] [PATCH v5 1/6] domain: introduce alloc/free_shared_info() helpers paul
2020-03-09 10:23 ` [Xen-devel] [PATCH v5 2/6] x86 / p2m: replace page_list check in p2m_alloc_table paul
2020-03-09 10:23 ` [Xen-devel] [PATCH v5 3/6] x86 / pv: do not treat PGC_extra pages as RAM paul
2020-03-09 13:04   ` Jan Beulich
2020-03-10 13:32     ` Paul Durrant
2020-03-10 14:58       ` Jan Beulich
2020-03-10 15:05         ` Paul Durrant
2020-03-10 15:12           ` Jan Beulich
2020-03-10 15:16             ` Paul Durrant
2020-03-10 15:33               ` Jan Beulich
2020-03-10 15:54                 ` Paul Durrant
2020-03-09 10:23 ` [Xen-devel] [PATCH v5 4/6] x86 / ioreq: use a MEMF_no_refcount allocation for server pages paul
2020-03-09 10:23 ` [Xen-devel] [PATCH v5 5/6] mm: add 'is_special_page' inline function paul
2020-03-09 13:28   ` Jan Beulich
2020-03-10 16:32     ` Paul Durrant
2020-03-09 10:23 ` [Xen-devel] [PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info paul
2020-03-09 15:56   ` Jan Beulich
2020-03-10 17:33     ` Paul Durrant
2020-03-11  9:16       ` Jan Beulich
     [not found]         ` <004501d5f7b9$b00e1120$102a3360$@xen.org>
2020-03-12  8:34           ` Jan Beulich
2020-03-12  9:09             ` [Xen-devel] [EXTERNAL][PATCH " Paul Durrant
2020-03-12  9:26               ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.