* Fix VGA logdirty related display freezes with altp2m @ 2018-10-18 10:07 Razvan Cojocaru 2018-10-18 10:07 ` [PATCH 1/2] x86/altp2m: propagate ept.ad changes to all active altp2ms Razvan Cojocaru ` (2 more replies) 0 siblings, 3 replies; 25+ messages in thread From: Razvan Cojocaru @ 2018-10-18 10:07 UTC (permalink / raw) To: xen-devel Cc: kevin.tian, wei.liu2, jun.nakajima, george.dunlap, andrew.cooper3, jbeulich Hello, This series aims to prevent the display from freezing when enabling altp2m and switching to a new view (and assorted problems when resizing the display). The first patch propagates ept.ad changes to all active altp2ms, and the second one allocates a new logdirty rangeset for each new altp2m, and propagates (under lock) changes to all p2ms. The first patch is the same as: [PATCH V4] x86/altp2m: propagate ept.ad changes to all active altp2ms but as it is now required for the second one to apply cleanly, it has been resent as part of this series. [PATCH 1/2] x86/altp2m: propagate ept.ad changes to all active altp2ms [PATCH 2/2] x86/altp2m: fix display frozen when switching to a new Thanks, Razvan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 1/2] x86/altp2m: propagate ept.ad changes to all active altp2ms 2018-10-18 10:07 Fix VGA logdirty related display freezes with altp2m Razvan Cojocaru @ 2018-10-18 10:07 ` Razvan Cojocaru 2018-10-18 10:07 ` [PATCH 2/2] x86/altp2m: fix display frozen when switching to a new view early Razvan Cojocaru 2018-10-18 20:08 ` Fix VGA logdirty related display freezes with altp2m Tamas K Lengyel 2 siblings, 0 replies; 25+ messages in thread From: Razvan Cojocaru @ 2018-10-18 10:07 UTC (permalink / raw) To: xen-devel Cc: kevin.tian, wei.liu2, jun.nakajima, Razvan Cojocaru, george.dunlap, andrew.cooper3, jbeulich This patch is a pre-requisite for fixing the logdirty VGA issue (display freezes when switching to a new altp2m view early in a domain's lifetime), but sent separately for easier review. The new ept_set_ad_sync() function has been added to update all active altp2ms' ept.ad. New altp2ms will inherit the hostp2m's ept.ad value. Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com> Suggested-by: George Dunlap <george.dunlap@citrix.com> --- xen/arch/x86/mm/p2m-ept.c | 57 +++++++++++++++++++++++++++++++++++++++++++---- xen/arch/x86/mm/p2m.c | 8 ------- 2 files changed, 53 insertions(+), 12 deletions(-) diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c index 407e299..fabcd06 100644 --- a/xen/arch/x86/mm/p2m-ept.c +++ b/xen/arch/x86/mm/p2m-ept.c @@ -17,6 +17,7 @@ #include <xen/domain_page.h> #include <xen/sched.h> +#include <asm/altp2m.h> #include <asm/current.h> #include <asm/paging.h> #include <asm/types.h> @@ -1222,6 +1223,34 @@ static void ept_tlb_flush(struct p2m_domain *p2m) ept_sync_domain_mask(p2m, p2m->domain->dirty_cpumask); } +static void ept_set_ad_sync(struct domain *d, bool value) +{ + struct p2m_domain *hostp2m = p2m_get_hostp2m(d); + + ASSERT(p2m_locked_by_me(hostp2m)); + + hostp2m->ept.ad = value; + + if ( unlikely(altp2m_active(d)) ) + { + unsigned int i; + + for ( i = 0; i < MAX_ALTP2M; i++ ) + { + struct p2m_domain *p2m; + + if ( d->arch.altp2m_eptp[i] == mfn_x(INVALID_MFN) ) + continue; + + p2m = d->arch.altp2m_p2m[i]; + + p2m_lock(p2m); + p2m->ept.ad = value; + p2m_unlock(p2m); + } + } +} + static void ept_enable_pml(struct p2m_domain *p2m) { /* Domain must have been paused */ @@ -1236,7 +1265,7 @@ static void ept_enable_pml(struct p2m_domain *p2m) return; /* Enable EPT A/D bit for PML */ - p2m->ept.ad = 1; + ept_set_ad_sync(p2m->domain, true); vmx_domain_update_eptp(p2m->domain); } @@ -1248,10 +1277,28 @@ static void ept_disable_pml(struct p2m_domain *p2m) vmx_domain_disable_pml(p2m->domain); /* Disable EPT A/D bit */ - p2m->ept.ad = 0; + ept_set_ad_sync(p2m->domain, false); vmx_domain_update_eptp(p2m->domain); } +static void ept_enable_hardware_log_dirty(struct p2m_domain *p2m) +{ + struct p2m_domain *hostp2m = p2m_get_hostp2m(p2m->domain); + + p2m_lock(hostp2m); + ept_enable_pml(hostp2m); + p2m_unlock(hostp2m); +} + +static void ept_disable_hardware_log_dirty(struct p2m_domain *p2m) +{ + struct p2m_domain *hostp2m = p2m_get_hostp2m(p2m->domain); + + p2m_lock(hostp2m); + ept_disable_pml(hostp2m); + p2m_unlock(hostp2m); +} + static void ept_flush_pml_buffers(struct p2m_domain *p2m) { /* Domain must have been paused */ @@ -1281,8 +1328,8 @@ int ept_p2m_init(struct p2m_domain *p2m) if ( cpu_has_vmx_pml ) { - p2m->enable_hardware_log_dirty = ept_enable_pml; - p2m->disable_hardware_log_dirty = ept_disable_pml; + p2m->enable_hardware_log_dirty = ept_enable_hardware_log_dirty; + p2m->disable_hardware_log_dirty = ept_disable_hardware_log_dirty; p2m->flush_hardware_cached_dirty = ept_flush_pml_buffers; } @@ -1390,8 +1437,10 @@ void setup_ept_dump(void) void p2m_init_altp2m_ept(struct domain *d, unsigned int i) { struct p2m_domain *p2m = d->arch.altp2m_p2m[i]; + struct p2m_domain *hostp2m = p2m_get_hostp2m(d); struct ept_data *ept; + p2m->ept.ad = hostp2m->ept.ad; p2m->min_remapped_gfn = gfn_x(INVALID_GFN); p2m->max_remapped_gfn = 0; ept = &p2m->ept; diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c index a00a3c1..42b9ef4 100644 --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -360,11 +360,7 @@ void p2m_enable_hardware_log_dirty(struct domain *d) struct p2m_domain *p2m = p2m_get_hostp2m(d); if ( p2m->enable_hardware_log_dirty ) - { - p2m_lock(p2m); p2m->enable_hardware_log_dirty(p2m); - p2m_unlock(p2m); - } } void p2m_disable_hardware_log_dirty(struct domain *d) @@ -372,11 +368,7 @@ void p2m_disable_hardware_log_dirty(struct domain *d) struct p2m_domain *p2m = p2m_get_hostp2m(d); if ( p2m->disable_hardware_log_dirty ) - { - p2m_lock(p2m); p2m->disable_hardware_log_dirty(p2m); - p2m_unlock(p2m); - } } void p2m_flush_hardware_cached_dirty(struct domain *d) -- 2.7.4 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 2/2] x86/altp2m: fix display frozen when switching to a new view early 2018-10-18 10:07 Fix VGA logdirty related display freezes with altp2m Razvan Cojocaru 2018-10-18 10:07 ` [PATCH 1/2] x86/altp2m: propagate ept.ad changes to all active altp2ms Razvan Cojocaru @ 2018-10-18 10:07 ` Razvan Cojocaru 2018-10-18 10:57 ` Andrew Cooper 2018-10-18 20:08 ` Fix VGA logdirty related display freezes with altp2m Tamas K Lengyel 2 siblings, 1 reply; 25+ messages in thread From: Razvan Cojocaru @ 2018-10-18 10:07 UTC (permalink / raw) To: xen-devel Cc: kevin.tian, wei.liu2, jun.nakajima, Razvan Cojocaru, george.dunlap, andrew.cooper3, jbeulich When an new altp2m view is created very early on guest boot, the display will freeze (although the guest will run normally). This may also happen on resizing the display. The reason is the way Xen currently (mis)handles logdirty VGA: it intentionally misconfigures VGA pages so that they will fault. The problem is that it only does this in the host p2m. Once we switch to a new altp2m, the misconfigured entries will no longer fault, so the display will not be updated. This patch: * updates ept_handle_misconfig() to use the active altp2m instead of the hostp2m; * allocates new logdirty ranges for each altp2m; * has p2m_init_altp2m_ept() copy over max_mapped_pfn, and global_logdirty, and merges the logdirty ranges of the hostp2m into the logdirty range of the altp2m; * modifies p2m_change_entry_type_global(), p2m_memory_type_changed and p2m_change_type_range() to propagate their changes to all valid altp2ms. Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com> Suggested-by: George Dunlap <george.dunlap@citrix.com> --- xen/arch/x86/mm/p2m-ept.c | 31 ++++++++++- xen/arch/x86/mm/p2m.c | 112 ++++++++++++++++++++++++++++---------- xen/drivers/passthrough/pci.c | 2 +- xen/include/asm-x86/hvm/vmx/vmx.h | 3 +- xen/include/asm-x86/p2m.h | 10 ++-- 5 files changed, 123 insertions(+), 35 deletions(-) diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c index fabcd06..28790bf 100644 --- a/xen/arch/x86/mm/p2m-ept.c +++ b/xen/arch/x86/mm/p2m-ept.c @@ -657,6 +657,9 @@ bool_t ept_handle_misconfig(uint64_t gpa) bool_t spurious; int rc; + if ( altp2m_active(curr->domain) ) + p2m = p2m_get_altp2m(curr); + p2m_lock(p2m); spurious = curr->arch.hvm.vmx.ept_spurious_misconfig; @@ -1434,18 +1437,44 @@ void setup_ept_dump(void) register_keyhandler('D', ept_dump_p2m_table, "dump VT-x EPT tables", 0); } -void p2m_init_altp2m_ept(struct domain *d, unsigned int i) +int p2m_init_altp2m_ept(struct domain *d, unsigned int i) { struct p2m_domain *p2m = d->arch.altp2m_p2m[i]; struct p2m_domain *hostp2m = p2m_get_hostp2m(d); struct ept_data *ept; + int rc; + + ASSERT(!p2m->sync.logdirty_ranges); + p2m->sync.logdirty_ranges = rangeset_new(d, "log-dirty", + RANGESETF_prettyprint_hex); + if ( !p2m->sync.logdirty_ranges ) + return -ENOMEM; + + rc = rangeset_merge(p2m->sync.logdirty_ranges, + hostp2m->sync.logdirty_ranges); + if ( !rc ) + return rc; p2m->ept.ad = hostp2m->ept.ad; + p2m->max_mapped_pfn = hostp2m->max_mapped_pfn; + p2m->default_access = hostp2m->default_access; + p2m->domain = hostp2m->domain; + + p2m->sync.global_logdirty = hostp2m->sync.global_logdirty; p2m->min_remapped_gfn = gfn_x(INVALID_GFN); p2m->max_remapped_gfn = 0; ept = &p2m->ept; ept->mfn = pagetable_get_pfn(p2m_get_pagetable(p2m)); d->arch.altp2m_eptp[i] = ept->eptp; + + return 0; +} + +void p2m_uninit_altp2m_ept(struct p2m_domain *p2m) +{ + ASSERT(p2m->sync.logdirty_ranges); + rangeset_destroy(p2m->sync.logdirty_ranges); + p2m->sync.logdirty_ranges = NULL; } unsigned int p2m_find_altp2m_by_eptp(struct domain *d, uint64_t eptp) diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c index 42b9ef4..e9f8385 100644 --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -28,6 +28,7 @@ #include <xen/vm_event.h> #include <xen/event.h> #include <public/vm_event.h> +#include <asm/altp2m.h> #include <asm/domain.h> #include <asm/page.h> #include <asm/paging.h> @@ -119,9 +120,9 @@ static int p2m_init_hostp2m(struct domain *d) if ( p2m ) { - p2m->logdirty_ranges = rangeset_new(d, "log-dirty", - RANGESETF_prettyprint_hex); - if ( p2m->logdirty_ranges ) + p2m->sync.logdirty_ranges = rangeset_new(d, "log-dirty", + RANGESETF_prettyprint_hex); + if ( p2m->sync.logdirty_ranges ) { d->arch.p2m = p2m; return 0; @@ -138,7 +139,7 @@ static void p2m_teardown_hostp2m(struct domain *d) if ( p2m ) { - rangeset_destroy(p2m->logdirty_ranges); + rangeset_destroy(p2m->sync.logdirty_ranges); p2m_free_one(p2m); d->arch.p2m = NULL; } @@ -193,6 +194,7 @@ static void p2m_teardown_altp2m(struct domain *d) if ( !d->arch.altp2m_p2m[i] ) continue; p2m = d->arch.altp2m_p2m[i]; + p2m_uninit_altp2m_ept(p2m); d->arch.altp2m_p2m[i] = NULL; p2m_free_one(p2m); } @@ -255,33 +257,55 @@ int p2m_init(struct domain *d) int p2m_is_logdirty_range(struct p2m_domain *p2m, unsigned long start, unsigned long end) { - ASSERT(p2m_is_hostp2m(p2m)); - if ( p2m->global_logdirty || - rangeset_contains_range(p2m->logdirty_ranges, start, end) ) + if ( p2m->sync.global_logdirty || + rangeset_contains_range(p2m->sync.logdirty_ranges, start, end) ) return 1; - if ( rangeset_overlaps_range(p2m->logdirty_ranges, start, end) ) + if ( rangeset_overlaps_range(p2m->sync.logdirty_ranges, start, end) ) return -1; return 0; } +static void _p2m_change_entry_type_global(struct p2m_domain *p2m, + p2m_type_t ot, p2m_type_t nt) +{ + p2m->change_entry_type_global(p2m, ot, nt); + p2m->sync.global_logdirty = (nt == p2m_ram_logdirty); +} + void p2m_change_entry_type_global(struct domain *d, p2m_type_t ot, p2m_type_t nt) { - struct p2m_domain *p2m = p2m_get_hostp2m(d); + struct p2m_domain *hostp2m = p2m_get_hostp2m(d); ASSERT(ot != nt); ASSERT(p2m_is_changeable(ot) && p2m_is_changeable(nt)); - p2m_lock(p2m); - p2m->change_entry_type_global(p2m, ot, nt); - p2m->global_logdirty = (nt == p2m_ram_logdirty); - p2m_unlock(p2m); + p2m_lock(hostp2m); + + _p2m_change_entry_type_global(p2m_get_hostp2m(d), ot, nt); + +#ifdef CONFIG_HVM + if ( unlikely(altp2m_active(d)) ) + { + unsigned int i; + + for ( i = 0; i < MAX_ALTP2M; i++ ) + if ( d->arch.altp2m_eptp[i] != mfn_x(INVALID_MFN) ) + { + struct p2m_domain *p2m = d->arch.altp2m_p2m[i]; + + p2m_lock(p2m); + _p2m_change_entry_type_global(p2m, ot, nt); + p2m_unlock(p2m); + } + } +#endif + + p2m_unlock(hostp2m); } -void p2m_memory_type_changed(struct domain *d) +void _p2m_memory_type_changed(struct p2m_domain *p2m) { - struct p2m_domain *p2m = p2m_get_hostp2m(d); - if ( p2m->memory_type_changed ) { p2m_lock(p2m); @@ -290,6 +314,22 @@ void p2m_memory_type_changed(struct domain *d) } } +void p2m_memory_type_changed(struct domain *d) +{ +#ifdef CONFIG_HVM + if ( unlikely(altp2m_active(d)) ) + { + unsigned int i; + + for ( i = 0; i < MAX_ALTP2M; i++ ) + if ( d->arch.altp2m_eptp[i] != mfn_x(INVALID_MFN) ) + _p2m_memory_type_changed(d->arch.altp2m_p2m[i]); + } +#endif + + _p2m_memory_type_changed(p2m_get_hostp2m(d)); +} + int p2m_set_ioreq_server(struct domain *d, unsigned int flags, struct hvm_ioreq_server *s) @@ -970,12 +1010,12 @@ int p2m_change_type_one(struct domain *d, unsigned long gfn_l, } /* Modify the p2m type of a range of gfns from ot to nt. */ -void p2m_change_type_range(struct domain *d, - unsigned long start, unsigned long end, - p2m_type_t ot, p2m_type_t nt) +static void _p2m_change_type_range(struct p2m_domain *p2m, + unsigned long start, unsigned long end, + p2m_type_t ot, p2m_type_t nt) { + struct domain *d = p2m->domain; unsigned long gfn = start; - struct p2m_domain *p2m = p2m_get_hostp2m(d); int rc = 0; ASSERT(ot != nt); @@ -1006,11 +1046,11 @@ void p2m_change_type_range(struct domain *d, { case p2m_ram_rw: if ( ot == p2m_ram_logdirty ) - rc = rangeset_remove_range(p2m->logdirty_ranges, start, end - 1); + rc = rangeset_remove_range(p2m->sync.logdirty_ranges, start, end - 1); break; case p2m_ram_logdirty: if ( ot == p2m_ram_rw ) - rc = rangeset_add_range(p2m->logdirty_ranges, start, end - 1); + rc = rangeset_add_range(p2m->sync.logdirty_ranges, start, end - 1); break; default: break; @@ -1028,6 +1068,25 @@ void p2m_change_type_range(struct domain *d, p2m_unlock(p2m); } +void p2m_change_type_range(struct domain *d, + unsigned long start, unsigned long end, + p2m_type_t ot, p2m_type_t nt) +{ +#ifdef CONFIG_HVM + if ( unlikely(altp2m_active(d)) ) + { + unsigned int i; + + for ( i = 0; i < MAX_ALTP2M; i++ ) + if ( d->arch.altp2m_eptp[i] != mfn_x(INVALID_MFN) ) + _p2m_change_type_range(d->arch.altp2m_p2m[i], start, end, ot, + nt); + } +#endif + + _p2m_change_type_range(p2m_get_hostp2m(d), start, end, ot, nt); +} + /* * Finish p2m type change for gfns which are marked as need_recalc in a range. * Returns: 0/1 for success, negative for failure @@ -2289,10 +2348,7 @@ int p2m_init_altp2m_by_id(struct domain *d, unsigned int idx) altp2m_list_lock(d); if ( d->arch.altp2m_eptp[idx] == mfn_x(INVALID_MFN) ) - { - p2m_init_altp2m_ept(d, idx); - rc = 0; - } + rc = p2m_init_altp2m_ept(d, idx); altp2m_list_unlock(d); return rc; @@ -2310,9 +2366,8 @@ int p2m_init_next_altp2m(struct domain *d, uint16_t *idx) if ( d->arch.altp2m_eptp[i] != mfn_x(INVALID_MFN) ) continue; - p2m_init_altp2m_ept(d, i); + rc = p2m_init_altp2m_ept(d, i); *idx = i; - rc = 0; break; } @@ -2341,6 +2396,7 @@ int p2m_destroy_altp2m_by_id(struct domain *d, unsigned int idx) { p2m_flush_table(d->arch.altp2m_p2m[idx]); /* Uninit and reinit ept to force TLB shootdown */ + p2m_uninit_altp2m_ept(d->arch.altp2m_p2m[idx]); ept_p2m_uninit(d->arch.altp2m_p2m[idx]); ept_p2m_init(d->arch.altp2m_p2m[idx]); d->arch.altp2m_eptp[idx] = mfn_x(INVALID_MFN); diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c index e5b9602..390c748 100644 --- a/xen/drivers/passthrough/pci.c +++ b/xen/drivers/passthrough/pci.c @@ -1418,7 +1418,7 @@ static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn, u32 flag) * enabled for this domain */ if ( unlikely(d->arch.hvm.mem_sharing_enabled || vm_event_check_ring(d->vm_event_paging) || - p2m_get_hostp2m(d)->global_logdirty) ) + p2m_get_hostp2m(d)->sync.global_logdirty) ) return -EXDEV; if ( !pcidevs_trylock() ) diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h b/xen/include/asm-x86/hvm/vmx/vmx.h index b110e16..b2a1094 100644 --- a/xen/include/asm-x86/hvm/vmx/vmx.h +++ b/xen/include/asm-x86/hvm/vmx/vmx.h @@ -598,7 +598,8 @@ void ept_p2m_uninit(struct p2m_domain *p2m); void ept_walk_table(struct domain *d, unsigned long gfn); bool_t ept_handle_misconfig(uint64_t gpa); void setup_ept_dump(void); -void p2m_init_altp2m_ept(struct domain *d, unsigned int i); +int p2m_init_altp2m_ept(struct domain *d, unsigned int i); +void p2m_uninit_altp2m_ept(struct p2m_domain *p2m); /* Locate an alternate p2m by its EPTP */ unsigned int p2m_find_altp2m_by_eptp(struct domain *d, uint64_t eptp); diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h index d08c595..7346eeb 100644 --- a/xen/include/asm-x86/p2m.h +++ b/xen/include/asm-x86/p2m.h @@ -219,11 +219,13 @@ struct p2m_domain { struct list_head np2m_list; #endif - /* Host p2m: Log-dirty ranges registered for the domain. */ - struct rangeset *logdirty_ranges; + struct { + /* Host p2m: Log-dirty ranges registered for the domain. */ + struct rangeset *logdirty_ranges; - /* Host p2m: Global log-dirty mode enabled for the domain. */ - bool_t global_logdirty; + /* Host p2m: Global log-dirty mode enabled for the domain. */ + bool global_logdirty; + } sync; /* Host p2m: when this flag is set, don't flush all the nested-p2m * tables on every host-p2m change. The setter of this flag -- 2.7.4 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 2/2] x86/altp2m: fix display frozen when switching to a new view early 2018-10-18 10:07 ` [PATCH 2/2] x86/altp2m: fix display frozen when switching to a new view early Razvan Cojocaru @ 2018-10-18 10:57 ` Andrew Cooper 2018-10-19 8:30 ` Razvan Cojocaru 0 siblings, 1 reply; 25+ messages in thread From: Andrew Cooper @ 2018-10-18 10:57 UTC (permalink / raw) To: Razvan Cojocaru, xen-devel Cc: george.dunlap, kevin.tian, wei.liu2, jbeulich, jun.nakajima On 18/10/18 11:07, Razvan Cojocaru wrote: > When an new altp2m view is created very early on guest boot, the > display will freeze (although the guest will run normally). This > may also happen on resizing the display. The reason is the way > Xen currently (mis)handles logdirty VGA: it intentionally > misconfigures VGA pages so that they will fault. > > The problem is that it only does this in the host p2m. Once we > switch to a new altp2m, the misconfigured entries will no longer > fault, so the display will not be updated. > > This patch: > * updates ept_handle_misconfig() to use the active altp2m instead > of the hostp2m; > * allocates new logdirty ranges for each altp2m; > * has p2m_init_altp2m_ept() copy over max_mapped_pfn, > and global_logdirty, and merges the logdirty ranges of the > hostp2m into the logdirty range of the altp2m; > * modifies p2m_change_entry_type_global(), p2m_memory_type_changed > and p2m_change_type_range() to propagate their changes to all > valid altp2ms. > > Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com> > Suggested-by: George Dunlap <george.dunlap@citrix.com> From the looks of this patch, it would be cleaner to split out the patch for allocating/freeing resources from the patch which implements the logdirty merging. On the allocating/freeing side of things specifically, ... > diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c > index fabcd06..28790bf 100644 > --- a/xen/arch/x86/mm/p2m-ept.c > +++ b/xen/arch/x86/mm/p2m-ept.c > @@ -1434,18 +1437,44 @@ void setup_ept_dump(void) > register_keyhandler('D', ept_dump_p2m_table, "dump VT-x EPT tables", 0); > } > > -void p2m_init_altp2m_ept(struct domain *d, unsigned int i) > +int p2m_init_altp2m_ept(struct domain *d, unsigned int i) > { > struct p2m_domain *p2m = d->arch.altp2m_p2m[i]; > struct p2m_domain *hostp2m = p2m_get_hostp2m(d); > struct ept_data *ept; > + int rc; > + > + ASSERT(!p2m->sync.logdirty_ranges); > + p2m->sync.logdirty_ranges = rangeset_new(d, "log-dirty", > + RANGESETF_prettyprint_hex); > + if ( !p2m->sync.logdirty_ranges ) > + return -ENOMEM; > + > + rc = rangeset_merge(p2m->sync.logdirty_ranges, > + hostp2m->sync.logdirty_ranges); > + if ( !rc ) > + return rc; > > p2m->ept.ad = hostp2m->ept.ad; > + p2m->max_mapped_pfn = hostp2m->max_mapped_pfn; > + p2m->default_access = hostp2m->default_access; > + p2m->domain = hostp2m->domain; > + > + p2m->sync.global_logdirty = hostp2m->sync.global_logdirty; > p2m->min_remapped_gfn = gfn_x(INVALID_GFN); > p2m->max_remapped_gfn = 0; > ept = &p2m->ept; > ept->mfn = pagetable_get_pfn(p2m_get_pagetable(p2m)); > d->arch.altp2m_eptp[i] = ept->eptp; > + > + return 0; > +} > + > +void p2m_uninit_altp2m_ept(struct p2m_domain *p2m) For naming consistency, this should be p2m_destroy_altp2m_ept() [EDIT] It looks like the rest of the code has poor consistency. /sigh > +{ > + ASSERT(p2m->sync.logdirty_ranges); > + rangeset_destroy(p2m->sync.logdirty_ranges); > + p2m->sync.logdirty_ranges = NULL; Please make all destroy functions idempotent. i.e. if ( p2m->sync.logdirty_ranges ) { rangeset_destroy(p2m->sync.logdirty_ranges); p2m->sync.logdirty_ranges = NULL; } and use this destroy function in the cleanup path of init(). I'm currently in the process of doing this to our entire create/destroy infrastructure, which is a necessary prerequisite to fix the "vcpu allocation during domain creation" problem we've got. The longterm plan is to have errors in the init() path return immediately with no cleanup, and have the top level make one call into the destroy() path. For this to work, the destroy path needs to be able to correctly clean up from any point of initialisation (including none). > } > > unsigned int p2m_find_altp2m_by_eptp(struct domain *d, uint64_t eptp) > diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c > index 42b9ef4..e9f8385 100644 > --- a/xen/arch/x86/mm/p2m.c > +++ b/xen/arch/x86/mm/p2m.c > @@ -119,9 +120,9 @@ static int p2m_init_hostp2m(struct domain *d) > > if ( p2m ) > { > - p2m->logdirty_ranges = rangeset_new(d, "log-dirty", > - RANGESETF_prettyprint_hex); > - if ( p2m->logdirty_ranges ) > + p2m->sync.logdirty_ranges = rangeset_new(d, "log-dirty", > + RANGESETF_prettyprint_hex); This looks to be common with bits of p2m_init_altp2m_ept(). Why doesn't that get reused? > + if ( p2m->sync.logdirty_ranges ) > { > d->arch.p2m = p2m; > return 0; > @@ -2341,6 +2396,7 @@ int p2m_destroy_altp2m_by_id(struct domain *d, unsigned int idx) > { > p2m_flush_table(d->arch.altp2m_p2m[idx]); > /* Uninit and reinit ept to force TLB shootdown */ > + p2m_uninit_altp2m_ept(d->arch.altp2m_p2m[idx]); Shouldn't this ideally be called from ept_p2m_uninit(d->arch.altp2m_p2m[idx]) instead? ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/2] x86/altp2m: fix display frozen when switching to a new view early 2018-10-18 10:57 ` Andrew Cooper @ 2018-10-19 8:30 ` Razvan Cojocaru 0 siblings, 0 replies; 25+ messages in thread From: Razvan Cojocaru @ 2018-10-19 8:30 UTC (permalink / raw) To: Andrew Cooper, xen-devel Cc: george.dunlap, kevin.tian, wei.liu2, jbeulich, jun.nakajima On 10/18/18 1:57 PM, Andrew Cooper wrote: > On 18/10/18 11:07, Razvan Cojocaru wrote: >> When an new altp2m view is created very early on guest boot, the >> display will freeze (although the guest will run normally). This >> may also happen on resizing the display. The reason is the way >> Xen currently (mis)handles logdirty VGA: it intentionally >> misconfigures VGA pages so that they will fault. >> >> The problem is that it only does this in the host p2m. Once we >> switch to a new altp2m, the misconfigured entries will no longer >> fault, so the display will not be updated. >> >> This patch: >> * updates ept_handle_misconfig() to use the active altp2m instead >> of the hostp2m; >> * allocates new logdirty ranges for each altp2m; >> * has p2m_init_altp2m_ept() copy over max_mapped_pfn, >> and global_logdirty, and merges the logdirty ranges of the >> hostp2m into the logdirty range of the altp2m; >> * modifies p2m_change_entry_type_global(), p2m_memory_type_changed >> and p2m_change_type_range() to propagate their changes to all >> valid altp2ms. >> >> Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com> >> Suggested-by: George Dunlap <george.dunlap@citrix.com> > >>From the looks of this patch, it would be cleaner to split out the patch > for allocating/freeing resources from the patch which implements the > logdirty merging. > > On the allocating/freeing side of things specifically, ... Thanks for the review! I was poking the patch around to see how I might split it as indicated, and there's a small problem: I've moved p2m->logdirty_ranges and p2m->global_logdirty under p2m->sync.logdirty_ranges and p2m->sync.global_logdirty respectively. This was recommended by George, but as it has turned out while coding it doesn't make a huge difference (beyond being useful to really identify all places where those two were used because after the move code that wasn't updated stopped compiling). But the point is that if I move them under "sync" and update all the code that uses them, the allocate / free part of the patch grows to quite a lot of the original patch. Should I drop the move under "sync"? >> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c >> index fabcd06..28790bf 100644 >> --- a/xen/arch/x86/mm/p2m-ept.c >> +++ b/xen/arch/x86/mm/p2m-ept.c >> @@ -1434,18 +1437,44 @@ void setup_ept_dump(void) >> register_keyhandler('D', ept_dump_p2m_table, "dump VT-x EPT tables", 0); >> } >> >> -void p2m_init_altp2m_ept(struct domain *d, unsigned int i) >> +int p2m_init_altp2m_ept(struct domain *d, unsigned int i) >> { >> struct p2m_domain *p2m = d->arch.altp2m_p2m[i]; >> struct p2m_domain *hostp2m = p2m_get_hostp2m(d); >> struct ept_data *ept; >> + int rc; >> + >> + ASSERT(!p2m->sync.logdirty_ranges); >> + p2m->sync.logdirty_ranges = rangeset_new(d, "log-dirty", >> + RANGESETF_prettyprint_hex); >> + if ( !p2m->sync.logdirty_ranges ) >> + return -ENOMEM; >> + >> + rc = rangeset_merge(p2m->sync.logdirty_ranges, >> + hostp2m->sync.logdirty_ranges); >> + if ( !rc ) >> + return rc; >> >> p2m->ept.ad = hostp2m->ept.ad; >> + p2m->max_mapped_pfn = hostp2m->max_mapped_pfn; >> + p2m->default_access = hostp2m->default_access; >> + p2m->domain = hostp2m->domain; >> + >> + p2m->sync.global_logdirty = hostp2m->sync.global_logdirty; >> p2m->min_remapped_gfn = gfn_x(INVALID_GFN); >> p2m->max_remapped_gfn = 0; >> ept = &p2m->ept; >> ept->mfn = pagetable_get_pfn(p2m_get_pagetable(p2m)); >> d->arch.altp2m_eptp[i] = ept->eptp; >> + >> + return 0; >> +} >> + >> +void p2m_uninit_altp2m_ept(struct p2m_domain *p2m) > > For naming consistency, this should be p2m_destroy_altp2m_ept() > > [EDIT] It looks like the rest of the code has poor consistency. /sigh I'll do my best to rename touched functions for consistency in the alloc / free patch as well then. >> +{ >> + ASSERT(p2m->sync.logdirty_ranges); >> + rangeset_destroy(p2m->sync.logdirty_ranges); >> + p2m->sync.logdirty_ranges = NULL; > > Please make all destroy functions idempotent. i.e. > > if ( p2m->sync.logdirty_ranges ) > { > rangeset_destroy(p2m->sync.logdirty_ranges); > p2m->sync.logdirty_ranges = NULL; > } > > and use this destroy function in the cleanup path of init(). I'll do that. >> unsigned int p2m_find_altp2m_by_eptp(struct domain *d, uint64_t eptp) >> diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c >> index 42b9ef4..e9f8385 100644 >> --- a/xen/arch/x86/mm/p2m.c >> +++ b/xen/arch/x86/mm/p2m.c >> @@ -119,9 +120,9 @@ static int p2m_init_hostp2m(struct domain *d) >> >> if ( p2m ) >> { >> - p2m->logdirty_ranges = rangeset_new(d, "log-dirty", >> - RANGESETF_prettyprint_hex); >> - if ( p2m->logdirty_ranges ) >> + p2m->sync.logdirty_ranges = rangeset_new(d, "log-dirty", >> + RANGESETF_prettyprint_hex); > > This looks to be common with bits of p2m_init_altp2m_ept(). Why doesn't > that get reused? I'll see about that as well. >> + if ( p2m->sync.logdirty_ranges ) >> { >> d->arch.p2m = p2m; >> return 0; >> @@ -2341,6 +2396,7 @@ int p2m_destroy_altp2m_by_id(struct domain *d, unsigned int idx) >> { >> p2m_flush_table(d->arch.altp2m_p2m[idx]); >> /* Uninit and reinit ept to force TLB shootdown */ >> + p2m_uninit_altp2m_ept(d->arch.altp2m_p2m[idx]); > > Shouldn't this ideally be called from > ept_p2m_uninit(d->arch.altp2m_p2m[idx]) instead? I'll check and update accordingly. Thanks, Razvan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Fix VGA logdirty related display freezes with altp2m 2018-10-18 10:07 Fix VGA logdirty related display freezes with altp2m Razvan Cojocaru 2018-10-18 10:07 ` [PATCH 1/2] x86/altp2m: propagate ept.ad changes to all active altp2ms Razvan Cojocaru 2018-10-18 10:07 ` [PATCH 2/2] x86/altp2m: fix display frozen when switching to a new view early Razvan Cojocaru @ 2018-10-18 20:08 ` Tamas K Lengyel 2018-10-18 21:12 ` Razvan Cojocaru 2 siblings, 1 reply; 25+ messages in thread From: Tamas K Lengyel @ 2018-10-18 20:08 UTC (permalink / raw) To: Razvan Cojocaru Cc: Kevin Tian, Wei Liu, Jan Beulich, George Dunlap, Andrew Cooper, Jun Nakajima, Xen-devel On Thu, Oct 18, 2018 at 4:09 AM Razvan Cojocaru <rcojocaru@bitdefender.com> wrote: > > Hello, > > This series aims to prevent the display from freezing when > enabling altp2m and switching to a new view (and assorted problems > when resizing the display). > > The first patch propagates ept.ad changes to all active altp2ms, > and the second one allocates a new logdirty rangeset for each > new altp2m, and propagates (under lock) changes to all p2ms. > > The first patch is the same as: > [PATCH V4] x86/altp2m: propagate ept.ad changes to all active altp2ms > but as it is now required for the second one to apply cleanly, it > has been resent as part of this series. > > [PATCH 1/2] x86/altp2m: propagate ept.ad changes to all active altp2ms > [PATCH 2/2] x86/altp2m: fix display frozen when switching to a new Hi Razvan, I would be happy to give this a spin, can you push it as a git branch somewhere? Thanks, Tamas _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Fix VGA logdirty related display freezes with altp2m 2018-10-18 20:08 ` Fix VGA logdirty related display freezes with altp2m Tamas K Lengyel @ 2018-10-18 21:12 ` Razvan Cojocaru 2018-10-22 20:48 ` Tamas K Lengyel 0 siblings, 1 reply; 25+ messages in thread From: Razvan Cojocaru @ 2018-10-18 21:12 UTC (permalink / raw) To: Tamas K Lengyel Cc: Kevin Tian, Wei Liu, Jan Beulich, George Dunlap, Andrew Cooper, Jun Nakajima, Xen-devel On 10/18/18 11:08 PM, Tamas K Lengyel wrote: > On Thu, Oct 18, 2018 at 4:09 AM Razvan Cojocaru > <rcojocaru@bitdefender.com> wrote: >> >> Hello, >> >> This series aims to prevent the display from freezing when >> enabling altp2m and switching to a new view (and assorted problems >> when resizing the display). >> >> The first patch propagates ept.ad changes to all active altp2ms, >> and the second one allocates a new logdirty rangeset for each >> new altp2m, and propagates (under lock) changes to all p2ms. >> >> The first patch is the same as: >> [PATCH V4] x86/altp2m: propagate ept.ad changes to all active altp2ms >> but as it is now required for the second one to apply cleanly, it >> has been resent as part of this series. >> >> [PATCH 1/2] x86/altp2m: propagate ept.ad changes to all active altp2ms >> [PATCH 2/2] x86/altp2m: fix display frozen when switching to a new > > Hi Razvan, > I would be happy to give this a spin, can you push it as a git branch somewhere? Sure, here you go: https://github.com/razvan-cojocaru/xen/tree/altp2m-logdirty-take1 Thanks, Razvan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Fix VGA logdirty related display freezes with altp2m 2018-10-18 21:12 ` Razvan Cojocaru @ 2018-10-22 20:48 ` Tamas K Lengyel 2018-10-22 21:17 ` Razvan Cojocaru 0 siblings, 1 reply; 25+ messages in thread From: Tamas K Lengyel @ 2018-10-22 20:48 UTC (permalink / raw) To: Razvan Cojocaru Cc: Kevin Tian, Wei Liu, Jan Beulich, George Dunlap, Andrew Cooper, Jun Nakajima, Xen-devel On Thu, Oct 18, 2018 at 3:12 PM Razvan Cojocaru <rcojocaru@bitdefender.com> wrote: > > On 10/18/18 11:08 PM, Tamas K Lengyel wrote: > > On Thu, Oct 18, 2018 at 4:09 AM Razvan Cojocaru > > <rcojocaru@bitdefender.com> wrote: > >> > >> Hello, > >> > >> This series aims to prevent the display from freezing when > >> enabling altp2m and switching to a new view (and assorted problems > >> when resizing the display). > >> > >> The first patch propagates ept.ad changes to all active altp2ms, > >> and the second one allocates a new logdirty rangeset for each > >> new altp2m, and propagates (under lock) changes to all p2ms. > >> > >> The first patch is the same as: > >> [PATCH V4] x86/altp2m: propagate ept.ad changes to all active altp2ms > >> but as it is now required for the second one to apply cleanly, it > >> has been resent as part of this series. > >> > >> [PATCH 1/2] x86/altp2m: propagate ept.ad changes to all active altp2ms > >> [PATCH 2/2] x86/altp2m: fix display frozen when switching to a new > > > > Hi Razvan, > > I would be happy to give this a spin, can you push it as a git branch somewhere? > > Sure, here you go: > > https://github.com/razvan-cojocaru/xen/tree/altp2m-logdirty-take1 I ran into this crash when my config incorrectly pointed to a non-valid disk location: (XEN) Assertion 'p2m->sync.logdirty_ranges' failed at p2m-ept.c:1475 (XEN) ----[ Xen-4.12-unstable x86_64 debug=y Not tainted ]---- (XEN) CPU: 4 (XEN) RIP: e008:[<ffff82d08033f40c>] p2m_uninit_altp2m_ept+0x29/0x2b (XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor (XEN) rax: ffff83046d27802c rbx: ffff8304558dd880 rcx: 0000000000000000 (XEN) rdx: ffff83046d277fff rsi: 00000000004680c0 rdi: 0000000000000000 (XEN) rbp: ffff83046d277d60 rsp: ffff83046d277d50 r8: ffff82d0809304a0 (XEN) r9: 0000000000455940 r10: ffff82e008d01000 r11: 0000000000000017 (XEN) r12: ffff8304558dd880 r13: ffff8304558df830 r14: ffff8304558df000 (XEN) r15: fffffffffffffff8 cr0: 000000008005003b cr4: 00000000003526e0 (XEN) cr3: 000000005da16000 cr2: ffff880456cd6e80 (XEN) fsb: 0000000000000000 gsb: ffff880467f40000 gss: 0000000000000000 (XEN) ds: 002b es: 002b fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) Xen code around <ffff82d08033f40c> (p2m_uninit_altp2m_ept+0x29/0x2b): (XEN) 00 48 83 c4 08 5b 5d c3 <0f> 0b 55 48 89 e5 41 56 41 55 41 54 53 48 8d 05 (XEN) Xen call trace: (XEN) [<ffff82d08033f40c>] p2m_uninit_altp2m_ept+0x29/0x2b (XEN) [<ffff82d0803305ab>] p2m.c#p2m_teardown_altp2m+0x36/0x52 (XEN) [<ffff82d0803331b5>] p2m_final_teardown+0x11/0x28 (XEN) [<ffff82d08034509c>] paging_final_teardown+0x2e/0x3c (XEN) [<ffff82d080276439>] arch_domain_destroy+0x50/0xa1 (XEN) [<ffff82d08020595c>] domain.c#complete_domain_destroy+0x86/0x159 (XEN) [<ffff82d080228f4f>] rcupdate.c#rcu_process_callbacks+0xa5/0x1cf (XEN) [<ffff82d08023ae6b>] softirq.c#__do_softirq+0x71/0x9a (XEN) [<ffff82d08023aede>] do_softirq+0x13/0x15 (XEN) [<ffff82d080275068>] domain.c#idle_loop+0x63/0xb9 (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 4: (XEN) Assertion 'p2m->sync.logdirty_ranges' failed at p2m-ept.c:1475 (XEN) **************************************** With the config fixed it boots but when I run DRAKVUF on the domain I get the following crash: (XEN) ----[ Xen-4.12-unstable x86_64 debug=y Not tainted ]---- (XEN) CPU: 0 (XEN) RIP: e008:[<000000007bdb630c>] 000000007bdb630c (XEN) RFLAGS: 0000000000010282 CONTEXT: hypervisor (d0v5) (XEN) rax: 00000000ee138470 rbx: 0000000000000000 rcx: 000000008000b098 (XEN) rdx: 0000000000000cf8 rsi: 0000000000000000 rdi: 000000046d2ef000 (XEN) rbp: 0000000000000000 rsp: ffff83005da27a10 r8: 0000000000000cf8 (XEN) r9: 0000000000000cf8 r10: ffff83005da27ab8 r11: ffff83005da27a08 (XEN) r12: 0000000000000000 r13: 0000000000000000 r14: 0000000000000065 (XEN) r15: 00000000000005a7 cr0: 0000000080050033 cr4: 0000000000372660 (XEN) cr3: 000000046d2ef000 cr2: 00000000ee138470 (XEN) fsb: 00007fe46d97bbc0 gsb: ffff880467f40000 gss: 0000000000000000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) Xen code around <000000007bdb630c> (000000007bdb630c): (XEN) 80 74 0b 05 70 84 00 00 <c7> 00 00 00 00 e0 80 3d 7a 34 00 00 00 75 64 48 (XEN) Xen stack trace from rsp=ffff83005da27a10:(XEN) Xen stack trace from rsp=ffff83005da27a10: (XEN) 0000000000000000 0000000000000065 ffff83005da27a50 ffff82d08037aafc (XEN) 00000000fffffffe ffff82d08037ae14 0000000000000000 ffff83005da27a90 (XEN) 0000000000372660 000000046d2ef000 0000000393e91000 ffff82d0809602b0 (XEN) 000000fe00000000 ffff82d0802a3b98 ffffffffffffffff ffff83005da27ab8 (XEN) ffff83005da27b08 ffff82d0802a3511 ffff82d08046b028 ffff83005da27b08 (XEN) ffff82d0802a3511 ffff83005da27fff 0000138800000292 000082d0808176a0 (XEN) 0000000000000000 ffff82d08023b889 0000000000000292 ffff82d08046b028 (XEN) ffff82d080451ac8 ffff82d080454af2 00000000000005a7 ffff83005da27b78 (XEN) ffff82d080251d6f ffff82d080250fcd 0000000000000028 ffff83005da27b88 (XEN) ffff83005da27b38 000000000000e010 ffff82d080454c73 ffff82d080451ac8 (XEN) ffff82d080454af2 00000000000005a7 0000000000000030 ffff83005da27bf8 (XEN) ffff82d080454c73 ffff83005da27be8 ffff82d0802aaebc ffff82d08033f3dc (XEN) ffff82d080451ac8 ffff82d08037d969 ffff82d08037d95d ffff82d08037d969 (XEN) 0b0f82d08037d95d ffff82d08037d969 ffff83005fe5b000 0000000000000000 (XEN) 0000000000000000 ffff83005da27fff 0000000000000000 00007cffa25d83e7 (XEN) ffff82d08037da2d deadbeefdeadf00d ffff83018caf2530 ffff83005da27d38 (XEN) ffff83040a492830 ffff83005da27cc8 ffff83040bab2880 0000000000000000 (XEN) 0000000000000000 deadbeefdeadf00d deadbeefdeadf00d 0000000000000000 (XEN) 0000000000000000 ffff830451835000 0000000000000000 ffff83040a492000 (XEN) 0000000600000000 ffff82d08033f3da 000000000000e008 0000000000010282 (XEN) Xen call trace: (XEN) [<000000007bdb630c>] 000000007bdb630c (XEN) (XEN) Pagetable walk from 00000000ee138470: (XEN) L4[0x000] = 000000046d2ee063 ffffffffffffffff (XEN) L3[0x003] = 000000005da11063 ffffffffffffffff (XEN) L2[0x170] = 0000000000000000 ffffffffffffffff (XEN) (XEN) **************************************** (XEN) Panic on CPU 0: (XEN) FATAL PAGE FAULT (XEN) [error_code=0002] (XEN) Faulting linear address: 00000000ee138470 (XEN) **************************************** (XEN) (XEN) Reboot in five seconds... Tamas _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Fix VGA logdirty related display freezes with altp2m 2018-10-22 20:48 ` Tamas K Lengyel @ 2018-10-22 21:17 ` Razvan Cojocaru 2018-10-22 21:22 ` Andrew Cooper 0 siblings, 1 reply; 25+ messages in thread From: Razvan Cojocaru @ 2018-10-22 21:17 UTC (permalink / raw) To: Tamas K Lengyel Cc: Kevin Tian, Wei Liu, Jan Beulich, George Dunlap, Andrew Cooper, Jun Nakajima, Xen-devel On 10/22/18 11:48 PM, Tamas K Lengyel wrote: > On Thu, Oct 18, 2018 at 3:12 PM Razvan Cojocaru > <rcojocaru@bitdefender.com> wrote: >> >> On 10/18/18 11:08 PM, Tamas K Lengyel wrote: >>> On Thu, Oct 18, 2018 at 4:09 AM Razvan Cojocaru >>> <rcojocaru@bitdefender.com> wrote: >>>> >>>> Hello, >>>> >>>> This series aims to prevent the display from freezing when >>>> enabling altp2m and switching to a new view (and assorted problems >>>> when resizing the display). >>>> >>>> The first patch propagates ept.ad changes to all active altp2ms, >>>> and the second one allocates a new logdirty rangeset for each >>>> new altp2m, and propagates (under lock) changes to all p2ms. >>>> >>>> The first patch is the same as: >>>> [PATCH V4] x86/altp2m: propagate ept.ad changes to all active altp2ms >>>> but as it is now required for the second one to apply cleanly, it >>>> has been resent as part of this series. >>>> >>>> [PATCH 1/2] x86/altp2m: propagate ept.ad changes to all active altp2ms >>>> [PATCH 2/2] x86/altp2m: fix display frozen when switching to a new >>> >>> Hi Razvan, >>> I would be happy to give this a spin, can you push it as a git branch somewhere? >> >> Sure, here you go: >> >> https://github.com/razvan-cojocaru/xen/tree/altp2m-logdirty-take1 > > I ran into this crash when my config incorrectly pointed to a > non-valid disk location: > > (XEN) Assertion 'p2m->sync.logdirty_ranges' failed at p2m-ept.c:1475 > (XEN) ----[ Xen-4.12-unstable x86_64 debug=y Not tainted ]---- > (XEN) CPU: 4 > (XEN) RIP: e008:[<ffff82d08033f40c>] p2m_uninit_altp2m_ept+0x29/0x2b > (XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor > (XEN) rax: ffff83046d27802c rbx: ffff8304558dd880 rcx: 0000000000000000 > (XEN) rdx: ffff83046d277fff rsi: 00000000004680c0 rdi: 0000000000000000 > (XEN) rbp: ffff83046d277d60 rsp: ffff83046d277d50 r8: ffff82d0809304a0 > (XEN) r9: 0000000000455940 r10: ffff82e008d01000 r11: 0000000000000017 > (XEN) r12: ffff8304558dd880 r13: ffff8304558df830 r14: ffff8304558df000 > (XEN) r15: fffffffffffffff8 cr0: 000000008005003b cr4: 00000000003526e0 > (XEN) cr3: 000000005da16000 cr2: ffff880456cd6e80 > (XEN) fsb: 0000000000000000 gsb: ffff880467f40000 gss: 0000000000000000 > (XEN) ds: 002b es: 002b fs: 0000 gs: 0000 ss: e010 cs: e008 > (XEN) Xen code around <ffff82d08033f40c> (p2m_uninit_altp2m_ept+0x29/0x2b): > (XEN) 00 48 83 c4 08 5b 5d c3 <0f> 0b 55 48 89 e5 41 56 41 55 41 54 53 48 8d 05 > (XEN) Xen call trace: > (XEN) [<ffff82d08033f40c>] p2m_uninit_altp2m_ept+0x29/0x2b > (XEN) [<ffff82d0803305ab>] p2m.c#p2m_teardown_altp2m+0x36/0x52 > (XEN) [<ffff82d0803331b5>] p2m_final_teardown+0x11/0x28 > (XEN) [<ffff82d08034509c>] paging_final_teardown+0x2e/0x3c > (XEN) [<ffff82d080276439>] arch_domain_destroy+0x50/0xa1 > (XEN) [<ffff82d08020595c>] domain.c#complete_domain_destroy+0x86/0x159 > (XEN) [<ffff82d080228f4f>] rcupdate.c#rcu_process_callbacks+0xa5/0x1cf > (XEN) [<ffff82d08023ae6b>] softirq.c#__do_softirq+0x71/0x9a > (XEN) [<ffff82d08023aede>] do_softirq+0x13/0x15 > (XEN) [<ffff82d080275068>] domain.c#idle_loop+0x63/0xb9 > (XEN) > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 4: > (XEN) Assertion 'p2m->sync.logdirty_ranges' failed at p2m-ept.c:1475 > (XEN) **************************************** Right, that one I've also come across now, that will be fixed in the next series as a result of doing what Andrew has suggested, which is to say: "Please make all destroy functions idempotent. i.e. if ( p2m->sync.logdirty_ranges ) { rangeset_destroy(p2m->sync.logdirty_ranges); p2m->sync.logdirty_ranges = NULL; } and use this destroy function in the cleanup path of init()." > With the config fixed it boots but when I run DRAKVUF on the domain I > get the following crash: > > (XEN) ----[ Xen-4.12-unstable x86_64 debug=y Not tainted ]---- > (XEN) CPU: 0 > (XEN) RIP: e008:[<000000007bdb630c>] 000000007bdb630c > (XEN) RFLAGS: 0000000000010282 CONTEXT: hypervisor (d0v5) > (XEN) rax: 00000000ee138470 rbx: 0000000000000000 rcx: 000000008000b098 > (XEN) rdx: 0000000000000cf8 rsi: 0000000000000000 rdi: 000000046d2ef000 > (XEN) rbp: 0000000000000000 rsp: ffff83005da27a10 r8: 0000000000000cf8 > (XEN) r9: 0000000000000cf8 r10: ffff83005da27ab8 r11: ffff83005da27a08 > (XEN) r12: 0000000000000000 r13: 0000000000000000 r14: 0000000000000065 > (XEN) r15: 00000000000005a7 cr0: 0000000080050033 cr4: 0000000000372660 > (XEN) cr3: 000000046d2ef000 cr2: 00000000ee138470 > (XEN) fsb: 00007fe46d97bbc0 gsb: ffff880467f40000 gss: 0000000000000000 > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 > (XEN) Xen code around <000000007bdb630c> (000000007bdb630c): > (XEN) 80 74 0b 05 70 84 00 00 <c7> 00 00 00 00 e0 80 3d 7a 34 00 00 00 75 64 48 > (XEN) Xen stack trace from rsp=ffff83005da27a10:(XEN) Xen stack trace > from rsp=ffff83005da27a10: > (XEN) 0000000000000000 0000000000000065 ffff83005da27a50 ffff82d08037aafc > (XEN) 00000000fffffffe ffff82d08037ae14 0000000000000000 ffff83005da27a90 > (XEN) 0000000000372660 000000046d2ef000 0000000393e91000 ffff82d0809602b0 > (XEN) 000000fe00000000 ffff82d0802a3b98 ffffffffffffffff ffff83005da27ab8 > (XEN) ffff83005da27b08 ffff82d0802a3511 ffff82d08046b028 ffff83005da27b08 > (XEN) ffff82d0802a3511 ffff83005da27fff 0000138800000292 000082d0808176a0 > (XEN) 0000000000000000 ffff82d08023b889 0000000000000292 ffff82d08046b028 > (XEN) ffff82d080451ac8 ffff82d080454af2 00000000000005a7 ffff83005da27b78 > (XEN) ffff82d080251d6f ffff82d080250fcd 0000000000000028 ffff83005da27b88 > (XEN) ffff83005da27b38 000000000000e010 ffff82d080454c73 ffff82d080451ac8 > (XEN) ffff82d080454af2 00000000000005a7 0000000000000030 ffff83005da27bf8 > (XEN) ffff82d080454c73 ffff83005da27be8 ffff82d0802aaebc ffff82d08033f3dc > (XEN) ffff82d080451ac8 ffff82d08037d969 ffff82d08037d95d ffff82d08037d969 > (XEN) 0b0f82d08037d95d ffff82d08037d969 ffff83005fe5b000 0000000000000000 > (XEN) 0000000000000000 ffff83005da27fff 0000000000000000 00007cffa25d83e7 > (XEN) ffff82d08037da2d deadbeefdeadf00d ffff83018caf2530 ffff83005da27d38 > (XEN) ffff83040a492830 ffff83005da27cc8 ffff83040bab2880 0000000000000000 > (XEN) 0000000000000000 deadbeefdeadf00d deadbeefdeadf00d 0000000000000000 > (XEN) 0000000000000000 ffff830451835000 0000000000000000 ffff83040a492000 > (XEN) 0000000600000000 ffff82d08033f3da 000000000000e008 0000000000010282 > (XEN) Xen call trace: > (XEN) [<000000007bdb630c>] 000000007bdb630c > (XEN) > (XEN) Pagetable walk from 00000000ee138470: > (XEN) L4[0x000] = 000000046d2ee063 ffffffffffffffff > (XEN) L3[0x003] = 000000005da11063 ffffffffffffffff > (XEN) L2[0x170] = 0000000000000000 ffffffffffffffff > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 0: > (XEN) FATAL PAGE FAULT > (XEN) [error_code=0002] > (XEN) Faulting linear address: 00000000ee138470 > (XEN) **************************************** > (XEN) > (XEN) Reboot in five seconds... This one I'm not sure about. What does your introspection agent do at that point? Thanks, Razvan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Fix VGA logdirty related display freezes with altp2m 2018-10-22 21:17 ` Razvan Cojocaru @ 2018-10-22 21:22 ` Andrew Cooper 2018-10-22 21:28 ` Tamas K Lengyel 0 siblings, 1 reply; 25+ messages in thread From: Andrew Cooper @ 2018-10-22 21:22 UTC (permalink / raw) To: Razvan Cojocaru, Tamas K Lengyel Cc: Kevin Tian, Wei Liu, Jan Beulich, George Dunlap, Jun Nakajima, Xen-devel On 22/10/2018 22:17, Razvan Cojocaru wrote: > On 10/22/18 11:48 PM, Tamas K Lengyel wrote: >> On Thu, Oct 18, 2018 at 3:12 PM Razvan Cojocaru >> <rcojocaru@bitdefender.com> wrote: >>> On 10/18/18 11:08 PM, Tamas K Lengyel wrote: >>>> On Thu, Oct 18, 2018 at 4:09 AM Razvan Cojocaru >>>> <rcojocaru@bitdefender.com> wrote: >>>>> Hello, >>>>> >>>>> This series aims to prevent the display from freezing when >>>>> enabling altp2m and switching to a new view (and assorted problems >>>>> when resizing the display). >>>>> >>>>> The first patch propagates ept.ad changes to all active altp2ms, >>>>> and the second one allocates a new logdirty rangeset for each >>>>> new altp2m, and propagates (under lock) changes to all p2ms. >>>>> >>>>> The first patch is the same as: >>>>> [PATCH V4] x86/altp2m: propagate ept.ad changes to all active altp2ms >>>>> but as it is now required for the second one to apply cleanly, it >>>>> has been resent as part of this series. >>>>> >>>>> [PATCH 1/2] x86/altp2m: propagate ept.ad changes to all active altp2ms >>>>> [PATCH 2/2] x86/altp2m: fix display frozen when switching to a new >>>> Hi Razvan, >>>> I would be happy to give this a spin, can you push it as a git branch somewhere? >>> Sure, here you go: >>> >>> https://github.com/razvan-cojocaru/xen/tree/altp2m-logdirty-take1 >> I ran into this crash when my config incorrectly pointed to a >> non-valid disk location: >> >> (XEN) Assertion 'p2m->sync.logdirty_ranges' failed at p2m-ept.c:1475 >> (XEN) ----[ Xen-4.12-unstable x86_64 debug=y Not tainted ]---- >> (XEN) CPU: 4 >> (XEN) RIP: e008:[<ffff82d08033f40c>] p2m_uninit_altp2m_ept+0x29/0x2b >> (XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor >> (XEN) rax: ffff83046d27802c rbx: ffff8304558dd880 rcx: 0000000000000000 >> (XEN) rdx: ffff83046d277fff rsi: 00000000004680c0 rdi: 0000000000000000 >> (XEN) rbp: ffff83046d277d60 rsp: ffff83046d277d50 r8: ffff82d0809304a0 >> (XEN) r9: 0000000000455940 r10: ffff82e008d01000 r11: 0000000000000017 >> (XEN) r12: ffff8304558dd880 r13: ffff8304558df830 r14: ffff8304558df000 >> (XEN) r15: fffffffffffffff8 cr0: 000000008005003b cr4: 00000000003526e0 >> (XEN) cr3: 000000005da16000 cr2: ffff880456cd6e80 >> (XEN) fsb: 0000000000000000 gsb: ffff880467f40000 gss: 0000000000000000 >> (XEN) ds: 002b es: 002b fs: 0000 gs: 0000 ss: e010 cs: e008 >> (XEN) Xen code around <ffff82d08033f40c> (p2m_uninit_altp2m_ept+0x29/0x2b): >> (XEN) 00 48 83 c4 08 5b 5d c3 <0f> 0b 55 48 89 e5 41 56 41 55 41 54 53 48 8d 05 >> (XEN) Xen call trace: >> (XEN) [<ffff82d08033f40c>] p2m_uninit_altp2m_ept+0x29/0x2b >> (XEN) [<ffff82d0803305ab>] p2m.c#p2m_teardown_altp2m+0x36/0x52 >> (XEN) [<ffff82d0803331b5>] p2m_final_teardown+0x11/0x28 >> (XEN) [<ffff82d08034509c>] paging_final_teardown+0x2e/0x3c >> (XEN) [<ffff82d080276439>] arch_domain_destroy+0x50/0xa1 >> (XEN) [<ffff82d08020595c>] domain.c#complete_domain_destroy+0x86/0x159 >> (XEN) [<ffff82d080228f4f>] rcupdate.c#rcu_process_callbacks+0xa5/0x1cf >> (XEN) [<ffff82d08023ae6b>] softirq.c#__do_softirq+0x71/0x9a >> (XEN) [<ffff82d08023aede>] do_softirq+0x13/0x15 >> (XEN) [<ffff82d080275068>] domain.c#idle_loop+0x63/0xb9 >> (XEN) >> (XEN) >> (XEN) **************************************** >> (XEN) Panic on CPU 4: >> (XEN) Assertion 'p2m->sync.logdirty_ranges' failed at p2m-ept.c:1475 >> (XEN) **************************************** > Right, that one I've also come across now, that will be fixed in the > next series as a result of doing what Andrew has suggested, which is to say: > > "Please make all destroy functions idempotent. i.e. > > if ( p2m->sync.logdirty_ranges ) > { > rangeset_destroy(p2m->sync.logdirty_ranges); > p2m->sync.logdirty_ranges = NULL; > } > > and use this destroy function in the cleanup path of init()." Indeed. > >> With the config fixed it boots but when I run DRAKVUF on the domain I >> get the following crash: >> >> (XEN) ----[ Xen-4.12-unstable x86_64 debug=y Not tainted ]---- >> (XEN) CPU: 0 >> (XEN) RIP: e008:[<000000007bdb630c>] 000000007bdb630c >> (XEN) RFLAGS: 0000000000010282 CONTEXT: hypervisor (d0v5) >> (XEN) rax: 00000000ee138470 rbx: 0000000000000000 rcx: 000000008000b098 >> (XEN) rdx: 0000000000000cf8 rsi: 0000000000000000 rdi: 000000046d2ef000 >> (XEN) rbp: 0000000000000000 rsp: ffff83005da27a10 r8: 0000000000000cf8 >> (XEN) r9: 0000000000000cf8 r10: ffff83005da27ab8 r11: ffff83005da27a08 >> (XEN) r12: 0000000000000000 r13: 0000000000000000 r14: 0000000000000065 >> (XEN) r15: 00000000000005a7 cr0: 0000000080050033 cr4: 0000000000372660 >> (XEN) cr3: 000000046d2ef000 cr2: 00000000ee138470 >> (XEN) fsb: 00007fe46d97bbc0 gsb: ffff880467f40000 gss: 0000000000000000 >> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 >> (XEN) Xen code around <000000007bdb630c> (000000007bdb630c): >> (XEN) 80 74 0b 05 70 84 00 00 <c7> 00 00 00 00 e0 80 3d 7a 34 00 00 00 75 64 48 >> (XEN) Xen stack trace from rsp=ffff83005da27a10:(XEN) Xen stack trace >> from rsp=ffff83005da27a10: >> (XEN) 0000000000000000 0000000000000065 ffff83005da27a50 ffff82d08037aafc >> (XEN) 00000000fffffffe ffff82d08037ae14 0000000000000000 ffff83005da27a90 >> (XEN) 0000000000372660 000000046d2ef000 0000000393e91000 ffff82d0809602b0 >> (XEN) 000000fe00000000 ffff82d0802a3b98 ffffffffffffffff ffff83005da27ab8 >> (XEN) ffff83005da27b08 ffff82d0802a3511 ffff82d08046b028 ffff83005da27b08 >> (XEN) ffff82d0802a3511 ffff83005da27fff 0000138800000292 000082d0808176a0 >> (XEN) 0000000000000000 ffff82d08023b889 0000000000000292 ffff82d08046b028 >> (XEN) ffff82d080451ac8 ffff82d080454af2 00000000000005a7 ffff83005da27b78 >> (XEN) ffff82d080251d6f ffff82d080250fcd 0000000000000028 ffff83005da27b88 >> (XEN) ffff83005da27b38 000000000000e010 ffff82d080454c73 ffff82d080451ac8 >> (XEN) ffff82d080454af2 00000000000005a7 0000000000000030 ffff83005da27bf8 >> (XEN) ffff82d080454c73 ffff83005da27be8 ffff82d0802aaebc ffff82d08033f3dc >> (XEN) ffff82d080451ac8 ffff82d08037d969 ffff82d08037d95d ffff82d08037d969 >> (XEN) 0b0f82d08037d95d ffff82d08037d969 ffff83005fe5b000 0000000000000000 >> (XEN) 0000000000000000 ffff83005da27fff 0000000000000000 00007cffa25d83e7 >> (XEN) ffff82d08037da2d deadbeefdeadf00d ffff83018caf2530 ffff83005da27d38 >> (XEN) ffff83040a492830 ffff83005da27cc8 ffff83040bab2880 0000000000000000 >> (XEN) 0000000000000000 deadbeefdeadf00d deadbeefdeadf00d 0000000000000000 >> (XEN) 0000000000000000 ffff830451835000 0000000000000000 ffff83040a492000 >> (XEN) 0000000600000000 ffff82d08033f3da 000000000000e008 0000000000010282 >> (XEN) Xen call trace: >> (XEN) [<000000007bdb630c>] 000000007bdb630c >> (XEN) >> (XEN) Pagetable walk from 00000000ee138470: >> (XEN) L4[0x000] = 000000046d2ee063 ffffffffffffffff >> (XEN) L3[0x003] = 000000005da11063 ffffffffffffffff >> (XEN) L2[0x170] = 0000000000000000 ffffffffffffffff >> (XEN) >> (XEN) **************************************** >> (XEN) Panic on CPU 0: >> (XEN) FATAL PAGE FAULT >> (XEN) [error_code=0002] >> (XEN) Faulting linear address: 00000000ee138470 >> (XEN) **************************************** >> (XEN) >> (XEN) Reboot in five seconds... > This one I'm not sure about. What does your introspection agent do at > that point? This crash is bizarre. Xen has most likely followed a corrupt function pointer, because none of Xen's .text section live just below the 2G boundary ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Fix VGA logdirty related display freezes with altp2m 2018-10-22 21:22 ` Andrew Cooper @ 2018-10-22 21:28 ` Tamas K Lengyel 2018-10-22 22:15 ` Razvan Cojocaru 0 siblings, 1 reply; 25+ messages in thread From: Tamas K Lengyel @ 2018-10-22 21:28 UTC (permalink / raw) To: Andrew Cooper Cc: Kevin Tian, Wei Liu, Jun Nakajima, Razvan Cojocaru, George Dunlap, Jan Beulich, Xen-devel On Mon, Oct 22, 2018 at 3:22 PM Andrew Cooper <andrew.cooper3@citrix.com> wrote: > > On 22/10/2018 22:17, Razvan Cojocaru wrote: > > On 10/22/18 11:48 PM, Tamas K Lengyel wrote: > >> On Thu, Oct 18, 2018 at 3:12 PM Razvan Cojocaru > >> <rcojocaru@bitdefender.com> wrote: > >>> On 10/18/18 11:08 PM, Tamas K Lengyel wrote: > >>>> On Thu, Oct 18, 2018 at 4:09 AM Razvan Cojocaru > >>>> <rcojocaru@bitdefender.com> wrote: > >>>>> Hello, > >>>>> > >>>>> This series aims to prevent the display from freezing when > >>>>> enabling altp2m and switching to a new view (and assorted problems > >>>>> when resizing the display). > >>>>> > >>>>> The first patch propagates ept.ad changes to all active altp2ms, > >>>>> and the second one allocates a new logdirty rangeset for each > >>>>> new altp2m, and propagates (under lock) changes to all p2ms. > >>>>> > >>>>> The first patch is the same as: > >>>>> [PATCH V4] x86/altp2m: propagate ept.ad changes to all active altp2ms > >>>>> but as it is now required for the second one to apply cleanly, it > >>>>> has been resent as part of this series. > >>>>> > >>>>> [PATCH 1/2] x86/altp2m: propagate ept.ad changes to all active altp2ms > >>>>> [PATCH 2/2] x86/altp2m: fix display frozen when switching to a new > >>>> Hi Razvan, > >>>> I would be happy to give this a spin, can you push it as a git branch somewhere? > >>> Sure, here you go: > >>> > >>> https://github.com/razvan-cojocaru/xen/tree/altp2m-logdirty-take1 > >> I ran into this crash when my config incorrectly pointed to a > >> non-valid disk location: > >> > >> (XEN) Assertion 'p2m->sync.logdirty_ranges' failed at p2m-ept.c:1475 > >> (XEN) ----[ Xen-4.12-unstable x86_64 debug=y Not tainted ]---- > >> (XEN) CPU: 4 > >> (XEN) RIP: e008:[<ffff82d08033f40c>] p2m_uninit_altp2m_ept+0x29/0x2b > >> (XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor > >> (XEN) rax: ffff83046d27802c rbx: ffff8304558dd880 rcx: 0000000000000000 > >> (XEN) rdx: ffff83046d277fff rsi: 00000000004680c0 rdi: 0000000000000000 > >> (XEN) rbp: ffff83046d277d60 rsp: ffff83046d277d50 r8: ffff82d0809304a0 > >> (XEN) r9: 0000000000455940 r10: ffff82e008d01000 r11: 0000000000000017 > >> (XEN) r12: ffff8304558dd880 r13: ffff8304558df830 r14: ffff8304558df000 > >> (XEN) r15: fffffffffffffff8 cr0: 000000008005003b cr4: 00000000003526e0 > >> (XEN) cr3: 000000005da16000 cr2: ffff880456cd6e80 > >> (XEN) fsb: 0000000000000000 gsb: ffff880467f40000 gss: 0000000000000000 > >> (XEN) ds: 002b es: 002b fs: 0000 gs: 0000 ss: e010 cs: e008 > >> (XEN) Xen code around <ffff82d08033f40c> (p2m_uninit_altp2m_ept+0x29/0x2b): > >> (XEN) 00 48 83 c4 08 5b 5d c3 <0f> 0b 55 48 89 e5 41 56 41 55 41 54 53 48 8d 05 > >> (XEN) Xen call trace: > >> (XEN) [<ffff82d08033f40c>] p2m_uninit_altp2m_ept+0x29/0x2b > >> (XEN) [<ffff82d0803305ab>] p2m.c#p2m_teardown_altp2m+0x36/0x52 > >> (XEN) [<ffff82d0803331b5>] p2m_final_teardown+0x11/0x28 > >> (XEN) [<ffff82d08034509c>] paging_final_teardown+0x2e/0x3c > >> (XEN) [<ffff82d080276439>] arch_domain_destroy+0x50/0xa1 > >> (XEN) [<ffff82d08020595c>] domain.c#complete_domain_destroy+0x86/0x159 > >> (XEN) [<ffff82d080228f4f>] rcupdate.c#rcu_process_callbacks+0xa5/0x1cf > >> (XEN) [<ffff82d08023ae6b>] softirq.c#__do_softirq+0x71/0x9a > >> (XEN) [<ffff82d08023aede>] do_softirq+0x13/0x15 > >> (XEN) [<ffff82d080275068>] domain.c#idle_loop+0x63/0xb9 > >> (XEN) > >> (XEN) > >> (XEN) **************************************** > >> (XEN) Panic on CPU 4: > >> (XEN) Assertion 'p2m->sync.logdirty_ranges' failed at p2m-ept.c:1475 > >> (XEN) **************************************** > > Right, that one I've also come across now, that will be fixed in the > > next series as a result of doing what Andrew has suggested, which is to say: > > > > "Please make all destroy functions idempotent. i.e. > > > > if ( p2m->sync.logdirty_ranges ) > > { > > rangeset_destroy(p2m->sync.logdirty_ranges); > > p2m->sync.logdirty_ranges = NULL; > > } > > > > and use this destroy function in the cleanup path of init()." > > Indeed. > > > > >> With the config fixed it boots but when I run DRAKVUF on the domain I > >> get the following crash: > >> > >> (XEN) ----[ Xen-4.12-unstable x86_64 debug=y Not tainted ]---- > >> (XEN) CPU: 0 > >> (XEN) RIP: e008:[<000000007bdb630c>] 000000007bdb630c > >> (XEN) RFLAGS: 0000000000010282 CONTEXT: hypervisor (d0v5) > >> (XEN) rax: 00000000ee138470 rbx: 0000000000000000 rcx: 000000008000b098 > >> (XEN) rdx: 0000000000000cf8 rsi: 0000000000000000 rdi: 000000046d2ef000 > >> (XEN) rbp: 0000000000000000 rsp: ffff83005da27a10 r8: 0000000000000cf8 > >> (XEN) r9: 0000000000000cf8 r10: ffff83005da27ab8 r11: ffff83005da27a08 > >> (XEN) r12: 0000000000000000 r13: 0000000000000000 r14: 0000000000000065 > >> (XEN) r15: 00000000000005a7 cr0: 0000000080050033 cr4: 0000000000372660 > >> (XEN) cr3: 000000046d2ef000 cr2: 00000000ee138470 > >> (XEN) fsb: 00007fe46d97bbc0 gsb: ffff880467f40000 gss: 0000000000000000 > >> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 > >> (XEN) Xen code around <000000007bdb630c> (000000007bdb630c): > >> (XEN) 80 74 0b 05 70 84 00 00 <c7> 00 00 00 00 e0 80 3d 7a 34 00 00 00 75 64 48 > >> (XEN) Xen stack trace from rsp=ffff83005da27a10:(XEN) Xen stack trace > >> from rsp=ffff83005da27a10: > >> (XEN) 0000000000000000 0000000000000065 ffff83005da27a50 ffff82d08037aafc > >> (XEN) 00000000fffffffe ffff82d08037ae14 0000000000000000 ffff83005da27a90 > >> (XEN) 0000000000372660 000000046d2ef000 0000000393e91000 ffff82d0809602b0 > >> (XEN) 000000fe00000000 ffff82d0802a3b98 ffffffffffffffff ffff83005da27ab8 > >> (XEN) ffff83005da27b08 ffff82d0802a3511 ffff82d08046b028 ffff83005da27b08 > >> (XEN) ffff82d0802a3511 ffff83005da27fff 0000138800000292 000082d0808176a0 > >> (XEN) 0000000000000000 ffff82d08023b889 0000000000000292 ffff82d08046b028 > >> (XEN) ffff82d080451ac8 ffff82d080454af2 00000000000005a7 ffff83005da27b78 > >> (XEN) ffff82d080251d6f ffff82d080250fcd 0000000000000028 ffff83005da27b88 > >> (XEN) ffff83005da27b38 000000000000e010 ffff82d080454c73 ffff82d080451ac8 > >> (XEN) ffff82d080454af2 00000000000005a7 0000000000000030 ffff83005da27bf8 > >> (XEN) ffff82d080454c73 ffff83005da27be8 ffff82d0802aaebc ffff82d08033f3dc > >> (XEN) ffff82d080451ac8 ffff82d08037d969 ffff82d08037d95d ffff82d08037d969 > >> (XEN) 0b0f82d08037d95d ffff82d08037d969 ffff83005fe5b000 0000000000000000 > >> (XEN) 0000000000000000 ffff83005da27fff 0000000000000000 00007cffa25d83e7 > >> (XEN) ffff82d08037da2d deadbeefdeadf00d ffff83018caf2530 ffff83005da27d38 > >> (XEN) ffff83040a492830 ffff83005da27cc8 ffff83040bab2880 0000000000000000 > >> (XEN) 0000000000000000 deadbeefdeadf00d deadbeefdeadf00d 0000000000000000 > >> (XEN) 0000000000000000 ffff830451835000 0000000000000000 ffff83040a492000 > >> (XEN) 0000000600000000 ffff82d08033f3da 000000000000e008 0000000000010282 > >> (XEN) Xen call trace: > >> (XEN) [<000000007bdb630c>] 000000007bdb630c > >> (XEN) > >> (XEN) Pagetable walk from 00000000ee138470: > >> (XEN) L4[0x000] = 000000046d2ee063 ffffffffffffffff > >> (XEN) L3[0x003] = 000000005da11063 ffffffffffffffff > >> (XEN) L2[0x170] = 0000000000000000 ffffffffffffffff > >> (XEN) > >> (XEN) **************************************** > >> (XEN) Panic on CPU 0: > >> (XEN) FATAL PAGE FAULT > >> (XEN) [error_code=0002] > >> (XEN) Faulting linear address: 00000000ee138470 > >> (XEN) **************************************** > >> (XEN) > >> (XEN) Reboot in five seconds... > > This one I'm not sure about. What does your introspection agent do at > > that point? > > This crash is bizarre. Xen has most likely followed a corrupt function > pointer, because none of Xen's .text section live just below the 2G boundary > It's reproducible and happens immediately after a successful call to xc_altp2m_set_domain_state to enable altp2m. Tamas _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Fix VGA logdirty related display freezes with altp2m 2018-10-22 21:28 ` Tamas K Lengyel @ 2018-10-22 22:15 ` Razvan Cojocaru 2018-10-22 22:50 ` Tamas K Lengyel 0 siblings, 1 reply; 25+ messages in thread From: Razvan Cojocaru @ 2018-10-22 22:15 UTC (permalink / raw) To: Tamas K Lengyel, Andrew Cooper Cc: Kevin Tian, Wei Liu, Jan Beulich, George Dunlap, Jun Nakajima, Xen-devel >>>> With the config fixed it boots but when I run DRAKVUF on the domain I >>>> get the following crash: >>>> >>>> (XEN) ----[ Xen-4.12-unstable x86_64 debug=y Not tainted ]---- >>>> (XEN) CPU: 0 >>>> (XEN) RIP: e008:[<000000007bdb630c>] 000000007bdb630c >>>> (XEN) RFLAGS: 0000000000010282 CONTEXT: hypervisor (d0v5) >>>> (XEN) rax: 00000000ee138470 rbx: 0000000000000000 rcx: 000000008000b098 >>>> (XEN) rdx: 0000000000000cf8 rsi: 0000000000000000 rdi: 000000046d2ef000 >>>> (XEN) rbp: 0000000000000000 rsp: ffff83005da27a10 r8: 0000000000000cf8 >>>> (XEN) r9: 0000000000000cf8 r10: ffff83005da27ab8 r11: ffff83005da27a08 >>>> (XEN) r12: 0000000000000000 r13: 0000000000000000 r14: 0000000000000065 >>>> (XEN) r15: 00000000000005a7 cr0: 0000000080050033 cr4: 0000000000372660 >>>> (XEN) cr3: 000000046d2ef000 cr2: 00000000ee138470 >>>> (XEN) fsb: 00007fe46d97bbc0 gsb: ffff880467f40000 gss: 0000000000000000 >>>> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 >>>> (XEN) Xen code around <000000007bdb630c> (000000007bdb630c): >>>> (XEN) 80 74 0b 05 70 84 00 00 <c7> 00 00 00 00 e0 80 3d 7a 34 00 00 00 75 64 48 >>>> (XEN) Xen stack trace from rsp=ffff83005da27a10:(XEN) Xen stack trace >>>> from rsp=ffff83005da27a10: >>>> (XEN) 0000000000000000 0000000000000065 ffff83005da27a50 ffff82d08037aafc >>>> (XEN) 00000000fffffffe ffff82d08037ae14 0000000000000000 ffff83005da27a90 >>>> (XEN) 0000000000372660 000000046d2ef000 0000000393e91000 ffff82d0809602b0 >>>> (XEN) 000000fe00000000 ffff82d0802a3b98 ffffffffffffffff ffff83005da27ab8 >>>> (XEN) ffff83005da27b08 ffff82d0802a3511 ffff82d08046b028 ffff83005da27b08 >>>> (XEN) ffff82d0802a3511 ffff83005da27fff 0000138800000292 000082d0808176a0 >>>> (XEN) 0000000000000000 ffff82d08023b889 0000000000000292 ffff82d08046b028 >>>> (XEN) ffff82d080451ac8 ffff82d080454af2 00000000000005a7 ffff83005da27b78 >>>> (XEN) ffff82d080251d6f ffff82d080250fcd 0000000000000028 ffff83005da27b88 >>>> (XEN) ffff83005da27b38 000000000000e010 ffff82d080454c73 ffff82d080451ac8 >>>> (XEN) ffff82d080454af2 00000000000005a7 0000000000000030 ffff83005da27bf8 >>>> (XEN) ffff82d080454c73 ffff83005da27be8 ffff82d0802aaebc ffff82d08033f3dc >>>> (XEN) ffff82d080451ac8 ffff82d08037d969 ffff82d08037d95d ffff82d08037d969 >>>> (XEN) 0b0f82d08037d95d ffff82d08037d969 ffff83005fe5b000 0000000000000000 >>>> (XEN) 0000000000000000 ffff83005da27fff 0000000000000000 00007cffa25d83e7 >>>> (XEN) ffff82d08037da2d deadbeefdeadf00d ffff83018caf2530 ffff83005da27d38 >>>> (XEN) ffff83040a492830 ffff83005da27cc8 ffff83040bab2880 0000000000000000 >>>> (XEN) 0000000000000000 deadbeefdeadf00d deadbeefdeadf00d 0000000000000000 >>>> (XEN) 0000000000000000 ffff830451835000 0000000000000000 ffff83040a492000 >>>> (XEN) 0000000600000000 ffff82d08033f3da 000000000000e008 0000000000010282 >>>> (XEN) Xen call trace: >>>> (XEN) [<000000007bdb630c>] 000000007bdb630c >>>> (XEN) >>>> (XEN) Pagetable walk from 00000000ee138470: >>>> (XEN) L4[0x000] = 000000046d2ee063 ffffffffffffffff >>>> (XEN) L3[0x003] = 000000005da11063 ffffffffffffffff >>>> (XEN) L2[0x170] = 0000000000000000 ffffffffffffffff >>>> (XEN) >>>> (XEN) **************************************** >>>> (XEN) Panic on CPU 0: >>>> (XEN) FATAL PAGE FAULT >>>> (XEN) [error_code=0002] >>>> (XEN) Faulting linear address: 00000000ee138470 >>>> (XEN) **************************************** >>>> (XEN) >>>> (XEN) Reboot in five seconds... >>> This one I'm not sure about. What does your introspection agent do at >>> that point? >> >> This crash is bizarre. Xen has most likely followed a corrupt function >> pointer, because none of Xen's .text section live just below the 2G boundary >> > > It's reproducible and happens immediately after a successful call to > xc_altp2m_set_domain_state to enable altp2m. That can't be all that's needed. I assure you I've tested this with much more that just calling xc_altp2m_set_domain_state() with no crashes at all. Something else must happen as well. Could you write a simple C test application that does the minimum ammount of work needed to produce this crash? Thanks, Razvan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Fix VGA logdirty related display freezes with altp2m 2018-10-22 22:15 ` Razvan Cojocaru @ 2018-10-22 22:50 ` Tamas K Lengyel 2018-10-23 12:37 ` Razvan Cojocaru 0 siblings, 1 reply; 25+ messages in thread From: Tamas K Lengyel @ 2018-10-22 22:50 UTC (permalink / raw) To: Razvan Cojocaru Cc: Kevin Tian, Wei Liu, Jan Beulich, George Dunlap, Andrew Cooper, Jun Nakajima, Xen-devel On Mon, Oct 22, 2018 at 4:15 PM Razvan Cojocaru <rcojocaru@bitdefender.com> wrote: > > >>>> With the config fixed it boots but when I run DRAKVUF on the domain I > >>>> get the following crash: > >>>> > >>>> (XEN) ----[ Xen-4.12-unstable x86_64 debug=y Not tainted ]---- > >>>> (XEN) CPU: 0 > >>>> (XEN) RIP: e008:[<000000007bdb630c>] 000000007bdb630c > >>>> (XEN) RFLAGS: 0000000000010282 CONTEXT: hypervisor (d0v5) > >>>> (XEN) rax: 00000000ee138470 rbx: 0000000000000000 rcx: 000000008000b098 > >>>> (XEN) rdx: 0000000000000cf8 rsi: 0000000000000000 rdi: 000000046d2ef000 > >>>> (XEN) rbp: 0000000000000000 rsp: ffff83005da27a10 r8: 0000000000000cf8 > >>>> (XEN) r9: 0000000000000cf8 r10: ffff83005da27ab8 r11: ffff83005da27a08 > >>>> (XEN) r12: 0000000000000000 r13: 0000000000000000 r14: 0000000000000065 > >>>> (XEN) r15: 00000000000005a7 cr0: 0000000080050033 cr4: 0000000000372660 > >>>> (XEN) cr3: 000000046d2ef000 cr2: 00000000ee138470 > >>>> (XEN) fsb: 00007fe46d97bbc0 gsb: ffff880467f40000 gss: 0000000000000000 > >>>> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 > >>>> (XEN) Xen code around <000000007bdb630c> (000000007bdb630c): > >>>> (XEN) 80 74 0b 05 70 84 00 00 <c7> 00 00 00 00 e0 80 3d 7a 34 00 00 00 75 64 48 > >>>> (XEN) Xen stack trace from rsp=ffff83005da27a10:(XEN) Xen stack trace > >>>> from rsp=ffff83005da27a10: > >>>> (XEN) 0000000000000000 0000000000000065 ffff83005da27a50 ffff82d08037aafc > >>>> (XEN) 00000000fffffffe ffff82d08037ae14 0000000000000000 ffff83005da27a90 > >>>> (XEN) 0000000000372660 000000046d2ef000 0000000393e91000 ffff82d0809602b0 > >>>> (XEN) 000000fe00000000 ffff82d0802a3b98 ffffffffffffffff ffff83005da27ab8 > >>>> (XEN) ffff83005da27b08 ffff82d0802a3511 ffff82d08046b028 ffff83005da27b08 > >>>> (XEN) ffff82d0802a3511 ffff83005da27fff 0000138800000292 000082d0808176a0 > >>>> (XEN) 0000000000000000 ffff82d08023b889 0000000000000292 ffff82d08046b028 > >>>> (XEN) ffff82d080451ac8 ffff82d080454af2 00000000000005a7 ffff83005da27b78 > >>>> (XEN) ffff82d080251d6f ffff82d080250fcd 0000000000000028 ffff83005da27b88 > >>>> (XEN) ffff83005da27b38 000000000000e010 ffff82d080454c73 ffff82d080451ac8 > >>>> (XEN) ffff82d080454af2 00000000000005a7 0000000000000030 ffff83005da27bf8 > >>>> (XEN) ffff82d080454c73 ffff83005da27be8 ffff82d0802aaebc ffff82d08033f3dc > >>>> (XEN) ffff82d080451ac8 ffff82d08037d969 ffff82d08037d95d ffff82d08037d969 > >>>> (XEN) 0b0f82d08037d95d ffff82d08037d969 ffff83005fe5b000 0000000000000000 > >>>> (XEN) 0000000000000000 ffff83005da27fff 0000000000000000 00007cffa25d83e7 > >>>> (XEN) ffff82d08037da2d deadbeefdeadf00d ffff83018caf2530 ffff83005da27d38 > >>>> (XEN) ffff83040a492830 ffff83005da27cc8 ffff83040bab2880 0000000000000000 > >>>> (XEN) 0000000000000000 deadbeefdeadf00d deadbeefdeadf00d 0000000000000000 > >>>> (XEN) 0000000000000000 ffff830451835000 0000000000000000 ffff83040a492000 > >>>> (XEN) 0000000600000000 ffff82d08033f3da 000000000000e008 0000000000010282 > >>>> (XEN) Xen call trace: > >>>> (XEN) [<000000007bdb630c>] 000000007bdb630c > >>>> (XEN) > >>>> (XEN) Pagetable walk from 00000000ee138470: > >>>> (XEN) L4[0x000] = 000000046d2ee063 ffffffffffffffff > >>>> (XEN) L3[0x003] = 000000005da11063 ffffffffffffffff > >>>> (XEN) L2[0x170] = 0000000000000000 ffffffffffffffff > >>>> (XEN) > >>>> (XEN) **************************************** > >>>> (XEN) Panic on CPU 0: > >>>> (XEN) FATAL PAGE FAULT > >>>> (XEN) [error_code=0002] > >>>> (XEN) Faulting linear address: 00000000ee138470 > >>>> (XEN) **************************************** > >>>> (XEN) > >>>> (XEN) Reboot in five seconds... > >>> This one I'm not sure about. What does your introspection agent do at > >>> that point? > >> > >> This crash is bizarre. Xen has most likely followed a corrupt function > >> pointer, because none of Xen's .text section live just below the 2G boundary > >> > > > > It's reproducible and happens immediately after a successful call to > > xc_altp2m_set_domain_state to enable altp2m. > > That can't be all that's needed. I assure you I've tested this with much > more that just calling xc_altp2m_set_domain_state() with no crashes at > all. Something else must happen as well. > > Could you write a simple C test application that does the minimum > ammount of work needed to produce this crash? Not the same error but another crash when just using xen-access with altp2m_exec: (XEN) Assertion '!p2m->sync.logdirty_ranges' failed at p2m-ept.c:1447 (XEN) ----[ Xen-4.12-unstable x86_64 debug=y Not tainted ]---- (XEN) CPU: 7 (XEN) RIP: e008:[<ffff82d08033f3da>] p2m_init_altp2m_ept+0xf8/0x101 (XEN) RFLAGS: 0000000000010282 CONTEXT: hypervisor (d0v1) (XEN) rax: 0000000000000000 rbx: ffff83044ff21880 rcx: 0000000000000000 (XEN) rdx: ffff830451aae000 rsi: 0000000000000000 rdi: ffff83044f500000 (XEN) rbp: ffff83046d237cc8 rsp: ffff83046d237ca8 r8: deadbeefdeadf00d (XEN) r9: deadbeefdeadf00d r10: 0000000000000000 r11: 0000000000000000 (XEN) r12: ffff83044f500830 r13: ffff83046d237d38 r14: ffff83018caf24a0 (XEN) r15: deadbeefdeadf00d cr0: 0000000080050033 cr4: 0000000000372660 (XEN) cr3: 00000003b9719000 cr2: 00007ffcf624afb0 (XEN) fsb: 00007f31c4b1a140 gsb: ffff880467e40000 gss: 0000000000000000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) Xen code around <ffff82d08033f3da> (p2m_init_altp2m_ept+0xf8/0x101): (XEN) 41 5c 41 5d 41 5e 5d c3 <0f> 0b b8 f4 ff ff ff eb ee 55 48 89 e5 53 48 83 (XEN) Xen stack trace from rsp=ffff83046d237ca8: (XEN) ffff83044f500000 ffff83044f500830 ffff83046d237d38 0000000000000000 (XEN) ffff83046d237d08 ffff82d0803380be ffff83046d237ce8 ffff83044f500000 (XEN) 00007f31c4b36010 00000000ffffffff ffff82d0802fb4ab deadbeefdeadf00d (XEN) ffff83046d237d98 ffff82d0802f7efb ffff83046d237e48 ffff82d0802035ba (XEN) 0000000400000001 0000000000000001 000000000003ffff 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000019 00007f31c4b36010 (XEN) ffff83005fdfb000 ffff82d0802fb4ab ffff83046d237e48 ffff82d0802fc6f2 (XEN) ffff83046d237fff ffff83005fdfb000 ffff83046d237dc8 ffff82d08036fe71 (XEN) ffff83046d237e48 ffff82d08037512a 0000000600000001 0000000000000000 (XEN) 0000000000000202 00007f31c41ff5d7 ffff82d08037d444 ffff82d08037d438 (XEN) ffff82d08037d444 ffff82d08037d438 ffff82d08037d444 ffff83046d237ef8 (XEN) 0000000000000022 ffff83005fdfb000 ffff82d0802fb4ab deadbeefdeadf00d (XEN) ffff83046d237ee8 ffff82d080374b07 02ff82d08037d444 0000000000000019 (XEN) 00007f31c4b36010 deadbeefdeadf00d deadbeefdeadf00d deadbeefdeadf00d (XEN) ffff82d08037d444 ffff82d08037d438 ffff82d08037d444 ffff82d08037d438 (XEN) ffff82d08037d444 ffff82d08037d438 ffff82d08037d444 ffff83005fdfb000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 00007cfb92dc80e7 ffff82d08037d4a2 00007ffcf624d6b0 0000000000305000 (XEN) ffff880421adb400 00007ffcf624d6b0 ffffc90042c47e60 ffffffffffffffff (XEN) Xen call trace: (XEN) [<ffff82d08033f3da>] p2m_init_altp2m_ept+0xf8/0x101 (XEN) [<ffff82d0803380be>] p2m_init_next_altp2m+0x103/0x161 (XEN) [<ffff82d0802f7efb>] hvm.c#do_altp2m_op+0x413/0x779 (XEN) [<ffff82d0802fc6f2>] do_hvm_op+0x1247/0x1319 (XEN) [<ffff82d080374b07>] pv_hypercall+0x1dc/0x4bb (XEN) [<ffff82d08037d4a2>] lstar_enter+0x112/0x120 (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 7: (XEN) Assertion '!p2m->sync.logdirty_ranges' failed at p2m-ept.c:1447 (XEN) **************************************** (XEN) (XEN) Reboot in five seconds... (XEN) APIC error on CPU0: 40(00) I had to rebase your branch on staging to get it to compile, other then that, I don't know why the crash is not happening on your side. Tamas _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
* Fix VGA logdirty related display freezes with altp2m 2018-10-22 22:50 ` Tamas K Lengyel @ 2018-10-23 12:37 ` Razvan Cojocaru 2018-10-24 17:09 ` Tamas K Lengyel 0 siblings, 1 reply; 25+ messages in thread From: Razvan Cojocaru @ 2018-10-23 12:37 UTC (permalink / raw) To: Tamas K Lengyel Cc: Kevin Tian, Wei Liu, Jan Beulich, George Dunlap, Andrew Cooper, Jun Nakajima, Xen-devel Tamas, could you please give this a spin? https://github.com/razvan-cojocaru/xen/tree/altp2m-logdirty-take2 It _should_ solve the crashes. Thanks, Razvan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Fix VGA logdirty related display freezes with altp2m 2018-10-23 12:37 ` Razvan Cojocaru @ 2018-10-24 17:09 ` Tamas K Lengyel 2018-10-24 17:20 ` Razvan Cojocaru 0 siblings, 1 reply; 25+ messages in thread From: Tamas K Lengyel @ 2018-10-24 17:09 UTC (permalink / raw) To: Razvan Cojocaru Cc: Kevin Tian, Wei Liu, Jan Beulich, George Dunlap, Andrew Cooper, Jun Nakajima, Xen-devel On Tue, Oct 23, 2018 at 6:37 AM Razvan Cojocaru <rcojocaru@bitdefender.com> wrote: > > Tamas, could you please give this a spin? > > https://github.com/razvan-cojocaru/xen/tree/altp2m-logdirty-take2 > > It _should_ solve the crashes. Indeed, I no longer see the crash. However, there might be some locking issue present because the whole system freezes up shortly after starting DRAKVUF on a domain - within a couple seconds. I mean Xen itself locks up: no response on the serial, dom0 screen frozen, etc. Tamas _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Fix VGA logdirty related display freezes with altp2m 2018-10-24 17:09 ` Tamas K Lengyel @ 2018-10-24 17:20 ` Razvan Cojocaru 2018-10-24 17:31 ` Tamas K Lengyel 0 siblings, 1 reply; 25+ messages in thread From: Razvan Cojocaru @ 2018-10-24 17:20 UTC (permalink / raw) To: Tamas K Lengyel Cc: Kevin Tian, Wei Liu, Jan Beulich, George Dunlap, Andrew Cooper, Jun Nakajima, Xen-devel On 10/24/18 8:09 PM, Tamas K Lengyel wrote: > On Tue, Oct 23, 2018 at 6:37 AM Razvan Cojocaru > <rcojocaru@bitdefender.com> wrote: >> >> Tamas, could you please give this a spin? >> >> https://github.com/razvan-cojocaru/xen/tree/altp2m-logdirty-take2 >> >> It _should_ solve the crashes. > > Indeed, I no longer see the crash. However, there might be some > locking issue present because the whole system freezes up shortly > after starting DRAKVUF on a domain - within a couple seconds. I mean > Xen itself locks up: no response on the serial, dom0 screen frozen, > etc. Do you have any type of log / backtrace / way I could reproduce it without Drakvuf? All the ways I've tested it were fine (including xen-access). Thanks, Razvan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Fix VGA logdirty related display freezes with altp2m 2018-10-24 17:20 ` Razvan Cojocaru @ 2018-10-24 17:31 ` Tamas K Lengyel 2018-10-24 17:52 ` Tamas K Lengyel 0 siblings, 1 reply; 25+ messages in thread From: Tamas K Lengyel @ 2018-10-24 17:31 UTC (permalink / raw) To: Razvan Cojocaru Cc: Kevin Tian, Wei Liu, Jan Beulich, George Dunlap, Andrew Cooper, Jun Nakajima, Xen-devel On Wed, Oct 24, 2018 at 11:20 AM Razvan Cojocaru <rcojocaru@bitdefender.com> wrote: > > On 10/24/18 8:09 PM, Tamas K Lengyel wrote: > > On Tue, Oct 23, 2018 at 6:37 AM Razvan Cojocaru > > <rcojocaru@bitdefender.com> wrote: > >> > >> Tamas, could you please give this a spin? > >> > >> https://github.com/razvan-cojocaru/xen/tree/altp2m-logdirty-take2 > >> > >> It _should_ solve the crashes. > > > > Indeed, I no longer see the crash. However, there might be some > > locking issue present because the whole system freezes up shortly > > after starting DRAKVUF on a domain - within a couple seconds. I mean > > Xen itself locks up: no response on the serial, dom0 screen frozen, > > etc. > > Do you have any type of log / backtrace / way I could reproduce it > without Drakvuf? All the ways I've tested it were fine (including > xen-access). I don't have a standalone test that produces that error. With DRAKVUF it is easily reproducible though. If you have a Windows guest installed, setting up DRAKVUF should really not be much trouble. With xen-access it indeed doesn't lock up but since the guest is pretty much unresponsive during that test I can't verify whether the VGA issue is now resolved or not. Also the xen-access tests are fairly limited and don't use all aspects of altp2m. Tamas _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Fix VGA logdirty related display freezes with altp2m 2018-10-24 17:31 ` Tamas K Lengyel @ 2018-10-24 17:52 ` Tamas K Lengyel 2018-10-24 18:05 ` Razvan Cojocaru 2018-10-25 14:24 ` Razvan Cojocaru 0 siblings, 2 replies; 25+ messages in thread From: Tamas K Lengyel @ 2018-10-24 17:52 UTC (permalink / raw) To: Razvan Cojocaru Cc: Kevin Tian, Wei Liu, Jan Beulich, George Dunlap, Andrew Cooper, Jun Nakajima, Xen-devel On Wed, Oct 24, 2018 at 11:31 AM Tamas K Lengyel <tamas.k.lengyel@gmail.com> wrote: > > On Wed, Oct 24, 2018 at 11:20 AM Razvan Cojocaru > <rcojocaru@bitdefender.com> wrote: > > > > On 10/24/18 8:09 PM, Tamas K Lengyel wrote: > > > On Tue, Oct 23, 2018 at 6:37 AM Razvan Cojocaru > > > <rcojocaru@bitdefender.com> wrote: > > >> > > >> Tamas, could you please give this a spin? > > >> > > >> https://github.com/razvan-cojocaru/xen/tree/altp2m-logdirty-take2 > > >> > > >> It _should_ solve the crashes. > > > > > > Indeed, I no longer see the crash. However, there might be some > > > locking issue present because the whole system freezes up shortly > > > after starting DRAKVUF on a domain - within a couple seconds. I mean > > > Xen itself locks up: no response on the serial, dom0 screen frozen, > > > etc. > > > > Do you have any type of log / backtrace / way I could reproduce it > > without Drakvuf? All the ways I've tested it were fine (including > > xen-access). > > I don't have a standalone test that produces that error. With DRAKVUF > it is easily reproducible though. If you have a Windows guest > installed, setting up DRAKVUF should really not be much trouble. With > xen-access it indeed doesn't lock up but since the guest is pretty > much unresponsive during that test I can't verify whether the VGA > issue is now resolved or not. Also the xen-access tests are fairly > limited and don't use all aspects of altp2m. > What I see from the DRAKVUF log is that the last thing it prints is sending a vm_event response that both enables singlestepping and switches altp2m view. This looks to be consistent. It didn't matter if the guest had 1 or 2 vCPUs, the freeze occurs just the same. It's definitely racey because it doesn't happen right away, the system works as expected for a couple seconds. Tamas _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Fix VGA logdirty related display freezes with altp2m 2018-10-24 17:52 ` Tamas K Lengyel @ 2018-10-24 18:05 ` Razvan Cojocaru 2018-10-25 14:24 ` Razvan Cojocaru 1 sibling, 0 replies; 25+ messages in thread From: Razvan Cojocaru @ 2018-10-24 18:05 UTC (permalink / raw) To: Tamas K Lengyel Cc: Kevin Tian, Wei Liu, Jan Beulich, George Dunlap, Andrew Cooper, Jun Nakajima, Xen-devel On 10/24/18 8:52 PM, Tamas K Lengyel wrote: > On Wed, Oct 24, 2018 at 11:31 AM Tamas K Lengyel > <tamas.k.lengyel@gmail.com> wrote: >> >> On Wed, Oct 24, 2018 at 11:20 AM Razvan Cojocaru >> <rcojocaru@bitdefender.com> wrote: >>> >>> On 10/24/18 8:09 PM, Tamas K Lengyel wrote: >>>> On Tue, Oct 23, 2018 at 6:37 AM Razvan Cojocaru >>>> <rcojocaru@bitdefender.com> wrote: >>>>> >>>>> Tamas, could you please give this a spin? >>>>> >>>>> https://github.com/razvan-cojocaru/xen/tree/altp2m-logdirty-take2 >>>>> >>>>> It _should_ solve the crashes. >>>> >>>> Indeed, I no longer see the crash. However, there might be some >>>> locking issue present because the whole system freezes up shortly >>>> after starting DRAKVUF on a domain - within a couple seconds. I mean >>>> Xen itself locks up: no response on the serial, dom0 screen frozen, >>>> etc. >>> >>> Do you have any type of log / backtrace / way I could reproduce it >>> without Drakvuf? All the ways I've tested it were fine (including >>> xen-access). >> >> I don't have a standalone test that produces that error. With DRAKVUF >> it is easily reproducible though. If you have a Windows guest >> installed, setting up DRAKVUF should really not be much trouble. With >> xen-access it indeed doesn't lock up but since the guest is pretty >> much unresponsive during that test I can't verify whether the VGA >> issue is now resolved or not. Also the xen-access tests are fairly >> limited and don't use all aspects of altp2m. >> > > What I see from the DRAKVUF log is that the last thing it prints is > sending a vm_event response that both enables singlestepping and > switches altp2m view. This looks to be consistent. It didn't matter if > the guest had 1 or 2 vCPUs, the freeze occurs just the same. It's > definitely racey because it doesn't happen right away, the system > works as expected for a couple seconds. Right, I'll try to set up Drakvuf tomorrow. If it's a locking issue it should be in patch 3 - I don't see how the first two could cause a problem. I did test the patches with both xen-access and our full introspection agent with no lockups, crashes, or display freezes but it would seem that you're using it somehow differently. Thanks, Razvan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Fix VGA logdirty related display freezes with altp2m 2018-10-24 17:52 ` Tamas K Lengyel 2018-10-24 18:05 ` Razvan Cojocaru @ 2018-10-25 14:24 ` Razvan Cojocaru 2018-10-25 14:55 ` Tamas K Lengyel 1 sibling, 1 reply; 25+ messages in thread From: Razvan Cojocaru @ 2018-10-25 14:24 UTC (permalink / raw) To: Tamas K Lengyel Cc: Kevin Tian, Wei Liu, Jan Beulich, George Dunlap, Andrew Cooper, Jun Nakajima, Xen-devel On 10/24/18 8:52 PM, Tamas K Lengyel wrote: > On Wed, Oct 24, 2018 at 11:31 AM Tamas K Lengyel > <tamas.k.lengyel@gmail.com> wrote: >> >> On Wed, Oct 24, 2018 at 11:20 AM Razvan Cojocaru >> <rcojocaru@bitdefender.com> wrote: >>> >>> On 10/24/18 8:09 PM, Tamas K Lengyel wrote: >>>> On Tue, Oct 23, 2018 at 6:37 AM Razvan Cojocaru >>>> <rcojocaru@bitdefender.com> wrote: >>>>> >>>>> Tamas, could you please give this a spin? >>>>> >>>>> https://github.com/razvan-cojocaru/xen/tree/altp2m-logdirty-take2 >>>>> >>>>> It _should_ solve the crashes. >>>> >>>> Indeed, I no longer see the crash. However, there might be some >>>> locking issue present because the whole system freezes up shortly >>>> after starting DRAKVUF on a domain - within a couple seconds. I mean >>>> Xen itself locks up: no response on the serial, dom0 screen frozen, >>>> etc. >>> >>> Do you have any type of log / backtrace / way I could reproduce it >>> without Drakvuf? All the ways I've tested it were fine (including >>> xen-access). >> >> I don't have a standalone test that produces that error. With DRAKVUF >> it is easily reproducible though. If you have a Windows guest >> installed, setting up DRAKVUF should really not be much trouble. With >> xen-access it indeed doesn't lock up but since the guest is pretty >> much unresponsive during that test I can't verify whether the VGA >> issue is now resolved or not. Also the xen-access tests are fairly >> limited and don't use all aspects of altp2m. >> > > What I see from the DRAKVUF log is that the last thing it prints is > sending a vm_event response that both enables singlestepping and > switches altp2m view. This looks to be consistent. It didn't matter if > the guest had 1 or 2 vCPUs, the freeze occurs just the same. It's > definitely racey because it doesn't happen right away, the system > works as expected for a couple seconds. After having to install clang because my GCC couldn't build Drakvuf: ../../src/plugins/plugins.h:188:1: sorry, unimplemented: non-trivial designated initializers not supported then rekall via pip, then having to mount my Windows disk to do "rekal peinfo", I finally gave up when "rekall fetch_pdb" couldn't find the debug files on the Microsoft server. :) So if you could find a way to reproduce the issue with a simple libxc-based application alone (or at least with something libvmi-related, which I do have set up), I'd really appreciate it. Or maybe try to hack around with patch no 3 of the series (for a start, just revert it and see if the problem persists - of course the display will freeze) and see if there's an easy fix? Thanks, Razvan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Fix VGA logdirty related display freezes with altp2m 2018-10-25 14:24 ` Razvan Cojocaru @ 2018-10-25 14:55 ` Tamas K Lengyel 2018-10-25 15:02 ` Razvan Cojocaru 0 siblings, 1 reply; 25+ messages in thread From: Tamas K Lengyel @ 2018-10-25 14:55 UTC (permalink / raw) To: Razvan Cojocaru Cc: Kevin Tian, Wei Liu, Jan Beulich, George Dunlap, Andrew Cooper, Jun Nakajima, Xen-devel On Thu, Oct 25, 2018 at 8:24 AM Razvan Cojocaru <rcojocaru@bitdefender.com> wrote: > > On 10/24/18 8:52 PM, Tamas K Lengyel wrote: > > On Wed, Oct 24, 2018 at 11:31 AM Tamas K Lengyel > > <tamas.k.lengyel@gmail.com> wrote: > >> > >> On Wed, Oct 24, 2018 at 11:20 AM Razvan Cojocaru > >> <rcojocaru@bitdefender.com> wrote: > >>> > >>> On 10/24/18 8:09 PM, Tamas K Lengyel wrote: > >>>> On Tue, Oct 23, 2018 at 6:37 AM Razvan Cojocaru > >>>> <rcojocaru@bitdefender.com> wrote: > >>>>> > >>>>> Tamas, could you please give this a spin? > >>>>> > >>>>> https://github.com/razvan-cojocaru/xen/tree/altp2m-logdirty-take2 > >>>>> > >>>>> It _should_ solve the crashes. > >>>> > >>>> Indeed, I no longer see the crash. However, there might be some > >>>> locking issue present because the whole system freezes up shortly > >>>> after starting DRAKVUF on a domain - within a couple seconds. I mean > >>>> Xen itself locks up: no response on the serial, dom0 screen frozen, > >>>> etc. > >>> > >>> Do you have any type of log / backtrace / way I could reproduce it > >>> without Drakvuf? All the ways I've tested it were fine (including > >>> xen-access). > >> > >> I don't have a standalone test that produces that error. With DRAKVUF > >> it is easily reproducible though. If you have a Windows guest > >> installed, setting up DRAKVUF should really not be much trouble. With > >> xen-access it indeed doesn't lock up but since the guest is pretty > >> much unresponsive during that test I can't verify whether the VGA > >> issue is now resolved or not. Also the xen-access tests are fairly > >> limited and don't use all aspects of altp2m. > >> > > > > What I see from the DRAKVUF log is that the last thing it prints is > > sending a vm_event response that both enables singlestepping and > > switches altp2m view. This looks to be consistent. It didn't matter if > > the guest had 1 or 2 vCPUs, the freeze occurs just the same. It's > > definitely racey because it doesn't happen right away, the system > > works as expected for a couple seconds. > > After having to install clang because my GCC couldn't build Drakvuf: > > ../../src/plugins/plugins.h:188:1: sorry, unimplemented: non-trivial > designated initializers not supported Please follow the instruction for compiling it, clang is a requirement. I don't even know how you got pass the ./configure stage without clang being installed. You could also just copy-paste things from the travis script directly: https://github.com/tklengyel/drakvuf/blob/master/.travis.yml#L51 > > then rekall via pip, then having to mount my Windows disk to do "rekal > peinfo", I finally gave up when "rekall fetch_pdb" couldn't find the > debug files on the Microsoft server. :) If your version if Windows is that brand new then yes, Microsoft takes a couple days to publish their debug information and you will just have to wait or use an older version of Windows. > > So if you could find a way to reproduce the issue with a simple > libxc-based application alone (or at least with something > libvmi-related, which I do have set up), I'd really appreciate it. > > Or maybe try to hack around with patch no 3 of the series (for a start, > just revert it and see if the problem persists - of course the display > will freeze) and see if there's an easy fix? Unfortunately I won't have time to do either of these any time soon. If you are having that much trouble setting it up I can perhaps send you a pre-compiled version with a version of Windows for which Microsoft already published the debug info for. Tamas _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Fix VGA logdirty related display freezes with altp2m 2018-10-25 14:55 ` Tamas K Lengyel @ 2018-10-25 15:02 ` Razvan Cojocaru 2018-10-25 15:08 ` Tamas K Lengyel 0 siblings, 1 reply; 25+ messages in thread From: Razvan Cojocaru @ 2018-10-25 15:02 UTC (permalink / raw) To: Tamas K Lengyel Cc: Kevin Tian, Wei Liu, Jan Beulich, George Dunlap, Andrew Cooper, Jun Nakajima, Xen-devel On 10/25/18 5:55 PM, Tamas K Lengyel wrote: > On Thu, Oct 25, 2018 at 8:24 AM Razvan Cojocaru > <rcojocaru@bitdefender.com> wrote: >> >> On 10/24/18 8:52 PM, Tamas K Lengyel wrote: >>> On Wed, Oct 24, 2018 at 11:31 AM Tamas K Lengyel >>> <tamas.k.lengyel@gmail.com> wrote: >>>> >>>> On Wed, Oct 24, 2018 at 11:20 AM Razvan Cojocaru >>>> <rcojocaru@bitdefender.com> wrote: >>>>> >>>>> On 10/24/18 8:09 PM, Tamas K Lengyel wrote: >>>>>> On Tue, Oct 23, 2018 at 6:37 AM Razvan Cojocaru >>>>>> <rcojocaru@bitdefender.com> wrote: >>>>>>> >>>>>>> Tamas, could you please give this a spin? >>>>>>> >>>>>>> https://github.com/razvan-cojocaru/xen/tree/altp2m-logdirty-take2 >>>>>>> >>>>>>> It _should_ solve the crashes. >>>>>> >>>>>> Indeed, I no longer see the crash. However, there might be some >>>>>> locking issue present because the whole system freezes up shortly >>>>>> after starting DRAKVUF on a domain - within a couple seconds. I mean >>>>>> Xen itself locks up: no response on the serial, dom0 screen frozen, >>>>>> etc. >>>>> >>>>> Do you have any type of log / backtrace / way I could reproduce it >>>>> without Drakvuf? All the ways I've tested it were fine (including >>>>> xen-access). >>>> >>>> I don't have a standalone test that produces that error. With DRAKVUF >>>> it is easily reproducible though. If you have a Windows guest >>>> installed, setting up DRAKVUF should really not be much trouble. With >>>> xen-access it indeed doesn't lock up but since the guest is pretty >>>> much unresponsive during that test I can't verify whether the VGA >>>> issue is now resolved or not. Also the xen-access tests are fairly >>>> limited and don't use all aspects of altp2m. >>>> >>> >>> What I see from the DRAKVUF log is that the last thing it prints is >>> sending a vm_event response that both enables singlestepping and >>> switches altp2m view. This looks to be consistent. It didn't matter if >>> the guest had 1 or 2 vCPUs, the freeze occurs just the same. It's >>> definitely racey because it doesn't happen right away, the system >>> works as expected for a couple seconds. >> >> After having to install clang because my GCC couldn't build Drakvuf: >> >> ../../src/plugins/plugins.h:188:1: sorry, unimplemented: non-trivial >> designated initializers not supported > > Please follow the instruction for compiling it, clang is a > requirement. I don't even know how you got pass the ./configure stage > without clang being installed. You could also just copy-paste things > from the travis script directly: > https://github.com/tklengyel/drakvuf/blob/master/.travis.yml#L51 > >> >> then rekall via pip, then having to mount my Windows disk to do "rekal >> peinfo", I finally gave up when "rekall fetch_pdb" couldn't find the >> debug files on the Microsoft server. :) > > If your version if Windows is that brand new then yes, Microsoft takes > a couple days to publish their debug information and you will just > have to wait or use an older version of Windows. > >> >> So if you could find a way to reproduce the issue with a simple >> libxc-based application alone (or at least with something >> libvmi-related, which I do have set up), I'd really appreciate it. >> >> Or maybe try to hack around with patch no 3 of the series (for a start, >> just revert it and see if the problem persists - of course the display >> will freeze) and see if there's an easy fix? > > Unfortunately I won't have time to do either of these any time soon. > If you are having that much trouble setting it up I can perhaps send > you a pre-compiled version with a version of Windows for which > Microsoft already published the debug info for. It's a Windows 7 x64 guest. But the problem was that the right command line is: rekall fetch_pdb ntkrnlmp instead of the suggested "rekall fetch_pdb ntkrpamp" on the drakvuf.com website. I'll try to continue - in any case should I have more trouble I'll contact you privately so as not to spam the list. Just wanted to leave this here in case someone else has this problem in the hope that it's useful. Thanks, Razvan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Fix VGA logdirty related display freezes with altp2m 2018-10-25 15:02 ` Razvan Cojocaru @ 2018-10-25 15:08 ` Tamas K Lengyel 2018-10-25 20:11 ` Tamas K Lengyel 0 siblings, 1 reply; 25+ messages in thread From: Tamas K Lengyel @ 2018-10-25 15:08 UTC (permalink / raw) To: Razvan Cojocaru Cc: Kevin Tian, Wei Liu, Jan Beulich, George Dunlap, Andrew Cooper, Jun Nakajima, Xen-devel On Thu, Oct 25, 2018 at 9:02 AM Razvan Cojocaru <rcojocaru@bitdefender.com> wrote: > > On 10/25/18 5:55 PM, Tamas K Lengyel wrote: > > On Thu, Oct 25, 2018 at 8:24 AM Razvan Cojocaru > > <rcojocaru@bitdefender.com> wrote: > >> > >> On 10/24/18 8:52 PM, Tamas K Lengyel wrote: > >>> On Wed, Oct 24, 2018 at 11:31 AM Tamas K Lengyel > >>> <tamas.k.lengyel@gmail.com> wrote: > >>>> > >>>> On Wed, Oct 24, 2018 at 11:20 AM Razvan Cojocaru > >>>> <rcojocaru@bitdefender.com> wrote: > >>>>> > >>>>> On 10/24/18 8:09 PM, Tamas K Lengyel wrote: > >>>>>> On Tue, Oct 23, 2018 at 6:37 AM Razvan Cojocaru > >>>>>> <rcojocaru@bitdefender.com> wrote: > >>>>>>> > >>>>>>> Tamas, could you please give this a spin? > >>>>>>> > >>>>>>> https://github.com/razvan-cojocaru/xen/tree/altp2m-logdirty-take2 > >>>>>>> > >>>>>>> It _should_ solve the crashes. > >>>>>> > >>>>>> Indeed, I no longer see the crash. However, there might be some > >>>>>> locking issue present because the whole system freezes up shortly > >>>>>> after starting DRAKVUF on a domain - within a couple seconds. I mean > >>>>>> Xen itself locks up: no response on the serial, dom0 screen frozen, > >>>>>> etc. > >>>>> > >>>>> Do you have any type of log / backtrace / way I could reproduce it > >>>>> without Drakvuf? All the ways I've tested it were fine (including > >>>>> xen-access). > >>>> > >>>> I don't have a standalone test that produces that error. With DRAKVUF > >>>> it is easily reproducible though. If you have a Windows guest > >>>> installed, setting up DRAKVUF should really not be much trouble. With > >>>> xen-access it indeed doesn't lock up but since the guest is pretty > >>>> much unresponsive during that test I can't verify whether the VGA > >>>> issue is now resolved or not. Also the xen-access tests are fairly > >>>> limited and don't use all aspects of altp2m. > >>>> > >>> > >>> What I see from the DRAKVUF log is that the last thing it prints is > >>> sending a vm_event response that both enables singlestepping and > >>> switches altp2m view. This looks to be consistent. It didn't matter if > >>> the guest had 1 or 2 vCPUs, the freeze occurs just the same. It's > >>> definitely racey because it doesn't happen right away, the system > >>> works as expected for a couple seconds. > >> > >> After having to install clang because my GCC couldn't build Drakvuf: > >> > >> ../../src/plugins/plugins.h:188:1: sorry, unimplemented: non-trivial > >> designated initializers not supported > > > > Please follow the instruction for compiling it, clang is a > > requirement. I don't even know how you got pass the ./configure stage > > without clang being installed. You could also just copy-paste things > > from the travis script directly: > > https://github.com/tklengyel/drakvuf/blob/master/.travis.yml#L51 > > > >> > >> then rekall via pip, then having to mount my Windows disk to do "rekal > >> peinfo", I finally gave up when "rekall fetch_pdb" couldn't find the > >> debug files on the Microsoft server. :) > > > > If your version if Windows is that brand new then yes, Microsoft takes > > a couple days to publish their debug information and you will just > > have to wait or use an older version of Windows. > > > >> > >> So if you could find a way to reproduce the issue with a simple > >> libxc-based application alone (or at least with something > >> libvmi-related, which I do have set up), I'd really appreciate it. > >> > >> Or maybe try to hack around with patch no 3 of the series (for a start, > >> just revert it and see if the problem persists - of course the display > >> will freeze) and see if there's an easy fix? > > > > Unfortunately I won't have time to do either of these any time soon. > > If you are having that much trouble setting it up I can perhaps send > > you a pre-compiled version with a version of Windows for which > > Microsoft already published the debug info for. > > It's a Windows 7 x64 guest. But the problem was that the right command > line is: > > rekall fetch_pdb ntkrnlmp > > instead of the suggested "rekall fetch_pdb ntkrpamp" on the drakvuf.com > website. The kernel filename is specific to the version of Windows you have installed. The instructions specify _an example_ for the 32-bit version of Windows 7 and you will need to adjust it according to the kernel filename. For 64-bit it is ntkrnlmp. The instruction explicitly say that you need to use the PDB filename that was printed for your specific kernel version. > > I'll try to continue - in any case should I have more trouble I'll > contact you privately so as not to spam the list. Just wanted to leave > this here in case someone else has this problem in the hope that it's > useful. Of course, also please feel free to open an issue on github if you run into something that's blocking you. Chances are if you run into it, others would too :) Thanks, Tamas _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Fix VGA logdirty related display freezes with altp2m 2018-10-25 15:08 ` Tamas K Lengyel @ 2018-10-25 20:11 ` Tamas K Lengyel 2018-10-25 20:17 ` Razvan Cojocaru 0 siblings, 1 reply; 25+ messages in thread From: Tamas K Lengyel @ 2018-10-25 20:11 UTC (permalink / raw) To: Razvan Cojocaru Cc: Kevin Tian, Wei Liu, Jan Beulich, George Dunlap, Andrew Cooper, Jun Nakajima, Xen-devel On Thu, Oct 25, 2018 at 9:08 AM Tamas K Lengyel <tamas.k.lengyel@gmail.com> wrote: > > On Thu, Oct 25, 2018 at 9:02 AM Razvan Cojocaru > <rcojocaru@bitdefender.com> wrote: > > > > On 10/25/18 5:55 PM, Tamas K Lengyel wrote: > > > On Thu, Oct 25, 2018 at 8:24 AM Razvan Cojocaru > > > <rcojocaru@bitdefender.com> wrote: > > >> > > >> On 10/24/18 8:52 PM, Tamas K Lengyel wrote: > > >>> On Wed, Oct 24, 2018 at 11:31 AM Tamas K Lengyel > > >>> <tamas.k.lengyel@gmail.com> wrote: > > >>>> > > >>>> On Wed, Oct 24, 2018 at 11:20 AM Razvan Cojocaru > > >>>> <rcojocaru@bitdefender.com> wrote: > > >>>>> > > >>>>> On 10/24/18 8:09 PM, Tamas K Lengyel wrote: > > >>>>>> On Tue, Oct 23, 2018 at 6:37 AM Razvan Cojocaru > > >>>>>> <rcojocaru@bitdefender.com> wrote: > > >>>>>>> > > >>>>>>> Tamas, could you please give this a spin? > > >>>>>>> > > >>>>>>> https://github.com/razvan-cojocaru/xen/tree/altp2m-logdirty-take2 > > >>>>>>> > > >>>>>>> It _should_ solve the crashes. > > >>>>>> > > >>>>>> Indeed, I no longer see the crash. However, there might be some > > >>>>>> locking issue present because the whole system freezes up shortly > > >>>>>> after starting DRAKVUF on a domain - within a couple seconds. I mean > > >>>>>> Xen itself locks up: no response on the serial, dom0 screen frozen, > > >>>>>> etc. > > >>>>> > > >>>>> Do you have any type of log / backtrace / way I could reproduce it > > >>>>> without Drakvuf? All the ways I've tested it were fine (including > > >>>>> xen-access). > > >>>> > > >>>> I don't have a standalone test that produces that error. With DRAKVUF > > >>>> it is easily reproducible though. If you have a Windows guest > > >>>> installed, setting up DRAKVUF should really not be much trouble. With > > >>>> xen-access it indeed doesn't lock up but since the guest is pretty > > >>>> much unresponsive during that test I can't verify whether the VGA > > >>>> issue is now resolved or not. Also the xen-access tests are fairly > > >>>> limited and don't use all aspects of altp2m. > > >>>> > > >>> > > >>> What I see from the DRAKVUF log is that the last thing it prints is > > >>> sending a vm_event response that both enables singlestepping and > > >>> switches altp2m view. This looks to be consistent. It didn't matter if > > >>> the guest had 1 or 2 vCPUs, the freeze occurs just the same. It's > > >>> definitely racey because it doesn't happen right away, the system > > >>> works as expected for a couple seconds. > > >> > > >> After having to install clang because my GCC couldn't build Drakvuf: > > >> > > >> ../../src/plugins/plugins.h:188:1: sorry, unimplemented: non-trivial > > >> designated initializers not supported > > > > > > Please follow the instruction for compiling it, clang is a > > > requirement. I don't even know how you got pass the ./configure stage > > > without clang being installed. You could also just copy-paste things > > > from the travis script directly: > > > https://github.com/tklengyel/drakvuf/blob/master/.travis.yml#L51 > > > > > >> > > >> then rekall via pip, then having to mount my Windows disk to do "rekal > > >> peinfo", I finally gave up when "rekall fetch_pdb" couldn't find the > > >> debug files on the Microsoft server. :) > > > > > > If your version if Windows is that brand new then yes, Microsoft takes > > > a couple days to publish their debug information and you will just > > > have to wait or use an older version of Windows. > > > > > >> > > >> So if you could find a way to reproduce the issue with a simple > > >> libxc-based application alone (or at least with something > > >> libvmi-related, which I do have set up), I'd really appreciate it. > > >> > > >> Or maybe try to hack around with patch no 3 of the series (for a start, > > >> just revert it and see if the problem persists - of course the display > > >> will freeze) and see if there's an easy fix? > > > > > > Unfortunately I won't have time to do either of these any time soon. > > > If you are having that much trouble setting it up I can perhaps send > > > you a pre-compiled version with a version of Windows for which > > > Microsoft already published the debug info for. > > > > It's a Windows 7 x64 guest. But the problem was that the right command > > line is: > > > > rekall fetch_pdb ntkrnlmp > > > > instead of the suggested "rekall fetch_pdb ntkrpamp" on the drakvuf.com > > website. > > The kernel filename is specific to the version of Windows you have > installed. The instructions specify _an example_ for the 32-bit > version of Windows 7 and you will need to adjust it according to the > kernel filename. For 64-bit it is ntkrnlmp. The instruction explicitly > say that you need to use the PDB filename that was printed for your > specific kernel version. > > > > > I'll try to continue - in any case should I have more trouble I'll > > contact you privately so as not to spam the list. Just wanted to leave > > this here in case someone else has this problem in the hope that it's > > useful. > > Of course, also please feel free to open an issue on github if you run > into something that's blocking you. Chances are if you run into it, > others would too :) We can chalk the freeze issue up to buggy hardware on my side. We couldn't reproduce the issue on two other systems. The screen issue is definitely gone now which is awesome! :) Thanks Razvan! Tamas _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Fix VGA logdirty related display freezes with altp2m 2018-10-25 20:11 ` Tamas K Lengyel @ 2018-10-25 20:17 ` Razvan Cojocaru 0 siblings, 0 replies; 25+ messages in thread From: Razvan Cojocaru @ 2018-10-25 20:17 UTC (permalink / raw) To: Tamas K Lengyel Cc: Kevin Tian, Wei Liu, Jan Beulich, George Dunlap, Andrew Cooper, Jun Nakajima, Xen-devel On 10/25/18 11:11 PM, Tamas K Lengyel wrote: > On Thu, Oct 25, 2018 at 9:08 AM Tamas K Lengyel > <tamas.k.lengyel@gmail.com> wrote: >> >> On Thu, Oct 25, 2018 at 9:02 AM Razvan Cojocaru >> <rcojocaru@bitdefender.com> wrote: >>> >>> On 10/25/18 5:55 PM, Tamas K Lengyel wrote: >>>> On Thu, Oct 25, 2018 at 8:24 AM Razvan Cojocaru >>>> <rcojocaru@bitdefender.com> wrote: >>>>> >>>>> On 10/24/18 8:52 PM, Tamas K Lengyel wrote: >>>>>> On Wed, Oct 24, 2018 at 11:31 AM Tamas K Lengyel >>>>>> <tamas.k.lengyel@gmail.com> wrote: >>>>>>> >>>>>>> On Wed, Oct 24, 2018 at 11:20 AM Razvan Cojocaru >>>>>>> <rcojocaru@bitdefender.com> wrote: >>>>>>>> >>>>>>>> On 10/24/18 8:09 PM, Tamas K Lengyel wrote: >>>>>>>>> On Tue, Oct 23, 2018 at 6:37 AM Razvan Cojocaru >>>>>>>>> <rcojocaru@bitdefender.com> wrote: >>>>>>>>>> >>>>>>>>>> Tamas, could you please give this a spin? >>>>>>>>>> >>>>>>>>>> https://github.com/razvan-cojocaru/xen/tree/altp2m-logdirty-take2 >>>>>>>>>> >>>>>>>>>> It _should_ solve the crashes. >>>>>>>>> >>>>>>>>> Indeed, I no longer see the crash. However, there might be some >>>>>>>>> locking issue present because the whole system freezes up shortly >>>>>>>>> after starting DRAKVUF on a domain - within a couple seconds. I mean >>>>>>>>> Xen itself locks up: no response on the serial, dom0 screen frozen, >>>>>>>>> etc. >>>>>>>> >>>>>>>> Do you have any type of log / backtrace / way I could reproduce it >>>>>>>> without Drakvuf? All the ways I've tested it were fine (including >>>>>>>> xen-access). >>>>>>> >>>>>>> I don't have a standalone test that produces that error. With DRAKVUF >>>>>>> it is easily reproducible though. If you have a Windows guest >>>>>>> installed, setting up DRAKVUF should really not be much trouble. With >>>>>>> xen-access it indeed doesn't lock up but since the guest is pretty >>>>>>> much unresponsive during that test I can't verify whether the VGA >>>>>>> issue is now resolved or not. Also the xen-access tests are fairly >>>>>>> limited and don't use all aspects of altp2m. >>>>>>> >>>>>> >>>>>> What I see from the DRAKVUF log is that the last thing it prints is >>>>>> sending a vm_event response that both enables singlestepping and >>>>>> switches altp2m view. This looks to be consistent. It didn't matter if >>>>>> the guest had 1 or 2 vCPUs, the freeze occurs just the same. It's >>>>>> definitely racey because it doesn't happen right away, the system >>>>>> works as expected for a couple seconds. >>>>> >>>>> After having to install clang because my GCC couldn't build Drakvuf: >>>>> >>>>> ../../src/plugins/plugins.h:188:1: sorry, unimplemented: non-trivial >>>>> designated initializers not supported >>>> >>>> Please follow the instruction for compiling it, clang is a >>>> requirement. I don't even know how you got pass the ./configure stage >>>> without clang being installed. You could also just copy-paste things >>>> from the travis script directly: >>>> https://github.com/tklengyel/drakvuf/blob/master/.travis.yml#L51 >>>> >>>>> >>>>> then rekall via pip, then having to mount my Windows disk to do "rekal >>>>> peinfo", I finally gave up when "rekall fetch_pdb" couldn't find the >>>>> debug files on the Microsoft server. :) >>>> >>>> If your version if Windows is that brand new then yes, Microsoft takes >>>> a couple days to publish their debug information and you will just >>>> have to wait or use an older version of Windows. >>>> >>>>> >>>>> So if you could find a way to reproduce the issue with a simple >>>>> libxc-based application alone (or at least with something >>>>> libvmi-related, which I do have set up), I'd really appreciate it. >>>>> >>>>> Or maybe try to hack around with patch no 3 of the series (for a start, >>>>> just revert it and see if the problem persists - of course the display >>>>> will freeze) and see if there's an easy fix? >>>> >>>> Unfortunately I won't have time to do either of these any time soon. >>>> If you are having that much trouble setting it up I can perhaps send >>>> you a pre-compiled version with a version of Windows for which >>>> Microsoft already published the debug info for. >>> >>> It's a Windows 7 x64 guest. But the problem was that the right command >>> line is: >>> >>> rekall fetch_pdb ntkrnlmp >>> >>> instead of the suggested "rekall fetch_pdb ntkrpamp" on the drakvuf.com >>> website. >> >> The kernel filename is specific to the version of Windows you have >> installed. The instructions specify _an example_ for the 32-bit >> version of Windows 7 and you will need to adjust it according to the >> kernel filename. For 64-bit it is ntkrnlmp. The instruction explicitly >> say that you need to use the PDB filename that was printed for your >> specific kernel version. >> >>> >>> I'll try to continue - in any case should I have more trouble I'll >>> contact you privately so as not to spam the list. Just wanted to leave >>> this here in case someone else has this problem in the hope that it's >>> useful. >> >> Of course, also please feel free to open an issue on github if you run >> into something that's blocking you. Chances are if you run into it, >> others would too :) > > We can chalk the freeze issue up to buggy hardware on my side. We > couldn't reproduce the issue on two other systems. The screen issue is > definitely gone now which is awesome! :) Thanks Razvan! No problem, thank you for testing! And I'm competent with Drakvuf now. :) Thanks, Razvan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2018-10-25 20:17 UTC | newest] Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-10-18 10:07 Fix VGA logdirty related display freezes with altp2m Razvan Cojocaru 2018-10-18 10:07 ` [PATCH 1/2] x86/altp2m: propagate ept.ad changes to all active altp2ms Razvan Cojocaru 2018-10-18 10:07 ` [PATCH 2/2] x86/altp2m: fix display frozen when switching to a new view early Razvan Cojocaru 2018-10-18 10:57 ` Andrew Cooper 2018-10-19 8:30 ` Razvan Cojocaru 2018-10-18 20:08 ` Fix VGA logdirty related display freezes with altp2m Tamas K Lengyel 2018-10-18 21:12 ` Razvan Cojocaru 2018-10-22 20:48 ` Tamas K Lengyel 2018-10-22 21:17 ` Razvan Cojocaru 2018-10-22 21:22 ` Andrew Cooper 2018-10-22 21:28 ` Tamas K Lengyel 2018-10-22 22:15 ` Razvan Cojocaru 2018-10-22 22:50 ` Tamas K Lengyel 2018-10-23 12:37 ` Razvan Cojocaru 2018-10-24 17:09 ` Tamas K Lengyel 2018-10-24 17:20 ` Razvan Cojocaru 2018-10-24 17:31 ` Tamas K Lengyel 2018-10-24 17:52 ` Tamas K Lengyel 2018-10-24 18:05 ` Razvan Cojocaru 2018-10-25 14:24 ` Razvan Cojocaru 2018-10-25 14:55 ` Tamas K Lengyel 2018-10-25 15:02 ` Razvan Cojocaru 2018-10-25 15:08 ` Tamas K Lengyel 2018-10-25 20:11 ` Tamas K Lengyel 2018-10-25 20:17 ` Razvan Cojocaru
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.