All of lore.kernel.org
 help / color / mirror / Atom feed
From: Razvan Cojocaru <rcojocaru@bitdefender.com>
To: George Dunlap <dunlapg@umich.edu>
Cc: "Tian, Kevin" <kevin.tian@intel.com>,
	Tamas K Lengyel <tamas@tklengyel.com>,
	Andrew Cooper <Andrew.Cooper3@citrix.com>,
	Tim Deegan <tim@xen.org>,
	George Dunlap <george.dunlap@citrix.com>,
	Jun Nakajima <jun.nakajima@intel.com>,
	xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: Weird altp2m behaviour when switching early to a new view
Date: Tue, 17 Apr 2018 11:24:58 +0300	[thread overview]
Message-ID: <00b52489-51e5-6bc8-ccc2-7b44d2dec107@bitdefender.com> (raw)
In-Reply-To: <CAFLBxZbjif+AeMPrc1CkbyjEiSGHJEW3tGNmMqZtCvp+3pOGeg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3578 bytes --]

On 04/16/2018 11:21 PM, George Dunlap wrote:
> On Mon, Apr 16, 2018 at 7:46 PM, Razvan Cojocaru
> <rcojocaru@bitdefender.com> wrote:
>> On 04/16/2018 08:47 PM, George Dunlap wrote:
>>> On 04/13/2018 03:44 PM, Razvan Cojocaru wrote:
>>>> On 04/11/2018 11:04 AM, Razvan Cojocaru wrote:
>>>>> Debugging continues.
>>>>
>>>> Finally, the attached patch seems to get the display unstuck in my
>>>> scenario, although for one guest I get:
>>>>
>>>> (XEN) d2v0 Unexpected vmexit: reason 49
>>>> (XEN) domain_crash called from vmx.c:4120
>>>> (XEN) Domain 2 (vcpu#0) crashed on cpu#1:
>>>> (XEN) ----[ Xen-4.11-unstable  x86_64  debug=y   Not tainted ]----
>>>> (XEN) CPU:    1
>>>> (XEN) RIP:    0010:[<fffff96000842354>]
>>>> (XEN) RFLAGS: 0000000000010246   CONTEXT: hvm guest (d2v0)
>>>> (XEN) rax: fffff88003000000   rbx: fffff900c0083db0   rcx: 00000000aa55aa55
>>>> (XEN) rdx: fffffa80041bdc41   rsi: fffff900c00c69a0   rdi: 0000000000000001
>>>> (XEN) rbp: 0000000000000000   rsp: fffff88002ee9ef0   r8:  fffffa80041bdc40
>>>> (XEN) r9:  fffff80001810e80   r10: fffffa800342aa70   r11: fffff88002ee9e80
>>>> (XEN) r12: 0000000000000005   r13: 0000000000000001   r14: fffff900c00c08b0
>>>> (XEN) r15: 0000000000000001   cr0: 0000000080050031   cr4: 00000000000406f8
>>>> (XEN) cr3: 00000000ef771000   cr2: fffff900c00c8000
>>>> (XEN) fsb: 00000000fffde000   gsb: fffff80001810d00   gss: 000007fffffdc000
>>>> (XEN) ds: 002b   es: 002b   fs: 0053   gs: 002b   ss: 0018   cs: 0010
>>>>
>>>> i.e. EXIT_REASON_EPT_MISCONFIG - so not of the woods yet. I am hoping
>>>> somebody more familiar with the code can point to a more elegant
>>>> solution if one exists.
>>>
>>> I think I have an idea what's going on, but it's complicated. :-)
>>>
>>> Basically, the logdirty functionality isn't simple, and needs careful
>>> thought on how to integrate it.  I'll write some more tomorrow, and see
>>> if I can come up with a solution.
>>
>> I think I know why this happens for the one guest - the other guests
>> start at a certain resolution display-wise and stay that way until shutdown.
>>
>> This particular guest starts with a larger screen, then goes to roughly
>> 2/3rds of it, then tries to go back to the initial larger one - at which
>> point the above happens. I assume this corresponds to some pages being
>> removed and/or added. I'll test this theory more tomorrow - if it's
>> correct I should be able to reproduce the crash (with the patch) by
>> simply resetting the screen resolution (increasing it).
> 
> The trick is that p2m_change_type doesn't actually iterate over the
> entire p2m range, individually changing entries as it goes.  Instead
> it misconfigures the entries at the top-level, which causes the kinds
> of faults shown above.  As it gets faults for each entry, it checks
> the current type, the logdirty ranges, and the global logdirty bit to
> determine what the new types should be.
> 
> Your patch makes it so that all the altp2ms now get the
> misconfiguration when the logdirty range is changed; but clearly
> handling the misconfiguration isn't integrated properly with the
> altp2m system yet.  Doing it right may take some thought.

FWIW, the attached patch has solved the misconfig-related domain crash
for me (though I'm very likely missing some subtleties). It all seems to
work as expected when enabling altp2m and switching early to a new view.
However, now I have domUs with a frozen display when I disconnect the
introspection application (that is, after I switch back to the default
view and disable altp2m on the domain).


Thanks,
Razvan

[-- Attachment #2: altp2m_logdirty2.patch --]
[-- Type: text/x-patch, Size: 4312 bytes --]

diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 14b5939..4530689 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -17,6 +17,7 @@
 
 #include <xen/domain_page.h>
 #include <xen/sched.h>
+#include <asm/altp2m.h>
 #include <asm/current.h>
 #include <asm/paging.h>
 #include <asm/types.h>
@@ -652,17 +653,38 @@ static int resolve_misconfig(struct p2m_domain *p2m, unsigned long gfn)
 bool_t ept_handle_misconfig(uint64_t gpa)
 {
     struct vcpu *curr = current;
-    struct p2m_domain *p2m = p2m_get_hostp2m(curr->domain);
+    struct domain *d = curr->domain;
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
     bool_t spurious;
-    int rc;
-
-    p2m_lock(p2m);
+    int rc = 0;
+    unsigned int i;
 
     spurious = curr->arch.hvm_vmx.ept_spurious_misconfig;
-    rc = resolve_misconfig(p2m, PFN_DOWN(gpa));
-    curr->arch.hvm_vmx.ept_spurious_misconfig = 0;
 
-    p2m_unlock(p2m);
+    if ( altp2m_active(d) )
+    {
+       for ( i = 0; i < MAX_ALTP2M; i++ )
+            if ( d->arch.altp2m_eptp[i] != mfn_x(INVALID_MFN) )
+            {
+                p2m = d->arch.altp2m_p2m[i];
+
+                p2m_lock(p2m);
+
+                rc = resolve_misconfig(p2m, PFN_DOWN(gpa));
+                curr->arch.hvm_vmx.ept_spurious_misconfig = 0;
+
+                p2m_unlock(p2m);
+            }
+    }
+    else
+    {
+        p2m_lock(p2m);
+
+        rc = resolve_misconfig(p2m, PFN_DOWN(gpa));
+        curr->arch.hvm_vmx.ept_spurious_misconfig = 0;
+
+        p2m_unlock(p2m);
+    }
 
     return spurious ? (rc >= 0) : (rc > 0);
 }
@@ -1375,8 +1397,15 @@ void setup_ept_dump(void)
 void p2m_init_altp2m_ept(struct domain *d, unsigned int i)
 {
     struct p2m_domain *p2m = d->arch.altp2m_p2m[i];
+    struct p2m_domain *hostp2m = p2m_get_hostp2m(d);
     struct ept_data *ept;
 
+    p2m->max_mapped_pfn = hostp2m->max_mapped_pfn;
+    p2m->default_access = hostp2m->default_access;
+    p2m->domain = hostp2m->domain;
+    p2m->logdirty_ranges = hostp2m->logdirty_ranges;
+    p2m->global_logdirty = hostp2m->global_logdirty;
+
     p2m->min_remapped_gfn = gfn_x(INVALID_GFN);
     p2m->max_remapped_gfn = 0;
     ept = &p2m->ept;
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index c53cab4..00f85e1 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -28,6 +28,7 @@
 #include <xen/vm_event.h>
 #include <xen/event.h>
 #include <public/vm_event.h>
+#include <asm/altp2m.h>
 #include <asm/domain.h>
 #include <asm/page.h>
 #include <asm/paging.h>
@@ -248,7 +249,6 @@ int p2m_init(struct domain *d)
 int p2m_is_logdirty_range(struct p2m_domain *p2m, unsigned long start,
                           unsigned long end)
 {
-    ASSERT(p2m_is_hostp2m(p2m));
     if ( p2m->global_logdirty ||
          rangeset_contains_range(p2m->logdirty_ranges, start, end) )
         return 1;
@@ -964,12 +964,12 @@ int p2m_change_type_one(struct domain *d, unsigned long gfn_l,
 }
 
 /* Modify the p2m type of a range of gfns from ot to nt. */
-void p2m_change_type_range(struct domain *d, 
-                           unsigned long start, unsigned long end,
-                           p2m_type_t ot, p2m_type_t nt)
+static void _p2m_change_type_range(struct p2m_domain *p2m,
+                                   unsigned long start, unsigned long end,
+                                   p2m_type_t ot, p2m_type_t nt)
 {
+    struct domain *d = p2m->domain;
     unsigned long gfn = start;
-    struct p2m_domain *p2m = p2m_get_hostp2m(d);
     int rc = 0;
 
     ASSERT(ot != nt);
@@ -1022,6 +1022,23 @@ void p2m_change_type_range(struct domain *d,
     p2m_unlock(p2m);
 }
 
+void p2m_change_type_range(struct domain *d,
+                           unsigned long start, unsigned long end,
+                           p2m_type_t ot, p2m_type_t nt)
+{
+    unsigned int i;
+
+    if ( !altp2m_active(d) )
+    {
+        _p2m_change_type_range(p2m_get_hostp2m(d), start, end, ot, nt);
+        return;
+    }
+
+    for ( i = 0; i < MAX_ALTP2M; i++ )
+        if ( d->arch.altp2m_eptp[i] != mfn_x(INVALID_MFN) )
+            _p2m_change_type_range(d->arch.altp2m_p2m[i], start, end, ot, nt);
+}
+
 /*
  * Finish p2m type change for gfns which are marked as need_recalc in a range.
  * Returns: 0/1 for success, negative for failure

[-- Attachment #3: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  parent reply	other threads:[~2018-04-17  8:25 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-08 20:38 Weird altp2m behaviour when switching early to a new view Razvan Cojocaru
2018-04-09 14:12 ` George Dunlap
2018-04-11  6:39   ` Razvan Cojocaru
2018-04-11  8:04     ` Razvan Cojocaru
2018-04-12 16:15       ` Razvan Cojocaru
2018-04-13 14:44       ` Razvan Cojocaru
2018-04-13 16:38         ` Tamas K Lengyel
2018-04-13 17:04           ` Razvan Cojocaru
2018-04-16 17:47         ` George Dunlap
2018-04-16 18:46           ` Razvan Cojocaru
2018-04-16 20:21             ` George Dunlap
2018-04-17  7:24               ` Razvan Cojocaru
2018-04-17  8:24               ` Razvan Cojocaru [this message]
2018-04-17 10:49                 ` Razvan Cojocaru
2018-04-17 10:50                   ` Razvan Cojocaru
2018-04-17 13:53                     ` George Dunlap
2018-04-17 14:21                       ` Razvan Cojocaru
2018-04-17 14:58                         ` George Dunlap
2018-04-17 15:13                           ` Razvan Cojocaru
2018-04-17 17:07                             ` Tamas K Lengyel
2018-04-18  8:20                       ` Razvan Cojocaru
2018-04-18 10:45                         ` George Dunlap
2018-04-18 10:49                           ` Razvan Cojocaru
2018-04-11 20:17     ` Tamas K Lengyel
2018-04-12  7:19       ` Razvan Cojocaru
2018-04-09 15:40 ` Alexey G
2018-10-02 16:29 Сергей
2018-10-02 17:59 ` Razvan Cojocaru
2018-10-03 10:56   ` Сергей
2018-10-03 11:02     ` Razvan Cojocaru
2018-10-04  9:17     ` Razvan Cojocaru

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=00b52489-51e5-6bc8-ccc2-7b44d2dec107@bitdefender.com \
    --to=rcojocaru@bitdefender.com \
    --cc=Andrew.Cooper3@citrix.com \
    --cc=dunlapg@umich.edu \
    --cc=george.dunlap@citrix.com \
    --cc=jun.nakajima@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=tamas@tklengyel.com \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.