All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Manuel Bouyer <bouyer@antioche.eu.org>
Cc: <xen-devel@lists.xenproject.org>
Subject: Re: dom0 PV looping on search_pre_exception_table()
Date: Wed, 9 Dec 2020 13:28:54 +0000	[thread overview]
Message-ID: <3f7e50bb-24ad-1e32-9ea1-ba87007d3796@citrix.com> (raw)
In-Reply-To: <20201209101512.GA1299@antioche.eu.org>

[-- Attachment #1: Type: text/plain, Size: 5677 bytes --]

On 09/12/2020 10:15, Manuel Bouyer wrote:
> On Tue, Dec 08, 2020 at 06:13:46PM +0000, Andrew Cooper wrote:
>> On 08/12/2020 17:57, Manuel Bouyer wrote:
>>> Hello,
>>> for the first time I tried to boot a xen kernel from devel with
>>> a NetBSD PV dom0. The kernel boots, but when the first userland prcess
>>> is launched, it seems to enter a loop involving search_pre_exception_table()
>>> (I see an endless stream from the dprintk() at arch/x86/extable.c:202)
>>>
>>> With xen 4.13 I see it, but exactly once:
>>> (XEN) extable.c:202: Pre-exception: ffff82d08038c304 -> ffff82d08038c8c8
>>>
>>> with devel:
>>> (XEN) extable.c:202: Pre-exception: ffff82d040393309 -> ffff82d0403938c8        
>>> (XEN) extable.c:202: Pre-exception: ffff82d040393309 -> ffff82d0403938c8        
>>> (XEN) extable.c:202: Pre-exception: ffff82d040393309 -> ffff82d0403938c8        
>>> (XEN) extable.c:202: Pre-exception: ffff82d040393309 -> ffff82d0403938c8        
>>> (XEN) extable.c:202: Pre-exception: ffff82d040393309 -> ffff82d0403938c8        
>>> [...]
>>>
>>> the dom0 kernel is the same.
>>>
>>> At first glance it looks like a fault in the guest is not handled at it should,
>>> and the userland process keeps faulting on the same address.
>>>
>>> Any idea what to look at ?
>> That is a reoccurring fault on IRET back to guest context, and is
>> probably caused by some unwise-in-hindsight cleanup which doesn't
>> escalate the failure to the failsafe callback.
>>
>> This ought to give something useful to debug with:
> thanks, I got:
> (XEN) IRET fault: #PF[0000]                                                 
> (XEN) domain_crash called from extable.c:209                                
> (XEN) Domain 0 (vcpu#0) crashed on cpu#0:                                   
> (XEN) ----[ Xen-4.15-unstable  x86_64  debug=y   Tainted:   C   ]----       
> (XEN) CPU:    0                                                             
> (XEN) RIP:    0047:[<00007f7e184007d0>]                                     
> (XEN) RFLAGS: 0000000000000202   EM: 0   CONTEXT: pv guest (d0v0)           
> (XEN) rax: ffff82d04038c309   rbx: 0000000000000000   rcx: 000000000000e008 
> (XEN) rdx: 0000000000010086   rsi: ffff83007fcb7f78   rdi: 000000000000e010 
> (XEN) rbp: 0000000000000000   rsp: 00007f7fff53e3e0   r8:  0000000e00000000 
> (XEN) r9:  0000000000000000   r10: 0000000000000000   r11: 0000000000000000 
> (XEN) r12: 0000000000000000   r13: 0000000000000000   r14: 0000000000000000 
> (XEN) r15: 0000000000000000   cr0: 0000000080050033   cr4: 0000000000002660 
> (XEN) cr3: 0000000079cdb000   cr2: 00007f7fff53e3e0                         
> (XEN) fsb: 0000000000000000   gsb: 0000000000000000   gss: ffffffff80cf2dc0    
> (XEN) ds: 0023   es: 0023   fs: 0000   gs: 0000   ss: 003f   cs: 0047       
> (XEN) Guest stack trace from rsp=00007f7fff53e3e0:          
> (XEN)    0000000000000001 00007f7fff53e8f8 0000000000000000 0000000000000000
> (XEN)    0000000000000003 000000004b600040 0000000000000004 0000000000000038
> (XEN)    0000000000000005 0000000000000008 0000000000000006 0000000000001000
> (XEN)    0000000000000007 00007f7e18400000 0000000000000008 0000000000000000
> (XEN)    0000000000000009 000000004b601cd0 00000000000007d0 0000000000000000
> (XEN)    00000000000007d1 0000000000000000 00000000000007d2 0000000000000000
> (XEN)    00000000000007d3 0000000000000000 000000000000000d 00007f7fff53f000
> (XEN)    00000000000007de 00007f7fff53e4e0 0000000000000000 0000000000000000
> (XEN)    6e692f6e6962732f 0000000000007469 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN) Hardware Dom0 crashed: rebooting machine in 5 seconds.

Pagefaults on IRET come either from stack accesses for operands (not the
case here as Xen is otherwise working fine), or from segement selector
loads for %cs and %ss.

In this example, %ss is in the LDT, which specifically does use
pagefaults to promote the frame to PGT_segdesc.

I suspect that what is happening is that handle_ldt_mapping_fault() is
failing to promote the page (for some reason), and we're taking the "In
hypervisor mode? Leave it to the #PF handler to fix up." path due to the
confusion in context, and Xen's #PF handler is concluding "nothing else
to do".

The older behaviour of escalating to the failsafe callback would have
broken this cycle by rewriting %ss and re-entering the kernel.


Please try the attached debugging patch, which is an extension of what I
gave you yesterday.  First, it ought to print %cr2, which I expect will
point to Xen's virtual mapping of the vcpu's LDT.  The logic ought to
loop a few times so we can inspect the hypervisor codepaths which are
effectively livelocked in this state, and I've also instrumented
check_descriptor() failures because I've got a gut feeling that is the
root cause of the problem.

~Andrew

[-- Attachment #2: 0001-extable-dbg.patch --]
[-- Type: text/x-patch, Size: 2272 bytes --]

From 841a6950fec5b43b370653e0c833a54fed64882e Mon Sep 17 00:00:00 2001
From: Andrew Cooper <andrew.cooper3@citrix.com>
Date: Wed, 9 Dec 2020 12:50:38 +0000
Subject: extable-dbg


diff --git a/xen/arch/x86/extable.c b/xen/arch/x86/extable.c
index 70972f1085..88b05bef38 100644
--- a/xen/arch/x86/extable.c
+++ b/xen/arch/x86/extable.c
@@ -191,6 +191,10 @@ static int __init stub_selftest(void)
 __initcall(stub_selftest);
 #endif
 
+#include <xen/sched.h>
+#include <xen/softirq.h>
+const char *vec_name(unsigned int vec);
+
 unsigned long
 search_pre_exception_table(struct cpu_user_regs *regs)
 {
@@ -199,7 +203,21 @@ search_pre_exception_table(struct cpu_user_regs *regs)
         __start___pre_ex_table, __stop___pre_ex_table-1, addr);
     if ( fixup )
     {
-        dprintk(XENLOG_INFO, "Pre-exception: %p -> %p\n", _p(addr), _p(fixup));
+        static int count;
+
+        printk(XENLOG_ERR "IRET fault: %s[%04x]\n",
+               vec_name(regs->entry_vector), regs->error_code);
+
+        if ( regs->entry_vector == X86_EXC_PF )
+            printk(XENLOG_ERR "%%cr2 %016lx\n", read_cr2());
+
+        if ( count++ > 2 )
+        {
+            domain_crash(current->domain);
+            for ( ;; )
+                do_softirq();
+        }
+
         perfc_incr(exception_fixed);
     }
     return fixup;
diff --git a/xen/arch/x86/pv/descriptor-tables.c b/xen/arch/x86/pv/descriptor-tables.c
index 39c1a2311a..6bc58bba67 100644
--- a/xen/arch/x86/pv/descriptor-tables.c
+++ b/xen/arch/x86/pv/descriptor-tables.c
@@ -282,6 +282,10 @@ int validate_segdesc_page(struct page_info *page)
 
     unmap_domain_page(descs);
 
+    if ( i != 512 )
+        printk_once("Check Descriptor failed: idx %u, a: %08x, b: %08x\n",
+                    i, descs[i].a, descs[i].b);
+
     return i == 512 ? 0 : -EINVAL;
 }
 
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 0459cee9fb..1059f3ce66 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -687,7 +687,7 @@ const char *trapstr(unsigned int trapnr)
     return trapnr < ARRAY_SIZE(strings) ? strings[trapnr] : "???";
 }
 
-static const char *vec_name(unsigned int vec)
+const char *vec_name(unsigned int vec)
 {
     static const char names[][4] = {
 #define P(x) [X86_EXC_ ## x] = "#" #x

  reply	other threads:[~2020-12-09 13:29 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-08 17:57 dom0 PV looping on search_pre_exception_table() Manuel Bouyer
2020-12-08 18:13 ` Andrew Cooper
2020-12-09  8:39   ` Jan Beulich
2020-12-09  9:49     ` Manuel Bouyer
2020-12-09 10:15   ` Manuel Bouyer
2020-12-09 13:28     ` Andrew Cooper [this message]
2020-12-09 13:59       ` Manuel Bouyer
2020-12-09 14:41         ` Andrew Cooper
2020-12-09 15:44           ` Manuel Bouyer
2020-12-09 16:00             ` Andrew Cooper
2020-12-09 16:30               ` Manuel Bouyer
2020-12-09 18:08                 ` Andrew Cooper
2020-12-09 18:57                   ` Manuel Bouyer
2020-12-09 19:08                     ` Andrew Cooper
2020-12-10  9:51                       ` Manuel Bouyer
2020-12-10 10:41                         ` Jan Beulich
2020-12-10 15:51                         ` Andrew Cooper
2020-12-10 17:03                           ` Manuel Bouyer
2020-12-10 17:18                             ` Andrew Cooper
2020-12-10 17:35                               ` Manuel Bouyer
2020-12-10 21:01                                 ` Andrew Cooper
2020-12-11 10:47                                   ` Manuel Bouyer
2020-12-11  8:58                         ` Jan Beulich
2020-12-11 11:15                           ` Manuel Bouyer
2020-12-11 13:56                             ` Andrew Cooper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3f7e50bb-24ad-1e32-9ea1-ba87007d3796@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=bouyer@antioche.eu.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.