All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jan Beulich" <JBeulich@suse.com>
To: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>,
	Paul Durrant <paul.durrant@citrix.com>
Subject: Re: [PATCH v2] x86/HVM: correct hvmemul_map_linear_addr() for multi-page case
Date: Mon, 08 Oct 2018 05:53:29 -0600	[thread overview]
Message-ID: <5BBB453902000078001EF88D@prv1-mh.provo.novell.com> (raw)
In-Reply-To: <b1bf6cb0-f3d3-28fc-2062-e8bf3411c35e@citrix.com>

>>> On 25.09.18 at 17:30, <andrew.cooper3@citrix.com> wrote:
> On 25/09/18 13:41, Jan Beulich wrote:
>>>>> On 20.09.18 at 14:41, <andrew.cooper3@citrix.com> wrote:
>>> On 13/09/18 11:12, Jan Beulich wrote:
>>>> The function does two translations in one go for a single guest access.
>>>> Any failure of the first translation step (guest linear -> guest
>>>> physical), resulting in #PF, ought to take precedence over any failure
>>>> of the second step (guest physical -> host physical).
>>> Why?  What is the basis of this presumption?
>>>
>>> As far as what real hardware does...
>>>
>>> This test sets up a ballooned page and a read-only page.  I.e. a second
>>> stage fault on the first part of a misaligned access, and a first stage
>>> fault on the second part of the access.
>>>
>>> (d1) --- Xen Test Framework ---
>>> (d1) Environment: HVM 64bit (Long mode 4 levels)
>>> (d1) Test splitfault
>>> (d1) About to read
>>> (XEN) *** EPT qual 0000000000000181, gpa 000000000011cffc
>>> (d1) Reading PTR: got 00000000ffffffff
>>> (d1) About to write
>>> (XEN) *** EPT qual 0000000000000182, gpa 000000000011cffc
>>> (d1) ******************************
>>> (d1) PANIC: Unhandled exception at 0008:00000000001047e0
>>> (d1) Vec 14 #PF[-d-sWP] %cr2 000000000011d000
>>> (d1) ******************************
>>>
>>> The second stage fault is recognised first, which is contrary to your
>>> presumption, i.e. the code in its current form appears to be correct.
>> Coming back to this example of yours: As a first step, are we in
>> agreement that with the exception of very complex instructions
>> (FSAVE, FXSAVE, XSAVE etc) instructions are supposed to work in an
>> all-or-nothing manner when it comes to updating of architectural
>> state (be it registers or memory)?
> 
> No.  Read Chapter Intel Vol3 8.1 and 8.2, which makes it quite clear
> that misaligned accesses may be split access, and observably so to other
> processors in the system.
> 
> I've even found a new bit in it which says that >quadword SSE accesses
> may even result in a partial write being completed before #PF is raised.

As said before, I'm all with you with the "may" part for FPU, SSE, and
alike more complex accesses. Short of being able to think of a way to
compare (in a more direct way than via EPT behavior, as your test
did) native and emulated behavior (I had coded something up until I
realized that it doesn't test what I want to test), I've written a small
test for real hardware. Would this debugging patch

--- a/xen/arch/x86/boot/x86_64.S
+++ b/xen/arch/x86/boot/x86_64.S
@@ -87,7 +87,7 @@ GLOBAL(boot_cpu_compat_gdt_table)
  * of physical memory. In any case the VGA hole should be mapped with type UC.
  * Uses 1x 4k page.
  */
-l1_identmap:
+GLOBAL(l1_identmap)
         pfn = 0
         .rept L1_PAGETABLE_ENTRIES
         /* VGA hole (0xa0000-0xc0000) should be mapped UC-. */
--- a/xen/arch/x86/ioport_emulate.c
+++ b/xen/arch/x86/ioport_emulate.c
@@ -112,8 +112,48 @@ static struct dmi_system_id __initdata i
     { }
 };
 
+extern unsigned long l1_identmap[L1_PAGETABLE_ENTRIES];//temp
 static int __init ioport_quirks_init(void)
 {
+ {//temp
+  unsigned char*hole = __va(0xb0000);
+  unsigned long l1e = l1_identmap[0xb0];
+  int i;
+
+  show_page_walk((long)hole);
+
+  l1_identmap[0xb0] &= ~_PAGE_RW;
+  l1_identmap[0xb0] |= _PAGE_PWT;
+  for(;;) {
+   flush_area_all(hole, FLUSH_TLB|FLUSH_TLB_GLOBAL);
+   show_page_walk((long)hole);
+
+   for(i = 3; i > 0; --i) {
+    unsigned long cr2 = 0;
+
+    memset(hole - 4, 0, 4);
+    asm("1: addl $-1, %0; 2:\n\t"
+        ".pushsection .fixup,\"ax\"\n"
+        "3: mov %%cr2, %1; jmp 2b\n\t"
+        ".popsection\n\t"
+        _ASM_EXTABLE(1b, 3b)
+        : "+m" (hole[-i]), "+r" (cr2) :: "memory");
+
+    printk("CR2=%lx data: %02x %02x %02x %02x\n", cr2,
+           hole[-4], hole[-3], hole[-2], hole[-1]);
+   }
+
+   if(l1_identmap[0xb0] & (1UL << (paddr_bits - 1)))
+    break;
+
+   l1_identmap[0xb0] |= ((1UL << paddr_bits) - 1) & PAGE_MASK;
+  }
+
+  l1_identmap[0xb0] = l1e;
+  flush_area_all(hole, FLUSH_TLB|FLUSH_TLB_GLOBAL);
+  show_page_walk((long)hole);
+ }
+
     dmi_check_system(ioport_quirks_tbl);
     return 0;
 }

producing this output

(XEN) Pagetable walk from ffff8300000b0000:
(XEN)  L4[0x106] = 80000000cf0ba063 ffffffffffffffff
(XEN)  L3[0x000] = 00000000cf0b4063 ffffffffffffffff
(XEN)  L2[0x000] = 00000000cf0b3063 ffffffffffffffff
(XEN)  L1[0x0b0] = 00000000000b0373 ffffffffffffffff
(XEN) Pagetable walk from ffff8300000b0000:
(XEN)  L4[0x106] = 80000000cf0ba063 ffffffffffffffff
(XEN)  L3[0x000] = 00000000cf0b4063 ffffffffffffffff
(XEN)  L2[0x000] = 00000000cf0b3063 ffffffffffffffff
(XEN)  L1[0x0b0] = 00000000000b0379 ffffffffffffffff
(XEN) CR2=ffff8300000b0000 data: 00 00 00 00
(XEN) CR2=ffff8300000b0000 data: 00 00 00 00
(XEN) CR2=ffff8300000b0000 data: 00 00 00 00
(XEN) Pagetable walk from ffff8300000b0000:
(XEN)  L4[0x106] = 80000000cf0ba063 ffffffffffffffff
(XEN)  L3[0x000] = 00000000cf0b4063 ffffffffffffffff
(XEN)  L2[0x000] = 00000000cf0b3063 ffffffffffffffff
(XEN)  L1[0x0b0] = 0000000ffffff379 ffffffffffffffff
(XEN) CR2=ffff8300000b0000 data: 00 00 00 00
(XEN) CR2=ffff8300000b0000 data: 00 00 00 00
(XEN) CR2=ffff8300000b0000 data: 00 00 00 00
(XEN) Pagetable walk from ffff8300000b0000:
(XEN)  L4[0x106] = 80000000cf0ba063 ffffffffffffffff
(XEN)  L3[0x000] = 00000000cf0b4063 ffffffffffffffff
(XEN)  L2[0x000] = 00000000cf0b3063 ffffffffffffffff
(XEN)  L1[0x0b0] = 00000000000b0373 ffffffffffffffff

convince you that simple (ALU) accesses at the very least do not get
split, and hence our emulation shouldn't do so either. (I admit that I
think I see another change, beyond the patch here, to be necessary
for this to work consistently in all cases: In the "known_gla() returns
true" case we'd also have to perform the second page walk _before_
trying to actually write anything. But I had questioned that
conditional anyway, first in
https://lists.xenproject.org/archives/html/xen-devel/2018-09/msg00155.html
and then with a shortened version left in the description of commit
3bdec530a5 ["x86/HVM: split page straddling emulated accesses in
more cases"]. Paul wanted that behavior to be mirrored to
hvmemul_rmw() instead.)

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  parent reply	other threads:[~2018-10-08 11:53 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-12  9:09 [PATCH] x86/HVM: correct hvmemul_map_linear_addr() for multi-page case Jan Beulich
2018-09-12 11:51 ` Paul Durrant
2018-09-12 12:13   ` Jan Beulich
2018-09-13 10:12 ` [PATCH v2] " Jan Beulich
2018-09-13 11:06   ` Paul Durrant
2018-09-13 11:39     ` Jan Beulich
2018-09-13 11:41       ` Paul Durrant
2018-09-20 12:41   ` Andrew Cooper
2018-09-20 13:39     ` Jan Beulich
2018-09-20 14:13       ` Andrew Cooper
2018-09-20 14:51         ` Jan Beulich
2018-09-25 12:41     ` Jan Beulich
2018-09-25 15:30       ` Andrew Cooper
2018-09-26  9:27         ` Jan Beulich
2018-10-08 11:53         ` Jan Beulich [this message]
2019-07-31 11:26   ` [Xen-devel] " Alexandru Stefan ISAILA
2023-08-30 14:30 ` [Xen-devel] [PATCH] " Roger Pau Monné
2023-08-30 18:09   ` Andrew Cooper
2023-08-31  7:03     ` Jan Beulich
2023-08-31  8:59       ` Roger Pau Monné
2023-08-31  7:14   ` [Xen-devel] " Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5BBB453902000078001EF88D@prv1-mh.provo.novell.com \
    --to=jbeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=paul.durrant@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.