All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/HVM: correct hvmemul_map_linear_addr() for multi-page case
@ 2018-09-12  9:09 Jan Beulich
  2018-09-12 11:51 ` Paul Durrant
                   ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Jan Beulich @ 2018-09-12  9:09 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant

The function does two translations in one go for a single guest access.
Any failure of the first translation step (guest linear -> guest
physical), resulting in #PF, ought to take precedence over any failure
of the second step (guest physical -> host physical). Bail out of the
loop early solely when translation produces HVMTRANS_bad_linear_to_gfn,
and record the most relevant of perhaps multiple different errors
otherwise. (The choice of ZERO_BLOCK_PTR as sentinel is arbitrary.)

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -531,6 +531,20 @@ static int hvmemul_do_mmio_addr(paddr_t
     return hvmemul_do_io_addr(1, mmio_gpa, reps, size, dir, df, ram_gpa);
 }
 
+static void *update_map_err(void *err, void *new)
+{
+    if ( err == ZERO_BLOCK_PTR || err == ERR_PTR(~X86EMUL_OKAY) )
+        return new;
+
+    if ( new == ERR_PTR(~X86EMUL_OKAY) )
+        return err;
+
+    if ( err == ERR_PTR(~X86EMUL_RETRY) )
+        return new;
+
+    return err;
+}
+
 /*
  * Map the frame(s) covering an individual linear access, for writeable
  * access.  May return NULL for MMIO, or ERR_PTR(~X86EMUL_*) for other errors
@@ -544,7 +558,7 @@ static void *hvmemul_map_linear_addr(
     struct hvm_emulate_ctxt *hvmemul_ctxt)
 {
     struct vcpu *curr = current;
-    void *err, *mapping;
+    void *err = ZERO_BLOCK_PTR, *mapping;
     unsigned int nr_frames = ((linear + bytes - !!bytes) >> PAGE_SHIFT) -
         (linear >> PAGE_SHIFT) + 1;
     unsigned int i;
@@ -600,27 +614,28 @@ static void *hvmemul_map_linear_addr(
             goto out;
 
         case HVMTRANS_bad_gfn_to_mfn:
-            err = NULL;
-            goto out;
+            err = update_map_err(err, NULL);
+            continue;
 
         case HVMTRANS_gfn_paged_out:
         case HVMTRANS_gfn_shared:
-            err = ERR_PTR(~X86EMUL_RETRY);
-            goto out;
+            err = update_map_err(err, ERR_PTR(~X86EMUL_RETRY));
+            continue;
 
         default:
-            goto unhandleable;
+            err = update_map_err(err, ERR_PTR(~X86EMUL_UNHANDLEABLE));
+            continue;
         }
 
         *mfn++ = page_to_mfn(page);
 
         if ( p2m_is_discard_write(p2mt) )
-        {
-            err = ERR_PTR(~X86EMUL_OKAY);
-            goto out;
-        }
+            err = update_map_err(err, ERR_PTR(~X86EMUL_OKAY));
     }
 
+    if ( err != ZERO_BLOCK_PTR )
+        goto out;
+
     /* Entire access within a single frame? */
     if ( nr_frames == 1 )
         mapping = map_domain_page(hvmemul_ctxt->mfn[0]);
@@ -639,6 +654,7 @@ static void *hvmemul_map_linear_addr(
     return mapping + (linear & ~PAGE_MASK);
 
  unhandleable:
+    ASSERT(err == ZERO_BLOCK_PTR);
     err = ERR_PTR(~X86EMUL_UNHANDLEABLE);
 
  out:





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] x86/HVM: correct hvmemul_map_linear_addr() for multi-page case
  2018-09-12  9:09 [PATCH] x86/HVM: correct hvmemul_map_linear_addr() for multi-page case Jan Beulich
@ 2018-09-12 11:51 ` Paul Durrant
  2018-09-12 12:13   ` Jan Beulich
  2018-09-13 10:12 ` [PATCH v2] " Jan Beulich
  2023-08-30 14:30 ` [Xen-devel] [PATCH] " Roger Pau Monné
  2 siblings, 1 reply; 21+ messages in thread
From: Paul Durrant @ 2018-09-12 11:51 UTC (permalink / raw)
  To: 'Jan Beulich', xen-devel; +Cc: Andrew Cooper

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 12 September 2018 10:10
> To: xen-devel <xen-devel@lists.xenproject.org>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Paul Durrant
> <Paul.Durrant@citrix.com>
> Subject: [PATCH] x86/HVM: correct hvmemul_map_linear_addr() for multi-
> page case
> 
> The function does two translations in one go for a single guest access.
> Any failure of the first translation step (guest linear -> guest
> physical), resulting in #PF, ought to take precedence over any failure
> of the second step (guest physical -> host physical). Bail out of the
> loop early solely when translation produces HVMTRANS_bad_linear_to_gfn,
> and record the most relevant of perhaps multiple different errors
> otherwise. (The choice of ZERO_BLOCK_PTR as sentinel is arbitrary.)

Could we have comment perhaps saying what the order of relevance of the errors are? The logic in update_map_err() below is a little hard to follow.

  Paul

> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -531,6 +531,20 @@ static int hvmemul_do_mmio_addr(paddr_t
>      return hvmemul_do_io_addr(1, mmio_gpa, reps, size, dir, df, ram_gpa);
>  }
> 
> +static void *update_map_err(void *err, void *new)
> +{
> +    if ( err == ZERO_BLOCK_PTR || err == ERR_PTR(~X86EMUL_OKAY) )
> +        return new;
> +
> +    if ( new == ERR_PTR(~X86EMUL_OKAY) )
> +        return err;
> +
> +    if ( err == ERR_PTR(~X86EMUL_RETRY) )
> +        return new;
> +
> +    return err;
> +}
> +
>  /*
>   * Map the frame(s) covering an individual linear access, for writeable
>   * access.  May return NULL for MMIO, or ERR_PTR(~X86EMUL_*) for other
> errors
> @@ -544,7 +558,7 @@ static void *hvmemul_map_linear_addr(
>      struct hvm_emulate_ctxt *hvmemul_ctxt)
>  {
>      struct vcpu *curr = current;
> -    void *err, *mapping;
> +    void *err = ZERO_BLOCK_PTR, *mapping;
>      unsigned int nr_frames = ((linear + bytes - !!bytes) >> PAGE_SHIFT) -
>          (linear >> PAGE_SHIFT) + 1;
>      unsigned int i;
> @@ -600,27 +614,28 @@ static void *hvmemul_map_linear_addr(
>              goto out;
> 
>          case HVMTRANS_bad_gfn_to_mfn:
> -            err = NULL;
> -            goto out;
> +            err = update_map_err(err, NULL);
> +            continue;
> 
>          case HVMTRANS_gfn_paged_out:
>          case HVMTRANS_gfn_shared:
> -            err = ERR_PTR(~X86EMUL_RETRY);
> -            goto out;
> +            err = update_map_err(err, ERR_PTR(~X86EMUL_RETRY));
> +            continue;
> 
>          default:
> -            goto unhandleable;
> +            err = update_map_err(err, ERR_PTR(~X86EMUL_UNHANDLEABLE));
> +            continue;
>          }
> 
>          *mfn++ = page_to_mfn(page);
> 
>          if ( p2m_is_discard_write(p2mt) )
> -        {
> -            err = ERR_PTR(~X86EMUL_OKAY);
> -            goto out;
> -        }
> +            err = update_map_err(err, ERR_PTR(~X86EMUL_OKAY));
>      }
> 
> +    if ( err != ZERO_BLOCK_PTR )
> +        goto out;
> +
>      /* Entire access within a single frame? */
>      if ( nr_frames == 1 )
>          mapping = map_domain_page(hvmemul_ctxt->mfn[0]);
> @@ -639,6 +654,7 @@ static void *hvmemul_map_linear_addr(
>      return mapping + (linear & ~PAGE_MASK);
> 
>   unhandleable:
> +    ASSERT(err == ZERO_BLOCK_PTR);
>      err = ERR_PTR(~X86EMUL_UNHANDLEABLE);
> 
>   out:
> 
> 
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] x86/HVM: correct hvmemul_map_linear_addr() for multi-page case
  2018-09-12 11:51 ` Paul Durrant
@ 2018-09-12 12:13   ` Jan Beulich
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Beulich @ 2018-09-12 12:13 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, xen-devel

>>> On 12.09.18 at 13:51, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 12 September 2018 10:10
>> To: xen-devel <xen-devel@lists.xenproject.org>
>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Paul Durrant
>> <Paul.Durrant@citrix.com>
>> Subject: [PATCH] x86/HVM: correct hvmemul_map_linear_addr() for multi-
>> page case
>> 
>> The function does two translations in one go for a single guest access.
>> Any failure of the first translation step (guest linear -> guest
>> physical), resulting in #PF, ought to take precedence over any failure
>> of the second step (guest physical -> host physical). Bail out of the
>> loop early solely when translation produces HVMTRANS_bad_linear_to_gfn,
>> and record the most relevant of perhaps multiple different errors
>> otherwise. (The choice of ZERO_BLOCK_PTR as sentinel is arbitrary.)
> 
> Could we have comment perhaps saying what the order of relevance of the 
> errors are? The logic in update_map_err() below is a little hard to follow.

Yeah, I was thinking about this meanwhile, and I'm also no longer
certain I've chosen the most sensible "ordering".

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v2] x86/HVM: correct hvmemul_map_linear_addr() for multi-page case
  2018-09-12  9:09 [PATCH] x86/HVM: correct hvmemul_map_linear_addr() for multi-page case Jan Beulich
  2018-09-12 11:51 ` Paul Durrant
@ 2018-09-13 10:12 ` Jan Beulich
  2018-09-13 11:06   ` Paul Durrant
                     ` (2 more replies)
  2023-08-30 14:30 ` [Xen-devel] [PATCH] " Roger Pau Monné
  2 siblings, 3 replies; 21+ messages in thread
From: Jan Beulich @ 2018-09-13 10:12 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant

The function does two translations in one go for a single guest access.
Any failure of the first translation step (guest linear -> guest
physical), resulting in #PF, ought to take precedence over any failure
of the second step (guest physical -> host physical). Bail out of the
loop early solely when translation produces HVMTRANS_bad_linear_to_gfn,
and record the most relevant of perhaps multiple different errors
otherwise. (The choice of ZERO_BLOCK_PTR as sentinel is arbitrary.)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Add comment (mapping table) and adjust update_map_err()
    accordingly.

--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -532,6 +532,36 @@ static int hvmemul_do_mmio_addr(paddr_t
 }
 
 /*
+ * Intended mapping, implemented without table lookup:
+ *
+ * -----------------------------------------
+ * | \ new |       |       |       |       |
+ * |   \   | OKAY  | NULL  | RETRY | UNHND |
+ * | err \ |       |       |       |       |
+ * -----------------------------------------
+ * | OKAY  | OKAY  | NULL  | RETRY | UNHND |
+ * -----------------------------------------
+ * | NULL  | NULL  | NULL  | RETRY | UNHND |
+ * -----------------------------------------
+ * | RETRY | RETRY | RETRY | RETRY | UNHND |
+ * -----------------------------------------
+ * | UNHND | UNHND | UNHND | UNHND | UNHND |
+ * -----------------------------------------
+ */
+static void *update_map_err(void *err, void *new)
+{
+    if ( err == ZERO_BLOCK_PTR || err == ERR_PTR(~X86EMUL_OKAY) ||
+         new == ERR_PTR(~X86EMUL_UNHANDLEABLE) )
+        return new;
+
+    if ( new == ERR_PTR(~X86EMUL_OKAY) ||
+         err == ERR_PTR(~X86EMUL_UNHANDLEABLE) )
+        return err;
+
+    return err ?: new;
+}
+
+/*
  * Map the frame(s) covering an individual linear access, for writeable
  * access.  May return NULL for MMIO, or ERR_PTR(~X86EMUL_*) for other errors
  * including ERR_PTR(~X86EMUL_OKAY) for write-discard mappings.
@@ -544,7 +574,7 @@ static void *hvmemul_map_linear_addr(
     struct hvm_emulate_ctxt *hvmemul_ctxt)
 {
     struct vcpu *curr = current;
-    void *err, *mapping;
+    void *err = ZERO_BLOCK_PTR, *mapping;
     unsigned int nr_frames = ((linear + bytes - !!bytes) >> PAGE_SHIFT) -
         (linear >> PAGE_SHIFT) + 1;
     unsigned int i;
@@ -600,27 +630,28 @@ static void *hvmemul_map_linear_addr(
             goto out;
 
         case HVMTRANS_bad_gfn_to_mfn:
-            err = NULL;
-            goto out;
+            err = update_map_err(err, NULL);
+            continue;
 
         case HVMTRANS_gfn_paged_out:
         case HVMTRANS_gfn_shared:
-            err = ERR_PTR(~X86EMUL_RETRY);
-            goto out;
+            err = update_map_err(err, ERR_PTR(~X86EMUL_RETRY));
+            continue;
 
         default:
-            goto unhandleable;
+            err = update_map_err(err, ERR_PTR(~X86EMUL_UNHANDLEABLE));
+            continue;
         }
 
         *mfn++ = page_to_mfn(page);
 
         if ( p2m_is_discard_write(p2mt) )
-        {
-            err = ERR_PTR(~X86EMUL_OKAY);
-            goto out;
-        }
+            err = update_map_err(err, ERR_PTR(~X86EMUL_OKAY));
     }
 
+    if ( err != ZERO_BLOCK_PTR )
+        goto out;
+
     /* Entire access within a single frame? */
     if ( nr_frames == 1 )
         mapping = map_domain_page(hvmemul_ctxt->mfn[0]);
@@ -639,6 +670,7 @@ static void *hvmemul_map_linear_addr(
     return mapping + (linear & ~PAGE_MASK);
 
  unhandleable:
+    ASSERT(err == ZERO_BLOCK_PTR);
     err = ERR_PTR(~X86EMUL_UNHANDLEABLE);
 
  out:





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2] x86/HVM: correct hvmemul_map_linear_addr() for multi-page case
  2018-09-13 10:12 ` [PATCH v2] " Jan Beulich
@ 2018-09-13 11:06   ` Paul Durrant
  2018-09-13 11:39     ` Jan Beulich
  2018-09-20 12:41   ` Andrew Cooper
  2019-07-31 11:26   ` [Xen-devel] " Alexandru Stefan ISAILA
  2 siblings, 1 reply; 21+ messages in thread
From: Paul Durrant @ 2018-09-13 11:06 UTC (permalink / raw)
  To: 'Jan Beulich', xen-devel; +Cc: Andrew Cooper


> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 13 September 2018 11:13
> To: xen-devel <xen-devel@lists.xenproject.org>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Paul Durrant
> <Paul.Durrant@citrix.com>
> Subject: [PATCH v2] x86/HVM: correct hvmemul_map_linear_addr() for multi-
> page case
> 
> The function does two translations in one go for a single guest access.
> Any failure of the first translation step (guest linear -> guest
> physical), resulting in #PF, ought to take precedence over any failure
> of the second step (guest physical -> host physical). Bail out of the
> loop early solely when translation produces HVMTRANS_bad_linear_to_gfn,
> and record the most relevant of perhaps multiple different errors
> otherwise. (The choice of ZERO_BLOCK_PTR as sentinel is arbitrary.)
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> v2: Add comment (mapping table) and adjust update_map_err()
>     accordingly.
> 
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -532,6 +532,36 @@ static int hvmemul_do_mmio_addr(paddr_t
>  }
> 
>  /*
> + * Intended mapping, implemented without table lookup:
> + *
> + * -----------------------------------------
> + * | \ new |       |       |       |       |
> + * |   \   | OKAY  | NULL  | RETRY | UNHND |
> + * | err \ |       |       |       |       |
> + * -----------------------------------------
> + * | OKAY  | OKAY  | NULL  | RETRY | UNHND |
> + * -----------------------------------------
> + * | NULL  | NULL  | NULL  | RETRY | UNHND |
> + * -----------------------------------------
> + * | RETRY | RETRY | RETRY | RETRY | UNHND |
> + * -----------------------------------------
> + * | UNHND | UNHND | UNHND | UNHND | UNHND |
> + * -----------------------------------------
> + */
> +static void *update_map_err(void *err, void *new)
> +{
> +    if ( err == ZERO_BLOCK_PTR || err == ERR_PTR(~X86EMUL_OKAY) ||
> +         new == ERR_PTR(~X86EMUL_UNHANDLEABLE) )
> +        return new;
> +
> +    if ( new == ERR_PTR(~X86EMUL_OKAY) ||
> +         err == ERR_PTR(~X86EMUL_UNHANDLEABLE) )
> +        return err;
> +
> +    return err ?: new;

Took a while to check but that logic does match the table :-)

> +}
> +
> +/*
>   * Map the frame(s) covering an individual linear access, for writeable
>   * access.  May return NULL for MMIO, or ERR_PTR(~X86EMUL_*) for other
> errors
>   * including ERR_PTR(~X86EMUL_OKAY) for write-discard mappings.
> @@ -544,7 +574,7 @@ static void *hvmemul_map_linear_addr(
>      struct hvm_emulate_ctxt *hvmemul_ctxt)
>  {
>      struct vcpu *curr = current;
> -    void *err, *mapping;
> +    void *err = ZERO_BLOCK_PTR, *mapping;

Given the revised logic, can't you just start *err with the value ERR_PTR(~X86EMUL_OKAY) now? You can then lose the extra test in the first if of update_map_err().

  Paul

>      unsigned int nr_frames = ((linear + bytes - !!bytes) >> PAGE_SHIFT) -
>          (linear >> PAGE_SHIFT) + 1;
>      unsigned int i;
> @@ -600,27 +630,28 @@ static void *hvmemul_map_linear_addr(
>              goto out;
> 
>          case HVMTRANS_bad_gfn_to_mfn:
> -            err = NULL;
> -            goto out;
> +            err = update_map_err(err, NULL);
> +            continue;
> 
>          case HVMTRANS_gfn_paged_out:
>          case HVMTRANS_gfn_shared:
> -            err = ERR_PTR(~X86EMUL_RETRY);
> -            goto out;
> +            err = update_map_err(err, ERR_PTR(~X86EMUL_RETRY));
> +            continue;
> 
>          default:
> -            goto unhandleable;
> +            err = update_map_err(err, ERR_PTR(~X86EMUL_UNHANDLEABLE));
> +            continue;
>          }
> 
>          *mfn++ = page_to_mfn(page);
> 
>          if ( p2m_is_discard_write(p2mt) )
> -        {
> -            err = ERR_PTR(~X86EMUL_OKAY);
> -            goto out;
> -        }
> +            err = update_map_err(err, ERR_PTR(~X86EMUL_OKAY));
>      }
> 
> +    if ( err != ZERO_BLOCK_PTR )
> +        goto out;
> +
>      /* Entire access within a single frame? */
>      if ( nr_frames == 1 )
>          mapping = map_domain_page(hvmemul_ctxt->mfn[0]);
> @@ -639,6 +670,7 @@ static void *hvmemul_map_linear_addr(
>      return mapping + (linear & ~PAGE_MASK);
> 
>   unhandleable:
> +    ASSERT(err == ZERO_BLOCK_PTR);
>      err = ERR_PTR(~X86EMUL_UNHANDLEABLE);
> 
>   out:
> 
> 
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2] x86/HVM: correct hvmemul_map_linear_addr() for multi-page case
  2018-09-13 11:06   ` Paul Durrant
@ 2018-09-13 11:39     ` Jan Beulich
  2018-09-13 11:41       ` Paul Durrant
  0 siblings, 1 reply; 21+ messages in thread
From: Jan Beulich @ 2018-09-13 11:39 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, xen-devel

>>> On 13.09.18 at 13:06, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 13 September 2018 11:13
>> 
>> @@ -544,7 +574,7 @@ static void *hvmemul_map_linear_addr(
>>      struct hvm_emulate_ctxt *hvmemul_ctxt)
>>  {
>>      struct vcpu *curr = current;
>> -    void *err, *mapping;
>> +    void *err = ZERO_BLOCK_PTR, *mapping;
> 
> Given the revised logic, can't you just start *err with the value 
> ERR_PTR(~X86EMUL_OKAY) now? You can then lose the extra test in the first if 
> of update_map_err().

No, I don't think I can, because of ...

>>      unsigned int nr_frames = ((linear + bytes - !!bytes) >> PAGE_SHIFT) -
>>          (linear >> PAGE_SHIFT) + 1;
>>      unsigned int i;
>> @@ -600,27 +630,28 @@ static void *hvmemul_map_linear_addr(
>>              goto out;
>> 
>>          case HVMTRANS_bad_gfn_to_mfn:
>> -            err = NULL;
>> -            goto out;
>> +            err = update_map_err(err, NULL);
>> +            continue;
>> 
>>          case HVMTRANS_gfn_paged_out:
>>          case HVMTRANS_gfn_shared:
>> -            err = ERR_PTR(~X86EMUL_RETRY);
>> -            goto out;
>> +            err = update_map_err(err, ERR_PTR(~X86EMUL_RETRY));
>> +            continue;
>> 
>>          default:
>> -            goto unhandleable;
>> +            err = update_map_err(err, ERR_PTR(~X86EMUL_UNHANDLEABLE));
>> +            continue;
>>          }
>> 
>>          *mfn++ = page_to_mfn(page);
>> 
>>          if ( p2m_is_discard_write(p2mt) )
>> -        {
>> -            err = ERR_PTR(~X86EMUL_OKAY);
>> -            goto out;
>> -        }
>> +            err = update_map_err(err, ERR_PTR(~X86EMUL_OKAY));

... this and ...

>>      }
>> 
>> +    if ( err != ZERO_BLOCK_PTR )
>> +        goto out;

... this.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2] x86/HVM: correct hvmemul_map_linear_addr() for multi-page case
  2018-09-13 11:39     ` Jan Beulich
@ 2018-09-13 11:41       ` Paul Durrant
  0 siblings, 0 replies; 21+ messages in thread
From: Paul Durrant @ 2018-09-13 11:41 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: Andrew Cooper, xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 13 September 2018 12:39
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-devel <xen-
> devel@lists.xenproject.org>
> Subject: RE: [PATCH v2] x86/HVM: correct hvmemul_map_linear_addr() for
> multi-page case
> 
> >>> On 13.09.18 at 13:06, <Paul.Durrant@citrix.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 13 September 2018 11:13
> >>
> >> @@ -544,7 +574,7 @@ static void *hvmemul_map_linear_addr(
> >>      struct hvm_emulate_ctxt *hvmemul_ctxt)
> >>  {
> >>      struct vcpu *curr = current;
> >> -    void *err, *mapping;
> >> +    void *err = ZERO_BLOCK_PTR, *mapping;
> >
> > Given the revised logic, can't you just start *err with the value
> > ERR_PTR(~X86EMUL_OKAY) now? You can then lose the extra test in the
> first if
> > of update_map_err().
> 
> No, I don't think I can, because of ...
> 
> >>      unsigned int nr_frames = ((linear + bytes - !!bytes) >>
> PAGE_SHIFT) -
> >>          (linear >> PAGE_SHIFT) + 1;
> >>      unsigned int i;
> >> @@ -600,27 +630,28 @@ static void *hvmemul_map_linear_addr(
> >>              goto out;
> >>
> >>          case HVMTRANS_bad_gfn_to_mfn:
> >> -            err = NULL;
> >> -            goto out;
> >> +            err = update_map_err(err, NULL);
> >> +            continue;
> >>
> >>          case HVMTRANS_gfn_paged_out:
> >>          case HVMTRANS_gfn_shared:
> >> -            err = ERR_PTR(~X86EMUL_RETRY);
> >> -            goto out;
> >> +            err = update_map_err(err, ERR_PTR(~X86EMUL_RETRY));
> >> +            continue;
> >>
> >>          default:
> >> -            goto unhandleable;
> >> +            err = update_map_err(err, ERR_PTR(~X86EMUL_UNHANDLEABLE));
> >> +            continue;
> >>          }
> >>
> >>          *mfn++ = page_to_mfn(page);
> >>
> >>          if ( p2m_is_discard_write(p2mt) )
> >> -        {
> >> -            err = ERR_PTR(~X86EMUL_OKAY);
> >> -            goto out;
> >> -        }
> >> +            err = update_map_err(err, ERR_PTR(~X86EMUL_OKAY));
> 
> ... this and ...
> 
> >>      }
> >>
> >> +    if ( err != ZERO_BLOCK_PTR )
> >> +        goto out;
> 
> ... this.

Ah yes. In which case...

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> 
> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v2] x86/HVM: correct hvmemul_map_linear_addr() for multi-page case
  2018-09-13 10:12 ` [PATCH v2] " Jan Beulich
  2018-09-13 11:06   ` Paul Durrant
@ 2018-09-20 12:41   ` Andrew Cooper
  2018-09-20 13:39     ` Jan Beulich
  2018-09-25 12:41     ` Jan Beulich
  2019-07-31 11:26   ` [Xen-devel] " Alexandru Stefan ISAILA
  2 siblings, 2 replies; 21+ messages in thread
From: Andrew Cooper @ 2018-09-20 12:41 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Paul Durrant

On 13/09/18 11:12, Jan Beulich wrote:
> The function does two translations in one go for a single guest access.
> Any failure of the first translation step (guest linear -> guest
> physical), resulting in #PF, ought to take precedence over any failure
> of the second step (guest physical -> host physical).

Why?  What is the basis of this presumption?

As far as what real hardware does...

This test sets up a ballooned page and a read-only page.  I.e. a second
stage fault on the first part of a misaligned access, and a first stage
fault on the second part of the access.

(d1) --- Xen Test Framework ---
(d1) Environment: HVM 64bit (Long mode 4 levels)
(d1) Test splitfault
(d1) About to read
(XEN) *** EPT qual 0000000000000181, gpa 000000000011cffc
(d1) Reading PTR: got 00000000ffffffff
(d1) About to write
(XEN) *** EPT qual 0000000000000182, gpa 000000000011cffc
(d1) ******************************
(d1) PANIC: Unhandled exception at 0008:00000000001047e0
(d1) Vec 14 #PF[-d-sWP] %cr2 000000000011d000
(d1) ******************************

The second stage fault is recognised first, which is contrary to your
presumption, i.e. the code in its current form appears to be correct.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2] x86/HVM: correct hvmemul_map_linear_addr() for multi-page case
  2018-09-20 12:41   ` Andrew Cooper
@ 2018-09-20 13:39     ` Jan Beulich
  2018-09-20 14:13       ` Andrew Cooper
  2018-09-25 12:41     ` Jan Beulich
  1 sibling, 1 reply; 21+ messages in thread
From: Jan Beulich @ 2018-09-20 13:39 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Paul Durrant

>>> On 20.09.18 at 14:41, <andrew.cooper3@citrix.com> wrote:
> On 13/09/18 11:12, Jan Beulich wrote:
>> The function does two translations in one go for a single guest access.
>> Any failure of the first translation step (guest linear -> guest
>> physical), resulting in #PF, ought to take precedence over any failure
>> of the second step (guest physical -> host physical).
> 
> Why?  What is the basis of this presumption?
> 
> As far as what real hardware does...
> 
> This test sets up a ballooned page and a read-only page.  I.e. a second
> stage fault on the first part of a misaligned access, and a first stage
> fault on the second part of the access.
> 
> (d1) --- Xen Test Framework ---
> (d1) Environment: HVM 64bit (Long mode 4 levels)
> (d1) Test splitfault
> (d1) About to read
> (XEN) *** EPT qual 0000000000000181, gpa 000000000011cffc
> (d1) Reading PTR: got 00000000ffffffff
> (d1) About to write
> (XEN) *** EPT qual 0000000000000182, gpa 000000000011cffc
> (d1) ******************************
> (d1) PANIC: Unhandled exception at 0008:00000000001047e0
> (d1) Vec 14 #PF[-d-sWP] %cr2 000000000011d000
> (d1) ******************************
> 
> The second stage fault is recognised first, which is contrary to your
> presumption, i.e. the code in its current form appears to be correct.

But the guest doesn't know about 2nd stage translation. In the
absence of it, the (1st stage / only) fault ought to occur before
any bus level actions would be taken. Otherwise the code paths
using the mapping function don't match the paths using recently
introduced linear_{read,write}() in this regard.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2] x86/HVM: correct hvmemul_map_linear_addr() for multi-page case
  2018-09-20 13:39     ` Jan Beulich
@ 2018-09-20 14:13       ` Andrew Cooper
  2018-09-20 14:51         ` Jan Beulich
  0 siblings, 1 reply; 21+ messages in thread
From: Andrew Cooper @ 2018-09-20 14:13 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Paul Durrant

On 20/09/18 14:39, Jan Beulich wrote:
>>>> On 20.09.18 at 14:41, <andrew.cooper3@citrix.com> wrote:
>> On 13/09/18 11:12, Jan Beulich wrote:
>>> The function does two translations in one go for a single guest access.
>>> Any failure of the first translation step (guest linear -> guest
>>> physical), resulting in #PF, ought to take precedence over any failure
>>> of the second step (guest physical -> host physical).
>> Why?  What is the basis of this presumption?
>>
>> As far as what real hardware does...
>>
>> This test sets up a ballooned page and a read-only page.  I.e. a second
>> stage fault on the first part of a misaligned access, and a first stage
>> fault on the second part of the access.
>>
>> (d1) --- Xen Test Framework ---
>> (d1) Environment: HVM 64bit (Long mode 4 levels)
>> (d1) Test splitfault
>> (d1) About to read
>> (XEN) *** EPT qual 0000000000000181, gpa 000000000011cffc
>> (d1) Reading PTR: got 00000000ffffffff
>> (d1) About to write
>> (XEN) *** EPT qual 0000000000000182, gpa 000000000011cffc
>> (d1) ******************************
>> (d1) PANIC: Unhandled exception at 0008:00000000001047e0
>> (d1) Vec 14 #PF[-d-sWP] %cr2 000000000011d000
>> (d1) ******************************
>>
>> The second stage fault is recognised first, which is contrary to your
>> presumption, i.e. the code in its current form appears to be correct.
> But the guest doesn't know about 2nd stage translation. In the
> absence of it, the (1st stage / only) fault ought to occur before
> any bus level actions would be taken.

You have not answered my question.

Why?  On what basis do you conclude that the behaviour you describe is
"correct", especially now given evidence to the contrary?

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2] x86/HVM: correct hvmemul_map_linear_addr() for multi-page case
  2018-09-20 14:13       ` Andrew Cooper
@ 2018-09-20 14:51         ` Jan Beulich
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Beulich @ 2018-09-20 14:51 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Paul Durrant

>>> On 20.09.18 at 16:13, <andrew.cooper3@citrix.com> wrote:
> On 20/09/18 14:39, Jan Beulich wrote:
>>>>> On 20.09.18 at 14:41, <andrew.cooper3@citrix.com> wrote:
>>> On 13/09/18 11:12, Jan Beulich wrote:
>>>> The function does two translations in one go for a single guest access.
>>>> Any failure of the first translation step (guest linear -> guest
>>>> physical), resulting in #PF, ought to take precedence over any failure
>>>> of the second step (guest physical -> host physical).
>>> Why?  What is the basis of this presumption?
>>>
>>> As far as what real hardware does...
>>>
>>> This test sets up a ballooned page and a read-only page.  I.e. a second
>>> stage fault on the first part of a misaligned access, and a first stage
>>> fault on the second part of the access.
>>>
>>> (d1) --- Xen Test Framework ---
>>> (d1) Environment: HVM 64bit (Long mode 4 levels)
>>> (d1) Test splitfault
>>> (d1) About to read
>>> (XEN) *** EPT qual 0000000000000181, gpa 000000000011cffc
>>> (d1) Reading PTR: got 00000000ffffffff
>>> (d1) About to write
>>> (XEN) *** EPT qual 0000000000000182, gpa 000000000011cffc
>>> (d1) ******************************
>>> (d1) PANIC: Unhandled exception at 0008:00000000001047e0
>>> (d1) Vec 14 #PF[-d-sWP] %cr2 000000000011d000
>>> (d1) ******************************
>>>
>>> The second stage fault is recognised first, which is contrary to your
>>> presumption, i.e. the code in its current form appears to be correct.
>> But the guest doesn't know about 2nd stage translation. In the
>> absence of it, the (1st stage / only) fault ought to occur before
>> any bus level actions would be taken.
> 
> You have not answered my question.
> 
> Why?  On what basis do you conclude that the behaviour you describe is
> "correct", especially now given evidence to the contrary?

As to the basis I'm taking: Without it spelled out anywhere, any
sensible behavior can be considered "correct". But let's look at the
steps unpatched code takes:

hvm_translate_get_page() for the tail of the first page produces
HVMTRANS_bad_gfn_to_mfn, so we bail from the loop, returning
NULL. The caller takes this as an indication to write the range in
pieces. Hence a write to the last bytes of the first page occurs (if
it was MMIO instead of a ballooned page) before we raise #PF.

Now let's look at patched code behavior:

hvm_translate_get_page() for the tail of the first page produces
HVMTRANS_bad_gfn_to_mfn again, but we continue the loop.
hvm_translate_get_page() for the start of the second page
produces HVMTRANS_bad_linear_to_gfn, so we raise #PF without
first doing a partial write.

I continue to think that this is the less surprising behavior. Without
it being mandated that the partial write _has_ to occur, I'd much
prefer this changed behavior, no matter how the specific piece of
hardware behaves that you ran your test on.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2] x86/HVM: correct hvmemul_map_linear_addr() for multi-page case
  2018-09-20 12:41   ` Andrew Cooper
  2018-09-20 13:39     ` Jan Beulich
@ 2018-09-25 12:41     ` Jan Beulich
  2018-09-25 15:30       ` Andrew Cooper
  1 sibling, 1 reply; 21+ messages in thread
From: Jan Beulich @ 2018-09-25 12:41 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Paul Durrant

>>> On 20.09.18 at 14:41, <andrew.cooper3@citrix.com> wrote:
> On 13/09/18 11:12, Jan Beulich wrote:
>> The function does two translations in one go for a single guest access.
>> Any failure of the first translation step (guest linear -> guest
>> physical), resulting in #PF, ought to take precedence over any failure
>> of the second step (guest physical -> host physical).
> 
> Why?  What is the basis of this presumption?
> 
> As far as what real hardware does...
> 
> This test sets up a ballooned page and a read-only page.  I.e. a second
> stage fault on the first part of a misaligned access, and a first stage
> fault on the second part of the access.
> 
> (d1) --- Xen Test Framework ---
> (d1) Environment: HVM 64bit (Long mode 4 levels)
> (d1) Test splitfault
> (d1) About to read
> (XEN) *** EPT qual 0000000000000181, gpa 000000000011cffc
> (d1) Reading PTR: got 00000000ffffffff
> (d1) About to write
> (XEN) *** EPT qual 0000000000000182, gpa 000000000011cffc
> (d1) ******************************
> (d1) PANIC: Unhandled exception at 0008:00000000001047e0
> (d1) Vec 14 #PF[-d-sWP] %cr2 000000000011d000
> (d1) ******************************
> 
> The second stage fault is recognised first, which is contrary to your
> presumption, i.e. the code in its current form appears to be correct.

Coming back to this example of yours: As a first step, are we in
agreement that with the exception of very complex instructions
(FSAVE, FXSAVE, XSAVE etc) instructions are supposed to work in an
all-or-nothing manner when it comes to updating of architectural
state (be it registers or memory)? If you agree, then I cannot see
how avoiding to raise #PF on the second half of a split access can be
correct in your opinion, irrespective of the first vs second level
translation ordering for the two parts. In fact I think there are further
cases where we would need to change code for this to be guaranteed,
but hvmemul_map_linear_addr() can be quite helpful here (once
patched as suggested).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2] x86/HVM: correct hvmemul_map_linear_addr() for multi-page case
  2018-09-25 12:41     ` Jan Beulich
@ 2018-09-25 15:30       ` Andrew Cooper
  2018-09-26  9:27         ` Jan Beulich
  2018-10-08 11:53         ` Jan Beulich
  0 siblings, 2 replies; 21+ messages in thread
From: Andrew Cooper @ 2018-09-25 15:30 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Paul Durrant

On 25/09/18 13:41, Jan Beulich wrote:
>>>> On 20.09.18 at 14:41, <andrew.cooper3@citrix.com> wrote:
>> On 13/09/18 11:12, Jan Beulich wrote:
>>> The function does two translations in one go for a single guest access.
>>> Any failure of the first translation step (guest linear -> guest
>>> physical), resulting in #PF, ought to take precedence over any failure
>>> of the second step (guest physical -> host physical).
>> Why?  What is the basis of this presumption?
>>
>> As far as what real hardware does...
>>
>> This test sets up a ballooned page and a read-only page.  I.e. a second
>> stage fault on the first part of a misaligned access, and a first stage
>> fault on the second part of the access.
>>
>> (d1) --- Xen Test Framework ---
>> (d1) Environment: HVM 64bit (Long mode 4 levels)
>> (d1) Test splitfault
>> (d1) About to read
>> (XEN) *** EPT qual 0000000000000181, gpa 000000000011cffc
>> (d1) Reading PTR: got 00000000ffffffff
>> (d1) About to write
>> (XEN) *** EPT qual 0000000000000182, gpa 000000000011cffc
>> (d1) ******************************
>> (d1) PANIC: Unhandled exception at 0008:00000000001047e0
>> (d1) Vec 14 #PF[-d-sWP] %cr2 000000000011d000
>> (d1) ******************************
>>
>> The second stage fault is recognised first, which is contrary to your
>> presumption, i.e. the code in its current form appears to be correct.
> Coming back to this example of yours: As a first step, are we in
> agreement that with the exception of very complex instructions
> (FSAVE, FXSAVE, XSAVE etc) instructions are supposed to work in an
> all-or-nothing manner when it comes to updating of architectural
> state (be it registers or memory)?

No.  Read Chapter Intel Vol3 8.1 and 8.2, which makes it quite clear
that misaligned accesses may be split access, and observably so to other
processors in the system.

I've even found a new bit in it which says that >quadword SSE accesses
may even result in a partial write being completed before #PF is raised.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2] x86/HVM: correct hvmemul_map_linear_addr() for multi-page case
  2018-09-25 15:30       ` Andrew Cooper
@ 2018-09-26  9:27         ` Jan Beulich
  2018-10-08 11:53         ` Jan Beulich
  1 sibling, 0 replies; 21+ messages in thread
From: Jan Beulich @ 2018-09-26  9:27 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Paul Durrant

>>> On 25.09.18 at 17:30, <andrew.cooper3@citrix.com> wrote:
> On 25/09/18 13:41, Jan Beulich wrote:
>>>>> On 20.09.18 at 14:41, <andrew.cooper3@citrix.com> wrote:
>>> On 13/09/18 11:12, Jan Beulich wrote:
>>>> The function does two translations in one go for a single guest access.
>>>> Any failure of the first translation step (guest linear -> guest
>>>> physical), resulting in #PF, ought to take precedence over any failure
>>>> of the second step (guest physical -> host physical).
>>> Why?  What is the basis of this presumption?
>>>
>>> As far as what real hardware does...
>>>
>>> This test sets up a ballooned page and a read-only page.  I.e. a second
>>> stage fault on the first part of a misaligned access, and a first stage
>>> fault on the second part of the access.
>>>
>>> (d1) --- Xen Test Framework ---
>>> (d1) Environment: HVM 64bit (Long mode 4 levels)
>>> (d1) Test splitfault
>>> (d1) About to read
>>> (XEN) *** EPT qual 0000000000000181, gpa 000000000011cffc
>>> (d1) Reading PTR: got 00000000ffffffff
>>> (d1) About to write
>>> (XEN) *** EPT qual 0000000000000182, gpa 000000000011cffc
>>> (d1) ******************************
>>> (d1) PANIC: Unhandled exception at 0008:00000000001047e0
>>> (d1) Vec 14 #PF[-d-sWP] %cr2 000000000011d000
>>> (d1) ******************************
>>>
>>> The second stage fault is recognised first, which is contrary to your
>>> presumption, i.e. the code in its current form appears to be correct.
>> Coming back to this example of yours: As a first step, are we in
>> agreement that with the exception of very complex instructions
>> (FSAVE, FXSAVE, XSAVE etc) instructions are supposed to work in an
>> all-or-nothing manner when it comes to updating of architectural
>> state (be it registers or memory)?
> 
> No.  Read Chapter Intel Vol3 8.1 and 8.2, which makes it quite clear
> that misaligned accesses may be split access, and observably so to other
> processors in the system.
> 
> I've even found a new bit in it which says that >quadword SSE accesses
> may even result in a partial write being completed before #PF is raised.

But note that this is indeed limited to x87 / SSE insns. And there's
nothing said that this behavior is mandatory. Hence if we emulated
things such that (a) we meet the requirements for MOV and ALU
insns and (b) we make x87 / SSE ones match (a), all would still be
within spec.

Furthermore both section individually state that LOCKed insns
perform their accesses atomically, regardless of alignment. To me
this implies no #PF when part of the write has already happened
(presumably achieved by the walks needed for the reads already
done as write-access walks). I think hvmemul_rmw() matches
this already, yet even there a possible #PF on the second part of
a split access could be detected and reported without doing two
walks, by way of the change proposed here.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2] x86/HVM: correct hvmemul_map_linear_addr() for multi-page case
  2018-09-25 15:30       ` Andrew Cooper
  2018-09-26  9:27         ` Jan Beulich
@ 2018-10-08 11:53         ` Jan Beulich
  1 sibling, 0 replies; 21+ messages in thread
From: Jan Beulich @ 2018-10-08 11:53 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Paul Durrant

>>> On 25.09.18 at 17:30, <andrew.cooper3@citrix.com> wrote:
> On 25/09/18 13:41, Jan Beulich wrote:
>>>>> On 20.09.18 at 14:41, <andrew.cooper3@citrix.com> wrote:
>>> On 13/09/18 11:12, Jan Beulich wrote:
>>>> The function does two translations in one go for a single guest access.
>>>> Any failure of the first translation step (guest linear -> guest
>>>> physical), resulting in #PF, ought to take precedence over any failure
>>>> of the second step (guest physical -> host physical).
>>> Why?  What is the basis of this presumption?
>>>
>>> As far as what real hardware does...
>>>
>>> This test sets up a ballooned page and a read-only page.  I.e. a second
>>> stage fault on the first part of a misaligned access, and a first stage
>>> fault on the second part of the access.
>>>
>>> (d1) --- Xen Test Framework ---
>>> (d1) Environment: HVM 64bit (Long mode 4 levels)
>>> (d1) Test splitfault
>>> (d1) About to read
>>> (XEN) *** EPT qual 0000000000000181, gpa 000000000011cffc
>>> (d1) Reading PTR: got 00000000ffffffff
>>> (d1) About to write
>>> (XEN) *** EPT qual 0000000000000182, gpa 000000000011cffc
>>> (d1) ******************************
>>> (d1) PANIC: Unhandled exception at 0008:00000000001047e0
>>> (d1) Vec 14 #PF[-d-sWP] %cr2 000000000011d000
>>> (d1) ******************************
>>>
>>> The second stage fault is recognised first, which is contrary to your
>>> presumption, i.e. the code in its current form appears to be correct.
>> Coming back to this example of yours: As a first step, are we in
>> agreement that with the exception of very complex instructions
>> (FSAVE, FXSAVE, XSAVE etc) instructions are supposed to work in an
>> all-or-nothing manner when it comes to updating of architectural
>> state (be it registers or memory)?
> 
> No.  Read Chapter Intel Vol3 8.1 and 8.2, which makes it quite clear
> that misaligned accesses may be split access, and observably so to other
> processors in the system.
> 
> I've even found a new bit in it which says that >quadword SSE accesses
> may even result in a partial write being completed before #PF is raised.

As said before, I'm all with you with the "may" part for FPU, SSE, and
alike more complex accesses. Short of being able to think of a way to
compare (in a more direct way than via EPT behavior, as your test
did) native and emulated behavior (I had coded something up until I
realized that it doesn't test what I want to test), I've written a small
test for real hardware. Would this debugging patch

--- a/xen/arch/x86/boot/x86_64.S
+++ b/xen/arch/x86/boot/x86_64.S
@@ -87,7 +87,7 @@ GLOBAL(boot_cpu_compat_gdt_table)
  * of physical memory. In any case the VGA hole should be mapped with type UC.
  * Uses 1x 4k page.
  */
-l1_identmap:
+GLOBAL(l1_identmap)
         pfn = 0
         .rept L1_PAGETABLE_ENTRIES
         /* VGA hole (0xa0000-0xc0000) should be mapped UC-. */
--- a/xen/arch/x86/ioport_emulate.c
+++ b/xen/arch/x86/ioport_emulate.c
@@ -112,8 +112,48 @@ static struct dmi_system_id __initdata i
     { }
 };
 
+extern unsigned long l1_identmap[L1_PAGETABLE_ENTRIES];//temp
 static int __init ioport_quirks_init(void)
 {
+ {//temp
+  unsigned char*hole = __va(0xb0000);
+  unsigned long l1e = l1_identmap[0xb0];
+  int i;
+
+  show_page_walk((long)hole);
+
+  l1_identmap[0xb0] &= ~_PAGE_RW;
+  l1_identmap[0xb0] |= _PAGE_PWT;
+  for(;;) {
+   flush_area_all(hole, FLUSH_TLB|FLUSH_TLB_GLOBAL);
+   show_page_walk((long)hole);
+
+   for(i = 3; i > 0; --i) {
+    unsigned long cr2 = 0;
+
+    memset(hole - 4, 0, 4);
+    asm("1: addl $-1, %0; 2:\n\t"
+        ".pushsection .fixup,\"ax\"\n"
+        "3: mov %%cr2, %1; jmp 2b\n\t"
+        ".popsection\n\t"
+        _ASM_EXTABLE(1b, 3b)
+        : "+m" (hole[-i]), "+r" (cr2) :: "memory");
+
+    printk("CR2=%lx data: %02x %02x %02x %02x\n", cr2,
+           hole[-4], hole[-3], hole[-2], hole[-1]);
+   }
+
+   if(l1_identmap[0xb0] & (1UL << (paddr_bits - 1)))
+    break;
+
+   l1_identmap[0xb0] |= ((1UL << paddr_bits) - 1) & PAGE_MASK;
+  }
+
+  l1_identmap[0xb0] = l1e;
+  flush_area_all(hole, FLUSH_TLB|FLUSH_TLB_GLOBAL);
+  show_page_walk((long)hole);
+ }
+
     dmi_check_system(ioport_quirks_tbl);
     return 0;
 }

producing this output

(XEN) Pagetable walk from ffff8300000b0000:
(XEN)  L4[0x106] = 80000000cf0ba063 ffffffffffffffff
(XEN)  L3[0x000] = 00000000cf0b4063 ffffffffffffffff
(XEN)  L2[0x000] = 00000000cf0b3063 ffffffffffffffff
(XEN)  L1[0x0b0] = 00000000000b0373 ffffffffffffffff
(XEN) Pagetable walk from ffff8300000b0000:
(XEN)  L4[0x106] = 80000000cf0ba063 ffffffffffffffff
(XEN)  L3[0x000] = 00000000cf0b4063 ffffffffffffffff
(XEN)  L2[0x000] = 00000000cf0b3063 ffffffffffffffff
(XEN)  L1[0x0b0] = 00000000000b0379 ffffffffffffffff
(XEN) CR2=ffff8300000b0000 data: 00 00 00 00
(XEN) CR2=ffff8300000b0000 data: 00 00 00 00
(XEN) CR2=ffff8300000b0000 data: 00 00 00 00
(XEN) Pagetable walk from ffff8300000b0000:
(XEN)  L4[0x106] = 80000000cf0ba063 ffffffffffffffff
(XEN)  L3[0x000] = 00000000cf0b4063 ffffffffffffffff
(XEN)  L2[0x000] = 00000000cf0b3063 ffffffffffffffff
(XEN)  L1[0x0b0] = 0000000ffffff379 ffffffffffffffff
(XEN) CR2=ffff8300000b0000 data: 00 00 00 00
(XEN) CR2=ffff8300000b0000 data: 00 00 00 00
(XEN) CR2=ffff8300000b0000 data: 00 00 00 00
(XEN) Pagetable walk from ffff8300000b0000:
(XEN)  L4[0x106] = 80000000cf0ba063 ffffffffffffffff
(XEN)  L3[0x000] = 00000000cf0b4063 ffffffffffffffff
(XEN)  L2[0x000] = 00000000cf0b3063 ffffffffffffffff
(XEN)  L1[0x0b0] = 00000000000b0373 ffffffffffffffff

convince you that simple (ALU) accesses at the very least do not get
split, and hence our emulation shouldn't do so either. (I admit that I
think I see another change, beyond the patch here, to be necessary
for this to work consistently in all cases: In the "known_gla() returns
true" case we'd also have to perform the second page walk _before_
trying to actually write anything. But I had questioned that
conditional anyway, first in
https://lists.xenproject.org/archives/html/xen-devel/2018-09/msg00155.html
and then with a shortened version left in the description of commit
3bdec530a5 ["x86/HVM: split page straddling emulated accesses in
more cases"]. Paul wanted that behavior to be mirrored to
hvmemul_rmw() instead.)

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xen-devel] [PATCH v2] x86/HVM: correct hvmemul_map_linear_addr() for multi-page case
  2018-09-13 10:12 ` [PATCH v2] " Jan Beulich
  2018-09-13 11:06   ` Paul Durrant
  2018-09-20 12:41   ` Andrew Cooper
@ 2019-07-31 11:26   ` Alexandru Stefan ISAILA
  2 siblings, 0 replies; 21+ messages in thread
From: Alexandru Stefan ISAILA @ 2019-07-31 11:26 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Andrew Cooper, Paul Durrant



On 13.09.2018 13:12, Jan Beulich wrote:
> The function does two translations in one go for a single guest access.
> Any failure of the first translation step (guest linear -> guest
> physical), resulting in #PF, ought to take precedence over any failure
> of the second step (guest physical -> host physical). Bail out of the
> loop early solely when translation produces HVMTRANS_bad_linear_to_gfn,
> and record the most relevant of perhaps multiple different errors
> otherwise. (The choice of ZERO_BLOCK_PTR as sentinel is arbitrary.)
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

This is useful for adding new functionality to hvmemul_map_linear_addr()

Reviewed-by: Alexandru Isaila <aisaila@bitdefender.com>

> ---
> v2: Add comment (mapping table) and adjust update_map_err()
>      accordingly.
> 
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -532,6 +532,36 @@ static int hvmemul_do_mmio_addr(paddr_t
>   }
>   
>   /*
> + * Intended mapping, implemented without table lookup:
> + *
> + * -----------------------------------------
> + * | \ new |       |       |       |       |
> + * |   \   | OKAY  | NULL  | RETRY | UNHND |
> + * | err \ |       |       |       |       |
> + * -----------------------------------------
> + * | OKAY  | OKAY  | NULL  | RETRY | UNHND |
> + * -----------------------------------------
> + * | NULL  | NULL  | NULL  | RETRY | UNHND |
> + * -----------------------------------------
> + * | RETRY | RETRY | RETRY | RETRY | UNHND |
> + * -----------------------------------------
> + * | UNHND | UNHND | UNHND | UNHND | UNHND |
> + * -----------------------------------------
> + */
> +static void *update_map_err(void *err, void *new)
> +{
> +    if ( err == ZERO_BLOCK_PTR || err == ERR_PTR(~X86EMUL_OKAY) ||
> +         new == ERR_PTR(~X86EMUL_UNHANDLEABLE) )
> +        return new;
> +
> +    if ( new == ERR_PTR(~X86EMUL_OKAY) ||
> +         err == ERR_PTR(~X86EMUL_UNHANDLEABLE) )
> +        return err;
> +
> +    return err ?: new;
> +}
> +
> +/*
>    * Map the frame(s) covering an individual linear access, for writeable
>    * access.  May return NULL for MMIO, or ERR_PTR(~X86EMUL_*) for other errors
>    * including ERR_PTR(~X86EMUL_OKAY) for write-discard mappings.
> @@ -544,7 +574,7 @@ static void *hvmemul_map_linear_addr(
>       struct hvm_emulate_ctxt *hvmemul_ctxt)
>   {
>       struct vcpu *curr = current;
> -    void *err, *mapping;
> +    void *err = ZERO_BLOCK_PTR, *mapping;
>       unsigned int nr_frames = ((linear + bytes - !!bytes) >> PAGE_SHIFT) -
>           (linear >> PAGE_SHIFT) + 1;
>       unsigned int i;
> @@ -600,27 +630,28 @@ static void *hvmemul_map_linear_addr(
>               goto out;
>   
>           case HVMTRANS_bad_gfn_to_mfn:
> -            err = NULL;
> -            goto out;
> +            err = update_map_err(err, NULL);
> +            continue;
>   
>           case HVMTRANS_gfn_paged_out:
>           case HVMTRANS_gfn_shared:
> -            err = ERR_PTR(~X86EMUL_RETRY);
> -            goto out;
> +            err = update_map_err(err, ERR_PTR(~X86EMUL_RETRY));
> +            continue;
>   
>           default:
> -            goto unhandleable;
> +            err = update_map_err(err, ERR_PTR(~X86EMUL_UNHANDLEABLE));
> +            continue;
>           }
>   
>           *mfn++ = page_to_mfn(page);
>   
>           if ( p2m_is_discard_write(p2mt) )
> -        {
> -            err = ERR_PTR(~X86EMUL_OKAY);
> -            goto out;
> -        }
> +            err = update_map_err(err, ERR_PTR(~X86EMUL_OKAY));
>       }
>   
> +    if ( err != ZERO_BLOCK_PTR )
> +        goto out;
> +
>       /* Entire access within a single frame? */
>       if ( nr_frames == 1 )
>           mapping = map_domain_page(hvmemul_ctxt->mfn[0]);
> @@ -639,6 +670,7 @@ static void *hvmemul_map_linear_addr(
>       return mapping + (linear & ~PAGE_MASK);
>   
>    unhandleable:
> +    ASSERT(err == ZERO_BLOCK_PTR);
>       err = ERR_PTR(~X86EMUL_UNHANDLEABLE);
>   
>    out:
> 
> 
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel
> 
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xen-devel] [PATCH] x86/HVM: correct hvmemul_map_linear_addr() for multi-page case
  2018-09-12  9:09 [PATCH] x86/HVM: correct hvmemul_map_linear_addr() for multi-page case Jan Beulich
  2018-09-12 11:51 ` Paul Durrant
  2018-09-13 10:12 ` [PATCH v2] " Jan Beulich
@ 2023-08-30 14:30 ` Roger Pau Monné
  2023-08-30 18:09   ` Andrew Cooper
  2023-08-31  7:14   ` [Xen-devel] " Jan Beulich
  2 siblings, 2 replies; 21+ messages in thread
From: Roger Pau Monné @ 2023-08-30 14:30 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Cooper, Paul Durrant

On Wed, Sep 12, 2018 at 03:09:35AM -0600, Jan Beulich wrote:
> The function does two translations in one go for a single guest access.
> Any failure of the first translation step (guest linear -> guest
> physical), resulting in #PF, ought to take precedence over any failure
> of the second step (guest physical -> host physical).

If my understanding is correct, this is done so that any #PF that
would arise from the access is injected to the guest, regardless of
whether there might also be gfn -> mfn errors that would otherwise
prevent the #PF from being detected.

> Bail out of the
> loop early solely when translation produces HVMTRANS_bad_linear_to_gfn,
> and record the most relevant of perhaps multiple different errors
> otherwise. (The choice of ZERO_BLOCK_PTR as sentinel is arbitrary.)
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -531,6 +531,20 @@ static int hvmemul_do_mmio_addr(paddr_t
>      return hvmemul_do_io_addr(1, mmio_gpa, reps, size, dir, df, ram_gpa);
>  }
>  
> +static void *update_map_err(void *err, void *new)
> +{
> +    if ( err == ZERO_BLOCK_PTR || err == ERR_PTR(~X86EMUL_OKAY) )
> +        return new;
> +
> +    if ( new == ERR_PTR(~X86EMUL_OKAY) )
> +        return err;
> +
> +    if ( err == ERR_PTR(~X86EMUL_RETRY) )
> +        return new;
> +
> +    return err;
> +}
> +
>  /*
>   * Map the frame(s) covering an individual linear access, for writeable
>   * access.  May return NULL for MMIO, or ERR_PTR(~X86EMUL_*) for other errors
> @@ -544,7 +558,7 @@ static void *hvmemul_map_linear_addr(
>      struct hvm_emulate_ctxt *hvmemul_ctxt)
>  {
>      struct vcpu *curr = current;
> -    void *err, *mapping;
> +    void *err = ZERO_BLOCK_PTR, *mapping;
>      unsigned int nr_frames = ((linear + bytes - !!bytes) >> PAGE_SHIFT) -
>          (linear >> PAGE_SHIFT) + 1;
>      unsigned int i;
> @@ -600,27 +614,28 @@ static void *hvmemul_map_linear_addr(
>              goto out;
>  
>          case HVMTRANS_bad_gfn_to_mfn:
> -            err = NULL;
> -            goto out;
> +            err = update_map_err(err, NULL);
> +            continue;
>  
>          case HVMTRANS_gfn_paged_out:
>          case HVMTRANS_gfn_shared:
> -            err = ERR_PTR(~X86EMUL_RETRY);
> -            goto out;
> +            err = update_map_err(err, ERR_PTR(~X86EMUL_RETRY));
> +            continue;
>  
>          default:
> -            goto unhandleable;
> +            err = update_map_err(err, ERR_PTR(~X86EMUL_UNHANDLEABLE));
> +            continue;
>          }
>  
>          *mfn++ = page_to_mfn(page);

I have to admit it find it weird that since now we don't exit the loop
when HVMTRANS_bad_gfn_to_mfn is returned, the item at mfn[0] might
point to the gfn -> mfn translation for the second half of the access.
AFAICT that would happen if the first half of the access fails to
translate with an error !HVMTRANS_bad_linear_to_gfn and the second
half is successful.

I guess it doesn't matter much, because the vmap below will be
skipped, might still be worth to add a comment.

In fact, if the first translation fails the following ones could use
the cheaper paging_gva_to_gfn(), as we no longer care to get the
underlying page, and just need to figure out whether the access would
trigger a #PF?

Thanks, Roger.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xen-devel] [PATCH] x86/HVM: correct hvmemul_map_linear_addr() for multi-page case
  2023-08-30 14:30 ` [Xen-devel] [PATCH] " Roger Pau Monné
@ 2023-08-30 18:09   ` Andrew Cooper
  2023-08-31  7:03     ` Jan Beulich
  2023-08-31  7:14   ` [Xen-devel] " Jan Beulich
  1 sibling, 1 reply; 21+ messages in thread
From: Andrew Cooper @ 2023-08-30 18:09 UTC (permalink / raw)
  To: Roger Pau Monné, Jan Beulich; +Cc: xen-devel, Paul Durrant

On 30/08/2023 3:30 pm, Roger Pau Monné wrote:
> On Wed, Sep 12, 2018 at 03:09:35AM -0600, Jan Beulich wrote:
>> The function does two translations in one go for a single guest access.
>> Any failure of the first translation step (guest linear -> guest
>> physical), resulting in #PF, ought to take precedence over any failure
>> of the second step (guest physical -> host physical).

Erm... No?

There are up to 25 translations steps, assuming a memory operand
contained entirely within a cache-line.

They intermix between gla->gpa and gpa->spa in a strict order.

There not a point where the error is ambiguous, nor is there ever a
point where a pagewalk continues beyond a faulting condition.

Hardware certainly isn't wasting transistors to hold state just to see
could try to progress further in order to hand back a different error...


When the pipeline needs to split an access, it has to generate multiple
adjacent memory accesses, because the unit of memory access is a cache line.

There is a total order of accesses in the memory queue, so any faults
from first byte of the access will be delivered before any fault from
the first byte to move into the next cache line.


I'm not necessarily saying that Xen's behaviour in
hvmemul_map_linear_addr() is correct in all cases, but it looks a hell
of a lot more correct in it's current form than what this patch presents.

Or do you have a concrete example where you think
hvmemul_map_linear_addr() behaves incorrectly?

~Andrew


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] x86/HVM: correct hvmemul_map_linear_addr() for multi-page case
  2023-08-30 18:09   ` Andrew Cooper
@ 2023-08-31  7:03     ` Jan Beulich
  2023-08-31  8:59       ` Roger Pau Monné
  0 siblings, 1 reply; 21+ messages in thread
From: Jan Beulich @ 2023-08-31  7:03 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Paul Durrant, Roger Pau Monné

On 30.08.2023 20:09, Andrew Cooper wrote:
> On 30/08/2023 3:30 pm, Roger Pau Monné wrote:
>> On Wed, Sep 12, 2018 at 03:09:35AM -0600, Jan Beulich wrote:
>>> The function does two translations in one go for a single guest access.
>>> Any failure of the first translation step (guest linear -> guest
>>> physical), resulting in #PF, ought to take precedence over any failure
>>> of the second step (guest physical -> host physical).
> 
> Erm... No?
> 
> There are up to 25 translations steps, assuming a memory operand
> contained entirely within a cache-line.
> 
> They intermix between gla->gpa and gpa->spa in a strict order.

But we're talking about an access crossing a page boundary here.

> There not a point where the error is ambiguous, nor is there ever a
> point where a pagewalk continues beyond a faulting condition.
> 
> Hardware certainly isn't wasting transistors to hold state just to see
> could try to progress further in order to hand back a different error...
> 
> 
> When the pipeline needs to split an access, it has to generate multiple
> adjacent memory accesses, because the unit of memory access is a cache line.
> 
> There is a total order of accesses in the memory queue, so any faults
> from first byte of the access will be delivered before any fault from
> the first byte to move into the next cache line.

Looks like we're fundamentally disagreeing on what we try to emulate in
Xen. My view is that the goal ought to be to match, as closely as
possible, how code would behave on bare metal. IOW no considerations of
of the GPA -> MA translation steps. Of course in a fully virtualized
environment these necessarily have to occur for the page table accesses
themselves, before the the actual memory access can be carried out. But
that's different for the leaf access itself. (In fact I'm not even sure
the architecture guarantees that the two split accesses, or their
associated page walks, always occur in [address] order.)

I'd also like to expand on the "we're": Considering the two R-b I got
already back at the time, both apparently agreed with my way of looking
at things. With Roger's reply that you've responded to here, I'm
getting the impression that he also shares that view.

Of course that still doesn't mean we're right and you're wrong, but if
you think that's the case, it'll take you actually supplying arguments
supporting your view. And since we're talking of an abstract concept
here, resorting to how CPUs actually deal with the same situation
isn't enough. It wouldn't be the first time that they got things
wrong. Plus it may also require you potentially accepting that
different views are possible, without either being strictly wrong and
the other strictly right.

> I'm not necessarily saying that Xen's behaviour in
> hvmemul_map_linear_addr() is correct in all cases, but it looks a hell
> of a lot more correct in it's current form than what this patch presents.
> 
> Or do you have a concrete example where you think
> hvmemul_map_linear_addr() behaves incorrectly?

I may not have observed one (the patch has been pending for too long
now for me to still recall in the context of what unrelated work I
noticed there being an issue here; certainly it was a case where I was
at least suspecting this being the possible cause, and I do recall it
was related to some specific observation with Windows guests), but the
description makes clear enough that any split access crossing a (guest
view) non-faulting/faulting boundary is going to be affected, if the
former access would instead cause some 2nd-stage translation issue on
the leaf access.

In fact I think one can even see a guest security aspect here (not an
active issue, but defense-in-depth like): If there's any chance to have
the guest kernel take corrective action, that should be preferred over
Xen potentially taking fatal (to the guest) action (because of whatever
2nd stage translation issue on the lower part of the access). From
that angle the change may even not go far enough, yet (thinking e.g.
of a PoD out-of-memory condition on the first part of the access; in
such an event hvm_translate_get_page() unconditionally using
P2M_UNSHARE, and hence implicitly P2M_ALLOC, is also getting in the
way).

Jan


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xen-devel] [PATCH] x86/HVM: correct hvmemul_map_linear_addr() for multi-page case
  2023-08-30 14:30 ` [Xen-devel] [PATCH] " Roger Pau Monné
  2023-08-30 18:09   ` Andrew Cooper
@ 2023-08-31  7:14   ` Jan Beulich
  1 sibling, 0 replies; 21+ messages in thread
From: Jan Beulich @ 2023-08-31  7:14 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Andrew Cooper, Paul Durrant

On 30.08.2023 16:30, Roger Pau Monné wrote:
> On Wed, Sep 12, 2018 at 03:09:35AM -0600, Jan Beulich wrote:
>> The function does two translations in one go for a single guest access.
>> Any failure of the first translation step (guest linear -> guest
>> physical), resulting in #PF, ought to take precedence over any failure
>> of the second step (guest physical -> host physical).
> 
> If my understanding is correct, this is done so that any #PF that
> would arise from the access is injected to the guest, regardless of
> whether there might also be gfn -> mfn errors that would otherwise
> prevent the #PF from being detected.

Yes.

>> @@ -600,27 +614,28 @@ static void *hvmemul_map_linear_addr(
>>              goto out;
>>  
>>          case HVMTRANS_bad_gfn_to_mfn:
>> -            err = NULL;
>> -            goto out;
>> +            err = update_map_err(err, NULL);
>> +            continue;
>>  
>>          case HVMTRANS_gfn_paged_out:
>>          case HVMTRANS_gfn_shared:
>> -            err = ERR_PTR(~X86EMUL_RETRY);
>> -            goto out;
>> +            err = update_map_err(err, ERR_PTR(~X86EMUL_RETRY));
>> +            continue;
>>  
>>          default:
>> -            goto unhandleable;
>> +            err = update_map_err(err, ERR_PTR(~X86EMUL_UNHANDLEABLE));
>> +            continue;
>>          }
>>  
>>          *mfn++ = page_to_mfn(page);
> 
> I have to admit it find it weird that since now we don't exit the loop
> when HVMTRANS_bad_gfn_to_mfn is returned, the item at mfn[0] might
> point to the gfn -> mfn translation for the second half of the access.
> AFAICT that would happen if the first half of the access fails to
> translate with an error !HVMTRANS_bad_linear_to_gfn and the second
> half is successful.
> 
> I guess it doesn't matter much, because the vmap below will be
> skipped, might still be worth to add a comment.

I could add one, but as you say it doesn't matter much, plus - I don't
see a good place where such a comment would go.

> In fact, if the first translation fails the following ones could use
> the cheaper paging_gva_to_gfn(), as we no longer care to get the
> underlying page, and just need to figure out whether the access would
> trigger a #PF?

That would be an option, yes, at the expense of (slightly) more
complicated logic. Of course going that route would then also address
your mfn[0] remark above, without the need for any comment.

Jan


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] x86/HVM: correct hvmemul_map_linear_addr() for multi-page case
  2023-08-31  7:03     ` Jan Beulich
@ 2023-08-31  8:59       ` Roger Pau Monné
  0 siblings, 0 replies; 21+ messages in thread
From: Roger Pau Monné @ 2023-08-31  8:59 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, xen-devel, Paul Durrant

On Thu, Aug 31, 2023 at 09:03:18AM +0200, Jan Beulich wrote:
> On 30.08.2023 20:09, Andrew Cooper wrote:
> > On 30/08/2023 3:30 pm, Roger Pau Monné wrote:
> >> On Wed, Sep 12, 2018 at 03:09:35AM -0600, Jan Beulich wrote:
> >>> The function does two translations in one go for a single guest access.
> >>> Any failure of the first translation step (guest linear -> guest
> >>> physical), resulting in #PF, ought to take precedence over any failure
> >>> of the second step (guest physical -> host physical).
> > 
> > Erm... No?
> > 
> > There are up to 25 translations steps, assuming a memory operand
> > contained entirely within a cache-line.
> > 
> > They intermix between gla->gpa and gpa->spa in a strict order.
> 
> But we're talking about an access crossing a page boundary here.
> 
> > There not a point where the error is ambiguous, nor is there ever a
> > point where a pagewalk continues beyond a faulting condition.
> > 
> > Hardware certainly isn't wasting transistors to hold state just to see
> > could try to progress further in order to hand back a different error...
> > 
> > 
> > When the pipeline needs to split an access, it has to generate multiple
> > adjacent memory accesses, because the unit of memory access is a cache line.
> > 
> > There is a total order of accesses in the memory queue, so any faults
> > from first byte of the access will be delivered before any fault from
> > the first byte to move into the next cache line.
> 
> Looks like we're fundamentally disagreeing on what we try to emulate in
> Xen. My view is that the goal ought to be to match, as closely as
> possible, how code would behave on bare metal. IOW no considerations of
> of the GPA -> MA translation steps. Of course in a fully virtualized
> environment these necessarily have to occur for the page table accesses
> themselves, before the the actual memory access can be carried out. But
> that's different for the leaf access itself. (In fact I'm not even sure
> the architecture guarantees that the two split accesses, or their
> associated page walks, always occur in [address] order.)
> 
> I'd also like to expand on the "we're": Considering the two R-b I got
> already back at the time, both apparently agreed with my way of looking
> at things. With Roger's reply that you've responded to here, I'm
> getting the impression that he also shares that view.

Ideally the emulator should attempt to replicate the behavior a guests
gets when running on second-stage translation, so it's not possible to
differentiate the behavior of emulating an instruction vs
executing it in non-root mode. IOW: not only take the ordering of #PF
into account, but also the EPT_VIOLATION vmexits.

> Of course that still doesn't mean we're right and you're wrong, but if
> you think that's the case, it'll take you actually supplying arguments
> supporting your view. And since we're talking of an abstract concept
> here, resorting to how CPUs actually deal with the same situation
> isn't enough. It wouldn't be the first time that they got things
> wrong. Plus it may also require you potentially accepting that
> different views are possible, without either being strictly wrong and
> the other strictly right.

I don't really have an answer here, with the lack of a written down
specification by vendors I think we should just go with whatever is
easier for us to handle in the hypervisor.

Also, this is such a corner case, that I would think any guest
attempting this is likely hitting a BUG or attempting something fishy.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2023-08-31  9:00 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-12  9:09 [PATCH] x86/HVM: correct hvmemul_map_linear_addr() for multi-page case Jan Beulich
2018-09-12 11:51 ` Paul Durrant
2018-09-12 12:13   ` Jan Beulich
2018-09-13 10:12 ` [PATCH v2] " Jan Beulich
2018-09-13 11:06   ` Paul Durrant
2018-09-13 11:39     ` Jan Beulich
2018-09-13 11:41       ` Paul Durrant
2018-09-20 12:41   ` Andrew Cooper
2018-09-20 13:39     ` Jan Beulich
2018-09-20 14:13       ` Andrew Cooper
2018-09-20 14:51         ` Jan Beulich
2018-09-25 12:41     ` Jan Beulich
2018-09-25 15:30       ` Andrew Cooper
2018-09-26  9:27         ` Jan Beulich
2018-10-08 11:53         ` Jan Beulich
2019-07-31 11:26   ` [Xen-devel] " Alexandru Stefan ISAILA
2023-08-30 14:30 ` [Xen-devel] [PATCH] " Roger Pau Monné
2023-08-30 18:09   ` Andrew Cooper
2023-08-31  7:03     ` Jan Beulich
2023-08-31  8:59       ` Roger Pau Monné
2023-08-31  7:14   ` [Xen-devel] " Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.