All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] MMIO emulation fixes
@ 2018-08-10 10:37 Paul Durrant
  2018-08-10 10:37 ` [PATCH 1/2] x86/hvm/ioreq: MMIO range checking completely ignores direction flag Paul Durrant
                   ` (2 more replies)
  0 siblings, 3 replies; 41+ messages in thread
From: Paul Durrant @ 2018-08-10 10:37 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant

These are probably both candidates for back-port.

Paul Durrant (2):
  x86/hvm/ioreq: MMIO range checking completely ignores direction flag
  x86/hvm/emulate: make sure rep I/O emulation does not cross GFN
    boundaries

 xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++-
 xen/arch/x86/hvm/ioreq.c   | 15 ++++++++++-----
 2 files changed, 26 insertions(+), 6 deletions(-)

-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 1/2] x86/hvm/ioreq: MMIO range checking completely ignores direction flag
  2018-08-10 10:37 Paul Durrant
@ 2018-08-10 10:37 ` Paul Durrant
  2018-08-10 11:11   ` Andrew Cooper
  2018-08-10 10:37 ` [PATCH 2/2] x86/hvm/emulate: make sure rep I/O emulation does not cross GFN boundaries Paul Durrant
  2018-08-10 12:01 ` [PATCH 0/2] MMIO emulation fixes Jan Beulich
  2 siblings, 1 reply; 41+ messages in thread
From: Paul Durrant @ 2018-08-10 10:37 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Jan Beulich

hvm_select_ioreq_server() is used to route an ioreq to the appropriate
ioreq server. For MMIO this is done by comparing the range of the ioreq
to the ranges registered by the device models of each ioreq server.
Unfortunately the calculation of the range if the ioreq completely ignores
the direction flag and thus may calculate the wrong range for comparison.
Thus the ioreq may either be routed to the wrong server or erroneously
terminated by null_ops.

NOTE: The patch also fixes whitespace in the switch statement to make it
      style compliant.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/ioreq.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index 7c515b3ef7..940a2c9728 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -1353,20 +1353,25 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
 
         switch ( type )
         {
-            unsigned long end;
+            unsigned long start, end;
 
         case XEN_DMOP_IO_RANGE_PORT:
-            end = addr + p->size - 1;
-            if ( rangeset_contains_range(r, addr, end) )
+            start = addr;
+            end = start + p->size - 1;
+            if ( rangeset_contains_range(r, start, end) )
                 return s;
 
             break;
+
         case XEN_DMOP_IO_RANGE_MEMORY:
-            end = addr + (p->size * p->count) - 1;
-            if ( rangeset_contains_range(r, addr, end) )
+            start = hvm_mmio_first_byte(p);
+            end = hvm_mmio_last_byte(p);
+
+            if ( rangeset_contains_range(r, start, end) )
                 return s;
 
             break;
+
         case XEN_DMOP_IO_RANGE_PCI:
             if ( rangeset_contains_singleton(r, addr >> 32) )
             {
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 2/2] x86/hvm/emulate: make sure rep I/O emulation does not cross GFN boundaries
  2018-08-10 10:37 Paul Durrant
  2018-08-10 10:37 ` [PATCH 1/2] x86/hvm/ioreq: MMIO range checking completely ignores direction flag Paul Durrant
@ 2018-08-10 10:37 ` Paul Durrant
  2018-08-10 11:14   ` Andrew Cooper
  2018-08-10 11:59   ` Jan Beulich
  2018-08-10 12:01 ` [PATCH 0/2] MMIO emulation fixes Jan Beulich
  2 siblings, 2 replies; 41+ messages in thread
From: Paul Durrant @ 2018-08-10 10:37 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Jan Beulich

When emulating a rep I/O operation it is possible that the ioreq will
describe a single operation that spans multiple GFNs. This is fine as long
as all those GFNs fall within an MMIO region covered by a single device
model, but unfortunately the higher levels of the emulation code do not
guarantee that. This is something that should almost certainly be fixed,
but in the meantime this patch makes sure that MMIO is truncated at GFN
boundaries and hence the appropriate device model is re-evaluated for each
target GFN.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 8385c62145..d6a81ec4d1 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -184,8 +184,23 @@ static int hvmemul_do_io(
         hvmtrace_io_assist(&p);
     }
 
-    vio->io_req = p;
+    /*
+     * Make sure that we truncate rep MMIO at any GFN boundary. This is
+     * necessary to ensure that the correct device model is targetted
+     * or that we correctly handle a rep op spanning MMIO and RAM.
+     */
+    if ( unlikely(p.count > 1) && p.type == IOREQ_TYPE_COPY )
+    {
+        unsigned long off = p.addr & ~PAGE_MASK;
 
+        p.count = min_t(unsigned long,
+                        p.count,
+                        p.df ?
+                        (off + p.size) / p.size :
+                        (PAGE_SIZE - off) / p.size);
+    }
+
+    vio->io_req = p;
     rc = hvm_io_intercept(&p);
 
     /*
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH 1/2] x86/hvm/ioreq: MMIO range checking completely ignores direction flag
  2018-08-10 10:37 ` [PATCH 1/2] x86/hvm/ioreq: MMIO range checking completely ignores direction flag Paul Durrant
@ 2018-08-10 11:11   ` Andrew Cooper
  0 siblings, 0 replies; 41+ messages in thread
From: Andrew Cooper @ 2018-08-10 11:11 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Jan Beulich

On 10/08/18 11:37, Paul Durrant wrote:
> hvm_select_ioreq_server() is used to route an ioreq to the appropriate
> ioreq server. For MMIO this is done by comparing the range of the ioreq
> to the ranges registered by the device models of each ioreq server.
> Unfortunately the calculation of the range if the ioreq completely ignores
> the direction flag and thus may calculate the wrong range for comparison.
> Thus the ioreq may either be routed to the wrong server or erroneously
> terminated by null_ops.
>
> NOTE: The patch also fixes whitespace in the switch statement to make it
>       style compliant.
>
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 2/2] x86/hvm/emulate: make sure rep I/O emulation does not cross GFN boundaries
  2018-08-10 10:37 ` [PATCH 2/2] x86/hvm/emulate: make sure rep I/O emulation does not cross GFN boundaries Paul Durrant
@ 2018-08-10 11:14   ` Andrew Cooper
  2018-08-10 11:50     ` Paul Durrant
  2018-08-10 11:59   ` Jan Beulich
  1 sibling, 1 reply; 41+ messages in thread
From: Andrew Cooper @ 2018-08-10 11:14 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Jan Beulich

On 10/08/18 11:37, Paul Durrant wrote:
> When emulating a rep I/O operation it is possible that the ioreq will
> describe a single operation that spans multiple GFNs. This is fine as long
> as all those GFNs fall within an MMIO region covered by a single device
> model, but unfortunately the higher levels of the emulation code do not
> guarantee that. This is something that should almost certainly be fixed,
> but in the meantime this patch makes sure that MMIO is truncated at GFN
> boundaries and hence the appropriate device model is re-evaluated for each
> target GFN.
>
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> ---
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
>  xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index 8385c62145..d6a81ec4d1 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -184,8 +184,23 @@ static int hvmemul_do_io(
>          hvmtrace_io_assist(&p);
>      }
>  
> -    vio->io_req = p;
> +    /*
> +     * Make sure that we truncate rep MMIO at any GFN boundary. This is
> +     * necessary to ensure that the correct device model is targetted
> +     * or that we correctly handle a rep op spanning MMIO and RAM.
> +     */
> +    if ( unlikely(p.count > 1) && p.type == IOREQ_TYPE_COPY )
> +    {
> +        unsigned long off = p.addr & ~PAGE_MASK;
>  
> +        p.count = min_t(unsigned long,
> +                        p.count,
> +                        p.df ?
> +                        (off + p.size) / p.size :
> +                        (PAGE_SIZE - off) / p.size);

(p.df ? (off + p.size) : (PAGE_SIZE - off)) / p.size

?

> +    }
> +
> +    vio->io_req = p;

You'll get a cleaner patch if you retain the newline between these two.

Both can be fixed up on commit.  From a functional point of view, this
looks fine.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 2/2] x86/hvm/emulate: make sure rep I/O emulation does not cross GFN boundaries
  2018-08-10 11:14   ` Andrew Cooper
@ 2018-08-10 11:50     ` Paul Durrant
  2018-08-10 11:50       ` Andrew Cooper
  0 siblings, 1 reply; 41+ messages in thread
From: Paul Durrant @ 2018-08-10 11:50 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel; +Cc: Jan Beulich

> -----Original Message-----
> From: Andrew Cooper
> Sent: 10 August 2018 12:14
> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org
> Cc: Jan Beulich <jbeulich@suse.com>
> Subject: Re: [PATCH 2/2] x86/hvm/emulate: make sure rep I/O emulation
> does not cross GFN boundaries
> 
> On 10/08/18 11:37, Paul Durrant wrote:
> > When emulating a rep I/O operation it is possible that the ioreq will
> > describe a single operation that spans multiple GFNs. This is fine as long
> > as all those GFNs fall within an MMIO region covered by a single device
> > model, but unfortunately the higher levels of the emulation code do not
> > guarantee that. This is something that should almost certainly be fixed,
> > but in the meantime this patch makes sure that MMIO is truncated at GFN
> > boundaries and hence the appropriate device model is re-evaluated for
> each
> > target GFN.
> >
> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> > ---
> > Cc: Jan Beulich <jbeulich@suse.com>
> > Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> > ---
> >  xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++-
> >  1 file changed, 16 insertions(+), 1 deletion(-)
> >
> > diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> > index 8385c62145..d6a81ec4d1 100644
> > --- a/xen/arch/x86/hvm/emulate.c
> > +++ b/xen/arch/x86/hvm/emulate.c
> > @@ -184,8 +184,23 @@ static int hvmemul_do_io(
> >          hvmtrace_io_assist(&p);
> >      }
> >
> > -    vio->io_req = p;
> > +    /*
> > +     * Make sure that we truncate rep MMIO at any GFN boundary. This is
> > +     * necessary to ensure that the correct device model is targetted
> > +     * or that we correctly handle a rep op spanning MMIO and RAM.
> > +     */
> > +    if ( unlikely(p.count > 1) && p.type == IOREQ_TYPE_COPY )
> > +    {
> > +        unsigned long off = p.addr & ~PAGE_MASK;
> >
> > +        p.count = min_t(unsigned long,
> > +                        p.count,
> > +                        p.df ?
> > +                        (off + p.size) / p.size :
> > +                        (PAGE_SIZE - off) / p.size);
> 
> (p.df ? (off + p.size) : (PAGE_SIZE - off)) / p.size
> 

Yes, I suppose so.

> ?
> 
> > +    }
> > +
> > +    vio->io_req = p;
> 
> You'll get a cleaner patch if you retain the newline between these two.
> 
> Both can be fixed up on commit.  From a functional point of view, this
> looks fine.
> 

Ok. I'll leave to you then :-)

  Paul

> ~Andrew
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 2/2] x86/hvm/emulate: make sure rep I/O emulation does not cross GFN boundaries
  2018-08-10 11:50     ` Paul Durrant
@ 2018-08-10 11:50       ` Andrew Cooper
  0 siblings, 0 replies; 41+ messages in thread
From: Andrew Cooper @ 2018-08-10 11:50 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Jan Beulich

On 10/08/18 12:50, Paul Durrant wrote:
>> -----Original Message-----
>> From: Andrew Cooper
>> Sent: 10 August 2018 12:14
>> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org
>> Cc: Jan Beulich <jbeulich@suse.com>
>> Subject: Re: [PATCH 2/2] x86/hvm/emulate: make sure rep I/O emulation
>> does not cross GFN boundaries
>>
>> On 10/08/18 11:37, Paul Durrant wrote:
>>> When emulating a rep I/O operation it is possible that the ioreq will
>>> describe a single operation that spans multiple GFNs. This is fine as long
>>> as all those GFNs fall within an MMIO region covered by a single device
>>> model, but unfortunately the higher levels of the emulation code do not
>>> guarantee that. This is something that should almost certainly be fixed,
>>> but in the meantime this patch makes sure that MMIO is truncated at GFN
>>> boundaries and hence the appropriate device model is re-evaluated for
>> each
>>> target GFN.
>>>
>>> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>>> ---
>>> Cc: Jan Beulich <jbeulich@suse.com>
>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>> ---
>>>  xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++-
>>>  1 file changed, 16 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
>>> index 8385c62145..d6a81ec4d1 100644
>>> --- a/xen/arch/x86/hvm/emulate.c
>>> +++ b/xen/arch/x86/hvm/emulate.c
>>> @@ -184,8 +184,23 @@ static int hvmemul_do_io(
>>>          hvmtrace_io_assist(&p);
>>>      }
>>>
>>> -    vio->io_req = p;
>>> +    /*
>>> +     * Make sure that we truncate rep MMIO at any GFN boundary. This is
>>> +     * necessary to ensure that the correct device model is targetted
>>> +     * or that we correctly handle a rep op spanning MMIO and RAM.
>>> +     */
>>> +    if ( unlikely(p.count > 1) && p.type == IOREQ_TYPE_COPY )
>>> +    {
>>> +        unsigned long off = p.addr & ~PAGE_MASK;
>>>
>>> +        p.count = min_t(unsigned long,
>>> +                        p.count,
>>> +                        p.df ?
>>> +                        (off + p.size) / p.size :
>>> +                        (PAGE_SIZE - off) / p.size);
>> (p.df ? (off + p.size) : (PAGE_SIZE - off)) / p.size
>>
> Yes, I suppose so.
>
>> ?
>>
>>> +    }
>>> +
>>> +    vio->io_req = p;
>> You'll get a cleaner patch if you retain the newline between these two.
>>
>> Both can be fixed up on commit.  From a functional point of view, this
>> looks fine.
>>
> Ok. I'll leave to you then :-)

Sure.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 2/2] x86/hvm/emulate: make sure rep I/O emulation does not cross GFN boundaries
  2018-08-10 10:37 ` [PATCH 2/2] x86/hvm/emulate: make sure rep I/O emulation does not cross GFN boundaries Paul Durrant
  2018-08-10 11:14   ` Andrew Cooper
@ 2018-08-10 11:59   ` Jan Beulich
  2018-08-10 12:10     ` Paul Durrant
  1 sibling, 1 reply; 41+ messages in thread
From: Jan Beulich @ 2018-08-10 11:59 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, xen-devel

>>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote:
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -184,8 +184,23 @@ static int hvmemul_do_io(
>          hvmtrace_io_assist(&p);
>      }
>  
> -    vio->io_req = p;
> +    /*
> +     * Make sure that we truncate rep MMIO at any GFN boundary. This is
> +     * necessary to ensure that the correct device model is targetted
> +     * or that we correctly handle a rep op spanning MMIO and RAM.
> +     */
> +    if ( unlikely(p.count > 1) && p.type == IOREQ_TYPE_COPY )
> +    {
> +        unsigned long off = p.addr & ~PAGE_MASK;
>  
> +        p.count = min_t(unsigned long,
> +                        p.count,
> +                        p.df ?
> +                        (off + p.size) / p.size :
> +                        (PAGE_SIZE - off) / p.size);

For misaligned requests you need to make sure p.count doesn't end
up as zero (which can now happen in the forwards case). Or do you
rely on callers (hvmemul_do_io_addr() in particular) splitting such
requests already? Yet in that case it's not clear to me whether
anything needs changing here in the first place. (Similarly in the
backwards case I think the first iteration risks crossing a page
boundary, and then the batch should be clipped to count 1.)

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
  2018-08-10 10:37 Paul Durrant
  2018-08-10 10:37 ` [PATCH 1/2] x86/hvm/ioreq: MMIO range checking completely ignores direction flag Paul Durrant
  2018-08-10 10:37 ` [PATCH 2/2] x86/hvm/emulate: make sure rep I/O emulation does not cross GFN boundaries Paul Durrant
@ 2018-08-10 12:01 ` Jan Beulich
  2018-08-10 12:08   ` Paul Durrant
  2 siblings, 1 reply; 41+ messages in thread
From: Jan Beulich @ 2018-08-10 12:01 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel

>>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote:
> These are probably both candidates for back-port.
> 
> Paul Durrant (2):
>   x86/hvm/ioreq: MMIO range checking completely ignores direction flag
>   x86/hvm/emulate: make sure rep I/O emulation does not cross GFN
>     boundaries
> 
>  xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++-
>  xen/arch/x86/hvm/ioreq.c   | 15 ++++++++++-----
>  2 files changed, 26 insertions(+), 6 deletions(-)

I take it this isn't yet what we've talked about yesterday on irc?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
  2018-08-10 12:01 ` [PATCH 0/2] MMIO emulation fixes Jan Beulich
@ 2018-08-10 12:08   ` Paul Durrant
  2018-08-10 12:13     ` Jan Beulich
  0 siblings, 1 reply; 41+ messages in thread
From: Paul Durrant @ 2018-08-10 12:08 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 10 August 2018 13:02
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: xen-devel <xen-devel@lists.xenproject.org>
> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
> 
> >>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote:
> > These are probably both candidates for back-port.
> >
> > Paul Durrant (2):
> >   x86/hvm/ioreq: MMIO range checking completely ignores direction flag
> >   x86/hvm/emulate: make sure rep I/O emulation does not cross GFN
> >     boundaries
> >
> >  xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++-
> >  xen/arch/x86/hvm/ioreq.c   | 15 ++++++++++-----
> >  2 files changed, 26 insertions(+), 6 deletions(-)
> 
> I take it this isn't yet what we've talked about yesterday on irc?
> 

This is the band-aid fix. I can now show correct handling of a rep mov walking off MMIO into RAM.

  Paul

> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 2/2] x86/hvm/emulate: make sure rep I/O emulation does not cross GFN boundaries
  2018-08-10 11:59   ` Jan Beulich
@ 2018-08-10 12:10     ` Paul Durrant
  0 siblings, 0 replies; 41+ messages in thread
From: Paul Durrant @ 2018-08-10 12:10 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: Andrew Cooper, xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 10 August 2018 12:59
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-devel <xen-
> devel@lists.xenproject.org>
> Subject: Re: [PATCH 2/2] x86/hvm/emulate: make sure rep I/O emulation
> does not cross GFN boundaries
> 
> >>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote:
> > --- a/xen/arch/x86/hvm/emulate.c
> > +++ b/xen/arch/x86/hvm/emulate.c
> > @@ -184,8 +184,23 @@ static int hvmemul_do_io(
> >          hvmtrace_io_assist(&p);
> >      }
> >
> > -    vio->io_req = p;
> > +    /*
> > +     * Make sure that we truncate rep MMIO at any GFN boundary. This is
> > +     * necessary to ensure that the correct device model is targetted
> > +     * or that we correctly handle a rep op spanning MMIO and RAM.
> > +     */
> > +    if ( unlikely(p.count > 1) && p.type == IOREQ_TYPE_COPY )
> > +    {
> > +        unsigned long off = p.addr & ~PAGE_MASK;
> >
> > +        p.count = min_t(unsigned long,
> > +                        p.count,
> > +                        p.df ?
> > +                        (off + p.size) / p.size :
> > +                        (PAGE_SIZE - off) / p.size);
> 
> For misaligned requests you need to make sure p.count doesn't end
> up as zero (which can now happen in the forwards case). Or do you
> rely on callers (hvmemul_do_io_addr() in particular) splitting such
> requests already?

Well I have a test case where that split is not happening. Adding a safety check for p.count == 0 at this point should be done.

> Yet in that case it's not clear to me whether
> anything needs changing here in the first place. (Similarly in the
> backwards case I think the first iteration risks crossing a page
> boundary, and then the batch should be clipped to count 1.)
> 

Ok. Sounds like clipping to 1 rep in both circumstances would be best.

  Paul

> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
  2018-08-10 12:08   ` Paul Durrant
@ 2018-08-10 12:13     ` Jan Beulich
  2018-08-10 12:22       ` Paul Durrant
  0 siblings, 1 reply; 41+ messages in thread
From: Jan Beulich @ 2018-08-10 12:13 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel

>>> On 10.08.18 at 14:08, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 10 August 2018 13:02
>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> Cc: xen-devel <xen-devel@lists.xenproject.org>
>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>> 
>> >>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote:
>> > These are probably both candidates for back-port.
>> >
>> > Paul Durrant (2):
>> >   x86/hvm/ioreq: MMIO range checking completely ignores direction flag
>> >   x86/hvm/emulate: make sure rep I/O emulation does not cross GFN
>> >     boundaries
>> >
>> >  xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++-
>> >  xen/arch/x86/hvm/ioreq.c   | 15 ++++++++++-----
>> >  2 files changed, 26 insertions(+), 6 deletions(-)
>> 
>> I take it this isn't yet what we've talked about yesterday on irc?
>> 
> 
> This is the band-aid fix. I can now show correct handling of a rep mov 
> walking off MMIO into RAM.

But that's not the problem we're having. In our case the bad behavior
is with a single MOV. That's why I had assumed that your plan to fiddle
with null_handler would help in our case as well, while this series clearly
won't (afaict).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
  2018-08-10 12:13     ` Jan Beulich
@ 2018-08-10 12:22       ` Paul Durrant
  2018-08-10 12:37         ` Jan Beulich
  0 siblings, 1 reply; 41+ messages in thread
From: Paul Durrant @ 2018-08-10 12:22 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 10 August 2018 13:13
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: xen-devel <xen-devel@lists.xenproject.org>
> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
> 
> >>> On 10.08.18 at 14:08, <Paul.Durrant@citrix.com> wrote:
> >>  -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 10 August 2018 13:02
> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> Cc: xen-devel <xen-devel@lists.xenproject.org>
> >> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
> >>
> >> >>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote:
> >> > These are probably both candidates for back-port.
> >> >
> >> > Paul Durrant (2):
> >> >   x86/hvm/ioreq: MMIO range checking completely ignores direction flag
> >> >   x86/hvm/emulate: make sure rep I/O emulation does not cross GFN
> >> >     boundaries
> >> >
> >> >  xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++-
> >> >  xen/arch/x86/hvm/ioreq.c   | 15 ++++++++++-----
> >> >  2 files changed, 26 insertions(+), 6 deletions(-)
> >>
> >> I take it this isn't yet what we've talked about yesterday on irc?
> >>
> >
> > This is the band-aid fix. I can now show correct handling of a rep mov
> > walking off MMIO into RAM.
> 
> But that's not the problem we're having. In our case the bad behavior
> is with a single MOV. That's why I had assumed that your plan to fiddle
> with null_handler would help in our case as well, while this series clearly
> won't (afaict).
> 

Oh, I see. A single MOV spanning MMIO and RAM has undefined behaviour though as I understand it. Am I incorrect?

  Paul

> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
  2018-08-10 12:22       ` Paul Durrant
@ 2018-08-10 12:37         ` Jan Beulich
  2018-08-10 12:43           ` Paul Durrant
  0 siblings, 1 reply; 41+ messages in thread
From: Jan Beulich @ 2018-08-10 12:37 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel

>>> On 10.08.18 at 14:22, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 10 August 2018 13:13
>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> Cc: xen-devel <xen-devel@lists.xenproject.org>
>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>> 
>> >>> On 10.08.18 at 14:08, <Paul.Durrant@citrix.com> wrote:
>> >>  -----Original Message-----
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: 10 August 2018 13:02
>> >> To: Paul Durrant <Paul.Durrant@citrix.com>
>> >> Cc: xen-devel <xen-devel@lists.xenproject.org>
>> >> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>> >>
>> >> >>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote:
>> >> > These are probably both candidates for back-port.
>> >> >
>> >> > Paul Durrant (2):
>> >> >   x86/hvm/ioreq: MMIO range checking completely ignores direction flag
>> >> >   x86/hvm/emulate: make sure rep I/O emulation does not cross GFN
>> >> >     boundaries
>> >> >
>> >> >  xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++-
>> >> >  xen/arch/x86/hvm/ioreq.c   | 15 ++++++++++-----
>> >> >  2 files changed, 26 insertions(+), 6 deletions(-)
>> >>
>> >> I take it this isn't yet what we've talked about yesterday on irc?
>> >>
>> >
>> > This is the band-aid fix. I can now show correct handling of a rep mov
>> > walking off MMIO into RAM.
>> 
>> But that's not the problem we're having. In our case the bad behavior
>> is with a single MOV. That's why I had assumed that your plan to fiddle
>> with null_handler would help in our case as well, while this series clearly
>> won't (afaict).
>> 
> 
> Oh, I see. A single MOV spanning MMIO and RAM has undefined behaviour though 
> as I understand it. Am I incorrect?

I'm not aware of SDM or PM saying anything like this. Anyway, the
specific case where this is being observed as an issue is when
accessing the last few bytes of a normal RAM page followed by a
ballooned out one. The balloon driver doesn't remove the virtual
mapping of such pages (presumably in order to not shatter super
pages); observation is with the old XenoLinux one, but from code
inspection the upstream one behaves the same.

Unless we want to change the balloon driver's behavior, at least
this specific case needs to be considered having defined behavior,
I think.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
  2018-08-10 12:37         ` Jan Beulich
@ 2018-08-10 12:43           ` Paul Durrant
  2018-08-10 12:55             ` Andrew Cooper
  0 siblings, 1 reply; 41+ messages in thread
From: Paul Durrant @ 2018-08-10 12:43 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 10 August 2018 13:37
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: xen-devel <xen-devel@lists.xenproject.org>
> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
> 
> >>> On 10.08.18 at 14:22, <Paul.Durrant@citrix.com> wrote:
> >>  -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 10 August 2018 13:13
> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> Cc: xen-devel <xen-devel@lists.xenproject.org>
> >> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
> >>
> >> >>> On 10.08.18 at 14:08, <Paul.Durrant@citrix.com> wrote:
> >> >>  -----Original Message-----
> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> Sent: 10 August 2018 13:02
> >> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> >> Cc: xen-devel <xen-devel@lists.xenproject.org>
> >> >> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
> >> >>
> >> >> >>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote:
> >> >> > These are probably both candidates for back-port.
> >> >> >
> >> >> > Paul Durrant (2):
> >> >> >   x86/hvm/ioreq: MMIO range checking completely ignores direction
> flag
> >> >> >   x86/hvm/emulate: make sure rep I/O emulation does not cross GFN
> >> >> >     boundaries
> >> >> >
> >> >> >  xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++-
> >> >> >  xen/arch/x86/hvm/ioreq.c   | 15 ++++++++++-----
> >> >> >  2 files changed, 26 insertions(+), 6 deletions(-)
> >> >>
> >> >> I take it this isn't yet what we've talked about yesterday on irc?
> >> >>
> >> >
> >> > This is the band-aid fix. I can now show correct handling of a rep mov
> >> > walking off MMIO into RAM.
> >>
> >> But that's not the problem we're having. In our case the bad behavior
> >> is with a single MOV. That's why I had assumed that your plan to fiddle
> >> with null_handler would help in our case as well, while this series clearly
> >> won't (afaict).
> >>
> >
> > Oh, I see. A single MOV spanning MMIO and RAM has undefined behaviour
> though
> > as I understand it. Am I incorrect?
> 
> I'm not aware of SDM or PM saying anything like this. Anyway, the
> specific case where this is being observed as an issue is when
> accessing the last few bytes of a normal RAM page followed by a
> ballooned out one. The balloon driver doesn't remove the virtual
> mapping of such pages (presumably in order to not shatter super
> pages); observation is with the old XenoLinux one, but from code
> inspection the upstream one behaves the same.
> 
> Unless we want to change the balloon driver's behavior, at least
> this specific case needs to be considered having defined behavior,
> I think.
> 

Ok. I'll see what I can do.

  Paul

> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
  2018-08-10 12:43           ` Paul Durrant
@ 2018-08-10 12:55             ` Andrew Cooper
  2018-08-10 15:08               ` Paul Durrant
  0 siblings, 1 reply; 41+ messages in thread
From: Andrew Cooper @ 2018-08-10 12:55 UTC (permalink / raw)
  To: Paul Durrant, 'Jan Beulich'; +Cc: xen-devel

On 10/08/18 13:43, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 10 August 2018 13:37
>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> Cc: xen-devel <xen-devel@lists.xenproject.org>
>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>>
>>>>> On 10.08.18 at 14:22, <Paul.Durrant@citrix.com> wrote:
>>>>  -----Original Message-----
>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>> Sent: 10 August 2018 13:13
>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
>>>> Cc: xen-devel <xen-devel@lists.xenproject.org>
>>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>>>>
>>>>>>> On 10.08.18 at 14:08, <Paul.Durrant@citrix.com> wrote:
>>>>>>  -----Original Message-----
>>>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>>> Sent: 10 August 2018 13:02
>>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
>>>>>> Cc: xen-devel <xen-devel@lists.xenproject.org>
>>>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>>>>>>
>>>>>>>>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote:
>>>>>>> These are probably both candidates for back-port.
>>>>>>>
>>>>>>> Paul Durrant (2):
>>>>>>>   x86/hvm/ioreq: MMIO range checking completely ignores direction
>> flag
>>>>>>>   x86/hvm/emulate: make sure rep I/O emulation does not cross GFN
>>>>>>>     boundaries
>>>>>>>
>>>>>>>  xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++-
>>>>>>>  xen/arch/x86/hvm/ioreq.c   | 15 ++++++++++-----
>>>>>>>  2 files changed, 26 insertions(+), 6 deletions(-)
>>>>>> I take it this isn't yet what we've talked about yesterday on irc?
>>>>>>
>>>>> This is the band-aid fix. I can now show correct handling of a rep mov
>>>>> walking off MMIO into RAM.
>>>> But that's not the problem we're having. In our case the bad behavior
>>>> is with a single MOV. That's why I had assumed that your plan to fiddle
>>>> with null_handler would help in our case as well, while this series clearly
>>>> won't (afaict).
>>>>
>>> Oh, I see. A single MOV spanning MMIO and RAM has undefined behaviour
>> though
>>> as I understand it. Am I incorrect?
>> I'm not aware of SDM or PM saying anything like this. Anyway, the
>> specific case where this is being observed as an issue is when
>> accessing the last few bytes of a normal RAM page followed by a
>> ballooned out one. The balloon driver doesn't remove the virtual
>> mapping of such pages (presumably in order to not shatter super
>> pages); observation is with the old XenoLinux one, but from code
>> inspection the upstream one behaves the same.
>>
>> Unless we want to change the balloon driver's behavior, at least
>> this specific case needs to be considered having defined behavior,
>> I think.
>>
> Ok. I'll see what I can do.

It is a software error to try and cross boundaries.  Modern processors
do their best to try and cause the correct behaviour to occur, albeit
with a massive disclaimer about the performance hit.  Older processors
didn't cope.

As far as I'm concerned, its fine to terminate a emulation which crosses
a boundary with the null ops.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
  2018-08-10 12:55             ` Andrew Cooper
@ 2018-08-10 15:08               ` Paul Durrant
  2018-08-10 15:30                 ` Jan Beulich
  0 siblings, 1 reply; 41+ messages in thread
From: Paul Durrant @ 2018-08-10 15:08 UTC (permalink / raw)
  To: Andrew Cooper, 'Jan Beulich'; +Cc: xen-devel

> -----Original Message-----
> From: Andrew Cooper
> Sent: 10 August 2018 13:56
> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich'
> <JBeulich@suse.com>
> Cc: xen-devel <xen-devel@lists.xenproject.org>
> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
> 
> On 10/08/18 13:43, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 10 August 2018 13:37
> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> Cc: xen-devel <xen-devel@lists.xenproject.org>
> >> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
> >>
> >>>>> On 10.08.18 at 14:22, <Paul.Durrant@citrix.com> wrote:
> >>>>  -----Original Message-----
> >>>> From: Jan Beulich [mailto:JBeulich@suse.com]
> >>>> Sent: 10 August 2018 13:13
> >>>> To: Paul Durrant <Paul.Durrant@citrix.com>
> >>>> Cc: xen-devel <xen-devel@lists.xenproject.org>
> >>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
> >>>>
> >>>>>>> On 10.08.18 at 14:08, <Paul.Durrant@citrix.com> wrote:
> >>>>>>  -----Original Message-----
> >>>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
> >>>>>> Sent: 10 August 2018 13:02
> >>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
> >>>>>> Cc: xen-devel <xen-devel@lists.xenproject.org>
> >>>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
> >>>>>>
> >>>>>>>>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote:
> >>>>>>> These are probably both candidates for back-port.
> >>>>>>>
> >>>>>>> Paul Durrant (2):
> >>>>>>>   x86/hvm/ioreq: MMIO range checking completely ignores
> direction
> >> flag
> >>>>>>>   x86/hvm/emulate: make sure rep I/O emulation does not cross
> GFN
> >>>>>>>     boundaries
> >>>>>>>
> >>>>>>>  xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++-
> >>>>>>>  xen/arch/x86/hvm/ioreq.c   | 15 ++++++++++-----
> >>>>>>>  2 files changed, 26 insertions(+), 6 deletions(-)
> >>>>>> I take it this isn't yet what we've talked about yesterday on irc?
> >>>>>>
> >>>>> This is the band-aid fix. I can now show correct handling of a rep mov
> >>>>> walking off MMIO into RAM.
> >>>> But that's not the problem we're having. In our case the bad behavior
> >>>> is with a single MOV. That's why I had assumed that your plan to fiddle
> >>>> with null_handler would help in our case as well, while this series
> clearly
> >>>> won't (afaict).
> >>>>
> >>> Oh, I see. A single MOV spanning MMIO and RAM has undefined
> behaviour
> >> though
> >>> as I understand it. Am I incorrect?
> >> I'm not aware of SDM or PM saying anything like this. Anyway, the
> >> specific case where this is being observed as an issue is when
> >> accessing the last few bytes of a normal RAM page followed by a
> >> ballooned out one. The balloon driver doesn't remove the virtual
> >> mapping of such pages (presumably in order to not shatter super
> >> pages); observation is with the old XenoLinux one, but from code
> >> inspection the upstream one behaves the same.
> >>
> >> Unless we want to change the balloon driver's behavior, at least
> >> this specific case needs to be considered having defined behavior,
> >> I think.
> >>
> > Ok. I'll see what I can do.
> 
> It is a software error to try and cross boundaries.  Modern processors
> do their best to try and cause the correct behaviour to occur, albeit
> with a massive disclaimer about the performance hit.  Older processors
> didn't cope.
> 
> As far as I'm concerned, its fine to terminate a emulation which crosses
> a boundary with the null ops.

Alas we never even get as far as the I/O handlers in some circumstances...

I just set up a variant of an XTF test doing a backwards rep movsd into a well aligned stack buffer where source buffer starts 1 byte before a boundary between RAM and MMIO. The code in hvmemul_rep_movs() (rightly) detects that both the source and dest of the initial rep are RAM, skips over the I/O emulation calls, and then fails when the hvm_copy_from_guest_phys() (unsurprisingly) fails to grab the 8 bytes for the initial rep.
So, any logic we add to deal with handling page spanning ops is going to have to go in at the top level of instruction emulation... which I fear is going to be quite a major change and not something that's going to be easy to back-port.

  Paul

> 
> ~Andrew
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
  2018-08-10 15:08               ` Paul Durrant
@ 2018-08-10 15:30                 ` Jan Beulich
  2018-08-10 15:35                   ` Paul Durrant
  0 siblings, 1 reply; 41+ messages in thread
From: Jan Beulich @ 2018-08-10 15:30 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, xen-devel

>>> On 10.08.18 at 17:08, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: Andrew Cooper
>> Sent: 10 August 2018 13:56
>> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich'
>> <JBeulich@suse.com>
>> Cc: xen-devel <xen-devel@lists.xenproject.org>
>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>> 
>> On 10/08/18 13:43, Paul Durrant wrote:
>> >> -----Original Message-----
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: 10 August 2018 13:37
>> >> To: Paul Durrant <Paul.Durrant@citrix.com>
>> >> Cc: xen-devel <xen-devel@lists.xenproject.org>
>> >> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>> >>
>> >>>>> On 10.08.18 at 14:22, <Paul.Durrant@citrix.com> wrote:
>> >>>>  -----Original Message-----
>> >>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >>>> Sent: 10 August 2018 13:13
>> >>>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> >>>> Cc: xen-devel <xen-devel@lists.xenproject.org>
>> >>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>> >>>>
>> >>>>>>> On 10.08.18 at 14:08, <Paul.Durrant@citrix.com> wrote:
>> >>>>>>  -----Original Message-----
>> >>>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >>>>>> Sent: 10 August 2018 13:02
>> >>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> >>>>>> Cc: xen-devel <xen-devel@lists.xenproject.org>
>> >>>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>> >>>>>>
>> >>>>>>>>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote:
>> >>>>>>> These are probably both candidates for back-port.
>> >>>>>>>
>> >>>>>>> Paul Durrant (2):
>> >>>>>>>   x86/hvm/ioreq: MMIO range checking completely ignores
>> direction
>> >> flag
>> >>>>>>>   x86/hvm/emulate: make sure rep I/O emulation does not cross
>> GFN
>> >>>>>>>     boundaries
>> >>>>>>>
>> >>>>>>>  xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++-
>> >>>>>>>  xen/arch/x86/hvm/ioreq.c   | 15 ++++++++++-----
>> >>>>>>>  2 files changed, 26 insertions(+), 6 deletions(-)
>> >>>>>> I take it this isn't yet what we've talked about yesterday on irc?
>> >>>>>>
>> >>>>> This is the band-aid fix. I can now show correct handling of a rep mov
>> >>>>> walking off MMIO into RAM.
>> >>>> But that's not the problem we're having. In our case the bad behavior
>> >>>> is with a single MOV. That's why I had assumed that your plan to fiddle
>> >>>> with null_handler would help in our case as well, while this series
>> clearly
>> >>>> won't (afaict).
>> >>>>
>> >>> Oh, I see. A single MOV spanning MMIO and RAM has undefined
>> behaviour
>> >> though
>> >>> as I understand it. Am I incorrect?
>> >> I'm not aware of SDM or PM saying anything like this. Anyway, the
>> >> specific case where this is being observed as an issue is when
>> >> accessing the last few bytes of a normal RAM page followed by a
>> >> ballooned out one. The balloon driver doesn't remove the virtual
>> >> mapping of such pages (presumably in order to not shatter super
>> >> pages); observation is with the old XenoLinux one, but from code
>> >> inspection the upstream one behaves the same.
>> >>
>> >> Unless we want to change the balloon driver's behavior, at least
>> >> this specific case needs to be considered having defined behavior,
>> >> I think.
>> >>
>> > Ok. I'll see what I can do.
>> 
>> It is a software error to try and cross boundaries.  Modern processors
>> do their best to try and cause the correct behaviour to occur, albeit
>> with a massive disclaimer about the performance hit.  Older processors
>> didn't cope.
>> 
>> As far as I'm concerned, its fine to terminate a emulation which crosses
>> a boundary with the null ops.
> 
> Alas we never even get as far as the I/O handlers in some circumstances...
> 
> I just set up a variant of an XTF test doing a backwards rep movsd into a 
> well aligned stack buffer where source buffer starts 1 byte before a boundary 
> between RAM and MMIO. The code in hvmemul_rep_movs() (rightly) detects that 
> both the source and dest of the initial rep are RAM, skips over the I/O 
> emulation calls, and then fails when the hvm_copy_from_guest_phys() 
> (unsurprisingly) fails to grab the 8 bytes for the initial rep.
> So, any logic we add to deal with handling page spanning ops is going to 
> have to go in at the top level of instruction emulation... which I fear is 
> going to be quite a major change and not something that's going to be easy to 
> back-port.

Well, wasn't it clear from the beginning that a proper fix would be too
invasive to backport? And wasn't it for that reason that you intended
to add a small hack first, to deal with just the case(s) that we currently
have issues with?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
  2018-08-10 15:30                 ` Jan Beulich
@ 2018-08-10 15:35                   ` Paul Durrant
       [not found]                     ` <5B6DB69D02000078001DD06A@prv1*mh.provo.novell.com>
  2018-08-10 16:00                     ` Jan Beulich
  0 siblings, 2 replies; 41+ messages in thread
From: Paul Durrant @ 2018-08-10 15:35 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: Andrew Cooper, xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 10 August 2018 16:31
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-devel <xen-
> devel@lists.xenproject.org>
> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
> 
> >>> On 10.08.18 at 17:08, <Paul.Durrant@citrix.com> wrote:
> >>  -----Original Message-----
> >> From: Andrew Cooper
> >> Sent: 10 August 2018 13:56
> >> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich'
> >> <JBeulich@suse.com>
> >> Cc: xen-devel <xen-devel@lists.xenproject.org>
> >> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
> >>
> >> On 10/08/18 13:43, Paul Durrant wrote:
> >> >> -----Original Message-----
> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> Sent: 10 August 2018 13:37
> >> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> >> Cc: xen-devel <xen-devel@lists.xenproject.org>
> >> >> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
> >> >>
> >> >>>>> On 10.08.18 at 14:22, <Paul.Durrant@citrix.com> wrote:
> >> >>>>  -----Original Message-----
> >> >>>> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >>>> Sent: 10 August 2018 13:13
> >> >>>> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> >>>> Cc: xen-devel <xen-devel@lists.xenproject.org>
> >> >>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
> >> >>>>
> >> >>>>>>> On 10.08.18 at 14:08, <Paul.Durrant@citrix.com> wrote:
> >> >>>>>>  -----Original Message-----
> >> >>>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >>>>>> Sent: 10 August 2018 13:02
> >> >>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> >>>>>> Cc: xen-devel <xen-devel@lists.xenproject.org>
> >> >>>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
> >> >>>>>>
> >> >>>>>>>>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote:
> >> >>>>>>> These are probably both candidates for back-port.
> >> >>>>>>>
> >> >>>>>>> Paul Durrant (2):
> >> >>>>>>>   x86/hvm/ioreq: MMIO range checking completely ignores
> >> direction
> >> >> flag
> >> >>>>>>>   x86/hvm/emulate: make sure rep I/O emulation does not cross
> >> GFN
> >> >>>>>>>     boundaries
> >> >>>>>>>
> >> >>>>>>>  xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++-
> >> >>>>>>>  xen/arch/x86/hvm/ioreq.c   | 15 ++++++++++-----
> >> >>>>>>>  2 files changed, 26 insertions(+), 6 deletions(-)
> >> >>>>>> I take it this isn't yet what we've talked about yesterday on irc?
> >> >>>>>>
> >> >>>>> This is the band-aid fix. I can now show correct handling of a rep
> mov
> >> >>>>> walking off MMIO into RAM.
> >> >>>> But that's not the problem we're having. In our case the bad
> behavior
> >> >>>> is with a single MOV. That's why I had assumed that your plan to
> fiddle
> >> >>>> with null_handler would help in our case as well, while this series
> >> clearly
> >> >>>> won't (afaict).
> >> >>>>
> >> >>> Oh, I see. A single MOV spanning MMIO and RAM has undefined
> >> behaviour
> >> >> though
> >> >>> as I understand it. Am I incorrect?
> >> >> I'm not aware of SDM or PM saying anything like this. Anyway, the
> >> >> specific case where this is being observed as an issue is when
> >> >> accessing the last few bytes of a normal RAM page followed by a
> >> >> ballooned out one. The balloon driver doesn't remove the virtual
> >> >> mapping of such pages (presumably in order to not shatter super
> >> >> pages); observation is with the old XenoLinux one, but from code
> >> >> inspection the upstream one behaves the same.
> >> >>
> >> >> Unless we want to change the balloon driver's behavior, at least
> >> >> this specific case needs to be considered having defined behavior,
> >> >> I think.
> >> >>
> >> > Ok. I'll see what I can do.
> >>
> >> It is a software error to try and cross boundaries.  Modern processors
> >> do their best to try and cause the correct behaviour to occur, albeit
> >> with a massive disclaimer about the performance hit.  Older processors
> >> didn't cope.
> >>
> >> As far as I'm concerned, its fine to terminate a emulation which crosses
> >> a boundary with the null ops.
> >
> > Alas we never even get as far as the I/O handlers in some circumstances...
> >
> > I just set up a variant of an XTF test doing a backwards rep movsd into a
> > well aligned stack buffer where source buffer starts 1 byte before a
> boundary
> > between RAM and MMIO. The code in hvmemul_rep_movs() (rightly)
> detects that
> > both the source and dest of the initial rep are RAM, skips over the I/O
> > emulation calls, and then fails when the hvm_copy_from_guest_phys()
> > (unsurprisingly) fails to grab the 8 bytes for the initial rep.
> > So, any logic we add to deal with handling page spanning ops is going to
> > have to go in at the top level of instruction emulation... which I fear is
> > going to be quite a major change and not something that's going to be easy
> to
> > back-port.
> 
> Well, wasn't it clear from the beginning that a proper fix would be too
> invasive to backport? And wasn't it for that reason that you intended
> to add a small hack first, to deal with just the case(s) that we currently
> have issues with?

Well, given that I mistakenly understood you were hitting the same rep issue that I was, I thought I could deal with it in a reasonably straightforward way. Maybe I can still do a point fix for what you are hitting though. What precisely are you hitting? Always a single MOV? And always from a page spanning source to a well aligned dest? Or more combinations than that?

  Paul

> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
  2018-08-10 15:35                   ` Paul Durrant
       [not found]                     ` <5B6DB69D02000078001DD06A@prv1*mh.provo.novell.com>
@ 2018-08-10 16:00                     ` Jan Beulich
  2018-08-10 16:30                       ` George Dunlap
  1 sibling, 1 reply; 41+ messages in thread
From: Jan Beulich @ 2018-08-10 16:00 UTC (permalink / raw)
  To: Paul Durrant, George Dunlap; +Cc: Andrew Cooper, xen-devel

>>> On 10.08.18 at 17:35, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 10 August 2018 16:31
>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-devel <xen-
>> devel@lists.xenproject.org>
>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>> 
>> >>> On 10.08.18 at 17:08, <Paul.Durrant@citrix.com> wrote:
>> >>  -----Original Message-----
>> >> From: Andrew Cooper
>> >> Sent: 10 August 2018 13:56
>> >> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich'
>> >> <JBeulich@suse.com>
>> >> Cc: xen-devel <xen-devel@lists.xenproject.org>
>> >> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>> >>
>> >> On 10/08/18 13:43, Paul Durrant wrote:
>> >> >> -----Original Message-----
>> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >> Sent: 10 August 2018 13:37
>> >> >> To: Paul Durrant <Paul.Durrant@citrix.com>
>> >> >> Cc: xen-devel <xen-devel@lists.xenproject.org>
>> >> >> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>> >> >>
>> >> >>>>> On 10.08.18 at 14:22, <Paul.Durrant@citrix.com> wrote:
>> >> >>>>  -----Original Message-----
>> >> >>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >>>> Sent: 10 August 2018 13:13
>> >> >>>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> >> >>>> Cc: xen-devel <xen-devel@lists.xenproject.org>
>> >> >>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>> >> >>>>
>> >> >>>>>>> On 10.08.18 at 14:08, <Paul.Durrant@citrix.com> wrote:
>> >> >>>>>>  -----Original Message-----
>> >> >>>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >>>>>> Sent: 10 August 2018 13:02
>> >> >>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> >> >>>>>> Cc: xen-devel <xen-devel@lists.xenproject.org>
>> >> >>>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>> >> >>>>>>
>> >> >>>>>>>>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote:
>> >> >>>>>>> These are probably both candidates for back-port.
>> >> >>>>>>>
>> >> >>>>>>> Paul Durrant (2):
>> >> >>>>>>>   x86/hvm/ioreq: MMIO range checking completely ignores
>> >> direction
>> >> >> flag
>> >> >>>>>>>   x86/hvm/emulate: make sure rep I/O emulation does not cross
>> >> GFN
>> >> >>>>>>>     boundaries
>> >> >>>>>>>
>> >> >>>>>>>  xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++-
>> >> >>>>>>>  xen/arch/x86/hvm/ioreq.c   | 15 ++++++++++-----
>> >> >>>>>>>  2 files changed, 26 insertions(+), 6 deletions(-)
>> >> >>>>>> I take it this isn't yet what we've talked about yesterday on irc?
>> >> >>>>>>
>> >> >>>>> This is the band-aid fix. I can now show correct handling of a rep
>> mov
>> >> >>>>> walking off MMIO into RAM.
>> >> >>>> But that's not the problem we're having. In our case the bad
>> behavior
>> >> >>>> is with a single MOV. That's why I had assumed that your plan to
>> fiddle
>> >> >>>> with null_handler would help in our case as well, while this series
>> >> clearly
>> >> >>>> won't (afaict).
>> >> >>>>
>> >> >>> Oh, I see. A single MOV spanning MMIO and RAM has undefined
>> >> behaviour
>> >> >> though
>> >> >>> as I understand it. Am I incorrect?
>> >> >> I'm not aware of SDM or PM saying anything like this. Anyway, the
>> >> >> specific case where this is being observed as an issue is when
>> >> >> accessing the last few bytes of a normal RAM page followed by a
>> >> >> ballooned out one. The balloon driver doesn't remove the virtual
>> >> >> mapping of such pages (presumably in order to not shatter super
>> >> >> pages); observation is with the old XenoLinux one, but from code
>> >> >> inspection the upstream one behaves the same.
>> >> >>
>> >> >> Unless we want to change the balloon driver's behavior, at least
>> >> >> this specific case needs to be considered having defined behavior,
>> >> >> I think.
>> >> >>
>> >> > Ok. I'll see what I can do.
>> >>
>> >> It is a software error to try and cross boundaries.  Modern processors
>> >> do their best to try and cause the correct behaviour to occur, albeit
>> >> with a massive disclaimer about the performance hit.  Older processors
>> >> didn't cope.
>> >>
>> >> As far as I'm concerned, its fine to terminate a emulation which crosses
>> >> a boundary with the null ops.
>> >
>> > Alas we never even get as far as the I/O handlers in some circumstances...
>> >
>> > I just set up a variant of an XTF test doing a backwards rep movsd into a
>> > well aligned stack buffer where source buffer starts 1 byte before a
>> boundary
>> > between RAM and MMIO. The code in hvmemul_rep_movs() (rightly)
>> detects that
>> > both the source and dest of the initial rep are RAM, skips over the I/O
>> > emulation calls, and then fails when the hvm_copy_from_guest_phys()
>> > (unsurprisingly) fails to grab the 8 bytes for the initial rep.
>> > So, any logic we add to deal with handling page spanning ops is going to
>> > have to go in at the top level of instruction emulation... which I fear is
>> > going to be quite a major change and not something that's going to be easy
>> to
>> > back-port.
>> 
>> Well, wasn't it clear from the beginning that a proper fix would be too
>> invasive to backport? And wasn't it for that reason that you intended
>> to add a small hack first, to deal with just the case(s) that we currently
>> have issues with?
> 
> Well, given that I mistakenly understood you were hitting the same rep issue 
> that I was, I thought I could deal with it in a reasonably straightforward 
> way. Maybe I can still do a point fix for what you are hitting though. What 
> precisely are you hitting? Always a single MOV? And always from a page 
> spanning source to a well aligned dest? Or more combinations than that?

Always a single, misaligned MOV spanning the boundary from a valid
RAM page to a ballooned out one (Linux'es load_unaligned_zeropad()).
I meanwhile wonder though whether it might not be better to address
this at the p2m level, by inserting a r/o mapping of an all-zeros page.
George, do you have any opinion one way or the other?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
  2018-08-10 16:00                     ` Jan Beulich
@ 2018-08-10 16:30                       ` George Dunlap
  2018-08-10 16:37                         ` Andrew Cooper
  0 siblings, 1 reply; 41+ messages in thread
From: George Dunlap @ 2018-08-10 16:30 UTC (permalink / raw)
  To: Jan Beulich, Paul Durrant, George Dunlap; +Cc: Andrew Cooper, xen-devel

On 08/10/2018 05:00 PM, Jan Beulich wrote:
>>>> On 10.08.18 at 17:35, <Paul.Durrant@citrix.com> wrote:
>>>  -----Original Message-----
>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>> Sent: 10 August 2018 16:31
>>> To: Paul Durrant <Paul.Durrant@citrix.com>
>>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-devel <xen-
>>> devel@lists.xenproject.org>
>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>>>
>>>>>> On 10.08.18 at 17:08, <Paul.Durrant@citrix.com> wrote:
>>>>>  -----Original Message-----
>>>>> From: Andrew Cooper
>>>>> Sent: 10 August 2018 13:56
>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich'
>>>>> <JBeulich@suse.com>
>>>>> Cc: xen-devel <xen-devel@lists.xenproject.org>
>>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>>>>>
>>>>> On 10/08/18 13:43, Paul Durrant wrote:
>>>>>>> -----Original Message-----
>>>>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>>>> Sent: 10 August 2018 13:37
>>>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
>>>>>>> Cc: xen-devel <xen-devel@lists.xenproject.org>
>>>>>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>>>>>>>
>>>>>>>>>> On 10.08.18 at 14:22, <Paul.Durrant@citrix.com> wrote:
>>>>>>>>>  -----Original Message-----
>>>>>>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>>>>>> Sent: 10 August 2018 13:13
>>>>>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
>>>>>>>>> Cc: xen-devel <xen-devel@lists.xenproject.org>
>>>>>>>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>>>>>>>>>
>>>>>>>>>>>> On 10.08.18 at 14:08, <Paul.Durrant@citrix.com> wrote:
>>>>>>>>>>>  -----Original Message-----
>>>>>>>>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>>>>>>>> Sent: 10 August 2018 13:02
>>>>>>>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
>>>>>>>>>>> Cc: xen-devel <xen-devel@lists.xenproject.org>
>>>>>>>>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>>>>>>>>>>>
>>>>>>>>>>>>>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote:
>>>>>>>>>>>> These are probably both candidates for back-port.
>>>>>>>>>>>>
>>>>>>>>>>>> Paul Durrant (2):
>>>>>>>>>>>>   x86/hvm/ioreq: MMIO range checking completely ignores
>>>>> direction
>>>>>>> flag
>>>>>>>>>>>>   x86/hvm/emulate: make sure rep I/O emulation does not cross
>>>>> GFN
>>>>>>>>>>>>     boundaries
>>>>>>>>>>>>
>>>>>>>>>>>>  xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++-
>>>>>>>>>>>>  xen/arch/x86/hvm/ioreq.c   | 15 ++++++++++-----
>>>>>>>>>>>>  2 files changed, 26 insertions(+), 6 deletions(-)
>>>>>>>>>>> I take it this isn't yet what we've talked about yesterday on irc?
>>>>>>>>>>>
>>>>>>>>>> This is the band-aid fix. I can now show correct handling of a rep
>>> mov
>>>>>>>>>> walking off MMIO into RAM.
>>>>>>>>> But that's not the problem we're having. In our case the bad
>>> behavior
>>>>>>>>> is with a single MOV. That's why I had assumed that your plan to
>>> fiddle
>>>>>>>>> with null_handler would help in our case as well, while this series
>>>>> clearly
>>>>>>>>> won't (afaict).
>>>>>>>>>
>>>>>>>> Oh, I see. A single MOV spanning MMIO and RAM has undefined
>>>>> behaviour
>>>>>>> though
>>>>>>>> as I understand it. Am I incorrect?
>>>>>>> I'm not aware of SDM or PM saying anything like this. Anyway, the
>>>>>>> specific case where this is being observed as an issue is when
>>>>>>> accessing the last few bytes of a normal RAM page followed by a
>>>>>>> ballooned out one. The balloon driver doesn't remove the virtual
>>>>>>> mapping of such pages (presumably in order to not shatter super
>>>>>>> pages); observation is with the old XenoLinux one, but from code
>>>>>>> inspection the upstream one behaves the same.
>>>>>>>
>>>>>>> Unless we want to change the balloon driver's behavior, at least
>>>>>>> this specific case needs to be considered having defined behavior,
>>>>>>> I think.
>>>>>>>
>>>>>> Ok. I'll see what I can do.
>>>>>
>>>>> It is a software error to try and cross boundaries.  Modern processors
>>>>> do their best to try and cause the correct behaviour to occur, albeit
>>>>> with a massive disclaimer about the performance hit.  Older processors
>>>>> didn't cope.
>>>>>
>>>>> As far as I'm concerned, its fine to terminate a emulation which crosses
>>>>> a boundary with the null ops.
>>>>
>>>> Alas we never even get as far as the I/O handlers in some circumstances...
>>>>
>>>> I just set up a variant of an XTF test doing a backwards rep movsd into a
>>>> well aligned stack buffer where source buffer starts 1 byte before a
>>> boundary
>>>> between RAM and MMIO. The code in hvmemul_rep_movs() (rightly)
>>> detects that
>>>> both the source and dest of the initial rep are RAM, skips over the I/O
>>>> emulation calls, and then fails when the hvm_copy_from_guest_phys()
>>>> (unsurprisingly) fails to grab the 8 bytes for the initial rep.
>>>> So, any logic we add to deal with handling page spanning ops is going to
>>>> have to go in at the top level of instruction emulation... which I fear is
>>>> going to be quite a major change and not something that's going to be easy
>>> to
>>>> back-port.
>>>
>>> Well, wasn't it clear from the beginning that a proper fix would be too
>>> invasive to backport? And wasn't it for that reason that you intended
>>> to add a small hack first, to deal with just the case(s) that we currently
>>> have issues with?
>>
>> Well, given that I mistakenly understood you were hitting the same rep issue 
>> that I was, I thought I could deal with it in a reasonably straightforward 
>> way. Maybe I can still do a point fix for what you are hitting though. What 
>> precisely are you hitting? Always a single MOV? And always from a page 
>> spanning source to a well aligned dest? Or more combinations than that?
> 
> Always a single, misaligned MOV spanning the boundary from a valid
> RAM page to a ballooned out one (Linux'es load_unaligned_zeropad()).
> I meanwhile wonder though whether it might not be better to address
> this at the p2m level, by inserting a r/o mapping of an all-zeros page.
> George, do you have any opinion one way or the other?

Sorry, what exactly is the issue here?  Linux has a function called
load_unaligned_zeropad() which is reading into a ballooned region?

Fundamentally, a ballooned page is one which has been allocated to a
device driver.  I'm having a hard time coming up with a justification
for having code which reads memory owned by B in the process of reading
memory owned by A.  Or is there some weird architectural reason that I'm
not aware of?

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
  2018-08-10 16:30                       ` George Dunlap
@ 2018-08-10 16:37                         ` Andrew Cooper
  2018-08-13  6:50                           ` Jan Beulich
  2018-08-16 10:29                           ` Jan Beulich
  0 siblings, 2 replies; 41+ messages in thread
From: Andrew Cooper @ 2018-08-10 16:37 UTC (permalink / raw)
  To: George Dunlap, Jan Beulich, Paul Durrant, George Dunlap; +Cc: xen-devel

On 10/08/18 17:30, George Dunlap wrote:
> On 08/10/2018 05:00 PM, Jan Beulich wrote:
>>>>> On 10.08.18 at 17:35, <Paul.Durrant@citrix.com> wrote:
>>>>  -----Original Message-----
>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>> Sent: 10 August 2018 16:31
>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
>>>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-devel <xen-
>>>> devel@lists.xenproject.org>
>>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>>>>
>>>>>>> On 10.08.18 at 17:08, <Paul.Durrant@citrix.com> wrote:
>>>>>>  -----Original Message-----
>>>>>> From: Andrew Cooper
>>>>>> Sent: 10 August 2018 13:56
>>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich'
>>>>>> <JBeulich@suse.com>
>>>>>> Cc: xen-devel <xen-devel@lists.xenproject.org>
>>>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>>>>>>
>>>>>> On 10/08/18 13:43, Paul Durrant wrote:
>>>>>>>> -----Original Message-----
>>>>>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>>>>> Sent: 10 August 2018 13:37
>>>>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
>>>>>>>> Cc: xen-devel <xen-devel@lists.xenproject.org>
>>>>>>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>>>>>>>>
>>>>>>>>>>> On 10.08.18 at 14:22, <Paul.Durrant@citrix.com> wrote:
>>>>>>>>>>  -----Original Message-----
>>>>>>>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>>>>>>> Sent: 10 August 2018 13:13
>>>>>>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
>>>>>>>>>> Cc: xen-devel <xen-devel@lists.xenproject.org>
>>>>>>>>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>>>>>>>>>>
>>>>>>>>>>>>> On 10.08.18 at 14:08, <Paul.Durrant@citrix.com> wrote:
>>>>>>>>>>>>  -----Original Message-----
>>>>>>>>>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>>>>>>>>> Sent: 10 August 2018 13:02
>>>>>>>>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
>>>>>>>>>>>> Cc: xen-devel <xen-devel@lists.xenproject.org>
>>>>>>>>>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote:
>>>>>>>>>>>>> These are probably both candidates for back-port.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Paul Durrant (2):
>>>>>>>>>>>>>   x86/hvm/ioreq: MMIO range checking completely ignores
>>>>>> direction
>>>>>>>> flag
>>>>>>>>>>>>>   x86/hvm/emulate: make sure rep I/O emulation does not cross
>>>>>> GFN
>>>>>>>>>>>>>     boundaries
>>>>>>>>>>>>>
>>>>>>>>>>>>>  xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++-
>>>>>>>>>>>>>  xen/arch/x86/hvm/ioreq.c   | 15 ++++++++++-----
>>>>>>>>>>>>>  2 files changed, 26 insertions(+), 6 deletions(-)
>>>>>>>>>>>> I take it this isn't yet what we've talked about yesterday on irc?
>>>>>>>>>>>>
>>>>>>>>>>> This is the band-aid fix. I can now show correct handling of a rep
>>>> mov
>>>>>>>>>>> walking off MMIO into RAM.
>>>>>>>>>> But that's not the problem we're having. In our case the bad
>>>> behavior
>>>>>>>>>> is with a single MOV. That's why I had assumed that your plan to
>>>> fiddle
>>>>>>>>>> with null_handler would help in our case as well, while this series
>>>>>> clearly
>>>>>>>>>> won't (afaict).
>>>>>>>>>>
>>>>>>>>> Oh, I see. A single MOV spanning MMIO and RAM has undefined
>>>>>> behaviour
>>>>>>>> though
>>>>>>>>> as I understand it. Am I incorrect?
>>>>>>>> I'm not aware of SDM or PM saying anything like this. Anyway, the
>>>>>>>> specific case where this is being observed as an issue is when
>>>>>>>> accessing the last few bytes of a normal RAM page followed by a
>>>>>>>> ballooned out one. The balloon driver doesn't remove the virtual
>>>>>>>> mapping of such pages (presumably in order to not shatter super
>>>>>>>> pages); observation is with the old XenoLinux one, but from code
>>>>>>>> inspection the upstream one behaves the same.
>>>>>>>>
>>>>>>>> Unless we want to change the balloon driver's behavior, at least
>>>>>>>> this specific case needs to be considered having defined behavior,
>>>>>>>> I think.
>>>>>>>>
>>>>>>> Ok. I'll see what I can do.
>>>>>> It is a software error to try and cross boundaries.  Modern processors
>>>>>> do their best to try and cause the correct behaviour to occur, albeit
>>>>>> with a massive disclaimer about the performance hit.  Older processors
>>>>>> didn't cope.
>>>>>>
>>>>>> As far as I'm concerned, its fine to terminate a emulation which crosses
>>>>>> a boundary with the null ops.
>>>>> Alas we never even get as far as the I/O handlers in some circumstances...
>>>>>
>>>>> I just set up a variant of an XTF test doing a backwards rep movsd into a
>>>>> well aligned stack buffer where source buffer starts 1 byte before a
>>>> boundary
>>>>> between RAM and MMIO. The code in hvmemul_rep_movs() (rightly)
>>>> detects that
>>>>> both the source and dest of the initial rep are RAM, skips over the I/O
>>>>> emulation calls, and then fails when the hvm_copy_from_guest_phys()
>>>>> (unsurprisingly) fails to grab the 8 bytes for the initial rep.
>>>>> So, any logic we add to deal with handling page spanning ops is going to
>>>>> have to go in at the top level of instruction emulation... which I fear is
>>>>> going to be quite a major change and not something that's going to be easy
>>>> to
>>>>> back-port.
>>>> Well, wasn't it clear from the beginning that a proper fix would be too
>>>> invasive to backport? And wasn't it for that reason that you intended
>>>> to add a small hack first, to deal with just the case(s) that we currently
>>>> have issues with?
>>> Well, given that I mistakenly understood you were hitting the same rep issue 
>>> that I was, I thought I could deal with it in a reasonably straightforward 
>>> way. Maybe I can still do a point fix for what you are hitting though. What 
>>> precisely are you hitting? Always a single MOV? And always from a page 
>>> spanning source to a well aligned dest? Or more combinations than that?
>> Always a single, misaligned MOV spanning the boundary from a valid
>> RAM page to a ballooned out one (Linux'es load_unaligned_zeropad()).
>> I meanwhile wonder though whether it might not be better to address
>> this at the p2m level, by inserting a r/o mapping of an all-zeros page.
>> George, do you have any opinion one way or the other?
> Sorry, what exactly is the issue here?  Linux has a function called
> load_unaligned_zeropad() which is reading into a ballooned region?
>
> Fundamentally, a ballooned page is one which has been allocated to a
> device driver.  I'm having a hard time coming up with a justification
> for having code which reads memory owned by B in the process of reading
> memory owned by A.  Or is there some weird architectural reason that I'm
> not aware of?

The underlying issue is that the emulator can't cope with a single
misaligned access which crosses RAM and MMIO.  It gives up and
presumably throws #UD back.

One longstanding Xen bug is that simply ballooning a page out shouldn't
be able to trigger MMIO emulation to begin with.  It is a side effect of
mixed p2m types, and the fix for this to have Xen understand the guest
physmap layout.

However, the real bug is Linux making such a misaligned access into a
ballooned out page in the first place.  This is a Linux kernel bug which
(presumably) manifests in a very obvious way due to shortcomings in
Xen's emulation handling.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
  2018-08-10 16:37                         ` Andrew Cooper
@ 2018-08-13  6:50                           ` Jan Beulich
  2018-08-16 11:08                             ` Andrew Cooper
  2018-08-29 10:36                             ` Olaf Hering
  2018-08-16 10:29                           ` Jan Beulich
  1 sibling, 2 replies; 41+ messages in thread
From: Jan Beulich @ 2018-08-13  6:50 UTC (permalink / raw)
  To: Andrew Cooper, George Dunlap; +Cc: xen-devel, Paul Durrant, george.dunlap

>>> On 10.08.18 at 18:37, <andrew.cooper3@citrix.com> wrote:
> On 10/08/18 17:30, George Dunlap wrote:
>> Sorry, what exactly is the issue here?  Linux has a function called
>> load_unaligned_zeropad() which is reading into a ballooned region?

Yes.

>> Fundamentally, a ballooned page is one which has been allocated to a
>> device driver.  I'm having a hard time coming up with a justification
>> for having code which reads memory owned by B in the process of reading
>> memory owned by A.  Or is there some weird architectural reason that I'm
>> not aware of?

Well, they do this no matter who owns the successive page (or
perhaps at a smaller granularity also the successive allocation).
I guess their goal is to have just a single MOV in the common
case (with the caller ignoring the uninteresting to it high bytes),
while recovering gracefully from #PF should one occur.

> The underlying issue is that the emulator can't cope with a single
> misaligned access which crosses RAM and MMIO.  It gives up and
> presumably throws #UD back.

We wouldn't have observed any problem if there was #UD in
such a case, as Linux'es fault recovery code doesn't care what
kind of fault has occurred. We're getting back a result of all
ones, even for the part of the read that has actually hit the
last few bytes of the present page.

> One longstanding Xen bug is that simply ballooning a page out shouldn't
> be able to trigger MMIO emulation to begin with.  It is a side effect of
> mixed p2m types, and the fix for this to have Xen understand the guest
> physmap layout.

And hence the consideration of mapping in an all zeros page
instead. This is because of the way __hvmemul_read() /
__hvm_copy() work: The latter doesn't tell its caller how many
bytes it was able to read, and hence the former considers the
entire range MMIO (and forwards the request for emulation).
Of course all of this is an issue only because
hvmemul_virtual_to_linear() sees no need to split the request
at the page boundary, due to the balloon driver having left in
place the mapping of the ballooned out page.

Obviously the opposite case (access starting in a ballooned
out page and crossing into an "ordinary" one) would have a
similar issue, which is presumably even harder to fix without
going the map-a-zero-page route (or Paul's suggested
null_handler hack).

> However, the real bug is Linux making such a misaligned access into a
> ballooned out page in the first place.  This is a Linux kernel bug which
> (presumably) manifests in a very obvious way due to shortcomings in
> Xen's emulation handling.

I wouldn't dare to judge whether this is a bug, especially in
light that they recover gracefully from the #PF that might result in
the native case. Arguably the caller has to have some knowledge
about what might live in the following page, as to not inadvertently
hit an MMIO page rather than a non-present mapping. But I'd
leave such judgment to them; our business is to get working a case
that is working without Xen underneath.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
  2018-08-10 16:37                         ` Andrew Cooper
  2018-08-13  6:50                           ` Jan Beulich
@ 2018-08-16 10:29                           ` Jan Beulich
  2018-08-16 10:56                             ` Andrew Cooper
  1 sibling, 1 reply; 41+ messages in thread
From: Jan Beulich @ 2018-08-16 10:29 UTC (permalink / raw)
  To: Boris Ostrovsky, Juergen Gross
  Cc: George Dunlap, Andrew Cooper, Paul Durrant, george.dunlap, xen-devel

>>> On 13.08.18 at 08:50,  wrote:
>>>> On 10.08.18 at 18:37, <andrew.cooper3@citrix.com> wrote:
> > On 10/08/18 17:30, George Dunlap wrote:
> >> Sorry, what exactly is the issue here?  Linux has a function called
> >> load_unaligned_zeropad() which is reading into a ballooned region?
> 
> Yes.
> 
> >> Fundamentally, a ballooned page is one which has been allocated to a
> >> device driver.  I'm having a hard time coming up with a justification
> >> for having code which reads memory owned by B in the process of reading
> >> memory owned by A.  Or is there some weird architectural reason that I'm
> >> not aware of?
> 
> Well, they do this no matter who owns the successive page (or
> perhaps at a smaller granularity also the successive allocation).
> I guess their goal is to have just a single MOV in the common
> case (with the caller ignoring the uninteresting to it high bytes),
> while recovering gracefully from #PF should one occur.
> 
> > The underlying issue is that the emulator can't cope with a single
> > misaligned access which crosses RAM and MMIO.  It gives up and
> > presumably throws #UD back.
> 
> We wouldn't have observed any problem if there was #UD in
> such a case, as Linux'es fault recovery code doesn't care what
> kind of fault has occurred. We're getting back a result of all
> ones, even for the part of the read that has actually hit the
> last few bytes of the present page.
> 
> > One longstanding Xen bug is that simply ballooning a page out shouldn't
> > be able to trigger MMIO emulation to begin with.  It is a side effect of
> > mixed p2m types, and the fix for this to have Xen understand the guest
> > physmap layout.
> 
> And hence the consideration of mapping in an all zeros page
> instead. This is because of the way __hvmemul_read() /
> __hvm_copy() work: The latter doesn't tell its caller how many
> bytes it was able to read, and hence the former considers the
> entire range MMIO (and forwards the request for emulation).
> Of course all of this is an issue only because
> hvmemul_virtual_to_linear() sees no need to split the request
> at the page boundary, due to the balloon driver having left in
> place the mapping of the ballooned out page.
> 
> Obviously the opposite case (access starting in a ballooned
> out page and crossing into an "ordinary" one) would have a
> similar issue, which is presumably even harder to fix without
> going the map-a-zero-page route (or Paul's suggested
> null_handler hack).
> 
> > However, the real bug is Linux making such a misaligned access into a
> > ballooned out page in the first place.  This is a Linux kernel bug which
> > (presumably) manifests in a very obvious way due to shortcomings in
> > Xen's emulation handling.
> 
> I wouldn't dare to judge whether this is a bug, especially in
> light that they recover gracefully from the #PF that might result in
> the native case. Arguably the caller has to have some knowledge
> about what might live in the following page, as to not inadvertently
> hit an MMIO page rather than a non-present mapping. But I'd
> leave such judgment to them; our business is to get working a case
> that is working without Xen underneath.

Following some further discussion with Andrew, he looks to be
convinced that the issue is to be fixed in the balloon driver,
which so far (intentionally afaict) does not remove the direct
map entries for ballooned out pages in the HVM case. I'm not
convinced of this, but I'd nevertheless like to inquire whether
such a change (resulting in shattered super page mappings)
would be acceptable in the first place.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
  2018-08-16 10:29                           ` Jan Beulich
@ 2018-08-16 10:56                             ` Andrew Cooper
  2018-08-16 11:27                               ` Jan Beulich
  0 siblings, 1 reply; 41+ messages in thread
From: Andrew Cooper @ 2018-08-16 10:56 UTC (permalink / raw)
  To: Jan Beulich, Boris Ostrovsky, Juergen Gross
  Cc: George Dunlap, xen-devel, Paul Durrant, george.dunlap

On 16/08/18 11:29, Jan Beulich wrote:
>>>> On 13.08.18 at 08:50,  wrote:
>>>>> On 10.08.18 at 18:37, <andrew.cooper3@citrix.com> wrote:
>>> On 10/08/18 17:30, George Dunlap wrote:
>>>> Sorry, what exactly is the issue here?  Linux has a function called
>>>> load_unaligned_zeropad() which is reading into a ballooned region?
>> Yes.
>>
>>>> Fundamentally, a ballooned page is one which has been allocated to a
>>>> device driver.  I'm having a hard time coming up with a justification
>>>> for having code which reads memory owned by B in the process of reading
>>>> memory owned by A.  Or is there some weird architectural reason that I'm
>>>> not aware of?
>> Well, they do this no matter who owns the successive page (or
>> perhaps at a smaller granularity also the successive allocation).
>> I guess their goal is to have just a single MOV in the common
>> case (with the caller ignoring the uninteresting to it high bytes),
>> while recovering gracefully from #PF should one occur.
>>
>>> The underlying issue is that the emulator can't cope with a single
>>> misaligned access which crosses RAM and MMIO.  It gives up and
>>> presumably throws #UD back.
>> We wouldn't have observed any problem if there was #UD in
>> such a case, as Linux'es fault recovery code doesn't care what
>> kind of fault has occurred. We're getting back a result of all
>> ones, even for the part of the read that has actually hit the
>> last few bytes of the present page.
>>
>>> One longstanding Xen bug is that simply ballooning a page out shouldn't
>>> be able to trigger MMIO emulation to begin with.  It is a side effect of
>>> mixed p2m types, and the fix for this to have Xen understand the guest
>>> physmap layout.
>> And hence the consideration of mapping in an all zeros page
>> instead. This is because of the way __hvmemul_read() /
>> __hvm_copy() work: The latter doesn't tell its caller how many
>> bytes it was able to read, and hence the former considers the
>> entire range MMIO (and forwards the request for emulation).
>> Of course all of this is an issue only because
>> hvmemul_virtual_to_linear() sees no need to split the request
>> at the page boundary, due to the balloon driver having left in
>> place the mapping of the ballooned out page.
>>
>> Obviously the opposite case (access starting in a ballooned
>> out page and crossing into an "ordinary" one) would have a
>> similar issue, which is presumably even harder to fix without
>> going the map-a-zero-page route (or Paul's suggested
>> null_handler hack).
>>
>>> However, the real bug is Linux making such a misaligned access into a
>>> ballooned out page in the first place.  This is a Linux kernel bug which
>>> (presumably) manifests in a very obvious way due to shortcomings in
>>> Xen's emulation handling.
>> I wouldn't dare to judge whether this is a bug, especially in
>> light that they recover gracefully from the #PF that might result in
>> the native case. Arguably the caller has to have some knowledge
>> about what might live in the following page, as to not inadvertently
>> hit an MMIO page rather than a non-present mapping. But I'd
>> leave such judgment to them; our business is to get working a case
>> that is working without Xen underneath.
> Following some further discussion with Andrew, he looks to be
> convinced that the issue is to be fixed in the balloon driver,
> which so far (intentionally afaict) does not remove the direct
> map entries for ballooned out pages in the HVM case. I'm not
> convinced of this, but I'd nevertheless like to inquire whether
> such a change (resulting in shattered super page mappings)
> would be acceptable in the first place.

We don't tolerate anything else in the directmap pointing to
invalid/unimplemented frames.  Why should ballooning be any different?

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
  2018-08-13  6:50                           ` Jan Beulich
@ 2018-08-16 11:08                             ` Andrew Cooper
  2018-08-29 10:36                             ` Olaf Hering
  1 sibling, 0 replies; 41+ messages in thread
From: Andrew Cooper @ 2018-08-16 11:08 UTC (permalink / raw)
  To: Jan Beulich, George Dunlap; +Cc: xen-devel, Paul Durrant, george.dunlap

On 13/08/18 07:50, Jan Beulich wrote:
>>>> On 10.08.18 at 18:37, <andrew.cooper3@citrix.com> wrote:
>> On 10/08/18 17:30, George Dunlap wrote:
>>> Sorry, what exactly is the issue here?  Linux has a function called
>>> load_unaligned_zeropad() which is reading into a ballooned region?
> Yes.
>
>>> Fundamentally, a ballooned page is one which has been allocated to a
>>> device driver.  I'm having a hard time coming up with a justification
>>> for having code which reads memory owned by B in the process of reading
>>> memory owned by A.  Or is there some weird architectural reason that I'm
>>> not aware of?
> Well, they do this no matter who owns the successive page (or
> perhaps at a smaller granularity also the successive allocation).
> I guess their goal is to have just a single MOV in the common
> case (with the caller ignoring the uninteresting to it high bytes),
> while recovering gracefully from #PF should one occur.
>
>> The underlying issue is that the emulator can't cope with a single
>> misaligned access which crosses RAM and MMIO.  It gives up and
>> presumably throws #UD back.
> We wouldn't have observed any problem if there was #UD in
> such a case, as Linux'es fault recovery code doesn't care what
> kind of fault has occurred. We're getting back a result of all
> ones, even for the part of the read that has actually hit the
> last few bytes of the present page.
>
>> One longstanding Xen bug is that simply ballooning a page out shouldn't
>> be able to trigger MMIO emulation to begin with.  It is a side effect of
>> mixed p2m types, and the fix for this to have Xen understand the guest
>> physmap layout.
> And hence the consideration of mapping in an all zeros page
> instead. This is because of the way __hvmemul_read() /
> __hvm_copy() work: The latter doesn't tell its caller how many
> bytes it was able to read, and hence the former considers the
> entire range MMIO (and forwards the request for emulation).
> Of course all of this is an issue only because
> hvmemul_virtual_to_linear() sees no need to split the request
> at the page boundary, due to the balloon driver having left in
> place the mapping of the ballooned out page.

Actually, the more I think about this, the more of a bad idea emulating
a zero page is.

It gives the illusion of a working piece of zeroed ram, except that
writes definitely can't take effect.  Its going to make bugs even more
subtle.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
  2018-08-16 10:56                             ` Andrew Cooper
@ 2018-08-16 11:27                               ` Jan Beulich
  0 siblings, 0 replies; 41+ messages in thread
From: Jan Beulich @ 2018-08-16 11:27 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Juergen Gross, George Dunlap, george.dunlap, Paul Durrant,
	xen-devel, Boris Ostrovsky

>>> On 16.08.18 at 12:56, <andrew.cooper3@citrix.com> wrote:
> On 16/08/18 11:29, Jan Beulich wrote:
>> Following some further discussion with Andrew, he looks to be
>> convinced that the issue is to be fixed in the balloon driver,
>> which so far (intentionally afaict) does not remove the direct
>> map entries for ballooned out pages in the HVM case. I'm not
>> convinced of this, but I'd nevertheless like to inquire whether
>> such a change (resulting in shattered super page mappings)
>> would be acceptable in the first place.
> 
> We don't tolerate anything else in the directmap pointing to
> invalid/unimplemented frames.  Why should ballooning be any different?

Because ballooning is something virtualization specific, which
doesn't have any equivalent on bare hardware (memory hot
unplug doesn't come quite close enough imo, not the least
because that doesn't work on a page granular basis). Hence
we're to define the exact behavior here, and hence such a
definition could as well include special behavior of accesses
to the involved guest-physical addresses.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
       [not found]                   ` <5B755FBC0200007?= =?UTF-8?Q?8001DEDBF@suse.com>
@ 2018-08-16 12:52                     ` Juergen Gross
  2018-09-04 16:11                     ` Juergen Gross
  1 sibling, 0 replies; 41+ messages in thread
From: Juergen Gross @ 2018-08-16 12:52 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper
  Cc: George Dunlap, xen-devel, Boris Ostrovsky, Paul Durrant, George Dunlap

On 16/08/18 13:27, Jan Beulich wrote:
>>>> On 16.08.18 at 12:56, <andrew.cooper3@citrix.com> wrote:
>> On 16/08/18 11:29, Jan Beulich wrote:
>>> Following some further discussion with Andrew, he looks to be
>>> convinced that the issue is to be fixed in the balloon driver,
>>> which so far (intentionally afaict) does not remove the direct
>>> map entries for ballooned out pages in the HVM case. I'm not
>>> convinced of this, but I'd nevertheless like to inquire whether
>>> such a change (resulting in shattered super page mappings)
>>> would be acceptable in the first place.
>>
>> We don't tolerate anything else in the directmap pointing to
>> invalid/unimplemented frames.  Why should ballooning be any different?
> 
> Because ballooning is something virtualization specific, which
> doesn't have any equivalent on bare hardware (memory hot
> unplug doesn't come quite close enough imo, not the least
> because that doesn't work on a page granular basis). Hence
> we're to define the exact behavior here, and hence such a
> definition could as well include special behavior of accesses
> to the involved guest-physical addresses.

If I read drivers/virtio/virtio_balloon.c correctly KVM does the same
as Xen: ballooned pages are _not_ removed from the direct mappings.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
  2018-08-13  6:50                           ` Jan Beulich
  2018-08-16 11:08                             ` Andrew Cooper
@ 2018-08-29 10:36                             ` Olaf Hering
  2018-08-29 10:53                               ` Andrew Cooper
  2018-08-30  8:10                               ` Olaf Hering
  1 sibling, 2 replies; 41+ messages in thread
From: Olaf Hering @ 2018-08-29 10:36 UTC (permalink / raw)
  To: Jan Beulich
  Cc: George Dunlap, Andrew Cooper, Paul Durrant, george.dunlap, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 749 bytes --]

On Mon, Aug 13, Jan Beulich wrote:

> And hence the consideration of mapping in an all zeros page
> instead. This is because of the way __hvmemul_read() /
> __hvm_copy() work: The latter doesn't tell its caller how many
> bytes it was able to read, and hence the former considers the
> entire range MMIO (and forwards the request for emulation).
> Of course all of this is an issue only because
> hvmemul_virtual_to_linear() sees no need to split the request
> at the page boundary, due to the balloon driver having left in
> place the mapping of the ballooned out page.

Should perhaps __hvm_copy detect the fault and copy 0xf for the
unavailable page into 'buf', and finally return success?

Clearly something must be done at the Xen level.

Olaf

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
  2018-08-29 10:36                             ` Olaf Hering
@ 2018-08-29 10:53                               ` Andrew Cooper
  2018-08-29 11:00                                 ` Olaf Hering
  2018-08-30  8:10                               ` Olaf Hering
  1 sibling, 1 reply; 41+ messages in thread
From: Andrew Cooper @ 2018-08-29 10:53 UTC (permalink / raw)
  To: Olaf Hering, Jan Beulich
  Cc: George Dunlap, xen-devel, Paul Durrant, george.dunlap

On 29/08/18 11:36, Olaf Hering wrote:
> On Mon, Aug 13, Jan Beulich wrote:
>
>> And hence the consideration of mapping in an all zeros page
>> instead. This is because of the way __hvmemul_read() /
>> __hvm_copy() work: The latter doesn't tell its caller how many
>> bytes it was able to read, and hence the former considers the
>> entire range MMIO (and forwards the request for emulation).
>> Of course all of this is an issue only because
>> hvmemul_virtual_to_linear() sees no need to split the request
>> at the page boundary, due to the balloon driver having left in
>> place the mapping of the ballooned out page.
> Should perhaps __hvm_copy detect the fault and copy 0xf for the
> unavailable page into 'buf', and finally return success?
>
> Clearly something must be done at the Xen level.

This is first and formost a Linux bug.  No amount of fixing Xen is going
to alter that.

Architecturally speaking, handing #MC back is probably the closest we
can get to sensible behaviour, but it is still a bug that Linux is
touching the ballooned out page in the first place.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
  2018-08-29 10:53                               ` Andrew Cooper
@ 2018-08-29 11:00                                 ` Olaf Hering
  2018-08-29 11:09                                   ` Andrew Cooper
  0 siblings, 1 reply; 41+ messages in thread
From: Olaf Hering @ 2018-08-29 11:00 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: George Dunlap, xen-devel, Paul Durrant, george.dunlap, Jan Beulich


[-- Attachment #1.1: Type: text/plain, Size: 662 bytes --]

On Wed, Aug 29, Andrew Cooper wrote:

> Architecturally speaking, handing #MC back is probably the closest we
> can get to sensible behaviour, but it is still a bug that Linux is
> touching the ballooned out page in the first place.

Well, the issue is that a read crosses a page boundary. If that would be
forbidden, load_unaligned_zeropad() would not exist. It can not know
what is in the following page. And such page crossing happens also in
the unballooned case. Sadly I can not trigger the reported NFS bug
myself. But it can be enforced by ballooning enough pages so that an
allocated readdir reply eventually is right in front of a ballooned
page.

Olaf

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
  2018-08-29 11:00                                 ` Olaf Hering
@ 2018-08-29 11:09                                   ` Andrew Cooper
  2018-08-29 11:14                                     ` Andrew Cooper
                                                       ` (2 more replies)
  0 siblings, 3 replies; 41+ messages in thread
From: Andrew Cooper @ 2018-08-29 11:09 UTC (permalink / raw)
  To: Olaf Hering
  Cc: George Dunlap, xen-devel, Paul Durrant, george.dunlap, Jan Beulich

On 29/08/18 12:00, Olaf Hering wrote:
> On Wed, Aug 29, Andrew Cooper wrote:
>
>> Architecturally speaking, handing #MC back is probably the closest we
>> can get to sensible behaviour, but it is still a bug that Linux is
>> touching the ballooned out page in the first place.
> Well, the issue is that a read crosses a page boundary. If that would be
> forbidden, load_unaligned_zeropad() would not exist. It can not know
> what is in the following page. And such page crossing happens also in
> the unballooned case. Sadly I can not trigger the reported NFS bug
> myself. But it can be enforced by ballooning enough pages so that an
> allocated readdir reply eventually is right in front of a ballooned
> page.

The Linux bug is not shooting the ballooned page out of the directmap. 
Linux should be taking a fatal #PF for that read, because its a virtual
mapping for a frame which Linux has voluntarily elected to make invalid.

As Xen can't prevent Linux from making/maintaining such an invalid
mapping, throwing #MC back is the next best thing, because terminating
the access with ~0 is just going to hide the bug, and run at a glacial
pace while doing so.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
  2018-08-29 11:09                                   ` Andrew Cooper
@ 2018-08-29 11:14                                     ` Andrew Cooper
  2018-08-29 11:26                                     ` Juergen Gross
       [not found]                                     ` <5B86773A0200004903F324A0@prv1-mh.provo.novell.com>
  2 siblings, 0 replies; 41+ messages in thread
From: Andrew Cooper @ 2018-08-29 11:14 UTC (permalink / raw)
  To: Olaf Hering
  Cc: George Dunlap, xen-devel, Paul Durrant, george.dunlap, Jan Beulich

On 29/08/18 12:09, Andrew Cooper wrote:
> On 29/08/18 12:00, Olaf Hering wrote:
>> On Wed, Aug 29, Andrew Cooper wrote:
>>
>>> Architecturally speaking, handing #MC back is probably the closest we
>>> can get to sensible behaviour, but it is still a bug that Linux is
>>> touching the ballooned out page in the first place.
>> Well, the issue is that a read crosses a page boundary. If that would be
>> forbidden, load_unaligned_zeropad() would not exist. It can not know
>> what is in the following page. And such page crossing happens also in
>> the unballooned case. Sadly I can not trigger the reported NFS bug
>> myself. But it can be enforced by ballooning enough pages so that an
>> allocated readdir reply eventually is right in front of a ballooned
>> page.
> The Linux bug is not shooting the ballooned page out of the directmap. 
> Linux should be taking a fatal #PF for that read, because its a virtual
> mapping for a frame which Linux has voluntarily elected to make invalid.
>
> As Xen can't prevent Linux from making/maintaining such an invalid
> mapping, throwing #MC back is the next best thing, because terminating
> the access with ~0 is just going to hide the bug, and run at a glacial
> pace while doing so.

Yes - having looked at load_unaligned_zeropad(), it exists explicitly to
cope with #PF occurring from the page crossing.

This re-enforces my opinion that the underlying bug is not shooting
ballooned-out pages out of the directmap.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
  2018-08-29 11:09                                   ` Andrew Cooper
  2018-08-29 11:14                                     ` Andrew Cooper
@ 2018-08-29 11:26                                     ` Juergen Gross
       [not found]                                     ` <5B86773A0200004903F324A0@prv1-mh.provo.novell.com>
  2 siblings, 0 replies; 41+ messages in thread
From: Juergen Gross @ 2018-08-29 11:26 UTC (permalink / raw)
  To: Andrew Cooper, Olaf Hering
  Cc: George Dunlap, xen-devel, Paul Durrant, george.dunlap, Jan Beulich

On 29/08/18 13:09, Andrew Cooper wrote:
> On 29/08/18 12:00, Olaf Hering wrote:
>> On Wed, Aug 29, Andrew Cooper wrote:
>>
>>> Architecturally speaking, handing #MC back is probably the closest we
>>> can get to sensible behaviour, but it is still a bug that Linux is
>>> touching the ballooned out page in the first place.
>> Well, the issue is that a read crosses a page boundary. If that would be
>> forbidden, load_unaligned_zeropad() would not exist. It can not know
>> what is in the following page. And such page crossing happens also in
>> the unballooned case. Sadly I can not trigger the reported NFS bug
>> myself. But it can be enforced by ballooning enough pages so that an
>> allocated readdir reply eventually is right in front of a ballooned
>> page.
> 
> The Linux bug is not shooting the ballooned page out of the directmap. 
> Linux should be taking a fatal #PF for that read, because its a virtual
> mapping for a frame which Linux has voluntarily elected to make invalid.
> 
> As Xen can't prevent Linux from making/maintaining such an invalid
> mapping, throwing #MC back is the next best thing, because terminating
> the access with ~0 is just going to hide the bug, and run at a glacial
> pace while doing so.

I think you are right: the kernel should in no case access a random page
without knowing it is RAM.

Hitting a ballooned page is just much more probable than hitting a MMIO
page this way. There are _no_ guard pages around MMIO areas, so it could
in theory happen that load_unaligned_zeropad() would access MMIO area
triggering random behavior.

So removing ballooned pages from the directmap just hides an underlying
problem.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
       [not found]                                           ` <5B867F020200009E04E46402@prv1-mh.provo.novell.com>
@ 2018-08-29 12:06                                             ` Jan Beulich
  0 siblings, 0 replies; 41+ messages in thread
From: Jan Beulich @ 2018-08-29 12:06 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: George Dunlap, xen-devel, Olaf Hering, Paul Durrant, george.dunlap

>>> On 29.08.18 at 13:09, <andrew.cooper3@citrix.com> wrote:
> On 29/08/18 12:00, Olaf Hering wrote:
>> On Wed, Aug 29, Andrew Cooper wrote:
>>
>>> Architecturally speaking, handing #MC back is probably the closest we
>>> can get to sensible behaviour, but it is still a bug that Linux is
>>> touching the ballooned out page in the first place.
>> Well, the issue is that a read crosses a page boundary. If that would be
>> forbidden, load_unaligned_zeropad() would not exist. It can not know
>> what is in the following page. And such page crossing happens also in
>> the unballooned case. Sadly I can not trigger the reported NFS bug
>> myself. But it can be enforced by ballooning enough pages so that an
>> allocated readdir reply eventually is right in front of a ballooned
>> page.
> 
> The Linux bug is not shooting the ballooned page out of the directmap. 
> Linux should be taking a fatal #PF for that read, because its a virtual
> mapping for a frame which Linux has voluntarily elected to make invalid.
> 
> As Xen can't prevent Linux from making/maintaining such an invalid
> mapping, throwing #MC back is the next best thing, because terminating
> the access with ~0 is just going to hide the bug, and run at a glacial
> pace while doing so.

I still do not understand why you think so: Handing back ~0 is far more
correct than raising #MC imo. The x86 architecture is bound to its
history, and in pre-Pentium days there was no #MC to be raised in the
first place. Furthermore, while I can see that _some other_ bug may
be hidden this way, there's no bug at all the be hidden in
load_unaligned_zeropad() (leaving aside the balloon driver behavior).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
  2018-08-29 10:36                             ` Olaf Hering
  2018-08-29 10:53                               ` Andrew Cooper
@ 2018-08-30  8:10                               ` Olaf Hering
  1 sibling, 0 replies; 41+ messages in thread
From: Olaf Hering @ 2018-08-30  8:10 UTC (permalink / raw)
  To: Paul Durrant
  Cc: George Dunlap, Andrew Cooper, george.dunlap, Jan Beulich, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 1244 bytes --]

On Wed, Aug 29, Olaf Hering wrote:

> On Mon, Aug 13, Jan Beulich wrote:
> 
> > And hence the consideration of mapping in an all zeros page
> > instead. This is because of the way __hvmemul_read() /
> > __hvm_copy() work: The latter doesn't tell its caller how many
> > bytes it was able to read, and hence the former considers the
> > entire range MMIO (and forwards the request for emulation).
> > Of course all of this is an issue only because
> > hvmemul_virtual_to_linear() sees no need to split the request
> > at the page boundary, due to the balloon driver having left in
> > place the mapping of the ballooned out page.

So how is this bug supposed to be fixed?

What I see in my tracing is that __hvmemul_read gets called with
gla==ffff880223bffff9/bytes==8. Then hvm_copy_from_guest_linear fills
the buffer from gpa 223bffff9 with data, but finally it returns
HVMTRANS_bad_gfn_to_mfn, which it got from a failed get_page_from_gfn
for the second page.

Now things go downhill. hvmemul_linear_mmio_read is called, which calls
hvmemul_do_io/hvm_io_intercept. That returns X86EMUL_UNHANDLEABLE. As a
result hvm_process_io_intercept(null_handler) is called, which
overwrites the return buffer with 0xff.

Olaf

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
       [not found]                                       ` <5B87A68A0200001C04E5493A@prv1-mh.provo.novell.com>
@ 2018-08-30  8:23                                         ` Jan Beulich
  2018-08-30 10:42                                           ` Olaf Hering
  0 siblings, 1 reply; 41+ messages in thread
From: Jan Beulich @ 2018-08-30  8:23 UTC (permalink / raw)
  To: Olaf Hering
  Cc: George Dunlap, Andrew Cooper, Paul Durrant, george.dunlap, xen-devel

>>> On 30.08.18 at 10:10, <olaf@aepfle.de> wrote:
> On Wed, Aug 29, Olaf Hering wrote:
> 
>> On Mon, Aug 13, Jan Beulich wrote:
>> 
>> > And hence the consideration of mapping in an all zeros page
>> > instead. This is because of the way __hvmemul_read() /
>> > __hvm_copy() work: The latter doesn't tell its caller how many
>> > bytes it was able to read, and hence the former considers the
>> > entire range MMIO (and forwards the request for emulation).
>> > Of course all of this is an issue only because
>> > hvmemul_virtual_to_linear() sees no need to split the request
>> > at the page boundary, due to the balloon driver having left in
>> > place the mapping of the ballooned out page.
> 
> So how is this bug supposed to be fixed?
> 
> What I see in my tracing is that __hvmemul_read gets called with
> gla==ffff880223bffff9/bytes==8. Then hvm_copy_from_guest_linear fills
> the buffer from gpa 223bffff9 with data, but finally it returns
> HVMTRANS_bad_gfn_to_mfn, which it got from a failed get_page_from_gfn
> for the second page.
> 
> Now things go downhill. hvmemul_linear_mmio_read is called, which calls
> hvmemul_do_io/hvm_io_intercept. That returns X86EMUL_UNHANDLEABLE. As a
> result hvm_process_io_intercept(null_handler) is called, which
> overwrites the return buffer with 0xff.

There are a number of options (besides fixing the issue on the Linux
side, which I continue to not be entirely convinced of being the best
approach): One is Paul's idea of making null_handler actually retrieve
RAM contents when (part of) the access touches RAM. Another might
be to make __hvm_copy() return back what parts of the access could
be read/written (so that MMIO emulation would only be triggered for
the missing piece). A third might be to make the splitting of accesses
more intelligent in __hvmemul_read().

I'm meaning to look into this in some more detail later today, unless
a patch has appeared by then from e.g. Paul.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
  2018-08-30  8:23                                         ` Jan Beulich
@ 2018-08-30 10:42                                           ` Olaf Hering
  0 siblings, 0 replies; 41+ messages in thread
From: Olaf Hering @ 2018-08-30 10:42 UTC (permalink / raw)
  To: Jan Beulich
  Cc: George Dunlap, Andrew Cooper, Paul Durrant, george.dunlap, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 1129 bytes --]

On Thu, Aug 30, Jan Beulich wrote:

> approach): One is Paul's idea of making null_handler actually retrieve
> RAM contents when (part of) the access touches RAM. Another might

This works for me:

static int null_read(const struct hvm_io_handler *io_handler,
                     uint64_t addr,
                     uint32_t size,
                     uint64_t *data)
{
    struct vcpu *curr = current;
    struct domain *currd = curr->domain;
    p2m_type_t p2mt = p2m_invalid;
    unsigned long gmfn = paddr_to_pfn(addr);
    struct page_info *page;
    char *p;

    get_gfn_query_unlocked(currd, gmfn, &p2mt);
    if ( p2mt != p2m_ram_rw )
    {   
        *data = ~0ul;
    }
    else
    {   
        page = get_page_from_gfn(currd, gmfn, NULL, P2M_UNSHARE);
        if ( ! page )
        {
            memset(data, 0xee, size);
        }
        else
        {
            p = (char *)__map_domain_page(page) + (addr & ~PAGE_MASK);
            memcpy(data, p, size);
            unmap_domain_page(p);
            put_page(page);
        }
    }
    return X86EMUL_OKAY;
}

Olaf

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
       [not found]                   ` <5B755FBC0200007?= =?UTF-8?Q?8001DEDBF@suse.com>
  2018-08-16 12:52                     ` [PATCH 0/2] MMIO emulation fixes Juergen Gross
@ 2018-09-04 16:11                     ` Juergen Gross
  1 sibling, 0 replies; 41+ messages in thread
From: Juergen Gross @ 2018-09-04 16:11 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper
  Cc: George Dunlap, xen-devel, Boris Ostrovsky, Paul Durrant, George Dunlap

On 16/08/18 13:27, Jan Beulich wrote:
>>>> On 16.08.18 at 12:56, <andrew.cooper3@citrix.com> wrote:
>> On 16/08/18 11:29, Jan Beulich wrote:
>>> Following some further discussion with Andrew, he looks to be
>>> convinced that the issue is to be fixed in the balloon driver,
>>> which so far (intentionally afaict) does not remove the direct
>>> map entries for ballooned out pages in the HVM case. I'm not
>>> convinced of this, but I'd nevertheless like to inquire whether
>>> such a change (resulting in shattered super page mappings)
>>> would be acceptable in the first place.
>>
>> We don't tolerate anything else in the directmap pointing to
>> invalid/unimplemented frames.  Why should ballooning be any different?
> 
> Because ballooning is something virtualization specific, which
> doesn't have any equivalent on bare hardware (memory hot
> unplug doesn't come quite close enough imo, not the least
> because that doesn't work on a page granular basis). Hence
> we're to define the exact behavior here, and hence such a
> definition could as well include special behavior of accesses
> to the involved guest-physical addresses.

After discussing the issue with some KVM guys I still think it would be
better to leave the ballooned pages mapped in the direct map. KVM does
it the same way. They return "something" in case the guest tries to
read from such a page (might be the real data, 0's or all 1's).

So we should either map an all 0's or 1's page via EPT, or we should
return 0's or 1's via emulation of the read instruction.

Performance shouldn't be a major issue, as such reads should be really
rare.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
       [not found]               ` <dd3c99c2-67e3-faf1-4219-85651b89?= =?UTF-8?Q?1adc@suse.com>
@ 2018-09-04 16:24                 ` Andrew Cooper
       [not found]                   ` <651CBD680200008737554D14@prv1-mh.provo.novell.com>
  0 siblings, 1 reply; 41+ messages in thread
From: Andrew Cooper @ 2018-09-04 16:24 UTC (permalink / raw)
  To: Juergen Gross, Jan Beulich
  Cc: George Dunlap, xen-devel, Boris Ostrovsky, Paul Durrant, George Dunlap

On 04/09/18 17:11, Juergen Gross wrote:
> On 16/08/18 13:27, Jan Beulich wrote:
>>>>> On 16.08.18 at 12:56, <andrew.cooper3@citrix.com> wrote:
>>> On 16/08/18 11:29, Jan Beulich wrote:
>>>> Following some further discussion with Andrew, he looks to be
>>>> convinced that the issue is to be fixed in the balloon driver,
>>>> which so far (intentionally afaict) does not remove the direct
>>>> map entries for ballooned out pages in the HVM case. I'm not
>>>> convinced of this, but I'd nevertheless like to inquire whether
>>>> such a change (resulting in shattered super page mappings)
>>>> would be acceptable in the first place.
>>> We don't tolerate anything else in the directmap pointing to
>>> invalid/unimplemented frames.  Why should ballooning be any different?
>> Because ballooning is something virtualization specific, which
>> doesn't have any equivalent on bare hardware (memory hot
>> unplug doesn't come quite close enough imo, not the least
>> because that doesn't work on a page granular basis). Hence
>> we're to define the exact behavior here, and hence such a
>> definition could as well include special behavior of accesses
>> to the involved guest-physical addresses.
> After discussing the issue with some KVM guys I still think it would be
> better to leave the ballooned pages mapped in the direct map. KVM does
> it the same way. They return "something" in case the guest tries to
> read from such a page (might be the real data, 0's or all 1's).
>
> So we should either map an all 0's or 1's page via EPT, or we should
> return 0's or 1's via emulation of the read instruction.
>
> Performance shouldn't be a major issue, as such reads should be really
> rare.

Such reads should be non-existent.  One way or another, there's still a
bug to fix in the kernel, because it isn't keeping suitable track of the
pfns.

As for how Xen could do things better...

We could map a page of all-ones (all zeroes would definitely be wrong),
but you've still got the problem of what happens if a write occurs.  We
absolutely can't sacrifice enough RAM to fill in the ballooned-out
frames with read/write frames.

I'd prefer not to see any emulation here, but that is more for an attack
surface limitation point of view.  x86 still offers us the option to not
tolerate misaligned accesses and terminate early write-discard when
hitting one of these pages.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] MMIO emulation fixes
       [not found]                         ` <A283E656020000808E2C01CD@prv1-mh.provo.novell.com>
@ 2018-09-05  7:10                           ` Jan Beulich
  0 siblings, 0 replies; 41+ messages in thread
From: Jan Beulich @ 2018-09-05  7:10 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Juergen Gross, George Dunlap, george.dunlap, Paul Durrant,
	xen-devel, Boris Ostrovsky

>>> On 04.09.18 at 18:24, <andrew.cooper3@citrix.com> wrote:
> On 04/09/18 17:11, Juergen Gross wrote:
>> On 16/08/18 13:27, Jan Beulich wrote:
>>>>>> On 16.08.18 at 12:56, <andrew.cooper3@citrix.com> wrote:
>>>> On 16/08/18 11:29, Jan Beulich wrote:
>>>>> Following some further discussion with Andrew, he looks to be
>>>>> convinced that the issue is to be fixed in the balloon driver,
>>>>> which so far (intentionally afaict) does not remove the direct
>>>>> map entries for ballooned out pages in the HVM case. I'm not
>>>>> convinced of this, but I'd nevertheless like to inquire whether
>>>>> such a change (resulting in shattered super page mappings)
>>>>> would be acceptable in the first place.
>>>> We don't tolerate anything else in the directmap pointing to
>>>> invalid/unimplemented frames.  Why should ballooning be any different?
>>> Because ballooning is something virtualization specific, which
>>> doesn't have any equivalent on bare hardware (memory hot
>>> unplug doesn't come quite close enough imo, not the least
>>> because that doesn't work on a page granular basis). Hence
>>> we're to define the exact behavior here, and hence such a
>>> definition could as well include special behavior of accesses
>>> to the involved guest-physical addresses.
>> After discussing the issue with some KVM guys I still think it would be
>> better to leave the ballooned pages mapped in the direct map. KVM does
>> it the same way. They return "something" in case the guest tries to
>> read from such a page (might be the real data, 0's or all 1's).
>>
>> So we should either map an all 0's or 1's page via EPT, or we should
>> return 0's or 1's via emulation of the read instruction.
>>
>> Performance shouldn't be a major issue, as such reads should be really
>> rare.
> 
> Such reads should be non-existent.  One way or another, there's still a
> bug to fix in the kernel, because it isn't keeping suitable track of the
> pfns.

So you put yourself in opposition to both what KVM and Xen do in
their Linux implementations. I can only re-iterate: We're talking
about a PV extension here. Behavior of this is entirely defined by
us. Hence it is not a given that "such reads should be non-existent".

> As for how Xen could do things better...
> 
> We could map a page of all-ones (all zeroes would definitely be wrong),
> but you've still got the problem of what happens if a write occurs.  We
> absolutely can't sacrifice enough RAM to fill in the ballooned-out
> frames with read/write frames.

Of course, or else the ballooning effect would be nullified. However,
besides a page full of 0s or 1s, a simple "sink" page could also be
used, where reads return undefined data (i.e. whatever has last
been written into it through one of its perhaps very many aliases).

Another possibility for the sink page would be a (hardware) MMIO
one we know has no actual device backing it, thus allowing writes
to be terminated (discarded) by hardware, and reads to return all
ones (again due to hardware behavior). The question is how we
would universally find such a page (accesses to which must
obviously not have any other side effects).

> I'd prefer not to see any emulation here, but that is more for an attack
> surface limitation point of view.  x86 still offers us the option to not
> tolerate misaligned accesses and terminate early write-discard when
> hitting one of these pages.

Well - for now we have the series hopefully fixing the emulation
misbehavior here (and elsewhere at the same time). But I certainly
appreciate your desire for there not being emulation here in the
first place. Which I think leaves as the only option the sink page
described above.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2018-09-05  7:10 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20180810103714.5112=ef=bf=bd1=ef=bf=bdpaul.durrant@ci?= =?UTF-8?Q?trix.com>
     [not found] ` <5B6D86F30?= =?UTF-8?Q?2000078001DCF85@prv1=ef=bf=bdmh.provo.novell.com>
     [not found]   ` <e8cff3ca6c154b?= =?UTF-8?Q?67a2a932af83719354@AMSPEX02CL03.citrite.net>
     [not found]     ` <fdf19f7d=ef=bf=bd1b?= =?UTF-8?B?OTLvv71hOWMw77+9MzYwMu+/vWIxYzk4MDdiZjYxMEBjaXRyaXguY29tPiA8YTcz?= =?UTF-8?Q?5b4359ccc4b278330204d9790c6ac@AMSPEX02CL03.citrite.net>
     [not found]       ` <5B6DAF9F?= =?UTF-8?Q?02000078001DD040@prv1=ef=bf=bdmh.provo.novell.com>
     [not found]         ` <e2f77af0b2394?= =?UTF-8?Q?b8f859a1f2dc1a91797@AMSPEX02CL03.citrite.net>
     [not found]           ` <5B6DB69D0200007800?= =?UTF-8?Q?1DD06A@prv1=ef=bf=bdmh.provo.novell.com>
     [not found]             ` <eaab5a73=ef=bf=bd2910?= =?UTF-8?B?77+9N2ZiNu+/vWUxZmPvv70wODUzN2U2MzA4OGNAY2l0cml4LmNvbT4gPDkyY2E2?= =?UTF-8?B?OWU177+9OThiMe+/vTYxZTTvv704MTdh77+9Mzg2OGY4Mjk0NzFhQGNpdHJpeC5j?= =?UTF-8?Q?om>
     [not found]               ` <5B75521102000078001DED13@prv1=ef=bf=bdmh.provo.novell.com>
     [not found]                 ` <?= =?UTF-8?Q?11c0c965-9af7-2cec-1420-4541e281183a@citrix.com>
     [not found]                   ` <5B755FBC0200007?= =?UTF-8?Q?8001DEDBF@suse.com>
2018-08-16 12:52                     ` [PATCH 0/2] MMIO emulation fixes Juergen Gross
2018-09-04 16:11                     ` Juergen Gross
     [not found] <20180810103714.5112=3def=3dbf=3dbd1=3def=3dbf=3dbdpau?= =?UTF-8?Q?l.durrant@ci=3f=3d_trix.com>
     [not found] ` <fdf19f7d=ef=bf=bd1b92=ef=bf=bda9c0?= =?UTF-8?Q?=ef=bf=bd3602=ef=bf=bdb1c9807bf610@citrix.com>
     [not found]   ` <a735b4359ccc4b278?= =?UTF-8?Q?330204d9790c6ac@AMSPEX02CL03.citrite.net>
     [not found]     ` <5B6DAF9F02000078001DD0?= =?UTF-8?Q?40@prv1=ef=bf=bdmh.provo.novell.com>
     [not found]       ` <e2f77af0b2394b8f859a1f2dc1a?= =?UTF-8?Q?91797@AMSPEX02CL03.citrite.net>
     [not found]         ` <5B6DB69D02000078001DD06A@prv1?= =?UTF-8?B?77+9bWgucHJvdm8ubm92ZWxsLmNvbT4gPGVhYWI1YTcz77+9MjkxMO+/vTdmYjY=?= =?UTF-8?B?77+9ZTFmY++/vTA4NTM3ZTYzMDg4Y0BjaXRyaXguY29tPiA8OTJjYTY5ZTXvv705?= =?UTF-8?B?OGIx77+9NjFlNO+/vTgxN2Hvv70zODY4ZjgyOTQ3MWFAY2l0>
     [not found]           ` <11c0c96?= =?UTF-8?Q?5-9af7-2cec-1420-4541e281183a@citrix.com>
     [not found]             ` <5B755FBC0200007_=3d=3f?= =?UTF-8?Q?UTF-8=3fQ=3f8001DEDBF@suse.com>
     [not found]               ` <dd3c99c2-67e3-faf1-4219-85651b89?= =?UTF-8?Q?1adc@suse.com>
2018-09-04 16:24                 ` Andrew Cooper
     [not found]                   ` <651CBD680200008737554D14@prv1-mh.provo.novell.com>
     [not found]                     ` <21554C83020000C537554D14@prv1-mh.provo.novell.com>
     [not found]                       ` <06D73C83020000C037554D14@prv1-mh.provo.novell.com>
     [not found]                         ` <A283E656020000808E2C01CD@prv1-mh.provo.novell.com>
2018-09-05  7:10                           ` Jan Beulich
2018-08-10 10:37 Paul Durrant
2018-08-10 10:37 ` [PATCH 1/2] x86/hvm/ioreq: MMIO range checking completely ignores direction flag Paul Durrant
2018-08-10 11:11   ` Andrew Cooper
2018-08-10 10:37 ` [PATCH 2/2] x86/hvm/emulate: make sure rep I/O emulation does not cross GFN boundaries Paul Durrant
2018-08-10 11:14   ` Andrew Cooper
2018-08-10 11:50     ` Paul Durrant
2018-08-10 11:50       ` Andrew Cooper
2018-08-10 11:59   ` Jan Beulich
2018-08-10 12:10     ` Paul Durrant
2018-08-10 12:01 ` [PATCH 0/2] MMIO emulation fixes Jan Beulich
2018-08-10 12:08   ` Paul Durrant
2018-08-10 12:13     ` Jan Beulich
2018-08-10 12:22       ` Paul Durrant
2018-08-10 12:37         ` Jan Beulich
2018-08-10 12:43           ` Paul Durrant
2018-08-10 12:55             ` Andrew Cooper
2018-08-10 15:08               ` Paul Durrant
2018-08-10 15:30                 ` Jan Beulich
2018-08-10 15:35                   ` Paul Durrant
     [not found]                     ` <5B6DB69D02000078001DD06A@prv1*mh.provo.novell.com>
     [not found]                       ` <eaab5a73*2910*7fb6*e1fc*08537e63088c@citrix.com>
     [not found]                         ` <92ca69e5*98b1*61e4*817a*3868f829471a@citrix.com>
2018-08-10 16:00                     ` Jan Beulich
2018-08-10 16:30                       ` George Dunlap
2018-08-10 16:37                         ` Andrew Cooper
2018-08-13  6:50                           ` Jan Beulich
2018-08-16 11:08                             ` Andrew Cooper
2018-08-29 10:36                             ` Olaf Hering
2018-08-29 10:53                               ` Andrew Cooper
2018-08-29 11:00                                 ` Olaf Hering
2018-08-29 11:09                                   ` Andrew Cooper
2018-08-29 11:14                                     ` Andrew Cooper
2018-08-29 11:26                                     ` Juergen Gross
     [not found]                                     ` <5B86773A0200004903F324A0@prv1-mh.provo.novell.com>
     [not found]                                       ` <5B867B1A0200006D03F3278E@prv1-mh.provo.novell.com>
     [not found]                                         ` <5B867D000200009103F328E2@prv1-mh.provo.novell.com>
     [not found]                                           ` <5B867F020200009E04E46402@prv1-mh.provo.novell.com>
2018-08-29 12:06                                             ` Jan Beulich
     [not found]                                       ` <5B87A68A0200001C04E5493A@prv1-mh.provo.novell.com>
2018-08-30  8:23                                         ` Jan Beulich
2018-08-30 10:42                                           ` Olaf Hering
2018-08-30  8:10                               ` Olaf Hering
2018-08-16 10:29                           ` Jan Beulich
2018-08-16 10:56                             ` Andrew Cooper
2018-08-16 11:27                               ` Jan Beulich
     [not found] <20180810103714.5112=3def=3dbf=3dbd1=3def=3dbf=3dbdpau?= l.durrant@ci?= trix.com>

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.