* [PATCH 0/2] MMIO emulation fixes @ 2018-08-10 10:37 Paul Durrant 2018-08-10 10:37 ` [PATCH 1/2] x86/hvm/ioreq: MMIO range checking completely ignores direction flag Paul Durrant ` (2 more replies) 0 siblings, 3 replies; 41+ messages in thread From: Paul Durrant @ 2018-08-10 10:37 UTC (permalink / raw) To: xen-devel; +Cc: Paul Durrant These are probably both candidates for back-port. Paul Durrant (2): x86/hvm/ioreq: MMIO range checking completely ignores direction flag x86/hvm/emulate: make sure rep I/O emulation does not cross GFN boundaries xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++- xen/arch/x86/hvm/ioreq.c | 15 ++++++++++----- 2 files changed, 26 insertions(+), 6 deletions(-) -- 2.11.0 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* [PATCH 1/2] x86/hvm/ioreq: MMIO range checking completely ignores direction flag 2018-08-10 10:37 [PATCH 0/2] MMIO emulation fixes Paul Durrant @ 2018-08-10 10:37 ` Paul Durrant 2018-08-10 11:11 ` Andrew Cooper 2018-08-10 10:37 ` [PATCH 2/2] x86/hvm/emulate: make sure rep I/O emulation does not cross GFN boundaries Paul Durrant 2018-08-10 12:01 ` [PATCH 0/2] MMIO emulation fixes Jan Beulich 2 siblings, 1 reply; 41+ messages in thread From: Paul Durrant @ 2018-08-10 10:37 UTC (permalink / raw) To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Jan Beulich hvm_select_ioreq_server() is used to route an ioreq to the appropriate ioreq server. For MMIO this is done by comparing the range of the ioreq to the ranges registered by the device models of each ioreq server. Unfortunately the calculation of the range if the ioreq completely ignores the direction flag and thus may calculate the wrong range for comparison. Thus the ioreq may either be routed to the wrong server or erroneously terminated by null_ops. NOTE: The patch also fixes whitespace in the switch statement to make it style compliant. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> --- Cc: Jan Beulich <jbeulich@suse.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> --- xen/arch/x86/hvm/ioreq.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c index 7c515b3ef7..940a2c9728 100644 --- a/xen/arch/x86/hvm/ioreq.c +++ b/xen/arch/x86/hvm/ioreq.c @@ -1353,20 +1353,25 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d, switch ( type ) { - unsigned long end; + unsigned long start, end; case XEN_DMOP_IO_RANGE_PORT: - end = addr + p->size - 1; - if ( rangeset_contains_range(r, addr, end) ) + start = addr; + end = start + p->size - 1; + if ( rangeset_contains_range(r, start, end) ) return s; break; + case XEN_DMOP_IO_RANGE_MEMORY: - end = addr + (p->size * p->count) - 1; - if ( rangeset_contains_range(r, addr, end) ) + start = hvm_mmio_first_byte(p); + end = hvm_mmio_last_byte(p); + + if ( rangeset_contains_range(r, start, end) ) return s; break; + case XEN_DMOP_IO_RANGE_PCI: if ( rangeset_contains_singleton(r, addr >> 32) ) { -- 2.11.0 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH 1/2] x86/hvm/ioreq: MMIO range checking completely ignores direction flag 2018-08-10 10:37 ` [PATCH 1/2] x86/hvm/ioreq: MMIO range checking completely ignores direction flag Paul Durrant @ 2018-08-10 11:11 ` Andrew Cooper 0 siblings, 0 replies; 41+ messages in thread From: Andrew Cooper @ 2018-08-10 11:11 UTC (permalink / raw) To: Paul Durrant, xen-devel; +Cc: Jan Beulich On 10/08/18 11:37, Paul Durrant wrote: > hvm_select_ioreq_server() is used to route an ioreq to the appropriate > ioreq server. For MMIO this is done by comparing the range of the ioreq > to the ranges registered by the device models of each ioreq server. > Unfortunately the calculation of the range if the ioreq completely ignores > the direction flag and thus may calculate the wrong range for comparison. > Thus the ioreq may either be routed to the wrong server or erroneously > terminated by null_ops. > > NOTE: The patch also fixes whitespace in the switch statement to make it > style compliant. > > Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* [PATCH 2/2] x86/hvm/emulate: make sure rep I/O emulation does not cross GFN boundaries 2018-08-10 10:37 [PATCH 0/2] MMIO emulation fixes Paul Durrant 2018-08-10 10:37 ` [PATCH 1/2] x86/hvm/ioreq: MMIO range checking completely ignores direction flag Paul Durrant @ 2018-08-10 10:37 ` Paul Durrant 2018-08-10 11:14 ` Andrew Cooper 2018-08-10 11:59 ` Jan Beulich 2018-08-10 12:01 ` [PATCH 0/2] MMIO emulation fixes Jan Beulich 2 siblings, 2 replies; 41+ messages in thread From: Paul Durrant @ 2018-08-10 10:37 UTC (permalink / raw) To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Jan Beulich When emulating a rep I/O operation it is possible that the ioreq will describe a single operation that spans multiple GFNs. This is fine as long as all those GFNs fall within an MMIO region covered by a single device model, but unfortunately the higher levels of the emulation code do not guarantee that. This is something that should almost certainly be fixed, but in the meantime this patch makes sure that MMIO is truncated at GFN boundaries and hence the appropriate device model is re-evaluated for each target GFN. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> --- Cc: Jan Beulich <jbeulich@suse.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> --- xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index 8385c62145..d6a81ec4d1 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -184,8 +184,23 @@ static int hvmemul_do_io( hvmtrace_io_assist(&p); } - vio->io_req = p; + /* + * Make sure that we truncate rep MMIO at any GFN boundary. This is + * necessary to ensure that the correct device model is targetted + * or that we correctly handle a rep op spanning MMIO and RAM. + */ + if ( unlikely(p.count > 1) && p.type == IOREQ_TYPE_COPY ) + { + unsigned long off = p.addr & ~PAGE_MASK; + p.count = min_t(unsigned long, + p.count, + p.df ? + (off + p.size) / p.size : + (PAGE_SIZE - off) / p.size); + } + + vio->io_req = p; rc = hvm_io_intercept(&p); /* -- 2.11.0 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH 2/2] x86/hvm/emulate: make sure rep I/O emulation does not cross GFN boundaries 2018-08-10 10:37 ` [PATCH 2/2] x86/hvm/emulate: make sure rep I/O emulation does not cross GFN boundaries Paul Durrant @ 2018-08-10 11:14 ` Andrew Cooper 2018-08-10 11:50 ` Paul Durrant 2018-08-10 11:59 ` Jan Beulich 1 sibling, 1 reply; 41+ messages in thread From: Andrew Cooper @ 2018-08-10 11:14 UTC (permalink / raw) To: Paul Durrant, xen-devel; +Cc: Jan Beulich On 10/08/18 11:37, Paul Durrant wrote: > When emulating a rep I/O operation it is possible that the ioreq will > describe a single operation that spans multiple GFNs. This is fine as long > as all those GFNs fall within an MMIO region covered by a single device > model, but unfortunately the higher levels of the emulation code do not > guarantee that. This is something that should almost certainly be fixed, > but in the meantime this patch makes sure that MMIO is truncated at GFN > boundaries and hence the appropriate device model is re-evaluated for each > target GFN. > > Signed-off-by: Paul Durrant <paul.durrant@citrix.com> > --- > Cc: Jan Beulich <jbeulich@suse.com> > Cc: Andrew Cooper <andrew.cooper3@citrix.com> > --- > xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++- > 1 file changed, 16 insertions(+), 1 deletion(-) > > diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c > index 8385c62145..d6a81ec4d1 100644 > --- a/xen/arch/x86/hvm/emulate.c > +++ b/xen/arch/x86/hvm/emulate.c > @@ -184,8 +184,23 @@ static int hvmemul_do_io( > hvmtrace_io_assist(&p); > } > > - vio->io_req = p; > + /* > + * Make sure that we truncate rep MMIO at any GFN boundary. This is > + * necessary to ensure that the correct device model is targetted > + * or that we correctly handle a rep op spanning MMIO and RAM. > + */ > + if ( unlikely(p.count > 1) && p.type == IOREQ_TYPE_COPY ) > + { > + unsigned long off = p.addr & ~PAGE_MASK; > > + p.count = min_t(unsigned long, > + p.count, > + p.df ? > + (off + p.size) / p.size : > + (PAGE_SIZE - off) / p.size); (p.df ? (off + p.size) : (PAGE_SIZE - off)) / p.size ? > + } > + > + vio->io_req = p; You'll get a cleaner patch if you retain the newline between these two. Both can be fixed up on commit. From a functional point of view, this looks fine. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 2/2] x86/hvm/emulate: make sure rep I/O emulation does not cross GFN boundaries 2018-08-10 11:14 ` Andrew Cooper @ 2018-08-10 11:50 ` Paul Durrant 2018-08-10 11:50 ` Andrew Cooper 0 siblings, 1 reply; 41+ messages in thread From: Paul Durrant @ 2018-08-10 11:50 UTC (permalink / raw) To: Andrew Cooper, xen-devel; +Cc: Jan Beulich > -----Original Message----- > From: Andrew Cooper > Sent: 10 August 2018 12:14 > To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org > Cc: Jan Beulich <jbeulich@suse.com> > Subject: Re: [PATCH 2/2] x86/hvm/emulate: make sure rep I/O emulation > does not cross GFN boundaries > > On 10/08/18 11:37, Paul Durrant wrote: > > When emulating a rep I/O operation it is possible that the ioreq will > > describe a single operation that spans multiple GFNs. This is fine as long > > as all those GFNs fall within an MMIO region covered by a single device > > model, but unfortunately the higher levels of the emulation code do not > > guarantee that. This is something that should almost certainly be fixed, > > but in the meantime this patch makes sure that MMIO is truncated at GFN > > boundaries and hence the appropriate device model is re-evaluated for > each > > target GFN. > > > > Signed-off-by: Paul Durrant <paul.durrant@citrix.com> > > --- > > Cc: Jan Beulich <jbeulich@suse.com> > > Cc: Andrew Cooper <andrew.cooper3@citrix.com> > > --- > > xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++- > > 1 file changed, 16 insertions(+), 1 deletion(-) > > > > diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c > > index 8385c62145..d6a81ec4d1 100644 > > --- a/xen/arch/x86/hvm/emulate.c > > +++ b/xen/arch/x86/hvm/emulate.c > > @@ -184,8 +184,23 @@ static int hvmemul_do_io( > > hvmtrace_io_assist(&p); > > } > > > > - vio->io_req = p; > > + /* > > + * Make sure that we truncate rep MMIO at any GFN boundary. This is > > + * necessary to ensure that the correct device model is targetted > > + * or that we correctly handle a rep op spanning MMIO and RAM. > > + */ > > + if ( unlikely(p.count > 1) && p.type == IOREQ_TYPE_COPY ) > > + { > > + unsigned long off = p.addr & ~PAGE_MASK; > > > > + p.count = min_t(unsigned long, > > + p.count, > > + p.df ? > > + (off + p.size) / p.size : > > + (PAGE_SIZE - off) / p.size); > > (p.df ? (off + p.size) : (PAGE_SIZE - off)) / p.size > Yes, I suppose so. > ? > > > + } > > + > > + vio->io_req = p; > > You'll get a cleaner patch if you retain the newline between these two. > > Both can be fixed up on commit. From a functional point of view, this > looks fine. > Ok. I'll leave to you then :-) Paul > ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 2/2] x86/hvm/emulate: make sure rep I/O emulation does not cross GFN boundaries 2018-08-10 11:50 ` Paul Durrant @ 2018-08-10 11:50 ` Andrew Cooper 0 siblings, 0 replies; 41+ messages in thread From: Andrew Cooper @ 2018-08-10 11:50 UTC (permalink / raw) To: Paul Durrant, xen-devel; +Cc: Jan Beulich On 10/08/18 12:50, Paul Durrant wrote: >> -----Original Message----- >> From: Andrew Cooper >> Sent: 10 August 2018 12:14 >> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org >> Cc: Jan Beulich <jbeulich@suse.com> >> Subject: Re: [PATCH 2/2] x86/hvm/emulate: make sure rep I/O emulation >> does not cross GFN boundaries >> >> On 10/08/18 11:37, Paul Durrant wrote: >>> When emulating a rep I/O operation it is possible that the ioreq will >>> describe a single operation that spans multiple GFNs. This is fine as long >>> as all those GFNs fall within an MMIO region covered by a single device >>> model, but unfortunately the higher levels of the emulation code do not >>> guarantee that. This is something that should almost certainly be fixed, >>> but in the meantime this patch makes sure that MMIO is truncated at GFN >>> boundaries and hence the appropriate device model is re-evaluated for >> each >>> target GFN. >>> >>> Signed-off-by: Paul Durrant <paul.durrant@citrix.com> >>> --- >>> Cc: Jan Beulich <jbeulich@suse.com> >>> Cc: Andrew Cooper <andrew.cooper3@citrix.com> >>> --- >>> xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++- >>> 1 file changed, 16 insertions(+), 1 deletion(-) >>> >>> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c >>> index 8385c62145..d6a81ec4d1 100644 >>> --- a/xen/arch/x86/hvm/emulate.c >>> +++ b/xen/arch/x86/hvm/emulate.c >>> @@ -184,8 +184,23 @@ static int hvmemul_do_io( >>> hvmtrace_io_assist(&p); >>> } >>> >>> - vio->io_req = p; >>> + /* >>> + * Make sure that we truncate rep MMIO at any GFN boundary. This is >>> + * necessary to ensure that the correct device model is targetted >>> + * or that we correctly handle a rep op spanning MMIO and RAM. >>> + */ >>> + if ( unlikely(p.count > 1) && p.type == IOREQ_TYPE_COPY ) >>> + { >>> + unsigned long off = p.addr & ~PAGE_MASK; >>> >>> + p.count = min_t(unsigned long, >>> + p.count, >>> + p.df ? >>> + (off + p.size) / p.size : >>> + (PAGE_SIZE - off) / p.size); >> (p.df ? (off + p.size) : (PAGE_SIZE - off)) / p.size >> > Yes, I suppose so. > >> ? >> >>> + } >>> + >>> + vio->io_req = p; >> You'll get a cleaner patch if you retain the newline between these two. >> >> Both can be fixed up on commit. From a functional point of view, this >> looks fine. >> > Ok. I'll leave to you then :-) Sure. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 2/2] x86/hvm/emulate: make sure rep I/O emulation does not cross GFN boundaries 2018-08-10 10:37 ` [PATCH 2/2] x86/hvm/emulate: make sure rep I/O emulation does not cross GFN boundaries Paul Durrant 2018-08-10 11:14 ` Andrew Cooper @ 2018-08-10 11:59 ` Jan Beulich 2018-08-10 12:10 ` Paul Durrant 1 sibling, 1 reply; 41+ messages in thread From: Jan Beulich @ 2018-08-10 11:59 UTC (permalink / raw) To: Paul Durrant; +Cc: Andrew Cooper, xen-devel >>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote: > --- a/xen/arch/x86/hvm/emulate.c > +++ b/xen/arch/x86/hvm/emulate.c > @@ -184,8 +184,23 @@ static int hvmemul_do_io( > hvmtrace_io_assist(&p); > } > > - vio->io_req = p; > + /* > + * Make sure that we truncate rep MMIO at any GFN boundary. This is > + * necessary to ensure that the correct device model is targetted > + * or that we correctly handle a rep op spanning MMIO and RAM. > + */ > + if ( unlikely(p.count > 1) && p.type == IOREQ_TYPE_COPY ) > + { > + unsigned long off = p.addr & ~PAGE_MASK; > > + p.count = min_t(unsigned long, > + p.count, > + p.df ? > + (off + p.size) / p.size : > + (PAGE_SIZE - off) / p.size); For misaligned requests you need to make sure p.count doesn't end up as zero (which can now happen in the forwards case). Or do you rely on callers (hvmemul_do_io_addr() in particular) splitting such requests already? Yet in that case it's not clear to me whether anything needs changing here in the first place. (Similarly in the backwards case I think the first iteration risks crossing a page boundary, and then the batch should be clipped to count 1.) Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 2/2] x86/hvm/emulate: make sure rep I/O emulation does not cross GFN boundaries 2018-08-10 11:59 ` Jan Beulich @ 2018-08-10 12:10 ` Paul Durrant 0 siblings, 0 replies; 41+ messages in thread From: Paul Durrant @ 2018-08-10 12:10 UTC (permalink / raw) To: 'Jan Beulich'; +Cc: Andrew Cooper, xen-devel > -----Original Message----- > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: 10 August 2018 12:59 > To: Paul Durrant <Paul.Durrant@citrix.com> > Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-devel <xen- > devel@lists.xenproject.org> > Subject: Re: [PATCH 2/2] x86/hvm/emulate: make sure rep I/O emulation > does not cross GFN boundaries > > >>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote: > > --- a/xen/arch/x86/hvm/emulate.c > > +++ b/xen/arch/x86/hvm/emulate.c > > @@ -184,8 +184,23 @@ static int hvmemul_do_io( > > hvmtrace_io_assist(&p); > > } > > > > - vio->io_req = p; > > + /* > > + * Make sure that we truncate rep MMIO at any GFN boundary. This is > > + * necessary to ensure that the correct device model is targetted > > + * or that we correctly handle a rep op spanning MMIO and RAM. > > + */ > > + if ( unlikely(p.count > 1) && p.type == IOREQ_TYPE_COPY ) > > + { > > + unsigned long off = p.addr & ~PAGE_MASK; > > > > + p.count = min_t(unsigned long, > > + p.count, > > + p.df ? > > + (off + p.size) / p.size : > > + (PAGE_SIZE - off) / p.size); > > For misaligned requests you need to make sure p.count doesn't end > up as zero (which can now happen in the forwards case). Or do you > rely on callers (hvmemul_do_io_addr() in particular) splitting such > requests already? Well I have a test case where that split is not happening. Adding a safety check for p.count == 0 at this point should be done. > Yet in that case it's not clear to me whether > anything needs changing here in the first place. (Similarly in the > backwards case I think the first iteration risks crossing a page > boundary, and then the batch should be clipped to count 1.) > Ok. Sounds like clipping to 1 rep in both circumstances would be best. Paul > Jan > _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/2] MMIO emulation fixes 2018-08-10 10:37 [PATCH 0/2] MMIO emulation fixes Paul Durrant 2018-08-10 10:37 ` [PATCH 1/2] x86/hvm/ioreq: MMIO range checking completely ignores direction flag Paul Durrant 2018-08-10 10:37 ` [PATCH 2/2] x86/hvm/emulate: make sure rep I/O emulation does not cross GFN boundaries Paul Durrant @ 2018-08-10 12:01 ` Jan Beulich 2018-08-10 12:08 ` Paul Durrant 2 siblings, 1 reply; 41+ messages in thread From: Jan Beulich @ 2018-08-10 12:01 UTC (permalink / raw) To: Paul Durrant; +Cc: xen-devel >>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote: > These are probably both candidates for back-port. > > Paul Durrant (2): > x86/hvm/ioreq: MMIO range checking completely ignores direction flag > x86/hvm/emulate: make sure rep I/O emulation does not cross GFN > boundaries > > xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++- > xen/arch/x86/hvm/ioreq.c | 15 ++++++++++----- > 2 files changed, 26 insertions(+), 6 deletions(-) I take it this isn't yet what we've talked about yesterday on irc? Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/2] MMIO emulation fixes 2018-08-10 12:01 ` [PATCH 0/2] MMIO emulation fixes Jan Beulich @ 2018-08-10 12:08 ` Paul Durrant 2018-08-10 12:13 ` Jan Beulich 0 siblings, 1 reply; 41+ messages in thread From: Paul Durrant @ 2018-08-10 12:08 UTC (permalink / raw) To: 'Jan Beulich'; +Cc: xen-devel > -----Original Message----- > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: 10 August 2018 13:02 > To: Paul Durrant <Paul.Durrant@citrix.com> > Cc: xen-devel <xen-devel@lists.xenproject.org> > Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > > >>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote: > > These are probably both candidates for back-port. > > > > Paul Durrant (2): > > x86/hvm/ioreq: MMIO range checking completely ignores direction flag > > x86/hvm/emulate: make sure rep I/O emulation does not cross GFN > > boundaries > > > > xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++- > > xen/arch/x86/hvm/ioreq.c | 15 ++++++++++----- > > 2 files changed, 26 insertions(+), 6 deletions(-) > > I take it this isn't yet what we've talked about yesterday on irc? > This is the band-aid fix. I can now show correct handling of a rep mov walking off MMIO into RAM. Paul > Jan > _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/2] MMIO emulation fixes 2018-08-10 12:08 ` Paul Durrant @ 2018-08-10 12:13 ` Jan Beulich 2018-08-10 12:22 ` Paul Durrant 0 siblings, 1 reply; 41+ messages in thread From: Jan Beulich @ 2018-08-10 12:13 UTC (permalink / raw) To: Paul Durrant; +Cc: xen-devel >>> On 10.08.18 at 14:08, <Paul.Durrant@citrix.com> wrote: >> -----Original Message----- >> From: Jan Beulich [mailto:JBeulich@suse.com] >> Sent: 10 August 2018 13:02 >> To: Paul Durrant <Paul.Durrant@citrix.com> >> Cc: xen-devel <xen-devel@lists.xenproject.org> >> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >> >> >>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote: >> > These are probably both candidates for back-port. >> > >> > Paul Durrant (2): >> > x86/hvm/ioreq: MMIO range checking completely ignores direction flag >> > x86/hvm/emulate: make sure rep I/O emulation does not cross GFN >> > boundaries >> > >> > xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++- >> > xen/arch/x86/hvm/ioreq.c | 15 ++++++++++----- >> > 2 files changed, 26 insertions(+), 6 deletions(-) >> >> I take it this isn't yet what we've talked about yesterday on irc? >> > > This is the band-aid fix. I can now show correct handling of a rep mov > walking off MMIO into RAM. But that's not the problem we're having. In our case the bad behavior is with a single MOV. That's why I had assumed that your plan to fiddle with null_handler would help in our case as well, while this series clearly won't (afaict). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/2] MMIO emulation fixes 2018-08-10 12:13 ` Jan Beulich @ 2018-08-10 12:22 ` Paul Durrant 2018-08-10 12:37 ` Jan Beulich 0 siblings, 1 reply; 41+ messages in thread From: Paul Durrant @ 2018-08-10 12:22 UTC (permalink / raw) To: 'Jan Beulich'; +Cc: xen-devel > -----Original Message----- > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: 10 August 2018 13:13 > To: Paul Durrant <Paul.Durrant@citrix.com> > Cc: xen-devel <xen-devel@lists.xenproject.org> > Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > > >>> On 10.08.18 at 14:08, <Paul.Durrant@citrix.com> wrote: > >> -----Original Message----- > >> From: Jan Beulich [mailto:JBeulich@suse.com] > >> Sent: 10 August 2018 13:02 > >> To: Paul Durrant <Paul.Durrant@citrix.com> > >> Cc: xen-devel <xen-devel@lists.xenproject.org> > >> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > >> > >> >>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote: > >> > These are probably both candidates for back-port. > >> > > >> > Paul Durrant (2): > >> > x86/hvm/ioreq: MMIO range checking completely ignores direction flag > >> > x86/hvm/emulate: make sure rep I/O emulation does not cross GFN > >> > boundaries > >> > > >> > xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++- > >> > xen/arch/x86/hvm/ioreq.c | 15 ++++++++++----- > >> > 2 files changed, 26 insertions(+), 6 deletions(-) > >> > >> I take it this isn't yet what we've talked about yesterday on irc? > >> > > > > This is the band-aid fix. I can now show correct handling of a rep mov > > walking off MMIO into RAM. > > But that's not the problem we're having. In our case the bad behavior > is with a single MOV. That's why I had assumed that your plan to fiddle > with null_handler would help in our case as well, while this series clearly > won't (afaict). > Oh, I see. A single MOV spanning MMIO and RAM has undefined behaviour though as I understand it. Am I incorrect? Paul > Jan > _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/2] MMIO emulation fixes 2018-08-10 12:22 ` Paul Durrant @ 2018-08-10 12:37 ` Jan Beulich 2018-08-10 12:43 ` Paul Durrant 0 siblings, 1 reply; 41+ messages in thread From: Jan Beulich @ 2018-08-10 12:37 UTC (permalink / raw) To: Paul Durrant; +Cc: xen-devel >>> On 10.08.18 at 14:22, <Paul.Durrant@citrix.com> wrote: >> -----Original Message----- >> From: Jan Beulich [mailto:JBeulich@suse.com] >> Sent: 10 August 2018 13:13 >> To: Paul Durrant <Paul.Durrant@citrix.com> >> Cc: xen-devel <xen-devel@lists.xenproject.org> >> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >> >> >>> On 10.08.18 at 14:08, <Paul.Durrant@citrix.com> wrote: >> >> -----Original Message----- >> >> From: Jan Beulich [mailto:JBeulich@suse.com] >> >> Sent: 10 August 2018 13:02 >> >> To: Paul Durrant <Paul.Durrant@citrix.com> >> >> Cc: xen-devel <xen-devel@lists.xenproject.org> >> >> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >> >> >> >> >>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote: >> >> > These are probably both candidates for back-port. >> >> > >> >> > Paul Durrant (2): >> >> > x86/hvm/ioreq: MMIO range checking completely ignores direction flag >> >> > x86/hvm/emulate: make sure rep I/O emulation does not cross GFN >> >> > boundaries >> >> > >> >> > xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++- >> >> > xen/arch/x86/hvm/ioreq.c | 15 ++++++++++----- >> >> > 2 files changed, 26 insertions(+), 6 deletions(-) >> >> >> >> I take it this isn't yet what we've talked about yesterday on irc? >> >> >> > >> > This is the band-aid fix. I can now show correct handling of a rep mov >> > walking off MMIO into RAM. >> >> But that's not the problem we're having. In our case the bad behavior >> is with a single MOV. That's why I had assumed that your plan to fiddle >> with null_handler would help in our case as well, while this series clearly >> won't (afaict). >> > > Oh, I see. A single MOV spanning MMIO and RAM has undefined behaviour though > as I understand it. Am I incorrect? I'm not aware of SDM or PM saying anything like this. Anyway, the specific case where this is being observed as an issue is when accessing the last few bytes of a normal RAM page followed by a ballooned out one. The balloon driver doesn't remove the virtual mapping of such pages (presumably in order to not shatter super pages); observation is with the old XenoLinux one, but from code inspection the upstream one behaves the same. Unless we want to change the balloon driver's behavior, at least this specific case needs to be considered having defined behavior, I think. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/2] MMIO emulation fixes 2018-08-10 12:37 ` Jan Beulich @ 2018-08-10 12:43 ` Paul Durrant 2018-08-10 12:55 ` Andrew Cooper 0 siblings, 1 reply; 41+ messages in thread From: Paul Durrant @ 2018-08-10 12:43 UTC (permalink / raw) To: 'Jan Beulich'; +Cc: xen-devel > -----Original Message----- > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: 10 August 2018 13:37 > To: Paul Durrant <Paul.Durrant@citrix.com> > Cc: xen-devel <xen-devel@lists.xenproject.org> > Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > > >>> On 10.08.18 at 14:22, <Paul.Durrant@citrix.com> wrote: > >> -----Original Message----- > >> From: Jan Beulich [mailto:JBeulich@suse.com] > >> Sent: 10 August 2018 13:13 > >> To: Paul Durrant <Paul.Durrant@citrix.com> > >> Cc: xen-devel <xen-devel@lists.xenproject.org> > >> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > >> > >> >>> On 10.08.18 at 14:08, <Paul.Durrant@citrix.com> wrote: > >> >> -----Original Message----- > >> >> From: Jan Beulich [mailto:JBeulich@suse.com] > >> >> Sent: 10 August 2018 13:02 > >> >> To: Paul Durrant <Paul.Durrant@citrix.com> > >> >> Cc: xen-devel <xen-devel@lists.xenproject.org> > >> >> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > >> >> > >> >> >>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote: > >> >> > These are probably both candidates for back-port. > >> >> > > >> >> > Paul Durrant (2): > >> >> > x86/hvm/ioreq: MMIO range checking completely ignores direction > flag > >> >> > x86/hvm/emulate: make sure rep I/O emulation does not cross GFN > >> >> > boundaries > >> >> > > >> >> > xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++- > >> >> > xen/arch/x86/hvm/ioreq.c | 15 ++++++++++----- > >> >> > 2 files changed, 26 insertions(+), 6 deletions(-) > >> >> > >> >> I take it this isn't yet what we've talked about yesterday on irc? > >> >> > >> > > >> > This is the band-aid fix. I can now show correct handling of a rep mov > >> > walking off MMIO into RAM. > >> > >> But that's not the problem we're having. In our case the bad behavior > >> is with a single MOV. That's why I had assumed that your plan to fiddle > >> with null_handler would help in our case as well, while this series clearly > >> won't (afaict). > >> > > > > Oh, I see. A single MOV spanning MMIO and RAM has undefined behaviour > though > > as I understand it. Am I incorrect? > > I'm not aware of SDM or PM saying anything like this. Anyway, the > specific case where this is being observed as an issue is when > accessing the last few bytes of a normal RAM page followed by a > ballooned out one. The balloon driver doesn't remove the virtual > mapping of such pages (presumably in order to not shatter super > pages); observation is with the old XenoLinux one, but from code > inspection the upstream one behaves the same. > > Unless we want to change the balloon driver's behavior, at least > this specific case needs to be considered having defined behavior, > I think. > Ok. I'll see what I can do. Paul > Jan > _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/2] MMIO emulation fixes 2018-08-10 12:43 ` Paul Durrant @ 2018-08-10 12:55 ` Andrew Cooper 2018-08-10 15:08 ` Paul Durrant 0 siblings, 1 reply; 41+ messages in thread From: Andrew Cooper @ 2018-08-10 12:55 UTC (permalink / raw) To: Paul Durrant, 'Jan Beulich'; +Cc: xen-devel On 10/08/18 13:43, Paul Durrant wrote: >> -----Original Message----- >> From: Jan Beulich [mailto:JBeulich@suse.com] >> Sent: 10 August 2018 13:37 >> To: Paul Durrant <Paul.Durrant@citrix.com> >> Cc: xen-devel <xen-devel@lists.xenproject.org> >> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >> >>>>> On 10.08.18 at 14:22, <Paul.Durrant@citrix.com> wrote: >>>> -----Original Message----- >>>> From: Jan Beulich [mailto:JBeulich@suse.com] >>>> Sent: 10 August 2018 13:13 >>>> To: Paul Durrant <Paul.Durrant@citrix.com> >>>> Cc: xen-devel <xen-devel@lists.xenproject.org> >>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>> >>>>>>> On 10.08.18 at 14:08, <Paul.Durrant@citrix.com> wrote: >>>>>> -----Original Message----- >>>>>> From: Jan Beulich [mailto:JBeulich@suse.com] >>>>>> Sent: 10 August 2018 13:02 >>>>>> To: Paul Durrant <Paul.Durrant@citrix.com> >>>>>> Cc: xen-devel <xen-devel@lists.xenproject.org> >>>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>>>> >>>>>>>>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote: >>>>>>> These are probably both candidates for back-port. >>>>>>> >>>>>>> Paul Durrant (2): >>>>>>> x86/hvm/ioreq: MMIO range checking completely ignores direction >> flag >>>>>>> x86/hvm/emulate: make sure rep I/O emulation does not cross GFN >>>>>>> boundaries >>>>>>> >>>>>>> xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++- >>>>>>> xen/arch/x86/hvm/ioreq.c | 15 ++++++++++----- >>>>>>> 2 files changed, 26 insertions(+), 6 deletions(-) >>>>>> I take it this isn't yet what we've talked about yesterday on irc? >>>>>> >>>>> This is the band-aid fix. I can now show correct handling of a rep mov >>>>> walking off MMIO into RAM. >>>> But that's not the problem we're having. In our case the bad behavior >>>> is with a single MOV. That's why I had assumed that your plan to fiddle >>>> with null_handler would help in our case as well, while this series clearly >>>> won't (afaict). >>>> >>> Oh, I see. A single MOV spanning MMIO and RAM has undefined behaviour >> though >>> as I understand it. Am I incorrect? >> I'm not aware of SDM or PM saying anything like this. Anyway, the >> specific case where this is being observed as an issue is when >> accessing the last few bytes of a normal RAM page followed by a >> ballooned out one. The balloon driver doesn't remove the virtual >> mapping of such pages (presumably in order to not shatter super >> pages); observation is with the old XenoLinux one, but from code >> inspection the upstream one behaves the same. >> >> Unless we want to change the balloon driver's behavior, at least >> this specific case needs to be considered having defined behavior, >> I think. >> > Ok. I'll see what I can do. It is a software error to try and cross boundaries. Modern processors do their best to try and cause the correct behaviour to occur, albeit with a massive disclaimer about the performance hit. Older processors didn't cope. As far as I'm concerned, its fine to terminate a emulation which crosses a boundary with the null ops. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/2] MMIO emulation fixes 2018-08-10 12:55 ` Andrew Cooper @ 2018-08-10 15:08 ` Paul Durrant 2018-08-10 15:30 ` Jan Beulich 0 siblings, 1 reply; 41+ messages in thread From: Paul Durrant @ 2018-08-10 15:08 UTC (permalink / raw) To: Andrew Cooper, 'Jan Beulich'; +Cc: xen-devel > -----Original Message----- > From: Andrew Cooper > Sent: 10 August 2018 13:56 > To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich' > <JBeulich@suse.com> > Cc: xen-devel <xen-devel@lists.xenproject.org> > Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > > On 10/08/18 13:43, Paul Durrant wrote: > >> -----Original Message----- > >> From: Jan Beulich [mailto:JBeulich@suse.com] > >> Sent: 10 August 2018 13:37 > >> To: Paul Durrant <Paul.Durrant@citrix.com> > >> Cc: xen-devel <xen-devel@lists.xenproject.org> > >> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > >> > >>>>> On 10.08.18 at 14:22, <Paul.Durrant@citrix.com> wrote: > >>>> -----Original Message----- > >>>> From: Jan Beulich [mailto:JBeulich@suse.com] > >>>> Sent: 10 August 2018 13:13 > >>>> To: Paul Durrant <Paul.Durrant@citrix.com> > >>>> Cc: xen-devel <xen-devel@lists.xenproject.org> > >>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > >>>> > >>>>>>> On 10.08.18 at 14:08, <Paul.Durrant@citrix.com> wrote: > >>>>>> -----Original Message----- > >>>>>> From: Jan Beulich [mailto:JBeulich@suse.com] > >>>>>> Sent: 10 August 2018 13:02 > >>>>>> To: Paul Durrant <Paul.Durrant@citrix.com> > >>>>>> Cc: xen-devel <xen-devel@lists.xenproject.org> > >>>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > >>>>>> > >>>>>>>>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote: > >>>>>>> These are probably both candidates for back-port. > >>>>>>> > >>>>>>> Paul Durrant (2): > >>>>>>> x86/hvm/ioreq: MMIO range checking completely ignores > direction > >> flag > >>>>>>> x86/hvm/emulate: make sure rep I/O emulation does not cross > GFN > >>>>>>> boundaries > >>>>>>> > >>>>>>> xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++- > >>>>>>> xen/arch/x86/hvm/ioreq.c | 15 ++++++++++----- > >>>>>>> 2 files changed, 26 insertions(+), 6 deletions(-) > >>>>>> I take it this isn't yet what we've talked about yesterday on irc? > >>>>>> > >>>>> This is the band-aid fix. I can now show correct handling of a rep mov > >>>>> walking off MMIO into RAM. > >>>> But that's not the problem we're having. In our case the bad behavior > >>>> is with a single MOV. That's why I had assumed that your plan to fiddle > >>>> with null_handler would help in our case as well, while this series > clearly > >>>> won't (afaict). > >>>> > >>> Oh, I see. A single MOV spanning MMIO and RAM has undefined > behaviour > >> though > >>> as I understand it. Am I incorrect? > >> I'm not aware of SDM or PM saying anything like this. Anyway, the > >> specific case where this is being observed as an issue is when > >> accessing the last few bytes of a normal RAM page followed by a > >> ballooned out one. The balloon driver doesn't remove the virtual > >> mapping of such pages (presumably in order to not shatter super > >> pages); observation is with the old XenoLinux one, but from code > >> inspection the upstream one behaves the same. > >> > >> Unless we want to change the balloon driver's behavior, at least > >> this specific case needs to be considered having defined behavior, > >> I think. > >> > > Ok. I'll see what I can do. > > It is a software error to try and cross boundaries. Modern processors > do their best to try and cause the correct behaviour to occur, albeit > with a massive disclaimer about the performance hit. Older processors > didn't cope. > > As far as I'm concerned, its fine to terminate a emulation which crosses > a boundary with the null ops. Alas we never even get as far as the I/O handlers in some circumstances... I just set up a variant of an XTF test doing a backwards rep movsd into a well aligned stack buffer where source buffer starts 1 byte before a boundary between RAM and MMIO. The code in hvmemul_rep_movs() (rightly) detects that both the source and dest of the initial rep are RAM, skips over the I/O emulation calls, and then fails when the hvm_copy_from_guest_phys() (unsurprisingly) fails to grab the 8 bytes for the initial rep. So, any logic we add to deal with handling page spanning ops is going to have to go in at the top level of instruction emulation... which I fear is going to be quite a major change and not something that's going to be easy to back-port. Paul > > ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/2] MMIO emulation fixes 2018-08-10 15:08 ` Paul Durrant @ 2018-08-10 15:30 ` Jan Beulich 2018-08-10 15:35 ` Paul Durrant 0 siblings, 1 reply; 41+ messages in thread From: Jan Beulich @ 2018-08-10 15:30 UTC (permalink / raw) To: Paul Durrant; +Cc: Andrew Cooper, xen-devel >>> On 10.08.18 at 17:08, <Paul.Durrant@citrix.com> wrote: >> -----Original Message----- >> From: Andrew Cooper >> Sent: 10 August 2018 13:56 >> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich' >> <JBeulich@suse.com> >> Cc: xen-devel <xen-devel@lists.xenproject.org> >> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >> >> On 10/08/18 13:43, Paul Durrant wrote: >> >> -----Original Message----- >> >> From: Jan Beulich [mailto:JBeulich@suse.com] >> >> Sent: 10 August 2018 13:37 >> >> To: Paul Durrant <Paul.Durrant@citrix.com> >> >> Cc: xen-devel <xen-devel@lists.xenproject.org> >> >> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >> >> >> >>>>> On 10.08.18 at 14:22, <Paul.Durrant@citrix.com> wrote: >> >>>> -----Original Message----- >> >>>> From: Jan Beulich [mailto:JBeulich@suse.com] >> >>>> Sent: 10 August 2018 13:13 >> >>>> To: Paul Durrant <Paul.Durrant@citrix.com> >> >>>> Cc: xen-devel <xen-devel@lists.xenproject.org> >> >>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >> >>>> >> >>>>>>> On 10.08.18 at 14:08, <Paul.Durrant@citrix.com> wrote: >> >>>>>> -----Original Message----- >> >>>>>> From: Jan Beulich [mailto:JBeulich@suse.com] >> >>>>>> Sent: 10 August 2018 13:02 >> >>>>>> To: Paul Durrant <Paul.Durrant@citrix.com> >> >>>>>> Cc: xen-devel <xen-devel@lists.xenproject.org> >> >>>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >> >>>>>> >> >>>>>>>>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote: >> >>>>>>> These are probably both candidates for back-port. >> >>>>>>> >> >>>>>>> Paul Durrant (2): >> >>>>>>> x86/hvm/ioreq: MMIO range checking completely ignores >> direction >> >> flag >> >>>>>>> x86/hvm/emulate: make sure rep I/O emulation does not cross >> GFN >> >>>>>>> boundaries >> >>>>>>> >> >>>>>>> xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++- >> >>>>>>> xen/arch/x86/hvm/ioreq.c | 15 ++++++++++----- >> >>>>>>> 2 files changed, 26 insertions(+), 6 deletions(-) >> >>>>>> I take it this isn't yet what we've talked about yesterday on irc? >> >>>>>> >> >>>>> This is the band-aid fix. I can now show correct handling of a rep mov >> >>>>> walking off MMIO into RAM. >> >>>> But that's not the problem we're having. In our case the bad behavior >> >>>> is with a single MOV. That's why I had assumed that your plan to fiddle >> >>>> with null_handler would help in our case as well, while this series >> clearly >> >>>> won't (afaict). >> >>>> >> >>> Oh, I see. A single MOV spanning MMIO and RAM has undefined >> behaviour >> >> though >> >>> as I understand it. Am I incorrect? >> >> I'm not aware of SDM or PM saying anything like this. Anyway, the >> >> specific case where this is being observed as an issue is when >> >> accessing the last few bytes of a normal RAM page followed by a >> >> ballooned out one. The balloon driver doesn't remove the virtual >> >> mapping of such pages (presumably in order to not shatter super >> >> pages); observation is with the old XenoLinux one, but from code >> >> inspection the upstream one behaves the same. >> >> >> >> Unless we want to change the balloon driver's behavior, at least >> >> this specific case needs to be considered having defined behavior, >> >> I think. >> >> >> > Ok. I'll see what I can do. >> >> It is a software error to try and cross boundaries. Modern processors >> do their best to try and cause the correct behaviour to occur, albeit >> with a massive disclaimer about the performance hit. Older processors >> didn't cope. >> >> As far as I'm concerned, its fine to terminate a emulation which crosses >> a boundary with the null ops. > > Alas we never even get as far as the I/O handlers in some circumstances... > > I just set up a variant of an XTF test doing a backwards rep movsd into a > well aligned stack buffer where source buffer starts 1 byte before a boundary > between RAM and MMIO. The code in hvmemul_rep_movs() (rightly) detects that > both the source and dest of the initial rep are RAM, skips over the I/O > emulation calls, and then fails when the hvm_copy_from_guest_phys() > (unsurprisingly) fails to grab the 8 bytes for the initial rep. > So, any logic we add to deal with handling page spanning ops is going to > have to go in at the top level of instruction emulation... which I fear is > going to be quite a major change and not something that's going to be easy to > back-port. Well, wasn't it clear from the beginning that a proper fix would be too invasive to backport? And wasn't it for that reason that you intended to add a small hack first, to deal with just the case(s) that we currently have issues with? Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/2] MMIO emulation fixes 2018-08-10 15:30 ` Jan Beulich @ 2018-08-10 15:35 ` Paul Durrant [not found] ` <5B6DB69D02000078001DD06A@prv1*mh.provo.novell.com> 2018-08-10 16:00 ` Jan Beulich 0 siblings, 2 replies; 41+ messages in thread From: Paul Durrant @ 2018-08-10 15:35 UTC (permalink / raw) To: 'Jan Beulich'; +Cc: Andrew Cooper, xen-devel > -----Original Message----- > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: 10 August 2018 16:31 > To: Paul Durrant <Paul.Durrant@citrix.com> > Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-devel <xen- > devel@lists.xenproject.org> > Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > > >>> On 10.08.18 at 17:08, <Paul.Durrant@citrix.com> wrote: > >> -----Original Message----- > >> From: Andrew Cooper > >> Sent: 10 August 2018 13:56 > >> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich' > >> <JBeulich@suse.com> > >> Cc: xen-devel <xen-devel@lists.xenproject.org> > >> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > >> > >> On 10/08/18 13:43, Paul Durrant wrote: > >> >> -----Original Message----- > >> >> From: Jan Beulich [mailto:JBeulich@suse.com] > >> >> Sent: 10 August 2018 13:37 > >> >> To: Paul Durrant <Paul.Durrant@citrix.com> > >> >> Cc: xen-devel <xen-devel@lists.xenproject.org> > >> >> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > >> >> > >> >>>>> On 10.08.18 at 14:22, <Paul.Durrant@citrix.com> wrote: > >> >>>> -----Original Message----- > >> >>>> From: Jan Beulich [mailto:JBeulich@suse.com] > >> >>>> Sent: 10 August 2018 13:13 > >> >>>> To: Paul Durrant <Paul.Durrant@citrix.com> > >> >>>> Cc: xen-devel <xen-devel@lists.xenproject.org> > >> >>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > >> >>>> > >> >>>>>>> On 10.08.18 at 14:08, <Paul.Durrant@citrix.com> wrote: > >> >>>>>> -----Original Message----- > >> >>>>>> From: Jan Beulich [mailto:JBeulich@suse.com] > >> >>>>>> Sent: 10 August 2018 13:02 > >> >>>>>> To: Paul Durrant <Paul.Durrant@citrix.com> > >> >>>>>> Cc: xen-devel <xen-devel@lists.xenproject.org> > >> >>>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes > >> >>>>>> > >> >>>>>>>>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote: > >> >>>>>>> These are probably both candidates for back-port. > >> >>>>>>> > >> >>>>>>> Paul Durrant (2): > >> >>>>>>> x86/hvm/ioreq: MMIO range checking completely ignores > >> direction > >> >> flag > >> >>>>>>> x86/hvm/emulate: make sure rep I/O emulation does not cross > >> GFN > >> >>>>>>> boundaries > >> >>>>>>> > >> >>>>>>> xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++- > >> >>>>>>> xen/arch/x86/hvm/ioreq.c | 15 ++++++++++----- > >> >>>>>>> 2 files changed, 26 insertions(+), 6 deletions(-) > >> >>>>>> I take it this isn't yet what we've talked about yesterday on irc? > >> >>>>>> > >> >>>>> This is the band-aid fix. I can now show correct handling of a rep > mov > >> >>>>> walking off MMIO into RAM. > >> >>>> But that's not the problem we're having. In our case the bad > behavior > >> >>>> is with a single MOV. That's why I had assumed that your plan to > fiddle > >> >>>> with null_handler would help in our case as well, while this series > >> clearly > >> >>>> won't (afaict). > >> >>>> > >> >>> Oh, I see. A single MOV spanning MMIO and RAM has undefined > >> behaviour > >> >> though > >> >>> as I understand it. Am I incorrect? > >> >> I'm not aware of SDM or PM saying anything like this. Anyway, the > >> >> specific case where this is being observed as an issue is when > >> >> accessing the last few bytes of a normal RAM page followed by a > >> >> ballooned out one. The balloon driver doesn't remove the virtual > >> >> mapping of such pages (presumably in order to not shatter super > >> >> pages); observation is with the old XenoLinux one, but from code > >> >> inspection the upstream one behaves the same. > >> >> > >> >> Unless we want to change the balloon driver's behavior, at least > >> >> this specific case needs to be considered having defined behavior, > >> >> I think. > >> >> > >> > Ok. I'll see what I can do. > >> > >> It is a software error to try and cross boundaries. Modern processors > >> do their best to try and cause the correct behaviour to occur, albeit > >> with a massive disclaimer about the performance hit. Older processors > >> didn't cope. > >> > >> As far as I'm concerned, its fine to terminate a emulation which crosses > >> a boundary with the null ops. > > > > Alas we never even get as far as the I/O handlers in some circumstances... > > > > I just set up a variant of an XTF test doing a backwards rep movsd into a > > well aligned stack buffer where source buffer starts 1 byte before a > boundary > > between RAM and MMIO. The code in hvmemul_rep_movs() (rightly) > detects that > > both the source and dest of the initial rep are RAM, skips over the I/O > > emulation calls, and then fails when the hvm_copy_from_guest_phys() > > (unsurprisingly) fails to grab the 8 bytes for the initial rep. > > So, any logic we add to deal with handling page spanning ops is going to > > have to go in at the top level of instruction emulation... which I fear is > > going to be quite a major change and not something that's going to be easy > to > > back-port. > > Well, wasn't it clear from the beginning that a proper fix would be too > invasive to backport? And wasn't it for that reason that you intended > to add a small hack first, to deal with just the case(s) that we currently > have issues with? Well, given that I mistakenly understood you were hitting the same rep issue that I was, I thought I could deal with it in a reasonably straightforward way. Maybe I can still do a point fix for what you are hitting though. What precisely are you hitting? Always a single MOV? And always from a page spanning source to a well aligned dest? Or more combinations than that? Paul > > Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
[parent not found: <5B6DB69D02000078001DD06A@prv1*mh.provo.novell.com>]
[parent not found: <eaab5a73*2910*7fb6*e1fc*08537e63088c@citrix.com>]
[parent not found: <92ca69e5*98b1*61e4*817a*3868f829471a@citrix.com>]
* Re: [PATCH 0/2] MMIO emulation fixes 2018-08-10 15:35 ` Paul Durrant [not found] ` <5B6DB69D02000078001DD06A@prv1*mh.provo.novell.com> @ 2018-08-10 16:00 ` Jan Beulich 2018-08-10 16:30 ` George Dunlap 1 sibling, 1 reply; 41+ messages in thread From: Jan Beulich @ 2018-08-10 16:00 UTC (permalink / raw) To: Paul Durrant, George Dunlap; +Cc: Andrew Cooper, xen-devel >>> On 10.08.18 at 17:35, <Paul.Durrant@citrix.com> wrote: >> -----Original Message----- >> From: Jan Beulich [mailto:JBeulich@suse.com] >> Sent: 10 August 2018 16:31 >> To: Paul Durrant <Paul.Durrant@citrix.com> >> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-devel <xen- >> devel@lists.xenproject.org> >> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >> >> >>> On 10.08.18 at 17:08, <Paul.Durrant@citrix.com> wrote: >> >> -----Original Message----- >> >> From: Andrew Cooper >> >> Sent: 10 August 2018 13:56 >> >> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich' >> >> <JBeulich@suse.com> >> >> Cc: xen-devel <xen-devel@lists.xenproject.org> >> >> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >> >> >> >> On 10/08/18 13:43, Paul Durrant wrote: >> >> >> -----Original Message----- >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com] >> >> >> Sent: 10 August 2018 13:37 >> >> >> To: Paul Durrant <Paul.Durrant@citrix.com> >> >> >> Cc: xen-devel <xen-devel@lists.xenproject.org> >> >> >> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >> >> >> >> >> >>>>> On 10.08.18 at 14:22, <Paul.Durrant@citrix.com> wrote: >> >> >>>> -----Original Message----- >> >> >>>> From: Jan Beulich [mailto:JBeulich@suse.com] >> >> >>>> Sent: 10 August 2018 13:13 >> >> >>>> To: Paul Durrant <Paul.Durrant@citrix.com> >> >> >>>> Cc: xen-devel <xen-devel@lists.xenproject.org> >> >> >>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >> >> >>>> >> >> >>>>>>> On 10.08.18 at 14:08, <Paul.Durrant@citrix.com> wrote: >> >> >>>>>> -----Original Message----- >> >> >>>>>> From: Jan Beulich [mailto:JBeulich@suse.com] >> >> >>>>>> Sent: 10 August 2018 13:02 >> >> >>>>>> To: Paul Durrant <Paul.Durrant@citrix.com> >> >> >>>>>> Cc: xen-devel <xen-devel@lists.xenproject.org> >> >> >>>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >> >> >>>>>> >> >> >>>>>>>>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote: >> >> >>>>>>> These are probably both candidates for back-port. >> >> >>>>>>> >> >> >>>>>>> Paul Durrant (2): >> >> >>>>>>> x86/hvm/ioreq: MMIO range checking completely ignores >> >> direction >> >> >> flag >> >> >>>>>>> x86/hvm/emulate: make sure rep I/O emulation does not cross >> >> GFN >> >> >>>>>>> boundaries >> >> >>>>>>> >> >> >>>>>>> xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++- >> >> >>>>>>> xen/arch/x86/hvm/ioreq.c | 15 ++++++++++----- >> >> >>>>>>> 2 files changed, 26 insertions(+), 6 deletions(-) >> >> >>>>>> I take it this isn't yet what we've talked about yesterday on irc? >> >> >>>>>> >> >> >>>>> This is the band-aid fix. I can now show correct handling of a rep >> mov >> >> >>>>> walking off MMIO into RAM. >> >> >>>> But that's not the problem we're having. In our case the bad >> behavior >> >> >>>> is with a single MOV. That's why I had assumed that your plan to >> fiddle >> >> >>>> with null_handler would help in our case as well, while this series >> >> clearly >> >> >>>> won't (afaict). >> >> >>>> >> >> >>> Oh, I see. A single MOV spanning MMIO and RAM has undefined >> >> behaviour >> >> >> though >> >> >>> as I understand it. Am I incorrect? >> >> >> I'm not aware of SDM or PM saying anything like this. Anyway, the >> >> >> specific case where this is being observed as an issue is when >> >> >> accessing the last few bytes of a normal RAM page followed by a >> >> >> ballooned out one. The balloon driver doesn't remove the virtual >> >> >> mapping of such pages (presumably in order to not shatter super >> >> >> pages); observation is with the old XenoLinux one, but from code >> >> >> inspection the upstream one behaves the same. >> >> >> >> >> >> Unless we want to change the balloon driver's behavior, at least >> >> >> this specific case needs to be considered having defined behavior, >> >> >> I think. >> >> >> >> >> > Ok. I'll see what I can do. >> >> >> >> It is a software error to try and cross boundaries. Modern processors >> >> do their best to try and cause the correct behaviour to occur, albeit >> >> with a massive disclaimer about the performance hit. Older processors >> >> didn't cope. >> >> >> >> As far as I'm concerned, its fine to terminate a emulation which crosses >> >> a boundary with the null ops. >> > >> > Alas we never even get as far as the I/O handlers in some circumstances... >> > >> > I just set up a variant of an XTF test doing a backwards rep movsd into a >> > well aligned stack buffer where source buffer starts 1 byte before a >> boundary >> > between RAM and MMIO. The code in hvmemul_rep_movs() (rightly) >> detects that >> > both the source and dest of the initial rep are RAM, skips over the I/O >> > emulation calls, and then fails when the hvm_copy_from_guest_phys() >> > (unsurprisingly) fails to grab the 8 bytes for the initial rep. >> > So, any logic we add to deal with handling page spanning ops is going to >> > have to go in at the top level of instruction emulation... which I fear is >> > going to be quite a major change and not something that's going to be easy >> to >> > back-port. >> >> Well, wasn't it clear from the beginning that a proper fix would be too >> invasive to backport? And wasn't it for that reason that you intended >> to add a small hack first, to deal with just the case(s) that we currently >> have issues with? > > Well, given that I mistakenly understood you were hitting the same rep issue > that I was, I thought I could deal with it in a reasonably straightforward > way. Maybe I can still do a point fix for what you are hitting though. What > precisely are you hitting? Always a single MOV? And always from a page > spanning source to a well aligned dest? Or more combinations than that? Always a single, misaligned MOV spanning the boundary from a valid RAM page to a ballooned out one (Linux'es load_unaligned_zeropad()). I meanwhile wonder though whether it might not be better to address this at the p2m level, by inserting a r/o mapping of an all-zeros page. George, do you have any opinion one way or the other? Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/2] MMIO emulation fixes 2018-08-10 16:00 ` Jan Beulich @ 2018-08-10 16:30 ` George Dunlap 2018-08-10 16:37 ` Andrew Cooper 0 siblings, 1 reply; 41+ messages in thread From: George Dunlap @ 2018-08-10 16:30 UTC (permalink / raw) To: Jan Beulich, Paul Durrant, George Dunlap; +Cc: Andrew Cooper, xen-devel On 08/10/2018 05:00 PM, Jan Beulich wrote: >>>> On 10.08.18 at 17:35, <Paul.Durrant@citrix.com> wrote: >>> -----Original Message----- >>> From: Jan Beulich [mailto:JBeulich@suse.com] >>> Sent: 10 August 2018 16:31 >>> To: Paul Durrant <Paul.Durrant@citrix.com> >>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-devel <xen- >>> devel@lists.xenproject.org> >>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>> >>>>>> On 10.08.18 at 17:08, <Paul.Durrant@citrix.com> wrote: >>>>> -----Original Message----- >>>>> From: Andrew Cooper >>>>> Sent: 10 August 2018 13:56 >>>>> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich' >>>>> <JBeulich@suse.com> >>>>> Cc: xen-devel <xen-devel@lists.xenproject.org> >>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>>> >>>>> On 10/08/18 13:43, Paul Durrant wrote: >>>>>>> -----Original Message----- >>>>>>> From: Jan Beulich [mailto:JBeulich@suse.com] >>>>>>> Sent: 10 August 2018 13:37 >>>>>>> To: Paul Durrant <Paul.Durrant@citrix.com> >>>>>>> Cc: xen-devel <xen-devel@lists.xenproject.org> >>>>>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>>>>> >>>>>>>>>> On 10.08.18 at 14:22, <Paul.Durrant@citrix.com> wrote: >>>>>>>>> -----Original Message----- >>>>>>>>> From: Jan Beulich [mailto:JBeulich@suse.com] >>>>>>>>> Sent: 10 August 2018 13:13 >>>>>>>>> To: Paul Durrant <Paul.Durrant@citrix.com> >>>>>>>>> Cc: xen-devel <xen-devel@lists.xenproject.org> >>>>>>>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>>>>>>> >>>>>>>>>>>> On 10.08.18 at 14:08, <Paul.Durrant@citrix.com> wrote: >>>>>>>>>>> -----Original Message----- >>>>>>>>>>> From: Jan Beulich [mailto:JBeulich@suse.com] >>>>>>>>>>> Sent: 10 August 2018 13:02 >>>>>>>>>>> To: Paul Durrant <Paul.Durrant@citrix.com> >>>>>>>>>>> Cc: xen-devel <xen-devel@lists.xenproject.org> >>>>>>>>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>>>>>>>>> >>>>>>>>>>>>>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote: >>>>>>>>>>>> These are probably both candidates for back-port. >>>>>>>>>>>> >>>>>>>>>>>> Paul Durrant (2): >>>>>>>>>>>> x86/hvm/ioreq: MMIO range checking completely ignores >>>>> direction >>>>>>> flag >>>>>>>>>>>> x86/hvm/emulate: make sure rep I/O emulation does not cross >>>>> GFN >>>>>>>>>>>> boundaries >>>>>>>>>>>> >>>>>>>>>>>> xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++- >>>>>>>>>>>> xen/arch/x86/hvm/ioreq.c | 15 ++++++++++----- >>>>>>>>>>>> 2 files changed, 26 insertions(+), 6 deletions(-) >>>>>>>>>>> I take it this isn't yet what we've talked about yesterday on irc? >>>>>>>>>>> >>>>>>>>>> This is the band-aid fix. I can now show correct handling of a rep >>> mov >>>>>>>>>> walking off MMIO into RAM. >>>>>>>>> But that's not the problem we're having. In our case the bad >>> behavior >>>>>>>>> is with a single MOV. That's why I had assumed that your plan to >>> fiddle >>>>>>>>> with null_handler would help in our case as well, while this series >>>>> clearly >>>>>>>>> won't (afaict). >>>>>>>>> >>>>>>>> Oh, I see. A single MOV spanning MMIO and RAM has undefined >>>>> behaviour >>>>>>> though >>>>>>>> as I understand it. Am I incorrect? >>>>>>> I'm not aware of SDM or PM saying anything like this. Anyway, the >>>>>>> specific case where this is being observed as an issue is when >>>>>>> accessing the last few bytes of a normal RAM page followed by a >>>>>>> ballooned out one. The balloon driver doesn't remove the virtual >>>>>>> mapping of such pages (presumably in order to not shatter super >>>>>>> pages); observation is with the old XenoLinux one, but from code >>>>>>> inspection the upstream one behaves the same. >>>>>>> >>>>>>> Unless we want to change the balloon driver's behavior, at least >>>>>>> this specific case needs to be considered having defined behavior, >>>>>>> I think. >>>>>>> >>>>>> Ok. I'll see what I can do. >>>>> >>>>> It is a software error to try and cross boundaries. Modern processors >>>>> do their best to try and cause the correct behaviour to occur, albeit >>>>> with a massive disclaimer about the performance hit. Older processors >>>>> didn't cope. >>>>> >>>>> As far as I'm concerned, its fine to terminate a emulation which crosses >>>>> a boundary with the null ops. >>>> >>>> Alas we never even get as far as the I/O handlers in some circumstances... >>>> >>>> I just set up a variant of an XTF test doing a backwards rep movsd into a >>>> well aligned stack buffer where source buffer starts 1 byte before a >>> boundary >>>> between RAM and MMIO. The code in hvmemul_rep_movs() (rightly) >>> detects that >>>> both the source and dest of the initial rep are RAM, skips over the I/O >>>> emulation calls, and then fails when the hvm_copy_from_guest_phys() >>>> (unsurprisingly) fails to grab the 8 bytes for the initial rep. >>>> So, any logic we add to deal with handling page spanning ops is going to >>>> have to go in at the top level of instruction emulation... which I fear is >>>> going to be quite a major change and not something that's going to be easy >>> to >>>> back-port. >>> >>> Well, wasn't it clear from the beginning that a proper fix would be too >>> invasive to backport? And wasn't it for that reason that you intended >>> to add a small hack first, to deal with just the case(s) that we currently >>> have issues with? >> >> Well, given that I mistakenly understood you were hitting the same rep issue >> that I was, I thought I could deal with it in a reasonably straightforward >> way. Maybe I can still do a point fix for what you are hitting though. What >> precisely are you hitting? Always a single MOV? And always from a page >> spanning source to a well aligned dest? Or more combinations than that? > > Always a single, misaligned MOV spanning the boundary from a valid > RAM page to a ballooned out one (Linux'es load_unaligned_zeropad()). > I meanwhile wonder though whether it might not be better to address > this at the p2m level, by inserting a r/o mapping of an all-zeros page. > George, do you have any opinion one way or the other? Sorry, what exactly is the issue here? Linux has a function called load_unaligned_zeropad() which is reading into a ballooned region? Fundamentally, a ballooned page is one which has been allocated to a device driver. I'm having a hard time coming up with a justification for having code which reads memory owned by B in the process of reading memory owned by A. Or is there some weird architectural reason that I'm not aware of? -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/2] MMIO emulation fixes 2018-08-10 16:30 ` George Dunlap @ 2018-08-10 16:37 ` Andrew Cooper 2018-08-13 6:50 ` Jan Beulich 2018-08-16 10:29 ` Jan Beulich 0 siblings, 2 replies; 41+ messages in thread From: Andrew Cooper @ 2018-08-10 16:37 UTC (permalink / raw) To: George Dunlap, Jan Beulich, Paul Durrant, George Dunlap; +Cc: xen-devel On 10/08/18 17:30, George Dunlap wrote: > On 08/10/2018 05:00 PM, Jan Beulich wrote: >>>>> On 10.08.18 at 17:35, <Paul.Durrant@citrix.com> wrote: >>>> -----Original Message----- >>>> From: Jan Beulich [mailto:JBeulich@suse.com] >>>> Sent: 10 August 2018 16:31 >>>> To: Paul Durrant <Paul.Durrant@citrix.com> >>>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; xen-devel <xen- >>>> devel@lists.xenproject.org> >>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>> >>>>>>> On 10.08.18 at 17:08, <Paul.Durrant@citrix.com> wrote: >>>>>> -----Original Message----- >>>>>> From: Andrew Cooper >>>>>> Sent: 10 August 2018 13:56 >>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich' >>>>>> <JBeulich@suse.com> >>>>>> Cc: xen-devel <xen-devel@lists.xenproject.org> >>>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>>>> >>>>>> On 10/08/18 13:43, Paul Durrant wrote: >>>>>>>> -----Original Message----- >>>>>>>> From: Jan Beulich [mailto:JBeulich@suse.com] >>>>>>>> Sent: 10 August 2018 13:37 >>>>>>>> To: Paul Durrant <Paul.Durrant@citrix.com> >>>>>>>> Cc: xen-devel <xen-devel@lists.xenproject.org> >>>>>>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>>>>>> >>>>>>>>>>> On 10.08.18 at 14:22, <Paul.Durrant@citrix.com> wrote: >>>>>>>>>> -----Original Message----- >>>>>>>>>> From: Jan Beulich [mailto:JBeulich@suse.com] >>>>>>>>>> Sent: 10 August 2018 13:13 >>>>>>>>>> To: Paul Durrant <Paul.Durrant@citrix.com> >>>>>>>>>> Cc: xen-devel <xen-devel@lists.xenproject.org> >>>>>>>>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>>>>>>>> >>>>>>>>>>>>> On 10.08.18 at 14:08, <Paul.Durrant@citrix.com> wrote: >>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>> From: Jan Beulich [mailto:JBeulich@suse.com] >>>>>>>>>>>> Sent: 10 August 2018 13:02 >>>>>>>>>>>> To: Paul Durrant <Paul.Durrant@citrix.com> >>>>>>>>>>>> Cc: xen-devel <xen-devel@lists.xenproject.org> >>>>>>>>>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>>>>>>>>>> >>>>>>>>>>>>>>> On 10.08.18 at 12:37, <paul.durrant@citrix.com> wrote: >>>>>>>>>>>>> These are probably both candidates for back-port. >>>>>>>>>>>>> >>>>>>>>>>>>> Paul Durrant (2): >>>>>>>>>>>>> x86/hvm/ioreq: MMIO range checking completely ignores >>>>>> direction >>>>>>>> flag >>>>>>>>>>>>> x86/hvm/emulate: make sure rep I/O emulation does not cross >>>>>> GFN >>>>>>>>>>>>> boundaries >>>>>>>>>>>>> >>>>>>>>>>>>> xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++- >>>>>>>>>>>>> xen/arch/x86/hvm/ioreq.c | 15 ++++++++++----- >>>>>>>>>>>>> 2 files changed, 26 insertions(+), 6 deletions(-) >>>>>>>>>>>> I take it this isn't yet what we've talked about yesterday on irc? >>>>>>>>>>>> >>>>>>>>>>> This is the band-aid fix. I can now show correct handling of a rep >>>> mov >>>>>>>>>>> walking off MMIO into RAM. >>>>>>>>>> But that's not the problem we're having. In our case the bad >>>> behavior >>>>>>>>>> is with a single MOV. That's why I had assumed that your plan to >>>> fiddle >>>>>>>>>> with null_handler would help in our case as well, while this series >>>>>> clearly >>>>>>>>>> won't (afaict). >>>>>>>>>> >>>>>>>>> Oh, I see. A single MOV spanning MMIO and RAM has undefined >>>>>> behaviour >>>>>>>> though >>>>>>>>> as I understand it. Am I incorrect? >>>>>>>> I'm not aware of SDM or PM saying anything like this. Anyway, the >>>>>>>> specific case where this is being observed as an issue is when >>>>>>>> accessing the last few bytes of a normal RAM page followed by a >>>>>>>> ballooned out one. The balloon driver doesn't remove the virtual >>>>>>>> mapping of such pages (presumably in order to not shatter super >>>>>>>> pages); observation is with the old XenoLinux one, but from code >>>>>>>> inspection the upstream one behaves the same. >>>>>>>> >>>>>>>> Unless we want to change the balloon driver's behavior, at least >>>>>>>> this specific case needs to be considered having defined behavior, >>>>>>>> I think. >>>>>>>> >>>>>>> Ok. I'll see what I can do. >>>>>> It is a software error to try and cross boundaries. Modern processors >>>>>> do their best to try and cause the correct behaviour to occur, albeit >>>>>> with a massive disclaimer about the performance hit. Older processors >>>>>> didn't cope. >>>>>> >>>>>> As far as I'm concerned, its fine to terminate a emulation which crosses >>>>>> a boundary with the null ops. >>>>> Alas we never even get as far as the I/O handlers in some circumstances... >>>>> >>>>> I just set up a variant of an XTF test doing a backwards rep movsd into a >>>>> well aligned stack buffer where source buffer starts 1 byte before a >>>> boundary >>>>> between RAM and MMIO. The code in hvmemul_rep_movs() (rightly) >>>> detects that >>>>> both the source and dest of the initial rep are RAM, skips over the I/O >>>>> emulation calls, and then fails when the hvm_copy_from_guest_phys() >>>>> (unsurprisingly) fails to grab the 8 bytes for the initial rep. >>>>> So, any logic we add to deal with handling page spanning ops is going to >>>>> have to go in at the top level of instruction emulation... which I fear is >>>>> going to be quite a major change and not something that's going to be easy >>>> to >>>>> back-port. >>>> Well, wasn't it clear from the beginning that a proper fix would be too >>>> invasive to backport? And wasn't it for that reason that you intended >>>> to add a small hack first, to deal with just the case(s) that we currently >>>> have issues with? >>> Well, given that I mistakenly understood you were hitting the same rep issue >>> that I was, I thought I could deal with it in a reasonably straightforward >>> way. Maybe I can still do a point fix for what you are hitting though. What >>> precisely are you hitting? Always a single MOV? And always from a page >>> spanning source to a well aligned dest? Or more combinations than that? >> Always a single, misaligned MOV spanning the boundary from a valid >> RAM page to a ballooned out one (Linux'es load_unaligned_zeropad()). >> I meanwhile wonder though whether it might not be better to address >> this at the p2m level, by inserting a r/o mapping of an all-zeros page. >> George, do you have any opinion one way or the other? > Sorry, what exactly is the issue here? Linux has a function called > load_unaligned_zeropad() which is reading into a ballooned region? > > Fundamentally, a ballooned page is one which has been allocated to a > device driver. I'm having a hard time coming up with a justification > for having code which reads memory owned by B in the process of reading > memory owned by A. Or is there some weird architectural reason that I'm > not aware of? The underlying issue is that the emulator can't cope with a single misaligned access which crosses RAM and MMIO. It gives up and presumably throws #UD back. One longstanding Xen bug is that simply ballooning a page out shouldn't be able to trigger MMIO emulation to begin with. It is a side effect of mixed p2m types, and the fix for this to have Xen understand the guest physmap layout. However, the real bug is Linux making such a misaligned access into a ballooned out page in the first place. This is a Linux kernel bug which (presumably) manifests in a very obvious way due to shortcomings in Xen's emulation handling. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/2] MMIO emulation fixes 2018-08-10 16:37 ` Andrew Cooper @ 2018-08-13 6:50 ` Jan Beulich 2018-08-16 11:08 ` Andrew Cooper 2018-08-29 10:36 ` Olaf Hering 2018-08-16 10:29 ` Jan Beulich 1 sibling, 2 replies; 41+ messages in thread From: Jan Beulich @ 2018-08-13 6:50 UTC (permalink / raw) To: Andrew Cooper, George Dunlap; +Cc: xen-devel, Paul Durrant, george.dunlap >>> On 10.08.18 at 18:37, <andrew.cooper3@citrix.com> wrote: > On 10/08/18 17:30, George Dunlap wrote: >> Sorry, what exactly is the issue here? Linux has a function called >> load_unaligned_zeropad() which is reading into a ballooned region? Yes. >> Fundamentally, a ballooned page is one which has been allocated to a >> device driver. I'm having a hard time coming up with a justification >> for having code which reads memory owned by B in the process of reading >> memory owned by A. Or is there some weird architectural reason that I'm >> not aware of? Well, they do this no matter who owns the successive page (or perhaps at a smaller granularity also the successive allocation). I guess their goal is to have just a single MOV in the common case (with the caller ignoring the uninteresting to it high bytes), while recovering gracefully from #PF should one occur. > The underlying issue is that the emulator can't cope with a single > misaligned access which crosses RAM and MMIO. It gives up and > presumably throws #UD back. We wouldn't have observed any problem if there was #UD in such a case, as Linux'es fault recovery code doesn't care what kind of fault has occurred. We're getting back a result of all ones, even for the part of the read that has actually hit the last few bytes of the present page. > One longstanding Xen bug is that simply ballooning a page out shouldn't > be able to trigger MMIO emulation to begin with. It is a side effect of > mixed p2m types, and the fix for this to have Xen understand the guest > physmap layout. And hence the consideration of mapping in an all zeros page instead. This is because of the way __hvmemul_read() / __hvm_copy() work: The latter doesn't tell its caller how many bytes it was able to read, and hence the former considers the entire range MMIO (and forwards the request for emulation). Of course all of this is an issue only because hvmemul_virtual_to_linear() sees no need to split the request at the page boundary, due to the balloon driver having left in place the mapping of the ballooned out page. Obviously the opposite case (access starting in a ballooned out page and crossing into an "ordinary" one) would have a similar issue, which is presumably even harder to fix without going the map-a-zero-page route (or Paul's suggested null_handler hack). > However, the real bug is Linux making such a misaligned access into a > ballooned out page in the first place. This is a Linux kernel bug which > (presumably) manifests in a very obvious way due to shortcomings in > Xen's emulation handling. I wouldn't dare to judge whether this is a bug, especially in light that they recover gracefully from the #PF that might result in the native case. Arguably the caller has to have some knowledge about what might live in the following page, as to not inadvertently hit an MMIO page rather than a non-present mapping. But I'd leave such judgment to them; our business is to get working a case that is working without Xen underneath. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/2] MMIO emulation fixes 2018-08-13 6:50 ` Jan Beulich @ 2018-08-16 11:08 ` Andrew Cooper 2018-08-29 10:36 ` Olaf Hering 1 sibling, 0 replies; 41+ messages in thread From: Andrew Cooper @ 2018-08-16 11:08 UTC (permalink / raw) To: Jan Beulich, George Dunlap; +Cc: xen-devel, Paul Durrant, george.dunlap On 13/08/18 07:50, Jan Beulich wrote: >>>> On 10.08.18 at 18:37, <andrew.cooper3@citrix.com> wrote: >> On 10/08/18 17:30, George Dunlap wrote: >>> Sorry, what exactly is the issue here? Linux has a function called >>> load_unaligned_zeropad() which is reading into a ballooned region? > Yes. > >>> Fundamentally, a ballooned page is one which has been allocated to a >>> device driver. I'm having a hard time coming up with a justification >>> for having code which reads memory owned by B in the process of reading >>> memory owned by A. Or is there some weird architectural reason that I'm >>> not aware of? > Well, they do this no matter who owns the successive page (or > perhaps at a smaller granularity also the successive allocation). > I guess their goal is to have just a single MOV in the common > case (with the caller ignoring the uninteresting to it high bytes), > while recovering gracefully from #PF should one occur. > >> The underlying issue is that the emulator can't cope with a single >> misaligned access which crosses RAM and MMIO. It gives up and >> presumably throws #UD back. > We wouldn't have observed any problem if there was #UD in > such a case, as Linux'es fault recovery code doesn't care what > kind of fault has occurred. We're getting back a result of all > ones, even for the part of the read that has actually hit the > last few bytes of the present page. > >> One longstanding Xen bug is that simply ballooning a page out shouldn't >> be able to trigger MMIO emulation to begin with. It is a side effect of >> mixed p2m types, and the fix for this to have Xen understand the guest >> physmap layout. > And hence the consideration of mapping in an all zeros page > instead. This is because of the way __hvmemul_read() / > __hvm_copy() work: The latter doesn't tell its caller how many > bytes it was able to read, and hence the former considers the > entire range MMIO (and forwards the request for emulation). > Of course all of this is an issue only because > hvmemul_virtual_to_linear() sees no need to split the request > at the page boundary, due to the balloon driver having left in > place the mapping of the ballooned out page. Actually, the more I think about this, the more of a bad idea emulating a zero page is. It gives the illusion of a working piece of zeroed ram, except that writes definitely can't take effect. Its going to make bugs even more subtle. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/2] MMIO emulation fixes 2018-08-13 6:50 ` Jan Beulich 2018-08-16 11:08 ` Andrew Cooper @ 2018-08-29 10:36 ` Olaf Hering 2018-08-29 10:53 ` Andrew Cooper 2018-08-30 8:10 ` Olaf Hering 1 sibling, 2 replies; 41+ messages in thread From: Olaf Hering @ 2018-08-29 10:36 UTC (permalink / raw) To: Jan Beulich Cc: George Dunlap, Andrew Cooper, Paul Durrant, george.dunlap, xen-devel [-- Attachment #1.1: Type: text/plain, Size: 749 bytes --] On Mon, Aug 13, Jan Beulich wrote: > And hence the consideration of mapping in an all zeros page > instead. This is because of the way __hvmemul_read() / > __hvm_copy() work: The latter doesn't tell its caller how many > bytes it was able to read, and hence the former considers the > entire range MMIO (and forwards the request for emulation). > Of course all of this is an issue only because > hvmemul_virtual_to_linear() sees no need to split the request > at the page boundary, due to the balloon driver having left in > place the mapping of the ballooned out page. Should perhaps __hvm_copy detect the fault and copy 0xf for the unavailable page into 'buf', and finally return success? Clearly something must be done at the Xen level. Olaf [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 195 bytes --] [-- Attachment #2: Type: text/plain, Size: 157 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/2] MMIO emulation fixes 2018-08-29 10:36 ` Olaf Hering @ 2018-08-29 10:53 ` Andrew Cooper 2018-08-29 11:00 ` Olaf Hering 2018-08-30 8:10 ` Olaf Hering 1 sibling, 1 reply; 41+ messages in thread From: Andrew Cooper @ 2018-08-29 10:53 UTC (permalink / raw) To: Olaf Hering, Jan Beulich Cc: George Dunlap, xen-devel, Paul Durrant, george.dunlap On 29/08/18 11:36, Olaf Hering wrote: > On Mon, Aug 13, Jan Beulich wrote: > >> And hence the consideration of mapping in an all zeros page >> instead. This is because of the way __hvmemul_read() / >> __hvm_copy() work: The latter doesn't tell its caller how many >> bytes it was able to read, and hence the former considers the >> entire range MMIO (and forwards the request for emulation). >> Of course all of this is an issue only because >> hvmemul_virtual_to_linear() sees no need to split the request >> at the page boundary, due to the balloon driver having left in >> place the mapping of the ballooned out page. > Should perhaps __hvm_copy detect the fault and copy 0xf for the > unavailable page into 'buf', and finally return success? > > Clearly something must be done at the Xen level. This is first and formost a Linux bug. No amount of fixing Xen is going to alter that. Architecturally speaking, handing #MC back is probably the closest we can get to sensible behaviour, but it is still a bug that Linux is touching the ballooned out page in the first place. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/2] MMIO emulation fixes 2018-08-29 10:53 ` Andrew Cooper @ 2018-08-29 11:00 ` Olaf Hering 2018-08-29 11:09 ` Andrew Cooper 0 siblings, 1 reply; 41+ messages in thread From: Olaf Hering @ 2018-08-29 11:00 UTC (permalink / raw) To: Andrew Cooper Cc: George Dunlap, xen-devel, Paul Durrant, george.dunlap, Jan Beulich [-- Attachment #1.1: Type: text/plain, Size: 662 bytes --] On Wed, Aug 29, Andrew Cooper wrote: > Architecturally speaking, handing #MC back is probably the closest we > can get to sensible behaviour, but it is still a bug that Linux is > touching the ballooned out page in the first place. Well, the issue is that a read crosses a page boundary. If that would be forbidden, load_unaligned_zeropad() would not exist. It can not know what is in the following page. And such page crossing happens also in the unballooned case. Sadly I can not trigger the reported NFS bug myself. But it can be enforced by ballooning enough pages so that an allocated readdir reply eventually is right in front of a ballooned page. Olaf [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 195 bytes --] [-- Attachment #2: Type: text/plain, Size: 157 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/2] MMIO emulation fixes 2018-08-29 11:00 ` Olaf Hering @ 2018-08-29 11:09 ` Andrew Cooper 2018-08-29 11:14 ` Andrew Cooper ` (2 more replies) 0 siblings, 3 replies; 41+ messages in thread From: Andrew Cooper @ 2018-08-29 11:09 UTC (permalink / raw) To: Olaf Hering Cc: George Dunlap, xen-devel, Paul Durrant, george.dunlap, Jan Beulich On 29/08/18 12:00, Olaf Hering wrote: > On Wed, Aug 29, Andrew Cooper wrote: > >> Architecturally speaking, handing #MC back is probably the closest we >> can get to sensible behaviour, but it is still a bug that Linux is >> touching the ballooned out page in the first place. > Well, the issue is that a read crosses a page boundary. If that would be > forbidden, load_unaligned_zeropad() would not exist. It can not know > what is in the following page. And such page crossing happens also in > the unballooned case. Sadly I can not trigger the reported NFS bug > myself. But it can be enforced by ballooning enough pages so that an > allocated readdir reply eventually is right in front of a ballooned > page. The Linux bug is not shooting the ballooned page out of the directmap. Linux should be taking a fatal #PF for that read, because its a virtual mapping for a frame which Linux has voluntarily elected to make invalid. As Xen can't prevent Linux from making/maintaining such an invalid mapping, throwing #MC back is the next best thing, because terminating the access with ~0 is just going to hide the bug, and run at a glacial pace while doing so. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/2] MMIO emulation fixes 2018-08-29 11:09 ` Andrew Cooper @ 2018-08-29 11:14 ` Andrew Cooper 2018-08-29 11:26 ` Juergen Gross [not found] ` <5B86773A0200004903F324A0@prv1-mh.provo.novell.com> 2 siblings, 0 replies; 41+ messages in thread From: Andrew Cooper @ 2018-08-29 11:14 UTC (permalink / raw) To: Olaf Hering Cc: George Dunlap, xen-devel, Paul Durrant, george.dunlap, Jan Beulich On 29/08/18 12:09, Andrew Cooper wrote: > On 29/08/18 12:00, Olaf Hering wrote: >> On Wed, Aug 29, Andrew Cooper wrote: >> >>> Architecturally speaking, handing #MC back is probably the closest we >>> can get to sensible behaviour, but it is still a bug that Linux is >>> touching the ballooned out page in the first place. >> Well, the issue is that a read crosses a page boundary. If that would be >> forbidden, load_unaligned_zeropad() would not exist. It can not know >> what is in the following page. And such page crossing happens also in >> the unballooned case. Sadly I can not trigger the reported NFS bug >> myself. But it can be enforced by ballooning enough pages so that an >> allocated readdir reply eventually is right in front of a ballooned >> page. > The Linux bug is not shooting the ballooned page out of the directmap. > Linux should be taking a fatal #PF for that read, because its a virtual > mapping for a frame which Linux has voluntarily elected to make invalid. > > As Xen can't prevent Linux from making/maintaining such an invalid > mapping, throwing #MC back is the next best thing, because terminating > the access with ~0 is just going to hide the bug, and run at a glacial > pace while doing so. Yes - having looked at load_unaligned_zeropad(), it exists explicitly to cope with #PF occurring from the page crossing. This re-enforces my opinion that the underlying bug is not shooting ballooned-out pages out of the directmap. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/2] MMIO emulation fixes 2018-08-29 11:09 ` Andrew Cooper 2018-08-29 11:14 ` Andrew Cooper @ 2018-08-29 11:26 ` Juergen Gross [not found] ` <5B86773A0200004903F324A0@prv1-mh.provo.novell.com> 2 siblings, 0 replies; 41+ messages in thread From: Juergen Gross @ 2018-08-29 11:26 UTC (permalink / raw) To: Andrew Cooper, Olaf Hering Cc: George Dunlap, xen-devel, Paul Durrant, george.dunlap, Jan Beulich On 29/08/18 13:09, Andrew Cooper wrote: > On 29/08/18 12:00, Olaf Hering wrote: >> On Wed, Aug 29, Andrew Cooper wrote: >> >>> Architecturally speaking, handing #MC back is probably the closest we >>> can get to sensible behaviour, but it is still a bug that Linux is >>> touching the ballooned out page in the first place. >> Well, the issue is that a read crosses a page boundary. If that would be >> forbidden, load_unaligned_zeropad() would not exist. It can not know >> what is in the following page. And such page crossing happens also in >> the unballooned case. Sadly I can not trigger the reported NFS bug >> myself. But it can be enforced by ballooning enough pages so that an >> allocated readdir reply eventually is right in front of a ballooned >> page. > > The Linux bug is not shooting the ballooned page out of the directmap. > Linux should be taking a fatal #PF for that read, because its a virtual > mapping for a frame which Linux has voluntarily elected to make invalid. > > As Xen can't prevent Linux from making/maintaining such an invalid > mapping, throwing #MC back is the next best thing, because terminating > the access with ~0 is just going to hide the bug, and run at a glacial > pace while doing so. I think you are right: the kernel should in no case access a random page without knowing it is RAM. Hitting a ballooned page is just much more probable than hitting a MMIO page this way. There are _no_ guard pages around MMIO areas, so it could in theory happen that load_unaligned_zeropad() would access MMIO area triggering random behavior. So removing ballooned pages from the directmap just hides an underlying problem. Juergen _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
[parent not found: <5B86773A0200004903F324A0@prv1-mh.provo.novell.com>]
[parent not found: <5B867B1A0200006D03F3278E@prv1-mh.provo.novell.com>]
[parent not found: <5B867D000200009103F328E2@prv1-mh.provo.novell.com>]
[parent not found: <5B867F020200009E04E46402@prv1-mh.provo.novell.com>]
* Re: [PATCH 0/2] MMIO emulation fixes [not found] ` <5B867F020200009E04E46402@prv1-mh.provo.novell.com> @ 2018-08-29 12:06 ` Jan Beulich 0 siblings, 0 replies; 41+ messages in thread From: Jan Beulich @ 2018-08-29 12:06 UTC (permalink / raw) To: Andrew Cooper Cc: George Dunlap, xen-devel, Olaf Hering, Paul Durrant, george.dunlap >>> On 29.08.18 at 13:09, <andrew.cooper3@citrix.com> wrote: > On 29/08/18 12:00, Olaf Hering wrote: >> On Wed, Aug 29, Andrew Cooper wrote: >> >>> Architecturally speaking, handing #MC back is probably the closest we >>> can get to sensible behaviour, but it is still a bug that Linux is >>> touching the ballooned out page in the first place. >> Well, the issue is that a read crosses a page boundary. If that would be >> forbidden, load_unaligned_zeropad() would not exist. It can not know >> what is in the following page. And such page crossing happens also in >> the unballooned case. Sadly I can not trigger the reported NFS bug >> myself. But it can be enforced by ballooning enough pages so that an >> allocated readdir reply eventually is right in front of a ballooned >> page. > > The Linux bug is not shooting the ballooned page out of the directmap. > Linux should be taking a fatal #PF for that read, because its a virtual > mapping for a frame which Linux has voluntarily elected to make invalid. > > As Xen can't prevent Linux from making/maintaining such an invalid > mapping, throwing #MC back is the next best thing, because terminating > the access with ~0 is just going to hide the bug, and run at a glacial > pace while doing so. I still do not understand why you think so: Handing back ~0 is far more correct than raising #MC imo. The x86 architecture is bound to its history, and in pre-Pentium days there was no #MC to be raised in the first place. Furthermore, while I can see that _some other_ bug may be hidden this way, there's no bug at all the be hidden in load_unaligned_zeropad() (leaving aside the balloon driver behavior). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
[parent not found: <5B87A68A0200001C04E5493A@prv1-mh.provo.novell.com>]
* Re: [PATCH 0/2] MMIO emulation fixes [not found] ` <5B87A68A0200001C04E5493A@prv1-mh.provo.novell.com> @ 2018-08-30 8:23 ` Jan Beulich 2018-08-30 10:42 ` Olaf Hering 0 siblings, 1 reply; 41+ messages in thread From: Jan Beulich @ 2018-08-30 8:23 UTC (permalink / raw) To: Olaf Hering Cc: George Dunlap, Andrew Cooper, Paul Durrant, george.dunlap, xen-devel >>> On 30.08.18 at 10:10, <olaf@aepfle.de> wrote: > On Wed, Aug 29, Olaf Hering wrote: > >> On Mon, Aug 13, Jan Beulich wrote: >> >> > And hence the consideration of mapping in an all zeros page >> > instead. This is because of the way __hvmemul_read() / >> > __hvm_copy() work: The latter doesn't tell its caller how many >> > bytes it was able to read, and hence the former considers the >> > entire range MMIO (and forwards the request for emulation). >> > Of course all of this is an issue only because >> > hvmemul_virtual_to_linear() sees no need to split the request >> > at the page boundary, due to the balloon driver having left in >> > place the mapping of the ballooned out page. > > So how is this bug supposed to be fixed? > > What I see in my tracing is that __hvmemul_read gets called with > gla==ffff880223bffff9/bytes==8. Then hvm_copy_from_guest_linear fills > the buffer from gpa 223bffff9 with data, but finally it returns > HVMTRANS_bad_gfn_to_mfn, which it got from a failed get_page_from_gfn > for the second page. > > Now things go downhill. hvmemul_linear_mmio_read is called, which calls > hvmemul_do_io/hvm_io_intercept. That returns X86EMUL_UNHANDLEABLE. As a > result hvm_process_io_intercept(null_handler) is called, which > overwrites the return buffer with 0xff. There are a number of options (besides fixing the issue on the Linux side, which I continue to not be entirely convinced of being the best approach): One is Paul's idea of making null_handler actually retrieve RAM contents when (part of) the access touches RAM. Another might be to make __hvm_copy() return back what parts of the access could be read/written (so that MMIO emulation would only be triggered for the missing piece). A third might be to make the splitting of accesses more intelligent in __hvmemul_read(). I'm meaning to look into this in some more detail later today, unless a patch has appeared by then from e.g. Paul. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/2] MMIO emulation fixes 2018-08-30 8:23 ` Jan Beulich @ 2018-08-30 10:42 ` Olaf Hering 0 siblings, 0 replies; 41+ messages in thread From: Olaf Hering @ 2018-08-30 10:42 UTC (permalink / raw) To: Jan Beulich Cc: George Dunlap, Andrew Cooper, Paul Durrant, george.dunlap, xen-devel [-- Attachment #1.1: Type: text/plain, Size: 1129 bytes --] On Thu, Aug 30, Jan Beulich wrote: > approach): One is Paul's idea of making null_handler actually retrieve > RAM contents when (part of) the access touches RAM. Another might This works for me: static int null_read(const struct hvm_io_handler *io_handler, uint64_t addr, uint32_t size, uint64_t *data) { struct vcpu *curr = current; struct domain *currd = curr->domain; p2m_type_t p2mt = p2m_invalid; unsigned long gmfn = paddr_to_pfn(addr); struct page_info *page; char *p; get_gfn_query_unlocked(currd, gmfn, &p2mt); if ( p2mt != p2m_ram_rw ) { *data = ~0ul; } else { page = get_page_from_gfn(currd, gmfn, NULL, P2M_UNSHARE); if ( ! page ) { memset(data, 0xee, size); } else { p = (char *)__map_domain_page(page) + (addr & ~PAGE_MASK); memcpy(data, p, size); unmap_domain_page(p); put_page(page); } } return X86EMUL_OKAY; } Olaf [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 195 bytes --] [-- Attachment #2: Type: text/plain, Size: 157 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/2] MMIO emulation fixes 2018-08-29 10:36 ` Olaf Hering 2018-08-29 10:53 ` Andrew Cooper @ 2018-08-30 8:10 ` Olaf Hering 1 sibling, 0 replies; 41+ messages in thread From: Olaf Hering @ 2018-08-30 8:10 UTC (permalink / raw) To: Paul Durrant Cc: George Dunlap, Andrew Cooper, george.dunlap, Jan Beulich, xen-devel [-- Attachment #1.1: Type: text/plain, Size: 1244 bytes --] On Wed, Aug 29, Olaf Hering wrote: > On Mon, Aug 13, Jan Beulich wrote: > > > And hence the consideration of mapping in an all zeros page > > instead. This is because of the way __hvmemul_read() / > > __hvm_copy() work: The latter doesn't tell its caller how many > > bytes it was able to read, and hence the former considers the > > entire range MMIO (and forwards the request for emulation). > > Of course all of this is an issue only because > > hvmemul_virtual_to_linear() sees no need to split the request > > at the page boundary, due to the balloon driver having left in > > place the mapping of the ballooned out page. So how is this bug supposed to be fixed? What I see in my tracing is that __hvmemul_read gets called with gla==ffff880223bffff9/bytes==8. Then hvm_copy_from_guest_linear fills the buffer from gpa 223bffff9 with data, but finally it returns HVMTRANS_bad_gfn_to_mfn, which it got from a failed get_page_from_gfn for the second page. Now things go downhill. hvmemul_linear_mmio_read is called, which calls hvmemul_do_io/hvm_io_intercept. That returns X86EMUL_UNHANDLEABLE. As a result hvm_process_io_intercept(null_handler) is called, which overwrites the return buffer with 0xff. Olaf [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 195 bytes --] [-- Attachment #2: Type: text/plain, Size: 157 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/2] MMIO emulation fixes 2018-08-10 16:37 ` Andrew Cooper 2018-08-13 6:50 ` Jan Beulich @ 2018-08-16 10:29 ` Jan Beulich 2018-08-16 10:56 ` Andrew Cooper 1 sibling, 1 reply; 41+ messages in thread From: Jan Beulich @ 2018-08-16 10:29 UTC (permalink / raw) To: Boris Ostrovsky, Juergen Gross Cc: George Dunlap, Andrew Cooper, Paul Durrant, george.dunlap, xen-devel >>> On 13.08.18 at 08:50, wrote: >>>> On 10.08.18 at 18:37, <andrew.cooper3@citrix.com> wrote: > > On 10/08/18 17:30, George Dunlap wrote: > >> Sorry, what exactly is the issue here? Linux has a function called > >> load_unaligned_zeropad() which is reading into a ballooned region? > > Yes. > > >> Fundamentally, a ballooned page is one which has been allocated to a > >> device driver. I'm having a hard time coming up with a justification > >> for having code which reads memory owned by B in the process of reading > >> memory owned by A. Or is there some weird architectural reason that I'm > >> not aware of? > > Well, they do this no matter who owns the successive page (or > perhaps at a smaller granularity also the successive allocation). > I guess their goal is to have just a single MOV in the common > case (with the caller ignoring the uninteresting to it high bytes), > while recovering gracefully from #PF should one occur. > > > The underlying issue is that the emulator can't cope with a single > > misaligned access which crosses RAM and MMIO. It gives up and > > presumably throws #UD back. > > We wouldn't have observed any problem if there was #UD in > such a case, as Linux'es fault recovery code doesn't care what > kind of fault has occurred. We're getting back a result of all > ones, even for the part of the read that has actually hit the > last few bytes of the present page. > > > One longstanding Xen bug is that simply ballooning a page out shouldn't > > be able to trigger MMIO emulation to begin with. It is a side effect of > > mixed p2m types, and the fix for this to have Xen understand the guest > > physmap layout. > > And hence the consideration of mapping in an all zeros page > instead. This is because of the way __hvmemul_read() / > __hvm_copy() work: The latter doesn't tell its caller how many > bytes it was able to read, and hence the former considers the > entire range MMIO (and forwards the request for emulation). > Of course all of this is an issue only because > hvmemul_virtual_to_linear() sees no need to split the request > at the page boundary, due to the balloon driver having left in > place the mapping of the ballooned out page. > > Obviously the opposite case (access starting in a ballooned > out page and crossing into an "ordinary" one) would have a > similar issue, which is presumably even harder to fix without > going the map-a-zero-page route (or Paul's suggested > null_handler hack). > > > However, the real bug is Linux making such a misaligned access into a > > ballooned out page in the first place. This is a Linux kernel bug which > > (presumably) manifests in a very obvious way due to shortcomings in > > Xen's emulation handling. > > I wouldn't dare to judge whether this is a bug, especially in > light that they recover gracefully from the #PF that might result in > the native case. Arguably the caller has to have some knowledge > about what might live in the following page, as to not inadvertently > hit an MMIO page rather than a non-present mapping. But I'd > leave such judgment to them; our business is to get working a case > that is working without Xen underneath. Following some further discussion with Andrew, he looks to be convinced that the issue is to be fixed in the balloon driver, which so far (intentionally afaict) does not remove the direct map entries for ballooned out pages in the HVM case. I'm not convinced of this, but I'd nevertheless like to inquire whether such a change (resulting in shattered super page mappings) would be acceptable in the first place. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/2] MMIO emulation fixes 2018-08-16 10:29 ` Jan Beulich @ 2018-08-16 10:56 ` Andrew Cooper 2018-08-16 11:27 ` Jan Beulich 0 siblings, 1 reply; 41+ messages in thread From: Andrew Cooper @ 2018-08-16 10:56 UTC (permalink / raw) To: Jan Beulich, Boris Ostrovsky, Juergen Gross Cc: George Dunlap, xen-devel, Paul Durrant, george.dunlap On 16/08/18 11:29, Jan Beulich wrote: >>>> On 13.08.18 at 08:50, wrote: >>>>> On 10.08.18 at 18:37, <andrew.cooper3@citrix.com> wrote: >>> On 10/08/18 17:30, George Dunlap wrote: >>>> Sorry, what exactly is the issue here? Linux has a function called >>>> load_unaligned_zeropad() which is reading into a ballooned region? >> Yes. >> >>>> Fundamentally, a ballooned page is one which has been allocated to a >>>> device driver. I'm having a hard time coming up with a justification >>>> for having code which reads memory owned by B in the process of reading >>>> memory owned by A. Or is there some weird architectural reason that I'm >>>> not aware of? >> Well, they do this no matter who owns the successive page (or >> perhaps at a smaller granularity also the successive allocation). >> I guess their goal is to have just a single MOV in the common >> case (with the caller ignoring the uninteresting to it high bytes), >> while recovering gracefully from #PF should one occur. >> >>> The underlying issue is that the emulator can't cope with a single >>> misaligned access which crosses RAM and MMIO. It gives up and >>> presumably throws #UD back. >> We wouldn't have observed any problem if there was #UD in >> such a case, as Linux'es fault recovery code doesn't care what >> kind of fault has occurred. We're getting back a result of all >> ones, even for the part of the read that has actually hit the >> last few bytes of the present page. >> >>> One longstanding Xen bug is that simply ballooning a page out shouldn't >>> be able to trigger MMIO emulation to begin with. It is a side effect of >>> mixed p2m types, and the fix for this to have Xen understand the guest >>> physmap layout. >> And hence the consideration of mapping in an all zeros page >> instead. This is because of the way __hvmemul_read() / >> __hvm_copy() work: The latter doesn't tell its caller how many >> bytes it was able to read, and hence the former considers the >> entire range MMIO (and forwards the request for emulation). >> Of course all of this is an issue only because >> hvmemul_virtual_to_linear() sees no need to split the request >> at the page boundary, due to the balloon driver having left in >> place the mapping of the ballooned out page. >> >> Obviously the opposite case (access starting in a ballooned >> out page and crossing into an "ordinary" one) would have a >> similar issue, which is presumably even harder to fix without >> going the map-a-zero-page route (or Paul's suggested >> null_handler hack). >> >>> However, the real bug is Linux making such a misaligned access into a >>> ballooned out page in the first place. This is a Linux kernel bug which >>> (presumably) manifests in a very obvious way due to shortcomings in >>> Xen's emulation handling. >> I wouldn't dare to judge whether this is a bug, especially in >> light that they recover gracefully from the #PF that might result in >> the native case. Arguably the caller has to have some knowledge >> about what might live in the following page, as to not inadvertently >> hit an MMIO page rather than a non-present mapping. But I'd >> leave such judgment to them; our business is to get working a case >> that is working without Xen underneath. > Following some further discussion with Andrew, he looks to be > convinced that the issue is to be fixed in the balloon driver, > which so far (intentionally afaict) does not remove the direct > map entries for ballooned out pages in the HVM case. I'm not > convinced of this, but I'd nevertheless like to inquire whether > such a change (resulting in shattered super page mappings) > would be acceptable in the first place. We don't tolerate anything else in the directmap pointing to invalid/unimplemented frames. Why should ballooning be any different? ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/2] MMIO emulation fixes 2018-08-16 10:56 ` Andrew Cooper @ 2018-08-16 11:27 ` Jan Beulich 0 siblings, 0 replies; 41+ messages in thread From: Jan Beulich @ 2018-08-16 11:27 UTC (permalink / raw) To: Andrew Cooper Cc: Juergen Gross, George Dunlap, george.dunlap, Paul Durrant, xen-devel, Boris Ostrovsky >>> On 16.08.18 at 12:56, <andrew.cooper3@citrix.com> wrote: > On 16/08/18 11:29, Jan Beulich wrote: >> Following some further discussion with Andrew, he looks to be >> convinced that the issue is to be fixed in the balloon driver, >> which so far (intentionally afaict) does not remove the direct >> map entries for ballooned out pages in the HVM case. I'm not >> convinced of this, but I'd nevertheless like to inquire whether >> such a change (resulting in shattered super page mappings) >> would be acceptable in the first place. > > We don't tolerate anything else in the directmap pointing to > invalid/unimplemented frames. Why should ballooning be any different? Because ballooning is something virtualization specific, which doesn't have any equivalent on bare hardware (memory hot unplug doesn't come quite close enough imo, not the least because that doesn't work on a page granular basis). Hence we're to define the exact behavior here, and hence such a definition could as well include special behavior of accesses to the involved guest-physical addresses. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
[parent not found: <20180810103714.5112=3def=3dbf=3dbd1=3def=3dbf=3dbdpau?= =?UTF-8?Q?l.durrant@ci=3f=3d_trix.com>]
[parent not found: <fdf19f7d=ef=bf=bd1b92=ef=bf=bda9c0?= =?UTF-8?Q?=ef=bf=bd3602=ef=bf=bdb1c9807bf610@citrix.com>]
[parent not found: <a735b4359ccc4b278?= =?UTF-8?Q?330204d9790c6ac@AMSPEX02CL03.citrite.net>]
[parent not found: <5B6DAF9F02000078001DD0?= =?UTF-8?Q?40@prv1=ef=bf=bdmh.provo.novell.com>]
[parent not found: <e2f77af0b2394b8f859a1f2dc1a?= =?UTF-8?Q?91797@AMSPEX02CL03.citrite.net>]
[parent not found: <5B6DB69D02000078001DD06A@prv1?= =?UTF-8?B?77+9bWgucHJvdm8ubm92ZWxsLmNvbT4gPGVhYWI1YTcz77+9MjkxMO+/vTdmYjY=?= =?UTF-8?B?77+9ZTFmY++/vTA4NTM3ZTYzMDg4Y0BjaXRyaXguY29tPiA8OTJjYTY5ZTXvv705?= =?UTF-8?B?OGIx77+9NjFlNO+/vTgxN2Hvv70zODY4ZjgyOTQ3MWFAY2l0>]
[parent not found: <11c0c96?= =?UTF-8?Q?5-9af7-2cec-1420-4541e281183a@citrix.com>]
[parent not found: <5B755FBC0200007_=3d=3f?= =?UTF-8?Q?UTF-8=3fQ=3f8001DEDBF@suse.com>]
[parent not found: <dd3c99c2-67e3-faf1-4219-85651b89?= =?UTF-8?Q?1adc@suse.com>]
* Re: [PATCH 0/2] MMIO emulation fixes [not found] ` <dd3c99c2-67e3-faf1-4219-85651b89?= =?UTF-8?Q?1adc@suse.com> @ 2018-09-04 16:24 ` Andrew Cooper [not found] ` <651CBD680200008737554D14@prv1-mh.provo.novell.com> 0 siblings, 1 reply; 41+ messages in thread From: Andrew Cooper @ 2018-09-04 16:24 UTC (permalink / raw) To: Juergen Gross, Jan Beulich Cc: George Dunlap, xen-devel, Boris Ostrovsky, Paul Durrant, George Dunlap On 04/09/18 17:11, Juergen Gross wrote: > On 16/08/18 13:27, Jan Beulich wrote: >>>>> On 16.08.18 at 12:56, <andrew.cooper3@citrix.com> wrote: >>> On 16/08/18 11:29, Jan Beulich wrote: >>>> Following some further discussion with Andrew, he looks to be >>>> convinced that the issue is to be fixed in the balloon driver, >>>> which so far (intentionally afaict) does not remove the direct >>>> map entries for ballooned out pages in the HVM case. I'm not >>>> convinced of this, but I'd nevertheless like to inquire whether >>>> such a change (resulting in shattered super page mappings) >>>> would be acceptable in the first place. >>> We don't tolerate anything else in the directmap pointing to >>> invalid/unimplemented frames. Why should ballooning be any different? >> Because ballooning is something virtualization specific, which >> doesn't have any equivalent on bare hardware (memory hot >> unplug doesn't come quite close enough imo, not the least >> because that doesn't work on a page granular basis). Hence >> we're to define the exact behavior here, and hence such a >> definition could as well include special behavior of accesses >> to the involved guest-physical addresses. > After discussing the issue with some KVM guys I still think it would be > better to leave the ballooned pages mapped in the direct map. KVM does > it the same way. They return "something" in case the guest tries to > read from such a page (might be the real data, 0's or all 1's). > > So we should either map an all 0's or 1's page via EPT, or we should > return 0's or 1's via emulation of the read instruction. > > Performance shouldn't be a major issue, as such reads should be really > rare. Such reads should be non-existent. One way or another, there's still a bug to fix in the kernel, because it isn't keeping suitable track of the pfns. As for how Xen could do things better... We could map a page of all-ones (all zeroes would definitely be wrong), but you've still got the problem of what happens if a write occurs. We absolutely can't sacrifice enough RAM to fill in the ballooned-out frames with read/write frames. I'd prefer not to see any emulation here, but that is more for an attack surface limitation point of view. x86 still offers us the option to not tolerate misaligned accesses and terminate early write-discard when hitting one of these pages. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
[parent not found: <651CBD680200008737554D14@prv1-mh.provo.novell.com>]
[parent not found: <21554C83020000C537554D14@prv1-mh.provo.novell.com>]
[parent not found: <06D73C83020000C037554D14@prv1-mh.provo.novell.com>]
[parent not found: <A283E656020000808E2C01CD@prv1-mh.provo.novell.com>]
* Re: [PATCH 0/2] MMIO emulation fixes [not found] ` <A283E656020000808E2C01CD@prv1-mh.provo.novell.com> @ 2018-09-05 7:10 ` Jan Beulich 0 siblings, 0 replies; 41+ messages in thread From: Jan Beulich @ 2018-09-05 7:10 UTC (permalink / raw) To: Andrew Cooper Cc: Juergen Gross, George Dunlap, george.dunlap, Paul Durrant, xen-devel, Boris Ostrovsky >>> On 04.09.18 at 18:24, <andrew.cooper3@citrix.com> wrote: > On 04/09/18 17:11, Juergen Gross wrote: >> On 16/08/18 13:27, Jan Beulich wrote: >>>>>> On 16.08.18 at 12:56, <andrew.cooper3@citrix.com> wrote: >>>> On 16/08/18 11:29, Jan Beulich wrote: >>>>> Following some further discussion with Andrew, he looks to be >>>>> convinced that the issue is to be fixed in the balloon driver, >>>>> which so far (intentionally afaict) does not remove the direct >>>>> map entries for ballooned out pages in the HVM case. I'm not >>>>> convinced of this, but I'd nevertheless like to inquire whether >>>>> such a change (resulting in shattered super page mappings) >>>>> would be acceptable in the first place. >>>> We don't tolerate anything else in the directmap pointing to >>>> invalid/unimplemented frames. Why should ballooning be any different? >>> Because ballooning is something virtualization specific, which >>> doesn't have any equivalent on bare hardware (memory hot >>> unplug doesn't come quite close enough imo, not the least >>> because that doesn't work on a page granular basis). Hence >>> we're to define the exact behavior here, and hence such a >>> definition could as well include special behavior of accesses >>> to the involved guest-physical addresses. >> After discussing the issue with some KVM guys I still think it would be >> better to leave the ballooned pages mapped in the direct map. KVM does >> it the same way. They return "something" in case the guest tries to >> read from such a page (might be the real data, 0's or all 1's). >> >> So we should either map an all 0's or 1's page via EPT, or we should >> return 0's or 1's via emulation of the read instruction. >> >> Performance shouldn't be a major issue, as such reads should be really >> rare. > > Such reads should be non-existent. One way or another, there's still a > bug to fix in the kernel, because it isn't keeping suitable track of the > pfns. So you put yourself in opposition to both what KVM and Xen do in their Linux implementations. I can only re-iterate: We're talking about a PV extension here. Behavior of this is entirely defined by us. Hence it is not a given that "such reads should be non-existent". > As for how Xen could do things better... > > We could map a page of all-ones (all zeroes would definitely be wrong), > but you've still got the problem of what happens if a write occurs. We > absolutely can't sacrifice enough RAM to fill in the ballooned-out > frames with read/write frames. Of course, or else the ballooning effect would be nullified. However, besides a page full of 0s or 1s, a simple "sink" page could also be used, where reads return undefined data (i.e. whatever has last been written into it through one of its perhaps very many aliases). Another possibility for the sink page would be a (hardware) MMIO one we know has no actual device backing it, thus allowing writes to be terminated (discarded) by hardware, and reads to return all ones (again due to hardware behavior). The question is how we would universally find such a page (accesses to which must obviously not have any other side effects). > I'd prefer not to see any emulation here, but that is more for an attack > surface limitation point of view. x86 still offers us the option to not > tolerate misaligned accesses and terminate early write-discard when > hitting one of these pages. Well - for now we have the series hopefully fixing the emulation misbehavior here (and elsewhere at the same time). But I certainly appreciate your desire for there not being emulation here in the first place. Which I think leaves as the only option the sink page described above. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
[parent not found: <20180810103714.5112=ef=bf=bd1=ef=bf=bdpaul.durrant@ci?= =?UTF-8?Q?trix.com>]
[parent not found: <5B6D86F30?= =?UTF-8?Q?2000078001DCF85@prv1=ef=bf=bdmh.provo.novell.com>]
[parent not found: <e8cff3ca6c154b?= =?UTF-8?Q?67a2a932af83719354@AMSPEX02CL03.citrite.net>]
[parent not found: <fdf19f7d=ef=bf=bd1b?= =?UTF-8?B?OTLvv71hOWMw77+9MzYwMu+/vWIxYzk4MDdiZjYxMEBjaXRyaXguY29tPiA8YTcz?= =?UTF-8?Q?5b4359ccc4b278330204d9790c6ac@AMSPEX02CL03.citrite.net>]
[parent not found: <5B6DAF9F?= =?UTF-8?Q?02000078001DD040@prv1=ef=bf=bdmh.provo.novell.com>]
[parent not found: <e2f77af0b2394?= =?UTF-8?Q?b8f859a1f2dc1a91797@AMSPEX02CL03.citrite.net>]
[parent not found: <5B6DB69D0200007800?= =?UTF-8?Q?1DD06A@prv1=ef=bf=bdmh.provo.novell.com>]
[parent not found: <eaab5a73=ef=bf=bd2910?= =?UTF-8?B?77+9N2ZiNu+/vWUxZmPvv70wODUzN2U2MzA4OGNAY2l0cml4LmNvbT4gPDkyY2E2?= =?UTF-8?B?OWU177+9OThiMe+/vTYxZTTvv704MTdh77+9Mzg2OGY4Mjk0NzFhQGNpdHJpeC5j?= =?UTF-8?Q?om>]
[parent not found: <5B75521102000078001DED13@prv1=ef=bf=bdmh.provo.novell.com>]
[parent not found: <?= =?UTF-8?Q?11c0c965-9af7-2cec-1420-4541e281183a@citrix.com>]
[parent not found: <5B755FBC0200007?= =?UTF-8?Q?8001DEDBF@suse.com>]
* Re: [PATCH 0/2] MMIO emulation fixes [not found] ` <5B755FBC0200007?= =?UTF-8?Q?8001DEDBF@suse.com> @ 2018-08-16 12:52 ` Juergen Gross 2018-09-04 16:11 ` Juergen Gross 1 sibling, 0 replies; 41+ messages in thread From: Juergen Gross @ 2018-08-16 12:52 UTC (permalink / raw) To: Jan Beulich, Andrew Cooper Cc: George Dunlap, xen-devel, Boris Ostrovsky, Paul Durrant, George Dunlap On 16/08/18 13:27, Jan Beulich wrote: >>>> On 16.08.18 at 12:56, <andrew.cooper3@citrix.com> wrote: >> On 16/08/18 11:29, Jan Beulich wrote: >>> Following some further discussion with Andrew, he looks to be >>> convinced that the issue is to be fixed in the balloon driver, >>> which so far (intentionally afaict) does not remove the direct >>> map entries for ballooned out pages in the HVM case. I'm not >>> convinced of this, but I'd nevertheless like to inquire whether >>> such a change (resulting in shattered super page mappings) >>> would be acceptable in the first place. >> >> We don't tolerate anything else in the directmap pointing to >> invalid/unimplemented frames. Why should ballooning be any different? > > Because ballooning is something virtualization specific, which > doesn't have any equivalent on bare hardware (memory hot > unplug doesn't come quite close enough imo, not the least > because that doesn't work on a page granular basis). Hence > we're to define the exact behavior here, and hence such a > definition could as well include special behavior of accesses > to the involved guest-physical addresses. If I read drivers/virtio/virtio_balloon.c correctly KVM does the same as Xen: ballooned pages are _not_ removed from the direct mappings. Juergen _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/2] MMIO emulation fixes [not found] ` <5B755FBC0200007?= =?UTF-8?Q?8001DEDBF@suse.com> 2018-08-16 12:52 ` Juergen Gross @ 2018-09-04 16:11 ` Juergen Gross 1 sibling, 0 replies; 41+ messages in thread From: Juergen Gross @ 2018-09-04 16:11 UTC (permalink / raw) To: Jan Beulich, Andrew Cooper Cc: George Dunlap, xen-devel, Boris Ostrovsky, Paul Durrant, George Dunlap On 16/08/18 13:27, Jan Beulich wrote: >>>> On 16.08.18 at 12:56, <andrew.cooper3@citrix.com> wrote: >> On 16/08/18 11:29, Jan Beulich wrote: >>> Following some further discussion with Andrew, he looks to be >>> convinced that the issue is to be fixed in the balloon driver, >>> which so far (intentionally afaict) does not remove the direct >>> map entries for ballooned out pages in the HVM case. I'm not >>> convinced of this, but I'd nevertheless like to inquire whether >>> such a change (resulting in shattered super page mappings) >>> would be acceptable in the first place. >> >> We don't tolerate anything else in the directmap pointing to >> invalid/unimplemented frames. Why should ballooning be any different? > > Because ballooning is something virtualization specific, which > doesn't have any equivalent on bare hardware (memory hot > unplug doesn't come quite close enough imo, not the least > because that doesn't work on a page granular basis). Hence > we're to define the exact behavior here, and hence such a > definition could as well include special behavior of accesses > to the involved guest-physical addresses. After discussing the issue with some KVM guys I still think it would be better to leave the ballooned pages mapped in the direct map. KVM does it the same way. They return "something" in case the guest tries to read from such a page (might be the real data, 0's or all 1's). So we should either map an all 0's or 1's page via EPT, or we should return 0's or 1's via emulation of the read instruction. Performance shouldn't be a major issue, as such reads should be really rare. Juergen _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 41+ messages in thread
[parent not found: <20180810103714.5112=3def=3dbf=3dbd1=3def=3dbf=3dbdpau?= l.durrant@ci?= trix.com>]
end of thread, other threads:[~2018-09-05 7:10 UTC | newest] Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-08-10 10:37 [PATCH 0/2] MMIO emulation fixes Paul Durrant 2018-08-10 10:37 ` [PATCH 1/2] x86/hvm/ioreq: MMIO range checking completely ignores direction flag Paul Durrant 2018-08-10 11:11 ` Andrew Cooper 2018-08-10 10:37 ` [PATCH 2/2] x86/hvm/emulate: make sure rep I/O emulation does not cross GFN boundaries Paul Durrant 2018-08-10 11:14 ` Andrew Cooper 2018-08-10 11:50 ` Paul Durrant 2018-08-10 11:50 ` Andrew Cooper 2018-08-10 11:59 ` Jan Beulich 2018-08-10 12:10 ` Paul Durrant 2018-08-10 12:01 ` [PATCH 0/2] MMIO emulation fixes Jan Beulich 2018-08-10 12:08 ` Paul Durrant 2018-08-10 12:13 ` Jan Beulich 2018-08-10 12:22 ` Paul Durrant 2018-08-10 12:37 ` Jan Beulich 2018-08-10 12:43 ` Paul Durrant 2018-08-10 12:55 ` Andrew Cooper 2018-08-10 15:08 ` Paul Durrant 2018-08-10 15:30 ` Jan Beulich 2018-08-10 15:35 ` Paul Durrant [not found] ` <5B6DB69D02000078001DD06A@prv1*mh.provo.novell.com> [not found] ` <eaab5a73*2910*7fb6*e1fc*08537e63088c@citrix.com> [not found] ` <92ca69e5*98b1*61e4*817a*3868f829471a@citrix.com> 2018-08-10 16:00 ` Jan Beulich 2018-08-10 16:30 ` George Dunlap 2018-08-10 16:37 ` Andrew Cooper 2018-08-13 6:50 ` Jan Beulich 2018-08-16 11:08 ` Andrew Cooper 2018-08-29 10:36 ` Olaf Hering 2018-08-29 10:53 ` Andrew Cooper 2018-08-29 11:00 ` Olaf Hering 2018-08-29 11:09 ` Andrew Cooper 2018-08-29 11:14 ` Andrew Cooper 2018-08-29 11:26 ` Juergen Gross [not found] ` <5B86773A0200004903F324A0@prv1-mh.provo.novell.com> [not found] ` <5B867B1A0200006D03F3278E@prv1-mh.provo.novell.com> [not found] ` <5B867D000200009103F328E2@prv1-mh.provo.novell.com> [not found] ` <5B867F020200009E04E46402@prv1-mh.provo.novell.com> 2018-08-29 12:06 ` Jan Beulich [not found] ` <5B87A68A0200001C04E5493A@prv1-mh.provo.novell.com> 2018-08-30 8:23 ` Jan Beulich 2018-08-30 10:42 ` Olaf Hering 2018-08-30 8:10 ` Olaf Hering 2018-08-16 10:29 ` Jan Beulich 2018-08-16 10:56 ` Andrew Cooper 2018-08-16 11:27 ` Jan Beulich [not found] <20180810103714.5112=3def=3dbf=3dbd1=3def=3dbf=3dbdpau?= =?UTF-8?Q?l.durrant@ci=3f=3d_trix.com> [not found] ` <fdf19f7d=ef=bf=bd1b92=ef=bf=bda9c0?= =?UTF-8?Q?=ef=bf=bd3602=ef=bf=bdb1c9807bf610@citrix.com> [not found] ` <a735b4359ccc4b278?= =?UTF-8?Q?330204d9790c6ac@AMSPEX02CL03.citrite.net> [not found] ` <5B6DAF9F02000078001DD0?= =?UTF-8?Q?40@prv1=ef=bf=bdmh.provo.novell.com> [not found] ` <e2f77af0b2394b8f859a1f2dc1a?= =?UTF-8?Q?91797@AMSPEX02CL03.citrite.net> [not found] ` <5B6DB69D02000078001DD06A@prv1?= =?UTF-8?B?77+9bWgucHJvdm8ubm92ZWxsLmNvbT4gPGVhYWI1YTcz77+9MjkxMO+/vTdmYjY=?= =?UTF-8?B?77+9ZTFmY++/vTA4NTM3ZTYzMDg4Y0BjaXRyaXguY29tPiA8OTJjYTY5ZTXvv705?= =?UTF-8?B?OGIx77+9NjFlNO+/vTgxN2Hvv70zODY4ZjgyOTQ3MWFAY2l0> [not found] ` <11c0c96?= =?UTF-8?Q?5-9af7-2cec-1420-4541e281183a@citrix.com> [not found] ` <5B755FBC0200007_=3d=3f?= =?UTF-8?Q?UTF-8=3fQ=3f8001DEDBF@suse.com> [not found] ` <dd3c99c2-67e3-faf1-4219-85651b89?= =?UTF-8?Q?1adc@suse.com> 2018-09-04 16:24 ` Andrew Cooper [not found] ` <651CBD680200008737554D14@prv1-mh.provo.novell.com> [not found] ` <21554C83020000C537554D14@prv1-mh.provo.novell.com> [not found] ` <06D73C83020000C037554D14@prv1-mh.provo.novell.com> [not found] ` <A283E656020000808E2C01CD@prv1-mh.provo.novell.com> 2018-09-05 7:10 ` Jan Beulich [not found] <20180810103714.5112=ef=bf=bd1=ef=bf=bdpaul.durrant@ci?= =?UTF-8?Q?trix.com> [not found] ` <5B6D86F30?= =?UTF-8?Q?2000078001DCF85@prv1=ef=bf=bdmh.provo.novell.com> [not found] ` <e8cff3ca6c154b?= =?UTF-8?Q?67a2a932af83719354@AMSPEX02CL03.citrite.net> [not found] ` <fdf19f7d=ef=bf=bd1b?= =?UTF-8?B?OTLvv71hOWMw77+9MzYwMu+/vWIxYzk4MDdiZjYxMEBjaXRyaXguY29tPiA8YTcz?= =?UTF-8?Q?5b4359ccc4b278330204d9790c6ac@AMSPEX02CL03.citrite.net> [not found] ` <5B6DAF9F?= =?UTF-8?Q?02000078001DD040@prv1=ef=bf=bdmh.provo.novell.com> [not found] ` <e2f77af0b2394?= =?UTF-8?Q?b8f859a1f2dc1a91797@AMSPEX02CL03.citrite.net> [not found] ` <5B6DB69D0200007800?= =?UTF-8?Q?1DD06A@prv1=ef=bf=bdmh.provo.novell.com> [not found] ` <eaab5a73=ef=bf=bd2910?= =?UTF-8?B?77+9N2ZiNu+/vWUxZmPvv70wODUzN2U2MzA4OGNAY2l0cml4LmNvbT4gPDkyY2E2?= =?UTF-8?B?OWU177+9OThiMe+/vTYxZTTvv704MTdh77+9Mzg2OGY4Mjk0NzFhQGNpdHJpeC5j?= =?UTF-8?Q?om> [not found] ` <5B75521102000078001DED13@prv1=ef=bf=bdmh.provo.novell.com> [not found] ` <?= =?UTF-8?Q?11c0c965-9af7-2cec-1420-4541e281183a@citrix.com> [not found] ` <5B755FBC0200007?= =?UTF-8?Q?8001DEDBF@suse.com> 2018-08-16 12:52 ` Juergen Gross 2018-09-04 16:11 ` Juergen Gross [not found] <20180810103714.5112=3def=3dbf=3dbd1=3def=3dbf=3dbdpau?= l.durrant@ci?= trix.com>
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.