All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/mm/p2m: don't needlessly limit MMIO mapping order to 4k
@ 2018-10-16 14:41 Paul Durrant
  2018-10-16 16:07 ` George Dunlap
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Paul Durrant @ 2018-10-16 14:41 UTC (permalink / raw)
  To: xen-devel
  Cc: George Dunlap, Andrew Cooper, Paul Durrant, Wei Liu, Jan Beulich

The P2M common code currently restricts the MMIO mapping order of any
domain with IOMMU mappings, but that is not using shared tables, to 4k.
This has been shown to have a huge performance cost when passing through
a PCI device with a very large BAR (e.g. NVIDIA P40), increasing the guest
boot time from ~20s to several minutes when iommu=no-sharept is specified
on the Xen command line.

The limitation was added by commit c3c756bd "x86/p2m: use large pages
for MMIO mappings" however the underlying implementations of p2m->set_entry
for Intel and AMD are coded to cope with mapping orders higher than 4k,
even though the IOMMU mapping function is itself currently limited to 4k,
so there is no real need to limit the order passed into the method.

With this patch applied the VM boot time is restored to something
reasonable. Not as fast as without iommu=no-sharept, but within a few
seconds of it.

NOTE: This patch takes the opportunity to shorten a couple of > 80
      character lines in the code.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm/p2m.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index a00a3c1bff..f1df1debc7 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -2081,14 +2081,11 @@ static unsigned int mmio_order(const struct domain *d,
                                unsigned long start_fn, unsigned long nr)
 {
     /*
-     * Note that the !iommu_use_hap_pt() here has three effects:
-     * - cover iommu_{,un}map_page() not having an "order" input yet,
-     * - exclude shadow mode (which doesn't support large MMIO mappings),
-     * - exclude PV guests, should execution reach this code for such.
-     * So be careful when altering this.
+     * PV guests or shadow-mode HVM guests must be restricted to 4k
+     * mappings.
      */
-    if ( !iommu_use_hap_pt(d) ||
-         (start_fn & ((1UL << PAGE_ORDER_2M) - 1)) || !(nr >> PAGE_ORDER_2M) )
+    if ( !hap_enabled(d) || (start_fn & ((1UL << PAGE_ORDER_2M) - 1)) ||
+         !(nr >> PAGE_ORDER_2M) )
         return PAGE_ORDER_4K;
 
     if ( 0 /*
@@ -2096,7 +2093,8 @@ static unsigned int mmio_order(const struct domain *d,
             * set_typed_p2m_entry() when it needs to zap M2P entries
             * for a RAM range.
             */ &&
-         !(start_fn & ((1UL << PAGE_ORDER_1G) - 1)) && (nr >> PAGE_ORDER_1G) &&
+         !(start_fn & ((1UL << PAGE_ORDER_1G) - 1)) &&
+         (nr >> PAGE_ORDER_1G) &&
          hap_has_1gb )
         return PAGE_ORDER_1G;
 
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/mm/p2m: don't needlessly limit MMIO mapping order to 4k
  2018-10-16 14:41 [PATCH] x86/mm/p2m: don't needlessly limit MMIO mapping order to 4k Paul Durrant
@ 2018-10-16 16:07 ` George Dunlap
  2018-10-16 16:26 ` Roger Pau Monné
  2018-10-17 13:23 ` Andrew Cooper
  2 siblings, 0 replies; 6+ messages in thread
From: George Dunlap @ 2018-10-16 16:07 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich

On 10/16/2018 03:41 PM, Paul Durrant wrote:
> The P2M common code currently restricts the MMIO mapping order of any
> domain with IOMMU mappings, but that is not using shared tables, to 4k.
> This has been shown to have a huge performance cost when passing through
> a PCI device with a very large BAR (e.g. NVIDIA P40), increasing the guest
> boot time from ~20s to several minutes when iommu=no-sharept is specified
> on the Xen command line.
> 
> The limitation was added by commit c3c756bd "x86/p2m: use large pages
> for MMIO mappings" however the underlying implementations of p2m->set_entry
> for Intel and AMD are coded to cope with mapping orders higher than 4k,
> even though the IOMMU mapping function is itself currently limited to 4k,
> so there is no real need to limit the order passed into the method.
> 
> With this patch applied the VM boot time is restored to something
> reasonable. Not as fast as without iommu=no-sharept, but within a few
> seconds of it.
> 
> NOTE: This patch takes the opportunity to shorten a couple of > 80
>       character lines in the code.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>

This looks correct to me:

Acked-by: George Dunlap <george.dunlap@citrix.com>

...but I'd like to hear Jan's opinion (since he was the author of
c3c756bd) before it gets checked in if possible.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/mm/p2m: don't needlessly limit MMIO mapping order to 4k
  2018-10-16 14:41 [PATCH] x86/mm/p2m: don't needlessly limit MMIO mapping order to 4k Paul Durrant
  2018-10-16 16:07 ` George Dunlap
@ 2018-10-16 16:26 ` Roger Pau Monné
  2018-10-17  7:50   ` Paul Durrant
  2018-10-17 13:23 ` Andrew Cooper
  2 siblings, 1 reply; 6+ messages in thread
From: Roger Pau Monné @ 2018-10-16 16:26 UTC (permalink / raw)
  To: Paul Durrant
  Cc: George Dunlap, xen-devel, Wei Liu, Jan Beulich, Andrew Cooper

On Tue, Oct 16, 2018 at 03:41:55PM +0100, Paul Durrant wrote:
> The P2M common code currently restricts the MMIO mapping order of any
> domain with IOMMU mappings, but that is not using shared tables, to 4k.
> This has been shown to have a huge performance cost when passing through
> a PCI device with a very large BAR (e.g. NVIDIA P40), increasing the guest
> boot time from ~20s to several minutes when iommu=no-sharept is specified
> on the Xen command line.
> 
> The limitation was added by commit c3c756bd "x86/p2m: use large pages
> for MMIO mappings" however the underlying implementations of p2m->set_entry
> for Intel and AMD are coded to cope with mapping orders higher than 4k,
> even though the IOMMU mapping function is itself currently limited to 4k,
> so there is no real need to limit the order passed into the method.
> 
> With this patch applied the VM boot time is restored to something
> reasonable. Not as fast as without iommu=no-sharept, but within a few
> seconds of it.

I guess the worry was that the loop in for example ept_set_entry to
perform the iommu mappings would take too long and trigger the
watchdog if for example a 1GB page is mapped?

In any case we already allow to map higher order non-MMIO pages, which
should also trigger such issue?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/mm/p2m: don't needlessly limit MMIO mapping order to 4k
  2018-10-16 16:26 ` Roger Pau Monné
@ 2018-10-17  7:50   ` Paul Durrant
  0 siblings, 0 replies; 6+ messages in thread
From: Paul Durrant @ 2018-10-17  7:50 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: xen-devel, Wei Liu, George Dunlap, Jan Beulich, Andrew Cooper

> -----Original Message-----
> From: Roger Pau Monne
> Sent: 16 October 2018 17:27
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: xen-devel@lists.xenproject.org; George Dunlap
> <George.Dunlap@citrix.com>; Andrew Cooper <Andrew.Cooper3@citrix.com>; Wei
> Liu <wei.liu2@citrix.com>; Jan Beulich <jbeulich@suse.com>
> Subject: Re: [Xen-devel] [PATCH] x86/mm/p2m: don't needlessly limit MMIO
> mapping order to 4k
> 
> On Tue, Oct 16, 2018 at 03:41:55PM +0100, Paul Durrant wrote:
> > The P2M common code currently restricts the MMIO mapping order of any
> > domain with IOMMU mappings, but that is not using shared tables, to 4k.
> > This has been shown to have a huge performance cost when passing through
> > a PCI device with a very large BAR (e.g. NVIDIA P40), increasing the
> guest
> > boot time from ~20s to several minutes when iommu=no-sharept is
> specified
> > on the Xen command line.
> >
> > The limitation was added by commit c3c756bd "x86/p2m: use large pages
> > for MMIO mappings" however the underlying implementations of p2m-
> >set_entry
> > for Intel and AMD are coded to cope with mapping orders higher than 4k,
> > even though the IOMMU mapping function is itself currently limited to
> 4k,
> > so there is no real need to limit the order passed into the method.
> >
> > With this patch applied the VM boot time is restored to something
> > reasonable. Not as fast as without iommu=no-sharept, but within a few
> > seconds of it.
> 
> I guess the worry was that the loop in for example ept_set_entry to
> perform the iommu mappings would take too long and trigger the
> watchdog if for example a 1GB page is mapped?
> 
> In any case we already allow to map higher order non-MMIO pages, which
> should also trigger such issue?

Indeed. It's no different at this level. AFAICT we map an extent of whatever order is handed to XENMEM_populate_physmap with no pre-empt checks so, if there is a problem, limiting the MMIO order does not seem to be the right way to deal with it.

  Paul

> 
> Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/mm/p2m: don't needlessly limit MMIO mapping order to 4k
  2018-10-16 14:41 [PATCH] x86/mm/p2m: don't needlessly limit MMIO mapping order to 4k Paul Durrant
  2018-10-16 16:07 ` George Dunlap
  2018-10-16 16:26 ` Roger Pau Monné
@ 2018-10-17 13:23 ` Andrew Cooper
  2018-10-17 14:02   ` Paul Durrant
  2 siblings, 1 reply; 6+ messages in thread
From: Andrew Cooper @ 2018-10-17 13:23 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: George Dunlap, Wei Liu, Jan Beulich

On 16/10/18 15:41, Paul Durrant wrote:
> The P2M common code currently restricts the MMIO mapping order of any
> domain with IOMMU mappings, but that is not using shared tables, to 4k.
> This has been shown to have a huge performance cost when passing through
> a PCI device with a very large BAR (e.g. NVIDIA P40), increasing the guest
> boot time from ~20s to several minutes when iommu=no-sharept is specified
> on the Xen command line.
>
> The limitation was added by commit c3c756bd "x86/p2m: use large pages
> for MMIO mappings" however the underlying implementations of p2m->set_entry
> for Intel and AMD are coded to cope with mapping orders higher than 4k,
> even though the IOMMU mapping function is itself currently limited to 4k,
> so there is no real need to limit the order passed into the method.
>
> With this patch applied the VM boot time is restored to something
> reasonable. Not as fast as without iommu=no-sharept, but within a few
> seconds of it.
>
> NOTE: This patch takes the opportunity to shorten a couple of > 80
>       character lines in the code.
>
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>

I'm afraid that it isn't needless.

The fact that the underlying IOMMU functions can only work with 4k
pages, in combination with no -ERestart support, is precisely why you
must only ever pass a 4k order here.  Otherwise we will genuinely wait
for the IOMMU to map/unmap 1G worth of 4k pages at a time, which the
security team has previously deemed is too long to wait for a single
operation.

I'm afraid that the only safe option I see here is to make the p2m
operations support properly preemptible.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/mm/p2m: don't needlessly limit MMIO mapping order to 4k
  2018-10-17 13:23 ` Andrew Cooper
@ 2018-10-17 14:02   ` Paul Durrant
  0 siblings, 0 replies; 6+ messages in thread
From: Paul Durrant @ 2018-10-17 14:02 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel; +Cc: Wei Liu, George Dunlap, Jan Beulich

> -----Original Message-----
> From: Andrew Cooper
> Sent: 17 October 2018 14:24
> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org
> Cc: George Dunlap <George.Dunlap@citrix.com>; Jan Beulich
> <jbeulich@suse.com>; Wei Liu <wei.liu2@citrix.com>
> Subject: Re: [PATCH] x86/mm/p2m: don't needlessly limit MMIO mapping order
> to 4k
> 
> On 16/10/18 15:41, Paul Durrant wrote:
> > The P2M common code currently restricts the MMIO mapping order of any
> > domain with IOMMU mappings, but that is not using shared tables, to 4k.
> > This has been shown to have a huge performance cost when passing through
> > a PCI device with a very large BAR (e.g. NVIDIA P40), increasing the
> guest
> > boot time from ~20s to several minutes when iommu=no-sharept is
> specified
> > on the Xen command line.
> >
> > The limitation was added by commit c3c756bd "x86/p2m: use large pages
> > for MMIO mappings" however the underlying implementations of p2m-
> >set_entry
> > for Intel and AMD are coded to cope with mapping orders higher than 4k,
> > even though the IOMMU mapping function is itself currently limited to
> 4k,
> > so there is no real need to limit the order passed into the method.
> >
> > With this patch applied the VM boot time is restored to something
> > reasonable. Not as fast as without iommu=no-sharept, but within a few
> > seconds of it.
> >
> > NOTE: This patch takes the opportunity to shorten a couple of > 80
> >       character lines in the code.
> >
> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> 
> I'm afraid that it isn't needless.
> 
> The fact that the underlying IOMMU functions can only work with 4k
> pages, in combination with no -ERestart support, is precisely why you
> must only ever pass a 4k order here.  Otherwise we will genuinely wait
> for the IOMMU to map/unmap 1G worth of 4k pages at a time, which the
> security team has previously deemed is too long to wait for a single
> operation.
> 

Ok, I agree that there may be an issue with 1G but not for 2M. I have tested the patch and, as I said in my comment, the total cost of the all the domctls was barely different from when page sharing was turned on.

> I'm afraid that the only safe option I see here is to make the p2m
> operations support properly preemptible.
> 

That's a lot of re-work. I agree it is the correct solution but without a patch to allow at least 2M mapping, a system without shared tables is unusable so I think such a patch is reasonable stop-gap.

  Paul

> ~Andrew
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-10-17 14:04 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-16 14:41 [PATCH] x86/mm/p2m: don't needlessly limit MMIO mapping order to 4k Paul Durrant
2018-10-16 16:07 ` George Dunlap
2018-10-16 16:26 ` Roger Pau Monné
2018-10-17  7:50   ` Paul Durrant
2018-10-17 13:23 ` Andrew Cooper
2018-10-17 14:02   ` Paul Durrant

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.