All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC] libxl: set 1GB MMIO hole for PVH
@ 2018-05-09 16:07 Roger Pau Monne
  2018-05-09 16:12 ` Juergen Gross
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Roger Pau Monne @ 2018-05-09 16:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Wei Liu, Andrew Cooper, Ian Jackson, Jan Beulich,
	Boris Ostrovsky, Roger Pau Monne

This prevents page-shattering, by being able to populate the RAM
regions below 4GB using 1GB pages, provided the guest memory size is
set to a multiple of a GB.

Note that there are some special and ACPI pages in the MMIO hole that
will be populated using smaller order pages, but those shouldn't be
accessed as often as RAM regions.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
Not 4.11 material, Ccing Boris and Juergen for their opinion as Linux
maintainers.
---
 tools/libxl/libxl_dom.c | 19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index f0fd5fd3a3..1ae0e8ef33 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1230,16 +1230,21 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
     else if (dom->mmio_size == 0 && !device_model) {
 #if defined(__i386__) || defined(__x86_64__)
         /*
+         * Set MMIO hole size to 1GB, so that the whole 3-4GB region is not
+         * populated. This prevents page shattering, since there are MMIO areas
+         * in that region that cannot be populated.
+         *
          * Make sure the local APIC page, the ACPI tables and the special pages
          * are inside the MMIO hole.
          */
-        xen_paddr_t start =
-            (X86_HVM_END_SPECIAL_REGION - X86_HVM_NR_SPECIAL_PAGES) <<
-            XC_PAGE_SHIFT;
-
-        start = min_t(xen_paddr_t, start, LAPIC_BASE_ADDRESS);
-        start = min_t(xen_paddr_t, start, ACPI_INFO_PHYSICAL_ADDRESS);
-        dom->mmio_size = GB(4) - start;
+        dom->mmio_size = GB(1);
+#define ASSERT_ADDR_MMIO(addr) assert((addr) >= (GB(4) - dom->mmio_size) && \
+                                      (addr) < GB(4))
+        ASSERT_ADDR_MMIO((X86_HVM_END_SPECIAL_REGION - X86_HVM_NR_SPECIAL_PAGES)
+                         << XC_PAGE_SHIFT);
+        ASSERT_ADDR_MMIO(LAPIC_BASE_ADDRESS);
+        ASSERT_ADDR_MMIO(ACPI_INFO_PHYSICAL_ADDRESS);
+#undef ASSERT_ADDR_MMIO
 #else
         assert(1);
 #endif
-- 
2.17.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC] libxl: set 1GB MMIO hole for PVH
  2018-05-09 16:07 [PATCH RFC] libxl: set 1GB MMIO hole for PVH Roger Pau Monne
@ 2018-05-09 16:12 ` Juergen Gross
  2018-05-09 16:13   ` Juergen Gross
  2018-05-10  8:33   ` Roger Pau Monné
  2018-05-10  9:43 ` George Dunlap
  2018-05-10 11:23 ` Wei Liu
  2 siblings, 2 replies; 9+ messages in thread
From: Juergen Gross @ 2018-05-09 16:12 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel
  Cc: Wei Liu, Boris Ostrovsky, Ian Jackson, Jan Beulich, Andrew Cooper

On 09/05/18 18:07, Roger Pau Monne wrote:
> This prevents page-shattering, by being able to populate the RAM
> regions below 4GB using 1GB pages, provided the guest memory size is
> set to a multiple of a GB.
> 
> Note that there are some special and ACPI pages in the MMIO hole that
> will be populated using smaller order pages, but those shouldn't be
> accessed as often as RAM regions.

Would it be possible somehow to put a potential firmware into that
1GB region, too, if it needs any memory in high memory? Seabios e.g.
is taking the last RAM page of the guest for its hypercall page, which
will again shatter GB mappings.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC] libxl: set 1GB MMIO hole for PVH
  2018-05-09 16:12 ` Juergen Gross
@ 2018-05-09 16:13   ` Juergen Gross
  2018-05-10  8:33   ` Roger Pau Monné
  1 sibling, 0 replies; 9+ messages in thread
From: Juergen Gross @ 2018-05-09 16:13 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel
  Cc: Wei Liu, Boris Ostrovsky, Ian Jackson, Jan Beulich, Andrew Cooper

On 09/05/18 18:12, Juergen Gross wrote:
> On 09/05/18 18:07, Roger Pau Monne wrote:
>> This prevents page-shattering, by being able to populate the RAM
>> regions below 4GB using 1GB pages, provided the guest memory size is
>> set to a multiple of a GB.
>>
>> Note that there are some special and ACPI pages in the MMIO hole that
>> will be populated using smaller order pages, but those shouldn't be
>> accessed as often as RAM regions.
> 
> Would it be possible somehow to put a potential firmware into that
> 1GB region, too, if it needs any memory in high memory? Seabios e.g.
> is taking the last RAM page of the guest for its hypercall page, which
> will again shatter GB mappings.

Clearly out of coffee, sorry. I manged to read HVM instead of PVH.

Sorry for the noise.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC] libxl: set 1GB MMIO hole for PVH
  2018-05-09 16:12 ` Juergen Gross
  2018-05-09 16:13   ` Juergen Gross
@ 2018-05-10  8:33   ` Roger Pau Monné
  2018-05-11  3:55     ` Juergen Gross
  1 sibling, 1 reply; 9+ messages in thread
From: Roger Pau Monné @ 2018-05-10  8:33 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Wei Liu, Andrew Cooper, Ian Jackson, Jan Beulich, xen-devel,
	Boris Ostrovsky

On Wed, May 09, 2018 at 06:12:28PM +0200, Juergen Gross wrote:
> On 09/05/18 18:07, Roger Pau Monne wrote:
> > This prevents page-shattering, by being able to populate the RAM
> > regions below 4GB using 1GB pages, provided the guest memory size is
> > set to a multiple of a GB.
> > 
> > Note that there are some special and ACPI pages in the MMIO hole that
> > will be populated using smaller order pages, but those shouldn't be
> > accessed as often as RAM regions.
> 
> Would it be possible somehow to put a potential firmware into that
> 1GB region, too, if it needs any memory in high memory? Seabios e.g.
> is taking the last RAM page of the guest for its hypercall page, which
> will again shatter GB mappings.

I know this comment is related to HVM guests, but I'm not sure I see
how setting the hypercall page shatters GB mappings. Setting the
hypercall page doesn't involve changing any p2m mappings, but just
filling a guest RAM page with some data.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC] libxl: set 1GB MMIO hole for PVH
  2018-05-09 16:07 [PATCH RFC] libxl: set 1GB MMIO hole for PVH Roger Pau Monne
  2018-05-09 16:12 ` Juergen Gross
@ 2018-05-10  9:43 ` George Dunlap
  2018-05-10 10:01   ` Roger Pau Monné
  2018-05-10 11:23 ` Wei Liu
  2 siblings, 1 reply; 9+ messages in thread
From: George Dunlap @ 2018-05-10  9:43 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Juergen Gross, Wei Liu, Andrew Cooper, Ian Jackson, Jan Beulich,
	xen-devel, Boris Ostrovsky

On Wed, May 9, 2018 at 5:07 PM, Roger Pau Monne <roger.pau@citrix.com> wrote:
> This prevents page-shattering, by being able to populate the RAM
> regions below 4GB using 1GB pages, provided the guest memory size is
> set to a multiple of a GB.
>
> Note that there are some special and ACPI pages in the MMIO hole that
> will be populated using smaller order pages, but those shouldn't be
> accessed as often as RAM regions.

Is it possible to run PVH in pure 32-bit mode (as opposed to 32-bit
PAE)?  If so, such guests would be limited to 3GiB of total memory
(instead of 4GiB).

But I suppose there's no particular reason to run PVH in pure 32-bit
mode instead of 32-bit PAE.  (I don't *think* TLB misses are slower on
3-level paging than 2-level paging, because the L3 entries are
essentially loaded on CR3 switch.)

So at the moment this seems OK to me.  If someone decides they want to
run PVH 2-level paging with more than 3GiB of RAM, we can easily add
an option to turn it on.

(Haven't reviewed the code.)

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC] libxl: set 1GB MMIO hole for PVH
  2018-05-10  9:43 ` George Dunlap
@ 2018-05-10 10:01   ` Roger Pau Monné
  2018-05-10 10:11     ` Andrew Cooper
  0 siblings, 1 reply; 9+ messages in thread
From: Roger Pau Monné @ 2018-05-10 10:01 UTC (permalink / raw)
  To: George Dunlap
  Cc: Juergen Gross, Wei Liu, Andrew Cooper, Ian Jackson, Jan Beulich,
	xen-devel, Boris Ostrovsky

On Thu, May 10, 2018 at 10:43:26AM +0100, George Dunlap wrote:
> On Wed, May 9, 2018 at 5:07 PM, Roger Pau Monne <roger.pau@citrix.com> wrote:
> > This prevents page-shattering, by being able to populate the RAM
> > regions below 4GB using 1GB pages, provided the guest memory size is
> > set to a multiple of a GB.
> >
> > Note that there are some special and ACPI pages in the MMIO hole that
> > will be populated using smaller order pages, but those shouldn't be
> > accessed as often as RAM regions.
> 
> Is it possible to run PVH in pure 32-bit mode (as opposed to 32-bit
> PAE)?  If so, such guests would be limited to 3GiB of total memory
> (instead of 4GiB).

Yes, that's correct. PVH guests are not limited to any mode, you could
even run them in protected or real mode.

> But I suppose there's no particular reason to run PVH in pure 32-bit
> mode instead of 32-bit PAE.  (I don't *think* TLB misses are slower on
> 3-level paging than 2-level paging, because the L3 entries are
> essentially loaded on CR3 switch.)
> 
> So at the moment this seems OK to me.  If someone decides they want to
> run PVH 2-level paging with more than 3GiB of RAM, we can easily add
> an option to turn it on.

That's my opinion. HVM guests already have a mmio_hole option, it
would be almost trivial to make that option also available to PVH if
there's a need for it.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC] libxl: set 1GB MMIO hole for PVH
  2018-05-10 10:01   ` Roger Pau Monné
@ 2018-05-10 10:11     ` Andrew Cooper
  0 siblings, 0 replies; 9+ messages in thread
From: Andrew Cooper @ 2018-05-10 10:11 UTC (permalink / raw)
  To: Roger Pau Monné, George Dunlap
  Cc: Juergen Gross, Wei Liu, Ian Jackson, Jan Beulich, xen-devel,
	Boris Ostrovsky

On 10/05/18 11:01, Roger Pau Monné wrote:
> On Thu, May 10, 2018 at 10:43:26AM +0100, George Dunlap wrote:
>> On Wed, May 9, 2018 at 5:07 PM, Roger Pau Monne <roger.pau@citrix.com> wrote:
>>> This prevents page-shattering, by being able to populate the RAM
>>> regions below 4GB using 1GB pages, provided the guest memory size is
>>> set to a multiple of a GB.
>>>
>>> Note that there are some special and ACPI pages in the MMIO hole that
>>> will be populated using smaller order pages, but those shouldn't be
>>> accessed as often as RAM regions.
>> Is it possible to run PVH in pure 32-bit mode (as opposed to 32-bit
>> PAE)?  If so, such guests would be limited to 3GiB of total memory
>> (instead of 4GiB).
> Yes, that's correct. PVH guests are not limited to any mode, you could
> even run them in protected or real mode.
>
>> But I suppose there's no particular reason to run PVH in pure 32-bit
>> mode instead of 32-bit PAE.  (I don't *think* TLB misses are slower on
>> 3-level paging than 2-level paging, because the L3 entries are
>> essentially loaded on CR3 switch.)
>>
>> So at the moment this seems OK to me.  If someone decides they want to
>> run PVH 2-level paging with more than 3GiB of RAM, we can easily add
>> an option to turn it on.
> That's my opinion. HVM guests already have a mmio_hole option, it
> would be almost trivial to make that option also available to PVH if
> there's a need for it.

Lets optimise for the common case.  These days, this is 64bit OSes.

The purpose of making the MMIO hole like this is to allow us to use 1G
host superpages for all of guest RAM, and avoid all cases which would
cause them to be shattered.  Avoiding shattering is going to require
some care, like not turning on legacy MTRRs, and ensuring that we
provide some empty gfn space for the guest to make mappings into.  There
probably needs to be nGB + a little extra on small mappings to cover
things like the ACPI tables, vram etc.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC] libxl: set 1GB MMIO hole for PVH
  2018-05-09 16:07 [PATCH RFC] libxl: set 1GB MMIO hole for PVH Roger Pau Monne
  2018-05-09 16:12 ` Juergen Gross
  2018-05-10  9:43 ` George Dunlap
@ 2018-05-10 11:23 ` Wei Liu
  2 siblings, 0 replies; 9+ messages in thread
From: Wei Liu @ 2018-05-10 11:23 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Juergen Gross, Wei Liu, Andrew Cooper, Ian Jackson, Jan Beulich,
	xen-devel, Boris Ostrovsky

On Wed, May 09, 2018 at 05:07:12PM +0100, Roger Pau Monne wrote:
> This prevents page-shattering, by being able to populate the RAM
> regions below 4GB using 1GB pages, provided the guest memory size is
> set to a multiple of a GB.
> 
> Note that there are some special and ACPI pages in the MMIO hole that
> will be populated using smaller order pages, but those shouldn't be
> accessed as often as RAM regions.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

This idea sounds fine to me.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC] libxl: set 1GB MMIO hole for PVH
  2018-05-10  8:33   ` Roger Pau Monné
@ 2018-05-11  3:55     ` Juergen Gross
  0 siblings, 0 replies; 9+ messages in thread
From: Juergen Gross @ 2018-05-11  3:55 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Wei Liu, Andrew Cooper, Ian Jackson, Jan Beulich, xen-devel,
	Boris Ostrovsky

On 10/05/18 10:33, Roger Pau Monné wrote:
> On Wed, May 09, 2018 at 06:12:28PM +0200, Juergen Gross wrote:
>> On 09/05/18 18:07, Roger Pau Monne wrote:
>>> This prevents page-shattering, by being able to populate the RAM
>>> regions below 4GB using 1GB pages, provided the guest memory size is
>>> set to a multiple of a GB.
>>>
>>> Note that there are some special and ACPI pages in the MMIO hole that
>>> will be populated using smaller order pages, but those shouldn't be
>>> accessed as often as RAM regions.
>>
>> Would it be possible somehow to put a potential firmware into that
>> 1GB region, too, if it needs any memory in high memory? Seabios e.g.
>> is taking the last RAM page of the guest for its hypercall page, which
>> will again shatter GB mappings.
> 
> I know this comment is related to HVM guests, but I'm not sure I see
> how setting the hypercall page shatters GB mappings. Setting the
> hypercall page doesn't involve changing any p2m mappings, but just
> filling a guest RAM page with some data.

The problem is that any memory reserved by firmware will be added as
"Reserved" in the E820 map. This will in turn result to the OS mapping
it read only or not at all, so it can't use a GB mapping even for the
physical memory mapping any longer. Seabios e.g. is using the last
memory page of the guest below 4GB for that purpose. Linux tends to
put memory management structure accessed very often (e.g. struct page
or numa node data) at the end of a memory region, so performance is
degraded. With memory management intensive workloads I've seen
performance going up about 3% in a HVM guest with a small Xen patch
adding a single additional page to the guest which was used by Seabios
for the hypercall page, resulting in the guest using a GB mapping for
the last GB of its memory.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-05-11  3:55 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-09 16:07 [PATCH RFC] libxl: set 1GB MMIO hole for PVH Roger Pau Monne
2018-05-09 16:12 ` Juergen Gross
2018-05-09 16:13   ` Juergen Gross
2018-05-10  8:33   ` Roger Pau Monné
2018-05-11  3:55     ` Juergen Gross
2018-05-10  9:43 ` George Dunlap
2018-05-10 10:01   ` Roger Pau Monné
2018-05-10 10:11     ` Andrew Cooper
2018-05-10 11:23 ` Wei Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.