All of lore.kernel.org
 help / color / mirror / Atom feed
* xen: memory initialization/balloon fixes (#2)
@ 2011-08-19 14:57 David Vrabel
  2011-08-19 14:57 ` [PATCH 1/5] xen: use maximum reservation to limit amount of usable RAM David Vrabel
                   ` (5 more replies)
  0 siblings, 6 replies; 20+ messages in thread
From: David Vrabel @ 2011-08-19 14:57 UTC (permalink / raw)
  To: xen-devel; +Cc: Konrad Rzeszutek Wilk

This set of patches fixes some bugs in the memory initialization under
Xen and in Xen's memory balloon driver.  They can make 100s of MB of
additional RAM available (depending on the system/configuration).

Patches 1 & 2 are fixes and should be queued for 3.1 and possibly
queued for the 3.0 stable tree.

Patches 3 & 4 increase the amount of low memory in 32 bit domains
started with < 1 GiB of RAM.  Please queue for 3.2

Patch 5 releases all pages in the initial allocation with PFNs that
lie within a 1-1 mapping.  This seems correct to me as I think that
once the 1-1 mapping is set the MFN of the original page is lost so
it's no longer accessible by the kernel (and it cannot be used by
another domain as the Xen still thinks its used by the original
domain).

Changes since #1

- Reordered patches to put "xen: use maximum reservation to limit
  amount of usable RAM" first.
- Check maximum reservation for domU as well.
- New patch "xen/balloon: account for pages released during memory
  setup"
- Added explicit check of size when adding regions in the balloon
  driver.
- New patch "xen: release all pages within 1-1 p2m mappings"

David

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/5] xen: use maximum reservation to limit amount of usable RAM
  2011-08-19 14:57 xen: memory initialization/balloon fixes (#2) David Vrabel
@ 2011-08-19 14:57 ` David Vrabel
  2011-08-31 20:40   ` Konrad Rzeszutek Wilk
  2011-08-19 14:57 ` [PATCH 2/5] xen/balloon: account for pages released during memory setup David Vrabel
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 20+ messages in thread
From: David Vrabel @ 2011-08-19 14:57 UTC (permalink / raw)
  To: xen-devel; +Cc: David Vrabel, Konrad Rzeszutek Wilk

Use the domain's maximum reservation to limit the amount of extra RAM
for the memory balloon. This reduces the size of the pages tables and
the amount of reserved low memory (which defaults to about 1/32 of the
total RAM).

On a system with 8 GiB of RAM with the domain limited to 1 GiB the
kernel reports:

Before:

Memory: 627792k/4472000k available

After:

Memory: 549740k/11132224k available

A increase of about 76 MiB (~1.5% of the unused 7 GiB).  The reserved
low memory is also reduced from 253 MiB to 32 MiB.  The total
additional usable RAM is 329 MiB.

For dom0, this requires at patch to Xen ('x86: use 'dom0_mem' to limit
the number of pages for dom0')[1].

[1] http://lists.xensource.com/archives/html/xen-devel/2011-08/msg00567.html

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 arch/x86/xen/setup.c |   19 +++++++++++++++++++
 1 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index df118a8..c3b8d44 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -184,6 +184,19 @@ static unsigned long __init xen_set_identity(const struct e820entry *list,
 					PFN_UP(start_pci), PFN_DOWN(last));
 	return identity;
 }
+
+static unsigned long __init xen_get_max_pages(void)
+{
+	unsigned long max_pages = MAX_DOMAIN_PAGES;
+	domid_t domid = DOMID_SELF;
+	int ret;
+
+	ret = HYPERVISOR_memory_op(XENMEM_maximum_reservation, &domid);
+	if (ret > 0)
+		max_pages = ret;
+	return min(max_pages, MAX_DOMAIN_PAGES);
+}
+
 /**
  * machine_specific_memory_setup - Hook for machine specific memory setup.
  **/
@@ -292,6 +305,12 @@ char * __init xen_memory_setup(void)
 
 	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
 
+	extra_limit = xen_get_max_pages();
+	if (extra_limit >= max_pfn)
+		extra_pages = extra_limit - max_pfn;
+	else
+		extra_pages = 0;
+
 	extra_pages += xen_return_unused_memory(xen_start_info->nr_pages, &e820);
 
 	/*
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/5] xen/balloon: account for pages released during memory setup
  2011-08-19 14:57 xen: memory initialization/balloon fixes (#2) David Vrabel
  2011-08-19 14:57 ` [PATCH 1/5] xen: use maximum reservation to limit amount of usable RAM David Vrabel
@ 2011-08-19 14:57 ` David Vrabel
  2011-09-06 21:31   ` Konrad Rzeszutek Wilk
  2011-08-19 14:57 ` [PATCH 3/5] xen: allow balloon driver to use more than one memory region David Vrabel
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 20+ messages in thread
From: David Vrabel @ 2011-08-19 14:57 UTC (permalink / raw)
  To: xen-devel; +Cc: David Vrabel, Konrad Rzeszutek Wilk

In xen_memory_setup() pages that occur in gaps in the memory map are
released back to Xen.  This reduces the domain's current page count.
The Xen balloon driver does not correctly decrease its initial
current_pages count to reflect this.  If 'delta' pages are released
and the target is adjusted the resulting reservation is always 'delta'
less than the requested target.

This affects dom0 if the initial allocation of pages overlaps the PCI
memory region but won't affect most domU guests that have been setup
with pseudo-physical memory maps that don't have gaps.

Fix this by asking the hypervisor what the current reservation is when
starting the balloon driver.

If the domain's targets are managed by xapi, the domain may eventually
run out of memory and die because xapi currently gets its target
calculations wrong and whenever it is restarted it always reduces the
target by 'delta'.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/balloon.c |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 5dfd8f8..5814022 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -557,15 +557,20 @@ EXPORT_SYMBOL(free_xenballooned_pages);
 
 static int __init balloon_init(void)
 {
+	domid_t domid = DOMID_SELF;
 	unsigned long pfn, extra_pfn_end;
 	struct page *page;
+	int ret;
 
 	if (!xen_domain())
 		return -ENODEV;
 
 	pr_info("xen/balloon: Initialising balloon driver.\n");
 
-	balloon_stats.current_pages = xen_pv_domain() ? min(xen_start_info->nr_pages, max_pfn) : max_pfn;
+	ret = HYPERVISOR_memory_op(XENMEM_current_reservation, &domid);
+	if (ret < 0)
+		return ret;
+	balloon_stats.current_pages = ret;
 	balloon_stats.target_pages  = balloon_stats.current_pages;
 	balloon_stats.balloon_low   = 0;
 	balloon_stats.balloon_high  = 0;
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/5] xen: allow balloon driver to use more than one memory region
  2011-08-19 14:57 xen: memory initialization/balloon fixes (#2) David Vrabel
  2011-08-19 14:57 ` [PATCH 1/5] xen: use maximum reservation to limit amount of usable RAM David Vrabel
  2011-08-19 14:57 ` [PATCH 2/5] xen/balloon: account for pages released during memory setup David Vrabel
@ 2011-08-19 14:57 ` David Vrabel
  2011-09-06 21:57   ` Konrad Rzeszutek Wilk
  2011-08-19 14:57 ` [PATCH 4/5] xen: allow extra memory to be in multiple regions David Vrabel
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 20+ messages in thread
From: David Vrabel @ 2011-08-19 14:57 UTC (permalink / raw)
  To: xen-devel; +Cc: David Vrabel, Konrad Rzeszutek Wilk

Allow the xen balloon driver to populate its list of extra pages from
more than one region of memory.  This will allow platforms to provide
(for example) a region of low memory and a region of high memory.

The maximum possible number of extra regions is 128 (== E820MAX) which
is quite large so xen_extra_mem is placed in __initdata.  This is safe
as both xen_memeory_setup() and balloon_init() are in __init.

The balloon regions themselves are not altered (i.e., there is still
only the one region).

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 arch/x86/xen/setup.c  |   20 +++++++++---------
 drivers/xen/balloon.c |   51 ++++++++++++++++++++++++++++--------------------
 include/xen/page.h    |    9 +++++++-
 3 files changed, 48 insertions(+), 32 deletions(-)

diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index c3b8d44..4720c2d 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -37,7 +37,7 @@ extern void xen_syscall_target(void);
 extern void xen_syscall32_target(void);
 
 /* Amount of extra memory space we add to the e820 ranges */
-phys_addr_t xen_extra_mem_start, xen_extra_mem_size;
+struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS] __initdata;
 
 /* 
  * The maximum amount of extra memory compared to the base size.  The
@@ -56,7 +56,7 @@ static void __init xen_add_extra_mem(unsigned long pages)
 	unsigned long pfn;
 
 	u64 size = (u64)pages * PAGE_SIZE;
-	u64 extra_start = xen_extra_mem_start + xen_extra_mem_size;
+	u64 extra_start = xen_extra_mem[0].start + xen_extra_mem[0].size;
 
 	if (!pages)
 		return;
@@ -66,7 +66,7 @@ static void __init xen_add_extra_mem(unsigned long pages)
 
 	memblock_x86_reserve_range(extra_start, extra_start + size, "XEN EXTRA");
 
-	xen_extra_mem_size += size;
+	xen_extra_mem[0].size += size;
 
 	xen_max_p2m_pfn = PFN_DOWN(extra_start + size);
 
@@ -239,7 +239,7 @@ char * __init xen_memory_setup(void)
 
 	memcpy(map_raw, map, sizeof(map));
 	e820.nr_map = 0;
-	xen_extra_mem_start = mem_end;
+	xen_extra_mem[0].start = mem_end;
 	for (i = 0; i < memmap.nr_entries; i++) {
 		unsigned long long end;
 
@@ -267,8 +267,8 @@ char * __init xen_memory_setup(void)
 				e820_add_region(end, delta, E820_UNUSABLE);
 		}
 
-		if (map[i].size > 0 && end > xen_extra_mem_start)
-			xen_extra_mem_start = end;
+		if (map[i].size > 0 && end > xen_extra_mem[0].start)
+			xen_extra_mem[0].start = end;
 
 		/* Add region if any remains */
 		if (map[i].size > 0)
@@ -276,10 +276,10 @@ char * __init xen_memory_setup(void)
 	}
 	/* Align the balloon area so that max_low_pfn does not get set
 	 * to be at the _end_ of the PCI gap at the far end (fee01000).
-	 * Note that xen_extra_mem_start gets set in the loop above to be
-	 * past the last E820 region. */
-	if (xen_initial_domain() && (xen_extra_mem_start < (1ULL<<32)))
-		xen_extra_mem_start = (1ULL<<32);
+	 * Note that the start of balloon area gets set in the loop above
+	 * to be past the last E820 region. */
+	if (xen_initial_domain() && (xen_extra_mem[0].start < (1ULL<<32)))
+		xen_extra_mem[0].start = (1ULL<<32);
 
 	/*
 	 * In domU, the ISA region is normal, usable memory, but we
diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 5814022..996cf65 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -555,11 +555,35 @@ void free_xenballooned_pages(int nr_pages, struct page** pages)
 }
 EXPORT_SYMBOL(free_xenballooned_pages);
 
-static int __init balloon_init(void)
+static void __init balloon_add_memory_region(unsigned long start_pfn, unsigned long pages)
 {
-	domid_t domid = DOMID_SELF;
 	unsigned long pfn, extra_pfn_end;
 	struct page *page;
+
+	/*
+	 * Initialise the balloon with excess memory space.  We need
+	 * to make sure we don't add memory which doesn't exist or
+	 * logically exist.  The E820 map can be trimmed to be smaller
+	 * than the amount of physical memory due to the mem= command
+	 * line parameter.  And if this is a 32-bit non-HIGHMEM kernel
+	 * on a system with memory which requires highmem to access,
+	 * don't try to use it.
+	 */
+	extra_pfn_end = min(min(max_pfn, e820_end_of_ram_pfn()),
+			    start_pfn + pages);
+	for (pfn = start_pfn; pfn < extra_pfn_end; pfn++) {
+		page = pfn_to_page(pfn);
+		/* totalram_pages and totalhigh_pages do not
+		   include the boot-time balloon extension, so
+		   don't subtract from it. */
+		__balloon_append(page);
+	}
+}
+
+static int __init balloon_init(void)
+{
+	domid_t domid = DOMID_SELF;
+	int i;
 	int ret;
 
 	if (!xen_domain())
@@ -588,25 +612,10 @@ static int __init balloon_init(void)
 	register_memory_notifier(&xen_memory_nb);
 #endif
 
-	/*
-	 * Initialise the balloon with excess memory space.  We need
-	 * to make sure we don't add memory which doesn't exist or
-	 * logically exist.  The E820 map can be trimmed to be smaller
-	 * than the amount of physical memory due to the mem= command
-	 * line parameter.  And if this is a 32-bit non-HIGHMEM kernel
-	 * on a system with memory which requires highmem to access,
-	 * don't try to use it.
-	 */
-	extra_pfn_end = min(min(max_pfn, e820_end_of_ram_pfn()),
-			    (unsigned long)PFN_DOWN(xen_extra_mem_start + xen_extra_mem_size));
-	for (pfn = PFN_UP(xen_extra_mem_start);
-	     pfn < extra_pfn_end;
-	     pfn++) {
-		page = pfn_to_page(pfn);
-		/* totalram_pages and totalhigh_pages do not include the boot-time
-		   balloon extension, so don't subtract from it. */
-		__balloon_append(page);
-	}
+	for (i = 0; i < XEN_EXTRA_MEM_MAX_REGIONS; i++)
+		if (xen_extra_mem[i].size)
+			balloon_add_memory_region(PFN_UP(xen_extra_mem[i].start),
+						  PFN_DOWN(xen_extra_mem[i].size));
 
 	return 0;
 }
diff --git a/include/xen/page.h b/include/xen/page.h
index 0be36b9..b432c03 100644
--- a/include/xen/page.h
+++ b/include/xen/page.h
@@ -3,6 +3,13 @@
 
 #include <asm/xen/page.h>
 
-extern phys_addr_t xen_extra_mem_start, xen_extra_mem_size;
+struct xen_memory_region {
+	phys_addr_t start;
+	phys_addr_t size;
+};
+
+#define XEN_EXTRA_MEM_MAX_REGIONS 128 /* == E820MAX */
+
+extern struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS];
 
 #endif	/* _XEN_PAGE_H */
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 4/5] xen: allow extra memory to be in multiple regions
  2011-08-19 14:57 xen: memory initialization/balloon fixes (#2) David Vrabel
                   ` (2 preceding siblings ...)
  2011-08-19 14:57 ` [PATCH 3/5] xen: allow balloon driver to use more than one memory region David Vrabel
@ 2011-08-19 14:57 ` David Vrabel
  2011-09-07 12:28   ` Konrad Rzeszutek Wilk
  2011-08-19 14:57 ` [PATCH 5/5] xen: release all pages within 1-1 p2m mappings David Vrabel
  2011-08-22 14:49 ` xen: memory initialization/balloon fixes (#2) Konrad Rzeszutek Wilk
  5 siblings, 1 reply; 20+ messages in thread
From: David Vrabel @ 2011-08-19 14:57 UTC (permalink / raw)
  To: xen-devel; +Cc: David Vrabel, Konrad Rzeszutek Wilk

Allow the extra memory (used by the balloon driver) to be in multiple
regions regions (typically two regions, one for low memory and one for
high memory).  This allows the balloon driver to increase the number
of available low pages (if the initial number if pages is small).

As a side effect, the algorithm for building the e820 memory map is
simpler and more obviously correct as the map supplied by the
hypervisor is (almost) used as is (in particular, all reserved regions
and gaps are preserved).  Only RAM regions are altered and RAM regions
above max_pfn + extra_pages are marked as unused (the region is split
in two if necessary).

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 arch/x86/xen/setup.c |  155 ++++++++++++++++++++++----------------------------
 1 files changed, 67 insertions(+), 88 deletions(-)

diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index 4720c2d..93e4542 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -51,23 +51,29 @@ struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS] __initdata;
  */
 #define EXTRA_MEM_RATIO		(10)
 
-static void __init xen_add_extra_mem(unsigned long pages)
+static void __init xen_add_extra_mem(u64 extra_start, u64 size)
 {
 	unsigned long pfn;
+	int i;
 
-	u64 size = (u64)pages * PAGE_SIZE;
-	u64 extra_start = xen_extra_mem[0].start + xen_extra_mem[0].size;
-
-	if (!pages)
-		return;
-
-	e820_add_region(extra_start, size, E820_RAM);
-	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
+	for (i = 0; i < XEN_EXTRA_MEM_MAX_REGIONS; i++) {
+		/* Add new region. */
+		if (xen_extra_mem[i].size == 0) {
+			xen_extra_mem[i].start = extra_start;
+			xen_extra_mem[i].size  = size;
+			break;
+		}
+		/* Append to existing region. */
+		if (xen_extra_mem[i].start + xen_extra_mem[i].size == extra_start) {
+			xen_extra_mem[i].size += size;
+			break;
+		}
+	}
+	if (i == XEN_EXTRA_MEM_MAX_REGIONS)
+		printk(KERN_WARNING "Warning: not enough extra memory regions\n");
 
 	memblock_x86_reserve_range(extra_start, extra_start + size, "XEN EXTRA");
 
-	xen_extra_mem[0].size += size;
-
 	xen_max_p2m_pfn = PFN_DOWN(extra_start + size);
 
 	for (pfn = PFN_DOWN(extra_start); pfn <= xen_max_p2m_pfn; pfn++)
@@ -118,7 +124,8 @@ static unsigned long __init xen_release_chunk(phys_addr_t start_addr,
 }
 
 static unsigned long __init xen_return_unused_memory(unsigned long max_pfn,
-						     const struct e820map *e820)
+						     const struct e820entry *map,
+						     int nr_map)
 {
 	phys_addr_t max_addr = PFN_PHYS(max_pfn);
 	phys_addr_t last_end = ISA_END_ADDRESS;
@@ -126,13 +133,13 @@ static unsigned long __init xen_return_unused_memory(unsigned long max_pfn,
 	int i;
 
 	/* Free any unused memory above the low 1Mbyte. */
-	for (i = 0; i < e820->nr_map && last_end < max_addr; i++) {
-		phys_addr_t end = e820->map[i].addr;
+	for (i = 0; i < nr_map && last_end < max_addr; i++) {
+		phys_addr_t end = map[i].addr;
 		end = min(max_addr, end);
 
 		if (last_end < end)
 			released += xen_release_chunk(last_end, end);
-		last_end = max(last_end, e820->map[i].addr + e820->map[i].size);
+		last_end = max(last_end, map[i].addr + map[i].size);
 	}
 
 	if (last_end < max_addr)
@@ -203,14 +210,13 @@ static unsigned long __init xen_get_max_pages(void)
 char * __init xen_memory_setup(void)
 {
 	static struct e820entry map[E820MAX] __initdata;
-	static struct e820entry map_raw[E820MAX] __initdata;
 
 	unsigned long max_pfn = xen_start_info->nr_pages;
 	unsigned long long mem_end;
 	int rc;
 	struct xen_memory_map memmap;
+	unsigned long max_pages;
 	unsigned long extra_pages = 0;
-	unsigned long extra_limit;
 	unsigned long identity_pages = 0;
 	int i;
 	int op;
@@ -237,49 +243,52 @@ char * __init xen_memory_setup(void)
 	}
 	BUG_ON(rc);
 
-	memcpy(map_raw, map, sizeof(map));
-	e820.nr_map = 0;
-	xen_extra_mem[0].start = mem_end;
-	for (i = 0; i < memmap.nr_entries; i++) {
-		unsigned long long end;
-
-		/* Guard against non-page aligned E820 entries. */
-		if (map[i].type == E820_RAM)
-			map[i].size -= (map[i].size + map[i].addr) % PAGE_SIZE;
-
-		end = map[i].addr + map[i].size;
-		if (map[i].type == E820_RAM && end > mem_end) {
-			/* RAM off the end - may be partially included */
-			u64 delta = min(map[i].size, end - mem_end);
-
-			map[i].size -= delta;
-			end -= delta;
-
-			extra_pages += PFN_DOWN(delta);
-			/*
-			 * Set RAM below 4GB that is not for us to be unusable.
-			 * This prevents "System RAM" address space from being
-			 * used as potential resource for I/O address (happens
-			 * when 'allocate_resource' is called).
-			 */
-			if (delta &&
-				(xen_initial_domain() && end < 0x100000000ULL))
-				e820_add_region(end, delta, E820_UNUSABLE);
+	/* Make sure the Xen-supplied memory map is well-ordered. */
+	sanitize_e820_map(map, memmap.nr_entries, &memmap.nr_entries);
+
+	max_pages = xen_get_max_pages();
+	if (max_pages > max_pfn)
+		extra_pages += max_pages - max_pfn;
+
+	extra_pages += xen_return_unused_memory(max_pfn, map, memmap.nr_entries);
+
+	/*
+	 * Clamp the amount of extra memory to a EXTRA_MEM_RATIO
+	 * factor the base size.  On non-highmem systems, the base
+	 * size is the full initial memory allocation; on highmem it
+	 * is limited to the max size of lowmem, so that it doesn't
+	 * get completely filled.
+	 *
+	 * In principle there could be a problem in lowmem systems if
+	 * the initial memory is also very large with respect to
+	 * lowmem, but we won't try to deal with that here.
+	 */
+	extra_pages = min(EXTRA_MEM_RATIO * min(max_pfn, PFN_DOWN(MAXMEM)), extra_pages);
+
+	i = 0;
+	while (i < memmap.nr_entries) {
+		u64 addr = map[i].addr;
+		u64 size = map[i].size;
+		u32 type = map[i].type;
+
+		if (type == E820_RAM) {
+			if (addr < mem_end) {
+				size = min(size, mem_end - addr);
+			} else if (extra_pages) {
+				size = min(size, (u64)extra_pages * PAGE_SIZE);
+				extra_pages -= size / PAGE_SIZE;
+				xen_add_extra_mem(addr, size);
+			} else
+				type = E820_UNUSABLE;
 		}
 
-		if (map[i].size > 0 && end > xen_extra_mem[0].start)
-			xen_extra_mem[0].start = end;
+		e820_add_region(addr, size, type);
 
-		/* Add region if any remains */
-		if (map[i].size > 0)
-			e820_add_region(map[i].addr, map[i].size, map[i].type);
+		map[i].addr += size;
+		map[i].size -= size;
+		if (map[i].size == 0)
+			i++;
 	}
-	/* Align the balloon area so that max_low_pfn does not get set
-	 * to be at the _end_ of the PCI gap at the far end (fee01000).
-	 * Note that the start of balloon area gets set in the loop above
-	 * to be past the last E820 region. */
-	if (xen_initial_domain() && (xen_extra_mem[0].start < (1ULL<<32)))
-		xen_extra_mem[0].start = (1ULL<<32);
 
 	/*
 	 * In domU, the ISA region is normal, usable memory, but we
@@ -305,41 +314,11 @@ char * __init xen_memory_setup(void)
 
 	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
 
-	extra_limit = xen_get_max_pages();
-	if (extra_limit >= max_pfn)
-		extra_pages = extra_limit - max_pfn;
-	else
-		extra_pages = 0;
-
-	extra_pages += xen_return_unused_memory(xen_start_info->nr_pages, &e820);
-
-	/*
-	 * Clamp the amount of extra memory to a EXTRA_MEM_RATIO
-	 * factor the base size.  On non-highmem systems, the base
-	 * size is the full initial memory allocation; on highmem it
-	 * is limited to the max size of lowmem, so that it doesn't
-	 * get completely filled.
-	 *
-	 * In principle there could be a problem in lowmem systems if
-	 * the initial memory is also very large with respect to
-	 * lowmem, but we won't try to deal with that here.
-	 */
-	extra_limit = min(EXTRA_MEM_RATIO * min(max_pfn, PFN_DOWN(MAXMEM)),
-			  max_pfn + extra_pages);
-
-	if (extra_limit >= max_pfn)
-		extra_pages = extra_limit - max_pfn;
-	else
-		extra_pages = 0;
-
-	xen_add_extra_mem(extra_pages);
-
 	/*
 	 * Set P2M for all non-RAM pages and E820 gaps to be identity
-	 * type PFNs. We supply it with the non-sanitized version
-	 * of the E820.
+	 * type PFNs.
 	 */
-	identity_pages = xen_set_identity(map_raw, memmap.nr_entries);
+	identity_pages = xen_set_identity(e820.map, e820.nr_map);
 	printk(KERN_INFO "Set %ld page(s) to 1-1 mapping.\n", identity_pages);
 	return "Xen";
 }
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 5/5] xen: release all pages within 1-1 p2m mappings
  2011-08-19 14:57 xen: memory initialization/balloon fixes (#2) David Vrabel
                   ` (3 preceding siblings ...)
  2011-08-19 14:57 ` [PATCH 4/5] xen: allow extra memory to be in multiple regions David Vrabel
@ 2011-08-19 14:57 ` David Vrabel
  2011-08-19 15:05   ` David Vrabel
  2011-09-06 21:20   ` Konrad Rzeszutek Wilk
  2011-08-22 14:49 ` xen: memory initialization/balloon fixes (#2) Konrad Rzeszutek Wilk
  5 siblings, 2 replies; 20+ messages in thread
From: David Vrabel @ 2011-08-19 14:57 UTC (permalink / raw)
  To: xen-devel; +Cc: David Vrabel, David Vrabel, Konrad Rzeszutek Wilk

In xen_memory_setup() all reserved regions and gaps are set to an
identity (1-1) p2m mapping.  If an available page has a PFN within one
of these 1-1 mappings it will become accessible (as it MFN is lost) so
release them before setting up the mapping.

This can make an additional 256 MiB or more of RAM available
(depending on the size of the reserved regions in the memory map).

Signed-off-by: David Vrabel <david.vrabel@csr.com>
---
 arch/x86/xen/setup.c |   88 ++++++++++++--------------------------------------
 1 files changed, 21 insertions(+), 67 deletions(-)

diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index 93e4542..0f1cd69 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -123,73 +123,33 @@ static unsigned long __init xen_release_chunk(phys_addr_t start_addr,
 	return len;
 }
 
-static unsigned long __init xen_return_unused_memory(unsigned long max_pfn,
-						     const struct e820entry *map,
-						     int nr_map)
+static unsigned long __init xen_set_identity_and_release(const struct e820entry *list,
+							 ssize_t map_size,
+							 unsigned long nr_pages)
 {
-	phys_addr_t max_addr = PFN_PHYS(max_pfn);
-	phys_addr_t last_end = ISA_END_ADDRESS;
+	phys_addr_t avail_end = PFN_PHYS(nr_pages);
+	phys_addr_t last_end = 0;
 	unsigned long released = 0;
-	int i;
-
-	/* Free any unused memory above the low 1Mbyte. */
-	for (i = 0; i < nr_map && last_end < max_addr; i++) {
-		phys_addr_t end = map[i].addr;
-		end = min(max_addr, end);
-
-		if (last_end < end)
-			released += xen_release_chunk(last_end, end);
-		last_end = max(last_end, map[i].addr + map[i].size);
-	}
-
-	if (last_end < max_addr)
-		released += xen_release_chunk(last_end, max_addr);
-
-	printk(KERN_INFO "released %lu pages of unused memory\n", released);
-	return released;
-}
-
-static unsigned long __init xen_set_identity(const struct e820entry *list,
-					     ssize_t map_size)
-{
-	phys_addr_t last = xen_initial_domain() ? 0 : ISA_END_ADDRESS;
-	phys_addr_t start_pci = last;
 	const struct e820entry *entry;
-	unsigned long identity = 0;
 	int i;
 
 	for (i = 0, entry = list; i < map_size; i++, entry++) {
-		phys_addr_t start = entry->addr;
-		phys_addr_t end = start + entry->size;
-
-		if (start < last)
-			start = last;
+		phys_addr_t begin = last_end;
+		phys_addr_t end = entry->addr + entry->size;
 
-		if (end <= start)
-			continue;
+		last_end = end;
 
-		/* Skip over the 1MB region. */
-		if (last > end)
-			continue;
+		if (entry->type == E820_RAM || entry->type == E820_UNUSABLE)
+			end = entry->addr;
 
-		if ((entry->type == E820_RAM) || (entry->type == E820_UNUSABLE)) {
-			if (start > start_pci)
-				identity += set_phys_range_identity(
-						PFN_UP(start_pci), PFN_DOWN(start));
+		if (begin < end) {
+			if (begin < avail_end)
+				released += xen_release_chunk(begin, min(end, avail_end));
 
-			/* Without saving 'last' we would gooble RAM too
-			 * at the end of the loop. */
-			last = end;
-			start_pci = end;
-			continue;
+			set_phys_range_identity(PFN_UP(begin), PFN_DOWN(end));
 		}
-		start_pci = min(start, start_pci);
-		last = end;
 	}
-	if (last > start_pci)
-		identity += set_phys_range_identity(
-					PFN_UP(start_pci), PFN_DOWN(last));
-	return identity;
+	return released;
 }
 
 static unsigned long __init xen_get_max_pages(void)
@@ -217,7 +177,6 @@ char * __init xen_memory_setup(void)
 	struct xen_memory_map memmap;
 	unsigned long max_pages;
 	unsigned long extra_pages = 0;
-	unsigned long identity_pages = 0;
 	int i;
 	int op;
 
@@ -250,7 +209,12 @@ char * __init xen_memory_setup(void)
 	if (max_pages > max_pfn)
 		extra_pages += max_pages - max_pfn;
 
-	extra_pages += xen_return_unused_memory(max_pfn, map, memmap.nr_entries);
+	/*
+	 * Set P2M for all non-RAM pages and E820 gaps to be identity
+	 * type PFNs.  Any RAM pages that would be made inaccesible by
+	 * this are first released.
+	 */
+	extra_pages += xen_set_identity_and_release(map, memmap.nr_entries, max_pfn);
 
 	/*
 	 * Clamp the amount of extra memory to a EXTRA_MEM_RATIO
@@ -294,10 +258,6 @@ char * __init xen_memory_setup(void)
 	 * In domU, the ISA region is normal, usable memory, but we
 	 * reserve ISA memory anyway because too many things poke
 	 * about in there.
-	 *
-	 * In Dom0, the host E820 information can leave gaps in the
-	 * ISA range, which would cause us to release those pages.  To
-	 * avoid this, we unconditionally reserve them here.
 	 */
 	e820_add_region(ISA_START_ADDRESS, ISA_END_ADDRESS - ISA_START_ADDRESS,
 			E820_RESERVED);
@@ -314,12 +274,6 @@ char * __init xen_memory_setup(void)
 
 	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
 
-	/*
-	 * Set P2M for all non-RAM pages and E820 gaps to be identity
-	 * type PFNs.
-	 */
-	identity_pages = xen_set_identity(e820.map, e820.nr_map);
-	printk(KERN_INFO "Set %ld page(s) to 1-1 mapping.\n", identity_pages);
 	return "Xen";
 }
 
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 5/5] xen: release all pages within 1-1 p2m mappings
  2011-08-19 14:57 ` [PATCH 5/5] xen: release all pages within 1-1 p2m mappings David Vrabel
@ 2011-08-19 15:05   ` David Vrabel
  2011-09-06 21:20   ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 20+ messages in thread
From: David Vrabel @ 2011-08-19 15:05 UTC (permalink / raw)
  To: David Vrabel; +Cc: xen-devel, Rzeszutek Wilk, Konrad

On 19/08/11 15:57, David Vrabel wrote:

> 
> Signed-off-by: David Vrabel <david.vrabel@csr.com>

Oops.  Wrong email address.  It should have been citrix.com. Let me know
if you want me to resend this patch with the correct signed-off-by.

Sorry.

David

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: xen: memory initialization/balloon fixes (#2)
  2011-08-19 14:57 xen: memory initialization/balloon fixes (#2) David Vrabel
                   ` (4 preceding siblings ...)
  2011-08-19 14:57 ` [PATCH 5/5] xen: release all pages within 1-1 p2m mappings David Vrabel
@ 2011-08-22 14:49 ` Konrad Rzeszutek Wilk
  5 siblings, 0 replies; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-08-22 14:49 UTC (permalink / raw)
  To: David Vrabel; +Cc: xen-devel

On Fri, Aug 19, 2011 at 03:57:15PM +0100, David Vrabel wrote:
> This set of patches fixes some bugs in the memory initialization under
> Xen and in Xen's memory balloon driver.  They can make 100s of MB of
> additional RAM available (depending on the system/configuration).
> 
Ah, please ignore my previous ping email. Seems that my mail server
was a bit stuffed.

> Patches 1 & 2 are fixes and should be queued for 3.1 and possibly
> queued for the 3.0 stable tree.

Excellent.
> 
> Patches 3 & 4 increase the amount of low memory in 32 bit domains
> started with < 1 GiB of RAM.  Please queue for 3.2

K, let me look at them this week.
> 
> Patch 5 releases all pages in the initial allocation with PFNs that
> lie within a 1-1 mapping.  This seems correct to me as I think that
> once the 1-1 mapping is set the MFN of the original page is lost so
> it's no longer accessible by the kernel (and it cannot be used by
> another domain as the Xen still thinks its used by the original
> domain).
> 
> Changes since #1
> 
> - Reordered patches to put "xen: use maximum reservation to limit
>   amount of usable RAM" first.
> - Check maximum reservation for domU as well.
> - New patch "xen/balloon: account for pages released during memory
>   setup"
> - Added explicit check of size when adding regions in the balloon
>   driver.
> - New patch "xen: release all pages within 1-1 p2m mappings"
> 
> David
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/5] xen: use maximum reservation to limit amount of usable RAM
  2011-08-19 14:57 ` [PATCH 1/5] xen: use maximum reservation to limit amount of usable RAM David Vrabel
@ 2011-08-31 20:40   ` Konrad Rzeszutek Wilk
  2011-09-01 12:12     ` David Vrabel
  0 siblings, 1 reply; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-08-31 20:40 UTC (permalink / raw)
  To: David Vrabel; +Cc: xen-devel

> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> ---
>  arch/x86/xen/setup.c |   19 +++++++++++++++++++
>  1 files changed, 19 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
> index df118a8..c3b8d44 100644
> --- a/arch/x86/xen/setup.c
> +++ b/arch/x86/xen/setup.c
> @@ -184,6 +184,19 @@ static unsigned long __init xen_set_identity(const struct e820entry *list,
>  					PFN_UP(start_pci), PFN_DOWN(last));
>  	return identity;
>  }
> +
> +static unsigned long __init xen_get_max_pages(void)
> +{
> +	unsigned long max_pages = MAX_DOMAIN_PAGES;
> +	domid_t domid = DOMID_SELF;
> +	int ret;
> +
> +	ret = HYPERVISOR_memory_op(XENMEM_maximum_reservation, &domid);
> +	if (ret > 0)
> +		max_pages = ret;
> +	return min(max_pages, MAX_DOMAIN_PAGES);
> +}
> +
>  /**
>   * machine_specific_memory_setup - Hook for machine specific memory setup.
>   **/
> @@ -292,6 +305,12 @@ char * __init xen_memory_setup(void)
>  
>  	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
>  
> +	extra_limit = xen_get_max_pages();
> +	if (extra_limit >= max_pfn)
> +		extra_pages = extra_limit - max_pfn;
> +	else
> +		extra_pages = 0;
> +
>  	extra_pages += xen_return_unused_memory(xen_start_info->nr_pages, &e820);

I ran this on three setups:

1) PV (domU)
2) PV+PCI (dom0)
3) PV+PCI+e820_hole=1 (domU)

and then the same without this patch.

Both the 2) and 3) worked correctly - the E820 had the same non-RAM regions and
gaps - and the last RAM E820 entry was properly truncated. However, when it
came to pure PV it was truncated more than it should:

domU:								domU:
0000000000000000 - 00000000000a0000 (usable)			0000000000000000 - 00000000000a0000 (usable)
00000000000a0000 - 0000000000100000 (reserved)			00000000000a0000 - 0000000000100000 (reserved)
0000000000100000 - 0000000040800000 (usable)		      |	0000000000100000 - 0000000040100000 (usable)

(left has the old PV - without your patch). Which makes me think that there is something
amiss in the toolstack? I used 'xl' (latest xen-unstable from today).

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/5] xen: use maximum reservation to limit amount of usable RAM
  2011-08-31 20:40   ` Konrad Rzeszutek Wilk
@ 2011-09-01 12:12     ` David Vrabel
  2011-09-01 13:14       ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 20+ messages in thread
From: David Vrabel @ 2011-09-01 12:12 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel

On 31/08/11 21:40, Konrad Rzeszutek Wilk wrote:
>> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
>> ---
>>  arch/x86/xen/setup.c |   19 +++++++++++++++++++
>>  1 files changed, 19 insertions(+), 0 deletions(-)
>>
>> diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
>> index df118a8..c3b8d44 100644
>> --- a/arch/x86/xen/setup.c
>> +++ b/arch/x86/xen/setup.c
>> @@ -184,6 +184,19 @@ static unsigned long __init xen_set_identity(const struct e820entry *list,
>>  					PFN_UP(start_pci), PFN_DOWN(last));
>>  	return identity;
>>  }
>> +
>> +static unsigned long __init xen_get_max_pages(void)
>> +{
>> +	unsigned long max_pages = MAX_DOMAIN_PAGES;
>> +	domid_t domid = DOMID_SELF;
>> +	int ret;
>> +
>> +	ret = HYPERVISOR_memory_op(XENMEM_maximum_reservation, &domid);
>> +	if (ret > 0)
>> +		max_pages = ret;
>> +	return min(max_pages, MAX_DOMAIN_PAGES);
>> +}
>> +
>>  /**
>>   * machine_specific_memory_setup - Hook for machine specific memory setup.
>>   **/
>> @@ -292,6 +305,12 @@ char * __init xen_memory_setup(void)
>>  
>>  	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
>>  
>> +	extra_limit = xen_get_max_pages();
>> +	if (extra_limit >= max_pfn)
>> +		extra_pages = extra_limit - max_pfn;
>> +	else
>> +		extra_pages = 0;
>> +
>>  	extra_pages += xen_return_unused_memory(xen_start_info->nr_pages, &e820);
> 
> I ran this on three setups:
> 
> 1) PV (domU)
> 2) PV+PCI (dom0)
> 3) PV+PCI+e820_hole=1 (domU)
> 
> and then the same without this patch.
> 
> Both the 2) and 3) worked correctly - the E820 had the same non-RAM regions and
> gaps - and the last RAM E820 entry was properly truncated. However, when it
> came to pure PV it was truncated more than it should:
> 
> domU:								domU:
> 0000000000000000 - 00000000000a0000 (usable)		0000000000000000 - 00000000000a0000 (usable)
> 00000000000a0000 - 0000000000100000 (reserved)	00000000000a0000 - 0000000000100000 (reserved)
> 0000000000100000 - 0000000040800000 (usable)	      |	0000000000100000 - 0000000040100000 (usable)
> 
> (left has the old PV - without your patch). Which makes me think that there is something
> amiss in the toolstack? I used 'xl' (latest xen-unstable from today).

What were you expecting? It looks like xl is either: specifying a memory
map that is larger than it should be or b) setting the maximum
reservation as too low.  And if you asked for 1 GiB neither looks right.

David

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/5] xen: use maximum reservation to limit amount of usable RAM
  2011-09-01 12:12     ` David Vrabel
@ 2011-09-01 13:14       ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-09-01 13:14 UTC (permalink / raw)
  To: David Vrabel; +Cc: xen-devel

> > I ran this on three setups:
> > 
> > 1) PV (domU)
> > 2) PV+PCI (dom0)
> > 3) PV+PCI+e820_hole=1 (domU)
> > 
> > and then the same without this patch.
> > 
> > Both the 2) and 3) worked correctly - the E820 had the same non-RAM regions and
> > gaps - and the last RAM E820 entry was properly truncated. However, when it
> > came to pure PV it was truncated more than it should:
> > 
> > domU:								domU:
> > 0000000000000000 - 00000000000a0000 (usable)		0000000000000000 - 00000000000a0000 (usable)
> > 00000000000a0000 - 0000000000100000 (reserved)	00000000000a0000 - 0000000000100000 (reserved)
> > 0000000000100000 - 0000000040800000 (usable)	      |	0000000000100000 - 0000000040100000 (usable)
> > 
> > (left has the old PV - without your patch). Which makes me think that there is something
> > amiss in the toolstack? I used 'xl' (latest xen-unstable from today).
> 
> What were you expecting? It looks like xl is either: specifying a memory
> map that is larger than it should be or b) setting the maximum
> reservation as too low.  And if you asked for 1 GiB neither looks right.

'xm' is even worst. It ends up truncating it to 40000000 exactly.

Anyhow, I chatted with Ian about it and also the thread
"difference between xen hypervisor and common kernel on handling BIOS's e820 map"
nails the coffin to this - there is no need anymore for that 8MB of extra space.

So - off to look at your next set of patches :-)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 5/5] xen: release all pages within 1-1 p2m mappings
  2011-08-19 14:57 ` [PATCH 5/5] xen: release all pages within 1-1 p2m mappings David Vrabel
  2011-08-19 15:05   ` David Vrabel
@ 2011-09-06 21:20   ` Konrad Rzeszutek Wilk
  2011-09-07 11:03     ` David Vrabel
  1 sibling, 1 reply; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-09-06 21:20 UTC (permalink / raw)
  To: David Vrabel; +Cc: xen-devel, David Vrabel

On Fri, Aug 19, 2011 at 03:57:20PM +0100, David Vrabel wrote:
> In xen_memory_setup() all reserved regions and gaps are set to an
> identity (1-1) p2m mapping.  If an available page has a PFN within one
> of these 1-1 mappings it will become accessible (as it MFN is lost) so
> release them before setting up the mapping.
> 
> This can make an additional 256 MiB or more of RAM available
> (depending on the size of the reserved regions in the memory map).

.. if the xen_start_info->nr_pages overlaps the reserved region.

> 
> Signed-off-by: David Vrabel <david.vrabel@csr.com>
> ---
>  arch/x86/xen/setup.c |   88 ++++++++++++--------------------------------------
>  1 files changed, 21 insertions(+), 67 deletions(-)
> 
> diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
> index 93e4542..0f1cd69 100644
> --- a/arch/x86/xen/setup.c
> +++ b/arch/x86/xen/setup.c
> @@ -123,73 +123,33 @@ static unsigned long __init xen_release_chunk(phys_addr_t start_addr,
>  	return len;
>  }
>  
> -static unsigned long __init xen_return_unused_memory(unsigned long max_pfn,
> -						     const struct e820entry *map,
> -						     int nr_map)
> +static unsigned long __init xen_set_identity_and_release(const struct e820entry *list,
> +							 ssize_t map_size,
> +							 unsigned long nr_pages)
>  {
> -	phys_addr_t max_addr = PFN_PHYS(max_pfn);
> -	phys_addr_t last_end = ISA_END_ADDRESS;
> +	phys_addr_t avail_end = PFN_PHYS(nr_pages);
> +	phys_addr_t last_end = 0;
>  	unsigned long released = 0;
> -	int i;
> -
> -	/* Free any unused memory above the low 1Mbyte. */
> -	for (i = 0; i < nr_map && last_end < max_addr; i++) {
> -		phys_addr_t end = map[i].addr;
> -		end = min(max_addr, end);
> -
> -		if (last_end < end)
> -			released += xen_release_chunk(last_end, end);
> -		last_end = max(last_end, map[i].addr + map[i].size);
> -	}
> -
> -	if (last_end < max_addr)
> -		released += xen_release_chunk(last_end, max_addr);
> -
> -	printk(KERN_INFO "released %lu pages of unused memory\n", released);
> -	return released;
> -}
> -
> -static unsigned long __init xen_set_identity(const struct e820entry *list,
> -					     ssize_t map_size)
> -{
> -	phys_addr_t last = xen_initial_domain() ? 0 : ISA_END_ADDRESS;
> -	phys_addr_t start_pci = last;
>  	const struct e820entry *entry;
> -	unsigned long identity = 0;
>  	int i;
>  
>  	for (i = 0, entry = list; i < map_size; i++, entry++) {
> -		phys_addr_t start = entry->addr;
> -		phys_addr_t end = start + entry->size;
> -
> -		if (start < last)
> -			start = last;
> +		phys_addr_t begin = last_end;

The "begin" is a bit confusing. You are using the previous E820 entry's end - not
the beginning of this E820 entry. Doing a s/begin/last_end/ makes
the code a bit easier to understand.

> +		phys_addr_t end = entry->addr + entry->size;
>  
> -		if (end <= start)
> -			continue;
> +		last_end = end;

Please include the comment:
/* This entry end. */

>  
> -		/* Skip over the 1MB region. */
> -		if (last > end)
> -			continue;
> +		if (entry->type == E820_RAM || entry->type == E820_UNUSABLE)
> +			end = entry->addr;

And:
/* Should encapsulate the gap between prev_end and this E820
entry's starting address. */
>  
> -		if ((entry->type == E820_RAM) || (entry->type == E820_UNUSABLE)) {
> -			if (start > start_pci)
> -				identity += set_phys_range_identity(
> -						PFN_UP(start_pci), PFN_DOWN(start));
> +		if (begin < end) {
> +			if (begin < avail_end)
> +				released += xen_release_chunk(begin, min(end, avail_end));
>  
> -			/* Without saving 'last' we would gooble RAM too
> -			 * at the end of the loop. */
> -			last = end;
> -			start_pci = end;
> -			continue;
> +			set_phys_range_identity(PFN_UP(begin), PFN_DOWN(end));

identity += set_phys_range ..
>  		}
> -		start_pci = min(start, start_pci);
> -		last = end;
>  	}
> -	if (last > start_pci)
> -		identity += set_phys_range_identity(
> -					PFN_UP(start_pci), PFN_DOWN(last));
> -	return identity;

OK, but you have ripped out the nice printk's that existed before. So add them
back in:


	printk(KERN_INFO "released %lu pages of unused memory\n", released);
	printk(KERN_INFO "Set %ld page(s) to 1-1 mapping.\n", identity);

as they are quite useful in the field.

> +	return released;
>  }
>  
>  static unsigned long __init xen_get_max_pages(void)
> @@ -217,7 +177,6 @@ char * __init xen_memory_setup(void)
>  	struct xen_memory_map memmap;
>  	unsigned long max_pages;
>  	unsigned long extra_pages = 0;
> -	unsigned long identity_pages = 0;
>  	int i;
>  	int op;
>  
> @@ -250,7 +209,12 @@ char * __init xen_memory_setup(void)
>  	if (max_pages > max_pfn)
>  		extra_pages += max_pages - max_pfn;
>  
> -	extra_pages += xen_return_unused_memory(max_pfn, map, memmap.nr_entries);
> +	/*
> +	 * Set P2M for all non-RAM pages and E820 gaps to be identity
> +	 * type PFNs.  Any RAM pages that would be made inaccesible by
> +	 * this are first released.
> +	 */
> +	extra_pages += xen_set_identity_and_release(map, memmap.nr_entries, max_pfn);
>  
>  	/*
>  	 * Clamp the amount of extra memory to a EXTRA_MEM_RATIO
> @@ -294,10 +258,6 @@ char * __init xen_memory_setup(void)
>  	 * In domU, the ISA region is normal, usable memory, but we
>  	 * reserve ISA memory anyway because too many things poke
>  	 * about in there.
> -	 *
> -	 * In Dom0, the host E820 information can leave gaps in the
> -	 * ISA range, which would cause us to release those pages.  To
> -	 * avoid this, we unconditionally reserve them here.
>  	 */
>  	e820_add_region(ISA_START_ADDRESS, ISA_END_ADDRESS - ISA_START_ADDRESS,
>  			E820_RESERVED);
> @@ -314,12 +274,6 @@ char * __init xen_memory_setup(void)
>  
>  	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
>  
> -	/*
> -	 * Set P2M for all non-RAM pages and E820 gaps to be identity
> -	 * type PFNs.
> -	 */
> -	identity_pages = xen_set_identity(e820.map, e820.nr_map);
> -	printk(KERN_INFO "Set %ld page(s) to 1-1 mapping.\n", identity_pages);
>  	return "Xen";
>  }
>  
> -- 
> 1.7.2.5

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/5] xen/balloon: account for pages released during memory setup
  2011-08-19 14:57 ` [PATCH 2/5] xen/balloon: account for pages released during memory setup David Vrabel
@ 2011-09-06 21:31   ` Konrad Rzeszutek Wilk
  2011-09-08 15:01     ` David Vrabel
  0 siblings, 1 reply; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-09-06 21:31 UTC (permalink / raw)
  To: David Vrabel; +Cc: xen-devel

On Fri, Aug 19, 2011 at 03:57:17PM +0100, David Vrabel wrote:
> In xen_memory_setup() pages that occur in gaps in the memory map are
> released back to Xen.  This reduces the domain's current page count.

You might want to add: "in the hypervisor."

> The Xen balloon driver does not correctly decrease its initial
> current_pages count to reflect this.  If 'delta' pages are released
> and the target is adjusted the resulting reservation is always 'delta'
> less than the requested target.

Might want to add:

Wherein delta is reported as (for example):
[    0.000000] released 261886 pages of unused memory

> 
> This affects dom0 if the initial allocation of pages overlaps the PCI
> memory region but won't affect most domU guests that have been setup
> with pseudo-physical memory maps that don't have gaps.
> 
> Fix this by asking the hypervisor what the current reservation is when
> starting the balloon driver.
> 
> If the domain's targets are managed by xapi, the domain may eventually
> run out of memory and die because xapi currently gets its target
> calculations wrong and whenever it is restarted it always reduces the
> target by 'delta'.
> 
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> ---
>  drivers/xen/balloon.c |    7 ++++++-
>  1 files changed, 6 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
> index 5dfd8f8..5814022 100644
> --- a/drivers/xen/balloon.c
> +++ b/drivers/xen/balloon.c
> @@ -557,15 +557,20 @@ EXPORT_SYMBOL(free_xenballooned_pages);
>  
>  static int __init balloon_init(void)
>  {
> +	domid_t domid = DOMID_SELF;
>  	unsigned long pfn, extra_pfn_end;
>  	struct page *page;
> +	int ret;

long int?
>  
>  	if (!xen_domain())
>  		return -ENODEV;
>  
>  	pr_info("xen/balloon: Initialising balloon driver.\n");
>  
> -	balloon_stats.current_pages = xen_pv_domain() ? min(xen_start_info->nr_pages, max_pfn) : max_pfn;
> +	ret = HYPERVISOR_memory_op(XENMEM_current_reservation, &domid);
> +	if (ret < 0)
> +		return ret;
> +	balloon_stats.current_pages = ret;
>  	balloon_stats.target_pages  = balloon_stats.current_pages;
>  	balloon_stats.balloon_low   = 0;
>  	balloon_stats.balloon_high  = 0;
> -- 
> 1.7.2.5

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/5] xen: allow balloon driver to use more than one memory region
  2011-08-19 14:57 ` [PATCH 3/5] xen: allow balloon driver to use more than one memory region David Vrabel
@ 2011-09-06 21:57   ` Konrad Rzeszutek Wilk
  2011-09-07 10:44     ` David Vrabel
  0 siblings, 1 reply; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-09-06 21:57 UTC (permalink / raw)
  To: David Vrabel; +Cc: xen-devel

On Fri, Aug 19, 2011 at 03:57:18PM +0100, David Vrabel wrote:
> Allow the xen balloon driver to populate its list of extra pages from
> more than one region of memory.  This will allow platforms to provide
> (for example) a region of low memory and a region of high memory.
> 
> The maximum possible number of extra regions is 128 (== E820MAX) which
> is quite large so xen_extra_mem is placed in __initdata.  This is safe
> as both xen_memeory_setup() and balloon_init() are in __init.
              ^^^^^^ memory.

> 
> The balloon regions themselves are not altered (i.e., there is still
> only the one region).
> 
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> ---
>  arch/x86/xen/setup.c  |   20 +++++++++---------
>  drivers/xen/balloon.c |   51 ++++++++++++++++++++++++++++--------------------
>  include/xen/page.h    |    9 +++++++-
>  3 files changed, 48 insertions(+), 32 deletions(-)
> 
> diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
> index c3b8d44..4720c2d 100644
> --- a/arch/x86/xen/setup.c
> +++ b/arch/x86/xen/setup.c
> @@ -37,7 +37,7 @@ extern void xen_syscall_target(void);
>  extern void xen_syscall32_target(void);
>  
>  /* Amount of extra memory space we add to the e820 ranges */
> -phys_addr_t xen_extra_mem_start, xen_extra_mem_size;
> +struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS] __initdata;
>  
>  /* 
>   * The maximum amount of extra memory compared to the base size.  The
> @@ -56,7 +56,7 @@ static void __init xen_add_extra_mem(unsigned long pages)
>  	unsigned long pfn;
>  
>  	u64 size = (u64)pages * PAGE_SIZE;
> -	u64 extra_start = xen_extra_mem_start + xen_extra_mem_size;
> +	u64 extra_start = xen_extra_mem[0].start + xen_extra_mem[0].size;
>  
>  	if (!pages)
>  		return;
> @@ -66,7 +66,7 @@ static void __init xen_add_extra_mem(unsigned long pages)
>  
>  	memblock_x86_reserve_range(extra_start, extra_start + size, "XEN EXTRA");
>  
> -	xen_extra_mem_size += size;
> +	xen_extra_mem[0].size += size;
>  
>  	xen_max_p2m_pfn = PFN_DOWN(extra_start + size);
>  
> @@ -239,7 +239,7 @@ char * __init xen_memory_setup(void)
>  
>  	memcpy(map_raw, map, sizeof(map));
>  	e820.nr_map = 0;
> -	xen_extra_mem_start = mem_end;
> +	xen_extra_mem[0].start = mem_end;
>  	for (i = 0; i < memmap.nr_entries; i++) {
>  		unsigned long long end;
>  
> @@ -267,8 +267,8 @@ char * __init xen_memory_setup(void)
>  				e820_add_region(end, delta, E820_UNUSABLE);
>  		}
>  
> -		if (map[i].size > 0 && end > xen_extra_mem_start)
> -			xen_extra_mem_start = end;
> +		if (map[i].size > 0 && end > xen_extra_mem[0].start)
> +			xen_extra_mem[0].start = end;
>  
>  		/* Add region if any remains */
>  		if (map[i].size > 0)
> @@ -276,10 +276,10 @@ char * __init xen_memory_setup(void)
>  	}
>  	/* Align the balloon area so that max_low_pfn does not get set
>  	 * to be at the _end_ of the PCI gap at the far end (fee01000).
> -	 * Note that xen_extra_mem_start gets set in the loop above to be
> -	 * past the last E820 region. */
> -	if (xen_initial_domain() && (xen_extra_mem_start < (1ULL<<32)))
> -		xen_extra_mem_start = (1ULL<<32);
> +	 * Note that the start of balloon area gets set in the loop above
> +	 * to be past the last E820 region. */
> +	if (xen_initial_domain() && (xen_extra_mem[0].start < (1ULL<<32)))
> +		xen_extra_mem[0].start = (1ULL<<32);
>  
>  	/*
>  	 * In domU, the ISA region is normal, usable memory, but we
> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
> index 5814022..996cf65 100644
> --- a/drivers/xen/balloon.c
> +++ b/drivers/xen/balloon.c
> @@ -555,11 +555,35 @@ void free_xenballooned_pages(int nr_pages, struct page** pages)
>  }
>  EXPORT_SYMBOL(free_xenballooned_pages);
>  
> -static int __init balloon_init(void)
> +static void __init balloon_add_memory_region(unsigned long start_pfn, unsigned long pages)

That looks suspiciously like it has more than 80 lines. You ran the
_all_ of the patches through scripts/checkpath.pl right?

>  {
> -	domid_t domid = DOMID_SELF;
>  	unsigned long pfn, extra_pfn_end;
>  	struct page *page;
> +
> +	/*
> +	 * Initialise the balloon with excess memory space.  We need
> +	 * to make sure we don't add memory which doesn't exist or
> +	 * logically exist.  The E820 map can be trimmed to be smaller
> +	 * than the amount of physical memory due to the mem= command
> +	 * line parameter.  And if this is a 32-bit non-HIGHMEM kernel
> +	 * on a system with memory which requires highmem to access,
> +	 * don't try to use it.


That whole comment is just confusing.. But I do know that you
just moved it, but I wonder if it makes sense to remove it or
alter it in the future.

> +	 */
> +	extra_pfn_end = min(min(max_pfn, e820_end_of_ram_pfn()),
> +			    start_pfn + pages);
> +	for (pfn = start_pfn; pfn < extra_pfn_end; pfn++) {
> +		page = pfn_to_page(pfn);
> +		/* totalram_pages and totalhigh_pages do not
> +		   include the boot-time balloon extension, so
> +		   don't subtract from it. */
> +		__balloon_append(page);
> +	}
> +}
> +
> +static int __init balloon_init(void)
> +{
> +	domid_t domid = DOMID_SELF;
> +	int i;
>  	int ret;
>  
>  	if (!xen_domain())
> @@ -588,25 +612,10 @@ static int __init balloon_init(void)
>  	register_memory_notifier(&xen_memory_nb);
>  #endif
>  
> -	/*
> -	 * Initialise the balloon with excess memory space.  We need
> -	 * to make sure we don't add memory which doesn't exist or
> -	 * logically exist.  The E820 map can be trimmed to be smaller
> -	 * than the amount of physical memory due to the mem= command
> -	 * line parameter.  And if this is a 32-bit non-HIGHMEM kernel
> -	 * on a system with memory which requires highmem to access,
> -	 * don't try to use it.
> -	 */
> -	extra_pfn_end = min(min(max_pfn, e820_end_of_ram_pfn()),
> -			    (unsigned long)PFN_DOWN(xen_extra_mem_start + xen_extra_mem_size));
> -	for (pfn = PFN_UP(xen_extra_mem_start);
> -	     pfn < extra_pfn_end;
> -	     pfn++) {
> -		page = pfn_to_page(pfn);
> -		/* totalram_pages and totalhigh_pages do not include the boot-time
> -		   balloon extension, so don't subtract from it. */
> -		__balloon_append(page);
> -	}
> +	for (i = 0; i < XEN_EXTRA_MEM_MAX_REGIONS; i++)
> +		if (xen_extra_mem[i].size)
> +			balloon_add_memory_region(PFN_UP(xen_extra_mem[i].start),
> +						  PFN_DOWN(xen_extra_mem[i].size));
>  
>  	return 0;
>  }
> diff --git a/include/xen/page.h b/include/xen/page.h
> index 0be36b9..b432c03 100644
> --- a/include/xen/page.h
> +++ b/include/xen/page.h
> @@ -3,6 +3,13 @@
>  
>  #include <asm/xen/page.h>
>  
> -extern phys_addr_t xen_extra_mem_start, xen_extra_mem_size;
> +struct xen_memory_region {
> +	phys_addr_t start;
> +	phys_addr_t size;
> +};
> +
> +#define XEN_EXTRA_MEM_MAX_REGIONS 128 /* == E820MAX */
> +
> +extern struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS];

__initdata

>  
>  #endif	/* _XEN_PAGE_H */
> -- 
> 1.7.2.5

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/5] xen: allow balloon driver to use more than one memory region
  2011-09-06 21:57   ` Konrad Rzeszutek Wilk
@ 2011-09-07 10:44     ` David Vrabel
  2011-09-07 18:09       ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 20+ messages in thread
From: David Vrabel @ 2011-09-07 10:44 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel

On 06/09/11 22:57, Konrad Rzeszutek Wilk wrote:
> On Fri, Aug 19, 2011 at 03:57:18PM +0100, David Vrabel wrote:
>> 
>> +static void __init balloon_add_memory_region(unsigned long start_pfn, unsigned long pages)
> 
> That looks suspiciously like it has more than 80 lines. You ran the
> _all_ of the patches through scripts/checkpath.pl right?

Yes, I used checkpatch.pl but I tend to treat the 80 character limit as
a guideline rather than a hard limit and only pay attention to it if
breaking a line improves readability.

I assume your preference is for an 80 character hard limit?

>>  {
>> -	domid_t domid = DOMID_SELF;
>>  	unsigned long pfn, extra_pfn_end;
>>  	struct page *page;
>> +
>> +	/*
>> +	 * Initialise the balloon with excess memory space.  We need
>> +	 * to make sure we don't add memory which doesn't exist or
>> +	 * logically exist.  The E820 map can be trimmed to be smaller
>> +	 * than the amount of physical memory due to the mem= command
>> +	 * line parameter.  And if this is a 32-bit non-HIGHMEM kernel
>> +	 * on a system with memory which requires highmem to access,
>> +	 * don't try to use it.
> 
> 
> That whole comment is just confusing.. But I do know that you
> just moved it, but I wonder if it makes sense to remove it or
> alter it in the future.

I think the comment is valid.

>> +	 */
>> +	extra_pfn_end = min(min(max_pfn, e820_end_of_ram_pfn()),
>> +			    start_pfn + pages);

It's this line that it's documenting.

>> --- a/include/xen/page.h
>> +++ b/include/xen/page.h
>> @@ -3,6 +3,13 @@
>>  
>>  #include <asm/xen/page.h>
>>  
>> -extern phys_addr_t xen_extra_mem_start, xen_extra_mem_size;
>> +struct xen_memory_region {
>> +	phys_addr_t start;
>> +	phys_addr_t size;
>> +};
>> +
>> +#define XEN_EXTRA_MEM_MAX_REGIONS 128 /* == E820MAX */
>> +
>> +extern struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS];
> 
> __initdata

Only the definition (in arch/x86/xen/setup.c) needs the __initdata
attribute, right?

I just checked and it does end up in the .init.data section.

David

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 5/5] xen: release all pages within 1-1 p2m mappings
  2011-09-06 21:20   ` Konrad Rzeszutek Wilk
@ 2011-09-07 11:03     ` David Vrabel
  2011-09-07 18:23       ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 20+ messages in thread
From: David Vrabel @ 2011-09-07 11:03 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Vrabel, xen-devel, David

On 06/09/11 22:20, Konrad Rzeszutek Wilk wrote:
> On Fri, Aug 19, 2011 at 03:57:20PM +0100, David Vrabel wrote:
>> 
>> --- a/arch/x86/xen/setup.c
>> +++ b/arch/x86/xen/setup.c
>> @@ -123,73 +123,33 @@ static unsigned long __init xen_release_chunk(phys_addr_t start_addr,
>>  	return len;
>>  }
>>  
>> -static unsigned long __init xen_return_unused_memory(unsigned long max_pfn,
>> -						     const struct e820entry *map,
>> -						     int nr_map)
>> +static unsigned long __init xen_set_identity_and_release(const struct e820entry *list,
>> +							 ssize_t map_size,
>> +							 unsigned long nr_pages)
>>  {
>> -	phys_addr_t max_addr = PFN_PHYS(max_pfn);
>> -	phys_addr_t last_end = ISA_END_ADDRESS;
>> +	phys_addr_t avail_end = PFN_PHYS(nr_pages);
>> +	phys_addr_t last_end = 0;
>>  	unsigned long released = 0;
>> -	int i;
>> -
>> -	/* Free any unused memory above the low 1Mbyte. */
>> -	for (i = 0; i < nr_map && last_end < max_addr; i++) {
>> -		phys_addr_t end = map[i].addr;
>> -		end = min(max_addr, end);
>> -
>> -		if (last_end < end)
>> -			released += xen_release_chunk(last_end, end);
>> -		last_end = max(last_end, map[i].addr + map[i].size);
>> -	}
>> -
>> -	if (last_end < max_addr)
>> -		released += xen_release_chunk(last_end, max_addr);
>> -
>> -	printk(KERN_INFO "released %lu pages of unused memory\n", released);
>> -	return released;
>> -}
>> -
>> -static unsigned long __init xen_set_identity(const struct e820entry *list,
>> -					     ssize_t map_size)
>> -{
>> -	phys_addr_t last = xen_initial_domain() ? 0 : ISA_END_ADDRESS;
>> -	phys_addr_t start_pci = last;
>>  	const struct e820entry *entry;
>> -	unsigned long identity = 0;
>>  	int i;
>>  
>>  	for (i = 0, entry = list; i < map_size; i++, entry++) {
>> -		phys_addr_t start = entry->addr;
>> -		phys_addr_t end = start + entry->size;
>> -
>> -		if (start < last)
>> -			start = last;
>> +		phys_addr_t begin = last_end;
> 
> The "begin" is a bit confusing. You are using the previous E820 entry's end - not
> the beginning of this E820 entry. Doing a s/begin/last_end/ makes
> the code a bit easier to understand.

Really?  It seems pretty clear to me that they're the beginning and end
of the memory range we're considering to release or map.

That loop went through a number of variations and what's there is what I
think is the most readable.

>> +		phys_addr_t end = entry->addr + entry->size;
>>  
>> -		if (end <= start)
>> -			continue;
>> +		last_end = end;
> 
> Please include the comment:
> /* This entry end. */

Not really in favour of little comments like this.  I'll put a comment
above the loop.

/*
 * For each memory region consider whether to release and map the
 * region and the preceeding gap (if any).
 */

> OK, but you have ripped out the nice printk's that existed before. So add them
> back in:
> 
> 
> 	printk(KERN_INFO "released %lu pages of unused memory\n", released);
> 	printk(KERN_INFO "Set %ld page(s) to 1-1 mapping.\n", identity);
> 
> as they are quite useful in the field.

What problem are these useful for diagnosing that the remaining messages
(and the e820 map) don't tell you already?

 xen_release_chunk: looking at area pfn 9e-100: 98 pages freed
 1-1 mapping on 9e->100
 1-1 mapping on bf699->bf6af
 1-1 mapping on bf6af->bf6ce
 1-1 mapping on bf6ce->c0000
 1-1 mapping on c0000->f0000
 1-1 mapping on f0000->100000

The total count just doesn't seem useful.

David

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/5] xen: allow extra memory to be in multiple regions
  2011-08-19 14:57 ` [PATCH 4/5] xen: allow extra memory to be in multiple regions David Vrabel
@ 2011-09-07 12:28   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-09-07 12:28 UTC (permalink / raw)
  To: David Vrabel; +Cc: xen-devel

On Fri, Aug 19, 2011 at 03:57:19PM +0100, David Vrabel wrote:
> Allow the extra memory (used by the balloon driver) to be in multiple
> regions regions (typically two regions, one for low memory and one for
> high memory).  This allows the balloon driver to increase the number
> of available low pages (if the initial number if pages is small).
> 
> As a side effect, the algorithm for building the e820 memory map is
> simpler and more obviously correct as the map supplied by the
> hypervisor is (almost) used as is (in particular, all reserved regions
> and gaps are preserved).  Only RAM regions are altered and RAM regions
> above max_pfn + extra_pages are marked as unused (the region is split
> in two if necessary).
> 
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> ---
>  arch/x86/xen/setup.c |  155 ++++++++++++++++++++++----------------------------
>  1 files changed, 67 insertions(+), 88 deletions(-)
> 
> diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
> index 4720c2d..93e4542 100644
> --- a/arch/x86/xen/setup.c
> +++ b/arch/x86/xen/setup.c
> @@ -51,23 +51,29 @@ struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS] __initdata;
>   */
>  #define EXTRA_MEM_RATIO		(10)
>  
> -static void __init xen_add_extra_mem(unsigned long pages)
> +static void __init xen_add_extra_mem(u64 extra_start, u64 size)
>  {
>  	unsigned long pfn;
> +	int i;
>  
> -	u64 size = (u64)pages * PAGE_SIZE;
> -	u64 extra_start = xen_extra_mem[0].start + xen_extra_mem[0].size;
> -
> -	if (!pages)
> -		return;
> -
> -	e820_add_region(extra_start, size, E820_RAM);
> -	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
> +	for (i = 0; i < XEN_EXTRA_MEM_MAX_REGIONS; i++) {
> +		/* Add new region. */
> +		if (xen_extra_mem[i].size == 0) {
> +			xen_extra_mem[i].start = extra_start;
> +			xen_extra_mem[i].size  = size;
> +			break;
> +		}
> +		/* Append to existing region. */
> +		if (xen_extra_mem[i].start + xen_extra_mem[i].size == extra_start) {
> +			xen_extra_mem[i].size += size;
> +			break;
> +		}
> +	}
> +	if (i == XEN_EXTRA_MEM_MAX_REGIONS)
> +		printk(KERN_WARNING "Warning: not enough extra memory regions\n");
>  
>  	memblock_x86_reserve_range(extra_start, extra_start + size, "XEN EXTRA");
>  
> -	xen_extra_mem[0].size += size;
> -
>  	xen_max_p2m_pfn = PFN_DOWN(extra_start + size);
>  
>  	for (pfn = PFN_DOWN(extra_start); pfn <= xen_max_p2m_pfn; pfn++)
> @@ -118,7 +124,8 @@ static unsigned long __init xen_release_chunk(phys_addr_t start_addr,
>  }
>  
>  static unsigned long __init xen_return_unused_memory(unsigned long max_pfn,
> -						     const struct e820map *e820)
> +						     const struct e820entry *map,
> +						     int nr_map)
>  {
>  	phys_addr_t max_addr = PFN_PHYS(max_pfn);
>  	phys_addr_t last_end = ISA_END_ADDRESS;
> @@ -126,13 +133,13 @@ static unsigned long __init xen_return_unused_memory(unsigned long max_pfn,
>  	int i;
>  
>  	/* Free any unused memory above the low 1Mbyte. */
> -	for (i = 0; i < e820->nr_map && last_end < max_addr; i++) {
> -		phys_addr_t end = e820->map[i].addr;
> +	for (i = 0; i < nr_map && last_end < max_addr; i++) {
> +		phys_addr_t end = map[i].addr;
>  		end = min(max_addr, end);
>  
>  		if (last_end < end)
>  			released += xen_release_chunk(last_end, end);
> -		last_end = max(last_end, e820->map[i].addr + e820->map[i].size);
> +		last_end = max(last_end, map[i].addr + map[i].size);
>  	}
>  
>  	if (last_end < max_addr)
> @@ -203,14 +210,13 @@ static unsigned long __init xen_get_max_pages(void)
>  char * __init xen_memory_setup(void)
>  {
>  	static struct e820entry map[E820MAX] __initdata;
> -	static struct e820entry map_raw[E820MAX] __initdata;
>  
>  	unsigned long max_pfn = xen_start_info->nr_pages;
>  	unsigned long long mem_end;
>  	int rc;
>  	struct xen_memory_map memmap;
> +	unsigned long max_pages;
>  	unsigned long extra_pages = 0;
> -	unsigned long extra_limit;
>  	unsigned long identity_pages = 0;
>  	int i;
>  	int op;
> @@ -237,49 +243,52 @@ char * __init xen_memory_setup(void)
>  	}
>  	BUG_ON(rc);
>  
> -	memcpy(map_raw, map, sizeof(map));
> -	e820.nr_map = 0;
> -	xen_extra_mem[0].start = mem_end;
> -	for (i = 0; i < memmap.nr_entries; i++) {
> -		unsigned long long end;
> -
> -		/* Guard against non-page aligned E820 entries. */
> -		if (map[i].type == E820_RAM)
> -			map[i].size -= (map[i].size + map[i].addr) % PAGE_SIZE;
> -
> -		end = map[i].addr + map[i].size;
> -		if (map[i].type == E820_RAM && end > mem_end) {
> -			/* RAM off the end - may be partially included */
> -			u64 delta = min(map[i].size, end - mem_end);
> -
> -			map[i].size -= delta;
> -			end -= delta;
> -
> -			extra_pages += PFN_DOWN(delta);
> -			/*
> -			 * Set RAM below 4GB that is not for us to be unusable.
> -			 * This prevents "System RAM" address space from being
> -			 * used as potential resource for I/O address (happens
> -			 * when 'allocate_resource' is called).
> -			 */
> -			if (delta &&
> -				(xen_initial_domain() && end < 0x100000000ULL))
> -				e820_add_region(end, delta, E820_UNUSABLE);
> +	/* Make sure the Xen-supplied memory map is well-ordered. */
> +	sanitize_e820_map(map, memmap.nr_entries, &memmap.nr_entries);
> +
> +	max_pages = xen_get_max_pages();
> +	if (max_pages > max_pfn)
> +		extra_pages += max_pages - max_pfn;
> +
> +	extra_pages += xen_return_unused_memory(max_pfn, map, memmap.nr_entries);
> +
> +	/*
> +	 * Clamp the amount of extra memory to a EXTRA_MEM_RATIO
> +	 * factor the base size.  On non-highmem systems, the base
> +	 * size is the full initial memory allocation; on highmem it
> +	 * is limited to the max size of lowmem, so that it doesn't
> +	 * get completely filled.
> +	 *
> +	 * In principle there could be a problem in lowmem systems if
> +	 * the initial memory is also very large with respect to
> +	 * lowmem, but we won't try to deal with that here.
> +	 */
> +	extra_pages = min(EXTRA_MEM_RATIO * min(max_pfn, PFN_DOWN(MAXMEM)), extra_pages);
> +
> +	i = 0;
> +	while (i < memmap.nr_entries) {
> +		u64 addr = map[i].addr;
> +		u64 size = map[i].size;
> +		u32 type = map[i].type;
> +
> +		if (type == E820_RAM) {
> +			if (addr < mem_end) {
> +				size = min(size, mem_end - addr);

You removed this little alignmend issue we found on Dell Inspiron laptops:

		/* Guard against non-page aligned E820 entries. */
		if (map[i].type == E820_RAM)
			map[i].size -= (map[i].size + map[i].addr) % PAGE_SIZE;

		end = map[i].addr + map[i].size;

Without it we end up dying later on when NUMA was turned on as it tried
to use two pages and "half" of the last page was not RAM and it ended up
going belly up. We need to make sure that the "end" (and the "beginning")
of the E820 entry is page aligned.

> +			} else if (extra_pages) {
> +				size = min(size, (u64)extra_pages * PAGE_SIZE);
> +				extra_pages -= size / PAGE_SIZE;

Those operations can be done using << and >> (using PAGE_SHIFT) respectivly.

> +				xen_add_extra_mem(addr, size);
> +			} else
> +				type = E820_UNUSABLE;
>  		}
>  
> -		if (map[i].size > 0 && end > xen_extra_mem[0].start)
> -			xen_extra_mem[0].start = end;
> +		e820_add_region(addr, size, type);
>  
> -		/* Add region if any remains */
> -		if (map[i].size > 0)
> -			e820_add_region(map[i].addr, map[i].size, map[i].type);
> +		map[i].addr += size;
> +		map[i].size -= size;
> +		if (map[i].size == 0)
> +			i++;
>  	}
> -	/* Align the balloon area so that max_low_pfn does not get set
> -	 * to be at the _end_ of the PCI gap at the far end (fee01000).
> -	 * Note that the start of balloon area gets set in the loop above
> -	 * to be past the last E820 region. */
> -	if (xen_initial_domain() && (xen_extra_mem[0].start < (1ULL<<32)))
> -		xen_extra_mem[0].start = (1ULL<<32);

I think this issue is not present with your patch - but just to be
on the safe side you might want to ask Stefano to use his box where
he found the issue originally.

>  
>  	/*
>  	 * In domU, the ISA region is normal, usable memory, but we
> @@ -305,41 +314,11 @@ char * __init xen_memory_setup(void)
>  
>  	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
>  
> -	extra_limit = xen_get_max_pages();
> -	if (extra_limit >= max_pfn)
> -		extra_pages = extra_limit - max_pfn;
> -	else
> -		extra_pages = 0;
> -
> -	extra_pages += xen_return_unused_memory(xen_start_info->nr_pages, &e820);
> -
> -	/*
> -	 * Clamp the amount of extra memory to a EXTRA_MEM_RATIO
> -	 * factor the base size.  On non-highmem systems, the base
> -	 * size is the full initial memory allocation; on highmem it
> -	 * is limited to the max size of lowmem, so that it doesn't
> -	 * get completely filled.
> -	 *
> -	 * In principle there could be a problem in lowmem systems if
> -	 * the initial memory is also very large with respect to
> -	 * lowmem, but we won't try to deal with that here.
> -	 */
> -	extra_limit = min(EXTRA_MEM_RATIO * min(max_pfn, PFN_DOWN(MAXMEM)),
> -			  max_pfn + extra_pages);
> -
> -	if (extra_limit >= max_pfn)
> -		extra_pages = extra_limit - max_pfn;
> -	else
> -		extra_pages = 0;
> -
> -	xen_add_extra_mem(extra_pages);
> -
>  	/*
>  	 * Set P2M for all non-RAM pages and E820 gaps to be identity
> -	 * type PFNs. We supply it with the non-sanitized version
> -	 * of the E820.
> +	 * type PFNs.
>  	 */
> -	identity_pages = xen_set_identity(map_raw, memmap.nr_entries);
> +	identity_pages = xen_set_identity(e820.map, e820.nr_map);
>  	printk(KERN_INFO "Set %ld page(s) to 1-1 mapping.\n", identity_pages);
>  	return "Xen";
>  }
> -- 
> 1.7.2.5

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/5] xen: allow balloon driver to use more than one memory region
  2011-09-07 10:44     ` David Vrabel
@ 2011-09-07 18:09       ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-09-07 18:09 UTC (permalink / raw)
  To: David Vrabel; +Cc: xen-devel

On Wed, Sep 07, 2011 at 11:44:36AM +0100, David Vrabel wrote:
> On 06/09/11 22:57, Konrad Rzeszutek Wilk wrote:
> > On Fri, Aug 19, 2011 at 03:57:18PM +0100, David Vrabel wrote:
> >> 
> >> +static void __init balloon_add_memory_region(unsigned long start_pfn, unsigned long pages)
> > 
> > That looks suspiciously like it has more than 80 lines. You ran the
> > _all_ of the patches through scripts/checkpath.pl right?
> 
> Yes, I used checkpatch.pl but I tend to treat the 80 character limit as
> a guideline rather than a hard limit and only pay attention to it if
> breaking a line improves readability.
> 
> I assume your preference is for an 80 character hard limit?

Please.
> 
> >>  {
> >> -	domid_t domid = DOMID_SELF;
> >>  	unsigned long pfn, extra_pfn_end;
> >>  	struct page *page;
> >> +
> >> +	/*
> >> +	 * Initialise the balloon with excess memory space.  We need
> >> +	 * to make sure we don't add memory which doesn't exist or
> >> +	 * logically exist.  The E820 map can be trimmed to be smaller
> >> +	 * than the amount of physical memory due to the mem= command
> >> +	 * line parameter.  And if this is a 32-bit non-HIGHMEM kernel
> >> +	 * on a system with memory which requires highmem to access,
> >> +	 * don't try to use it.
> > 
> > 
> > That whole comment is just confusing.. But I do know that you
> > just moved it, but I wonder if it makes sense to remove it or
> > alter it in the future.
> 
> I think the comment is valid.

Can define "logically exist" ?

or "non-HIGHMEM kernel .. don't try to use it' - but we do
use it.
> 
> >> +	 */
> >> +	extra_pfn_end = min(min(max_pfn, e820_end_of_ram_pfn()),
> >> +			    start_pfn + pages);
> 
> It's this line that it's documenting.
> 
> >> --- a/include/xen/page.h
> >> +++ b/include/xen/page.h
> >> @@ -3,6 +3,13 @@
> >>  
> >>  #include <asm/xen/page.h>
> >>  
> >> -extern phys_addr_t xen_extra_mem_start, xen_extra_mem_size;
> >> +struct xen_memory_region {
> >> +	phys_addr_t start;
> >> +	phys_addr_t size;
> >> +};
> >> +
> >> +#define XEN_EXTRA_MEM_MAX_REGIONS 128 /* == E820MAX */
> >> +
> >> +extern struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS];
> > 
> > __initdata
> 
> Only the definition (in arch/x86/xen/setup.c) needs the __initdata
> attribute, right?

Right, but you might wnat to put in here too so folks won't try to
use past __init. Or just stick a comment in there warning folks
about it.

> 
> I just checked and it does end up in the .init.data section.
> 
> David

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 5/5] xen: release all pages within 1-1 p2m mappings
  2011-09-07 11:03     ` David Vrabel
@ 2011-09-07 18:23       ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-09-07 18:23 UTC (permalink / raw)
  To: David Vrabel; +Cc: xen-devel, David Vrabel

On Wed, Sep 07, 2011 at 12:03:30PM +0100, David Vrabel wrote:
> On 06/09/11 22:20, Konrad Rzeszutek Wilk wrote:
> > On Fri, Aug 19, 2011 at 03:57:20PM +0100, David Vrabel wrote:
> >> 
> >> --- a/arch/x86/xen/setup.c
> >> +++ b/arch/x86/xen/setup.c
> >> @@ -123,73 +123,33 @@ static unsigned long __init xen_release_chunk(phys_addr_t start_addr,
> >>  	return len;
> >>  }
> >>  
> >> -static unsigned long __init xen_return_unused_memory(unsigned long max_pfn,
> >> -						     const struct e820entry *map,
> >> -						     int nr_map)
> >> +static unsigned long __init xen_set_identity_and_release(const struct e820entry *list,
> >> +							 ssize_t map_size,
> >> +							 unsigned long nr_pages)
> >>  {
> >> -	phys_addr_t max_addr = PFN_PHYS(max_pfn);
> >> -	phys_addr_t last_end = ISA_END_ADDRESS;
> >> +	phys_addr_t avail_end = PFN_PHYS(nr_pages);
> >> +	phys_addr_t last_end = 0;
> >>  	unsigned long released = 0;
> >> -	int i;
> >> -
> >> -	/* Free any unused memory above the low 1Mbyte. */
> >> -	for (i = 0; i < nr_map && last_end < max_addr; i++) {
> >> -		phys_addr_t end = map[i].addr;
> >> -		end = min(max_addr, end);
> >> -
> >> -		if (last_end < end)
> >> -			released += xen_release_chunk(last_end, end);
> >> -		last_end = max(last_end, map[i].addr + map[i].size);
> >> -	}
> >> -
> >> -	if (last_end < max_addr)
> >> -		released += xen_release_chunk(last_end, max_addr);
> >> -
> >> -	printk(KERN_INFO "released %lu pages of unused memory\n", released);
> >> -	return released;
> >> -}
> >> -
> >> -static unsigned long __init xen_set_identity(const struct e820entry *list,
> >> -					     ssize_t map_size)
> >> -{
> >> -	phys_addr_t last = xen_initial_domain() ? 0 : ISA_END_ADDRESS;
> >> -	phys_addr_t start_pci = last;
> >>  	const struct e820entry *entry;
> >> -	unsigned long identity = 0;
> >>  	int i;
> >>  
> >>  	for (i = 0, entry = list; i < map_size; i++, entry++) {
> >> -		phys_addr_t start = entry->addr;
> >> -		phys_addr_t end = start + entry->size;
> >> -
> >> -		if (start < last)
> >> -			start = last;
> >> +		phys_addr_t begin = last_end;
> > 
> > The "begin" is a bit confusing. You are using the previous E820 entry's end - not
> > the beginning of this E820 entry. Doing a s/begin/last_end/ makes
> > the code a bit easier to understand.
> 
> Really?  It seems pretty clear to me that they're the beginning and end
> of the memory range we're considering to release or map.
> 
> That loop went through a number of variations and what's there is what I
> think is the most readable.

Please add a comment describing that.

> 
> >> +		phys_addr_t end = entry->addr + entry->size;
> >>  
> >> -		if (end <= start)
> >> -			continue;
> >> +		last_end = end;
> > 
> > Please include the comment:
> > /* This entry end. */
> 
> Not really in favour of little comments like this.  I'll put a comment
> above the loop.
> 
> /*
>  * For each memory region consider whether to release and map the
>  * region and the preceeding gap (if any).
>  */

OK, can you expand it please to mention that you are evaluating the
beginning and end of the memory range. Thanks!

> 
> > OK, but you have ripped out the nice printk's that existed before. So add them
> > back in:
> > 
> > 
> > 	printk(KERN_INFO "released %lu pages of unused memory\n", released);
> > 	printk(KERN_INFO "Set %ld page(s) to 1-1 mapping.\n", identity);
> > 
> > as they are quite useful in the field.
> 
> What problem are these useful for diagnosing that the remaining messages
> (and the e820 map) don't tell you already?

You get an idea of which are released and which ones are identity pages.

The E820 won't tell you how many total got released. Well, you can
figure out if you look at the E820, at the mem=X and do some decimal
to hex conversations.

Much easier just to look at the end result.

> 
>  xen_release_chunk: looking at area pfn 9e-100: 98 pages freed
>  1-1 mapping on 9e->100
>  1-1 mapping on bf699->bf6af
>  1-1 mapping on bf6af->bf6ce
>  1-1 mapping on bf6ce->c0000
>  1-1 mapping on c0000->f0000
>  1-1 mapping on f0000->100000

Ok, but those are 'debug' version. The totals are for 'info' level

Also, considering that the Red Hat guys posted patches to improve
the look and feel of those printk's I don't want to them rip out.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/5] xen/balloon: account for pages released during memory setup
  2011-09-06 21:31   ` Konrad Rzeszutek Wilk
@ 2011-09-08 15:01     ` David Vrabel
  0 siblings, 0 replies; 20+ messages in thread
From: David Vrabel @ 2011-09-08 15:01 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel

On 06/09/11 22:31, Konrad Rzeszutek Wilk wrote:
> On Fri, Aug 19, 2011 at 03:57:17PM +0100, David Vrabel wrote:
>> 
>> --- a/drivers/xen/balloon.c
>> +++ b/drivers/xen/balloon.c
>> @@ -557,15 +557,20 @@ EXPORT_SYMBOL(free_xenballooned_pages);
>>  
>>  static int __init balloon_init(void)
>>  {
>> +	domid_t domid = DOMID_SELF;
>>  	unsigned long pfn, extra_pfn_end;
>>  	struct page *page;
>> +	int ret;
> 
> long int?

int looks correct to me.  From arch/x86/include/asm/xen/hypercall.h:

static inline int
HYPERVISOR_memory_op(unsigned int cmd, void *arg)
{
        return _hypercall2(int, memory_op, cmd, arg);
}

>>  
>>  	if (!xen_domain())
>>  		return -ENODEV;
>>  
>>  	pr_info("xen/balloon: Initialising balloon driver.\n");
>>  
>> -	balloon_stats.current_pages = xen_pv_domain() ? min(xen_start_info->nr_pages, max_pfn) : max_pfn;
>> +	ret = HYPERVISOR_memory_op(XENMEM_current_reservation, &domid);
>> +	if (ret < 0)
>> +		return ret;
>> +	balloon_stats.current_pages = ret;
>>  	balloon_stats.target_pages  = balloon_stats.current_pages;
>>  	balloon_stats.balloon_low   = 0;
>>  	balloon_stats.balloon_high  = 0;
>> -- 
>> 1.7.2.5

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2011-09-08 15:01 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-19 14:57 xen: memory initialization/balloon fixes (#2) David Vrabel
2011-08-19 14:57 ` [PATCH 1/5] xen: use maximum reservation to limit amount of usable RAM David Vrabel
2011-08-31 20:40   ` Konrad Rzeszutek Wilk
2011-09-01 12:12     ` David Vrabel
2011-09-01 13:14       ` Konrad Rzeszutek Wilk
2011-08-19 14:57 ` [PATCH 2/5] xen/balloon: account for pages released during memory setup David Vrabel
2011-09-06 21:31   ` Konrad Rzeszutek Wilk
2011-09-08 15:01     ` David Vrabel
2011-08-19 14:57 ` [PATCH 3/5] xen: allow balloon driver to use more than one memory region David Vrabel
2011-09-06 21:57   ` Konrad Rzeszutek Wilk
2011-09-07 10:44     ` David Vrabel
2011-09-07 18:09       ` Konrad Rzeszutek Wilk
2011-08-19 14:57 ` [PATCH 4/5] xen: allow extra memory to be in multiple regions David Vrabel
2011-09-07 12:28   ` Konrad Rzeszutek Wilk
2011-08-19 14:57 ` [PATCH 5/5] xen: release all pages within 1-1 p2m mappings David Vrabel
2011-08-19 15:05   ` David Vrabel
2011-09-06 21:20   ` Konrad Rzeszutek Wilk
2011-09-07 11:03     ` David Vrabel
2011-09-07 18:23       ` Konrad Rzeszutek Wilk
2011-08-22 14:49 ` xen: memory initialization/balloon fixes (#2) Konrad Rzeszutek Wilk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.