linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 2/2 v2] x86/e820: Use much less memory for e820/e820_saved, save up to 120k
@ 2016-09-09  7:58 Denys Vlasenko
  2016-09-15  7:04 ` Ingo Molnar
  0 siblings, 1 reply; 6+ messages in thread
From: Denys Vlasenko @ 2016-09-09  7:58 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Denys Vlasenko, Andy Lutomirski, H. Peter Anvin, Borislav Petkov,
	Brian Gerst, x86, linux-kernel

The maximum size of e820 map array for EFI systems is defined as
E820_X_MAX (E820MAX + 3 * MAX_NUMNODES).

In x86_64 defconfig, this ends up with E820_X_MAX = 320, e820 and e820_saved
are 6404 bytes each.

With larger configs, for example Fedora kernels, E820_X_MAX = 3200, e820 and e820_saved
are 64004 bytes each. Most of this space is wasted. Typical machines have some 20-30
e820 areas at most.

This patch turns e820 and e820_saved to pointers which initially point to __initdata
tables, of the same size as before.

At the very end of setup_arch(), when we are done fiddling with these maps,
allocate smaller alloc_bootmem blocks, copy maps there, and change pointers.

Run-tested.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Ingo Molnar <mingo@kernel.org>
CC: Andy Lutomirski <luto@amacapital.net>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Borislav Petkov <bp@alien8.de>
CC: Brian Gerst <brgerst@gmail.com>
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
---
Changes since v1: added __refdata annotations.

 arch/x86/include/asm/e820.h       |   6 +-
 arch/x86/kernel/e820.c            | 123 +++++++++++++++++++++++---------------
 arch/x86/kernel/early-quirks.c    |   2 +-
 arch/x86/kernel/kexec-bzimage64.c |   4 +-
 arch/x86/kernel/resource.c        |   4 +-
 arch/x86/kernel/setup.c           |  10 ++--
 arch/x86/kernel/tboot.c           |   8 +--
 arch/x86/platform/efi/efi.c       |   2 +-
 arch/x86/xen/setup.c              |   2 +-
 9 files changed, 96 insertions(+), 65 deletions(-)

diff --git a/arch/x86/include/asm/e820.h b/arch/x86/include/asm/e820.h
index 3ab0537..476b574 100644
--- a/arch/x86/include/asm/e820.h
+++ b/arch/x86/include/asm/e820.h
@@ -10,8 +10,8 @@
 #include <uapi/asm/e820.h>
 #ifndef __ASSEMBLY__
 /* see comment in arch/x86/kernel/e820.c */
-extern struct e820map e820;
-extern struct e820map e820_saved;
+extern struct e820map *e820;
+extern struct e820map *e820_saved;
 
 extern unsigned long pci_mem_start;
 extern int e820_any_mapped(u64 start, u64 end, unsigned type);
@@ -53,6 +53,8 @@ extern void e820_reserve_resources_late(void);
 extern void setup_memory_map(void);
 extern char *default_machine_specific_memory_setup(void);
 
+extern void e820_reallocate_tables(void);
+
 /*
  * Returns true iff the specified range [s,e) is completely contained inside
  * the ISA region.
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index a6d4f10..4e52859 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -40,8 +40,14 @@
  * user can e.g. boot the original kernel with mem=1G while still booting the
  * next kernel with full memory.
  */
-struct e820map e820;
-struct e820map e820_saved;
+static struct e820map initial_e820 __initdata;
+static struct e820map initial_e820_saved __initdata;
+/*
+ * After we are done fiddling with maps at boot, we reallocate (shrink) them.
+ * Otherwise each map uses up to ~64k, but typically only ~1k is needed.
+ */
+struct e820map *e820 __refdata = &initial_e820;
+struct e820map *e820_saved __refdata = &initial_e820_saved;
 
 /* For PCI or other memory-mapped resources */
 unsigned long pci_mem_start = 0xaeedbabe;
@@ -58,8 +64,8 @@ e820_any_mapped(u64 start, u64 end, unsigned type)
 {
 	int i;
 
-	for (i = 0; i < e820.nr_map; i++) {
-		struct e820entry *ei = &e820.map[i];
+	for (i = 0; i < e820->nr_map; i++) {
+		struct e820entry *ei = &e820->map[i];
 
 		if (type && ei->type != type)
 			continue;
@@ -81,8 +87,8 @@ int __init e820_all_mapped(u64 start, u64 end, unsigned type)
 {
 	int i;
 
-	for (i = 0; i < e820.nr_map; i++) {
-		struct e820entry *ei = &e820.map[i];
+	for (i = 0; i < e820->nr_map; i++) {
+		struct e820entry *ei = &e820->map[i];
 
 		if (type && ei->type != type)
 			continue;
@@ -128,7 +134,7 @@ static void __init __e820_add_region(struct e820map *e820x, u64 start, u64 size,
 
 void __init e820_add_region(u64 start, u64 size, int type)
 {
-	__e820_add_region(&e820, start, size, type);
+	__e820_add_region(e820, start, size, type);
 }
 
 static void __init e820_print_type(u32 type)
@@ -164,12 +170,12 @@ void __init e820_print_map(char *who)
 {
 	int i;
 
-	for (i = 0; i < e820.nr_map; i++) {
+	for (i = 0; i < e820->nr_map; i++) {
 		printk(KERN_INFO "%s: [mem %#018Lx-%#018Lx] ", who,
-		       (unsigned long long) e820.map[i].addr,
+		       (unsigned long long) e820->map[i].addr,
 		       (unsigned long long)
-		       (e820.map[i].addr + e820.map[i].size - 1));
-		e820_print_type(e820.map[i].type);
+		       (e820->map[i].addr + e820->map[i].size - 1));
+		e820_print_type(e820->map[i].type);
 		printk(KERN_CONT "\n");
 	}
 }
@@ -493,13 +499,13 @@ static u64 __init __e820_update_range(struct e820map *e820x, u64 start,
 u64 __init e820_update_range(u64 start, u64 size, unsigned old_type,
 			     unsigned new_type)
 {
-	return __e820_update_range(&e820, start, size, old_type, new_type);
+	return __e820_update_range(e820, start, size, old_type, new_type);
 }
 
 static u64 __init e820_update_range_saved(u64 start, u64 size,
 					  unsigned old_type, unsigned new_type)
 {
-	return __e820_update_range(&e820_saved, start, size, old_type,
+	return __e820_update_range(e820_saved, start, size, old_type,
 				     new_type);
 }
 
@@ -521,8 +527,8 @@ u64 __init e820_remove_range(u64 start, u64 size, unsigned old_type,
 		e820_print_type(old_type);
 	printk(KERN_CONT "\n");
 
-	for (i = 0; i < e820.nr_map; i++) {
-		struct e820entry *ei = &e820.map[i];
+	for (i = 0; i < e820->nr_map; i++) {
+		struct e820entry *ei = &e820->map[i];
 		u64 final_start, final_end;
 		u64 ei_end;
 
@@ -566,15 +572,15 @@ u64 __init e820_remove_range(u64 start, u64 size, unsigned old_type,
 
 void __init update_e820(void)
 {
-	if (sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map))
+	if (sanitize_e820_map(e820->map, ARRAY_SIZE(e820->map), &e820->nr_map))
 		return;
 	printk(KERN_INFO "e820: modified physical RAM map:\n");
 	e820_print_map("modified");
 }
 static void __init update_e820_saved(void)
 {
-	sanitize_e820_map(e820_saved.map, ARRAY_SIZE(e820_saved.map),
-				&e820_saved.nr_map);
+	sanitize_e820_map(e820_saved->map, ARRAY_SIZE(e820_saved->map),
+				&e820_saved->nr_map);
 }
 #define MAX_GAP_END 0x100000000ull
 /*
@@ -584,14 +590,14 @@ __init int e820_search_gap(unsigned long *gapstart, unsigned long *gapsize,
 		unsigned long start_addr, unsigned long long end_addr)
 {
 	unsigned long long last;
-	int i = e820.nr_map;
+	int i = e820->nr_map;
 	int found = 0;
 
 	last = (end_addr && end_addr < MAX_GAP_END) ? end_addr : MAX_GAP_END;
 
 	while (--i >= 0) {
-		unsigned long long start = e820.map[i].addr;
-		unsigned long long end = start + e820.map[i].size;
+		unsigned long long start = e820->map[i].addr;
+		unsigned long long end = start + e820->map[i].size;
 
 		if (end < start_addr)
 			continue;
@@ -649,6 +655,27 @@ __init void e820_setup_gap(void)
 	       gapstart, gapstart + gapsize - 1);
 }
 
+/*
+ * Initial e820 and e820_saved are largish __initdata arrays.
+ * Copy them to (usually much smaller) dynamically allocated area.
+ * This is done after all tweaks we ever do to them.
+ */
+__init void e820_reallocate_tables(void)
+{
+	struct e820map *n;
+	int size;
+
+	size = offsetof(struct e820map, map) + sizeof(struct e820entry) * e820->nr_map;
+	n = alloc_bootmem(size);
+	memcpy(n, e820, size);
+	e820 = n;
+
+	size = offsetof(struct e820map, map) + sizeof(struct e820entry) * e820_saved->nr_map;
+	n = alloc_bootmem(size);
+	memcpy(n, e820_saved, size);
+	e820_saved = n;
+}
+
 /**
  * Because of the size limitation of struct boot_params, only first
  * 128 E820 memory entries are passed to kernel via
@@ -665,7 +692,7 @@ void __init parse_e820_ext(u64 phys_addr, u32 data_len)
 	entries = sdata->len / sizeof(struct e820entry);
 	extmap = (struct e820entry *)(sdata->data);
 	__append_e820_map(extmap, entries);
-	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
+	sanitize_e820_map(e820->map, ARRAY_SIZE(e820->map), &e820->nr_map);
 	early_memunmap(sdata, data_len);
 	printk(KERN_INFO "e820: extended physical RAM map:\n");
 	e820_print_map("extended");
@@ -686,8 +713,8 @@ void __init e820_mark_nosave_regions(unsigned long limit_pfn)
 	int i;
 	unsigned long pfn = 0;
 
-	for (i = 0; i < e820.nr_map; i++) {
-		struct e820entry *ei = &e820.map[i];
+	for (i = 0; i < e820->nr_map; i++) {
+		struct e820entry *ei = &e820->map[i];
 
 		if (pfn < PFN_UP(ei->addr))
 			register_nosave_region(pfn, PFN_UP(ei->addr));
@@ -712,8 +739,8 @@ static int __init e820_mark_nvs_memory(void)
 {
 	int i;
 
-	for (i = 0; i < e820.nr_map; i++) {
-		struct e820entry *ei = &e820.map[i];
+	for (i = 0; i < e820->nr_map; i++) {
+		struct e820entry *ei = &e820->map[i];
 
 		if (ei->type == E820_NVS)
 			acpi_nvs_register(ei->addr, ei->size);
@@ -760,8 +787,8 @@ static unsigned long __init e820_end_pfn(unsigned long limit_pfn)
 	unsigned long last_pfn = 0;
 	unsigned long max_arch_pfn = MAX_ARCH_PFN;
 
-	for (i = 0; i < e820.nr_map; i++) {
-		struct e820entry *ei = &e820.map[i];
+	for (i = 0; i < e820->nr_map; i++) {
+		struct e820entry *ei = &e820->map[i];
 		unsigned long start_pfn;
 		unsigned long end_pfn;
 
@@ -856,7 +883,7 @@ static int __init parse_memmap_one(char *p)
 		 */
 		saved_max_pfn = e820_end_of_ram_pfn();
 #endif
-		e820.nr_map = 0;
+		e820->nr_map = 0;
 		userdef = 1;
 		return 0;
 	}
@@ -903,8 +930,8 @@ early_param("memmap", parse_memmap_opt);
 void __init finish_e820_parsing(void)
 {
 	if (userdef) {
-		if (sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map),
-					&e820.nr_map) < 0)
+		if (sanitize_e820_map(e820->map, ARRAY_SIZE(e820->map),
+					&e820->nr_map) < 0)
 			early_panic("Invalid user supplied memory map");
 
 		printk(KERN_INFO "e820: user-defined physical RAM map:\n");
@@ -991,35 +1018,35 @@ void __init e820_reserve_resources(void)
 	struct resource *res;
 	u64 end;
 
-	res = alloc_bootmem(sizeof(struct resource) * e820.nr_map);
+	res = alloc_bootmem(sizeof(struct resource) * e820->nr_map);
 	e820_res = res;
-	for (i = 0; i < e820.nr_map; i++) {
-		end = e820.map[i].addr + e820.map[i].size - 1;
+	for (i = 0; i < e820->nr_map; i++) {
+		end = e820->map[i].addr + e820->map[i].size - 1;
 		if (end != (resource_size_t)end) {
 			res++;
 			continue;
 		}
-		res->name = e820_type_to_string(e820.map[i].type);
-		res->start = e820.map[i].addr;
+		res->name = e820_type_to_string(e820->map[i].type);
+		res->start = e820->map[i].addr;
 		res->end = end;
 
-		res->flags = e820_type_to_iomem_type(e820.map[i].type);
-		res->desc = e820_type_to_iores_desc(e820.map[i].type);
+		res->flags = e820_type_to_iomem_type(e820->map[i].type);
+		res->desc = e820_type_to_iores_desc(e820->map[i].type);
 
 		/*
 		 * don't register the region that could be conflicted with
 		 * pci device BAR resource and insert them later in
 		 * pcibios_resource_survey()
 		 */
-		if (do_mark_busy(e820.map[i].type, res)) {
+		if (do_mark_busy(e820->map[i].type, res)) {
 			res->flags |= IORESOURCE_BUSY;
 			insert_resource(&iomem_resource, res);
 		}
 		res++;
 	}
 
-	for (i = 0; i < e820_saved.nr_map; i++) {
-		struct e820entry *entry = &e820_saved.map[i];
+	for (i = 0; i < e820_saved->nr_map; i++) {
+		struct e820entry *entry = &e820_saved->map[i];
 		firmware_map_add_early(entry->addr,
 			entry->addr + entry->size,
 			e820_type_to_string(entry->type));
@@ -1051,7 +1078,7 @@ void __init e820_reserve_resources_late(void)
 	struct resource *res;
 
 	res = e820_res;
-	for (i = 0; i < e820.nr_map; i++) {
+	for (i = 0; i < e820->nr_map; i++) {
 		if (!res->parent && res->end)
 			insert_resource_expand_to_fit(&iomem_resource, res);
 		res++;
@@ -1061,8 +1088,8 @@ void __init e820_reserve_resources_late(void)
 	 * Try to bump up RAM regions to reasonable boundaries to
 	 * avoid stolen RAM:
 	 */
-	for (i = 0; i < e820.nr_map; i++) {
-		struct e820entry *entry = &e820.map[i];
+	for (i = 0; i < e820->nr_map; i++) {
+		struct e820entry *entry = &e820->map[i];
 		u64 start, end;
 
 		if (entry->type != E820_RAM)
@@ -1110,7 +1137,7 @@ char *__init default_machine_specific_memory_setup(void)
 			who = "BIOS-e801";
 		}
 
-		e820.nr_map = 0;
+		e820->nr_map = 0;
 		e820_add_region(0, LOWMEMSIZE(), E820_RAM);
 		e820_add_region(HIGH_MEMORY, mem_size << 10, E820_RAM);
 	}
@@ -1124,7 +1151,7 @@ void __init setup_memory_map(void)
 	char *who;
 
 	who = x86_init.resources.memory_setup();
-	memcpy(&e820_saved, &e820, sizeof(struct e820map));
+	memcpy(e820_saved, e820, sizeof(struct e820map));
 	printk(KERN_INFO "e820: BIOS-provided physical RAM map:\n");
 	e820_print_map(who);
 }
@@ -1141,8 +1168,8 @@ void __init memblock_x86_fill(void)
 	 */
 	memblock_allow_resize();
 
-	for (i = 0; i < e820.nr_map; i++) {
-		struct e820entry *ei = &e820.map[i];
+	for (i = 0; i < e820->nr_map; i++) {
+		struct e820entry *ei = &e820->map[i];
 
 		end = ei->addr + ei->size;
 		if (end != (resource_size_t)end)
diff --git a/arch/x86/kernel/early-quirks.c b/arch/x86/kernel/early-quirks.c
index de7501e..18bb3a6 100644
--- a/arch/x86/kernel/early-quirks.c
+++ b/arch/x86/kernel/early-quirks.c
@@ -555,7 +555,7 @@ intel_graphics_stolen(int num, int slot, int func,
 
 	/* Mark this space as reserved */
 	e820_add_region(base, size, E820_RESERVED);
-	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
+	sanitize_e820_map(e820->map, ARRAY_SIZE(e820->map), &e820->nr_map);
 }
 
 static void __init intel_graphics_quirks(int num, int slot, int func)
diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c
index f2356bd..3407b14 100644
--- a/arch/x86/kernel/kexec-bzimage64.c
+++ b/arch/x86/kernel/kexec-bzimage64.c
@@ -99,14 +99,14 @@ static int setup_e820_entries(struct boot_params *params)
 {
 	unsigned int nr_e820_entries;
 
-	nr_e820_entries = e820_saved.nr_map;
+	nr_e820_entries = e820_saved->nr_map;
 
 	/* TODO: Pass entries more than E820MAX in bootparams setup data */
 	if (nr_e820_entries > E820MAX)
 		nr_e820_entries = E820MAX;
 
 	params->e820_entries = nr_e820_entries;
-	memcpy(&params->e820_map, &e820_saved.map,
+	memcpy(&params->e820_map, &e820_saved->map,
 	       nr_e820_entries * sizeof(struct e820entry));
 
 	return 0;
diff --git a/arch/x86/kernel/resource.c b/arch/x86/kernel/resource.c
index 80eab01..2408c16 100644
--- a/arch/x86/kernel/resource.c
+++ b/arch/x86/kernel/resource.c
@@ -27,8 +27,8 @@ static void remove_e820_regions(struct resource *avail)
 	int i;
 	struct e820entry *entry;
 
-	for (i = 0; i < e820.nr_map; i++) {
-		entry = &e820.map[i];
+	for (i = 0; i < e820->nr_map; i++) {
+		entry = &e820->map[i];
 
 		resource_clip(avail, entry->addr,
 			      entry->addr + entry->size - 1);
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 0fa60f5..b11517e 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -458,8 +458,8 @@ static void __init e820_reserve_setup_data(void)
 		early_memunmap(data, sizeof(*data));
 	}
 
-	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
-	memcpy(&e820_saved, &e820, sizeof(struct e820map));
+	sanitize_e820_map(e820->map, ARRAY_SIZE(e820->map), &e820->nr_map);
+	memcpy(e820_saved, e820, sizeof(struct e820map));
 	printk(KERN_INFO "extended physical RAM map:\n");
 	e820_print_map("reserve setup_data");
 }
@@ -763,7 +763,7 @@ static void __init trim_bios_range(void)
 	 */
 	e820_remove_range(BIOS_BEGIN, BIOS_END - BIOS_BEGIN, E820_RAM, 1);
 
-	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
+	sanitize_e820_map(e820->map, ARRAY_SIZE(e820->map), &e820->nr_map);
 }
 
 /* called before trim_bios_range() to spare extra sanitize */
@@ -1032,7 +1032,7 @@ void __init setup_arch(char **cmdline_p)
 	if (ppro_with_ram_bug()) {
 		e820_update_range(0x70000000ULL, 0x40000ULL, E820_RAM,
 				  E820_RESERVED);
-		sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
+		sanitize_e820_map(e820->map, ARRAY_SIZE(e820->map), &e820->nr_map);
 		printk(KERN_INFO "fixed physical RAM map:\n");
 		e820_print_map("bad_ppro");
 	}
@@ -1262,6 +1262,8 @@ void __init setup_arch(char **cmdline_p)
 	if (efi_enabled(EFI_BOOT))
 		efi_apply_memmap_quirks();
 #endif
+
+	e820_reallocate_tables();
 }
 
 #ifdef CONFIG_X86_32
diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c
index 654f6c6..8402907 100644
--- a/arch/x86/kernel/tboot.c
+++ b/arch/x86/kernel/tboot.c
@@ -188,12 +188,12 @@ static int tboot_setup_sleep(void)
 
 	tboot->num_mac_regions = 0;
 
-	for (i = 0; i < e820.nr_map; i++) {
-		if ((e820.map[i].type != E820_RAM)
-		 && (e820.map[i].type != E820_RESERVED_KERN))
+	for (i = 0; i < e820->nr_map; i++) {
+		if ((e820->map[i].type != E820_RAM)
+		 && (e820->map[i].type != E820_RESERVED_KERN))
 			continue;
 
-		add_mac_region(e820.map[i].addr, e820.map[i].size);
+		add_mac_region(e820->map[i].addr, e820->map[i].size);
 	}
 
 	tboot->acpi_sinfo.kernel_s3_resume_vector =
diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 1fbb408..2e34334 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -166,7 +166,7 @@ static void __init do_add_efi_memmap(void)
 		}
 		e820_add_region(start, size, e820_type);
 	}
-	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
+	sanitize_e820_map(e820->map, ARRAY_SIZE(e820->map), &e820->nr_map);
 }
 
 int __init efi_memblock_x86_reserve_range(void)
diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index 1764252..f8960fc 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -861,7 +861,7 @@ char * __init xen_memory_setup(void)
 	e820_add_region(ISA_START_ADDRESS, ISA_END_ADDRESS - ISA_START_ADDRESS,
 			E820_RESERVED);
 
-	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
+	sanitize_e820_map(e820->map, ARRAY_SIZE(e820->map), &e820->nr_map);
 
 	/*
 	 * Check whether the kernel itself conflicts with the target E820 map.
-- 
2.9.2

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2 v2] x86/e820: Use much less memory for e820/e820_saved, save up to 120k
  2016-09-09  7:58 [PATCH 2/2 v2] x86/e820: Use much less memory for e820/e820_saved, save up to 120k Denys Vlasenko
@ 2016-09-15  7:04 ` Ingo Molnar
  2016-09-17 21:09   ` Denys Vlasenko
  0 siblings, 1 reply; 6+ messages in thread
From: Ingo Molnar @ 2016-09-15  7:04 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Andy Lutomirski, H. Peter Anvin, Borislav Petkov, Brian Gerst,
	x86, linux-kernel


* Denys Vlasenko <dvlasenk@redhat.com> wrote:

> The maximum size of e820 map array for EFI systems is defined as
> E820_X_MAX (E820MAX + 3 * MAX_NUMNODES).
> 
> In x86_64 defconfig, this ends up with E820_X_MAX = 320, e820 and e820_saved
> are 6404 bytes each.
> 
> With larger configs, for example Fedora kernels, E820_X_MAX = 3200, e820 and e820_saved
> are 64004 bytes each. Most of this space is wasted. Typical machines have some 20-30
> e820 areas at most.
> 
> This patch turns e820 and e820_saved to pointers which initially point to __initdata
> tables, of the same size as before.
> 
> At the very end of setup_arch(), when we are done fiddling with these maps,
> allocate smaller alloc_bootmem blocks, copy maps there, and change pointers.
> 
> Run-tested.

> +/*
> + * Initial e820 and e820_saved are largish __initdata arrays.
> + * Copy them to (usually much smaller) dynamically allocated area.
> + * This is done after all tweaks we ever do to them.
> + */
> +__init void e820_reallocate_tables(void)
> +{
> +	struct e820map *n;
> +	int size;
> +
> +	size = offsetof(struct e820map, map) + sizeof(struct e820entry) * e820->nr_map;
> +	n = alloc_bootmem(size);
> +	memcpy(n, e820, size);
> +	e820 = n;
> +
> +	size = offsetof(struct e820map, map) + sizeof(struct e820entry) * e820_saved->nr_map;
> +	n = alloc_bootmem(size);
> +	memcpy(n, e820_saved, size);
> +	e820_saved = n;
> +}

Ok, this makes me quite nervous, could you please split this into two patches so 
that any fails can be nicely bisected to?

First patch only does the pointerization changes with a trivial placeholder 
structure (full size, static allocated), second patch does all the dangerous bits 
such as changing it to __initdata, allocating and copying over bits.

Also, could we please also add some minimal debugging facility to make sure the 
memory table does not get extended after it's been reallocated?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2 v2] x86/e820: Use much less memory for e820/e820_saved, save up to 120k
  2016-09-15  7:04 ` Ingo Molnar
@ 2016-09-17 21:09   ` Denys Vlasenko
  2016-09-18  8:31     ` Ingo Molnar
  0 siblings, 1 reply; 6+ messages in thread
From: Denys Vlasenko @ 2016-09-17 21:09 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andy Lutomirski, H. Peter Anvin, Borislav Petkov, Brian Gerst,
	x86, linux-kernel

On 09/15/2016 09:04 AM, Ingo Molnar wrote:
>
> * Denys Vlasenko <dvlasenk@redhat.com> wrote:
>
>> The maximum size of e820 map array for EFI systems is defined as
>> E820_X_MAX (E820MAX + 3 * MAX_NUMNODES).
>>
>> In x86_64 defconfig, this ends up with E820_X_MAX = 320, e820 and e820_saved
>> are 6404 bytes each.
>>
>> With larger configs, for example Fedora kernels, E820_X_MAX = 3200, e820 and e820_saved
>> are 64004 bytes each. Most of this space is wasted. Typical machines have some 20-30
>> e820 areas at most.
>>
>> This patch turns e820 and e820_saved to pointers which initially point to __initdata
>> tables, of the same size as before.
>>
>> At the very end of setup_arch(), when we are done fiddling with these maps,
>> allocate smaller alloc_bootmem blocks, copy maps there, and change pointers.
>>
>> Run-tested.
>
>> +/*
>> + * Initial e820 and e820_saved are largish __initdata arrays.
>> + * Copy them to (usually much smaller) dynamically allocated area.
>> + * This is done after all tweaks we ever do to them.
>> + */
>> +__init void e820_reallocate_tables(void)
>> +{
>> +	struct e820map *n;
>> +	int size;
>> +
>> +	size = offsetof(struct e820map, map) + sizeof(struct e820entry) * e820->nr_map;
>> +	n = alloc_bootmem(size);
>> +	memcpy(n, e820, size);
>> +	e820 = n;
>> +
>> +	size = offsetof(struct e820map, map) + sizeof(struct e820entry) * e820_saved->nr_map;
>> +	n = alloc_bootmem(size);
>> +	memcpy(n, e820_saved, size);
>> +	e820_saved = n;
>> +}
>
> Ok, this makes me quite nervous, could you please split this into two patches so
> that any fails can be nicely bisected to?

No problem.

> First patch only does the pointerization changes with a trivial placeholder
> structure (full size, static allocated), second patch does all the dangerous bits
> such as changing it to __initdata, allocating and copying over bits.
>
> Also, could we please also add some minimal debugging facility to make sure the
> memory table does not get extended after it's been reallocated?

I have another idea: run e820_reallocate_tables() later, just before
we free __init and __initdata. Then e820 tables _can't_ be_ changed -
all functions which do that are __init functions.

Will test this now, and send a patchset.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2 v2] x86/e820: Use much less memory for e820/e820_saved, save up to 120k
  2016-09-17 21:09   ` Denys Vlasenko
@ 2016-09-18  8:31     ` Ingo Molnar
  2016-09-18 18:11       ` Denys Vlasenko
  0 siblings, 1 reply; 6+ messages in thread
From: Ingo Molnar @ 2016-09-18  8:31 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Andy Lutomirski, H. Peter Anvin, Borislav Petkov, Brian Gerst,
	x86, linux-kernel


* Denys Vlasenko <dvlasenk@redhat.com> wrote:

> On 09/15/2016 09:04 AM, Ingo Molnar wrote:
> >
> >* Denys Vlasenko <dvlasenk@redhat.com> wrote:
> >
> >>The maximum size of e820 map array for EFI systems is defined as
> >>E820_X_MAX (E820MAX + 3 * MAX_NUMNODES).
> >>
> >>In x86_64 defconfig, this ends up with E820_X_MAX = 320, e820 and e820_saved
> >>are 6404 bytes each.
> >>
> >>With larger configs, for example Fedora kernels, E820_X_MAX = 3200, e820 and e820_saved
> >>are 64004 bytes each. Most of this space is wasted. Typical machines have some 20-30
> >>e820 areas at most.
> >>
> >>This patch turns e820 and e820_saved to pointers which initially point to __initdata
> >>tables, of the same size as before.
> >>
> >>At the very end of setup_arch(), when we are done fiddling with these maps,
> >>allocate smaller alloc_bootmem blocks, copy maps there, and change pointers.
> >>
> >>Run-tested.
> >
> >>+/*
> >>+ * Initial e820 and e820_saved are largish __initdata arrays.
> >>+ * Copy them to (usually much smaller) dynamically allocated area.
> >>+ * This is done after all tweaks we ever do to them.
> >>+ */
> >>+__init void e820_reallocate_tables(void)
> >>+{
> >>+	struct e820map *n;
> >>+	int size;
> >>+
> >>+	size = offsetof(struct e820map, map) + sizeof(struct e820entry) * e820->nr_map;
> >>+	n = alloc_bootmem(size);
> >>+	memcpy(n, e820, size);
> >>+	e820 = n;
> >>+
> >>+	size = offsetof(struct e820map, map) + sizeof(struct e820entry) * e820_saved->nr_map;
> >>+	n = alloc_bootmem(size);
> >>+	memcpy(n, e820_saved, size);
> >>+	e820_saved = n;
> >>+}
> >
> >Ok, this makes me quite nervous, could you please split this into two patches so
> >that any fails can be nicely bisected to?
> 
> No problem.
> 
> >First patch only does the pointerization changes with a trivial placeholder
> >structure (full size, static allocated), second patch does all the dangerous bits
> >such as changing it to __initdata, allocating and copying over bits.
> >
> >Also, could we please also add some minimal debugging facility to make sure the
> >memory table does not get extended after it's been reallocated?
> 
> I have another idea: run e820_reallocate_tables() later, just before
> we free __init and __initdata. Then e820 tables _can't_ be_ changed -
> all functions which do that are __init functions.
> 
> Will test this now, and send a patchset.

Could we also mark it __ro_after_init?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2 v2] x86/e820: Use much less memory for e820/e820_saved, save up to 120k
  2016-09-18  8:31     ` Ingo Molnar
@ 2016-09-18 18:11       ` Denys Vlasenko
  2016-09-18 19:58         ` Andy Lutomirski
  0 siblings, 1 reply; 6+ messages in thread
From: Denys Vlasenko @ 2016-09-18 18:11 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andy Lutomirski, H. Peter Anvin, Borislav Petkov, Brian Gerst,
	x86, linux-kernel

On 09/18/2016 10:31 AM, Ingo Molnar wrote:
> * Denys Vlasenko <dvlasenk@redhat.com> wrote:
>> On 09/15/2016 09:04 AM, Ingo Molnar wrote:
>>> * Denys Vlasenko <dvlasenk@redhat.com> wrote:
>>>
>>>> The maximum size of e820 map array for EFI systems is defined as
>>>> E820_X_MAX (E820MAX + 3 * MAX_NUMNODES).
>>>>
>>>> In x86_64 defconfig, this ends up with E820_X_MAX = 320, e820 and e820_saved
>>>> are 6404 bytes each.
>>>>
>>>> With larger configs, for example Fedora kernels, E820_X_MAX = 3200, e820 and e820_saved
>>>> are 64004 bytes each. Most of this space is wasted. Typical machines have some 20-30
>>>> e820 areas at most.
>>>>
>>>> This patch turns e820 and e820_saved to pointers which initially point to __initdata
>>>> tables, of the same size as before.
>>>>
>>>> At the very end of setup_arch(), when we are done fiddling with these maps,
>>>> allocate smaller alloc_bootmem blocks, copy maps there, and change pointers.
>>>>
>>>> Run-tested.
>>>
>>>> +/*
>>>> + * Initial e820 and e820_saved are largish __initdata arrays.
>>>> + * Copy them to (usually much smaller) dynamically allocated area.
>>>> + * This is done after all tweaks we ever do to them.
>>>> + */
>>>> +__init void e820_reallocate_tables(void)
>>>> +{
>>>> +	struct e820map *n;
>>>> +	int size;
>>>> +
>>>> +	size = offsetof(struct e820map, map) + sizeof(struct e820entry) * e820->nr_map;
>>>> +	n = alloc_bootmem(size);
>>>> +	memcpy(n, e820, size);
>>>> +	e820 = n;
>>>> +
>>>> +	size = offsetof(struct e820map, map) + sizeof(struct e820entry) * e820_saved->nr_map;
>>>> +	n = alloc_bootmem(size);
>>>> +	memcpy(n, e820_saved, size);
>>>> +	e820_saved = n;
>>>> +}
>>>
>>> Ok, this makes me quite nervous, could you please split this into two patches so
>>> that any fails can be nicely bisected to?
>>
>> No problem.
>>
>>> First patch only does the pointerization changes with a trivial placeholder
>>> structure (full size, static allocated), second patch does all the dangerous bits
>>> such as changing it to __initdata, allocating and copying over bits.
>>>
>>> Also, could we please also add some minimal debugging facility to make sure the
>>> memory table does not get extended after it's been reallocated?
>>
>> I have another idea: run e820_reallocate_tables() later, just before
>> we free __init and __initdata. Then e820 tables _can't_ be_ changed -
>> all functions which do that are __init functions.
>>
>> Will test this now, and send a patchset.
>
> Could we also mark it __ro_after_init?

__ro_after_init makes sense only for statically allocated objects.
e820_reallocate_tables() copies e280 maps to kmalloced memory.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2 v2] x86/e820: Use much less memory for e820/e820_saved, save up to 120k
  2016-09-18 18:11       ` Denys Vlasenko
@ 2016-09-18 19:58         ` Andy Lutomirski
  0 siblings, 0 replies; 6+ messages in thread
From: Andy Lutomirski @ 2016-09-18 19:58 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Ingo Molnar, Borislav Petkov, linux-kernel, Brian Gerst, X86 ML,
	H. Peter Anvin

On Sep 18, 2016 8:11 AM, "Denys Vlasenko" <dvlasenk@redhat.com> wrote:
>
> On 09/18/2016 10:31 AM, Ingo Molnar wrote:
>>
>> * Denys Vlasenko <dvlasenk@redhat.com> wrote:
>>>
>>> On 09/15/2016 09:04 AM, Ingo Molnar wrote:
>>>>
>>>> * Denys Vlasenko <dvlasenk@redhat.com> wrote:
>>>>
>>>>> The maximum size of e820 map array for EFI systems is defined as
>>>>> E820_X_MAX (E820MAX + 3 * MAX_NUMNODES).
>>>>>
>>>>> In x86_64 defconfig, this ends up with E820_X_MAX = 320, e820 and e820_saved
>>>>> are 6404 bytes each.
>>>>>
>>>>> With larger configs, for example Fedora kernels, E820_X_MAX = 3200, e820 and e820_saved
>>>>> are 64004 bytes each. Most of this space is wasted. Typical machines have some 20-30
>>>>> e820 areas at most.
>>>>>
>>>>> This patch turns e820 and e820_saved to pointers which initially point to __initdata
>>>>> tables, of the same size as before.
>>>>>
>>>>> At the very end of setup_arch(), when we are done fiddling with these maps,
>>>>> allocate smaller alloc_bootmem blocks, copy maps there, and change pointers.
>>>>>
>>>>> Run-tested.
>>>>
>>>>
>>>>> +/*
>>>>> + * Initial e820 and e820_saved are largish __initdata arrays.
>>>>> + * Copy them to (usually much smaller) dynamically allocated area.
>>>>> + * This is done after all tweaks we ever do to them.
>>>>> + */
>>>>> +__init void e820_reallocate_tables(void)
>>>>> +{
>>>>> +       struct e820map *n;
>>>>> +       int size;
>>>>> +
>>>>> +       size = offsetof(struct e820map, map) + sizeof(struct e820entry) * e820->nr_map;
>>>>> +       n = alloc_bootmem(size);
>>>>> +       memcpy(n, e820, size);
>>>>> +       e820 = n;
>>>>> +
>>>>> +       size = offsetof(struct e820map, map) + sizeof(struct e820entry) * e820_saved->nr_map;
>>>>> +       n = alloc_bootmem(size);
>>>>> +       memcpy(n, e820_saved, size);
>>>>> +       e820_saved = n;
>>>>> +}
>>>>
>>>>
>>>> Ok, this makes me quite nervous, could you please split this into two patches so
>>>> that any fails can be nicely bisected to?
>>>
>>>
>>> No problem.
>>>
>>>> First patch only does the pointerization changes with a trivial placeholder
>>>> structure (full size, static allocated), second patch does all the dangerous bits
>>>> such as changing it to __initdata, allocating and copying over bits.
>>>>
>>>> Also, could we please also add some minimal debugging facility to make sure the
>>>> memory table does not get extended after it's been reallocated?
>>>
>>>
>>> I have another idea: run e820_reallocate_tables() later, just before
>>> we free __init and __initdata. Then e820 tables _can't_ be_ changed -
>>> all functions which do that are __init functions.
>>>
>>> Will test this now, and send a patchset.
>>
>>
>> Could we also mark it __ro_after_init?
>
>
> __ro_after_init makes sense only for statically allocated objects.
> e820_reallocate_tables() copies e280 maps to kmalloced memory.

The pointer can be __ro_after_init, though.  You could also
set_memory_ro() the thing if it's big enough that page-aligning it
wouldn't be a problem.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-09-18 19:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-09  7:58 [PATCH 2/2 v2] x86/e820: Use much less memory for e820/e820_saved, save up to 120k Denys Vlasenko
2016-09-15  7:04 ` Ingo Molnar
2016-09-17 21:09   ` Denys Vlasenko
2016-09-18  8:31     ` Ingo Molnar
2016-09-18 18:11       ` Denys Vlasenko
2016-09-18 19:58         ` Andy Lutomirski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).