linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* kexec based crash dumping
@ 2004-09-15 12:50 Hariprasad Nellitheertha
  2004-09-15 12:51 ` [PATCH][1/6]Documentation Hariprasad Nellitheertha
  2004-09-15 17:33 ` [Fastboot] kexec based crash dumping Eric W. Biederman
  0 siblings, 2 replies; 22+ messages in thread
From: Hariprasad Nellitheertha @ 2004-09-15 12:50 UTC (permalink / raw)
  To: akpm, linux-kernel, fastboot
  Cc: Suparna Bhattacharya, mbligh, ebiederm, litke

Hi Andrew,

The patches that follow contain the kexec based crash dumping implementation.
Based on feedback received last time, we have made several changes. Some of
them are:

- The dumping kernel now boots from a non-default location. This is possible
  due to Eric's patch which allows i386 kernels to boot from a non-default
  location. This change means that we need two different kernels to get this
  setup. The documentation patch has complete details on how to do this.
- We can now choose whether or not to dump from panic. The documentation
  patch has details on this as well.
- The linear view is now called oldmem.
- Changes as per the code review comments from the previous posting.

The patches correspond to 2.6.9-rc1-mm5.

Kindly review these patches and let me know your thoughts.

Regards, Hari
-- 
Hariprasad Nellitheertha
Linux Technology Center
India Software Labs
IBM India, Bangalore

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH][1/6]Documentation
  2004-09-15 12:50 kexec based crash dumping Hariprasad Nellitheertha
@ 2004-09-15 12:51 ` Hariprasad Nellitheertha
  2004-09-15 12:53   ` [PATCH][2/6]Memory preserving reboot using kexec Hariprasad Nellitheertha
  2004-09-15 17:33 ` [Fastboot] kexec based crash dumping Eric W. Biederman
  1 sibling, 1 reply; 22+ messages in thread
From: Hariprasad Nellitheertha @ 2004-09-15 12:51 UTC (permalink / raw)
  To: akpm, linux-kernel, fastboot
  Cc: Suparna Bhattacharya, mbligh, ebiederm, litke

[-- Attachment #1: Type: text/plain, Size: 108 bytes --]

Regards, Hari
-- 
Hariprasad Nellitheertha
Linux Technology Center
India Software Labs
IBM India, Bangalore

[-- Attachment #2: kd-doc-269rc1-mm5.patch --]
[-- Type: text/plain, Size: 5299 bytes --]



This patch contains the documentation for the kexec based crash dump tool.

Signed off by Hariprasad Nellitheertha <hari@in.ibm.com>



---

 linux-2.6.9-rc1-hari/Documentation/kdump.txt |  133 +++++++++++++++++++++++++++
 1 files changed, 133 insertions(+)

diff -puN /dev/null Documentation/kdump.txt
--- /dev/null	2003-01-30 15:54:37.000000000 +0530
+++ linux-2.6.9-rc1-hari/Documentation/kdump.txt	2004-09-15 17:36:25.000000000 +0530
@@ -0,0 +1,133 @@
+Documentation for kdump - the kexec based crash dumping solution
+================================================================
+
+DESIGN
+======
+
+We use kexec to reboot to a second kernel whenever a dump needs to be taken.
+This second kernel is booted with with very little memory (configurable
+at compile time). The first kernel reserves the section of memory that the
+second kernel uses. This ensures that on-going DMA from the first kernel
+does not corrupt the second kernel. The first 640k of physical memory is
+needed irrespective of where the kernel loads at. Hence, this region is
+backed up before reboot.
+
+In the second kernel, "old memory" can be accessed in two ways. The
+first one is through a device interface. We can create a /dev/oldmem or
+whatever and write out the memory in raw format. The second interface is
+through /proc/vmcore. This exports the dump as an ELF format file which
+can be written out using any file copy command (cp, scp, etc). Further, gdb
+can be used to perform some minimal debugging on the dump file. Both these
+methods ensure that there is correct ordering of the dump pages (corresponding
+to the first 640k that has been relocated).
+
+Note that the two approaches are independent and the patches
+can be used depending on the functionality needed. More details on the
+patches below.
+
+PATCHES
+=======
+
+We currently have 6 patches.
+
+1) kd-doc-<version>.patch - Contains basic documentation (this document!!)
+2) kd-reb-<version>.patch - This patch ensures we do a kexec reboot upon panic
+   and also saves the necessary regions of  memory into a backup area
+3) kd-copy-<version>.patch - This contains the code for reading the dump pages
+   in the second kernel.
+4) kd-reg-<version>.patch - This patch is for snapshotting the register contents
+   of all processors on to the backup area before rebooting.
+5) kd-elf-<version>.patch - This patch provides an ELF format interface to
+   the dump, post-reboot.
+6) kd-oldmem-<version>.patch - This patch contains the code to access the dump as
+   an /dev/oldmem.
+
+SETUP
+=====
+
+1) Apply the appropriate -mm patch on to the vanilla kernel tree. The -mm
+   tree has the kexec patches included.
+
+2) In order to enable the kernel to boot from a non-default location, the
+   following patches (by Eric Biederman) needs to be applied.
+
+   http://www.xmission.com/~ebiederm/files/kexec/2.6.8.1-kexec3/
+	broken-out/highbzImage.i386.patch
+   http://www.xmission.com/~ebiederm/files/kexec/2.6.8.1-kexec3/
+	broken-out/vmlinux-lds.i386.patch
+
+3) Apply the crash dump patches.
+
+4) Two kernels need to be built in order to get this feature working.
+
+   For the first kernel, choose the default values for the following options.
+
+   a) Physical address where the kernel expects to be loaded
+   b) kexec system call
+   c) kernel crash dumps
+
+   All the options are under "Processor type and features"
+
+   For the second kernel, change (a) to 16MB. If you want to choose another
+   value here, ensure "location of the crash dumps backup region" under (c)
+   reflects the same value.
+
+   Also ensure you have CONFIG_HIGHMEM on.
+
+5) Boot into the first kernel. You are now ready to try out kexec based crash
+   dumps.
+
+5) Load the second kernel to be booted using
+
+   kexec -l <second-kernel> --args-linux --append="root=<root-dev> dump
+   init 1 memmap=exactmap memmap=640k@0 memmap=32M@16M"
+
+   Note that <second-kernel> has to be a vmlinux image. bzImage will not
+   work, as of now.
+
+6) Enable kexec based dumping by
+
+   echo 1 > /proc/kexec-dump
+
+7) System reboots into the second kernel when a panic occurs.
+   You could write a module to call panic, for testing purposes.
+
+8) Write out the dump file using
+
+   cp /proc/vmcore <dump-file>
+
+You can also access the dump as a device for a linear/raw view. To do this,
+you will need the kd-oldmem-<version>.patch built into the kernel. To create
+the device, type
+
+  mknod /dev/oldmem c 1 12
+
+Use "dd" with suitable options for count, bs and skip to access specific
+portions of the dump.
+
+ANALYSIS
+========
+
+You can run gdb on the dump file copied out of /proc/vmcore. Use vmlinux built
+with -g and run
+
+  gdb vmlinux <dump-file>
+
+Stack trace for the task on processor 0, register display, memory display
+work fine.
+
+TODO
+====
+
+1) Provide a kernel-pages only view for the dump. This could possibly turn up
+   as /proc/vmcore-kern.
+2) Provide register contents of all processors (similar to what multi-threaded
+   core dumps does).
+3) Modify "crash" to make it recognize this dump.
+4) Make the i386 kernel boot from any location so we can run the second kernel
+   from the reserved location instead of the current approach.
+
+CONTACT
+=======
+
+Hariprasad Nellitheertha - hari at in dot ibm dot com

_

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH][2/6]Memory preserving reboot using kexec
  2004-09-15 12:51 ` [PATCH][1/6]Documentation Hariprasad Nellitheertha
@ 2004-09-15 12:53   ` Hariprasad Nellitheertha
  2004-09-15 12:54     ` [PATCH][3/6]Routines for copying the dump pages Hariprasad Nellitheertha
                       ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Hariprasad Nellitheertha @ 2004-09-15 12:53 UTC (permalink / raw)
  To: akpm, linux-kernel, fastboot
  Cc: Suparna Bhattacharya, mbligh, ebiederm, litke

[-- Attachment #1: Type: text/plain, Size: 108 bytes --]

Regards, Hari
-- 
Hariprasad Nellitheertha
Linux Technology Center
India Software Labs
IBM India, Bangalore

[-- Attachment #2: kd-reb-269rc1-mm5.patch --]
[-- Type: text/plain, Size: 10483 bytes --]



This patch contains the code that does the memory preserving reboot. It 
copies over the first 640k into a backup region before handing over to 
kexec. The second kernel will boot using only the backup region.

Signed off by Hariprasad Nellitheertha <hari@in.ibm.com>
Signed off by Adam Litke <litke@us.ibm.com>


---

 linux-2.6.9-rc1-hari/arch/i386/Kconfig                |   20 ++++++++
 linux-2.6.9-rc1-hari/arch/i386/kernel/machine_kexec.c |   31 ++++++++++++
 linux-2.6.9-rc1-hari/arch/i386/kernel/setup.c         |   13 +++++
 linux-2.6.9-rc1-hari/fs/proc/proc_misc.c              |   25 ++++++++++
 linux-2.6.9-rc1-hari/include/asm-i386/crash_dump.h    |   44 ++++++++++++++++++
 linux-2.6.9-rc1-hari/include/linux/bootmem.h          |    1 
 linux-2.6.9-rc1-hari/include/linux/crash_dump.h       |   28 +++++++++++
 linux-2.6.9-rc1-hari/kernel/panic.c                   |    7 ++
 linux-2.6.9-rc1-hari/mm/bootmem.c                     |    5 ++
 9 files changed, 174 insertions(+)

diff -puN arch/i386/Kconfig~kd-reb-269rc1-mm5 arch/i386/Kconfig
--- linux-2.6.9-rc1/arch/i386/Kconfig~kd-reb-269rc1-mm5	2004-09-15 17:36:30.000000000 +0530
+++ linux-2.6.9-rc1-hari/arch/i386/Kconfig	2004-09-15 17:36:30.000000000 +0530
@@ -894,6 +894,26 @@ config KEXEC
 	  support.  As of this writing the exact hardware interface is
 	  strongly in flux, so no good recommendation can be made.
 
+config CRASH_DUMP
+	bool "kernel crash dumps (EXPERIMENTAL)"
+	depends on KEXEC
+	help
+	  Generate crash dump using kexec.
+
+config BACKUP_BASE
+	int "location of the crash dumps backup region"
+	depends on CRASH_DUMP
+	default 16
+	help
+	This is the location where the second kernel will boot from.
+
+config BACKUP_SIZE
+	int "Size of the crash dumps backup region"
+	depends on CRASH_DUMP
+	range 16 64
+	default 32
+	help
+	The size of the second kernel's memory.
 endmenu
 
 
diff -puN arch/i386/kernel/setup.c~kd-reb-269rc1-mm5 arch/i386/kernel/setup.c
--- linux-2.6.9-rc1/arch/i386/kernel/setup.c~kd-reb-269rc1-mm5	2004-09-15 17:36:30.000000000 +0530
+++ linux-2.6.9-rc1-hari/arch/i386/kernel/setup.c	2004-09-15 17:36:30.000000000 +0530
@@ -39,6 +39,7 @@
 #include <linux/efi.h>
 #include <linux/init.h>
 #include <linux/edd.h>
+#include <linux/crash_dump.h>
 #include <video/edid.h>
 #include <asm/e820.h>
 #include <asm/mpspec.h>
@@ -57,6 +58,7 @@
 unsigned long init_pg_tables_end __initdata = ~0UL;
 
 int disable_pse __initdata = 0;
+unsigned int dump_enabled;
 
 /*
  * Machine setup..
@@ -708,6 +710,11 @@ static void __init parse_cmdline_early (
 			if (to != command_line)
 				to--;
 			if (!memcmp(from+7, "exactmap", 8)) {
+				/* If we are doing a crash dump, we
+				 * still need to know the real mem
+				 * size.
+				 */
+				set_saved_max_pfn();
 				from += 8+7;
 				e820.nr_map = 0;
 				userdef = 1;
@@ -815,6 +822,9 @@ static void __init parse_cmdline_early (
 		if (c == ' ' && !memcmp(from, "highmem=", 8))
 			highmem_pages = memparse(from+8, &from) >> PAGE_SHIFT;
 	
+		if (!memcmp(from, "dump", 4))
+			dump_enabled = 1;
+
 		c = *(from++);
 		if (!c)
 			break;
@@ -1102,6 +1112,9 @@ static unsigned long __init setup_memory
 		}
 	}
 #endif
+
+	crash_reserve_bootmem();
+
 	return max_low_pfn;
 }
 #else
diff -puN /dev/null include/asm-i386/crash_dump.h
--- /dev/null	2003-01-30 15:54:37.000000000 +0530
+++ linux-2.6.9-rc1-hari/include/asm-i386/crash_dump.h	2004-09-15 17:36:30.000000000 +0530
@@ -0,0 +1,44 @@
+/* asm-i386/crash_dump.h */
+#include <linux/bootmem.h>
+
+extern unsigned int dump_enabled;
+
+void __crash_relocate_mem(unsigned long, unsigned long);
+unsigned long __init find_max_low_pfn(void);
+void __init find_max_pfn(void);
+
+extern unsigned int crashed;
+
+#ifdef CONFIG_CRASH_DUMP
+#define CRASH_BACKUP_BASE ((unsigned long)CONFIG_BACKUP_BASE * 0x100000)
+#define CRASH_BACKUP_SIZE ((unsigned long)CONFIG_BACKUP_SIZE * 0x100000)
+#define CRASH_RELOCATE_SIZE 0xa0000
+
+static inline void crash_relocate_mem(void)
+{
+	if (crashed)
+		__crash_relocate_mem(CRASH_BACKUP_BASE + CRASH_BACKUP_SIZE,
+					CRASH_RELOCATE_SIZE);
+}
+
+static inline void set_saved_max_pfn(void)
+{
+	find_max_pfn();
+	saved_max_pfn = find_max_low_pfn();
+}
+
+static inline void crash_reserve_bootmem(void)
+{
+	if (!dump_enabled) {
+		reserve_bootmem(0, CRASH_RELOCATE_SIZE);
+		reserve_bootmem(CRASH_BACKUP_BASE,
+			CRASH_BACKUP_SIZE + CRASH_RELOCATE_SIZE);
+	}
+}
+#else
+#define CRASH_BACKUP_BASE 0x6000000
+#define CRASH_BACKUP_SIZE 0x1000000
+#define crash_relocate_mem() do { } while(0)
+#define set_saved_max_pfn() do { } while(0)
+#define crash_reserve_bootmem() do { } while(0)
+#endif
diff -puN include/linux/bootmem.h~kd-reb-269rc1-mm5 include/linux/bootmem.h
--- linux-2.6.9-rc1/include/linux/bootmem.h~kd-reb-269rc1-mm5	2004-09-15 17:36:30.000000000 +0530
+++ linux-2.6.9-rc1-hari/include/linux/bootmem.h	2004-09-15 17:36:30.000000000 +0530
@@ -21,6 +21,7 @@ extern unsigned long min_low_pfn;
  * highest page
  */
 extern unsigned long max_pfn;
+extern unsigned long saved_max_pfn;
 
 /*
  * node_bootmem_map is a map pointer - the bits represent all physical 
diff -puN /dev/null include/linux/crash_dump.h
--- /dev/null	2003-01-30 15:54:37.000000000 +0530
+++ linux-2.6.9-rc1-hari/include/linux/crash_dump.h	2004-09-15 17:36:30.000000000 +0530
@@ -0,0 +1,28 @@
+#include <linux/kexec.h>
+#include <linux/smp_lock.h>
+#include <linux/device.h>
+#include <asm/crash_dump.h>
+
+#ifdef CONFIG_CRASH_DUMP
+extern int crash_dump_on;
+static inline void crash_machine_kexec(void)
+{
+	struct kimage *image;
+
+	if ((!crash_dump_on) || (crashed))
+		return;
+
+	image = xchg(&kexec_image, 0);
+	if (image) {
+		crashed = 1;
+		device_shutdown();
+		printk(KERN_EMERG "kexec: opening parachute\n");
+		machine_kexec(image);
+		while (1);
+	} else {
+		printk(KERN_EMERG "kexec: No kernel image loaded!\n");
+	}
+}
+#else
+#define crash_machine_kexec()	do { } while(0)
+#endif
diff -puN kernel/panic.c~kd-reb-269rc1-mm5 kernel/panic.c
--- linux-2.6.9-rc1/kernel/panic.c~kd-reb-269rc1-mm5	2004-09-15 17:36:30.000000000 +0530
+++ linux-2.6.9-rc1-hari/kernel/panic.c	2004-09-15 17:36:30.000000000 +0530
@@ -19,10 +19,14 @@
 #include <linux/syscalls.h>
 #include <linux/interrupt.h>
 #include <linux/nmi.h>
+#include <linux/kexec.h>
+#include <linux/crash_dump.h>
 
 int panic_timeout;
 int panic_on_oops;
 int tainted;
+unsigned int crashed;
+int crash_dump_on;
 
 EXPORT_SYMBOL(panic_timeout);
 
@@ -62,6 +66,9 @@ NORET_TYPE void panic(const char * fmt, 
 	printk(KERN_EMERG "Kernel panic - not syncing: %s\n",buf);
 	bust_spinlocks(0);
 
+	/* If we have crashed, perform a kexec reboot, for dump write-out */
+	crash_machine_kexec();
+
 #ifdef CONFIG_SMP
 	smp_send_stop();
 #endif
diff -puN mm/bootmem.c~kd-reb-269rc1-mm5 mm/bootmem.c
--- linux-2.6.9-rc1/mm/bootmem.c~kd-reb-269rc1-mm5	2004-09-15 17:36:30.000000000 +0530
+++ linux-2.6.9-rc1-hari/mm/bootmem.c	2004-09-15 17:36:30.000000000 +0530
@@ -27,6 +27,11 @@
 unsigned long max_low_pfn;
 unsigned long min_low_pfn;
 unsigned long max_pfn;
+/*
+ * If we have booted due to a crash, max_pfn will be a very low value. We need
+ * to know the amount of memory that the previous kernel used.
+ */
+unsigned long saved_max_pfn;
 
 EXPORT_SYMBOL(max_pfn);		/* This is exported so
 				 * dma_get_required_mask(), which uses
diff -puN fs/proc/proc_misc.c~kd-reb-269rc1-mm5 fs/proc/proc_misc.c
--- linux-2.6.9-rc1/fs/proc/proc_misc.c~kd-reb-269rc1-mm5	2004-09-15 17:36:30.000000000 +0530
+++ linux-2.6.9-rc1-hari/fs/proc/proc_misc.c	2004-09-15 17:36:30.000000000 +0530
@@ -44,6 +44,7 @@
 #include <linux/jiffies.h>
 #include <linux/sysrq.h>
 #include <linux/vmalloc.h>
+#include <linux/crash_dump.h>
 #include <asm/uaccess.h>
 #include <asm/pgtable.h>
 #include <asm/io.h>
@@ -563,6 +564,25 @@ static struct file_operations proc_sysrq
 };
 #endif
 
+#ifdef CONFIG_CRASH_DUMP
+/*
+ * Enable kexec reboot upon panic; for dumping
+ */
+static ssize_t write_crash_dump_on(struct file *file, const char __user *buf,
+					size_t count, loff_t *ppos)
+{
+	if (count) {
+		if (get_user(crash_dump_on, buf))
+			return -EFAULT;
+	}
+	return count;
+}
+
+static struct file_operations proc_crash_dump_on_operations = {
+	.write		= write_crash_dump_on,
+};
+#endif
+
 struct proc_dir_entry *proc_root_kcore;
 
 static void create_seq_entry(char *name, mode_t mode, struct file_operations *f)
@@ -663,6 +683,11 @@ void __init proc_misc_init(void)
 	if (entry)
 		entry->proc_fops = &proc_sysrq_trigger_operations;
 #endif
+#ifdef CONFIG_CRASH_DUMP
+	entry = create_proc_entry("kexec-dump", S_IWUSR, NULL);
+	if (entry)
+		entry->proc_fops = &proc_crash_dump_on_operations;
+#endif
 #ifdef CONFIG_LOCKMETER
 	entry = create_proc_entry("lockmeter", S_IWUSR | S_IRUGO, NULL);
 	if (entry) {
diff -puN arch/i386/kernel/machine_kexec.c~kd-reb-269rc1-mm5 arch/i386/kernel/machine_kexec.c
--- linux-2.6.9-rc1/arch/i386/kernel/machine_kexec.c~kd-reb-269rc1-mm5	2004-09-15 17:36:30.000000000 +0530
+++ linux-2.6.9-rc1-hari/arch/i386/kernel/machine_kexec.c	2004-09-15 17:36:30.000000000 +0530
@@ -161,6 +161,30 @@ void machine_kexec_cleanup(struct kimage
 }
 
 /*
+ * We are going to do a memory preserving reboot. So, we copy over the
+ * first 640k of memory into a backup location. Though the second kernel
+ * boots from a different location, it still requires the first 640k.
+ * Hence this backup.
+ */
+void __crash_relocate_mem(unsigned long backup_addr, unsigned long backup_size)
+{
+	unsigned long pfn, pfn_max;
+	void *src_addr, *dest_addr;
+	struct page *page;
+
+	pfn_max = backup_size >> PAGE_SHIFT;
+	for (pfn = 0; pfn < pfn_max; pfn++) {
+		src_addr = phys_to_virt(pfn << PAGE_SHIFT);
+		dest_addr = backup_addr + src_addr;
+		if (!pfn_valid(pfn))
+			continue;
+		page = pfn_to_page(pfn);
+		if (PageReserved(page))
+			copy_page(dest_addr, src_addr);
+	}
+}
+
+/*
  * Do not allocate memory (or fail in any way) in machine_kexec().
  * We are past the point of no return, committed to rebooting now.
  */
@@ -180,6 +204,13 @@ void machine_kexec(struct kimage *image)
 	/* Set up an identity mapping for the reboot_code_buffer */
 	identity_map_page(reboot_code_buffer);
 
+	/*
+	 * If we are here to do a crash dump, save the memory from
+	 * 0-640k before we copy over the kexec kernel image.  Otherwise
+	 * our dump will show the wrong kernel entirely.
+	 */
+	crash_relocate_mem();
+
 	/* copy it out */
 	memcpy((void *)reboot_code_buffer, relocate_new_kernel, relocate_new_kernel_size);
 

_

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH][3/6]Routines for copying the dump pages
  2004-09-15 12:53   ` [PATCH][2/6]Memory preserving reboot using kexec Hariprasad Nellitheertha
@ 2004-09-15 12:54     ` Hariprasad Nellitheertha
  2004-09-15 12:55       ` [PATCH][4/6]Register snapshotting before kexec boot Hariprasad Nellitheertha
  2004-09-15 21:23       ` [PATCH][3/6]Routines for copying the dump pages Andrew Morton
  2004-09-15 21:22     ` [PATCH][2/6]Memory preserving reboot using kexec Andrew Morton
  2004-09-19 20:37     ` [Fastboot] " Eric W. Biederman
  2 siblings, 2 replies; 22+ messages in thread
From: Hariprasad Nellitheertha @ 2004-09-15 12:54 UTC (permalink / raw)
  To: akpm, linux-kernel, fastboot
  Cc: Suparna Bhattacharya, mbligh, ebiederm, litke

[-- Attachment #1: Type: text/plain, Size: 108 bytes --]

Regards, Hari
-- 
Hariprasad Nellitheertha
Linux Technology Center
India Software Labs
IBM India, Bangalore

[-- Attachment #2: kd-copy-269rc1-mm5.patch --]
[-- Type: text/plain, Size: 3622 bytes --]



This patch provides the interfaces necessary to read the dump contents,
treating it as a high memory device.

Signed off by Hariprasad Nellitheertha <hari@in.ibm.com>


---

 linux-2.6.9-rc1-hari/arch/i386/mm/highmem.c     |   18 +++++++++++++
 linux-2.6.9-rc1-hari/include/asm-i386/highmem.h |    1 
 linux-2.6.9-rc1-hari/include/linux/highmem.h    |   31 ++++++++++++++++++++++++
 3 files changed, 50 insertions(+)

diff -puN arch/i386/mm/highmem.c~kd-copy-269rc1-mm5 arch/i386/mm/highmem.c
--- linux-2.6.9-rc1/arch/i386/mm/highmem.c~kd-copy-269rc1-mm5	2004-09-15 17:36:33.000000000 +0530
+++ linux-2.6.9-rc1-hari/arch/i386/mm/highmem.c	2004-09-15 17:36:33.000000000 +0530
@@ -74,6 +74,24 @@ void kunmap_atomic(void *kvaddr, enum km
 	preempt_check_resched();
 }
 
+/* This is the same as kmap_atomic() but can map memory that doesn't
+ * have a struct page associated with it.
+ */
+void *kmap_atomic_pfn(unsigned long pfn, enum km_type type)
+{
+        enum fixed_addresses idx;
+        unsigned long vaddr;
+
+        inc_preempt_count();
+
+        idx = type + KM_TYPE_NR*smp_processor_id();
+        vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
+        set_pte(kmap_pte-idx, pfn_pte(pfn, kmap_prot));
+        __flush_tlb_one(vaddr);
+
+        return (void*) vaddr;
+}
+
 struct page *kmap_atomic_to_page(void *ptr)
 {
 	unsigned long idx, vaddr = (unsigned long)ptr;
diff -puN include/asm-i386/highmem.h~kd-copy-269rc1-mm5 include/asm-i386/highmem.h
--- linux-2.6.9-rc1/include/asm-i386/highmem.h~kd-copy-269rc1-mm5	2004-09-15 17:36:33.000000000 +0530
+++ linux-2.6.9-rc1-hari/include/asm-i386/highmem.h	2004-09-15 17:36:33.000000000 +0530
@@ -61,6 +61,7 @@ void *kmap(struct page *page);
 void kunmap(struct page *page);
 void *kmap_atomic(struct page *page, enum km_type type);
 void kunmap_atomic(void *kvaddr, enum km_type type);
+void *kmap_atomic_pfn(unsigned long pfn, enum km_type type);
 struct page *kmap_atomic_to_page(void *ptr);
 
 #define flush_cache_kmaps()	do { } while (0)
diff -puN include/linux/highmem.h~kd-copy-269rc1-mm5 include/linux/highmem.h
--- linux-2.6.9-rc1/include/linux/highmem.h~kd-copy-269rc1-mm5	2004-09-15 17:36:33.000000000 +0530
+++ linux-2.6.9-rc1-hari/include/linux/highmem.h	2004-09-15 17:36:33.000000000 +0530
@@ -6,6 +6,7 @@
 #include <linux/mm.h>
 
 #include <asm/cacheflush.h>
+#include <asm/uaccess.h>
 
 #ifdef CONFIG_HIGHMEM
 
@@ -30,6 +31,7 @@ static inline void *kmap(struct page *pa
 
 #define kmap_atomic(page, idx)		page_address(page)
 #define kunmap_atomic(addr, idx)	do { } while (0)
+#define kmap_atomic_pfn(pfn, idx)	page_address(pfn_to_page(pfn))
 #define kmap_atomic_to_page(ptr)	virt_to_page(ptr)
 
 #endif /* CONFIG_HIGHMEM */
@@ -86,4 +88,33 @@ static inline void copy_highpage(struct 
 	kunmap_atomic(vto, KM_USER1);
 }
 
+/*
+ * Copy a page from "oldmem". For this page, there is no pte mapped
+ * in the current kernel. We stitch up a pte, similar to kmap_atomic.
+ */
+static inline ssize_t copy_oldmem_page(unsigned long pfn,
+			char *buf, size_t csize, int userbuf)
+{
+        void *page, *vaddr;
+
+        if (!csize)
+                return 0;
+
+        page = kmalloc(PAGE_SIZE, GFP_KERNEL);
+
+        vaddr = kmap_atomic_pfn(pfn, KM_PTE0);
+        copy_page(page, vaddr);
+        kunmap_atomic(vaddr, KM_PTE0);
+
+        if (userbuf) {
+                if (copy_to_user(buf, page, csize)) {
+                        kfree(page);
+                        return -EFAULT;
+                }
+        } else
+                memcpy(buf, page, csize);
+        kfree(page);
+
+        return 0;
+}
 #endif /* _LINUX_HIGHMEM_H */

_

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH][4/6]Register snapshotting before kexec boot
  2004-09-15 12:54     ` [PATCH][3/6]Routines for copying the dump pages Hariprasad Nellitheertha
@ 2004-09-15 12:55       ` Hariprasad Nellitheertha
  2004-09-15 12:56         ` [PATCH][5/6]ELF format dump file access Hariprasad Nellitheertha
  2004-09-15 21:27         ` [PATCH][4/6]Register snapshotting before kexec boot Andrew Morton
  2004-09-15 21:23       ` [PATCH][3/6]Routines for copying the dump pages Andrew Morton
  1 sibling, 2 replies; 22+ messages in thread
From: Hariprasad Nellitheertha @ 2004-09-15 12:55 UTC (permalink / raw)
  To: akpm, linux-kernel, fastboot
  Cc: Suparna Bhattacharya, mbligh, ebiederm, litke

[-- Attachment #1: Type: text/plain, Size: 108 bytes --]

Regards, Hari
-- 
Hariprasad Nellitheertha
Linux Technology Center
India Software Labs
IBM India, Bangalore

[-- Attachment #2: kd-reg-269rc1-mm5.patch --]
[-- Type: text/plain, Size: 9803 bytes --]



This patch contains the code for stopping the other cpus and snapshotting
their register values before doing the kexec reboot.

Signed off by Hariprasad Nellitheertha <hari@in.ibm.com>


---

 linux-2.6.9-rc1-hari/arch/i386/kernel/Makefile                   |    1 
 linux-2.6.9-rc1-hari/arch/i386/kernel/crash_dump.c               |  101 ++++++++++
 linux-2.6.9-rc1-hari/arch/i386/kernel/machine_kexec.c            |    4 
 linux-2.6.9-rc1-hari/arch/i386/kernel/smp.c                      |   13 +
 linux-2.6.9-rc1-hari/include/asm-i386/crash_dump.h               |   41 +++-
 linux-2.6.9-rc1-hari/include/asm-i386/mach-default/irq_vectors.h |    1 
 linux-2.6.9-rc1-hari/include/asm-i386/smp.h                      |    1 
 7 files changed, 160 insertions(+), 2 deletions(-)

diff -puN arch/i386/kernel/Makefile~kd-reg-269rc1-mm5 arch/i386/kernel/Makefile
--- linux-2.6.9-rc1/arch/i386/kernel/Makefile~kd-reg-269rc1-mm5	2004-09-15 17:36:37.000000000 +0530
+++ linux-2.6.9-rc1-hari/arch/i386/kernel/Makefile	2004-09-15 17:36:37.000000000 +0530
@@ -25,6 +25,7 @@ obj-$(CONFIG_X86_MPPARSE)	+= mpparse.o
 obj-$(CONFIG_X86_LOCAL_APIC)	+= apic.o nmi.o
 obj-$(CONFIG_X86_IO_APIC)	+= io_apic.o
 obj-$(CONFIG_KEXEC)		+= machine_kexec.o relocate_kernel.o
+obj-$(CONFIG_CRASH_DUMP)	+= crash_dump.o
 obj-$(CONFIG_X86_NUMAQ)		+= numaq.o
 obj-$(CONFIG_X86_SUMMIT_NUMA)	+= summit.o
 obj-$(CONFIG_KPROBES)		+= kprobes.o
diff -puN /dev/null arch/i386/kernel/crash_dump.c
--- /dev/null	2003-01-30 15:54:37.000000000 +0530
+++ linux-2.6.9-rc1-hari/arch/i386/kernel/crash_dump.c	2004-09-15 17:36:37.000000000 +0530
@@ -0,0 +1,101 @@
+/*
+ * Architecture specific (i386) functions for kexec based crash dumps.
+ *
+ * Created by: Hariprasad Nellitheertha (hari@in.ibm.com)
+ *
+ * Copyright (C) IBM Corporation, 2004. All rights reserved.
+ *
+ */
+
+#include <linux/init.h>
+#include <linux/types.h>
+#include <linux/kernel.h>
+#include <linux/smp.h>
+#include <linux/irq.h>
+
+#include <asm/crash_dump.h>
+#include <asm/processor.h>
+#include <asm/hardirq.h>
+#include <asm/nmi.h>
+#include <asm/hw_irq.h>
+
+struct pt_regs crash_smp_regs[NR_CPUS];
+long crash_smp_current_task[NR_CPUS];
+
+#ifdef CONFIG_SMP
+static atomic_t waiting_for_dump_ipi;
+static int crash_dump_expect_ipi[NR_CPUS];
+extern void crash_dump_send_ipi(void);
+extern void stop_this_cpu(void *);
+
+static int crash_dump_nmi_callback(struct pt_regs *regs, int cpu)
+{
+	if (!crash_dump_expect_ipi[cpu])
+		return 0;
+
+	crash_dump_expect_ipi[cpu] = 0;
+	crash_dump_save_this_cpu(regs, cpu);
+	atomic_dec(&waiting_for_dump_ipi);
+
+	stop_this_cpu(NULL);
+
+	return 1;
+}
+
+void __crash_dump_stop_cpus(void)
+{
+	int i, cpu = smp_processor_id();
+	int other_cpus = num_online_cpus()-1;
+
+	if (other_cpus > 0) {
+		atomic_set(&waiting_for_dump_ipi, other_cpus);
+
+		for (i = 0; i < NR_CPUS; i++)
+			crash_dump_expect_ipi[i] = (i != cpu && cpu_online(i));
+
+		set_nmi_callback(crash_dump_nmi_callback);
+		/* Ensure the new callback function is set before sending
+		 * out the IPI
+		 */
+		wmb();
+
+		crash_dump_send_ipi();
+		while (atomic_read(&waiting_for_dump_ipi) > 0)
+			cpu_relax();
+
+		unset_nmi_callback();
+	} else {
+		local_irq_disable();
+		disable_local_APIC();
+		local_irq_enable();
+	}
+}
+#else
+void __crash_dump_stop_cpus(void) {}
+#endif
+
+void crash_get_current_regs(struct pt_regs *regs)
+{
+	__asm__ __volatile__("movl %%ebx,%0" : "=m"(regs->ebx));
+	__asm__ __volatile__("movl %%ecx,%0" : "=m"(regs->ecx));
+	__asm__ __volatile__("movl %%edx,%0" : "=m"(regs->edx));
+	__asm__ __volatile__("movl %%esi,%0" : "=m"(regs->esi));
+	__asm__ __volatile__("movl %%edi,%0" : "=m"(regs->edi));
+	__asm__ __volatile__("movl %%ebp,%0" : "=m"(regs->ebp));
+	__asm__ __volatile__("movl %%eax,%0" : "=m"(regs->eax));
+	__asm__ __volatile__("movl %%esp,%0" : "=m"(regs->esp));
+	__asm__ __volatile__("movw %%ss, %%ax;" :"=a"(regs->xss));
+	__asm__ __volatile__("movw %%cs, %%ax;" :"=a"(regs->xcs));
+	__asm__ __volatile__("movw %%ds, %%ax;" :"=a"(regs->xds));
+	__asm__ __volatile__("movw %%es, %%ax;" :"=a"(regs->xes));
+	__asm__ __volatile__("pushfl; popl %0" :"=m"(regs->eflags));
+
+	regs->eip = (unsigned long)current_text_addr();
+}
+
+void crash_dump_save_this_cpu(struct pt_regs *regs, int cpu)
+{
+	crash_smp_current_task[cpu] = (long)current;
+	crash_smp_regs[cpu] = *regs;
+}
+
diff -puN arch/i386/kernel/machine_kexec.c~kd-reg-269rc1-mm5 arch/i386/kernel/machine_kexec.c
--- linux-2.6.9-rc1/arch/i386/kernel/machine_kexec.c~kd-reg-269rc1-mm5	2004-09-15 17:36:37.000000000 +0530
+++ linux-2.6.9-rc1-hari/arch/i386/kernel/machine_kexec.c	2004-09-15 17:36:37.000000000 +0530
@@ -9,6 +9,7 @@
 #include <linux/mm.h>
 #include <linux/kexec.h>
 #include <linux/delay.h>
+#include <linux/crash_dump.h>
 #include <asm/pgtable.h>
 #include <asm/pgalloc.h>
 #include <asm/tlbflush.h>
@@ -194,6 +195,9 @@ void machine_kexec(struct kimage *image)
 	unsigned long reboot_code_buffer;
 	relocate_new_kernel_t rnk;
 
+	crash_dump_stop_cpus();
+	crash_dump_save_registers();
+
 	/* Interrupts aren't acceptable while we reboot */
 	local_irq_disable();
 
diff -puN arch/i386/kernel/smp.c~kd-reg-269rc1-mm5 arch/i386/kernel/smp.c
--- linux-2.6.9-rc1/arch/i386/kernel/smp.c~kd-reg-269rc1-mm5	2004-09-15 17:36:37.000000000 +0530
+++ linux-2.6.9-rc1-hari/arch/i386/kernel/smp.c	2004-09-15 17:36:37.000000000 +0530
@@ -143,6 +143,9 @@ void __send_IPI_shortcut(unsigned int sh
 	 */
 	cfg = __prepare_ICR(shortcut, vector);
 
+	if (vector == CRASH_DUMP_VECTOR)
+		cfg = (cfg&~APIC_VECTOR_MASK)|APIC_DM_NMI;
+
 	/*
 	 * Send the IPI. The write to APIC_ICR fires this off.
 	 */
@@ -221,6 +224,9 @@ inline void send_IPI_mask_sequence(cpuma
 			 */
 			cfg = __prepare_ICR(0, vector);
 			
+			if (vector == CRASH_DUMP_VECTOR)
+				cfg = (cfg&~APIC_VECTOR_MASK)|APIC_DM_NMI;
+
 			/*
 			 * Send the IPI. The write to APIC_ICR fires this off.
 			 */
@@ -489,6 +495,11 @@ void smp_send_reschedule(int cpu)
 	send_IPI_mask(cpumask_of_cpu(cpu), RESCHEDULE_VECTOR);
 }
 
+void crash_dump_send_ipi(void)
+{
+	send_IPI_allbutself(CRASH_DUMP_VECTOR);
+}
+
 /*
  * Structure and data for smp_call_function(). This is designed to minimise
  * static memory requirements. It also looks cleaner.
@@ -565,7 +576,7 @@ int smp_call_function (void (*func) (voi
 	return 0;
 }
 
-static void stop_this_cpu (void * dummy)
+void stop_this_cpu (void * dummy)
 {
 	/*
 	 * Remove this CPU:
diff -puN include/asm-i386/crash_dump.h~kd-reg-269rc1-mm5 include/asm-i386/crash_dump.h
--- linux-2.6.9-rc1/include/asm-i386/crash_dump.h~kd-reg-269rc1-mm5	2004-09-15 17:36:37.000000000 +0530
+++ linux-2.6.9-rc1-hari/include/asm-i386/crash_dump.h	2004-09-15 17:36:37.000000000 +0530
@@ -1,5 +1,7 @@
 /* asm-i386/crash_dump.h */
 #include <linux/bootmem.h>
+#include <asm/hw_irq.h>
+#include <asm/apic.h>
 
 extern unsigned int dump_enabled;
 
@@ -8,6 +10,11 @@ unsigned long __init find_max_low_pfn(vo
 void __init find_max_pfn(void);
 
 extern unsigned int crashed;
+extern struct pt_regs crash_smp_regs[NR_CPUS];
+extern long crash_smp_current_task[NR_CPUS];
+extern void crash_dump_save_this_cpu(struct pt_regs *, int);
+extern void __crash_dump_stop_cpus(void);
+extern void crash_get_current_regs(struct pt_regs *regs);
 
 #ifdef CONFIG_CRASH_DUMP
 #define CRASH_BACKUP_BASE ((unsigned long)CONFIG_BACKUP_BASE * 0x100000)
@@ -32,13 +39,45 @@ static inline void crash_reserve_bootmem
 	if (!dump_enabled) {
 		reserve_bootmem(0, CRASH_RELOCATE_SIZE);
 		reserve_bootmem(CRASH_BACKUP_BASE,
-			CRASH_BACKUP_SIZE + CRASH_RELOCATE_SIZE);
+			CRASH_BACKUP_SIZE + CRASH_RELOCATE_SIZE + PAGE_SIZE);
 	}
 }
+
+static inline void crash_dump_stop_cpus(void)
+{
+	if (!crashed)
+		return;
+
+	int cpu = smp_processor_id();
+
+	crash_smp_current_task[cpu] = (long)current;
+	crash_get_current_regs(&crash_smp_regs[cpu]);
+
+	/* This also captures the register states of the other cpus */
+	__crash_dump_stop_cpus();
+#if defined(CONFIG_X86_IO_APIC)
+	disable_IO_APIC();
+#endif
+#if defined(CONFIG_X86_LOCAL_APIC)
+	disconnect_bsp_APIC();
+#endif
+}
+
+static inline void crash_dump_save_registers(void)
+{
+	void *addr;
+
+	addr = __va(CRASH_BACKUP_BASE + CRASH_BACKUP_SIZE + CRASH_RELOCATE_SIZE);
+	memcpy(addr, crash_smp_regs, (sizeof(struct pt_regs)*NR_CPUS));
+	addr += sizeof(struct pt_regs)*NR_CPUS;
+	memcpy(addr, crash_smp_current_task, (sizeof(long)*NR_CPUS));
+}
 #else
 #define CRASH_BACKUP_BASE 0x6000000
 #define CRASH_BACKUP_SIZE 0x1000000
 #define crash_relocate_mem() do { } while(0)
 #define set_saved_max_pfn() do { } while(0)
 #define crash_reserve_bootmem() do { } while(0)
+#define crash_dump_stop_cpus() do { } while(0)
+#define crash_dump_save_registers() do { } while(0)
 #endif
diff -puN include/asm-i386/mach-default/irq_vectors.h~kd-reg-269rc1-mm5 include/asm-i386/mach-default/irq_vectors.h
--- linux-2.6.9-rc1/include/asm-i386/mach-default/irq_vectors.h~kd-reg-269rc1-mm5	2004-09-15 17:36:37.000000000 +0530
+++ linux-2.6.9-rc1-hari/include/asm-i386/mach-default/irq_vectors.h	2004-09-15 17:36:37.000000000 +0530
@@ -48,6 +48,7 @@
 #define INVALIDATE_TLB_VECTOR	0xfd
 #define RESCHEDULE_VECTOR	0xfc
 #define CALL_FUNCTION_VECTOR	0xfb
+#define CRASH_DUMP_VECTOR	0xfa
 
 #define THERMAL_APIC_VECTOR	0xf0
 /*
diff -puN include/asm-i386/smp.h~kd-reg-269rc1-mm5 include/asm-i386/smp.h
--- linux-2.6.9-rc1/include/asm-i386/smp.h~kd-reg-269rc1-mm5	2004-09-15 17:36:37.000000000 +0530
+++ linux-2.6.9-rc1-hari/include/asm-i386/smp.h	2004-09-15 17:36:37.000000000 +0530
@@ -41,6 +41,7 @@ extern void smp_message_irq(int cpl, voi
 extern void smp_invalidate_rcv(void);		/* Process an NMI */
 extern void (*mtrr_hook) (void);
 extern void zap_low_mappings (void);
+extern void stop_this_cpu(void *);
 
 #define MAX_APICID 256
 extern u8 x86_cpu_to_apicid[];

_

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH][5/6]ELF format dump file access
  2004-09-15 12:55       ` [PATCH][4/6]Register snapshotting before kexec boot Hariprasad Nellitheertha
@ 2004-09-15 12:56         ` Hariprasad Nellitheertha
  2004-09-15 12:57           ` [PATCH][6/6]Linear/raw " Hariprasad Nellitheertha
                             ` (3 more replies)
  2004-09-15 21:27         ` [PATCH][4/6]Register snapshotting before kexec boot Andrew Morton
  1 sibling, 4 replies; 22+ messages in thread
From: Hariprasad Nellitheertha @ 2004-09-15 12:56 UTC (permalink / raw)
  To: akpm, linux-kernel, fastboot
  Cc: Suparna Bhattacharya, mbligh, ebiederm, litke

[-- Attachment #1: Type: text/plain, Size: 108 bytes --]

Regards, Hari
-- 
Hariprasad Nellitheertha
Linux Technology Center
India Software Labs
IBM India, Bangalore

[-- Attachment #2: kd-elf-269rc1-mm5.patch --]
[-- Type: text/plain, Size: 11637 bytes --]



This patch contains the code that provides an ELF format interface to the
previous kernel's memory post kexec reboot.

Signed off by Hariprasad Nellitheertha <hari@in.ibm.com>


---

 linux-2.6.9-rc1-hari/fs/proc/Makefile           |    1 
 linux-2.6.9-rc1-hari/fs/proc/kcore.c            |   10 -
 linux-2.6.9-rc1-hari/fs/proc/proc_misc.c        |   12 +
 linux-2.6.9-rc1-hari/fs/proc/vmcore.c           |  238 ++++++++++++++++++++++++
 linux-2.6.9-rc1-hari/include/linux/crash_dump.h |    8 
 linux-2.6.9-rc1-hari/include/linux/proc_fs.h    |    2 
 6 files changed, 266 insertions(+), 5 deletions(-)

diff -puN fs/proc/Makefile~kd-elf-269rc1-mm5 fs/proc/Makefile
--- linux-2.6.9-rc1/fs/proc/Makefile~kd-elf-269rc1-mm5	2004-09-15 17:36:53.000000000 +0530
+++ linux-2.6.9-rc1-hari/fs/proc/Makefile	2004-09-15 17:36:53.000000000 +0530
@@ -11,4 +11,5 @@ proc-y       += inode.o root.o base.o ge
 		kmsg.o proc_tty.o proc_misc.o
 
 proc-$(CONFIG_PROC_KCORE)	+= kcore.o
+proc-$(CONFIG_CRASH_DUMP)	+= vmcore.o
 proc-$(CONFIG_PROC_DEVICETREE)	+= proc_devtree.o
diff -puN fs/proc/kcore.c~kd-elf-269rc1-mm5 fs/proc/kcore.c
--- linux-2.6.9-rc1/fs/proc/kcore.c~kd-elf-269rc1-mm5	2004-09-15 17:36:53.000000000 +0530
+++ linux-2.6.9-rc1-hari/fs/proc/kcore.c	2004-09-15 17:36:53.000000000 +0530
@@ -114,7 +114,7 @@ static size_t get_kcore_size(int *nphdr,
 /*
  * determine size of ELF note
  */
-static int notesize(struct memelfnote *en)
+int notesize(struct memelfnote *en)
 {
 	int sz;
 
@@ -129,7 +129,7 @@ static int notesize(struct memelfnote *e
 /*
  * store a note in the header buffer
  */
-static char *storenote(struct memelfnote *men, char *bufp)
+char *storenote(struct memelfnote *men, char *bufp)
 {
 	struct elf_note en;
 
@@ -156,7 +156,7 @@ static char *storenote(struct memelfnote
  * store an ELF coredump header in the supplied buffer
  * nphdr is the number of elf_phdr to insert
  */
-static void elf_kcore_store_hdr(char *bufp, int nphdr, int dataoff)
+void elf_kcore_store_hdr(char *bufp, int nphdr, int dataoff, struct kcore_list *clist)
 {
 	struct elf_prstatus prstatus;	/* NT_PRSTATUS */
 	struct elf_prpsinfo prpsinfo;	/* NT_PRPSINFO */
@@ -208,7 +208,7 @@ static void elf_kcore_store_hdr(char *bu
 	nhdr->p_align	= 0;
 
 	/* setup ELF PT_LOAD program header for every area */
-	for (m=kclist; m; m=m->next) {
+	for (m=clist; m; m=m->next) {
 		phdr = (struct elf_phdr *) bufp;
 		bufp += sizeof(struct elf_phdr);
 		offset += sizeof(struct elf_phdr);
@@ -305,7 +305,7 @@ read_kcore(struct file *file, char __use
 			return -ENOMEM;
 		}
 		memset(elf_buf, 0, elf_buflen);
-		elf_kcore_store_hdr(elf_buf, nphdr, elf_buflen);
+		elf_kcore_store_hdr(elf_buf, nphdr, elf_buflen, kclist);
 		read_unlock(&kclist_lock);
 		if (copy_to_user(buffer, elf_buf + *fpos, tsz)) {
 			kfree(elf_buf);
diff -puN fs/proc/proc_misc.c~kd-elf-269rc1-mm5 fs/proc/proc_misc.c
--- linux-2.6.9-rc1/fs/proc/proc_misc.c~kd-elf-269rc1-mm5	2004-09-15 17:36:53.000000000 +0530
+++ linux-2.6.9-rc1-hari/fs/proc/proc_misc.c	2004-09-15 17:36:53.000000000 +0530
@@ -44,6 +44,7 @@
 #include <linux/jiffies.h>
 #include <linux/sysrq.h>
 #include <linux/vmalloc.h>
+#include <linux/bootmem.h>
 #include <linux/crash_dump.h>
 #include <asm/uaccess.h>
 #include <asm/pgtable.h>
@@ -584,6 +585,7 @@ static struct file_operations proc_crash
 #endif
 
 struct proc_dir_entry *proc_root_kcore;
+struct proc_dir_entry *proc_vmcore;
 
 static void create_seq_entry(char *name, mode_t mode, struct file_operations *f)
 {
@@ -678,6 +680,16 @@ void __init proc_misc_init(void)
 				(size_t)high_memory - PAGE_OFFSET + PAGE_SIZE;
 	}
 #endif
+#ifdef CONFIG_CRASH_DUMP
+	if (dump_enabled) {
+		proc_vmcore = create_proc_entry("vmcore", S_IRUSR, NULL);
+		if (proc_vmcore) {
+			proc_vmcore->proc_fops = &proc_vmcore_operations;
+			proc_vmcore->size =
+			(size_t)(saved_max_pfn << PAGE_SHIFT);
+		}
+	}
+#endif
 #ifdef CONFIG_MAGIC_SYSRQ
 	entry = create_proc_entry("sysrq-trigger", S_IWUSR, NULL);
 	if (entry)
diff -puN /dev/null fs/proc/vmcore.c
--- /dev/null	2003-01-30 15:54:37.000000000 +0530
+++ linux-2.6.9-rc1-hari/fs/proc/vmcore.c	2004-09-15 17:36:53.000000000 +0530
@@ -0,0 +1,238 @@
+/*
+ *	fs/proc/vmcore.c Interface for accessing the crash
+ * 				 dump from the system's previous life.
+ * 	Heavily borrowed from fs/proc/kcore.c
+ *	Created by: Hariprasad Nellitheertha (hari@in.ibm.com)
+ *	Copyright (C) IBM Corporation, 2004. All rights reserved
+ */
+
+#include <linux/config.h>
+#include <linux/mm.h>
+#include <linux/proc_fs.h>
+#include <linux/user.h>
+#include <linux/a.out.h>
+#include <linux/elf.h>
+#include <linux/elfcore.h>
+#include <linux/vmalloc.h>
+#include <linux/proc_fs.h>
+#include <linux/highmem.h>
+#include <linux/bootmem.h>
+#include <linux/init.h>
+#include <linux/crash_dump.h>
+#include <asm/uaccess.h>
+#include <asm/io.h>
+
+/* This is to re-use the kcore header creation code */
+static struct kcore_list vmcore_mem;
+
+static int open_vmcore(struct inode * inode, struct file * filp)
+{
+	return 0;
+}
+
+static ssize_t read_vmcore(struct file *,char __user *,size_t, loff_t *);
+
+struct file_operations proc_vmcore_operations = {
+	.read		= read_vmcore,
+	.open		= open_vmcore,
+};
+
+#define BACKUP_START CRASH_BACKUP_BASE
+#define BACKUP_END CRASH_BACKUP_BASE + CRASH_BACKUP_SIZE
+#define REG_SIZE sizeof(elf_gregset_t)
+
+struct memelfnote
+{
+	const char *name;
+	int type;
+	unsigned int datasz;
+	void *data;
+};
+
+static size_t get_vmcore_size(int *nphdr, size_t *elf_buflen)
+{
+	size_t size;
+
+	/* We need 1 PT_LOAD segment headers
+	 * In addition, we need one PT_NOTE header
+	 */
+	*nphdr = 2;
+	size = (size_t)(saved_max_pfn << PAGE_SHIFT);
+
+	*elf_buflen =	sizeof(struct elfhdr) +
+			(*nphdr + 2)*sizeof(struct elf_phdr) +
+			3 * sizeof(struct memelfnote) +
+			sizeof(struct elf_prstatus) +
+			sizeof(struct elf_prpsinfo) +
+			sizeof(struct task_struct);
+	*elf_buflen = PAGE_ALIGN(*elf_buflen);
+	return size + *elf_buflen;
+}
+
+/*
+ * Reads from the oldmem device from given offset till
+ * given count
+ */
+static ssize_t read_from_oldmem(char *buf, size_t count,
+			     loff_t *ppos, int userbuf)
+{
+	unsigned long pfn, p = *ppos;
+	size_t read = 0;
+
+	pfn = p / PAGE_SIZE;
+	p = p % PAGE_SIZE;
+
+	if (pfn > saved_max_pfn) {
+		read = -EINVAL;
+		goto done;
+	}
+
+	if (count > PAGE_SIZE - p)
+		count = PAGE_SIZE - p;
+
+	if (copy_oldmem_page(pfn, buf, count, userbuf)) {
+		read = -EFAULT;
+		goto done;
+	}
+
+	*ppos += count;
+done:
+	return read;
+}
+
+/*
+ * store an ELF crash dump header in the supplied buffer
+ * nphdr is the number of elf_phdr to insert
+ */
+static void elf_vmcore_store_hdr(char *bufp, int nphdr, int dataoff)
+{
+	struct elf_prstatus prstatus;	/* NT_PRSTATUS */
+	struct memelfnote notes[1];
+	char reg_buf[REG_SIZE];
+	loff_t reg_ppos;
+	char *buf = bufp;
+
+	vmcore_mem.addr = (unsigned long)__va(0);
+	vmcore_mem.size = saved_max_pfn << PAGE_SHIFT;
+	vmcore_mem.next = NULL;
+
+	/* Re-use the kcore code */
+	elf_kcore_store_hdr(bufp, nphdr, dataoff, &vmcore_mem);
+	buf += sizeof(struct elfhdr) + 2*sizeof(struct elf_phdr);
+
+	/* set up the process status */
+	notes[0].name = "CORE";
+	notes[0].type = NT_PRSTATUS;
+	notes[0].datasz = sizeof(struct elf_prstatus);
+	notes[0].data = &prstatus;
+
+	memset(&prstatus, 0, sizeof(struct elf_prstatus));
+
+	/* 1 - Get the registers from the reserved memory area */
+	reg_ppos = BACKUP_END + CRASH_RELOCATE_SIZE;
+	read_from_oldmem(reg_buf, REG_SIZE, &reg_ppos, 0);
+	elf_core_copy_regs(&prstatus.pr_reg, (struct pt_regs *)reg_buf);
+	buf = storenote(&notes[0], buf);
+}
+
+/*
+ * read from the ELF header and then the crash dump
+ */
+static ssize_t read_vmcore(
+struct file *file, char __user *buffer, size_t buflen, loff_t *fpos)
+{
+	ssize_t acc = 0;
+	size_t size, tsz;
+	size_t elf_buflen;
+	int nphdr;
+	unsigned long start;
+
+	tsz =  get_vmcore_size(&nphdr, &elf_buflen);
+	proc_vmcore->size = size = tsz + elf_buflen;
+	if (buflen == 0 || *fpos >= size) {
+		goto done;
+	}
+
+	/* trim buflen to not go beyond EOF */
+	if (buflen > size - *fpos)
+		buflen = size - *fpos;
+
+	/* construct an ELF core header if we'll need some of it */
+	if (*fpos < elf_buflen) {
+		char * elf_buf;
+
+		tsz = elf_buflen - *fpos;
+		if (buflen < tsz)
+			tsz = buflen;
+		elf_buf = kmalloc(elf_buflen, GFP_ATOMIC);
+		if (!elf_buf) {
+			acc = -ENOMEM;
+			goto done;
+		}
+		memset(elf_buf, 0, elf_buflen);
+		elf_vmcore_store_hdr(elf_buf, nphdr, elf_buflen);
+		if (copy_to_user(buffer, elf_buf + *fpos, tsz)) {
+			kfree(elf_buf);
+			acc = -EFAULT;
+			goto done;
+		}
+		kfree(elf_buf);
+		buflen -= tsz;
+		*fpos += tsz;
+		buffer += tsz;
+		acc += tsz;
+
+		/* leave now if filled buffer already */
+		if (buflen == 0) {
+			goto done;
+		}
+	}
+
+	start = *fpos - elf_buflen;
+	if ((tsz = (PAGE_SIZE - (start & ~PAGE_MASK))) > buflen)
+		tsz = buflen;
+
+	while (buflen) {
+		unsigned long p;
+
+		if ((start < 0) || (start >= size))
+			if (clear_user(buffer, tsz)) {
+				acc = -EFAULT;
+				goto done;
+			}
+
+		/* tsz contains actual len of dump to be read.
+		 * buflen is the total len that was requested.
+		 * This may contain part of ELF header. start
+		 * is the fpos for the oldmem region
+		 * If the file position corresponds to the second
+		 * kernel's memory, we just return zeroes
+		 */
+		p = start;
+		if ((p >= BACKUP_START) && (p < BACKUP_END)) {
+			if (clear_user(buffer, tsz)) {
+				acc = -EFAULT;
+				goto done;
+			}
+
+			goto read_done;
+		} else if (p < CRASH_RELOCATE_SIZE)
+			p += BACKUP_END;
+
+		if (read_from_oldmem(buffer, tsz, (loff_t *)&p, 1)) {
+			acc = -EINVAL;
+			goto done;
+		}
+
+read_done:
+		buflen -= tsz;
+		*fpos += tsz;
+		buffer += tsz;
+		acc += tsz;
+		start += tsz;
+		tsz = (buflen > PAGE_SIZE ? PAGE_SIZE : buflen);
+	}
+
+done:
+	return acc;
+}
diff -puN include/linux/crash_dump.h~kd-elf-269rc1-mm5 include/linux/crash_dump.h
--- linux-2.6.9-rc1/include/linux/crash_dump.h~kd-elf-269rc1-mm5	2004-09-15 17:36:53.000000000 +0530
+++ linux-2.6.9-rc1-hari/include/linux/crash_dump.h	2004-09-15 17:36:53.000000000 +0530
@@ -1,8 +1,16 @@
 #include <linux/kexec.h>
 #include <linux/smp_lock.h>
 #include <linux/device.h>
+#include <linux/proc_fs.h>
 #include <asm/crash_dump.h>
 
+extern unsigned long saved_max_pfn;
+extern struct memelfnote memelfnote;
+extern int notesize(struct memelfnote *);
+extern char *storenote(struct memelfnote *, char *);
+extern ssize_t copy_oldmem_page(unsigned long, char *, size_t, int);
+extern void elf_kcore_store_hdr(char *, int, int, struct kcore_list *);
+
 #ifdef CONFIG_CRASH_DUMP
 extern int crash_dump_on;
 static inline void crash_machine_kexec(void)
diff -puN include/linux/proc_fs.h~kd-elf-269rc1-mm5 include/linux/proc_fs.h
--- linux-2.6.9-rc1/include/linux/proc_fs.h~kd-elf-269rc1-mm5	2004-09-15 17:36:53.000000000 +0530
+++ linux-2.6.9-rc1-hari/include/linux/proc_fs.h	2004-09-15 17:36:53.000000000 +0530
@@ -82,6 +82,7 @@ extern struct proc_dir_entry *proc_net;
 extern struct proc_dir_entry *proc_bus;
 extern struct proc_dir_entry *proc_root_driver;
 extern struct proc_dir_entry *proc_root_kcore;
+extern struct proc_dir_entry *proc_vmcore;
 
 extern void proc_root_init(void);
 extern void proc_misc_init(void);
@@ -117,6 +118,7 @@ extern int proc_readdir(struct file *, v
 extern struct dentry *proc_lookup(struct inode *, struct dentry *, struct nameidata *);
 
 extern struct file_operations proc_kcore_operations;
+extern struct file_operations proc_vmcore_operations;
 extern struct file_operations proc_kmsg_operations;
 extern struct file_operations ppc_htab_operations;
 

_

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH][6/6]Linear/raw format dump file access
  2004-09-15 12:56         ` [PATCH][5/6]ELF format dump file access Hariprasad Nellitheertha
@ 2004-09-15 12:57           ` Hariprasad Nellitheertha
  2004-09-15 21:28           ` [PATCH][5/6]ELF " Andrew Morton
                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 22+ messages in thread
From: Hariprasad Nellitheertha @ 2004-09-15 12:57 UTC (permalink / raw)
  To: akpm, linux-kernel, fastboot
  Cc: Suparna Bhattacharya, mbligh, ebiederm, litke

[-- Attachment #1: Type: text/plain, Size: 108 bytes --]

Regards, Hari
-- 
Hariprasad Nellitheertha
Linux Technology Center
India Software Labs
IBM India, Bangalore

[-- Attachment #2: kd-oldmem-269rc1-mm5.patch --]
[-- Type: text/plain, Size: 4150 bytes --]



This patch contains the code that enables us to access the previous kernel's
memory as /dev/oldmem.

Signed off by Hariprasad Nellitheertha <hari@in.ibm.com>
Signed off by Adam Litke <litke@us.ibm.com>


---

 linux-2.6.9-rc1-hari/Documentation/devices.txt |    1 
 linux-2.6.9-rc1-hari/drivers/char/mem.c        |   66 +++++++++++++++++++++++++
 2 files changed, 67 insertions(+)

diff -puN Documentation/devices.txt~kd-oldmem-269rc1-mm5 Documentation/devices.txt
--- linux-2.6.9-rc1/Documentation/devices.txt~kd-oldmem-269rc1-mm5	2004-09-15 17:36:57.000000000 +0530
+++ linux-2.6.9-rc1-hari/Documentation/devices.txt	2004-09-15 17:36:57.000000000 +0530
@@ -100,6 +100,7 @@ Your cooperation is appreciated.
 		  9 = /dev/urandom	Faster, less secure random number gen.
 		 10 = /dev/aio		Asyncronous I/O notification interface
 		 11 = /dev/kmsg		Writes to this come out as printk's
+		 12 = /dev/oldmem		Access to kexec-ed crash dump
   1 block	RAM disk
 		  0 = /dev/ram0		First RAM disk
 		  1 = /dev/ram1		Second RAM disk
diff -puN drivers/char/mem.c~kd-oldmem-269rc1-mm5 drivers/char/mem.c
--- linux-2.6.9-rc1/drivers/char/mem.c~kd-oldmem-269rc1-mm5	2004-09-15 17:36:57.000000000 +0530
+++ linux-2.6.9-rc1-hari/drivers/char/mem.c	2004-09-15 17:36:57.000000000 +0530
@@ -23,6 +23,8 @@
 #include <linux/devfs_fs_kernel.h>
 #include <linux/ptrace.h>
 #include <linux/device.h>
+#include <linux/highmem.h>
+#include <linux/crash_dump.h>
 
 #include <asm/uaccess.h>
 #include <asm/io.h>
@@ -231,6 +233,60 @@ static int mmap_mem(struct file *file, s
 	return 0;
 }
 
+/*
+ * Read memory corresponding to the old kernel.
+ * If we are reading from the reserved section, which is
+ * actually used by the current kernel, we just return zeroes.
+ * Or if we are reading from the first 640k, we return from the
+ * backed up area.
+ */
+static ssize_t read_oldmem(struct file * file, char * buf,
+                        size_t count, loff_t *ppos)
+{
+	unsigned long pfn;
+	unsigned backup_start, backup_end, relocate_start;
+	size_t read=0, csize;
+
+	backup_start = CRASH_BACKUP_BASE / PAGE_SIZE;
+	backup_end = backup_start + (CRASH_BACKUP_SIZE / PAGE_SIZE);
+	relocate_start = (CRASH_BACKUP_BASE + CRASH_BACKUP_SIZE) / PAGE_SIZE;
+
+	while(count) {
+		pfn = *ppos / PAGE_SIZE;
+
+		csize = (count > PAGE_SIZE) ? PAGE_SIZE : count;
+
+		/* Perform translation (see comment above) */
+		if ((pfn >= backup_start) && (pfn < backup_end)) {
+			if (clear_user(buf, csize)) {
+				read = -EFAULT;
+				goto done;
+			}
+
+			goto copy_done;
+		} else if (pfn < (CRASH_RELOCATE_SIZE / PAGE_SIZE))
+			pfn += relocate_start;
+
+		if (pfn > saved_max_pfn) {
+			read = 0;
+			goto done;
+		}
+
+		if (copy_oldmem_page(pfn, buf, csize, 1)) {
+			read = -EFAULT;
+			goto done;
+		}
+
+copy_done:
+		buf += csize;
+		*ppos += csize;
+		read += csize;
+		count -= csize;
+	}
+done:
+	return read;
+}
+
 extern long vread(char *buf, char *addr, unsigned long count);
 extern long vwrite(char *buf, char *addr, unsigned long count);
 
@@ -535,6 +591,7 @@ static int open_port(struct inode * inod
 #define read_full       read_zero
 #define open_mem	open_port
 #define open_kmem	open_mem
+#define open_oldmem	open_mem
 
 static struct file_operations mem_fops = {
 	.llseek		= memory_lseek,
@@ -579,6 +636,11 @@ static struct file_operations full_fops 
 	.write		= write_full,
 };
 
+static struct file_operations oldmem_fops = {
+        .read           = read_oldmem,
+        .open           = open_oldmem,
+};
+
 static ssize_t kmsg_write(struct file * file, const char __user * buf,
 			  size_t count, loff_t *ppos)
 {
@@ -633,6 +695,9 @@ static int memory_open(struct inode * in
 		case 11:
 			filp->f_op = &kmsg_fops;
 			break;
+		case 12:
+			filp->f_op = &oldmem_fops;
+			break;
 		default:
 			return -ENXIO;
 	}
@@ -662,6 +727,7 @@ static const struct {
 	{8, "random",  S_IRUGO | S_IWUSR,           &random_fops},
 	{9, "urandom", S_IRUGO | S_IWUSR,           &urandom_fops},
 	{11,"kmsg",    S_IRUGO | S_IWUSR,           &kmsg_fops},
+	{12,"oldmem",    S_IRUSR | S_IWUSR | S_IRGRP, &oldmem_fops},
 };
 
 static struct class_simple *mem_class;

_

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Fastboot] kexec based crash dumping
  2004-09-15 12:50 kexec based crash dumping Hariprasad Nellitheertha
  2004-09-15 12:51 ` [PATCH][1/6]Documentation Hariprasad Nellitheertha
@ 2004-09-15 17:33 ` Eric W. Biederman
  1 sibling, 0 replies; 22+ messages in thread
From: Eric W. Biederman @ 2004-09-15 17:33 UTC (permalink / raw)
  To: hari; +Cc: akpm, linux-kernel, fastboot, Suparna Bhattacharya, mbligh, litke

Hariprasad Nellitheertha <hari@in.ibm.com> writes:

> Hi Andrew,
> 
> The patches that follow contain the kexec based crash dumping implementation.
> Based on feedback received last time, we have made several changes. Some of
> them are:
> 
> - The dumping kernel now boots from a non-default location. This is possible
>   due to Eric's patch which allows i386 kernels to boot from a non-default
>   location. This change means that we need two different kernels to get this
>   setup. The documentation patch has complete details on how to do this.

Cool.  I am glad you could get this going this should make things
easier.

> - We can now choose whether or not to dump from panic. The documentation
>   patch has details on this as well.
> - The linear view is now called oldmem.
> - Changes as per the code review comments from the previous posting.
> 
> The patches correspond to 2.6.9-rc1-mm5.
> 
> Kindly review these patches and let me know your thoughts.

I will start with my standard objections about it still being
incompatible with other uses of kexec.

More later when I get a chance.  Things seem to slowly be going
in the right direction.

Eric

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH][2/6]Memory preserving reboot using kexec
  2004-09-15 12:53   ` [PATCH][2/6]Memory preserving reboot using kexec Hariprasad Nellitheertha
  2004-09-15 12:54     ` [PATCH][3/6]Routines for copying the dump pages Hariprasad Nellitheertha
@ 2004-09-15 21:22     ` Andrew Morton
  2004-09-19 20:37     ` [Fastboot] " Eric W. Biederman
  2 siblings, 0 replies; 22+ messages in thread
From: Andrew Morton @ 2004-09-15 21:22 UTC (permalink / raw)
  To: hari; +Cc: linux-kernel, fastboot, suparna, mbligh, ebiederm, litke


I'll add these to -mm.  Minor things:


> +config BACKUP_BASE
> +	int "location of the crash dumps backup region"
> +	depends on CRASH_DUMP
> +	default 16
> +	help
> +	This is the location where the second kernel will boot from.
> +
> +config BACKUP_SIZE
> +	int "Size of the crash dumps backup region"
> +	depends on CRASH_DUMP
> +	range 16 64
> +	default 32
> +	help
> +	The size of the second kernel's memory.

You should tell the user the units of the parameter.  So

	"location of the crash dumps backup region (MB)"

would be nice.

> +++ linux-2.6.9-rc1-hari/include/asm-i386/crash_dump.h	2004-09-15 17:36:30.000000000 +0530
> @@ -0,0 +1,44 @@
> +/* asm-i386/crash_dump.h */
> +#include <linux/bootmem.h>
> +
> +extern unsigned int dump_enabled;
> +
> +void __crash_relocate_mem(unsigned long, unsigned long);
> +unsigned long __init find_max_low_pfn(void);
> +void __init find_max_pfn(void);
> +
> +extern unsigned int crashed;

Should the above declarations be inside CONFIG_CRASH_DUMP?  We don't want
to leave them in scope if they don't exist.

> +static inline void crash_machine_kexec(void)
> +{
> +	struct kimage *image;
> +
> +	if ((!crash_dump_on) || (crashed))
> +		return;
> +
> +	image = xchg(&kexec_image, 0);
> +	if (image) {
> +		crashed = 1;
> +		device_shutdown();
> +		printk(KERN_EMERG "kexec: opening parachute\n");
> +		machine_kexec(image);
> +		while (1);
> +	} else {
> +		printk(KERN_EMERG "kexec: No kernel image loaded!\n");
> +	}
> +}

I don't see why this is inlined??

> +#ifdef CONFIG_CRASH_DUMP
> +/*
> + * Enable kexec reboot upon panic; for dumping
> + */
> +static ssize_t write_crash_dump_on(struct file *file, const char __user *buf,
> +					size_t count, loff_t *ppos)
> +{
> +	if (count) {
> +		if (get_user(crash_dump_on, buf))
> +			return -EFAULT;
> +	}
> +	return count;
> +}
> +
> +static struct file_operations proc_crash_dump_on_operations = {
> +	.write		= write_crash_dump_on,
> +};
> +#endif
> +
>  struct proc_dir_entry *proc_root_kcore;
>  
>  static void create_seq_entry(char *name, mode_t mode, struct file_operations *f)
> @@ -663,6 +683,11 @@ void __init proc_misc_init(void)
>  	if (entry)
>  		entry->proc_fops = &proc_sysrq_trigger_operations;
>  #endif
> +#ifdef CONFIG_CRASH_DUMP
> +	entry = create_proc_entry("kexec-dump", S_IWUSR, NULL);
> +	if (entry)
> +		entry->proc_fops = &proc_crash_dump_on_operations;
> +#endif
>  #ifdef CONFIG_LOCKMETER
>  	entry = create_proc_entry("lockmeter", S_IWUSR | S_IRUGO, NULL);
>  	if (entry) {

Please do all the above in a crashdump-specific file, not inside proc_misc.c



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH][3/6]Routines for copying the dump pages
  2004-09-15 12:54     ` [PATCH][3/6]Routines for copying the dump pages Hariprasad Nellitheertha
  2004-09-15 12:55       ` [PATCH][4/6]Register snapshotting before kexec boot Hariprasad Nellitheertha
@ 2004-09-15 21:23       ` Andrew Morton
  1 sibling, 0 replies; 22+ messages in thread
From: Andrew Morton @ 2004-09-15 21:23 UTC (permalink / raw)
  To: hari; +Cc: linux-kernel, fastboot, suparna, mbligh, ebiederm, litke

Hariprasad Nellitheertha <hari@in.ibm.com> wrote:
>
> +/*
> + * Copy a page from "oldmem". For this page, there is no pte mapped
> + * in the current kernel. We stitch up a pte, similar to kmap_atomic.
> + */
> +static inline ssize_t copy_oldmem_page(unsigned long pfn,
> +			char *buf, size_t csize, int userbuf)

Again, why inline this function?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH][4/6]Register snapshotting before kexec boot
  2004-09-15 12:55       ` [PATCH][4/6]Register snapshotting before kexec boot Hariprasad Nellitheertha
  2004-09-15 12:56         ` [PATCH][5/6]ELF format dump file access Hariprasad Nellitheertha
@ 2004-09-15 21:27         ` Andrew Morton
  2004-09-16  8:11           ` [Fastboot] " Dipankar Sarma
  1 sibling, 1 reply; 22+ messages in thread
From: Andrew Morton @ 2004-09-15 21:27 UTC (permalink / raw)
  To: hari, Rusty Russell
  Cc: linux-kernel, fastboot, suparna, mbligh, ebiederm, litke

Hariprasad Nellitheertha <hari@in.ibm.com> wrote:
>
> +void __crash_dump_stop_cpus(void)
> +{
> +	int i, cpu = smp_processor_id();
> +	int other_cpus = num_online_cpus()-1;
> +
> +	if (other_cpus > 0) {
> +		atomic_set(&waiting_for_dump_ipi, other_cpus);
> +
> +		for (i = 0; i < NR_CPUS; i++)
> +			crash_dump_expect_ipi[i] = (i != cpu && cpu_online(i));
> +
> +		set_nmi_callback(crash_dump_nmi_callback);
> +		/* Ensure the new callback function is set before sending
> +		 * out the IPI
> +		 */
> +		wmb();
> +
> +		crash_dump_send_ipi();
> +		while (atomic_read(&waiting_for_dump_ipi) > 0)
> +			cpu_relax();
> +
> +		unset_nmi_callback();
> +	} else {
> +		local_irq_disable();
> +		disable_local_APIC();
> +		local_irq_enable();
> +	}
> +}

Is dodgy wrt CPU hotplug, but there's not a lot we can do about that
in this context, I expect.  Which is a shame, given that CPU hotplug
is a likely time at which to be taking a crashdump ;)

Rusty, you may like to review these patches...

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH][5/6]ELF format dump file access
  2004-09-15 12:56         ` [PATCH][5/6]ELF format dump file access Hariprasad Nellitheertha
  2004-09-15 12:57           ` [PATCH][6/6]Linear/raw " Hariprasad Nellitheertha
@ 2004-09-15 21:28           ` Andrew Morton
  2004-09-15 21:29           ` Andrew Morton
  2004-09-15 21:31           ` Andrew Morton
  3 siblings, 0 replies; 22+ messages in thread
From: Andrew Morton @ 2004-09-15 21:28 UTC (permalink / raw)
  To: hari; +Cc: linux-kernel, fastboot, suparna, mbligh, ebiederm, litke

Hariprasad Nellitheertha <hari@in.ibm.com> wrote:
>
> -static int notesize(struct memelfnote *en)
> +int notesize(struct memelfnote *en)
>  {
>  	int sz;
>  
> @@ -129,7 +129,7 @@ static int notesize(struct memelfnote *e
>  /*
>   * store a note in the header buffer
>   */
> -static char *storenote(struct memelfnote *men, char *bufp)
> +char *storenote(struct memelfnote *men, char *bufp)

As you're giving these kernel-wide scope, please also rename them
to elf_notesize() and elf_storenote().

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH][5/6]ELF format dump file access
  2004-09-15 12:56         ` [PATCH][5/6]ELF format dump file access Hariprasad Nellitheertha
  2004-09-15 12:57           ` [PATCH][6/6]Linear/raw " Hariprasad Nellitheertha
  2004-09-15 21:28           ` [PATCH][5/6]ELF " Andrew Morton
@ 2004-09-15 21:29           ` Andrew Morton
  2004-09-15 21:31           ` Andrew Morton
  3 siblings, 0 replies; 22+ messages in thread
From: Andrew Morton @ 2004-09-15 21:29 UTC (permalink / raw)
  To: hari; +Cc: linux-kernel, fastboot, suparna, mbligh, ebiederm, litke

Hariprasad Nellitheertha <hari@in.ibm.com> wrote:
>
> +#ifdef CONFIG_CRASH_DUMP
> +	if (dump_enabled) {
> +		proc_vmcore = create_proc_entry("vmcore", S_IRUSR, NULL);
> +		if (proc_vmcore) {
> +			proc_vmcore->proc_fops = &proc_vmcore_operations;
> +			proc_vmcore->size =
> +			(size_t)(saved_max_pfn << PAGE_SHIFT);
> +		}
> +	}
> +#endif

Again, please try to move this out of procfs and into a crashdump-specific file.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH][5/6]ELF format dump file access
  2004-09-15 12:56         ` [PATCH][5/6]ELF format dump file access Hariprasad Nellitheertha
                             ` (2 preceding siblings ...)
  2004-09-15 21:29           ` Andrew Morton
@ 2004-09-15 21:31           ` Andrew Morton
  3 siblings, 0 replies; 22+ messages in thread
From: Andrew Morton @ 2004-09-15 21:31 UTC (permalink / raw)
  To: hari; +Cc: linux-kernel, fastboot, suparna, mbligh, ebiederm, litke

Hariprasad Nellitheertha <hari@in.ibm.com> wrote:
>
> +/*
> + * Reads from the oldmem device from given offset till
> + * given count
> + */
> +static ssize_t read_from_oldmem(char *buf, size_t count,
> +			     loff_t *ppos, int userbuf)
> +{
> +	unsigned long pfn, p = *ppos;
> +	size_t read = 0;
> +
> +	pfn = p / PAGE_SIZE;
> +	p = p % PAGE_SIZE;
> +
> +	if (pfn > saved_max_pfn) {
> +		read = -EINVAL;
> +		goto done;
> +	}
> +
> +	if (count > PAGE_SIZE - p)
> +		count = PAGE_SIZE - p;
> +
> +	if (copy_oldmem_page(pfn, buf, count, userbuf)) {
> +		read = -EFAULT;
> +		goto done;
> +	}
> +
> +	*ppos += count;

hm, what's going on here?  *ppos is a loff_t but you've copied it
into a 32-bit local prior to calculating the pfn.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Fastboot] Re: [PATCH][4/6]Register snapshotting before kexec boot
  2004-09-15 21:27         ` [PATCH][4/6]Register snapshotting before kexec boot Andrew Morton
@ 2004-09-16  8:11           ` Dipankar Sarma
  2004-09-17 14:53             ` Srivatsa Vaddagiri
  0 siblings, 1 reply; 22+ messages in thread
From: Dipankar Sarma @ 2004-09-16  8:11 UTC (permalink / raw)
  To: Andrew Morton
  Cc: hari, Rusty Russell, suparna, fastboot, ebiederm, litke,
	linux-kernel, mbligh

On Wed, Sep 15, 2004 at 02:27:22PM -0700, Andrew Morton wrote:
> Hariprasad Nellitheertha <hari@in.ibm.com> wrote:
> > +void __crash_dump_stop_cpus(void)
> > +{
> > +	int i, cpu = smp_processor_id();
> > +	int other_cpus = num_online_cpus()-1;
> > +
> > +	if (other_cpus > 0) {
> > +		atomic_set(&waiting_for_dump_ipi, other_cpus);
> > +
> > +		for (i = 0; i < NR_CPUS; i++)
> > +			crash_dump_expect_ipi[i] = (i != cpu && cpu_online(i));
> > +
> > +		set_nmi_callback(crash_dump_nmi_callback);
> > +		/* Ensure the new callback function is set before sending
> > +		 * out the IPI
> > +		 */
> > +		wmb();
> > +
> > +		crash_dump_send_ipi();
> > +		while (atomic_read(&waiting_for_dump_ipi) > 0)
> > +			cpu_relax();
> > +
> > +		unset_nmi_callback();
> > +	} else {
> > +		local_irq_disable();
> > +		disable_local_APIC();
> > +		local_irq_enable();
> > +	}
> > +}
> 
> Is dodgy wrt CPU hotplug, but there's not a lot we can do about that
> in this context, I expect.  Which is a shame, given that CPU hotplug
> is a likely time at which to be taking a crashdump ;)

If Hari disables preemption during this entire section of code,
he should be safe from CPU hotplug, AFAICS. The stop machine
threads will never get to run on that CPU.

Thanks
Dipankar

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Fastboot] Re: [PATCH][4/6]Register snapshotting before kexec boot
  2004-09-16  8:11           ` [Fastboot] " Dipankar Sarma
@ 2004-09-17 14:53             ` Srivatsa Vaddagiri
  2004-09-19 20:17               ` Eric W. Biederman
  0 siblings, 1 reply; 22+ messages in thread
From: Srivatsa Vaddagiri @ 2004-09-17 14:53 UTC (permalink / raw)
  To: Dipankar Sarma
  Cc: Andrew Morton, hari, Rusty Russell, suparna, fastboot, ebiederm,
	litke, linux-kernel, mbligh

On Thu, Sep 16, 2004 at 08:41:13AM +0000, Dipankar Sarma wrote:
> On Wed, Sep 15, 2004 at 02:27:22PM -0700, Andrew Morton wrote:
> > Is dodgy wrt CPU hotplug, but there's not a lot we can do about that
> > in this context, I expect.  Which is a shame, given that CPU hotplug
> > is a likely time at which to be taking a crashdump ;)
> 
> If Hari disables preemption during this entire section of code,
> he should be safe from CPU hotplug, AFAICS. The stop machine
> threads will never get to run on that CPU.

This will work for CPU remove, not CPU add, since the later
is not atomic (yet). 

Rusty, do you think it would be worthwhile making CPU add atomic?
I can give it a shot :)

-- 


Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560017

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Fastboot] Re: [PATCH][4/6]Register snapshotting before kexec boot
  2004-09-17 14:53             ` Srivatsa Vaddagiri
@ 2004-09-19 20:17               ` Eric W. Biederman
  0 siblings, 0 replies; 22+ messages in thread
From: Eric W. Biederman @ 2004-09-19 20:17 UTC (permalink / raw)
  To: vatsa
  Cc: Dipankar Sarma, Andrew Morton, suparna, ebiederm, mbligh,
	fastboot, litke, Rusty Russell, linux-kernel

Srivatsa Vaddagiri <vatsa@in.ibm.com> writes:

> On Thu, Sep 16, 2004 at 08:41:13AM +0000, Dipankar Sarma wrote:
> > On Wed, Sep 15, 2004 at 02:27:22PM -0700, Andrew Morton wrote:
> > > Is dodgy wrt CPU hotplug, but there's not a lot we can do about that
> > > in this context, I expect.  Which is a shame, given that CPU hotplug
> > > is a likely time at which to be taking a crashdump ;)
> > 
> > If Hari disables preemption during this entire section of code,
> > he should be safe from CPU hotplug, AFAICS. The stop machine
> > threads will never get to run on that CPU.
> 
> This will work for CPU remove, not CPU add, since the later
> is not atomic (yet). 
> 
> Rusty, do you think it would be worthwhile making CPU add atomic?
> I can give it a shot :)

The whole section should be run with interrupts disabled,
so disabling preemption should be trivial.

Beyond this the code needs a bogomips based timeout, (we can't use external
time sources just the delay loop).  Any of the other cpu's could have
crashed at this point, or our accounting data structures could be
currupt.

Putting the initial call inside of machine_kexec is also bogus.
crash_machine_kexec needs to call crash_stop_cpus and then
machine_kexec.  machine_kexec is currently factored as a perfectly
reusable piece of code let's not mess that up.

Eric

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Fastboot] Re: [PATCH][2/6]Memory preserving reboot using kexec
  2004-09-15 12:53   ` [PATCH][2/6]Memory preserving reboot using kexec Hariprasad Nellitheertha
  2004-09-15 12:54     ` [PATCH][3/6]Routines for copying the dump pages Hariprasad Nellitheertha
  2004-09-15 21:22     ` [PATCH][2/6]Memory preserving reboot using kexec Andrew Morton
@ 2004-09-19 20:37     ` Eric W. Biederman
  2004-09-20 13:49       ` Hariprasad Nellitheertha
  2 siblings, 1 reply; 22+ messages in thread
From: Eric W. Biederman @ 2004-09-19 20:37 UTC (permalink / raw)
  To: hari
  Cc: akpm, linux-kernel, fastboot, Suparna Bhattacharya, mbligh,
	ebiederm, litke

Hariprasad Nellitheertha <hari@in.ibm.com> writes:

> Regards, Hari
> -- 
> Hariprasad Nellitheertha
> Linux Technology Center
> India Software Labs
> IBM India, Bangalore
> 
> 
> 
> This patch contains the code that does the memory preserving reboot. It 
> copies over the first 640k into a backup region before handing over to 
> kexec. The second kernel will boot using only the backup region.

Do you know what the kernel does with the low 1M?

Nothing in the hardware architecture requires us to use the
low 1M.  So I think we would be safer if we could track down
and remove this dependency.

In general I agree that we need to be prepared to save some of the
original machine state, because some architectures give special
meaning to addresses in memory.  But x86 is not one of those.

Perhaps the proper abstraction is to add a use_mem= variable
that simply tells us which memory addresses we can use.

By still doing some copying we run into the problem, of
potentially running out of memory areas where ongoing DMA
transfers may be happening.  So this is worth
tracking down.

> diff -puN /dev/null include/linux/crash_dump.h
> --- /dev/null	2003-01-30 15:54:37.000000000 +0530
> +++ linux-2.6.9-rc1-hari/include/linux/crash_dump.h 2004-09-15
> 17:36:30.000000000 +0530
> 
> @@ -0,0 +1,28 @@
> +#include <linux/kexec.h>
> +#include <linux/smp_lock.h>
> +#include <linux/device.h>
> +#include <asm/crash_dump.h>
> +
> +#ifdef CONFIG_CRASH_DUMP

Why is this function an inline in a header file?
I know we only call it once but still unnecessary code
in a header file seems silly.

> +extern int crash_dump_on;
> +static inline void crash_machine_kexec(void)
> +{
> +	struct kimage *image;
> +
> +	if ((!crash_dump_on) || (crashed))
> +		return;
> +
> +	image = xchg(&kexec_image, 0);

You are still not using a different global variable here.

With a separate kexec_crash_image variable if the image is present
we do a crash dump.  That should allow you to remove the
crash_dump_on variable and the test above.

> +	if (image) {
> +		crashed = 1;
We should not need the magic global variable crashed.  We can
just call the one or two functions needed from crash_machine_kexec.

> +		device_shutdown();
Why are you calling device_shutdown here?
> +		printk(KERN_EMERG "kexec: opening parachute\n");

To wrap things prettily we should probably add a machine_crash_shutdown() 
and place in machine_crash_shutdown whatever architecture specific magic needs
to happen.

> +		machine_kexec(image);
> +		while (1);

The while look is unnecessary machine_kexec cannot
return. 

> +	} else {
> +		printk(KERN_EMERG "kexec: No kernel image loaded!\n");
> +	}
> +}
> +#else
> +#define crash_machine_kexec()	do { } while(0)
> +#endif

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Fastboot] Re: [PATCH][2/6]Memory preserving reboot using kexec
  2004-09-19 20:37     ` [Fastboot] " Eric W. Biederman
@ 2004-09-20 13:49       ` Hariprasad Nellitheertha
  2004-09-20 19:53         ` Eric W. Biederman
  0 siblings, 1 reply; 22+ messages in thread
From: Hariprasad Nellitheertha @ 2004-09-20 13:49 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: akpm, linux-kernel, fastboot, Suparna Bhattacharya, mbligh, agl

Hi Eric,

On Sun, Sep 19, 2004 at 02:37:18PM -0600, Eric W. Biederman wrote:
> Hariprasad Nellitheertha <hari@in.ibm.com> writes:
> 
> > This patch contains the code that does the memory preserving reboot. It 
> > copies over the first 640k into a backup region before handing over to 
> > kexec. The second kernel will boot using only the backup region.
> 
> Do you know what the kernel does with the low 1M?
> 
> Nothing in the hardware architecture requires us to use the
> low 1M.  So I think we would be safer if we could track down
> and remove this dependency.
> 
> In general I agree that we need to be prepared to save some of the
> original machine state, because some architectures give special
> meaning to addresses in memory.  But x86 is not one of those.
> 
> Perhaps the proper abstraction is to add a use_mem= variable
> that simply tells us which memory addresses we can use.
> 
> By still doing some copying we run into the problem, of
> potentially running out of memory areas where ongoing DMA
> transfers may be happening.  So this is worth
> tracking down.

I am trying to track this down. I tried moving the first segment of vmlinux
into the reserved section by modifying kexec-tools. This is the command line
argument segment. It still seems to need the first few kilobytes, though. 

Eliminating this is definitely needed so we can avoid using the first 
kernel's region completely.

Also, I will make the changes in the rest of the patch as per your review
comments.

Regards, Hari
-- 
Hariprasad Nellitheertha
Linux Technology Center
India Software Labs
IBM India, Bangalore

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Fastboot] Re: [PATCH][2/6]Memory preserving reboot using kexec
  2004-09-20 13:49       ` Hariprasad Nellitheertha
@ 2004-09-20 19:53         ` Eric W. Biederman
  0 siblings, 0 replies; 22+ messages in thread
From: Eric W. Biederman @ 2004-09-20 19:53 UTC (permalink / raw)
  To: hari; +Cc: akpm, linux-kernel, fastboot, Suparna Bhattacharya, mbligh, agl

Hariprasad Nellitheertha <hari@in.ibm.com> writes:

> Hi Eric,
> 
> On Sun, Sep 19, 2004 at 02:37:18PM -0600, Eric W. Biederman wrote:
> > Hariprasad Nellitheertha <hari@in.ibm.com> writes:
> > 
> > > This patch contains the code that does the memory preserving reboot. It 
> > > copies over the first 640k into a backup region before handing over to 
> > > kexec. The second kernel will boot using only the backup region.
> > 
> > Do you know what the kernel does with the low 1M?
> > 
> > Nothing in the hardware architecture requires us to use the
> > low 1M.  So I think we would be safer if we could track down
> > and remove this dependency.
> > 
> > In general I agree that we need to be prepared to save some of the
> > original machine state, because some architectures give special
> > meaning to addresses in memory.  But x86 is not one of those.
> > 
> > Perhaps the proper abstraction is to add a use_mem= variable
> > that simply tells us which memory addresses we can use.
> > 
> > By still doing some copying we run into the problem, of
> > potentially running out of memory areas where ongoing DMA
> > transfers may be happening.  So this is worth
> > tracking down.
> 
> I am trying to track this down. I tried moving the first segment of vmlinux
> into the reserved section by modifying kexec-tools. This is the command line
> argument segment. It still seems to need the first few kilobytes, though. 

Right that is being automatically placed there.
For testing it should not be too hard to hard code it at someplace
appropriate.

> Eliminating this is definitely needed so we can avoid using the first 
> kernel's region completely.
> 
> Also, I will make the changes in the rest of the patch as per your review
> comments.

Thanks.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH][2/6]Memory preserving reboot using kexec
  2004-08-17 12:07   ` [PATCH][2/6]Memory preserving reboot using kexec Hariprasad Nellitheertha
@ 2004-08-17 16:01     ` Dave Hansen
  0 siblings, 0 replies; 22+ messages in thread
From: Dave Hansen @ 2004-08-17 16:01 UTC (permalink / raw)
  To: Hariprasad Nellitheertha [imap]
  Cc: Linux Kernel Mailing List, fastboot, Andrew Morton,
	Suparna Bhattacharya [imap],
	Martin J. Bligh, litke, Eric W. Biederman

On Tue, 2004-08-17 at 05:07, Hariprasad Nellitheertha wrote:
> Regards, Hari


> +void __relocate_base_mem(unsigned long backup_addr, unsigned long backup_size)
> +{
> +       unsigned long pfn, pfn_max;
> +       void *src_addr, *dest_addr;
> +       struct page *page;
> +
> +       pfn_max = backup_size >> PAGE_SHIFT;
> +       for (pfn = 0; pfn < pfn_max; pfn++) {
> +               src_addr = phys_to_virt(pfn << PAGE_SHIFT);
> +               dest_addr = backup_addr + src_addr;
> +               if (!pfn_valid(pfn))
> +                       continue;
> +               page = pfn_to_page(pfn);
> +               if (PageReserved(page))
> +                       copy_page(dest_addr, src_addr);
> +       }
> +}

You're getting a little sloppy with your types in there.  I know you
probably aren't getting warnings for passing unsigned longs to
copy_page(), but you should probably still be passing pointers to it.  

I think the general convention is to keep physical addresses in unsigned
longs and virtual addresses in pointers.  Just keep that in mind.  

> +#define CRASH_BACKUP_BASE 0x08000000
> +#define CRASH_BACKUP_SIZE 0x01000000

What are these numbers?  Why do you need to define them when your config
option is off?

> +/*
> + * If we have booted due to a crash, max_pfn will be a very low value. We need
> + * to know the amount of memory that the previous kernel used.
> + */
> +unsigned long saved_max_pfn;

I'd probably put that comment next to the place where saved_max_pfn is
used instead of where it is declared.  


-- Dave


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH][2/6]Memory preserving reboot using kexec
  2004-08-17 12:05 ` [PATCH][1/6]Documentation Hariprasad Nellitheertha
@ 2004-08-17 12:07   ` Hariprasad Nellitheertha
  2004-08-17 16:01     ` Dave Hansen
  0 siblings, 1 reply; 22+ messages in thread
From: Hariprasad Nellitheertha @ 2004-08-17 12:07 UTC (permalink / raw)
  To: linux-kernel, fastboot
  Cc: akpm, Suparna Bhattacharya, mbligh, litke, ebiederm

[-- Attachment #1: Type: text/plain, Size: 108 bytes --]

Regards, Hari
-- 
Hariprasad Nellitheertha
Linux Technology Center
India Software Labs
IBM India, Bangalore

[-- Attachment #2: kd-reb-268.patch --]
[-- Type: text/plain, Size: 8996 bytes --]



This patch contains the code that does the memory preserving reboot. It 
copies over the first CRASH_BACKUP_SIZE amount of memory into a backup
region before handing over to kexec.

Signed off by Hariprasad Nellitheertha <hari@in.ibm.com>
Signed off by Adam Litke <litke@us.ibm.com>


---

 linux-2.6.8.1-hari/arch/i386/Kconfig                |   22 +++++++++++
 linux-2.6.8.1-hari/arch/i386/kernel/machine_kexec.c |   31 +++++++++++++++
 linux-2.6.8.1-hari/arch/i386/kernel/setup.c         |   11 +++++
 linux-2.6.8.1-hari/include/asm-i386/crash_dump.h    |   39 ++++++++++++++++++++
 linux-2.6.8.1-hari/include/linux/bootmem.h          |    1 
 linux-2.6.8.1-hari/include/linux/crash_dump.h       |   27 +++++++++++++
 linux-2.6.8.1-hari/kernel/panic.c                   |    6 +++
 linux-2.6.8.1-hari/mm/bootmem.c                     |    5 ++
 8 files changed, 142 insertions(+)

diff -puN arch/i386/Kconfig~kd-reb-268 arch/i386/Kconfig
--- linux-2.6.8.1/arch/i386/Kconfig~kd-reb-268	2004-08-17 17:04:35.000000000 +0530
+++ linux-2.6.8.1-hari/arch/i386/Kconfig	2004-08-17 17:05:03.000000000 +0530
@@ -865,6 +865,28 @@ config REGPARM
 	generate incorrect output with certain kernel constructs when
 	-mregparm=3 is used.
 
+config CRASH_DUMP
+	bool "kernel crash dumps (EXPERIMENTAL)"
+	depends on KEXEC
+	help
+	  Generate crash dump using kexec.
+
+config BACKUP_BASE
+	int "location of the crash dumps backup region"
+	depends on CRASH_DUMP
+	default 128
+	help
+	Offset of backup region in terms of MB. Change this if you want
+	to modify the location of the crash dumps backup region.
+
+config BACKUP_SIZE
+	int "Size of the crash dumps backup region"
+	depends on CRASH_DUMP
+	range 16 64
+	default 16
+	help
+	The size of the backup region, in MB. This is also the size of the
+	memory that will be used to boot the second kernel after a crash.
 endmenu
 
 
diff -puN arch/i386/kernel/machine_kexec.c~kd-reb-268 arch/i386/kernel/machine_kexec.c
--- linux-2.6.8.1/arch/i386/kernel/machine_kexec.c~kd-reb-268	2004-08-17 17:04:35.000000000 +0530
+++ linux-2.6.8.1-hari/arch/i386/kernel/machine_kexec.c	2004-08-17 17:05:03.000000000 +0530
@@ -9,6 +9,7 @@
 #include <linux/mm.h>
 #include <linux/kexec.h>
 #include <linux/delay.h>
+#include <linux/crash_dump.h>
 #include <asm/pgtable.h>
 #include <asm/pgalloc.h>
 #include <asm/tlbflush.h>
@@ -16,6 +17,7 @@
 #include <asm/io.h>
 #include <asm/apic.h>
 #include <asm/cpufeature.h>
+#include <asm/hw_irq.h>
 
 
 static void set_idt(void *newidt, __u16 limit)
@@ -76,6 +78,28 @@ const extern unsigned int relocate_new_k
 extern void use_mm(struct mm_struct *mm);
 
 /*
+ * We are going to do a memory preserving reboot. So, we copy over the
+ * first xxMB of memory into a backup location.
+ */
+void __relocate_base_mem(unsigned long backup_addr, unsigned long backup_size)
+{
+	unsigned long pfn, pfn_max;
+	void *src_addr, *dest_addr;
+	struct page *page;
+
+	pfn_max = backup_size >> PAGE_SHIFT;
+	for (pfn = 0; pfn < pfn_max; pfn++) {
+		src_addr = phys_to_virt(pfn << PAGE_SHIFT);
+		dest_addr = backup_addr + src_addr;
+		if (!pfn_valid(pfn))
+			continue;
+		page = pfn_to_page(pfn);
+		if (PageReserved(page))
+			copy_page(dest_addr, src_addr);
+	}
+}
+
+/*
  * Do not allocate memory (or fail in any way) in machine_kexec().
  * We are past the point of no return, committed to rebooting now.
  */
@@ -94,6 +118,13 @@ void machine_kexec(struct kimage *image)
 	reboot_code_buffer = page_to_pfn(image->reboot_code_pages) << PAGE_SHIFT;
 	indirection_page = image->head & PAGE_MASK;
 
+	/*
+	 * If we are here to do a crash dump, save the memory from
+	 * 0-16MB before we copy over the kexec kernel image.  Otherwise
+	 * our dump will show the wrong kernel entirely.
+	 */
+	relocate_base_mem();
+
 	/* copy it out */
 	memcpy((void *)reboot_code_buffer, relocate_new_kernel, relocate_new_kernel_size);
 
diff -puN arch/i386/kernel/setup.c~kd-reb-268 arch/i386/kernel/setup.c
--- linux-2.6.8.1/arch/i386/kernel/setup.c~kd-reb-268	2004-08-17 17:04:35.000000000 +0530
+++ linux-2.6.8.1-hari/arch/i386/kernel/setup.c	2004-08-17 17:05:03.000000000 +0530
@@ -39,6 +39,7 @@
 #include <linux/efi.h>
 #include <linux/init.h>
 #include <linux/edd.h>
+#include <linux/crash_dump.h>
 #include <video/edid.h>
 #include <asm/e820.h>
 #include <asm/mpspec.h>
@@ -56,6 +57,7 @@
 unsigned long init_pg_tables_end __initdata = ~0UL;
 
 int disable_pse __initdata = 0;
+unsigned int dump_enabled;
 
 /*
  * Machine setup..
@@ -347,6 +349,9 @@ static void __init limit_regions(unsigne
 			}
 		}
 	}
+	/* If we are doing a crash dump, we still need to know the real mem size*/
+	set_saved_max_pfn();
+
 	for (i = 0; i < e820.nr_map; i++) {
 		if (e820.map[i].type == E820_RAM) {
 			current_addr = e820.map[i].addr + e820.map[i].size;
@@ -809,6 +814,9 @@ static void __init parse_cmdline_early (
 		if (c == ' ' && !memcmp(from, "highmem=", 8))
 			highmem_pages = memparse(from+8, &from) >> PAGE_SHIFT;
 	
+		if (!memcmp(from, "dump", 4))
+			dump_enabled = 1;
+
 		c = *(from++);
 		if (!c)
 			break;
@@ -1082,6 +1090,9 @@ static unsigned long __init setup_memory
 		}
 	}
 #endif
+
+	crash_reserve_bootmem();
+
 	return max_low_pfn;
 }
 #else
diff -puN /dev/null include/asm-i386/crash_dump.h
--- /dev/null	2003-01-30 15:54:37.000000000 +0530
+++ linux-2.6.8.1-hari/include/asm-i386/crash_dump.h	2004-08-17 17:05:03.000000000 +0530
@@ -0,0 +1,39 @@
+/* asm-i386/crash_dump.h */
+#include <linux/bootmem.h>
+
+extern unsigned int dump_enabled;
+
+unsigned long __init find_max_low_pfn(void);
+void __init find_max_pfn(void);
+void __relocate_base_mem(unsigned long, unsigned long);
+
+extern unsigned int crashed;
+
+#ifdef CONFIG_CRASH_DUMP
+#define CRASH_BACKUP_BASE ((unsigned long)CONFIG_BACKUP_BASE * 0x100000)
+#define CRASH_BACKUP_SIZE ((unsigned long)CONFIG_BACKUP_SIZE * 0x100000)
+
+static inline void relocate_base_mem(void)
+{
+	if (crashed)
+		__relocate_base_mem(CRASH_BACKUP_BASE, CRASH_BACKUP_SIZE);
+}
+
+static inline void set_saved_max_pfn(void)
+{
+	find_max_pfn();
+	saved_max_pfn = find_max_low_pfn();
+}
+
+static inline void crash_reserve_bootmem(void)
+{
+	if (!dump_enabled)
+		reserve_bootmem(CRASH_BACKUP_BASE, CRASH_BACKUP_SIZE);
+}
+#else
+#define CRASH_BACKUP_BASE 0x08000000
+#define CRASH_BACKUP_SIZE 0x01000000
+static inline void relocate_base_mem(void) {}
+static inline void set_saved_max_pfn(void) {}
+static inline void crash_reserve_bootmem(void) {}
+#endif
diff -puN include/linux/bootmem.h~kd-reb-268 include/linux/bootmem.h
--- linux-2.6.8.1/include/linux/bootmem.h~kd-reb-268	2004-08-17 17:04:36.000000000 +0530
+++ linux-2.6.8.1-hari/include/linux/bootmem.h	2004-08-17 17:05:03.000000000 +0530
@@ -21,6 +21,7 @@ extern unsigned long min_low_pfn;
  * highest page
  */
 extern unsigned long max_pfn;
+extern unsigned long saved_max_pfn;
 
 /*
  * node_bootmem_map is a map pointer - the bits represent all physical 
diff -puN /dev/null include/linux/crash_dump.h
--- /dev/null	2003-01-30 15:54:37.000000000 +0530
+++ linux-2.6.8.1-hari/include/linux/crash_dump.h	2004-08-17 17:05:03.000000000 +0530
@@ -0,0 +1,27 @@
+#include <linux/kexec.h>
+#include <linux/smp_lock.h>
+#include <linux/device.h>
+#include <asm/crash_dump.h>
+
+#ifdef CONFIG_CRASH_DUMP
+static inline void crash_machine_kexec(void)
+{
+	struct kimage *image;
+
+	if (crashed)
+		return;
+
+        image = xchg(&kexec_image, 0);
+        if (image) {
+		crashed = 1;
+		device_shutdown();
+		printk(KERN_EMERG "kexec: opening parachute\n");
+		machine_kexec(image);
+		while (1);
+        } else {
+		printk(KERN_EMERG "kexec: No kernel image loaded!\n");
+        }
+}
+#else
+static inline void crash_machine_kexec(void) {}
+#endif
diff -puN kernel/panic.c~kd-reb-268 kernel/panic.c
--- linux-2.6.8.1/kernel/panic.c~kd-reb-268	2004-08-17 17:04:36.000000000 +0530
+++ linux-2.6.8.1-hari/kernel/panic.c	2004-08-17 17:05:03.000000000 +0530
@@ -19,10 +19,13 @@
 #include <linux/syscalls.h>
 #include <linux/interrupt.h>
 #include <linux/nmi.h>
+#include <linux/kexec.h>
+#include <linux/crash_dump.h>
 
 int panic_timeout;
 int panic_on_oops;
 int tainted;
+unsigned int crashed;
 
 EXPORT_SYMBOL(panic_timeout);
 
@@ -68,6 +71,9 @@ NORET_TYPE void panic(const char * fmt, 
 		sys_sync();
 	bust_spinlocks(0);
 
+	/* If we have crashed, perform a kexec reboot, for dump write-out */
+	crash_machine_kexec();
+
 #ifdef CONFIG_SMP
 	smp_send_stop();
 #endif
diff -puN mm/bootmem.c~kd-reb-268 mm/bootmem.c
--- linux-2.6.8.1/mm/bootmem.c~kd-reb-268	2004-08-17 17:04:36.000000000 +0530
+++ linux-2.6.8.1-hari/mm/bootmem.c	2004-08-17 17:05:03.000000000 +0530
@@ -27,6 +27,11 @@
 unsigned long max_low_pfn;
 unsigned long min_low_pfn;
 unsigned long max_pfn;
+/*
+ * If we have booted due to a crash, max_pfn will be a very low value. We need
+ * to know the amount of memory that the previous kernel used.
+ */
+unsigned long saved_max_pfn;
 
 EXPORT_SYMBOL(max_pfn);		/* This is exported so
 				 * dma_get_required_mask(), which uses

_

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2004-09-20 19:57 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-09-15 12:50 kexec based crash dumping Hariprasad Nellitheertha
2004-09-15 12:51 ` [PATCH][1/6]Documentation Hariprasad Nellitheertha
2004-09-15 12:53   ` [PATCH][2/6]Memory preserving reboot using kexec Hariprasad Nellitheertha
2004-09-15 12:54     ` [PATCH][3/6]Routines for copying the dump pages Hariprasad Nellitheertha
2004-09-15 12:55       ` [PATCH][4/6]Register snapshotting before kexec boot Hariprasad Nellitheertha
2004-09-15 12:56         ` [PATCH][5/6]ELF format dump file access Hariprasad Nellitheertha
2004-09-15 12:57           ` [PATCH][6/6]Linear/raw " Hariprasad Nellitheertha
2004-09-15 21:28           ` [PATCH][5/6]ELF " Andrew Morton
2004-09-15 21:29           ` Andrew Morton
2004-09-15 21:31           ` Andrew Morton
2004-09-15 21:27         ` [PATCH][4/6]Register snapshotting before kexec boot Andrew Morton
2004-09-16  8:11           ` [Fastboot] " Dipankar Sarma
2004-09-17 14:53             ` Srivatsa Vaddagiri
2004-09-19 20:17               ` Eric W. Biederman
2004-09-15 21:23       ` [PATCH][3/6]Routines for copying the dump pages Andrew Morton
2004-09-15 21:22     ` [PATCH][2/6]Memory preserving reboot using kexec Andrew Morton
2004-09-19 20:37     ` [Fastboot] " Eric W. Biederman
2004-09-20 13:49       ` Hariprasad Nellitheertha
2004-09-20 19:53         ` Eric W. Biederman
2004-09-15 17:33 ` [Fastboot] kexec based crash dumping Eric W. Biederman
  -- strict thread matches above, loose matches on Subject: below --
2004-08-17 12:04 [RFC]Kexec " Hariprasad Nellitheertha
2004-08-17 12:05 ` [PATCH][1/6]Documentation Hariprasad Nellitheertha
2004-08-17 12:07   ` [PATCH][2/6]Memory preserving reboot using kexec Hariprasad Nellitheertha
2004-08-17 16:01     ` Dave Hansen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).