All of lore.kernel.org
 help / color / mirror / Atom feed
* [Patch v3 0/7] add a new interface to show the memory usage of 1st kernel
@ 2014-07-28  8:19 Baoquan He
  2014-07-28  8:20 ` [Patch v3 1/7] initialize pfn_memhole in get_num_dumpable_cyclic Baoquan He
                   ` (6 more replies)
  0 siblings, 7 replies; 29+ messages in thread
From: Baoquan He @ 2014-07-28  8:19 UTC (permalink / raw)
  To: kexec; +Cc: kumagai-atsushi, Baoquan He, vgoyal

Recently people complained that they don't know how to decide how
much disk size need be reserved for kdump. E.g there are lots of
machines with different memory size, if the memory usage information
of current system can be shown, that can help them to make an estimate
how much storage space need be reserved.
    
In this patchset, a new interface is added into makedumpfile. By the
help of this, people can know the page number of memory in different
use. The implementation is analyzing the "System Ram" and "kernel text"
program segment of /proc/kcore excluding the crashkernel range, then
calculating the page number of different kind per vmcoreinfo.

Previouly, patchset v1 was posted. And patch 7/7 has a change in v2.
So several changes are made in this v3 post per comments from Vivek
and Atsushi.

[patch 3/7] preparation functions for parsing vmcoreinfo
v1->v3: 
    Since get_kernel_version need be called to get page_offset
    before initial() in mem_usage code flow, and surely it will be called
    inside initial() again. Add a static variable to avoid this duplicate
    calling.

[patch 5/7] prepare the dump loads for kcore analysis
v1->v3:
    Fix the compiler warnings.

[patch 6/7] implement a function to print the memory usage
v1->v3:
    Adjust the printing content and format of dumpable page numbers per Vivek's
    comments.

[patch 7/7]
v1->v2:
    Set info->dump_level=MAX_DUMP_LEVEL, with MAX_DUMP_LEVEL all kinds of
    memory can be calculated. 
v2->v3:
    Add the description of this feature into help message and man page.

Baoquan He (7):
  initialize pfn_memhole in get_num_dumpable_cyclic
  functions to get crashkernel memory range
  preparation functions for parsing vmcoreinfo
  set vmcoreinfo for kcore
  prepare the dump loads for kcore analysis
  implement a function to print the memory usage
  add a new interface to show the memory usage of 1st kernel

 elf_info.c     | 231 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 elf_info.h     |   3 +
 makedumpfile.8 |  17 ++++
 makedumpfile.c | 247 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 makedumpfile.h |  10 +++
 print_info.c   |   8 ++
 6 files changed, 516 insertions(+)

-- 
1.8.5.3


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [Patch v3 1/7] initialize pfn_memhole in get_num_dumpable_cyclic
  2014-07-28  8:19 [Patch v3 0/7] add a new interface to show the memory usage of 1st kernel Baoquan He
@ 2014-07-28  8:20 ` Baoquan He
  2014-07-28  8:20 ` [Patch v3 2/7] functions to get crashkernel memory range Baoquan He
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 29+ messages in thread
From: Baoquan He @ 2014-07-28  8:20 UTC (permalink / raw)
  To: kexec; +Cc: kumagai-atsushi, Baoquan He, vgoyal

This is a code bug. In initialize_2nd_bitmap_cyclic pfn_memhole is
calculated, however it's not initialized before that. If an available
pfn_memhole is wanted after get_num_dumpable_cyclic invocation,
initializing pfn_memhole in get_num_dumpable_cyclic is necessary.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 makedumpfile.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/makedumpfile.c b/makedumpfile.c
index 3884aa5..760bfd1 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -5588,6 +5588,8 @@ get_num_dumpable_cyclic(void)
 	mdf_pfn_t pfn, num_dumpable=0;
 	struct cycle cycle = {0};
 
+	pfn_memhole = info->max_mapnr;
+
 	for_each_cycle(0, info->max_mapnr, &cycle)
 	{
 		if (!exclude_unnecessary_pages_cyclic(&cycle))
-- 
1.8.5.3


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Patch v3 2/7] functions to get crashkernel memory range
  2014-07-28  8:19 [Patch v3 0/7] add a new interface to show the memory usage of 1st kernel Baoquan He
  2014-07-28  8:20 ` [Patch v3 1/7] initialize pfn_memhole in get_num_dumpable_cyclic Baoquan He
@ 2014-07-28  8:20 ` Baoquan He
  2014-08-01  7:32   ` Atsushi Kumagai
  2014-07-28  8:20 ` [Patch v3 3/7] preparation functions for parsing vmcoreinfo Baoquan He
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 29+ messages in thread
From: Baoquan He @ 2014-07-28  8:20 UTC (permalink / raw)
  To: kexec; +Cc: kumagai-atsushi, Baoquan He, vgoyal

These functions are used to parse /proc/iomem code and get memory
ranges of specific type. They are implemented in kexec-tools and
borrowed here to get the crashkernel memory range. Since crashkernel
memory range should be excluded from dumpable memory ranges.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 makedumpfile.c | 82 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 makedumpfile.h |  7 +++++
 2 files changed, 89 insertions(+)

diff --git a/makedumpfile.c b/makedumpfile.c
index 760bfd1..220570e 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -8980,6 +8980,88 @@ calculate_cyclic_buffer_size(void) {
 	return TRUE;
 }
 
+
+
+//#define CRASH_RESERVED_MEM_NR   8
+struct memory_range crash_reserved_mem[CRASH_RESERVED_MEM_NR];
+int crash_reserved_mem_nr;
+
+/*
+ * iomem_for_each_line()
+ *
+ * Iterate over each line in the file returned by proc_iomem(). If match is
+ * NULL or if the line matches with our match-pattern then call the
+ * callback if non-NULL.
+ *
+ * Return the number of lines matched.
+ */
+int iomem_for_each_line(char *match,
+			      int (*callback)(void *data,
+					      int nr,
+					      char *str,
+					      unsigned long base,
+					      unsigned long length),
+			      void *data)
+{
+	const char iomem[] = "/proc/iomem";
+	char line[BUFSIZE_FGETS];
+	FILE *fp;
+	unsigned long long start, end, size;
+	char *str;
+	int consumed;
+	int count;
+	int nr = 0;
+
+	fp = fopen(iomem, "r");
+	if (!fp) {
+		ERRMSG("Cannot open %s\n", iomem);
+		exit(1);
+	}
+
+	while(fgets(line, sizeof(line), fp) != 0) {
+		count = sscanf(line, "%Lx-%Lx : %n", &start, &end, &consumed);
+		if (count != 2)
+			continue;
+		str = line + consumed;
+		size = end - start + 1;
+		if (!match || memcmp(str, match, strlen(match)) == 0) {
+			if (callback
+			    && callback(data, nr, str, start, size) < 0) {
+				break;
+			}
+			nr++;
+		}
+	}
+
+	fclose(fp);
+
+	return nr;
+}
+
+static int crashkernel_mem_callback(void *data, int nr,
+                                          char *str,
+                                          unsigned long base,
+                                          unsigned long length)
+{
+        if (nr >= CRASH_RESERVED_MEM_NR)
+                return 1;
+
+        crash_reserved_mem[nr].start = base;
+        crash_reserved_mem[nr].end   = base + length - 1;
+        return 0;
+}
+
+int is_crashkernel_mem_reserved(void)
+{
+        int ret;
+
+        ret = iomem_for_each_line("Crash kernel\n",
+                                        crashkernel_mem_callback, NULL);
+        crash_reserved_mem_nr = ret;
+
+        return !!crash_reserved_mem_nr;
+}
+
 static struct option longopts[] = {
 	{"split", no_argument, NULL, OPT_SPLIT},
 	{"reassemble", no_argument, NULL, OPT_REASSEMBLE},
diff --git a/makedumpfile.h b/makedumpfile.h
index 9402f05..7ffa1ee 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -1452,6 +1452,13 @@ extern struct array_table	array_table;
 extern struct number_table	number_table;
 extern struct srcfile_table	srcfile_table;
 
+struct memory_range {
+        unsigned long long start, end;
+};
+
+#define CRASH_RESERVED_MEM_NR   8
+struct memory_range crash_reserved_mem[CRASH_RESERVED_MEM_NR];
+int crash_reserved_mem_nr;
 
 int readmem(int type_addr, unsigned long long addr, void *bufptr, size_t size);
 int get_str_osrelease_from_vmlinux(void);
-- 
1.8.5.3


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Patch v3 3/7] preparation functions for parsing vmcoreinfo
  2014-07-28  8:19 [Patch v3 0/7] add a new interface to show the memory usage of 1st kernel Baoquan He
  2014-07-28  8:20 ` [Patch v3 1/7] initialize pfn_memhole in get_num_dumpable_cyclic Baoquan He
  2014-07-28  8:20 ` [Patch v3 2/7] functions to get crashkernel memory range Baoquan He
@ 2014-07-28  8:20 ` Baoquan He
  2014-08-01  7:12   ` Atsushi Kumagai
  2014-07-28  8:20 ` [Patch v3 4/7] set vmcoreinfo for kcore Baoquan He
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 29+ messages in thread
From: Baoquan He @ 2014-07-28  8:20 UTC (permalink / raw)
  To: kexec; +Cc: kumagai-atsushi, Baoquan He, vgoyal

Add 2 preparation functions get_elf_loads and get_page_offset, later
they will be needed for parsing vmcoreinfo.

Meanwhile since get_kernel_version need be called to get page_offset
before initial() in mem_usage code flow, and surely it will be called
inside initial() again. Add a static variable to avoid this duplicate
calling.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 elf_info.c     | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
 elf_info.h     |  1 +
 makedumpfile.c | 23 +++++++++++++++++++++++
 3 files changed, 74 insertions(+)

diff --git a/elf_info.c b/elf_info.c
index b277f69..69d3fdb 100644
--- a/elf_info.c
+++ b/elf_info.c
@@ -681,6 +681,56 @@ get_elf32_ehdr(int fd, char *filename, Elf32_Ehdr *ehdr)
 	return TRUE;
 }
 
+int
+get_elf_loads(int fd, char *filename)
+{
+	int i, j, phnum, elf_format;
+	Elf64_Phdr phdr;
+
+	/*
+	 * Check ELF64 or ELF32.
+	 */
+	elf_format = check_elf_format(fd, filename, &phnum, &num_pt_loads);
+	if (elf_format == ELF64)
+		flags_memory |= MEMORY_ELF64;
+	else if (elf_format != ELF32)
+		return FALSE;
+
+	if (!num_pt_loads) {
+		ERRMSG("Can't get the number of PT_LOAD.\n");
+		return FALSE;
+	}
+
+	/*
+	 * The below file information will be used as /proc/vmcore.
+	 */
+	fd_memory   = fd;
+	name_memory = filename;
+
+	pt_loads = calloc(sizeof(struct pt_load_segment), num_pt_loads);
+	if (pt_loads == NULL) {
+		ERRMSG("Can't allocate memory for the PT_LOAD. %s\n",
+		    strerror(errno));
+		return FALSE;
+	}
+	for (i = 0, j = 0; i < phnum; i++) {
+		if (!get_phdr_memory(i, &phdr))
+			return FALSE;
+
+		if (phdr.p_type != PT_LOAD)
+			continue;
+
+		if (j >= num_pt_loads)
+			return FALSE;
+		if(!dump_Elf_load(&phdr, j))
+			return FALSE;
+		j++;
+	}
+
+	return TRUE;
+}
+
+
 /*
  * Get ELF information about /proc/vmcore.
  */
diff --git a/elf_info.h b/elf_info.h
index 801faff..263d993 100644
--- a/elf_info.h
+++ b/elf_info.h
@@ -44,6 +44,7 @@ int get_elf64_ehdr(int fd, char *filename, Elf64_Ehdr *ehdr);
 int get_elf32_ehdr(int fd, char *filename, Elf32_Ehdr *ehdr);
 int get_elf_info(int fd, char *filename);
 void free_elf_info(void);
+int get_elf_loads(int fd, char *filename);
 
 int is_elf64_memory(void);
 int is_xen_memory(void);
diff --git a/makedumpfile.c b/makedumpfile.c
index 220570e..78aa7a5 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -681,6 +681,10 @@ get_kernel_version(char *release)
 	int32_t version;
 	long maj, min, rel;
 	char *start, *end;
+	static int done = 0;
+
+	if (done)
+		return info->kernel_version;
 
 	/*
 	 * This method checks that vmlinux and vmcore are same kernel version.
@@ -706,6 +710,9 @@ get_kernel_version(char *release)
 		MSG("The kernel version is not supported.\n");
 		MSG("The created dumpfile may be incomplete.\n");
 	}
+
+	done = 1;
+
 	return version;
 }
 
@@ -9062,6 +9069,22 @@ int is_crashkernel_mem_reserved(void)
         return !!crash_reserved_mem_nr;
 }
 
+static int get_page_offset()
+{
+#ifdef __x86_64__
+	struct utsname utsname;
+	if (uname(&utsname)) {
+		ERRMSG("Cannot get name and information about current kernel : %s", strerror(errno));
+		return FALSE;
+	}
+
+	info->kernel_version = get_kernel_version(utsname.release);
+	get_versiondep_info_x86_64();
+#endif /* x86_64 */
+
+	return TRUE;
+}
+
 static struct option longopts[] = {
 	{"split", no_argument, NULL, OPT_SPLIT},
 	{"reassemble", no_argument, NULL, OPT_REASSEMBLE},
-- 
1.8.5.3


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Patch v3 4/7] set vmcoreinfo for kcore
  2014-07-28  8:19 [Patch v3 0/7] add a new interface to show the memory usage of 1st kernel Baoquan He
                   ` (2 preceding siblings ...)
  2014-07-28  8:20 ` [Patch v3 3/7] preparation functions for parsing vmcoreinfo Baoquan He
@ 2014-07-28  8:20 ` Baoquan He
  2014-08-01  7:12   ` Atsushi Kumagai
  2014-07-28  8:20 ` [Patch v3 5/7] prepare the dump loads for kcore analysis Baoquan He
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 29+ messages in thread
From: Baoquan He @ 2014-07-28  8:20 UTC (permalink / raw)
  To: kexec; +Cc: kumagai-atsushi, Baoquan He, vgoyal

In vmcore dumping, note program of vmcoreinfo is set in elf header
of /proc/vmcore. In 1st kernel, the vmcoreinfo is also needed for
kcore analyzing. So in this patch information of vmcoreinfo is
parsed and set in offset_vmcoreinfo and size_vmcoreinfo.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 elf_info.c     | 47 +++++++++++++++++++++++++++++++++++++++++++++++
 elf_info.h     |  1 +
 makedumpfile.c | 29 +++++++++++++++++++++++++++++
 3 files changed, 77 insertions(+)

diff --git a/elf_info.c b/elf_info.c
index 69d3fdb..edbfc97 100644
--- a/elf_info.c
+++ b/elf_info.c
@@ -395,6 +395,53 @@ get_pt_note_info(void)
 	return TRUE;
 }
 
+#define UNINITIALIZED  ((ulong)(-1))
+int set_kcore_vmcoreinfo(uint64_t vmcoreinfo_addr, uint64_t vmcoreinfo_len)
+{
+	int i;
+	ulong kvaddr;
+	off_t offset;
+	char note[MAX_SIZE_NHDR];
+	int size_desc;
+	off_t offset_desc;
+
+	offset = UNINITIALIZED;
+	kvaddr = (ulong)vmcoreinfo_addr | PAGE_OFFSET;
+
+	for (i = 0; i < num_pt_loads; ++i) {
+		struct pt_load_segment *p = &pt_loads[i];
+		if ((kvaddr >= p->virt_start) && (kvaddr < p->virt_end)) {
+			offset = (off_t)(kvaddr - p->virt_start) +
+			(off_t)p->file_offset;
+			break;
+		}
+	}
+
+	if (offset == UNINITIALIZED){
+		ERRMSG("Can't seek the dump memory(%s). %s\n",
+		    name_memory, strerror(errno));
+		return FALSE;
+	}
+
+        if (lseek(fd_memory, offset, SEEK_SET) != offset){
+		ERRMSG("Can't seek the dump memory(%s). %s\n",
+		    name_memory, strerror(errno));
+		return FALSE;
+	}
+
+	if (read(fd_memory, note, MAX_SIZE_NHDR) != MAX_SIZE_NHDR){
+		ERRMSG("Can't read the dump memory(%s). %s\n",
+		    name_memory, strerror(errno));
+		return FALSE;
+	}
+
+	size_desc   = note_descsz(note);
+	offset_desc = offset + offset_note_desc(note);
+
+	set_vmcoreinfo(offset_desc, size_desc);
+
+	return TRUE;
+}
 
 /*
  * External functions.
diff --git a/elf_info.h b/elf_info.h
index 263d993..3ce0138 100644
--- a/elf_info.h
+++ b/elf_info.h
@@ -45,6 +45,7 @@ int get_elf32_ehdr(int fd, char *filename, Elf32_Ehdr *ehdr);
 int get_elf_info(int fd, char *filename);
 void free_elf_info(void);
 int get_elf_loads(int fd, char *filename);
+int set_kcore_vmcoreinfo(uint64_t vmcoreinfo_addr, uint64_t vmcoreinfo_len);
 
 int is_elf64_memory(void);
 int is_xen_memory(void);
diff --git a/makedumpfile.c b/makedumpfile.c
index 78aa7a5..84857e0 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -9085,6 +9085,35 @@ static int get_page_offset()
 	return TRUE;
 }
 
+
+/* Returns the physical address of start of crash notes buffer for a kernel. */
+static int get_sys_kernel_vmcoreinfo(uint64_t *addr, uint64_t *len)
+{
+	char line[BUFSIZE_FGETS];
+	int count;
+	FILE *fp;
+	unsigned long long temp, temp2;
+
+	*addr = 0;
+	*len = 0;
+
+	if (!(fp = fopen("/sys/kernel/vmcoreinfo", "r")))
+		return -1;
+
+	if (!fgets(line, sizeof(line), fp))
+		ERRMSG("Cannot parse %s: %s\n", "/sys/kernel/vmcoreinfo", strerror(errno));
+	count = sscanf(line, "%Lx %Lx", &temp, &temp2);
+	if (count != 2)
+		ERRMSG("Cannot parse %s: %s\n", "/sys/kernel/vmcoreinfo", strerror(errno));
+
+	*addr = (uint64_t) temp;
+	*len = (uint64_t) temp2;
+
+	fclose(fp);
+	return 0;
+}
+
+
 static struct option longopts[] = {
 	{"split", no_argument, NULL, OPT_SPLIT},
 	{"reassemble", no_argument, NULL, OPT_REASSEMBLE},
-- 
1.8.5.3


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Patch v3 5/7] prepare the dump loads for kcore analysis
  2014-07-28  8:19 [Patch v3 0/7] add a new interface to show the memory usage of 1st kernel Baoquan He
                   ` (3 preceding siblings ...)
  2014-07-28  8:20 ` [Patch v3 4/7] set vmcoreinfo for kcore Baoquan He
@ 2014-07-28  8:20 ` Baoquan He
  2014-08-01  7:12   ` Atsushi Kumagai
  2014-07-28  8:20 ` [Patch v3 6/7] implement a function to print the memory usage Baoquan He
  2014-07-28  8:20 ` [Patch v3 7/7] add a new interface to show the memory usage of 1st kernel Baoquan He
  6 siblings, 1 reply; 29+ messages in thread
From: Baoquan He @ 2014-07-28  8:20 UTC (permalink / raw)
  To: kexec; +Cc: kumagai-atsushi, Baoquan He, vgoyal

In kcore, only  "System Ram" and "kernel text" program segments
are needed. And to be more precise, exclude the crashkernel
memory range.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 elf_info.c     | 134 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 elf_info.h     |   1 +
 makedumpfile.h |   1 +
 3 files changed, 136 insertions(+)

diff --git a/elf_info.c b/elf_info.c
index edbfc97..6216f51 100644
--- a/elf_info.c
+++ b/elf_info.c
@@ -777,6 +777,140 @@ get_elf_loads(int fd, char *filename)
 	return TRUE;
 }
 
+static int exclude_segment(struct pt_load_segment **pt_loads, unsigned int	*num_pt_loads, uint64_t start, uint64_t end)
+{
+        int i, j, tidx = -1;
+	unsigned long long	vstart, vend, kvstart, kvend;
+        struct pt_load_segment temp_seg = {0};
+	kvstart = (ulong)start | PAGE_OFFSET;
+	kvend = (ulong)end | PAGE_OFFSET;
+	unsigned long size;
+
+        for (i = 0; i < (*num_pt_loads); i++) {
+                vstart = (*pt_loads)[i].virt_start;
+                vend = (*pt_loads)[i].virt_end;
+                if (kvstart <  vend && kvend > vstart) {
+                        if (kvstart != vstart && kvend != vend) {
+				/* Split load segment */
+				temp_seg.phys_start = end +1;
+				temp_seg.phys_end = (*pt_loads)[i].phys_end;
+				temp_seg.virt_start = kvend + 1;
+				temp_seg.virt_end = vend;
+				temp_seg.file_offset = (*pt_loads)[i].file_offset + temp_seg.virt_start - (*pt_loads)[i].virt_start;
+
+				(*pt_loads)[i].virt_end = kvstart - 1;
+				(*pt_loads)[i].phys_end =  start -1;
+
+				tidx = i+1;
+                        } else if (kvstart != vstart) {
+				(*pt_loads)[i].phys_end = start - 1;
+				(*pt_loads)[i].virt_end = kvstart - 1;
+                        } else {
+				(*pt_loads)[i].phys_start = end + 1;
+				(*pt_loads)[i].virt_start = kvend + 1;
+                        }
+                }
+        }
+        /* Insert split load segment, if any. */
+	if (tidx >= 0) {
+		size = (*num_pt_loads + 1) * sizeof((*pt_loads)[0]);
+		(*pt_loads) = realloc((*pt_loads), size);
+		if  (!(*pt_loads) ) {
+		    ERRMSG("Cannot realloc %ld bytes: %s\n",
+		            size + 0UL, strerror(errno));
+			exit(1);
+		}
+		for (j = (*num_pt_loads - 1); j >= tidx; j--)
+		        (*pt_loads)[j+1] = (*pt_loads)[j];
+		(*pt_loads)[tidx] = temp_seg;
+		(*num_pt_loads)++;
+        }
+        return 0;
+}
+
+static int
+process_dump_load(struct pt_load_segment	*pls)
+{
+	unsigned long long paddr;
+
+	paddr = vaddr_to_paddr(pls->virt_start);
+	pls->phys_start  = paddr;
+	pls->phys_end    = paddr + (pls->virt_end - pls->virt_start);
+	DEBUG_MSG("process_dump_load\n");
+	DEBUG_MSG("  phys_start : %llx\n", pls->phys_start);
+	DEBUG_MSG("  phys_end   : %llx\n", pls->phys_end);
+	DEBUG_MSG("  virt_start : %llx\n", pls->virt_start);
+	DEBUG_MSG("  virt_end   : %llx\n", pls->virt_end);
+
+	return TRUE;
+}
+
+int get_kcore_dump_loads()
+{
+	struct pt_load_segment	*pls;
+	int i, j, loads=0;
+
+	for (i = 0; i < num_pt_loads; ++i) {
+		struct pt_load_segment *p = &pt_loads[i];
+		if (is_vmalloc_addr(p->virt_start))
+			continue;
+		loads++;
+	}
+
+	pls = calloc(sizeof(struct pt_load_segment), loads);
+	if (pls == NULL) {
+		ERRMSG("Can't allocate memory for the PT_LOAD. %s\n",
+		    strerror(errno));
+		return FALSE;
+	}
+
+	for (i = 0, j=0; i < num_pt_loads; ++i) {
+		struct pt_load_segment *p = &pt_loads[i];
+		if (is_vmalloc_addr(p->virt_start))
+			continue;
+		if (j >= loads)
+			return FALSE;
+
+		if (j == 0) {
+			offset_pt_load_memory = p->file_offset;
+			if (offset_pt_load_memory == 0) {
+				ERRMSG("Can't get the offset of page data.\n");
+				return FALSE;
+			}
+		}
+
+		pls[j] = *p;
+		process_dump_load(&pls[j]);
+		j++;
+	}
+
+	free(pt_loads);
+	pt_loads = pls;
+	num_pt_loads = loads;
+
+	for (i=0; i<crash_reserved_mem_nr; i++)
+	{
+		exclude_segment(&pt_loads, &num_pt_loads, crash_reserved_mem[i].start, crash_reserved_mem[i].end);
+	}
+
+	max_file_offset = 0;
+	for (i = 0; i < num_pt_loads; ++i) {
+		struct pt_load_segment *p = &pt_loads[i];
+		max_file_offset = MAX(max_file_offset,
+				      p->file_offset + p->phys_end - p->phys_start);
+	}
+
+	for (i = 0; i < num_pt_loads; ++i) {
+		struct pt_load_segment *p = &pt_loads[i];
+		DEBUG_MSG("LOAD (%d)\n", i);
+		DEBUG_MSG("  phys_start : %llx\n", p->phys_start);
+		DEBUG_MSG("  phys_end   : %llx\n", p->phys_end);
+		DEBUG_MSG("  virt_start : %llx\n", p->virt_start);
+		DEBUG_MSG("  virt_end   : %llx\n", p->virt_end);
+	}
+
+	return TRUE;
+}
 
 /*
  * Get ELF information about /proc/vmcore.
diff --git a/elf_info.h b/elf_info.h
index 3ce0138..ba27fdf 100644
--- a/elf_info.h
+++ b/elf_info.h
@@ -46,6 +46,7 @@ int get_elf_info(int fd, char *filename);
 void free_elf_info(void);
 int get_elf_loads(int fd, char *filename);
 int set_kcore_vmcoreinfo(uint64_t vmcoreinfo_addr, uint64_t vmcoreinfo_len);
+int get_kcore_dump_loads();
 
 int is_elf64_memory(void);
 int is_xen_memory(void);
diff --git a/makedumpfile.h b/makedumpfile.h
index 7ffa1ee..8881c76 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -719,6 +719,7 @@ unsigned long long vaddr_to_paddr_x86(unsigned long vaddr);
 #endif /* x86 */
 
 #ifdef __x86_64__
+int is_vmalloc_addr(ulong vaddr);
 int get_phys_base_x86_64(void);
 int get_machdep_info_x86_64(void);
 int get_versiondep_info_x86_64(void);
-- 
1.8.5.3


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Patch v3 6/7] implement a function to print the memory usage
  2014-07-28  8:19 [Patch v3 0/7] add a new interface to show the memory usage of 1st kernel Baoquan He
                   ` (4 preceding siblings ...)
  2014-07-28  8:20 ` [Patch v3 5/7] prepare the dump loads for kcore analysis Baoquan He
@ 2014-07-28  8:20 ` Baoquan He
  2014-07-28  8:20 ` [Patch v3 7/7] add a new interface to show the memory usage of 1st kernel Baoquan He
  6 siblings, 0 replies; 29+ messages in thread
From: Baoquan He @ 2014-07-28  8:20 UTC (permalink / raw)
  To: kexec; +Cc: kumagai-atsushi, Baoquan He, vgoyal

Introduce print_mem_usage to print the result of analysis of /proc/kcore.
The page number of memory in different use are printed.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 makedumpfile.c | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/makedumpfile.c b/makedumpfile.c
index 84857e0..b5e920d 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -7837,6 +7837,41 @@ print_report(void)
 	REPORT_MSG("\n");
 }
 
+static void
+print_mem_usage(void)
+{
+	mdf_pfn_t pfn_original, pfn_excluded, shrinking;
+
+	/*
+	* /proc/vmcore doesn't contain the memory hole area.
+	*/
+	pfn_original = info->max_mapnr - pfn_memhole;
+
+	pfn_excluded = pfn_zero + pfn_cache + pfn_cache_private
+	    + pfn_user + pfn_free + pfn_hwpoison;
+	shrinking = (pfn_original - pfn_excluded) * 100;
+	shrinking = shrinking / pfn_original;
+
+	MSG("\n");
+	MSG("Page number of memory in different use\n");
+	MSG("--------------------------------------------------\n");
+	MSG("TYPE		PAGES			EXCLUDABLE	DESCRIPTION\n");
+
+	MSG("ZERO		%-16llu	yes		Pages filled with zero\n", pfn_zero);
+	MSG("CACHE		%-16llu	yes		Cache pages\n", pfn_cache);
+	MSG("CACHE_PRIVATE	%-16llu	yes		Cache pages + private\n",
+	    pfn_cache_private);
+	MSG("USER		%-16llu	yes		User process pages\n", pfn_user);
+	MSG("FREE		%-16llu	yes		Free pages\n", pfn_free);
+	MSG("KERN_DATA	%-16llu	no		Dumpable kernel data \n",
+	    pfn_original - pfn_excluded);
+
+	MSG("\n");
+
+	MSG("Total pages on system:	%-16llu\n", pfn_original);
+	MSG("\n");
+}
+
 int
 writeout_dumpfile(void)
 {
-- 
1.8.5.3


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Patch v3 7/7] add a new interface to show the memory usage of 1st kernel
  2014-07-28  8:19 [Patch v3 0/7] add a new interface to show the memory usage of 1st kernel Baoquan He
                   ` (5 preceding siblings ...)
  2014-07-28  8:20 ` [Patch v3 6/7] implement a function to print the memory usage Baoquan He
@ 2014-07-28  8:20 ` Baoquan He
  2014-07-29 12:43   ` Vivek Goyal
  2014-08-01  7:12   ` Atsushi Kumagai
  6 siblings, 2 replies; 29+ messages in thread
From: Baoquan He @ 2014-07-28  8:20 UTC (permalink / raw)
  To: kexec; +Cc: kumagai-atsushi, Baoquan He, vgoyal

Recently people complained that they don't know how to decide how
much disk size need be reserved for kdump. E.g there are lots of
machines with different memory size, if the memory usage information
of current system can be shown, that can help them to make an estimate
how much storage space need be reserved.

In this patch, a new interface is added into makedumpfile. By the
help of this, people can know the page number of memory in different
use. The implementation is analyzing the "System Ram" and "kernel text"
program segment of /proc/kcore excluding the crashkernel range, then
calculating the page number of different kind per vmcoreinfo.

The print is like below:
->$ ./makedumpfile  --mem-usage  /proc/kcore
Excluding unnecessary pages        : [100.0 %] |

Page number of memory in different use
--------------------------------------------------
TYPE		PAGES			EXCLUDABLE	DESCRIPTION
ZERO		0               	yes		Pages filled with zero
CACHE		562006          	yes		Cache pages
CACHE_PRIVATE	353502          	yes		Cache pages + private
USER		225780          	yes		User process pages
FREE		2761884         	yes		Free pages
KERN_DATA	235873          	no		Dumpable kernel data

Total pages on system:	4139045

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 makedumpfile.8 | 17 +++++++++++++
 makedumpfile.c | 76 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 makedumpfile.h |  2 ++
 print_info.c   |  8 +++++++
 4 files changed, 103 insertions(+)

diff --git a/makedumpfile.8 b/makedumpfile.8
index 25fe74e..64abbc7 100644
--- a/makedumpfile.8
+++ b/makedumpfile.8
@@ -532,6 +532,23 @@ it is necessary to specfiy [\-x \fIVMLINUX\fR] or [\-i \fIVMCOREINFO\fR].
 # makedumpfile \-\-dump-dmesg -x vmlinux /proc/vmcore dmesgfile
 .br
 
+
+.TP
+\fB\-\-mem-usage\fR
+This option is used to show the page numbers of current system in different
+use. It should be executed in 1st kernel. By the help of this, user can know
+how many pages is dumpable when different dump_level is specified. It analyzes
+the 'System Ram' and 'kernel text' program segment of /proc/kcore excluding
+the crashkernel range, then calculates the page number of different kind per
+vmcoreinfo. So currently /proc/kcore need be specified explicitly.
+
+.br
+.B Example:
+.br
+# makedumpfile \-\-mem-usage /proc/kcore
+.br
+
+
 .TP
 \fB\-\-diskset=VMCORE\fR
 Specify multiple \fIVMCORE\fRs created on sadump diskset configuration
diff --git a/makedumpfile.c b/makedumpfile.c
index b5e920d..6bbf324 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -7853,6 +7853,7 @@ print_mem_usage(void)
 	shrinking = shrinking / pfn_original;
 
 	MSG("\n");
+	MSG("\n");
 	MSG("Page number of memory in different use\n");
 	MSG("--------------------------------------------------\n");
 	MSG("TYPE		PAGES			EXCLUDABLE	DESCRIPTION\n");
@@ -8906,6 +8907,13 @@ check_param_for_creating_dumpfile(int argc, char *argv[])
 		 */
 		info->name_memory   = argv[optind];
 
+	} else if ((argc == optind + 1) && info->flag_mem_usage) {
+		/*
+		* Parameter for showing the page number of memory
+		* in different use from.
+		*/
+		info->name_memory   = argv[optind];
+
 	} else
 		return FALSE;
 
@@ -9148,6 +9156,58 @@ static int get_sys_kernel_vmcoreinfo(uint64_t *addr, uint64_t *len)
 	return 0;
 }
 
+int show_mem_usage(void)
+{
+        uint64_t vmcoreinfo_addr, vmcoreinfo_len;
+
+        if (!is_crashkernel_mem_reserved()) {
+                ERRMSG("No memory is reserved for crashkenrel!\n");
+                return FALSE;
+        }
+
+
+        if (!info->flag_cyclic)
+                info->flag_cyclic = TRUE;
+
+	info->dump_level = MAX_DUMP_LEVEL;
+
+        if (!get_page_offset())
+                return FALSE;
+
+        if (!open_dump_memory())
+                return FALSE;
+
+        if (!get_elf_loads(info->fd_memory, info->name_memory))
+                return FALSE;
+
+        if (get_sys_kernel_vmcoreinfo(&vmcoreinfo_addr, &vmcoreinfo_len))
+                return FALSE;
+
+        if (!set_kcore_vmcoreinfo(vmcoreinfo_addr, vmcoreinfo_len))
+                return FALSE;
+
+        if (!get_kcore_dump_loads())
+                return FALSE;
+
+        if (!initial())
+                return FALSE;
+
+
+        if (!prepare_bitmap2_buffer_cyclic())
+                return FALSE;
+
+        info->num_dumpable = get_num_dumpable_cyclic();
+
+	free_bitmap2_buffer_cyclic();
+
+        print_mem_usage();
+
+        if (!close_files_for_creating_dumpfile())
+                return FALSE;
+
+        return TRUE;
+}
+
 
 static struct option longopts[] = {
 	{"split", no_argument, NULL, OPT_SPLIT},
@@ -9165,6 +9225,7 @@ static struct option longopts[] = {
 	{"cyclic-buffer", required_argument, NULL, OPT_CYCLIC_BUFFER},
 	{"eppic", required_argument, NULL, OPT_EPPIC},
 	{"non-mmap", no_argument, NULL, OPT_NON_MMAP},
+	{"mem-usage", no_argument, NULL, OPT_MEM_USAGE},
 	{0, 0, 0, 0}
 };
 
@@ -9256,6 +9317,9 @@ main(int argc, char *argv[])
 		case OPT_DUMP_DMESG:
 			info->flag_dmesg = 1;
 			break;
+		case OPT_MEM_USAGE:
+                       info->flag_mem_usage = 1;
+                       break;
 		case OPT_COMPRESS_SNAPPY:
 			info->flag_compress = DUMP_DH_COMPRESSED_SNAPPY;
 			break;
@@ -9396,6 +9460,18 @@ main(int argc, char *argv[])
 
 		MSG("\n");
 		MSG("The dmesg log is saved to %s.\n", info->name_dumpfile);
+	} else if (info->flag_mem_usage) {
+		if (!check_param_for_creating_dumpfile(argc, argv)) {
+			MSG("Commandline parameter is invalid.\n");
+			MSG("Try `makedumpfile --help' for more information.\n");
+			goto out;
+		}
+
+		if (!show_mem_usage())
+			goto out;
+
+		MSG("\n");
+		MSG("Showing page number of memory in different use successfully.\n");
 	} else {
 		if (!check_param_for_creating_dumpfile(argc, argv)) {
 			MSG("Commandline parameter is invalid.\n");
diff --git a/makedumpfile.h b/makedumpfile.h
index 8881c76..ba8c0f9 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -911,6 +911,7 @@ struct DumpInfo {
 	int		flag_force;	     /* overwrite existing stuff */
 	int		flag_exclude_xen_dom;/* exclude Domain-U from xen-kdump */
 	int             flag_dmesg;          /* dump the dmesg log out of the vmcore file */
+	int             flag_mem_usage;  /*show the page number of memory in different use*/
 	int		flag_use_printk_log; /* did we read printk_log symbol name? */
 	int		flag_nospace;	     /* the flag of "No space on device" error */
 	int		flag_vmemmap;        /* kernel supports vmemmap address space */
@@ -1772,6 +1773,7 @@ struct elf_prstatus {
 #define OPT_CYCLIC_BUFFER       OPT_START+11
 #define OPT_EPPIC               OPT_START+12
 #define OPT_NON_MMAP            OPT_START+13
+#define OPT_MEM_USAGE            OPT_START+14
 
 /*
  * Function Prototype.
diff --git a/print_info.c b/print_info.c
index 7592690..29db918 100644
--- a/print_info.c
+++ b/print_info.c
@@ -264,6 +264,14 @@ print_usage(void)
 	MSG("      LOGFILE. If a VMCORE does not contain VMCOREINFO for dmesg, it is\n");
 	MSG("      necessary to specfiy [-x VMLINUX] or [-i VMCOREINFO].\n");
 	MSG("\n");
+	MSG("  [--mem-usage]:\n");
+	MSG("      This option is used to show the page numbers of current system in different\n");
+	MSG("      use. It should be executed in 1st kernel. By the help of this, user can know\n");
+	MSG("      how many pages is dumpable when different dump_level is specified. It analyzes\n");
+	MSG("      the 'System Ram' and 'kernel text' program segment of /proc/kcore excluding\n");
+	MSG("      the crashkernel range, then calculates the page number of different kind per\n");
+	MSG("      vmcoreinfo. So currently /proc/kcore need be specified explicitly.\n");
+	MSG("\n");
 	MSG("  [-D]:\n");
 	MSG("      Print debugging message.\n");
 	MSG("\n");
-- 
1.8.5.3


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [Patch v3 7/7] add a new interface to show the memory usage of 1st kernel
  2014-07-28  8:20 ` [Patch v3 7/7] add a new interface to show the memory usage of 1st kernel Baoquan He
@ 2014-07-29 12:43   ` Vivek Goyal
  2014-07-31  2:32     ` Baoquan He
  2014-08-01  7:12   ` Atsushi Kumagai
  1 sibling, 1 reply; 29+ messages in thread
From: Vivek Goyal @ 2014-07-29 12:43 UTC (permalink / raw)
  To: Baoquan He; +Cc: kumagai-atsushi, kexec

On Mon, Jul 28, 2014 at 04:20:06PM +0800, Baoquan He wrote:
> Recently people complained that they don't know how to decide how
> much disk size need be reserved for kdump. E.g there are lots of
> machines with different memory size, if the memory usage information
> of current system can be shown, that can help them to make an estimate
> how much storage space need be reserved.
> 
> In this patch, a new interface is added into makedumpfile. By the
> help of this, people can know the page number of memory in different
> use. The implementation is analyzing the "System Ram" and "kernel text"
> program segment of /proc/kcore excluding the crashkernel range, then
> calculating the page number of different kind per vmcoreinfo.
> 
> The print is like below:
> ->$ ./makedumpfile  --mem-usage  /proc/kcore
> Excluding unnecessary pages        : [100.0 %] |

I think above message is now unnecessary. In fact we are not excluding
any pages.

> 
> Page number of memory in different use
> --------------------------------------------------

Above is not required.


> TYPE		PAGES			EXCLUDABLE	DESCRIPTION

We probably should put dashes under these headers

TYPE		PAGES			EXCLUDABLE	DESCRIPTION
====		=====			==========	===========

> ZERO		0               	yes		Pages filled with zero
> CACHE		562006          	yes		Cache pages
> CACHE_PRIVATE	353502          	yes		Cache pages + private
> USER		225780          	yes		User process pages
> FREE		2761884         	yes		Free pages
> KERN_DATA	235873          	no		Dumpable kernel data

What's "Dumpable kernel data" ? Are we saying they are kernel pages which
can't be filtered?

Why not simply call them "kernel data" or "kernel pages" 


> 
> Total pages on system:	4139045

How about "Total number of pages".

Otherwise this output looks much better than previous version. Thanks for
the changes.

Vivek

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch v3 7/7] add a new interface to show the memory usage of 1st kernel
  2014-07-29 12:43   ` Vivek Goyal
@ 2014-07-31  2:32     ` Baoquan He
  0 siblings, 0 replies; 29+ messages in thread
From: Baoquan He @ 2014-07-31  2:32 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: kexec, kumagai-atsushi

On 07/29/14 at 08:43am, Vivek Goyal wrote:
> On Mon, Jul 28, 2014 at 04:20:06PM +0800, Baoquan He wrote:
> > Recently people complained that they don't know how to decide how
> > much disk size need be reserved for kdump. E.g there are lots of
> > machines with different memory size, if the memory usage information
> > of current system can be shown, that can help them to make an estimate
> > how much storage space need be reserved.
> > 
> > In this patch, a new interface is added into makedumpfile. By the
> > help of this, people can know the page number of memory in different
> > use. The implementation is analyzing the "System Ram" and "kernel text"
> > program segment of /proc/kcore excluding the crashkernel range, then
> > calculating the page number of different kind per vmcoreinfo.
> > 
> > The print is like below:
> > ->$ ./makedumpfile  --mem-usage  /proc/kcore
> > Excluding unnecessary pages        : [100.0 %] |
> 
> I think above message is now unnecessary. In fact we are not excluding
> any pages.

I reused the function get_num_dumpable_cyclic(). It will iterate each
pages to check whether it's dumpable or not based on the dump_level
specified. While above message is printed inside a internal function. I
don't think it's very necessary, but also I don't think it will misguide
people to think of too much about it. It just tell the progress of
filtering, and could be a little helpful if it's working on a large
memory, say more than 1T memory.

> 
> > 
> > Page number of memory in different use
> > --------------------------------------------------
> 
> Above is not required.

OK, will remove it.

> 
> 
> > TYPE		PAGES			EXCLUDABLE	DESCRIPTION
> 
> We probably should put dashes under these headers
> 
> TYPE		PAGES			EXCLUDABLE	DESCRIPTION
> ====		=====			==========	===========

I am fine with this, will add dashes to decorate it.


> 
> > ZERO		0               	yes		Pages filled with zero
> > CACHE		562006          	yes		Cache pages
> > CACHE_PRIVATE	353502          	yes		Cache pages + private
> > USER		225780          	yes		User process pages
> > FREE		2761884         	yes		Free pages
> > KERN_DATA	235873          	no		Dumpable kernel data
> 
> What's "Dumpable kernel data" ? Are we saying they are kernel pages which
> can't be filtered?
> 
> Why not simply call them "kernel data" or "kernel pages" 

Yes, it's misled. WIll change.


> 
> 
> > 
> > Total pages on system:	4139045
> 
> How about "Total number of pages".

Will change.

> 
> Otherwise this output looks much better than previous version. Thanks for
> the changes.
> 
> Vivek
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [Patch v3 5/7] prepare the dump loads for kcore analysis
  2014-07-28  8:20 ` [Patch v3 5/7] prepare the dump loads for kcore analysis Baoquan He
@ 2014-08-01  7:12   ` Atsushi Kumagai
  2014-08-12 10:10     ` bhe
  0 siblings, 1 reply; 29+ messages in thread
From: Atsushi Kumagai @ 2014-08-01  7:12 UTC (permalink / raw)
  To: bhe; +Cc: kexec, vgoyal

>In kcore, only  "System Ram" and "kernel text" program segments
>are needed. And to be more precise, exclude the crashkernel
>memory range.
>
>Signed-off-by: Baoquan He <bhe@redhat.com>
>---
> elf_info.c     | 134 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> elf_info.h     |   1 +
> makedumpfile.h |   1 +
> 3 files changed, 136 insertions(+)
>
>diff --git a/elf_info.c b/elf_info.c
>index edbfc97..6216f51 100644
>--- a/elf_info.c
>+++ b/elf_info.c
>@@ -777,6 +777,140 @@ get_elf_loads(int fd, char *filename)
> 	return TRUE;
> }
>
>+static int exclude_segment(struct pt_load_segment **pt_loads, unsigned int	*num_pt_loads, uint64_t start,
>uint64_t end)
>+{
>+        int i, j, tidx = -1;
>+	unsigned long long	vstart, vend, kvstart, kvend;
>+        struct pt_load_segment temp_seg = {0};
>+	kvstart = (ulong)start | PAGE_OFFSET;
>+	kvend = (ulong)end | PAGE_OFFSET;
>+	unsigned long size;
>+
>+        for (i = 0; i < (*num_pt_loads); i++) {
>+                vstart = (*pt_loads)[i].virt_start;
>+                vend = (*pt_loads)[i].virt_end;
>+                if (kvstart <  vend && kvend > vstart) {
>+                        if (kvstart != vstart && kvend != vend) {
>+				/* Split load segment */
>+				temp_seg.phys_start = end +1;
>+				temp_seg.phys_end = (*pt_loads)[i].phys_end;
>+				temp_seg.virt_start = kvend + 1;
>+				temp_seg.virt_end = vend;
>+				temp_seg.file_offset = (*pt_loads)[i].file_offset + temp_seg.virt_start -
>(*pt_loads)[i].virt_start;
>+
>+				(*pt_loads)[i].virt_end = kvstart - 1;
>+				(*pt_loads)[i].phys_end =  start -1;
>+
>+				tidx = i+1;
>+                        } else if (kvstart != vstart) {
>+				(*pt_loads)[i].phys_end = start - 1;
>+				(*pt_loads)[i].virt_end = kvstart - 1;
>+                        } else {
>+				(*pt_loads)[i].phys_start = end + 1;
>+				(*pt_loads)[i].virt_start = kvend + 1;
>+                        }
>+                }
>+        }
>+        /* Insert split load segment, if any. */
>+	if (tidx >= 0) {
>+		size = (*num_pt_loads + 1) * sizeof((*pt_loads)[0]);
>+		(*pt_loads) = realloc((*pt_loads), size);
>+		if  (!(*pt_loads) ) {
>+		    ERRMSG("Cannot realloc %ld bytes: %s\n",
>+		            size + 0UL, strerror(errno));
>+			exit(1);
>+		}
>+		for (j = (*num_pt_loads - 1); j >= tidx; j--)
>+		        (*pt_loads)[j+1] = (*pt_loads)[j];
>+		(*pt_loads)[tidx] = temp_seg;
>+		(*num_pt_loads)++;
>+        }
>+        return 0;
>+}
>+
>+static int
>+process_dump_load(struct pt_load_segment	*pls)
>+{
>+	unsigned long long paddr;
>+
>+	paddr = vaddr_to_paddr(pls->virt_start);
>+	pls->phys_start  = paddr;
>+	pls->phys_end    = paddr + (pls->virt_end - pls->virt_start);
>+	DEBUG_MSG("process_dump_load\n");
>+	DEBUG_MSG("  phys_start : %llx\n", pls->phys_start);
>+	DEBUG_MSG("  phys_end   : %llx\n", pls->phys_end);
>+	DEBUG_MSG("  virt_start : %llx\n", pls->virt_start);
>+	DEBUG_MSG("  virt_end   : %llx\n", pls->virt_end);
>+
>+	return TRUE;
>+}
>+
>+int get_kcore_dump_loads()
>+{
>+	struct pt_load_segment	*pls;
>+	int i, j, loads=0;
>+
>+	for (i = 0; i < num_pt_loads; ++i) {
>+		struct pt_load_segment *p = &pt_loads[i];
>+		if (is_vmalloc_addr(p->virt_start))
>+			continue;
>+		loads++;
>+	}
>+
>+	pls = calloc(sizeof(struct pt_load_segment), loads);
>+	if (pls == NULL) {
>+		ERRMSG("Can't allocate memory for the PT_LOAD. %s\n",
>+		    strerror(errno));
>+		return FALSE;
>+	}
>+
>+	for (i = 0, j=0; i < num_pt_loads; ++i) {
>+		struct pt_load_segment *p = &pt_loads[i];
>+		if (is_vmalloc_addr(p->virt_start))
>+			continue;
>+		if (j >= loads)
>+			return FALSE;
>+
>+		if (j == 0) {
>+			offset_pt_load_memory = p->file_offset;
>+			if (offset_pt_load_memory == 0) {
>+				ERRMSG("Can't get the offset of page data.\n");
>+				return FALSE;
>+			}
>+		}
>+
>+		pls[j] = *p;
>+		process_dump_load(&pls[j]);
>+		j++;
>+	}
>+
>+	free(pt_loads);
>+	pt_loads = pls;
>+	num_pt_loads = loads;
>+
>+	for (i=0; i<crash_reserved_mem_nr; i++)
>+	{
>+		exclude_segment(&pt_loads, &num_pt_loads, crash_reserved_mem[i].start, crash_reserved_mem[i].end);
>+	}
>+
>+	max_file_offset = 0;
>+	for (i = 0; i < num_pt_loads; ++i) {
>+		struct pt_load_segment *p = &pt_loads[i];
>+		max_file_offset = MAX(max_file_offset,
>+				      p->file_offset + p->phys_end - p->phys_start);
>+	}
>+
>+	for (i = 0; i < num_pt_loads; ++i) {
>+		struct pt_load_segment *p = &pt_loads[i];
>+		DEBUG_MSG("LOAD (%d)\n", i);
>+		DEBUG_MSG("  phys_start : %llx\n", p->phys_start);
>+		DEBUG_MSG("  phys_end   : %llx\n", p->phys_end);
>+		DEBUG_MSG("  virt_start : %llx\n", p->virt_start);
>+		DEBUG_MSG("  virt_end   : %llx\n", p->virt_end);
>+	}
>+
>+	return TRUE;
>+}
>
> /*
>  * Get ELF information about /proc/vmcore.
>diff --git a/elf_info.h b/elf_info.h
>index 3ce0138..ba27fdf 100644
>--- a/elf_info.h
>+++ b/elf_info.h
>@@ -46,6 +46,7 @@ int get_elf_info(int fd, char *filename);
> void free_elf_info(void);
> int get_elf_loads(int fd, char *filename);
> int set_kcore_vmcoreinfo(uint64_t vmcoreinfo_addr, uint64_t vmcoreinfo_len);
>+int get_kcore_dump_loads();
>
> int is_elf64_memory(void);
> int is_xen_memory(void);
>diff --git a/makedumpfile.h b/makedumpfile.h
>index 7ffa1ee..8881c76 100644
>--- a/makedumpfile.h
>+++ b/makedumpfile.h
>@@ -719,6 +719,7 @@ unsigned long long vaddr_to_paddr_x86(unsigned long vaddr);
> #endif /* x86 */
>
> #ifdef __x86_64__
>+int is_vmalloc_addr(ulong vaddr);
> int get_phys_base_x86_64(void);
> int get_machdep_info_x86_64(void);
> int get_versiondep_info_x86_64(void);

It will fail to build due to undefined is_vmalloc_addr() except on
x86_64, let's define it also for the other architectures like below:

#ifdef __x86__
 int get_machdep_info_x86(void);
 int get_versiondep_info_x86(void);
+int is_vmalloc_addr_x86(ulong vaddr);
 unsigned long long vaddr_to_paddr_x86(unsigned long vaddr);
 #define get_phys_base()                TRUE
 #define get_machdep_info()     get_machdep_info_x86()
 #define get_versiondep_info()  get_versiondep_info_x86()
 #define vaddr_to_paddr(X)      vaddr_to_paddr_x86(X)
+#define is_vmalloc_addr(X)      is_vmalloc_addr_x86(X)
 #endif /* x86 */

Besides, I think it's better to rename the is_vmalloc_addr() in
arch/x86_64.c to is_vmalloc_addr_x86_64().


Thanks
Atsushi Kumagai

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [Patch v3 4/7] set vmcoreinfo for kcore
  2014-07-28  8:20 ` [Patch v3 4/7] set vmcoreinfo for kcore Baoquan He
@ 2014-08-01  7:12   ` Atsushi Kumagai
  2014-08-12 10:08     ` bhe
  0 siblings, 1 reply; 29+ messages in thread
From: Atsushi Kumagai @ 2014-08-01  7:12 UTC (permalink / raw)
  To: bhe; +Cc: kexec, vgoyal

>In vmcore dumping, note program of vmcoreinfo is set in elf header
>of /proc/vmcore. In 1st kernel, the vmcoreinfo is also needed for
>kcore analyzing. So in this patch information of vmcoreinfo is
>parsed and set in offset_vmcoreinfo and size_vmcoreinfo.
>
>Signed-off-by: Baoquan He <bhe@redhat.com>
>---
> elf_info.c     | 47 +++++++++++++++++++++++++++++++++++++++++++++++
> elf_info.h     |  1 +
> makedumpfile.c | 29 +++++++++++++++++++++++++++++
> 3 files changed, 77 insertions(+)
>
>diff --git a/elf_info.c b/elf_info.c
>index 69d3fdb..edbfc97 100644
>--- a/elf_info.c
>+++ b/elf_info.c
>@@ -395,6 +395,53 @@ get_pt_note_info(void)
> 	return TRUE;
> }
>
>+#define UNINITIALIZED  ((ulong)(-1))
>+int set_kcore_vmcoreinfo(uint64_t vmcoreinfo_addr, uint64_t vmcoreinfo_len)
>+{
>+	int i;
>+	ulong kvaddr;
>+	off_t offset;
>+	char note[MAX_SIZE_NHDR];
>+	int size_desc;
>+	off_t offset_desc;
>+
>+	offset = UNINITIALIZED;
>+	kvaddr = (ulong)vmcoreinfo_addr | PAGE_OFFSET;
>+
>+	for (i = 0; i < num_pt_loads; ++i) {
>+		struct pt_load_segment *p = &pt_loads[i];
>+		if ((kvaddr >= p->virt_start) && (kvaddr < p->virt_end)) {
>+			offset = (off_t)(kvaddr - p->virt_start) +
>+			(off_t)p->file_offset;
>+			break;
>+		}
>+	}
>+
>+	if (offset == UNINITIALIZED){
>+		ERRMSG("Can't seek the dump memory(%s). %s\n",
>+		    name_memory, strerror(errno));
>+		return FALSE;
>+	}
>+
>+        if (lseek(fd_memory, offset, SEEK_SET) != offset){
>+		ERRMSG("Can't seek the dump memory(%s). %s\n",
>+		    name_memory, strerror(errno));
>+		return FALSE;
>+	}

These two error messages are the same, they aren't helpful for debugging.
I think the former should be like "Can't get the offset of VMCOREINFO".

>+
>+	if (read(fd_memory, note, MAX_SIZE_NHDR) != MAX_SIZE_NHDR){
>+		ERRMSG("Can't read the dump memory(%s). %s\n",
>+		    name_memory, strerror(errno));
>+		return FALSE;
>+	}
>+
>+	size_desc   = note_descsz(note);
>+	offset_desc = offset + offset_note_desc(note);
>+
>+	set_vmcoreinfo(offset_desc, size_desc);
>+
>+	return TRUE;
>+}
>
> /*
>  * External functions.
>diff --git a/elf_info.h b/elf_info.h
>index 263d993..3ce0138 100644
>--- a/elf_info.h
>+++ b/elf_info.h
>@@ -45,6 +45,7 @@ int get_elf32_ehdr(int fd, char *filename, Elf32_Ehdr *ehdr);
> int get_elf_info(int fd, char *filename);
> void free_elf_info(void);
> int get_elf_loads(int fd, char *filename);
>+int set_kcore_vmcoreinfo(uint64_t vmcoreinfo_addr, uint64_t vmcoreinfo_len);
>
> int is_elf64_memory(void);
> int is_xen_memory(void);
>diff --git a/makedumpfile.c b/makedumpfile.c
>index 78aa7a5..84857e0 100644
>--- a/makedumpfile.c
>+++ b/makedumpfile.c
>@@ -9085,6 +9085,35 @@ static int get_page_offset()
> 	return TRUE;
> }
>
>+
>+/* Returns the physical address of start of crash notes buffer for a kernel. */
>+static int get_sys_kernel_vmcoreinfo(uint64_t *addr, uint64_t *len)
>+{

This function just return the result status, so please use TRUE or FALSE
as the return value instead of 0 or -1.

>+	char line[BUFSIZE_FGETS];
>+	int count;
>+	FILE *fp;
>+	unsigned long long temp, temp2;
>+
>+	*addr = 0;
>+	*len = 0;
>+
>+	if (!(fp = fopen("/sys/kernel/vmcoreinfo", "r")))
>+		return -1;
>+
>+	if (!fgets(line, sizeof(line), fp))
>+		ERRMSG("Cannot parse %s: %s\n", "/sys/kernel/vmcoreinfo", strerror(errno));
>+	count = sscanf(line, "%Lx %Lx", &temp, &temp2);
>+	if (count != 2)
>+		ERRMSG("Cannot parse %s: %s\n", "/sys/kernel/vmcoreinfo", strerror(errno));

The messages are the same, too.


Thanks
Atsushi Kumagai

>+
>+	*addr = (uint64_t) temp;
>+	*len = (uint64_t) temp2;
>+
>+	fclose(fp);
>+	return 0;
>+}
>+
>+
> static struct option longopts[] = {
> 	{"split", no_argument, NULL, OPT_SPLIT},
> 	{"reassemble", no_argument, NULL, OPT_REASSEMBLE},
>--
>1.8.5.3

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [Patch v3 3/7] preparation functions for parsing vmcoreinfo
  2014-07-28  8:20 ` [Patch v3 3/7] preparation functions for parsing vmcoreinfo Baoquan He
@ 2014-08-01  7:12   ` Atsushi Kumagai
  2014-08-12  9:46     ` bhe
  0 siblings, 1 reply; 29+ messages in thread
From: Atsushi Kumagai @ 2014-08-01  7:12 UTC (permalink / raw)
  To: bhe; +Cc: kexec, vgoyal

>Add 2 preparation functions get_elf_loads and get_page_offset, later
>they will be needed for parsing vmcoreinfo.
>
>Meanwhile since get_kernel_version need be called to get page_offset
>before initial() in mem_usage code flow, and surely it will be called
>inside initial() again. Add a static variable to avoid this duplicate
>calling.
>
>Signed-off-by: Baoquan He <bhe@redhat.com>
>---
> elf_info.c     | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
> elf_info.h     |  1 +
> makedumpfile.c | 23 +++++++++++++++++++++++
> 3 files changed, 74 insertions(+)
>
>diff --git a/elf_info.c b/elf_info.c
>index b277f69..69d3fdb 100644
>--- a/elf_info.c
>+++ b/elf_info.c
>@@ -681,6 +681,56 @@ get_elf32_ehdr(int fd, char *filename, Elf32_Ehdr *ehdr)
> 	return TRUE;
> }
>
>+int
>+get_elf_loads(int fd, char *filename)
>+{
>+	int i, j, phnum, elf_format;
>+	Elf64_Phdr phdr;
>+
>+	/*
>+	 * Check ELF64 or ELF32.
>+	 */
>+	elf_format = check_elf_format(fd, filename, &phnum, &num_pt_loads);
>+	if (elf_format == ELF64)
>+		flags_memory |= MEMORY_ELF64;
>+	else if (elf_format != ELF32)
>+		return FALSE;
>+
>+	if (!num_pt_loads) {
>+		ERRMSG("Can't get the number of PT_LOAD.\n");
>+		return FALSE;
>+	}
>+
>+	/*
>+	 * The below file information will be used as /proc/vmcore.
>+	 */
>+	fd_memory   = fd;
>+	name_memory = filename;
>+
>+	pt_loads = calloc(sizeof(struct pt_load_segment), num_pt_loads);
>+	if (pt_loads == NULL) {
>+		ERRMSG("Can't allocate memory for the PT_LOAD. %s\n",
>+		    strerror(errno));
>+		return FALSE;
>+	}
>+	for (i = 0, j = 0; i < phnum; i++) {
>+		if (!get_phdr_memory(i, &phdr))
>+			return FALSE;
>+
>+		if (phdr.p_type != PT_LOAD)
>+			continue;
>+
>+		if (j >= num_pt_loads)
>+			return FALSE;
>+		if(!dump_Elf_load(&phdr, j))
>+			return FALSE;
>+		j++;
>+	}
>+
>+	return TRUE;
>+}
>+
>+
> /*
>  * Get ELF information about /proc/vmcore.
>  */
>diff --git a/elf_info.h b/elf_info.h
>index 801faff..263d993 100644
>--- a/elf_info.h
>+++ b/elf_info.h
>@@ -44,6 +44,7 @@ int get_elf64_ehdr(int fd, char *filename, Elf64_Ehdr *ehdr);
> int get_elf32_ehdr(int fd, char *filename, Elf32_Ehdr *ehdr);
> int get_elf_info(int fd, char *filename);
> void free_elf_info(void);
>+int get_elf_loads(int fd, char *filename);
>
> int is_elf64_memory(void);
> int is_xen_memory(void);
>diff --git a/makedumpfile.c b/makedumpfile.c
>index 220570e..78aa7a5 100644
>--- a/makedumpfile.c
>+++ b/makedumpfile.c
>@@ -681,6 +681,10 @@ get_kernel_version(char *release)
> 	int32_t version;
> 	long maj, min, rel;
> 	char *start, *end;
>+	static int done = 0;
>+
>+	if (done)
>+		return info->kernel_version;

This function just convert the argument as string into
a number, it shouldn't be affected by external factors.

You should use info->kernel_version in the caller side if
you want to avoid duplicate calling of this function, but
I think it's unnecessary since this function is small.

>
> 	/*
> 	 * This method checks that vmlinux and vmcore are same kernel version.
>@@ -706,6 +710,9 @@ get_kernel_version(char *release)
> 		MSG("The kernel version is not supported.\n");
> 		MSG("The created dumpfile may be incomplete.\n");
> 	}
>+
>+	done = 1;
>+
> 	return version;
> }
>
>@@ -9062,6 +9069,22 @@ int is_crashkernel_mem_reserved(void)
>         return !!crash_reserved_mem_nr;
> }
>
>+static int get_page_offset()
>+{
>+#ifdef __x86_64__
>+	struct utsname utsname;
>+	if (uname(&utsname)) {
>+		ERRMSG("Cannot get name and information about current kernel : %s", strerror(errno));
>+		return FALSE;
>+	}
>+
>+	info->kernel_version = get_kernel_version(utsname.release);
>+	get_versiondep_info_x86_64();
>+#endif /* x86_64 */

You should replace get_versiondep_info_x86_64() with get_versiondep_info()
to get rid of #ifdef.
#ifdef is messy, I don't want to use it if possible.


Thanks
Atsushi Kumagai.

>+
>+	return TRUE;
>+}
>+
> static struct option longopts[] = {
> 	{"split", no_argument, NULL, OPT_SPLIT},
> 	{"reassemble", no_argument, NULL, OPT_REASSEMBLE},
>--
>1.8.5.3

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [Patch v3 7/7] add a new interface to show the memory usage of 1st kernel
  2014-07-28  8:20 ` [Patch v3 7/7] add a new interface to show the memory usage of 1st kernel Baoquan He
  2014-07-29 12:43   ` Vivek Goyal
@ 2014-08-01  7:12   ` Atsushi Kumagai
  2014-08-12 10:14     ` bhe
  2014-08-21 10:31     ` bhe
  1 sibling, 2 replies; 29+ messages in thread
From: Atsushi Kumagai @ 2014-08-01  7:12 UTC (permalink / raw)
  To: bhe; +Cc: kexec, vgoyal

>Recently people complained that they don't know how to decide how
>much disk size need be reserved for kdump. E.g there are lots of
>machines with different memory size, if the memory usage information
>of current system can be shown, that can help them to make an estimate
>how much storage space need be reserved.
>
>In this patch, a new interface is added into makedumpfile. By the
>help of this, people can know the page number of memory in different
>use. The implementation is analyzing the "System Ram" and "kernel text"
>program segment of /proc/kcore excluding the crashkernel range, then
>calculating the page number of different kind per vmcoreinfo.
>
>The print is like below:
>->$ ./makedumpfile  --mem-usage  /proc/kcore
>Excluding unnecessary pages        : [100.0 %] |
>
>Page number of memory in different use
>--------------------------------------------------
>TYPE		PAGES			EXCLUDABLE	DESCRIPTION
>ZERO		0               	yes		Pages filled with zero

The number of zero pages is always 0 since it isn't counted during
get_num_dumpable_cyclic(). To count it up, we have to read all of the
pages like exclude_zero_pages(), so we need "exclude_zero_pages_cyclic()".
My idea is to call it in get_num_dumpable_cyclic() like:

		for_each_cycle(0, info->max_mapnr, &cycle)
		{
				if (!exclude_unnecessary_pages_cyclic(&cycle))
					return FALSE;

+				if (info->flag_mem_usage)
+					exclude_zero_pages_cyclic(&cycle);
+
				for(pfn=cycle.start_pfn; pfn<cycle.end_pfn; pfn++)


BTW, what is the target kernel version of this feature?
It works well on 3.12 but fails on 2.6.32 like:

# ./makedumpfile --mem-usage /proc/kcore
read_device: Can't read a file(/proc/kcore). Success
set_kcore_vmcoreinfo: Can't read the dump memory(/proc/kcore). Success

makedumpfile Failed.
#

This error means reading VMCOREINFO from /proc/kcore was failed.
Of course, there is a VMCOREINFO on the memory,

# cat /sys/kernel/vmcoreinfo
1e01b80 1000
#

It seems like old /proc/kcore's issue, but I'm still investigating.
Any comments are helpful.


Thanks
Atsushi Kumagai

>CACHE		562006          	yes		Cache pages
>CACHE_PRIVATE	353502          	yes		Cache pages + private
>USER		225780          	yes		User process pages
>FREE		2761884         	yes		Free pages
>KERN_DATA	235873          	no		Dumpable kernel data
>
>Total pages on system:	4139045
>
>Signed-off-by: Baoquan He <bhe@redhat.com>
>---
> makedumpfile.8 | 17 +++++++++++++
> makedumpfile.c | 76 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> makedumpfile.h |  2 ++
> print_info.c   |  8 +++++++
> 4 files changed, 103 insertions(+)
>
>diff --git a/makedumpfile.8 b/makedumpfile.8
>index 25fe74e..64abbc7 100644
>--- a/makedumpfile.8
>+++ b/makedumpfile.8
>@@ -532,6 +532,23 @@ it is necessary to specfiy [\-x \fIVMLINUX\fR] or [\-i \fIVMCOREINFO\fR].
> # makedumpfile \-\-dump-dmesg -x vmlinux /proc/vmcore dmesgfile
> .br
>
>+
>+.TP
>+\fB\-\-mem-usage\fR
>+This option is used to show the page numbers of current system in different
>+use. It should be executed in 1st kernel. By the help of this, user can know
>+how many pages is dumpable when different dump_level is specified. It analyzes
>+the 'System Ram' and 'kernel text' program segment of /proc/kcore excluding
>+the crashkernel range, then calculates the page number of different kind per
>+vmcoreinfo. So currently /proc/kcore need be specified explicitly.
>+
>+.br
>+.B Example:
>+.br
>+# makedumpfile \-\-mem-usage /proc/kcore
>+.br
>+
>+
> .TP
> \fB\-\-diskset=VMCORE\fR
> Specify multiple \fIVMCORE\fRs created on sadump diskset configuration
>diff --git a/makedumpfile.c b/makedumpfile.c
>index b5e920d..6bbf324 100644
>--- a/makedumpfile.c
>+++ b/makedumpfile.c
>@@ -7853,6 +7853,7 @@ print_mem_usage(void)
> 	shrinking = shrinking / pfn_original;
>
> 	MSG("\n");
>+	MSG("\n");
> 	MSG("Page number of memory in different use\n");
> 	MSG("--------------------------------------------------\n");
> 	MSG("TYPE		PAGES			EXCLUDABLE	DESCRIPTION\n");
>@@ -8906,6 +8907,13 @@ check_param_for_creating_dumpfile(int argc, char *argv[])
> 		 */
> 		info->name_memory   = argv[optind];
>
>+	} else if ((argc == optind + 1) && info->flag_mem_usage) {
>+		/*
>+		* Parameter for showing the page number of memory
>+		* in different use from.
>+		*/
>+		info->name_memory   = argv[optind];
>+
> 	} else
> 		return FALSE;
>
>@@ -9148,6 +9156,58 @@ static int get_sys_kernel_vmcoreinfo(uint64_t *addr, uint64_t *len)
> 	return 0;
> }
>
>+int show_mem_usage(void)
>+{
>+        uint64_t vmcoreinfo_addr, vmcoreinfo_len;
>+
>+        if (!is_crashkernel_mem_reserved()) {
>+                ERRMSG("No memory is reserved for crashkenrel!\n");
>+                return FALSE;
>+        }
>+
>+
>+        if (!info->flag_cyclic)
>+                info->flag_cyclic = TRUE;
>+
>+	info->dump_level = MAX_DUMP_LEVEL;
>+
>+        if (!get_page_offset())
>+                return FALSE;
>+
>+        if (!open_dump_memory())
>+                return FALSE;
>+
>+        if (!get_elf_loads(info->fd_memory, info->name_memory))
>+                return FALSE;
>+
>+        if (get_sys_kernel_vmcoreinfo(&vmcoreinfo_addr, &vmcoreinfo_len))
>+                return FALSE;
>+
>+        if (!set_kcore_vmcoreinfo(vmcoreinfo_addr, vmcoreinfo_len))
>+                return FALSE;
>+
>+        if (!get_kcore_dump_loads())
>+                return FALSE;
>+
>+        if (!initial())
>+                return FALSE;
>+
>+
>+        if (!prepare_bitmap2_buffer_cyclic())
>+                return FALSE;
>+
>+        info->num_dumpable = get_num_dumpable_cyclic();
>+
>+	free_bitmap2_buffer_cyclic();
>+
>+        print_mem_usage();
>+
>+        if (!close_files_for_creating_dumpfile())
>+                return FALSE;
>+
>+        return TRUE;
>+}
>+
>
> static struct option longopts[] = {
> 	{"split", no_argument, NULL, OPT_SPLIT},
>@@ -9165,6 +9225,7 @@ static struct option longopts[] = {
> 	{"cyclic-buffer", required_argument, NULL, OPT_CYCLIC_BUFFER},
> 	{"eppic", required_argument, NULL, OPT_EPPIC},
> 	{"non-mmap", no_argument, NULL, OPT_NON_MMAP},
>+	{"mem-usage", no_argument, NULL, OPT_MEM_USAGE},
> 	{0, 0, 0, 0}
> };
>
>@@ -9256,6 +9317,9 @@ main(int argc, char *argv[])
> 		case OPT_DUMP_DMESG:
> 			info->flag_dmesg = 1;
> 			break;
>+		case OPT_MEM_USAGE:
>+                       info->flag_mem_usage = 1;
>+                       break;
> 		case OPT_COMPRESS_SNAPPY:
> 			info->flag_compress = DUMP_DH_COMPRESSED_SNAPPY;
> 			break;
>@@ -9396,6 +9460,18 @@ main(int argc, char *argv[])
>
> 		MSG("\n");
> 		MSG("The dmesg log is saved to %s.\n", info->name_dumpfile);
>+	} else if (info->flag_mem_usage) {
>+		if (!check_param_for_creating_dumpfile(argc, argv)) {
>+			MSG("Commandline parameter is invalid.\n");
>+			MSG("Try `makedumpfile --help' for more information.\n");
>+			goto out;
>+		}
>+
>+		if (!show_mem_usage())
>+			goto out;
>+
>+		MSG("\n");
>+		MSG("Showing page number of memory in different use successfully.\n");
> 	} else {
> 		if (!check_param_for_creating_dumpfile(argc, argv)) {
> 			MSG("Commandline parameter is invalid.\n");
>diff --git a/makedumpfile.h b/makedumpfile.h
>index 8881c76..ba8c0f9 100644
>--- a/makedumpfile.h
>+++ b/makedumpfile.h
>@@ -911,6 +911,7 @@ struct DumpInfo {
> 	int		flag_force;	     /* overwrite existing stuff */
> 	int		flag_exclude_xen_dom;/* exclude Domain-U from xen-kdump */
> 	int             flag_dmesg;          /* dump the dmesg log out of the vmcore file */
>+	int             flag_mem_usage;  /*show the page number of memory in different use*/
> 	int		flag_use_printk_log; /* did we read printk_log symbol name? */
> 	int		flag_nospace;	     /* the flag of "No space on device" error */
> 	int		flag_vmemmap;        /* kernel supports vmemmap address space */
>@@ -1772,6 +1773,7 @@ struct elf_prstatus {
> #define OPT_CYCLIC_BUFFER       OPT_START+11
> #define OPT_EPPIC               OPT_START+12
> #define OPT_NON_MMAP            OPT_START+13
>+#define OPT_MEM_USAGE            OPT_START+14
>
> /*
>  * Function Prototype.
>diff --git a/print_info.c b/print_info.c
>index 7592690..29db918 100644
>--- a/print_info.c
>+++ b/print_info.c
>@@ -264,6 +264,14 @@ print_usage(void)
> 	MSG("      LOGFILE. If a VMCORE does not contain VMCOREINFO for dmesg, it is\n");
> 	MSG("      necessary to specfiy [-x VMLINUX] or [-i VMCOREINFO].\n");
> 	MSG("\n");
>+	MSG("  [--mem-usage]:\n");
>+	MSG("      This option is used to show the page numbers of current system in different\n");
>+	MSG("      use. It should be executed in 1st kernel. By the help of this, user can know\n");
>+	MSG("      how many pages is dumpable when different dump_level is specified. It analyzes\n");
>+	MSG("      the 'System Ram' and 'kernel text' program segment of /proc/kcore excluding\n");
>+	MSG("      the crashkernel range, then calculates the page number of different kind per\n");
>+	MSG("      vmcoreinfo. So currently /proc/kcore need be specified explicitly.\n");
>+	MSG("\n");
> 	MSG("  [-D]:\n");
> 	MSG("      Print debugging message.\n");
> 	MSG("\n");
>--
>1.8.5.3


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [Patch v3 2/7] functions to get crashkernel memory range
  2014-07-28  8:20 ` [Patch v3 2/7] functions to get crashkernel memory range Baoquan He
@ 2014-08-01  7:32   ` Atsushi Kumagai
  2014-08-12  9:25     ` bhe
  0 siblings, 1 reply; 29+ messages in thread
From: Atsushi Kumagai @ 2014-08-01  7:32 UTC (permalink / raw)
  To: bhe; +Cc: kexec, vgoyal

>These functions are used to parse /proc/iomem code and get memory
>ranges of specific type. They are implemented in kexec-tools and
>borrowed here to get the crashkernel memory range. Since crashkernel
>memory range should be excluded from dumpable memory ranges.
>
>Signed-off-by: Baoquan He <bhe@redhat.com>
>---
> makedumpfile.c | 82 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> makedumpfile.h |  7 +++++
> 2 files changed, 89 insertions(+)
>
>diff --git a/makedumpfile.c b/makedumpfile.c
>index 760bfd1..220570e 100644
>--- a/makedumpfile.c
>+++ b/makedumpfile.c
>@@ -8980,6 +8980,88 @@ calculate_cyclic_buffer_size(void) {
> 	return TRUE;
> }
>
>+
>+
>+//#define CRASH_RESERVED_MEM_NR   8
>+struct memory_range crash_reserved_mem[CRASH_RESERVED_MEM_NR];
>+int crash_reserved_mem_nr;
>+
>+/*
>+ * iomem_for_each_line()
>+ *
>+ * Iterate over each line in the file returned by proc_iomem(). If match is
>+ * NULL or if the line matches with our match-pattern then call the
>+ * callback if non-NULL.
>+ *
>+ * Return the number of lines matched.
>+ */
>+int iomem_for_each_line(char *match,
>+			      int (*callback)(void *data,
>+					      int nr,
>+					      char *str,
>+					      unsigned long base,
>+					      unsigned long length),
>+			      void *data)
>+{
>+	const char iomem[] = "/proc/iomem";
>+	char line[BUFSIZE_FGETS];
>+	FILE *fp;
>+	unsigned long long start, end, size;
>+	char *str;
>+	int consumed;
>+	int count;
>+	int nr = 0;
>+
>+	fp = fopen(iomem, "r");
>+	if (!fp) {
>+		ERRMSG("Cannot open %s\n", iomem);
>+		exit(1);
>+	}

Could you change this to return ERROR and handle it in the
caller side? It's a coding style of makedumpfile.


Thanks
Atsushi Kumagai

>+
>+	while(fgets(line, sizeof(line), fp) != 0) {
>+		count = sscanf(line, "%Lx-%Lx : %n", &start, &end, &consumed);
>+		if (count != 2)
>+			continue;
>+		str = line + consumed;
>+		size = end - start + 1;
>+		if (!match || memcmp(str, match, strlen(match)) == 0) {
>+			if (callback
>+			    && callback(data, nr, str, start, size) < 0) {
>+				break;
>+			}
>+			nr++;
>+		}
>+	}
>+
>+	fclose(fp);
>+
>+	return nr;
>+}
>+
>+static int crashkernel_mem_callback(void *data, int nr,
>+                                          char *str,
>+                                          unsigned long base,
>+                                          unsigned long length)
>+{
>+        if (nr >= CRASH_RESERVED_MEM_NR)
>+                return 1;
>+
>+        crash_reserved_mem[nr].start = base;
>+        crash_reserved_mem[nr].end   = base + length - 1;
>+        return 0;
>+}
>+
>+int is_crashkernel_mem_reserved(void)
>+{
>+        int ret;
>+
>+        ret = iomem_for_each_line("Crash kernel\n",
>+                                        crashkernel_mem_callback, NULL);
>+        crash_reserved_mem_nr = ret;
>+
>+        return !!crash_reserved_mem_nr;
>+}
>+
> static struct option longopts[] = {
> 	{"split", no_argument, NULL, OPT_SPLIT},
> 	{"reassemble", no_argument, NULL, OPT_REASSEMBLE},
>diff --git a/makedumpfile.h b/makedumpfile.h
>index 9402f05..7ffa1ee 100644
>--- a/makedumpfile.h
>+++ b/makedumpfile.h
>@@ -1452,6 +1452,13 @@ extern struct array_table	array_table;
> extern struct number_table	number_table;
> extern struct srcfile_table	srcfile_table;
>
>+struct memory_range {
>+        unsigned long long start, end;
>+};
>+
>+#define CRASH_RESERVED_MEM_NR   8
>+struct memory_range crash_reserved_mem[CRASH_RESERVED_MEM_NR];
>+int crash_reserved_mem_nr;
>
> int readmem(int type_addr, unsigned long long addr, void *bufptr, size_t size);
> int get_str_osrelease_from_vmlinux(void);
>--
>1.8.5.3


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch v3 2/7] functions to get crashkernel memory range
  2014-08-01  7:32   ` Atsushi Kumagai
@ 2014-08-12  9:25     ` bhe
  0 siblings, 0 replies; 29+ messages in thread
From: bhe @ 2014-08-12  9:25 UTC (permalink / raw)
  To: Atsushi Kumagai; +Cc: kexec, vgoyal

On 08/01/14 at 07:32am, Atsushi Kumagai wrote:
> >+ */
> >+int iomem_for_each_line(char *match,
> >+			      int (*callback)(void *data,
> >+					      int nr,
> >+					      char *str,
> >+					      unsigned long base,
> >+					      unsigned long length),
> >+			      void *data)
> >+{
> >+	const char iomem[] = "/proc/iomem";
> >+	char line[BUFSIZE_FGETS];
> >+	FILE *fp;
> >+	unsigned long long start, end, size;
> >+	char *str;
> >+	int consumed;
> >+	int count;
> >+	int nr = 0;
> >+
> >+	fp = fopen(iomem, "r");
> >+	if (!fp) {
> >+		ERRMSG("Cannot open %s\n", iomem);
> >+		exit(1);
> >+	}
> 
> Could you change this to return ERROR and handle it in the
> caller side? It's a coding style of makedumpfile.

Yes, sure. I plan to return nr since nr is initialized to 0. Then it
will cause show_mem_usage() to return FALSE and print the related
failure message.

	fp = fopen(iomem, "r");
        if (!fp) {
                ERRMSG("Cannot open %s\n", iomem);
                return nr;
        } 

> > static struct option longopts[] = {
> > 	{"split", no_argument, NULL, OPT_SPLIT},
> > 	{"reassemble", no_argument, NULL, OPT_REASSEMBLE},
> >diff --git a/makedumpfile.h b/makedumpfile.h
> >index 9402f05..7ffa1ee 100644
> >--- a/makedumpfile.h
> >+++ b/makedumpfile.h
> >@@ -1452,6 +1452,13 @@ extern struct array_table	array_table;
> > extern struct number_table	number_table;
> > extern struct srcfile_table	srcfile_table;
> >
> >+struct memory_range {
> >+        unsigned long long start, end;
> >+};
> >+
> >+#define CRASH_RESERVED_MEM_NR   8
> >+struct memory_range crash_reserved_mem[CRASH_RESERVED_MEM_NR];
> >+int crash_reserved_mem_nr;
> >
> > int readmem(int type_addr, unsigned long long addr, void *bufptr, size_t size);
> > int get_str_osrelease_from_vmlinux(void);
> >--
> >1.8.5.3
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
> 
> 
> Thanks
> Atsushi Kumagai
> 
> >+
> >+	while(fgets(line, sizeof(line), fp) != 0) {
> >+		count = sscanf(line, "%Lx-%Lx : %n", &start, &end, &consumed);
> >+		if (count != 2)
> >+			continue;
> >+		str = line + consumed;
> >+		size = end - start + 1;
> >+		if (!match || memcmp(str, match, strlen(match)) == 0) {
> >+			if (callback
> >+			    && callback(data, nr, str, start, size) < 0) {
> >+				break;
> >+			}
> >+			nr++;
> >+		}
> >+	}
> >+
> >+	fclose(fp);
> >+
> >+	return nr;
> >+}
> >+
> >+static int crashkernel_mem_callback(void *data, int nr,
> >+                                          char *str,
> >+                                          unsigned long base,
> >+                                          unsigned long length)
> >+{
> >+        if (nr >= CRASH_RESERVED_MEM_NR)
> >+                return 1;
> >+
> >+        crash_reserved_mem[nr].start = base;
> >+        crash_reserved_mem[nr].end   = base + length - 1;
> >+        return 0;
> >+}
> >+
> >+int is_crashkernel_mem_reserved(void)
> >+{
> >+        int ret;
> >+
> >+        ret = iomem_for_each_line("Crash kernel\n",
> >+                                        crashkernel_mem_callback, NULL);
> >+        crash_reserved_mem_nr = ret;
> >+
> >+        return !!crash_reserved_mem_nr;
> >+}
> >+

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch v3 3/7] preparation functions for parsing vmcoreinfo
  2014-08-01  7:12   ` Atsushi Kumagai
@ 2014-08-12  9:46     ` bhe
  2014-08-12 10:01       ` bhe
  0 siblings, 1 reply; 29+ messages in thread
From: bhe @ 2014-08-12  9:46 UTC (permalink / raw)
  To: Atsushi Kumagai; +Cc: kexec, vgoyal

On 08/01/14 at 07:12am, Atsushi Kumagai wrote:
> >diff --git a/makedumpfile.c b/makedumpfile.c
> >index 220570e..78aa7a5 100644
> >--- a/makedumpfile.c
> >+++ b/makedumpfile.c
> >@@ -681,6 +681,10 @@ get_kernel_version(char *release)
> > 	int32_t version;
> > 	long maj, min, rel;
> > 	char *start, *end;
> >+	static int done = 0;
> >+
> >+	if (done)
> >+		return info->kernel_version;
> 
> This function just convert the argument as string into
> a number, it shouldn't be affected by external factors.
> 
> You should use info->kernel_version in the caller side if
> you want to avoid duplicate calling of this function, but
> I think it's unnecessary since this function is small.

In show_mem_usage() implementaion, the page_offset is needed before
initial() calling because the dumpable elf program loads have ot be
prepared before that. However in current commited code, the page_offset
is got in initial() when call check_release().

So I have to get it in advance by this way. Then the
get_kernel_version() can be reused in this way. Anyway, by this I
needn't change the code in initial().

If use info->kernel_version directly before initial() calling, it's
still zero.

> 
> >
> > 	/*
> > 	 * This method checks that vmlinux and vmcore are same kernel version.
> >@@ -706,6 +710,9 @@ get_kernel_version(char *release)
> > 		MSG("The kernel version is not supported.\n");
> > 		MSG("The created dumpfile may be incomplete.\n");
> > 	}
> >+
> >+	done = 1;
> >+
> > 	return version;
> > }
> >
> >@@ -9062,6 +9069,22 @@ int is_crashkernel_mem_reserved(void)
> >         return !!crash_reserved_mem_nr;
> > }
> >
> >+static int get_page_offset()
> >+{
> >+#ifdef __x86_64__
> >+	struct utsname utsname;
> >+	if (uname(&utsname)) {
> >+		ERRMSG("Cannot get name and information about current kernel : %s", strerror(errno));
> >+		return FALSE;
> >+	}
> >+
> >+	info->kernel_version = get_kernel_version(utsname.release);
> >+	get_versiondep_info_x86_64();
> >+#endif /* x86_64 */
> 
> You should replace get_versiondep_info_x86_64() with get_versiondep_info()
> to get rid of #ifdef.
> #ifdef is messy, I don't want to use it if possible.

Sure, will do.

> 
> 
> Thanks
> Atsushi Kumagai.
> 
> >+
> >+	return TRUE;
> >+}
> >+
> > static struct option longopts[] = {
> > 	{"split", no_argument, NULL, OPT_SPLIT},
> > 	{"reassemble", no_argument, NULL, OPT_REASSEMBLE},
> >--
> >1.8.5.3
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch v3 3/7] preparation functions for parsing vmcoreinfo
  2014-08-12  9:46     ` bhe
@ 2014-08-12 10:01       ` bhe
  2014-08-14  7:37         ` Atsushi Kumagai
  0 siblings, 1 reply; 29+ messages in thread
From: bhe @ 2014-08-12 10:01 UTC (permalink / raw)
  To: Atsushi Kumagai; +Cc: kexec, vgoyal

On 08/12/14 at 05:46pm, Baoquan He wrote:
> On 08/01/14 at 07:12am, Atsushi Kumagai wrote:
> > >diff --git a/makedumpfile.c b/makedumpfile.c
> > >index 220570e..78aa7a5 100644
> > >--- a/makedumpfile.c
> > >+++ b/makedumpfile.c
> > >@@ -681,6 +681,10 @@ get_kernel_version(char *release)
> > > 	int32_t version;
> > > 	long maj, min, rel;
> > > 	char *start, *end;
> > >+	static int done = 0;
> > >+
> > >+	if (done)
> > >+		return info->kernel_version;
> > 
> > This function just convert the argument as string into
> > a number, it shouldn't be affected by external factors.
> > 
> > You should use info->kernel_version in the caller side if
> > you want to avoid duplicate calling of this function, but
> > I think it's unnecessary since this function is small.
> 
> In show_mem_usage() implementaion, the page_offset is needed before
> initial() calling because the dumpable elf program loads have ot be
> prepared before that. However in current commited code, the page_offset
> is got in initial() when call check_release().
> 
> So I have to get it in advance by this way. Then the
> get_kernel_version() can be reused in this way. Anyway, by this I
> needn't change the code in initial().
> 
> If use info->kernel_version directly before initial() calling, it's
> still zero.
> 

I add the static variable "done" just because it always print below
warning message twice, this makes me uncomfortable. Otherwise
get_kernel_version() can be called any time no matter how many times it
has been called if it's only a converting utility function.

The kernel version is not supported.
The created dumpfile may be incomplete.


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch v3 4/7] set vmcoreinfo for kcore
  2014-08-01  7:12   ` Atsushi Kumagai
@ 2014-08-12 10:08     ` bhe
  0 siblings, 0 replies; 29+ messages in thread
From: bhe @ 2014-08-12 10:08 UTC (permalink / raw)
  To: Atsushi Kumagai; +Cc: kexec, vgoyal

On 08/01/14 at 07:12am, Atsushi Kumagai wrote:

> >+#define UNINITIALIZED  ((ulong)(-1))
> >+int set_kcore_vmcoreinfo(uint64_t vmcoreinfo_addr, uint64_t vmcoreinfo_len)
> >+{
> >+	int i;
> >+	ulong kvaddr;
> >+	off_t offset;
> >+	char note[MAX_SIZE_NHDR];
> >+	int size_desc;
> >+	off_t offset_desc;
> >+
......
> >+
> >+	if (offset == UNINITIALIZED){
> >+		ERRMSG("Can't seek the dump memory(%s). %s\n",
> >+		    name_memory, strerror(errno));
> >+		return FALSE;
> >+	}
> >+
> >+        if (lseek(fd_memory, offset, SEEK_SET) != offset){
> >+		ERRMSG("Can't seek the dump memory(%s). %s\n",
> >+		    name_memory, strerror(errno));
> >+		return FALSE;
> >+	}
> 
> These two error messages are the same, they aren't helpful for debugging.
> I think the former should be like "Can't get the offset of VMCOREINFO".

Yeah, great idea, will change.


> >+/* Returns the physical address of start of crash notes buffer for a kernel. */
> >+static int get_sys_kernel_vmcoreinfo(uint64_t *addr, uint64_t *len)
> >+{
> 
> This function just return the result status, so please use TRUE or FALSE
> as the return value instead of 0 or -1.

Will do.

> 

> >+	if (!(fp = fopen("/sys/kernel/vmcoreinfo", "r")))
> >+		return -1;
> >+
> >+	if (!fgets(line, sizeof(line), fp))
> >+		ERRMSG("Cannot parse %s: %s\n", "/sys/kernel/vmcoreinfo", strerror(errno));
> >+	count = sscanf(line, "%Lx %Lx", &temp, &temp2);
> >+	if (count != 2)
> >+		ERRMSG("Cannot parse %s: %s\n", "/sys/kernel/vmcoreinfo", strerror(errno));
> 
> The messages are the same, too.

Will change.

> 
> 
> Thanks
> Atsushi Kumagai
> 

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch v3 5/7] prepare the dump loads for kcore analysis
  2014-08-01  7:12   ` Atsushi Kumagai
@ 2014-08-12 10:10     ` bhe
  0 siblings, 0 replies; 29+ messages in thread
From: bhe @ 2014-08-12 10:10 UTC (permalink / raw)
  To: Atsushi Kumagai; +Cc: kexec, vgoyal

On 08/01/14 at 07:12am, Atsushi Kumagai wrote:

> >  * Get ELF information about /proc/vmcore.
> >diff --git a/elf_info.h b/elf_info.h
> >index 3ce0138..ba27fdf 100644
> >--- a/elf_info.h
> >+++ b/elf_info.h
> >@@ -46,6 +46,7 @@ int get_elf_info(int fd, char *filename);
> > void free_elf_info(void);
> > int get_elf_loads(int fd, char *filename);
> > int set_kcore_vmcoreinfo(uint64_t vmcoreinfo_addr, uint64_t vmcoreinfo_len);
> >+int get_kcore_dump_loads();
> >
> > int is_elf64_memory(void);
> > int is_xen_memory(void);
> >diff --git a/makedumpfile.h b/makedumpfile.h
> >index 7ffa1ee..8881c76 100644
> >--- a/makedumpfile.h
> >+++ b/makedumpfile.h
> >@@ -719,6 +719,7 @@ unsigned long long vaddr_to_paddr_x86(unsigned long vaddr);
> > #endif /* x86 */
> >
> > #ifdef __x86_64__
> >+int is_vmalloc_addr(ulong vaddr);
> > int get_phys_base_x86_64(void);
> > int get_machdep_info_x86_64(void);
> > int get_versiondep_info_x86_64(void);
> 
> It will fail to build due to undefined is_vmalloc_addr() except on
> x86_64, let's define it also for the other architectures like below:
> 
> #ifdef __x86__
>  int get_machdep_info_x86(void);
>  int get_versiondep_info_x86(void);
> +int is_vmalloc_addr_x86(ulong vaddr);
>  unsigned long long vaddr_to_paddr_x86(unsigned long vaddr);
>  #define get_phys_base()                TRUE
>  #define get_machdep_info()     get_machdep_info_x86()
>  #define get_versiondep_info()  get_versiondep_info_x86()
>  #define vaddr_to_paddr(X)      vaddr_to_paddr_x86(X)
> +#define is_vmalloc_addr(X)      is_vmalloc_addr_x86(X)
>  #endif /* x86 */
> 
> Besides, I think it's better to rename the is_vmalloc_addr() in
> arch/x86_64.c to is_vmalloc_addr_x86_64().

Great idea, will change.


> 
> 
> Thanks
> Atsushi Kumagai
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch v3 7/7] add a new interface to show the memory usage of 1st kernel
  2014-08-01  7:12   ` Atsushi Kumagai
@ 2014-08-12 10:14     ` bhe
  2014-08-21 10:31     ` bhe
  1 sibling, 0 replies; 29+ messages in thread
From: bhe @ 2014-08-12 10:14 UTC (permalink / raw)
  To: Atsushi Kumagai; +Cc: kexec, vgoyal

On 08/01/14 at 07:12am, Atsushi Kumagai wrote:
> >The print is like below:
> >->$ ./makedumpfile  --mem-usage  /proc/kcore
> >Excluding unnecessary pages        : [100.0 %] |
> >
> >Page number of memory in different use
> >--------------------------------------------------
> >TYPE		PAGES			EXCLUDABLE	DESCRIPTION
> >ZERO		0               	yes		Pages filled with zero
> 
> The number of zero pages is always 0 since it isn't counted during
> get_num_dumpable_cyclic(). To count it up, we have to read all of the
> pages like exclude_zero_pages(), so we need "exclude_zero_pages_cyclic()".
> My idea is to call it in get_num_dumpable_cyclic() like:

Yeah, I didn't notice it. Thanks for pointing it out and great idea.
Will change.

> 
> 		for_each_cycle(0, info->max_mapnr, &cycle)
> 		{
> 				if (!exclude_unnecessary_pages_cyclic(&cycle))
> 					return FALSE;
> 
> +				if (info->flag_mem_usage)
> +					exclude_zero_pages_cyclic(&cycle);
> +
> 				for(pfn=cycle.start_pfn; pfn<cycle.end_pfn; pfn++)
> 
> 
> BTW, what is the target kernel version of this feature?
> It works well on 3.12 but fails on 2.6.32 like:
> 
> # ./makedumpfile --mem-usage /proc/kcore
> read_device: Can't read a file(/proc/kcore). Success
> set_kcore_vmcoreinfo: Can't read the dump memory(/proc/kcore). Success
> 
> makedumpfile Failed.
> #
> 
> This error means reading VMCOREINFO from /proc/kcore was failed.
> Of course, there is a VMCOREINFO on the memory,
> 
> # cat /sys/kernel/vmcoreinfo
> 1e01b80 1000
> #
> 
> It seems like old /proc/kcore's issue, but I'm still investigating.
> Any comments are helpful.

OK, I just tested this on Latest kernel of Linus's tree. That means I
need installed an old kernel to check it.

Will paste the result after analyzing. 

Thanks so much for these helpful comments and great ideas. Will post a
new patchset soon.


> 
> 
> Thanks
> Atsushi Kumagai
> 
> >CACHE		562006          	yes		Cache pages
> >CACHE_PRIVATE	353502          	yes		Cache pages + private
> >USER		225780          	yes		User process pages
> >FREE		2761884         	yes		Free pages
> >KERN_DATA	235873          	no		Dumpable kernel data
> >
> >Total pages on system:	4139045
> >
> >Signed-off-by: Baoquan He <bhe@redhat.com>
> >---
> > makedumpfile.8 | 17 +++++++++++++
> > makedumpfile.c | 76 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > makedumpfile.h |  2 ++
> > print_info.c   |  8 +++++++
> > 4 files changed, 103 insertions(+)
> >
> >diff --git a/makedumpfile.8 b/makedumpfile.8
> >index 25fe74e..64abbc7 100644
> >--- a/makedumpfile.8
> >+++ b/makedumpfile.8
> >@@ -532,6 +532,23 @@ it is necessary to specfiy [\-x \fIVMLINUX\fR] or [\-i \fIVMCOREINFO\fR].
> > # makedumpfile \-\-dump-dmesg -x vmlinux /proc/vmcore dmesgfile
> > .br
> >
> >+
> >+.TP
> >+\fB\-\-mem-usage\fR
> >+This option is used to show the page numbers of current system in different
> >+use. It should be executed in 1st kernel. By the help of this, user can know
> >+how many pages is dumpable when different dump_level is specified. It analyzes
> >+the 'System Ram' and 'kernel text' program segment of /proc/kcore excluding
> >+the crashkernel range, then calculates the page number of different kind per
> >+vmcoreinfo. So currently /proc/kcore need be specified explicitly.
> >+
> >+.br
> >+.B Example:
> >+.br
> >+# makedumpfile \-\-mem-usage /proc/kcore
> >+.br
> >+
> >+
> > .TP
> > \fB\-\-diskset=VMCORE\fR
> > Specify multiple \fIVMCORE\fRs created on sadump diskset configuration
> >diff --git a/makedumpfile.c b/makedumpfile.c
> >index b5e920d..6bbf324 100644
> >--- a/makedumpfile.c
> >+++ b/makedumpfile.c
> >@@ -7853,6 +7853,7 @@ print_mem_usage(void)
> > 	shrinking = shrinking / pfn_original;
> >
> > 	MSG("\n");
> >+	MSG("\n");
> > 	MSG("Page number of memory in different use\n");
> > 	MSG("--------------------------------------------------\n");
> > 	MSG("TYPE		PAGES			EXCLUDABLE	DESCRIPTION\n");
> >@@ -8906,6 +8907,13 @@ check_param_for_creating_dumpfile(int argc, char *argv[])
> > 		 */
> > 		info->name_memory   = argv[optind];
> >
> >+	} else if ((argc == optind + 1) && info->flag_mem_usage) {
> >+		/*
> >+		* Parameter for showing the page number of memory
> >+		* in different use from.
> >+		*/
> >+		info->name_memory   = argv[optind];
> >+
> > 	} else
> > 		return FALSE;
> >
> >@@ -9148,6 +9156,58 @@ static int get_sys_kernel_vmcoreinfo(uint64_t *addr, uint64_t *len)
> > 	return 0;
> > }
> >
> >+int show_mem_usage(void)
> >+{
> >+        uint64_t vmcoreinfo_addr, vmcoreinfo_len;
> >+
> >+        if (!is_crashkernel_mem_reserved()) {
> >+                ERRMSG("No memory is reserved for crashkenrel!\n");
> >+                return FALSE;
> >+        }
> >+
> >+
> >+        if (!info->flag_cyclic)
> >+                info->flag_cyclic = TRUE;
> >+
> >+	info->dump_level = MAX_DUMP_LEVEL;
> >+
> >+        if (!get_page_offset())
> >+                return FALSE;
> >+
> >+        if (!open_dump_memory())
> >+                return FALSE;
> >+
> >+        if (!get_elf_loads(info->fd_memory, info->name_memory))
> >+                return FALSE;
> >+
> >+        if (get_sys_kernel_vmcoreinfo(&vmcoreinfo_addr, &vmcoreinfo_len))
> >+                return FALSE;
> >+
> >+        if (!set_kcore_vmcoreinfo(vmcoreinfo_addr, vmcoreinfo_len))
> >+                return FALSE;
> >+
> >+        if (!get_kcore_dump_loads())
> >+                return FALSE;
> >+
> >+        if (!initial())
> >+                return FALSE;
> >+
> >+
> >+        if (!prepare_bitmap2_buffer_cyclic())
> >+                return FALSE;
> >+
> >+        info->num_dumpable = get_num_dumpable_cyclic();
> >+
> >+	free_bitmap2_buffer_cyclic();
> >+
> >+        print_mem_usage();
> >+
> >+        if (!close_files_for_creating_dumpfile())
> >+                return FALSE;
> >+
> >+        return TRUE;
> >+}
> >+
> >
> > static struct option longopts[] = {
> > 	{"split", no_argument, NULL, OPT_SPLIT},
> >@@ -9165,6 +9225,7 @@ static struct option longopts[] = {
> > 	{"cyclic-buffer", required_argument, NULL, OPT_CYCLIC_BUFFER},
> > 	{"eppic", required_argument, NULL, OPT_EPPIC},
> > 	{"non-mmap", no_argument, NULL, OPT_NON_MMAP},
> >+	{"mem-usage", no_argument, NULL, OPT_MEM_USAGE},
> > 	{0, 0, 0, 0}
> > };
> >
> >@@ -9256,6 +9317,9 @@ main(int argc, char *argv[])
> > 		case OPT_DUMP_DMESG:
> > 			info->flag_dmesg = 1;
> > 			break;
> >+		case OPT_MEM_USAGE:
> >+                       info->flag_mem_usage = 1;
> >+                       break;
> > 		case OPT_COMPRESS_SNAPPY:
> > 			info->flag_compress = DUMP_DH_COMPRESSED_SNAPPY;
> > 			break;
> >@@ -9396,6 +9460,18 @@ main(int argc, char *argv[])
> >
> > 		MSG("\n");
> > 		MSG("The dmesg log is saved to %s.\n", info->name_dumpfile);
> >+	} else if (info->flag_mem_usage) {
> >+		if (!check_param_for_creating_dumpfile(argc, argv)) {
> >+			MSG("Commandline parameter is invalid.\n");
> >+			MSG("Try `makedumpfile --help' for more information.\n");
> >+			goto out;
> >+		}
> >+
> >+		if (!show_mem_usage())
> >+			goto out;
> >+
> >+		MSG("\n");
> >+		MSG("Showing page number of memory in different use successfully.\n");
> > 	} else {
> > 		if (!check_param_for_creating_dumpfile(argc, argv)) {
> > 			MSG("Commandline parameter is invalid.\n");
> >diff --git a/makedumpfile.h b/makedumpfile.h
> >index 8881c76..ba8c0f9 100644
> >--- a/makedumpfile.h
> >+++ b/makedumpfile.h
> >@@ -911,6 +911,7 @@ struct DumpInfo {
> > 	int		flag_force;	     /* overwrite existing stuff */
> > 	int		flag_exclude_xen_dom;/* exclude Domain-U from xen-kdump */
> > 	int             flag_dmesg;          /* dump the dmesg log out of the vmcore file */
> >+	int             flag_mem_usage;  /*show the page number of memory in different use*/
> > 	int		flag_use_printk_log; /* did we read printk_log symbol name? */
> > 	int		flag_nospace;	     /* the flag of "No space on device" error */
> > 	int		flag_vmemmap;        /* kernel supports vmemmap address space */
> >@@ -1772,6 +1773,7 @@ struct elf_prstatus {
> > #define OPT_CYCLIC_BUFFER       OPT_START+11
> > #define OPT_EPPIC               OPT_START+12
> > #define OPT_NON_MMAP            OPT_START+13
> >+#define OPT_MEM_USAGE            OPT_START+14
> >
> > /*
> >  * Function Prototype.
> >diff --git a/print_info.c b/print_info.c
> >index 7592690..29db918 100644
> >--- a/print_info.c
> >+++ b/print_info.c
> >@@ -264,6 +264,14 @@ print_usage(void)
> > 	MSG("      LOGFILE. If a VMCORE does not contain VMCOREINFO for dmesg, it is\n");
> > 	MSG("      necessary to specfiy [-x VMLINUX] or [-i VMCOREINFO].\n");
> > 	MSG("\n");
> >+	MSG("  [--mem-usage]:\n");
> >+	MSG("      This option is used to show the page numbers of current system in different\n");
> >+	MSG("      use. It should be executed in 1st kernel. By the help of this, user can know\n");
> >+	MSG("      how many pages is dumpable when different dump_level is specified. It analyzes\n");
> >+	MSG("      the 'System Ram' and 'kernel text' program segment of /proc/kcore excluding\n");
> >+	MSG("      the crashkernel range, then calculates the page number of different kind per\n");
> >+	MSG("      vmcoreinfo. So currently /proc/kcore need be specified explicitly.\n");
> >+	MSG("\n");
> > 	MSG("  [-D]:\n");
> > 	MSG("      Print debugging message.\n");
> > 	MSG("\n");
> >--
> >1.8.5.3
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [Patch v3 3/7] preparation functions for parsing vmcoreinfo
  2014-08-12 10:01       ` bhe
@ 2014-08-14  7:37         ` Atsushi Kumagai
  2014-08-14  8:15           ` bhe
  0 siblings, 1 reply; 29+ messages in thread
From: Atsushi Kumagai @ 2014-08-14  7:37 UTC (permalink / raw)
  To: bhe; +Cc: kexec, vgoyal

>On 08/12/14 at 05:46pm, Baoquan He wrote:
>> On 08/01/14 at 07:12am, Atsushi Kumagai wrote:
>> > >diff --git a/makedumpfile.c b/makedumpfile.c
>> > >index 220570e..78aa7a5 100644
>> > >--- a/makedumpfile.c
>> > >+++ b/makedumpfile.c
>> > >@@ -681,6 +681,10 @@ get_kernel_version(char *release)
>> > > 	int32_t version;
>> > > 	long maj, min, rel;
>> > > 	char *start, *end;
>> > >+	static int done = 0;
>> > >+
>> > >+	if (done)
>> > >+		return info->kernel_version;
>> >
>> > This function just convert the argument as string into
>> > a number, it shouldn't be affected by external factors.
>> >
>> > You should use info->kernel_version in the caller side if
>> > you want to avoid duplicate calling of this function, but
>> > I think it's unnecessary since this function is small.
>>
>> In show_mem_usage() implementaion, the page_offset is needed before
>> initial() calling because the dumpable elf program loads have ot be
>> prepared before that. However in current commited code, the page_offset
>> is got in initial() when call check_release().
>>
>> So I have to get it in advance by this way. Then the
>> get_kernel_version() can be reused in this way. Anyway, by this I
>> needn't change the code in initial().
>>
>> If use info->kernel_version directly before initial() calling, it's
>> still zero.
>>
>
>I add the static variable "done" just because it always print below
>warning message twice, this makes me uncomfortable. Otherwise
>get_kernel_version() can be called any time no matter how many times it
>has been called if it's only a converting utility function.
>
>The kernel version is not supported.
>The created dumpfile may be incomplete.

Yeah, I understand the reason.
I agree with your idea, but I think it's better to check the
kernel_version directly like:

	if (info->kernel_version)
		return info->kernel_version;


Thanks
Atsushi Kumagai

>
>_______________________________________________
>kexec mailing list
>kexec@lists.infradead.org
>http://lists.infradead.org/mailman/listinfo/kexec

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch v3 3/7] preparation functions for parsing vmcoreinfo
  2014-08-14  7:37         ` Atsushi Kumagai
@ 2014-08-14  8:15           ` bhe
  0 siblings, 0 replies; 29+ messages in thread
From: bhe @ 2014-08-14  8:15 UTC (permalink / raw)
  To: Atsushi Kumagai; +Cc: kexec, vgoyal

On 08/14/14 at 07:37am, Atsushi Kumagai wrote:
> >On 08/12/14 at 05:46pm, Baoquan He wrote:
 >>
> >
> >I add the static variable "done" just because it always print below
> >warning message twice, this makes me uncomfortable. Otherwise
> >get_kernel_version() can be called any time no matter how many times it
> >has been called if it's only a converting utility function.
> >
> >The kernel version is not supported.
> >The created dumpfile may be incomplete.
> 
> Yeah, I understand the reason.
> I agree with your idea, but I think it's better to check the
> kernel_version directly like:
> 
> 	if (info->kernel_version)
> 		return info->kernel_version;

Well, I got what you meant. I didn't get it when you said at first
time. Yeah, this works and is better. Will change like this. Thanks a
lot.


> 
> 
> Thanks
> Atsushi Kumagai
> 
> >
> >_______________________________________________
> >kexec mailing list
> >kexec@lists.infradead.org
> >http://lists.infradead.org/mailman/listinfo/kexec

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch v3 7/7] add a new interface to show the memory usage of 1st kernel
  2014-08-01  7:12   ` Atsushi Kumagai
  2014-08-12 10:14     ` bhe
@ 2014-08-21 10:31     ` bhe
  2014-08-26  2:28       ` Atsushi Kumagai
  1 sibling, 1 reply; 29+ messages in thread
From: bhe @ 2014-08-21 10:31 UTC (permalink / raw)
  To: Atsushi Kumagai; +Cc: kexec, vgoyal

On 08/01/14 at 07:12am, Atsushi Kumagai wrote:
> >Page number of memory in different use
> >--------------------------------------------------
> >TYPE		PAGES			EXCLUDABLE	DESCRIPTION
> >ZERO		0               	yes		Pages filled with zero
> 
> The number of zero pages is always 0 since it isn't counted during
> get_num_dumpable_cyclic(). To count it up, we have to read all of the
> pages like exclude_zero_pages(), so we need "exclude_zero_pages_cyclic()".
> My idea is to call it in get_num_dumpable_cyclic() like:
> 
> 		for_each_cycle(0, info->max_mapnr, &cycle)
> 		{
> 				if (!exclude_unnecessary_pages_cyclic(&cycle))
> 					return FALSE;
> 
> +				if (info->flag_mem_usage)
> +					exclude_zero_pages_cyclic(&cycle);
> +
> 				for(pfn=cycle.start_pfn; pfn<cycle.end_pfn; pfn++)


Hi Atsushi,

I just introduced a new function exclude_zero_pages_cyclic as you
suggested. But it always exited with below message. I don't know what's
wrong with this function. Could you help have a look at it?

"Program terminated with signal SIGKILL"


From: Baoquan He <bhe@redhat.com>
Date: Thu, 21 Aug 2014 13:29:31 +0800
Subject: [PATCH] introduce a function exclude_zero_pages_cyclic()

Introduced a new function exclude_zero_pages_cyclic(), this will
exclude and counting zero pages. Calling it in get_num_dumpable_cyclic
can get the number of zero pages.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 makedumpfile.c | 42 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 42 insertions(+)

diff --git a/makedumpfile.c b/makedumpfile.c
index d43d02d..a511179 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -4582,6 +4582,45 @@ exclude_zero_pages(void)
 	return TRUE;
 }
 
+int
+exclude_zero_pages_cyclic(struct cycle *cycle)
+{
+	mdf_pfn_t pfn;
+	unsigned long long paddr;
+	unsigned char buf[info->page_size];
+
+	for (pfn = cycle->start_pfn, paddr = pfn_to_paddr(pfn); pfn < cycle->end_pfn;
+	    pfn++, paddr += info->page_size) {
+
+		if (!is_in_segs(paddr))
+			continue;
+
+		if (!is_dumpable_cyclic(info->partial_bitmap2, pfn, cycle))
+			continue;
+
+		if (is_xen_memory()) {
+			if (!readmem(MADDR_XEN, paddr, buf, info->page_size)) {
+				ERRMSG("Can't get the page data(pfn:%llx, max_mapnr:%llx).\n",
+				    pfn, info->max_mapnr);
+				return FALSE;
+			}
+		} else {
+			if (!readmem(PADDR, paddr, buf, info->page_size)) {
+				ERRMSG("Can't get the page data(pfn:%llx, max_mapnr:%llx).\n",
+				    pfn, info->max_mapnr);
+				return FALSE;
+			}
+		}
+		if (is_zero_page(buf, info->page_size)) {
+			if (clear_bit_on_2nd_bitmap(pfn, cycle))
+				pfn_zero++;
+		}
+	}
+
+	return TRUE;
+}
+
+
 static int
 initialize_2nd_bitmap_cyclic(struct cycle *cycle)
 {
@@ -5662,6 +5701,9 @@ get_num_dumpable_cyclic(void)
 		if (!exclude_unnecessary_pages_cyclic(&cycle))
 			return FALSE;
 
+		if (info->flag_mem_usage)
+			exclude_zero_pages_cyclic(&cycle);
+
 		for(pfn=cycle.start_pfn; pfn<cycle.end_pfn; pfn++)
 			if (is_dumpable_cyclic(info->partial_bitmap2, pfn, &cycle))
 				num_dumpable++;
-- 
1.8.5.3


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* RE: [Patch v3 7/7] add a new interface to show the memory usage of 1st kernel
  2014-08-21 10:31     ` bhe
@ 2014-08-26  2:28       ` Atsushi Kumagai
  2014-08-26  3:22         ` bhe
  0 siblings, 1 reply; 29+ messages in thread
From: Atsushi Kumagai @ 2014-08-26  2:28 UTC (permalink / raw)
  To: bhe; +Cc: kexec, vgoyal

>On 08/01/14 at 07:12am, Atsushi Kumagai wrote:
>> >Page number of memory in different use
>> >--------------------------------------------------
>> >TYPE		PAGES			EXCLUDABLE	DESCRIPTION
>> >ZERO		0               	yes		Pages filled with zero
>>
>> The number of zero pages is always 0 since it isn't counted during
>> get_num_dumpable_cyclic(). To count it up, we have to read all of the
>> pages like exclude_zero_pages(), so we need "exclude_zero_pages_cyclic()".
>> My idea is to call it in get_num_dumpable_cyclic() like:
>>
>> 		for_each_cycle(0, info->max_mapnr, &cycle)
>> 		{
>> 				if (!exclude_unnecessary_pages_cyclic(&cycle))
>> 					return FALSE;
>>
>> +				if (info->flag_mem_usage)
>> +					exclude_zero_pages_cyclic(&cycle);
>> +
>> 				for(pfn=cycle.start_pfn; pfn<cycle.end_pfn; pfn++)
>
>
>Hi Atsushi,
>
>I just introduced a new function exclude_zero_pages_cyclic as you
>suggested. But it always exited with below message. I don't know what's
>wrong with this function. Could you help have a look at it?
>
>"Program terminated with signal SIGKILL"

Umm, the code looks no problem and it works well at least on my
machine (x86_64 on KVM), so I have no idea for now.

Can strace and audit help your investigation? They may provide
some hints (e.g. Who send SIGKILL) for us.


Thanks
Atsushi Kumagai

>From: Baoquan He <bhe@redhat.com>
>Date: Thu, 21 Aug 2014 13:29:31 +0800
>Subject: [PATCH] introduce a function exclude_zero_pages_cyclic()
>
>Introduced a new function exclude_zero_pages_cyclic(), this will
>exclude and counting zero pages. Calling it in get_num_dumpable_cyclic
>can get the number of zero pages.
>
>Signed-off-by: Baoquan He <bhe@redhat.com>
>---
> makedumpfile.c | 42 ++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 42 insertions(+)
>
>diff --git a/makedumpfile.c b/makedumpfile.c
>index d43d02d..a511179 100644
>--- a/makedumpfile.c
>+++ b/makedumpfile.c
>@@ -4582,6 +4582,45 @@ exclude_zero_pages(void)
> 	return TRUE;
> }
>
>+int
>+exclude_zero_pages_cyclic(struct cycle *cycle)
>+{
>+	mdf_pfn_t pfn;
>+	unsigned long long paddr;
>+	unsigned char buf[info->page_size];
>+
>+	for (pfn = cycle->start_pfn, paddr = pfn_to_paddr(pfn); pfn < cycle->end_pfn;
>+	    pfn++, paddr += info->page_size) {
>+
>+		if (!is_in_segs(paddr))
>+			continue;
>+
>+		if (!is_dumpable_cyclic(info->partial_bitmap2, pfn, cycle))
>+			continue;
>+
>+		if (is_xen_memory()) {
>+			if (!readmem(MADDR_XEN, paddr, buf, info->page_size)) {
>+				ERRMSG("Can't get the page data(pfn:%llx, max_mapnr:%llx).\n",
>+				    pfn, info->max_mapnr);
>+				return FALSE;
>+			}
>+		} else {
>+			if (!readmem(PADDR, paddr, buf, info->page_size)) {
>+				ERRMSG("Can't get the page data(pfn:%llx, max_mapnr:%llx).\n",
>+				    pfn, info->max_mapnr);
>+				return FALSE;
>+			}
>+		}
>+		if (is_zero_page(buf, info->page_size)) {
>+			if (clear_bit_on_2nd_bitmap(pfn, cycle))
>+				pfn_zero++;
>+		}
>+	}
>+
>+	return TRUE;
>+}
>+
>+
> static int
> initialize_2nd_bitmap_cyclic(struct cycle *cycle)
> {
>@@ -5662,6 +5701,9 @@ get_num_dumpable_cyclic(void)
> 		if (!exclude_unnecessary_pages_cyclic(&cycle))
> 			return FALSE;
>
>+		if (info->flag_mem_usage)
>+			exclude_zero_pages_cyclic(&cycle);
>+
> 		for(pfn=cycle.start_pfn; pfn<cycle.end_pfn; pfn++)
> 			if (is_dumpable_cyclic(info->partial_bitmap2, pfn, &cycle))
> 				num_dumpable++;
>--
>1.8.5.3

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch v3 7/7] add a new interface to show the memory usage of 1st kernel
  2014-08-26  2:28       ` Atsushi Kumagai
@ 2014-08-26  3:22         ` bhe
  2014-08-26  6:25           ` Petr Tesarik
  0 siblings, 1 reply; 29+ messages in thread
From: bhe @ 2014-08-26  3:22 UTC (permalink / raw)
  To: Atsushi Kumagai; +Cc: kexec, vgoyal

On 08/26/14 at 02:28am, Atsushi Kumagai wrote:
> >On 08/01/14 at 07:12am, Atsushi Kumagai wrote:
> >> >Page number of memory in different use
> >> >--------------------------------------------------
> >> >TYPE		PAGES			EXCLUDABLE	DESCRIPTION
> >> >ZERO		0               	yes		Pages filled with zero
> >>
> >> The number of zero pages is always 0 since it isn't counted during
> >> get_num_dumpable_cyclic(). To count it up, we have to read all of the
> >> pages like exclude_zero_pages(), so we need "exclude_zero_pages_cyclic()".
> >> My idea is to call it in get_num_dumpable_cyclic() like:
> >>
> >> 		for_each_cycle(0, info->max_mapnr, &cycle)
> >> 		{
> >> 				if (!exclude_unnecessary_pages_cyclic(&cycle))
> >> 					return FALSE;
> >>
> >> +				if (info->flag_mem_usage)
> >> +					exclude_zero_pages_cyclic(&cycle);
> >> +
> >> 				for(pfn=cycle.start_pfn; pfn<cycle.end_pfn; pfn++)
> >
> >
> >Hi Atsushi,
> >
> >I just introduced a new function exclude_zero_pages_cyclic as you
> >suggested. But it always exited with below message. I don't know what's
> >wrong with this function. Could you help have a look at it?
> >
> >"Program terminated with signal SIGKILL"
> 
> Umm, the code looks no problem and it works well at least on my
> machine (x86_64 on KVM), so I have no idea for now.
> 
> Can strace and audit help your investigation? They may provide
> some hints (e.g. Who send SIGKILL) for us.

It only happened on a AMD machine with Quad-Core AMD Opteron(tm)
Processor 1352. I tested on my other 2 intel machines, both of them are
OK.

Just now I used strace to check it, and found it's caused by a reading.
It's weird since that page should be inside the System RAM and can be
read. And before this handling hwpoison has been checked. I am wondering
why it happened.


[ ~]$ sudo readelf -l /proc/kcore                                                                                                                                        

Elf file type is CORE (Core file)
Entry point 0x0
There are 13 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
...

This is the load segment where the page reading error happened.
  LOAD           0x0000080080001000 0xffff880080000000
0x0000000000000000
                 0x000000004fee0000 0x000000004fee0000  RWE    1000
...

  LOAD           0x00006a0002001000 0xffffea0002000000
0x0000000000000000
                 0x00000000013fc000 0x00000000013fc000  RWE    1000
  LOAD           0x0000080100001000 0xffff880100000000
0x0000000000000000
                 0x0000000130000000 0x0000000130000000  RWE    1000

read(3,
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
4096) = 4096
lseek(3, 8799351988224, SEEK_SET)       = 8799351988224
read(3,
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
4096) = 4096
lseek(3, 8799351992320, SEEK_SET)       = 8799351992320
read(3,
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
4096) = 4096
lseek(3, 8799351996416, SEEK_SET)       = 8799351996416
read(3,
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
4096) = 4096
lseek(3, 8799365541888, SEEK_SET)       = 8799365541888
read(3,
"\340\216\274\226f\177\0\0PCD\224f\177\0\0\265\0\0\0\0\0\0\0p\217\274\226f\177\0\0"...,
4096) = 4096
-----------------------------------------
Here it use lseek to position, then try to read, and then reading failed
and raised a SIGKILL.

lseek(3, 8799381360640, SEEK_SET)       = 8799381360640
read(3,  <unfinished ...>
+++ killed by SIGKILL +++
Killed
> 
> 
> Thanks
> Atsushi Kumagai
> 

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch v3 7/7] add a new interface to show the memory usage of 1st kernel
  2014-08-26  3:22         ` bhe
@ 2014-08-26  6:25           ` Petr Tesarik
  2014-08-26 14:12             ` bhe
  0 siblings, 1 reply; 29+ messages in thread
From: Petr Tesarik @ 2014-08-26  6:25 UTC (permalink / raw)
  To: bhe; +Cc: kexec, Atsushi Kumagai, vgoyal

On Tue, 26 Aug 2014 11:22:47 +0800
"bhe@redhat.com" <bhe@redhat.com> wrote:

>[...]
> Here it use lseek to position, then try to read, and then reading failed
> and raised a SIGKILL.
> 
> lseek(3, 8799381360640, SEEK_SET)       = 8799381360640
> read(3,  <unfinished ...>
> +++ killed by SIGKILL +++
> Killed

This smells like killed by OOM Killer. Can you check the kernel log if
there's anything?

Petr Tesarik

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Patch v3 7/7] add a new interface to show the memory usage of 1st kernel
  2014-08-26  6:25           ` Petr Tesarik
@ 2014-08-26 14:12             ` bhe
  2014-09-02  6:20               ` Atsushi Kumagai
  0 siblings, 1 reply; 29+ messages in thread
From: bhe @ 2014-08-26 14:12 UTC (permalink / raw)
  To: Petr Tesarik; +Cc: kexec, Atsushi Kumagai, vgoyal

On 08/26/14 at 08:25am, Petr Tesarik wrote:
> On Tue, 26 Aug 2014 11:22:47 +0800
> "bhe@redhat.com" <bhe@redhat.com> wrote:
> 
> >[...]
> > Here it use lseek to position, then try to read, and then reading failed
> > and raised a SIGKILL.
> > 
> > lseek(3, 8799381360640, SEEK_SET)       = 8799381360640
> > read(3,  <unfinished ...>
> > +++ killed by SIGKILL +++
> > Killed
> 
> This smells like killed by OOM Killer. Can you check the kernel log if
> there's anything?

Thanks for notice, it is caused by kernel addr validation check in
read_kcore. No idea why it happened.

[ +35.288439] BUG: unable to handle kernel paging request at
ffff8800c4000000
[  +0.011559] IP: [<ffffffff8105870b>] kern_addr_valid+0x15b/0x1b0
[  +0.010586] PGD 220f067 PUD 2213067 PMD 80000000c4000062 
[  +0.009994] Oops: 0000 [#1] SMP 
[  +0.007782] Modules linked in: xt_CHECKSUM nf_conntrack_netbios_ns
nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT xt_conntrack cfg80211
rfki
[  +0.084041] CPU: 2 PID: 1679 Comm: makedumpfile Not tainted 3.17.0-rc2
#15
[  +0.012249] Hardware name: Dell Inc. PowerEdge T105 /0J001K, BIOS
1.4.4 07/30/2009
[  +0.012991] task: ffff8800c010ae40 ti: ffff880214830000 task.ti:
ffff880214830000
[  +0.012886] RIP: 0010:[<ffffffff8105870b>]  [<ffffffff8105870b>]
kern_addr_valid+0x15b/0x1b0
[  +0.013860] RSP: 0018:ffff880214833e88  EFLAGS: 00010206
[  +0.010710] RAX: 00000000c4000000 RBX: 0000000000001000 RCX:
0000000000000000
[  +0.012445] RDX: 00000000c4000000 RSI: ffff880000000000 RDI:
80000000c4000062
[  +0.012298] RBP: ffff880214833e88 R08: 000000000000000e R09:
00007fffffffffff
[  +0.012322] R10: 0000000000000001 R11: 0000000000000246 R12:
0000000000001000
[  +0.012316] R13: 00000000022beb10 R14: ffff880214833f50 R15:
ffff8800c4000000
[  +0.012308] FS:  00007f0c91ffa740(0000) GS:ffff88022fd00000(0000)
knlGS:0000000000000000
[  +0.013315] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  +0.010985] CR2: ffff8800c4000000 CR3: 000000021a632000 CR4:
00000000000007e0
[  +0.012266] Stack:
[  +0.006976]  ffff880214833ee0 ffffffff81255c38 ffff880214929410
0000000000000000
[  +0.012471]  0000000d14833ed8 0000000000001000 ffff880223c7c480
00000000022beb10
[  +0.012465]  ffff880214833f50 0000000000001000 00007ffffe1840c0
ffff880214833f00
[  +0.012456] Call Trace:
[  +0.007459]  [<ffffffff81255c38>] read_kcore+0x228/0x300
[  +0.010392]  [<ffffffff8124962d>] proc_reg_read+0x3d/0x80
[  +0.010491]  [<ffffffff811e58f8>] vfs_read+0x98/0x170
[  +0.010153]  [<ffffffff811e6576>] SyS_read+0x46/0xb0
[  +0.010034]  [<ffffffff8111d5f6>] ? __audit_syscall_exit+0x1f6/0x2a0
[  +0.011425]  [<ffffffff8181dea9>] system_call_fastpath+0x16/0x1b
[  +0.011072] Code: 48 89 f8 66 66 66 90 48 be 00 f0 ff ff ff 3f 00 00
48 c1 ea 09 48 21 f0 81 e2 f8 0f 00 00 48 be 00 00 00 00 00 88 ff ff 48
01
[  +0.030537] RIP  [<ffffffff8105870b>] kern_addr_valid+0x15b/0x1b0
[  +0.011666]  RSP <ffff880214833e88>
[  +0.009043] CR2: ffff8800c4000000
[  +0.042194] ---[ end trace 97512601fec12186 ]---

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [Patch v3 7/7] add a new interface to show the memory usage of 1st kernel
  2014-08-26 14:12             ` bhe
@ 2014-09-02  6:20               ` Atsushi Kumagai
  0 siblings, 0 replies; 29+ messages in thread
From: Atsushi Kumagai @ 2014-09-02  6:20 UTC (permalink / raw)
  To: bhe; +Cc: kexec, vgoyal

Hello,

>On 08/26/14 at 08:25am, Petr Tesarik wrote:
>> On Tue, 26 Aug 2014 11:22:47 +0800
>> "bhe@redhat.com" <bhe@redhat.com> wrote:
>>
>> >[...]
>> > Here it use lseek to position, then try to read, and then reading failed
>> > and raised a SIGKILL.
>> >
>> > lseek(3, 8799381360640, SEEK_SET)       = 8799381360640
>> > read(3,  <unfinished ...>
>> > +++ killed by SIGKILL +++
>> > Killed
>>
>> This smells like killed by OOM Killer. Can you check the kernel log if
>> there's anything?
>
>Thanks for notice, it is caused by kernel addr validation check in
>read_kcore. No idea why it happened.

I found a similar bug as below. This bug must be fixed in your environment
since it looks that you use linux 3.17-rc2, but it might help your
investigation.


commit 0ee364eb316348ddf3e0dfcd986f5f13f528f821
Author: Mel Gorman <mgorman@suse.de>
Date:   Mon Feb 11 14:52:36 2013 +0000

    x86/mm: Check if PUD is large when validating a kernel address

    A user reported the following oops when a backup process reads
    /proc/kcore:

     BUG: unable to handle kernel paging request at ffffbb00ff33b000
     IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
     [...]

     Call Trace:
      [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
      [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
      [<ffffffff81151687>] vfs_read+0xc7/0x130
      [<ffffffff811517f3>] sys_read+0x53/0xa0
      [<ffffffff81449692>] system_call_fastpath+0x16/0x1b


Thanks
Atsushi Kumagai


>[ +35.288439] BUG: unable to handle kernel paging request at
>ffff8800c4000000
>[  +0.011559] IP: [<ffffffff8105870b>] kern_addr_valid+0x15b/0x1b0
>[  +0.010586] PGD 220f067 PUD 2213067 PMD 80000000c4000062
>[  +0.009994] Oops: 0000 [#1] SMP
>[  +0.007782] Modules linked in: xt_CHECKSUM nf_conntrack_netbios_ns
>nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT xt_conntrack cfg80211
>rfki
>[  +0.084041] CPU: 2 PID: 1679 Comm: makedumpfile Not tainted 3.17.0-rc2
>#15
>[  +0.012249] Hardware name: Dell Inc. PowerEdge T105 /0J001K, BIOS
>1.4.4 07/30/2009
>[  +0.012991] task: ffff8800c010ae40 ti: ffff880214830000 task.ti:
>ffff880214830000
>[  +0.012886] RIP: 0010:[<ffffffff8105870b>]  [<ffffffff8105870b>]
>kern_addr_valid+0x15b/0x1b0
>[  +0.013860] RSP: 0018:ffff880214833e88  EFLAGS: 00010206
>[  +0.010710] RAX: 00000000c4000000 RBX: 0000000000001000 RCX:
>0000000000000000
>[  +0.012445] RDX: 00000000c4000000 RSI: ffff880000000000 RDI:
>80000000c4000062
>[  +0.012298] RBP: ffff880214833e88 R08: 000000000000000e R09:
>00007fffffffffff
>[  +0.012322] R10: 0000000000000001 R11: 0000000000000246 R12:
>0000000000001000
>[  +0.012316] R13: 00000000022beb10 R14: ffff880214833f50 R15:
>ffff8800c4000000
>[  +0.012308] FS:  00007f0c91ffa740(0000) GS:ffff88022fd00000(0000)
>knlGS:0000000000000000
>[  +0.013315] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>[  +0.010985] CR2: ffff8800c4000000 CR3: 000000021a632000 CR4:
>00000000000007e0
>[  +0.012266] Stack:
>[  +0.006976]  ffff880214833ee0 ffffffff81255c38 ffff880214929410
>0000000000000000
>[  +0.012471]  0000000d14833ed8 0000000000001000 ffff880223c7c480
>00000000022beb10
>[  +0.012465]  ffff880214833f50 0000000000001000 00007ffffe1840c0
>ffff880214833f00
>[  +0.012456] Call Trace:
>[  +0.007459]  [<ffffffff81255c38>] read_kcore+0x228/0x300
>[  +0.010392]  [<ffffffff8124962d>] proc_reg_read+0x3d/0x80
>[  +0.010491]  [<ffffffff811e58f8>] vfs_read+0x98/0x170
>[  +0.010153]  [<ffffffff811e6576>] SyS_read+0x46/0xb0
>[  +0.010034]  [<ffffffff8111d5f6>] ? __audit_syscall_exit+0x1f6/0x2a0
>[  +0.011425]  [<ffffffff8181dea9>] system_call_fastpath+0x16/0x1b
>[  +0.011072] Code: 48 89 f8 66 66 66 90 48 be 00 f0 ff ff ff 3f 00 00
>48 c1 ea 09 48 21 f0 81 e2 f8 0f 00 00 48 be 00 00 00 00 00 88 ff ff 48
>01
>[  +0.030537] RIP  [<ffffffff8105870b>] kern_addr_valid+0x15b/0x1b0
>[  +0.011666]  RSP <ffff880214833e88>
>[  +0.009043] CR2: ffff8800c4000000
>[  +0.042194] ---[ end trace 97512601fec12186 ]---
>
>_______________________________________________
>kexec mailing list
>kexec@lists.infradead.org
>http://lists.infradead.org/mailman/listinfo/kexec

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2014-09-02  6:29 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-28  8:19 [Patch v3 0/7] add a new interface to show the memory usage of 1st kernel Baoquan He
2014-07-28  8:20 ` [Patch v3 1/7] initialize pfn_memhole in get_num_dumpable_cyclic Baoquan He
2014-07-28  8:20 ` [Patch v3 2/7] functions to get crashkernel memory range Baoquan He
2014-08-01  7:32   ` Atsushi Kumagai
2014-08-12  9:25     ` bhe
2014-07-28  8:20 ` [Patch v3 3/7] preparation functions for parsing vmcoreinfo Baoquan He
2014-08-01  7:12   ` Atsushi Kumagai
2014-08-12  9:46     ` bhe
2014-08-12 10:01       ` bhe
2014-08-14  7:37         ` Atsushi Kumagai
2014-08-14  8:15           ` bhe
2014-07-28  8:20 ` [Patch v3 4/7] set vmcoreinfo for kcore Baoquan He
2014-08-01  7:12   ` Atsushi Kumagai
2014-08-12 10:08     ` bhe
2014-07-28  8:20 ` [Patch v3 5/7] prepare the dump loads for kcore analysis Baoquan He
2014-08-01  7:12   ` Atsushi Kumagai
2014-08-12 10:10     ` bhe
2014-07-28  8:20 ` [Patch v3 6/7] implement a function to print the memory usage Baoquan He
2014-07-28  8:20 ` [Patch v3 7/7] add a new interface to show the memory usage of 1st kernel Baoquan He
2014-07-29 12:43   ` Vivek Goyal
2014-07-31  2:32     ` Baoquan He
2014-08-01  7:12   ` Atsushi Kumagai
2014-08-12 10:14     ` bhe
2014-08-21 10:31     ` bhe
2014-08-26  2:28       ` Atsushi Kumagai
2014-08-26  3:22         ` bhe
2014-08-26  6:25           ` Petr Tesarik
2014-08-26 14:12             ` bhe
2014-09-02  6:20               ` Atsushi Kumagai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.