All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time
@ 2014-09-29  7:06 Zhou Wenjian
  2014-09-29  7:06 ` [PATCH v1 1/5] makedumpfile: Add support for block Zhou Wenjian
                   ` (6 more replies)
  0 siblings, 7 replies; 11+ messages in thread
From: Zhou Wenjian @ 2014-09-29  7:06 UTC (permalink / raw)
  To: kexec

The issue is discussed at http://lists.infradead.org/pipermail/kexec/2014-March/011289.html

This patch implements the idea of 2-pass algorhythm with smaller memory to manage block table.
Exactly the algorhythm is still 3-pass,but the time of second pass is much shorter.
The tables below show the performence with different size of cyclic-buffer and block.
The test is executed on the machine having 128G memory.

the value is total time (including first pass and second pass).
the value in brackets is the time of second pass.
															      sec
	cyclic-buffer	1		2		4		8		16		32		64
block-size
1M			4.74(0.00)	4.22(0.01)	3.94(0.01)	3.78(0.02)	3.71(0.03)	3.73(0.07)	3.74(0.10)	
2M			4.74(0.00)	4.19(0.00)	3.94(0.01)	3.80(0.03)	3.71(0.03)	3.72(0.07)	3.72(0.09)	
4M			4.73(0.00)	4.21(0.01)	3.95(0.01)	3.78(0.02)	3.70(0.02)	3.73(0.08)	3.73(0.10)	
8M			4.73(0.00)	4.19(0.00)	3.94(0.01)	3.83(0.02)	3.73(0.03)	3.72(0.07)	3.74(0.10)	
16M			4.74(0.01)	4.21(0.00)	3.94(0.01)	3.76(0.01)	3.73(0.03)	3.73(0.08)	3.74(0.10)	
32M			4.72(0.00)	4.20(0.02)	3.92(0.01)	3.77(0.02)	3.71(0.02)	3.70(0.06)	3.74(0.10)	
64M			4.74(0.01)	4.20(0.00)	3.95(0.01)	3.78(0.02)	3.70(0.02)	3.71(0.07)	3.72(0.09)	
128M			4.73(0.01)	4.20(0.00)	3.94(0.01)	3.78(0.02)	3.76(0.03)	3.72(0.08)	3.74(0.09)	
256M			4.75(0.02)	4.22(0.02)	3.96(0.03)	3.78(0.02)	3.70(0.03)	3.70(0.07)	3.74(0.11)	
512M			4.77(0.04)	4.21(0.03)	3.97(0.04)	3.79(0.03)	3.73(0.04)	3.75(0.09)	3.82(0.13)	
1G			4.82(0.09)	4.26(0.07)	4.00(0.08)	3.83(0.07)	3.76(0.08)	3.73(0.08)	3.76(0.12)	
2G			8.26(3.54)	7.34(3.14)	6.86(2.93)	6.56(2.80)	6.44(2.76)	6.45(2.79)	6.42(2.80)

the performence of 3-pass algorhythm
origin			8.25(3.54)	7.26(3.11)	6.80(2.91)	6.52(2.80)	6.39(2.76)	6.40(2.78)	6.45(2.85)

															       sec
	cyclic-buffer	128		256		512		1024		2048		4096		8192	
block-size
1M			3.83(0.21)	3.94(0.33)	4.16(0.54)	4.61(0.99)	7.03(3.41)	8.73(5.11)	8.69(5.08)
2M			3.86(0.21)	3.92(0.32)	4.16(0.54)	4.64(0.98)	7.02(3.41)	8.71(5.09)	8.72(5.09)
4M			3.82(0.21)	3.95(0.32)	4.18(0.55)	4.62(0.99)	7.05(3.44)	8.70(5.09)	8.68(5.07)
8M			3.82(0.21)	3.95(0.33)	4.17(0.54)	4.58(0.97)	7.03(3.41)	8.79(5.16)	8.71(5.09)
16M			3.83(0.21)	3.93(0.31)	4.15(0.54)	4.60(0.98)	7.06(3.43)	8.76(5.13)	8.73(5.10)
32M			3.84(0.22)	3.93(0.32)	4.15(0.54)	4.61(0.98)	7.00(3.40)	8.69(5.08)	8.75(5.13)
64M			3.84(0.21)	3.94(0.33)	4.15(0.54)	4.60(0.98)	7.04(3.42)	8.74(5.10)	8.80(5.16)
128M			3.85(0.22)	3.97(0.33)	4.16(0.54)	4.60(0.98)	7.07(3.44)	8.68(5.07)	8.69(5.07)
256M			3.84(0.21)	3.94(0.33)	4.16(0.55)	4.64(1.00)	7.02(3.41)	8.74(5.11)	8.73(5.11)
512M			3.85(0.24)	3.97(0.34)	4.17(0.56)	4.61(0.99)	7.05(3.44)	8.73(5.11)	8.75(5.13)
1G			3.85(0.22)	3.96(0.35)	4.18(0.56)	4.65(1.00)	7.06(3.44)	8.76(5.12)	8.72(5.11)
2G			6.53(2.91)	6.86(3.25)	7.54(3.92)	8.95(5.31)	10.60(6.97)	14.08(10.47)	14.32(10.60)

the performence of 3-pass algorhythm
origin			6.64(3.05)	6.81(3.24)	7.51(3.93)	8.86(5.30)	10.51(6.94)	13.92(10.36)	14.11(10.55)

Zhou Wenjian (5):
  Add support for block
  Add tools for reading and writing from block table
  Add module of generating table
  Add module of calculating start_pfn and end_pfn in each dumpfile
  Add support for --block-size

 makedumpfile.8 |   16 ++++
 makedumpfile.c |  245 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 makedumpfile.h |   15 ++++
 3 files changed, 271 insertions(+), 5 deletions(-)

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v1 1/5] makedumpfile: Add support for block
  2014-09-29  7:06 [PATCH v1 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time Zhou Wenjian
@ 2014-09-29  7:06 ` Zhou Wenjian
  2014-10-10  8:11   ` Atsushi Kumagai
  2014-09-29  7:06 ` [PATCH v1 2/5] makedumpfile: Add tools for reading and writing from block table Zhou Wenjian
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 11+ messages in thread
From: Zhou Wenjian @ 2014-09-29  7:06 UTC (permalink / raw)
  To: kexec

When --split option is specified, fair I/O workloads shoud be assigned
for each process. So the start and end pfn of each dumpfile should be
calculated with excluding unnecessary pages. However, it costs a lot of
time to execute excluding for the whole memory. That is why struct Block
exists. Struct Block is designed to manage memory, mainly for recording
the number of dumpable pages. We can use the number of dumpable pages to
calculate start and end pfn instead of execute excluding for the whole
memory.

The char array *table in struct Block is used to record the number of
dumpable pages.
The table entry size is calculated as
			divideup(log2(block_size / page_size), 8) bytes
The table entry size is calculated, so that the
space table taken will be small enough. And the code will also have a
good performence when the number of pages in one block is big enough.

Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
Signed-off-by: Zhou Wenjian <zhouwj-fnst@cn.fujitsu.com>
---
 makedumpfile.c |   23 +++++++++++++++++++++++
 makedumpfile.h |   14 ++++++++++++++
 2 files changed, 37 insertions(+), 0 deletions(-)

diff --git a/makedumpfile.c b/makedumpfile.c
index b4d43d8..2feda01 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -34,6 +34,7 @@ struct srcfile_table	srcfile_table;
 
 struct vm_table		vt = { 0 };
 struct DumpInfo		*info = NULL;
+struct Block		*block = NULL;
 
 char filename_stdout[] = FILENAME_STDOUT;
 
@@ -5685,6 +5686,28 @@ out:
 	return ret;
 }
 
+/*
+ * cyclic_split mode:
+ *	manage memory by blocks,
+ *	divide memory into blocks
+ *	use block_table to record numbers of dumpable pages in each block
+ */
+
+//calculate entry size based on the amount of pages in one block
+int
+calculate_entry_size(void){
+	int entry_num = 1, count = 1;
+	int entry_size;
+	while (entry_num < block->page_per_block){
+		entry_num = entry_num << 1;
+		count++;
+	}
+	entry_size = count/BITPERBYTE;
+	if (count %BITPERBYTE)
+		entry_size++;
+	return entry_size;
+}
+
 mdf_pfn_t
 get_num_dumpable(void)
 {
diff --git a/makedumpfile.h b/makedumpfile.h
index 96830b0..ed4f799 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -1168,10 +1168,24 @@ struct DumpInfo {
 	 */
 	int (*page_is_buddy)(unsigned long flags, unsigned int _mapcount,
 			     unsigned long private, unsigned int _count);
+	/*
+	 *for cyclic_splitting mode, setup block_size
+	 */
+	long long block_size;
 };
 extern struct DumpInfo		*info;
 
 /*
+ *for cyclic_splitting mode,Manage memory by block
+ */
+struct Block{
+        char *table;
+        long long num;
+        long long page_per_block;
+        int entry_size;                 //counted by byte
+};
+
+/*
  * kernel VM-related data
  */
 struct vm_table {
-- 
1.7.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v1 2/5] makedumpfile: Add tools for reading and writing from block table
  2014-09-29  7:06 [PATCH v1 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time Zhou Wenjian
  2014-09-29  7:06 ` [PATCH v1 1/5] makedumpfile: Add support for block Zhou Wenjian
@ 2014-09-29  7:06 ` Zhou Wenjian
  2014-09-29  7:06 ` [PATCH v1 3/5] makedumpfile: Add module of generating table Zhou Wenjian
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Zhou Wenjian @ 2014-09-29  7:06 UTC (permalink / raw)
  To: kexec

The function added in this patch, is used for writing and reading value
from the char array in struct Block.

Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
Signed-off-by: Zhou Wenjian <zhouwj-fnst@cn.fujitsu.com>
---
 makedumpfile.c |   23 +++++++++++++++++++++++
 1 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/makedumpfile.c b/makedumpfile.c
index 2feda01..a4cb9b6 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -5708,6 +5708,29 @@ calculate_entry_size(void){
 	return entry_size;
 }
 
+void
+write_value_into_block_table(char *block_inner, unsigned long long content)
+{
+	char temp;
+	int i=0;
+	while (i++ < block->entry_size) {
+		temp = content & 0xff;
+		content = content >> BITPERBYTE;
+		*block_inner++ = temp;
+	}
+}
+unsigned long long
+read_value_from_block_table(char *block_inner)
+{
+	unsigned long long ret = 0;
+	int i;
+	for (i = block->entry_size; i > 0; i--) {
+		ret = ret << BITPERBYTE;
+		ret += *(block_inner + i - 1) & 0xff;
+	}
+	return ret;
+}
+
 mdf_pfn_t
 get_num_dumpable(void)
 {
-- 
1.7.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v1 3/5] makedumpfile: Add module of generating table
  2014-09-29  7:06 [PATCH v1 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time Zhou Wenjian
  2014-09-29  7:06 ` [PATCH v1 1/5] makedumpfile: Add support for block Zhou Wenjian
  2014-09-29  7:06 ` [PATCH v1 2/5] makedumpfile: Add tools for reading and writing from block table Zhou Wenjian
@ 2014-09-29  7:06 ` Zhou Wenjian
  2014-10-10  8:12   ` Atsushi Kumagai
  2014-09-29  7:06 ` [PATCH v1 4/5] makedumpfile: Add module of calculating start_pfn and end_pfn in each dumpfile Zhou Wenjian
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 11+ messages in thread
From: Zhou Wenjian @ 2014-09-29  7:06 UTC (permalink / raw)
  To: kexec

set block size and generate basic information of block table

Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
Signed-off-by: Zhou Wenjian <zhouwj-fnst@cn.fujitsu.com>
---
 makedumpfile.c |   86 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 86 insertions(+), 0 deletions(-)

diff --git a/makedumpfile.c b/makedumpfile.c
index a4cb9b6..c6ea635 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -5731,6 +5731,52 @@ read_value_from_block_table(char *block_inner)
 	return ret;
 }
 
+/*
+ * The block size is specified as Kbyte with --block-size <size> option.
+ * if not specified ,set default value
+ */
+int
+check_block_size(void)
+{
+	if (info->block_size){
+		info->block_size <<= 10;
+		if (info->block_size < info->page_size) {
+			ERRMSG("The block size could not be smaller than page_size. %s.\n",
+									strerror(errno));
+		return FALSE;
+		}
+	}
+	else{
+		// set default 1GB
+		info->block_size = 1 << 30;
+	}
+	return TRUE;
+}
+
+int
+prepare_block_table(void)
+{
+	check_block_size();
+	if ((block = calloc(1, sizeof(struct Block))) == NULL) {
+		ERRMSG("Can't allocate memory for the block. %s.\n", strerror(errno));
+		return FALSE;
+	}
+	block->page_per_block = info->block_size/info->page_size;
+	/*
+	 *divide memory into blocks.
+	 *if there is a remainder, called it memory not managed by block
+	 *and it will be also dealt with in function calculate_end_pfn_by_block()
+	 */
+	block->num = info->max_mapnr/block->page_per_block;
+	block->entry_size = calculate_entry_size();
+	if ((block->table = (char *)calloc(sizeof(char),(block->entry_size * block->num)))
+										== NULL) {
+		ERRMSG("Can't allocate memory for the block_table. %s.\n", strerror(errno));
+		return FALSE;
+	}
+	return TRUE;
+}
+
 mdf_pfn_t
 get_num_dumpable(void)
 {
@@ -5746,9 +5792,43 @@ get_num_dumpable(void)
 	return num_dumpable;
 }
 
+/*
+ * generate block_table
+ * modified from function get_num_dumpable_cyclic
+ */
+mdf_pfn_t
+get_num_dumpable_cyclic_withsplit(void)
+{
+	mdf_pfn_t pfn, num_dumpable = 0;
+	mdf_pfn_t dumpable_pfn_num = 0, pfn_num = 0;
+	struct cycle cycle = {0};
+	int pos = 0;
+	prepare_block_table();
+	for_each_cycle(0, info->max_mapnr, &cycle) {
+		if (!exclude_unnecessary_pages_cyclic(&cycle))
+			return FALSE;
+		for (pfn = cycle.start_pfn; pfn < cycle.end_pfn; pfn++) {
+			if (is_dumpable_cyclic(info->partial_bitmap2, pfn, &cycle)) {
+				num_dumpable++;
+				dumpable_pfn_num++;
+			}
+			if (++pfn_num >= block->page_per_block) {
+				write_value_into_block_table(block->table + pos, dumpable_pfn_num);
+				pos += block->entry_size;
+				pfn_num = 0;
+				dumpable_pfn_num = 0;
+			}
+		}
+	}
+	return num_dumpable;
+}
+
 mdf_pfn_t
 get_num_dumpable_cyclic(void)
 {
+	if(info->flag_split)
+		return get_num_dumpable_cyclic_withsplit();
+
 	mdf_pfn_t pfn, num_dumpable=0;
 	struct cycle cycle = {0};
 
@@ -9703,6 +9783,12 @@ out:
 		if (info->page_buf != NULL)
 			free(info->page_buf);
 		free(info);
+
+		if (block) {
+			if (block->table)
+				free(block->table);
+		free(block);
+		}
 	}
 	free_elf_info();
 
-- 
1.7.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v1 4/5] makedumpfile: Add module of calculating start_pfn and end_pfn in each dumpfile
  2014-09-29  7:06 [PATCH v1 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time Zhou Wenjian
                   ` (2 preceding siblings ...)
  2014-09-29  7:06 ` [PATCH v1 3/5] makedumpfile: Add module of generating table Zhou Wenjian
@ 2014-09-29  7:06 ` Zhou Wenjian
  2014-09-29  7:06 ` [PATCH v1 5/5] makedumpfile: Add support for --block-size Zhou Wenjian
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Zhou Wenjian @ 2014-09-29  7:06 UTC (permalink / raw)
  To: kexec

When --split is specified in cyclic mode, start_pfn and end_pfn of each dumpfile
will be calculated to make each dumpfile have the same size.

Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
Signed-off-by: Zhou Wenjian <zhouwj-fnst@cn.fujitsu.com>
---
 makedumpfile.c |  109 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 files changed, 104 insertions(+), 5 deletions(-)

diff --git a/makedumpfile.c b/makedumpfile.c
index c6ea635..3e66346 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -8183,6 +8183,103 @@ out:
 		return ret;
 }
 
+/*
+ * calculate end pfn in incomplete block or memory not managed by block
+ */
+mdf_pfn_t
+calculate_end_pfn_in_cycle(mdf_pfn_t start, mdf_pfn_t max,
+			    mdf_pfn_t end_pfn, long long pfn_needed_by_per_dumpfile)
+{
+	struct cycle cycle;
+	for_each_cycle(start,max,&cycle) {
+		if (!exclude_unnecessary_pages_cyclic(&cycle))
+			return FALSE;
+		while (end_pfn < cycle.end_pfn) {
+			end_pfn++;
+			if (is_dumpable_cyclic(info->partial_bitmap2, end_pfn, &cycle)){
+				if (--pfn_needed_by_per_dumpfile <= 0)
+					return ++end_pfn;
+			}
+		}
+	}
+	return ++end_pfn;
+}
+
+/*
+ * calculate end_pfn of one dumpfile.
+ * try to make every output file have the same size.
+ * block_table is used to reduce calculate time.
+ */
+
+#define CURRENT_BLOCK_PFN_NUM (*current_block * block->page_per_block)
+mdf_pfn_t
+calculate_end_pfn_by_block(mdf_pfn_t start_pfn,
+			   int *current_block,
+			   long long *current_block_pfns){
+	mdf_pfn_t end_pfn;
+	long long pfn_needed_by_per_dumpfile,offset;
+	pfn_needed_by_per_dumpfile = info->num_dumpable / info->num_dumpfile;
+	offset = *current_block * block->entry_size;
+	end_pfn = start_pfn;
+	char *block_inner = block->table + offset;
+	//calculate the part containing complete block
+	while (*current_block < block->num && pfn_needed_by_per_dumpfile > 0) {
+		if (*current_block_pfns > 0) {
+			pfn_needed_by_per_dumpfile -= *current_block_pfns ;
+			*current_block_pfns = 0 ;
+		}
+		else
+		pfn_needed_by_per_dumpfile -= read_value_from_block_table(block_inner);
+		block_inner += block->entry_size;
+		++*current_block;
+	}
+	//deal with complete block
+	if (pfn_needed_by_per_dumpfile == 0)
+		end_pfn = CURRENT_BLOCK_PFN_NUM;
+	//deal with incomplete block
+	if (pfn_needed_by_per_dumpfile < 0) {
+		--*current_block;
+		block_inner -= block->entry_size;
+		end_pfn = CURRENT_BLOCK_PFN_NUM;
+		*current_block_pfns = (-1) * pfn_needed_by_per_dumpfile;
+		pfn_needed_by_per_dumpfile += read_value_from_block_table(block_inner);
+		end_pfn = calculate_end_pfn_in_cycle(CURRENT_BLOCK_PFN_NUM,
+						     CURRENT_BLOCK_PFN_NUM+block->page_per_block,
+						     end_pfn,pfn_needed_by_per_dumpfile);
+	}
+	//deal with memory not managed by block
+	if (pfn_needed_by_per_dumpfile > 0 && *current_block >= block->num) {
+		mdf_pfn_t cycle_start_pfn = MAX(CURRENT_BLOCK_PFN_NUM,end_pfn);
+		end_pfn=calculate_end_pfn_in_cycle(cycle_start_pfn,
+						   info->max_mapnr,
+						   end_pfn,
+						   pfn_needed_by_per_dumpfile);
+	}
+	return end_pfn;
+}
+/*
+ * calculate start_pfn and end_pfn in each output file.
+ */
+static int setup_splitting_cyclic(void)
+{
+	int i;
+	mdf_pfn_t start_pfn, end_pfn;
+	long long current_block_pfns = 0;
+	int current_block = 0;
+	start_pfn = end_pfn = 0;
+	for (i = 0; i < info->num_dumpfile - 1; i++) {
+		start_pfn = end_pfn;
+		end_pfn = calculate_end_pfn_by_block(start_pfn,
+						     &current_block,
+						     &current_block_pfns);
+		SPLITTING_START_PFN(i) = start_pfn;
+		SPLITTING_END_PFN(i) = end_pfn;
+	}
+	SPLITTING_START_PFN(info->num_dumpfile - 1) = end_pfn;
+	SPLITTING_END_PFN(info->num_dumpfile - 1) = info->max_mapnr;
+	return TRUE;
+}
+
 int
 setup_splitting(void)
 {
@@ -8196,12 +8293,14 @@ setup_splitting(void)
 		return FALSE;
 
 	if (info->flag_cyclic) {
-		for (i = 0; i < info->num_dumpfile; i++) {
-			SPLITTING_START_PFN(i) = divideup(info->max_mapnr, info->num_dumpfile) * i;
-			SPLITTING_END_PFN(i)   = divideup(info->max_mapnr, info->num_dumpfile) * (i + 1);
+		int ret = FALSE;
+		if(!prepare_bitmap2_buffer_cyclic()){
+			free_bitmap_buffer();
+			return ret;
 		}
-		if (SPLITTING_END_PFN(i-1) > info->max_mapnr)
-			SPLITTING_END_PFN(i-1) = info->max_mapnr;
+		ret = setup_splitting_cyclic();
+		free_bitmap2_buffer_cyclic();
+		return ret;
         } else {
 		initialize_2nd_bitmap(&bitmap2);
 
-- 
1.7.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v1 5/5] makedumpfile: Add support for --block-size
  2014-09-29  7:06 [PATCH v1 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time Zhou Wenjian
                   ` (3 preceding siblings ...)
  2014-09-29  7:06 ` [PATCH v1 4/5] makedumpfile: Add module of calculating start_pfn and end_pfn in each dumpfile Zhou Wenjian
@ 2014-09-29  7:06 ` Zhou Wenjian
  2014-10-10  8:11   ` Atsushi Kumagai
  2014-10-07  2:49 ` [PATCH v1 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time "Zhou, Wenjian/周文剑"
  2014-10-10  4:12 ` "Zhou, Wenjian/周文剑"
  6 siblings, 1 reply; 11+ messages in thread
From: Zhou Wenjian @ 2014-09-29  7:06 UTC (permalink / raw)
  To: kexec

Use --block-size to specify block size (KB)
When --split is specified in cyclic mode,block table will be
generated in get_num_dumpable_cyclic.

Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
Signed-off-by: Zhou Wenjian <zhouwj-fnst@cn.fujitsu.com>
---
 makedumpfile.8 |   16 ++++++++++++++++
 makedumpfile.c |    4 ++++
 makedumpfile.h |    1 +
 3 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/makedumpfile.8 b/makedumpfile.8
index 9cb12c0..a384213 100644
--- a/makedumpfile.8
+++ b/makedumpfile.8
@@ -386,6 +386,22 @@ size, so ordinary users don't need to specify this option.
 # makedumpfile \-\-cyclic\-buffer 1024 \-d 31 \-x vmlinux /proc/vmcore dumpfile
 
 .TP
+\fB\-\-block\-size\fR \fIblock_size\fR
+Specify the block size in kilo bytes for analysis in the cyclic mode with --split.
+In the cyclic split mode, the number of blocks is represented as:
+
+    num_of_blocks = system_memory / (\fIblock_size\fR * 1KB )
+
+The larger number of block, the faster working speed is expected, but the more memory will
+be taken. By default, \fIblock_size\fR will be set as 1GB, so ordinary users don't need to
+specify this option.
+
+.br
+.B Example:
+.br
+# makedumpfile \-\-block\-size 10240 \-d 31 \-x vmlinux \-\-split /proc/vmcore dumpfile1 dumpfile2
+
+.TP
 \fB\-\-non\-cyclic\fR
 Running in the non-cyclic mode, this mode uses the old filtering logic same as v1.4.4 or before.
 If you feel the cyclic mode is too slow, please try this mode.
diff --git a/makedumpfile.c b/makedumpfile.c
index 3e66346..405d935 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -9571,6 +9571,7 @@ static struct option longopts[] = {
 	{"eppic", required_argument, NULL, OPT_EPPIC},
 	{"non-mmap", no_argument, NULL, OPT_NON_MMAP},
 	{"mem-usage", no_argument, NULL, OPT_MEM_USAGE},
+	{"block-size", required_argument, NULL, OPT_BLOCK_SIZE},
 	{0, 0, 0, 0}
 };
 
@@ -9711,6 +9712,9 @@ main(int argc, char *argv[])
 		case OPT_CYCLIC_BUFFER:
 			info->bufsize_cyclic = atoi(optarg);
 			break;
+		case OPT_BLOCK_SIZE:
+			info->block_size = atoi(optarg);
+			break;
 		case '?':
 			MSG("Commandline parameter is invalid.\n");
 			MSG("Try `makedumpfile --help' for more information.\n");
diff --git a/makedumpfile.h b/makedumpfile.h
index ed4f799..56d2c79 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -1883,6 +1883,7 @@ struct elf_prstatus {
 #define OPT_EPPIC               OPT_START+12
 #define OPT_NON_MMAP            OPT_START+13
 #define OPT_MEM_USAGE            OPT_START+14
+#define OPT_BLOCK_SIZE		OPT_START+15
 
 /*
  * Function Prototype.
-- 
1.7.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time
  2014-09-29  7:06 [PATCH v1 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time Zhou Wenjian
                   ` (4 preceding siblings ...)
  2014-09-29  7:06 ` [PATCH v1 5/5] makedumpfile: Add support for --block-size Zhou Wenjian
@ 2014-10-07  2:49 ` "Zhou, Wenjian/周文剑"
  2014-10-10  4:12 ` "Zhou, Wenjian/周文剑"
  6 siblings, 0 replies; 11+ messages in thread
From: "Zhou, Wenjian/周文剑" @ 2014-10-07  2:49 UTC (permalink / raw)
  To: kexec

ping ...

On 09/29/2014 03:06 PM, Zhou Wenjian wrote:
> The issue is discussed at http://lists.infradead.org/pipermail/kexec/2014-March/011289.html
>
> This patch implements the idea of 2-pass algorhythm with smaller memory to manage block table.
> Exactly the algorhythm is still 3-pass,but the time of second pass is much shorter.
> The tables below show the performence with different size of cyclic-buffer and block.
> The test is executed on the machine having 128G memory.
>
> the value is total time (including first pass and second pass).
> the value in brackets is the time of second pass.
> 															      sec
> 	cyclic-buffer	1		2		4		8		16		32		64
> block-size
> 1M			4.74(0.00)	4.22(0.01)	3.94(0.01)	3.78(0.02)	3.71(0.03)	3.73(0.07)	3.74(0.10)	
> 2M			4.74(0.00)	4.19(0.00)	3.94(0.01)	3.80(0.03)	3.71(0.03)	3.72(0.07)	3.72(0.09)	
> 4M			4.73(0.00)	4.21(0.01)	3.95(0.01)	3.78(0.02)	3.70(0.02)	3.73(0.08)	3.73(0.10)	
> 8M			4.73(0.00)	4.19(0.00)	3.94(0.01)	3.83(0.02)	3.73(0.03)	3.72(0.07)	3.74(0.10)	
> 16M			4.74(0.01)	4.21(0.00)	3.94(0.01)	3.76(0.01)	3.73(0.03)	3.73(0.08)	3.74(0.10)	
> 32M			4.72(0.00)	4.20(0.02)	3.92(0.01)	3.77(0.02)	3.71(0.02)	3.70(0.06)	3.74(0.10)	
> 64M			4.74(0.01)	4.20(0.00)	3.95(0.01)	3.78(0.02)	3.70(0.02)	3.71(0.07)	3.72(0.09)	
> 128M			4.73(0.01)	4.20(0.00)	3.94(0.01)	3.78(0.02)	3.76(0.03)	3.72(0.08)	3.74(0.09)	
> 256M			4.75(0.02)	4.22(0.02)	3.96(0.03)	3.78(0.02)	3.70(0.03)	3.70(0.07)	3.74(0.11)	
> 512M			4.77(0.04)	4.21(0.03)	3.97(0.04)	3.79(0.03)	3.73(0.04)	3.75(0.09)	3.82(0.13)	
> 1G			4.82(0.09)	4.26(0.07)	4.00(0.08)	3.83(0.07)	3.76(0.08)	3.73(0.08)	3.76(0.12)	
> 2G			8.26(3.54)	7.34(3.14)	6.86(2.93)	6.56(2.80)	6.44(2.76)	6.45(2.79)	6.42(2.80)
>
> the performence of 3-pass algorhythm
> origin			8.25(3.54)	7.26(3.11)	6.80(2.91)	6.52(2.80)	6.39(2.76)	6.40(2.78)	6.45(2.85)
>
> 															       sec
> 	cyclic-buffer	128		256		512		1024		2048		4096		8192	
> block-size
> 1M			3.83(0.21)	3.94(0.33)	4.16(0.54)	4.61(0.99)	7.03(3.41)	8.73(5.11)	8.69(5.08)
> 2M			3.86(0.21)	3.92(0.32)	4.16(0.54)	4.64(0.98)	7.02(3.41)	8.71(5.09)	8.72(5.09)
> 4M			3.82(0.21)	3.95(0.32)	4.18(0.55)	4.62(0.99)	7.05(3.44)	8.70(5.09)	8.68(5.07)
> 8M			3.82(0.21)	3.95(0.33)	4.17(0.54)	4.58(0.97)	7.03(3.41)	8.79(5.16)	8.71(5.09)
> 16M			3.83(0.21)	3.93(0.31)	4.15(0.54)	4.60(0.98)	7.06(3.43)	8.76(5.13)	8.73(5.10)
> 32M			3.84(0.22)	3.93(0.32)	4.15(0.54)	4.61(0.98)	7.00(3.40)	8.69(5.08)	8.75(5.13)
> 64M			3.84(0.21)	3.94(0.33)	4.15(0.54)	4.60(0.98)	7.04(3.42)	8.74(5.10)	8.80(5.16)
> 128M			3.85(0.22)	3.97(0.33)	4.16(0.54)	4.60(0.98)	7.07(3.44)	8.68(5.07)	8.69(5.07)
> 256M			3.84(0.21)	3.94(0.33)	4.16(0.55)	4.64(1.00)	7.02(3.41)	8.74(5.11)	8.73(5.11)
> 512M			3.85(0.24)	3.97(0.34)	4.17(0.56)	4.61(0.99)	7.05(3.44)	8.73(5.11)	8.75(5.13)
> 1G			3.85(0.22)	3.96(0.35)	4.18(0.56)	4.65(1.00)	7.06(3.44)	8.76(5.12)	8.72(5.11)
> 2G			6.53(2.91)	6.86(3.25)	7.54(3.92)	8.95(5.31)	10.60(6.97)	14.08(10.47)	14.32(10.60)
>
> the performence of 3-pass algorhythm
> origin			6.64(3.05)	6.81(3.24)	7.51(3.93)	8.86(5.30)	10.51(6.94)	13.92(10.36)	14.11(10.55)
>
> Zhou Wenjian (5):
>    Add support for block
>    Add tools for reading and writing from block table
>    Add module of generating table
>    Add module of calculating start_pfn and end_pfn in each dumpfile
>    Add support for --block-size
>
>   makedumpfile.8 |   16 ++++
>   makedumpfile.c |  245 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>   makedumpfile.h |   15 ++++
>   3 files changed, 271 insertions(+), 5 deletions(-)
>
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time
  2014-09-29  7:06 [PATCH v1 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time Zhou Wenjian
                   ` (5 preceding siblings ...)
  2014-10-07  2:49 ` [PATCH v1 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time "Zhou, Wenjian/周文剑"
@ 2014-10-10  4:12 ` "Zhou, Wenjian/周文剑"
  6 siblings, 0 replies; 11+ messages in thread
From: "Zhou, Wenjian/周文剑" @ 2014-10-10  4:12 UTC (permalink / raw)
  To: kexec

Maybe I should give more information about the issue.

When --split option is specified, fair I/O workloads should be assigned for each process
to maximize amount of performance optimization by parallel processing.

However, the current implementation of setup_splitting() in cyclic mode doesn't care about
  filtering at all. It may always cause a big difference among dumpfiles in size.

To solve the problem, we should count the dumpable pfn instead of each pfn. It means that
the start and end pfn of each dumpfile must be calculated with filtering.

So, HATAYAMA Daisuke put forward the 3-pass algorithm. The algorithm deals with the issue
by doing the complete filtering in setup_splitting_cyclic().
(The implementation of 3-pass algorithm is referred to
http://lists.infradead.org/pipermail/kexec/2014-March/011339.html)

However, in 3-pass algorithm, if --split is specified in cyclic mode, we do filtering three times:
in get_dumpable_pages_cyclic(), in setup_splitting_cyclic() and in writeout_dumpfile().
Filtering takes a long time on system with huge memory according to the benchmark on
the past, so it is necessary to be optimized.


Then, the 2-pass algorithm came. We remove the filtering in setup_splitting_cyclic(). Since we
just need counting the dumpable pfn, we can record the number of dumpable pfn in first filtering
and calculate the start-end pfn with the number.

We divide memory into several parts(we call it block. the default block size is 1GB). The number
of dumpable pages in each block is recorded when doing first filtering. When calculating, with
the help of the dumpable number, we don't need to do the filtering for whole memory.

These algorithms may can be described as the following:

	current:
		get_dumpable_pages_cyclic():
						do filtering
						count all dumpable pages
		setup_splitting():
						calculate start-end pfn without counting dumpable pages

		writeout_dumpfile():
						do filtering
						write data

	3-pass:
		get_dumpable_pages_cyclic():
						do filtering
						count all dumpable pages
		setup_splitting_cyclic():
						do filtering
						count dumpable pages of each dumpfile
						calculate start-end pfn of each dumpfile
		writeout_dumpfile():
						do filtering
						write data

	2-pass:
		get_dumpable_pages_cyclic():
						do filtering
						count dumpable pages of each block
						count all dumpable pages
		setup_splitting_cyclic():
						calculate start-end pfn of each dumpfile with the help of block

		writeout_dumpfile():
						do filtering
						write data

The performance of the two algorithm (2-pass and 3-pass) was tested. The result can be found in
the previous letter.


On 09/29/2014 03:06 PM, Zhou Wenjian wrote:
> The issue is discussed at http://lists.infradead.org/pipermail/kexec/2014-March/011289.html
>
> This patch implements the idea of 2-pass algorhythm with smaller memory to manage block table.
> Exactly the algorhythm is still 3-pass,but the time of second pass is much shorter.
> The tables below show the performence with different size of cyclic-buffer and block.
> The test is executed on the machine having 128G memory.
>
> the value is total time (including first pass and second pass).
> the value in brackets is the time of second pass.
> 															      sec
> 	cyclic-buffer	1		2		4		8		16		32		64
> block-size
> 1M			4.74(0.00)	4.22(0.01)	3.94(0.01)	3.78(0.02)	3.71(0.03)	3.73(0.07)	3.74(0.10)	
> 2M			4.74(0.00)	4.19(0.00)	3.94(0.01)	3.80(0.03)	3.71(0.03)	3.72(0.07)	3.72(0.09)	
> 4M			4.73(0.00)	4.21(0.01)	3.95(0.01)	3.78(0.02)	3.70(0.02)	3.73(0.08)	3.73(0.10)	
> 8M			4.73(0.00)	4.19(0.00)	3.94(0.01)	3.83(0.02)	3.73(0.03)	3.72(0.07)	3.74(0.10)	
> 16M			4.74(0.01)	4.21(0.00)	3.94(0.01)	3.76(0.01)	3.73(0.03)	3.73(0.08)	3.74(0.10)	
> 32M			4.72(0.00)	4.20(0.02)	3.92(0.01)	3.77(0.02)	3.71(0.02)	3.70(0.06)	3.74(0.10)	
> 64M			4.74(0.01)	4.20(0.00)	3.95(0.01)	3.78(0.02)	3.70(0.02)	3.71(0.07)	3.72(0.09)	
> 128M			4.73(0.01)	4.20(0.00)	3.94(0.01)	3.78(0.02)	3.76(0.03)	3.72(0.08)	3.74(0.09)	
> 256M			4.75(0.02)	4.22(0.02)	3.96(0.03)	3.78(0.02)	3.70(0.03)	3.70(0.07)	3.74(0.11)	
> 512M			4.77(0.04)	4.21(0.03)	3.97(0.04)	3.79(0.03)	3.73(0.04)	3.75(0.09)	3.82(0.13)	
> 1G			4.82(0.09)	4.26(0.07)	4.00(0.08)	3.83(0.07)	3.76(0.08)	3.73(0.08)	3.76(0.12)	
> 2G			8.26(3.54)	7.34(3.14)	6.86(2.93)	6.56(2.80)	6.44(2.76)	6.45(2.79)	6.42(2.80)
>
> the performence of 3-pass algorhythm
> origin			8.25(3.54)	7.26(3.11)	6.80(2.91)	6.52(2.80)	6.39(2.76)	6.40(2.78)	6.45(2.85)
>
> 															       sec
> 	cyclic-buffer	128		256		512		1024		2048		4096		8192	
> block-size
> 1M			3.83(0.21)	3.94(0.33)	4.16(0.54)	4.61(0.99)	7.03(3.41)	8.73(5.11)	8.69(5.08)
> 2M			3.86(0.21)	3.92(0.32)	4.16(0.54)	4.64(0.98)	7.02(3.41)	8.71(5.09)	8.72(5.09)
> 4M			3.82(0.21)	3.95(0.32)	4.18(0.55)	4.62(0.99)	7.05(3.44)	8.70(5.09)	8.68(5.07)
> 8M			3.82(0.21)	3.95(0.33)	4.17(0.54)	4.58(0.97)	7.03(3.41)	8.79(5.16)	8.71(5.09)
> 16M			3.83(0.21)	3.93(0.31)	4.15(0.54)	4.60(0.98)	7.06(3.43)	8.76(5.13)	8.73(5.10)
> 32M			3.84(0.22)	3.93(0.32)	4.15(0.54)	4.61(0.98)	7.00(3.40)	8.69(5.08)	8.75(5.13)
> 64M			3.84(0.21)	3.94(0.33)	4.15(0.54)	4.60(0.98)	7.04(3.42)	8.74(5.10)	8.80(5.16)
> 128M			3.85(0.22)	3.97(0.33)	4.16(0.54)	4.60(0.98)	7.07(3.44)	8.68(5.07)	8.69(5.07)
> 256M			3.84(0.21)	3.94(0.33)	4.16(0.55)	4.64(1.00)	7.02(3.41)	8.74(5.11)	8.73(5.11)
> 512M			3.85(0.24)	3.97(0.34)	4.17(0.56)	4.61(0.99)	7.05(3.44)	8.73(5.11)	8.75(5.13)
> 1G			3.85(0.22)	3.96(0.35)	4.18(0.56)	4.65(1.00)	7.06(3.44)	8.76(5.12)	8.72(5.11)
> 2G			6.53(2.91)	6.86(3.25)	7.54(3.92)	8.95(5.31)	10.60(6.97)	14.08(10.47)	14.32(10.60)
>
> the performence of 3-pass algorhythm
> origin			6.64(3.05)	6.81(3.24)	7.51(3.93)	8.86(5.30)	10.51(6.94)	13.92(10.36)	14.11(10.55)
>
> Zhou Wenjian (5):
>    Add support for block
>    Add tools for reading and writing from block table
>    Add module of generating table
>    Add module of calculating start_pfn and end_pfn in each dumpfile
>    Add support for --block-size
>
>   makedumpfile.8 |   16 ++++
>   makedumpfile.c |  245 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>   makedumpfile.h |   15 ++++
>   3 files changed, 271 insertions(+), 5 deletions(-)
>
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH v1 1/5] makedumpfile: Add support for block
  2014-09-29  7:06 ` [PATCH v1 1/5] makedumpfile: Add support for block Zhou Wenjian
@ 2014-10-10  8:11   ` Atsushi Kumagai
  0 siblings, 0 replies; 11+ messages in thread
From: Atsushi Kumagai @ 2014-10-10  8:11 UTC (permalink / raw)
  To: zhouwj-fnst; +Cc: kexec

>When --split option is specified, fair I/O workloads shoud be assigned
>for each process. So the start and end pfn of each dumpfile should be
>calculated with excluding unnecessary pages. However, it costs a lot of
>time to execute excluding for the whole memory. That is why struct Block
>exists. Struct Block is designed to manage memory, mainly for recording
>the number of dumpable pages. We can use the number of dumpable pages to
>calculate start and end pfn instead of execute excluding for the whole
>memory.

*Block* is a general word, it may suggest other things (e.g. Disk Block).
Actually, --block-size is confusing since -b option uses *block_order*.
I prefer a more specific name like... SplitBlock.


Thanks,
Atsushi Kumagai

>The char array *table in struct Block is used to record the number of
>dumpable pages.
>The table entry size is calculated as
>			divideup(log2(block_size / page_size), 8) bytes
>The table entry size is calculated, so that the
>space table taken will be small enough. And the code will also have a
>good performence when the number of pages in one block is big enough.
>
>Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
>Signed-off-by: Zhou Wenjian <zhouwj-fnst@cn.fujitsu.com>
>---
> makedumpfile.c |   23 +++++++++++++++++++++++
> makedumpfile.h |   14 ++++++++++++++
> 2 files changed, 37 insertions(+), 0 deletions(-)
>
>diff --git a/makedumpfile.c b/makedumpfile.c
>index b4d43d8..2feda01 100644
>--- a/makedumpfile.c
>+++ b/makedumpfile.c
>@@ -34,6 +34,7 @@ struct srcfile_table	srcfile_table;
>
> struct vm_table		vt = { 0 };
> struct DumpInfo		*info = NULL;
>+struct Block		*block = NULL;
>
> char filename_stdout[] = FILENAME_STDOUT;
>
>@@ -5685,6 +5686,28 @@ out:
> 	return ret;
> }
>
>+/*
>+ * cyclic_split mode:
>+ *	manage memory by blocks,
>+ *	divide memory into blocks
>+ *	use block_table to record numbers of dumpable pages in each block
>+ */
>+
>+//calculate entry size based on the amount of pages in one block
>+int
>+calculate_entry_size(void){
>+	int entry_num = 1, count = 1;
>+	int entry_size;
>+	while (entry_num < block->page_per_block){
>+		entry_num = entry_num << 1;
>+		count++;
>+	}
>+	entry_size = count/BITPERBYTE;
>+	if (count %BITPERBYTE)
>+		entry_size++;
>+	return entry_size;
>+}
>+
> mdf_pfn_t
> get_num_dumpable(void)
> {
>diff --git a/makedumpfile.h b/makedumpfile.h
>index 96830b0..ed4f799 100644
>--- a/makedumpfile.h
>+++ b/makedumpfile.h
>@@ -1168,10 +1168,24 @@ struct DumpInfo {
> 	 */
> 	int (*page_is_buddy)(unsigned long flags, unsigned int _mapcount,
> 			     unsigned long private, unsigned int _count);
>+	/*
>+	 *for cyclic_splitting mode, setup block_size
>+	 */
>+	long long block_size;
> };
> extern struct DumpInfo		*info;
>
> /*
>+ *for cyclic_splitting mode,Manage memory by block
>+ */
>+struct Block{
>+        char *table;
>+        long long num;
>+        long long page_per_block;
>+        int entry_size;                 //counted by byte
>+};
>+
>+/*
>  * kernel VM-related data
>  */
> struct vm_table {
>--
>1.7.1
>
>
>_______________________________________________
>kexec mailing list
>kexec@lists.infradead.org
>http://lists.infradead.org/mailman/listinfo/kexec

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH v1 5/5] makedumpfile: Add support for --block-size
  2014-09-29  7:06 ` [PATCH v1 5/5] makedumpfile: Add support for --block-size Zhou Wenjian
@ 2014-10-10  8:11   ` Atsushi Kumagai
  0 siblings, 0 replies; 11+ messages in thread
From: Atsushi Kumagai @ 2014-10-10  8:11 UTC (permalink / raw)
  To: zhouwj-fnst; +Cc: kexec

>Use --block-size to specify block size (KB)
>When --split is specified in cyclic mode,block table will be
>generated in get_num_dumpable_cyclic.
>
>Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
>Signed-off-by: Zhou Wenjian <zhouwj-fnst@cn.fujitsu.com>
>---
> makedumpfile.8 |   16 ++++++++++++++++
> makedumpfile.c |    4 ++++
> makedumpfile.h |    1 +
> 3 files changed, 21 insertions(+), 0 deletions(-)
>
>diff --git a/makedumpfile.8 b/makedumpfile.8
>index 9cb12c0..a384213 100644
>--- a/makedumpfile.8
>+++ b/makedumpfile.8
>@@ -386,6 +386,22 @@ size, so ordinary users don't need to specify this option.
> # makedumpfile \-\-cyclic\-buffer 1024 \-d 31 \-x vmlinux /proc/vmcore dumpfile
>
> .TP
>+\fB\-\-block\-size\fR \fIblock_size\fR
>+Specify the block size in kilo bytes for analysis in the cyclic mode with --split.
>+In the cyclic split mode, the number of blocks is represented as:
>+
>+    num_of_blocks = system_memory / (\fIblock_size\fR * 1KB )
>+
>+The larger number of block, the faster working speed is expected, but the more memory will
>+be taken. By default, \fIblock_size\fR will be set as 1GB, so ordinary users don't need to
>+specify this option.
>+
>+.br
>+.B Example:
>+.br
>+# makedumpfile \-\-block\-size 10240 \-d 31 \-x vmlinux \-\-split /proc/vmcore dumpfile1 dumpfile2
>+
>+.TP

Also print_usage() in print_info.c should be modified.


Thanks,
Atsushi Kumagai

> \fB\-\-non\-cyclic\fR
> Running in the non-cyclic mode, this mode uses the old filtering logic same as v1.4.4 or before.
> If you feel the cyclic mode is too slow, please try this mode.
>diff --git a/makedumpfile.c b/makedumpfile.c
>index 3e66346..405d935 100644
>--- a/makedumpfile.c
>+++ b/makedumpfile.c
>@@ -9571,6 +9571,7 @@ static struct option longopts[] = {
> 	{"eppic", required_argument, NULL, OPT_EPPIC},
> 	{"non-mmap", no_argument, NULL, OPT_NON_MMAP},
> 	{"mem-usage", no_argument, NULL, OPT_MEM_USAGE},
>+	{"block-size", required_argument, NULL, OPT_BLOCK_SIZE},
> 	{0, 0, 0, 0}
> };
>
>@@ -9711,6 +9712,9 @@ main(int argc, char *argv[])
> 		case OPT_CYCLIC_BUFFER:
> 			info->bufsize_cyclic = atoi(optarg);
> 			break;
>+		case OPT_BLOCK_SIZE:
>+			info->block_size = atoi(optarg);
>+			break;
> 		case '?':
> 			MSG("Commandline parameter is invalid.\n");
> 			MSG("Try `makedumpfile --help' for more information.\n");
>diff --git a/makedumpfile.h b/makedumpfile.h
>index ed4f799..56d2c79 100644
>--- a/makedumpfile.h
>+++ b/makedumpfile.h
>@@ -1883,6 +1883,7 @@ struct elf_prstatus {
> #define OPT_EPPIC               OPT_START+12
> #define OPT_NON_MMAP            OPT_START+13
> #define OPT_MEM_USAGE            OPT_START+14
>+#define OPT_BLOCK_SIZE		OPT_START+15
>
> /*
>  * Function Prototype.
>--
>1.7.1
>
>
>_______________________________________________
>kexec mailing list
>kexec@lists.infradead.org
>http://lists.infradead.org/mailman/listinfo/kexec

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH v1 3/5] makedumpfile: Add module of generating table
  2014-09-29  7:06 ` [PATCH v1 3/5] makedumpfile: Add module of generating table Zhou Wenjian
@ 2014-10-10  8:12   ` Atsushi Kumagai
  0 siblings, 0 replies; 11+ messages in thread
From: Atsushi Kumagai @ 2014-10-10  8:12 UTC (permalink / raw)
  To: zhouwj-fnst; +Cc: kexec

>set block size and generate basic information of block table
>
>Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
>Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
>Signed-off-by: Zhou Wenjian <zhouwj-fnst@cn.fujitsu.com>
>---
> makedumpfile.c |   86 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 files changed, 86 insertions(+), 0 deletions(-)
>
>diff --git a/makedumpfile.c b/makedumpfile.c
>index a4cb9b6..c6ea635 100644
>--- a/makedumpfile.c
>+++ b/makedumpfile.c
>@@ -5731,6 +5731,52 @@ read_value_from_block_table(char *block_inner)
> 	return ret;
> }
>
>+/*
>+ * The block size is specified as Kbyte with --block-size <size> option.
>+ * if not specified ,set default value
>+ */
>+int
>+check_block_size(void)
>+{
>+	if (info->block_size){
>+		info->block_size <<= 10;
>+		if (info->block_size < info->page_size) {
>+			ERRMSG("The block size could not be smaller than page_size. %s.\n",
>+									strerror(errno));
>+		return FALSE;
>+		}

Isn't it necessary to align the block size to the page size ?

>+	}
>+	else{
>+		// set default 1GB
>+		info->block_size = 1 << 30;
>+	}
>+	return TRUE;
>+}
>+
>+int
>+prepare_block_table(void)
>+{
>+	check_block_size();

Should check the return code, otherwise this check is useless.


Thanks,
Atsushi Kumagai

>+	if ((block = calloc(1, sizeof(struct Block))) == NULL) {
>+		ERRMSG("Can't allocate memory for the block. %s.\n", strerror(errno));
>+		return FALSE;
>+	}
>+	block->page_per_block = info->block_size/info->page_size;
>+	/*
>+	 *divide memory into blocks.
>+	 *if there is a remainder, called it memory not managed by block
>+	 *and it will be also dealt with in function calculate_end_pfn_by_block()
>+	 */
>+	block->num = info->max_mapnr/block->page_per_block;
>+	block->entry_size = calculate_entry_size();
>+	if ((block->table = (char *)calloc(sizeof(char),(block->entry_size * block->num)))
>+										== NULL) {
>+		ERRMSG("Can't allocate memory for the block_table. %s.\n", strerror(errno));
>+		return FALSE;
>+	}
>+	return TRUE;
>+}
>+
> mdf_pfn_t
> get_num_dumpable(void)
> {
>@@ -5746,9 +5792,43 @@ get_num_dumpable(void)
> 	return num_dumpable;
> }
>
>+/*
>+ * generate block_table
>+ * modified from function get_num_dumpable_cyclic
>+ */
>+mdf_pfn_t
>+get_num_dumpable_cyclic_withsplit(void)
>+{
>+	mdf_pfn_t pfn, num_dumpable = 0;
>+	mdf_pfn_t dumpable_pfn_num = 0, pfn_num = 0;
>+	struct cycle cycle = {0};
>+	int pos = 0;
>+	prepare_block_table();
>+	for_each_cycle(0, info->max_mapnr, &cycle) {
>+		if (!exclude_unnecessary_pages_cyclic(&cycle))
>+			return FALSE;
>+		for (pfn = cycle.start_pfn; pfn < cycle.end_pfn; pfn++) {
>+			if (is_dumpable_cyclic(info->partial_bitmap2, pfn, &cycle)) {
>+				num_dumpable++;
>+				dumpable_pfn_num++;
>+			}
>+			if (++pfn_num >= block->page_per_block) {
>+				write_value_into_block_table(block->table + pos, dumpable_pfn_num);
>+				pos += block->entry_size;
>+				pfn_num = 0;
>+				dumpable_pfn_num = 0;
>+			}
>+		}
>+	}
>+	return num_dumpable;
>+}
>+
> mdf_pfn_t
> get_num_dumpable_cyclic(void)
> {
>+	if(info->flag_split)
>+		return get_num_dumpable_cyclic_withsplit();
>+
> 	mdf_pfn_t pfn, num_dumpable=0;
> 	struct cycle cycle = {0};
>
>@@ -9703,6 +9783,12 @@ out:
> 		if (info->page_buf != NULL)
> 			free(info->page_buf);
> 		free(info);
>+
>+		if (block) {
>+			if (block->table)
>+				free(block->table);
>+		free(block);
>+		}
> 	}
> 	free_elf_info();
>
>--
>1.7.1
>
>
>_______________________________________________
>kexec mailing list
>kexec@lists.infradead.org
>http://lists.infradead.org/mailman/listinfo/kexec

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2014-10-10  8:24 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-29  7:06 [PATCH v1 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time Zhou Wenjian
2014-09-29  7:06 ` [PATCH v1 1/5] makedumpfile: Add support for block Zhou Wenjian
2014-10-10  8:11   ` Atsushi Kumagai
2014-09-29  7:06 ` [PATCH v1 2/5] makedumpfile: Add tools for reading and writing from block table Zhou Wenjian
2014-09-29  7:06 ` [PATCH v1 3/5] makedumpfile: Add module of generating table Zhou Wenjian
2014-10-10  8:12   ` Atsushi Kumagai
2014-09-29  7:06 ` [PATCH v1 4/5] makedumpfile: Add module of calculating start_pfn and end_pfn in each dumpfile Zhou Wenjian
2014-09-29  7:06 ` [PATCH v1 5/5] makedumpfile: Add support for --block-size Zhou Wenjian
2014-10-10  8:11   ` Atsushi Kumagai
2014-10-07  2:49 ` [PATCH v1 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time "Zhou, Wenjian/周文剑"
2014-10-10  4:12 ` "Zhou, Wenjian/周文剑"

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.