* [PATCH 0/13] makedumpfile: Avoid two pass filtering by using bitmap file.
@ 2015-05-11 6:13 Atsushi Kumagai
2015-05-12 8:20 ` HATAYAMA Daisuke
0 siblings, 1 reply; 5+ messages in thread
From: Atsushi Kumagai @ 2015-05-11 6:13 UTC (permalink / raw)
To: kexec
Hello,
This is the patch set to avoid two pass filtering, it is the
finished version of the previous patch set below:
http://lists.infradead.org/pipermail/kexec/2015-March/013497.html
cyclic mode has to take a two-pass approach for filtering to save on the
memory consumption, it's a disadvantage of the cyclic mode and it's basically
unavoidable. However, even the cyclic mode can avoid two-pass filtering if free
memory space is enough to store the whole 1st and 2nd bitmaps, but the current
version doesn't it.
The main purpose of this patch set is avoiding that useless filtering,
but before that, I merged non-cyclic mode into cyclic mode as code clean up
because the codes are almost the same. Instead, I introduce another way to
guarantee one-pass filtering by using disk space.
MAJOR CHANGES
1. Introduce --work-dir option instead of --non-cyclic
--non-cyclic option were used to choose the non-cyclic mode, it realizes
one-pass filtering by creating a temporary bitmap file on TMPDIR.
The new option "--work-dir" is used to specify a working directory
to store the bitmap file, so this is the alternative to the combination of
--non-cyclic and TMPDIR.
2. Remove extra page filtering
If free memory space is enough to store the whole 1st and 2nd bitmaps
(or --work-dir is specified), page filtering and creating bitmaps are
done only once before writing process. Otherwise the bitmaps are created
twice; to decide the offset of the page header region, and for actual
page writing process, as usual.
INTERNAL CHANGE
info->flag_cyclic indicated whether the internal mode is cyclic or non-cyclic.
Now, since the two modes are merged, the flag means that filtering process will
take multi cycles, i.e. it has to take a two-pass approach.
Atsushi Kumagai (13):
Organize bitmap structure for cyclic logic.
Add option to specify working directory for the bitmap.
Integrate the entry point of is_dumpable().
Integrate the main logic of writing kdump file.
Communalize the function for creating 1st bitmap.
Remove the old logic of writing kdump pages.
Integrate filtering process for ELF path.
Remove the old logic of writing ELF pages.
Adjust --mem-usage path to the new code.
Adjust --split/--reassemble path to the new code.
Adjust refiltering path to the new code.
Adjust sadump path to the new code.
Remove --non-cyclic option.
README | 16 -
makedumpfile.8 | 32 +-
makedumpfile.c | 1624 ++++++++++++++++++--------------------------------------
makedumpfile.h | 41 +-
print_info.c | 23 +-
sadump_info.c | 8 +-
6 files changed, 588 insertions(+), 1156 deletions(-)
--
1.9.0
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 0/13] makedumpfile: Avoid two pass filtering by using bitmap file.
2015-05-11 6:13 [PATCH 0/13] makedumpfile: Avoid two pass filtering by using bitmap file Atsushi Kumagai
@ 2015-05-12 8:20 ` HATAYAMA Daisuke
2015-05-13 8:04 ` Atsushi Kumagai
0 siblings, 1 reply; 5+ messages in thread
From: HATAYAMA Daisuke @ 2015-05-12 8:20 UTC (permalink / raw)
To: ats-kumagai; +Cc: kexec
Hello Kumagai-san,
From: Atsushi Kumagai <ats-kumagai@wm.jp.nec.com>
Subject: [PATCH 0/13] makedumpfile: Avoid two pass filtering by using bitmap file.
Date: Mon, 11 May 2015 06:13:51 +0000
> Hello,
>
> This is the patch set to avoid two pass filtering, it is the
> finished version of the previous patch set below:
>
> http://lists.infradead.org/pipermail/kexec/2015-March/013497.html
>
>
> cyclic mode has to take a two-pass approach for filtering to save on the
> memory consumption, it's a disadvantage of the cyclic mode and it's basically
> unavoidable. However, even the cyclic mode can avoid two-pass filtering if free
> memory space is enough to store the whole 1st and 2nd bitmaps, but the current
> version doesn't it.
> The main purpose of this patch set is avoiding that useless filtering,
> but before that, I merged non-cyclic mode into cyclic mode as code clean up
> because the codes are almost the same. Instead, I introduce another way to
> guarantee one-pass filtering by using disk space.
>
How about compromising progress information to some extent? The first
pass is intended to count up the exact number of dumpable pages just
to provide precise progress information. Is such prcision really
needed?
For example, how about another simple progress information:
pfn / max_mapnr
where pfn is the number of a page frame that is currently
processed. We know max_mapnr from the beginning, so this is possible
within one pass. It's less precise but might be precise enough.
--
Thanks.
HATAYAMA, Daisuke
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: [PATCH 0/13] makedumpfile: Avoid two pass filtering by using bitmap file.
2015-05-12 8:20 ` HATAYAMA Daisuke
@ 2015-05-13 8:04 ` Atsushi Kumagai
2015-05-14 1:08 ` HATAYAMA Daisuke
0 siblings, 1 reply; 5+ messages in thread
From: Atsushi Kumagai @ 2015-05-13 8:04 UTC (permalink / raw)
To: d.hatayama; +Cc: kexec
>> cyclic mode has to take a two-pass approach for filtering to save on the
>> memory consumption, it's a disadvantage of the cyclic mode and it's basically
>> unavoidable. However, even the cyclic mode can avoid two-pass filtering if free
>> memory space is enough to store the whole 1st and 2nd bitmaps, but the current
>> version doesn't it.
>> The main purpose of this patch set is avoiding that useless filtering,
>> but before that, I merged non-cyclic mode into cyclic mode as code clean up
>> because the codes are almost the same. Instead, I introduce another way to
>> guarantee one-pass filtering by using disk space.
>>
>
>How about compromising progress information to some extent? The first
>pass is intended to count up the exact number of dumpable pages just
>to provide precise progress information. Is such prcision really
>needed?
The first pass counts up the num_dumpable *to calculate the offset of
starting page data region in advance*, otherwise makedumpfile can't start
to write page data except create a sparse file.
7330 write_kdump_pages_and_bitmap_cyclic(struct cache_data *cd_header, struct cache_data *cd_page)
7331 {
7332 struct page_desc pd_zero;
7333 off_t offset_data=0;
7334 struct disk_dump_header *dh = info->dump_header;
7335 unsigned char buf[info->page_size];
7336 struct timeval tv_start;
7337
7338 /*
7339 * Reset counter for debug message.
7340 */
7341 pfn_zero = pfn_cache = pfn_cache_private = 0;
7342 pfn_user = pfn_free = pfn_hwpoison = 0;
7343 pfn_memhole = info->max_mapnr;
7344
7345 cd_header->offset
7346 = (DISKDUMP_HEADER_BLOCKS + dh->sub_hdr_size + dh->bitmap_blocks)
7347 * dh->block_size;
7348 cd_page->offset = cd_header->offset + sizeof(page_desc_t)*info->num_dumpable;
7349 offset_data = cd_page->offset; ^^^^^^^^^^^^
>For example, how about another simple progress information:
>
> pfn / max_mapnr
>
>where pfn is the number of a page frame that is currently
>processed. We know max_mapnr from the beginning, so this is possible
>within one pass. It's less precise but might be precise enough.
I also think it's enough for progress information, but anyway the 1st
pass is necessary as above.
Thanks
Atsushi Kumagai
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 0/13] makedumpfile: Avoid two pass filtering by using bitmap file.
2015-05-13 8:04 ` Atsushi Kumagai
@ 2015-05-14 1:08 ` HATAYAMA Daisuke
2015-05-15 4:51 ` Atsushi Kumagai
0 siblings, 1 reply; 5+ messages in thread
From: HATAYAMA Daisuke @ 2015-05-14 1:08 UTC (permalink / raw)
To: ats-kumagai; +Cc: kexec
From: Atsushi Kumagai <ats-kumagai@wm.jp.nec.com>
Subject: RE: [PATCH 0/13] makedumpfile: Avoid two pass filtering by using bitmap file.
Date: Wed, 13 May 2015 08:04:27 +0000
>>> cyclic mode has to take a two-pass approach for filtering to save on the
>>> memory consumption, it's a disadvantage of the cyclic mode and it's basically
>>> unavoidable. However, even the cyclic mode can avoid two-pass filtering if free
>>> memory space is enough to store the whole 1st and 2nd bitmaps, but the current
>>> version doesn't it.
>>> The main purpose of this patch set is avoiding that useless filtering,
>>> but before that, I merged non-cyclic mode into cyclic mode as code clean up
>>> because the codes are almost the same. Instead, I introduce another way to
>>> guarantee one-pass filtering by using disk space.
>>>
>>
>>How about compromising progress information to some extent? The first
>>pass is intended to count up the exact number of dumpable pages just
>>to provide precise progress information. Is such prcision really
>>needed?
>
> The first pass counts up the num_dumpable *to calculate the offset of
> starting page data region in advance*, otherwise makedumpfile can't start
> to write page data except create a sparse file.
>
> 7330 write_kdump_pages_and_bitmap_cyclic(struct cache_data *cd_header, struct cache_data *cd_page)
> 7331 {
> 7332 struct page_desc pd_zero;
> 7333 off_t offset_data=0;
> 7334 struct disk_dump_header *dh = info->dump_header;
> 7335 unsigned char buf[info->page_size];
> 7336 struct timeval tv_start;
> 7337
> 7338 /*
> 7339 * Reset counter for debug message.
> 7340 */
> 7341 pfn_zero = pfn_cache = pfn_cache_private = 0;
> 7342 pfn_user = pfn_free = pfn_hwpoison = 0;
> 7343 pfn_memhole = info->max_mapnr;
> 7344
> 7345 cd_header->offset
> 7346 = (DISKDUMP_HEADER_BLOCKS + dh->sub_hdr_size + dh->bitmap_blocks)
> 7347 * dh->block_size;
> 7348 cd_page->offset = cd_header->offset + sizeof(page_desc_t)*info->num_dumpable;
> 7349 offset_data = cd_page->offset; ^^^^^^^^^^^^
>
>
I overlooked this, sorry.
Size of page description header is 24 bytes. This corresponds to 6 GB
per 1 TB. Can this become a big problem? Of course, I think it odd
that page description table could be larger than memory data part.
There's another aproach: construct the page description table at each
cycle separately over a dump file and connect them by a linked list.
This changes dump format and needs to add crash utility support; no
compatibility to current crash utility.
>>For example, how about another simple progress information:
>>
>> pfn / max_mapnr
>>
>>where pfn is the number of a page frame that is currently
>>processed. We know max_mapnr from the beginning, so this is possible
>>within one pass. It's less precise but might be precise enough.
>
> I also think it's enough for progress information, but anyway the 1st
> pass is necessary as above.
>
>
> Thanks
> Atsushi Kumagai
--
Thanks.
HATAYAMA, Daisuke
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: [PATCH 0/13] makedumpfile: Avoid two pass filtering by using bitmap file.
2015-05-14 1:08 ` HATAYAMA Daisuke
@ 2015-05-15 4:51 ` Atsushi Kumagai
0 siblings, 0 replies; 5+ messages in thread
From: Atsushi Kumagai @ 2015-05-15 4:51 UTC (permalink / raw)
To: d.hatayama; +Cc: kexec
[-- Attachment #1: Type: text/plain, Size: 4253 bytes --]
>>>How about compromising progress information to some extent? The first
>>>pass is intended to count up the exact number of dumpable pages just
>>>to provide precise progress information. Is such prcision really
>>>needed?
>>
>> The first pass counts up the num_dumpable *to calculate the offset of
>> starting page data region in advance*, otherwise makedumpfile can't start
>> to write page data except create a sparse file.
>>
>> 7330 write_kdump_pages_and_bitmap_cyclic(struct cache_data *cd_header, struct cache_data *cd_page)
>> 7331 {
>> 7332 struct page_desc pd_zero;
>> 7333 off_t offset_data=0;
>> 7334 struct disk_dump_header *dh = info->dump_header;
>> 7335 unsigned char buf[info->page_size];
>> 7336 struct timeval tv_start;
>> 7337
>> 7338 /*
>> 7339 * Reset counter for debug message.
>> 7340 */
>> 7341 pfn_zero = pfn_cache = pfn_cache_private = 0;
>> 7342 pfn_user = pfn_free = pfn_hwpoison = 0;
>> 7343 pfn_memhole = info->max_mapnr;
>> 7344
>> 7345 cd_header->offset
>> 7346 = (DISKDUMP_HEADER_BLOCKS + dh->sub_hdr_size + dh->bitmap_blocks)
>> 7347 * dh->block_size;
>> 7348 cd_page->offset = cd_header->offset + sizeof(page_desc_t)*info->num_dumpable;
>> 7349 offset_data = cd_page->offset; ^^^^^^^^^^^^
>>
>>
>
>I overlooked this, sorry.
>
>Size of page description header is 24 bytes. This corresponds to 6 GB
>per 1 TB. Can this become a big problem? Of course, I think it odd
>that page description table could be larger than memory data part.
At least, it looks that the member "page_flags" can be removed since
makedumpfile always just set 0 to it and crash doesn't refer it.
typedef struct page_desc {
off_t offset; /* the offset of the page data*/
unsigned int size; /* the size of this dump page */
unsigned int flags; /* flags */
unsigned long long page_flags; /* page flags */ <--- always 0, this 8 byte is useless.
} page_desc_t;
(Sorry for getting off track here)
Further, I have another idea that would reduce the total size
of page descriptor. That is assigning a page descriptor to a number
of pages, it means multiple pages will be managed as a data block.
The original purpose of the idea is improving compressive performance
by compressing some pages in a lump.
We know the compression with zlib is too slow. I suspect that one of
the causes is the buffer size for compress2().
When compressing a 100MB file, I expect that compressing 10MB block 10 times
will be faster than compressing 1MB block 100 times.
Actually I did simple verification with the attached program like:
# ./zlib_compress 1024 testdata
TOTAL COMPRESSION TIME: 18.478064
# ./zlib_compress 10240 testdata
TOTAL COMPRESSION TIME: 5.940524
# ./zlib_compress 102400 testdata
TOTAL COMPRESSION TIME: 2.088867
#
Unfortunately I haven't had a chance to work for it for a long time,
but I think it would be better to consider it together if we design
a new dump format.
>There's another aproach: construct the page description table at each
>cycle separately over a dump file and connect them by a linked list.
>
>This changes dump format and needs to add crash utility support; no
>compatibility to current crash utility.
It's interesting. I think we should improve the format if there is a
good reason, the format shouldn't be an obstacle.
Of course, the new format should be an option at first, but it would
be great if there is a choice to get better performance.
Thanks
Atsushi Kumagai
>>>For example, how about another simple progress information:
>>>
>>> pfn / max_mapnr
>>>
>>>where pfn is the number of a page frame that is currently
>>>processed. We know max_mapnr from the beginning, so this is possible
>>>within one pass. It's less precise but might be precise enough.
>>
>> I also think it's enough for progress information, but anyway the 1st
>> pass is necessary as above.
>>
>>
>> Thanks
>> Atsushi Kumagai
>--
>Thanks.
>HATAYAMA, Daisuke
[-- Attachment #2: zlib_compress.c --]
[-- Type: text/plain, Size: 1243 bytes --]
#include <zlib.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <fcntl.h>
#include <sys/time.h>
static inline double getdtime(void)
{
struct timeval tv;
gettimeofday(&tv, NULL);
return (double)tv.tv_sec + (double)tv.tv_usec * 0.001 * 0.001;
}
int main(int argc, char * argv[]) {
char *buf_in, *buf_out;
unsigned long size_in, size_out;
int fd;
double d_start, d_end, d_compress;
if (argc != 3) {
printf("./command <size_in> <input file>\n");
exit(1);
}
d_compress = 0;
// buffer for input data
size_in = atoi(argv[1]);
buf_in = (char *)malloc(size_in);
if (!buf_in) {
printf("malloc failed. %n byten", size_in);
exit(1);
}
// buffer for output data
size_out = compressBound(size_in);
buf_out = (char *)malloc(size_out);
if (!buf_out) {
printf("malloc failed. %n byten", size_out);
exit(1);
}
if ((fd = open(argv[2], O_RDONLY)) == 0) {
printf("file open failed. %s\n", argv[2]);
exit(1);
}
while (read(fd, buf_in, size_in)) {
d_start = getdtime();
compress2(buf_out, &size_out, buf_in, size_in, Z_BEST_SPEED);
d_end = getdtime();
d_compress += d_end - d_start;
}
printf("TOTAL COMPRESSION TIME: %lf\n", d_compress);
return 0;
}
[-- Attachment #3: Type: text/plain, Size: 143 bytes --]
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-05-15 4:53 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-11 6:13 [PATCH 0/13] makedumpfile: Avoid two pass filtering by using bitmap file Atsushi Kumagai
2015-05-12 8:20 ` HATAYAMA Daisuke
2015-05-13 8:04 ` Atsushi Kumagai
2015-05-14 1:08 ` HATAYAMA Daisuke
2015-05-15 4:51 ` Atsushi Kumagai
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.