* hugepage compaction causes performance drop
@ 2015-11-19 9:29 ` Aaron Lu
0 siblings, 0 replies; 30+ messages in thread
From: Aaron Lu @ 2015-11-19 9:29 UTC (permalink / raw)
To: linux-mm; +Cc: Huang Ying, Dave Hansen, Tim Chen, lkp
[-- Attachment #1: Type: text/plain, Size: 1635 bytes --]
Hi,
One vm related test case run by LKP on a Haswell EP with 128GiB memory
showed that compaction code would cause performance drop about 30%. To
illustrate the problem, I've simplified the test with a program called
usemem(see attached). The test goes like this:
1 Boot up the server;
2 modprobe scsi_debug(a module that could use memory as SCSI device),
dev_size set to 4/5 free memory, i.e. about 100GiB. Use it as swap.
3 run the usemem test, which use mmap to map a MAP_PRIVATE | MAP_ANON
region with the size set to 3/4 of (remaining_free_memory + swap), and
then write to that region sequentially to trigger page fault and swap
out.
The above test runs with two configs regarding the below two sysfs files:
/sys/kernel/mm/transparent_hugepage/enabled
/sys/kernel/mm/transparent_hugepage/defrag
1 transparent hugepage and defrag are both set to always, let's call it
always-always case;
2 transparent hugepage is set to always while defrag is set to never,
let's call it always-never case.
The output from the always-always case is:
Setting up swapspace version 1, size = 104627196 KiB
no label, UUID=aafa53ae-af9e-46c9-acb9-8b3d4f57f610
cmdline: /lkp/aaron/src/bin/usemem 99994672128
99994672128 transferred in 95 seconds, throughput: 1003 MB/s
And the output from the always-never case is:
etting up swapspace version 1, size = 104629244 KiB
no label, UUID=60563c82-d1c6-4d86-b9fa-b52f208097e9
cmdline: /lkp/aaron/src/bin/usemem 99995965440
99995965440 transferred in 67 seconds, throughput: 1423 MB/s
The vmstat and perf-profile are also attached, please let me know if you
need any more information, thanks.
[-- Attachment #2: swap_test.tar.xz --]
[-- Type: application/x-xz, Size: 297576 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* hugepage compaction causes performance drop
@ 2015-11-19 9:29 ` Aaron Lu
0 siblings, 0 replies; 30+ messages in thread
From: Aaron Lu @ 2015-11-19 9:29 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1671 bytes --]
Hi,
One vm related test case run by LKP on a Haswell EP with 128GiB memory
showed that compaction code would cause performance drop about 30%. To
illustrate the problem, I've simplified the test with a program called
usemem(see attached). The test goes like this:
1 Boot up the server;
2 modprobe scsi_debug(a module that could use memory as SCSI device),
dev_size set to 4/5 free memory, i.e. about 100GiB. Use it as swap.
3 run the usemem test, which use mmap to map a MAP_PRIVATE | MAP_ANON
region with the size set to 3/4 of (remaining_free_memory + swap), and
then write to that region sequentially to trigger page fault and swap
out.
The above test runs with two configs regarding the below two sysfs files:
/sys/kernel/mm/transparent_hugepage/enabled
/sys/kernel/mm/transparent_hugepage/defrag
1 transparent hugepage and defrag are both set to always, let's call it
always-always case;
2 transparent hugepage is set to always while defrag is set to never,
let's call it always-never case.
The output from the always-always case is:
Setting up swapspace version 1, size = 104627196 KiB
no label, UUID=aafa53ae-af9e-46c9-acb9-8b3d4f57f610
cmdline: /lkp/aaron/src/bin/usemem 99994672128
99994672128 transferred in 95 seconds, throughput: 1003 MB/s
And the output from the always-never case is:
etting up swapspace version 1, size = 104629244 KiB
no label, UUID=60563c82-d1c6-4d86-b9fa-b52f208097e9
cmdline: /lkp/aaron/src/bin/usemem 99995965440
99995965440 transferred in 67 seconds, throughput: 1423 MB/s
The vmstat and perf-profile are also attached, please let me know if you
need any more information, thanks.
[-- Attachment #2: swap_test.tar.xz --]
[-- Type: application/x-xz, Size: 297576 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: hugepage compaction causes performance drop
2015-11-19 9:29 ` Aaron Lu
@ 2015-11-19 13:29 ` Vlastimil Babka
-1 siblings, 0 replies; 30+ messages in thread
From: Vlastimil Babka @ 2015-11-19 13:29 UTC (permalink / raw)
To: Aaron Lu, linux-mm
Cc: Huang Ying, Dave Hansen, Tim Chen, lkp, Andrea Arcangeli,
David Rientjes, Joonsoo Kim
+CC Andrea, David, Joonsoo
On 11/19/2015 10:29 AM, Aaron Lu wrote:
> Hi,
>
> One vm related test case run by LKP on a Haswell EP with 128GiB memory
> showed that compaction code would cause performance drop about 30%. To
> illustrate the problem, I've simplified the test with a program called
> usemem(see attached). The test goes like this:
> 1 Boot up the server;
> 2 modprobe scsi_debug(a module that could use memory as SCSI device),
> dev_size set to 4/5 free memory, i.e. about 100GiB. Use it as swap.
> 3 run the usemem test, which use mmap to map a MAP_PRIVATE | MAP_ANON
> region with the size set to 3/4 of (remaining_free_memory + swap), and
> then write to that region sequentially to trigger page fault and swap
> out.
>
> The above test runs with two configs regarding the below two sysfs files:
> /sys/kernel/mm/transparent_hugepage/enabled
> /sys/kernel/mm/transparent_hugepage/defrag
> 1 transparent hugepage and defrag are both set to always, let's call it
> always-always case;
> 2 transparent hugepage is set to always while defrag is set to never,
> let's call it always-never case.
>
> The output from the always-always case is:
> Setting up swapspace version 1, size = 104627196 KiB
> no label, UUID=aafa53ae-af9e-46c9-acb9-8b3d4f57f610
> cmdline: /lkp/aaron/src/bin/usemem 99994672128
> 99994672128 transferred in 95 seconds, throughput: 1003 MB/s
>
> And the output from the always-never case is:
> etting up swapspace version 1, size = 104629244 KiB
> no label, UUID=60563c82-d1c6-4d86-b9fa-b52f208097e9
> cmdline: /lkp/aaron/src/bin/usemem 99995965440
> 99995965440 transferred in 67 seconds, throughput: 1423 MB/s
So yeah this is an example of workload that has no benefit from THP's,
but pays all the cost. Fixing that is non-trivial and I admit I haven't
pushed my prior efforts there too much lately...
But it's also possible there still are actual compaction bugs making the
issue worse.
> The vmstat and perf-profile are also attached, please let me know if you
> need any more information, thanks.
Output from vmstat (the tool) isn't much useful here, a periodic "cat
/proc/vmstat" would be much better.
The perf profiles are somewhat weirdly sorted by children cost (?), but
I noticed a very high cost (46%) in pageblock_pfn_to_page(). This could
be due to a very large but sparsely populated zone. Could you provide
/proc/zoneinfo?
If the compaction scanners behave strangely due to a bug, enabling the
ftrace compaction tracepoints should help find the cause. That might
produce a very large output, but maybe it would be enough to see some
parts of it (i.e. towards beginning, middle, end of the experiment).
Vlastimil
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: hugepage compaction causes performance drop
@ 2015-11-19 13:29 ` Vlastimil Babka
0 siblings, 0 replies; 30+ messages in thread
From: Vlastimil Babka @ 2015-11-19 13:29 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 2749 bytes --]
+CC Andrea, David, Joonsoo
On 11/19/2015 10:29 AM, Aaron Lu wrote:
> Hi,
>
> One vm related test case run by LKP on a Haswell EP with 128GiB memory
> showed that compaction code would cause performance drop about 30%. To
> illustrate the problem, I've simplified the test with a program called
> usemem(see attached). The test goes like this:
> 1 Boot up the server;
> 2 modprobe scsi_debug(a module that could use memory as SCSI device),
> dev_size set to 4/5 free memory, i.e. about 100GiB. Use it as swap.
> 3 run the usemem test, which use mmap to map a MAP_PRIVATE | MAP_ANON
> region with the size set to 3/4 of (remaining_free_memory + swap), and
> then write to that region sequentially to trigger page fault and swap
> out.
>
> The above test runs with two configs regarding the below two sysfs files:
> /sys/kernel/mm/transparent_hugepage/enabled
> /sys/kernel/mm/transparent_hugepage/defrag
> 1 transparent hugepage and defrag are both set to always, let's call it
> always-always case;
> 2 transparent hugepage is set to always while defrag is set to never,
> let's call it always-never case.
>
> The output from the always-always case is:
> Setting up swapspace version 1, size = 104627196 KiB
> no label, UUID=aafa53ae-af9e-46c9-acb9-8b3d4f57f610
> cmdline: /lkp/aaron/src/bin/usemem 99994672128
> 99994672128 transferred in 95 seconds, throughput: 1003 MB/s
>
> And the output from the always-never case is:
> etting up swapspace version 1, size = 104629244 KiB
> no label, UUID=60563c82-d1c6-4d86-b9fa-b52f208097e9
> cmdline: /lkp/aaron/src/bin/usemem 99995965440
> 99995965440 transferred in 67 seconds, throughput: 1423 MB/s
So yeah this is an example of workload that has no benefit from THP's,
but pays all the cost. Fixing that is non-trivial and I admit I haven't
pushed my prior efforts there too much lately...
But it's also possible there still are actual compaction bugs making the
issue worse.
> The vmstat and perf-profile are also attached, please let me know if you
> need any more information, thanks.
Output from vmstat (the tool) isn't much useful here, a periodic "cat
/proc/vmstat" would be much better.
The perf profiles are somewhat weirdly sorted by children cost (?), but
I noticed a very high cost (46%) in pageblock_pfn_to_page(). This could
be due to a very large but sparsely populated zone. Could you provide
/proc/zoneinfo?
If the compaction scanners behave strangely due to a bug, enabling the
ftrace compaction tracepoints should help find the cause. That might
produce a very large output, but maybe it would be enough to see some
parts of it (i.e. towards beginning, middle, end of the experiment).
Vlastimil
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: hugepage compaction causes performance drop
2015-11-19 13:29 ` Vlastimil Babka
@ 2015-11-20 8:55 ` Aaron Lu
-1 siblings, 0 replies; 30+ messages in thread
From: Aaron Lu @ 2015-11-20 8:55 UTC (permalink / raw)
To: Vlastimil Babka, linux-mm
Cc: Huang Ying, Dave Hansen, Tim Chen, lkp, Andrea Arcangeli,
David Rientjes, Joonsoo Kim
On 11/19/2015 09:29 PM, Vlastimil Babka wrote:
> +CC Andrea, David, Joonsoo
>
> On 11/19/2015 10:29 AM, Aaron Lu wrote:
>> The vmstat and perf-profile are also attached, please let me know if you
>> need any more information, thanks.
>
> Output from vmstat (the tool) isn't much useful here, a periodic "cat
> /proc/vmstat" would be much better.
No problem.
> The perf profiles are somewhat weirdly sorted by children cost (?), but
> I noticed a very high cost (46%) in pageblock_pfn_to_page(). This could
> be due to a very large but sparsely populated zone. Could you provide
> /proc/zoneinfo?
Is a one time /proc/zoneinfo enough or also a periodic one?
> If the compaction scanners behave strangely due to a bug, enabling the
> ftrace compaction tracepoints should help find the cause. That might
> produce a very large output, but maybe it would be enough to see some
> parts of it (i.e. towards beginning, middle, end of the experiment).
I'll see how to do this, never used ftrace before.
Thanks for the quick response.
Regards,
Aaron
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: hugepage compaction causes performance drop
@ 2015-11-20 8:55 ` Aaron Lu
0 siblings, 0 replies; 30+ messages in thread
From: Aaron Lu @ 2015-11-20 8:55 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1086 bytes --]
On 11/19/2015 09:29 PM, Vlastimil Babka wrote:
> +CC Andrea, David, Joonsoo
>
> On 11/19/2015 10:29 AM, Aaron Lu wrote:
>> The vmstat and perf-profile are also attached, please let me know if you
>> need any more information, thanks.
>
> Output from vmstat (the tool) isn't much useful here, a periodic "cat
> /proc/vmstat" would be much better.
No problem.
> The perf profiles are somewhat weirdly sorted by children cost (?), but
> I noticed a very high cost (46%) in pageblock_pfn_to_page(). This could
> be due to a very large but sparsely populated zone. Could you provide
> /proc/zoneinfo?
Is a one time /proc/zoneinfo enough or also a periodic one?
> If the compaction scanners behave strangely due to a bug, enabling the
> ftrace compaction tracepoints should help find the cause. That might
> produce a very large output, but maybe it would be enough to see some
> parts of it (i.e. towards beginning, middle, end of the experiment).
I'll see how to do this, never used ftrace before.
Thanks for the quick response.
Regards,
Aaron
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: hugepage compaction causes performance drop
2015-11-20 8:55 ` Aaron Lu
@ 2015-11-20 9:33 ` Aaron Lu
-1 siblings, 0 replies; 30+ messages in thread
From: Aaron Lu @ 2015-11-20 9:33 UTC (permalink / raw)
To: Vlastimil Babka, linux-mm
Cc: Huang Ying, Dave Hansen, Tim Chen, lkp, Andrea Arcangeli,
David Rientjes, Joonsoo Kim
[-- Attachment #1: Type: text/plain, Size: 835 bytes --]
On 11/20/2015 04:55 PM, Aaron Lu wrote:
> On 11/19/2015 09:29 PM, Vlastimil Babka wrote:
>> +CC Andrea, David, Joonsoo
>>
>> On 11/19/2015 10:29 AM, Aaron Lu wrote:
>>> The vmstat and perf-profile are also attached, please let me know if you
>>> need any more information, thanks.
>>
>> Output from vmstat (the tool) isn't much useful here, a periodic "cat
>> /proc/vmstat" would be much better.
>
> No problem.
>
>> The perf profiles are somewhat weirdly sorted by children cost (?), but
>> I noticed a very high cost (46%) in pageblock_pfn_to_page(). This could
>> be due to a very large but sparsely populated zone. Could you provide
>> /proc/zoneinfo?
>
> Is a one time /proc/zoneinfo enough or also a periodic one?
Please see attached, note that this is a new run so the perf profile is
a little different.
Thanks,
Aaron
[-- Attachment #2: zoneinfo --]
[-- Type: text/plain, Size: 36523 bytes --]
/proc/zoneinfo
Node 0, zone DMA
pages free 3950
min 2
low 2
high 3
scanned 0
spanned 4095
present 3994
managed 3973
nr_free_pages 3950
nr_alloc_batch 1
nr_inactive_anon 0
nr_active_anon 0
nr_inactive_file 21
nr_active_file 1
nr_unevictable 0
nr_mlock 0
nr_anon_pages 0
nr_mapped 0
nr_file_pages 22
nr_dirty 0
nr_writeback 0
nr_slab_reclaimable 0
nr_slab_unreclaimable 1
nr_page_table_pages 0
nr_kernel_stack 0
nr_unstable 0
nr_bounce 0
nr_vmscan_write 0
nr_vmscan_immediate_reclaim 0
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 0
nr_dirtied 0
nr_written 0
nr_pages_scanned 0
numa_hit 23
numa_miss 0
numa_foreign 0
numa_interleave 0
numa_local 1
numa_other 22
workingset_refault 0
workingset_activate 0
workingset_nodereclaim 0
nr_anon_transparent_hugepages 0
nr_free_cma 0
protection: (0, 1873, 64327, 64327)
pagesets
cpu: 0
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 1
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 2
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 3
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 4
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 5
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 6
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 7
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 8
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 9
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 10
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 11
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 12
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 13
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 14
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 15
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 16
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 17
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 18
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 19
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 20
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 21
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 22
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 23
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 24
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 25
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 26
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 27
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 28
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 29
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 30
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 31
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 32
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 33
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 34
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 35
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 36
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 37
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 38
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 39
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 40
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 41
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 42
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 43
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 44
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 45
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 46
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 47
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 48
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 49
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 50
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 51
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 52
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 53
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 54
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 55
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 56
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 57
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 58
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 59
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 60
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 61
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 62
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 63
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 64
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 65
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 66
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 67
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 68
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 69
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 70
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 71
count: 0
high: 0
batch: 1
vm stats threshold: 14
all_unreclaimable: 0
start_pfn: 1
inactive_ratio: 1
Node 0, zone DMA32
pages free 62829
min 327
low 408
high 490
scanned 0
spanned 1044480
present 495951
managed 479559
nr_free_pages 62829
nr_alloc_batch 3
nr_inactive_anon 12
nr_active_anon 50
nr_inactive_file 1440
nr_active_file 316
nr_unevictable 0
nr_mlock 0
nr_anon_pages 40
nr_mapped 39
nr_file_pages 1778
nr_dirty 0
nr_writeback 0
nr_slab_reclaimable 238
nr_slab_unreclaimable 246
nr_page_table_pages 15
nr_kernel_stack 9
nr_unstable 0
nr_bounce 0
nr_vmscan_write 0
nr_vmscan_immediate_reclaim 0
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 22
nr_dirtied 0
nr_written 0
nr_pages_scanned 0
numa_hit 416524
numa_miss 0
numa_foreign 0
numa_interleave 0
numa_local 414721
numa_other 1803
workingset_refault 0
workingset_activate 0
workingset_nodereclaim 0
nr_anon_transparent_hugepages 0
nr_free_cma 0
protection: (0, 0, 62453, 62453)
pagesets
cpu: 0
count: 33
high: 186
batch: 31
vm stats threshold: 70
cpu: 1
count: 104
high: 186
batch: 31
vm stats threshold: 70
cpu: 2
count: 79
high: 186
batch: 31
vm stats threshold: 70
cpu: 3
count: 109
high: 186
batch: 31
vm stats threshold: 70
cpu: 4
count: 53
high: 186
batch: 31
vm stats threshold: 70
cpu: 5
count: 43
high: 186
batch: 31
vm stats threshold: 70
cpu: 6
count: 126
high: 186
batch: 31
vm stats threshold: 70
cpu: 7
count: 38
high: 186
batch: 31
vm stats threshold: 70
cpu: 8
count: 63
high: 186
batch: 31
vm stats threshold: 70
cpu: 9
count: 63
high: 186
batch: 31
vm stats threshold: 70
cpu: 10
count: 144
high: 186
batch: 31
vm stats threshold: 70
cpu: 11
count: 59
high: 186
batch: 31
vm stats threshold: 70
cpu: 12
count: 43
high: 186
batch: 31
vm stats threshold: 70
cpu: 13
count: 52
high: 186
batch: 31
vm stats threshold: 70
cpu: 14
count: 111
high: 186
batch: 31
vm stats threshold: 70
cpu: 15
count: 112
high: 186
batch: 31
vm stats threshold: 70
cpu: 16
count: 118
high: 186
batch: 31
vm stats threshold: 70
cpu: 17
count: 41
high: 186
batch: 31
vm stats threshold: 70
cpu: 18
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 19
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 20
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 21
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 22
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 23
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 24
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 25
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 26
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 27
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 28
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 29
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 30
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 31
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 32
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 33
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 34
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 35
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 36
count: 44
high: 186
batch: 31
vm stats threshold: 70
cpu: 37
count: 53
high: 186
batch: 31
vm stats threshold: 70
cpu: 38
count: 109
high: 186
batch: 31
vm stats threshold: 70
cpu: 39
count: 40
high: 186
batch: 31
vm stats threshold: 70
cpu: 40
count: 85
high: 186
batch: 31
vm stats threshold: 70
cpu: 41
count: 30
high: 186
batch: 31
vm stats threshold: 70
cpu: 42
count: 48
high: 186
batch: 31
vm stats threshold: 70
cpu: 43
count: 59
high: 186
batch: 31
vm stats threshold: 70
cpu: 44
count: 96
high: 186
batch: 31
vm stats threshold: 70
cpu: 45
count: 55
high: 186
batch: 31
vm stats threshold: 70
cpu: 46
count: 93
high: 186
batch: 31
vm stats threshold: 70
cpu: 47
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 48
count: 75
high: 186
batch: 31
vm stats threshold: 70
cpu: 49
count: 63
high: 186
batch: 31
vm stats threshold: 70
cpu: 50
count: 87
high: 186
batch: 31
vm stats threshold: 70
cpu: 51
count: 124
high: 186
batch: 31
vm stats threshold: 70
cpu: 52
count: 68
high: 186
batch: 31
vm stats threshold: 70
cpu: 53
count: 57
high: 186
batch: 31
vm stats threshold: 70
cpu: 54
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 55
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 56
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 57
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 58
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 59
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 60
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 61
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 62
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 63
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 64
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 65
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 66
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 67
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 68
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 69
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 70
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 71
count: 0
high: 186
batch: 31
vm stats threshold: 70
all_unreclaimable: 0
start_pfn: 4096
inactive_ratio: 3
Node 0, zone Normal
pages free 13732
min 10921
low 13651
high 16381
scanned 0
spanned 16252928
present 16252928
managed 15988216
nr_free_pages 13732
nr_alloc_batch 1009
nr_inactive_anon 80
nr_active_anon 630
nr_inactive_file 44444
nr_active_file 11926
nr_unevictable 0
nr_mlock 0
nr_anon_pages 633
nr_mapped 1613
nr_file_pages 56462
nr_dirty 0
nr_writeback 0
nr_slab_reclaimable 6196
nr_slab_unreclaimable 12143
nr_page_table_pages 104
nr_kernel_stack 590
nr_unstable 0
nr_bounce 0
nr_vmscan_write 0
nr_vmscan_immediate_reclaim 0
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 92
nr_dirtied 29
nr_written 29
nr_pages_scanned 0
numa_hit 16004783
numa_miss 0
numa_foreign 10066439
numa_interleave 87113
numa_local 15944244
numa_other 60539
workingset_refault 0
workingset_activate 0
workingset_nodereclaim 0
nr_anon_transparent_hugepages 0
nr_free_cma 0
protection: (0, 0, 0, 0)
pagesets
cpu: 0
count: 77
high: 186
batch: 31
vm stats threshold: 125
cpu: 1
count: 25
high: 186
batch: 31
vm stats threshold: 125
cpu: 2
count: 155
high: 186
batch: 31
vm stats threshold: 125
cpu: 3
count: 56
high: 186
batch: 31
vm stats threshold: 125
cpu: 4
count: 176
high: 186
batch: 31
vm stats threshold: 125
cpu: 5
count: 157
high: 186
batch: 31
vm stats threshold: 125
cpu: 6
count: 109
high: 186
batch: 31
vm stats threshold: 125
cpu: 7
count: 156
high: 186
batch: 31
vm stats threshold: 125
cpu: 8
count: 66
high: 186
batch: 31
vm stats threshold: 125
cpu: 9
count: 164
high: 186
batch: 31
vm stats threshold: 125
cpu: 10
count: 167
high: 186
batch: 31
vm stats threshold: 125
cpu: 11
count: 161
high: 186
batch: 31
vm stats threshold: 125
cpu: 12
count: 76
high: 186
batch: 31
vm stats threshold: 125
cpu: 13
count: 73
high: 186
batch: 31
vm stats threshold: 125
cpu: 14
count: 184
high: 186
batch: 31
vm stats threshold: 125
cpu: 15
count: 170
high: 186
batch: 31
vm stats threshold: 125
cpu: 16
count: 81
high: 186
batch: 31
vm stats threshold: 125
cpu: 17
count: 181
high: 186
batch: 31
vm stats threshold: 125
cpu: 18
count: 136
high: 186
batch: 31
vm stats threshold: 125
cpu: 19
count: 105
high: 186
batch: 31
vm stats threshold: 125
cpu: 20
count: 98
high: 186
batch: 31
vm stats threshold: 125
cpu: 21
count: 134
high: 186
batch: 31
vm stats threshold: 125
cpu: 22
count: 163
high: 186
batch: 31
vm stats threshold: 125
cpu: 23
count: 46
high: 186
batch: 31
vm stats threshold: 125
cpu: 24
count: 181
high: 186
batch: 31
vm stats threshold: 125
cpu: 25
count: 138
high: 186
batch: 31
vm stats threshold: 125
cpu: 26
count: 127
high: 186
batch: 31
vm stats threshold: 125
cpu: 27
count: 104
high: 186
batch: 31
vm stats threshold: 125
cpu: 28
count: 54
high: 186
batch: 31
vm stats threshold: 125
cpu: 29
count: 105
high: 186
batch: 31
vm stats threshold: 125
cpu: 30
count: 95
high: 186
batch: 31
vm stats threshold: 125
cpu: 31
count: 150
high: 186
batch: 31
vm stats threshold: 125
cpu: 32
count: 166
high: 186
batch: 31
vm stats threshold: 125
cpu: 33
count: 137
high: 186
batch: 31
vm stats threshold: 125
cpu: 34
count: 152
high: 186
batch: 31
vm stats threshold: 125
cpu: 35
count: 108
high: 186
batch: 31
vm stats threshold: 125
cpu: 36
count: 176
high: 186
batch: 31
vm stats threshold: 125
cpu: 37
count: 163
high: 186
batch: 31
vm stats threshold: 125
cpu: 38
count: 124
high: 186
batch: 31
vm stats threshold: 125
cpu: 39
count: 132
high: 186
batch: 31
vm stats threshold: 125
cpu: 40
count: 108
high: 186
batch: 31
vm stats threshold: 125
cpu: 41
count: 91
high: 186
batch: 31
vm stats threshold: 125
cpu: 42
count: 172
high: 186
batch: 31
vm stats threshold: 125
cpu: 43
count: 165
high: 186
batch: 31
vm stats threshold: 125
cpu: 44
count: 182
high: 186
batch: 31
vm stats threshold: 125
cpu: 45
count: 163
high: 186
batch: 31
vm stats threshold: 125
cpu: 46
count: 122
high: 186
batch: 31
vm stats threshold: 125
cpu: 47
count: 127
high: 186
batch: 31
vm stats threshold: 125
cpu: 48
count: 151
high: 186
batch: 31
vm stats threshold: 125
cpu: 49
count: 170
high: 186
batch: 31
vm stats threshold: 125
cpu: 50
count: 145
high: 186
batch: 31
vm stats threshold: 125
cpu: 51
count: 138
high: 186
batch: 31
vm stats threshold: 125
cpu: 52
count: 176
high: 186
batch: 31
vm stats threshold: 125
cpu: 53
count: 183
high: 186
batch: 31
vm stats threshold: 125
cpu: 54
count: 112
high: 186
batch: 31
vm stats threshold: 125
cpu: 55
count: 144
high: 186
batch: 31
vm stats threshold: 125
cpu: 56
count: 49
high: 186
batch: 31
vm stats threshold: 125
cpu: 57
count: 57
high: 186
batch: 31
vm stats threshold: 125
cpu: 58
count: 110
high: 186
batch: 31
vm stats threshold: 125
cpu: 59
count: 124
high: 186
batch: 31
vm stats threshold: 125
cpu: 60
count: 0
high: 186
batch: 31
vm stats threshold: 125
cpu: 61
count: 184
high: 186
batch: 31
vm stats threshold: 125
cpu: 62
count: 126
high: 186
batch: 31
vm stats threshold: 125
cpu: 63
count: 75
high: 186
batch: 31
vm stats threshold: 125
cpu: 64
count: 108
high: 186
batch: 31
vm stats threshold: 125
cpu: 65
count: 10
high: 186
batch: 31
vm stats threshold: 125
cpu: 66
count: 152
high: 186
batch: 31
vm stats threshold: 125
cpu: 67
count: 94
high: 186
batch: 31
vm stats threshold: 125
cpu: 68
count: 9
high: 186
batch: 31
vm stats threshold: 125
cpu: 69
count: 66
high: 186
batch: 31
vm stats threshold: 125
cpu: 70
count: 60
high: 186
batch: 31
vm stats threshold: 125
cpu: 71
count: 73
high: 186
batch: 31
vm stats threshold: 125
all_unreclaimable: 0
start_pfn: 1048576
inactive_ratio: 24
Node 1, zone Normal
pages free 6322599
min 11276
low 14095
high 16914
scanned 0
spanned 16777216
present 16777216
managed 16507772
nr_free_pages 6322599
nr_alloc_batch 2797
nr_inactive_anon 2202
nr_active_anon 5117
nr_inactive_file 46700
nr_active_file 11418
nr_unevictable 0
nr_mlock 0
nr_anon_pages 4967
nr_mapped 3009
nr_file_pages 60363
nr_dirty 0
nr_writeback 0
nr_slab_reclaimable 5328
nr_slab_unreclaimable 14512
nr_page_table_pages 458
nr_kernel_stack 274
nr_unstable 0
nr_bounce 0
nr_vmscan_write 0
nr_vmscan_immediate_reclaim 0
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 2245
nr_dirtied 44
nr_written 44
nr_pages_scanned 0
numa_hit 219272
numa_miss 10066439
numa_foreign 0
numa_interleave 88755
numa_local 195291
numa_other 10090608
workingset_refault 0
workingset_activate 0
workingset_nodereclaim 0
nr_anon_transparent_hugepages 2
nr_free_cma 0
protection: (0, 0, 0, 0)
pagesets
cpu: 0
count: 158
high: 186
batch: 31
vm stats threshold: 125
cpu: 1
count: 140
high: 186
batch: 31
vm stats threshold: 125
cpu: 2
count: 73
high: 186
batch: 31
vm stats threshold: 125
cpu: 3
count: 153
high: 186
batch: 31
vm stats threshold: 125
cpu: 4
count: 179
high: 186
batch: 31
vm stats threshold: 125
cpu: 5
count: 70
high: 186
batch: 31
vm stats threshold: 125
cpu: 6
count: 143
high: 186
batch: 31
vm stats threshold: 125
cpu: 7
count: 93
high: 186
batch: 31
vm stats threshold: 125
cpu: 8
count: 68
high: 186
batch: 31
vm stats threshold: 125
cpu: 9
count: 84
high: 186
batch: 31
vm stats threshold: 125
cpu: 10
count: 153
high: 186
batch: 31
vm stats threshold: 125
cpu: 11
count: 89
high: 186
batch: 31
vm stats threshold: 125
cpu: 12
count: 164
high: 186
batch: 31
vm stats threshold: 125
cpu: 13
count: 88
high: 186
batch: 31
vm stats threshold: 125
cpu: 14
count: 177
high: 186
batch: 31
vm stats threshold: 125
cpu: 15
count: 66
high: 186
batch: 31
vm stats threshold: 125
cpu: 16
count: 51
high: 186
batch: 31
vm stats threshold: 125
cpu: 17
count: 141
high: 186
batch: 31
vm stats threshold: 125
cpu: 18
count: 55
high: 186
batch: 31
vm stats threshold: 125
cpu: 19
count: 132
high: 186
batch: 31
vm stats threshold: 125
cpu: 20
count: 170
high: 186
batch: 31
vm stats threshold: 125
cpu: 21
count: 145
high: 186
batch: 31
vm stats threshold: 125
cpu: 22
count: 163
high: 186
batch: 31
vm stats threshold: 125
cpu: 23
count: 100
high: 186
batch: 31
vm stats threshold: 125
cpu: 24
count: 17
high: 186
batch: 31
vm stats threshold: 125
cpu: 25
count: 87
high: 186
batch: 31
vm stats threshold: 125
cpu: 26
count: 152
high: 186
batch: 31
vm stats threshold: 125
cpu: 27
count: 50
high: 186
batch: 31
vm stats threshold: 125
cpu: 28
count: 165
high: 186
batch: 31
vm stats threshold: 125
cpu: 29
count: 145
high: 186
batch: 31
vm stats threshold: 125
cpu: 30
count: 114
high: 186
batch: 31
vm stats threshold: 125
cpu: 31
count: 26
high: 186
batch: 31
vm stats threshold: 125
cpu: 32
count: 168
high: 186
batch: 31
vm stats threshold: 125
cpu: 33
count: 46
high: 186
batch: 31
vm stats threshold: 125
cpu: 34
count: 171
high: 186
batch: 31
vm stats threshold: 125
cpu: 35
count: 144
high: 186
batch: 31
vm stats threshold: 125
cpu: 36
count: 79
high: 186
batch: 31
vm stats threshold: 125
cpu: 37
count: 130
high: 186
batch: 31
vm stats threshold: 125
cpu: 38
count: 40
high: 186
batch: 31
vm stats threshold: 125
cpu: 39
count: 58
high: 186
batch: 31
vm stats threshold: 125
cpu: 40
count: 166
high: 186
batch: 31
vm stats threshold: 125
cpu: 41
count: 185
high: 186
batch: 31
vm stats threshold: 125
cpu: 42
count: 150
high: 186
batch: 31
vm stats threshold: 125
cpu: 43
count: 110
high: 186
batch: 31
vm stats threshold: 125
cpu: 44
count: 56
high: 186
batch: 31
vm stats threshold: 125
cpu: 45
count: 83
high: 186
batch: 31
vm stats threshold: 125
cpu: 46
count: 165
high: 186
batch: 31
vm stats threshold: 125
cpu: 47
count: 136
high: 186
batch: 31
vm stats threshold: 125
cpu: 48
count: 93
high: 186
batch: 31
vm stats threshold: 125
cpu: 49
count: 101
high: 186
batch: 31
vm stats threshold: 125
cpu: 50
count: 165
high: 186
batch: 31
vm stats threshold: 125
cpu: 51
count: 84
high: 186
batch: 31
vm stats threshold: 125
cpu: 52
count: 164
high: 186
batch: 31
vm stats threshold: 125
cpu: 53
count: 181
high: 186
batch: 31
vm stats threshold: 125
cpu: 54
count: 148
high: 186
batch: 31
vm stats threshold: 125
cpu: 55
count: 181
high: 186
batch: 31
vm stats threshold: 125
cpu: 56
count: 145
high: 186
batch: 31
vm stats threshold: 125
cpu: 57
count: 159
high: 186
batch: 31
vm stats threshold: 125
cpu: 58
count: 163
high: 186
batch: 31
vm stats threshold: 125
cpu: 59
count: 75
high: 186
batch: 31
vm stats threshold: 125
cpu: 60
count: 68
high: 186
batch: 31
vm stats threshold: 125
cpu: 61
count: 127
high: 186
batch: 31
vm stats threshold: 125
cpu: 62
count: 106
high: 186
batch: 31
vm stats threshold: 125
cpu: 63
count: 170
high: 186
batch: 31
vm stats threshold: 125
cpu: 64
count: 161
high: 186
batch: 31
vm stats threshold: 125
cpu: 65
count: 176
high: 186
batch: 31
vm stats threshold: 125
cpu: 66
count: 136
high: 186
batch: 31
vm stats threshold: 125
cpu: 67
count: 162
high: 186
batch: 31
vm stats threshold: 125
cpu: 68
count: 151
high: 186
batch: 31
vm stats threshold: 125
cpu: 69
count: 150
high: 186
batch: 31
vm stats threshold: 125
cpu: 70
count: 116
high: 186
batch: 31
vm stats threshold: 125
cpu: 71
count: 137
high: 186
batch: 31
vm stats threshold: 125
all_unreclaimable: 0
start_pfn: 17301504
inactive_ratio: 24
[-- Attachment #3: proc-vmstat.gz --]
[-- Type: application/gzip, Size: 22205 bytes --]
[-- Attachment #4: perf-profile.xz --]
[-- Type: application/x-xz, Size: 116760 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: hugepage compaction causes performance drop
@ 2015-11-20 9:33 ` Aaron Lu
0 siblings, 0 replies; 30+ messages in thread
From: Aaron Lu @ 2015-11-20 9:33 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 860 bytes --]
On 11/20/2015 04:55 PM, Aaron Lu wrote:
> On 11/19/2015 09:29 PM, Vlastimil Babka wrote:
>> +CC Andrea, David, Joonsoo
>>
>> On 11/19/2015 10:29 AM, Aaron Lu wrote:
>>> The vmstat and perf-profile are also attached, please let me know if you
>>> need any more information, thanks.
>>
>> Output from vmstat (the tool) isn't much useful here, a periodic "cat
>> /proc/vmstat" would be much better.
>
> No problem.
>
>> The perf profiles are somewhat weirdly sorted by children cost (?), but
>> I noticed a very high cost (46%) in pageblock_pfn_to_page(). This could
>> be due to a very large but sparsely populated zone. Could you provide
>> /proc/zoneinfo?
>
> Is a one time /proc/zoneinfo enough or also a periodic one?
Please see attached, note that this is a new run so the perf profile is
a little different.
Thanks,
Aaron
[-- Attachment #2: zoneinfo.ksh --]
[-- Type: text/plain, Size: 36523 bytes --]
/proc/zoneinfo
Node 0, zone DMA
pages free 3950
min 2
low 2
high 3
scanned 0
spanned 4095
present 3994
managed 3973
nr_free_pages 3950
nr_alloc_batch 1
nr_inactive_anon 0
nr_active_anon 0
nr_inactive_file 21
nr_active_file 1
nr_unevictable 0
nr_mlock 0
nr_anon_pages 0
nr_mapped 0
nr_file_pages 22
nr_dirty 0
nr_writeback 0
nr_slab_reclaimable 0
nr_slab_unreclaimable 1
nr_page_table_pages 0
nr_kernel_stack 0
nr_unstable 0
nr_bounce 0
nr_vmscan_write 0
nr_vmscan_immediate_reclaim 0
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 0
nr_dirtied 0
nr_written 0
nr_pages_scanned 0
numa_hit 23
numa_miss 0
numa_foreign 0
numa_interleave 0
numa_local 1
numa_other 22
workingset_refault 0
workingset_activate 0
workingset_nodereclaim 0
nr_anon_transparent_hugepages 0
nr_free_cma 0
protection: (0, 1873, 64327, 64327)
pagesets
cpu: 0
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 1
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 2
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 3
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 4
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 5
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 6
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 7
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 8
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 9
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 10
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 11
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 12
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 13
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 14
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 15
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 16
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 17
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 18
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 19
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 20
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 21
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 22
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 23
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 24
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 25
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 26
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 27
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 28
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 29
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 30
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 31
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 32
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 33
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 34
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 35
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 36
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 37
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 38
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 39
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 40
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 41
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 42
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 43
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 44
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 45
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 46
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 47
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 48
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 49
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 50
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 51
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 52
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 53
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 54
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 55
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 56
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 57
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 58
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 59
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 60
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 61
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 62
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 63
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 64
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 65
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 66
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 67
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 68
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 69
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 70
count: 0
high: 0
batch: 1
vm stats threshold: 14
cpu: 71
count: 0
high: 0
batch: 1
vm stats threshold: 14
all_unreclaimable: 0
start_pfn: 1
inactive_ratio: 1
Node 0, zone DMA32
pages free 62829
min 327
low 408
high 490
scanned 0
spanned 1044480
present 495951
managed 479559
nr_free_pages 62829
nr_alloc_batch 3
nr_inactive_anon 12
nr_active_anon 50
nr_inactive_file 1440
nr_active_file 316
nr_unevictable 0
nr_mlock 0
nr_anon_pages 40
nr_mapped 39
nr_file_pages 1778
nr_dirty 0
nr_writeback 0
nr_slab_reclaimable 238
nr_slab_unreclaimable 246
nr_page_table_pages 15
nr_kernel_stack 9
nr_unstable 0
nr_bounce 0
nr_vmscan_write 0
nr_vmscan_immediate_reclaim 0
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 22
nr_dirtied 0
nr_written 0
nr_pages_scanned 0
numa_hit 416524
numa_miss 0
numa_foreign 0
numa_interleave 0
numa_local 414721
numa_other 1803
workingset_refault 0
workingset_activate 0
workingset_nodereclaim 0
nr_anon_transparent_hugepages 0
nr_free_cma 0
protection: (0, 0, 62453, 62453)
pagesets
cpu: 0
count: 33
high: 186
batch: 31
vm stats threshold: 70
cpu: 1
count: 104
high: 186
batch: 31
vm stats threshold: 70
cpu: 2
count: 79
high: 186
batch: 31
vm stats threshold: 70
cpu: 3
count: 109
high: 186
batch: 31
vm stats threshold: 70
cpu: 4
count: 53
high: 186
batch: 31
vm stats threshold: 70
cpu: 5
count: 43
high: 186
batch: 31
vm stats threshold: 70
cpu: 6
count: 126
high: 186
batch: 31
vm stats threshold: 70
cpu: 7
count: 38
high: 186
batch: 31
vm stats threshold: 70
cpu: 8
count: 63
high: 186
batch: 31
vm stats threshold: 70
cpu: 9
count: 63
high: 186
batch: 31
vm stats threshold: 70
cpu: 10
count: 144
high: 186
batch: 31
vm stats threshold: 70
cpu: 11
count: 59
high: 186
batch: 31
vm stats threshold: 70
cpu: 12
count: 43
high: 186
batch: 31
vm stats threshold: 70
cpu: 13
count: 52
high: 186
batch: 31
vm stats threshold: 70
cpu: 14
count: 111
high: 186
batch: 31
vm stats threshold: 70
cpu: 15
count: 112
high: 186
batch: 31
vm stats threshold: 70
cpu: 16
count: 118
high: 186
batch: 31
vm stats threshold: 70
cpu: 17
count: 41
high: 186
batch: 31
vm stats threshold: 70
cpu: 18
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 19
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 20
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 21
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 22
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 23
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 24
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 25
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 26
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 27
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 28
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 29
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 30
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 31
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 32
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 33
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 34
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 35
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 36
count: 44
high: 186
batch: 31
vm stats threshold: 70
cpu: 37
count: 53
high: 186
batch: 31
vm stats threshold: 70
cpu: 38
count: 109
high: 186
batch: 31
vm stats threshold: 70
cpu: 39
count: 40
high: 186
batch: 31
vm stats threshold: 70
cpu: 40
count: 85
high: 186
batch: 31
vm stats threshold: 70
cpu: 41
count: 30
high: 186
batch: 31
vm stats threshold: 70
cpu: 42
count: 48
high: 186
batch: 31
vm stats threshold: 70
cpu: 43
count: 59
high: 186
batch: 31
vm stats threshold: 70
cpu: 44
count: 96
high: 186
batch: 31
vm stats threshold: 70
cpu: 45
count: 55
high: 186
batch: 31
vm stats threshold: 70
cpu: 46
count: 93
high: 186
batch: 31
vm stats threshold: 70
cpu: 47
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 48
count: 75
high: 186
batch: 31
vm stats threshold: 70
cpu: 49
count: 63
high: 186
batch: 31
vm stats threshold: 70
cpu: 50
count: 87
high: 186
batch: 31
vm stats threshold: 70
cpu: 51
count: 124
high: 186
batch: 31
vm stats threshold: 70
cpu: 52
count: 68
high: 186
batch: 31
vm stats threshold: 70
cpu: 53
count: 57
high: 186
batch: 31
vm stats threshold: 70
cpu: 54
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 55
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 56
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 57
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 58
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 59
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 60
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 61
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 62
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 63
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 64
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 65
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 66
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 67
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 68
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 69
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 70
count: 0
high: 186
batch: 31
vm stats threshold: 70
cpu: 71
count: 0
high: 186
batch: 31
vm stats threshold: 70
all_unreclaimable: 0
start_pfn: 4096
inactive_ratio: 3
Node 0, zone Normal
pages free 13732
min 10921
low 13651
high 16381
scanned 0
spanned 16252928
present 16252928
managed 15988216
nr_free_pages 13732
nr_alloc_batch 1009
nr_inactive_anon 80
nr_active_anon 630
nr_inactive_file 44444
nr_active_file 11926
nr_unevictable 0
nr_mlock 0
nr_anon_pages 633
nr_mapped 1613
nr_file_pages 56462
nr_dirty 0
nr_writeback 0
nr_slab_reclaimable 6196
nr_slab_unreclaimable 12143
nr_page_table_pages 104
nr_kernel_stack 590
nr_unstable 0
nr_bounce 0
nr_vmscan_write 0
nr_vmscan_immediate_reclaim 0
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 92
nr_dirtied 29
nr_written 29
nr_pages_scanned 0
numa_hit 16004783
numa_miss 0
numa_foreign 10066439
numa_interleave 87113
numa_local 15944244
numa_other 60539
workingset_refault 0
workingset_activate 0
workingset_nodereclaim 0
nr_anon_transparent_hugepages 0
nr_free_cma 0
protection: (0, 0, 0, 0)
pagesets
cpu: 0
count: 77
high: 186
batch: 31
vm stats threshold: 125
cpu: 1
count: 25
high: 186
batch: 31
vm stats threshold: 125
cpu: 2
count: 155
high: 186
batch: 31
vm stats threshold: 125
cpu: 3
count: 56
high: 186
batch: 31
vm stats threshold: 125
cpu: 4
count: 176
high: 186
batch: 31
vm stats threshold: 125
cpu: 5
count: 157
high: 186
batch: 31
vm stats threshold: 125
cpu: 6
count: 109
high: 186
batch: 31
vm stats threshold: 125
cpu: 7
count: 156
high: 186
batch: 31
vm stats threshold: 125
cpu: 8
count: 66
high: 186
batch: 31
vm stats threshold: 125
cpu: 9
count: 164
high: 186
batch: 31
vm stats threshold: 125
cpu: 10
count: 167
high: 186
batch: 31
vm stats threshold: 125
cpu: 11
count: 161
high: 186
batch: 31
vm stats threshold: 125
cpu: 12
count: 76
high: 186
batch: 31
vm stats threshold: 125
cpu: 13
count: 73
high: 186
batch: 31
vm stats threshold: 125
cpu: 14
count: 184
high: 186
batch: 31
vm stats threshold: 125
cpu: 15
count: 170
high: 186
batch: 31
vm stats threshold: 125
cpu: 16
count: 81
high: 186
batch: 31
vm stats threshold: 125
cpu: 17
count: 181
high: 186
batch: 31
vm stats threshold: 125
cpu: 18
count: 136
high: 186
batch: 31
vm stats threshold: 125
cpu: 19
count: 105
high: 186
batch: 31
vm stats threshold: 125
cpu: 20
count: 98
high: 186
batch: 31
vm stats threshold: 125
cpu: 21
count: 134
high: 186
batch: 31
vm stats threshold: 125
cpu: 22
count: 163
high: 186
batch: 31
vm stats threshold: 125
cpu: 23
count: 46
high: 186
batch: 31
vm stats threshold: 125
cpu: 24
count: 181
high: 186
batch: 31
vm stats threshold: 125
cpu: 25
count: 138
high: 186
batch: 31
vm stats threshold: 125
cpu: 26
count: 127
high: 186
batch: 31
vm stats threshold: 125
cpu: 27
count: 104
high: 186
batch: 31
vm stats threshold: 125
cpu: 28
count: 54
high: 186
batch: 31
vm stats threshold: 125
cpu: 29
count: 105
high: 186
batch: 31
vm stats threshold: 125
cpu: 30
count: 95
high: 186
batch: 31
vm stats threshold: 125
cpu: 31
count: 150
high: 186
batch: 31
vm stats threshold: 125
cpu: 32
count: 166
high: 186
batch: 31
vm stats threshold: 125
cpu: 33
count: 137
high: 186
batch: 31
vm stats threshold: 125
cpu: 34
count: 152
high: 186
batch: 31
vm stats threshold: 125
cpu: 35
count: 108
high: 186
batch: 31
vm stats threshold: 125
cpu: 36
count: 176
high: 186
batch: 31
vm stats threshold: 125
cpu: 37
count: 163
high: 186
batch: 31
vm stats threshold: 125
cpu: 38
count: 124
high: 186
batch: 31
vm stats threshold: 125
cpu: 39
count: 132
high: 186
batch: 31
vm stats threshold: 125
cpu: 40
count: 108
high: 186
batch: 31
vm stats threshold: 125
cpu: 41
count: 91
high: 186
batch: 31
vm stats threshold: 125
cpu: 42
count: 172
high: 186
batch: 31
vm stats threshold: 125
cpu: 43
count: 165
high: 186
batch: 31
vm stats threshold: 125
cpu: 44
count: 182
high: 186
batch: 31
vm stats threshold: 125
cpu: 45
count: 163
high: 186
batch: 31
vm stats threshold: 125
cpu: 46
count: 122
high: 186
batch: 31
vm stats threshold: 125
cpu: 47
count: 127
high: 186
batch: 31
vm stats threshold: 125
cpu: 48
count: 151
high: 186
batch: 31
vm stats threshold: 125
cpu: 49
count: 170
high: 186
batch: 31
vm stats threshold: 125
cpu: 50
count: 145
high: 186
batch: 31
vm stats threshold: 125
cpu: 51
count: 138
high: 186
batch: 31
vm stats threshold: 125
cpu: 52
count: 176
high: 186
batch: 31
vm stats threshold: 125
cpu: 53
count: 183
high: 186
batch: 31
vm stats threshold: 125
cpu: 54
count: 112
high: 186
batch: 31
vm stats threshold: 125
cpu: 55
count: 144
high: 186
batch: 31
vm stats threshold: 125
cpu: 56
count: 49
high: 186
batch: 31
vm stats threshold: 125
cpu: 57
count: 57
high: 186
batch: 31
vm stats threshold: 125
cpu: 58
count: 110
high: 186
batch: 31
vm stats threshold: 125
cpu: 59
count: 124
high: 186
batch: 31
vm stats threshold: 125
cpu: 60
count: 0
high: 186
batch: 31
vm stats threshold: 125
cpu: 61
count: 184
high: 186
batch: 31
vm stats threshold: 125
cpu: 62
count: 126
high: 186
batch: 31
vm stats threshold: 125
cpu: 63
count: 75
high: 186
batch: 31
vm stats threshold: 125
cpu: 64
count: 108
high: 186
batch: 31
vm stats threshold: 125
cpu: 65
count: 10
high: 186
batch: 31
vm stats threshold: 125
cpu: 66
count: 152
high: 186
batch: 31
vm stats threshold: 125
cpu: 67
count: 94
high: 186
batch: 31
vm stats threshold: 125
cpu: 68
count: 9
high: 186
batch: 31
vm stats threshold: 125
cpu: 69
count: 66
high: 186
batch: 31
vm stats threshold: 125
cpu: 70
count: 60
high: 186
batch: 31
vm stats threshold: 125
cpu: 71
count: 73
high: 186
batch: 31
vm stats threshold: 125
all_unreclaimable: 0
start_pfn: 1048576
inactive_ratio: 24
Node 1, zone Normal
pages free 6322599
min 11276
low 14095
high 16914
scanned 0
spanned 16777216
present 16777216
managed 16507772
nr_free_pages 6322599
nr_alloc_batch 2797
nr_inactive_anon 2202
nr_active_anon 5117
nr_inactive_file 46700
nr_active_file 11418
nr_unevictable 0
nr_mlock 0
nr_anon_pages 4967
nr_mapped 3009
nr_file_pages 60363
nr_dirty 0
nr_writeback 0
nr_slab_reclaimable 5328
nr_slab_unreclaimable 14512
nr_page_table_pages 458
nr_kernel_stack 274
nr_unstable 0
nr_bounce 0
nr_vmscan_write 0
nr_vmscan_immediate_reclaim 0
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 2245
nr_dirtied 44
nr_written 44
nr_pages_scanned 0
numa_hit 219272
numa_miss 10066439
numa_foreign 0
numa_interleave 88755
numa_local 195291
numa_other 10090608
workingset_refault 0
workingset_activate 0
workingset_nodereclaim 0
nr_anon_transparent_hugepages 2
nr_free_cma 0
protection: (0, 0, 0, 0)
pagesets
cpu: 0
count: 158
high: 186
batch: 31
vm stats threshold: 125
cpu: 1
count: 140
high: 186
batch: 31
vm stats threshold: 125
cpu: 2
count: 73
high: 186
batch: 31
vm stats threshold: 125
cpu: 3
count: 153
high: 186
batch: 31
vm stats threshold: 125
cpu: 4
count: 179
high: 186
batch: 31
vm stats threshold: 125
cpu: 5
count: 70
high: 186
batch: 31
vm stats threshold: 125
cpu: 6
count: 143
high: 186
batch: 31
vm stats threshold: 125
cpu: 7
count: 93
high: 186
batch: 31
vm stats threshold: 125
cpu: 8
count: 68
high: 186
batch: 31
vm stats threshold: 125
cpu: 9
count: 84
high: 186
batch: 31
vm stats threshold: 125
cpu: 10
count: 153
high: 186
batch: 31
vm stats threshold: 125
cpu: 11
count: 89
high: 186
batch: 31
vm stats threshold: 125
cpu: 12
count: 164
high: 186
batch: 31
vm stats threshold: 125
cpu: 13
count: 88
high: 186
batch: 31
vm stats threshold: 125
cpu: 14
count: 177
high: 186
batch: 31
vm stats threshold: 125
cpu: 15
count: 66
high: 186
batch: 31
vm stats threshold: 125
cpu: 16
count: 51
high: 186
batch: 31
vm stats threshold: 125
cpu: 17
count: 141
high: 186
batch: 31
vm stats threshold: 125
cpu: 18
count: 55
high: 186
batch: 31
vm stats threshold: 125
cpu: 19
count: 132
high: 186
batch: 31
vm stats threshold: 125
cpu: 20
count: 170
high: 186
batch: 31
vm stats threshold: 125
cpu: 21
count: 145
high: 186
batch: 31
vm stats threshold: 125
cpu: 22
count: 163
high: 186
batch: 31
vm stats threshold: 125
cpu: 23
count: 100
high: 186
batch: 31
vm stats threshold: 125
cpu: 24
count: 17
high: 186
batch: 31
vm stats threshold: 125
cpu: 25
count: 87
high: 186
batch: 31
vm stats threshold: 125
cpu: 26
count: 152
high: 186
batch: 31
vm stats threshold: 125
cpu: 27
count: 50
high: 186
batch: 31
vm stats threshold: 125
cpu: 28
count: 165
high: 186
batch: 31
vm stats threshold: 125
cpu: 29
count: 145
high: 186
batch: 31
vm stats threshold: 125
cpu: 30
count: 114
high: 186
batch: 31
vm stats threshold: 125
cpu: 31
count: 26
high: 186
batch: 31
vm stats threshold: 125
cpu: 32
count: 168
high: 186
batch: 31
vm stats threshold: 125
cpu: 33
count: 46
high: 186
batch: 31
vm stats threshold: 125
cpu: 34
count: 171
high: 186
batch: 31
vm stats threshold: 125
cpu: 35
count: 144
high: 186
batch: 31
vm stats threshold: 125
cpu: 36
count: 79
high: 186
batch: 31
vm stats threshold: 125
cpu: 37
count: 130
high: 186
batch: 31
vm stats threshold: 125
cpu: 38
count: 40
high: 186
batch: 31
vm stats threshold: 125
cpu: 39
count: 58
high: 186
batch: 31
vm stats threshold: 125
cpu: 40
count: 166
high: 186
batch: 31
vm stats threshold: 125
cpu: 41
count: 185
high: 186
batch: 31
vm stats threshold: 125
cpu: 42
count: 150
high: 186
batch: 31
vm stats threshold: 125
cpu: 43
count: 110
high: 186
batch: 31
vm stats threshold: 125
cpu: 44
count: 56
high: 186
batch: 31
vm stats threshold: 125
cpu: 45
count: 83
high: 186
batch: 31
vm stats threshold: 125
cpu: 46
count: 165
high: 186
batch: 31
vm stats threshold: 125
cpu: 47
count: 136
high: 186
batch: 31
vm stats threshold: 125
cpu: 48
count: 93
high: 186
batch: 31
vm stats threshold: 125
cpu: 49
count: 101
high: 186
batch: 31
vm stats threshold: 125
cpu: 50
count: 165
high: 186
batch: 31
vm stats threshold: 125
cpu: 51
count: 84
high: 186
batch: 31
vm stats threshold: 125
cpu: 52
count: 164
high: 186
batch: 31
vm stats threshold: 125
cpu: 53
count: 181
high: 186
batch: 31
vm stats threshold: 125
cpu: 54
count: 148
high: 186
batch: 31
vm stats threshold: 125
cpu: 55
count: 181
high: 186
batch: 31
vm stats threshold: 125
cpu: 56
count: 145
high: 186
batch: 31
vm stats threshold: 125
cpu: 57
count: 159
high: 186
batch: 31
vm stats threshold: 125
cpu: 58
count: 163
high: 186
batch: 31
vm stats threshold: 125
cpu: 59
count: 75
high: 186
batch: 31
vm stats threshold: 125
cpu: 60
count: 68
high: 186
batch: 31
vm stats threshold: 125
cpu: 61
count: 127
high: 186
batch: 31
vm stats threshold: 125
cpu: 62
count: 106
high: 186
batch: 31
vm stats threshold: 125
cpu: 63
count: 170
high: 186
batch: 31
vm stats threshold: 125
cpu: 64
count: 161
high: 186
batch: 31
vm stats threshold: 125
cpu: 65
count: 176
high: 186
batch: 31
vm stats threshold: 125
cpu: 66
count: 136
high: 186
batch: 31
vm stats threshold: 125
cpu: 67
count: 162
high: 186
batch: 31
vm stats threshold: 125
cpu: 68
count: 151
high: 186
batch: 31
vm stats threshold: 125
cpu: 69
count: 150
high: 186
batch: 31
vm stats threshold: 125
cpu: 70
count: 116
high: 186
batch: 31
vm stats threshold: 125
cpu: 71
count: 137
high: 186
batch: 31
vm stats threshold: 125
all_unreclaimable: 0
start_pfn: 17301504
inactive_ratio: 24
[-- Attachment #3: proc-vmstat.gz --]
[-- Type: application/gzip, Size: 22205 bytes --]
[-- Attachment #4: perf-profile.xz --]
[-- Type: application/x-xz, Size: 116760 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: hugepage compaction causes performance drop
2015-11-20 9:33 ` Aaron Lu
@ 2015-11-20 10:06 ` Vlastimil Babka
-1 siblings, 0 replies; 30+ messages in thread
From: Vlastimil Babka @ 2015-11-20 10:06 UTC (permalink / raw)
To: Aaron Lu, linux-mm
Cc: Huang Ying, Dave Hansen, Tim Chen, lkp, Andrea Arcangeli,
David Rientjes, Joonsoo Kim
On 11/20/2015 10:33 AM, Aaron Lu wrote:
> On 11/20/2015 04:55 PM, Aaron Lu wrote:
>> On 11/19/2015 09:29 PM, Vlastimil Babka wrote:
>>> +CC Andrea, David, Joonsoo
>>>
>>> On 11/19/2015 10:29 AM, Aaron Lu wrote:
>>>> The vmstat and perf-profile are also attached, please let me know if you
>>>> need any more information, thanks.
>>>
>>> Output from vmstat (the tool) isn't much useful here, a periodic "cat
>>> /proc/vmstat" would be much better.
>>
>> No problem.
>>
>>> The perf profiles are somewhat weirdly sorted by children cost (?), but
>>> I noticed a very high cost (46%) in pageblock_pfn_to_page(). This could
>>> be due to a very large but sparsely populated zone. Could you provide
>>> /proc/zoneinfo?
>>
>> Is a one time /proc/zoneinfo enough or also a periodic one?
>
> Please see attached, note that this is a new run so the perf profile is
> a little different.
>
> Thanks,
> Aaron
Thanks.
DMA32 is a bit sparse:
Node 0, zone DMA32
pages free 62829
min 327
low 408
high 490
scanned 0
spanned 1044480
present 495951
managed 479559
Since the other zones are much larger, probably this is not the culprit.
But tracepoints should tell us more. I have a theory that updating free
scanner's cached pfn doesn't happen if it aborts due to need_resched()
during isolate_freepages(), before hitting a valid pageblock, if the
zone has a large hole in it. But zoneinfo doesn't tell us if the large
difference between "spanned" and "present"/"managed" is due to a large
hole, or many smaller holes...
compact_migrate_scanned 1982396
compact_free_scanned 40576943
compact_isolated 2096602
compact_stall 9070
compact_fail 6025
compact_success 3045
So it's struggling to find free pages, no wonder about that. I'm working
on a series that should hopefully help here, and Joonsoo as well.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: hugepage compaction causes performance drop
@ 2015-11-20 10:06 ` Vlastimil Babka
0 siblings, 0 replies; 30+ messages in thread
From: Vlastimil Babka @ 2015-11-20 10:06 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1954 bytes --]
On 11/20/2015 10:33 AM, Aaron Lu wrote:
> On 11/20/2015 04:55 PM, Aaron Lu wrote:
>> On 11/19/2015 09:29 PM, Vlastimil Babka wrote:
>>> +CC Andrea, David, Joonsoo
>>>
>>> On 11/19/2015 10:29 AM, Aaron Lu wrote:
>>>> The vmstat and perf-profile are also attached, please let me know if you
>>>> need any more information, thanks.
>>>
>>> Output from vmstat (the tool) isn't much useful here, a periodic "cat
>>> /proc/vmstat" would be much better.
>>
>> No problem.
>>
>>> The perf profiles are somewhat weirdly sorted by children cost (?), but
>>> I noticed a very high cost (46%) in pageblock_pfn_to_page(). This could
>>> be due to a very large but sparsely populated zone. Could you provide
>>> /proc/zoneinfo?
>>
>> Is a one time /proc/zoneinfo enough or also a periodic one?
>
> Please see attached, note that this is a new run so the perf profile is
> a little different.
>
> Thanks,
> Aaron
Thanks.
DMA32 is a bit sparse:
Node 0, zone DMA32
pages free 62829
min 327
low 408
high 490
scanned 0
spanned 1044480
present 495951
managed 479559
Since the other zones are much larger, probably this is not the culprit.
But tracepoints should tell us more. I have a theory that updating free
scanner's cached pfn doesn't happen if it aborts due to need_resched()
during isolate_freepages(), before hitting a valid pageblock, if the
zone has a large hole in it. But zoneinfo doesn't tell us if the large
difference between "spanned" and "present"/"managed" is due to a large
hole, or many smaller holes...
compact_migrate_scanned 1982396
compact_free_scanned 40576943
compact_isolated 2096602
compact_stall 9070
compact_fail 6025
compact_success 3045
So it's struggling to find free pages, no wonder about that. I'm working
on a series that should hopefully help here, and Joonsoo as well.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: hugepage compaction causes performance drop
2015-11-20 10:06 ` Vlastimil Babka
@ 2015-11-23 8:16 ` Joonsoo Kim
-1 siblings, 0 replies; 30+ messages in thread
From: Joonsoo Kim @ 2015-11-23 8:16 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Aaron Lu, linux-mm, Huang Ying, Dave Hansen, Tim Chen, lkp,
Andrea Arcangeli, David Rientjes
On Fri, Nov 20, 2015 at 11:06:46AM +0100, Vlastimil Babka wrote:
> On 11/20/2015 10:33 AM, Aaron Lu wrote:
> >On 11/20/2015 04:55 PM, Aaron Lu wrote:
> >>On 11/19/2015 09:29 PM, Vlastimil Babka wrote:
> >>>+CC Andrea, David, Joonsoo
> >>>
> >>>On 11/19/2015 10:29 AM, Aaron Lu wrote:
> >>>>The vmstat and perf-profile are also attached, please let me know if you
> >>>>need any more information, thanks.
> >>>
> >>>Output from vmstat (the tool) isn't much useful here, a periodic "cat
> >>>/proc/vmstat" would be much better.
> >>
> >>No problem.
> >>
> >>>The perf profiles are somewhat weirdly sorted by children cost (?), but
> >>>I noticed a very high cost (46%) in pageblock_pfn_to_page(). This could
> >>>be due to a very large but sparsely populated zone. Could you provide
> >>>/proc/zoneinfo?
> >>
> >>Is a one time /proc/zoneinfo enough or also a periodic one?
> >
> >Please see attached, note that this is a new run so the perf profile is
> >a little different.
> >
> >Thanks,
> >Aaron
>
> Thanks.
>
> DMA32 is a bit sparse:
>
> Node 0, zone DMA32
> pages free 62829
> min 327
> low 408
> high 490
> scanned 0
> spanned 1044480
> present 495951
> managed 479559
>
> Since the other zones are much larger, probably this is not the
> culprit. But tracepoints should tell us more. I have a theory that
> updating free scanner's cached pfn doesn't happen if it aborts due
> to need_resched() during isolate_freepages(), before hitting a valid
> pageblock, if the zone has a large hole in it. But zoneinfo doesn't
> tell us if the large difference between "spanned" and
> "present"/"managed" is due to a large hole, or many smaller holes...
>
> compact_migrate_scanned 1982396
> compact_free_scanned 40576943
> compact_isolated 2096602
> compact_stall 9070
> compact_fail 6025
> compact_success 3045
>
> So it's struggling to find free pages, no wonder about that. I'm
Numbers looks fine to me. I guess this performance degradation is
caused by COMPACT_CLUSTER_MAX change (from 32 to 256). THP allocation
is async so should be aborted quickly. But, after isolating 256
migratable pages, it can't be aborted and will finish 256 pages
migration (at least, current implementation).
Aaron, please test again with setting COMPACT_CLUSTER_MAX to 32
(in swap.h)?
And, please attach always-always's vmstat numbers, too.
Thanks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: hugepage compaction causes performance drop
@ 2015-11-23 8:16 ` Joonsoo Kim
0 siblings, 0 replies; 30+ messages in thread
From: Joonsoo Kim @ 2015-11-23 8:16 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 2492 bytes --]
On Fri, Nov 20, 2015 at 11:06:46AM +0100, Vlastimil Babka wrote:
> On 11/20/2015 10:33 AM, Aaron Lu wrote:
> >On 11/20/2015 04:55 PM, Aaron Lu wrote:
> >>On 11/19/2015 09:29 PM, Vlastimil Babka wrote:
> >>>+CC Andrea, David, Joonsoo
> >>>
> >>>On 11/19/2015 10:29 AM, Aaron Lu wrote:
> >>>>The vmstat and perf-profile are also attached, please let me know if you
> >>>>need any more information, thanks.
> >>>
> >>>Output from vmstat (the tool) isn't much useful here, a periodic "cat
> >>>/proc/vmstat" would be much better.
> >>
> >>No problem.
> >>
> >>>The perf profiles are somewhat weirdly sorted by children cost (?), but
> >>>I noticed a very high cost (46%) in pageblock_pfn_to_page(). This could
> >>>be due to a very large but sparsely populated zone. Could you provide
> >>>/proc/zoneinfo?
> >>
> >>Is a one time /proc/zoneinfo enough or also a periodic one?
> >
> >Please see attached, note that this is a new run so the perf profile is
> >a little different.
> >
> >Thanks,
> >Aaron
>
> Thanks.
>
> DMA32 is a bit sparse:
>
> Node 0, zone DMA32
> pages free 62829
> min 327
> low 408
> high 490
> scanned 0
> spanned 1044480
> present 495951
> managed 479559
>
> Since the other zones are much larger, probably this is not the
> culprit. But tracepoints should tell us more. I have a theory that
> updating free scanner's cached pfn doesn't happen if it aborts due
> to need_resched() during isolate_freepages(), before hitting a valid
> pageblock, if the zone has a large hole in it. But zoneinfo doesn't
> tell us if the large difference between "spanned" and
> "present"/"managed" is due to a large hole, or many smaller holes...
>
> compact_migrate_scanned 1982396
> compact_free_scanned 40576943
> compact_isolated 2096602
> compact_stall 9070
> compact_fail 6025
> compact_success 3045
>
> So it's struggling to find free pages, no wonder about that. I'm
Numbers looks fine to me. I guess this performance degradation is
caused by COMPACT_CLUSTER_MAX change (from 32 to 256). THP allocation
is async so should be aborted quickly. But, after isolating 256
migratable pages, it can't be aborted and will finish 256 pages
migration (at least, current implementation).
Aaron, please test again with setting COMPACT_CLUSTER_MAX to 32
(in swap.h)?
And, please attach always-always's vmstat numbers, too.
Thanks.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: hugepage compaction causes performance drop
2015-11-23 8:16 ` Joonsoo Kim
@ 2015-11-23 8:33 ` Aaron Lu
-1 siblings, 0 replies; 30+ messages in thread
From: Aaron Lu @ 2015-11-23 8:33 UTC (permalink / raw)
To: Joonsoo Kim, Vlastimil Babka
Cc: linux-mm, Huang Ying, Dave Hansen, Tim Chen, lkp,
Andrea Arcangeli, David Rientjes
[-- Attachment #1: Type: text/plain, Size: 881 bytes --]
On 11/23/2015 04:16 PM, Joonsoo Kim wrote:
> Numbers looks fine to me. I guess this performance degradation is
> caused by COMPACT_CLUSTER_MAX change (from 32 to 256). THP allocation
> is async so should be aborted quickly. But, after isolating 256
> migratable pages, it can't be aborted and will finish 256 pages
> migration (at least, current implementation).
>
> Aaron, please test again with setting COMPACT_CLUSTER_MAX to 32
> (in swap.h)?
This is what I found in include/linux/swap.h:
#define SWAP_CLUSTER_MAX 32UL
#define COMPACT_CLUSTER_MAX SWAP_CLUSTER_MAX
Looks like it is already 32, or am I looking at the wrong place?
BTW, I'm using v4.3 for all these tests, and I just checked v4.4-rc2,
the above definition doesn't change.
>
> And, please attach always-always's vmstat numbers, too.
Sure, attached the vmstat tool output, taken every second.
Thanks,
Aaron
[-- Attachment #2: vmstat --]
[-- Type: text/plain, Size: 9298 bytes --]
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- -----timestamp-----
r b swpd free buff cache si so bi bo in cs us sy id wa st CST
6 1 0 25647504 580 626540 0 0 0 0 66 19 0 1 99 0 0 2015-11-20 02:19:37
1 0 0 25563796 580 638000 0 0 0 0 769 6085 0 1 99 0 0 2015-11-20 02:19:38
1 0 0 22010336 580 638168 0 0 0 0 1698 930 0 1 99 0 0 2015-11-20 02:19:39
1 0 0 18589868 580 638084 0 0 0 0 1198 793 0 1 99 0 0 2015-11-20 02:19:40
1 0 0 15173252 580 638104 0 0 0 0 1234 738 0 1 99 0 0 2015-11-20 02:19:41
1 0 0 11751756 580 638120 0 0 0 0 1224 679 0 1 99 0 0 2015-11-20 02:19:42
1 0 0 8322416 580 638156 0 0 0 0 1213 726 0 1 99 0 0 2015-11-20 02:19:43
1 0 0 4877336 580 638232 0 0 0 0 1171 726 0 1 99 0 0 2015-11-20 02:19:44
1 0 0 1437496 580 638300 0 0 0 0 1203 641 0 1 99 0 0 2015-11-20 02:19:45
1 0 460904 439088 284 631300 1020 465260 1020 465260 7392 6468 0 2 98 0 0 2015-11-20 02:19:46
3 1 1656704 371028 148 633716 2216 1203792 2216 1203792 253072 5293 0 4 95 1 0 2015-11-20 02:19:47
2 0 2989412 385264 140 631940 1772 1325968 1772 1325968 291189 4987 0 4 95 0 0 2015-11-20 02:19:48
1 0 4271348 396588 140 634024 604 1281156 604 1281156 114622 5095 0 2 97 0 0 2015-11-20 02:19:49
1 0 5590260 391532 140 634208 324 1318916 324 1318916 1550 5516 0 1 99 0 0 2015-11-20 02:19:50
3 0 6735960 373428 140 634744 20 1147804 20 1147804 106941 4821 0 2 98 0 0 2015-11-20 02:19:51
3 0 7933896 374244 140 636020 632 1197576 632 1197572 240440 4690 0 4 96 0 0 2015-11-20 02:19:52
3 0 9262464 366936 140 638332 128 1327512 128 1327516 291280 4277 0 4 96 0 0 2015-11-20 02:19:53
1 0 10465268 400632 140 637240 56 1204884 56 1204884 119208 4982 0 2 97 0 0 2015-11-20 02:19:54
1 0 11487212 401092 140 636896 24 1019904 24 1019904 1579 5249 0 1 99 0 0 2015-11-20 02:19:55
1 0 12398600 400644 140 637240 8 911396 8 911396 1434 4825 0 1 99 0 0 2015-11-20 02:19:56
1 1 13407712 396808 140 636480 108 1010376 108 1010376 1741 9335 0 1 98 0 0 2015-11-20 02:19:57
1 0 14212948 397452 140 637192 120 804160 120 804160 1414 4490 0 1 99 0 0 2015-11-20 02:19:58
1 0 14976844 399148 140 636696 0 763904 0 763904 1473 4379 0 1 99 0 0 2015-11-20 02:19:59
1 0 15765336 401612 140 636656 12 788508 12 788508 1387 4378 0 1 99 0 0 2015-11-20 02:20:00
1 0 16737876 403216 140 636468 80 975368 80 975364 1469 4950 0 1 99 0 0 2015-11-20 02:20:01
1 0 17532708 403472 140 637256 0 792056 0 792060 1375 4558 0 1 99 0 0 2015-11-20 02:20:02
1 0 18263000 402296 140 637784 784 733184 784 733184 15557 4555 0 1 98 0 0 2015-11-20 02:20:03
1 0 19246408 404008 140 639284 0 981040 0 981040 15169 4835 0 1 99 0 0 2015-11-20 02:20:04
1 0 19713820 407392 140 638924 0 467420 0 467420 15464 3788 0 1 99 0 0 2015-11-20 02:20:05
1 0 20326740 401112 140 639860 60 612936 60 612936 15072 4204 0 1 99 0 0 2015-11-20 02:20:06
1 0 21001060 402152 140 640008 0 676376 0 676376 15018 4148 0 1 99 0 0 2015-11-20 02:20:07
1 0 21563284 406060 140 639804 20 560188 20 560188 17919 8419 0 2 98 0 0 2015-11-20 02:20:08
1 0 22077856 403296 140 640604 0 514576 0 514576 15618 3734 0 1 99 0 0 2015-11-20 02:20:09
1 0 22578344 402016 140 640896 32 500516 32 500516 15288 3848 0 1 99 0 0 2015-11-20 02:20:10
1 0 23054368 401156 140 641000 0 476024 0 476024 15534 3896 0 1 99 0 0 2015-11-20 02:20:11
1 0 23678064 403060 140 640980 0 623700 0 623700 15184 4009 0 1 99 0 0 2015-11-20 02:20:12
1 0 24152136 424660 140 646608 7564 483848 7564 483848 3544 4709 0 1 98 0 0 2015-11-20 02:20:13
1 0 24631332 402948 140 646572 124 479232 124 479232 1475 4037 0 1 99 0 0 2015-11-20 02:20:14
1 0 25137188 399836 140 646496 0 505856 0 505856 1546 3745 0 1 99 0 0 2015-11-20 02:20:15
1 0 25809492 399544 140 639732 0 672304 0 672304 1500 4242 0 1 99 0 0 2015-11-20 02:20:16
1 0 26839604 397088 140 639816 100 1030144 100 1030144 1476 5131 0 1 99 0 0 2015-11-20 02:20:17
1 0 27873840 392212 140 640160 0 1034240 0 1034240 1387 5104 0 1 99 0 0 2015-11-20 02:20:18
1 0 28866508 423100 140 634052 40 992692 40 992692 43633 8766 0 2 98 0 0 2015-11-20 02:20:19
1 0 29953544 384884 140 634228 1020 1087856 1020 1087856 244003 2850 0 2 97 0 0 2015-11-20 02:20:20
1 0 30991516 388644 140 634544 928 1038824 928 1038824 104550 4616 0 2 98 0 0 2015-11-20 02:20:21
1 0 32099728 393432 140 634540 40 1108220 40 1108220 36817 5281 0 2 98 0 0 2015-11-20 02:20:22
1 0 33346816 398860 140 634820 864 1248384 864 1248384 203863 4811 0 3 96 0 0 2015-11-20 02:20:23
3 0 34256800 396392 92 636452 376 912708 376 912708 106741 4122 0 2 98 0 0 2015-11-20 02:20:24
3 0 35305064 360224 92 637684 756 1047104 756 1047104 215548 3407 0 4 96 0 0 2015-11-20 02:20:25
1 0 36096908 399740 92 636628 180 791124 180 791124 91507 4034 0 2 98 0 0 2015-11-20 02:20:26
3 0 37168508 388644 92 637652 596 1072448 596 1072444 33876 5317 0 2 98 0 0 2015-11-20 02:20:27
1 0 38356828 383224 92 635984 764 1189040 764 1189044 15618 5839 0 1 98 0 0 2015-11-20 02:20:28
1 0 39697584 383288 92 636848 8 1342204 8 1342204 1466 5839 0 1 99 0 0 2015-11-20 02:20:29
1 0 40936988 393532 92 636784 196 1239952 196 1239952 9621 10147 0 2 98 0 0 2015-11-20 02:20:30
1 0 42314612 393596 92 636908 0 1375824 0 1375824 1513 5957 0 1 99 0 0 2015-11-20 02:20:31
1 0 43648364 388076 92 637308 4 1333860 4 1333860 1403 5806 0 1 99 0 0 2015-11-20 02:20:32
1 0 44909932 395256 92 637168 0 1261472 0 1261472 1407 5562 0 1 99 0 0 2015-11-20 02:20:33
1 0 46161428 387000 92 637192 4 1253376 4 1253376 1341 5563 0 1 99 0 0 2015-11-20 02:20:34
1 0 47420772 389516 92 637100 0 1257472 0 1257472 1490 5671 0 1 99 0 0 2015-11-20 02:20:35
1 0 48688468 389928 92 637712 24 1267712 24 1267712 1406 5615 0 1 99 0 0 2015-11-20 02:20:36
1 1 50005344 385328 92 636720 60 1316908 60 1316908 1472 5744 0 1 99 0 0 2015-11-20 02:20:37
1 0 51232704 383852 92 636824 464 1230888 464 1230888 1564 5727 0 1 98 0 0 2015-11-20 02:20:38
1 0 52472728 383456 92 637680 4 1236964 4 1236964 1419 5537 0 1 99 0 0 2015-11-20 02:20:39
1 0 53671788 381408 92 637304 4 1201744 4 1201744 1368 5411 0 1 99 0 0 2015-11-20 02:20:40
2 0 54444956 401360 92 636396 952 771872 952 771872 55875 9172 0 2 97 0 0 2015-11-20 02:20:41
1 0 55317332 391568 92 637668 852 875836 852 875836 85727 4794 0 2 97 0 0 2015-11-20 02:20:42
1 0 56218404 409888 92 634484 764 900928 764 900928 89499 4867 0 2 97 0 0 2015-11-20 02:20:43
1 0 56989380 392256 92 633052 2016 773700 2016 773700 50196 5161 0 2 98 1 0 2015-11-20 02:20:44
3 0 57633636 378960 92 633920 944 642984 944 642984 32478 4148 0 2 98 0 0 2015-11-20 02:20:45
1 0 58693956 392100 92 633332 792 1061092 792 1061092 92713 4976 0 2 97 0 0 2015-11-20 02:20:46
1 0 59583820 407380 92 633848 912 890916 912 890916 98123 4765 0 2 98 0 0 2015-11-20 02:20:47
1 0 60493136 376532 92 633184 1128 912444 1128 912444 50279 5276 0 2 98 0 0 2015-11-20 02:20:48
1 0 61168796 398636 92 634220 920 674796 920 674796 40032 4843 0 2 98 0 0 2015-11-20 02:20:49
1 0 61952912 387680 92 633732 840 784944 840 784944 10903 4938 0 1 98 0 0 2015-11-20 02:20:50
1 0 63190048 387784 92 634132 564 1237880 564 1237880 1759 5928 0 1 98 0 0 2015-11-20 02:20:51
1 1 64455336 383276 92 633660 76 1265664 76 1265664 1541 5668 0 1 99 0 0 2015-11-20 02:20:52
1 0 65825828 386788 92 633696 84 1370948 84 1370948 2145 9935 0 1 98 0 0 2015-11-20 02:20:53
1 0 66505700 386764 92 634404 876 679936 876 679936 1574 4153 0 1 99 0 0 2015-11-20 02:20:54
1 0 67357744 384320 92 634760 8 852056 8 852056 1470 4443 0 1 99 0 0 2015-11-20 02:20:55
1 0 68500536 386720 92 634268 12 1142800 12 1142800 1367 5366 0 1 99 0 0 2015-11-20 02:20:56
1 0 69670512 385312 92 634572 0 1171456 0 1171456 1403 5325 0 1 99 0 0 2015-11-20 02:20:57
1 0 70771360 378756 92 634888 484 1099776 484 1099776 1459 5365 0 1 99 0 0 2015-11-20 02:20:58
1 0 71882880 384664 92 635176 0 1114192 0 1114192 1454 5193 0 1 99 0 0 2015-11-20 02:20:59
1 0 43315772 379880 92 634952 40 634800 40 634800 1382 3564 0 1 99 0 0 2015-11-20 02:21:00
0 0 33600 25734156 92 634884 4536 0 4536 0 2215 4864 0 1 98 1 0 2015-11-20 02:21:01
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: hugepage compaction causes performance drop
@ 2015-11-23 8:33 ` Aaron Lu
0 siblings, 0 replies; 30+ messages in thread
From: Aaron Lu @ 2015-11-23 8:33 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 908 bytes --]
On 11/23/2015 04:16 PM, Joonsoo Kim wrote:
> Numbers looks fine to me. I guess this performance degradation is
> caused by COMPACT_CLUSTER_MAX change (from 32 to 256). THP allocation
> is async so should be aborted quickly. But, after isolating 256
> migratable pages, it can't be aborted and will finish 256 pages
> migration (at least, current implementation).
>
> Aaron, please test again with setting COMPACT_CLUSTER_MAX to 32
> (in swap.h)?
This is what I found in include/linux/swap.h:
#define SWAP_CLUSTER_MAX 32UL
#define COMPACT_CLUSTER_MAX SWAP_CLUSTER_MAX
Looks like it is already 32, or am I looking at the wrong place?
BTW, I'm using v4.3 for all these tests, and I just checked v4.4-rc2,
the above definition doesn't change.
>
> And, please attach always-always's vmstat numbers, too.
Sure, attached the vmstat tool output, taken every second.
Thanks,
Aaron
[-- Attachment #2: vmstat.ksh --]
[-- Type: text/plain, Size: 9298 bytes --]
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- -----timestamp-----
r b swpd free buff cache si so bi bo in cs us sy id wa st CST
6 1 0 25647504 580 626540 0 0 0 0 66 19 0 1 99 0 0 2015-11-20 02:19:37
1 0 0 25563796 580 638000 0 0 0 0 769 6085 0 1 99 0 0 2015-11-20 02:19:38
1 0 0 22010336 580 638168 0 0 0 0 1698 930 0 1 99 0 0 2015-11-20 02:19:39
1 0 0 18589868 580 638084 0 0 0 0 1198 793 0 1 99 0 0 2015-11-20 02:19:40
1 0 0 15173252 580 638104 0 0 0 0 1234 738 0 1 99 0 0 2015-11-20 02:19:41
1 0 0 11751756 580 638120 0 0 0 0 1224 679 0 1 99 0 0 2015-11-20 02:19:42
1 0 0 8322416 580 638156 0 0 0 0 1213 726 0 1 99 0 0 2015-11-20 02:19:43
1 0 0 4877336 580 638232 0 0 0 0 1171 726 0 1 99 0 0 2015-11-20 02:19:44
1 0 0 1437496 580 638300 0 0 0 0 1203 641 0 1 99 0 0 2015-11-20 02:19:45
1 0 460904 439088 284 631300 1020 465260 1020 465260 7392 6468 0 2 98 0 0 2015-11-20 02:19:46
3 1 1656704 371028 148 633716 2216 1203792 2216 1203792 253072 5293 0 4 95 1 0 2015-11-20 02:19:47
2 0 2989412 385264 140 631940 1772 1325968 1772 1325968 291189 4987 0 4 95 0 0 2015-11-20 02:19:48
1 0 4271348 396588 140 634024 604 1281156 604 1281156 114622 5095 0 2 97 0 0 2015-11-20 02:19:49
1 0 5590260 391532 140 634208 324 1318916 324 1318916 1550 5516 0 1 99 0 0 2015-11-20 02:19:50
3 0 6735960 373428 140 634744 20 1147804 20 1147804 106941 4821 0 2 98 0 0 2015-11-20 02:19:51
3 0 7933896 374244 140 636020 632 1197576 632 1197572 240440 4690 0 4 96 0 0 2015-11-20 02:19:52
3 0 9262464 366936 140 638332 128 1327512 128 1327516 291280 4277 0 4 96 0 0 2015-11-20 02:19:53
1 0 10465268 400632 140 637240 56 1204884 56 1204884 119208 4982 0 2 97 0 0 2015-11-20 02:19:54
1 0 11487212 401092 140 636896 24 1019904 24 1019904 1579 5249 0 1 99 0 0 2015-11-20 02:19:55
1 0 12398600 400644 140 637240 8 911396 8 911396 1434 4825 0 1 99 0 0 2015-11-20 02:19:56
1 1 13407712 396808 140 636480 108 1010376 108 1010376 1741 9335 0 1 98 0 0 2015-11-20 02:19:57
1 0 14212948 397452 140 637192 120 804160 120 804160 1414 4490 0 1 99 0 0 2015-11-20 02:19:58
1 0 14976844 399148 140 636696 0 763904 0 763904 1473 4379 0 1 99 0 0 2015-11-20 02:19:59
1 0 15765336 401612 140 636656 12 788508 12 788508 1387 4378 0 1 99 0 0 2015-11-20 02:20:00
1 0 16737876 403216 140 636468 80 975368 80 975364 1469 4950 0 1 99 0 0 2015-11-20 02:20:01
1 0 17532708 403472 140 637256 0 792056 0 792060 1375 4558 0 1 99 0 0 2015-11-20 02:20:02
1 0 18263000 402296 140 637784 784 733184 784 733184 15557 4555 0 1 98 0 0 2015-11-20 02:20:03
1 0 19246408 404008 140 639284 0 981040 0 981040 15169 4835 0 1 99 0 0 2015-11-20 02:20:04
1 0 19713820 407392 140 638924 0 467420 0 467420 15464 3788 0 1 99 0 0 2015-11-20 02:20:05
1 0 20326740 401112 140 639860 60 612936 60 612936 15072 4204 0 1 99 0 0 2015-11-20 02:20:06
1 0 21001060 402152 140 640008 0 676376 0 676376 15018 4148 0 1 99 0 0 2015-11-20 02:20:07
1 0 21563284 406060 140 639804 20 560188 20 560188 17919 8419 0 2 98 0 0 2015-11-20 02:20:08
1 0 22077856 403296 140 640604 0 514576 0 514576 15618 3734 0 1 99 0 0 2015-11-20 02:20:09
1 0 22578344 402016 140 640896 32 500516 32 500516 15288 3848 0 1 99 0 0 2015-11-20 02:20:10
1 0 23054368 401156 140 641000 0 476024 0 476024 15534 3896 0 1 99 0 0 2015-11-20 02:20:11
1 0 23678064 403060 140 640980 0 623700 0 623700 15184 4009 0 1 99 0 0 2015-11-20 02:20:12
1 0 24152136 424660 140 646608 7564 483848 7564 483848 3544 4709 0 1 98 0 0 2015-11-20 02:20:13
1 0 24631332 402948 140 646572 124 479232 124 479232 1475 4037 0 1 99 0 0 2015-11-20 02:20:14
1 0 25137188 399836 140 646496 0 505856 0 505856 1546 3745 0 1 99 0 0 2015-11-20 02:20:15
1 0 25809492 399544 140 639732 0 672304 0 672304 1500 4242 0 1 99 0 0 2015-11-20 02:20:16
1 0 26839604 397088 140 639816 100 1030144 100 1030144 1476 5131 0 1 99 0 0 2015-11-20 02:20:17
1 0 27873840 392212 140 640160 0 1034240 0 1034240 1387 5104 0 1 99 0 0 2015-11-20 02:20:18
1 0 28866508 423100 140 634052 40 992692 40 992692 43633 8766 0 2 98 0 0 2015-11-20 02:20:19
1 0 29953544 384884 140 634228 1020 1087856 1020 1087856 244003 2850 0 2 97 0 0 2015-11-20 02:20:20
1 0 30991516 388644 140 634544 928 1038824 928 1038824 104550 4616 0 2 98 0 0 2015-11-20 02:20:21
1 0 32099728 393432 140 634540 40 1108220 40 1108220 36817 5281 0 2 98 0 0 2015-11-20 02:20:22
1 0 33346816 398860 140 634820 864 1248384 864 1248384 203863 4811 0 3 96 0 0 2015-11-20 02:20:23
3 0 34256800 396392 92 636452 376 912708 376 912708 106741 4122 0 2 98 0 0 2015-11-20 02:20:24
3 0 35305064 360224 92 637684 756 1047104 756 1047104 215548 3407 0 4 96 0 0 2015-11-20 02:20:25
1 0 36096908 399740 92 636628 180 791124 180 791124 91507 4034 0 2 98 0 0 2015-11-20 02:20:26
3 0 37168508 388644 92 637652 596 1072448 596 1072444 33876 5317 0 2 98 0 0 2015-11-20 02:20:27
1 0 38356828 383224 92 635984 764 1189040 764 1189044 15618 5839 0 1 98 0 0 2015-11-20 02:20:28
1 0 39697584 383288 92 636848 8 1342204 8 1342204 1466 5839 0 1 99 0 0 2015-11-20 02:20:29
1 0 40936988 393532 92 636784 196 1239952 196 1239952 9621 10147 0 2 98 0 0 2015-11-20 02:20:30
1 0 42314612 393596 92 636908 0 1375824 0 1375824 1513 5957 0 1 99 0 0 2015-11-20 02:20:31
1 0 43648364 388076 92 637308 4 1333860 4 1333860 1403 5806 0 1 99 0 0 2015-11-20 02:20:32
1 0 44909932 395256 92 637168 0 1261472 0 1261472 1407 5562 0 1 99 0 0 2015-11-20 02:20:33
1 0 46161428 387000 92 637192 4 1253376 4 1253376 1341 5563 0 1 99 0 0 2015-11-20 02:20:34
1 0 47420772 389516 92 637100 0 1257472 0 1257472 1490 5671 0 1 99 0 0 2015-11-20 02:20:35
1 0 48688468 389928 92 637712 24 1267712 24 1267712 1406 5615 0 1 99 0 0 2015-11-20 02:20:36
1 1 50005344 385328 92 636720 60 1316908 60 1316908 1472 5744 0 1 99 0 0 2015-11-20 02:20:37
1 0 51232704 383852 92 636824 464 1230888 464 1230888 1564 5727 0 1 98 0 0 2015-11-20 02:20:38
1 0 52472728 383456 92 637680 4 1236964 4 1236964 1419 5537 0 1 99 0 0 2015-11-20 02:20:39
1 0 53671788 381408 92 637304 4 1201744 4 1201744 1368 5411 0 1 99 0 0 2015-11-20 02:20:40
2 0 54444956 401360 92 636396 952 771872 952 771872 55875 9172 0 2 97 0 0 2015-11-20 02:20:41
1 0 55317332 391568 92 637668 852 875836 852 875836 85727 4794 0 2 97 0 0 2015-11-20 02:20:42
1 0 56218404 409888 92 634484 764 900928 764 900928 89499 4867 0 2 97 0 0 2015-11-20 02:20:43
1 0 56989380 392256 92 633052 2016 773700 2016 773700 50196 5161 0 2 98 1 0 2015-11-20 02:20:44
3 0 57633636 378960 92 633920 944 642984 944 642984 32478 4148 0 2 98 0 0 2015-11-20 02:20:45
1 0 58693956 392100 92 633332 792 1061092 792 1061092 92713 4976 0 2 97 0 0 2015-11-20 02:20:46
1 0 59583820 407380 92 633848 912 890916 912 890916 98123 4765 0 2 98 0 0 2015-11-20 02:20:47
1 0 60493136 376532 92 633184 1128 912444 1128 912444 50279 5276 0 2 98 0 0 2015-11-20 02:20:48
1 0 61168796 398636 92 634220 920 674796 920 674796 40032 4843 0 2 98 0 0 2015-11-20 02:20:49
1 0 61952912 387680 92 633732 840 784944 840 784944 10903 4938 0 1 98 0 0 2015-11-20 02:20:50
1 0 63190048 387784 92 634132 564 1237880 564 1237880 1759 5928 0 1 98 0 0 2015-11-20 02:20:51
1 1 64455336 383276 92 633660 76 1265664 76 1265664 1541 5668 0 1 99 0 0 2015-11-20 02:20:52
1 0 65825828 386788 92 633696 84 1370948 84 1370948 2145 9935 0 1 98 0 0 2015-11-20 02:20:53
1 0 66505700 386764 92 634404 876 679936 876 679936 1574 4153 0 1 99 0 0 2015-11-20 02:20:54
1 0 67357744 384320 92 634760 8 852056 8 852056 1470 4443 0 1 99 0 0 2015-11-20 02:20:55
1 0 68500536 386720 92 634268 12 1142800 12 1142800 1367 5366 0 1 99 0 0 2015-11-20 02:20:56
1 0 69670512 385312 92 634572 0 1171456 0 1171456 1403 5325 0 1 99 0 0 2015-11-20 02:20:57
1 0 70771360 378756 92 634888 484 1099776 484 1099776 1459 5365 0 1 99 0 0 2015-11-20 02:20:58
1 0 71882880 384664 92 635176 0 1114192 0 1114192 1454 5193 0 1 99 0 0 2015-11-20 02:20:59
1 0 43315772 379880 92 634952 40 634800 40 634800 1382 3564 0 1 99 0 0 2015-11-20 02:21:00
0 0 33600 25734156 92 634884 4536 0 4536 0 2215 4864 0 1 98 1 0 2015-11-20 02:21:01
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: hugepage compaction causes performance drop
2015-11-23 8:33 ` Aaron Lu
@ 2015-11-23 9:24 ` Joonsoo Kim
-1 siblings, 0 replies; 30+ messages in thread
From: Joonsoo Kim @ 2015-11-23 9:24 UTC (permalink / raw)
To: Aaron Lu
Cc: Joonsoo Kim, Vlastimil Babka, Linux Memory Management List,
Huang Ying, Dave Hansen, Tim Chen, lkp, Andrea Arcangeli,
David Rientjes
2015-11-23 17:33 GMT+09:00 Aaron Lu <aaron.lu@intel.com>:
> On 11/23/2015 04:16 PM, Joonsoo Kim wrote:
>> Numbers looks fine to me. I guess this performance degradation is
>> caused by COMPACT_CLUSTER_MAX change (from 32 to 256). THP allocation
>> is async so should be aborted quickly. But, after isolating 256
>> migratable pages, it can't be aborted and will finish 256 pages
>> migration (at least, current implementation).
Let me correct above comment. It can be aborted after some try.
>> Aaron, please test again with setting COMPACT_CLUSTER_MAX to 32
>> (in swap.h)?
>
> This is what I found in include/linux/swap.h:
>
> #define SWAP_CLUSTER_MAX 32UL
> #define COMPACT_CLUSTER_MAX SWAP_CLUSTER_MAX
>
> Looks like it is already 32, or am I looking at the wrong place?
>
> BTW, I'm using v4.3 for all these tests, and I just checked v4.4-rc2,
> the above definition doesn't change.
Sorry. I looked at linux-next tree and, there, it is 128.
Please ignore my comment! :)
>>
>> And, please attach always-always's vmstat numbers, too.
>
> Sure, attached the vmstat tool output, taken every second.
Oops... I'd like to see '1 sec interval cat /proc/vmstat' for always-never.
Thanks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: hugepage compaction causes performance drop
@ 2015-11-23 9:24 ` Joonsoo Kim
0 siblings, 0 replies; 30+ messages in thread
From: Joonsoo Kim @ 2015-11-23 9:24 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1224 bytes --]
2015-11-23 17:33 GMT+09:00 Aaron Lu <aaron.lu@intel.com>:
> On 11/23/2015 04:16 PM, Joonsoo Kim wrote:
>> Numbers looks fine to me. I guess this performance degradation is
>> caused by COMPACT_CLUSTER_MAX change (from 32 to 256). THP allocation
>> is async so should be aborted quickly. But, after isolating 256
>> migratable pages, it can't be aborted and will finish 256 pages
>> migration (at least, current implementation).
Let me correct above comment. It can be aborted after some try.
>> Aaron, please test again with setting COMPACT_CLUSTER_MAX to 32
>> (in swap.h)?
>
> This is what I found in include/linux/swap.h:
>
> #define SWAP_CLUSTER_MAX 32UL
> #define COMPACT_CLUSTER_MAX SWAP_CLUSTER_MAX
>
> Looks like it is already 32, or am I looking at the wrong place?
>
> BTW, I'm using v4.3 for all these tests, and I just checked v4.4-rc2,
> the above definition doesn't change.
Sorry. I looked at linux-next tree and, there, it is 128.
Please ignore my comment! :)
>>
>> And, please attach always-always's vmstat numbers, too.
>
> Sure, attached the vmstat tool output, taken every second.
Oops... I'd like to see '1 sec interval cat /proc/vmstat' for always-never.
Thanks.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: hugepage compaction causes performance drop
2015-11-20 10:06 ` Vlastimil Babka
@ 2015-11-24 2:45 ` Joonsoo Kim
-1 siblings, 0 replies; 30+ messages in thread
From: Joonsoo Kim @ 2015-11-24 2:45 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Aaron Lu, linux-mm, Huang Ying, Dave Hansen, Tim Chen, lkp,
Andrea Arcangeli, David Rientjes
On Fri, Nov 20, 2015 at 11:06:46AM +0100, Vlastimil Babka wrote:
> On 11/20/2015 10:33 AM, Aaron Lu wrote:
> >On 11/20/2015 04:55 PM, Aaron Lu wrote:
> >>On 11/19/2015 09:29 PM, Vlastimil Babka wrote:
> >>>+CC Andrea, David, Joonsoo
> >>>
> >>>On 11/19/2015 10:29 AM, Aaron Lu wrote:
> >>>>The vmstat and perf-profile are also attached, please let me know if you
> >>>>need any more information, thanks.
> >>>
> >>>Output from vmstat (the tool) isn't much useful here, a periodic "cat
> >>>/proc/vmstat" would be much better.
> >>
> >>No problem.
> >>
> >>>The perf profiles are somewhat weirdly sorted by children cost (?), but
> >>>I noticed a very high cost (46%) in pageblock_pfn_to_page(). This could
> >>>be due to a very large but sparsely populated zone. Could you provide
> >>>/proc/zoneinfo?
> >>
> >>Is a one time /proc/zoneinfo enough or also a periodic one?
> >
> >Please see attached, note that this is a new run so the perf profile is
> >a little different.
> >
> >Thanks,
> >Aaron
>
> Thanks.
>
> DMA32 is a bit sparse:
>
> Node 0, zone DMA32
> pages free 62829
> min 327
> low 408
> high 490
> scanned 0
> spanned 1044480
> present 495951
> managed 479559
>
> Since the other zones are much larger, probably this is not the
> culprit. But tracepoints should tell us more. I have a theory that
> updating free scanner's cached pfn doesn't happen if it aborts due
> to need_resched() during isolate_freepages(), before hitting a valid
> pageblock, if the zone has a large hole in it. But zoneinfo doesn't
Today, I revisit this issue and yes, I think that your theory is
right. isolate_freepages() will not update cached pfn until call
isolate_freepages_block(). So, if there are many holes or many
unmovable pageblocks or !isolation_suitable() pageblocks, cached pfn
will not updated if compaction aborts due to need_resched(). zoneinfo
shows that there is not much holes so I guess that this problem is caused
by latter two cases.
It is better to update cached pfn in these cases. Although I don't see
your solution yet, I guess it will help here.
Thanks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: hugepage compaction causes performance drop
@ 2015-11-24 2:45 ` Joonsoo Kim
0 siblings, 0 replies; 30+ messages in thread
From: Joonsoo Kim @ 2015-11-24 2:45 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 2222 bytes --]
On Fri, Nov 20, 2015 at 11:06:46AM +0100, Vlastimil Babka wrote:
> On 11/20/2015 10:33 AM, Aaron Lu wrote:
> >On 11/20/2015 04:55 PM, Aaron Lu wrote:
> >>On 11/19/2015 09:29 PM, Vlastimil Babka wrote:
> >>>+CC Andrea, David, Joonsoo
> >>>
> >>>On 11/19/2015 10:29 AM, Aaron Lu wrote:
> >>>>The vmstat and perf-profile are also attached, please let me know if you
> >>>>need any more information, thanks.
> >>>
> >>>Output from vmstat (the tool) isn't much useful here, a periodic "cat
> >>>/proc/vmstat" would be much better.
> >>
> >>No problem.
> >>
> >>>The perf profiles are somewhat weirdly sorted by children cost (?), but
> >>>I noticed a very high cost (46%) in pageblock_pfn_to_page(). This could
> >>>be due to a very large but sparsely populated zone. Could you provide
> >>>/proc/zoneinfo?
> >>
> >>Is a one time /proc/zoneinfo enough or also a periodic one?
> >
> >Please see attached, note that this is a new run so the perf profile is
> >a little different.
> >
> >Thanks,
> >Aaron
>
> Thanks.
>
> DMA32 is a bit sparse:
>
> Node 0, zone DMA32
> pages free 62829
> min 327
> low 408
> high 490
> scanned 0
> spanned 1044480
> present 495951
> managed 479559
>
> Since the other zones are much larger, probably this is not the
> culprit. But tracepoints should tell us more. I have a theory that
> updating free scanner's cached pfn doesn't happen if it aborts due
> to need_resched() during isolate_freepages(), before hitting a valid
> pageblock, if the zone has a large hole in it. But zoneinfo doesn't
Today, I revisit this issue and yes, I think that your theory is
right. isolate_freepages() will not update cached pfn until call
isolate_freepages_block(). So, if there are many holes or many
unmovable pageblocks or !isolation_suitable() pageblocks, cached pfn
will not updated if compaction aborts due to need_resched(). zoneinfo
shows that there is not much holes so I guess that this problem is caused
by latter two cases.
It is better to update cached pfn in these cases. Although I don't see
your solution yet, I guess it will help here.
Thanks.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: hugepage compaction causes performance drop
2015-11-23 9:24 ` Joonsoo Kim
@ 2015-11-24 3:40 ` Aaron Lu
-1 siblings, 0 replies; 30+ messages in thread
From: Aaron Lu @ 2015-11-24 3:40 UTC (permalink / raw)
To: Joonsoo Kim
Cc: Joonsoo Kim, Vlastimil Babka, Linux Memory Management List,
Huang Ying, Dave Hansen, Tim Chen, lkp, Andrea Arcangeli,
David Rientjes
[-- Attachment #1: Type: text/plain, Size: 509 bytes --]
On 11/23/2015 05:24 PM, Joonsoo Kim wrote:
> 2015-11-23 17:33 GMT+09:00 Aaron Lu <aaron.lu@intel.com>:
>> On 11/23/2015 04:16 PM, Joonsoo Kim wrote:
>>>
>>> And, please attach always-always's vmstat numbers, too.
>>
>> Sure, attached the vmstat tool output, taken every second.
>
> Oops... I'd like to see '1 sec interval cat /proc/vmstat' for always-never.
Here it is, the proc-vmstat for always-never.
BTW, I'm still learning how to do proper ftrace for this case and it may
take a while.
Thanks,
Aaron
[-- Attachment #2: proc-vmstat.gz --]
[-- Type: application/gzip, Size: 17041 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: hugepage compaction causes performance drop
@ 2015-11-24 3:40 ` Aaron Lu
0 siblings, 0 replies; 30+ messages in thread
From: Aaron Lu @ 2015-11-24 3:40 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 526 bytes --]
On 11/23/2015 05:24 PM, Joonsoo Kim wrote:
> 2015-11-23 17:33 GMT+09:00 Aaron Lu <aaron.lu@intel.com>:
>> On 11/23/2015 04:16 PM, Joonsoo Kim wrote:
>>>
>>> And, please attach always-always's vmstat numbers, too.
>>
>> Sure, attached the vmstat tool output, taken every second.
>
> Oops... I'd like to see '1 sec interval cat /proc/vmstat' for always-never.
Here it is, the proc-vmstat for always-never.
BTW, I'm still learning how to do proper ftrace for this case and it may
take a while.
Thanks,
Aaron
[-- Attachment #2: proc-vmstat.gz --]
[-- Type: application/gzip, Size: 17041 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: hugepage compaction causes performance drop
2015-11-24 3:40 ` Aaron Lu
@ 2015-11-24 4:55 ` Joonsoo Kim
-1 siblings, 0 replies; 30+ messages in thread
From: Joonsoo Kim @ 2015-11-24 4:55 UTC (permalink / raw)
To: Aaron Lu
Cc: Vlastimil Babka, Linux Memory Management List, Huang Ying,
Dave Hansen, Tim Chen, lkp, Andrea Arcangeli, David Rientjes
On Tue, Nov 24, 2015 at 11:40:28AM +0800, Aaron Lu wrote:
> On 11/23/2015 05:24 PM, Joonsoo Kim wrote:
> > 2015-11-23 17:33 GMT+09:00 Aaron Lu <aaron.lu@intel.com>:
> >> On 11/23/2015 04:16 PM, Joonsoo Kim wrote:
> >>>
> >>> And, please attach always-always's vmstat numbers, too.
> >>
> >> Sure, attached the vmstat tool output, taken every second.
> >
> > Oops... I'd like to see '1 sec interval cat /proc/vmstat' for always-never.
>
> Here it is, the proc-vmstat for always-never.
Okay. In this case, compaction never happen.
Could you show 1 sec interval cat /proc/pagetypeinfo for
always-always?
> BTW, I'm still learning how to do proper ftrace for this case and it may
> take a while.
You can do it simply with trace-cmd.
sudo trace-cmd record -e compaction &
run test program
fg
Ctrl + c
sudo trace-cmd report
Thanks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: hugepage compaction causes performance drop
@ 2015-11-24 4:55 ` Joonsoo Kim
0 siblings, 0 replies; 30+ messages in thread
From: Joonsoo Kim @ 2015-11-24 4:55 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 864 bytes --]
On Tue, Nov 24, 2015 at 11:40:28AM +0800, Aaron Lu wrote:
> On 11/23/2015 05:24 PM, Joonsoo Kim wrote:
> > 2015-11-23 17:33 GMT+09:00 Aaron Lu <aaron.lu@intel.com>:
> >> On 11/23/2015 04:16 PM, Joonsoo Kim wrote:
> >>>
> >>> And, please attach always-always's vmstat numbers, too.
> >>
> >> Sure, attached the vmstat tool output, taken every second.
> >
> > Oops... I'd like to see '1 sec interval cat /proc/vmstat' for always-never.
>
> Here it is, the proc-vmstat for always-never.
Okay. In this case, compaction never happen.
Could you show 1 sec interval cat /proc/pagetypeinfo for
always-always?
> BTW, I'm still learning how to do proper ftrace for this case and it may
> take a while.
You can do it simply with trace-cmd.
sudo trace-cmd record -e compaction &
run test program
fg
Ctrl + c
sudo trace-cmd report
Thanks.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: hugepage compaction causes performance drop
2015-11-24 4:55 ` Joonsoo Kim
@ 2015-11-24 7:27 ` Aaron Lu
-1 siblings, 0 replies; 30+ messages in thread
From: Aaron Lu @ 2015-11-24 7:27 UTC (permalink / raw)
To: Joonsoo Kim
Cc: Vlastimil Babka, Linux Memory Management List, Huang Ying,
Dave Hansen, Tim Chen, lkp, Andrea Arcangeli, David Rientjes
On 11/24/2015 12:55 PM, Joonsoo Kim wrote:
> On Tue, Nov 24, 2015 at 11:40:28AM +0800, Aaron Lu wrote:
>> BTW, I'm still learning how to do proper ftrace for this case and it may
>> take a while.
>
> You can do it simply with trace-cmd.
>
> sudo trace-cmd record -e compaction &
> run test program
> fg
> Ctrl + c
>
> sudo trace-cmd report
Thanks for the tip, I just recorded it like this:
trace-cmd record -e compaction ./usemem xxx
Due to the big size of trace.out(6MB after compress), I've uploaed it:
https://drive.google.com/open?id=0B49uX3igf4K4UkJBOGt3cHhOU00
The pagetypeinfo, perf and proc-vmstat is also there.
Regards,
Aaron
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: hugepage compaction causes performance drop
@ 2015-11-24 7:27 ` Aaron Lu
0 siblings, 0 replies; 30+ messages in thread
From: Aaron Lu @ 2015-11-24 7:27 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 667 bytes --]
On 11/24/2015 12:55 PM, Joonsoo Kim wrote:
> On Tue, Nov 24, 2015 at 11:40:28AM +0800, Aaron Lu wrote:
>> BTW, I'm still learning how to do proper ftrace for this case and it may
>> take a while.
>
> You can do it simply with trace-cmd.
>
> sudo trace-cmd record -e compaction &
> run test program
> fg
> Ctrl + c
>
> sudo trace-cmd report
Thanks for the tip, I just recorded it like this:
trace-cmd record -e compaction ./usemem xxx
Due to the big size of trace.out(6MB after compress), I've uploaed it:
https://drive.google.com/open?id=0B49uX3igf4K4UkJBOGt3cHhOU00
The pagetypeinfo, perf and proc-vmstat is also there.
Regards,
Aaron
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: hugepage compaction causes performance drop
2015-11-24 7:27 ` Aaron Lu
@ 2015-11-24 8:29 ` Joonsoo Kim
-1 siblings, 0 replies; 30+ messages in thread
From: Joonsoo Kim @ 2015-11-24 8:29 UTC (permalink / raw)
To: Aaron Lu
Cc: Vlastimil Babka, Linux Memory Management List, Huang Ying,
Dave Hansen, Tim Chen, lkp, Andrea Arcangeli, David Rientjes
On Tue, Nov 24, 2015 at 03:27:43PM +0800, Aaron Lu wrote:
> On 11/24/2015 12:55 PM, Joonsoo Kim wrote:
> > On Tue, Nov 24, 2015 at 11:40:28AM +0800, Aaron Lu wrote:
> >> BTW, I'm still learning how to do proper ftrace for this case and it may
> >> take a while.
> >
> > You can do it simply with trace-cmd.
> >
> > sudo trace-cmd record -e compaction &
> > run test program
> > fg
> > Ctrl + c
> >
> > sudo trace-cmd report
>
> Thanks for the tip, I just recorded it like this:
> trace-cmd record -e compaction ./usemem xxx
>
> Due to the big size of trace.out(6MB after compress), I've uploaed it:
> https://drive.google.com/open?id=0B49uX3igf4K4UkJBOGt3cHhOU00
>
> The pagetypeinfo, perf and proc-vmstat is also there.
>
Thanks.
Okay. Output proves the theory. pagetypeinfo shows that there are
too many unmovable pageblocks. isolate_freepages() should skip these
so it's not easy to meet proper pageblock until need_resched(). Hence,
updating cached pfn doesn't happen. (You can see unchanged free_pfn
with 'grep compaction_begin tracepoint-output')
But, I don't think that updating cached pfn is enough to solve your problem.
More complex change would be needed, I guess.
Thanks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: hugepage compaction causes performance drop
@ 2015-11-24 8:29 ` Joonsoo Kim
0 siblings, 0 replies; 30+ messages in thread
From: Joonsoo Kim @ 2015-11-24 8:29 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1231 bytes --]
On Tue, Nov 24, 2015 at 03:27:43PM +0800, Aaron Lu wrote:
> On 11/24/2015 12:55 PM, Joonsoo Kim wrote:
> > On Tue, Nov 24, 2015 at 11:40:28AM +0800, Aaron Lu wrote:
> >> BTW, I'm still learning how to do proper ftrace for this case and it may
> >> take a while.
> >
> > You can do it simply with trace-cmd.
> >
> > sudo trace-cmd record -e compaction &
> > run test program
> > fg
> > Ctrl + c
> >
> > sudo trace-cmd report
>
> Thanks for the tip, I just recorded it like this:
> trace-cmd record -e compaction ./usemem xxx
>
> Due to the big size of trace.out(6MB after compress), I've uploaed it:
> https://drive.google.com/open?id=0B49uX3igf4K4UkJBOGt3cHhOU00
>
> The pagetypeinfo, perf and proc-vmstat is also there.
>
Thanks.
Okay. Output proves the theory. pagetypeinfo shows that there are
too many unmovable pageblocks. isolate_freepages() should skip these
so it's not easy to meet proper pageblock until need_resched(). Hence,
updating cached pfn doesn't happen. (You can see unchanged free_pfn
with 'grep compaction_begin tracepoint-output')
But, I don't think that updating cached pfn is enough to solve your problem.
More complex change would be needed, I guess.
Thanks.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: hugepage compaction causes performance drop
2015-11-24 8:29 ` Joonsoo Kim
@ 2015-11-25 12:44 ` Vlastimil Babka
-1 siblings, 0 replies; 30+ messages in thread
From: Vlastimil Babka @ 2015-11-25 12:44 UTC (permalink / raw)
To: Joonsoo Kim, Aaron Lu
Cc: Linux Memory Management List, Huang Ying, Dave Hansen, Tim Chen,
lkp, Andrea Arcangeli, David Rientjes
On 11/24/2015 09:29 AM, Joonsoo Kim wrote:
> On Tue, Nov 24, 2015 at 03:27:43PM +0800, Aaron Lu wrote:
>
> Thanks.
>
> Okay. Output proves the theory. pagetypeinfo shows that there are
> too many unmovable pageblocks. isolate_freepages() should skip these
> so it's not easy to meet proper pageblock until need_resched(). Hence,
> updating cached pfn doesn't happen. (You can see unchanged free_pfn
> with 'grep compaction_begin tracepoint-output')
Hm to me it seems that the scanners meet a lot, so they restart at zone
boundaries and that's fine. There's nothing to cache.
> But, I don't think that updating cached pfn is enough to solve your problem.
> More complex change would be needed, I guess.
One factor is probably that THP only use async compaction and those don't result
in deferred compaction, which should help here. It also means that
pageblock_skip bits are not being reset except by kswapd...
Oh and pageblock_pfn_to_page is done before checking the pageblock skip bits, so
that's why it's prominent in the profiles. Although it was less prominent (9% vs
46% before) in the last data... was perf collected while tracing, thus
generating extra noise?
> Thanks.
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: hugepage compaction causes performance drop
@ 2015-11-25 12:44 ` Vlastimil Babka
0 siblings, 0 replies; 30+ messages in thread
From: Vlastimil Babka @ 2015-11-25 12:44 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1465 bytes --]
On 11/24/2015 09:29 AM, Joonsoo Kim wrote:
> On Tue, Nov 24, 2015 at 03:27:43PM +0800, Aaron Lu wrote:
>
> Thanks.
>
> Okay. Output proves the theory. pagetypeinfo shows that there are
> too many unmovable pageblocks. isolate_freepages() should skip these
> so it's not easy to meet proper pageblock until need_resched(). Hence,
> updating cached pfn doesn't happen. (You can see unchanged free_pfn
> with 'grep compaction_begin tracepoint-output')
Hm to me it seems that the scanners meet a lot, so they restart at zone
boundaries and that's fine. There's nothing to cache.
> But, I don't think that updating cached pfn is enough to solve your problem.
> More complex change would be needed, I guess.
One factor is probably that THP only use async compaction and those don't result
in deferred compaction, which should help here. It also means that
pageblock_skip bits are not being reset except by kswapd...
Oh and pageblock_pfn_to_page is done before checking the pageblock skip bits, so
that's why it's prominent in the profiles. Although it was less prominent (9% vs
46% before) in the last data... was perf collected while tracing, thus
generating extra noise?
> Thanks.
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo(a)kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email(a)kvack.org </a>
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: hugepage compaction causes performance drop
2015-11-25 12:44 ` Vlastimil Babka
@ 2015-11-26 5:47 ` Aaron Lu
-1 siblings, 0 replies; 30+ messages in thread
From: Aaron Lu @ 2015-11-26 5:47 UTC (permalink / raw)
To: Vlastimil Babka, Joonsoo Kim
Cc: Linux Memory Management List, Huang Ying, Dave Hansen, Tim Chen,
lkp, Andrea Arcangeli, David Rientjes
On 11/25/2015 08:44 PM, Vlastimil Babka wrote:
> On 11/24/2015 09:29 AM, Joonsoo Kim wrote:
>> On Tue, Nov 24, 2015 at 03:27:43PM +0800, Aaron Lu wrote:
>>
>> Thanks.
>>
>> Okay. Output proves the theory. pagetypeinfo shows that there are
>> too many unmovable pageblocks. isolate_freepages() should skip these
>> so it's not easy to meet proper pageblock until need_resched(). Hence,
>> updating cached pfn doesn't happen. (You can see unchanged free_pfn
>> with 'grep compaction_begin tracepoint-output')
>
> Hm to me it seems that the scanners meet a lot, so they restart at zone
> boundaries and that's fine. There's nothing to cache.
>
>> But, I don't think that updating cached pfn is enough to solve your problem.
>> More complex change would be needed, I guess.
>
> One factor is probably that THP only use async compaction and those don't result
> in deferred compaction, which should help here. It also means that
> pageblock_skip bits are not being reset except by kswapd...
>
> Oh and pageblock_pfn_to_page is done before checking the pageblock skip bits, so
> that's why it's prominent in the profiles. Although it was less prominent (9% vs
> 46% before) in the last data... was perf collected while tracing, thus
> generating extra noise?
The perf is always run during these test runs, it will start 25 seconds
later after the test starts to give it some time to eat the remaining
free memory so that when perf starts collection data, the swap out should
already start. The perf data is collected for 10 seconds.
I guess the test run under trace-cmd is slower before before, so the
perf is collecting data at a different time window.
Regards,
Aaron
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: hugepage compaction causes performance drop
@ 2015-11-26 5:47 ` Aaron Lu
0 siblings, 0 replies; 30+ messages in thread
From: Aaron Lu @ 2015-11-26 5:47 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1707 bytes --]
On 11/25/2015 08:44 PM, Vlastimil Babka wrote:
> On 11/24/2015 09:29 AM, Joonsoo Kim wrote:
>> On Tue, Nov 24, 2015 at 03:27:43PM +0800, Aaron Lu wrote:
>>
>> Thanks.
>>
>> Okay. Output proves the theory. pagetypeinfo shows that there are
>> too many unmovable pageblocks. isolate_freepages() should skip these
>> so it's not easy to meet proper pageblock until need_resched(). Hence,
>> updating cached pfn doesn't happen. (You can see unchanged free_pfn
>> with 'grep compaction_begin tracepoint-output')
>
> Hm to me it seems that the scanners meet a lot, so they restart at zone
> boundaries and that's fine. There's nothing to cache.
>
>> But, I don't think that updating cached pfn is enough to solve your problem.
>> More complex change would be needed, I guess.
>
> One factor is probably that THP only use async compaction and those don't result
> in deferred compaction, which should help here. It also means that
> pageblock_skip bits are not being reset except by kswapd...
>
> Oh and pageblock_pfn_to_page is done before checking the pageblock skip bits, so
> that's why it's prominent in the profiles. Although it was less prominent (9% vs
> 46% before) in the last data... was perf collected while tracing, thus
> generating extra noise?
The perf is always run during these test runs, it will start 25 seconds
later after the test starts to give it some time to eat the remaining
free memory so that when perf starts collection data, the swap out should
already start. The perf data is collected for 10 seconds.
I guess the test run under trace-cmd is slower before before, so the
perf is collecting data at a different time window.
Regards,
Aaron
^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2015-11-26 5:47 UTC | newest]
Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-19 9:29 hugepage compaction causes performance drop Aaron Lu
2015-11-19 9:29 ` Aaron Lu
2015-11-19 13:29 ` Vlastimil Babka
2015-11-19 13:29 ` Vlastimil Babka
2015-11-20 8:55 ` Aaron Lu
2015-11-20 8:55 ` Aaron Lu
2015-11-20 9:33 ` Aaron Lu
2015-11-20 9:33 ` Aaron Lu
2015-11-20 10:06 ` Vlastimil Babka
2015-11-20 10:06 ` Vlastimil Babka
2015-11-23 8:16 ` Joonsoo Kim
2015-11-23 8:16 ` Joonsoo Kim
2015-11-23 8:33 ` Aaron Lu
2015-11-23 8:33 ` Aaron Lu
2015-11-23 9:24 ` Joonsoo Kim
2015-11-23 9:24 ` Joonsoo Kim
2015-11-24 3:40 ` Aaron Lu
2015-11-24 3:40 ` Aaron Lu
2015-11-24 4:55 ` Joonsoo Kim
2015-11-24 4:55 ` Joonsoo Kim
2015-11-24 7:27 ` Aaron Lu
2015-11-24 7:27 ` Aaron Lu
2015-11-24 8:29 ` Joonsoo Kim
2015-11-24 8:29 ` Joonsoo Kim
2015-11-25 12:44 ` Vlastimil Babka
2015-11-25 12:44 ` Vlastimil Babka
2015-11-26 5:47 ` Aaron Lu
2015-11-26 5:47 ` Aaron Lu
2015-11-24 2:45 ` Joonsoo Kim
2015-11-24 2:45 ` Joonsoo Kim
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.