* [PATCH 0/1] mm: FAULT_AROUND_ORDER patchset performance data for powerpc
@ 2014-03-25 6:50 ` Madhavan Srinivasan
0 siblings, 0 replies; 21+ messages in thread
From: Madhavan Srinivasan @ 2014-03-25 6:50 UTC (permalink / raw)
To: linux-kernel, linuxppc-dev, linux-mm, linux-arch, x86
Cc: benh, paulus, kirill.shutemov, rusty, akpm, riel, mgorman, ak,
peterz, mingo, Madhavan Srinivasan
Performance data for different FAULT_AROUND_ORDER values from 4 socket
Power7 system (128 Threads and 128GB memory) is below. Fault around order (FAO)
value of 3 looks more advantageous.
FAULT_AROUND_ORDER Baseline 1 3 4 5 7
Linux build (make -j64)
minor-faults 7184385 5874015 4567289 4318518 4193815 4159193
times in seconds 61.433776136 60.865935292 59.245368038 60.630675011 60.56587624 59.828271924
Linux rebuild (make -j64)
minor-faults 303018 226392 146170 132480 126878 126236
times in seconds 5.659819172 5.723996942 5.591238319 5.622533357 5.878811995 5.550133096
Two synthetic tests: access every word in file in sequential/random order.
Marginal Performance gains seen for FAO value of 3 when compared to value
of 4.
Sequential access 16GiB file
FAULT_AROUND_ORDER Baseline 1 3 4 5 7
1 thread
minor-faults 262302 131192 32873 16486 8291 2351
times in seconds 53.071497352 52.945826882 52.931417302 52.928577184 52.859285439 53.116800539
8 threads
minor-faults 2097314 1051046 263336 131715 66098 16653
times in seconds 54.385698561 54.603652339 54.771282004 54.488565674 54.496701531 54.962142189
32 threads
minor-faults 8389267 4218595 1059961 531319 266463 67271
times in seconds 60.61715047 60.827964038 60.46412673 60.266045885 60.492398315 60.24531921
64 threads
minor-faults 16777455 8485998 2178582 1092106 544302 137693
times in seconds 86.471334554 84.412415735 85.208303832 84.331473392 85.598793479 84.695469266
128 threads
minor-faults 33555267 17734522 4710107 2380821 1182707 292077
times in seconds 117.535385569 114.291359037 112.593908276 113.081807611 114.358686588 114.491043011
Random access 1GiB file
FAULT_AROUND_ORDER Baseline 1 3 4 5 7
1 thread
minor-faults 16503 8664 2149 1126 610 437
times in seconds 43.843573808 48.042069805 50.580779682 54.282884593 52.641739876 51.803302129
8 threads
minor-faults 131201 70916 17760 8665 4250 1149
times in seconds 46.262626804 55.942851041 56.629191584 57.97044714 55.417557594 56.019709166
32 threads
minor-faults 524959 265980 67282 33601 16930 4316
times in seconds 67.754175928 69.85012331 71.750338061 71.053074643 68.90728294 71.250103217
64 threads
minor-faults 1048831 528829 133256 66700 33428 8776
times in seconds 96.674025305 93.109961822 87.441777715 91.986332028 88.686748472 93.101434306
128 threads
minor-faults 2098043 1053224 266271 133702 66966 17276
times in seconds 156.525792044 152.117971403 147.523673243 148.560226602 148.596575663 149.389288429
Worst case scenario: we touch one page every 16M to demonstrate overhead.
Touch only one page in page table in 16GiB file
FAULT_AROUND_ORDER Baseline 1 3 4 5 7
1 thread
minor-faults 1077 1064 1051 1048 1046 1045
times in seconds 0.00615347 0.008327379 0.019775282 0.034444003 0.05905971 0.220863339
8 threads
minor-faults 8252 8239 8226 8223 8220 8224
times in seconds 0.04387392 0.059859294 0.113897648 0.199707764 0.361585762 1.343366843
32 threads
minor-faults 32852 32841 32825 32826 32824 32828
times in seconds 0.191404544 0.21907773 0.433207123 0.72430447 1.334983196 4.97727449
64 threads
minor-faults 65652 65642 65629 65622 65623 65634
times in seconds 0.402140429 0.510806718 0.854288645 1.412329805 2.556707704 8.711074863
128 threads
minor-faults 131255 131239 131228 131228 131229 131243
times in seconds 0.817782148 1.124631348 2.023730928 3.184792382 5.331392072 17.309524609
Madhavan Srinivasan (1):
mm: move FAULT_AROUND_ORDER to arch/
arch/powerpc/include/asm/pgtable.h | 6 ++++++
arch/x86/include/asm/pgtable.h | 5 +++++
include/asm-generic/pgtable.h | 10 ++++++++++
mm/memory.c | 2 --
4 files changed, 21 insertions(+), 2 deletions(-)
--
1.7.10.4
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 0/1] mm: FAULT_AROUND_ORDER patchset performance data for powerpc
@ 2014-03-25 6:50 ` Madhavan Srinivasan
0 siblings, 0 replies; 21+ messages in thread
From: Madhavan Srinivasan @ 2014-03-25 6:50 UTC (permalink / raw)
To: linux-kernel, linuxppc-dev, linux-mm, linux-arch, x86
Cc: benh, paulus, kirill.shutemov, rusty, akpm, riel, mgorman, ak,
peterz, mingo, Madhavan Srinivasan
Performance data for different FAULT_AROUND_ORDER values from 4 socket
Power7 system (128 Threads and 128GB memory) is below. Fault around order (FAO)
value of 3 looks more advantageous.
FAULT_AROUND_ORDER Baseline 1 3 4 5 7
Linux build (make -j64)
minor-faults 7184385 5874015 4567289 4318518 4193815 4159193
times in seconds 61.433776136 60.865935292 59.245368038 60.630675011 60.56587624 59.828271924
Linux rebuild (make -j64)
minor-faults 303018 226392 146170 132480 126878 126236
times in seconds 5.659819172 5.723996942 5.591238319 5.622533357 5.878811995 5.550133096
Two synthetic tests: access every word in file in sequential/random order.
Marginal Performance gains seen for FAO value of 3 when compared to value
of 4.
Sequential access 16GiB file
FAULT_AROUND_ORDER Baseline 1 3 4 5 7
1 thread
minor-faults 262302 131192 32873 16486 8291 2351
times in seconds 53.071497352 52.945826882 52.931417302 52.928577184 52.859285439 53.116800539
8 threads
minor-faults 2097314 1051046 263336 131715 66098 16653
times in seconds 54.385698561 54.603652339 54.771282004 54.488565674 54.496701531 54.962142189
32 threads
minor-faults 8389267 4218595 1059961 531319 266463 67271
times in seconds 60.61715047 60.827964038 60.46412673 60.266045885 60.492398315 60.24531921
64 threads
minor-faults 16777455 8485998 2178582 1092106 544302 137693
times in seconds 86.471334554 84.412415735 85.208303832 84.331473392 85.598793479 84.695469266
128 threads
minor-faults 33555267 17734522 4710107 2380821 1182707 292077
times in seconds 117.535385569 114.291359037 112.593908276 113.081807611 114.358686588 114.491043011
Random access 1GiB file
FAULT_AROUND_ORDER Baseline 1 3 4 5 7
1 thread
minor-faults 16503 8664 2149 1126 610 437
times in seconds 43.843573808 48.042069805 50.580779682 54.282884593 52.641739876 51.803302129
8 threads
minor-faults 131201 70916 17760 8665 4250 1149
times in seconds 46.262626804 55.942851041 56.629191584 57.97044714 55.417557594 56.019709166
32 threads
minor-faults 524959 265980 67282 33601 16930 4316
times in seconds 67.754175928 69.85012331 71.750338061 71.053074643 68.90728294 71.250103217
64 threads
minor-faults 1048831 528829 133256 66700 33428 8776
times in seconds 96.674025305 93.109961822 87.441777715 91.986332028 88.686748472 93.101434306
128 threads
minor-faults 2098043 1053224 266271 133702 66966 17276
times in seconds 156.525792044 152.117971403 147.523673243 148.560226602 148.596575663 149.389288429
Worst case scenario: we touch one page every 16M to demonstrate overhead.
Touch only one page in page table in 16GiB file
FAULT_AROUND_ORDER Baseline 1 3 4 5 7
1 thread
minor-faults 1077 1064 1051 1048 1046 1045
times in seconds 0.00615347 0.008327379 0.019775282 0.034444003 0.05905971 0.220863339
8 threads
minor-faults 8252 8239 8226 8223 8220 8224
times in seconds 0.04387392 0.059859294 0.113897648 0.199707764 0.361585762 1.343366843
32 threads
minor-faults 32852 32841 32825 32826 32824 32828
times in seconds 0.191404544 0.21907773 0.433207123 0.72430447 1.334983196 4.97727449
64 threads
minor-faults 65652 65642 65629 65622 65623 65634
times in seconds 0.402140429 0.510806718 0.854288645 1.412329805 2.556707704 8.711074863
128 threads
minor-faults 131255 131239 131228 131228 131229 131243
times in seconds 0.817782148 1.124631348 2.023730928 3.184792382 5.331392072 17.309524609
Madhavan Srinivasan (1):
mm: move FAULT_AROUND_ORDER to arch/
arch/powerpc/include/asm/pgtable.h | 6 ++++++
arch/x86/include/asm/pgtable.h | 5 +++++
include/asm-generic/pgtable.h | 10 ++++++++++
mm/memory.c | 2 --
4 files changed, 21 insertions(+), 2 deletions(-)
--
1.7.10.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 0/1] mm: FAULT_AROUND_ORDER patchset performance data for powerpc
@ 2014-03-25 6:50 ` Madhavan Srinivasan
0 siblings, 0 replies; 21+ messages in thread
From: Madhavan Srinivasan @ 2014-03-25 6:50 UTC (permalink / raw)
To: linux-kernel, linuxppc-dev, linux-mm, linux-arch, x86
Cc: riel, ak, peterz, rusty, Madhavan Srinivasan, paulus, mgorman,
akpm, mingo, kirill.shutemov
Performance data for different FAULT_AROUND_ORDER values from 4 socket
Power7 system (128 Threads and 128GB memory) is below. Fault around order (FAO)
value of 3 looks more advantageous.
FAULT_AROUND_ORDER Baseline 1 3 4 5 7
Linux build (make -j64)
minor-faults 7184385 5874015 4567289 4318518 4193815 4159193
times in seconds 61.433776136 60.865935292 59.245368038 60.630675011 60.56587624 59.828271924
Linux rebuild (make -j64)
minor-faults 303018 226392 146170 132480 126878 126236
times in seconds 5.659819172 5.723996942 5.591238319 5.622533357 5.878811995 5.550133096
Two synthetic tests: access every word in file in sequential/random order.
Marginal Performance gains seen for FAO value of 3 when compared to value
of 4.
Sequential access 16GiB file
FAULT_AROUND_ORDER Baseline 1 3 4 5 7
1 thread
minor-faults 262302 131192 32873 16486 8291 2351
times in seconds 53.071497352 52.945826882 52.931417302 52.928577184 52.859285439 53.116800539
8 threads
minor-faults 2097314 1051046 263336 131715 66098 16653
times in seconds 54.385698561 54.603652339 54.771282004 54.488565674 54.496701531 54.962142189
32 threads
minor-faults 8389267 4218595 1059961 531319 266463 67271
times in seconds 60.61715047 60.827964038 60.46412673 60.266045885 60.492398315 60.24531921
64 threads
minor-faults 16777455 8485998 2178582 1092106 544302 137693
times in seconds 86.471334554 84.412415735 85.208303832 84.331473392 85.598793479 84.695469266
128 threads
minor-faults 33555267 17734522 4710107 2380821 1182707 292077
times in seconds 117.535385569 114.291359037 112.593908276 113.081807611 114.358686588 114.491043011
Random access 1GiB file
FAULT_AROUND_ORDER Baseline 1 3 4 5 7
1 thread
minor-faults 16503 8664 2149 1126 610 437
times in seconds 43.843573808 48.042069805 50.580779682 54.282884593 52.641739876 51.803302129
8 threads
minor-faults 131201 70916 17760 8665 4250 1149
times in seconds 46.262626804 55.942851041 56.629191584 57.97044714 55.417557594 56.019709166
32 threads
minor-faults 524959 265980 67282 33601 16930 4316
times in seconds 67.754175928 69.85012331 71.750338061 71.053074643 68.90728294 71.250103217
64 threads
minor-faults 1048831 528829 133256 66700 33428 8776
times in seconds 96.674025305 93.109961822 87.441777715 91.986332028 88.686748472 93.101434306
128 threads
minor-faults 2098043 1053224 266271 133702 66966 17276
times in seconds 156.525792044 152.117971403 147.523673243 148.560226602 148.596575663 149.389288429
Worst case scenario: we touch one page every 16M to demonstrate overhead.
Touch only one page in page table in 16GiB file
FAULT_AROUND_ORDER Baseline 1 3 4 5 7
1 thread
minor-faults 1077 1064 1051 1048 1046 1045
times in seconds 0.00615347 0.008327379 0.019775282 0.034444003 0.05905971 0.220863339
8 threads
minor-faults 8252 8239 8226 8223 8220 8224
times in seconds 0.04387392 0.059859294 0.113897648 0.199707764 0.361585762 1.343366843
32 threads
minor-faults 32852 32841 32825 32826 32824 32828
times in seconds 0.191404544 0.21907773 0.433207123 0.72430447 1.334983196 4.97727449
64 threads
minor-faults 65652 65642 65629 65622 65623 65634
times in seconds 0.402140429 0.510806718 0.854288645 1.412329805 2.556707704 8.711074863
128 threads
minor-faults 131255 131239 131228 131228 131229 131243
times in seconds 0.817782148 1.124631348 2.023730928 3.184792382 5.331392072 17.309524609
Madhavan Srinivasan (1):
mm: move FAULT_AROUND_ORDER to arch/
arch/powerpc/include/asm/pgtable.h | 6 ++++++
arch/x86/include/asm/pgtable.h | 5 +++++
include/asm-generic/pgtable.h | 10 ++++++++++
mm/memory.c | 2 --
4 files changed, 21 insertions(+), 2 deletions(-)
--
1.7.10.4
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 1/1] mm: move FAULT_AROUND_ORDER to arch/
2014-03-25 6:50 ` Madhavan Srinivasan
(?)
@ 2014-03-25 6:50 ` Madhavan Srinivasan
-1 siblings, 0 replies; 21+ messages in thread
From: Madhavan Srinivasan @ 2014-03-25 6:50 UTC (permalink / raw)
To: linux-kernel, linuxppc-dev, linux-mm, linux-arch, x86
Cc: benh, paulus, kirill.shutemov, rusty, akpm, riel, mgorman, ak,
peterz, mingo, Madhavan Srinivasan
Kirill A. Shutemov with the commit 96bacfe542 introduced
vm_ops->map_pages() for mapping easy accessible pages around
fault address in hope to reduce number of minor page faults.
Based on his workload runs, suggested FAULT_AROUND_ORDER
(knob to control the numbers of pages to map) is 4.
This patch moves the FAULT_AROUND_ORDER macro to arch/ for
architecture maintainers to decide on suitable FAULT_AROUND_ORDER
value based on performance data for that architecture.
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/pgtable.h | 6 ++++++
arch/x86/include/asm/pgtable.h | 5 +++++
include/asm-generic/pgtable.h | 10 ++++++++++
mm/memory.c | 2 --
4 files changed, 21 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
index 3ebb188..9fcbd48 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -19,6 +19,12 @@ struct mm_struct;
#endif
/*
+ * With a few real world workloads that were run,
+ * the performance data showed that a value of 3 is more advantageous.
+ */
+#define FAULT_AROUND_ORDER 3
+
+/*
* We save the slot number & secondary bit in the second half of the
* PTE page. We use the 8 bytes per each pte entry.
*/
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 938ef1d..8387a65 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -7,6 +7,11 @@
#include <asm/pgtable_types.h>
/*
+ * Based on Kirill's test results, fault around order is set to 4
+ */
+#define FAULT_AROUND_ORDER 4
+
+/*
* Macro to mark a page protection value as UC-
*/
#define pgprot_noncached(prot) \
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 1ec08c1..62f7f07 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -7,6 +7,16 @@
#include <linux/mm_types.h>
#include <linux/bug.h>
+
+/*
+ * Fault around order is a control knob to decide the fault around pages.
+ * Default value is set to 0UL (disabled), but the arch can override it as
+ * desired.
+ */
+#ifndef FAULT_AROUND_ORDER
+#define FAULT_AROUND_ORDER 0UL
+#endif
+
/*
* On almost all architectures and configurations, 0 can be used as the
* upper ceiling to free_pgtables(): on many architectures it has the same
diff --git a/mm/memory.c b/mm/memory.c
index b02c584..fd79ffc 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3358,8 +3358,6 @@ void do_set_pte(struct vm_area_struct *vma, unsigned long address,
update_mmu_cache(vma, address, pte);
}
-#define FAULT_AROUND_ORDER 4
-
#ifdef CONFIG_DEBUG_FS
static unsigned int fault_around_order = FAULT_AROUND_ORDER;
--
1.7.10.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 1/1] mm: move FAULT_AROUND_ORDER to arch/
@ 2014-03-25 6:50 ` Madhavan Srinivasan
0 siblings, 0 replies; 21+ messages in thread
From: Madhavan Srinivasan @ 2014-03-25 6:50 UTC (permalink / raw)
To: linux-kernel, linuxppc-dev, linux-mm, linux-arch, x86
Cc: benh, paulus, kirill.shutemov, rusty, akpm, riel, mgorman, ak,
peterz, mingo, Madhavan Srinivasan
Kirill A. Shutemov with the commit 96bacfe542 introduced
vm_ops->map_pages() for mapping easy accessible pages around
fault address in hope to reduce number of minor page faults.
Based on his workload runs, suggested FAULT_AROUND_ORDER
(knob to control the numbers of pages to map) is 4.
This patch moves the FAULT_AROUND_ORDER macro to arch/ for
architecture maintainers to decide on suitable FAULT_AROUND_ORDER
value based on performance data for that architecture.
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/pgtable.h | 6 ++++++
arch/x86/include/asm/pgtable.h | 5 +++++
include/asm-generic/pgtable.h | 10 ++++++++++
mm/memory.c | 2 --
4 files changed, 21 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
index 3ebb188..9fcbd48 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -19,6 +19,12 @@ struct mm_struct;
#endif
/*
+ * With a few real world workloads that were run,
+ * the performance data showed that a value of 3 is more advantageous.
+ */
+#define FAULT_AROUND_ORDER 3
+
+/*
* We save the slot number & secondary bit in the second half of the
* PTE page. We use the 8 bytes per each pte entry.
*/
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 938ef1d..8387a65 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -7,6 +7,11 @@
#include <asm/pgtable_types.h>
/*
+ * Based on Kirill's test results, fault around order is set to 4
+ */
+#define FAULT_AROUND_ORDER 4
+
+/*
* Macro to mark a page protection value as UC-
*/
#define pgprot_noncached(prot) \
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 1ec08c1..62f7f07 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -7,6 +7,16 @@
#include <linux/mm_types.h>
#include <linux/bug.h>
+
+/*
+ * Fault around order is a control knob to decide the fault around pages.
+ * Default value is set to 0UL (disabled), but the arch can override it as
+ * desired.
+ */
+#ifndef FAULT_AROUND_ORDER
+#define FAULT_AROUND_ORDER 0UL
+#endif
+
/*
* On almost all architectures and configurations, 0 can be used as the
* upper ceiling to free_pgtables(): on many architectures it has the same
diff --git a/mm/memory.c b/mm/memory.c
index b02c584..fd79ffc 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3358,8 +3358,6 @@ void do_set_pte(struct vm_area_struct *vma, unsigned long address,
update_mmu_cache(vma, address, pte);
}
-#define FAULT_AROUND_ORDER 4
-
#ifdef CONFIG_DEBUG_FS
static unsigned int fault_around_order = FAULT_AROUND_ORDER;
--
1.7.10.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 1/1] mm: move FAULT_AROUND_ORDER to arch/
@ 2014-03-25 6:50 ` Madhavan Srinivasan
0 siblings, 0 replies; 21+ messages in thread
From: Madhavan Srinivasan @ 2014-03-25 6:50 UTC (permalink / raw)
To: linux-kernel, linuxppc-dev, linux-mm, linux-arch, x86
Cc: riel, ak, peterz, rusty, Madhavan Srinivasan, paulus, mgorman,
akpm, mingo, kirill.shutemov
Kirill A. Shutemov with the commit 96bacfe542 introduced
vm_ops->map_pages() for mapping easy accessible pages around
fault address in hope to reduce number of minor page faults.
Based on his workload runs, suggested FAULT_AROUND_ORDER
(knob to control the numbers of pages to map) is 4.
This patch moves the FAULT_AROUND_ORDER macro to arch/ for
architecture maintainers to decide on suitable FAULT_AROUND_ORDER
value based on performance data for that architecture.
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/pgtable.h | 6 ++++++
arch/x86/include/asm/pgtable.h | 5 +++++
include/asm-generic/pgtable.h | 10 ++++++++++
mm/memory.c | 2 --
4 files changed, 21 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
index 3ebb188..9fcbd48 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -19,6 +19,12 @@ struct mm_struct;
#endif
/*
+ * With a few real world workloads that were run,
+ * the performance data showed that a value of 3 is more advantageous.
+ */
+#define FAULT_AROUND_ORDER 3
+
+/*
* We save the slot number & secondary bit in the second half of the
* PTE page. We use the 8 bytes per each pte entry.
*/
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 938ef1d..8387a65 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -7,6 +7,11 @@
#include <asm/pgtable_types.h>
/*
+ * Based on Kirill's test results, fault around order is set to 4
+ */
+#define FAULT_AROUND_ORDER 4
+
+/*
* Macro to mark a page protection value as UC-
*/
#define pgprot_noncached(prot) \
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 1ec08c1..62f7f07 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -7,6 +7,16 @@
#include <linux/mm_types.h>
#include <linux/bug.h>
+
+/*
+ * Fault around order is a control knob to decide the fault around pages.
+ * Default value is set to 0UL (disabled), but the arch can override it as
+ * desired.
+ */
+#ifndef FAULT_AROUND_ORDER
+#define FAULT_AROUND_ORDER 0UL
+#endif
+
/*
* On almost all architectures and configurations, 0 can be used as the
* upper ceiling to free_pgtables(): on many architectures it has the same
diff --git a/mm/memory.c b/mm/memory.c
index b02c584..fd79ffc 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3358,8 +3358,6 @@ void do_set_pte(struct vm_area_struct *vma, unsigned long address,
update_mmu_cache(vma, address, pte);
}
-#define FAULT_AROUND_ORDER 4
-
#ifdef CONFIG_DEBUG_FS
static unsigned int fault_around_order = FAULT_AROUND_ORDER;
--
1.7.10.4
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH 0/1] mm: FAULT_AROUND_ORDER patchset performance data for powerpc
2014-03-25 6:50 ` Madhavan Srinivasan
(?)
@ 2014-03-25 8:11 ` Ingo Molnar
-1 siblings, 0 replies; 21+ messages in thread
From: Ingo Molnar @ 2014-03-25 8:11 UTC (permalink / raw)
To: Madhavan Srinivasan
Cc: linux-kernel, linuxppc-dev, linux-mm, linux-arch, x86, benh,
paulus, kirill.shutemov, rusty, akpm, riel, mgorman, ak, peterz,
Linus Torvalds
* Madhavan Srinivasan <maddy@linux.vnet.ibm.com> wrote:
> Performance data for different FAULT_AROUND_ORDER values from 4 socket
> Power7 system (128 Threads and 128GB memory) is below. Fault around order (FAO)
> value of 3 looks more advantageous.
>
> FAULT_AROUND_ORDER Baseline 1 3 4 5 7
>
> Linux build (make -j64)
> minor-faults 7184385 5874015 4567289 4318518 4193815 4159193
> times in seconds 61.433776136 60.865935292 59.245368038 60.630675011 60.56587624 59.828271924
Hm, I have one general observation: it's hard to tell how
(statistically) significant the time differences are, without standard
deviation numbers.
You can get stddev very easily via 'perf stat --null --repeat N'.
You can use --pre <script> and --post <script> for pre/post
measurement cleanup hooks (such as 'make clean'). So for example:
perf stat --null --repeat 3 --pre 'make defconfig; make clean >/dev/null 2>&1' make -j64 kernel/
Which run the workload 3 times and it will output something like:
9.013717158 seconds time elapsed ( +- 0.99% )
Where the +- column shows the stddev in relative percentage units.
The --null option ensures that only time measurement is done with no
overhead for the workload, no other performance metrics are taken.
The overhead of the --pre stage is not added to the measured time.
Thus you can also add really expensive steps to the --pre stage, such
as a vm_drop_caches clearing of all caches, to measure cache-cold
results.
The stddev value shows that the result is significant to about the
first fractional digit.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/1] mm: FAULT_AROUND_ORDER patchset performance data for powerpc
@ 2014-03-25 8:11 ` Ingo Molnar
0 siblings, 0 replies; 21+ messages in thread
From: Ingo Molnar @ 2014-03-25 8:11 UTC (permalink / raw)
To: Madhavan Srinivasan
Cc: linux-kernel, linuxppc-dev, linux-mm, linux-arch, x86, benh,
paulus, kirill.shutemov, rusty, akpm, riel, mgorman, ak, peterz,
Linus Torvalds
* Madhavan Srinivasan <maddy@linux.vnet.ibm.com> wrote:
> Performance data for different FAULT_AROUND_ORDER values from 4 socket
> Power7 system (128 Threads and 128GB memory) is below. Fault around order (FAO)
> value of 3 looks more advantageous.
>
> FAULT_AROUND_ORDER Baseline 1 3 4 5 7
>
> Linux build (make -j64)
> minor-faults 7184385 5874015 4567289 4318518 4193815 4159193
> times in seconds 61.433776136 60.865935292 59.245368038 60.630675011 60.56587624 59.828271924
Hm, I have one general observation: it's hard to tell how
(statistically) significant the time differences are, without standard
deviation numbers.
You can get stddev very easily via 'perf stat --null --repeat N'.
You can use --pre <script> and --post <script> for pre/post
measurement cleanup hooks (such as 'make clean'). So for example:
perf stat --null --repeat 3 --pre 'make defconfig; make clean >/dev/null 2>&1' make -j64 kernel/
Which run the workload 3 times and it will output something like:
9.013717158 seconds time elapsed ( +- 0.99% )
Where the +- column shows the stddev in relative percentage units.
The --null option ensures that only time measurement is done with no
overhead for the workload, no other performance metrics are taken.
The overhead of the --pre stage is not added to the measured time.
Thus you can also add really expensive steps to the --pre stage, such
as a vm_drop_caches clearing of all caches, to measure cache-cold
results.
The stddev value shows that the result is significant to about the
first fractional digit.
Thanks,
Ingo
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/1] mm: FAULT_AROUND_ORDER patchset performance data for powerpc
@ 2014-03-25 8:11 ` Ingo Molnar
0 siblings, 0 replies; 21+ messages in thread
From: Ingo Molnar @ 2014-03-25 8:11 UTC (permalink / raw)
To: Madhavan Srinivasan
Cc: linux-arch, riel, rusty, peterz, x86, linux-kernel, linux-mm, ak,
paulus, mgorman, Linus Torvalds, akpm, linuxppc-dev,
kirill.shutemov
* Madhavan Srinivasan <maddy@linux.vnet.ibm.com> wrote:
> Performance data for different FAULT_AROUND_ORDER values from 4 socket
> Power7 system (128 Threads and 128GB memory) is below. Fault around order (FAO)
> value of 3 looks more advantageous.
>
> FAULT_AROUND_ORDER Baseline 1 3 4 5 7
>
> Linux build (make -j64)
> minor-faults 7184385 5874015 4567289 4318518 4193815 4159193
> times in seconds 61.433776136 60.865935292 59.245368038 60.630675011 60.56587624 59.828271924
Hm, I have one general observation: it's hard to tell how
(statistically) significant the time differences are, without standard
deviation numbers.
You can get stddev very easily via 'perf stat --null --repeat N'.
You can use --pre <script> and --post <script> for pre/post
measurement cleanup hooks (such as 'make clean'). So for example:
perf stat --null --repeat 3 --pre 'make defconfig; make clean >/dev/null 2>&1' make -j64 kernel/
Which run the workload 3 times and it will output something like:
9.013717158 seconds time elapsed ( +- 0.99% )
Where the +- column shows the stddev in relative percentage units.
The --null option ensures that only time measurement is done with no
overhead for the workload, no other performance metrics are taken.
The overhead of the --pre stage is not added to the measured time.
Thus you can also add really expensive steps to the --pre stage, such
as a vm_drop_caches clearing of all caches, to measure cache-cold
results.
The stddev value shows that the result is significant to about the
first fractional digit.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/1] mm: move FAULT_AROUND_ORDER to arch/
2014-03-25 6:50 ` Madhavan Srinivasan
(?)
@ 2014-03-25 17:36 ` Kirill A. Shutemov
-1 siblings, 0 replies; 21+ messages in thread
From: Kirill A. Shutemov @ 2014-03-25 17:36 UTC (permalink / raw)
To: Madhavan Srinivasan
Cc: linux-kernel, linuxppc-dev, linux-mm, linux-arch, x86, benh,
paulus, kirill.shutemov, rusty, akpm, riel, mgorman, ak, peterz,
mingo
On Tue, Mar 25, 2014 at 12:20:15PM +0530, Madhavan Srinivasan wrote:
> Kirill A. Shutemov with the commit 96bacfe542 introduced
> vm_ops->map_pages() for mapping easy accessible pages around
> fault address in hope to reduce number of minor page faults.
> Based on his workload runs, suggested FAULT_AROUND_ORDER
> (knob to control the numbers of pages to map) is 4.
>
> This patch moves the FAULT_AROUND_ORDER macro to arch/ for
> architecture maintainers to decide on suitable FAULT_AROUND_ORDER
> value based on performance data for that architecture.
>
> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
> ---
> arch/powerpc/include/asm/pgtable.h | 6 ++++++
> arch/x86/include/asm/pgtable.h | 5 +++++
> include/asm-generic/pgtable.h | 10 ++++++++++
> mm/memory.c | 2 --
> 4 files changed, 21 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
> index 3ebb188..9fcbd48 100644
> --- a/arch/powerpc/include/asm/pgtable.h
> +++ b/arch/powerpc/include/asm/pgtable.h
> @@ -19,6 +19,12 @@ struct mm_struct;
> #endif
>
> /*
> + * With a few real world workloads that were run,
> + * the performance data showed that a value of 3 is more advantageous.
> + */
> +#define FAULT_AROUND_ORDER 3
> +
> +/*
> * We save the slot number & secondary bit in the second half of the
> * PTE page. We use the 8 bytes per each pte entry.
> */
> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
> index 938ef1d..8387a65 100644
> --- a/arch/x86/include/asm/pgtable.h
> +++ b/arch/x86/include/asm/pgtable.h
> @@ -7,6 +7,11 @@
> #include <asm/pgtable_types.h>
>
> /*
> + * Based on Kirill's test results, fault around order is set to 4
> + */
> +#define FAULT_AROUND_ORDER 4
> +
> +/*
> * Macro to mark a page protection value as UC-
> */
> #define pgprot_noncached(prot) \
> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
> index 1ec08c1..62f7f07 100644
> --- a/include/asm-generic/pgtable.h
> +++ b/include/asm-generic/pgtable.h
> @@ -7,6 +7,16 @@
> #include <linux/mm_types.h>
> #include <linux/bug.h>
>
> +
> +/*
> + * Fault around order is a control knob to decide the fault around pages.
> + * Default value is set to 0UL (disabled), but the arch can override it as
> + * desired.
> + */
> +#ifndef FAULT_AROUND_ORDER
> +#define FAULT_AROUND_ORDER 0UL
> +#endif
FAULT_AROUND_ORDER == 0 case should be handled separately in
do_read_fault(): no reason to go to do_fault_around() if we are going to
fault in only one page.
--
Kirill A. Shutemov
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/1] mm: move FAULT_AROUND_ORDER to arch/
@ 2014-03-25 17:36 ` Kirill A. Shutemov
0 siblings, 0 replies; 21+ messages in thread
From: Kirill A. Shutemov @ 2014-03-25 17:36 UTC (permalink / raw)
To: Madhavan Srinivasan
Cc: linux-kernel, linuxppc-dev, linux-mm, linux-arch, x86, benh,
paulus, kirill.shutemov, rusty, akpm, riel, mgorman, ak, peterz,
mingo
On Tue, Mar 25, 2014 at 12:20:15PM +0530, Madhavan Srinivasan wrote:
> Kirill A. Shutemov with the commit 96bacfe542 introduced
> vm_ops->map_pages() for mapping easy accessible pages around
> fault address in hope to reduce number of minor page faults.
> Based on his workload runs, suggested FAULT_AROUND_ORDER
> (knob to control the numbers of pages to map) is 4.
>
> This patch moves the FAULT_AROUND_ORDER macro to arch/ for
> architecture maintainers to decide on suitable FAULT_AROUND_ORDER
> value based on performance data for that architecture.
>
> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
> ---
> arch/powerpc/include/asm/pgtable.h | 6 ++++++
> arch/x86/include/asm/pgtable.h | 5 +++++
> include/asm-generic/pgtable.h | 10 ++++++++++
> mm/memory.c | 2 --
> 4 files changed, 21 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
> index 3ebb188..9fcbd48 100644
> --- a/arch/powerpc/include/asm/pgtable.h
> +++ b/arch/powerpc/include/asm/pgtable.h
> @@ -19,6 +19,12 @@ struct mm_struct;
> #endif
>
> /*
> + * With a few real world workloads that were run,
> + * the performance data showed that a value of 3 is more advantageous.
> + */
> +#define FAULT_AROUND_ORDER 3
> +
> +/*
> * We save the slot number & secondary bit in the second half of the
> * PTE page. We use the 8 bytes per each pte entry.
> */
> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
> index 938ef1d..8387a65 100644
> --- a/arch/x86/include/asm/pgtable.h
> +++ b/arch/x86/include/asm/pgtable.h
> @@ -7,6 +7,11 @@
> #include <asm/pgtable_types.h>
>
> /*
> + * Based on Kirill's test results, fault around order is set to 4
> + */
> +#define FAULT_AROUND_ORDER 4
> +
> +/*
> * Macro to mark a page protection value as UC-
> */
> #define pgprot_noncached(prot) \
> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
> index 1ec08c1..62f7f07 100644
> --- a/include/asm-generic/pgtable.h
> +++ b/include/asm-generic/pgtable.h
> @@ -7,6 +7,16 @@
> #include <linux/mm_types.h>
> #include <linux/bug.h>
>
> +
> +/*
> + * Fault around order is a control knob to decide the fault around pages.
> + * Default value is set to 0UL (disabled), but the arch can override it as
> + * desired.
> + */
> +#ifndef FAULT_AROUND_ORDER
> +#define FAULT_AROUND_ORDER 0UL
> +#endif
FAULT_AROUND_ORDER == 0 case should be handled separately in
do_read_fault(): no reason to go to do_fault_around() if we are going to
fault in only one page.
--
Kirill A. Shutemov
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/1] mm: move FAULT_AROUND_ORDER to arch/
@ 2014-03-25 17:36 ` Kirill A. Shutemov
0 siblings, 0 replies; 21+ messages in thread
From: Kirill A. Shutemov @ 2014-03-25 17:36 UTC (permalink / raw)
To: Madhavan Srinivasan
Cc: linux-arch, riel, rusty, peterz, x86, linux-kernel, linux-mm, ak,
paulus, mgorman, akpm, linuxppc-dev, mingo, kirill.shutemov
On Tue, Mar 25, 2014 at 12:20:15PM +0530, Madhavan Srinivasan wrote:
> Kirill A. Shutemov with the commit 96bacfe542 introduced
> vm_ops->map_pages() for mapping easy accessible pages around
> fault address in hope to reduce number of minor page faults.
> Based on his workload runs, suggested FAULT_AROUND_ORDER
> (knob to control the numbers of pages to map) is 4.
>
> This patch moves the FAULT_AROUND_ORDER macro to arch/ for
> architecture maintainers to decide on suitable FAULT_AROUND_ORDER
> value based on performance data for that architecture.
>
> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
> ---
> arch/powerpc/include/asm/pgtable.h | 6 ++++++
> arch/x86/include/asm/pgtable.h | 5 +++++
> include/asm-generic/pgtable.h | 10 ++++++++++
> mm/memory.c | 2 --
> 4 files changed, 21 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
> index 3ebb188..9fcbd48 100644
> --- a/arch/powerpc/include/asm/pgtable.h
> +++ b/arch/powerpc/include/asm/pgtable.h
> @@ -19,6 +19,12 @@ struct mm_struct;
> #endif
>
> /*
> + * With a few real world workloads that were run,
> + * the performance data showed that a value of 3 is more advantageous.
> + */
> +#define FAULT_AROUND_ORDER 3
> +
> +/*
> * We save the slot number & secondary bit in the second half of the
> * PTE page. We use the 8 bytes per each pte entry.
> */
> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
> index 938ef1d..8387a65 100644
> --- a/arch/x86/include/asm/pgtable.h
> +++ b/arch/x86/include/asm/pgtable.h
> @@ -7,6 +7,11 @@
> #include <asm/pgtable_types.h>
>
> /*
> + * Based on Kirill's test results, fault around order is set to 4
> + */
> +#define FAULT_AROUND_ORDER 4
> +
> +/*
> * Macro to mark a page protection value as UC-
> */
> #define pgprot_noncached(prot) \
> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
> index 1ec08c1..62f7f07 100644
> --- a/include/asm-generic/pgtable.h
> +++ b/include/asm-generic/pgtable.h
> @@ -7,6 +7,16 @@
> #include <linux/mm_types.h>
> #include <linux/bug.h>
>
> +
> +/*
> + * Fault around order is a control knob to decide the fault around pages.
> + * Default value is set to 0UL (disabled), but the arch can override it as
> + * desired.
> + */
> +#ifndef FAULT_AROUND_ORDER
> +#define FAULT_AROUND_ORDER 0UL
> +#endif
FAULT_AROUND_ORDER == 0 case should be handled separately in
do_read_fault(): no reason to go to do_fault_around() if we are going to
fault in only one page.
--
Kirill A. Shutemov
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/1] mm: move FAULT_AROUND_ORDER to arch/
2014-03-25 17:36 ` Kirill A. Shutemov
(?)
@ 2014-03-25 17:50 ` Dave Hansen
-1 siblings, 0 replies; 21+ messages in thread
From: Dave Hansen @ 2014-03-25 17:50 UTC (permalink / raw)
To: Kirill A. Shutemov, Madhavan Srinivasan
Cc: linux-kernel, linuxppc-dev, linux-mm, linux-arch, x86, benh,
paulus, kirill.shutemov, rusty, akpm, riel, mgorman, ak, peterz,
mingo
On 03/25/2014 10:36 AM, Kirill A. Shutemov wrote:
>> > +/*
>> > + * Fault around order is a control knob to decide the fault around pages.
>> > + * Default value is set to 0UL (disabled), but the arch can override it as
>> > + * desired.
>> > + */
>> > +#ifndef FAULT_AROUND_ORDER
>> > +#define FAULT_AROUND_ORDER 0UL
>> > +#endif
> FAULT_AROUND_ORDER == 0 case should be handled separately in
> do_read_fault(): no reason to go to do_fault_around() if we are going to
> fault in only one page.
Isn't this the kind of thing we want to do in Kconfig?
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/1] mm: move FAULT_AROUND_ORDER to arch/
@ 2014-03-25 17:50 ` Dave Hansen
0 siblings, 0 replies; 21+ messages in thread
From: Dave Hansen @ 2014-03-25 17:50 UTC (permalink / raw)
To: Kirill A. Shutemov, Madhavan Srinivasan
Cc: linux-kernel, linuxppc-dev, linux-mm, linux-arch, x86, benh,
paulus, kirill.shutemov, rusty, akpm, riel, mgorman, ak, peterz,
mingo
On 03/25/2014 10:36 AM, Kirill A. Shutemov wrote:
>> > +/*
>> > + * Fault around order is a control knob to decide the fault around pages.
>> > + * Default value is set to 0UL (disabled), but the arch can override it as
>> > + * desired.
>> > + */
>> > +#ifndef FAULT_AROUND_ORDER
>> > +#define FAULT_AROUND_ORDER 0UL
>> > +#endif
> FAULT_AROUND_ORDER == 0 case should be handled separately in
> do_read_fault(): no reason to go to do_fault_around() if we are going to
> fault in only one page.
Isn't this the kind of thing we want to do in Kconfig?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/1] mm: move FAULT_AROUND_ORDER to arch/
@ 2014-03-25 17:50 ` Dave Hansen
0 siblings, 0 replies; 21+ messages in thread
From: Dave Hansen @ 2014-03-25 17:50 UTC (permalink / raw)
To: Kirill A. Shutemov, Madhavan Srinivasan
Cc: linux-arch, riel, rusty, peterz, x86, linux-kernel, linux-mm, ak,
paulus, mgorman, akpm, linuxppc-dev, mingo, kirill.shutemov
On 03/25/2014 10:36 AM, Kirill A. Shutemov wrote:
>> > +/*
>> > + * Fault around order is a control knob to decide the fault around pages.
>> > + * Default value is set to 0UL (disabled), but the arch can override it as
>> > + * desired.
>> > + */
>> > +#ifndef FAULT_AROUND_ORDER
>> > +#define FAULT_AROUND_ORDER 0UL
>> > +#endif
> FAULT_AROUND_ORDER == 0 case should be handled separately in
> do_read_fault(): no reason to go to do_fault_around() if we are going to
> fault in only one page.
Isn't this the kind of thing we want to do in Kconfig?
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/1] mm: move FAULT_AROUND_ORDER to arch/
2014-03-25 17:36 ` Kirill A. Shutemov
(?)
@ 2014-03-27 6:20 ` Madhavan Srinivasan
-1 siblings, 0 replies; 21+ messages in thread
From: Madhavan Srinivasan @ 2014-03-27 6:20 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: linux-kernel, linuxppc-dev, linux-mm, linux-arch, x86, benh,
paulus, kirill.shutemov, rusty, akpm, riel, mgorman, ak, peterz,
mingo
On Tuesday 25 March 2014 11:06 PM, Kirill A. Shutemov wrote:
> On Tue, Mar 25, 2014 at 12:20:15PM +0530, Madhavan Srinivasan wrote:
>> Kirill A. Shutemov with the commit 96bacfe542 introduced
>> vm_ops->map_pages() for mapping easy accessible pages around
>> fault address in hope to reduce number of minor page faults.
>> Based on his workload runs, suggested FAULT_AROUND_ORDER
>> (knob to control the numbers of pages to map) is 4.
>>
>> This patch moves the FAULT_AROUND_ORDER macro to arch/ for
>> architecture maintainers to decide on suitable FAULT_AROUND_ORDER
>> value based on performance data for that architecture.
>>
>> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
>> ---
>> arch/powerpc/include/asm/pgtable.h | 6 ++++++
>> arch/x86/include/asm/pgtable.h | 5 +++++
>> include/asm-generic/pgtable.h | 10 ++++++++++
>> mm/memory.c | 2 --
>> 4 files changed, 21 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
>> index 3ebb188..9fcbd48 100644
>> --- a/arch/powerpc/include/asm/pgtable.h
>> +++ b/arch/powerpc/include/asm/pgtable.h
>> @@ -19,6 +19,12 @@ struct mm_struct;
>> #endif
>>
>> /*
>> + * With a few real world workloads that were run,
>> + * the performance data showed that a value of 3 is more advantageous.
>> + */
>> +#define FAULT_AROUND_ORDER 3
>> +
>> +/*
>> * We save the slot number & secondary bit in the second half of the
>> * PTE page. We use the 8 bytes per each pte entry.
>> */
>> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
>> index 938ef1d..8387a65 100644
>> --- a/arch/x86/include/asm/pgtable.h
>> +++ b/arch/x86/include/asm/pgtable.h
>> @@ -7,6 +7,11 @@
>> #include <asm/pgtable_types.h>
>>
>> /*
>> + * Based on Kirill's test results, fault around order is set to 4
>> + */
>> +#define FAULT_AROUND_ORDER 4
>> +
>> +/*
>> * Macro to mark a page protection value as UC-
>> */
>> #define pgprot_noncached(prot) \
>> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
>> index 1ec08c1..62f7f07 100644
>> --- a/include/asm-generic/pgtable.h
>> +++ b/include/asm-generic/pgtable.h
>> @@ -7,6 +7,16 @@
>> #include <linux/mm_types.h>
>> #include <linux/bug.h>
>>
>> +
>> +/*
>> + * Fault around order is a control knob to decide the fault around pages.
>> + * Default value is set to 0UL (disabled), but the arch can override it as
>> + * desired.
>> + */
>> +#ifndef FAULT_AROUND_ORDER
>> +#define FAULT_AROUND_ORDER 0UL
>> +#endif
>
> FAULT_AROUND_ORDER == 0 case should be handled separately in
> do_read_fault(): no reason to go to do_fault_around() if we are going to
> fault in only one page.
>
ok agreed. I am thinking of adding FAULT_AROUND_ORDER check with
map_pages check in the do_read_fault. Kindly share your thoughts.
With regards
Maddy
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/1] mm: move FAULT_AROUND_ORDER to arch/
@ 2014-03-27 6:20 ` Madhavan Srinivasan
0 siblings, 0 replies; 21+ messages in thread
From: Madhavan Srinivasan @ 2014-03-27 6:20 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: linux-kernel, linuxppc-dev, linux-mm, linux-arch, x86, benh,
paulus, kirill.shutemov, rusty, akpm, riel, mgorman, ak, peterz,
mingo
On Tuesday 25 March 2014 11:06 PM, Kirill A. Shutemov wrote:
> On Tue, Mar 25, 2014 at 12:20:15PM +0530, Madhavan Srinivasan wrote:
>> Kirill A. Shutemov with the commit 96bacfe542 introduced
>> vm_ops->map_pages() for mapping easy accessible pages around
>> fault address in hope to reduce number of minor page faults.
>> Based on his workload runs, suggested FAULT_AROUND_ORDER
>> (knob to control the numbers of pages to map) is 4.
>>
>> This patch moves the FAULT_AROUND_ORDER macro to arch/ for
>> architecture maintainers to decide on suitable FAULT_AROUND_ORDER
>> value based on performance data for that architecture.
>>
>> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
>> ---
>> arch/powerpc/include/asm/pgtable.h | 6 ++++++
>> arch/x86/include/asm/pgtable.h | 5 +++++
>> include/asm-generic/pgtable.h | 10 ++++++++++
>> mm/memory.c | 2 --
>> 4 files changed, 21 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
>> index 3ebb188..9fcbd48 100644
>> --- a/arch/powerpc/include/asm/pgtable.h
>> +++ b/arch/powerpc/include/asm/pgtable.h
>> @@ -19,6 +19,12 @@ struct mm_struct;
>> #endif
>>
>> /*
>> + * With a few real world workloads that were run,
>> + * the performance data showed that a value of 3 is more advantageous.
>> + */
>> +#define FAULT_AROUND_ORDER 3
>> +
>> +/*
>> * We save the slot number & secondary bit in the second half of the
>> * PTE page. We use the 8 bytes per each pte entry.
>> */
>> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
>> index 938ef1d..8387a65 100644
>> --- a/arch/x86/include/asm/pgtable.h
>> +++ b/arch/x86/include/asm/pgtable.h
>> @@ -7,6 +7,11 @@
>> #include <asm/pgtable_types.h>
>>
>> /*
>> + * Based on Kirill's test results, fault around order is set to 4
>> + */
>> +#define FAULT_AROUND_ORDER 4
>> +
>> +/*
>> * Macro to mark a page protection value as UC-
>> */
>> #define pgprot_noncached(prot) \
>> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
>> index 1ec08c1..62f7f07 100644
>> --- a/include/asm-generic/pgtable.h
>> +++ b/include/asm-generic/pgtable.h
>> @@ -7,6 +7,16 @@
>> #include <linux/mm_types.h>
>> #include <linux/bug.h>
>>
>> +
>> +/*
>> + * Fault around order is a control knob to decide the fault around pages.
>> + * Default value is set to 0UL (disabled), but the arch can override it as
>> + * desired.
>> + */
>> +#ifndef FAULT_AROUND_ORDER
>> +#define FAULT_AROUND_ORDER 0UL
>> +#endif
>
> FAULT_AROUND_ORDER == 0 case should be handled separately in
> do_read_fault(): no reason to go to do_fault_around() if we are going to
> fault in only one page.
>
ok agreed. I am thinking of adding FAULT_AROUND_ORDER check with
map_pages check in the do_read_fault. Kindly share your thoughts.
With regards
Maddy
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/1] mm: move FAULT_AROUND_ORDER to arch/
@ 2014-03-27 6:20 ` Madhavan Srinivasan
0 siblings, 0 replies; 21+ messages in thread
From: Madhavan Srinivasan @ 2014-03-27 6:20 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: linux-arch, riel, rusty, peterz, x86, linux-kernel, linux-mm, ak,
paulus, mgorman, akpm, linuxppc-dev, mingo, kirill.shutemov
On Tuesday 25 March 2014 11:06 PM, Kirill A. Shutemov wrote:
> On Tue, Mar 25, 2014 at 12:20:15PM +0530, Madhavan Srinivasan wrote:
>> Kirill A. Shutemov with the commit 96bacfe542 introduced
>> vm_ops->map_pages() for mapping easy accessible pages around
>> fault address in hope to reduce number of minor page faults.
>> Based on his workload runs, suggested FAULT_AROUND_ORDER
>> (knob to control the numbers of pages to map) is 4.
>>
>> This patch moves the FAULT_AROUND_ORDER macro to arch/ for
>> architecture maintainers to decide on suitable FAULT_AROUND_ORDER
>> value based on performance data for that architecture.
>>
>> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
>> ---
>> arch/powerpc/include/asm/pgtable.h | 6 ++++++
>> arch/x86/include/asm/pgtable.h | 5 +++++
>> include/asm-generic/pgtable.h | 10 ++++++++++
>> mm/memory.c | 2 --
>> 4 files changed, 21 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
>> index 3ebb188..9fcbd48 100644
>> --- a/arch/powerpc/include/asm/pgtable.h
>> +++ b/arch/powerpc/include/asm/pgtable.h
>> @@ -19,6 +19,12 @@ struct mm_struct;
>> #endif
>>
>> /*
>> + * With a few real world workloads that were run,
>> + * the performance data showed that a value of 3 is more advantageous.
>> + */
>> +#define FAULT_AROUND_ORDER 3
>> +
>> +/*
>> * We save the slot number & secondary bit in the second half of the
>> * PTE page. We use the 8 bytes per each pte entry.
>> */
>> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
>> index 938ef1d..8387a65 100644
>> --- a/arch/x86/include/asm/pgtable.h
>> +++ b/arch/x86/include/asm/pgtable.h
>> @@ -7,6 +7,11 @@
>> #include <asm/pgtable_types.h>
>>
>> /*
>> + * Based on Kirill's test results, fault around order is set to 4
>> + */
>> +#define FAULT_AROUND_ORDER 4
>> +
>> +/*
>> * Macro to mark a page protection value as UC-
>> */
>> #define pgprot_noncached(prot) \
>> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
>> index 1ec08c1..62f7f07 100644
>> --- a/include/asm-generic/pgtable.h
>> +++ b/include/asm-generic/pgtable.h
>> @@ -7,6 +7,16 @@
>> #include <linux/mm_types.h>
>> #include <linux/bug.h>
>>
>> +
>> +/*
>> + * Fault around order is a control knob to decide the fault around pages.
>> + * Default value is set to 0UL (disabled), but the arch can override it as
>> + * desired.
>> + */
>> +#ifndef FAULT_AROUND_ORDER
>> +#define FAULT_AROUND_ORDER 0UL
>> +#endif
>
> FAULT_AROUND_ORDER == 0 case should be handled separately in
> do_read_fault(): no reason to go to do_fault_around() if we are going to
> fault in only one page.
>
ok agreed. I am thinking of adding FAULT_AROUND_ORDER check with
map_pages check in the do_read_fault. Kindly share your thoughts.
With regards
Maddy
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/1] mm: move FAULT_AROUND_ORDER to arch/
2014-03-25 17:50 ` Dave Hansen
(?)
@ 2014-04-02 4:45 ` Madhavan Srinivasan
-1 siblings, 0 replies; 21+ messages in thread
From: Madhavan Srinivasan @ 2014-04-02 4:45 UTC (permalink / raw)
To: Dave Hansen, Kirill A. Shutemov
Cc: linux-kernel, linuxppc-dev, linux-mm, linux-arch, x86, benh,
paulus, kirill.shutemov, rusty, akpm, riel, mgorman, ak, peterz,
mingo
On Tuesday 25 March 2014 11:20 PM, Dave Hansen wrote:
> On 03/25/2014 10:36 AM, Kirill A. Shutemov wrote:
>>>> +/*
>>>> + * Fault around order is a control knob to decide the fault around pages.
>>>> + * Default value is set to 0UL (disabled), but the arch can override it as
>>>> + * desired.
>>>> + */
>>>> +#ifndef FAULT_AROUND_ORDER
>>>> +#define FAULT_AROUND_ORDER 0UL
>>>> +#endif
>> FAULT_AROUND_ORDER == 0 case should be handled separately in
>> do_read_fault(): no reason to go to do_fault_around() if we are going to
>> fault in only one page.
>
> Isn't this the kind of thing we want to do in Kconfig?
>
>
I am still investigating this option since this looks better. But it is
taking time, my bad. I will get back on this.
With Regards
Maddy
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/1] mm: move FAULT_AROUND_ORDER to arch/
@ 2014-04-02 4:45 ` Madhavan Srinivasan
0 siblings, 0 replies; 21+ messages in thread
From: Madhavan Srinivasan @ 2014-04-02 4:45 UTC (permalink / raw)
To: Dave Hansen, Kirill A. Shutemov
Cc: linux-kernel, linuxppc-dev, linux-mm, linux-arch, x86, benh,
paulus, kirill.shutemov, rusty, akpm, riel, mgorman, ak, peterz,
mingo
On Tuesday 25 March 2014 11:20 PM, Dave Hansen wrote:
> On 03/25/2014 10:36 AM, Kirill A. Shutemov wrote:
>>>> +/*
>>>> + * Fault around order is a control knob to decide the fault around pages.
>>>> + * Default value is set to 0UL (disabled), but the arch can override it as
>>>> + * desired.
>>>> + */
>>>> +#ifndef FAULT_AROUND_ORDER
>>>> +#define FAULT_AROUND_ORDER 0UL
>>>> +#endif
>> FAULT_AROUND_ORDER == 0 case should be handled separately in
>> do_read_fault(): no reason to go to do_fault_around() if we are going to
>> fault in only one page.
>
> Isn't this the kind of thing we want to do in Kconfig?
>
>
I am still investigating this option since this looks better. But it is
taking time, my bad. I will get back on this.
With Regards
Maddy
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/1] mm: move FAULT_AROUND_ORDER to arch/
@ 2014-04-02 4:45 ` Madhavan Srinivasan
0 siblings, 0 replies; 21+ messages in thread
From: Madhavan Srinivasan @ 2014-04-02 4:45 UTC (permalink / raw)
To: Dave Hansen, Kirill A. Shutemov
Cc: linux-arch, riel, rusty, peterz, x86, linux-kernel, linux-mm, ak,
paulus, mgorman, akpm, linuxppc-dev, mingo, kirill.shutemov
On Tuesday 25 March 2014 11:20 PM, Dave Hansen wrote:
> On 03/25/2014 10:36 AM, Kirill A. Shutemov wrote:
>>>> +/*
>>>> + * Fault around order is a control knob to decide the fault around pages.
>>>> + * Default value is set to 0UL (disabled), but the arch can override it as
>>>> + * desired.
>>>> + */
>>>> +#ifndef FAULT_AROUND_ORDER
>>>> +#define FAULT_AROUND_ORDER 0UL
>>>> +#endif
>> FAULT_AROUND_ORDER == 0 case should be handled separately in
>> do_read_fault(): no reason to go to do_fault_around() if we are going to
>> fault in only one page.
>
> Isn't this the kind of thing we want to do in Kconfig?
>
>
I am still investigating this option since this looks better. But it is
taking time, my bad. I will get back on this.
With Regards
Maddy
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2014-04-02 5:03 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-25 6:50 [PATCH 0/1] mm: FAULT_AROUND_ORDER patchset performance data for powerpc Madhavan Srinivasan
2014-03-25 6:50 ` Madhavan Srinivasan
2014-03-25 6:50 ` Madhavan Srinivasan
2014-03-25 6:50 ` [PATCH 1/1] mm: move FAULT_AROUND_ORDER to arch/ Madhavan Srinivasan
2014-03-25 6:50 ` Madhavan Srinivasan
2014-03-25 6:50 ` Madhavan Srinivasan
2014-03-25 17:36 ` Kirill A. Shutemov
2014-03-25 17:36 ` Kirill A. Shutemov
2014-03-25 17:36 ` Kirill A. Shutemov
2014-03-25 17:50 ` Dave Hansen
2014-03-25 17:50 ` Dave Hansen
2014-03-25 17:50 ` Dave Hansen
2014-04-02 4:45 ` Madhavan Srinivasan
2014-04-02 4:45 ` Madhavan Srinivasan
2014-04-02 4:45 ` Madhavan Srinivasan
2014-03-27 6:20 ` Madhavan Srinivasan
2014-03-27 6:20 ` Madhavan Srinivasan
2014-03-27 6:20 ` Madhavan Srinivasan
2014-03-25 8:11 ` [PATCH 0/1] mm: FAULT_AROUND_ORDER patchset performance data for powerpc Ingo Molnar
2014-03-25 8:11 ` Ingo Molnar
2014-03-25 8:11 ` Ingo Molnar
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.