linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/4] Automatic NUMA balancing and PROT_NONE handling followup v2r8
@ 2015-03-07 15:20 Mel Gorman
  2015-03-07 15:20 ` [PATCH 1/4] mm: thp: Return the correct value for change_huge_pmd Mel Gorman
                   ` (3 more replies)
  0 siblings, 4 replies; 49+ messages in thread
From: Mel Gorman @ 2015-03-07 15:20 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Andrew Morton, Ingo Molnar, Linus Torvalds, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, linuxppc-dev,
	Mel Gorman

Dave Chinner reported a problem due to excessive NUMA balancing activity
and bisected it. The first patch in this series corrects a major problem
that is unlikely to affect Dave but is still serious. Patch 2 is a minor
cleanup that was spotted while looking at scan rate control. Patch 3 is
minor and unlikely to make a difference but is still an inconsistentcy
between base and THP handling. Patch 4 is the important one, it slows
PTE scan updates if migrations are failing or throttled. Details of the
performance impact on local tests is included in the patch.

 include/linux/migrate.h |  5 -----
 include/linux/sched.h   |  9 +++++----
 kernel/sched/fair.c     |  8 ++++++--
 mm/huge_memory.c        |  8 +++++---
 mm/memory.c             |  3 ++-
 mm/migrate.c            | 20 --------------------
 6 files changed, 18 insertions(+), 35 deletions(-)

-- 
2.1.2


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH 1/4] mm: thp: Return the correct value for change_huge_pmd
  2015-03-07 15:20 [RFC PATCH 0/4] Automatic NUMA balancing and PROT_NONE handling followup v2r8 Mel Gorman
@ 2015-03-07 15:20 ` Mel Gorman
  2015-03-07 20:13   ` Linus Torvalds
  2015-03-07 20:31   ` Linus Torvalds
  2015-03-07 15:20 ` [PATCH 2/4] mm: numa: Remove migrate_ratelimited Mel Gorman
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 49+ messages in thread
From: Mel Gorman @ 2015-03-07 15:20 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Andrew Morton, Ingo Molnar, Linus Torvalds, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, linuxppc-dev,
	Mel Gorman

The wrong value is being returned by change_huge_pmd since commit
10c1045f28e8 ("mm: numa: avoid unnecessary TLB flushes when setting
NUMA hinting entries") which allows a fallthrough that tries to adjust
non-existent PTEs. This patch corrects it.

Signed-off-by: Mel Gorman <mgorman@suse.de>
---
 mm/huge_memory.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index fc00c8cb5a82..194c0f019774 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1482,6 +1482,7 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
 
 	if (__pmd_trans_huge_lock(pmd, vma, &ptl) == 1) {
 		pmd_t entry;
+		ret = 1;
 
 		/*
 		 * Avoid trapping faults against the zero page. The read-only
@@ -1490,11 +1491,10 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
 		 */
 		if (prot_numa && is_huge_zero_pmd(*pmd)) {
 			spin_unlock(ptl);
-			return 0;
+			return ret;
 		}
 
 		if (!prot_numa || !pmd_protnone(*pmd)) {
-			ret = 1;
 			entry = pmdp_get_and_clear_notify(mm, addr, pmd);
 			entry = pmd_modify(entry, newprot);
 			ret = HPAGE_PMD_NR;
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 2/4] mm: numa: Remove migrate_ratelimited
  2015-03-07 15:20 [RFC PATCH 0/4] Automatic NUMA balancing and PROT_NONE handling followup v2r8 Mel Gorman
  2015-03-07 15:20 ` [PATCH 1/4] mm: thp: Return the correct value for change_huge_pmd Mel Gorman
@ 2015-03-07 15:20 ` Mel Gorman
  2015-03-07 15:20 ` [PATCH 3/4] mm: numa: Mark huge PTEs young when clearing NUMA hinting faults Mel Gorman
  2015-03-07 15:20 ` [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur Mel Gorman
  3 siblings, 0 replies; 49+ messages in thread
From: Mel Gorman @ 2015-03-07 15:20 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Andrew Morton, Ingo Molnar, Linus Torvalds, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, linuxppc-dev,
	Mel Gorman

This code is dead since commit 9e645ab6d089 ("sched/numa: Continue PTE
scanning even if migrate rate limited") so remove it.

Signed-off-by: Mel Gorman <mgorman@suse.de>
---
 include/linux/migrate.h |  5 -----
 mm/migrate.c            | 20 --------------------
 2 files changed, 25 deletions(-)

diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index 78baed5f2952..cac1c0904d5f 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -69,7 +69,6 @@ static inline int migrate_huge_page_move_mapping(struct address_space *mapping,
 extern bool pmd_trans_migrating(pmd_t pmd);
 extern int migrate_misplaced_page(struct page *page,
 				  struct vm_area_struct *vma, int node);
-extern bool migrate_ratelimited(int node);
 #else
 static inline bool pmd_trans_migrating(pmd_t pmd)
 {
@@ -80,10 +79,6 @@ static inline int migrate_misplaced_page(struct page *page,
 {
 	return -EAGAIN; /* can't migrate now */
 }
-static inline bool migrate_ratelimited(int node)
-{
-	return false;
-}
 #endif /* CONFIG_NUMA_BALANCING */
 
 #if defined(CONFIG_NUMA_BALANCING) && defined(CONFIG_TRANSPARENT_HUGEPAGE)
diff --git a/mm/migrate.c b/mm/migrate.c
index 85e042686031..6aa9a4222ea9 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1554,30 +1554,10 @@ static struct page *alloc_misplaced_dst_page(struct page *page,
  * page migration rate limiting control.
  * Do not migrate more than @pages_to_migrate in a @migrate_interval_millisecs
  * window of time. Default here says do not migrate more than 1280M per second.
- * If a node is rate-limited then PTE NUMA updates are also rate-limited. However
- * as it is faults that reset the window, pte updates will happen unconditionally
- * if there has not been a fault since @pteupdate_interval_millisecs after the
- * throttle window closed.
  */
 static unsigned int migrate_interval_millisecs __read_mostly = 100;
-static unsigned int pteupdate_interval_millisecs __read_mostly = 1000;
 static unsigned int ratelimit_pages __read_mostly = 128 << (20 - PAGE_SHIFT);
 
-/* Returns true if NUMA migration is currently rate limited */
-bool migrate_ratelimited(int node)
-{
-	pg_data_t *pgdat = NODE_DATA(node);
-
-	if (time_after(jiffies, pgdat->numabalancing_migrate_next_window +
-				msecs_to_jiffies(pteupdate_interval_millisecs)))
-		return false;
-
-	if (pgdat->numabalancing_migrate_nr_pages < ratelimit_pages)
-		return false;
-
-	return true;
-}
-
 /* Returns true if the node is migrate rate-limited after the update */
 static bool numamigrate_update_ratelimit(pg_data_t *pgdat,
 					unsigned long nr_pages)
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 3/4] mm: numa: Mark huge PTEs young when clearing NUMA hinting faults
  2015-03-07 15:20 [RFC PATCH 0/4] Automatic NUMA balancing and PROT_NONE handling followup v2r8 Mel Gorman
  2015-03-07 15:20 ` [PATCH 1/4] mm: thp: Return the correct value for change_huge_pmd Mel Gorman
  2015-03-07 15:20 ` [PATCH 2/4] mm: numa: Remove migrate_ratelimited Mel Gorman
@ 2015-03-07 15:20 ` Mel Gorman
  2015-03-07 18:33   ` Linus Torvalds
  2015-03-07 15:20 ` [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur Mel Gorman
  3 siblings, 1 reply; 49+ messages in thread
From: Mel Gorman @ 2015-03-07 15:20 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Andrew Morton, Ingo Molnar, Linus Torvalds, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, linuxppc-dev,
	Mel Gorman

Base PTEs are marked young when the NUMA hinting information is cleared
but the same does not happen for huge pages which this patch addresses.
Note that migrated pages are not marked young as the base page migration
code does not assume that migrated pages have been referenced. This could
be addressed but beyond the scope of this series which is aimed at Dave
Chinners shrink workload that is unlikely to be affected by this issue.

Signed-off-by: Mel Gorman <mgorman@suse.de>
---
 mm/huge_memory.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 194c0f019774..ae13ad31e113 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1359,6 +1359,7 @@ int do_huge_pmd_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
 clear_pmdnuma:
 	BUG_ON(!PageLocked(page));
 	pmd = pmd_modify(pmd, vma->vm_page_prot);
+	pmd = pmd_mkyoung(pmd);
 	set_pmd_at(mm, haddr, pmdp, pmd);
 	update_mmu_cache_pmd(vma, addr, pmdp);
 	unlock_page(page);
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-07 15:20 [RFC PATCH 0/4] Automatic NUMA balancing and PROT_NONE handling followup v2r8 Mel Gorman
                   ` (2 preceding siblings ...)
  2015-03-07 15:20 ` [PATCH 3/4] mm: numa: Mark huge PTEs young when clearing NUMA hinting faults Mel Gorman
@ 2015-03-07 15:20 ` Mel Gorman
  2015-03-07 16:36   ` Ingo Molnar
  2015-03-08  9:41   ` Ingo Molnar
  3 siblings, 2 replies; 49+ messages in thread
From: Mel Gorman @ 2015-03-07 15:20 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Andrew Morton, Ingo Molnar, Linus Torvalds, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, linuxppc-dev,
	Mel Gorman

Dave Chinner reported the following on https://lkml.org/lkml/2015/3/1/226

Across the board the 4.0-rc1 numbers are much slower, and the degradation
is far worse when using the large memory footprint configs. Perf points
straight at the cause - this is from 4.0-rc1 on the "-o bhash=101073" config:

   -   56.07%    56.07%  [kernel]            [k] default_send_IPI_mask_sequence_phys
      - default_send_IPI_mask_sequence_phys
         - 99.99% physflat_send_IPI_mask
            - 99.37% native_send_call_func_ipi
                 smp_call_function_many
               - native_flush_tlb_others
                  - 99.85% flush_tlb_page
                       ptep_clear_flush
                       try_to_unmap_one
                       rmap_walk
                       try_to_unmap
                       migrate_pages
                       migrate_misplaced_page
                     - handle_mm_fault
                        - 99.73% __do_page_fault
                             trace_do_page_fault
                             do_async_page_fault
                           + async_page_fault
              0.63% native_send_call_func_single_ipi
                 generic_exec_single
                 smp_call_function_single

This is showing excessive migration activity even though excessive migrations
are meant to get throttled. Normally, the scan rate is tuned on a per-task
basis depending on the locality of faults.  However, if migrations fail
for any reason then the PTE scanner may scan faster if the faults continue
to be remote. This means there is higher system CPU overhead and fault
trapping at exactly the time we know that migrations cannot happen. This
patch tracks when migration failures occur and slows the PTE scanner.

This was tested on a 4 socket bare-metal machine with 48 cores. The results
compare 4.0-rc1, the patches applied and 3.19-vanilla which was the last
known good kernel. This is the standard autonuma benchmark

                                           4.0.0-rc1             4.0.0-rc1                3.19.0
                                             vanilla           slowscan-v2               vanilla
Time System-NUMA01                  602.44 (  0.00%)      209.42 ( 65.24%)      194.70 ( 67.68%)
Time System-NUMA01_THEADLOCAL        78.10 (  0.00%)       92.70 (-18.69%)       98.52 (-26.15%)
Time System-NUMA02                    6.47 (  0.00%)        6.06 (  6.34%)        9.28 (-43.43%)
Time System-NUMA02_SMT                5.06 (  0.00%)        3.39 ( 33.00%)        3.79 ( 25.10%)
Time Elapsed-NUMA01                 755.96 (  0.00%)      833.63 (-10.27%)      558.84 ( 26.08%)
Time Elapsed-NUMA01_THEADLOCAL      382.22 (  0.00%)      395.45 ( -3.46%)      382.54 ( -0.08%)
Time Elapsed-NUMA02                  49.38 (  0.00%)       50.21 ( -1.68%)       49.83 ( -0.91%)
Time Elapsed-NUMA02_SMT              47.70 (  0.00%)       48.55 ( -1.78%)       46.59 (  2.33%)

There is a performance drop as a result of this patch although in the
case of NUMA01 it is not a major concern as it's an adverse workload. The
important point is that in most cases system CPU usage is much lower. Here
are the totals

           4.0.0-rc1   4.0.0-rc1      3.19.0
             vanilla  slowscan-v2     vanilla
User        53384.29    56093.11    46119.12
System        692.14      311.64      306.41
Elapsed      1236.87     1328.61     1039.88

Note that the system CPU usage is now similar to 3.19-vanilla.

I also tested with a workload very similar to Dave's. The machine
configuration and storage is completely different so it's not an equivalent
test unfortunately. It's reporting the elapsed time and CPU time while
fsmark is running to create the inodes and when runnig xfsrepair afterwards

xfsrepair
                                    4.0.0-rc1             4.0.0-rc1                3.19.0
                                      vanilla           slowscan-v2               vanilla
Min      real-fsmark        1157.41 (  0.00%)     1150.38 (  0.61%)     1164.44 ( -0.61%)
Min      syst-fsmark        3998.06 (  0.00%)     3988.42 (  0.24%)     4016.12 ( -0.45%)
Min      real-xfsrepair      497.64 (  0.00%)      456.87 (  8.19%)      442.64 ( 11.05%)
Min      syst-xfsrepair      500.61 (  0.00%)      263.41 ( 47.38%)      194.97 ( 61.05%)
Amean    real-fsmark        1166.63 (  0.00%)     1155.97 (  0.91%)     1166.28 (  0.03%)
Amean    syst-fsmark        4020.94 (  0.00%)     4004.19 (  0.42%)     4025.87 ( -0.12%)
Amean    real-xfsrepair      507.85 (  0.00%)      459.58 (  9.50%)      447.66 ( 11.85%)
Amean    syst-xfsrepair      519.88 (  0.00%)      281.63 ( 45.83%)      202.93 ( 60.97%)
Stddev   real-fsmark           6.55 (  0.00%)        3.97 ( 39.30%)        1.44 ( 77.98%)
Stddev   syst-fsmark          16.22 (  0.00%)       15.09 (  6.96%)        9.76 ( 39.86%)
Stddev   real-xfsrepair       11.17 (  0.00%)        3.41 ( 69.43%)        5.57 ( 50.17%)
Stddev   syst-xfsrepair       13.98 (  0.00%)       19.94 (-42.60%)        5.69 ( 59.31%)
CoeffVar real-fsmark           0.56 (  0.00%)        0.34 ( 38.74%)        0.12 ( 77.97%)
CoeffVar syst-fsmark           0.40 (  0.00%)        0.38 (  6.57%)        0.24 ( 39.93%)
CoeffVar real-xfsrepair        2.20 (  0.00%)        0.74 ( 66.22%)        1.24 ( 43.47%)
CoeffVar syst-xfsrepair        2.69 (  0.00%)        7.08 (-163.23%)        2.80 ( -4.23%)
Max      real-fsmark        1171.98 (  0.00%)     1159.25 (  1.09%)     1167.96 (  0.34%)
Max      syst-fsmark        4033.84 (  0.00%)     4024.53 (  0.23%)     4039.20 ( -0.13%)
Max      real-xfsrepair      523.40 (  0.00%)      464.40 ( 11.27%)      455.42 ( 12.99%)
Max      syst-xfsrepair      533.37 (  0.00%)      309.38 ( 42.00%)      207.94 ( 61.01%)

The key point is that system CPU usage for xfsrepair (syst-xfsrepair)
is almost cut in half. It's still not as low as 3.19-vanilla but it's
much closer

                             4.0.0-rc1   4.0.0-rc1      3.19.0
                               vanilla  slowscan-v2     vanilla
NUMA alloc hit               146138883   121929782   104019526
NUMA alloc miss               13146328    11456356     7806370
NUMA interleave hit                  0           0           0
NUMA alloc local             146060848   121865921   103953085
NUMA base PTE updates        242201535   117237258   216624143
NUMA huge PMD updates           113270       52121      127782
NUMA page range updates      300195775   143923210   282048527
NUMA hint faults             180388025    87299060   147235021
NUMA hint local faults        72784532    32939258    61866265
NUMA hint local percent             40          37          42
NUMA pages migrated           71175262    41395302    23237799

Note the big differences in faults trapped and pages migrated. 3.19-vanilla
still migrated fewer pages but if necessary the threshold at which we
start throttling migrations can be lowered.

Signed-off-by: Mel Gorman <mgorman@suse.de>
---
 include/linux/sched.h | 9 +++++----
 kernel/sched/fair.c   | 8 ++++++--
 mm/huge_memory.c      | 3 ++-
 mm/memory.c           | 3 ++-
 4 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 6d77432e14ff..a419b65770d6 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1625,11 +1625,11 @@ struct task_struct {
 
 	/*
 	 * numa_faults_locality tracks if faults recorded during the last
-	 * scan window were remote/local. The task scan period is adapted
-	 * based on the locality of the faults with different weights
-	 * depending on whether they were shared or private faults
+	 * scan window were remote/local or failed to migrate. The task scan
+	 * period is adapted based on the locality of the faults with different
+	 * weights depending on whether they were shared or private faults
 	 */
-	unsigned long numa_faults_locality[2];
+	unsigned long numa_faults_locality[3];
 
 	unsigned long numa_pages_migrated;
 #endif /* CONFIG_NUMA_BALANCING */
@@ -1719,6 +1719,7 @@ struct task_struct {
 #define TNF_NO_GROUP	0x02
 #define TNF_SHARED	0x04
 #define TNF_FAULT_LOCAL	0x08
+#define TNF_MIGRATE_FAIL 0x10
 
 #ifdef CONFIG_NUMA_BALANCING
 extern void task_numa_fault(int last_node, int node, int pages, int flags);
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 7ce18f3c097a..bcfe32088b37 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1609,9 +1609,11 @@ static void update_task_scan_period(struct task_struct *p,
 	/*
 	 * If there were no record hinting faults then either the task is
 	 * completely idle or all activity is areas that are not of interest
-	 * to automatic numa balancing. Scan slower
+	 * to automatic numa balancing. Related to that, if there were failed
+	 * migration then it implies we are migrating too quickly or the local
+	 * node is overloaded. In either case, scan slower
 	 */
-	if (local + shared == 0) {
+	if (local + shared == 0 || p->numa_faults_locality[2]) {
 		p->numa_scan_period = min(p->numa_scan_period_max,
 			p->numa_scan_period << 1);
 
@@ -2080,6 +2082,8 @@ void task_numa_fault(int last_cpupid, int mem_node, int pages, int flags)
 
 	if (migrated)
 		p->numa_pages_migrated += pages;
+	if (flags & TNF_MIGRATE_FAIL)
+		p->numa_faults_locality[2] += pages;
 
 	p->numa_faults[task_faults_idx(NUMA_MEMBUF, mem_node, priv)] += pages;
 	p->numa_faults[task_faults_idx(NUMA_CPUBUF, cpu_node, priv)] += pages;
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index ae13ad31e113..f508fda07d34 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1353,7 +1353,8 @@ int do_huge_pmd_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
 	if (migrated) {
 		flags |= TNF_MIGRATED;
 		page_nid = target_nid;
-	}
+	} else
+		flags |= TNF_MIGRATE_FAIL;
 
 	goto out;
 clear_pmdnuma:
diff --git a/mm/memory.c b/mm/memory.c
index 8068893697bb..187daf695f88 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3097,7 +3097,8 @@ static int do_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
 	if (migrated) {
 		page_nid = target_nid;
 		flags |= TNF_MIGRATED;
-	}
+	} else
+		flags |= TNF_MIGRATE_FAIL;
 
 out:
 	if (page_nid != -1)
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-07 15:20 ` [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur Mel Gorman
@ 2015-03-07 16:36   ` Ingo Molnar
  2015-03-07 17:37     ` Mel Gorman
  2015-03-07 19:12     ` Linus Torvalds
  2015-03-08  9:41   ` Ingo Molnar
  1 sibling, 2 replies; 49+ messages in thread
From: Ingo Molnar @ 2015-03-07 16:36 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Dave Chinner, Andrew Morton, Linus Torvalds, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, linuxppc-dev


* Mel Gorman <mgorman@suse.de> wrote:

> Dave Chinner reported the following on https://lkml.org/lkml/2015/3/1/226
> 
> Across the board the 4.0-rc1 numbers are much slower, and the 
> degradation is far worse when using the large memory footprint 
> configs. Perf points straight at the cause - this is from 4.0-rc1 on 
> the "-o bhash=101073" config:
> 
> [...]

>            4.0.0-rc1   4.0.0-rc1      3.19.0
>              vanilla  slowscan-v2     vanilla
> User        53384.29    56093.11    46119.12
> System        692.14      311.64      306.41
> Elapsed      1236.87     1328.61     1039.88
> 
> Note that the system CPU usage is now similar to 3.19-vanilla.

Similar, but still worse, and also the elapsed time is still much 
worse. User time is much higher, although it's the same amount of work 
done on every kernel, right?

> I also tested with a workload very similar to Dave's. The machine 
> configuration and storage is completely different so it's not an 
> equivalent test unfortunately. It's reporting the elapsed time and 
> CPU time while fsmark is running to create the inodes and when 
> runnig xfsrepair afterwards
> 
> xfsrepair
>                                     4.0.0-rc1             4.0.0-rc1                3.19.0
>                                       vanilla           slowscan-v2               vanilla
> Min      real-fsmark        1157.41 (  0.00%)     1150.38 (  0.61%)     1164.44 ( -0.61%)
> Min      syst-fsmark        3998.06 (  0.00%)     3988.42 (  0.24%)     4016.12 ( -0.45%)
> Min      real-xfsrepair      497.64 (  0.00%)      456.87 (  8.19%)      442.64 ( 11.05%)
> Min      syst-xfsrepair      500.61 (  0.00%)      263.41 ( 47.38%)      194.97 ( 61.05%)
> Amean    real-fsmark        1166.63 (  0.00%)     1155.97 (  0.91%)     1166.28 (  0.03%)
> Amean    syst-fsmark        4020.94 (  0.00%)     4004.19 (  0.42%)     4025.87 ( -0.12%)
> Amean    real-xfsrepair      507.85 (  0.00%)      459.58 (  9.50%)      447.66 ( 11.85%)
> Amean    syst-xfsrepair      519.88 (  0.00%)      281.63 ( 45.83%)      202.93 ( 60.97%)
> Stddev   real-fsmark           6.55 (  0.00%)        3.97 ( 39.30%)        1.44 ( 77.98%)
> Stddev   syst-fsmark          16.22 (  0.00%)       15.09 (  6.96%)        9.76 ( 39.86%)
> Stddev   real-xfsrepair       11.17 (  0.00%)        3.41 ( 69.43%)        5.57 ( 50.17%)
> Stddev   syst-xfsrepair       13.98 (  0.00%)       19.94 (-42.60%)        5.69 ( 59.31%)
> CoeffVar real-fsmark           0.56 (  0.00%)        0.34 ( 38.74%)        0.12 ( 77.97%)
> CoeffVar syst-fsmark           0.40 (  0.00%)        0.38 (  6.57%)        0.24 ( 39.93%)
> CoeffVar real-xfsrepair        2.20 (  0.00%)        0.74 ( 66.22%)        1.24 ( 43.47%)
> CoeffVar syst-xfsrepair        2.69 (  0.00%)        7.08 (-163.23%)        2.80 ( -4.23%)
> Max      real-fsmark        1171.98 (  0.00%)     1159.25 (  1.09%)     1167.96 (  0.34%)
> Max      syst-fsmark        4033.84 (  0.00%)     4024.53 (  0.23%)     4039.20 ( -0.13%)
> Max      real-xfsrepair      523.40 (  0.00%)      464.40 ( 11.27%)      455.42 ( 12.99%)
> Max      syst-xfsrepair      533.37 (  0.00%)      309.38 ( 42.00%)      207.94 ( 61.01%)
> 
> The key point is that system CPU usage for xfsrepair (syst-xfsrepair)
> is almost cut in half. It's still not as low as 3.19-vanilla but it's
> much closer
> 
>                              4.0.0-rc1   4.0.0-rc1      3.19.0
>                                vanilla  slowscan-v2     vanilla
> NUMA alloc hit               146138883   121929782   104019526
> NUMA alloc miss               13146328    11456356     7806370
> NUMA interleave hit                  0           0           0
> NUMA alloc local             146060848   121865921   103953085
> NUMA base PTE updates        242201535   117237258   216624143
> NUMA huge PMD updates           113270       52121      127782
> NUMA page range updates      300195775   143923210   282048527
> NUMA hint faults             180388025    87299060   147235021
> NUMA hint local faults        72784532    32939258    61866265
> NUMA hint local percent             40          37          42
> NUMA pages migrated           71175262    41395302    23237799
> 
> Note the big differences in faults trapped and pages migrated. 
> 3.19-vanilla still migrated fewer pages but if necessary the 
> threshold at which we start throttling migrations can be lowered.

This too is still worse than what v3.19 had.

So what worries me is that Dave bisected the regression to:

  4d9424669946 ("mm: convert p[te|md]_mknonnuma and remaining page table manipulations")

And clearly your patch #4 just tunes balancing/migration intensity - 
is that a workaround for the real problem/bug?

And the patch Dave bisected to is a relatively simple patch.
Why not simply revert it to see whether that cures much of the 
problem?

Am I missing something fundamental?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-07 16:36   ` Ingo Molnar
@ 2015-03-07 17:37     ` Mel Gorman
  2015-03-08  9:54       ` Ingo Molnar
  2015-03-07 19:12     ` Linus Torvalds
  1 sibling, 1 reply; 49+ messages in thread
From: Mel Gorman @ 2015-03-07 17:37 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Dave Chinner, Andrew Morton, Linus Torvalds, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, linuxppc-dev

On Sat, Mar 07, 2015 at 05:36:58PM +0100, Ingo Molnar wrote:
> 
> * Mel Gorman <mgorman@suse.de> wrote:
> 
> > Dave Chinner reported the following on https://lkml.org/lkml/2015/3/1/226
> > 
> > Across the board the 4.0-rc1 numbers are much slower, and the 
> > degradation is far worse when using the large memory footprint 
> > configs. Perf points straight at the cause - this is from 4.0-rc1 on 
> > the "-o bhash=101073" config:
> > 
> > [...]
> 
> >            4.0.0-rc1   4.0.0-rc1      3.19.0
> >              vanilla  slowscan-v2     vanilla
> > User        53384.29    56093.11    46119.12
> > System        692.14      311.64      306.41
> > Elapsed      1236.87     1328.61     1039.88
> > 
> > Note that the system CPU usage is now similar to 3.19-vanilla.
> 
> Similar, but still worse, and also the elapsed time is still much 
> worse. User time is much higher, although it's the same amount of work 
> done on every kernel, right?
> 

Elapsed time is primarily worse on one benchmark -- numa01 which is an
adverse workload. The user time differences are also dominated by that
benchmark

                                           4.0.0-rc1             4.0.0-rc1                3.19.0
                                             vanilla         slowscan-v2r7               vanilla
Time User-NUMA01                  32883.59 (  0.00%)    35288.00 ( -7.31%)    25695.96 ( 21.86%)
Time User-NUMA01_THEADLOCAL       17453.20 (  0.00%)    17765.79 ( -1.79%)    17404.36 (  0.28%)
Time User-NUMA02                   2063.70 (  0.00%)     2063.22 (  0.02%)     2037.65 (  1.26%)
Time User-NUMA02_SMT                983.70 (  0.00%)      976.01 (  0.78%)      981.02 (  0.27%)


> > I also tested with a workload very similar to Dave's. The machine 
> > configuration and storage is completely different so it's not an 
> > equivalent test unfortunately. It's reporting the elapsed time and 
> > CPU time while fsmark is running to create the inodes and when 
> > runnig xfsrepair afterwards
> > 
> > xfsrepair
> >                                     4.0.0-rc1             4.0.0-rc1                3.19.0
> >                                       vanilla           slowscan-v2               vanilla
> > Min      real-fsmark        1157.41 (  0.00%)     1150.38 (  0.61%)     1164.44 ( -0.61%)
> > Min      syst-fsmark        3998.06 (  0.00%)     3988.42 (  0.24%)     4016.12 ( -0.45%)
> > Min      real-xfsrepair      497.64 (  0.00%)      456.87 (  8.19%)      442.64 ( 11.05%)
> > Min      syst-xfsrepair      500.61 (  0.00%)      263.41 ( 47.38%)      194.97 ( 61.05%)
> > Amean    real-fsmark        1166.63 (  0.00%)     1155.97 (  0.91%)     1166.28 (  0.03%)
> > Amean    syst-fsmark        4020.94 (  0.00%)     4004.19 (  0.42%)     4025.87 ( -0.12%)
> > Amean    real-xfsrepair      507.85 (  0.00%)      459.58 (  9.50%)      447.66 ( 11.85%)
> > Amean    syst-xfsrepair      519.88 (  0.00%)      281.63 ( 45.83%)      202.93 ( 60.97%)
> > Stddev   real-fsmark           6.55 (  0.00%)        3.97 ( 39.30%)        1.44 ( 77.98%)
> > Stddev   syst-fsmark          16.22 (  0.00%)       15.09 (  6.96%)        9.76 ( 39.86%)
> > Stddev   real-xfsrepair       11.17 (  0.00%)        3.41 ( 69.43%)        5.57 ( 50.17%)
> > Stddev   syst-xfsrepair       13.98 (  0.00%)       19.94 (-42.60%)        5.69 ( 59.31%)
> > CoeffVar real-fsmark           0.56 (  0.00%)        0.34 ( 38.74%)        0.12 ( 77.97%)
> > CoeffVar syst-fsmark           0.40 (  0.00%)        0.38 (  6.57%)        0.24 ( 39.93%)
> > CoeffVar real-xfsrepair        2.20 (  0.00%)        0.74 ( 66.22%)        1.24 ( 43.47%)
> > CoeffVar syst-xfsrepair        2.69 (  0.00%)        7.08 (-163.23%)        2.80 ( -4.23%)
> > Max      real-fsmark        1171.98 (  0.00%)     1159.25 (  1.09%)     1167.96 (  0.34%)
> > Max      syst-fsmark        4033.84 (  0.00%)     4024.53 (  0.23%)     4039.20 ( -0.13%)
> > Max      real-xfsrepair      523.40 (  0.00%)      464.40 ( 11.27%)      455.42 ( 12.99%)
> > Max      syst-xfsrepair      533.37 (  0.00%)      309.38 ( 42.00%)      207.94 ( 61.01%)
> > 
> > The key point is that system CPU usage for xfsrepair (syst-xfsrepair)
> > is almost cut in half. It's still not as low as 3.19-vanilla but it's
> > much closer
> > 
> >                              4.0.0-rc1   4.0.0-rc1      3.19.0
> >                                vanilla  slowscan-v2     vanilla
> > NUMA alloc hit               146138883   121929782   104019526
> > NUMA alloc miss               13146328    11456356     7806370
> > NUMA interleave hit                  0           0           0
> > NUMA alloc local             146060848   121865921   103953085
> > NUMA base PTE updates        242201535   117237258   216624143
> > NUMA huge PMD updates           113270       52121      127782
> > NUMA page range updates      300195775   143923210   282048527
> > NUMA hint faults             180388025    87299060   147235021
> > NUMA hint local faults        72784532    32939258    61866265
> > NUMA hint local percent             40          37          42
> > NUMA pages migrated           71175262    41395302    23237799
> > 
> > Note the big differences in faults trapped and pages migrated. 
> > 3.19-vanilla still migrated fewer pages but if necessary the 
> > threshold at which we start throttling migrations can be lowered.
> 
> This too is still worse than what v3.19 had.
> 

Yes.

> So what worries me is that Dave bisected the regression to:
> 
>   4d9424669946 ("mm: convert p[te|md]_mknonnuma and remaining page table manipulations")
> 
> And clearly your patch #4 just tunes balancing/migration intensity - 
> is that a workaround for the real problem/bug?
> 

The patch makes NUMA hinting faults use standard page table handling routines
and protections to trap the faults. Fundamentally it's safer even though
it appears to cause more traps to be handled. I've been assuming this is
related to the different permissions PTEs get and when they are visible on
all CPUs. This path is addressing the symptom that more faults are being
handled and that it needs to be less aggressive.

I've gone through that patch and didn't spot anything else that is doing
wrong that is not already handled in this series. Did you spot anything
obviously wrong in that patch that isn't addressed in this series?

> And the patch Dave bisected to is a relatively simple patch.
> Why not simply revert it to see whether that cures much of the 
> problem?
> 

Because it also means reverting all the PROT_NONE handling and going back
to _PAGE_NUMA tricks which I expect would be naked by Linus.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 3/4] mm: numa: Mark huge PTEs young when clearing NUMA hinting faults
  2015-03-07 15:20 ` [PATCH 3/4] mm: numa: Mark huge PTEs young when clearing NUMA hinting faults Mel Gorman
@ 2015-03-07 18:33   ` Linus Torvalds
  2015-03-07 18:42     ` Linus Torvalds
  0 siblings, 1 reply; 49+ messages in thread
From: Linus Torvalds @ 2015-03-07 18:33 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Dave Chinner, Andrew Morton, Ingo Molnar, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Sat, Mar 7, 2015 at 7:20 AM, Mel Gorman <mgorman@suse.de> wrote:
>         pmd = pmd_modify(pmd, vma->vm_page_prot);
> +       pmd = pmd_mkyoung(pmd);

Hmm. I *thought* this should be unnecessary. vm_page_prot alreadty has
the accessed bit set, and we kind of depend on the initial page table
setup and mk_pte() and friends (ie all new pages are installed
"young").

But it looks like I am wrong - the way we use _[H]PAGE_CHG_MASK means
that we always take the accessed and dirty bits from the old entry,
ignoring the bit in vm_page_prot.

I wonder if we should just make pte/pmd_modify() work the way I
*thought* they worked (remove the masking of vm_page_prot bits).

So the patch isn't wrong. It's just that we *migth* instead just do
something like this:

    arch/x86/include/asm/pgtable.h | 4 ++--
    1 file changed, 2 insertions(+), 2 deletions(-)

   diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
   index a0c35bf6cb92..79b898bb9e18 100644
   --- a/arch/x86/include/asm/pgtable.h
   +++ b/arch/x86/include/asm/pgtable.h
   @@ -355,7 +355,7 @@ static inline pte_t pte_modify(pte_t pte,
pgprot_t newprot)
            * the newprot (if present):
            */
           val &= _PAGE_CHG_MASK;
   -       val |= massage_pgprot(newprot) & ~_PAGE_CHG_MASK;
   +       val |= massage_pgprot(newprot);

           return __pte(val);
    }
   @@ -365,7 +365,7 @@ static inline pmd_t pmd_modify(pmd_t pmd,
pgprot_t newprot)
           pmdval_t val = pmd_val(pmd);

           val &= _HPAGE_CHG_MASK;
   -       val |= massage_pgprot(newprot) & ~_HPAGE_CHG_MASK;
   +       val |= massage_pgprot(newprot);

           return __pmd(val);
    }

instead, and remove the mkyoung. Completely untested, but that "just
or in the new protection bits" is what pnf_pte() does just a few lines
above this.

                        Linus

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 3/4] mm: numa: Mark huge PTEs young when clearing NUMA hinting faults
  2015-03-07 18:33   ` Linus Torvalds
@ 2015-03-07 18:42     ` Linus Torvalds
  0 siblings, 0 replies; 49+ messages in thread
From: Linus Torvalds @ 2015-03-07 18:42 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Dave Chinner, Andrew Morton, Ingo Molnar, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Sat, Mar 7, 2015 at 10:33 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
>             Completely untested, but that "just
> or in the new protection bits" is what pnf_pte() does just a few lines
> above this.

Hmm. Looking at this, we do *not* want to set _PAGE_ACCESSED when we
turn a page into PROT_NONE or mark it for numa faulting. Nor do we
want to set it for mprotect for random pages that we haven't actually
accessed, just changed the protections for.

So my patch was obviously wrong, and I should feel bad for suggesting
it. I'm a moron, and my expectations that "pte_modify()" would just
take the accessed bit from the vm_page_prot field was stupid and
wrong.

Mel's patch is the right thing to do.

                                Linus

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-07 16:36   ` Ingo Molnar
  2015-03-07 17:37     ` Mel Gorman
@ 2015-03-07 19:12     ` Linus Torvalds
  2015-03-08 10:02       ` Ingo Molnar
  1 sibling, 1 reply; 49+ messages in thread
From: Linus Torvalds @ 2015-03-07 19:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mel Gorman, Dave Chinner, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Sat, Mar 7, 2015 at 8:36 AM, Ingo Molnar <mingo@kernel.org> wrote:
>
> And the patch Dave bisected to is a relatively simple patch.
> Why not simply revert it to see whether that cures much of the
> problem?

So the problem with that is that "pmd_set_numa()" and friends simply
no longer exist. So we can't just revert that one patch, it's the
whole series, and the whole point of the series.

What confuses me is that the only real change that I can see in that
patch is the change to "change_huge_pmd()". Everything else is pretty
much a 100% equivalent transformation, afaik. Of course, I may be
wrong about that, and missing something silly.

And the changes to "change_huge_pmd()" were basically re-done
differently by subsequent patches anyway.

The *only* change I see remaining is that change_huge_pmd() now does

   entry = pmdp_get_and_clear_notify(mm, addr, pmd);
   entry = pmd_modify(entry, newprot);
   set_pmd_at(mm, addr, pmd, entry);

for all changes. It used to do that "pmdp_set_numa()" for the
prot_numa case, which did just

   pmd_t pmd = *pmdp;
   pmd = pmd_mknuma(pmd);
   set_pmd_at(mm, addr, pmdp, pmd);

instead.

I don't like the old pmdp_set_numa() because it can drop dirty bits,
so I think the old code was actively buggy.

But I do *not* see why the new code would cause more migrations to happen.

There's probably something really stupid I'm missing.

                           Linus

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 1/4] mm: thp: Return the correct value for change_huge_pmd
  2015-03-07 15:20 ` [PATCH 1/4] mm: thp: Return the correct value for change_huge_pmd Mel Gorman
@ 2015-03-07 20:13   ` Linus Torvalds
  2015-03-07 20:31   ` Linus Torvalds
  1 sibling, 0 replies; 49+ messages in thread
From: Linus Torvalds @ 2015-03-07 20:13 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Dave Chinner, Andrew Morton, Ingo Molnar, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

Looks obviously correct. The old code was just very wrong.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>

                     Linus


On Sat, Mar 7, 2015 at 7:20 AM, Mel Gorman <mgorman@suse.de> wrote:
> The wrong value is being returned by change_huge_pmd since commit
> 10c1045f28e8 ("mm: numa: avoid unnecessary TLB flushes when setting
> NUMA hinting entries") which allows a fallthrough that tries to adjust
> non-existent PTEs. This patch corrects it.
>
> Signed-off-by: Mel Gorman <mgorman@suse.de>
> ---
>  mm/huge_memory.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index fc00c8cb5a82..194c0f019774 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1482,6 +1482,7 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
>
>         if (__pmd_trans_huge_lock(pmd, vma, &ptl) == 1) {
>                 pmd_t entry;
> +               ret = 1;
>
>                 /*
>                  * Avoid trapping faults against the zero page. The read-only
> @@ -1490,11 +1491,10 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
>                  */
>                 if (prot_numa && is_huge_zero_pmd(*pmd)) {
>                         spin_unlock(ptl);
> -                       return 0;
> +                       return ret;
>                 }
>
>                 if (!prot_numa || !pmd_protnone(*pmd)) {
> -                       ret = 1;
>                         entry = pmdp_get_and_clear_notify(mm, addr, pmd);
>                         entry = pmd_modify(entry, newprot);
>                         ret = HPAGE_PMD_NR;
> --
> 2.1.2
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 1/4] mm: thp: Return the correct value for change_huge_pmd
  2015-03-07 15:20 ` [PATCH 1/4] mm: thp: Return the correct value for change_huge_pmd Mel Gorman
  2015-03-07 20:13   ` Linus Torvalds
@ 2015-03-07 20:31   ` Linus Torvalds
  2015-03-07 20:56     ` Mel Gorman
  1 sibling, 1 reply; 49+ messages in thread
From: Linus Torvalds @ 2015-03-07 20:31 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Dave Chinner, Andrew Morton, Ingo Molnar, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Sat, Mar 7, 2015 at 7:20 AM, Mel Gorman <mgorman@suse.de> wrote:
>
>                 if (!prot_numa || !pmd_protnone(*pmd)) {
> -                       ret = 1;
>                         entry = pmdp_get_and_clear_notify(mm, addr, pmd);
>                         entry = pmd_modify(entry, newprot);
>                         ret = HPAGE_PMD_NR;

Hmm. I know I acked this already, but the return value - which correct
- is still potentially something we could improve upon.

In particular, we don't need to flush the TLB's if the old entry was
not present. Sadly, we don't have a helper function for that.

But the code *could* do something like

    entry = pmdp_get_and_clear_notify(mm, addr, pmd);
    ret = pmd_tlb_cacheable(entry) ? HPAGE_PMD_NR : 1;
    entry = pmd_modify(entry, newprot);

where pmd_tlb_cacheable() on x86 would test if _PAGE_PRESENT (bit #0) is set.

In particular, that would mean that as we change *from* a protnone
(whether NUMA or really protnone) we wouldn't need to flush the TLB.

In fact, we could make it even more aggressive: it's not just an old
non-present TLB entry that doesn't need flushing - we can avoid the
flushing whenever we strictly increase the access rigths. So we could
have something that takes the old entry _and_ the new protections into
account, and avoids the TLB flush if the new entry is strictly more
permissive.

This doesn't explain the extra TLB flushes Dave sees, though, because
the old code didn't make those kinds of optimizations either. But
maybe something like this is worth doing.

                            Linus

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 1/4] mm: thp: Return the correct value for change_huge_pmd
  2015-03-07 20:31   ` Linus Torvalds
@ 2015-03-07 20:56     ` Mel Gorman
  0 siblings, 0 replies; 49+ messages in thread
From: Mel Gorman @ 2015-03-07 20:56 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dave Chinner, Andrew Morton, Ingo Molnar, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Sat, Mar 07, 2015 at 12:31:03PM -0800, Linus Torvalds wrote:
> On Sat, Mar 7, 2015 at 7:20 AM, Mel Gorman <mgorman@suse.de> wrote:
> >
> >                 if (!prot_numa || !pmd_protnone(*pmd)) {
> > -                       ret = 1;
> >                         entry = pmdp_get_and_clear_notify(mm, addr, pmd);
> >                         entry = pmd_modify(entry, newprot);
> >                         ret = HPAGE_PMD_NR;
> 
> Hmm. I know I acked this already, but the return value - which correct
> - is still potentially something we could improve upon.
> 
> In particular, we don't need to flush the TLB's if the old entry was
> not present. Sadly, we don't have a helper function for that.
> 
> But the code *could* do something like
> 
>     entry = pmdp_get_and_clear_notify(mm, addr, pmd);
>     ret = pmd_tlb_cacheable(entry) ? HPAGE_PMD_NR : 1;
>     entry = pmd_modify(entry, newprot);
> 
> where pmd_tlb_cacheable() on x86 would test if _PAGE_PRESENT (bit #0) is set.
> 

I agree with you in principle. pmd_tlb_cacheable looks and sounds very
similar to pte_accessible().

> In particular, that would mean that as we change *from* a protnone
> (whether NUMA or really protnone) we wouldn't need to flush the TLB.
> 
> In fact, we could make it even more aggressive: it's not just an old
> non-present TLB entry that doesn't need flushing - we can avoid the
> flushing whenever we strictly increase the access rigths. So we could
> have something that takes the old entry _and_ the new protections into
> account, and avoids the TLB flush if the new entry is strictly more
> permissive.
> 
> This doesn't explain the extra TLB flushes Dave sees, though, because
> the old code didn't make those kinds of optimizations either. But
> maybe something like this is worth doing.
> 

I think it is worth doing although it'll be after LSF/MM before I do it. I
severely doubt this is what Dave is seeing because the vmstats indicated
there was no THP activity but it's still a good idea.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-07 15:20 ` [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur Mel Gorman
  2015-03-07 16:36   ` Ingo Molnar
@ 2015-03-08  9:41   ` Ingo Molnar
  1 sibling, 0 replies; 49+ messages in thread
From: Ingo Molnar @ 2015-03-08  9:41 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Dave Chinner, Andrew Morton, Linus Torvalds, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, linuxppc-dev


* Mel Gorman <mgorman@suse.de> wrote:

> xfsrepair
>                                     4.0.0-rc1             4.0.0-rc1                3.19.0
>                                       vanilla           slowscan-v2               vanilla
> Min      real-fsmark        1157.41 (  0.00%)     1150.38 (  0.61%)     1164.44 ( -0.61%)
> Min      syst-fsmark        3998.06 (  0.00%)     3988.42 (  0.24%)     4016.12 ( -0.45%)
> Min      real-xfsrepair      497.64 (  0.00%)      456.87 (  8.19%)      442.64 ( 11.05%)
> Min      syst-xfsrepair      500.61 (  0.00%)      263.41 ( 47.38%)      194.97 ( 61.05%)
> Amean    real-fsmark        1166.63 (  0.00%)     1155.97 (  0.91%)     1166.28 (  0.03%)
> Amean    syst-fsmark        4020.94 (  0.00%)     4004.19 (  0.42%)     4025.87 ( -0.12%)
> Amean    real-xfsrepair      507.85 (  0.00%)      459.58 (  9.50%)      447.66 ( 11.85%)
> Amean    syst-xfsrepair      519.88 (  0.00%)      281.63 ( 45.83%)      202.93 ( 60.97%)
> Stddev   real-fsmark           6.55 (  0.00%)        3.97 ( 39.30%)        1.44 ( 77.98%)
> Stddev   syst-fsmark          16.22 (  0.00%)       15.09 (  6.96%)        9.76 ( 39.86%)
> Stddev   real-xfsrepair       11.17 (  0.00%)        3.41 ( 69.43%)        5.57 ( 50.17%)
> Stddev   syst-xfsrepair       13.98 (  0.00%)       19.94 (-42.60%)        5.69 ( 59.31%)
> CoeffVar real-fsmark           0.56 (  0.00%)        0.34 ( 38.74%)        0.12 ( 77.97%)
> CoeffVar syst-fsmark           0.40 (  0.00%)        0.38 (  6.57%)        0.24 ( 39.93%)
> CoeffVar real-xfsrepair        2.20 (  0.00%)        0.74 ( 66.22%)        1.24 ( 43.47%)
> CoeffVar syst-xfsrepair        2.69 (  0.00%)        7.08 (-163.23%)        2.80 ( -4.23%)
> Max      real-fsmark        1171.98 (  0.00%)     1159.25 (  1.09%)     1167.96 (  0.34%)
> Max      syst-fsmark        4033.84 (  0.00%)     4024.53 (  0.23%)     4039.20 ( -0.13%)
> Max      real-xfsrepair      523.40 (  0.00%)      464.40 ( 11.27%)      455.42 ( 12.99%)
> Max      syst-xfsrepair      533.37 (  0.00%)      309.38 ( 42.00%)      207.94 ( 61.01%)

Btw., I think it would be nice if these numbers listed v3.19 
performance in the first column, to make it clear at a glance
how much regression we still have?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-07 17:37     ` Mel Gorman
@ 2015-03-08  9:54       ` Ingo Molnar
  0 siblings, 0 replies; 49+ messages in thread
From: Ingo Molnar @ 2015-03-08  9:54 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Dave Chinner, Andrew Morton, Linus Torvalds, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, linuxppc-dev


* Mel Gorman <mgorman@suse.de> wrote:

> Elapsed time is primarily worse on one benchmark -- numa01 which is 
> an adverse workload. The user time differences are also dominated by 
> that benchmark
> 
>                                            4.0.0-rc1             4.0.0-rc1                3.19.0
>                                              vanilla         slowscan-v2r7               vanilla
> Time User-NUMA01                  32883.59 (  0.00%)    35288.00 ( -7.31%)    25695.96 ( 21.86%)
> Time User-NUMA01_THEADLOCAL       17453.20 (  0.00%)    17765.79 ( -1.79%)    17404.36 (  0.28%)
> Time User-NUMA02                   2063.70 (  0.00%)     2063.22 (  0.02%)     2037.65 (  1.26%)
> Time User-NUMA02_SMT                983.70 (  0.00%)      976.01 (  0.78%)      981.02 (  0.27%)

But even for 'numa02', the simplest of the workloads, there appears to 
be some of a regression relative to v3.19, which ought to be beyond 
the noise of the measurement (which would be below 1% I suspect), and 
as such relevant, right?

And the XFS numbers still show significant regression compared to 
v3.19 - and that cannot be ignored as artificial, 'adversarial' 
workload, right?

For example, from your numbers:

xfsrepair
                                    4.0.0-rc1             4.0.0-rc1                3.19.0
                                      vanilla           slowscan-v2               vanilla
...
Amean    real-xfsrepair      507.85 (  0.00%)      459.58 (  9.50%)      447.66 ( 11.85%)
Amean    syst-xfsrepair      519.88 (  0.00%)      281.63 ( 45.83%)      202.93 ( 60.97%)

if I interpret the numbers correctly, it shows that compared to v3.19, 
system time increased by 38% - which is rather significant!

> > So what worries me is that Dave bisected the regression to:
> > 
> >   4d9424669946 ("mm: convert p[te|md]_mknonnuma and remaining page table manipulations")
> > 
> > And clearly your patch #4 just tunes balancing/migration intensity 
> > - is that a workaround for the real problem/bug?
> 
> The patch makes NUMA hinting faults use standard page table handling 
> routines and protections to trap the faults. Fundamentally it's 
> safer even though it appears to cause more traps to be handled. I've 
> been assuming this is related to the different permissions PTEs get 
> and when they are visible on all CPUs. This path is addressing the 
> symptom that more faults are being handled and that it needs to be 
> less aggressive.

But the whole cleanup ought to have been close to an identity 
transformation from the CPU's point of view - and your measurements 
seem to confirm Dave's findings.

And your measurement was on bare metal, while Dave's is on a VM, and 
both show a significant slowdown on the xfs tests even with your 
slow-tuning patch applied, so it's unlikely to be a measurement fluke 
or some weird platform property.

> I've gone through that patch and didn't spot anything else that is 
> doing wrong that is not already handled in this series. Did you spot 
> anything obviously wrong in that patch that isn't addressed in this 
> series?

I didn't spot anything wrong, but is that a basis to go forward and 
work around the regression, in a way that doesn't even recover lost 
performance?

> > And the patch Dave bisected to is a relatively simple patch. Why 
> > not simply revert it to see whether that cures much of the 
> > problem?
> 
> Because it also means reverting all the PROT_NONE handling and going 
> back to _PAGE_NUMA tricks which I expect would be naked by Linus.

Yeah, I realize that (and obviously I support the PROT_NONE direction 
that Peter Zijlstra prototyped with the original sched/numa series), 
but can we leave this much of a regression on the table?

I hate to be such a pain in the neck, but especially the 'down tuning' 
of the scanning intensity will make an apples to apples comparison 
harder!

I'd rather not do the slow-tuning part and leave sucky performance in 
place for now and have an easy method plus the motivation to find and 
fix the real cause of the regression, than to partially hide it this 
way ...

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-07 19:12     ` Linus Torvalds
@ 2015-03-08 10:02       ` Ingo Molnar
  2015-03-08 18:35         ` Linus Torvalds
  2015-03-08 20:40         ` Mel Gorman
  0 siblings, 2 replies; 49+ messages in thread
From: Ingo Molnar @ 2015-03-08 10:02 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Mel Gorman, Dave Chinner, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Sat, Mar 7, 2015 at 8:36 AM, Ingo Molnar <mingo@kernel.org> wrote:
> >
> > And the patch Dave bisected to is a relatively simple patch. Why 
> > not simply revert it to see whether that cures much of the 
> > problem?
> 
> So the problem with that is that "pmd_set_numa()" and friends simply 
> no longer exist. So we can't just revert that one patch, it's the 
> whole series, and the whole point of the series.

Yeah.

> What confuses me is that the only real change that I can see in that 
> patch is the change to "change_huge_pmd()". Everything else is 
> pretty much a 100% equivalent transformation, afaik. Of course, I 
> may be wrong about that, and missing something silly.

Well, there's a difference in what we write to the pte:

 #define _PAGE_BIT_NUMA          (_PAGE_BIT_GLOBAL+1)
 #define _PAGE_BIT_PROTNONE      _PAGE_BIT_GLOBAL

and our expectation was that the two should be equivalent methods from 
the POV of the NUMA balancing code, right?

> And the changes to "change_huge_pmd()" were basically re-done
> differently by subsequent patches anyway.
> 
> The *only* change I see remaining is that change_huge_pmd() now does
> 
>    entry = pmdp_get_and_clear_notify(mm, addr, pmd);
>    entry = pmd_modify(entry, newprot);
>    set_pmd_at(mm, addr, pmd, entry);
> 
> for all changes. It used to do that "pmdp_set_numa()" for the
> prot_numa case, which did just
> 
>    pmd_t pmd = *pmdp;
>    pmd = pmd_mknuma(pmd);
>    set_pmd_at(mm, addr, pmdp, pmd);
> 
> instead.
> 
> I don't like the old pmdp_set_numa() because it can drop dirty bits,
> so I think the old code was actively buggy.

Could we, as a silly testing hack not to be applied, write a 
hack-patch that re-introduces the racy way of setting the NUMA bit, to 
confirm that it is indeed this difference that changes pte visibility 
across CPUs enough to create so many more faults?

Because if the answer is 'yes', then we can safely say: 'we regressed 
performance because correctness [not dropping dirty bits] comes before 
performance'.

If the answer is 'no', then we still have a mystery (and a regression) 
to track down.

As a second hack (not to be applied), could we change:

 #define _PAGE_BIT_PROTNONE      _PAGE_BIT_GLOBAL

to:

 #define _PAGE_BIT_PROTNONE      (_PAGE_BIT_GLOBAL+1)

to double check that the position of the bit does not matter?

I don't think we've exhaused all avenues of analysis here.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-08 10:02       ` Ingo Molnar
@ 2015-03-08 18:35         ` Linus Torvalds
  2015-03-08 18:46           ` Linus Torvalds
  2015-03-09 11:29           ` Dave Chinner
  2015-03-08 20:40         ` Mel Gorman
  1 sibling, 2 replies; 49+ messages in thread
From: Linus Torvalds @ 2015-03-08 18:35 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mel Gorman, Dave Chinner, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Sun, Mar 8, 2015 at 3:02 AM, Ingo Molnar <mingo@kernel.org> wrote:
>
> Well, there's a difference in what we write to the pte:
>
>  #define _PAGE_BIT_NUMA          (_PAGE_BIT_GLOBAL+1)
>  #define _PAGE_BIT_PROTNONE      _PAGE_BIT_GLOBAL
>
> and our expectation was that the two should be equivalent methods from
> the POV of the NUMA balancing code, right?

Right.

But yes, we might have screwed something up. In particular, there
might be something that thinks it cares about the global bit, but
doesn't notice that the present bit isn't set, so it considers the
protnone mappings to be global and causes lots more tlb flushes etc.

>> I don't like the old pmdp_set_numa() because it can drop dirty bits,
>> so I think the old code was actively buggy.
>
> Could we, as a silly testing hack not to be applied, write a
> hack-patch that re-introduces the racy way of setting the NUMA bit, to
> confirm that it is indeed this difference that changes pte visibility
> across CPUs enough to create so many more faults?

So one of Mel's patches did that, but I don't know if Dave tested it.

And thinking about it, it *may* be safe for huge-pages, if they always
already have the dirty bit set to begin with. And I don't see how we
could have a clean hugepage (apart from the special case of the
zeropage, which is read-only, so races on teh dirty bit aren't an
issue).

So it might actually be that the non-atomic version is safe for
hpages. And we could possibly get rid of the "atomic read-and-clear"
even for the non-numa case.

I'd rather do it for both cases than for just one of them.

But:

> As a second hack (not to be applied), could we change:
>
>  #define _PAGE_BIT_PROTNONE      _PAGE_BIT_GLOBAL
>
> to:
>
>  #define _PAGE_BIT_PROTNONE      (_PAGE_BIT_GLOBAL+1)
>
> to double check that the position of the bit does not matter?

Agreed. We should definitely try that.

Dave?

Also, is there some sane way for me to actually see this behavior on a
regular machine with just a single socket? Dave is apparently running
in some fake-numa setup, I'm wondering if this is easy enough to
reproduce that I could see it myself.

                          Linus

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-08 18:35         ` Linus Torvalds
@ 2015-03-08 18:46           ` Linus Torvalds
  2015-03-09 11:29           ` Dave Chinner
  1 sibling, 0 replies; 49+ messages in thread
From: Linus Torvalds @ 2015-03-08 18:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mel Gorman, Dave Chinner, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Sun, Mar 8, 2015 at 11:35 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>> As a second hack (not to be applied), could we change:
>>
>>  #define _PAGE_BIT_PROTNONE      _PAGE_BIT_GLOBAL
>>
>> to:
>>
>>  #define _PAGE_BIT_PROTNONE      (_PAGE_BIT_GLOBAL+1)
>>
>> to double check that the position of the bit does not matter?
>
> Agreed. We should definitely try that.

There's a second reason to do that, actually: the __supported_pte_mask
thing, _and_ the pageattr stuff in __split_large_page() etc play games
with _PAGE_GLOBAL. As does drivers/lguest for some reason.

So looking at this all, there's a lot of room for confusion with _PAGE_GLOBAL.

That kind of confusion would certainly explain the whole "the changes
_look_ like they do the same thing, but don't" - because of silly
semantic conflicts with PROTNONE vs GLOBAL.

                                Linus

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-08 10:02       ` Ingo Molnar
  2015-03-08 18:35         ` Linus Torvalds
@ 2015-03-08 20:40         ` Mel Gorman
  2015-03-09 21:02           ` Mel Gorman
  1 sibling, 1 reply; 49+ messages in thread
From: Mel Gorman @ 2015-03-08 20:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Dave Chinner, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Sun, Mar 08, 2015 at 11:02:23AM +0100, Ingo Molnar wrote:
> 
> * Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > On Sat, Mar 7, 2015 at 8:36 AM, Ingo Molnar <mingo@kernel.org> wrote:
> > >
> > > And the patch Dave bisected to is a relatively simple patch. Why 
> > > not simply revert it to see whether that cures much of the 
> > > problem?
> > 
> > So the problem with that is that "pmd_set_numa()" and friends simply 
> > no longer exist. So we can't just revert that one patch, it's the 
> > whole series, and the whole point of the series.
> 
> Yeah.
> 
> > What confuses me is that the only real change that I can see in that 
> > patch is the change to "change_huge_pmd()". Everything else is 
> > pretty much a 100% equivalent transformation, afaik. Of course, I 
> > may be wrong about that, and missing something silly.
> 
> Well, there's a difference in what we write to the pte:
> 
>  #define _PAGE_BIT_NUMA          (_PAGE_BIT_GLOBAL+1)
>  #define _PAGE_BIT_PROTNONE      _PAGE_BIT_GLOBAL
> 
> and our expectation was that the two should be equivalent methods from 
> the POV of the NUMA balancing code, right?
> 

Functionally yes but performance-wise no. We are now using the global bit
for NUMA faults at the very least.

> > And the changes to "change_huge_pmd()" were basically re-done
> > differently by subsequent patches anyway.
> > 
> > The *only* change I see remaining is that change_huge_pmd() now does
> > 
> >    entry = pmdp_get_and_clear_notify(mm, addr, pmd);
> >    entry = pmd_modify(entry, newprot);
> >    set_pmd_at(mm, addr, pmd, entry);
> > 
> > for all changes. It used to do that "pmdp_set_numa()" for the
> > prot_numa case, which did just
> > 
> >    pmd_t pmd = *pmdp;
> >    pmd = pmd_mknuma(pmd);
> >    set_pmd_at(mm, addr, pmdp, pmd);
> > 
> > instead.
> > 
> > I don't like the old pmdp_set_numa() because it can drop dirty bits,
> > so I think the old code was actively buggy.
> 
> Could we, as a silly testing hack not to be applied, write a 
> hack-patch that re-introduces the racy way of setting the NUMA bit, to 
> confirm that it is indeed this difference that changes pte visibility 
> across CPUs enough to create so many more faults?
> 

This was already done and tested by Dave but while it helped, it was
not enough.  As the approach was inherently unsafe it was dropped and the
throttling approach taken. However, the fact it made little difference
may indicate that this is somehow related to the global bit being used.

> Because if the answer is 'yes', then we can safely say: 'we regressed 
> performance because correctness [not dropping dirty bits] comes before 
> performance'.
> 
> If the answer is 'no', then we still have a mystery (and a regression) 
> to track down.
> 
> As a second hack (not to be applied), could we change:
> 
>  #define _PAGE_BIT_PROTNONE      _PAGE_BIT_GLOBAL
> 
> to:
> 
>  #define _PAGE_BIT_PROTNONE      (_PAGE_BIT_GLOBAL+1)
> 

In itself, that's not enough. The SWP_OFFSET_SHIFT would also need updating
as a partial revert of 21d9ee3eda7792c45880b2f11bff8e95c9a061fb but it
can be done.

> to double check that the position of the bit does not matter?
> 

It's worth checking in case it's a case of how the global bit is
treated. However, note that Dave is currently travelling for LSF/MM in
Boston and there is a chance he cannot test this week at all. I'm just
after landing in the hotel myself. I'll try find time during during one
of the breaks tomorrow but if the wireless is too crap then accessing the
test machine remotely might be an issue.

> I don't think we've exhaused all avenues of analysis here.
> 

True.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-08 18:35         ` Linus Torvalds
  2015-03-08 18:46           ` Linus Torvalds
@ 2015-03-09 11:29           ` Dave Chinner
  2015-03-09 16:52             ` Linus Torvalds
  1 sibling, 1 reply; 49+ messages in thread
From: Dave Chinner @ 2015-03-09 11:29 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, Mel Gorman, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Sun, Mar 08, 2015 at 11:35:59AM -0700, Linus Torvalds wrote:
> On Sun, Mar 8, 2015 at 3:02 AM, Ingo Molnar <mingo@kernel.org> wrote:
> But:
> 
> > As a second hack (not to be applied), could we change:
> >
> >  #define _PAGE_BIT_PROTNONE      _PAGE_BIT_GLOBAL
> >
> > to:
> >
> >  #define _PAGE_BIT_PROTNONE      (_PAGE_BIT_GLOBAL+1)
> >
> > to double check that the position of the bit does not matter?
> 
> Agreed. We should definitely try that.
> 
> Dave?

As Mel has already mentioned, I'm in Boston for LSFMM and don't have
access to the test rig I've used to generate this.

> Also, is there some sane way for me to actually see this behavior on a
> regular machine with just a single socket? Dave is apparently running
> in some fake-numa setup, I'm wondering if this is easy enough to
> reproduce that I could see it myself.

Should be - I don't actually use 500TB of storage to generate this -
50GB on an SSD is all you need from the storage side. I just use a
sparse backing file to make it look like a 500TB device. :P

i.e. create an XFS filesystem on a 500TB sparse file with "mkfs.xfs
-d size=500t,file=1 /path/to/file.img", mount it on loopback or as a
virtio,cache=none device for the guest vm and then use fsmark to
generate several million files spread across many, many directories
such as:

$  fs_mark -D 10000 -S0 -n 100000 -s 1 -L 32 -d \
/mnt/scratch/0 -d /mnt/scratch/1 -d /mnt/scratch/2 -d \
/mnt/scratch/3 -d /mnt/scratch/4 -d /mnt/scratch/5 -d \
/mnt/scratch/6 -d /mnt/scratch/7

That should only take a few minutes to run - if you throw 8p at it
then it should run at >100k files/s being created.

Then unmount and run "xfs_repair -o bhash=101703 /path/to/file.img"
on the resultant image file.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-09 11:29           ` Dave Chinner
@ 2015-03-09 16:52             ` Linus Torvalds
  2015-03-09 19:19               ` Dave Chinner
  0 siblings, 1 reply; 49+ messages in thread
From: Linus Torvalds @ 2015-03-09 16:52 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Ingo Molnar, Mel Gorman, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Mon, Mar 9, 2015 at 4:29 AM, Dave Chinner <david@fromorbit.com> wrote:
>
>> Also, is there some sane way for me to actually see this behavior on a
>> regular machine with just a single socket? Dave is apparently running
>> in some fake-numa setup, I'm wondering if this is easy enough to
>> reproduce that I could see it myself.
>
> Should be - I don't actually use 500TB of storage to generate this -
> 50GB on an SSD is all you need from the storage side. I just use a
> sparse backing file to make it look like a 500TB device. :P

What's your virtual environment setup? Kernel config, and
virtualization environment to actually get that odd fake NUMA thing
happening?

                          Linus

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-09 16:52             ` Linus Torvalds
@ 2015-03-09 19:19               ` Dave Chinner
  2015-03-10 23:55                 ` Linus Torvalds
  0 siblings, 1 reply; 49+ messages in thread
From: Dave Chinner @ 2015-03-09 19:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, Mel Gorman, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

[-- Attachment #1: Type: text/plain, Size: 2242 bytes --]

On Mon, Mar 09, 2015 at 09:52:18AM -0700, Linus Torvalds wrote:
> On Mon, Mar 9, 2015 at 4:29 AM, Dave Chinner <david@fromorbit.com> wrote:
> >
> >> Also, is there some sane way for me to actually see this behavior on a
> >> regular machine with just a single socket? Dave is apparently running
> >> in some fake-numa setup, I'm wondering if this is easy enough to
> >> reproduce that I could see it myself.
> >
> > Should be - I don't actually use 500TB of storage to generate this -
> > 50GB on an SSD is all you need from the storage side. I just use a
> > sparse backing file to make it look like a 500TB device. :P
> 
> What's your virtual environment setup? Kernel config, and
> virtualization environment to actually get that odd fake NUMA thing
> happening?

I don't have the exact .config with me (test machines at home
are shut down because I'm half a world away), but it's pretty much
this (copied and munged from a similar test vm on my laptop):

$ cat run-vm-4.sh
sudo qemu-system-x86_64 \
        -machine accel=kvm \
        -no-fd-bootchk \
        -localtime \
        -boot c \
        -serial pty \
        -nographic \
        -alt-grab \
        -smp 16 -m 16384 \
        -hda /data/vm-2/root.img \
        -drive file=/vm/vm-4/vm-4-test.img,if=virtio,cache=none \
        -drive file=/vm/vm-4/vm-4-scratch.img,if=virtio,cache=none \
        -drive file=/vm/vm-4/vm-4-500TB.img,if=virtio,cache=none \
        -kernel /vm/vm-4/vmlinuz \
        -append "console=ttyS0,115200 root=/dev/sda1,numa=fake=4"
$

And on the host I have /vm on a ssd that is an XFS filesystem, and
I've created /vm/vm-4/vm-4-500TB.img by doing:

$ xfs_io -f -c "truncate 500t" -c "extsize 1m" /vm/vm-4/vm-4-500TB.img

and in the guest the filesystem is created with:

# mkfs.xfs -f -mcrc=1,finobt=1 /dev/vdc

And that will create a 500TB filesystem that you can then mount and
run fsmark on it, then unmount and run xfs_repair on it.

the .config I have on my laptop is from 3.18-rc something, but it
should work just with a make oldconfig update. I'ts attached below.

Hopefully this will be sufficient for you, otherwise it'll have to
wait until I get home to get the exact configs for you.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

[-- Attachment #2: .config --]
[-- Type: text/plain, Size: 93243 bytes --]

#
# Automatically generated file; DO NOT EDIT.
# Linux/x86 3.18.0-rc1 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_LATENCYTOP_SUPPORT=y
CONFIG_MMU=y
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ZONE_DMA32=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_X86_64_SMP=y
CONFIG_X86_HT=y
CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-rdi -fcall-saved-rsi -fcall-saved-rdx -fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 -fcall-saved-r10 -fcall-saved-r11"
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_CROSS_MEMORY_ATTACH=y
# CONFIG_FHANDLE is not set
CONFIG_USELIB=y
CONFIG_AUDIT=y
CONFIG_HAVE_ARCH_AUDITSYSCALL=y
CONFIG_AUDITSYSCALL=y
CONFIG_AUDIT_WATCH=y
CONFIG_AUDIT_TREE=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_IRQ_LEGACY_ALLOC_HWIRQ=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_IRQ_DOMAIN=y
# CONFIG_IRQ_DOMAIN_DEBUG is not set
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_DATA=y
CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BUILD=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
CONFIG_NO_HZ_IDLE=y
# CONFIG_NO_HZ_FULL is not set
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y

#
# CPU/Task time and stats accounting
#
# CONFIG_TICK_CPU_ACCOUNTING is not set
# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
CONFIG_IRQ_TIME_ACCOUNTING=y
CONFIG_BSD_PROCESS_ACCT=y
# CONFIG_BSD_PROCESS_ACCT_V3 is not set
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y

#
# RCU Subsystem
#
CONFIG_TREE_RCU=y
# CONFIG_PREEMPT_RCU is not set
# CONFIG_TASKS_RCU is not set
CONFIG_RCU_STALL_COMMON=y
# CONFIG_RCU_USER_QS is not set
CONFIG_RCU_FANOUT=64
CONFIG_RCU_FANOUT_LEAF=16
# CONFIG_RCU_FANOUT_EXACT is not set
# CONFIG_RCU_FAST_NO_HZ is not set
# CONFIG_TREE_RCU_TRACE is not set
# CONFIG_RCU_NOCB_CPU is not set
# CONFIG_BUILD_BIN2C is not set
# CONFIG_IKCONFIG is not set
CONFIG_LOG_BUF_SHIFT=18
CONFIG_LOG_CPU_MAX_BUF_SHIFT=17
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
CONFIG_ARCH_SUPPORTS_INT128=y
# CONFIG_NUMA_BALANCING is not set
CONFIG_CGROUPS=y
# CONFIG_CGROUP_DEBUG is not set
CONFIG_CGROUP_FREEZER=y
# CONFIG_CGROUP_DEVICE is not set
CONFIG_CPUSETS=y
CONFIG_PROC_PID_CPUSET=y
CONFIG_CGROUP_CPUACCT=y
CONFIG_RESOURCE_COUNTERS=y
# CONFIG_MEMCG is not set
# CONFIG_CGROUP_HUGETLB is not set
# CONFIG_CGROUP_PERF is not set
CONFIG_CGROUP_SCHED=y
CONFIG_FAIR_GROUP_SCHED=y
# CONFIG_CFS_BANDWIDTH is not set
# CONFIG_RT_GROUP_SCHED is not set
# CONFIG_BLK_CGROUP is not set
# CONFIG_CHECKPOINT_RESTORE is not set
CONFIG_NAMESPACES=y
CONFIG_UTS_NS=y
CONFIG_IPC_NS=y
CONFIG_USER_NS=y
CONFIG_PID_NS=y
CONFIG_NET_NS=y
# CONFIG_SCHED_AUTOGROUP is not set
CONFIG_SYSFS_DEPRECATED=y
# CONFIG_SYSFS_DEPRECATED_V2 is not set
CONFIG_RELAY=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_RD_GZIP=y
CONFIG_RD_BZIP2=y
CONFIG_RD_LZMA=y
CONFIG_RD_XZ=y
CONFIG_RD_LZO=y
CONFIG_RD_LZ4=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SYSCTL=y
CONFIG_ANON_INODES=y
CONFIG_HAVE_UID16=y
CONFIG_SYSCTL_EXCEPTION_TRACE=y
CONFIG_HAVE_PCSPKR_PLATFORM=y
# CONFIG_EXPERT is not set
CONFIG_UID16=y
CONFIG_SGETMASK_SYSCALL=y
CONFIG_SYSFS_SYSCALL=y
# CONFIG_SYSCTL_SYSCALL is not set
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_PCSPKR_PLATFORM=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_AIO=y
CONFIG_ADVISE_SYSCALLS=y
CONFIG_PCI_QUIRKS=y
# CONFIG_EMBEDDED is not set
CONFIG_HAVE_PERF_EVENTS=y

#
# Kernel Performance Events And Counters
#
CONFIG_PERF_EVENTS=y
# CONFIG_DEBUG_PERF_USE_VMALLOC is not set
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLUB_DEBUG=y
# CONFIG_COMPAT_BRK is not set
# CONFIG_SLAB is not set
CONFIG_SLUB=y
CONFIG_SLUB_CPU_PARTIAL=y
# CONFIG_SYSTEM_TRUSTED_KEYRING is not set
CONFIG_PROFILING=y
CONFIG_TRACEPOINTS=y
# CONFIG_OPROFILE is not set
CONFIG_HAVE_OPROFILE=y
CONFIG_OPROFILE_NMI_TIMER=y
CONFIG_KPROBES=y
CONFIG_JUMP_LABEL=y
CONFIG_OPTPROBES=y
# CONFIG_UPROBES is not set
# CONFIG_HAVE_64BIT_ALIGNED_ACCESS is not set
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
CONFIG_ARCH_USE_BUILTIN_BSWAP=y
CONFIG_KRETPROBES=y
CONFIG_USER_RETURN_NOTIFIER=y
CONFIG_HAVE_IOREMAP_PROT=y
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KRETPROBES=y
CONFIG_HAVE_OPTPROBES=y
CONFIG_HAVE_KPROBES_ON_FTRACE=y
CONFIG_HAVE_ARCH_TRACEHOOK=y
CONFIG_HAVE_DMA_ATTRS=y
CONFIG_HAVE_DMA_CONTIGUOUS=y
CONFIG_GENERIC_SMP_IDLE_THREAD=y
CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y
CONFIG_HAVE_DMA_API_DEBUG=y
CONFIG_HAVE_HW_BREAKPOINT=y
CONFIG_HAVE_MIXED_BREAKPOINTS_REGS=y
CONFIG_HAVE_USER_RETURN_NOTIFIER=y
CONFIG_HAVE_PERF_EVENTS_NMI=y
CONFIG_HAVE_PERF_REGS=y
CONFIG_HAVE_PERF_USER_STACK_DUMP=y
CONFIG_HAVE_ARCH_JUMP_LABEL=y
CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG=y
CONFIG_HAVE_ALIGNED_STRUCT_PAGE=y
CONFIG_HAVE_CMPXCHG_LOCAL=y
CONFIG_HAVE_CMPXCHG_DOUBLE=y
CONFIG_ARCH_WANT_COMPAT_IPC_PARSE_VERSION=y
CONFIG_ARCH_WANT_OLD_COMPAT_IPC=y
CONFIG_HAVE_ARCH_SECCOMP_FILTER=y
CONFIG_SECCOMP_FILTER=y
CONFIG_HAVE_CC_STACKPROTECTOR=y
# CONFIG_CC_STACKPROTECTOR is not set
CONFIG_CC_STACKPROTECTOR_NONE=y
# CONFIG_CC_STACKPROTECTOR_REGULAR is not set
# CONFIG_CC_STACKPROTECTOR_STRONG is not set
CONFIG_HAVE_CONTEXT_TRACKING=y
CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y
CONFIG_HAVE_IRQ_TIME_ACCOUNTING=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_HAVE_ARCH_SOFT_DIRTY=y
CONFIG_MODULES_USE_ELF_RELA=y
CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK=y
CONFIG_OLD_SIGSUSPEND3=y
CONFIG_COMPAT_OLD_SIGACTION=y

#
# GCOV-based kernel profiling
#
# CONFIG_GCOV_KERNEL is not set
# CONFIG_HAVE_GENERIC_DMA_COHERENT is not set
CONFIG_SLABINFO=y
CONFIG_RT_MUTEXES=y
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
# CONFIG_MODULE_FORCE_LOAD is not set
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
# CONFIG_MODULE_SIG is not set
# CONFIG_MODULE_COMPRESS is not set
CONFIG_STOP_MACHINE=y
CONFIG_BLOCK=y
CONFIG_BLK_DEV_BSG=y
# CONFIG_BLK_DEV_BSGLIB is not set
# CONFIG_BLK_DEV_INTEGRITY is not set
# CONFIG_BLK_CMDLINE_PARSER is not set

#
# Partition Types
#
CONFIG_PARTITION_ADVANCED=y
# CONFIG_ACORN_PARTITION is not set
# CONFIG_AIX_PARTITION is not set
# CONFIG_OSF_PARTITION is not set
# CONFIG_AMIGA_PARTITION is not set
# CONFIG_ATARI_PARTITION is not set
# CONFIG_MAC_PARTITION is not set
CONFIG_MSDOS_PARTITION=y
CONFIG_BSD_DISKLABEL=y
CONFIG_MINIX_SUBPARTITION=y
CONFIG_SOLARIS_X86_PARTITION=y
CONFIG_UNIXWARE_DISKLABEL=y
# CONFIG_LDM_PARTITION is not set
# CONFIG_SGI_PARTITION is not set
# CONFIG_ULTRIX_PARTITION is not set
# CONFIG_SUN_PARTITION is not set
# CONFIG_KARMA_PARTITION is not set
CONFIG_EFI_PARTITION=y
# CONFIG_SYSV68_PARTITION is not set
# CONFIG_CMDLINE_PARTITION is not set
CONFIG_BLOCK_COMPAT=y

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_DEFAULT_DEADLINE is not set
# CONFIG_DEFAULT_CFQ is not set
CONFIG_DEFAULT_NOOP=y
CONFIG_DEFAULT_IOSCHED="noop"
CONFIG_PREEMPT_NOTIFIERS=y
CONFIG_INLINE_SPIN_UNLOCK_IRQ=y
CONFIG_INLINE_READ_UNLOCK=y
CONFIG_INLINE_READ_UNLOCK_IRQ=y
CONFIG_INLINE_WRITE_UNLOCK=y
CONFIG_INLINE_WRITE_UNLOCK_IRQ=y
CONFIG_ARCH_SUPPORTS_ATOMIC_RMW=y
CONFIG_MUTEX_SPIN_ON_OWNER=y
CONFIG_RWSEM_SPIN_ON_OWNER=y
CONFIG_ARCH_USE_QUEUE_RWLOCK=y
CONFIG_QUEUE_RWLOCK=y
CONFIG_FREEZER=y

#
# Processor type and features
#
CONFIG_ZONE_DMA=y
CONFIG_SMP=y
CONFIG_X86_FEATURE_NAMES=y
CONFIG_X86_MPPARSE=y
CONFIG_X86_EXTENDED_PLATFORM=y
# CONFIG_X86_VSMP is not set
# CONFIG_X86_GOLDFISH is not set
# CONFIG_X86_INTEL_LPSS is not set
CONFIG_IOSF_MBI=m
# CONFIG_IOSF_MBI_DEBUG is not set
CONFIG_X86_SUPPORTS_MEMORY_FAILURE=y
CONFIG_SCHED_OMIT_FRAME_POINTER=y
# CONFIG_HYPERVISOR_GUEST is not set
CONFIG_NO_BOOTMEM=y
# CONFIG_MEMTEST is not set
# CONFIG_MK8 is not set
# CONFIG_MPSC is not set
CONFIG_MCORE2=y
# CONFIG_MATOM is not set
# CONFIG_GENERIC_CPU is not set
CONFIG_X86_INTERNODE_CACHE_SHIFT=6
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_P6_NOP=y
CONFIG_X86_TSC=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_CMOV=y
CONFIG_X86_MINIMUM_CPU_FAMILY=64
CONFIG_X86_DEBUGCTLMSR=y
CONFIG_CPU_SUP_INTEL=y
CONFIG_CPU_SUP_AMD=y
CONFIG_CPU_SUP_CENTAUR=y
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
CONFIG_DMI=y
CONFIG_GART_IOMMU=y
# CONFIG_CALGARY_IOMMU is not set
CONFIG_SWIOTLB=y
CONFIG_IOMMU_HELPER=y
# CONFIG_MAXSMP is not set
CONFIG_NR_CPUS=16
CONFIG_SCHED_SMT=y
CONFIG_SCHED_MC=y
# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
# CONFIG_PREEMPT is not set
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_INTEL=y
# CONFIG_X86_MCE_AMD is not set
CONFIG_X86_MCE_THRESHOLD=y
# CONFIG_X86_MCE_INJECT is not set
CONFIG_X86_THERMAL_VECTOR=y
CONFIG_X86_16BIT=y
CONFIG_X86_ESPFIX64=y
# CONFIG_I8K is not set
CONFIG_MICROCODE=y
CONFIG_MICROCODE_INTEL=y
# CONFIG_MICROCODE_AMD is not set
CONFIG_MICROCODE_OLD_INTERFACE=y
CONFIG_MICROCODE_INTEL_EARLY=y
# CONFIG_MICROCODE_AMD_EARLY is not set
CONFIG_MICROCODE_EARLY=y
CONFIG_X86_MSR=y
CONFIG_X86_CPUID=y
CONFIG_ARCH_PHYS_ADDR_T_64BIT=y
CONFIG_ARCH_DMA_ADDR_T_64BIT=y
CONFIG_DIRECT_GBPAGES=y
CONFIG_NUMA=y
CONFIG_AMD_NUMA=y
CONFIG_X86_64_ACPI_NUMA=y
CONFIG_NODES_SPAN_OTHER_NODES=y
# CONFIG_NUMA_EMU is not set
CONFIG_NODES_SHIFT=6
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_DEFAULT=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_ARCH_PROC_KCORE_TEXT=y
CONFIG_ILLEGAL_POINTER_VALUE=0xdead000000000000
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_SPARSEMEM_MANUAL=y
CONFIG_SPARSEMEM=y
CONFIG_NEED_MULTIPLE_NODES=y
CONFIG_HAVE_MEMORY_PRESENT=y
CONFIG_SPARSEMEM_EXTREME=y
CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER=y
CONFIG_SPARSEMEM_VMEMMAP=y
CONFIG_HAVE_MEMBLOCK=y
CONFIG_HAVE_MEMBLOCK_NODE_MAP=y
CONFIG_ARCH_DISCARD_MEMBLOCK=y
# CONFIG_MOVABLE_NODE is not set
# CONFIG_HAVE_BOOTMEM_INFO_NODE is not set
# CONFIG_MEMORY_HOTPLUG is not set
CONFIG_PAGEFLAGS_EXTENDED=y
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK=y
CONFIG_MEMORY_BALLOON=y
CONFIG_BALLOON_COMPACTION=y
CONFIG_COMPACTION=y
CONFIG_MIGRATION=y
CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION=y
CONFIG_PHYS_ADDR_T_64BIT=y
CONFIG_ZONE_DMA_FLAG=1
CONFIG_BOUNCE=y
CONFIG_VIRT_TO_BUS=y
CONFIG_MMU_NOTIFIER=y
# CONFIG_KSM is not set
CONFIG_DEFAULT_MMAP_MIN_ADDR=4096
CONFIG_ARCH_SUPPORTS_MEMORY_FAILURE=y
# CONFIG_MEMORY_FAILURE is not set
# CONFIG_TRANSPARENT_HUGEPAGE is not set
# CONFIG_CLEANCACHE is not set
# CONFIG_FRONTSWAP is not set
# CONFIG_CMA is not set
# CONFIG_ZPOOL is not set
# CONFIG_ZBUD is not set
# CONFIG_ZSMALLOC is not set
CONFIG_GENERIC_EARLY_IOREMAP=y
CONFIG_X86_CHECK_BIOS_CORRUPTION=y
CONFIG_X86_BOOTPARAM_MEMORY_CORRUPTION_CHECK=y
CONFIG_X86_RESERVE_LOW=64
CONFIG_MTRR=y
# CONFIG_MTRR_SANITIZER is not set
CONFIG_X86_PAT=y
CONFIG_ARCH_USES_PG_UNCACHED=y
CONFIG_ARCH_RANDOM=y
CONFIG_X86_SMAP=y
CONFIG_EFI=y
# CONFIG_EFI_STUB is not set
CONFIG_SECCOMP=y
# CONFIG_HZ_100 is not set
# CONFIG_HZ_250 is not set
# CONFIG_HZ_300 is not set
CONFIG_HZ_1000=y
CONFIG_HZ=1000
CONFIG_SCHED_HRTICK=y
CONFIG_KEXEC=y
CONFIG_CRASH_DUMP=y
# CONFIG_KEXEC_JUMP is not set
CONFIG_PHYSICAL_START=0x1000000
CONFIG_RELOCATABLE=y
# CONFIG_RANDOMIZE_BASE is not set
CONFIG_PHYSICAL_ALIGN=0x1000000
CONFIG_HOTPLUG_CPU=y
# CONFIG_BOOTPARAM_HOTPLUG_CPU0 is not set
# CONFIG_DEBUG_HOTPLUG_CPU0 is not set
# CONFIG_COMPAT_VDSO is not set
# CONFIG_CMDLINE_BOOL is not set
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
CONFIG_USE_PERCPU_NUMA_NODE_ID=y

#
# Power management and ACPI options
#
CONFIG_ARCH_HIBERNATION_HEADER=y
CONFIG_SUSPEND=y
CONFIG_SUSPEND_FREEZER=y
CONFIG_HIBERNATE_CALLBACKS=y
CONFIG_HIBERNATION=y
CONFIG_PM_STD_PARTITION=""
CONFIG_PM_SLEEP=y
CONFIG_PM_SLEEP_SMP=y
# CONFIG_PM_AUTOSLEEP is not set
# CONFIG_PM_WAKELOCKS is not set
CONFIG_PM_RUNTIME=y
CONFIG_PM=y
CONFIG_PM_DEBUG=y
CONFIG_PM_ADVANCED_DEBUG=y
# CONFIG_PM_TEST_SUSPEND is not set
CONFIG_PM_SLEEP_DEBUG=y
CONFIG_PM_TRACE=y
CONFIG_PM_TRACE_RTC=y
# CONFIG_WQ_POWER_EFFICIENT_DEFAULT is not set
CONFIG_ACPI=y
CONFIG_ACPI_LEGACY_TABLES_LOOKUP=y
CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC=y
CONFIG_ACPI_SLEEP=y
CONFIG_ACPI_PROCFS_POWER=y
# CONFIG_ACPI_EC_DEBUGFS is not set
CONFIG_ACPI_AC=y
CONFIG_ACPI_BATTERY=y
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_VIDEO=y
CONFIG_ACPI_FAN=y
CONFIG_ACPI_DOCK=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_HOTPLUG_CPU=y
# CONFIG_ACPI_PROCESSOR_AGGREGATOR is not set
CONFIG_ACPI_THERMAL=y
CONFIG_ACPI_NUMA=y
CONFIG_ACPI_CUSTOM_DSDT_FILE=""
# CONFIG_ACPI_CUSTOM_DSDT is not set
# CONFIG_ACPI_INITRD_TABLE_OVERRIDE is not set
# CONFIG_ACPI_DEBUG is not set
# CONFIG_ACPI_PCI_SLOT is not set
CONFIG_X86_PM_TIMER=y
CONFIG_ACPI_CONTAINER=y
# CONFIG_ACPI_SBS is not set
# CONFIG_ACPI_HED is not set
# CONFIG_ACPI_CUSTOM_METHOD is not set
# CONFIG_ACPI_BGRT is not set
# CONFIG_ACPI_REDUCED_HARDWARE_ONLY is not set
CONFIG_HAVE_ACPI_APEI=y
CONFIG_HAVE_ACPI_APEI_NMI=y
# CONFIG_ACPI_APEI is not set
# CONFIG_ACPI_EXTLOG is not set
# CONFIG_SFI is not set

#
# CPU Frequency scaling
#
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_GOV_COMMON=y
CONFIG_CPU_FREQ_STAT=y
CONFIG_CPU_FREQ_STAT_DETAILS=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set
CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=y
CONFIG_CPU_FREQ_GOV_USERSPACE=y
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y

#
# x86 CPU frequency scaling drivers
#
# CONFIG_X86_INTEL_PSTATE is not set
CONFIG_X86_PCC_CPUFREQ=y
CONFIG_X86_ACPI_CPUFREQ=y
CONFIG_X86_ACPI_CPUFREQ_CPB=y
# CONFIG_X86_POWERNOW_K8 is not set
# CONFIG_X86_AMD_FREQ_SENSITIVITY is not set
CONFIG_X86_SPEEDSTEP_CENTRINO=y
CONFIG_X86_P4_CLOCKMOD=y

#
# shared options
#
CONFIG_X86_SPEEDSTEP_LIB=y

#
# CPU Idle
#
CONFIG_CPU_IDLE=y
CONFIG_CPU_IDLE_GOV_LADDER=y
CONFIG_CPU_IDLE_GOV_MENU=y
# CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
CONFIG_INTEL_IDLE=y

#
# Memory power savings
#
CONFIG_I7300_IDLE_IOAT_CHANNEL=y
CONFIG_I7300_IDLE=y

#
# Bus options (PCI etc.)
#
CONFIG_PCI=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_PCI_DOMAINS=y
CONFIG_PCIEPORTBUS=y
CONFIG_PCIEAER=y
# CONFIG_PCIE_ECRC is not set
# CONFIG_PCIEAER_INJECT is not set
CONFIG_PCIEASPM=y
# CONFIG_PCIEASPM_DEBUG is not set
CONFIG_PCIEASPM_DEFAULT=y
# CONFIG_PCIEASPM_POWERSAVE is not set
# CONFIG_PCIEASPM_PERFORMANCE is not set
CONFIG_PCIE_PME=y
CONFIG_PCI_MSI=y
# CONFIG_PCI_DEBUG is not set
# CONFIG_PCI_REALLOC_ENABLE_AUTO is not set
# CONFIG_PCI_STUB is not set
CONFIG_HT_IRQ=y
CONFIG_PCI_ATS=y
CONFIG_PCI_IOV=y
# CONFIG_PCI_PRI is not set
# CONFIG_PCI_PASID is not set
CONFIG_PCI_IOAPIC=y
CONFIG_PCI_LABEL=y

#
# PCI host controller drivers
#
CONFIG_ISA_DMA_API=y
CONFIG_AMD_NB=y
# CONFIG_PCCARD is not set
# CONFIG_HOTPLUG_PCI is not set
# CONFIG_RAPIDIO is not set
# CONFIG_X86_SYSFB is not set

#
# Executable file formats / Emulations
#
CONFIG_BINFMT_ELF=y
CONFIG_COMPAT_BINFMT_ELF=y
CONFIG_ARCH_BINFMT_ELF_RANDOMIZE_PIE=y
CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y
CONFIG_BINFMT_SCRIPT=y
# CONFIG_HAVE_AOUT is not set
CONFIG_BINFMT_MISC=y
CONFIG_COREDUMP=y
CONFIG_IA32_EMULATION=y
# CONFIG_IA32_AOUT is not set
# CONFIG_X86_X32 is not set
CONFIG_COMPAT=y
CONFIG_COMPAT_FOR_U64_ALIGNMENT=y
CONFIG_SYSVIPC_COMPAT=y
CONFIG_KEYS_COMPAT=y
CONFIG_X86_DEV_DMA_OPS=y
CONFIG_PMC_ATOM=y
CONFIG_NET=y

#
# Networking options
#
CONFIG_PACKET=y
# CONFIG_PACKET_DIAG is not set
CONFIG_UNIX=y
# CONFIG_UNIX_DIAG is not set
CONFIG_XFRM=y
CONFIG_XFRM_ALGO=y
CONFIG_XFRM_USER=y
# CONFIG_XFRM_SUB_POLICY is not set
# CONFIG_XFRM_MIGRATE is not set
# CONFIG_XFRM_STATISTICS is not set
# CONFIG_NET_KEY is not set
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
# CONFIG_IP_FIB_TRIE_STATS is not set
CONFIG_IP_MULTIPLE_TABLES=y
CONFIG_IP_ROUTE_MULTIPATH=y
CONFIG_IP_ROUTE_VERBOSE=y
CONFIG_IP_PNP=y
CONFIG_IP_PNP_DHCP=y
CONFIG_IP_PNP_BOOTP=y
CONFIG_IP_PNP_RARP=y
# CONFIG_NET_IPIP is not set
# CONFIG_NET_IPGRE_DEMUX is not set
CONFIG_NET_IP_TUNNEL=y
CONFIG_IP_MROUTE=y
# CONFIG_IP_MROUTE_MULTIPLE_TABLES is not set
CONFIG_IP_PIMSM_V1=y
CONFIG_IP_PIMSM_V2=y
CONFIG_SYN_COOKIES=y
# CONFIG_NET_UDP_TUNNEL is not set
# CONFIG_NET_FOU is not set
# CONFIG_GENEVE is not set
# CONFIG_INET_AH is not set
# CONFIG_INET_ESP is not set
# CONFIG_INET_IPCOMP is not set
# CONFIG_INET_XFRM_TUNNEL is not set
CONFIG_INET_TUNNEL=y
# CONFIG_INET_XFRM_MODE_TRANSPORT is not set
# CONFIG_INET_XFRM_MODE_TUNNEL is not set
# CONFIG_INET_XFRM_MODE_BEET is not set
CONFIG_INET_LRO=y
# CONFIG_INET_DIAG is not set
CONFIG_TCP_CONG_ADVANCED=y
# CONFIG_TCP_CONG_BIC is not set
CONFIG_TCP_CONG_CUBIC=y
# CONFIG_TCP_CONG_WESTWOOD is not set
# CONFIG_TCP_CONG_HTCP is not set
# CONFIG_TCP_CONG_HSTCP is not set
# CONFIG_TCP_CONG_HYBLA is not set
# CONFIG_TCP_CONG_VEGAS is not set
# CONFIG_TCP_CONG_SCALABLE is not set
# CONFIG_TCP_CONG_LP is not set
# CONFIG_TCP_CONG_VENO is not set
# CONFIG_TCP_CONG_YEAH is not set
# CONFIG_TCP_CONG_ILLINOIS is not set
# CONFIG_TCP_CONG_DCTCP is not set
CONFIG_DEFAULT_CUBIC=y
# CONFIG_DEFAULT_RENO is not set
CONFIG_DEFAULT_TCP_CONG="cubic"
CONFIG_TCP_MD5SIG=y
CONFIG_IPV6=y
# CONFIG_IPV6_ROUTER_PREF is not set
# CONFIG_IPV6_OPTIMISTIC_DAD is not set
CONFIG_INET6_AH=y
CONFIG_INET6_ESP=y
# CONFIG_INET6_IPCOMP is not set
# CONFIG_IPV6_MIP6 is not set
# CONFIG_INET6_XFRM_TUNNEL is not set
# CONFIG_INET6_TUNNEL is not set
CONFIG_INET6_XFRM_MODE_TRANSPORT=y
CONFIG_INET6_XFRM_MODE_TUNNEL=y
CONFIG_INET6_XFRM_MODE_BEET=y
# CONFIG_INET6_XFRM_MODE_ROUTEOPTIMIZATION is not set
# CONFIG_IPV6_VTI is not set
CONFIG_IPV6_SIT=y
# CONFIG_IPV6_SIT_6RD is not set
CONFIG_IPV6_NDISC_NODETYPE=y
# CONFIG_IPV6_TUNNEL is not set
# CONFIG_IPV6_GRE is not set
# CONFIG_IPV6_MULTIPLE_TABLES is not set
# CONFIG_IPV6_MROUTE is not set
CONFIG_NETLABEL=y
CONFIG_NETWORK_SECMARK=y
# CONFIG_NET_PTP_CLASSIFY is not set
# CONFIG_NETWORK_PHY_TIMESTAMPING is not set
CONFIG_NETFILTER=y
# CONFIG_NETFILTER_DEBUG is not set
# CONFIG_NETFILTER_ADVANCED is not set

#
# Core Netfilter Configuration
#
CONFIG_NETFILTER_NETLINK=y
CONFIG_NETFILTER_NETLINK_LOG=y
CONFIG_NF_CONNTRACK=y
CONFIG_NF_LOG_COMMON=m
CONFIG_NF_CONNTRACK_SECMARK=y
CONFIG_NF_CONNTRACK_PROCFS=y
CONFIG_NF_CONNTRACK_FTP=y
CONFIG_NF_CONNTRACK_IRC=y
# CONFIG_NF_CONNTRACK_NETBIOS_NS is not set
CONFIG_NF_CONNTRACK_SIP=y
CONFIG_NF_CT_NETLINK=y
CONFIG_NF_NAT=m
CONFIG_NF_NAT_NEEDED=y
# CONFIG_NF_NAT_AMANDA is not set
CONFIG_NF_NAT_FTP=m
CONFIG_NF_NAT_IRC=m
CONFIG_NF_NAT_SIP=m
# CONFIG_NF_NAT_TFTP is not set
# CONFIG_NF_TABLES is not set
CONFIG_NETFILTER_XTABLES=y

#
# Xtables combined modules
#
CONFIG_NETFILTER_XT_MARK=m

#
# Xtables targets
#
CONFIG_NETFILTER_XT_TARGET_CONNSECMARK=y
CONFIG_NETFILTER_XT_TARGET_LOG=m
CONFIG_NETFILTER_XT_NAT=m
# CONFIG_NETFILTER_XT_TARGET_NETMAP is not set
CONFIG_NETFILTER_XT_TARGET_NFLOG=y
# CONFIG_NETFILTER_XT_TARGET_REDIRECT is not set
CONFIG_NETFILTER_XT_TARGET_SECMARK=y
CONFIG_NETFILTER_XT_TARGET_TCPMSS=y

#
# Xtables matches
#
CONFIG_NETFILTER_XT_MATCH_CONNTRACK=y
CONFIG_NETFILTER_XT_MATCH_POLICY=y
CONFIG_NETFILTER_XT_MATCH_STATE=y
# CONFIG_IP_SET is not set
# CONFIG_IP_VS is not set

#
# IP: Netfilter Configuration
#
CONFIG_NF_DEFRAG_IPV4=y
CONFIG_NF_CONNTRACK_IPV4=y
CONFIG_NF_CONNTRACK_PROC_COMPAT=y
CONFIG_NF_LOG_ARP=m
CONFIG_NF_LOG_IPV4=m
CONFIG_NF_REJECT_IPV4=y
CONFIG_NF_NAT_IPV4=m
CONFIG_NF_NAT_MASQUERADE_IPV4=m
# CONFIG_NF_NAT_PPTP is not set
# CONFIG_NF_NAT_H323 is not set
CONFIG_IP_NF_IPTABLES=y
CONFIG_IP_NF_FILTER=y
CONFIG_IP_NF_TARGET_REJECT=y
CONFIG_IP_NF_NAT=m
CONFIG_IP_NF_TARGET_MASQUERADE=m
CONFIG_IP_NF_MANGLE=y
# CONFIG_IP_NF_RAW is not set

#
# IPv6: Netfilter Configuration
#
CONFIG_NF_DEFRAG_IPV6=y
CONFIG_NF_CONNTRACK_IPV6=y
CONFIG_NF_REJECT_IPV6=y
CONFIG_NF_LOG_IPV6=m
CONFIG_IP6_NF_IPTABLES=y
CONFIG_IP6_NF_MATCH_IPV6HEADER=y
CONFIG_IP6_NF_FILTER=y
CONFIG_IP6_NF_TARGET_REJECT=y
CONFIG_IP6_NF_MANGLE=y
# CONFIG_IP6_NF_RAW is not set
# CONFIG_BRIDGE_NF_EBTABLES is not set
# CONFIG_IP_DCCP is not set
# CONFIG_IP_SCTP is not set
# CONFIG_RDS is not set
# CONFIG_TIPC is not set
# CONFIG_ATM is not set
# CONFIG_L2TP is not set
CONFIG_STP=y
CONFIG_BRIDGE=y
CONFIG_BRIDGE_IGMP_SNOOPING=y
# CONFIG_BRIDGE_VLAN_FILTERING is not set
CONFIG_HAVE_NET_DSA=y
CONFIG_VLAN_8021Q=y
# CONFIG_VLAN_8021Q_GVRP is not set
# CONFIG_VLAN_8021Q_MVRP is not set
# CONFIG_DECNET is not set
CONFIG_LLC=y
# CONFIG_LLC2 is not set
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_PHONET is not set
# CONFIG_6LOWPAN is not set
# CONFIG_IEEE802154 is not set
CONFIG_NET_SCHED=y

#
# Queueing/Scheduling
#
# CONFIG_NET_SCH_CBQ is not set
# CONFIG_NET_SCH_HTB is not set
# CONFIG_NET_SCH_HFSC is not set
# CONFIG_NET_SCH_PRIO is not set
# CONFIG_NET_SCH_MULTIQ is not set
# CONFIG_NET_SCH_RED is not set
# CONFIG_NET_SCH_SFB is not set
# CONFIG_NET_SCH_SFQ is not set
# CONFIG_NET_SCH_TEQL is not set
# CONFIG_NET_SCH_TBF is not set
# CONFIG_NET_SCH_GRED is not set
# CONFIG_NET_SCH_DSMARK is not set
# CONFIG_NET_SCH_NETEM is not set
# CONFIG_NET_SCH_DRR is not set
# CONFIG_NET_SCH_MQPRIO is not set
# CONFIG_NET_SCH_CHOKE is not set
# CONFIG_NET_SCH_QFQ is not set
# CONFIG_NET_SCH_CODEL is not set
# CONFIG_NET_SCH_FQ_CODEL is not set
# CONFIG_NET_SCH_FQ is not set
# CONFIG_NET_SCH_HHF is not set
# CONFIG_NET_SCH_PIE is not set
# CONFIG_NET_SCH_INGRESS is not set
# CONFIG_NET_SCH_PLUG is not set

#
# Classification
#
CONFIG_NET_CLS=y
# CONFIG_NET_CLS_BASIC is not set
# CONFIG_NET_CLS_TCINDEX is not set
# CONFIG_NET_CLS_ROUTE4 is not set
# CONFIG_NET_CLS_FW is not set
# CONFIG_NET_CLS_U32 is not set
# CONFIG_NET_CLS_RSVP is not set
# CONFIG_NET_CLS_RSVP6 is not set
# CONFIG_NET_CLS_FLOW is not set
# CONFIG_NET_CLS_CGROUP is not set
# CONFIG_NET_CLS_BPF is not set
CONFIG_NET_EMATCH=y
CONFIG_NET_EMATCH_STACK=32
# CONFIG_NET_EMATCH_CMP is not set
# CONFIG_NET_EMATCH_NBYTE is not set
# CONFIG_NET_EMATCH_U32 is not set
# CONFIG_NET_EMATCH_META is not set
# CONFIG_NET_EMATCH_TEXT is not set
CONFIG_NET_CLS_ACT=y
# CONFIG_NET_ACT_POLICE is not set
# CONFIG_NET_ACT_GACT is not set
# CONFIG_NET_ACT_MIRRED is not set
# CONFIG_NET_ACT_IPT is not set
# CONFIG_NET_ACT_NAT is not set
# CONFIG_NET_ACT_PEDIT is not set
# CONFIG_NET_ACT_SIMP is not set
# CONFIG_NET_ACT_SKBEDIT is not set
# CONFIG_NET_ACT_CSUM is not set
CONFIG_NET_SCH_FIFO=y
# CONFIG_DCB is not set
CONFIG_DNS_RESOLVER=y
# CONFIG_BATMAN_ADV is not set
# CONFIG_OPENVSWITCH is not set
# CONFIG_VSOCKETS is not set
# CONFIG_NETLINK_MMAP is not set
# CONFIG_NETLINK_DIAG is not set
# CONFIG_NET_MPLS_GSO is not set
# CONFIG_HSR is not set
CONFIG_RPS=y
CONFIG_RFS_ACCEL=y
CONFIG_XPS=y
# CONFIG_CGROUP_NET_PRIO is not set
# CONFIG_CGROUP_NET_CLASSID is not set
CONFIG_NET_RX_BUSY_POLL=y
CONFIG_BQL=y
# CONFIG_BPF_JIT is not set
CONFIG_NET_FLOW_LIMIT=y

#
# Network testing
#
# CONFIG_NET_PKTGEN is not set
# CONFIG_NET_TCPPROBE is not set
# CONFIG_NET_DROP_MONITOR is not set
# CONFIG_HAMRADIO is not set
# CONFIG_CAN is not set
# CONFIG_IRDA is not set
# CONFIG_BT is not set
# CONFIG_AF_RXRPC is not set
CONFIG_FIB_RULES=y
CONFIG_WIRELESS=y
# CONFIG_CFG80211 is not set
# CONFIG_LIB80211 is not set

#
# CFG80211 needs to be enabled for MAC80211
#
# CONFIG_WIMAX is not set
# CONFIG_RFKILL is not set
# CONFIG_NET_9P is not set
# CONFIG_CAIF is not set
# CONFIG_CEPH_LIB is not set
# CONFIG_NFC is not set
CONFIG_HAVE_BPF_JIT=y

#
# Device Drivers
#

#
# Generic Driver Options
#
CONFIG_UEVENT_HELPER=y
CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
# CONFIG_DEVTMPFS is not set
# CONFIG_STANDALONE is not set
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_FW_LOADER=y
CONFIG_FIRMWARE_IN_KERNEL=y
CONFIG_EXTRA_FIRMWARE=""
# CONFIG_FW_LOADER_USER_HELPER_FALLBACK is not set
# CONFIG_DEBUG_DRIVER is not set
CONFIG_DEBUG_DEVRES=y
# CONFIG_SYS_HYPERVISOR is not set
# CONFIG_GENERIC_CPU_DEVICES is not set
CONFIG_GENERIC_CPU_AUTOPROBE=y
CONFIG_DMA_SHARED_BUFFER=y
# CONFIG_FENCE_TRACE is not set

#
# Bus devices
#
CONFIG_CONNECTOR=y
CONFIG_PROC_EVENTS=y
# CONFIG_MTD is not set
CONFIG_ARCH_MIGHT_HAVE_PC_PARPORT=y
# CONFIG_PARPORT is not set
CONFIG_PNP=y
CONFIG_PNP_DEBUG_MESSAGES=y

#
# Protocols
#
CONFIG_PNPACPI=y
CONFIG_BLK_DEV=y
# CONFIG_BLK_DEV_NULL_BLK is not set
# CONFIG_BLK_DEV_FD is not set
# CONFIG_BLK_DEV_PCIESSD_MTIP32XX is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_DAC960 is not set
# CONFIG_BLK_DEV_UMEM is not set
# CONFIG_BLK_DEV_COW_COMMON is not set
CONFIG_BLK_DEV_LOOP=y
CONFIG_BLK_DEV_LOOP_MIN_COUNT=8
# CONFIG_BLK_DEV_CRYPTOLOOP is not set
# CONFIG_BLK_DEV_DRBD is not set
# CONFIG_BLK_DEV_NBD is not set
# CONFIG_BLK_DEV_NVME is not set
# CONFIG_BLK_DEV_SKD is not set
# CONFIG_BLK_DEV_SX8 is not set
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_COUNT=16
CONFIG_BLK_DEV_RAM_SIZE=16384
# CONFIG_BLK_DEV_XIP is not set
# CONFIG_CDROM_PKTCDVD is not set
# CONFIG_ATA_OVER_ETH is not set
CONFIG_VIRTIO_BLK=y
# CONFIG_BLK_DEV_HD is not set
# CONFIG_BLK_DEV_RBD is not set
# CONFIG_BLK_DEV_RSXX is not set

#
# Misc devices
#
# CONFIG_SENSORS_LIS3LV02D is not set
# CONFIG_AD525X_DPOT is not set
# CONFIG_DUMMY_IRQ is not set
# CONFIG_IBM_ASM is not set
# CONFIG_PHANTOM is not set
# CONFIG_SGI_IOC4 is not set
# CONFIG_TIFM_CORE is not set
# CONFIG_ICS932S401 is not set
# CONFIG_ENCLOSURE_SERVICES is not set
# CONFIG_HP_ILO is not set
# CONFIG_APDS9802ALS is not set
# CONFIG_ISL29003 is not set
# CONFIG_ISL29020 is not set
# CONFIG_SENSORS_TSL2550 is not set
# CONFIG_SENSORS_BH1780 is not set
# CONFIG_SENSORS_BH1770 is not set
# CONFIG_SENSORS_APDS990X is not set
# CONFIG_HMC6352 is not set
# CONFIG_DS1682 is not set
# CONFIG_BMP085_I2C is not set
# CONFIG_USB_SWITCH_FSA9480 is not set
# CONFIG_SRAM is not set
# CONFIG_C2PORT is not set

#
# EEPROM support
#
# CONFIG_EEPROM_AT24 is not set
# CONFIG_EEPROM_LEGACY is not set
# CONFIG_EEPROM_MAX6875 is not set
# CONFIG_EEPROM_93CX6 is not set
# CONFIG_CB710_CORE is not set

#
# Texas Instruments shared transport line discipline
#
# CONFIG_SENSORS_LIS3_I2C is not set

#
# Altera FPGA firmware download module
#
# CONFIG_ALTERA_STAPL is not set
# CONFIG_VMWARE_VMCI is not set

#
# Intel MIC Bus Driver
#
# CONFIG_INTEL_MIC_BUS is not set

#
# Intel MIC Host Driver
#

#
# Intel MIC Card Driver
#
# CONFIG_GENWQE is not set
# CONFIG_ECHO is not set
# CONFIG_CXL_BASE is not set
CONFIG_HAVE_IDE=y
# CONFIG_IDE is not set

#
# SCSI device support
#
CONFIG_SCSI_MOD=y
# CONFIG_RAID_ATTRS is not set
CONFIG_SCSI=y
CONFIG_SCSI_DMA=y
# CONFIG_SCSI_NETLINK is not set
# CONFIG_SCSI_MQ_DEFAULT is not set
CONFIG_SCSI_PROC_FS=y

#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=y
# CONFIG_CHR_DEV_ST is not set
# CONFIG_CHR_DEV_OSST is not set
CONFIG_BLK_DEV_SR=y
CONFIG_BLK_DEV_SR_VENDOR=y
CONFIG_CHR_DEV_SG=y
# CONFIG_CHR_DEV_SCH is not set
CONFIG_SCSI_CONSTANTS=y
# CONFIG_SCSI_LOGGING is not set
# CONFIG_SCSI_SCAN_ASYNC is not set

#
# SCSI Transports
#
CONFIG_SCSI_SPI_ATTRS=y
# CONFIG_SCSI_FC_ATTRS is not set
# CONFIG_SCSI_ISCSI_ATTRS is not set
# CONFIG_SCSI_SAS_ATTRS is not set
# CONFIG_SCSI_SAS_LIBSAS is not set
# CONFIG_SCSI_SRP_ATTRS is not set
# CONFIG_SCSI_LOWLEVEL is not set
# CONFIG_SCSI_DH is not set
# CONFIG_SCSI_OSD_INITIATOR is not set
CONFIG_ATA=y
# CONFIG_ATA_NONSTANDARD is not set
CONFIG_ATA_VERBOSE_ERROR=y
CONFIG_ATA_ACPI=y
# CONFIG_SATA_ZPODD is not set
CONFIG_SATA_PMP=y

#
# Controllers with non-SFF native interface
#
CONFIG_SATA_AHCI=y
CONFIG_SATA_AHCI_PLATFORM=y
# CONFIG_SATA_INIC162X is not set
# CONFIG_SATA_ACARD_AHCI is not set
# CONFIG_SATA_SIL24 is not set
CONFIG_ATA_SFF=y

#
# SFF controllers with custom DMA interface
#
# CONFIG_PDC_ADMA is not set
# CONFIG_SATA_QSTOR is not set
# CONFIG_SATA_SX4 is not set
CONFIG_ATA_BMDMA=y

#
# SATA SFF controllers with BMDMA
#
CONFIG_ATA_PIIX=y
CONFIG_SATA_MV=y
# CONFIG_SATA_NV is not set
# CONFIG_SATA_PROMISE is not set
# CONFIG_SATA_SIL is not set
# CONFIG_SATA_SIS is not set
# CONFIG_SATA_SVW is not set
# CONFIG_SATA_ULI is not set
# CONFIG_SATA_VIA is not set
# CONFIG_SATA_VITESSE is not set

#
# PATA SFF controllers with BMDMA
#
# CONFIG_PATA_ALI is not set
# CONFIG_PATA_AMD is not set
# CONFIG_PATA_ARTOP is not set
# CONFIG_PATA_ATIIXP is not set
# CONFIG_PATA_ATP867X is not set
# CONFIG_PATA_CMD64X is not set
# CONFIG_PATA_CYPRESS is not set
# CONFIG_PATA_EFAR is not set
# CONFIG_PATA_HPT366 is not set
# CONFIG_PATA_HPT37X is not set
# CONFIG_PATA_HPT3X2N is not set
# CONFIG_PATA_HPT3X3 is not set
# CONFIG_PATA_IT8213 is not set
# CONFIG_PATA_IT821X is not set
# CONFIG_PATA_JMICRON is not set
# CONFIG_PATA_MARVELL is not set
# CONFIG_PATA_NETCELL is not set
# CONFIG_PATA_NINJA32 is not set
# CONFIG_PATA_NS87415 is not set
# CONFIG_PATA_OLDPIIX is not set
# CONFIG_PATA_OPTIDMA is not set
# CONFIG_PATA_PDC2027X is not set
# CONFIG_PATA_PDC_OLD is not set
# CONFIG_PATA_RADISYS is not set
# CONFIG_PATA_RDC is not set
# CONFIG_PATA_SCH is not set
# CONFIG_PATA_SERVERWORKS is not set
# CONFIG_PATA_SIL680 is not set
# CONFIG_PATA_SIS is not set
# CONFIG_PATA_TOSHIBA is not set
# CONFIG_PATA_TRIFLEX is not set
# CONFIG_PATA_VIA is not set
# CONFIG_PATA_WINBOND is not set

#
# PIO-only SFF controllers
#
# CONFIG_PATA_CMD640_PCI is not set
# CONFIG_PATA_MPIIX is not set
# CONFIG_PATA_NS87410 is not set
# CONFIG_PATA_OPTI is not set
# CONFIG_PATA_RZ1000 is not set

#
# Generic fallback / legacy drivers
#
# CONFIG_PATA_ACPI is not set
# CONFIG_ATA_GENERIC is not set
# CONFIG_PATA_LEGACY is not set
CONFIG_MD=y
CONFIG_BLK_DEV_MD=y
CONFIG_MD_AUTODETECT=y
CONFIG_MD_LINEAR=y
CONFIG_MD_RAID0=y
CONFIG_MD_RAID1=y
CONFIG_MD_RAID10=y
CONFIG_MD_RAID456=y
# CONFIG_MD_MULTIPATH is not set
# CONFIG_MD_FAULTY is not set
# CONFIG_BCACHE is not set
CONFIG_BLK_DEV_DM_BUILTIN=y
CONFIG_BLK_DEV_DM=y
# CONFIG_DM_DEBUG is not set
CONFIG_DM_BUFIO=y
CONFIG_DM_CRYPT=y
CONFIG_DM_SNAPSHOT=y
# CONFIG_DM_THIN_PROVISIONING is not set
# CONFIG_DM_CACHE is not set
# CONFIG_DM_ERA is not set
CONFIG_DM_MIRROR=y
# CONFIG_DM_LOG_USERSPACE is not set
# CONFIG_DM_RAID is not set
CONFIG_DM_ZERO=y
# CONFIG_DM_MULTIPATH is not set
# CONFIG_DM_DELAY is not set
# CONFIG_DM_UEVENT is not set
# CONFIG_DM_FLAKEY is not set
# CONFIG_DM_VERITY is not set
# CONFIG_DM_SWITCH is not set
# CONFIG_TARGET_CORE is not set
# CONFIG_FUSION is not set

#
# IEEE 1394 (FireWire) support
#
# CONFIG_FIREWIRE is not set
# CONFIG_FIREWIRE_NOSY is not set
# CONFIG_I2O is not set
# CONFIG_MACINTOSH_DRIVERS is not set
CONFIG_NETDEVICES=y
CONFIG_MII=y
CONFIG_NET_CORE=y
# CONFIG_BONDING is not set
# CONFIG_DUMMY is not set
# CONFIG_EQUALIZER is not set
# CONFIG_NET_FC is not set
# CONFIG_IFB is not set
# CONFIG_NET_TEAM is not set
# CONFIG_MACVLAN is not set
# CONFIG_VXLAN is not set
# CONFIG_NETCONSOLE is not set
# CONFIG_NETPOLL is not set
# CONFIG_NET_POLL_CONTROLLER is not set
CONFIG_TUN=y
CONFIG_VETH=y
CONFIG_VIRTIO_NET=y
# CONFIG_NLMON is not set
# CONFIG_ARCNET is not set

#
# CAIF transport drivers
#
CONFIG_VHOST_NET=y
CONFIG_VHOST_RING=y
CONFIG_VHOST=y

#
# Distributed Switch Architecture drivers
#
# CONFIG_NET_DSA_MV88E6XXX is not set
# CONFIG_NET_DSA_MV88E6060 is not set
# CONFIG_NET_DSA_MV88E6XXX_NEED_PPU is not set
# CONFIG_NET_DSA_MV88E6131 is not set
# CONFIG_NET_DSA_MV88E6123_61_65 is not set
# CONFIG_NET_DSA_MV88E6171 is not set
# CONFIG_NET_DSA_BCM_SF2 is not set
CONFIG_ETHERNET=y
CONFIG_NET_VENDOR_3COM=y
# CONFIG_VORTEX is not set
# CONFIG_TYPHOON is not set
CONFIG_NET_VENDOR_ADAPTEC=y
# CONFIG_ADAPTEC_STARFIRE is not set
CONFIG_NET_VENDOR_AGERE=y
# CONFIG_ET131X is not set
CONFIG_NET_VENDOR_ALTEON=y
# CONFIG_ACENIC is not set
# CONFIG_ALTERA_TSE is not set
CONFIG_NET_VENDOR_AMD=y
# CONFIG_AMD8111_ETH is not set
# CONFIG_PCNET32 is not set
# CONFIG_NET_XGENE is not set
CONFIG_NET_VENDOR_ARC=y
CONFIG_NET_VENDOR_ATHEROS=y
# CONFIG_ATL2 is not set
# CONFIG_ATL1 is not set
# CONFIG_ATL1E is not set
# CONFIG_ATL1C is not set
# CONFIG_ALX is not set
CONFIG_NET_VENDOR_BROADCOM=y
# CONFIG_B44 is not set
# CONFIG_BNX2 is not set
# CONFIG_CNIC is not set
# CONFIG_TIGON3 is not set
# CONFIG_BNX2X is not set
CONFIG_NET_VENDOR_BROCADE=y
# CONFIG_BNA is not set
CONFIG_NET_VENDOR_CHELSIO=y
# CONFIG_CHELSIO_T1 is not set
# CONFIG_CHELSIO_T3 is not set
# CONFIG_CHELSIO_T4 is not set
# CONFIG_CHELSIO_T4VF is not set
CONFIG_NET_VENDOR_CISCO=y
# CONFIG_ENIC is not set
# CONFIG_CX_ECAT is not set
# CONFIG_DNET is not set
CONFIG_NET_VENDOR_DEC=y
# CONFIG_NET_TULIP is not set
CONFIG_NET_VENDOR_DLINK=y
# CONFIG_DL2K is not set
# CONFIG_SUNDANCE is not set
CONFIG_NET_VENDOR_EMULEX=y
# CONFIG_BE2NET is not set
CONFIG_NET_VENDOR_EXAR=y
# CONFIG_S2IO is not set
# CONFIG_VXGE is not set
CONFIG_NET_VENDOR_HP=y
# CONFIG_HP100 is not set
CONFIG_NET_VENDOR_INTEL=y
# CONFIG_E100 is not set
# CONFIG_E1000 is not set
# CONFIG_E1000E is not set
# CONFIG_IGB is not set
# CONFIG_IGBVF is not set
# CONFIG_IXGB is not set
# CONFIG_IXGBE is not set
# CONFIG_IXGBEVF is not set
# CONFIG_I40E is not set
# CONFIG_I40EVF is not set
# CONFIG_FM10K is not set
CONFIG_NET_VENDOR_I825XX=y
# CONFIG_IP1000 is not set
# CONFIG_JME is not set
CONFIG_NET_VENDOR_MARVELL=y
# CONFIG_MVMDIO is not set
# CONFIG_SKGE is not set
CONFIG_SKY2=y
# CONFIG_SKY2_DEBUG is not set
CONFIG_NET_VENDOR_MELLANOX=y
# CONFIG_MLX4_EN is not set
# CONFIG_MLX4_CORE is not set
# CONFIG_MLX5_CORE is not set
CONFIG_NET_VENDOR_MICREL=y
# CONFIG_KS8851_MLL is not set
# CONFIG_KSZ884X_PCI is not set
CONFIG_NET_VENDOR_MYRI=y
# CONFIG_MYRI10GE is not set
# CONFIG_FEALNX is not set
CONFIG_NET_VENDOR_NATSEMI=y
# CONFIG_NATSEMI is not set
# CONFIG_NS83820 is not set
CONFIG_NET_VENDOR_8390=y
# CONFIG_NE2K_PCI is not set
CONFIG_NET_VENDOR_NVIDIA=y
# CONFIG_FORCEDETH is not set
CONFIG_NET_VENDOR_OKI=y
# CONFIG_ETHOC is not set
CONFIG_NET_PACKET_ENGINE=y
# CONFIG_HAMACHI is not set
# CONFIG_YELLOWFIN is not set
CONFIG_NET_VENDOR_QLOGIC=y
# CONFIG_QLA3XXX is not set
# CONFIG_QLCNIC is not set
# CONFIG_QLGE is not set
# CONFIG_NETXEN_NIC is not set
CONFIG_NET_VENDOR_REALTEK=y
# CONFIG_8139CP is not set
# CONFIG_8139TOO is not set
CONFIG_R8169=y
CONFIG_NET_VENDOR_RDC=y
# CONFIG_R6040 is not set
CONFIG_NET_VENDOR_SAMSUNG=y
# CONFIG_SXGBE_ETH is not set
CONFIG_NET_VENDOR_SEEQ=y
CONFIG_NET_VENDOR_SILAN=y
# CONFIG_SC92031 is not set
CONFIG_NET_VENDOR_SIS=y
# CONFIG_SIS900 is not set
# CONFIG_SIS190 is not set
# CONFIG_SFC is not set
CONFIG_NET_VENDOR_SMSC=y
# CONFIG_EPIC100 is not set
# CONFIG_SMSC911X is not set
# CONFIG_SMSC9420 is not set
CONFIG_NET_VENDOR_STMICRO=y
# CONFIG_STMMAC_ETH is not set
CONFIG_NET_VENDOR_SUN=y
# CONFIG_HAPPYMEAL is not set
# CONFIG_SUNGEM is not set
# CONFIG_CASSINI is not set
# CONFIG_NIU is not set
CONFIG_NET_VENDOR_TEHUTI=y
# CONFIG_TEHUTI is not set
CONFIG_NET_VENDOR_TI=y
# CONFIG_TLAN is not set
CONFIG_NET_VENDOR_VIA=y
# CONFIG_VIA_RHINE is not set
# CONFIG_VIA_VELOCITY is not set
CONFIG_NET_VENDOR_WIZNET=y
# CONFIG_WIZNET_W5100 is not set
# CONFIG_WIZNET_W5300 is not set
CONFIG_FDDI=y
# CONFIG_DEFXX is not set
# CONFIG_SKFP is not set
# CONFIG_HIPPI is not set
# CONFIG_NET_SB1000 is not set
CONFIG_PHYLIB=y

#
# MII PHY device drivers
#
# CONFIG_AT803X_PHY is not set
# CONFIG_AMD_PHY is not set
# CONFIG_MARVELL_PHY is not set
# CONFIG_DAVICOM_PHY is not set
# CONFIG_QSEMI_PHY is not set
# CONFIG_LXT_PHY is not set
# CONFIG_CICADA_PHY is not set
# CONFIG_VITESSE_PHY is not set
# CONFIG_SMSC_PHY is not set
# CONFIG_BROADCOM_PHY is not set
# CONFIG_BCM7XXX_PHY is not set
# CONFIG_BCM87XX_PHY is not set
# CONFIG_ICPLUS_PHY is not set
# CONFIG_REALTEK_PHY is not set
# CONFIG_NATIONAL_PHY is not set
# CONFIG_STE10XP is not set
# CONFIG_LSI_ET1011C_PHY is not set
# CONFIG_MICREL_PHY is not set
# CONFIG_FIXED_PHY is not set
# CONFIG_MDIO_BITBANG is not set
# CONFIG_MDIO_BCM_UNIMAC is not set
# CONFIG_PPP is not set
# CONFIG_SLIP is not set
CONFIG_USB_NET_DRIVERS=y
# CONFIG_USB_CATC is not set
# CONFIG_USB_KAWETH is not set
# CONFIG_USB_PEGASUS is not set
# CONFIG_USB_RTL8150 is not set
# CONFIG_USB_RTL8152 is not set
# CONFIG_USB_USBNET is not set
# CONFIG_USB_IPHETH is not set
# CONFIG_WLAN is not set

#
# Enable WiMAX (Networking options) to see the WiMAX drivers
#
# CONFIG_WAN is not set
# CONFIG_VMXNET3 is not set
# CONFIG_ISDN is not set

#
# Input device support
#
CONFIG_INPUT=y
CONFIG_INPUT_FF_MEMLESS=y
CONFIG_INPUT_POLLDEV=y
CONFIG_INPUT_SPARSEKMAP=y
# CONFIG_INPUT_MATRIXKMAP is not set

#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
# CONFIG_INPUT_MOUSEDEV_PSAUX is not set
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
# CONFIG_INPUT_JOYDEV is not set
CONFIG_INPUT_EVDEV=y
# CONFIG_INPUT_EVBUG is not set

#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
# CONFIG_KEYBOARD_ADP5588 is not set
# CONFIG_KEYBOARD_ADP5589 is not set
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_QT1070 is not set
# CONFIG_KEYBOARD_QT2160 is not set
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_TCA6416 is not set
# CONFIG_KEYBOARD_TCA8418 is not set
# CONFIG_KEYBOARD_LM8323 is not set
# CONFIG_KEYBOARD_LM8333 is not set
# CONFIG_KEYBOARD_MAX7359 is not set
# CONFIG_KEYBOARD_MCS is not set
# CONFIG_KEYBOARD_MPR121 is not set
# CONFIG_KEYBOARD_NEWTON is not set
# CONFIG_KEYBOARD_OPENCORES is not set
# CONFIG_KEYBOARD_STOWAWAY is not set
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_XTKBD is not set
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
CONFIG_MOUSE_PS2_ALPS=y
CONFIG_MOUSE_PS2_LOGIPS2PP=y
CONFIG_MOUSE_PS2_SYNAPTICS=y
CONFIG_MOUSE_PS2_CYPRESS=y
CONFIG_MOUSE_PS2_LIFEBOOK=y
CONFIG_MOUSE_PS2_TRACKPOINT=y
# CONFIG_MOUSE_PS2_ELANTECH is not set
# CONFIG_MOUSE_PS2_SENTELIC is not set
# CONFIG_MOUSE_PS2_TOUCHKIT is not set
# CONFIG_MOUSE_SERIAL is not set
# CONFIG_MOUSE_APPLETOUCH is not set
# CONFIG_MOUSE_BCM5974 is not set
# CONFIG_MOUSE_CYAPA is not set
# CONFIG_MOUSE_VSXXXAA is not set
# CONFIG_MOUSE_SYNAPTICS_I2C is not set
# CONFIG_MOUSE_SYNAPTICS_USB is not set
CONFIG_INPUT_JOYSTICK=y
# CONFIG_JOYSTICK_ANALOG is not set
# CONFIG_JOYSTICK_A3D is not set
# CONFIG_JOYSTICK_ADI is not set
# CONFIG_JOYSTICK_COBRA is not set
# CONFIG_JOYSTICK_GF2K is not set
# CONFIG_JOYSTICK_GRIP is not set
# CONFIG_JOYSTICK_GRIP_MP is not set
# CONFIG_JOYSTICK_GUILLEMOT is not set
# CONFIG_JOYSTICK_INTERACT is not set
# CONFIG_JOYSTICK_SIDEWINDER is not set
# CONFIG_JOYSTICK_TMDC is not set
# CONFIG_JOYSTICK_IFORCE is not set
# CONFIG_JOYSTICK_WARRIOR is not set
# CONFIG_JOYSTICK_MAGELLAN is not set
# CONFIG_JOYSTICK_SPACEORB is not set
# CONFIG_JOYSTICK_SPACEBALL is not set
# CONFIG_JOYSTICK_STINGER is not set
# CONFIG_JOYSTICK_TWIDJOY is not set
# CONFIG_JOYSTICK_ZHENHUA is not set
# CONFIG_JOYSTICK_AS5011 is not set
# CONFIG_JOYSTICK_JOYDUMP is not set
# CONFIG_JOYSTICK_XPAD is not set
CONFIG_INPUT_TABLET=y
# CONFIG_TABLET_USB_ACECAD is not set
# CONFIG_TABLET_USB_AIPTEK is not set
# CONFIG_TABLET_USB_GTCO is not set
# CONFIG_TABLET_USB_HANWANG is not set
# CONFIG_TABLET_USB_KBTAB is not set
# CONFIG_TABLET_SERIAL_WACOM4 is not set
CONFIG_INPUT_TOUCHSCREEN=y
# CONFIG_TOUCHSCREEN_AD7879 is not set
# CONFIG_TOUCHSCREEN_ATMEL_MXT is not set
# CONFIG_TOUCHSCREEN_BU21013 is not set
# CONFIG_TOUCHSCREEN_CYTTSP_CORE is not set
# CONFIG_TOUCHSCREEN_CYTTSP4_CORE is not set
# CONFIG_TOUCHSCREEN_DYNAPRO is not set
# CONFIG_TOUCHSCREEN_HAMPSHIRE is not set
# CONFIG_TOUCHSCREEN_EETI is not set
# CONFIG_TOUCHSCREEN_FUJITSU is not set
# CONFIG_TOUCHSCREEN_ILI210X is not set
# CONFIG_TOUCHSCREEN_GUNZE is not set
# CONFIG_TOUCHSCREEN_ELO is not set
# CONFIG_TOUCHSCREEN_WACOM_W8001 is not set
# CONFIG_TOUCHSCREEN_WACOM_I2C is not set
# CONFIG_TOUCHSCREEN_MAX11801 is not set
# CONFIG_TOUCHSCREEN_MCS5000 is not set
# CONFIG_TOUCHSCREEN_MMS114 is not set
# CONFIG_TOUCHSCREEN_MTOUCH is not set
# CONFIG_TOUCHSCREEN_INEXIO is not set
# CONFIG_TOUCHSCREEN_MK712 is not set
# CONFIG_TOUCHSCREEN_PENMOUNT is not set
# CONFIG_TOUCHSCREEN_EDT_FT5X06 is not set
# CONFIG_TOUCHSCREEN_TOUCHRIGHT is not set
# CONFIG_TOUCHSCREEN_TOUCHWIN is not set
# CONFIG_TOUCHSCREEN_PIXCIR is not set
# CONFIG_TOUCHSCREEN_USB_COMPOSITE is not set
# CONFIG_TOUCHSCREEN_TOUCHIT213 is not set
# CONFIG_TOUCHSCREEN_TSC_SERIO is not set
# CONFIG_TOUCHSCREEN_TSC2007 is not set
# CONFIG_TOUCHSCREEN_ST1232 is not set
# CONFIG_TOUCHSCREEN_SUR40 is not set
# CONFIG_TOUCHSCREEN_TPS6507X is not set
CONFIG_INPUT_MISC=y
# CONFIG_INPUT_AD714X is not set
# CONFIG_INPUT_BMA150 is not set
# CONFIG_INPUT_PCSPKR is not set
# CONFIG_INPUT_MMA8450 is not set
# CONFIG_INPUT_MPU3050 is not set
# CONFIG_INPUT_APANEL is not set
# CONFIG_INPUT_ATLAS_BTNS is not set
# CONFIG_INPUT_ATI_REMOTE2 is not set
# CONFIG_INPUT_KEYSPAN_REMOTE is not set
# CONFIG_INPUT_KXTJ9 is not set
# CONFIG_INPUT_POWERMATE is not set
# CONFIG_INPUT_YEALINK is not set
# CONFIG_INPUT_CM109 is not set
# CONFIG_INPUT_UINPUT is not set
# CONFIG_INPUT_PCF8574 is not set
# CONFIG_INPUT_ADXL34X is not set
# CONFIG_INPUT_IMS_PCU is not set
# CONFIG_INPUT_CMA3000 is not set
# CONFIG_INPUT_IDEAPAD_SLIDEBAR is not set
# CONFIG_INPUT_DRV2667_HAPTICS is not set

#
# Hardware I/O ports
#
CONFIG_SERIO=y
CONFIG_ARCH_MIGHT_HAVE_PC_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_SERIO_SERPORT=y
# CONFIG_SERIO_CT82C710 is not set
# CONFIG_SERIO_PCIPS2 is not set
CONFIG_SERIO_LIBPS2=y
# CONFIG_SERIO_RAW is not set
# CONFIG_SERIO_ALTERA_PS2 is not set
# CONFIG_SERIO_PS2MULT is not set
# CONFIG_SERIO_ARC_PS2 is not set
# CONFIG_GAMEPORT is not set

#
# Character devices
#
CONFIG_TTY=y
CONFIG_VT=y
CONFIG_CONSOLE_TRANSLATIONS=y
CONFIG_VT_CONSOLE=y
CONFIG_VT_CONSOLE_SLEEP=y
CONFIG_HW_CONSOLE=y
CONFIG_VT_HW_CONSOLE_BINDING=y
CONFIG_UNIX98_PTYS=y
# CONFIG_DEVPTS_MULTIPLE_INSTANCES is not set
# CONFIG_LEGACY_PTYS is not set
CONFIG_SERIAL_NONSTANDARD=y
# CONFIG_ROCKETPORT is not set
# CONFIG_CYCLADES is not set
# CONFIG_MOXA_INTELLIO is not set
# CONFIG_MOXA_SMARTIO is not set
# CONFIG_SYNCLINK is not set
# CONFIG_SYNCLINKMP is not set
# CONFIG_SYNCLINK_GT is not set
# CONFIG_NOZOMI is not set
# CONFIG_ISI is not set
# CONFIG_N_HDLC is not set
# CONFIG_N_GSM is not set
# CONFIG_TRACE_SINK is not set
CONFIG_DEVKMEM=y

#
# Serial drivers
#
CONFIG_SERIAL_EARLYCON=y
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_DEPRECATED_OPTIONS=y
CONFIG_SERIAL_8250_PNP=y
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_SERIAL_8250_DMA=y
CONFIG_SERIAL_8250_PCI=y
CONFIG_SERIAL_8250_NR_UARTS=32
CONFIG_SERIAL_8250_RUNTIME_UARTS=4
CONFIG_SERIAL_8250_EXTENDED=y
CONFIG_SERIAL_8250_MANY_PORTS=y
CONFIG_SERIAL_8250_SHARE_IRQ=y
CONFIG_SERIAL_8250_DETECT_IRQ=y
CONFIG_SERIAL_8250_RSA=y
# CONFIG_SERIAL_8250_DW is not set
# CONFIG_SERIAL_8250_FINTEK is not set

#
# Non-8250 serial port support
#
# CONFIG_SERIAL_MFD_HSU is not set
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
# CONFIG_SERIAL_JSM is not set
# CONFIG_SERIAL_SCCNXP is not set
# CONFIG_SERIAL_SC16IS7XX is not set
# CONFIG_SERIAL_ALTERA_JTAGUART is not set
# CONFIG_SERIAL_ALTERA_UART is not set
# CONFIG_SERIAL_ARC is not set
# CONFIG_SERIAL_RP2 is not set
# CONFIG_SERIAL_FSL_LPUART is not set
# CONFIG_VIRTIO_CONSOLE is not set
# CONFIG_IPMI_HANDLER is not set
CONFIG_HW_RANDOM=y
# CONFIG_HW_RANDOM_TIMERIOMEM is not set
# CONFIG_HW_RANDOM_INTEL is not set
# CONFIG_HW_RANDOM_AMD is not set
CONFIG_HW_RANDOM_VIA=y
# CONFIG_HW_RANDOM_VIRTIO is not set
CONFIG_NVRAM=y
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set
# CONFIG_MWAVE is not set
# CONFIG_RAW_DRIVER is not set
CONFIG_HPET=y
# CONFIG_HPET_MMAP is not set
# CONFIG_HANGCHECK_TIMER is not set
# CONFIG_TCG_TPM is not set
# CONFIG_TELCLOCK is not set
CONFIG_DEVPORT=y
CONFIG_HMC_DRV=m
# CONFIG_XILLYBUS is not set

#
# I2C support
#
CONFIG_I2C=y
CONFIG_ACPI_I2C_OPREGION=y
CONFIG_I2C_BOARDINFO=y
CONFIG_I2C_COMPAT=y
# CONFIG_I2C_CHARDEV is not set
# CONFIG_I2C_MUX is not set
CONFIG_I2C_HELPER_AUTO=y
CONFIG_I2C_ALGOBIT=y

#
# I2C Hardware Bus support
#

#
# PC SMBus host controller drivers
#
# CONFIG_I2C_ALI1535 is not set
# CONFIG_I2C_ALI1563 is not set
# CONFIG_I2C_ALI15X3 is not set
# CONFIG_I2C_AMD756 is not set
# CONFIG_I2C_AMD8111 is not set
CONFIG_I2C_I801=y
# CONFIG_I2C_ISCH is not set
# CONFIG_I2C_ISMT is not set
# CONFIG_I2C_PIIX4 is not set
# CONFIG_I2C_NFORCE2 is not set
# CONFIG_I2C_SIS5595 is not set
# CONFIG_I2C_SIS630 is not set
# CONFIG_I2C_SIS96X is not set
# CONFIG_I2C_VIA is not set
# CONFIG_I2C_VIAPRO is not set

#
# ACPI drivers
#
# CONFIG_I2C_SCMI is not set

#
# I2C system bus drivers (mostly embedded / system-on-chip)
#
# CONFIG_I2C_DESIGNWARE_PCI is not set
# CONFIG_I2C_OCORES is not set
# CONFIG_I2C_PCA_PLATFORM is not set
# CONFIG_I2C_PXA_PCI is not set
# CONFIG_I2C_SIMTEC is not set
# CONFIG_I2C_XILINX is not set

#
# External I2C/SMBus adapter drivers
#
# CONFIG_I2C_DIOLAN_U2C is not set
# CONFIG_I2C_PARPORT_LIGHT is not set
# CONFIG_I2C_ROBOTFUZZ_OSIF is not set
# CONFIG_I2C_TAOS_EVM is not set
# CONFIG_I2C_TINY_USB is not set

#
# Other I2C/SMBus bus drivers
#
# CONFIG_I2C_STUB is not set
# CONFIG_I2C_DEBUG_CORE is not set
# CONFIG_I2C_DEBUG_ALGO is not set
# CONFIG_I2C_DEBUG_BUS is not set
# CONFIG_SPI is not set
# CONFIG_SPMI is not set
# CONFIG_HSI is not set

#
# PPS support
#
# CONFIG_PPS is not set

#
# PPS generators support
#

#
# PTP clock support
#
# CONFIG_PTP_1588_CLOCK is not set

#
# Enable PHYLIB and NETWORK_PHY_TIMESTAMPING to see the additional clocks.
#
CONFIG_ARCH_WANT_OPTIONAL_GPIOLIB=y
# CONFIG_GPIOLIB is not set
# CONFIG_W1 is not set
CONFIG_POWER_SUPPLY=y
# CONFIG_POWER_SUPPLY_DEBUG is not set
# CONFIG_PDA_POWER is not set
# CONFIG_TEST_POWER is not set
# CONFIG_BATTERY_DS2780 is not set
# CONFIG_BATTERY_DS2781 is not set
# CONFIG_BATTERY_DS2782 is not set
# CONFIG_BATTERY_SBS is not set
# CONFIG_BATTERY_BQ27x00 is not set
# CONFIG_BATTERY_MAX17040 is not set
# CONFIG_BATTERY_MAX17042 is not set
# CONFIG_CHARGER_MAX8903 is not set
# CONFIG_CHARGER_LP8727 is not set
# CONFIG_CHARGER_BQ2415X is not set
# CONFIG_CHARGER_SMB347 is not set
# CONFIG_POWER_RESET is not set
# CONFIG_POWER_AVS is not set
CONFIG_HWMON=y
# CONFIG_HWMON_VID is not set
# CONFIG_HWMON_DEBUG_CHIP is not set

#
# Native drivers
#
# CONFIG_SENSORS_ABITUGURU is not set
# CONFIG_SENSORS_ABITUGURU3 is not set
# CONFIG_SENSORS_AD7414 is not set
# CONFIG_SENSORS_AD7418 is not set
# CONFIG_SENSORS_ADM1021 is not set
# CONFIG_SENSORS_ADM1025 is not set
# CONFIG_SENSORS_ADM1026 is not set
# CONFIG_SENSORS_ADM1029 is not set
# CONFIG_SENSORS_ADM1031 is not set
# CONFIG_SENSORS_ADM9240 is not set
# CONFIG_SENSORS_ADT7410 is not set
# CONFIG_SENSORS_ADT7411 is not set
# CONFIG_SENSORS_ADT7462 is not set
# CONFIG_SENSORS_ADT7470 is not set
# CONFIG_SENSORS_ADT7475 is not set
# CONFIG_SENSORS_ASC7621 is not set
# CONFIG_SENSORS_K8TEMP is not set
# CONFIG_SENSORS_K10TEMP is not set
# CONFIG_SENSORS_FAM15H_POWER is not set
# CONFIG_SENSORS_APPLESMC is not set
# CONFIG_SENSORS_ASB100 is not set
# CONFIG_SENSORS_ATXP1 is not set
# CONFIG_SENSORS_DS620 is not set
# CONFIG_SENSORS_DS1621 is not set
# CONFIG_SENSORS_I5K_AMB is not set
# CONFIG_SENSORS_F71805F is not set
# CONFIG_SENSORS_F71882FG is not set
# CONFIG_SENSORS_F75375S is not set
# CONFIG_SENSORS_FSCHMD is not set
# CONFIG_SENSORS_GL518SM is not set
# CONFIG_SENSORS_GL520SM is not set
# CONFIG_SENSORS_G760A is not set
# CONFIG_SENSORS_G762 is not set
# CONFIG_SENSORS_HIH6130 is not set
# CONFIG_SENSORS_CORETEMP is not set
# CONFIG_SENSORS_IT87 is not set
# CONFIG_SENSORS_JC42 is not set
# CONFIG_SENSORS_POWR1220 is not set
# CONFIG_SENSORS_LINEAGE is not set
# CONFIG_SENSORS_LTC2945 is not set
# CONFIG_SENSORS_LTC4151 is not set
# CONFIG_SENSORS_LTC4215 is not set
# CONFIG_SENSORS_LTC4222 is not set
# CONFIG_SENSORS_LTC4245 is not set
# CONFIG_SENSORS_LTC4260 is not set
# CONFIG_SENSORS_LTC4261 is not set
# CONFIG_SENSORS_MAX16065 is not set
# CONFIG_SENSORS_MAX1619 is not set
# CONFIG_SENSORS_MAX1668 is not set
# CONFIG_SENSORS_MAX197 is not set
# CONFIG_SENSORS_MAX6639 is not set
# CONFIG_SENSORS_MAX6642 is not set
# CONFIG_SENSORS_MAX6650 is not set
# CONFIG_SENSORS_MAX6697 is not set
# CONFIG_SENSORS_HTU21 is not set
# CONFIG_SENSORS_MCP3021 is not set
# CONFIG_SENSORS_LM63 is not set
# CONFIG_SENSORS_LM73 is not set
# CONFIG_SENSORS_LM75 is not set
# CONFIG_SENSORS_LM77 is not set
# CONFIG_SENSORS_LM78 is not set
# CONFIG_SENSORS_LM80 is not set
# CONFIG_SENSORS_LM83 is not set
# CONFIG_SENSORS_LM85 is not set
# CONFIG_SENSORS_LM87 is not set
# CONFIG_SENSORS_LM90 is not set
# CONFIG_SENSORS_LM92 is not set
# CONFIG_SENSORS_LM93 is not set
# CONFIG_SENSORS_LM95234 is not set
# CONFIG_SENSORS_LM95241 is not set
# CONFIG_SENSORS_LM95245 is not set
# CONFIG_SENSORS_PC87360 is not set
# CONFIG_SENSORS_PC87427 is not set
# CONFIG_SENSORS_NTC_THERMISTOR is not set
# CONFIG_SENSORS_NCT6683 is not set
# CONFIG_SENSORS_NCT6775 is not set
# CONFIG_SENSORS_PCF8591 is not set
# CONFIG_PMBUS is not set
# CONFIG_SENSORS_SHT21 is not set
# CONFIG_SENSORS_SHTC1 is not set
# CONFIG_SENSORS_SIS5595 is not set
# CONFIG_SENSORS_DME1737 is not set
# CONFIG_SENSORS_EMC1403 is not set
# CONFIG_SENSORS_EMC2103 is not set
# CONFIG_SENSORS_EMC6W201 is not set
# CONFIG_SENSORS_SMSC47M1 is not set
# CONFIG_SENSORS_SMSC47M192 is not set
# CONFIG_SENSORS_SMSC47B397 is not set
# CONFIG_SENSORS_SCH56XX_COMMON is not set
# CONFIG_SENSORS_SCH5627 is not set
# CONFIG_SENSORS_SCH5636 is not set
# CONFIG_SENSORS_SMM665 is not set
# CONFIG_SENSORS_ADC128D818 is not set
# CONFIG_SENSORS_ADS1015 is not set
# CONFIG_SENSORS_ADS7828 is not set
# CONFIG_SENSORS_AMC6821 is not set
# CONFIG_SENSORS_INA209 is not set
# CONFIG_SENSORS_INA2XX is not set
# CONFIG_SENSORS_THMC50 is not set
# CONFIG_SENSORS_TMP102 is not set
# CONFIG_SENSORS_TMP103 is not set
# CONFIG_SENSORS_TMP401 is not set
# CONFIG_SENSORS_TMP421 is not set
# CONFIG_SENSORS_VIA_CPUTEMP is not set
# CONFIG_SENSORS_VIA686A is not set
# CONFIG_SENSORS_VT1211 is not set
# CONFIG_SENSORS_VT8231 is not set
# CONFIG_SENSORS_W83781D is not set
# CONFIG_SENSORS_W83791D is not set
# CONFIG_SENSORS_W83792D is not set
# CONFIG_SENSORS_W83793 is not set
# CONFIG_SENSORS_W83795 is not set
# CONFIG_SENSORS_W83L785TS is not set
# CONFIG_SENSORS_W83L786NG is not set
# CONFIG_SENSORS_W83627HF is not set
# CONFIG_SENSORS_W83627EHF is not set

#
# ACPI drivers
#
# CONFIG_SENSORS_ACPI_POWER is not set
# CONFIG_SENSORS_ATK0110 is not set
CONFIG_THERMAL=y
# CONFIG_THERMAL_HWMON is not set
CONFIG_THERMAL_DEFAULT_GOV_STEP_WISE=y
# CONFIG_THERMAL_DEFAULT_GOV_FAIR_SHARE is not set
# CONFIG_THERMAL_DEFAULT_GOV_USER_SPACE is not set
# CONFIG_THERMAL_GOV_FAIR_SHARE is not set
CONFIG_THERMAL_GOV_STEP_WISE=y
CONFIG_THERMAL_GOV_USER_SPACE=y
# CONFIG_THERMAL_EMULATION is not set
# CONFIG_INTEL_POWERCLAMP is not set
CONFIG_X86_PKG_TEMP_THERMAL=m
# CONFIG_ACPI_INT3403_THERMAL is not set
# CONFIG_INTEL_SOC_DTS_THERMAL is not set

#
# Texas Instruments thermal drivers
#
CONFIG_WATCHDOG=y
# CONFIG_WATCHDOG_CORE is not set
# CONFIG_WATCHDOG_NOWAYOUT is not set

#
# Watchdog Device Drivers
#
# CONFIG_SOFT_WATCHDOG is not set
# CONFIG_XILINX_WATCHDOG is not set
# CONFIG_DW_WATCHDOG is not set
# CONFIG_ACQUIRE_WDT is not set
# CONFIG_ADVANTECH_WDT is not set
# CONFIG_ALIM1535_WDT is not set
# CONFIG_ALIM7101_WDT is not set
# CONFIG_F71808E_WDT is not set
# CONFIG_SP5100_TCO is not set
# CONFIG_SBC_FITPC2_WATCHDOG is not set
# CONFIG_EUROTECH_WDT is not set
# CONFIG_IB700_WDT is not set
# CONFIG_IBMASR is not set
# CONFIG_WAFER_WDT is not set
# CONFIG_I6300ESB_WDT is not set
# CONFIG_IE6XX_WDT is not set
# CONFIG_ITCO_WDT is not set
# CONFIG_IT8712F_WDT is not set
# CONFIG_IT87_WDT is not set
# CONFIG_HP_WATCHDOG is not set
# CONFIG_SC1200_WDT is not set
# CONFIG_PC87413_WDT is not set
# CONFIG_NV_TCO is not set
# CONFIG_60XX_WDT is not set
# CONFIG_CPU5_WDT is not set
# CONFIG_SMSC_SCH311X_WDT is not set
# CONFIG_SMSC37B787_WDT is not set
# CONFIG_VIA_WDT is not set
# CONFIG_W83627HF_WDT is not set
# CONFIG_W83877F_WDT is not set
# CONFIG_W83977F_WDT is not set
# CONFIG_MACHZ_WDT is not set
# CONFIG_SBC_EPX_C3_WATCHDOG is not set

#
# PCI-based Watchdog Cards
#
# CONFIG_PCIPCWATCHDOG is not set
# CONFIG_WDTPCI is not set

#
# USB-based Watchdog Cards
#
# CONFIG_USBPCWATCHDOG is not set
CONFIG_SSB_POSSIBLE=y

#
# Sonics Silicon Backplane
#
# CONFIG_SSB is not set
CONFIG_BCMA_POSSIBLE=y

#
# Broadcom specific AMBA
#
# CONFIG_BCMA is not set

#
# Multifunction device drivers
#
# CONFIG_MFD_CORE is not set
# CONFIG_MFD_AS3711 is not set
# CONFIG_PMIC_ADP5520 is not set
# CONFIG_MFD_BCM590XX is not set
# CONFIG_MFD_AXP20X is not set
# CONFIG_MFD_CROS_EC is not set
# CONFIG_PMIC_DA903X is not set
# CONFIG_MFD_DA9052_I2C is not set
# CONFIG_MFD_DA9055 is not set
# CONFIG_MFD_DA9063 is not set
# CONFIG_MFD_MC13XXX_I2C is not set
# CONFIG_HTC_PASIC3 is not set
# CONFIG_LPC_ICH is not set
# CONFIG_LPC_SCH is not set
# CONFIG_INTEL_SOC_PMIC is not set
# CONFIG_MFD_JANZ_CMODIO is not set
# CONFIG_MFD_KEMPLD is not set
# CONFIG_MFD_88PM800 is not set
# CONFIG_MFD_88PM805 is not set
# CONFIG_MFD_88PM860X is not set
# CONFIG_MFD_MAX14577 is not set
# CONFIG_MFD_MAX77686 is not set
# CONFIG_MFD_MAX77693 is not set
# CONFIG_MFD_MAX8907 is not set
# CONFIG_MFD_MAX8925 is not set
# CONFIG_MFD_MAX8997 is not set
# CONFIG_MFD_MAX8998 is not set
# CONFIG_MFD_MENF21BMC is not set
# CONFIG_MFD_VIPERBOARD is not set
# CONFIG_MFD_RETU is not set
# CONFIG_MFD_PCF50633 is not set
# CONFIG_MFD_RDC321X is not set
# CONFIG_MFD_RTSX_PCI is not set
# CONFIG_MFD_RTSX_USB is not set
# CONFIG_MFD_RC5T583 is not set
# CONFIG_MFD_RN5T618 is not set
# CONFIG_MFD_SEC_CORE is not set
# CONFIG_MFD_SI476X_CORE is not set
# CONFIG_MFD_SM501 is not set
# CONFIG_MFD_SMSC is not set
# CONFIG_ABX500_CORE is not set
# CONFIG_MFD_SYSCON is not set
# CONFIG_MFD_TI_AM335X_TSCADC is not set
# CONFIG_MFD_LP3943 is not set
# CONFIG_MFD_LP8788 is not set
# CONFIG_MFD_PALMAS is not set
# CONFIG_TPS6105X is not set
# CONFIG_TPS6507X is not set
# CONFIG_MFD_TPS65090 is not set
# CONFIG_MFD_TPS65217 is not set
# CONFIG_MFD_TPS65218 is not set
# CONFIG_MFD_TPS6586X is not set
# CONFIG_MFD_TPS80031 is not set
# CONFIG_TWL4030_CORE is not set
# CONFIG_TWL6040_CORE is not set
# CONFIG_MFD_WL1273_CORE is not set
# CONFIG_MFD_LM3533 is not set
# CONFIG_MFD_TC3589X is not set
# CONFIG_MFD_TMIO is not set
# CONFIG_MFD_VX855 is not set
# CONFIG_MFD_ARIZONA_I2C is not set
# CONFIG_MFD_WM8400 is not set
# CONFIG_MFD_WM831X_I2C is not set
# CONFIG_MFD_WM8350_I2C is not set
# CONFIG_MFD_WM8994 is not set
# CONFIG_REGULATOR is not set
# CONFIG_MEDIA_SUPPORT is not set

#
# Graphics support
#
CONFIG_AGP=y
CONFIG_AGP_AMD64=y
CONFIG_AGP_INTEL=y
# CONFIG_AGP_SIS is not set
# CONFIG_AGP_VIA is not set
CONFIG_INTEL_GTT=y
CONFIG_VGA_ARB=y
CONFIG_VGA_ARB_MAX_GPUS=16
# CONFIG_VGA_SWITCHEROO is not set

#
# Direct Rendering Manager
#
CONFIG_DRM=y
CONFIG_DRM_KMS_HELPER=y
CONFIG_DRM_KMS_FB_HELPER=y
# CONFIG_DRM_LOAD_EDID_FIRMWARE is not set
CONFIG_DRM_TTM=y

#
# I2C encoder or helper chips
#
# CONFIG_DRM_I2C_CH7006 is not set
# CONFIG_DRM_I2C_SIL164 is not set
# CONFIG_DRM_I2C_NXP_TDA998X is not set
# CONFIG_DRM_PTN3460 is not set
# CONFIG_DRM_TDFX is not set
# CONFIG_DRM_R128 is not set
CONFIG_DRM_RADEON=y
# CONFIG_DRM_RADEON_UMS is not set
# CONFIG_DRM_NOUVEAU is not set
# CONFIG_DRM_I810 is not set
# CONFIG_DRM_I915 is not set
# CONFIG_DRM_MGA is not set
# CONFIG_DRM_SIS is not set
# CONFIG_DRM_VIA is not set
# CONFIG_DRM_SAVAGE is not set
# CONFIG_DRM_VMWGFX is not set
# CONFIG_DRM_GMA500 is not set
# CONFIG_DRM_UDL is not set
# CONFIG_DRM_AST is not set
# CONFIG_DRM_MGAG200 is not set
# CONFIG_DRM_CIRRUS_QEMU is not set
# CONFIG_DRM_QXL is not set
# CONFIG_DRM_BOCHS is not set

#
# Frame buffer Devices
#
CONFIG_FB=y
# CONFIG_FIRMWARE_EDID is not set
CONFIG_FB_CMDLINE=y
CONFIG_FB_DDC=y
CONFIG_FB_BOOT_VESA_SUPPORT=y
CONFIG_FB_CFB_FILLRECT=y
CONFIG_FB_CFB_COPYAREA=y
CONFIG_FB_CFB_IMAGEBLIT=y
# CONFIG_FB_CFB_REV_PIXELS_IN_BYTE is not set
# CONFIG_FB_SYS_FILLRECT is not set
# CONFIG_FB_SYS_COPYAREA is not set
# CONFIG_FB_SYS_IMAGEBLIT is not set
# CONFIG_FB_FOREIGN_ENDIAN is not set
# CONFIG_FB_SYS_FOPS is not set
# CONFIG_FB_SVGALIB is not set
# CONFIG_FB_MACMODES is not set
CONFIG_FB_BACKLIGHT=y
CONFIG_FB_MODE_HELPERS=y
CONFIG_FB_TILEBLITTING=y

#
# Frame buffer hardware drivers
#
# CONFIG_FB_CIRRUS is not set
# CONFIG_FB_PM2 is not set
# CONFIG_FB_CYBER2000 is not set
# CONFIG_FB_ARC is not set
# CONFIG_FB_ASILIANT is not set
# CONFIG_FB_IMSTT is not set
# CONFIG_FB_VGA16 is not set
# CONFIG_FB_UVESA is not set
CONFIG_FB_VESA=y
CONFIG_FB_EFI=y
# CONFIG_FB_N411 is not set
# CONFIG_FB_HGA is not set
# CONFIG_FB_OPENCORES is not set
# CONFIG_FB_S1D13XXX is not set
# CONFIG_FB_NVIDIA is not set
# CONFIG_FB_RIVA is not set
# CONFIG_FB_I740 is not set
# CONFIG_FB_LE80578 is not set
# CONFIG_FB_MATROX is not set
CONFIG_FB_RADEON=y
CONFIG_FB_RADEON_I2C=y
CONFIG_FB_RADEON_BACKLIGHT=y
# CONFIG_FB_RADEON_DEBUG is not set
# CONFIG_FB_ATY128 is not set
# CONFIG_FB_ATY is not set
# CONFIG_FB_S3 is not set
# CONFIG_FB_SAVAGE is not set
# CONFIG_FB_SIS is not set
# CONFIG_FB_VIA is not set
# CONFIG_FB_NEOMAGIC is not set
# CONFIG_FB_KYRO is not set
# CONFIG_FB_3DFX is not set
# CONFIG_FB_VOODOO1 is not set
# CONFIG_FB_VT8623 is not set
# CONFIG_FB_TRIDENT is not set
# CONFIG_FB_ARK is not set
# CONFIG_FB_PM3 is not set
# CONFIG_FB_CARMINE is not set
# CONFIG_FB_SMSCUFX is not set
# CONFIG_FB_UDL is not set
# CONFIG_FB_VIRTUAL is not set
# CONFIG_FB_METRONOME is not set
# CONFIG_FB_MB862XX is not set
# CONFIG_FB_BROADSHEET is not set
# CONFIG_FB_AUO_K190X is not set
# CONFIG_FB_SIMPLE is not set
CONFIG_BACKLIGHT_LCD_SUPPORT=y
# CONFIG_LCD_CLASS_DEVICE is not set
CONFIG_BACKLIGHT_CLASS_DEVICE=y
CONFIG_BACKLIGHT_GENERIC=y
# CONFIG_BACKLIGHT_APPLE is not set
# CONFIG_BACKLIGHT_SAHARA is not set
# CONFIG_BACKLIGHT_ADP8860 is not set
# CONFIG_BACKLIGHT_ADP8870 is not set
# CONFIG_BACKLIGHT_LM3639 is not set
# CONFIG_BACKLIGHT_LV5207LP is not set
# CONFIG_BACKLIGHT_BD6107 is not set
# CONFIG_VGASTATE is not set
CONFIG_HDMI=y

#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
CONFIG_VGACON_SOFT_SCROLLBACK=y
CONFIG_VGACON_SOFT_SCROLLBACK_SIZE=64
CONFIG_DUMMY_CONSOLE=y
CONFIG_FRAMEBUFFER_CONSOLE=y
CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y
# CONFIG_FRAMEBUFFER_CONSOLE_ROTATION is not set
CONFIG_LOGO=y
# CONFIG_LOGO_LINUX_MONO is not set
# CONFIG_LOGO_LINUX_VGA16 is not set
CONFIG_LOGO_LINUX_CLUT224=y
CONFIG_SOUND=y
CONFIG_SOUND_OSS_CORE=y
CONFIG_SOUND_OSS_CORE_PRECLAIM=y
CONFIG_SND=y
CONFIG_SND_TIMER=y
CONFIG_SND_PCM=y
CONFIG_SND_HWDEP=y
CONFIG_SND_SEQUENCER=y
CONFIG_SND_SEQ_DUMMY=y
CONFIG_SND_OSSEMUL=y
CONFIG_SND_MIXER_OSS=y
CONFIG_SND_PCM_OSS=y
CONFIG_SND_PCM_OSS_PLUGINS=y
CONFIG_SND_SEQUENCER_OSS=y
CONFIG_SND_HRTIMER=y
CONFIG_SND_SEQ_HRTIMER_DEFAULT=y
CONFIG_SND_DYNAMIC_MINORS=y
CONFIG_SND_MAX_CARDS=32
CONFIG_SND_SUPPORT_OLD_API=y
CONFIG_SND_VERBOSE_PROCFS=y
# CONFIG_SND_VERBOSE_PRINTK is not set
# CONFIG_SND_DEBUG is not set
CONFIG_SND_VMASTER=y
CONFIG_SND_KCTL_JACK=y
CONFIG_SND_DMA_SGBUF=y
# CONFIG_SND_RAWMIDI_SEQ is not set
# CONFIG_SND_OPL3_LIB_SEQ is not set
# CONFIG_SND_OPL4_LIB_SEQ is not set
# CONFIG_SND_SBAWE_SEQ is not set
# CONFIG_SND_EMU10K1_SEQ is not set
CONFIG_SND_DRIVERS=y
# CONFIG_SND_PCSP is not set
# CONFIG_SND_DUMMY is not set
# CONFIG_SND_ALOOP is not set
# CONFIG_SND_VIRMIDI is not set
# CONFIG_SND_MTPAV is not set
# CONFIG_SND_SERIAL_U16550 is not set
# CONFIG_SND_MPU401 is not set
CONFIG_SND_PCI=y
# CONFIG_SND_AD1889 is not set
# CONFIG_SND_ALS300 is not set
# CONFIG_SND_ALS4000 is not set
# CONFIG_SND_ALI5451 is not set
# CONFIG_SND_ASIHPI is not set
# CONFIG_SND_ATIIXP is not set
# CONFIG_SND_ATIIXP_MODEM is not set
# CONFIG_SND_AU8810 is not set
# CONFIG_SND_AU8820 is not set
# CONFIG_SND_AU8830 is not set
# CONFIG_SND_AW2 is not set
# CONFIG_SND_AZT3328 is not set
# CONFIG_SND_BT87X is not set
# CONFIG_SND_CA0106 is not set
# CONFIG_SND_CMIPCI is not set
# CONFIG_SND_OXYGEN is not set
# CONFIG_SND_CS4281 is not set
# CONFIG_SND_CS46XX is not set
# CONFIG_SND_CTXFI is not set
# CONFIG_SND_DARLA20 is not set
# CONFIG_SND_GINA20 is not set
# CONFIG_SND_LAYLA20 is not set
# CONFIG_SND_DARLA24 is not set
# CONFIG_SND_GINA24 is not set
# CONFIG_SND_LAYLA24 is not set
# CONFIG_SND_MONA is not set
# CONFIG_SND_MIA is not set
# CONFIG_SND_ECHO3G is not set
# CONFIG_SND_INDIGO is not set
# CONFIG_SND_INDIGOIO is not set
# CONFIG_SND_INDIGODJ is not set
# CONFIG_SND_INDIGOIOX is not set
# CONFIG_SND_INDIGODJX is not set
# CONFIG_SND_EMU10K1 is not set
# CONFIG_SND_EMU10K1X is not set
# CONFIG_SND_ENS1370 is not set
# CONFIG_SND_ENS1371 is not set
# CONFIG_SND_ES1938 is not set
# CONFIG_SND_ES1968 is not set
# CONFIG_SND_FM801 is not set
# CONFIG_SND_HDSP is not set
# CONFIG_SND_HDSPM is not set
# CONFIG_SND_ICE1712 is not set
# CONFIG_SND_ICE1724 is not set
# CONFIG_SND_INTEL8X0 is not set
# CONFIG_SND_INTEL8X0M is not set
# CONFIG_SND_KORG1212 is not set
# CONFIG_SND_LOLA is not set
# CONFIG_SND_LX6464ES is not set
# CONFIG_SND_MAESTRO3 is not set
# CONFIG_SND_MIXART is not set
# CONFIG_SND_NM256 is not set
# CONFIG_SND_PCXHR is not set
# CONFIG_SND_RIPTIDE is not set
# CONFIG_SND_RME32 is not set
# CONFIG_SND_RME96 is not set
# CONFIG_SND_RME9652 is not set
# CONFIG_SND_SONICVIBES is not set
# CONFIG_SND_TRIDENT is not set
# CONFIG_SND_VIA82XX is not set
# CONFIG_SND_VIA82XX_MODEM is not set
# CONFIG_SND_VIRTUOSO is not set
# CONFIG_SND_VX222 is not set
# CONFIG_SND_YMFPCI is not set

#
# HD-Audio
#
CONFIG_SND_HDA=y
CONFIG_SND_HDA_INTEL=y
CONFIG_SND_HDA_PREALLOC_SIZE=64
CONFIG_SND_HDA_HWDEP=y
# CONFIG_SND_HDA_RECONFIG is not set
# CONFIG_SND_HDA_INPUT_BEEP is not set
# CONFIG_SND_HDA_INPUT_JACK is not set
# CONFIG_SND_HDA_PATCH_LOADER is not set
CONFIG_SND_HDA_CODEC_REALTEK=y
CONFIG_SND_HDA_CODEC_ANALOG=y
CONFIG_SND_HDA_CODEC_SIGMATEL=y
CONFIG_SND_HDA_CODEC_VIA=y
CONFIG_SND_HDA_CODEC_HDMI=y
CONFIG_SND_HDA_CODEC_CIRRUS=y
CONFIG_SND_HDA_CODEC_CONEXANT=y
CONFIG_SND_HDA_CODEC_CA0110=y
# CONFIG_SND_HDA_CODEC_CA0132 is not set
CONFIG_SND_HDA_CODEC_CMEDIA=y
CONFIG_SND_HDA_CODEC_SI3054=y
CONFIG_SND_HDA_GENERIC=y
CONFIG_SND_HDA_POWER_SAVE_DEFAULT=0
CONFIG_SND_USB=y
# CONFIG_SND_USB_AUDIO is not set
# CONFIG_SND_USB_UA101 is not set
# CONFIG_SND_USB_USX2Y is not set
# CONFIG_SND_USB_CAIAQ is not set
# CONFIG_SND_USB_US122L is not set
# CONFIG_SND_USB_6FIRE is not set
# CONFIG_SND_USB_HIFACE is not set
# CONFIG_SND_BCD2000 is not set
# CONFIG_SND_SOC is not set
# CONFIG_SOUND_PRIME is not set

#
# HID support
#
CONFIG_HID=y
# CONFIG_HID_BATTERY_STRENGTH is not set
CONFIG_HIDRAW=y
# CONFIG_UHID is not set
CONFIG_HID_GENERIC=y

#
# Special HID drivers
#
CONFIG_HID_A4TECH=y
# CONFIG_HID_ACRUX is not set
CONFIG_HID_APPLE=y
# CONFIG_HID_APPLEIR is not set
# CONFIG_HID_AUREAL is not set
CONFIG_HID_BELKIN=y
CONFIG_HID_CHERRY=y
CONFIG_HID_CHICONY=y
# CONFIG_HID_PRODIKEYS is not set
CONFIG_HID_CYPRESS=y
# CONFIG_HID_DRAGONRISE is not set
# CONFIG_HID_EMS_FF is not set
# CONFIG_HID_ELECOM is not set
# CONFIG_HID_ELO is not set
CONFIG_HID_EZKEY=y
# CONFIG_HID_HOLTEK is not set
# CONFIG_HID_GT683R is not set
# CONFIG_HID_HUION is not set
# CONFIG_HID_KEYTOUCH is not set
CONFIG_HID_KYE=y
# CONFIG_HID_UCLOGIC is not set
# CONFIG_HID_WALTOP is not set
CONFIG_HID_GYRATION=y
# CONFIG_HID_ICADE is not set
# CONFIG_HID_TWINHAN is not set
CONFIG_HID_KENSINGTON=y
# CONFIG_HID_LCPOWER is not set
# CONFIG_HID_LENOVO is not set
CONFIG_HID_LOGITECH=y
# CONFIG_HID_LOGITECH_DJ is not set
CONFIG_LOGITECH_FF=y
# CONFIG_LOGIRUMBLEPAD2_FF is not set
# CONFIG_LOGIG940_FF is not set
CONFIG_LOGIWHEELS_FF=y
# CONFIG_HID_MAGICMOUSE is not set
CONFIG_HID_MICROSOFT=y
CONFIG_HID_MONTEREY=y
# CONFIG_HID_MULTITOUCH is not set
CONFIG_HID_NTRIG=y
# CONFIG_HID_ORTEK is not set
CONFIG_HID_PANTHERLORD=y
CONFIG_PANTHERLORD_FF=y
# CONFIG_HID_PENMOUNT is not set
CONFIG_HID_PETALYNX=y
# CONFIG_HID_PICOLCD is not set
# CONFIG_HID_PRIMAX is not set
# CONFIG_HID_ROCCAT is not set
# CONFIG_HID_SAITEK is not set
CONFIG_HID_SAMSUNG=y
CONFIG_HID_SONY=y
# CONFIG_SONY_FF is not set
# CONFIG_HID_SPEEDLINK is not set
# CONFIG_HID_STEELSERIES is not set
CONFIG_HID_SUNPLUS=y
# CONFIG_HID_RMI is not set
# CONFIG_HID_GREENASIA is not set
# CONFIG_HID_SMARTJOYPLUS is not set
# CONFIG_HID_TIVO is not set
CONFIG_HID_TOPSEED=y
# CONFIG_HID_THINGM is not set
# CONFIG_HID_THRUSTMASTER is not set
# CONFIG_HID_WACOM is not set
# CONFIG_HID_WIIMOTE is not set
# CONFIG_HID_XINMO is not set
# CONFIG_HID_ZEROPLUS is not set
# CONFIG_HID_ZYDACRON is not set
# CONFIG_HID_SENSOR_HUB is not set

#
# USB HID support
#
CONFIG_USB_HID=y
CONFIG_HID_PID=y
CONFIG_USB_HIDDEV=y

#
# I2C HID support
#
# CONFIG_I2C_HID is not set
CONFIG_USB_OHCI_LITTLE_ENDIAN=y
CONFIG_USB_SUPPORT=y
CONFIG_USB_COMMON=y
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB=y
CONFIG_USB_ANNOUNCE_NEW_DEVICES=y

#
# Miscellaneous USB options
#
CONFIG_USB_DEFAULT_PERSIST=y
# CONFIG_USB_DYNAMIC_MINORS is not set
# CONFIG_USB_OTG is not set
# CONFIG_USB_OTG_WHITELIST is not set
# CONFIG_USB_OTG_FSM is not set
CONFIG_USB_MON=y
# CONFIG_USB_WUSB_CBAF is not set

#
# USB Host Controller Drivers
#
# CONFIG_USB_C67X00_HCD is not set
# CONFIG_USB_XHCI_HCD is not set
CONFIG_USB_EHCI_HCD=y
# CONFIG_USB_EHCI_ROOT_HUB_TT is not set
# CONFIG_USB_EHCI_TT_NEWSCHED is not set
CONFIG_USB_EHCI_PCI=y
# CONFIG_USB_EHCI_HCD_PLATFORM is not set
# CONFIG_USB_OXU210HP_HCD is not set
# CONFIG_USB_ISP116X_HCD is not set
# CONFIG_USB_ISP1760_HCD is not set
# CONFIG_USB_ISP1362_HCD is not set
# CONFIG_USB_FUSBH200_HCD is not set
# CONFIG_USB_FOTG210_HCD is not set
CONFIG_USB_OHCI_HCD=y
CONFIG_USB_OHCI_HCD_PCI=y
# CONFIG_USB_OHCI_HCD_PLATFORM is not set
CONFIG_USB_UHCI_HCD=y
# CONFIG_USB_SL811_HCD is not set
# CONFIG_USB_R8A66597_HCD is not set
# CONFIG_USB_HCD_TEST_MODE is not set

#
# USB Device Class drivers
#
# CONFIG_USB_ACM is not set
CONFIG_USB_PRINTER=y
# CONFIG_USB_WDM is not set
# CONFIG_USB_TMC is not set

#
# NOTE: USB_STORAGE depends on SCSI but BLK_DEV_SD may
#

#
# also be needed; see USB_STORAGE Help for more info
#
CONFIG_USB_STORAGE=y
# CONFIG_USB_STORAGE_DEBUG is not set
# CONFIG_USB_STORAGE_REALTEK is not set
# CONFIG_USB_STORAGE_DATAFAB is not set
# CONFIG_USB_STORAGE_FREECOM is not set
# CONFIG_USB_STORAGE_ISD200 is not set
# CONFIG_USB_STORAGE_USBAT is not set
# CONFIG_USB_STORAGE_SDDR09 is not set
# CONFIG_USB_STORAGE_SDDR55 is not set
# CONFIG_USB_STORAGE_JUMPSHOT is not set
# CONFIG_USB_STORAGE_ALAUDA is not set
# CONFIG_USB_STORAGE_ONETOUCH is not set
# CONFIG_USB_STORAGE_KARMA is not set
# CONFIG_USB_STORAGE_CYPRESS_ATACB is not set
# CONFIG_USB_STORAGE_ENE_UB6250 is not set
# CONFIG_USB_UAS is not set

#
# USB Imaging devices
#
# CONFIG_USB_MDC800 is not set
# CONFIG_USB_MICROTEK is not set
# CONFIG_USBIP_CORE is not set
# CONFIG_USB_MUSB_HDRC is not set
# CONFIG_USB_DWC3 is not set
# CONFIG_USB_DWC2 is not set
# CONFIG_USB_CHIPIDEA is not set

#
# USB port drivers
#
# CONFIG_USB_SERIAL is not set

#
# USB Miscellaneous drivers
#
# CONFIG_USB_EMI62 is not set
# CONFIG_USB_EMI26 is not set
# CONFIG_USB_ADUTUX is not set
# CONFIG_USB_SEVSEG is not set
# CONFIG_USB_RIO500 is not set
# CONFIG_USB_LEGOTOWER is not set
# CONFIG_USB_LCD is not set
# CONFIG_USB_LED is not set
# CONFIG_USB_CYPRESS_CY7C63 is not set
# CONFIG_USB_CYTHERM is not set
# CONFIG_USB_IDMOUSE is not set
# CONFIG_USB_FTDI_ELAN is not set
# CONFIG_USB_APPLEDISPLAY is not set
# CONFIG_USB_SISUSBVGA is not set
# CONFIG_USB_LD is not set
# CONFIG_USB_TRANCEVIBRATOR is not set
# CONFIG_USB_IOWARRIOR is not set
# CONFIG_USB_TEST is not set
# CONFIG_USB_EHSET_TEST_FIXTURE is not set
# CONFIG_USB_ISIGHTFW is not set
# CONFIG_USB_YUREX is not set
# CONFIG_USB_EZUSB_FX2 is not set
# CONFIG_USB_HSIC_USB3503 is not set
# CONFIG_USB_LINK_LAYER_TEST is not set

#
# USB Physical Layer drivers
#
# CONFIG_USB_PHY is not set
# CONFIG_NOP_USB_XCEIV is not set
# CONFIG_USB_ISP1301 is not set
# CONFIG_USB_GADGET is not set
# CONFIG_USB_LED_TRIG is not set
# CONFIG_UWB is not set
# CONFIG_MMC is not set
# CONFIG_MEMSTICK is not set
CONFIG_NEW_LEDS=y
CONFIG_LEDS_CLASS=y

#
# LED drivers
#
# CONFIG_LEDS_LM3530 is not set
# CONFIG_LEDS_LM3642 is not set
# CONFIG_LEDS_PCA9532 is not set
# CONFIG_LEDS_LP3944 is not set
# CONFIG_LEDS_LP5521 is not set
# CONFIG_LEDS_LP5523 is not set
# CONFIG_LEDS_LP5562 is not set
# CONFIG_LEDS_LP8501 is not set
# CONFIG_LEDS_CLEVO_MAIL is not set
# CONFIG_LEDS_PCA955X is not set
# CONFIG_LEDS_PCA963X is not set
# CONFIG_LEDS_BD2802 is not set
# CONFIG_LEDS_INTEL_SS4200 is not set
# CONFIG_LEDS_TCA6507 is not set
# CONFIG_LEDS_LM355x is not set

#
# LED driver for blink(1) USB RGB LED is under Special HID drivers (HID_THINGM)
#
# CONFIG_LEDS_BLINKM is not set

#
# LED Triggers
#
CONFIG_LEDS_TRIGGERS=y
# CONFIG_LEDS_TRIGGER_TIMER is not set
# CONFIG_LEDS_TRIGGER_ONESHOT is not set
# CONFIG_LEDS_TRIGGER_HEARTBEAT is not set
# CONFIG_LEDS_TRIGGER_BACKLIGHT is not set
# CONFIG_LEDS_TRIGGER_CPU is not set
# CONFIG_LEDS_TRIGGER_DEFAULT_ON is not set

#
# iptables trigger is under Netfilter config (LED target)
#
# CONFIG_LEDS_TRIGGER_TRANSIENT is not set
# CONFIG_LEDS_TRIGGER_CAMERA is not set
# CONFIG_ACCESSIBILITY is not set
# CONFIG_INFINIBAND is not set
CONFIG_EDAC=y
CONFIG_EDAC_LEGACY_SYSFS=y
# CONFIG_EDAC_DEBUG is not set
# CONFIG_EDAC_MM_EDAC is not set
CONFIG_RTC_LIB=y
CONFIG_RTC_CLASS=y
# CONFIG_RTC_HCTOSYS is not set
CONFIG_RTC_SYSTOHC=y
CONFIG_RTC_HCTOSYS_DEVICE="rtc0"
# CONFIG_RTC_DEBUG is not set

#
# RTC interfaces
#
CONFIG_RTC_INTF_SYSFS=y
CONFIG_RTC_INTF_PROC=y
CONFIG_RTC_INTF_DEV=y
# CONFIG_RTC_INTF_DEV_UIE_EMUL is not set
# CONFIG_RTC_DRV_TEST is not set

#
# I2C RTC drivers
#
# CONFIG_RTC_DRV_DS1307 is not set
# CONFIG_RTC_DRV_DS1374 is not set
# CONFIG_RTC_DRV_DS1672 is not set
# CONFIG_RTC_DRV_DS3232 is not set
# CONFIG_RTC_DRV_MAX6900 is not set
# CONFIG_RTC_DRV_RS5C372 is not set
# CONFIG_RTC_DRV_ISL1208 is not set
# CONFIG_RTC_DRV_ISL12022 is not set
# CONFIG_RTC_DRV_ISL12057 is not set
# CONFIG_RTC_DRV_X1205 is not set
# CONFIG_RTC_DRV_PCF2127 is not set
# CONFIG_RTC_DRV_PCF8523 is not set
# CONFIG_RTC_DRV_PCF8563 is not set
# CONFIG_RTC_DRV_PCF85063 is not set
# CONFIG_RTC_DRV_PCF8583 is not set
# CONFIG_RTC_DRV_M41T80 is not set
# CONFIG_RTC_DRV_BQ32K is not set
# CONFIG_RTC_DRV_S35390A is not set
# CONFIG_RTC_DRV_FM3130 is not set
# CONFIG_RTC_DRV_RX8581 is not set
# CONFIG_RTC_DRV_RX8025 is not set
# CONFIG_RTC_DRV_EM3027 is not set
# CONFIG_RTC_DRV_RV3029C2 is not set

#
# SPI RTC drivers
#

#
# Platform RTC drivers
#
CONFIG_RTC_DRV_CMOS=y
# CONFIG_RTC_DRV_DS1286 is not set
# CONFIG_RTC_DRV_DS1511 is not set
# CONFIG_RTC_DRV_DS1553 is not set
# CONFIG_RTC_DRV_DS1742 is not set
# CONFIG_RTC_DRV_DS2404 is not set
# CONFIG_RTC_DRV_EFI is not set
# CONFIG_RTC_DRV_STK17TA8 is not set
# CONFIG_RTC_DRV_M48T86 is not set
# CONFIG_RTC_DRV_M48T35 is not set
# CONFIG_RTC_DRV_M48T59 is not set
# CONFIG_RTC_DRV_MSM6242 is not set
# CONFIG_RTC_DRV_BQ4802 is not set
# CONFIG_RTC_DRV_RP5C01 is not set
# CONFIG_RTC_DRV_V3020 is not set

#
# on-CPU RTC drivers
#
# CONFIG_RTC_DRV_XGENE is not set

#
# HID Sensor RTC drivers
#
# CONFIG_RTC_DRV_HID_SENSOR_TIME is not set
CONFIG_DMADEVICES=y
# CONFIG_DMADEVICES_DEBUG is not set

#
# DMA Devices
#
# CONFIG_INTEL_MID_DMAC is not set
# CONFIG_INTEL_IOATDMA is not set
# CONFIG_DW_DMAC_CORE is not set
# CONFIG_DW_DMAC is not set
# CONFIG_DW_DMAC_PCI is not set
CONFIG_DMA_ACPI=y
# CONFIG_AUXDISPLAY is not set
# CONFIG_UIO is not set
# CONFIG_VIRT_DRIVERS is not set
CONFIG_VIRTIO=y

#
# Virtio drivers
#
CONFIG_VIRTIO_PCI=y
CONFIG_VIRTIO_BALLOON=y
# CONFIG_VIRTIO_MMIO is not set

#
# Microsoft Hyper-V guest support
#
# CONFIG_STAGING is not set
CONFIG_X86_PLATFORM_DEVICES=y
# CONFIG_ACERHDF is not set
# CONFIG_ASUS_LAPTOP is not set
# CONFIG_DELL_SMO8800 is not set
# CONFIG_FUJITSU_LAPTOP is not set
# CONFIG_FUJITSU_TABLET is not set
# CONFIG_HP_ACCEL is not set
# CONFIG_HP_WIRELESS is not set
# CONFIG_PANASONIC_LAPTOP is not set
# CONFIG_THINKPAD_ACPI is not set
# CONFIG_SENSORS_HDAPS is not set
# CONFIG_INTEL_MENLOW is not set
# CONFIG_ACPI_WMI is not set
# CONFIG_TOPSTAR_LAPTOP is not set
# CONFIG_TOSHIBA_BT_RFKILL is not set
# CONFIG_TOSHIBA_HAPS is not set
# CONFIG_ACPI_CMPC is not set
# CONFIG_INTEL_IPS is not set
# CONFIG_IBM_RTL is not set
# CONFIG_SAMSUNG_LAPTOP is not set
# CONFIG_SAMSUNG_Q10 is not set
# CONFIG_APPLE_GMUX is not set
# CONFIG_INTEL_RST is not set
# CONFIG_INTEL_SMARTCONNECT is not set
# CONFIG_PVPANIC is not set
# CONFIG_CHROME_PLATFORMS is not set

#
# SOC (System On Chip) specific Drivers
#
# CONFIG_SOC_TI is not set

#
# Hardware Spinlock drivers
#

#
# Clock Source drivers
#
CONFIG_CLKEVT_I8253=y
CONFIG_I8253_LOCK=y
CONFIG_CLKBLD_I8253=y
# CONFIG_ATMEL_PIT is not set
# CONFIG_SH_TIMER_CMT is not set
# CONFIG_SH_TIMER_MTU2 is not set
# CONFIG_SH_TIMER_TMU is not set
# CONFIG_EM_TIMER_STI is not set
# CONFIG_MAILBOX is not set
CONFIG_IOMMU_SUPPORT=y
# CONFIG_AMD_IOMMU is not set
# CONFIG_INTEL_IOMMU is not set
# CONFIG_IRQ_REMAP is not set

#
# Remoteproc drivers
#
# CONFIG_STE_MODEM_RPROC is not set

#
# Rpmsg drivers
#

#
# SOC (System On Chip) specific Drivers
#
# CONFIG_PM_DEVFREQ is not set
# CONFIG_EXTCON is not set
# CONFIG_MEMORY is not set
# CONFIG_IIO is not set
# CONFIG_NTB is not set
# CONFIG_VME_BUS is not set
# CONFIG_PWM is not set
# CONFIG_IPACK_BUS is not set
# CONFIG_RESET_CONTROLLER is not set
# CONFIG_FMC is not set

#
# PHY Subsystem
#
CONFIG_GENERIC_PHY=y
# CONFIG_BCM_KONA_USB2_PHY is not set
# CONFIG_POWERCAP is not set
# CONFIG_MCB is not set
CONFIG_RAS=y
# CONFIG_THUNDERBOLT is not set

#
# Firmware Drivers
#
CONFIG_EDD=y
# CONFIG_EDD_OFF is not set
CONFIG_FIRMWARE_MEMMAP=y
# CONFIG_DELL_RBU is not set
# CONFIG_DCDBAS is not set
CONFIG_DMIID=y
# CONFIG_DMI_SYSFS is not set
CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK=y
# CONFIG_ISCSI_IBFT_FIND is not set
# CONFIG_GOOGLE_FIRMWARE is not set

#
# EFI (Extensible Firmware Interface) Support
#
CONFIG_EFI_VARS=y
CONFIG_EFI_RUNTIME_MAP=y
CONFIG_EFI_RUNTIME_WRAPPERS=y

#
# File systems
#
CONFIG_DCACHE_WORD_ACCESS=y
CONFIG_EXT2_FS=y
CONFIG_EXT2_FS_XATTR=y
CONFIG_EXT2_FS_POSIX_ACL=y
CONFIG_EXT2_FS_SECURITY=y
# CONFIG_EXT2_FS_XIP is not set
CONFIG_EXT3_FS=y
CONFIG_EXT3_DEFAULTS_TO_ORDERED=y
CONFIG_EXT3_FS_XATTR=y
CONFIG_EXT3_FS_POSIX_ACL=y
CONFIG_EXT3_FS_SECURITY=y
CONFIG_EXT4_FS=y
CONFIG_EXT4_FS_POSIX_ACL=y
CONFIG_EXT4_FS_SECURITY=y
# CONFIG_EXT4_DEBUG is not set
CONFIG_JBD=y
# CONFIG_JBD_DEBUG is not set
CONFIG_JBD2=y
# CONFIG_JBD2_DEBUG is not set
CONFIG_FS_MBCACHE=y
# CONFIG_REISERFS_FS is not set
# CONFIG_JFS_FS is not set
CONFIG_XFS_FS=y
CONFIG_XFS_QUOTA=y
CONFIG_XFS_POSIX_ACL=y
CONFIG_XFS_RT=y
CONFIG_XFS_DEBUG=y
# CONFIG_GFS2_FS is not set
# CONFIG_BTRFS_FS is not set
# CONFIG_NILFS2_FS is not set
CONFIG_FS_POSIX_ACL=y
CONFIG_EXPORTFS=y
CONFIG_FILE_LOCKING=y
CONFIG_FSNOTIFY=y
CONFIG_DNOTIFY=y
CONFIG_INOTIFY_USER=y
# CONFIG_FANOTIFY is not set
CONFIG_QUOTA=y
CONFIG_QUOTA_NETLINK_INTERFACE=y
# CONFIG_PRINT_QUOTA_WARNING is not set
# CONFIG_QUOTA_DEBUG is not set
CONFIG_QUOTA_TREE=y
# CONFIG_QFMT_V1 is not set
CONFIG_QFMT_V2=y
CONFIG_QUOTACTL=y
CONFIG_QUOTACTL_COMPAT=y
CONFIG_AUTOFS4_FS=y
# CONFIG_FUSE_FS is not set

#
# Caches
#
# CONFIG_FSCACHE is not set

#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_UDF_FS=y
CONFIG_UDF_NLS=y

#
# DOS/FAT/NT Filesystems
#
CONFIG_FAT_FS=y
CONFIG_MSDOS_FS=y
CONFIG_VFAT_FS=y
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
CONFIG_NTFS_FS=y
# CONFIG_NTFS_DEBUG is not set
CONFIG_NTFS_RW=y

#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_PROC_VMCORE=y
CONFIG_PROC_SYSCTL=y
CONFIG_PROC_PAGE_MONITOR=y
CONFIG_KERNFS=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
CONFIG_TMPFS_POSIX_ACL=y
CONFIG_TMPFS_XATTR=y
CONFIG_HUGETLBFS=y
CONFIG_HUGETLB_PAGE=y
# CONFIG_CONFIGFS_FS is not set
CONFIG_MISC_FILESYSTEMS=y
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_ECRYPT_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_HFSPLUS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
# CONFIG_LOGFS is not set
# CONFIG_CRAMFS is not set
# CONFIG_SQUASHFS is not set
# CONFIG_VXFS_FS is not set
# CONFIG_MINIX_FS is not set
# CONFIG_OMFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_QNX6FS_FS is not set
# CONFIG_ROMFS_FS is not set
# CONFIG_PSTORE is not set
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set
# CONFIG_F2FS_FS is not set
# CONFIG_EFIVAR_FS is not set
CONFIG_NETWORK_FILESYSTEMS=y
CONFIG_NFS_FS=y
CONFIG_NFS_V2=y
CONFIG_NFS_V3=y
CONFIG_NFS_V3_ACL=y
CONFIG_NFS_V4=y
# CONFIG_NFS_SWAP is not set
# CONFIG_NFS_V4_1 is not set
CONFIG_ROOT_NFS=y
# CONFIG_NFS_USE_LEGACY_DNS is not set
CONFIG_NFS_USE_KERNEL_DNS=y
CONFIG_NFSD=y
CONFIG_NFSD_V2_ACL=y
CONFIG_NFSD_V3=y
CONFIG_NFSD_V3_ACL=y
CONFIG_NFSD_V4=y
# CONFIG_NFSD_V4_SECURITY_LABEL is not set
# CONFIG_NFSD_FAULT_INJECTION is not set
CONFIG_GRACE_PERIOD=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
CONFIG_NFS_ACL_SUPPORT=y
CONFIG_NFS_COMMON=y
CONFIG_SUNRPC=y
CONFIG_SUNRPC_GSS=y
# CONFIG_SUNRPC_DEBUG is not set
# CONFIG_CEPH_FS is not set
# CONFIG_CIFS is not set
# CONFIG_NCP_FS is not set
# CONFIG_CODA_FS is not set
# CONFIG_AFS_FS is not set
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="utf8"
CONFIG_NLS_CODEPAGE_437=y
# CONFIG_NLS_CODEPAGE_737 is not set
# CONFIG_NLS_CODEPAGE_775 is not set
# CONFIG_NLS_CODEPAGE_850 is not set
# CONFIG_NLS_CODEPAGE_852 is not set
# CONFIG_NLS_CODEPAGE_855 is not set
# CONFIG_NLS_CODEPAGE_857 is not set
# CONFIG_NLS_CODEPAGE_860 is not set
# CONFIG_NLS_CODEPAGE_861 is not set
# CONFIG_NLS_CODEPAGE_862 is not set
# CONFIG_NLS_CODEPAGE_863 is not set
# CONFIG_NLS_CODEPAGE_864 is not set
# CONFIG_NLS_CODEPAGE_865 is not set
# CONFIG_NLS_CODEPAGE_866 is not set
# CONFIG_NLS_CODEPAGE_869 is not set
# CONFIG_NLS_CODEPAGE_936 is not set
# CONFIG_NLS_CODEPAGE_950 is not set
# CONFIG_NLS_CODEPAGE_932 is not set
# CONFIG_NLS_CODEPAGE_949 is not set
# CONFIG_NLS_CODEPAGE_874 is not set
# CONFIG_NLS_ISO8859_8 is not set
# CONFIG_NLS_CODEPAGE_1250 is not set
# CONFIG_NLS_CODEPAGE_1251 is not set
CONFIG_NLS_ASCII=y
CONFIG_NLS_ISO8859_1=y
# CONFIG_NLS_ISO8859_2 is not set
# CONFIG_NLS_ISO8859_3 is not set
# CONFIG_NLS_ISO8859_4 is not set
# CONFIG_NLS_ISO8859_5 is not set
# CONFIG_NLS_ISO8859_6 is not set
# CONFIG_NLS_ISO8859_7 is not set
# CONFIG_NLS_ISO8859_9 is not set
# CONFIG_NLS_ISO8859_13 is not set
# CONFIG_NLS_ISO8859_14 is not set
# CONFIG_NLS_ISO8859_15 is not set
# CONFIG_NLS_KOI8_R is not set
# CONFIG_NLS_KOI8_U is not set
# CONFIG_NLS_MAC_ROMAN is not set
# CONFIG_NLS_MAC_CELTIC is not set
# CONFIG_NLS_MAC_CENTEURO is not set
# CONFIG_NLS_MAC_CROATIAN is not set
# CONFIG_NLS_MAC_CYRILLIC is not set
# CONFIG_NLS_MAC_GAELIC is not set
# CONFIG_NLS_MAC_GREEK is not set
# CONFIG_NLS_MAC_ICELAND is not set
# CONFIG_NLS_MAC_INUIT is not set
# CONFIG_NLS_MAC_ROMANIAN is not set
# CONFIG_NLS_MAC_TURKISH is not set
CONFIG_NLS_UTF8=y

#
# Kernel hacking
#
CONFIG_TRACE_IRQFLAGS_SUPPORT=y

#
# printk and dmesg options
#
CONFIG_PRINTK_TIME=y
CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4
# CONFIG_BOOT_PRINTK_DELAY is not set
# CONFIG_DYNAMIC_DEBUG is not set

#
# Compile-time checks and compiler options
#
# CONFIG_DEBUG_INFO is not set
# CONFIG_ENABLE_WARN_DEPRECATED is not set
CONFIG_ENABLE_MUST_CHECK=y
CONFIG_FRAME_WARN=2048
# CONFIG_STRIP_ASM_SYMS is not set
# CONFIG_READABLE_ASM is not set
# CONFIG_UNUSED_SYMBOLS is not set
CONFIG_DEBUG_FS=y
# CONFIG_HEADERS_CHECK is not set
# CONFIG_DEBUG_SECTION_MISMATCH is not set
CONFIG_ARCH_WANT_FRAME_POINTERS=y
CONFIG_FRAME_POINTER=y
# CONFIG_DEBUG_FORCE_WEAK_PER_CPU is not set
CONFIG_MAGIC_SYSRQ=y
CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x1
CONFIG_DEBUG_KERNEL=y

#
# Memory Debugging
#
# CONFIG_DEBUG_PAGEALLOC is not set
# CONFIG_DEBUG_OBJECTS is not set
# CONFIG_SLUB_DEBUG_ON is not set
# CONFIG_SLUB_STATS is not set
CONFIG_HAVE_DEBUG_KMEMLEAK=y
# CONFIG_DEBUG_KMEMLEAK is not set
CONFIG_DEBUG_STACK_USAGE=y
# CONFIG_DEBUG_VM is not set
# CONFIG_DEBUG_VIRTUAL is not set
CONFIG_DEBUG_MEMORY_INIT=y
# CONFIG_DEBUG_PER_CPU_MAPS is not set
CONFIG_HAVE_DEBUG_STACKOVERFLOW=y
CONFIG_DEBUG_STACKOVERFLOW=y
CONFIG_HAVE_ARCH_KMEMCHECK=y
# CONFIG_KMEMCHECK is not set
# CONFIG_DEBUG_SHIRQ is not set

#
# Debug Lockups and Hangs
#
# CONFIG_LOCKUP_DETECTOR is not set
# CONFIG_DETECT_HUNG_TASK is not set
# CONFIG_PANIC_ON_OOPS is not set
CONFIG_PANIC_ON_OOPS_VALUE=0
CONFIG_PANIC_TIMEOUT=0
# CONFIG_SCHED_DEBUG is not set
CONFIG_SCHEDSTATS=y
# CONFIG_SCHED_STACK_END_CHECK is not set
CONFIG_TIMER_STATS=y

#
# Lock Debugging (spinlocks, mutexes, etc...)
#
# CONFIG_DEBUG_RT_MUTEXES is not set
# CONFIG_DEBUG_SPINLOCK is not set
# CONFIG_DEBUG_MUTEXES is not set
# CONFIG_DEBUG_WW_MUTEX_SLOWPATH is not set
# CONFIG_DEBUG_LOCK_ALLOC is not set
# CONFIG_PROVE_LOCKING is not set
# CONFIG_LOCK_STAT is not set
# CONFIG_DEBUG_ATOMIC_SLEEP is not set
# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
# CONFIG_LOCK_TORTURE_TEST is not set
CONFIG_STACKTRACE=y
# CONFIG_DEBUG_KOBJECT is not set
CONFIG_DEBUG_BUGVERBOSE=y
# CONFIG_DEBUG_LIST is not set
# CONFIG_DEBUG_PI_LIST is not set
# CONFIG_DEBUG_SG is not set
# CONFIG_DEBUG_NOTIFIERS is not set
# CONFIG_DEBUG_CREDENTIALS is not set

#
# RCU Debugging
#
# CONFIG_SPARSE_RCU_POINTER is not set
# CONFIG_TORTURE_TEST is not set
# CONFIG_RCU_TORTURE_TEST is not set
CONFIG_RCU_CPU_STALL_TIMEOUT=21
# CONFIG_RCU_CPU_STALL_INFO is not set
# CONFIG_RCU_TRACE is not set
# CONFIG_DEBUG_BLOCK_EXT_DEVT is not set
# CONFIG_NOTIFIER_ERROR_INJECTION is not set
# CONFIG_FAULT_INJECTION is not set
# CONFIG_LATENCYTOP is not set
CONFIG_ARCH_HAS_DEBUG_STRICT_USER_COPY_CHECKS=y
# CONFIG_DEBUG_STRICT_USER_COPY_CHECKS is not set
CONFIG_USER_STACKTRACE_SUPPORT=y
CONFIG_NOP_TRACER=y
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_FP_TEST=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
CONFIG_HAVE_FENTRY=y
CONFIG_HAVE_C_RECORDMCOUNT=y
CONFIG_TRACE_CLOCK=y
CONFIG_RING_BUFFER=y
CONFIG_EVENT_TRACING=y
CONFIG_CONTEXT_SWITCH_TRACER=y
CONFIG_TRACING=y
CONFIG_GENERIC_TRACER=y
CONFIG_TRACING_SUPPORT=y
CONFIG_FTRACE=y
# CONFIG_FUNCTION_TRACER is not set
# CONFIG_IRQSOFF_TRACER is not set
# CONFIG_SCHED_TRACER is not set
# CONFIG_FTRACE_SYSCALLS is not set
# CONFIG_TRACER_SNAPSHOT is not set
CONFIG_BRANCH_PROFILE_NONE=y
# CONFIG_PROFILE_ANNOTATED_BRANCHES is not set
# CONFIG_PROFILE_ALL_BRANCHES is not set
# CONFIG_STACK_TRACER is not set
CONFIG_BLK_DEV_IO_TRACE=y
CONFIG_KPROBE_EVENT=y
# CONFIG_UPROBE_EVENT is not set
CONFIG_PROBE_EVENTS=y
# CONFIG_FTRACE_STARTUP_TEST is not set
# CONFIG_MMIOTRACE is not set
# CONFIG_TRACEPOINT_BENCHMARK is not set
# CONFIG_RING_BUFFER_BENCHMARK is not set
# CONFIG_RING_BUFFER_STARTUP_TEST is not set

#
# Runtime Testing
#
# CONFIG_LKDTM is not set
# CONFIG_TEST_LIST_SORT is not set
# CONFIG_KPROBES_SANITY_TEST is not set
# CONFIG_BACKTRACE_SELF_TEST is not set
# CONFIG_RBTREE_TEST is not set
# CONFIG_INTERVAL_TREE_TEST is not set
# CONFIG_PERCPU_TEST is not set
# CONFIG_ATOMIC64_SELFTEST is not set
# CONFIG_ASYNC_RAID6_TEST is not set
# CONFIG_TEST_STRING_HELPERS is not set
# CONFIG_TEST_KSTRTOX is not set
# CONFIG_TEST_RHASHTABLE is not set
CONFIG_PROVIDE_OHCI1394_DMA_INIT=y
# CONFIG_DMA_API_DEBUG is not set
# CONFIG_TEST_LKM is not set
# CONFIG_TEST_USER_COPY is not set
# CONFIG_TEST_BPF is not set
# CONFIG_TEST_FIRMWARE is not set
# CONFIG_TEST_UDELAY is not set
# CONFIG_SAMPLES is not set
CONFIG_HAVE_ARCH_KGDB=y
# CONFIG_KGDB is not set
# CONFIG_STRICT_DEVMEM is not set
CONFIG_X86_VERBOSE_BOOTUP=y
CONFIG_EARLY_PRINTK=y
CONFIG_EARLY_PRINTK_DBGP=y
# CONFIG_EARLY_PRINTK_EFI is not set
# CONFIG_X86_PTDUMP is not set
CONFIG_DEBUG_RODATA=y
# CONFIG_DEBUG_RODATA_TEST is not set
# CONFIG_DEBUG_SET_MODULE_RONX is not set
# CONFIG_DEBUG_NX_TEST is not set
CONFIG_DOUBLEFAULT=y
# CONFIG_DEBUG_TLBFLUSH is not set
# CONFIG_IOMMU_DEBUG is not set
# CONFIG_IOMMU_STRESS is not set
CONFIG_HAVE_MMIOTRACE_SUPPORT=y
# CONFIG_X86_DECODER_SELFTEST is not set
CONFIG_IO_DELAY_TYPE_0X80=0
CONFIG_IO_DELAY_TYPE_0XED=1
CONFIG_IO_DELAY_TYPE_UDELAY=2
CONFIG_IO_DELAY_TYPE_NONE=3
CONFIG_IO_DELAY_0X80=y
# CONFIG_IO_DELAY_0XED is not set
# CONFIG_IO_DELAY_UDELAY is not set
# CONFIG_IO_DELAY_NONE is not set
CONFIG_DEFAULT_IO_DELAY_TYPE=0
CONFIG_DEBUG_BOOT_PARAMS=y
# CONFIG_CPA_DEBUG is not set
CONFIG_OPTIMIZE_INLINING=y
# CONFIG_DEBUG_NMI_SELFTEST is not set
# CONFIG_X86_DEBUG_STATIC_CPU_HAS is not set

#
# Security options
#
CONFIG_KEYS=y
# CONFIG_PERSISTENT_KEYRINGS is not set
# CONFIG_BIG_KEYS is not set
# CONFIG_ENCRYPTED_KEYS is not set
CONFIG_KEYS_DEBUG_PROC_KEYS=y
# CONFIG_SECURITY_DMESG_RESTRICT is not set
CONFIG_SECURITY=y
# CONFIG_SECURITYFS is not set
CONFIG_SECURITY_NETWORK=y
# CONFIG_SECURITY_NETWORK_XFRM is not set
# CONFIG_SECURITY_PATH is not set
# CONFIG_SECURITY_SELINUX is not set
# CONFIG_SECURITY_SMACK is not set
# CONFIG_SECURITY_TOMOYO is not set
# CONFIG_SECURITY_APPARMOR is not set
# CONFIG_SECURITY_YAMA is not set
CONFIG_INTEGRITY=y
# CONFIG_INTEGRITY_SIGNATURE is not set
CONFIG_INTEGRITY_AUDIT=y
# CONFIG_IMA is not set
# CONFIG_EVM is not set
CONFIG_DEFAULT_SECURITY_DAC=y
CONFIG_DEFAULT_SECURITY=""
CONFIG_XOR_BLOCKS=y
CONFIG_ASYNC_CORE=y
CONFIG_ASYNC_MEMCPY=y
CONFIG_ASYNC_XOR=y
CONFIG_ASYNC_PQ=y
CONFIG_ASYNC_RAID6_RECOV=y
CONFIG_CRYPTO=y

#
# Crypto core or helper
#
CONFIG_CRYPTO_ALGAPI=y
CONFIG_CRYPTO_ALGAPI2=y
CONFIG_CRYPTO_AEAD=y
CONFIG_CRYPTO_AEAD2=y
CONFIG_CRYPTO_BLKCIPHER=y
CONFIG_CRYPTO_BLKCIPHER2=y
CONFIG_CRYPTO_HASH=y
CONFIG_CRYPTO_HASH2=y
CONFIG_CRYPTO_RNG2=y
CONFIG_CRYPTO_PCOMP2=y
CONFIG_CRYPTO_MANAGER=y
CONFIG_CRYPTO_MANAGER2=y
# CONFIG_CRYPTO_USER is not set
CONFIG_CRYPTO_MANAGER_DISABLE_TESTS=y
# CONFIG_CRYPTO_GF128MUL is not set
# CONFIG_CRYPTO_NULL is not set
# CONFIG_CRYPTO_PCRYPT is not set
CONFIG_CRYPTO_WORKQUEUE=y
# CONFIG_CRYPTO_CRYPTD is not set
# CONFIG_CRYPTO_MCRYPTD is not set
CONFIG_CRYPTO_AUTHENC=y
# CONFIG_CRYPTO_TEST is not set

#
# Authenticated Encryption with Associated Data
#
# CONFIG_CRYPTO_CCM is not set
# CONFIG_CRYPTO_GCM is not set
# CONFIG_CRYPTO_SEQIV is not set

#
# Block modes
#
CONFIG_CRYPTO_CBC=y
# CONFIG_CRYPTO_CTR is not set
# CONFIG_CRYPTO_CTS is not set
CONFIG_CRYPTO_ECB=y
# CONFIG_CRYPTO_LRW is not set
# CONFIG_CRYPTO_PCBC is not set
# CONFIG_CRYPTO_XTS is not set

#
# Hash modes
#
# CONFIG_CRYPTO_CMAC is not set
CONFIG_CRYPTO_HMAC=y
# CONFIG_CRYPTO_XCBC is not set
# CONFIG_CRYPTO_VMAC is not set

#
# Digest
#
CONFIG_CRYPTO_CRC32C=y
# CONFIG_CRYPTO_CRC32C_INTEL is not set
# CONFIG_CRYPTO_CRC32 is not set
# CONFIG_CRYPTO_CRC32_PCLMUL is not set
CONFIG_CRYPTO_CRCT10DIF=y
# CONFIG_CRYPTO_CRCT10DIF_PCLMUL is not set
# CONFIG_CRYPTO_GHASH is not set
# CONFIG_CRYPTO_MD4 is not set
CONFIG_CRYPTO_MD5=y
# CONFIG_CRYPTO_MICHAEL_MIC is not set
# CONFIG_CRYPTO_RMD128 is not set
# CONFIG_CRYPTO_RMD160 is not set
# CONFIG_CRYPTO_RMD256 is not set
# CONFIG_CRYPTO_RMD320 is not set
CONFIG_CRYPTO_SHA1=y
# CONFIG_CRYPTO_SHA1_SSSE3 is not set
# CONFIG_CRYPTO_SHA256_SSSE3 is not set
# CONFIG_CRYPTO_SHA512_SSSE3 is not set
# CONFIG_CRYPTO_SHA1_MB is not set
# CONFIG_CRYPTO_SHA256 is not set
# CONFIG_CRYPTO_SHA512 is not set
# CONFIG_CRYPTO_TGR192 is not set
# CONFIG_CRYPTO_WP512 is not set
# CONFIG_CRYPTO_GHASH_CLMUL_NI_INTEL is not set

#
# Ciphers
#
CONFIG_CRYPTO_AES=y
# CONFIG_CRYPTO_AES_X86_64 is not set
# CONFIG_CRYPTO_AES_NI_INTEL is not set
# CONFIG_CRYPTO_ANUBIS is not set
CONFIG_CRYPTO_ARC4=y
# CONFIG_CRYPTO_BLOWFISH is not set
# CONFIG_CRYPTO_BLOWFISH_X86_64 is not set
# CONFIG_CRYPTO_CAMELLIA is not set
# CONFIG_CRYPTO_CAMELLIA_X86_64 is not set
# CONFIG_CRYPTO_CAMELLIA_AESNI_AVX_X86_64 is not set
# CONFIG_CRYPTO_CAMELLIA_AESNI_AVX2_X86_64 is not set
# CONFIG_CRYPTO_CAST5 is not set
# CONFIG_CRYPTO_CAST5_AVX_X86_64 is not set
# CONFIG_CRYPTO_CAST6 is not set
# CONFIG_CRYPTO_CAST6_AVX_X86_64 is not set
CONFIG_CRYPTO_DES=y
# CONFIG_CRYPTO_DES3_EDE_X86_64 is not set
# CONFIG_CRYPTO_FCRYPT is not set
# CONFIG_CRYPTO_KHAZAD is not set
# CONFIG_CRYPTO_SALSA20 is not set
# CONFIG_CRYPTO_SALSA20_X86_64 is not set
# CONFIG_CRYPTO_SEED is not set
# CONFIG_CRYPTO_SERPENT is not set
# CONFIG_CRYPTO_SERPENT_SSE2_X86_64 is not set
# CONFIG_CRYPTO_SERPENT_AVX_X86_64 is not set
# CONFIG_CRYPTO_SERPENT_AVX2_X86_64 is not set
# CONFIG_CRYPTO_TEA is not set
# CONFIG_CRYPTO_TWOFISH is not set
# CONFIG_CRYPTO_TWOFISH_X86_64 is not set
# CONFIG_CRYPTO_TWOFISH_X86_64_3WAY is not set
# CONFIG_CRYPTO_TWOFISH_AVX_X86_64 is not set

#
# Compression
#
# CONFIG_CRYPTO_DEFLATE is not set
# CONFIG_CRYPTO_ZLIB is not set
# CONFIG_CRYPTO_LZO is not set
# CONFIG_CRYPTO_LZ4 is not set
# CONFIG_CRYPTO_LZ4HC is not set

#
# Random Number Generation
#
# CONFIG_CRYPTO_ANSI_CPRNG is not set
# CONFIG_CRYPTO_DRBG_MENU is not set
# CONFIG_CRYPTO_USER_API_HASH is not set
# CONFIG_CRYPTO_USER_API_SKCIPHER is not set
CONFIG_CRYPTO_HW=y
# CONFIG_CRYPTO_DEV_PADLOCK is not set
# CONFIG_CRYPTO_DEV_CCP is not set
# CONFIG_CRYPTO_DEV_QAT_DH895xCC is not set
# CONFIG_ASYMMETRIC_KEY_TYPE is not set
CONFIG_HAVE_KVM=y
CONFIG_HAVE_KVM_IRQCHIP=y
CONFIG_HAVE_KVM_IRQFD=y
CONFIG_HAVE_KVM_IRQ_ROUTING=y
CONFIG_HAVE_KVM_EVENTFD=y
CONFIG_KVM_APIC_ARCHITECTURE=y
CONFIG_KVM_MMIO=y
CONFIG_KVM_ASYNC_PF=y
CONFIG_HAVE_KVM_MSI=y
CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT=y
CONFIG_KVM_VFIO=y
CONFIG_VIRTUALIZATION=y
CONFIG_KVM=y
CONFIG_KVM_INTEL=y
# CONFIG_KVM_AMD is not set
# CONFIG_KVM_MMU_AUDIT is not set
CONFIG_BINARY_PRINTF=y

#
# Library routines
#
CONFIG_RAID6_PQ=y
CONFIG_BITREVERSE=y
CONFIG_GENERIC_STRNCPY_FROM_USER=y
CONFIG_GENERIC_STRNLEN_USER=y
CONFIG_GENERIC_NET_UTILS=y
CONFIG_GENERIC_FIND_FIRST_BIT=y
CONFIG_GENERIC_PCI_IOMAP=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_IO=y
CONFIG_ARCH_USE_CMPXCHG_LOCKREF=y
CONFIG_ARCH_HAS_FAST_MULTIPLIER=y
# CONFIG_CRC_CCITT is not set
CONFIG_CRC16=y
CONFIG_CRC_T10DIF=y
CONFIG_CRC_ITU_T=y
CONFIG_CRC32=y
# CONFIG_CRC32_SELFTEST is not set
CONFIG_CRC32_SLICEBY8=y
# CONFIG_CRC32_SLICEBY4 is not set
# CONFIG_CRC32_SARWATE is not set
# CONFIG_CRC32_BIT is not set
# CONFIG_CRC7 is not set
CONFIG_LIBCRC32C=y
# CONFIG_CRC8 is not set
# CONFIG_AUDIT_ARCH_COMPAT_GENERIC is not set
# CONFIG_RANDOM32_SELFTEST is not set
CONFIG_ZLIB_INFLATE=y
CONFIG_LZO_COMPRESS=y
CONFIG_LZO_DECOMPRESS=y
CONFIG_LZ4_DECOMPRESS=y
CONFIG_XZ_DEC=y
CONFIG_XZ_DEC_X86=y
CONFIG_XZ_DEC_POWERPC=y
CONFIG_XZ_DEC_IA64=y
CONFIG_XZ_DEC_ARM=y
CONFIG_XZ_DEC_ARMTHUMB=y
CONFIG_XZ_DEC_SPARC=y
CONFIG_XZ_DEC_BCJ=y
# CONFIG_XZ_DEC_TEST is not set
CONFIG_DECOMPRESS_GZIP=y
CONFIG_DECOMPRESS_BZIP2=y
CONFIG_DECOMPRESS_LZMA=y
CONFIG_DECOMPRESS_XZ=y
CONFIG_DECOMPRESS_LZO=y
CONFIG_DECOMPRESS_LZ4=y
CONFIG_INTERVAL_TREE=y
CONFIG_ASSOCIATIVE_ARRAY=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT_MAP=y
CONFIG_HAS_DMA=y
CONFIG_CHECK_SIGNATURE=y
CONFIG_CPU_RMAP=y
CONFIG_DQL=y
CONFIG_GLOB=y
# CONFIG_GLOB_SELFTEST is not set
CONFIG_NLATTR=y
CONFIG_ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE=y
CONFIG_AVERAGE=y
# CONFIG_CORDIC is not set
# CONFIG_DDR is not set
CONFIG_OID_REGISTRY=y
CONFIG_UCS2_STRING=y
CONFIG_FONT_SUPPORT=y
# CONFIG_FONTS is not set
CONFIG_FONT_8x8=y
CONFIG_FONT_8x16=y
CONFIG_ARCH_HAS_SG_CHAIN=y

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-08 20:40         ` Mel Gorman
@ 2015-03-09 21:02           ` Mel Gorman
  2015-03-10 13:08             ` Mel Gorman
  0 siblings, 1 reply; 49+ messages in thread
From: Mel Gorman @ 2015-03-09 21:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Dave Chinner, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Sun, Mar 08, 2015 at 08:40:25PM +0000, Mel Gorman wrote:
> > Because if the answer is 'yes', then we can safely say: 'we regressed 
> > performance because correctness [not dropping dirty bits] comes before 
> > performance'.
> > 
> > If the answer is 'no', then we still have a mystery (and a regression) 
> > to track down.
> > 
> > As a second hack (not to be applied), could we change:
> > 
> >  #define _PAGE_BIT_PROTNONE      _PAGE_BIT_GLOBAL
> > 
> > to:
> > 
> >  #define _PAGE_BIT_PROTNONE      (_PAGE_BIT_GLOBAL+1)
> > 
> 
> In itself, that's not enough. The SWP_OFFSET_SHIFT would also need updating
> as a partial revert of 21d9ee3eda7792c45880b2f11bff8e95c9a061fb but it
> can be done.
> 

More importantily, _PAGE_BIT_GLOBAL+1 == the special PTE bit so just
updating the value should crash. For the purposes of testing the idea, I
thought the straight-forward option was to break soft dirty page tracking
and steal their bit for testing (patch below). Took most of the day to
get access to the test machine so tests are not long running and only
the autonuma one has completed;

autonumabench
                                              3.19.0             4.0.0-rc1             4.0.0-rc1             4.0.0-rc1
                                             vanilla               vanilla         slowscan-v2r7        protnone-v3
Time User-NUMA01                  25695.96 (  0.00%)    32883.59 (-27.97%)    35288.00 (-37.33%)    35236.21 (-37.13%)
Time User-NUMA01_THEADLOCAL       17404.36 (  0.00%)    17453.20 ( -0.28%)    17765.79 ( -2.08%)    17590.10 ( -1.07%)
Time User-NUMA02                   2037.65 (  0.00%)     2063.70 ( -1.28%)     2063.22 ( -1.25%)     2072.95 ( -1.73%)
Time User-NUMA02_SMT                981.02 (  0.00%)      983.70 ( -0.27%)      976.01 (  0.51%)      983.42 ( -0.24%)
Time System-NUMA01                  194.70 (  0.00%)      602.44 (-209.42%)      209.42 ( -7.56%)      737.36 (-278.72%)
Time System-NUMA01_THEADLOCAL        98.52 (  0.00%)       78.10 ( 20.73%)       92.70 (  5.91%)       80.69 ( 18.10%)
Time System-NUMA02                    9.28 (  0.00%)        6.47 ( 30.28%)        6.06 ( 34.70%)        6.63 ( 28.56%)
Time System-NUMA02_SMT                3.79 (  0.00%)        5.06 (-33.51%)        3.39 ( 10.55%)        3.60 (  5.01%)
Time Elapsed-NUMA01                 558.84 (  0.00%)      755.96 (-35.27%)      833.63 (-49.17%)      804.50 (-43.96%)
Time Elapsed-NUMA01_THEADLOCAL      382.54 (  0.00%)      382.22 (  0.08%)      395.45 ( -3.37%)      388.12 ( -1.46%)
Time Elapsed-NUMA02                  49.83 (  0.00%)       49.38 (  0.90%)       50.21 ( -0.76%)       48.99 (  1.69%)
Time Elapsed-NUMA02_SMT              46.59 (  0.00%)       47.70 ( -2.38%)       48.55 ( -4.21%)       49.50 ( -6.25%)
Time CPU-NUMA01                    4632.00 (  0.00%)     4429.00 (  4.38%)     4258.00 (  8.07%)     4471.00 (  3.48%)
Time CPU-NUMA01_THEADLOCAL         4575.00 (  0.00%)     4586.00 ( -0.24%)     4515.00 (  1.31%)     4552.00 (  0.50%)
Time CPU-NUMA02                    4107.00 (  0.00%)     4191.00 ( -2.05%)     4120.00 ( -0.32%)     4244.00 ( -3.34%)
Time CPU-NUMA02_SMT                2113.00 (  0.00%)     2072.00 (  1.94%)     2017.00 (  4.54%)     1993.00 (  5.68%)

              3.19.0   4.0.0-rc1   4.0.0-rc1   4.0.0-rc1
             vanilla     vanillaslowscan-v2r7protnone-v3
User        46119.12    53384.29    56093.11    55882.82
System        306.41      692.14      311.64      828.36
Elapsed      1039.88     1236.87     1328.61     1292.92

So just using a different bit doesn't seem to be it either

                                3.19.0   4.0.0-rc1   4.0.0-rc1   4.0.0-rc1
                               vanilla     vanillaslowscan-v2r7protnone-v3
NUMA alloc hit                 1202922     1437560     1472578     1499274
NUMA alloc miss                      0           0           0           0
NUMA interleave hit                  0           0           0           0
NUMA alloc local               1200683     1436781     1472226     1498680
NUMA base PTE updates        222840103   304513172   121532313   337431414
NUMA huge PMD updates           434894      594467      237170      658715
NUMA page range updates      445505831   608880276   242963353   674693494
NUMA hint faults                601358      733491      334334      820793
NUMA hint local faults          371571      511530      227171      565003
NUMA hint local percent             61          69          67          68
NUMA pages migrated            7073177    26366701     8607082    31288355

Patch to use a bit other than the global bit for prot none is below.

diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index 8c7c10802e9c..1f243323693c 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -20,16 +20,16 @@
 #define _PAGE_BIT_SOFTW2	10	/* " */
 #define _PAGE_BIT_SOFTW3	11	/* " */
 #define _PAGE_BIT_PAT_LARGE	12	/* On 2MB or 1GB pages */
-#define _PAGE_BIT_SPECIAL	_PAGE_BIT_SOFTW1
-#define _PAGE_BIT_CPA_TEST	_PAGE_BIT_SOFTW1
+#define _PAGE_BIT_SPECIAL	_PAGE_BIT_SOFTW3
+#define _PAGE_BIT_CPA_TEST	_PAGE_BIT_SOFTW3
 #define _PAGE_BIT_SPLITTING	_PAGE_BIT_SOFTW2 /* only valid on a PSE pmd */
-#define _PAGE_BIT_HIDDEN	_PAGE_BIT_SOFTW3 /* hidden by kmemcheck */
-#define _PAGE_BIT_SOFT_DIRTY	_PAGE_BIT_SOFTW3 /* software dirty tracking */
+#define _PAGE_BIT_HIDDEN	_PAGE_BIT_SOFTW1 /* hidden by kmemcheck */
+#define _PAGE_BIT_SOFT_DIRTY	_PAGE_BIT_SOFTW1 /* software dirty tracking */
 #define _PAGE_BIT_NX           63       /* No execute: only valid after cpuid check */
 
 /* If _PAGE_BIT_PRESENT is clear, we use these: */
 /* - if the user mapped it with PROT_NONE; pte_present gives true */
-#define _PAGE_BIT_PROTNONE	_PAGE_BIT_GLOBAL
+#define _PAGE_BIT_PROTNONE	_PAGE_BIT_SOFTW1
 
 #define _PAGE_PRESENT	(_AT(pteval_t, 1) << _PAGE_BIT_PRESENT)
 #define _PAGE_RW	(_AT(pteval_t, 1) << _PAGE_BIT_RW)
@@ -98,8 +98,7 @@
 
 /* Set of bits not changed in pte_modify */
 #define _PAGE_CHG_MASK	(PTE_PFN_MASK | _PAGE_PCD | _PAGE_PWT |		\
-			 _PAGE_SPECIAL | _PAGE_ACCESSED | _PAGE_DIRTY |	\
-			 _PAGE_SOFT_DIRTY)
+			 _PAGE_SPECIAL | _PAGE_ACCESSED | _PAGE_DIRTY)
 #define _HPAGE_CHG_MASK (_PAGE_CHG_MASK | _PAGE_PSE)
 
 /*

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-09 21:02           ` Mel Gorman
@ 2015-03-10 13:08             ` Mel Gorman
  0 siblings, 0 replies; 49+ messages in thread
From: Mel Gorman @ 2015-03-10 13:08 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Dave Chinner, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Mon, Mar 09, 2015 at 09:02:19PM +0000, Mel Gorman wrote:
> On Sun, Mar 08, 2015 at 08:40:25PM +0000, Mel Gorman wrote:
> > > Because if the answer is 'yes', then we can safely say: 'we regressed 
> > > performance because correctness [not dropping dirty bits] comes before 
> > > performance'.
> > > 
> > > If the answer is 'no', then we still have a mystery (and a regression) 
> > > to track down.
> > > 
> > > As a second hack (not to be applied), could we change:
> > > 
> > >  #define _PAGE_BIT_PROTNONE      _PAGE_BIT_GLOBAL
> > > 
> > > to:
> > > 
> > >  #define _PAGE_BIT_PROTNONE      (_PAGE_BIT_GLOBAL+1)
> > > 
> > 
> > In itself, that's not enough. The SWP_OFFSET_SHIFT would also need updating
> > as a partial revert of 21d9ee3eda7792c45880b2f11bff8e95c9a061fb but it
> > can be done.
> > 
> 
> More importantily, _PAGE_BIT_GLOBAL+1 == the special PTE bit so just
> updating the value should crash. For the purposes of testing the idea, I
> thought the straight-forward option was to break soft dirty page tracking
> and steal their bit for testing (patch below). Took most of the day to
> get access to the test machine so tests are not long running and only
> the autonuma one has completed;
> 

And the xfsrepair workload also does not show any benefit from using a
different bit either

                                       3.19.0             4.0.0-rc1             4.0.0-rc1             4.0.0-rc1
                                      vanilla               vanilla         slowscan-v2r7        protnone-v3r17
Min      real-fsmark        1164.44 (  0.00%)     1157.41 (  0.60%)     1150.38 (  1.21%)     1173.22 ( -0.75%)
Min      syst-fsmark        4016.12 (  0.00%)     3998.06 (  0.45%)     3988.42 (  0.69%)     4037.90 ( -0.54%)
Min      real-xfsrepair      442.64 (  0.00%)      497.64 (-12.43%)      456.87 ( -3.21%)      489.60 (-10.61%)
Min      syst-xfsrepair      194.97 (  0.00%)      500.61 (-156.76%)      263.41 (-35.10%)      544.56 (-179.30%)
Amean    real-fsmark        1166.28 (  0.00%)     1166.63 ( -0.03%)     1155.97 (  0.88%)     1183.19 ( -1.45%)
Amean    syst-fsmark        4025.87 (  0.00%)     4020.94 (  0.12%)     4004.19 (  0.54%)     4061.64 ( -0.89%)
Amean    real-xfsrepair      447.66 (  0.00%)      507.85 (-13.45%)      459.58 ( -2.66%)      498.71 (-11.40%)
Amean    syst-xfsrepair      202.93 (  0.00%)      519.88 (-156.19%)      281.63 (-38.78%)      569.21 (-180.50%)
Stddev   real-fsmark           1.44 (  0.00%)        6.55 (-354.10%)        3.97 (-175.65%)        9.20 (-537.90%)
Stddev   syst-fsmark           9.76 (  0.00%)       16.22 (-66.27%)       15.09 (-54.69%)       17.47 (-79.13%)
Stddev   real-xfsrepair        5.57 (  0.00%)       11.17 (-100.68%)        3.41 ( 38.66%)        6.77 (-21.63%)
Stddev   syst-xfsrepair        5.69 (  0.00%)       13.98 (-145.78%)       19.94 (-250.49%)       20.03 (-252.05%)
CoeffVar real-fsmark           0.12 (  0.00%)        0.56 (-353.96%)        0.34 (-178.11%)        0.78 (-528.79%)
CoeffVar syst-fsmark           0.24 (  0.00%)        0.40 (-66.48%)        0.38 (-55.53%)        0.43 (-77.55%)
CoeffVar real-xfsrepair        1.24 (  0.00%)        2.20 (-76.89%)        0.74 ( 40.25%)        1.36 ( -9.17%)
CoeffVar syst-xfsrepair        2.80 (  0.00%)        2.69 (  4.06%)        7.08 (-152.54%)        3.52 (-25.51%)
Max      real-fsmark        1167.96 (  0.00%)     1171.98 ( -0.34%)     1159.25 (  0.75%)     1195.41 ( -2.35%)
Max      syst-fsmark        4039.20 (  0.00%)     4033.84 (  0.13%)     4024.53 (  0.36%)     4079.45 ( -1.00%)
Max      real-xfsrepair      455.42 (  0.00%)      523.40 (-14.93%)      464.40 ( -1.97%)      505.82 (-11.07%)
Max      syst-xfsrepair      207.94 (  0.00%)      533.37 (-156.50%)      309.38 (-48.78%)      593.62 (-185.48%)


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-09 19:19               ` Dave Chinner
@ 2015-03-10 23:55                 ` Linus Torvalds
  2015-03-12 13:10                   ` Mel Gorman
  0 siblings, 1 reply; 49+ messages in thread
From: Linus Torvalds @ 2015-03-10 23:55 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Ingo Molnar, Mel Gorman, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Mon, Mar 9, 2015 at 12:19 PM, Dave Chinner <david@fromorbit.com> wrote:
> On Mon, Mar 09, 2015 at 09:52:18AM -0700, Linus Torvalds wrote:
>>
>> What's your virtual environment setup? Kernel config, and
>> virtualization environment to actually get that odd fake NUMA thing
>> happening?
>
> I don't have the exact .config with me (test machines at home
> are shut down because I'm half a world away), but it's pretty much
> this (copied and munged from a similar test vm on my laptop):

[ snip snip ]

Ok, I hate debugging by symptoms anyway, so I didn't do any of this,
and went back to actually *thinking* about the code instead of trying
to reproduce this and figure things out by trial and error.

And I think I figured it out. Of course, since I didn't actually test
anything, what do I know, but I feel good about it, because I think I
can explain why that patch that on the face of it shouldn't change
anything actually did.

So, the old code just did all those manual page table changes,
clearing the present bit and setting the NUMA bit instead.

The new code _ostensibly_ does the same, except it clears the present
bit and sets the PROTNONE bit instead.

However, rather than playing special games with just those two bits,
it uses the normal pte accessor functions, and in particular uses
vma->vm_page_prot to reset the protections back. Which is a nice
cleanup and really makes the code look saner, and does the same thing.

Except it really isn't the same thing at all.

Why?

The protection bits in the page tables are *not* the same as
vma->vm_page_prot. Yes, they start out that way, but they don't stay
that way. And no, I'm not talking about dirty and accessed bits.

The difference? COW. Any private mapping is marked read-only in
vma->vm_page_prot, and then the COW (or the initial write) makes it
read-write.

And so, when we did

-       pte = pte_mknonnuma(pte);
+       /* Make it present again */
+       pte = pte_modify(pte, vma->vm_page_prot);
+       pte = pte_mkyoung(pte);

that isn't equivalent at all - it makes the page read-only, because it
restores it to its original state.

Now, that isn't actually what hurts most, I suspect. Judging by the
profiles, we don't suddenly take a lot of new COW faults. No, what
hurts most is that the NUMA balancing code does this:

        /*
         * Avoid grouping on DSO/COW pages in specific and RO pages
         * in general, RO pages shouldn't hurt as much anyway since
         * they can be in shared cache state.
         */
        if (!pte_write(pte))
                flags |= TNF_NO_GROUP;

and that "!pte_write(pte)" is basically now *always* true for private
mappings (which is 99% of all mappings).

In other words, I think the patch unintentionally made the NUMA code
basically always do the TNF_NO_GROUP case.

I think that a quick hack for testing might be to just replace that
"!pte_write()" with "!pte_dirty()", and seeing how that acts.

Comments?

                      Linus

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-10 23:55                 ` Linus Torvalds
@ 2015-03-12 13:10                   ` Mel Gorman
  2015-03-12 16:20                     ` Linus Torvalds
  0 siblings, 1 reply; 49+ messages in thread
From: Mel Gorman @ 2015-03-12 13:10 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dave Chinner, Ingo Molnar, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Tue, Mar 10, 2015 at 04:55:52PM -0700, Linus Torvalds wrote:
> On Mon, Mar 9, 2015 at 12:19 PM, Dave Chinner <david@fromorbit.com> wrote:
> > On Mon, Mar 09, 2015 at 09:52:18AM -0700, Linus Torvalds wrote:
> >>
> >> What's your virtual environment setup? Kernel config, and
> >> virtualization environment to actually get that odd fake NUMA thing
> >> happening?
> >
> > I don't have the exact .config with me (test machines at home
> > are shut down because I'm half a world away), but it's pretty much
> > this (copied and munged from a similar test vm on my laptop):
> 
> [ snip snip ]
> 
> Ok, I hate debugging by symptoms anyway, so I didn't do any of this,
> and went back to actually *thinking* about the code instead of trying
> to reproduce this and figure things out by trial and error.
> 
> And I think I figured it out.
> <SNIP>

I believe you're correct and it matches what was observed. I'm still
travelling and wireless is dirt but managed to queue a test using pmd_dirty

                                              3.19.0             4.0.0-rc1             4.0.0-rc1
                                             vanilla               vanilla        ptewrite-v1r20
Time User-NUMA01                  25695.96 (  0.00%)    32883.59 (-27.97%)    24012.80 (  6.55%)
Time User-NUMA01_THEADLOCAL       17404.36 (  0.00%)    17453.20 ( -0.28%)    17950.54 ( -3.14%)
Time User-NUMA02                   2037.65 (  0.00%)     2063.70 ( -1.28%)     2046.88 ( -0.45%)
Time User-NUMA02_SMT                981.02 (  0.00%)      983.70 ( -0.27%)      983.68 ( -0.27%)
Time System-NUMA01                  194.70 (  0.00%)      602.44 (-209.42%)      158.90 ( 18.39%)
Time System-NUMA01_THEADLOCAL        98.52 (  0.00%)       78.10 ( 20.73%)      107.66 ( -9.28%)
Time System-NUMA02                    9.28 (  0.00%)        6.47 ( 30.28%)        9.25 (  0.32%)
Time System-NUMA02_SMT                3.79 (  0.00%)        5.06 (-33.51%)        3.92 ( -3.43%)
Time Elapsed-NUMA01                 558.84 (  0.00%)      755.96 (-35.27%)      532.41 (  4.73%)
Time Elapsed-NUMA01_THEADLOCAL      382.54 (  0.00%)      382.22 (  0.08%)      390.48 ( -2.08%)
Time Elapsed-NUMA02                  49.83 (  0.00%)       49.38 (  0.90%)       49.79 (  0.08%)
Time Elapsed-NUMA02_SMT              46.59 (  0.00%)       47.70 ( -2.38%)       47.77 ( -2.53%)
Time CPU-NUMA01                    4632.00 (  0.00%)     4429.00 (  4.38%)     4539.00 (  2.01%)
Time CPU-NUMA01_THEADLOCAL         4575.00 (  0.00%)     4586.00 ( -0.24%)     4624.00 ( -1.07%)
Time CPU-NUMA02                    4107.00 (  0.00%)     4191.00 ( -2.05%)     4129.00 ( -0.54%)
Time CPU-NUMA02_SMT                2113.00 (  0.00%)     2072.00 (  1.94%)     2067.00 (  2.18%)

              3.19.0   4.0.0-rc1   4.0.0-rc1
             vanilla     vanillaptewrite-v1r20
User        46119.12    53384.29    44994.10
System        306.41      692.14      279.78
Elapsed      1039.88     1236.87     1022.92

There are still some difference but it's much closer to what it was.
The balancing stats are almost looking similar to 3.19

NUMA base PTE updates        222840103   304513172   230724075
NUMA huge PMD updates           434894      594467      450274
NUMA page range updates      445505831   608880276   461264363
NUMA hint faults                601358      733491      626176
NUMA hint local faults          371571      511530      359215
NUMA hint local percent             61          69          57
NUMA pages migrated            7073177    26366701     6829196

XFS repair on the same machine is not fully restore either but a big
enough move in the right direction to indicate this was the relevant
change.

xfsrepair
                                       3.19.0             4.0.0-rc1             4.0.0-rc1
                                      vanilla               vanilla        ptewrite-v1r20
Amean    real-fsmark        1166.28 (  0.00%)     1166.63 ( -0.03%)     1184.97 ( -1.60%)
Amean    syst-fsmark        4025.87 (  0.00%)     4020.94 (  0.12%)     4071.10 ( -1.12%)
Amean    real-xfsrepair      447.66 (  0.00%)      507.85 (-13.45%)      460.94 ( -2.97%)
Amean    syst-xfsrepair      202.93 (  0.00%)      519.88 (-156.19%)      282.45 (-39.19%)

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-12 13:10                   ` Mel Gorman
@ 2015-03-12 16:20                     ` Linus Torvalds
  2015-03-12 18:49                       ` Mel Gorman
  0 siblings, 1 reply; 49+ messages in thread
From: Linus Torvalds @ 2015-03-12 16:20 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Dave Chinner, Ingo Molnar, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Thu, Mar 12, 2015 at 6:10 AM, Mel Gorman <mgorman@suse.de> wrote:
>
> I believe you're correct and it matches what was observed. I'm still
> travelling and wireless is dirt but managed to queue a test using pmd_dirty

Ok, thanks.

I'm not entirely happy with that change, and I suspect the whole
heuristic should be looked at much more (maybe it should also look at
whether it's executable, for example), but it's a step in the right
direction.

So I committed it and added a comment, and wrote a commit log about
it. I suspect any further work is post-4.0-release, unless somebody
comes up with something small and simple and obviously better.

                                 Linus

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-12 16:20                     ` Linus Torvalds
@ 2015-03-12 18:49                       ` Mel Gorman
  2015-03-17  7:06                         ` Dave Chinner
  0 siblings, 1 reply; 49+ messages in thread
From: Mel Gorman @ 2015-03-12 18:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dave Chinner, Ingo Molnar, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Thu, Mar 12, 2015 at 09:20:36AM -0700, Linus Torvalds wrote:
> On Thu, Mar 12, 2015 at 6:10 AM, Mel Gorman <mgorman@suse.de> wrote:
> >
> > I believe you're correct and it matches what was observed. I'm still
> > travelling and wireless is dirt but managed to queue a test using pmd_dirty
> 
> Ok, thanks.
> 
> I'm not entirely happy with that change, and I suspect the whole
> heuristic should be looked at much more (maybe it should also look at
> whether it's executable, for example), but it's a step in the right
> direction.
> 

I can follow up when I'm back in work properly. As you have already pulled
this in directly, can you also consider pulling in "mm: thp: return the
correct value for change_huge_pmd" please? The other two patches were very
minor can be resent through the normal paths later.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-12 18:49                       ` Mel Gorman
@ 2015-03-17  7:06                         ` Dave Chinner
  2015-03-17 16:53                           ` Linus Torvalds
  0 siblings, 1 reply; 49+ messages in thread
From: Dave Chinner @ 2015-03-17  7:06 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Thu, Mar 12, 2015 at 06:49:26PM +0000, Mel Gorman wrote:
> On Thu, Mar 12, 2015 at 09:20:36AM -0700, Linus Torvalds wrote:
> > On Thu, Mar 12, 2015 at 6:10 AM, Mel Gorman <mgorman@suse.de> wrote:
> > >
> > > I believe you're correct and it matches what was observed. I'm still
> > > travelling and wireless is dirt but managed to queue a test using pmd_dirty
> > 
> > Ok, thanks.
> > 
> > I'm not entirely happy with that change, and I suspect the whole
> > heuristic should be looked at much more (maybe it should also look at
> > whether it's executable, for example), but it's a step in the right
> > direction.
> > 
> 
> I can follow up when I'm back in work properly. As you have already pulled
> this in directly, can you also consider pulling in "mm: thp: return the
> correct value for change_huge_pmd" please? The other two patches were very
> minor can be resent through the normal paths later.

TO close the loop here, now I'm back home and can run tests:

config                            3.19      4.0-rc1     4.0-rc4
defaults                         8m08s        9m34s       9m14s
-o ag_stride=-1                  4m04s        4m38s       4m11s
-o bhash=101073                  6m04s       17m43s       7m35s
-o ag_stride=-1,bhash=101073     4m54s        9m58s       7m50s

It's better but there are still significant regressions, especially
for the large memory footprint cases. I haven't had a chance to look
at any stats or profiles yet, so I don't know yet whether this is
still page fault related or some other problem....

Cheers,

Dave
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-17  7:06                         ` Dave Chinner
@ 2015-03-17 16:53                           ` Linus Torvalds
  2015-03-17 20:51                             ` Dave Chinner
  0 siblings, 1 reply; 49+ messages in thread
From: Linus Torvalds @ 2015-03-17 16:53 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Mel Gorman, Ingo Molnar, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Tue, Mar 17, 2015 at 12:06 AM, Dave Chinner <david@fromorbit.com> wrote:
>
> TO close the loop here, now I'm back home and can run tests:
>
> config                            3.19      4.0-rc1     4.0-rc4
> defaults                         8m08s        9m34s       9m14s
> -o ag_stride=-1                  4m04s        4m38s       4m11s
> -o bhash=101073                  6m04s       17m43s       7m35s
> -o ag_stride=-1,bhash=101073     4m54s        9m58s       7m50s
>
> It's better but there are still significant regressions, especially
> for the large memory footprint cases. I haven't had a chance to look
> at any stats or profiles yet, so I don't know yet whether this is
> still page fault related or some other problem....

Ok. I'd love to see some data on what changed between 3.19 and rc4 in
the profiles, just to see whether it's "more page faults due to extra
COW", or whether it's due to "more TLB flushes because of the
pte_write() vs pte_dirty()" differences. I'm *guessing*  lot of the
remaining issues are due to extra page fault overhead because I'd
expect write/dirty to be fairly 1:1, but there could be differences
due to shared memory use and/or just writebacks of dirty pages that
become clean.

I guess you can also see in vmstat.mm_migrate_pages whether it's
because of excessive migration (because of bad grouping) or not. So
not just profiles data.

At the same time, I feel fairly happy about the situation - we at
least understand what is going on, and the "3x worse performance" case
is at least gone.  Even if that last case still looks horrible.

So it's still a bad performance regression, but at the same time I
think your test setup (big 500 TB filesystem, but then a fake-numa
thing with just 4GB per node) is specialized and unrealistic enough
that I don't feel it's all that relevant from a *real-world*
standpoint, and so I wouldn't be uncomfortable saying "ok, the page
table handling cleanup caused some issues, but we know about them and
how to fix them longer-term".  So I don't consider this a 4.0
showstopper or a "we need to revert for now" issue.

If it's a case of "we take a lot more page faults because we handle
the NUMA fault and then have a COW fault almost immediately", then the
fix is likely to do the same early-cow that the normal non-numa-fault
case does. In fact, my gut feel is that we should try to unify that
numa/regula fault handling path a bit more, but that would be a pretty
invasive patch.

                     Linus

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-17 16:53                           ` Linus Torvalds
@ 2015-03-17 20:51                             ` Dave Chinner
  2015-03-17 21:30                               ` Linus Torvalds
  0 siblings, 1 reply; 49+ messages in thread
From: Dave Chinner @ 2015-03-17 20:51 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Mel Gorman, Ingo Molnar, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Tue, Mar 17, 2015 at 09:53:57AM -0700, Linus Torvalds wrote:
> On Tue, Mar 17, 2015 at 12:06 AM, Dave Chinner <david@fromorbit.com> wrote:
> >
> > TO close the loop here, now I'm back home and can run tests:
> >
> > config                            3.19      4.0-rc1     4.0-rc4
> > defaults                         8m08s        9m34s       9m14s
> > -o ag_stride=-1                  4m04s        4m38s       4m11s
> > -o bhash=101073                  6m04s       17m43s       7m35s
> > -o ag_stride=-1,bhash=101073     4m54s        9m58s       7m50s
> >
> > It's better but there are still significant regressions, especially
> > for the large memory footprint cases. I haven't had a chance to look
> > at any stats or profiles yet, so I don't know yet whether this is
> > still page fault related or some other problem....
> 
> Ok. I'd love to see some data on what changed between 3.19 and rc4 in
> the profiles, just to see whether it's "more page faults due to extra
> COW", or whether it's due to "more TLB flushes because of the
> pte_write() vs pte_dirty()" differences. I'm *guessing*  lot of the
> remaining issues are due to extra page fault overhead because I'd
> expect write/dirty to be fairly 1:1, but there could be differences
> due to shared memory use and/or just writebacks of dirty pages that
> become clean.
> 
> I guess you can also see in vmstat.mm_migrate_pages whether it's
> because of excessive migration (because of bad grouping) or not. So
> not just profiles data.

On the -o ag_stride=-1 -o bhash=101073 config, the 60s perf stat I
was using during steady state shows:

     471,752      migrate:mm_migrate_pages ( +-  7.38% )

The migrate pages rate is even higher than in 4.0-rc1 (~360,000)
and 3.19 (~55,000), so that looks like even more of a problem than
before.

And the profile looks like:

-   43.73%     0.05%  [kernel]            [k] native_flush_tlb_others
   - native_flush_tlb_others
      - 99.87% flush_tlb_page
           ptep_clear_flush
           try_to_unmap_one
           rmap_walk
           try_to_unmap
           migrate_pages
           migrate_misplaced_page
         - handle_mm_fault
            - 99.84% __do_page_fault
                 trace_do_page_fault
                 do_async_page_fault
               + async_page_fault

(grrrr - running perf with call stack profiling for long enough
oom-kills xfs_repair)

And the vmstats are:

3.19:

numa_hit 5163221
numa_miss 121274
numa_foreign 121274
numa_interleave 12116
numa_local 5153127
numa_other 131368
numa_pte_updates 36482466
numa_huge_pte_updates 0
numa_hint_faults 34816515
numa_hint_faults_local 9197961
numa_pages_migrated 1228114
pgmigrate_success 1228114
pgmigrate_fail 0

4.0-rc1:

numa_hit 36952043
numa_miss 92471
numa_foreign 92471
numa_interleave 10964
numa_local 36927384
numa_other 117130
numa_pte_updates 84010995
numa_huge_pte_updates 0
numa_hint_faults 81697505
numa_hint_faults_local 21765799
numa_pages_migrated 32916316
pgmigrate_success 32916316
pgmigrate_fail 0

4.0-rc4:

numa_hit 23447345
numa_miss 47471
numa_foreign 47471
numa_interleave 10877
numa_local 23438564
numa_other 56252
numa_pte_updates 60901329
numa_huge_pte_updates 0
numa_hint_faults 58777092
numa_hint_faults_local 16478674
numa_pages_migrated 20075156
pgmigrate_success 20075156
pgmigrate_fail 0

Page migrations are still up by a factor of ~20 on 3.19.


> At the same time, I feel fairly happy about the situation - we at
> least understand what is going on, and the "3x worse performance" case
> is at least gone.  Even if that last case still looks horrible.
> 
> So it's still a bad performance regression, but at the same time I
> think your test setup (big 500 TB filesystem, but then a fake-numa
> thing with just 4GB per node) is specialized and unrealistic enough
> that I don't feel it's all that relevant from a *real-world*
> standpoint,

I don't buy it.

The regression is triggered by the search algorithm that
xfs_repair uses, and it's fairly common. It just uses a large hash
table which has at least a 50% miss rate (every I/O misses on the
initial lookup). The page faults are triggered by searching all
these pointer chasing misses.

IOWs the filesystem size under test is irrelevant - the amount of
metadata in the FS determines the xfs_repair memory footprint. The
fs size only determines concurrency, but this particular test case
(ag_stride=-1) turns off the concurrency.  Hence we'll see the same
problem with a 1TB filesystem with 50 million inodes in it, and
there's *lots* of those around.

Also, to address the "fake-numa" setup - the problem is node-local
allocation policy, not the size of the nodes.  The repair threads
wander all over the machine, even when there are only 8 threads
running (ag_stride=-1):

$ ps wauxH |grep  xfs_repair |wc -l
8
$

top - 07:31:08 up 24 min,  2 users,  load average: 1.75, 0.62, 0.69
Tasks: 240 total,   1 running, 239 sleeping,   0 stopped,   0 zombie
%Cpu0  :  3.2 us, 13.4 sy,  0.0 ni, 63.6 id, 19.8 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  :  0.0 us,  0.0 sy,  0.0 ni, 99.7 id,  0.3 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu2  :  1.4 us,  8.5 sy,  0.0 ni, 74.7 id, 15.4 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  :  1.0 us,  1.7 sy,  0.0 ni, 83.2 id, 14.1 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu4  :  0.0 us,  0.7 sy,  0.0 ni, 98.7 id,  0.7 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu5  :  4.0 us, 21.7 sy,  0.0 ni, 56.5 id, 17.8 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu6  :  1.0 us,  2.3 sy,  0.0 ni, 92.4 id,  4.3 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu7  :  2.0 us, 13.5 sy,  0.0 ni, 80.7 id,  3.7 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu8  : 18.8 us,  2.7 sy,  0.0 ni, 78.5 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu9  :  1.4 us, 10.2 sy,  0.0 ni, 87.4 id,  1.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu10 :  2.4 us, 13.2 sy,  0.0 ni, 84.5 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu11 :  1.0 us,  6.1 sy,  0.0 ni, 88.6 id,  4.4 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu12 :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu13 :  0.7 us,  5.4 sy,  0.0 ni, 79.3 id, 14.6 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu14 :  0.3 us,  0.7 sy,  0.0 ni, 93.3 id,  5.7 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu15 :  2.7 us,  9.3 sy,  0.0 ni, 87.7 id,  0.3 wa,  0.0 hi,  0.0 si,  0.0 st

So, 8 threads doing work, 16 cpu cores, and only one fully idle
processor core in the machine over a 5s sample period.  Hence it
seems to me that process memory is getting sprayed over all nodes
because of the way the scheduler processes move around, not because
of the small memory size of the numa nodes.

> and so I wouldn't be uncomfortable saying "ok, the page
> table handling cleanup caused some issues, but we know about them and
> how to fix them longer-term".  So I don't consider this a 4.0
> showstopper or a "we need to revert for now" issue.

I don't consider it necessary of a revert, either, but I don't want
it swept under the table because of "workload not relevant"
arguments.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-17 20:51                             ` Dave Chinner
@ 2015-03-17 21:30                               ` Linus Torvalds
  2015-03-17 22:08                                 ` Dave Chinner
  0 siblings, 1 reply; 49+ messages in thread
From: Linus Torvalds @ 2015-03-17 21:30 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Mel Gorman, Ingo Molnar, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Tue, Mar 17, 2015 at 1:51 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> On the -o ag_stride=-1 -o bhash=101073 config, the 60s perf stat I
> was using during steady state shows:
>
>      471,752      migrate:mm_migrate_pages ( +-  7.38% )
>
> The migrate pages rate is even higher than in 4.0-rc1 (~360,000)
> and 3.19 (~55,000), so that looks like even more of a problem than
> before.

Hmm. How stable are those numbers boot-to-boot?

That kind of extreme spread makes me suspicious. It's also interesting
that if the numbers really go up even more (and by that big amount),
then why does there seem to be almost no correlation with performance
(which apparently went up since rc1, despite migrate_pages getting
even _worse_).

> And the profile looks like:
>
> -   43.73%     0.05%  [kernel]            [k] native_flush_tlb_others

Ok, that's down from rc1 (67%), but still hugely up from 3.19 (13.7%).
And flush_tlb_page() does seem to be called about ten times more
(flush_tlb_mm_range used to be 1.4% of the callers, now it's invisible
at 0.13%)

Damn. From a performance number standpoint, it looked like we zoomed
in on the right thing. But now it's migrating even more pages than
before. Odd.

> And the vmstats are:
>
> 3.19:
>
> numa_hit 5163221
> numa_local 5153127

> 4.0-rc1:
>
> numa_hit 36952043
> numa_local 36927384
>
> 4.0-rc4:
>
> numa_hit 23447345
> numa_local 23438564
>
> Page migrations are still up by a factor of ~20 on 3.19.

The thing is, those "numa_hit" things come from the zone_statistics()
call in buffered_rmqueue(), which in turn is simple from the memory
allocator. That has *nothing* to do with virtual memory, and
everything to do with actual physical memory allocations.  So the load
is simply allocating a lot more pages, presumably for those stupid
migration events.

But then it doesn't correlate with performance anyway..

Can you do a simple stupid test? Apply that commit 53da3bc2ba9e ("mm:
fix up numa read-only thread grouping logic") to 3.19, so that it uses
the same "pte_dirty()" logic as 4.0-rc4. That *should* make the 3.19
and 4.0-rc4 numbers comparable.

It does make me wonder if your load is "chaotic" wrt scheduling. The
load presumably wants to spread out across all cpu's, but then the
numa code tries to group things together for numa accesses, but
depending on just random allocation patterns and layout in the hash
tables, there either are patters with page access or there aren't.

Which is kind of why I wonder how stable those numbers are boot to
boot. Maybe this is at least partly about lucky allocation patterns.

                              Linus

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-17 21:30                               ` Linus Torvalds
@ 2015-03-17 22:08                                 ` Dave Chinner
  2015-03-18 16:08                                   ` Linus Torvalds
  0 siblings, 1 reply; 49+ messages in thread
From: Dave Chinner @ 2015-03-17 22:08 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Mel Gorman, Ingo Molnar, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Tue, Mar 17, 2015 at 02:30:57PM -0700, Linus Torvalds wrote:
> On Tue, Mar 17, 2015 at 1:51 PM, Dave Chinner <david@fromorbit.com> wrote:
> >
> > On the -o ag_stride=-1 -o bhash=101073 config, the 60s perf stat I
> > was using during steady state shows:
> >
> >      471,752      migrate:mm_migrate_pages ( +-  7.38% )
> >
> > The migrate pages rate is even higher than in 4.0-rc1 (~360,000)
> > and 3.19 (~55,000), so that looks like even more of a problem than
> > before.
> 
> Hmm. How stable are those numbers boot-to-boot?

I've run the test several times but only profiles once so far.
runtimes were 7m45, 7m50, 7m44s, 8m2s, and the profiles came from
the 8m2s run.

reboot, run again:

$ sudo perf stat -a -r 6 -e migrate:mm_migrate_pages sleep 10

 Performance counter stats for 'system wide' (6 runs):

           572,839      migrate:mm_migrate_pages    ( +-  3.15% )

      10.001664694 seconds time elapsed             ( +-  0.00% )
$

And just to confirm, a minute later, still in phase 3:

	590,974      migrate:mm_migrate_pages       ( +-  2.86% )

Reboot, run again:

	575,344      migrate:mm_migrate_pages       ( +-  0.70% )

So there is boot-to-boot variation, but it doesn't look like it
gets any better....

> That kind of extreme spread makes me suspicious. It's also interesting
> that if the numbers really go up even more (and by that big amount),
> then why does there seem to be almost no correlation with performance
> (which apparently went up since rc1, despite migrate_pages getting
> even _worse_).
> 
> > And the profile looks like:
> >
> > -   43.73%     0.05%  [kernel]            [k] native_flush_tlb_others
> 
> Ok, that's down from rc1 (67%), but still hugely up from 3.19 (13.7%).
> And flush_tlb_page() does seem to be called about ten times more
> (flush_tlb_mm_range used to be 1.4% of the callers, now it's invisible
> at 0.13%)
> 
> Damn. From a performance number standpoint, it looked like we zoomed
> in on the right thing. But now it's migrating even more pages than
> before. Odd.

Throttling problem, like Mel originally suspected?

> > And the vmstats are:
> >
> > 3.19:
> >
> > numa_hit 5163221
> > numa_local 5153127
> 
> > 4.0-rc1:
> >
> > numa_hit 36952043
> > numa_local 36927384
> >
> > 4.0-rc4:
> >
> > numa_hit 23447345
> > numa_local 23438564
> >
> > Page migrations are still up by a factor of ~20 on 3.19.
> 
> The thing is, those "numa_hit" things come from the zone_statistics()
> call in buffered_rmqueue(), which in turn is simple from the memory
> allocator. That has *nothing* to do with virtual memory, and
> everything to do with actual physical memory allocations.  So the load
> is simply allocating a lot more pages, presumably for those stupid
> migration events.
> 
> But then it doesn't correlate with performance anyway..
>
> Can you do a simple stupid test? Apply that commit 53da3bc2ba9e ("mm:
> fix up numa read-only thread grouping logic") to 3.19, so that it uses
> the same "pte_dirty()" logic as 4.0-rc4. That *should* make the 3.19
> and 4.0-rc4 numbers comparable.

patched 3.19 numbers on this test are slightly worse than stock
3.19, but nowhere near as bad as 4.0-rc4:

	241,718      migrate:mm_migrate_pages		( +-  5.17% )

So that pte_write->pte_dirty change makes this go from ~55k to 240k,
and runtime go from 4m54s to 5m20s. vmstats:

numa_hit 9162476
numa_miss 0
numa_foreign 0
numa_interleave 10685
numa_local 9153740
numa_other 8736
numa_pte_updates 49582103
numa_huge_pte_updates 0
numa_hint_faults 48075098
numa_hint_faults_local 12974704
numa_pages_migrated 5748256
pgmigrate_success 5748256
pgmigrate_fail 0

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-17 22:08                                 ` Dave Chinner
@ 2015-03-18 16:08                                   ` Linus Torvalds
  2015-03-18 17:31                                     ` Linus Torvalds
  0 siblings, 1 reply; 49+ messages in thread
From: Linus Torvalds @ 2015-03-18 16:08 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Mel Gorman, Ingo Molnar, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Tue, Mar 17, 2015 at 3:08 PM, Dave Chinner <david@fromorbit.com> wrote:
>>
>> Damn. From a performance number standpoint, it looked like we zoomed
>> in on the right thing. But now it's migrating even more pages than
>> before. Odd.
>
> Throttling problem, like Mel originally suspected?

That doesn't much make sense for the original bisect you did, though.

Although if there are two different issues, maybe that bisect was
wrong. Or rather, incomplete.

>> Can you do a simple stupid test? Apply that commit 53da3bc2ba9e ("mm:
>> fix up numa read-only thread grouping logic") to 3.19, so that it uses
>> the same "pte_dirty()" logic as 4.0-rc4. That *should* make the 3.19
>> and 4.0-rc4 numbers comparable.
>
> patched 3.19 numbers on this test are slightly worse than stock
> 3.19, but nowhere near as bad as 4.0-rc4:
>
>         241,718      migrate:mm_migrate_pages           ( +-  5.17% )

Ok, that's still much worse than plain 3.19, which was ~55,000.
Assuming your memory/measurements were the same.

So apparently the pte_write() -> pte_dirty() check isn't equivalent at
all. My thinking that for the common case (ie private mappings) it
would be *exactly* the same, because all normal COW pages turn dirty
at the same time they turn writable (and, in page_mkclean_one(), turn
clean and read-only again at the same time). But if the numbers change
that much, then clearly my simplistic "they are the same in practice"
is just complete BS.

So why am I wrong? Why is testing for dirty not the same as testing
for writable?

I can see a few cases:

 - your load has lots of writable (but not written-to) shared memory,
and maybe the test should be something like

      pte_dirty(pte) || (vma->vm_flags & (VM_WRITE|VM_SHARED) ==
(VM_WRITE|VM_SHARED))

   and we really should have some helper function for this logic.

 - something completely different that I am entirely missing

What am I missing?

                          Linus

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-18 16:08                                   ` Linus Torvalds
@ 2015-03-18 17:31                                     ` Linus Torvalds
  2015-03-18 22:23                                       ` Dave Chinner
                                                         ` (2 more replies)
  0 siblings, 3 replies; 49+ messages in thread
From: Linus Torvalds @ 2015-03-18 17:31 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Mel Gorman, Ingo Molnar, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Wed, Mar 18, 2015 at 9:08 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> So why am I wrong? Why is testing for dirty not the same as testing
> for writable?
>
> I can see a few cases:
>
>  - your load has lots of writable (but not written-to) shared memory

Hmm. I tried to look at the xfsprog sources, and I don't see any
MAP_SHARED activity.  It looks like it's just using pread64/pwrite64,
and the only MAP_SHARED is for the xfsio mmap test thing, not for
xfsrepair.

So I don't see any shared mappings, but I don't know the code-base.

>  - something completely different that I am entirely missing

So I think there's something I'm missing. For non-shared mappings, I
still have the idea that pte_dirty should be the same as pte_write.
And yet, your testing of 3.19 shows that it's a big difference.
There's clearly something I'm completely missing.

                          Linus

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-18 17:31                                     ` Linus Torvalds
@ 2015-03-18 22:23                                       ` Dave Chinner
  2015-03-19 14:10                                       ` Mel Gorman
  2015-03-19 21:41                                       ` Linus Torvalds
  2 siblings, 0 replies; 49+ messages in thread
From: Dave Chinner @ 2015-03-18 22:23 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Mel Gorman, Ingo Molnar, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Wed, Mar 18, 2015 at 10:31:28AM -0700, Linus Torvalds wrote:
> On Wed, Mar 18, 2015 at 9:08 AM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > So why am I wrong? Why is testing for dirty not the same as testing
> > for writable?
> >
> > I can see a few cases:
> >
> >  - your load has lots of writable (but not written-to) shared memory
> 
> Hmm. I tried to look at the xfsprog sources, and I don't see any
> MAP_SHARED activity.  It looks like it's just using pread64/pwrite64,
> and the only MAP_SHARED is for the xfsio mmap test thing, not for
> xfsrepair.
> 
> So I don't see any shared mappings, but I don't know the code-base.

Right - all the mmap activity in the xfs_repair test is coming from
memory allocation through glibc - we don't use mmap() directly
anywhere in xfs_repair. FWIW, all the IO into these pages that are
allocated is being done via direct IO, if that makes any
difference...

> >  - something completely different that I am entirely missing
> 
> So I think there's something I'm missing. For non-shared mappings, I
> still have the idea that pte_dirty should be the same as pte_write.
> And yet, your testing of 3.19 shows that it's a big difference.
> There's clearly something I'm completely missing.

This level of pte interactions is beyond my level of knowledge, so
I'm afraid at this point I'm not going to be much help other than to
test patches and report the result.

FWIW, here's the distribution of the hash table we are iterating
over. There are a lot of search misses, which means we are doing a
lot of pointer chasing, but the distribution is centred directly
around the goal of 8 entries per chain and there is no long tail:

libxfs_bcache: 0x67e110
Max supported entries = 808584
Max utilized entries = 808584
Active entries = 808583
Hash table size = 101073
Hits = 9789987
Misses = 8224234
Hit ratio = 54.35
MRU 0 entries =   4667 (  0%)
MRU 1 entries =      0 (  0%)
MRU 2 entries =      4 (  0%)
MRU 3 entries = 797447 ( 98%)
MRU 4 entries =    653 (  0%)
MRU 5 entries =      0 (  0%)
MRU 6 entries =   2755 (  0%)
MRU 7 entries =   1518 (  0%)
MRU 8 entries =   1518 (  0%)
MRU 9 entries =      0 (  0%)
MRU 10 entries =     21 (  0%)
MRU 11 entries =      0 (  0%)
MRU 12 entries =      0 (  0%)
MRU 13 entries =      0 (  0%)
MRU 14 entries =      0 (  0%)
MRU 15 entries =      0 (  0%)
Hash buckets with   0 entries     30 (  0%)
Hash buckets with   1 entries    241 (  0%)
Hash buckets with   2 entries   1019 (  0%)
Hash buckets with   3 entries   2787 (  1%)
Hash buckets with   4 entries   5838 (  2%)
Hash buckets with   5 entries   9144 (  5%)
Hash buckets with   6 entries  12165 (  9%)
Hash buckets with   7 entries  14194 ( 12%)
Hash buckets with   8 entries  14387 ( 14%)
Hash buckets with   9 entries  12742 ( 14%)
Hash buckets with  10 entries  10253 ( 12%)
Hash buckets with  11 entries   7308 (  9%)
Hash buckets with  12 entries   4872 (  7%)
Hash buckets with  13 entries   2869 (  4%)
Hash buckets with  14 entries   1578 (  2%)
Hash buckets with  15 entries    894 (  1%)
Hash buckets with  16 entries    430 (  0%)
Hash buckets with  17 entries    188 (  0%)
Hash buckets with  18 entries     88 (  0%)
Hash buckets with  19 entries     24 (  0%)
Hash buckets with  20 entries     11 (  0%)
Hash buckets with  21 entries     10 (  0%)
Hash buckets with  22 entries      1 (  0%)


Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-18 17:31                                     ` Linus Torvalds
  2015-03-18 22:23                                       ` Dave Chinner
@ 2015-03-19 14:10                                       ` Mel Gorman
  2015-03-19 18:09                                         ` Linus Torvalds
  2015-03-19 21:41                                       ` Linus Torvalds
  2 siblings, 1 reply; 49+ messages in thread
From: Mel Gorman @ 2015-03-19 14:10 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dave Chinner, Ingo Molnar, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Wed, Mar 18, 2015 at 10:31:28AM -0700, Linus Torvalds wrote:
> >  - something completely different that I am entirely missing
> 
> So I think there's something I'm missing. For non-shared mappings, I
> still have the idea that pte_dirty should be the same as pte_write.
> And yet, your testing of 3.19 shows that it's a big difference.
> There's clearly something I'm completely missing.
> 

Minimally, there is still the window where we clear the PTE to set the
protections. During that window, a fault can occur. In the old code which
was inherently racy and unsafe, the fault might still go ahead deferring
a potential migration for a short period. In the current code, it'll stall
on the lock, notice the PTE is changed and refault so the overhead is very
different but functionally correct.

In the old code, pte_write had complex interactions with background
cleaning and sync in the case of file mappings (not applicable to Dave's
case but still it's unpredictable behaviour). pte_dirty is close but there
are interactions with the application as the timing of writes vs the PTE
scanner matter.

Even if we restored the original behaviour, it would still be very difficult
to understand all the interactions between userspace and kernel.  The patch
below should be tested because it's clearer what the intent is. Using
the VMA flags is coarse but it's not vulnerable to timing artifacts that
behave differently depending on the machine. My preliminary testing shows
it helps but not by much. It does not restore performance to where it was
but it's easier to understand which is important if there are changes in
the scheduler later.

In combination, I also think that slowing PTE scanning when migration fails
is the correct action even if it is unrelated to the patch Dave bisected
to. It's stupid to increase scanning rates and incurs more faults when
migrations are failing so I'll be testing that next.

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 626e93db28ba..2f12e9fcf1a2 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1291,17 +1291,8 @@ int do_huge_pmd_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
 		flags |= TNF_FAULT_LOCAL;
 	}
 
-	/*
-	 * Avoid grouping on DSO/COW pages in specific and RO pages
-	 * in general, RO pages shouldn't hurt as much anyway since
-	 * they can be in shared cache state.
-	 *
-	 * FIXME! This checks "pmd_dirty()" as an approximation of
-	 * "is this a read-only page", since checking "pmd_write()"
-	 * is even more broken. We haven't actually turned this into
-	 * a writable page, so pmd_write() will always be false.
-	 */
-	if (!pmd_dirty(pmd))
+	/* See similar comment in do_numa_page for explanation */
+	if (!(vma->vm_flags & VM_WRITE))
 		flags |= TNF_NO_GROUP;
 
 	/*
diff --git a/mm/memory.c b/mm/memory.c
index 411144f977b1..20beb6647dba 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3069,16 +3069,19 @@ static int do_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
 	}
 
 	/*
-	 * Avoid grouping on DSO/COW pages in specific and RO pages
-	 * in general, RO pages shouldn't hurt as much anyway since
-	 * they can be in shared cache state.
+	 * Avoid grouping on RO pages in general. RO pages shouldn't hurt as
+	 * much anyway since they can be in shared cache state. This misses
+	 * the case where a mapping is writable but the process never writes
+	 * to it but pte_write gets cleared during protection updates and
+	 * pte_dirty has unpredictable behaviour between PTE scan updates,
+	 * background writeback, dirty balancing and application behaviour.
 	 *
-	 * FIXME! This checks "pmd_dirty()" as an approximation of
-	 * "is this a read-only page", since checking "pmd_write()"
-	 * is even more broken. We haven't actually turned this into
-	 * a writable page, so pmd_write() will always be false.
+	 * TODO: Note that the ideal here would be to avoid a situation where a
+	 * NUMA fault is taken immediately followed by a write fault in
+	 * some cases which would have lower overhead overall but would be
+	 * invasive as the fault paths would need to be unified.
 	 */
-	if (!pte_dirty(pte))
+	if (!(vma->vm_flags & VM_WRITE))
 		flags |= TNF_NO_GROUP;
 
 	/*

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-19 14:10                                       ` Mel Gorman
@ 2015-03-19 18:09                                         ` Linus Torvalds
  0 siblings, 0 replies; 49+ messages in thread
From: Linus Torvalds @ 2015-03-19 18:09 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Dave Chinner, Ingo Molnar, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Thu, Mar 19, 2015 at 7:10 AM, Mel Gorman <mgorman@suse.de> wrote:
> -       if (!pmd_dirty(pmd))
> +       /* See similar comment in do_numa_page for explanation */
> +       if (!(vma->vm_flags & VM_WRITE))

Yeah, that would certainly be a whole lot more obvious than all the
"if this particular pte/pmd looks like X" tests.

So that, together with scanning rate improvements (this *does* seem to
be somewhat chaotic, so it's quite possible that the current scanning
rate thing is just fairly unstable) is likely the right thing. I'd
just like to _understand_ why that write/dirty bit makes such a
difference. I thought I understood what was going on, and was happy,
and then Dave come with his crazy numbers.

Damn you Dave, and damn your numbers and "facts" and stuff. Sometimes
I much prefer ignorant bliss.

                           Linus

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-18 17:31                                     ` Linus Torvalds
  2015-03-18 22:23                                       ` Dave Chinner
  2015-03-19 14:10                                       ` Mel Gorman
@ 2015-03-19 21:41                                       ` Linus Torvalds
  2015-03-19 22:41                                         ` Dave Chinner
  2 siblings, 1 reply; 49+ messages in thread
From: Linus Torvalds @ 2015-03-19 21:41 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Mel Gorman, Ingo Molnar, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Wed, Mar 18, 2015 at 10:31 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> So I think there's something I'm missing. For non-shared mappings, I
> still have the idea that pte_dirty should be the same as pte_write.
> And yet, your testing of 3.19 shows that it's a big difference.
> There's clearly something I'm completely missing.

Ahh. The normal page table scanning and page fault handling both clear
and set the dirty bit together with the writable one. But "fork()"
will clear the writable bit without clearing dirty. For some reason I
thought it moved the dirty bit into the struct page like the VM
scanning does, but that was just me having a brainfart. So yeah,
pte_dirty doesn't have to match pte_write even under perfectly normal
circumstances. Maybe there are other cases.

Not that I see a lot of forking in the xfs repair case either, so..

Dave, mind re-running the plain 3.19 numbers to really verify that the
pte_dirty/pte_write change really made that big of a difference. Maybe
your recollection of ~55,000 migrate_pages events was faulty. If the
pte_write ->pte_dirty change is the *only* difference, it's still very
odd how that one difference would make migrate_rate go from ~55k to
471k. That's an order of magnitude difference, for what really
shouldn't be a big change.

I'm running a kernel right now with a hacky "update_mmu_cache()" that
warns if pte_dirty is ever different from pte_write().

+void update_mmu_cache(struct vm_area_struct *vma,
+               unsigned long addr, pte_t *ptep)
+{
+       if (!(vma->vm_flags & VM_SHARED)) {
+               pte_t now = READ_ONCE(*ptep);
+               if (!pte_write(now) != !pte_dirty(now)) {
+                       static int count = 20;
+                       static unsigned int prev = 0;
+                       unsigned int val = pte_val(now) & 0xfff;
+                       if (prev != val && count) {
+                               prev = val;
+                               count--;
+                               WARN(1, "pte value %x", val);
+                       }
+               }
+       }
+}

I haven't seen a single warning so far (and there I wrote all that
code to limit repeated warnings), although admittedly
update_mu_cache() isn't called for all cases where we change a pte
(not for the fork case, for example). But it *is* called for the page
faulting cases

Maybe a system update has changed libraries and memory allocation
patterns, and there is something bigger than that one-liner
pte_dirty/write change going on?

                             Linus

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-19 21:41                                       ` Linus Torvalds
@ 2015-03-19 22:41                                         ` Dave Chinner
  2015-03-19 23:05                                           ` Linus Torvalds
  0 siblings, 1 reply; 49+ messages in thread
From: Dave Chinner @ 2015-03-19 22:41 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Mel Gorman, Ingo Molnar, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Thu, Mar 19, 2015 at 02:41:48PM -0700, Linus Torvalds wrote:
> On Wed, Mar 18, 2015 at 10:31 AM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > So I think there's something I'm missing. For non-shared mappings, I
> > still have the idea that pte_dirty should be the same as pte_write.
> > And yet, your testing of 3.19 shows that it's a big difference.
> > There's clearly something I'm completely missing.
> 
> Ahh. The normal page table scanning and page fault handling both clear
> and set the dirty bit together with the writable one. But "fork()"
> will clear the writable bit without clearing dirty. For some reason I
> thought it moved the dirty bit into the struct page like the VM
> scanning does, but that was just me having a brainfart. So yeah,
> pte_dirty doesn't have to match pte_write even under perfectly normal
> circumstances. Maybe there are other cases.
> 
> Not that I see a lot of forking in the xfs repair case either, so..
> 
> Dave, mind re-running the plain 3.19 numbers to really verify that the
> pte_dirty/pte_write change really made that big of a difference. Maybe
> your recollection of ~55,000 migrate_pages events was faulty. If the
> pte_write ->pte_dirty change is the *only* difference, it's still very
> odd how that one difference would make migrate_rate go from ~55k to
> 471k. That's an order of magnitude difference, for what really
> shouldn't be a big change.

My recollection wasn't faulty - I pulled it from an earlier email.
That said, the original measurement might have been faulty. I ran
the numbers again on the 3.19 kernel I saved away from the original
testing. That came up at 235k, which is pretty much the same as
yesterday's test. The runtime,however, is unchanged from my original
measurements of 4m54s (pte_hack came in at 5m20s).

Wondering where the 55k number came from, I played around with when
I started the measurement - all the numbers since I did the bisect
have come from starting it at roughly 130AGs into phase 3 where the
memory footprint stabilises and the tlb flush overhead kicks in.

However, if I start the measurement at the same time as the repair
test, I get something much closer to the 55k number. I also note
that my original 4.0-rc1 numbers were much lower than the more
recent steady state measurements (360k vs 470k), so I'd say the
original numbers weren't representative of the steady state
behaviour and so can be ignored...

> Maybe a system update has changed libraries and memory allocation
> patterns, and there is something bigger than that one-liner
> pte_dirty/write change going on?

Possibly. The xfs_repair binary has definitely been rebuilt (testing
unrelated bug fixes that only affect phase 6/7 behaviour), but
otherwise the system libraries are unchanged.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-19 22:41                                         ` Dave Chinner
@ 2015-03-19 23:05                                           ` Linus Torvalds
  2015-03-19 23:23                                             ` Dave Chinner
                                                               ` (2 more replies)
  0 siblings, 3 replies; 49+ messages in thread
From: Linus Torvalds @ 2015-03-19 23:05 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Mel Gorman, Ingo Molnar, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Thu, Mar 19, 2015 at 3:41 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> My recollection wasn't faulty - I pulled it from an earlier email.
> That said, the original measurement might have been faulty. I ran
> the numbers again on the 3.19 kernel I saved away from the original
> testing. That came up at 235k, which is pretty much the same as
> yesterday's test. The runtime,however, is unchanged from my original
> measurements of 4m54s (pte_hack came in at 5m20s).

Ok. Good. So the "more than an order of magnitude difference" was
really about measurement differences, not quite as real. Looks like
more a "factor of two" than a factor of 20.

Did you do the profiles the same way? Because that would explain the
differences in the TLB flush percentages too (the "1.4% from
tlb_invalidate_range()" vs "pretty much everything from migration").

The runtime variation does show that there's some *big* subtle
difference for the numa balancing in the exact TNF_NO_GROUP details.
It must be *very* unstable for it to make that big of a difference.
But I feel at least a *bit* better about "unstable algorithm changes a
small varioation into a factor-of-two" vs that crazy factor-of-20.

Can you try Mel's change to make it use

        if (!(vma->vm_flags & VM_WRITE))

instead of the pte details? Again, on otherwise plain 3.19, just so
that we have a baseline. I'd be *so* much happer with checking the vma
details over per-pte details, especially ones that change over the
lifetime of the pte entry, and the NUMA code explicitly mucks with.

                           Linus

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-19 23:05                                           ` Linus Torvalds
@ 2015-03-19 23:23                                             ` Dave Chinner
  2015-03-20  0:23                                             ` Dave Chinner
  2015-03-20  9:56                                             ` Mel Gorman
  2 siblings, 0 replies; 49+ messages in thread
From: Dave Chinner @ 2015-03-19 23:23 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Mel Gorman, Ingo Molnar, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Thu, Mar 19, 2015 at 04:05:46PM -0700, Linus Torvalds wrote:
> On Thu, Mar 19, 2015 at 3:41 PM, Dave Chinner <david@fromorbit.com> wrote:
> >
> > My recollection wasn't faulty - I pulled it from an earlier email.
> > That said, the original measurement might have been faulty. I ran
> > the numbers again on the 3.19 kernel I saved away from the original
> > testing. That came up at 235k, which is pretty much the same as
> > yesterday's test. The runtime,however, is unchanged from my original
> > measurements of 4m54s (pte_hack came in at 5m20s).
> 
> Ok. Good. So the "more than an order of magnitude difference" was
> really about measurement differences, not quite as real. Looks like
> more a "factor of two" than a factor of 20.
> 
> Did you do the profiles the same way? Because that would explain the
> differences in the TLB flush percentages too (the "1.4% from
> tlb_invalidate_range()" vs "pretty much everything from migration").

No, the profiles all came from steady state. The profiles from the
initial startup phase hammer the mmap_sem because of page fault vs
mprotect contention (glibc runs mprotect() on every chunk of
memory it allocates). It's not until the cache reaches "full" and it
starts recycling old buffers rather than allocating new ones that
the tlb flush problem dominates the profiles.

> The runtime variation does show that there's some *big* subtle
> difference for the numa balancing in the exact TNF_NO_GROUP details.
> It must be *very* unstable for it to make that big of a difference.
> But I feel at least a *bit* better about "unstable algorithm changes a
> small varioation into a factor-of-two" vs that crazy factor-of-20.
> 
> Can you try Mel's change to make it use
> 
>         if (!(vma->vm_flags & VM_WRITE))
> 
> instead of the pte details? Again, on otherwise plain 3.19, just so
> that we have a baseline. I'd be *so* much happer with checking the vma
> details over per-pte details, especially ones that change over the
> lifetime of the pte entry, and the NUMA code explicitly mucks with.

Yup, will do. might take an hour or two before I get to it, though...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-19 23:05                                           ` Linus Torvalds
  2015-03-19 23:23                                             ` Dave Chinner
@ 2015-03-20  0:23                                             ` Dave Chinner
  2015-03-20  1:29                                               ` Linus Torvalds
  2015-03-20  9:56                                             ` Mel Gorman
  2 siblings, 1 reply; 49+ messages in thread
From: Dave Chinner @ 2015-03-20  0:23 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Mel Gorman, Ingo Molnar, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Thu, Mar 19, 2015 at 04:05:46PM -0700, Linus Torvalds wrote:
> Can you try Mel's change to make it use
> 
>         if (!(vma->vm_flags & VM_WRITE))
> 
> instead of the pte details? Again, on otherwise plain 3.19, just so
> that we have a baseline. I'd be *so* much happer with checking the vma
> details over per-pte details, especially ones that change over the
> lifetime of the pte entry, and the NUMA code explicitly mucks with.

$ sudo perf_3.18 stat -a -r 6 -e migrate:mm_migrate_pages sleep 10

 Performance counter stats for 'system wide' (6 runs):

    266,750      migrate:mm_migrate_pages ( +-  7.43% )

	  10.002032292 seconds time elapsed ( +-  0.00% )

Bit more variance there than the pte checking, but runtime
difference is in the noise - 5m4s vs 4m54s - and profiles are
identical to the pte checking version.

Cheers,

Dave.

-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-20  0:23                                             ` Dave Chinner
@ 2015-03-20  1:29                                               ` Linus Torvalds
  2015-03-20  4:13                                                 ` Dave Chinner
  2015-03-20 10:12                                                 ` Mel Gorman
  0 siblings, 2 replies; 49+ messages in thread
From: Linus Torvalds @ 2015-03-20  1:29 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Mel Gorman, Ingo Molnar, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Thu, Mar 19, 2015 at 5:23 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> Bit more variance there than the pte checking, but runtime
> difference is in the noise - 5m4s vs 4m54s - and profiles are
> identical to the pte checking version.

Ahh, so that "!(vma->vm_flags & VM_WRITE)" test works _almost_ as well
as the original !pte_write() test.

Now, can you check that on top of rc4? If I've gotten everything
right, we now have:

 - plain 3.19 (pte_write): 4m54s
 - 3.19 with vm_flags & VM_WRITE: 5m4s
 - 3.19 with pte_dirty: 5m20s

so the pte_dirty version seems to have been a bad choice indeed.

For 4.0-rc4, (which uses pte_dirty) you had 7m50s, so it's still
_much_ worse, but I'm wondering whether that VM_WRITE test will at
least shrink the difference like it does for 3.19.

And the VM_WRITE test should be stable and not have any subtle
interaction with the other changes that the numa pte things
introduced. It would be good to see if the profiles then pop something
*else* up as the performance difference (which I'm sure will remain,
since the 7m50s was so far off).

                        Linus

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-20  1:29                                               ` Linus Torvalds
@ 2015-03-20  4:13                                                 ` Dave Chinner
  2015-03-20 17:02                                                   ` Linus Torvalds
  2015-03-20 10:12                                                 ` Mel Gorman
  1 sibling, 1 reply; 49+ messages in thread
From: Dave Chinner @ 2015-03-20  4:13 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Mel Gorman, Ingo Molnar, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Thu, Mar 19, 2015 at 06:29:47PM -0700, Linus Torvalds wrote:
> On Thu, Mar 19, 2015 at 5:23 PM, Dave Chinner <david@fromorbit.com> wrote:
> >
> > Bit more variance there than the pte checking, but runtime
> > difference is in the noise - 5m4s vs 4m54s - and profiles are
> > identical to the pte checking version.
> 
> Ahh, so that "!(vma->vm_flags & VM_WRITE)" test works _almost_ as well
> as the original !pte_write() test.
> 
> Now, can you check that on top of rc4? If I've gotten everything
> right, we now have:
> 
>  - plain 3.19 (pte_write): 4m54s
>  - 3.19 with vm_flags & VM_WRITE: 5m4s
>  - 3.19 with pte_dirty: 5m20s

*nod*

> so the pte_dirty version seems to have been a bad choice indeed.
> 
> For 4.0-rc4, (which uses pte_dirty) you had 7m50s, so it's still
> _much_ worse, but I'm wondering whether that VM_WRITE test will at
> least shrink the difference like it does for 3.19.

Testing now. It's a bit faster - three runs gave 7m35s, 7m20s and
7m36s. IOWs's a bit better, but not significantly. page migrations
are pretty much unchanged, too:

	   558,632      migrate:mm_migrate_pages ( +-  6.38% )

> And the VM_WRITE test should be stable and not have any subtle
> interaction with the other changes that the numa pte things
> introduced. It would be good to see if the profiles then pop something
> *else* up as the performance difference (which I'm sure will remain,
> since the 7m50s was so far off).

No, nothing new pops up in the kernel profiles. All the system CPU
time is still being spent sending IPIs on the tlb flush path.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-19 23:05                                           ` Linus Torvalds
  2015-03-19 23:23                                             ` Dave Chinner
  2015-03-20  0:23                                             ` Dave Chinner
@ 2015-03-20  9:56                                             ` Mel Gorman
  2 siblings, 0 replies; 49+ messages in thread
From: Mel Gorman @ 2015-03-20  9:56 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dave Chinner, Ingo Molnar, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Thu, Mar 19, 2015 at 04:05:46PM -0700, Linus Torvalds wrote:
> On Thu, Mar 19, 2015 at 3:41 PM, Dave Chinner <david@fromorbit.com> wrote:
> >
> > My recollection wasn't faulty - I pulled it from an earlier email.
> > That said, the original measurement might have been faulty. I ran
> > the numbers again on the 3.19 kernel I saved away from the original
> > testing. That came up at 235k, which is pretty much the same as
> > yesterday's test. The runtime,however, is unchanged from my original
> > measurements of 4m54s (pte_hack came in at 5m20s).
> 
> Ok. Good. So the "more than an order of magnitude difference" was
> really about measurement differences, not quite as real. Looks like
> more a "factor of two" than a factor of 20.
> 
> Did you do the profiles the same way? Because that would explain the
> differences in the TLB flush percentages too (the "1.4% from
> tlb_invalidate_range()" vs "pretty much everything from migration").
> 
> The runtime variation does show that there's some *big* subtle
> difference for the numa balancing in the exact TNF_NO_GROUP details.

TNF_NO_GROUP affects whether the scheduler tries to group related processes
together. Whether migration occurs depends on what node a process is
scheduled on. If processes are aggressively grouped inappropriately then it
is possible there is a bug that causes the load balancer to move processes
off a node (possible migration) with NUMA balancing trying to pull it back
(another possible migration). Small bugs there can result in excessive
migration.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-20  1:29                                               ` Linus Torvalds
  2015-03-20  4:13                                                 ` Dave Chinner
@ 2015-03-20 10:12                                                 ` Mel Gorman
  1 sibling, 0 replies; 49+ messages in thread
From: Mel Gorman @ 2015-03-20 10:12 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dave Chinner, Ingo Molnar, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Thu, Mar 19, 2015 at 06:29:47PM -0700, Linus Torvalds wrote:
> And the VM_WRITE test should be stable and not have any subtle
> interaction with the other changes that the numa pte things
> introduced. It would be good to see if the profiles then pop something
> *else* up as the performance difference (which I'm sure will remain,
> since the 7m50s was so far off).
> 

As a side-note, I did test a patch that checked pte_write and preserved
it across both faults and setting the protections. It did not alter
migration activity much but there was a  drop in minor faults - 20% drop in
autonumabench, 58% drop in xfsrepair workload. I'm assuming this is due to
refaults to mark pages writable.  The patch looks and is hacky so I won't
post it to save people bleaching their eyes. I'll spend some time soon
(hopefully today) at a smooth way of falling through to WP checks after
trapping a NUMA fault.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-20  4:13                                                 ` Dave Chinner
@ 2015-03-20 17:02                                                   ` Linus Torvalds
  2015-03-23 12:01                                                     ` Mel Gorman
  0 siblings, 1 reply; 49+ messages in thread
From: Linus Torvalds @ 2015-03-20 17:02 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Mel Gorman, Ingo Molnar, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Thu, Mar 19, 2015 at 9:13 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> Testing now. It's a bit faster - three runs gave 7m35s, 7m20s and
> 7m36s. IOWs's a bit better, but not significantly. page migrations
> are pretty much unchanged, too:
>
>            558,632      migrate:mm_migrate_pages ( +-  6.38% )

Ok. That was kind of the expected thing.

I don't really know the NUMA fault rate limiting code, but one thing
that strikes me is that if it tries to balance the NUMA faults against
the *regular* faults, then maybe just the fact that we end up taking
more COW faults after a NUMA fault then means that the NUMA rate
limiting code now gets over-eager (because it sees all those extra
non-numa faults).

Mel, does that sound at all possible? I really have never looked at
the magic automatic rate handling..

                         Linus

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
  2015-03-20 17:02                                                   ` Linus Torvalds
@ 2015-03-23 12:01                                                     ` Mel Gorman
  0 siblings, 0 replies; 49+ messages in thread
From: Mel Gorman @ 2015-03-23 12:01 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dave Chinner, Ingo Molnar, Andrew Morton, Aneesh Kumar,
	Linux Kernel Mailing List, Linux-MM, xfs, ppc-dev

On Fri, Mar 20, 2015 at 10:02:23AM -0700, Linus Torvalds wrote:
> On Thu, Mar 19, 2015 at 9:13 PM, Dave Chinner <david@fromorbit.com> wrote:
> >
> > Testing now. It's a bit faster - three runs gave 7m35s, 7m20s and
> > 7m36s. IOWs's a bit better, but not significantly. page migrations
> > are pretty much unchanged, too:
> >
> >            558,632      migrate:mm_migrate_pages ( +-  6.38% )
> 
> Ok. That was kind of the expected thing.
> 
> I don't really know the NUMA fault rate limiting code, but one thing
> that strikes me is that if it tries to balance the NUMA faults against
> the *regular* faults, then maybe just the fact that we end up taking
> more COW faults after a NUMA fault then means that the NUMA rate
> limiting code now gets over-eager (because it sees all those extra
> non-numa faults).
> 
> Mel, does that sound at all possible? I really have never looked at
> the magic automatic rate handling..
> 

It should not be trying to balance against regular faults as it has no
information on it. The trapping of additional faults to mark the PTE
writable will alter timing so it indirectly affects how many migration
faults there but this is only a side-effect IMO.

There is more overhead now due to losing the writable information and
that should be reduced so I tried a few approaches.  Ultimately, the one
that performed the best and was easiest to understand simply preserved
the writable bit across the protection update and page fault. I'll post
it later when I stick a changelog on it.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2015-03-23 12:01 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-07 15:20 [RFC PATCH 0/4] Automatic NUMA balancing and PROT_NONE handling followup v2r8 Mel Gorman
2015-03-07 15:20 ` [PATCH 1/4] mm: thp: Return the correct value for change_huge_pmd Mel Gorman
2015-03-07 20:13   ` Linus Torvalds
2015-03-07 20:31   ` Linus Torvalds
2015-03-07 20:56     ` Mel Gorman
2015-03-07 15:20 ` [PATCH 2/4] mm: numa: Remove migrate_ratelimited Mel Gorman
2015-03-07 15:20 ` [PATCH 3/4] mm: numa: Mark huge PTEs young when clearing NUMA hinting faults Mel Gorman
2015-03-07 18:33   ` Linus Torvalds
2015-03-07 18:42     ` Linus Torvalds
2015-03-07 15:20 ` [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur Mel Gorman
2015-03-07 16:36   ` Ingo Molnar
2015-03-07 17:37     ` Mel Gorman
2015-03-08  9:54       ` Ingo Molnar
2015-03-07 19:12     ` Linus Torvalds
2015-03-08 10:02       ` Ingo Molnar
2015-03-08 18:35         ` Linus Torvalds
2015-03-08 18:46           ` Linus Torvalds
2015-03-09 11:29           ` Dave Chinner
2015-03-09 16:52             ` Linus Torvalds
2015-03-09 19:19               ` Dave Chinner
2015-03-10 23:55                 ` Linus Torvalds
2015-03-12 13:10                   ` Mel Gorman
2015-03-12 16:20                     ` Linus Torvalds
2015-03-12 18:49                       ` Mel Gorman
2015-03-17  7:06                         ` Dave Chinner
2015-03-17 16:53                           ` Linus Torvalds
2015-03-17 20:51                             ` Dave Chinner
2015-03-17 21:30                               ` Linus Torvalds
2015-03-17 22:08                                 ` Dave Chinner
2015-03-18 16:08                                   ` Linus Torvalds
2015-03-18 17:31                                     ` Linus Torvalds
2015-03-18 22:23                                       ` Dave Chinner
2015-03-19 14:10                                       ` Mel Gorman
2015-03-19 18:09                                         ` Linus Torvalds
2015-03-19 21:41                                       ` Linus Torvalds
2015-03-19 22:41                                         ` Dave Chinner
2015-03-19 23:05                                           ` Linus Torvalds
2015-03-19 23:23                                             ` Dave Chinner
2015-03-20  0:23                                             ` Dave Chinner
2015-03-20  1:29                                               ` Linus Torvalds
2015-03-20  4:13                                                 ` Dave Chinner
2015-03-20 17:02                                                   ` Linus Torvalds
2015-03-23 12:01                                                     ` Mel Gorman
2015-03-20 10:12                                                 ` Mel Gorman
2015-03-20  9:56                                             ` Mel Gorman
2015-03-08 20:40         ` Mel Gorman
2015-03-09 21:02           ` Mel Gorman
2015-03-10 13:08             ` Mel Gorman
2015-03-08  9:41   ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).