linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v9 00/24] Speculative page faults
@ 2018-03-13 17:59 Laurent Dufour
  2018-03-13 17:59 ` [PATCH v9 01/24] mm: Introduce CONFIG_SPECULATIVE_PAGE_FAULT Laurent Dufour
                   ` (26 more replies)
  0 siblings, 27 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-03-13 17:59 UTC (permalink / raw)
  To: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan
  Cc: linux-kernel, linux-mm, haren, khandual, npiggin, bsingharora,
	Tim Chen, linuxppc-dev, x86

This is a port on kernel 4.16 of the work done by Peter Zijlstra to
handle page fault without holding the mm semaphore [1].

The idea is to try to handle user space page faults without holding the
mmap_sem. This should allow better concurrency for massively threaded
process since the page fault handler will not wait for other threads memory
layout change to be done, assuming that this change is done in another part
of the process's memory space. This type page fault is named speculative
page fault. If the speculative page fault fails because of a concurrency is
detected or because underlying PMD or PTE tables are not yet allocating, it
is failing its processing and a classic page fault is then tried.

The speculative page fault (SPF) has to look for the VMA matching the fault
address without holding the mmap_sem, this is done by introducing a rwlock
which protects the access to the mm_rb tree. Previously this was done using
SRCU but it was introducing a lot of scheduling to process the VMA's
freeing
operation which was hitting the performance by 20% as reported by Kemi Wang
[2].Using a rwlock to protect access to the mm_rb tree is limiting the
locking contention to these operations which are expected to be in a O(log
n)
order. In addition to ensure that the VMA is not freed in our back a
reference count is added and 2 services (get_vma() and put_vma()) are
introduced to handle the reference count. When a VMA is fetch from the RB
tree using get_vma() is must be later freeed using put_vma(). Furthermore,
to allow the VMA to be used again by the classic page fault handler a
service is introduced can_reuse_spf_vma(). This service is expected to be
called with the mmap_sem hold. It checked that the VMA is still matching
the specified address and is releasing its reference count as the mmap_sem
is hold it is ensure that it will not be freed in our back. In general, the
VMA's reference count could be decremented when holding the mmap_sem but it
should not be increased as holding the mmap_sem is ensuring that the VMA is
stable. I can't see anymore the overhead I got while will-it-scale
benchmark anymore.

The VMA's attributes checked during the speculative page fault processing
have to be protected against parallel changes. This is done by using a per
VMA sequence lock. This sequence lock allows the speculative page fault
handler to fast check for parallel changes in progress and to abort the
speculative page fault in that case.

Once the VMA is found, the speculative page fault handler would check for
the VMA's attributes to verify that the page fault has to be handled
correctly or not. Thus the VMA is protected through a sequence lock which
allows fast detection of concurrent VMA changes. If such a change is
detected, the speculative page fault is aborted and a *classic* page fault
is tried.  VMA sequence lockings are added when VMA attributes which are
checked during the page fault are modified.

When the PTE is fetched, the VMA is checked to see if it has been changed,
so once the page table is locked, the VMA is valid, so any other changes
leading to touching this PTE will need to lock the page table, so no
parallel change is possible at this time.

The locking of the PTE is done with interrupts disabled, this allows to
check for the PMD to ensure that there is not an ongoing collapsing
operation. Since khugepaged is firstly set the PMD to pmd_none and then is
waiting for the other CPU to have catch the IPI interrupt, if the pmd is
valid at the time the PTE is locked, we have the guarantee that the
collapsing opertion will have to wait on the PTE lock to move foward. This
allows the SPF handler to map the PTE safely. If the PMD value is different
than the one recorded at the beginning of the SPF operation, the classic
page fault handler will be called to handle the operation while holding the
mmap_sem. As the PTE lock is done with the interrupts disabled, the lock is
done using spin_trylock() to avoid dead lock when handling a page fault
while a TLB invalidate is requested by an other CPU holding the PTE.

Support for THP is not done because when checking for the PMD, we can be
confused by an in progress collapsing operation done by khugepaged. The
issue is that pmd_none() could be true either if the PMD is not already
populated or if the underlying PTE are in the way to be collapsed. So we
cannot safely allocate a PMD if pmd_none() is true.

This series a new software performance event named 'speculative-faults' or
'spf'. It counts the number of successful page fault event handled in a
speculative way. When recording 'faults,spf' events, the faults one is
counting the total number of page fault events while 'spf' is only counting
the part of the faults processed in a speculative way.

There are some trace events introduced by this series. They allow to
identify why the page faults where not processed in a speculative way. This
doesn't take in account the faults generated by a monothreaded process
which directly processed while holding the mmap_sem. This trace events are
grouped in a system named 'pagefault', they are:
 - pagefault:spf_pte_lock : if the pte was already locked by another thread
 - pagefault:spf_vma_changed : if the VMA has been changed in our back
 - pagefault:spf_vma_noanon : the vma->anon_vma field was not yet set.
 - pagefault:spf_vma_notsup : the VMA's type is not supported
 - pagefault:spf_vma_access : the VMA's access right are not respected
 - pagefault:spf_pmd_changed : the upper PMD pointer has changed in our
 back.

To record all the related events, the easier is to run perf with the
following arguments :
$ perf stat -e 'faults,spf,pagefault:*' <command>

This series builds on top of v4.16-rc2-mmotm-2018-02-21-14-48 and is
functional on x86 and PowerPC.

---------------------
Real Workload results

As mentioned in previous email, we did non official runs using a "popular
in memory multithreaded database product" on 176 cores SMT8 Power system
which showed a 30% improvements in the number of transaction processed per
second. This run has been done on the v6 series, but changes introduced in
this new verion should not impact the performance boost seen.

Here are the perf data captured during 2 of these runs on top of the v8
series:
		vanilla		spf
faults		89.418		101.364		
spf                n/a		 97.989

With the SPF kernel, most of the page fault were processed in a speculative
way.

------------------
Benchmarks results

Base kernel is v4.16-rc4-mmotm-2018-03-09-16-34
SPF is BASE + this series

Kernbench:
----------
Here are the results on a 16 CPUs X86 guest using kernbench on a 4.13-rc4
kernel (kernel is build 5 times):

Average	Half load -j 8
		 Run	(std deviation)
		 BASE			SPF
Elapsed	Time	 151.36	 (1.40139)	151.748	(1.09716)	0.26%
User	Time	 1023.19 (3.58972)	1027.35	(2.30396)	0.41%
System	Time	 125.026 (1.8547)	124.504	(0.980015)	-0.42%
Percent	CPU	 758.2	 (5.54076)	758.6	(3.97492)	0.05%
Context	Switches 54924	 (453.634)	54851	(382.293)	-0.13%
Sleeps		 105589	 (704.581)	105282	(435.502)	-0.29%
						
Average	Optimal	load -j	16
		 Run	(std deviation)
		 BASE			SPF
Elapsed	Time	 74.804	 (1.25139)	74.368	(0.406288)	-0.58%
User	Time	 962.033 (64.5125)	963.93	(66.8797)	0.20%
System	Time	 110.771 (15.0817)	110.387	(14.8989)	-0.35%
Percent	CPU	 1045.7	 (303.387)	1049.1	(306.255)	0.33%
Context	Switches 76201.8 (22433.1)	76170.4	(22482.9)	-0.04%
Sleeps		 110289	 (5024.05)	110220	(5248.58)	-0.06%

During a run on the SPF, perf events were captured:
 Performance counter stats for '../kernbench -M':
         510334017      faults
               200      spf
                 0      pagefault:spf_pte_lock
                 0      pagefault:spf_vma_changed
                 0      pagefault:spf_vma_noanon
              2174      pagefault:spf_vma_notsup
                 0      pagefault:spf_vma_access
                 0      pagefault:spf_pmd_changed

Very few speculative page fault were recorded as most of the processes
involved are monothreaded (sounds that on this architecture some threads
were created during the kernel build processing).

Here are the kerbench results on a 80 CPUs Power8 system:

Average	Half load -j 40
		 Run	(std deviation)
		 BASE			SPF
Elapsed	Time	 116.958 (0.73401)	117.43	(0.927497)	0.40%
User	Time	 4472.35 (7.85792)	4480.16	(19.4909)	0.17%
System	Time	 136.248 (0.587639)	136.922	(1.09058)	0.49%
Percent	CPU	 3939.8	 (20.6567)	3931.2	(17.2829)	-0.22%
Context	Switches 92445.8 (236.672)	92720.8	(270.118)	0.30%
Sleeps		 318475	 (1412.6)	317996	(1819.07)	-0.15%
						
Average	Optimal	load -j	80
		 Run	(std deviation)
		 BASE			SPF
Elapsed	Time	 106.976 (0.406731)	107.72	(0.329014)	0.70%
User	Time	 5863.47 (1466.45)	5865.38	(1460.27)	0.03%
System	Time	 159.995 (25.0393)	160.329	(24.6921)	0.21%
Percent	CPU	 5446.2	 (1588.23)	5416	(1565.34)	-0.55%
Context	Switches 223018	 (137637)	224867	(139305)	0.83%
Sleeps		 330846	 (13127.3)	332348	(15556.9)	0.45%

During a run on the SPF, perf events were captured:
 Performance counter stats for '../kernbench -M':
         116612488      faults
                 0      spf
                 0      pagefault:spf_pte_lock
                 0      pagefault:spf_vma_changed
                 0      pagefault:spf_vma_noanon
	       473      pagefault:spf_vma_notsup
               	 0      pagefault:spf_vma_access
                 0      pagefault:spf_pmd_changed

Most of the processes involved are monothreaded so SPF is not activated but
there is no impact on the performance.

Ebizzy:
-------
The test is counting the number of records per second it can manage, the
higher is the best. I run it like this 'ebizzy -mTRp'. To get consistent
result I repeated the test 100 times and measure the average result. The
number is the record processes per second, the higher is the best.

  		BASE		SPF		delta	
16 CPUs x86 VM	14902.6		95905.16	543.55%
80 CPUs P8 node	37240.24	78185.67	109.95%

Here are the performance counter read during a run on a 16 CPUs x86 VM:
 Performance counter stats for './ebizzy -mRTp':
            888157      faults
            884773      spf
                92      pagefault:spf_pte_lock
              2379      pagefault:spf_vma_changed
                 0      pagefault:spf_vma_noanon
                80      pagefault:spf_vma_notsup
                 0      pagefault:spf_vma_access
                 0      pagefault:spf_pmd_changed

And the ones captured during a run on a 80 CPUs Power node:
 Performance counter stats for './ebizzy -mRTp':
            762134      faults
            728663      spf
             19101      pagefault:spf_pte_lock
             13969      pagefault:spf_vma_changed
                 0      pagefault:spf_vma_noanon
               272      pagefault:spf_vma_notsup
                 0      pagefault:spf_vma_access
                 0      pagefault:spf_pmd_changed

In ebizzy's case most of the page fault were handled in a speculative way,
leading the ebizzy performance boost.

------------------
Changes since v8:
 - Don't check PMD when locking the pte when THP is disabled
   Thanks to Daniel Jordan for reporting this.
 - Rebase on 4.16
Changes since v7:
 - move pte_map_lock() and pte_spinlock() upper in mm/memory.c (patch 4 &
   5)
 - make pte_unmap_same() compatible with the speculative page fault (patch
   6)
Changes since v6:
 - Rename config variable to CONFIG_SPECULATIVE_PAGE_FAULT (patch 1)
 - Review the way the config variable is set (patch 1 to 3)
 - Introduce mm_rb_write_*lock() in mm/mmap.c (patch 18)
 - Merge patch introducing pte try locking in the patch 18.
Changes since v5:
 - use rwlock agains the mm RB tree in place of SRCU
 - add a VMA's reference count to protect VMA while using it without
   holding the mmap_sem.
 - check PMD value to detect collapsing operation
 - don't try speculative page fault for mono threaded processes
 - try to reuse the fetched VMA if VM_RETRY is returned
 - go directly to the error path if an error is detected during the SPF
   path
 - fix race window when moving VMA in move_vma()
Changes since v4:
 - As requested by Andrew Morton, use CONFIG_SPF and define it earlier in
 the series to ease bisection.
Changes since v3:
 - Don't build when CONFIG_SMP is not set
 - Fixed a lock dependency warning in __vma_adjust()
 - Use READ_ONCE to access p*d values in handle_speculative_fault()
 - Call memcp_oom() service in handle_speculative_fault()
Changes since v2:
 - Perf event is renamed in PERF_COUNT_SW_SPF
 - On Power handle do_page_fault()'s cleaning
 - On Power if the VM_FAULT_ERROR is returned by
 handle_speculative_fault(), do not retry but jump to the error path
 - If VMA's flags are not matching the fault, directly returns
 VM_FAULT_SIGSEGV and not VM_FAULT_RETRY
 - Check for pud_trans_huge() to avoid speculative path
 - Handles _vm_normal_page()'s introduced by 6f16211df3bf
 ("mm/device-public-memory: device memory cache coherent with CPU")
 - add and review few comments in the code
Changes since v1:
 - Remove PERF_COUNT_SW_SPF_FAILED perf event.
 - Add tracing events to details speculative page fault failures.
 - Cache VMA fields values which are used once the PTE is unlocked at the
 end of the page fault events.
 - Ensure that fields read during the speculative path are written and read
 using WRITE_ONCE and READ_ONCE.
 - Add checks at the beginning of the speculative path to abort it if the
 VMA is known to not be supported.
Changes since RFC V5 [5]
 - Port to 4.13 kernel
 - Merging patch fixing lock dependency into the original patch
 - Replace the 2 parameters of vma_has_changed() with the vmf pointer
 - In patch 7, don't call __do_fault() in the speculative path as it may
 want to unlock the mmap_sem.
 - In patch 11-12, don't check for vma boundaries when
 page_add_new_anon_rmap() is called during the spf path and protect against
 anon_vma pointer's update.
 - In patch 13-16, add performance events to report number of successful
 and failed speculative events.

[1]
http://linux-kernel.2935.n7.nabble.com/RFC-PATCH-0-6-Another-go-at-speculative-page-faults-tt965642.html#none
[2] https://patchwork.kernel.org/patch/9999687/


Laurent Dufour (20):
  mm: Introduce CONFIG_SPECULATIVE_PAGE_FAULT
  x86/mm: Define CONFIG_SPECULATIVE_PAGE_FAULT
  powerpc/mm: Define CONFIG_SPECULATIVE_PAGE_FAULT
  mm: Introduce pte_spinlock for FAULT_FLAG_SPECULATIVE
  mm: make pte_unmap_same compatible with SPF
  mm: Protect VMA modifications using VMA sequence count
  mm: protect mremap() against SPF hanlder
  mm: Protect SPF handler against anon_vma changes
  mm: Cache some VMA fields in the vm_fault structure
  mm/migrate: Pass vm_fault pointer to migrate_misplaced_page()
  mm: Introduce __lru_cache_add_active_or_unevictable
  mm: Introduce __maybe_mkwrite()
  mm: Introduce __vm_normal_page()
  mm: Introduce __page_add_new_anon_rmap()
  mm: Protect mm_rb tree with a rwlock
  mm: Adding speculative page fault failure trace events
  perf: Add a speculative page fault sw event
  perf tools: Add support for the SPF perf event
  mm: Speculative page fault handler return VMA
  powerpc/mm: Add speculative page fault

Peter Zijlstra (4):
  mm: Prepare for FAULT_FLAG_SPECULATIVE
  mm: VMA sequence count
  mm: Provide speculative fault infrastructure
  x86/mm: Add speculative pagefault handling

 arch/powerpc/Kconfig                  |   1 +
 arch/powerpc/mm/fault.c               |  31 +-
 arch/x86/Kconfig                      |   1 +
 arch/x86/mm/fault.c                   |  38 ++-
 fs/proc/task_mmu.c                    |   5 +-
 fs/userfaultfd.c                      |  17 +-
 include/linux/hugetlb_inline.h        |   2 +-
 include/linux/migrate.h               |   4 +-
 include/linux/mm.h                    |  92 +++++-
 include/linux/mm_types.h              |   7 +
 include/linux/pagemap.h               |   4 +-
 include/linux/rmap.h                  |  12 +-
 include/linux/swap.h                  |  10 +-
 include/trace/events/pagefault.h      |  87 +++++
 include/uapi/linux/perf_event.h       |   1 +
 kernel/fork.c                         |   3 +
 mm/Kconfig                            |   3 +
 mm/hugetlb.c                          |   2 +
 mm/init-mm.c                          |   3 +
 mm/internal.h                         |  20 ++
 mm/khugepaged.c                       |   5 +
 mm/madvise.c                          |   6 +-
 mm/memory.c                           | 594 ++++++++++++++++++++++++++++++----
 mm/mempolicy.c                        |  51 ++-
 mm/migrate.c                          |   4 +-
 mm/mlock.c                            |  13 +-
 mm/mmap.c                             | 211 +++++++++---
 mm/mprotect.c                         |   4 +-
 mm/mremap.c                           |  13 +
 mm/rmap.c                             |   5 +-
 mm/swap.c                             |   6 +-
 mm/swap_state.c                       |   8 +-
 tools/include/uapi/linux/perf_event.h |   1 +
 tools/perf/util/evsel.c               |   1 +
 tools/perf/util/parse-events.c        |   4 +
 tools/perf/util/parse-events.l        |   1 +
 tools/perf/util/python.c              |   1 +
 37 files changed, 1097 insertions(+), 174 deletions(-)
 create mode 100644 include/trace/events/pagefault.h

-- 
2.7.4

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [PATCH v9 01/24] mm: Introduce CONFIG_SPECULATIVE_PAGE_FAULT
  2018-03-13 17:59 [PATCH v9 00/24] Speculative page faults Laurent Dufour
@ 2018-03-13 17:59 ` Laurent Dufour
  2018-03-25 21:50   ` David Rientjes
  2018-03-13 17:59 ` [PATCH v9 02/24] x86/mm: Define CONFIG_SPECULATIVE_PAGE_FAULT Laurent Dufour
                   ` (25 subsequent siblings)
  26 siblings, 1 reply; 98+ messages in thread
From: Laurent Dufour @ 2018-03-13 17:59 UTC (permalink / raw)
  To: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan
  Cc: linux-kernel, linux-mm, haren, khandual, npiggin, bsingharora,
	Tim Chen, linuxppc-dev, x86

This configuration variable will be used to build the code needed to
handle speculative page fault.

By default it is turned off, and activated depending on architecture
support.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
---
 mm/Kconfig | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/Kconfig b/mm/Kconfig
index abefa573bcd8..07c566c88faf 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -759,3 +759,6 @@ config GUP_BENCHMARK
 	  performance of get_user_pages_fast().
 
 	  See tools/testing/selftests/vm/gup_benchmark.c
+
+config SPECULATIVE_PAGE_FAULT
+       bool
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v9 02/24] x86/mm: Define CONFIG_SPECULATIVE_PAGE_FAULT
  2018-03-13 17:59 [PATCH v9 00/24] Speculative page faults Laurent Dufour
  2018-03-13 17:59 ` [PATCH v9 01/24] mm: Introduce CONFIG_SPECULATIVE_PAGE_FAULT Laurent Dufour
@ 2018-03-13 17:59 ` Laurent Dufour
  2018-03-13 17:59 ` [PATCH v9 03/24] powerpc/mm: " Laurent Dufour
                   ` (24 subsequent siblings)
  26 siblings, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-03-13 17:59 UTC (permalink / raw)
  To: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan
  Cc: linux-kernel, linux-mm, haren, khandual, npiggin, bsingharora,
	Tim Chen, linuxppc-dev, x86

Introduce CONFIG_SPECULATIVE_PAGE_FAULT which turns on the Speculative Page
Fault handler when building for 64bits with SMP.

Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
---
 arch/x86/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index a0a777ce4c7c..4c018c48d414 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -29,6 +29,7 @@ config X86_64
 	select HAVE_ARCH_SOFT_DIRTY
 	select MODULES_USE_ELF_RELA
 	select X86_DEV_DMA_OPS
+	select SPECULATIVE_PAGE_FAULT if SMP
 
 #
 # Arch settings
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v9 03/24] powerpc/mm: Define CONFIG_SPECULATIVE_PAGE_FAULT
  2018-03-13 17:59 [PATCH v9 00/24] Speculative page faults Laurent Dufour
  2018-03-13 17:59 ` [PATCH v9 01/24] mm: Introduce CONFIG_SPECULATIVE_PAGE_FAULT Laurent Dufour
  2018-03-13 17:59 ` [PATCH v9 02/24] x86/mm: Define CONFIG_SPECULATIVE_PAGE_FAULT Laurent Dufour
@ 2018-03-13 17:59 ` Laurent Dufour
  2018-03-13 17:59 ` [PATCH v9 05/24] mm: Introduce pte_spinlock for FAULT_FLAG_SPECULATIVE Laurent Dufour
                   ` (23 subsequent siblings)
  26 siblings, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-03-13 17:59 UTC (permalink / raw)
  To: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan
  Cc: linux-kernel, linux-mm, haren, khandual, npiggin, bsingharora,
	Tim Chen, linuxppc-dev, x86

Define CONFIG_SPECULATIVE_PAGE_FAULT for BOOK3S_64 and SMP. This enables
the Speculative Page Fault handler.

Support is only provide for BOOK3S_64 currently because:
- require CONFIG_PPC_STD_MMU because checks done in
  set_access_flags_filter()
- require BOOK3S because we can't support for book3e_hugetlb_preload()
  called by update_mmu_cache()

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
---
 arch/powerpc/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 73ce5dd07642..acf2696a6505 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -233,6 +233,7 @@ config PPC
 	select OLD_SIGACTION			if PPC32
 	select OLD_SIGSUSPEND
 	select SPARSE_IRQ
+	select SPECULATIVE_PAGE_FAULT		if PPC_BOOK3S_64 && SMP
 	select SYSCTL_EXCEPTION_TRACE
 	select VIRT_TO_BUS			if !PPC64
 	#
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v9 05/24] mm: Introduce pte_spinlock for FAULT_FLAG_SPECULATIVE
  2018-03-13 17:59 [PATCH v9 00/24] Speculative page faults Laurent Dufour
                   ` (2 preceding siblings ...)
  2018-03-13 17:59 ` [PATCH v9 03/24] powerpc/mm: " Laurent Dufour
@ 2018-03-13 17:59 ` Laurent Dufour
  2018-03-25 21:50   ` David Rientjes
  2018-03-13 17:59 ` [PATCH v9 06/24] mm: make pte_unmap_same compatible with SPF Laurent Dufour
                   ` (22 subsequent siblings)
  26 siblings, 1 reply; 98+ messages in thread
From: Laurent Dufour @ 2018-03-13 17:59 UTC (permalink / raw)
  To: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan
  Cc: linux-kernel, linux-mm, haren, khandual, npiggin, bsingharora,
	Tim Chen, linuxppc-dev, x86

When handling page fault without holding the mmap_sem the fetch of the
pte lock pointer and the locking will have to be done while ensuring
that the VMA is not touched in our back.

So move the fetch and locking operations in a dedicated function.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
---
 mm/memory.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 8ac241b9f370..21b1212a0892 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2288,6 +2288,13 @@ int apply_to_page_range(struct mm_struct *mm, unsigned long addr,
 }
 EXPORT_SYMBOL_GPL(apply_to_page_range);
 
+static bool pte_spinlock(struct vm_fault *vmf)
+{
+	vmf->ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd);
+	spin_lock(vmf->ptl);
+	return true;
+}
+
 static bool pte_map_lock(struct vm_fault *vmf)
 {
 	vmf->pte = pte_offset_map_lock(vmf->vma->vm_mm, vmf->pmd,
@@ -3798,8 +3805,8 @@ static int do_numa_page(struct vm_fault *vmf)
 	 * validation through pte_unmap_same(). It's of NUMA type but
 	 * the pfn may be screwed if the read is non atomic.
 	 */
-	vmf->ptl = pte_lockptr(vma->vm_mm, vmf->pmd);
-	spin_lock(vmf->ptl);
+	if (!pte_spinlock(vmf))
+		return VM_FAULT_RETRY;
 	if (unlikely(!pte_same(*vmf->pte, vmf->orig_pte))) {
 		pte_unmap_unlock(vmf->pte, vmf->ptl);
 		goto out;
@@ -3992,8 +3999,8 @@ static int handle_pte_fault(struct vm_fault *vmf)
 	if (pte_protnone(vmf->orig_pte) && vma_is_accessible(vmf->vma))
 		return do_numa_page(vmf);
 
-	vmf->ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd);
-	spin_lock(vmf->ptl);
+	if (!pte_spinlock(vmf))
+		return VM_FAULT_RETRY;
 	entry = vmf->orig_pte;
 	if (unlikely(!pte_same(*vmf->pte, entry)))
 		goto unlock;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v9 06/24] mm: make pte_unmap_same compatible with SPF
  2018-03-13 17:59 [PATCH v9 00/24] Speculative page faults Laurent Dufour
                   ` (3 preceding siblings ...)
  2018-03-13 17:59 ` [PATCH v9 05/24] mm: Introduce pte_spinlock for FAULT_FLAG_SPECULATIVE Laurent Dufour
@ 2018-03-13 17:59 ` Laurent Dufour
  2018-03-27 21:18   ` David Rientjes
  2018-04-03 19:10   ` Jerome Glisse
  2018-03-13 17:59 ` [PATCH v9 08/24] mm: Protect VMA modifications using VMA sequence count Laurent Dufour
                   ` (21 subsequent siblings)
  26 siblings, 2 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-03-13 17:59 UTC (permalink / raw)
  To: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan
  Cc: linux-kernel, linux-mm, haren, khandual, npiggin, bsingharora,
	Tim Chen, linuxppc-dev, x86

pte_unmap_same() is making the assumption that the page table are still
around because the mmap_sem is held.
This is no more the case when running a speculative page fault and
additional check must be made to ensure that the final page table are still
there.

This is now done by calling pte_spinlock() to check for the VMA's
consistency while locking for the page tables.

This is requiring passing a vm_fault structure to pte_unmap_same() which is
containing all the needed parameters.

As pte_spinlock() may fail in the case of a speculative page fault, if the
VMA has been touched in our back, pte_unmap_same() should now return 3
cases :
	1. pte are the same (0)
	2. pte are different (VM_FAULT_PTNOTSAME)
	3. a VMA's changes has been detected (VM_FAULT_RETRY)

The case 2 is handled by the introduction of a new VM_FAULT flag named
VM_FAULT_PTNOTSAME which is then trapped in cow_user_page().
If VM_FAULT_RETRY is returned, it is passed up to the callers to retry the
page fault while holding the mmap_sem.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
---
 include/linux/mm.h |  1 +
 mm/memory.c        | 29 +++++++++++++++++++----------
 2 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 2f3e98edc94a..b6432a261e63 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1199,6 +1199,7 @@ static inline void clear_page_pfmemalloc(struct page *page)
 #define VM_FAULT_NEEDDSYNC  0x2000	/* ->fault did not modify page tables
 					 * and needs fsync() to complete (for
 					 * synchronous page faults in DAX) */
+#define VM_FAULT_PTNOTSAME 0x4000	/* Page table entries have changed */
 
 #define VM_FAULT_ERROR	(VM_FAULT_OOM | VM_FAULT_SIGBUS | VM_FAULT_SIGSEGV | \
 			 VM_FAULT_HWPOISON | VM_FAULT_HWPOISON_LARGE | \
diff --git a/mm/memory.c b/mm/memory.c
index 21b1212a0892..4bc7b0bdcb40 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2309,21 +2309,29 @@ static bool pte_map_lock(struct vm_fault *vmf)
  * parts, do_swap_page must check under lock before unmapping the pte and
  * proceeding (but do_wp_page is only called after already making such a check;
  * and do_anonymous_page can safely check later on).
+ *
+ * pte_unmap_same() returns:
+ *	0			if the PTE are the same
+ *	VM_FAULT_PTNOTSAME	if the PTE are different
+ *	VM_FAULT_RETRY		if the VMA has changed in our back during
+ *				a speculative page fault handling.
  */
-static inline int pte_unmap_same(struct mm_struct *mm, pmd_t *pmd,
-				pte_t *page_table, pte_t orig_pte)
+static inline int pte_unmap_same(struct vm_fault *vmf)
 {
-	int same = 1;
+	int ret = 0;
+
 #if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT)
 	if (sizeof(pte_t) > sizeof(unsigned long)) {
-		spinlock_t *ptl = pte_lockptr(mm, pmd);
-		spin_lock(ptl);
-		same = pte_same(*page_table, orig_pte);
-		spin_unlock(ptl);
+		if (pte_spinlock(vmf)) {
+			if (!pte_same(*vmf->pte, vmf->orig_pte))
+				ret = VM_FAULT_PTNOTSAME;
+			spin_unlock(vmf->ptl);
+		} else
+			ret = VM_FAULT_RETRY;
 	}
 #endif
-	pte_unmap(page_table);
-	return same;
+	pte_unmap(vmf->pte);
+	return ret;
 }
 
 static inline void cow_user_page(struct page *dst, struct page *src, unsigned long va, struct vm_area_struct *vma)
@@ -2913,7 +2921,8 @@ int do_swap_page(struct vm_fault *vmf)
 	int exclusive = 0;
 	int ret = 0;
 
-	if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf->orig_pte))
+	ret = pte_unmap_same(vmf);
+	if (ret)
 		goto out;
 
 	entry = pte_to_swp_entry(vmf->orig_pte);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v9 08/24] mm: Protect VMA modifications using VMA sequence count
  2018-03-13 17:59 [PATCH v9 00/24] Speculative page faults Laurent Dufour
                   ` (4 preceding siblings ...)
  2018-03-13 17:59 ` [PATCH v9 06/24] mm: make pte_unmap_same compatible with SPF Laurent Dufour
@ 2018-03-13 17:59 ` Laurent Dufour
  2018-03-27 21:45   ` David Rientjes
  2018-03-27 21:57   ` David Rientjes
  2018-03-13 17:59 ` [PATCH v9 09/24] mm: protect mremap() against SPF hanlder Laurent Dufour
                   ` (20 subsequent siblings)
  26 siblings, 2 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-03-13 17:59 UTC (permalink / raw)
  To: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan
  Cc: linux-kernel, linux-mm, haren, khandual, npiggin, bsingharora,
	Tim Chen, linuxppc-dev, x86

The VMA sequence count has been introduced to allow fast detection of
VMA modification when running a page fault handler without holding
the mmap_sem.

This patch provides protection against the VMA modification done in :
	- madvise()
	- mpol_rebind_policy()
	- vma_replace_policy()
	- change_prot_numa()
	- mlock(), munlock()
	- mprotect()
	- mmap_region()
	- collapse_huge_page()
	- userfaultd registering services

In addition, VMA fields which will be read during the speculative fault
path needs to be written using WRITE_ONCE to prevent write to be split
and intermediate values to be pushed to other CPUs.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
---
 fs/proc/task_mmu.c |  5 ++++-
 fs/userfaultfd.c   | 17 +++++++++++++----
 mm/khugepaged.c    |  3 +++
 mm/madvise.c       |  6 +++++-
 mm/mempolicy.c     | 51 ++++++++++++++++++++++++++++++++++-----------------
 mm/mlock.c         | 13 ++++++++-----
 mm/mmap.c          | 17 ++++++++++-------
 mm/mprotect.c      |  4 +++-
 mm/swap_state.c    |  8 ++++++--
 9 files changed, 86 insertions(+), 38 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 65ae54659833..a2d9c87b7b0b 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1136,8 +1136,11 @@ static ssize_t clear_refs_write(struct file *file, const char __user *buf,
 					goto out_mm;
 				}
 				for (vma = mm->mmap; vma; vma = vma->vm_next) {
-					vma->vm_flags &= ~VM_SOFTDIRTY;
+					vm_write_begin(vma);
+					WRITE_ONCE(vma->vm_flags,
+						   vma->vm_flags & ~VM_SOFTDIRTY);
 					vma_set_page_prot(vma);
+					vm_write_end(vma);
 				}
 				downgrade_write(&mm->mmap_sem);
 				break;
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index cec550c8468f..b8212ba17695 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -659,8 +659,11 @@ int dup_userfaultfd(struct vm_area_struct *vma, struct list_head *fcs)
 
 	octx = vma->vm_userfaultfd_ctx.ctx;
 	if (!octx || !(octx->features & UFFD_FEATURE_EVENT_FORK)) {
+		vm_write_begin(vma);
 		vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX;
-		vma->vm_flags &= ~(VM_UFFD_WP | VM_UFFD_MISSING);
+		WRITE_ONCE(vma->vm_flags,
+			   vma->vm_flags & ~(VM_UFFD_WP | VM_UFFD_MISSING));
+		vm_write_end(vma);
 		return 0;
 	}
 
@@ -885,8 +888,10 @@ static int userfaultfd_release(struct inode *inode, struct file *file)
 			vma = prev;
 		else
 			prev = vma;
-		vma->vm_flags = new_flags;
+		vm_write_begin(vma);
+		WRITE_ONCE(vma->vm_flags, new_flags);
 		vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX;
+		vm_write_end(vma);
 	}
 	up_write(&mm->mmap_sem);
 	mmput(mm);
@@ -1434,8 +1439,10 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
 		 * the next vma was merged into the current one and
 		 * the current one has not been updated yet.
 		 */
-		vma->vm_flags = new_flags;
+		vm_write_begin(vma);
+		WRITE_ONCE(vma->vm_flags, new_flags);
 		vma->vm_userfaultfd_ctx.ctx = ctx;
+		vm_write_end(vma);
 
 	skip:
 		prev = vma;
@@ -1592,8 +1599,10 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx,
 		 * the next vma was merged into the current one and
 		 * the current one has not been updated yet.
 		 */
-		vma->vm_flags = new_flags;
+		vm_write_begin(vma);
+		WRITE_ONCE(vma->vm_flags, new_flags);
 		vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX;
+		vm_write_end(vma);
 
 	skip:
 		prev = vma;
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index b7e2268dfc9a..32314e9e48dd 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1006,6 +1006,7 @@ static void collapse_huge_page(struct mm_struct *mm,
 	if (mm_find_pmd(mm, address) != pmd)
 		goto out;
 
+	vm_write_begin(vma);
 	anon_vma_lock_write(vma->anon_vma);
 
 	pte = pte_offset_map(pmd, address);
@@ -1041,6 +1042,7 @@ static void collapse_huge_page(struct mm_struct *mm,
 		pmd_populate(mm, pmd, pmd_pgtable(_pmd));
 		spin_unlock(pmd_ptl);
 		anon_vma_unlock_write(vma->anon_vma);
+		vm_write_end(vma);
 		result = SCAN_FAIL;
 		goto out;
 	}
@@ -1075,6 +1077,7 @@ static void collapse_huge_page(struct mm_struct *mm,
 	set_pmd_at(mm, address, pmd, _pmd);
 	update_mmu_cache_pmd(vma, address, pmd);
 	spin_unlock(pmd_ptl);
+	vm_write_end(vma);
 
 	*hpage = NULL;
 
diff --git a/mm/madvise.c b/mm/madvise.c
index 4d3c922ea1a1..e328f7ab5942 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -184,7 +184,9 @@ static long madvise_behavior(struct vm_area_struct *vma,
 	/*
 	 * vm_flags is protected by the mmap_sem held in write mode.
 	 */
-	vma->vm_flags = new_flags;
+	vm_write_begin(vma);
+	WRITE_ONCE(vma->vm_flags, new_flags);
+	vm_write_end(vma);
 out:
 	return error;
 }
@@ -450,9 +452,11 @@ static void madvise_free_page_range(struct mmu_gather *tlb,
 		.private = tlb,
 	};
 
+	vm_write_begin(vma);
 	tlb_start_vma(tlb, vma);
 	walk_page_range(addr, end, &free_walk);
 	tlb_end_vma(tlb, vma);
+	vm_write_end(vma);
 }
 
 static int madvise_free_single_vma(struct vm_area_struct *vma,
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index e0e706f0b34e..2632c6f93b63 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -380,8 +380,11 @@ void mpol_rebind_mm(struct mm_struct *mm, nodemask_t *new)
 	struct vm_area_struct *vma;
 
 	down_write(&mm->mmap_sem);
-	for (vma = mm->mmap; vma; vma = vma->vm_next)
+	for (vma = mm->mmap; vma; vma = vma->vm_next) {
+		vm_write_begin(vma);
 		mpol_rebind_policy(vma->vm_policy, new);
+		vm_write_end(vma);
+	}
 	up_write(&mm->mmap_sem);
 }
 
@@ -554,9 +557,11 @@ unsigned long change_prot_numa(struct vm_area_struct *vma,
 {
 	int nr_updated;
 
+	vm_write_begin(vma);
 	nr_updated = change_protection(vma, addr, end, PAGE_NONE, 0, 1);
 	if (nr_updated)
 		count_vm_numa_events(NUMA_PTE_UPDATES, nr_updated);
+	vm_write_end(vma);
 
 	return nr_updated;
 }
@@ -657,6 +662,7 @@ static int vma_replace_policy(struct vm_area_struct *vma,
 	if (IS_ERR(new))
 		return PTR_ERR(new);
 
+	vm_write_begin(vma);
 	if (vma->vm_ops && vma->vm_ops->set_policy) {
 		err = vma->vm_ops->set_policy(vma, new);
 		if (err)
@@ -664,11 +670,17 @@ static int vma_replace_policy(struct vm_area_struct *vma,
 	}
 
 	old = vma->vm_policy;
-	vma->vm_policy = new; /* protected by mmap_sem */
+	/*
+	 * The speculative page fault handler access this field without
+	 * hodling the mmap_sem.
+	 */
+	WRITE_ONCE(vma->vm_policy,  new);
+	vm_write_end(vma);
 	mpol_put(old);
 
 	return 0;
  err_out:
+	vm_write_end(vma);
 	mpol_put(new);
 	return err;
 }
@@ -1552,23 +1564,28 @@ COMPAT_SYSCALL_DEFINE6(mbind, compat_ulong_t, start, compat_ulong_t, len,
 struct mempolicy *__get_vma_policy(struct vm_area_struct *vma,
 						unsigned long addr)
 {
-	struct mempolicy *pol = NULL;
+	struct mempolicy *pol;
 
-	if (vma) {
-		if (vma->vm_ops && vma->vm_ops->get_policy) {
-			pol = vma->vm_ops->get_policy(vma, addr);
-		} else if (vma->vm_policy) {
-			pol = vma->vm_policy;
+	if (!vma)
+		return NULL;
 
-			/*
-			 * shmem_alloc_page() passes MPOL_F_SHARED policy with
-			 * a pseudo vma whose vma->vm_ops=NULL. Take a reference
-			 * count on these policies which will be dropped by
-			 * mpol_cond_put() later
-			 */
-			if (mpol_needs_cond_ref(pol))
-				mpol_get(pol);
-		}
+	if (vma->vm_ops && vma->vm_ops->get_policy)
+		return vma->vm_ops->get_policy(vma, addr);
+
+	/*
+	 * This could be called without holding the mmap_sem in the
+	 * speculative page fault handler's path.
+	 */
+	pol = READ_ONCE(vma->vm_policy);
+	if (pol) {
+		/*
+		 * shmem_alloc_page() passes MPOL_F_SHARED policy with
+		 * a pseudo vma whose vma->vm_ops=NULL. Take a reference
+		 * count on these policies which will be dropped by
+		 * mpol_cond_put() later
+		 */
+		if (mpol_needs_cond_ref(pol))
+			mpol_get(pol);
 	}
 
 	return pol;
diff --git a/mm/mlock.c b/mm/mlock.c
index 74e5a6547c3d..c40285c94ced 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -445,7 +445,9 @@ static unsigned long __munlock_pagevec_fill(struct pagevec *pvec,
 void munlock_vma_pages_range(struct vm_area_struct *vma,
 			     unsigned long start, unsigned long end)
 {
-	vma->vm_flags &= VM_LOCKED_CLEAR_MASK;
+	vm_write_begin(vma);
+	WRITE_ONCE(vma->vm_flags, vma->vm_flags & VM_LOCKED_CLEAR_MASK);
+	vm_write_end(vma);
 
 	while (start < end) {
 		struct page *page;
@@ -568,10 +570,11 @@ static int mlock_fixup(struct vm_area_struct *vma, struct vm_area_struct **prev,
 	 * It's okay if try_to_unmap_one unmaps a page just after we
 	 * set VM_LOCKED, populate_vma_page_range will bring it back.
 	 */
-
-	if (lock)
-		vma->vm_flags = newflags;
-	else
+	if (lock) {
+		vm_write_begin(vma);
+		WRITE_ONCE(vma->vm_flags, newflags);
+		vm_write_end(vma);
+	} else
 		munlock_vma_pages_range(vma, start, end);
 
 out:
diff --git a/mm/mmap.c b/mm/mmap.c
index 5898255d0aeb..d6533cb85213 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -847,17 +847,18 @@ int __vma_adjust(struct vm_area_struct *vma, unsigned long start,
 	}
 
 	if (start != vma->vm_start) {
-		vma->vm_start = start;
+		WRITE_ONCE(vma->vm_start, start);
 		start_changed = true;
 	}
 	if (end != vma->vm_end) {
-		vma->vm_end = end;
+		WRITE_ONCE(vma->vm_end, end);
 		end_changed = true;
 	}
-	vma->vm_pgoff = pgoff;
+	WRITE_ONCE(vma->vm_pgoff, pgoff);
 	if (adjust_next) {
-		next->vm_start += adjust_next << PAGE_SHIFT;
-		next->vm_pgoff += adjust_next;
+		WRITE_ONCE(next->vm_start,
+			   next->vm_start + (adjust_next << PAGE_SHIFT));
+		WRITE_ONCE(next->vm_pgoff, next->vm_pgoff + adjust_next);
 	}
 
 	if (root) {
@@ -1781,6 +1782,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
 out:
 	perf_event_mmap(vma);
 
+	vm_write_begin(vma);
 	vm_stat_account(mm, vm_flags, len >> PAGE_SHIFT);
 	if (vm_flags & VM_LOCKED) {
 		if (!((vm_flags & VM_SPECIAL) || is_vm_hugetlb_page(vma) ||
@@ -1803,6 +1805,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
 	vma->vm_flags |= VM_SOFTDIRTY;
 
 	vma_set_page_prot(vma);
+	vm_write_end(vma);
 
 	return addr;
 
@@ -2431,8 +2434,8 @@ int expand_downwards(struct vm_area_struct *vma,
 					mm->locked_vm += grow;
 				vm_stat_account(mm, vma->vm_flags, grow);
 				anon_vma_interval_tree_pre_update_vma(vma);
-				vma->vm_start = address;
-				vma->vm_pgoff -= grow;
+				WRITE_ONCE(vma->vm_start, address);
+				WRITE_ONCE(vma->vm_pgoff, vma->vm_pgoff - grow);
 				anon_vma_interval_tree_post_update_vma(vma);
 				vma_gap_update(vma);
 				spin_unlock(&mm->page_table_lock);
diff --git a/mm/mprotect.c b/mm/mprotect.c
index e3309fcf586b..9b7a71c30287 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -366,7 +366,8 @@ mprotect_fixup(struct vm_area_struct *vma, struct vm_area_struct **pprev,
 	 * vm_flags and vm_page_prot are protected by the mmap_sem
 	 * held in write mode.
 	 */
-	vma->vm_flags = newflags;
+	vm_write_begin(vma);
+	WRITE_ONCE(vma->vm_flags, newflags);
 	dirty_accountable = vma_wants_writenotify(vma, vma->vm_page_prot);
 	vma_set_page_prot(vma);
 
@@ -381,6 +382,7 @@ mprotect_fixup(struct vm_area_struct *vma, struct vm_area_struct **pprev,
 			(newflags & VM_WRITE)) {
 		populate_vma_page_range(vma, start, end, NULL);
 	}
+	vm_write_end(vma);
 
 	vm_stat_account(mm, oldflags, -nrpages);
 	vm_stat_account(mm, newflags, nrpages);
diff --git a/mm/swap_state.c b/mm/swap_state.c
index 486f7c26386e..691bd6e06967 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -575,6 +575,10 @@ static unsigned long swapin_nr_pages(unsigned long offset)
  * the readahead.
  *
  * Caller must hold down_read on the vma->vm_mm if vmf->vma is not NULL.
+ * This is needed to ensure the VMA will not be freed in our back. In the case
+ * of the speculative page fault handler, this cannot happen, even if we don't
+ * hold the mmap_sem. Callees are assumed to take care of reading VMA's fields
+ * using READ_ONCE() to read consistent values.
  */
 struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask,
 				struct vm_fault *vmf)
@@ -669,9 +673,9 @@ static inline void swap_ra_clamp_pfn(struct vm_area_struct *vma,
 				     unsigned long *start,
 				     unsigned long *end)
 {
-	*start = max3(lpfn, PFN_DOWN(vma->vm_start),
+	*start = max3(lpfn, PFN_DOWN(READ_ONCE(vma->vm_start)),
 		      PFN_DOWN(faddr & PMD_MASK));
-	*end = min3(rpfn, PFN_DOWN(vma->vm_end),
+	*end = min3(rpfn, PFN_DOWN(READ_ONCE(vma->vm_end)),
 		    PFN_DOWN((faddr & PMD_MASK) + PMD_SIZE));
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v9 09/24] mm: protect mremap() against SPF hanlder
  2018-03-13 17:59 [PATCH v9 00/24] Speculative page faults Laurent Dufour
                   ` (5 preceding siblings ...)
  2018-03-13 17:59 ` [PATCH v9 08/24] mm: Protect VMA modifications using VMA sequence count Laurent Dufour
@ 2018-03-13 17:59 ` Laurent Dufour
  2018-03-27 22:12   ` David Rientjes
  2018-03-13 17:59 ` [PATCH v9 10/24] mm: Protect SPF handler against anon_vma changes Laurent Dufour
                   ` (19 subsequent siblings)
  26 siblings, 1 reply; 98+ messages in thread
From: Laurent Dufour @ 2018-03-13 17:59 UTC (permalink / raw)
  To: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan
  Cc: linux-kernel, linux-mm, haren, khandual, npiggin, bsingharora,
	Tim Chen, linuxppc-dev, x86

If a thread is remapping an area while another one is faulting on the
destination area, the SPF handler may fetch the vma from the RB tree before
the pte has been moved by the other thread. This means that the moved ptes
will overwrite those create by the page fault handler leading to page
leaked.

	CPU 1				CPU2
	enter mremap()
	unmap the dest area
	copy_vma()			Enter speculative page fault handler
	   >> at this time the dest area is present in the RB tree
					fetch the vma matching dest area
					create a pte as the VMA matched
					Exit the SPF handler
					<data written in the new page>
	move_ptes()
	  > it is assumed that the dest area is empty,
 	  > the move ptes overwrite the page mapped by the CPU2.

To prevent that, when the VMA matching the dest area is extended or created
by copy_vma(), it should be marked as non available to the SPF handler.
The usual way to so is to rely on vm_write_begin()/end().
This is already in __vma_adjust() called by copy_vma() (through
vma_merge()). But __vma_adjust() is calling vm_write_end() before returning
which create a window for another thread.
This patch adds a new parameter to vma_merge() which is passed down to
vma_adjust().
The assumption is that copy_vma() is returning a vma which should be
released by calling vm_raw_write_end() by the callee once the ptes have
been moved.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
---
 include/linux/mm.h | 16 ++++++++++++----
 mm/mmap.c          | 47 ++++++++++++++++++++++++++++++++++++-----------
 mm/mremap.c        | 13 +++++++++++++
 3 files changed, 61 insertions(+), 15 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 88042d843668..ef6ef0627090 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2189,16 +2189,24 @@ void anon_vma_interval_tree_verify(struct anon_vma_chain *node);
 extern int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin);
 extern int __vma_adjust(struct vm_area_struct *vma, unsigned long start,
 	unsigned long end, pgoff_t pgoff, struct vm_area_struct *insert,
-	struct vm_area_struct *expand);
+	struct vm_area_struct *expand, bool keep_locked);
 static inline int vma_adjust(struct vm_area_struct *vma, unsigned long start,
 	unsigned long end, pgoff_t pgoff, struct vm_area_struct *insert)
 {
-	return __vma_adjust(vma, start, end, pgoff, insert, NULL);
+	return __vma_adjust(vma, start, end, pgoff, insert, NULL, false);
 }
-extern struct vm_area_struct *vma_merge(struct mm_struct *,
+extern struct vm_area_struct *__vma_merge(struct mm_struct *,
 	struct vm_area_struct *prev, unsigned long addr, unsigned long end,
 	unsigned long vm_flags, struct anon_vma *, struct file *, pgoff_t,
-	struct mempolicy *, struct vm_userfaultfd_ctx);
+	struct mempolicy *, struct vm_userfaultfd_ctx, bool keep_locked);
+static inline struct vm_area_struct *vma_merge(struct mm_struct *vma,
+	struct vm_area_struct *prev, unsigned long addr, unsigned long end,
+	unsigned long vm_flags, struct anon_vma *anon, struct file *file,
+	pgoff_t off, struct mempolicy *pol, struct vm_userfaultfd_ctx uff)
+{
+	return __vma_merge(vma, prev, addr, end, vm_flags, anon, file, off,
+			   pol, uff, false);
+}
 extern struct anon_vma *find_mergeable_anon_vma(struct vm_area_struct *);
 extern int __split_vma(struct mm_struct *, struct vm_area_struct *,
 	unsigned long addr, int new_below);
diff --git a/mm/mmap.c b/mm/mmap.c
index d6533cb85213..ac32b577a0c9 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -684,7 +684,7 @@ static inline void __vma_unlink_prev(struct mm_struct *mm,
  */
 int __vma_adjust(struct vm_area_struct *vma, unsigned long start,
 	unsigned long end, pgoff_t pgoff, struct vm_area_struct *insert,
-	struct vm_area_struct *expand)
+	struct vm_area_struct *expand, bool keep_locked)
 {
 	struct mm_struct *mm = vma->vm_mm;
 	struct vm_area_struct *next = vma->vm_next, *orig_vma = vma;
@@ -996,7 +996,8 @@ int __vma_adjust(struct vm_area_struct *vma, unsigned long start,
 
 	if (next && next != vma)
 		vm_raw_write_end(next);
-	vm_raw_write_end(vma);
+	if (!keep_locked)
+		vm_raw_write_end(vma);
 
 	validate_mm(mm);
 
@@ -1132,12 +1133,13 @@ can_vma_merge_after(struct vm_area_struct *vma, unsigned long vm_flags,
  * parameter) may establish ptes with the wrong permissions of NNNN
  * instead of the right permissions of XXXX.
  */
-struct vm_area_struct *vma_merge(struct mm_struct *mm,
+struct vm_area_struct *__vma_merge(struct mm_struct *mm,
 			struct vm_area_struct *prev, unsigned long addr,
 			unsigned long end, unsigned long vm_flags,
 			struct anon_vma *anon_vma, struct file *file,
 			pgoff_t pgoff, struct mempolicy *policy,
-			struct vm_userfaultfd_ctx vm_userfaultfd_ctx)
+			struct vm_userfaultfd_ctx vm_userfaultfd_ctx,
+			bool keep_locked)
 {
 	pgoff_t pglen = (end - addr) >> PAGE_SHIFT;
 	struct vm_area_struct *area, *next;
@@ -1185,10 +1187,11 @@ struct vm_area_struct *vma_merge(struct mm_struct *mm,
 							/* cases 1, 6 */
 			err = __vma_adjust(prev, prev->vm_start,
 					 next->vm_end, prev->vm_pgoff, NULL,
-					 prev);
+					 prev, keep_locked);
 		} else					/* cases 2, 5, 7 */
 			err = __vma_adjust(prev, prev->vm_start,
-					 end, prev->vm_pgoff, NULL, prev);
+					   end, prev->vm_pgoff, NULL, prev,
+					   keep_locked);
 		if (err)
 			return NULL;
 		khugepaged_enter_vma_merge(prev, vm_flags);
@@ -1205,10 +1208,12 @@ struct vm_area_struct *vma_merge(struct mm_struct *mm,
 					     vm_userfaultfd_ctx)) {
 		if (prev && addr < prev->vm_end)	/* case 4 */
 			err = __vma_adjust(prev, prev->vm_start,
-					 addr, prev->vm_pgoff, NULL, next);
+					 addr, prev->vm_pgoff, NULL, next,
+					 keep_locked);
 		else {					/* cases 3, 8 */
 			err = __vma_adjust(area, addr, next->vm_end,
-					 next->vm_pgoff - pglen, NULL, next);
+					 next->vm_pgoff - pglen, NULL, next,
+					 keep_locked);
 			/*
 			 * In case 3 area is already equal to next and
 			 * this is a noop, but in case 8 "area" has
@@ -3163,9 +3168,20 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
 
 	if (find_vma_links(mm, addr, addr + len, &prev, &rb_link, &rb_parent))
 		return NULL;	/* should never get here */
-	new_vma = vma_merge(mm, prev, addr, addr + len, vma->vm_flags,
-			    vma->anon_vma, vma->vm_file, pgoff, vma_policy(vma),
-			    vma->vm_userfaultfd_ctx);
+
+	/* There is 3 cases to manage here in
+	 *     AAAA            AAAA              AAAA              AAAA
+	 * PPPP....      PPPP......NNNN      PPPP....NNNN      PP........NN
+	 * PPPPPPPP(A)   PPPP..NNNNNNNN(B)   PPPPPPPPPPPP(1)       NULL
+	 *                                   PPPPPPPPNNNN(2)
+	 *				     PPPPNNNNNNNN(3)
+	 *
+	 * new_vma == prev in case A,1,2
+	 * new_vma == next in case B,3
+	 */
+	new_vma = __vma_merge(mm, prev, addr, addr + len, vma->vm_flags,
+			      vma->anon_vma, vma->vm_file, pgoff,
+			      vma_policy(vma), vma->vm_userfaultfd_ctx, true);
 	if (new_vma) {
 		/*
 		 * Source vma may have been merged into new_vma
@@ -3205,6 +3221,15 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
 			get_file(new_vma->vm_file);
 		if (new_vma->vm_ops && new_vma->vm_ops->open)
 			new_vma->vm_ops->open(new_vma);
+		/*
+		 * As the VMA is linked right now, it may be hit by the
+		 * speculative page fault handler. But we don't want it to
+		 * to start mapping page in this area until the caller has
+		 * potentially move the pte from the moved VMA. To prevent
+		 * that we protect it right now, and let the caller unprotect
+		 * it once the move is done.
+		 */
+		vm_raw_write_begin(new_vma);
 		vma_link(mm, new_vma, prev, rb_link, rb_parent);
 		*need_rmap_locks = false;
 	}
diff --git a/mm/mremap.c b/mm/mremap.c
index 049470aa1e3e..8ed1a1d6eaed 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -302,6 +302,14 @@ static unsigned long move_vma(struct vm_area_struct *vma,
 	if (!new_vma)
 		return -ENOMEM;
 
+	/* new_vma is returned protected by copy_vma, to prevent speculative
+	 * page fault to be done in the destination area before we move the pte.
+	 * Now, we must also protect the source VMA since we don't want pages
+	 * to be mapped in our back while we are copying the PTEs.
+	 */
+	if (vma != new_vma)
+		vm_raw_write_begin(vma);
+
 	moved_len = move_page_tables(vma, old_addr, new_vma, new_addr, old_len,
 				     need_rmap_locks);
 	if (moved_len < old_len) {
@@ -318,6 +326,8 @@ static unsigned long move_vma(struct vm_area_struct *vma,
 		 */
 		move_page_tables(new_vma, new_addr, vma, old_addr, moved_len,
 				 true);
+		if (vma != new_vma)
+			vm_raw_write_end(vma);
 		vma = new_vma;
 		old_len = new_len;
 		old_addr = new_addr;
@@ -326,7 +336,10 @@ static unsigned long move_vma(struct vm_area_struct *vma,
 		mremap_userfaultfd_prep(new_vma, uf);
 		arch_remap(mm, old_addr, old_addr + old_len,
 			   new_addr, new_addr + new_len);
+		if (vma != new_vma)
+			vm_raw_write_end(vma);
 	}
+	vm_raw_write_end(new_vma);
 
 	/* Conceal VM_ACCOUNT so old reservation is not undone */
 	if (vm_flags & VM_ACCOUNT) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v9 10/24] mm: Protect SPF handler against anon_vma changes
  2018-03-13 17:59 [PATCH v9 00/24] Speculative page faults Laurent Dufour
                   ` (6 preceding siblings ...)
  2018-03-13 17:59 ` [PATCH v9 09/24] mm: protect mremap() against SPF hanlder Laurent Dufour
@ 2018-03-13 17:59 ` Laurent Dufour
  2018-03-13 17:59 ` [PATCH v9 11/24] mm: Cache some VMA fields in the vm_fault structure Laurent Dufour
                   ` (18 subsequent siblings)
  26 siblings, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-03-13 17:59 UTC (permalink / raw)
  To: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan
  Cc: linux-kernel, linux-mm, haren, khandual, npiggin, bsingharora,
	Tim Chen, linuxppc-dev, x86

The speculative page fault handler must be protected against anon_vma
changes. This is because page_add_new_anon_rmap() is called during the
speculative path.

In addition, don't try speculative page fault if the VMA don't have an
anon_vma structure allocated because its allocation should be
protected by the mmap_sem.

In __vma_adjust() when importer->anon_vma is set, there is no need to
protect against speculative page faults since speculative page fault
is aborted if the vma->anon_vma is not set.

When calling page_add_new_anon_rmap() vma->anon_vma is necessarily
valid since we checked for it when locking the pte and the anon_vma is
removed once the pte is unlocked. So even if the speculative page
fault handler is running concurrently with do_unmap(), as the pte is
locked in unmap_region() - through unmap_vmas() - and the anon_vma
unlinked later, because we check for the vma sequence counter which is
updated in unmap_page_range() before locking the pte, and then in
free_pgtables() so when locking the pte the change will be detected.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
---
 mm/memory.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/mm/memory.c b/mm/memory.c
index d57749966fb8..0200340ef089 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -624,7 +624,9 @@ void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *vma,
 		 * Hide vma from rmap and truncate_pagecache before freeing
 		 * pgtables
 		 */
+		vm_write_begin(vma);
 		unlink_anon_vmas(vma);
+		vm_write_end(vma);
 		unlink_file_vma(vma);
 
 		if (is_vm_hugetlb_page(vma)) {
@@ -638,7 +640,9 @@ void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *vma,
 			       && !is_vm_hugetlb_page(next)) {
 				vma = next;
 				next = vma->vm_next;
+				vm_write_begin(vma);
 				unlink_anon_vmas(vma);
+				vm_write_end(vma);
 				unlink_file_vma(vma);
 			}
 			free_pgd_range(tlb, addr, vma->vm_end,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v9 11/24] mm: Cache some VMA fields in the vm_fault structure
  2018-03-13 17:59 [PATCH v9 00/24] Speculative page faults Laurent Dufour
                   ` (7 preceding siblings ...)
  2018-03-13 17:59 ` [PATCH v9 10/24] mm: Protect SPF handler against anon_vma changes Laurent Dufour
@ 2018-03-13 17:59 ` Laurent Dufour
  2018-04-02 22:24   ` David Rientjes
  2018-03-13 17:59 ` [PATCH v9 12/24] mm/migrate: Pass vm_fault pointer to migrate_misplaced_page() Laurent Dufour
                   ` (17 subsequent siblings)
  26 siblings, 1 reply; 98+ messages in thread
From: Laurent Dufour @ 2018-03-13 17:59 UTC (permalink / raw)
  To: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan
  Cc: linux-kernel, linux-mm, haren, khandual, npiggin, bsingharora,
	Tim Chen, linuxppc-dev, x86

When handling speculative page fault, the vma->vm_flags and
vma->vm_page_prot fields are read once the page table lock is released. So
there is no more guarantee that these fields would not change in our back.
They will be saved in the vm_fault structure before the VMA is checked for
changes.

This patch also set the fields in hugetlb_no_page() and
__collapse_huge_page_swapin even if it is not need for the callee.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
---
 include/linux/mm.h |  6 ++++++
 mm/hugetlb.c       |  2 ++
 mm/khugepaged.c    |  2 ++
 mm/memory.c        | 38 ++++++++++++++++++++------------------
 4 files changed, 30 insertions(+), 18 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index ef6ef0627090..dfa81a638b7c 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -359,6 +359,12 @@ struct vm_fault {
 					 * page table to avoid allocation from
 					 * atomic context.
 					 */
+	/*
+	 * These entries are required when handling speculative page fault.
+	 * This way the page handling is done using consistent field values.
+	 */
+	unsigned long vma_flags;
+	pgprot_t vma_page_prot;
 };
 
 /* page entry size for vm->huge_fault() */
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 446427cafa19..f71db2b42b30 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3717,6 +3717,8 @@ static int hugetlb_no_page(struct mm_struct *mm, struct vm_area_struct *vma,
 				.vma = vma,
 				.address = address,
 				.flags = flags,
+				.vma_flags = vma->vm_flags,
+				.vma_page_prot = vma->vm_page_prot,
 				/*
 				 * Hard to debug if it ends up being
 				 * used by a callee that assumes
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 32314e9e48dd..a946d5306160 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -882,6 +882,8 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm,
 		.flags = FAULT_FLAG_ALLOW_RETRY,
 		.pmd = pmd,
 		.pgoff = linear_page_index(vma, address),
+		.vma_flags = vma->vm_flags,
+		.vma_page_prot = vma->vm_page_prot,
 	};
 
 	/* we only decide to swapin, if there is enough young ptes */
diff --git a/mm/memory.c b/mm/memory.c
index 0200340ef089..46fe92b93682 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2615,7 +2615,7 @@ static int wp_page_copy(struct vm_fault *vmf)
 		 * Don't let another task, with possibly unlocked vma,
 		 * keep the mlocked page.
 		 */
-		if (page_copied && (vma->vm_flags & VM_LOCKED)) {
+		if (page_copied && (vmf->vma_flags & VM_LOCKED)) {
 			lock_page(old_page);	/* LRU manipulation */
 			if (PageMlocked(old_page))
 				munlock_vma_page(old_page);
@@ -2649,7 +2649,7 @@ static int wp_page_copy(struct vm_fault *vmf)
  */
 int finish_mkwrite_fault(struct vm_fault *vmf)
 {
-	WARN_ON_ONCE(!(vmf->vma->vm_flags & VM_SHARED));
+	WARN_ON_ONCE(!(vmf->vma_flags & VM_SHARED));
 	if (!pte_map_lock(vmf))
 		return VM_FAULT_RETRY;
 	/*
@@ -2751,7 +2751,7 @@ static int do_wp_page(struct vm_fault *vmf)
 		 * We should not cow pages in a shared writeable mapping.
 		 * Just mark the pages writable and/or call ops->pfn_mkwrite.
 		 */
-		if ((vma->vm_flags & (VM_WRITE|VM_SHARED)) ==
+		if ((vmf->vma_flags & (VM_WRITE|VM_SHARED)) ==
 				     (VM_WRITE|VM_SHARED))
 			return wp_pfn_shared(vmf);
 
@@ -2798,7 +2798,7 @@ static int do_wp_page(struct vm_fault *vmf)
 			return VM_FAULT_WRITE;
 		}
 		unlock_page(vmf->page);
-	} else if (unlikely((vma->vm_flags & (VM_WRITE|VM_SHARED)) ==
+	} else if (unlikely((vmf->vma_flags & (VM_WRITE|VM_SHARED)) ==
 					(VM_WRITE|VM_SHARED))) {
 		return wp_page_shared(vmf);
 	}
@@ -3067,7 +3067,7 @@ int do_swap_page(struct vm_fault *vmf)
 
 	inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
 	dec_mm_counter_fast(vma->vm_mm, MM_SWAPENTS);
-	pte = mk_pte(page, vma->vm_page_prot);
+	pte = mk_pte(page, vmf->vma_page_prot);
 	if ((vmf->flags & FAULT_FLAG_WRITE) && reuse_swap_page(page, NULL)) {
 		pte = maybe_mkwrite(pte_mkdirty(pte), vma);
 		vmf->flags &= ~FAULT_FLAG_WRITE;
@@ -3093,7 +3093,7 @@ int do_swap_page(struct vm_fault *vmf)
 
 	swap_free(entry);
 	if (mem_cgroup_swap_full(page) ||
-	    (vma->vm_flags & VM_LOCKED) || PageMlocked(page))
+	    (vmf->vma_flags & VM_LOCKED) || PageMlocked(page))
 		try_to_free_swap(page);
 	unlock_page(page);
 	if (page != swapcache && swapcache) {
@@ -3150,7 +3150,7 @@ static int do_anonymous_page(struct vm_fault *vmf)
 	pte_t entry;
 
 	/* File mapping without ->vm_ops ? */
-	if (vma->vm_flags & VM_SHARED)
+	if (vmf->vma_flags & VM_SHARED)
 		return VM_FAULT_SIGBUS;
 
 	/*
@@ -3174,7 +3174,7 @@ static int do_anonymous_page(struct vm_fault *vmf)
 	if (!(vmf->flags & FAULT_FLAG_WRITE) &&
 			!mm_forbids_zeropage(vma->vm_mm)) {
 		entry = pte_mkspecial(pfn_pte(my_zero_pfn(vmf->address),
-						vma->vm_page_prot));
+						vmf->vma_page_prot));
 		if (!pte_map_lock(vmf))
 			return VM_FAULT_RETRY;
 		if (!pte_none(*vmf->pte))
@@ -3207,8 +3207,8 @@ static int do_anonymous_page(struct vm_fault *vmf)
 	 */
 	__SetPageUptodate(page);
 
-	entry = mk_pte(page, vma->vm_page_prot);
-	if (vma->vm_flags & VM_WRITE)
+	entry = mk_pte(page, vmf->vma_page_prot);
+	if (vmf->vma_flags & VM_WRITE)
 		entry = pte_mkwrite(pte_mkdirty(entry));
 
 	if (!pte_map_lock(vmf)) {
@@ -3404,7 +3404,7 @@ static int do_set_pmd(struct vm_fault *vmf, struct page *page)
 	for (i = 0; i < HPAGE_PMD_NR; i++)
 		flush_icache_page(vma, page + i);
 
-	entry = mk_huge_pmd(page, vma->vm_page_prot);
+	entry = mk_huge_pmd(page, vmf->vma_page_prot);
 	if (write)
 		entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);
 
@@ -3478,11 +3478,11 @@ int alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg,
 		return VM_FAULT_NOPAGE;
 
 	flush_icache_page(vma, page);
-	entry = mk_pte(page, vma->vm_page_prot);
+	entry = mk_pte(page, vmf->vma_page_prot);
 	if (write)
 		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
 	/* copy-on-write page */
-	if (write && !(vma->vm_flags & VM_SHARED)) {
+	if (write && !(vmf->vma_flags & VM_SHARED)) {
 		inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
 		page_add_new_anon_rmap(page, vma, vmf->address, false);
 		mem_cgroup_commit_charge(page, memcg, false, false);
@@ -3521,7 +3521,7 @@ int finish_fault(struct vm_fault *vmf)
 
 	/* Did we COW the page? */
 	if ((vmf->flags & FAULT_FLAG_WRITE) &&
-	    !(vmf->vma->vm_flags & VM_SHARED))
+	    !(vmf->vma_flags & VM_SHARED))
 		page = vmf->cow_page;
 	else
 		page = vmf->page;
@@ -3775,7 +3775,7 @@ static int do_fault(struct vm_fault *vmf)
 		ret = VM_FAULT_SIGBUS;
 	else if (!(vmf->flags & FAULT_FLAG_WRITE))
 		ret = do_read_fault(vmf);
-	else if (!(vma->vm_flags & VM_SHARED))
+	else if (!(vmf->vma_flags & VM_SHARED))
 		ret = do_cow_fault(vmf);
 	else
 		ret = do_shared_fault(vmf);
@@ -3832,7 +3832,7 @@ static int do_numa_page(struct vm_fault *vmf)
 	 * accessible ptes, some can allow access by kernel mode.
 	 */
 	pte = ptep_modify_prot_start(vma->vm_mm, vmf->address, vmf->pte);
-	pte = pte_modify(pte, vma->vm_page_prot);
+	pte = pte_modify(pte, vmf->vma_page_prot);
 	pte = pte_mkyoung(pte);
 	if (was_writable)
 		pte = pte_mkwrite(pte);
@@ -3866,7 +3866,7 @@ static int do_numa_page(struct vm_fault *vmf)
 	 * Flag if the page is shared between multiple address spaces. This
 	 * is later used when determining whether to group tasks together
 	 */
-	if (page_mapcount(page) > 1 && (vma->vm_flags & VM_SHARED))
+	if (page_mapcount(page) > 1 && (vmf->vma_flags & VM_SHARED))
 		flags |= TNF_SHARED;
 
 	last_cpupid = page_cpupid_last(page);
@@ -3911,7 +3911,7 @@ static inline int wp_huge_pmd(struct vm_fault *vmf, pmd_t orig_pmd)
 		return vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PMD);
 
 	/* COW handled on pte level: split pmd */
-	VM_BUG_ON_VMA(vmf->vma->vm_flags & VM_SHARED, vmf->vma);
+	VM_BUG_ON_VMA(vmf->vma_flags & VM_SHARED, vmf->vma);
 	__split_huge_pmd(vmf->vma, vmf->pmd, vmf->address, false, NULL);
 
 	return VM_FAULT_FALLBACK;
@@ -4058,6 +4058,8 @@ static int __handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
 		.flags = flags,
 		.pgoff = linear_page_index(vma, address),
 		.gfp_mask = __get_fault_gfp_mask(vma),
+		.vma_flags = vma->vm_flags,
+		.vma_page_prot = vma->vm_page_prot,
 	};
 	unsigned int dirty = flags & FAULT_FLAG_WRITE;
 	struct mm_struct *mm = vma->vm_mm;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v9 12/24] mm/migrate: Pass vm_fault pointer to migrate_misplaced_page()
  2018-03-13 17:59 [PATCH v9 00/24] Speculative page faults Laurent Dufour
                   ` (8 preceding siblings ...)
  2018-03-13 17:59 ` [PATCH v9 11/24] mm: Cache some VMA fields in the vm_fault structure Laurent Dufour
@ 2018-03-13 17:59 ` Laurent Dufour
  2018-04-02 23:00   ` David Rientjes
  2018-03-13 17:59 ` [PATCH v9 13/24] mm: Introduce __lru_cache_add_active_or_unevictable Laurent Dufour
                   ` (16 subsequent siblings)
  26 siblings, 1 reply; 98+ messages in thread
From: Laurent Dufour @ 2018-03-13 17:59 UTC (permalink / raw)
  To: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan
  Cc: linux-kernel, linux-mm, haren, khandual, npiggin, bsingharora,
	Tim Chen, linuxppc-dev, x86

migrate_misplaced_page() is only called during the page fault handling so
it's better to pass the pointer to the struct vm_fault instead of the vma.

This way during the speculative page fault path the saved vma->vm_flags
could be used.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
---
 include/linux/migrate.h | 4 ++--
 mm/memory.c             | 2 +-
 mm/migrate.c            | 4 ++--
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index f2b4abbca55e..fd4c3ab7bd9c 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -126,14 +126,14 @@ static inline void __ClearPageMovable(struct page *page)
 #ifdef CONFIG_NUMA_BALANCING
 extern bool pmd_trans_migrating(pmd_t pmd);
 extern int migrate_misplaced_page(struct page *page,
-				  struct vm_area_struct *vma, int node);
+				  struct vm_fault *vmf, int node);
 #else
 static inline bool pmd_trans_migrating(pmd_t pmd)
 {
 	return false;
 }
 static inline int migrate_misplaced_page(struct page *page,
-					 struct vm_area_struct *vma, int node)
+					 struct vm_fault *vmf, int node)
 {
 	return -EAGAIN; /* can't migrate now */
 }
diff --git a/mm/memory.c b/mm/memory.c
index 46fe92b93682..412014d5785b 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3880,7 +3880,7 @@ static int do_numa_page(struct vm_fault *vmf)
 	}
 
 	/* Migrate to the requested node */
-	migrated = migrate_misplaced_page(page, vma, target_nid);
+	migrated = migrate_misplaced_page(page, vmf, target_nid);
 	if (migrated) {
 		page_nid = target_nid;
 		flags |= TNF_MIGRATED;
diff --git a/mm/migrate.c b/mm/migrate.c
index 5d0dc7b85f90..ad8692ca6a4f 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1900,7 +1900,7 @@ bool pmd_trans_migrating(pmd_t pmd)
  * node. Caller is expected to have an elevated reference count on
  * the page that will be dropped by this function before returning.
  */
-int migrate_misplaced_page(struct page *page, struct vm_area_struct *vma,
+int migrate_misplaced_page(struct page *page, struct vm_fault *vmf,
 			   int node)
 {
 	pg_data_t *pgdat = NODE_DATA(node);
@@ -1913,7 +1913,7 @@ int migrate_misplaced_page(struct page *page, struct vm_area_struct *vma,
 	 * with execute permissions as they are probably shared libraries.
 	 */
 	if (page_mapcount(page) != 1 && page_is_file_cache(page) &&
-	    (vma->vm_flags & VM_EXEC))
+	    (vmf->vma_flags & VM_EXEC))
 		goto out;
 
 	/*
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v9 13/24] mm: Introduce __lru_cache_add_active_or_unevictable
  2018-03-13 17:59 [PATCH v9 00/24] Speculative page faults Laurent Dufour
                   ` (9 preceding siblings ...)
  2018-03-13 17:59 ` [PATCH v9 12/24] mm/migrate: Pass vm_fault pointer to migrate_misplaced_page() Laurent Dufour
@ 2018-03-13 17:59 ` Laurent Dufour
  2018-04-02 23:11   ` David Rientjes
  2018-03-13 17:59 ` [PATCH v9 14/24] mm: Introduce __maybe_mkwrite() Laurent Dufour
                   ` (15 subsequent siblings)
  26 siblings, 1 reply; 98+ messages in thread
From: Laurent Dufour @ 2018-03-13 17:59 UTC (permalink / raw)
  To: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan
  Cc: linux-kernel, linux-mm, haren, khandual, npiggin, bsingharora,
	Tim Chen, linuxppc-dev, x86

The speculative page fault handler which is run without holding the
mmap_sem is calling lru_cache_add_active_or_unevictable() but the vm_flags
is not guaranteed to remain constant.
Introducing __lru_cache_add_active_or_unevictable() which has the vma flags
value parameter instead of the vma pointer.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
---
 include/linux/swap.h | 10 ++++++++--
 mm/memory.c          |  8 ++++----
 mm/swap.c            |  6 +++---
 3 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 1985940af479..a7dc37e0e405 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -338,8 +338,14 @@ extern void deactivate_file_page(struct page *page);
 extern void mark_page_lazyfree(struct page *page);
 extern void swap_setup(void);
 
-extern void lru_cache_add_active_or_unevictable(struct page *page,
-						struct vm_area_struct *vma);
+extern void __lru_cache_add_active_or_unevictable(struct page *page,
+						unsigned long vma_flags);
+
+static inline void lru_cache_add_active_or_unevictable(struct page *page,
+						struct vm_area_struct *vma)
+{
+	return __lru_cache_add_active_or_unevictable(page, vma->vm_flags);
+}
 
 /* linux/mm/vmscan.c */
 extern unsigned long zone_reclaimable_pages(struct zone *zone);
diff --git a/mm/memory.c b/mm/memory.c
index 412014d5785b..0a0a483d9a65 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2560,7 +2560,7 @@ static int wp_page_copy(struct vm_fault *vmf)
 		ptep_clear_flush_notify(vma, vmf->address, vmf->pte);
 		page_add_new_anon_rmap(new_page, vma, vmf->address, false);
 		mem_cgroup_commit_charge(new_page, memcg, false, false);
-		lru_cache_add_active_or_unevictable(new_page, vma);
+		__lru_cache_add_active_or_unevictable(new_page, vmf->vma_flags);
 		/*
 		 * We call the notify macro here because, when using secondary
 		 * mmu page tables (such as kvm shadow page tables), we want the
@@ -3083,7 +3083,7 @@ int do_swap_page(struct vm_fault *vmf)
 	if (unlikely(page != swapcache && swapcache)) {
 		page_add_new_anon_rmap(page, vma, vmf->address, false);
 		mem_cgroup_commit_charge(page, memcg, false, false);
-		lru_cache_add_active_or_unevictable(page, vma);
+		__lru_cache_add_active_or_unevictable(page, vmf->vma_flags);
 	} else {
 		do_page_add_anon_rmap(page, vma, vmf->address, exclusive);
 		mem_cgroup_commit_charge(page, memcg, true, false);
@@ -3234,7 +3234,7 @@ static int do_anonymous_page(struct vm_fault *vmf)
 	inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
 	page_add_new_anon_rmap(page, vma, vmf->address, false);
 	mem_cgroup_commit_charge(page, memcg, false, false);
-	lru_cache_add_active_or_unevictable(page, vma);
+	__lru_cache_add_active_or_unevictable(page, vmf->vma_flags);
 setpte:
 	set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry);
 
@@ -3486,7 +3486,7 @@ int alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg,
 		inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
 		page_add_new_anon_rmap(page, vma, vmf->address, false);
 		mem_cgroup_commit_charge(page, memcg, false, false);
-		lru_cache_add_active_or_unevictable(page, vma);
+		__lru_cache_add_active_or_unevictable(page, vmf->vma_flags);
 	} else {
 		inc_mm_counter_fast(vma->vm_mm, mm_counter_file(page));
 		page_add_file_rmap(page, false);
diff --git a/mm/swap.c b/mm/swap.c
index 3dd518832096..f2f9c587246f 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -455,12 +455,12 @@ void lru_cache_add(struct page *page)
  * directly back onto it's zone's unevictable list, it does NOT use a
  * per cpu pagevec.
  */
-void lru_cache_add_active_or_unevictable(struct page *page,
-					 struct vm_area_struct *vma)
+void __lru_cache_add_active_or_unevictable(struct page *page,
+					   unsigned long vma_flags)
 {
 	VM_BUG_ON_PAGE(PageLRU(page), page);
 
-	if (likely((vma->vm_flags & (VM_LOCKED | VM_SPECIAL)) != VM_LOCKED))
+	if (likely((vma_flags & (VM_LOCKED | VM_SPECIAL)) != VM_LOCKED))
 		SetPageActive(page);
 	else if (!TestSetPageMlocked(page)) {
 		/*
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v9 14/24] mm: Introduce __maybe_mkwrite()
  2018-03-13 17:59 [PATCH v9 00/24] Speculative page faults Laurent Dufour
                   ` (10 preceding siblings ...)
  2018-03-13 17:59 ` [PATCH v9 13/24] mm: Introduce __lru_cache_add_active_or_unevictable Laurent Dufour
@ 2018-03-13 17:59 ` Laurent Dufour
  2018-04-02 23:12   ` David Rientjes
  2018-03-13 17:59 ` [PATCH v9 15/24] mm: Introduce __vm_normal_page() Laurent Dufour
                   ` (14 subsequent siblings)
  26 siblings, 1 reply; 98+ messages in thread
From: Laurent Dufour @ 2018-03-13 17:59 UTC (permalink / raw)
  To: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan
  Cc: linux-kernel, linux-mm, haren, khandual, npiggin, bsingharora,
	Tim Chen, linuxppc-dev, x86

The current maybe_mkwrite() is getting passed the pointer to the vma
structure to fetch the vm_flags field.

When dealing with the speculative page fault handler, it will be better to
rely on the cached vm_flags value stored in the vm_fault structure.

This patch introduce a __maybe_mkwrite() service which can be called by
passing the value of the vm_flags field.

There is no change functional changes expected for the other callers of
maybe_mkwrite().

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
---
 include/linux/mm.h | 9 +++++++--
 mm/memory.c        | 6 +++---
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index dfa81a638b7c..a84ddc218bbd 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -684,13 +684,18 @@ void free_compound_page(struct page *page);
  * pte_mkwrite.  But get_user_pages can cause write faults for mappings
  * that do not have writing enabled, when used by access_process_vm.
  */
-static inline pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma)
+static inline pte_t __maybe_mkwrite(pte_t pte, unsigned long vma_flags)
 {
-	if (likely(vma->vm_flags & VM_WRITE))
+	if (likely(vma_flags & VM_WRITE))
 		pte = pte_mkwrite(pte);
 	return pte;
 }
 
+static inline pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma)
+{
+	return __maybe_mkwrite(pte, vma->vm_flags);
+}
+
 int alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg,
 		struct page *page);
 int finish_fault(struct vm_fault *vmf);
diff --git a/mm/memory.c b/mm/memory.c
index 0a0a483d9a65..af0338fbc34d 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2472,7 +2472,7 @@ static inline void wp_page_reuse(struct vm_fault *vmf)
 
 	flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
 	entry = pte_mkyoung(vmf->orig_pte);
-	entry = maybe_mkwrite(pte_mkdirty(entry), vma);
+	entry = __maybe_mkwrite(pte_mkdirty(entry), vmf->vma_flags);
 	if (ptep_set_access_flags(vma, vmf->address, vmf->pte, entry, 1))
 		update_mmu_cache(vma, vmf->address, vmf->pte);
 	pte_unmap_unlock(vmf->pte, vmf->ptl);
@@ -2549,8 +2549,8 @@ static int wp_page_copy(struct vm_fault *vmf)
 			inc_mm_counter_fast(mm, MM_ANONPAGES);
 		}
 		flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
-		entry = mk_pte(new_page, vma->vm_page_prot);
-		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
+		entry = mk_pte(new_page, vmf->vma_page_prot);
+		entry = __maybe_mkwrite(pte_mkdirty(entry), vmf->vma_flags);
 		/*
 		 * Clear the pte entry and flush it first, before updating the
 		 * pte with the new entry. This will avoid a race condition
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v9 15/24] mm: Introduce __vm_normal_page()
  2018-03-13 17:59 [PATCH v9 00/24] Speculative page faults Laurent Dufour
                   ` (11 preceding siblings ...)
  2018-03-13 17:59 ` [PATCH v9 14/24] mm: Introduce __maybe_mkwrite() Laurent Dufour
@ 2018-03-13 17:59 ` Laurent Dufour
  2018-04-02 23:18   ` David Rientjes
  2018-04-03 19:39   ` Jerome Glisse
  2018-03-13 17:59 ` [PATCH v9 16/24] mm: Introduce __page_add_new_anon_rmap() Laurent Dufour
                   ` (13 subsequent siblings)
  26 siblings, 2 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-03-13 17:59 UTC (permalink / raw)
  To: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan
  Cc: linux-kernel, linux-mm, haren, khandual, npiggin, bsingharora,
	Tim Chen, linuxppc-dev, x86

When dealing with the speculative fault path we should use the VMA's field
cached value stored in the vm_fault structure.

Currently vm_normal_page() is using the pointer to the VMA to fetch the
vm_flags value. This patch provides a new __vm_normal_page() which is
receiving the vm_flags flags value as parameter.

Note: The speculative path is turned on for architecture providing support
for special PTE flag. So only the first block of vm_normal_page is used
during the speculative path.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
---
 include/linux/mm.h |  7 +++++--
 mm/memory.c        | 18 ++++++++++--------
 2 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index a84ddc218bbd..73b8b99f482b 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1263,8 +1263,11 @@ struct zap_details {
 	pgoff_t last_index;			/* Highest page->index to unmap */
 };
 
-struct page *_vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
-			     pte_t pte, bool with_public_device);
+struct page *__vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
+			      pte_t pte, bool with_public_device,
+			      unsigned long vma_flags);
+#define _vm_normal_page(vma, addr, pte, with_public_device) \
+	__vm_normal_page(vma, addr, pte, with_public_device, (vma)->vm_flags)
 #define vm_normal_page(vma, addr, pte) _vm_normal_page(vma, addr, pte, false)
 
 struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr,
diff --git a/mm/memory.c b/mm/memory.c
index af0338fbc34d..184a0d663a76 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -826,8 +826,9 @@ static void print_bad_pte(struct vm_area_struct *vma, unsigned long addr,
 #else
 # define HAVE_PTE_SPECIAL 0
 #endif
-struct page *_vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
-			     pte_t pte, bool with_public_device)
+struct page *__vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
+			      pte_t pte, bool with_public_device,
+			      unsigned long vma_flags)
 {
 	unsigned long pfn = pte_pfn(pte);
 
@@ -836,7 +837,7 @@ struct page *_vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
 			goto check_pfn;
 		if (vma->vm_ops && vma->vm_ops->find_special_page)
 			return vma->vm_ops->find_special_page(vma, addr);
-		if (vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP))
+		if (vma_flags & (VM_PFNMAP | VM_MIXEDMAP))
 			return NULL;
 		if (is_zero_pfn(pfn))
 			return NULL;
@@ -868,8 +869,8 @@ struct page *_vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
 
 	/* !HAVE_PTE_SPECIAL case follows: */
 
-	if (unlikely(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))) {
-		if (vma->vm_flags & VM_MIXEDMAP) {
+	if (unlikely(vma_flags & (VM_PFNMAP|VM_MIXEDMAP))) {
+		if (vma_flags & VM_MIXEDMAP) {
 			if (!pfn_valid(pfn))
 				return NULL;
 			goto out;
@@ -878,7 +879,7 @@ struct page *_vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
 			off = (addr - vma->vm_start) >> PAGE_SHIFT;
 			if (pfn == vma->vm_pgoff + off)
 				return NULL;
-			if (!is_cow_mapping(vma->vm_flags))
+			if (!is_cow_mapping(vma_flags))
 				return NULL;
 		}
 	}
@@ -2742,7 +2743,8 @@ static int do_wp_page(struct vm_fault *vmf)
 {
 	struct vm_area_struct *vma = vmf->vma;
 
-	vmf->page = vm_normal_page(vma, vmf->address, vmf->orig_pte);
+	vmf->page = __vm_normal_page(vma, vmf->address, vmf->orig_pte, false,
+				     vmf->vma_flags);
 	if (!vmf->page) {
 		/*
 		 * VM_MIXEDMAP !pfn_valid() case, or VM_SOFTDIRTY clear on a
@@ -3839,7 +3841,7 @@ static int do_numa_page(struct vm_fault *vmf)
 	ptep_modify_prot_commit(vma->vm_mm, vmf->address, vmf->pte, pte);
 	update_mmu_cache(vma, vmf->address, vmf->pte);
 
-	page = vm_normal_page(vma, vmf->address, pte);
+	page = __vm_normal_page(vma, vmf->address, pte, false, vmf->vma_flags);
 	if (!page) {
 		pte_unmap_unlock(vmf->pte, vmf->ptl);
 		return 0;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v9 16/24] mm: Introduce __page_add_new_anon_rmap()
  2018-03-13 17:59 [PATCH v9 00/24] Speculative page faults Laurent Dufour
                   ` (12 preceding siblings ...)
  2018-03-13 17:59 ` [PATCH v9 15/24] mm: Introduce __vm_normal_page() Laurent Dufour
@ 2018-03-13 17:59 ` Laurent Dufour
  2018-04-02 23:57   ` David Rientjes
  2018-03-13 17:59 ` [PATCH v9 17/24] mm: Protect mm_rb tree with a rwlock Laurent Dufour
                   ` (12 subsequent siblings)
  26 siblings, 1 reply; 98+ messages in thread
From: Laurent Dufour @ 2018-03-13 17:59 UTC (permalink / raw)
  To: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan
  Cc: linux-kernel, linux-mm, haren, khandual, npiggin, bsingharora,
	Tim Chen, linuxppc-dev, x86

When dealing with speculative page fault handler, we may race with VMA
being split or merged. In this case the vma->vm_start and vm->vm_end
fields may not match the address the page fault is occurring.

This can only happens when the VMA is split but in that case, the
anon_vma pointer of the new VMA will be the same as the original one,
because in __split_vma the new->anon_vma is set to src->anon_vma when
*new = *vma.

So even if the VMA boundaries are not correct, the anon_vma pointer is
still valid.

If the VMA has been merged, then the VMA in which it has been merged
must have the same anon_vma pointer otherwise the merge can't be done.

So in all the case we know that the anon_vma is valid, since we have
checked before starting the speculative page fault that the anon_vma
pointer is valid for this VMA and since there is an anon_vma this
means that at one time a page has been backed and that before the VMA
is cleaned, the page table lock would have to be grab to clean the
PTE, and the anon_vma field is checked once the PTE is locked.

This patch introduce a new __page_add_new_anon_rmap() service which
doesn't check for the VMA boundaries, and create a new inline one
which do the check.

When called from a page fault handler, if this is not a speculative one,
there is a guarantee that vm_start and vm_end match the faulting address,
so this check is useless. In the context of the speculative page fault
handler, this check may be wrong but anon_vma is still valid as explained
above.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
---
 include/linux/rmap.h | 12 ++++++++++--
 mm/memory.c          |  8 ++++----
 mm/rmap.c            |  5 ++---
 3 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 988d176472df..a5d282573093 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -174,8 +174,16 @@ void page_add_anon_rmap(struct page *, struct vm_area_struct *,
 		unsigned long, bool);
 void do_page_add_anon_rmap(struct page *, struct vm_area_struct *,
 			   unsigned long, int);
-void page_add_new_anon_rmap(struct page *, struct vm_area_struct *,
-		unsigned long, bool);
+void __page_add_new_anon_rmap(struct page *, struct vm_area_struct *,
+			      unsigned long, bool);
+static inline void page_add_new_anon_rmap(struct page *page,
+					  struct vm_area_struct *vma,
+					  unsigned long address, bool compound)
+{
+	VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma);
+	__page_add_new_anon_rmap(page, vma, address, compound);
+}
+
 void page_add_file_rmap(struct page *, bool);
 void page_remove_rmap(struct page *, bool);
 
diff --git a/mm/memory.c b/mm/memory.c
index 184a0d663a76..66517535514b 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2559,7 +2559,7 @@ static int wp_page_copy(struct vm_fault *vmf)
 		 * thread doing COW.
 		 */
 		ptep_clear_flush_notify(vma, vmf->address, vmf->pte);
-		page_add_new_anon_rmap(new_page, vma, vmf->address, false);
+		__page_add_new_anon_rmap(new_page, vma, vmf->address, false);
 		mem_cgroup_commit_charge(new_page, memcg, false, false);
 		__lru_cache_add_active_or_unevictable(new_page, vmf->vma_flags);
 		/*
@@ -3083,7 +3083,7 @@ int do_swap_page(struct vm_fault *vmf)
 
 	/* ksm created a completely new copy */
 	if (unlikely(page != swapcache && swapcache)) {
-		page_add_new_anon_rmap(page, vma, vmf->address, false);
+		__page_add_new_anon_rmap(page, vma, vmf->address, false);
 		mem_cgroup_commit_charge(page, memcg, false, false);
 		__lru_cache_add_active_or_unevictable(page, vmf->vma_flags);
 	} else {
@@ -3234,7 +3234,7 @@ static int do_anonymous_page(struct vm_fault *vmf)
 	}
 
 	inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
-	page_add_new_anon_rmap(page, vma, vmf->address, false);
+	__page_add_new_anon_rmap(page, vma, vmf->address, false);
 	mem_cgroup_commit_charge(page, memcg, false, false);
 	__lru_cache_add_active_or_unevictable(page, vmf->vma_flags);
 setpte:
@@ -3486,7 +3486,7 @@ int alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg,
 	/* copy-on-write page */
 	if (write && !(vmf->vma_flags & VM_SHARED)) {
 		inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
-		page_add_new_anon_rmap(page, vma, vmf->address, false);
+		__page_add_new_anon_rmap(page, vma, vmf->address, false);
 		mem_cgroup_commit_charge(page, memcg, false, false);
 		__lru_cache_add_active_or_unevictable(page, vmf->vma_flags);
 	} else {
diff --git a/mm/rmap.c b/mm/rmap.c
index 9eaa6354fe70..e028d660c304 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1136,7 +1136,7 @@ void do_page_add_anon_rmap(struct page *page,
 }
 
 /**
- * page_add_new_anon_rmap - add pte mapping to a new anonymous page
+ * __page_add_new_anon_rmap - add pte mapping to a new anonymous page
  * @page:	the page to add the mapping to
  * @vma:	the vm area in which the mapping is added
  * @address:	the user virtual address mapped
@@ -1146,12 +1146,11 @@ void do_page_add_anon_rmap(struct page *page,
  * This means the inc-and-test can be bypassed.
  * Page does not have to be locked.
  */
-void page_add_new_anon_rmap(struct page *page,
+void __page_add_new_anon_rmap(struct page *page,
 	struct vm_area_struct *vma, unsigned long address, bool compound)
 {
 	int nr = compound ? hpage_nr_pages(page) : 1;
 
-	VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma);
 	__SetPageSwapBacked(page);
 	if (compound) {
 		VM_BUG_ON_PAGE(!PageTransHuge(page), page);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v9 17/24] mm: Protect mm_rb tree with a rwlock
  2018-03-13 17:59 [PATCH v9 00/24] Speculative page faults Laurent Dufour
                   ` (13 preceding siblings ...)
  2018-03-13 17:59 ` [PATCH v9 16/24] mm: Introduce __page_add_new_anon_rmap() Laurent Dufour
@ 2018-03-13 17:59 ` Laurent Dufour
  2018-03-14  8:48   ` Peter Zijlstra
                     ` (2 more replies)
  2018-03-13 17:59 ` [PATCH v9 18/24] mm: Provide speculative fault infrastructure Laurent Dufour
                   ` (11 subsequent siblings)
  26 siblings, 3 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-03-13 17:59 UTC (permalink / raw)
  To: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan
  Cc: linux-kernel, linux-mm, haren, khandual, npiggin, bsingharora,
	Tim Chen, linuxppc-dev, x86

This change is inspired by the Peter's proposal patch [1] which was
protecting the VMA using SRCU. Unfortunately, SRCU is not scaling well in
that particular case, and it is introducing major performance degradation
due to excessive scheduling operations.

To allow access to the mm_rb tree without grabbing the mmap_sem, this patch
is protecting it access using a rwlock.  As the mm_rb tree is a O(log n)
search it is safe to protect it using such a lock.  The VMA cache is not
protected by the new rwlock and it should not be used without holding the
mmap_sem.

To allow the picked VMA structure to be used once the rwlock is released, a
use count is added to the VMA structure. When the VMA is allocated it is
set to 1.  Each time the VMA is picked with the rwlock held its use count
is incremented. Each time the VMA is released it is decremented. When the
use count hits zero, this means that the VMA is no more used and should be
freed.

This patch is preparing for 2 kind of VMA access :
 - as usual, under the control of the mmap_sem,
 - without holding the mmap_sem for the speculative page fault handler.

Access done under the control the mmap_sem doesn't require to grab the
rwlock to protect read access to the mm_rb tree, but access in write must
be done under the protection of the rwlock too. This affects inserting and
removing of elements in the RB tree.

The patch is introducing 2 new functions:
 - vma_get() to find a VMA based on an address by holding the new rwlock.
 - vma_put() to release the VMA when its no more used.
These services are designed to be used when access are made to the RB tree
without holding the mmap_sem.

When a VMA is removed from the RB tree, its vma->vm_rb field is cleared and
we rely on the WMB done when releasing the rwlock to serialize the write
with the RMB done in a later patch to check for the VMA's validity.

When free_vma is called, the file associated with the VMA is closed
immediately, but the policy and the file structure remained in used until
the VMA's use count reach 0, which may happens later when exiting an
in progress speculative page fault.

[1] https://patchwork.kernel.org/patch/5108281/

Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
---
 include/linux/mm_types.h |   4 ++
 kernel/fork.c            |   3 ++
 mm/init-mm.c             |   3 ++
 mm/internal.h            |   6 +++
 mm/mmap.c                | 122 ++++++++++++++++++++++++++++++++++-------------
 5 files changed, 106 insertions(+), 32 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 34fde7111e88..28c763ea1036 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -335,6 +335,7 @@ struct vm_area_struct {
 	struct vm_userfaultfd_ctx vm_userfaultfd_ctx;
 #ifdef CONFIG_SPECULATIVE_PAGE_FAULT
 	seqcount_t vm_sequence;
+	atomic_t vm_ref_count;		/* see vma_get(), vma_put() */
 #endif
 } __randomize_layout;
 
@@ -353,6 +354,9 @@ struct kioctx_table;
 struct mm_struct {
 	struct vm_area_struct *mmap;		/* list of VMAs */
 	struct rb_root mm_rb;
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+	rwlock_t mm_rb_lock;
+#endif
 	u32 vmacache_seqnum;                   /* per-thread vmacache */
 #ifdef CONFIG_MMU
 	unsigned long (*get_unmapped_area) (struct file *filp,
diff --git a/kernel/fork.c b/kernel/fork.c
index a32e1c4311b2..9ecac4f725b9 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -889,6 +889,9 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p,
 	mm->mmap = NULL;
 	mm->mm_rb = RB_ROOT;
 	mm->vmacache_seqnum = 0;
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+	rwlock_init(&mm->mm_rb_lock);
+#endif
 	atomic_set(&mm->mm_users, 1);
 	atomic_set(&mm->mm_count, 1);
 	init_rwsem(&mm->mmap_sem);
diff --git a/mm/init-mm.c b/mm/init-mm.c
index f94d5d15ebc0..e71ac37a98c4 100644
--- a/mm/init-mm.c
+++ b/mm/init-mm.c
@@ -17,6 +17,9 @@
 
 struct mm_struct init_mm = {
 	.mm_rb		= RB_ROOT,
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+	.mm_rb_lock	= __RW_LOCK_UNLOCKED(init_mm.mm_rb_lock),
+#endif
 	.pgd		= swapper_pg_dir,
 	.mm_users	= ATOMIC_INIT(2),
 	.mm_count	= ATOMIC_INIT(1),
diff --git a/mm/internal.h b/mm/internal.h
index 62d8c34e63d5..fb2667b20f0a 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -40,6 +40,12 @@ void page_writeback_init(void);
 
 int do_swap_page(struct vm_fault *vmf);
 
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+extern struct vm_area_struct *get_vma(struct mm_struct *mm,
+				      unsigned long addr);
+extern void put_vma(struct vm_area_struct *vma);
+#endif
+
 void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *start_vma,
 		unsigned long floor, unsigned long ceiling);
 
diff --git a/mm/mmap.c b/mm/mmap.c
index ac32b577a0c9..182359a5445c 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -160,6 +160,27 @@ void unlink_file_vma(struct vm_area_struct *vma)
 	}
 }
 
+static void __free_vma(struct vm_area_struct *vma)
+{
+	if (vma->vm_file)
+		fput(vma->vm_file);
+	mpol_put(vma_policy(vma));
+	kmem_cache_free(vm_area_cachep, vma);
+}
+
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+void put_vma(struct vm_area_struct *vma)
+{
+	if (atomic_dec_and_test(&vma->vm_ref_count))
+		__free_vma(vma);
+}
+#else
+static inline void put_vma(struct vm_area_struct *vma)
+{
+	return __free_vma(vma);
+}
+#endif
+
 /*
  * Close a vm structure and free it, returning the next.
  */
@@ -170,10 +191,7 @@ static struct vm_area_struct *remove_vma(struct vm_area_struct *vma)
 	might_sleep();
 	if (vma->vm_ops && vma->vm_ops->close)
 		vma->vm_ops->close(vma);
-	if (vma->vm_file)
-		fput(vma->vm_file);
-	mpol_put(vma_policy(vma));
-	kmem_cache_free(vm_area_cachep, vma);
+	put_vma(vma);
 	return next;
 }
 
@@ -393,6 +411,14 @@ static void validate_mm(struct mm_struct *mm)
 #define validate_mm(mm) do { } while (0)
 #endif
 
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+#define mm_rb_write_lock(mm)	write_lock(&(mm)->mm_rb_lock)
+#define mm_rb_write_unlock(mm)	write_unlock(&(mm)->mm_rb_lock)
+#else
+#define mm_rb_write_lock(mm)	do { } while (0)
+#define mm_rb_write_unlock(mm)	do { } while (0)
+#endif /* CONFIG_SPECULATIVE_PAGE_FAULT */
+
 RB_DECLARE_CALLBACKS(static, vma_gap_callbacks, struct vm_area_struct, vm_rb,
 		     unsigned long, rb_subtree_gap, vma_compute_subtree_gap)
 
@@ -411,26 +437,37 @@ static void vma_gap_update(struct vm_area_struct *vma)
 }
 
 static inline void vma_rb_insert(struct vm_area_struct *vma,
-				 struct rb_root *root)
+				 struct mm_struct *mm)
 {
+	struct rb_root *root = &mm->mm_rb;
+
 	/* All rb_subtree_gap values must be consistent prior to insertion */
 	validate_mm_rb(root, NULL);
 
 	rb_insert_augmented(&vma->vm_rb, root, &vma_gap_callbacks);
 }
 
-static void __vma_rb_erase(struct vm_area_struct *vma, struct rb_root *root)
+static void __vma_rb_erase(struct vm_area_struct *vma, struct mm_struct *mm)
 {
+	struct rb_root *root = &mm->mm_rb;
 	/*
 	 * Note rb_erase_augmented is a fairly large inline function,
 	 * so make sure we instantiate it only once with our desired
 	 * augmented rbtree callbacks.
 	 */
+	mm_rb_write_lock(mm);
 	rb_erase_augmented(&vma->vm_rb, root, &vma_gap_callbacks);
+	mm_rb_write_unlock(mm); /* wmb */
+
+	/*
+	 * Ensure the removal is complete before clearing the node.
+	 * Matched by vma_has_changed()/handle_speculative_fault().
+	 */
+	RB_CLEAR_NODE(&vma->vm_rb);
 }
 
 static __always_inline void vma_rb_erase_ignore(struct vm_area_struct *vma,
-						struct rb_root *root,
+						struct mm_struct *mm,
 						struct vm_area_struct *ignore)
 {
 	/*
@@ -438,21 +475,21 @@ static __always_inline void vma_rb_erase_ignore(struct vm_area_struct *vma,
 	 * with the possible exception of the "next" vma being erased if
 	 * next->vm_start was reduced.
 	 */
-	validate_mm_rb(root, ignore);
+	validate_mm_rb(&mm->mm_rb, ignore);
 
-	__vma_rb_erase(vma, root);
+	__vma_rb_erase(vma, mm);
 }
 
 static __always_inline void vma_rb_erase(struct vm_area_struct *vma,
-					 struct rb_root *root)
+					 struct mm_struct *mm)
 {
 	/*
 	 * All rb_subtree_gap values must be consistent prior to erase,
 	 * with the possible exception of the vma being erased.
 	 */
-	validate_mm_rb(root, vma);
+	validate_mm_rb(&mm->mm_rb, vma);
 
-	__vma_rb_erase(vma, root);
+	__vma_rb_erase(vma, mm);
 }
 
 /*
@@ -558,10 +595,6 @@ void __vma_link_rb(struct mm_struct *mm, struct vm_area_struct *vma,
 	else
 		mm->highest_vm_end = vm_end_gap(vma);
 
-#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
-	seqcount_init(&vma->vm_sequence);
-#endif
-
 	/*
 	 * vma->vm_prev wasn't known when we followed the rbtree to find the
 	 * correct insertion point for that vma. As a result, we could not
@@ -571,10 +604,15 @@ void __vma_link_rb(struct mm_struct *mm, struct vm_area_struct *vma,
 	 * immediately update the gap to the correct value. Finally we
 	 * rebalance the rbtree after all augmented values have been set.
 	 */
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+	atomic_set(&vma->vm_ref_count, 1);
+#endif
+	mm_rb_write_lock(mm);
 	rb_link_node(&vma->vm_rb, rb_parent, rb_link);
 	vma->rb_subtree_gap = 0;
 	vma_gap_update(vma);
-	vma_rb_insert(vma, &mm->mm_rb);
+	vma_rb_insert(vma, mm);
+	mm_rb_write_unlock(mm);
 }
 
 static void __vma_link_file(struct vm_area_struct *vma)
@@ -650,7 +688,7 @@ static __always_inline void __vma_unlink_common(struct mm_struct *mm,
 {
 	struct vm_area_struct *next;
 
-	vma_rb_erase_ignore(vma, &mm->mm_rb, ignore);
+	vma_rb_erase_ignore(vma, mm, ignore);
 	next = vma->vm_next;
 	if (has_prev)
 		prev->vm_next = next;
@@ -923,16 +961,13 @@ int __vma_adjust(struct vm_area_struct *vma, unsigned long start,
 	}
 
 	if (remove_next) {
-		if (file) {
+		if (file)
 			uprobe_munmap(next, next->vm_start, next->vm_end);
-			fput(file);
-		}
 		if (next->anon_vma)
 			anon_vma_merge(vma, next);
 		mm->map_count--;
-		mpol_put(vma_policy(next));
 		vm_raw_write_end(next);
-		kmem_cache_free(vm_area_cachep, next);
+		put_vma(next);
 		/*
 		 * In mprotect's case 6 (see comments on vma_merge),
 		 * we must remove another next too. It would clutter
@@ -2182,15 +2217,11 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
 EXPORT_SYMBOL(get_unmapped_area);
 
 /* Look up the first VMA which satisfies  addr < vm_end,  NULL if none. */
-struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr)
+static struct vm_area_struct *__find_vma(struct mm_struct *mm,
+					 unsigned long addr)
 {
 	struct rb_node *rb_node;
-	struct vm_area_struct *vma;
-
-	/* Check the cache first. */
-	vma = vmacache_find(mm, addr);
-	if (likely(vma))
-		return vma;
+	struct vm_area_struct *vma = NULL;
 
 	rb_node = mm->mm_rb.rb_node;
 
@@ -2208,13 +2239,40 @@ struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr)
 			rb_node = rb_node->rb_right;
 	}
 
+	return vma;
+}
+
+struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr)
+{
+	struct vm_area_struct *vma;
+
+	/* Check the cache first. */
+	vma = vmacache_find(mm, addr);
+	if (likely(vma))
+		return vma;
+
+	vma = __find_vma(mm, addr);
 	if (vma)
 		vmacache_update(addr, vma);
 	return vma;
 }
-
 EXPORT_SYMBOL(find_vma);
 
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+struct vm_area_struct *get_vma(struct mm_struct *mm, unsigned long addr)
+{
+	struct vm_area_struct *vma = NULL;
+
+	read_lock(&mm->mm_rb_lock);
+	vma = __find_vma(mm, addr);
+	if (vma)
+		atomic_inc(&vma->vm_ref_count);
+	read_unlock(&mm->mm_rb_lock);
+
+	return vma;
+}
+#endif
+
 /*
  * Same as find_vma, but also return a pointer to the previous VMA in *pprev.
  */
@@ -2582,7 +2640,7 @@ detach_vmas_to_be_unmapped(struct mm_struct *mm, struct vm_area_struct *vma,
 	insertion_point = (prev ? &prev->vm_next : &mm->mmap);
 	vma->vm_prev = NULL;
 	do {
-		vma_rb_erase(vma, &mm->mm_rb);
+		vma_rb_erase(vma, mm);
 		mm->map_count--;
 		tail_vma = vma;
 		vma = vma->vm_next;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v9 18/24] mm: Provide speculative fault infrastructure
  2018-03-13 17:59 [PATCH v9 00/24] Speculative page faults Laurent Dufour
                   ` (14 preceding siblings ...)
  2018-03-13 17:59 ` [PATCH v9 17/24] mm: Protect mm_rb tree with a rwlock Laurent Dufour
@ 2018-03-13 17:59 ` Laurent Dufour
  2018-03-13 17:59 ` [PATCH v9 19/24] mm: Adding speculative page fault failure trace events Laurent Dufour
                   ` (10 subsequent siblings)
  26 siblings, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-03-13 17:59 UTC (permalink / raw)
  To: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan
  Cc: linux-kernel, linux-mm, haren, khandual, npiggin, bsingharora,
	Tim Chen, linuxppc-dev, x86

From: Peter Zijlstra <peterz@infradead.org>

Provide infrastructure to do a speculative fault (not holding
mmap_sem).

The not holding of mmap_sem means we can race against VMA
change/removal and page-table destruction. We use the SRCU VMA freeing
to keep the VMA around. We use the VMA seqcount to detect change
(including umapping / page-table deletion) and we use gup_fast() style
page-table walking to deal with page-table races.

Once we've obtained the page and are ready to update the PTE, we
validate if the state we started the fault with is still valid, if
not, we'll fail the fault with VM_FAULT_RETRY, otherwise we update the
PTE and we're done.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

[Manage the newly introduced pte_spinlock() for speculative page
 fault to fail if the VMA is touched in our back]
[Rename vma_is_dead() to vma_has_changed() and declare it here]
[Fetch p4d and pud]
[Set vmd.sequence in __handle_mm_fault()]
[Abort speculative path when handle_userfault() has to be called]
[Add additional VMA's flags checks in handle_speculative_fault()]
[Clear FAULT_FLAG_ALLOW_RETRY in handle_speculative_fault()]
[Don't set vmf->pte and vmf->ptl if pte_map_lock() failed]
[Remove warning comment about waiting for !seq&1 since we don't want
 to wait]
[Remove warning about no huge page support, mention it explictly]
[Don't call do_fault() in the speculative path as __do_fault() calls
 vma->vm_ops->fault() which may want to release mmap_sem]
[Only vm_fault pointer argument for vma_has_changed()]
[Fix check against huge page, calling pmd_trans_huge()]
[Use READ_ONCE() when reading VMA's fields in the speculative path]
[Explicitly check for __HAVE_ARCH_PTE_SPECIAL as we can't support for
 processing done in vm_normal_page()]
[Check that vma->anon_vma is already set when starting the speculative
 path]
[Check for memory policy as we can't support MPOL_INTERLEAVE case due to
 the processing done in mpol_misplaced()]
[Don't support VMA growing up or down]
[Move check on vm_sequence just before calling handle_pte_fault()]
[Don't build SPF services if !CONFIG_SPECULATIVE_PAGE_FAULT]
[Add mem cgroup oom check]
[Use READ_ONCE to access p*d entries]
[Replace deprecated ACCESS_ONCE() by READ_ONCE() in vma_has_changed()]
[Don't fetch pte again in handle_pte_fault() when running the speculative
 path]
[Check PMD against concurrent collapsing operation]
[Try spin lock the pte during the speculative path to avoid deadlock with
 other CPU's invalidating the TLB and requiring this CPU to catch the
 inter processor's interrupt]
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
---
 include/linux/hugetlb_inline.h |   2 +-
 include/linux/mm.h             |   8 +
 include/linux/pagemap.h        |   4 +-
 mm/internal.h                  |  16 +-
 mm/memory.c                    | 342 ++++++++++++++++++++++++++++++++++++++++-
 5 files changed, 364 insertions(+), 8 deletions(-)

diff --git a/include/linux/hugetlb_inline.h b/include/linux/hugetlb_inline.h
index 0660a03d37d9..9e25283d6fc9 100644
--- a/include/linux/hugetlb_inline.h
+++ b/include/linux/hugetlb_inline.h
@@ -8,7 +8,7 @@
 
 static inline bool is_vm_hugetlb_page(struct vm_area_struct *vma)
 {
-	return !!(vma->vm_flags & VM_HUGETLB);
+	return !!(READ_ONCE(vma->vm_flags) & VM_HUGETLB);
 }
 
 #else
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 73b8b99f482b..1acc3f4e07d1 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -329,6 +329,10 @@ struct vm_fault {
 	gfp_t gfp_mask;			/* gfp mask to be used for allocations */
 	pgoff_t pgoff;			/* Logical page offset based on vma */
 	unsigned long address;		/* Faulting virtual address */
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+	unsigned int sequence;
+	pmd_t orig_pmd;			/* value of PMD at the time of fault */
+#endif
 	pmd_t *pmd;			/* Pointer to pmd entry matching
 					 * the 'address' */
 	pud_t *pud;			/* Pointer to pud entry matching
@@ -1351,6 +1355,10 @@ int invalidate_inode_page(struct page *page);
 #ifdef CONFIG_MMU
 extern int handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
 		unsigned int flags);
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+extern int handle_speculative_fault(struct mm_struct *mm,
+				    unsigned long address, unsigned int flags);
+#endif /* CONFIG_SPECULATIVE_PAGE_FAULT */
 extern int fixup_user_fault(struct task_struct *tsk, struct mm_struct *mm,
 			    unsigned long address, unsigned int fault_flags,
 			    bool *unlocked);
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 34ce3ebf97d5..70e4d2688e7b 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -456,8 +456,8 @@ static inline pgoff_t linear_page_index(struct vm_area_struct *vma,
 	pgoff_t pgoff;
 	if (unlikely(is_vm_hugetlb_page(vma)))
 		return linear_hugepage_index(vma, address);
-	pgoff = (address - vma->vm_start) >> PAGE_SHIFT;
-	pgoff += vma->vm_pgoff;
+	pgoff = (address - READ_ONCE(vma->vm_start)) >> PAGE_SHIFT;
+	pgoff += READ_ONCE(vma->vm_pgoff);
 	return pgoff;
 }
 
diff --git a/mm/internal.h b/mm/internal.h
index fb2667b20f0a..10b188c87fa4 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -44,7 +44,21 @@ int do_swap_page(struct vm_fault *vmf);
 extern struct vm_area_struct *get_vma(struct mm_struct *mm,
 				      unsigned long addr);
 extern void put_vma(struct vm_area_struct *vma);
-#endif
+
+static inline bool vma_has_changed(struct vm_fault *vmf)
+{
+	int ret = RB_EMPTY_NODE(&vmf->vma->vm_rb);
+	unsigned int seq = READ_ONCE(vmf->vma->vm_sequence.sequence);
+
+	/*
+	 * Matches both the wmb in write_seqlock_{begin,end}() and
+	 * the wmb in vma_rb_erase().
+	 */
+	smp_rmb();
+
+	return ret || seq != vmf->sequence;
+}
+#endif /* CONFIG_SPECULATIVE_PAGE_FAULT */
 
 void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *start_vma,
 		unsigned long floor, unsigned long ceiling);
diff --git a/mm/memory.c b/mm/memory.c
index 66517535514b..f0f2caa11282 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -769,7 +769,8 @@ static void print_bad_pte(struct vm_area_struct *vma, unsigned long addr,
 	if (page)
 		dump_page(page, "bad pte");
 	pr_alert("addr:%p vm_flags:%08lx anon_vma:%p mapping:%p index:%lx\n",
-		 (void *)addr, vma->vm_flags, vma->anon_vma, mapping, index);
+		 (void *)addr, READ_ONCE(vma->vm_flags), vma->anon_vma,
+		 mapping, index);
 	pr_alert("file:%pD fault:%pf mmap:%pf readpage:%pf\n",
 		 vma->vm_file,
 		 vma->vm_ops ? vma->vm_ops->fault : NULL,
@@ -2295,19 +2296,127 @@ int apply_to_page_range(struct mm_struct *mm, unsigned long addr,
 }
 EXPORT_SYMBOL_GPL(apply_to_page_range);
 
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
 static bool pte_spinlock(struct vm_fault *vmf)
 {
+	bool ret = false;
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+	pmd_t pmdval;
+#endif
+
+	/* Check if vma is still valid */
+	if (!(vmf->flags & FAULT_FLAG_SPECULATIVE)) {
+		vmf->ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd);
+		spin_lock(vmf->ptl);
+		return true;
+	}
+
+	local_irq_disable();
+	if (vma_has_changed(vmf))
+		goto out;
+
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+	/*
+	 * We check if the pmd value is still the same to ensure that there
+	 * is not a huge collapse operation in progress in our back.
+	 */
+	pmdval = READ_ONCE(*vmf->pmd);
+	if (!pmd_same(pmdval, vmf->orig_pmd))
+		goto out;
+#endif
+
+	vmf->ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd);
+	if (unlikely(!spin_trylock(vmf->ptl)))
+		goto out;
+
+	if (vma_has_changed(vmf)) {
+		spin_unlock(vmf->ptl);
+		goto out;
+	}
+
+	ret = true;
+out:
+	local_irq_enable();
+	return ret;
+}
+
+static bool pte_map_lock(struct vm_fault *vmf)
+{
+	bool ret = false;
+	pte_t *pte;
+	spinlock_t *ptl;
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+	pmd_t pmdval;
+#endif
+
+	if (!(vmf->flags & FAULT_FLAG_SPECULATIVE)) {
+		vmf->pte = pte_offset_map_lock(vmf->vma->vm_mm, vmf->pmd,
+					       vmf->address, &vmf->ptl);
+		return true;
+	}
+
+	/*
+	 * The first vma_has_changed() guarantees the page-tables are still
+	 * valid, having IRQs disabled ensures they stay around, hence the
+	 * second vma_has_changed() to make sure they are still valid once
+	 * we've got the lock. After that a concurrent zap_pte_range() will
+	 * block on the PTL and thus we're safe.
+	 */
+	local_irq_disable();
+	if (vma_has_changed(vmf))
+		goto out;
+
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+	/*
+	 * We check if the pmd value is still the same to ensure that there
+	 * is not a huge collapse operation in progress in our back.
+	 */
+	pmdval = READ_ONCE(*vmf->pmd);
+	if (!pmd_same(pmdval, vmf->orig_pmd))
+		goto out;
+#endif
+
+	/*
+	 * Same as pte_offset_map_lock() except that we call
+	 * spin_trylock() in place of spin_lock() to avoid race with
+	 * unmap path which may have the lock and wait for this CPU
+	 * to invalidate TLB but this CPU has irq disabled.
+	 * Since we are in a speculative patch, accept it could fail
+	 */
+	ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd);
+	pte = pte_offset_map(vmf->pmd, vmf->address);
+	if (unlikely(!spin_trylock(ptl))) {
+		pte_unmap(pte);
+		goto out;
+	}
+
+	if (vma_has_changed(vmf)) {
+		pte_unmap_unlock(pte, ptl);
+		goto out;
+	}
+
+	vmf->pte = pte;
+	vmf->ptl = ptl;
+	ret = true;
+out:
+	local_irq_enable();
+	return ret;
+}
+#else
+static inline bool pte_spinlock(struct vm_fault *vmf)
+{
 	vmf->ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd);
 	spin_lock(vmf->ptl);
 	return true;
 }
 
-static bool pte_map_lock(struct vm_fault *vmf)
+static inline bool pte_map_lock(struct vm_fault *vmf)
 {
 	vmf->pte = pte_offset_map_lock(vmf->vma->vm_mm, vmf->pmd,
 				       vmf->address, &vmf->ptl);
 	return true;
 }
+#endif /* CONFIG_SPECULATIVE_PAGE_FAULT */
 
 /*
  * handle_pte_fault chooses page fault handler according to an entry which was
@@ -3184,6 +3293,14 @@ static int do_anonymous_page(struct vm_fault *vmf)
 		ret = check_stable_address_space(vma->vm_mm);
 		if (ret)
 			goto unlock;
+		/*
+		 * Don't call the userfaultfd during the speculative path.
+		 * We already checked for the VMA to not be managed through
+		 * userfaultfd, but it may be set in our back once we have lock
+		 * the pte. In such a case we can ignore it this time.
+		 */
+		if (vmf->flags & FAULT_FLAG_SPECULATIVE)
+			goto setpte;
 		/* Deliver the page fault to userland, check inside PT lock */
 		if (userfaultfd_missing(vma)) {
 			pte_unmap_unlock(vmf->pte, vmf->ptl);
@@ -3226,7 +3343,7 @@ static int do_anonymous_page(struct vm_fault *vmf)
 		goto release;
 
 	/* Deliver the page fault to userland, check inside PT lock */
-	if (userfaultfd_missing(vma)) {
+	if (!(vmf->flags & FAULT_FLAG_SPECULATIVE) && userfaultfd_missing(vma)) {
 		pte_unmap_unlock(vmf->pte, vmf->ptl);
 		mem_cgroup_cancel_charge(page, memcg, false);
 		put_page(page);
@@ -3969,13 +4086,22 @@ static int handle_pte_fault(struct vm_fault *vmf)
 
 	if (unlikely(pmd_none(*vmf->pmd))) {
 		/*
+		 * In the case of the speculative page fault handler we abort
+		 * the speculative path immediately as the pmd is probably
+		 * in the way to be converted in a huge one. We will try
+		 * again holding the mmap_sem (which implies that the collapse
+		 * operation is done).
+		 */
+		if (vmf->flags & FAULT_FLAG_SPECULATIVE)
+			return VM_FAULT_RETRY;
+		/*
 		 * Leave __pte_alloc() until later: because vm_ops->fault may
 		 * want to allocate huge page, and if we expose page table
 		 * for an instant, it will be difficult to retract from
 		 * concurrent faults and from rmap lookups.
 		 */
 		vmf->pte = NULL;
-	} else {
+	} else if (!(vmf->flags & FAULT_FLAG_SPECULATIVE)) {
 		/* See comment in pte_alloc_one_map() */
 		if (pmd_devmap_trans_unstable(vmf->pmd))
 			return 0;
@@ -3984,6 +4110,9 @@ static int handle_pte_fault(struct vm_fault *vmf)
 		 * pmd from under us anymore at this point because we hold the
 		 * mmap_sem read mode and khugepaged takes it in write mode.
 		 * So now it's safe to run pte_offset_map().
+		 * This is not applicable to the speculative page fault handler
+		 * but in that case, the pte is fetched earlier in
+		 * handle_speculative_fault().
 		 */
 		vmf->pte = pte_offset_map(vmf->pmd, vmf->address);
 		vmf->orig_pte = *vmf->pte;
@@ -4006,6 +4135,8 @@ static int handle_pte_fault(struct vm_fault *vmf)
 	if (!vmf->pte) {
 		if (vma_is_anonymous(vmf->vma))
 			return do_anonymous_page(vmf);
+		else if (vmf->flags & FAULT_FLAG_SPECULATIVE)
+			return VM_FAULT_RETRY;
 		else
 			return do_fault(vmf);
 	}
@@ -4103,6 +4234,9 @@ static int __handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
 	vmf.pmd = pmd_alloc(mm, vmf.pud, address);
 	if (!vmf.pmd)
 		return VM_FAULT_OOM;
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+	vmf.sequence = raw_read_seqcount(&vma->vm_sequence);
+#endif
 	if (pmd_none(*vmf.pmd) && transparent_hugepage_enabled(vma)) {
 		ret = create_huge_pmd(&vmf);
 		if (!(ret & VM_FAULT_FALLBACK))
@@ -4136,6 +4270,206 @@ static int __handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
 	return handle_pte_fault(&vmf);
 }
 
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+
+#ifndef __HAVE_ARCH_PTE_SPECIAL
+/* This is required by vm_normal_page() */
+#error "Speculative page fault handler requires __HAVE_ARCH_PTE_SPECIAL"
+#endif
+
+/*
+ * vm_normal_page() adds some processing which should be done while
+ * hodling the mmap_sem.
+ */
+int handle_speculative_fault(struct mm_struct *mm, unsigned long address,
+			     unsigned int flags)
+{
+	struct vm_fault vmf = {
+		.address = address,
+	};
+	pgd_t *pgd, pgdval;
+	p4d_t *p4d, p4dval;
+	pud_t pudval;
+	int seq, ret = VM_FAULT_RETRY;
+	struct vm_area_struct *vma;
+#ifdef CONFIG_NUMA
+	struct mempolicy *pol;
+#endif
+
+	/* Clear flags that may lead to release the mmap_sem to retry */
+	flags &= ~(FAULT_FLAG_ALLOW_RETRY|FAULT_FLAG_KILLABLE);
+	flags |= FAULT_FLAG_SPECULATIVE;
+
+	vma = get_vma(mm, address);
+	if (!vma)
+		return ret;
+
+	seq = raw_read_seqcount(&vma->vm_sequence); /* rmb <-> seqlock,vma_rb_erase() */
+	if (seq & 1)
+		goto out_put;
+
+	/*
+	 * Can't call vm_ops service has we don't know what they would do
+	 * with the VMA.
+	 * This include huge page from hugetlbfs.
+	 */
+	if (vma->vm_ops)
+		goto out_put;
+
+	/*
+	 * __anon_vma_prepare() requires the mmap_sem to be held
+	 * because vm_next and vm_prev must be safe. This can't be guaranteed
+	 * in the speculative path.
+	 */
+	if (unlikely(!vma->anon_vma))
+		goto out_put;
+
+	vmf.vma_flags = READ_ONCE(vma->vm_flags);
+	vmf.vma_page_prot = READ_ONCE(vma->vm_page_prot);
+
+	/* Can't call userland page fault handler in the speculative path */
+	if (unlikely(vmf.vma_flags & VM_UFFD_MISSING))
+		goto out_put;
+
+	if (vmf.vma_flags & VM_GROWSDOWN || vmf.vma_flags & VM_GROWSUP)
+		/*
+		 * This could be detected by the check address against VMA's
+		 * boundaries but we want to trace it as not supported instead
+		 * of changed.
+		 */
+		goto out_put;
+
+	if (address < READ_ONCE(vma->vm_start)
+	    || READ_ONCE(vma->vm_end) <= address)
+		goto out_put;
+
+	if (!arch_vma_access_permitted(vma, flags & FAULT_FLAG_WRITE,
+				       flags & FAULT_FLAG_INSTRUCTION,
+				       flags & FAULT_FLAG_REMOTE)) {
+		ret = VM_FAULT_SIGSEGV;
+		goto out_put;
+	}
+
+	/* This is one is required to check that the VMA has write access set */
+	if (flags & FAULT_FLAG_WRITE) {
+		if (unlikely(!(vmf.vma_flags & VM_WRITE))) {
+			ret = VM_FAULT_SIGSEGV;
+			goto out_put;
+		}
+	} else if (unlikely(!(vmf.vma_flags & (VM_READ|VM_EXEC|VM_WRITE)))) {
+		ret = VM_FAULT_SIGSEGV;
+		goto out_put;
+	}
+
+#ifdef CONFIG_NUMA
+	/*
+	 * MPOL_INTERLEAVE implies additional check in mpol_misplaced() which
+	 * are not compatible with the speculative page fault processing.
+	 */
+	pol = __get_vma_policy(vma, address);
+	if (!pol)
+		pol = get_task_policy(current);
+	if (pol && pol->mode == MPOL_INTERLEAVE)
+		goto out_put;
+#endif
+
+	/*
+	 * Do a speculative lookup of the PTE entry.
+	 */
+	local_irq_disable();
+	pgd = pgd_offset(mm, address);
+	pgdval = READ_ONCE(*pgd);
+	if (pgd_none(pgdval) || unlikely(pgd_bad(pgdval)))
+		goto out_walk;
+
+	p4d = p4d_offset(pgd, address);
+	p4dval = READ_ONCE(*p4d);
+	if (p4d_none(p4dval) || unlikely(p4d_bad(p4dval)))
+		goto out_walk;
+
+	vmf.pud = pud_offset(p4d, address);
+	pudval = READ_ONCE(*vmf.pud);
+	if (pud_none(pudval) || unlikely(pud_bad(pudval)))
+		goto out_walk;
+
+	/* Huge pages at PUD level are not supported. */
+	if (unlikely(pud_trans_huge(pudval)))
+		goto out_walk;
+
+	vmf.pmd = pmd_offset(vmf.pud, address);
+	vmf.orig_pmd = READ_ONCE(*vmf.pmd);
+	/*
+	 * pmd_none could mean that a hugepage collapse is in progress
+	 * in our back as collapse_huge_page() mark it before
+	 * invalidating the pte (which is done once the IPI is catched
+	 * by all CPU and we have interrupt disabled).
+	 * For this reason we cannot handle THP in a speculative way since we
+	 * can't safely indentify an in progress collapse operation done in our
+	 * back on that PMD.
+	 * Regarding the order of the following checks, see comment in
+	 * pmd_devmap_trans_unstable()
+	 */
+	if (unlikely(pmd_devmap(vmf.orig_pmd) ||
+		     pmd_none(vmf.orig_pmd) || pmd_trans_huge(vmf.orig_pmd) ||
+		     is_swap_pmd(vmf.orig_pmd)))
+		goto out_walk;
+
+	/*
+	 * The above does not allocate/instantiate page-tables because doing so
+	 * would lead to the possibility of instantiating page-tables after
+	 * free_pgtables() -- and consequently leaking them.
+	 *
+	 * The result is that we take at least one !speculative fault per PMD
+	 * in order to instantiate it.
+	 */
+
+	vmf.pte = pte_offset_map(vmf.pmd, address);
+	vmf.orig_pte = READ_ONCE(*vmf.pte);
+	barrier(); /* See comment in handle_pte_fault() */
+	if (pte_none(vmf.orig_pte)) {
+		pte_unmap(vmf.pte);
+		vmf.pte = NULL;
+	}
+
+	vmf.vma = vma;
+	vmf.pgoff = linear_page_index(vma, address);
+	vmf.gfp_mask = __get_fault_gfp_mask(vma);
+	vmf.sequence = seq;
+	vmf.flags = flags;
+
+	local_irq_enable();
+
+	/*
+	 * We need to re-validate the VMA after checking the bounds, otherwise
+	 * we might have a false positive on the bounds.
+	 */
+	if (read_seqcount_retry(&vma->vm_sequence, seq))
+		goto out_put;
+
+	mem_cgroup_oom_enable();
+	ret = handle_pte_fault(&vmf);
+	mem_cgroup_oom_disable();
+
+	put_vma(vma);
+
+	/*
+	 * The task may have entered a memcg OOM situation but
+	 * if the allocation error was handled gracefully (no
+	 * VM_FAULT_OOM), there is no need to kill anything.
+	 * Just clean up the OOM state peacefully.
+	 */
+	if (task_in_memcg_oom(current) && !(ret & VM_FAULT_OOM))
+		mem_cgroup_oom_synchronize(false);
+	return ret;
+
+out_walk:
+	local_irq_enable();
+out_put:
+	put_vma(vma);
+	return ret;
+}
+#endif /* CONFIG_SPECULATIVE_PAGE_FAULT */
+
 /*
  * By the time we get here, we already hold the mm semaphore
  *
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v9 19/24] mm: Adding speculative page fault failure trace events
  2018-03-13 17:59 [PATCH v9 00/24] Speculative page faults Laurent Dufour
                   ` (15 preceding siblings ...)
  2018-03-13 17:59 ` [PATCH v9 18/24] mm: Provide speculative fault infrastructure Laurent Dufour
@ 2018-03-13 17:59 ` Laurent Dufour
  2018-03-13 17:59 ` [PATCH v9 20/24] perf: Add a speculative page fault sw event Laurent Dufour
                   ` (9 subsequent siblings)
  26 siblings, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-03-13 17:59 UTC (permalink / raw)
  To: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan
  Cc: linux-kernel, linux-mm, haren, khandual, npiggin, bsingharora,
	Tim Chen, linuxppc-dev, x86

This patch a set of new trace events to collect the speculative page fault
event failures.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
---
 include/trace/events/pagefault.h | 87 ++++++++++++++++++++++++++++++++++++++++
 mm/memory.c                      | 62 ++++++++++++++++++++++------
 2 files changed, 136 insertions(+), 13 deletions(-)
 create mode 100644 include/trace/events/pagefault.h

diff --git a/include/trace/events/pagefault.h b/include/trace/events/pagefault.h
new file mode 100644
index 000000000000..1d793f8c739b
--- /dev/null
+++ b/include/trace/events/pagefault.h
@@ -0,0 +1,87 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM pagefault
+
+#if !defined(_TRACE_PAGEFAULT_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_PAGEFAULT_H
+
+#include <linux/tracepoint.h>
+#include <linux/mm.h>
+
+DECLARE_EVENT_CLASS(spf,
+
+	TP_PROTO(unsigned long caller,
+		 struct vm_area_struct *vma, unsigned long address),
+
+	TP_ARGS(caller, vma, address),
+
+	TP_STRUCT__entry(
+		__field(unsigned long, caller)
+		__field(unsigned long, vm_start)
+		__field(unsigned long, vm_end)
+		__field(unsigned long, address)
+	),
+
+	TP_fast_assign(
+		__entry->caller		= caller;
+		__entry->vm_start	= vma->vm_start;
+		__entry->vm_end		= vma->vm_end;
+		__entry->address	= address;
+	),
+
+	TP_printk("ip:%lx vma:%lx-%lx address:%lx",
+		  __entry->caller, __entry->vm_start, __entry->vm_end,
+		  __entry->address)
+);
+
+DEFINE_EVENT(spf, spf_pte_lock,
+
+	TP_PROTO(unsigned long caller,
+		 struct vm_area_struct *vma, unsigned long address),
+
+	TP_ARGS(caller, vma, address)
+);
+
+DEFINE_EVENT(spf, spf_vma_changed,
+
+	TP_PROTO(unsigned long caller,
+		 struct vm_area_struct *vma, unsigned long address),
+
+	TP_ARGS(caller, vma, address)
+);
+
+DEFINE_EVENT(spf, spf_vma_noanon,
+
+	TP_PROTO(unsigned long caller,
+		 struct vm_area_struct *vma, unsigned long address),
+
+	TP_ARGS(caller, vma, address)
+);
+
+DEFINE_EVENT(spf, spf_vma_notsup,
+
+	TP_PROTO(unsigned long caller,
+		 struct vm_area_struct *vma, unsigned long address),
+
+	TP_ARGS(caller, vma, address)
+);
+
+DEFINE_EVENT(spf, spf_vma_access,
+
+	TP_PROTO(unsigned long caller,
+		 struct vm_area_struct *vma, unsigned long address),
+
+	TP_ARGS(caller, vma, address)
+);
+
+DEFINE_EVENT(spf, spf_pmd_changed,
+
+	TP_PROTO(unsigned long caller,
+		 struct vm_area_struct *vma, unsigned long address),
+
+	TP_ARGS(caller, vma, address)
+);
+
+#endif /* _TRACE_PAGEFAULT_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/mm/memory.c b/mm/memory.c
index f0f2caa11282..f39c4a4df703 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -80,6 +80,9 @@
 
 #include "internal.h"
 
+#define CREATE_TRACE_POINTS
+#include <trace/events/pagefault.h>
+
 #if defined(LAST_CPUPID_NOT_IN_PAGE_FLAGS) && !defined(CONFIG_COMPILE_TEST)
 #warning Unfortunate NUMA and NUMA Balancing config, growing page-frame for last_cpupid.
 #endif
@@ -2312,8 +2315,10 @@ static bool pte_spinlock(struct vm_fault *vmf)
 	}
 
 	local_irq_disable();
-	if (vma_has_changed(vmf))
+	if (vma_has_changed(vmf)) {
+		trace_spf_vma_changed(_RET_IP_, vmf->vma, vmf->address);
 		goto out;
+	}
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 	/*
@@ -2321,16 +2326,21 @@ static bool pte_spinlock(struct vm_fault *vmf)
 	 * is not a huge collapse operation in progress in our back.
 	 */
 	pmdval = READ_ONCE(*vmf->pmd);
-	if (!pmd_same(pmdval, vmf->orig_pmd))
+	if (!pmd_same(pmdval, vmf->orig_pmd)) {
+		trace_spf_pmd_changed(_RET_IP_, vmf->vma, vmf->address);
 		goto out;
+	}
 #endif
 
 	vmf->ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd);
-	if (unlikely(!spin_trylock(vmf->ptl)))
+	if (unlikely(!spin_trylock(vmf->ptl))) {
+		trace_spf_pte_lock(_RET_IP_, vmf->vma, vmf->address);
 		goto out;
+	}
 
 	if (vma_has_changed(vmf)) {
 		spin_unlock(vmf->ptl);
+		trace_spf_vma_changed(_RET_IP_, vmf->vma, vmf->address);
 		goto out;
 	}
 
@@ -2363,8 +2373,10 @@ static bool pte_map_lock(struct vm_fault *vmf)
 	 * block on the PTL and thus we're safe.
 	 */
 	local_irq_disable();
-	if (vma_has_changed(vmf))
+	if (vma_has_changed(vmf)) {
+		trace_spf_vma_changed(_RET_IP_, vmf->vma, vmf->address);
 		goto out;
+	}
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 	/*
@@ -2372,8 +2384,10 @@ static bool pte_map_lock(struct vm_fault *vmf)
 	 * is not a huge collapse operation in progress in our back.
 	 */
 	pmdval = READ_ONCE(*vmf->pmd);
-	if (!pmd_same(pmdval, vmf->orig_pmd))
+	if (!pmd_same(pmdval, vmf->orig_pmd)) {
+		trace_spf_pmd_changed(_RET_IP_, vmf->vma, vmf->address);
 		goto out;
+	}
 #endif
 
 	/*
@@ -2387,11 +2401,13 @@ static bool pte_map_lock(struct vm_fault *vmf)
 	pte = pte_offset_map(vmf->pmd, vmf->address);
 	if (unlikely(!spin_trylock(ptl))) {
 		pte_unmap(pte);
+		trace_spf_pte_lock(_RET_IP_, vmf->vma, vmf->address);
 		goto out;
 	}
 
 	if (vma_has_changed(vmf)) {
 		pte_unmap_unlock(pte, ptl);
+		trace_spf_vma_changed(_RET_IP_, vmf->vma, vmf->address);
 		goto out;
 	}
 
@@ -4305,47 +4321,60 @@ int handle_speculative_fault(struct mm_struct *mm, unsigned long address,
 		return ret;
 
 	seq = raw_read_seqcount(&vma->vm_sequence); /* rmb <-> seqlock,vma_rb_erase() */
-	if (seq & 1)
+	if (seq & 1) {
+		trace_spf_vma_changed(_RET_IP_, vma, address);
 		goto out_put;
+	}
 
 	/*
 	 * Can't call vm_ops service has we don't know what they would do
 	 * with the VMA.
 	 * This include huge page from hugetlbfs.
 	 */
-	if (vma->vm_ops)
+	if (vma->vm_ops) {
+		trace_spf_vma_notsup(_RET_IP_, vma, address);
 		goto out_put;
+	}
 
 	/*
 	 * __anon_vma_prepare() requires the mmap_sem to be held
 	 * because vm_next and vm_prev must be safe. This can't be guaranteed
 	 * in the speculative path.
 	 */
-	if (unlikely(!vma->anon_vma))
+	if (unlikely(!vma->anon_vma)) {
+		trace_spf_vma_notsup(_RET_IP_, vma, address);
 		goto out_put;
+	}
 
 	vmf.vma_flags = READ_ONCE(vma->vm_flags);
 	vmf.vma_page_prot = READ_ONCE(vma->vm_page_prot);
 
 	/* Can't call userland page fault handler in the speculative path */
-	if (unlikely(vmf.vma_flags & VM_UFFD_MISSING))
+	if (unlikely(vmf.vma_flags & VM_UFFD_MISSING)) {
+		trace_spf_vma_notsup(_RET_IP_, vma, address);
 		goto out_put;
+	}
 
-	if (vmf.vma_flags & VM_GROWSDOWN || vmf.vma_flags & VM_GROWSUP)
+	if (vmf.vma_flags & VM_GROWSDOWN || vmf.vma_flags & VM_GROWSUP) {
 		/*
 		 * This could be detected by the check address against VMA's
 		 * boundaries but we want to trace it as not supported instead
 		 * of changed.
 		 */
+		trace_spf_vma_notsup(_RET_IP_, vma, address);
 		goto out_put;
+	}
 
 	if (address < READ_ONCE(vma->vm_start)
-	    || READ_ONCE(vma->vm_end) <= address)
+	    || READ_ONCE(vma->vm_end) <= address) {
+		trace_spf_vma_changed(_RET_IP_, vma, address);
 		goto out_put;
+	}
 
 	if (!arch_vma_access_permitted(vma, flags & FAULT_FLAG_WRITE,
 				       flags & FAULT_FLAG_INSTRUCTION,
 				       flags & FAULT_FLAG_REMOTE)) {
+		trace_spf_vma_access(_RET_IP_, vma, address);
 		ret = VM_FAULT_SIGSEGV;
 		goto out_put;
 	}
@@ -4353,10 +4382,12 @@ int handle_speculative_fault(struct mm_struct *mm, unsigned long address,
 	/* This is one is required to check that the VMA has write access set */
 	if (flags & FAULT_FLAG_WRITE) {
 		if (unlikely(!(vmf.vma_flags & VM_WRITE))) {
+			trace_spf_vma_access(_RET_IP_, vma, address);
 			ret = VM_FAULT_SIGSEGV;
 			goto out_put;
 		}
 	} else if (unlikely(!(vmf.vma_flags & (VM_READ|VM_EXEC|VM_WRITE)))) {
+		trace_spf_vma_access(_RET_IP_, vma, address);
 		ret = VM_FAULT_SIGSEGV;
 		goto out_put;
 	}
@@ -4369,8 +4400,10 @@ int handle_speculative_fault(struct mm_struct *mm, unsigned long address,
 	pol = __get_vma_policy(vma, address);
 	if (!pol)
 		pol = get_task_policy(current);
-	if (pol && pol->mode == MPOL_INTERLEAVE)
+	if (pol && pol->mode == MPOL_INTERLEAVE) {
+		trace_spf_vma_notsup(_RET_IP_, vma, address);
 		goto out_put;
+	}
 #endif
 
 	/*
@@ -4443,8 +4476,10 @@ int handle_speculative_fault(struct mm_struct *mm, unsigned long address,
 	 * We need to re-validate the VMA after checking the bounds, otherwise
 	 * we might have a false positive on the bounds.
 	 */
-	if (read_seqcount_retry(&vma->vm_sequence, seq))
+	if (read_seqcount_retry(&vma->vm_sequence, seq)) {
+		trace_spf_vma_changed(_RET_IP_, vma, address);
 		goto out_put;
+	}
 
 	mem_cgroup_oom_enable();
 	ret = handle_pte_fault(&vmf);
@@ -4463,6 +4498,7 @@ int handle_speculative_fault(struct mm_struct *mm, unsigned long address,
 	return ret;
 
 out_walk:
+	trace_spf_vma_notsup(_RET_IP_, vma, address);
 	local_irq_enable();
 out_put:
 	put_vma(vma);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v9 20/24] perf: Add a speculative page fault sw event
  2018-03-13 17:59 [PATCH v9 00/24] Speculative page faults Laurent Dufour
                   ` (16 preceding siblings ...)
  2018-03-13 17:59 ` [PATCH v9 19/24] mm: Adding speculative page fault failure trace events Laurent Dufour
@ 2018-03-13 17:59 ` Laurent Dufour
  2018-03-26 21:43   ` David Rientjes
  2018-03-13 17:59 ` [PATCH v9 21/24] perf tools: Add support for the SPF perf event Laurent Dufour
                   ` (8 subsequent siblings)
  26 siblings, 1 reply; 98+ messages in thread
From: Laurent Dufour @ 2018-03-13 17:59 UTC (permalink / raw)
  To: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan
  Cc: linux-kernel, linux-mm, haren, khandual, npiggin, bsingharora,
	Tim Chen, linuxppc-dev, x86

Add a new software event to count succeeded speculative page faults.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
---
 include/uapi/linux/perf_event.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 6f873503552d..a6ddab9edeec 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -112,6 +112,7 @@ enum perf_sw_ids {
 	PERF_COUNT_SW_EMULATION_FAULTS		= 8,
 	PERF_COUNT_SW_DUMMY			= 9,
 	PERF_COUNT_SW_BPF_OUTPUT		= 10,
+	PERF_COUNT_SW_SPF			= 11,
 
 	PERF_COUNT_SW_MAX,			/* non-ABI */
 };
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v9 21/24] perf tools: Add support for the SPF perf event
  2018-03-13 17:59 [PATCH v9 00/24] Speculative page faults Laurent Dufour
                   ` (17 preceding siblings ...)
  2018-03-13 17:59 ` [PATCH v9 20/24] perf: Add a speculative page fault sw event Laurent Dufour
@ 2018-03-13 17:59 ` Laurent Dufour
  2018-03-26 21:44   ` David Rientjes
  2018-03-13 17:59 ` [PATCH v9 22/24] mm: Speculative page fault handler return VMA Laurent Dufour
                   ` (7 subsequent siblings)
  26 siblings, 1 reply; 98+ messages in thread
From: Laurent Dufour @ 2018-03-13 17:59 UTC (permalink / raw)
  To: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan
  Cc: linux-kernel, linux-mm, haren, khandual, npiggin, bsingharora,
	Tim Chen, linuxppc-dev, x86

Add support for the new speculative faults event.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
---
 tools/include/uapi/linux/perf_event.h | 1 +
 tools/perf/util/evsel.c               | 1 +
 tools/perf/util/parse-events.c        | 4 ++++
 tools/perf/util/parse-events.l        | 1 +
 tools/perf/util/python.c              | 1 +
 5 files changed, 8 insertions(+)

diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index 6f873503552d..a6ddab9edeec 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -112,6 +112,7 @@ enum perf_sw_ids {
 	PERF_COUNT_SW_EMULATION_FAULTS		= 8,
 	PERF_COUNT_SW_DUMMY			= 9,
 	PERF_COUNT_SW_BPF_OUTPUT		= 10,
+	PERF_COUNT_SW_SPF			= 11,
 
 	PERF_COUNT_SW_MAX,			/* non-ABI */
 };
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index ef351688b797..45b954019118 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -428,6 +428,7 @@ const char *perf_evsel__sw_names[PERF_COUNT_SW_MAX] = {
 	"alignment-faults",
 	"emulation-faults",
 	"dummy",
+	"speculative-faults",
 };
 
 static const char *__perf_evsel__sw_name(u64 config)
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 34589c427e52..2a8189c6d5fc 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -140,6 +140,10 @@ struct event_symbol event_symbols_sw[PERF_COUNT_SW_MAX] = {
 		.symbol = "bpf-output",
 		.alias  = "",
 	},
+	[PERF_COUNT_SW_SPF] = {
+		.symbol = "speculative-faults",
+		.alias	= "spf",
+	},
 };
 
 #define __PERF_EVENT_FIELD(config, name) \
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 655ecff636a8..5d6782426b30 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -308,6 +308,7 @@ emulation-faults				{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_EM
 dummy						{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_DUMMY); }
 duration_time					{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_DUMMY); }
 bpf-output					{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_BPF_OUTPUT); }
+speculative-faults|spf				{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_SPF); }
 
 	/*
 	 * We have to handle the kernel PMU event cycles-ct/cycles-t/mem-loads/mem-stores separately.
diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c
index 2918cac7a142..00dd227959e6 100644
--- a/tools/perf/util/python.c
+++ b/tools/perf/util/python.c
@@ -1174,6 +1174,7 @@ static struct {
 	PERF_CONST(COUNT_SW_ALIGNMENT_FAULTS),
 	PERF_CONST(COUNT_SW_EMULATION_FAULTS),
 	PERF_CONST(COUNT_SW_DUMMY),
+	PERF_CONST(COUNT_SW_SPF),
 
 	PERF_CONST(SAMPLE_IP),
 	PERF_CONST(SAMPLE_TID),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v9 22/24] mm: Speculative page fault handler return VMA
  2018-03-13 17:59 [PATCH v9 00/24] Speculative page faults Laurent Dufour
                   ` (18 preceding siblings ...)
  2018-03-13 17:59 ` [PATCH v9 21/24] perf tools: Add support for the SPF perf event Laurent Dufour
@ 2018-03-13 17:59 ` Laurent Dufour
  2018-03-13 17:59 ` [PATCH v9 23/24] x86/mm: Add speculative pagefault handling Laurent Dufour
                   ` (6 subsequent siblings)
  26 siblings, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-03-13 17:59 UTC (permalink / raw)
  To: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan
  Cc: linux-kernel, linux-mm, haren, khandual, npiggin, bsingharora,
	Tim Chen, linuxppc-dev, x86

When the speculative page fault handler is returning VM_RETRY, there is a
chance that VMA fetched without grabbing the mmap_sem can be reused by the
legacy page fault handler.  By reusing it, we avoid calling find_vma()
again. To achieve, that we must ensure that the VMA structure will not be
freed in our back. This is done by getting the reference on it (get_vma())
and by assuming that the caller will call the new service
can_reuse_spf_vma() once it has grabbed the mmap_sem.

can_reuse_spf_vma() is first checking that the VMA is still in the RB tree
, and then that the VMA's boundaries matched the passed address and release
the reference on the VMA so that it can be freed if needed.

In the case the VMA is freed, can_reuse_spf_vma() will have returned false
as the VMA is no more in the RB tree.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
---
 include/linux/mm.h |   5 +-
 mm/memory.c        | 136 +++++++++++++++++++++++++++++++++--------------------
 2 files changed, 88 insertions(+), 53 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 1acc3f4e07d1..38a8c0041fd0 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1357,7 +1357,10 @@ extern int handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
 		unsigned int flags);
 #ifdef CONFIG_SPECULATIVE_PAGE_FAULT
 extern int handle_speculative_fault(struct mm_struct *mm,
-				    unsigned long address, unsigned int flags);
+				    unsigned long address, unsigned int flags,
+				    struct vm_area_struct **vma);
+extern bool can_reuse_spf_vma(struct vm_area_struct *vma,
+			      unsigned long address);
 #endif /* CONFIG_SPECULATIVE_PAGE_FAULT */
 extern int fixup_user_fault(struct task_struct *tsk, struct mm_struct *mm,
 			    unsigned long address, unsigned int fault_flags,
diff --git a/mm/memory.c b/mm/memory.c
index f39c4a4df703..16d3f5f4ffdd 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4292,13 +4292,22 @@ static int __handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
 /* This is required by vm_normal_page() */
 #error "Speculative page fault handler requires __HAVE_ARCH_PTE_SPECIAL"
 #endif
-
 /*
  * vm_normal_page() adds some processing which should be done while
  * hodling the mmap_sem.
  */
+
+/*
+ * Tries to handle the page fault in a speculative way, without grabbing the
+ * mmap_sem.
+ * When VM_FAULT_RETRY is returned, the vma pointer is valid and this vma must
+ * be checked later when the mmap_sem has been grabbed by calling
+ * can_reuse_spf_vma().
+ * This is needed as the returned vma is kept in memory until the call to
+ * can_reuse_spf_vma() is made.
+ */
 int handle_speculative_fault(struct mm_struct *mm, unsigned long address,
-			     unsigned int flags)
+			     unsigned int flags, struct vm_area_struct **vma)
 {
 	struct vm_fault vmf = {
 		.address = address,
@@ -4307,7 +4316,6 @@ int handle_speculative_fault(struct mm_struct *mm, unsigned long address,
 	p4d_t *p4d, p4dval;
 	pud_t pudval;
 	int seq, ret = VM_FAULT_RETRY;
-	struct vm_area_struct *vma;
 #ifdef CONFIG_NUMA
 	struct mempolicy *pol;
 #endif
@@ -4316,14 +4324,16 @@ int handle_speculative_fault(struct mm_struct *mm, unsigned long address,
 	flags &= ~(FAULT_FLAG_ALLOW_RETRY|FAULT_FLAG_KILLABLE);
 	flags |= FAULT_FLAG_SPECULATIVE;
 
-	vma = get_vma(mm, address);
-	if (!vma)
+	*vma = get_vma(mm, address);
+	if (!*vma)
 		return ret;
+	vmf.vma = *vma;
 
-	seq = raw_read_seqcount(&vma->vm_sequence); /* rmb <-> seqlock,vma_rb_erase() */
+	/* rmb <-> seqlock,vma_rb_erase() */
+	seq = raw_read_seqcount(&vmf.vma->vm_sequence);
 	if (seq & 1) {
-		trace_spf_vma_changed(_RET_IP_, vma, address);
-		goto out_put;
+		trace_spf_vma_changed(_RET_IP_, vmf.vma, address);
+		return ret;
 	}
 
 	/*
@@ -4331,9 +4341,9 @@ int handle_speculative_fault(struct mm_struct *mm, unsigned long address,
 	 * with the VMA.
 	 * This include huge page from hugetlbfs.
 	 */
-	if (vma->vm_ops) {
-		trace_spf_vma_notsup(_RET_IP_, vma, address);
-		goto out_put;
+	if (vmf.vma->vm_ops) {
+		trace_spf_vma_notsup(_RET_IP_, vmf.vma, address);
+		return ret;
 	}
 
 	/*
@@ -4341,18 +4351,18 @@ int handle_speculative_fault(struct mm_struct *mm, unsigned long address,
 	 * because vm_next and vm_prev must be safe. This can't be guaranteed
 	 * in the speculative path.
 	 */
-	if (unlikely(!vma->anon_vma)) {
-		trace_spf_vma_notsup(_RET_IP_, vma, address);
-		goto out_put;
+	if (unlikely(!vmf.vma->anon_vma)) {
+		trace_spf_vma_notsup(_RET_IP_, vmf.vma, address);
+		return ret;
 	}
 
-	vmf.vma_flags = READ_ONCE(vma->vm_flags);
-	vmf.vma_page_prot = READ_ONCE(vma->vm_page_prot);
+	vmf.vma_flags = READ_ONCE(vmf.vma->vm_flags);
+	vmf.vma_page_prot = READ_ONCE(vmf.vma->vm_page_prot);
 
 	/* Can't call userland page fault handler in the speculative path */
 	if (unlikely(vmf.vma_flags & VM_UFFD_MISSING)) {
-		trace_spf_vma_notsup(_RET_IP_, vma, address);
-		goto out_put;
+		trace_spf_vma_notsup(_RET_IP_, vmf.vma, address);
+		return ret;
 	}
 
 	if (vmf.vma_flags & VM_GROWSDOWN || vmf.vma_flags & VM_GROWSUP) {
@@ -4361,48 +4371,39 @@ int handle_speculative_fault(struct mm_struct *mm, unsigned long address,
 		 * boundaries but we want to trace it as not supported instead
 		 * of changed.
 		 */
-		trace_spf_vma_notsup(_RET_IP_, vma, address);
-		goto out_put;
+		trace_spf_vma_notsup(_RET_IP_, vmf.vma, address);
+		return ret;
 	}
 
-	if (address < READ_ONCE(vma->vm_start)
-	    || READ_ONCE(vma->vm_end) <= address) {
-		trace_spf_vma_changed(_RET_IP_, vma, address);
-		goto out_put;
+	if (address < READ_ONCE(vmf.vma->vm_start)
+	    || READ_ONCE(vmf.vma->vm_end) <= address) {
+		trace_spf_vma_changed(_RET_IP_, vmf.vma, address);
+		return ret;
 	}
 
-	if (!arch_vma_access_permitted(vma, flags & FAULT_FLAG_WRITE,
+	if (!arch_vma_access_permitted(vmf.vma, flags & FAULT_FLAG_WRITE,
 				       flags & FAULT_FLAG_INSTRUCTION,
-				       flags & FAULT_FLAG_REMOTE)) {
-		trace_spf_vma_access(_RET_IP_, vma, address);
-		ret = VM_FAULT_SIGSEGV;
-		goto out_put;
-	}
+				       flags & FAULT_FLAG_REMOTE))
+		goto out_segv;
 
 	/* This is one is required to check that the VMA has write access set */
 	if (flags & FAULT_FLAG_WRITE) {
-		if (unlikely(!(vmf.vma_flags & VM_WRITE))) {
-			trace_spf_vma_access(_RET_IP_, vma, address);
-			ret = VM_FAULT_SIGSEGV;
-			goto out_put;
-		}
-	} else if (unlikely(!(vmf.vma_flags & (VM_READ|VM_EXEC|VM_WRITE)))) {
-		trace_spf_vma_access(_RET_IP_, vma, address);
-		ret = VM_FAULT_SIGSEGV;
-		goto out_put;
-	}
+		if (unlikely(!(vmf.vma_flags & VM_WRITE)))
+			goto out_segv;
+	} else if (unlikely(!(vmf.vma_flags & (VM_READ|VM_EXEC|VM_WRITE))))
+		goto out_segv;
 
 #ifdef CONFIG_NUMA
 	/*
 	 * MPOL_INTERLEAVE implies additional check in mpol_misplaced() which
 	 * are not compatible with the speculative page fault processing.
 	 */
-	pol = __get_vma_policy(vma, address);
+	pol = __get_vma_policy(vmf.vma, address);
 	if (!pol)
 		pol = get_task_policy(current);
 	if (pol && pol->mode == MPOL_INTERLEAVE) {
-		trace_spf_vma_notsup(_RET_IP_, vma, address);
-		goto out_put;
+		trace_spf_vma_notsup(_RET_IP_, vmf.vma, address);
+		return ret;
 	}
 #endif
 
@@ -4464,9 +4465,8 @@ int handle_speculative_fault(struct mm_struct *mm, unsigned long address,
 		vmf.pte = NULL;
 	}
 
-	vmf.vma = vma;
-	vmf.pgoff = linear_page_index(vma, address);
-	vmf.gfp_mask = __get_fault_gfp_mask(vma);
+	vmf.pgoff = linear_page_index(vmf.vma, address);
+	vmf.gfp_mask = __get_fault_gfp_mask(vmf.vma);
 	vmf.sequence = seq;
 	vmf.flags = flags;
 
@@ -4476,16 +4476,22 @@ int handle_speculative_fault(struct mm_struct *mm, unsigned long address,
 	 * We need to re-validate the VMA after checking the bounds, otherwise
 	 * we might have a false positive on the bounds.
 	 */
-	if (read_seqcount_retry(&vma->vm_sequence, seq)) {
-		trace_spf_vma_changed(_RET_IP_, vma, address);
-		goto out_put;
+	if (read_seqcount_retry(&vmf.vma->vm_sequence, seq)) {
+		trace_spf_vma_changed(_RET_IP_, vmf.vma, address);
+		return ret;
 	}
 
 	mem_cgroup_oom_enable();
 	ret = handle_pte_fault(&vmf);
 	mem_cgroup_oom_disable();
 
-	put_vma(vma);
+	/*
+	 * If there is no need to retry, don't return the vma to the caller.
+	 */
+	if (!(ret & VM_FAULT_RETRY)) {
+		put_vma(vmf.vma);
+		*vma = NULL;
+	}
 
 	/*
 	 * The task may have entered a memcg OOM situation but
@@ -4498,9 +4504,35 @@ int handle_speculative_fault(struct mm_struct *mm, unsigned long address,
 	return ret;
 
 out_walk:
-	trace_spf_vma_notsup(_RET_IP_, vma, address);
+	trace_spf_vma_notsup(_RET_IP_, vmf.vma, address);
 	local_irq_enable();
-out_put:
+	return ret;
+
+out_segv:
+	trace_spf_vma_access(_RET_IP_, vmf.vma, address);
+	/*
+	 * We don't return VM_FAULT_RETRY so the caller is not expected to
+	 * retrieve the fetched VMA.
+	 */
+	put_vma(vmf.vma);
+	*vma = NULL;
+	return VM_FAULT_SIGSEGV;
+}
+
+/*
+ * This is used to know if the vma fetch in the speculative page fault handler
+ * is still valid when trying the regular fault path while holding the
+ * mmap_sem.
+ * The call to put_vma(vma) must be made after checking the vma's fields, as
+ * the vma may be freed by put_vma(). In such a case it is expected that false
+ * is returned.
+ */
+bool can_reuse_spf_vma(struct vm_area_struct *vma, unsigned long address)
+{
+	bool ret;
+
+	ret = !RB_EMPTY_NODE(&vma->vm_rb) &&
+		vma->vm_start <= address && address < vma->vm_end;
 	put_vma(vma);
 	return ret;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v9 23/24] x86/mm: Add speculative pagefault handling
  2018-03-13 17:59 [PATCH v9 00/24] Speculative page faults Laurent Dufour
                   ` (19 preceding siblings ...)
  2018-03-13 17:59 ` [PATCH v9 22/24] mm: Speculative page fault handler return VMA Laurent Dufour
@ 2018-03-13 17:59 ` Laurent Dufour
  2018-03-26 21:41   ` David Rientjes
  2018-03-13 17:59 ` [PATCH v9 24/24] powerpc/mm: Add speculative page fault Laurent Dufour
                   ` (5 subsequent siblings)
  26 siblings, 1 reply; 98+ messages in thread
From: Laurent Dufour @ 2018-03-13 17:59 UTC (permalink / raw)
  To: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan
  Cc: linux-kernel, linux-mm, haren, khandual, npiggin, bsingharora,
	Tim Chen, linuxppc-dev, x86

From: Peter Zijlstra <peterz@infradead.org>

Try a speculative fault before acquiring mmap_sem, if it returns with
VM_FAULT_RETRY continue with the mmap_sem acquisition and do the
traditional fault.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

[Clearing of FAULT_FLAG_ALLOW_RETRY is now done in
 handle_speculative_fault()]
[Retry with usual fault path in the case VM_ERROR is returned by
 handle_speculative_fault(). This allows signal to be delivered]
[Don't build SPF call if !CONFIG_SPECULATIVE_PAGE_FAULT]
[Try speculative fault path only for multi threaded processes]
[Try to the VMA fetch during the speculative path in case of retry]
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
---
 arch/x86/mm/fault.c | 38 +++++++++++++++++++++++++++++++++++++-
 1 file changed, 37 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index e6af2b464c3d..a73cf227edd6 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -1239,6 +1239,9 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
 		unsigned long address)
 {
 	struct vm_area_struct *vma;
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+	struct vm_area_struct *spf_vma = NULL;
+#endif
 	struct task_struct *tsk;
 	struct mm_struct *mm;
 	int fault, major = 0;
@@ -1332,6 +1335,27 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
 	if (error_code & X86_PF_INSTR)
 		flags |= FAULT_FLAG_INSTRUCTION;
 
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+	if ((error_code & X86_PF_USER) && (atomic_read(&mm->mm_users) > 1)) {
+		fault = handle_speculative_fault(mm, address, flags,
+						 &spf_vma);
+
+		if (!(fault & VM_FAULT_RETRY)) {
+			if (!(fault & VM_FAULT_ERROR)) {
+				perf_sw_event(PERF_COUNT_SW_SPF, 1,
+					      regs, address);
+				goto done;
+			}
+			/*
+			 * In case of error we need the pkey value, but
+			 * can't get it from the spf_vma as it is only returned
+			 * when VM_FAULT_RETRY is returned. So we have to
+			 * retry the page fault with the mmap_sem grabbed.
+			 */
+		}
+	}
+#endif /* CONFIG_SPECULATIVE_PAGE_FAULT */
+
 	/*
 	 * When running in the kernel we expect faults to occur only to
 	 * addresses in user space.  All other faults represent errors in
@@ -1365,7 +1389,16 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
 		might_sleep();
 	}
 
-	vma = find_vma(mm, address);
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+	if (spf_vma) {
+		if (can_reuse_spf_vma(spf_vma, address))
+			vma = spf_vma;
+		else
+			vma = find_vma(mm, address);
+		spf_vma = NULL;
+	} else
+#endif
+		vma = find_vma(mm, address);
 	if (unlikely(!vma)) {
 		bad_area(regs, error_code, address);
 		return;
@@ -1451,6 +1484,9 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
 		return;
 	}
 
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+done:
+#endif
 	/*
 	 * Major/minor page fault accounting. If any of the events
 	 * returned VM_FAULT_MAJOR, we account it as a major fault.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH v9 24/24] powerpc/mm: Add speculative page fault
  2018-03-13 17:59 [PATCH v9 00/24] Speculative page faults Laurent Dufour
                   ` (20 preceding siblings ...)
  2018-03-13 17:59 ` [PATCH v9 23/24] x86/mm: Add speculative pagefault handling Laurent Dufour
@ 2018-03-13 17:59 ` Laurent Dufour
  2018-03-26 21:39   ` David Rientjes
  2018-03-14 13:11 ` [PATCH v9 00/24] Speculative page faults Michal Hocko
                   ` (4 subsequent siblings)
  26 siblings, 1 reply; 98+ messages in thread
From: Laurent Dufour @ 2018-03-13 17:59 UTC (permalink / raw)
  To: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan
  Cc: linux-kernel, linux-mm, haren, khandual, npiggin, bsingharora,
	Tim Chen, linuxppc-dev, x86

This patch enable the speculative page fault on the PowerPC
architecture.

This will try a speculative page fault without holding the mmap_sem,
if it returns with VM_FAULT_RETRY, the mmap_sem is acquired and the
traditional page fault processing is done.

The speculative path is only tried for multithreaded process as there is no
risk of contention on the mmap_sem otherwise.

Build on if CONFIG_SPECULATIVE_PAGE_FAULT is defined (currently for
BOOK3S_64 && SMP).

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
---
 arch/powerpc/mm/fault.c | 31 ++++++++++++++++++++++++++++++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 866446cf2d9a..104f3cc86b51 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -392,6 +392,9 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
 			   unsigned long error_code)
 {
 	struct vm_area_struct * vma;
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+	struct vm_area_struct *spf_vma = NULL;
+#endif
 	struct mm_struct *mm = current->mm;
 	unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
  	int is_exec = TRAP(regs) == 0x400;
@@ -459,6 +462,20 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
 	if (is_exec)
 		flags |= FAULT_FLAG_INSTRUCTION;
 
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+	if (is_user && (atomic_read(&mm->mm_users) > 1)) {
+		/* let's try a speculative page fault without grabbing the
+		 * mmap_sem.
+		 */
+		fault = handle_speculative_fault(mm, address, flags, &spf_vma);
+		if (!(fault & VM_FAULT_RETRY)) {
+			perf_sw_event(PERF_COUNT_SW_SPF, 1,
+				      regs, address);
+			goto done;
+		}
+	}
+#endif /* CONFIG_SPECULATIVE_PAGE_FAULT */
+
 	/* When running in the kernel we expect faults to occur only to
 	 * addresses in user space.  All other faults represent errors in the
 	 * kernel and should generate an OOPS.  Unfortunately, in the case of an
@@ -489,7 +506,16 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
 		might_sleep();
 	}
 
-	vma = find_vma(mm, address);
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+	if (spf_vma) {
+		if (can_reuse_spf_vma(spf_vma, address))
+			vma = spf_vma;
+		else
+			vma =  find_vma(mm, address);
+		spf_vma = NULL;
+	} else
+#endif
+		vma = find_vma(mm, address);
 	if (unlikely(!vma))
 		return bad_area(regs, address);
 	if (likely(vma->vm_start <= address))
@@ -568,6 +594,9 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
 
 	up_read(&current->mm->mmap_sem);
 
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+done:
+#endif
 	if (unlikely(fault & VM_FAULT_ERROR))
 		return mm_fault_error(regs, address, fault);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 17/24] mm: Protect mm_rb tree with a rwlock
  2018-03-13 17:59 ` [PATCH v9 17/24] mm: Protect mm_rb tree with a rwlock Laurent Dufour
@ 2018-03-14  8:48   ` Peter Zijlstra
  2018-03-14 16:25     ` Laurent Dufour
  2018-03-16 10:23   ` [mm] b33ddf50eb: INFO:trying_to_register_non-static_key kernel test robot
  2018-04-03  0:11   ` [PATCH v9 17/24] mm: Protect mm_rb tree with a rwlock David Rientjes
  2 siblings, 1 reply; 98+ messages in thread
From: Peter Zijlstra @ 2018-03-14  8:48 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, akpm, kirill, ak, mhocko, dave, jack, Matthew Wilcox,
	benh, mpe, paulus, Thomas Gleixner, Ingo Molnar, hpa,
	Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Tue, Mar 13, 2018 at 06:59:47PM +0100, Laurent Dufour wrote:
> This change is inspired by the Peter's proposal patch [1] which was
> protecting the VMA using SRCU. Unfortunately, SRCU is not scaling well in
> that particular case, and it is introducing major performance degradation
> due to excessive scheduling operations.

Do you happen to have a little more detail on that?

> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 34fde7111e88..28c763ea1036 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -335,6 +335,7 @@ struct vm_area_struct {
>  	struct vm_userfaultfd_ctx vm_userfaultfd_ctx;
>  #ifdef CONFIG_SPECULATIVE_PAGE_FAULT
>  	seqcount_t vm_sequence;
> +	atomic_t vm_ref_count;		/* see vma_get(), vma_put() */
>  #endif
>  } __randomize_layout;
>  
> @@ -353,6 +354,9 @@ struct kioctx_table;
>  struct mm_struct {
>  	struct vm_area_struct *mmap;		/* list of VMAs */
>  	struct rb_root mm_rb;
> +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
> +	rwlock_t mm_rb_lock;
> +#endif
>  	u32 vmacache_seqnum;                   /* per-thread vmacache */
>  #ifdef CONFIG_MMU
>  	unsigned long (*get_unmapped_area) (struct file *filp,

When I tried this, it simply traded contention on mmap_sem for
contention on these two cachelines.

This was for the concurrent fault benchmark, where mmap_sem is only ever
acquired for reading (so no blocking ever happens) and the bottle-neck
was really pure cacheline access.

Only by using RCU can you avoid that thrashing.

Also note that if your database allocates the one giant mapping, it'll
be _one_ VMA and that vm_ref_count gets _very_ hot indeed.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 00/24] Speculative page faults
  2018-03-13 17:59 [PATCH v9 00/24] Speculative page faults Laurent Dufour
                   ` (21 preceding siblings ...)
  2018-03-13 17:59 ` [PATCH v9 24/24] powerpc/mm: Add speculative page fault Laurent Dufour
@ 2018-03-14 13:11 ` Michal Hocko
  2018-03-14 13:33   ` Laurent Dufour
  2018-04-13 13:34   ` Laurent Dufour
  2018-03-22  1:21 ` Ganesh Mahendran
                   ` (3 subsequent siblings)
  26 siblings, 2 replies; 98+ messages in thread
From: Michal Hocko @ 2018-03-14 13:11 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, akpm, kirill, ak, dave, jack, Matthew Wilcox,
	benh, mpe, paulus, Thomas Gleixner, Ingo Molnar, hpa,
	Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Tue 13-03-18 18:59:30, Laurent Dufour wrote:
> Changes since v8:
>  - Don't check PMD when locking the pte when THP is disabled
>    Thanks to Daniel Jordan for reporting this.
>  - Rebase on 4.16

Is this really worth reposting the whole pile? I mean this is at v9,
each doing little changes. It is quite tiresome to barely get to a
bookmarked version just to find out that there are 2 new versions out.

I am sorry to be grumpy and I can understand some frustration it doesn't
move forward that easilly but this is a _big_ change. We should start
with a real high level review rather than doing small changes here and
there and reach v20 quickly.

I am planning to find some time to look at it but the spare cycles are
so rare these days...
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 00/24] Speculative page faults
  2018-03-14 13:11 ` [PATCH v9 00/24] Speculative page faults Michal Hocko
@ 2018-03-14 13:33   ` Laurent Dufour
  2018-04-13 13:34   ` Laurent Dufour
  1 sibling, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-03-14 13:33 UTC (permalink / raw)
  To: Michal Hocko
  Cc: paulmck, peterz, akpm, kirill, ak, dave, jack, Matthew Wilcox,
	benh, mpe, paulus, Thomas Gleixner, Ingo Molnar, hpa,
	Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On 14/03/2018 14:11, Michal Hocko wrote:
> On Tue 13-03-18 18:59:30, Laurent Dufour wrote:
>> Changes since v8:
>>  - Don't check PMD when locking the pte when THP is disabled
>>    Thanks to Daniel Jordan for reporting this.
>>  - Rebase on 4.16
> 
> Is this really worth reposting the whole pile? I mean this is at v9,
> each doing little changes. It is quite tiresome to barely get to a
> bookmarked version just to find out that there are 2 new versions out.

I agree, I could have sent only a change for the concerned patch. But the
previous series has been sent a month ago and this one is rebased on the 4.16
kernel.

> I am sorry to be grumpy and I can understand some frustration it doesn't
> move forward that easilly but this is a _big_ change. We should start
> with a real high level review rather than doing small changes here and
> there and reach v20 quickly.
> 
> I am planning to find some time to look at it but the spare cycles are
> so rare these days...

I understand that this is a big change and I'll try to not post a new series
until I get more feedback from this one.

Thanks,
Laurent.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 17/24] mm: Protect mm_rb tree with a rwlock
  2018-03-14  8:48   ` Peter Zijlstra
@ 2018-03-14 16:25     ` Laurent Dufour
  0 siblings, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-03-14 16:25 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: paulmck, akpm, kirill, ak, mhocko, dave, jack, Matthew Wilcox,
	benh, mpe, paulus, Thomas Gleixner, Ingo Molnar, hpa,
	Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On 14/03/2018 09:48, Peter Zijlstra wrote:
> On Tue, Mar 13, 2018 at 06:59:47PM +0100, Laurent Dufour wrote:
>> This change is inspired by the Peter's proposal patch [1] which was
>> protecting the VMA using SRCU. Unfortunately, SRCU is not scaling well in
>> that particular case, and it is introducing major performance degradation
>> due to excessive scheduling operations.
> 
> Do you happen to have a little more detail on that?

This has been reported by kemi who find bad performance when running some
benchmarks on top of the v5 series:
https://patchwork.kernel.org/patch/9999687/

It appears that SRCU is generating a lot of additional scheduling to manage the
freeing of the VMA structure. SRCU is dealing through per cpu ressources but
the SRCU callback is And since we are handling this way a per process
ressource (VMA) through a global resource (SRCU) this leads to a lot of
overhead when scheduling the SRCU callback.

>> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
>> index 34fde7111e88..28c763ea1036 100644
>> --- a/include/linux/mm_types.h
>> +++ b/include/linux/mm_types.h
>> @@ -335,6 +335,7 @@ struct vm_area_struct {
>>  	struct vm_userfaultfd_ctx vm_userfaultfd_ctx;
>>  #ifdef CONFIG_SPECULATIVE_PAGE_FAULT
>>  	seqcount_t vm_sequence;
>> +	atomic_t vm_ref_count;		/* see vma_get(), vma_put() */
>>  #endif
>>  } __randomize_layout;
>>   
>> @@ -353,6 +354,9 @@ struct kioctx_table;
>>  struct mm_struct {
>>  	struct vm_area_struct *mmap;		/* list of VMAs */
>>  	struct rb_root mm_rb;
>> +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
>> +	rwlock_t mm_rb_lock;
>> +#endif
>>  	u32 vmacache_seqnum;                   /* per-thread vmacache */
>>  #ifdef CONFIG_MMU
>>  	unsigned long (*get_unmapped_area) (struct file *filp,
> 
> When I tried this, it simply traded contention on mmap_sem for
> contention on these two cachelines.
> 
> This was for the concurrent fault benchmark, where mmap_sem is only ever
> acquired for reading (so no blocking ever happens) and the bottle-neck
> was really pure cacheline access.

I'd say that this expected if multiple threads are dealing on the same VMA, but
if the VMA differ then this contention is disappearing while it is remaining
when using the mmap_sem.

This being said, test I did on PowerPC using will-it-scale/page_fault1_threads
showed that the number of caches-misses generated in get_vma() are very low
(less than 5%). Am I missing something ?

> Only by using RCU can you avoid that thrashing.

I agree, but this kind of test is the best use case for SRCU because there are
not so many updates, so not a lot of call to the SRCU asynchronous callback.

Honestly, I can't see an ideal solution here, RCU is not optimal when there is
a high number of updates, and using a rwlock may introduced a bottleneck there.
I get better results when using the rwlock than using SRCU in that case, but if
you have another proposal, please advise, I'll give it a try.

> Also note that if your database allocates the one giant mapping, it'll
> be _one_ VMA and that vm_ref_count gets _very_ hot indeed.

In the case of the database product I mentioned in the series header, that's
the opposite, the VMA number is very high so this doesn't happen. But in the
case of one VMA, it's clear that there will be a contention on vm_ref_count,
but this would be better than blocking on the mmap_sem.

Laurent.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [mm]  b33ddf50eb: INFO:trying_to_register_non-static_key
  2018-03-13 17:59 ` [PATCH v9 17/24] mm: Protect mm_rb tree with a rwlock Laurent Dufour
  2018-03-14  8:48   ` Peter Zijlstra
@ 2018-03-16 10:23   ` kernel test robot
  2018-03-16 16:38     ` Laurent Dufour
  2018-04-03  0:11   ` [PATCH v9 17/24] mm: Protect mm_rb tree with a rwlock David Rientjes
  2 siblings, 1 reply; 98+ messages in thread
From: kernel test robot @ 2018-03-16 10:23 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86, lkp

[-- Attachment #1: Type: text/plain, Size: 6498 bytes --]

FYI, we noticed the following commit (built with gcc-7):

commit: b33ddf50ebcc740b990dd2e0e8ff0b92c7acf58e ("mm: Protect mm_rb tree with a rwlock")
url: https://github.com/0day-ci/linux/commits/Laurent-Dufour/Speculative-page-faults/20180316-151833


in testcase: boot

on test machine: qemu-system-x86_64 -enable-kvm -cpu host -smp 2 -m 4G

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


+----------------------------------------+------------+------------+
|                                        | 7f3f7b4e80 | b33ddf50eb |
+----------------------------------------+------------+------------+
| boot_successes                         | 8          | 0          |
| boot_failures                          | 0          | 6          |
| INFO:trying_to_register_non-static_key | 0          | 6          |
+----------------------------------------+------------+------------+



[   22.218186] INFO: trying to register non-static key.
[   22.220252] the code is fine but needs lockdep annotation.
[   22.222471] turning off the locking correctness validator.
[   22.224839] CPU: 0 PID: 1 Comm: init Not tainted 4.16.0-rc4-next-20180309-00017-gb33ddf5 #1
[   22.228528] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[   22.232443] Call Trace:
[   22.234234]  dump_stack+0x85/0xbc
[   22.236085]  register_lock_class+0x237/0x477
[   22.238057]  __lock_acquire+0xd0/0xf15
[   22.240032]  lock_acquire+0x19c/0x1ce
[   22.241927]  ? do_mmap+0x3aa/0x3ff
[   22.243749]  mmap_region+0x37a/0x4c0
[   22.245619]  ? do_mmap+0x3aa/0x3ff
[   22.247425]  do_mmap+0x3aa/0x3ff
[   22.249175]  vm_mmap_pgoff+0xa1/0xea
[   22.251083]  elf_map+0x6d/0x134
[   22.252873]  load_elf_binary+0x56f/0xe07
[   22.254853]  search_binary_handler+0x75/0x1f8
[   22.256934]  do_execveat_common+0x661/0x92b
[   22.259164]  ? rest_init+0x22e/0x22e
[   22.261082]  do_execve+0x1f/0x21
[   22.262884]  kernel_init+0x5a/0xf0
[   22.264722]  ret_from_fork+0x3a/0x50
[   22.303240] systemd[1]: RTC configured in localtime, applying delta of 480 minutes to system time.
[   22.313544] systemd[1]: Failed to insert module 'autofs4': No such file or directory


[   22.513663] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input3
         Mounting Debug File System...
         Starting Create list of required st... nodes for the current kernel...
         Starting Remount Root and Kernel File Systems...
         Mounting Huge Pages File System...
         Starting Journal Service...
         Starting Load Kernel Modules...
         Mounting POSIX Message Queue File System...
         Mounting RPC Pipe File System...
         Mounting Configuration File System...
         Starting Apply Kernel Variables...
         Mounting FUSE Control File System...
         Starting Flush Journal to Persistent Storage...
         Starting Load/Save Random Seed...
         Starting udev Coldplug all Devices...
         Starting Create Static Device Nodes in /dev...
         Starting udev Kernel Device Manager...
         Starting Preprocess NFS configuration...
         Starting Create Volatile Files and Directories...
         Starting Network Time Synchronization...
         Starting RPC bind portmap service...
         Starting Update UTMP about System Boot/Shutdown...
[   24.727672] input: PC Speaker as /devices/platform/pcspkr/input/input4
         Starting OpenBSD Secure Shell server...
         Starting /etc/rc.local Compatibility...
         Starting Permit User Sessions...
         Starting LKP bootstrap...
[   24.923463] rc.local[3526]: PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/lkp/lkp/src/bin
         Starting Login Service...
LKP: HOSTNAME vm-lkp-nex04-4G-18, MAC 52:54:00:12:34:56, kernel 4.16.0-rc4-next-20180309-00017-gb33ddf5 1, serial console /dev/ttyS0
[   25.479682] Kernel tests: Boot OK!
[   25.479710] 
[   25.644452] install debs round one: dpkg -i --force-confdef --force-depends /opt/deb/gawk_1%3a4.1.4+dfsg-1_amd64.deb
[   25.644475] 
[   25.668363] /opt/deb/sysstat_11.6.0-1_amd64.deb
[   25.668384] 
[   25.722852] (Reading database ... 2202 files and directories currently installed.)
[   25.722894] 
[   25.744726] Preparing to unpack .../gawk_1%3a4.1.4+dfsg-1_amd64.deb ...
[   25.744756] 
[   25.783727] Unpacking gawk (1:4.1.4+dfsg-1) over (1:4.1.1+dfsg-1) ...
[   25.783756] 
[   26.409737] Selecting previously unselected package sysstat.
[   26.409765] 
[   26.447612] Preparing to unpack .../deb/sysstat_11.6.0-1_amd64.deb ...
[   26.447643] 
[   26.489959] Unpacking sysstat (11.6.0-1) ...
[   26.489988] 
[   26.995116] Setting up gawk (1:4.1.4+dfsg-1) ...
[   26.995146] 
[   27.102542] Setting up sysstat (11.6.0-1) ...
[   27.102573] 
[   30.773854] Processing triggers for systemd (231-5) ...
[   30.773873] 
[   30.851184] /lkp/lkp/src/bin/run-lkp
[   30.851205] 
[   30.917864] RESULT_ROOT=/result/boot/1/vm-lkp-nex04-4G/debian-x86_64-2016-08-31.cgz/x86_64-nfsroot/gcc-7/b33ddf50ebcc740b990dd2e0e8ff0b92c7acf58e/0
[   30.917885] 
[   30.949653] job=/lkp/scheduled/vm-lkp-nex04-4G-18/boot-1-debian-x86_64-2016-08-31.cgz-b33ddf50ebcc740b990dd2e0e8ff0b92c7acf58e-20180316-52130-1wxdvik-0.yaml
[   30.949673] 
[   30.958745] mount.nfs: try 1 time...
[   30.958760] 
[   31.190880] mount.nfs (3761) used greatest stack depth: 12424 bytes left
[   31.264066] run-job /lkp/scheduled/vm-lkp-nex04-4G-18/boot-1-debian-x86_64-2016-08-31.cgz-b33ddf50ebcc740b990dd2e0e8ff0b92c7acf58e-20180316-52130-1wxdvik-0.yaml
[   31.264087] 
[   31.338975] /usr/bin/curl -sSf http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/scheduled/vm-lkp-nex04-4G-18/boot-1-debian-x86_64-2016-08-31.cgz-b33ddf50ebcc740b990dd2e0e8ff0b92c7acf58e-20180316-52130-1wxdvik-0.yaml&job_state=running -o /dev/null
[   31.339035] 
[   34.814078] skip microcode check for virtual machine
[   34.814099] 
[   35.520694] random: crng init done
[   38.000898] /usr/bin/curl -sSf http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/scheduled/vm-lkp-nex04-4G-18/boot-1-debian-x86_64-2016-08-31.cgz-b33ddf50ebcc740b990dd2e0e8ff0b92c7acf58e-20180316-52130-1wxdvik-0.yaml&job_state=post_run -o /dev/null
[   38.000920] 
[   40.029932] kill 3803 dmesg --follow --decode 
[   40.029959] 


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp qemu -k <bzImage> job-script  # job-script is attached in this email



Thanks,
lkp

[-- Attachment #2: config-4.16.0-rc4-next-20180309-00017-gb33ddf5 --]
[-- Type: text/plain, Size: 117782 bytes --]

#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 4.16.0-rc4 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=28
CONFIG_ARCH_MMAP_RND_BITS_MAX=32
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ZONE_DMA32=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_X86_64_SMP=y
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_PGTABLE_LEVELS=4
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y
CONFIG_THREAD_INFO_IN_TASK=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_CROSS_MEMORY_ATTACH=y
CONFIG_USELIB=y
CONFIG_AUDIT=y
CONFIG_HAVE_ARCH_AUDITSYSCALL=y
CONFIG_AUDITSYSCALL=y
CONFIG_AUDIT_WATCH=y
CONFIG_AUDIT_TREE=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_GENERIC_IRQ_MIGRATION=y
CONFIG_IRQ_DOMAIN=y
CONFIG_IRQ_DOMAIN_HIERARCHY=y
CONFIG_GENERIC_MSI_IRQ=y
CONFIG_GENERIC_MSI_IRQ_DOMAIN=y
CONFIG_GENERIC_IRQ_MATRIX_ALLOCATOR=y
CONFIG_GENERIC_IRQ_RESERVATION_MODE=y
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
# CONFIG_GENERIC_IRQ_DEBUGFS is not set
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_DATA=y
CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
CONFIG_NO_HZ_IDLE=y
# CONFIG_NO_HZ_FULL is not set
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y

#
# CPU/Task time and stats accounting
#
CONFIG_TICK_CPU_ACCOUNTING=y
# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
# CONFIG_IRQ_TIME_ACCOUNTING is not set
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y
CONFIG_CPU_ISOLATION=y

#
# RCU Subsystem
#
CONFIG_TREE_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_SRCU=y
CONFIG_TREE_SRCU=y
CONFIG_TASKS_RCU=y
CONFIG_RCU_STALL_COMMON=y
CONFIG_RCU_NEED_SEGCBLIST=y
CONFIG_BUILD_BIN2C=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=20
CONFIG_LOG_CPU_MAX_BUF_SHIFT=12
CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=13
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH=y
CONFIG_ARCH_SUPPORTS_INT128=y
# CONFIG_NUMA_BALANCING is not set
CONFIG_CGROUPS=y
# CONFIG_MEMCG is not set
CONFIG_BLK_CGROUP=y
CONFIG_DEBUG_BLK_CGROUP=y
CONFIG_CGROUP_SCHED=y
CONFIG_FAIR_GROUP_SCHED=y
# CONFIG_CFS_BANDWIDTH is not set
# CONFIG_RT_GROUP_SCHED is not set
# CONFIG_CGROUP_PIDS is not set
# CONFIG_CGROUP_RDMA is not set
CONFIG_CGROUP_FREEZER=y
# CONFIG_CGROUP_HUGETLB is not set
CONFIG_CPUSETS=y
CONFIG_PROC_PID_CPUSET=y
CONFIG_CGROUP_DEVICE=y
# CONFIG_CGROUP_CPUACCT is not set
# CONFIG_CGROUP_PERF is not set
CONFIG_CGROUP_DEBUG=y
# CONFIG_NAMESPACES is not set
CONFIG_SCHED_AUTOGROUP=y
# CONFIG_SYSFS_DEPRECATED is not set
CONFIG_RELAY=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_RD_GZIP=y
# CONFIG_RD_BZIP2 is not set
# CONFIG_RD_LZMA is not set
# CONFIG_RD_XZ is not set
# CONFIG_RD_LZO is not set
CONFIG_RD_LZ4=y
# CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE is not set
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
CONFIG_ANON_INODES=y
CONFIG_HAVE_UID16=y
CONFIG_SYSCTL_EXCEPTION_TRACE=y
CONFIG_HAVE_PCSPKR_PLATFORM=y
CONFIG_BPF=y
CONFIG_EXPERT=y
CONFIG_UID16=y
CONFIG_MULTIUSER=y
CONFIG_SGETMASK_SYSCALL=y
CONFIG_SYSFS_SYSCALL=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_FHANDLE=y
CONFIG_POSIX_TIMERS=y
CONFIG_PRINTK=y
CONFIG_PRINTK_NMI=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_PCSPKR_PLATFORM=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_FUTEX_PI=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_AIO=y
CONFIG_ADVISE_SYSCALLS=y
CONFIG_MEMBARRIER=y
# CONFIG_CHECKPOINT_RESTORE is not set
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_KALLSYMS_ABSOLUTE_PERCPU=y
CONFIG_KALLSYMS_BASE_RELATIVE=y
# CONFIG_BPF_SYSCALL is not set
# CONFIG_USERFAULTFD is not set
CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE=y
CONFIG_EMBEDDED=y
CONFIG_HAVE_PERF_EVENTS=y
# CONFIG_PC104 is not set

#
# Kernel Performance Events And Counters
#
CONFIG_PERF_EVENTS=y
# CONFIG_DEBUG_PERF_USE_VMALLOC is not set
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLUB_DEBUG=y
CONFIG_COMPAT_BRK=y
# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_SLOB is not set
CONFIG_SLAB_MERGE_DEFAULT=y
# CONFIG_SLAB_FREELIST_RANDOM is not set
# CONFIG_SLAB_FREELIST_HARDENED is not set
CONFIG_SLUB_CPU_PARTIAL=y
CONFIG_PROFILING=y
CONFIG_TRACEPOINTS=y
CONFIG_CRASH_CORE=y
CONFIG_KEXEC_CORE=y
CONFIG_OPROFILE=y
# CONFIG_OPROFILE_EVENT_MULTIPLEX is not set
CONFIG_HAVE_OPROFILE=y
CONFIG_OPROFILE_NMI_TIMER=y
CONFIG_KPROBES=y
# CONFIG_JUMP_LABEL is not set
CONFIG_OPTPROBES=y
CONFIG_KPROBES_ON_FTRACE=y
CONFIG_UPROBES=y
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
CONFIG_ARCH_USE_BUILTIN_BSWAP=y
CONFIG_KRETPROBES=y
CONFIG_USER_RETURN_NOTIFIER=y
CONFIG_HAVE_IOREMAP_PROT=y
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KRETPROBES=y
CONFIG_HAVE_OPTPROBES=y
CONFIG_HAVE_KPROBES_ON_FTRACE=y
CONFIG_HAVE_FUNCTION_ERROR_INJECTION=y
CONFIG_HAVE_NMI=y
CONFIG_HAVE_ARCH_TRACEHOOK=y
CONFIG_HAVE_DMA_CONTIGUOUS=y
CONFIG_GENERIC_SMP_IDLE_THREAD=y
CONFIG_ARCH_HAS_FORTIFY_SOURCE=y
CONFIG_ARCH_HAS_SET_MEMORY=y
CONFIG_HAVE_ARCH_THREAD_STRUCT_WHITELIST=y
CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT=y
CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y
CONFIG_HAVE_CLK=y
CONFIG_HAVE_DMA_API_DEBUG=y
CONFIG_HAVE_HW_BREAKPOINT=y
CONFIG_HAVE_MIXED_BREAKPOINTS_REGS=y
CONFIG_HAVE_USER_RETURN_NOTIFIER=y
CONFIG_HAVE_PERF_EVENTS_NMI=y
CONFIG_HAVE_HARDLOCKUP_DETECTOR_PERF=y
CONFIG_HAVE_PERF_REGS=y
CONFIG_HAVE_PERF_USER_STACK_DUMP=y
CONFIG_HAVE_ARCH_JUMP_LABEL=y
CONFIG_HAVE_RCU_TABLE_FREE=y
CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG=y
CONFIG_HAVE_ALIGNED_STRUCT_PAGE=y
CONFIG_HAVE_CMPXCHG_LOCAL=y
CONFIG_HAVE_CMPXCHG_DOUBLE=y
CONFIG_ARCH_WANT_COMPAT_IPC_PARSE_VERSION=y
CONFIG_ARCH_WANT_OLD_COMPAT_IPC=y
CONFIG_HAVE_ARCH_SECCOMP_FILTER=y
CONFIG_SECCOMP_FILTER=y
CONFIG_HAVE_GCC_PLUGINS=y
# CONFIG_GCC_PLUGINS is not set
CONFIG_HAVE_CC_STACKPROTECTOR=y
# CONFIG_CC_STACKPROTECTOR_NONE is not set
# CONFIG_CC_STACKPROTECTOR_REGULAR is not set
# CONFIG_CC_STACKPROTECTOR_STRONG is not set
CONFIG_CC_STACKPROTECTOR_AUTO=y
CONFIG_HAVE_ARCH_WITHIN_STACK_FRAMES=y
CONFIG_HAVE_CONTEXT_TRACKING=y
CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y
CONFIG_HAVE_IRQ_TIME_ACCOUNTING=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=y
CONFIG_HAVE_ARCH_HUGE_VMAP=y
CONFIG_HAVE_ARCH_SOFT_DIRTY=y
CONFIG_HAVE_MOD_ARCH_SPECIFIC=y
CONFIG_MODULES_USE_ELF_RELA=y
CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK=y
CONFIG_ARCH_HAS_ELF_RANDOMIZE=y
CONFIG_HAVE_ARCH_MMAP_RND_BITS=y
CONFIG_HAVE_EXIT_THREAD=y
CONFIG_ARCH_MMAP_RND_BITS=28
CONFIG_HAVE_ARCH_MMAP_RND_COMPAT_BITS=y
CONFIG_ARCH_MMAP_RND_COMPAT_BITS=8
CONFIG_HAVE_ARCH_COMPAT_MMAP_BASES=y
CONFIG_HAVE_COPY_THREAD_TLS=y
CONFIG_HAVE_STACK_VALIDATION=y
CONFIG_OLD_SIGSUSPEND3=y
CONFIG_COMPAT_OLD_SIGACTION=y
CONFIG_HAVE_ARCH_VMAP_STACK=y
CONFIG_VMAP_STACK=y
CONFIG_ARCH_HAS_STRICT_KERNEL_RWX=y
CONFIG_STRICT_KERNEL_RWX=y
CONFIG_ARCH_HAS_STRICT_MODULE_RWX=y
CONFIG_STRICT_MODULE_RWX=y
CONFIG_ARCH_HAS_PHYS_TO_DMA=y
CONFIG_ARCH_HAS_REFCOUNT=y
# CONFIG_REFCOUNT_FULL is not set

#
# GCOV-based kernel profiling
#
# CONFIG_GCOV_KERNEL is not set
CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y
CONFIG_RT_MUTEXES=y
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
# CONFIG_MODULE_FORCE_LOAD is not set
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
# CONFIG_MODULE_SIG is not set
# CONFIG_MODULE_COMPRESS is not set
# CONFIG_TRIM_UNUSED_KSYMS is not set
CONFIG_MODULES_TREE_LOOKUP=y
CONFIG_BLOCK=y
CONFIG_BLK_SCSI_REQUEST=y
CONFIG_BLK_DEV_BSG=y
CONFIG_BLK_DEV_BSGLIB=y
CONFIG_BLK_DEV_INTEGRITY=y
# CONFIG_BLK_DEV_ZONED is not set
CONFIG_BLK_DEV_THROTTLING=y
# CONFIG_BLK_DEV_THROTTLING_LOW is not set
# CONFIG_BLK_CMDLINE_PARSER is not set
# CONFIG_BLK_WBT is not set
CONFIG_BLK_DEBUG_FS=y
# CONFIG_BLK_SED_OPAL is not set

#
# Partition Types
#
# CONFIG_PARTITION_ADVANCED is not set
CONFIG_MSDOS_PARTITION=y
CONFIG_EFI_PARTITION=y
CONFIG_BLOCK_COMPAT=y
CONFIG_BLK_MQ_PCI=y
CONFIG_BLK_MQ_VIRTIO=y

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
CONFIG_CFQ_GROUP_IOSCHED=y
# CONFIG_DEFAULT_DEADLINE is not set
CONFIG_DEFAULT_CFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="cfq"
CONFIG_MQ_IOSCHED_DEADLINE=y
CONFIG_MQ_IOSCHED_KYBER=y
# CONFIG_IOSCHED_BFQ is not set
CONFIG_PREEMPT_NOTIFIERS=y
CONFIG_UNINLINE_SPIN_UNLOCK=y
CONFIG_ARCH_SUPPORTS_ATOMIC_RMW=y
CONFIG_MUTEX_SPIN_ON_OWNER=y
CONFIG_RWSEM_SPIN_ON_OWNER=y
CONFIG_LOCK_SPIN_ON_OWNER=y
CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
CONFIG_QUEUED_SPINLOCKS=y
CONFIG_ARCH_USE_QUEUED_RWLOCKS=y
CONFIG_QUEUED_RWLOCKS=y
CONFIG_ARCH_HAS_SYNC_CORE_BEFORE_USERMODE=y
CONFIG_FREEZER=y

#
# Processor type and features
#
CONFIG_ZONE_DMA=y
CONFIG_SMP=y
CONFIG_X86_FEATURE_NAMES=y
CONFIG_X86_FAST_FEATURE_TESTS=y
# CONFIG_X86_X2APIC is not set
CONFIG_X86_MPPARSE=y
# CONFIG_GOLDFISH is not set
CONFIG_RETPOLINE=y
# CONFIG_INTEL_RDT is not set
CONFIG_X86_EXTENDED_PLATFORM=y
# CONFIG_X86_VSMP is not set
# CONFIG_X86_GOLDFISH is not set
# CONFIG_X86_INTEL_MID is not set
# CONFIG_X86_INTEL_LPSS is not set
# CONFIG_X86_AMD_PLATFORM_DEVICE is not set
CONFIG_IOSF_MBI=y
# CONFIG_IOSF_MBI_DEBUG is not set
CONFIG_X86_SUPPORTS_MEMORY_FAILURE=y
# CONFIG_SCHED_OMIT_FRAME_POINTER is not set
CONFIG_HYPERVISOR_GUEST=y
CONFIG_PARAVIRT=y
# CONFIG_PARAVIRT_DEBUG is not set
# CONFIG_PARAVIRT_SPINLOCKS is not set
# CONFIG_XEN is not set
CONFIG_KVM_GUEST=y
# CONFIG_KVM_DEBUG_FS is not set
# CONFIG_PARAVIRT_TIME_ACCOUNTING is not set
CONFIG_PARAVIRT_CLOCK=y
# CONFIG_JAILHOUSE_GUEST is not set
CONFIG_NO_BOOTMEM=y
# CONFIG_MK8 is not set
# CONFIG_MPSC is not set
CONFIG_MCORE2=y
# CONFIG_MATOM is not set
# CONFIG_GENERIC_CPU is not set
CONFIG_X86_INTERNODE_CACHE_SHIFT=6
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_P6_NOP=y
CONFIG_X86_TSC=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_CMOV=y
CONFIG_X86_MINIMUM_CPU_FAMILY=64
CONFIG_X86_DEBUGCTLMSR=y
CONFIG_PROCESSOR_SELECT=y
CONFIG_CPU_SUP_INTEL=y
# CONFIG_CPU_SUP_AMD is not set
CONFIG_CPU_SUP_CENTAUR=y
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
CONFIG_DMI=y
# CONFIG_CALGARY_IOMMU is not set
CONFIG_SWIOTLB=y
CONFIG_IOMMU_HELPER=y
# CONFIG_MAXSMP is not set
CONFIG_NR_CPUS_RANGE_BEGIN=2
CONFIG_NR_CPUS_RANGE_END=512
CONFIG_NR_CPUS_DEFAULT=64
CONFIG_NR_CPUS=64
CONFIG_SCHED_SMT=y
CONFIG_SCHED_MC=y
CONFIG_SCHED_MC_PRIO=y
# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
# CONFIG_PREEMPT is not set
CONFIG_PREEMPT_COUNT=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
# CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS is not set
CONFIG_X86_MCE=y
# CONFIG_X86_MCELOG_LEGACY is not set
CONFIG_X86_MCE_INTEL=y
CONFIG_X86_MCE_THRESHOLD=y
CONFIG_X86_MCE_INJECT=y
CONFIG_X86_THERMAL_VECTOR=y

#
# Performance monitoring
#
CONFIG_PERF_EVENTS_INTEL_UNCORE=y
CONFIG_PERF_EVENTS_INTEL_RAPL=y
CONFIG_PERF_EVENTS_INTEL_CSTATE=y
CONFIG_X86_16BIT=y
CONFIG_X86_ESPFIX64=y
CONFIG_X86_VSYSCALL_EMULATION=y
# CONFIG_I8K is not set
CONFIG_MICROCODE=y
CONFIG_MICROCODE_INTEL=y
# CONFIG_MICROCODE_AMD is not set
CONFIG_MICROCODE_OLD_INTERFACE=y
CONFIG_X86_MSR=y
CONFIG_X86_CPUID=y
# CONFIG_X86_5LEVEL is not set
CONFIG_ARCH_PHYS_ADDR_T_64BIT=y
CONFIG_ARCH_DMA_ADDR_T_64BIT=y
CONFIG_X86_DIRECT_GBPAGES=y
CONFIG_ARCH_HAS_MEM_ENCRYPT=y
CONFIG_NUMA=y
CONFIG_AMD_NUMA=y
CONFIG_X86_64_ACPI_NUMA=y
CONFIG_NODES_SPAN_OTHER_NODES=y
CONFIG_NUMA_EMU=y
CONFIG_NODES_SHIFT=6
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_DEFAULT=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_ARCH_MEMORY_PROBE=y
CONFIG_ARCH_PROC_KCORE_TEXT=y
CONFIG_ILLEGAL_POINTER_VALUE=0xdead000000000000
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_SPARSEMEM_MANUAL=y
CONFIG_SPARSEMEM=y
CONFIG_NEED_MULTIPLE_NODES=y
CONFIG_HAVE_MEMORY_PRESENT=y
CONFIG_SPARSEMEM_EXTREME=y
CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER=y
CONFIG_SPARSEMEM_VMEMMAP=y
CONFIG_HAVE_MEMBLOCK=y
CONFIG_HAVE_MEMBLOCK_NODE_MAP=y
CONFIG_HAVE_GENERIC_GUP=y
CONFIG_ARCH_DISCARD_MEMBLOCK=y
CONFIG_MEMORY_ISOLATION=y
CONFIG_HAVE_BOOTMEM_INFO_NODE=y
CONFIG_MEMORY_HOTPLUG=y
CONFIG_MEMORY_HOTPLUG_SPARSE=y
# CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE is not set
CONFIG_MEMORY_HOTREMOVE=y
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK=y
CONFIG_MEMORY_BALLOON=y
# CONFIG_COMPACTION is not set
CONFIG_MIGRATION=y
CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION=y
CONFIG_PHYS_ADDR_T_64BIT=y
CONFIG_BOUNCE=y
CONFIG_VIRT_TO_BUS=y
CONFIG_MMU_NOTIFIER=y
CONFIG_KSM=y
CONFIG_DEFAULT_MMAP_MIN_ADDR=4096
CONFIG_ARCH_SUPPORTS_MEMORY_FAILURE=y
CONFIG_MEMORY_FAILURE=y
CONFIG_HWPOISON_INJECT=y
# CONFIG_TRANSPARENT_HUGEPAGE is not set
CONFIG_ARCH_WANTS_THP_SWAP=y
# CONFIG_CLEANCACHE is not set
# CONFIG_FRONTSWAP is not set
# CONFIG_CMA is not set
# CONFIG_ZPOOL is not set
# CONFIG_ZBUD is not set
# CONFIG_ZSMALLOC is not set
CONFIG_GENERIC_EARLY_IOREMAP=y
# CONFIG_DEFERRED_STRUCT_PAGE_INIT is not set
# CONFIG_IDLE_PAGE_TRACKING is not set
CONFIG_ARCH_HAS_ZONE_DEVICE=y
# CONFIG_ZONE_DEVICE is not set
CONFIG_ARCH_USES_HIGH_VMA_FLAGS=y
CONFIG_ARCH_HAS_PKEYS=y
# CONFIG_PERCPU_STATS is not set
# CONFIG_GUP_BENCHMARK is not set
CONFIG_SPECULATIVE_PAGE_FAULT=y
# CONFIG_X86_PMEM_LEGACY is not set
CONFIG_X86_CHECK_BIOS_CORRUPTION=y
CONFIG_X86_BOOTPARAM_MEMORY_CORRUPTION_CHECK=y
CONFIG_X86_RESERVE_LOW=64
CONFIG_MTRR=y
# CONFIG_MTRR_SANITIZER is not set
CONFIG_X86_PAT=y
CONFIG_ARCH_USES_PG_UNCACHED=y
CONFIG_ARCH_RANDOM=y
CONFIG_X86_SMAP=y
CONFIG_X86_INTEL_UMIP=y
# CONFIG_X86_INTEL_MPX is not set
CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS=y
CONFIG_EFI=y
# CONFIG_EFI_STUB is not set
CONFIG_SECCOMP=y
# CONFIG_HZ_100 is not set
# CONFIG_HZ_250 is not set
# CONFIG_HZ_300 is not set
CONFIG_HZ_1000=y
CONFIG_HZ=1000
CONFIG_SCHED_HRTICK=y
CONFIG_KEXEC=y
CONFIG_KEXEC_FILE=y
# CONFIG_KEXEC_SIG is not set
CONFIG_CRASH_DUMP=y
# CONFIG_KEXEC_JUMP is not set
CONFIG_PHYSICAL_START=0x1000000
CONFIG_RELOCATABLE=y
CONFIG_RANDOMIZE_BASE=y
CONFIG_X86_NEED_RELOCS=y
CONFIG_PHYSICAL_ALIGN=0x1000000
CONFIG_DYNAMIC_MEMORY_LAYOUT=y
CONFIG_RANDOMIZE_MEMORY=y
CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING=0xa
CONFIG_HOTPLUG_CPU=y
# CONFIG_BOOTPARAM_HOTPLUG_CPU0 is not set
# CONFIG_DEBUG_HOTPLUG_CPU0 is not set
CONFIG_COMPAT_VDSO=y
# CONFIG_LEGACY_VSYSCALL_NATIVE is not set
CONFIG_LEGACY_VSYSCALL_EMULATE=y
# CONFIG_LEGACY_VSYSCALL_NONE is not set
# CONFIG_CMDLINE_BOOL is not set
CONFIG_MODIFY_LDT_SYSCALL=y
CONFIG_HAVE_LIVEPATCH=y
# CONFIG_LIVEPATCH is not set
CONFIG_ARCH_HAS_ADD_PAGES=y
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE=y
CONFIG_USE_PERCPU_NUMA_NODE_ID=y

#
# Power management and ACPI options
#
CONFIG_ARCH_HIBERNATION_HEADER=y
CONFIG_SUSPEND=y
CONFIG_SUSPEND_FREEZER=y
# CONFIG_SUSPEND_SKIP_SYNC is not set
CONFIG_HIBERNATE_CALLBACKS=y
CONFIG_HIBERNATION=y
CONFIG_PM_STD_PARTITION=""
CONFIG_PM_SLEEP=y
CONFIG_PM_SLEEP_SMP=y
# CONFIG_PM_AUTOSLEEP is not set
# CONFIG_PM_WAKELOCKS is not set
CONFIG_PM=y
CONFIG_PM_DEBUG=y
CONFIG_PM_ADVANCED_DEBUG=y
CONFIG_PM_TEST_SUSPEND=y
CONFIG_PM_SLEEP_DEBUG=y
CONFIG_PM_TRACE=y
CONFIG_PM_TRACE_RTC=y
CONFIG_PM_CLK=y
# CONFIG_WQ_POWER_EFFICIENT_DEFAULT is not set
CONFIG_ACPI=y
CONFIG_ACPI_LEGACY_TABLES_LOOKUP=y
CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC=y
CONFIG_ACPI_SYSTEM_POWER_STATES_SUPPORT=y
# CONFIG_ACPI_DEBUGGER is not set
CONFIG_ACPI_SPCR_TABLE=y
CONFIG_ACPI_LPIT=y
CONFIG_ACPI_SLEEP=y
CONFIG_ACPI_PROCFS_POWER=y
CONFIG_ACPI_REV_OVERRIDE_POSSIBLE=y
CONFIG_ACPI_EC_DEBUGFS=y
CONFIG_ACPI_AC=y
CONFIG_ACPI_BATTERY=y
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_VIDEO=y
CONFIG_ACPI_FAN=y
CONFIG_ACPI_DOCK=y
CONFIG_ACPI_CPU_FREQ_PSS=y
CONFIG_ACPI_PROCESSOR_CSTATE=y
CONFIG_ACPI_PROCESSOR_IDLE=y
CONFIG_ACPI_CPPC_LIB=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_HOTPLUG_CPU=y
# CONFIG_ACPI_PROCESSOR_AGGREGATOR is not set
CONFIG_ACPI_THERMAL=y
CONFIG_ACPI_NUMA=y
CONFIG_ACPI_CUSTOM_DSDT_FILE=""
CONFIG_ARCH_HAS_ACPI_TABLE_UPGRADE=y
CONFIG_ACPI_TABLE_UPGRADE=y
CONFIG_ACPI_DEBUG=y
CONFIG_ACPI_PCI_SLOT=y
CONFIG_ACPI_CONTAINER=y
# CONFIG_ACPI_HOTPLUG_MEMORY is not set
CONFIG_ACPI_HOTPLUG_IOAPIC=y
CONFIG_ACPI_SBS=y
# CONFIG_ACPI_HED is not set
# CONFIG_ACPI_CUSTOM_METHOD is not set
# CONFIG_ACPI_BGRT is not set
# CONFIG_ACPI_REDUCED_HARDWARE_ONLY is not set
# CONFIG_ACPI_NFIT is not set
CONFIG_HAVE_ACPI_APEI=y
CONFIG_HAVE_ACPI_APEI_NMI=y
# CONFIG_ACPI_APEI is not set
# CONFIG_DPTF_POWER is not set
# CONFIG_PMIC_OPREGION is not set
# CONFIG_ACPI_CONFIGFS is not set
CONFIG_X86_PM_TIMER=y
# CONFIG_SFI is not set

#
# CPU Frequency scaling
#
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_GOV_ATTR_SET=y
CONFIG_CPU_FREQ_GOV_COMMON=y
CONFIG_CPU_FREQ_STAT=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_SCHEDUTIL is not set
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=y
CONFIG_CPU_FREQ_GOV_USERSPACE=y
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y
# CONFIG_CPU_FREQ_GOV_SCHEDUTIL is not set

#
# CPU frequency scaling drivers
#
CONFIG_X86_INTEL_PSTATE=y
# CONFIG_X86_PCC_CPUFREQ is not set
CONFIG_X86_ACPI_CPUFREQ=y
# CONFIG_X86_POWERNOW_K8 is not set
CONFIG_X86_SPEEDSTEP_CENTRINO=y
# CONFIG_X86_P4_CLOCKMOD is not set

#
# shared options
#

#
# CPU Idle
#
CONFIG_CPU_IDLE=y
CONFIG_CPU_IDLE_GOV_LADDER=y
CONFIG_CPU_IDLE_GOV_MENU=y
# CONFIG_INTEL_IDLE is not set

#
# Bus options (PCI etc.)
#
CONFIG_PCI=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_PCI_DOMAINS=y
# CONFIG_PCI_CNB20LE_QUIRK is not set
CONFIG_PCIEPORTBUS=y
# CONFIG_HOTPLUG_PCI_PCIE is not set
CONFIG_PCIEAER=y
# CONFIG_PCIE_ECRC is not set
# CONFIG_PCIEAER_INJECT is not set
# CONFIG_PCIEASPM is not set
CONFIG_PCIE_PME=y
# CONFIG_PCIE_DPC is not set
# CONFIG_PCIE_PTM is not set
CONFIG_PCI_BUS_ADDR_T_64BIT=y
CONFIG_PCI_MSI=y
CONFIG_PCI_MSI_IRQ_DOMAIN=y
CONFIG_PCI_QUIRKS=y
# CONFIG_PCI_DEBUG is not set
# CONFIG_PCI_STUB is not set
CONFIG_PCI_LOCKLESS_CONFIG=y
# CONFIG_PCI_IOV is not set
# CONFIG_PCI_PRI is not set
# CONFIG_PCI_PASID is not set
CONFIG_PCI_LABEL=y
CONFIG_HOTPLUG_PCI=y
# CONFIG_HOTPLUG_PCI_ACPI is not set
# CONFIG_HOTPLUG_PCI_CPCI is not set
# CONFIG_HOTPLUG_PCI_SHPC is not set

#
# Cadence PCIe controllers support
#

#
# DesignWare PCI Core Support
#
# CONFIG_PCIE_DW_PLAT is not set

#
# PCI host controller drivers
#
# CONFIG_VMD is not set

#
# PCI Endpoint
#
# CONFIG_PCI_ENDPOINT is not set

#
# PCI switch controller drivers
#
# CONFIG_PCI_SW_SWITCHTEC is not set
# CONFIG_ISA_BUS is not set
CONFIG_ISA_DMA_API=y
# CONFIG_PCCARD is not set
# CONFIG_RAPIDIO is not set
# CONFIG_X86_SYSFB is not set

#
# Executable file formats / Emulations
#
CONFIG_BINFMT_ELF=y
CONFIG_COMPAT_BINFMT_ELF=y
CONFIG_ELFCORE=y
# CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set
CONFIG_BINFMT_SCRIPT=y
# CONFIG_BINFMT_MISC is not set
CONFIG_COREDUMP=y
CONFIG_IA32_EMULATION=y
# CONFIG_IA32_AOUT is not set
# CONFIG_X86_X32 is not set
CONFIG_COMPAT_32=y
CONFIG_COMPAT=y
CONFIG_COMPAT_FOR_U64_ALIGNMENT=y
CONFIG_SYSVIPC_COMPAT=y
CONFIG_X86_DEV_DMA_OPS=y
CONFIG_NET=y
CONFIG_NET_INGRESS=y

#
# Networking options
#
CONFIG_PACKET=y
# CONFIG_PACKET_DIAG is not set
CONFIG_UNIX=y
# CONFIG_UNIX_DIAG is not set
# CONFIG_TLS is not set
CONFIG_XFRM=y
CONFIG_XFRM_ALGO=y
CONFIG_XFRM_USER=y
# CONFIG_XFRM_SUB_POLICY is not set
# CONFIG_XFRM_MIGRATE is not set
# CONFIG_XFRM_STATISTICS is not set
CONFIG_XFRM_IPCOMP=y
CONFIG_NET_KEY=y
# CONFIG_NET_KEY_MIGRATE is not set
CONFIG_INET=y
# CONFIG_IP_MULTICAST is not set
# CONFIG_IP_ADVANCED_ROUTER is not set
CONFIG_IP_ROUTE_CLASSID=y
CONFIG_IP_PNP=y
CONFIG_IP_PNP_DHCP=y
# CONFIG_IP_PNP_BOOTP is not set
# CONFIG_IP_PNP_RARP is not set
# CONFIG_NET_IPIP is not set
# CONFIG_NET_IPGRE_DEMUX is not set
CONFIG_NET_IP_TUNNEL=y
# CONFIG_SYN_COOKIES is not set
# CONFIG_NET_FOU is not set
# CONFIG_NET_FOU_IP_TUNNELS is not set
CONFIG_INET_AH=y
CONFIG_INET_ESP=y
# CONFIG_INET_ESP_OFFLOAD is not set
CONFIG_INET_IPCOMP=y
CONFIG_INET_XFRM_TUNNEL=y
CONFIG_INET_TUNNEL=y
# CONFIG_INET_XFRM_MODE_TRANSPORT is not set
# CONFIG_INET_XFRM_MODE_TUNNEL is not set
# CONFIG_INET_XFRM_MODE_BEET is not set
CONFIG_INET_DIAG=y
CONFIG_INET_TCP_DIAG=y
# CONFIG_INET_UDP_DIAG is not set
# CONFIG_INET_RAW_DIAG is not set
# CONFIG_INET_DIAG_DESTROY is not set
CONFIG_TCP_CONG_ADVANCED=y
CONFIG_TCP_CONG_BIC=y
CONFIG_TCP_CONG_CUBIC=y
CONFIG_TCP_CONG_WESTWOOD=y
CONFIG_TCP_CONG_HTCP=y
CONFIG_TCP_CONG_HSTCP=y
CONFIG_TCP_CONG_HYBLA=y
CONFIG_TCP_CONG_VEGAS=y
# CONFIG_TCP_CONG_NV is not set
CONFIG_TCP_CONG_SCALABLE=y
CONFIG_TCP_CONG_LP=y
CONFIG_TCP_CONG_VENO=y
CONFIG_TCP_CONG_YEAH=y
CONFIG_TCP_CONG_ILLINOIS=y
# CONFIG_TCP_CONG_DCTCP is not set
# CONFIG_TCP_CONG_CDG is not set
# CONFIG_TCP_CONG_BBR is not set
# CONFIG_DEFAULT_BIC is not set
CONFIG_DEFAULT_CUBIC=y
# CONFIG_DEFAULT_HTCP is not set
# CONFIG_DEFAULT_HYBLA is not set
# CONFIG_DEFAULT_VEGAS is not set
# CONFIG_DEFAULT_VENO is not set
# CONFIG_DEFAULT_WESTWOOD is not set
# CONFIG_DEFAULT_RENO is not set
CONFIG_DEFAULT_TCP_CONG="cubic"
# CONFIG_TCP_MD5SIG is not set
CONFIG_IPV6=y
# CONFIG_IPV6_ROUTER_PREF is not set
# CONFIG_IPV6_OPTIMISTIC_DAD is not set
# CONFIG_INET6_AH is not set
# CONFIG_INET6_ESP is not set
# CONFIG_INET6_IPCOMP is not set
# CONFIG_IPV6_MIP6 is not set
# CONFIG_IPV6_ILA is not set
CONFIG_INET6_XFRM_MODE_TRANSPORT=y
CONFIG_INET6_XFRM_MODE_TUNNEL=y
CONFIG_INET6_XFRM_MODE_BEET=y
# CONFIG_INET6_XFRM_MODE_ROUTEOPTIMIZATION is not set
# CONFIG_IPV6_VTI is not set
CONFIG_IPV6_SIT=y
# CONFIG_IPV6_SIT_6RD is not set
CONFIG_IPV6_NDISC_NODETYPE=y
# CONFIG_IPV6_TUNNEL is not set
# CONFIG_IPV6_MULTIPLE_TABLES is not set
# CONFIG_IPV6_MROUTE is not set
# CONFIG_IPV6_SEG6_LWTUNNEL is not set
# CONFIG_IPV6_SEG6_HMAC is not set
# CONFIG_NETWORK_SECMARK is not set
CONFIG_NET_PTP_CLASSIFY=y
# CONFIG_NETWORK_PHY_TIMESTAMPING is not set
CONFIG_NETFILTER=y
CONFIG_NETFILTER_ADVANCED=y
CONFIG_BRIDGE_NETFILTER=y

#
# Core Netfilter Configuration
#
CONFIG_NETFILTER_INGRESS=y
CONFIG_NETFILTER_NETLINK=y
CONFIG_NETFILTER_FAMILY_BRIDGE=y
CONFIG_NETFILTER_FAMILY_ARP=y
# CONFIG_NETFILTER_NETLINK_ACCT is not set
CONFIG_NETFILTER_NETLINK_QUEUE=y
CONFIG_NETFILTER_NETLINK_LOG=y
CONFIG_NF_CONNTRACK=y
# CONFIG_NF_LOG_NETDEV is not set
CONFIG_NETFILTER_CONNCOUNT=y
CONFIG_NF_CONNTRACK_MARK=y
CONFIG_NF_CONNTRACK_ZONES=y
CONFIG_NF_CONNTRACK_PROCFS=y
CONFIG_NF_CONNTRACK_EVENTS=y
# CONFIG_NF_CONNTRACK_TIMEOUT is not set
# CONFIG_NF_CONNTRACK_TIMESTAMP is not set
CONFIG_NF_CT_PROTO_DCCP=y
CONFIG_NF_CT_PROTO_GRE=y
CONFIG_NF_CT_PROTO_SCTP=y
CONFIG_NF_CT_PROTO_UDPLITE=y
CONFIG_NF_CONNTRACK_AMANDA=y
CONFIG_NF_CONNTRACK_FTP=y
CONFIG_NF_CONNTRACK_H323=y
CONFIG_NF_CONNTRACK_IRC=y
CONFIG_NF_CONNTRACK_BROADCAST=y
CONFIG_NF_CONNTRACK_NETBIOS_NS=y
# CONFIG_NF_CONNTRACK_SNMP is not set
CONFIG_NF_CONNTRACK_PPTP=y
CONFIG_NF_CONNTRACK_SANE=y
CONFIG_NF_CONNTRACK_SIP=y
CONFIG_NF_CONNTRACK_TFTP=y
CONFIG_NF_CT_NETLINK=y
# CONFIG_NF_CT_NETLINK_TIMEOUT is not set
# CONFIG_NETFILTER_NETLINK_GLUE_CT is not set
# CONFIG_NF_TABLES is not set
CONFIG_NETFILTER_XTABLES=y

#
# Xtables combined modules
#
CONFIG_NETFILTER_XT_MARK=y
CONFIG_NETFILTER_XT_CONNMARK=y

#
# Xtables targets
#
# CONFIG_NETFILTER_XT_TARGET_AUDIT is not set
CONFIG_NETFILTER_XT_TARGET_CHECKSUM=y
CONFIG_NETFILTER_XT_TARGET_CLASSIFY=y
CONFIG_NETFILTER_XT_TARGET_CONNMARK=y
CONFIG_NETFILTER_XT_TARGET_CT=y
CONFIG_NETFILTER_XT_TARGET_DSCP=y
CONFIG_NETFILTER_XT_TARGET_HL=y
# CONFIG_NETFILTER_XT_TARGET_HMARK is not set
CONFIG_NETFILTER_XT_TARGET_IDLETIMER=y
CONFIG_NETFILTER_XT_TARGET_LED=y
# CONFIG_NETFILTER_XT_TARGET_LOG is not set
CONFIG_NETFILTER_XT_TARGET_MARK=y
CONFIG_NETFILTER_XT_TARGET_NFLOG=y
CONFIG_NETFILTER_XT_TARGET_NFQUEUE=y
CONFIG_NETFILTER_XT_TARGET_NOTRACK=y
CONFIG_NETFILTER_XT_TARGET_RATEEST=y
CONFIG_NETFILTER_XT_TARGET_TEE=y
CONFIG_NETFILTER_XT_TARGET_TPROXY=y
CONFIG_NETFILTER_XT_TARGET_TRACE=y
CONFIG_NETFILTER_XT_TARGET_TCPMSS=y
CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP=y

#
# Xtables matches
#
# CONFIG_NETFILTER_XT_MATCH_ADDRTYPE is not set
# CONFIG_NETFILTER_XT_MATCH_BPF is not set
# CONFIG_NETFILTER_XT_MATCH_CGROUP is not set
CONFIG_NETFILTER_XT_MATCH_CLUSTER=y
CONFIG_NETFILTER_XT_MATCH_COMMENT=y
CONFIG_NETFILTER_XT_MATCH_CONNBYTES=y
# CONFIG_NETFILTER_XT_MATCH_CONNLABEL is not set
CONFIG_NETFILTER_XT_MATCH_CONNLIMIT=y
CONFIG_NETFILTER_XT_MATCH_CONNMARK=y
CONFIG_NETFILTER_XT_MATCH_CONNTRACK=y
CONFIG_NETFILTER_XT_MATCH_CPU=y
CONFIG_NETFILTER_XT_MATCH_DCCP=y
# CONFIG_NETFILTER_XT_MATCH_DEVGROUP is not set
CONFIG_NETFILTER_XT_MATCH_DSCP=y
CONFIG_NETFILTER_XT_MATCH_ECN=y
CONFIG_NETFILTER_XT_MATCH_ESP=y
CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=y
CONFIG_NETFILTER_XT_MATCH_HELPER=y
CONFIG_NETFILTER_XT_MATCH_HL=y
# CONFIG_NETFILTER_XT_MATCH_IPCOMP is not set
CONFIG_NETFILTER_XT_MATCH_IPRANGE=y
# CONFIG_NETFILTER_XT_MATCH_L2TP is not set
CONFIG_NETFILTER_XT_MATCH_LENGTH=y
CONFIG_NETFILTER_XT_MATCH_LIMIT=y
CONFIG_NETFILTER_XT_MATCH_MAC=y
CONFIG_NETFILTER_XT_MATCH_MARK=y
CONFIG_NETFILTER_XT_MATCH_MULTIPORT=y
# CONFIG_NETFILTER_XT_MATCH_NFACCT is not set
CONFIG_NETFILTER_XT_MATCH_OSF=y
CONFIG_NETFILTER_XT_MATCH_OWNER=y
CONFIG_NETFILTER_XT_MATCH_POLICY=y
CONFIG_NETFILTER_XT_MATCH_PHYSDEV=y
CONFIG_NETFILTER_XT_MATCH_PKTTYPE=y
CONFIG_NETFILTER_XT_MATCH_QUOTA=y
CONFIG_NETFILTER_XT_MATCH_RATEEST=y
CONFIG_NETFILTER_XT_MATCH_REALM=y
CONFIG_NETFILTER_XT_MATCH_RECENT=y
CONFIG_NETFILTER_XT_MATCH_SCTP=y
CONFIG_NETFILTER_XT_MATCH_STATE=y
CONFIG_NETFILTER_XT_MATCH_STATISTIC=y
CONFIG_NETFILTER_XT_MATCH_STRING=y
CONFIG_NETFILTER_XT_MATCH_TCPMSS=y
CONFIG_NETFILTER_XT_MATCH_TIME=y
CONFIG_NETFILTER_XT_MATCH_U32=y
# CONFIG_IP_SET is not set
# CONFIG_IP_VS is not set

#
# IP: Netfilter Configuration
#
CONFIG_NF_DEFRAG_IPV4=y
CONFIG_NF_CONNTRACK_IPV4=y
# CONFIG_NF_SOCKET_IPV4 is not set
CONFIG_NF_DUP_IPV4=y
# CONFIG_NF_LOG_ARP is not set
# CONFIG_NF_LOG_IPV4 is not set
CONFIG_NF_REJECT_IPV4=y
# CONFIG_NF_NAT_IPV4 is not set
CONFIG_IP_NF_IPTABLES=y
CONFIG_IP_NF_MATCH_AH=y
CONFIG_IP_NF_MATCH_ECN=y
# CONFIG_IP_NF_MATCH_RPFILTER is not set
CONFIG_IP_NF_MATCH_TTL=y
CONFIG_IP_NF_FILTER=y
CONFIG_IP_NF_TARGET_REJECT=y
# CONFIG_IP_NF_TARGET_SYNPROXY is not set
# CONFIG_IP_NF_NAT is not set
CONFIG_IP_NF_MANGLE=y
CONFIG_IP_NF_TARGET_CLUSTERIP=y
CONFIG_IP_NF_TARGET_ECN=y
CONFIG_IP_NF_TARGET_TTL=y
CONFIG_IP_NF_RAW=y
CONFIG_IP_NF_ARPTABLES=y
CONFIG_IP_NF_ARPFILTER=y
CONFIG_IP_NF_ARP_MANGLE=y

#
# IPv6: Netfilter Configuration
#
# CONFIG_NF_CONNTRACK_IPV6 is not set
# CONFIG_NF_SOCKET_IPV6 is not set
CONFIG_NF_DUP_IPV6=y
# CONFIG_NF_REJECT_IPV6 is not set
# CONFIG_NF_LOG_IPV6 is not set
# CONFIG_IP6_NF_IPTABLES is not set
CONFIG_BRIDGE_NF_EBTABLES=y
CONFIG_BRIDGE_EBT_BROUTE=y
CONFIG_BRIDGE_EBT_T_FILTER=y
CONFIG_BRIDGE_EBT_T_NAT=y
CONFIG_BRIDGE_EBT_802_3=y
CONFIG_BRIDGE_EBT_AMONG=y
CONFIG_BRIDGE_EBT_ARP=y
CONFIG_BRIDGE_EBT_IP=y
# CONFIG_BRIDGE_EBT_IP6 is not set
CONFIG_BRIDGE_EBT_LIMIT=y
CONFIG_BRIDGE_EBT_MARK=y
CONFIG_BRIDGE_EBT_PKTTYPE=y
CONFIG_BRIDGE_EBT_STP=y
CONFIG_BRIDGE_EBT_VLAN=y
CONFIG_BRIDGE_EBT_ARPREPLY=y
CONFIG_BRIDGE_EBT_DNAT=y
CONFIG_BRIDGE_EBT_MARK_T=y
CONFIG_BRIDGE_EBT_REDIRECT=y
CONFIG_BRIDGE_EBT_SNAT=y
CONFIG_BRIDGE_EBT_LOG=y
CONFIG_BRIDGE_EBT_NFLOG=y
# CONFIG_IP_DCCP is not set
# CONFIG_IP_SCTP is not set
# CONFIG_RDS is not set
# CONFIG_TIPC is not set
# CONFIG_ATM is not set
# CONFIG_L2TP is not set
CONFIG_STP=y
CONFIG_BRIDGE=y
CONFIG_BRIDGE_IGMP_SNOOPING=y
CONFIG_HAVE_NET_DSA=y
# CONFIG_NET_DSA is not set
# CONFIG_VLAN_8021Q is not set
# CONFIG_DECNET is not set
CONFIG_LLC=y
# CONFIG_LLC2 is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_PHONET is not set
# CONFIG_6LOWPAN is not set
# CONFIG_IEEE802154 is not set
CONFIG_NET_SCHED=y

#
# Queueing/Scheduling
#
# CONFIG_NET_SCH_CBQ is not set
# CONFIG_NET_SCH_HTB is not set
# CONFIG_NET_SCH_HFSC is not set
# CONFIG_NET_SCH_PRIO is not set
# CONFIG_NET_SCH_MULTIQ is not set
# CONFIG_NET_SCH_RED is not set
# CONFIG_NET_SCH_SFB is not set
# CONFIG_NET_SCH_SFQ is not set
# CONFIG_NET_SCH_TEQL is not set
# CONFIG_NET_SCH_TBF is not set
# CONFIG_NET_SCH_CBS is not set
# CONFIG_NET_SCH_GRED is not set
# CONFIG_NET_SCH_DSMARK is not set
CONFIG_NET_SCH_NETEM=y
# CONFIG_NET_SCH_DRR is not set
# CONFIG_NET_SCH_MQPRIO is not set
# CONFIG_NET_SCH_CHOKE is not set
# CONFIG_NET_SCH_QFQ is not set
# CONFIG_NET_SCH_CODEL is not set
# CONFIG_NET_SCH_FQ_CODEL is not set
# CONFIG_NET_SCH_FQ is not set
# CONFIG_NET_SCH_HHF is not set
# CONFIG_NET_SCH_PIE is not set
# CONFIG_NET_SCH_PLUG is not set
# CONFIG_NET_SCH_DEFAULT is not set

#
# Classification
#
# CONFIG_NET_CLS_BASIC is not set
# CONFIG_NET_CLS_TCINDEX is not set
# CONFIG_NET_CLS_ROUTE4 is not set
# CONFIG_NET_CLS_FW is not set
# CONFIG_NET_CLS_U32 is not set
# CONFIG_NET_CLS_RSVP is not set
# CONFIG_NET_CLS_RSVP6 is not set
# CONFIG_NET_CLS_FLOW is not set
# CONFIG_NET_CLS_CGROUP is not set
# CONFIG_NET_CLS_BPF is not set
# CONFIG_NET_CLS_FLOWER is not set
# CONFIG_NET_CLS_MATCHALL is not set
# CONFIG_NET_EMATCH is not set
# CONFIG_NET_CLS_ACT is not set
CONFIG_NET_SCH_FIFO=y
# CONFIG_DCB is not set
CONFIG_DNS_RESOLVER=y
# CONFIG_BATMAN_ADV is not set
# CONFIG_OPENVSWITCH is not set
# CONFIG_VSOCKETS is not set
# CONFIG_NETLINK_DIAG is not set
# CONFIG_MPLS is not set
# CONFIG_NET_NSH is not set
# CONFIG_HSR is not set
# CONFIG_NET_SWITCHDEV is not set
# CONFIG_NET_L3_MASTER_DEV is not set
# CONFIG_NET_NCSI is not set
CONFIG_RPS=y
CONFIG_RFS_ACCEL=y
CONFIG_XPS=y
# CONFIG_CGROUP_NET_PRIO is not set
# CONFIG_CGROUP_NET_CLASSID is not set
CONFIG_NET_RX_BUSY_POLL=y
CONFIG_BQL=y
# CONFIG_BPF_JIT is not set
CONFIG_NET_FLOW_LIMIT=y

#
# Network testing
#
# CONFIG_NET_PKTGEN is not set
# CONFIG_NET_DROP_MONITOR is not set
# CONFIG_HAMRADIO is not set
# CONFIG_CAN is not set
# CONFIG_BT is not set
# CONFIG_AF_RXRPC is not set
# CONFIG_AF_KCM is not set
# CONFIG_WIRELESS is not set
# CONFIG_WIMAX is not set
# CONFIG_RFKILL is not set
CONFIG_NET_9P=y
CONFIG_NET_9P_VIRTIO=y
# CONFIG_NET_9P_DEBUG is not set
# CONFIG_CAIF is not set
# CONFIG_CEPH_LIB is not set
# CONFIG_NFC is not set
# CONFIG_PSAMPLE is not set
# CONFIG_NET_IFE is not set
# CONFIG_LWTUNNEL is not set
CONFIG_DST_CACHE=y
CONFIG_GRO_CELLS=y
# CONFIG_NET_DEVLINK is not set
CONFIG_MAY_USE_DEVLINK=y
CONFIG_HAVE_EBPF_JIT=y

#
# Device Drivers
#

#
# Generic Driver Options
#
CONFIG_UEVENT_HELPER=y
CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y
# CONFIG_STANDALONE is not set
# CONFIG_PREVENT_FIRMWARE_BUILD is not set
CONFIG_FW_LOADER=y
CONFIG_EXTRA_FIRMWARE=""
# CONFIG_FW_LOADER_USER_HELPER_FALLBACK is not set
CONFIG_ALLOW_DEV_COREDUMP=y
# CONFIG_DEBUG_DRIVER is not set
# CONFIG_DEBUG_DEVRES is not set
# CONFIG_DEBUG_TEST_DRIVER_REMOVE is not set
# CONFIG_TEST_ASYNC_DRIVER_PROBE is not set
CONFIG_GENERIC_CPU_AUTOPROBE=y
CONFIG_GENERIC_CPU_VULNERABILITIES=y
CONFIG_REGMAP=y
CONFIG_REGMAP_I2C=y
CONFIG_DMA_SHARED_BUFFER=y
# CONFIG_DMA_FENCE_TRACE is not set

#
# Bus devices
#
CONFIG_CONNECTOR=y
CONFIG_PROC_EVENTS=y
# CONFIG_MTD is not set
# CONFIG_OF is not set
CONFIG_ARCH_MIGHT_HAVE_PC_PARPORT=y
# CONFIG_PARPORT is not set
CONFIG_PNP=y
CONFIG_PNP_DEBUG_MESSAGES=y

#
# Protocols
#
CONFIG_PNPACPI=y
CONFIG_BLK_DEV=y
# CONFIG_BLK_DEV_NULL_BLK is not set
CONFIG_BLK_DEV_FD=y
CONFIG_CDROM=y
# CONFIG_BLK_DEV_PCIESSD_MTIP32XX is not set
# CONFIG_BLK_DEV_DAC960 is not set
# CONFIG_BLK_DEV_UMEM is not set
CONFIG_BLK_DEV_LOOP=y
CONFIG_BLK_DEV_LOOP_MIN_COUNT=8
CONFIG_BLK_DEV_CRYPTOLOOP=y
# CONFIG_BLK_DEV_DRBD is not set
# CONFIG_BLK_DEV_NBD is not set
# CONFIG_BLK_DEV_SKD is not set
# CONFIG_BLK_DEV_SX8 is not set
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_COUNT=16
CONFIG_BLK_DEV_RAM_SIZE=65536
CONFIG_CDROM_PKTCDVD=y
CONFIG_CDROM_PKTCDVD_BUFFERS=128
# CONFIG_CDROM_PKTCDVD_WCACHE is not set
# CONFIG_ATA_OVER_ETH is not set
CONFIG_VIRTIO_BLK=y
# CONFIG_VIRTIO_BLK_SCSI is not set
# CONFIG_BLK_DEV_RBD is not set
# CONFIG_BLK_DEV_RSXX is not set

#
# NVME Support
#
# CONFIG_BLK_DEV_NVME is not set
# CONFIG_NVME_FC is not set
# CONFIG_NVME_TARGET is not set

#
# Misc devices
#
# CONFIG_AD525X_DPOT is not set
# CONFIG_DUMMY_IRQ is not set
# CONFIG_IBM_ASM is not set
# CONFIG_PHANTOM is not set
# CONFIG_SGI_IOC4 is not set
# CONFIG_TIFM_CORE is not set
# CONFIG_ICS932S401 is not set
# CONFIG_ENCLOSURE_SERVICES is not set
# CONFIG_HP_ILO is not set
# CONFIG_APDS9802ALS is not set
# CONFIG_ISL29003 is not set
# CONFIG_ISL29020 is not set
# CONFIG_SENSORS_TSL2550 is not set
# CONFIG_SENSORS_BH1770 is not set
# CONFIG_SENSORS_APDS990X is not set
# CONFIG_HMC6352 is not set
# CONFIG_DS1682 is not set
# CONFIG_USB_SWITCH_FSA9480 is not set
# CONFIG_SRAM is not set
# CONFIG_PCI_ENDPOINT_TEST is not set
# CONFIG_C2PORT is not set

#
# EEPROM support
#
# CONFIG_EEPROM_AT24 is not set
# CONFIG_EEPROM_LEGACY is not set
# CONFIG_EEPROM_MAX6875 is not set
# CONFIG_EEPROM_93CX6 is not set
# CONFIG_EEPROM_IDT_89HPESX is not set
# CONFIG_CB710_CORE is not set

#
# Texas Instruments shared transport line discipline
#
# CONFIG_SENSORS_LIS3_I2C is not set
# CONFIG_ALTERA_STAPL is not set
# CONFIG_INTEL_MEI is not set
# CONFIG_INTEL_MEI_ME is not set
# CONFIG_INTEL_MEI_TXE is not set
# CONFIG_VMWARE_VMCI is not set

#
# Intel MIC & related support
#

#
# Intel MIC Bus Driver
#
# CONFIG_INTEL_MIC_BUS is not set

#
# SCIF Bus Driver
#
# CONFIG_SCIF_BUS is not set

#
# VOP Bus Driver
#
# CONFIG_VOP_BUS is not set

#
# Intel MIC Host Driver
#

#
# Intel MIC Card Driver
#

#
# SCIF Driver
#

#
# Intel MIC Coprocessor State Management (COSM) Drivers
#

#
# VOP Driver
#
# CONFIG_GENWQE is not set
# CONFIG_ECHO is not set
# CONFIG_MISC_RTSX_PCI is not set
# CONFIG_MISC_RTSX_USB is not set
CONFIG_HAVE_IDE=y
CONFIG_IDE=y

#
# Please see Documentation/ide/ide.txt for help/info on IDE drives
#
CONFIG_IDE_XFER_MODE=y
# CONFIG_BLK_DEV_IDE_SATA is not set
CONFIG_IDE_GD=y
CONFIG_IDE_GD_ATA=y
# CONFIG_IDE_GD_ATAPI is not set
# CONFIG_BLK_DEV_IDECD is not set
# CONFIG_BLK_DEV_IDETAPE is not set
# CONFIG_BLK_DEV_IDEACPI is not set
# CONFIG_IDE_TASK_IOCTL is not set
CONFIG_IDE_PROC_FS=y

#
# IDE chipset support/bugfixes
#
CONFIG_IDE_GENERIC=y
# CONFIG_BLK_DEV_PLATFORM is not set
# CONFIG_BLK_DEV_CMD640 is not set
# CONFIG_BLK_DEV_IDEPNP is not set
CONFIG_BLK_DEV_IDEDMA_SFF=y

#
# PCI IDE chipsets support
#
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_PCIBUS_ORDER=y
# CONFIG_BLK_DEV_GENERIC is not set
# CONFIG_BLK_DEV_OPTI621 is not set
# CONFIG_BLK_DEV_RZ1000 is not set
CONFIG_BLK_DEV_IDEDMA_PCI=y
# CONFIG_BLK_DEV_AEC62XX is not set
# CONFIG_BLK_DEV_ALI15X3 is not set
# CONFIG_BLK_DEV_AMD74XX is not set
# CONFIG_BLK_DEV_ATIIXP is not set
# CONFIG_BLK_DEV_CMD64X is not set
# CONFIG_BLK_DEV_TRIFLEX is not set
# CONFIG_BLK_DEV_HPT366 is not set
# CONFIG_BLK_DEV_JMICRON is not set
CONFIG_BLK_DEV_PIIX=y
# CONFIG_BLK_DEV_IT8172 is not set
# CONFIG_BLK_DEV_IT8213 is not set
# CONFIG_BLK_DEV_IT821X is not set
# CONFIG_BLK_DEV_NS87415 is not set
# CONFIG_BLK_DEV_PDC202XX_OLD is not set
# CONFIG_BLK_DEV_PDC202XX_NEW is not set
# CONFIG_BLK_DEV_SVWKS is not set
# CONFIG_BLK_DEV_SIIMAGE is not set
# CONFIG_BLK_DEV_SIS5513 is not set
# CONFIG_BLK_DEV_SLC90E66 is not set
# CONFIG_BLK_DEV_TRM290 is not set
# CONFIG_BLK_DEV_VIA82CXXX is not set
# CONFIG_BLK_DEV_TC86C001 is not set
CONFIG_BLK_DEV_IDEDMA=y

#
# SCSI device support
#
CONFIG_SCSI_MOD=y
CONFIG_RAID_ATTRS=y
CONFIG_SCSI=y
CONFIG_SCSI_DMA=y
CONFIG_SCSI_NETLINK=y
# CONFIG_SCSI_MQ_DEFAULT is not set
CONFIG_SCSI_PROC_FS=y

#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=y
# CONFIG_CHR_DEV_ST is not set
# CONFIG_CHR_DEV_OSST is not set
# CONFIG_BLK_DEV_SR is not set
# CONFIG_CHR_DEV_SG is not set
# CONFIG_CHR_DEV_SCH is not set
CONFIG_SCSI_CONSTANTS=y
# CONFIG_SCSI_LOGGING is not set
CONFIG_SCSI_SCAN_ASYNC=y

#
# SCSI Transports
#
CONFIG_SCSI_SPI_ATTRS=y
CONFIG_SCSI_FC_ATTRS=y
CONFIG_SCSI_ISCSI_ATTRS=y
CONFIG_SCSI_SAS_ATTRS=y
CONFIG_SCSI_SAS_LIBSAS=y
CONFIG_SCSI_SAS_ATA=y
CONFIG_SCSI_SAS_HOST_SMP=y
# CONFIG_SCSI_SRP_ATTRS is not set
CONFIG_SCSI_LOWLEVEL=y
# CONFIG_ISCSI_TCP is not set
CONFIG_ISCSI_BOOT_SYSFS=y
# CONFIG_SCSI_CXGB3_ISCSI is not set
# CONFIG_SCSI_CXGB4_ISCSI is not set
# CONFIG_SCSI_BNX2_ISCSI is not set
# CONFIG_BE2ISCSI is not set
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
# CONFIG_SCSI_HPSA is not set
# CONFIG_SCSI_3W_9XXX is not set
# CONFIG_SCSI_3W_SAS is not set
# CONFIG_SCSI_ACARD is not set
CONFIG_SCSI_AACRAID=y
CONFIG_SCSI_AIC7XXX=y
CONFIG_AIC7XXX_CMDS_PER_DEVICE=32
CONFIG_AIC7XXX_RESET_DELAY_MS=5000
# CONFIG_AIC7XXX_BUILD_FIRMWARE is not set
CONFIG_AIC7XXX_DEBUG_ENABLE=y
CONFIG_AIC7XXX_DEBUG_MASK=0
CONFIG_AIC7XXX_REG_PRETTY_PRINT=y
CONFIG_SCSI_AIC79XX=y
CONFIG_AIC79XX_CMDS_PER_DEVICE=32
CONFIG_AIC79XX_RESET_DELAY_MS=5000
# CONFIG_AIC79XX_BUILD_FIRMWARE is not set
CONFIG_AIC79XX_DEBUG_ENABLE=y
CONFIG_AIC79XX_DEBUG_MASK=0
CONFIG_AIC79XX_REG_PRETTY_PRINT=y
CONFIG_SCSI_AIC94XX=y
CONFIG_AIC94XX_DEBUG=y
CONFIG_SCSI_MVSAS=y
CONFIG_SCSI_MVSAS_DEBUG=y
# CONFIG_SCSI_MVSAS_TASKLET is not set
# CONFIG_SCSI_MVUMI is not set
CONFIG_SCSI_DPT_I2O=y
CONFIG_SCSI_ADVANSYS=y
# CONFIG_SCSI_ARCMSR is not set
# CONFIG_SCSI_ESAS2R is not set
CONFIG_MEGARAID_NEWGEN=y
CONFIG_MEGARAID_MM=y
CONFIG_MEGARAID_MAILBOX=y
CONFIG_MEGARAID_LEGACY=y
CONFIG_MEGARAID_SAS=y
CONFIG_SCSI_MPT3SAS=y
CONFIG_SCSI_MPT2SAS_MAX_SGE=128
CONFIG_SCSI_MPT3SAS_MAX_SGE=128
CONFIG_SCSI_MPT2SAS=y
# CONFIG_SCSI_SMARTPQI is not set
# CONFIG_SCSI_UFSHCD is not set
# CONFIG_SCSI_HPTIOP is not set
# CONFIG_SCSI_BUSLOGIC is not set
# CONFIG_VMWARE_PVSCSI is not set
# CONFIG_LIBFC is not set
# CONFIG_SCSI_SNIC is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
CONFIG_SCSI_GDTH=y
CONFIG_SCSI_ISCI=y
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_INITIO is not set
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_STEX is not set
# CONFIG_SCSI_SYM53C8XX_2 is not set
# CONFIG_SCSI_IPR is not set
CONFIG_SCSI_QLOGIC_1280=y
CONFIG_SCSI_QLA_FC=y
CONFIG_SCSI_QLA_ISCSI=y
# CONFIG_SCSI_LPFC is not set
# CONFIG_SCSI_DC395x is not set
# CONFIG_SCSI_AM53C974 is not set
# CONFIG_SCSI_WD719X is not set
# CONFIG_SCSI_DEBUG is not set
# CONFIG_SCSI_PMCRAID is not set
# CONFIG_SCSI_PM8001 is not set
# CONFIG_SCSI_BFA_FC is not set
CONFIG_SCSI_VIRTIO=y
# CONFIG_SCSI_CHELSIO_FCOE is not set
# CONFIG_SCSI_DH is not set
# CONFIG_SCSI_OSD_INITIATOR is not set
CONFIG_ATA=y
CONFIG_ATA_VERBOSE_ERROR=y
CONFIG_ATA_ACPI=y
# CONFIG_SATA_ZPODD is not set
CONFIG_SATA_PMP=y

#
# Controllers with non-SFF native interface
#
CONFIG_SATA_AHCI=y
CONFIG_SATA_MOBILE_LPM_POLICY=0
# CONFIG_SATA_AHCI_PLATFORM is not set
# CONFIG_SATA_INIC162X is not set
# CONFIG_SATA_ACARD_AHCI is not set
# CONFIG_SATA_SIL24 is not set
CONFIG_ATA_SFF=y

#
# SFF controllers with custom DMA interface
#
# CONFIG_PDC_ADMA is not set
# CONFIG_SATA_QSTOR is not set
# CONFIG_SATA_SX4 is not set
CONFIG_ATA_BMDMA=y

#
# SATA SFF controllers with BMDMA
#
CONFIG_ATA_PIIX=y
# CONFIG_SATA_DWC is not set
# CONFIG_SATA_MV is not set
# CONFIG_SATA_NV is not set
# CONFIG_SATA_PROMISE is not set
# CONFIG_SATA_SIL is not set
# CONFIG_SATA_SIS is not set
# CONFIG_SATA_SVW is not set
# CONFIG_SATA_ULI is not set
# CONFIG_SATA_VIA is not set
# CONFIG_SATA_VITESSE is not set

#
# PATA SFF controllers with BMDMA
#
# CONFIG_PATA_ALI is not set
# CONFIG_PATA_AMD is not set
# CONFIG_PATA_ARTOP is not set
# CONFIG_PATA_ATIIXP is not set
# CONFIG_PATA_ATP867X is not set
# CONFIG_PATA_CMD64X is not set
# CONFIG_PATA_CYPRESS is not set
# CONFIG_PATA_EFAR is not set
# CONFIG_PATA_HPT366 is not set
# CONFIG_PATA_HPT37X is not set
# CONFIG_PATA_HPT3X2N is not set
# CONFIG_PATA_HPT3X3 is not set
# CONFIG_PATA_IT8213 is not set
# CONFIG_PATA_IT821X is not set
# CONFIG_PATA_JMICRON is not set
# CONFIG_PATA_MARVELL is not set
# CONFIG_PATA_NETCELL is not set
# CONFIG_PATA_NINJA32 is not set
# CONFIG_PATA_NS87415 is not set
# CONFIG_PATA_OLDPIIX is not set
# CONFIG_PATA_OPTIDMA is not set
# CONFIG_PATA_PDC2027X is not set
# CONFIG_PATA_PDC_OLD is not set
# CONFIG_PATA_RADISYS is not set
# CONFIG_PATA_RDC is not set
# CONFIG_PATA_SCH is not set
# CONFIG_PATA_SERVERWORKS is not set
# CONFIG_PATA_SIL680 is not set
# CONFIG_PATA_SIS is not set
# CONFIG_PATA_TOSHIBA is not set
# CONFIG_PATA_TRIFLEX is not set
# CONFIG_PATA_VIA is not set
# CONFIG_PATA_WINBOND is not set

#
# PIO-only SFF controllers
#
# CONFIG_PATA_CMD640_PCI is not set
# CONFIG_PATA_MPIIX is not set
# CONFIG_PATA_NS87410 is not set
# CONFIG_PATA_OPTI is not set
CONFIG_PATA_PLATFORM=y
# CONFIG_PATA_RZ1000 is not set

#
# Generic fallback / legacy drivers
#
# CONFIG_PATA_ACPI is not set
CONFIG_ATA_GENERIC=y
# CONFIG_PATA_LEGACY is not set
CONFIG_MD=y
CONFIG_BLK_DEV_MD=y
CONFIG_MD_AUTODETECT=y
CONFIG_MD_LINEAR=y
CONFIG_MD_RAID0=y
CONFIG_MD_RAID1=y
CONFIG_MD_RAID10=y
CONFIG_MD_RAID456=y
CONFIG_MD_MULTIPATH=y
CONFIG_MD_FAULTY=y
# CONFIG_BCACHE is not set
CONFIG_BLK_DEV_DM_BUILTIN=y
CONFIG_BLK_DEV_DM=y
# CONFIG_DM_MQ_DEFAULT is not set
CONFIG_DM_DEBUG=y
CONFIG_DM_BUFIO=y
# CONFIG_DM_DEBUG_BLOCK_MANAGER_LOCKING is not set
# CONFIG_DM_UNSTRIPED is not set
CONFIG_DM_CRYPT=y
CONFIG_DM_SNAPSHOT=y
# CONFIG_DM_THIN_PROVISIONING is not set
# CONFIG_DM_CACHE is not set
# CONFIG_DM_ERA is not set
CONFIG_DM_MIRROR=y
CONFIG_DM_LOG_USERSPACE=y
# CONFIG_DM_RAID is not set
CONFIG_DM_ZERO=y
CONFIG_DM_MULTIPATH=y
CONFIG_DM_MULTIPATH_QL=y
CONFIG_DM_MULTIPATH_ST=y
CONFIG_DM_DELAY=y
CONFIG_DM_UEVENT=y
# CONFIG_DM_FLAKEY is not set
# CONFIG_DM_VERITY is not set
# CONFIG_DM_SWITCH is not set
# CONFIG_DM_LOG_WRITES is not set
# CONFIG_DM_INTEGRITY is not set
# CONFIG_TARGET_CORE is not set
CONFIG_FUSION=y
CONFIG_FUSION_SPI=y
CONFIG_FUSION_FC=y
CONFIG_FUSION_SAS=y
CONFIG_FUSION_MAX_SGE=128
CONFIG_FUSION_CTL=y
CONFIG_FUSION_LOGGING=y

#
# IEEE 1394 (FireWire) support
#
# CONFIG_FIREWIRE is not set
# CONFIG_FIREWIRE_NOSY is not set
# CONFIG_MACINTOSH_DRIVERS is not set
CONFIG_NETDEVICES=y
CONFIG_MII=y
CONFIG_NET_CORE=y
# CONFIG_BONDING is not set
CONFIG_DUMMY=y
# CONFIG_EQUALIZER is not set
# CONFIG_NET_FC is not set
# CONFIG_NET_TEAM is not set
# CONFIG_MACVLAN is not set
# CONFIG_IPVLAN is not set
# CONFIG_VXLAN is not set
# CONFIG_MACSEC is not set
CONFIG_NETCONSOLE=y
CONFIG_NETCONSOLE_DYNAMIC=y
CONFIG_NETPOLL=y
CONFIG_NET_POLL_CONTROLLER=y
CONFIG_TUN=y
# CONFIG_TUN_VNET_CROSS_LE is not set
# CONFIG_VETH is not set
CONFIG_VIRTIO_NET=y
# CONFIG_NLMON is not set
# CONFIG_ARCNET is not set

#
# CAIF transport drivers
#

#
# Distributed Switch Architecture drivers
#
CONFIG_ETHERNET=y
CONFIG_MDIO=y
# CONFIG_NET_VENDOR_3COM is not set
CONFIG_NET_VENDOR_ADAPTEC=y
# CONFIG_ADAPTEC_STARFIRE is not set
CONFIG_NET_VENDOR_AGERE=y
# CONFIG_ET131X is not set
CONFIG_NET_VENDOR_ALACRITECH=y
# CONFIG_SLICOSS is not set
CONFIG_NET_VENDOR_ALTEON=y
# CONFIG_ACENIC is not set
# CONFIG_ALTERA_TSE is not set
CONFIG_NET_VENDOR_AMAZON=y
# CONFIG_ENA_ETHERNET is not set
CONFIG_NET_VENDOR_AMD=y
# CONFIG_AMD8111_ETH is not set
# CONFIG_PCNET32 is not set
# CONFIG_AMD_XGBE is not set
CONFIG_NET_VENDOR_AQUANTIA=y
# CONFIG_AQTION is not set
CONFIG_NET_VENDOR_ARC=y
CONFIG_NET_VENDOR_ATHEROS=y
CONFIG_ATL2=y
CONFIG_ATL1=y
CONFIG_ATL1E=y
CONFIG_ATL1C=y
# CONFIG_ALX is not set
# CONFIG_NET_VENDOR_AURORA is not set
CONFIG_NET_CADENCE=y
# CONFIG_MACB is not set
CONFIG_NET_VENDOR_BROADCOM=y
# CONFIG_B44 is not set
CONFIG_BNX2=y
CONFIG_CNIC=y
CONFIG_TIGON3=y
CONFIG_TIGON3_HWMON=y
# CONFIG_BNX2X is not set
# CONFIG_BNXT is not set
CONFIG_NET_VENDOR_BROCADE=y
# CONFIG_BNA is not set
CONFIG_NET_VENDOR_CAVIUM=y
# CONFIG_THUNDER_NIC_PF is not set
# CONFIG_THUNDER_NIC_VF is not set
# CONFIG_THUNDER_NIC_BGX is not set
# CONFIG_THUNDER_NIC_RGX is not set
CONFIG_CAVIUM_PTP=y
# CONFIG_LIQUIDIO is not set
# CONFIG_LIQUIDIO_VF is not set
CONFIG_NET_VENDOR_CHELSIO=y
# CONFIG_CHELSIO_T1 is not set
# CONFIG_CHELSIO_T3 is not set
# CONFIG_CHELSIO_T4 is not set
# CONFIG_CHELSIO_T4VF is not set
CONFIG_NET_VENDOR_CISCO=y
# CONFIG_ENIC is not set
CONFIG_NET_VENDOR_CORTINA=y
# CONFIG_CX_ECAT is not set
# CONFIG_DNET is not set
CONFIG_NET_VENDOR_DEC=y
CONFIG_NET_TULIP=y
# CONFIG_DE2104X is not set
# CONFIG_TULIP is not set
# CONFIG_DE4X5 is not set
# CONFIG_WINBOND_840 is not set
# CONFIG_DM9102 is not set
# CONFIG_ULI526X is not set
CONFIG_NET_VENDOR_DLINK=y
# CONFIG_DL2K is not set
# CONFIG_SUNDANCE is not set
CONFIG_NET_VENDOR_EMULEX=y
# CONFIG_BE2NET is not set
CONFIG_NET_VENDOR_EZCHIP=y
CONFIG_NET_VENDOR_EXAR=y
# CONFIG_S2IO is not set
# CONFIG_VXGE is not set
CONFIG_NET_VENDOR_HP=y
# CONFIG_HP100 is not set
CONFIG_NET_VENDOR_HUAWEI=y
# CONFIG_HINIC is not set
CONFIG_NET_VENDOR_INTEL=y
CONFIG_E100=y
CONFIG_E1000=y
CONFIG_E1000E=y
CONFIG_E1000E_HWTS=y
CONFIG_IGB=y
CONFIG_IGB_HWMON=y
CONFIG_IGB_DCA=y
CONFIG_IGBVF=y
CONFIG_IXGB=y
CONFIG_IXGBE=y
CONFIG_IXGBE_HWMON=y
CONFIG_IXGBE_DCA=y
# CONFIG_IXGBEVF is not set
# CONFIG_I40E is not set
# CONFIG_I40EVF is not set
# CONFIG_FM10K is not set
CONFIG_NET_VENDOR_I825XX=y
CONFIG_JME=y
CONFIG_NET_VENDOR_MARVELL=y
# CONFIG_MVMDIO is not set
CONFIG_SKGE=y
# CONFIG_SKGE_DEBUG is not set
# CONFIG_SKGE_GENESIS is not set
CONFIG_SKY2=y
# CONFIG_SKY2_DEBUG is not set
CONFIG_NET_VENDOR_MELLANOX=y
# CONFIG_MLX4_EN is not set
# CONFIG_MLX5_CORE is not set
# CONFIG_MLXSW_CORE is not set
# CONFIG_MLXFW is not set
CONFIG_NET_VENDOR_MICREL=y
# CONFIG_KS8842 is not set
# CONFIG_KS8851_MLL is not set
# CONFIG_KSZ884X_PCI is not set
CONFIG_NET_VENDOR_MYRI=y
# CONFIG_MYRI10GE is not set
# CONFIG_FEALNX is not set
CONFIG_NET_VENDOR_NATSEMI=y
# CONFIG_NATSEMI is not set
# CONFIG_NS83820 is not set
CONFIG_NET_VENDOR_NETRONOME=y
# CONFIG_NFP is not set
CONFIG_NET_VENDOR_8390=y
# CONFIG_NE2K_PCI is not set
CONFIG_NET_VENDOR_NVIDIA=y
# CONFIG_FORCEDETH is not set
CONFIG_NET_VENDOR_OKI=y
# CONFIG_ETHOC is not set
# CONFIG_NET_PACKET_ENGINE is not set
CONFIG_NET_VENDOR_QLOGIC=y
# CONFIG_QLA3XXX is not set
# CONFIG_QLCNIC is not set
# CONFIG_QLGE is not set
# CONFIG_NETXEN_NIC is not set
# CONFIG_QED is not set
CONFIG_NET_VENDOR_QUALCOMM=y
# CONFIG_QCOM_EMAC is not set
# CONFIG_RMNET is not set
CONFIG_NET_VENDOR_REALTEK=y
# CONFIG_8139CP is not set
# CONFIG_8139TOO is not set
CONFIG_R8169=y
CONFIG_NET_VENDOR_RENESAS=y
CONFIG_NET_VENDOR_RDC=y
# CONFIG_R6040 is not set
CONFIG_NET_VENDOR_ROCKER=y
CONFIG_NET_VENDOR_SAMSUNG=y
# CONFIG_SXGBE_ETH is not set
CONFIG_NET_VENDOR_SEEQ=y
CONFIG_NET_VENDOR_SILAN=y
# CONFIG_SC92031 is not set
CONFIG_NET_VENDOR_SIS=y
# CONFIG_SIS900 is not set
CONFIG_SIS190=y
CONFIG_NET_VENDOR_SOLARFLARE=y
# CONFIG_SFC is not set
# CONFIG_SFC_FALCON is not set
CONFIG_NET_VENDOR_SMSC=y
# CONFIG_EPIC100 is not set
# CONFIG_SMSC911X is not set
# CONFIG_SMSC9420 is not set
CONFIG_NET_VENDOR_SOCIONEXT=y
CONFIG_NET_VENDOR_STMICRO=y
# CONFIG_STMMAC_ETH is not set
CONFIG_NET_VENDOR_SUN=y
# CONFIG_HAPPYMEAL is not set
# CONFIG_SUNGEM is not set
# CONFIG_CASSINI is not set
# CONFIG_NIU is not set
CONFIG_NET_VENDOR_TEHUTI=y
# CONFIG_TEHUTI is not set
CONFIG_NET_VENDOR_TI=y
# CONFIG_TI_CPSW_ALE is not set
# CONFIG_TLAN is not set
CONFIG_NET_VENDOR_VIA=y
# CONFIG_VIA_RHINE is not set
CONFIG_VIA_VELOCITY=y
CONFIG_NET_VENDOR_WIZNET=y
# CONFIG_WIZNET_W5100 is not set
# CONFIG_WIZNET_W5300 is not set
CONFIG_NET_VENDOR_SYNOPSYS=y
# CONFIG_DWC_XLGMAC is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
# CONFIG_NET_SB1000 is not set
CONFIG_MDIO_DEVICE=y
CONFIG_MDIO_BUS=y
# CONFIG_MDIO_BITBANG is not set
# CONFIG_MDIO_THUNDER is not set
CONFIG_PHYLIB=y
# CONFIG_LED_TRIGGER_PHY is not set

#
# MII PHY device drivers
#
# CONFIG_AMD_PHY is not set
# CONFIG_AQUANTIA_PHY is not set
# CONFIG_AT803X_PHY is not set
# CONFIG_BCM7XXX_PHY is not set
# CONFIG_BCM87XX_PHY is not set
CONFIG_BCM_NET_PHYLIB=y
CONFIG_BROADCOM_PHY=y
CONFIG_CICADA_PHY=y
# CONFIG_CORTINA_PHY is not set
CONFIG_DAVICOM_PHY=y
# CONFIG_DP83822_PHY is not set
# CONFIG_DP83848_PHY is not set
# CONFIG_DP83867_PHY is not set
# CONFIG_FIXED_PHY is not set
CONFIG_ICPLUS_PHY=y
# CONFIG_INTEL_XWAY_PHY is not set
# CONFIG_LSI_ET1011C_PHY is not set
CONFIG_LXT_PHY=y
CONFIG_MARVELL_PHY=y
# CONFIG_MARVELL_10G_PHY is not set
# CONFIG_MICREL_PHY is not set
# CONFIG_MICROCHIP_PHY is not set
# CONFIG_MICROSEMI_PHY is not set
# CONFIG_NATIONAL_PHY is not set
CONFIG_QSEMI_PHY=y
# CONFIG_REALTEK_PHY is not set
# CONFIG_RENESAS_PHY is not set
# CONFIG_ROCKCHIP_PHY is not set
CONFIG_SMSC_PHY=y
# CONFIG_STE10XP is not set
# CONFIG_TERANETICS_PHY is not set
CONFIG_VITESSE_PHY=y
# CONFIG_XILINX_GMII2RGMII is not set
# CONFIG_PPP is not set
# CONFIG_SLIP is not set
CONFIG_USB_NET_DRIVERS=y
CONFIG_USB_CATC=y
CONFIG_USB_KAWETH=y
CONFIG_USB_PEGASUS=y
CONFIG_USB_RTL8150=y
# CONFIG_USB_RTL8152 is not set
# CONFIG_USB_LAN78XX is not set
CONFIG_USB_USBNET=y
CONFIG_USB_NET_AX8817X=y
CONFIG_USB_NET_AX88179_178A=y
CONFIG_USB_NET_CDCETHER=y
CONFIG_USB_NET_CDC_EEM=y
CONFIG_USB_NET_CDC_NCM=y
# CONFIG_USB_NET_HUAWEI_CDC_NCM is not set
# CONFIG_USB_NET_CDC_MBIM is not set
CONFIG_USB_NET_DM9601=y
# CONFIG_USB_NET_SR9700 is not set
# CONFIG_USB_NET_SR9800 is not set
CONFIG_USB_NET_SMSC75XX=y
CONFIG_USB_NET_SMSC95XX=y
CONFIG_USB_NET_GL620A=y
CONFIG_USB_NET_NET1080=y
CONFIG_USB_NET_PLUSB=y
CONFIG_USB_NET_MCS7830=y
CONFIG_USB_NET_RNDIS_HOST=y
CONFIG_USB_NET_CDC_SUBSET_ENABLE=y
CONFIG_USB_NET_CDC_SUBSET=y
CONFIG_USB_ALI_M5632=y
CONFIG_USB_AN2720=y
CONFIG_USB_BELKIN=y
CONFIG_USB_ARMLINUX=y
CONFIG_USB_EPSON2888=y
CONFIG_USB_KC2190=y
CONFIG_USB_NET_ZAURUS=y
# CONFIG_USB_NET_CX82310_ETH is not set
# CONFIG_USB_NET_KALMIA is not set
# CONFIG_USB_NET_QMI_WWAN is not set
CONFIG_USB_NET_INT51X1=y
# CONFIG_USB_IPHETH is not set
# CONFIG_USB_SIERRA_NET is not set
# CONFIG_USB_VL600 is not set
# CONFIG_USB_NET_CH9200 is not set
# CONFIG_WLAN is not set

#
# Enable WiMAX (Networking options) to see the WiMAX drivers
#
# CONFIG_WAN is not set
# CONFIG_VMXNET3 is not set
# CONFIG_FUJITSU_ES is not set
# CONFIG_NETDEVSIM is not set
# CONFIG_ISDN is not set
# CONFIG_NVM is not set

#
# Input device support
#
CONFIG_INPUT=y
CONFIG_INPUT_LEDS=y
# CONFIG_INPUT_FF_MEMLESS is not set
CONFIG_INPUT_POLLDEV=y
CONFIG_INPUT_SPARSEKMAP=y
# CONFIG_INPUT_MATRIXKMAP is not set

#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
# CONFIG_INPUT_JOYDEV is not set
CONFIG_INPUT_EVDEV=y
# CONFIG_INPUT_EVBUG is not set

#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
# CONFIG_KEYBOARD_ADP5588 is not set
# CONFIG_KEYBOARD_ADP5589 is not set
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_QT1070 is not set
# CONFIG_KEYBOARD_QT2160 is not set
# CONFIG_KEYBOARD_DLINK_DIR685 is not set
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_TCA6416 is not set
# CONFIG_KEYBOARD_TCA8418 is not set
# CONFIG_KEYBOARD_LM8323 is not set
# CONFIG_KEYBOARD_LM8333 is not set
# CONFIG_KEYBOARD_MAX7359 is not set
# CONFIG_KEYBOARD_MCS is not set
# CONFIG_KEYBOARD_MPR121 is not set
CONFIG_KEYBOARD_NEWTON=y
# CONFIG_KEYBOARD_OPENCORES is not set
# CONFIG_KEYBOARD_SAMSUNG is not set
# CONFIG_KEYBOARD_STOWAWAY is not set
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_TM2_TOUCHKEY is not set
CONFIG_KEYBOARD_XTKBD=y
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
CONFIG_MOUSE_PS2_ALPS=y
CONFIG_MOUSE_PS2_BYD=y
CONFIG_MOUSE_PS2_LOGIPS2PP=y
CONFIG_MOUSE_PS2_SYNAPTICS=y
CONFIG_MOUSE_PS2_SYNAPTICS_SMBUS=y
CONFIG_MOUSE_PS2_CYPRESS=y
CONFIG_MOUSE_PS2_LIFEBOOK=y
CONFIG_MOUSE_PS2_TRACKPOINT=y
# CONFIG_MOUSE_PS2_ELANTECH is not set
# CONFIG_MOUSE_PS2_SENTELIC is not set
# CONFIG_MOUSE_PS2_TOUCHKIT is not set
CONFIG_MOUSE_PS2_FOCALTECH=y
# CONFIG_MOUSE_PS2_VMMOUSE is not set
CONFIG_MOUSE_PS2_SMBUS=y
# CONFIG_MOUSE_SERIAL is not set
# CONFIG_MOUSE_APPLETOUCH is not set
# CONFIG_MOUSE_BCM5974 is not set
# CONFIG_MOUSE_CYAPA is not set
# CONFIG_MOUSE_ELAN_I2C is not set
# CONFIG_MOUSE_VSXXXAA is not set
# CONFIG_MOUSE_SYNAPTICS_I2C is not set
# CONFIG_MOUSE_SYNAPTICS_USB is not set
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TABLET is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
CONFIG_INPUT_MISC=y
# CONFIG_INPUT_AD714X is not set
# CONFIG_INPUT_BMA150 is not set
# CONFIG_INPUT_E3X0_BUTTON is not set
# CONFIG_INPUT_PCSPKR is not set
# CONFIG_INPUT_MMA8450 is not set
# CONFIG_INPUT_APANEL is not set
# CONFIG_INPUT_ATLAS_BTNS is not set
# CONFIG_INPUT_ATI_REMOTE2 is not set
# CONFIG_INPUT_KEYSPAN_REMOTE is not set
# CONFIG_INPUT_KXTJ9 is not set
# CONFIG_INPUT_POWERMATE is not set
# CONFIG_INPUT_YEALINK is not set
# CONFIG_INPUT_CM109 is not set
CONFIG_INPUT_UINPUT=y
# CONFIG_INPUT_PCF8574 is not set
# CONFIG_INPUT_ADXL34X is not set
# CONFIG_INPUT_IMS_PCU is not set
# CONFIG_INPUT_CMA3000 is not set
# CONFIG_INPUT_IDEAPAD_SLIDEBAR is not set
# CONFIG_INPUT_DRV2665_HAPTICS is not set
# CONFIG_INPUT_DRV2667_HAPTICS is not set
# CONFIG_RMI4_CORE is not set

#
# Hardware I/O ports
#
CONFIG_SERIO=y
CONFIG_ARCH_MIGHT_HAVE_PC_SERIO=y
CONFIG_SERIO_I8042=y
# CONFIG_SERIO_SERPORT is not set
# CONFIG_SERIO_CT82C710 is not set
# CONFIG_SERIO_PCIPS2 is not set
CONFIG_SERIO_LIBPS2=y
# CONFIG_SERIO_RAW is not set
# CONFIG_SERIO_ALTERA_PS2 is not set
# CONFIG_SERIO_PS2MULT is not set
# CONFIG_SERIO_ARC_PS2 is not set
# CONFIG_USERIO is not set
# CONFIG_GAMEPORT is not set

#
# Character devices
#
CONFIG_TTY=y
CONFIG_VT=y
CONFIG_CONSOLE_TRANSLATIONS=y
CONFIG_VT_CONSOLE=y
CONFIG_VT_CONSOLE_SLEEP=y
CONFIG_HW_CONSOLE=y
CONFIG_VT_HW_CONSOLE_BINDING=y
CONFIG_UNIX98_PTYS=y
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=256
# CONFIG_SERIAL_NONSTANDARD is not set
CONFIG_NOZOMI=y
# CONFIG_N_GSM is not set
# CONFIG_TRACE_SINK is not set
CONFIG_DEVMEM=y
CONFIG_DEVKMEM=y

#
# Serial drivers
#
CONFIG_SERIAL_EARLYCON=y
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_DEPRECATED_OPTIONS=y
CONFIG_SERIAL_8250_PNP=y
# CONFIG_SERIAL_8250_FINTEK is not set
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_SERIAL_8250_DMA=y
CONFIG_SERIAL_8250_PCI=y
CONFIG_SERIAL_8250_EXAR=y
CONFIG_SERIAL_8250_NR_UARTS=16
CONFIG_SERIAL_8250_RUNTIME_UARTS=4
CONFIG_SERIAL_8250_EXTENDED=y
CONFIG_SERIAL_8250_MANY_PORTS=y
CONFIG_SERIAL_8250_SHARE_IRQ=y
# CONFIG_SERIAL_8250_DETECT_IRQ is not set
CONFIG_SERIAL_8250_RSA=y
# CONFIG_SERIAL_8250_DW is not set
# CONFIG_SERIAL_8250_RT288X is not set
CONFIG_SERIAL_8250_LPSS=y
CONFIG_SERIAL_8250_MID=y
# CONFIG_SERIAL_8250_MOXA is not set

#
# Non-8250 serial port support
#
# CONFIG_SERIAL_UARTLITE is not set
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
# CONFIG_SERIAL_JSM is not set
# CONFIG_SERIAL_SCCNXP is not set
# CONFIG_SERIAL_SC16IS7XX is not set
# CONFIG_SERIAL_ALTERA_JTAGUART is not set
# CONFIG_SERIAL_ALTERA_UART is not set
# CONFIG_SERIAL_ARC is not set
# CONFIG_SERIAL_RP2 is not set
# CONFIG_SERIAL_FSL_LPUART is not set
# CONFIG_SERIAL_DEV_BUS is not set
# CONFIG_TTY_PRINTK is not set
CONFIG_HVC_DRIVER=y
CONFIG_VIRTIO_CONSOLE=y
# CONFIG_IPMI_HANDLER is not set
CONFIG_HW_RANDOM=y
# CONFIG_HW_RANDOM_TIMERIOMEM is not set
CONFIG_HW_RANDOM_INTEL=y
# CONFIG_HW_RANDOM_AMD is not set
CONFIG_HW_RANDOM_VIA=y
CONFIG_HW_RANDOM_VIRTIO=y
CONFIG_NVRAM=y
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set
# CONFIG_MWAVE is not set
# CONFIG_RAW_DRIVER is not set
CONFIG_HPET=y
CONFIG_HPET_MMAP=y
CONFIG_HPET_MMAP_DEFAULT=y
CONFIG_HANGCHECK_TIMER=y
# CONFIG_TCG_TPM is not set
# CONFIG_TELCLOCK is not set
CONFIG_DEVPORT=y
# CONFIG_XILLYBUS is not set

#
# I2C support
#
CONFIG_I2C=y
CONFIG_ACPI_I2C_OPREGION=y
CONFIG_I2C_BOARDINFO=y
CONFIG_I2C_COMPAT=y
# CONFIG_I2C_CHARDEV is not set
# CONFIG_I2C_MUX is not set
CONFIG_I2C_HELPER_AUTO=y
CONFIG_I2C_ALGOBIT=y

#
# I2C Hardware Bus support
#

#
# PC SMBus host controller drivers
#
# CONFIG_I2C_ALI1535 is not set
# CONFIG_I2C_ALI1563 is not set
# CONFIG_I2C_ALI15X3 is not set
# CONFIG_I2C_AMD756 is not set
# CONFIG_I2C_AMD8111 is not set
# CONFIG_I2C_I801 is not set
# CONFIG_I2C_ISCH is not set
# CONFIG_I2C_ISMT is not set
# CONFIG_I2C_PIIX4 is not set
# CONFIG_I2C_NFORCE2 is not set
# CONFIG_I2C_SIS5595 is not set
# CONFIG_I2C_SIS630 is not set
# CONFIG_I2C_SIS96X is not set
# CONFIG_I2C_VIA is not set
# CONFIG_I2C_VIAPRO is not set

#
# ACPI drivers
#
# CONFIG_I2C_SCMI is not set

#
# I2C system bus drivers (mostly embedded / system-on-chip)
#
# CONFIG_I2C_DESIGNWARE_PLATFORM is not set
# CONFIG_I2C_DESIGNWARE_PCI is not set
# CONFIG_I2C_EMEV2 is not set
# CONFIG_I2C_OCORES is not set
# CONFIG_I2C_PCA_PLATFORM is not set
# CONFIG_I2C_SIMTEC is not set
# CONFIG_I2C_XILINX is not set

#
# External I2C/SMBus adapter drivers
#
# CONFIG_I2C_DIOLAN_U2C is not set
# CONFIG_I2C_PARPORT_LIGHT is not set
# CONFIG_I2C_ROBOTFUZZ_OSIF is not set
# CONFIG_I2C_TAOS_EVM is not set
# CONFIG_I2C_TINY_USB is not set

#
# Other I2C/SMBus bus drivers
#
# CONFIG_I2C_MLXCPLD is not set
# CONFIG_I2C_STUB is not set
# CONFIG_I2C_SLAVE is not set
# CONFIG_I2C_DEBUG_CORE is not set
# CONFIG_I2C_DEBUG_ALGO is not set
# CONFIG_I2C_DEBUG_BUS is not set
# CONFIG_SPI is not set
# CONFIG_SPMI is not set
# CONFIG_HSI is not set
CONFIG_PPS=y
# CONFIG_PPS_DEBUG is not set

#
# PPS clients support
#
# CONFIG_PPS_CLIENT_KTIMER is not set
# CONFIG_PPS_CLIENT_LDISC is not set
# CONFIG_PPS_CLIENT_GPIO is not set

#
# PPS generators support
#

#
# PTP clock support
#
CONFIG_PTP_1588_CLOCK=y

#
# Enable PHYLIB and NETWORK_PHY_TIMESTAMPING to see the additional clocks.
#
CONFIG_PTP_1588_CLOCK_KVM=y
# CONFIG_PINCTRL is not set
# CONFIG_GPIOLIB is not set
# CONFIG_W1 is not set
# CONFIG_POWER_AVS is not set
# CONFIG_POWER_RESET is not set
CONFIG_POWER_SUPPLY=y
# CONFIG_POWER_SUPPLY_DEBUG is not set
# CONFIG_PDA_POWER is not set
# CONFIG_TEST_POWER is not set
# CONFIG_BATTERY_DS2780 is not set
# CONFIG_BATTERY_DS2781 is not set
# CONFIG_BATTERY_DS2782 is not set
# CONFIG_BATTERY_SBS is not set
# CONFIG_CHARGER_SBS is not set
# CONFIG_BATTERY_BQ27XXX is not set
# CONFIG_BATTERY_MAX17040 is not set
# CONFIG_BATTERY_MAX17042 is not set
# CONFIG_CHARGER_MAX8903 is not set
# CONFIG_CHARGER_LP8727 is not set
# CONFIG_CHARGER_BQ2415X is not set
# CONFIG_CHARGER_SMB347 is not set
# CONFIG_BATTERY_GAUGE_LTC2941 is not set
CONFIG_HWMON=y
# CONFIG_HWMON_DEBUG_CHIP is not set

#
# Native drivers
#
# CONFIG_SENSORS_ABITUGURU is not set
# CONFIG_SENSORS_ABITUGURU3 is not set
# CONFIG_SENSORS_AD7414 is not set
# CONFIG_SENSORS_AD7418 is not set
# CONFIG_SENSORS_ADM1021 is not set
# CONFIG_SENSORS_ADM1025 is not set
# CONFIG_SENSORS_ADM1026 is not set
# CONFIG_SENSORS_ADM1029 is not set
# CONFIG_SENSORS_ADM1031 is not set
# CONFIG_SENSORS_ADM9240 is not set
# CONFIG_SENSORS_ADT7410 is not set
# CONFIG_SENSORS_ADT7411 is not set
# CONFIG_SENSORS_ADT7462 is not set
# CONFIG_SENSORS_ADT7470 is not set
# CONFIG_SENSORS_ADT7475 is not set
# CONFIG_SENSORS_ASC7621 is not set
# CONFIG_SENSORS_K8TEMP is not set
# CONFIG_SENSORS_K10TEMP is not set
# CONFIG_SENSORS_APPLESMC is not set
# CONFIG_SENSORS_ASB100 is not set
# CONFIG_SENSORS_ASPEED is not set
# CONFIG_SENSORS_ATXP1 is not set
# CONFIG_SENSORS_DS620 is not set
# CONFIG_SENSORS_DS1621 is not set
# CONFIG_SENSORS_DELL_SMM is not set
# CONFIG_SENSORS_I5K_AMB is not set
# CONFIG_SENSORS_F71805F is not set
# CONFIG_SENSORS_F71882FG is not set
# CONFIG_SENSORS_F75375S is not set
# CONFIG_SENSORS_FSCHMD is not set
# CONFIG_SENSORS_FTSTEUTATES is not set
# CONFIG_SENSORS_GL518SM is not set
# CONFIG_SENSORS_GL520SM is not set
# CONFIG_SENSORS_G760A is not set
# CONFIG_SENSORS_G762 is not set
# CONFIG_SENSORS_HIH6130 is not set
# CONFIG_SENSORS_I5500 is not set
# CONFIG_SENSORS_CORETEMP is not set
# CONFIG_SENSORS_IT87 is not set
# CONFIG_SENSORS_JC42 is not set
# CONFIG_SENSORS_POWR1220 is not set
# CONFIG_SENSORS_LINEAGE is not set
# CONFIG_SENSORS_LTC2945 is not set
# CONFIG_SENSORS_LTC2990 is not set
# CONFIG_SENSORS_LTC4151 is not set
# CONFIG_SENSORS_LTC4215 is not set
# CONFIG_SENSORS_LTC4222 is not set
# CONFIG_SENSORS_LTC4245 is not set
# CONFIG_SENSORS_LTC4260 is not set
# CONFIG_SENSORS_LTC4261 is not set
# CONFIG_SENSORS_MAX16065 is not set
# CONFIG_SENSORS_MAX1619 is not set
# CONFIG_SENSORS_MAX1668 is not set
# CONFIG_SENSORS_MAX197 is not set
# CONFIG_SENSORS_MAX6621 is not set
# CONFIG_SENSORS_MAX6639 is not set
# CONFIG_SENSORS_MAX6642 is not set
# CONFIG_SENSORS_MAX6650 is not set
# CONFIG_SENSORS_MAX6697 is not set
# CONFIG_SENSORS_MAX31790 is not set
# CONFIG_SENSORS_MCP3021 is not set
# CONFIG_SENSORS_TC654 is not set
# CONFIG_SENSORS_LM63 is not set
# CONFIG_SENSORS_LM73 is not set
# CONFIG_SENSORS_LM75 is not set
# CONFIG_SENSORS_LM77 is not set
# CONFIG_SENSORS_LM78 is not set
# CONFIG_SENSORS_LM80 is not set
# CONFIG_SENSORS_LM83 is not set
# CONFIG_SENSORS_LM85 is not set
# CONFIG_SENSORS_LM87 is not set
# CONFIG_SENSORS_LM90 is not set
# CONFIG_SENSORS_LM92 is not set
# CONFIG_SENSORS_LM93 is not set
# CONFIG_SENSORS_LM95234 is not set
# CONFIG_SENSORS_LM95241 is not set
# CONFIG_SENSORS_LM95245 is not set
# CONFIG_SENSORS_PC87360 is not set
# CONFIG_SENSORS_PC87427 is not set
# CONFIG_SENSORS_NTC_THERMISTOR is not set
# CONFIG_SENSORS_NCT6683 is not set
# CONFIG_SENSORS_NCT6775 is not set
# CONFIG_SENSORS_NCT7802 is not set
# CONFIG_SENSORS_NCT7904 is not set
# CONFIG_SENSORS_PCF8591 is not set
# CONFIG_PMBUS is not set
# CONFIG_SENSORS_SHT21 is not set
# CONFIG_SENSORS_SHT3x is not set
# CONFIG_SENSORS_SHTC1 is not set
# CONFIG_SENSORS_SIS5595 is not set
# CONFIG_SENSORS_DME1737 is not set
# CONFIG_SENSORS_EMC1403 is not set
# CONFIG_SENSORS_EMC2103 is not set
# CONFIG_SENSORS_EMC6W201 is not set
# CONFIG_SENSORS_SMSC47M1 is not set
# CONFIG_SENSORS_SMSC47M192 is not set
# CONFIG_SENSORS_SMSC47B397 is not set
# CONFIG_SENSORS_SCH5627 is not set
# CONFIG_SENSORS_SCH5636 is not set
# CONFIG_SENSORS_STTS751 is not set
# CONFIG_SENSORS_SMM665 is not set
# CONFIG_SENSORS_ADC128D818 is not set
# CONFIG_SENSORS_ADS1015 is not set
# CONFIG_SENSORS_ADS7828 is not set
# CONFIG_SENSORS_AMC6821 is not set
# CONFIG_SENSORS_INA209 is not set
# CONFIG_SENSORS_INA2XX is not set
# CONFIG_SENSORS_INA3221 is not set
# CONFIG_SENSORS_TC74 is not set
# CONFIG_SENSORS_THMC50 is not set
# CONFIG_SENSORS_TMP102 is not set
# CONFIG_SENSORS_TMP103 is not set
# CONFIG_SENSORS_TMP108 is not set
# CONFIG_SENSORS_TMP401 is not set
# CONFIG_SENSORS_TMP421 is not set
# CONFIG_SENSORS_VIA_CPUTEMP is not set
# CONFIG_SENSORS_VIA686A is not set
# CONFIG_SENSORS_VT1211 is not set
# CONFIG_SENSORS_VT8231 is not set
# CONFIG_SENSORS_W83773G is not set
# CONFIG_SENSORS_W83781D is not set
# CONFIG_SENSORS_W83791D is not set
# CONFIG_SENSORS_W83792D is not set
# CONFIG_SENSORS_W83793 is not set
# CONFIG_SENSORS_W83795 is not set
# CONFIG_SENSORS_W83L785TS is not set
# CONFIG_SENSORS_W83L786NG is not set
# CONFIG_SENSORS_W83627HF is not set
# CONFIG_SENSORS_W83627EHF is not set
# CONFIG_SENSORS_XGENE is not set

#
# ACPI drivers
#
# CONFIG_SENSORS_ACPI_POWER is not set
# CONFIG_SENSORS_ATK0110 is not set
CONFIG_THERMAL=y
CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS=0
CONFIG_THERMAL_HWMON=y
CONFIG_THERMAL_WRITABLE_TRIPS=y
CONFIG_THERMAL_DEFAULT_GOV_STEP_WISE=y
# CONFIG_THERMAL_DEFAULT_GOV_FAIR_SHARE is not set
# CONFIG_THERMAL_DEFAULT_GOV_USER_SPACE is not set
# CONFIG_THERMAL_DEFAULT_GOV_POWER_ALLOCATOR is not set
# CONFIG_THERMAL_GOV_FAIR_SHARE is not set
CONFIG_THERMAL_GOV_STEP_WISE=y
# CONFIG_THERMAL_GOV_BANG_BANG is not set
CONFIG_THERMAL_GOV_USER_SPACE=y
# CONFIG_THERMAL_GOV_POWER_ALLOCATOR is not set
# CONFIG_THERMAL_EMULATION is not set
# CONFIG_INTEL_POWERCLAMP is not set
CONFIG_X86_PKG_TEMP_THERMAL=m
# CONFIG_INTEL_SOC_DTS_THERMAL is not set

#
# ACPI INT340X thermal drivers
#
# CONFIG_INT340X_THERMAL is not set
# CONFIG_INTEL_PCH_THERMAL is not set
CONFIG_WATCHDOG=y
CONFIG_WATCHDOG_CORE=y
# CONFIG_WATCHDOG_NOWAYOUT is not set
CONFIG_WATCHDOG_HANDLE_BOOT_ENABLED=y
# CONFIG_WATCHDOG_SYSFS is not set

#
# Watchdog Device Drivers
#
CONFIG_SOFT_WATCHDOG=y
# CONFIG_WDAT_WDT is not set
# CONFIG_XILINX_WATCHDOG is not set
# CONFIG_ZIIRAVE_WATCHDOG is not set
# CONFIG_CADENCE_WATCHDOG is not set
# CONFIG_DW_WATCHDOG is not set
# CONFIG_MAX63XX_WATCHDOG is not set
# CONFIG_ACQUIRE_WDT is not set
# CONFIG_ADVANTECH_WDT is not set
# CONFIG_ALIM1535_WDT is not set
# CONFIG_ALIM7101_WDT is not set
# CONFIG_EBC_C384_WDT is not set
# CONFIG_F71808E_WDT is not set
# CONFIG_SP5100_TCO is not set
# CONFIG_SBC_FITPC2_WATCHDOG is not set
# CONFIG_EUROTECH_WDT is not set
# CONFIG_IB700_WDT is not set
# CONFIG_IBMASR is not set
# CONFIG_WAFER_WDT is not set
CONFIG_I6300ESB_WDT=y
# CONFIG_IE6XX_WDT is not set
CONFIG_ITCO_WDT=y
CONFIG_ITCO_VENDOR_SUPPORT=y
# CONFIG_IT8712F_WDT is not set
# CONFIG_IT87_WDT is not set
# CONFIG_HP_WATCHDOG is not set
# CONFIG_SC1200_WDT is not set
# CONFIG_PC87413_WDT is not set
# CONFIG_NV_TCO is not set
# CONFIG_60XX_WDT is not set
# CONFIG_CPU5_WDT is not set
# CONFIG_SMSC_SCH311X_WDT is not set
# CONFIG_SMSC37B787_WDT is not set
# CONFIG_VIA_WDT is not set
# CONFIG_W83627HF_WDT is not set
# CONFIG_W83877F_WDT is not set
# CONFIG_W83977F_WDT is not set
# CONFIG_MACHZ_WDT is not set
# CONFIG_SBC_EPX_C3_WATCHDOG is not set
# CONFIG_NI903X_WDT is not set
# CONFIG_NIC7018_WDT is not set

#
# PCI-based Watchdog Cards
#
# CONFIG_PCIPCWATCHDOG is not set
# CONFIG_WDTPCI is not set

#
# USB-based Watchdog Cards
#
# CONFIG_USBPCWATCHDOG is not set

#
# Watchdog Pretimeout Governors
#
# CONFIG_WATCHDOG_PRETIMEOUT_GOV is not set
CONFIG_SSB_POSSIBLE=y
CONFIG_SSB=y
CONFIG_SSB_SPROM=y
CONFIG_SSB_PCIHOST_POSSIBLE=y
CONFIG_SSB_PCIHOST=y
# CONFIG_SSB_SILENT is not set
# CONFIG_SSB_DEBUG is not set
CONFIG_SSB_DRIVER_PCICORE_POSSIBLE=y
CONFIG_SSB_DRIVER_PCICORE=y
CONFIG_BCMA_POSSIBLE=y
# CONFIG_BCMA is not set

#
# Multifunction device drivers
#
CONFIG_MFD_CORE=y
# CONFIG_MFD_AS3711 is not set
# CONFIG_PMIC_ADP5520 is not set
# CONFIG_MFD_BCM590XX is not set
# CONFIG_MFD_BD9571MWV is not set
# CONFIG_MFD_AXP20X_I2C is not set
# CONFIG_MFD_CROS_EC is not set
# CONFIG_PMIC_DA903X is not set
# CONFIG_MFD_DA9052_I2C is not set
# CONFIG_MFD_DA9055 is not set
# CONFIG_MFD_DA9062 is not set
# CONFIG_MFD_DA9063 is not set
# CONFIG_MFD_DA9150 is not set
# CONFIG_MFD_DLN2 is not set
# CONFIG_MFD_MC13XXX_I2C is not set
# CONFIG_HTC_PASIC3 is not set
# CONFIG_MFD_INTEL_QUARK_I2C_GPIO is not set
CONFIG_LPC_ICH=y
# CONFIG_LPC_SCH is not set
# CONFIG_INTEL_SOC_PMIC_CHTWC is not set
# CONFIG_MFD_INTEL_LPSS_ACPI is not set
# CONFIG_MFD_INTEL_LPSS_PCI is not set
# CONFIG_MFD_JANZ_CMODIO is not set
# CONFIG_MFD_KEMPLD is not set
# CONFIG_MFD_88PM800 is not set
# CONFIG_MFD_88PM805 is not set
# CONFIG_MFD_88PM860X is not set
# CONFIG_MFD_MAX14577 is not set
# CONFIG_MFD_MAX77693 is not set
# CONFIG_MFD_MAX77843 is not set
# CONFIG_MFD_MAX8907 is not set
# CONFIG_MFD_MAX8925 is not set
# CONFIG_MFD_MAX8997 is not set
# CONFIG_MFD_MAX8998 is not set
# CONFIG_MFD_MT6397 is not set
# CONFIG_MFD_MENF21BMC is not set
# CONFIG_MFD_VIPERBOARD is not set
# CONFIG_MFD_RETU is not set
# CONFIG_MFD_PCF50633 is not set
# CONFIG_MFD_RDC321X is not set
# CONFIG_MFD_RT5033 is not set
# CONFIG_MFD_RC5T583 is not set
# CONFIG_MFD_SEC_CORE is not set
# CONFIG_MFD_SI476X_CORE is not set
# CONFIG_MFD_SM501 is not set
# CONFIG_MFD_SKY81452 is not set
# CONFIG_MFD_SMSC is not set
# CONFIG_ABX500_CORE is not set
# CONFIG_MFD_SYSCON is not set
# CONFIG_MFD_TI_AM335X_TSCADC is not set
# CONFIG_MFD_LP3943 is not set
# CONFIG_MFD_LP8788 is not set
# CONFIG_MFD_TI_LMU is not set
# CONFIG_MFD_PALMAS is not set
# CONFIG_TPS6105X is not set
# CONFIG_TPS6507X is not set
# CONFIG_MFD_TPS65086 is not set
# CONFIG_MFD_TPS65090 is not set
# CONFIG_MFD_TPS68470 is not set
# CONFIG_MFD_TI_LP873X is not set
# CONFIG_MFD_TPS6586X is not set
# CONFIG_MFD_TPS65912_I2C is not set
# CONFIG_MFD_TPS80031 is not set
# CONFIG_TWL4030_CORE is not set
# CONFIG_TWL6040_CORE is not set
# CONFIG_MFD_WL1273_CORE is not set
# CONFIG_MFD_LM3533 is not set
# CONFIG_MFD_VX855 is not set
# CONFIG_MFD_ARIZONA_I2C is not set
# CONFIG_MFD_WM8400 is not set
# CONFIG_MFD_WM831X_I2C is not set
# CONFIG_MFD_WM8350_I2C is not set
# CONFIG_MFD_WM8994 is not set
# CONFIG_REGULATOR is not set
# CONFIG_RC_CORE is not set
# CONFIG_MEDIA_SUPPORT is not set

#
# Graphics support
#
CONFIG_AGP=y
CONFIG_AGP_INTEL=y
# CONFIG_AGP_SIS is not set
# CONFIG_AGP_VIA is not set
CONFIG_INTEL_GTT=y
CONFIG_VGA_ARB=y
CONFIG_VGA_ARB_MAX_GPUS=16
# CONFIG_VGA_SWITCHEROO is not set
CONFIG_DRM=y
CONFIG_DRM_MIPI_DSI=y
# CONFIG_DRM_DP_AUX_CHARDEV is not set
# CONFIG_DRM_DEBUG_MM is not set
# CONFIG_DRM_DEBUG_MM_SELFTEST is not set
CONFIG_DRM_KMS_HELPER=y
CONFIG_DRM_KMS_FB_HELPER=y
CONFIG_DRM_FBDEV_EMULATION=y
CONFIG_DRM_FBDEV_OVERALLOC=100
# CONFIG_DRM_LOAD_EDID_FIRMWARE is not set

#
# I2C encoder or helper chips
#
# CONFIG_DRM_I2C_CH7006 is not set
# CONFIG_DRM_I2C_SIL164 is not set
# CONFIG_DRM_I2C_NXP_TDA998X is not set
# CONFIG_DRM_RADEON is not set
# CONFIG_DRM_AMDGPU is not set

#
# ACP (Audio CoProcessor) Configuration
#

#
# AMD Library routines
#
# CONFIG_DRM_NOUVEAU is not set
CONFIG_DRM_I915=y
# CONFIG_DRM_I915_ALPHA_SUPPORT is not set
CONFIG_DRM_I915_CAPTURE_ERROR=y
CONFIG_DRM_I915_COMPRESS_ERROR=y
CONFIG_DRM_I915_USERPTR=y
# CONFIG_DRM_I915_GVT is not set

#
# drm/i915 Debugging
#
# CONFIG_DRM_I915_WERROR is not set
# CONFIG_DRM_I915_DEBUG is not set
# CONFIG_DRM_I915_SW_FENCE_DEBUG_OBJECTS is not set
# CONFIG_DRM_I915_SW_FENCE_CHECK_DAG is not set
# CONFIG_DRM_I915_SELFTEST is not set
# CONFIG_DRM_I915_LOW_LEVEL_TRACEPOINTS is not set
# CONFIG_DRM_I915_DEBUG_VBLANK_EVADE is not set
# CONFIG_DRM_VGEM is not set
# CONFIG_DRM_VMWGFX is not set
# CONFIG_DRM_GMA500 is not set
# CONFIG_DRM_UDL is not set
# CONFIG_DRM_AST is not set
# CONFIG_DRM_MGAG200 is not set
# CONFIG_DRM_CIRRUS_QEMU is not set
# CONFIG_DRM_QXL is not set
# CONFIG_DRM_BOCHS is not set
# CONFIG_DRM_VIRTIO_GPU is not set
CONFIG_DRM_PANEL=y

#
# Display Panels
#
# CONFIG_DRM_PANEL_RASPBERRYPI_TOUCHSCREEN is not set
CONFIG_DRM_BRIDGE=y
CONFIG_DRM_PANEL_BRIDGE=y

#
# Display Interface Bridges
#
# CONFIG_DRM_ANALOGIX_ANX78XX is not set
# CONFIG_DRM_HISI_HIBMC is not set
# CONFIG_DRM_TINYDRM is not set
# CONFIG_DRM_LEGACY is not set
CONFIG_DRM_PANEL_ORIENTATION_QUIRKS=y

#
# Frame buffer Devices
#
CONFIG_FB=y
# CONFIG_FIRMWARE_EDID is not set
CONFIG_FB_CMDLINE=y
CONFIG_FB_NOTIFY=y
CONFIG_FB_CFB_FILLRECT=y
CONFIG_FB_CFB_COPYAREA=y
CONFIG_FB_CFB_IMAGEBLIT=y
CONFIG_FB_SYS_FILLRECT=y
CONFIG_FB_SYS_COPYAREA=y
CONFIG_FB_SYS_IMAGEBLIT=y
# CONFIG_FB_FOREIGN_ENDIAN is not set
CONFIG_FB_SYS_FOPS=y
CONFIG_FB_DEFERRED_IO=y
CONFIG_FB_MODE_HELPERS=y
# CONFIG_FB_TILEBLITTING is not set

#
# Frame buffer hardware drivers
#
# CONFIG_FB_CIRRUS is not set
# CONFIG_FB_PM2 is not set
# CONFIG_FB_CYBER2000 is not set
# CONFIG_FB_ARC is not set
# CONFIG_FB_ASILIANT is not set
# CONFIG_FB_IMSTT is not set
# CONFIG_FB_VGA16 is not set
# CONFIG_FB_UVESA is not set
# CONFIG_FB_VESA is not set
# CONFIG_FB_EFI is not set
# CONFIG_FB_N411 is not set
# CONFIG_FB_HGA is not set
# CONFIG_FB_OPENCORES is not set
# CONFIG_FB_S1D13XXX is not set
# CONFIG_FB_NVIDIA is not set
# CONFIG_FB_RIVA is not set
# CONFIG_FB_I740 is not set
# CONFIG_FB_LE80578 is not set
# CONFIG_FB_MATROX is not set
# CONFIG_FB_RADEON is not set
# CONFIG_FB_ATY128 is not set
# CONFIG_FB_ATY is not set
# CONFIG_FB_S3 is not set
# CONFIG_FB_SAVAGE is not set
# CONFIG_FB_SIS is not set
# CONFIG_FB_NEOMAGIC is not set
# CONFIG_FB_KYRO is not set
# CONFIG_FB_3DFX is not set
# CONFIG_FB_VOODOO1 is not set
# CONFIG_FB_VT8623 is not set
# CONFIG_FB_TRIDENT is not set
# CONFIG_FB_ARK is not set
# CONFIG_FB_PM3 is not set
# CONFIG_FB_CARMINE is not set
# CONFIG_FB_SMSCUFX is not set
# CONFIG_FB_UDL is not set
# CONFIG_FB_IBM_GXT4500 is not set
# CONFIG_FB_VIRTUAL is not set
# CONFIG_FB_METRONOME is not set
# CONFIG_FB_MB862XX is not set
# CONFIG_FB_BROADSHEET is not set
# CONFIG_FB_AUO_K190X is not set
# CONFIG_FB_SIMPLE is not set
# CONFIG_FB_SM712 is not set
CONFIG_BACKLIGHT_LCD_SUPPORT=y
CONFIG_LCD_CLASS_DEVICE=y
# CONFIG_LCD_PLATFORM is not set
CONFIG_BACKLIGHT_CLASS_DEVICE=y
CONFIG_BACKLIGHT_GENERIC=y
# CONFIG_BACKLIGHT_APPLE is not set
# CONFIG_BACKLIGHT_PM8941_WLED is not set
# CONFIG_BACKLIGHT_SAHARA is not set
# CONFIG_BACKLIGHT_ADP8860 is not set
# CONFIG_BACKLIGHT_ADP8870 is not set
# CONFIG_BACKLIGHT_LM3639 is not set
# CONFIG_BACKLIGHT_LV5207LP is not set
# CONFIG_BACKLIGHT_BD6107 is not set
# CONFIG_BACKLIGHT_ARCXCNN is not set
CONFIG_HDMI=y

#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
CONFIG_VGACON_SOFT_SCROLLBACK=y
CONFIG_VGACON_SOFT_SCROLLBACK_SIZE=1024
# CONFIG_VGACON_SOFT_SCROLLBACK_PERSISTENT_ENABLE_BY_DEFAULT is not set
CONFIG_DUMMY_CONSOLE=y
CONFIG_DUMMY_CONSOLE_COLUMNS=80
CONFIG_DUMMY_CONSOLE_ROWS=25
CONFIG_FRAMEBUFFER_CONSOLE=y
CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y
# CONFIG_FRAMEBUFFER_CONSOLE_ROTATION is not set
# CONFIG_LOGO is not set
CONFIG_SOUND=y
CONFIG_SND=y
CONFIG_SND_TIMER=y
CONFIG_SND_PCM=y
CONFIG_SND_HWDEP=y
CONFIG_SND_SEQ_DEVICE=y
CONFIG_SND_JACK=y
CONFIG_SND_JACK_INPUT_DEV=y
# CONFIG_SND_OSSEMUL is not set
CONFIG_SND_PCM_TIMER=y
CONFIG_SND_HRTIMER=y
CONFIG_SND_DYNAMIC_MINORS=y
CONFIG_SND_MAX_CARDS=32
# CONFIG_SND_SUPPORT_OLD_API is not set
CONFIG_SND_PROC_FS=y
CONFIG_SND_VERBOSE_PROCFS=y
CONFIG_SND_VERBOSE_PRINTK=y
CONFIG_SND_DEBUG=y
CONFIG_SND_DEBUG_VERBOSE=y
CONFIG_SND_PCM_XRUN_DEBUG=y
CONFIG_SND_VMASTER=y
CONFIG_SND_DMA_SGBUF=y
CONFIG_SND_SEQUENCER=y
CONFIG_SND_SEQ_DUMMY=y
CONFIG_SND_SEQ_HRTIMER_DEFAULT=y
CONFIG_SND_DRIVERS=y
CONFIG_SND_PCSP=m
# CONFIG_SND_DUMMY is not set
# CONFIG_SND_ALOOP is not set
# CONFIG_SND_VIRMIDI is not set
# CONFIG_SND_MTPAV is not set
# CONFIG_SND_SERIAL_U16550 is not set
# CONFIG_SND_MPU401 is not set
CONFIG_SND_PCI=y
# CONFIG_SND_AD1889 is not set
# CONFIG_SND_ALS300 is not set
# CONFIG_SND_ALS4000 is not set
# CONFIG_SND_ALI5451 is not set
# CONFIG_SND_ASIHPI is not set
# CONFIG_SND_ATIIXP is not set
# CONFIG_SND_ATIIXP_MODEM is not set
# CONFIG_SND_AU8810 is not set
# CONFIG_SND_AU8820 is not set
# CONFIG_SND_AU8830 is not set
# CONFIG_SND_AW2 is not set
# CONFIG_SND_AZT3328 is not set
# CONFIG_SND_BT87X is not set
# CONFIG_SND_CA0106 is not set
# CONFIG_SND_CMIPCI is not set
# CONFIG_SND_OXYGEN is not set
# CONFIG_SND_CS4281 is not set
# CONFIG_SND_CS46XX is not set
# CONFIG_SND_CTXFI is not set
# CONFIG_SND_DARLA20 is not set
# CONFIG_SND_GINA20 is not set
# CONFIG_SND_LAYLA20 is not set
# CONFIG_SND_DARLA24 is not set
# CONFIG_SND_GINA24 is not set
# CONFIG_SND_LAYLA24 is not set
# CONFIG_SND_MONA is not set
# CONFIG_SND_MIA is not set
# CONFIG_SND_ECHO3G is not set
# CONFIG_SND_INDIGO is not set
# CONFIG_SND_INDIGOIO is not set
# CONFIG_SND_INDIGODJ is not set
# CONFIG_SND_INDIGOIOX is not set
# CONFIG_SND_INDIGODJX is not set
# CONFIG_SND_EMU10K1 is not set
# CONFIG_SND_EMU10K1X is not set
# CONFIG_SND_ENS1370 is not set
# CONFIG_SND_ENS1371 is not set
# CONFIG_SND_ES1938 is not set
# CONFIG_SND_ES1968 is not set
# CONFIG_SND_FM801 is not set
# CONFIG_SND_HDSP is not set
# CONFIG_SND_HDSPM is not set
# CONFIG_SND_ICE1712 is not set
# CONFIG_SND_ICE1724 is not set
# CONFIG_SND_INTEL8X0 is not set
# CONFIG_SND_INTEL8X0M is not set
# CONFIG_SND_KORG1212 is not set
# CONFIG_SND_LOLA is not set
# CONFIG_SND_LX6464ES is not set
# CONFIG_SND_MAESTRO3 is not set
# CONFIG_SND_MIXART is not set
# CONFIG_SND_NM256 is not set
# CONFIG_SND_PCXHR is not set
# CONFIG_SND_RIPTIDE is not set
# CONFIG_SND_RME32 is not set
# CONFIG_SND_RME96 is not set
# CONFIG_SND_RME9652 is not set
# CONFIG_SND_SE6X is not set
# CONFIG_SND_SONICVIBES is not set
# CONFIG_SND_TRIDENT is not set
# CONFIG_SND_VIA82XX is not set
# CONFIG_SND_VIA82XX_MODEM is not set
# CONFIG_SND_VIRTUOSO is not set
# CONFIG_SND_VX222 is not set
# CONFIG_SND_YMFPCI is not set

#
# HD-Audio
#
CONFIG_SND_HDA=y
CONFIG_SND_HDA_INTEL=y
CONFIG_SND_HDA_HWDEP=y
CONFIG_SND_HDA_RECONFIG=y
CONFIG_SND_HDA_INPUT_BEEP=y
CONFIG_SND_HDA_INPUT_BEEP_MODE=1
CONFIG_SND_HDA_PATCH_LOADER=y
CONFIG_SND_HDA_CODEC_REALTEK=y
CONFIG_SND_HDA_CODEC_ANALOG=y
CONFIG_SND_HDA_CODEC_SIGMATEL=y
CONFIG_SND_HDA_CODEC_VIA=y
CONFIG_SND_HDA_CODEC_HDMI=y
CONFIG_SND_HDA_CODEC_CIRRUS=y
CONFIG_SND_HDA_CODEC_CONEXANT=y
CONFIG_SND_HDA_CODEC_CA0110=y
CONFIG_SND_HDA_CODEC_CA0132=y
# CONFIG_SND_HDA_CODEC_CA0132_DSP is not set
CONFIG_SND_HDA_CODEC_CMEDIA=y
CONFIG_SND_HDA_CODEC_SI3054=y
CONFIG_SND_HDA_GENERIC=y
CONFIG_SND_HDA_POWER_SAVE_DEFAULT=0
CONFIG_SND_HDA_CORE=y
CONFIG_SND_HDA_I915=y
CONFIG_SND_HDA_PREALLOC_SIZE=64
# CONFIG_SND_USB is not set
# CONFIG_SND_SOC is not set
CONFIG_SND_X86=y
# CONFIG_HDMI_LPE_AUDIO is not set

#
# HID support
#
CONFIG_HID=y
CONFIG_HID_BATTERY_STRENGTH=y
# CONFIG_HIDRAW is not set
# CONFIG_UHID is not set
CONFIG_HID_GENERIC=y

#
# Special HID drivers
#
# CONFIG_HID_A4TECH is not set
# CONFIG_HID_ACCUTOUCH is not set
# CONFIG_HID_ACRUX is not set
# CONFIG_HID_APPLE is not set
# CONFIG_HID_APPLEIR is not set
# CONFIG_HID_ASUS is not set
# CONFIG_HID_AUREAL is not set
# CONFIG_HID_BELKIN is not set
# CONFIG_HID_BETOP_FF is not set
# CONFIG_HID_CHERRY is not set
# CONFIG_HID_CHICONY is not set
# CONFIG_HID_CORSAIR is not set
# CONFIG_HID_PRODIKEYS is not set
# CONFIG_HID_CMEDIA is not set
# CONFIG_HID_CYPRESS is not set
# CONFIG_HID_DRAGONRISE is not set
# CONFIG_HID_EMS_FF is not set
# CONFIG_HID_ELAN is not set
# CONFIG_HID_ELECOM is not set
# CONFIG_HID_ELO is not set
# CONFIG_HID_EZKEY is not set
# CONFIG_HID_GEMBIRD is not set
# CONFIG_HID_GFRM is not set
# CONFIG_HID_HOLTEK is not set
# CONFIG_HID_GT683R is not set
# CONFIG_HID_KEYTOUCH is not set
# CONFIG_HID_KYE is not set
# CONFIG_HID_UCLOGIC is not set
# CONFIG_HID_WALTOP is not set
# CONFIG_HID_GYRATION is not set
# CONFIG_HID_ICADE is not set
# CONFIG_HID_ITE is not set
# CONFIG_HID_JABRA is not set
# CONFIG_HID_TWINHAN is not set
# CONFIG_HID_KENSINGTON is not set
# CONFIG_HID_LCPOWER is not set
# CONFIG_HID_LED is not set
# CONFIG_HID_LENOVO is not set
# CONFIG_HID_LOGITECH is not set
# CONFIG_HID_MAGICMOUSE is not set
# CONFIG_HID_MAYFLASH is not set
# CONFIG_HID_MICROSOFT is not set
# CONFIG_HID_MONTEREY is not set
# CONFIG_HID_MULTITOUCH is not set
# CONFIG_HID_NTI is not set
# CONFIG_HID_NTRIG is not set
# CONFIG_HID_ORTEK is not set
# CONFIG_HID_PANTHERLORD is not set
# CONFIG_HID_PENMOUNT is not set
# CONFIG_HID_PETALYNX is not set
# CONFIG_HID_PICOLCD is not set
# CONFIG_HID_PLANTRONICS is not set
# CONFIG_HID_PRIMAX is not set
# CONFIG_HID_RETRODE is not set
# CONFIG_HID_ROCCAT is not set
# CONFIG_HID_SAITEK is not set
# CONFIG_HID_SAMSUNG is not set
# CONFIG_HID_SONY is not set
# CONFIG_HID_SPEEDLINK is not set
# CONFIG_HID_STEELSERIES is not set
# CONFIG_HID_SUNPLUS is not set
# CONFIG_HID_RMI is not set
# CONFIG_HID_GREENASIA is not set
# CONFIG_HID_SMARTJOYPLUS is not set
# CONFIG_HID_TIVO is not set
# CONFIG_HID_TOPSEED is not set
# CONFIG_HID_THINGM is not set
# CONFIG_HID_THRUSTMASTER is not set
# CONFIG_HID_UDRAW_PS3 is not set
# CONFIG_HID_WACOM is not set
# CONFIG_HID_WIIMOTE is not set
# CONFIG_HID_XINMO is not set
# CONFIG_HID_ZEROPLUS is not set
# CONFIG_HID_ZYDACRON is not set
# CONFIG_HID_SENSOR_HUB is not set
# CONFIG_HID_ALPS is not set

#
# USB HID support
#
CONFIG_USB_HID=y
# CONFIG_HID_PID is not set
CONFIG_USB_HIDDEV=y

#
# I2C HID support
#
# CONFIG_I2C_HID is not set

#
# Intel ISH HID support
#
# CONFIG_INTEL_ISH_HID is not set
CONFIG_USB_OHCI_LITTLE_ENDIAN=y
CONFIG_USB_SUPPORT=y
CONFIG_USB_COMMON=y
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB=y
CONFIG_USB_PCI=y
CONFIG_USB_ANNOUNCE_NEW_DEVICES=y

#
# Miscellaneous USB options
#
CONFIG_USB_DEFAULT_PERSIST=y
CONFIG_USB_DYNAMIC_MINORS=y
# CONFIG_USB_OTG is not set
# CONFIG_USB_OTG_WHITELIST is not set
# CONFIG_USB_OTG_BLACKLIST_HUB is not set
# CONFIG_USB_LEDS_TRIGGER_USBPORT is not set
# CONFIG_USB_MON is not set
# CONFIG_USB_WUSB_CBAF is not set

#
# USB Host Controller Drivers
#
# CONFIG_USB_C67X00_HCD is not set
# CONFIG_USB_XHCI_HCD is not set
CONFIG_USB_EHCI_HCD=y
CONFIG_USB_EHCI_ROOT_HUB_TT=y
# CONFIG_USB_EHCI_TT_NEWSCHED is not set
CONFIG_USB_EHCI_PCI=y
# CONFIG_USB_EHCI_HCD_PLATFORM is not set
# CONFIG_USB_OXU210HP_HCD is not set
# CONFIG_USB_ISP116X_HCD is not set
# CONFIG_USB_ISP1362_HCD is not set
# CONFIG_USB_FOTG210_HCD is not set
# CONFIG_USB_OHCI_HCD is not set
CONFIG_USB_UHCI_HCD=y
# CONFIG_USB_SL811_HCD is not set
# CONFIG_USB_R8A66597_HCD is not set
# CONFIG_USB_HCD_SSB is not set
# CONFIG_USB_HCD_TEST_MODE is not set

#
# USB Device Class drivers
#
# CONFIG_USB_ACM is not set
# CONFIG_USB_PRINTER is not set
# CONFIG_USB_WDM is not set
CONFIG_USB_TMC=m

#
# NOTE: USB_STORAGE depends on SCSI but BLK_DEV_SD may
#

#
# also be needed; see USB_STORAGE Help for more info
#
CONFIG_USB_STORAGE=y
# CONFIG_USB_STORAGE_DEBUG is not set
# CONFIG_USB_STORAGE_REALTEK is not set
CONFIG_USB_STORAGE_DATAFAB=y
CONFIG_USB_STORAGE_FREECOM=y
CONFIG_USB_STORAGE_ISD200=y
CONFIG_USB_STORAGE_USBAT=y
CONFIG_USB_STORAGE_SDDR09=y
CONFIG_USB_STORAGE_SDDR55=y
CONFIG_USB_STORAGE_JUMPSHOT=y
CONFIG_USB_STORAGE_ALAUDA=y
CONFIG_USB_STORAGE_ONETOUCH=y
CONFIG_USB_STORAGE_KARMA=y
CONFIG_USB_STORAGE_CYPRESS_ATACB=y
# CONFIG_USB_STORAGE_ENE_UB6250 is not set
# CONFIG_USB_UAS is not set

#
# USB Imaging devices
#
# CONFIG_USB_MDC800 is not set
# CONFIG_USB_MICROTEK is not set
# CONFIG_USBIP_CORE is not set
# CONFIG_USB_MUSB_HDRC is not set
# CONFIG_USB_DWC3 is not set
# CONFIG_USB_DWC2 is not set
# CONFIG_USB_CHIPIDEA is not set
# CONFIG_USB_ISP1760 is not set

#
# USB port drivers
#
CONFIG_USB_SERIAL=y
CONFIG_USB_SERIAL_CONSOLE=y
CONFIG_USB_SERIAL_GENERIC=y
# CONFIG_USB_SERIAL_SIMPLE is not set
# CONFIG_USB_SERIAL_AIRCABLE is not set
# CONFIG_USB_SERIAL_ARK3116 is not set
CONFIG_USB_SERIAL_BELKIN=y
# CONFIG_USB_SERIAL_CH341 is not set
# CONFIG_USB_SERIAL_WHITEHEAT is not set
# CONFIG_USB_SERIAL_DIGI_ACCELEPORT is not set
# CONFIG_USB_SERIAL_CP210X is not set
# CONFIG_USB_SERIAL_CYPRESS_M8 is not set
# CONFIG_USB_SERIAL_EMPEG is not set
# CONFIG_USB_SERIAL_FTDI_SIO is not set
# CONFIG_USB_SERIAL_VISOR is not set
# CONFIG_USB_SERIAL_IPAQ is not set
# CONFIG_USB_SERIAL_IR is not set
# CONFIG_USB_SERIAL_EDGEPORT is not set
# CONFIG_USB_SERIAL_EDGEPORT_TI is not set
# CONFIG_USB_SERIAL_F81232 is not set
# CONFIG_USB_SERIAL_F8153X is not set
# CONFIG_USB_SERIAL_GARMIN is not set
# CONFIG_USB_SERIAL_IPW is not set
# CONFIG_USB_SERIAL_IUU is not set
# CONFIG_USB_SERIAL_KEYSPAN_PDA is not set
# CONFIG_USB_SERIAL_KEYSPAN is not set
# CONFIG_USB_SERIAL_KLSI is not set
# CONFIG_USB_SERIAL_KOBIL_SCT is not set
CONFIG_USB_SERIAL_MCT_U232=y
# CONFIG_USB_SERIAL_METRO is not set
# CONFIG_USB_SERIAL_MOS7720 is not set
# CONFIG_USB_SERIAL_MOS7840 is not set
# CONFIG_USB_SERIAL_MXUPORT is not set
# CONFIG_USB_SERIAL_NAVMAN is not set
# CONFIG_USB_SERIAL_PL2303 is not set
# CONFIG_USB_SERIAL_OTI6858 is not set
# CONFIG_USB_SERIAL_QCAUX is not set
# CONFIG_USB_SERIAL_QUALCOMM is not set
# CONFIG_USB_SERIAL_SPCP8X5 is not set
# CONFIG_USB_SERIAL_SAFE is not set
# CONFIG_USB_SERIAL_SIERRAWIRELESS is not set
# CONFIG_USB_SERIAL_SYMBOL is not set
# CONFIG_USB_SERIAL_TI is not set
# CONFIG_USB_SERIAL_CYBERJACK is not set
# CONFIG_USB_SERIAL_XIRCOM is not set
# CONFIG_USB_SERIAL_OPTION is not set
# CONFIG_USB_SERIAL_OMNINET is not set
# CONFIG_USB_SERIAL_OPTICON is not set
# CONFIG_USB_SERIAL_XSENS_MT is not set
# CONFIG_USB_SERIAL_WISHBONE is not set
# CONFIG_USB_SERIAL_SSU100 is not set
# CONFIG_USB_SERIAL_QT2 is not set
# CONFIG_USB_SERIAL_UPD78F0730 is not set
# CONFIG_USB_SERIAL_DEBUG is not set

#
# USB Miscellaneous drivers
#
# CONFIG_USB_EMI62 is not set
# CONFIG_USB_EMI26 is not set
# CONFIG_USB_ADUTUX is not set
# CONFIG_USB_SEVSEG is not set
# CONFIG_USB_RIO500 is not set
# CONFIG_USB_LEGOTOWER is not set
# CONFIG_USB_LCD is not set
# CONFIG_USB_CYPRESS_CY7C63 is not set
# CONFIG_USB_CYTHERM is not set
# CONFIG_USB_IDMOUSE is not set
# CONFIG_USB_FTDI_ELAN is not set
# CONFIG_USB_APPLEDISPLAY is not set
# CONFIG_USB_SISUSBVGA is not set
# CONFIG_USB_LD is not set
# CONFIG_USB_TRANCEVIBRATOR is not set
# CONFIG_USB_IOWARRIOR is not set
CONFIG_USB_TEST=y
# CONFIG_USB_EHSET_TEST_FIXTURE is not set
# CONFIG_USB_ISIGHTFW is not set
# CONFIG_USB_YUREX is not set
# CONFIG_USB_EZUSB_FX2 is not set
# CONFIG_USB_HUB_USB251XB is not set
# CONFIG_USB_HSIC_USB3503 is not set
# CONFIG_USB_HSIC_USB4604 is not set
# CONFIG_USB_LINK_LAYER_TEST is not set
# CONFIG_USB_CHAOSKEY is not set

#
# USB Physical Layer drivers
#
# CONFIG_NOP_USB_XCEIV is not set
# CONFIG_USB_ISP1301 is not set
# CONFIG_USB_GADGET is not set
# CONFIG_TYPEC is not set
# CONFIG_USB_LED_TRIG is not set
# CONFIG_USB_ULPI_BUS is not set
# CONFIG_UWB is not set
# CONFIG_MMC is not set
# CONFIG_MEMSTICK is not set
CONFIG_NEW_LEDS=y
CONFIG_LEDS_CLASS=y
# CONFIG_LEDS_CLASS_FLASH is not set
# CONFIG_LEDS_BRIGHTNESS_HW_CHANGED is not set

#
# LED drivers
#
# CONFIG_LEDS_APU is not set
# CONFIG_LEDS_LM3530 is not set
# CONFIG_LEDS_LM3642 is not set
# CONFIG_LEDS_PCA9532 is not set
# CONFIG_LEDS_LP3944 is not set
# CONFIG_LEDS_LP5521 is not set
# CONFIG_LEDS_LP5523 is not set
# CONFIG_LEDS_LP5562 is not set
# CONFIG_LEDS_LP8501 is not set
# CONFIG_LEDS_CLEVO_MAIL is not set
# CONFIG_LEDS_PCA955X is not set
# CONFIG_LEDS_PCA963X is not set
# CONFIG_LEDS_BD2802 is not set
# CONFIG_LEDS_INTEL_SS4200 is not set
# CONFIG_LEDS_TCA6507 is not set
# CONFIG_LEDS_TLC591XX is not set
# CONFIG_LEDS_LM355x is not set

#
# LED driver for blink(1) USB RGB LED is under Special HID drivers (HID_THINGM)
#
# CONFIG_LEDS_BLINKM is not set
# CONFIG_LEDS_MLXCPLD is not set
# CONFIG_LEDS_MLXREG is not set
# CONFIG_LEDS_USER is not set
# CONFIG_LEDS_NIC78BX is not set

#
# LED Triggers
#
CONFIG_LEDS_TRIGGERS=y
# CONFIG_LEDS_TRIGGER_TIMER is not set
# CONFIG_LEDS_TRIGGER_ONESHOT is not set
# CONFIG_LEDS_TRIGGER_DISK is not set
# CONFIG_LEDS_TRIGGER_HEARTBEAT is not set
# CONFIG_LEDS_TRIGGER_BACKLIGHT is not set
# CONFIG_LEDS_TRIGGER_CPU is not set
# CONFIG_LEDS_TRIGGER_ACTIVITY is not set
# CONFIG_LEDS_TRIGGER_DEFAULT_ON is not set

#
# iptables trigger is under Netfilter config (LED target)
#
# CONFIG_LEDS_TRIGGER_TRANSIENT is not set
# CONFIG_LEDS_TRIGGER_CAMERA is not set
# CONFIG_LEDS_TRIGGER_PANIC is not set
# CONFIG_LEDS_TRIGGER_NETDEV is not set
# CONFIG_ACCESSIBILITY is not set
# CONFIG_INFINIBAND is not set
CONFIG_EDAC_ATOMIC_SCRUB=y
CONFIG_EDAC_SUPPORT=y
# CONFIG_EDAC is not set
CONFIG_RTC_LIB=y
CONFIG_RTC_MC146818_LIB=y
CONFIG_RTC_CLASS=y
CONFIG_RTC_HCTOSYS=y
CONFIG_RTC_HCTOSYS_DEVICE="rtc0"
CONFIG_RTC_SYSTOHC=y
CONFIG_RTC_SYSTOHC_DEVICE="rtc0"
# CONFIG_RTC_DEBUG is not set
CONFIG_RTC_NVMEM=y

#
# RTC interfaces
#
CONFIG_RTC_INTF_SYSFS=y
CONFIG_RTC_INTF_PROC=y
CONFIG_RTC_INTF_DEV=y
# CONFIG_RTC_INTF_DEV_UIE_EMUL is not set
# CONFIG_RTC_DRV_TEST is not set

#
# I2C RTC drivers
#
# CONFIG_RTC_DRV_ABB5ZES3 is not set
# CONFIG_RTC_DRV_ABX80X is not set
# CONFIG_RTC_DRV_DS1307 is not set
# CONFIG_RTC_DRV_DS1374 is not set
# CONFIG_RTC_DRV_DS1672 is not set
# CONFIG_RTC_DRV_MAX6900 is not set
# CONFIG_RTC_DRV_RS5C372 is not set
# CONFIG_RTC_DRV_ISL1208 is not set
# CONFIG_RTC_DRV_ISL12022 is not set
# CONFIG_RTC_DRV_X1205 is not set
# CONFIG_RTC_DRV_PCF8523 is not set
# CONFIG_RTC_DRV_PCF85063 is not set
# CONFIG_RTC_DRV_PCF85363 is not set
# CONFIG_RTC_DRV_PCF8563 is not set
# CONFIG_RTC_DRV_PCF8583 is not set
# CONFIG_RTC_DRV_M41T80 is not set
# CONFIG_RTC_DRV_BQ32K is not set
# CONFIG_RTC_DRV_S35390A is not set
# CONFIG_RTC_DRV_FM3130 is not set
# CONFIG_RTC_DRV_RX8010 is not set
# CONFIG_RTC_DRV_RX8581 is not set
# CONFIG_RTC_DRV_RX8025 is not set
# CONFIG_RTC_DRV_EM3027 is not set
# CONFIG_RTC_DRV_RV8803 is not set

#
# SPI RTC drivers
#
CONFIG_RTC_I2C_AND_SPI=y

#
# SPI and I2C RTC drivers
#
# CONFIG_RTC_DRV_DS3232 is not set
# CONFIG_RTC_DRV_PCF2127 is not set
# CONFIG_RTC_DRV_RV3029C2 is not set

#
# Platform RTC drivers
#
CONFIG_RTC_DRV_CMOS=y
# CONFIG_RTC_DRV_DS1286 is not set
# CONFIG_RTC_DRV_DS1511 is not set
# CONFIG_RTC_DRV_DS1553 is not set
# CONFIG_RTC_DRV_DS1685_FAMILY is not set
# CONFIG_RTC_DRV_DS1742 is not set
# CONFIG_RTC_DRV_DS2404 is not set
# CONFIG_RTC_DRV_STK17TA8 is not set
# CONFIG_RTC_DRV_M48T86 is not set
# CONFIG_RTC_DRV_M48T35 is not set
# CONFIG_RTC_DRV_M48T59 is not set
# CONFIG_RTC_DRV_MSM6242 is not set
# CONFIG_RTC_DRV_BQ4802 is not set
# CONFIG_RTC_DRV_RP5C01 is not set
# CONFIG_RTC_DRV_V3020 is not set

#
# on-CPU RTC drivers
#
# CONFIG_RTC_DRV_FTRTC010 is not set

#
# HID Sensor RTC drivers
#
# CONFIG_RTC_DRV_HID_SENSOR_TIME is not set
CONFIG_DMADEVICES=y
# CONFIG_DMADEVICES_DEBUG is not set

#
# DMA Devices
#
CONFIG_DMA_ENGINE=y
CONFIG_DMA_VIRTUAL_CHANNELS=y
CONFIG_DMA_ACPI=y
# CONFIG_ALTERA_MSGDMA is not set
# CONFIG_INTEL_IDMA64 is not set
CONFIG_INTEL_IOATDMA=y
# CONFIG_QCOM_HIDMA_MGMT is not set
# CONFIG_QCOM_HIDMA is not set
CONFIG_DW_DMAC_CORE=y
# CONFIG_DW_DMAC is not set
# CONFIG_DW_DMAC_PCI is not set
CONFIG_HSU_DMA=y

#
# DMA Clients
#
# CONFIG_ASYNC_TX_DMA is not set
# CONFIG_DMATEST is not set
CONFIG_DMA_ENGINE_RAID=y

#
# DMABUF options
#
CONFIG_SYNC_FILE=y
# CONFIG_SW_SYNC is not set
CONFIG_DCA=y
# CONFIG_AUXDISPLAY is not set
CONFIG_UIO=y
# CONFIG_UIO_CIF is not set
# CONFIG_UIO_PDRV_GENIRQ is not set
# CONFIG_UIO_DMEM_GENIRQ is not set
# CONFIG_UIO_AEC is not set
# CONFIG_UIO_SERCOS3 is not set
# CONFIG_UIO_PCI_GENERIC is not set
# CONFIG_UIO_NETX is not set
# CONFIG_UIO_PRUSS is not set
# CONFIG_UIO_MF624 is not set
CONFIG_IRQ_BYPASS_MANAGER=y
# CONFIG_VIRT_DRIVERS is not set
CONFIG_VIRTIO=y
CONFIG_VIRTIO_MENU=y
CONFIG_VIRTIO_PCI=y
CONFIG_VIRTIO_PCI_LEGACY=y
CONFIG_VIRTIO_BALLOON=y
# CONFIG_VIRTIO_INPUT is not set
CONFIG_VIRTIO_MMIO=y
# CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES is not set

#
# Microsoft Hyper-V guest support
#
# CONFIG_HYPERV is not set
# CONFIG_STAGING is not set
CONFIG_X86_PLATFORM_DEVICES=y
CONFIG_ACER_WMI=y
# CONFIG_ACER_WIRELESS is not set
# CONFIG_ACERHDF is not set
# CONFIG_ALIENWARE_WMI is not set
# CONFIG_ASUS_LAPTOP is not set
CONFIG_DELL_SMBIOS=y
# CONFIG_DELL_SMBIOS_WMI is not set
# CONFIG_DELL_LAPTOP is not set
CONFIG_DELL_WMI=y
CONFIG_DELL_WMI_DESCRIPTOR=y
# CONFIG_DELL_WMI_AIO is not set
# CONFIG_DELL_WMI_LED is not set
# CONFIG_DELL_SMO8800 is not set
# CONFIG_FUJITSU_LAPTOP is not set
# CONFIG_FUJITSU_TABLET is not set
# CONFIG_GPD_POCKET_FAN is not set
# CONFIG_HP_ACCEL is not set
# CONFIG_HP_WIRELESS is not set
CONFIG_HP_WMI=y
# CONFIG_PANASONIC_LAPTOP is not set
CONFIG_THINKPAD_ACPI=y
CONFIG_THINKPAD_ACPI_ALSA_SUPPORT=y
# CONFIG_THINKPAD_ACPI_DEBUGFACILITIES is not set
# CONFIG_THINKPAD_ACPI_DEBUG is not set
# CONFIG_THINKPAD_ACPI_UNSAFE_LEDS is not set
CONFIG_THINKPAD_ACPI_VIDEO=y
CONFIG_THINKPAD_ACPI_HOTKEY_POLL=y
# CONFIG_SENSORS_HDAPS is not set
# CONFIG_INTEL_MENLOW is not set
CONFIG_EEEPC_LAPTOP=y
# CONFIG_ASUS_WMI is not set
# CONFIG_ASUS_WIRELESS is not set
CONFIG_ACPI_WMI=y
CONFIG_WMI_BMOF=y
# CONFIG_INTEL_WMI_THUNDERBOLT is not set
# CONFIG_MSI_WMI is not set
# CONFIG_PEAQ_WMI is not set
# CONFIG_TOPSTAR_LAPTOP is not set
# CONFIG_TOSHIBA_BT_RFKILL is not set
# CONFIG_TOSHIBA_HAPS is not set
# CONFIG_TOSHIBA_WMI is not set
# CONFIG_ACPI_CMPC is not set
# CONFIG_INTEL_HID_EVENT is not set
# CONFIG_INTEL_VBTN is not set
# CONFIG_INTEL_IPS is not set
# CONFIG_INTEL_PMC_CORE is not set
# CONFIG_IBM_RTL is not set
# CONFIG_SAMSUNG_LAPTOP is not set
# CONFIG_MXM_WMI is not set
# CONFIG_SAMSUNG_Q10 is not set
# CONFIG_APPLE_GMUX is not set
# CONFIG_INTEL_RST is not set
# CONFIG_INTEL_SMARTCONNECT is not set
# CONFIG_PVPANIC is not set
# CONFIG_INTEL_PMC_IPC is not set
# CONFIG_SURFACE_PRO3_BUTTON is not set
# CONFIG_INTEL_PUNIT_IPC is not set
# CONFIG_MLX_PLATFORM is not set
# CONFIG_INTEL_TURBO_MAX_3 is not set
CONFIG_PMC_ATOM=y
# CONFIG_CHROME_PLATFORMS is not set
# CONFIG_MELLANOX_PLATFORM is not set
CONFIG_CLKDEV_LOOKUP=y
CONFIG_HAVE_CLK_PREPARE=y
CONFIG_COMMON_CLK=y

#
# Common Clock Framework
#
# CONFIG_COMMON_CLK_SI5351 is not set
# CONFIG_COMMON_CLK_CDCE706 is not set
# CONFIG_COMMON_CLK_CS2000_CP is not set
# CONFIG_HWSPINLOCK is not set

#
# Clock Source drivers
#
CONFIG_CLKEVT_I8253=y
CONFIG_I8253_LOCK=y
CONFIG_CLKBLD_I8253=y
CONFIG_MAILBOX=y
CONFIG_PCC=y
# CONFIG_ALTERA_MBOX is not set
CONFIG_IOMMU_SUPPORT=y

#
# Generic IOMMU Pagetable Support
#
# CONFIG_AMD_IOMMU is not set
# CONFIG_INTEL_IOMMU is not set
# CONFIG_IRQ_REMAP is not set

#
# Remoteproc drivers
#
# CONFIG_REMOTEPROC is not set

#
# Rpmsg drivers
#
# CONFIG_RPMSG_QCOM_GLINK_RPM is not set
# CONFIG_RPMSG_VIRTIO is not set
# CONFIG_SOUNDWIRE is not set

#
# SOC (System On Chip) specific Drivers
#

#
# Amlogic SoC drivers
#

#
# Broadcom SoC drivers
#

#
# i.MX SoC drivers
#

#
# Qualcomm SoC drivers
#
# CONFIG_SOC_TI is not set

#
# Xilinx SoC drivers
#
# CONFIG_XILINX_VCU is not set
# CONFIG_PM_DEVFREQ is not set
# CONFIG_EXTCON is not set
# CONFIG_MEMORY is not set
# CONFIG_IIO is not set
# CONFIG_NTB is not set
# CONFIG_VME_BUS is not set
# CONFIG_PWM is not set

#
# IRQ chip support
#
CONFIG_ARM_GIC_MAX_NR=1
# CONFIG_IPACK_BUS is not set
# CONFIG_RESET_CONTROLLER is not set
# CONFIG_FMC is not set

#
# PHY Subsystem
#
# CONFIG_GENERIC_PHY is not set
# CONFIG_BCM_KONA_USB2_PHY is not set
# CONFIG_PHY_PXA_28NM_HSIC is not set
# CONFIG_PHY_PXA_28NM_USB2 is not set
# CONFIG_POWERCAP is not set
# CONFIG_MCB is not set

#
# Performance monitor support
#
CONFIG_RAS=y
# CONFIG_RAS_CEC is not set
# CONFIG_THUNDERBOLT is not set

#
# Android
#
# CONFIG_ANDROID is not set
# CONFIG_LIBNVDIMM is not set
CONFIG_DAX=y
CONFIG_NVMEM=y
# CONFIG_STM is not set
# CONFIG_INTEL_TH is not set
# CONFIG_FPGA is not set
# CONFIG_FSI is not set
# CONFIG_UNISYS_VISORBUS is not set
# CONFIG_SIOX is not set
# CONFIG_SLIMBUS is not set

#
# Firmware Drivers
#
# CONFIG_EDD is not set
CONFIG_FIRMWARE_MEMMAP=y
# CONFIG_DELL_RBU is not set
# CONFIG_DCDBAS is not set
CONFIG_DMIID=y
# CONFIG_DMI_SYSFS is not set
CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK=y
# CONFIG_ISCSI_IBFT_FIND is not set
# CONFIG_FW_CFG_SYSFS is not set
# CONFIG_GOOGLE_FIRMWARE is not set

#
# EFI (Extensible Firmware Interface) Support
#
# CONFIG_EFI_VARS is not set
CONFIG_EFI_ESRT=y
CONFIG_EFI_RUNTIME_MAP=y
# CONFIG_EFI_FAKE_MEMMAP is not set
CONFIG_EFI_RUNTIME_WRAPPERS=y
# CONFIG_EFI_CAPSULE_LOADER is not set
# CONFIG_EFI_TEST is not set

#
# Tegra firmware driver
#

#
# File systems
#
CONFIG_DCACHE_WORD_ACCESS=y
CONFIG_FS_IOMAP=y
# CONFIG_EXT2_FS is not set
CONFIG_EXT3_FS=y
CONFIG_EXT3_FS_POSIX_ACL=y
CONFIG_EXT3_FS_SECURITY=y
CONFIG_EXT4_FS=y
CONFIG_EXT4_USE_FOR_EXT2=y
CONFIG_EXT4_FS_POSIX_ACL=y
CONFIG_EXT4_FS_SECURITY=y
# CONFIG_EXT4_ENCRYPTION is not set
# CONFIG_EXT4_DEBUG is not set
CONFIG_JBD2=y
# CONFIG_JBD2_DEBUG is not set
CONFIG_FS_MBCACHE=y
CONFIG_REISERFS_FS=y
# CONFIG_REISERFS_CHECK is not set
CONFIG_REISERFS_PROC_INFO=y
# CONFIG_REISERFS_FS_XATTR is not set
CONFIG_JFS_FS=y
# CONFIG_JFS_POSIX_ACL is not set
# CONFIG_JFS_SECURITY is not set
# CONFIG_JFS_DEBUG is not set
# CONFIG_JFS_STATISTICS is not set
CONFIG_XFS_FS=y
# CONFIG_XFS_QUOTA is not set
CONFIG_XFS_POSIX_ACL=y
# CONFIG_XFS_RT is not set
# CONFIG_XFS_ONLINE_SCRUB is not set
# CONFIG_XFS_WARN is not set
# CONFIG_XFS_DEBUG is not set
# CONFIG_GFS2_FS is not set
# CONFIG_OCFS2_FS is not set
CONFIG_BTRFS_FS=y
# CONFIG_BTRFS_FS_POSIX_ACL is not set
# CONFIG_BTRFS_FS_CHECK_INTEGRITY is not set
# CONFIG_BTRFS_FS_RUN_SANITY_TESTS is not set
# CONFIG_BTRFS_DEBUG is not set
# CONFIG_BTRFS_ASSERT is not set
# CONFIG_BTRFS_FS_REF_VERIFY is not set
CONFIG_NILFS2_FS=y
CONFIG_F2FS_FS=m
CONFIG_F2FS_STAT_FS=y
CONFIG_F2FS_FS_XATTR=y
CONFIG_F2FS_FS_POSIX_ACL=y
# CONFIG_F2FS_FS_SECURITY is not set
# CONFIG_F2FS_CHECK_FS is not set
# CONFIG_F2FS_FS_ENCRYPTION is not set
# CONFIG_F2FS_IO_TRACE is not set
# CONFIG_F2FS_FAULT_INJECTION is not set
# CONFIG_FS_DAX is not set
CONFIG_FS_POSIX_ACL=y
CONFIG_EXPORTFS=y
# CONFIG_EXPORTFS_BLOCK_OPS is not set
CONFIG_FILE_LOCKING=y
CONFIG_MANDATORY_FILE_LOCKING=y
# CONFIG_FS_ENCRYPTION is not set
CONFIG_FSNOTIFY=y
CONFIG_DNOTIFY=y
CONFIG_INOTIFY_USER=y
# CONFIG_FANOTIFY is not set
# CONFIG_QUOTA is not set
# CONFIG_AUTOFS4_FS is not set
CONFIG_FUSE_FS=y
# CONFIG_CUSE is not set
# CONFIG_OVERLAY_FS is not set

#
# Caches
#
# CONFIG_FSCACHE is not set

#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=y
# CONFIG_JOLIET is not set
# CONFIG_ZISOFS is not set
# CONFIG_UDF_FS is not set

#
# DOS/FAT/NT Filesystems
#
CONFIG_FAT_FS=y
# CONFIG_MSDOS_FS is not set
CONFIG_VFAT_FS=y
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
# CONFIG_FAT_DEFAULT_UTF8 is not set
# CONFIG_NTFS_FS is not set

#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_PROC_VMCORE=y
CONFIG_PROC_SYSCTL=y
CONFIG_PROC_PAGE_MONITOR=y
# CONFIG_PROC_CHILDREN is not set
CONFIG_KERNFS=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
# CONFIG_TMPFS_POSIX_ACL is not set
# CONFIG_TMPFS_XATTR is not set
CONFIG_HUGETLBFS=y
CONFIG_HUGETLB_PAGE=y
CONFIG_MEMFD_CREATE=y
CONFIG_CONFIGFS_FS=y
CONFIG_EFIVAR_FS=m
CONFIG_MISC_FILESYSTEMS=y
# CONFIG_ORANGEFS_FS is not set
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_ECRYPT_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_HFSPLUS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
CONFIG_CRAMFS=y
CONFIG_CRAMFS_BLOCKDEV=y
# CONFIG_SQUASHFS is not set
# CONFIG_VXFS_FS is not set
# CONFIG_MINIX_FS is not set
# CONFIG_OMFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_QNX6FS_FS is not set
# CONFIG_ROMFS_FS is not set
# CONFIG_PSTORE is not set
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set
CONFIG_NETWORK_FILESYSTEMS=y
CONFIG_NFS_FS=y
CONFIG_NFS_V2=y
CONFIG_NFS_V3=y
CONFIG_NFS_V3_ACL=y
CONFIG_NFS_V4=y
# CONFIG_NFS_SWAP is not set
# CONFIG_NFS_V4_1 is not set
CONFIG_ROOT_NFS=y
# CONFIG_NFS_USE_LEGACY_DNS is not set
CONFIG_NFS_USE_KERNEL_DNS=y
CONFIG_NFSD=y
CONFIG_NFSD_V3=y
# CONFIG_NFSD_V3_ACL is not set
CONFIG_NFSD_V4=y
# CONFIG_NFSD_BLOCKLAYOUT is not set
# CONFIG_NFSD_SCSILAYOUT is not set
# CONFIG_NFSD_FLEXFILELAYOUT is not set
# CONFIG_NFSD_FAULT_INJECTION is not set
CONFIG_GRACE_PERIOD=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
CONFIG_NFS_ACL_SUPPORT=y
CONFIG_NFS_COMMON=y
CONFIG_SUNRPC=y
CONFIG_SUNRPC_GSS=y
# CONFIG_SUNRPC_DEBUG is not set
# CONFIG_CEPH_FS is not set
CONFIG_CIFS=y
CONFIG_CIFS_STATS=y
CONFIG_CIFS_STATS2=y
# CONFIG_CIFS_WEAK_PW_HASH is not set
# CONFIG_CIFS_UPCALL is not set
CONFIG_CIFS_XATTR=y
CONFIG_CIFS_POSIX=y
# CONFIG_CIFS_ACL is not set
CONFIG_CIFS_DEBUG=y
CONFIG_CIFS_DEBUG2=y
# CONFIG_CIFS_DEBUG_DUMP_KEYS is not set
# CONFIG_CIFS_DFS_UPCALL is not set
# CONFIG_CIFS_SMB311 is not set
# CONFIG_CODA_FS is not set
# CONFIG_AFS_FS is not set
CONFIG_9P_FS=y
CONFIG_9P_FS_POSIX_ACL=y
# CONFIG_9P_FS_SECURITY is not set
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=y
# CONFIG_NLS_CODEPAGE_737 is not set
# CONFIG_NLS_CODEPAGE_775 is not set
# CONFIG_NLS_CODEPAGE_850 is not set
# CONFIG_NLS_CODEPAGE_852 is not set
# CONFIG_NLS_CODEPAGE_855 is not set
# CONFIG_NLS_CODEPAGE_857 is not set
# CONFIG_NLS_CODEPAGE_860 is not set
# CONFIG_NLS_CODEPAGE_861 is not set
# CONFIG_NLS_CODEPAGE_862 is not set
# CONFIG_NLS_CODEPAGE_863 is not set
# CONFIG_NLS_CODEPAGE_864 is not set
# CONFIG_NLS_CODEPAGE_865 is not set
# CONFIG_NLS_CODEPAGE_866 is not set
# CONFIG_NLS_CODEPAGE_869 is not set
CONFIG_NLS_CODEPAGE_936=y
# CONFIG_NLS_CODEPAGE_950 is not set
# CONFIG_NLS_CODEPAGE_932 is not set
# CONFIG_NLS_CODEPAGE_949 is not set
# CONFIG_NLS_CODEPAGE_874 is not set
# CONFIG_NLS_ISO8859_8 is not set
# CONFIG_NLS_CODEPAGE_1250 is not set
# CONFIG_NLS_CODEPAGE_1251 is not set
# CONFIG_NLS_ASCII is not set
CONFIG_NLS_ISO8859_1=y
# CONFIG_NLS_ISO8859_2 is not set
# CONFIG_NLS_ISO8859_3 is not set
# CONFIG_NLS_ISO8859_4 is not set
# CONFIG_NLS_ISO8859_5 is not set
# CONFIG_NLS_ISO8859_6 is not set
# CONFIG_NLS_ISO8859_7 is not set
# CONFIG_NLS_ISO8859_9 is not set
# CONFIG_NLS_ISO8859_13 is not set
# CONFIG_NLS_ISO8859_14 is not set
# CONFIG_NLS_ISO8859_15 is not set
# CONFIG_NLS_KOI8_R is not set
# CONFIG_NLS_KOI8_U is not set
# CONFIG_NLS_MAC_ROMAN is not set
# CONFIG_NLS_MAC_CELTIC is not set
# CONFIG_NLS_MAC_CENTEURO is not set
# CONFIG_NLS_MAC_CROATIAN is not set
# CONFIG_NLS_MAC_CYRILLIC is not set
# CONFIG_NLS_MAC_GAELIC is not set
# CONFIG_NLS_MAC_GREEK is not set
# CONFIG_NLS_MAC_ICELAND is not set
# CONFIG_NLS_MAC_INUIT is not set
# CONFIG_NLS_MAC_ROMANIAN is not set
# CONFIG_NLS_MAC_TURKISH is not set
CONFIG_NLS_UTF8=y
# CONFIG_DLM is not set

#
# Kernel hacking
#
CONFIG_TRACE_IRQFLAGS_SUPPORT=y

#
# printk and dmesg options
#
CONFIG_PRINTK_TIME=y
CONFIG_CONSOLE_LOGLEVEL_DEFAULT=7
CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4
CONFIG_BOOT_PRINTK_DELAY=y
# CONFIG_DYNAMIC_DEBUG is not set

#
# Compile-time checks and compiler options
#
CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_INFO_REDUCED=y
# CONFIG_DEBUG_INFO_SPLIT is not set
# CONFIG_DEBUG_INFO_DWARF4 is not set
# CONFIG_GDB_SCRIPTS is not set
CONFIG_ENABLE_WARN_DEPRECATED=y
CONFIG_ENABLE_MUST_CHECK=y
CONFIG_FRAME_WARN=2048
# CONFIG_STRIP_ASM_SYMS is not set
# CONFIG_READABLE_ASM is not set
# CONFIG_UNUSED_SYMBOLS is not set
# CONFIG_PAGE_OWNER is not set
CONFIG_DEBUG_FS=y
CONFIG_HEADERS_CHECK=y
CONFIG_DEBUG_SECTION_MISMATCH=y
CONFIG_SECTION_MISMATCH_WARN_ONLY=y
CONFIG_STACK_VALIDATION=y
CONFIG_DEBUG_FORCE_WEAK_PER_CPU=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x1
CONFIG_MAGIC_SYSRQ_SERIAL=y
CONFIG_DEBUG_KERNEL=y

#
# Memory Debugging
#
# CONFIG_PAGE_EXTENSION is not set
# CONFIG_DEBUG_PAGEALLOC is not set
# CONFIG_PAGE_POISONING is not set
# CONFIG_DEBUG_PAGE_REF is not set
# CONFIG_DEBUG_RODATA_TEST is not set
CONFIG_DEBUG_OBJECTS=y
# CONFIG_DEBUG_OBJECTS_SELFTEST is not set
# CONFIG_DEBUG_OBJECTS_FREE is not set
# CONFIG_DEBUG_OBJECTS_TIMERS is not set
# CONFIG_DEBUG_OBJECTS_WORK is not set
# CONFIG_DEBUG_OBJECTS_RCU_HEAD is not set
# CONFIG_DEBUG_OBJECTS_PERCPU_COUNTER is not set
CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT=1
CONFIG_SLUB_DEBUG_ON=y
CONFIG_SLUB_STATS=y
CONFIG_HAVE_DEBUG_KMEMLEAK=y
# CONFIG_DEBUG_KMEMLEAK is not set
CONFIG_DEBUG_STACK_USAGE=y
CONFIG_DEBUG_VM=y
# CONFIG_DEBUG_VM_VMACACHE is not set
# CONFIG_DEBUG_VM_RB is not set
# CONFIG_DEBUG_VM_PGFLAGS is not set
CONFIG_ARCH_HAS_DEBUG_VIRTUAL=y
CONFIG_DEBUG_VIRTUAL=y
CONFIG_DEBUG_MEMORY_INIT=y
CONFIG_DEBUG_PER_CPU_MAPS=y
CONFIG_HAVE_DEBUG_STACKOVERFLOW=y
CONFIG_DEBUG_STACKOVERFLOW=y
CONFIG_HAVE_ARCH_KASAN=y
# CONFIG_KASAN is not set
CONFIG_ARCH_HAS_KCOV=y
# CONFIG_KCOV is not set
CONFIG_DEBUG_SHIRQ=y

#
# Debug Lockups and Hangs
#
# CONFIG_SOFTLOCKUP_DETECTOR is not set
CONFIG_HARDLOCKUP_CHECK_TIMESTAMP=y
# CONFIG_HARDLOCKUP_DETECTOR is not set
CONFIG_DETECT_HUNG_TASK=y
CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
CONFIG_BOOTPARAM_HUNG_TASK_PANIC=y
CONFIG_BOOTPARAM_HUNG_TASK_PANIC_VALUE=1
# CONFIG_WQ_WATCHDOG is not set
# CONFIG_PANIC_ON_OOPS is not set
CONFIG_PANIC_ON_OOPS_VALUE=0
CONFIG_PANIC_TIMEOUT=0
CONFIG_SCHED_DEBUG=y
CONFIG_SCHED_INFO=y
CONFIG_SCHEDSTATS=y
# CONFIG_SCHED_STACK_END_CHECK is not set
# CONFIG_DEBUG_TIMEKEEPING is not set

#
# Lock Debugging (spinlocks, mutexes, etc...)
#
CONFIG_DEBUG_RT_MUTEXES=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_MUTEXES=y
# CONFIG_DEBUG_WW_MUTEX_SLOWPATH is not set
CONFIG_DEBUG_LOCK_ALLOC=y
CONFIG_PROVE_LOCKING=y
CONFIG_LOCKDEP=y
CONFIG_LOCK_STAT=y
# CONFIG_DEBUG_LOCKDEP is not set
CONFIG_DEBUG_ATOMIC_SLEEP=y
CONFIG_DEBUG_LOCKING_API_SELFTESTS=y
# CONFIG_LOCK_TORTURE_TEST is not set
# CONFIG_WW_MUTEX_SELFTEST is not set
CONFIG_TRACE_IRQFLAGS=y
CONFIG_STACKTRACE=y
# CONFIG_WARN_ALL_UNSEEDED_RANDOM is not set
# CONFIG_DEBUG_KOBJECT is not set
CONFIG_DEBUG_BUGVERBOSE=y
CONFIG_DEBUG_LIST=y
CONFIG_DEBUG_PI_LIST=y
CONFIG_DEBUG_SG=y
CONFIG_DEBUG_NOTIFIERS=y
CONFIG_DEBUG_CREDENTIALS=y

#
# RCU Debugging
#
CONFIG_PROVE_RCU=y
CONFIG_TORTURE_TEST=y
# CONFIG_RCU_PERF_TEST is not set
CONFIG_RCU_TORTURE_TEST=y
CONFIG_RCU_CPU_STALL_TIMEOUT=60
CONFIG_RCU_TRACE=y
# CONFIG_RCU_EQS_DEBUG is not set
# CONFIG_DEBUG_WQ_FORCE_RR_CPU is not set
# CONFIG_DEBUG_BLOCK_EXT_DEVT is not set
# CONFIG_CPU_HOTPLUG_STATE_CONTROL is not set
# CONFIG_NOTIFIER_ERROR_INJECTION is not set
CONFIG_FAULT_INJECTION=y
CONFIG_FUNCTION_ERROR_INJECTION=y
# CONFIG_FAILSLAB is not set
CONFIG_FAIL_PAGE_ALLOC=y
CONFIG_FAIL_MAKE_REQUEST=y
# CONFIG_FAIL_IO_TIMEOUT is not set
# CONFIG_FAIL_FUTEX is not set
# CONFIG_FAIL_FUNCTION is not set
CONFIG_FAULT_INJECTION_DEBUG_FS=y
CONFIG_LATENCYTOP=y
CONFIG_USER_STACKTRACE_SUPPORT=y
CONFIG_NOP_TRACER=y
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
CONFIG_HAVE_FENTRY=y
CONFIG_HAVE_C_RECORDMCOUNT=y
CONFIG_TRACER_MAX_TRACE=y
CONFIG_TRACE_CLOCK=y
CONFIG_RING_BUFFER=y
CONFIG_EVENT_TRACING=y
CONFIG_CONTEXT_SWITCH_TRACER=y
CONFIG_RING_BUFFER_ALLOW_SWAP=y
CONFIG_TRACING=y
CONFIG_GENERIC_TRACER=y
CONFIG_TRACING_SUPPORT=y
CONFIG_FTRACE=y
CONFIG_FUNCTION_TRACER=y
CONFIG_FUNCTION_GRAPH_TRACER=y
CONFIG_IRQSOFF_TRACER=y
CONFIG_SCHED_TRACER=y
# CONFIG_HWLAT_TRACER is not set
CONFIG_FTRACE_SYSCALLS=y
CONFIG_TRACER_SNAPSHOT=y
CONFIG_TRACER_SNAPSHOT_PER_CPU_SWAP=y
CONFIG_BRANCH_PROFILE_NONE=y
# CONFIG_PROFILE_ANNOTATED_BRANCHES is not set
# CONFIG_PROFILE_ALL_BRANCHES is not set
# CONFIG_STACK_TRACER is not set
CONFIG_BLK_DEV_IO_TRACE=y
CONFIG_KPROBE_EVENTS=y
CONFIG_UPROBE_EVENTS=y
CONFIG_PROBE_EVENTS=y
CONFIG_DYNAMIC_FTRACE=y
CONFIG_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_FUNCTION_PROFILER=y
CONFIG_FTRACE_MCOUNT_RECORD=y
# CONFIG_FTRACE_STARTUP_TEST is not set
CONFIG_MMIOTRACE=y
# CONFIG_HIST_TRIGGERS is not set
CONFIG_MMIOTRACE_TEST=m
# CONFIG_TRACEPOINT_BENCHMARK is not set
# CONFIG_RING_BUFFER_BENCHMARK is not set
# CONFIG_RING_BUFFER_STARTUP_TEST is not set
# CONFIG_TRACE_EVAL_MAP_FILE is not set
# CONFIG_PROVIDE_OHCI1394_DMA_INIT is not set
# CONFIG_DMA_API_DEBUG is not set
CONFIG_RUNTIME_TESTING_MENU=y
CONFIG_LKDTM=y
CONFIG_TEST_LIST_SORT=y
# CONFIG_TEST_SORT is not set
CONFIG_KPROBES_SANITY_TEST=y
# CONFIG_BACKTRACE_SELF_TEST is not set
# CONFIG_RBTREE_TEST is not set
# CONFIG_INTERVAL_TREE_TEST is not set
# CONFIG_PERCPU_TEST is not set
CONFIG_ATOMIC64_SELFTEST=y
# CONFIG_ASYNC_RAID6_TEST is not set
# CONFIG_TEST_HEXDUMP is not set
# CONFIG_TEST_STRING_HELPERS is not set
# CONFIG_TEST_KSTRTOX is not set
# CONFIG_TEST_PRINTF is not set
# CONFIG_TEST_BITMAP is not set
# CONFIG_TEST_UUID is not set
# CONFIG_TEST_RHASHTABLE is not set
# CONFIG_TEST_HASH is not set
# CONFIG_TEST_LKM is not set
# CONFIG_TEST_USER_COPY is not set
# CONFIG_TEST_BPF is not set
# CONFIG_FIND_BIT_BENCHMARK is not set
# CONFIG_TEST_FIRMWARE is not set
# CONFIG_TEST_SYSCTL is not set
# CONFIG_TEST_UDELAY is not set
# CONFIG_TEST_STATIC_KEYS is not set
# CONFIG_TEST_KMOD is not set
# CONFIG_TEST_DEBUG_VIRTUAL is not set
CONFIG_MEMTEST=y
# CONFIG_BUG_ON_DATA_CORRUPTION is not set
# CONFIG_SAMPLES is not set
CONFIG_HAVE_ARCH_KGDB=y
# CONFIG_KGDB is not set
CONFIG_ARCH_HAS_UBSAN_SANITIZE_ALL=y
# CONFIG_UBSAN is not set
CONFIG_ARCH_HAS_DEVMEM_IS_ALLOWED=y
# CONFIG_STRICT_DEVMEM is not set
CONFIG_X86_VERBOSE_BOOTUP=y
CONFIG_EARLY_PRINTK=y
# CONFIG_EARLY_PRINTK_DBGP is not set
# CONFIG_EARLY_PRINTK_EFI is not set
# CONFIG_EARLY_PRINTK_USB_XDBC is not set
CONFIG_X86_PTDUMP_CORE=y
CONFIG_X86_PTDUMP=y
# CONFIG_EFI_PGT_DUMP is not set
# CONFIG_DEBUG_WX is not set
CONFIG_DOUBLEFAULT=y
# CONFIG_DEBUG_TLBFLUSH is not set
CONFIG_HAVE_MMIOTRACE_SUPPORT=y
# CONFIG_X86_DECODER_SELFTEST is not set
CONFIG_IO_DELAY_TYPE_0X80=0
CONFIG_IO_DELAY_TYPE_0XED=1
CONFIG_IO_DELAY_TYPE_UDELAY=2
CONFIG_IO_DELAY_TYPE_NONE=3
CONFIG_IO_DELAY_0X80=y
# CONFIG_IO_DELAY_0XED is not set
# CONFIG_IO_DELAY_UDELAY is not set
# CONFIG_IO_DELAY_NONE is not set
CONFIG_DEFAULT_IO_DELAY_TYPE=0
# CONFIG_DEBUG_BOOT_PARAMS is not set
# CONFIG_CPA_DEBUG is not set
# CONFIG_OPTIMIZE_INLINING is not set
# CONFIG_DEBUG_ENTRY is not set
CONFIG_DEBUG_NMI_SELFTEST=y
CONFIG_X86_DEBUG_FPU=y
# CONFIG_PUNIT_ATOM_DEBUG is not set
CONFIG_UNWINDER_ORC=y
# CONFIG_UNWINDER_FRAME_POINTER is not set
# CONFIG_UNWINDER_GUESS is not set

#
# Security options
#
CONFIG_KEYS=y
CONFIG_KEYS_COMPAT=y
# CONFIG_PERSISTENT_KEYRINGS is not set
# CONFIG_BIG_KEYS is not set
# CONFIG_ENCRYPTED_KEYS is not set
# CONFIG_KEY_DH_OPERATIONS is not set
# CONFIG_SECURITY_DMESG_RESTRICT is not set
# CONFIG_SECURITY is not set
# CONFIG_SECURITYFS is not set
CONFIG_PAGE_TABLE_ISOLATION=y
CONFIG_HAVE_HARDENED_USERCOPY_ALLOCATOR=y
# CONFIG_HARDENED_USERCOPY is not set
# CONFIG_FORTIFY_SOURCE is not set
# CONFIG_STATIC_USERMODEHELPER is not set
# CONFIG_LOCK_DOWN_KERNEL is not set
# CONFIG_LOCK_DOWN_IN_EFI_SECURE_BOOT is not set
CONFIG_DEFAULT_SECURITY_DAC=y
CONFIG_DEFAULT_SECURITY=""
CONFIG_XOR_BLOCKS=y
CONFIG_ASYNC_CORE=y
CONFIG_ASYNC_MEMCPY=y
CONFIG_ASYNC_XOR=y
CONFIG_ASYNC_PQ=y
CONFIG_ASYNC_RAID6_RECOV=y
CONFIG_CRYPTO=y

#
# Crypto core or helper
#
CONFIG_CRYPTO_ALGAPI=y
CONFIG_CRYPTO_ALGAPI2=y
CONFIG_CRYPTO_AEAD=y
CONFIG_CRYPTO_AEAD2=y
CONFIG_CRYPTO_BLKCIPHER=y
CONFIG_CRYPTO_BLKCIPHER2=y
CONFIG_CRYPTO_HASH=y
CONFIG_CRYPTO_HASH2=y
CONFIG_CRYPTO_RNG=y
CONFIG_CRYPTO_RNG2=y
CONFIG_CRYPTO_RNG_DEFAULT=y
CONFIG_CRYPTO_AKCIPHER2=y
CONFIG_CRYPTO_KPP2=y
CONFIG_CRYPTO_ACOMP2=y
# CONFIG_CRYPTO_RSA is not set
# CONFIG_CRYPTO_DH is not set
# CONFIG_CRYPTO_ECDH is not set
CONFIG_CRYPTO_MANAGER=y
CONFIG_CRYPTO_MANAGER2=y
# CONFIG_CRYPTO_USER is not set
CONFIG_CRYPTO_MANAGER_DISABLE_TESTS=y
CONFIG_CRYPTO_GF128MUL=y
CONFIG_CRYPTO_NULL=y
CONFIG_CRYPTO_NULL2=y
# CONFIG_CRYPTO_PCRYPT is not set
CONFIG_CRYPTO_WORKQUEUE=y
# CONFIG_CRYPTO_CRYPTD is not set
# CONFIG_CRYPTO_MCRYPTD is not set
CONFIG_CRYPTO_AUTHENC=y
# CONFIG_CRYPTO_TEST is not set
CONFIG_CRYPTO_ENGINE=m

#
# Authenticated Encryption with Associated Data
#
CONFIG_CRYPTO_CCM=y
# CONFIG_CRYPTO_GCM is not set
# CONFIG_CRYPTO_CHACHA20POLY1305 is not set
CONFIG_CRYPTO_SEQIV=y
CONFIG_CRYPTO_ECHAINIV=y

#
# Block modes
#
CONFIG_CRYPTO_CBC=y
CONFIG_CRYPTO_CTR=y
# CONFIG_CRYPTO_CTS is not set
CONFIG_CRYPTO_ECB=y
# CONFIG_CRYPTO_LRW is not set
# CONFIG_CRYPTO_PCBC is not set
CONFIG_CRYPTO_XTS=y
# CONFIG_CRYPTO_KEYWRAP is not set

#
# Hash modes
#
CONFIG_CRYPTO_CMAC=y
CONFIG_CRYPTO_HMAC=y
# CONFIG_CRYPTO_XCBC is not set
# CONFIG_CRYPTO_VMAC is not set

#
# Digest
#
CONFIG_CRYPTO_CRC32C=y
# CONFIG_CRYPTO_CRC32C_INTEL is not set
CONFIG_CRYPTO_CRC32=m
# CONFIG_CRYPTO_CRC32_PCLMUL is not set
CONFIG_CRYPTO_CRCT10DIF=y
# CONFIG_CRYPTO_CRCT10DIF_PCLMUL is not set
# CONFIG_CRYPTO_GHASH is not set
# CONFIG_CRYPTO_POLY1305 is not set
# CONFIG_CRYPTO_POLY1305_X86_64 is not set
CONFIG_CRYPTO_MD4=y
CONFIG_CRYPTO_MD5=y
# CONFIG_CRYPTO_MICHAEL_MIC is not set
# CONFIG_CRYPTO_RMD128 is not set
# CONFIG_CRYPTO_RMD160 is not set
# CONFIG_CRYPTO_RMD256 is not set
# CONFIG_CRYPTO_RMD320 is not set
CONFIG_CRYPTO_SHA1=y
# CONFIG_CRYPTO_SHA1_SSSE3 is not set
# CONFIG_CRYPTO_SHA256_SSSE3 is not set
# CONFIG_CRYPTO_SHA512_SSSE3 is not set
# CONFIG_CRYPTO_SHA1_MB is not set
# CONFIG_CRYPTO_SHA256_MB is not set
# CONFIG_CRYPTO_SHA512_MB is not set
CONFIG_CRYPTO_SHA256=y
CONFIG_CRYPTO_SHA512=y
# CONFIG_CRYPTO_SHA3 is not set
# CONFIG_CRYPTO_SM3 is not set
# CONFIG_CRYPTO_TGR192 is not set
# CONFIG_CRYPTO_WP512 is not set
# CONFIG_CRYPTO_GHASH_CLMUL_NI_INTEL is not set

#
# Ciphers
#
CONFIG_CRYPTO_AES=y
# CONFIG_CRYPTO_AES_TI is not set
# CONFIG_CRYPTO_AES_X86_64 is not set
# CONFIG_CRYPTO_AES_NI_INTEL is not set
# CONFIG_CRYPTO_ANUBIS is not set
CONFIG_CRYPTO_ARC4=y
# CONFIG_CRYPTO_BLOWFISH is not set
# CONFIG_CRYPTO_BLOWFISH_X86_64 is not set
# CONFIG_CRYPTO_CAMELLIA is not set
# CONFIG_CRYPTO_CAMELLIA_X86_64 is not set
# CONFIG_CRYPTO_CAMELLIA_AESNI_AVX_X86_64 is not set
# CONFIG_CRYPTO_CAMELLIA_AESNI_AVX2_X86_64 is not set
# CONFIG_CRYPTO_CAST5 is not set
# CONFIG_CRYPTO_CAST5_AVX_X86_64 is not set
# CONFIG_CRYPTO_CAST6 is not set
# CONFIG_CRYPTO_CAST6_AVX_X86_64 is not set
CONFIG_CRYPTO_DES=y
# CONFIG_CRYPTO_DES3_EDE_X86_64 is not set
# CONFIG_CRYPTO_FCRYPT is not set
# CONFIG_CRYPTO_KHAZAD is not set
# CONFIG_CRYPTO_SALSA20 is not set
# CONFIG_CRYPTO_SALSA20_X86_64 is not set
# CONFIG_CRYPTO_CHACHA20 is not set
# CONFIG_CRYPTO_CHACHA20_X86_64 is not set
# CONFIG_CRYPTO_SEED is not set
# CONFIG_CRYPTO_SERPENT is not set
# CONFIG_CRYPTO_SERPENT_SSE2_X86_64 is not set
# CONFIG_CRYPTO_SERPENT_AVX_X86_64 is not set
# CONFIG_CRYPTO_SERPENT_AVX2_X86_64 is not set
# CONFIG_CRYPTO_SPECK is not set
# CONFIG_CRYPTO_TEA is not set
# CONFIG_CRYPTO_TWOFISH is not set
# CONFIG_CRYPTO_TWOFISH_X86_64 is not set
# CONFIG_CRYPTO_TWOFISH_X86_64_3WAY is not set
# CONFIG_CRYPTO_TWOFISH_AVX_X86_64 is not set

#
# Compression
#
CONFIG_CRYPTO_DEFLATE=y
# CONFIG_CRYPTO_LZO is not set
# CONFIG_CRYPTO_842 is not set
# CONFIG_CRYPTO_LZ4 is not set
# CONFIG_CRYPTO_LZ4HC is not set

#
# Random Number Generation
#
# CONFIG_CRYPTO_ANSI_CPRNG is not set
CONFIG_CRYPTO_DRBG_MENU=y
CONFIG_CRYPTO_DRBG_HMAC=y
# CONFIG_CRYPTO_DRBG_HASH is not set
# CONFIG_CRYPTO_DRBG_CTR is not set
CONFIG_CRYPTO_DRBG=y
CONFIG_CRYPTO_JITTERENTROPY=y
# CONFIG_CRYPTO_USER_API_HASH is not set
# CONFIG_CRYPTO_USER_API_SKCIPHER is not set
# CONFIG_CRYPTO_USER_API_RNG is not set
# CONFIG_CRYPTO_USER_API_AEAD is not set
CONFIG_CRYPTO_HW=y
# CONFIG_CRYPTO_DEV_PADLOCK is not set
# CONFIG_CRYPTO_DEV_CCP is not set
# CONFIG_CRYPTO_DEV_QAT_DH895xCC is not set
# CONFIG_CRYPTO_DEV_QAT_C3XXX is not set
# CONFIG_CRYPTO_DEV_QAT_C62X is not set
# CONFIG_CRYPTO_DEV_QAT_DH895xCCVF is not set
# CONFIG_CRYPTO_DEV_QAT_C3XXXVF is not set
# CONFIG_CRYPTO_DEV_QAT_C62XVF is not set
# CONFIG_CRYPTO_DEV_NITROX_CNN55XX is not set
CONFIG_CRYPTO_DEV_VIRTIO=m
# CONFIG_ASYMMETRIC_KEY_TYPE is not set

#
# Certificates for signature checking
#
# CONFIG_SYSTEM_BLACKLIST_KEYRING is not set
CONFIG_HAVE_KVM=y
CONFIG_HAVE_KVM_IRQCHIP=y
CONFIG_HAVE_KVM_IRQFD=y
CONFIG_HAVE_KVM_IRQ_ROUTING=y
CONFIG_HAVE_KVM_EVENTFD=y
CONFIG_KVM_MMIO=y
CONFIG_KVM_ASYNC_PF=y
CONFIG_HAVE_KVM_MSI=y
CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT=y
CONFIG_KVM_VFIO=y
CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT=y
CONFIG_KVM_COMPAT=y
CONFIG_HAVE_KVM_IRQ_BYPASS=y
CONFIG_VIRTUALIZATION=y
CONFIG_KVM=y
CONFIG_KVM_INTEL=y
# CONFIG_KVM_AMD is not set
# CONFIG_KVM_MMU_AUDIT is not set
CONFIG_VHOST_NET=y
CONFIG_VHOST=y
# CONFIG_VHOST_CROSS_ENDIAN_LEGACY is not set
CONFIG_BINARY_PRINTF=y

#
# Library routines
#
CONFIG_RAID6_PQ=y
CONFIG_BITREVERSE=y
CONFIG_RATIONAL=y
CONFIG_GENERIC_STRNCPY_FROM_USER=y
CONFIG_GENERIC_STRNLEN_USER=y
CONFIG_GENERIC_NET_UTILS=y
CONFIG_GENERIC_FIND_FIRST_BIT=y
CONFIG_GENERIC_PCI_IOMAP=y
CONFIG_GENERIC_IOMAP=y
CONFIG_ARCH_USE_CMPXCHG_LOCKREF=y
CONFIG_ARCH_HAS_FAST_MULTIPLIER=y
CONFIG_CRC_CCITT=y
CONFIG_CRC16=y
CONFIG_CRC_T10DIF=y
# CONFIG_CRC_ITU_T is not set
CONFIG_CRC32=y
# CONFIG_CRC32_SELFTEST is not set
CONFIG_CRC32_SLICEBY8=y
# CONFIG_CRC32_SLICEBY4 is not set
# CONFIG_CRC32_SARWATE is not set
# CONFIG_CRC32_BIT is not set
# CONFIG_CRC4 is not set
# CONFIG_CRC7 is not set
CONFIG_LIBCRC32C=y
# CONFIG_CRC8 is not set
CONFIG_XXHASH=y
# CONFIG_RANDOM32_SELFTEST is not set
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=y
CONFIG_LZO_COMPRESS=y
CONFIG_LZO_DECOMPRESS=y
CONFIG_LZ4_DECOMPRESS=y
CONFIG_ZSTD_COMPRESS=y
CONFIG_ZSTD_DECOMPRESS=y
# CONFIG_XZ_DEC is not set
CONFIG_DECOMPRESS_GZIP=y
CONFIG_DECOMPRESS_LZ4=y
CONFIG_GENERIC_ALLOCATOR=y
CONFIG_TEXTSEARCH=y
CONFIG_TEXTSEARCH_KMP=y
CONFIG_TEXTSEARCH_BM=y
CONFIG_TEXTSEARCH_FSM=y
CONFIG_BTREE=y
CONFIG_INTERVAL_TREE=y
CONFIG_ASSOCIATIVE_ARRAY=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT_MAP=y
CONFIG_HAS_DMA=y
CONFIG_SGL_ALLOC=y
# CONFIG_CPUMASK_OFFSTACK is not set
CONFIG_CPU_RMAP=y
CONFIG_DQL=y
CONFIG_GLOB=y
# CONFIG_GLOB_SELFTEST is not set
CONFIG_NLATTR=y
# CONFIG_CORDIC is not set
# CONFIG_DDR is not set
# CONFIG_IRQ_POLL is not set
CONFIG_OID_REGISTRY=y
CONFIG_UCS2_STRING=y
CONFIG_FONT_SUPPORT=y
# CONFIG_FONTS is not set
CONFIG_FONT_8x8=y
CONFIG_FONT_8x16=y
CONFIG_SG_POOL=y
CONFIG_ARCH_HAS_SG_CHAIN=y
CONFIG_ARCH_HAS_PMEM_API=y
CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE=y
CONFIG_SBITMAP=y
# CONFIG_STRING_SELFTEST is not set

[-- Attachment #3: job-script --]
[-- Type: text/plain, Size: 4349 bytes --]

#!/bin/sh

export_top_env()
{
	export suite='boot'
	export testcase='boot'
	export timeout='10m'
	export job_origin='/lkp/lkp/src/jobs/boot.yaml'
	export queue='bisect'
	export testbox='vm-lkp-nex04-4G-18'
	export tbox_group='vm-lkp-nex04-4G'
	export branch='linux-review/Laurent-Dufour/Speculative-page-faults/20180316-151833'
	export commit='b33ddf50ebcc740b990dd2e0e8ff0b92c7acf58e'
	export kconfig='x86_64-nfsroot'
	export submit_id='5aab89310b9a93cba20a4ba0'
	export job_file='/lkp/scheduled/vm-lkp-nex04-4G-18/boot-1-debian-x86_64-2016-08-31.cgz-b33ddf50ebcc740b990dd2e0e8ff0b92c7acf58e-20180316-52130-1wxdvik-0.yaml'
	export id='cdc6ea0921656abe0725891a10a3ad7514fb9ecf'
	export model='qemu-system-x86_64 -enable-kvm -cpu host'
	export nr_vm=20
	export nr_cpu=2
	export memory='4G'
	export hdd_partitions='/dev/vda'
	export swap_partitions='/dev/vdb'
	export need_kconfig='CONFIG_KVM_GUEST=y'
	export ssh_base_port=23230
	export compiler='gcc-7'
	export rootfs='debian-x86_64-2016-08-31.cgz'
	export enqueue_time='2018-03-16 17:06:57 +0800'
	export _id='5aab89310b9a93cba20a4ba0'
	export _rt='/result/boot/1/vm-lkp-nex04-4G/debian-x86_64-2016-08-31.cgz/x86_64-nfsroot/gcc-7/b33ddf50ebcc740b990dd2e0e8ff0b92c7acf58e'
	export user='lkp'
	export result_root='/result/boot/1/vm-lkp-nex04-4G/debian-x86_64-2016-08-31.cgz/x86_64-nfsroot/gcc-7/b33ddf50ebcc740b990dd2e0e8ff0b92c7acf58e/0'
	export LKP_SERVER='inn'
	export max_uptime=600
	export initrd='/osimage/debian/debian-x86_64-2016-08-31.cgz'
	export bootloader_append='root=/dev/ram0
user=lkp
job=/lkp/scheduled/vm-lkp-nex04-4G-18/boot-1-debian-x86_64-2016-08-31.cgz-b33ddf50ebcc740b990dd2e0e8ff0b92c7acf58e-20180316-52130-1wxdvik-0.yaml
ARCH=x86_64
kconfig=x86_64-nfsroot
branch=linux-review/Laurent-Dufour/Speculative-page-faults/20180316-151833
commit=b33ddf50ebcc740b990dd2e0e8ff0b92c7acf58e
BOOT_IMAGE=/pkg/linux/x86_64-nfsroot/gcc-7/b33ddf50ebcc740b990dd2e0e8ff0b92c7acf58e/vmlinuz-4.16.0-rc4-next-20180309-00017-gb33ddf5
max_uptime=600
RESULT_ROOT=/result/boot/1/vm-lkp-nex04-4G/debian-x86_64-2016-08-31.cgz/x86_64-nfsroot/gcc-7/b33ddf50ebcc740b990dd2e0e8ff0b92c7acf58e/0
LKP_SERVER=inn
debug
apic=debug
sysrq_always_enabled
rcupdate.rcu_cpu_stall_timeout=100
net.ifnames=0
printk.devkmsg=on
panic=-1
softlockup_panic=1
nmi_watchdog=panic
oops=panic
load_ramdisk=2
prompt_ramdisk=0
drbd.minor_count=8
systemd.log_level=err
ignore_loglevel
console=tty0
earlyprintk=ttyS0,115200
console=ttyS0,115200
vga=normal
rw'
	export modules_initrd='/pkg/linux/x86_64-nfsroot/gcc-7/b33ddf50ebcc740b990dd2e0e8ff0b92c7acf58e/modules.cgz'
	export bm_initrd='/osimage/deps/debian-x86_64-2016-08-31.cgz/lkp_2017-12-14.cgz,/osimage/deps/debian-x86_64-2016-08-31.cgz/rsync-rootfs_2016-11-15.cgz,/osimage/deps/debian-x86_64-2016-08-31.cgz/run-ipconfig_2016-11-15.cgz'
	export lkp_initrd='/lkp/lkp/lkp-x86_64.cgz'
	export site='inn'
	export LKP_CGI_PORT=80
	export LKP_CIFS_PORT=139
	export kernel='/pkg/linux/x86_64-nfsroot/gcc-7/b33ddf50ebcc740b990dd2e0e8ff0b92c7acf58e/vmlinuz-4.16.0-rc4-next-20180309-00017-gb33ddf5'
	export dequeue_time='2018-03-16 17:15:13 +0800'
	export job_initrd='/lkp/scheduled/vm-lkp-nex04-4G-18/boot-1-debian-x86_64-2016-08-31.cgz-b33ddf50ebcc740b990dd2e0e8ff0b92c7acf58e-20180316-52130-1wxdvik-0.cgz'

	[ -n "$LKP_SRC" ] ||
	export LKP_SRC=/lkp/${user:-lkp}/src
}

run_job()
{
	echo $$ > $TMP/run-job.pid

	. $LKP_SRC/lib/http.sh
	. $LKP_SRC/lib/job.sh
	. $LKP_SRC/lib/env.sh

	export_top_env

	run_monitor $LKP_SRC/monitors/one-shot/wrapper boot-slabinfo
	run_monitor $LKP_SRC/monitors/one-shot/wrapper boot-meminfo
	run_monitor $LKP_SRC/monitors/one-shot/wrapper memmap
	run_monitor $LKP_SRC/monitors/no-stdout/wrapper boot-time
	run_monitor $LKP_SRC/monitors/wrapper kmsg
	run_monitor $LKP_SRC/monitors/wrapper oom-killer
	run_monitor $LKP_SRC/monitors/plain/watchdog

	run_test $LKP_SRC/tests/wrapper sleep 1
}

extract_stats()
{
	$LKP_SRC/stats/wrapper boot-slabinfo
	$LKP_SRC/stats/wrapper boot-meminfo
	$LKP_SRC/stats/wrapper memmap
	$LKP_SRC/stats/wrapper boot-memory
	$LKP_SRC/stats/wrapper boot-time
	$LKP_SRC/stats/wrapper kernel-size
	$LKP_SRC/stats/wrapper kmsg

	$LKP_SRC/stats/wrapper time sleep.time
	$LKP_SRC/stats/wrapper time
	$LKP_SRC/stats/wrapper dmesg
	$LKP_SRC/stats/wrapper kmsg
	$LKP_SRC/stats/wrapper stderr
	$LKP_SRC/stats/wrapper last_state
}

"$@"

[-- Attachment #4: dmesg.xz --]
[-- Type: application/x-xz, Size: 15688 bytes --]

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [mm] b33ddf50eb: INFO:trying_to_register_non-static_key
  2018-03-16 10:23   ` [mm] b33ddf50eb: INFO:trying_to_register_non-static_key kernel test robot
@ 2018-03-16 16:38     ` Laurent Dufour
  0 siblings, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-03-16 16:38 UTC (permalink / raw)
  To: kernel test robot
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86, lkp

On 16/03/2018 11:23, kernel test robot wrote:
> FYI, we noticed the following commit (built with gcc-7):
> 
> commit: b33ddf50ebcc740b990dd2e0e8ff0b92c7acf58e ("mm: Protect mm_rb tree with a rwlock")
> url: https://github.com/0day-ci/linux/commits/Laurent-Dufour/Speculative-page-faults/20180316-151833
> 
> 
> in testcase: boot
> 
> on test machine: qemu-system-x86_64 -enable-kvm -cpu host -smp 2 -m 4G
> 
> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
> 
> 
> +----------------------------------------+------------+------------+
> |                                        | 7f3f7b4e80 | b33ddf50eb |
> +----------------------------------------+------------+------------+
> | boot_successes                         | 8          | 0          |
> | boot_failures                          | 0          | 6          |
> | INFO:trying_to_register_non-static_key | 0          | 6          |
> +----------------------------------------+------------+------------+
> 
> 
> 
> [   22.218186] INFO: trying to register non-static key.
> [   22.220252] the code is fine but needs lockdep annotation.
> [   22.222471] turning off the locking correctness validator.
> [   22.224839] CPU: 0 PID: 1 Comm: init Not tainted 4.16.0-rc4-next-20180309-00017-gb33ddf5 #1
> [   22.228528] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> [   22.232443] Call Trace:
> [   22.234234]  dump_stack+0x85/0xbc
> [   22.236085]  register_lock_class+0x237/0x477
> [   22.238057]  __lock_acquire+0xd0/0xf15
> [   22.240032]  lock_acquire+0x19c/0x1ce
> [   22.241927]  ? do_mmap+0x3aa/0x3ff
> [   22.243749]  mmap_region+0x37a/0x4c0
> [   22.245619]  ? do_mmap+0x3aa/0x3ff
> [   22.247425]  do_mmap+0x3aa/0x3ff
> [   22.249175]  vm_mmap_pgoff+0xa1/0xea
> [   22.251083]  elf_map+0x6d/0x134
> [   22.252873]  load_elf_binary+0x56f/0xe07
> [   22.254853]  search_binary_handler+0x75/0x1f8
> [   22.256934]  do_execveat_common+0x661/0x92b
> [   22.259164]  ? rest_init+0x22e/0x22e
> [   22.261082]  do_execve+0x1f/0x21
> [   22.262884]  kernel_init+0x5a/0xf0
> [   22.264722]  ret_from_fork+0x3a/0x50
> [   22.303240] systemd[1]: RTC configured in localtime, applying delta of 480 minutes to system time.
> [   22.313544] systemd[1]: Failed to insert module 'autofs4': No such file or directory

Thanks a lot for reporting this.

I found the issue introduced in that patch.
I mistakenly remove in the call to seqcount_init(&vma->vm_sequence) in
__vma_link_rb().
This doesn't have a functional impact as the vm_sequence is incremented
monotonically.

I'll fix that in the next series.

Laurent.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [mm]  b1f0502d04: INFO:trying_to_register_non-static_key
       [not found] ` <1520963994-28477-8-git-send-email-ldufour@linux.vnet.ibm.com>
@ 2018-03-17  7:51   ` kernel test robot
  2018-03-21 12:21     ` Laurent Dufour
  2018-03-27 21:30   ` [PATCH v9 07/24] mm: VMA sequence count David Rientjes
  1 sibling, 1 reply; 98+ messages in thread
From: kernel test robot @ 2018-03-17  7:51 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86, lkp

[-- Attachment #1: Type: text/plain, Size: 4926 bytes --]

FYI, we noticed the following commit (built with gcc-7):

commit: b1f0502d04537ef55b0c296823affe332b100eb5 ("mm: VMA sequence count")
url: https://github.com/0day-ci/linux/commits/Laurent-Dufour/Speculative-page-faults/20180316-151833


in testcase: trinity
with following parameters:

	runtime: 300s

test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/


on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -m 512M

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


+----------------------------------------+------------+------------+
|                                        | 6a4ce82339 | b1f0502d04 |
+----------------------------------------+------------+------------+
| boot_successes                         | 8          | 4          |
| boot_failures                          | 0          | 4          |
| INFO:trying_to_register_non-static_key | 0          | 4          |
+----------------------------------------+------------+------------+



[   22.212940] INFO: trying to register non-static key.
[   22.213687] the code is fine but needs lockdep annotation.
[   22.214459] turning off the locking correctness validator.
[   22.227459] CPU: 0 PID: 547 Comm: trinity-main Not tainted 4.16.0-rc4-next-20180309-00007-gb1f0502 #239
[   22.228904] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[   22.230043] Call Trace:
[   22.230409]  dump_stack+0x5d/0x79
[   22.231025]  register_lock_class+0x226/0x45e
[   22.231827]  ? kvm_clock_read+0x21/0x30
[   22.232544]  ? kvm_sched_clock_read+0x5/0xd
[   22.233330]  __lock_acquire+0xa2/0x774
[   22.234152]  lock_acquire+0x4b/0x66
[   22.234805]  ? unmap_vmas+0x30/0x3d
[   22.245680]  unmap_page_range+0x56/0x48c
[   22.248127]  ? unmap_vmas+0x30/0x3d
[   22.248741]  ? lru_deactivate_file_fn+0x2c6/0x2c6
[   22.249537]  ? pagevec_lru_move_fn+0x9a/0xa9
[   22.250244]  unmap_vmas+0x30/0x3d
[   22.250791]  unmap_region+0xad/0x105
[   22.251419]  mmap_region+0x3cc/0x455
[   22.252011]  do_mmap+0x394/0x3e9
[   22.261224]  vm_mmap_pgoff+0x9c/0xe5
[   22.261798]  SyS_mmap_pgoff+0x19a/0x1d4
[   22.262475]  ? task_work_run+0x5e/0x9c
[   22.263163]  do_syscall_64+0x6d/0x103
[   22.263814]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[   22.264697] RIP: 0033:0x4573da
[   22.267248] RSP: 002b:00007fffa22f1398 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
[   22.274720] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00000000004573da
[   22.276083] RDX: 0000000000000001 RSI: 0000000000001000 RDI: 0000000000000000
[   22.277343] RBP: 000000000000001c R08: 000000000000001c R09: 0000000000000000
[   22.278686] R10: 0000000000000002 R11: 0000000000000246 R12: 0000000000000000
[   22.279930] R13: 0000000000001000 R14: 0000000000000002 R15: 0000000000000000
[   22.391866] trinity-main uses obsolete (PF_INET,SOCK_PACKET)
[  327.566956] sysrq: SysRq : Emergency Sync
[  327.567849] Emergency Sync complete
[  327.569975] sysrq: SysRq : Resetting

Elapsed time: 330

#!/bin/bash

# To reproduce,
# 1) save job-script and this script (both are attached in 0day report email)
# 2) run this script with your compiled kernel and optional env $INSTALL_MOD_PATH

kernel=$1

initrds=(
	/osimage/yocto/yocto-minimal-x86_64-2016-04-22.cgz
	/lkp/lkp/lkp-x86_64.cgz
	/osimage/pkg/debian-x86_64-2016-08-31.cgz/trinity-static-x86_64-x86_64-6ddabfd2_2017-11-10.cgz
)

HTTP_PREFIX=https://github.com/0day-ci/lkp-qemu/raw/master
wget --timestamping "${initrds[@]/#/$HTTP_PREFIX}"

{
	cat "${initrds[@]//*\//}"
	[[ $INSTALL_MOD_PATH ]] && (
		cd "$INSTALL_MOD_PATH"
		find lib | cpio -o -H newc --quiet | gzip
	)
	echo  job-script | cpio -o -H newc --quiet | gzip
} > initrd.img

qemu-img create -f qcow2 disk-vm-kbuild-yocto-x86_64-62-0 256G

kvm=(
	qemu-system-x86_64
	-enable-kvm
	-cpu SandyBridge
	-kernel $kernel
	-initrd initrd.img
	-m 512
	-smp 1
	-device e1000,netdev=net0
	-netdev user,id=net0
	-boot order=nc
	-no-reboot
	-watchdog i6300esb
	-watchdog-action debug
	-rtc base=localtime
	-drive file=disk-vm-kbuild-yocto-x86_64-62-0,media=disk,if=virtio
	-serial stdio
	-display none
	-monitor null
)

append=(
	ip=::::vm-kbuild-yocto-x86_64-62::dhcp
	root=/dev/ram0
	user=lkp
	job=/job-script
	ARCH=x86_64
	kconfig=x86_64-acpi-redef
	branch=linux-devel/devel-catchup-201803161558
	commit=b1f0502d04537ef55b0c296823affe332b100eb5
	BOOT_IMAGE=/pkg/linux/x86_64-acpi-redef/gcc-7/b1f0502d04537ef55b0c296823affe332b100eb5/vmlinuz-4.16.0-rc4-next-20180309-00007-gb1f0502
	max_uptime=1500
	RESULT_ROOT=/result/trinity/300s/vm-kbuild-yocto-x86_64/yocto-minimal-x86_64-2016-04-22.cgz/x86_64-acpi-redef/gcc-7/b1f0502d04537ef55b0c296823affe332b100eb5/0


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp qemu -k <bzImage> job-script  # job-script is attached in this email



Thanks,
lkp

[-- Attachment #2: config-4.16.0-rc4-next-20180309-00007-gb1f0502 --]
[-- Type: text/plain, Size: 125596 bytes --]

#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 4.16.0-rc4 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=28
CONFIG_ARCH_MMAP_RND_BITS_MAX=32
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ZONE_DMA32=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_X86_64_SMP=y
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_PGTABLE_LEVELS=4
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y
CONFIG_THREAD_INFO_IN_TASK=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
# CONFIG_KERNEL_GZIP is not set
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
CONFIG_KERNEL_XZ=y
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
# CONFIG_SYSVIPC is not set
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_CROSS_MEMORY_ATTACH=y
CONFIG_USELIB=y
CONFIG_AUDIT=y
CONFIG_HAVE_ARCH_AUDITSYSCALL=y
CONFIG_AUDITSYSCALL=y
CONFIG_AUDIT_WATCH=y
CONFIG_AUDIT_TREE=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_GENERIC_IRQ_MIGRATION=y
CONFIG_GENERIC_IRQ_CHIP=y
CONFIG_IRQ_DOMAIN=y
CONFIG_IRQ_DOMAIN_HIERARCHY=y
CONFIG_GENERIC_MSI_IRQ=y
CONFIG_GENERIC_MSI_IRQ_DOMAIN=y
CONFIG_GENERIC_IRQ_MATRIX_ALLOCATOR=y
CONFIG_GENERIC_IRQ_RESERVATION_MODE=y
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
# CONFIG_GENERIC_IRQ_DEBUGFS is not set
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_DATA=y
CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_HZ_PERIODIC=y
# CONFIG_NO_HZ_IDLE is not set
# CONFIG_NO_HZ_FULL is not set
# CONFIG_NO_HZ is not set
CONFIG_HIGH_RES_TIMERS=y

#
# CPU/Task time and stats accounting
#
CONFIG_TICK_CPU_ACCOUNTING=y
# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
# CONFIG_IRQ_TIME_ACCOUNTING is not set
# CONFIG_BSD_PROCESS_ACCT is not set
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
# CONFIG_TASK_XACCT is not set
CONFIG_CPU_ISOLATION=y

#
# RCU Subsystem
#
CONFIG_PREEMPT_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_SRCU=y
CONFIG_TREE_SRCU=y
CONFIG_TASKS_RCU=y
CONFIG_RCU_STALL_COMMON=y
CONFIG_RCU_NEED_SEGCBLIST=y
CONFIG_BUILD_BIN2C=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=20
CONFIG_LOG_CPU_MAX_BUF_SHIFT=12
CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=13
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH=y
CONFIG_ARCH_SUPPORTS_INT128=y
CONFIG_CGROUPS=y
# CONFIG_MEMCG is not set
# CONFIG_BLK_CGROUP is not set
CONFIG_CGROUP_SCHED=y
CONFIG_FAIR_GROUP_SCHED=y
# CONFIG_CFS_BANDWIDTH is not set
CONFIG_RT_GROUP_SCHED=y
# CONFIG_CGROUP_PIDS is not set
# CONFIG_CGROUP_RDMA is not set
CONFIG_CGROUP_FREEZER=y
# CONFIG_CGROUP_HUGETLB is not set
CONFIG_CPUSETS=y
# CONFIG_PROC_PID_CPUSET is not set
# CONFIG_CGROUP_DEVICE is not set
# CONFIG_CGROUP_CPUACCT is not set
# CONFIG_CGROUP_PERF is not set
CONFIG_CGROUP_DEBUG=y
# CONFIG_NAMESPACES is not set
CONFIG_SCHED_AUTOGROUP=y
# CONFIG_SYSFS_DEPRECATED is not set
# CONFIG_RELAY is not set
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_RD_GZIP=y
CONFIG_RD_BZIP2=y
CONFIG_RD_LZMA=y
CONFIG_RD_XZ=y
# CONFIG_RD_LZO is not set
CONFIG_RD_LZ4=y
# CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE is not set
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
CONFIG_ANON_INODES=y
CONFIG_HAVE_UID16=y
CONFIG_SYSCTL_EXCEPTION_TRACE=y
CONFIG_HAVE_PCSPKR_PLATFORM=y
CONFIG_BPF=y
CONFIG_EXPERT=y
# CONFIG_UID16 is not set
CONFIG_MULTIUSER=y
CONFIG_SGETMASK_SYSCALL=y
CONFIG_SYSFS_SYSCALL=y
# CONFIG_SYSCTL_SYSCALL is not set
CONFIG_FHANDLE=y
CONFIG_POSIX_TIMERS=y
CONFIG_PRINTK=y
CONFIG_PRINTK_NMI=y
CONFIG_BUG=y
CONFIG_PCSPKR_PLATFORM=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_FUTEX_PI=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_AIO=y
CONFIG_ADVISE_SYSCALLS=y
CONFIG_MEMBARRIER=y
# CONFIG_CHECKPOINT_RESTORE is not set
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_KALLSYMS_ABSOLUTE_PERCPU=y
CONFIG_KALLSYMS_BASE_RELATIVE=y
# CONFIG_BPF_SYSCALL is not set
# CONFIG_USERFAULTFD is not set
CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE=y
CONFIG_EMBEDDED=y
CONFIG_HAVE_PERF_EVENTS=y
# CONFIG_PC104 is not set

#
# Kernel Performance Events And Counters
#
CONFIG_PERF_EVENTS=y
# CONFIG_DEBUG_PERF_USE_VMALLOC is not set
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLUB_DEBUG=y
CONFIG_COMPAT_BRK=y
# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_SLOB is not set
CONFIG_SLAB_MERGE_DEFAULT=y
# CONFIG_SLAB_FREELIST_RANDOM is not set
# CONFIG_SLAB_FREELIST_HARDENED is not set
CONFIG_SLUB_CPU_PARTIAL=y
# CONFIG_PROFILING is not set
CONFIG_HAVE_OPROFILE=y
CONFIG_OPROFILE_NMI_TIMER=y
# CONFIG_JUMP_LABEL is not set
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
CONFIG_ARCH_USE_BUILTIN_BSWAP=y
CONFIG_HAVE_IOREMAP_PROT=y
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KRETPROBES=y
CONFIG_HAVE_OPTPROBES=y
CONFIG_HAVE_KPROBES_ON_FTRACE=y
CONFIG_HAVE_FUNCTION_ERROR_INJECTION=y
CONFIG_HAVE_NMI=y
CONFIG_HAVE_ARCH_TRACEHOOK=y
CONFIG_HAVE_DMA_CONTIGUOUS=y
CONFIG_GENERIC_SMP_IDLE_THREAD=y
CONFIG_ARCH_HAS_FORTIFY_SOURCE=y
CONFIG_ARCH_HAS_SET_MEMORY=y
CONFIG_HAVE_ARCH_THREAD_STRUCT_WHITELIST=y
CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT=y
CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y
CONFIG_HAVE_CLK=y
CONFIG_HAVE_DMA_API_DEBUG=y
CONFIG_HAVE_HW_BREAKPOINT=y
CONFIG_HAVE_MIXED_BREAKPOINTS_REGS=y
CONFIG_HAVE_USER_RETURN_NOTIFIER=y
CONFIG_HAVE_PERF_EVENTS_NMI=y
CONFIG_HAVE_HARDLOCKUP_DETECTOR_PERF=y
CONFIG_HAVE_PERF_REGS=y
CONFIG_HAVE_PERF_USER_STACK_DUMP=y
CONFIG_HAVE_ARCH_JUMP_LABEL=y
CONFIG_HAVE_RCU_TABLE_FREE=y
CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG=y
CONFIG_HAVE_ALIGNED_STRUCT_PAGE=y
CONFIG_HAVE_CMPXCHG_LOCAL=y
CONFIG_HAVE_CMPXCHG_DOUBLE=y
CONFIG_ARCH_WANT_COMPAT_IPC_PARSE_VERSION=y
CONFIG_ARCH_WANT_OLD_COMPAT_IPC=y
CONFIG_HAVE_ARCH_SECCOMP_FILTER=y
CONFIG_SECCOMP_FILTER=y
CONFIG_HAVE_GCC_PLUGINS=y
# CONFIG_GCC_PLUGINS is not set
CONFIG_HAVE_CC_STACKPROTECTOR=y
# CONFIG_CC_STACKPROTECTOR_NONE is not set
# CONFIG_CC_STACKPROTECTOR_REGULAR is not set
# CONFIG_CC_STACKPROTECTOR_STRONG is not set
CONFIG_CC_STACKPROTECTOR_AUTO=y
CONFIG_HAVE_ARCH_WITHIN_STACK_FRAMES=y
CONFIG_HAVE_CONTEXT_TRACKING=y
CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y
CONFIG_HAVE_IRQ_TIME_ACCOUNTING=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=y
CONFIG_HAVE_ARCH_HUGE_VMAP=y
CONFIG_HAVE_ARCH_SOFT_DIRTY=y
CONFIG_HAVE_MOD_ARCH_SPECIFIC=y
CONFIG_MODULES_USE_ELF_RELA=y
CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK=y
CONFIG_ARCH_HAS_ELF_RANDOMIZE=y
CONFIG_HAVE_ARCH_MMAP_RND_BITS=y
CONFIG_HAVE_EXIT_THREAD=y
CONFIG_ARCH_MMAP_RND_BITS=28
CONFIG_HAVE_ARCH_MMAP_RND_COMPAT_BITS=y
CONFIG_ARCH_MMAP_RND_COMPAT_BITS=8
CONFIG_HAVE_ARCH_COMPAT_MMAP_BASES=y
CONFIG_HAVE_COPY_THREAD_TLS=y
CONFIG_HAVE_STACK_VALIDATION=y
CONFIG_OLD_SIGSUSPEND3=y
CONFIG_COMPAT_OLD_SIGACTION=y
CONFIG_HAVE_ARCH_VMAP_STACK=y
CONFIG_VMAP_STACK=y
CONFIG_ARCH_HAS_STRICT_KERNEL_RWX=y
CONFIG_STRICT_KERNEL_RWX=y
CONFIG_ARCH_HAS_STRICT_MODULE_RWX=y
CONFIG_ARCH_HAS_PHYS_TO_DMA=y
CONFIG_ARCH_HAS_REFCOUNT=y
# CONFIG_REFCOUNT_FULL is not set

#
# GCOV-based kernel profiling
#
# CONFIG_GCOV_KERNEL is not set
CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y
CONFIG_RT_MUTEXES=y
CONFIG_BASE_SMALL=0
# CONFIG_MODULES is not set
CONFIG_MODULES_TREE_LOOKUP=y
CONFIG_BLOCK=y
CONFIG_BLK_SCSI_REQUEST=y
CONFIG_BLK_DEV_BSG=y
CONFIG_BLK_DEV_BSGLIB=y
CONFIG_BLK_DEV_INTEGRITY=y
# CONFIG_BLK_DEV_ZONED is not set
# CONFIG_BLK_CMDLINE_PARSER is not set
# CONFIG_BLK_WBT is not set
CONFIG_BLK_DEBUG_FS=y
# CONFIG_BLK_SED_OPAL is not set

#
# Partition Types
#
# CONFIG_PARTITION_ADVANCED is not set
CONFIG_MSDOS_PARTITION=y
CONFIG_EFI_PARTITION=y
CONFIG_BLOCK_COMPAT=y
CONFIG_BLK_MQ_PCI=y
CONFIG_BLK_MQ_VIRTIO=y

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
# CONFIG_IOSCHED_DEADLINE is not set
# CONFIG_IOSCHED_CFQ is not set
CONFIG_DEFAULT_NOOP=y
CONFIG_DEFAULT_IOSCHED="noop"
CONFIG_MQ_IOSCHED_DEADLINE=y
CONFIG_MQ_IOSCHED_KYBER=y
# CONFIG_IOSCHED_BFQ is not set
CONFIG_UNINLINE_SPIN_UNLOCK=y
CONFIG_ARCH_SUPPORTS_ATOMIC_RMW=y
CONFIG_MUTEX_SPIN_ON_OWNER=y
CONFIG_RWSEM_SPIN_ON_OWNER=y
CONFIG_LOCK_SPIN_ON_OWNER=y
CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
CONFIG_QUEUED_SPINLOCKS=y
CONFIG_ARCH_USE_QUEUED_RWLOCKS=y
CONFIG_QUEUED_RWLOCKS=y
CONFIG_ARCH_HAS_SYNC_CORE_BEFORE_USERMODE=y
CONFIG_FREEZER=y

#
# Processor type and features
#
# CONFIG_ZONE_DMA is not set
CONFIG_SMP=y
CONFIG_X86_FEATURE_NAMES=y
CONFIG_X86_FAST_FEATURE_TESTS=y
# CONFIG_X86_X2APIC is not set
CONFIG_X86_MPPARSE=y
# CONFIG_GOLDFISH is not set
CONFIG_RETPOLINE=y
# CONFIG_INTEL_RDT is not set
# CONFIG_X86_EXTENDED_PLATFORM is not set
# CONFIG_X86_INTEL_LPSS is not set
# CONFIG_X86_AMD_PLATFORM_DEVICE is not set
# CONFIG_IOSF_MBI is not set
CONFIG_X86_SUPPORTS_MEMORY_FAILURE=y
# CONFIG_SCHED_OMIT_FRAME_POINTER is not set
CONFIG_HYPERVISOR_GUEST=y
CONFIG_PARAVIRT=y
# CONFIG_PARAVIRT_DEBUG is not set
# CONFIG_PARAVIRT_SPINLOCKS is not set
CONFIG_XEN=y
CONFIG_XEN_PV=y
CONFIG_XEN_PV_SMP=y
CONFIG_XEN_DOM0=y
CONFIG_XEN_PVHVM=y
CONFIG_XEN_PVHVM_SMP=y
CONFIG_XEN_512GB=y
CONFIG_XEN_SAVE_RESTORE=y
CONFIG_XEN_DEBUG_FS=y
# CONFIG_XEN_PVH is not set
CONFIG_KVM_GUEST=y
# CONFIG_KVM_DEBUG_FS is not set
CONFIG_PARAVIRT_TIME_ACCOUNTING=y
CONFIG_PARAVIRT_CLOCK=y
# CONFIG_JAILHOUSE_GUEST is not set
CONFIG_NO_BOOTMEM=y
# CONFIG_MK8 is not set
# CONFIG_MPSC is not set
# CONFIG_MCORE2 is not set
# CONFIG_MATOM is not set
CONFIG_GENERIC_CPU=y
CONFIG_X86_INTERNODE_CACHE_SHIFT=6
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_TSC=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_CMOV=y
CONFIG_X86_MINIMUM_CPU_FAMILY=64
CONFIG_X86_DEBUGCTLMSR=y
# CONFIG_PROCESSOR_SELECT is not set
CONFIG_CPU_SUP_INTEL=y
CONFIG_CPU_SUP_AMD=y
CONFIG_CPU_SUP_CENTAUR=y
CONFIG_HPET_TIMER=y
CONFIG_DMI=y
CONFIG_GART_IOMMU=y
# CONFIG_CALGARY_IOMMU is not set
CONFIG_SWIOTLB=y
CONFIG_IOMMU_HELPER=y
# CONFIG_MAXSMP is not set
CONFIG_NR_CPUS_RANGE_BEGIN=2
CONFIG_NR_CPUS_RANGE_END=512
CONFIG_NR_CPUS_DEFAULT=64
CONFIG_NR_CPUS=8
# CONFIG_SCHED_SMT is not set
CONFIG_SCHED_MC=y
CONFIG_SCHED_MC_PRIO=y
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
CONFIG_PREEMPT=y
CONFIG_PREEMPT_COUNT=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS=y
CONFIG_X86_MCE=y
# CONFIG_X86_MCELOG_LEGACY is not set
# CONFIG_X86_MCE_INTEL is not set
CONFIG_X86_MCE_AMD=y
CONFIG_X86_MCE_THRESHOLD=y
# CONFIG_X86_MCE_INJECT is not set

#
# Performance monitoring
#
CONFIG_PERF_EVENTS_INTEL_UNCORE=y
CONFIG_PERF_EVENTS_INTEL_RAPL=y
CONFIG_PERF_EVENTS_INTEL_CSTATE=y
# CONFIG_PERF_EVENTS_AMD_POWER is not set
CONFIG_X86_16BIT=y
CONFIG_X86_ESPFIX64=y
CONFIG_X86_VSYSCALL_EMULATION=y
# CONFIG_I8K is not set
# CONFIG_MICROCODE is not set
# CONFIG_X86_MSR is not set
CONFIG_X86_CPUID=y
# CONFIG_X86_5LEVEL is not set
CONFIG_ARCH_PHYS_ADDR_T_64BIT=y
CONFIG_ARCH_DMA_ADDR_T_64BIT=y
CONFIG_X86_DIRECT_GBPAGES=y
CONFIG_ARCH_HAS_MEM_ENCRYPT=y
# CONFIG_AMD_MEM_ENCRYPT is not set
# CONFIG_NUMA is not set
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_DEFAULT=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_ILLEGAL_POINTER_VALUE=0xdead000000000000
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_SPARSEMEM_MANUAL=y
CONFIG_SPARSEMEM=y
CONFIG_HAVE_MEMORY_PRESENT=y
CONFIG_SPARSEMEM_EXTREME=y
CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER=y
# CONFIG_SPARSEMEM_VMEMMAP is not set
CONFIG_HAVE_MEMBLOCK=y
CONFIG_HAVE_MEMBLOCK_NODE_MAP=y
CONFIG_HAVE_GENERIC_GUP=y
CONFIG_ARCH_DISCARD_MEMBLOCK=y
CONFIG_MEMORY_ISOLATION=y
# CONFIG_MEMORY_HOTPLUG is not set
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK=y
CONFIG_MEMORY_BALLOON=y
# CONFIG_BALLOON_COMPACTION is not set
CONFIG_COMPACTION=y
CONFIG_MIGRATION=y
CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION=y
CONFIG_ARCH_ENABLE_THP_MIGRATION=y
CONFIG_PHYS_ADDR_T_64BIT=y
CONFIG_VIRT_TO_BUS=y
CONFIG_KSM=y
CONFIG_DEFAULT_MMAP_MIN_ADDR=4096
CONFIG_ARCH_SUPPORTS_MEMORY_FAILURE=y
CONFIG_MEMORY_FAILURE=y
# CONFIG_HWPOISON_INJECT is not set
CONFIG_TRANSPARENT_HUGEPAGE=y
CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
# CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
CONFIG_ARCH_WANTS_THP_SWAP=y
CONFIG_THP_SWAP=y
CONFIG_TRANSPARENT_HUGE_PAGECACHE=y
# CONFIG_CLEANCACHE is not set
# CONFIG_FRONTSWAP is not set
# CONFIG_CMA is not set
# CONFIG_ZPOOL is not set
# CONFIG_ZBUD is not set
CONFIG_ZSMALLOC=y
# CONFIG_PGTABLE_MAPPING is not set
# CONFIG_ZSMALLOC_STAT is not set
CONFIG_GENERIC_EARLY_IOREMAP=y
# CONFIG_DEFERRED_STRUCT_PAGE_INIT is not set
# CONFIG_IDLE_PAGE_TRACKING is not set
CONFIG_ARCH_HAS_ZONE_DEVICE=y
CONFIG_ARCH_USES_HIGH_VMA_FLAGS=y
CONFIG_ARCH_HAS_PKEYS=y
# CONFIG_PERCPU_STATS is not set
# CONFIG_GUP_BENCHMARK is not set
CONFIG_SPECULATIVE_PAGE_FAULT=y
# CONFIG_X86_PMEM_LEGACY is not set
CONFIG_X86_CHECK_BIOS_CORRUPTION=y
# CONFIG_X86_BOOTPARAM_MEMORY_CORRUPTION_CHECK is not set
CONFIG_X86_RESERVE_LOW=64
# CONFIG_MTRR is not set
CONFIG_ARCH_RANDOM=y
# CONFIG_X86_SMAP is not set
CONFIG_X86_INTEL_UMIP=y
# CONFIG_X86_INTEL_MPX is not set
CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS=y
CONFIG_EFI=y
CONFIG_EFI_STUB=y
# CONFIG_EFI_MIXED is not set
CONFIG_SECCOMP=y
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
CONFIG_SCHED_HRTICK=y
# CONFIG_KEXEC is not set
# CONFIG_KEXEC_FILE is not set
# CONFIG_CRASH_DUMP is not set
CONFIG_PHYSICAL_START=0x1000000
CONFIG_RELOCATABLE=y
CONFIG_RANDOMIZE_BASE=y
CONFIG_X86_NEED_RELOCS=y
CONFIG_PHYSICAL_ALIGN=0x1000000
CONFIG_DYNAMIC_MEMORY_LAYOUT=y
CONFIG_RANDOMIZE_MEMORY=y
CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING=0x0
CONFIG_HOTPLUG_CPU=y
CONFIG_BOOTPARAM_HOTPLUG_CPU0=y
# CONFIG_DEBUG_HOTPLUG_CPU0 is not set
CONFIG_COMPAT_VDSO=y
# CONFIG_LEGACY_VSYSCALL_NATIVE is not set
CONFIG_LEGACY_VSYSCALL_EMULATE=y
# CONFIG_LEGACY_VSYSCALL_NONE is not set
# CONFIG_CMDLINE_BOOL is not set
CONFIG_MODIFY_LDT_SYSCALL=y
CONFIG_HAVE_LIVEPATCH=y
CONFIG_ARCH_HAS_ADD_PAGES=y
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y

#
# Power management and ACPI options
#
# CONFIG_SUSPEND is not set
CONFIG_HIBERNATE_CALLBACKS=y
# CONFIG_HIBERNATION is not set
CONFIG_PM_SLEEP=y
CONFIG_PM_SLEEP_SMP=y
# CONFIG_PM_AUTOSLEEP is not set
# CONFIG_PM_WAKELOCKS is not set
CONFIG_PM=y
CONFIG_PM_DEBUG=y
# CONFIG_PM_ADVANCED_DEBUG is not set
CONFIG_PM_SLEEP_DEBUG=y
# CONFIG_PM_TRACE_RTC is not set
CONFIG_PM_CLK=y
# CONFIG_WQ_POWER_EFFICIENT_DEFAULT is not set
CONFIG_ACPI=y
CONFIG_ACPI_LEGACY_TABLES_LOOKUP=y
CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC=y
CONFIG_ACPI_SYSTEM_POWER_STATES_SUPPORT=y
# CONFIG_ACPI_DEBUGGER is not set
CONFIG_ACPI_SPCR_TABLE=y
CONFIG_ACPI_LPIT=y
# CONFIG_ACPI_PROCFS_POWER is not set
CONFIG_ACPI_REV_OVERRIDE_POSSIBLE=y
CONFIG_ACPI_EC_DEBUGFS=y
# CONFIG_ACPI_AC is not set
CONFIG_ACPI_BATTERY=y
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_VIDEO=y
CONFIG_ACPI_FAN=y
CONFIG_ACPI_DOCK=y
CONFIG_ACPI_CPU_FREQ_PSS=y
CONFIG_ACPI_PROCESSOR_CSTATE=y
CONFIG_ACPI_PROCESSOR_IDLE=y
CONFIG_ACPI_CPPC_LIB=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_HOTPLUG_CPU=y
CONFIG_ACPI_PROCESSOR_AGGREGATOR=y
CONFIG_ACPI_THERMAL=y
CONFIG_ARCH_HAS_ACPI_TABLE_UPGRADE=y
CONFIG_ACPI_TABLE_UPGRADE=y
# CONFIG_ACPI_DEBUG is not set
CONFIG_ACPI_PCI_SLOT=y
CONFIG_ACPI_CONTAINER=y
CONFIG_ACPI_HOTPLUG_IOAPIC=y
CONFIG_ACPI_SBS=y
CONFIG_ACPI_HED=y
# CONFIG_ACPI_CUSTOM_METHOD is not set
# CONFIG_ACPI_BGRT is not set
# CONFIG_ACPI_REDUCED_HARDWARE_ONLY is not set
# CONFIG_ACPI_NFIT is not set
CONFIG_HAVE_ACPI_APEI=y
CONFIG_HAVE_ACPI_APEI_NMI=y
# CONFIG_ACPI_APEI is not set
# CONFIG_DPTF_POWER is not set
# CONFIG_ACPI_EXTLOG is not set
# CONFIG_PMIC_OPREGION is not set
# CONFIG_ACPI_CONFIGFS is not set
CONFIG_X86_PM_TIMER=y
CONFIG_SFI=y

#
# CPU Frequency scaling
#
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_GOV_ATTR_SET=y
CONFIG_CPU_FREQ_GOV_COMMON=y
CONFIG_CPU_FREQ_STAT=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set
CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_SCHEDUTIL is not set
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=y
# CONFIG_CPU_FREQ_GOV_USERSPACE is not set
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
# CONFIG_CPU_FREQ_GOV_CONSERVATIVE is not set
# CONFIG_CPU_FREQ_GOV_SCHEDUTIL is not set

#
# CPU frequency scaling drivers
#
CONFIG_X86_INTEL_PSTATE=y
CONFIG_X86_PCC_CPUFREQ=y
# CONFIG_X86_ACPI_CPUFREQ is not set
CONFIG_X86_SPEEDSTEP_CENTRINO=y
CONFIG_X86_P4_CLOCKMOD=y

#
# shared options
#
CONFIG_X86_SPEEDSTEP_LIB=y

#
# CPU Idle
#
CONFIG_CPU_IDLE=y
CONFIG_CPU_IDLE_GOV_LADDER=y
# CONFIG_CPU_IDLE_GOV_MENU is not set
CONFIG_INTEL_IDLE=y

#
# Bus options (PCI etc.)
#
CONFIG_PCI=y
CONFIG_PCI_DIRECT=y
# CONFIG_PCI_MMCONFIG is not set
CONFIG_PCI_XEN=y
CONFIG_PCI_DOMAINS=y
# CONFIG_PCI_CNB20LE_QUIRK is not set
# CONFIG_PCIEPORTBUS is not set
CONFIG_PCI_BUS_ADDR_T_64BIT=y
CONFIG_PCI_MSI=y
CONFIG_PCI_MSI_IRQ_DOMAIN=y
CONFIG_PCI_QUIRKS=y
# CONFIG_PCI_DEBUG is not set
# CONFIG_PCI_STUB is not set
CONFIG_XEN_PCIDEV_FRONTEND=y
CONFIG_PCI_ATS=y
CONFIG_PCI_LOCKLESS_CONFIG=y
# CONFIG_PCI_IOV is not set
CONFIG_PCI_PRI=y
# CONFIG_PCI_PASID is not set
CONFIG_PCI_LABEL=y
CONFIG_HOTPLUG_PCI=y
# CONFIG_HOTPLUG_PCI_ACPI is not set
# CONFIG_HOTPLUG_PCI_CPCI is not set
# CONFIG_HOTPLUG_PCI_SHPC is not set

#
# Cadence PCIe controllers support
#

#
# DesignWare PCI Core Support
#
# CONFIG_PCIE_DW_PLAT is not set

#
# PCI host controller drivers
#
# CONFIG_VMD is not set

#
# PCI Endpoint
#
# CONFIG_PCI_ENDPOINT is not set

#
# PCI switch controller drivers
#
# CONFIG_PCI_SW_SWITCHTEC is not set
# CONFIG_ISA_BUS is not set
# CONFIG_ISA_DMA_API is not set
CONFIG_AMD_NB=y
# CONFIG_PCCARD is not set
# CONFIG_RAPIDIO is not set
# CONFIG_X86_SYSFB is not set

#
# Executable file formats / Emulations
#
CONFIG_BINFMT_ELF=y
CONFIG_COMPAT_BINFMT_ELF=y
CONFIG_ELFCORE=y
CONFIG_BINFMT_SCRIPT=y
CONFIG_BINFMT_MISC=y
# CONFIG_COREDUMP is not set
CONFIG_IA32_EMULATION=y
CONFIG_IA32_AOUT=y
CONFIG_X86_X32=y
CONFIG_COMPAT_32=y
CONFIG_COMPAT=y
CONFIG_COMPAT_FOR_U64_ALIGNMENT=y
CONFIG_X86_DEV_DMA_OPS=y
CONFIG_NET=y
CONFIG_NET_INGRESS=y

#
# Networking options
#
# CONFIG_PACKET is not set
CONFIG_UNIX=y
CONFIG_UNIX_DIAG=y
# CONFIG_TLS is not set
CONFIG_XFRM=y
CONFIG_XFRM_ALGO=y
CONFIG_XFRM_USER=y
# CONFIG_XFRM_SUB_POLICY is not set
# CONFIG_XFRM_MIGRATE is not set
# CONFIG_XFRM_STATISTICS is not set
CONFIG_XFRM_IPCOMP=y
# CONFIG_NET_KEY is not set
CONFIG_INET=y
# CONFIG_IP_MULTICAST is not set
# CONFIG_IP_ADVANCED_ROUTER is not set
CONFIG_IP_ROUTE_CLASSID=y
CONFIG_IP_PNP=y
CONFIG_IP_PNP_DHCP=y
CONFIG_IP_PNP_BOOTP=y
# CONFIG_IP_PNP_RARP is not set
CONFIG_NET_IPIP=y
# CONFIG_NET_IPGRE_DEMUX is not set
CONFIG_NET_IP_TUNNEL=y
CONFIG_SYN_COOKIES=y
# CONFIG_NET_FOU is not set
# CONFIG_NET_FOU_IP_TUNNELS is not set
# CONFIG_INET_AH is not set
CONFIG_INET_ESP=y
# CONFIG_INET_ESP_OFFLOAD is not set
CONFIG_INET_IPCOMP=y
CONFIG_INET_XFRM_TUNNEL=y
CONFIG_INET_TUNNEL=y
CONFIG_INET_XFRM_MODE_TRANSPORT=y
# CONFIG_INET_XFRM_MODE_TUNNEL is not set
CONFIG_INET_XFRM_MODE_BEET=y
# CONFIG_INET_DIAG is not set
CONFIG_TCP_CONG_ADVANCED=y
CONFIG_TCP_CONG_BIC=y
CONFIG_TCP_CONG_CUBIC=y
CONFIG_TCP_CONG_WESTWOOD=y
CONFIG_TCP_CONG_HTCP=y
# CONFIG_TCP_CONG_HSTCP is not set
CONFIG_TCP_CONG_HYBLA=y
CONFIG_TCP_CONG_VEGAS=y
# CONFIG_TCP_CONG_NV is not set
CONFIG_TCP_CONG_SCALABLE=y
# CONFIG_TCP_CONG_LP is not set
CONFIG_TCP_CONG_VENO=y
CONFIG_TCP_CONG_YEAH=y
# CONFIG_TCP_CONG_ILLINOIS is not set
# CONFIG_TCP_CONG_DCTCP is not set
# CONFIG_TCP_CONG_CDG is not set
# CONFIG_TCP_CONG_BBR is not set
# CONFIG_DEFAULT_BIC is not set
# CONFIG_DEFAULT_CUBIC is not set
# CONFIG_DEFAULT_HTCP is not set
# CONFIG_DEFAULT_HYBLA is not set
# CONFIG_DEFAULT_VEGAS is not set
CONFIG_DEFAULT_VENO=y
# CONFIG_DEFAULT_WESTWOOD is not set
# CONFIG_DEFAULT_RENO is not set
CONFIG_DEFAULT_TCP_CONG="veno"
# CONFIG_TCP_MD5SIG is not set
# CONFIG_IPV6 is not set
# CONFIG_NETWORK_SECMARK is not set
CONFIG_NET_PTP_CLASSIFY=y
# CONFIG_NETWORK_PHY_TIMESTAMPING is not set
CONFIG_NETFILTER=y
CONFIG_NETFILTER_ADVANCED=y

#
# Core Netfilter Configuration
#
CONFIG_NETFILTER_INGRESS=y
CONFIG_NETFILTER_NETLINK=y
CONFIG_NETFILTER_NETLINK_ACCT=y
CONFIG_NETFILTER_NETLINK_QUEUE=y
CONFIG_NETFILTER_NETLINK_LOG=y
# CONFIG_NF_CONNTRACK is not set
CONFIG_NF_LOG_COMMON=y
# CONFIG_NF_LOG_NETDEV is not set
# CONFIG_NF_TABLES is not set
CONFIG_NETFILTER_XTABLES=y

#
# Xtables combined modules
#
CONFIG_NETFILTER_XT_MARK=y
CONFIG_NETFILTER_XT_SET=y

#
# Xtables targets
#
# CONFIG_NETFILTER_XT_TARGET_AUDIT is not set
CONFIG_NETFILTER_XT_TARGET_CLASSIFY=y
CONFIG_NETFILTER_XT_TARGET_HMARK=y
# CONFIG_NETFILTER_XT_TARGET_IDLETIMER is not set
CONFIG_NETFILTER_XT_TARGET_LED=y
CONFIG_NETFILTER_XT_TARGET_LOG=y
# CONFIG_NETFILTER_XT_TARGET_MARK is not set
# CONFIG_NETFILTER_XT_TARGET_NFLOG is not set
CONFIG_NETFILTER_XT_TARGET_NFQUEUE=y
# CONFIG_NETFILTER_XT_TARGET_RATEEST is not set
# CONFIG_NETFILTER_XT_TARGET_TEE is not set
# CONFIG_NETFILTER_XT_TARGET_TCPMSS is not set

#
# Xtables matches
#
CONFIG_NETFILTER_XT_MATCH_ADDRTYPE=y
# CONFIG_NETFILTER_XT_MATCH_BPF is not set
# CONFIG_NETFILTER_XT_MATCH_CGROUP is not set
# CONFIG_NETFILTER_XT_MATCH_COMMENT is not set
CONFIG_NETFILTER_XT_MATCH_CPU=y
CONFIG_NETFILTER_XT_MATCH_DCCP=y
CONFIG_NETFILTER_XT_MATCH_DEVGROUP=y
# CONFIG_NETFILTER_XT_MATCH_DSCP is not set
# CONFIG_NETFILTER_XT_MATCH_ECN is not set
CONFIG_NETFILTER_XT_MATCH_ESP=y
# CONFIG_NETFILTER_XT_MATCH_HASHLIMIT is not set
# CONFIG_NETFILTER_XT_MATCH_HL is not set
# CONFIG_NETFILTER_XT_MATCH_IPCOMP is not set
# CONFIG_NETFILTER_XT_MATCH_IPRANGE is not set
# CONFIG_NETFILTER_XT_MATCH_L2TP is not set
# CONFIG_NETFILTER_XT_MATCH_LENGTH is not set
# CONFIG_NETFILTER_XT_MATCH_LIMIT is not set
# CONFIG_NETFILTER_XT_MATCH_MAC is not set
CONFIG_NETFILTER_XT_MATCH_MARK=y
CONFIG_NETFILTER_XT_MATCH_MULTIPORT=y
CONFIG_NETFILTER_XT_MATCH_NFACCT=y
CONFIG_NETFILTER_XT_MATCH_OSF=y
CONFIG_NETFILTER_XT_MATCH_OWNER=y
# CONFIG_NETFILTER_XT_MATCH_POLICY is not set
# CONFIG_NETFILTER_XT_MATCH_PKTTYPE is not set
CONFIG_NETFILTER_XT_MATCH_QUOTA=y
# CONFIG_NETFILTER_XT_MATCH_RATEEST is not set
CONFIG_NETFILTER_XT_MATCH_REALM=y
# CONFIG_NETFILTER_XT_MATCH_RECENT is not set
CONFIG_NETFILTER_XT_MATCH_SCTP=y
# CONFIG_NETFILTER_XT_MATCH_STATISTIC is not set
CONFIG_NETFILTER_XT_MATCH_STRING=y
# CONFIG_NETFILTER_XT_MATCH_TCPMSS is not set
CONFIG_NETFILTER_XT_MATCH_TIME=y
# CONFIG_NETFILTER_XT_MATCH_U32 is not set
CONFIG_IP_SET=y
CONFIG_IP_SET_MAX=256
CONFIG_IP_SET_BITMAP_IP=y
# CONFIG_IP_SET_BITMAP_IPMAC is not set
CONFIG_IP_SET_BITMAP_PORT=y
CONFIG_IP_SET_HASH_IP=y
# CONFIG_IP_SET_HASH_IPMARK is not set
CONFIG_IP_SET_HASH_IPPORT=y
CONFIG_IP_SET_HASH_IPPORTIP=y
# CONFIG_IP_SET_HASH_IPPORTNET is not set
# CONFIG_IP_SET_HASH_IPMAC is not set
# CONFIG_IP_SET_HASH_MAC is not set
# CONFIG_IP_SET_HASH_NETPORTNET is not set
# CONFIG_IP_SET_HASH_NET is not set
# CONFIG_IP_SET_HASH_NETNET is not set
# CONFIG_IP_SET_HASH_NETPORT is not set
CONFIG_IP_SET_HASH_NETIFACE=y
CONFIG_IP_SET_LIST_SET=y
CONFIG_IP_VS=y
CONFIG_IP_VS_DEBUG=y
CONFIG_IP_VS_TAB_BITS=12

#
# IPVS transport protocol load balancing support
#
# CONFIG_IP_VS_PROTO_TCP is not set
# CONFIG_IP_VS_PROTO_UDP is not set
CONFIG_IP_VS_PROTO_AH_ESP=y
CONFIG_IP_VS_PROTO_ESP=y
# CONFIG_IP_VS_PROTO_AH is not set
# CONFIG_IP_VS_PROTO_SCTP is not set

#
# IPVS scheduler
#
# CONFIG_IP_VS_RR is not set
# CONFIG_IP_VS_WRR is not set
CONFIG_IP_VS_LC=y
CONFIG_IP_VS_WLC=y
# CONFIG_IP_VS_FO is not set
# CONFIG_IP_VS_OVF is not set
CONFIG_IP_VS_LBLC=y
# CONFIG_IP_VS_LBLCR is not set
CONFIG_IP_VS_DH=y
# CONFIG_IP_VS_SH is not set
CONFIG_IP_VS_SED=y
# CONFIG_IP_VS_NQ is not set

#
# IPVS SH scheduler
#
CONFIG_IP_VS_SH_TAB_BITS=8

#
# IPVS application helper
#

#
# IP: Netfilter Configuration
#
# CONFIG_NF_SOCKET_IPV4 is not set
# CONFIG_NF_DUP_IPV4 is not set
# CONFIG_NF_LOG_ARP is not set
CONFIG_NF_LOG_IPV4=y
# CONFIG_NF_REJECT_IPV4 is not set
# CONFIG_IP_NF_IPTABLES is not set
# CONFIG_IP_NF_ARPTABLES is not set
# CONFIG_IP_DCCP is not set
# CONFIG_IP_SCTP is not set
# CONFIG_RDS is not set
# CONFIG_TIPC is not set
# CONFIG_ATM is not set
# CONFIG_L2TP is not set
# CONFIG_BRIDGE is not set
CONFIG_HAVE_NET_DSA=y
# CONFIG_NET_DSA is not set
# CONFIG_VLAN_8021Q is not set
# CONFIG_DECNET is not set
# CONFIG_LLC2 is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_PHONET is not set
# CONFIG_IEEE802154 is not set
# CONFIG_NET_SCHED is not set
# CONFIG_DCB is not set
CONFIG_DNS_RESOLVER=y
# CONFIG_BATMAN_ADV is not set
# CONFIG_OPENVSWITCH is not set
# CONFIG_VSOCKETS is not set
# CONFIG_NETLINK_DIAG is not set
# CONFIG_MPLS is not set
# CONFIG_NET_NSH is not set
# CONFIG_HSR is not set
# CONFIG_NET_SWITCHDEV is not set
# CONFIG_NET_L3_MASTER_DEV is not set
# CONFIG_NET_NCSI is not set
CONFIG_RPS=y
CONFIG_RFS_ACCEL=y
CONFIG_XPS=y
# CONFIG_CGROUP_NET_PRIO is not set
# CONFIG_CGROUP_NET_CLASSID is not set
CONFIG_NET_RX_BUSY_POLL=y
CONFIG_BQL=y
CONFIG_NET_FLOW_LIMIT=y

#
# Network testing
#
# CONFIG_NET_PKTGEN is not set
# CONFIG_HAMRADIO is not set
# CONFIG_CAN is not set
# CONFIG_BT is not set
# CONFIG_AF_RXRPC is not set
# CONFIG_AF_KCM is not set
CONFIG_WIRELESS=y
# CONFIG_CFG80211 is not set

#
# CFG80211 needs to be enabled for MAC80211
#
CONFIG_MAC80211_STA_HASH_MAX_SIZE=0
# CONFIG_WIMAX is not set
# CONFIG_RFKILL is not set
CONFIG_NET_9P=y
CONFIG_NET_9P_VIRTIO=y
# CONFIG_NET_9P_XEN is not set
CONFIG_NET_9P_DEBUG=y
CONFIG_CAIF=y
# CONFIG_CAIF_DEBUG is not set
CONFIG_CAIF_NETDEV=y
# CONFIG_CAIF_USB is not set
CONFIG_CEPH_LIB=y
CONFIG_CEPH_LIB_PRETTYDEBUG=y
# CONFIG_CEPH_LIB_USE_DNS_RESOLVER is not set
# CONFIG_NFC is not set
# CONFIG_PSAMPLE is not set
# CONFIG_NET_IFE is not set
# CONFIG_LWTUNNEL is not set
CONFIG_DST_CACHE=y
CONFIG_GRO_CELLS=y
# CONFIG_NET_DEVLINK is not set
CONFIG_MAY_USE_DEVLINK=y
CONFIG_HAVE_EBPF_JIT=y

#
# Device Drivers
#

#
# Generic Driver Options
#
CONFIG_UEVENT_HELPER=y
CONFIG_UEVENT_HELPER_PATH=""
CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_FW_LOADER=y
CONFIG_EXTRA_FIRMWARE=""
CONFIG_FW_LOADER_USER_HELPER=y
# CONFIG_FW_LOADER_USER_HELPER_FALLBACK is not set
CONFIG_WANT_DEV_COREDUMP=y
CONFIG_ALLOW_DEV_COREDUMP=y
CONFIG_DEV_COREDUMP=y
# CONFIG_DEBUG_DRIVER is not set
# CONFIG_DEBUG_DEVRES is not set
# CONFIG_DEBUG_TEST_DRIVER_REMOVE is not set
CONFIG_GENERIC_CPU_AUTOPROBE=y
CONFIG_GENERIC_CPU_VULNERABILITIES=y
CONFIG_REGMAP=y
CONFIG_REGMAP_I2C=y
CONFIG_REGMAP_SPI=y
CONFIG_REGMAP_IRQ=y
CONFIG_DMA_SHARED_BUFFER=y
# CONFIG_DMA_FENCE_TRACE is not set

#
# Bus devices
#
CONFIG_CONNECTOR=y
# CONFIG_PROC_EVENTS is not set
CONFIG_MTD=y
# CONFIG_MTD_REDBOOT_PARTS is not set
# CONFIG_MTD_CMDLINE_PARTS is not set
# CONFIG_MTD_AR7_PARTS is not set

#
# Partition parsers
#

#
# User Modules And Translation Layers
#
CONFIG_MTD_BLKDEVS=y
# CONFIG_MTD_BLOCK is not set
# CONFIG_MTD_BLOCK_RO is not set
CONFIG_FTL=y
# CONFIG_NFTL is not set
# CONFIG_INFTL is not set
# CONFIG_RFD_FTL is not set
CONFIG_SSFDC=y
# CONFIG_SM_FTL is not set
# CONFIG_MTD_OOPS is not set
CONFIG_MTD_SWAP=y
# CONFIG_MTD_PARTITIONED_MASTER is not set

#
# RAM/ROM/Flash chip drivers
#
CONFIG_MTD_CFI=y
CONFIG_MTD_JEDECPROBE=y
CONFIG_MTD_GEN_PROBE=y
# CONFIG_MTD_CFI_ADV_OPTIONS is not set
CONFIG_MTD_MAP_BANK_WIDTH_1=y
CONFIG_MTD_MAP_BANK_WIDTH_2=y
CONFIG_MTD_MAP_BANK_WIDTH_4=y
CONFIG_MTD_CFI_I1=y
CONFIG_MTD_CFI_I2=y
CONFIG_MTD_CFI_INTELEXT=y
CONFIG_MTD_CFI_AMDSTD=y
# CONFIG_MTD_CFI_STAA is not set
CONFIG_MTD_CFI_UTIL=y
CONFIG_MTD_RAM=y
# CONFIG_MTD_ROM is not set
CONFIG_MTD_ABSENT=y

#
# Mapping drivers for chip access
#
CONFIG_MTD_COMPLEX_MAPPINGS=y
# CONFIG_MTD_PHYSMAP is not set
# CONFIG_MTD_SBC_GXX is not set
# CONFIG_MTD_AMD76XROM is not set
CONFIG_MTD_ICHXROM=y
CONFIG_MTD_ESB2ROM=y
# CONFIG_MTD_CK804XROM is not set
# CONFIG_MTD_SCB2_FLASH is not set
# CONFIG_MTD_NETtel is not set
CONFIG_MTD_L440GX=y
CONFIG_MTD_PCI=y
CONFIG_MTD_GPIO_ADDR=y
# CONFIG_MTD_INTEL_VR_NOR is not set
CONFIG_MTD_PLATRAM=y
# CONFIG_MTD_LATCH_ADDR is not set

#
# Self-contained MTD device drivers
#
# CONFIG_MTD_PMC551 is not set
CONFIG_MTD_DATAFLASH=y
CONFIG_MTD_DATAFLASH_WRITE_VERIFY=y
# CONFIG_MTD_DATAFLASH_OTP is not set
# CONFIG_MTD_MCHP23K256 is not set
CONFIG_MTD_SST25L=y
CONFIG_MTD_SLRAM=y
CONFIG_MTD_PHRAM=y
CONFIG_MTD_MTDRAM=y
CONFIG_MTDRAM_TOTAL_SIZE=4096
CONFIG_MTDRAM_ERASE_SIZE=128
CONFIG_MTD_BLOCK2MTD=y

#
# Disk-On-Chip Device Drivers
#
CONFIG_MTD_DOCG3=y
CONFIG_BCH_CONST_M=14
CONFIG_BCH_CONST_T=4
# CONFIG_MTD_NAND is not set
# CONFIG_MTD_ONENAND is not set

#
# LPDDR & LPDDR2 PCM memory drivers
#
# CONFIG_MTD_LPDDR is not set
# CONFIG_MTD_SPI_NOR is not set
CONFIG_MTD_UBI=y
CONFIG_MTD_UBI_WL_THRESHOLD=4096
CONFIG_MTD_UBI_BEB_LIMIT=20
CONFIG_MTD_UBI_FASTMAP=y
# CONFIG_MTD_UBI_GLUEBI is not set
# CONFIG_MTD_UBI_BLOCK is not set
# CONFIG_OF is not set
CONFIG_ARCH_MIGHT_HAVE_PC_PARPORT=y
# CONFIG_PARPORT is not set
CONFIG_PNP=y
# CONFIG_PNP_DEBUG_MESSAGES is not set

#
# Protocols
#
CONFIG_PNPACPI=y
CONFIG_BLK_DEV=y
# CONFIG_BLK_DEV_NULL_BLK is not set
CONFIG_CDROM=y
CONFIG_BLK_DEV_PCIESSD_MTIP32XX=y
# CONFIG_ZRAM is not set
# CONFIG_BLK_DEV_DAC960 is not set
# CONFIG_BLK_DEV_UMEM is not set
# CONFIG_BLK_DEV_LOOP is not set
# CONFIG_BLK_DEV_DRBD is not set
# CONFIG_BLK_DEV_NBD is not set
# CONFIG_BLK_DEV_SKD is not set
# CONFIG_BLK_DEV_SX8 is not set
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_COUNT=16
CONFIG_BLK_DEV_RAM_SIZE=4096
CONFIG_CDROM_PKTCDVD=y
CONFIG_CDROM_PKTCDVD_BUFFERS=8
# CONFIG_CDROM_PKTCDVD_WCACHE is not set
CONFIG_ATA_OVER_ETH=y
CONFIG_XEN_BLKDEV_FRONTEND=y
# CONFIG_XEN_BLKDEV_BACKEND is not set
CONFIG_VIRTIO_BLK=y
# CONFIG_VIRTIO_BLK_SCSI is not set
# CONFIG_BLK_DEV_RBD is not set
# CONFIG_BLK_DEV_RSXX is not set

#
# NVME Support
#
# CONFIG_BLK_DEV_NVME is not set
# CONFIG_NVME_FC is not set
# CONFIG_NVME_TARGET is not set

#
# Misc devices
#
CONFIG_SENSORS_LIS3LV02D=y
CONFIG_AD525X_DPOT=y
# CONFIG_AD525X_DPOT_I2C is not set
CONFIG_AD525X_DPOT_SPI=y
# CONFIG_DUMMY_IRQ is not set
# CONFIG_IBM_ASM is not set
CONFIG_PHANTOM=y
CONFIG_SGI_IOC4=y
CONFIG_TIFM_CORE=y
CONFIG_TIFM_7XX1=y
# CONFIG_ICS932S401 is not set
CONFIG_ENCLOSURE_SERVICES=y
# CONFIG_HP_ILO is not set
# CONFIG_APDS9802ALS is not set
# CONFIG_ISL29003 is not set
CONFIG_ISL29020=y
CONFIG_SENSORS_TSL2550=y
CONFIG_SENSORS_BH1770=y
# CONFIG_SENSORS_APDS990X is not set
# CONFIG_HMC6352 is not set
CONFIG_DS1682=y
CONFIG_USB_SWITCH_FSA9480=y
CONFIG_LATTICE_ECP3_CONFIG=y
# CONFIG_SRAM is not set
# CONFIG_PCI_ENDPOINT_TEST is not set
# CONFIG_C2PORT is not set

#
# EEPROM support
#
# CONFIG_EEPROM_AT24 is not set
# CONFIG_EEPROM_AT25 is not set
# CONFIG_EEPROM_LEGACY is not set
CONFIG_EEPROM_MAX6875=y
# CONFIG_EEPROM_93CX6 is not set
# CONFIG_EEPROM_93XX46 is not set
# CONFIG_EEPROM_IDT_89HPESX is not set
# CONFIG_CB710_CORE is not set

#
# Texas Instruments shared transport line discipline
#
# CONFIG_TI_ST is not set
CONFIG_SENSORS_LIS3_I2C=y
# CONFIG_ALTERA_STAPL is not set
# CONFIG_INTEL_MEI is not set
# CONFIG_INTEL_MEI_ME is not set
# CONFIG_INTEL_MEI_TXE is not set
# CONFIG_VMWARE_VMCI is not set

#
# Intel MIC & related support
#

#
# Intel MIC Bus Driver
#
# CONFIG_INTEL_MIC_BUS is not set

#
# SCIF Bus Driver
#
# CONFIG_SCIF_BUS is not set

#
# VOP Bus Driver
#
# CONFIG_VOP_BUS is not set

#
# Intel MIC Host Driver
#

#
# Intel MIC Card Driver
#

#
# SCIF Driver
#

#
# Intel MIC Coprocessor State Management (COSM) Drivers
#

#
# VOP Driver
#
# CONFIG_GENWQE is not set
# CONFIG_ECHO is not set
# CONFIG_MISC_RTSX_PCI is not set
# CONFIG_MISC_RTSX_USB is not set
CONFIG_HAVE_IDE=y
CONFIG_IDE=y

#
# Please see Documentation/ide/ide.txt for help/info on IDE drives
#
CONFIG_IDE_XFER_MODE=y
CONFIG_IDE_TIMINGS=y
CONFIG_IDE_ATAPI=y
# CONFIG_BLK_DEV_IDE_SATA is not set
CONFIG_IDE_GD=y
# CONFIG_IDE_GD_ATA is not set
# CONFIG_IDE_GD_ATAPI is not set
# CONFIG_BLK_DEV_IDECD is not set
CONFIG_BLK_DEV_IDETAPE=y
CONFIG_BLK_DEV_IDEACPI=y
# CONFIG_IDE_TASK_IOCTL is not set
CONFIG_IDE_PROC_FS=y

#
# IDE chipset support/bugfixes
#
# CONFIG_IDE_GENERIC is not set
CONFIG_BLK_DEV_PLATFORM=y
CONFIG_BLK_DEV_CMD640=y
CONFIG_BLK_DEV_CMD640_ENHANCED=y
# CONFIG_BLK_DEV_IDEPNP is not set
CONFIG_BLK_DEV_IDEDMA_SFF=y

#
# PCI IDE chipsets support
#
CONFIG_BLK_DEV_IDEPCI=y
# CONFIG_IDEPCI_PCIBUS_ORDER is not set
CONFIG_BLK_DEV_OFFBOARD=y
# CONFIG_BLK_DEV_GENERIC is not set
# CONFIG_BLK_DEV_OPTI621 is not set
# CONFIG_BLK_DEV_RZ1000 is not set
CONFIG_BLK_DEV_IDEDMA_PCI=y
CONFIG_BLK_DEV_AEC62XX=y
# CONFIG_BLK_DEV_ALI15X3 is not set
# CONFIG_BLK_DEV_AMD74XX is not set
# CONFIG_BLK_DEV_ATIIXP is not set
CONFIG_BLK_DEV_CMD64X=y
# CONFIG_BLK_DEV_TRIFLEX is not set
CONFIG_BLK_DEV_HPT366=y
CONFIG_BLK_DEV_JMICRON=y
CONFIG_BLK_DEV_PIIX=y
CONFIG_BLK_DEV_IT8172=y
# CONFIG_BLK_DEV_IT8213 is not set
CONFIG_BLK_DEV_IT821X=y
CONFIG_BLK_DEV_NS87415=y
# CONFIG_BLK_DEV_PDC202XX_OLD is not set
CONFIG_BLK_DEV_PDC202XX_NEW=y
CONFIG_BLK_DEV_SVWKS=y
CONFIG_BLK_DEV_SIIMAGE=y
CONFIG_BLK_DEV_SIS5513=y
CONFIG_BLK_DEV_SLC90E66=y
CONFIG_BLK_DEV_TRM290=y
CONFIG_BLK_DEV_VIA82CXXX=y
# CONFIG_BLK_DEV_TC86C001 is not set
CONFIG_BLK_DEV_IDEDMA=y

#
# SCSI device support
#
CONFIG_SCSI_MOD=y
CONFIG_RAID_ATTRS=y
CONFIG_SCSI=y
CONFIG_SCSI_DMA=y
CONFIG_SCSI_NETLINK=y
# CONFIG_SCSI_MQ_DEFAULT is not set
CONFIG_SCSI_PROC_FS=y

#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=y
# CONFIG_CHR_DEV_ST is not set
# CONFIG_CHR_DEV_OSST is not set
# CONFIG_BLK_DEV_SR is not set
CONFIG_CHR_DEV_SG=y
# CONFIG_CHR_DEV_SCH is not set
CONFIG_SCSI_ENCLOSURE=y
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_LOGGING=y
CONFIG_SCSI_SCAN_ASYNC=y

#
# SCSI Transports
#
CONFIG_SCSI_SPI_ATTRS=y
CONFIG_SCSI_FC_ATTRS=y
CONFIG_SCSI_ISCSI_ATTRS=y
CONFIG_SCSI_SAS_ATTRS=y
CONFIG_SCSI_SAS_LIBSAS=y
# CONFIG_SCSI_SAS_ATA is not set
# CONFIG_SCSI_SAS_HOST_SMP is not set
# CONFIG_SCSI_SRP_ATTRS is not set
CONFIG_SCSI_LOWLEVEL=y
# CONFIG_ISCSI_TCP is not set
CONFIG_ISCSI_BOOT_SYSFS=y
CONFIG_SCSI_CXGB3_ISCSI=y
CONFIG_SCSI_CXGB4_ISCSI=y
CONFIG_SCSI_BNX2_ISCSI=y
# CONFIG_SCSI_BNX2X_FCOE is not set
CONFIG_BE2ISCSI=y
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
CONFIG_SCSI_HPSA=y
CONFIG_SCSI_3W_9XXX=y
# CONFIG_SCSI_3W_SAS is not set
# CONFIG_SCSI_ACARD is not set
CONFIG_SCSI_AACRAID=y
# CONFIG_SCSI_AIC7XXX is not set
# CONFIG_SCSI_AIC79XX is not set
# CONFIG_SCSI_AIC94XX is not set
CONFIG_SCSI_MVSAS=y
CONFIG_SCSI_MVSAS_DEBUG=y
CONFIG_SCSI_MVSAS_TASKLET=y
CONFIG_SCSI_MVUMI=y
CONFIG_SCSI_DPT_I2O=y
# CONFIG_SCSI_ADVANSYS is not set
# CONFIG_SCSI_ARCMSR is not set
# CONFIG_SCSI_ESAS2R is not set
# CONFIG_MEGARAID_NEWGEN is not set
# CONFIG_MEGARAID_LEGACY is not set
CONFIG_MEGARAID_SAS=y
CONFIG_SCSI_MPT3SAS=y
CONFIG_SCSI_MPT2SAS_MAX_SGE=128
CONFIG_SCSI_MPT3SAS_MAX_SGE=128
CONFIG_SCSI_MPT2SAS=y
# CONFIG_SCSI_SMARTPQI is not set
CONFIG_SCSI_UFSHCD=y
# CONFIG_SCSI_UFSHCD_PCI is not set
# CONFIG_SCSI_UFSHCD_PLATFORM is not set
# CONFIG_SCSI_HPTIOP is not set
# CONFIG_VMWARE_PVSCSI is not set
# CONFIG_XEN_SCSI_FRONTEND is not set
CONFIG_LIBFC=y
CONFIG_LIBFCOE=y
CONFIG_FCOE=y
CONFIG_FCOE_FNIC=y
# CONFIG_SCSI_SNIC is not set
CONFIG_SCSI_DMX3191D=y
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_ISCI is not set
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_INITIO is not set
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_STEX is not set
CONFIG_SCSI_SYM53C8XX_2=y
CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MODE=1
CONFIG_SCSI_SYM53C8XX_DEFAULT_TAGS=16
CONFIG_SCSI_SYM53C8XX_MAX_TAGS=64
CONFIG_SCSI_SYM53C8XX_MMIO=y
CONFIG_SCSI_IPR=y
# CONFIG_SCSI_IPR_TRACE is not set
CONFIG_SCSI_IPR_DUMP=y
# CONFIG_SCSI_QLOGIC_1280 is not set
CONFIG_SCSI_QLA_FC=y
# CONFIG_SCSI_QLA_ISCSI is not set
# CONFIG_SCSI_LPFC is not set
# CONFIG_SCSI_DC395x is not set
# CONFIG_SCSI_AM53C974 is not set
# CONFIG_SCSI_WD719X is not set
# CONFIG_SCSI_DEBUG is not set
CONFIG_SCSI_PMCRAID=y
CONFIG_SCSI_PM8001=y
# CONFIG_SCSI_BFA_FC is not set
CONFIG_SCSI_VIRTIO=y
CONFIG_SCSI_CHELSIO_FCOE=y
# CONFIG_SCSI_DH is not set
# CONFIG_SCSI_OSD_INITIATOR is not set
CONFIG_ATA=y
CONFIG_ATA_VERBOSE_ERROR=y
# CONFIG_ATA_ACPI is not set
CONFIG_SATA_PMP=y

#
# Controllers with non-SFF native interface
#
# CONFIG_SATA_AHCI is not set
CONFIG_SATA_AHCI_PLATFORM=y
# CONFIG_SATA_INIC162X is not set
# CONFIG_SATA_ACARD_AHCI is not set
CONFIG_SATA_SIL24=y
CONFIG_ATA_SFF=y

#
# SFF controllers with custom DMA interface
#
# CONFIG_PDC_ADMA is not set
# CONFIG_SATA_QSTOR is not set
# CONFIG_SATA_SX4 is not set
# CONFIG_ATA_BMDMA is not set

#
# PIO-only SFF controllers
#
CONFIG_PATA_CMD640_PCI=y
CONFIG_PATA_MPIIX=y
# CONFIG_PATA_NS87410 is not set
# CONFIG_PATA_OPTI is not set
CONFIG_PATA_PLATFORM=y
# CONFIG_PATA_RZ1000 is not set

#
# Generic fallback / legacy drivers
#
# CONFIG_PATA_LEGACY is not set
# CONFIG_MD is not set
# CONFIG_TARGET_CORE is not set
# CONFIG_FUSION is not set

#
# IEEE 1394 (FireWire) support
#
# CONFIG_FIREWIRE is not set
# CONFIG_FIREWIRE_NOSY is not set
CONFIG_MACINTOSH_DRIVERS=y
# CONFIG_MAC_EMUMOUSEBTN is not set
CONFIG_NETDEVICES=y
CONFIG_MII=y
CONFIG_NET_CORE=y
CONFIG_BONDING=y
CONFIG_DUMMY=y
CONFIG_EQUALIZER=y
# CONFIG_NET_FC is not set
# CONFIG_NET_TEAM is not set
CONFIG_MACVLAN=y
# CONFIG_MACVTAP is not set
# CONFIG_IPVLAN is not set
# CONFIG_VXLAN is not set
# CONFIG_MACSEC is not set
# CONFIG_NETCONSOLE is not set
# CONFIG_TUN is not set
# CONFIG_TUN_VNET_CROSS_LE is not set
# CONFIG_VETH is not set
CONFIG_VIRTIO_NET=y
# CONFIG_NLMON is not set
CONFIG_SUNGEM_PHY=y
# CONFIG_ARCNET is not set

#
# CAIF transport drivers
#
CONFIG_CAIF_TTY=y
CONFIG_CAIF_SPI_SLAVE=y
# CONFIG_CAIF_SPI_SYNC is not set
# CONFIG_CAIF_HSI is not set
# CONFIG_CAIF_VIRTIO is not set

#
# Distributed Switch Architecture drivers
#
CONFIG_ETHERNET=y
CONFIG_MDIO=y
# CONFIG_NET_VENDOR_3COM is not set
# CONFIG_NET_VENDOR_ADAPTEC is not set
CONFIG_NET_VENDOR_AGERE=y
CONFIG_ET131X=y
CONFIG_NET_VENDOR_ALACRITECH=y
CONFIG_SLICOSS=y
CONFIG_NET_VENDOR_ALTEON=y
# CONFIG_ACENIC is not set
# CONFIG_ALTERA_TSE is not set
CONFIG_NET_VENDOR_AMAZON=y
# CONFIG_ENA_ETHERNET is not set
CONFIG_NET_VENDOR_AMD=y
# CONFIG_AMD8111_ETH is not set
# CONFIG_PCNET32 is not set
# CONFIG_AMD_XGBE is not set
CONFIG_NET_VENDOR_AQUANTIA=y
# CONFIG_AQTION is not set
CONFIG_NET_VENDOR_ARC=y
CONFIG_NET_VENDOR_ATHEROS=y
CONFIG_ATL2=y
# CONFIG_ATL1 is not set
# CONFIG_ATL1E is not set
CONFIG_ATL1C=y
# CONFIG_ALX is not set
# CONFIG_NET_VENDOR_AURORA is not set
CONFIG_NET_CADENCE=y
CONFIG_MACB=y
CONFIG_MACB_USE_HWSTAMP=y
# CONFIG_MACB_PCI is not set
CONFIG_NET_VENDOR_BROADCOM=y
CONFIG_B44=y
CONFIG_B44_PCI_AUTOSELECT=y
CONFIG_B44_PCICORE_AUTOSELECT=y
CONFIG_B44_PCI=y
CONFIG_BNX2=y
CONFIG_CNIC=y
# CONFIG_TIGON3 is not set
# CONFIG_BNX2X is not set
# CONFIG_BNXT is not set
# CONFIG_NET_VENDOR_BROCADE is not set
CONFIG_NET_VENDOR_CAVIUM=y
# CONFIG_THUNDER_NIC_PF is not set
# CONFIG_THUNDER_NIC_VF is not set
# CONFIG_THUNDER_NIC_BGX is not set
# CONFIG_THUNDER_NIC_RGX is not set
CONFIG_CAVIUM_PTP=y
# CONFIG_LIQUIDIO is not set
# CONFIG_LIQUIDIO_VF is not set
CONFIG_NET_VENDOR_CHELSIO=y
# CONFIG_CHELSIO_T1 is not set
CONFIG_CHELSIO_T3=y
CONFIG_CHELSIO_T4=y
CONFIG_CHELSIO_T4VF=y
CONFIG_CHELSIO_LIB=y
CONFIG_NET_VENDOR_CISCO=y
# CONFIG_ENIC is not set
CONFIG_NET_VENDOR_CORTINA=y
# CONFIG_CX_ECAT is not set
# CONFIG_DNET is not set
# CONFIG_NET_VENDOR_DEC is not set
CONFIG_NET_VENDOR_DLINK=y
# CONFIG_DL2K is not set
CONFIG_SUNDANCE=y
CONFIG_SUNDANCE_MMIO=y
# CONFIG_NET_VENDOR_EMULEX is not set
CONFIG_NET_VENDOR_EZCHIP=y
CONFIG_NET_VENDOR_EXAR=y
# CONFIG_S2IO is not set
# CONFIG_VXGE is not set
# CONFIG_NET_VENDOR_HP is not set
CONFIG_NET_VENDOR_HUAWEI=y
# CONFIG_HINIC is not set
CONFIG_NET_VENDOR_INTEL=y
# CONFIG_E100 is not set
CONFIG_E1000=y
CONFIG_E1000E=y
CONFIG_E1000E_HWTS=y
CONFIG_IGB=y
CONFIG_IGB_HWMON=y
# CONFIG_IGBVF is not set
# CONFIG_IXGB is not set
CONFIG_IXGBE=y
CONFIG_IXGBE_HWMON=y
# CONFIG_IXGBEVF is not set
# CONFIG_I40E is not set
# CONFIG_I40EVF is not set
# CONFIG_FM10K is not set
CONFIG_NET_VENDOR_I825XX=y
# CONFIG_JME is not set
# CONFIG_NET_VENDOR_MARVELL is not set
CONFIG_NET_VENDOR_MELLANOX=y
CONFIG_MLX4_EN=y
CONFIG_MLX4_CORE=y
CONFIG_MLX4_DEBUG=y
CONFIG_MLX4_CORE_GEN2=y
# CONFIG_MLX5_CORE is not set
# CONFIG_MLXSW_CORE is not set
# CONFIG_MLXFW is not set
CONFIG_NET_VENDOR_MICREL=y
# CONFIG_KS8851 is not set
# CONFIG_KS8851_MLL is not set
# CONFIG_KSZ884X_PCI is not set
# CONFIG_NET_VENDOR_MICROCHIP is not set
CONFIG_NET_VENDOR_MYRI=y
# CONFIG_MYRI10GE is not set
# CONFIG_FEALNX is not set
# CONFIG_NET_VENDOR_NATSEMI is not set
CONFIG_NET_VENDOR_NETRONOME=y
# CONFIG_NFP is not set
CONFIG_NET_VENDOR_NVIDIA=y
CONFIG_FORCEDETH=y
CONFIG_NET_VENDOR_OKI=y
# CONFIG_ETHOC is not set
# CONFIG_NET_PACKET_ENGINE is not set
CONFIG_NET_VENDOR_QLOGIC=y
# CONFIG_QLA3XXX is not set
CONFIG_QLCNIC=y
CONFIG_QLCNIC_HWMON=y
CONFIG_QLGE=y
# CONFIG_NETXEN_NIC is not set
# CONFIG_QED is not set
CONFIG_NET_VENDOR_QUALCOMM=y
# CONFIG_QCOM_EMAC is not set
# CONFIG_RMNET is not set
# CONFIG_NET_VENDOR_REALTEK is not set
CONFIG_NET_VENDOR_RENESAS=y
CONFIG_NET_VENDOR_RDC=y
CONFIG_R6040=y
CONFIG_NET_VENDOR_ROCKER=y
CONFIG_NET_VENDOR_SAMSUNG=y
# CONFIG_SXGBE_ETH is not set
CONFIG_NET_VENDOR_SEEQ=y
# CONFIG_NET_VENDOR_SILAN is not set
CONFIG_NET_VENDOR_SIS=y
# CONFIG_SIS900 is not set
# CONFIG_SIS190 is not set
CONFIG_NET_VENDOR_SOLARFLARE=y
# CONFIG_SFC is not set
# CONFIG_SFC_FALCON is not set
CONFIG_NET_VENDOR_SMSC=y
CONFIG_EPIC100=y
# CONFIG_SMSC911X is not set
# CONFIG_SMSC9420 is not set
CONFIG_NET_VENDOR_SOCIONEXT=y
# CONFIG_NET_VENDOR_STMICRO is not set
CONFIG_NET_VENDOR_SUN=y
# CONFIG_HAPPYMEAL is not set
CONFIG_SUNGEM=y
# CONFIG_CASSINI is not set
CONFIG_NIU=y
CONFIG_NET_VENDOR_TEHUTI=y
CONFIG_TEHUTI=y
CONFIG_NET_VENDOR_TI=y
# CONFIG_TI_CPSW_ALE is not set
# CONFIG_TLAN is not set
# CONFIG_NET_VENDOR_VIA is not set
CONFIG_NET_VENDOR_WIZNET=y
CONFIG_WIZNET_W5100=y
CONFIG_WIZNET_W5300=y
# CONFIG_WIZNET_BUS_DIRECT is not set
# CONFIG_WIZNET_BUS_INDIRECT is not set
CONFIG_WIZNET_BUS_ANY=y
# CONFIG_WIZNET_W5100_SPI is not set
CONFIG_NET_VENDOR_SYNOPSYS=y
# CONFIG_DWC_XLGMAC is not set
# CONFIG_FDDI is not set
CONFIG_HIPPI=y
# CONFIG_ROADRUNNER is not set
CONFIG_NET_SB1000=y
CONFIG_MDIO_DEVICE=y
CONFIG_MDIO_BUS=y
CONFIG_MDIO_BITBANG=y
# CONFIG_MDIO_GPIO is not set
# CONFIG_MDIO_THUNDER is not set
CONFIG_PHYLIB=y
CONFIG_SWPHY=y
# CONFIG_LED_TRIGGER_PHY is not set

#
# MII PHY device drivers
#
CONFIG_AMD_PHY=y
# CONFIG_AQUANTIA_PHY is not set
CONFIG_AT803X_PHY=y
# CONFIG_BCM7XXX_PHY is not set
# CONFIG_BCM87XX_PHY is not set
CONFIG_BCM_NET_PHYLIB=y
CONFIG_BROADCOM_PHY=y
# CONFIG_CICADA_PHY is not set
# CONFIG_CORTINA_PHY is not set
CONFIG_DAVICOM_PHY=y
# CONFIG_DP83822_PHY is not set
# CONFIG_DP83848_PHY is not set
# CONFIG_DP83867_PHY is not set
CONFIG_FIXED_PHY=y
CONFIG_ICPLUS_PHY=y
# CONFIG_INTEL_XWAY_PHY is not set
# CONFIG_LSI_ET1011C_PHY is not set
CONFIG_LXT_PHY=y
# CONFIG_MARVELL_PHY is not set
# CONFIG_MARVELL_10G_PHY is not set
# CONFIG_MICREL_PHY is not set
# CONFIG_MICROCHIP_PHY is not set
# CONFIG_MICROSEMI_PHY is not set
CONFIG_NATIONAL_PHY=y
# CONFIG_QSEMI_PHY is not set
CONFIG_REALTEK_PHY=y
# CONFIG_RENESAS_PHY is not set
# CONFIG_ROCKCHIP_PHY is not set
# CONFIG_SMSC_PHY is not set
CONFIG_STE10XP=y
# CONFIG_TERANETICS_PHY is not set
CONFIG_VITESSE_PHY=y
# CONFIG_XILINX_GMII2RGMII is not set
CONFIG_MICREL_KS8995MA=y
# CONFIG_PPP is not set
# CONFIG_SLIP is not set
CONFIG_USB_NET_DRIVERS=y
CONFIG_USB_CATC=y
# CONFIG_USB_KAWETH is not set
CONFIG_USB_PEGASUS=y
# CONFIG_USB_RTL8150 is not set
# CONFIG_USB_RTL8152 is not set
# CONFIG_USB_LAN78XX is not set
# CONFIG_USB_USBNET is not set
CONFIG_USB_IPHETH=y
# CONFIG_WLAN is not set

#
# Enable WiMAX (Networking options) to see the WiMAX drivers
#
# CONFIG_WAN is not set
CONFIG_XEN_NETDEV_FRONTEND=y
# CONFIG_XEN_NETDEV_BACKEND is not set
# CONFIG_VMXNET3 is not set
# CONFIG_FUJITSU_ES is not set
# CONFIG_NETDEVSIM is not set
CONFIG_ISDN=y
# CONFIG_ISDN_I4L is not set
# CONFIG_ISDN_CAPI is not set
# CONFIG_ISDN_DRV_GIGASET is not set
# CONFIG_MISDN is not set
# CONFIG_NVM is not set

#
# Input device support
#
CONFIG_INPUT=y
CONFIG_INPUT_LEDS=y
# CONFIG_INPUT_FF_MEMLESS is not set
CONFIG_INPUT_POLLDEV=y
CONFIG_INPUT_SPARSEKMAP=y
CONFIG_INPUT_MATRIXKMAP=y

#
# Userland interfaces
#
# CONFIG_INPUT_MOUSEDEV is not set
# CONFIG_INPUT_JOYDEV is not set
CONFIG_INPUT_EVDEV=y
# CONFIG_INPUT_EVBUG is not set

#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ADP5588=y
CONFIG_KEYBOARD_ADP5589=y
CONFIG_KEYBOARD_ATKBD=y
CONFIG_KEYBOARD_QT1070=y
# CONFIG_KEYBOARD_QT2160 is not set
# CONFIG_KEYBOARD_DLINK_DIR685 is not set
CONFIG_KEYBOARD_LKKBD=y
CONFIG_KEYBOARD_GPIO=y
CONFIG_KEYBOARD_GPIO_POLLED=y
# CONFIG_KEYBOARD_TCA6416 is not set
# CONFIG_KEYBOARD_TCA8418 is not set
CONFIG_KEYBOARD_MATRIX=y
# CONFIG_KEYBOARD_LM8323 is not set
# CONFIG_KEYBOARD_LM8333 is not set
CONFIG_KEYBOARD_MAX7359=y
# CONFIG_KEYBOARD_MCS is not set
# CONFIG_KEYBOARD_MPR121 is not set
# CONFIG_KEYBOARD_NEWTON is not set
# CONFIG_KEYBOARD_OPENCORES is not set
# CONFIG_KEYBOARD_SAMSUNG is not set
# CONFIG_KEYBOARD_STOWAWAY is not set
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_TM2_TOUCHKEY is not set
CONFIG_KEYBOARD_XTKBD=y
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
# CONFIG_MOUSE_PS2_ALPS is not set
CONFIG_MOUSE_PS2_BYD=y
CONFIG_MOUSE_PS2_LOGIPS2PP=y
CONFIG_MOUSE_PS2_SYNAPTICS=y
CONFIG_MOUSE_PS2_SYNAPTICS_SMBUS=y
# CONFIG_MOUSE_PS2_CYPRESS is not set
CONFIG_MOUSE_PS2_LIFEBOOK=y
CONFIG_MOUSE_PS2_TRACKPOINT=y
# CONFIG_MOUSE_PS2_ELANTECH is not set
CONFIG_MOUSE_PS2_SENTELIC=y
# CONFIG_MOUSE_PS2_TOUCHKIT is not set
CONFIG_MOUSE_PS2_FOCALTECH=y
# CONFIG_MOUSE_PS2_VMMOUSE is not set
CONFIG_MOUSE_PS2_SMBUS=y
CONFIG_MOUSE_SERIAL=y
# CONFIG_MOUSE_APPLETOUCH is not set
CONFIG_MOUSE_BCM5974=y
CONFIG_MOUSE_CYAPA=y
# CONFIG_MOUSE_ELAN_I2C is not set
CONFIG_MOUSE_VSXXXAA=y
# CONFIG_MOUSE_GPIO is not set
CONFIG_MOUSE_SYNAPTICS_I2C=y
CONFIG_MOUSE_SYNAPTICS_USB=y
# CONFIG_INPUT_JOYSTICK is not set
CONFIG_INPUT_TABLET=y
# CONFIG_TABLET_USB_ACECAD is not set
# CONFIG_TABLET_USB_AIPTEK is not set
CONFIG_TABLET_USB_GTCO=y
CONFIG_TABLET_USB_HANWANG=y
# CONFIG_TABLET_USB_KBTAB is not set
# CONFIG_TABLET_USB_PEGASUS is not set
# CONFIG_TABLET_SERIAL_WACOM4 is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
CONFIG_INPUT_MISC=y
CONFIG_INPUT_88PM860X_ONKEY=y
# CONFIG_INPUT_AD714X is not set
# CONFIG_INPUT_ARIZONA_HAPTICS is not set
# CONFIG_INPUT_BMA150 is not set
# CONFIG_INPUT_E3X0_BUTTON is not set
# CONFIG_INPUT_PCSPKR is not set
# CONFIG_INPUT_MAX77693_HAPTIC is not set
# CONFIG_INPUT_MMA8450 is not set
CONFIG_INPUT_APANEL=y
CONFIG_INPUT_GP2A=y
# CONFIG_INPUT_GPIO_BEEPER is not set
# CONFIG_INPUT_GPIO_DECODER is not set
# CONFIG_INPUT_ATLAS_BTNS is not set
CONFIG_INPUT_ATI_REMOTE2=y
# CONFIG_INPUT_KEYSPAN_REMOTE is not set
CONFIG_INPUT_KXTJ9=y
CONFIG_INPUT_KXTJ9_POLLED_MODE=y
# CONFIG_INPUT_POWERMATE is not set
# CONFIG_INPUT_YEALINK is not set
CONFIG_INPUT_CM109=y
# CONFIG_INPUT_REGULATOR_HAPTIC is not set
CONFIG_INPUT_RETU_PWRBUTTON=y
# CONFIG_INPUT_TWL6040_VIBRA is not set
# CONFIG_INPUT_UINPUT is not set
# CONFIG_INPUT_PALMAS_PWRBUTTON is not set
# CONFIG_INPUT_PCF8574 is not set
# CONFIG_INPUT_PWM_BEEPER is not set
# CONFIG_INPUT_PWM_VIBRA is not set
# CONFIG_INPUT_GPIO_ROTARY_ENCODER is not set
CONFIG_INPUT_DA9052_ONKEY=y
# CONFIG_INPUT_PCAP is not set
CONFIG_INPUT_ADXL34X=y
CONFIG_INPUT_ADXL34X_I2C=y
CONFIG_INPUT_ADXL34X_SPI=y
# CONFIG_INPUT_IMS_PCU is not set
# CONFIG_INPUT_CMA3000 is not set
CONFIG_INPUT_XEN_KBDDEV_FRONTEND=y
# CONFIG_INPUT_IDEAPAD_SLIDEBAR is not set
# CONFIG_INPUT_SOC_BUTTON_ARRAY is not set
# CONFIG_INPUT_DRV260X_HAPTICS is not set
# CONFIG_INPUT_DRV2665_HAPTICS is not set
# CONFIG_INPUT_DRV2667_HAPTICS is not set
# CONFIG_RMI4_CORE is not set

#
# Hardware I/O ports
#
CONFIG_SERIO=y
CONFIG_ARCH_MIGHT_HAVE_PC_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_SERIO_SERPORT=y
# CONFIG_SERIO_CT82C710 is not set
CONFIG_SERIO_PCIPS2=y
CONFIG_SERIO_LIBPS2=y
CONFIG_SERIO_RAW=y
# CONFIG_SERIO_ALTERA_PS2 is not set
# CONFIG_SERIO_PS2MULT is not set
# CONFIG_SERIO_ARC_PS2 is not set
# CONFIG_SERIO_GPIO_PS2 is not set
# CONFIG_USERIO is not set
CONFIG_GAMEPORT=y
# CONFIG_GAMEPORT_NS558 is not set
CONFIG_GAMEPORT_L4=y
# CONFIG_GAMEPORT_EMU10K1 is not set
CONFIG_GAMEPORT_FM801=y

#
# Character devices
#
CONFIG_TTY=y
# CONFIG_VT is not set
CONFIG_UNIX98_PTYS=y
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=256
# CONFIG_SERIAL_NONSTANDARD is not set
CONFIG_NOZOMI=y
CONFIG_N_GSM=y
CONFIG_TRACE_ROUTER=y
CONFIG_TRACE_SINK=y
CONFIG_DEVMEM=y
CONFIG_DEVKMEM=y

#
# Serial drivers
#
CONFIG_SERIAL_EARLYCON=y
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_DEPRECATED_OPTIONS=y
CONFIG_SERIAL_8250_PNP=y
# CONFIG_SERIAL_8250_FINTEK is not set
CONFIG_SERIAL_8250_CONSOLE=y
# CONFIG_SERIAL_8250_PCI is not set
CONFIG_SERIAL_8250_NR_UARTS=4
CONFIG_SERIAL_8250_RUNTIME_UARTS=4
# CONFIG_SERIAL_8250_EXTENDED is not set
CONFIG_SERIAL_8250_DW=y
# CONFIG_SERIAL_8250_RT288X is not set
CONFIG_SERIAL_8250_LPSS=y
CONFIG_SERIAL_8250_MID=y
# CONFIG_SERIAL_8250_MOXA is not set

#
# Non-8250 serial port support
#
CONFIG_SERIAL_MAX3100=y
# CONFIG_SERIAL_MAX310X is not set
# CONFIG_SERIAL_UARTLITE is not set
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
# CONFIG_SERIAL_JSM is not set
CONFIG_SERIAL_SCCNXP=y
# CONFIG_SERIAL_SCCNXP_CONSOLE is not set
# CONFIG_SERIAL_SC16IS7XX is not set
CONFIG_SERIAL_ALTERA_JTAGUART=y
# CONFIG_SERIAL_ALTERA_JTAGUART_CONSOLE is not set
# CONFIG_SERIAL_ALTERA_UART is not set
CONFIG_SERIAL_IFX6X60=y
CONFIG_SERIAL_ARC=y
CONFIG_SERIAL_ARC_CONSOLE=y
CONFIG_SERIAL_ARC_NR_PORTS=1
# CONFIG_SERIAL_RP2 is not set
# CONFIG_SERIAL_FSL_LPUART is not set
# CONFIG_SERIAL_DEV_BUS is not set
# CONFIG_TTY_PRINTK is not set
CONFIG_HVC_DRIVER=y
CONFIG_HVC_IRQ=y
CONFIG_HVC_XEN=y
# CONFIG_HVC_XEN_FRONTEND is not set
CONFIG_VIRTIO_CONSOLE=y
# CONFIG_IPMI_HANDLER is not set
CONFIG_HW_RANDOM=y
CONFIG_HW_RANDOM_TIMERIOMEM=y
CONFIG_HW_RANDOM_INTEL=y
CONFIG_HW_RANDOM_AMD=y
# CONFIG_HW_RANDOM_VIA is not set
CONFIG_HW_RANDOM_VIRTIO=y
CONFIG_NVRAM=y
CONFIG_R3964=y
CONFIG_APPLICOM=y
# CONFIG_MWAVE is not set
# CONFIG_RAW_DRIVER is not set
CONFIG_HPET=y
CONFIG_HPET_MMAP=y
CONFIG_HPET_MMAP_DEFAULT=y
CONFIG_HANGCHECK_TIMER=y
CONFIG_TCG_TPM=y
# CONFIG_HW_RANDOM_TPM is not set
# CONFIG_TCG_TIS is not set
# CONFIG_TCG_TIS_SPI is not set
# CONFIG_TCG_TIS_I2C_ATMEL is not set
# CONFIG_TCG_TIS_I2C_INFINEON is not set
# CONFIG_TCG_TIS_I2C_NUVOTON is not set
# CONFIG_TCG_NSC is not set
# CONFIG_TCG_ATMEL is not set
# CONFIG_TCG_INFINEON is not set
# CONFIG_TCG_XEN is not set
# CONFIG_TCG_CRB is not set
# CONFIG_TCG_VTPM_PROXY is not set
# CONFIG_TCG_TIS_ST33ZP24_I2C is not set
# CONFIG_TCG_TIS_ST33ZP24_SPI is not set
# CONFIG_TELCLOCK is not set
CONFIG_DEVPORT=y
# CONFIG_XILLYBUS is not set

#
# I2C support
#
CONFIG_I2C=y
CONFIG_ACPI_I2C_OPREGION=y
CONFIG_I2C_BOARDINFO=y
# CONFIG_I2C_COMPAT is not set
# CONFIG_I2C_CHARDEV is not set
# CONFIG_I2C_MUX is not set
CONFIG_I2C_HELPER_AUTO=y
CONFIG_I2C_SMBUS=y
CONFIG_I2C_ALGOBIT=y

#
# I2C Hardware Bus support
#

#
# PC SMBus host controller drivers
#
# CONFIG_I2C_ALI1535 is not set
# CONFIG_I2C_ALI1563 is not set
# CONFIG_I2C_ALI15X3 is not set
CONFIG_I2C_AMD756=y
CONFIG_I2C_AMD756_S4882=y
CONFIG_I2C_AMD8111=y
# CONFIG_I2C_I801 is not set
# CONFIG_I2C_ISCH is not set
# CONFIG_I2C_ISMT is not set
# CONFIG_I2C_PIIX4 is not set
# CONFIG_I2C_NFORCE2 is not set
CONFIG_I2C_SIS5595=y
CONFIG_I2C_SIS630=y
# CONFIG_I2C_SIS96X is not set
# CONFIG_I2C_VIA is not set
CONFIG_I2C_VIAPRO=y

#
# ACPI drivers
#
CONFIG_I2C_SCMI=y

#
# I2C system bus drivers (mostly embedded / system-on-chip)
#
# CONFIG_I2C_CBUS_GPIO is not set
# CONFIG_I2C_DESIGNWARE_PLATFORM is not set
# CONFIG_I2C_DESIGNWARE_PCI is not set
# CONFIG_I2C_EMEV2 is not set
# CONFIG_I2C_GPIO is not set
# CONFIG_I2C_OCORES is not set
# CONFIG_I2C_PCA_PLATFORM is not set
# CONFIG_I2C_SIMTEC is not set
CONFIG_I2C_XILINX=y

#
# External I2C/SMBus adapter drivers
#
# CONFIG_I2C_DIOLAN_U2C is not set
CONFIG_I2C_PARPORT_LIGHT=y
# CONFIG_I2C_ROBOTFUZZ_OSIF is not set
# CONFIG_I2C_TAOS_EVM is not set
# CONFIG_I2C_TINY_USB is not set
# CONFIG_I2C_VIPERBOARD is not set

#
# Other I2C/SMBus bus drivers
#
# CONFIG_I2C_MLXCPLD is not set
# CONFIG_I2C_SLAVE is not set
# CONFIG_I2C_DEBUG_CORE is not set
# CONFIG_I2C_DEBUG_ALGO is not set
# CONFIG_I2C_DEBUG_BUS is not set
CONFIG_SPI=y
# CONFIG_SPI_DEBUG is not set
CONFIG_SPI_MASTER=y

#
# SPI Master Controller Drivers
#
CONFIG_SPI_ALTERA=y
# CONFIG_SPI_AXI_SPI_ENGINE is not set
CONFIG_SPI_BITBANG=y
# CONFIG_SPI_CADENCE is not set
CONFIG_SPI_DESIGNWARE=y
# CONFIG_SPI_DW_PCI is not set
# CONFIG_SPI_DW_MMIO is not set
# CONFIG_SPI_GPIO is not set
CONFIG_SPI_OC_TINY=y
# CONFIG_SPI_PXA2XX is not set
# CONFIG_SPI_ROCKCHIP is not set
CONFIG_SPI_SC18IS602=y
# CONFIG_SPI_XCOMM is not set
CONFIG_SPI_XILINX=y
# CONFIG_SPI_ZYNQMP_GQSPI is not set

#
# SPI Protocol Masters
#
# CONFIG_SPI_SPIDEV is not set
CONFIG_SPI_TLE62X0=y
# CONFIG_SPI_SLAVE is not set
# CONFIG_SPMI is not set
# CONFIG_HSI is not set
CONFIG_PPS=y
# CONFIG_PPS_DEBUG is not set
# CONFIG_NTP_PPS is not set

#
# PPS clients support
#
# CONFIG_PPS_CLIENT_KTIMER is not set
CONFIG_PPS_CLIENT_LDISC=y
CONFIG_PPS_CLIENT_GPIO=y

#
# PPS generators support
#

#
# PTP clock support
#
CONFIG_PTP_1588_CLOCK=y

#
# Enable PHYLIB and NETWORK_PHY_TIMESTAMPING to see the additional clocks.
#
CONFIG_PTP_1588_CLOCK_KVM=y
# CONFIG_PINCTRL is not set
CONFIG_GPIOLIB=y
CONFIG_GPIO_ACPI=y
CONFIG_DEBUG_GPIO=y
CONFIG_GPIO_SYSFS=y
CONFIG_GPIO_GENERIC=y
CONFIG_GPIO_MAX730X=y

#
# Memory mapped GPIO drivers
#
# CONFIG_GPIO_AMDPT is not set
# CONFIG_GPIO_DWAPB is not set
CONFIG_GPIO_GENERIC_PLATFORM=y
CONFIG_GPIO_ICH=y
# CONFIG_GPIO_LYNXPOINT is not set
# CONFIG_GPIO_MB86S7X is not set
# CONFIG_GPIO_MOCKUP is not set
# CONFIG_GPIO_VX855 is not set

#
# Port-mapped I/O GPIO drivers
#
# CONFIG_GPIO_F7188X is not set
# CONFIG_GPIO_IT87 is not set
CONFIG_GPIO_SCH=y
# CONFIG_GPIO_SCH311X is not set
# CONFIG_GPIO_WINBOND is not set
# CONFIG_GPIO_WS16C48 is not set

#
# I2C GPIO expanders
#
# CONFIG_GPIO_ADP5588 is not set
# CONFIG_GPIO_MAX7300 is not set
# CONFIG_GPIO_MAX732X is not set
# CONFIG_GPIO_PCA953X is not set
# CONFIG_GPIO_PCF857X is not set
# CONFIG_GPIO_TPIC2810 is not set

#
# MFD GPIO expanders
#
# CONFIG_GPIO_ARIZONA is not set
CONFIG_GPIO_DA9052=y
# CONFIG_GPIO_JANZ_TTL is not set
# CONFIG_GPIO_PALMAS is not set
CONFIG_GPIO_RC5T583=y
CONFIG_GPIO_TPS6586X=y
# CONFIG_GPIO_TPS65910 is not set
CONFIG_GPIO_TWL6040=y
CONFIG_GPIO_UCB1400=y
# CONFIG_GPIO_WM8350 is not set

#
# PCI GPIO expanders
#
# CONFIG_GPIO_AMD8111 is not set
CONFIG_GPIO_BT8XX=y
CONFIG_GPIO_ML_IOH=y
# CONFIG_GPIO_PCI_IDIO_16 is not set
# CONFIG_GPIO_PCIE_IDIO_24 is not set
# CONFIG_GPIO_RDC321X is not set

#
# SPI GPIO expanders
#
# CONFIG_GPIO_MAX3191X is not set
CONFIG_GPIO_MAX7301=y
CONFIG_GPIO_MC33880=y
# CONFIG_GPIO_PISOSR is not set
# CONFIG_GPIO_XRA1403 is not set

#
# USB GPIO expanders
#
# CONFIG_GPIO_VIPERBOARD is not set
CONFIG_W1=y
# CONFIG_W1_CON is not set

#
# 1-wire Bus Masters
#
CONFIG_W1_MASTER_MATROX=y
# CONFIG_W1_MASTER_DS2490 is not set
CONFIG_W1_MASTER_DS2482=y
CONFIG_W1_MASTER_DS1WM=y
# CONFIG_W1_MASTER_GPIO is not set

#
# 1-wire Slaves
#
# CONFIG_W1_SLAVE_THERM is not set
CONFIG_W1_SLAVE_SMEM=y
# CONFIG_W1_SLAVE_DS2405 is not set
# CONFIG_W1_SLAVE_DS2408 is not set
CONFIG_W1_SLAVE_DS2413=y
# CONFIG_W1_SLAVE_DS2406 is not set
# CONFIG_W1_SLAVE_DS2423 is not set
# CONFIG_W1_SLAVE_DS2805 is not set
CONFIG_W1_SLAVE_DS2431=y
# CONFIG_W1_SLAVE_DS2433 is not set
# CONFIG_W1_SLAVE_DS2438 is not set
CONFIG_W1_SLAVE_DS2760=y
CONFIG_W1_SLAVE_DS2780=y
CONFIG_W1_SLAVE_DS2781=y
# CONFIG_W1_SLAVE_DS28E04 is not set
# CONFIG_W1_SLAVE_DS28E17 is not set
# CONFIG_POWER_AVS is not set
CONFIG_POWER_RESET=y
# CONFIG_POWER_RESET_RESTART is not set
CONFIG_POWER_SUPPLY=y
CONFIG_POWER_SUPPLY_DEBUG=y
CONFIG_PDA_POWER=y
# CONFIG_WM8350_POWER is not set
# CONFIG_TEST_POWER is not set
# CONFIG_BATTERY_88PM860X is not set
CONFIG_BATTERY_DS2760=y
CONFIG_BATTERY_DS2780=y
CONFIG_BATTERY_DS2781=y
CONFIG_BATTERY_DS2782=y
# CONFIG_BATTERY_SBS is not set
# CONFIG_CHARGER_SBS is not set
# CONFIG_BATTERY_BQ27XXX is not set
# CONFIG_BATTERY_DA9030 is not set
CONFIG_BATTERY_DA9052=y
CONFIG_BATTERY_MAX17040=y
CONFIG_BATTERY_MAX17042=y
# CONFIG_BATTERY_MAX1721X is not set
CONFIG_CHARGER_ISP1704=y
# CONFIG_CHARGER_MAX8903 is not set
CONFIG_CHARGER_LP8727=y
# CONFIG_CHARGER_GPIO is not set
# CONFIG_CHARGER_MANAGER is not set
# CONFIG_CHARGER_LTC3651 is not set
# CONFIG_CHARGER_MAX77693 is not set
CONFIG_CHARGER_BQ2415X=y
# CONFIG_CHARGER_BQ24190 is not set
# CONFIG_CHARGER_BQ24257 is not set
# CONFIG_CHARGER_BQ24735 is not set
# CONFIG_CHARGER_BQ25890 is not set
CONFIG_CHARGER_SMB347=y
# CONFIG_CHARGER_TPS65090 is not set
# CONFIG_BATTERY_GAUGE_LTC2941 is not set
# CONFIG_CHARGER_RT9455 is not set
CONFIG_HWMON=y
CONFIG_HWMON_VID=y
# CONFIG_HWMON_DEBUG_CHIP is not set

#
# Native drivers
#
CONFIG_SENSORS_ABITUGURU=y
# CONFIG_SENSORS_ABITUGURU3 is not set
# CONFIG_SENSORS_AD7314 is not set
CONFIG_SENSORS_AD7414=y
CONFIG_SENSORS_AD7418=y
CONFIG_SENSORS_ADM1021=y
CONFIG_SENSORS_ADM1025=y
# CONFIG_SENSORS_ADM1026 is not set
CONFIG_SENSORS_ADM1029=y
# CONFIG_SENSORS_ADM1031 is not set
# CONFIG_SENSORS_ADM9240 is not set
# CONFIG_SENSORS_ADT7310 is not set
# CONFIG_SENSORS_ADT7410 is not set
CONFIG_SENSORS_ADT7411=y
# CONFIG_SENSORS_ADT7462 is not set
CONFIG_SENSORS_ADT7470=y
CONFIG_SENSORS_ADT7475=y
CONFIG_SENSORS_ASC7621=y
# CONFIG_SENSORS_K8TEMP is not set
CONFIG_SENSORS_K10TEMP=y
# CONFIG_SENSORS_FAM15H_POWER is not set
CONFIG_SENSORS_APPLESMC=y
# CONFIG_SENSORS_ASB100 is not set
# CONFIG_SENSORS_ASPEED is not set
CONFIG_SENSORS_ATXP1=y
CONFIG_SENSORS_DS620=y
CONFIG_SENSORS_DS1621=y
# CONFIG_SENSORS_DELL_SMM is not set
CONFIG_SENSORS_DA9052_ADC=y
# CONFIG_SENSORS_I5K_AMB is not set
CONFIG_SENSORS_F71805F=y
# CONFIG_SENSORS_F71882FG is not set
# CONFIG_SENSORS_F75375S is not set
CONFIG_SENSORS_FSCHMD=y
# CONFIG_SENSORS_FTSTEUTATES is not set
CONFIG_SENSORS_GL518SM=y
# CONFIG_SENSORS_GL520SM is not set
CONFIG_SENSORS_G760A=y
# CONFIG_SENSORS_G762 is not set
# CONFIG_SENSORS_HIH6130 is not set
# CONFIG_SENSORS_I5500 is not set
# CONFIG_SENSORS_CORETEMP is not set
# CONFIG_SENSORS_IT87 is not set
# CONFIG_SENSORS_JC42 is not set
# CONFIG_SENSORS_POWR1220 is not set
CONFIG_SENSORS_LINEAGE=y
# CONFIG_SENSORS_LTC2945 is not set
# CONFIG_SENSORS_LTC2990 is not set
# CONFIG_SENSORS_LTC4151 is not set
# CONFIG_SENSORS_LTC4215 is not set
# CONFIG_SENSORS_LTC4222 is not set
CONFIG_SENSORS_LTC4245=y
# CONFIG_SENSORS_LTC4260 is not set
CONFIG_SENSORS_LTC4261=y
CONFIG_SENSORS_MAX1111=y
CONFIG_SENSORS_MAX16065=y
# CONFIG_SENSORS_MAX1619 is not set
CONFIG_SENSORS_MAX1668=y
# CONFIG_SENSORS_MAX197 is not set
# CONFIG_SENSORS_MAX31722 is not set
# CONFIG_SENSORS_MAX6621 is not set
CONFIG_SENSORS_MAX6639=y
CONFIG_SENSORS_MAX6642=y
# CONFIG_SENSORS_MAX6650 is not set
# CONFIG_SENSORS_MAX6697 is not set
# CONFIG_SENSORS_MAX31790 is not set
CONFIG_SENSORS_MCP3021=y
# CONFIG_SENSORS_TC654 is not set
CONFIG_SENSORS_ADCXX=y
CONFIG_SENSORS_LM63=y
CONFIG_SENSORS_LM70=y
# CONFIG_SENSORS_LM73 is not set
# CONFIG_SENSORS_LM75 is not set
# CONFIG_SENSORS_LM77 is not set
# CONFIG_SENSORS_LM78 is not set
CONFIG_SENSORS_LM80=y
CONFIG_SENSORS_LM83=y
CONFIG_SENSORS_LM85=y
# CONFIG_SENSORS_LM87 is not set
# CONFIG_SENSORS_LM90 is not set
CONFIG_SENSORS_LM92=y
CONFIG_SENSORS_LM93=y
# CONFIG_SENSORS_LM95234 is not set
CONFIG_SENSORS_LM95241=y
# CONFIG_SENSORS_LM95245 is not set
# CONFIG_SENSORS_PC87360 is not set
# CONFIG_SENSORS_PC87427 is not set
# CONFIG_SENSORS_NTC_THERMISTOR is not set
# CONFIG_SENSORS_NCT6683 is not set
# CONFIG_SENSORS_NCT6775 is not set
# CONFIG_SENSORS_NCT7802 is not set
# CONFIG_SENSORS_NCT7904 is not set
CONFIG_SENSORS_PCF8591=y
CONFIG_PMBUS=y
# CONFIG_SENSORS_PMBUS is not set
# CONFIG_SENSORS_ADM1275 is not set
# CONFIG_SENSORS_IBM_CFFPS is not set
# CONFIG_SENSORS_IR35221 is not set
CONFIG_SENSORS_LM25066=y
CONFIG_SENSORS_LTC2978=y
# CONFIG_SENSORS_LTC2978_REGULATOR is not set
# CONFIG_SENSORS_LTC3815 is not set
# CONFIG_SENSORS_MAX16064 is not set
# CONFIG_SENSORS_MAX20751 is not set
# CONFIG_SENSORS_MAX31785 is not set
CONFIG_SENSORS_MAX34440=y
CONFIG_SENSORS_MAX8688=y
# CONFIG_SENSORS_TPS40422 is not set
# CONFIG_SENSORS_TPS53679 is not set
# CONFIG_SENSORS_UCD9000 is not set
CONFIG_SENSORS_UCD9200=y
CONFIG_SENSORS_ZL6100=y
# CONFIG_SENSORS_SHT15 is not set
# CONFIG_SENSORS_SHT21 is not set
# CONFIG_SENSORS_SHT3x is not set
# CONFIG_SENSORS_SHTC1 is not set
CONFIG_SENSORS_SIS5595=y
# CONFIG_SENSORS_DME1737 is not set
CONFIG_SENSORS_EMC1403=y
CONFIG_SENSORS_EMC2103=y
# CONFIG_SENSORS_EMC6W201 is not set
CONFIG_SENSORS_SMSC47M1=y
# CONFIG_SENSORS_SMSC47M192 is not set
# CONFIG_SENSORS_SMSC47B397 is not set
CONFIG_SENSORS_SCH56XX_COMMON=y
# CONFIG_SENSORS_SCH5627 is not set
CONFIG_SENSORS_SCH5636=y
# CONFIG_SENSORS_STTS751 is not set
CONFIG_SENSORS_SMM665=y
# CONFIG_SENSORS_ADC128D818 is not set
CONFIG_SENSORS_ADS1015=y
CONFIG_SENSORS_ADS7828=y
# CONFIG_SENSORS_ADS7871 is not set
CONFIG_SENSORS_AMC6821=y
# CONFIG_SENSORS_INA209 is not set
CONFIG_SENSORS_INA2XX=y
# CONFIG_SENSORS_INA3221 is not set
# CONFIG_SENSORS_TC74 is not set
CONFIG_SENSORS_THMC50=y
# CONFIG_SENSORS_TMP102 is not set
# CONFIG_SENSORS_TMP103 is not set
# CONFIG_SENSORS_TMP108 is not set
CONFIG_SENSORS_TMP401=y
CONFIG_SENSORS_TMP421=y
CONFIG_SENSORS_VIA_CPUTEMP=y
CONFIG_SENSORS_VIA686A=y
CONFIG_SENSORS_VT1211=y
# CONFIG_SENSORS_VT8231 is not set
# CONFIG_SENSORS_W83773G is not set
# CONFIG_SENSORS_W83781D is not set
# CONFIG_SENSORS_W83791D is not set
# CONFIG_SENSORS_W83792D is not set
# CONFIG_SENSORS_W83793 is not set
CONFIG_SENSORS_W83795=y
# CONFIG_SENSORS_W83795_FANCTRL is not set
# CONFIG_SENSORS_W83L785TS is not set
CONFIG_SENSORS_W83L786NG=y
# CONFIG_SENSORS_W83627HF is not set
# CONFIG_SENSORS_W83627EHF is not set
# CONFIG_SENSORS_WM8350 is not set
# CONFIG_SENSORS_XGENE is not set

#
# ACPI drivers
#
CONFIG_SENSORS_ACPI_POWER=y
CONFIG_SENSORS_ATK0110=y
CONFIG_THERMAL=y
CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS=0
CONFIG_THERMAL_HWMON=y
# CONFIG_THERMAL_WRITABLE_TRIPS is not set
# CONFIG_THERMAL_DEFAULT_GOV_STEP_WISE is not set
# CONFIG_THERMAL_DEFAULT_GOV_FAIR_SHARE is not set
CONFIG_THERMAL_DEFAULT_GOV_USER_SPACE=y
# CONFIG_THERMAL_DEFAULT_GOV_POWER_ALLOCATOR is not set
# CONFIG_THERMAL_GOV_FAIR_SHARE is not set
# CONFIG_THERMAL_GOV_STEP_WISE is not set
# CONFIG_THERMAL_GOV_BANG_BANG is not set
CONFIG_THERMAL_GOV_USER_SPACE=y
# CONFIG_THERMAL_GOV_POWER_ALLOCATOR is not set
# CONFIG_CLOCK_THERMAL is not set
# CONFIG_DEVFREQ_THERMAL is not set
# CONFIG_THERMAL_EMULATION is not set
CONFIG_INTEL_POWERCLAMP=y
# CONFIG_INTEL_SOC_DTS_THERMAL is not set

#
# ACPI INT340X thermal drivers
#
# CONFIG_INT340X_THERMAL is not set
# CONFIG_INTEL_PCH_THERMAL is not set
CONFIG_WATCHDOG=y
CONFIG_WATCHDOG_CORE=y
# CONFIG_WATCHDOG_NOWAYOUT is not set
CONFIG_WATCHDOG_HANDLE_BOOT_ENABLED=y
# CONFIG_WATCHDOG_SYSFS is not set

#
# Watchdog Device Drivers
#
# CONFIG_SOFT_WATCHDOG is not set
# CONFIG_DA9052_WATCHDOG is not set
# CONFIG_WDAT_WDT is not set
# CONFIG_WM8350_WATCHDOG is not set
# CONFIG_XILINX_WATCHDOG is not set
# CONFIG_ZIIRAVE_WATCHDOG is not set
# CONFIG_CADENCE_WATCHDOG is not set
# CONFIG_DW_WATCHDOG is not set
# CONFIG_MAX63XX_WATCHDOG is not set
# CONFIG_RETU_WATCHDOG is not set
CONFIG_ACQUIRE_WDT=y
# CONFIG_ADVANTECH_WDT is not set
# CONFIG_ALIM1535_WDT is not set
# CONFIG_ALIM7101_WDT is not set
# CONFIG_EBC_C384_WDT is not set
CONFIG_F71808E_WDT=y
# CONFIG_SP5100_TCO is not set
CONFIG_SBC_FITPC2_WATCHDOG=y
# CONFIG_EUROTECH_WDT is not set
CONFIG_IB700_WDT=y
CONFIG_IBMASR=y
CONFIG_WAFER_WDT=y
# CONFIG_I6300ESB_WDT is not set
# CONFIG_IE6XX_WDT is not set
CONFIG_ITCO_WDT=y
# CONFIG_ITCO_VENDOR_SUPPORT is not set
# CONFIG_IT8712F_WDT is not set
# CONFIG_IT87_WDT is not set
# CONFIG_HP_WATCHDOG is not set
CONFIG_SC1200_WDT=y
# CONFIG_PC87413_WDT is not set
# CONFIG_NV_TCO is not set
# CONFIG_60XX_WDT is not set
CONFIG_CPU5_WDT=y
CONFIG_SMSC_SCH311X_WDT=y
CONFIG_SMSC37B787_WDT=y
# CONFIG_VIA_WDT is not set
# CONFIG_W83627HF_WDT is not set
# CONFIG_W83877F_WDT is not set
# CONFIG_W83977F_WDT is not set
CONFIG_MACHZ_WDT=y
# CONFIG_SBC_EPX_C3_WATCHDOG is not set
# CONFIG_NI903X_WDT is not set
# CONFIG_NIC7018_WDT is not set
# CONFIG_MEN_A21_WDT is not set
CONFIG_XEN_WDT=y

#
# PCI-based Watchdog Cards
#
CONFIG_PCIPCWATCHDOG=y
# CONFIG_WDTPCI is not set

#
# USB-based Watchdog Cards
#
# CONFIG_USBPCWATCHDOG is not set

#
# Watchdog Pretimeout Governors
#
# CONFIG_WATCHDOG_PRETIMEOUT_GOV is not set
CONFIG_SSB_POSSIBLE=y
CONFIG_SSB=y
CONFIG_SSB_SPROM=y
CONFIG_SSB_PCIHOST_POSSIBLE=y
CONFIG_SSB_PCIHOST=y
# CONFIG_SSB_SILENT is not set
CONFIG_SSB_DEBUG=y
CONFIG_SSB_DRIVER_PCICORE_POSSIBLE=y
CONFIG_SSB_DRIVER_PCICORE=y
CONFIG_SSB_DRIVER_GPIO=y
CONFIG_BCMA_POSSIBLE=y
CONFIG_BCMA=y
CONFIG_BCMA_HOST_PCI_POSSIBLE=y
# CONFIG_BCMA_HOST_PCI is not set
# CONFIG_BCMA_HOST_SOC is not set
CONFIG_BCMA_DRIVER_PCI=y
# CONFIG_BCMA_DRIVER_GMAC_CMN is not set
CONFIG_BCMA_DRIVER_GPIO=y
# CONFIG_BCMA_DEBUG is not set

#
# Multifunction device drivers
#
CONFIG_MFD_CORE=y
# CONFIG_MFD_AS3711 is not set
# CONFIG_PMIC_ADP5520 is not set
CONFIG_MFD_AAT2870_CORE=y
# CONFIG_MFD_BCM590XX is not set
# CONFIG_MFD_BD9571MWV is not set
# CONFIG_MFD_AXP20X_I2C is not set
# CONFIG_MFD_CROS_EC is not set
CONFIG_PMIC_DA903X=y
CONFIG_PMIC_DA9052=y
# CONFIG_MFD_DA9052_SPI is not set
CONFIG_MFD_DA9052_I2C=y
# CONFIG_MFD_DA9055 is not set
# CONFIG_MFD_DA9062 is not set
# CONFIG_MFD_DA9063 is not set
# CONFIG_MFD_DA9150 is not set
# CONFIG_MFD_DLN2 is not set
# CONFIG_MFD_MC13XXX_SPI is not set
# CONFIG_MFD_MC13XXX_I2C is not set
CONFIG_HTC_PASIC3=y
CONFIG_HTC_I2CPLD=y
# CONFIG_MFD_INTEL_QUARK_I2C_GPIO is not set
CONFIG_LPC_ICH=y
CONFIG_LPC_SCH=y
# CONFIG_INTEL_SOC_PMIC is not set
# CONFIG_INTEL_SOC_PMIC_CHTWC is not set
# CONFIG_INTEL_SOC_PMIC_CHTDC_TI is not set
# CONFIG_MFD_INTEL_LPSS_ACPI is not set
# CONFIG_MFD_INTEL_LPSS_PCI is not set
CONFIG_MFD_JANZ_CMODIO=y
# CONFIG_MFD_KEMPLD is not set
# CONFIG_MFD_88PM800 is not set
CONFIG_MFD_88PM805=y
CONFIG_MFD_88PM860X=y
# CONFIG_MFD_MAX14577 is not set
CONFIG_MFD_MAX77693=y
# CONFIG_MFD_MAX77843 is not set
# CONFIG_MFD_MAX8907 is not set
# CONFIG_MFD_MAX8925 is not set
# CONFIG_MFD_MAX8997 is not set
# CONFIG_MFD_MAX8998 is not set
# CONFIG_MFD_MT6397 is not set
# CONFIG_MFD_MENF21BMC is not set
CONFIG_EZX_PCAP=y
CONFIG_MFD_VIPERBOARD=y
CONFIG_MFD_RETU=y
# CONFIG_MFD_PCF50633 is not set
CONFIG_UCB1400_CORE=y
# CONFIG_MFD_RDC321X is not set
# CONFIG_MFD_RT5033 is not set
CONFIG_MFD_RC5T583=y
# CONFIG_MFD_SEC_CORE is not set
# CONFIG_MFD_SI476X_CORE is not set
# CONFIG_MFD_SM501 is not set
# CONFIG_MFD_SKY81452 is not set
CONFIG_MFD_SMSC=y
# CONFIG_ABX500_CORE is not set
# CONFIG_MFD_SYSCON is not set
# CONFIG_MFD_TI_AM335X_TSCADC is not set
# CONFIG_MFD_LP3943 is not set
CONFIG_MFD_LP8788=y
# CONFIG_MFD_TI_LMU is not set
CONFIG_MFD_PALMAS=y
CONFIG_TPS6105X=y
# CONFIG_TPS65010 is not set
CONFIG_TPS6507X=y
# CONFIG_MFD_TPS65086 is not set
CONFIG_MFD_TPS65090=y
# CONFIG_MFD_TPS68470 is not set
# CONFIG_MFD_TI_LP873X is not set
CONFIG_MFD_TPS6586X=y
CONFIG_MFD_TPS65910=y
# CONFIG_MFD_TPS65912_I2C is not set
# CONFIG_MFD_TPS65912_SPI is not set
CONFIG_MFD_TPS80031=y
# CONFIG_TWL4030_CORE is not set
CONFIG_TWL6040_CORE=y
CONFIG_MFD_WL1273_CORE=y
CONFIG_MFD_LM3533=y
CONFIG_MFD_VX855=y
CONFIG_MFD_ARIZONA=y
# CONFIG_MFD_ARIZONA_I2C is not set
CONFIG_MFD_ARIZONA_SPI=y
# CONFIG_MFD_CS47L24 is not set
# CONFIG_MFD_WM5102 is not set
CONFIG_MFD_WM5110=y
# CONFIG_MFD_WM8997 is not set
# CONFIG_MFD_WM8998 is not set
CONFIG_MFD_WM8400=y
# CONFIG_MFD_WM831X_I2C is not set
# CONFIG_MFD_WM831X_SPI is not set
CONFIG_MFD_WM8350=y
CONFIG_MFD_WM8350_I2C=y
# CONFIG_MFD_WM8994 is not set
CONFIG_REGULATOR=y
# CONFIG_REGULATOR_DEBUG is not set
CONFIG_REGULATOR_FIXED_VOLTAGE=y
# CONFIG_REGULATOR_VIRTUAL_CONSUMER is not set
# CONFIG_REGULATOR_USERSPACE_CONSUMER is not set
CONFIG_REGULATOR_88PM8607=y
# CONFIG_REGULATOR_ACT8865 is not set
# CONFIG_REGULATOR_AD5398 is not set
# CONFIG_REGULATOR_AAT2870 is not set
# CONFIG_REGULATOR_ARIZONA_LDO1 is not set
# CONFIG_REGULATOR_ARIZONA_MICSUPP is not set
# CONFIG_REGULATOR_DA903X is not set
# CONFIG_REGULATOR_DA9052 is not set
# CONFIG_REGULATOR_DA9210 is not set
# CONFIG_REGULATOR_DA9211 is not set
# CONFIG_REGULATOR_FAN53555 is not set
# CONFIG_REGULATOR_GPIO is not set
# CONFIG_REGULATOR_ISL9305 is not set
CONFIG_REGULATOR_ISL6271A=y
# CONFIG_REGULATOR_LP3971 is not set
CONFIG_REGULATOR_LP3972=y
# CONFIG_REGULATOR_LP872X is not set
# CONFIG_REGULATOR_LP8755 is not set
# CONFIG_REGULATOR_LP8788 is not set
# CONFIG_REGULATOR_LTC3589 is not set
# CONFIG_REGULATOR_LTC3676 is not set
# CONFIG_REGULATOR_MAX1586 is not set
CONFIG_REGULATOR_MAX8649=y
CONFIG_REGULATOR_MAX8660=y
# CONFIG_REGULATOR_MAX8952 is not set
# CONFIG_REGULATOR_MAX77693 is not set
# CONFIG_REGULATOR_MT6311 is not set
CONFIG_REGULATOR_PALMAS=y
CONFIG_REGULATOR_PCAP=y
# CONFIG_REGULATOR_PFUZE100 is not set
# CONFIG_REGULATOR_PV88060 is not set
# CONFIG_REGULATOR_PV88080 is not set
# CONFIG_REGULATOR_PV88090 is not set
# CONFIG_REGULATOR_PWM is not set
# CONFIG_REGULATOR_RC5T583 is not set
CONFIG_REGULATOR_TPS51632=y
# CONFIG_REGULATOR_TPS6105X is not set
# CONFIG_REGULATOR_TPS62360 is not set
CONFIG_REGULATOR_TPS65023=y
CONFIG_REGULATOR_TPS6507X=y
# CONFIG_REGULATOR_TPS65090 is not set
# CONFIG_REGULATOR_TPS65132 is not set
CONFIG_REGULATOR_TPS6524X=y
# CONFIG_REGULATOR_TPS6586X is not set
CONFIG_REGULATOR_TPS65910=y
# CONFIG_REGULATOR_TPS80031 is not set
CONFIG_REGULATOR_WM8350=y
# CONFIG_REGULATOR_WM8400 is not set
CONFIG_RC_CORE=y
# CONFIG_RC_MAP is not set
# CONFIG_LIRC is not set
CONFIG_RC_DECODERS=y
CONFIG_IR_NEC_DECODER=y
# CONFIG_IR_RC5_DECODER is not set
# CONFIG_IR_RC6_DECODER is not set
# CONFIG_IR_JVC_DECODER is not set
# CONFIG_IR_SONY_DECODER is not set
CONFIG_IR_SANYO_DECODER=y
# CONFIG_IR_SHARP_DECODER is not set
# CONFIG_IR_MCE_KBD_DECODER is not set
# CONFIG_IR_XMP_DECODER is not set
CONFIG_RC_DEVICES=y
# CONFIG_RC_ATI_REMOTE is not set
CONFIG_IR_ENE=y
CONFIG_IR_IMON=y
CONFIG_IR_MCEUSB=y
CONFIG_IR_ITE_CIR=y
CONFIG_IR_FINTEK=y
CONFIG_IR_NUVOTON=y
# CONFIG_IR_REDRAT3 is not set
# CONFIG_IR_STREAMZAP is not set
# CONFIG_IR_WINBOND_CIR is not set
# CONFIG_IR_IGORPLUGUSB is not set
# CONFIG_IR_IGUANA is not set
# CONFIG_IR_TTUSBIR is not set
CONFIG_RC_LOOPBACK=y
# CONFIG_IR_SERIAL is not set
# CONFIG_IR_SIR is not set
CONFIG_MEDIA_SUPPORT=y

#
# Multimedia core support
#
# CONFIG_MEDIA_CAMERA_SUPPORT is not set
CONFIG_MEDIA_ANALOG_TV_SUPPORT=y
CONFIG_MEDIA_DIGITAL_TV_SUPPORT=y
# CONFIG_MEDIA_RADIO_SUPPORT is not set
# CONFIG_MEDIA_SDR_SUPPORT is not set
# CONFIG_MEDIA_CEC_SUPPORT is not set
# CONFIG_MEDIA_CONTROLLER is not set
CONFIG_VIDEO_DEV=y
CONFIG_VIDEO_V4L2=y
# CONFIG_VIDEO_ADV_DEBUG is not set
# CONFIG_VIDEO_FIXED_MINOR_RANGES is not set
CONFIG_V4L2_FWNODE=y
CONFIG_DVB_CORE=y
# CONFIG_DVB_MMAP is not set
CONFIG_DVB_NET=y
CONFIG_DVB_MAX_ADAPTERS=8
# CONFIG_DVB_DYNAMIC_MINORS is not set
# CONFIG_DVB_DEMUX_SECTION_LOSS_LOG is not set
# CONFIG_DVB_ULE_DEBUG is not set

#
# Media drivers
#
# CONFIG_MEDIA_USB_SUPPORT is not set
# CONFIG_MEDIA_PCI_SUPPORT is not set
# CONFIG_DVB_PLATFORM_DRIVERS is not set

#
# Supported MMC/SDIO adapters
#
# CONFIG_CYPRESS_FIRMWARE is not set

#
# Media ancillary drivers (tuners, sensors, i2c, spi, frontends)
#
# CONFIG_MEDIA_SUBDRV_AUTOSELECT is not set
CONFIG_VIDEO_IR_I2C=y

#
# I2C Encoders, decoders, sensors and other helper chips
#

#
# Audio decoders, processors and mixers
#
# CONFIG_VIDEO_TVAUDIO is not set
CONFIG_VIDEO_TDA7432=y
CONFIG_VIDEO_TDA9840=y
# CONFIG_VIDEO_TEA6415C is not set
CONFIG_VIDEO_TEA6420=y
# CONFIG_VIDEO_MSP3400 is not set
# CONFIG_VIDEO_CS3308 is not set
CONFIG_VIDEO_CS5345=y
CONFIG_VIDEO_CS53L32A=y
# CONFIG_VIDEO_TLV320AIC23B is not set
# CONFIG_VIDEO_UDA1342 is not set
# CONFIG_VIDEO_WM8775 is not set
CONFIG_VIDEO_WM8739=y
# CONFIG_VIDEO_VP27SMPX is not set
# CONFIG_VIDEO_SONY_BTF_MPX is not set

#
# RDS decoders
#
CONFIG_VIDEO_SAA6588=y

#
# Video decoders
#
CONFIG_VIDEO_ADV7183=y
CONFIG_VIDEO_BT819=y
CONFIG_VIDEO_BT856=y
# CONFIG_VIDEO_BT866 is not set
CONFIG_VIDEO_KS0127=y
# CONFIG_VIDEO_ML86V7667 is not set
CONFIG_VIDEO_SAA7110=y
CONFIG_VIDEO_SAA711X=y
CONFIG_VIDEO_TVP514X=y
# CONFIG_VIDEO_TVP5150 is not set
# CONFIG_VIDEO_TVP7002 is not set
# CONFIG_VIDEO_TW2804 is not set
# CONFIG_VIDEO_TW9903 is not set
# CONFIG_VIDEO_TW9906 is not set
# CONFIG_VIDEO_TW9910 is not set
CONFIG_VIDEO_VPX3220=y

#
# Video and audio decoders
#
# CONFIG_VIDEO_SAA717X is not set
# CONFIG_VIDEO_CX25840 is not set

#
# Video encoders
#
# CONFIG_VIDEO_SAA7127 is not set
CONFIG_VIDEO_SAA7185=y
# CONFIG_VIDEO_ADV7170 is not set
# CONFIG_VIDEO_ADV7175 is not set
# CONFIG_VIDEO_ADV7343 is not set
# CONFIG_VIDEO_ADV7393 is not set
CONFIG_VIDEO_AK881X=y
# CONFIG_VIDEO_THS8200 is not set

#
# Camera sensor devices
#
# CONFIG_VIDEO_MT9M111 is not set

#
# Flash devices
#

#
# Video improvement chips
#
CONFIG_VIDEO_UPD64031A=y
CONFIG_VIDEO_UPD64083=y

#
# Audio/Video compression chips
#
# CONFIG_VIDEO_SAA6752HS is not set

#
# SDR tuner chips
#

#
# Miscellaneous helper chips
#
CONFIG_VIDEO_THS7303=y
CONFIG_VIDEO_M52790=y

#
# Sensors used on soc_camera driver
#

#
# SPI helper chips
#

#
# Media SPI Adapters
#
CONFIG_CXD2880_SPI_DRV=y
CONFIG_MEDIA_TUNER=y

#
# Customize TV tuners
#
# CONFIG_MEDIA_TUNER_SIMPLE is not set
CONFIG_MEDIA_TUNER_TDA18250=y
# CONFIG_MEDIA_TUNER_TDA8290 is not set
CONFIG_MEDIA_TUNER_TDA827X=y
# CONFIG_MEDIA_TUNER_TDA18271 is not set
CONFIG_MEDIA_TUNER_TDA9887=y
# CONFIG_MEDIA_TUNER_TEA5761 is not set
# CONFIG_MEDIA_TUNER_TEA5767 is not set
CONFIG_MEDIA_TUNER_MSI001=y
# CONFIG_MEDIA_TUNER_MT20XX is not set
# CONFIG_MEDIA_TUNER_MT2060 is not set
CONFIG_MEDIA_TUNER_MT2063=y
CONFIG_MEDIA_TUNER_MT2266=y
CONFIG_MEDIA_TUNER_MT2131=y
CONFIG_MEDIA_TUNER_QT1010=y
# CONFIG_MEDIA_TUNER_XC2028 is not set
CONFIG_MEDIA_TUNER_XC5000=y
# CONFIG_MEDIA_TUNER_XC4000 is not set
# CONFIG_MEDIA_TUNER_MXL5005S is not set
CONFIG_MEDIA_TUNER_MXL5007T=y
# CONFIG_MEDIA_TUNER_MC44S803 is not set
# CONFIG_MEDIA_TUNER_MAX2165 is not set
# CONFIG_MEDIA_TUNER_TDA18218 is not set
CONFIG_MEDIA_TUNER_FC0011=y
# CONFIG_MEDIA_TUNER_FC0012 is not set
# CONFIG_MEDIA_TUNER_FC0013 is not set
CONFIG_MEDIA_TUNER_TDA18212=y
# CONFIG_MEDIA_TUNER_E4000 is not set
# CONFIG_MEDIA_TUNER_FC2580 is not set
CONFIG_MEDIA_TUNER_M88RS6000T=y
# CONFIG_MEDIA_TUNER_TUA9001 is not set
CONFIG_MEDIA_TUNER_SI2157=y
CONFIG_MEDIA_TUNER_IT913X=y
CONFIG_MEDIA_TUNER_R820T=y
CONFIG_MEDIA_TUNER_MXL301RF=y
CONFIG_MEDIA_TUNER_QM1D1C0042=y

#
# Customise DVB Frontends
#

#
# Multistandard (satellite) frontends
#
# CONFIG_DVB_STB0899 is not set
CONFIG_DVB_STB6100=y
CONFIG_DVB_STV090x=y
CONFIG_DVB_STV0910=y
CONFIG_DVB_STV6110x=y
CONFIG_DVB_STV6111=y
CONFIG_DVB_MXL5XX=y

#
# Multistandard (cable + terrestrial) frontends
#
# CONFIG_DVB_DRXK is not set
CONFIG_DVB_TDA18271C2DD=y
CONFIG_DVB_SI2165=y
CONFIG_DVB_MN88472=y
CONFIG_DVB_MN88473=y

#
# DVB-S (satellite) frontends
#
CONFIG_DVB_CX24110=y
CONFIG_DVB_CX24123=y
CONFIG_DVB_MT312=y
# CONFIG_DVB_ZL10036 is not set
# CONFIG_DVB_ZL10039 is not set
# CONFIG_DVB_S5H1420 is not set
# CONFIG_DVB_STV0288 is not set
# CONFIG_DVB_STB6000 is not set
CONFIG_DVB_STV0299=y
CONFIG_DVB_STV6110=y
# CONFIG_DVB_STV0900 is not set
CONFIG_DVB_TDA8083=y
CONFIG_DVB_TDA10086=y
# CONFIG_DVB_TDA8261 is not set
CONFIG_DVB_VES1X93=y
# CONFIG_DVB_TUNER_ITD1000 is not set
CONFIG_DVB_TUNER_CX24113=y
# CONFIG_DVB_TDA826X is not set
CONFIG_DVB_TUA6100=y
# CONFIG_DVB_CX24116 is not set
CONFIG_DVB_CX24117=y
CONFIG_DVB_CX24120=y
CONFIG_DVB_SI21XX=y
CONFIG_DVB_TS2020=y
# CONFIG_DVB_DS3000 is not set
CONFIG_DVB_MB86A16=y
# CONFIG_DVB_TDA10071 is not set

#
# DVB-T (terrestrial) frontends
#
# CONFIG_DVB_SP8870 is not set
CONFIG_DVB_SP887X=y
CONFIG_DVB_CX22700=y
CONFIG_DVB_CX22702=y
CONFIG_DVB_S5H1432=y
CONFIG_DVB_DRXD=y
# CONFIG_DVB_L64781 is not set
# CONFIG_DVB_TDA1004X is not set
CONFIG_DVB_NXT6000=y
CONFIG_DVB_MT352=y
# CONFIG_DVB_ZL10353 is not set
# CONFIG_DVB_DIB3000MB is not set
# CONFIG_DVB_DIB3000MC is not set
CONFIG_DVB_DIB7000M=y
CONFIG_DVB_DIB7000P=y
# CONFIG_DVB_DIB9000 is not set
CONFIG_DVB_TDA10048=y
CONFIG_DVB_AF9013=y
# CONFIG_DVB_EC100 is not set
CONFIG_DVB_STV0367=y
# CONFIG_DVB_CXD2820R is not set
CONFIG_DVB_CXD2841ER=y
CONFIG_DVB_ZD1301_DEMOD=y
CONFIG_DVB_CXD2880=y

#
# DVB-C (cable) frontends
#
# CONFIG_DVB_VES1820 is not set
# CONFIG_DVB_TDA10021 is not set
# CONFIG_DVB_TDA10023 is not set
# CONFIG_DVB_STV0297 is not set

#
# ATSC (North American/Korean Terrestrial/Cable DTV) frontends
#
# CONFIG_DVB_NXT200X is not set
# CONFIG_DVB_OR51211 is not set
# CONFIG_DVB_OR51132 is not set
CONFIG_DVB_BCM3510=y
CONFIG_DVB_LGDT330X=y
# CONFIG_DVB_LGDT3305 is not set
# CONFIG_DVB_LG2160 is not set
CONFIG_DVB_S5H1409=y
CONFIG_DVB_AU8522=y
CONFIG_DVB_AU8522_DTV=y
# CONFIG_DVB_AU8522_V4L is not set
# CONFIG_DVB_S5H1411 is not set

#
# ISDB-T (terrestrial) frontends
#
# CONFIG_DVB_S921 is not set
CONFIG_DVB_DIB8000=y
# CONFIG_DVB_MB86A20S is not set

#
# ISDB-S (satellite) & ISDB-T (terrestrial) frontends
#
CONFIG_DVB_TC90522=y

#
# Digital terrestrial only tuners/PLL
#
# CONFIG_DVB_PLL is not set
# CONFIG_DVB_TUNER_DIB0070 is not set
# CONFIG_DVB_TUNER_DIB0090 is not set

#
# SEC control devices for DVB-S
#
CONFIG_DVB_DRX39XYJ=y
CONFIG_DVB_LNBH25=y
CONFIG_DVB_LNBP21=y
CONFIG_DVB_LNBP22=y
# CONFIG_DVB_ISL6405 is not set
# CONFIG_DVB_ISL6421 is not set
CONFIG_DVB_ISL6423=y
# CONFIG_DVB_A8293 is not set
# CONFIG_DVB_LGS8GL5 is not set
# CONFIG_DVB_LGS8GXX is not set
# CONFIG_DVB_ATBM8830 is not set
# CONFIG_DVB_TDA665x is not set
# CONFIG_DVB_IX2505V is not set
# CONFIG_DVB_M88RS2000 is not set
# CONFIG_DVB_AF9033 is not set
CONFIG_DVB_HORUS3A=y
CONFIG_DVB_ASCOT2E=y
CONFIG_DVB_HELENE=y

#
# Common Interface (EN50221) controller drivers
#
CONFIG_DVB_CXD2099=y
CONFIG_DVB_SP2=y

#
# Tools to develop new frontends
#
# CONFIG_DVB_DUMMY_FE is not set

#
# Graphics support
#
CONFIG_AGP=y
CONFIG_AGP_AMD64=y
# CONFIG_AGP_INTEL is not set
CONFIG_AGP_SIS=y
CONFIG_AGP_VIA=y
# CONFIG_VGA_ARB is not set
# CONFIG_VGA_SWITCHEROO is not set
CONFIG_DRM=y
# CONFIG_DRM_DP_AUX_CHARDEV is not set
# CONFIG_DRM_DEBUG_MM is not set
# CONFIG_DRM_DEBUG_MM_SELFTEST is not set
CONFIG_DRM_KMS_HELPER=y
CONFIG_DRM_KMS_FB_HELPER=y
CONFIG_DRM_FBDEV_EMULATION=y
CONFIG_DRM_FBDEV_OVERALLOC=100
CONFIG_DRM_LOAD_EDID_FIRMWARE=y
CONFIG_DRM_TTM=y

#
# I2C encoder or helper chips
#
CONFIG_DRM_I2C_CH7006=y
# CONFIG_DRM_I2C_SIL164 is not set
# CONFIG_DRM_I2C_NXP_TDA998X is not set
# CONFIG_DRM_RADEON is not set
# CONFIG_DRM_AMDGPU is not set

#
# ACP (Audio CoProcessor) Configuration
#

#
# AMD Library routines
#
# CONFIG_DRM_NOUVEAU is not set
# CONFIG_DRM_I915 is not set
# CONFIG_DRM_VGEM is not set
# CONFIG_DRM_VMWGFX is not set
CONFIG_DRM_GMA500=y
CONFIG_DRM_GMA600=y
CONFIG_DRM_GMA3600=y
# CONFIG_DRM_UDL is not set
CONFIG_DRM_AST=y
CONFIG_DRM_MGAG200=y
# CONFIG_DRM_CIRRUS_QEMU is not set
# CONFIG_DRM_QXL is not set
# CONFIG_DRM_BOCHS is not set
# CONFIG_DRM_VIRTIO_GPU is not set
CONFIG_DRM_PANEL=y

#
# Display Panels
#
CONFIG_DRM_BRIDGE=y
CONFIG_DRM_PANEL_BRIDGE=y

#
# Display Interface Bridges
#
# CONFIG_DRM_ANALOGIX_ANX78XX is not set
# CONFIG_DRM_HISI_HIBMC is not set
# CONFIG_DRM_TINYDRM is not set
# CONFIG_DRM_LEGACY is not set
CONFIG_DRM_PANEL_ORIENTATION_QUIRKS=y

#
# Frame buffer Devices
#
CONFIG_FB=y
# CONFIG_FIRMWARE_EDID is not set
CONFIG_FB_CMDLINE=y
CONFIG_FB_NOTIFY=y
CONFIG_FB_DDC=y
CONFIG_FB_BOOT_VESA_SUPPORT=y
CONFIG_FB_CFB_FILLRECT=y
CONFIG_FB_CFB_COPYAREA=y
CONFIG_FB_CFB_IMAGEBLIT=y
CONFIG_FB_SYS_FILLRECT=y
CONFIG_FB_SYS_COPYAREA=y
CONFIG_FB_SYS_IMAGEBLIT=y
CONFIG_FB_FOREIGN_ENDIAN=y
CONFIG_FB_BOTH_ENDIAN=y
# CONFIG_FB_BIG_ENDIAN is not set
# CONFIG_FB_LITTLE_ENDIAN is not set
CONFIG_FB_SYS_FOPS=y
CONFIG_FB_DEFERRED_IO=y
CONFIG_FB_SVGALIB=y
CONFIG_FB_BACKLIGHT=y
CONFIG_FB_MODE_HELPERS=y
CONFIG_FB_TILEBLITTING=y

#
# Frame buffer hardware drivers
#
# CONFIG_FB_CIRRUS is not set
CONFIG_FB_PM2=y
# CONFIG_FB_PM2_FIFO_DISCONNECT is not set
# CONFIG_FB_CYBER2000 is not set
# CONFIG_FB_ARC is not set
CONFIG_FB_ASILIANT=y
CONFIG_FB_IMSTT=y
# CONFIG_FB_VGA16 is not set
# CONFIG_FB_UVESA is not set
CONFIG_FB_VESA=y
# CONFIG_FB_EFI is not set
# CONFIG_FB_N411 is not set
# CONFIG_FB_HGA is not set
# CONFIG_FB_OPENCORES is not set
CONFIG_FB_S1D13XXX=y
CONFIG_FB_NVIDIA=y
# CONFIG_FB_NVIDIA_I2C is not set
# CONFIG_FB_NVIDIA_DEBUG is not set
# CONFIG_FB_NVIDIA_BACKLIGHT is not set
CONFIG_FB_RIVA=y
CONFIG_FB_RIVA_I2C=y
CONFIG_FB_RIVA_DEBUG=y
CONFIG_FB_RIVA_BACKLIGHT=y
# CONFIG_FB_I740 is not set
CONFIG_FB_LE80578=y
# CONFIG_FB_CARILLO_RANCH is not set
CONFIG_FB_MATROX=y
CONFIG_FB_MATROX_MILLENIUM=y
# CONFIG_FB_MATROX_MYSTIQUE is not set
# CONFIG_FB_MATROX_G is not set
# CONFIG_FB_MATROX_I2C is not set
# CONFIG_FB_RADEON is not set
CONFIG_FB_ATY128=y
# CONFIG_FB_ATY128_BACKLIGHT is not set
CONFIG_FB_ATY=y
# CONFIG_FB_ATY_CT is not set
# CONFIG_FB_ATY_GX is not set
# CONFIG_FB_ATY_BACKLIGHT is not set
CONFIG_FB_S3=y
CONFIG_FB_S3_DDC=y
# CONFIG_FB_SAVAGE is not set
CONFIG_FB_SIS=y
CONFIG_FB_SIS_300=y
# CONFIG_FB_SIS_315 is not set
# CONFIG_FB_VIA is not set
CONFIG_FB_NEOMAGIC=y
CONFIG_FB_KYRO=y
# CONFIG_FB_3DFX is not set
CONFIG_FB_VOODOO1=y
# CONFIG_FB_VT8623 is not set
CONFIG_FB_TRIDENT=y
CONFIG_FB_ARK=y
CONFIG_FB_PM3=y
CONFIG_FB_CARMINE=y
# CONFIG_FB_CARMINE_DRAM_EVAL is not set
CONFIG_CARMINE_DRAM_CUSTOM=y
CONFIG_FB_SMSCUFX=y
CONFIG_FB_UDL=y
# CONFIG_FB_IBM_GXT4500 is not set
CONFIG_FB_VIRTUAL=y
# CONFIG_XEN_FBDEV_FRONTEND is not set
# CONFIG_FB_METRONOME is not set
# CONFIG_FB_MB862XX is not set
CONFIG_FB_BROADSHEET=y
CONFIG_FB_AUO_K190X=y
# CONFIG_FB_AUO_K1900 is not set
# CONFIG_FB_AUO_K1901 is not set
# CONFIG_FB_SIMPLE is not set
# CONFIG_FB_SM712 is not set
CONFIG_BACKLIGHT_LCD_SUPPORT=y
CONFIG_LCD_CLASS_DEVICE=y
# CONFIG_LCD_L4F00242T03 is not set
CONFIG_LCD_LMS283GF05=y
# CONFIG_LCD_LTV350QV is not set
# CONFIG_LCD_ILI922X is not set
# CONFIG_LCD_ILI9320 is not set
CONFIG_LCD_TDO24M=y
# CONFIG_LCD_VGG2432A4 is not set
# CONFIG_LCD_PLATFORM is not set
CONFIG_LCD_S6E63M0=y
CONFIG_LCD_LD9040=y
CONFIG_LCD_AMS369FG06=y
# CONFIG_LCD_LMS501KF03 is not set
# CONFIG_LCD_HX8357 is not set
CONFIG_BACKLIGHT_CLASS_DEVICE=y
CONFIG_BACKLIGHT_GENERIC=y
# CONFIG_BACKLIGHT_LM3533 is not set
# CONFIG_BACKLIGHT_CARILLO_RANCH is not set
CONFIG_BACKLIGHT_PWM=y
CONFIG_BACKLIGHT_DA903X=y
CONFIG_BACKLIGHT_DA9052=y
CONFIG_BACKLIGHT_APPLE=y
# CONFIG_BACKLIGHT_PM8941_WLED is not set
CONFIG_BACKLIGHT_SAHARA=y
CONFIG_BACKLIGHT_ADP8860=y
CONFIG_BACKLIGHT_ADP8870=y
CONFIG_BACKLIGHT_88PM860X=y
CONFIG_BACKLIGHT_AAT2870=y
# CONFIG_BACKLIGHT_LM3630A is not set
CONFIG_BACKLIGHT_LM3639=y
# CONFIG_BACKLIGHT_LP855X is not set
# CONFIG_BACKLIGHT_LP8788 is not set
# CONFIG_BACKLIGHT_GPIO is not set
# CONFIG_BACKLIGHT_LV5207LP is not set
# CONFIG_BACKLIGHT_BD6107 is not set
# CONFIG_BACKLIGHT_ARCXCNN is not set
CONFIG_VGASTATE=y
CONFIG_HDMI=y
# CONFIG_LOGO is not set
CONFIG_SOUND=y
CONFIG_SOUND_OSS_CORE=y
CONFIG_SOUND_OSS_CORE_PRECLAIM=y
CONFIG_SND=y
CONFIG_SND_TIMER=y
CONFIG_SND_PCM=y
CONFIG_SND_HWDEP=y
CONFIG_SND_SEQ_DEVICE=y
CONFIG_SND_RAWMIDI=y
CONFIG_SND_JACK=y
CONFIG_SND_JACK_INPUT_DEV=y
CONFIG_SND_OSSEMUL=y
# CONFIG_SND_MIXER_OSS is not set
# CONFIG_SND_PCM_OSS is not set
CONFIG_SND_PCM_TIMER=y
CONFIG_SND_HRTIMER=y
CONFIG_SND_DYNAMIC_MINORS=y
CONFIG_SND_MAX_CARDS=32
# CONFIG_SND_SUPPORT_OLD_API is not set
CONFIG_SND_PROC_FS=y
CONFIG_SND_VERBOSE_PROCFS=y
CONFIG_SND_VERBOSE_PRINTK=y
CONFIG_SND_DEBUG=y
CONFIG_SND_DEBUG_VERBOSE=y
# CONFIG_SND_PCM_XRUN_DEBUG is not set
CONFIG_SND_VMASTER=y
CONFIG_SND_DMA_SGBUF=y
CONFIG_SND_SEQUENCER=y
# CONFIG_SND_SEQ_DUMMY is not set
CONFIG_SND_SEQUENCER_OSS=y
CONFIG_SND_SEQ_HRTIMER_DEFAULT=y
CONFIG_SND_SEQ_MIDI_EVENT=y
CONFIG_SND_SEQ_MIDI=y
CONFIG_SND_SEQ_MIDI_EMUL=y
CONFIG_SND_MPU401_UART=y
CONFIG_SND_OPL3_LIB=y
CONFIG_SND_OPL3_LIB_SEQ=y
CONFIG_SND_VX_LIB=y
CONFIG_SND_AC97_CODEC=y
# CONFIG_SND_DRIVERS is not set
CONFIG_SND_PCI=y
CONFIG_SND_AD1889=y
# CONFIG_SND_ASIHPI is not set
CONFIG_SND_ATIIXP=y
CONFIG_SND_ATIIXP_MODEM=y
CONFIG_SND_AU8810=y
# CONFIG_SND_AU8820 is not set
# CONFIG_SND_AU8830 is not set
# CONFIG_SND_AW2 is not set
CONFIG_SND_BT87X=y
# CONFIG_SND_BT87X_OVERCLOCK is not set
CONFIG_SND_CA0106=y
CONFIG_SND_CMIPCI=y
CONFIG_SND_OXYGEN_LIB=y
CONFIG_SND_OXYGEN=y
# CONFIG_SND_CS4281 is not set
CONFIG_SND_CS46XX=y
CONFIG_SND_CS46XX_NEW_DSP=y
# CONFIG_SND_CTXFI is not set
# CONFIG_SND_DARLA20 is not set
CONFIG_SND_GINA20=y
# CONFIG_SND_LAYLA20 is not set
# CONFIG_SND_DARLA24 is not set
# CONFIG_SND_GINA24 is not set
CONFIG_SND_LAYLA24=y
CONFIG_SND_MONA=y
# CONFIG_SND_MIA is not set
# CONFIG_SND_ECHO3G is not set
# CONFIG_SND_INDIGO is not set
# CONFIG_SND_INDIGOIO is not set
CONFIG_SND_INDIGODJ=y
CONFIG_SND_INDIGOIOX=y
# CONFIG_SND_INDIGODJX is not set
CONFIG_SND_ENS1370=y
# CONFIG_SND_ENS1371 is not set
# CONFIG_SND_FM801 is not set
# CONFIG_SND_HDSP is not set
CONFIG_SND_HDSPM=y
CONFIG_SND_ICE1724=y
# CONFIG_SND_INTEL8X0 is not set
# CONFIG_SND_INTEL8X0M is not set
CONFIG_SND_KORG1212=y
CONFIG_SND_LOLA=y
CONFIG_SND_LX6464ES=y
CONFIG_SND_MIXART=y
CONFIG_SND_NM256=y
CONFIG_SND_PCXHR=y
# CONFIG_SND_RIPTIDE is not set
CONFIG_SND_RME32=y
CONFIG_SND_RME96=y
# CONFIG_SND_RME9652 is not set
CONFIG_SND_VIA82XX=y
# CONFIG_SND_VIA82XX_MODEM is not set
# CONFIG_SND_VIRTUOSO is not set
CONFIG_SND_VX222=y
# CONFIG_SND_YMFPCI is not set

#
# HD-Audio
#
# CONFIG_SND_HDA_INTEL is not set
CONFIG_SND_HDA_PREALLOC_SIZE=64
# CONFIG_SND_SPI is not set
CONFIG_SND_USB=y
CONFIG_SND_USB_AUDIO=y
CONFIG_SND_USB_UA101=y
# CONFIG_SND_USB_USX2Y is not set
CONFIG_SND_USB_CAIAQ=y
# CONFIG_SND_USB_CAIAQ_INPUT is not set
CONFIG_SND_USB_US122L=y
# CONFIG_SND_USB_6FIRE is not set
# CONFIG_SND_USB_HIFACE is not set
# CONFIG_SND_BCD2000 is not set
# CONFIG_SND_USB_POD is not set
# CONFIG_SND_USB_PODHD is not set
# CONFIG_SND_USB_TONEPORT is not set
# CONFIG_SND_USB_VARIAX is not set
CONFIG_SND_SOC=y
# CONFIG_SND_SOC_AMD_ACP is not set
# CONFIG_SND_ATMEL_SOC is not set
# CONFIG_SND_DESIGNWARE_I2S is not set

#
# SoC Audio for Freescale CPUs
#

#
# Common SoC Audio options for Freescale CPUs:
#
# CONFIG_SND_SOC_FSL_ASRC is not set
# CONFIG_SND_SOC_FSL_SAI is not set
# CONFIG_SND_SOC_FSL_SSI is not set
# CONFIG_SND_SOC_FSL_SPDIF is not set
# CONFIG_SND_SOC_FSL_ESAI is not set
# CONFIG_SND_SOC_IMX_AUDMUX is not set
# CONFIG_SND_I2S_HI6210_I2S is not set
# CONFIG_SND_SOC_IMG is not set
CONFIG_SND_SOC_INTEL_SST_TOPLEVEL=y
# CONFIG_SND_SST_ATOM_HIFI2_PLATFORM_PCI is not set
# CONFIG_SND_SST_ATOM_HIFI2_PLATFORM is not set
# CONFIG_SND_SOC_INTEL_SKYLAKE is not set
CONFIG_SND_SOC_INTEL_MACH=y

#
# STMicroelectronics STM32 SOC audio support
#
# CONFIG_SND_SOC_XTFPGA_I2S is not set
# CONFIG_ZX_TDM is not set
CONFIG_SND_SOC_I2C_AND_SPI=y

#
# CODEC drivers
#
# CONFIG_SND_SOC_AC97_CODEC is not set
# CONFIG_SND_SOC_ADAU1701 is not set
# CONFIG_SND_SOC_ADAU1761_I2C is not set
# CONFIG_SND_SOC_ADAU1761_SPI is not set
# CONFIG_SND_SOC_ADAU7002 is not set
# CONFIG_SND_SOC_AK4104 is not set
# CONFIG_SND_SOC_AK4458 is not set
# CONFIG_SND_SOC_AK4554 is not set
# CONFIG_SND_SOC_AK4613 is not set
# CONFIG_SND_SOC_AK4642 is not set
# CONFIG_SND_SOC_AK5386 is not set
# CONFIG_SND_SOC_AK5558 is not set
# CONFIG_SND_SOC_ALC5623 is not set
# CONFIG_SND_SOC_BD28623 is not set
# CONFIG_SND_SOC_BT_SCO is not set
# CONFIG_SND_SOC_CS35L32 is not set
# CONFIG_SND_SOC_CS35L33 is not set
# CONFIG_SND_SOC_CS35L34 is not set
# CONFIG_SND_SOC_CS35L35 is not set
# CONFIG_SND_SOC_CS42L42 is not set
# CONFIG_SND_SOC_CS42L51_I2C is not set
# CONFIG_SND_SOC_CS42L52 is not set
# CONFIG_SND_SOC_CS42L56 is not set
# CONFIG_SND_SOC_CS42L73 is not set
# CONFIG_SND_SOC_CS4265 is not set
# CONFIG_SND_SOC_CS4270 is not set
# CONFIG_SND_SOC_CS4271_I2C is not set
# CONFIG_SND_SOC_CS4271_SPI is not set
# CONFIG_SND_SOC_CS42XX8_I2C is not set
# CONFIG_SND_SOC_CS43130 is not set
# CONFIG_SND_SOC_CS4349 is not set
# CONFIG_SND_SOC_CS53L30 is not set
# CONFIG_SND_SOC_DIO2125 is not set
# CONFIG_SND_SOC_ES7134 is not set
# CONFIG_SND_SOC_ES8316 is not set
# CONFIG_SND_SOC_ES8328_I2C is not set
# CONFIG_SND_SOC_ES8328_SPI is not set
# CONFIG_SND_SOC_GTM601 is not set
# CONFIG_SND_SOC_INNO_RK3036 is not set
# CONFIG_SND_SOC_MAX98504 is not set
# CONFIG_SND_SOC_MAX9867 is not set
# CONFIG_SND_SOC_MAX98927 is not set
# CONFIG_SND_SOC_MAX98373 is not set
# CONFIG_SND_SOC_MAX9860 is not set
# CONFIG_SND_SOC_MSM8916_WCD_DIGITAL is not set
# CONFIG_SND_SOC_PCM1681 is not set
# CONFIG_SND_SOC_PCM179X_I2C is not set
# CONFIG_SND_SOC_PCM179X_SPI is not set
# CONFIG_SND_SOC_PCM186X_I2C is not set
# CONFIG_SND_SOC_PCM186X_SPI is not set
# CONFIG_SND_SOC_PCM3168A_I2C is not set
# CONFIG_SND_SOC_PCM3168A_SPI is not set
# CONFIG_SND_SOC_PCM512x_I2C is not set
# CONFIG_SND_SOC_PCM512x_SPI is not set
# CONFIG_SND_SOC_RT5616 is not set
# CONFIG_SND_SOC_RT5631 is not set
# CONFIG_SND_SOC_SGTL5000 is not set
# CONFIG_SND_SOC_SIRF_AUDIO_CODEC is not set
# CONFIG_SND_SOC_SPDIF is not set
# CONFIG_SND_SOC_SSM2602_SPI is not set
# CONFIG_SND_SOC_SSM2602_I2C is not set
# CONFIG_SND_SOC_SSM4567 is not set
# CONFIG_SND_SOC_STA32X is not set
# CONFIG_SND_SOC_STA350 is not set
# CONFIG_SND_SOC_STI_SAS is not set
# CONFIG_SND_SOC_TAS2552 is not set
# CONFIG_SND_SOC_TAS5086 is not set
# CONFIG_SND_SOC_TAS571X is not set
# CONFIG_SND_SOC_TAS5720 is not set
# CONFIG_SND_SOC_TAS6424 is not set
# CONFIG_SND_SOC_TFA9879 is not set
# CONFIG_SND_SOC_TLV320AIC23_I2C is not set
# CONFIG_SND_SOC_TLV320AIC23_SPI is not set
# CONFIG_SND_SOC_TLV320AIC31XX is not set
# CONFIG_SND_SOC_TLV320AIC32X4_I2C is not set
# CONFIG_SND_SOC_TLV320AIC32X4_SPI is not set
# CONFIG_SND_SOC_TLV320AIC3X is not set
# CONFIG_SND_SOC_TS3A227E is not set
# CONFIG_SND_SOC_TSCS42XX is not set
# CONFIG_SND_SOC_WM8510 is not set
# CONFIG_SND_SOC_WM8523 is not set
# CONFIG_SND_SOC_WM8524 is not set
# CONFIG_SND_SOC_WM8580 is not set
# CONFIG_SND_SOC_WM8711 is not set
# CONFIG_SND_SOC_WM8728 is not set
# CONFIG_SND_SOC_WM8731 is not set
# CONFIG_SND_SOC_WM8737 is not set
# CONFIG_SND_SOC_WM8741 is not set
# CONFIG_SND_SOC_WM8750 is not set
# CONFIG_SND_SOC_WM8753 is not set
# CONFIG_SND_SOC_WM8770 is not set
# CONFIG_SND_SOC_WM8776 is not set
# CONFIG_SND_SOC_WM8804_I2C is not set
# CONFIG_SND_SOC_WM8804_SPI is not set
# CONFIG_SND_SOC_WM8903 is not set
# CONFIG_SND_SOC_WM8960 is not set
# CONFIG_SND_SOC_WM8962 is not set
# CONFIG_SND_SOC_WM8974 is not set
# CONFIG_SND_SOC_WM8978 is not set
# CONFIG_SND_SOC_WM8985 is not set
# CONFIG_SND_SOC_ZX_AUD96P22 is not set
# CONFIG_SND_SOC_MAX9759 is not set
# CONFIG_SND_SOC_NAU8540 is not set
# CONFIG_SND_SOC_NAU8810 is not set
# CONFIG_SND_SOC_NAU8824 is not set
# CONFIG_SND_SOC_TPA6130A2 is not set
CONFIG_SND_SIMPLE_CARD_UTILS=y
CONFIG_SND_SIMPLE_CARD=y
CONFIG_SND_X86=y
CONFIG_AC97_BUS=y

#
# HID support
#
# CONFIG_HID is not set

#
# USB HID support
#
# CONFIG_USB_HID is not set
# CONFIG_HID_PID is not set

#
# USB HID Boot Protocol drivers
#
CONFIG_USB_KBD=y
# CONFIG_USB_MOUSE is not set

#
# I2C HID support
#
# CONFIG_I2C_HID is not set

#
# Intel ISH HID support
#
# CONFIG_INTEL_ISH_HID is not set
CONFIG_USB_OHCI_LITTLE_ENDIAN=y
CONFIG_USB_SUPPORT=y
CONFIG_USB_COMMON=y
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB=y
CONFIG_USB_PCI=y
CONFIG_USB_ANNOUNCE_NEW_DEVICES=y

#
# Miscellaneous USB options
#
CONFIG_USB_DEFAULT_PERSIST=y
CONFIG_USB_DYNAMIC_MINORS=y
# CONFIG_USB_OTG is not set
# CONFIG_USB_OTG_WHITELIST is not set
# CONFIG_USB_OTG_BLACKLIST_HUB is not set
# CONFIG_USB_LEDS_TRIGGER_USBPORT is not set
# CONFIG_USB_MON is not set
CONFIG_USB_WUSB_CBAF=y
CONFIG_USB_WUSB_CBAF_DEBUG=y

#
# USB Host Controller Drivers
#
# CONFIG_USB_C67X00_HCD is not set
CONFIG_USB_XHCI_HCD=y
# CONFIG_USB_XHCI_DBGCAP is not set
CONFIG_USB_XHCI_PCI=y
# CONFIG_USB_XHCI_PLATFORM is not set
# CONFIG_USB_EHCI_HCD is not set
CONFIG_USB_OXU210HP_HCD=y
# CONFIG_USB_ISP116X_HCD is not set
# CONFIG_USB_ISP1362_HCD is not set
# CONFIG_USB_FOTG210_HCD is not set
# CONFIG_USB_MAX3421_HCD is not set
CONFIG_USB_OHCI_HCD=y
CONFIG_USB_OHCI_HCD_PCI=y
# CONFIG_USB_OHCI_HCD_SSB is not set
CONFIG_USB_OHCI_HCD_PLATFORM=y
# CONFIG_USB_UHCI_HCD is not set
# CONFIG_USB_SL811_HCD is not set
CONFIG_USB_R8A66597_HCD=y
CONFIG_USB_HCD_BCMA=y
# CONFIG_USB_HCD_SSB is not set
# CONFIG_USB_HCD_TEST_MODE is not set

#
# USB Device Class drivers
#
CONFIG_USB_ACM=y
CONFIG_USB_PRINTER=y
CONFIG_USB_WDM=y
# CONFIG_USB_TMC is not set

#
# NOTE: USB_STORAGE depends on SCSI but BLK_DEV_SD may
#

#
# also be needed; see USB_STORAGE Help for more info
#
CONFIG_USB_STORAGE=y
CONFIG_USB_STORAGE_DEBUG=y
CONFIG_USB_STORAGE_REALTEK=y
CONFIG_REALTEK_AUTOPM=y
CONFIG_USB_STORAGE_DATAFAB=y
CONFIG_USB_STORAGE_FREECOM=y
# CONFIG_USB_STORAGE_ISD200 is not set
# CONFIG_USB_STORAGE_USBAT is not set
# CONFIG_USB_STORAGE_SDDR09 is not set
CONFIG_USB_STORAGE_SDDR55=y
CONFIG_USB_STORAGE_JUMPSHOT=y
CONFIG_USB_STORAGE_ALAUDA=y
# CONFIG_USB_STORAGE_ONETOUCH is not set
# CONFIG_USB_STORAGE_KARMA is not set
# CONFIG_USB_STORAGE_CYPRESS_ATACB is not set
# CONFIG_USB_STORAGE_ENE_UB6250 is not set
# CONFIG_USB_UAS is not set

#
# USB Imaging devices
#
CONFIG_USB_MDC800=y
# CONFIG_USB_MICROTEK is not set
# CONFIG_USBIP_CORE is not set
# CONFIG_USB_MUSB_HDRC is not set
# CONFIG_USB_DWC3 is not set
# CONFIG_USB_DWC2 is not set
CONFIG_USB_CHIPIDEA=y
CONFIG_USB_CHIPIDEA_PCI=y
# CONFIG_USB_CHIPIDEA_UDC is not set
# CONFIG_USB_ISP1760 is not set

#
# USB port drivers
#
# CONFIG_USB_SERIAL is not set

#
# USB Miscellaneous drivers
#
CONFIG_USB_EMI62=y
# CONFIG_USB_EMI26 is not set
# CONFIG_USB_ADUTUX is not set
CONFIG_USB_SEVSEG=y
CONFIG_USB_RIO500=y
# CONFIG_USB_LEGOTOWER is not set
CONFIG_USB_LCD=y
CONFIG_USB_CYPRESS_CY7C63=y
# CONFIG_USB_CYTHERM is not set
# CONFIG_USB_IDMOUSE is not set
# CONFIG_USB_FTDI_ELAN is not set
# CONFIG_USB_APPLEDISPLAY is not set
# CONFIG_USB_LD is not set
# CONFIG_USB_TRANCEVIBRATOR is not set
CONFIG_USB_IOWARRIOR=y
CONFIG_USB_TEST=y
# CONFIG_USB_EHSET_TEST_FIXTURE is not set
# CONFIG_USB_ISIGHTFW is not set
# CONFIG_USB_YUREX is not set
# CONFIG_USB_EZUSB_FX2 is not set
# CONFIG_USB_HUB_USB251XB is not set
# CONFIG_USB_HSIC_USB3503 is not set
# CONFIG_USB_HSIC_USB4604 is not set
# CONFIG_USB_LINK_LAYER_TEST is not set
# CONFIG_USB_CHAOSKEY is not set

#
# USB Physical Layer drivers
#
CONFIG_USB_PHY=y
CONFIG_NOP_USB_XCEIV=y
CONFIG_USB_GPIO_VBUS=y
# CONFIG_TAHVO_USB is not set
# CONFIG_USB_ISP1301 is not set
CONFIG_USB_GADGET=y
# CONFIG_USB_GADGET_DEBUG is not set
# CONFIG_USB_GADGET_DEBUG_FILES is not set
# CONFIG_USB_GADGET_DEBUG_FS is not set
CONFIG_USB_GADGET_VBUS_DRAW=2
CONFIG_USB_GADGET_STORAGE_NUM_BUFFERS=2
# CONFIG_U_SERIAL_CONSOLE is not set

#
# USB Peripheral Controller
#
# CONFIG_USB_FOTG210_UDC is not set
# CONFIG_USB_GR_UDC is not set
# CONFIG_USB_R8A66597 is not set
# CONFIG_USB_PXA27X is not set
# CONFIG_USB_MV_UDC is not set
# CONFIG_USB_MV_U3D is not set
CONFIG_USB_M66592=y
# CONFIG_USB_BDC_UDC is not set
# CONFIG_USB_AMD5536UDC is not set
# CONFIG_USB_NET2272 is not set
# CONFIG_USB_NET2280 is not set
# CONFIG_USB_GOKU is not set
# CONFIG_USB_EG20T is not set
CONFIG_USB_DUMMY_HCD=y
CONFIG_USB_LIBCOMPOSITE=y
CONFIG_USB_F_ACM=y
CONFIG_USB_U_SERIAL=y
CONFIG_USB_F_MASS_STORAGE=y
# CONFIG_USB_CONFIGFS is not set
# CONFIG_USB_ZERO is not set
# CONFIG_USB_AUDIO is not set
# CONFIG_USB_ETH is not set
# CONFIG_USB_G_NCM is not set
# CONFIG_USB_GADGETFS is not set
# CONFIG_USB_FUNCTIONFS is not set
# CONFIG_USB_MASS_STORAGE is not set
# CONFIG_USB_G_SERIAL is not set
# CONFIG_USB_MIDI_GADGET is not set
# CONFIG_USB_G_PRINTER is not set
# CONFIG_USB_CDC_COMPOSITE is not set
CONFIG_USB_G_ACM_MS=y
# CONFIG_USB_G_MULTI is not set
# CONFIG_USB_G_HID is not set
# CONFIG_USB_G_DBGP is not set
# CONFIG_USB_G_WEBCAM is not set
# CONFIG_TYPEC is not set
# CONFIG_USB_LED_TRIG is not set
# CONFIG_USB_ULPI_BUS is not set
# CONFIG_UWB is not set
# CONFIG_MMC is not set
# CONFIG_MEMSTICK is not set
CONFIG_NEW_LEDS=y
CONFIG_LEDS_CLASS=y
# CONFIG_LEDS_CLASS_FLASH is not set
# CONFIG_LEDS_BRIGHTNESS_HW_CHANGED is not set

#
# LED drivers
#
# CONFIG_LEDS_88PM860X is not set
# CONFIG_LEDS_APU is not set
# CONFIG_LEDS_LM3530 is not set
# CONFIG_LEDS_LM3533 is not set
CONFIG_LEDS_LM3642=y
CONFIG_LEDS_PCA9532=y
# CONFIG_LEDS_PCA9532_GPIO is not set
CONFIG_LEDS_GPIO=y
# CONFIG_LEDS_LP3944 is not set
# CONFIG_LEDS_LP3952 is not set
# CONFIG_LEDS_LP5521 is not set
# CONFIG_LEDS_LP5523 is not set
# CONFIG_LEDS_LP5562 is not set
# CONFIG_LEDS_LP8501 is not set
CONFIG_LEDS_LP8788=y
# CONFIG_LEDS_CLEVO_MAIL is not set
# CONFIG_LEDS_PCA955X is not set
# CONFIG_LEDS_PCA963X is not set
CONFIG_LEDS_WM8350=y
# CONFIG_LEDS_DA903X is not set
# CONFIG_LEDS_DA9052 is not set
CONFIG_LEDS_DAC124S085=y
# CONFIG_LEDS_PWM is not set
CONFIG_LEDS_REGULATOR=y
CONFIG_LEDS_BD2802=y
# CONFIG_LEDS_INTEL_SS4200 is not set
CONFIG_LEDS_LT3593=y
# CONFIG_LEDS_TCA6507 is not set
# CONFIG_LEDS_TLC591XX is not set
# CONFIG_LEDS_LM355x is not set

#
# LED driver for blink(1) USB RGB LED is under Special HID drivers (HID_THINGM)
#
CONFIG_LEDS_BLINKM=y
# CONFIG_LEDS_MLXCPLD is not set
# CONFIG_LEDS_MLXREG is not set
# CONFIG_LEDS_USER is not set
# CONFIG_LEDS_NIC78BX is not set

#
# LED Triggers
#
CONFIG_LEDS_TRIGGERS=y
CONFIG_LEDS_TRIGGER_TIMER=y
# CONFIG_LEDS_TRIGGER_ONESHOT is not set
# CONFIG_LEDS_TRIGGER_DISK is not set
# CONFIG_LEDS_TRIGGER_MTD is not set
CONFIG_LEDS_TRIGGER_HEARTBEAT=y
# CONFIG_LEDS_TRIGGER_BACKLIGHT is not set
# CONFIG_LEDS_TRIGGER_CPU is not set
# CONFIG_LEDS_TRIGGER_ACTIVITY is not set
# CONFIG_LEDS_TRIGGER_GPIO is not set
# CONFIG_LEDS_TRIGGER_DEFAULT_ON is not set

#
# iptables trigger is under Netfilter config (LED target)
#
# CONFIG_LEDS_TRIGGER_TRANSIENT is not set
# CONFIG_LEDS_TRIGGER_CAMERA is not set
# CONFIG_LEDS_TRIGGER_PANIC is not set
# CONFIG_LEDS_TRIGGER_NETDEV is not set
# CONFIG_ACCESSIBILITY is not set
# CONFIG_INFINIBAND is not set
CONFIG_EDAC_ATOMIC_SCRUB=y
CONFIG_EDAC_SUPPORT=y
CONFIG_EDAC=y
# CONFIG_EDAC_LEGACY_SYSFS is not set
CONFIG_EDAC_DEBUG=y
# CONFIG_EDAC_DECODE_MCE is not set
# CONFIG_EDAC_E752X is not set
# CONFIG_EDAC_I82975X is not set
# CONFIG_EDAC_I3000 is not set
# CONFIG_EDAC_I3200 is not set
# CONFIG_EDAC_IE31200 is not set
# CONFIG_EDAC_X38 is not set
# CONFIG_EDAC_I5400 is not set
# CONFIG_EDAC_I5000 is not set
# CONFIG_EDAC_I5100 is not set
# CONFIG_EDAC_I7300 is not set
CONFIG_RTC_LIB=y
CONFIG_RTC_MC146818_LIB=y
# CONFIG_RTC_CLASS is not set
# CONFIG_DMADEVICES is not set

#
# DMABUF options
#
CONFIG_SYNC_FILE=y
# CONFIG_SW_SYNC is not set
CONFIG_AUXDISPLAY=y
# CONFIG_HD44780 is not set
# CONFIG_IMG_ASCII_LCD is not set
CONFIG_UIO=y
# CONFIG_UIO_CIF is not set
# CONFIG_UIO_PDRV_GENIRQ is not set
CONFIG_UIO_DMEM_GENIRQ=y
# CONFIG_UIO_AEC is not set
CONFIG_UIO_SERCOS3=y
# CONFIG_UIO_PCI_GENERIC is not set
CONFIG_UIO_NETX=y
# CONFIG_UIO_PRUSS is not set
# CONFIG_UIO_MF624 is not set
CONFIG_VIRT_DRIVERS=y
# CONFIG_VBOXGUEST is not set
CONFIG_VIRTIO=y
CONFIG_VIRTIO_MENU=y
CONFIG_VIRTIO_PCI=y
CONFIG_VIRTIO_PCI_LEGACY=y
CONFIG_VIRTIO_BALLOON=y
# CONFIG_VIRTIO_INPUT is not set
# CONFIG_VIRTIO_MMIO is not set

#
# Microsoft Hyper-V guest support
#
# CONFIG_HYPERV is not set

#
# Xen driver support
#
CONFIG_XEN_BALLOON=y
CONFIG_XEN_SCRUB_PAGES=y
CONFIG_XEN_DEV_EVTCHN=y
CONFIG_XEN_BACKEND=y
# CONFIG_XENFS is not set
# CONFIG_XEN_SYS_HYPERVISOR is not set
CONFIG_XEN_XENBUS_FRONTEND=y
# CONFIG_XEN_GNTDEV is not set
CONFIG_XEN_GRANT_DEV_ALLOC=y
CONFIG_SWIOTLB_XEN=y
# CONFIG_XEN_PCIDEV_BACKEND is not set
# CONFIG_XEN_PVCALLS_FRONTEND is not set
# CONFIG_XEN_PVCALLS_BACKEND is not set
CONFIG_XEN_PRIVCMD=y
# CONFIG_XEN_ACPI_PROCESSOR is not set
# CONFIG_XEN_MCE_LOG is not set
CONFIG_XEN_HAVE_PVMMU=y
CONFIG_XEN_EFI=y
CONFIG_XEN_AUTO_XLATE=y
CONFIG_XEN_ACPI=y
CONFIG_XEN_HAVE_VPMU=y
CONFIG_STAGING=y
# CONFIG_IRDA is not set
# CONFIG_IPX is not set
# CONFIG_NCP_FS is not set
# CONFIG_COMEDI is not set
# CONFIG_RTS5208 is not set
# CONFIG_FB_SM750 is not set
# CONFIG_FB_XGI is not set

#
# Speakup console speech
#
# CONFIG_STAGING_MEDIA is not set

#
# Android
#
CONFIG_ASHMEM=y
# CONFIG_ION is not set
# CONFIG_LNET is not set
# CONFIG_DGNC is not set
# CONFIG_GS_FPGABOOT is not set
# CONFIG_CRYPTO_SKEIN is not set
# CONFIG_UNISYSSPAR is not set
# CONFIG_FB_TFT is not set
# CONFIG_MOST is not set
# CONFIG_GREYBUS is not set

#
# USB Power Delivery and Type-C drivers
#
# CONFIG_DRM_VBOXVIDEO is not set
# CONFIG_PI433 is not set
CONFIG_X86_PLATFORM_DEVICES=y
CONFIG_ACER_WMI=y
# CONFIG_ACER_WIRELESS is not set
# CONFIG_ACERHDF is not set
# CONFIG_ALIENWARE_WMI is not set
# CONFIG_ASUS_LAPTOP is not set
# CONFIG_DELL_SMBIOS_WMI is not set
# CONFIG_DELL_LAPTOP is not set
# CONFIG_DELL_WMI is not set
# CONFIG_DELL_WMI_AIO is not set
# CONFIG_DELL_WMI_LED is not set
# CONFIG_DELL_SMO8800 is not set
# CONFIG_FUJITSU_LAPTOP is not set
CONFIG_FUJITSU_TABLET=y
# CONFIG_GPD_POCKET_FAN is not set
CONFIG_HP_ACCEL=y
# CONFIG_HP_WIRELESS is not set
CONFIG_HP_WMI=y
CONFIG_PANASONIC_LAPTOP=y
# CONFIG_SURFACE3_WMI is not set
CONFIG_THINKPAD_ACPI=y
CONFIG_THINKPAD_ACPI_ALSA_SUPPORT=y
# CONFIG_THINKPAD_ACPI_DEBUGFACILITIES is not set
# CONFIG_THINKPAD_ACPI_DEBUG is not set
CONFIG_THINKPAD_ACPI_UNSAFE_LEDS=y
# CONFIG_THINKPAD_ACPI_VIDEO is not set
CONFIG_THINKPAD_ACPI_HOTKEY_POLL=y
# CONFIG_SENSORS_HDAPS is not set
CONFIG_INTEL_MENLOW=y
# CONFIG_EEEPC_LAPTOP is not set
# CONFIG_ASUS_WMI is not set
# CONFIG_ASUS_WIRELESS is not set
CONFIG_ACPI_WMI=y
CONFIG_WMI_BMOF=y
# CONFIG_INTEL_WMI_THUNDERBOLT is not set
CONFIG_MSI_WMI=y
# CONFIG_PEAQ_WMI is not set
CONFIG_TOPSTAR_LAPTOP=y
# CONFIG_TOSHIBA_BT_RFKILL is not set
# CONFIG_TOSHIBA_HAPS is not set
# CONFIG_TOSHIBA_WMI is not set
CONFIG_ACPI_CMPC=y
# CONFIG_INTEL_CHT_INT33FE is not set
# CONFIG_INTEL_INT0002_VGPIO is not set
# CONFIG_INTEL_HID_EVENT is not set
# CONFIG_INTEL_VBTN is not set
# CONFIG_INTEL_IPS is not set
# CONFIG_INTEL_PMC_CORE is not set
# CONFIG_IBM_RTL is not set
# CONFIG_SAMSUNG_LAPTOP is not set
CONFIG_MXM_WMI=y
CONFIG_SAMSUNG_Q10=y
CONFIG_APPLE_GMUX=y
# CONFIG_INTEL_RST is not set
# CONFIG_INTEL_SMARTCONNECT is not set
# CONFIG_PVPANIC is not set
# CONFIG_INTEL_PMC_IPC is not set
# CONFIG_SURFACE_PRO3_BUTTON is not set
# CONFIG_SURFACE_3_BUTTON is not set
# CONFIG_INTEL_PUNIT_IPC is not set
# CONFIG_MLX_PLATFORM is not set
# CONFIG_INTEL_TURBO_MAX_3 is not set
CONFIG_PMC_ATOM=y
# CONFIG_CHROME_PLATFORMS is not set
# CONFIG_MELLANOX_PLATFORM is not set
CONFIG_CLKDEV_LOOKUP=y
CONFIG_HAVE_CLK_PREPARE=y
CONFIG_COMMON_CLK=y

#
# Common Clock Framework
#
# CONFIG_COMMON_CLK_SI5351 is not set
# CONFIG_COMMON_CLK_CDCE706 is not set
# CONFIG_COMMON_CLK_CS2000_CP is not set
# CONFIG_CLK_TWL6040 is not set
# CONFIG_COMMON_CLK_PALMAS is not set
# CONFIG_COMMON_CLK_PWM is not set
# CONFIG_HWSPINLOCK is not set

#
# Clock Source drivers
#
CONFIG_CLKEVT_I8253=y
CONFIG_I8253_LOCK=y
CONFIG_CLKBLD_I8253=y
CONFIG_MAILBOX=y
CONFIG_PCC=y
# CONFIG_ALTERA_MBOX is not set
# CONFIG_IOMMU_SUPPORT is not set

#
# Remoteproc drivers
#
CONFIG_REMOTEPROC=y

#
# Rpmsg drivers
#
# CONFIG_RPMSG_QCOM_GLINK_RPM is not set
# CONFIG_RPMSG_VIRTIO is not set
# CONFIG_SOUNDWIRE is not set

#
# SOC (System On Chip) specific Drivers
#

#
# Amlogic SoC drivers
#

#
# Broadcom SoC drivers
#

#
# i.MX SoC drivers
#

#
# Qualcomm SoC drivers
#
# CONFIG_SOC_TI is not set

#
# Xilinx SoC drivers
#
# CONFIG_XILINX_VCU is not set
CONFIG_PM_DEVFREQ=y

#
# DEVFREQ Governors
#
CONFIG_DEVFREQ_GOV_SIMPLE_ONDEMAND=y
# CONFIG_DEVFREQ_GOV_PERFORMANCE is not set
# CONFIG_DEVFREQ_GOV_POWERSAVE is not set
# CONFIG_DEVFREQ_GOV_USERSPACE is not set
# CONFIG_DEVFREQ_GOV_PASSIVE is not set

#
# DEVFREQ Drivers
#
# CONFIG_PM_DEVFREQ_EVENT is not set
CONFIG_EXTCON=y

#
# Extcon Device Drivers
#
# CONFIG_EXTCON_ARIZONA is not set
# CONFIG_EXTCON_GPIO is not set
# CONFIG_EXTCON_INTEL_INT3496 is not set
# CONFIG_EXTCON_MAX3355 is not set
# CONFIG_EXTCON_MAX77693 is not set
# CONFIG_EXTCON_PALMAS is not set
# CONFIG_EXTCON_RT8973A is not set
# CONFIG_EXTCON_SM5502 is not set
# CONFIG_EXTCON_USB_GPIO is not set
CONFIG_MEMORY=y
# CONFIG_IIO is not set
# CONFIG_NTB is not set
# CONFIG_VME_BUS is not set
CONFIG_PWM=y
CONFIG_PWM_SYSFS=y
# CONFIG_PWM_LPSS_PCI is not set
# CONFIG_PWM_LPSS_PLATFORM is not set
# CONFIG_PWM_PCA9685 is not set

#
# IRQ chip support
#
CONFIG_ARM_GIC_MAX_NR=1
# CONFIG_IPACK_BUS is not set
CONFIG_RESET_CONTROLLER=y
# CONFIG_RESET_TI_SYSCON is not set
# CONFIG_FMC is not set

#
# PHY Subsystem
#
# CONFIG_GENERIC_PHY is not set
# CONFIG_BCM_KONA_USB2_PHY is not set
# CONFIG_PHY_PXA_28NM_HSIC is not set
# CONFIG_PHY_PXA_28NM_USB2 is not set
# CONFIG_POWERCAP is not set
# CONFIG_MCB is not set

#
# Performance monitor support
#
CONFIG_RAS=y
# CONFIG_RAS_CEC is not set
# CONFIG_THUNDERBOLT is not set

#
# Android
#
CONFIG_ANDROID=y
# CONFIG_ANDROID_BINDER_IPC is not set
# CONFIG_LIBNVDIMM is not set
# CONFIG_DAX is not set
# CONFIG_NVMEM is not set
# CONFIG_STM is not set
# CONFIG_INTEL_TH is not set
# CONFIG_FPGA is not set
# CONFIG_FSI is not set
CONFIG_PM_OPP=y
# CONFIG_UNISYS_VISORBUS is not set
# CONFIG_SIOX is not set
# CONFIG_SLIMBUS is not set

#
# Firmware Drivers
#
# CONFIG_EDD is not set
CONFIG_FIRMWARE_MEMMAP=y
CONFIG_DELL_RBU=y
# CONFIG_DCDBAS is not set
CONFIG_DMIID=y
# CONFIG_DMI_SYSFS is not set
CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK=y
CONFIG_ISCSI_IBFT_FIND=y
# CONFIG_ISCSI_IBFT is not set
# CONFIG_FW_CFG_SYSFS is not set
# CONFIG_GOOGLE_FIRMWARE is not set

#
# EFI (Extensible Firmware Interface) Support
#
# CONFIG_EFI_VARS is not set
CONFIG_EFI_ESRT=y
# CONFIG_EFI_FAKE_MEMMAP is not set
CONFIG_EFI_RUNTIME_WRAPPERS=y
# CONFIG_EFI_CAPSULE_LOADER is not set
# CONFIG_EFI_TEST is not set
# CONFIG_APPLE_PROPERTIES is not set
# CONFIG_RESET_ATTACK_MITIGATION is not set

#
# Tegra firmware driver
#

#
# File systems
#
CONFIG_DCACHE_WORD_ACCESS=y
CONFIG_FS_IOMAP=y
CONFIG_EXT2_FS=y
# CONFIG_EXT2_FS_XATTR is not set
CONFIG_EXT3_FS=y
# CONFIG_EXT3_FS_POSIX_ACL is not set
# CONFIG_EXT3_FS_SECURITY is not set
CONFIG_EXT4_FS=y
# CONFIG_EXT4_FS_POSIX_ACL is not set
# CONFIG_EXT4_FS_SECURITY is not set
# CONFIG_EXT4_ENCRYPTION is not set
# CONFIG_EXT4_DEBUG is not set
CONFIG_JBD2=y
# CONFIG_JBD2_DEBUG is not set
CONFIG_FS_MBCACHE=y
CONFIG_REISERFS_FS=y
# CONFIG_REISERFS_CHECK is not set
# CONFIG_REISERFS_PROC_INFO is not set
CONFIG_REISERFS_FS_XATTR=y
CONFIG_REISERFS_FS_POSIX_ACL=y
# CONFIG_REISERFS_FS_SECURITY is not set
# CONFIG_JFS_FS is not set
CONFIG_XFS_FS=y
# CONFIG_XFS_QUOTA is not set
# CONFIG_XFS_POSIX_ACL is not set
# CONFIG_XFS_RT is not set
# CONFIG_XFS_ONLINE_SCRUB is not set
# CONFIG_XFS_WARN is not set
# CONFIG_XFS_DEBUG is not set
# CONFIG_GFS2_FS is not set
# CONFIG_OCFS2_FS is not set
CONFIG_BTRFS_FS=y
# CONFIG_BTRFS_FS_POSIX_ACL is not set
# CONFIG_BTRFS_FS_CHECK_INTEGRITY is not set
# CONFIG_BTRFS_FS_RUN_SANITY_TESTS is not set
# CONFIG_BTRFS_DEBUG is not set
# CONFIG_BTRFS_ASSERT is not set
# CONFIG_BTRFS_FS_REF_VERIFY is not set
CONFIG_NILFS2_FS=y
# CONFIG_F2FS_FS is not set
# CONFIG_FS_DAX is not set
CONFIG_FS_POSIX_ACL=y
CONFIG_EXPORTFS=y
# CONFIG_EXPORTFS_BLOCK_OPS is not set
CONFIG_FILE_LOCKING=y
CONFIG_MANDATORY_FILE_LOCKING=y
# CONFIG_FS_ENCRYPTION is not set
CONFIG_FSNOTIFY=y
# CONFIG_DNOTIFY is not set
CONFIG_INOTIFY_USER=y
# CONFIG_FANOTIFY is not set
# CONFIG_QUOTA is not set
# CONFIG_AUTOFS4_FS is not set
# CONFIG_FUSE_FS is not set
# CONFIG_OVERLAY_FS is not set

#
# Caches
#
# CONFIG_FSCACHE is not set

#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
# CONFIG_ZISOFS is not set
CONFIG_UDF_FS=y
CONFIG_UDF_NLS=y

#
# DOS/FAT/NT Filesystems
#
CONFIG_FAT_FS=y
# CONFIG_MSDOS_FS is not set
CONFIG_VFAT_FS=y
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
# CONFIG_FAT_DEFAULT_UTF8 is not set
# CONFIG_NTFS_FS is not set

#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
# CONFIG_PROC_KCORE is not set
CONFIG_PROC_SYSCTL=y
CONFIG_PROC_PAGE_MONITOR=y
# CONFIG_PROC_CHILDREN is not set
CONFIG_KERNFS=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
CONFIG_TMPFS_POSIX_ACL=y
CONFIG_TMPFS_XATTR=y
CONFIG_HUGETLBFS=y
CONFIG_HUGETLB_PAGE=y
CONFIG_MEMFD_CREATE=y
CONFIG_ARCH_HAS_GIGANTIC_PAGE=y
CONFIG_CONFIGFS_FS=y
CONFIG_EFIVAR_FS=y
# CONFIG_MISC_FILESYSTEMS is not set
CONFIG_NETWORK_FILESYSTEMS=y
CONFIG_NFS_FS=y
CONFIG_NFS_V2=y
CONFIG_NFS_V3=y
# CONFIG_NFS_V3_ACL is not set
CONFIG_NFS_V4=y
# CONFIG_NFS_SWAP is not set
# CONFIG_NFS_V4_1 is not set
# CONFIG_ROOT_NFS is not set
# CONFIG_NFS_USE_LEGACY_DNS is not set
CONFIG_NFS_USE_KERNEL_DNS=y
# CONFIG_NFSD is not set
CONFIG_GRACE_PERIOD=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
CONFIG_NFS_COMMON=y
CONFIG_SUNRPC=y
CONFIG_SUNRPC_GSS=y
# CONFIG_SUNRPC_DEBUG is not set
# CONFIG_CEPH_FS is not set
CONFIG_CIFS=y
# CONFIG_CIFS_STATS is not set
# CONFIG_CIFS_WEAK_PW_HASH is not set
# CONFIG_CIFS_UPCALL is not set
# CONFIG_CIFS_XATTR is not set
CONFIG_CIFS_DEBUG=y
# CONFIG_CIFS_DEBUG2 is not set
# CONFIG_CIFS_DEBUG_DUMP_KEYS is not set
# CONFIG_CIFS_DFS_UPCALL is not set
# CONFIG_CIFS_SMB311 is not set
# CONFIG_CODA_FS is not set
# CONFIG_AFS_FS is not set
CONFIG_9P_FS=y
# CONFIG_9P_FS_POSIX_ACL is not set
# CONFIG_9P_FS_SECURITY is not set
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
# CONFIG_NLS_CODEPAGE_437 is not set
CONFIG_NLS_CODEPAGE_737=y
# CONFIG_NLS_CODEPAGE_775 is not set
# CONFIG_NLS_CODEPAGE_850 is not set
# CONFIG_NLS_CODEPAGE_852 is not set
# CONFIG_NLS_CODEPAGE_855 is not set
# CONFIG_NLS_CODEPAGE_857 is not set
# CONFIG_NLS_CODEPAGE_860 is not set
# CONFIG_NLS_CODEPAGE_861 is not set
CONFIG_NLS_CODEPAGE_862=y
CONFIG_NLS_CODEPAGE_863=y
# CONFIG_NLS_CODEPAGE_864 is not set
# CONFIG_NLS_CODEPAGE_865 is not set
# CONFIG_NLS_CODEPAGE_866 is not set
# CONFIG_NLS_CODEPAGE_869 is not set
CONFIG_NLS_CODEPAGE_936=y
CONFIG_NLS_CODEPAGE_950=y
# CONFIG_NLS_CODEPAGE_932 is not set
CONFIG_NLS_CODEPAGE_949=y
# CONFIG_NLS_CODEPAGE_874 is not set
CONFIG_NLS_ISO8859_8=y
# CONFIG_NLS_CODEPAGE_1250 is not set
CONFIG_NLS_CODEPAGE_1251=y
CONFIG_NLS_ASCII=y
CONFIG_NLS_ISO8859_1=y
CONFIG_NLS_ISO8859_2=y
# CONFIG_NLS_ISO8859_3 is not set
CONFIG_NLS_ISO8859_4=y
# CONFIG_NLS_ISO8859_5 is not set
CONFIG_NLS_ISO8859_6=y
CONFIG_NLS_ISO8859_7=y
# CONFIG_NLS_ISO8859_9 is not set
CONFIG_NLS_ISO8859_13=y
CONFIG_NLS_ISO8859_14=y
CONFIG_NLS_ISO8859_15=y
CONFIG_NLS_KOI8_R=y
# CONFIG_NLS_KOI8_U is not set
# CONFIG_NLS_MAC_ROMAN is not set
CONFIG_NLS_MAC_CELTIC=y
# CONFIG_NLS_MAC_CENTEURO is not set
# CONFIG_NLS_MAC_CROATIAN is not set
CONFIG_NLS_MAC_CYRILLIC=y
CONFIG_NLS_MAC_GAELIC=y
# CONFIG_NLS_MAC_GREEK is not set
CONFIG_NLS_MAC_ICELAND=y
# CONFIG_NLS_MAC_INUIT is not set
# CONFIG_NLS_MAC_ROMANIAN is not set
# CONFIG_NLS_MAC_TURKISH is not set
CONFIG_NLS_UTF8=y
# CONFIG_DLM is not set

#
# Kernel hacking
#
CONFIG_TRACE_IRQFLAGS_SUPPORT=y

#
# printk and dmesg options
#
CONFIG_PRINTK_TIME=y
CONFIG_CONSOLE_LOGLEVEL_DEFAULT=7
CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4
CONFIG_BOOT_PRINTK_DELAY=y
# CONFIG_DYNAMIC_DEBUG is not set

#
# Compile-time checks and compiler options
#
CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_INFO_REDUCED=y
# CONFIG_DEBUG_INFO_SPLIT is not set
# CONFIG_DEBUG_INFO_DWARF4 is not set
# CONFIG_GDB_SCRIPTS is not set
# CONFIG_ENABLE_WARN_DEPRECATED is not set
# CONFIG_ENABLE_MUST_CHECK is not set
CONFIG_FRAME_WARN=2048
# CONFIG_STRIP_ASM_SYMS is not set
# CONFIG_READABLE_ASM is not set
# CONFIG_UNUSED_SYMBOLS is not set
# CONFIG_PAGE_OWNER is not set
CONFIG_DEBUG_FS=y
CONFIG_HEADERS_CHECK=y
CONFIG_DEBUG_SECTION_MISMATCH=y
CONFIG_SECTION_MISMATCH_WARN_ONLY=y
CONFIG_STACK_VALIDATION=y
CONFIG_DEBUG_FORCE_WEAK_PER_CPU=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x1
CONFIG_MAGIC_SYSRQ_SERIAL=y
CONFIG_DEBUG_KERNEL=y

#
# Memory Debugging
#
# CONFIG_PAGE_EXTENSION is not set
# CONFIG_DEBUG_PAGEALLOC is not set
# CONFIG_PAGE_POISONING is not set
# CONFIG_DEBUG_RODATA_TEST is not set
CONFIG_DEBUG_OBJECTS=y
CONFIG_DEBUG_OBJECTS_SELFTEST=y
CONFIG_DEBUG_OBJECTS_FREE=y
# CONFIG_DEBUG_OBJECTS_TIMERS is not set
CONFIG_DEBUG_OBJECTS_WORK=y
# CONFIG_DEBUG_OBJECTS_RCU_HEAD is not set
# CONFIG_DEBUG_OBJECTS_PERCPU_COUNTER is not set
CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT=1
# CONFIG_SLUB_DEBUG_ON is not set
CONFIG_SLUB_STATS=y
CONFIG_HAVE_DEBUG_KMEMLEAK=y
# CONFIG_DEBUG_KMEMLEAK is not set
CONFIG_DEBUG_STACK_USAGE=y
CONFIG_DEBUG_VM=y
# CONFIG_DEBUG_VM_VMACACHE is not set
CONFIG_DEBUG_VM_RB=y
# CONFIG_DEBUG_VM_PGFLAGS is not set
CONFIG_ARCH_HAS_DEBUG_VIRTUAL=y
# CONFIG_DEBUG_VIRTUAL is not set
# CONFIG_DEBUG_MEMORY_INIT is not set
# CONFIG_DEBUG_PER_CPU_MAPS is not set
CONFIG_HAVE_DEBUG_STACKOVERFLOW=y
CONFIG_DEBUG_STACKOVERFLOW=y
CONFIG_HAVE_ARCH_KASAN=y
# CONFIG_KASAN is not set
CONFIG_ARCH_HAS_KCOV=y
# CONFIG_KCOV is not set
CONFIG_DEBUG_SHIRQ=y

#
# Debug Lockups and Hangs
#
CONFIG_LOCKUP_DETECTOR=y
CONFIG_SOFTLOCKUP_DETECTOR=y
CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=1
CONFIG_HARDLOCKUP_DETECTOR_PERF=y
CONFIG_HARDLOCKUP_CHECK_TIMESTAMP=y
CONFIG_HARDLOCKUP_DETECTOR=y
# CONFIG_BOOTPARAM_HARDLOCKUP_PANIC is not set
CONFIG_BOOTPARAM_HARDLOCKUP_PANIC_VALUE=0
# CONFIG_DETECT_HUNG_TASK is not set
# CONFIG_WQ_WATCHDOG is not set
# CONFIG_PANIC_ON_OOPS is not set
CONFIG_PANIC_ON_OOPS_VALUE=0
CONFIG_PANIC_TIMEOUT=0
CONFIG_SCHED_DEBUG=y
CONFIG_SCHED_INFO=y
# CONFIG_SCHEDSTATS is not set
# CONFIG_SCHED_STACK_END_CHECK is not set
# CONFIG_DEBUG_TIMEKEEPING is not set
# CONFIG_DEBUG_PREEMPT is not set

#
# Lock Debugging (spinlocks, mutexes, etc...)
#
CONFIG_DEBUG_RT_MUTEXES=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_MUTEXES=y
# CONFIG_DEBUG_WW_MUTEX_SLOWPATH is not set
CONFIG_DEBUG_LOCK_ALLOC=y
# CONFIG_PROVE_LOCKING is not set
CONFIG_LOCKDEP=y
CONFIG_LOCK_STAT=y
CONFIG_DEBUG_LOCKDEP=y
CONFIG_DEBUG_ATOMIC_SLEEP=y
# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
# CONFIG_LOCK_TORTURE_TEST is not set
# CONFIG_WW_MUTEX_SELFTEST is not set
CONFIG_STACKTRACE=y
# CONFIG_WARN_ALL_UNSEEDED_RANDOM is not set
# CONFIG_DEBUG_KOBJECT is not set
CONFIG_DEBUG_BUGVERBOSE=y
# CONFIG_DEBUG_LIST is not set
CONFIG_DEBUG_PI_LIST=y
# CONFIG_DEBUG_SG is not set
# CONFIG_DEBUG_NOTIFIERS is not set
CONFIG_DEBUG_CREDENTIALS=y

#
# RCU Debugging
#
# CONFIG_RCU_PERF_TEST is not set
# CONFIG_RCU_TORTURE_TEST is not set
CONFIG_RCU_CPU_STALL_TIMEOUT=21
# CONFIG_RCU_TRACE is not set
# CONFIG_RCU_EQS_DEBUG is not set
# CONFIG_DEBUG_WQ_FORCE_RR_CPU is not set
# CONFIG_DEBUG_BLOCK_EXT_DEVT is not set
# CONFIG_CPU_HOTPLUG_STATE_CONTROL is not set
# CONFIG_NOTIFIER_ERROR_INJECTION is not set
# CONFIG_FAULT_INJECTION is not set
# CONFIG_LATENCYTOP is not set
CONFIG_USER_STACKTRACE_SUPPORT=y
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
CONFIG_HAVE_FENTRY=y
CONFIG_HAVE_C_RECORDMCOUNT=y
CONFIG_TRACING_SUPPORT=y
# CONFIG_FTRACE is not set
# CONFIG_PROVIDE_OHCI1394_DMA_INIT is not set
# CONFIG_DMA_API_DEBUG is not set
CONFIG_RUNTIME_TESTING_MENU=y
# CONFIG_LKDTM is not set
# CONFIG_TEST_LIST_SORT is not set
# CONFIG_TEST_SORT is not set
# CONFIG_BACKTRACE_SELF_TEST is not set
# CONFIG_RBTREE_TEST is not set
# CONFIG_INTERVAL_TREE_TEST is not set
CONFIG_ATOMIC64_SELFTEST=y
# CONFIG_TEST_HEXDUMP is not set
# CONFIG_TEST_STRING_HELPERS is not set
# CONFIG_TEST_KSTRTOX is not set
# CONFIG_TEST_PRINTF is not set
# CONFIG_TEST_BITMAP is not set
# CONFIG_TEST_UUID is not set
# CONFIG_TEST_RHASHTABLE is not set
# CONFIG_TEST_HASH is not set
# CONFIG_FIND_BIT_BENCHMARK is not set
# CONFIG_TEST_FIRMWARE is not set
# CONFIG_TEST_SYSCTL is not set
# CONFIG_TEST_UDELAY is not set
# CONFIG_MEMTEST is not set
# CONFIG_BUG_ON_DATA_CORRUPTION is not set
# CONFIG_SAMPLES is not set
CONFIG_HAVE_ARCH_KGDB=y
# CONFIG_KGDB is not set
CONFIG_ARCH_HAS_UBSAN_SANITIZE_ALL=y
# CONFIG_UBSAN is not set
CONFIG_ARCH_HAS_DEVMEM_IS_ALLOWED=y
# CONFIG_STRICT_DEVMEM is not set
CONFIG_X86_VERBOSE_BOOTUP=y
# CONFIG_EARLY_PRINTK is not set
# CONFIG_X86_PTDUMP is not set
# CONFIG_EFI_PGT_DUMP is not set
# CONFIG_DEBUG_WX is not set
CONFIG_DOUBLEFAULT=y
CONFIG_DEBUG_TLBFLUSH=y
# CONFIG_IOMMU_DEBUG is not set
CONFIG_HAVE_MMIOTRACE_SUPPORT=y
CONFIG_IO_DELAY_TYPE_0X80=0
CONFIG_IO_DELAY_TYPE_0XED=1
CONFIG_IO_DELAY_TYPE_UDELAY=2
CONFIG_IO_DELAY_TYPE_NONE=3
# CONFIG_IO_DELAY_0X80 is not set
CONFIG_IO_DELAY_0XED=y
# CONFIG_IO_DELAY_UDELAY is not set
# CONFIG_IO_DELAY_NONE is not set
CONFIG_DEFAULT_IO_DELAY_TYPE=1
CONFIG_DEBUG_BOOT_PARAMS=y
# CONFIG_CPA_DEBUG is not set
# CONFIG_OPTIMIZE_INLINING is not set
# CONFIG_DEBUG_ENTRY is not set
# CONFIG_DEBUG_NMI_SELFTEST is not set
CONFIG_X86_DEBUG_FPU=y
# CONFIG_PUNIT_ATOM_DEBUG is not set
CONFIG_UNWINDER_ORC=y
# CONFIG_UNWINDER_FRAME_POINTER is not set
# CONFIG_UNWINDER_GUESS is not set

#
# Security options
#
CONFIG_KEYS=y
CONFIG_KEYS_COMPAT=y
# CONFIG_PERSISTENT_KEYRINGS is not set
# CONFIG_BIG_KEYS is not set
CONFIG_TRUSTED_KEYS=y
CONFIG_ENCRYPTED_KEYS=y
# CONFIG_KEY_DH_OPERATIONS is not set
# CONFIG_SECURITY_DMESG_RESTRICT is not set
# CONFIG_SECURITY is not set
CONFIG_SECURITYFS=y
CONFIG_PAGE_TABLE_ISOLATION=y
CONFIG_HAVE_HARDENED_USERCOPY_ALLOCATOR=y
# CONFIG_HARDENED_USERCOPY is not set
# CONFIG_FORTIFY_SOURCE is not set
# CONFIG_STATIC_USERMODEHELPER is not set
# CONFIG_LOCK_DOWN_KERNEL is not set
# CONFIG_LOCK_DOWN_IN_EFI_SECURE_BOOT is not set
CONFIG_DEFAULT_SECURITY_DAC=y
CONFIG_DEFAULT_SECURITY=""
CONFIG_XOR_BLOCKS=y
CONFIG_CRYPTO=y

#
# Crypto core or helper
#
CONFIG_CRYPTO_ALGAPI=y
CONFIG_CRYPTO_ALGAPI2=y
CONFIG_CRYPTO_AEAD=y
CONFIG_CRYPTO_AEAD2=y
CONFIG_CRYPTO_BLKCIPHER=y
CONFIG_CRYPTO_BLKCIPHER2=y
CONFIG_CRYPTO_HASH=y
CONFIG_CRYPTO_HASH2=y
CONFIG_CRYPTO_RNG=y
CONFIG_CRYPTO_RNG2=y
CONFIG_CRYPTO_RNG_DEFAULT=y
CONFIG_CRYPTO_AKCIPHER2=y
CONFIG_CRYPTO_KPP2=y
CONFIG_CRYPTO_ACOMP2=y
# CONFIG_CRYPTO_RSA is not set
# CONFIG_CRYPTO_DH is not set
# CONFIG_CRYPTO_ECDH is not set
CONFIG_CRYPTO_MANAGER=y
CONFIG_CRYPTO_MANAGER2=y
CONFIG_CRYPTO_USER=y
CONFIG_CRYPTO_MANAGER_DISABLE_TESTS=y
CONFIG_CRYPTO_GF128MUL=y
CONFIG_CRYPTO_NULL=y
CONFIG_CRYPTO_NULL2=y
# CONFIG_CRYPTO_PCRYPT is not set
CONFIG_CRYPTO_WORKQUEUE=y
CONFIG_CRYPTO_CRYPTD=y
# CONFIG_CRYPTO_MCRYPTD is not set
CONFIG_CRYPTO_AUTHENC=y
CONFIG_CRYPTO_SIMD=y
CONFIG_CRYPTO_GLUE_HELPER_X86=y

#
# Authenticated Encryption with Associated Data
#
CONFIG_CRYPTO_CCM=y
CONFIG_CRYPTO_GCM=y
# CONFIG_CRYPTO_CHACHA20POLY1305 is not set
CONFIG_CRYPTO_SEQIV=y
CONFIG_CRYPTO_ECHAINIV=y

#
# Block modes
#
CONFIG_CRYPTO_CBC=y
CONFIG_CRYPTO_CTR=y
# CONFIG_CRYPTO_CTS is not set
CONFIG_CRYPTO_ECB=y
CONFIG_CRYPTO_LRW=y
CONFIG_CRYPTO_PCBC=y
CONFIG_CRYPTO_XTS=y
# CONFIG_CRYPTO_KEYWRAP is not set

#
# Hash modes
#
CONFIG_CRYPTO_CMAC=y
CONFIG_CRYPTO_HMAC=y
# CONFIG_CRYPTO_XCBC is not set
CONFIG_CRYPTO_VMAC=y

#
# Digest
#
CONFIG_CRYPTO_CRC32C=y
# CONFIG_CRYPTO_CRC32C_INTEL is not set
# CONFIG_CRYPTO_CRC32 is not set
# CONFIG_CRYPTO_CRC32_PCLMUL is not set
CONFIG_CRYPTO_CRCT10DIF=y
# CONFIG_CRYPTO_CRCT10DIF_PCLMUL is not set
CONFIG_CRYPTO_GHASH=y
# CONFIG_CRYPTO_POLY1305 is not set
# CONFIG_CRYPTO_POLY1305_X86_64 is not set
CONFIG_CRYPTO_MD4=y
CONFIG_CRYPTO_MD5=y
# CONFIG_CRYPTO_MICHAEL_MIC is not set
# CONFIG_CRYPTO_RMD128 is not set
CONFIG_CRYPTO_RMD160=y
# CONFIG_CRYPTO_RMD256 is not set
# CONFIG_CRYPTO_RMD320 is not set
CONFIG_CRYPTO_SHA1=y
# CONFIG_CRYPTO_SHA1_SSSE3 is not set
# CONFIG_CRYPTO_SHA256_SSSE3 is not set
# CONFIG_CRYPTO_SHA512_SSSE3 is not set
# CONFIG_CRYPTO_SHA1_MB is not set
# CONFIG_CRYPTO_SHA256_MB is not set
# CONFIG_CRYPTO_SHA512_MB is not set
CONFIG_CRYPTO_SHA256=y
# CONFIG_CRYPTO_SHA512 is not set
# CONFIG_CRYPTO_SHA3 is not set
# CONFIG_CRYPTO_SM3 is not set
CONFIG_CRYPTO_TGR192=y
# CONFIG_CRYPTO_WP512 is not set
CONFIG_CRYPTO_GHASH_CLMUL_NI_INTEL=y

#
# Ciphers
#
CONFIG_CRYPTO_AES=y
# CONFIG_CRYPTO_AES_TI is not set
# CONFIG_CRYPTO_AES_X86_64 is not set
# CONFIG_CRYPTO_AES_NI_INTEL is not set
CONFIG_CRYPTO_ANUBIS=y
CONFIG_CRYPTO_ARC4=y
CONFIG_CRYPTO_BLOWFISH=y
CONFIG_CRYPTO_BLOWFISH_COMMON=y
CONFIG_CRYPTO_BLOWFISH_X86_64=y
# CONFIG_CRYPTO_CAMELLIA is not set
CONFIG_CRYPTO_CAMELLIA_X86_64=y
CONFIG_CRYPTO_CAMELLIA_AESNI_AVX_X86_64=y
# CONFIG_CRYPTO_CAMELLIA_AESNI_AVX2_X86_64 is not set
CONFIG_CRYPTO_CAST_COMMON=y
CONFIG_CRYPTO_CAST5=y
# CONFIG_CRYPTO_CAST5_AVX_X86_64 is not set
CONFIG_CRYPTO_CAST6=y
CONFIG_CRYPTO_CAST6_AVX_X86_64=y
CONFIG_CRYPTO_DES=y
# CONFIG_CRYPTO_DES3_EDE_X86_64 is not set
CONFIG_CRYPTO_FCRYPT=y
CONFIG_CRYPTO_KHAZAD=y
CONFIG_CRYPTO_SALSA20=y
CONFIG_CRYPTO_SALSA20_X86_64=y
# CONFIG_CRYPTO_CHACHA20 is not set
# CONFIG_CRYPTO_CHACHA20_X86_64 is not set
CONFIG_CRYPTO_SEED=y
CONFIG_CRYPTO_SERPENT=y
# CONFIG_CRYPTO_SERPENT_SSE2_X86_64 is not set
CONFIG_CRYPTO_SERPENT_AVX_X86_64=y
# CONFIG_CRYPTO_SERPENT_AVX2_X86_64 is not set
# CONFIG_CRYPTO_SPECK is not set
CONFIG_CRYPTO_TEA=y
# CONFIG_CRYPTO_TWOFISH is not set
CONFIG_CRYPTO_TWOFISH_COMMON=y
CONFIG_CRYPTO_TWOFISH_X86_64=y
CONFIG_CRYPTO_TWOFISH_X86_64_3WAY=y
CONFIG_CRYPTO_TWOFISH_AVX_X86_64=y

#
# Compression
#
CONFIG_CRYPTO_DEFLATE=y
# CONFIG_CRYPTO_LZO is not set
# CONFIG_CRYPTO_842 is not set
# CONFIG_CRYPTO_LZ4 is not set
# CONFIG_CRYPTO_LZ4HC is not set

#
# Random Number Generation
#
# CONFIG_CRYPTO_ANSI_CPRNG is not set
CONFIG_CRYPTO_DRBG_MENU=y
CONFIG_CRYPTO_DRBG_HMAC=y
# CONFIG_CRYPTO_DRBG_HASH is not set
# CONFIG_CRYPTO_DRBG_CTR is not set
CONFIG_CRYPTO_DRBG=y
CONFIG_CRYPTO_JITTERENTROPY=y
# CONFIG_CRYPTO_USER_API_HASH is not set
# CONFIG_CRYPTO_USER_API_SKCIPHER is not set
# CONFIG_CRYPTO_USER_API_RNG is not set
# CONFIG_CRYPTO_USER_API_AEAD is not set
CONFIG_CRYPTO_HASH_INFO=y
# CONFIG_CRYPTO_HW is not set
CONFIG_ASYMMETRIC_KEY_TYPE=y
# CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE is not set

#
# Certificates for signature checking
#
# CONFIG_SYSTEM_TRUSTED_KEYRING is not set
# CONFIG_SYSTEM_BLACKLIST_KEYRING is not set
CONFIG_HAVE_KVM=y
CONFIG_VIRTUALIZATION=y
# CONFIG_KVM is not set
CONFIG_VHOST_NET=y
CONFIG_VHOST=y
# CONFIG_VHOST_CROSS_ENDIAN_LEGACY is not set

#
# Library routines
#
CONFIG_RAID6_PQ=y
CONFIG_BITREVERSE=y
CONFIG_RATIONAL=y
CONFIG_GENERIC_STRNCPY_FROM_USER=y
CONFIG_GENERIC_STRNLEN_USER=y
CONFIG_GENERIC_NET_UTILS=y
CONFIG_GENERIC_FIND_FIRST_BIT=y
CONFIG_GENERIC_PCI_IOMAP=y
CONFIG_GENERIC_IOMAP=y
CONFIG_ARCH_USE_CMPXCHG_LOCKREF=y
CONFIG_ARCH_HAS_FAST_MULTIPLIER=y
CONFIG_CRC_CCITT=y
CONFIG_CRC16=y
CONFIG_CRC_T10DIF=y
CONFIG_CRC_ITU_T=y
CONFIG_CRC32=y
CONFIG_CRC32_SELFTEST=y
# CONFIG_CRC32_SLICEBY8 is not set
# CONFIG_CRC32_SLICEBY4 is not set
# CONFIG_CRC32_SARWATE is not set
CONFIG_CRC32_BIT=y
# CONFIG_CRC4 is not set
CONFIG_CRC7=y
CONFIG_LIBCRC32C=y
# CONFIG_CRC8 is not set
CONFIG_XXHASH=y
# CONFIG_RANDOM32_SELFTEST is not set
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=y
CONFIG_LZO_COMPRESS=y
CONFIG_LZO_DECOMPRESS=y
CONFIG_LZ4_DECOMPRESS=y
CONFIG_ZSTD_COMPRESS=y
CONFIG_ZSTD_DECOMPRESS=y
CONFIG_XZ_DEC=y
# CONFIG_XZ_DEC_X86 is not set
# CONFIG_XZ_DEC_POWERPC is not set
CONFIG_XZ_DEC_IA64=y
CONFIG_XZ_DEC_ARM=y
CONFIG_XZ_DEC_ARMTHUMB=y
# CONFIG_XZ_DEC_SPARC is not set
CONFIG_XZ_DEC_BCJ=y
# CONFIG_XZ_DEC_TEST is not set
CONFIG_DECOMPRESS_GZIP=y
CONFIG_DECOMPRESS_BZIP2=y
CONFIG_DECOMPRESS_LZMA=y
CONFIG_DECOMPRESS_XZ=y
CONFIG_DECOMPRESS_LZ4=y
CONFIG_GENERIC_ALLOCATOR=y
CONFIG_BCH=y
CONFIG_BCH_CONST_PARAMS=y
CONFIG_TEXTSEARCH=y
CONFIG_TEXTSEARCH_KMP=y
CONFIG_TEXTSEARCH_BM=y
CONFIG_TEXTSEARCH_FSM=y
CONFIG_BTREE=y
CONFIG_RADIX_TREE_MULTIORDER=y
CONFIG_ASSOCIATIVE_ARRAY=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT_MAP=y
CONFIG_HAS_DMA=y
CONFIG_SGL_ALLOC=y
CONFIG_CHECK_SIGNATURE=y
CONFIG_CPU_RMAP=y
CONFIG_DQL=y
CONFIG_GLOB=y
# CONFIG_GLOB_SELFTEST is not set
CONFIG_NLATTR=y
CONFIG_CORDIC=y
# CONFIG_DDR is not set
CONFIG_IRQ_POLL=y
CONFIG_OID_REGISTRY=y
CONFIG_UCS2_STRING=y
CONFIG_SG_POOL=y
CONFIG_ARCH_HAS_SG_CHAIN=y
CONFIG_ARCH_HAS_PMEM_API=y
CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE=y
CONFIG_SBITMAP=y
# CONFIG_STRING_SELFTEST is not set

[-- Attachment #3: job-script --]
[-- Type: text/plain, Size: 3775 bytes --]

#!/bin/sh

export_top_env()
{
	export suite='trinity'
	export testcase='trinity'
	export runtime=300
	export job_origin='/lkp/lkp/src/allot/rand/vm-kbuild-yocto-x86_64/trinity.yaml'
	export testbox='vm-kbuild-yocto-x86_64-62'
	export tbox_group='vm-kbuild-yocto-x86_64'
	export kconfig='x86_64-acpi-redef'
	export compiler='gcc-7'
	export queue='bisect'
	export branch='linux-devel/devel-catchup-201803161558'
	export commit='b1f0502d04537ef55b0c296823affe332b100eb5'
	export submit_id='5aacbfbf0b9a9316d564cb62'
	export job_file='/lkp/scheduled/vm-kbuild-yocto-x86_64-62/trinity-300s-yocto-minimal-x86_64-2016-04-22.cgz-b1f0502d04537ef55b0c296823affe332b100eb5-20180317-5845-1ji2aul-0.yaml'
	export id='06a0be8f45c61324ef6913e73758b6565d86f204'
	export model='qemu-system-x86_64 -enable-kvm -cpu SandyBridge'
	export nr_vm=64
	export nr_cpu=1
	export memory='512M'
	export rootfs='yocto-minimal-x86_64-2016-04-22.cgz'
	export swap_partitions='/dev/vda'
	export need_kconfig='CONFIG_KVM_GUEST=y'
	export enqueue_time='2018-03-17 15:11:59 +0800'
	export _id='5aacbfbf0b9a9316d564cb62'
	export _rt='/result/trinity/300s/vm-kbuild-yocto-x86_64/yocto-minimal-x86_64-2016-04-22.cgz/x86_64-acpi-redef/gcc-7/b1f0502d04537ef55b0c296823affe332b100eb5'
	export user='lkp'
	export result_root='/result/trinity/300s/vm-kbuild-yocto-x86_64/yocto-minimal-x86_64-2016-04-22.cgz/x86_64-acpi-redef/gcc-7/b1f0502d04537ef55b0c296823affe332b100eb5/0'
	export LKP_SERVER='inn'
	export max_uptime=1500
	export initrd='/osimage/yocto/yocto-minimal-x86_64-2016-04-22.cgz'
	export bootloader_append='root=/dev/ram0
user=lkp
job=/lkp/scheduled/vm-kbuild-yocto-x86_64-62/trinity-300s-yocto-minimal-x86_64-2016-04-22.cgz-b1f0502d04537ef55b0c296823affe332b100eb5-20180317-5845-1ji2aul-0.yaml
ARCH=x86_64
kconfig=x86_64-acpi-redef
branch=linux-devel/devel-catchup-201803161558
commit=b1f0502d04537ef55b0c296823affe332b100eb5
BOOT_IMAGE=/pkg/linux/x86_64-acpi-redef/gcc-7/b1f0502d04537ef55b0c296823affe332b100eb5/vmlinuz-4.16.0-rc4-next-20180309-00007-gb1f0502
max_uptime=1500
RESULT_ROOT=/result/trinity/300s/vm-kbuild-yocto-x86_64/yocto-minimal-x86_64-2016-04-22.cgz/x86_64-acpi-redef/gcc-7/b1f0502d04537ef55b0c296823affe332b100eb5/0
LKP_SERVER=inn
debug
apic=debug
sysrq_always_enabled
rcupdate.rcu_cpu_stall_timeout=100
net.ifnames=0
printk.devkmsg=on
panic=-1
softlockup_panic=1
nmi_watchdog=panic
oops=panic
load_ramdisk=2
prompt_ramdisk=0
drbd.minor_count=8
systemd.log_level=err
ignore_loglevel
console=tty0
earlyprintk=ttyS0,115200
console=ttyS0,115200
vga=normal
rw'
	export bm_initrd='/osimage/pkg/debian-x86_64-2016-08-31.cgz/trinity-static-x86_64-x86_64-6ddabfd2_2017-11-10.cgz'
	export lkp_initrd='/lkp/lkp/lkp-x86_64.cgz'
	export site='inn'
	export LKP_CGI_PORT=80
	export LKP_CIFS_PORT=139
	export kernel='/pkg/linux/x86_64-acpi-redef/gcc-7/b1f0502d04537ef55b0c296823affe332b100eb5/vmlinuz-4.16.0-rc4-next-20180309-00007-gb1f0502'
	export dequeue_time='2018-03-17 15:16:09 +0800'
	export job_initrd='/lkp/scheduled/vm-kbuild-yocto-x86_64-62/trinity-300s-yocto-minimal-x86_64-2016-04-22.cgz-b1f0502d04537ef55b0c296823affe332b100eb5-20180317-5845-1ji2aul-0.cgz'

	[ -n "$LKP_SRC" ] ||
	export LKP_SRC=/lkp/${user:-lkp}/src
}

run_job()
{
	echo $$ > $TMP/run-job.pid

	. $LKP_SRC/lib/http.sh
	. $LKP_SRC/lib/job.sh
	. $LKP_SRC/lib/env.sh

	export_top_env

	run_monitor $LKP_SRC/monitors/wrapper kmsg
	run_monitor $LKP_SRC/monitors/wrapper oom-killer
	run_monitor $LKP_SRC/monitors/plain/watchdog

	run_test $LKP_SRC/tests/wrapper trinity
}

extract_stats()
{
	$LKP_SRC/stats/wrapper kmsg

	$LKP_SRC/stats/wrapper time trinity.time
	$LKP_SRC/stats/wrapper time
	$LKP_SRC/stats/wrapper dmesg
	$LKP_SRC/stats/wrapper kmsg
	$LKP_SRC/stats/wrapper stderr
	$LKP_SRC/stats/wrapper last_state
}

"$@"

[-- Attachment #4: dmesg.xz --]
[-- Type: application/x-xz, Size: 13744 bytes --]

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [mm] b1f0502d04: INFO:trying_to_register_non-static_key
  2018-03-17  7:51   ` [mm] b1f0502d04: INFO:trying_to_register_non-static_key kernel test robot
@ 2018-03-21 12:21     ` Laurent Dufour
  2018-03-25 22:10       ` David Rientjes
  0 siblings, 1 reply; 98+ messages in thread
From: Laurent Dufour @ 2018-03-21 12:21 UTC (permalink / raw)
  To: kernel test robot
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86, lkp



On 17/03/2018 08:51, kernel test robot wrote:
> FYI, we noticed the following commit (built with gcc-7):
> 
> commit: b1f0502d04537ef55b0c296823affe332b100eb5 ("mm: VMA sequence count")
> url: https://github.com/0day-ci/linux/commits/Laurent-Dufour/Speculative-page-faults/20180316-151833
> 
> 
> in testcase: trinity
> with following parameters:
> 
> 	runtime: 300s
> 
> test-description: Trinity is a linux system call fuzz tester.
> test-url: http://codemonkey.org.uk/projects/trinity/
> 
> 
> on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -m 512M
> 
> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
> 
> 
> +----------------------------------------+------------+------------+
> |                                        | 6a4ce82339 | b1f0502d04 |
> +----------------------------------------+------------+------------+
> | boot_successes                         | 8          | 4          |
> | boot_failures                          | 0          | 4          |
> | INFO:trying_to_register_non-static_key | 0          | 4          |
> +----------------------------------------+------------+------------+
> 
> 
> 
> [   22.212940] INFO: trying to register non-static key.
> [   22.213687] the code is fine but needs lockdep annotation.
> [   22.214459] turning off the locking correctness validator.
> [   22.227459] CPU: 0 PID: 547 Comm: trinity-main Not tainted 4.16.0-rc4-next-20180309-00007-gb1f0502 #239
> [   22.228904] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> [   22.230043] Call Trace:
> [   22.230409]  dump_stack+0x5d/0x79
> [   22.231025]  register_lock_class+0x226/0x45e
> [   22.231827]  ? kvm_clock_read+0x21/0x30
> [   22.232544]  ? kvm_sched_clock_read+0x5/0xd
> [   22.233330]  __lock_acquire+0xa2/0x774
> [   22.234152]  lock_acquire+0x4b/0x66
> [   22.234805]  ? unmap_vmas+0x30/0x3d
> [   22.245680]  unmap_page_range+0x56/0x48c
> [   22.248127]  ? unmap_vmas+0x30/0x3d
> [   22.248741]  ? lru_deactivate_file_fn+0x2c6/0x2c6
> [   22.249537]  ? pagevec_lru_move_fn+0x9a/0xa9
> [   22.250244]  unmap_vmas+0x30/0x3d
> [   22.250791]  unmap_region+0xad/0x105
> [   22.251419]  mmap_region+0x3cc/0x455
> [   22.252011]  do_mmap+0x394/0x3e9
> [   22.261224]  vm_mmap_pgoff+0x9c/0xe5
> [   22.261798]  SyS_mmap_pgoff+0x19a/0x1d4
> [   22.262475]  ? task_work_run+0x5e/0x9c
> [   22.263163]  do_syscall_64+0x6d/0x103
> [   22.263814]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
> [   22.264697] RIP: 0033:0x4573da
> [   22.267248] RSP: 002b:00007fffa22f1398 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
> [   22.274720] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00000000004573da
> [   22.276083] RDX: 0000000000000001 RSI: 0000000000001000 RDI: 0000000000000000
> [   22.277343] RBP: 000000000000001c R08: 000000000000001c R09: 0000000000000000
> [   22.278686] R10: 0000000000000002 R11: 0000000000000246 R12: 0000000000000000
> [   22.279930] R13: 0000000000001000 R14: 0000000000000002 R15: 0000000000000000
> [   22.391866] trinity-main uses obsolete (PF_INET,SOCK_PACKET)
> [  327.566956] sysrq: SysRq : Emergency Sync
> [  327.567849] Emergency Sync complete
> [  327.569975] sysrq: SysRq : Resetting

I found the root cause of this lockdep warning.

In mmap_region(), unmap_region() may be called while vma_link() has not been
called. This happens during the error path if call_mmap() failed.

The only to fix that particular case is to call
seqcount_init(&vma->vm_sequence) when initializing the vma in mmap_region().

Thanks,
Laurent.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 00/24] Speculative page faults
  2018-03-13 17:59 [PATCH v9 00/24] Speculative page faults Laurent Dufour
                   ` (22 preceding siblings ...)
  2018-03-14 13:11 ` [PATCH v9 00/24] Speculative page faults Michal Hocko
@ 2018-03-22  1:21 ` Ganesh Mahendran
  2018-03-29 12:49   ` Laurent Dufour
       [not found] ` <1520963994-28477-5-git-send-email-ldufour@linux.vnet.ibm.com>
                   ` (2 subsequent siblings)
  26 siblings, 1 reply; 98+ messages in thread
From: Ganesh Mahendran @ 2018-03-22  1:21 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, Peter Zijlstra, Andrew Morton, kirill, ak, Michal Hocko,
	dave, jack, Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner,
	Ingo Molnar, hpa, Will Deacon, Sergey Senozhatsky,
	Andrea Arcangeli, Alexei Starovoitov, kemi.wang,
	Sergey Senozhatsky, Daniel Jordan, linux-kernel, Linux-MM, haren,
	khandual, npiggin, Balbir Singh, Tim Chen, linuxppc-dev, x86

Hi, Laurent

2018-03-14 1:59 GMT+08:00 Laurent Dufour <ldufour@linux.vnet.ibm.com>:
> This is a port on kernel 4.16 of the work done by Peter Zijlstra to
> handle page fault without holding the mm semaphore [1].
>
> The idea is to try to handle user space page faults without holding the
> mmap_sem. This should allow better concurrency for massively threaded
> process since the page fault handler will not wait for other threads memory
> layout change to be done, assuming that this change is done in another part
> of the process's memory space. This type page fault is named speculative
> page fault. If the speculative page fault fails because of a concurrency is
> detected or because underlying PMD or PTE tables are not yet allocating, it
> is failing its processing and a classic page fault is then tried.
>
> The speculative page fault (SPF) has to look for the VMA matching the fault
> address without holding the mmap_sem, this is done by introducing a rwlock
> which protects the access to the mm_rb tree. Previously this was done using
> SRCU but it was introducing a lot of scheduling to process the VMA's
> freeing
> operation which was hitting the performance by 20% as reported by Kemi Wang
> [2].Using a rwlock to protect access to the mm_rb tree is limiting the
> locking contention to these operations which are expected to be in a O(log
> n)
> order. In addition to ensure that the VMA is not freed in our back a
> reference count is added and 2 services (get_vma() and put_vma()) are
> introduced to handle the reference count. When a VMA is fetch from the RB
> tree using get_vma() is must be later freeed using put_vma(). Furthermore,
> to allow the VMA to be used again by the classic page fault handler a
> service is introduced can_reuse_spf_vma(). This service is expected to be
> called with the mmap_sem hold. It checked that the VMA is still matching
> the specified address and is releasing its reference count as the mmap_sem
> is hold it is ensure that it will not be freed in our back. In general, the
> VMA's reference count could be decremented when holding the mmap_sem but it
> should not be increased as holding the mmap_sem is ensuring that the VMA is
> stable. I can't see anymore the overhead I got while will-it-scale
> benchmark anymore.
>
> The VMA's attributes checked during the speculative page fault processing
> have to be protected against parallel changes. This is done by using a per
> VMA sequence lock. This sequence lock allows the speculative page fault
> handler to fast check for parallel changes in progress and to abort the
> speculative page fault in that case.
>
> Once the VMA is found, the speculative page fault handler would check for
> the VMA's attributes to verify that the page fault has to be handled
> correctly or not. Thus the VMA is protected through a sequence lock which
> allows fast detection of concurrent VMA changes. If such a change is
> detected, the speculative page fault is aborted and a *classic* page fault
> is tried.  VMA sequence lockings are added when VMA attributes which are
> checked during the page fault are modified.
>
> When the PTE is fetched, the VMA is checked to see if it has been changed,
> so once the page table is locked, the VMA is valid, so any other changes
> leading to touching this PTE will need to lock the page table, so no
> parallel change is possible at this time.
>
> The locking of the PTE is done with interrupts disabled, this allows to
> check for the PMD to ensure that there is not an ongoing collapsing
> operation. Since khugepaged is firstly set the PMD to pmd_none and then is
> waiting for the other CPU to have catch the IPI interrupt, if the pmd is
> valid at the time the PTE is locked, we have the guarantee that the
> collapsing opertion will have to wait on the PTE lock to move foward. This
> allows the SPF handler to map the PTE safely. If the PMD value is different
> than the one recorded at the beginning of the SPF operation, the classic
> page fault handler will be called to handle the operation while holding the
> mmap_sem. As the PTE lock is done with the interrupts disabled, the lock is
> done using spin_trylock() to avoid dead lock when handling a page fault
> while a TLB invalidate is requested by an other CPU holding the PTE.
>
> Support for THP is not done because when checking for the PMD, we can be
> confused by an in progress collapsing operation done by khugepaged. The
> issue is that pmd_none() could be true either if the PMD is not already
> populated or if the underlying PTE are in the way to be collapsed. So we
> cannot safely allocate a PMD if pmd_none() is true.
>
> This series a new software performance event named 'speculative-faults' or
> 'spf'. It counts the number of successful page fault event handled in a
> speculative way. When recording 'faults,spf' events, the faults one is
> counting the total number of page fault events while 'spf' is only counting
> the part of the faults processed in a speculative way.
>
> There are some trace events introduced by this series. They allow to
> identify why the page faults where not processed in a speculative way. This
> doesn't take in account the faults generated by a monothreaded process
> which directly processed while holding the mmap_sem. This trace events are
> grouped in a system named 'pagefault', they are:
>  - pagefault:spf_pte_lock : if the pte was already locked by another thread
>  - pagefault:spf_vma_changed : if the VMA has been changed in our back
>  - pagefault:spf_vma_noanon : the vma->anon_vma field was not yet set.
>  - pagefault:spf_vma_notsup : the VMA's type is not supported
>  - pagefault:spf_vma_access : the VMA's access right are not respected
>  - pagefault:spf_pmd_changed : the upper PMD pointer has changed in our
>  back.
>
> To record all the related events, the easier is to run perf with the
> following arguments :
> $ perf stat -e 'faults,spf,pagefault:*' <command>
>
> This series builds on top of v4.16-rc2-mmotm-2018-02-21-14-48 and is
> functional on x86 and PowerPC.
>
> ---------------------
> Real Workload results
>
> As mentioned in previous email, we did non official runs using a "popular
> in memory multithreaded database product" on 176 cores SMT8 Power system
> which showed a 30% improvements in the number of transaction processed per
> second. This run has been done on the v6 series, but changes introduced in
> this new verion should not impact the performance boost seen.
>
> Here are the perf data captured during 2 of these runs on top of the v8
> series:
>                 vanilla         spf
> faults          89.418          101.364
> spf                n/a           97.989
>
> With the SPF kernel, most of the page fault were processed in a speculative
> way.
>
> ------------------
> Benchmarks results
>
> Base kernel is v4.16-rc4-mmotm-2018-03-09-16-34
> SPF is BASE + this series
>
> Kernbench:
> ----------
> Here are the results on a 16 CPUs X86 guest using kernbench on a 4.13-rc4
> kernel (kernel is build 5 times):
>
> Average Half load -j 8
>                  Run    (std deviation)
>                  BASE                   SPF
> Elapsed Time     151.36  (1.40139)      151.748 (1.09716)       0.26%
> User    Time     1023.19 (3.58972)      1027.35 (2.30396)       0.41%
> System  Time     125.026 (1.8547)       124.504 (0.980015)      -0.42%
> Percent CPU      758.2   (5.54076)      758.6   (3.97492)       0.05%
> Context Switches 54924   (453.634)      54851   (382.293)       -0.13%
> Sleeps           105589  (704.581)      105282  (435.502)       -0.29%
>
> Average Optimal load -j 16
>                  Run    (std deviation)
>                  BASE                   SPF
> Elapsed Time     74.804  (1.25139)      74.368  (0.406288)      -0.58%
> User    Time     962.033 (64.5125)      963.93  (66.8797)       0.20%
> System  Time     110.771 (15.0817)      110.387 (14.8989)       -0.35%
> Percent CPU      1045.7  (303.387)      1049.1  (306.255)       0.33%
> Context Switches 76201.8 (22433.1)      76170.4 (22482.9)       -0.04%
> Sleeps           110289  (5024.05)      110220  (5248.58)       -0.06%
>
> During a run on the SPF, perf events were captured:
>  Performance counter stats for '../kernbench -M':
>          510334017      faults
>                200      spf
>                  0      pagefault:spf_pte_lock
>                  0      pagefault:spf_vma_changed
>                  0      pagefault:spf_vma_noanon
>               2174      pagefault:spf_vma_notsup
>                  0      pagefault:spf_vma_access
>                  0      pagefault:spf_pmd_changed
>
> Very few speculative page fault were recorded as most of the processes
> involved are monothreaded (sounds that on this architecture some threads
> were created during the kernel build processing).
>
> Here are the kerbench results on a 80 CPUs Power8 system:
>
> Average Half load -j 40
>                  Run    (std deviation)
>                  BASE                   SPF
> Elapsed Time     116.958 (0.73401)      117.43  (0.927497)      0.40%
> User    Time     4472.35 (7.85792)      4480.16 (19.4909)       0.17%
> System  Time     136.248 (0.587639)     136.922 (1.09058)       0.49%
> Percent CPU      3939.8  (20.6567)      3931.2  (17.2829)       -0.22%
> Context Switches 92445.8 (236.672)      92720.8 (270.118)       0.30%
> Sleeps           318475  (1412.6)       317996  (1819.07)       -0.15%
>
> Average Optimal load -j 80
>                  Run    (std deviation)
>                  BASE                   SPF
> Elapsed Time     106.976 (0.406731)     107.72  (0.329014)      0.70%
> User    Time     5863.47 (1466.45)      5865.38 (1460.27)       0.03%
> System  Time     159.995 (25.0393)      160.329 (24.6921)       0.21%
> Percent CPU      5446.2  (1588.23)      5416    (1565.34)       -0.55%
> Context Switches 223018  (137637)       224867  (139305)        0.83%
> Sleeps           330846  (13127.3)      332348  (15556.9)       0.45%
>
> During a run on the SPF, perf events were captured:
>  Performance counter stats for '../kernbench -M':
>          116612488      faults
>                  0      spf
>                  0      pagefault:spf_pte_lock
>                  0      pagefault:spf_vma_changed
>                  0      pagefault:spf_vma_noanon
>                473      pagefault:spf_vma_notsup
>                  0      pagefault:spf_vma_access
>                  0      pagefault:spf_pmd_changed
>
> Most of the processes involved are monothreaded so SPF is not activated but
> there is no impact on the performance.
>
> Ebizzy:
> -------
> The test is counting the number of records per second it can manage, the
> higher is the best. I run it like this 'ebizzy -mTRp'. To get consistent
> result I repeated the test 100 times and measure the average result. The
> number is the record processes per second, the higher is the best.
>
>                 BASE            SPF             delta
> 16 CPUs x86 VM  14902.6         95905.16        543.55%
> 80 CPUs P8 node 37240.24        78185.67        109.95%
>
> Here are the performance counter read during a run on a 16 CPUs x86 VM:
>  Performance counter stats for './ebizzy -mRTp':
>             888157      faults
>             884773      spf
>                 92      pagefault:spf_pte_lock
>               2379      pagefault:spf_vma_changed
>                  0      pagefault:spf_vma_noanon
>                 80      pagefault:spf_vma_notsup
>                  0      pagefault:spf_vma_access
>                  0      pagefault:spf_pmd_changed
>
> And the ones captured during a run on a 80 CPUs Power node:
>  Performance counter stats for './ebizzy -mRTp':
>             762134      faults
>             728663      spf
>              19101      pagefault:spf_pte_lock
>              13969      pagefault:spf_vma_changed
>                  0      pagefault:spf_vma_noanon
>                272      pagefault:spf_vma_notsup
>                  0      pagefault:spf_vma_access
>                  0      pagefault:spf_pmd_changed
>
> In ebizzy's case most of the page fault were handled in a speculative way,
> leading the ebizzy performance boost.

We ported the SPF to kernel 4.9 in android devices.
For the app launch time, It improves about 15% average. For the apps
which have hundreds of threads, it will be about 20%.

Thanks.

>
> ------------------
> Changes since v8:
>  - Don't check PMD when locking the pte when THP is disabled
>    Thanks to Daniel Jordan for reporting this.
>  - Rebase on 4.16
> Changes since v7:
>  - move pte_map_lock() and pte_spinlock() upper in mm/memory.c (patch 4 &
>    5)
>  - make pte_unmap_same() compatible with the speculative page fault (patch
>    6)
> Changes since v6:
>  - Rename config variable to CONFIG_SPECULATIVE_PAGE_FAULT (patch 1)
>  - Review the way the config variable is set (patch 1 to 3)
>  - Introduce mm_rb_write_*lock() in mm/mmap.c (patch 18)
>  - Merge patch introducing pte try locking in the patch 18.
> Changes since v5:
>  - use rwlock agains the mm RB tree in place of SRCU
>  - add a VMA's reference count to protect VMA while using it without
>    holding the mmap_sem.
>  - check PMD value to detect collapsing operation
>  - don't try speculative page fault for mono threaded processes
>  - try to reuse the fetched VMA if VM_RETRY is returned
>  - go directly to the error path if an error is detected during the SPF
>    path
>  - fix race window when moving VMA in move_vma()
> Changes since v4:
>  - As requested by Andrew Morton, use CONFIG_SPF and define it earlier in
>  the series to ease bisection.
> Changes since v3:
>  - Don't build when CONFIG_SMP is not set
>  - Fixed a lock dependency warning in __vma_adjust()
>  - Use READ_ONCE to access p*d values in handle_speculative_fault()
>  - Call memcp_oom() service in handle_speculative_fault()
> Changes since v2:
>  - Perf event is renamed in PERF_COUNT_SW_SPF
>  - On Power handle do_page_fault()'s cleaning
>  - On Power if the VM_FAULT_ERROR is returned by
>  handle_speculative_fault(), do not retry but jump to the error path
>  - If VMA's flags are not matching the fault, directly returns
>  VM_FAULT_SIGSEGV and not VM_FAULT_RETRY
>  - Check for pud_trans_huge() to avoid speculative path
>  - Handles _vm_normal_page()'s introduced by 6f16211df3bf
>  ("mm/device-public-memory: device memory cache coherent with CPU")
>  - add and review few comments in the code
> Changes since v1:
>  - Remove PERF_COUNT_SW_SPF_FAILED perf event.
>  - Add tracing events to details speculative page fault failures.
>  - Cache VMA fields values which are used once the PTE is unlocked at the
>  end of the page fault events.
>  - Ensure that fields read during the speculative path are written and read
>  using WRITE_ONCE and READ_ONCE.
>  - Add checks at the beginning of the speculative path to abort it if the
>  VMA is known to not be supported.
> Changes since RFC V5 [5]
>  - Port to 4.13 kernel
>  - Merging patch fixing lock dependency into the original patch
>  - Replace the 2 parameters of vma_has_changed() with the vmf pointer
>  - In patch 7, don't call __do_fault() in the speculative path as it may
>  want to unlock the mmap_sem.
>  - In patch 11-12, don't check for vma boundaries when
>  page_add_new_anon_rmap() is called during the spf path and protect against
>  anon_vma pointer's update.
>  - In patch 13-16, add performance events to report number of successful
>  and failed speculative events.
>
> [1]
> http://linux-kernel.2935.n7.nabble.com/RFC-PATCH-0-6-Another-go-at-speculative-page-faults-tt965642.html#none
> [2] https://patchwork.kernel.org/patch/9999687/
>
>
> Laurent Dufour (20):
>   mm: Introduce CONFIG_SPECULATIVE_PAGE_FAULT
>   x86/mm: Define CONFIG_SPECULATIVE_PAGE_FAULT
>   powerpc/mm: Define CONFIG_SPECULATIVE_PAGE_FAULT
>   mm: Introduce pte_spinlock for FAULT_FLAG_SPECULATIVE
>   mm: make pte_unmap_same compatible with SPF
>   mm: Protect VMA modifications using VMA sequence count
>   mm: protect mremap() against SPF hanlder
>   mm: Protect SPF handler against anon_vma changes
>   mm: Cache some VMA fields in the vm_fault structure
>   mm/migrate: Pass vm_fault pointer to migrate_misplaced_page()
>   mm: Introduce __lru_cache_add_active_or_unevictable
>   mm: Introduce __maybe_mkwrite()
>   mm: Introduce __vm_normal_page()
>   mm: Introduce __page_add_new_anon_rmap()
>   mm: Protect mm_rb tree with a rwlock
>   mm: Adding speculative page fault failure trace events
>   perf: Add a speculative page fault sw event
>   perf tools: Add support for the SPF perf event
>   mm: Speculative page fault handler return VMA
>   powerpc/mm: Add speculative page fault
>
> Peter Zijlstra (4):
>   mm: Prepare for FAULT_FLAG_SPECULATIVE
>   mm: VMA sequence count
>   mm: Provide speculative fault infrastructure
>   x86/mm: Add speculative pagefault handling
>
>  arch/powerpc/Kconfig                  |   1 +
>  arch/powerpc/mm/fault.c               |  31 +-
>  arch/x86/Kconfig                      |   1 +
>  arch/x86/mm/fault.c                   |  38 ++-
>  fs/proc/task_mmu.c                    |   5 +-
>  fs/userfaultfd.c                      |  17 +-
>  include/linux/hugetlb_inline.h        |   2 +-
>  include/linux/migrate.h               |   4 +-
>  include/linux/mm.h                    |  92 +++++-
>  include/linux/mm_types.h              |   7 +
>  include/linux/pagemap.h               |   4 +-
>  include/linux/rmap.h                  |  12 +-
>  include/linux/swap.h                  |  10 +-
>  include/trace/events/pagefault.h      |  87 +++++
>  include/uapi/linux/perf_event.h       |   1 +
>  kernel/fork.c                         |   3 +
>  mm/Kconfig                            |   3 +
>  mm/hugetlb.c                          |   2 +
>  mm/init-mm.c                          |   3 +
>  mm/internal.h                         |  20 ++
>  mm/khugepaged.c                       |   5 +
>  mm/madvise.c                          |   6 +-
>  mm/memory.c                           | 594 ++++++++++++++++++++++++++++++----
>  mm/mempolicy.c                        |  51 ++-
>  mm/migrate.c                          |   4 +-
>  mm/mlock.c                            |  13 +-
>  mm/mmap.c                             | 211 +++++++++---
>  mm/mprotect.c                         |   4 +-
>  mm/mremap.c                           |  13 +
>  mm/rmap.c                             |   5 +-
>  mm/swap.c                             |   6 +-
>  mm/swap_state.c                       |   8 +-
>  tools/include/uapi/linux/perf_event.h |   1 +
>  tools/perf/util/evsel.c               |   1 +
>  tools/perf/util/parse-events.c        |   4 +
>  tools/perf/util/parse-events.l        |   1 +
>  tools/perf/util/python.c              |   1 +
>  37 files changed, 1097 insertions(+), 174 deletions(-)
>  create mode 100644 include/trace/events/pagefault.h
>
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 01/24] mm: Introduce CONFIG_SPECULATIVE_PAGE_FAULT
  2018-03-13 17:59 ` [PATCH v9 01/24] mm: Introduce CONFIG_SPECULATIVE_PAGE_FAULT Laurent Dufour
@ 2018-03-25 21:50   ` David Rientjes
  2018-03-28  7:49     ` Laurent Dufour
  0 siblings, 1 reply; 98+ messages in thread
From: David Rientjes @ 2018-03-25 21:50 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Tue, 13 Mar 2018, Laurent Dufour wrote:

> This configuration variable will be used to build the code needed to
> handle speculative page fault.
> 
> By default it is turned off, and activated depending on architecture
> support.
> 
> Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
> ---
>  mm/Kconfig | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/mm/Kconfig b/mm/Kconfig
> index abefa573bcd8..07c566c88faf 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -759,3 +759,6 @@ config GUP_BENCHMARK
>  	  performance of get_user_pages_fast().
>  
>  	  See tools/testing/selftests/vm/gup_benchmark.c
> +
> +config SPECULATIVE_PAGE_FAULT
> +       bool

Should this be configurable even if the arch supports it?

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 04/24] mm: Prepare for FAULT_FLAG_SPECULATIVE
       [not found] ` <1520963994-28477-5-git-send-email-ldufour@linux.vnet.ibm.com>
@ 2018-03-25 21:50   ` David Rientjes
  2018-03-28 10:27     ` Laurent Dufour
  0 siblings, 1 reply; 98+ messages in thread
From: David Rientjes @ 2018-03-25 21:50 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Tue, 13 Mar 2018, Laurent Dufour wrote:

> From: Peter Zijlstra <peterz@infradead.org>
> 
> When speculating faults (without holding mmap_sem) we need to validate
> that the vma against which we loaded pages is still valid when we're
> ready to install the new PTE.
> 
> Therefore, replace the pte_offset_map_lock() calls that (re)take the
> PTL with pte_map_lock() which can fail in case we find the VMA changed
> since we started the fault.
> 

Based on how its used, I would have suspected this to be named 
pte_map_trylock().

> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> 
> [Port to 4.12 kernel]
> [Remove the comment about the fault_env structure which has been
>  implemented as the vm_fault structure in the kernel]
> [move pte_map_lock()'s definition upper in the file]
> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
> ---
>  include/linux/mm.h |  1 +
>  mm/memory.c        | 56 ++++++++++++++++++++++++++++++++++++++----------------
>  2 files changed, 41 insertions(+), 16 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 4d02524a7998..2f3e98edc94a 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -300,6 +300,7 @@ extern pgprot_t protection_map[16];
>  #define FAULT_FLAG_USER		0x40	/* The fault originated in userspace */
>  #define FAULT_FLAG_REMOTE	0x80	/* faulting for non current tsk/mm */
>  #define FAULT_FLAG_INSTRUCTION  0x100	/* The fault was during an instruction fetch */
> +#define FAULT_FLAG_SPECULATIVE	0x200	/* Speculative fault, not holding mmap_sem */
>  
>  #define FAULT_FLAG_TRACE \
>  	{ FAULT_FLAG_WRITE,		"WRITE" }, \

I think FAULT_FLAG_SPECULATIVE should be introduced in the patch that 
actually uses it.

> diff --git a/mm/memory.c b/mm/memory.c
> index e0ae4999c824..8ac241b9f370 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2288,6 +2288,13 @@ int apply_to_page_range(struct mm_struct *mm, unsigned long addr,
>  }
>  EXPORT_SYMBOL_GPL(apply_to_page_range);
>  
> +static bool pte_map_lock(struct vm_fault *vmf)

inline?

> +{
> +	vmf->pte = pte_offset_map_lock(vmf->vma->vm_mm, vmf->pmd,
> +				       vmf->address, &vmf->ptl);
> +	return true;
> +}
> +
>  /*
>   * handle_pte_fault chooses page fault handler according to an entry which was
>   * read non-atomically.  Before making any commitment, on those architectures
> @@ -2477,6 +2484,7 @@ static int wp_page_copy(struct vm_fault *vmf)
>  	const unsigned long mmun_start = vmf->address & PAGE_MASK;
>  	const unsigned long mmun_end = mmun_start + PAGE_SIZE;
>  	struct mem_cgroup *memcg;
> +	int ret = VM_FAULT_OOM;
>  
>  	if (unlikely(anon_vma_prepare(vma)))
>  		goto oom;
> @@ -2504,7 +2512,11 @@ static int wp_page_copy(struct vm_fault *vmf)
>  	/*
>  	 * Re-check the pte - we dropped the lock
>  	 */
> -	vmf->pte = pte_offset_map_lock(mm, vmf->pmd, vmf->address, &vmf->ptl);
> +	if (!pte_map_lock(vmf)) {
> +		mem_cgroup_cancel_charge(new_page, memcg, false);
> +		ret = VM_FAULT_RETRY;
> +		goto oom_free_new;
> +	}

Ugh, but we aren't oom here, so maybe rename oom_free_new so that it makes 
sense for return values other than VM_FAULT_OOM?

>  	if (likely(pte_same(*vmf->pte, vmf->orig_pte))) {
>  		if (old_page) {
>  			if (!PageAnon(old_page)) {
> @@ -2596,7 +2608,7 @@ static int wp_page_copy(struct vm_fault *vmf)
>  oom:
>  	if (old_page)
>  		put_page(old_page);
> -	return VM_FAULT_OOM;
> +	return ret;
>  }
>  
>  /**
> @@ -2617,8 +2629,8 @@ static int wp_page_copy(struct vm_fault *vmf)
>  int finish_mkwrite_fault(struct vm_fault *vmf)
>  {
>  	WARN_ON_ONCE(!(vmf->vma->vm_flags & VM_SHARED));
> -	vmf->pte = pte_offset_map_lock(vmf->vma->vm_mm, vmf->pmd, vmf->address,
> -				       &vmf->ptl);
> +	if (!pte_map_lock(vmf))
> +		return VM_FAULT_RETRY;
>  	/*
>  	 * We might have raced with another page fault while we released the
>  	 * pte_offset_map_lock.
> @@ -2736,8 +2748,11 @@ static int do_wp_page(struct vm_fault *vmf)
>  			get_page(vmf->page);
>  			pte_unmap_unlock(vmf->pte, vmf->ptl);
>  			lock_page(vmf->page);
> -			vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd,
> -					vmf->address, &vmf->ptl);
> +			if (!pte_map_lock(vmf)) {
> +				unlock_page(vmf->page);
> +				put_page(vmf->page);
> +				return VM_FAULT_RETRY;
> +			}
>  			if (!pte_same(*vmf->pte, vmf->orig_pte)) {
>  				unlock_page(vmf->page);
>  				pte_unmap_unlock(vmf->pte, vmf->ptl);
> @@ -2947,8 +2962,10 @@ int do_swap_page(struct vm_fault *vmf)
>  			 * Back out if somebody else faulted in this pte
>  			 * while we released the pte lock.
>  			 */

Comment needs updating, pte_same() isn't the only reason to bail out here.

> -			vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd,
> -					vmf->address, &vmf->ptl);
> +			if (!pte_map_lock(vmf)) {
> +				delayacct_clear_flag(DELAYACCT_PF_SWAPIN);
> +				return VM_FAULT_RETRY;
> +			}
>  			if (likely(pte_same(*vmf->pte, vmf->orig_pte)))
>  				ret = VM_FAULT_OOM;
>  			delayacct_clear_flag(DELAYACCT_PF_SWAPIN);

Not crucial, but it would be nice if this could do goto out instead, 
otherwise this is the first mid function return.

> @@ -3003,8 +3020,11 @@ int do_swap_page(struct vm_fault *vmf)
>  	/*
>  	 * Back out if somebody else already faulted in this pte.
>  	 */

Same as above.

> -	vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address,
> -			&vmf->ptl);
> +	if (!pte_map_lock(vmf)) {
> +		ret = VM_FAULT_RETRY;
> +		mem_cgroup_cancel_charge(page, memcg, false);
> +		goto out_page;
> +	}
>  	if (unlikely(!pte_same(*vmf->pte, vmf->orig_pte)))
>  		goto out_nomap;
>  

mem_cgroup_try_charge() is done before grabbing pte_offset_map_lock(), why 
does the out_nomap exit path do mem_cgroup_cancel_charge(); 
pte_unmap_unlock()?  If the pte lock can be droppde first, there's no need 
to embed the mem_cgroup_cancel_charge() here.

> @@ -3133,8 +3153,8 @@ static int do_anonymous_page(struct vm_fault *vmf)
>  			!mm_forbids_zeropage(vma->vm_mm)) {
>  		entry = pte_mkspecial(pfn_pte(my_zero_pfn(vmf->address),
>  						vma->vm_page_prot));
> -		vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd,
> -				vmf->address, &vmf->ptl);
> +		if (!pte_map_lock(vmf))
> +			return VM_FAULT_RETRY;
>  		if (!pte_none(*vmf->pte))
>  			goto unlock;
>  		ret = check_stable_address_space(vma->vm_mm);
> @@ -3169,8 +3189,11 @@ static int do_anonymous_page(struct vm_fault *vmf)
>  	if (vma->vm_flags & VM_WRITE)
>  		entry = pte_mkwrite(pte_mkdirty(entry));
>  
> -	vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address,
> -			&vmf->ptl);
> +	if (!pte_map_lock(vmf)) {
> +		mem_cgroup_cancel_charge(page, memcg, false);
> +		put_page(page);
> +		return VM_FAULT_RETRY;
> +	}
>  	if (!pte_none(*vmf->pte))
>  		goto release;
>  

This is more spaghetti, can the exit path be fixed up so we order things 
consistently for all gotos?

> @@ -3294,8 +3317,9 @@ static int pte_alloc_one_map(struct vm_fault *vmf)
>  	 * pte_none() under vmf->ptl protection when we return to
>  	 * alloc_set_pte().
>  	 */
> -	vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address,
> -			&vmf->ptl);
> +	if (!pte_map_lock(vmf))
> +		return VM_FAULT_RETRY;
> +
>  	return 0;
>  }
>  

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 05/24] mm: Introduce pte_spinlock for FAULT_FLAG_SPECULATIVE
  2018-03-13 17:59 ` [PATCH v9 05/24] mm: Introduce pte_spinlock for FAULT_FLAG_SPECULATIVE Laurent Dufour
@ 2018-03-25 21:50   ` David Rientjes
  2018-03-28  8:15     ` Laurent Dufour
  0 siblings, 1 reply; 98+ messages in thread
From: David Rientjes @ 2018-03-25 21:50 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Tue, 13 Mar 2018, Laurent Dufour wrote:

> When handling page fault without holding the mmap_sem the fetch of the
> pte lock pointer and the locking will have to be done while ensuring
> that the VMA is not touched in our back.
> 
> So move the fetch and locking operations in a dedicated function.
> 
> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
> ---
>  mm/memory.c | 15 +++++++++++----
>  1 file changed, 11 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 8ac241b9f370..21b1212a0892 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2288,6 +2288,13 @@ int apply_to_page_range(struct mm_struct *mm, unsigned long addr,
>  }
>  EXPORT_SYMBOL_GPL(apply_to_page_range);
>  
> +static bool pte_spinlock(struct vm_fault *vmf)

inline?

> +{
> +	vmf->ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd);
> +	spin_lock(vmf->ptl);
> +	return true;
> +}
> +
>  static bool pte_map_lock(struct vm_fault *vmf)
>  {
>  	vmf->pte = pte_offset_map_lock(vmf->vma->vm_mm, vmf->pmd,

Shouldn't pte_unmap_same() take struct vm_fault * and use the new 
pte_spinlock()?

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [mm] b1f0502d04: INFO:trying_to_register_non-static_key
  2018-03-21 12:21     ` Laurent Dufour
@ 2018-03-25 22:10       ` David Rientjes
  2018-03-28 13:30         ` Laurent Dufour
  0 siblings, 1 reply; 98+ messages in thread
From: David Rientjes @ 2018-03-25 22:10 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: kernel test robot, paulmck, peterz, akpm, kirill, ak, mhocko,
	dave, jack, Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner,
	Ingo Molnar, hpa, Will Deacon, Sergey Senozhatsky,
	Andrea Arcangeli, Alexei Starovoitov, kemi.wang,
	sergey.senozhatsky.work, Daniel Jordan, linux-kernel, linux-mm,
	haren, khandual, npiggin, bsingharora, Tim Chen, linuxppc-dev,
	x86, lkp

On Wed, 21 Mar 2018, Laurent Dufour wrote:

> I found the root cause of this lockdep warning.
> 
> In mmap_region(), unmap_region() may be called while vma_link() has not been
> called. This happens during the error path if call_mmap() failed.
> 
> The only to fix that particular case is to call
> seqcount_init(&vma->vm_sequence) when initializing the vma in mmap_region().
> 

Ack, although that would require a fixup to dup_mmap() as well.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 24/24] powerpc/mm: Add speculative page fault
  2018-03-13 17:59 ` [PATCH v9 24/24] powerpc/mm: Add speculative page fault Laurent Dufour
@ 2018-03-26 21:39   ` David Rientjes
  0 siblings, 0 replies; 98+ messages in thread
From: David Rientjes @ 2018-03-26 21:39 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Tue, 13 Mar 2018, Laurent Dufour wrote:

> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index 866446cf2d9a..104f3cc86b51 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -392,6 +392,9 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
>  			   unsigned long error_code)
>  {
>  	struct vm_area_struct * vma;
> +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
> +	struct vm_area_struct *spf_vma = NULL;
> +#endif
>  	struct mm_struct *mm = current->mm;
>  	unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
>   	int is_exec = TRAP(regs) == 0x400;
> @@ -459,6 +462,20 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
>  	if (is_exec)
>  		flags |= FAULT_FLAG_INSTRUCTION;
>  
> +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
> +	if (is_user && (atomic_read(&mm->mm_users) > 1)) {
> +		/* let's try a speculative page fault without grabbing the
> +		 * mmap_sem.
> +		 */
> +		fault = handle_speculative_fault(mm, address, flags, &spf_vma);
> +		if (!(fault & VM_FAULT_RETRY)) {
> +			perf_sw_event(PERF_COUNT_SW_SPF, 1,
> +				      regs, address);
> +			goto done;
> +		}
> +	}
> +#endif /* CONFIG_SPECULATIVE_PAGE_FAULT */
> +

Can't you elimiate all #ifdef's in this patch if 
handle_speculative_fault() can be passed is_user and return some error 
code that fallback is needed?  Maybe reuse VM_FAULT_FALLBACK?

>  	/* When running in the kernel we expect faults to occur only to
>  	 * addresses in user space.  All other faults represent errors in the
>  	 * kernel and should generate an OOPS.  Unfortunately, in the case of an
> @@ -489,7 +506,16 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
>  		might_sleep();
>  	}
>  
> -	vma = find_vma(mm, address);
> +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
> +	if (spf_vma) {
> +		if (can_reuse_spf_vma(spf_vma, address))
> +			vma = spf_vma;
> +		else
> +			vma =  find_vma(mm, address);
> +		spf_vma = NULL;
> +	} else
> +#endif
> +		vma = find_vma(mm, address);
>  	if (unlikely(!vma))
>  		return bad_area(regs, address);
>  	if (likely(vma->vm_start <= address))

I think the code quality here could be improved such that you can pass mm, 
&spf_vma, and address and some helper function would return spf_vma if 
can_reuse_spf_vma() is true (and do *spf_vma to NULL) or otherwise return 
find_vma(mm, address).

Also, spf_vma is being set to NULL because of VM_FAULT_RETRY, but does it 
make sense to retry handle_speculative_fault() in this case since we've 
dropped mm->mmap_sem and there may have been a writer queued behind it?

> @@ -568,6 +594,9 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
>  
>  	up_read(&current->mm->mmap_sem);
>  
> +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
> +done:
> +#endif
>  	if (unlikely(fault & VM_FAULT_ERROR))
>  		return mm_fault_error(regs, address, fault);
>  

And things like this are trivially handled by doing

done: __maybe_unused

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 23/24] x86/mm: Add speculative pagefault handling
  2018-03-13 17:59 ` [PATCH v9 23/24] x86/mm: Add speculative pagefault handling Laurent Dufour
@ 2018-03-26 21:41   ` David Rientjes
  0 siblings, 0 replies; 98+ messages in thread
From: David Rientjes @ 2018-03-26 21:41 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Tue, 13 Mar 2018, Laurent Dufour wrote:

> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> index e6af2b464c3d..a73cf227edd6 100644
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -1239,6 +1239,9 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
>  		unsigned long address)
>  {
>  	struct vm_area_struct *vma;
> +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
> +	struct vm_area_struct *spf_vma = NULL;
> +#endif
>  	struct task_struct *tsk;
>  	struct mm_struct *mm;
>  	int fault, major = 0;
> @@ -1332,6 +1335,27 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
>  	if (error_code & X86_PF_INSTR)
>  		flags |= FAULT_FLAG_INSTRUCTION;
>  
> +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
> +	if ((error_code & X86_PF_USER) && (atomic_read(&mm->mm_users) > 1)) {
> +		fault = handle_speculative_fault(mm, address, flags,
> +						 &spf_vma);
> +
> +		if (!(fault & VM_FAULT_RETRY)) {
> +			if (!(fault & VM_FAULT_ERROR)) {
> +				perf_sw_event(PERF_COUNT_SW_SPF, 1,
> +					      regs, address);
> +				goto done;
> +			}
> +			/*
> +			 * In case of error we need the pkey value, but
> +			 * can't get it from the spf_vma as it is only returned
> +			 * when VM_FAULT_RETRY is returned. So we have to
> +			 * retry the page fault with the mmap_sem grabbed.
> +			 */
> +		}
> +	}
> +#endif /* CONFIG_SPECULATIVE_PAGE_FAULT */

All the comments from the powerpc version will apply here as well, the 
only interesting point is whether VM_FAULT_FALLBACK can be returned from 
handle_speculative_fault() to indicate its not possible.

> +
>  	/*
>  	 * When running in the kernel we expect faults to occur only to
>  	 * addresses in user space.  All other faults represent errors in
> @@ -1365,7 +1389,16 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
>  		might_sleep();
>  	}
>  
> -	vma = find_vma(mm, address);
> +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
> +	if (spf_vma) {
> +		if (can_reuse_spf_vma(spf_vma, address))
> +			vma = spf_vma;
> +		else
> +			vma = find_vma(mm, address);
> +		spf_vma = NULL;
> +	} else
> +#endif
> +		vma = find_vma(mm, address);
>  	if (unlikely(!vma)) {
>  		bad_area(regs, error_code, address);
>  		return;
> @@ -1451,6 +1484,9 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
>  		return;
>  	}
>  
> +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
> +done:
> +#endif
>  	/*
>  	 * Major/minor page fault accounting. If any of the events
>  	 * returned VM_FAULT_MAJOR, we account it as a major fault.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 20/24] perf: Add a speculative page fault sw event
  2018-03-13 17:59 ` [PATCH v9 20/24] perf: Add a speculative page fault sw event Laurent Dufour
@ 2018-03-26 21:43   ` David Rientjes
  0 siblings, 0 replies; 98+ messages in thread
From: David Rientjes @ 2018-03-26 21:43 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Tue, 13 Mar 2018, Laurent Dufour wrote:

> Add a new software event to count succeeded speculative page faults.
> 
> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>

Acked-by: David Rientjes <rientjes@google.com>

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 21/24] perf tools: Add support for the SPF perf event
  2018-03-13 17:59 ` [PATCH v9 21/24] perf tools: Add support for the SPF perf event Laurent Dufour
@ 2018-03-26 21:44   ` David Rientjes
  2018-03-27  3:49     ` Andi Kleen
  0 siblings, 1 reply; 98+ messages in thread
From: David Rientjes @ 2018-03-26 21:44 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Tue, 13 Mar 2018, Laurent Dufour wrote:

> Add support for the new speculative faults event.
> 
> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>

Acked-by: David Rientjes <rientjes@google.com>

Aside: should there be a new spec_flt field for struct task_struct that 
complements maj_flt and min_flt and be exported through /proc/pid/stat?

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 21/24] perf tools: Add support for the SPF perf event
  2018-03-26 21:44   ` David Rientjes
@ 2018-03-27  3:49     ` Andi Kleen
  2018-04-10  6:47       ` David Rientjes
  0 siblings, 1 reply; 98+ messages in thread
From: Andi Kleen @ 2018-03-27  3:49 UTC (permalink / raw)
  To: David Rientjes
  Cc: Laurent Dufour, paulmck, peterz, akpm, kirill, mhocko, dave,
	jack, Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner,
	Ingo Molnar, hpa, Will Deacon, Sergey Senozhatsky,
	Andrea Arcangeli, Alexei Starovoitov, kemi.wang,
	sergey.senozhatsky.work, Daniel Jordan, linux-kernel, linux-mm,
	haren, khandual, npiggin, bsingharora, Tim Chen, linuxppc-dev,
	x86

On Mon, Mar 26, 2018 at 02:44:48PM -0700, David Rientjes wrote:
> On Tue, 13 Mar 2018, Laurent Dufour wrote:
> 
> > Add support for the new speculative faults event.
> > 
> > Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
> 
> Acked-by: David Rientjes <rientjes@google.com>
> 
> Aside: should there be a new spec_flt field for struct task_struct that 
> complements maj_flt and min_flt and be exported through /proc/pid/stat?

No. task_struct is already too bloated. If you need per process tracking 
you can always get it through trace points.

-Andi

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 06/24] mm: make pte_unmap_same compatible with SPF
  2018-03-13 17:59 ` [PATCH v9 06/24] mm: make pte_unmap_same compatible with SPF Laurent Dufour
@ 2018-03-27 21:18   ` David Rientjes
  2018-03-28  8:27     ` Laurent Dufour
  2018-04-03 19:10   ` Jerome Glisse
  1 sibling, 1 reply; 98+ messages in thread
From: David Rientjes @ 2018-03-27 21:18 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Tue, 13 Mar 2018, Laurent Dufour wrote:

> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 2f3e98edc94a..b6432a261e63 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1199,6 +1199,7 @@ static inline void clear_page_pfmemalloc(struct page *page)
>  #define VM_FAULT_NEEDDSYNC  0x2000	/* ->fault did not modify page tables
>  					 * and needs fsync() to complete (for
>  					 * synchronous page faults in DAX) */
> +#define VM_FAULT_PTNOTSAME 0x4000	/* Page table entries have changed */
>  
>  #define VM_FAULT_ERROR	(VM_FAULT_OOM | VM_FAULT_SIGBUS | VM_FAULT_SIGSEGV | \
>  			 VM_FAULT_HWPOISON | VM_FAULT_HWPOISON_LARGE | \
> diff --git a/mm/memory.c b/mm/memory.c
> index 21b1212a0892..4bc7b0bdcb40 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2309,21 +2309,29 @@ static bool pte_map_lock(struct vm_fault *vmf)
>   * parts, do_swap_page must check under lock before unmapping the pte and
>   * proceeding (but do_wp_page is only called after already making such a check;
>   * and do_anonymous_page can safely check later on).
> + *
> + * pte_unmap_same() returns:
> + *	0			if the PTE are the same
> + *	VM_FAULT_PTNOTSAME	if the PTE are different
> + *	VM_FAULT_RETRY		if the VMA has changed in our back during
> + *				a speculative page fault handling.
>   */
> -static inline int pte_unmap_same(struct mm_struct *mm, pmd_t *pmd,
> -				pte_t *page_table, pte_t orig_pte)
> +static inline int pte_unmap_same(struct vm_fault *vmf)
>  {
> -	int same = 1;
> +	int ret = 0;
> +
>  #if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT)
>  	if (sizeof(pte_t) > sizeof(unsigned long)) {
> -		spinlock_t *ptl = pte_lockptr(mm, pmd);
> -		spin_lock(ptl);
> -		same = pte_same(*page_table, orig_pte);
> -		spin_unlock(ptl);
> +		if (pte_spinlock(vmf)) {
> +			if (!pte_same(*vmf->pte, vmf->orig_pte))
> +				ret = VM_FAULT_PTNOTSAME;
> +			spin_unlock(vmf->ptl);
> +		} else
> +			ret = VM_FAULT_RETRY;
>  	}
>  #endif
> -	pte_unmap(page_table);
> -	return same;
> +	pte_unmap(vmf->pte);
> +	return ret;
>  }
>  
>  static inline void cow_user_page(struct page *dst, struct page *src, unsigned long va, struct vm_area_struct *vma)
> @@ -2913,7 +2921,8 @@ int do_swap_page(struct vm_fault *vmf)
>  	int exclusive = 0;
>  	int ret = 0;

Initialization is now unneeded.

Otherwise:

Acked-by: David Rientjes <rientjes@google.com>

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 07/24] mm: VMA sequence count
       [not found] ` <1520963994-28477-8-git-send-email-ldufour@linux.vnet.ibm.com>
  2018-03-17  7:51   ` [mm] b1f0502d04: INFO:trying_to_register_non-static_key kernel test robot
@ 2018-03-27 21:30   ` David Rientjes
  2018-03-28 17:58     ` Laurent Dufour
  1 sibling, 1 reply; 98+ messages in thread
From: David Rientjes @ 2018-03-27 21:30 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Tue, 13 Mar 2018, Laurent Dufour wrote:

> diff --git a/mm/mmap.c b/mm/mmap.c
> index faf85699f1a1..5898255d0aeb 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -558,6 +558,10 @@ void __vma_link_rb(struct mm_struct *mm, struct vm_area_struct *vma,
>  	else
>  		mm->highest_vm_end = vm_end_gap(vma);
>  
> +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
> +	seqcount_init(&vma->vm_sequence);
> +#endif
> +
>  	/*
>  	 * vma->vm_prev wasn't known when we followed the rbtree to find the
>  	 * correct insertion point for that vma. As a result, we could not
> @@ -692,6 +696,30 @@ int __vma_adjust(struct vm_area_struct *vma, unsigned long start,
>  	long adjust_next = 0;
>  	int remove_next = 0;
>  
> +	/*
> +	 * Why using vm_raw_write*() functions here to avoid lockdep's warning ?
> +	 *
> +	 * Locked is complaining about a theoretical lock dependency, involving
> +	 * 3 locks:
> +	 *   mapping->i_mmap_rwsem --> vma->vm_sequence --> fs_reclaim
> +	 *
> +	 * Here are the major path leading to this dependency :
> +	 *  1. __vma_adjust() mmap_sem  -> vm_sequence -> i_mmap_rwsem
> +	 *  2. move_vmap() mmap_sem -> vm_sequence -> fs_reclaim
> +	 *  3. __alloc_pages_nodemask() fs_reclaim -> i_mmap_rwsem
> +	 *  4. unmap_mapping_range() i_mmap_rwsem -> vm_sequence
> +	 *
> +	 * So there is no way to solve this easily, especially because in
> +	 * unmap_mapping_range() the i_mmap_rwsem is grab while the impacted
> +	 * VMAs are not yet known.
> +	 * However, the way the vm_seq is used is guarantying that we will
> +	 * never block on it since we just check for its value and never wait
> +	 * for it to move, see vma_has_changed() and handle_speculative_fault().
> +	 */
> +	vm_raw_write_begin(vma);
> +	if (next)
> +		vm_raw_write_begin(next);
> +
>  	if (next && !insert) {
>  		struct vm_area_struct *exporter = NULL, *importer = NULL;
>  

Eek, what about later on:

		/*
		 * Easily overlooked: when mprotect shifts the boundary,
		 * make sure the expanding vma has anon_vma set if the
		 * shrinking vma had, to cover any anon pages imported.
		 */
		if (exporter && exporter->anon_vma && !importer->anon_vma) {
			int error;

			importer->anon_vma = exporter->anon_vma;
			error = anon_vma_clone(importer, exporter);
			if (error)
				return error;
		}

This needs

if (error) {
	if (next && next != vma)
		vm_raw_write_end(next);
	vm_raw_write_end(vma);
	return error;
}

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 08/24] mm: Protect VMA modifications using VMA sequence count
  2018-03-13 17:59 ` [PATCH v9 08/24] mm: Protect VMA modifications using VMA sequence count Laurent Dufour
@ 2018-03-27 21:45   ` David Rientjes
  2018-03-28 16:57     ` Laurent Dufour
  2018-03-27 21:57   ` David Rientjes
  1 sibling, 1 reply; 98+ messages in thread
From: David Rientjes @ 2018-03-27 21:45 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Tue, 13 Mar 2018, Laurent Dufour wrote:

> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index 65ae54659833..a2d9c87b7b0b 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -1136,8 +1136,11 @@ static ssize_t clear_refs_write(struct file *file, const char __user *buf,
>  					goto out_mm;
>  				}
>  				for (vma = mm->mmap; vma; vma = vma->vm_next) {
> -					vma->vm_flags &= ~VM_SOFTDIRTY;
> +					vm_write_begin(vma);
> +					WRITE_ONCE(vma->vm_flags,
> +						   vma->vm_flags & ~VM_SOFTDIRTY);
>  					vma_set_page_prot(vma);
> +					vm_write_end(vma);
>  				}
>  				downgrade_write(&mm->mmap_sem);
>  				break;
> diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
> index cec550c8468f..b8212ba17695 100644
> --- a/fs/userfaultfd.c
> +++ b/fs/userfaultfd.c
> @@ -659,8 +659,11 @@ int dup_userfaultfd(struct vm_area_struct *vma, struct list_head *fcs)
>  
>  	octx = vma->vm_userfaultfd_ctx.ctx;
>  	if (!octx || !(octx->features & UFFD_FEATURE_EVENT_FORK)) {
> +		vm_write_begin(vma);
>  		vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX;
> -		vma->vm_flags &= ~(VM_UFFD_WP | VM_UFFD_MISSING);
> +		WRITE_ONCE(vma->vm_flags,
> +			   vma->vm_flags & ~(VM_UFFD_WP | VM_UFFD_MISSING));
> +		vm_write_end(vma);
>  		return 0;
>  	}
>  

In several locations in this patch vm_write_begin(vma) -> 
vm_write_end(vma) is nesting things other than vma->vm_flags, 
vma->vm_policy, etc.  I think it's better to do vm_write_end(vma) as soon 
as the members that the seqcount protects are modified.  In other words, 
this isn't offering protection for vma->vm_userfaultfd_ctx.  There are 
several examples of this in the patch.

> @@ -885,8 +888,10 @@ static int userfaultfd_release(struct inode *inode, struct file *file)
>  			vma = prev;
>  		else
>  			prev = vma;
> -		vma->vm_flags = new_flags;
> +		vm_write_begin(vma);
> +		WRITE_ONCE(vma->vm_flags, new_flags);
>  		vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX;
> +		vm_write_end(vma);
>  	}
>  	up_write(&mm->mmap_sem);
>  	mmput(mm);
> @@ -1434,8 +1439,10 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
>  		 * the next vma was merged into the current one and
>  		 * the current one has not been updated yet.
>  		 */
> -		vma->vm_flags = new_flags;
> +		vm_write_begin(vma);
> +		WRITE_ONCE(vma->vm_flags, new_flags);
>  		vma->vm_userfaultfd_ctx.ctx = ctx;
> +		vm_write_end(vma);
>  
>  	skip:
>  		prev = vma;
> @@ -1592,8 +1599,10 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx,
>  		 * the next vma was merged into the current one and
>  		 * the current one has not been updated yet.
>  		 */
> -		vma->vm_flags = new_flags;
> +		vm_write_begin(vma);
> +		WRITE_ONCE(vma->vm_flags, new_flags);
>  		vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX;
> +		vm_write_end(vma);
>  
>  	skip:
>  		prev = vma;
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index b7e2268dfc9a..32314e9e48dd 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -1006,6 +1006,7 @@ static void collapse_huge_page(struct mm_struct *mm,
>  	if (mm_find_pmd(mm, address) != pmd)
>  		goto out;
>  
> +	vm_write_begin(vma);
>  	anon_vma_lock_write(vma->anon_vma);
>  
>  	pte = pte_offset_map(pmd, address);
> @@ -1041,6 +1042,7 @@ static void collapse_huge_page(struct mm_struct *mm,
>  		pmd_populate(mm, pmd, pmd_pgtable(_pmd));
>  		spin_unlock(pmd_ptl);
>  		anon_vma_unlock_write(vma->anon_vma);
> +		vm_write_end(vma);
>  		result = SCAN_FAIL;
>  		goto out;
>  	}
> @@ -1075,6 +1077,7 @@ static void collapse_huge_page(struct mm_struct *mm,
>  	set_pmd_at(mm, address, pmd, _pmd);
>  	update_mmu_cache_pmd(vma, address, pmd);
>  	spin_unlock(pmd_ptl);
> +	vm_write_end(vma);
>  
>  	*hpage = NULL;
>  
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 4d3c922ea1a1..e328f7ab5942 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -184,7 +184,9 @@ static long madvise_behavior(struct vm_area_struct *vma,
>  	/*
>  	 * vm_flags is protected by the mmap_sem held in write mode.
>  	 */
> -	vma->vm_flags = new_flags;
> +	vm_write_begin(vma);
> +	WRITE_ONCE(vma->vm_flags, new_flags);
> +	vm_write_end(vma);
>  out:
>  	return error;
>  }
> @@ -450,9 +452,11 @@ static void madvise_free_page_range(struct mmu_gather *tlb,
>  		.private = tlb,
>  	};
>  
> +	vm_write_begin(vma);
>  	tlb_start_vma(tlb, vma);
>  	walk_page_range(addr, end, &free_walk);
>  	tlb_end_vma(tlb, vma);
> +	vm_write_end(vma);
>  }
>  
>  static int madvise_free_single_vma(struct vm_area_struct *vma,
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index e0e706f0b34e..2632c6f93b63 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -380,8 +380,11 @@ void mpol_rebind_mm(struct mm_struct *mm, nodemask_t *new)
>  	struct vm_area_struct *vma;
>  
>  	down_write(&mm->mmap_sem);
> -	for (vma = mm->mmap; vma; vma = vma->vm_next)
> +	for (vma = mm->mmap; vma; vma = vma->vm_next) {
> +		vm_write_begin(vma);
>  		mpol_rebind_policy(vma->vm_policy, new);
> +		vm_write_end(vma);
> +	}
>  	up_write(&mm->mmap_sem);
>  }
>  
> @@ -554,9 +557,11 @@ unsigned long change_prot_numa(struct vm_area_struct *vma,
>  {
>  	int nr_updated;
>  
> +	vm_write_begin(vma);
>  	nr_updated = change_protection(vma, addr, end, PAGE_NONE, 0, 1);
>  	if (nr_updated)
>  		count_vm_numa_events(NUMA_PTE_UPDATES, nr_updated);
> +	vm_write_end(vma);
>  
>  	return nr_updated;
>  }
> @@ -657,6 +662,7 @@ static int vma_replace_policy(struct vm_area_struct *vma,
>  	if (IS_ERR(new))
>  		return PTR_ERR(new);
>  
> +	vm_write_begin(vma);
>  	if (vma->vm_ops && vma->vm_ops->set_policy) {
>  		err = vma->vm_ops->set_policy(vma, new);
>  		if (err)
> @@ -664,11 +670,17 @@ static int vma_replace_policy(struct vm_area_struct *vma,
>  	}
>  
>  	old = vma->vm_policy;
> -	vma->vm_policy = new; /* protected by mmap_sem */
> +	/*
> +	 * The speculative page fault handler access this field without
> +	 * hodling the mmap_sem.
> +	 */

"The speculative page fault handler accesses this field without holding 
vma->vm_mm->mmap_sem"

> +	WRITE_ONCE(vma->vm_policy,  new);
> +	vm_write_end(vma);
>  	mpol_put(old);
>  
>  	return 0;
>   err_out:
> +	vm_write_end(vma);
>  	mpol_put(new);
>  	return err;
>  }

Wait, doesn't vma_dup_policy() also need to protect dst->vm_policy?

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -2121,7 +2121,9 @@ int vma_dup_policy(struct vm_area_struct *src, struct vm_area_struct *dst)
 
 	if (IS_ERR(pol))
 		return PTR_ERR(pol);
-	dst->vm_policy = pol;
+	vm_write_begin(dst);
+	WRITE_ONCE(dst->vm_policy, pol);
+	vm_write_end(dst);
 	return 0;
 }
 

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 08/24] mm: Protect VMA modifications using VMA sequence count
  2018-03-13 17:59 ` [PATCH v9 08/24] mm: Protect VMA modifications using VMA sequence count Laurent Dufour
  2018-03-27 21:45   ` David Rientjes
@ 2018-03-27 21:57   ` David Rientjes
  2018-03-28 17:10     ` Laurent Dufour
  1 sibling, 1 reply; 98+ messages in thread
From: David Rientjes @ 2018-03-27 21:57 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, Andrew Morton, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Tue, 13 Mar 2018, Laurent Dufour wrote:

> diff --git a/mm/mmap.c b/mm/mmap.c
> index 5898255d0aeb..d6533cb85213 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -847,17 +847,18 @@ int __vma_adjust(struct vm_area_struct *vma, unsigned long start,
>  	}
>  
>  	if (start != vma->vm_start) {
> -		vma->vm_start = start;
> +		WRITE_ONCE(vma->vm_start, start);
>  		start_changed = true;
>  	}
>  	if (end != vma->vm_end) {
> -		vma->vm_end = end;
> +		WRITE_ONCE(vma->vm_end, end);
>  		end_changed = true;
>  	}
> -	vma->vm_pgoff = pgoff;
> +	WRITE_ONCE(vma->vm_pgoff, pgoff);
>  	if (adjust_next) {
> -		next->vm_start += adjust_next << PAGE_SHIFT;
> -		next->vm_pgoff += adjust_next;
> +		WRITE_ONCE(next->vm_start,
> +			   next->vm_start + (adjust_next << PAGE_SHIFT));
> +		WRITE_ONCE(next->vm_pgoff, next->vm_pgoff + adjust_next);
>  	}
>  
>  	if (root) {
> @@ -1781,6 +1782,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
>  out:
>  	perf_event_mmap(vma);
>  
> +	vm_write_begin(vma);
>  	vm_stat_account(mm, vm_flags, len >> PAGE_SHIFT);
>  	if (vm_flags & VM_LOCKED) {
>  		if (!((vm_flags & VM_SPECIAL) || is_vm_hugetlb_page(vma) ||
> @@ -1803,6 +1805,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
>  	vma->vm_flags |= VM_SOFTDIRTY;
>  
>  	vma_set_page_prot(vma);
> +	vm_write_end(vma);
>  
>  	return addr;
>  

Shouldn't this also protect vma->vm_flags?

diff --git a/mm/mmap.c b/mm/mmap.c
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1796,7 +1796,8 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
 					vma == get_gate_vma(current->mm)))
 			mm->locked_vm += (len >> PAGE_SHIFT);
 		else
-			vma->vm_flags &= VM_LOCKED_CLEAR_MASK;
+			WRITE_ONCE(vma->vm_flags,
+				   vma->vm_flags & VM_LOCKED_CLEAR_MASK);
 	}
 
 	if (file)
@@ -1809,7 +1810,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
 	 * then new mapped in-place (which must be aimed as
 	 * a completely new data area).
 	 */
-	vma->vm_flags |= VM_SOFTDIRTY;
+	WRITE_ONCE(vma->vm_flags, vma->vm_flags | VM_SOFTDIRTY);
 
 	vma_set_page_prot(vma);
 	vm_write_end(vma);

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 09/24] mm: protect mremap() against SPF hanlder
  2018-03-13 17:59 ` [PATCH v9 09/24] mm: protect mremap() against SPF hanlder Laurent Dufour
@ 2018-03-27 22:12   ` David Rientjes
  2018-03-28 18:11     ` Laurent Dufour
  0 siblings, 1 reply; 98+ messages in thread
From: David Rientjes @ 2018-03-27 22:12 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Tue, 13 Mar 2018, Laurent Dufour wrote:

> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 88042d843668..ef6ef0627090 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -2189,16 +2189,24 @@ void anon_vma_interval_tree_verify(struct anon_vma_chain *node);
>  extern int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin);
>  extern int __vma_adjust(struct vm_area_struct *vma, unsigned long start,
>  	unsigned long end, pgoff_t pgoff, struct vm_area_struct *insert,
> -	struct vm_area_struct *expand);
> +	struct vm_area_struct *expand, bool keep_locked);
>  static inline int vma_adjust(struct vm_area_struct *vma, unsigned long start,
>  	unsigned long end, pgoff_t pgoff, struct vm_area_struct *insert)
>  {
> -	return __vma_adjust(vma, start, end, pgoff, insert, NULL);
> +	return __vma_adjust(vma, start, end, pgoff, insert, NULL, false);
>  }
> -extern struct vm_area_struct *vma_merge(struct mm_struct *,
> +extern struct vm_area_struct *__vma_merge(struct mm_struct *,
>  	struct vm_area_struct *prev, unsigned long addr, unsigned long end,
>  	unsigned long vm_flags, struct anon_vma *, struct file *, pgoff_t,
> -	struct mempolicy *, struct vm_userfaultfd_ctx);
> +	struct mempolicy *, struct vm_userfaultfd_ctx, bool keep_locked);
> +static inline struct vm_area_struct *vma_merge(struct mm_struct *vma,
> +	struct vm_area_struct *prev, unsigned long addr, unsigned long end,
> +	unsigned long vm_flags, struct anon_vma *anon, struct file *file,
> +	pgoff_t off, struct mempolicy *pol, struct vm_userfaultfd_ctx uff)
> +{
> +	return __vma_merge(vma, prev, addr, end, vm_flags, anon, file, off,
> +			   pol, uff, false);
> +}

The first formal to vma_merge() is an mm, not a vma.

This area could use an uncluttering.

>  extern struct anon_vma *find_mergeable_anon_vma(struct vm_area_struct *);
>  extern int __split_vma(struct mm_struct *, struct vm_area_struct *,
>  	unsigned long addr, int new_below);
> diff --git a/mm/mmap.c b/mm/mmap.c
> index d6533cb85213..ac32b577a0c9 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -684,7 +684,7 @@ static inline void __vma_unlink_prev(struct mm_struct *mm,
>   */
>  int __vma_adjust(struct vm_area_struct *vma, unsigned long start,
>  	unsigned long end, pgoff_t pgoff, struct vm_area_struct *insert,
> -	struct vm_area_struct *expand)
> +	struct vm_area_struct *expand, bool keep_locked)
>  {
>  	struct mm_struct *mm = vma->vm_mm;
>  	struct vm_area_struct *next = vma->vm_next, *orig_vma = vma;
> @@ -996,7 +996,8 @@ int __vma_adjust(struct vm_area_struct *vma, unsigned long start,
>  
>  	if (next && next != vma)
>  		vm_raw_write_end(next);
> -	vm_raw_write_end(vma);
> +	if (!keep_locked)
> +		vm_raw_write_end(vma);
>  
>  	validate_mm(mm);
>  

This will require a fixup for the following patch where a retval from 
anon_vma_close() can also return without vma locked even though 
keep_locked == true.

How does vma_merge() handle that error wrt vm_raw_write_begin(vma)?

> @@ -1132,12 +1133,13 @@ can_vma_merge_after(struct vm_area_struct *vma, unsigned long vm_flags,
>   * parameter) may establish ptes with the wrong permissions of NNNN
>   * instead of the right permissions of XXXX.
>   */
> -struct vm_area_struct *vma_merge(struct mm_struct *mm,
> +struct vm_area_struct *__vma_merge(struct mm_struct *mm,
>  			struct vm_area_struct *prev, unsigned long addr,
>  			unsigned long end, unsigned long vm_flags,
>  			struct anon_vma *anon_vma, struct file *file,
>  			pgoff_t pgoff, struct mempolicy *policy,
> -			struct vm_userfaultfd_ctx vm_userfaultfd_ctx)
> +			struct vm_userfaultfd_ctx vm_userfaultfd_ctx,
> +			bool keep_locked)
>  {
>  	pgoff_t pglen = (end - addr) >> PAGE_SHIFT;
>  	struct vm_area_struct *area, *next;
> @@ -1185,10 +1187,11 @@ struct vm_area_struct *vma_merge(struct mm_struct *mm,
>  							/* cases 1, 6 */
>  			err = __vma_adjust(prev, prev->vm_start,
>  					 next->vm_end, prev->vm_pgoff, NULL,
> -					 prev);
> +					 prev, keep_locked);
>  		} else					/* cases 2, 5, 7 */
>  			err = __vma_adjust(prev, prev->vm_start,
> -					 end, prev->vm_pgoff, NULL, prev);
> +					   end, prev->vm_pgoff, NULL, prev,
> +					   keep_locked);
>  		if (err)
>  			return NULL;
>  		khugepaged_enter_vma_merge(prev, vm_flags);
> @@ -1205,10 +1208,12 @@ struct vm_area_struct *vma_merge(struct mm_struct *mm,
>  					     vm_userfaultfd_ctx)) {
>  		if (prev && addr < prev->vm_end)	/* case 4 */
>  			err = __vma_adjust(prev, prev->vm_start,
> -					 addr, prev->vm_pgoff, NULL, next);
> +					 addr, prev->vm_pgoff, NULL, next,
> +					 keep_locked);
>  		else {					/* cases 3, 8 */
>  			err = __vma_adjust(area, addr, next->vm_end,
> -					 next->vm_pgoff - pglen, NULL, next);
> +					 next->vm_pgoff - pglen, NULL, next,
> +					 keep_locked);
>  			/*
>  			 * In case 3 area is already equal to next and
>  			 * this is a noop, but in case 8 "area" has
> @@ -3163,9 +3168,20 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
>  
>  	if (find_vma_links(mm, addr, addr + len, &prev, &rb_link, &rb_parent))
>  		return NULL;	/* should never get here */
> -	new_vma = vma_merge(mm, prev, addr, addr + len, vma->vm_flags,
> -			    vma->anon_vma, vma->vm_file, pgoff, vma_policy(vma),
> -			    vma->vm_userfaultfd_ctx);
> +
> +	/* There is 3 cases to manage here in
> +	 *     AAAA            AAAA              AAAA              AAAA
> +	 * PPPP....      PPPP......NNNN      PPPP....NNNN      PP........NN
> +	 * PPPPPPPP(A)   PPPP..NNNNNNNN(B)   PPPPPPPPPPPP(1)       NULL
> +	 *                                   PPPPPPPPNNNN(2)
> +	 *				     PPPPNNNNNNNN(3)
> +	 *
> +	 * new_vma == prev in case A,1,2
> +	 * new_vma == next in case B,3
> +	 */

Interleaved tabs and whitespace.

> +	new_vma = __vma_merge(mm, prev, addr, addr + len, vma->vm_flags,
> +			      vma->anon_vma, vma->vm_file, pgoff,
> +			      vma_policy(vma), vma->vm_userfaultfd_ctx, true);
>  	if (new_vma) {
>  		/*
>  		 * Source vma may have been merged into new_vma
> @@ -3205,6 +3221,15 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
>  			get_file(new_vma->vm_file);
>  		if (new_vma->vm_ops && new_vma->vm_ops->open)
>  			new_vma->vm_ops->open(new_vma);
> +		/*
> +		 * As the VMA is linked right now, it may be hit by the
> +		 * speculative page fault handler. But we don't want it to
> +		 * to start mapping page in this area until the caller has
> +		 * potentially move the pte from the moved VMA. To prevent
> +		 * that we protect it right now, and let the caller unprotect
> +		 * it once the move is done.
> +		 */
> +		vm_raw_write_begin(new_vma);
>  		vma_link(mm, new_vma, prev, rb_link, rb_parent);
>  		*need_rmap_locks = false;
>  	}
> diff --git a/mm/mremap.c b/mm/mremap.c
> index 049470aa1e3e..8ed1a1d6eaed 100644
> --- a/mm/mremap.c
> +++ b/mm/mremap.c
> @@ -302,6 +302,14 @@ static unsigned long move_vma(struct vm_area_struct *vma,
>  	if (!new_vma)
>  		return -ENOMEM;
>  
> +	/* new_vma is returned protected by copy_vma, to prevent speculative
> +	 * page fault to be done in the destination area before we move the pte.
> +	 * Now, we must also protect the source VMA since we don't want pages
> +	 * to be mapped in our back while we are copying the PTEs.
> +	 */
> +	if (vma != new_vma)
> +		vm_raw_write_begin(vma);
> +
>  	moved_len = move_page_tables(vma, old_addr, new_vma, new_addr, old_len,
>  				     need_rmap_locks);
>  	if (moved_len < old_len) {
> @@ -318,6 +326,8 @@ static unsigned long move_vma(struct vm_area_struct *vma,
>  		 */
>  		move_page_tables(new_vma, new_addr, vma, old_addr, moved_len,
>  				 true);
> +		if (vma != new_vma)
> +			vm_raw_write_end(vma);
>  		vma = new_vma;
>  		old_len = new_len;
>  		old_addr = new_addr;
> @@ -326,7 +336,10 @@ static unsigned long move_vma(struct vm_area_struct *vma,
>  		mremap_userfaultfd_prep(new_vma, uf);
>  		arch_remap(mm, old_addr, old_addr + old_len,
>  			   new_addr, new_addr + new_len);
> +		if (vma != new_vma)
> +			vm_raw_write_end(vma);
>  	}
> +	vm_raw_write_end(new_vma);

Just do

vm_raw_write_end(vma);
vm_raw_write_end(new_vma);

here.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 01/24] mm: Introduce CONFIG_SPECULATIVE_PAGE_FAULT
  2018-03-25 21:50   ` David Rientjes
@ 2018-03-28  7:49     ` Laurent Dufour
  2018-03-28 10:16       ` David Rientjes
  0 siblings, 1 reply; 98+ messages in thread
From: Laurent Dufour @ 2018-03-28  7:49 UTC (permalink / raw)
  To: David Rientjes, Thomas Gleixner
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Ingo Molnar, hpa, Will Deacon,
	Sergey Senozhatsky, Andrea Arcangeli, Alexei Starovoitov,
	kemi.wang, sergey.senozhatsky.work, Daniel Jordan, linux-kernel,
	linux-mm, haren, khandual, npiggin, bsingharora, Tim Chen,
	linuxppc-dev, x86

Hi David,

Thanks a lot for your deep review on this series.

On 25/03/2018 23:50, David Rientjes wrote:
> On Tue, 13 Mar 2018, Laurent Dufour wrote:
> 
>> This configuration variable will be used to build the code needed to
>> handle speculative page fault.
>>
>> By default it is turned off, and activated depending on architecture
>> support.
>>
>> Suggested-by: Thomas Gleixner <tglx@linutronix.de>
>> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
>> ---
>>  mm/Kconfig | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/mm/Kconfig b/mm/Kconfig
>> index abefa573bcd8..07c566c88faf 100644
>> --- a/mm/Kconfig
>> +++ b/mm/Kconfig
>> @@ -759,3 +759,6 @@ config GUP_BENCHMARK
>>  	  performance of get_user_pages_fast().
>>  
>>  	  See tools/testing/selftests/vm/gup_benchmark.c
>> +
>> +config SPECULATIVE_PAGE_FAULT
>> +       bool
> 
> Should this be configurable even if the arch supports it?

Actually, this is not configurable unless by manually editing the .config file.

I made it this way on the Thomas's request :
https://lkml.org/lkml/2018/1/15/969

That sounds to be the smarter way to achieve that, isn't it ?

Laurent.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 05/24] mm: Introduce pte_spinlock for FAULT_FLAG_SPECULATIVE
  2018-03-25 21:50   ` David Rientjes
@ 2018-03-28  8:15     ` Laurent Dufour
  0 siblings, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-03-28  8:15 UTC (permalink / raw)
  To: David Rientjes
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On 25/03/2018 23:50, David Rientjes wrote:
> On Tue, 13 Mar 2018, Laurent Dufour wrote:
> 
>> When handling page fault without holding the mmap_sem the fetch of the
>> pte lock pointer and the locking will have to be done while ensuring
>> that the VMA is not touched in our back.
>>
>> So move the fetch and locking operations in a dedicated function.
>>
>> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
>> ---
>>  mm/memory.c | 15 +++++++++++----
>>  1 file changed, 11 insertions(+), 4 deletions(-)
>>
>> diff --git a/mm/memory.c b/mm/memory.c
>> index 8ac241b9f370..21b1212a0892 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -2288,6 +2288,13 @@ int apply_to_page_range(struct mm_struct *mm, unsigned long addr,
>>  }
>>  EXPORT_SYMBOL_GPL(apply_to_page_range);
>>  
>> +static bool pte_spinlock(struct vm_fault *vmf)
> 
> inline?

You're right.
Indeed this was done in the patch 18 : "mm: Provide speculative fault
infrastructure", but this has to be done there too, I'll fix that.

> 
>> +{
>> +	vmf->ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd);
>> +	spin_lock(vmf->ptl);
>> +	return true;
>> +}
>> +
>>  static bool pte_map_lock(struct vm_fault *vmf)
>>  {
>>  	vmf->pte = pte_offset_map_lock(vmf->vma->vm_mm, vmf->pmd,
> 
> Shouldn't pte_unmap_same() take struct vm_fault * and use the new 
> pte_spinlock()?

done in the next patch, but you already acked it..

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 06/24] mm: make pte_unmap_same compatible with SPF
  2018-03-27 21:18   ` David Rientjes
@ 2018-03-28  8:27     ` Laurent Dufour
  2018-03-28 10:20       ` David Rientjes
  0 siblings, 1 reply; 98+ messages in thread
From: Laurent Dufour @ 2018-03-28  8:27 UTC (permalink / raw)
  To: David Rientjes
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86



On 27/03/2018 23:18, David Rientjes wrote:
> On Tue, 13 Mar 2018, Laurent Dufour wrote:
> 
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index 2f3e98edc94a..b6432a261e63 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -1199,6 +1199,7 @@ static inline void clear_page_pfmemalloc(struct page *page)
>>  #define VM_FAULT_NEEDDSYNC  0x2000	/* ->fault did not modify page tables
>>  					 * and needs fsync() to complete (for
>>  					 * synchronous page faults in DAX) */
>> +#define VM_FAULT_PTNOTSAME 0x4000	/* Page table entries have changed */
>>  
>>  #define VM_FAULT_ERROR	(VM_FAULT_OOM | VM_FAULT_SIGBUS | VM_FAULT_SIGSEGV | \
>>  			 VM_FAULT_HWPOISON | VM_FAULT_HWPOISON_LARGE | \
>> diff --git a/mm/memory.c b/mm/memory.c
>> index 21b1212a0892..4bc7b0bdcb40 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -2309,21 +2309,29 @@ static bool pte_map_lock(struct vm_fault *vmf)
>>   * parts, do_swap_page must check under lock before unmapping the pte and
>>   * proceeding (but do_wp_page is only called after already making such a check;
>>   * and do_anonymous_page can safely check later on).
>> + *
>> + * pte_unmap_same() returns:
>> + *	0			if the PTE are the same
>> + *	VM_FAULT_PTNOTSAME	if the PTE are different
>> + *	VM_FAULT_RETRY		if the VMA has changed in our back during
>> + *				a speculative page fault handling.
>>   */
>> -static inline int pte_unmap_same(struct mm_struct *mm, pmd_t *pmd,
>> -				pte_t *page_table, pte_t orig_pte)
>> +static inline int pte_unmap_same(struct vm_fault *vmf)
>>  {
>> -	int same = 1;
>> +	int ret = 0;
>> +
>>  #if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT)
>>  	if (sizeof(pte_t) > sizeof(unsigned long)) {
>> -		spinlock_t *ptl = pte_lockptr(mm, pmd);
>> -		spin_lock(ptl);
>> -		same = pte_same(*page_table, orig_pte);
>> -		spin_unlock(ptl);
>> +		if (pte_spinlock(vmf)) {
>> +			if (!pte_same(*vmf->pte, vmf->orig_pte))
>> +				ret = VM_FAULT_PTNOTSAME;
>> +			spin_unlock(vmf->ptl);
>> +		} else
>> +			ret = VM_FAULT_RETRY;
>>  	}
>>  #endif
>> -	pte_unmap(page_table);
>> -	return same;
>> +	pte_unmap(vmf->pte);
>> +	return ret;
>>  }
>>  
>>  static inline void cow_user_page(struct page *dst, struct page *src, unsigned long va, struct vm_area_struct *vma)
>> @@ -2913,7 +2921,8 @@ int do_swap_page(struct vm_fault *vmf)
>>  	int exclusive = 0;
>>  	int ret = 0;
> 
> Initialization is now unneeded.

I'm sorry, what "initialization" are you talking about here ?

> 
> Otherwise:
> 
> Acked-by: David Rientjes <rientjes@google.com>

Thanks,
Laurent.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 01/24] mm: Introduce CONFIG_SPECULATIVE_PAGE_FAULT
  2018-03-28  7:49     ` Laurent Dufour
@ 2018-03-28 10:16       ` David Rientjes
  2018-03-28 11:15         ` Laurent Dufour
  0 siblings, 1 reply; 98+ messages in thread
From: David Rientjes @ 2018-03-28 10:16 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: Thomas Gleixner, paulmck, peterz, akpm, kirill, ak, mhocko, dave,
	jack, Matthew Wilcox, benh, mpe, paulus, Ingo Molnar, hpa,
	Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Wed, 28 Mar 2018, Laurent Dufour wrote:

> >> This configuration variable will be used to build the code needed to
> >> handle speculative page fault.
> >>
> >> By default it is turned off, and activated depending on architecture
> >> support.
> >>
> >> Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> >> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
> >> ---
> >>  mm/Kconfig | 3 +++
> >>  1 file changed, 3 insertions(+)
> >>
> >> diff --git a/mm/Kconfig b/mm/Kconfig
> >> index abefa573bcd8..07c566c88faf 100644
> >> --- a/mm/Kconfig
> >> +++ b/mm/Kconfig
> >> @@ -759,3 +759,6 @@ config GUP_BENCHMARK
> >>  	  performance of get_user_pages_fast().
> >>  
> >>  	  See tools/testing/selftests/vm/gup_benchmark.c
> >> +
> >> +config SPECULATIVE_PAGE_FAULT
> >> +       bool
> > 
> > Should this be configurable even if the arch supports it?
> 
> Actually, this is not configurable unless by manually editing the .config file.
> 
> I made it this way on the Thomas's request :
> https://lkml.org/lkml/2018/1/15/969
> 
> That sounds to be the smarter way to achieve that, isn't it ?
> 

Putting this in mm/Kconfig is definitely the right way to go about it 
instead of any generic option in arch/*.

My question, though, was making this configurable by the user:

config SPECULATIVE_PAGE_FAULT
	bool "Speculative page faults"
	depends on X86_64 || PPC
	default y
	help
	  ..

It's a question about whether we want this always enabled on x86_64 and 
power or whether the user should be able to disable it (right now they 
can't).  With a large feature like this, you may want to offer something 
simple (disable CONFIG_SPECULATIVE_PAGE_FAULT) if someone runs into 
regressions.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 06/24] mm: make pte_unmap_same compatible with SPF
  2018-03-28  8:27     ` Laurent Dufour
@ 2018-03-28 10:20       ` David Rientjes
  2018-03-28 10:43         ` Laurent Dufour
  0 siblings, 1 reply; 98+ messages in thread
From: David Rientjes @ 2018-03-28 10:20 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, Andrew Morton, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Wed, 28 Mar 2018, Laurent Dufour wrote:

> >> @@ -2913,7 +2921,8 @@ int do_swap_page(struct vm_fault *vmf)
> >>  	int exclusive = 0;
> >>  	int ret = 0;
> > 
> > Initialization is now unneeded.
> 
> I'm sorry, what "initialization" are you talking about here ?
> 

The initialization of the ret variable.

@@ -2913,7 +2921,8 @@ int do_swap_page(struct vm_fault *vmf)
 	int exclusive = 0;
 	int ret = 0;
 
-	if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf->orig_pte))
+	ret = pte_unmap_same(vmf);
+	if (ret)
 		goto out;
 
 	entry = pte_to_swp_entry(vmf->orig_pte);

"ret" is immediately set to the return value of pte_unmap_same(), so there 
is no need to initialize it to 0.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 04/24] mm: Prepare for FAULT_FLAG_SPECULATIVE
  2018-03-25 21:50   ` [PATCH v9 04/24] mm: Prepare for FAULT_FLAG_SPECULATIVE David Rientjes
@ 2018-03-28 10:27     ` Laurent Dufour
  2018-04-03 21:57       ` David Rientjes
  0 siblings, 1 reply; 98+ messages in thread
From: Laurent Dufour @ 2018-03-28 10:27 UTC (permalink / raw)
  To: David Rientjes
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On 25/03/2018 23:50, David Rientjes wrote:
> On Tue, 13 Mar 2018, Laurent Dufour wrote:
> 
>> From: Peter Zijlstra <peterz@infradead.org>
>>
>> When speculating faults (without holding mmap_sem) we need to validate
>> that the vma against which we loaded pages is still valid when we're
>> ready to install the new PTE.
>>
>> Therefore, replace the pte_offset_map_lock() calls that (re)take the
>> PTL with pte_map_lock() which can fail in case we find the VMA changed
>> since we started the fault.
>>
> 
> Based on how its used, I would have suspected this to be named 
> pte_map_trylock().
> 
>> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
>>
>> [Port to 4.12 kernel]
>> [Remove the comment about the fault_env structure which has been
>>  implemented as the vm_fault structure in the kernel]
>> [move pte_map_lock()'s definition upper in the file]
>> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
>> ---
>>  include/linux/mm.h |  1 +
>>  mm/memory.c        | 56 ++++++++++++++++++++++++++++++++++++++----------------
>>  2 files changed, 41 insertions(+), 16 deletions(-)
>>
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index 4d02524a7998..2f3e98edc94a 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -300,6 +300,7 @@ extern pgprot_t protection_map[16];
>>  #define FAULT_FLAG_USER		0x40	/* The fault originated in userspace */
>>  #define FAULT_FLAG_REMOTE	0x80	/* faulting for non current tsk/mm */
>>  #define FAULT_FLAG_INSTRUCTION  0x100	/* The fault was during an instruction fetch */
>> +#define FAULT_FLAG_SPECULATIVE	0x200	/* Speculative fault, not holding mmap_sem */
>>  
>>  #define FAULT_FLAG_TRACE \
>>  	{ FAULT_FLAG_WRITE,		"WRITE" }, \
> 
> I think FAULT_FLAG_SPECULATIVE should be introduced in the patch that 
> actually uses it.

I think you're right, I'll move down this define in the series.

>> diff --git a/mm/memory.c b/mm/memory.c
>> index e0ae4999c824..8ac241b9f370 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -2288,6 +2288,13 @@ int apply_to_page_range(struct mm_struct *mm, unsigned long addr,
>>  }
>>  EXPORT_SYMBOL_GPL(apply_to_page_range);
>>  
>> +static bool pte_map_lock(struct vm_fault *vmf)
> 
> inline?

Agreed.

>> +{
>> +	vmf->pte = pte_offset_map_lock(vmf->vma->vm_mm, vmf->pmd,
>> +				       vmf->address, &vmf->ptl);
>> +	return true;
>> +}
>> +
>>  /*
>>   * handle_pte_fault chooses page fault handler according to an entry which was
>>   * read non-atomically.  Before making any commitment, on those architectures
>> @@ -2477,6 +2484,7 @@ static int wp_page_copy(struct vm_fault *vmf)
>>  	const unsigned long mmun_start = vmf->address & PAGE_MASK;
>>  	const unsigned long mmun_end = mmun_start + PAGE_SIZE;
>>  	struct mem_cgroup *memcg;
>> +	int ret = VM_FAULT_OOM;
>>  
>>  	if (unlikely(anon_vma_prepare(vma)))
>>  		goto oom;
>> @@ -2504,7 +2512,11 @@ static int wp_page_copy(struct vm_fault *vmf)
>>  	/*
>>  	 * Re-check the pte - we dropped the lock
>>  	 */
>> -	vmf->pte = pte_offset_map_lock(mm, vmf->pmd, vmf->address, &vmf->ptl);
>> +	if (!pte_map_lock(vmf)) {
>> +		mem_cgroup_cancel_charge(new_page, memcg, false);
>> +		ret = VM_FAULT_RETRY;
>> +		goto oom_free_new;
>> +	}
> 
> Ugh, but we aren't oom here, so maybe rename oom_free_new so that it makes 
> sense for return values other than VM_FAULT_OOM?

You're right, now this label name is not correct, I'll rename it to
"out_free_new" and rename also the label "oom" to "out" since it is generic too
now.

>>  	if (likely(pte_same(*vmf->pte, vmf->orig_pte))) {
>>  		if (old_page) {
>>  			if (!PageAnon(old_page)) {
>> @@ -2596,7 +2608,7 @@ static int wp_page_copy(struct vm_fault *vmf)
>>  oom:
>>  	if (old_page)
>>  		put_page(old_page);
>> -	return VM_FAULT_OOM;
>> +	return ret;
>>  }
>>  
>>  /**
>> @@ -2617,8 +2629,8 @@ static int wp_page_copy(struct vm_fault *vmf)
>>  int finish_mkwrite_fault(struct vm_fault *vmf)
>>  {
>>  	WARN_ON_ONCE(!(vmf->vma->vm_flags & VM_SHARED));
>> -	vmf->pte = pte_offset_map_lock(vmf->vma->vm_mm, vmf->pmd, vmf->address,
>> -				       &vmf->ptl);
>> +	if (!pte_map_lock(vmf))
>> +		return VM_FAULT_RETRY;
>>  	/*
>>  	 * We might have raced with another page fault while we released the
>>  	 * pte_offset_map_lock.
>> @@ -2736,8 +2748,11 @@ static int do_wp_page(struct vm_fault *vmf)
>>  			get_page(vmf->page);
>>  			pte_unmap_unlock(vmf->pte, vmf->ptl);
>>  			lock_page(vmf->page);
>> -			vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd,
>> -					vmf->address, &vmf->ptl);
>> +			if (!pte_map_lock(vmf)) {
>> +				unlock_page(vmf->page);
>> +				put_page(vmf->page);
>> +				return VM_FAULT_RETRY;
>> +			}
>>  			if (!pte_same(*vmf->pte, vmf->orig_pte)) {
>>  				unlock_page(vmf->page);
>>  				pte_unmap_unlock(vmf->pte, vmf->ptl);
>> @@ -2947,8 +2962,10 @@ int do_swap_page(struct vm_fault *vmf)
>>  			 * Back out if somebody else faulted in this pte
>>  			 * while we released the pte lock.
>>  			 */
> 
> Comment needs updating, pte_same() isn't the only reason to bail out here.

I'll update it to :
			/*
			 * Back out if the VMA has changed in our back during
			 * a speculative page fault or if somebody else
			 * faulted in this pte while we released the pte lock.
			 */

> 
>> -			vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd,
>> -					vmf->address, &vmf->ptl);
>> +			if (!pte_map_lock(vmf)) {
>> +				delayacct_clear_flag(DELAYACCT_PF_SWAPIN);
>> +				return VM_FAULT_RETRY;
>> +			}
>>  			if (likely(pte_same(*vmf->pte, vmf->orig_pte)))
>>  				ret = VM_FAULT_OOM;
>>  			delayacct_clear_flag(DELAYACCT_PF_SWAPIN);
> 
> Not crucial, but it would be nice if this could do goto out instead, 
> otherwise this is the first mid function return.

ok will do.

> 
>> @@ -3003,8 +3020,11 @@ int do_swap_page(struct vm_fault *vmf)
>>  	/*
>>  	 * Back out if somebody else already faulted in this pte.
>>  	 */
> 
> Same as above.

Ok changing as above.

> 
>> -	vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address,
>> -			&vmf->ptl);
>> +	if (!pte_map_lock(vmf)) {
>> +		ret = VM_FAULT_RETRY;
>> +		mem_cgroup_cancel_charge(page, memcg, false);
>> +		goto out_page;
>> +	}
>>  	if (unlikely(!pte_same(*vmf->pte, vmf->orig_pte)))
>>  		goto out_nomap;
>>  
> 
> mem_cgroup_try_charge() is done before grabbing pte_offset_map_lock(), why 
> does the out_nomap exit path do mem_cgroup_cancel_charge(); 
> pte_unmap_unlock()?  If the pte lock can be droppde first, there's no need 
> to embed the mem_cgroup_cancel_charge() here.

I think we can safely invert the call to mem_cgroup_cancel_charge() and to
pte_unmap_unlock(), and then introduce a new label and jump in if
pte_map_lock() failed.
Something like this:

@@ -3001,10 +3020,13 @@ int do_swap_page(struct vm_fault *vmf)
        }

        /*
-        * Back out if somebody else already faulted in this pte.
+        * Back out if the VMA has changed in our back during a speculative
+        * page fault or if somebody else already faulted in this pte.
         */
-       vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address,
-                       &vmf->ptl);
+       if (!pte_map_lock(vmf)) {
+               ret = VM_FAULT_RETRY;
+               goto out_cancel_cgroup;
+       }
        if (unlikely(!pte_same(*vmf->pte, vmf->orig_pte)))
                goto out_nomap;

@@ -3082,8 +3104,9 @@ int do_swap_page(struct vm_fault *vmf)
 out:
        return ret;
 out_nomap:
-       mem_cgroup_cancel_charge(page, memcg, false);
        pte_unmap_unlock(vmf->pte, vmf->ptl);
+out_cancel_cgroup:
+       mem_cgroup_cancel_charge(page, memcg, false);
 out_page:
        unlock_page(page);
 out_release:



>> @@ -3133,8 +3153,8 @@ static int do_anonymous_page(struct vm_fault *vmf)
>>  			!mm_forbids_zeropage(vma->vm_mm)) {
>>  		entry = pte_mkspecial(pfn_pte(my_zero_pfn(vmf->address),
>>  						vma->vm_page_prot));
>> -		vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd,
>> -				vmf->address, &vmf->ptl);
>> +		if (!pte_map_lock(vmf))
>> +			return VM_FAULT_RETRY;
>>  		if (!pte_none(*vmf->pte))
>>  			goto unlock;
>>  		ret = check_stable_address_space(vma->vm_mm);
>> @@ -3169,8 +3189,11 @@ static int do_anonymous_page(struct vm_fault *vmf)
>>  	if (vma->vm_flags & VM_WRITE)
>>  		entry = pte_mkwrite(pte_mkdirty(entry));
>>  
>> -	vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address,
>> -			&vmf->ptl);
>> +	if (!pte_map_lock(vmf)) {
>> +		mem_cgroup_cancel_charge(page, memcg, false);
>> +		put_page(page);
>> +		return VM_FAULT_RETRY;
>> +	}
>>  	if (!pte_none(*vmf->pte))
>>  		goto release;
>>  
> 
> This is more spaghetti, can the exit path be fixed up so we order things 
> consistently for all gotos?

I do agree, this was due to inverted calls to mem_cgroup_cancel_charge() and
pte_unmap_unlock().

This will become:
@@ -3170,14 +3193,16 @@ static int do_anonymous_page(struct vm_fault *vmf)
        if (vma->vm_flags & VM_WRITE)
                entry = pte_mkwrite(pte_mkdirty(entry));

-       vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address,
-                       &vmf->ptl);
-       if (!pte_none(*vmf->pte))
+       if (!pte_map_lock(vmf)) {
+               ret = VM_FAULT_RETRY;
                goto release;
+       }
+       if (!pte_none(*vmf->pte))
+               goto unlock_and_release;

        ret = check_stable_address_space(vma->vm_mm);
        if (ret)
-               goto release;
+               goto unlock_and_release;

        /* Deliver the page fault to userland, check inside PT lock */
        if (userfaultfd_missing(vma)) {
@@ -3199,10 +3224,12 @@ static int do_anonymous_page(struct vm_fault *vmf)
 unlock:
        pte_unmap_unlock(vmf->pte, vmf->ptl);
        return ret;
+unlock_and_release:
+       pte_unmap_unlock(vmf->pte, vmf->ptl);
 release:
        mem_cgroup_cancel_charge(page, memcg, false);
        put_page(page);
-       goto unlock;
+       return ret;
 oom_free_page:
        put_page(page);
 oom:

Thanks,
Laurent.

> 
>> @@ -3294,8 +3317,9 @@ static int pte_alloc_one_map(struct vm_fault *vmf)
>>  	 * pte_none() under vmf->ptl protection when we return to
>>  	 * alloc_set_pte().
>>  	 */
>> -	vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address,
>> -			&vmf->ptl);
>> +	if (!pte_map_lock(vmf))
>> +		return VM_FAULT_RETRY;
>> +
>>  	return 0;
>>  }
>>  
> 

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 06/24] mm: make pte_unmap_same compatible with SPF
  2018-03-28 10:20       ` David Rientjes
@ 2018-03-28 10:43         ` Laurent Dufour
  0 siblings, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-03-28 10:43 UTC (permalink / raw)
  To: David Rientjes
  Cc: paulmck, peterz, Andrew Morton, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86



On 28/03/2018 12:20, David Rientjes wrote:
> On Wed, 28 Mar 2018, Laurent Dufour wrote:
> 
>>>> @@ -2913,7 +2921,8 @@ int do_swap_page(struct vm_fault *vmf)
>>>>  	int exclusive = 0;
>>>>  	int ret = 0;
>>>
>>> Initialization is now unneeded.
>>
>> I'm sorry, what "initialization" are you talking about here ?
>>
> 
> The initialization of the ret variable.
> 
> @@ -2913,7 +2921,8 @@ int do_swap_page(struct vm_fault *vmf)
>  	int exclusive = 0;
>  	int ret = 0;
> 
> -	if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf->orig_pte))
> +	ret = pte_unmap_same(vmf);
> +	if (ret)
>  		goto out;
> 
>  	entry = pte_to_swp_entry(vmf->orig_pte);
> 
> "ret" is immediately set to the return value of pte_unmap_same(), so there 
> is no need to initialize it to 0.

Sorry, I missed that. I'll remove this initialization.

Thanks,
Laurent.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 01/24] mm: Introduce CONFIG_SPECULATIVE_PAGE_FAULT
  2018-03-28 10:16       ` David Rientjes
@ 2018-03-28 11:15         ` Laurent Dufour
  2018-03-28 21:18           ` David Rientjes
  0 siblings, 1 reply; 98+ messages in thread
From: Laurent Dufour @ 2018-03-28 11:15 UTC (permalink / raw)
  To: David Rientjes
  Cc: Thomas Gleixner, paulmck, peterz, akpm, kirill, ak, mhocko, dave,
	jack, Matthew Wilcox, benh, mpe, paulus, Ingo Molnar, hpa,
	Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86



On 28/03/2018 12:16, David Rientjes wrote:
> On Wed, 28 Mar 2018, Laurent Dufour wrote:
> 
>>>> This configuration variable will be used to build the code needed to
>>>> handle speculative page fault.
>>>>
>>>> By default it is turned off, and activated depending on architecture
>>>> support.
>>>>
>>>> Suggested-by: Thomas Gleixner <tglx@linutronix.de>
>>>> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
>>>> ---
>>>>  mm/Kconfig | 3 +++
>>>>  1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/mm/Kconfig b/mm/Kconfig
>>>> index abefa573bcd8..07c566c88faf 100644
>>>> --- a/mm/Kconfig
>>>> +++ b/mm/Kconfig
>>>> @@ -759,3 +759,6 @@ config GUP_BENCHMARK
>>>>  	  performance of get_user_pages_fast().
>>>>  
>>>>  	  See tools/testing/selftests/vm/gup_benchmark.c
>>>> +
>>>> +config SPECULATIVE_PAGE_FAULT
>>>> +       bool
>>>
>>> Should this be configurable even if the arch supports it?
>>
>> Actually, this is not configurable unless by manually editing the .config file.
>>
>> I made it this way on the Thomas's request :
>> https://lkml.org/lkml/2018/1/15/969
>>
>> That sounds to be the smarter way to achieve that, isn't it ?
>>
> 
> Putting this in mm/Kconfig is definitely the right way to go about it 
> instead of any generic option in arch/*.
> 
> My question, though, was making this configurable by the user:
> 
> config SPECULATIVE_PAGE_FAULT
> 	bool "Speculative page faults"
> 	depends on X86_64 || PPC
> 	default y
> 	help
> 	  ..
> 
> It's a question about whether we want this always enabled on x86_64 and 
> power or whether the user should be able to disable it (right now they 
> can't).  With a large feature like this, you may want to offer something 
> simple (disable CONFIG_SPECULATIVE_PAGE_FAULT) if someone runs into 
> regressions.

I agree, but I think it would be important to get the per architecture
enablement to avoid complex check here. For instance in the case of powerPC
this is only supported for PPC_BOOK3S_64.

To avoid exposing such per architecture define here, what do you think about
having supporting architectures setting ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT
and the SPECULATIVE_PAGE_FAULT depends on this, like this:

In mm/Kconfig:
config SPECULATIVE_PAGE_FAULT
 	bool "Speculative page faults"
 	depends on ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT && SMP
 	default y
 	help
		...

In arch/powerpc/Kconfig:
config PPC
	...
	select ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT	if PPC_BOOK3S_64

In arch/x86/Kconfig:
config X86_64
	...
	select ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [mm] b1f0502d04: INFO:trying_to_register_non-static_key
  2018-03-25 22:10       ` David Rientjes
@ 2018-03-28 13:30         ` Laurent Dufour
  2018-04-04  0:48           ` David Rientjes
  0 siblings, 1 reply; 98+ messages in thread
From: Laurent Dufour @ 2018-03-28 13:30 UTC (permalink / raw)
  To: David Rientjes
  Cc: kernel test robot, paulmck, peterz, akpm, kirill, ak, mhocko,
	dave, jack, Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner,
	Ingo Molnar, hpa, Will Deacon, Sergey Senozhatsky,
	Andrea Arcangeli, Alexei Starovoitov, kemi.wang,
	sergey.senozhatsky.work, Daniel Jordan, linux-kernel, linux-mm,
	haren, khandual, npiggin, bsingharora, Tim Chen, linuxppc-dev,
	x86, lkp

On 26/03/2018 00:10, David Rientjes wrote:
> On Wed, 21 Mar 2018, Laurent Dufour wrote:
> 
>> I found the root cause of this lockdep warning.
>>
>> In mmap_region(), unmap_region() may be called while vma_link() has not been
>> called. This happens during the error path if call_mmap() failed.
>>
>> The only to fix that particular case is to call
>> seqcount_init(&vma->vm_sequence) when initializing the vma in mmap_region().
>>
> 
> Ack, although that would require a fixup to dup_mmap() as well.

You're right, I'll fix that too.

Thanks a lot.
Laurent.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 08/24] mm: Protect VMA modifications using VMA sequence count
  2018-03-27 21:45   ` David Rientjes
@ 2018-03-28 16:57     ` Laurent Dufour
  0 siblings, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-03-28 16:57 UTC (permalink / raw)
  To: David Rientjes
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On 27/03/2018 23:45, David Rientjes wrote:
> On Tue, 13 Mar 2018, Laurent Dufour wrote:
> 
>> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
>> index 65ae54659833..a2d9c87b7b0b 100644
>> --- a/fs/proc/task_mmu.c
>> +++ b/fs/proc/task_mmu.c
>> @@ -1136,8 +1136,11 @@ static ssize_t clear_refs_write(struct file *file, const char __user *buf,
>>  					goto out_mm;
>>  				}
>>  				for (vma = mm->mmap; vma; vma = vma->vm_next) {
>> -					vma->vm_flags &= ~VM_SOFTDIRTY;
>> +					vm_write_begin(vma);
>> +					WRITE_ONCE(vma->vm_flags,
>> +						   vma->vm_flags & ~VM_SOFTDIRTY);
>>  					vma_set_page_prot(vma);
>> +					vm_write_end(vma);
>>  				}
>>  				downgrade_write(&mm->mmap_sem);
>>  				break;
>> diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
>> index cec550c8468f..b8212ba17695 100644
>> --- a/fs/userfaultfd.c
>> +++ b/fs/userfaultfd.c
>> @@ -659,8 +659,11 @@ int dup_userfaultfd(struct vm_area_struct *vma, struct list_head *fcs)
>>  
>>  	octx = vma->vm_userfaultfd_ctx.ctx;
>>  	if (!octx || !(octx->features & UFFD_FEATURE_EVENT_FORK)) {
>> +		vm_write_begin(vma);
>>  		vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX;
>> -		vma->vm_flags &= ~(VM_UFFD_WP | VM_UFFD_MISSING);
>> +		WRITE_ONCE(vma->vm_flags,
>> +			   vma->vm_flags & ~(VM_UFFD_WP | VM_UFFD_MISSING));
>> +		vm_write_end(vma);
>>  		return 0;
>>  	}
>>  
> 
> In several locations in this patch vm_write_begin(vma) -> 
> vm_write_end(vma) is nesting things other than vma->vm_flags, 
> vma->vm_policy, etc.  I think it's better to do vm_write_end(vma) as soon 
> as the members that the seqcount protects are modified.  In other words, 
> this isn't offering protection for vma->vm_userfaultfd_ctx.  There are 
> several examples of this in the patch.

That's true in this particular case, and I could change that to not include the
change to vm_userfaultfd_ctx.
This being said, I don't think this will have a major impact, but I'll make a
close review on this patch to be sure there is too large protected part of code.

>> @@ -885,8 +888,10 @@ static int userfaultfd_release(struct inode *inode, struct file *file)
>>  			vma = prev;
>>  		else
>>  			prev = vma;
>> -		vma->vm_flags = new_flags;
>> +		vm_write_begin(vma);
>> +		WRITE_ONCE(vma->vm_flags, new_flags);
>>  		vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX;
>> +		vm_write_end(vma);
>>  	}
>>  	up_write(&mm->mmap_sem);
>>  	mmput(mm);
>> @@ -1434,8 +1439,10 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
>>  		 * the next vma was merged into the current one and
>>  		 * the current one has not been updated yet.
>>  		 */
>> -		vma->vm_flags = new_flags;
>> +		vm_write_begin(vma);
>> +		WRITE_ONCE(vma->vm_flags, new_flags);
>>  		vma->vm_userfaultfd_ctx.ctx = ctx;
>> +		vm_write_end(vma);
>>  
>>  	skip:
>>  		prev = vma;
>> @@ -1592,8 +1599,10 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx,
>>  		 * the next vma was merged into the current one and
>>  		 * the current one has not been updated yet.
>>  		 */
>> -		vma->vm_flags = new_flags;
>> +		vm_write_begin(vma);
>> +		WRITE_ONCE(vma->vm_flags, new_flags);
>>  		vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX;
>> +		vm_write_end(vma);
>>  
>>  	skip:
>>  		prev = vma;
>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> index b7e2268dfc9a..32314e9e48dd 100644
>> --- a/mm/khugepaged.c
>> +++ b/mm/khugepaged.c
>> @@ -1006,6 +1006,7 @@ static void collapse_huge_page(struct mm_struct *mm,
>>  	if (mm_find_pmd(mm, address) != pmd)
>>  		goto out;
>>  
>> +	vm_write_begin(vma);
>>  	anon_vma_lock_write(vma->anon_vma);
>>  
>>  	pte = pte_offset_map(pmd, address);
>> @@ -1041,6 +1042,7 @@ static void collapse_huge_page(struct mm_struct *mm,
>>  		pmd_populate(mm, pmd, pmd_pgtable(_pmd));
>>  		spin_unlock(pmd_ptl);
>>  		anon_vma_unlock_write(vma->anon_vma);
>> +		vm_write_end(vma);
>>  		result = SCAN_FAIL;
>>  		goto out;
>>  	}
>> @@ -1075,6 +1077,7 @@ static void collapse_huge_page(struct mm_struct *mm,
>>  	set_pmd_at(mm, address, pmd, _pmd);
>>  	update_mmu_cache_pmd(vma, address, pmd);
>>  	spin_unlock(pmd_ptl);
>> +	vm_write_end(vma);
>>  
>>  	*hpage = NULL;
>>  
>> diff --git a/mm/madvise.c b/mm/madvise.c
>> index 4d3c922ea1a1..e328f7ab5942 100644
>> --- a/mm/madvise.c
>> +++ b/mm/madvise.c
>> @@ -184,7 +184,9 @@ static long madvise_behavior(struct vm_area_struct *vma,
>>  	/*
>>  	 * vm_flags is protected by the mmap_sem held in write mode.
>>  	 */
>> -	vma->vm_flags = new_flags;
>> +	vm_write_begin(vma);
>> +	WRITE_ONCE(vma->vm_flags, new_flags);
>> +	vm_write_end(vma);
>>  out:
>>  	return error;
>>  }
>> @@ -450,9 +452,11 @@ static void madvise_free_page_range(struct mmu_gather *tlb,
>>  		.private = tlb,
>>  	};
>>  
>> +	vm_write_begin(vma);
>>  	tlb_start_vma(tlb, vma);
>>  	walk_page_range(addr, end, &free_walk);
>>  	tlb_end_vma(tlb, vma);
>> +	vm_write_end(vma);
>>  }
>>  
>>  static int madvise_free_single_vma(struct vm_area_struct *vma,
>> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
>> index e0e706f0b34e..2632c6f93b63 100644
>> --- a/mm/mempolicy.c
>> +++ b/mm/mempolicy.c
>> @@ -380,8 +380,11 @@ void mpol_rebind_mm(struct mm_struct *mm, nodemask_t *new)
>>  	struct vm_area_struct *vma;
>>  
>>  	down_write(&mm->mmap_sem);
>> -	for (vma = mm->mmap; vma; vma = vma->vm_next)
>> +	for (vma = mm->mmap; vma; vma = vma->vm_next) {
>> +		vm_write_begin(vma);
>>  		mpol_rebind_policy(vma->vm_policy, new);
>> +		vm_write_end(vma);
>> +	}
>>  	up_write(&mm->mmap_sem);
>>  }
>>  
>> @@ -554,9 +557,11 @@ unsigned long change_prot_numa(struct vm_area_struct *vma,
>>  {
>>  	int nr_updated;
>>  
>> +	vm_write_begin(vma);
>>  	nr_updated = change_protection(vma, addr, end, PAGE_NONE, 0, 1);
>>  	if (nr_updated)
>>  		count_vm_numa_events(NUMA_PTE_UPDATES, nr_updated);
>> +	vm_write_end(vma);
>>  
>>  	return nr_updated;
>>  }
>> @@ -657,6 +662,7 @@ static int vma_replace_policy(struct vm_area_struct *vma,
>>  	if (IS_ERR(new))
>>  		return PTR_ERR(new);
>>  
>> +	vm_write_begin(vma);
>>  	if (vma->vm_ops && vma->vm_ops->set_policy) {
>>  		err = vma->vm_ops->set_policy(vma, new);
>>  		if (err)
>> @@ -664,11 +670,17 @@ static int vma_replace_policy(struct vm_area_struct *vma,
>>  	}
>>  
>>  	old = vma->vm_policy;
>> -	vma->vm_policy = new; /* protected by mmap_sem */
>> +	/*
>> +	 * The speculative page fault handler access this field without
>> +	 * hodling the mmap_sem.
>> +	 */
> 
> "The speculative page fault handler accesses this field without holding 
> vma->vm_mm->mmap_sem"

Oops :/

> 
>> +	WRITE_ONCE(vma->vm_policy,  new);
>> +	vm_write_end(vma);
>>  	mpol_put(old);
>>  
>>  	return 0;
>>   err_out:
>> +	vm_write_end(vma);
>>  	mpol_put(new);
>>  	return err;
>>  }
> 
> Wait, doesn't vma_dup_policy() also need to protect dst->vm_policy?

Indeed this is not necessary because vma_dup_policy() is called when dst is not
yet linked in the RB tree, so it can't be found by the speculative page fault
handler. This is not the case of vma_replace_policy, and this is why the
protection are needed here.

> 
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -2121,7 +2121,9 @@ int vma_dup_policy(struct vm_area_struct *src, struct vm_area_struct *dst)
> 
>  	if (IS_ERR(pol))
>  		return PTR_ERR(pol);
> -	dst->vm_policy = pol;
> +	vm_write_begin(dst);
> +	WRITE_ONCE(dst->vm_policy, pol);
> +	vm_write_end(dst);
>  	return 0;
>  }

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 08/24] mm: Protect VMA modifications using VMA sequence count
  2018-03-27 21:57   ` David Rientjes
@ 2018-03-28 17:10     ` Laurent Dufour
  0 siblings, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-03-28 17:10 UTC (permalink / raw)
  To: David Rientjes
  Cc: paulmck, peterz, Andrew Morton, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86



On 27/03/2018 23:57, David Rientjes wrote:
> On Tue, 13 Mar 2018, Laurent Dufour wrote:
> 
>> diff --git a/mm/mmap.c b/mm/mmap.c
>> index 5898255d0aeb..d6533cb85213 100644
>> --- a/mm/mmap.c
>> +++ b/mm/mmap.c
>> @@ -847,17 +847,18 @@ int __vma_adjust(struct vm_area_struct *vma, unsigned long start,
>>  	}
>>  
>>  	if (start != vma->vm_start) {
>> -		vma->vm_start = start;
>> +		WRITE_ONCE(vma->vm_start, start);
>>  		start_changed = true;
>>  	}
>>  	if (end != vma->vm_end) {
>> -		vma->vm_end = end;
>> +		WRITE_ONCE(vma->vm_end, end);
>>  		end_changed = true;
>>  	}
>> -	vma->vm_pgoff = pgoff;
>> +	WRITE_ONCE(vma->vm_pgoff, pgoff);
>>  	if (adjust_next) {
>> -		next->vm_start += adjust_next << PAGE_SHIFT;
>> -		next->vm_pgoff += adjust_next;
>> +		WRITE_ONCE(next->vm_start,
>> +			   next->vm_start + (adjust_next << PAGE_SHIFT));
>> +		WRITE_ONCE(next->vm_pgoff, next->vm_pgoff + adjust_next);
>>  	}
>>  
>>  	if (root) {
>> @@ -1781,6 +1782,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
>>  out:
>>  	perf_event_mmap(vma);
>>  
>> +	vm_write_begin(vma);
>>  	vm_stat_account(mm, vm_flags, len >> PAGE_SHIFT);
>>  	if (vm_flags & VM_LOCKED) {
>>  		if (!((vm_flags & VM_SPECIAL) || is_vm_hugetlb_page(vma) ||
>> @@ -1803,6 +1805,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
>>  	vma->vm_flags |= VM_SOFTDIRTY;
>>  
>>  	vma_set_page_prot(vma);
>> +	vm_write_end(vma);
>>  
>>  	return addr;
>>  
> 
> Shouldn't this also protect vma->vm_flags?

Nice catch !
I just found that too while reviewing the entire patch to answer your previous
email.

> 
> diff --git a/mm/mmap.c b/mm/mmap.c
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1796,7 +1796,8 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
>  					vma == get_gate_vma(current->mm)))
>  			mm->locked_vm += (len >> PAGE_SHIFT);
>  		else
> -			vma->vm_flags &= VM_LOCKED_CLEAR_MASK;
> +			WRITE_ONCE(vma->vm_flags,
> +				   vma->vm_flags & VM_LOCKED_CLEAR_MASK);
>  	}
> 
>  	if (file)
> @@ -1809,7 +1810,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
>  	 * then new mapped in-place (which must be aimed as
>  	 * a completely new data area).
>  	 */
> -	vma->vm_flags |= VM_SOFTDIRTY;
> +	WRITE_ONCE(vma->vm_flags, vma->vm_flags | VM_SOFTDIRTY);
> 
>  	vma_set_page_prot(vma);
>  	vm_write_end(vma);
> 

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 07/24] mm: VMA sequence count
  2018-03-27 21:30   ` [PATCH v9 07/24] mm: VMA sequence count David Rientjes
@ 2018-03-28 17:58     ` Laurent Dufour
  0 siblings, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-03-28 17:58 UTC (permalink / raw)
  To: David Rientjes
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86



On 27/03/2018 23:30, David Rientjes wrote:
> On Tue, 13 Mar 2018, Laurent Dufour wrote:
> 
>> diff --git a/mm/mmap.c b/mm/mmap.c
>> index faf85699f1a1..5898255d0aeb 100644
>> --- a/mm/mmap.c
>> +++ b/mm/mmap.c
>> @@ -558,6 +558,10 @@ void __vma_link_rb(struct mm_struct *mm, struct vm_area_struct *vma,
>>  	else
>>  		mm->highest_vm_end = vm_end_gap(vma);
>>  
>> +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
>> +	seqcount_init(&vma->vm_sequence);
>> +#endif
>> +
>>  	/*
>>  	 * vma->vm_prev wasn't known when we followed the rbtree to find the
>>  	 * correct insertion point for that vma. As a result, we could not
>> @@ -692,6 +696,30 @@ int __vma_adjust(struct vm_area_struct *vma, unsigned long start,
>>  	long adjust_next = 0;
>>  	int remove_next = 0;
>>  
>> +	/*
>> +	 * Why using vm_raw_write*() functions here to avoid lockdep's warning ?
>> +	 *
>> +	 * Locked is complaining about a theoretical lock dependency, involving
>> +	 * 3 locks:
>> +	 *   mapping->i_mmap_rwsem --> vma->vm_sequence --> fs_reclaim
>> +	 *
>> +	 * Here are the major path leading to this dependency :
>> +	 *  1. __vma_adjust() mmap_sem  -> vm_sequence -> i_mmap_rwsem
>> +	 *  2. move_vmap() mmap_sem -> vm_sequence -> fs_reclaim
>> +	 *  3. __alloc_pages_nodemask() fs_reclaim -> i_mmap_rwsem
>> +	 *  4. unmap_mapping_range() i_mmap_rwsem -> vm_sequence
>> +	 *
>> +	 * So there is no way to solve this easily, especially because in
>> +	 * unmap_mapping_range() the i_mmap_rwsem is grab while the impacted
>> +	 * VMAs are not yet known.
>> +	 * However, the way the vm_seq is used is guarantying that we will
>> +	 * never block on it since we just check for its value and never wait
>> +	 * for it to move, see vma_has_changed() and handle_speculative_fault().
>> +	 */
>> +	vm_raw_write_begin(vma);
>> +	if (next)
>> +		vm_raw_write_begin(next);
>> +
>>  	if (next && !insert) {
>>  		struct vm_area_struct *exporter = NULL, *importer = NULL;
>>  
> 
> Eek, what about later on:
> 
> 		/*
> 		 * Easily overlooked: when mprotect shifts the boundary,
> 		 * make sure the expanding vma has anon_vma set if the
> 		 * shrinking vma had, to cover any anon pages imported.
> 		 */
> 		if (exporter && exporter->anon_vma && !importer->anon_vma) {
> 			int error;
> 
> 			importer->anon_vma = exporter->anon_vma;
> 			error = anon_vma_clone(importer, exporter);
> 			if (error)
> 				return error;
> 		}
> 
> This needs
> 
> if (error) {
> 	if (next && next != vma)
> 		vm_raw_write_end(next);
> 	vm_raw_write_end(vma);
> 	return error;
> }

Nice catch !

Thanks,
Laurent.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 09/24] mm: protect mremap() against SPF hanlder
  2018-03-27 22:12   ` David Rientjes
@ 2018-03-28 18:11     ` Laurent Dufour
  2018-03-28 21:21       ` David Rientjes
  0 siblings, 1 reply; 98+ messages in thread
From: Laurent Dufour @ 2018-03-28 18:11 UTC (permalink / raw)
  To: David Rientjes
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86



On 28/03/2018 00:12, David Rientjes wrote:
> On Tue, 13 Mar 2018, Laurent Dufour wrote:
> 
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index 88042d843668..ef6ef0627090 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -2189,16 +2189,24 @@ void anon_vma_interval_tree_verify(struct anon_vma_chain *node);
>>  extern int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin);
>>  extern int __vma_adjust(struct vm_area_struct *vma, unsigned long start,
>>  	unsigned long end, pgoff_t pgoff, struct vm_area_struct *insert,
>> -	struct vm_area_struct *expand);
>> +	struct vm_area_struct *expand, bool keep_locked);
>>  static inline int vma_adjust(struct vm_area_struct *vma, unsigned long start,
>>  	unsigned long end, pgoff_t pgoff, struct vm_area_struct *insert)
>>  {
>> -	return __vma_adjust(vma, start, end, pgoff, insert, NULL);
>> +	return __vma_adjust(vma, start, end, pgoff, insert, NULL, false);
>>  }
>> -extern struct vm_area_struct *vma_merge(struct mm_struct *,
>> +extern struct vm_area_struct *__vma_merge(struct mm_struct *,
>>  	struct vm_area_struct *prev, unsigned long addr, unsigned long end,
>>  	unsigned long vm_flags, struct anon_vma *, struct file *, pgoff_t,
>> -	struct mempolicy *, struct vm_userfaultfd_ctx);
>> +	struct mempolicy *, struct vm_userfaultfd_ctx, bool keep_locked);
>> +static inline struct vm_area_struct *vma_merge(struct mm_struct *vma,
>> +	struct vm_area_struct *prev, unsigned long addr, unsigned long end,
>> +	unsigned long vm_flags, struct anon_vma *anon, struct file *file,
>> +	pgoff_t off, struct mempolicy *pol, struct vm_userfaultfd_ctx uff)
>> +{
>> +	return __vma_merge(vma, prev, addr, end, vm_flags, anon, file, off,
>> +			   pol, uff, false);
>> +}
> 
> The first formal to vma_merge() is an mm, not a vma.

Oops !

> This area could use an uncluttering.
> 
>>  extern struct anon_vma *find_mergeable_anon_vma(struct vm_area_struct *);
>>  extern int __split_vma(struct mm_struct *, struct vm_area_struct *,
>>  	unsigned long addr, int new_below);
>> diff --git a/mm/mmap.c b/mm/mmap.c
>> index d6533cb85213..ac32b577a0c9 100644
>> --- a/mm/mmap.c
>> +++ b/mm/mmap.c
>> @@ -684,7 +684,7 @@ static inline void __vma_unlink_prev(struct mm_struct *mm,
>>   */
>>  int __vma_adjust(struct vm_area_struct *vma, unsigned long start,
>>  	unsigned long end, pgoff_t pgoff, struct vm_area_struct *insert,
>> -	struct vm_area_struct *expand)
>> +	struct vm_area_struct *expand, bool keep_locked)
>>  {
>>  	struct mm_struct *mm = vma->vm_mm;
>>  	struct vm_area_struct *next = vma->vm_next, *orig_vma = vma;
>> @@ -996,7 +996,8 @@ int __vma_adjust(struct vm_area_struct *vma, unsigned long start,
>>  
>>  	if (next && next != vma)
>>  		vm_raw_write_end(next);
>> -	vm_raw_write_end(vma);
>> +	if (!keep_locked)
>> +		vm_raw_write_end(vma);
>>  
>>  	validate_mm(mm);
>>  
> 
> This will require a fixup for the following patch where a retval from 
> anon_vma_close() can also return without vma locked even though 
> keep_locked == true.

Yes I saw your previous email about that.

> 
> How does vma_merge() handle that error wrt vm_raw_write_begin(vma)?

The not written assumption is that in the case __vma_adjust() is returning an
error, the vma is no more sequence locked. That's need to be clarify a huge
comment near the __vma_adjust() definition.

This being said, in that case of __vma_merge() is returning NULL which means
that it didn't do the job (the caller don't know if this is due to an error or
because the VMA cannot be merged). In that case again the assumption is that
the VMA are no locked in that case since the merge operation was not done. But
again this is not documented at all and I've to fix that.
In addition the caller copy_vma() which is the only one calling __vma_merge()
with keep_locked=true, is assuming that in that case, the VMA is not locked and
it will allocate a new VMA which will be locked before it is inserted in the RB
tree.

So it should work, but I've to make a huge effort to document that in the code.
Thanks a lot for raising this !

> 
>> @@ -1132,12 +1133,13 @@ can_vma_merge_after(struct vm_area_struct *vma, unsigned long vm_flags,
>>   * parameter) may establish ptes with the wrong permissions of NNNN
>>   * instead of the right permissions of XXXX.
>>   */
>> -struct vm_area_struct *vma_merge(struct mm_struct *mm,
>> +struct vm_area_struct *__vma_merge(struct mm_struct *mm,
>>  			struct vm_area_struct *prev, unsigned long addr,
>>  			unsigned long end, unsigned long vm_flags,
>>  			struct anon_vma *anon_vma, struct file *file,
>>  			pgoff_t pgoff, struct mempolicy *policy,
>> -			struct vm_userfaultfd_ctx vm_userfaultfd_ctx)
>> +			struct vm_userfaultfd_ctx vm_userfaultfd_ctx,
>> +			bool keep_locked)
>>  {
>>  	pgoff_t pglen = (end - addr) >> PAGE_SHIFT;
>>  	struct vm_area_struct *area, *next;
>> @@ -1185,10 +1187,11 @@ struct vm_area_struct *vma_merge(struct mm_struct *mm,
>>  							/* cases 1, 6 */
>>  			err = __vma_adjust(prev, prev->vm_start,
>>  					 next->vm_end, prev->vm_pgoff, NULL,
>> -					 prev);
>> +					 prev, keep_locked);
>>  		} else					/* cases 2, 5, 7 */
>>  			err = __vma_adjust(prev, prev->vm_start,
>> -					 end, prev->vm_pgoff, NULL, prev);
>> +					   end, prev->vm_pgoff, NULL, prev,
>> +					   keep_locked);
>>  		if (err)
>>  			return NULL;
>>  		khugepaged_enter_vma_merge(prev, vm_flags);
>> @@ -1205,10 +1208,12 @@ struct vm_area_struct *vma_merge(struct mm_struct *mm,
>>  					     vm_userfaultfd_ctx)) {
>>  		if (prev && addr < prev->vm_end)	/* case 4 */
>>  			err = __vma_adjust(prev, prev->vm_start,
>> -					 addr, prev->vm_pgoff, NULL, next);
>> +					 addr, prev->vm_pgoff, NULL, next,
>> +					 keep_locked);
>>  		else {					/* cases 3, 8 */
>>  			err = __vma_adjust(area, addr, next->vm_end,
>> -					 next->vm_pgoff - pglen, NULL, next);
>> +					 next->vm_pgoff - pglen, NULL, next,
>> +					 keep_locked);
>>  			/*
>>  			 * In case 3 area is already equal to next and
>>  			 * this is a noop, but in case 8 "area" has
>> @@ -3163,9 +3168,20 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
>>  
>>  	if (find_vma_links(mm, addr, addr + len, &prev, &rb_link, &rb_parent))
>>  		return NULL;	/* should never get here */
>> -	new_vma = vma_merge(mm, prev, addr, addr + len, vma->vm_flags,
>> -			    vma->anon_vma, vma->vm_file, pgoff, vma_policy(vma),
>> -			    vma->vm_userfaultfd_ctx);
>> +
>> +	/* There is 3 cases to manage here in
>> +	 *     AAAA            AAAA              AAAA              AAAA
>> +	 * PPPP....      PPPP......NNNN      PPPP....NNNN      PP........NN
>> +	 * PPPPPPPP(A)   PPPP..NNNNNNNN(B)   PPPPPPPPPPPP(1)       NULL
>> +	 *                                   PPPPPPPPNNNN(2)
>> +	 *				     PPPPNNNNNNNN(3)
>> +	 *
>> +	 * new_vma == prev in case A,1,2
>> +	 * new_vma == next in case B,3
>> +	 */
> 
> Interleaved tabs and whitespace.

Fair enough, I will try to fix that.

>> +	new_vma = __vma_merge(mm, prev, addr, addr + len, vma->vm_flags,
>> +			      vma->anon_vma, vma->vm_file, pgoff,
>> +			      vma_policy(vma), vma->vm_userfaultfd_ctx, true);
>>  	if (new_vma) {
>>  		/*
>>  		 * Source vma may have been merged into new_vma
>> @@ -3205,6 +3221,15 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
>>  			get_file(new_vma->vm_file);
>>  		if (new_vma->vm_ops && new_vma->vm_ops->open)
>>  			new_vma->vm_ops->open(new_vma);
>> +		/*
>> +		 * As the VMA is linked right now, it may be hit by the
>> +		 * speculative page fault handler. But we don't want it to
>> +		 * to start mapping page in this area until the caller has
>> +		 * potentially move the pte from the moved VMA. To prevent
>> +		 * that we protect it right now, and let the caller unprotect
>> +		 * it once the move is done.
>> +		 */
>> +		vm_raw_write_begin(new_vma);
>>  		vma_link(mm, new_vma, prev, rb_link, rb_parent);
>>  		*need_rmap_locks = false;
>>  	}
>> diff --git a/mm/mremap.c b/mm/mremap.c
>> index 049470aa1e3e..8ed1a1d6eaed 100644
>> --- a/mm/mremap.c
>> +++ b/mm/mremap.c
>> @@ -302,6 +302,14 @@ static unsigned long move_vma(struct vm_area_struct *vma,
>>  	if (!new_vma)
>>  		return -ENOMEM;
>>  
>> +	/* new_vma is returned protected by copy_vma, to prevent speculative
>> +	 * page fault to be done in the destination area before we move the pte.
>> +	 * Now, we must also protect the source VMA since we don't want pages
>> +	 * to be mapped in our back while we are copying the PTEs.
>> +	 */
>> +	if (vma != new_vma)
>> +		vm_raw_write_begin(vma);
>> +
>>  	moved_len = move_page_tables(vma, old_addr, new_vma, new_addr, old_len,
>>  				     need_rmap_locks);
>>  	if (moved_len < old_len) {
>> @@ -318,6 +326,8 @@ static unsigned long move_vma(struct vm_area_struct *vma,
>>  		 */
>>  		move_page_tables(new_vma, new_addr, vma, old_addr, moved_len,
>>  				 true);
>> +		if (vma != new_vma)
>> +			vm_raw_write_end(vma);
>>  		vma = new_vma;
>>  		old_len = new_len;
>>  		old_addr = new_addr;
>> @@ -326,7 +336,10 @@ static unsigned long move_vma(struct vm_area_struct *vma,
>>  		mremap_userfaultfd_prep(new_vma, uf);
>>  		arch_remap(mm, old_addr, old_addr + old_len,
>>  			   new_addr, new_addr + new_len);
>> +		if (vma != new_vma)
>> +			vm_raw_write_end(vma);
>>  	}
>> +	vm_raw_write_end(new_vma);
> 
> Just do
> 
> vm_raw_write_end(vma);
> vm_raw_write_end(new_vma);
> 
> here.

Are you sure ? we can have vma = new_vma done if (unlikely(err))

Cheers,
Laurent.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 01/24] mm: Introduce CONFIG_SPECULATIVE_PAGE_FAULT
  2018-03-28 11:15         ` Laurent Dufour
@ 2018-03-28 21:18           ` David Rientjes
  0 siblings, 0 replies; 98+ messages in thread
From: David Rientjes @ 2018-03-28 21:18 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: Thomas Gleixner, paulmck, peterz, akpm, kirill, ak, mhocko, dave,
	jack, Matthew Wilcox, benh, mpe, paulus, Ingo Molnar, hpa,
	Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Wed, 28 Mar 2018, Laurent Dufour wrote:

> > Putting this in mm/Kconfig is definitely the right way to go about it 
> > instead of any generic option in arch/*.
> > 
> > My question, though, was making this configurable by the user:
> > 
> > config SPECULATIVE_PAGE_FAULT
> > 	bool "Speculative page faults"
> > 	depends on X86_64 || PPC
> > 	default y
> > 	help
> > 	  ..
> > 
> > It's a question about whether we want this always enabled on x86_64 and 
> > power or whether the user should be able to disable it (right now they 
> > can't).  With a large feature like this, you may want to offer something 
> > simple (disable CONFIG_SPECULATIVE_PAGE_FAULT) if someone runs into 
> > regressions.
> 
> I agree, but I think it would be important to get the per architecture
> enablement to avoid complex check here. For instance in the case of powerPC
> this is only supported for PPC_BOOK3S_64.
> 
> To avoid exposing such per architecture define here, what do you think about
> having supporting architectures setting ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT
> and the SPECULATIVE_PAGE_FAULT depends on this, like this:
> 
> In mm/Kconfig:
> config SPECULATIVE_PAGE_FAULT
>  	bool "Speculative page faults"
>  	depends on ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT && SMP
>  	default y
>  	help
> 		...
> 
> In arch/powerpc/Kconfig:
> config PPC
> 	...
> 	select ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT	if PPC_BOOK3S_64
> 
> In arch/x86/Kconfig:
> config X86_64
> 	...
> 	select ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT
> 
> 

Looks good to me!  It feels like this will add more assurance that if 
things regress for certain workloads that it can be disabled.  I don't 
feel strongly about the default value, I'm ok with it being enabled by 
default.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 09/24] mm: protect mremap() against SPF hanlder
  2018-03-28 18:11     ` Laurent Dufour
@ 2018-03-28 21:21       ` David Rientjes
  2018-04-04  8:24         ` Laurent Dufour
  0 siblings, 1 reply; 98+ messages in thread
From: David Rientjes @ 2018-03-28 21:21 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Wed, 28 Mar 2018, Laurent Dufour wrote:

> >> @@ -326,7 +336,10 @@ static unsigned long move_vma(struct vm_area_struct *vma,
> >>  		mremap_userfaultfd_prep(new_vma, uf);
> >>  		arch_remap(mm, old_addr, old_addr + old_len,
> >>  			   new_addr, new_addr + new_len);
> >> +		if (vma != new_vma)
> >> +			vm_raw_write_end(vma);
> >>  	}
> >> +	vm_raw_write_end(new_vma);
> > 
> > Just do
> > 
> > vm_raw_write_end(vma);
> > vm_raw_write_end(new_vma);
> > 
> > here.
> 
> Are you sure ? we can have vma = new_vma done if (unlikely(err))
> 

Sorry, what I meant was do

if (vma != new_vma)
	vm_raw_write_end(vma);
vm_raw_write_end(new_vma);

after the conditional.  Having the locking unnecessarily embedded in the 
conditional has been an issue in the past with other areas of core code, 
unless you have a strong reason for it.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 00/24] Speculative page faults
  2018-03-22  1:21 ` Ganesh Mahendran
@ 2018-03-29 12:49   ` Laurent Dufour
  0 siblings, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-03-29 12:49 UTC (permalink / raw)
  To: Ganesh Mahendran
  Cc: paulmck, Peter Zijlstra, Andrew Morton, kirill, ak, Michal Hocko,
	dave, jack, Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner,
	Ingo Molnar, hpa, Will Deacon, Sergey Senozhatsky,
	Andrea Arcangeli, Alexei Starovoitov, kemi.wang,
	Sergey Senozhatsky, Daniel Jordan, linux-kernel, Linux-MM, haren,
	khandual, npiggin, Balbir Singh, Tim Chen, linuxppc-dev, x86

On 22/03/2018 02:21, Ganesh Mahendran wrote:
> Hi, Laurent
> 
> 2018-03-14 1:59 GMT+08:00 Laurent Dufour <ldufour@linux.vnet.ibm.com>:
>> This is a port on kernel 4.16 of the work done by Peter Zijlstra to
>> handle page fault without holding the mm semaphore [1].
>>
>> The idea is to try to handle user space page faults without holding the
>> mmap_sem. This should allow better concurrency for massively threaded
>> process since the page fault handler will not wait for other threads memory
>> layout change to be done, assuming that this change is done in another part
>> of the process's memory space. This type page fault is named speculative
>> page fault. If the speculative page fault fails because of a concurrency is
>> detected or because underlying PMD or PTE tables are not yet allocating, it
>> is failing its processing and a classic page fault is then tried.
>>
>> The speculative page fault (SPF) has to look for the VMA matching the fault
>> address without holding the mmap_sem, this is done by introducing a rwlock
>> which protects the access to the mm_rb tree. Previously this was done using
>> SRCU but it was introducing a lot of scheduling to process the VMA's
>> freeing
>> operation which was hitting the performance by 20% as reported by Kemi Wang
>> [2].Using a rwlock to protect access to the mm_rb tree is limiting the
>> locking contention to these operations which are expected to be in a O(log
>> n)
>> order. In addition to ensure that the VMA is not freed in our back a
>> reference count is added and 2 services (get_vma() and put_vma()) are
>> introduced to handle the reference count. When a VMA is fetch from the RB
>> tree using get_vma() is must be later freeed using put_vma(). Furthermore,
>> to allow the VMA to be used again by the classic page fault handler a
>> service is introduced can_reuse_spf_vma(). This service is expected to be
>> called with the mmap_sem hold. It checked that the VMA is still matching
>> the specified address and is releasing its reference count as the mmap_sem
>> is hold it is ensure that it will not be freed in our back. In general, the
>> VMA's reference count could be decremented when holding the mmap_sem but it
>> should not be increased as holding the mmap_sem is ensuring that the VMA is
>> stable. I can't see anymore the overhead I got while will-it-scale
>> benchmark anymore.
>>
>> The VMA's attributes checked during the speculative page fault processing
>> have to be protected against parallel changes. This is done by using a per
>> VMA sequence lock. This sequence lock allows the speculative page fault
>> handler to fast check for parallel changes in progress and to abort the
>> speculative page fault in that case.
>>
>> Once the VMA is found, the speculative page fault handler would check for
>> the VMA's attributes to verify that the page fault has to be handled
>> correctly or not. Thus the VMA is protected through a sequence lock which
>> allows fast detection of concurrent VMA changes. If such a change is
>> detected, the speculative page fault is aborted and a *classic* page fault
>> is tried.  VMA sequence lockings are added when VMA attributes which are
>> checked during the page fault are modified.
>>
>> When the PTE is fetched, the VMA is checked to see if it has been changed,
>> so once the page table is locked, the VMA is valid, so any other changes
>> leading to touching this PTE will need to lock the page table, so no
>> parallel change is possible at this time.
>>
>> The locking of the PTE is done with interrupts disabled, this allows to
>> check for the PMD to ensure that there is not an ongoing collapsing
>> operation. Since khugepaged is firstly set the PMD to pmd_none and then is
>> waiting for the other CPU to have catch the IPI interrupt, if the pmd is
>> valid at the time the PTE is locked, we have the guarantee that the
>> collapsing opertion will have to wait on the PTE lock to move foward. This
>> allows the SPF handler to map the PTE safely. If the PMD value is different
>> than the one recorded at the beginning of the SPF operation, the classic
>> page fault handler will be called to handle the operation while holding the
>> mmap_sem. As the PTE lock is done with the interrupts disabled, the lock is
>> done using spin_trylock() to avoid dead lock when handling a page fault
>> while a TLB invalidate is requested by an other CPU holding the PTE.
>>
>> Support for THP is not done because when checking for the PMD, we can be
>> confused by an in progress collapsing operation done by khugepaged. The
>> issue is that pmd_none() could be true either if the PMD is not already
>> populated or if the underlying PTE are in the way to be collapsed. So we
>> cannot safely allocate a PMD if pmd_none() is true.
>>
>> This series a new software performance event named 'speculative-faults' or
>> 'spf'. It counts the number of successful page fault event handled in a
>> speculative way. When recording 'faults,spf' events, the faults one is
>> counting the total number of page fault events while 'spf' is only counting
>> the part of the faults processed in a speculative way.
>>
>> There are some trace events introduced by this series. They allow to
>> identify why the page faults where not processed in a speculative way. This
>> doesn't take in account the faults generated by a monothreaded process
>> which directly processed while holding the mmap_sem. This trace events are
>> grouped in a system named 'pagefault', they are:
>>  - pagefault:spf_pte_lock : if the pte was already locked by another thread
>>  - pagefault:spf_vma_changed : if the VMA has been changed in our back
>>  - pagefault:spf_vma_noanon : the vma->anon_vma field was not yet set.
>>  - pagefault:spf_vma_notsup : the VMA's type is not supported
>>  - pagefault:spf_vma_access : the VMA's access right are not respected
>>  - pagefault:spf_pmd_changed : the upper PMD pointer has changed in our
>>  back.
>>
>> To record all the related events, the easier is to run perf with the
>> following arguments :
>> $ perf stat -e 'faults,spf,pagefault:*' <command>
>>
>> This series builds on top of v4.16-rc2-mmotm-2018-02-21-14-48 and is
>> functional on x86 and PowerPC.
>>
>> ---------------------
>> Real Workload results
>>
>> As mentioned in previous email, we did non official runs using a "popular
>> in memory multithreaded database product" on 176 cores SMT8 Power system
>> which showed a 30% improvements in the number of transaction processed per
>> second. This run has been done on the v6 series, but changes introduced in
>> this new verion should not impact the performance boost seen.
>>
>> Here are the perf data captured during 2 of these runs on top of the v8
>> series:
>>                 vanilla         spf
>> faults          89.418          101.364
>> spf                n/a           97.989
>>
>> With the SPF kernel, most of the page fault were processed in a speculative
>> way.
>>
>> ------------------
>> Benchmarks results
>>
>> Base kernel is v4.16-rc4-mmotm-2018-03-09-16-34
>> SPF is BASE + this series
>>
>> Kernbench:
>> ----------
>> Here are the results on a 16 CPUs X86 guest using kernbench on a 4.13-rc4
>> kernel (kernel is build 5 times):
>>
>> Average Half load -j 8
>>                  Run    (std deviation)
>>                  BASE                   SPF
>> Elapsed Time     151.36  (1.40139)      151.748 (1.09716)       0.26%
>> User    Time     1023.19 (3.58972)      1027.35 (2.30396)       0.41%
>> System  Time     125.026 (1.8547)       124.504 (0.980015)      -0.42%
>> Percent CPU      758.2   (5.54076)      758.6   (3.97492)       0.05%
>> Context Switches 54924   (453.634)      54851   (382.293)       -0.13%
>> Sleeps           105589  (704.581)      105282  (435.502)       -0.29%
>>
>> Average Optimal load -j 16
>>                  Run    (std deviation)
>>                  BASE                   SPF
>> Elapsed Time     74.804  (1.25139)      74.368  (0.406288)      -0.58%
>> User    Time     962.033 (64.5125)      963.93  (66.8797)       0.20%
>> System  Time     110.771 (15.0817)      110.387 (14.8989)       -0.35%
>> Percent CPU      1045.7  (303.387)      1049.1  (306.255)       0.33%
>> Context Switches 76201.8 (22433.1)      76170.4 (22482.9)       -0.04%
>> Sleeps           110289  (5024.05)      110220  (5248.58)       -0.06%
>>
>> During a run on the SPF, perf events were captured:
>>  Performance counter stats for '../kernbench -M':
>>          510334017      faults
>>                200      spf
>>                  0      pagefault:spf_pte_lock
>>                  0      pagefault:spf_vma_changed
>>                  0      pagefault:spf_vma_noanon
>>               2174      pagefault:spf_vma_notsup
>>                  0      pagefault:spf_vma_access
>>                  0      pagefault:spf_pmd_changed
>>
>> Very few speculative page fault were recorded as most of the processes
>> involved are monothreaded (sounds that on this architecture some threads
>> were created during the kernel build processing).
>>
>> Here are the kerbench results on a 80 CPUs Power8 system:
>>
>> Average Half load -j 40
>>                  Run    (std deviation)
>>                  BASE                   SPF
>> Elapsed Time     116.958 (0.73401)      117.43  (0.927497)      0.40%
>> User    Time     4472.35 (7.85792)      4480.16 (19.4909)       0.17%
>> System  Time     136.248 (0.587639)     136.922 (1.09058)       0.49%
>> Percent CPU      3939.8  (20.6567)      3931.2  (17.2829)       -0.22%
>> Context Switches 92445.8 (236.672)      92720.8 (270.118)       0.30%
>> Sleeps           318475  (1412.6)       317996  (1819.07)       -0.15%
>>
>> Average Optimal load -j 80
>>                  Run    (std deviation)
>>                  BASE                   SPF
>> Elapsed Time     106.976 (0.406731)     107.72  (0.329014)      0.70%
>> User    Time     5863.47 (1466.45)      5865.38 (1460.27)       0.03%
>> System  Time     159.995 (25.0393)      160.329 (24.6921)       0.21%
>> Percent CPU      5446.2  (1588.23)      5416    (1565.34)       -0.55%
>> Context Switches 223018  (137637)       224867  (139305)        0.83%
>> Sleeps           330846  (13127.3)      332348  (15556.9)       0.45%
>>
>> During a run on the SPF, perf events were captured:
>>  Performance counter stats for '../kernbench -M':
>>          116612488      faults
>>                  0      spf
>>                  0      pagefault:spf_pte_lock
>>                  0      pagefault:spf_vma_changed
>>                  0      pagefault:spf_vma_noanon
>>                473      pagefault:spf_vma_notsup
>>                  0      pagefault:spf_vma_access
>>                  0      pagefault:spf_pmd_changed
>>
>> Most of the processes involved are monothreaded so SPF is not activated but
>> there is no impact on the performance.
>>
>> Ebizzy:
>> -------
>> The test is counting the number of records per second it can manage, the
>> higher is the best. I run it like this 'ebizzy -mTRp'. To get consistent
>> result I repeated the test 100 times and measure the average result. The
>> number is the record processes per second, the higher is the best.
>>
>>                 BASE            SPF             delta
>> 16 CPUs x86 VM  14902.6         95905.16        543.55%
>> 80 CPUs P8 node 37240.24        78185.67        109.95%
>>
>> Here are the performance counter read during a run on a 16 CPUs x86 VM:
>>  Performance counter stats for './ebizzy -mRTp':
>>             888157      faults
>>             884773      spf
>>                 92      pagefault:spf_pte_lock
>>               2379      pagefault:spf_vma_changed
>>                  0      pagefault:spf_vma_noanon
>>                 80      pagefault:spf_vma_notsup
>>                  0      pagefault:spf_vma_access
>>                  0      pagefault:spf_pmd_changed
>>
>> And the ones captured during a run on a 80 CPUs Power node:
>>  Performance counter stats for './ebizzy -mRTp':
>>             762134      faults
>>             728663      spf
>>              19101      pagefault:spf_pte_lock
>>              13969      pagefault:spf_vma_changed
>>                  0      pagefault:spf_vma_noanon
>>                272      pagefault:spf_vma_notsup
>>                  0      pagefault:spf_vma_access
>>                  0      pagefault:spf_pmd_changed
>>
>> In ebizzy's case most of the page fault were handled in a speculative way,
>> leading the ebizzy performance boost.
> 
> We ported the SPF to kernel 4.9 in android devices.
> For the app launch time, It improves about 15% average. For the apps
> which have hundreds of threads, it will be about 20%.

Hi Ganesh,

Thanks for sharing these great and encouraging results.

Could you please detail a bit more about your system configuration and
application ?

Laurent.

> Thanks.
> 
>>
>> ------------------
>> Changes since v8:
>>  - Don't check PMD when locking the pte when THP is disabled
>>    Thanks to Daniel Jordan for reporting this.
>>  - Rebase on 4.16
>> Changes since v7:
>>  - move pte_map_lock() and pte_spinlock() upper in mm/memory.c (patch 4 &
>>    5)
>>  - make pte_unmap_same() compatible with the speculative page fault (patch
>>    6)
>> Changes since v6:
>>  - Rename config variable to CONFIG_SPECULATIVE_PAGE_FAULT (patch 1)
>>  - Review the way the config variable is set (patch 1 to 3)
>>  - Introduce mm_rb_write_*lock() in mm/mmap.c (patch 18)
>>  - Merge patch introducing pte try locking in the patch 18.
>> Changes since v5:
>>  - use rwlock agains the mm RB tree in place of SRCU
>>  - add a VMA's reference count to protect VMA while using it without
>>    holding the mmap_sem.
>>  - check PMD value to detect collapsing operation
>>  - don't try speculative page fault for mono threaded processes
>>  - try to reuse the fetched VMA if VM_RETRY is returned
>>  - go directly to the error path if an error is detected during the SPF
>>    path
>>  - fix race window when moving VMA in move_vma()
>> Changes since v4:
>>  - As requested by Andrew Morton, use CONFIG_SPF and define it earlier in
>>  the series to ease bisection.
>> Changes since v3:
>>  - Don't build when CONFIG_SMP is not set
>>  - Fixed a lock dependency warning in __vma_adjust()
>>  - Use READ_ONCE to access p*d values in handle_speculative_fault()
>>  - Call memcp_oom() service in handle_speculative_fault()
>> Changes since v2:
>>  - Perf event is renamed in PERF_COUNT_SW_SPF
>>  - On Power handle do_page_fault()'s cleaning
>>  - On Power if the VM_FAULT_ERROR is returned by
>>  handle_speculative_fault(), do not retry but jump to the error path
>>  - If VMA's flags are not matching the fault, directly returns
>>  VM_FAULT_SIGSEGV and not VM_FAULT_RETRY
>>  - Check for pud_trans_huge() to avoid speculative path
>>  - Handles _vm_normal_page()'s introduced by 6f16211df3bf
>>  ("mm/device-public-memory: device memory cache coherent with CPU")
>>  - add and review few comments in the code
>> Changes since v1:
>>  - Remove PERF_COUNT_SW_SPF_FAILED perf event.
>>  - Add tracing events to details speculative page fault failures.
>>  - Cache VMA fields values which are used once the PTE is unlocked at the
>>  end of the page fault events.
>>  - Ensure that fields read during the speculative path are written and read
>>  using WRITE_ONCE and READ_ONCE.
>>  - Add checks at the beginning of the speculative path to abort it if the
>>  VMA is known to not be supported.
>> Changes since RFC V5 [5]
>>  - Port to 4.13 kernel
>>  - Merging patch fixing lock dependency into the original patch
>>  - Replace the 2 parameters of vma_has_changed() with the vmf pointer
>>  - In patch 7, don't call __do_fault() in the speculative path as it may
>>  want to unlock the mmap_sem.
>>  - In patch 11-12, don't check for vma boundaries when
>>  page_add_new_anon_rmap() is called during the spf path and protect against
>>  anon_vma pointer's update.
>>  - In patch 13-16, add performance events to report number of successful
>>  and failed speculative events.
>>
>> [1]
>> http://linux-kernel.2935.n7.nabble.com/RFC-PATCH-0-6-Another-go-at-speculative-page-faults-tt965642.html#none
>> [2] https://patchwork.kernel.org/patch/9999687/
>>
>>
>> Laurent Dufour (20):
>>   mm: Introduce CONFIG_SPECULATIVE_PAGE_FAULT
>>   x86/mm: Define CONFIG_SPECULATIVE_PAGE_FAULT
>>   powerpc/mm: Define CONFIG_SPECULATIVE_PAGE_FAULT
>>   mm: Introduce pte_spinlock for FAULT_FLAG_SPECULATIVE
>>   mm: make pte_unmap_same compatible with SPF
>>   mm: Protect VMA modifications using VMA sequence count
>>   mm: protect mremap() against SPF hanlder
>>   mm: Protect SPF handler against anon_vma changes
>>   mm: Cache some VMA fields in the vm_fault structure
>>   mm/migrate: Pass vm_fault pointer to migrate_misplaced_page()
>>   mm: Introduce __lru_cache_add_active_or_unevictable
>>   mm: Introduce __maybe_mkwrite()
>>   mm: Introduce __vm_normal_page()
>>   mm: Introduce __page_add_new_anon_rmap()
>>   mm: Protect mm_rb tree with a rwlock
>>   mm: Adding speculative page fault failure trace events
>>   perf: Add a speculative page fault sw event
>>   perf tools: Add support for the SPF perf event
>>   mm: Speculative page fault handler return VMA
>>   powerpc/mm: Add speculative page fault
>>
>> Peter Zijlstra (4):
>>   mm: Prepare for FAULT_FLAG_SPECULATIVE
>>   mm: VMA sequence count
>>   mm: Provide speculative fault infrastructure
>>   x86/mm: Add speculative pagefault handling
>>
>>  arch/powerpc/Kconfig                  |   1 +
>>  arch/powerpc/mm/fault.c               |  31 +-
>>  arch/x86/Kconfig                      |   1 +
>>  arch/x86/mm/fault.c                   |  38 ++-
>>  fs/proc/task_mmu.c                    |   5 +-
>>  fs/userfaultfd.c                      |  17 +-
>>  include/linux/hugetlb_inline.h        |   2 +-
>>  include/linux/migrate.h               |   4 +-
>>  include/linux/mm.h                    |  92 +++++-
>>  include/linux/mm_types.h              |   7 +
>>  include/linux/pagemap.h               |   4 +-
>>  include/linux/rmap.h                  |  12 +-
>>  include/linux/swap.h                  |  10 +-
>>  include/trace/events/pagefault.h      |  87 +++++
>>  include/uapi/linux/perf_event.h       |   1 +
>>  kernel/fork.c                         |   3 +
>>  mm/Kconfig                            |   3 +
>>  mm/hugetlb.c                          |   2 +
>>  mm/init-mm.c                          |   3 +
>>  mm/internal.h                         |  20 ++
>>  mm/khugepaged.c                       |   5 +
>>  mm/madvise.c                          |   6 +-
>>  mm/memory.c                           | 594 ++++++++++++++++++++++++++++++----
>>  mm/mempolicy.c                        |  51 ++-
>>  mm/migrate.c                          |   4 +-
>>  mm/mlock.c                            |  13 +-
>>  mm/mmap.c                             | 211 +++++++++---
>>  mm/mprotect.c                         |   4 +-
>>  mm/mremap.c                           |  13 +
>>  mm/rmap.c                             |   5 +-
>>  mm/swap.c                             |   6 +-
>>  mm/swap_state.c                       |   8 +-
>>  tools/include/uapi/linux/perf_event.h |   1 +
>>  tools/perf/util/evsel.c               |   1 +
>>  tools/perf/util/parse-events.c        |   4 +
>>  tools/perf/util/parse-events.l        |   1 +
>>  tools/perf/util/python.c              |   1 +
>>  37 files changed, 1097 insertions(+), 174 deletions(-)
>>  create mode 100644 include/trace/events/pagefault.h
>>
>> --
>> 2.7.4
>>
> 

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 11/24] mm: Cache some VMA fields in the vm_fault structure
  2018-03-13 17:59 ` [PATCH v9 11/24] mm: Cache some VMA fields in the vm_fault structure Laurent Dufour
@ 2018-04-02 22:24   ` David Rientjes
  2018-04-04 15:48     ` Laurent Dufour
  0 siblings, 1 reply; 98+ messages in thread
From: David Rientjes @ 2018-04-02 22:24 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Tue, 13 Mar 2018, Laurent Dufour wrote:

> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index ef6ef0627090..dfa81a638b7c 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -359,6 +359,12 @@ struct vm_fault {
>  					 * page table to avoid allocation from
>  					 * atomic context.
>  					 */
> +	/*
> +	 * These entries are required when handling speculative page fault.
> +	 * This way the page handling is done using consistent field values.
> +	 */
> +	unsigned long vma_flags;
> +	pgprot_t vma_page_prot;
>  };
>  
>  /* page entry size for vm->huge_fault() */
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 446427cafa19..f71db2b42b30 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3717,6 +3717,8 @@ static int hugetlb_no_page(struct mm_struct *mm, struct vm_area_struct *vma,
>  				.vma = vma,
>  				.address = address,
>  				.flags = flags,
> +				.vma_flags = vma->vm_flags,
> +				.vma_page_prot = vma->vm_page_prot,
>  				/*
>  				 * Hard to debug if it ends up being
>  				 * used by a callee that assumes
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 32314e9e48dd..a946d5306160 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -882,6 +882,8 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm,
>  		.flags = FAULT_FLAG_ALLOW_RETRY,
>  		.pmd = pmd,
>  		.pgoff = linear_page_index(vma, address),
> +		.vma_flags = vma->vm_flags,
> +		.vma_page_prot = vma->vm_page_prot,
>  	};
>  
>  	/* we only decide to swapin, if there is enough young ptes */
> diff --git a/mm/memory.c b/mm/memory.c
> index 0200340ef089..46fe92b93682 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2615,7 +2615,7 @@ static int wp_page_copy(struct vm_fault *vmf)
>  		 * Don't let another task, with possibly unlocked vma,
>  		 * keep the mlocked page.
>  		 */
> -		if (page_copied && (vma->vm_flags & VM_LOCKED)) {
> +		if (page_copied && (vmf->vma_flags & VM_LOCKED)) {
>  			lock_page(old_page);	/* LRU manipulation */
>  			if (PageMlocked(old_page))
>  				munlock_vma_page(old_page);

Doesn't wp_page_copy() also need to pass this to anon_vma_prepare() so 
that find_mergeable_anon_vma() works correctly?

> @@ -2649,7 +2649,7 @@ static int wp_page_copy(struct vm_fault *vmf)
>   */
>  int finish_mkwrite_fault(struct vm_fault *vmf)
>  {
> -	WARN_ON_ONCE(!(vmf->vma->vm_flags & VM_SHARED));
> +	WARN_ON_ONCE(!(vmf->vma_flags & VM_SHARED));
>  	if (!pte_map_lock(vmf))
>  		return VM_FAULT_RETRY;
>  	/*
> @@ -2751,7 +2751,7 @@ static int do_wp_page(struct vm_fault *vmf)
>  		 * We should not cow pages in a shared writeable mapping.
>  		 * Just mark the pages writable and/or call ops->pfn_mkwrite.
>  		 */
> -		if ((vma->vm_flags & (VM_WRITE|VM_SHARED)) ==
> +		if ((vmf->vma_flags & (VM_WRITE|VM_SHARED)) ==
>  				     (VM_WRITE|VM_SHARED))
>  			return wp_pfn_shared(vmf);
>  
> @@ -2798,7 +2798,7 @@ static int do_wp_page(struct vm_fault *vmf)
>  			return VM_FAULT_WRITE;
>  		}
>  		unlock_page(vmf->page);
> -	} else if (unlikely((vma->vm_flags & (VM_WRITE|VM_SHARED)) ==
> +	} else if (unlikely((vmf->vma_flags & (VM_WRITE|VM_SHARED)) ==
>  					(VM_WRITE|VM_SHARED))) {
>  		return wp_page_shared(vmf);
>  	}
> @@ -3067,7 +3067,7 @@ int do_swap_page(struct vm_fault *vmf)
>  
>  	inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
>  	dec_mm_counter_fast(vma->vm_mm, MM_SWAPENTS);
> -	pte = mk_pte(page, vma->vm_page_prot);
> +	pte = mk_pte(page, vmf->vma_page_prot);
>  	if ((vmf->flags & FAULT_FLAG_WRITE) && reuse_swap_page(page, NULL)) {
>  		pte = maybe_mkwrite(pte_mkdirty(pte), vma);
>  		vmf->flags &= ~FAULT_FLAG_WRITE;
> @@ -3093,7 +3093,7 @@ int do_swap_page(struct vm_fault *vmf)
>  
>  	swap_free(entry);
>  	if (mem_cgroup_swap_full(page) ||
> -	    (vma->vm_flags & VM_LOCKED) || PageMlocked(page))
> +	    (vmf->vma_flags & VM_LOCKED) || PageMlocked(page))
>  		try_to_free_swap(page);
>  	unlock_page(page);
>  	if (page != swapcache && swapcache) {
> @@ -3150,7 +3150,7 @@ static int do_anonymous_page(struct vm_fault *vmf)
>  	pte_t entry;
>  
>  	/* File mapping without ->vm_ops ? */
> -	if (vma->vm_flags & VM_SHARED)
> +	if (vmf->vma_flags & VM_SHARED)
>  		return VM_FAULT_SIGBUS;
>  
>  	/*
> @@ -3174,7 +3174,7 @@ static int do_anonymous_page(struct vm_fault *vmf)
>  	if (!(vmf->flags & FAULT_FLAG_WRITE) &&
>  			!mm_forbids_zeropage(vma->vm_mm)) {
>  		entry = pte_mkspecial(pfn_pte(my_zero_pfn(vmf->address),
> -						vma->vm_page_prot));
> +						vmf->vma_page_prot));
>  		if (!pte_map_lock(vmf))
>  			return VM_FAULT_RETRY;
>  		if (!pte_none(*vmf->pte))
> @@ -3207,8 +3207,8 @@ static int do_anonymous_page(struct vm_fault *vmf)
>  	 */
>  	__SetPageUptodate(page);
>  
> -	entry = mk_pte(page, vma->vm_page_prot);
> -	if (vma->vm_flags & VM_WRITE)
> +	entry = mk_pte(page, vmf->vma_page_prot);
> +	if (vmf->vma_flags & VM_WRITE)
>  		entry = pte_mkwrite(pte_mkdirty(entry));
>  
>  	if (!pte_map_lock(vmf)) {
> @@ -3404,7 +3404,7 @@ static int do_set_pmd(struct vm_fault *vmf, struct page *page)
>  	for (i = 0; i < HPAGE_PMD_NR; i++)
>  		flush_icache_page(vma, page + i);
>  
> -	entry = mk_huge_pmd(page, vma->vm_page_prot);
> +	entry = mk_huge_pmd(page, vmf->vma_page_prot);
>  	if (write)
>  		entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);
>  
> @@ -3478,11 +3478,11 @@ int alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg,
>  		return VM_FAULT_NOPAGE;
>  
>  	flush_icache_page(vma, page);
> -	entry = mk_pte(page, vma->vm_page_prot);
> +	entry = mk_pte(page, vmf->vma_page_prot);
>  	if (write)
>  		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
>  	/* copy-on-write page */
> -	if (write && !(vma->vm_flags & VM_SHARED)) {
> +	if (write && !(vmf->vma_flags & VM_SHARED)) {
>  		inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
>  		page_add_new_anon_rmap(page, vma, vmf->address, false);
>  		mem_cgroup_commit_charge(page, memcg, false, false);
> @@ -3521,7 +3521,7 @@ int finish_fault(struct vm_fault *vmf)
>  
>  	/* Did we COW the page? */
>  	if ((vmf->flags & FAULT_FLAG_WRITE) &&
> -	    !(vmf->vma->vm_flags & VM_SHARED))
> +	    !(vmf->vma_flags & VM_SHARED))
>  		page = vmf->cow_page;
>  	else
>  		page = vmf->page;
> @@ -3775,7 +3775,7 @@ static int do_fault(struct vm_fault *vmf)
>  		ret = VM_FAULT_SIGBUS;
>  	else if (!(vmf->flags & FAULT_FLAG_WRITE))
>  		ret = do_read_fault(vmf);
> -	else if (!(vma->vm_flags & VM_SHARED))
> +	else if (!(vmf->vma_flags & VM_SHARED))
>  		ret = do_cow_fault(vmf);
>  	else
>  		ret = do_shared_fault(vmf);
> @@ -3832,7 +3832,7 @@ static int do_numa_page(struct vm_fault *vmf)
>  	 * accessible ptes, some can allow access by kernel mode.
>  	 */
>  	pte = ptep_modify_prot_start(vma->vm_mm, vmf->address, vmf->pte);
> -	pte = pte_modify(pte, vma->vm_page_prot);
> +	pte = pte_modify(pte, vmf->vma_page_prot);
>  	pte = pte_mkyoung(pte);
>  	if (was_writable)
>  		pte = pte_mkwrite(pte);
> @@ -3866,7 +3866,7 @@ static int do_numa_page(struct vm_fault *vmf)
>  	 * Flag if the page is shared between multiple address spaces. This
>  	 * is later used when determining whether to group tasks together
>  	 */
> -	if (page_mapcount(page) > 1 && (vma->vm_flags & VM_SHARED))
> +	if (page_mapcount(page) > 1 && (vmf->vma_flags & VM_SHARED))
>  		flags |= TNF_SHARED;
>  
>  	last_cpupid = page_cpupid_last(page);
> @@ -3911,7 +3911,7 @@ static inline int wp_huge_pmd(struct vm_fault *vmf, pmd_t orig_pmd)
>  		return vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PMD);
>  
>  	/* COW handled on pte level: split pmd */
> -	VM_BUG_ON_VMA(vmf->vma->vm_flags & VM_SHARED, vmf->vma);
> +	VM_BUG_ON_VMA(vmf->vma_flags & VM_SHARED, vmf->vma);
>  	__split_huge_pmd(vmf->vma, vmf->pmd, vmf->address, false, NULL);
>  
>  	return VM_FAULT_FALLBACK;
> @@ -4058,6 +4058,8 @@ static int __handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
>  		.flags = flags,
>  		.pgoff = linear_page_index(vma, address),
>  		.gfp_mask = __get_fault_gfp_mask(vma),
> +		.vma_flags = vma->vm_flags,
> +		.vma_page_prot = vma->vm_page_prot,
>  	};
>  	unsigned int dirty = flags & FAULT_FLAG_WRITE;
>  	struct mm_struct *mm = vma->vm_mm;

Don't you also need to do this?

diff --git a/include/linux/mm.h b/include/linux/mm.h
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -694,9 +694,9 @@ void free_compound_page(struct page *page);
  * pte_mkwrite.  But get_user_pages can cause write faults for mappings
  * that do not have writing enabled, when used by access_process_vm.
  */
-static inline pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma)
+static inline pte_t maybe_mkwrite(pte_t pte, unsigned long vma_flags)
 {
-	if (likely(vma->vm_flags & VM_WRITE))
+	if (likely(vma_flags & VM_WRITE))
 		pte = pte_mkwrite(pte);
 	return pte;
 }
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1195,8 +1195,8 @@ static int do_huge_pmd_wp_page_fallback(struct vm_fault *vmf, pmd_t orig_pmd,
 
 	for (i = 0; i < HPAGE_PMD_NR; i++, haddr += PAGE_SIZE) {
 		pte_t entry;
-		entry = mk_pte(pages[i], vma->vm_page_prot);
-		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
+		entry = mk_pte(pages[i], vmf->vma_page_prot);
+		entry = maybe_mkwrite(pte_mkdirty(entry), vmf->vma_flags);
 		memcg = (void *)page_private(pages[i]);
 		set_page_private(pages[i], 0);
 		page_add_new_anon_rmap(pages[i], vmf->vma, haddr, false);
@@ -2169,7 +2169,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
 				entry = pte_swp_mksoft_dirty(entry);
 		} else {
 			entry = mk_pte(page + i, READ_ONCE(vma->vm_page_prot));
-			entry = maybe_mkwrite(entry, vma);
+			entry = maybe_mkwrite(entry, vma->vm_flags);
 			if (!write)
 				entry = pte_wrprotect(entry);
 			if (!young)
diff --git a/mm/memory.c b/mm/memory.c
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1826,7 +1826,7 @@ static int insert_pfn(struct vm_area_struct *vma, unsigned long addr,
 out_mkwrite:
 	if (mkwrite) {
 		entry = pte_mkyoung(entry);
-		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
+		entry = maybe_mkwrite(pte_mkdirty(entry), vma->vm_flags);
 	}
 
 	set_pte_at(mm, addr, pte, entry);
@@ -2472,7 +2472,7 @@ static inline void wp_page_reuse(struct vm_fault *vmf)
 
 	flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
 	entry = pte_mkyoung(vmf->orig_pte);
-	entry = maybe_mkwrite(pte_mkdirty(entry), vma);
+	entry = maybe_mkwrite(pte_mkdirty(entry), vmf->vma_flags);
 	if (ptep_set_access_flags(vma, vmf->address, vmf->pte, entry, 1))
 		update_mmu_cache(vma, vmf->address, vmf->pte);
 	pte_unmap_unlock(vmf->pte, vmf->ptl);
@@ -2549,8 +2549,8 @@ static int wp_page_copy(struct vm_fault *vmf)
 			inc_mm_counter_fast(mm, MM_ANONPAGES);
 		}
 		flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
-		entry = mk_pte(new_page, vma->vm_page_prot);
-		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
+		entry = mk_pte(new_page, vmf->vma_page_prot);
+		entry = maybe_mkwrite(pte_mkdirty(entry), vmf->vma_flags);
 		/*
 		 * Clear the pte entry and flush it first, before updating the
 		 * pte with the new entry. This will avoid a race condition
@@ -3069,7 +3069,7 @@ int do_swap_page(struct vm_fault *vmf)
 	dec_mm_counter_fast(vma->vm_mm, MM_SWAPENTS);
 	pte = mk_pte(page, vmf->vma_page_prot);
 	if ((vmf->flags & FAULT_FLAG_WRITE) && reuse_swap_page(page, NULL)) {
-		pte = maybe_mkwrite(pte_mkdirty(pte), vma);
+		pte = maybe_mkwrite(pte_mkdirty(pte), vmf->vm_flags);
 		vmf->flags &= ~FAULT_FLAG_WRITE;
 		ret |= VM_FAULT_WRITE;
 		exclusive = RMAP_EXCLUSIVE;
@@ -3481,7 +3481,7 @@ int alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg,
 	flush_icache_page(vma, page);
 	entry = mk_pte(page, vmf->vma_page_prot);
 	if (write)
-		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
+		entry = maybe_mkwrite(pte_mkdirty(entry), vmf->vm_flags);
 	/* copy-on-write page */
 	if (write && !(vmf->vma_flags & VM_SHARED)) {
 		inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
diff --git a/mm/migrate.c b/mm/migrate.c
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -240,7 +240,7 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma,
 		 */
 		entry = pte_to_swp_entry(*pvmw.pte);
 		if (is_write_migration_entry(entry))
-			pte = maybe_mkwrite(pte, vma);
+			pte = maybe_mkwrite(pte, vma->vm_flags);
 
 		if (unlikely(is_zone_device_page(new))) {
 			if (is_device_private_page(new)) {

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 12/24] mm/migrate: Pass vm_fault pointer to migrate_misplaced_page()
  2018-03-13 17:59 ` [PATCH v9 12/24] mm/migrate: Pass vm_fault pointer to migrate_misplaced_page() Laurent Dufour
@ 2018-04-02 23:00   ` David Rientjes
  0 siblings, 0 replies; 98+ messages in thread
From: David Rientjes @ 2018-04-02 23:00 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Tue, 13 Mar 2018, Laurent Dufour wrote:

> migrate_misplaced_page() is only called during the page fault handling so
> it's better to pass the pointer to the struct vm_fault instead of the vma.
> 
> This way during the speculative page fault path the saved vma->vm_flags
> could be used.
> 
> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>

Acked-by: David Rientjes <rientjes@google.com>

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 13/24] mm: Introduce __lru_cache_add_active_or_unevictable
  2018-03-13 17:59 ` [PATCH v9 13/24] mm: Introduce __lru_cache_add_active_or_unevictable Laurent Dufour
@ 2018-04-02 23:11   ` David Rientjes
  0 siblings, 0 replies; 98+ messages in thread
From: David Rientjes @ 2018-04-02 23:11 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Tue, 13 Mar 2018, Laurent Dufour wrote:

> The speculative page fault handler which is run without holding the
> mmap_sem is calling lru_cache_add_active_or_unevictable() but the vm_flags
> is not guaranteed to remain constant.
> Introducing __lru_cache_add_active_or_unevictable() which has the vma flags
> value parameter instead of the vma pointer.
> 
> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>

Acked-by: David Rientjes <rientjes@google.com>

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 14/24] mm: Introduce __maybe_mkwrite()
  2018-03-13 17:59 ` [PATCH v9 14/24] mm: Introduce __maybe_mkwrite() Laurent Dufour
@ 2018-04-02 23:12   ` David Rientjes
  2018-04-04 15:56     ` Laurent Dufour
  0 siblings, 1 reply; 98+ messages in thread
From: David Rientjes @ 2018-04-02 23:12 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Tue, 13 Mar 2018, Laurent Dufour wrote:

> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index dfa81a638b7c..a84ddc218bbd 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -684,13 +684,18 @@ void free_compound_page(struct page *page);
>   * pte_mkwrite.  But get_user_pages can cause write faults for mappings
>   * that do not have writing enabled, when used by access_process_vm.
>   */
> -static inline pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma)
> +static inline pte_t __maybe_mkwrite(pte_t pte, unsigned long vma_flags)
>  {
> -	if (likely(vma->vm_flags & VM_WRITE))
> +	if (likely(vma_flags & VM_WRITE))
>  		pte = pte_mkwrite(pte);
>  	return pte;
>  }
>  
> +static inline pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma)
> +{
> +	return __maybe_mkwrite(pte, vma->vm_flags);
> +}
> +
>  int alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg,
>  		struct page *page);
>  int finish_fault(struct vm_fault *vmf);
> diff --git a/mm/memory.c b/mm/memory.c
> index 0a0a483d9a65..af0338fbc34d 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2472,7 +2472,7 @@ static inline void wp_page_reuse(struct vm_fault *vmf)
>  
>  	flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
>  	entry = pte_mkyoung(vmf->orig_pte);
> -	entry = maybe_mkwrite(pte_mkdirty(entry), vma);
> +	entry = __maybe_mkwrite(pte_mkdirty(entry), vmf->vma_flags);
>  	if (ptep_set_access_flags(vma, vmf->address, vmf->pte, entry, 1))
>  		update_mmu_cache(vma, vmf->address, vmf->pte);
>  	pte_unmap_unlock(vmf->pte, vmf->ptl);
> @@ -2549,8 +2549,8 @@ static int wp_page_copy(struct vm_fault *vmf)
>  			inc_mm_counter_fast(mm, MM_ANONPAGES);
>  		}
>  		flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
> -		entry = mk_pte(new_page, vma->vm_page_prot);
> -		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
> +		entry = mk_pte(new_page, vmf->vma_page_prot);
> +		entry = __maybe_mkwrite(pte_mkdirty(entry), vmf->vma_flags);
>  		/*
>  		 * Clear the pte entry and flush it first, before updating the
>  		 * pte with the new entry. This will avoid a race condition

Don't you also need to do this in do_swap_page()?

diff --git a/mm/memory.c b/mm/memory.c
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3067,9 +3067,9 @@ int do_swap_page(struct vm_fault *vmf)
 
 	inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
 	dec_mm_counter_fast(vma->vm_mm, MM_SWAPENTS);
-	pte = mk_pte(page, vma->vm_page_prot);
+	pte = mk_pte(page, vmf->vma_page_prot);
 	if ((vmf->flags & FAULT_FLAG_WRITE) && reuse_swap_page(page, NULL)) {
-		pte = maybe_mkwrite(pte_mkdirty(pte), vma);
+		pte = __maybe_mkwrite(pte_mkdirty(pte), vmf->vma_flags);
 		vmf->flags &= ~FAULT_FLAG_WRITE;
 		ret |= VM_FAULT_WRITE;
 		exclusive = RMAP_EXCLUSIVE;

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 15/24] mm: Introduce __vm_normal_page()
  2018-03-13 17:59 ` [PATCH v9 15/24] mm: Introduce __vm_normal_page() Laurent Dufour
@ 2018-04-02 23:18   ` David Rientjes
  2018-04-04 16:04     ` Laurent Dufour
  2018-04-03 19:39   ` Jerome Glisse
  1 sibling, 1 reply; 98+ messages in thread
From: David Rientjes @ 2018-04-02 23:18 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Tue, 13 Mar 2018, Laurent Dufour wrote:

> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index a84ddc218bbd..73b8b99f482b 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1263,8 +1263,11 @@ struct zap_details {
>  	pgoff_t last_index;			/* Highest page->index to unmap */
>  };
>  
> -struct page *_vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
> -			     pte_t pte, bool with_public_device);
> +struct page *__vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
> +			      pte_t pte, bool with_public_device,
> +			      unsigned long vma_flags);
> +#define _vm_normal_page(vma, addr, pte, with_public_device) \
> +	__vm_normal_page(vma, addr, pte, with_public_device, (vma)->vm_flags)
>  #define vm_normal_page(vma, addr, pte) _vm_normal_page(vma, addr, pte, false)
>  
>  struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr,

If _vm_normal_page() is a static inline function does it break somehow?  
It's nice to avoid the #define's.

> diff --git a/mm/memory.c b/mm/memory.c
> index af0338fbc34d..184a0d663a76 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -826,8 +826,9 @@ static void print_bad_pte(struct vm_area_struct *vma, unsigned long addr,
>  #else
>  # define HAVE_PTE_SPECIAL 0
>  #endif
> -struct page *_vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
> -			     pte_t pte, bool with_public_device)
> +struct page *__vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
> +			      pte_t pte, bool with_public_device,
> +			      unsigned long vma_flags)
>  {
>  	unsigned long pfn = pte_pfn(pte);
>  

Would it be possible to update the comment since the function itself is no 
longer named vm_normal_page?

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 16/24] mm: Introduce __page_add_new_anon_rmap()
  2018-03-13 17:59 ` [PATCH v9 16/24] mm: Introduce __page_add_new_anon_rmap() Laurent Dufour
@ 2018-04-02 23:57   ` David Rientjes
  2018-04-10 16:30     ` Laurent Dufour
  0 siblings, 1 reply; 98+ messages in thread
From: David Rientjes @ 2018-04-02 23:57 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Tue, 13 Mar 2018, Laurent Dufour wrote:

> When dealing with speculative page fault handler, we may race with VMA
> being split or merged. In this case the vma->vm_start and vm->vm_end
> fields may not match the address the page fault is occurring.
> 
> This can only happens when the VMA is split but in that case, the
> anon_vma pointer of the new VMA will be the same as the original one,
> because in __split_vma the new->anon_vma is set to src->anon_vma when
> *new = *vma.
> 
> So even if the VMA boundaries are not correct, the anon_vma pointer is
> still valid.
> 
> If the VMA has been merged, then the VMA in which it has been merged
> must have the same anon_vma pointer otherwise the merge can't be done.
> 
> So in all the case we know that the anon_vma is valid, since we have
> checked before starting the speculative page fault that the anon_vma
> pointer is valid for this VMA and since there is an anon_vma this
> means that at one time a page has been backed and that before the VMA
> is cleaned, the page table lock would have to be grab to clean the
> PTE, and the anon_vma field is checked once the PTE is locked.
> 
> This patch introduce a new __page_add_new_anon_rmap() service which
> doesn't check for the VMA boundaries, and create a new inline one
> which do the check.
> 
> When called from a page fault handler, if this is not a speculative one,
> there is a guarantee that vm_start and vm_end match the faulting address,
> so this check is useless. In the context of the speculative page fault
> handler, this check may be wrong but anon_vma is still valid as explained
> above.
> 
> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>

I'm indifferent on this: it could be argued both sides that the new 
function and its variant for a simple VM_BUG_ON() isn't worth it and it 
would should rather be done in the callers of page_add_new_anon_rmap().  
It feels like it would be better left to the caller and add a comment to 
page_add_anon_rmap() itself in mm/rmap.c.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 17/24] mm: Protect mm_rb tree with a rwlock
  2018-03-13 17:59 ` [PATCH v9 17/24] mm: Protect mm_rb tree with a rwlock Laurent Dufour
  2018-03-14  8:48   ` Peter Zijlstra
  2018-03-16 10:23   ` [mm] b33ddf50eb: INFO:trying_to_register_non-static_key kernel test robot
@ 2018-04-03  0:11   ` David Rientjes
  2018-04-06 14:23     ` Laurent Dufour
  2018-04-10 16:20     ` Laurent Dufour
  2 siblings, 2 replies; 98+ messages in thread
From: David Rientjes @ 2018-04-03  0:11 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Tue, 13 Mar 2018, Laurent Dufour wrote:

> This change is inspired by the Peter's proposal patch [1] which was
> protecting the VMA using SRCU. Unfortunately, SRCU is not scaling well in
> that particular case, and it is introducing major performance degradation
> due to excessive scheduling operations.
> 
> To allow access to the mm_rb tree without grabbing the mmap_sem, this patch
> is protecting it access using a rwlock.  As the mm_rb tree is a O(log n)
> search it is safe to protect it using such a lock.  The VMA cache is not
> protected by the new rwlock and it should not be used without holding the
> mmap_sem.
> 
> To allow the picked VMA structure to be used once the rwlock is released, a
> use count is added to the VMA structure. When the VMA is allocated it is
> set to 1.  Each time the VMA is picked with the rwlock held its use count
> is incremented. Each time the VMA is released it is decremented. When the
> use count hits zero, this means that the VMA is no more used and should be
> freed.
> 
> This patch is preparing for 2 kind of VMA access :
>  - as usual, under the control of the mmap_sem,
>  - without holding the mmap_sem for the speculative page fault handler.
> 
> Access done under the control the mmap_sem doesn't require to grab the
> rwlock to protect read access to the mm_rb tree, but access in write must
> be done under the protection of the rwlock too. This affects inserting and
> removing of elements in the RB tree.
> 
> The patch is introducing 2 new functions:
>  - vma_get() to find a VMA based on an address by holding the new rwlock.
>  - vma_put() to release the VMA when its no more used.
> These services are designed to be used when access are made to the RB tree
> without holding the mmap_sem.
> 
> When a VMA is removed from the RB tree, its vma->vm_rb field is cleared and
> we rely on the WMB done when releasing the rwlock to serialize the write
> with the RMB done in a later patch to check for the VMA's validity.
> 
> When free_vma is called, the file associated with the VMA is closed
> immediately, but the policy and the file structure remained in used until
> the VMA's use count reach 0, which may happens later when exiting an
> in progress speculative page fault.
> 
> [1] https://patchwork.kernel.org/patch/5108281/
> 
> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> Cc: Matthew Wilcox <willy@infradead.org>
> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>

Can __free_vma() be generalized for mm/nommu.c's delete_vma() and 
do_mmap()?

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 06/24] mm: make pte_unmap_same compatible with SPF
  2018-03-13 17:59 ` [PATCH v9 06/24] mm: make pte_unmap_same compatible with SPF Laurent Dufour
  2018-03-27 21:18   ` David Rientjes
@ 2018-04-03 19:10   ` Jerome Glisse
  2018-04-03 20:40     ` David Rientjes
  2018-04-04  9:53     ` Laurent Dufour
  1 sibling, 2 replies; 98+ messages in thread
From: Jerome Glisse @ 2018-04-03 19:10 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Tue, Mar 13, 2018 at 06:59:36PM +0100, Laurent Dufour wrote:
> pte_unmap_same() is making the assumption that the page table are still
> around because the mmap_sem is held.
> This is no more the case when running a speculative page fault and
> additional check must be made to ensure that the final page table are still
> there.
> 
> This is now done by calling pte_spinlock() to check for the VMA's
> consistency while locking for the page tables.
> 
> This is requiring passing a vm_fault structure to pte_unmap_same() which is
> containing all the needed parameters.
> 
> As pte_spinlock() may fail in the case of a speculative page fault, if the
> VMA has been touched in our back, pte_unmap_same() should now return 3
> cases :
> 	1. pte are the same (0)
> 	2. pte are different (VM_FAULT_PTNOTSAME)
> 	3. a VMA's changes has been detected (VM_FAULT_RETRY)
> 
> The case 2 is handled by the introduction of a new VM_FAULT flag named
> VM_FAULT_PTNOTSAME which is then trapped in cow_user_page().
> If VM_FAULT_RETRY is returned, it is passed up to the callers to retry the
> page fault while holding the mmap_sem.
> 
> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
> ---
>  include/linux/mm.h |  1 +
>  mm/memory.c        | 29 +++++++++++++++++++----------
>  2 files changed, 20 insertions(+), 10 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 2f3e98edc94a..b6432a261e63 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1199,6 +1199,7 @@ static inline void clear_page_pfmemalloc(struct page *page)
>  #define VM_FAULT_NEEDDSYNC  0x2000	/* ->fault did not modify page tables
>  					 * and needs fsync() to complete (for
>  					 * synchronous page faults in DAX) */
> +#define VM_FAULT_PTNOTSAME 0x4000	/* Page table entries have changed */
>  
>  #define VM_FAULT_ERROR	(VM_FAULT_OOM | VM_FAULT_SIGBUS | VM_FAULT_SIGSEGV | \
>  			 VM_FAULT_HWPOISON | VM_FAULT_HWPOISON_LARGE | \
> diff --git a/mm/memory.c b/mm/memory.c
> index 21b1212a0892..4bc7b0bdcb40 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2309,21 +2309,29 @@ static bool pte_map_lock(struct vm_fault *vmf)
>   * parts, do_swap_page must check under lock before unmapping the pte and
>   * proceeding (but do_wp_page is only called after already making such a check;
>   * and do_anonymous_page can safely check later on).
> + *
> + * pte_unmap_same() returns:
> + *	0			if the PTE are the same
> + *	VM_FAULT_PTNOTSAME	if the PTE are different
> + *	VM_FAULT_RETRY		if the VMA has changed in our back during
> + *				a speculative page fault handling.
>   */
> -static inline int pte_unmap_same(struct mm_struct *mm, pmd_t *pmd,
> -				pte_t *page_table, pte_t orig_pte)
> +static inline int pte_unmap_same(struct vm_fault *vmf)
>  {
> -	int same = 1;
> +	int ret = 0;
> +
>  #if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT)
>  	if (sizeof(pte_t) > sizeof(unsigned long)) {
> -		spinlock_t *ptl = pte_lockptr(mm, pmd);
> -		spin_lock(ptl);
> -		same = pte_same(*page_table, orig_pte);
> -		spin_unlock(ptl);
> +		if (pte_spinlock(vmf)) {
> +			if (!pte_same(*vmf->pte, vmf->orig_pte))
> +				ret = VM_FAULT_PTNOTSAME;
> +			spin_unlock(vmf->ptl);
> +		} else
> +			ret = VM_FAULT_RETRY;
>  	}
>  #endif
> -	pte_unmap(page_table);
> -	return same;
> +	pte_unmap(vmf->pte);
> +	return ret;
>  }
>  
>  static inline void cow_user_page(struct page *dst, struct page *src, unsigned long va, struct vm_area_struct *vma)
> @@ -2913,7 +2921,8 @@ int do_swap_page(struct vm_fault *vmf)
>  	int exclusive = 0;
>  	int ret = 0;
>  
> -	if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf->orig_pte))
> +	ret = pte_unmap_same(vmf);
> +	if (ret)
>  		goto out;
>  

This change what do_swap_page() returns ie before it was returning 0
when locked pte lookup was different from orig_pte. After this patch
it returns VM_FAULT_PTNOTSAME but this is a new return value for
handle_mm_fault() (the do_swap_page() return value is what ultimately
get return by handle_mm_fault())

Do we really want that ? This might confuse some existing user of
handle_mm_fault() and i am not sure of the value of that information
to caller.

Note i do understand that you want to return retry if anything did
change from underneath and thus need to differentiate from when the
pte value are not the same.

Cheers,
Jérôme

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 15/24] mm: Introduce __vm_normal_page()
  2018-03-13 17:59 ` [PATCH v9 15/24] mm: Introduce __vm_normal_page() Laurent Dufour
  2018-04-02 23:18   ` David Rientjes
@ 2018-04-03 19:39   ` Jerome Glisse
  2018-04-03 20:45     ` David Rientjes
  2018-04-04 16:26     ` Laurent Dufour
  1 sibling, 2 replies; 98+ messages in thread
From: Jerome Glisse @ 2018-04-03 19:39 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Tue, Mar 13, 2018 at 06:59:45PM +0100, Laurent Dufour wrote:
> When dealing with the speculative fault path we should use the VMA's field
> cached value stored in the vm_fault structure.
> 
> Currently vm_normal_page() is using the pointer to the VMA to fetch the
> vm_flags value. This patch provides a new __vm_normal_page() which is
> receiving the vm_flags flags value as parameter.
> 
> Note: The speculative path is turned on for architecture providing support
> for special PTE flag. So only the first block of vm_normal_page is used
> during the speculative path.

Might be a good idea to explicitly have SPECULATIVE Kconfig option depends
on ARCH_PTE_SPECIAL and a comment for !HAVE_PTE_SPECIAL in the function
explaining that speculative page fault should never reach that point.

Cheers,
Jérôme

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 00/24] Speculative page faults
  2018-03-13 17:59 [PATCH v9 00/24] Speculative page faults Laurent Dufour
                   ` (25 preceding siblings ...)
       [not found] ` <1520963994-28477-8-git-send-email-ldufour@linux.vnet.ibm.com>
@ 2018-04-03 20:37 ` Jerome Glisse
  2018-04-04  7:59   ` Laurent Dufour
  26 siblings, 1 reply; 98+ messages in thread
From: Jerome Glisse @ 2018-04-03 20:37 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Tue, Mar 13, 2018 at 06:59:30PM +0100, Laurent Dufour wrote:
> This is a port on kernel 4.16 of the work done by Peter Zijlstra to
> handle page fault without holding the mm semaphore [1].
> 
> The idea is to try to handle user space page faults without holding the
> mmap_sem. This should allow better concurrency for massively threaded
> process since the page fault handler will not wait for other threads memory
> layout change to be done, assuming that this change is done in another part
> of the process's memory space. This type page fault is named speculative
> page fault. If the speculative page fault fails because of a concurrency is
> detected or because underlying PMD or PTE tables are not yet allocating, it
> is failing its processing and a classic page fault is then tried.
> 
> The speculative page fault (SPF) has to look for the VMA matching the fault
> address without holding the mmap_sem, this is done by introducing a rwlock
> which protects the access to the mm_rb tree. Previously this was done using
> SRCU but it was introducing a lot of scheduling to process the VMA's
> freeing
> operation which was hitting the performance by 20% as reported by Kemi Wang
> [2].Using a rwlock to protect access to the mm_rb tree is limiting the
> locking contention to these operations which are expected to be in a O(log
> n)
> order. In addition to ensure that the VMA is not freed in our back a
> reference count is added and 2 services (get_vma() and put_vma()) are
> introduced to handle the reference count. When a VMA is fetch from the RB
> tree using get_vma() is must be later freeed using put_vma(). Furthermore,
> to allow the VMA to be used again by the classic page fault handler a
> service is introduced can_reuse_spf_vma(). This service is expected to be
> called with the mmap_sem hold. It checked that the VMA is still matching
> the specified address and is releasing its reference count as the mmap_sem
> is hold it is ensure that it will not be freed in our back. In general, the
> VMA's reference count could be decremented when holding the mmap_sem but it
> should not be increased as holding the mmap_sem is ensuring that the VMA is
> stable. I can't see anymore the overhead I got while will-it-scale
> benchmark anymore.
> 
> The VMA's attributes checked during the speculative page fault processing
> have to be protected against parallel changes. This is done by using a per
> VMA sequence lock. This sequence lock allows the speculative page fault
> handler to fast check for parallel changes in progress and to abort the
> speculative page fault in that case.
> 
> Once the VMA is found, the speculative page fault handler would check for
> the VMA's attributes to verify that the page fault has to be handled
> correctly or not. Thus the VMA is protected through a sequence lock which
> allows fast detection of concurrent VMA changes. If such a change is
> detected, the speculative page fault is aborted and a *classic* page fault
> is tried.  VMA sequence lockings are added when VMA attributes which are
> checked during the page fault are modified.
> 
> When the PTE is fetched, the VMA is checked to see if it has been changed,
> so once the page table is locked, the VMA is valid, so any other changes
> leading to touching this PTE will need to lock the page table, so no
> parallel change is possible at this time.

What would have been nice is some pseudo highlevel code before all the
above detailed description. Something like:
  speculative_fault(addr) {
    mm_lock_for_vma_snapshot()
    vma_snapshot = snapshot_vma_infos(addr)
    mm_unlock_for_vma_snapshot()
    ...
    if (!vma_can_speculatively_fault(vma_snapshot, addr))
        return;
    ...
    /* Do fault ie alloc memory, read from file ... */
    page = ...;

    preempt_disable();
    if (vma_snapshot_still_valid(vma_snapshot, addr) &&
        vma_pte_map_lock(vma_snapshot, addr)) {
        if (pte_same(ptep, orig_pte)) {
            /* Setup new pte */
            page = NULL;
        }
    }
    preempt_enable();
    if (page)
        put(page)
  }

I just find pseudo code easier for grasping the highlevel view of the
expected code flow.


> 
> The locking of the PTE is done with interrupts disabled, this allows to
> check for the PMD to ensure that there is not an ongoing collapsing
> operation. Since khugepaged is firstly set the PMD to pmd_none and then is
> waiting for the other CPU to have catch the IPI interrupt, if the pmd is
> valid at the time the PTE is locked, we have the guarantee that the
> collapsing opertion will have to wait on the PTE lock to move foward. This
> allows the SPF handler to map the PTE safely. If the PMD value is different
> than the one recorded at the beginning of the SPF operation, the classic
> page fault handler will be called to handle the operation while holding the
> mmap_sem. As the PTE lock is done with the interrupts disabled, the lock is
> done using spin_trylock() to avoid dead lock when handling a page fault
> while a TLB invalidate is requested by an other CPU holding the PTE.
> 
> Support for THP is not done because when checking for the PMD, we can be
> confused by an in progress collapsing operation done by khugepaged. The
> issue is that pmd_none() could be true either if the PMD is not already
> populated or if the underlying PTE are in the way to be collapsed. So we
> cannot safely allocate a PMD if pmd_none() is true.

Might be a good topic fo LSF/MM, should we set the pmd to something
else then 0 when collapsing pmd (apply to pud too) ? This would allow
support THP.

[...]

> 
> Ebizzy:
> -------
> The test is counting the number of records per second it can manage, the
> higher is the best. I run it like this 'ebizzy -mTRp'. To get consistent
> result I repeated the test 100 times and measure the average result. The
> number is the record processes per second, the higher is the best.
> 
>   		BASE		SPF		delta	
> 16 CPUs x86 VM	14902.6		95905.16	543.55%
> 80 CPUs P8 node	37240.24	78185.67	109.95%

I find those results interesting as it seems that SPF do not scale well
on big configuration. Note that it still have a sizeable improvement so
it is still a very interesting feature i believe.

Still understanding what is happening here might a good idea. From the
numbers below it seems there is 2 causes to the scaling issue. First
pte lock contention (kind of expected i guess). Second changes to vma
while faulting.

Have you thought about this ? Do i read those numbers in the wrong way ?

> 
> Here are the performance counter read during a run on a 16 CPUs x86 VM:
>  Performance counter stats for './ebizzy -mRTp':
>             888157      faults
>             884773      spf
>                 92      pagefault:spf_pte_lock
>               2379      pagefault:spf_vma_changed
>                  0      pagefault:spf_vma_noanon
>                 80      pagefault:spf_vma_notsup
>                  0      pagefault:spf_vma_access
>                  0      pagefault:spf_pmd_changed
> 
> And the ones captured during a run on a 80 CPUs Power node:
>  Performance counter stats for './ebizzy -mRTp':
>             762134      faults
>             728663      spf
>              19101      pagefault:spf_pte_lock
>              13969      pagefault:spf_vma_changed
>                  0      pagefault:spf_vma_noanon
>                272      pagefault:spf_vma_notsup
>                  0      pagefault:spf_vma_access
>                  0      pagefault:spf_pmd_changed


There is one aspect that i would like to see cover. Maybe i am not
understanding something fundamental, but it seems to me that SPF can
trigger OOM or at very least over stress page allocation.

Assume you have a lot of concurrent SPF to anonymous vma and they all
allocate new pages, then you might overallocate for a single address
by a factor correlated with the number of CPUs in your system. Now,
multiply this for several distinc address and you might be allocating
a lot of memory transiently ie just for a short period time. While
the fact that you quickly free when you fail should prevent the OOM
reaper. But still this might severly stress the memory allocation
path.

Am i missing something in how this all work ? Or is the above some-
thing that might be of concern ? Should there be some boundary on the
maximum number of concurrent SPF (and thus boundary on maximum page
temporary page allocation) ?

Cheers,
Jérôme

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 06/24] mm: make pte_unmap_same compatible with SPF
  2018-04-03 19:10   ` Jerome Glisse
@ 2018-04-03 20:40     ` David Rientjes
  2018-04-03 21:04       ` Jerome Glisse
  2018-04-04  9:53     ` Laurent Dufour
  1 sibling, 1 reply; 98+ messages in thread
From: David Rientjes @ 2018-04-03 20:40 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: Laurent Dufour, paulmck, peterz, akpm, kirill, ak, mhocko, dave,
	jack, Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner,
	Ingo Molnar, hpa, Will Deacon, Sergey Senozhatsky,
	Andrea Arcangeli, Alexei Starovoitov, kemi.wang,
	sergey.senozhatsky.work, Daniel Jordan, linux-kernel, linux-mm,
	haren, khandual, npiggin, bsingharora, Tim Chen, linuxppc-dev,
	x86

On Tue, 3 Apr 2018, Jerome Glisse wrote:

> > diff --git a/mm/memory.c b/mm/memory.c
> > index 21b1212a0892..4bc7b0bdcb40 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -2309,21 +2309,29 @@ static bool pte_map_lock(struct vm_fault *vmf)
> >   * parts, do_swap_page must check under lock before unmapping the pte and
> >   * proceeding (but do_wp_page is only called after already making such a check;
> >   * and do_anonymous_page can safely check later on).
> > + *
> > + * pte_unmap_same() returns:
> > + *	0			if the PTE are the same
> > + *	VM_FAULT_PTNOTSAME	if the PTE are different
> > + *	VM_FAULT_RETRY		if the VMA has changed in our back during
> > + *				a speculative page fault handling.
> >   */
> > -static inline int pte_unmap_same(struct mm_struct *mm, pmd_t *pmd,
> > -				pte_t *page_table, pte_t orig_pte)
> > +static inline int pte_unmap_same(struct vm_fault *vmf)
> >  {
> > -	int same = 1;
> > +	int ret = 0;
> > +
> >  #if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT)
> >  	if (sizeof(pte_t) > sizeof(unsigned long)) {
> > -		spinlock_t *ptl = pte_lockptr(mm, pmd);
> > -		spin_lock(ptl);
> > -		same = pte_same(*page_table, orig_pte);
> > -		spin_unlock(ptl);
> > +		if (pte_spinlock(vmf)) {
> > +			if (!pte_same(*vmf->pte, vmf->orig_pte))
> > +				ret = VM_FAULT_PTNOTSAME;
> > +			spin_unlock(vmf->ptl);
> > +		} else
> > +			ret = VM_FAULT_RETRY;
> >  	}
> >  #endif
> > -	pte_unmap(page_table);
> > -	return same;
> > +	pte_unmap(vmf->pte);
> > +	return ret;
> >  }
> >  
> >  static inline void cow_user_page(struct page *dst, struct page *src, unsigned long va, struct vm_area_struct *vma)
> > @@ -2913,7 +2921,8 @@ int do_swap_page(struct vm_fault *vmf)
> >  	int exclusive = 0;
> >  	int ret = 0;
> >  
> > -	if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf->orig_pte))
> > +	ret = pte_unmap_same(vmf);
> > +	if (ret)
> >  		goto out;
> >  
> 
> This change what do_swap_page() returns ie before it was returning 0
> when locked pte lookup was different from orig_pte. After this patch
> it returns VM_FAULT_PTNOTSAME but this is a new return value for
> handle_mm_fault() (the do_swap_page() return value is what ultimately
> get return by handle_mm_fault())
> 
> Do we really want that ? This might confuse some existing user of
> handle_mm_fault() and i am not sure of the value of that information
> to caller.
> 
> Note i do understand that you want to return retry if anything did
> change from underneath and thus need to differentiate from when the
> pte value are not the same.
> 

I think VM_FAULT_RETRY should be handled appropriately for any user of 
handle_mm_fault() already, and would be surprised to learn differently.  
Khugepaged has the appropriate handling.  I think the concern is whether a 
user is handling anything other than VM_FAULT_RETRY and VM_FAULT_ERROR 
(which VM_FAULT_PTNOTSAME is not set in)?  I haven't done a full audit of 
the users.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 15/24] mm: Introduce __vm_normal_page()
  2018-04-03 19:39   ` Jerome Glisse
@ 2018-04-03 20:45     ` David Rientjes
  2018-04-04 16:26     ` Laurent Dufour
  1 sibling, 0 replies; 98+ messages in thread
From: David Rientjes @ 2018-04-03 20:45 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: Laurent Dufour, paulmck, peterz, akpm, kirill, ak, mhocko, dave,
	jack, Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner,
	Ingo Molnar, hpa, Will Deacon, Sergey Senozhatsky,
	Andrea Arcangeli, Alexei Starovoitov, kemi.wang,
	sergey.senozhatsky.work, Daniel Jordan, linux-kernel, linux-mm,
	haren, khandual, npiggin, bsingharora, Tim Chen, linuxppc-dev,
	x86

On Tue, 3 Apr 2018, Jerome Glisse wrote:

> > When dealing with the speculative fault path we should use the VMA's field
> > cached value stored in the vm_fault structure.
> > 
> > Currently vm_normal_page() is using the pointer to the VMA to fetch the
> > vm_flags value. This patch provides a new __vm_normal_page() which is
> > receiving the vm_flags flags value as parameter.
> > 
> > Note: The speculative path is turned on for architecture providing support
> > for special PTE flag. So only the first block of vm_normal_page is used
> > during the speculative path.
> 
> Might be a good idea to explicitly have SPECULATIVE Kconfig option depends
> on ARCH_PTE_SPECIAL and a comment for !HAVE_PTE_SPECIAL in the function
> explaining that speculative page fault should never reach that point.

Yeah, I think that's appropriate but in a follow-up patch since this is 
only propagating vma_flags.  It will require that __HAVE_ARCH_PTE_SPECIAL 
become an actual Kconfig entry, however.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 06/24] mm: make pte_unmap_same compatible with SPF
  2018-04-03 20:40     ` David Rientjes
@ 2018-04-03 21:04       ` Jerome Glisse
  0 siblings, 0 replies; 98+ messages in thread
From: Jerome Glisse @ 2018-04-03 21:04 UTC (permalink / raw)
  To: David Rientjes
  Cc: Laurent Dufour, paulmck, peterz, akpm, kirill, ak, mhocko, dave,
	jack, Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner,
	Ingo Molnar, hpa, Will Deacon, Sergey Senozhatsky,
	Andrea Arcangeli, Alexei Starovoitov, kemi.wang,
	sergey.senozhatsky.work, Daniel Jordan, linux-kernel, linux-mm,
	haren, khandual, npiggin, bsingharora, Tim Chen, linuxppc-dev,
	x86

On Tue, Apr 03, 2018 at 01:40:18PM -0700, David Rientjes wrote:
> On Tue, 3 Apr 2018, Jerome Glisse wrote:
> 
> > > diff --git a/mm/memory.c b/mm/memory.c
> > > index 21b1212a0892..4bc7b0bdcb40 100644
> > > --- a/mm/memory.c
> > > +++ b/mm/memory.c
> > > @@ -2309,21 +2309,29 @@ static bool pte_map_lock(struct vm_fault *vmf)
> > >   * parts, do_swap_page must check under lock before unmapping the pte and
> > >   * proceeding (but do_wp_page is only called after already making such a check;
> > >   * and do_anonymous_page can safely check later on).
> > > + *
> > > + * pte_unmap_same() returns:
> > > + *	0			if the PTE are the same
> > > + *	VM_FAULT_PTNOTSAME	if the PTE are different
> > > + *	VM_FAULT_RETRY		if the VMA has changed in our back during
> > > + *				a speculative page fault handling.
> > >   */

[...]

> > >  
> > 
> > This change what do_swap_page() returns ie before it was returning 0
> > when locked pte lookup was different from orig_pte. After this patch
> > it returns VM_FAULT_PTNOTSAME but this is a new return value for
> > handle_mm_fault() (the do_swap_page() return value is what ultimately
> > get return by handle_mm_fault())
> > 
> > Do we really want that ? This might confuse some existing user of
> > handle_mm_fault() and i am not sure of the value of that information
> > to caller.
> > 
> > Note i do understand that you want to return retry if anything did
> > change from underneath and thus need to differentiate from when the
> > pte value are not the same.
> > 
> 
> I think VM_FAULT_RETRY should be handled appropriately for any user of 
> handle_mm_fault() already, and would be surprised to learn differently.  
> Khugepaged has the appropriate handling.  I think the concern is whether a 
> user is handling anything other than VM_FAULT_RETRY and VM_FAULT_ERROR 
> (which VM_FAULT_PTNOTSAME is not set in)?  I haven't done a full audit of 
> the users.

I am not worried about VM_FAULT_RETRY and barely have any worry about
VM_FAULT_PTNOTSAME either as they are other comparable new return value
(VM_FAULT_NEEDDSYNC for instance which is quite recent).

I wonder if adding a new value is really needed here. I don't see any
value to it for caller of handle_mm_fault() except for stats.

Note that I am not oppose, but while today we have free bits, maybe
tomorrow we will run out, i am always worried about thing like that :)

Cheers,
Jérôme

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 04/24] mm: Prepare for FAULT_FLAG_SPECULATIVE
  2018-03-28 10:27     ` Laurent Dufour
@ 2018-04-03 21:57       ` David Rientjes
  2018-04-04  9:23         ` Laurent Dufour
  0 siblings, 1 reply; 98+ messages in thread
From: David Rientjes @ 2018-04-03 21:57 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Wed, 28 Mar 2018, Laurent Dufour wrote:

> >> diff --git a/include/linux/mm.h b/include/linux/mm.h
> >> index 4d02524a7998..2f3e98edc94a 100644
> >> --- a/include/linux/mm.h
> >> +++ b/include/linux/mm.h
> >> @@ -300,6 +300,7 @@ extern pgprot_t protection_map[16];
> >>  #define FAULT_FLAG_USER		0x40	/* The fault originated in userspace */
> >>  #define FAULT_FLAG_REMOTE	0x80	/* faulting for non current tsk/mm */
> >>  #define FAULT_FLAG_INSTRUCTION  0x100	/* The fault was during an instruction fetch */
> >> +#define FAULT_FLAG_SPECULATIVE	0x200	/* Speculative fault, not holding mmap_sem */
> >>  
> >>  #define FAULT_FLAG_TRACE \
> >>  	{ FAULT_FLAG_WRITE,		"WRITE" }, \
> > 
> > I think FAULT_FLAG_SPECULATIVE should be introduced in the patch that 
> > actually uses it.
> 
> I think you're right, I'll move down this define in the series.
> 
> >> diff --git a/mm/memory.c b/mm/memory.c
> >> index e0ae4999c824..8ac241b9f370 100644
> >> --- a/mm/memory.c
> >> +++ b/mm/memory.c
> >> @@ -2288,6 +2288,13 @@ int apply_to_page_range(struct mm_struct *mm, unsigned long addr,
> >>  }
> >>  EXPORT_SYMBOL_GPL(apply_to_page_range);
> >>  
> >> +static bool pte_map_lock(struct vm_fault *vmf)
> > 
> > inline?
> 
> Agreed.
> 

Ignore this, the final form of the function after the full patchset 
shouldn't be inline.

> >> +{
> >> +	vmf->pte = pte_offset_map_lock(vmf->vma->vm_mm, vmf->pmd,
> >> +				       vmf->address, &vmf->ptl);
> >> +	return true;
> >> +}
> >> +
> >>  /*
> >>   * handle_pte_fault chooses page fault handler according to an entry which was
> >>   * read non-atomically.  Before making any commitment, on those architectures
> >> @@ -2477,6 +2484,7 @@ static int wp_page_copy(struct vm_fault *vmf)
> >>  	const unsigned long mmun_start = vmf->address & PAGE_MASK;
> >>  	const unsigned long mmun_end = mmun_start + PAGE_SIZE;
> >>  	struct mem_cgroup *memcg;
> >> +	int ret = VM_FAULT_OOM;
> >>  
> >>  	if (unlikely(anon_vma_prepare(vma)))
> >>  		goto oom;
> >> @@ -2504,7 +2512,11 @@ static int wp_page_copy(struct vm_fault *vmf)
> >>  	/*
> >>  	 * Re-check the pte - we dropped the lock
> >>  	 */
> >> -	vmf->pte = pte_offset_map_lock(mm, vmf->pmd, vmf->address, &vmf->ptl);
> >> +	if (!pte_map_lock(vmf)) {
> >> +		mem_cgroup_cancel_charge(new_page, memcg, false);
> >> +		ret = VM_FAULT_RETRY;
> >> +		goto oom_free_new;
> >> +	}
> > 
> > Ugh, but we aren't oom here, so maybe rename oom_free_new so that it makes 
> > sense for return values other than VM_FAULT_OOM?
> 
> You're right, now this label name is not correct, I'll rename it to
> "out_free_new" and rename also the label "oom" to "out" since it is generic too
> now.
> 

I think it would just be better to introduce a out_uncharge that handles 
the mem_cgroup_cancel_charge() in the exit path.

diff --git a/mm/memory.c b/mm/memory.c
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2645,9 +2645,8 @@ static int wp_page_copy(struct vm_fault *vmf)
 	 * Re-check the pte - we dropped the lock
 	 */
 	if (!pte_map_lock(vmf)) {
-		mem_cgroup_cancel_charge(new_page, memcg, false);
 		ret = VM_FAULT_RETRY;
-		goto oom_free_new;
+		goto out_uncharge;
 	}
 	if (likely(pte_same(*vmf->pte, vmf->orig_pte))) {
 		if (old_page) {
@@ -2735,6 +2734,8 @@ static int wp_page_copy(struct vm_fault *vmf)
 		put_page(old_page);
 	}
 	return page_copied ? VM_FAULT_WRITE : 0;
+out_uncharge:
+	mem_cgroup_cancel_charge(new_page, memcg, false);
 oom_free_new:
 	put_page(new_page);
 oom:

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [mm] b1f0502d04: INFO:trying_to_register_non-static_key
  2018-03-28 13:30         ` Laurent Dufour
@ 2018-04-04  0:48           ` David Rientjes
  2018-04-04  1:03             ` David Rientjes
  2018-04-04 10:19             ` Laurent Dufour
  0 siblings, 2 replies; 98+ messages in thread
From: David Rientjes @ 2018-04-04  0:48 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: kernel test robot, paulmck, peterz, akpm, kirill, ak, mhocko,
	dave, jack, Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner,
	Ingo Molnar, hpa, Will Deacon, Sergey Senozhatsky,
	Andrea Arcangeli, Alexei Starovoitov, kemi.wang,
	sergey.senozhatsky.work, Daniel Jordan, linux-kernel, linux-mm,
	haren, khandual, npiggin, bsingharora, Tim Chen, linuxppc-dev,
	x86, lkp

On Wed, 28 Mar 2018, Laurent Dufour wrote:

> On 26/03/2018 00:10, David Rientjes wrote:
> > On Wed, 21 Mar 2018, Laurent Dufour wrote:
> > 
> >> I found the root cause of this lockdep warning.
> >>
> >> In mmap_region(), unmap_region() may be called while vma_link() has not been
> >> called. This happens during the error path if call_mmap() failed.
> >>
> >> The only to fix that particular case is to call
> >> seqcount_init(&vma->vm_sequence) when initializing the vma in mmap_region().
> >>
> > 
> > Ack, although that would require a fixup to dup_mmap() as well.
> 
> You're right, I'll fix that too.
> 

I also think the following is needed:

diff --git a/fs/exec.c b/fs/exec.c
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -312,6 +312,10 @@ static int __bprm_mm_init(struct linux_binprm *bprm)
 	vma->vm_flags = VM_SOFTDIRTY | VM_STACK_FLAGS | VM_STACK_INCOMPLETE_SETUP;
 	vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
 	INIT_LIST_HEAD(&vma->anon_vma_chain);
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+	seqcount_init(&vma->vm_sequence);
+	atomic_set(&vma->vm_ref_count, 0);
+#endif
 
 	err = insert_vm_struct(mm, vma);
 	if (err)

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [mm] b1f0502d04: INFO:trying_to_register_non-static_key
  2018-04-04  0:48           ` David Rientjes
@ 2018-04-04  1:03             ` David Rientjes
  2018-04-04 10:28               ` Laurent Dufour
  2018-04-04 10:19             ` Laurent Dufour
  1 sibling, 1 reply; 98+ messages in thread
From: David Rientjes @ 2018-04-04  1:03 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: kernel test robot, paulmck, peterz, akpm, kirill, ak, mhocko,
	dave, jack, Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner,
	Ingo Molnar, hpa, Will Deacon, Sergey Senozhatsky,
	Andrea Arcangeli, Alexei Starovoitov, kemi.wang,
	sergey.senozhatsky.work, Daniel Jordan, linux-kernel, linux-mm,
	haren, khandual, npiggin, bsingharora, Tim Chen, linuxppc-dev,
	x86, lkp

On Tue, 3 Apr 2018, David Rientjes wrote:

> > >> I found the root cause of this lockdep warning.
> > >>
> > >> In mmap_region(), unmap_region() may be called while vma_link() has not been
> > >> called. This happens during the error path if call_mmap() failed.
> > >>
> > >> The only to fix that particular case is to call
> > >> seqcount_init(&vma->vm_sequence) when initializing the vma in mmap_region().
> > >>
> > > 
> > > Ack, although that would require a fixup to dup_mmap() as well.
> > 
> > You're right, I'll fix that too.
> > 
> 
> I also think the following is needed:
> 
> diff --git a/fs/exec.c b/fs/exec.c
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -312,6 +312,10 @@ static int __bprm_mm_init(struct linux_binprm *bprm)
>  	vma->vm_flags = VM_SOFTDIRTY | VM_STACK_FLAGS | VM_STACK_INCOMPLETE_SETUP;
>  	vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
>  	INIT_LIST_HEAD(&vma->anon_vma_chain);
> +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
> +	seqcount_init(&vma->vm_sequence);
> +	atomic_set(&vma->vm_ref_count, 0);
> +#endif
>  
>  	err = insert_vm_struct(mm, vma);
>  	if (err)
> 

Ugh, I think there are a number of other places where this is needed as 
well in mm/mmap.c.  I think it would be better to just create a new 
alloc_vma(unsigned long flags) that all vma allocators can use and for 
CONFIG_SPECULATIVE_PAGE_FAULT will initialize the seqcount_t and atomic_t.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 00/24] Speculative page faults
  2018-04-03 20:37 ` [PATCH v9 00/24] Speculative page faults Jerome Glisse
@ 2018-04-04  7:59   ` Laurent Dufour
  0 siblings, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-04-04  7:59 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

Hi Jerome,

Thanks for reviewing this series.

On 03/04/2018 22:37, Jerome Glisse wrote:
> On Tue, Mar 13, 2018 at 06:59:30PM +0100, Laurent Dufour wrote:
>> This is a port on kernel 4.16 of the work done by Peter Zijlstra to
>> handle page fault without holding the mm semaphore [1].
>>
>> The idea is to try to handle user space page faults without holding the
>> mmap_sem. This should allow better concurrency for massively threaded
>> process since the page fault handler will not wait for other threads memory
>> layout change to be done, assuming that this change is done in another part
>> of the process's memory space. This type page fault is named speculative
>> page fault. If the speculative page fault fails because of a concurrency is
>> detected or because underlying PMD or PTE tables are not yet allocating, it
>> is failing its processing and a classic page fault is then tried.
>>
>> The speculative page fault (SPF) has to look for the VMA matching the fault
>> address without holding the mmap_sem, this is done by introducing a rwlock
>> which protects the access to the mm_rb tree. Previously this was done using
>> SRCU but it was introducing a lot of scheduling to process the VMA's
>> freeing
>> operation which was hitting the performance by 20% as reported by Kemi Wang
>> [2].Using a rwlock to protect access to the mm_rb tree is limiting the
>> locking contention to these operations which are expected to be in a O(log
>> n)
>> order. In addition to ensure that the VMA is not freed in our back a
>> reference count is added and 2 services (get_vma() and put_vma()) are
>> introduced to handle the reference count. When a VMA is fetch from the RB
>> tree using get_vma() is must be later freeed using put_vma(). Furthermore,
>> to allow the VMA to be used again by the classic page fault handler a
>> service is introduced can_reuse_spf_vma(). This service is expected to be
>> called with the mmap_sem hold. It checked that the VMA is still matching
>> the specified address and is releasing its reference count as the mmap_sem
>> is hold it is ensure that it will not be freed in our back. In general, the
>> VMA's reference count could be decremented when holding the mmap_sem but it
>> should not be increased as holding the mmap_sem is ensuring that the VMA is
>> stable. I can't see anymore the overhead I got while will-it-scale
>> benchmark anymore.
>>
>> The VMA's attributes checked during the speculative page fault processing
>> have to be protected against parallel changes. This is done by using a per
>> VMA sequence lock. This sequence lock allows the speculative page fault
>> handler to fast check for parallel changes in progress and to abort the
>> speculative page fault in that case.
>>
>> Once the VMA is found, the speculative page fault handler would check for
>> the VMA's attributes to verify that the page fault has to be handled
>> correctly or not. Thus the VMA is protected through a sequence lock which
>> allows fast detection of concurrent VMA changes. If such a change is
>> detected, the speculative page fault is aborted and a *classic* page fault
>> is tried.  VMA sequence lockings are added when VMA attributes which are
>> checked during the page fault are modified.
>>
>> When the PTE is fetched, the VMA is checked to see if it has been changed,
>> so once the page table is locked, the VMA is valid, so any other changes
>> leading to touching this PTE will need to lock the page table, so no
>> parallel change is possible at this time.
> 
> What would have been nice is some pseudo highlevel code before all the
> above detailed description. Something like:
>   speculative_fault(addr) {
>     mm_lock_for_vma_snapshot()
>     vma_snapshot = snapshot_vma_infos(addr)
>     mm_unlock_for_vma_snapshot()
>     ...
>     if (!vma_can_speculatively_fault(vma_snapshot, addr))
>         return;
>     ...
>     /* Do fault ie alloc memory, read from file ... */
>     page = ...;
> 
>     preempt_disable();
>     if (vma_snapshot_still_valid(vma_snapshot, addr) &&
>         vma_pte_map_lock(vma_snapshot, addr)) {
>         if (pte_same(ptep, orig_pte)) {
>             /* Setup new pte */
>             page = NULL;
>         }
>     }
>     preempt_enable();
>     if (page)
>         put(page)
>   }
> 
> I just find pseudo code easier for grasping the highlevel view of the
> expected code flow.

Fair enough, I agree that sounds easier this way, but one might argue that the
pseudo code is not more valid or accurate at one time :)

As always, the updated documentation is the code itself.

I'll try to put one inspired by yours in the next series's header.

>>
>> The locking of the PTE is done with interrupts disabled, this allows to
>> check for the PMD to ensure that there is not an ongoing collapsing
>> operation. Since khugepaged is firstly set the PMD to pmd_none and then is
>> waiting for the other CPU to have catch the IPI interrupt, if the pmd is
>> valid at the time the PTE is locked, we have the guarantee that the
>> collapsing opertion will have to wait on the PTE lock to move foward. This
>> allows the SPF handler to map the PTE safely. If the PMD value is different
>> than the one recorded at the beginning of the SPF operation, the classic
>> page fault handler will be called to handle the operation while holding the
>> mmap_sem. As the PTE lock is done with the interrupts disabled, the lock is
>> done using spin_trylock() to avoid dead lock when handling a page fault
>> while a TLB invalidate is requested by an other CPU holding the PTE.
>>
>> Support for THP is not done because when checking for the PMD, we can be
>> confused by an in progress collapsing operation done by khugepaged. The
>> issue is that pmd_none() could be true either if the PMD is not already
>> populated or if the underlying PTE are in the way to be collapsed. So we
>> cannot safely allocate a PMD if pmd_none() is true.
> 
> Might be a good topic fo LSF/MM, should we set the pmd to something
> else then 0 when collapsing pmd (apply to pud too) ? This would allow
> support THP.

Absolutely !

> [...]
> 
>>
>> Ebizzy:
>> -------
>> The test is counting the number of records per second it can manage, the
>> higher is the best. I run it like this 'ebizzy -mTRp'. To get consistent
>> result I repeated the test 100 times and measure the average result. The
>> number is the record processes per second, the higher is the best.
>>
>>   		BASE		SPF		delta	
>> 16 CPUs x86 VM	14902.6		95905.16	543.55%
>> 80 CPUs P8 node	37240.24	78185.67	109.95%
> 
> I find those results interesting as it seems that SPF do not scale well
> on big configuration. Note that it still have a sizeable improvement so
> it is still a very interesting feature i believe.
> 
> Still understanding what is happening here might a good idea. From the
> numbers below it seems there is 2 causes to the scaling issue. First
> pte lock contention (kind of expected i guess). Second changes to vma
> while faulting.
> 
> Have you thought about this ? Do i read those numbers in the wrong way ?

Your reading of the numbers is correct, but there is also another point to keep
in mind, on ppc64, the default page size is 64K, and since we are mapping new
pages for user space, those pages have to be cleared, leading to more time
spent clearing pages on ppc64 which leads to less page fault ratio on ppc64.
And since the VMA is checked again once the cleared page is allocated, there is
a major chance for that VMA to be touched in the ebizzy case.

>>
>> Here are the performance counter read during a run on a 16 CPUs x86 VM:
>>  Performance counter stats for './ebizzy -mRTp':
>>             As always, the updated documentation is the code itself.
>>             888157      faults
>>             884773      spf
>>                 92      pagefault:spf_pte_lock
>>               2379      pagefault:spf_vma_changed
>>                  0      pagefault:spf_vma_noanon
>>                 80      pagefault:spf_vma_notsup
>>                  0      pagefault:spf_vma_access
>>                  0      pagefault:spf_pmd_changed
>>
>> And the ones captured during a run on a 80 CPUs Power node:
>>  Performance counter stats for './ebizzy -mRTp':
>>             762134      faults
>>             728663      spf
>>              19101      pagefault:spf_pte_lock
>>              13969      pagefault:spf_vma_changed
>>                  0      pagefault:spf_vma_noanon
>>                272      pagefault:spf_vma_notsup
>>                  0      pagefault:spf_vma_access
>>                  0      pagefault:spf_pmd_changed
> 
> 
> There is one aspect that i would like to see cover. Maybe i am not
> understanding something fundamental, but it seems to me that SPF can
> trigger OOM or at very least over stress page allocation.
> 
> Assume you have a lot of concurrent SPF to anonymous vma and they all
> allocate new pages, then you might overallocate for a single address
> by a factor correlated with the number of CPUs in your system. Now,
> multiply this for several distinc address and you might be allocating
> a lot of memory transiently ie just for a short period time. While
> the fact that you quickly free when you fail should prevent the OOM
> reaper. But still this might severly stress the memory allocation
> path.

That's an interesting point, and you're right, SPF may lead to page allocation
that will not be used.
But as you mentioned this will be a factor of CPU numbers, so the max page
overhead, assuming that all minus one threads of the same process are dealing
with page on the same VMA and the last one is touching that VMA parallel,
is (nrcpus-1) page allocated at one time which may not be used immediately.

I'm not sure this will be a major risk, but I might be too optimistic.

This raises also the question of the cleared page cache, I'd have to see if
there is such a cache is in place.

> Am i missing something in how this all work ? Or is the above some-
> thing that might be of concern ? Should there be some boundary on the
> maximum number of concurrent SPF (and thus boundary on maximum page
> temporary page allocation) ?

I don't think you're missing anything ;)

It would be easy to introduce such a limit in the case OOM are trigger too many
times due to SPF handling.

Cheers,
Laurent.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 09/24] mm: protect mremap() against SPF hanlder
  2018-03-28 21:21       ` David Rientjes
@ 2018-04-04  8:24         ` Laurent Dufour
  0 siblings, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-04-04  8:24 UTC (permalink / raw)
  To: David Rientjes
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86



On 28/03/2018 23:21, David Rientjes wrote:
> On Wed, 28 Mar 2018, Laurent Dufour wrote:
> 
>>>> @@ -326,7 +336,10 @@ static unsigned long move_vma(struct vm_area_struct *vma,
>>>>  		mremap_userfaultfd_prep(new_vma, uf);
>>>>  		arch_remap(mm, old_addr, old_addr + old_len,
>>>>  			   new_addr, new_addr + new_len);
>>>> +		if (vma != new_vma)
>>>> +			vm_raw_write_end(vma);
>>>>  	}
>>>> +	vm_raw_write_end(new_vma);
>>>
>>> Just do
>>>
>>> vm_raw_write_end(vma);
>>> vm_raw_write_end(new_vma);
>>>
>>> here.
>>
>> Are you sure ? we can have vma = new_vma done if (unlikely(err))
>>
> 
> Sorry, what I meant was do
> 
> if (vma != new_vma)
> 	vm_raw_write_end(vma);
> vm_raw_write_end(new_vma);
> 
> after the conditional.  Having the locking unnecessarily embedded in the 
> conditional has been an issue in the past with other areas of core code, 
> unless you have a strong reason for it.

Unfortunately, I can't see how doing this in another way since vma = new_vma is
done in the error branch.
So releasing the VMAs outside of the conditional may lead to miss 'vma' if the
error branch is taken.

Here is the code snippet as a reminder:

	new_vma = copy_vma(&vma, new_addr, new_len, new_pgoff,
			   &need_rmap_locks);
	[...]
	if (vma != new_vma)
		vm_raw_write_begin(vma);
	[...]
	if (unlikely(err)) {
		[...]
		if (vma != new_vma)
			vm_raw_write_end(vma);
		vma = new_vma; <<<< here we lost reference to vma
		[...]
	} else {
		[...]
		if (vma != new_vma)
			vm_raw_write_end(vma);
	}
	vm_raw_write_end(new_vma);

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 04/24] mm: Prepare for FAULT_FLAG_SPECULATIVE
  2018-04-03 21:57       ` David Rientjes
@ 2018-04-04  9:23         ` Laurent Dufour
  0 siblings, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-04-04  9:23 UTC (permalink / raw)
  To: David Rientjes
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86



On 03/04/2018 23:57, David Rientjes wrote:
> On Wed, 28 Mar 2018, Laurent Dufour wrote:
> 
>>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>>> index 4d02524a7998..2f3e98edc94a 100644
>>>> --- a/include/linux/mm.h
>>>> +++ b/include/linux/mm.h
>>>> @@ -300,6 +300,7 @@ extern pgprot_t protection_map[16];
>>>>  #define FAULT_FLAG_USER		0x40	/* The fault originated in userspace */
>>>>  #define FAULT_FLAG_REMOTE	0x80	/* faulting for non current tsk/mm */
>>>>  #define FAULT_FLAG_INSTRUCTION  0x100	/* The fault was during an instruction fetch */
>>>> +#define FAULT_FLAG_SPECULATIVE	0x200	/* Speculative fault, not holding mmap_sem */
>>>>  
>>>>  #define FAULT_FLAG_TRACE \
>>>>  	{ FAULT_FLAG_WRITE,		"WRITE" }, \
>>>
>>> I think FAULT_FLAG_SPECULATIVE should be introduced in the patch that 
>>> actually uses it.
>>
>> I think you're right, I'll move down this define in the series.
>>
>>>> diff --git a/mm/memory.c b/mm/memory.c
>>>> index e0ae4999c824..8ac241b9f370 100644
>>>> --- a/mm/memory.c
>>>> +++ b/mm/memory.c
>>>> @@ -2288,6 +2288,13 @@ int apply_to_page_range(struct mm_struct *mm, unsigned long addr,
>>>>  }
>>>>  EXPORT_SYMBOL_GPL(apply_to_page_range);
>>>>  
>>>> +static bool pte_map_lock(struct vm_fault *vmf)
>>>
>>> inline?
>>
>> Agreed.
>>
> 
> Ignore this, the final form of the function after the full patchset 
> shouldn't be inline.

Indeed, I only kept as inlined the small pte_map_lock() and later
pte_spinlock() defined when CONFIG_SPECULATIVE_PAGE_FAULT is not set.

>>>> +{
>>>> +	vmf->pte = pte_offset_map_lock(vmf->vma->vm_mm, vmf->pmd,
>>>> +				       vmf->address, &vmf->ptl);
>>>> +	return true;
>>>> +}
>>>> +
>>>>  /*
>>>>   * handle_pte_fault chooses page fault handler according to an entry which was
>>>>   * read non-atomically.  Before making any commitment, on those architectures
>>>> @@ -2477,6 +2484,7 @@ static int wp_page_copy(struct vm_fault *vmf)
>>>>  	const unsigned long mmun_start = vmf->address & PAGE_MASK;
>>>>  	const unsigned long mmun_end = mmun_start + PAGE_SIZE;
>>>>  	struct mem_cgroup *memcg;
>>>> +	int ret = VM_FAULT_OOM;
>>>>  
>>>>  	if (unlikely(anon_vma_prepare(vma)))
>>>>  		goto oom;
>>>> @@ -2504,7 +2512,11 @@ static int wp_page_copy(struct vm_fault *vmf)
>>>>  	/*
>>>>  	 * Re-check the pte - we dropped the lock
>>>>  	 */
>>>> -	vmf->pte = pte_offset_map_lock(mm, vmf->pmd, vmf->address, &vmf->ptl);
>>>> +	if (!pte_map_lock(vmf)) {
>>>> +		mem_cgroup_cancel_charge(new_page, memcg, false);
>>>> +		ret = VM_FAULT_RETRY;
>>>> +		goto oom_free_new;
>>>> +	}
>>>
>>> Ugh, but we aren't oom here, so maybe rename oom_free_new so that it makes 
>>> sense for return values other than VM_FAULT_OOM?
>>
>> You're right, now this label name is not correct, I'll rename it to
>> "out_free_new" and rename also the label "oom" to "out" since it is generic too
>> now.
>>
> 
> I think it would just be better to introduce a out_uncharge that handles 
> the mem_cgroup_cancel_charge() in the exit path.

Yes adding an out_uncharge label sounds good too. I'll add it and also rename
oom_* ones to out_*.

> 
> diff --git a/mm/memory.c b/mm/memory.c
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2645,9 +2645,8 @@ static int wp_page_copy(struct vm_fault *vmf)
>  	 * Re-check the pte - we dropped the lock
>  	 */
>  	if (!pte_map_lock(vmf)) {
> -		mem_cgroup_cancel_charge(new_page, memcg, false);
>  		ret = VM_FAULT_RETRY;
> -		goto oom_free_new;
> +		goto out_uncharge;
>  	}
>  	if (likely(pte_same(*vmf->pte, vmf->orig_pte))) {
>  		if (old_page) {
> @@ -2735,6 +2734,8 @@ static int wp_page_copy(struct vm_fault *vmf)
>  		put_page(old_page);
>  	}
>  	return page_copied ? VM_FAULT_WRITE : 0;
> +out_uncharge:
> +	mem_cgroup_cancel_charge(new_page, memcg, false);
>  oom_free_new:
>  	put_page(new_page);
>  oom:
> 

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 06/24] mm: make pte_unmap_same compatible with SPF
  2018-04-03 19:10   ` Jerome Glisse
  2018-04-03 20:40     ` David Rientjes
@ 2018-04-04  9:53     ` Laurent Dufour
  1 sibling, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-04-04  9:53 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86



On 03/04/2018 21:10, Jerome Glisse wrote:
> On Tue, Mar 13, 2018 at 06:59:36PM +0100, Laurent Dufour wrote:
>> pte_unmap_same() is making the assumption that the page table are still
>> around because the mmap_sem is held.
>> This is no more the case when running a speculative page fault and
>> additional check must be made to ensure that the final page table are still
>> there.
>>
>> This is now done by calling pte_spinlock() to check for the VMA's
>> consistency while locking for the page tables.
>>
>> This is requiring passing a vm_fault structure to pte_unmap_same() which is
>> containing all the needed parameters.
>>
>> As pte_spinlock() may fail in the case of a speculative page fault, if the
>> VMA has been touched in our back, pte_unmap_same() should now return 3
>> cases :
>> 	1. pte are the same (0)
>> 	2. pte are different (VM_FAULT_PTNOTSAME)
>> 	3. a VMA's changes has been detected (VM_FAULT_RETRY)
>>
>> The case 2 is handled by the introduction of a new VM_FAULT flag named
>> VM_FAULT_PTNOTSAME which is then trapped in cow_user_page().
>> If VM_FAULT_RETRY is returned, it is passed up to the callers to retry the
>> page fault while holding the mmap_sem.
>>
>> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
>> ---
>>  include/linux/mm.h |  1 +
>>  mm/memory.c        | 29 +++++++++++++++++++----------
>>  2 files changed, 20 insertions(+), 10 deletions(-)
>>
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index 2f3e98edc94a..b6432a261e63 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -1199,6 +1199,7 @@ static inline void clear_page_pfmemalloc(struct page *page)
>>  #define VM_FAULT_NEEDDSYNC  0x2000	/* ->fault did not modify page tables
>>  					 * and needs fsync() to complete (for
>>  					 * synchronous page faults in DAX) */
>> +#define VM_FAULT_PTNOTSAME 0x4000	/* Page table entries have changed */
>>  
>>  #define VM_FAULT_ERROR	(VM_FAULT_OOM | VM_FAULT_SIGBUS | VM_FAULT_SIGSEGV | \
>>  			 VM_FAULT_HWPOISON | VM_FAULT_HWPOISON_LARGE | \
>> diff --git a/mm/memory.c b/mm/memory.c
>> index 21b1212a0892..4bc7b0bdcb40 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -2309,21 +2309,29 @@ static bool pte_map_lock(struct vm_fault *vmf)
>>   * parts, do_swap_page must check under lock before unmapping the pte and
>>   * proceeding (but do_wp_page is only called after already making such a check;
>>   * and do_anonymous_page can safely check later on).
>> + *
>> + * pte_unmap_same() returns:
>> + *	0			if the PTE are the same
>> + *	VM_FAULT_PTNOTSAME	if the PTE are different
>> + *	VM_FAULT_RETRY		if the VMA has changed in our back during
>> + *				a speculative page fault handling.
>>   */
>> -static inline int pte_unmap_same(struct mm_struct *mm, pmd_t *pmd,
>> -				pte_t *page_table, pte_t orig_pte)
>> +static inline int pte_unmap_same(struct vm_fault *vmf)
>>  {
>> -	int same = 1;
>> +	int ret = 0;
>> +
>>  #if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT)
>>  	if (sizeof(pte_t) > sizeof(unsigned long)) {
>> -		spinlock_t *ptl = pte_lockptr(mm, pmd);
>> -		spin_lock(ptl);
>> -		same = pte_same(*page_table, orig_pte);
>> -		spin_unlock(ptl);
>> +		if (pte_spinlock(vmf)) {
>> +			if (!pte_same(*vmf->pte, vmf->orig_pte))
>> +				ret = VM_FAULT_PTNOTSAME;
>> +			spin_unlock(vmf->ptl);
>> +		} else
>> +			ret = VM_FAULT_RETRY;
>>  	}
>>  #endif
>> -	pte_unmap(page_table);
>> -	return same;
>> +	pte_unmap(vmf->pte);
>> +	return ret;
>>  }
>>  
>>  static inline void cow_user_page(struct page *dst, struct page *src, unsigned long va, struct vm_area_struct *vma)
>> @@ -2913,7 +2921,8 @@ int do_swap_page(struct vm_fault *vmf)
>>  	int exclusive = 0;
>>  	int ret = 0;
>>  
>> -	if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf->orig_pte))
>> +	ret = pte_unmap_same(vmf);
>> +	if (ret)
>>  		goto out;
>>  
> 
> This change what do_swap_page() returns ie before it was returning 0
> when locked pte lookup was different from orig_pte. After this patch
> it returns VM_FAULT_PTNOTSAME but this is a new return value for
> handle_mm_fault() (the do_swap_page() return value is what ultimately
> get return by handle_mm_fault())
> 
> Do we really want that ? This might confuse some existing user of
> handle_mm_fault() and i am not sure of the value of that information
> to caller.
> 
> Note i do understand that you want to return retry if anything did
> change from underneath and thus need to differentiate from when the
> pte value are not the same.

You're right, do_swap_page() should still return 0 in the case the lookup pte
is different from orig_pte, assuming that the swap operation has been handled
in our back by another concurrent thread.

So in do_swap_page(), VM_FAULT_PTNOTSAME should be translated in ret = 0.

	ret = pte_unmap_same(vmf);
	if (ret) {
		/*
		 * If pte != orig_pte, this means another thread did the
		 * swap operation in our back.
		 * So nothing else to do.
		 */
		if (ret == VM_FAULT_PTNOTSAME)
			ret = 0;
		goto out;
	}

This means that VM_FAULT_PTNOTSAME will never been reported up and limited to
do_swap_page().

Doing this will make easier to understand why when pte_unmap_same() is
returning 0, do_swap_page() is done.

Cheers,
Laurent.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [mm] b1f0502d04: INFO:trying_to_register_non-static_key
  2018-04-04  0:48           ` David Rientjes
  2018-04-04  1:03             ` David Rientjes
@ 2018-04-04 10:19             ` Laurent Dufour
  2018-04-04 21:53               ` David Rientjes
  1 sibling, 1 reply; 98+ messages in thread
From: Laurent Dufour @ 2018-04-04 10:19 UTC (permalink / raw)
  To: David Rientjes
  Cc: kernel test robot, paulmck, peterz, akpm, kirill, ak, mhocko,
	dave, jack, Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner,
	Ingo Molnar, hpa, Will Deacon, Sergey Senozhatsky,
	Andrea Arcangeli, Alexei Starovoitov, kemi.wang,
	sergey.senozhatsky.work, Daniel Jordan, linux-kernel, linux-mm,
	haren, khandual, npiggin, bsingharora, Tim Chen, linuxppc-dev,
	x86, lkp



On 04/04/2018 02:48, David Rientjes wrote:
> On Wed, 28 Mar 2018, Laurent Dufour wrote:
> 
>> On 26/03/2018 00:10, David Rientjes wrote:
>>> On Wed, 21 Mar 2018, Laurent Dufour wrote:
>>>
>>>> I found the root cause of this lockdep warning.
>>>>
>>>> In mmap_region(), unmap_region() may be called while vma_link() has not been
>>>> called. This happens during the error path if call_mmap() failed.
>>>>
>>>> The only to fix that particular case is to call
>>>> seqcount_init(&vma->vm_sequence) when initializing the vma in mmap_region().
>>>>
>>>
>>> Ack, although that would require a fixup to dup_mmap() as well.
>>
>> You're right, I'll fix that too.
>>
> 
> I also think the following is needed:
> 
> diff --git a/fs/exec.c b/fs/exec.c
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -312,6 +312,10 @@ static int __bprm_mm_init(struct linux_binprm *bprm)
>  	vma->vm_flags = VM_SOFTDIRTY | VM_STACK_FLAGS | VM_STACK_INCOMPLETE_SETUP;
>  	vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
>  	INIT_LIST_HEAD(&vma->anon_vma_chain);
> +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
> +	seqcount_init(&vma->vm_sequence);
> +	atomic_set(&vma->vm_ref_count, 0);
> +#endif
> 
>  	err = insert_vm_struct(mm, vma);
>  	if (err)

No, this not needed because the vma is allocated with kmem_cache_zalloc() so
vm_ref_count is 0, and insert_vm_struc() will later call
__vma_link_rb() which will call seqcount_init().

Furhtermore, in case of error, the vma structure is freed without calling
get_vma() so there is risk of lockdep warning.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [mm] b1f0502d04: INFO:trying_to_register_non-static_key
  2018-04-04  1:03             ` David Rientjes
@ 2018-04-04 10:28               ` Laurent Dufour
  0 siblings, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-04-04 10:28 UTC (permalink / raw)
  To: David Rientjes
  Cc: kernel test robot, paulmck, peterz, akpm, kirill, ak, mhocko,
	dave, jack, Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner,
	Ingo Molnar, hpa, Will Deacon, Sergey Senozhatsky,
	Andrea Arcangeli, Alexei Starovoitov, kemi.wang,
	sergey.senozhatsky.work, Daniel Jordan, linux-kernel, linux-mm,
	haren, khandual, npiggin, bsingharora, Tim Chen, linuxppc-dev,
	x86, lkp



On 04/04/2018 03:03, David Rientjes wrote:
> On Tue, 3 Apr 2018, David Rientjes wrote:
> 
>>>>> I found the root cause of this lockdep warning.
>>>>>
>>>>> In mmap_region(), unmap_region() may be called while vma_link() has not been
>>>>> called. This happens during the error path if call_mmap() failed.
>>>>>
>>>>> The only to fix that particular case is to call
>>>>> seqcount_init(&vma->vm_sequence) when initializing the vma in mmap_region().
>>>>>
>>>>
>>>> Ack, although that would require a fixup to dup_mmap() as well.
>>>
>>> You're right, I'll fix that too.
>>>
>>
>> I also think the following is needed:
>>
>> diff --git a/fs/exec.c b/fs/exec.c
>> --- a/fs/exec.c
>> +++ b/fs/exec.c
>> @@ -312,6 +312,10 @@ static int __bprm_mm_init(struct linux_binprm *bprm)
>>  	vma->vm_flags = VM_SOFTDIRTY | VM_STACK_FLAGS | VM_STACK_INCOMPLETE_SETUP;
>>  	vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
>>  	INIT_LIST_HEAD(&vma->anon_vma_chain);
>> +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
>> +	seqcount_init(&vma->vm_sequence);
>> +	atomic_set(&vma->vm_ref_count, 0);
>> +#endif
>>  
>>  	err = insert_vm_struct(mm, vma);
>>  	if (err)
>>
> 
> Ugh, I think there are a number of other places where this is needed as 
> well in mm/mmap.c.  I think it would be better to just create a new 
> alloc_vma(unsigned long flags) that all vma allocators can use and for 
> CONFIG_SPECULATIVE_PAGE_FAULT will initialize the seqcount_t and atomic_t.
> 

I don't think this is really needed, most of the time the vma structure is
allocated cleared and is then link to rb tree via vma_link.

The only case generating a locked warning is when the vma is referenced in the
error path before being linked in the mm tree and that is not usual.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 11/24] mm: Cache some VMA fields in the vm_fault structure
  2018-04-02 22:24   ` David Rientjes
@ 2018-04-04 15:48     ` Laurent Dufour
  0 siblings, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-04-04 15:48 UTC (permalink / raw)
  To: David Rientjes
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On 03/04/2018 00:24, David Rientjes wrote:
> On Tue, 13 Mar 2018, Laurent Dufour wrote:
> 
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index ef6ef0627090..dfa81a638b7c 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -359,6 +359,12 @@ struct vm_fault {
>>  					 * page table to avoid allocation from
>>  					 * atomic context.
>>  					 */
>> +	/*
>> +	 * These entries are required when handling speculative page fault.
>> +	 * This way the page handling is done using consistent field values.
>> +	 */
>> +	unsigned long vma_flags;
>> +	pgprot_t vma_page_prot;
>>  };
>>  
>>  /* page entry size for vm->huge_fault() */
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index 446427cafa19..f71db2b42b30 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -3717,6 +3717,8 @@ static int hugetlb_no_page(struct mm_struct *mm, struct vm_area_struct *vma,
>>  				.vma = vma,
>>  				.address = address,
>>  				.flags = flags,
>> +				.vma_flags = vma->vm_flags,
>> +				.vma_page_prot = vma->vm_page_prot,
>>  				/*
>>  				 * Hard to debug if it ends up being
>>  				 * used by a callee that assumes
>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> index 32314e9e48dd..a946d5306160 100644
>> --- a/mm/khugepaged.c
>> +++ b/mm/khugepaged.c
>> @@ -882,6 +882,8 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm,
>>  		.flags = FAULT_FLAG_ALLOW_RETRY,
>>  		.pmd = pmd,
>>  		.pgoff = linear_page_index(vma, address),
>> +		.vma_flags = vma->vm_flags,
>> +		.vma_page_prot = vma->vm_page_prot,
>>  	};
>>  
>>  	/* we only decide to swapin, if there is enough young ptes */
>> diff --git a/mm/memory.c b/mm/memory.c
>> index 0200340ef089..46fe92b93682 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -2615,7 +2615,7 @@ static int wp_page_copy(struct vm_fault *vmf)
>>  		 * Don't let another task, with possibly unlocked vma,
>>  		 * keep the mlocked page.
>>  		 */
>> -		if (page_copied && (vma->vm_flags & VM_LOCKED)) {
>> +		if (page_copied && (vmf->vma_flags & VM_LOCKED)) {
>>  			lock_page(old_page);	/* LRU manipulation */
>>  			if (PageMlocked(old_page))
>>  				munlock_vma_page(old_page);
> 
> Doesn't wp_page_copy() also need to pass this to anon_vma_prepare() so 
> that find_mergeable_anon_vma() works correctly?

In the case of the spf handler, we check that the vma->anon_vma is not null.
So __anon_vma_prepare(vma) is never called in the context of the SPF handler.

> 
>> @@ -2649,7 +2649,7 @@ static int wp_page_copy(struct vm_fault *vmf)
>>   */
>>  int finish_mkwrite_fault(struct vm_fault *vmf)
>>  {
>> -	WARN_ON_ONCE(!(vmf->vma->vm_flags & VM_SHARED));
>> +	WARN_ON_ONCE(!(vmf->vma_flags & VM_SHARED));
>>  	if (!pte_map_lock(vmf))
>>  		return VM_FAULT_RETRY;
>>  	/*
>> @@ -2751,7 +2751,7 @@ static int do_wp_page(struct vm_fault *vmf)
>>  		 * We should not cow pages in a shared writeable mapping.
>>  		 * Just mark the pages writable and/or call ops->pfn_mkwrite.
>>  		 */
>> -		if ((vma->vm_flags & (VM_WRITE|VM_SHARED)) ==
>> +		if ((vmf->vma_flags & (VM_WRITE|VM_SHARED)) ==
>>  				     (VM_WRITE|VM_SHARED))
>>  			return wp_pfn_shared(vmf);
>>  
>> @@ -2798,7 +2798,7 @@ static int do_wp_page(struct vm_fault *vmf)
>>  			return VM_FAULT_WRITE;
>>  		}
>>  		unlock_page(vmf->page);
>> -	} else if (unlikely((vma->vm_flags & (VM_WRITE|VM_SHARED)) ==
>> +	} else if (unlikely((vmf->vma_flags & (VM_WRITE|VM_SHARED)) ==
>>  					(VM_WRITE|VM_SHARED))) {
>>  		return wp_page_shared(vmf);
>>  	}
>> @@ -3067,7 +3067,7 @@ int do_swap_page(struct vm_fault *vmf)
>>  
>>  	inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
>>  	dec_mm_counter_fast(vma->vm_mm, MM_SWAPENTS);
>> -	pte = mk_pte(page, vma->vm_page_prot);
>> +	pte = mk_pte(page, vmf->vma_page_prot);
>>  	if ((vmf->flags & FAULT_FLAG_WRITE) && reuse_swap_page(page, NULL)) {
>>  		pte = maybe_mkwrite(pte_mkdirty(pte), vma);
>>  		vmf->flags &= ~FAULT_FLAG_WRITE;
>> @@ -3093,7 +3093,7 @@ int do_swap_page(struct vm_fault *vmf)
>>  
>>  	swap_free(entry);
>>  	if (mem_cgroup_swap_full(page) ||
>> -	    (vma->vm_flags & VM_LOCKED) || PageMlocked(page))
>> +	    (vmf->vma_flags & VM_LOCKED) || PageMlocked(page))
>>  		try_to_free_swap(page);
>>  	unlock_page(page);
>>  	if (page != swapcache && swapcache) {
>> @@ -3150,7 +3150,7 @@ static int do_anonymous_page(struct vm_fault *vmf)
>>  	pte_t entry;
>>  
>>  	/* File mapping without ->vm_ops ? */
>> -	if (vma->vm_flags & VM_SHARED)
>> +	if (vmf->vma_flags & VM_SHARED)
>>  		return VM_FAULT_SIGBUS;
>>  
>>  	/*
>> @@ -3174,7 +3174,7 @@ static int do_anonymous_page(struct vm_fault *vmf)
>>  	if (!(vmf->flags & FAULT_FLAG_WRITE) &&
>>  			!mm_forbids_zeropage(vma->vm_mm)) {
>>  		entry = pte_mkspecial(pfn_pte(my_zero_pfn(vmf->address),
>> -						vma->vm_page_prot));
>> +						vmf->vma_page_prot));
>>  		if (!pte_map_lock(vmf))
>>  			return VM_FAULT_RETRY;
>>  		if (!pte_none(*vmf->pte))
>> @@ -3207,8 +3207,8 @@ static int do_anonymous_page(struct vm_fault *vmf)
>>  	 */
>>  	__SetPageUptodate(page);
>>  
>> -	entry = mk_pte(page, vma->vm_page_prot);
>> -	if (vma->vm_flags & VM_WRITE)
>> +	entry = mk_pte(page, vmf->vma_page_prot);
>> +	if (vmf->vma_flags & VM_WRITE)
>>  		entry = pte_mkwrite(pte_mkdirty(entry));
>>  
>>  	if (!pte_map_lock(vmf)) {
>> @@ -3404,7 +3404,7 @@ static int do_set_pmd(struct vm_fault *vmf, struct page *page)
>>  	for (i = 0; i < HPAGE_PMD_NR; i++)
>>  		flush_icache_page(vma, page + i);
>>  
>> -	entry = mk_huge_pmd(page, vma->vm_page_prot);
>> +	entry = mk_huge_pmd(page, vmf->vma_page_prot);
>>  	if (write)
>>  		entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);
>>  
>> @@ -3478,11 +3478,11 @@ int alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg,
>>  		return VM_FAULT_NOPAGE;
>>  
>>  	flush_icache_page(vma, page);
>> -	entry = mk_pte(page, vma->vm_page_prot);
>> +	entry = mk_pte(page, vmf->vma_page_prot);
>>  	if (write)
>>  		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
>>  	/* copy-on-write page */
>> -	if (write && !(vma->vm_flags & VM_SHARED)) {
>> +	if (write && !(vmf->vma_flags & VM_SHARED)) {
>>  		inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
>>  		page_add_new_anon_rmap(page, vma, vmf->address, false);
>>  		mem_cgroup_commit_charge(page, memcg, false, false);
>> @@ -3521,7 +3521,7 @@ int finish_fault(struct vm_fault *vmf)
>>  
>>  	/* Did we COW the page? */
>>  	if ((vmf->flags & FAULT_FLAG_WRITE) &&
>> -	    !(vmf->vma->vm_flags & VM_SHARED))
>> +	    !(vmf->vma_flags & VM_SHARED))
>>  		page = vmf->cow_page;
>>  	else
>>  		page = vmf->page;
>> @@ -3775,7 +3775,7 @@ static int do_fault(struct vm_fault *vmf)
>>  		ret = VM_FAULT_SIGBUS;
>>  	else if (!(vmf->flags & FAULT_FLAG_WRITE))
>>  		ret = do_read_fault(vmf);
>> -	else if (!(vma->vm_flags & VM_SHARED))
>> +	else if (!(vmf->vma_flags & VM_SHARED))
>>  		ret = do_cow_fault(vmf);
>>  	else
>>  		ret = do_shared_fault(vmf);
>> @@ -3832,7 +3832,7 @@ static int do_numa_page(struct vm_fault *vmf)
>>  	 * accessible ptes, some can allow access by kernel mode.
>>  	 */
>>  	pte = ptep_modify_prot_start(vma->vm_mm, vmf->address, vmf->pte);
>> -	pte = pte_modify(pte, vma->vm_page_prot);
>> +	pte = pte_modify(pte, vmf->vma_page_prot);
>>  	pte = pte_mkyoung(pte);
>>  	if (was_writable)
>>  		pte = pte_mkwrite(pte);
>> @@ -3866,7 +3866,7 @@ static int do_numa_page(struct vm_fault *vmf)
>>  	 * Flag if the page is shared between multiple address spaces. This
>>  	 * is later used when determining whether to group tasks together
>>  	 */
>> -	if (page_mapcount(page) > 1 && (vma->vm_flags & VM_SHARED))
>> +	if (page_mapcount(page) > 1 && (vmf->vma_flags & VM_SHARED))
>>  		flags |= TNF_SHARED;
>>  
>>  	last_cpupid = page_cpupid_last(page);
>> @@ -3911,7 +3911,7 @@ static inline int wp_huge_pmd(struct vm_fault *vmf, pmd_t orig_pmd)
>>  		return vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PMD);
>>  
>>  	/* COW handled on pte level: split pmd */
>> -	VM_BUG_ON_VMA(vmf->vma->vm_flags & VM_SHARED, vmf->vma);
>> +	VM_BUG_ON_VMA(vmf->vma_flags & VM_SHARED, vmf->vma);
>>  	__split_huge_pmd(vmf->vma, vmf->pmd, vmf->address, false, NULL);
>>  
>>  	return VM_FAULT_FALLBACK;
>> @@ -4058,6 +4058,8 @@ static int __handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
>>  		.flags = flags,
>>  		.pgoff = linear_page_index(vma, address),
>>  		.gfp_mask = __get_fault_gfp_mask(vma),
>> +		.vma_flags = vma->vm_flags,
>> +		.vma_page_prot = vma->vm_page_prot,
>>  	};
>>  	unsigned int dirty = flags & FAULT_FLAG_WRITE;
>>  	struct mm_struct *mm = vma->vm_mm;
> 
> Don't you also need to do this?

In theory there is no risk there, because if the vma->vm_flags have changed in
our back, the locking of the pte will prevent concurrent update of the pte's
values.
So if a mprotect() call is occuring in parallel, once the vm_flags have been
touched, the pte needs to be modified and this requires the pte lock to be
held. So this will happen after we have revalidated the vma and locked the pte.

This being said, that sounds better to deal with the vmf->vma_flags when the
vmf structure is available so I'll apply the following.

> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -694,9 +694,9 @@ void free_compound_page(struct page *page);
>   * pte_mkwrite.  But get_user_pages can cause write faults for mappings
>   * that do not have writing enabled, when used by access_process_vm.
>   */
> -static inline pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma)
> +static inline pte_t maybe_mkwrite(pte_t pte, unsigned long vma_flags)
>  {
> -	if (likely(vma->vm_flags & VM_WRITE))
> +	if (likely(vma_flags & VM_WRITE))
>  		pte = pte_mkwrite(pte);
>  	return pte;
>  }
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1195,8 +1195,8 @@ static int do_huge_pmd_wp_page_fallback(struct vm_fault *vmf, pmd_t orig_pmd,
> 
>  	for (i = 0; i < HPAGE_PMD_NR; i++, haddr += PAGE_SIZE) {
>  		pte_t entry;
> -		entry = mk_pte(pages[i], vma->vm_page_prot);
> -		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
> +		entry = mk_pte(pages[i], vmf->vma_page_prot);
> +		entry = maybe_mkwrite(pte_mkdirty(entry), vmf->vma_flags);
>  		memcg = (void *)page_private(pages[i]);
>  		set_page_private(pages[i], 0);
>  		page_add_new_anon_rmap(pages[i], vmf->vma, haddr, false);
> @@ -2169,7 +2169,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
>  				entry = pte_swp_mksoft_dirty(entry);
>  		} else {
>  			entry = mk_pte(page + i, READ_ONCE(vma->vm_page_prot));
> -			entry = maybe_mkwrite(entry, vma);
> +			entry = maybe_mkwrite(entry, vma->vm_flags);
>  			if (!write)
>  				entry = pte_wrprotect(entry);
>  			if (!young)
> diff --git a/mm/memory.c b/mm/memory.c
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1826,7 +1826,7 @@ static int insert_pfn(struct vm_area_struct *vma, unsigned long addr,
>  out_mkwrite:
>  	if (mkwrite) {
>  		entry = pte_mkyoung(entry);
> -		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
> +		entry = maybe_mkwrite(pte_mkdirty(entry), vma->vm_flags);
>  	}
> 
>  	set_pte_at(mm, addr, pte, entry);
> @@ -2472,7 +2472,7 @@ static inline void wp_page_reuse(struct vm_fault *vmf)
> 
>  	flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
>  	entry = pte_mkyoung(vmf->orig_pte);
> -	entry = maybe_mkwrite(pte_mkdirty(entry), vma);
> +	entry = maybe_mkwrite(pte_mkdirty(entry), vmf->vma_flags);
>  	if (ptep_set_access_flags(vma, vmf->address, vmf->pte, entry, 1))
>  		update_mmu_cache(vma, vmf->address, vmf->pte);
>  	pte_unmap_unlock(vmf->pte, vmf->ptl);
> @@ -2549,8 +2549,8 @@ static int wp_page_copy(struct vm_fault *vmf)
>  			inc_mm_counter_fast(mm, MM_ANONPAGES);
>  		}
>  		flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
> -		entry = mk_pte(new_page, vma->vm_page_prot);
> -		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
> +		entry = mk_pte(new_page, vmf->vma_page_prot);
> +		entry = maybe_mkwrite(pte_mkdirty(entry), vmf->vma_flags);
>  		/*
>  		 * Clear the pte entry and flush it first, before updating the
>  		 * pte with the new entry. This will avoid a race condition
> @@ -3069,7 +3069,7 @@ int do_swap_page(struct vm_fault *vmf)
>  	dec_mm_counter_fast(vma->vm_mm, MM_SWAPENTS);
>  	pte = mk_pte(page, vmf->vma_page_prot);
>  	if ((vmf->flags & FAULT_FLAG_WRITE) && reuse_swap_page(page, NULL)) {
> -		pte = maybe_mkwrite(pte_mkdirty(pte), vma);
> +		pte = maybe_mkwrite(pte_mkdirty(pte), vmf->vm_flags);
>  		vmf->flags &= ~FAULT_FLAG_WRITE;
>  		ret |= VM_FAULT_WRITE;
>  		exclusive = RMAP_EXCLUSIVE;
> @@ -3481,7 +3481,7 @@ int alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg,
>  	flush_icache_page(vma, page);
>  	entry = mk_pte(page, vmf->vma_page_prot);
>  	if (write)
> -		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
> +		entry = maybe_mkwrite(pte_mkdirty(entry), vmf->vm_flags);
>  	/* copy-on-write page */
>  	if (write && !(vmf->vma_flags & VM_SHARED)) {
>  		inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
> diff --git a/mm/migrate.c b/mm/migrate.c
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -240,7 +240,7 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma,
>  		 */
>  		entry = pte_to_swp_entry(*pvmw.pte);
>  		if (is_write_migration_entry(entry))
> -			pte = maybe_mkwrite(pte, vma);
> +			pte = maybe_mkwrite(pte, vma->vm_flags);
> 
>  		if (unlikely(is_zone_device_page(new))) {
>  			if (is_device_private_page(new)) {
> 

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 14/24] mm: Introduce __maybe_mkwrite()
  2018-04-02 23:12   ` David Rientjes
@ 2018-04-04 15:56     ` Laurent Dufour
  0 siblings, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-04-04 15:56 UTC (permalink / raw)
  To: David Rientjes
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On 03/04/2018 01:12, David Rientjes wrote:
> On Tue, 13 Mar 2018, Laurent Dufour wrote:
> 
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index dfa81a638b7c..a84ddc218bbd 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -684,13 +684,18 @@ void free_compound_page(struct page *page);
>>   * pte_mkwrite.  But get_user_pages can cause write faults for mappings
>>   * that do not have writing enabled, when used by access_process_vm.
>>   */
>> -static inline pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma)
>> +static inline pte_t __maybe_mkwrite(pte_t pte, unsigned long vma_flags)
>>  {
>> -	if (likely(vma->vm_flags & VM_WRITE))
>> +	if (likely(vma_flags & VM_WRITE))
>>  		pte = pte_mkwrite(pte);
>>  	return pte;
>>  }
>>  
>> +static inline pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma)
>> +{
>> +	return __maybe_mkwrite(pte, vma->vm_flags);
>> +}
>> +
>>  int alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg,
>>  		struct page *page);
>>  int finish_fault(struct vm_fault *vmf);
>> diff --git a/mm/memory.c b/mm/memory.c
>> index 0a0a483d9a65..af0338fbc34d 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -2472,7 +2472,7 @@ static inline void wp_page_reuse(struct vm_fault *vmf)
>>  
>>  	flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
>>  	entry = pte_mkyoung(vmf->orig_pte);
>> -	entry = maybe_mkwrite(pte_mkdirty(entry), vma);
>> +	entry = __maybe_mkwrite(pte_mkdirty(entry), vmf->vma_flags);
>>  	if (ptep_set_access_flags(vma, vmf->address, vmf->pte, entry, 1))
>>  		update_mmu_cache(vma, vmf->address, vmf->pte);
>>  	pte_unmap_unlock(vmf->pte, vmf->ptl);
>> @@ -2549,8 +2549,8 @@ static int wp_page_copy(struct vm_fault *vmf)
>>  			inc_mm_counter_fast(mm, MM_ANONPAGES);
>>  		}
>>  		flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
>> -		entry = mk_pte(new_page, vma->vm_page_prot);
>> -		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
>> +		entry = mk_pte(new_page, vmf->vma_page_prot);
>> +		entry = __maybe_mkwrite(pte_mkdirty(entry), vmf->vma_flags);
>>  		/*
>>  		 * Clear the pte entry and flush it first, before updating the
>>  		 * pte with the new entry. This will avoid a race condition
> 
> Don't you also need to do this in do_swap_page()?

Indeed I'll drop this patch as all the changes are now done in the patch 11
"mm: Cache some VMA fields in the vm_fault structure" where, as you suggested,
maybe_mkwrite() is now getting passed the vm_flags value directly.

> diff --git a/mm/memory.c b/mm/memory.c
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3067,9 +3067,9 @@ int do_swap_page(struct vm_fault *vmf)
> 
>  	inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
>  	dec_mm_counter_fast(vma->vm_mm, MM_SWAPENTS);
> -	pte = mk_pte(page, vma->vm_page_prot);
> +	pte = mk_pte(page, vmf->vma_page_prot);
>  	if ((vmf->flags & FAULT_FLAG_WRITE) && reuse_swap_page(page, NULL)) {
> -		pte = maybe_mkwrite(pte_mkdirty(pte), vma);
> +		pte = __maybe_mkwrite(pte_mkdirty(pte), vmf->vma_flags);
>  		vmf->flags &= ~FAULT_FLAG_WRITE;
>  		ret |= VM_FAULT_WRITE;
>  		exclusive = RMAP_EXCLUSIVE;
> 

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 15/24] mm: Introduce __vm_normal_page()
  2018-04-02 23:18   ` David Rientjes
@ 2018-04-04 16:04     ` Laurent Dufour
  0 siblings, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-04-04 16:04 UTC (permalink / raw)
  To: David Rientjes
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On 03/04/2018 01:18, David Rientjes wrote:
> On Tue, 13 Mar 2018, Laurent Dufour wrote:
> 
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index a84ddc218bbd..73b8b99f482b 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -1263,8 +1263,11 @@ struct zap_details {
>>  	pgoff_t last_index;			/* Highest page->index to unmap */
>>  };
>>  
>> -struct page *_vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
>> -			     pte_t pte, bool with_public_device);
>> +struct page *__vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
>> +			      pte_t pte, bool with_public_device,
>> +			      unsigned long vma_flags);
>> +#define _vm_normal_page(vma, addr, pte, with_public_device) \
>> +	__vm_normal_page(vma, addr, pte, with_public_device, (vma)->vm_flags)
>>  #define vm_normal_page(vma, addr, pte) _vm_normal_page(vma, addr, pte, false)
>>  
>>  struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr,
> 
> If _vm_normal_page() is a static inline function does it break somehow?  
> It's nice to avoid the #define's.

No problem, I'll create it as a static inline function.

> 
>> diff --git a/mm/memory.c b/mm/memory.c
>> index af0338fbc34d..184a0d663a76 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -826,8 +826,9 @@ static void print_bad_pte(struct vm_area_struct *vma, unsigned long addr,
>>  #else
>>  # define HAVE_PTE_SPECIAL 0
>>  #endif
>> -struct page *_vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
>> -			     pte_t pte, bool with_public_device)
>> +struct page *__vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
>> +			      pte_t pte, bool with_public_device,
>> +			      unsigned long vma_flags)
>>  {
>>  	unsigned long pfn = pte_pfn(pte);
>>  
> 
> Would it be possible to update the comment since the function itself is no 
> longer named vm_normal_page?

Sure.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 15/24] mm: Introduce __vm_normal_page()
  2018-04-03 19:39   ` Jerome Glisse
  2018-04-03 20:45     ` David Rientjes
@ 2018-04-04 16:26     ` Laurent Dufour
  2018-04-04 21:59       ` Jerome Glisse
  1 sibling, 1 reply; 98+ messages in thread
From: Laurent Dufour @ 2018-04-04 16:26 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86



On 03/04/2018 21:39, Jerome Glisse wrote:
> On Tue, Mar 13, 2018 at 06:59:45PM +0100, Laurent Dufour wrote:
>> When dealing with the speculative fault path we should use the VMA's field
>> cached value stored in the vm_fault structure.
>>
>> Currently vm_normal_page() is using the pointer to the VMA to fetch the
>> vm_flags value. This patch provides a new __vm_normal_page() which is
>> receiving the vm_flags flags value as parameter.
>>
>> Note: The speculative path is turned on for architecture providing support
>> for special PTE flag. So only the first block of vm_normal_page is used
>> during the speculative path.
> 
> Might be a good idea to explicitly have SPECULATIVE Kconfig option depends
> on ARCH_PTE_SPECIAL and a comment for !HAVE_PTE_SPECIAL in the function
> explaining that speculative page fault should never reach that point.

Unfortunately there is no ARCH_PTE_SPECIAL in the config file, it is defined in
the per architecture header files.
So I can't do anything in the Kconfig file

However, I can check that at build time, and doing such a check in
__vm_normal_page sounds to be a good place, like that:

@@ -869,6 +870,14 @@ struct page *__vm_normal_page(struct vm_area_struct *vma,
unsigned long addr,

        /* !HAVE_PTE_SPECIAL case follows: */

+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+       /* This part should never get called when the speculative page fault
+        * handler is turned on. This is mainly because we can't rely on
+        * vm_start.
+        */
+#error CONFIG_SPECULATIVE_PAGE_FAULT requires HAVE_PTE_SPECIAL
+#endif
+
        if (unlikely(vma_flags & (VM_PFNMAP|VM_MIXEDMAP))) {
                if (vma_flags & VM_MIXEDMAP) {
                        if (!pfn_valid(pfn))

Thanks,
Laurent.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [mm] b1f0502d04: INFO:trying_to_register_non-static_key
  2018-04-04 10:19             ` Laurent Dufour
@ 2018-04-04 21:53               ` David Rientjes
  2018-04-05 16:55                 ` Laurent Dufour
  0 siblings, 1 reply; 98+ messages in thread
From: David Rientjes @ 2018-04-04 21:53 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: kernel test robot, paulmck, peterz, akpm, kirill, ak, mhocko,
	dave, jack, Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner,
	Ingo Molnar, hpa, Will Deacon, Sergey Senozhatsky,
	Andrea Arcangeli, Alexei Starovoitov, kemi.wang,
	sergey.senozhatsky.work, Daniel Jordan, linux-kernel, linux-mm,
	haren, khandual, npiggin, bsingharora, Tim Chen, linuxppc-dev,
	x86, lkp

On Wed, 4 Apr 2018, Laurent Dufour wrote:

> > I also think the following is needed:
> > 
> > diff --git a/fs/exec.c b/fs/exec.c
> > --- a/fs/exec.c
> > +++ b/fs/exec.c
> > @@ -312,6 +312,10 @@ static int __bprm_mm_init(struct linux_binprm *bprm)
> >  	vma->vm_flags = VM_SOFTDIRTY | VM_STACK_FLAGS | VM_STACK_INCOMPLETE_SETUP;
> >  	vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
> >  	INIT_LIST_HEAD(&vma->anon_vma_chain);
> > +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
> > +	seqcount_init(&vma->vm_sequence);
> > +	atomic_set(&vma->vm_ref_count, 0);
> > +#endif
> > 
> >  	err = insert_vm_struct(mm, vma);
> >  	if (err)
> 
> No, this not needed because the vma is allocated with kmem_cache_zalloc() so
> vm_ref_count is 0, and insert_vm_struc() will later call
> __vma_link_rb() which will call seqcount_init().
> 
> Furhtermore, in case of error, the vma structure is freed without calling
> get_vma() so there is risk of lockdep warning.
> 

Perhaps you're working from a different tree than I am, or you fixed the 
lockdep warning differently when adding to dup_mmap() and mmap_region().

I got the following two lockdep errors.

I fixed it locally by doing the seqcount_init() and atomic_set() 
everywhere a vma could be initialized.

INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
CPU: 12 PID: 1 Comm: init Not tainted
Call Trace:
 [<ffffffff8b12026f>] dump_stack+0x67/0x98
 [<ffffffff8a92b616>] register_lock_class+0x1e6/0x4e0
 [<ffffffff8a92cfe9>] __lock_acquire+0xb9/0x1710
 [<ffffffff8a92ef3a>] lock_acquire+0xba/0x200
 [<ffffffff8aa827df>] mprotect_fixup+0x10f/0x310
 [<ffffffff8aade3fd>] setup_arg_pages+0x12d/0x230
 [<ffffffff8ab4564a>] load_elf_binary+0x44a/0x1740
 [<ffffffff8aadde9b>] search_binary_handler+0x9b/0x1e0
 [<ffffffff8ab44e96>] load_script+0x206/0x270
 [<ffffffff8aadde9b>] search_binary_handler+0x9b/0x1e0
 [<ffffffff8aae0355>] do_execveat_common.isra.32+0x6b5/0x9d0
 [<ffffffff8aae069c>] do_execve+0x2c/0x30
 [<ffffffff8a80047b>] run_init_process+0x2b/0x30
 [<ffffffff8b1358d4>] kernel_init+0x54/0x110
 [<ffffffff8b2001ca>] ret_from_fork+0x3a/0x50

and

INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
CPU: 21 PID: 1926 Comm: mkdir Not tainted
Call Trace:
 [<ffffffff985202af>] dump_stack+0x67/0x98
 [<ffffffff97d2b616>] register_lock_class+0x1e6/0x4e0
 [<ffffffff97d2cfe9>] __lock_acquire+0xb9/0x1710
 [<ffffffff97d2ef3a>] lock_acquire+0xba/0x200
 [<ffffffff97e73c09>] unmap_page_range+0x89/0xaa0
 [<ffffffff97e746af>] unmap_single_vma+0x8f/0x100
 [<ffffffff97e74a1b>] unmap_vmas+0x4b/0x90
 [<ffffffff97e7f833>] exit_mmap+0xa3/0x1c0
 [<ffffffff97cc1b23>] mmput+0x73/0x120
 [<ffffffff97ccbacd>] do_exit+0x2bd/0xd60
 [<ffffffff97ccc5b7>] SyS_exit+0x17/0x20
 [<ffffffff97c01f1d>] do_syscall_64+0x6d/0x1a0
 [<ffffffff9860005a>] entry_SYSCALL_64_after_hwframe+0x26/0x9b

I think it would just be better to generalize vma allocation to initialize 
certain fields and init both spf fields properly for 
CONFIG_SPECULATIVE_PAGE_FAULT.  It's obviously too delicate as is.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 15/24] mm: Introduce __vm_normal_page()
  2018-04-04 16:26     ` Laurent Dufour
@ 2018-04-04 21:59       ` Jerome Glisse
  2018-04-05 12:53         ` Laurent Dufour
  0 siblings, 1 reply; 98+ messages in thread
From: Jerome Glisse @ 2018-04-04 21:59 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On Wed, Apr 04, 2018 at 06:26:44PM +0200, Laurent Dufour wrote:
> 
> 
> On 03/04/2018 21:39, Jerome Glisse wrote:
> > On Tue, Mar 13, 2018 at 06:59:45PM +0100, Laurent Dufour wrote:
> >> When dealing with the speculative fault path we should use the VMA's field
> >> cached value stored in the vm_fault structure.
> >>
> >> Currently vm_normal_page() is using the pointer to the VMA to fetch the
> >> vm_flags value. This patch provides a new __vm_normal_page() which is
> >> receiving the vm_flags flags value as parameter.
> >>
> >> Note: The speculative path is turned on for architecture providing support
> >> for special PTE flag. So only the first block of vm_normal_page is used
> >> during the speculative path.
> > 
> > Might be a good idea to explicitly have SPECULATIVE Kconfig option depends
> > on ARCH_PTE_SPECIAL and a comment for !HAVE_PTE_SPECIAL in the function
> > explaining that speculative page fault should never reach that point.
> 
> Unfortunately there is no ARCH_PTE_SPECIAL in the config file, it is defined in
> the per architecture header files.
> So I can't do anything in the Kconfig file

Maybe adding a new Kconfig symbol for ARCH_PTE_SPECIAL very much like
others ARCH_HAS_

> 
> However, I can check that at build time, and doing such a check in
> __vm_normal_page sounds to be a good place, like that:
> 
> @@ -869,6 +870,14 @@ struct page *__vm_normal_page(struct vm_area_struct *vma,
> unsigned long addr,
> 
>         /* !HAVE_PTE_SPECIAL case follows: */
> 
> +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
> +       /* This part should never get called when the speculative page fault
> +        * handler is turned on. This is mainly because we can't rely on
> +        * vm_start.
> +        */
> +#error CONFIG_SPECULATIVE_PAGE_FAULT requires HAVE_PTE_SPECIAL
> +#endif
> +
>         if (unlikely(vma_flags & (VM_PFNMAP|VM_MIXEDMAP))) {
>                 if (vma_flags & VM_MIXEDMAP) {
>                         if (!pfn_valid(pfn))
> 

I am not a fan of #if/#else/#endif in code. But that's a taste thing.
I honnestly think that adding a Kconfig for special pte is the cleanest
solution.

Cheers,
Jérôme

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 15/24] mm: Introduce __vm_normal_page()
  2018-04-04 21:59       ` Jerome Glisse
@ 2018-04-05 12:53         ` Laurent Dufour
  0 siblings, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-04-05 12:53 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On 04/04/2018 23:59, Jerome Glisse wrote:
> On Wed, Apr 04, 2018 at 06:26:44PM +0200, Laurent Dufour wrote:
>>
>>
>> On 03/04/2018 21:39, Jerome Glisse wrote:
>>> On Tue, Mar 13, 2018 at 06:59:45PM +0100, Laurent Dufour wrote:
>>>> When dealing with the speculative fault path we should use the VMA's field
>>>> cached value stored in the vm_fault structure.
>>>>
>>>> Currently vm_normal_page() is using the pointer to the VMA to fetch the
>>>> vm_flags value. This patch provides a new __vm_normal_page() which is
>>>> receiving the vm_flags flags value as parameter.
>>>>
>>>> Note: The speculative path is turned on for architecture providing support
>>>> for special PTE flag. So only the first block of vm_normal_page is used
>>>> during the speculative path.
>>>
>>> Might be a good idea to explicitly have SPECULATIVE Kconfig option depends
>>> on ARCH_PTE_SPECIAL and a comment for !HAVE_PTE_SPECIAL in the function
>>> explaining that speculative page fault should never reach that point.
>>
>> Unfortunately there is no ARCH_PTE_SPECIAL in the config file, it is defined in
>> the per architecture header files.
>> So I can't do anything in the Kconfig file
> 
> Maybe adding a new Kconfig symbol for ARCH_PTE_SPECIAL very much like
> others ARCH_HAS_
> 
>>
>> However, I can check that at build time, and doing such a check in
>> __vm_normal_page sounds to be a good place, like that:
>>
>> @@ -869,6 +870,14 @@ struct page *__vm_normal_page(struct vm_area_struct *vma,
>> unsigned long addr,
>>
>>         /* !HAVE_PTE_SPECIAL case follows: */
>>
>> +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
>> +       /* This part should never get called when the speculative page fault
>> +        * handler is turned on. This is mainly because we can't rely on
>> +        * vm_start.
>> +        */
>> +#error CONFIG_SPECULATIVE_PAGE_FAULT requires HAVE_PTE_SPECIAL
>> +#endif
>> +
>>         if (unlikely(vma_flags & (VM_PFNMAP|VM_MIXEDMAP))) {
>>                 if (vma_flags & VM_MIXEDMAP) {
>>                         if (!pfn_valid(pfn))
>>
> 
> I am not a fan of #if/#else/#endif in code. But that's a taste thing.
> I honnestly think that adding a Kconfig for special pte is the cleanest
> solution.

I do agree, but this should be done in a separate series.

I'll see how this could be done but there are some arch (like powerpc) where
this is a bit obfuscated for unknown reason.

For the time being, I'll remove the check and just let the comment in place.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [mm] b1f0502d04: INFO:trying_to_register_non-static_key
  2018-04-04 21:53               ` David Rientjes
@ 2018-04-05 16:55                 ` Laurent Dufour
  0 siblings, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-04-05 16:55 UTC (permalink / raw)
  To: David Rientjes
  Cc: kernel test robot, paulmck, peterz, akpm, kirill, ak, mhocko,
	dave, jack, Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner,
	Ingo Molnar, hpa, Will Deacon, Sergey Senozhatsky,
	Andrea Arcangeli, Alexei Starovoitov, kemi.wang,
	sergey.senozhatsky.work, Daniel Jordan, linux-kernel, linux-mm,
	haren, khandual, npiggin, bsingharora, Tim Chen, linuxppc-dev,
	x86, lkp

On 04/04/2018 23:53, David Rientjes wrote:
> On Wed, 4 Apr 2018, Laurent Dufour wrote:
> 
>>> I also think the following is needed:
>>>
>>> diff --git a/fs/exec.c b/fs/exec.c
>>> --- a/fs/exec.c
>>> +++ b/fs/exec.c
>>> @@ -312,6 +312,10 @@ static int __bprm_mm_init(struct linux_binprm *bprm)
>>>  	vma->vm_flags = VM_SOFTDIRTY | VM_STACK_FLAGS | VM_STACK_INCOMPLETE_SETUP;
>>>  	vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
>>>  	INIT_LIST_HEAD(&vma->anon_vma_chain);
>>> +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
>>> +	seqcount_init(&vma->vm_sequence);
>>> +	atomic_set(&vma->vm_ref_count, 0);
>>> +#endif
>>>
>>>  	err = insert_vm_struct(mm, vma);
>>>  	if (err)
>>
>> No, this not needed because the vma is allocated with kmem_cache_zalloc() so
>> vm_ref_count is 0, and insert_vm_struc() will later call
>> __vma_link_rb() which will call seqcount_init().
>>
>> Furhtermore, in case of error, the vma structure is freed without calling
>> get_vma() so there is risk of lockdep warning.
>>
> 
> Perhaps you're working from a different tree than I am, or you fixed the 
> lockdep warning differently when adding to dup_mmap() and mmap_region().
> 
> I got the following two lockdep errors.
> 
> I fixed it locally by doing the seqcount_init() and atomic_set() 
> everywhere a vma could be initialized.

That's weird, I don't get that on my side with lockdep activated.

There is a call to seqcount_init() in dup_mmap(), in mmap_region() and
__vma_link_rb() and that's enough to cover all the case.

That's being said, it'll be better call seqcount_init each time as soon as a
vma structure is allocated. For the vm_ref_count value, as most of the time the
vma is zero allocated, I don't think this is needed.
I just have to check when new_vma = *old_vma is done, but this often just
follow a vma allocation.
> 
> INFO: trying to register non-static key.
> the code is fine but needs lockdep annotation.
> turning off the locking correctness validator.
> CPU: 12 PID: 1 Comm: init Not tainted
> Call Trace:
>  [<ffffffff8b12026f>] dump_stack+0x67/0x98
>  [<ffffffff8a92b616>] register_lock_class+0x1e6/0x4e0
>  [<ffffffff8a92cfe9>] __lock_acquire+0xb9/0x1710
>  [<ffffffff8a92ef3a>] lock_acquire+0xba/0x200
>  [<ffffffff8aa827df>] mprotect_fixup+0x10f/0x310
>  [<ffffffff8aade3fd>] setup_arg_pages+0x12d/0x230
>  [<ffffffff8ab4564a>] load_elf_binary+0x44a/0x1740
>  [<ffffffff8aadde9b>] search_binary_handler+0x9b/0x1e0
>  [<ffffffff8ab44e96>] load_script+0x206/0x270
>  [<ffffffff8aadde9b>] search_binary_handler+0x9b/0x1e0
>  [<ffffffff8aae0355>] do_execveat_common.isra.32+0x6b5/0x9d0
>  [<ffffffff8aae069c>] do_execve+0x2c/0x30
>  [<ffffffff8a80047b>] run_init_process+0x2b/0x30
>  [<ffffffff8b1358d4>] kernel_init+0x54/0x110
>  [<ffffffff8b2001ca>] ret_from_fork+0x3a/0x50
> 
> and
> 
> INFO: trying to register non-static key.
> the code is fine but needs lockdep annotation.
> turning off the locking correctness validator.
> CPU: 21 PID: 1926 Comm: mkdir Not tainted
> Call Trace:
>  [<ffffffff985202af>] dump_stack+0x67/0x98
>  [<ffffffff97d2b616>] register_lock_class+0x1e6/0x4e0
>  [<ffffffff97d2cfe9>] __lock_acquire+0xb9/0x1710
>  [<ffffffff97d2ef3a>] lock_acquire+0xba/0x200
>  [<ffffffff97e73c09>] unmap_page_range+0x89/0xaa0
>  [<ffffffff97e746af>] unmap_single_vma+0x8f/0x100
>  [<ffffffff97e74a1b>] unmap_vmas+0x4b/0x90
>  [<ffffffff97e7f833>] exit_mmap+0xa3/0x1c0
>  [<ffffffff97cc1b23>] mmput+0x73/0x120
>  [<ffffffff97ccbacd>] do_exit+0x2bd/0xd60
>  [<ffffffff97ccc5b7>] SyS_exit+0x17/0x20
>  [<ffffffff97c01f1d>] do_syscall_64+0x6d/0x1a0
>  [<ffffffff9860005a>] entry_SYSCALL_64_after_hwframe+0x26/0x9b
> 
> I think it would just be better to generalize vma allocation to initialize 
> certain fields and init both spf fields properly for 
> CONFIG_SPECULATIVE_PAGE_FAULT.  It's obviously too delicate as is.
> 

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 17/24] mm: Protect mm_rb tree with a rwlock
  2018-04-03  0:11   ` [PATCH v9 17/24] mm: Protect mm_rb tree with a rwlock David Rientjes
@ 2018-04-06 14:23     ` Laurent Dufour
  2018-04-10 16:20     ` Laurent Dufour
  1 sibling, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-04-06 14:23 UTC (permalink / raw)
  To: David Rientjes
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86



On 03/04/2018 02:11, David Rientjes wrote:
> On Tue, 13 Mar 2018, Laurent Dufour wrote:
> 
>> This change is inspired by the Peter's proposal patch [1] which was
>> protecting the VMA using SRCU. Unfortunately, SRCU is not scaling well in
>> that particular case, and it is introducing major performance degradation
>> due to excessive scheduling operations.
>>
>> To allow access to the mm_rb tree without grabbing the mmap_sem, this patch
>> is protecting it access using a rwlock.  As the mm_rb tree is a O(log n)
>> search it is safe to protect it using such a lock.  The VMA cache is not
>> protected by the new rwlock and it should not be used without holding the
>> mmap_sem.
>>
>> To allow the picked VMA structure to be used once the rwlock is released, a
>> use count is added to the VMA structure. When the VMA is allocated it is
>> set to 1.  Each time the VMA is picked with the rwlock held its use count
>> is incremented. Each time the VMA is released it is decremented. When the
>> use count hits zero, this means that the VMA is no more used and should be
>> freed.
>>
>> This patch is preparing for 2 kind of VMA access :
>>  - as usual, under the control of the mmap_sem,
>>  - without holding the mmap_sem for the speculative page fault handler.
>>
>> Access done under the control the mmap_sem doesn't require to grab the
>> rwlock to protect read access to the mm_rb tree, but access in write must
>> be done under the protection of the rwlock too. This affects inserting and
>> removing of elements in the RB tree.
>>
>> The patch is introducing 2 new functions:
>>  - vma_get() to find a VMA based on an address by holding the new rwlock.
>>  - vma_put() to release the VMA when its no more used.
>> These services are designed to be used when access are made to the RB tree
>> without holding the mmap_sem.
>>
>> When a VMA is removed from the RB tree, its vma->vm_rb field is cleared and
>> we rely on the WMB done when releasing the rwlock to serialize the write
>> with the RMB done in a later patch to check for the VMA's validity.
>>
>> When free_vma is called, the file associated with the VMA is closed
>> immediately, but the policy and the file structure remained in used until
>> the VMA's use count reach 0, which may happens later when exiting an
>> in progress speculative page fault.
>>
>> [1] https://patchwork.kernel.org/patch/5108281/
>>
>> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
>> Cc: Matthew Wilcox <willy@infradead.org>
>> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
> 
> Can __free_vma() be generalized for mm/nommu.c's delete_vma() and 
> do_mmap()?

To be honest I didn't look at mm/nommu.c assuming that such architecture would
probably be monothreaded. Am I wrong ?

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 21/24] perf tools: Add support for the SPF perf event
  2018-03-27  3:49     ` Andi Kleen
@ 2018-04-10  6:47       ` David Rientjes
  2018-04-12 13:44         ` Laurent Dufour
  0 siblings, 1 reply; 98+ messages in thread
From: David Rientjes @ 2018-04-10  6:47 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Laurent Dufour, paulmck, peterz, akpm, kirill, mhocko, dave,
	jack, Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner,
	Ingo Molnar, hpa, Will Deacon, Sergey Senozhatsky,
	Andrea Arcangeli, Alexei Starovoitov, kemi.wang,
	sergey.senozhatsky.work, Daniel Jordan, linux-kernel, linux-mm,
	haren, khandual, npiggin, bsingharora, Tim Chen, linuxppc-dev,
	x86

On Mon, 26 Mar 2018, Andi Kleen wrote:

> > Aside: should there be a new spec_flt field for struct task_struct that 
> > complements maj_flt and min_flt and be exported through /proc/pid/stat?
> 
> No. task_struct is already too bloated. If you need per process tracking 
> you can always get it through trace points.
> 

Hi Andi,

We have

	count_vm_event(PGFAULT);
	count_memcg_event_mm(vma->vm_mm, PGFAULT);

in handle_mm_fault() but not counterpart for spf.  I think it would be 
helpful to be able to determine how much faulting can be done 
speculatively if there is no per-process tracking without tracing.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 17/24] mm: Protect mm_rb tree with a rwlock
  2018-04-03  0:11   ` [PATCH v9 17/24] mm: Protect mm_rb tree with a rwlock David Rientjes
  2018-04-06 14:23     ` Laurent Dufour
@ 2018-04-10 16:20     ` Laurent Dufour
  1 sibling, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-04-10 16:20 UTC (permalink / raw)
  To: David Rientjes
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86



On 03/04/2018 02:11, David Rientjes wrote:
> On Tue, 13 Mar 2018, Laurent Dufour wrote:
> 
>> This change is inspired by the Peter's proposal patch [1] which was
>> protecting the VMA using SRCU. Unfortunately, SRCU is not scaling well in
>> that particular case, and it is introducing major performance degradation
>> due to excessive scheduling operations.
>>
>> To allow access to the mm_rb tree without grabbing the mmap_sem, this patch
>> is protecting it access using a rwlock.  As the mm_rb tree is a O(log n)
>> search it is safe to protect it using such a lock.  The VMA cache is not
>> protected by the new rwlock and it should not be used without holding the
>> mmap_sem.
>>
>> To allow the picked VMA structure to be used once the rwlock is released, a
>> use count is added to the VMA structure. When the VMA is allocated it is
>> set to 1.  Each time the VMA is picked with the rwlock held its use count
>> is incremented. Each time the VMA is released it is decremented. When the
>> use count hits zero, this means that the VMA is no more used and should be
>> freed.
>>
>> This patch is preparing for 2 kind of VMA access :
>>  - as usual, under the control of the mmap_sem,
>>  - without holding the mmap_sem for the speculative page fault handler.
>>
>> Access done under the control the mmap_sem doesn't require to grab the
>> rwlock to protect read access to the mm_rb tree, but access in write must
>> be done under the protection of the rwlock too. This affects inserting and
>> removing of elements in the RB tree.
>>
>> The patch is introducing 2 new functions:
>>  - vma_get() to find a VMA based on an address by holding the new rwlock.
>>  - vma_put() to release the VMA when its no more used.
>> These services are designed to be used when access are made to the RB tree
>> without holding the mmap_sem.
>>
>> When a VMA is removed from the RB tree, its vma->vm_rb field is cleared and
>> we rely on the WMB done when releasing the rwlock to serialize the write
>> with the RMB done in a later patch to check for the VMA's validity.
>>
>> When free_vma is called, the file associated with the VMA is closed
>> immediately, but the policy and the file structure remained in used until
>> the VMA's use count reach 0, which may happens later when exiting an
>> in progress speculative page fault.
>>
>> [1] https://patchwork.kernel.org/patch/5108281/
>>
>> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
>> Cc: Matthew Wilcox <willy@infradead.org>
>> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
> 
> Can __free_vma() be generalized for mm/nommu.c's delete_vma() and 
> do_mmap()?

Good question !
I guess if there is no mmu, there is no page fault, so no speculative page
fault and this patch is clearly required by the speculative page fault handler.
By the I should probably make CONFIG_SPECULATIVE_PAGE_FAULT dependent on
CONFIG_MMU.

This being said, if your idea is to extend the mm_rb tree rwlocking to the
nommu case, then this is another story, and I wondering if there is a real need
in such case. But I've to admit I'm not so familliar with kernel built for
mmuless systems.

Am I missing something ?

Thanks,
Laurent.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 16/24] mm: Introduce __page_add_new_anon_rmap()
  2018-04-02 23:57   ` David Rientjes
@ 2018-04-10 16:30     ` Laurent Dufour
  0 siblings, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-04-10 16:30 UTC (permalink / raw)
  To: David Rientjes
  Cc: paulmck, peterz, akpm, kirill, ak, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On 03/04/2018 01:57, David Rientjes wrote:
> On Tue, 13 Mar 2018, Laurent Dufour wrote:
> 
>> When dealing with speculative page fault handler, we may race with VMA
>> being split or merged. In this case the vma->vm_start and vm->vm_end
>> fields may not match the address the page fault is occurring.
>>
>> This can only happens when the VMA is split but in that case, the
>> anon_vma pointer of the new VMA will be the same as the original one,
>> because in __split_vma the new->anon_vma is set to src->anon_vma when
>> *new = *vma.
>>
>> So even if the VMA boundaries are not correct, the anon_vma pointer is
>> still valid.
>>
>> If the VMA has been merged, then the VMA in which it has been merged
>> must have the same anon_vma pointer otherwise the merge can't be done.
>>
>> So in all the case we know that the anon_vma is valid, since we have
>> checked before starting the speculative page fault that the anon_vma
>> pointer is valid for this VMA and since there is an anon_vma this
>> means that at one time a page has been backed and that before the VMA
>> is cleaned, the page table lock would have to be grab to clean the
>> PTE, and the anon_vma field is checked once the PTE is locked.
>>
>> This patch introduce a new __page_add_new_anon_rmap() service which
>> doesn't check for the VMA boundaries, and create a new inline one
>> which do the check.
>>
>> When called from a page fault handler, if this is not a speculative one,
>> there is a guarantee that vm_start and vm_end match the faulting address,
>> so this check is useless. In the context of the speculative page fault
>> handler, this check may be wrong but anon_vma is still valid as explained
>> above.
>>
>> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
> 
> I'm indifferent on this: it could be argued both sides that the new 
> function and its variant for a simple VM_BUG_ON() isn't worth it and it 
> would should rather be done in the callers of page_add_new_anon_rmap().  
> It feels like it would be better left to the caller and add a comment to 
> page_add_anon_rmap() itself in mm/rmap.c.

Well there are 11 calls to page_add_new_anon_rmap() which will need to be
impacted and future ones too.

By introducing __page_add_new_anon_rmap() my goal was to make clear that this
call is *special* and that calling it is not the usual way. This also implies
that most of the time the check is done (when build with the right config) and
that we will not miss some.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 21/24] perf tools: Add support for the SPF perf event
  2018-04-10  6:47       ` David Rientjes
@ 2018-04-12 13:44         ` Laurent Dufour
  0 siblings, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-04-12 13:44 UTC (permalink / raw)
  To: David Rientjes, Andi Kleen
  Cc: paulmck, peterz, akpm, kirill, mhocko, dave, jack,
	Matthew Wilcox, benh, mpe, paulus, Thomas Gleixner, Ingo Molnar,
	hpa, Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On 10/04/2018 08:47, David Rientjes wrote:
> On Mon, 26 Mar 2018, Andi Kleen wrote:
> 
>>> Aside: should there be a new spec_flt field for struct task_struct that 
>>> complements maj_flt and min_flt and be exported through /proc/pid/stat?
>>
>> No. task_struct is already too bloated. If you need per process tracking 
>> you can always get it through trace points.
>>
> 
> Hi Andi,
> 
> We have
> 
> 	count_vm_event(PGFAULT);
> 	count_memcg_event_mm(vma->vm_mm, PGFAULT);
> 
> in handle_mm_fault() but not counterpart for spf.  I think it would be 
> helpful to be able to determine how much faulting can be done 
> speculatively if there is no per-process tracking without tracing.

That sounds to be a good idea, I will create a separate patch a dedicated
speculative_pgfault counter as PGFAULT is.

Thanks,
Laurent.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v9 00/24] Speculative page faults
  2018-03-14 13:11 ` [PATCH v9 00/24] Speculative page faults Michal Hocko
  2018-03-14 13:33   ` Laurent Dufour
@ 2018-04-13 13:34   ` Laurent Dufour
  1 sibling, 0 replies; 98+ messages in thread
From: Laurent Dufour @ 2018-04-13 13:34 UTC (permalink / raw)
  To: Michal Hocko
  Cc: paulmck, peterz, akpm, kirill, ak, dave, jack, Matthew Wilcox,
	benh, mpe, paulus, Thomas Gleixner, Ingo Molnar, hpa,
	Will Deacon, Sergey Senozhatsky, Andrea Arcangeli,
	Alexei Starovoitov, kemi.wang, sergey.senozhatsky.work,
	Daniel Jordan, linux-kernel, linux-mm, haren, khandual, npiggin,
	bsingharora, Tim Chen, linuxppc-dev, x86

On 14/03/2018 14:11, Michal Hocko wrote:
> On Tue 13-03-18 18:59:30, Laurent Dufour wrote:
>> Changes since v8:
>>  - Don't check PMD when locking the pte when THP is disabled
>>    Thanks to Daniel Jordan for reporting this.
>>  - Rebase on 4.16
> 
> Is this really worth reposting the whole pile? I mean this is at v9,
> each doing little changes. It is quite tiresome to barely get to a
> bookmarked version just to find out that there are 2 new versions out.
> 
> I am sorry to be grumpy and I can understand some frustration it doesn't
> move forward that easilly but this is a _big_ change. We should start
> with a real high level review rather than doing small changes here and
> there and reach v20 quickly.

I know this would mean v10, but there has been a bunch of reviews from David
Rientjes and Jerome Glisse, and I had to make many changes to address them.
So I think this is time to push a v10.

If you have already started a review of this v9 series, please send me your
remarks so that I can compile them in this v10 asap.

Thanks,
Laurent.

^ permalink raw reply	[flat|nested] 98+ messages in thread

end of thread, other threads:[~2018-04-13 13:34 UTC | newest]

Thread overview: 98+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-13 17:59 [PATCH v9 00/24] Speculative page faults Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 01/24] mm: Introduce CONFIG_SPECULATIVE_PAGE_FAULT Laurent Dufour
2018-03-25 21:50   ` David Rientjes
2018-03-28  7:49     ` Laurent Dufour
2018-03-28 10:16       ` David Rientjes
2018-03-28 11:15         ` Laurent Dufour
2018-03-28 21:18           ` David Rientjes
2018-03-13 17:59 ` [PATCH v9 02/24] x86/mm: Define CONFIG_SPECULATIVE_PAGE_FAULT Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 03/24] powerpc/mm: " Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 05/24] mm: Introduce pte_spinlock for FAULT_FLAG_SPECULATIVE Laurent Dufour
2018-03-25 21:50   ` David Rientjes
2018-03-28  8:15     ` Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 06/24] mm: make pte_unmap_same compatible with SPF Laurent Dufour
2018-03-27 21:18   ` David Rientjes
2018-03-28  8:27     ` Laurent Dufour
2018-03-28 10:20       ` David Rientjes
2018-03-28 10:43         ` Laurent Dufour
2018-04-03 19:10   ` Jerome Glisse
2018-04-03 20:40     ` David Rientjes
2018-04-03 21:04       ` Jerome Glisse
2018-04-04  9:53     ` Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 08/24] mm: Protect VMA modifications using VMA sequence count Laurent Dufour
2018-03-27 21:45   ` David Rientjes
2018-03-28 16:57     ` Laurent Dufour
2018-03-27 21:57   ` David Rientjes
2018-03-28 17:10     ` Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 09/24] mm: protect mremap() against SPF hanlder Laurent Dufour
2018-03-27 22:12   ` David Rientjes
2018-03-28 18:11     ` Laurent Dufour
2018-03-28 21:21       ` David Rientjes
2018-04-04  8:24         ` Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 10/24] mm: Protect SPF handler against anon_vma changes Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 11/24] mm: Cache some VMA fields in the vm_fault structure Laurent Dufour
2018-04-02 22:24   ` David Rientjes
2018-04-04 15:48     ` Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 12/24] mm/migrate: Pass vm_fault pointer to migrate_misplaced_page() Laurent Dufour
2018-04-02 23:00   ` David Rientjes
2018-03-13 17:59 ` [PATCH v9 13/24] mm: Introduce __lru_cache_add_active_or_unevictable Laurent Dufour
2018-04-02 23:11   ` David Rientjes
2018-03-13 17:59 ` [PATCH v9 14/24] mm: Introduce __maybe_mkwrite() Laurent Dufour
2018-04-02 23:12   ` David Rientjes
2018-04-04 15:56     ` Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 15/24] mm: Introduce __vm_normal_page() Laurent Dufour
2018-04-02 23:18   ` David Rientjes
2018-04-04 16:04     ` Laurent Dufour
2018-04-03 19:39   ` Jerome Glisse
2018-04-03 20:45     ` David Rientjes
2018-04-04 16:26     ` Laurent Dufour
2018-04-04 21:59       ` Jerome Glisse
2018-04-05 12:53         ` Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 16/24] mm: Introduce __page_add_new_anon_rmap() Laurent Dufour
2018-04-02 23:57   ` David Rientjes
2018-04-10 16:30     ` Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 17/24] mm: Protect mm_rb tree with a rwlock Laurent Dufour
2018-03-14  8:48   ` Peter Zijlstra
2018-03-14 16:25     ` Laurent Dufour
2018-03-16 10:23   ` [mm] b33ddf50eb: INFO:trying_to_register_non-static_key kernel test robot
2018-03-16 16:38     ` Laurent Dufour
2018-04-03  0:11   ` [PATCH v9 17/24] mm: Protect mm_rb tree with a rwlock David Rientjes
2018-04-06 14:23     ` Laurent Dufour
2018-04-10 16:20     ` Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 18/24] mm: Provide speculative fault infrastructure Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 19/24] mm: Adding speculative page fault failure trace events Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 20/24] perf: Add a speculative page fault sw event Laurent Dufour
2018-03-26 21:43   ` David Rientjes
2018-03-13 17:59 ` [PATCH v9 21/24] perf tools: Add support for the SPF perf event Laurent Dufour
2018-03-26 21:44   ` David Rientjes
2018-03-27  3:49     ` Andi Kleen
2018-04-10  6:47       ` David Rientjes
2018-04-12 13:44         ` Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 22/24] mm: Speculative page fault handler return VMA Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 23/24] x86/mm: Add speculative pagefault handling Laurent Dufour
2018-03-26 21:41   ` David Rientjes
2018-03-13 17:59 ` [PATCH v9 24/24] powerpc/mm: Add speculative page fault Laurent Dufour
2018-03-26 21:39   ` David Rientjes
2018-03-14 13:11 ` [PATCH v9 00/24] Speculative page faults Michal Hocko
2018-03-14 13:33   ` Laurent Dufour
2018-04-13 13:34   ` Laurent Dufour
2018-03-22  1:21 ` Ganesh Mahendran
2018-03-29 12:49   ` Laurent Dufour
     [not found] ` <1520963994-28477-5-git-send-email-ldufour@linux.vnet.ibm.com>
2018-03-25 21:50   ` [PATCH v9 04/24] mm: Prepare for FAULT_FLAG_SPECULATIVE David Rientjes
2018-03-28 10:27     ` Laurent Dufour
2018-04-03 21:57       ` David Rientjes
2018-04-04  9:23         ` Laurent Dufour
     [not found] ` <1520963994-28477-8-git-send-email-ldufour@linux.vnet.ibm.com>
2018-03-17  7:51   ` [mm] b1f0502d04: INFO:trying_to_register_non-static_key kernel test robot
2018-03-21 12:21     ` Laurent Dufour
2018-03-25 22:10       ` David Rientjes
2018-03-28 13:30         ` Laurent Dufour
2018-04-04  0:48           ` David Rientjes
2018-04-04  1:03             ` David Rientjes
2018-04-04 10:28               ` Laurent Dufour
2018-04-04 10:19             ` Laurent Dufour
2018-04-04 21:53               ` David Rientjes
2018-04-05 16:55                 ` Laurent Dufour
2018-03-27 21:30   ` [PATCH v9 07/24] mm: VMA sequence count David Rientjes
2018-03-28 17:58     ` Laurent Dufour
2018-04-03 20:37 ` [PATCH v9 00/24] Speculative page faults Jerome Glisse
2018-04-04  7:59   ` Laurent Dufour

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).