diff for duplicates of <9FE19350E8A7EE45B64D8D63D368C8966B85F660@SHSMSX101.ccr.corp.intel.com>

diff --git a/a/1.txt b/N1/1.txt
index ec70551..2a4493a 100644
--- a/a/1.txt
+++ b/N1/1.txt
@@ -1,613 +1,827 @@
-Hi Laurent,
-
-
-For the test result on Intel 4s skylake platform (192 CPUs, 768G Memory), the below test cases all were run 3 times.
-I check the test results, only page_fault3_thread/enable THP have 6% stddev for head commit, other tests have lower stddev.
-
-And I did not find other high variation on test case result.
-
-a). Enable THP
-testcase                          base     stddev       change      head     stddev         metric
-page_fault3/enable THP           10519      ± 3%        -20.5%      8368      ±6%          will-it-scale.per_thread_ops
-page_fault2/enalbe THP            8281      ± 2%        -18.8%      6728                   will-it-scale.per_thread_ops
-brk1/eanble THP                 998475                   -2.2%    976893                   will-it-scale.per_process_ops
-context_switch1/enable THP      223910                   -1.3%    220930                   will-it-scale.per_process_ops
-context_switch1/enable THP      233722                   -1.0%    231288                   will-it-scale.per_thread_ops
-
-b). Disable THP
-page_fault3/disable THP          10856                  -23.1%      8344                   will-it-scale.per_thread_ops
-page_fault2/disable THP           8147                  -18.8%      6613                   will-it-scale.per_thread_ops
-brk1/disable THP                   957                    -7.9%      881                   will-it-scale.per_thread_ops
-context_switch1/disable THP     237006                    -2.2%    231907                  will-it-scale.per_thread_ops
-brk1/disable THP                997317                    -2.0%    977778                  will-it-scale.per_process_ops
-page_fault3/disable THP         467454                    -1.8%    459251                  will-it-scale.per_process_ops
-context_switch1/disable THP     224431                    -1.3%    221567                  will-it-scale.per_process_ops
-
-
-Best regards,
-Haiyan Song
-________________________________________
-From: Laurent Dufour [ldufour@linux.vnet.ibm.com]
-Sent: Monday, July 02, 2018 4:59 PM
-To: Song, HaiyanX
-Cc: akpm@linux-foundation.org; mhocko@kernel.org; peterz@infradead.org; kirill@shutemov.name; ak@linux.intel.com; dave@stgolabs.net; jack@suse.cz; Matthew Wilcox; khandual@linux.vnet.ibm.com; aneesh.kumar@linux.vnet.ibm.com; benh@kernel.crashing.org; mpe@ellerman.id.au; paulus@samba.org; Thomas Gleixner; Ingo Molnar; hpa@zytor.com; Will Deacon; Sergey Senozhatsky; sergey.senozhatsky.work@gmail.com; Andrea Arcangeli; Alexei Starovoitov; Wang, Kemi; Daniel Jordan; David Rientjes; Jerome Glisse; Ganesh Mahendran; Minchan Kim; Punit Agrawal; vinayak menon; Yang Shi; linux-kernel@vger.kernel.org; linux-mm@kvack.org; haren@linux.vnet.ibm.com; npiggin@gmail.com; bsingharora@gmail.com; paulmck@linux.vnet.ibm.com; Tim Chen; linuxppc-dev@lists.ozlabs.org; x86@kernel.org
-Subject: Re: [PATCH v11 00/26] Speculative page faults
-
-On 11/06/2018 09:49, Song, HaiyanX wrote:
-> Hi Laurent,
->
-> Regression test for v11 patch serials have been run, some regression is found by LKP-tools (linux kernel performance)
-> tested on Intel 4s skylake platform. This time only test the cases which have been run and found regressions on
-> V9 patch serials.
->
-> The regression result is sorted by the metric will-it-scale.per_thread_ops.
-> branch: Laurent-Dufour/Speculative-page-faults/20180520-045126
-> commit id:
->   head commit : a7a8993bfe3ccb54ad468b9f1799649e4ad1ff12
->   base commit : ba98a1cdad71d259a194461b3a61471b49b14df1
-> Benchmark: will-it-scale
-> Download link: https://github.com/antonblanchard/will-it-scale/tree/master
->
-> Metrics:
->   will-it-scale.per_process_ops=processes/nr_cpu
->   will-it-scale.per_thread_ops=threads/nr_cpu
->   test box: lkp-skl-4sp1(nr_cpu=192,memory=768G)
-> THP: enable / disable
-> nr_task:100%
->
-> 1. Regressions:
->
-> a). Enable THP
-> testcase                          base           change      head           metric
-> page_fault3/enable THP           10519          -20.5%        836      will-it-scale.per_thread_ops
-> page_fault2/enalbe THP            8281          -18.8%       6728      will-it-scale.per_thread_ops
-> brk1/eanble THP                 998475           -2.2%     976893      will-it-scale.per_process_ops
-> context_switch1/enable THP      223910           -1.3%     220930      will-it-scale.per_process_ops
-> context_switch1/enable THP      233722           -1.0%     231288      will-it-scale.per_thread_ops
->
-> b). Disable THP
-> page_fault3/disable THP          10856          -23.1%       8344      will-it-scale.per_thread_ops
-> page_fault2/disable THP           8147          -18.8%       6613      will-it-scale.per_thread_ops
-> brk1/disable THP                   957           -7.9%        881      will-it-scale.per_thread_ops
-> context_switch1/disable THP     237006           -2.2%     231907      will-it-scale.per_thread_ops
-> brk1/disable THP                997317           -2.0%     977778      will-it-scale.per_process_ops
-> page_fault3/disable THP         467454           -1.8%     459251      will-it-scale.per_process_ops
-> context_switch1/disable THP     224431           -1.3%     221567      will-it-scale.per_process_ops
->
-> Notes: for the above  values of test result, the higher is better.
-
-I tried the same tests on my PowerPC victim VM (1024 CPUs, 11TB) and I can't
-get reproducible results. The results have huge variation, even on the vanilla
-kernel, and I can't state on any changes due to that.
-
-I tried on smaller node (80 CPUs, 32G), and the tests ran better, but I didn't
-measure any changes between the vanilla and the SPF patched ones:
-
-test THP enabled                4.17.0-rc4-mm1  spf             delta
-page_fault3_threads             2697.7          2683.5          -0.53%
-page_fault2_threads             170660.6        169574.1        -0.64%
-context_switch1_threads         6915269.2       6877507.3       -0.55%
-context_switch1_processes       6478076.2       6529493.5       0.79%
-brk1                            243391.2        238527.5        -2.00%
-
-Tests were run 10 times, no high variation detected.
-
-Did you see high variation on your side ? How many times the test were run to
-compute the average values ?
-
-Thanks,
-Laurent.
-
-
->
-> 2. Improvement: not found improvement based on the selected test cases.
->
->
-> Best regards
-> Haiyan Song
-> ________________________________________
-> From: owner-linux-mm@kvack.org [owner-linux-mm@kvack.org] on behalf of Laurent Dufour [ldufour@linux.vnet.ibm.com]
-> Sent: Monday, May 28, 2018 4:54 PM
-> To: Song, HaiyanX
-> Cc: akpm@linux-foundation.org; mhocko@kernel.org; peterz@infradead.org; kirill@shutemov.name; ak@linux.intel.com; dave@stgolabs.net; jack@suse.cz; Matthew Wilcox; khandual@linux.vnet.ibm.com; aneesh.kumar@linux.vnet.ibm.com; benh@kernel.crashing.org; mpe@ellerman.id.au; paulus@samba.org; Thomas Gleixner; Ingo Molnar; hpa@zytor.com; Will Deacon; Sergey Senozhatsky; sergey.senozhatsky.work@gmail.com; Andrea Arcangeli; Alexei Starovoitov; Wang, Kemi; Daniel Jordan; David Rientjes; Jerome Glisse; Ganesh Mahendran; Minchan Kim; Punit Agrawal; vinayak menon; Yang Shi; linux-kernel@vger.kernel.org; linux-mm@kvack.org; haren@linux.vnet.ibm.com; npiggin@gmail.com; bsingharora@gmail.com; paulmck@linux.vnet.ibm.com; Tim Chen; linuxppc-dev@lists.ozlabs.org; x86@kernel.org
-> Subject: Re: [PATCH v11 00/26] Speculative page faults
->
-> On 28/05/2018 10:22, Haiyan Song wrote:
->> Hi Laurent,
->>
->> Yes, these tests are done on V9 patch.
->
-> Do you plan to give this V11 a run ?
->
->>
->>
->> Best regards,
->> Haiyan Song
->>
->> On Mon, May 28, 2018 at 09:51:34AM +0200, Laurent Dufour wrote:
->>> On 28/05/2018 07:23, Song, HaiyanX wrote:
->>>>
->>>> Some regression and improvements is found by LKP-tools(linux kernel performance) on V9 patch series
->>>> tested on Intel 4s Skylake platform.
->>>
->>> Hi,
->>>
->>> Thanks for reporting this benchmark results, but you mentioned the "V9 patch
->>> series" while responding to the v11 header series...
->>> Were these tests done on v9 or v11 ?
->>>
->>> Cheers,
->>> Laurent.
->>>
->>>>
->>>> The regression result is sorted by the metric will-it-scale.per_thread_ops.
->>>> Branch: Laurent-Dufour/Speculative-page-faults/20180316-151833 (V9 patch series)
->>>> Commit id:
->>>>     base commit: d55f34411b1b126429a823d06c3124c16283231f
->>>>     head commit: 0355322b3577eeab7669066df42c550a56801110
->>>> Benchmark suite: will-it-scale
->>>> Download link:
->>>> https://github.com/antonblanchard/will-it-scale/tree/master/tests
->>>> Metrics:
->>>>     will-it-scale.per_process_ops=processes/nr_cpu
->>>>     will-it-scale.per_thread_ops=threads/nr_cpu
->>>> test box: lkp-skl-4sp1(nr_cpu=192,memory=768G)
->>>> THP: enable / disable
->>>> nr_task: 100%
->>>>
->>>> 1. Regressions:
->>>> a) THP enabled:
->>>> testcase                        base            change          head       metric
->>>> page_fault3/ enable THP         10092           -17.5%          8323       will-it-scale.per_thread_ops
->>>> page_fault2/ enable THP          8300           -17.2%          6869       will-it-scale.per_thread_ops
->>>> brk1/ enable THP                  957.67         -7.6%           885       will-it-scale.per_thread_ops
->>>> page_fault3/ enable THP        172821            -5.3%        163692       will-it-scale.per_process_ops
->>>> signal1/ enable THP              9125            -3.2%          8834       will-it-scale.per_process_ops
->>>>
->>>> b) THP disabled:
->>>> testcase                        base            change          head       metric
->>>> page_fault3/ disable THP        10107           -19.1%          8180       will-it-scale.per_thread_ops
->>>> page_fault2/ disable THP         8432           -17.8%          6931       will-it-scale.per_thread_ops
->>>> context_switch1/ disable THP   215389            -6.8%        200776       will-it-scale.per_thread_ops
->>>> brk1/ disable THP                 939.67         -6.6%           877.33    will-it-scale.per_thread_ops
->>>> page_fault3/ disable THP       173145            -4.7%        165064       will-it-scale.per_process_ops
->>>> signal1/ disable THP             9162            -3.9%          8802       will-it-scale.per_process_ops
->>>>
->>>> 2. Improvements:
->>>> a) THP enabled:
->>>> testcase                        base            change          head       metric
->>>> malloc1/ enable THP               66.33        +469.8%           383.67    will-it-scale.per_thread_ops
->>>> writeseek3/ enable THP          2531             +4.5%          2646       will-it-scale.per_thread_ops
->>>> signal1/ enable THP              989.33          +2.8%          1016       will-it-scale.per_thread_ops
->>>>
->>>> b) THP disabled:
->>>> testcase                        base            change          head       metric
->>>> malloc1/ disable THP              90.33        +417.3%           467.33    will-it-scale.per_thread_ops
->>>> read2/ disable THP             58934            +39.2%         82060       will-it-scale.per_thread_ops
->>>> page_fault1/ disable THP        8607            +36.4%         11736       will-it-scale.per_thread_ops
->>>> read1/ disable THP            314063            +12.7%        353934       will-it-scale.per_thread_ops
->>>> writeseek3/ disable THP         2452            +12.5%          2759       will-it-scale.per_thread_ops
->>>> signal1/ disable THP             971.33          +5.5%          1024       will-it-scale.per_thread_ops
->>>>
->>>> Notes: for above values in column "change", the higher value means that the related testcase result
->>>> on head commit is better than that on base commit for this benchmark.
->>>>
->>>>
->>>> Best regards
->>>> Haiyan Song
->>>>
->>>> ________________________________________
->>>> From: owner-linux-mm@kvack.org [owner-linux-mm@kvack.org] on behalf of Laurent Dufour [ldufour@linux.vnet.ibm.com]
->>>> Sent: Thursday, May 17, 2018 7:06 PM
->>>> To: akpm@linux-foundation.org; mhocko@kernel.org; peterz@infradead.org; kirill@shutemov.name; ak@linux.intel.com; dave@stgolabs.net; jack@suse.cz; Matthew Wilcox; khandual@linux.vnet.ibm.com; aneesh.kumar@linux.vnet.ibm.com; benh@kernel.crashing.org; mpe@ellerman.id.au; paulus@samba.org; Thomas Gleixner; Ingo Molnar; hpa@zytor.com; Will Deacon; Sergey Senozhatsky; sergey.senozhatsky.work@gmail.com; Andrea Arcangeli; Alexei Starovoitov; Wang, Kemi; Daniel Jordan; David Rientjes; Jerome Glisse; Ganesh Mahendran; Minchan Kim; Punit Agrawal; vinayak menon; Yang Shi
->>>> Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org; haren@linux.vnet.ibm.com; npiggin@gmail.com; bsingharora@gmail.com; paulmck@linux.vnet.ibm.com; Tim Chen; linuxppc-dev@lists.ozlabs.org; x86@kernel.org
->>>> Subject: [PATCH v11 00/26] Speculative page faults
->>>>
->>>> This is a port on kernel 4.17 of the work done by Peter Zijlstra to handle
->>>> page fault without holding the mm semaphore [1].
->>>>
->>>> The idea is to try to handle user space page faults without holding the
->>>> mmap_sem. This should allow better concurrency for massively threaded
->>>> process since the page fault handler will not wait for other threads memory
->>>> layout change to be done, assuming that this change is done in another part
->>>> of the process's memory space. This type page fault is named speculative
->>>> page fault. If the speculative page fault fails because of a concurrency is
->>>> detected or because underlying PMD or PTE tables are not yet allocating, it
->>>> is failing its processing and a classic page fault is then tried.
->>>>
->>>> The speculative page fault (SPF) has to look for the VMA matching the fault
->>>> address without holding the mmap_sem, this is done by introducing a rwlock
->>>> which protects the access to the mm_rb tree. Previously this was done using
->>>> SRCU but it was introducing a lot of scheduling to process the VMA's
->>>> freeing operation which was hitting the performance by 20% as reported by
->>>> Kemi Wang [2]. Using a rwlock to protect access to the mm_rb tree is
->>>> limiting the locking contention to these operations which are expected to
->>>> be in a O(log n) order. In addition to ensure that the VMA is not freed in
->>>> our back a reference count is added and 2 services (get_vma() and
->>>> put_vma()) are introduced to handle the reference count. Once a VMA is
->>>> fetched from the RB tree using get_vma(), it must be later freed using
->>>> put_vma(). I can't see anymore the overhead I got while will-it-scale
->>>> benchmark anymore.
->>>>
->>>> The VMA's attributes checked during the speculative page fault processing
->>>> have to be protected against parallel changes. This is done by using a per
->>>> VMA sequence lock. This sequence lock allows the speculative page fault
->>>> handler to fast check for parallel changes in progress and to abort the
->>>> speculative page fault in that case.
->>>>
->>>> Once the VMA has been found, the speculative page fault handler would check
->>>> for the VMA's attributes to verify that the page fault has to be handled
->>>> correctly or not. Thus, the VMA is protected through a sequence lock which
->>>> allows fast detection of concurrent VMA changes. If such a change is
->>>> detected, the speculative page fault is aborted and a *classic* page fault
->>>> is tried.  VMA sequence lockings are added when VMA attributes which are
->>>> checked during the page fault are modified.
->>>>
->>>> When the PTE is fetched, the VMA is checked to see if it has been changed,
->>>> so once the page table is locked, the VMA is valid, so any other changes
->>>> leading to touching this PTE will need to lock the page table, so no
->>>> parallel change is possible at this time.
->>>>
->>>> The locking of the PTE is done with interrupts disabled, this allows
->>>> checking for the PMD to ensure that there is not an ongoing collapsing
->>>> operation. Since khugepaged is firstly set the PMD to pmd_none and then is
->>>> waiting for the other CPU to have caught the IPI interrupt, if the pmd is
->>>> valid at the time the PTE is locked, we have the guarantee that the
->>>> collapsing operation will have to wait on the PTE lock to move forward.
->>>> This allows the SPF handler to map the PTE safely. If the PMD value is
->>>> different from the one recorded at the beginning of the SPF operation, the
->>>> classic page fault handler will be called to handle the operation while
->>>> holding the mmap_sem. As the PTE lock is done with the interrupts disabled,
->>>> the lock is done using spin_trylock() to avoid dead lock when handling a
->>>> page fault while a TLB invalidate is requested by another CPU holding the
->>>> PTE.
->>>>
->>>> In pseudo code, this could be seen as:
->>>>     speculative_page_fault()
->>>>     {
->>>>             vma = get_vma()
->>>>             check vma sequence count
->>>>             check vma's support
->>>>             disable interrupt
->>>>                   check pgd,p4d,...,pte
->>>>                   save pmd and pte in vmf
->>>>                   save vma sequence counter in vmf
->>>>             enable interrupt
->>>>             check vma sequence count
->>>>             handle_pte_fault(vma)
->>>>                     ..
->>>>                     page = alloc_page()
->>>>                     pte_map_lock()
->>>>                             disable interrupt
->>>>                                     abort if sequence counter has changed
->>>>                                     abort if pmd or pte has changed
->>>>                                     pte map and lock
->>>>                             enable interrupt
->>>>                     if abort
->>>>                        free page
->>>>                        abort
->>>>                     ...
->>>>     }
->>>>
->>>>     arch_fault_handler()
->>>>     {
->>>>             if (speculative_page_fault(&vma))
->>>>                goto done
->>>>     again:
->>>>             lock(mmap_sem)
->>>>             vma = find_vma();
->>>>             handle_pte_fault(vma);
->>>>             if retry
->>>>                unlock(mmap_sem)
->>>>                goto again;
->>>>     done:
->>>>             handle fault error
->>>>     }
->>>>
->>>> Support for THP is not done because when checking for the PMD, we can be
->>>> confused by an in progress collapsing operation done by khugepaged. The
->>>> issue is that pmd_none() could be true either if the PMD is not already
->>>> populated or if the underlying PTE are in the way to be collapsed. So we
->>>> cannot safely allocate a PMD if pmd_none() is true.
->>>>
->>>> This series add a new software performance event named 'speculative-faults'
->>>> or 'spf'. It counts the number of successful page fault event handled
->>>> speculatively. When recording 'faults,spf' events, the faults one is
->>>> counting the total number of page fault events while 'spf' is only counting
->>>> the part of the faults processed speculatively.
->>>>
->>>> There are some trace events introduced by this series. They allow
->>>> identifying why the page faults were not processed speculatively. This
->>>> doesn't take in account the faults generated by a monothreaded process
->>>> which directly processed while holding the mmap_sem. This trace events are
->>>> grouped in a system named 'pagefault', they are:
->>>>  - pagefault:spf_vma_changed : if the VMA has been changed in our back
->>>>  - pagefault:spf_vma_noanon : the vma->anon_vma field was not yet set.
->>>>  - pagefault:spf_vma_notsup : the VMA's type is not supported
->>>>  - pagefault:spf_vma_access : the VMA's access right are not respected
->>>>  - pagefault:spf_pmd_changed : the upper PMD pointer has changed in our
->>>>    back.
->>>>
->>>> To record all the related events, the easier is to run perf with the
->>>> following arguments :
->>>> $ perf stat -e 'faults,spf,pagefault:*' <command>
->>>>
->>>> There is also a dedicated vmstat counter showing the number of successful
->>>> page fault handled speculatively. I can be seen this way:
->>>> $ grep speculative_pgfault /proc/vmstat
->>>>
->>>> This series builds on top of v4.16-mmotm-2018-04-13-17-28 and is functional
->>>> on x86, PowerPC and arm64.
->>>>
->>>> ---------------------
->>>> Real Workload results
->>>>
->>>> As mentioned in previous email, we did non official runs using a "popular
->>>> in memory multithreaded database product" on 176 cores SMT8 Power system
->>>> which showed a 30% improvements in the number of transaction processed per
->>>> second. This run has been done on the v6 series, but changes introduced in
->>>> this new version should not impact the performance boost seen.
->>>>
->>>> Here are the perf data captured during 2 of these runs on top of the v8
->>>> series:
->>>>                 vanilla         spf
->>>> faults          89.418          101.364         +13%
->>>> spf                n/a           97.989
->>>>
->>>> With the SPF kernel, most of the page fault were processed in a speculative
->>>> way.
->>>>
->>>> Ganesh Mahendran had backported the series on top of a 4.9 kernel and gave
->>>> it a try on an android device. He reported that the application launch time
->>>> was improved in average by 6%, and for large applications (~100 threads) by
->>>> 20%.
->>>>
->>>> Here are the launch time Ganesh mesured on Android 8.0 on top of a Qcom
->>>> MSM845 (8 cores) with 6GB (the less is better):
->>>>
->>>> Application                             4.9     4.9+spf delta
->>>> com.tencent.mm                          416     389     -7%
->>>> com.eg.android.AlipayGphone             1135    986     -13%
->>>> com.tencent.mtt                         455     454     0%
->>>> com.qqgame.hlddz                        1497    1409    -6%
->>>> com.autonavi.minimap                    711     701     -1%
->>>> com.tencent.tmgp.sgame                  788     748     -5%
->>>> com.immomo.momo                         501     487     -3%
->>>> com.tencent.peng                        2145    2112    -2%
->>>> com.smile.gifmaker                      491     461     -6%
->>>> com.baidu.BaiduMap                      479     366     -23%
->>>> com.taobao.taobao                       1341    1198    -11%
->>>> com.baidu.searchbox                     333     314     -6%
->>>> com.tencent.mobileqq                    394     384     -3%
->>>> com.sina.weibo                          907     906     0%
->>>> com.youku.phone                         816     731     -11%
->>>> com.happyelements.AndroidAnimal.qq      763     717     -6%
->>>> com.UCMobile                            415     411     -1%
->>>> com.tencent.tmgp.ak                     1464    1431    -2%
->>>> com.tencent.qqmusic                     336     329     -2%
->>>> com.sankuai.meituan                     1661    1302    -22%
->>>> com.netease.cloudmusic                  1193    1200    1%
->>>> air.tv.douyu.android                    4257    4152    -2%
->>>>
->>>> ------------------
->>>> Benchmarks results
->>>>
->>>> Base kernel is v4.17.0-rc4-mm1
->>>> SPF is BASE + this series
->>>>
->>>> Kernbench:
->>>> ----------
->>>> Here are the results on a 16 CPUs X86 guest using kernbench on a 4.15
->>>> kernel (kernel is build 5 times):
->>>>
->>>> Average Half load -j 8
->>>>                  Run    (std deviation)
->>>>                  BASE                   SPF
->>>> Elapsed Time     1448.65 (5.72312)      1455.84 (4.84951)       0.50%
->>>> User    Time     10135.4 (30.3699)      10148.8 (31.1252)       0.13%
->>>> System  Time     900.47  (2.81131)      923.28  (7.52779)       2.53%
->>>> Percent CPU      761.4   (1.14018)      760.2   (0.447214)      -0.16%
->>>> Context Switches 85380   (3419.52)      84748   (1904.44)       -0.74%
->>>> Sleeps           105064  (1240.96)      105074  (337.612)       0.01%
->>>>
->>>> Average Optimal load -j 16
->>>>                  Run    (std deviation)
->>>>                  BASE                   SPF
->>>> Elapsed Time     920.528 (10.1212)      927.404 (8.91789)       0.75%
->>>> User    Time     11064.8 (981.142)      11085   (990.897)       0.18%
->>>> System  Time     979.904 (84.0615)      1001.14 (82.5523)       2.17%
->>>> Percent CPU      1089.5  (345.894)      1086.1  (343.545)       -0.31%
->>>> Context Switches 159488  (78156.4)      158223  (77472.1)       -0.79%
->>>> Sleeps           110566  (5877.49)      110388  (5617.75)       -0.16%
->>>>
->>>>
->>>> During a run on the SPF, perf events were captured:
->>>>  Performance counter stats for '../kernbench -M':
->>>>          526743764      faults
->>>>                210      spf
->>>>                  3      pagefault:spf_vma_changed
->>>>                  0      pagefault:spf_vma_noanon
->>>>               2278      pagefault:spf_vma_notsup
->>>>                  0      pagefault:spf_vma_access
->>>>                  0      pagefault:spf_pmd_changed
->>>>
->>>> Very few speculative page faults were recorded as most of the processes
->>>> involved are monothreaded (sounds that on this architecture some threads
->>>> were created during the kernel build processing).
->>>>
->>>> Here are the kerbench results on a 80 CPUs Power8 system:
->>>>
->>>> Average Half load -j 40
->>>>                  Run    (std deviation)
->>>>                  BASE                   SPF
->>>> Elapsed Time     117.152 (0.774642)     117.166 (0.476057)      0.01%
->>>> User    Time     4478.52 (24.7688)      4479.76 (9.08555)       0.03%
->>>> System  Time     131.104 (0.720056)     134.04  (0.708414)      2.24%
->>>> Percent CPU      3934    (19.7104)      3937.2  (19.0184)       0.08%
->>>> Context Switches 92125.4 (576.787)      92581.6 (198.622)       0.50%
->>>> Sleeps           317923  (652.499)      318469  (1255.59)       0.17%
->>>>
->>>> Average Optimal load -j 80
->>>>                  Run    (std deviation)
->>>>                  BASE                   SPF
->>>> Elapsed Time     107.73  (0.632416)     107.31  (0.584936)      -0.39%
->>>> User    Time     5869.86 (1466.72)      5871.71 (1467.27)       0.03%
->>>> System  Time     153.728 (23.8573)      157.153 (24.3704)       2.23%
->>>> Percent CPU      5418.6  (1565.17)      5436.7  (1580.91)       0.33%
->>>> Context Switches 223861  (138865)       225032  (139632)        0.52%
->>>> Sleeps           330529  (13495.1)      332001  (14746.2)       0.45%
->>>>
->>>> During a run on the SPF, perf events were captured:
->>>>  Performance counter stats for '../kernbench -M':
->>>>          116730856      faults
->>>>                  0      spf
->>>>                  3      pagefault:spf_vma_changed
->>>>                  0      pagefault:spf_vma_noanon
->>>>                476      pagefault:spf_vma_notsup
->>>>                  0      pagefault:spf_vma_access
->>>>                  0      pagefault:spf_pmd_changed
->>>>
->>>> Most of the processes involved are monothreaded so SPF is not activated but
->>>> there is no impact on the performance.
->>>>
->>>> Ebizzy:
->>>> -------
->>>> The test is counting the number of records per second it can manage, the
->>>> higher is the best. I run it like this 'ebizzy -mTt <nrcpus>'. To get
->>>> consistent result I repeated the test 100 times and measure the average
->>>> result. The number is the record processes per second, the higher is the
->>>> best.
->>>>
->>>>                 BASE            SPF             delta
->>>> 16 CPUs x86 VM  742.57          1490.24         100.69%
->>>> 80 CPUs P8 node 13105.4         24174.23        84.46%
->>>>
->>>> Here are the performance counter read during a run on a 16 CPUs x86 VM:
->>>>  Performance counter stats for './ebizzy -mTt 16':
->>>>            1706379      faults
->>>>            1674599      spf
->>>>              30588      pagefault:spf_vma_changed
->>>>                  0      pagefault:spf_vma_noanon
->>>>                363      pagefault:spf_vma_notsup
->>>>                  0      pagefault:spf_vma_access
->>>>                  0      pagefault:spf_pmd_changed
->>>>
->>>> And the ones captured during a run on a 80 CPUs Power node:
->>>>  Performance counter stats for './ebizzy -mTt 80':
->>>>            1874773      faults
->>>>            1461153      spf
->>>>             413293      pagefault:spf_vma_changed
->>>>                  0      pagefault:spf_vma_noanon
->>>>                200      pagefault:spf_vma_notsup
->>>>                  0      pagefault:spf_vma_access
->>>>                  0      pagefault:spf_pmd_changed
->>>>
->>>> In ebizzy's case most of the page fault were handled in a speculative way,
->>>> leading the ebizzy performance boost.
->>>>
->>>> ------------------
->>>> Changes since v10 (https://lkml.org/lkml/2018/4/17/572):
->>>>  - Accounted for all review feedbacks from Punit Agrawal, Ganesh Mahendran
->>>>    and Minchan Kim, hopefully.
->>>>  - Remove unneeded check on CONFIG_SPECULATIVE_PAGE_FAULT in
->>>>    __do_page_fault().
->>>>  - Loop in pte_spinlock() and pte_map_lock() when pte try lock fails
->>>>    instead
->>>>    of aborting the speculative page fault handling. Dropping the now
->>>> useless
->>>>    trace event pagefault:spf_pte_lock.
->>>>  - No more try to reuse the fetched VMA during the speculative page fault
->>>>    handling when retrying is needed. This adds a lot of complexity and
->>>>    additional tests done didn't show a significant performance improvement.
->>>>  - Convert IS_ENABLED(CONFIG_NUMA) back to #ifdef due to build error.
->>>>
->>>> [1] http://linux-kernel.2935.n7.nabble.com/RFC-PATCH-0-6-Another-go-at-speculative-page-faults-tt965642.html#none
->>>> [2] https://patchwork.kernel.org/patch/9999687/
->>>>
->>>>
->>>> Laurent Dufour (20):
->>>>   mm: introduce CONFIG_SPECULATIVE_PAGE_FAULT
->>>>   x86/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT
->>>>   powerpc/mm: set ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT
->>>>   mm: introduce pte_spinlock for FAULT_FLAG_SPECULATIVE
->>>>   mm: make pte_unmap_same compatible with SPF
->>>>   mm: introduce INIT_VMA()
->>>>   mm: protect VMA modifications using VMA sequence count
->>>>   mm: protect mremap() against SPF hanlder
->>>>   mm: protect SPF handler against anon_vma changes
->>>>   mm: cache some VMA fields in the vm_fault structure
->>>>   mm/migrate: Pass vm_fault pointer to migrate_misplaced_page()
->>>>   mm: introduce __lru_cache_add_active_or_unevictable
->>>>   mm: introduce __vm_normal_page()
->>>>   mm: introduce __page_add_new_anon_rmap()
->>>>   mm: protect mm_rb tree with a rwlock
->>>>   mm: adding speculative page fault failure trace events
->>>>   perf: add a speculative page fault sw event
->>>>   perf tools: add support for the SPF perf event
->>>>   mm: add speculative page fault vmstats
->>>>   powerpc/mm: add speculative page fault
->>>>
->>>> Mahendran Ganesh (2):
->>>>   arm64/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT
->>>>   arm64/mm: add speculative page fault
->>>>
->>>> Peter Zijlstra (4):
->>>>   mm: prepare for FAULT_FLAG_SPECULATIVE
->>>>   mm: VMA sequence count
->>>>   mm: provide speculative fault infrastructure
->>>>   x86/mm: add speculative pagefault handling
->>>>
->>>>  arch/arm64/Kconfig                    |   1 +
->>>>  arch/arm64/mm/fault.c                 |  12 +
->>>>  arch/powerpc/Kconfig                  |   1 +
->>>>  arch/powerpc/mm/fault.c               |  16 +
->>>>  arch/x86/Kconfig                      |   1 +
->>>>  arch/x86/mm/fault.c                   |  27 +-
->>>>  fs/exec.c                             |   2 +-
->>>>  fs/proc/task_mmu.c                    |   5 +-
->>>>  fs/userfaultfd.c                      |  17 +-
->>>>  include/linux/hugetlb_inline.h        |   2 +-
->>>>  include/linux/migrate.h               |   4 +-
->>>>  include/linux/mm.h                    | 136 +++++++-
->>>>  include/linux/mm_types.h              |   7 +
->>>>  include/linux/pagemap.h               |   4 +-
->>>>  include/linux/rmap.h                  |  12 +-
->>>>  include/linux/swap.h                  |  10 +-
->>>>  include/linux/vm_event_item.h         |   3 +
->>>>  include/trace/events/pagefault.h      |  80 +++++
->>>>  include/uapi/linux/perf_event.h       |   1 +
->>>>  kernel/fork.c                         |   5 +-
->>>>  mm/Kconfig                            |  22 ++
->>>>  mm/huge_memory.c                      |   6 +-
->>>>  mm/hugetlb.c                          |   2 +
->>>>  mm/init-mm.c                          |   3 +
->>>>  mm/internal.h                         |  20 ++
->>>>  mm/khugepaged.c                       |   5 +
->>>>  mm/madvise.c                          |   6 +-
->>>>  mm/memory.c                           | 612 +++++++++++++++++++++++++++++-----
->>>>  mm/mempolicy.c                        |  51 ++-
->>>>  mm/migrate.c                          |   6 +-
->>>>  mm/mlock.c                            |  13 +-
->>>>  mm/mmap.c                             | 229 ++++++++++---
->>>>  mm/mprotect.c                         |   4 +-
->>>>  mm/mremap.c                           |  13 +
->>>>  mm/nommu.c                            |   2 +-
->>>>  mm/rmap.c                             |   5 +-
->>>>  mm/swap.c                             |   6 +-
->>>>  mm/swap_state.c                       |   8 +-
->>>>  mm/vmstat.c                           |   5 +-
->>>>  tools/include/uapi/linux/perf_event.h |   1 +
->>>>  tools/perf/util/evsel.c               |   1 +
->>>>  tools/perf/util/parse-events.c        |   4 +
->>>>  tools/perf/util/parse-events.l        |   1 +
->>>>  tools/perf/util/python.c              |   1 +
->>>>  44 files changed, 1161 insertions(+), 211 deletions(-)
->>>>  create mode 100644 include/trace/events/pagefault.h
->>>>
->>>> --
->>>> 2.7.4
->>>>
->>>>
->>>
->>
->
\ No newline at end of file
+Hi Laurent,=0A=
+=0A=
+=0A=
+For the test result on Intel 4s skylake platform (192 CPUs, 768G Memory), t=
+he below test cases all were run 3 times.=0A=
+I check the test results, only page_fault3_thread/enable THP have 6% stddev=
+ for head commit, other tests have lower stddev.=0A=
+=0A=
+And I did not find other high variation on test case result.=0A=
+=0A=
+a). Enable THP=0A=
+testcase                          base     stddev       change      head   =
+  stddev         metric=0A=
+page_fault3/enable THP           10519      =B1 3%        -20.5%      8368 =
+     =B16%          will-it-scale.per_thread_ops=0A=
+page_fault2/enalbe THP            8281      =B1 2%        -18.8%      6728 =
+                  will-it-scale.per_thread_ops=0A=
+brk1/eanble THP                 998475                   -2.2%    976893   =
+                will-it-scale.per_process_ops=0A=
+context_switch1/enable THP      223910                   -1.3%    220930   =
+                will-it-scale.per_process_ops=0A=
+context_switch1/enable THP      233722                   -1.0%    231288   =
+                will-it-scale.per_thread_ops=0A=
+=0A=
+b). Disable THP=0A=
+page_fault3/disable THP          10856                  -23.1%      8344   =
+                will-it-scale.per_thread_ops=0A=
+page_fault2/disable THP           8147                  -18.8%      6613   =
+                will-it-scale.per_thread_ops=0A=
+brk1/disable THP                   957                    -7.9%      881   =
+                will-it-scale.per_thread_ops=0A=
+context_switch1/disable THP     237006                    -2.2%    231907  =
+                will-it-scale.per_thread_ops=0A=
+brk1/disable THP                997317                    -2.0%    977778  =
+                will-it-scale.per_process_ops=0A=
+page_fault3/disable THP         467454                    -1.8%    459251  =
+                will-it-scale.per_process_ops=0A=
+context_switch1/disable THP     224431                    -1.3%    221567  =
+                will-it-scale.per_process_ops=0A=
+=0A=
+=0A=
+Best regards,=0A=
+Haiyan Song=0A=
+________________________________________=0A=
+From: Laurent Dufour [ldufour@linux.vnet.ibm.com]=0A=
+Sent: Monday, July 02, 2018 4:59 PM=0A=
+To: Song, HaiyanX=0A=
+Cc: akpm@linux-foundation.org; mhocko@kernel.org; peterz@infradead.org; kir=
+ill@shutemov.name; ak@linux.intel.com; dave@stgolabs.net; jack@suse.cz; Mat=
+thew Wilcox; khandual@linux.vnet.ibm.com; aneesh.kumar@linux.vnet.ibm.com; =
+benh@kernel.crashing.org; mpe@ellerman.id.au; paulus@samba.org; Thomas Glei=
+xner; Ingo Molnar; hpa@zytor.com; Will Deacon; Sergey Senozhatsky; sergey.s=
+enozhatsky.work@gmail.com; Andrea Arcangeli; Alexei Starovoitov; Wang, Kemi=
+; Daniel Jordan; David Rientjes; Jerome Glisse; Ganesh Mahendran; Minchan K=
+im; Punit Agrawal; vinayak menon; Yang Shi; linux-kernel@vger.kernel.org; l=
+inux-mm@kvack.org; haren@linux.vnet.ibm.com; npiggin@gmail.com; bsingharora=
+@gmail.com; paulmck@linux.vnet.ibm.com; Tim Chen; linuxppc-dev@lists.ozlabs=
+.org; x86@kernel.org=0A=
+Subject: Re: [PATCH v11 00/26] Speculative page faults=0A=
+=0A=
+On 11/06/2018 09:49, Song, HaiyanX wrote:=0A=
+> Hi Laurent,=0A=
+>=0A=
+> Regression test for v11 patch serials have been run, some regression is f=
+ound by LKP-tools (linux kernel performance)=0A=
+> tested on Intel 4s skylake platform. This time only test the cases which =
+have been run and found regressions on=0A=
+> V9 patch serials.=0A=
+>=0A=
+> The regression result is sorted by the metric will-it-scale.per_thread_op=
+s.=0A=
+> branch: Laurent-Dufour/Speculative-page-faults/20180520-045126=0A=
+> commit id:=0A=
+>   head commit : a7a8993bfe3ccb54ad468b9f1799649e4ad1ff12=0A=
+>   base commit : ba98a1cdad71d259a194461b3a61471b49b14df1=0A=
+> Benchmark: will-it-scale=0A=
+> Download link: https://github.com/antonblanchard/will-it-scale/tree/maste=
+r=0A=
+>=0A=
+> Metrics:=0A=
+>   will-it-scale.per_process_ops=3Dprocesses/nr_cpu=0A=
+>   will-it-scale.per_thread_ops=3Dthreads/nr_cpu=0A=
+>   test box: lkp-skl-4sp1(nr_cpu=3D192,memory=3D768G)=0A=
+> THP: enable / disable=0A=
+> nr_task:100%=0A=
+>=0A=
+> 1. Regressions:=0A=
+>=0A=
+> a). Enable THP=0A=
+> testcase                          base           change      head        =
+   metric=0A=
+> page_fault3/enable THP           10519          -20.5%        836      wi=
+ll-it-scale.per_thread_ops=0A=
+> page_fault2/enalbe THP            8281          -18.8%       6728      wi=
+ll-it-scale.per_thread_ops=0A=
+> brk1/eanble THP                 998475           -2.2%     976893      wi=
+ll-it-scale.per_process_ops=0A=
+> context_switch1/enable THP      223910           -1.3%     220930      wi=
+ll-it-scale.per_process_ops=0A=
+> context_switch1/enable THP      233722           -1.0%     231288      wi=
+ll-it-scale.per_thread_ops=0A=
+>=0A=
+> b). Disable THP=0A=
+> page_fault3/disable THP          10856          -23.1%       8344      wi=
+ll-it-scale.per_thread_ops=0A=
+> page_fault2/disable THP           8147          -18.8%       6613      wi=
+ll-it-scale.per_thread_ops=0A=
+> brk1/disable THP                   957           -7.9%        881      wi=
+ll-it-scale.per_thread_ops=0A=
+> context_switch1/disable THP     237006           -2.2%     231907      wi=
+ll-it-scale.per_thread_ops=0A=
+> brk1/disable THP                997317           -2.0%     977778      wi=
+ll-it-scale.per_process_ops=0A=
+> page_fault3/disable THP         467454           -1.8%     459251      wi=
+ll-it-scale.per_process_ops=0A=
+> context_switch1/disable THP     224431           -1.3%     221567      wi=
+ll-it-scale.per_process_ops=0A=
+>=0A=
+> Notes: for the above  values of test result, the higher is better.=0A=
+=0A=
+I tried the same tests on my PowerPC victim VM (1024 CPUs, 11TB) and I can'=
+t=0A=
+get reproducible results. The results have huge variation, even on the vani=
+lla=0A=
+kernel, and I can't state on any changes due to that.=0A=
+=0A=
+I tried on smaller node (80 CPUs, 32G), and the tests ran better, but I did=
+n't=0A=
+measure any changes between the vanilla and the SPF patched ones:=0A=
+=0A=
+test THP enabled                4.17.0-rc4-mm1  spf             delta=0A=
+page_fault3_threads             2697.7          2683.5          -0.53%=0A=
+page_fault2_threads             170660.6        169574.1        -0.64%=0A=
+context_switch1_threads         6915269.2       6877507.3       -0.55%=0A=
+context_switch1_processes       6478076.2       6529493.5       0.79%=0A=
+brk1                            243391.2        238527.5        -2.00%=0A=
+=0A=
+Tests were run 10 times, no high variation detected.=0A=
+=0A=
+Did you see high variation on your side ? How many times the test were run =
+to=0A=
+compute the average values ?=0A=
+=0A=
+Thanks,=0A=
+Laurent.=0A=
+=0A=
+=0A=
+>=0A=
+> 2. Improvement: not found improvement based on the selected test cases.=
+=0A=
+>=0A=
+>=0A=
+> Best regards=0A=
+> Haiyan Song=0A=
+> ________________________________________=0A=
+> From: owner-linux-mm@kvack.org [owner-linux-mm@kvack.org] on behalf of La=
+urent Dufour [ldufour@linux.vnet.ibm.com]=0A=
+> Sent: Monday, May 28, 2018 4:54 PM=0A=
+> To: Song, HaiyanX=0A=
+> Cc: akpm@linux-foundation.org; mhocko@kernel.org; peterz@infradead.org; k=
+irill@shutemov.name; ak@linux.intel.com; dave@stgolabs.net; jack@suse.cz; M=
+atthew Wilcox; khandual@linux.vnet.ibm.com; aneesh.kumar@linux.vnet.ibm.com=
+; benh@kernel.crashing.org; mpe@ellerman.id.au; paulus@samba.org; Thomas Gl=
+eixner; Ingo Molnar; hpa@zytor.com; Will Deacon; Sergey Senozhatsky; sergey=
+.senozhatsky.work@gmail.com; Andrea Arcangeli; Alexei Starovoitov; Wang, Ke=
+mi; Daniel Jordan; David Rientjes; Jerome Glisse; Ganesh Mahendran; Minchan=
+ Kim; Punit Agrawal; vinayak menon; Yang Shi; linux-kernel@vger.kernel.org;=
+ linux-mm@kvack.org; haren@linux.vnet.ibm.com; npiggin@gmail.com; bsingharo=
+ra@gmail.com; paulmck@linux.vnet.ibm.com; Tim Chen; linuxppc-dev@lists.ozla=
+bs.org; x86@kernel.org=0A=
+> Subject: Re: [PATCH v11 00/26] Speculative page faults=0A=
+>=0A=
+> On 28/05/2018 10:22, Haiyan Song wrote:=0A=
+>> Hi Laurent,=0A=
+>>=0A=
+>> Yes, these tests are done on V9 patch.=0A=
+>=0A=
+> Do you plan to give this V11 a run ?=0A=
+>=0A=
+>>=0A=
+>>=0A=
+>> Best regards,=0A=
+>> Haiyan Song=0A=
+>>=0A=
+>> On Mon, May 28, 2018 at 09:51:34AM +0200, Laurent Dufour wrote:=0A=
+>>> On 28/05/2018 07:23, Song, HaiyanX wrote:=0A=
+>>>>=0A=
+>>>> Some regression and improvements is found by LKP-tools(linux kernel pe=
+rformance) on V9 patch series=0A=
+>>>> tested on Intel 4s Skylake platform.=0A=
+>>>=0A=
+>>> Hi,=0A=
+>>>=0A=
+>>> Thanks for reporting this benchmark results, but you mentioned the "V9 =
+patch=0A=
+>>> series" while responding to the v11 header series...=0A=
+>>> Were these tests done on v9 or v11 ?=0A=
+>>>=0A=
+>>> Cheers,=0A=
+>>> Laurent.=0A=
+>>>=0A=
+>>>>=0A=
+>>>> The regression result is sorted by the metric will-it-scale.per_thread=
+_ops.=0A=
+>>>> Branch: Laurent-Dufour/Speculative-page-faults/20180316-151833 (V9 pat=
+ch series)=0A=
+>>>> Commit id:=0A=
+>>>>     base commit: d55f34411b1b126429a823d06c3124c16283231f=0A=
+>>>>     head commit: 0355322b3577eeab7669066df42c550a56801110=0A=
+>>>> Benchmark suite: will-it-scale=0A=
+>>>> Download link:=0A=
+>>>> https://github.com/antonblanchard/will-it-scale/tree/master/tests=0A=
+>>>> Metrics:=0A=
+>>>>     will-it-scale.per_process_ops=3Dprocesses/nr_cpu=0A=
+>>>>     will-it-scale.per_thread_ops=3Dthreads/nr_cpu=0A=
+>>>> test box: lkp-skl-4sp1(nr_cpu=3D192,memory=3D768G)=0A=
+>>>> THP: enable / disable=0A=
+>>>> nr_task: 100%=0A=
+>>>>=0A=
+>>>> 1. Regressions:=0A=
+>>>> a) THP enabled:=0A=
+>>>> testcase                        base            change          head  =
+     metric=0A=
+>>>> page_fault3/ enable THP         10092           -17.5%          8323  =
+     will-it-scale.per_thread_ops=0A=
+>>>> page_fault2/ enable THP          8300           -17.2%          6869  =
+     will-it-scale.per_thread_ops=0A=
+>>>> brk1/ enable THP                  957.67         -7.6%           885  =
+     will-it-scale.per_thread_ops=0A=
+>>>> page_fault3/ enable THP        172821            -5.3%        163692  =
+     will-it-scale.per_process_ops=0A=
+>>>> signal1/ enable THP              9125            -3.2%          8834  =
+     will-it-scale.per_process_ops=0A=
+>>>>=0A=
+>>>> b) THP disabled:=0A=
+>>>> testcase                        base            change          head  =
+     metric=0A=
+>>>> page_fault3/ disable THP        10107           -19.1%          8180  =
+     will-it-scale.per_thread_ops=0A=
+>>>> page_fault2/ disable THP         8432           -17.8%          6931  =
+     will-it-scale.per_thread_ops=0A=
+>>>> context_switch1/ disable THP   215389            -6.8%        200776  =
+     will-it-scale.per_thread_ops=0A=
+>>>> brk1/ disable THP                 939.67         -6.6%           877.3=
+3    will-it-scale.per_thread_ops=0A=
+>>>> page_fault3/ disable THP       173145            -4.7%        165064  =
+     will-it-scale.per_process_ops=0A=
+>>>> signal1/ disable THP             9162            -3.9%          8802  =
+     will-it-scale.per_process_ops=0A=
+>>>>=0A=
+>>>> 2. Improvements:=0A=
+>>>> a) THP enabled:=0A=
+>>>> testcase                        base            change          head  =
+     metric=0A=
+>>>> malloc1/ enable THP               66.33        +469.8%           383.6=
+7    will-it-scale.per_thread_ops=0A=
+>>>> writeseek3/ enable THP          2531             +4.5%          2646  =
+     will-it-scale.per_thread_ops=0A=
+>>>> signal1/ enable THP              989.33          +2.8%          1016  =
+     will-it-scale.per_thread_ops=0A=
+>>>>=0A=
+>>>> b) THP disabled:=0A=
+>>>> testcase                        base            change          head  =
+     metric=0A=
+>>>> malloc1/ disable THP              90.33        +417.3%           467.3=
+3    will-it-scale.per_thread_ops=0A=
+>>>> read2/ disable THP             58934            +39.2%         82060  =
+     will-it-scale.per_thread_ops=0A=
+>>>> page_fault1/ disable THP        8607            +36.4%         11736  =
+     will-it-scale.per_thread_ops=0A=
+>>>> read1/ disable THP            314063            +12.7%        353934  =
+     will-it-scale.per_thread_ops=0A=
+>>>> writeseek3/ disable THP         2452            +12.5%          2759  =
+     will-it-scale.per_thread_ops=0A=
+>>>> signal1/ disable THP             971.33          +5.5%          1024  =
+     will-it-scale.per_thread_ops=0A=
+>>>>=0A=
+>>>> Notes: for above values in column "change", the higher value means tha=
+t the related testcase result=0A=
+>>>> on head commit is better than that on base commit for this benchmark.=
+=0A=
+>>>>=0A=
+>>>>=0A=
+>>>> Best regards=0A=
+>>>> Haiyan Song=0A=
+>>>>=0A=
+>>>> ________________________________________=0A=
+>>>> From: owner-linux-mm@kvack.org [owner-linux-mm@kvack.org] on behalf of=
+ Laurent Dufour [ldufour@linux.vnet.ibm.com]=0A=
+>>>> Sent: Thursday, May 17, 2018 7:06 PM=0A=
+>>>> To: akpm@linux-foundation.org; mhocko@kernel.org; peterz@infradead.org=
+; kirill@shutemov.name; ak@linux.intel.com; dave@stgolabs.net; jack@suse.cz=
+; Matthew Wilcox; khandual@linux.vnet.ibm.com; aneesh.kumar@linux.vnet.ibm.=
+com; benh@kernel.crashing.org; mpe@ellerman.id.au; paulus@samba.org; Thomas=
+ Gleixner; Ingo Molnar; hpa@zytor.com; Will Deacon; Sergey Senozhatsky; ser=
+gey.senozhatsky.work@gmail.com; Andrea Arcangeli; Alexei Starovoitov; Wang,=
+ Kemi; Daniel Jordan; David Rientjes; Jerome Glisse; Ganesh Mahendran; Minc=
+han Kim; Punit Agrawal; vinayak menon; Yang Shi=0A=
+>>>> Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org; haren@linux.vnet=
+.ibm.com; npiggin@gmail.com; bsingharora@gmail.com; paulmck@linux.vnet.ibm.=
+com; Tim Chen; linuxppc-dev@lists.ozlabs.org; x86@kernel.org=0A=
+>>>> Subject: [PATCH v11 00/26] Speculative page faults=0A=
+>>>>=0A=
+>>>> This is a port on kernel 4.17 of the work done by Peter Zijlstra to ha=
+ndle=0A=
+>>>> page fault without holding the mm semaphore [1].=0A=
+>>>>=0A=
+>>>> The idea is to try to handle user space page faults without holding th=
+e=0A=
+>>>> mmap_sem. This should allow better concurrency for massively threaded=
+=0A=
+>>>> process since the page fault handler will not wait for other threads m=
+emory=0A=
+>>>> layout change to be done, assuming that this change is done in another=
+ part=0A=
+>>>> of the process's memory space. This type page fault is named speculati=
+ve=0A=
+>>>> page fault. If the speculative page fault fails because of a concurren=
+cy is=0A=
+>>>> detected or because underlying PMD or PTE tables are not yet allocatin=
+g, it=0A=
+>>>> is failing its processing and a classic page fault is then tried.=0A=
+>>>>=0A=
+>>>> The speculative page fault (SPF) has to look for the VMA matching the =
+fault=0A=
+>>>> address without holding the mmap_sem, this is done by introducing a rw=
+lock=0A=
+>>>> which protects the access to the mm_rb tree. Previously this was done =
+using=0A=
+>>>> SRCU but it was introducing a lot of scheduling to process the VMA's=
+=0A=
+>>>> freeing operation which was hitting the performance by 20% as reported=
+ by=0A=
+>>>> Kemi Wang [2]. Using a rwlock to protect access to the mm_rb tree is=
+=0A=
+>>>> limiting the locking contention to these operations which are expected=
+ to=0A=
+>>>> be in a O(log n) order. In addition to ensure that the VMA is not free=
+d in=0A=
+>>>> our back a reference count is added and 2 services (get_vma() and=0A=
+>>>> put_vma()) are introduced to handle the reference count. Once a VMA is=
+=0A=
+>>>> fetched from the RB tree using get_vma(), it must be later freed using=
+=0A=
+>>>> put_vma(). I can't see anymore the overhead I got while will-it-scale=
+=0A=
+>>>> benchmark anymore.=0A=
+>>>>=0A=
+>>>> The VMA's attributes checked during the speculative page fault process=
+ing=0A=
+>>>> have to be protected against parallel changes. This is done by using a=
+ per=0A=
+>>>> VMA sequence lock. This sequence lock allows the speculative page faul=
+t=0A=
+>>>> handler to fast check for parallel changes in progress and to abort th=
+e=0A=
+>>>> speculative page fault in that case.=0A=
+>>>>=0A=
+>>>> Once the VMA has been found, the speculative page fault handler would =
+check=0A=
+>>>> for the VMA's attributes to verify that the page fault has to be handl=
+ed=0A=
+>>>> correctly or not. Thus, the VMA is protected through a sequence lock w=
+hich=0A=
+>>>> allows fast detection of concurrent VMA changes. If such a change is=
+=0A=
+>>>> detected, the speculative page fault is aborted and a *classic* page f=
+ault=0A=
+>>>> is tried.  VMA sequence lockings are added when VMA attributes which a=
+re=0A=
+>>>> checked during the page fault are modified.=0A=
+>>>>=0A=
+>>>> When the PTE is fetched, the VMA is checked to see if it has been chan=
+ged,=0A=
+>>>> so once the page table is locked, the VMA is valid, so any other chang=
+es=0A=
+>>>> leading to touching this PTE will need to lock the page table, so no=
+=0A=
+>>>> parallel change is possible at this time.=0A=
+>>>>=0A=
+>>>> The locking of the PTE is done with interrupts disabled, this allows=
+=0A=
+>>>> checking for the PMD to ensure that there is not an ongoing collapsing=
+=0A=
+>>>> operation. Since khugepaged is firstly set the PMD to pmd_none and the=
+n is=0A=
+>>>> waiting for the other CPU to have caught the IPI interrupt, if the pmd=
+ is=0A=
+>>>> valid at the time the PTE is locked, we have the guarantee that the=0A=
+>>>> collapsing operation will have to wait on the PTE lock to move forward=
+.=0A=
+>>>> This allows the SPF handler to map the PTE safely. If the PMD value is=
+=0A=
+>>>> different from the one recorded at the beginning of the SPF operation,=
+ the=0A=
+>>>> classic page fault handler will be called to handle the operation whil=
+e=0A=
+>>>> holding the mmap_sem. As the PTE lock is done with the interrupts disa=
+bled,=0A=
+>>>> the lock is done using spin_trylock() to avoid dead lock when handling=
+ a=0A=
+>>>> page fault while a TLB invalidate is requested by another CPU holding =
+the=0A=
+>>>> PTE.=0A=
+>>>>=0A=
+>>>> In pseudo code, this could be seen as:=0A=
+>>>>     speculative_page_fault()=0A=
+>>>>     {=0A=
+>>>>             vma =3D get_vma()=0A=
+>>>>             check vma sequence count=0A=
+>>>>             check vma's support=0A=
+>>>>             disable interrupt=0A=
+>>>>                   check pgd,p4d,...,pte=0A=
+>>>>                   save pmd and pte in vmf=0A=
+>>>>                   save vma sequence counter in vmf=0A=
+>>>>             enable interrupt=0A=
+>>>>             check vma sequence count=0A=
+>>>>             handle_pte_fault(vma)=0A=
+>>>>                     ..=0A=
+>>>>                     page =3D alloc_page()=0A=
+>>>>                     pte_map_lock()=0A=
+>>>>                             disable interrupt=0A=
+>>>>                                     abort if sequence counter has chan=
+ged=0A=
+>>>>                                     abort if pmd or pte has changed=0A=
+>>>>                                     pte map and lock=0A=
+>>>>                             enable interrupt=0A=
+>>>>                     if abort=0A=
+>>>>                        free page=0A=
+>>>>                        abort=0A=
+>>>>                     ...=0A=
+>>>>     }=0A=
+>>>>=0A=
+>>>>     arch_fault_handler()=0A=
+>>>>     {=0A=
+>>>>             if (speculative_page_fault(&vma))=0A=
+>>>>                goto done=0A=
+>>>>     again:=0A=
+>>>>             lock(mmap_sem)=0A=
+>>>>             vma =3D find_vma();=0A=
+>>>>             handle_pte_fault(vma);=0A=
+>>>>             if retry=0A=
+>>>>                unlock(mmap_sem)=0A=
+>>>>                goto again;=0A=
+>>>>     done:=0A=
+>>>>             handle fault error=0A=
+>>>>     }=0A=
+>>>>=0A=
+>>>> Support for THP is not done because when checking for the PMD, we can =
+be=0A=
+>>>> confused by an in progress collapsing operation done by khugepaged. Th=
+e=0A=
+>>>> issue is that pmd_none() could be true either if the PMD is not alread=
+y=0A=
+>>>> populated or if the underlying PTE are in the way to be collapsed. So =
+we=0A=
+>>>> cannot safely allocate a PMD if pmd_none() is true.=0A=
+>>>>=0A=
+>>>> This series add a new software performance event named 'speculative-fa=
+ults'=0A=
+>>>> or 'spf'. It counts the number of successful page fault event handled=
+=0A=
+>>>> speculatively. When recording 'faults,spf' events, the faults one is=
+=0A=
+>>>> counting the total number of page fault events while 'spf' is only cou=
+nting=0A=
+>>>> the part of the faults processed speculatively.=0A=
+>>>>=0A=
+>>>> There are some trace events introduced by this series. They allow=0A=
+>>>> identifying why the page faults were not processed speculatively. This=
+=0A=
+>>>> doesn't take in account the faults generated by a monothreaded process=
+=0A=
+>>>> which directly processed while holding the mmap_sem. This trace events=
+ are=0A=
+>>>> grouped in a system named 'pagefault', they are:=0A=
+>>>>  - pagefault:spf_vma_changed : if the VMA has been changed in our back=
+=0A=
+>>>>  - pagefault:spf_vma_noanon : the vma->anon_vma field was not yet set.=
+=0A=
+>>>>  - pagefault:spf_vma_notsup : the VMA's type is not supported=0A=
+>>>>  - pagefault:spf_vma_access : the VMA's access right are not respected=
+=0A=
+>>>>  - pagefault:spf_pmd_changed : the upper PMD pointer has changed in ou=
+r=0A=
+>>>>    back.=0A=
+>>>>=0A=
+>>>> To record all the related events, the easier is to run perf with the=
+=0A=
+>>>> following arguments :=0A=
+>>>> $ perf stat -e 'faults,spf,pagefault:*' <command>=0A=
+>>>>=0A=
+>>>> There is also a dedicated vmstat counter showing the number of success=
+ful=0A=
+>>>> page fault handled speculatively. I can be seen this way:=0A=
+>>>> $ grep speculative_pgfault /proc/vmstat=0A=
+>>>>=0A=
+>>>> This series builds on top of v4.16-mmotm-2018-04-13-17-28 and is funct=
+ional=0A=
+>>>> on x86, PowerPC and arm64.=0A=
+>>>>=0A=
+>>>> ---------------------=0A=
+>>>> Real Workload results=0A=
+>>>>=0A=
+>>>> As mentioned in previous email, we did non official runs using a "popu=
+lar=0A=
+>>>> in memory multithreaded database product" on 176 cores SMT8 Power syst=
+em=0A=
+>>>> which showed a 30% improvements in the number of transaction processed=
+ per=0A=
+>>>> second. This run has been done on the v6 series, but changes introduce=
+d in=0A=
+>>>> this new version should not impact the performance boost seen.=0A=
+>>>>=0A=
+>>>> Here are the perf data captured during 2 of these runs on top of the v=
+8=0A=
+>>>> series:=0A=
+>>>>                 vanilla         spf=0A=
+>>>> faults          89.418          101.364         +13%=0A=
+>>>> spf                n/a           97.989=0A=
+>>>>=0A=
+>>>> With the SPF kernel, most of the page fault were processed in a specul=
+ative=0A=
+>>>> way.=0A=
+>>>>=0A=
+>>>> Ganesh Mahendran had backported the series on top of a 4.9 kernel and =
+gave=0A=
+>>>> it a try on an android device. He reported that the application launch=
+ time=0A=
+>>>> was improved in average by 6%, and for large applications (~100 thread=
+s) by=0A=
+>>>> 20%.=0A=
+>>>>=0A=
+>>>> Here are the launch time Ganesh mesured on Android 8.0 on top of a Qco=
+m=0A=
+>>>> MSM845 (8 cores) with 6GB (the less is better):=0A=
+>>>>=0A=
+>>>> Application                             4.9     4.9+spf delta=0A=
+>>>> com.tencent.mm                          416     389     -7%=0A=
+>>>> com.eg.android.AlipayGphone             1135    986     -13%=0A=
+>>>> com.tencent.mtt                         455     454     0%=0A=
+>>>> com.qqgame.hlddz                        1497    1409    -6%=0A=
+>>>> com.autonavi.minimap                    711     701     -1%=0A=
+>>>> com.tencent.tmgp.sgame                  788     748     -5%=0A=
+>>>> com.immomo.momo                         501     487     -3%=0A=
+>>>> com.tencent.peng                        2145    2112    -2%=0A=
+>>>> com.smile.gifmaker                      491     461     -6%=0A=
+>>>> com.baidu.BaiduMap                      479     366     -23%=0A=
+>>>> com.taobao.taobao                       1341    1198    -11%=0A=
+>>>> com.baidu.searchbox                     333     314     -6%=0A=
+>>>> com.tencent.mobileqq                    394     384     -3%=0A=
+>>>> com.sina.weibo                          907     906     0%=0A=
+>>>> com.youku.phone                         816     731     -11%=0A=
+>>>> com.happyelements.AndroidAnimal.qq      763     717     -6%=0A=
+>>>> com.UCMobile                            415     411     -1%=0A=
+>>>> com.tencent.tmgp.ak                     1464    1431    -2%=0A=
+>>>> com.tencent.qqmusic                     336     329     -2%=0A=
+>>>> com.sankuai.meituan                     1661    1302    -22%=0A=
+>>>> com.netease.cloudmusic                  1193    1200    1%=0A=
+>>>> air.tv.douyu.android                    4257    4152    -2%=0A=
+>>>>=0A=
+>>>> ------------------=0A=
+>>>> Benchmarks results=0A=
+>>>>=0A=
+>>>> Base kernel is v4.17.0-rc4-mm1=0A=
+>>>> SPF is BASE + this series=0A=
+>>>>=0A=
+>>>> Kernbench:=0A=
+>>>> ----------=0A=
+>>>> Here are the results on a 16 CPUs X86 guest using kernbench on a 4.15=
+=0A=
+>>>> kernel (kernel is build 5 times):=0A=
+>>>>=0A=
+>>>> Average Half load -j 8=0A=
+>>>>                  Run    (std deviation)=0A=
+>>>>                  BASE                   SPF=0A=
+>>>> Elapsed Time     1448.65 (5.72312)      1455.84 (4.84951)       0.50%=
+=0A=
+>>>> User    Time     10135.4 (30.3699)      10148.8 (31.1252)       0.13%=
+=0A=
+>>>> System  Time     900.47  (2.81131)      923.28  (7.52779)       2.53%=
+=0A=
+>>>> Percent CPU      761.4   (1.14018)      760.2   (0.447214)      -0.16%=
+=0A=
+>>>> Context Switches 85380   (3419.52)      84748   (1904.44)       -0.74%=
+=0A=
+>>>> Sleeps           105064  (1240.96)      105074  (337.612)       0.01%=
+=0A=
+>>>>=0A=
+>>>> Average Optimal load -j 16=0A=
+>>>>                  Run    (std deviation)=0A=
+>>>>                  BASE                   SPF=0A=
+>>>> Elapsed Time     920.528 (10.1212)      927.404 (8.91789)       0.75%=
+=0A=
+>>>> User    Time     11064.8 (981.142)      11085   (990.897)       0.18%=
+=0A=
+>>>> System  Time     979.904 (84.0615)      1001.14 (82.5523)       2.17%=
+=0A=
+>>>> Percent CPU      1089.5  (345.894)      1086.1  (343.545)       -0.31%=
+=0A=
+>>>> Context Switches 159488  (78156.4)      158223  (77472.1)       -0.79%=
+=0A=
+>>>> Sleeps           110566  (5877.49)      110388  (5617.75)       -0.16%=
+=0A=
+>>>>=0A=
+>>>>=0A=
+>>>> During a run on the SPF, perf events were captured:=0A=
+>>>>  Performance counter stats for '../kernbench -M':=0A=
+>>>>          526743764      faults=0A=
+>>>>                210      spf=0A=
+>>>>                  3      pagefault:spf_vma_changed=0A=
+>>>>                  0      pagefault:spf_vma_noanon=0A=
+>>>>               2278      pagefault:spf_vma_notsup=0A=
+>>>>                  0      pagefault:spf_vma_access=0A=
+>>>>                  0      pagefault:spf_pmd_changed=0A=
+>>>>=0A=
+>>>> Very few speculative page faults were recorded as most of the processe=
+s=0A=
+>>>> involved are monothreaded (sounds that on this architecture some threa=
+ds=0A=
+>>>> were created during the kernel build processing).=0A=
+>>>>=0A=
+>>>> Here are the kerbench results on a 80 CPUs Power8 system:=0A=
+>>>>=0A=
+>>>> Average Half load -j 40=0A=
+>>>>                  Run    (std deviation)=0A=
+>>>>                  BASE                   SPF=0A=
+>>>> Elapsed Time     117.152 (0.774642)     117.166 (0.476057)      0.01%=
+=0A=
+>>>> User    Time     4478.52 (24.7688)      4479.76 (9.08555)       0.03%=
+=0A=
+>>>> System  Time     131.104 (0.720056)     134.04  (0.708414)      2.24%=
+=0A=
+>>>> Percent CPU      3934    (19.7104)      3937.2  (19.0184)       0.08%=
+=0A=
+>>>> Context Switches 92125.4 (576.787)      92581.6 (198.622)       0.50%=
+=0A=
+>>>> Sleeps           317923  (652.499)      318469  (1255.59)       0.17%=
+=0A=
+>>>>=0A=
+>>>> Average Optimal load -j 80=0A=
+>>>>                  Run    (std deviation)=0A=
+>>>>                  BASE                   SPF=0A=
+>>>> Elapsed Time     107.73  (0.632416)     107.31  (0.584936)      -0.39%=
+=0A=
+>>>> User    Time     5869.86 (1466.72)      5871.71 (1467.27)       0.03%=
+=0A=
+>>>> System  Time     153.728 (23.8573)      157.153 (24.3704)       2.23%=
+=0A=
+>>>> Percent CPU      5418.6  (1565.17)      5436.7  (1580.91)       0.33%=
+=0A=
+>>>> Context Switches 223861  (138865)       225032  (139632)        0.52%=
+=0A=
+>>>> Sleeps           330529  (13495.1)      332001  (14746.2)       0.45%=
+=0A=
+>>>>=0A=
+>>>> During a run on the SPF, perf events were captured:=0A=
+>>>>  Performance counter stats for '../kernbench -M':=0A=
+>>>>          116730856      faults=0A=
+>>>>                  0      spf=0A=
+>>>>                  3      pagefault:spf_vma_changed=0A=
+>>>>                  0      pagefault:spf_vma_noanon=0A=
+>>>>                476      pagefault:spf_vma_notsup=0A=
+>>>>                  0      pagefault:spf_vma_access=0A=
+>>>>                  0      pagefault:spf_pmd_changed=0A=
+>>>>=0A=
+>>>> Most of the processes involved are monothreaded so SPF is not activate=
+d but=0A=
+>>>> there is no impact on the performance.=0A=
+>>>>=0A=
+>>>> Ebizzy:=0A=
+>>>> -------=0A=
+>>>> The test is counting the number of records per second it can manage, t=
+he=0A=
+>>>> higher is the best. I run it like this 'ebizzy -mTt <nrcpus>'. To get=
+=0A=
+>>>> consistent result I repeated the test 100 times and measure the averag=
+e=0A=
+>>>> result. The number is the record processes per second, the higher is t=
+he=0A=
+>>>> best.=0A=
+>>>>=0A=
+>>>>                 BASE            SPF             delta=0A=
+>>>> 16 CPUs x86 VM  742.57          1490.24         100.69%=0A=
+>>>> 80 CPUs P8 node 13105.4         24174.23        84.46%=0A=
+>>>>=0A=
+>>>> Here are the performance counter read during a run on a 16 CPUs x86 VM=
+:=0A=
+>>>>  Performance counter stats for './ebizzy -mTt 16':=0A=
+>>>>            1706379      faults=0A=
+>>>>            1674599      spf=0A=
+>>>>              30588      pagefault:spf_vma_changed=0A=
+>>>>                  0      pagefault:spf_vma_noanon=0A=
+>>>>                363      pagefault:spf_vma_notsup=0A=
+>>>>                  0      pagefault:spf_vma_access=0A=
+>>>>                  0      pagefault:spf_pmd_changed=0A=
+>>>>=0A=
+>>>> And the ones captured during a run on a 80 CPUs Power node:=0A=
+>>>>  Performance counter stats for './ebizzy -mTt 80':=0A=
+>>>>            1874773      faults=0A=
+>>>>            1461153      spf=0A=
+>>>>             413293      pagefault:spf_vma_changed=0A=
+>>>>                  0      pagefault:spf_vma_noanon=0A=
+>>>>                200      pagefault:spf_vma_notsup=0A=
+>>>>                  0      pagefault:spf_vma_access=0A=
+>>>>                  0      pagefault:spf_pmd_changed=0A=
+>>>>=0A=
+>>>> In ebizzy's case most of the page fault were handled in a speculative =
+way,=0A=
+>>>> leading the ebizzy performance boost.=0A=
+>>>>=0A=
+>>>> ------------------=0A=
+>>>> Changes since v10 (https://lkml.org/lkml/2018/4/17/572):=0A=
+>>>>  - Accounted for all review feedbacks from Punit Agrawal, Ganesh Mahen=
+dran=0A=
+>>>>    and Minchan Kim, hopefully.=0A=
+>>>>  - Remove unneeded check on CONFIG_SPECULATIVE_PAGE_FAULT in=0A=
+>>>>    __do_page_fault().=0A=
+>>>>  - Loop in pte_spinlock() and pte_map_lock() when pte try lock fails=
+=0A=
+>>>>    instead=0A=
+>>>>    of aborting the speculative page fault handling. Dropping the now=
+=0A=
+>>>> useless=0A=
+>>>>    trace event pagefault:spf_pte_lock.=0A=
+>>>>  - No more try to reuse the fetched VMA during the speculative page fa=
+ult=0A=
+>>>>    handling when retrying is needed. This adds a lot of complexity and=
+=0A=
+>>>>    additional tests done didn't show a significant performance improve=
+ment.=0A=
+>>>>  - Convert IS_ENABLED(CONFIG_NUMA) back to #ifdef due to build error.=
+=0A=
+>>>>=0A=
+>>>> [1] http://linux-kernel.2935.n7.nabble.com/RFC-PATCH-0-6-Another-go-at=
+-speculative-page-faults-tt965642.html#none=0A=
+>>>> [2] https://patchwork.kernel.org/patch/9999687/=0A=
+>>>>=0A=
+>>>>=0A=
+>>>> Laurent Dufour (20):=0A=
+>>>>   mm: introduce CONFIG_SPECULATIVE_PAGE_FAULT=0A=
+>>>>   x86/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT=0A=
+>>>>   powerpc/mm: set ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT=0A=
+>>>>   mm: introduce pte_spinlock for FAULT_FLAG_SPECULATIVE=0A=
+>>>>   mm: make pte_unmap_same compatible with SPF=0A=
+>>>>   mm: introduce INIT_VMA()=0A=
+>>>>   mm: protect VMA modifications using VMA sequence count=0A=
+>>>>   mm: protect mremap() against SPF hanlder=0A=
+>>>>   mm: protect SPF handler against anon_vma changes=0A=
+>>>>   mm: cache some VMA fields in the vm_fault structure=0A=
+>>>>   mm/migrate: Pass vm_fault pointer to migrate_misplaced_page()=0A=
+>>>>   mm: introduce __lru_cache_add_active_or_unevictable=0A=
+>>>>   mm: introduce __vm_normal_page()=0A=
+>>>>   mm: introduce __page_add_new_anon_rmap()=0A=
+>>>>   mm: protect mm_rb tree with a rwlock=0A=
+>>>>   mm: adding speculative page fault failure trace events=0A=
+>>>>   perf: add a speculative page fault sw event=0A=
+>>>>   perf tools: add support for the SPF perf event=0A=
+>>>>   mm: add speculative page fault vmstats=0A=
+>>>>   powerpc/mm: add speculative page fault=0A=
+>>>>=0A=
+>>>> Mahendran Ganesh (2):=0A=
+>>>>   arm64/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT=0A=
+>>>>   arm64/mm: add speculative page fault=0A=
+>>>>=0A=
+>>>> Peter Zijlstra (4):=0A=
+>>>>   mm: prepare for FAULT_FLAG_SPECULATIVE=0A=
+>>>>   mm: VMA sequence count=0A=
+>>>>   mm: provide speculative fault infrastructure=0A=
+>>>>   x86/mm: add speculative pagefault handling=0A=
+>>>>=0A=
+>>>>  arch/arm64/Kconfig                    |   1 +=0A=
+>>>>  arch/arm64/mm/fault.c                 |  12 +=0A=
+>>>>  arch/powerpc/Kconfig                  |   1 +=0A=
+>>>>  arch/powerpc/mm/fault.c               |  16 +=0A=
+>>>>  arch/x86/Kconfig                      |   1 +=0A=
+>>>>  arch/x86/mm/fault.c                   |  27 +-=0A=
+>>>>  fs/exec.c                             |   2 +-=0A=
+>>>>  fs/proc/task_mmu.c                    |   5 +-=0A=
+>>>>  fs/userfaultfd.c                      |  17 +-=0A=
+>>>>  include/linux/hugetlb_inline.h        |   2 +-=0A=
+>>>>  include/linux/migrate.h               |   4 +-=0A=
+>>>>  include/linux/mm.h                    | 136 +++++++-=0A=
+>>>>  include/linux/mm_types.h              |   7 +=0A=
+>>>>  include/linux/pagemap.h               |   4 +-=0A=
+>>>>  include/linux/rmap.h                  |  12 +-=0A=
+>>>>  include/linux/swap.h                  |  10 +-=0A=
+>>>>  include/linux/vm_event_item.h         |   3 +=0A=
+>>>>  include/trace/events/pagefault.h      |  80 +++++=0A=
+>>>>  include/uapi/linux/perf_event.h       |   1 +=0A=
+>>>>  kernel/fork.c                         |   5 +-=0A=
+>>>>  mm/Kconfig                            |  22 ++=0A=
+>>>>  mm/huge_memory.c                      |   6 +-=0A=
+>>>>  mm/hugetlb.c                          |   2 +=0A=
+>>>>  mm/init-mm.c                          |   3 +=0A=
+>>>>  mm/internal.h                         |  20 ++=0A=
+>>>>  mm/khugepaged.c                       |   5 +=0A=
+>>>>  mm/madvise.c                          |   6 +-=0A=
+>>>>  mm/memory.c                           | 612 +++++++++++++++++++++++++=
+++++-----=0A=
+>>>>  mm/mempolicy.c                        |  51 ++-=0A=
+>>>>  mm/migrate.c                          |   6 +-=0A=
+>>>>  mm/mlock.c                            |  13 +-=0A=
+>>>>  mm/mmap.c                             | 229 ++++++++++---=0A=
+>>>>  mm/mprotect.c                         |   4 +-=0A=
+>>>>  mm/mremap.c                           |  13 +=0A=
+>>>>  mm/nommu.c                            |   2 +-=0A=
+>>>>  mm/rmap.c                             |   5 +-=0A=
+>>>>  mm/swap.c                             |   6 +-=0A=
+>>>>  mm/swap_state.c                       |   8 +-=0A=
+>>>>  mm/vmstat.c                           |   5 +-=0A=
+>>>>  tools/include/uapi/linux/perf_event.h |   1 +=0A=
+>>>>  tools/perf/util/evsel.c               |   1 +=0A=
+>>>>  tools/perf/util/parse-events.c        |   4 +=0A=
+>>>>  tools/perf/util/parse-events.l        |   1 +=0A=
+>>>>  tools/perf/util/python.c              |   1 +=0A=
+>>>>  44 files changed, 1161 insertions(+), 211 deletions(-)=0A=
+>>>>  create mode 100644 include/trace/events/pagefault.h=0A=
+>>>>=0A=
+>>>> --=0A=
+>>>> 2.7.4=0A=
+>>>>=0A=
+>>>>=0A=
+>>>=0A=
+>>=0A=
+>=0A=
+=0A=
\ No newline at end of file
diff --git a/a/content_digest b/N1/content_digest
index 433e1ff..419f628 100644
--- a/a/content_digest
+++ b/N1/content_digest
@@ -80,619 +80,833 @@
   "b\0"
 ]
 [
-  "Hi Laurent,\n",
-  "\n",
-  "\n",
-  "For the test result on Intel 4s skylake platform (192 CPUs, 768G Memory), the below test cases all were run 3 times.\n",
-  "I check the test results, only page_fault3_thread/enable THP have 6% stddev for head commit, other tests have lower stddev.\n",
-  "\n",
-  "And I did not find other high variation on test case result.\n",
-  "\n",
-  "a). Enable THP\n",
-  "testcase                          base     stddev       change      head     stddev         metric\n",
-  "page_fault3/enable THP           10519      \302\261 3%        -20.5%      8368      \302\2616%          will-it-scale.per_thread_ops\n",
-  "page_fault2/enalbe THP            8281      \302\261 2%        -18.8%      6728                   will-it-scale.per_thread_ops\n",
-  "brk1/eanble THP                 998475                   -2.2%    976893                   will-it-scale.per_process_ops\n",
-  "context_switch1/enable THP      223910                   -1.3%    220930                   will-it-scale.per_process_ops\n",
-  "context_switch1/enable THP      233722                   -1.0%    231288                   will-it-scale.per_thread_ops\n",
-  "\n",
-  "b). Disable THP\n",
-  "page_fault3/disable THP          10856                  -23.1%      8344                   will-it-scale.per_thread_ops\n",
-  "page_fault2/disable THP           8147                  -18.8%      6613                   will-it-scale.per_thread_ops\n",
-  "brk1/disable THP                   957                    -7.9%      881                   will-it-scale.per_thread_ops\n",
-  "context_switch1/disable THP     237006                    -2.2%    231907                  will-it-scale.per_thread_ops\n",
-  "brk1/disable THP                997317                    -2.0%    977778                  will-it-scale.per_process_ops\n",
-  "page_fault3/disable THP         467454                    -1.8%    459251                  will-it-scale.per_process_ops\n",
-  "context_switch1/disable THP     224431                    -1.3%    221567                  will-it-scale.per_process_ops\n",
-  "\n",
-  "\n",
-  "Best regards,\n",
-  "Haiyan Song\n",
-  "________________________________________\n",
-  "From: Laurent Dufour [ldufour\@linux.vnet.ibm.com]\n",
-  "Sent: Monday, July 02, 2018 4:59 PM\n",
-  "To: Song, HaiyanX\n",
-  "Cc: akpm\@linux-foundation.org; mhocko\@kernel.org; peterz\@infradead.org; kirill\@shutemov.name; ak\@linux.intel.com; dave\@stgolabs.net; jack\@suse.cz; Matthew Wilcox; khandual\@linux.vnet.ibm.com; aneesh.kumar\@linux.vnet.ibm.com; benh\@kernel.crashing.org; mpe\@ellerman.id.au; paulus\@samba.org; Thomas Gleixner; Ingo Molnar; hpa\@zytor.com; Will Deacon; Sergey Senozhatsky; sergey.senozhatsky.work\@gmail.com; Andrea Arcangeli; Alexei Starovoitov; Wang, Kemi; Daniel Jordan; David Rientjes; Jerome Glisse; Ganesh Mahendran; Minchan Kim; Punit Agrawal; vinayak menon; Yang Shi; linux-kernel\@vger.kernel.org; linux-mm\@kvack.org; haren\@linux.vnet.ibm.com; npiggin\@gmail.com; bsingharora\@gmail.com; paulmck\@linux.vnet.ibm.com; Tim Chen; linuxppc-dev\@lists.ozlabs.org; x86\@kernel.org\n",
-  "Subject: Re: [PATCH v11 00/26] Speculative page faults\n",
-  "\n",
-  "On 11/06/2018 09:49, Song, HaiyanX wrote:\n",
-  "> Hi Laurent,\n",
-  ">\n",
-  "> Regression test for v11 patch serials have been run, some regression is found by LKP-tools (linux kernel performance)\n",
-  "> tested on Intel 4s skylake platform. This time only test the cases which have been run and found regressions on\n",
-  "> V9 patch serials.\n",
-  ">\n",
-  "> The regression result is sorted by the metric will-it-scale.per_thread_ops.\n",
-  "> branch: Laurent-Dufour/Speculative-page-faults/20180520-045126\n",
-  "> commit id:\n",
-  ">   head commit : a7a8993bfe3ccb54ad468b9f1799649e4ad1ff12\n",
-  ">   base commit : ba98a1cdad71d259a194461b3a61471b49b14df1\n",
-  "> Benchmark: will-it-scale\n",
-  "> Download link: https://github.com/antonblanchard/will-it-scale/tree/master\n",
-  ">\n",
-  "> Metrics:\n",
-  ">   will-it-scale.per_process_ops=processes/nr_cpu\n",
-  ">   will-it-scale.per_thread_ops=threads/nr_cpu\n",
-  ">   test box: lkp-skl-4sp1(nr_cpu=192,memory=768G)\n",
-  "> THP: enable / disable\n",
-  "> nr_task:100%\n",
-  ">\n",
-  "> 1. Regressions:\n",
-  ">\n",
-  "> a). Enable THP\n",
-  "> testcase                          base           change      head           metric\n",
-  "> page_fault3/enable THP           10519          -20.5%        836      will-it-scale.per_thread_ops\n",
-  "> page_fault2/enalbe THP            8281          -18.8%       6728      will-it-scale.per_thread_ops\n",
-  "> brk1/eanble THP                 998475           -2.2%     976893      will-it-scale.per_process_ops\n",
-  "> context_switch1/enable THP      223910           -1.3%     220930      will-it-scale.per_process_ops\n",
-  "> context_switch1/enable THP      233722           -1.0%     231288      will-it-scale.per_thread_ops\n",
-  ">\n",
-  "> b). Disable THP\n",
-  "> page_fault3/disable THP          10856          -23.1%       8344      will-it-scale.per_thread_ops\n",
-  "> page_fault2/disable THP           8147          -18.8%       6613      will-it-scale.per_thread_ops\n",
-  "> brk1/disable THP                   957           -7.9%        881      will-it-scale.per_thread_ops\n",
-  "> context_switch1/disable THP     237006           -2.2%     231907      will-it-scale.per_thread_ops\n",
-  "> brk1/disable THP                997317           -2.0%     977778      will-it-scale.per_process_ops\n",
-  "> page_fault3/disable THP         467454           -1.8%     459251      will-it-scale.per_process_ops\n",
-  "> context_switch1/disable THP     224431           -1.3%     221567      will-it-scale.per_process_ops\n",
-  ">\n",
-  "> Notes: for the above  values of test result, the higher is better.\n",
-  "\n",
-  "I tried the same tests on my PowerPC victim VM (1024 CPUs, 11TB) and I can't\n",
-  "get reproducible results. The results have huge variation, even on the vanilla\n",
-  "kernel, and I can't state on any changes due to that.\n",
-  "\n",
-  "I tried on smaller node (80 CPUs, 32G), and the tests ran better, but I didn't\n",
-  "measure any changes between the vanilla and the SPF patched ones:\n",
-  "\n",
-  "test THP enabled                4.17.0-rc4-mm1  spf             delta\n",
-  "page_fault3_threads             2697.7          2683.5          -0.53%\n",
-  "page_fault2_threads             170660.6        169574.1        -0.64%\n",
-  "context_switch1_threads         6915269.2       6877507.3       -0.55%\n",
-  "context_switch1_processes       6478076.2       6529493.5       0.79%\n",
-  "brk1                            243391.2        238527.5        -2.00%\n",
-  "\n",
-  "Tests were run 10 times, no high variation detected.\n",
-  "\n",
-  "Did you see high variation on your side ? How many times the test were run to\n",
-  "compute the average values ?\n",
-  "\n",
-  "Thanks,\n",
-  "Laurent.\n",
-  "\n",
-  "\n",
-  ">\n",
-  "> 2. Improvement: not found improvement based on the selected test cases.\n",
-  ">\n",
-  ">\n",
-  "> Best regards\n",
-  "> Haiyan Song\n",
-  "> ________________________________________\n",
-  "> From: owner-linux-mm\@kvack.org [owner-linux-mm\@kvack.org] on behalf of Laurent Dufour [ldufour\@linux.vnet.ibm.com]\n",
-  "> Sent: Monday, May 28, 2018 4:54 PM\n",
-  "> To: Song, HaiyanX\n",
-  "> Cc: akpm\@linux-foundation.org; mhocko\@kernel.org; peterz\@infradead.org; kirill\@shutemov.name; ak\@linux.intel.com; dave\@stgolabs.net; jack\@suse.cz; Matthew Wilcox; khandual\@linux.vnet.ibm.com; aneesh.kumar\@linux.vnet.ibm.com; benh\@kernel.crashing.org; mpe\@ellerman.id.au; paulus\@samba.org; Thomas Gleixner; Ingo Molnar; hpa\@zytor.com; Will Deacon; Sergey Senozhatsky; sergey.senozhatsky.work\@gmail.com; Andrea Arcangeli; Alexei Starovoitov; Wang, Kemi; Daniel Jordan; David Rientjes; Jerome Glisse; Ganesh Mahendran; Minchan Kim; Punit Agrawal; vinayak menon; Yang Shi; linux-kernel\@vger.kernel.org; linux-mm\@kvack.org; haren\@linux.vnet.ibm.com; npiggin\@gmail.com; bsingharora\@gmail.com; paulmck\@linux.vnet.ibm.com; Tim Chen; linuxppc-dev\@lists.ozlabs.org; x86\@kernel.org\n",
-  "> Subject: Re: [PATCH v11 00/26] Speculative page faults\n",
-  ">\n",
-  "> On 28/05/2018 10:22, Haiyan Song wrote:\n",
-  ">> Hi Laurent,\n",
-  ">>\n",
-  ">> Yes, these tests are done on V9 patch.\n",
-  ">\n",
-  "> Do you plan to give this V11 a run ?\n",
-  ">\n",
-  ">>\n",
-  ">>\n",
-  ">> Best regards,\n",
-  ">> Haiyan Song\n",
-  ">>\n",
-  ">> On Mon, May 28, 2018 at 09:51:34AM +0200, Laurent Dufour wrote:\n",
-  ">>> On 28/05/2018 07:23, Song, HaiyanX wrote:\n",
-  ">>>>\n",
-  ">>>> Some regression and improvements is found by LKP-tools(linux kernel performance) on V9 patch series\n",
-  ">>>> tested on Intel 4s Skylake platform.\n",
-  ">>>\n",
-  ">>> Hi,\n",
-  ">>>\n",
-  ">>> Thanks for reporting this benchmark results, but you mentioned the \"V9 patch\n",
-  ">>> series\" while responding to the v11 header series...\n",
-  ">>> Were these tests done on v9 or v11 ?\n",
-  ">>>\n",
-  ">>> Cheers,\n",
-  ">>> Laurent.\n",
-  ">>>\n",
-  ">>>>\n",
-  ">>>> The regression result is sorted by the metric will-it-scale.per_thread_ops.\n",
-  ">>>> Branch: Laurent-Dufour/Speculative-page-faults/20180316-151833 (V9 patch series)\n",
-  ">>>> Commit id:\n",
-  ">>>>     base commit: d55f34411b1b126429a823d06c3124c16283231f\n",
-  ">>>>     head commit: 0355322b3577eeab7669066df42c550a56801110\n",
-  ">>>> Benchmark suite: will-it-scale\n",
-  ">>>> Download link:\n",
-  ">>>> https://github.com/antonblanchard/will-it-scale/tree/master/tests\n",
-  ">>>> Metrics:\n",
-  ">>>>     will-it-scale.per_process_ops=processes/nr_cpu\n",
-  ">>>>     will-it-scale.per_thread_ops=threads/nr_cpu\n",
-  ">>>> test box: lkp-skl-4sp1(nr_cpu=192,memory=768G)\n",
-  ">>>> THP: enable / disable\n",
-  ">>>> nr_task: 100%\n",
-  ">>>>\n",
-  ">>>> 1. Regressions:\n",
-  ">>>> a) THP enabled:\n",
-  ">>>> testcase                        base            change          head       metric\n",
-  ">>>> page_fault3/ enable THP         10092           -17.5%          8323       will-it-scale.per_thread_ops\n",
-  ">>>> page_fault2/ enable THP          8300           -17.2%          6869       will-it-scale.per_thread_ops\n",
-  ">>>> brk1/ enable THP                  957.67         -7.6%           885       will-it-scale.per_thread_ops\n",
-  ">>>> page_fault3/ enable THP        172821            -5.3%        163692       will-it-scale.per_process_ops\n",
-  ">>>> signal1/ enable THP              9125            -3.2%          8834       will-it-scale.per_process_ops\n",
-  ">>>>\n",
-  ">>>> b) THP disabled:\n",
-  ">>>> testcase                        base            change          head       metric\n",
-  ">>>> page_fault3/ disable THP        10107           -19.1%          8180       will-it-scale.per_thread_ops\n",
-  ">>>> page_fault2/ disable THP         8432           -17.8%          6931       will-it-scale.per_thread_ops\n",
-  ">>>> context_switch1/ disable THP   215389            -6.8%        200776       will-it-scale.per_thread_ops\n",
-  ">>>> brk1/ disable THP                 939.67         -6.6%           877.33    will-it-scale.per_thread_ops\n",
-  ">>>> page_fault3/ disable THP       173145            -4.7%        165064       will-it-scale.per_process_ops\n",
-  ">>>> signal1/ disable THP             9162            -3.9%          8802       will-it-scale.per_process_ops\n",
-  ">>>>\n",
-  ">>>> 2. Improvements:\n",
-  ">>>> a) THP enabled:\n",
-  ">>>> testcase                        base            change          head       metric\n",
-  ">>>> malloc1/ enable THP               66.33        +469.8%           383.67    will-it-scale.per_thread_ops\n",
-  ">>>> writeseek3/ enable THP          2531             +4.5%          2646       will-it-scale.per_thread_ops\n",
-  ">>>> signal1/ enable THP              989.33          +2.8%          1016       will-it-scale.per_thread_ops\n",
-  ">>>>\n",
-  ">>>> b) THP disabled:\n",
-  ">>>> testcase                        base            change          head       metric\n",
-  ">>>> malloc1/ disable THP              90.33        +417.3%           467.33    will-it-scale.per_thread_ops\n",
-  ">>>> read2/ disable THP             58934            +39.2%         82060       will-it-scale.per_thread_ops\n",
-  ">>>> page_fault1/ disable THP        8607            +36.4%         11736       will-it-scale.per_thread_ops\n",
-  ">>>> read1/ disable THP            314063            +12.7%        353934       will-it-scale.per_thread_ops\n",
-  ">>>> writeseek3/ disable THP         2452            +12.5%          2759       will-it-scale.per_thread_ops\n",
-  ">>>> signal1/ disable THP             971.33          +5.5%          1024       will-it-scale.per_thread_ops\n",
-  ">>>>\n",
-  ">>>> Notes: for above values in column \"change\", the higher value means that the related testcase result\n",
-  ">>>> on head commit is better than that on base commit for this benchmark.\n",
-  ">>>>\n",
-  ">>>>\n",
-  ">>>> Best regards\n",
-  ">>>> Haiyan Song\n",
-  ">>>>\n",
-  ">>>> ________________________________________\n",
-  ">>>> From: owner-linux-mm\@kvack.org [owner-linux-mm\@kvack.org] on behalf of Laurent Dufour [ldufour\@linux.vnet.ibm.com]\n",
-  ">>>> Sent: Thursday, May 17, 2018 7:06 PM\n",
-  ">>>> To: akpm\@linux-foundation.org; mhocko\@kernel.org; peterz\@infradead.org; kirill\@shutemov.name; ak\@linux.intel.com; dave\@stgolabs.net; jack\@suse.cz; Matthew Wilcox; khandual\@linux.vnet.ibm.com; aneesh.kumar\@linux.vnet.ibm.com; benh\@kernel.crashing.org; mpe\@ellerman.id.au; paulus\@samba.org; Thomas Gleixner; Ingo Molnar; hpa\@zytor.com; Will Deacon; Sergey Senozhatsky; sergey.senozhatsky.work\@gmail.com; Andrea Arcangeli; Alexei Starovoitov; Wang, Kemi; Daniel Jordan; David Rientjes; Jerome Glisse; Ganesh Mahendran; Minchan Kim; Punit Agrawal; vinayak menon; Yang Shi\n",
-  ">>>> Cc: linux-kernel\@vger.kernel.org; linux-mm\@kvack.org; haren\@linux.vnet.ibm.com; npiggin\@gmail.com; bsingharora\@gmail.com; paulmck\@linux.vnet.ibm.com; Tim Chen; linuxppc-dev\@lists.ozlabs.org; x86\@kernel.org\n",
-  ">>>> Subject: [PATCH v11 00/26] Speculative page faults\n",
-  ">>>>\n",
-  ">>>> This is a port on kernel 4.17 of the work done by Peter Zijlstra to handle\n",
-  ">>>> page fault without holding the mm semaphore [1].\n",
-  ">>>>\n",
-  ">>>> The idea is to try to handle user space page faults without holding the\n",
-  ">>>> mmap_sem. This should allow better concurrency for massively threaded\n",
-  ">>>> process since the page fault handler will not wait for other threads memory\n",
-  ">>>> layout change to be done, assuming that this change is done in another part\n",
-  ">>>> of the process's memory space. This type page fault is named speculative\n",
-  ">>>> page fault. If the speculative page fault fails because of a concurrency is\n",
-  ">>>> detected or because underlying PMD or PTE tables are not yet allocating, it\n",
-  ">>>> is failing its processing and a classic page fault is then tried.\n",
-  ">>>>\n",
-  ">>>> The speculative page fault (SPF) has to look for the VMA matching the fault\n",
-  ">>>> address without holding the mmap_sem, this is done by introducing a rwlock\n",
-  ">>>> which protects the access to the mm_rb tree. Previously this was done using\n",
-  ">>>> SRCU but it was introducing a lot of scheduling to process the VMA's\n",
-  ">>>> freeing operation which was hitting the performance by 20% as reported by\n",
-  ">>>> Kemi Wang [2]. Using a rwlock to protect access to the mm_rb tree is\n",
-  ">>>> limiting the locking contention to these operations which are expected to\n",
-  ">>>> be in a O(log n) order. In addition to ensure that the VMA is not freed in\n",
-  ">>>> our back a reference count is added and 2 services (get_vma() and\n",
-  ">>>> put_vma()) are introduced to handle the reference count. Once a VMA is\n",
-  ">>>> fetched from the RB tree using get_vma(), it must be later freed using\n",
-  ">>>> put_vma(). I can't see anymore the overhead I got while will-it-scale\n",
-  ">>>> benchmark anymore.\n",
-  ">>>>\n",
-  ">>>> The VMA's attributes checked during the speculative page fault processing\n",
-  ">>>> have to be protected against parallel changes. This is done by using a per\n",
-  ">>>> VMA sequence lock. This sequence lock allows the speculative page fault\n",
-  ">>>> handler to fast check for parallel changes in progress and to abort the\n",
-  ">>>> speculative page fault in that case.\n",
-  ">>>>\n",
-  ">>>> Once the VMA has been found, the speculative page fault handler would check\n",
-  ">>>> for the VMA's attributes to verify that the page fault has to be handled\n",
-  ">>>> correctly or not. Thus, the VMA is protected through a sequence lock which\n",
-  ">>>> allows fast detection of concurrent VMA changes. If such a change is\n",
-  ">>>> detected, the speculative page fault is aborted and a *classic* page fault\n",
-  ">>>> is tried.  VMA sequence lockings are added when VMA attributes which are\n",
-  ">>>> checked during the page fault are modified.\n",
-  ">>>>\n",
-  ">>>> When the PTE is fetched, the VMA is checked to see if it has been changed,\n",
-  ">>>> so once the page table is locked, the VMA is valid, so any other changes\n",
-  ">>>> leading to touching this PTE will need to lock the page table, so no\n",
-  ">>>> parallel change is possible at this time.\n",
-  ">>>>\n",
-  ">>>> The locking of the PTE is done with interrupts disabled, this allows\n",
-  ">>>> checking for the PMD to ensure that there is not an ongoing collapsing\n",
-  ">>>> operation. Since khugepaged is firstly set the PMD to pmd_none and then is\n",
-  ">>>> waiting for the other CPU to have caught the IPI interrupt, if the pmd is\n",
-  ">>>> valid at the time the PTE is locked, we have the guarantee that the\n",
-  ">>>> collapsing operation will have to wait on the PTE lock to move forward.\n",
-  ">>>> This allows the SPF handler to map the PTE safely. If the PMD value is\n",
-  ">>>> different from the one recorded at the beginning of the SPF operation, the\n",
-  ">>>> classic page fault handler will be called to handle the operation while\n",
-  ">>>> holding the mmap_sem. As the PTE lock is done with the interrupts disabled,\n",
-  ">>>> the lock is done using spin_trylock() to avoid dead lock when handling a\n",
-  ">>>> page fault while a TLB invalidate is requested by another CPU holding the\n",
-  ">>>> PTE.\n",
-  ">>>>\n",
-  ">>>> In pseudo code, this could be seen as:\n",
-  ">>>>     speculative_page_fault()\n",
-  ">>>>     {\n",
-  ">>>>             vma = get_vma()\n",
-  ">>>>             check vma sequence count\n",
-  ">>>>             check vma's support\n",
-  ">>>>             disable interrupt\n",
-  ">>>>                   check pgd,p4d,...,pte\n",
-  ">>>>                   save pmd and pte in vmf\n",
-  ">>>>                   save vma sequence counter in vmf\n",
-  ">>>>             enable interrupt\n",
-  ">>>>             check vma sequence count\n",
-  ">>>>             handle_pte_fault(vma)\n",
-  ">>>>                     ..\n",
-  ">>>>                     page = alloc_page()\n",
-  ">>>>                     pte_map_lock()\n",
-  ">>>>                             disable interrupt\n",
-  ">>>>                                     abort if sequence counter has changed\n",
-  ">>>>                                     abort if pmd or pte has changed\n",
-  ">>>>                                     pte map and lock\n",
-  ">>>>                             enable interrupt\n",
-  ">>>>                     if abort\n",
-  ">>>>                        free page\n",
-  ">>>>                        abort\n",
-  ">>>>                     ...\n",
-  ">>>>     }\n",
-  ">>>>\n",
-  ">>>>     arch_fault_handler()\n",
-  ">>>>     {\n",
-  ">>>>             if (speculative_page_fault(&vma))\n",
-  ">>>>                goto done\n",
-  ">>>>     again:\n",
-  ">>>>             lock(mmap_sem)\n",
-  ">>>>             vma = find_vma();\n",
-  ">>>>             handle_pte_fault(vma);\n",
-  ">>>>             if retry\n",
-  ">>>>                unlock(mmap_sem)\n",
-  ">>>>                goto again;\n",
-  ">>>>     done:\n",
-  ">>>>             handle fault error\n",
-  ">>>>     }\n",
-  ">>>>\n",
-  ">>>> Support for THP is not done because when checking for the PMD, we can be\n",
-  ">>>> confused by an in progress collapsing operation done by khugepaged. The\n",
-  ">>>> issue is that pmd_none() could be true either if the PMD is not already\n",
-  ">>>> populated or if the underlying PTE are in the way to be collapsed. So we\n",
-  ">>>> cannot safely allocate a PMD if pmd_none() is true.\n",
-  ">>>>\n",
-  ">>>> This series add a new software performance event named 'speculative-faults'\n",
-  ">>>> or 'spf'. It counts the number of successful page fault event handled\n",
-  ">>>> speculatively. When recording 'faults,spf' events, the faults one is\n",
-  ">>>> counting the total number of page fault events while 'spf' is only counting\n",
-  ">>>> the part of the faults processed speculatively.\n",
-  ">>>>\n",
-  ">>>> There are some trace events introduced by this series. They allow\n",
-  ">>>> identifying why the page faults were not processed speculatively. This\n",
-  ">>>> doesn't take in account the faults generated by a monothreaded process\n",
-  ">>>> which directly processed while holding the mmap_sem. This trace events are\n",
-  ">>>> grouped in a system named 'pagefault', they are:\n",
-  ">>>>  - pagefault:spf_vma_changed : if the VMA has been changed in our back\n",
-  ">>>>  - pagefault:spf_vma_noanon : the vma->anon_vma field was not yet set.\n",
-  ">>>>  - pagefault:spf_vma_notsup : the VMA's type is not supported\n",
-  ">>>>  - pagefault:spf_vma_access : the VMA's access right are not respected\n",
-  ">>>>  - pagefault:spf_pmd_changed : the upper PMD pointer has changed in our\n",
-  ">>>>    back.\n",
-  ">>>>\n",
-  ">>>> To record all the related events, the easier is to run perf with the\n",
-  ">>>> following arguments :\n",
-  ">>>> \$ perf stat -e 'faults,spf,pagefault:*' <command>\n",
-  ">>>>\n",
-  ">>>> There is also a dedicated vmstat counter showing the number of successful\n",
-  ">>>> page fault handled speculatively. I can be seen this way:\n",
-  ">>>> \$ grep speculative_pgfault /proc/vmstat\n",
-  ">>>>\n",
-  ">>>> This series builds on top of v4.16-mmotm-2018-04-13-17-28 and is functional\n",
-  ">>>> on x86, PowerPC and arm64.\n",
-  ">>>>\n",
-  ">>>> ---------------------\n",
-  ">>>> Real Workload results\n",
-  ">>>>\n",
-  ">>>> As mentioned in previous email, we did non official runs using a \"popular\n",
-  ">>>> in memory multithreaded database product\" on 176 cores SMT8 Power system\n",
-  ">>>> which showed a 30% improvements in the number of transaction processed per\n",
-  ">>>> second. This run has been done on the v6 series, but changes introduced in\n",
-  ">>>> this new version should not impact the performance boost seen.\n",
-  ">>>>\n",
-  ">>>> Here are the perf data captured during 2 of these runs on top of the v8\n",
-  ">>>> series:\n",
-  ">>>>                 vanilla         spf\n",
-  ">>>> faults          89.418          101.364         +13%\n",
-  ">>>> spf                n/a           97.989\n",
-  ">>>>\n",
-  ">>>> With the SPF kernel, most of the page fault were processed in a speculative\n",
-  ">>>> way.\n",
-  ">>>>\n",
-  ">>>> Ganesh Mahendran had backported the series on top of a 4.9 kernel and gave\n",
-  ">>>> it a try on an android device. He reported that the application launch time\n",
-  ">>>> was improved in average by 6%, and for large applications (~100 threads) by\n",
-  ">>>> 20%.\n",
-  ">>>>\n",
-  ">>>> Here are the launch time Ganesh mesured on Android 8.0 on top of a Qcom\n",
-  ">>>> MSM845 (8 cores) with 6GB (the less is better):\n",
-  ">>>>\n",
-  ">>>> Application                             4.9     4.9+spf delta\n",
-  ">>>> com.tencent.mm                          416     389     -7%\n",
-  ">>>> com.eg.android.AlipayGphone             1135    986     -13%\n",
-  ">>>> com.tencent.mtt                         455     454     0%\n",
-  ">>>> com.qqgame.hlddz                        1497    1409    -6%\n",
-  ">>>> com.autonavi.minimap                    711     701     -1%\n",
-  ">>>> com.tencent.tmgp.sgame                  788     748     -5%\n",
-  ">>>> com.immomo.momo                         501     487     -3%\n",
-  ">>>> com.tencent.peng                        2145    2112    -2%\n",
-  ">>>> com.smile.gifmaker                      491     461     -6%\n",
-  ">>>> com.baidu.BaiduMap                      479     366     -23%\n",
-  ">>>> com.taobao.taobao                       1341    1198    -11%\n",
-  ">>>> com.baidu.searchbox                     333     314     -6%\n",
-  ">>>> com.tencent.mobileqq                    394     384     -3%\n",
-  ">>>> com.sina.weibo                          907     906     0%\n",
-  ">>>> com.youku.phone                         816     731     -11%\n",
-  ">>>> com.happyelements.AndroidAnimal.qq      763     717     -6%\n",
-  ">>>> com.UCMobile                            415     411     -1%\n",
-  ">>>> com.tencent.tmgp.ak                     1464    1431    -2%\n",
-  ">>>> com.tencent.qqmusic                     336     329     -2%\n",
-  ">>>> com.sankuai.meituan                     1661    1302    -22%\n",
-  ">>>> com.netease.cloudmusic                  1193    1200    1%\n",
-  ">>>> air.tv.douyu.android                    4257    4152    -2%\n",
-  ">>>>\n",
-  ">>>> ------------------\n",
-  ">>>> Benchmarks results\n",
-  ">>>>\n",
-  ">>>> Base kernel is v4.17.0-rc4-mm1\n",
-  ">>>> SPF is BASE + this series\n",
-  ">>>>\n",
-  ">>>> Kernbench:\n",
-  ">>>> ----------\n",
-  ">>>> Here are the results on a 16 CPUs X86 guest using kernbench on a 4.15\n",
-  ">>>> kernel (kernel is build 5 times):\n",
-  ">>>>\n",
-  ">>>> Average Half load -j 8\n",
-  ">>>>                  Run    (std deviation)\n",
-  ">>>>                  BASE                   SPF\n",
-  ">>>> Elapsed Time     1448.65 (5.72312)      1455.84 (4.84951)       0.50%\n",
-  ">>>> User    Time     10135.4 (30.3699)      10148.8 (31.1252)       0.13%\n",
-  ">>>> System  Time     900.47  (2.81131)      923.28  (7.52779)       2.53%\n",
-  ">>>> Percent CPU      761.4   (1.14018)      760.2   (0.447214)      -0.16%\n",
-  ">>>> Context Switches 85380   (3419.52)      84748   (1904.44)       -0.74%\n",
-  ">>>> Sleeps           105064  (1240.96)      105074  (337.612)       0.01%\n",
-  ">>>>\n",
-  ">>>> Average Optimal load -j 16\n",
-  ">>>>                  Run    (std deviation)\n",
-  ">>>>                  BASE                   SPF\n",
-  ">>>> Elapsed Time     920.528 (10.1212)      927.404 (8.91789)       0.75%\n",
-  ">>>> User    Time     11064.8 (981.142)      11085   (990.897)       0.18%\n",
-  ">>>> System  Time     979.904 (84.0615)      1001.14 (82.5523)       2.17%\n",
-  ">>>> Percent CPU      1089.5  (345.894)      1086.1  (343.545)       -0.31%\n",
-  ">>>> Context Switches 159488  (78156.4)      158223  (77472.1)       -0.79%\n",
-  ">>>> Sleeps           110566  (5877.49)      110388  (5617.75)       -0.16%\n",
-  ">>>>\n",
-  ">>>>\n",
-  ">>>> During a run on the SPF, perf events were captured:\n",
-  ">>>>  Performance counter stats for '../kernbench -M':\n",
-  ">>>>          526743764      faults\n",
-  ">>>>                210      spf\n",
-  ">>>>                  3      pagefault:spf_vma_changed\n",
-  ">>>>                  0      pagefault:spf_vma_noanon\n",
-  ">>>>               2278      pagefault:spf_vma_notsup\n",
-  ">>>>                  0      pagefault:spf_vma_access\n",
-  ">>>>                  0      pagefault:spf_pmd_changed\n",
-  ">>>>\n",
-  ">>>> Very few speculative page faults were recorded as most of the processes\n",
-  ">>>> involved are monothreaded (sounds that on this architecture some threads\n",
-  ">>>> were created during the kernel build processing).\n",
-  ">>>>\n",
-  ">>>> Here are the kerbench results on a 80 CPUs Power8 system:\n",
-  ">>>>\n",
-  ">>>> Average Half load -j 40\n",
-  ">>>>                  Run    (std deviation)\n",
-  ">>>>                  BASE                   SPF\n",
-  ">>>> Elapsed Time     117.152 (0.774642)     117.166 (0.476057)      0.01%\n",
-  ">>>> User    Time     4478.52 (24.7688)      4479.76 (9.08555)       0.03%\n",
-  ">>>> System  Time     131.104 (0.720056)     134.04  (0.708414)      2.24%\n",
-  ">>>> Percent CPU      3934    (19.7104)      3937.2  (19.0184)       0.08%\n",
-  ">>>> Context Switches 92125.4 (576.787)      92581.6 (198.622)       0.50%\n",
-  ">>>> Sleeps           317923  (652.499)      318469  (1255.59)       0.17%\n",
-  ">>>>\n",
-  ">>>> Average Optimal load -j 80\n",
-  ">>>>                  Run    (std deviation)\n",
-  ">>>>                  BASE                   SPF\n",
-  ">>>> Elapsed Time     107.73  (0.632416)     107.31  (0.584936)      -0.39%\n",
-  ">>>> User    Time     5869.86 (1466.72)      5871.71 (1467.27)       0.03%\n",
-  ">>>> System  Time     153.728 (23.8573)      157.153 (24.3704)       2.23%\n",
-  ">>>> Percent CPU      5418.6  (1565.17)      5436.7  (1580.91)       0.33%\n",
-  ">>>> Context Switches 223861  (138865)       225032  (139632)        0.52%\n",
-  ">>>> Sleeps           330529  (13495.1)      332001  (14746.2)       0.45%\n",
-  ">>>>\n",
-  ">>>> During a run on the SPF, perf events were captured:\n",
-  ">>>>  Performance counter stats for '../kernbench -M':\n",
-  ">>>>          116730856      faults\n",
-  ">>>>                  0      spf\n",
-  ">>>>                  3      pagefault:spf_vma_changed\n",
-  ">>>>                  0      pagefault:spf_vma_noanon\n",
-  ">>>>                476      pagefault:spf_vma_notsup\n",
-  ">>>>                  0      pagefault:spf_vma_access\n",
-  ">>>>                  0      pagefault:spf_pmd_changed\n",
-  ">>>>\n",
-  ">>>> Most of the processes involved are monothreaded so SPF is not activated but\n",
-  ">>>> there is no impact on the performance.\n",
-  ">>>>\n",
-  ">>>> Ebizzy:\n",
-  ">>>> -------\n",
-  ">>>> The test is counting the number of records per second it can manage, the\n",
-  ">>>> higher is the best. I run it like this 'ebizzy -mTt <nrcpus>'. To get\n",
-  ">>>> consistent result I repeated the test 100 times and measure the average\n",
-  ">>>> result. The number is the record processes per second, the higher is the\n",
-  ">>>> best.\n",
-  ">>>>\n",
-  ">>>>                 BASE            SPF             delta\n",
-  ">>>> 16 CPUs x86 VM  742.57          1490.24         100.69%\n",
-  ">>>> 80 CPUs P8 node 13105.4         24174.23        84.46%\n",
-  ">>>>\n",
-  ">>>> Here are the performance counter read during a run on a 16 CPUs x86 VM:\n",
-  ">>>>  Performance counter stats for './ebizzy -mTt 16':\n",
-  ">>>>            1706379      faults\n",
-  ">>>>            1674599      spf\n",
-  ">>>>              30588      pagefault:spf_vma_changed\n",
-  ">>>>                  0      pagefault:spf_vma_noanon\n",
-  ">>>>                363      pagefault:spf_vma_notsup\n",
-  ">>>>                  0      pagefault:spf_vma_access\n",
-  ">>>>                  0      pagefault:spf_pmd_changed\n",
-  ">>>>\n",
-  ">>>> And the ones captured during a run on a 80 CPUs Power node:\n",
-  ">>>>  Performance counter stats for './ebizzy -mTt 80':\n",
-  ">>>>            1874773      faults\n",
-  ">>>>            1461153      spf\n",
-  ">>>>             413293      pagefault:spf_vma_changed\n",
-  ">>>>                  0      pagefault:spf_vma_noanon\n",
-  ">>>>                200      pagefault:spf_vma_notsup\n",
-  ">>>>                  0      pagefault:spf_vma_access\n",
-  ">>>>                  0      pagefault:spf_pmd_changed\n",
-  ">>>>\n",
-  ">>>> In ebizzy's case most of the page fault were handled in a speculative way,\n",
-  ">>>> leading the ebizzy performance boost.\n",
-  ">>>>\n",
-  ">>>> ------------------\n",
-  ">>>> Changes since v10 (https://lkml.org/lkml/2018/4/17/572):\n",
-  ">>>>  - Accounted for all review feedbacks from Punit Agrawal, Ganesh Mahendran\n",
-  ">>>>    and Minchan Kim, hopefully.\n",
-  ">>>>  - Remove unneeded check on CONFIG_SPECULATIVE_PAGE_FAULT in\n",
-  ">>>>    __do_page_fault().\n",
-  ">>>>  - Loop in pte_spinlock() and pte_map_lock() when pte try lock fails\n",
-  ">>>>    instead\n",
-  ">>>>    of aborting the speculative page fault handling. Dropping the now\n",
-  ">>>> useless\n",
-  ">>>>    trace event pagefault:spf_pte_lock.\n",
-  ">>>>  - No more try to reuse the fetched VMA during the speculative page fault\n",
-  ">>>>    handling when retrying is needed. This adds a lot of complexity and\n",
-  ">>>>    additional tests done didn't show a significant performance improvement.\n",
-  ">>>>  - Convert IS_ENABLED(CONFIG_NUMA) back to #ifdef due to build error.\n",
-  ">>>>\n",
-  ">>>> [1] http://linux-kernel.2935.n7.nabble.com/RFC-PATCH-0-6-Another-go-at-speculative-page-faults-tt965642.html#none\n",
-  ">>>> [2] https://patchwork.kernel.org/patch/9999687/\n",
-  ">>>>\n",
-  ">>>>\n",
-  ">>>> Laurent Dufour (20):\n",
-  ">>>>   mm: introduce CONFIG_SPECULATIVE_PAGE_FAULT\n",
-  ">>>>   x86/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT\n",
-  ">>>>   powerpc/mm: set ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT\n",
-  ">>>>   mm: introduce pte_spinlock for FAULT_FLAG_SPECULATIVE\n",
-  ">>>>   mm: make pte_unmap_same compatible with SPF\n",
-  ">>>>   mm: introduce INIT_VMA()\n",
-  ">>>>   mm: protect VMA modifications using VMA sequence count\n",
-  ">>>>   mm: protect mremap() against SPF hanlder\n",
-  ">>>>   mm: protect SPF handler against anon_vma changes\n",
-  ">>>>   mm: cache some VMA fields in the vm_fault structure\n",
-  ">>>>   mm/migrate: Pass vm_fault pointer to migrate_misplaced_page()\n",
-  ">>>>   mm: introduce __lru_cache_add_active_or_unevictable\n",
-  ">>>>   mm: introduce __vm_normal_page()\n",
-  ">>>>   mm: introduce __page_add_new_anon_rmap()\n",
-  ">>>>   mm: protect mm_rb tree with a rwlock\n",
-  ">>>>   mm: adding speculative page fault failure trace events\n",
-  ">>>>   perf: add a speculative page fault sw event\n",
-  ">>>>   perf tools: add support for the SPF perf event\n",
-  ">>>>   mm: add speculative page fault vmstats\n",
-  ">>>>   powerpc/mm: add speculative page fault\n",
-  ">>>>\n",
-  ">>>> Mahendran Ganesh (2):\n",
-  ">>>>   arm64/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT\n",
-  ">>>>   arm64/mm: add speculative page fault\n",
-  ">>>>\n",
-  ">>>> Peter Zijlstra (4):\n",
-  ">>>>   mm: prepare for FAULT_FLAG_SPECULATIVE\n",
-  ">>>>   mm: VMA sequence count\n",
-  ">>>>   mm: provide speculative fault infrastructure\n",
-  ">>>>   x86/mm: add speculative pagefault handling\n",
-  ">>>>\n",
-  ">>>>  arch/arm64/Kconfig                    |   1 +\n",
-  ">>>>  arch/arm64/mm/fault.c                 |  12 +\n",
-  ">>>>  arch/powerpc/Kconfig                  |   1 +\n",
-  ">>>>  arch/powerpc/mm/fault.c               |  16 +\n",
-  ">>>>  arch/x86/Kconfig                      |   1 +\n",
-  ">>>>  arch/x86/mm/fault.c                   |  27 +-\n",
-  ">>>>  fs/exec.c                             |   2 +-\n",
-  ">>>>  fs/proc/task_mmu.c                    |   5 +-\n",
-  ">>>>  fs/userfaultfd.c                      |  17 +-\n",
-  ">>>>  include/linux/hugetlb_inline.h        |   2 +-\n",
-  ">>>>  include/linux/migrate.h               |   4 +-\n",
-  ">>>>  include/linux/mm.h                    | 136 +++++++-\n",
-  ">>>>  include/linux/mm_types.h              |   7 +\n",
-  ">>>>  include/linux/pagemap.h               |   4 +-\n",
-  ">>>>  include/linux/rmap.h                  |  12 +-\n",
-  ">>>>  include/linux/swap.h                  |  10 +-\n",
-  ">>>>  include/linux/vm_event_item.h         |   3 +\n",
-  ">>>>  include/trace/events/pagefault.h      |  80 +++++\n",
-  ">>>>  include/uapi/linux/perf_event.h       |   1 +\n",
-  ">>>>  kernel/fork.c                         |   5 +-\n",
-  ">>>>  mm/Kconfig                            |  22 ++\n",
-  ">>>>  mm/huge_memory.c                      |   6 +-\n",
-  ">>>>  mm/hugetlb.c                          |   2 +\n",
-  ">>>>  mm/init-mm.c                          |   3 +\n",
-  ">>>>  mm/internal.h                         |  20 ++\n",
-  ">>>>  mm/khugepaged.c                       |   5 +\n",
-  ">>>>  mm/madvise.c                          |   6 +-\n",
-  ">>>>  mm/memory.c                           | 612 +++++++++++++++++++++++++++++-----\n",
-  ">>>>  mm/mempolicy.c                        |  51 ++-\n",
-  ">>>>  mm/migrate.c                          |   6 +-\n",
-  ">>>>  mm/mlock.c                            |  13 +-\n",
-  ">>>>  mm/mmap.c                             | 229 ++++++++++---\n",
-  ">>>>  mm/mprotect.c                         |   4 +-\n",
-  ">>>>  mm/mremap.c                           |  13 +\n",
-  ">>>>  mm/nommu.c                            |   2 +-\n",
-  ">>>>  mm/rmap.c                             |   5 +-\n",
-  ">>>>  mm/swap.c                             |   6 +-\n",
-  ">>>>  mm/swap_state.c                       |   8 +-\n",
-  ">>>>  mm/vmstat.c                           |   5 +-\n",
-  ">>>>  tools/include/uapi/linux/perf_event.h |   1 +\n",
-  ">>>>  tools/perf/util/evsel.c               |   1 +\n",
-  ">>>>  tools/perf/util/parse-events.c        |   4 +\n",
-  ">>>>  tools/perf/util/parse-events.l        |   1 +\n",
-  ">>>>  tools/perf/util/python.c              |   1 +\n",
-  ">>>>  44 files changed, 1161 insertions(+), 211 deletions(-)\n",
-  ">>>>  create mode 100644 include/trace/events/pagefault.h\n",
-  ">>>>\n",
-  ">>>> --\n",
-  ">>>> 2.7.4\n",
-  ">>>>\n",
-  ">>>>\n",
-  ">>>\n",
-  ">>\n",
-  ">"
+  "Hi Laurent,=0A=\n",
+  "=0A=\n",
+  "=0A=\n",
+  "For the test result on Intel 4s skylake platform (192 CPUs, 768G Memory), t=\n",
+  "he below test cases all were run 3 times.=0A=\n",
+  "I check the test results, only page_fault3_thread/enable THP have 6% stddev=\n",
+  " for head commit, other tests have lower stddev.=0A=\n",
+  "=0A=\n",
+  "And I did not find other high variation on test case result.=0A=\n",
+  "=0A=\n",
+  "a). Enable THP=0A=\n",
+  "testcase                          base     stddev       change      head   =\n",
+  "  stddev         metric=0A=\n",
+  "page_fault3/enable THP           10519      =B1 3%        -20.5%      8368 =\n",
+  "     =B16%          will-it-scale.per_thread_ops=0A=\n",
+  "page_fault2/enalbe THP            8281      =B1 2%        -18.8%      6728 =\n",
+  "                  will-it-scale.per_thread_ops=0A=\n",
+  "brk1/eanble THP                 998475                   -2.2%    976893   =\n",
+  "                will-it-scale.per_process_ops=0A=\n",
+  "context_switch1/enable THP      223910                   -1.3%    220930   =\n",
+  "                will-it-scale.per_process_ops=0A=\n",
+  "context_switch1/enable THP      233722                   -1.0%    231288   =\n",
+  "                will-it-scale.per_thread_ops=0A=\n",
+  "=0A=\n",
+  "b). Disable THP=0A=\n",
+  "page_fault3/disable THP          10856                  -23.1%      8344   =\n",
+  "                will-it-scale.per_thread_ops=0A=\n",
+  "page_fault2/disable THP           8147                  -18.8%      6613   =\n",
+  "                will-it-scale.per_thread_ops=0A=\n",
+  "brk1/disable THP                   957                    -7.9%      881   =\n",
+  "                will-it-scale.per_thread_ops=0A=\n",
+  "context_switch1/disable THP     237006                    -2.2%    231907  =\n",
+  "                will-it-scale.per_thread_ops=0A=\n",
+  "brk1/disable THP                997317                    -2.0%    977778  =\n",
+  "                will-it-scale.per_process_ops=0A=\n",
+  "page_fault3/disable THP         467454                    -1.8%    459251  =\n",
+  "                will-it-scale.per_process_ops=0A=\n",
+  "context_switch1/disable THP     224431                    -1.3%    221567  =\n",
+  "                will-it-scale.per_process_ops=0A=\n",
+  "=0A=\n",
+  "=0A=\n",
+  "Best regards,=0A=\n",
+  "Haiyan Song=0A=\n",
+  "________________________________________=0A=\n",
+  "From: Laurent Dufour [ldufour\@linux.vnet.ibm.com]=0A=\n",
+  "Sent: Monday, July 02, 2018 4:59 PM=0A=\n",
+  "To: Song, HaiyanX=0A=\n",
+  "Cc: akpm\@linux-foundation.org; mhocko\@kernel.org; peterz\@infradead.org; kir=\n",
+  "ill\@shutemov.name; ak\@linux.intel.com; dave\@stgolabs.net; jack\@suse.cz; Mat=\n",
+  "thew Wilcox; khandual\@linux.vnet.ibm.com; aneesh.kumar\@linux.vnet.ibm.com; =\n",
+  "benh\@kernel.crashing.org; mpe\@ellerman.id.au; paulus\@samba.org; Thomas Glei=\n",
+  "xner; Ingo Molnar; hpa\@zytor.com; Will Deacon; Sergey Senozhatsky; sergey.s=\n",
+  "enozhatsky.work\@gmail.com; Andrea Arcangeli; Alexei Starovoitov; Wang, Kemi=\n",
+  "; Daniel Jordan; David Rientjes; Jerome Glisse; Ganesh Mahendran; Minchan K=\n",
+  "im; Punit Agrawal; vinayak menon; Yang Shi; linux-kernel\@vger.kernel.org; l=\n",
+  "inux-mm\@kvack.org; haren\@linux.vnet.ibm.com; npiggin\@gmail.com; bsingharora=\n",
+  "\@gmail.com; paulmck\@linux.vnet.ibm.com; Tim Chen; linuxppc-dev\@lists.ozlabs=\n",
+  ".org; x86\@kernel.org=0A=\n",
+  "Subject: Re: [PATCH v11 00/26] Speculative page faults=0A=\n",
+  "=0A=\n",
+  "On 11/06/2018 09:49, Song, HaiyanX wrote:=0A=\n",
+  "> Hi Laurent,=0A=\n",
+  ">=0A=\n",
+  "> Regression test for v11 patch serials have been run, some regression is f=\n",
+  "ound by LKP-tools (linux kernel performance)=0A=\n",
+  "> tested on Intel 4s skylake platform. This time only test the cases which =\n",
+  "have been run and found regressions on=0A=\n",
+  "> V9 patch serials.=0A=\n",
+  ">=0A=\n",
+  "> The regression result is sorted by the metric will-it-scale.per_thread_op=\n",
+  "s.=0A=\n",
+  "> branch: Laurent-Dufour/Speculative-page-faults/20180520-045126=0A=\n",
+  "> commit id:=0A=\n",
+  ">   head commit : a7a8993bfe3ccb54ad468b9f1799649e4ad1ff12=0A=\n",
+  ">   base commit : ba98a1cdad71d259a194461b3a61471b49b14df1=0A=\n",
+  "> Benchmark: will-it-scale=0A=\n",
+  "> Download link: https://github.com/antonblanchard/will-it-scale/tree/maste=\n",
+  "r=0A=\n",
+  ">=0A=\n",
+  "> Metrics:=0A=\n",
+  ">   will-it-scale.per_process_ops=3Dprocesses/nr_cpu=0A=\n",
+  ">   will-it-scale.per_thread_ops=3Dthreads/nr_cpu=0A=\n",
+  ">   test box: lkp-skl-4sp1(nr_cpu=3D192,memory=3D768G)=0A=\n",
+  "> THP: enable / disable=0A=\n",
+  "> nr_task:100%=0A=\n",
+  ">=0A=\n",
+  "> 1. Regressions:=0A=\n",
+  ">=0A=\n",
+  "> a). Enable THP=0A=\n",
+  "> testcase                          base           change      head        =\n",
+  "   metric=0A=\n",
+  "> page_fault3/enable THP           10519          -20.5%        836      wi=\n",
+  "ll-it-scale.per_thread_ops=0A=\n",
+  "> page_fault2/enalbe THP            8281          -18.8%       6728      wi=\n",
+  "ll-it-scale.per_thread_ops=0A=\n",
+  "> brk1/eanble THP                 998475           -2.2%     976893      wi=\n",
+  "ll-it-scale.per_process_ops=0A=\n",
+  "> context_switch1/enable THP      223910           -1.3%     220930      wi=\n",
+  "ll-it-scale.per_process_ops=0A=\n",
+  "> context_switch1/enable THP      233722           -1.0%     231288      wi=\n",
+  "ll-it-scale.per_thread_ops=0A=\n",
+  ">=0A=\n",
+  "> b). Disable THP=0A=\n",
+  "> page_fault3/disable THP          10856          -23.1%       8344      wi=\n",
+  "ll-it-scale.per_thread_ops=0A=\n",
+  "> page_fault2/disable THP           8147          -18.8%       6613      wi=\n",
+  "ll-it-scale.per_thread_ops=0A=\n",
+  "> brk1/disable THP                   957           -7.9%        881      wi=\n",
+  "ll-it-scale.per_thread_ops=0A=\n",
+  "> context_switch1/disable THP     237006           -2.2%     231907      wi=\n",
+  "ll-it-scale.per_thread_ops=0A=\n",
+  "> brk1/disable THP                997317           -2.0%     977778      wi=\n",
+  "ll-it-scale.per_process_ops=0A=\n",
+  "> page_fault3/disable THP         467454           -1.8%     459251      wi=\n",
+  "ll-it-scale.per_process_ops=0A=\n",
+  "> context_switch1/disable THP     224431           -1.3%     221567      wi=\n",
+  "ll-it-scale.per_process_ops=0A=\n",
+  ">=0A=\n",
+  "> Notes: for the above  values of test result, the higher is better.=0A=\n",
+  "=0A=\n",
+  "I tried the same tests on my PowerPC victim VM (1024 CPUs, 11TB) and I can'=\n",
+  "t=0A=\n",
+  "get reproducible results. The results have huge variation, even on the vani=\n",
+  "lla=0A=\n",
+  "kernel, and I can't state on any changes due to that.=0A=\n",
+  "=0A=\n",
+  "I tried on smaller node (80 CPUs, 32G), and the tests ran better, but I did=\n",
+  "n't=0A=\n",
+  "measure any changes between the vanilla and the SPF patched ones:=0A=\n",
+  "=0A=\n",
+  "test THP enabled                4.17.0-rc4-mm1  spf             delta=0A=\n",
+  "page_fault3_threads             2697.7          2683.5          -0.53%=0A=\n",
+  "page_fault2_threads             170660.6        169574.1        -0.64%=0A=\n",
+  "context_switch1_threads         6915269.2       6877507.3       -0.55%=0A=\n",
+  "context_switch1_processes       6478076.2       6529493.5       0.79%=0A=\n",
+  "brk1                            243391.2        238527.5        -2.00%=0A=\n",
+  "=0A=\n",
+  "Tests were run 10 times, no high variation detected.=0A=\n",
+  "=0A=\n",
+  "Did you see high variation on your side ? How many times the test were run =\n",
+  "to=0A=\n",
+  "compute the average values ?=0A=\n",
+  "=0A=\n",
+  "Thanks,=0A=\n",
+  "Laurent.=0A=\n",
+  "=0A=\n",
+  "=0A=\n",
+  ">=0A=\n",
+  "> 2. Improvement: not found improvement based on the selected test cases.=\n",
+  "=0A=\n",
+  ">=0A=\n",
+  ">=0A=\n",
+  "> Best regards=0A=\n",
+  "> Haiyan Song=0A=\n",
+  "> ________________________________________=0A=\n",
+  "> From: owner-linux-mm\@kvack.org [owner-linux-mm\@kvack.org] on behalf of La=\n",
+  "urent Dufour [ldufour\@linux.vnet.ibm.com]=0A=\n",
+  "> Sent: Monday, May 28, 2018 4:54 PM=0A=\n",
+  "> To: Song, HaiyanX=0A=\n",
+  "> Cc: akpm\@linux-foundation.org; mhocko\@kernel.org; peterz\@infradead.org; k=\n",
+  "irill\@shutemov.name; ak\@linux.intel.com; dave\@stgolabs.net; jack\@suse.cz; M=\n",
+  "atthew Wilcox; khandual\@linux.vnet.ibm.com; aneesh.kumar\@linux.vnet.ibm.com=\n",
+  "; benh\@kernel.crashing.org; mpe\@ellerman.id.au; paulus\@samba.org; Thomas Gl=\n",
+  "eixner; Ingo Molnar; hpa\@zytor.com; Will Deacon; Sergey Senozhatsky; sergey=\n",
+  ".senozhatsky.work\@gmail.com; Andrea Arcangeli; Alexei Starovoitov; Wang, Ke=\n",
+  "mi; Daniel Jordan; David Rientjes; Jerome Glisse; Ganesh Mahendran; Minchan=\n",
+  " Kim; Punit Agrawal; vinayak menon; Yang Shi; linux-kernel\@vger.kernel.org;=\n",
+  " linux-mm\@kvack.org; haren\@linux.vnet.ibm.com; npiggin\@gmail.com; bsingharo=\n",
+  "ra\@gmail.com; paulmck\@linux.vnet.ibm.com; Tim Chen; linuxppc-dev\@lists.ozla=\n",
+  "bs.org; x86\@kernel.org=0A=\n",
+  "> Subject: Re: [PATCH v11 00/26] Speculative page faults=0A=\n",
+  ">=0A=\n",
+  "> On 28/05/2018 10:22, Haiyan Song wrote:=0A=\n",
+  ">> Hi Laurent,=0A=\n",
+  ">>=0A=\n",
+  ">> Yes, these tests are done on V9 patch.=0A=\n",
+  ">=0A=\n",
+  "> Do you plan to give this V11 a run ?=0A=\n",
+  ">=0A=\n",
+  ">>=0A=\n",
+  ">>=0A=\n",
+  ">> Best regards,=0A=\n",
+  ">> Haiyan Song=0A=\n",
+  ">>=0A=\n",
+  ">> On Mon, May 28, 2018 at 09:51:34AM +0200, Laurent Dufour wrote:=0A=\n",
+  ">>> On 28/05/2018 07:23, Song, HaiyanX wrote:=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> Some regression and improvements is found by LKP-tools(linux kernel pe=\n",
+  "rformance) on V9 patch series=0A=\n",
+  ">>>> tested on Intel 4s Skylake platform.=0A=\n",
+  ">>>=0A=\n",
+  ">>> Hi,=0A=\n",
+  ">>>=0A=\n",
+  ">>> Thanks for reporting this benchmark results, but you mentioned the \"V9 =\n",
+  "patch=0A=\n",
+  ">>> series\" while responding to the v11 header series...=0A=\n",
+  ">>> Were these tests done on v9 or v11 ?=0A=\n",
+  ">>>=0A=\n",
+  ">>> Cheers,=0A=\n",
+  ">>> Laurent.=0A=\n",
+  ">>>=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> The regression result is sorted by the metric will-it-scale.per_thread=\n",
+  "_ops.=0A=\n",
+  ">>>> Branch: Laurent-Dufour/Speculative-page-faults/20180316-151833 (V9 pat=\n",
+  "ch series)=0A=\n",
+  ">>>> Commit id:=0A=\n",
+  ">>>>     base commit: d55f34411b1b126429a823d06c3124c16283231f=0A=\n",
+  ">>>>     head commit: 0355322b3577eeab7669066df42c550a56801110=0A=\n",
+  ">>>> Benchmark suite: will-it-scale=0A=\n",
+  ">>>> Download link:=0A=\n",
+  ">>>> https://github.com/antonblanchard/will-it-scale/tree/master/tests=0A=\n",
+  ">>>> Metrics:=0A=\n",
+  ">>>>     will-it-scale.per_process_ops=3Dprocesses/nr_cpu=0A=\n",
+  ">>>>     will-it-scale.per_thread_ops=3Dthreads/nr_cpu=0A=\n",
+  ">>>> test box: lkp-skl-4sp1(nr_cpu=3D192,memory=3D768G)=0A=\n",
+  ">>>> THP: enable / disable=0A=\n",
+  ">>>> nr_task: 100%=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> 1. Regressions:=0A=\n",
+  ">>>> a) THP enabled:=0A=\n",
+  ">>>> testcase                        base            change          head  =\n",
+  "     metric=0A=\n",
+  ">>>> page_fault3/ enable THP         10092           -17.5%          8323  =\n",
+  "     will-it-scale.per_thread_ops=0A=\n",
+  ">>>> page_fault2/ enable THP          8300           -17.2%          6869  =\n",
+  "     will-it-scale.per_thread_ops=0A=\n",
+  ">>>> brk1/ enable THP                  957.67         -7.6%           885  =\n",
+  "     will-it-scale.per_thread_ops=0A=\n",
+  ">>>> page_fault3/ enable THP        172821            -5.3%        163692  =\n",
+  "     will-it-scale.per_process_ops=0A=\n",
+  ">>>> signal1/ enable THP              9125            -3.2%          8834  =\n",
+  "     will-it-scale.per_process_ops=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> b) THP disabled:=0A=\n",
+  ">>>> testcase                        base            change          head  =\n",
+  "     metric=0A=\n",
+  ">>>> page_fault3/ disable THP        10107           -19.1%          8180  =\n",
+  "     will-it-scale.per_thread_ops=0A=\n",
+  ">>>> page_fault2/ disable THP         8432           -17.8%          6931  =\n",
+  "     will-it-scale.per_thread_ops=0A=\n",
+  ">>>> context_switch1/ disable THP   215389            -6.8%        200776  =\n",
+  "     will-it-scale.per_thread_ops=0A=\n",
+  ">>>> brk1/ disable THP                 939.67         -6.6%           877.3=\n",
+  "3    will-it-scale.per_thread_ops=0A=\n",
+  ">>>> page_fault3/ disable THP       173145            -4.7%        165064  =\n",
+  "     will-it-scale.per_process_ops=0A=\n",
+  ">>>> signal1/ disable THP             9162            -3.9%          8802  =\n",
+  "     will-it-scale.per_process_ops=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> 2. Improvements:=0A=\n",
+  ">>>> a) THP enabled:=0A=\n",
+  ">>>> testcase                        base            change          head  =\n",
+  "     metric=0A=\n",
+  ">>>> malloc1/ enable THP               66.33        +469.8%           383.6=\n",
+  "7    will-it-scale.per_thread_ops=0A=\n",
+  ">>>> writeseek3/ enable THP          2531             +4.5%          2646  =\n",
+  "     will-it-scale.per_thread_ops=0A=\n",
+  ">>>> signal1/ enable THP              989.33          +2.8%          1016  =\n",
+  "     will-it-scale.per_thread_ops=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> b) THP disabled:=0A=\n",
+  ">>>> testcase                        base            change          head  =\n",
+  "     metric=0A=\n",
+  ">>>> malloc1/ disable THP              90.33        +417.3%           467.3=\n",
+  "3    will-it-scale.per_thread_ops=0A=\n",
+  ">>>> read2/ disable THP             58934            +39.2%         82060  =\n",
+  "     will-it-scale.per_thread_ops=0A=\n",
+  ">>>> page_fault1/ disable THP        8607            +36.4%         11736  =\n",
+  "     will-it-scale.per_thread_ops=0A=\n",
+  ">>>> read1/ disable THP            314063            +12.7%        353934  =\n",
+  "     will-it-scale.per_thread_ops=0A=\n",
+  ">>>> writeseek3/ disable THP         2452            +12.5%          2759  =\n",
+  "     will-it-scale.per_thread_ops=0A=\n",
+  ">>>> signal1/ disable THP             971.33          +5.5%          1024  =\n",
+  "     will-it-scale.per_thread_ops=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> Notes: for above values in column \"change\", the higher value means tha=\n",
+  "t the related testcase result=0A=\n",
+  ">>>> on head commit is better than that on base commit for this benchmark.=\n",
+  "=0A=\n",
+  ">>>>=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> Best regards=0A=\n",
+  ">>>> Haiyan Song=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> ________________________________________=0A=\n",
+  ">>>> From: owner-linux-mm\@kvack.org [owner-linux-mm\@kvack.org] on behalf of=\n",
+  " Laurent Dufour [ldufour\@linux.vnet.ibm.com]=0A=\n",
+  ">>>> Sent: Thursday, May 17, 2018 7:06 PM=0A=\n",
+  ">>>> To: akpm\@linux-foundation.org; mhocko\@kernel.org; peterz\@infradead.org=\n",
+  "; kirill\@shutemov.name; ak\@linux.intel.com; dave\@stgolabs.net; jack\@suse.cz=\n",
+  "; Matthew Wilcox; khandual\@linux.vnet.ibm.com; aneesh.kumar\@linux.vnet.ibm.=\n",
+  "com; benh\@kernel.crashing.org; mpe\@ellerman.id.au; paulus\@samba.org; Thomas=\n",
+  " Gleixner; Ingo Molnar; hpa\@zytor.com; Will Deacon; Sergey Senozhatsky; ser=\n",
+  "gey.senozhatsky.work\@gmail.com; Andrea Arcangeli; Alexei Starovoitov; Wang,=\n",
+  " Kemi; Daniel Jordan; David Rientjes; Jerome Glisse; Ganesh Mahendran; Minc=\n",
+  "han Kim; Punit Agrawal; vinayak menon; Yang Shi=0A=\n",
+  ">>>> Cc: linux-kernel\@vger.kernel.org; linux-mm\@kvack.org; haren\@linux.vnet=\n",
+  ".ibm.com; npiggin\@gmail.com; bsingharora\@gmail.com; paulmck\@linux.vnet.ibm.=\n",
+  "com; Tim Chen; linuxppc-dev\@lists.ozlabs.org; x86\@kernel.org=0A=\n",
+  ">>>> Subject: [PATCH v11 00/26] Speculative page faults=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> This is a port on kernel 4.17 of the work done by Peter Zijlstra to ha=\n",
+  "ndle=0A=\n",
+  ">>>> page fault without holding the mm semaphore [1].=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> The idea is to try to handle user space page faults without holding th=\n",
+  "e=0A=\n",
+  ">>>> mmap_sem. This should allow better concurrency for massively threaded=\n",
+  "=0A=\n",
+  ">>>> process since the page fault handler will not wait for other threads m=\n",
+  "emory=0A=\n",
+  ">>>> layout change to be done, assuming that this change is done in another=\n",
+  " part=0A=\n",
+  ">>>> of the process's memory space. This type page fault is named speculati=\n",
+  "ve=0A=\n",
+  ">>>> page fault. If the speculative page fault fails because of a concurren=\n",
+  "cy is=0A=\n",
+  ">>>> detected or because underlying PMD or PTE tables are not yet allocatin=\n",
+  "g, it=0A=\n",
+  ">>>> is failing its processing and a classic page fault is then tried.=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> The speculative page fault (SPF) has to look for the VMA matching the =\n",
+  "fault=0A=\n",
+  ">>>> address without holding the mmap_sem, this is done by introducing a rw=\n",
+  "lock=0A=\n",
+  ">>>> which protects the access to the mm_rb tree. Previously this was done =\n",
+  "using=0A=\n",
+  ">>>> SRCU but it was introducing a lot of scheduling to process the VMA's=\n",
+  "=0A=\n",
+  ">>>> freeing operation which was hitting the performance by 20% as reported=\n",
+  " by=0A=\n",
+  ">>>> Kemi Wang [2]. Using a rwlock to protect access to the mm_rb tree is=\n",
+  "=0A=\n",
+  ">>>> limiting the locking contention to these operations which are expected=\n",
+  " to=0A=\n",
+  ">>>> be in a O(log n) order. In addition to ensure that the VMA is not free=\n",
+  "d in=0A=\n",
+  ">>>> our back a reference count is added and 2 services (get_vma() and=0A=\n",
+  ">>>> put_vma()) are introduced to handle the reference count. Once a VMA is=\n",
+  "=0A=\n",
+  ">>>> fetched from the RB tree using get_vma(), it must be later freed using=\n",
+  "=0A=\n",
+  ">>>> put_vma(). I can't see anymore the overhead I got while will-it-scale=\n",
+  "=0A=\n",
+  ">>>> benchmark anymore.=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> The VMA's attributes checked during the speculative page fault process=\n",
+  "ing=0A=\n",
+  ">>>> have to be protected against parallel changes. This is done by using a=\n",
+  " per=0A=\n",
+  ">>>> VMA sequence lock. This sequence lock allows the speculative page faul=\n",
+  "t=0A=\n",
+  ">>>> handler to fast check for parallel changes in progress and to abort th=\n",
+  "e=0A=\n",
+  ">>>> speculative page fault in that case.=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> Once the VMA has been found, the speculative page fault handler would =\n",
+  "check=0A=\n",
+  ">>>> for the VMA's attributes to verify that the page fault has to be handl=\n",
+  "ed=0A=\n",
+  ">>>> correctly or not. Thus, the VMA is protected through a sequence lock w=\n",
+  "hich=0A=\n",
+  ">>>> allows fast detection of concurrent VMA changes. If such a change is=\n",
+  "=0A=\n",
+  ">>>> detected, the speculative page fault is aborted and a *classic* page f=\n",
+  "ault=0A=\n",
+  ">>>> is tried.  VMA sequence lockings are added when VMA attributes which a=\n",
+  "re=0A=\n",
+  ">>>> checked during the page fault are modified.=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> When the PTE is fetched, the VMA is checked to see if it has been chan=\n",
+  "ged,=0A=\n",
+  ">>>> so once the page table is locked, the VMA is valid, so any other chang=\n",
+  "es=0A=\n",
+  ">>>> leading to touching this PTE will need to lock the page table, so no=\n",
+  "=0A=\n",
+  ">>>> parallel change is possible at this time.=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> The locking of the PTE is done with interrupts disabled, this allows=\n",
+  "=0A=\n",
+  ">>>> checking for the PMD to ensure that there is not an ongoing collapsing=\n",
+  "=0A=\n",
+  ">>>> operation. Since khugepaged is firstly set the PMD to pmd_none and the=\n",
+  "n is=0A=\n",
+  ">>>> waiting for the other CPU to have caught the IPI interrupt, if the pmd=\n",
+  " is=0A=\n",
+  ">>>> valid at the time the PTE is locked, we have the guarantee that the=0A=\n",
+  ">>>> collapsing operation will have to wait on the PTE lock to move forward=\n",
+  ".=0A=\n",
+  ">>>> This allows the SPF handler to map the PTE safely. If the PMD value is=\n",
+  "=0A=\n",
+  ">>>> different from the one recorded at the beginning of the SPF operation,=\n",
+  " the=0A=\n",
+  ">>>> classic page fault handler will be called to handle the operation whil=\n",
+  "e=0A=\n",
+  ">>>> holding the mmap_sem. As the PTE lock is done with the interrupts disa=\n",
+  "bled,=0A=\n",
+  ">>>> the lock is done using spin_trylock() to avoid dead lock when handling=\n",
+  " a=0A=\n",
+  ">>>> page fault while a TLB invalidate is requested by another CPU holding =\n",
+  "the=0A=\n",
+  ">>>> PTE.=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> In pseudo code, this could be seen as:=0A=\n",
+  ">>>>     speculative_page_fault()=0A=\n",
+  ">>>>     {=0A=\n",
+  ">>>>             vma =3D get_vma()=0A=\n",
+  ">>>>             check vma sequence count=0A=\n",
+  ">>>>             check vma's support=0A=\n",
+  ">>>>             disable interrupt=0A=\n",
+  ">>>>                   check pgd,p4d,...,pte=0A=\n",
+  ">>>>                   save pmd and pte in vmf=0A=\n",
+  ">>>>                   save vma sequence counter in vmf=0A=\n",
+  ">>>>             enable interrupt=0A=\n",
+  ">>>>             check vma sequence count=0A=\n",
+  ">>>>             handle_pte_fault(vma)=0A=\n",
+  ">>>>                     ..=0A=\n",
+  ">>>>                     page =3D alloc_page()=0A=\n",
+  ">>>>                     pte_map_lock()=0A=\n",
+  ">>>>                             disable interrupt=0A=\n",
+  ">>>>                                     abort if sequence counter has chan=\n",
+  "ged=0A=\n",
+  ">>>>                                     abort if pmd or pte has changed=0A=\n",
+  ">>>>                                     pte map and lock=0A=\n",
+  ">>>>                             enable interrupt=0A=\n",
+  ">>>>                     if abort=0A=\n",
+  ">>>>                        free page=0A=\n",
+  ">>>>                        abort=0A=\n",
+  ">>>>                     ...=0A=\n",
+  ">>>>     }=0A=\n",
+  ">>>>=0A=\n",
+  ">>>>     arch_fault_handler()=0A=\n",
+  ">>>>     {=0A=\n",
+  ">>>>             if (speculative_page_fault(&vma))=0A=\n",
+  ">>>>                goto done=0A=\n",
+  ">>>>     again:=0A=\n",
+  ">>>>             lock(mmap_sem)=0A=\n",
+  ">>>>             vma =3D find_vma();=0A=\n",
+  ">>>>             handle_pte_fault(vma);=0A=\n",
+  ">>>>             if retry=0A=\n",
+  ">>>>                unlock(mmap_sem)=0A=\n",
+  ">>>>                goto again;=0A=\n",
+  ">>>>     done:=0A=\n",
+  ">>>>             handle fault error=0A=\n",
+  ">>>>     }=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> Support for THP is not done because when checking for the PMD, we can =\n",
+  "be=0A=\n",
+  ">>>> confused by an in progress collapsing operation done by khugepaged. Th=\n",
+  "e=0A=\n",
+  ">>>> issue is that pmd_none() could be true either if the PMD is not alread=\n",
+  "y=0A=\n",
+  ">>>> populated or if the underlying PTE are in the way to be collapsed. So =\n",
+  "we=0A=\n",
+  ">>>> cannot safely allocate a PMD if pmd_none() is true.=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> This series add a new software performance event named 'speculative-fa=\n",
+  "ults'=0A=\n",
+  ">>>> or 'spf'. It counts the number of successful page fault event handled=\n",
+  "=0A=\n",
+  ">>>> speculatively. When recording 'faults,spf' events, the faults one is=\n",
+  "=0A=\n",
+  ">>>> counting the total number of page fault events while 'spf' is only cou=\n",
+  "nting=0A=\n",
+  ">>>> the part of the faults processed speculatively.=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> There are some trace events introduced by this series. They allow=0A=\n",
+  ">>>> identifying why the page faults were not processed speculatively. This=\n",
+  "=0A=\n",
+  ">>>> doesn't take in account the faults generated by a monothreaded process=\n",
+  "=0A=\n",
+  ">>>> which directly processed while holding the mmap_sem. This trace events=\n",
+  " are=0A=\n",
+  ">>>> grouped in a system named 'pagefault', they are:=0A=\n",
+  ">>>>  - pagefault:spf_vma_changed : if the VMA has been changed in our back=\n",
+  "=0A=\n",
+  ">>>>  - pagefault:spf_vma_noanon : the vma->anon_vma field was not yet set.=\n",
+  "=0A=\n",
+  ">>>>  - pagefault:spf_vma_notsup : the VMA's type is not supported=0A=\n",
+  ">>>>  - pagefault:spf_vma_access : the VMA's access right are not respected=\n",
+  "=0A=\n",
+  ">>>>  - pagefault:spf_pmd_changed : the upper PMD pointer has changed in ou=\n",
+  "r=0A=\n",
+  ">>>>    back.=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> To record all the related events, the easier is to run perf with the=\n",
+  "=0A=\n",
+  ">>>> following arguments :=0A=\n",
+  ">>>> \$ perf stat -e 'faults,spf,pagefault:*' <command>=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> There is also a dedicated vmstat counter showing the number of success=\n",
+  "ful=0A=\n",
+  ">>>> page fault handled speculatively. I can be seen this way:=0A=\n",
+  ">>>> \$ grep speculative_pgfault /proc/vmstat=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> This series builds on top of v4.16-mmotm-2018-04-13-17-28 and is funct=\n",
+  "ional=0A=\n",
+  ">>>> on x86, PowerPC and arm64.=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> ---------------------=0A=\n",
+  ">>>> Real Workload results=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> As mentioned in previous email, we did non official runs using a \"popu=\n",
+  "lar=0A=\n",
+  ">>>> in memory multithreaded database product\" on 176 cores SMT8 Power syst=\n",
+  "em=0A=\n",
+  ">>>> which showed a 30% improvements in the number of transaction processed=\n",
+  " per=0A=\n",
+  ">>>> second. This run has been done on the v6 series, but changes introduce=\n",
+  "d in=0A=\n",
+  ">>>> this new version should not impact the performance boost seen.=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> Here are the perf data captured during 2 of these runs on top of the v=\n",
+  "8=0A=\n",
+  ">>>> series:=0A=\n",
+  ">>>>                 vanilla         spf=0A=\n",
+  ">>>> faults          89.418          101.364         +13%=0A=\n",
+  ">>>> spf                n/a           97.989=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> With the SPF kernel, most of the page fault were processed in a specul=\n",
+  "ative=0A=\n",
+  ">>>> way.=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> Ganesh Mahendran had backported the series on top of a 4.9 kernel and =\n",
+  "gave=0A=\n",
+  ">>>> it a try on an android device. He reported that the application launch=\n",
+  " time=0A=\n",
+  ">>>> was improved in average by 6%, and for large applications (~100 thread=\n",
+  "s) by=0A=\n",
+  ">>>> 20%.=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> Here are the launch time Ganesh mesured on Android 8.0 on top of a Qco=\n",
+  "m=0A=\n",
+  ">>>> MSM845 (8 cores) with 6GB (the less is better):=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> Application                             4.9     4.9+spf delta=0A=\n",
+  ">>>> com.tencent.mm                          416     389     -7%=0A=\n",
+  ">>>> com.eg.android.AlipayGphone             1135    986     -13%=0A=\n",
+  ">>>> com.tencent.mtt                         455     454     0%=0A=\n",
+  ">>>> com.qqgame.hlddz                        1497    1409    -6%=0A=\n",
+  ">>>> com.autonavi.minimap                    711     701     -1%=0A=\n",
+  ">>>> com.tencent.tmgp.sgame                  788     748     -5%=0A=\n",
+  ">>>> com.immomo.momo                         501     487     -3%=0A=\n",
+  ">>>> com.tencent.peng                        2145    2112    -2%=0A=\n",
+  ">>>> com.smile.gifmaker                      491     461     -6%=0A=\n",
+  ">>>> com.baidu.BaiduMap                      479     366     -23%=0A=\n",
+  ">>>> com.taobao.taobao                       1341    1198    -11%=0A=\n",
+  ">>>> com.baidu.searchbox                     333     314     -6%=0A=\n",
+  ">>>> com.tencent.mobileqq                    394     384     -3%=0A=\n",
+  ">>>> com.sina.weibo                          907     906     0%=0A=\n",
+  ">>>> com.youku.phone                         816     731     -11%=0A=\n",
+  ">>>> com.happyelements.AndroidAnimal.qq      763     717     -6%=0A=\n",
+  ">>>> com.UCMobile                            415     411     -1%=0A=\n",
+  ">>>> com.tencent.tmgp.ak                     1464    1431    -2%=0A=\n",
+  ">>>> com.tencent.qqmusic                     336     329     -2%=0A=\n",
+  ">>>> com.sankuai.meituan                     1661    1302    -22%=0A=\n",
+  ">>>> com.netease.cloudmusic                  1193    1200    1%=0A=\n",
+  ">>>> air.tv.douyu.android                    4257    4152    -2%=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> ------------------=0A=\n",
+  ">>>> Benchmarks results=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> Base kernel is v4.17.0-rc4-mm1=0A=\n",
+  ">>>> SPF is BASE + this series=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> Kernbench:=0A=\n",
+  ">>>> ----------=0A=\n",
+  ">>>> Here are the results on a 16 CPUs X86 guest using kernbench on a 4.15=\n",
+  "=0A=\n",
+  ">>>> kernel (kernel is build 5 times):=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> Average Half load -j 8=0A=\n",
+  ">>>>                  Run    (std deviation)=0A=\n",
+  ">>>>                  BASE                   SPF=0A=\n",
+  ">>>> Elapsed Time     1448.65 (5.72312)      1455.84 (4.84951)       0.50%=\n",
+  "=0A=\n",
+  ">>>> User    Time     10135.4 (30.3699)      10148.8 (31.1252)       0.13%=\n",
+  "=0A=\n",
+  ">>>> System  Time     900.47  (2.81131)      923.28  (7.52779)       2.53%=\n",
+  "=0A=\n",
+  ">>>> Percent CPU      761.4   (1.14018)      760.2   (0.447214)      -0.16%=\n",
+  "=0A=\n",
+  ">>>> Context Switches 85380   (3419.52)      84748   (1904.44)       -0.74%=\n",
+  "=0A=\n",
+  ">>>> Sleeps           105064  (1240.96)      105074  (337.612)       0.01%=\n",
+  "=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> Average Optimal load -j 16=0A=\n",
+  ">>>>                  Run    (std deviation)=0A=\n",
+  ">>>>                  BASE                   SPF=0A=\n",
+  ">>>> Elapsed Time     920.528 (10.1212)      927.404 (8.91789)       0.75%=\n",
+  "=0A=\n",
+  ">>>> User    Time     11064.8 (981.142)      11085   (990.897)       0.18%=\n",
+  "=0A=\n",
+  ">>>> System  Time     979.904 (84.0615)      1001.14 (82.5523)       2.17%=\n",
+  "=0A=\n",
+  ">>>> Percent CPU      1089.5  (345.894)      1086.1  (343.545)       -0.31%=\n",
+  "=0A=\n",
+  ">>>> Context Switches 159488  (78156.4)      158223  (77472.1)       -0.79%=\n",
+  "=0A=\n",
+  ">>>> Sleeps           110566  (5877.49)      110388  (5617.75)       -0.16%=\n",
+  "=0A=\n",
+  ">>>>=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> During a run on the SPF, perf events were captured:=0A=\n",
+  ">>>>  Performance counter stats for '../kernbench -M':=0A=\n",
+  ">>>>          526743764      faults=0A=\n",
+  ">>>>                210      spf=0A=\n",
+  ">>>>                  3      pagefault:spf_vma_changed=0A=\n",
+  ">>>>                  0      pagefault:spf_vma_noanon=0A=\n",
+  ">>>>               2278      pagefault:spf_vma_notsup=0A=\n",
+  ">>>>                  0      pagefault:spf_vma_access=0A=\n",
+  ">>>>                  0      pagefault:spf_pmd_changed=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> Very few speculative page faults were recorded as most of the processe=\n",
+  "s=0A=\n",
+  ">>>> involved are monothreaded (sounds that on this architecture some threa=\n",
+  "ds=0A=\n",
+  ">>>> were created during the kernel build processing).=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> Here are the kerbench results on a 80 CPUs Power8 system:=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> Average Half load -j 40=0A=\n",
+  ">>>>                  Run    (std deviation)=0A=\n",
+  ">>>>                  BASE                   SPF=0A=\n",
+  ">>>> Elapsed Time     117.152 (0.774642)     117.166 (0.476057)      0.01%=\n",
+  "=0A=\n",
+  ">>>> User    Time     4478.52 (24.7688)      4479.76 (9.08555)       0.03%=\n",
+  "=0A=\n",
+  ">>>> System  Time     131.104 (0.720056)     134.04  (0.708414)      2.24%=\n",
+  "=0A=\n",
+  ">>>> Percent CPU      3934    (19.7104)      3937.2  (19.0184)       0.08%=\n",
+  "=0A=\n",
+  ">>>> Context Switches 92125.4 (576.787)      92581.6 (198.622)       0.50%=\n",
+  "=0A=\n",
+  ">>>> Sleeps           317923  (652.499)      318469  (1255.59)       0.17%=\n",
+  "=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> Average Optimal load -j 80=0A=\n",
+  ">>>>                  Run    (std deviation)=0A=\n",
+  ">>>>                  BASE                   SPF=0A=\n",
+  ">>>> Elapsed Time     107.73  (0.632416)     107.31  (0.584936)      -0.39%=\n",
+  "=0A=\n",
+  ">>>> User    Time     5869.86 (1466.72)      5871.71 (1467.27)       0.03%=\n",
+  "=0A=\n",
+  ">>>> System  Time     153.728 (23.8573)      157.153 (24.3704)       2.23%=\n",
+  "=0A=\n",
+  ">>>> Percent CPU      5418.6  (1565.17)      5436.7  (1580.91)       0.33%=\n",
+  "=0A=\n",
+  ">>>> Context Switches 223861  (138865)       225032  (139632)        0.52%=\n",
+  "=0A=\n",
+  ">>>> Sleeps           330529  (13495.1)      332001  (14746.2)       0.45%=\n",
+  "=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> During a run on the SPF, perf events were captured:=0A=\n",
+  ">>>>  Performance counter stats for '../kernbench -M':=0A=\n",
+  ">>>>          116730856      faults=0A=\n",
+  ">>>>                  0      spf=0A=\n",
+  ">>>>                  3      pagefault:spf_vma_changed=0A=\n",
+  ">>>>                  0      pagefault:spf_vma_noanon=0A=\n",
+  ">>>>                476      pagefault:spf_vma_notsup=0A=\n",
+  ">>>>                  0      pagefault:spf_vma_access=0A=\n",
+  ">>>>                  0      pagefault:spf_pmd_changed=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> Most of the processes involved are monothreaded so SPF is not activate=\n",
+  "d but=0A=\n",
+  ">>>> there is no impact on the performance.=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> Ebizzy:=0A=\n",
+  ">>>> -------=0A=\n",
+  ">>>> The test is counting the number of records per second it can manage, t=\n",
+  "he=0A=\n",
+  ">>>> higher is the best. I run it like this 'ebizzy -mTt <nrcpus>'. To get=\n",
+  "=0A=\n",
+  ">>>> consistent result I repeated the test 100 times and measure the averag=\n",
+  "e=0A=\n",
+  ">>>> result. The number is the record processes per second, the higher is t=\n",
+  "he=0A=\n",
+  ">>>> best.=0A=\n",
+  ">>>>=0A=\n",
+  ">>>>                 BASE            SPF             delta=0A=\n",
+  ">>>> 16 CPUs x86 VM  742.57          1490.24         100.69%=0A=\n",
+  ">>>> 80 CPUs P8 node 13105.4         24174.23        84.46%=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> Here are the performance counter read during a run on a 16 CPUs x86 VM=\n",
+  ":=0A=\n",
+  ">>>>  Performance counter stats for './ebizzy -mTt 16':=0A=\n",
+  ">>>>            1706379      faults=0A=\n",
+  ">>>>            1674599      spf=0A=\n",
+  ">>>>              30588      pagefault:spf_vma_changed=0A=\n",
+  ">>>>                  0      pagefault:spf_vma_noanon=0A=\n",
+  ">>>>                363      pagefault:spf_vma_notsup=0A=\n",
+  ">>>>                  0      pagefault:spf_vma_access=0A=\n",
+  ">>>>                  0      pagefault:spf_pmd_changed=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> And the ones captured during a run on a 80 CPUs Power node:=0A=\n",
+  ">>>>  Performance counter stats for './ebizzy -mTt 80':=0A=\n",
+  ">>>>            1874773      faults=0A=\n",
+  ">>>>            1461153      spf=0A=\n",
+  ">>>>             413293      pagefault:spf_vma_changed=0A=\n",
+  ">>>>                  0      pagefault:spf_vma_noanon=0A=\n",
+  ">>>>                200      pagefault:spf_vma_notsup=0A=\n",
+  ">>>>                  0      pagefault:spf_vma_access=0A=\n",
+  ">>>>                  0      pagefault:spf_pmd_changed=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> In ebizzy's case most of the page fault were handled in a speculative =\n",
+  "way,=0A=\n",
+  ">>>> leading the ebizzy performance boost.=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> ------------------=0A=\n",
+  ">>>> Changes since v10 (https://lkml.org/lkml/2018/4/17/572):=0A=\n",
+  ">>>>  - Accounted for all review feedbacks from Punit Agrawal, Ganesh Mahen=\n",
+  "dran=0A=\n",
+  ">>>>    and Minchan Kim, hopefully.=0A=\n",
+  ">>>>  - Remove unneeded check on CONFIG_SPECULATIVE_PAGE_FAULT in=0A=\n",
+  ">>>>    __do_page_fault().=0A=\n",
+  ">>>>  - Loop in pte_spinlock() and pte_map_lock() when pte try lock fails=\n",
+  "=0A=\n",
+  ">>>>    instead=0A=\n",
+  ">>>>    of aborting the speculative page fault handling. Dropping the now=\n",
+  "=0A=\n",
+  ">>>> useless=0A=\n",
+  ">>>>    trace event pagefault:spf_pte_lock.=0A=\n",
+  ">>>>  - No more try to reuse the fetched VMA during the speculative page fa=\n",
+  "ult=0A=\n",
+  ">>>>    handling when retrying is needed. This adds a lot of complexity and=\n",
+  "=0A=\n",
+  ">>>>    additional tests done didn't show a significant performance improve=\n",
+  "ment.=0A=\n",
+  ">>>>  - Convert IS_ENABLED(CONFIG_NUMA) back to #ifdef due to build error.=\n",
+  "=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> [1] http://linux-kernel.2935.n7.nabble.com/RFC-PATCH-0-6-Another-go-at=\n",
+  "-speculative-page-faults-tt965642.html#none=0A=\n",
+  ">>>> [2] https://patchwork.kernel.org/patch/9999687/=0A=\n",
+  ">>>>=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> Laurent Dufour (20):=0A=\n",
+  ">>>>   mm: introduce CONFIG_SPECULATIVE_PAGE_FAULT=0A=\n",
+  ">>>>   x86/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT=0A=\n",
+  ">>>>   powerpc/mm: set ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT=0A=\n",
+  ">>>>   mm: introduce pte_spinlock for FAULT_FLAG_SPECULATIVE=0A=\n",
+  ">>>>   mm: make pte_unmap_same compatible with SPF=0A=\n",
+  ">>>>   mm: introduce INIT_VMA()=0A=\n",
+  ">>>>   mm: protect VMA modifications using VMA sequence count=0A=\n",
+  ">>>>   mm: protect mremap() against SPF hanlder=0A=\n",
+  ">>>>   mm: protect SPF handler against anon_vma changes=0A=\n",
+  ">>>>   mm: cache some VMA fields in the vm_fault structure=0A=\n",
+  ">>>>   mm/migrate: Pass vm_fault pointer to migrate_misplaced_page()=0A=\n",
+  ">>>>   mm: introduce __lru_cache_add_active_or_unevictable=0A=\n",
+  ">>>>   mm: introduce __vm_normal_page()=0A=\n",
+  ">>>>   mm: introduce __page_add_new_anon_rmap()=0A=\n",
+  ">>>>   mm: protect mm_rb tree with a rwlock=0A=\n",
+  ">>>>   mm: adding speculative page fault failure trace events=0A=\n",
+  ">>>>   perf: add a speculative page fault sw event=0A=\n",
+  ">>>>   perf tools: add support for the SPF perf event=0A=\n",
+  ">>>>   mm: add speculative page fault vmstats=0A=\n",
+  ">>>>   powerpc/mm: add speculative page fault=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> Mahendran Ganesh (2):=0A=\n",
+  ">>>>   arm64/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT=0A=\n",
+  ">>>>   arm64/mm: add speculative page fault=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> Peter Zijlstra (4):=0A=\n",
+  ">>>>   mm: prepare for FAULT_FLAG_SPECULATIVE=0A=\n",
+  ">>>>   mm: VMA sequence count=0A=\n",
+  ">>>>   mm: provide speculative fault infrastructure=0A=\n",
+  ">>>>   x86/mm: add speculative pagefault handling=0A=\n",
+  ">>>>=0A=\n",
+  ">>>>  arch/arm64/Kconfig                    |   1 +=0A=\n",
+  ">>>>  arch/arm64/mm/fault.c                 |  12 +=0A=\n",
+  ">>>>  arch/powerpc/Kconfig                  |   1 +=0A=\n",
+  ">>>>  arch/powerpc/mm/fault.c               |  16 +=0A=\n",
+  ">>>>  arch/x86/Kconfig                      |   1 +=0A=\n",
+  ">>>>  arch/x86/mm/fault.c                   |  27 +-=0A=\n",
+  ">>>>  fs/exec.c                             |   2 +-=0A=\n",
+  ">>>>  fs/proc/task_mmu.c                    |   5 +-=0A=\n",
+  ">>>>  fs/userfaultfd.c                      |  17 +-=0A=\n",
+  ">>>>  include/linux/hugetlb_inline.h        |   2 +-=0A=\n",
+  ">>>>  include/linux/migrate.h               |   4 +-=0A=\n",
+  ">>>>  include/linux/mm.h                    | 136 +++++++-=0A=\n",
+  ">>>>  include/linux/mm_types.h              |   7 +=0A=\n",
+  ">>>>  include/linux/pagemap.h               |   4 +-=0A=\n",
+  ">>>>  include/linux/rmap.h                  |  12 +-=0A=\n",
+  ">>>>  include/linux/swap.h                  |  10 +-=0A=\n",
+  ">>>>  include/linux/vm_event_item.h         |   3 +=0A=\n",
+  ">>>>  include/trace/events/pagefault.h      |  80 +++++=0A=\n",
+  ">>>>  include/uapi/linux/perf_event.h       |   1 +=0A=\n",
+  ">>>>  kernel/fork.c                         |   5 +-=0A=\n",
+  ">>>>  mm/Kconfig                            |  22 ++=0A=\n",
+  ">>>>  mm/huge_memory.c                      |   6 +-=0A=\n",
+  ">>>>  mm/hugetlb.c                          |   2 +=0A=\n",
+  ">>>>  mm/init-mm.c                          |   3 +=0A=\n",
+  ">>>>  mm/internal.h                         |  20 ++=0A=\n",
+  ">>>>  mm/khugepaged.c                       |   5 +=0A=\n",
+  ">>>>  mm/madvise.c                          |   6 +-=0A=\n",
+  ">>>>  mm/memory.c                           | 612 +++++++++++++++++++++++++=\n",
+  "++++-----=0A=\n",
+  ">>>>  mm/mempolicy.c                        |  51 ++-=0A=\n",
+  ">>>>  mm/migrate.c                          |   6 +-=0A=\n",
+  ">>>>  mm/mlock.c                            |  13 +-=0A=\n",
+  ">>>>  mm/mmap.c                             | 229 ++++++++++---=0A=\n",
+  ">>>>  mm/mprotect.c                         |   4 +-=0A=\n",
+  ">>>>  mm/mremap.c                           |  13 +=0A=\n",
+  ">>>>  mm/nommu.c                            |   2 +-=0A=\n",
+  ">>>>  mm/rmap.c                             |   5 +-=0A=\n",
+  ">>>>  mm/swap.c                             |   6 +-=0A=\n",
+  ">>>>  mm/swap_state.c                       |   8 +-=0A=\n",
+  ">>>>  mm/vmstat.c                           |   5 +-=0A=\n",
+  ">>>>  tools/include/uapi/linux/perf_event.h |   1 +=0A=\n",
+  ">>>>  tools/perf/util/evsel.c               |   1 +=0A=\n",
+  ">>>>  tools/perf/util/parse-events.c        |   4 +=0A=\n",
+  ">>>>  tools/perf/util/parse-events.l        |   1 +=0A=\n",
+  ">>>>  tools/perf/util/python.c              |   1 +=0A=\n",
+  ">>>>  44 files changed, 1161 insertions(+), 211 deletions(-)=0A=\n",
+  ">>>>  create mode 100644 include/trace/events/pagefault.h=0A=\n",
+  ">>>>=0A=\n",
+  ">>>> --=0A=\n",
+  ">>>> 2.7.4=0A=\n",
+  ">>>>=0A=\n",
+  ">>>>=0A=\n",
+  ">>>=0A=\n",
+  ">>=0A=\n",
+  ">=0A=\n",
+  "=0A="
 ]
 
-a20814b647c614b36a938ba0dabb3d7fc985be6f043bf93ba9ce31c5cbfc3db2
+3fe57aa64e67bec846efb9e40f093021b4adbcaceb036655e3916a57dd15f081
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.