All of lore.kernel.org
 help / color / mirror / Atom feed
diff for duplicates of <9FE19350E8A7EE45B64D8D63D368C8966B847F54@SHSMSX101.ccr.corp.intel.com>

diff --git a/a/1.txt b/N1/1.txt
index 720ed98..70fca7a 100644
--- a/a/1.txt
+++ b/N1/1.txt
@@ -1,7 +1,9 @@
 Hi Laurent,
 
-Regression test for v11 patch serials have been run, some regression is found by LKP-tools (linux kernel performance)
-tested on Intel 4s skylake platform. This time only test the cases which have been run and found regressions on
+Regression test for v11 patch serials have been run, some regression is fou=
+nd by LKP-tools (linux kernel performance)
+tested on Intel 4s skylake platform. This time only test the cases which ha=
+ve been run and found regressions on
 V9 patch serials.
 
 The regression result is sorted by the metric will-it-scale.per_thread_ops.
@@ -13,30 +15,43 @@ Benchmark: will-it-scale
 Download link: https://github.com/antonblanchard/will-it-scale/tree/master
 
 Metrics:
-  will-it-scale.per_process_ops=processes/nr_cpu
-  will-it-scale.per_thread_ops=threads/nr_cpu
-  test box: lkp-skl-4sp1(nr_cpu=192,memory=768G)
+  will-it-scale.per_process_ops=3Dprocesses/nr_cpu
+  will-it-scale.per_thread_ops=3Dthreads/nr_cpu
+  test box: lkp-skl-4sp1(nr_cpu=3D192,memory=3D768G)
 THP: enable / disable
 nr_task:100%
 
 1. Regressions:
 
 a). Enable THP
-testcase                          base           change      head           metric
-page_fault3/enable THP           10519          -20.5%        836      will-it-scale.per_thread_ops
-page_fault2/enalbe THP            8281          -18.8%       6728      will-it-scale.per_thread_ops
-brk1/eanble THP                 998475           -2.2%     976893      will-it-scale.per_process_ops
-context_switch1/enable THP      223910           -1.3%     220930      will-it-scale.per_process_ops
-context_switch1/enable THP      233722           -1.0%     231288      will-it-scale.per_thread_ops
+testcase                          base           change      head          =
+ metric
+page_fault3/enable THP           10519          -20.5%        836      will=
+-it-scale.per_thread_ops
+page_fault2/enalbe THP            8281          -18.8%       6728      will=
+-it-scale.per_thread_ops
+brk1/eanble THP                 998475           -2.2%     976893      will=
+-it-scale.per_process_ops
+context_switch1/enable THP      223910           -1.3%     220930      will=
+-it-scale.per_process_ops
+context_switch1/enable THP      233722           -1.0%     231288      will=
+-it-scale.per_thread_ops
 
 b). Disable THP
-page_fault3/disable THP          10856          -23.1%       8344      will-it-scale.per_thread_ops
-page_fault2/disable THP           8147          -18.8%       6613      will-it-scale.per_thread_ops
-brk1/disable THP                   957           -7.9%        881      will-it-scale.per_thread_ops
-context_switch1/disable THP     237006           -2.2%     231907      will-it-scale.per_thread_ops
-brk1/disable THP                997317           -2.0%     977778      will-it-scale.per_process_ops
-page_fault3/disable THP         467454           -1.8%     459251      will-it-scale.per_process_ops
-context_switch1/disable THP     224431           -1.3%     221567      will-it-scale.per_process_ops
+page_fault3/disable THP          10856          -23.1%       8344      will=
+-it-scale.per_thread_ops
+page_fault2/disable THP           8147          -18.8%       6613      will=
+-it-scale.per_thread_ops
+brk1/disable THP                   957           -7.9%        881      will=
+-it-scale.per_thread_ops
+context_switch1/disable THP     237006           -2.2%     231907      will=
+-it-scale.per_thread_ops
+brk1/disable THP                997317           -2.0%     977778      will=
+-it-scale.per_process_ops
+page_fault3/disable THP         467454           -1.8%     459251      will=
+-it-scale.per_process_ops
+context_switch1/disable THP     224431           -1.3%     221567      will=
+-it-scale.per_process_ops
 
 Notes: for the above  values of test result, the higher is better.
 
@@ -46,10 +61,21 @@ Notes: for the above  values of test result, the higher is better.
 Best regards
 Haiyan Song
 ________________________________________
-From: owner-linux-mm@kvack.org [owner-linux-mm@kvack.org] on behalf of Laurent Dufour [ldufour@linux.vnet.ibm.com]
+From: owner-linux-mm@kvack.org [owner-linux-mm@kvack.org] on behalf of Laur=
+ent Dufour [ldufour@linux.vnet.ibm.com]
 Sent: Monday, May 28, 2018 4:54 PM
 To: Song, HaiyanX
-Cc: akpm@linux-foundation.org; mhocko@kernel.org; peterz@infradead.org; kirill@shutemov.name; ak@linux.intel.com; dave@stgolabs.net; jack@suse.cz; Matthew Wilcox; khandual@linux.vnet.ibm.com; aneesh.kumar@linux.vnet.ibm.com; benh@kernel.crashing.org; mpe@ellerman.id.au; paulus@samba.org; Thomas Gleixner; Ingo Molnar; hpa@zytor.com; Will Deacon; Sergey Senozhatsky; sergey.senozhatsky.work@gmail.com; Andrea Arcangeli; Alexei Starovoitov; Wang, Kemi; Daniel Jordan; David Rientjes; Jerome Glisse; Ganesh Mahendran; Minchan Kim; Punit Agrawal; vinayak menon; Yang Shi; linux-kernel@vger.kernel.org; linux-mm@kvack.org; haren@linux.vnet.ibm.com; npiggin@gmail.com; bsingharora@gmail.com; paulmck@linux.vnet.ibm.com; Tim Chen; linuxppc-dev@lists.ozlabs.org; x86@kernel.org
+Cc: akpm@linux-foundation.org; mhocko@kernel.org; peterz@infradead.org; kir=
+ill@shutemov.name; ak@linux.intel.com; dave@stgolabs.net; jack@suse.cz; Mat=
+thew Wilcox; khandual@linux.vnet.ibm.com; aneesh.kumar@linux.vnet.ibm.com; =
+benh@kernel.crashing.org; mpe@ellerman.id.au; paulus@samba.org; Thomas Glei=
+xner; Ingo Molnar; hpa@zytor.com; Will Deacon; Sergey Senozhatsky; sergey.s=
+enozhatsky.work@gmail.com; Andrea Arcangeli; Alexei Starovoitov; Wang, Kemi=
+; Daniel Jordan; David Rientjes; Jerome Glisse; Ganesh Mahendran; Minchan K=
+im; Punit Agrawal; vinayak menon; Yang Shi; linux-kernel@vger.kernel.org; l=
+inux-mm@kvack.org; haren@linux.vnet.ibm.com; npiggin@gmail.com; bsingharora=
+@gmail.com; paulmck@linux.vnet.ibm.com; Tim Chen; linuxppc-dev@lists.ozlabs=
+.org; x86@kernel.org
 Subject: Re: [PATCH v11 00/26] Speculative page faults
 
 On 28/05/2018 10:22, Haiyan Song wrote:
@@ -67,12 +93,14 @@ Do you plan to give this V11 a run ?
 > On Mon, May 28, 2018 at 09:51:34AM +0200, Laurent Dufour wrote:
 >> On 28/05/2018 07:23, Song, HaiyanX wrote:
 >>>
->>> Some regression and improvements is found by LKP-tools(linux kernel performance) on V9 patch series
+>>> Some regression and improvements is found by LKP-tools(linux kernel per=
+formance) on V9 patch series
 >>> tested on Intel 4s Skylake platform.
 >>
 >> Hi,
 >>
->> Thanks for reporting this benchmark results, but you mentioned the "V9 patch
+>> Thanks for reporting this benchmark results, but you mentioned the "V9 p=
+atch
 >> series" while responding to the v11 header series...
 >> Were these tests done on v9 or v11 ?
 >>
@@ -80,8 +108,10 @@ Do you plan to give this V11 a run ?
 >> Laurent.
 >>
 >>>
->>> The regression result is sorted by the metric will-it-scale.per_thread_ops.
->>> Branch: Laurent-Dufour/Speculative-page-faults/20180316-151833 (V9 patch series)
+>>> The regression result is sorted by the metric will-it-scale.per_thread_=
+ops.
+>>> Branch: Laurent-Dufour/Speculative-page-faults/20180316-151833 (V9 patc=
+h series)
 >>> Commit id:
 >>>     base commit: d55f34411b1b126429a823d06c3124c16283231f
 >>>     head commit: 0355322b3577eeab7669066df42c550a56801110
@@ -89,47 +119,72 @@ Do you plan to give this V11 a run ?
 >>> Download link:
 >>> https://github.com/antonblanchard/will-it-scale/tree/master/tests
 >>> Metrics:
->>>     will-it-scale.per_process_ops=processes/nr_cpu
->>>     will-it-scale.per_thread_ops=threads/nr_cpu
->>> test box: lkp-skl-4sp1(nr_cpu=192,memory=768G)
+>>>     will-it-scale.per_process_ops=3Dprocesses/nr_cpu
+>>>     will-it-scale.per_thread_ops=3Dthreads/nr_cpu
+>>> test box: lkp-skl-4sp1(nr_cpu=3D192,memory=3D768G)
 >>> THP: enable / disable
 >>> nr_task: 100%
 >>>
 >>> 1. Regressions:
 >>> a) THP enabled:
->>> testcase                        base            change          head       metric
->>> page_fault3/ enable THP         10092           -17.5%          8323       will-it-scale.per_thread_ops
->>> page_fault2/ enable THP          8300           -17.2%          6869       will-it-scale.per_thread_ops
->>> brk1/ enable THP                  957.67         -7.6%           885       will-it-scale.per_thread_ops
->>> page_fault3/ enable THP        172821            -5.3%        163692       will-it-scale.per_process_ops
->>> signal1/ enable THP              9125            -3.2%          8834       will-it-scale.per_process_ops
+>>> testcase                        base            change          head   =
+    metric
+>>> page_fault3/ enable THP         10092           -17.5%          8323   =
+    will-it-scale.per_thread_ops
+>>> page_fault2/ enable THP          8300           -17.2%          6869   =
+    will-it-scale.per_thread_ops
+>>> brk1/ enable THP                  957.67         -7.6%           885   =
+    will-it-scale.per_thread_ops
+>>> page_fault3/ enable THP        172821            -5.3%        163692   =
+    will-it-scale.per_process_ops
+>>> signal1/ enable THP              9125            -3.2%          8834   =
+    will-it-scale.per_process_ops
 >>>
 >>> b) THP disabled:
->>> testcase                        base            change          head       metric
->>> page_fault3/ disable THP        10107           -19.1%          8180       will-it-scale.per_thread_ops
->>> page_fault2/ disable THP         8432           -17.8%          6931       will-it-scale.per_thread_ops
->>> context_switch1/ disable THP   215389            -6.8%        200776       will-it-scale.per_thread_ops
->>> brk1/ disable THP                 939.67         -6.6%           877.33    will-it-scale.per_thread_ops
->>> page_fault3/ disable THP       173145            -4.7%        165064       will-it-scale.per_process_ops
->>> signal1/ disable THP             9162            -3.9%          8802       will-it-scale.per_process_ops
+>>> testcase                        base            change          head   =
+    metric
+>>> page_fault3/ disable THP        10107           -19.1%          8180   =
+    will-it-scale.per_thread_ops
+>>> page_fault2/ disable THP         8432           -17.8%          6931   =
+    will-it-scale.per_thread_ops
+>>> context_switch1/ disable THP   215389            -6.8%        200776   =
+    will-it-scale.per_thread_ops
+>>> brk1/ disable THP                 939.67         -6.6%           877.33=
+    will-it-scale.per_thread_ops
+>>> page_fault3/ disable THP       173145            -4.7%        165064   =
+    will-it-scale.per_process_ops
+>>> signal1/ disable THP             9162            -3.9%          8802   =
+    will-it-scale.per_process_ops
 >>>
 >>> 2. Improvements:
 >>> a) THP enabled:
->>> testcase                        base            change          head       metric
->>> malloc1/ enable THP               66.33        +469.8%           383.67    will-it-scale.per_thread_ops
->>> writeseek3/ enable THP          2531             +4.5%          2646       will-it-scale.per_thread_ops
->>> signal1/ enable THP              989.33          +2.8%          1016       will-it-scale.per_thread_ops
+>>> testcase                        base            change          head   =
+    metric
+>>> malloc1/ enable THP               66.33        +469.8%           383.67=
+    will-it-scale.per_thread_ops
+>>> writeseek3/ enable THP          2531             +4.5%          2646   =
+    will-it-scale.per_thread_ops
+>>> signal1/ enable THP              989.33          +2.8%          1016   =
+    will-it-scale.per_thread_ops
 >>>
 >>> b) THP disabled:
->>> testcase                        base            change          head       metric
->>> malloc1/ disable THP              90.33        +417.3%           467.33    will-it-scale.per_thread_ops
->>> read2/ disable THP             58934            +39.2%         82060       will-it-scale.per_thread_ops
->>> page_fault1/ disable THP        8607            +36.4%         11736       will-it-scale.per_thread_ops
->>> read1/ disable THP            314063            +12.7%        353934       will-it-scale.per_thread_ops
->>> writeseek3/ disable THP         2452            +12.5%          2759       will-it-scale.per_thread_ops
->>> signal1/ disable THP             971.33          +5.5%          1024       will-it-scale.per_thread_ops
->>>
->>> Notes: for above values in column "change", the higher value means that the related testcase result
+>>> testcase                        base            change          head   =
+    metric
+>>> malloc1/ disable THP              90.33        +417.3%           467.33=
+    will-it-scale.per_thread_ops
+>>> read2/ disable THP             58934            +39.2%         82060   =
+    will-it-scale.per_thread_ops
+>>> page_fault1/ disable THP        8607            +36.4%         11736   =
+    will-it-scale.per_thread_ops
+>>> read1/ disable THP            314063            +12.7%        353934   =
+    will-it-scale.per_thread_ops
+>>> writeseek3/ disable THP         2452            +12.5%          2759   =
+    will-it-scale.per_thread_ops
+>>> signal1/ disable THP             971.33          +5.5%          1024   =
+    will-it-scale.per_thread_ops
+>>>
+>>> Notes: for above values in column "change", the higher value means that=
+ the related testcase result
 >>> on head commit is better than that on base commit for this benchmark.
 >>>
 >>>
@@ -137,75 +192,112 @@ Do you plan to give this V11 a run ?
 >>> Haiyan Song
 >>>
 >>> ________________________________________
->>> From: owner-linux-mm@kvack.org [owner-linux-mm@kvack.org] on behalf of Laurent Dufour [ldufour@linux.vnet.ibm.com]
+>>> From: owner-linux-mm@kvack.org [owner-linux-mm@kvack.org] on behalf of =
+Laurent Dufour [ldufour@linux.vnet.ibm.com]
 >>> Sent: Thursday, May 17, 2018 7:06 PM
->>> To: akpm@linux-foundation.org; mhocko@kernel.org; peterz@infradead.org; kirill@shutemov.name; ak@linux.intel.com; dave@stgolabs.net; jack@suse.cz; Matthew Wilcox; khandual@linux.vnet.ibm.com; aneesh.kumar@linux.vnet.ibm.com; benh@kernel.crashing.org; mpe@ellerman.id.au; paulus@samba.org; Thomas Gleixner; Ingo Molnar; hpa@zytor.com; Will Deacon; Sergey Senozhatsky; sergey.senozhatsky.work@gmail.com; Andrea Arcangeli; Alexei Starovoitov; Wang, Kemi; Daniel Jordan; David Rientjes; Jerome Glisse; Ganesh Mahendran; Minchan Kim; Punit Agrawal; vinayak menon; Yang Shi
->>> Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org; haren@linux.vnet.ibm.com; npiggin@gmail.com; bsingharora@gmail.com; paulmck@linux.vnet.ibm.com; Tim Chen; linuxppc-dev@lists.ozlabs.org; x86@kernel.org
+>>> To: akpm@linux-foundation.org; mhocko@kernel.org; peterz@infradead.org;=
+ kirill@shutemov.name; ak@linux.intel.com; dave@stgolabs.net; jack@suse.cz;=
+ Matthew Wilcox; khandual@linux.vnet.ibm.com; aneesh.kumar@linux.vnet.ibm.c=
+om; benh@kernel.crashing.org; mpe@ellerman.id.au; paulus@samba.org; Thomas =
+Gleixner; Ingo Molnar; hpa@zytor.com; Will Deacon; Sergey Senozhatsky; serg=
+ey.senozhatsky.work@gmail.com; Andrea Arcangeli; Alexei Starovoitov; Wang, =
+Kemi; Daniel Jordan; David Rientjes; Jerome Glisse; Ganesh Mahendran; Minch=
+an Kim; Punit Agrawal; vinayak menon; Yang Shi
+>>> Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org; haren@linux.vnet.=
+ibm.com; npiggin@gmail.com; bsingharora@gmail.com; paulmck@linux.vnet.ibm.c=
+om; Tim Chen; linuxppc-dev@lists.ozlabs.org; x86@kernel.org
 >>> Subject: [PATCH v11 00/26] Speculative page faults
 >>>
->>> This is a port on kernel 4.17 of the work done by Peter Zijlstra to handle
+>>> This is a port on kernel 4.17 of the work done by Peter Zijlstra to han=
+dle
 >>> page fault without holding the mm semaphore [1].
 >>>
 >>> The idea is to try to handle user space page faults without holding the
 >>> mmap_sem. This should allow better concurrency for massively threaded
->>> process since the page fault handler will not wait for other threads memory
->>> layout change to be done, assuming that this change is done in another part
->>> of the process's memory space. This type page fault is named speculative
->>> page fault. If the speculative page fault fails because of a concurrency is
->>> detected or because underlying PMD or PTE tables are not yet allocating, it
+>>> process since the page fault handler will not wait for other threads me=
+mory
+>>> layout change to be done, assuming that this change is done in another =
+part
+>>> of the process's memory space. This type page fault is named speculativ=
+e
+>>> page fault. If the speculative page fault fails because of a concurrenc=
+y is
+>>> detected or because underlying PMD or PTE tables are not yet allocating=
+, it
 >>> is failing its processing and a classic page fault is then tried.
 >>>
->>> The speculative page fault (SPF) has to look for the VMA matching the fault
->>> address without holding the mmap_sem, this is done by introducing a rwlock
->>> which protects the access to the mm_rb tree. Previously this was done using
+>>> The speculative page fault (SPF) has to look for the VMA matching the f=
+ault
+>>> address without holding the mmap_sem, this is done by introducing a rwl=
+ock
+>>> which protects the access to the mm_rb tree. Previously this was done u=
+sing
 >>> SRCU but it was introducing a lot of scheduling to process the VMA's
->>> freeing operation which was hitting the performance by 20% as reported by
+>>> freeing operation which was hitting the performance by 20% as reported =
+by
 >>> Kemi Wang [2]. Using a rwlock to protect access to the mm_rb tree is
->>> limiting the locking contention to these operations which are expected to
->>> be in a O(log n) order. In addition to ensure that the VMA is not freed in
+>>> limiting the locking contention to these operations which are expected =
+to
+>>> be in a O(log n) order. In addition to ensure that the VMA is not freed=
+ in
 >>> our back a reference count is added and 2 services (get_vma() and
 >>> put_vma()) are introduced to handle the reference count. Once a VMA is
 >>> fetched from the RB tree using get_vma(), it must be later freed using
 >>> put_vma(). I can't see anymore the overhead I got while will-it-scale
 >>> benchmark anymore.
 >>>
->>> The VMA's attributes checked during the speculative page fault processing
->>> have to be protected against parallel changes. This is done by using a per
+>>> The VMA's attributes checked during the speculative page fault processi=
+ng
+>>> have to be protected against parallel changes. This is done by using a =
+per
 >>> VMA sequence lock. This sequence lock allows the speculative page fault
 >>> handler to fast check for parallel changes in progress and to abort the
 >>> speculative page fault in that case.
 >>>
->>> Once the VMA has been found, the speculative page fault handler would check
->>> for the VMA's attributes to verify that the page fault has to be handled
->>> correctly or not. Thus, the VMA is protected through a sequence lock which
+>>> Once the VMA has been found, the speculative page fault handler would c=
+heck
+>>> for the VMA's attributes to verify that the page fault has to be handle=
+d
+>>> correctly or not. Thus, the VMA is protected through a sequence lock wh=
+ich
 >>> allows fast detection of concurrent VMA changes. If such a change is
->>> detected, the speculative page fault is aborted and a *classic* page fault
->>> is tried.  VMA sequence lockings are added when VMA attributes which are
+>>> detected, the speculative page fault is aborted and a *classic* page fa=
+ult
+>>> is tried.  VMA sequence lockings are added when VMA attributes which ar=
+e
 >>> checked during the page fault are modified.
 >>>
->>> When the PTE is fetched, the VMA is checked to see if it has been changed,
->>> so once the page table is locked, the VMA is valid, so any other changes
+>>> When the PTE is fetched, the VMA is checked to see if it has been chang=
+ed,
+>>> so once the page table is locked, the VMA is valid, so any other change=
+s
 >>> leading to touching this PTE will need to lock the page table, so no
 >>> parallel change is possible at this time.
 >>>
 >>> The locking of the PTE is done with interrupts disabled, this allows
 >>> checking for the PMD to ensure that there is not an ongoing collapsing
->>> operation. Since khugepaged is firstly set the PMD to pmd_none and then is
->>> waiting for the other CPU to have caught the IPI interrupt, if the pmd is
+>>> operation. Since khugepaged is firstly set the PMD to pmd_none and then=
+ is
+>>> waiting for the other CPU to have caught the IPI interrupt, if the pmd =
+is
 >>> valid at the time the PTE is locked, we have the guarantee that the
 >>> collapsing operation will have to wait on the PTE lock to move forward.
 >>> This allows the SPF handler to map the PTE safely. If the PMD value is
->>> different from the one recorded at the beginning of the SPF operation, the
+>>> different from the one recorded at the beginning of the SPF operation, =
+the
 >>> classic page fault handler will be called to handle the operation while
->>> holding the mmap_sem. As the PTE lock is done with the interrupts disabled,
->>> the lock is done using spin_trylock() to avoid dead lock when handling a
->>> page fault while a TLB invalidate is requested by another CPU holding the
+>>> holding the mmap_sem. As the PTE lock is done with the interrupts disab=
+led,
+>>> the lock is done using spin_trylock() to avoid dead lock when handling =
+a
+>>> page fault while a TLB invalidate is requested by another CPU holding t=
+he
 >>> PTE.
 >>>
 >>> In pseudo code, this could be seen as:
 >>>     speculative_page_fault()
 >>>     {
->>>             vma = get_vma()
+>>>             vma =3D get_vma()
 >>>             check vma sequence count
 >>>             check vma's support
 >>>             disable interrupt
@@ -216,10 +308,11 @@ Do you plan to give this V11 a run ?
 >>>             check vma sequence count
 >>>             handle_pte_fault(vma)
 >>>                     ..
->>>                     page = alloc_page()
+>>>                     page =3D alloc_page()
 >>>                     pte_map_lock()
 >>>                             disable interrupt
->>>                                     abort if sequence counter has changed
+>>>                                     abort if sequence counter has chang=
+ed
 >>>                                     abort if pmd or pte has changed
 >>>                                     pte map and lock
 >>>                             enable interrupt
@@ -235,7 +328,7 @@ Do you plan to give this V11 a run ?
 >>>                goto done
 >>>     again:
 >>>             lock(mmap_sem)
->>>             vma = find_vma();
+>>>             vma =3D find_vma();
 >>>             handle_pte_fault(vma);
 >>>             if retry
 >>>                unlock(mmap_sem)
@@ -244,22 +337,27 @@ Do you plan to give this V11 a run ?
 >>>             handle fault error
 >>>     }
 >>>
->>> Support for THP is not done because when checking for the PMD, we can be
+>>> Support for THP is not done because when checking for the PMD, we can b=
+e
 >>> confused by an in progress collapsing operation done by khugepaged. The
 >>> issue is that pmd_none() could be true either if the PMD is not already
->>> populated or if the underlying PTE are in the way to be collapsed. So we
+>>> populated or if the underlying PTE are in the way to be collapsed. So w=
+e
 >>> cannot safely allocate a PMD if pmd_none() is true.
 >>>
->>> This series add a new software performance event named 'speculative-faults'
+>>> This series add a new software performance event named 'speculative-fau=
+lts'
 >>> or 'spf'. It counts the number of successful page fault event handled
 >>> speculatively. When recording 'faults,spf' events, the faults one is
->>> counting the total number of page fault events while 'spf' is only counting
+>>> counting the total number of page fault events while 'spf' is only coun=
+ting
 >>> the part of the faults processed speculatively.
 >>>
 >>> There are some trace events introduced by this series. They allow
 >>> identifying why the page faults were not processed speculatively. This
 >>> doesn't take in account the faults generated by a monothreaded process
->>> which directly processed while holding the mmap_sem. This trace events are
+>>> which directly processed while holding the mmap_sem. This trace events =
+are
 >>> grouped in a system named 'pagefault', they are:
 >>>  - pagefault:spf_vma_changed : if the VMA has been changed in our back
 >>>  - pagefault:spf_vma_noanon : the vma->anon_vma field was not yet set.
@@ -272,20 +370,26 @@ Do you plan to give this V11 a run ?
 >>> following arguments :
 >>> $ perf stat -e 'faults,spf,pagefault:*' <command>
 >>>
->>> There is also a dedicated vmstat counter showing the number of successful
+>>> There is also a dedicated vmstat counter showing the number of successf=
+ul
 >>> page fault handled speculatively. I can be seen this way:
 >>> $ grep speculative_pgfault /proc/vmstat
 >>>
->>> This series builds on top of v4.16-mmotm-2018-04-13-17-28 and is functional
+>>> This series builds on top of v4.16-mmotm-2018-04-13-17-28 and is functi=
+onal
 >>> on x86, PowerPC and arm64.
 >>>
 >>> ---------------------
 >>> Real Workload results
 >>>
->>> As mentioned in previous email, we did non official runs using a "popular
->>> in memory multithreaded database product" on 176 cores SMT8 Power system
->>> which showed a 30% improvements in the number of transaction processed per
->>> second. This run has been done on the v6 series, but changes introduced in
+>>> As mentioned in previous email, we did non official runs using a "popul=
+ar
+>>> in memory multithreaded database product" on 176 cores SMT8 Power syste=
+m
+>>> which showed a 30% improvements in the number of transaction processed =
+per
+>>> second. This run has been done on the v6 series, but changes introduced=
+ in
 >>> this new version should not impact the performance boost seen.
 >>>
 >>> Here are the perf data captured during 2 of these runs on top of the v8
@@ -294,12 +398,16 @@ Do you plan to give this V11 a run ?
 >>> faults          89.418          101.364         +13%
 >>> spf                n/a           97.989
 >>>
->>> With the SPF kernel, most of the page fault were processed in a speculative
+>>> With the SPF kernel, most of the page fault were processed in a specula=
+tive
 >>> way.
 >>>
->>> Ganesh Mahendran had backported the series on top of a 4.9 kernel and gave
->>> it a try on an android device. He reported that the application launch time
->>> was improved in average by 6%, and for large applications (~100 threads) by
+>>> Ganesh Mahendran had backported the series on top of a 4.9 kernel and g=
+ave
+>>> it a try on an android device. He reported that the application launch =
+time
+>>> was improved in average by 6%, and for large applications (~100 threads=
+) by
 >>> 20%.
 >>>
 >>> Here are the launch time Ganesh mesured on Android 8.0 on top of a Qcom
@@ -372,7 +480,8 @@ Do you plan to give this V11 a run ?
 >>>                  0      pagefault:spf_pmd_changed
 >>>
 >>> Very few speculative page faults were recorded as most of the processes
->>> involved are monothreaded (sounds that on this architecture some threads
+>>> involved are monothreaded (sounds that on this architecture some thread=
+s
 >>> were created during the kernel build processing).
 >>>
 >>> Here are the kerbench results on a 80 CPUs Power8 system:
@@ -407,15 +516,18 @@ Do you plan to give this V11 a run ?
 >>>                  0      pagefault:spf_vma_access
 >>>                  0      pagefault:spf_pmd_changed
 >>>
->>> Most of the processes involved are monothreaded so SPF is not activated but
+>>> Most of the processes involved are monothreaded so SPF is not activated=
+ but
 >>> there is no impact on the performance.
 >>>
 >>> Ebizzy:
 >>> -------
->>> The test is counting the number of records per second it can manage, the
+>>> The test is counting the number of records per second it can manage, th=
+e
 >>> higher is the best. I run it like this 'ebizzy -mTt <nrcpus>'. To get
 >>> consistent result I repeated the test 100 times and measure the average
->>> result. The number is the record processes per second, the higher is the
+>>> result. The number is the record processes per second, the higher is th=
+e
 >>> best.
 >>>
 >>>                 BASE            SPF             delta
@@ -442,12 +554,14 @@ Do you plan to give this V11 a run ?
 >>>                  0      pagefault:spf_vma_access
 >>>                  0      pagefault:spf_pmd_changed
 >>>
->>> In ebizzy's case most of the page fault were handled in a speculative way,
+>>> In ebizzy's case most of the page fault were handled in a speculative w=
+ay,
 >>> leading the ebizzy performance boost.
 >>>
 >>> ------------------
 >>> Changes since v10 (https://lkml.org/lkml/2018/4/17/572):
->>>  - Accounted for all review feedbacks from Punit Agrawal, Ganesh Mahendran
+>>>  - Accounted for all review feedbacks from Punit Agrawal, Ganesh Mahend=
+ran
 >>>    and Minchan Kim, hopefully.
 >>>  - Remove unneeded check on CONFIG_SPECULATIVE_PAGE_FAULT in
 >>>    __do_page_fault().
@@ -456,12 +570,15 @@ Do you plan to give this V11 a run ?
 >>>    of aborting the speculative page fault handling. Dropping the now
 >>> useless
 >>>    trace event pagefault:spf_pte_lock.
->>>  - No more try to reuse the fetched VMA during the speculative page fault
+>>>  - No more try to reuse the fetched VMA during the speculative page fau=
+lt
 >>>    handling when retrying is needed. This adds a lot of complexity and
->>>    additional tests done didn't show a significant performance improvement.
+>>>    additional tests done didn't show a significant performance improvem=
+ent.
 >>>  - Convert IS_ENABLED(CONFIG_NUMA) back to #ifdef due to build error.
 >>>
->>> [1] http://linux-kernel.2935.n7.nabble.com/RFC-PATCH-0-6-Another-go-at-speculative-page-faults-tt965642.html#none
+>>> [1] http://linux-kernel.2935.n7.nabble.com/RFC-PATCH-0-6-Another-go-at-=
+speculative-page-faults-tt965642.html#none
 >>> [2] https://patchwork.kernel.org/patch/9999687/
 >>>
 >>>
@@ -524,7 +641,8 @@ Do you plan to give this V11 a run ?
 >>>  mm/internal.h                         |  20 ++
 >>>  mm/khugepaged.c                       |   5 +
 >>>  mm/madvise.c                          |   6 +-
->>>  mm/memory.c                           | 612 +++++++++++++++++++++++++++++-----
+>>>  mm/memory.c                           | 612 ++++++++++++++++++++++++++=
++++-----
 >>>  mm/mempolicy.c                        |  51 ++-
 >>>  mm/migrate.c                          |   6 +-
 >>>  mm/mlock.c                            |  13 +-
diff --git a/a/content_digest b/N1/content_digest
index ca53e7f..56e778a 100644
--- a/a/content_digest
+++ b/N1/content_digest
@@ -76,8 +76,10 @@
 [
   "Hi Laurent,\n",
   "\n",
-  "Regression test for v11 patch serials have been run, some regression is found by LKP-tools (linux kernel performance)\n",
-  "tested on Intel 4s skylake platform. This time only test the cases which have been run and found regressions on\n",
+  "Regression test for v11 patch serials have been run, some regression is fou=\n",
+  "nd by LKP-tools (linux kernel performance)\n",
+  "tested on Intel 4s skylake platform. This time only test the cases which ha=\n",
+  "ve been run and found regressions on\n",
   "V9 patch serials.\n",
   "\n",
   "The regression result is sorted by the metric will-it-scale.per_thread_ops.\n",
@@ -89,30 +91,43 @@
   "Download link: https://github.com/antonblanchard/will-it-scale/tree/master\n",
   "\n",
   "Metrics:\n",
-  "  will-it-scale.per_process_ops=processes/nr_cpu\n",
-  "  will-it-scale.per_thread_ops=threads/nr_cpu\n",
-  "  test box: lkp-skl-4sp1(nr_cpu=192,memory=768G)\n",
+  "  will-it-scale.per_process_ops=3Dprocesses/nr_cpu\n",
+  "  will-it-scale.per_thread_ops=3Dthreads/nr_cpu\n",
+  "  test box: lkp-skl-4sp1(nr_cpu=3D192,memory=3D768G)\n",
   "THP: enable / disable\n",
   "nr_task:100%\n",
   "\n",
   "1. Regressions:\n",
   "\n",
   "a). Enable THP\n",
-  "testcase                          base           change      head           metric\n",
-  "page_fault3/enable THP           10519          -20.5%        836      will-it-scale.per_thread_ops\n",
-  "page_fault2/enalbe THP            8281          -18.8%       6728      will-it-scale.per_thread_ops\n",
-  "brk1/eanble THP                 998475           -2.2%     976893      will-it-scale.per_process_ops\n",
-  "context_switch1/enable THP      223910           -1.3%     220930      will-it-scale.per_process_ops\n",
-  "context_switch1/enable THP      233722           -1.0%     231288      will-it-scale.per_thread_ops\n",
+  "testcase                          base           change      head          =\n",
+  " metric\n",
+  "page_fault3/enable THP           10519          -20.5%        836      will=\n",
+  "-it-scale.per_thread_ops\n",
+  "page_fault2/enalbe THP            8281          -18.8%       6728      will=\n",
+  "-it-scale.per_thread_ops\n",
+  "brk1/eanble THP                 998475           -2.2%     976893      will=\n",
+  "-it-scale.per_process_ops\n",
+  "context_switch1/enable THP      223910           -1.3%     220930      will=\n",
+  "-it-scale.per_process_ops\n",
+  "context_switch1/enable THP      233722           -1.0%     231288      will=\n",
+  "-it-scale.per_thread_ops\n",
   "\n",
   "b). Disable THP\n",
-  "page_fault3/disable THP          10856          -23.1%       8344      will-it-scale.per_thread_ops\n",
-  "page_fault2/disable THP           8147          -18.8%       6613      will-it-scale.per_thread_ops\n",
-  "brk1/disable THP                   957           -7.9%        881      will-it-scale.per_thread_ops\n",
-  "context_switch1/disable THP     237006           -2.2%     231907      will-it-scale.per_thread_ops\n",
-  "brk1/disable THP                997317           -2.0%     977778      will-it-scale.per_process_ops\n",
-  "page_fault3/disable THP         467454           -1.8%     459251      will-it-scale.per_process_ops\n",
-  "context_switch1/disable THP     224431           -1.3%     221567      will-it-scale.per_process_ops\n",
+  "page_fault3/disable THP          10856          -23.1%       8344      will=\n",
+  "-it-scale.per_thread_ops\n",
+  "page_fault2/disable THP           8147          -18.8%       6613      will=\n",
+  "-it-scale.per_thread_ops\n",
+  "brk1/disable THP                   957           -7.9%        881      will=\n",
+  "-it-scale.per_thread_ops\n",
+  "context_switch1/disable THP     237006           -2.2%     231907      will=\n",
+  "-it-scale.per_thread_ops\n",
+  "brk1/disable THP                997317           -2.0%     977778      will=\n",
+  "-it-scale.per_process_ops\n",
+  "page_fault3/disable THP         467454           -1.8%     459251      will=\n",
+  "-it-scale.per_process_ops\n",
+  "context_switch1/disable THP     224431           -1.3%     221567      will=\n",
+  "-it-scale.per_process_ops\n",
   "\n",
   "Notes: for the above  values of test result, the higher is better.\n",
   "\n",
@@ -122,10 +137,21 @@
   "Best regards\n",
   "Haiyan Song\n",
   "________________________________________\n",
-  "From: owner-linux-mm\@kvack.org [owner-linux-mm\@kvack.org] on behalf of Laurent Dufour [ldufour\@linux.vnet.ibm.com]\n",
+  "From: owner-linux-mm\@kvack.org [owner-linux-mm\@kvack.org] on behalf of Laur=\n",
+  "ent Dufour [ldufour\@linux.vnet.ibm.com]\n",
   "Sent: Monday, May 28, 2018 4:54 PM\n",
   "To: Song, HaiyanX\n",
-  "Cc: akpm\@linux-foundation.org; mhocko\@kernel.org; peterz\@infradead.org; kirill\@shutemov.name; ak\@linux.intel.com; dave\@stgolabs.net; jack\@suse.cz; Matthew Wilcox; khandual\@linux.vnet.ibm.com; aneesh.kumar\@linux.vnet.ibm.com; benh\@kernel.crashing.org; mpe\@ellerman.id.au; paulus\@samba.org; Thomas Gleixner; Ingo Molnar; hpa\@zytor.com; Will Deacon; Sergey Senozhatsky; sergey.senozhatsky.work\@gmail.com; Andrea Arcangeli; Alexei Starovoitov; Wang, Kemi; Daniel Jordan; David Rientjes; Jerome Glisse; Ganesh Mahendran; Minchan Kim; Punit Agrawal; vinayak menon; Yang Shi; linux-kernel\@vger.kernel.org; linux-mm\@kvack.org; haren\@linux.vnet.ibm.com; npiggin\@gmail.com; bsingharora\@gmail.com; paulmck\@linux.vnet.ibm.com; Tim Chen; linuxppc-dev\@lists.ozlabs.org; x86\@kernel.org\n",
+  "Cc: akpm\@linux-foundation.org; mhocko\@kernel.org; peterz\@infradead.org; kir=\n",
+  "ill\@shutemov.name; ak\@linux.intel.com; dave\@stgolabs.net; jack\@suse.cz; Mat=\n",
+  "thew Wilcox; khandual\@linux.vnet.ibm.com; aneesh.kumar\@linux.vnet.ibm.com; =\n",
+  "benh\@kernel.crashing.org; mpe\@ellerman.id.au; paulus\@samba.org; Thomas Glei=\n",
+  "xner; Ingo Molnar; hpa\@zytor.com; Will Deacon; Sergey Senozhatsky; sergey.s=\n",
+  "enozhatsky.work\@gmail.com; Andrea Arcangeli; Alexei Starovoitov; Wang, Kemi=\n",
+  "; Daniel Jordan; David Rientjes; Jerome Glisse; Ganesh Mahendran; Minchan K=\n",
+  "im; Punit Agrawal; vinayak menon; Yang Shi; linux-kernel\@vger.kernel.org; l=\n",
+  "inux-mm\@kvack.org; haren\@linux.vnet.ibm.com; npiggin\@gmail.com; bsingharora=\n",
+  "\@gmail.com; paulmck\@linux.vnet.ibm.com; Tim Chen; linuxppc-dev\@lists.ozlabs=\n",
+  ".org; x86\@kernel.org\n",
   "Subject: Re: [PATCH v11 00/26] Speculative page faults\n",
   "\n",
   "On 28/05/2018 10:22, Haiyan Song wrote:\n",
@@ -143,12 +169,14 @@
   "> On Mon, May 28, 2018 at 09:51:34AM +0200, Laurent Dufour wrote:\n",
   ">> On 28/05/2018 07:23, Song, HaiyanX wrote:\n",
   ">>>\n",
-  ">>> Some regression and improvements is found by LKP-tools(linux kernel performance) on V9 patch series\n",
+  ">>> Some regression and improvements is found by LKP-tools(linux kernel per=\n",
+  "formance) on V9 patch series\n",
   ">>> tested on Intel 4s Skylake platform.\n",
   ">>\n",
   ">> Hi,\n",
   ">>\n",
-  ">> Thanks for reporting this benchmark results, but you mentioned the \"V9 patch\n",
+  ">> Thanks for reporting this benchmark results, but you mentioned the \"V9 p=\n",
+  "atch\n",
   ">> series\" while responding to the v11 header series...\n",
   ">> Were these tests done on v9 or v11 ?\n",
   ">>\n",
@@ -156,8 +184,10 @@
   ">> Laurent.\n",
   ">>\n",
   ">>>\n",
-  ">>> The regression result is sorted by the metric will-it-scale.per_thread_ops.\n",
-  ">>> Branch: Laurent-Dufour/Speculative-page-faults/20180316-151833 (V9 patch series)\n",
+  ">>> The regression result is sorted by the metric will-it-scale.per_thread_=\n",
+  "ops.\n",
+  ">>> Branch: Laurent-Dufour/Speculative-page-faults/20180316-151833 (V9 patc=\n",
+  "h series)\n",
   ">>> Commit id:\n",
   ">>>     base commit: d55f34411b1b126429a823d06c3124c16283231f\n",
   ">>>     head commit: 0355322b3577eeab7669066df42c550a56801110\n",
@@ -165,47 +195,72 @@
   ">>> Download link:\n",
   ">>> https://github.com/antonblanchard/will-it-scale/tree/master/tests\n",
   ">>> Metrics:\n",
-  ">>>     will-it-scale.per_process_ops=processes/nr_cpu\n",
-  ">>>     will-it-scale.per_thread_ops=threads/nr_cpu\n",
-  ">>> test box: lkp-skl-4sp1(nr_cpu=192,memory=768G)\n",
+  ">>>     will-it-scale.per_process_ops=3Dprocesses/nr_cpu\n",
+  ">>>     will-it-scale.per_thread_ops=3Dthreads/nr_cpu\n",
+  ">>> test box: lkp-skl-4sp1(nr_cpu=3D192,memory=3D768G)\n",
   ">>> THP: enable / disable\n",
   ">>> nr_task: 100%\n",
   ">>>\n",
   ">>> 1. Regressions:\n",
   ">>> a) THP enabled:\n",
-  ">>> testcase                        base            change          head       metric\n",
-  ">>> page_fault3/ enable THP         10092           -17.5%          8323       will-it-scale.per_thread_ops\n",
-  ">>> page_fault2/ enable THP          8300           -17.2%          6869       will-it-scale.per_thread_ops\n",
-  ">>> brk1/ enable THP                  957.67         -7.6%           885       will-it-scale.per_thread_ops\n",
-  ">>> page_fault3/ enable THP        172821            -5.3%        163692       will-it-scale.per_process_ops\n",
-  ">>> signal1/ enable THP              9125            -3.2%          8834       will-it-scale.per_process_ops\n",
+  ">>> testcase                        base            change          head   =\n",
+  "    metric\n",
+  ">>> page_fault3/ enable THP         10092           -17.5%          8323   =\n",
+  "    will-it-scale.per_thread_ops\n",
+  ">>> page_fault2/ enable THP          8300           -17.2%          6869   =\n",
+  "    will-it-scale.per_thread_ops\n",
+  ">>> brk1/ enable THP                  957.67         -7.6%           885   =\n",
+  "    will-it-scale.per_thread_ops\n",
+  ">>> page_fault3/ enable THP        172821            -5.3%        163692   =\n",
+  "    will-it-scale.per_process_ops\n",
+  ">>> signal1/ enable THP              9125            -3.2%          8834   =\n",
+  "    will-it-scale.per_process_ops\n",
   ">>>\n",
   ">>> b) THP disabled:\n",
-  ">>> testcase                        base            change          head       metric\n",
-  ">>> page_fault3/ disable THP        10107           -19.1%          8180       will-it-scale.per_thread_ops\n",
-  ">>> page_fault2/ disable THP         8432           -17.8%          6931       will-it-scale.per_thread_ops\n",
-  ">>> context_switch1/ disable THP   215389            -6.8%        200776       will-it-scale.per_thread_ops\n",
-  ">>> brk1/ disable THP                 939.67         -6.6%           877.33    will-it-scale.per_thread_ops\n",
-  ">>> page_fault3/ disable THP       173145            -4.7%        165064       will-it-scale.per_process_ops\n",
-  ">>> signal1/ disable THP             9162            -3.9%          8802       will-it-scale.per_process_ops\n",
+  ">>> testcase                        base            change          head   =\n",
+  "    metric\n",
+  ">>> page_fault3/ disable THP        10107           -19.1%          8180   =\n",
+  "    will-it-scale.per_thread_ops\n",
+  ">>> page_fault2/ disable THP         8432           -17.8%          6931   =\n",
+  "    will-it-scale.per_thread_ops\n",
+  ">>> context_switch1/ disable THP   215389            -6.8%        200776   =\n",
+  "    will-it-scale.per_thread_ops\n",
+  ">>> brk1/ disable THP                 939.67         -6.6%           877.33=\n",
+  "    will-it-scale.per_thread_ops\n",
+  ">>> page_fault3/ disable THP       173145            -4.7%        165064   =\n",
+  "    will-it-scale.per_process_ops\n",
+  ">>> signal1/ disable THP             9162            -3.9%          8802   =\n",
+  "    will-it-scale.per_process_ops\n",
   ">>>\n",
   ">>> 2. Improvements:\n",
   ">>> a) THP enabled:\n",
-  ">>> testcase                        base            change          head       metric\n",
-  ">>> malloc1/ enable THP               66.33        +469.8%           383.67    will-it-scale.per_thread_ops\n",
-  ">>> writeseek3/ enable THP          2531             +4.5%          2646       will-it-scale.per_thread_ops\n",
-  ">>> signal1/ enable THP              989.33          +2.8%          1016       will-it-scale.per_thread_ops\n",
+  ">>> testcase                        base            change          head   =\n",
+  "    metric\n",
+  ">>> malloc1/ enable THP               66.33        +469.8%           383.67=\n",
+  "    will-it-scale.per_thread_ops\n",
+  ">>> writeseek3/ enable THP          2531             +4.5%          2646   =\n",
+  "    will-it-scale.per_thread_ops\n",
+  ">>> signal1/ enable THP              989.33          +2.8%          1016   =\n",
+  "    will-it-scale.per_thread_ops\n",
   ">>>\n",
   ">>> b) THP disabled:\n",
-  ">>> testcase                        base            change          head       metric\n",
-  ">>> malloc1/ disable THP              90.33        +417.3%           467.33    will-it-scale.per_thread_ops\n",
-  ">>> read2/ disable THP             58934            +39.2%         82060       will-it-scale.per_thread_ops\n",
-  ">>> page_fault1/ disable THP        8607            +36.4%         11736       will-it-scale.per_thread_ops\n",
-  ">>> read1/ disable THP            314063            +12.7%        353934       will-it-scale.per_thread_ops\n",
-  ">>> writeseek3/ disable THP         2452            +12.5%          2759       will-it-scale.per_thread_ops\n",
-  ">>> signal1/ disable THP             971.33          +5.5%          1024       will-it-scale.per_thread_ops\n",
-  ">>>\n",
-  ">>> Notes: for above values in column \"change\", the higher value means that the related testcase result\n",
+  ">>> testcase                        base            change          head   =\n",
+  "    metric\n",
+  ">>> malloc1/ disable THP              90.33        +417.3%           467.33=\n",
+  "    will-it-scale.per_thread_ops\n",
+  ">>> read2/ disable THP             58934            +39.2%         82060   =\n",
+  "    will-it-scale.per_thread_ops\n",
+  ">>> page_fault1/ disable THP        8607            +36.4%         11736   =\n",
+  "    will-it-scale.per_thread_ops\n",
+  ">>> read1/ disable THP            314063            +12.7%        353934   =\n",
+  "    will-it-scale.per_thread_ops\n",
+  ">>> writeseek3/ disable THP         2452            +12.5%          2759   =\n",
+  "    will-it-scale.per_thread_ops\n",
+  ">>> signal1/ disable THP             971.33          +5.5%          1024   =\n",
+  "    will-it-scale.per_thread_ops\n",
+  ">>>\n",
+  ">>> Notes: for above values in column \"change\", the higher value means that=\n",
+  " the related testcase result\n",
   ">>> on head commit is better than that on base commit for this benchmark.\n",
   ">>>\n",
   ">>>\n",
@@ -213,75 +268,112 @@
   ">>> Haiyan Song\n",
   ">>>\n",
   ">>> ________________________________________\n",
-  ">>> From: owner-linux-mm\@kvack.org [owner-linux-mm\@kvack.org] on behalf of Laurent Dufour [ldufour\@linux.vnet.ibm.com]\n",
+  ">>> From: owner-linux-mm\@kvack.org [owner-linux-mm\@kvack.org] on behalf of =\n",
+  "Laurent Dufour [ldufour\@linux.vnet.ibm.com]\n",
   ">>> Sent: Thursday, May 17, 2018 7:06 PM\n",
-  ">>> To: akpm\@linux-foundation.org; mhocko\@kernel.org; peterz\@infradead.org; kirill\@shutemov.name; ak\@linux.intel.com; dave\@stgolabs.net; jack\@suse.cz; Matthew Wilcox; khandual\@linux.vnet.ibm.com; aneesh.kumar\@linux.vnet.ibm.com; benh\@kernel.crashing.org; mpe\@ellerman.id.au; paulus\@samba.org; Thomas Gleixner; Ingo Molnar; hpa\@zytor.com; Will Deacon; Sergey Senozhatsky; sergey.senozhatsky.work\@gmail.com; Andrea Arcangeli; Alexei Starovoitov; Wang, Kemi; Daniel Jordan; David Rientjes; Jerome Glisse; Ganesh Mahendran; Minchan Kim; Punit Agrawal; vinayak menon; Yang Shi\n",
-  ">>> Cc: linux-kernel\@vger.kernel.org; linux-mm\@kvack.org; haren\@linux.vnet.ibm.com; npiggin\@gmail.com; bsingharora\@gmail.com; paulmck\@linux.vnet.ibm.com; Tim Chen; linuxppc-dev\@lists.ozlabs.org; x86\@kernel.org\n",
+  ">>> To: akpm\@linux-foundation.org; mhocko\@kernel.org; peterz\@infradead.org;=\n",
+  " kirill\@shutemov.name; ak\@linux.intel.com; dave\@stgolabs.net; jack\@suse.cz;=\n",
+  " Matthew Wilcox; khandual\@linux.vnet.ibm.com; aneesh.kumar\@linux.vnet.ibm.c=\n",
+  "om; benh\@kernel.crashing.org; mpe\@ellerman.id.au; paulus\@samba.org; Thomas =\n",
+  "Gleixner; Ingo Molnar; hpa\@zytor.com; Will Deacon; Sergey Senozhatsky; serg=\n",
+  "ey.senozhatsky.work\@gmail.com; Andrea Arcangeli; Alexei Starovoitov; Wang, =\n",
+  "Kemi; Daniel Jordan; David Rientjes; Jerome Glisse; Ganesh Mahendran; Minch=\n",
+  "an Kim; Punit Agrawal; vinayak menon; Yang Shi\n",
+  ">>> Cc: linux-kernel\@vger.kernel.org; linux-mm\@kvack.org; haren\@linux.vnet.=\n",
+  "ibm.com; npiggin\@gmail.com; bsingharora\@gmail.com; paulmck\@linux.vnet.ibm.c=\n",
+  "om; Tim Chen; linuxppc-dev\@lists.ozlabs.org; x86\@kernel.org\n",
   ">>> Subject: [PATCH v11 00/26] Speculative page faults\n",
   ">>>\n",
-  ">>> This is a port on kernel 4.17 of the work done by Peter Zijlstra to handle\n",
+  ">>> This is a port on kernel 4.17 of the work done by Peter Zijlstra to han=\n",
+  "dle\n",
   ">>> page fault without holding the mm semaphore [1].\n",
   ">>>\n",
   ">>> The idea is to try to handle user space page faults without holding the\n",
   ">>> mmap_sem. This should allow better concurrency for massively threaded\n",
-  ">>> process since the page fault handler will not wait for other threads memory\n",
-  ">>> layout change to be done, assuming that this change is done in another part\n",
-  ">>> of the process's memory space. This type page fault is named speculative\n",
-  ">>> page fault. If the speculative page fault fails because of a concurrency is\n",
-  ">>> detected or because underlying PMD or PTE tables are not yet allocating, it\n",
+  ">>> process since the page fault handler will not wait for other threads me=\n",
+  "mory\n",
+  ">>> layout change to be done, assuming that this change is done in another =\n",
+  "part\n",
+  ">>> of the process's memory space. This type page fault is named speculativ=\n",
+  "e\n",
+  ">>> page fault. If the speculative page fault fails because of a concurrenc=\n",
+  "y is\n",
+  ">>> detected or because underlying PMD or PTE tables are not yet allocating=\n",
+  ", it\n",
   ">>> is failing its processing and a classic page fault is then tried.\n",
   ">>>\n",
-  ">>> The speculative page fault (SPF) has to look for the VMA matching the fault\n",
-  ">>> address without holding the mmap_sem, this is done by introducing a rwlock\n",
-  ">>> which protects the access to the mm_rb tree. Previously this was done using\n",
+  ">>> The speculative page fault (SPF) has to look for the VMA matching the f=\n",
+  "ault\n",
+  ">>> address without holding the mmap_sem, this is done by introducing a rwl=\n",
+  "ock\n",
+  ">>> which protects the access to the mm_rb tree. Previously this was done u=\n",
+  "sing\n",
   ">>> SRCU but it was introducing a lot of scheduling to process the VMA's\n",
-  ">>> freeing operation which was hitting the performance by 20% as reported by\n",
+  ">>> freeing operation which was hitting the performance by 20% as reported =\n",
+  "by\n",
   ">>> Kemi Wang [2]. Using a rwlock to protect access to the mm_rb tree is\n",
-  ">>> limiting the locking contention to these operations which are expected to\n",
-  ">>> be in a O(log n) order. In addition to ensure that the VMA is not freed in\n",
+  ">>> limiting the locking contention to these operations which are expected =\n",
+  "to\n",
+  ">>> be in a O(log n) order. In addition to ensure that the VMA is not freed=\n",
+  " in\n",
   ">>> our back a reference count is added and 2 services (get_vma() and\n",
   ">>> put_vma()) are introduced to handle the reference count. Once a VMA is\n",
   ">>> fetched from the RB tree using get_vma(), it must be later freed using\n",
   ">>> put_vma(). I can't see anymore the overhead I got while will-it-scale\n",
   ">>> benchmark anymore.\n",
   ">>>\n",
-  ">>> The VMA's attributes checked during the speculative page fault processing\n",
-  ">>> have to be protected against parallel changes. This is done by using a per\n",
+  ">>> The VMA's attributes checked during the speculative page fault processi=\n",
+  "ng\n",
+  ">>> have to be protected against parallel changes. This is done by using a =\n",
+  "per\n",
   ">>> VMA sequence lock. This sequence lock allows the speculative page fault\n",
   ">>> handler to fast check for parallel changes in progress and to abort the\n",
   ">>> speculative page fault in that case.\n",
   ">>>\n",
-  ">>> Once the VMA has been found, the speculative page fault handler would check\n",
-  ">>> for the VMA's attributes to verify that the page fault has to be handled\n",
-  ">>> correctly or not. Thus, the VMA is protected through a sequence lock which\n",
+  ">>> Once the VMA has been found, the speculative page fault handler would c=\n",
+  "heck\n",
+  ">>> for the VMA's attributes to verify that the page fault has to be handle=\n",
+  "d\n",
+  ">>> correctly or not. Thus, the VMA is protected through a sequence lock wh=\n",
+  "ich\n",
   ">>> allows fast detection of concurrent VMA changes. If such a change is\n",
-  ">>> detected, the speculative page fault is aborted and a *classic* page fault\n",
-  ">>> is tried.  VMA sequence lockings are added when VMA attributes which are\n",
+  ">>> detected, the speculative page fault is aborted and a *classic* page fa=\n",
+  "ult\n",
+  ">>> is tried.  VMA sequence lockings are added when VMA attributes which ar=\n",
+  "e\n",
   ">>> checked during the page fault are modified.\n",
   ">>>\n",
-  ">>> When the PTE is fetched, the VMA is checked to see if it has been changed,\n",
-  ">>> so once the page table is locked, the VMA is valid, so any other changes\n",
+  ">>> When the PTE is fetched, the VMA is checked to see if it has been chang=\n",
+  "ed,\n",
+  ">>> so once the page table is locked, the VMA is valid, so any other change=\n",
+  "s\n",
   ">>> leading to touching this PTE will need to lock the page table, so no\n",
   ">>> parallel change is possible at this time.\n",
   ">>>\n",
   ">>> The locking of the PTE is done with interrupts disabled, this allows\n",
   ">>> checking for the PMD to ensure that there is not an ongoing collapsing\n",
-  ">>> operation. Since khugepaged is firstly set the PMD to pmd_none and then is\n",
-  ">>> waiting for the other CPU to have caught the IPI interrupt, if the pmd is\n",
+  ">>> operation. Since khugepaged is firstly set the PMD to pmd_none and then=\n",
+  " is\n",
+  ">>> waiting for the other CPU to have caught the IPI interrupt, if the pmd =\n",
+  "is\n",
   ">>> valid at the time the PTE is locked, we have the guarantee that the\n",
   ">>> collapsing operation will have to wait on the PTE lock to move forward.\n",
   ">>> This allows the SPF handler to map the PTE safely. If the PMD value is\n",
-  ">>> different from the one recorded at the beginning of the SPF operation, the\n",
+  ">>> different from the one recorded at the beginning of the SPF operation, =\n",
+  "the\n",
   ">>> classic page fault handler will be called to handle the operation while\n",
-  ">>> holding the mmap_sem. As the PTE lock is done with the interrupts disabled,\n",
-  ">>> the lock is done using spin_trylock() to avoid dead lock when handling a\n",
-  ">>> page fault while a TLB invalidate is requested by another CPU holding the\n",
+  ">>> holding the mmap_sem. As the PTE lock is done with the interrupts disab=\n",
+  "led,\n",
+  ">>> the lock is done using spin_trylock() to avoid dead lock when handling =\n",
+  "a\n",
+  ">>> page fault while a TLB invalidate is requested by another CPU holding t=\n",
+  "he\n",
   ">>> PTE.\n",
   ">>>\n",
   ">>> In pseudo code, this could be seen as:\n",
   ">>>     speculative_page_fault()\n",
   ">>>     {\n",
-  ">>>             vma = get_vma()\n",
+  ">>>             vma =3D get_vma()\n",
   ">>>             check vma sequence count\n",
   ">>>             check vma's support\n",
   ">>>             disable interrupt\n",
@@ -292,10 +384,11 @@
   ">>>             check vma sequence count\n",
   ">>>             handle_pte_fault(vma)\n",
   ">>>                     ..\n",
-  ">>>                     page = alloc_page()\n",
+  ">>>                     page =3D alloc_page()\n",
   ">>>                     pte_map_lock()\n",
   ">>>                             disable interrupt\n",
-  ">>>                                     abort if sequence counter has changed\n",
+  ">>>                                     abort if sequence counter has chang=\n",
+  "ed\n",
   ">>>                                     abort if pmd or pte has changed\n",
   ">>>                                     pte map and lock\n",
   ">>>                             enable interrupt\n",
@@ -311,7 +404,7 @@
   ">>>                goto done\n",
   ">>>     again:\n",
   ">>>             lock(mmap_sem)\n",
-  ">>>             vma = find_vma();\n",
+  ">>>             vma =3D find_vma();\n",
   ">>>             handle_pte_fault(vma);\n",
   ">>>             if retry\n",
   ">>>                unlock(mmap_sem)\n",
@@ -320,22 +413,27 @@
   ">>>             handle fault error\n",
   ">>>     }\n",
   ">>>\n",
-  ">>> Support for THP is not done because when checking for the PMD, we can be\n",
+  ">>> Support for THP is not done because when checking for the PMD, we can b=\n",
+  "e\n",
   ">>> confused by an in progress collapsing operation done by khugepaged. The\n",
   ">>> issue is that pmd_none() could be true either if the PMD is not already\n",
-  ">>> populated or if the underlying PTE are in the way to be collapsed. So we\n",
+  ">>> populated or if the underlying PTE are in the way to be collapsed. So w=\n",
+  "e\n",
   ">>> cannot safely allocate a PMD if pmd_none() is true.\n",
   ">>>\n",
-  ">>> This series add a new software performance event named 'speculative-faults'\n",
+  ">>> This series add a new software performance event named 'speculative-fau=\n",
+  "lts'\n",
   ">>> or 'spf'. It counts the number of successful page fault event handled\n",
   ">>> speculatively. When recording 'faults,spf' events, the faults one is\n",
-  ">>> counting the total number of page fault events while 'spf' is only counting\n",
+  ">>> counting the total number of page fault events while 'spf' is only coun=\n",
+  "ting\n",
   ">>> the part of the faults processed speculatively.\n",
   ">>>\n",
   ">>> There are some trace events introduced by this series. They allow\n",
   ">>> identifying why the page faults were not processed speculatively. This\n",
   ">>> doesn't take in account the faults generated by a monothreaded process\n",
-  ">>> which directly processed while holding the mmap_sem. This trace events are\n",
+  ">>> which directly processed while holding the mmap_sem. This trace events =\n",
+  "are\n",
   ">>> grouped in a system named 'pagefault', they are:\n",
   ">>>  - pagefault:spf_vma_changed : if the VMA has been changed in our back\n",
   ">>>  - pagefault:spf_vma_noanon : the vma->anon_vma field was not yet set.\n",
@@ -348,20 +446,26 @@
   ">>> following arguments :\n",
   ">>> \$ perf stat -e 'faults,spf,pagefault:*' <command>\n",
   ">>>\n",
-  ">>> There is also a dedicated vmstat counter showing the number of successful\n",
+  ">>> There is also a dedicated vmstat counter showing the number of successf=\n",
+  "ul\n",
   ">>> page fault handled speculatively. I can be seen this way:\n",
   ">>> \$ grep speculative_pgfault /proc/vmstat\n",
   ">>>\n",
-  ">>> This series builds on top of v4.16-mmotm-2018-04-13-17-28 and is functional\n",
+  ">>> This series builds on top of v4.16-mmotm-2018-04-13-17-28 and is functi=\n",
+  "onal\n",
   ">>> on x86, PowerPC and arm64.\n",
   ">>>\n",
   ">>> ---------------------\n",
   ">>> Real Workload results\n",
   ">>>\n",
-  ">>> As mentioned in previous email, we did non official runs using a \"popular\n",
-  ">>> in memory multithreaded database product\" on 176 cores SMT8 Power system\n",
-  ">>> which showed a 30% improvements in the number of transaction processed per\n",
-  ">>> second. This run has been done on the v6 series, but changes introduced in\n",
+  ">>> As mentioned in previous email, we did non official runs using a \"popul=\n",
+  "ar\n",
+  ">>> in memory multithreaded database product\" on 176 cores SMT8 Power syste=\n",
+  "m\n",
+  ">>> which showed a 30% improvements in the number of transaction processed =\n",
+  "per\n",
+  ">>> second. This run has been done on the v6 series, but changes introduced=\n",
+  " in\n",
   ">>> this new version should not impact the performance boost seen.\n",
   ">>>\n",
   ">>> Here are the perf data captured during 2 of these runs on top of the v8\n",
@@ -370,12 +474,16 @@
   ">>> faults          89.418          101.364         +13%\n",
   ">>> spf                n/a           97.989\n",
   ">>>\n",
-  ">>> With the SPF kernel, most of the page fault were processed in a speculative\n",
+  ">>> With the SPF kernel, most of the page fault were processed in a specula=\n",
+  "tive\n",
   ">>> way.\n",
   ">>>\n",
-  ">>> Ganesh Mahendran had backported the series on top of a 4.9 kernel and gave\n",
-  ">>> it a try on an android device. He reported that the application launch time\n",
-  ">>> was improved in average by 6%, and for large applications (~100 threads) by\n",
+  ">>> Ganesh Mahendran had backported the series on top of a 4.9 kernel and g=\n",
+  "ave\n",
+  ">>> it a try on an android device. He reported that the application launch =\n",
+  "time\n",
+  ">>> was improved in average by 6%, and for large applications (~100 threads=\n",
+  ") by\n",
   ">>> 20%.\n",
   ">>>\n",
   ">>> Here are the launch time Ganesh mesured on Android 8.0 on top of a Qcom\n",
@@ -448,7 +556,8 @@
   ">>>                  0      pagefault:spf_pmd_changed\n",
   ">>>\n",
   ">>> Very few speculative page faults were recorded as most of the processes\n",
-  ">>> involved are monothreaded (sounds that on this architecture some threads\n",
+  ">>> involved are monothreaded (sounds that on this architecture some thread=\n",
+  "s\n",
   ">>> were created during the kernel build processing).\n",
   ">>>\n",
   ">>> Here are the kerbench results on a 80 CPUs Power8 system:\n",
@@ -483,15 +592,18 @@
   ">>>                  0      pagefault:spf_vma_access\n",
   ">>>                  0      pagefault:spf_pmd_changed\n",
   ">>>\n",
-  ">>> Most of the processes involved are monothreaded so SPF is not activated but\n",
+  ">>> Most of the processes involved are monothreaded so SPF is not activated=\n",
+  " but\n",
   ">>> there is no impact on the performance.\n",
   ">>>\n",
   ">>> Ebizzy:\n",
   ">>> -------\n",
-  ">>> The test is counting the number of records per second it can manage, the\n",
+  ">>> The test is counting the number of records per second it can manage, th=\n",
+  "e\n",
   ">>> higher is the best. I run it like this 'ebizzy -mTt <nrcpus>'. To get\n",
   ">>> consistent result I repeated the test 100 times and measure the average\n",
-  ">>> result. The number is the record processes per second, the higher is the\n",
+  ">>> result. The number is the record processes per second, the higher is th=\n",
+  "e\n",
   ">>> best.\n",
   ">>>\n",
   ">>>                 BASE            SPF             delta\n",
@@ -518,12 +630,14 @@
   ">>>                  0      pagefault:spf_vma_access\n",
   ">>>                  0      pagefault:spf_pmd_changed\n",
   ">>>\n",
-  ">>> In ebizzy's case most of the page fault were handled in a speculative way,\n",
+  ">>> In ebizzy's case most of the page fault were handled in a speculative w=\n",
+  "ay,\n",
   ">>> leading the ebizzy performance boost.\n",
   ">>>\n",
   ">>> ------------------\n",
   ">>> Changes since v10 (https://lkml.org/lkml/2018/4/17/572):\n",
-  ">>>  - Accounted for all review feedbacks from Punit Agrawal, Ganesh Mahendran\n",
+  ">>>  - Accounted for all review feedbacks from Punit Agrawal, Ganesh Mahend=\n",
+  "ran\n",
   ">>>    and Minchan Kim, hopefully.\n",
   ">>>  - Remove unneeded check on CONFIG_SPECULATIVE_PAGE_FAULT in\n",
   ">>>    __do_page_fault().\n",
@@ -532,12 +646,15 @@
   ">>>    of aborting the speculative page fault handling. Dropping the now\n",
   ">>> useless\n",
   ">>>    trace event pagefault:spf_pte_lock.\n",
-  ">>>  - No more try to reuse the fetched VMA during the speculative page fault\n",
+  ">>>  - No more try to reuse the fetched VMA during the speculative page fau=\n",
+  "lt\n",
   ">>>    handling when retrying is needed. This adds a lot of complexity and\n",
-  ">>>    additional tests done didn't show a significant performance improvement.\n",
+  ">>>    additional tests done didn't show a significant performance improvem=\n",
+  "ent.\n",
   ">>>  - Convert IS_ENABLED(CONFIG_NUMA) back to #ifdef due to build error.\n",
   ">>>\n",
-  ">>> [1] http://linux-kernel.2935.n7.nabble.com/RFC-PATCH-0-6-Another-go-at-speculative-page-faults-tt965642.html#none\n",
+  ">>> [1] http://linux-kernel.2935.n7.nabble.com/RFC-PATCH-0-6-Another-go-at-=\n",
+  "speculative-page-faults-tt965642.html#none\n",
   ">>> [2] https://patchwork.kernel.org/patch/9999687/\n",
   ">>>\n",
   ">>>\n",
@@ -600,7 +717,8 @@
   ">>>  mm/internal.h                         |  20 ++\n",
   ">>>  mm/khugepaged.c                       |   5 +\n",
   ">>>  mm/madvise.c                          |   6 +-\n",
-  ">>>  mm/memory.c                           | 612 +++++++++++++++++++++++++++++-----\n",
+  ">>>  mm/memory.c                           | 612 ++++++++++++++++++++++++++=\n",
+  "+++-----\n",
   ">>>  mm/mempolicy.c                        |  51 ++-\n",
   ">>>  mm/migrate.c                          |   6 +-\n",
   ">>>  mm/mlock.c                            |  13 +-\n",
@@ -628,4 +746,4 @@
   ">"
 ]
 
-617f24814c1dbc7638771bcb400081ca23fdfb2df0de6e5f2ba6ea5d5c4710de
+d5198e20fe696e1225cc0346b895da1b6a5df97e863d1ae8cf803c02fe2a3f0a

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.