All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@techsingularity.net>
To: Michel Lespinasse <michel@lespinasse.org>
Cc: Linux-MM <linux-mm@kvack.org>,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	kernel-team@fb.com, Laurent Dufour <ldufour@linux.ibm.com>,
	Jerome Glisse <jglisse@google.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Michal Hocko <mhocko@suse.com>, Vlastimil Babka <vbabka@suse.cz>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Matthew Wilcox <willy@infradead.org>,
	Liam Howlett <liam.howlett@oracle.com>,
	Rik van Riel <riel@surriel.com>,
	Paul McKenney <paulmck@kernel.org>,
	Song Liu <songliubraving@fb.com>,
	Suren Baghdasaryan <surenb@google.com>,
	Minchan Kim <minchan@google.com>,
	Joel Fernandes <joelaf@google.com>,
	David Rientjes <rientjes@google.com>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Andy Lutomirski <luto@kernel.org>
Subject: Re: [PATCH v2 00/35] Speculative page faults
Date: Wed, 23 Feb 2022 16:11:41 +0000	[thread overview]
Message-ID: <20220223161141.GG4423@techsingularity.net> (raw)
In-Reply-To: <20220128131006.67712-1-michel@lespinasse.org>

On Fri, Jan 28, 2022 at 05:09:31AM -0800, Michel Lespinasse wrote:
> This patchset is my take on speculative page faults (spf).
> It builds on ideas that have been previously proposed by Laurent Dufour,
> Peter Zijlstra and others before. While Laurent's previous proposal
> was rejected around the time of LSF/MM 2019, I am hoping we can revisit
> this now based on what I think is a simpler and more bisectable approach,
> much improved scaling numbers in the anonymous vma case, and the Android
> use case that has since emerged. I will expand on these points towards
> the end of this message.
> 
> The patch series applies on top of linux v5.17-rc1;
> a git tree is also available:
> git fetch https://github.com/lespinasse/linux.git v5.17-rc1-spf-anon
> 
> I would like these patches to be considered for inclusion into v5.18.
> Several android vendors are using Laurent Dufour's previous SPF work into
> their kernel tree in order to improve application startup performance,
> want to converge to an upstream accepted solution, and have reported good
> numbers with previous versions of this patchset. Also, there is a broader
> interest into reducing mmap lock dependencies in critical MM paths,
> and I think this patchset would be a good first step in that direction.
> 

I think there is serious lack of performance data here. The only
performance point offered is the Android Application Startup case.
Unfortunately, that benefit may be specific to the Zygote process that
preloads classes that may be required and listens for new applications to
start. I suspect the benefit wouldn't apply to most Linux distributions
and even JVM-based workloads are not primarily constrained by the startup
cost. Improving application start up costs is not great justification
for this level of code complexity even though I recognise why it is a
key performance indicator for Android given that startup times affect
the user experience.

Laurent's original work was partially motivated by the performance of
a proprietary application. While I cannot replicate a full production
workload as that can only be done by the company, I could do a basic
evaluation commonly conducted on standalone systems. It was extremely
fault intensive with SPF success rates greater than 96% but almost no
change in actual performance. It's perfectly possible that the application
has changed since SPF was first proposed. The developers did spend a fair
amount of effort at making the application NUMA-aware and reusing memory
more aggressively to avoid faults. It's still very fault intensive but
does not appear to suffer due to parallel memory operations guessing from
the data.

On my own tests, the only preliminary test that was a clear winner
was will-it-scale using threads for the page-fault workloads and
page-fault-test for threads. To be far, the increases there are dramatic
with a high success rate of speculative faults.

pft timings
                                 5.17.0-rc3             5.17.0-rc3
                                    vanilla        mm-spfault-v2r1
Amean     elapsed-1        32.66 (   0.00%)       32.77 *  -0.36%*
Amean     elapsed-4         9.17 (   0.00%)        8.89 *   3.07%*
Amean     elapsed-7         5.53 (   0.00%)        5.26 *   4.95%*
Amean     elapsed-12        4.13 (   0.00%)        3.50 *  15.16%*
Amean     elapsed-21        3.93 (   0.00%)        2.79 *  29.03%*
Amean     elapsed-30        4.02 (   0.00%)        2.94 *  26.79%*
Amean     elapsed-48        4.37 (   0.00%)        2.83 *  35.24%*
Amean     elapsed-79        4.13 (   0.00%)        2.17 *  47.36%*
Amean     elapsed-80        4.12 (   0.00%)        2.13 *  48.22%*

Ops SPFault Attempt                        0.00  4734439786.00
Ops SPFault Abort                          0.00     9360014.00
Ops SPFault Success                        0.00          99.80

This is the ideal case for SPF but not very realistic. Interestingly,
ebizzy barely benefitted even though it's threaded because it's not
guaranteed to be address space modification intensive.

Hackbench took a performance hit between 0-5% depending on the exact
configuration and machine used. It is threaded and had high SPF abort rates
(up to 50%). It's not a great example but it shows at least one example
where SPF hurts more than it help and there may be other applications
that are harmed by having to retry faults.

The scope of SPF is narrow relative to the much older discussion of
breaking up mmap_sem. The only time SPF benefits is when faults are racing
against parallel memory address updates holding mmap_sem for write.
That requires a threaded application that is both intense in terms of
address space updates and fault intensive. That is much narrower than
threaded applications that are address space update intensive (e.g.
using mprotect to avoid accidentally leaking data, mapping data files
for IO etc). Have we examples of realistic applications that meet all the
criteria of "threaded", "address-space intensive" and "fault intensive"
that are common enough to justify the complexity?

Admittedly, I initially just threw this series at a collection of
workloads that simply stress the allocator because it stresses faults as
a side-effect but most of them did not match the criteria for "threaded
application that is both address space update intensive and fault
intensive". I'm struggling to think of good examples although redis
is a possibility. HPC workloads like NPB parallelised with OpenMP is a
possibility but I looked at some old results and while it does trap faults,
the vast majority are related to NUMA balancing.  The other ones I normally
consider for scaling purposes are process orientated and not threads.

On the patches themselves, I'm not sure the optimisation for ignoring SPF
is guaranteed to work as mm_users could be temporarily elevated although
probably not enough to matter. I also think patch 5 stands on its own and
could be sent separately. For the others, I didn't read them in sufficient
depth but noted that the level of similar logic between speculative
and non-speculative paths could be a maintenance headache to keep the
speculative and !speculative rules in sync. I didn't see obvious problems
as such but I still think the complexity is high for a corner case.

-- 
Mel Gorman
SUSE Labs

  parent reply	other threads:[~2022-02-23 16:11 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-28 13:09 [PATCH v2 00/35] Speculative page faults Michel Lespinasse
2022-01-28 13:09 ` [PATCH v2 01/35] mm: export dump_mm Michel Lespinasse
2022-01-28 13:09 ` [PATCH v2 02/35] mmap locking API: mmap_lock_is_contended returns a bool Michel Lespinasse
2022-01-28 13:09 ` [PATCH v2 03/35] mmap locking API: name the return values Michel Lespinasse
2022-01-31 16:17   ` Liam Howlett
2022-02-07 17:39     ` Michel Lespinasse
2022-01-28 13:09 ` [PATCH v2 04/35] do_anonymous_page: use update_mmu_tlb() Michel Lespinasse
2022-01-28 13:09 ` [PATCH v2 05/35] do_anonymous_page: reduce code duplication Michel Lespinasse
2022-01-28 13:09 ` [PATCH v2 06/35] mm: introduce CONFIG_SPECULATIVE_PAGE_FAULT Michel Lespinasse
2022-01-28 13:09 ` [PATCH v2 07/35] x86/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT Michel Lespinasse
2022-01-28 13:09 ` [PATCH v2 08/35] mm: add FAULT_FLAG_SPECULATIVE flag Michel Lespinasse
2022-01-28 13:09 ` [PATCH v2 09/35] mm: add do_handle_mm_fault() Michel Lespinasse
2022-01-28 13:09 ` [PATCH v2 10/35] mm: add per-mm mmap sequence counter for speculative page fault handling Michel Lespinasse
2022-08-25 11:23   ` Pavan Kondeti
2022-01-28 13:09 ` [PATCH v2 11/35] mm: rcu safe vma freeing Michel Lespinasse
2022-01-28 13:09 ` [PATCH v2 12/35] mm: separate mmap locked assertion from find_vma Michel Lespinasse
2022-01-29  0:08   ` kernel test robot
2022-01-29  0:08     ` kernel test robot
2022-01-29  0:33     ` Michel Lespinasse
2022-01-29  0:33       ` Michel Lespinasse
2022-01-31 14:44   ` Matthew Wilcox
2022-02-04 22:41     ` Michel Lespinasse
2022-01-28 13:09 ` [PATCH v2 13/35] x86/mm: attempt speculative mm faults first Michel Lespinasse
2022-02-01 17:16   ` Liam Howlett
2022-02-07 17:39     ` Michel Lespinasse
2022-01-28 13:09 ` [PATCH v2 14/35] mm: add speculative_page_walk_begin() and speculative_page_walk_end() Michel Lespinasse
2022-01-28 13:09 ` [PATCH v2 15/35] mm: refactor __handle_mm_fault() / handle_pte_fault() Michel Lespinasse
2022-01-28 13:09 ` [PATCH v2 16/35] mm: implement speculative handling in __handle_mm_fault() Michel Lespinasse
2022-01-28 13:09 ` [PATCH v2 17/35] mm: add pte_map_lock() and pte_spinlock() Michel Lespinasse
2022-01-28 13:09 ` [PATCH v2 18/35] mm: implement speculative handling in do_anonymous_page() Michel Lespinasse
2022-01-28 21:03   ` kernel test robot
2022-01-28 21:03     ` kernel test robot
2022-01-28 22:08     ` Michel Lespinasse
2022-01-28 22:08       ` Michel Lespinasse
2022-01-30  2:54   ` [mm] fa5331bae2: canonical_address#:#[##] kernel test robot
2022-01-30  2:54     ` kernel test robot
2022-01-30  5:08     ` Michel Lespinasse
2022-01-30  5:08       ` Michel Lespinasse
2022-01-28 13:09 ` [PATCH v2 19/35] mm: enable speculative fault handling through do_anonymous_page() Michel Lespinasse
2022-01-28 13:09 ` [PATCH v2 20/35] mm: implement speculative handling in do_numa_page() Michel Lespinasse
2022-01-28 13:09 ` [PATCH v2 21/35] mm: enable speculative fault " Michel Lespinasse
2022-01-28 13:09 ` [PATCH v2 22/35] percpu-rwsem: enable percpu_sem destruction in atomic context Michel Lespinasse
2022-01-29 12:13   ` Hillf Danton
2022-01-31 18:04     ` Suren Baghdasaryan
2022-02-01  2:09       ` Hillf Danton
2022-02-07 19:31         ` Suren Baghdasaryan
2022-02-08  0:20           ` Hillf Danton
2022-02-08  1:31             ` Suren Baghdasaryan
2022-01-28 13:09 ` [PATCH v2 23/35] mm: add mmu_notifier_lock Michel Lespinasse
2022-07-27  7:34   ` Pavan Kondeti
2022-07-27 20:30     ` Suren Baghdasaryan
2022-01-28 13:09 ` [PATCH v2 24/35] mm: write lock mmu_notifier_lock when registering mmu notifiers Michel Lespinasse
2022-01-28 13:09 ` [PATCH v2 25/35] mm: add mmu_notifier_trylock() and mmu_notifier_unlock() Michel Lespinasse
2022-01-28 13:09 ` [PATCH v2 26/35] mm: implement speculative handling in wp_page_copy() Michel Lespinasse
2022-01-28 13:09 ` [PATCH v2 27/35] mm: implement and enable speculative fault handling in handle_pte_fault() Michel Lespinasse
2022-01-28 13:09 ` [PATCH v2 28/35] mm: disable speculative faults for single threaded user space Michel Lespinasse
2022-01-28 13:10 ` [PATCH v2 29/35] mm: disable rcu safe vma freeing " Michel Lespinasse
2022-01-28 13:10 ` [PATCH v2 30/35] mm: create new include/linux/vm_event.h header file Michel Lespinasse
2022-01-28 13:10 ` [PATCH v2 31/35] mm: anon spf statistics Michel Lespinasse
2022-01-28 13:10 ` [PATCH v2 32/35] arm64/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT Michel Lespinasse
2022-01-28 13:10 ` [PATCH v2 33/35] arm64/mm: attempt speculative mm faults first Michel Lespinasse
2022-01-30  9:13   ` Mike Rapoport
2022-01-31  8:07     ` Michel Lespinasse
2022-02-01  8:58       ` Mike Rapoport
2022-02-07 17:39         ` Michel Lespinasse
2022-02-08  9:07           ` Mike Rapoport
2022-01-28 13:10 ` [PATCH v2 34/35] powerpc/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT Michel Lespinasse
2022-01-28 13:10 ` [PATCH v2 35/35] powerpc/mm: attempt speculative mm faults first Michel Lespinasse
2022-01-31  9:56 ` [PATCH v2 00/35] Speculative page faults David Hildenbrand
2022-01-31 17:00   ` Suren Baghdasaryan
2022-02-01  1:14 ` Andrew Morton
2022-02-01  2:20   ` Matthew Wilcox
2022-02-07 17:39     ` Michel Lespinasse
2022-02-01 17:17   ` Sebastian Andrzej Siewior
2022-02-23 16:11 ` Mel Gorman [this message]
2022-03-08  5:37   ` Suren Baghdasaryan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220223161141.GG4423@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=dave@stgolabs.net \
    --cc=jglisse@google.com \
    --cc=joelaf@google.com \
    --cc=kernel-team@fb.com \
    --cc=ldufour@linux.ibm.com \
    --cc=liam.howlett@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mhocko@suse.com \
    --cc=michel@lespinasse.org \
    --cc=minchan@google.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=riel@surriel.com \
    --cc=rientjes@google.com \
    --cc=songliubraving@fb.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.