linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/6] x86: prefetch_page() vDSO call
@ 2021-02-25  7:29 Nadav Amit
  2021-02-25  7:29 ` [RFC 1/6] vdso/extable: fix calculation of base Nadav Amit
                   ` (7 more replies)
  0 siblings, 8 replies; 19+ messages in thread
From: Nadav Amit @ 2021-02-25  7:29 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: Hugh Dickins, Andy Lutomirski, Thomas Gleixner, Peter Zijlstra,
	Ingo Molnar, Borislav Petkov, Nadav Amit, Sean Christopherson,
	Andrew Morton, x86

From: Nadav Amit <namit@vmware.com>

Just as applications can use prefetch instructions to overlap
computations and memory accesses, applications may want to overlap the
page-faults and compute or overlap the I/O accesses that are required
for page-faults of different pages.

Applications can use multiple threads and cores for this matter, by
running one thread that prefetches the data (i.e., faults in the data)
and another that does the compute, but this scheme is inefficient. Using
mincore() can tell whether a page is mapped, but might not tell whether
the page is in the page-cache and does not fault in the data.

Introduce prefetch_page() vDSO-call to prefetch, i.e. fault-in memory
asynchronously. The semantic of this call is: try to prefetch a page of
in a given address and return zero if the page is accessible following
the call. Start I/O operations to retrieve the page if such operations
are required and there is no high memory pressure that might introduce
slowdowns.

Note that as usual the page might be paged-out at any point and
therefore, similarly to mincore(), there is no guarantee that the page
will be present at the time that the user application uses the data that
resides on the page. Nevertheless, it is expected that in the vast
majority of the cases this would not happen, since prefetch_page()
accesses the page and therefore sets the PTE access-bit (if it is
clear). 

The implementation is as follows. The vDSO code accesses the data,
triggering a page-fault it is not present. The handler detects based on
the instruction pointer that this is an asynchronous-#PF, using the
recently introduce vDSO exception tables. If the page can be brought
without waiting (e.g., the page is already in the page-cache), the
kernel handles the fault and returns success (zero). If there is memory
pressure that prevents the proper handling of the fault (i.e., requires
heavy-weight reclamation) it returns a failure. Otherwise, it starts an
I/O to bring the page and returns failure.

Compilers can be extended to issue the prefetch_page() calls when
needed.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: x86@kernel.org

Nadav Amit (6):
  vdso/extable: fix calculation of base
  x86/vdso: add mask and flags to extable
  x86/vdso: introduce page_prefetch()
  mm/swap_state: respect FAULT_FLAG_RETRY_NOWAIT
  mm: use lightweight reclaim on FAULT_FLAG_RETRY_NOWAIT
  testing/selftest: test vDSO prefetch_page()

 arch/x86/Kconfig                              |   1 +
 arch/x86/entry/vdso/Makefile                  |   1 +
 arch/x86/entry/vdso/extable.c                 |  70 +++--
 arch/x86/entry/vdso/extable.h                 |  21 +-
 arch/x86/entry/vdso/vdso.lds.S                |   1 +
 arch/x86/entry/vdso/vprefetch.S               |  39 +++
 arch/x86/entry/vdso/vsgx.S                    |   9 +-
 arch/x86/include/asm/vdso.h                   |  38 ++-
 arch/x86/mm/fault.c                           |  11 +-
 lib/vdso/Kconfig                              |   5 +
 mm/memory.c                                   |  47 +++-
 mm/shmem.c                                    |   1 +
 mm/swap_state.c                               |  12 +-
 tools/testing/selftests/vDSO/Makefile         |   2 +
 .../selftests/vDSO/vdso_test_prefetch_page.c  | 265 ++++++++++++++++++
 15 files changed, 470 insertions(+), 53 deletions(-)
 create mode 100644 arch/x86/entry/vdso/vprefetch.S
 create mode 100644 tools/testing/selftests/vDSO/vdso_test_prefetch_page.c

-- 
2.25.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2021-02-28  9:22 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-25  7:29 [RFC 0/6] x86: prefetch_page() vDSO call Nadav Amit
2021-02-25  7:29 ` [RFC 1/6] vdso/extable: fix calculation of base Nadav Amit
2021-02-25 21:16   ` Sean Christopherson
2021-02-26 17:24     ` Nadav Amit
2021-02-26 17:47       ` Sean Christopherson
2021-02-28  9:20         ` Nadav Amit
2021-02-25  7:29 ` [RFC 2/6] x86/vdso: add mask and flags to extable Nadav Amit
2021-02-25  7:29 ` [RFC 3/6] x86/vdso: introduce page_prefetch() Nadav Amit
2021-02-25  7:29 ` [RFC 4/6] mm/swap_state: respect FAULT_FLAG_RETRY_NOWAIT Nadav Amit
2021-02-25  7:29 ` [RFC 5/6] mm: use lightweight reclaim on FAULT_FLAG_RETRY_NOWAIT Nadav Amit
2021-02-25  7:29 ` [PATCH 6/6] testing/selftest: test vDSO prefetch_page() Nadav Amit
2021-02-25  8:40 ` [RFC 0/6] x86: prefetch_page() vDSO call Peter Zijlstra
2021-02-25  8:52   ` Nadav Amit
2021-02-25  9:32     ` Nadav Amit
2021-02-25  9:55       ` Peter Zijlstra
2021-02-25 12:16 ` Matthew Wilcox
2021-02-25 16:56   ` Nadav Amit
2021-02-25 17:32     ` Matthew Wilcox
2021-02-25 17:53       ` Nadav Amit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).