linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: Skip opportunistic reclaim for dma pinned pages
@ 2020-06-24 19:14 Chris Wilson
  2020-06-24 19:21 ` Jason Gunthorpe
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Chris Wilson @ 2020-06-24 19:14 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, intel-gfx, Chris Wilson, Andrew Morton, Jan Kara,
	Jérôme Glisse, John Hubbard, Claudio Imbrenda,
	Kirill A . Shutemov, Jason Gunthorpe

A general rule of thumb is that shrinkers should be fast and effective.
They are called from direct reclaim at the most incovenient of times when
the caller is waiting for a page. If we attempt to reclaim a page being
pinned for active dma [pin_user_pages()], we will incur far greater
latency than a normal anonymous page mapped multiple times. Worse the
page may be in use indefinitely by the HW and unable to be reclaimed
in a timely manner.

A side effect of the LRU shrinker not being dma aware is that we will
often attempt to perform direct reclaim on the persistent group of dma
pages while continuing to use the dma HW (an issue as the HW may already
be actively waiting for the next user request), and even attempt to
reclaim a partially allocated dma object in order to satisfy pinning
the next user page for that object.

It is to be expected that such pages are made available for reclaim at
the end of the dma operation [unpin_user_pages()], and for truly
longterm pins to be proactively recovered via device specific shrinkers
[i.e. stop the HW, allow the pages to be returned to the system, and
then compete again for the memory].

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
---
This seems perhaps a little devious and overzealous. Is there a more
appropriate TTU flag? Would there be a way to limit its effect to say
FOLL_LONGTERM? Doing the migration first would seem to be sensible if
we disable opportunistic migration for the duration of the pin.
---
 mm/rmap.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/mm/rmap.c b/mm/rmap.c
index 5fe2dedce1fc..374c6e65551b 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1393,6 +1393,22 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 	    is_zone_device_page(page) && !is_device_private_page(page))
 		return true;
 
+	/*
+	 * Try and fail early to revoke a costly DMA pinned page.
+	 *
+	 * Reclaiming an active DMA page requires stopping the hardware
+	 * and flushing access. [Hardware that does support pagefaulting,
+	 * and so can quickly revoke DMA pages at any time, does not need
+	 * to pin the DMA page.] At worst, the page may be indefinitely in
+	 * use by the hardware. Even at best it will take far longer to
+	 * revoke the access via the mmu notifier, forcing that latency
+	 * onto our callers rather than the consumer of the HW. As we are
+	 * called during opportunistic direct reclaim, declare the
+	 * opportunity cost too high and ignore the page.
+	 */
+	if (page_maybe_dma_pinned(page))
+		return true;
+
 	if (flags & TTU_SPLIT_HUGE_PMD) {
 		split_huge_pmd_address(vma, address,
 				flags & TTU_SPLIT_FREEZE, page);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-06-25 16:32 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-24 19:14 [PATCH] mm: Skip opportunistic reclaim for dma pinned pages Chris Wilson
2020-06-24 19:21 ` Jason Gunthorpe
2020-06-24 20:23   ` Yang Shi
2020-06-24 21:02     ` Yang Shi
2020-06-24 20:47   ` John Hubbard
2020-06-24 23:20     ` Jason Gunthorpe
2020-06-25  0:11       ` John Hubbard
2020-06-25 11:24         ` Jan Kara
2020-06-25  7:57 ` Michal Hocko
     [not found]   ` <159308284703.4527.16058577374955415124@build.alporthouse.com>
2020-06-25 15:12     ` Michal Hocko
2020-06-25 11:42 ` Matthew Wilcox
2020-06-25 13:40   ` Jan Kara
2020-06-25 16:05     ` Matthew Wilcox
2020-06-25 16:32   ` Yang Shi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).