All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: John Hubbard <jhubbard@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>,
	Michal Hocko <mhocko@kernel.org>,
	Christopher Lameter <cl@linux.com>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	Dan Williams <dan.j.williams@intel.com>, Jan Kara <jack@suse.cz>,
	linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-rdma <linux-rdma@vger.kernel.org>,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 4/6] mm: introduce page->dma_pinned_flags, _count
Date: Sun, 14 Oct 2018 10:01:24 +1100	[thread overview]
Message-ID: <20181013230124.GB18822@dastard> (raw)
In-Reply-To: <7c2e3b54-0b1d-6726-a508-804ef8620cfd@nvidia.com>

On Sat, Oct 13, 2018 at 12:34:12AM -0700, John Hubbard wrote:
> On 10/12/18 8:55 PM, Dave Chinner wrote:
> > On Thu, Oct 11, 2018 at 11:00:12PM -0700, john.hubbard@gmail.com wrote:
> >> From: John Hubbard <jhubbard@nvidia.com>
> [...]
> >> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> >> index 5ed8f6292a53..017ab82e36ca 100644
> >> --- a/include/linux/mm_types.h
> >> +++ b/include/linux/mm_types.h
> >> @@ -78,12 +78,22 @@ struct page {
> >>  	 */
> >>  	union {
> >>  		struct {	/* Page cache and anonymous pages */
> >> -			/**
> >> -			 * @lru: Pageout list, eg. active_list protected by
> >> -			 * zone_lru_lock.  Sometimes used as a generic list
> >> -			 * by the page owner.
> >> -			 */
> >> -			struct list_head lru;
> >> +			union {
> >> +				/**
> >> +				 * @lru: Pageout list, eg. active_list protected
> >> +				 * by zone_lru_lock.  Sometimes used as a
> >> +				 * generic list by the page owner.
> >> +				 */
> >> +				struct list_head lru;
> >> +				/* Used by get_user_pages*(). Pages may not be
> >> +				 * on an LRU while these dma_pinned_* fields
> >> +				 * are in use.
> >> +				 */
> >> +				struct {
> >> +					unsigned long dma_pinned_flags;
> >> +					atomic_t      dma_pinned_count;
> >> +				};
> >> +			};
> > 
> > Isn't this broken for mapped file-backed pages? i.e. they may be
> > passed as the user buffer to read/write direct IO and so the pages
> > passed to gup will be on the active/inactive LRUs. hence I can't see
> > how you can have dual use of the LRU list head like this....
> > 
> > What am I missing here?
> 
> Hi Dave,
> 
> In patch 6/6, pin_page_for_dma(), which is called at the end of get_user_pages(),
> unceremoniously rips the pages out of the LRU, as a prerequisite to using
> either of the page->dma_pinned_* fields. 

How is that safe? If you've ripped the page out of the LRU, it's no
longer being tracked by the page cache aging and reclaim algorithms.
Patch 6 doesn't appear to put these pages back in the LRU, either,
so it looks to me like this just dumps them on the ground after the
gup reference is dropped.  How do we reclaim these page cache pages
when there is memory pressure if they aren't in the LRU?

> The idea is that LRU is not especially useful for this situation anyway,
> so we'll just make it one or the other: either a page is dma-pinned, and
> just hanging out doing RDMA most likely (and LRU is less meaningful during that
> time), or it's possibly on an LRU list.

gup isn't just used for RDMA. It's used by direct IO in far, far
more situations and machines than RDMA is. Please explain why
ripping pages out of the LRU and not putting them back is safe, has
no side effects, doesn't adversely impact page cache reclaim, etc.
Indeed, I'd love to see a description of all the page references and
where they come and go so we know the changes aren't just leaking
these pages until the filesystem invalidates them at unmount.

Maybe I'm not seeing why this is safe yet, but seeing as you haven't
explained why it is safe then, at minimum, the patch descriptions
are incomplete.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  parent reply	other threads:[~2018-10-13 23:01 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-12  6:00 [PATCH 0/6] RFC: gup+dma: tracking dma-pinned pages john.hubbard
2018-10-12  6:00 ` [PATCH 1/6] mm: get_user_pages: consolidate error handling john.hubbard
2018-10-12  6:30   ` Balbir Singh
2018-10-12 22:45     ` John Hubbard
2018-10-12 22:45       ` John Hubbard
2018-10-12  6:00 ` [PATCH 2/6] mm: introduce put_user_page*(), placeholder versions john.hubbard
2018-10-12  7:35   ` Balbir Singh
2018-10-12 22:31     ` John Hubbard
2018-10-12 22:31       ` John Hubbard
2018-10-12  6:00 ` [PATCH 3/6] infiniband/mm: convert put_page() to put_user_page*() john.hubbard
2018-10-12  6:00 ` [PATCH 4/6] mm: introduce page->dma_pinned_flags, _count john.hubbard
2018-10-12 10:56   ` Balbir Singh
2018-10-13  0:15     ` John Hubbard
2018-10-13  0:15       ` John Hubbard
2018-10-24 11:00       ` Balbir Singh
2018-11-02 23:27         ` John Hubbard
2018-11-02 23:27           ` John Hubbard
2018-10-13  3:55   ` Dave Chinner
2018-10-13  7:34     ` John Hubbard
2018-10-13  7:34       ` John Hubbard
2018-10-13 16:47       ` Christoph Hellwig
2018-10-13 21:19         ` John Hubbard
2018-10-13 21:19           ` John Hubbard
2018-11-05  7:10         ` John Hubbard
2018-11-05  7:10           ` John Hubbard
2018-11-05  9:54           ` Jan Kara
2018-11-06  0:26             ` John Hubbard
2018-11-06  0:26               ` John Hubbard
2018-11-06  2:47               ` Dave Chinner
2018-11-06 11:00                 ` Jan Kara
2018-11-06 20:41                   ` Dave Chinner
2018-11-07  6:36                     ` John Hubbard
2018-11-07  6:36                       ` John Hubbard
2018-10-13 23:01       ` Dave Chinner [this message]
2018-10-16  8:51         ` Jan Kara
2018-10-17  1:48           ` John Hubbard
2018-10-17  1:48             ` John Hubbard
2018-10-17 11:09             ` Michal Hocko
2018-10-18  0:03               ` John Hubbard
2018-10-18  0:03                 ` John Hubbard
2018-10-19  8:11                 ` Michal Hocko
2018-10-12  6:00 ` [PATCH 5/6] mm: introduce zone_gup_lock, for dma-pinned pages john.hubbard
2018-10-12  6:00 ` [PATCH 6/6] mm: track gup pages with page->dma_pinned_* fields john.hubbard
2018-10-12 11:07   ` Balbir Singh
2018-10-13  0:33     ` John Hubbard
2018-10-13  0:33       ` John Hubbard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181013230124.GB18822@dastard \
    --to=david@fromorbit.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=dan.j.williams@intel.com \
    --cc=jack@suse.cz \
    --cc=jgg@ziepe.ca \
    --cc=jhubbard@nvidia.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.