All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Matthew Wilcox <willy@infradead.org>
Cc: Jan Kara <jack@suse.cz>, John Hubbard <jhubbard@nvidia.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Christoph Hellwig <hch@lst.de>, Jason Gunthorpe <jgg@ziepe.ca>,
	John Hubbard <john.hubbard@gmail.com>,
	Michal Hocko <mhocko@kernel.org>,
	Christopher Lameter <cl@linux.com>, Linux MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-rdma <linux-rdma@vger.kernel.org>
Subject: Re: [PATCH 2/2] mm: set PG_dma_pinned on get_user_pages*()
Date: Tue, 19 Jun 2018 12:41:42 +0200	[thread overview]
Message-ID: <20180619104142.lpilc6esz7w3a54i@quack2.suse.cz> (raw)
In-Reply-To: <20180619090255.GA25522@bombadil.infradead.org>

On Tue 19-06-18 02:02:55, Matthew Wilcox wrote:
> On Tue, Jun 19, 2018 at 10:29:49AM +0200, Jan Kara wrote:
> > And for record, the problem with page cache pages is not only that
> > try_to_unmap() may unmap them. It is also that page_mkclean() can
> > write-protect them. And once PTEs are write-protected filesystems may end
> > up doing bad things if DMA then modifies the page contents (DIF/DIX
> > failures, data corruption, oopses). As such I don't think that solutions
> > based on page reference count have a big chance of dealing with the
> > problem.
> > 
> > And your page flag approach would also need to take page_mkclean() into
> > account. And there the issue is that until the flag is cleared (i.e., we
> > are sure there are no writers using references from GUP) you cannot
> > writeback the page safely which does not work well with your idea of
> > clearing the flag only once the page is evicted from page cache (hint, page
> > cache page cannot get evicted until it is written back).
> > 
> > So as sad as it is, I don't see an easy solution here.
> 
> Pages which are "got" don't need to be on the LRU list.  They'll be
> marked dirty when they're put, so we can use page->lru for fun things
> like a "got" refcount.  If we use bit 1 of page->lru for PageGot, we've
> got 30/62 bits in the first word and a full 64 bits in the second word.

Interesting idea! It would destroy the aging information for the page but
for pages accessed through GUP references that is very much vague concept
anyway. It might be a bit tricky as pulling a page out of LRU requires page
lock but I don't think that's a huge problem. And page cache pages not on
LRU exist even currently when they are under reclaim so hopefully there
won't be too many places in MM that would need fixing up for such pages.

I'm also still pondering the idea of inserting a "virtual" VMA into vma
interval tree in the inode - as the GUP references are IMHO closest to an
mlocked mapping - and that would achieve all the functionality we need as
well. I just didn't have time to experiment with it.

And then there's the aspect that both these approaches are a bit too
heavyweight for some get_user_pages_fast() users (e.g. direct IO) - Al Viro
had an idea to use page lock for that path but e.g. fs/direct-io.c would have
problems due to lock ordering constraints (filesystem ->get_block would
suddently get called with the page lock held). But we can probably leave
performance optimizations for phase two.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2018-06-19 10:41 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-17  1:25 [PATCH 0/2] mm: gup: don't unmap or drop filesystem buffers john.hubbard
2018-06-17  1:25 ` [PATCH 1/2] consolidate get_user_pages error handling john.hubbard
2018-06-17  1:25 ` [PATCH 2/2] mm: set PG_dma_pinned on get_user_pages*() john.hubbard
2018-06-17 19:53   ` Dan Williams
2018-06-17 20:04     ` Jason Gunthorpe
2018-06-17 20:10       ` Dan Williams
2018-06-17 20:28         ` John Hubbard
2018-06-17 20:28           ` John Hubbard
2018-06-18  8:12           ` Christoph Hellwig
2018-06-18 17:50             ` John Hubbard
2018-06-18 17:50               ` John Hubbard
2018-06-18 17:56               ` Dan Williams
2018-06-18 18:14                 ` John Hubbard
2018-06-18 18:14                   ` John Hubbard
2018-06-18 19:21                   ` Dan Williams
2018-06-18 19:31                     ` Jason Gunthorpe
2018-06-18 20:04                       ` Dan Williams
2018-06-18 21:36                     ` John Hubbard
2018-06-19  8:29                       ` Jan Kara
2018-06-19  9:02                         ` Matthew Wilcox
2018-06-19 10:41                           ` Jan Kara [this message]
2018-06-19 18:11                             ` John Hubbard
2018-06-20  1:24                               ` Dan Williams
2018-06-20  1:34                                 ` John Hubbard
2018-06-20  1:57                                   ` Dan Williams
2018-06-20  2:03                                     ` John Hubbard
2018-06-20 12:08                               ` Jan Kara
2018-06-20 22:55                                 ` John Hubbard
2018-06-21 16:30                                   ` Jan Kara
2018-06-25 15:21                                     ` Jan Kara
2018-06-25 19:03                                       ` John Hubbard
2018-06-26  7:52                                         ` Jan Kara
2018-06-26  6:31                                       ` John Hubbard
2018-06-26 11:48                                         ` Jan Kara
2018-06-26 13:47                     ` Michal Hocko
2018-06-26 16:48                       ` Jan Kara
2018-06-27 11:32                         ` Michal Hocko
2018-06-27 11:53                           ` Jan Kara
2018-06-27 11:59                             ` Michal Hocko
2018-06-27 12:42                               ` Jan Kara
2018-06-27 14:57                                 ` Jason Gunthorpe
2018-06-27 17:02                                   ` Jan Kara
2018-06-28  2:42                                     ` John Hubbard
2018-06-28  9:17                                       ` Jan Kara
2018-07-02  5:52                                         ` Leon Romanovsky
2018-07-02  6:10                                           ` John Hubbard
2018-07-02  6:34                                             ` Leon Romanovsky
2018-07-02  6:41                                               ` John Hubbard
2018-07-02 10:36                                                 ` Michal Hocko
2018-07-02  7:02                                             ` Jan Kara
2018-07-02 14:48                                               ` Michal Hocko
2018-07-02  6:58                                           ` Jan Kara
2018-06-18  8:11         ` Christoph Hellwig
2018-06-19  6:15           ` Leon Romanovsky
2018-06-17 22:19     ` John Hubbard
2018-06-17 22:19       ` John Hubbard
2018-06-18  7:56   ` Christoph Hellwig
2018-06-18 17:44     ` John Hubbard
2018-06-18 17:44       ` John Hubbard
2018-06-17 21:54 ` [PATCH 0/2] mm: gup: don't unmap or drop filesystem buffers Christopher Lameter
2018-06-17 22:23   ` John Hubbard
2018-06-17 22:23     ` John Hubbard
2018-06-18  8:10   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180619104142.lpilc6esz7w3a54i@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=cl@linux.com \
    --cc=dan.j.williams@intel.com \
    --cc=hch@lst.de \
    --cc=jgg@ziepe.ca \
    --cc=jhubbard@nvidia.com \
    --cc=john.hubbard@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.