linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: John Hubbard <jhubbard@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>, Jason Gunthorpe <jgg@ziepe.ca>,
	John Hubbard <john.hubbard@gmail.com>,
	Matthew Wilcox <willy@infradead.org>,
	Michal Hocko <mhocko@kernel.org>,
	Christopher Lameter <cl@linux.com>, Jan Kara <jack@suse.cz>,
	Linux MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-rdma <linux-rdma@vger.kernel.org>
Subject: Re: [PATCH 2/2] mm: set PG_dma_pinned on get_user_pages*()
Date: Mon, 18 Jun 2018 12:21:46 -0700	[thread overview]
Message-ID: <CAPcyv4iRBzmwWn_9zDvqdfVmTZL_Gn7uA_26A1T-kJib=84tvA@mail.gmail.com> (raw)
In-Reply-To: <3898ef6b-2fa0-e852-a9ac-d904b47320d5@nvidia.com>

On Mon, Jun 18, 2018 at 11:14 AM, John Hubbard <jhubbard@nvidia.com> wrote:
> On 06/18/2018 10:56 AM, Dan Williams wrote:
>> On Mon, Jun 18, 2018 at 10:50 AM, John Hubbard <jhubbard@nvidia.com> wrote:
>>> On 06/18/2018 01:12 AM, Christoph Hellwig wrote:
>>>> On Sun, Jun 17, 2018 at 01:28:18PM -0700, John Hubbard wrote:
>>>>> Yes. However, my thinking was: get_user_pages() can become a way to indicate that
>>>>> these pages are going to be treated specially. In particular, the caller
>>>>> does not really want or need to support certain file operations, while the
>>>>> page is flagged this way.
>>>>>
>>>>> If necessary, we could add a new API call.
>>>>
>>>> That API call is called get_user_pages_longterm.
>>>
>>> OK...I had the impression that this was just semi-temporary API for dax, but
>>> given that it's an exported symbol, I guess it really is here to stay.
>>
>> The plan is to go back and provide api changes that bypass
>> get_user_page_longterm() for RDMA. However, for VFIO and others, it's
>> not clear what we could do. In the VFIO case the guest would need to
>> be prepared handle the revocation.
>
> OK, let's see if I understand that plan correctly:
>
> 1. Change RDMA users (this could be done entirely in the various device drivers'
> code, unless I'm overlooking something) to use mmu notifiers, and to do their
> DMA to/from non-pinned pages.

The problem with this approach is surprising the RDMA drivers with
notifications of teardowns. It's the RDMA userspace applications that
need the notification, and it likely needs to be explicit opt-in, at
least for the non-ODP drivers.

> 2. Return early from get_user_pages_longterm, if the memory is...marked for
> RDMA? (How? Same sort of page flag that I'm floating here, or something else?)
> That would avoid the problem with pinned pages getting their buffer heads
> removed--by disallowing the pinning. Makes sense.

Well, right now the RDMA workaround is DAX specific and it seems we
need to generalize it for the page-cache case. One thought is to have
try_to_unmap() take it's own reference and wait for the page reference
count to drop to one so that the truncate path knows the page is
dma-idle and disconnected from the page cache, but I have not looked
at the details.

> Also, is there anything I can help with here, so that things can happen sooner?

I do think we should explore a page flag for pages that are "long
term" pinned. Michal asked for something along these lines at LSF / MM
so that the core-mm can give up on pages that the kernel has lost
lifetime control. Michal, did I capture your ask correctly?

  reply	other threads:[~2018-06-18 19:21 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-17  1:25 [PATCH 0/2] mm: gup: don't unmap or drop filesystem buffers john.hubbard
2018-06-17  1:25 ` [PATCH 1/2] consolidate get_user_pages error handling john.hubbard
2018-06-17  1:25 ` [PATCH 2/2] mm: set PG_dma_pinned on get_user_pages*() john.hubbard
2018-06-17 19:53   ` Dan Williams
2018-06-17 20:04     ` Jason Gunthorpe
2018-06-17 20:10       ` Dan Williams
2018-06-17 20:28         ` John Hubbard
2018-06-18  8:12           ` Christoph Hellwig
2018-06-18 17:50             ` John Hubbard
2018-06-18 17:56               ` Dan Williams
2018-06-18 18:14                 ` John Hubbard
2018-06-18 19:21                   ` Dan Williams [this message]
2018-06-18 19:31                     ` Jason Gunthorpe
2018-06-18 20:04                       ` Dan Williams
2018-06-18 21:36                     ` John Hubbard
2018-06-19  8:29                       ` Jan Kara
2018-06-19  9:02                         ` Matthew Wilcox
2018-06-19 10:41                           ` Jan Kara
2018-06-19 18:11                             ` John Hubbard
2018-06-20  1:24                               ` Dan Williams
2018-06-20  1:34                                 ` John Hubbard
2018-06-20  1:57                                   ` Dan Williams
2018-06-20  2:03                                     ` John Hubbard
2018-06-20 12:08                               ` Jan Kara
2018-06-20 22:55                                 ` John Hubbard
2018-06-21 16:30                                   ` Jan Kara
2018-06-25 15:21                                     ` Jan Kara
2018-06-25 19:03                                       ` John Hubbard
2018-06-26  7:52                                         ` Jan Kara
2018-06-26  6:31                                       ` John Hubbard
2018-06-26 11:48                                         ` Jan Kara
2018-06-26 13:47                     ` Michal Hocko
2018-06-26 16:48                       ` Jan Kara
2018-06-27 11:32                         ` Michal Hocko
2018-06-27 11:53                           ` Jan Kara
2018-06-27 11:59                             ` Michal Hocko
2018-06-27 12:42                               ` Jan Kara
2018-06-27 14:57                                 ` Jason Gunthorpe
2018-06-27 17:02                                   ` Jan Kara
2018-06-28  2:42                                     ` John Hubbard
2018-06-28  9:17                                       ` Jan Kara
2018-07-02  5:52                                         ` Leon Romanovsky
2018-07-02  6:10                                           ` John Hubbard
2018-07-02  6:34                                             ` Leon Romanovsky
2018-07-02  6:41                                               ` John Hubbard
2018-07-02 10:36                                                 ` Michal Hocko
2018-07-02  7:02                                             ` Jan Kara
2018-07-02 14:48                                               ` Michal Hocko
2018-07-02  6:58                                           ` Jan Kara
2018-06-18  8:11         ` Christoph Hellwig
2018-06-19  6:15           ` Leon Romanovsky
2018-06-17 22:19     ` John Hubbard
2018-06-18  7:56   ` Christoph Hellwig
2018-06-18 17:44     ` John Hubbard
2018-06-17 21:54 ` [PATCH 0/2] mm: gup: don't unmap or drop filesystem buffers Christopher Lameter
2018-06-17 22:23   ` John Hubbard
2018-06-18  8:10   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPcyv4iRBzmwWn_9zDvqdfVmTZL_Gn7uA_26A1T-kJib=84tvA@mail.gmail.com' \
    --to=dan.j.williams@intel.com \
    --cc=cl@linux.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=jgg@ziepe.ca \
    --cc=jhubbard@nvidia.com \
    --cc=john.hubbard@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).