linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Hubbard <jhubbard@nvidia.com>
To: Christopher Lameter <cl@linux.com>, john.hubbard@gmail.com
Cc: Matthew Wilcox <willy@infradead.org>,
	Michal Hocko <mhocko@kernel.org>, Jason Gunthorpe <jgg@ziepe.ca>,
	Dan Williams <dan.j.williams@intel.com>, Jan Kara <jack@suse.cz>,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>,
	linux-rdma <linux-rdma@vger.kernel.org>
Subject: Re: [PATCH 0/2] mm: gup: don't unmap or drop filesystem buffers
Date: Sun, 17 Jun 2018 15:23:14 -0700	[thread overview]
Message-ID: <4708f5be-1829-3a20-8fad-5a445d18aa84@nvidia.com> (raw)
In-Reply-To: <010001640fbe0dd8-f999e7f6-7b6e-4deb-b073-0c572006727d-000000@email.amazonses.com>

On 06/17/2018 02:54 PM, Christopher Lameter wrote:
> On Sat, 16 Jun 2018, john.hubbard@gmail.com wrote:
> 
>> I've come up with what I claim is a simple, robust fix, but...I'm
>> presuming to burn a struct page flag, and limit it to 64-bit arches, in
>> order to get there. Given that the problem is old (Jason Gunthorpe noted
>> that RDMA has been living with this problem since 2005), I think it's
>> worth it.
>>
>> Leaving the new page flag set "nearly forever" is not great, but on the
>> other hand, once the page is actually freed, the flag does get cleared.
>> It seems like an acceptable tradeoff, given that we only get one bit
>> (and are lucky to even have that).
> 
> This is not robust. Multiple processes may register a page with the RDMA
> subsystem. How do you decide when to clear the flag? I think you would
> need an additional refcount for the number of times the page was
> registered.

Effectively, page->_refcount is what does that here. It would be a nice, but 
not strictly required optimization to have a separate reference count. That's
because the new page flag gets cleared when the page is fully freed. So unless
we're dealing with pages that don't get freed, it's functional, right?

Each of those multiple processes also wants protection from the ravages
of try_to_unmap() and drop_buffers(), anyway. Having said that, it would
be nice to have that refcount, but seems hard to get one.

> 
> I still think the cleanest solution here is to require mmu notifier
> callbacks and to not pin the page in the first place. If a NIC does not
> support a hardware mmu then it can still simulate it in software by
> holding off the ummapping the mmu notifier callback until any pending
> operation is complete and then invalidate the mapping so that future
> operations require a remapping (or refaulting).
> 

Interesting. I didn't want a solution that only supported the few devices
that can support their own replayable page faulting, so I was sort of putting
the mmu notifier idea on the back burner. But somehow I missed the
idea of just holding off the invalidation, in MMU notifier callback, to 
work for non-page-faultable hardware. On one hand, it's wild to hold off
the invalidation perhaps for a long time, but on the other hand--you get
behavior that the hardware cannot otherwise do: access to non-pinned memory.

I know this was brought up before. Definitely would like to hear more 
opinions and brainstorming here.

thanks,
-- 
John Hubbard
NVIDIA

  reply	other threads:[~2018-06-17 22:23 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-17  1:25 [PATCH 0/2] mm: gup: don't unmap or drop filesystem buffers john.hubbard
2018-06-17  1:25 ` [PATCH 1/2] consolidate get_user_pages error handling john.hubbard
2018-06-17  1:25 ` [PATCH 2/2] mm: set PG_dma_pinned on get_user_pages*() john.hubbard
2018-06-17 19:53   ` Dan Williams
2018-06-17 20:04     ` Jason Gunthorpe
2018-06-17 20:10       ` Dan Williams
2018-06-17 20:28         ` John Hubbard
2018-06-18  8:12           ` Christoph Hellwig
2018-06-18 17:50             ` John Hubbard
2018-06-18 17:56               ` Dan Williams
2018-06-18 18:14                 ` John Hubbard
2018-06-18 19:21                   ` Dan Williams
2018-06-18 19:31                     ` Jason Gunthorpe
2018-06-18 20:04                       ` Dan Williams
2018-06-18 21:36                     ` John Hubbard
2018-06-19  8:29                       ` Jan Kara
2018-06-19  9:02                         ` Matthew Wilcox
2018-06-19 10:41                           ` Jan Kara
2018-06-19 18:11                             ` John Hubbard
2018-06-20  1:24                               ` Dan Williams
2018-06-20  1:34                                 ` John Hubbard
2018-06-20  1:57                                   ` Dan Williams
2018-06-20  2:03                                     ` John Hubbard
2018-06-20 12:08                               ` Jan Kara
2018-06-20 22:55                                 ` John Hubbard
2018-06-21 16:30                                   ` Jan Kara
2018-06-25 15:21                                     ` Jan Kara
2018-06-25 19:03                                       ` John Hubbard
2018-06-26  7:52                                         ` Jan Kara
2018-06-26  6:31                                       ` John Hubbard
2018-06-26 11:48                                         ` Jan Kara
2018-06-26 13:47                     ` Michal Hocko
2018-06-26 16:48                       ` Jan Kara
2018-06-27 11:32                         ` Michal Hocko
2018-06-27 11:53                           ` Jan Kara
2018-06-27 11:59                             ` Michal Hocko
2018-06-27 12:42                               ` Jan Kara
2018-06-27 14:57                                 ` Jason Gunthorpe
2018-06-27 17:02                                   ` Jan Kara
2018-06-28  2:42                                     ` John Hubbard
2018-06-28  9:17                                       ` Jan Kara
2018-07-02  5:52                                         ` Leon Romanovsky
2018-07-02  6:10                                           ` John Hubbard
2018-07-02  6:34                                             ` Leon Romanovsky
2018-07-02  6:41                                               ` John Hubbard
2018-07-02 10:36                                                 ` Michal Hocko
2018-07-02  7:02                                             ` Jan Kara
2018-07-02 14:48                                               ` Michal Hocko
2018-07-02  6:58                                           ` Jan Kara
2018-06-18  8:11         ` Christoph Hellwig
2018-06-19  6:15           ` Leon Romanovsky
2018-06-17 22:19     ` John Hubbard
2018-06-18  7:56   ` Christoph Hellwig
2018-06-18 17:44     ` John Hubbard
2018-06-17 21:54 ` [PATCH 0/2] mm: gup: don't unmap or drop filesystem buffers Christopher Lameter
2018-06-17 22:23   ` John Hubbard [this message]
2018-06-18  8:10   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4708f5be-1829-3a20-8fad-5a445d18aa84@nvidia.com \
    --to=jhubbard@nvidia.com \
    --cc=cl@linux.com \
    --cc=dan.j.williams@intel.com \
    --cc=jack@suse.cz \
    --cc=jgg@ziepe.ca \
    --cc=john.hubbard@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).