linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: John Hubbard <jhubbard@nvidia.com>
Cc: Jan Kara <jack@suse.cz>, Jason Gunthorpe <jgg@ziepe.ca>,
	Michal Hocko <mhocko@kernel.org>,
	Dan Williams <dan.j.williams@intel.com>,
	Christoph Hellwig <hch@lst.de>,
	John Hubbard <john.hubbard@gmail.com>,
	Matthew Wilcox <willy@infradead.org>,
	Christopher Lameter <cl@linux.com>, Linux MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-rdma <linux-rdma@vger.kernel.org>
Subject: Re: [PATCH 2/2] mm: set PG_dma_pinned on get_user_pages*()
Date: Thu, 28 Jun 2018 11:17:43 +0200	[thread overview]
Message-ID: <20180628091743.khhta7nafuwstd3m@quack2.suse.cz> (raw)
In-Reply-To: <1f6e79c5-5801-16d2-18a6-66bd0712b5b8@nvidia.com>

On Wed 27-06-18 19:42:01, John Hubbard wrote:
> On 06/27/2018 10:02 AM, Jan Kara wrote:
> > On Wed 27-06-18 08:57:18, Jason Gunthorpe wrote:
> >> On Wed, Jun 27, 2018 at 02:42:55PM +0200, Jan Kara wrote:
> >>> On Wed 27-06-18 13:59:27, Michal Hocko wrote:
> >>>> On Wed 27-06-18 13:53:49, Jan Kara wrote:
> >>>>> On Wed 27-06-18 13:32:21, Michal Hocko wrote:
> >>>> [...]
> >>>>>> Appart from that, do we really care about 32b here? Big DIO, IB users
> >>>>>> seem to be 64b only AFAIU.
> >>>>>
> >>>>> IMO it is a bad habit to leave unpriviledged-user-triggerable oops in the
> >>>>> kernel even for uncommon platforms...
> >>>>
> >>>> Absolutely agreed! I didn't mean to keep the blow up for 32b. I just
> >>>> wanted to say that we can stay with a simple solution for 32b. I thought
> >>>> the g-u-p-longterm has plugged the most obvious breakage already. But
> >>>> maybe I just misunderstood.
> >>>
> >>> Most yes, but if you try hard enough, you can still trigger the oops e.g.
> >>> with appropriately set up direct IO when racing with writeback / reclaim.
> >>
> >> gup longterm is only different from normal gup if you have DAX and few
> >> people do, which really means it doesn't help at all.. AFAIK??
> > 
> > Right, what I wrote works only for DAX. For non-DAX situation g-u-p
> > longterm does not currently help at all. Sorry for confusion.
> > 
> 
> OK, I've got an early version of this up and running, reusing the page->lru
> fields. I'll clean it up and do some heavier testing, and post as a PATCH v2.

Cool.

> One question though: I'm still vague on the best actions to take in the
> following functions:
> 
>     page_mkclean_one
>     try_to_unmap_one
> 
> At the moment, they are both just doing an evil little early-out:
> 
> 	if (PageDmaPinned(page))
> 		return false;
> 
> ...but we talked about maybe waiting for the condition to clear, instead?
> Thoughts?

What needs to happen in page_mkclean() depends on the caller. Most of the
callers really need to be sure the page is write-protected once
page_mkclean() returns. Those are:

  pagecache_isize_extended()
  fb_deferred_io_work()
  clear_page_dirty_for_io() if called for data-integrity writeback - which
    is currently known only in its caller (e.g. write_cache_pages()) where
    it can be determined as wbc->sync_mode == WB_SYNC_ALL. Getting this
    information into page_mkclean() will require some plumbing and
    clear_page_dirty_for_io() has some 50 callers but it's doable.

clear_page_dirty_for_io() for cleaning writeback (wbc->sync_mode !=
WB_SYNC_ALL) can just skip pinned pages and we probably need to do that as
otherwise memory cleaning would get stuck on pinned pages until RDMA
drivers release its pins.

> And if so, does it sound reasonable to refactor wait_on_page_bit_common(),
> so that it learns how to wait for a bit that, while inside struct page, is
> not within page->flags?

wait_on_page_bit_common() and associated wait queue handling is a fast path
so we should not make it slower for such special case as waiting for DMA
pin. OTOH we could probably refactor most of the infrastructure to take
pointer to word with flags instead of pointer to page. I'm not sure how the
result will look like but it's probably worth a try.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2018-06-28  9:17 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-17  1:25 [PATCH 0/2] mm: gup: don't unmap or drop filesystem buffers john.hubbard
2018-06-17  1:25 ` [PATCH 1/2] consolidate get_user_pages error handling john.hubbard
2018-06-17  1:25 ` [PATCH 2/2] mm: set PG_dma_pinned on get_user_pages*() john.hubbard
2018-06-17 19:53   ` Dan Williams
2018-06-17 20:04     ` Jason Gunthorpe
2018-06-17 20:10       ` Dan Williams
2018-06-17 20:28         ` John Hubbard
2018-06-18  8:12           ` Christoph Hellwig
2018-06-18 17:50             ` John Hubbard
2018-06-18 17:56               ` Dan Williams
2018-06-18 18:14                 ` John Hubbard
2018-06-18 19:21                   ` Dan Williams
2018-06-18 19:31                     ` Jason Gunthorpe
2018-06-18 20:04                       ` Dan Williams
2018-06-18 21:36                     ` John Hubbard
2018-06-19  8:29                       ` Jan Kara
2018-06-19  9:02                         ` Matthew Wilcox
2018-06-19 10:41                           ` Jan Kara
2018-06-19 18:11                             ` John Hubbard
2018-06-20  1:24                               ` Dan Williams
2018-06-20  1:34                                 ` John Hubbard
2018-06-20  1:57                                   ` Dan Williams
2018-06-20  2:03                                     ` John Hubbard
2018-06-20 12:08                               ` Jan Kara
2018-06-20 22:55                                 ` John Hubbard
2018-06-21 16:30                                   ` Jan Kara
2018-06-25 15:21                                     ` Jan Kara
2018-06-25 19:03                                       ` John Hubbard
2018-06-26  7:52                                         ` Jan Kara
2018-06-26  6:31                                       ` John Hubbard
2018-06-26 11:48                                         ` Jan Kara
2018-06-26 13:47                     ` Michal Hocko
2018-06-26 16:48                       ` Jan Kara
2018-06-27 11:32                         ` Michal Hocko
2018-06-27 11:53                           ` Jan Kara
2018-06-27 11:59                             ` Michal Hocko
2018-06-27 12:42                               ` Jan Kara
2018-06-27 14:57                                 ` Jason Gunthorpe
2018-06-27 17:02                                   ` Jan Kara
2018-06-28  2:42                                     ` John Hubbard
2018-06-28  9:17                                       ` Jan Kara [this message]
2018-07-02  5:52                                         ` Leon Romanovsky
2018-07-02  6:10                                           ` John Hubbard
2018-07-02  6:34                                             ` Leon Romanovsky
2018-07-02  6:41                                               ` John Hubbard
2018-07-02 10:36                                                 ` Michal Hocko
2018-07-02  7:02                                             ` Jan Kara
2018-07-02 14:48                                               ` Michal Hocko
2018-07-02  6:58                                           ` Jan Kara
2018-06-18  8:11         ` Christoph Hellwig
2018-06-19  6:15           ` Leon Romanovsky
2018-06-17 22:19     ` John Hubbard
2018-06-18  7:56   ` Christoph Hellwig
2018-06-18 17:44     ` John Hubbard
2018-06-17 21:54 ` [PATCH 0/2] mm: gup: don't unmap or drop filesystem buffers Christopher Lameter
2018-06-17 22:23   ` John Hubbard
2018-06-18  8:10   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180628091743.khhta7nafuwstd3m@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=cl@linux.com \
    --cc=dan.j.williams@intel.com \
    --cc=hch@lst.de \
    --cc=jgg@ziepe.ca \
    --cc=jhubbard@nvidia.com \
    --cc=john.hubbard@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).