Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
From: Jerome Glisse <jglisse@redhat.com>
To: John Hubbard <jhubbard@nvidia.com>
Cc: Jan Kara <jack@suse.cz>, Matthew Wilcox <willy@infradead.org>,
	Dave Chinner <david@fromorbit.com>,
	Dan Williams <dan.j.williams@intel.com>,
	John Hubbard <john.hubbard@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux MM <linux-mm@kvack.org>,
	tom@talpey.com, Al Viro <viro@zeniv.linux.org.uk>,
	benve@cisco.com, Christoph Hellwig <hch@infradead.org>,
	Christopher Lameter <cl@linux.com>,
	"Dalessandro, Dennis" <dennis.dalessandro@intel.com>,
	Doug Ledford <dledford@redhat.com>,
	Jason Gunthorpe <jgg@ziepe.ca>, Michal Hocko <mhocko@kernel.org>,
	mike.marciniszyn@intel.com, rcampbell@nvidia.com,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions
Date: Thu, 20 Dec 2018 11:50:31 -0500
Message-ID: <20181220165030.GC3963@redhat.com> (raw)
In-Reply-To: <8e98d553-7675-8fa1-3a60-4211fc836ed9@nvidia.com>

On Thu, Dec 20, 2018 at 02:54:49AM -0800, John Hubbard wrote:
> On 12/19/18 3:08 AM, Jan Kara wrote:
> > On Tue 18-12-18 21:07:24, Jerome Glisse wrote:
> >> On Tue, Dec 18, 2018 at 03:29:34PM -0800, John Hubbard wrote:
> >>> OK, so let's take another look at Jerome's _mapcount idea all by itself (using
> >>> *only* the tracking pinned pages aspect), given that it is the lightest weight
> >>> solution for that.  
> >>>
> >>> So as I understand it, this would use page->_mapcount to store both the real
> >>> mapcount, and the dma pinned count (simply added together), but only do so for
> >>> file-backed (non-anonymous) pages:
> >>>
> >>>
> >>> __get_user_pages()
> >>> {
> >>> 	...
> >>> 	get_page(page);
> >>>
> >>> 	if (!PageAnon)
> >>> 		atomic_inc(page->_mapcount);
> >>> 	...
> >>> }
> >>>
> >>> put_user_page(struct page *page)
> >>> {
> >>> 	...
> >>> 	if (!PageAnon)
> >>> 		atomic_dec(&page->_mapcount);
> >>>
> >>> 	put_page(page);
> >>> 	...
> >>> }
> >>>
> >>> ...and then in the various consumers of the DMA pinned count, we use page_mapped(page)
> >>> to see if any mapcount remains, and if so, we treat it as DMA pinned. Is that what you 
> >>> had in mind?
> >>
> >> Mostly, with the extra two observations:
> >>     [1] We only need to know the pin count when a write back kicks in
> >>     [2] We need to protect GUP code with wait_for_write_back() in case
> >>         GUP is racing with a write back that might not the see the
> >>         elevated mapcount in time.
> >>
> >> So for [2]
> >>
> >> __get_user_pages()
> >> {
> >>     get_page(page);
> >>
> >>     if (!PageAnon) {
> >>         atomic_inc(page->_mapcount);
> >> +       if (PageWriteback(page)) {
> >> +           // Assume we are racing and curent write back will not see
> >> +           // the elevated mapcount so wait for current write back and
> >> +           // force page fault
> >> +           wait_on_page_writeback(page);
> >> +           // force slow path that will fault again
> >> +       }
> >>     }
> >> }
> > 
> > This is not needed AFAICT. __get_user_pages() gets page reference (and it
> > should also increment page->_mapcount) under PTE lock. So at that point we
> > are sure we have writeable PTE nobody can change. So page_mkclean() has to
> > block on PTE lock to make PTE read-only and only after going through all
> > PTEs like this, it can check page->_mapcount. So the PTE lock provides
> > enough synchronization.
> > 
> >> For [1] only needing pin count during write back turns page_mkclean into
> >> the perfect spot to check for that so:
> >>
> >> int page_mkclean(struct page *page)
> >> {
> >>     int cleaned = 0;
> >> +   int real_mapcount = 0;
> >>     struct address_space *mapping;
> >>     struct rmap_walk_control rwc = {
> >>         .arg = (void *)&cleaned,
> >>         .rmap_one = page_mkclean_one,
> >>         .invalid_vma = invalid_mkclean_vma,
> >> +       .mapcount = &real_mapcount,
> >>     };
> >>
> >>     BUG_ON(!PageLocked(page));
> >>
> >>     if (!page_mapped(page))
> >>         return 0;
> >>
> >>     mapping = page_mapping(page);
> >>     if (!mapping)
> >>         return 0;
> >>
> >>     // rmap_walk need to change to count mapping and return value
> >>     // in .mapcount easy one
> >>     rmap_walk(page, &rwc);
> >>
> >>     // Big fat comment to explain what is going on
> >> +   if ((page_mapcount(page) - real_mapcount) > 0) {
> >> +       SetPageDMAPined(page);
> >> +   } else {
> >> +       ClearPageDMAPined(page);
> >> +   }
> > 
> > This is the detail I'm not sure about: Why cannot rmap_walk_file() race
> > with e.g. zap_pte_range() which decrements page->_mapcount and thus the
> > check we do in page_mkclean() is wrong?
> 
> Right. This looks like a dead end, after all. We can't lock a whole chunk 
> of "all these are mapped, hold still while we count you" pages. It's not
> designed to allow that at all.
> 
> IMHO, we are now back to something like dynamic_page, which provides an
> independent dma pinned count. 

I will keep looking because allocating a structure for every GUP is
insane to me they are user out there that are GUPin GigaBytes of data
and it gonna waste tons of memory just to fix crappy hardware.

Cheers,
J�r�me

  reply index

Thread overview: 213+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-04  0:17 [PATCH 0/2] put_user_page*(): start converting the call sites john.hubbard
2018-12-04  0:17 ` [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions john.hubbard
2018-12-04  7:53   ` Mike Rapoport
2018-12-05  1:40     ` John Hubbard
2018-12-04 20:28   ` Dan Williams
2018-12-04 21:56     ` John Hubbard
2018-12-04 23:03       ` Dan Williams
2018-12-05  0:36         ` Jerome Glisse
2018-12-05  0:40           ` Dan Williams
2018-12-05  0:59             ` John Hubbard
2018-12-05  0:58         ` John Hubbard
2018-12-05  1:00           ` Dan Williams
2018-12-05  1:15           ` Matthew Wilcox
2018-12-05  1:44             ` Jerome Glisse
2018-12-05  1:57               ` John Hubbard
2018-12-07  2:45                 ` John Hubbard
2018-12-07 19:16                   ` Jerome Glisse
2018-12-07 19:26                     ` Dan Williams
2018-12-07 19:40                       ` Jerome Glisse
2018-12-08  0:52                     ` John Hubbard
2018-12-08  2:24                       ` Jerome Glisse
2018-12-10 10:28                         ` Jan Kara
2018-12-12 15:03                           ` Jerome Glisse
2018-12-12 16:27                             ` Dan Williams
2018-12-12 17:02                               ` Jerome Glisse
2018-12-12 17:49                                 ` Dan Williams
2018-12-12 19:07                                   ` John Hubbard
2018-12-12 21:30                               ` Jerome Glisse
2018-12-12 21:40                                 ` Dan Williams
2018-12-12 21:53                                   ` Jerome Glisse
2018-12-12 22:11                                     ` Matthew Wilcox
2018-12-12 22:16                                       ` Jerome Glisse
2018-12-12 23:37                                     ` Jason Gunthorpe
2018-12-12 23:46                                       ` John Hubbard
2018-12-12 23:54                                       ` Dan Williams
2018-12-13  0:01                                       ` Jerome Glisse
2018-12-13  0:18                                         ` Dan Williams
2018-12-13  0:44                                           ` Jerome Glisse
2018-12-13  3:26                                             ` Jason Gunthorpe
2018-12-13  3:20                                         ` Jason Gunthorpe
2018-12-13 12:43                                           ` Jerome Glisse
2018-12-13 13:40                                             ` Tom Talpey
2018-12-13 14:18                                               ` Jerome Glisse
2018-12-13 14:51                                                 ` Tom Talpey
2018-12-13 15:18                                                   ` Jerome Glisse
2018-12-13 18:12                                                     ` Tom Talpey
2018-12-13 19:18                                                       ` Jerome Glisse
2018-12-14 10:41                                             ` Jan Kara
2018-12-14 15:25                                               ` Jerome Glisse
2018-12-12 21:56                                 ` John Hubbard
2018-12-12 22:04                                   ` Jerome Glisse
2018-12-12 22:11                                     ` John Hubbard
2018-12-12 22:14                                       ` Jerome Glisse
2018-12-12 22:17                                         ` John Hubbard
2018-12-12 21:46                             ` Dave Chinner
2018-12-12 21:59                               ` Jerome Glisse
2018-12-13  0:51                                 ` Dave Chinner
2018-12-13  2:02                                   ` Jerome Glisse
2018-12-13 15:56                                     ` Christopher Lameter
2018-12-13 16:02                                       ` Jerome Glisse
2018-12-14  6:00                                     ` Dave Chinner
2018-12-14 15:13                                       ` Jerome Glisse
2018-12-14  3:52                                   ` John Hubbard
2018-12-14  5:21                                     ` Dan Williams
2018-12-14  6:11                                       ` John Hubbard
2018-12-14 15:20                                         ` Jerome Glisse
2018-12-14 19:38                                         ` Dan Williams
2018-12-14 19:48                                           ` Matthew Wilcox
2018-12-14 19:53                                             ` Dave Hansen
2018-12-14 20:03                                               ` Matthew Wilcox
2018-12-14 20:17                                                 ` Dan Williams
2018-12-14 20:29                                                   ` Matthew Wilcox
2018-12-15  0:41                                                 ` John Hubbard
2018-12-17  8:56                                           ` Jan Kara
2018-12-17 18:28                                             ` Dan Williams
2018-12-14 15:43                               ` Jan Kara
2018-12-16 21:58                                 ` Dave Chinner
2018-12-17 18:11                                   ` Jerome Glisse
2018-12-17 18:34                                     ` Matthew Wilcox
2018-12-17 19:48                                       ` Jerome Glisse
2018-12-17 19:51                                         ` Matthew Wilcox
2018-12-17 19:54                                           ` Jerome Glisse
2018-12-17 19:59                                             ` Matthew Wilcox
2018-12-17 20:55                                               ` Jerome Glisse
2018-12-17 21:03                                                 ` Matthew Wilcox
2018-12-17 21:15                                                   ` Jerome Glisse
2018-12-18  1:09                                       ` Dave Chinner
2018-12-18  6:12                                       ` Darrick J. Wong
2018-12-18  9:30                                       ` Jan Kara
2018-12-18 23:29                                         ` John Hubbard
2018-12-19  2:07                                           ` Jerome Glisse
2018-12-19 11:08                                             ` Jan Kara
2018-12-20 10:54                                               ` John Hubbard
2018-12-20 16:50                                                 ` Jerome Glisse [this message]
2018-12-20 16:57                                                   ` Dan Williams
2018-12-20 16:49                                               ` Jerome Glisse
2019-01-03  1:55                                               ` Jerome Glisse
2019-01-03  3:27                                                 ` John Hubbard
2019-01-03 14:57                                                   ` Jerome Glisse
2019-01-03  9:26                                                 ` Jan Kara
2019-01-03 14:44                                                   ` Jerome Glisse
2019-01-11  2:59                                                     ` John Hubbard
2019-01-11  2:59                                                       ` John Hubbard
2019-01-11 16:51                                                       ` Jerome Glisse
2019-01-11 16:51                                                         ` Jerome Glisse
2019-01-12  1:04                                                         ` John Hubbard
2019-01-12  1:04                                                           ` John Hubbard
2019-01-12  2:02                                                           ` Jerome Glisse
2019-01-12  2:02                                                             ` Jerome Glisse
2019-01-12  2:38                                                             ` John Hubbard
2019-01-12  2:38                                                               ` John Hubbard
2019-01-12  2:46                                                               ` Jerome Glisse
2019-01-12  2:46                                                                 ` Jerome Glisse
2019-01-12  3:06                                                                 ` John Hubbard
2019-01-12  3:06                                                                   ` John Hubbard
2019-01-12  3:25                                                                   ` Jerome Glisse
2019-01-12  3:25                                                                     ` Jerome Glisse
2019-01-12 20:46                                                                     ` John Hubbard
2019-01-12 20:46                                                                       ` John Hubbard
2019-01-14 14:54                                                                   ` Jan Kara
2019-01-14 14:54                                                                     ` Jan Kara
2019-01-14 17:21                                                                     ` Jerome Glisse
2019-01-14 17:21                                                                       ` Jerome Glisse
2019-01-14 19:09                                                                       ` John Hubbard
2019-01-14 19:09                                                                         ` John Hubbard
2019-01-15  8:34                                                                         ` Jan Kara
2019-01-15  8:34                                                                           ` Jan Kara
2019-01-15 21:39                                                                           ` John Hubbard
2019-01-15 21:39                                                                             ` John Hubbard
2019-01-15  8:07                                                                       ` Jan Kara
2019-01-15  8:07                                                                         ` Jan Kara
2019-01-15 17:15                                                                         ` Jerome Glisse
2019-01-15 17:15                                                                           ` Jerome Glisse
2019-01-15 21:56                                                                           ` John Hubbard
2019-01-15 21:56                                                                             ` John Hubbard
2019-01-15 22:12                                                                             ` Jerome Glisse
2019-01-15 22:12                                                                               ` Jerome Glisse
2019-01-16  0:44                                                                               ` John Hubbard
2019-01-16  0:44                                                                                 ` John Hubbard
2019-01-16  1:56                                                                                 ` Jerome Glisse
2019-01-16  1:56                                                                                   ` Jerome Glisse
2019-01-16  2:01                                                                                   ` Dan Williams
2019-01-16  2:01                                                                                     ` Dan Williams
2019-01-16  2:23                                                                                     ` Jerome Glisse
2019-01-16  2:23                                                                                       ` Jerome Glisse
2019-01-16  4:34                                                                                       ` Dave Chinner
2019-01-16  4:34                                                                                         ` Dave Chinner
2019-01-16 14:50                                                                                         ` Jerome Glisse
2019-01-16 14:50                                                                                           ` Jerome Glisse
2019-01-16 22:51                                                                                           ` Dave Chinner
2019-01-16 22:51                                                                                             ` Dave Chinner
2019-01-16 11:38                                                                         ` Jan Kara
2019-01-16 11:38                                                                           ` Jan Kara
2019-01-16 13:08                                                                           ` Jerome Glisse
2019-01-16 13:08                                                                             ` Jerome Glisse
2019-01-17  5:42                                                                             ` John Hubbard
2019-01-17  5:42                                                                               ` John Hubbard
2019-01-17 15:21                                                                               ` Jerome Glisse
2019-01-17 15:21                                                                                 ` Jerome Glisse
2019-01-18  0:16                                                                                 ` Dave Chinner
2019-01-18  1:59                                                                                   ` Jerome Glisse
2019-01-17  9:30                                                                             ` Jan Kara
2019-01-17  9:30                                                                               ` Jan Kara
2019-01-17 15:17                                                                               ` Jerome Glisse
2019-01-17 15:17                                                                                 ` Jerome Glisse
2019-01-22 15:24                                                                                 ` Jan Kara
2019-01-22 16:46                                                                                   ` Jerome Glisse
2019-01-23 18:02                                                                                     ` Jan Kara
2019-01-23 19:04                                                                                       ` Jerome Glisse
2019-01-29  0:22                                                                                         ` John Hubbard
2019-01-29  1:23                                                                                           ` Jerome Glisse
2019-01-29  6:41                                                                                             ` John Hubbard
2019-01-29 10:12                                                                                               ` Jan Kara
2019-01-30  2:21                                                                                                 ` John Hubbard
2019-01-17  5:25                                                                         ` John Hubbard
2019-01-17  5:25                                                                           ` John Hubbard
2019-01-17  9:04                                                                           ` Jan Kara
2019-01-17  9:04                                                                             ` Jan Kara
2019-01-12  3:14                                                               ` Jerome Glisse
2019-01-12  3:14                                                                 ` Jerome Glisse
2018-12-18 10:33                                   ` Jan Kara
2018-12-18 23:42                                     ` Dave Chinner
2018-12-19  3:03                                       ` Jason Gunthorpe
2018-12-19  5:26                                         ` Dan Williams
2018-12-19 11:19                                           ` Jan Kara
2018-12-19 10:28                                         ` Dave Chinner
2018-12-19 11:35                                           ` Jan Kara
2018-12-19 16:56                                             ` Jason Gunthorpe
2018-12-19 22:33                                             ` Dave Chinner
2018-12-20  9:07                                               ` Jan Kara
2018-12-20 16:54                                               ` Jerome Glisse
2018-12-19 13:24                                       ` Jan Kara
2018-12-08  5:18                       ` Matthew Wilcox
2018-12-12 19:13                         ` John Hubbard
2018-12-08  7:16                       ` Dan Williams
2018-12-08 16:33                         ` Jerome Glisse
2018-12-08 16:48                           ` Christoph Hellwig
2018-12-08 17:47                             ` Jerome Glisse
2018-12-08 18:26                               ` Christoph Hellwig
2018-12-08 18:45                                 ` Jerome Glisse
2018-12-08 18:09                             ` Dan Williams
2018-12-08 18:12                               ` Christoph Hellwig
2018-12-11  6:18                               ` Dave Chinner
2018-12-05  5:52             ` Dan Williams
2018-12-05 11:16       ` Jan Kara
2018-12-04  0:17 ` [PATCH 2/2] infiniband/mm: convert put_page() to put_user_page*() john.hubbard
2018-12-04 17:10 ` [PATCH 0/2] put_user_page*(): start converting the call sites David Laight
2018-12-05  1:05   ` John Hubbard
2018-12-05 14:08     ` David Laight
2018-12-28  8:37       ` Pavel Machek
2019-02-08  7:56 [PATCH 0/2] mm: put_user_page() call site conversion first john.hubbard
2019-02-08  7:56 ` [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions john.hubbard
2019-02-08 10:32   ` Mike Rapoport
2019-02-08 20:44     ` John Hubbard

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181220165030.GC3963@redhat.com \
    --to=jglisse@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=benve@cisco.com \
    --cc=cl@linux.com \
    --cc=dan.j.williams@intel.com \
    --cc=david@fromorbit.com \
    --cc=dennis.dalessandro@intel.com \
    --cc=dledford@redhat.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=jgg@ziepe.ca \
    --cc=jhubbard@nvidia.com \
    --cc=john.hubbard@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mike.marciniszyn@intel.com \
    --cc=rcampbell@nvidia.com \
    --cc=tom@talpey.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org linux-fsdevel@archiver.kernel.org
	public-inbox-index linux-fsdevel


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox