linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Jason Gunthorpe <jgg@nvidia.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Alistair Popple <apopple@nvidia.com>
Cc: <akpm@linux-foundation.org>, Matthew Wilcox <willy@infradead.org>,
	"Jan Kara" <jack@suse.cz>, "Darrick J. Wong" <djwong@kernel.org>,
	"Christoph Hellwig" <hch@lst.de>,
	John Hubbard <jhubbard@nvidia.com>,
	<linux-fsdevel@vger.kernel.org>, <nvdimm@lists.linux.dev>,
	<linux-xfs@vger.kernel.org>, <linux-mm@kvack.org>,
	<linux-ext4@vger.kernel.org>
Subject: Re: [PATCH v2 16/18] mm/memremap_pages: Support initializing pages to a zero reference count
Date: Wed, 21 Sep 2022 16:45:22 -0700	[thread overview]
Message-ID: <632ba212e32bb_349629451@dwillia2-xfh.jf.intel.com.notmuch> (raw)
In-Reply-To: <YyssywF6HmZrfqhD@nvidia.com>

Jason Gunthorpe wrote:
> On Thu, Sep 15, 2022 at 08:36:43PM -0700, Dan Williams wrote:
> > The initial memremap_pages() implementation inherited the
> > __init_single_page() default of pages starting life with an elevated
> > reference count. This originally allowed for the page->pgmap pointer to
> > alias with the storage for page->lru since a page was only allowed to be
> > on an lru list when its reference count was zero.
> > 
> > Since then, 'struct page' definition cleanups have arranged for
> > dedicated space for the ZONE_DEVICE page metadata, and the
> > MEMORY_DEVICE_{PRIVATE,COHERENT} work has arranged for the 1 -> 0
> > page->_refcount transition to route the page to free_zone_device_page()
> > and not the core-mm page-free. With those cleanups in place and with
> > filesystem-dax and device-dax now converted to take and drop references
> > at map and truncate time, it is possible to start MEMORY_DEVICE_FS_DAX
> > and MEMORY_DEVICE_GENERIC reference counts at 0.
> > 
> > MEMORY_DEVICE_{PRIVATE,COHERENT} still expect that their ZONE_DEVICE
> > pages start life at _refcount 1, so make that the default if
> > pgmap->init_mode is left at zero.
> 
> I'm shocked to read this - how does it make any sense?

I think what happened is that since memremap_pages() historically
produced pages with an elevated reference count that GPU drivers skipped
taking a reference on first allocation and just passed along an elevated
reference count page to the first user.

So either we keep that assumption or update all users to be prepared for
idle pages coming out of memremap_pages().

This is all in reaction to the "set_page_count(page, 1);" in
free_zone_device_page(). Which I am happy to get rid of but need from
help from MEMORY_DEVICE_{PRIVATE,COHERENT} folks to react to
memremap_pages() starting all pages at reference count 0.

> dev_pagemap_ops->page_free() is only called on the 1->0 transition, so
> any driver which implements it must be expecting pages to have a 0
> refcount.
> 
> Looking around everything but only fsdax_pagemap_ops implements
> page_free()

Right.

> So, how does it work? Surely the instant the page map is created all
> the pages must be considered 'free', and after page_free() is called I
> would also expect the page to be considered free.

The GPU drivers need to increment reference counts when they hand out
the page rather than reuse the reference count that they get by default.

> How on earth can a free'd page have both a 0 and 1 refcount??

This is residual wonkiness from memremap_pages() handing out pages with
elevated reference counts at the outset.

> eg look at the simple hmm_test, it threads pages on to the
> mdevice->free_pages list immediately after memremap_pages and then
> again inside page_free() - it is completely wrong that they would have
> different refcounts while on the free_pages list.

I do not see any page_ref_inc() in that test, only put_page() so it is
assuming non-idle pages at the outset.

> I would expect that after the page is removed from the free_pages list
> it will have its recount set to 1 to make it non-free then it will go
> through the migration.
> 
> Alistair how should the refcounting be working here in hmm_test?
> 
> Jason



  reply	other threads:[~2022-09-21 23:45 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-16  3:35 [PATCH v2 00/18] Fix the DAX-gup mistake Dan Williams
2022-09-16  3:35 ` [PATCH v2 01/18] fsdax: Wait on @page not @page->_refcount Dan Williams
2022-09-20 14:30   ` Jason Gunthorpe
2022-09-16  3:35 ` [PATCH v2 02/18] fsdax: Use dax_page_idle() to document DAX busy page checking Dan Williams
2022-09-20 14:31   ` Jason Gunthorpe
2022-09-16  3:35 ` [PATCH v2 03/18] fsdax: Include unmapped inodes for page-idle detection Dan Williams
2022-09-16  3:35 ` [PATCH v2 04/18] ext4: Add ext4_break_layouts() to the inode eviction path Dan Williams
2022-09-16  3:35 ` [PATCH v2 05/18] xfs: Add xfs_break_layouts() " Dan Williams
2022-09-18 22:57   ` Dave Chinner
2022-09-19 16:11     ` Dan Williams
2022-09-19 21:29       ` Dave Chinner
2022-09-20 16:44         ` Dan Williams
2022-09-21 22:14           ` Dave Chinner
2022-09-21 22:28             ` Jason Gunthorpe
2022-09-23  0:18               ` Dave Chinner
2022-09-23  0:41                 ` Dan Williams
2022-09-23  2:10                   ` Dave Chinner
2022-09-23  9:38                     ` Jan Kara
2022-09-23 23:06                       ` Dan Williams
2022-09-25 23:54                       ` Dave Chinner
2022-09-26 14:10                         ` Jan Kara
2022-09-29 23:33                           ` Dan Williams
2022-09-30 13:41                             ` Jan Kara
2022-09-30 17:56                               ` Dan Williams
2022-09-30 18:06                                 ` Jason Gunthorpe
2022-09-30 18:46                                   ` Dan Williams
2022-10-03  7:55                                   ` Jan Kara
2022-09-23 12:39                     ` Jason Gunthorpe
2022-09-26  0:34                       ` Dave Chinner
2022-09-26 13:04                         ` Jason Gunthorpe
2022-09-22  0:02             ` Dan Williams
2022-09-22  0:10               ` Jason Gunthorpe
2022-09-16  3:35 ` [PATCH v2 06/18] fsdax: Rework dax_layout_busy_page() to dax_zap_mappings() Dan Williams
2022-09-16  3:35 ` [PATCH v2 07/18] fsdax: Update dax_insert_entry() calling convention to return an error Dan Williams
2022-09-16  3:35 ` [PATCH v2 08/18] fsdax: Cleanup dax_associate_entry() Dan Williams
2022-09-16  3:36 ` [PATCH v2 09/18] fsdax: Rework dax_insert_entry() calling convention Dan Williams
2022-09-16  3:36 ` [PATCH v2 10/18] fsdax: Manage pgmap references at entry insertion and deletion Dan Williams
2022-09-21 14:03   ` Jason Gunthorpe
2022-09-21 15:18     ` Dan Williams
2022-09-21 21:38       ` Dan Williams
2022-09-21 22:07         ` Jason Gunthorpe
2022-09-22  0:14           ` Dan Williams
2022-09-22  0:25             ` Jason Gunthorpe
2022-09-22  2:17               ` Dan Williams
2022-09-22 17:55                 ` Jason Gunthorpe
2022-09-22 21:54                   ` Dan Williams
2022-09-23  1:36                     ` Dave Chinner
2022-09-23  2:01                       ` Dan Williams
2022-09-23 13:24                     ` Jason Gunthorpe
2022-09-23 16:29                       ` Dan Williams
2022-09-23 17:42                         ` Jason Gunthorpe
2022-09-23 19:03                           ` Dan Williams
2022-09-23 19:23                             ` Jason Gunthorpe
2022-09-27  6:07                             ` Alistair Popple
2022-09-27 12:56                               ` Jason Gunthorpe
2022-09-16  3:36 ` [PATCH v2 11/18] devdax: Minor warning fixups Dan Williams
2022-09-16  3:36 ` [PATCH v2 12/18] devdax: Move address_space helpers to the DAX core Dan Williams
2022-09-27  6:20   ` Alistair Popple
2022-09-29 22:38     ` Dan Williams
2022-09-16  3:36 ` [PATCH v2 13/18] dax: Prep mapping helpers for compound pages Dan Williams
2022-09-21 14:06   ` Jason Gunthorpe
2022-09-21 15:19     ` Dan Williams
2022-09-16  3:36 ` [PATCH v2 14/18] devdax: add PUD support to the DAX mapping infrastructure Dan Williams
2022-09-16  3:36 ` [PATCH v2 15/18] devdax: Use dax_insert_entry() + dax_delete_mapping_entry() Dan Williams
2022-09-21 14:10   ` Jason Gunthorpe
2022-09-21 15:48     ` Dan Williams
2022-09-21 22:23       ` Jason Gunthorpe
2022-09-22  0:15         ` Dan Williams
2022-09-16  3:36 ` [PATCH v2 16/18] mm/memremap_pages: Support initializing pages to a zero reference count Dan Williams
2022-09-21 15:24   ` Jason Gunthorpe
2022-09-21 23:45     ` Dan Williams [this message]
2022-09-22  0:03       ` Alistair Popple
2022-09-22  0:04       ` Jason Gunthorpe
2022-09-22  0:34         ` Dan Williams
2022-09-22  1:36           ` Alistair Popple
2022-09-22  2:34             ` Dan Williams
2022-09-26  6:17               ` Alistair Popple
2022-09-22  0:13       ` John Hubbard
2022-09-16  3:36 ` [PATCH v2 17/18] fsdax: Delete put_devmap_managed_page_refs() Dan Williams
2022-09-16  3:36 ` [PATCH v2 18/18] mm/gup: Drop DAX pgmap accounting Dan Williams
2022-09-20 14:29 ` [PATCH v2 00/18] Fix the DAX-gup mistake Jason Gunthorpe
2022-09-20 16:50   ` Dan Williams
2022-11-09  0:20 ` Andrew Morton
2022-11-09 11:38   ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=632ba212e32bb_349629451@dwillia2-xfh.jf.intel.com.notmuch \
    --to=dan.j.williams@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=djwong@kernel.org \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=nvdimm@lists.linux.dev \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).