All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: akpm@linux-foundation.org
Cc: Jason Gunthorpe <jgg@nvidia.com>, Jan Kara <jack@suse.cz>,
	Christoph Hellwig <hch@lst.de>,
	"Darrick J. Wong" <djwong@kernel.org>,
	John Hubbard <jhubbard@nvidia.com>,
	Matthew Wilcox <willy@infradead.org>,
	linux-mm@kvack.org, nvdimm@lists.linux.dev,
	linux-fsdevel@vger.kernel.org
Subject: [PATCH 00/13] Fix the DAX-gup mistake
Date: Sat, 03 Sep 2022 19:16:00 -0700	[thread overview]
Message-ID: <166225775968.2351842.11156458342486082012.stgit@dwillia2-xfh.jf.intel.com> (raw)

tl;dr: Move the pin of 'struct dev_pagemap' instances from gup-time to
map time, move the unpin of 'struct dev_pagemap' to truncate_inode_pages()
for fsdax and devdax inodes, and use page_maybe_dma_pinned() to
determine when filesystems can safely truncate DAX mappings vs DMA.

The longer story is that DAX has caused friction with folio development
and other device-memory use cases due to its hack of using a
page-reference count of 1 to indicate that the page is DMA idle. That
situation arose from the mistake of not managing DAX page reference
counts at map time. The lack of page reference counting at map time grew
organically from the original DAX experiment of attempting to manage DAX
mappings without page structures. The page lock, dirty tracking and
other entry management was supported sans pages. However, the page
support was then bolted on incrementally so solve problems with gup,
memory-failure, and all the other kernel services that are missing when
a pfn does not have an associated page structure.

Since then John has led an effort to account for when a page is pinned
for DMA vs other sources that elevate the reference count. The
page_maybe_dma_pinned() helper slots in seamlessly to replace the need
to track transitions to page->_refount == 1.

The larger change in this set comes from Jason's observation that
inserting DAX mappings without any reference taken is a bug. So
dax_insert_entry(), that fsdax uses, is updated to take 'struct
dev_pagemap' references, and devdax is updated to reuse the same.

This allows for 'struct dev_pagemap' manipulation to be self-contained
to DAX-specific paths. It is also a foundation to build towards removing
pte_devmap() and start treating DAX pages as another vm_normal_page(),
and perhaps more conversions of the DAX infrastructure to reuse typical
page mapping helpers. One of the immediate hurdles is the usage of
pmd_devmap() to distinguish large page mappings that are not transparent
huge pages.

---

Dan Williams (13):
      fsdax: Rename "busy page" to "pinned page"
      fsdax: Use page_maybe_dma_pinned() for DAX vs DMA collisions
      fsdax: Delete put_devmap_managed_page_refs()
      fsdax: Update dax_insert_entry() calling convention to return an error
      fsdax: Cleanup dax_associate_entry()
      fsdax: Rework dax_insert_entry() calling convention
      fsdax: Manage pgmap references at entry insertion and deletion
      devdax: Minor warning fixups
      devdax: Move address_space helpers to the DAX core
      dax: Prep dax_{associate,disassociate}_entry() for compound pages
      devdax: add PUD support to the DAX mapping infrastructure
      devdax: Use dax_insert_entry() + dax_delete_mapping_entry()
      mm/gup: Drop DAX pgmap accounting


 .clang-format             |    1
 drivers/Makefile          |    2
 drivers/dax/Kconfig       |    6
 drivers/dax/Makefile      |    1
 drivers/dax/bus.c         |    2
 drivers/dax/dax-private.h |    1
 drivers/dax/device.c      |   73 ++-
 drivers/dax/mapping.c     | 1020 ++++++++++++++++++++++++++++++++++++++++++++
 drivers/dax/super.c       |    2
 fs/dax.c                  | 1049 ++-------------------------------------------
 fs/ext4/inode.c           |    9
 fs/fuse/dax.c             |   10
 fs/xfs/xfs_file.c         |    8
 fs/xfs/xfs_inode.c        |    2
 include/linux/dax.h       |  124 ++++-
 include/linux/huge_mm.h   |   23 -
 include/linux/memremap.h  |   24 +
 include/linux/mm.h        |   58 +-
 mm/gup.c                  |   92 +---
 mm/huge_memory.c          |   54 --
 mm/memremap.c             |   31 -
 mm/swap.c                 |    2
 22 files changed, 1326 insertions(+), 1268 deletions(-)
 create mode 100644 drivers/dax/mapping.c

base-commit: 1c23f9e627a7b412978b4e852793c5e3c3efc555

             reply	other threads:[~2022-09-04  2:16 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-04  2:16 Dan Williams [this message]
2022-09-04  2:16 ` [PATCH 01/13] fsdax: Rename "busy page" to "pinned page" Dan Williams
2022-09-04  2:16 ` [PATCH 02/13] fsdax: Use page_maybe_dma_pinned() for DAX vs DMA collisions Dan Williams
2022-09-06 12:07   ` Jason Gunthorpe
2022-09-04  2:16 ` [PATCH 03/13] fsdax: Delete put_devmap_managed_page_refs() Dan Williams
2022-09-04  2:16 ` [PATCH 04/13] fsdax: Update dax_insert_entry() calling convention to return an error Dan Williams
2022-09-04  2:16 ` [PATCH 05/13] fsdax: Cleanup dax_associate_entry() Dan Williams
2022-09-04  2:16 ` [PATCH 06/13] fsdax: Rework dax_insert_entry() calling convention Dan Williams
2022-09-04  2:16 ` [PATCH 07/13] fsdax: Manage pgmap references at entry insertion and deletion Dan Williams
2022-09-06 12:30   ` Jason Gunthorpe
2022-09-04  2:16 ` [PATCH 08/13] devdax: Minor warning fixups Dan Williams
2022-09-04  2:16 ` [PATCH 09/13] devdax: Move address_space helpers to the DAX core Dan Williams
2022-09-04  5:53   ` kernel test robot
2022-09-04  2:16 ` [PATCH 10/13] dax: Prep dax_{associate, disassociate}_entry() for compound pages Dan Williams
2022-09-04  2:17 ` [PATCH 11/13] devdax: add PUD support to the DAX mapping infrastructure Dan Williams
2022-09-04  2:17 ` [PATCH 12/13] devdax: Use dax_insert_entry() + dax_delete_mapping_entry() Dan Williams
2022-09-04  2:17 ` [PATCH 13/13] mm/gup: Drop DAX pgmap accounting Dan Williams
2022-09-06 13:05 ` [PATCH 00/13] Fix the DAX-gup mistake Jason Gunthorpe
2022-09-06 17:23   ` Dan Williams
2022-09-06 17:29     ` Jason Gunthorpe
2022-09-06 18:37       ` Dan Williams
2022-09-06 18:49         ` Jason Gunthorpe
2022-09-06 19:41           ` Dan Williams
2022-09-07  0:54             ` Dan Williams
2022-09-07 12:58               ` Jason Gunthorpe
2022-09-07 17:10                 ` Dan Williams
2022-09-07 18:43                   ` Dan Williams
2022-09-07 19:30                     ` Jason Gunthorpe
2022-09-07 20:45                       ` Dan Williams
2022-09-08 18:49                         ` Jason Gunthorpe
2022-09-08 19:27                           ` Dan Williams
2022-09-09 11:53                             ` Jason Gunthorpe
2022-09-09 17:52                               ` Dan Williams
2022-09-09 18:11                             ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=166225775968.2351842.11156458342486082012.stgit@dwillia2-xfh.jf.intel.com \
    --to=dan.j.williams@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=djwong@kernel.org \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nvdimm@lists.linux.dev \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.