From: John Hubbard <jhubbard@nvidia.com> To: Andrew Morton <akpm@linux-foundation.org> Cc: "Michal Hocko" <mhocko@suse.com>, "Jan Kara" <jack@suse.cz>, kvm@vger.kernel.org, linux-doc@vger.kernel.org, "David Airlie" <airlied@linux.ie>, "Dave Chinner" <david@fromorbit.com>, dri-devel@lists.freedesktop.org, LKML <linux-kernel@vger.kernel.org>, linux-mm@kvack.org, "Paul Mackerras" <paulus@samba.org>, linux-kselftest@vger.kernel.org, "Ira Weiny" <ira.weiny@intel.com>, "Christoph Hellwig" <hch@lst.de>, "Jonathan Corbet" <corbet@lwn.net>, linux-rdma@vger.kernel.org, "Michael Ellerman" <mpe@ellerman.id.au>, "Christoph Hellwig" <hch@infradead.org>, "Jason Gunthorpe" <jgg@ziepe.ca>, "Vlastimil Babka" <vbabka@suse.cz>, "Björn Töpel" <bjorn.topel@intel.com>, linux-media@vger.kernel.org, "Shuah Khan" <shuah@kernel.org>, "John Hubbard" <jhubbard@nvidia.com>, linux-block@vger.kernel.org, "Jérôme Glisse" <jglisse@redhat.com>, "Al Viro" <viro@zeniv.linux.org.uk> Subject: [PATCH v5 04/24] mm: Cleanup __put_devmap_managed_page() vs ->page_free() Date: Thu, 14 Nov 2019 21:53:20 -0800 [thread overview] Message-ID: <20191115055340.1825745-5-jhubbard@nvidia.com> (raw) In-Reply-To: <20191115055340.1825745-1-jhubbard@nvidia.com> From: Dan Williams <dan.j.williams@intel.com> After the removal of the device-public infrastructure there are only 2 ->page_free() call backs in the kernel. One of those is a device-private callback in the nouveau driver, the other is a generic wakeup needed in the DAX case. In the hopes that all ->page_free() callbacks can be migrated to common core kernel functionality, move the device-private specific actions in __put_devmap_managed_page() under the is_device_private_page() conditional, including the ->page_free() callback. For the other page types just open-code the generic wakeup. Yes, the wakeup is only needed in the MEMORY_DEVICE_FSDAX case, but it does no harm in the MEMORY_DEVICE_DEVDAX and MEMORY_DEVICE_PCI_P2PDMA case. Cc: Jan Kara <jack@suse.cz> Cc: Christoph Hellwig <hch@lst.de> Cc: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jérôme Glisse <jglisse@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: John Hubbard <jhubbard@nvidia.com> --- drivers/nvdimm/pmem.c | 6 ---- mm/memremap.c | 80 ++++++++++++++++++++++++------------------- 2 files changed, 44 insertions(+), 42 deletions(-) diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c index f9f76f6ba07b..21db1ce8c0ae 100644 --- a/drivers/nvdimm/pmem.c +++ b/drivers/nvdimm/pmem.c @@ -338,13 +338,7 @@ static void pmem_release_disk(void *__pmem) put_disk(pmem->disk); } -static void pmem_pagemap_page_free(struct page *page) -{ - wake_up_var(&page->_refcount); -} - static const struct dev_pagemap_ops fsdax_pagemap_ops = { - .page_free = pmem_pagemap_page_free, .kill = pmem_pagemap_kill, .cleanup = pmem_pagemap_cleanup, }; diff --git a/mm/memremap.c b/mm/memremap.c index 03ccbdfeb697..e899fa876a62 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -27,7 +27,8 @@ static void devmap_managed_enable_put(void) static int devmap_managed_enable_get(struct dev_pagemap *pgmap) { - if (!pgmap->ops || !pgmap->ops->page_free) { + if (pgmap->type == MEMORY_DEVICE_PRIVATE && + (!pgmap->ops || !pgmap->ops->page_free)) { WARN(1, "Missing page_free method\n"); return -EINVAL; } @@ -414,44 +415,51 @@ void __put_devmap_managed_page(struct page *page) { int count = page_ref_dec_return(page); - /* - * If refcount is 1 then page is freed and refcount is stable as nobody - * holds a reference on the page. - */ - if (count == 1) { - /* Clear Active bit in case of parallel mark_page_accessed */ - __ClearPageActive(page); - __ClearPageWaiters(page); + /* still busy */ + if (count > 1) + return; - mem_cgroup_uncharge(page); + /* only triggered by the dev_pagemap shutdown path */ + if (count == 0) { + __put_page(page); + return; + } - /* - * When a device_private page is freed, the page->mapping field - * may still contain a (stale) mapping value. For example, the - * lower bits of page->mapping may still identify the page as - * an anonymous page. Ultimately, this entire field is just - * stale and wrong, and it will cause errors if not cleared. - * One example is: - * - * migrate_vma_pages() - * migrate_vma_insert_page() - * page_add_new_anon_rmap() - * __page_set_anon_rmap() - * ...checks page->mapping, via PageAnon(page) call, - * and incorrectly concludes that the page is an - * anonymous page. Therefore, it incorrectly, - * silently fails to set up the new anon rmap. - * - * For other types of ZONE_DEVICE pages, migration is either - * handled differently or not done at all, so there is no need - * to clear page->mapping. - */ - if (is_device_private_page(page)) - page->mapping = NULL; + /* notify page idle for dax */ + if (!is_device_private_page(page)) { + wake_up_var(&page->_refcount); + return; + } - page->pgmap->ops->page_free(page); - } else if (!count) - __put_page(page); + /* Clear Active bit in case of parallel mark_page_accessed */ + __ClearPageActive(page); + __ClearPageWaiters(page); + + mem_cgroup_uncharge(page); + + /* + * When a device_private page is freed, the page->mapping field + * may still contain a (stale) mapping value. For example, the + * lower bits of page->mapping may still identify the page as an + * anonymous page. Ultimately, this entire field is just stale + * and wrong, and it will cause errors if not cleared. One + * example is: + * + * migrate_vma_pages() + * migrate_vma_insert_page() + * page_add_new_anon_rmap() + * __page_set_anon_rmap() + * ...checks page->mapping, via PageAnon(page) call, + * and incorrectly concludes that the page is an + * anonymous page. Therefore, it incorrectly, + * silently fails to set up the new anon rmap. + * + * For other types of ZONE_DEVICE pages, migration is either + * handled differently or not done at all, so there is no need + * to clear page->mapping. + */ + page->mapping = NULL; + page->pgmap->ops->page_free(page); } EXPORT_SYMBOL(__put_devmap_managed_page); #endif /* CONFIG_DEV_PAGEMAP_OPS */ -- 2.24.0 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
WARNING: multiple messages have this Message-ID (diff)
From: John Hubbard <jhubbard@nvidia.com> To: Andrew Morton <akpm@linux-foundation.org> Cc: "Michal Hocko" <mhocko@suse.com>, "Jan Kara" <jack@suse.cz>, kvm@vger.kernel.org, linux-doc@vger.kernel.org, "David Airlie" <airlied@linux.ie>, "Dave Chinner" <david@fromorbit.com>, dri-devel@lists.freedesktop.org, LKML <linux-kernel@vger.kernel.org>, linux-mm@kvack.org, "Paul Mackerras" <paulus@samba.org>, linux-kselftest@vger.kernel.org, "Ira Weiny" <ira.weiny@intel.com>, "Christoph Hellwig" <hch@lst.de>, "Jonathan Corbet" <corbet@lwn.net>, linux-rdma@vger.kernel.org, "Michael Ellerman" <mpe@ellerman.id.au>, "Christoph Hellwig" <hch@infradead.org>, "Jason Gunthorpe" <jgg@ziepe.ca>, "Vlastimil Babka" <vbabka@suse.cz>, "Björn Töpel" <bjorn.topel@intel.com>, linux-media@vger.kernel.org, "Shuah Khan" <shuah@kernel.org>, "John Hubbard" <jhubbard@nvidia.com>, linux-block@vger.kernel.org, "Jérôme Glisse" <jglisse@redhat.com>, "Al Viro" <viro@zeniv.linux.org.uk>, "Dan Williams" <dan.j.williams@intel.com>, "Mauro Carvalho Chehab" <mchehab@kernel.org>, "Magnus Karlsson" <magnus.karlsson@intel.com>, "Jens Axboe" <axboe@kernel.dk>, netdev@vger.kernel.org, "Alex Williamson" <alex.williamson@redhat.com>, linux-fsdevel@vger.kernel.org, bpf@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, "David S . Miller" <davem@davemloft.net>, "Mike Kravetz" <mike.kravetz@oracle.com> Subject: [PATCH v5 04/24] mm: Cleanup __put_devmap_managed_page() vs ->page_free() Date: Thu, 14 Nov 2019 21:53:20 -0800 [thread overview] Message-ID: <20191115055340.1825745-5-jhubbard@nvidia.com> (raw) Message-ID: <20191115055320.rjEoWAVMzkPdYZlxEDu8iXrNSC_fij6TRNlaBk4a43I@z> (raw) In-Reply-To: <20191115055340.1825745-1-jhubbard@nvidia.com> From: Dan Williams <dan.j.williams@intel.com> After the removal of the device-public infrastructure there are only 2 ->page_free() call backs in the kernel. One of those is a device-private callback in the nouveau driver, the other is a generic wakeup needed in the DAX case. In the hopes that all ->page_free() callbacks can be migrated to common core kernel functionality, move the device-private specific actions in __put_devmap_managed_page() under the is_device_private_page() conditional, including the ->page_free() callback. For the other page types just open-code the generic wakeup. Yes, the wakeup is only needed in the MEMORY_DEVICE_FSDAX case, but it does no harm in the MEMORY_DEVICE_DEVDAX and MEMORY_DEVICE_PCI_P2PDMA case. Cc: Jan Kara <jack@suse.cz> Cc: Christoph Hellwig <hch@lst.de> Cc: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jérôme Glisse <jglisse@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: John Hubbard <jhubbard@nvidia.com> --- drivers/nvdimm/pmem.c | 6 ---- mm/memremap.c | 80 ++++++++++++++++++++++++------------------- 2 files changed, 44 insertions(+), 42 deletions(-) diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c index f9f76f6ba07b..21db1ce8c0ae 100644 --- a/drivers/nvdimm/pmem.c +++ b/drivers/nvdimm/pmem.c @@ -338,13 +338,7 @@ static void pmem_release_disk(void *__pmem) put_disk(pmem->disk); } -static void pmem_pagemap_page_free(struct page *page) -{ - wake_up_var(&page->_refcount); -} - static const struct dev_pagemap_ops fsdax_pagemap_ops = { - .page_free = pmem_pagemap_page_free, .kill = pmem_pagemap_kill, .cleanup = pmem_pagemap_cleanup, }; diff --git a/mm/memremap.c b/mm/memremap.c index 03ccbdfeb697..e899fa876a62 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -27,7 +27,8 @@ static void devmap_managed_enable_put(void) static int devmap_managed_enable_get(struct dev_pagemap *pgmap) { - if (!pgmap->ops || !pgmap->ops->page_free) { + if (pgmap->type == MEMORY_DEVICE_PRIVATE && + (!pgmap->ops || !pgmap->ops->page_free)) { WARN(1, "Missing page_free method\n"); return -EINVAL; } @@ -414,44 +415,51 @@ void __put_devmap_managed_page(struct page *page) { int count = page_ref_dec_return(page); - /* - * If refcount is 1 then page is freed and refcount is stable as nobody - * holds a reference on the page. - */ - if (count == 1) { - /* Clear Active bit in case of parallel mark_page_accessed */ - __ClearPageActive(page); - __ClearPageWaiters(page); + /* still busy */ + if (count > 1) + return; - mem_cgroup_uncharge(page); + /* only triggered by the dev_pagemap shutdown path */ + if (count == 0) { + __put_page(page); + return; + } - /* - * When a device_private page is freed, the page->mapping field - * may still contain a (stale) mapping value. For example, the - * lower bits of page->mapping may still identify the page as - * an anonymous page. Ultimately, this entire field is just - * stale and wrong, and it will cause errors if not cleared. - * One example is: - * - * migrate_vma_pages() - * migrate_vma_insert_page() - * page_add_new_anon_rmap() - * __page_set_anon_rmap() - * ...checks page->mapping, via PageAnon(page) call, - * and incorrectly concludes that the page is an - * anonymous page. Therefore, it incorrectly, - * silently fails to set up the new anon rmap. - * - * For other types of ZONE_DEVICE pages, migration is either - * handled differently or not done at all, so there is no need - * to clear page->mapping. - */ - if (is_device_private_page(page)) - page->mapping = NULL; + /* notify page idle for dax */ + if (!is_device_private_page(page)) { + wake_up_var(&page->_refcount); + return; + } - page->pgmap->ops->page_free(page); - } else if (!count) - __put_page(page); + /* Clear Active bit in case of parallel mark_page_accessed */ + __ClearPageActive(page); + __ClearPageWaiters(page); + + mem_cgroup_uncharge(page); + + /* + * When a device_private page is freed, the page->mapping field + * may still contain a (stale) mapping value. For example, the + * lower bits of page->mapping may still identify the page as an + * anonymous page. Ultimately, this entire field is just stale + * and wrong, and it will cause errors if not cleared. One + * example is: + * + * migrate_vma_pages() + * migrate_vma_insert_page() + * page_add_new_anon_rmap() + * __page_set_anon_rmap() + * ...checks page->mapping, via PageAnon(page) call, + * and incorrectly concludes that the page is an + * anonymous page. Therefore, it incorrectly, + * silently fails to set up the new anon rmap. + * + * For other types of ZONE_DEVICE pages, migration is either + * handled differently or not done at all, so there is no need + * to clear page->mapping. + */ + page->mapping = NULL; + page->pgmap->ops->page_free(page); } EXPORT_SYMBOL(__put_devmap_managed_page); #endif /* CONFIG_DEV_PAGEMAP_OPS */ -- 2.24.0 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
next prev parent reply other threads:[~2019-11-15 5:53 UTC|newest] Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-11-15 5:53 [PATCH v5 00/24] mm/gup: track dma-pinned pages: FOLL_PIN John Hubbard 2019-11-15 5:53 ` John Hubbard 2019-11-15 5:53 ` [PATCH v5 01/24] mm/gup: pass flags arg to __gup_device_* functions John Hubbard 2019-11-15 5:53 ` John Hubbard 2019-11-15 5:53 ` [PATCH v5 02/24] mm/gup: factor out duplicate code from four routines John Hubbard 2019-11-15 5:53 ` John Hubbard 2019-11-18 9:46 ` Jan Kara 2019-11-18 9:46 ` Jan Kara 2019-11-19 7:00 ` John Hubbard 2019-11-19 7:00 ` John Hubbard 2019-11-15 5:53 ` [PATCH v5 03/24] mm/gup: move try_get_compound_head() to top, fix minor issues John Hubbard 2019-11-15 5:53 ` John Hubbard 2019-11-15 5:53 ` John Hubbard [this message] 2019-11-15 5:53 ` [PATCH v5 04/24] mm: Cleanup __put_devmap_managed_page() vs ->page_free() John Hubbard 2019-11-15 5:53 ` [PATCH v5 05/24] mm: devmap: refactor 1-based refcounting for ZONE_DEVICE pages John Hubbard 2019-11-15 5:53 ` John Hubbard 2019-11-15 5:53 ` [PATCH v5 06/24] goldish_pipe: rename local pin_user_pages() routine John Hubbard 2019-11-15 5:53 ` John Hubbard 2019-11-18 9:47 ` Jan Kara 2019-11-18 9:47 ` Jan Kara 2019-11-15 5:53 ` [PATCH v5 07/24] IB/umem: use get_user_pages_fast() to pin DMA pages John Hubbard 2019-11-15 5:53 ` John Hubbard 2019-11-18 9:49 ` Jan Kara 2019-11-18 9:49 ` Jan Kara 2019-11-15 5:53 ` [PATCH v5 08/24] media/v4l2-core: set pages dirty upon releasing DMA buffers John Hubbard 2019-11-15 5:53 ` John Hubbard 2019-11-15 5:53 ` [PATCH v5 09/24] vfio, mm: fix get_user_pages_remote() and FOLL_LONGTERM John Hubbard 2019-11-15 5:53 ` John Hubbard 2019-11-15 14:08 ` Jason Gunthorpe 2019-11-15 18:06 ` Ira Weiny 2019-11-15 18:06 ` Ira Weiny 2019-11-15 5:53 ` [PATCH v5 10/24] mm/gup: introduce pin_user_pages*() and FOLL_PIN John Hubbard 2019-11-15 5:53 ` John Hubbard 2019-11-18 10:16 ` Jan Kara 2019-11-18 10:16 ` Jan Kara 2019-11-19 5:17 ` John Hubbard 2019-11-19 5:17 ` John Hubbard 2019-11-15 5:53 ` [PATCH v5 11/24] goldish_pipe: convert to pin_user_pages() and put_user_page() John Hubbard 2019-11-15 5:53 ` John Hubbard 2019-11-18 10:16 ` Jan Kara 2019-11-18 10:16 ` Jan Kara 2019-11-15 5:53 ` [PATCH v5 12/24] IB/{core, hw, umem}: set FOLL_PIN via pin_user_pages*(), fix up ODP John Hubbard 2019-11-15 5:53 ` John Hubbard 2019-11-15 14:09 ` [PATCH v5 12/24] IB/{core,hw,umem}: " Jason Gunthorpe 2019-11-15 5:53 ` [PATCH v5 13/24] mm/process_vm_access: set FOLL_PIN via pin_user_pages_remote() John Hubbard 2019-11-15 5:53 ` John Hubbard 2019-11-18 10:30 ` Jan Kara 2019-11-18 10:30 ` Jan Kara 2019-11-15 5:53 ` [PATCH v5 14/24] drm/via: set FOLL_PIN via pin_user_pages_fast() John Hubbard 2019-11-15 5:53 ` John Hubbard 2019-11-15 5:53 ` [PATCH v5 15/24] fs/io_uring: set FOLL_PIN via pin_user_pages() John Hubbard 2019-11-15 5:53 ` John Hubbard 2019-11-18 10:34 ` Jan Kara 2019-11-18 10:34 ` Jan Kara 2019-11-15 5:53 ` [PATCH v5 16/24] net/xdp: " John Hubbard 2019-11-15 5:53 ` John Hubbard 2019-11-15 5:53 ` [PATCH v5 17/24] mm/gup: track FOLL_PIN pages John Hubbard 2019-11-15 5:53 ` John Hubbard 2019-11-18 11:58 ` Jan Kara 2019-11-18 11:58 ` Jan Kara 2019-11-19 0:22 ` John Hubbard 2019-11-19 0:22 ` John Hubbard 2019-11-15 5:53 ` [PATCH v5 18/24] media/v4l2-core: pin_user_pages (FOLL_PIN) and put_user_page() conversion John Hubbard 2019-11-15 5:53 ` John Hubbard 2019-11-15 5:53 ` [PATCH v5 19/24] vfio, mm: " John Hubbard 2019-11-15 5:53 ` John Hubbard 2019-11-15 5:53 ` [PATCH v5 20/24] powerpc: book3s64: convert to pin_user_pages() and put_user_page() John Hubbard 2019-11-15 5:53 ` John Hubbard 2019-11-15 5:53 ` [PATCH v5 21/24] mm/gup_benchmark: use proper FOLL_WRITE flags instead of hard-coding "1" John Hubbard 2019-11-15 5:53 ` John Hubbard 2019-11-15 5:53 ` [PATCH v5 22/24] mm/gup_benchmark: support pin_user_pages() and related calls John Hubbard 2019-11-15 5:53 ` John Hubbard 2019-11-15 5:53 ` [PATCH v5 23/24] selftests/vm: run_vmtests: invoke gup_benchmark with basic FOLL_PIN coverage John Hubbard 2019-11-15 5:53 ` John Hubbard 2019-11-15 5:53 ` [PATCH v5 24/24] mm, tree-wide: rename put_user_page*() to unpin_user_page*() John Hubbard 2019-11-15 5:53 ` John Hubbard
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20191115055340.1825745-5-jhubbard@nvidia.com \ --to=jhubbard@nvidia.com \ --cc=airlied@linux.ie \ --cc=akpm@linux-foundation.org \ --cc=bjorn.topel@intel.com \ --cc=corbet@lwn.net \ --cc=david@fromorbit.com \ --cc=dri-devel@lists.freedesktop.org \ --cc=hch@infradead.org \ --cc=hch@lst.de \ --cc=ira.weiny@intel.com \ --cc=jack@suse.cz \ --cc=jgg@ziepe.ca \ --cc=jglisse@redhat.com \ --cc=kvm@vger.kernel.org \ --cc=linux-block@vger.kernel.org \ --cc=linux-doc@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-kselftest@vger.kernel.org \ --cc=linux-media@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=linux-rdma@vger.kernel.org \ --cc=mhocko@suse.com \ --cc=mpe@ellerman.id.au \ --cc=paulus@samba.org \ --cc=shuah@kernel.org \ --cc=vbabka@suse.cz \ --cc=viro@zeniv.linux.org.uk \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).