From: Dan Williams <dan.j.williams@intel.com>
To: Joao Martins <joao.m.martins@oracle.com>
Cc: Linux MM <linux-mm@kvack.org>,
Vishal Verma <vishal.l.verma@intel.com>,
Dave Jiang <dave.jiang@intel.com>,
Naoya Horiguchi <naoya.horiguchi@nec.com>,
Matthew Wilcox <willy@infradead.org>,
Jason Gunthorpe <jgg@ziepe.ca>,
John Hubbard <jhubbard@nvidia.com>,
Jane Chu <jane.chu@oracle.com>,
Muchun Song <songmuchun@bytedance.com>,
Mike Kravetz <mike.kravetz@oracle.com>,
Andrew Morton <akpm@linux-foundation.org>,
Jonathan Corbet <corbet@lwn.net>, Christoph Hellwig <hch@lst.de>,
Linux NVDIMM <nvdimm@lists.linux.dev>,
Linux Doc Mailing List <linux-doc@vger.kernel.org>
Subject: Re: [PATCH v4 07/14] device-dax: compound devmap support
Date: Thu, 4 Nov 2021 17:38:19 -0700 [thread overview]
Message-ID: <CAPcyv4jqdPaLPOydb_GWvVP4d+hRkcu7CnP_Ud-CQXHcqTLWKw@mail.gmail.com> (raw)
In-Reply-To: <20210827145819.16471-8-joao.m.martins@oracle.com>
On Fri, Aug 27, 2021 at 7:59 AM Joao Martins <joao.m.martins@oracle.com> wrote:
>
> Use the newly added compound devmap facility which maps the assigned dax
> ranges as compound pages at a page size of @align. Currently, this means,
> that region/namespace bootstrap would take considerably less, given that
> you would initialize considerably less pages.
>
> On setups with 128G NVDIMMs the initialization with DRAM stored struct
> pages improves from ~268-358 ms to ~78-100 ms with 2M pages, and to less
> than a 1msec with 1G pages.
>
> dax devices are created with a fixed @align (huge page size) which is
> enforced through as well at mmap() of the device. Faults, consequently
> happen too at the specified @align specified at the creation, and those
> don't change through out dax device lifetime.
s/through out/throughout/
> MCEs poisons a whole dax huge page, as well as splits occurring at the configured page size.
A clarification here, MCEs trigger memory_failure() to *unmap* a whole
dax huge page, the poison stays limited to a single cacheline.
Otherwise the patch looks good to me.
>
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> ---
> drivers/dax/device.c | 56 ++++++++++++++++++++++++++++++++++----------
> 1 file changed, 43 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/dax/device.c b/drivers/dax/device.c
> index 6e348b5f9d45..5d23128f9a60 100644
> --- a/drivers/dax/device.c
> +++ b/drivers/dax/device.c
> @@ -192,6 +192,42 @@ static vm_fault_t __dev_dax_pud_fault(struct dev_dax *dev_dax,
> }
> #endif /* !CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */
>
> +static void set_page_mapping(struct vm_fault *vmf, pfn_t pfn,
> + unsigned long fault_size,
> + struct address_space *f_mapping)
> +{
> + unsigned long i;
> + pgoff_t pgoff;
> +
> + pgoff = linear_page_index(vmf->vma, ALIGN(vmf->address, fault_size));
> +
> + for (i = 0; i < fault_size / PAGE_SIZE; i++) {
> + struct page *page;
> +
> + page = pfn_to_page(pfn_t_to_pfn(pfn) + i);
> + if (page->mapping)
> + continue;
> + page->mapping = f_mapping;
> + page->index = pgoff + i;
> + }
> +}
> +
> +static void set_compound_mapping(struct vm_fault *vmf, pfn_t pfn,
> + unsigned long fault_size,
> + struct address_space *f_mapping)
> +{
> + struct page *head;
> +
> + head = pfn_to_page(pfn_t_to_pfn(pfn));
> + head = compound_head(head);
> + if (head->mapping)
> + return;
> +
> + head->mapping = f_mapping;
> + head->index = linear_page_index(vmf->vma,
> + ALIGN(vmf->address, fault_size));
> +}
> +
> static vm_fault_t dev_dax_huge_fault(struct vm_fault *vmf,
> enum page_entry_size pe_size)
> {
> @@ -225,8 +261,7 @@ static vm_fault_t dev_dax_huge_fault(struct vm_fault *vmf,
> }
>
> if (rc == VM_FAULT_NOPAGE) {
> - unsigned long i;
> - pgoff_t pgoff;
> + struct dev_pagemap *pgmap = dev_dax->pgmap;
>
> /*
> * In the device-dax case the only possibility for a
> @@ -234,17 +269,10 @@ static vm_fault_t dev_dax_huge_fault(struct vm_fault *vmf,
> * mapped. No need to consider the zero page, or racing
> * conflicting mappings.
> */
> - pgoff = linear_page_index(vmf->vma,
> - ALIGN(vmf->address, fault_size));
> - for (i = 0; i < fault_size / PAGE_SIZE; i++) {
> - struct page *page;
> -
> - page = pfn_to_page(pfn_t_to_pfn(pfn) + i);
> - if (page->mapping)
> - continue;
> - page->mapping = filp->f_mapping;
> - page->index = pgoff + i;
> - }
> + if (pgmap_geometry(pgmap) > 1)
> + set_compound_mapping(vmf, pfn, fault_size, filp->f_mapping);
> + else
> + set_page_mapping(vmf, pfn, fault_size, filp->f_mapping);
> }
> dax_read_unlock(id);
>
> @@ -426,6 +454,8 @@ int dev_dax_probe(struct dev_dax *dev_dax)
> }
>
> pgmap->type = MEMORY_DEVICE_GENERIC;
> + if (dev_dax->align > PAGE_SIZE)
> + pgmap->geometry = dev_dax->align >> PAGE_SHIFT;
> dev_dax->pgmap = pgmap;
>
> addr = devm_memremap_pages(dev, pgmap);
> --
> 2.17.1
>
next prev parent reply other threads:[~2021-11-05 0:38 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-27 14:58 [PATCH v4 00/14] mm, sparse-vmemmap: Introduce compound devmaps for device-dax Joao Martins
2021-08-27 14:58 ` [PATCH v4 01/14] memory-failure: fetch compound_head after pgmap_pfn_valid() Joao Martins
2021-08-27 14:58 ` [PATCH v4 02/14] mm/page_alloc: split prep_compound_page into head and tail subparts Joao Martins
2021-08-27 14:58 ` [PATCH v4 03/14] mm/page_alloc: refactor memmap_init_zone_device() page init Joao Martins
2021-08-27 14:58 ` [PATCH v4 04/14] mm/memremap: add ZONE_DEVICE support for compound pages Joao Martins
2021-08-27 15:33 ` Christoph Hellwig
2021-08-27 16:00 ` Joao Martins
2021-09-01 9:44 ` Christoph Hellwig
2021-09-09 9:38 ` Joao Martins
2021-08-27 14:58 ` [PATCH v4 05/14] device-dax: use ALIGN() for determining pgoff Joao Martins
2021-08-27 14:58 ` [PATCH v4 06/14] device-dax: ensure dev_dax->pgmap is valid for dynamic devices Joao Martins
2021-11-05 0:31 ` Dan Williams
2021-11-05 12:09 ` Joao Martins
2021-11-05 16:14 ` Joao Martins
2021-11-05 16:46 ` Dan Williams
2021-11-05 18:11 ` Joao Martins
2021-08-27 14:58 ` [PATCH v4 07/14] device-dax: compound devmap support Joao Martins
2021-11-05 0:38 ` Dan Williams [this message]
2021-11-05 14:10 ` Joao Martins
2021-11-05 16:41 ` Dan Williams
2021-08-27 14:58 ` [PATCH v4 08/14] mm/gup: grab head page refcount once for group of subpages Joao Martins
2021-08-27 16:25 ` Jason Gunthorpe
2021-08-27 18:34 ` Joao Martins
2021-08-30 13:07 ` Jason Gunthorpe
2021-08-31 12:34 ` Joao Martins
2021-08-31 17:05 ` Jason Gunthorpe
2021-09-23 16:51 ` Joao Martins
2021-09-28 18:01 ` Jason Gunthorpe
2021-09-29 11:50 ` Joao Martins
2021-09-29 19:34 ` Jason Gunthorpe
2021-09-30 3:01 ` Alistair Popple
2021-09-30 17:54 ` Joao Martins
2021-09-30 21:55 ` Jason Gunthorpe
2021-10-18 18:36 ` Jason Gunthorpe
2021-10-18 18:37 ` Jason Gunthorpe
2021-10-08 11:54 ` Jason Gunthorpe
2021-10-11 15:53 ` Joao Martins
2021-10-13 17:41 ` Jason Gunthorpe
2021-10-13 19:18 ` Joao Martins
2021-10-13 19:43 ` Jason Gunthorpe
2021-10-14 17:56 ` Joao Martins
2021-10-14 18:06 ` Jason Gunthorpe
2021-08-27 14:58 ` [PATCH v4 09/14] mm/sparse-vmemmap: add a pgmap argument to section activation Joao Martins
2021-08-27 14:58 ` [PATCH v4 10/14] mm/sparse-vmemmap: refactor core of vmemmap_populate_basepages() to helper Joao Martins
2021-08-27 14:58 ` [PATCH v4 11/14] mm/hugetlb_vmemmap: move comment block to Documentation/vm Joao Martins
2021-08-27 14:58 ` [PATCH v4 12/14] mm/sparse-vmemmap: populate compound devmaps Joao Martins
2021-08-27 14:58 ` [PATCH v4 13/14] mm/page_alloc: reuse tail struct pages for " Joao Martins
2021-08-27 14:58 ` [PATCH v4 14/14] mm/sparse-vmemmap: improve memory savings for compound pud geometry Joao Martins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAPcyv4jqdPaLPOydb_GWvVP4d+hRkcu7CnP_Ud-CQXHcqTLWKw@mail.gmail.com \
--to=dan.j.williams@intel.com \
--cc=akpm@linux-foundation.org \
--cc=corbet@lwn.net \
--cc=dave.jiang@intel.com \
--cc=hch@lst.de \
--cc=jane.chu@oracle.com \
--cc=jgg@ziepe.ca \
--cc=jhubbard@nvidia.com \
--cc=joao.m.martins@oracle.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mike.kravetz@oracle.com \
--cc=naoya.horiguchi@nec.com \
--cc=nvdimm@lists.linux.dev \
--cc=songmuchun@bytedance.com \
--cc=vishal.l.verma@intel.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).