nvdimm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Joao Martins <joao.m.martins@oracle.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Linux MM <linux-mm@kvack.org>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Dave Jiang <dave.jiang@intel.com>,
	Naoya Horiguchi <naoya.horiguchi@nec.com>,
	Matthew Wilcox <willy@infradead.org>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	John Hubbard <jhubbard@nvidia.com>,
	Jane Chu <jane.chu@oracle.com>,
	Muchun Song <songmuchun@bytedance.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jonathan Corbet <corbet@lwn.net>,
	Linux NVDIMM <nvdimm@lists.linux.dev>,
	Linux Doc Mailing List <linux-doc@vger.kernel.org>
Subject: Re: [PATCH v3 12/14] device-dax: compound pagemap support
Date: Thu, 15 Jul 2021 13:00:47 +0100	[thread overview]
Message-ID: <f7217b61-c845-eaed-501e-c9e7067a6b87@oracle.com> (raw)
In-Reply-To: <CAPcyv4h5c9afuxXy=UhrRr_tTwHB62RODyCKWNFU5TumXHc76A@mail.gmail.com>

On 7/15/21 12:36 AM, Dan Williams wrote:
> On Wed, Jul 14, 2021 at 12:36 PM Joao Martins <joao.m.martins@oracle.com> wrote:
>>
>> Use the newly added compound pagemap facility which maps the assigned dax
>> ranges as compound pages at a page size of @align. Currently, this means,
>> that region/namespace bootstrap would take considerably less, given that
>> you would initialize considerably less pages.
>>
>> On setups with 128G NVDIMMs the initialization with DRAM stored struct
>> pages improves from ~268-358 ms to ~78-100 ms with 2M pages, and to less
>> than a 1msec with 1G pages.
>>
>> dax devices are created with a fixed @align (huge page size) which is
>> enforced through as well at mmap() of the device. Faults, consequently
>> happen too at the specified @align specified at the creation, and those
>> don't change through out dax device lifetime. MCEs poisons a whole dax
>> huge page, as well as splits occurring at the configured page size.
>>
> 
> Hi Joao,
> 
> With this patch I'm hitting the following with the 'device-dax' test [1].
> 
Ugh, I can reproduce it too -- apologies for the oversight.

This patch is not the culprit, the flaw is early in the series, specifically the fourth patch.

It needs this chunk below change on the fourth patch due to the existing elevated page ref
count at zone device memmap init. put_page() called here in memunmap_pages():

for (i = 0; i < pgmap->nr_ranges; i++)
	for_each_device_pfn(pfn, pgmap, i)
		put_page(pfn_to_page(pfn));

... on a zone_device compound memmap would otherwise always decrease head page refcount by
@geometry pfn amount (leading to the aforementioned splat you reported).

diff --git a/mm/memremap.c b/mm/memremap.c
index b0e7b8cf3047..79a883af788e 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -102,15 +102,15 @@ static unsigned long pfn_end(struct dev_pagemap *pgmap, int range_id)
        return (range->start + range_len(range)) >> PAGE_SHIFT;
 }

-static unsigned long pfn_next(unsigned long pfn)
+static unsigned long pfn_next(struct dev_pagemap *pgmap, unsigned long pfn)
 {
        if (pfn % 1024 == 0)
                cond_resched();
-       return pfn + 1;
+       return pfn + pgmap_pfn_geometry(pgmap);
 }

 #define for_each_device_pfn(pfn, map, i) \
-       for (pfn = pfn_first(map, i); pfn < pfn_end(map, i); pfn = pfn_next(pfn))
+       for (pfn = pfn_first(map, i); pfn < pfn_end(map, i); pfn = pfn_next(map, pfn))

 static void dev_pagemap_kill(struct dev_pagemap *pgmap)
 {

It could also get this hunk below, but it is sort of redundant provided we won't touch
tail page refcount through out the devmap pages lifetime. This setting of tail pages
refcount to zero was in pre-v5.14 series, but it got removed under the assumption it comes
from the page allocator (where tail pages are already zeroed in refcount).

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 96975edac0a8..469a7aa5cf38 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6623,6 +6623,7 @@ static void __ref memmap_init_compound(struct page *page, unsigned
long pfn,
                __init_zone_device_page(page + i, pfn + i, zone_idx,
                                        nid, pgmap);
                prep_compound_tail(page, i);
+               set_page_count(page + i, 0);

                /*
                 * The first and second tail pages need to

  reply	other threads:[~2021-07-15 12:01 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-14 19:35 [PATCH v3 00/14] mm, sparse-vmemmap: Introduce compound pagemaps Joao Martins
2021-07-14 19:35 ` [PATCH v3 01/14] memory-failure: fetch compound_head after pgmap_pfn_valid() Joao Martins
2021-07-15  0:17   ` Dan Williams
2021-07-15  2:51   ` [External] " Muchun Song
2021-07-15  6:40     ` Christoph Hellwig
2021-07-15  9:19       ` Muchun Song
2021-07-15 13:17     ` Joao Martins
2021-07-14 19:35 ` [PATCH v3 02/14] mm/page_alloc: split prep_compound_page into head and tail subparts Joao Martins
2021-07-15  0:19   ` Dan Williams
2021-07-15  2:53   ` [External] " Muchun Song
2021-07-15 13:17     ` Joao Martins
2021-07-14 19:35 ` [PATCH v3 03/14] mm/page_alloc: refactor memmap_init_zone_device() page init Joao Martins
2021-07-15  0:20   ` Dan Williams
2021-07-14 19:35 ` [PATCH v3 04/14] mm/memremap: add ZONE_DEVICE support for compound pages Joao Martins
2021-07-15  1:08   ` Dan Williams
2021-07-15 12:52     ` Joao Martins
2021-07-15 13:06       ` Joao Martins
2021-07-15 19:48       ` Dan Williams
2021-07-30 16:13         ` Joao Martins
2021-07-22  0:38       ` Jane Chu
2021-07-22 10:56         ` Joao Martins
2021-07-15 12:59     ` Christoph Hellwig
2021-07-15 13:15       ` Joao Martins
2021-07-15  6:48   ` Christoph Hellwig
2021-07-15 13:15     ` Joao Martins
2021-07-14 19:35 ` [PATCH v3 05/14] mm/sparse-vmemmap: add a pgmap argument to section activation Joao Martins
2021-07-28  5:56   ` Dan Williams
2021-07-28  9:43     ` Joao Martins
2021-07-14 19:35 ` [PATCH v3 06/14] mm/sparse-vmemmap: refactor core of vmemmap_populate_basepages() to helper Joao Martins
2021-07-28  6:04   ` Dan Williams
2021-07-28 10:48     ` Joao Martins
2021-07-14 19:35 ` [PATCH v3 07/14] mm/hugetlb_vmemmap: move comment block to Documentation/vm Joao Martins
2021-07-15  2:47   ` [External] " Muchun Song
2021-07-15 13:16     ` Joao Martins
2021-07-28  6:09   ` Dan Williams
2021-07-14 19:35 ` [PATCH v3 08/14] mm/sparse-vmemmap: populate compound pagemaps Joao Martins
2021-07-28  6:55   ` Dan Williams
2021-07-28 15:35     ` Joao Martins
2021-07-28 18:03       ` Dan Williams
2021-07-28 18:54         ` Joao Martins
2021-07-28 20:04           ` Joao Martins
2021-07-14 19:35 ` [PATCH v3 09/14] mm/page_alloc: reuse tail struct pages for " Joao Martins
2021-07-28  7:28   ` Dan Williams
2021-07-28 15:56     ` Joao Martins
2021-07-28 16:08       ` Dan Williams
2021-07-28 16:12         ` Joao Martins
2021-07-14 19:35 ` [PATCH v3 10/14] device-dax: use ALIGN() for determining pgoff Joao Martins
2021-07-28  7:29   ` Dan Williams
2021-07-28 15:56     ` Joao Martins
2021-07-14 19:35 ` [PATCH v3 11/14] device-dax: ensure dev_dax->pgmap is valid for dynamic devices Joao Martins
2021-07-28  7:30   ` Dan Williams
2021-07-28 15:56     ` Joao Martins
2021-08-06 12:28       ` Joao Martins
2021-07-14 19:35 ` [PATCH v3 12/14] device-dax: compound pagemap support Joao Martins
2021-07-14 23:36   ` Dan Williams
2021-07-15 12:00     ` Joao Martins [this message]
2021-07-27 23:51       ` Dan Williams
2021-07-28  9:36         ` Joao Martins
2021-07-28 18:51           ` Dan Williams
2021-07-28 18:59             ` Joao Martins
2021-07-28 19:03               ` Dan Williams
2021-07-14 19:35 ` [PATCH v3 13/14] mm/gup: grab head page refcount once for group of subpages Joao Martins
2021-07-28 19:55   ` Dan Williams
2021-07-28 20:07     ` Joao Martins
2021-07-28 20:23       ` Dan Williams
2021-08-25 19:10         ` Joao Martins
2021-08-25 19:15           ` Matthew Wilcox
2021-08-25 19:26             ` Joao Martins
2021-07-14 19:35 ` [PATCH v3 14/14] mm/sparse-vmemmap: improve memory savings for compound pud geometry Joao Martins
2021-07-28 20:03   ` Dan Williams
2021-07-28 20:08     ` Joao Martins
2021-07-14 21:48 ` [PATCH v3 00/14] mm, sparse-vmemmap: Introduce compound pagemaps Andrew Morton
2021-07-14 23:47   ` Dan Williams
2021-07-22  2:24   ` Matthew Wilcox
2021-07-22 10:53     ` Joao Martins
2021-07-27 23:23       ` Dan Williams
2021-08-02 10:40         ` Joao Martins
2021-08-02 14:06           ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f7217b61-c845-eaed-501e-c9e7067a6b87@oracle.com \
    --to=joao.m.martins@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=jane.chu@oracle.com \
    --cc=jgg@ziepe.ca \
    --cc=jhubbard@nvidia.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=naoya.horiguchi@nec.com \
    --cc=nvdimm@lists.linux.dev \
    --cc=songmuchun@bytedance.com \
    --cc=vishal.l.verma@intel.com \
    --cc=willy@infradead.org \
    --subject='Re: [PATCH v3 12/14] device-dax: compound pagemap support' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox