nvdimm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Joao Martins <joao.m.martins@oracle.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Linux MM <linux-mm@kvack.org>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Dave Jiang <dave.jiang@intel.com>,
	Naoya Horiguchi <naoya.horiguchi@nec.com>,
	Matthew Wilcox <willy@infradead.org>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	John Hubbard <jhubbard@nvidia.com>,
	Jane Chu <jane.chu@oracle.com>,
	Muchun Song <songmuchun@bytedance.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jonathan Corbet <corbet@lwn.net>,
	Linux NVDIMM <nvdimm@lists.linux.dev>,
	Linux Doc Mailing List <linux-doc@vger.kernel.org>
Subject: Re: [PATCH v3 12/14] device-dax: compound pagemap support
Date: Wed, 28 Jul 2021 10:36:04 +0100	[thread overview]
Message-ID: <156c4fb8-46c5-d8ae-b953-837b86516ded@oracle.com> (raw)
In-Reply-To: <CAPcyv4hRQhG+0ika-wbxSFYrpmMJHxxX456qE64PMxDoxS+Fwg@mail.gmail.com>

On 7/28/21 12:51 AM, Dan Williams wrote:
> On Thu, Jul 15, 2021 at 5:01 AM Joao Martins <joao.m.martins@oracle.com> wrote:
>> On 7/15/21 12:36 AM, Dan Williams wrote:
>>> On Wed, Jul 14, 2021 at 12:36 PM Joao Martins <joao.m.martins@oracle.com> wrote:
>> This patch is not the culprit, the flaw is early in the series, specifically the fourth patch.
>>
>> It needs this chunk below change on the fourth patch due to the existing elevated page ref
>> count at zone device memmap init. put_page() called here in memunmap_pages():
>>
>> for (i = 0; i < pgmap->nr_ranges; i++)
>>         for_each_device_pfn(pfn, pgmap, i)
>>                 put_page(pfn_to_page(pfn));
>>
>> ... on a zone_device compound memmap would otherwise always decrease head page refcount by
>> @geometry pfn amount (leading to the aforementioned splat you reported).
>>
>> diff --git a/mm/memremap.c b/mm/memremap.c
>> index b0e7b8cf3047..79a883af788e 100644
>> --- a/mm/memremap.c
>> +++ b/mm/memremap.c
>> @@ -102,15 +102,15 @@ static unsigned long pfn_end(struct dev_pagemap *pgmap, int range_id)
>>         return (range->start + range_len(range)) >> PAGE_SHIFT;
>>  }
>>
>> -static unsigned long pfn_next(unsigned long pfn)
>> +static unsigned long pfn_next(struct dev_pagemap *pgmap, unsigned long pfn)
>>  {
>>         if (pfn % 1024 == 0)
>>                 cond_resched();
>> -       return pfn + 1;
>> +       return pfn + pgmap_pfn_geometry(pgmap);
> 
> The cond_resched() would need to be fixed up too to something like:
> 
> if (pfn % (1024 << pgmap_geometry_order(pgmap)))
>     cond_resched();
> 
> ...because the goal is to take a break every 1024 iterations, not
> every 1024 pfns.
> 

Ah, good point.

>>  }
>>
>>  #define for_each_device_pfn(pfn, map, i) \
>> -       for (pfn = pfn_first(map, i); pfn < pfn_end(map, i); pfn = pfn_next(pfn))
>> +       for (pfn = pfn_first(map, i); pfn < pfn_end(map, i); pfn = pfn_next(map, pfn))
>>
>>  static void dev_pagemap_kill(struct dev_pagemap *pgmap)
>>  {
>>
>> It could also get this hunk below, but it is sort of redundant provided we won't touch
>> tail page refcount through out the devmap pages lifetime. This setting of tail pages
>> refcount to zero was in pre-v5.14 series, but it got removed under the assumption it comes
>> from the page allocator (where tail pages are already zeroed in refcount).
> 
> Wait, devmap pages never see the page allocator?
> 
"where tail pages are already zeroed in refcount" this actually meant 'freshly allocated
pages' and I was referring to commit 7118fc2906e2 ("hugetlb: address ref count racing in
prep_compound_gigantic_page") that removed set_page_count() because the setting of page
ref count to zero was redundant.

Albeit devmap pages don't come from page allocator, you know separate zone and these pages
aren't part of the regular page pools (e.g. accessible via alloc_pages()), as you are
aware. Unless of course, we reassign them via dax_kmem, but then the way we map the struct
pages would be regular without any devmap stuff.

>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 96975edac0a8..469a7aa5cf38 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -6623,6 +6623,7 @@ static void __ref memmap_init_compound(struct page *page, unsigned
>> long pfn,
>>                 __init_zone_device_page(page + i, pfn + i, zone_idx,
>>                                         nid, pgmap);
>>                 prep_compound_tail(page, i);
>> +               set_page_count(page + i, 0);
> 
> Looks good to me and perhaps a for elevated tail page refcount at
> teardown as a sanity check that the tail pages was never pinned
> directly?
> 
Sorry didn't follow completely.

You meant to set tail page refcount back to 1 at teardown if it was kept to 0 (e.g.
memunmap_pages() after put_page()) or that the refcount is indeed kept to zero after the
put_page() in memunmap_pages() ?

  reply	other threads:[~2021-07-28  9:36 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-14 19:35 [PATCH v3 00/14] mm, sparse-vmemmap: Introduce compound pagemaps Joao Martins
2021-07-14 19:35 ` [PATCH v3 01/14] memory-failure: fetch compound_head after pgmap_pfn_valid() Joao Martins
2021-07-15  0:17   ` Dan Williams
2021-07-15  2:51   ` [External] " Muchun Song
2021-07-15  6:40     ` Christoph Hellwig
2021-07-15  9:19       ` Muchun Song
2021-07-15 13:17     ` Joao Martins
2021-07-14 19:35 ` [PATCH v3 02/14] mm/page_alloc: split prep_compound_page into head and tail subparts Joao Martins
2021-07-15  0:19   ` Dan Williams
2021-07-15  2:53   ` [External] " Muchun Song
2021-07-15 13:17     ` Joao Martins
2021-07-14 19:35 ` [PATCH v3 03/14] mm/page_alloc: refactor memmap_init_zone_device() page init Joao Martins
2021-07-15  0:20   ` Dan Williams
2021-07-14 19:35 ` [PATCH v3 04/14] mm/memremap: add ZONE_DEVICE support for compound pages Joao Martins
2021-07-15  1:08   ` Dan Williams
2021-07-15 12:52     ` Joao Martins
2021-07-15 13:06       ` Joao Martins
2021-07-15 19:48       ` Dan Williams
2021-07-30 16:13         ` Joao Martins
2021-07-22  0:38       ` Jane Chu
2021-07-22 10:56         ` Joao Martins
2021-07-15 12:59     ` Christoph Hellwig
2021-07-15 13:15       ` Joao Martins
2021-07-15  6:48   ` Christoph Hellwig
2021-07-15 13:15     ` Joao Martins
2021-07-14 19:35 ` [PATCH v3 05/14] mm/sparse-vmemmap: add a pgmap argument to section activation Joao Martins
2021-07-28  5:56   ` Dan Williams
2021-07-28  9:43     ` Joao Martins
2021-07-14 19:35 ` [PATCH v3 06/14] mm/sparse-vmemmap: refactor core of vmemmap_populate_basepages() to helper Joao Martins
2021-07-28  6:04   ` Dan Williams
2021-07-28 10:48     ` Joao Martins
2021-07-14 19:35 ` [PATCH v3 07/14] mm/hugetlb_vmemmap: move comment block to Documentation/vm Joao Martins
2021-07-15  2:47   ` [External] " Muchun Song
2021-07-15 13:16     ` Joao Martins
2021-07-28  6:09   ` Dan Williams
2021-07-14 19:35 ` [PATCH v3 08/14] mm/sparse-vmemmap: populate compound pagemaps Joao Martins
2021-07-28  6:55   ` Dan Williams
2021-07-28 15:35     ` Joao Martins
2021-07-28 18:03       ` Dan Williams
2021-07-28 18:54         ` Joao Martins
2021-07-28 20:04           ` Joao Martins
2021-07-14 19:35 ` [PATCH v3 09/14] mm/page_alloc: reuse tail struct pages for " Joao Martins
2021-07-28  7:28   ` Dan Williams
2021-07-28 15:56     ` Joao Martins
2021-07-28 16:08       ` Dan Williams
2021-07-28 16:12         ` Joao Martins
2021-07-14 19:35 ` [PATCH v3 10/14] device-dax: use ALIGN() for determining pgoff Joao Martins
2021-07-28  7:29   ` Dan Williams
2021-07-28 15:56     ` Joao Martins
2021-07-14 19:35 ` [PATCH v3 11/14] device-dax: ensure dev_dax->pgmap is valid for dynamic devices Joao Martins
2021-07-28  7:30   ` Dan Williams
2021-07-28 15:56     ` Joao Martins
2021-08-06 12:28       ` Joao Martins
2021-07-14 19:35 ` [PATCH v3 12/14] device-dax: compound pagemap support Joao Martins
2021-07-14 23:36   ` Dan Williams
2021-07-15 12:00     ` Joao Martins
2021-07-27 23:51       ` Dan Williams
2021-07-28  9:36         ` Joao Martins [this message]
2021-07-28 18:51           ` Dan Williams
2021-07-28 18:59             ` Joao Martins
2021-07-28 19:03               ` Dan Williams
2021-07-14 19:35 ` [PATCH v3 13/14] mm/gup: grab head page refcount once for group of subpages Joao Martins
2021-07-28 19:55   ` Dan Williams
2021-07-28 20:07     ` Joao Martins
2021-07-28 20:23       ` Dan Williams
2021-08-25 19:10         ` Joao Martins
2021-08-25 19:15           ` Matthew Wilcox
2021-08-25 19:26             ` Joao Martins
2021-07-14 19:35 ` [PATCH v3 14/14] mm/sparse-vmemmap: improve memory savings for compound pud geometry Joao Martins
2021-07-28 20:03   ` Dan Williams
2021-07-28 20:08     ` Joao Martins
2021-07-14 21:48 ` [PATCH v3 00/14] mm, sparse-vmemmap: Introduce compound pagemaps Andrew Morton
2021-07-14 23:47   ` Dan Williams
2021-07-22  2:24   ` Matthew Wilcox
2021-07-22 10:53     ` Joao Martins
2021-07-27 23:23       ` Dan Williams
2021-08-02 10:40         ` Joao Martins
2021-08-02 14:06           ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=156c4fb8-46c5-d8ae-b953-837b86516ded@oracle.com \
    --to=joao.m.martins@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=jane.chu@oracle.com \
    --cc=jgg@ziepe.ca \
    --cc=jhubbard@nvidia.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=naoya.horiguchi@nec.com \
    --cc=nvdimm@lists.linux.dev \
    --cc=songmuchun@bytedance.com \
    --cc=vishal.l.verma@intel.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).