From: Joao Martins <joao.m.martins@oracle.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Linux MM <linux-mm@kvack.org>,
Linux NVDIMM <nvdimm@lists.linux.dev>,
Ira Weiny <ira.weiny@intel.com>,
Matthew Wilcox <willy@infradead.org>,
Jason Gunthorpe <jgg@ziepe.ca>, Jane Chu <jane.chu@oracle.com>,
Muchun Song <songmuchun@bytedance.com>,
Mike Kravetz <mike.kravetz@oracle.com>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH v1 09/11] mm/page_alloc: reuse tail struct pages for compound pagemaps
Date: Mon, 14 Jun 2021 19:41:45 +0100 [thread overview]
Message-ID: <0c6a4dab-296d-94b2-f885-2371292f9e0d@oracle.com> (raw)
In-Reply-To: <CAPcyv4hJtqVGoA3ppCMfVQ4ZnWUa7jKtp=Huxu9mcSk4huq_7Q@mail.gmail.com>
On 6/7/21 8:32 PM, Dan Williams wrote:
> On Mon, Jun 7, 2021 at 6:49 AM Joao Martins <joao.m.martins@oracle.com> wrote:
> [..]
>>> Given all of the above I'm wondering if there should be a new
>>> "compound specific" flavor of this routine rather than trying to
>>> cleverly inter mingle the old path with the new. This is easier
>>> because you have already done the work in previous patches to break
>>> this into helpers. So just have memmap_init_zone_device() do it the
>>> "old" way and memmap_init_compound() handle all the tail page init +
>>> optimizations.
>>>
>> I can separate it out, should be easier to follow.
>>
>> Albeit just a note, I think memmap_init_compound() should be the new normal as metadata
>> more accurately represents what goes on the page tables. That's regardless of
>> vmemmap-based gains, and hence why my train of thought was to not separate it.
>>
>> After this series, all devmap pages where @geometry matches @align will have compound
>> pages be used instead. And we enable that in device-dax as first user (in the next patch).
>> altmap or not so far just differentiates on the uniqueness of struct pages as the former
>> doesn't reuse base pages that only contain tail pages and consequently makes us initialize
>> all tail struct pages.
>
> I think combining the cases into a common central routine makes the
> code that much harder to maintain. A small duplication cost is worth
> it in my opinion to help readability / maintainability.
>
I am addressing this comment and taking a step back. By just moving the tail page init to
memmap_init_compound() this gets a lot more readable. Albeit now I think having separate
top-level loops over pfns, doesn't bring much improvement there.
Here's what I have by moving just tails init to a separate routine. See your original
suggestion after the scissors mark. I have a slight inclination towards the first one, but
no really strong preference. Thoughts?
[...]
static void __ref memmap_init_compound(struct page *page, unsigned long pfn,
unsigned long zone_idx, int nid,
struct dev_pagemap *pgmap,
unsigned long nr_pages)
{
unsigned int order_align = order_base_2(nr_pages);
unsigned long i;
__SetPageHead(page);
for (i = 1; i < nr_pages; i++) {
__init_zone_device_page(page + i, pfn + i, zone_idx,
nid, pgmap);
prep_compound_tail(page, i);
/*
* The first and second tail pages need to
* initialized first, hence the head page is
* prepared last.
*/
if (i == 2)
prep_compound_head(page, order_align);
}
}
void __ref memmap_init_zone_device(struct zone *zone,
unsigned long start_pfn,
unsigned long nr_pages,
struct dev_pagemap *pgmap)
{
unsigned long pfn, end_pfn = start_pfn + nr_pages;
struct pglist_data *pgdat = zone->zone_pgdat;
struct vmem_altmap *altmap = pgmap_altmap(pgmap);
unsigned long pfns_per_compound = pgmap_pfn_geometry(pgmap);
unsigned long zone_idx = zone_idx(zone);
unsigned long start = jiffies;
int nid = pgdat->node_id;
if (WARN_ON_ONCE(!pgmap || zone_idx(zone) != ZONE_DEVICE))
return;
/*
* The call to memmap_init_zone should have already taken care
* of the pages reserved for the memmap, so we can just jump to
* the end of that region and start processing the device pages.
*/
if (altmap) {
start_pfn = altmap->base_pfn + vmem_altmap_offset(altmap);
nr_pages = end_pfn - start_pfn;
}
for (pfn = start_pfn; pfn < end_pfn; pfn += pfns_per_compound) {
struct page *page = pfn_to_page(pfn);
__init_zone_device_page(page, pfn, zone_idx, nid, pgmap);
if (pfns_per_compound == 1)
continue;
memmap_init_compound(page, pfn, zone_idx, nid, pgmap,
pfns_per_compound);
}
pr_info("%s initialised %lu pages in %ums\n", __func__,
nr_pages, jiffies_to_msecs(jiffies - start));
}
[...]
--->8-----
Whereas your original suggestion would look more like this:
[...]
static void __ref memmap_init_compound(unsigned long zone_idx, int nid,
struct dev_pagemap *pgmap,
unsigned long start_pfn,
unsigned long end_pfn)
{
unsigned long pfns_per_compound = pgmap_pfn_geometry(pgmap);
unsigned int order_align = order_base_2(pfns_per_compound);
unsigned long i;
for (pfn = start_pfn; pfn < end_pfn; pfn += pfns_per_compound) {
struct page *page = pfn_to_page(pfn);
__init_zone_device_page(page, pfn, zone_idx, nid, pgmap);
__SetPageHead(page);
for (i = 1; i < pfns_per_compound; i++) {
__init_zone_device_page(page + i, pfn + i, zone_idx,
nid, pgmap);
prep_compound_tail(page, i);
/*
* The first and second tail pages need to
* initialized first, hence the head page is
* prepared last.
*/
if (i == 2)
prep_compound_head(page, order_align);
}
}
}
static void __ref memmap_init_base(unsigned long zone_idx, int nid,
struct dev_pagemap *pgmap,
unsigned long start_pfn,
unsigned long end_pfn)
{
for (pfn = start_pfn; pfn < end_pfn; pfn++) {
struct page *page = pfn_to_page(pfn);
__init_zone_device_page(page, pfn, zone_idx, nid, pgmap);
}
}
void __ref memmap_init_zone_device(struct zone *zone,
unsigned long start_pfn,
unsigned long nr_pages,
struct dev_pagemap *pgmap)
{
unsigned long pfn, end_pfn = start_pfn + nr_pages;
struct pglist_data *pgdat = zone->zone_pgdat;
struct vmem_altmap *altmap = pgmap_altmap(pgmap);
bool compound = pgmap_geometry(pgmap) > PAGE_SIZE;
unsigned long zone_idx = zone_idx(zone);
unsigned long start = jiffies;
int nid = pgdat->node_id;
if (WARN_ON_ONCE(!pgmap || zone_idx(zone) != ZONE_DEVICE))
return;
/*
* The call to memmap_init_zone should have already taken care
* of the pages reserved for the memmap, so we can just jump to
* the end of that region and start processing the device pages.
*/
if (altmap) {
start_pfn = altmap->base_pfn + vmem_altmap_offset(altmap);
nr_pages = end_pfn - start_pfn;
}
if (compound)
memmap_init_compound(zone_idx, nid, pgmap, start_pfn, end_pfn);
else
memmap_init_base(zone_idx, nid, pgmap, start_pfn, end_pfn);
pr_info("%s initialised %lu pages in %ums\n", __func__,
nr_pages, jiffies_to_msecs(jiffies - start));
}
next prev parent reply other threads:[~2021-06-14 18:42 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-25 23:09 [PATCH v1 00/11] mm, sparse-vmemmap: Introduce compound pagemaps Joao Martins
2021-03-25 23:09 ` [PATCH v1 01/11] memory-failure: fetch compound_head after pgmap_pfn_valid() Joao Martins
2021-04-24 0:12 ` Dan Williams
2021-04-24 19:00 ` Joao Martins
2021-03-25 23:09 ` [PATCH v1 02/11] mm/page_alloc: split prep_compound_page into head and tail subparts Joao Martins
2021-04-24 0:16 ` Dan Williams
2021-04-24 19:05 ` Joao Martins
2021-03-25 23:09 ` [PATCH v1 03/11] mm/page_alloc: refactor memmap_init_zone_device() page init Joao Martins
2021-04-24 0:18 ` Dan Williams
2021-04-24 19:05 ` Joao Martins
2021-03-25 23:09 ` [PATCH v1 04/11] mm/memremap: add ZONE_DEVICE support for compound pages Joao Martins
2021-05-05 18:44 ` Dan Williams
2021-05-05 18:58 ` Matthew Wilcox
2021-05-05 19:49 ` Joao Martins
2021-05-05 22:20 ` Dan Williams
2021-05-05 22:36 ` Joao Martins
2021-05-05 23:03 ` Dan Williams
2021-05-06 10:12 ` Joao Martins
2021-05-18 17:27 ` Joao Martins
2021-05-18 19:56 ` Jane Chu
2021-05-19 11:29 ` Joao Martins
2021-05-19 18:36 ` Jane Chu
2021-06-07 20:17 ` Dan Williams
2021-06-07 20:47 ` Joao Martins
2021-06-07 21:00 ` Joao Martins
2021-06-07 21:57 ` Dan Williams
2021-05-06 8:05 ` Aneesh Kumar K.V
2021-05-06 10:23 ` Joao Martins
2021-05-06 11:43 ` Matthew Wilcox
2021-05-06 12:15 ` Joao Martins
2021-03-25 23:09 ` [PATCH v1 05/11] mm/sparse-vmemmap: add a pgmap argument to section activation Joao Martins
2021-05-05 22:34 ` Dan Williams
2021-05-05 22:37 ` Joao Martins
2021-05-05 23:14 ` Dan Williams
2021-05-06 10:24 ` Joao Martins
2021-03-25 23:09 ` [PATCH v1 06/11] mm/sparse-vmemmap: refactor vmemmap_populate_basepages() Joao Martins
2021-05-05 22:43 ` Dan Williams
2021-05-06 10:27 ` Joao Martins
2021-05-06 18:36 ` Joao Martins
2021-03-25 23:09 ` [PATCH v1 07/11] mm/sparse-vmemmap: populate compound pagemaps Joao Martins
2021-05-06 1:18 ` Dan Williams
2021-05-06 11:01 ` Joao Martins
2021-05-10 19:19 ` Dan Williams
2021-05-13 18:45 ` Joao Martins
2021-06-16 15:05 ` Joao Martins
2021-06-16 23:35 ` Dan Williams
2021-03-25 23:09 ` [PATCH v1 08/11] mm/sparse-vmemmap: use hugepages for PUD " Joao Martins
2021-06-01 19:30 ` Dan Williams
2021-06-07 12:02 ` Joao Martins
2021-06-07 19:47 ` Dan Williams
2021-03-25 23:09 ` [PATCH v1 09/11] mm/page_alloc: reuse tail struct pages for " Joao Martins
2021-06-01 23:35 ` Dan Williams
2021-06-07 13:48 ` Joao Martins
2021-06-07 19:32 ` Dan Williams
2021-06-14 18:41 ` Joao Martins [this message]
2021-06-14 23:07 ` Dan Williams
2021-03-25 23:09 ` [PATCH v1 10/11] device-dax: compound pagemap support Joao Martins
2021-06-02 0:36 ` Dan Williams
2021-06-07 13:59 ` Joao Martins
2021-03-25 23:09 ` [PATCH v1 11/11] mm/gup: grab head page refcount once for group of subpages Joao Martins
2021-06-02 1:05 ` Dan Williams
2021-06-07 15:21 ` Joao Martins
2021-06-07 19:22 ` Dan Williams
2021-04-01 9:38 ` [PATCH v1 00/11] mm, sparse-vmemmap: Introduce compound pagemaps Joao Martins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0c6a4dab-296d-94b2-f885-2371292f9e0d@oracle.com \
--to=joao.m.martins@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=dan.j.williams@intel.com \
--cc=ira.weiny@intel.com \
--cc=jane.chu@oracle.com \
--cc=jgg@ziepe.ca \
--cc=linux-mm@kvack.org \
--cc=mike.kravetz@oracle.com \
--cc=nvdimm@lists.linux.dev \
--cc=songmuchun@bytedance.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).