From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.1 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23B61C43219 for ; Sat, 4 May 2019 19:40:24 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 97D4820652 for ; Sat, 4 May 2019 19:40:23 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="BLXpH5+g" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 97D4820652 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=soleen.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 16E636B0003; Sat, 4 May 2019 15:40:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 122596B0006; Sat, 4 May 2019 15:40:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F293F6B0007; Sat, 4 May 2019 15:40:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from mail-ed1-f71.google.com (mail-ed1-f71.google.com [209.85.208.71]) by kanga.kvack.org (Postfix) with ESMTP id 8AFE96B0003 for ; Sat, 4 May 2019 15:40:22 -0400 (EDT) Received: by mail-ed1-f71.google.com with SMTP id h12so7490644edl.23 for ; Sat, 04 May 2019 12:40:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:mime-version:references :in-reply-to:from:date:message-id:subject:to:cc; bh=Ce5vpv+5ZXGpxa3lgMdpi8rpCOiWb2UyQ6xRgbfBrHw=; b=ZVI2djrwbvZyKtKXdPRXTquJYPgemArNRf8o1jveuqGE4v1ZctZHDXKSYPKjvkm8c0 RcrRmKEebt8vCq/YTYVK7WfSXQw8Az9IygpXF/lCIoN72/2lr22LWqGD1nBXgTBFgau2 bH9O/l7dbZrSCDamXU5VBzsMyOjLPq0oPmAhBETFUM9XA1LPFRku+/6nZOEo2EzOjiZ7 rgMSLS3r+bnRmWSJLSwE1hMrY6kqG8/vxkpphoqm71FxTj0S6D1tzKoYVmAqLwS3/RFG McQrEUr1xKaX6aEYmzT0SvhceDUI8qDKrDicFkB5VOQDtX3U6yvBgYtFxK8pQ27f9/G0 OcRQ== X-Gm-Message-State: APjAAAUwrsfsqbsEMTccNERf2Kv3vip9HD+Eivcds66Cztd1m6tRRAk7 HLQFCXN0WJFCh/6X67DjLgPnwDzw+d2fOZMd4uunawV/Nj58B874K2VQ2s87UyKUSv5+i/6nggY G2bo2/aYEqlWfZ+8mzrj57ZqmjFawvdf1LFpxWKNkS5CcASv+6rozB8joILbrQd9jlg== X-Received: by 2002:a17:906:7d08:: with SMTP id u8mr11864355ejo.1.1556998822089; Sat, 04 May 2019 12:40:22 -0700 (PDT) X-Received: by 2002:a17:906:7d08:: with SMTP id u8mr11864307ejo.1.1556998820848; Sat, 04 May 2019 12:40:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556998820; cv=none; d=google.com; s=arc-20160816; b=PXYX6vdOFl8JmfZQOpWeTTr2bqG1cjUDJZ6bkZvqPStz3O3x6lDSClq0v1elJKs7n3 X9reRXU7QxpPvYZY5esDY93jXBwzMKbBhdCgwEqo0ptOBHqAvQ5kyGVkJcYmLFbOHJ/z WVONbbHe7mHDO0o7N1AE1UJ0ZK4ErqiB0vswDvPbpxEyeue0PlzpvTqH7pZB3sHrvYVY Xw7LdWTB2ewINv6Dbb4oKt4RvvDBdxaqYo81K9KJWdpOU7GtZnO1DG2+j5m55P5rWo1w dZzRvv/aAXwwDDlJZQtpPhtq8x+aZ62UKMFdmvsdZPKoBxggxSr0h9uEaN6jqZnfIeh/ ae2g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=Ce5vpv+5ZXGpxa3lgMdpi8rpCOiWb2UyQ6xRgbfBrHw=; b=m1I1erS76hKyjQaLCUkD88mZVcOigzfjBLHb/lITC579kpam4wvo79qvHyCcJmrvRg cGrie9vE4w/cIE4Wk2Vux2NNxtJelOu3cPLUbucY86hoCCSQ8Ww0fNKhr92+z0xP5Ua1 RvwP6tvWR3FL0IQvrVG0GuDlX/KPJ6uyRpA+QhPLkEp7J95ZqQhMdtVejPCBwHa/ZOOr 8HyNKglSf4YzRZ8iVEN5FLAFnnWlJsKL5E8Q4PH+KYmHsqLHg7y5H3tXHnyRGTMpJw3T iMI1uhNSBXhTnHumGRAosIRZHdgc2vKHeA0DXy97Aa/qWTjCEADmMb8k+nFTyiJ5/2bF WrIw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@soleen.com header.s=google header.b=BLXpH5+g; spf=pass (google.com: domain of pasha.tatashin@soleen.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id i7sor590566ede.8.2019.05.04.12.40.20 for (Google Transport Security); Sat, 04 May 2019 12:40:20 -0700 (PDT) Received-SPF: pass (google.com: domain of pasha.tatashin@soleen.com designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@soleen.com header.s=google header.b=BLXpH5+g; spf=pass (google.com: domain of pasha.tatashin@soleen.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Ce5vpv+5ZXGpxa3lgMdpi8rpCOiWb2UyQ6xRgbfBrHw=; b=BLXpH5+gTls+UjjGmDTy2FR/p68KitDeI320srTaxdL2w3OBd9QKKJP2YLuNMkY2+9 yJWZHOg9nSv7TkbaJEd6meeP7fRBb0nYjFKeeqlqsc9k+UoMS1MQfrdsHgyy4UQxq7Ge wX7DzOMBZfkrqZEKJTbnETLo25QpaPNzSIIyCywMyLlXoq0uwuA8Ini/EGpWc8SbLJhD b8JUTY/DZjiZNrxI34yek6wYvqi1BMjx9XuNWRa8C/yrX+Yj2Mkpllf1WLze53HSlQur 1yg4RCZLUYbE5pCCyVdDbuS1DnbCEszfJrAOYUAVDJb3xZ85IdD/t+tkGFzJOv9cLTfb xjog== X-Google-Smtp-Source: APXvYqwOFF9HU+jMy+hBb2/lgsHqjnT3bBcNAAUTBNAKvnayHQXnRxCcMUzavHhLNXrDnEsmIYiRNcMVsEG0ILPTuPk= X-Received: by 2002:a50:f5ad:: with SMTP id u42mr16693281edm.17.1556998820421; Sat, 04 May 2019 12:40:20 -0700 (PDT) MIME-Version: 1.0 References: <155552633539.2015392.2477781120122237934.stgit@dwillia2-desk3.amr.corp.intel.com> <155552635098.2015392.5460028594173939000.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: From: Pavel Tatashin Date: Sat, 4 May 2019 15:40:08 -0400 Message-ID: Subject: Re: [PATCH v6 03/12] mm/sparsemem: Add helpers track active portions of a section at boot To: Dan Williams Cc: akpm@linux-foundation.org, mhocko@suse.com, Vlastimil Babka , Logan Gunthorpe , linux-mm , linux-nvdimm , LKML , David Hildenbrand Content-Type: multipart/alternative; boundary="0000000000009b73a205881509d8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: --0000000000009b73a205881509d8 Content-Type: text/plain; charset="UTF-8" On Sat, May 4, 2019, 3:26 PM Dan Williams wrote: > On Thu, May 2, 2019 at 9:12 AM Pavel Tatashin > wrote: > > > > On Wed, Apr 17, 2019 at 2:53 PM Dan Williams > wrote: > > > > > > Prepare for hot{plug,remove} of sub-ranges of a section by tracking a > > > section active bitmask, each bit representing 2MB (SECTION_SIZE (128M) > / > > > map_active bitmask length (64)). If it turns out that 2MB is too large > > > of an active tracking granularity it is trivial to increase the size of > > > the map_active bitmap. > > > > Please mention that 2M on Intel, and 16M on Arm64. > > > > > > > > The implications of a partially populated section is that pfn_valid() > > > needs to go beyond a valid_section() check and read the sub-section > > > active ranges from the bitmask. > > > > > > Cc: Michal Hocko > > > Cc: Vlastimil Babka > > > Cc: Logan Gunthorpe > > > Signed-off-by: Dan Williams > > > --- > > > include/linux/mmzone.h | 29 ++++++++++++++++++++++++++++- > > > mm/page_alloc.c | 4 +++- > > > mm/sparse.c | 48 > ++++++++++++++++++++++++++++++++++++++++++++++++ > > > 3 files changed, 79 insertions(+), 2 deletions(-) > > > > > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > > > index 6726fc175b51..cffde898e345 100644 > > > --- a/include/linux/mmzone.h > > > +++ b/include/linux/mmzone.h > > > @@ -1175,6 +1175,8 @@ struct mem_section_usage { > > > unsigned long pageblock_flags[0]; > > > }; > > > > > > +void section_active_init(unsigned long pfn, unsigned long nr_pages); > > > + > > > struct page; > > > struct page_ext; > > > struct mem_section { > > > @@ -1312,12 +1314,36 @@ static inline struct mem_section > *__pfn_to_section(unsigned long pfn) > > > > > > extern int __highest_present_section_nr; > > > > > > +static inline int section_active_index(phys_addr_t phys) > > > +{ > > > + return (phys & ~(PA_SECTION_MASK)) / SECTION_ACTIVE_SIZE; > > > > How about also defining SECTION_ACTIVE_SHIFT like this: > > > > /* BITS_PER_LONG = 2^6 */ > > #define BITS_PER_LONG_SHIFT 6 > > #define SECTION_ACTIVE_SHIFT (SECTION_SIZE_BITS - BITS_PER_LONG_SHIFT) > > #define SECTION_ACTIVE_SIZE (1 << SECTION_ACTIVE_SHIFT) > > > > The return above would become: > > return (phys & ~(PA_SECTION_MASK)) >> SECTION_ACTIVE_SHIFT; > > > > > +} > > > + > > > +#ifdef CONFIG_SPARSEMEM_VMEMMAP > > > +static inline int pfn_section_valid(struct mem_section *ms, unsigned > long pfn) > > > +{ > > > + int idx = section_active_index(PFN_PHYS(pfn)); > > > + > > > + return !!(ms->usage->map_active & (1UL << idx)); > > > +} > > > +#else > > > +static inline int pfn_section_valid(struct mem_section *ms, unsigned > long pfn) > > > +{ > > > + return 1; > > > +} > > > +#endif > > > + > > > #ifndef CONFIG_HAVE_ARCH_PFN_VALID > > > static inline int pfn_valid(unsigned long pfn) > > > { > > > + struct mem_section *ms; > > > + > > > if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) > > > return 0; > > > - return valid_section(__nr_to_section(pfn_to_section_nr(pfn))); > > > + ms = __nr_to_section(pfn_to_section_nr(pfn)); > > > + if (!valid_section(ms)) > > > + return 0; > > > + return pfn_section_valid(ms, pfn); > > > } > > > #endif > > > > > > @@ -1349,6 +1375,7 @@ void sparse_init(void); > > > #define sparse_init() do {} while (0) > > > #define sparse_index_init(_sec, _nid) do {} while (0) > > > #define pfn_present pfn_valid > > > +#define section_active_init(_pfn, _nr_pages) do {} while (0) > > > #endif /* CONFIG_SPARSEMEM */ > > > > > > /* > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > > index f671401a7c0b..c9ad28a78018 100644 > > > --- a/mm/page_alloc.c > > > +++ b/mm/page_alloc.c > > > @@ -7273,10 +7273,12 @@ void __init free_area_init_nodes(unsigned long > *max_zone_pfn) > > > > > > /* Print out the early node map */ > > > pr_info("Early memory node ranges\n"); > > > - for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, > &nid) > > > + for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, > &nid) { > > > pr_info(" node %3d: [mem %#018Lx-%#018Lx]\n", nid, > > > (u64)start_pfn << PAGE_SHIFT, > > > ((u64)end_pfn << PAGE_SHIFT) - 1); > > > + section_active_init(start_pfn, end_pfn - start_pfn); > > > + } > > > > > > /* Initialise every node */ > > > mminit_verify_pageflags_layout(); > > > diff --git a/mm/sparse.c b/mm/sparse.c > > > index f87de7ad32c8..5ef2f884c4e1 100644 > > > --- a/mm/sparse.c > > > +++ b/mm/sparse.c > > > @@ -210,6 +210,54 @@ static inline unsigned long > first_present_section_nr(void) > > > return next_present_section_nr(-1); > > > } > > > > > > +static unsigned long section_active_mask(unsigned long pfn, > > > + unsigned long nr_pages) > > > +{ > > > + int idx_start, idx_size; > > > + phys_addr_t start, size; > > > + > > > + if (!nr_pages) > > > + return 0; > > > + > > > + start = PFN_PHYS(pfn); > > > + size = PFN_PHYS(min(nr_pages, PAGES_PER_SECTION > > > + - (pfn & ~PAGE_SECTION_MASK))); > > > + size = ALIGN(size, SECTION_ACTIVE_SIZE); > > > + > > > + idx_start = section_active_index(start); > > > + idx_size = section_active_index(size); > > > + > > > + if (idx_size == 0) > > > + return -1; > > > + return ((1UL << idx_size) - 1) << idx_start; > > > +} > > > + > > > +void section_active_init(unsigned long pfn, unsigned long nr_pages) > > > +{ > > > + int end_sec = pfn_to_section_nr(pfn + nr_pages - 1); > > > + int i, start_sec = pfn_to_section_nr(pfn); > > > + > > > + if (!nr_pages) > > > + return; > > > + > > > + for (i = start_sec; i <= end_sec; i++) { > > > + struct mem_section *ms; > > > + unsigned long mask; > > > + unsigned long pfns; > > > + > > > + pfns = min(nr_pages, PAGES_PER_SECTION > > > + - (pfn & ~PAGE_SECTION_MASK)); > > > + mask = section_active_mask(pfn, pfns); > > > + > > > + ms = __nr_to_section(i); > > > + pr_debug("%s: sec: %d mask: %#018lx\n", __func__, i, > mask); > > > + ms->usage->map_active = mask; > > > + > > > + pfn += pfns; > > > + nr_pages -= pfns; > > > + } > > > +} > > > > For some reasons the above code is confusing to me. It seems all the > > code supposed to do is set all map_active to -1, and trim the first > > and last sections (can be the same section of course). So, I would > > replace the above two functions with one function like this: > > > > void section_active_init(unsigned long pfn, unsigned long nr_pages) > > { > > int end_sec = pfn_to_section_nr(pfn + nr_pages - 1); > > int i, idx, start_sec = pfn_to_section_nr(pfn); > > struct mem_section *ms; > > > > if (!nr_pages) > > return; > > > > for (i = start_sec; i <= end_sec; i++) { > > ms = __nr_to_section(i); > > ms->usage->map_active = ~0ul; > > } > > > > /* Might need to trim active pfns from the beginning and end */ > > idx = section_active_index(PFN_PHYS(pfn)); > > ms = __nr_to_section(start_sec); > > ms->usage->map_active &= (~0ul << idx); > > > > idx = section_active_index(PFN_PHYS(pfn + nr_pages -1)); > > ms = __nr_to_section(end_sec); > > ms->usage->map_active &= (~0ul >> (BITS_PER_LONG - idx - 1)); > > } > > I like the cleanup, but one of the fixes in v7 resulted in the > realization that a given section may be populated twice at init time. > For example, enabling that pr_debug() yields: > > section_active_init: sec: 12 mask: 0x00000003ffffffff > section_active_init: sec: 12 mask: 0xe000000000000000 > > So, the implementation can't blindly clear bits based on the current > parameters. However, I'm switching this code over to use bitmap_*() > helpers which should help with the readability. > Yes, bitmap_* will help, and I assume you will also make active_map size scalable? I will take another look at version 8. Thank you, Pasha --0000000000009b73a205881509d8 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Sat, May 4, 2019, 3:26 PM Dan Williams <dan.j.williams@intel.com> wrote= :
On Thu, May 2, 2019 at 9:12 AM Pa= vel Tatashin <pasha.tatashin@soleen.com> wrote:
>
> On Wed, Apr 17, 2019 at 2:53 PM Dan Williams <dan.j.williams@= intel.com> wrote:
> >
> > Prepare for hot{plug,remove} of sub-ranges of a section by tracki= ng a
> > section active bitmask, each bit representing 2MB (SECTION_SIZE (= 128M) /
> > map_active bitmask length (64)). If it turns out that 2MB is too = large
> > of an active tracking granularity it is trivial to increase the s= ize of
> > the map_active bitmap.
>
> Please mention that 2M on Intel, and 16M on Arm64.
>
> >
> > The implications of a partially populated section is that pfn_val= id()
> > needs to go beyond a valid_section() check and read the sub-secti= on
> > active ranges from the bitmask.
> >
> > Cc: Michal Hocko <mhocko@suse.com>
> > Cc: Vlastimil Babka <vbabka@suse.cz>
> > Cc: Logan Gunthorpe <logang@deltatee.com>
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> > ---
> >=C2=A0 include/linux/mmzone.h |=C2=A0 =C2=A029 +++++++++++++++++++= +++++++++-
> >=C2=A0 mm/page_alloc.c=C2=A0 =C2=A0 =C2=A0 =C2=A0 |=C2=A0 =C2=A0 4= +++-
> >=C2=A0 mm/sparse.c=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 |=C2= =A0 =C2=A048 ++++++++++++++++++++++++++++++++++++++++++++++++
> >=C2=A0 3 files changed, 79 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> > index 6726fc175b51..cffde898e345 100644
> > --- a/include/linux/mmzone.h
> > +++ b/include/linux/mmzone.h
> > @@ -1175,6 +1175,8 @@ struct mem_section_usage {
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0unsigned long pageblock_flags[0]= ;
> >=C2=A0 };
> >
> > +void section_active_init(unsigned long pfn, unsigned long nr_pag= es);
> > +
> >=C2=A0 struct page;
> >=C2=A0 struct page_ext;
> >=C2=A0 struct mem_section {
> > @@ -1312,12 +1314,36 @@ static inline struct mem_section *__pfn_t= o_section(unsigned long pfn)
> >
> >=C2=A0 extern int __highest_present_section_nr;
> >
> > +static inline int section_active_index(phys_addr_t phys)
> > +{
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0return (phys & ~(PA_SECTION_MASK)= ) / SECTION_ACTIVE_SIZE;
>
> How about also defining SECTION_ACTIVE_SHIFT like this:
>
> /* BITS_PER_LONG =3D 2^6 */
> #define BITS_PER_LONG_SHIFT 6
> #define SECTION_ACTIVE_SHIFT (SECTION_SIZE_BITS - BITS_PER_LONG_SHIFT)=
> #define SECTION_ACTIVE_SIZE (1 << SECTION_ACTIVE_SHIFT)
>
> The return above would become:
> return (phys & ~(PA_SECTION_MASK)) >> SECTION_ACTIVE_SHIFT;<= br> >
> > +}
> > +
> > +#ifdef CONFIG_SPARSEMEM_VMEMMAP
> > +static inline int pfn_section_valid(struct mem_section *ms, unsi= gned long pfn)
> > +{
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0int idx =3D section_active_index(PFN_= PHYS(pfn));
> > +
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0return !!(ms->usage->map_active= & (1UL << idx));
> > +}
> > +#else
> > +static inline int pfn_section_valid(struct mem_section *ms, unsi= gned long pfn)
> > +{
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0return 1;
> > +}
> > +#endif
> > +
> >=C2=A0 #ifndef CONFIG_HAVE_ARCH_PFN_VALID
> >=C2=A0 static inline int pfn_valid(unsigned long pfn)
> >=C2=A0 {
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0struct mem_section *ms;
> > +
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (pfn_to_section_nr(pfn) >= =3D NR_MEM_SECTIONS)
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0retu= rn 0;
> > -=C2=A0 =C2=A0 =C2=A0 =C2=A0return valid_section(__nr_to_section(= pfn_to_section_nr(pfn)));
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0ms =3D __nr_to_section(pfn_to_section= _nr(pfn));
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0if (!valid_section(ms))
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return 0;=
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0return pfn_section_valid(ms, pfn); > >=C2=A0 }
> >=C2=A0 #endif
> >
> > @@ -1349,6 +1375,7 @@ void sparse_init(void);
> >=C2=A0 #define sparse_init()=C2=A0 do {} while (0)
> >=C2=A0 #define sparse_index_init(_sec, _nid)=C2=A0 do {} while (0)=
> >=C2=A0 #define pfn_present pfn_valid
> > +#define section_active_init(_pfn, _nr_pages) do {} while (0)
> >=C2=A0 #endif /* CONFIG_SPARSEMEM */
> >
> >=C2=A0 /*
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index f671401a7c0b..c9ad28a78018 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -7273,10 +7273,12 @@ void __init free_area_init_nodes(unsigned= long *max_zone_pfn)
> >
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/* Print out the early node map = */
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0pr_info("Early memory node = ranges\n");
> > -=C2=A0 =C2=A0 =C2=A0 =C2=A0for_each_mem_pfn_range(i, MAX_NUMNODE= S, &start_pfn, &end_pfn, &nid)
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0for_each_mem_pfn_range(i, MAX_NUMNODE= S, &start_pfn, &end_pfn, &nid) {
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0pr_i= nfo("=C2=A0 node %3d: [mem %#018Lx-%#018Lx]\n", nid,
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0(u64)start_pfn << PAGE_SHIFT,
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0((u64)end_pfn << PAGE_SHIFT) - 1);
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0section_a= ctive_init(start_pfn, end_pfn - start_pfn);
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0}
> >
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/* Initialise every node */
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0mminit_verify_pageflags_layout()= ;
> > diff --git a/mm/sparse.c b/mm/sparse.c
> > index f87de7ad32c8..5ef2f884c4e1 100644
> > --- a/mm/sparse.c
> > +++ b/mm/sparse.c
> > @@ -210,6 +210,54 @@ static inline unsigned long first_present_se= ction_nr(void)
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return next_present_section_nr(-= 1);
> >=C2=A0 }
> >
> > +static unsigned long section_active_mask(unsigned long pfn,
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0unsigned = long nr_pages)
> > +{
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0int idx_start, idx_size;
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0phys_addr_t start, size;
> > +
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0if (!nr_pages)
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return 0;=
> > +
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0start =3D PFN_PHYS(pfn);
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0size =3D PFN_PHYS(min(nr_pages, PAGES= _PER_SECTION
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0- (pfn & ~PAGE_SECTION_= MASK)));
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0size =3D ALIGN(size, SECTION_ACTIVE_S= IZE);
> > +
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0idx_start =3D section_active_index(st= art);
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0idx_size =3D section_active_index(siz= e);
> > +
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0if (idx_size =3D=3D 0)
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return -1= ;
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0return ((1UL << idx_size) - 1) = << idx_start;
> > +}
> > +
> > +void section_active_init(unsigned long pfn, unsigned long nr_pag= es)
> > +{
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0int end_sec =3D pfn_to_section_nr(pfn= + nr_pages - 1);
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0int i, start_sec =3D pfn_to_section_n= r(pfn);
> > +
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0if (!nr_pages)
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return; > > +
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0for (i =3D start_sec; i <=3D end_s= ec; i++) {
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0struct me= m_section *ms;
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0unsigned = long mask;
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0unsigned = long pfns;
> > +
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0pfns =3D = min(nr_pages, PAGES_PER_SECTION
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0- (pfn & ~PAGE_SECTION_= MASK));
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0mask =3D = section_active_mask(pfn, pfns);
> > +
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ms =3D __= nr_to_section(i);
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0pr_debug(= "%s: sec: %d mask: %#018lx\n", __func__, i, mask);
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ms->us= age->map_active =3D mask;
> > +
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0pfn +=3D = pfns;
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0nr_pages = -=3D pfns;
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0}
> > +}
>
> For some reasons the above code is confusing to me. It seems all the > code supposed to do is set all map_active to -1, and trim the first > and last sections (can be the same section of course). So, I would
> replace the above two functions with one function like this:
>
> void section_active_init(unsigned long pfn, unsigned long nr_pages) > {
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0int end_sec =3D pfn_to_section_nr(pfn= + nr_pages - 1);
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0int i, idx, start_sec =3D pfn_to_sect= ion_nr(pfn);
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0struct mem_section *ms;
>
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (!nr_pages)
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return; >
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0for (i =3D start_sec; i <=3D end_s= ec; i++) {
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ms =3D __= nr_to_section(i);
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ms->us= age->map_active =3D ~0ul;
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}
>
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/* Might need to trim active pfns fro= m the beginning and end */
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0idx =3D section_active_index(PFN_PHYS= (pfn));
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ms =3D __nr_to_section(start_sec); >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ms->usage->map_active &=3D = (~0ul << idx);
>
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0idx =3D section_active_index(PFN_PHYS= (pfn + nr_pages -1));
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ms =3D __nr_to_section(end_sec);
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ms->usage->map_active &=3D = (~0ul >> (BITS_PER_LONG - idx - 1));
> }

I like the cleanup, but one of the fixes in v7 resulted in the
realization that a given section may be populated twice at init time.
For example, enabling that pr_debug() yields:

=C2=A0 =C2=A0 section_active_init: sec: 12 mask: 0x00000003ffffffff
=C2=A0 =C2=A0 section_active_init: sec: 12 mask: 0xe000000000000000

So, the implementation can't blindly clear bits based on the current parameters. However, I'm switching this code over to use bitmap_*()
helpers which should help with the readability.


=


T= hank you,
Pasha

--0000000000009b73a205881509d8--