From: Wei Yang <richardw.yang@linux.intel.com> To: Dan Williams <dan.j.williams@intel.com> Cc: Michal Hocko <mhocko@suse.com>, Pavel Tatashin <pasha.tatashin@soleen.com>, linux-nvdimm <linux-nvdimm@lists.01.org>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, Wei Yang <richard.weiyang@gmail.com>, Linux MM <linux-mm@kvack.org>, Andrew Morton <akpm@linux-foundation.org>, Vlastimil Babka <vbabka@suse.cz>, Oscar Salvador <osalvador@suse.de> Subject: Re: [PATCH v9 02/12] mm/sparsemem: Add helpers track active portions of a section at boot Date: Tue, 18 Jun 2019 09:03:51 +0800 [thread overview] Message-ID: <20190618010351.GC18161@richard> (raw) In-Reply-To: <CAPcyv4hdsvNL0QfA2ACHAaGZE+21RmAnfKYfrZsKGKUxu3eKRQ@mail.gmail.com> On Mon, Jun 17, 2019 at 03:32:45PM -0700, Dan Williams wrote: >On Mon, Jun 17, 2019 at 3:22 PM Wei Yang <richard.weiyang@gmail.com> wrote: >> >> On Wed, Jun 05, 2019 at 02:57:59PM -0700, Dan Williams wrote: >> >Prepare for hot{plug,remove} of sub-ranges of a section by tracking a >> >sub-section active bitmask, each bit representing a PMD_SIZE span of the >> >architecture's memory hotplug section size. >> > >> >The implications of a partially populated section is that pfn_valid() >> >needs to go beyond a valid_section() check and read the sub-section >> >active ranges from the bitmask. The expectation is that the bitmask >> >(subsection_map) fits in the same cacheline as the valid_section() data, >> >so the incremental performance overhead to pfn_valid() should be >> >negligible. >> > >> >Cc: Michal Hocko <mhocko@suse.com> >> >Cc: Vlastimil Babka <vbabka@suse.cz> >> >Cc: Logan Gunthorpe <logang@deltatee.com> >> >Cc: Oscar Salvador <osalvador@suse.de> >> >Cc: Pavel Tatashin <pasha.tatashin@soleen.com> >> >Tested-by: Jane Chu <jane.chu@oracle.com> >> >Signed-off-by: Dan Williams <dan.j.williams@intel.com> >> >--- >> > include/linux/mmzone.h | 29 ++++++++++++++++++++++++++++- >> > mm/page_alloc.c | 4 +++- >> > mm/sparse.c | 35 +++++++++++++++++++++++++++++++++++ >> > 3 files changed, 66 insertions(+), 2 deletions(-) >> > >> >diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h >> >index ac163f2f274f..6dd52d544857 100644 >> >--- a/include/linux/mmzone.h >> >+++ b/include/linux/mmzone.h >> >@@ -1199,6 +1199,8 @@ struct mem_section_usage { >> > unsigned long pageblock_flags[0]; >> > }; >> > >> >+void subsection_map_init(unsigned long pfn, unsigned long nr_pages); >> >+ >> > struct page; >> > struct page_ext; >> > struct mem_section { >> >@@ -1336,12 +1338,36 @@ static inline struct mem_section *__pfn_to_section(unsigned long pfn) >> > >> > extern int __highest_present_section_nr; >> > >> >+static inline int subsection_map_index(unsigned long pfn) >> >+{ >> >+ return (pfn & ~(PAGE_SECTION_MASK)) / PAGES_PER_SUBSECTION; >> >+} >> >+ >> >+#ifdef CONFIG_SPARSEMEM_VMEMMAP >> >+static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) >> >+{ >> >+ int idx = subsection_map_index(pfn); >> >+ >> >+ return test_bit(idx, ms->usage->subsection_map); >> >+} >> >+#else >> >+static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) >> >+{ >> >+ return 1; >> >+} >> >+#endif >> >+ >> > #ifndef CONFIG_HAVE_ARCH_PFN_VALID >> > static inline int pfn_valid(unsigned long pfn) >> > { >> >+ struct mem_section *ms; >> >+ >> > if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) >> > return 0; >> >- return valid_section(__nr_to_section(pfn_to_section_nr(pfn))); >> >+ ms = __nr_to_section(pfn_to_section_nr(pfn)); >> >+ if (!valid_section(ms)) >> >+ return 0; >> >+ return pfn_section_valid(ms, pfn); >> > } >> > #endif >> > >> >@@ -1373,6 +1399,7 @@ void sparse_init(void); >> > #define sparse_init() do {} while (0) >> > #define sparse_index_init(_sec, _nid) do {} while (0) >> > #define pfn_present pfn_valid >> >+#define subsection_map_init(_pfn, _nr_pages) do {} while (0) >> > #endif /* CONFIG_SPARSEMEM */ >> > >> > /* >> >diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> >index c6d8224d792e..bd773efe5b82 100644 >> >--- a/mm/page_alloc.c >> >+++ b/mm/page_alloc.c >> >@@ -7292,10 +7292,12 @@ void __init free_area_init_nodes(unsigned long *max_zone_pfn) >> > >> > /* Print out the early node map */ >> > pr_info("Early memory node ranges\n"); >> >- for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) >> >+ for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) { >> > pr_info(" node %3d: [mem %#018Lx-%#018Lx]\n", nid, >> > (u64)start_pfn << PAGE_SHIFT, >> > ((u64)end_pfn << PAGE_SHIFT) - 1); >> >+ subsection_map_init(start_pfn, end_pfn - start_pfn); >> >+ } >> >> Just curious about why we set subsection here? >> >> Function free_area_init_nodes() mostly handles pgdat, if I am correct. Setup >> subsection here looks like touching some lower level system data structure. > >Correct, I'm not sure how it ended up there, but it was the source of >a bug that was fixed with this change: > >https://lore.kernel.org/lkml/CAPcyv4hjvBPDYKpp2Gns3-cc2AQ0AVS1nLk-K3fwXeRUvvzQLg@mail.gmail.com/ So this one is moved to sparse_init_nid(). The bug is strange, while the code now is more reasonable to me. Thanks :-) >_______________________________________________ >Linux-nvdimm mailing list >Linux-nvdimm@lists.01.org >https://lists.01.org/mailman/listinfo/linux-nvdimm -- Wei Yang Help you, Help me _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm
WARNING: multiple messages have this Message-ID (diff)
From: Wei Yang <richardw.yang@linux.intel.com> To: Dan Williams <dan.j.williams@intel.com> Cc: Wei Yang <richard.weiyang@gmail.com>, Michal Hocko <mhocko@suse.com>, Pavel Tatashin <pasha.tatashin@soleen.com>, linux-nvdimm <linux-nvdimm@lists.01.org>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, Linux MM <linux-mm@kvack.org>, Andrew Morton <akpm@linux-foundation.org>, Vlastimil Babka <vbabka@suse.cz>, Oscar Salvador <osalvador@suse.de> Subject: Re: [PATCH v9 02/12] mm/sparsemem: Add helpers track active portions of a section at boot Date: Tue, 18 Jun 2019 09:03:51 +0800 [thread overview] Message-ID: <20190618010351.GC18161@richard> (raw) In-Reply-To: <CAPcyv4hdsvNL0QfA2ACHAaGZE+21RmAnfKYfrZsKGKUxu3eKRQ@mail.gmail.com> On Mon, Jun 17, 2019 at 03:32:45PM -0700, Dan Williams wrote: >On Mon, Jun 17, 2019 at 3:22 PM Wei Yang <richard.weiyang@gmail.com> wrote: >> >> On Wed, Jun 05, 2019 at 02:57:59PM -0700, Dan Williams wrote: >> >Prepare for hot{plug,remove} of sub-ranges of a section by tracking a >> >sub-section active bitmask, each bit representing a PMD_SIZE span of the >> >architecture's memory hotplug section size. >> > >> >The implications of a partially populated section is that pfn_valid() >> >needs to go beyond a valid_section() check and read the sub-section >> >active ranges from the bitmask. The expectation is that the bitmask >> >(subsection_map) fits in the same cacheline as the valid_section() data, >> >so the incremental performance overhead to pfn_valid() should be >> >negligible. >> > >> >Cc: Michal Hocko <mhocko@suse.com> >> >Cc: Vlastimil Babka <vbabka@suse.cz> >> >Cc: Logan Gunthorpe <logang@deltatee.com> >> >Cc: Oscar Salvador <osalvador@suse.de> >> >Cc: Pavel Tatashin <pasha.tatashin@soleen.com> >> >Tested-by: Jane Chu <jane.chu@oracle.com> >> >Signed-off-by: Dan Williams <dan.j.williams@intel.com> >> >--- >> > include/linux/mmzone.h | 29 ++++++++++++++++++++++++++++- >> > mm/page_alloc.c | 4 +++- >> > mm/sparse.c | 35 +++++++++++++++++++++++++++++++++++ >> > 3 files changed, 66 insertions(+), 2 deletions(-) >> > >> >diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h >> >index ac163f2f274f..6dd52d544857 100644 >> >--- a/include/linux/mmzone.h >> >+++ b/include/linux/mmzone.h >> >@@ -1199,6 +1199,8 @@ struct mem_section_usage { >> > unsigned long pageblock_flags[0]; >> > }; >> > >> >+void subsection_map_init(unsigned long pfn, unsigned long nr_pages); >> >+ >> > struct page; >> > struct page_ext; >> > struct mem_section { >> >@@ -1336,12 +1338,36 @@ static inline struct mem_section *__pfn_to_section(unsigned long pfn) >> > >> > extern int __highest_present_section_nr; >> > >> >+static inline int subsection_map_index(unsigned long pfn) >> >+{ >> >+ return (pfn & ~(PAGE_SECTION_MASK)) / PAGES_PER_SUBSECTION; >> >+} >> >+ >> >+#ifdef CONFIG_SPARSEMEM_VMEMMAP >> >+static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) >> >+{ >> >+ int idx = subsection_map_index(pfn); >> >+ >> >+ return test_bit(idx, ms->usage->subsection_map); >> >+} >> >+#else >> >+static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) >> >+{ >> >+ return 1; >> >+} >> >+#endif >> >+ >> > #ifndef CONFIG_HAVE_ARCH_PFN_VALID >> > static inline int pfn_valid(unsigned long pfn) >> > { >> >+ struct mem_section *ms; >> >+ >> > if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) >> > return 0; >> >- return valid_section(__nr_to_section(pfn_to_section_nr(pfn))); >> >+ ms = __nr_to_section(pfn_to_section_nr(pfn)); >> >+ if (!valid_section(ms)) >> >+ return 0; >> >+ return pfn_section_valid(ms, pfn); >> > } >> > #endif >> > >> >@@ -1373,6 +1399,7 @@ void sparse_init(void); >> > #define sparse_init() do {} while (0) >> > #define sparse_index_init(_sec, _nid) do {} while (0) >> > #define pfn_present pfn_valid >> >+#define subsection_map_init(_pfn, _nr_pages) do {} while (0) >> > #endif /* CONFIG_SPARSEMEM */ >> > >> > /* >> >diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> >index c6d8224d792e..bd773efe5b82 100644 >> >--- a/mm/page_alloc.c >> >+++ b/mm/page_alloc.c >> >@@ -7292,10 +7292,12 @@ void __init free_area_init_nodes(unsigned long *max_zone_pfn) >> > >> > /* Print out the early node map */ >> > pr_info("Early memory node ranges\n"); >> >- for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) >> >+ for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) { >> > pr_info(" node %3d: [mem %#018Lx-%#018Lx]\n", nid, >> > (u64)start_pfn << PAGE_SHIFT, >> > ((u64)end_pfn << PAGE_SHIFT) - 1); >> >+ subsection_map_init(start_pfn, end_pfn - start_pfn); >> >+ } >> >> Just curious about why we set subsection here? >> >> Function free_area_init_nodes() mostly handles pgdat, if I am correct. Setup >> subsection here looks like touching some lower level system data structure. > >Correct, I'm not sure how it ended up there, but it was the source of >a bug that was fixed with this change: > >https://lore.kernel.org/lkml/CAPcyv4hjvBPDYKpp2Gns3-cc2AQ0AVS1nLk-K3fwXeRUvvzQLg@mail.gmail.com/ So this one is moved to sparse_init_nid(). The bug is strange, while the code now is more reasonable to me. Thanks :-) >_______________________________________________ >Linux-nvdimm mailing list >Linux-nvdimm@lists.01.org >https://lists.01.org/mailman/listinfo/linux-nvdimm -- Wei Yang Help you, Help me
next prev parent reply other threads:[~2019-06-18 1:04 UTC|newest] Thread overview: 81+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-06-05 21:57 [PATCH v9 00/12] mm: Sub-section memory hotplug support Dan Williams 2019-06-05 21:57 ` Dan Williams 2019-06-05 21:57 ` [PATCH v9 01/12] mm/sparsemem: Introduce struct mem_section_usage Dan Williams 2019-06-05 21:57 ` Dan Williams 2019-06-06 17:34 ` Oscar Salvador 2019-06-06 17:34 ` Oscar Salvador 2019-06-16 13:11 ` Wei Yang 2019-06-16 13:11 ` Wei Yang 2019-06-18 21:56 ` Dan Williams 2019-06-18 21:56 ` Dan Williams 2019-06-18 21:56 ` Dan Williams 2019-06-19 2:13 ` Wei Yang 2019-06-19 2:13 ` Wei Yang 2019-06-05 21:57 ` [PATCH v9 02/12] mm/sparsemem: Add helpers track active portions of a section at boot Dan Williams 2019-06-05 21:57 ` Dan Williams 2019-06-06 16:55 ` Oscar Salvador 2019-06-06 16:55 ` Oscar Salvador 2019-06-17 22:21 ` Wei Yang 2019-06-17 22:21 ` Wei Yang 2019-06-17 22:32 ` Dan Williams 2019-06-17 22:32 ` Dan Williams 2019-06-17 22:32 ` Dan Williams 2019-06-18 1:03 ` Wei Yang [this message] 2019-06-18 1:03 ` Wei Yang 2019-06-19 3:15 ` Dan Williams 2019-06-19 3:15 ` Dan Williams 2019-06-05 21:58 ` [PATCH v9 03/12] mm/hotplug: Prepare shrink_{zone, pgdat}_span for sub-section removal Dan Williams 2019-06-05 21:58 ` Dan Williams 2019-06-18 1:42 ` Wei Yang 2019-06-18 1:42 ` Wei Yang 2019-06-19 3:40 ` Dan Williams 2019-06-19 3:40 ` Dan Williams 2019-06-05 21:58 ` [PATCH v9 04/12] mm/sparsemem: Convert kmalloc_section_memmap() to populate_section_memmap() Dan Williams 2019-06-05 21:58 ` Dan Williams 2019-06-06 17:02 ` Oscar Salvador 2019-06-06 17:02 ` Oscar Salvador 2019-06-16 6:06 ` Aneesh Kumar K.V 2019-06-16 6:06 ` Aneesh Kumar K.V 2019-06-05 21:58 ` [PATCH v9 05/12] mm/hotplug: Kill is_dev_zone() usage in __remove_pages() Dan Williams 2019-06-05 21:58 ` Dan Williams 2019-06-05 21:58 ` [PATCH v9 06/12] mm: Kill is_dev_zone() helper Dan Williams 2019-06-05 21:58 ` Dan Williams 2019-06-18 3:35 ` Wei Yang 2019-06-18 3:35 ` Wei Yang 2019-06-05 21:58 ` [PATCH v9 07/12] mm/sparsemem: Prepare for sub-section ranges Dan Williams 2019-06-05 21:58 ` Dan Williams 2019-06-06 17:21 ` Oscar Salvador 2019-06-06 17:21 ` Oscar Salvador 2019-06-06 18:16 ` Dan Williams 2019-06-06 18:16 ` Dan Williams 2019-06-06 18:16 ` Dan Williams 2019-06-14 8:39 ` David Hildenbrand 2019-06-14 8:39 ` David Hildenbrand 2019-06-05 21:58 ` [PATCH v9 08/12] mm/sparsemem: Support sub-section hotplug Dan Williams 2019-06-05 21:58 ` Dan Williams 2019-06-07 8:33 ` Oscar Salvador 2019-06-07 15:38 ` Dan Williams 2019-06-07 15:38 ` Dan Williams 2019-06-07 15:38 ` Dan Williams 2019-06-07 21:41 ` Oscar Salvador 2019-06-07 21:41 ` Oscar Salvador 2019-06-05 21:58 ` [PATCH v9 09/12] mm: Document ZONE_DEVICE memory-model implications Dan Williams 2019-06-05 21:58 ` Dan Williams 2019-06-05 21:58 ` [PATCH v9 10/12] mm/devm_memremap_pages: Enable sub-section remap Dan Williams 2019-06-05 21:58 ` Dan Williams 2019-06-07 8:56 ` Oscar Salvador 2019-06-07 8:56 ` Oscar Salvador 2019-06-16 7:49 ` Aneesh Kumar K.V 2019-06-05 21:58 ` [PATCH v9 11/12] libnvdimm/pfn: Fix fsdax-mode namespace info-block zero-fields Dan Williams 2019-06-05 21:58 ` Dan Williams 2019-06-06 21:46 ` Andrew Morton 2019-06-06 21:46 ` Andrew Morton 2019-06-06 22:06 ` Dan Williams 2019-06-06 22:06 ` Dan Williams 2019-06-06 22:06 ` Dan Williams 2019-06-07 19:54 ` Andrew Morton 2019-06-07 20:09 ` Dan Williams 2019-06-07 20:09 ` Dan Williams 2019-06-12 9:41 ` Aneesh Kumar K.V 2019-06-05 21:59 ` [PATCH v9 12/12] libnvdimm/pfn: Stop padding pmem namespaces to section alignment Dan Williams 2019-06-05 21:59 ` Dan Williams
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20190618010351.GC18161@richard \ --to=richardw.yang@linux.intel.com \ --cc=akpm@linux-foundation.org \ --cc=dan.j.williams@intel.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=linux-nvdimm@lists.01.org \ --cc=mhocko@suse.com \ --cc=osalvador@suse.de \ --cc=pasha.tatashin@soleen.com \ --cc=richard.weiyang@gmail.com \ --cc=vbabka@suse.cz \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.