From: Ingo Molnar <mingo@kernel.org>
To: Mike Galbraith <efault@gmx.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
linux-kernel@vger.kernel.org, stable@vger.kernel.org,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
Andy Lutomirski <luto@amacapital.net>,
Borislav Petkov <bp@suse.de>,
Cyrill Gorcunov <gorcunov@openvz.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Peter Zijlstra <peterz@infradead.org>,
Thomas Gleixner <tglx@linutronix.de>,
linux-mm@kvack.org
Subject: Re: [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
Date: Mon, 8 Jan 2018 17:04:44 +0100 [thread overview]
Message-ID: <20180108160444.2ol4fvgqbxnjmlpg@gmail.com> (raw)
In-Reply-To: <1515302062.6507.18.camel@gmx.de>
hi Kirill,
As Mike reported it below, your 5-level paging related upstream commit
83e3c48729d9 and all its followup fixes:
83e3c48729d9: mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y
629a359bdb0e: mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y
d09cfbbfa0f7: mm/sparse.c: wrong allocation for mem_section
... still breaks kexec - and that now regresses -stable as well.
Given that 5-level paging now syntactically depends on having this commit, if we
fully revert this then we'll have to disable 5-level paging as well.
Thanks,
Ingo
* Mike Galbraith <efault@gmx.de> wrote:
> On Fri, 2017-12-22 at 09:45 +0100, Greg Kroah-Hartman wrote:
> > 4.14-stable review patch. If anyone has any objections, please let me know.
>
> FYI, this broke kdump, or rather the makedumpfile part thereof.
> Forward looking wreckage is par for the kdump course, but...
>
> > ------------------
> >
> > From: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> >
> > commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4 upstream.
> >
> > Size of the mem_section[] array depends on the size of the physical address space.
> >
> > In preparation for boot-time switching between paging modes on x86-64
> > we need to make the allocation of mem_section[] dynamic, because otherwise
> > we waste a lot of RAM: with CONFIG_NODE_SHIFT=10, mem_section[] size is 32kB
> > for 4-level paging and 2MB for 5-level paging mode.
> >
> > The patch allocates the array on the first call to sparse_memory_present_with_active_regions().
> >
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Andy Lutomirski <luto@amacapital.net>
> > Cc: Borislav Petkov <bp@suse.de>
> > Cc: Cyrill Gorcunov <gorcunov@openvz.org>
> > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: linux-mm@kvack.org
> > Link: http://lkml.kernel.org/r/20170929140821.37654-2-kirill.shutemov@linux.intel.com
> > Signed-off-by: Ingo Molnar <mingo@kernel.org>
> > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> >
> > ---
> > include/linux/mmzone.h | 6 +++++-
> > mm/page_alloc.c | 10 ++++++++++
> > mm/sparse.c | 17 +++++++++++------
> > 3 files changed, 26 insertions(+), 7 deletions(-)
> >
> > --- a/include/linux/mmzone.h
> > +++ b/include/linux/mmzone.h
> > @@ -1152,13 +1152,17 @@ struct mem_section {
> > #define SECTION_ROOT_MASK (SECTIONS_PER_ROOT - 1)
> >
> > #ifdef CONFIG_SPARSEMEM_EXTREME
> > -extern struct mem_section *mem_section[NR_SECTION_ROOTS];
> > +extern struct mem_section **mem_section;
> > #else
> > extern struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT];
> > #endif
> >
> > static inline struct mem_section *__nr_to_section(unsigned long nr)
> > {
> > +#ifdef CONFIG_SPARSEMEM_EXTREME
> > + if (!mem_section)
> > + return NULL;
> > +#endif
> > if (!mem_section[SECTION_NR_TO_ROOT(nr)])
> > return NULL;
> > return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK];
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -5651,6 +5651,16 @@ void __init sparse_memory_present_with_a
> > unsigned long start_pfn, end_pfn;
> > int i, this_nid;
> >
> > +#ifdef CONFIG_SPARSEMEM_EXTREME
> > + if (!mem_section) {
> > + unsigned long size, align;
> > +
> > + size = sizeof(struct mem_section) * NR_SECTION_ROOTS;
> > + align = 1 << (INTERNODE_CACHE_SHIFT);
> > + mem_section = memblock_virt_alloc(size, align);
> > + }
> > +#endif
> > +
> > for_each_mem_pfn_range(i, nid, &start_pfn, &end_pfn, &this_nid)
> > memory_present(this_nid, start_pfn, end_pfn);
> > }
> > --- a/mm/sparse.c
> > +++ b/mm/sparse.c
> > @@ -23,8 +23,7 @@
> > * 1) mem_section - memory sections, mem_map's for valid memory
> > */
> > #ifdef CONFIG_SPARSEMEM_EXTREME
> > -struct mem_section *mem_section[NR_SECTION_ROOTS]
> > - ____cacheline_internodealigned_in_smp;
> > +struct mem_section **mem_section;
> > #else
> > struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT]
> > ____cacheline_internodealigned_in_smp;
> > @@ -101,7 +100,7 @@ static inline int sparse_index_init(unsi
> > int __section_nr(struct mem_section* ms)
> > {
> > unsigned long root_nr;
> > - struct mem_section* root;
> > + struct mem_section *root = NULL;
> >
> > for (root_nr = 0; root_nr < NR_SECTION_ROOTS; root_nr++) {
> > root = __nr_to_section(root_nr * SECTIONS_PER_ROOT);
> > @@ -112,7 +111,7 @@ int __section_nr(struct mem_section* ms)
> > break;
> > }
> >
> > - VM_BUG_ON(root_nr == NR_SECTION_ROOTS);
> > + VM_BUG_ON(!root);
> >
> > return (root_nr * SECTIONS_PER_ROOT) + (ms - root);
> > }
> > @@ -330,11 +329,17 @@ again:
> > static void __init check_usemap_section_nr(int nid, unsigned long *usemap)
> > {
> > unsigned long usemap_snr, pgdat_snr;
> > - static unsigned long old_usemap_snr = NR_MEM_SECTIONS;
> > - static unsigned long old_pgdat_snr = NR_MEM_SECTIONS;
> > + static unsigned long old_usemap_snr;
> > + static unsigned long old_pgdat_snr;
> > struct pglist_data *pgdat = NODE_DATA(nid);
> > int usemap_nid;
> >
> > + /* First call */
> > + if (!old_usemap_snr) {
> > + old_usemap_snr = NR_MEM_SECTIONS;
> > + old_pgdat_snr = NR_MEM_SECTIONS;
> > + }
> > +
> > usemap_snr = pfn_to_section_nr(__pa(usemap) >> PAGE_SHIFT);
> > pgdat_snr = pfn_to_section_nr(__pa(pgdat) >> PAGE_SHIFT);
> > if (usemap_snr == pgdat_snr)
> >
> >
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2018-01-08 16:04 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20171222084623.668990192@linuxfoundation.org>
2017-12-22 8:45 ` [PATCH 4.14 023/159] mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y Greg Kroah-Hartman
2017-12-22 14:18 ` Dan Rue
2017-12-22 14:52 ` Naresh Kamboju
2017-12-22 15:12 ` Greg Kroah-Hartman
2017-12-22 15:03 ` Greg Kroah-Hartman
2018-01-07 5:14 ` Mike Galbraith
2018-01-07 9:11 ` Greg Kroah-Hartman
2018-01-07 9:21 ` Mike Galbraith
2018-01-07 10:18 ` Michal Hocko
2018-01-07 10:42 ` Greg Kroah-Hartman
2018-01-07 12:44 ` Mike Galbraith
2018-01-07 13:23 ` Michal Hocko
2018-01-08 7:53 ` Greg Kroah-Hartman
2018-01-08 8:15 ` Mike Galbraith
2018-01-08 8:33 ` Greg Kroah-Hartman
2018-01-08 9:45 ` Mike Galbraith
2018-01-08 8:47 ` Michal Hocko
2018-01-08 9:10 ` Greg Kroah-Hartman
2018-01-08 9:27 ` Greg Kroah-Hartman
2018-01-08 16:04 ` Ingo Molnar [this message]
2018-01-08 17:46 ` Kirill A. Shutemov
2018-01-09 0:13 ` Kirill A. Shutemov
2018-01-09 1:09 ` Dave Young
2018-01-09 5:41 ` Baoquan He
2018-01-09 7:24 ` Dave Young
2018-01-09 9:05 ` Kirill A. Shutemov
2018-01-10 3:08 ` Dave Young
2018-01-10 11:16 ` Kirill A. Shutemov
2018-01-11 1:06 ` Baoquan He
2018-01-12 0:55 ` Dave Young
2018-01-15 5:57 ` Omar Sandoval
2018-01-16 8:36 ` Atsushi Kumagai
2018-01-09 3:44 ` Mike Galbraith
2018-02-07 9:25 ` Dou Liyang
2018-02-07 10:41 ` Kirill A. Shutemov
2018-02-07 10:45 ` Mike Galbraith
2018-02-07 12:00 ` Dou Liyang
2018-02-07 12:08 ` Baoquan He
2018-02-07 12:17 ` Dou Liyang
2018-02-07 12:27 ` Baoquan He
2018-02-07 12:34 ` Dou Liyang
2018-02-07 12:45 ` Baoquan He
2018-02-08 1:14 ` Dou Liyang
2018-02-08 1:23 ` Baoquan He
2018-02-08 1:44 ` Dou Liyang
2018-02-07 11:28 ` Baoquan He
2018-01-17 5:24 ` Baoquan He
2018-01-25 15:50 ` Kirill A. Shutemov
2018-01-26 2:48 ` Baoquan He
2017-12-22 8:45 ` [PATCH 4.14 024/159] x86/kasan: Use the same shadow offset for 4- and 5-level paging Greg Kroah-Hartman
2017-12-22 8:45 ` [PATCH 4.14 025/159] x86/xen: Provide pre-built page tables only for CONFIG_XEN_PV=y and CONFIG_XEN_PVH=y Greg Kroah-Hartman
2017-12-22 8:45 ` [PATCH 4.14 026/159] x86/xen: Drop 5-level paging support code from the XEN_PV code Greg Kroah-Hartman
2017-12-22 8:45 ` [PATCH 4.14 033/159] x86/boot: Relocate definition of the initial state of CR0 Greg Kroah-Hartman
2017-12-22 8:46 ` [PATCH 4.14 097/159] x86/paravirt: Dont patch flush_tlb_single Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180108160444.2ol4fvgqbxnjmlpg@gmail.com \
--to=mingo@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=bp@suse.de \
--cc=efault@gmx.de \
--cc=gorcunov@openvz.org \
--cc=gregkh@linuxfoundation.org \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@amacapital.net \
--cc=peterz@infradead.org \
--cc=stable@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).