All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pavel Tatashin <pasha.tatashin@oracle.com>
To: linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org,
	linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org,
	linux-s390@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	x86@kernel.org, kasan-dev@googlegroups.com,
	borntraeger@de.ibm.com, heiko.carstens@de.ibm.com,
	davem@davemloft.net, willy@infradead.org, mhocko@kernel.org,
	ard.biesheuvel@linaro.org, mark.rutland@arm.com,
	will.deacon@arm.com, catalin.marinas@arm.com, sam@ravnborg.org,
	mgorman@techsingularity.net, steven.sistare@oracle.com,
	daniel.m.jordan@oracle.com, bob.picco@oracle.com
Subject: [PATCH v10 01/10] x86/mm: setting fields in deferred pages
Date: Thu,  5 Oct 2017 17:11:15 -0400	[thread overview]
Message-ID: <20171005211124.26524-2-pasha.tatashin@oracle.com> (raw)
In-Reply-To: <20171005211124.26524-1-pasha.tatashin@oracle.com>

Without deferred struct page feature (CONFIG_DEFERRED_STRUCT_PAGE_INIT),
flags and other fields in "struct page"es are never changed prior to first
initializing struct pages by going through __init_single_page().

With deferred struct page feature enabled, however, we set fields in
register_page_bootmem_info that are subsequently clobbered right after in
free_all_bootmem:

        mem_init() {
                register_page_bootmem_info();
                free_all_bootmem();
                ...
        }

When register_page_bootmem_info() is called only non-deferred struct pages
are initialized. But, this function goes through some reserved pages which
might be part of the deferred, and thus are not yet initialized.

  mem_init
   register_page_bootmem_info
    register_page_bootmem_info_node
     get_page_bootmem
      .. setting fields here ..
      such as: page->freelist = (void *)type;

  free_all_bootmem()
   free_low_memory_core_early()
    for_each_reserved_mem_region()
     reserve_bootmem_region()
      init_reserved_page() <- Only if this is deferred reserved page
       __init_single_pfn()
        __init_single_page()
            memset(0) <-- Loose the set fields here

We end-up with issue where, currently we do not observe problem as memory
is explicitly zeroed. But, if flag asserts are changed we can start hitting
issues.

Also, because in this patch series we will stop zeroing struct page memory
during allocation, we must make sure that struct pages are properly
initialized prior to using them.

The deferred-reserved pages are initialized in free_all_bootmem().
Therefore, the fix is to switch the above calls.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
---
 arch/x86/mm/init_64.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 5ea1c3c2636e..8822523fdcd7 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1182,12 +1182,18 @@ void __init mem_init(void)
 
 	/* clear_bss() already clear the empty_zero_page */
 
-	register_page_bootmem_info();
-
 	/* this will put all memory onto the freelists */
 	free_all_bootmem();
 	after_bootmem = 1;
 
+	/*
+	 * Must be done after boot memory is put on freelist, because here we
+	 * might set fields in deferred struct pages that have not yet been
+	 * initialized, and free_all_bootmem() initializes all the reserved
+	 * deferred pages for us.
+	 */
+	register_page_bootmem_info();
+
 	/* Register memory areas for /proc/kcore */
 	kclist_add(&kcore_vsyscall, (void *)VSYSCALL_ADDR,
 			 PAGE_SIZE, KCORE_OTHER);
-- 
2.14.2

WARNING: multiple messages have this Message-ID (diff)
From: Pavel Tatashin <pasha.tatashin@oracle.com>
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v10 01/10] x86/mm: setting fields in deferred pages
Date: Thu, 05 Oct 2017 21:11:15 +0000	[thread overview]
Message-ID: <20171005211124.26524-2-pasha.tatashin@oracle.com> (raw)
In-Reply-To: <20171005211124.26524-1-pasha.tatashin@oracle.com>

Without deferred struct page feature (CONFIG_DEFERRED_STRUCT_PAGE_INIT),
flags and other fields in "struct page"es are never changed prior to first
initializing struct pages by going through __init_single_page().

With deferred struct page feature enabled, however, we set fields in
register_page_bootmem_info that are subsequently clobbered right after in
free_all_bootmem:

        mem_init() {
                register_page_bootmem_info();
                free_all_bootmem();
                ...
        }

When register_page_bootmem_info() is called only non-deferred struct pages
are initialized. But, this function goes through some reserved pages which
might be part of the deferred, and thus are not yet initialized.

  mem_init
   register_page_bootmem_info
    register_page_bootmem_info_node
     get_page_bootmem
      .. setting fields here ..
      such as: page->freelist = (void *)type;

  free_all_bootmem()
   free_low_memory_core_early()
    for_each_reserved_mem_region()
     reserve_bootmem_region()
      init_reserved_page() <- Only if this is deferred reserved page
       __init_single_pfn()
        __init_single_page()
            memset(0) <-- Loose the set fields here

We end-up with issue where, currently we do not observe problem as memory
is explicitly zeroed. But, if flag asserts are changed we can start hitting
issues.

Also, because in this patch series we will stop zeroing struct page memory
during allocation, we must make sure that struct pages are properly
initialized prior to using them.

The deferred-reserved pages are initialized in free_all_bootmem().
Therefore, the fix is to switch the above calls.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
---
 arch/x86/mm/init_64.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 5ea1c3c2636e..8822523fdcd7 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1182,12 +1182,18 @@ void __init mem_init(void)
 
 	/* clear_bss() already clear the empty_zero_page */
 
-	register_page_bootmem_info();
-
 	/* this will put all memory onto the freelists */
 	free_all_bootmem();
 	after_bootmem = 1;
 
+	/*
+	 * Must be done after boot memory is put on freelist, because here we
+	 * might set fields in deferred struct pages that have not yet been
+	 * initialized, and free_all_bootmem() initializes all the reserved
+	 * deferred pages for us.
+	 */
+	register_page_bootmem_info();
+
 	/* Register memory areas for /proc/kcore */
 	kclist_add(&kcore_vsyscall, (void *)VSYSCALL_ADDR,
 			 PAGE_SIZE, KCORE_OTHER);
-- 
2.14.2


WARNING: multiple messages have this Message-ID (diff)
From: Pavel Tatashin <pasha.tatashin@oracle.com>
To: linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org,
	linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org,
	linux-s390@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	x86@kernel.org, kasan-dev@googlegroups.com,
	borntraeger@de.ibm.com, heiko.carstens@de.ibm.com,
	davem@davemloft.net, willy@infradead.org, mhocko@kernel.org,
	ard.biesheuvel@linaro.org, mark.rutland@arm.com,
	will.deacon@arm.com, catalin.marinas@arm.com, sam@ravnborg.org,
	mgorman@techsingularity.net, steven.sistare@oracle.com,
	daniel.m.jordan@oracle.com, bob.picco@oracle.com
Subject: [PATCH v10 01/10] x86/mm: setting fields in deferred pages
Date: Thu,  5 Oct 2017 17:11:15 -0400	[thread overview]
Message-ID: <20171005211124.26524-2-pasha.tatashin@oracle.com> (raw)
In-Reply-To: <20171005211124.26524-1-pasha.tatashin@oracle.com>

Without deferred struct page feature (CONFIG_DEFERRED_STRUCT_PAGE_INIT),
flags and other fields in "struct page"es are never changed prior to first
initializing struct pages by going through __init_single_page().

With deferred struct page feature enabled, however, we set fields in
register_page_bootmem_info that are subsequently clobbered right after in
free_all_bootmem:

        mem_init() {
                register_page_bootmem_info();
                free_all_bootmem();
                ...
        }

When register_page_bootmem_info() is called only non-deferred struct pages
are initialized. But, this function goes through some reserved pages which
might be part of the deferred, and thus are not yet initialized.

  mem_init
   register_page_bootmem_info
    register_page_bootmem_info_node
     get_page_bootmem
      .. setting fields here ..
      such as: page->freelist = (void *)type;

  free_all_bootmem()
   free_low_memory_core_early()
    for_each_reserved_mem_region()
     reserve_bootmem_region()
      init_reserved_page() <- Only if this is deferred reserved page
       __init_single_pfn()
        __init_single_page()
            memset(0) <-- Loose the set fields here

We end-up with issue where, currently we do not observe problem as memory
is explicitly zeroed. But, if flag asserts are changed we can start hitting
issues.

Also, because in this patch series we will stop zeroing struct page memory
during allocation, we must make sure that struct pages are properly
initialized prior to using them.

The deferred-reserved pages are initialized in free_all_bootmem().
Therefore, the fix is to switch the above calls.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
---
 arch/x86/mm/init_64.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 5ea1c3c2636e..8822523fdcd7 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1182,12 +1182,18 @@ void __init mem_init(void)
 
 	/* clear_bss() already clear the empty_zero_page */
 
-	register_page_bootmem_info();
-
 	/* this will put all memory onto the freelists */
 	free_all_bootmem();
 	after_bootmem = 1;
 
+	/*
+	 * Must be done after boot memory is put on freelist, because here we
+	 * might set fields in deferred struct pages that have not yet been
+	 * initialized, and free_all_bootmem() initializes all the reserved
+	 * deferred pages for us.
+	 */
+	register_page_bootmem_info();
+
 	/* Register memory areas for /proc/kcore */
 	kclist_add(&kcore_vsyscall, (void *)VSYSCALL_ADDR,
 			 PAGE_SIZE, KCORE_OTHER);
-- 
2.14.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: pasha.tatashin@oracle.com (Pavel Tatashin)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v10 01/10] x86/mm: setting fields in deferred pages
Date: Thu,  5 Oct 2017 17:11:15 -0400	[thread overview]
Message-ID: <20171005211124.26524-2-pasha.tatashin@oracle.com> (raw)
In-Reply-To: <20171005211124.26524-1-pasha.tatashin@oracle.com>

Without deferred struct page feature (CONFIG_DEFERRED_STRUCT_PAGE_INIT),
flags and other fields in "struct page"es are never changed prior to first
initializing struct pages by going through __init_single_page().

With deferred struct page feature enabled, however, we set fields in
register_page_bootmem_info that are subsequently clobbered right after in
free_all_bootmem:

        mem_init() {
                register_page_bootmem_info();
                free_all_bootmem();
                ...
        }

When register_page_bootmem_info() is called only non-deferred struct pages
are initialized. But, this function goes through some reserved pages which
might be part of the deferred, and thus are not yet initialized.

  mem_init
   register_page_bootmem_info
    register_page_bootmem_info_node
     get_page_bootmem
      .. setting fields here ..
      such as: page->freelist = (void *)type;

  free_all_bootmem()
   free_low_memory_core_early()
    for_each_reserved_mem_region()
     reserve_bootmem_region()
      init_reserved_page() <- Only if this is deferred reserved page
       __init_single_pfn()
        __init_single_page()
            memset(0) <-- Loose the set fields here

We end-up with issue where, currently we do not observe problem as memory
is explicitly zeroed. But, if flag asserts are changed we can start hitting
issues.

Also, because in this patch series we will stop zeroing struct page memory
during allocation, we must make sure that struct pages are properly
initialized prior to using them.

The deferred-reserved pages are initialized in free_all_bootmem().
Therefore, the fix is to switch the above calls.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
---
 arch/x86/mm/init_64.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 5ea1c3c2636e..8822523fdcd7 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1182,12 +1182,18 @@ void __init mem_init(void)
 
 	/* clear_bss() already clear the empty_zero_page */
 
-	register_page_bootmem_info();
-
 	/* this will put all memory onto the freelists */
 	free_all_bootmem();
 	after_bootmem = 1;
 
+	/*
+	 * Must be done after boot memory is put on freelist, because here we
+	 * might set fields in deferred struct pages that have not yet been
+	 * initialized, and free_all_bootmem() initializes all the reserved
+	 * deferred pages for us.
+	 */
+	register_page_bootmem_info();
+
 	/* Register memory areas for /proc/kcore */
 	kclist_add(&kcore_vsyscall, (void *)VSYSCALL_ADDR,
 			 PAGE_SIZE, KCORE_OTHER);
-- 
2.14.2

  reply	other threads:[~2017-10-05 21:12 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-05 21:11 [PATCH v10 00/10] complete deferred page initialization Pavel Tatashin
2017-10-05 21:11 ` Pavel Tatashin
2017-10-05 21:11 ` Pavel Tatashin
2017-10-05 21:11 ` Pavel Tatashin
2017-10-05 21:11 ` Pavel Tatashin [this message]
2017-10-05 21:11   ` [PATCH v10 01/10] x86/mm: setting fields in deferred pages Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11 ` [PATCH v10 02/10] sparc64/mm: " Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11 ` [PATCH v10 03/10] sparc64: simplify vmemmap_populate Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11 ` [PATCH v10 04/10] mm: defining memblock_virt_alloc_try_nid_raw Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11 ` [PATCH v10 05/10] mm: zero reserved and unavailable struct pages Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-06 12:30   ` Michal Hocko
2017-10-06 12:30     ` Michal Hocko
2017-10-06 12:30     ` Michal Hocko
2017-10-06 12:30     ` Michal Hocko
2017-10-06 15:25     ` Pasha Tatashin
2017-10-06 15:25       ` Pasha Tatashin
2017-10-06 15:25       ` Pasha Tatashin
2017-10-06 15:25       ` Pasha Tatashin
2017-10-10 13:39       ` Michal Hocko
2017-10-10 13:39         ` Michal Hocko
2017-10-10 13:39         ` Michal Hocko
2017-10-10 13:39         ` Michal Hocko
2017-10-05 21:11 ` [PATCH v10 06/10] mm/kasan: kasan specific map populate function Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11 ` [PATCH v10 07/10] x86/kasan: use kasan_map_populate() Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11 ` [PATCH v10 08/10] arm64/kasan: " Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11 ` [PATCH v10 09/10] mm: stop zeroing memory during allocation in vmemmap Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-06 11:10   ` David Laight
2017-10-06 11:10     ` David Laight
2017-10-06 11:10     ` David Laight
2017-10-06 11:10     ` David Laight
2017-10-06 11:10     ` David Laight
2017-10-06 11:47     ` Michal Hocko
2017-10-06 11:47       ` Michal Hocko
2017-10-06 11:47       ` Michal Hocko
2017-10-06 11:47       ` Michal Hocko
2017-10-06 11:47       ` Michal Hocko
2017-10-06 12:11       ` David Laight
2017-10-06 12:11         ` David Laight
2017-10-06 12:11         ` David Laight
2017-10-06 12:11         ` David Laight
2017-10-06 12:11         ` David Laight
2017-10-06 12:25         ` Michal Hocko
2017-10-06 12:25           ` Michal Hocko
2017-10-06 12:25           ` Michal Hocko
2017-10-06 12:25           ` Michal Hocko
2017-10-06 12:25           ` Michal Hocko
2017-10-05 21:11 ` [PATCH v10 10/10] sparc64: optimized struct page zeroing Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171005211124.26524-2-pasha.tatashin@oracle.com \
    --to=pasha.tatashin@oracle.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=bob.picco@oracle.com \
    --cc=borntraeger@de.ibm.com \
    --cc=catalin.marinas@arm.com \
    --cc=daniel.m.jordan@oracle.com \
    --cc=davem@davemloft.net \
    --cc=heiko.carstens@de.ibm.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mark.rutland@arm.com \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=sam@ravnborg.org \
    --cc=sparclinux@vger.kernel.org \
    --cc=steven.sistare@oracle.com \
    --cc=will.deacon@arm.com \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.