All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5] mm: don't allow deferred pages with NEED_PER_CPU_KM
@ 2018-05-15 17:51 Pavel Tatashin
  2018-05-15 20:43 ` Michal Hocko
  2018-05-15 21:12 ` Andrew Morton
  0 siblings, 2 replies; 4+ messages in thread
From: Pavel Tatashin @ 2018-05-15 17:51 UTC (permalink / raw)
  To: steven.sistare, daniel.m.jordan, akpm, linux-kernel, tglx,
	mhocko, linux-mm, mgorman, mingo, peterz, rostedt, fengguang.wu,
	dennisszhou

It is unsafe to do virtual to physical translations before mm_init() is
called if struct page is needed in order to determine the memory section
number (see SECTION_IN_PAGE_FLAGS). This is because only in mm_init() we
initialize struct pages for all the allocated memory when deferred struct
pages are used.

My recent fix exposed this problem, because it greatly reduced number of
pages that are initialized before mm_init(), but the problem existed even
before my fix, as Fengguang Wu found.

Below is a more detailed explanation of the problem.

We initialize struct pages in four places:

1. Early in boot a small set of struct pages is initialized to fill
the first section, and lower zones.
2. During mm_init() we initialize "struct pages" for all the memory
that is allocated, i.e reserved in memblock.
3. Using on-demand logic when pages are allocated after mm_init call (when
memblock is finished)
4. After smp_init() when the rest free deferred pages are initialized.

The problem occurs if we try to do va to phys translation of a memory
between steps 1 and 2. Because we have not yet initialized struct pages for
all the reserved pages, it is inherently unsafe to do va to phys if the
translation itself requires access of "struct page" as in case of this
combination: CONFIG_SPARSE && !CONFIG_SPARSE_VMEMMAP

The following path exposes the problem:

start_kernel()
 trap_init()
  setup_cpu_entry_areas()
   setup_cpu_entry_area(cpu)
    get_cpu_gdt_paddr(cpu)
     per_cpu_ptr_to_phys(addr)
      pcpu_addr_to_page(addr)
       virt_to_page(addr)
        pfn_to_page(__pa(addr) >> PAGE_SHIFT)

We disable this path by not allowing NEED_PER_CPU_KM with deferred struct
pages feature.

The problems are discussed in these threads:
http://lkml.kernel.org/r/20180418135300.inazvpxjxowogyge@wfg-t540p.sh.intel.com
http://lkml.kernel.org/r/20180419013128.iurzouiqxvcnpbvz@wfg-t540p.sh.intel.com
http://lkml.kernel.org/r/20180426202619.2768-1-pasha.tatashin@oracle.com

Fixes: 3a80a7fa7989 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set")
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
---
 mm/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/Kconfig b/mm/Kconfig
index d5004d82a1d6..e14c01513bfd 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -636,6 +636,7 @@ config DEFERRED_STRUCT_PAGE_INIT
 	default n
 	depends on NO_BOOTMEM
 	depends on !FLATMEM
+	depends on !NEED_PER_CPU_KM
 	help
 	  Ordinarily all struct pages are initialised during early boot in a
 	  single thread. On very large machines this can take a considerable
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v5] mm: don't allow deferred pages with NEED_PER_CPU_KM
  2018-05-15 17:51 [PATCH v5] mm: don't allow deferred pages with NEED_PER_CPU_KM Pavel Tatashin
@ 2018-05-15 20:43 ` Michal Hocko
  2018-05-15 21:12 ` Andrew Morton
  1 sibling, 0 replies; 4+ messages in thread
From: Michal Hocko @ 2018-05-15 20:43 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: steven.sistare, daniel.m.jordan, akpm, linux-kernel, tglx,
	linux-mm, mgorman, mingo, peterz, rostedt, fengguang.wu,
	dennisszhou

On Tue 15-05-18 13:51:24, Pavel Tatashin wrote:
> It is unsafe to do virtual to physical translations before mm_init() is
> called if struct page is needed in order to determine the memory section
> number (see SECTION_IN_PAGE_FLAGS). This is because only in mm_init() we
> initialize struct pages for all the allocated memory when deferred struct
> pages are used.
> 
> My recent fix exposed this problem, because it greatly reduced number of
> pages that are initialized before mm_init(), but the problem existed even
> before my fix, as Fengguang Wu found.
> 
> Below is a more detailed explanation of the problem.
> 
> We initialize struct pages in four places:
> 
> 1. Early in boot a small set of struct pages is initialized to fill
> the first section, and lower zones.
> 2. During mm_init() we initialize "struct pages" for all the memory
> that is allocated, i.e reserved in memblock.
> 3. Using on-demand logic when pages are allocated after mm_init call (when
> memblock is finished)
> 4. After smp_init() when the rest free deferred pages are initialized.
> 
> The problem occurs if we try to do va to phys translation of a memory
> between steps 1 and 2. Because we have not yet initialized struct pages for
> all the reserved pages, it is inherently unsafe to do va to phys if the
> translation itself requires access of "struct page" as in case of this
> combination: CONFIG_SPARSE && !CONFIG_SPARSE_VMEMMAP
> 
> The following path exposes the problem:
> 
> start_kernel()
>  trap_init()
>   setup_cpu_entry_areas()
>    setup_cpu_entry_area(cpu)
>     get_cpu_gdt_paddr(cpu)
>      per_cpu_ptr_to_phys(addr)
>       pcpu_addr_to_page(addr)
>        virt_to_page(addr)
>         pfn_to_page(__pa(addr) >> PAGE_SHIFT)
> 
> We disable this path by not allowing NEED_PER_CPU_KM with deferred struct
> pages feature.
> 
> The problems are discussed in these threads:
> http://lkml.kernel.org/r/20180418135300.inazvpxjxowogyge@wfg-t540p.sh.intel.com
> http://lkml.kernel.org/r/20180419013128.iurzouiqxvcnpbvz@wfg-t540p.sh.intel.com
> http://lkml.kernel.org/r/20180426202619.2768-1-pasha.tatashin@oracle.com
> 
> Fixes: 3a80a7fa7989 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set")
> Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>

Acked-by: Michal Hocko <mhocko@suse.com>

Thanks a lot!

> ---
>  mm/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/mm/Kconfig b/mm/Kconfig
> index d5004d82a1d6..e14c01513bfd 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -636,6 +636,7 @@ config DEFERRED_STRUCT_PAGE_INIT
>  	default n
>  	depends on NO_BOOTMEM
>  	depends on !FLATMEM
> +	depends on !NEED_PER_CPU_KM
>  	help
>  	  Ordinarily all struct pages are initialised during early boot in a
>  	  single thread. On very large machines this can take a considerable
> -- 
> 2.17.0

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v5] mm: don't allow deferred pages with NEED_PER_CPU_KM
  2018-05-15 17:51 [PATCH v5] mm: don't allow deferred pages with NEED_PER_CPU_KM Pavel Tatashin
  2018-05-15 20:43 ` Michal Hocko
@ 2018-05-15 21:12 ` Andrew Morton
  2018-05-15 21:33   ` Pavel Tatashin
  1 sibling, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2018-05-15 21:12 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: steven.sistare, daniel.m.jordan, linux-kernel, tglx, mhocko,
	linux-mm, mgorman, mingo, peterz, rostedt, fengguang.wu,
	dennisszhou

On Tue, 15 May 2018 13:51:24 -0400 Pavel Tatashin <pasha.tatashin@oracle.com> wrote:

> It is unsafe to do virtual to physical translations before mm_init() is
> called if struct page is needed in order to determine the memory section
> number (see SECTION_IN_PAGE_FLAGS). This is because only in mm_init() we
> initialize struct pages for all the allocated memory when deferred struct
> pages are used.
> 
> My recent fix exposed this problem,

"my recent fix" isn't very useful.  I changed this to identify
c9e97a1997 ("mm: initialize pages on demand during boot"), yes?

> 
> Fixes: 3a80a7fa7989 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set")
> Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>

And I added cc:stable.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v5] mm: don't allow deferred pages with NEED_PER_CPU_KM
  2018-05-15 21:12 ` Andrew Morton
@ 2018-05-15 21:33   ` Pavel Tatashin
  0 siblings, 0 replies; 4+ messages in thread
From: Pavel Tatashin @ 2018-05-15 21:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Steven Sistare, Daniel Jordan, LKML, tglx, Michal Hocko,
	Linux Memory Management List, mgorman, mingo, peterz,
	Steven Rostedt, Fengguang Wu, Dennis Zhou

> > My recent fix exposed this problem,

> "my recent fix" isn't very useful.  I changed this to identify
> c9e97a1997 ("mm: initialize pages on demand during boot"), yes?

Yes, thank you.

Pavel

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-05-15 21:33 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-15 17:51 [PATCH v5] mm: don't allow deferred pages with NEED_PER_CPU_KM Pavel Tatashin
2018-05-15 20:43 ` Michal Hocko
2018-05-15 21:12 ` Andrew Morton
2018-05-15 21:33   ` Pavel Tatashin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.