All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1] mm/page_alloc: clear pages in alloc_contig_pages() with init_on_alloc=1 or __GFP_ZERO
@ 2020-11-10 19:32 David Hildenbrand
  2020-11-11  8:47 ` Michal Hocko
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: David Hildenbrand @ 2020-11-10 19:32 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Alexander Potapenko,
	Michal Hocko, Mike Kravetz, Vlastimil Babka, Mike Rapoport,
	Oscar Salvador, Kees Cook, Michael Ellerman

commit 6471384af2a6 ("mm: security: introduce init_on_alloc=1 and
init_on_free=1 boot options") resulted with init_on_alloc=1 in all pages
leaving the buddy via alloc_pages() and friends to be
initialized/cleared/zeroed on allocation.

However, the same logic is currently not applied to
alloc_contig_pages(): allocated pages leaving the buddy aren't cleared
with init_on_alloc=1 and init_on_free=0. Let's also properly clear
pages on that allocation path and add support for __GFP_ZERO.

With this change, we will see double clearing of pages in some
cases. One example are gigantic pages (either allocated via CMA, or
allocated dynamically via alloc_contig_pages()) - which is the right
thing to do (and to be optimized outside of the buddy in the callers) as
discussed in:
  https://lkml.kernel.org/r/20201019182853.7467-1-gpiccoli@canonical.com

This change implies that with init_on_alloc=1
- All CMA allocations will be cleared
- Gigantic pages allocated via alloc_contig_pages() will be cleared
- virtio-mem memory to be unplugged will be cleared. While this is
  suboptimal, it's similar to memory balloon drivers handling, where
  all pages to be inflated will get cleared as well.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/page_alloc.c | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index eed4f4075b3c..0361b119b74e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8453,6 +8453,19 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
 	return 0;
 }
 
+static void __alloc_contig_clear_range(unsigned long start_pfn,
+				       unsigned long end_pfn)
+{
+	unsigned long pfn;
+
+	for (pfn = start_pfn; pfn < end_pfn; pfn += MAX_ORDER_NR_PAGES) {
+		cond_resched();
+		kernel_init_free_pages(pfn_to_page(pfn),
+				       min_t(unsigned long, end_pfn - pfn,
+					     MAX_ORDER_NR_PAGES));
+	}
+}
+
 /**
  * alloc_contig_range() -- tries to allocate given range of pages
  * @start:	start PFN to allocate
@@ -8461,7 +8474,8 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
  *			#MIGRATE_MOVABLE or #MIGRATE_CMA).  All pageblocks
  *			in range must have the same migratetype and it must
  *			be either of the two.
- * @gfp_mask:	GFP mask to use during compaction
+ * @gfp_mask:	GFP mask to use during compaction. __GFP_ZERO clears allocated
+ *		pages.
  *
  * The PFN range does not have to be pageblock or MAX_ORDER_NR_PAGES
  * aligned.  The PFN range must belong to a single zone.
@@ -8488,7 +8502,7 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 		.mode = MIGRATE_SYNC,
 		.ignore_skip_hint = true,
 		.no_set_skip_hint = true,
-		.gfp_mask = current_gfp_context(gfp_mask),
+		.gfp_mask = current_gfp_context(gfp_mask & ~__GFP_ZERO),
 		.alloc_contig = true,
 	};
 	INIT_LIST_HEAD(&cc.migratepages);
@@ -8600,6 +8614,9 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 	if (end != outer_end)
 		free_contig_range(end, outer_end - end);
 
+	if (!want_init_on_free() && want_init_on_alloc(gfp_mask))
+		__alloc_contig_clear_range(start, end);
+
 done:
 	undo_isolate_page_range(pfn_max_align_down(start),
 				pfn_max_align_up(end), migratetype);
@@ -8653,7 +8670,8 @@ static bool zone_spans_last_pfn(const struct zone *zone,
 /**
  * alloc_contig_pages() -- tries to find and allocate contiguous range of pages
  * @nr_pages:	Number of contiguous pages to allocate
- * @gfp_mask:	GFP mask to limit search and used during compaction
+ * @gfp_mask:	GFP mask to limit search and used during compaction. __GFP_ZERO
+ *		clears allocated pages.
  * @nid:	Target node
  * @nodemask:	Mask for other possible nodes
  *
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v1] mm/page_alloc: clear pages in alloc_contig_pages() with init_on_alloc=1 or __GFP_ZERO
  2020-11-10 19:32 [PATCH v1] mm/page_alloc: clear pages in alloc_contig_pages() with init_on_alloc=1 or __GFP_ZERO David Hildenbrand
@ 2020-11-11  8:47 ` Michal Hocko
  2020-11-11  9:06   ` David Hildenbrand
  2020-11-11  9:25 ` David Hildenbrand
  2020-11-11  9:59 ` Mike Rapoport
  2 siblings, 1 reply; 10+ messages in thread
From: Michal Hocko @ 2020-11-11  8:47 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Alexander Potapenko,
	Mike Kravetz, Vlastimil Babka, Mike Rapoport, Oscar Salvador,
	Kees Cook, Michael Ellerman

On Tue 10-11-20 20:32:40, David Hildenbrand wrote:
> commit 6471384af2a6 ("mm: security: introduce init_on_alloc=1 and
> init_on_free=1 boot options") resulted with init_on_alloc=1 in all pages
> leaving the buddy via alloc_pages() and friends to be
> initialized/cleared/zeroed on allocation.
> 
> However, the same logic is currently not applied to
> alloc_contig_pages(): allocated pages leaving the buddy aren't cleared
> with init_on_alloc=1 and init_on_free=0. Let's also properly clear
> pages on that allocation path and add support for __GFP_ZERO.

AFAIR we do not have any user for __GFP_ZERO right? Not that this is
harmful but it is better to call that explicitly because a missing
implementation would be a real problem and as such a bug fix.

I am also not sure handling init_on_free at the higher level is good.
As we have discussed recently the primary point of this feature is to
add clearing at very few well defined entry points rather than spill it over
many places. In this case the entry point for the allocator is
__isolate_free_page which removes pages from the page allocator. I
haven't checked how much this is used elsewhere but I would expect
init_on_alloc to be handled there.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1] mm/page_alloc: clear pages in alloc_contig_pages() with init_on_alloc=1 or __GFP_ZERO
  2020-11-11  8:47 ` Michal Hocko
@ 2020-11-11  9:06   ` David Hildenbrand
  2020-11-11  9:58     ` Vlastimil Babka
  0 siblings, 1 reply; 10+ messages in thread
From: David Hildenbrand @ 2020-11-11  9:06 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, Andrew Morton, Alexander Potapenko,
	Mike Kravetz, Vlastimil Babka, Mike Rapoport, Oscar Salvador,
	Kees Cook, Michael Ellerman

On 11.11.20 09:47, Michal Hocko wrote:
> On Tue 10-11-20 20:32:40, David Hildenbrand wrote:
>> commit 6471384af2a6 ("mm: security: introduce init_on_alloc=1 and
>> init_on_free=1 boot options") resulted with init_on_alloc=1 in all pages
>> leaving the buddy via alloc_pages() and friends to be
>> initialized/cleared/zeroed on allocation.
>>
>> However, the same logic is currently not applied to
>> alloc_contig_pages(): allocated pages leaving the buddy aren't cleared
>> with init_on_alloc=1 and init_on_free=0. Let's also properly clear
>> pages on that allocation path and add support for __GFP_ZERO.
> 
> AFAIR we do not have any user for __GFP_ZERO right? Not that this is

Sorry, I had extended information under "---" but accidentally 
regenerated the patch before sending it out.

__GFP_ZERO is not used yet. It's intended to be used in 
https://lkml.kernel.org/r/20201029162718.29910-1-david@redhat.com
and I can move that change into a separate patch if desired.

> harmful but it is better to call that explicitly because a missing
> implementation would be a real problem and as such a bug fix.
> 
> I am also not sure handling init_on_free at the higher level is good.
> As we have discussed recently the primary point of this feature is to
> add clearing at very few well defined entry points rather than spill it over
> many places. In this case the entry point for the allocator is
> __isolate_free_page which removes pages from the page allocator. I
> haven't checked how much this is used elsewhere but I would expect
> init_on_alloc to be handled there.

Well, this is the entry point to our range allocator, which lives in 
page_alloc.c - used by actual high-level allocators (CMA, gigantic 
pages, etc). It's just a matter of taste where we want to have that 
handling exactly inside our allocator.

isolate_freepages_range()->split_map_pages() does the post_alloc_hook 
call. As we certainly don't want to zero pages during compaction, we 
could either pass the gfp_mask/"bool clear" down to that functions and 
handle it in there, or handle it in isolate_freepages_range(), after the 
->split_map_pages() call. Whatever you prefer.

Thanks!

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1] mm/page_alloc: clear pages in alloc_contig_pages() with init_on_alloc=1 or __GFP_ZERO
  2020-11-10 19:32 [PATCH v1] mm/page_alloc: clear pages in alloc_contig_pages() with init_on_alloc=1 or __GFP_ZERO David Hildenbrand
  2020-11-11  8:47 ` Michal Hocko
@ 2020-11-11  9:25 ` David Hildenbrand
  2020-11-11  9:59 ` Mike Rapoport
  2 siblings, 0 replies; 10+ messages in thread
From: David Hildenbrand @ 2020-11-11  9:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, Andrew Morton, Alexander Potapenko, Michal Hocko,
	Mike Kravetz, Vlastimil Babka, Mike Rapoport, Oscar Salvador,
	Kees Cook, Michael Ellerman

On 10.11.20 20:32, David Hildenbrand wrote:
> commit 6471384af2a6 ("mm: security: introduce init_on_alloc=1 and
> init_on_free=1 boot options") resulted with init_on_alloc=1 in all pages
> leaving the buddy via alloc_pages() and friends to be
> initialized/cleared/zeroed on allocation.
> 
> However, the same logic is currently not applied to
> alloc_contig_pages(): allocated pages leaving the buddy aren't cleared
> with init_on_alloc=1 and init_on_free=0. Let's also properly clear
> pages on that allocation path and add support for __GFP_ZERO.
> 
> With this change, we will see double clearing of pages in some
> cases. One example are gigantic pages (either allocated via CMA, or
> allocated dynamically via alloc_contig_pages()) - which is the right
> thing to do (and to be optimized outside of the buddy in the callers) as
> discussed in:
>    https://lkml.kernel.org/r/20201019182853.7467-1-gpiccoli@canonical.com
> 
> This change implies that with init_on_alloc=1
> - All CMA allocations will be cleared
> - Gigantic pages allocated via alloc_contig_pages() will be cleared
> - virtio-mem memory to be unplugged will be cleared. While this is
>    suboptimal, it's similar to memory balloon drivers handling, where
>    all pages to be inflated will get cleared as well.
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Alexander Potapenko <glider@google.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Mike Rapoport <rppt@linux.ibm.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>   mm/page_alloc.c | 24 +++++++++++++++++++++---
>   1 file changed, 21 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index eed4f4075b3c..0361b119b74e 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -8453,6 +8453,19 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
>   	return 0;
>   }
>   
> +static void __alloc_contig_clear_range(unsigned long start_pfn,
> +				       unsigned long end_pfn)
> +{
> +	unsigned long pfn;
> +
> +	for (pfn = start_pfn; pfn < end_pfn; pfn += MAX_ORDER_NR_PAGES) {
> +		cond_resched();
> +		kernel_init_free_pages(pfn_to_page(pfn),
> +				       min_t(unsigned long, end_pfn - pfn,
> +					     MAX_ORDER_NR_PAGES));

In weird cases, we might cross a MAX_ORDER - 1 block here. I'll fix that.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1] mm/page_alloc: clear pages in alloc_contig_pages() with init_on_alloc=1 or __GFP_ZERO
  2020-11-11  9:06   ` David Hildenbrand
@ 2020-11-11  9:58     ` Vlastimil Babka
  2020-11-11 10:05       ` David Hildenbrand
  0 siblings, 1 reply; 10+ messages in thread
From: Vlastimil Babka @ 2020-11-11  9:58 UTC (permalink / raw)
  To: David Hildenbrand, Michal Hocko
  Cc: linux-kernel, linux-mm, Andrew Morton, Alexander Potapenko,
	Mike Kravetz, Mike Rapoport, Oscar Salvador, Kees Cook,
	Michael Ellerman

On 11/11/20 10:06 AM, David Hildenbrand wrote:
> On 11.11.20 09:47, Michal Hocko wrote:
>> On Tue 10-11-20 20:32:40, David Hildenbrand wrote:
>>> commit 6471384af2a6 ("mm: security: introduce init_on_alloc=1 and
>>> init_on_free=1 boot options") resulted with init_on_alloc=1 in all pages
>>> leaving the buddy via alloc_pages() and friends to be
>>> initialized/cleared/zeroed on allocation.
>>>
>>> However, the same logic is currently not applied to
>>> alloc_contig_pages(): allocated pages leaving the buddy aren't cleared
>>> with init_on_alloc=1 and init_on_free=0. Let's also properly clear
>>> pages on that allocation path and add support for __GFP_ZERO.
>> 
>> AFAIR we do not have any user for __GFP_ZERO right? Not that this is
> 
> Sorry, I had extended information under "---" but accidentally
> regenerated the patch before sending it out.
> 
> __GFP_ZERO is not used yet. It's intended to be used in
> https://lkml.kernel.org/r/20201029162718.29910-1-david@redhat.com
> and I can move that change into a separate patch if desired.
> 
>> harmful but it is better to call that explicitly because a missing
>> implementation would be a real problem and as such a bug fix.
>> 
>> I am also not sure handling init_on_free at the higher level is good.
>> As we have discussed recently the primary point of this feature is to
>> add clearing at very few well defined entry points rather than spill it over
>> many places. In this case the entry point for the allocator is
>> __isolate_free_page which removes pages from the page allocator. I
>> haven't checked how much this is used elsewhere but I would expect
>> init_on_alloc to be handled there.
> 
> Well, this is the entry point to our range allocator, which lives in
> page_alloc.c - used by actual high-level allocators (CMA, gigantic
> pages, etc). It's just a matter of taste where we want to have that
> handling exactly inside our allocator.

I agree alloc_contig_range() is fine as an entry point.

> isolate_freepages_range()->split_map_pages() does the post_alloc_hook
> call. As we certainly don't want to zero pages during compaction, we
> could either pass the gfp_mask/"bool clear" down to that functions and
> handle it in there, or handle it in isolate_freepages_range(), after the
> ->split_map_pages() call. Whatever you prefer.

I'd rather not put it in post_alloc_hook() where the bool would then get checked 
from allocator fast path as well.
Maybe split_map_page() then as it contains a for-cycle already.

> Thanks!
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1] mm/page_alloc: clear pages in alloc_contig_pages() with init_on_alloc=1 or __GFP_ZERO
  2020-11-10 19:32 [PATCH v1] mm/page_alloc: clear pages in alloc_contig_pages() with init_on_alloc=1 or __GFP_ZERO David Hildenbrand
  2020-11-11  8:47 ` Michal Hocko
  2020-11-11  9:25 ` David Hildenbrand
@ 2020-11-11  9:59 ` Mike Rapoport
  2020-11-11 10:06   ` David Hildenbrand
  2 siblings, 1 reply; 10+ messages in thread
From: Mike Rapoport @ 2020-11-11  9:59 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Alexander Potapenko,
	Michal Hocko, Mike Kravetz, Vlastimil Babka, Oscar Salvador,
	Kees Cook, Michael Ellerman

On Tue, Nov 10, 2020 at 08:32:40PM +0100, David Hildenbrand wrote:
> commit 6471384af2a6 ("mm: security: introduce init_on_alloc=1 and
> init_on_free=1 boot options") resulted with init_on_alloc=1 in all pages
> leaving the buddy via alloc_pages() and friends to be
> initialized/cleared/zeroed on allocation.
> 
> However, the same logic is currently not applied to
> alloc_contig_pages(): allocated pages leaving the buddy aren't cleared
> with init_on_alloc=1 and init_on_free=0. Let's also properly clear
> pages on that allocation path and add support for __GFP_ZERO.
> 
> With this change, we will see double clearing of pages in some
> cases. One example are gigantic pages (either allocated via CMA, or
> allocated dynamically via alloc_contig_pages()) - which is the right
> thing to do (and to be optimized outside of the buddy in the callers) as
> discussed in:
>   https://lkml.kernel.org/r/20201019182853.7467-1-gpiccoli@canonical.com
> 
> This change implies that with init_on_alloc=1
> - All CMA allocations will be cleared
> - Gigantic pages allocated via alloc_contig_pages() will be cleared
> - virtio-mem memory to be unplugged will be cleared. While this is
>   suboptimal, it's similar to memory balloon drivers handling, where
>   all pages to be inflated will get cleared as well.
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Alexander Potapenko <glider@google.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Mike Rapoport <rppt@linux.ibm.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  mm/page_alloc.c | 24 +++++++++++++++++++++---
>  1 file changed, 21 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index eed4f4075b3c..0361b119b74e 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -8453,6 +8453,19 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
>  	return 0;
>  }
>  
> +static void __alloc_contig_clear_range(unsigned long start_pfn,
> +				       unsigned long end_pfn)

Maybe clear_contig_range() ?

> +{
> +	unsigned long pfn;
> +
> +	for (pfn = start_pfn; pfn < end_pfn; pfn += MAX_ORDER_NR_PAGES) {
> +		cond_resched();
> +		kernel_init_free_pages(pfn_to_page(pfn),
> +				       min_t(unsigned long, end_pfn - pfn,
> +					     MAX_ORDER_NR_PAGES));
> +	}
> +}
> +
>  /**
>   * alloc_contig_range() -- tries to allocate given range of pages
>   * @start:	start PFN to allocate
> @@ -8461,7 +8474,8 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
>   *			#MIGRATE_MOVABLE or #MIGRATE_CMA).  All pageblocks
>   *			in range must have the same migratetype and it must
>   *			be either of the two.
> - * @gfp_mask:	GFP mask to use during compaction
> + * @gfp_mask:	GFP mask to use during compaction. __GFP_ZERO clears allocated
> + *		pages.

"__GFP_ZERO is not passed to compaction but rather clears allocated pages"

>   *
>   * The PFN range does not have to be pageblock or MAX_ORDER_NR_PAGES
>   * aligned.  The PFN range must belong to a single zone.
> @@ -8488,7 +8502,7 @@ int alloc_contig_range(unsigned long start, unsigned long end,
>  		.mode = MIGRATE_SYNC,
>  		.ignore_skip_hint = true,
>  		.no_set_skip_hint = true,
> -		.gfp_mask = current_gfp_context(gfp_mask),
> +		.gfp_mask = current_gfp_context(gfp_mask & ~__GFP_ZERO),
>  		.alloc_contig = true,
>  	};
>  	INIT_LIST_HEAD(&cc.migratepages);
> @@ -8600,6 +8614,9 @@ int alloc_contig_range(unsigned long start, unsigned long end,
>  	if (end != outer_end)
>  		free_contig_range(end, outer_end - end);
>  
> +	if (!want_init_on_free() && want_init_on_alloc(gfp_mask))
> +		__alloc_contig_clear_range(start, end);
> +
>  done:
>  	undo_isolate_page_range(pfn_max_align_down(start),
>  				pfn_max_align_up(end), migratetype);
> @@ -8653,7 +8670,8 @@ static bool zone_spans_last_pfn(const struct zone *zone,
>  /**
>   * alloc_contig_pages() -- tries to find and allocate contiguous range of pages
>   * @nr_pages:	Number of contiguous pages to allocate
> - * @gfp_mask:	GFP mask to limit search and used during compaction
> + * @gfp_mask:	GFP mask to limit search and used during compaction. __GFP_ZERO
> + *		clears allocated pages.
>   * @nid:	Target node
>   * @nodemask:	Mask for other possible nodes
>   *
> -- 
> 2.26.2
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1] mm/page_alloc: clear pages in alloc_contig_pages() with init_on_alloc=1 or __GFP_ZERO
  2020-11-11  9:58     ` Vlastimil Babka
@ 2020-11-11 10:05       ` David Hildenbrand
  2020-11-11 10:22         ` Michal Hocko
  0 siblings, 1 reply; 10+ messages in thread
From: David Hildenbrand @ 2020-11-11 10:05 UTC (permalink / raw)
  To: Vlastimil Babka, Michal Hocko
  Cc: linux-kernel, linux-mm, Andrew Morton, Alexander Potapenko,
	Mike Kravetz, Mike Rapoport, Oscar Salvador, Kees Cook,
	Michael Ellerman

On 11.11.20 10:58, Vlastimil Babka wrote:
> On 11/11/20 10:06 AM, David Hildenbrand wrote:
>> On 11.11.20 09:47, Michal Hocko wrote:
>>> On Tue 10-11-20 20:32:40, David Hildenbrand wrote:
>>>> commit 6471384af2a6 ("mm: security: introduce init_on_alloc=1 and
>>>> init_on_free=1 boot options") resulted with init_on_alloc=1 in all pages
>>>> leaving the buddy via alloc_pages() and friends to be
>>>> initialized/cleared/zeroed on allocation.
>>>>
>>>> However, the same logic is currently not applied to
>>>> alloc_contig_pages(): allocated pages leaving the buddy aren't cleared
>>>> with init_on_alloc=1 and init_on_free=0. Let's also properly clear
>>>> pages on that allocation path and add support for __GFP_ZERO.
>>>
>>> AFAIR we do not have any user for __GFP_ZERO right? Not that this is
>>
>> Sorry, I had extended information under "---" but accidentally
>> regenerated the patch before sending it out.
>>
>> __GFP_ZERO is not used yet. It's intended to be used in
>> https://lkml.kernel.org/r/20201029162718.29910-1-david@redhat.com
>> and I can move that change into a separate patch if desired.
>>
>>> harmful but it is better to call that explicitly because a missing
>>> implementation would be a real problem and as such a bug fix.
>>>
>>> I am also not sure handling init_on_free at the higher level is good.
>>> As we have discussed recently the primary point of this feature is to
>>> add clearing at very few well defined entry points rather than spill it over
>>> many places. In this case the entry point for the allocator is
>>> __isolate_free_page which removes pages from the page allocator. I
>>> haven't checked how much this is used elsewhere but I would expect
>>> init_on_alloc to be handled there.
>>
>> Well, this is the entry point to our range allocator, which lives in
>> page_alloc.c - used by actual high-level allocators (CMA, gigantic
>> pages, etc). It's just a matter of taste where we want to have that
>> handling exactly inside our allocator.
> 
> I agree alloc_contig_range() is fine as an entry point.

Thanks, let's see if Michal insists of having this somewhere inside 
isolate_freepages_range() instead.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1] mm/page_alloc: clear pages in alloc_contig_pages() with init_on_alloc=1 or __GFP_ZERO
  2020-11-11  9:59 ` Mike Rapoport
@ 2020-11-11 10:06   ` David Hildenbrand
  0 siblings, 0 replies; 10+ messages in thread
From: David Hildenbrand @ 2020-11-11 10:06 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-kernel, linux-mm, Andrew Morton, Alexander Potapenko,
	Michal Hocko, Mike Kravetz, Vlastimil Babka, Oscar Salvador,
	Kees Cook, Michael Ellerman

On 11.11.20 10:59, Mike Rapoport wrote:
> On Tue, Nov 10, 2020 at 08:32:40PM +0100, David Hildenbrand wrote:
>> commit 6471384af2a6 ("mm: security: introduce init_on_alloc=1 and
>> init_on_free=1 boot options") resulted with init_on_alloc=1 in all pages
>> leaving the buddy via alloc_pages() and friends to be
>> initialized/cleared/zeroed on allocation.
>>
>> However, the same logic is currently not applied to
>> alloc_contig_pages(): allocated pages leaving the buddy aren't cleared
>> with init_on_alloc=1 and init_on_free=0. Let's also properly clear
>> pages on that allocation path and add support for __GFP_ZERO.
>>
>> With this change, we will see double clearing of pages in some
>> cases. One example are gigantic pages (either allocated via CMA, or
>> allocated dynamically via alloc_contig_pages()) - which is the right
>> thing to do (and to be optimized outside of the buddy in the callers) as
>> discussed in:
>>    https://lkml.kernel.org/r/20201019182853.7467-1-gpiccoli@canonical.com
>>
>> This change implies that with init_on_alloc=1
>> - All CMA allocations will be cleared
>> - Gigantic pages allocated via alloc_contig_pages() will be cleared
>> - virtio-mem memory to be unplugged will be cleared. While this is
>>    suboptimal, it's similar to memory balloon drivers handling, where
>>    all pages to be inflated will get cleared as well.
>>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Alexander Potapenko <glider@google.com>
>> Cc: Michal Hocko <mhocko@suse.com>
>> Cc: Mike Kravetz <mike.kravetz@oracle.com>
>> Cc: Vlastimil Babka <vbabka@suse.cz>
>> Cc: Mike Rapoport <rppt@linux.ibm.com>
>> Cc: Oscar Salvador <osalvador@suse.de>
>> Cc: Kees Cook <keescook@chromium.org>
>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>   mm/page_alloc.c | 24 +++++++++++++++++++++---
>>   1 file changed, 21 insertions(+), 3 deletions(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index eed4f4075b3c..0361b119b74e 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -8453,6 +8453,19 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
>>   	return 0;
>>   }
>>   
>> +static void __alloc_contig_clear_range(unsigned long start_pfn,
>> +				       unsigned long end_pfn)
> 
> Maybe clear_contig_range() ?

I chose the naming to match "__alloc_contig_migrate_range", but I agree 
that your version sounds better.

> 
>> +{
>> +	unsigned long pfn;
>> +
>> +	for (pfn = start_pfn; pfn < end_pfn; pfn += MAX_ORDER_NR_PAGES) {
>> +		cond_resched();
>> +		kernel_init_free_pages(pfn_to_page(pfn),
>> +				       min_t(unsigned long, end_pfn - pfn,
>> +					     MAX_ORDER_NR_PAGES));
>> +	}
>> +}
>> +
>>   /**
>>    * alloc_contig_range() -- tries to allocate given range of pages
>>    * @start:	start PFN to allocate
>> @@ -8461,7 +8474,8 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
>>    *			#MIGRATE_MOVABLE or #MIGRATE_CMA).  All pageblocks
>>    *			in range must have the same migratetype and it must
>>    *			be either of the two.
>> - * @gfp_mask:	GFP mask to use during compaction
>> + * @gfp_mask:	GFP mask to use during compaction. __GFP_ZERO clears allocated
>> + *		pages.
> 
> "__GFP_ZERO is not passed to compaction but rather clears allocated pages"

Bought! Thanks :)


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1] mm/page_alloc: clear pages in alloc_contig_pages() with init_on_alloc=1 or __GFP_ZERO
  2020-11-11 10:05       ` David Hildenbrand
@ 2020-11-11 10:22         ` Michal Hocko
  2020-11-11 10:32           ` David Hildenbrand
  0 siblings, 1 reply; 10+ messages in thread
From: Michal Hocko @ 2020-11-11 10:22 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Vlastimil Babka, linux-kernel, linux-mm, Andrew Morton,
	Alexander Potapenko, Mike Kravetz, Mike Rapoport, Oscar Salvador,
	Kees Cook, Michael Ellerman

On Wed 11-11-20 11:05:21, David Hildenbrand wrote:
> On 11.11.20 10:58, Vlastimil Babka wrote:
> > On 11/11/20 10:06 AM, David Hildenbrand wrote:
> > > On 11.11.20 09:47, Michal Hocko wrote:
> > > > On Tue 10-11-20 20:32:40, David Hildenbrand wrote:
> > > > > commit 6471384af2a6 ("mm: security: introduce init_on_alloc=1 and
> > > > > init_on_free=1 boot options") resulted with init_on_alloc=1 in all pages
> > > > > leaving the buddy via alloc_pages() and friends to be
> > > > > initialized/cleared/zeroed on allocation.
> > > > > 
> > > > > However, the same logic is currently not applied to
> > > > > alloc_contig_pages(): allocated pages leaving the buddy aren't cleared
> > > > > with init_on_alloc=1 and init_on_free=0. Let's also properly clear
> > > > > pages on that allocation path and add support for __GFP_ZERO.
> > > > 
> > > > AFAIR we do not have any user for __GFP_ZERO right? Not that this is
> > > 
> > > Sorry, I had extended information under "---" but accidentally
> > > regenerated the patch before sending it out.
> > > 
> > > __GFP_ZERO is not used yet. It's intended to be used in
> > > https://lkml.kernel.org/r/20201029162718.29910-1-david@redhat.com
> > > and I can move that change into a separate patch if desired.

OK, it would make sense to add it with its user.

> > > > harmful but it is better to call that explicitly because a missing
> > > > implementation would be a real problem and as such a bug fix.
> > > > 
> > > > I am also not sure handling init_on_free at the higher level is good.
> > > > As we have discussed recently the primary point of this feature is to
> > > > add clearing at very few well defined entry points rather than spill it over
> > > > many places. In this case the entry point for the allocator is
> > > > __isolate_free_page which removes pages from the page allocator. I
> > > > haven't checked how much this is used elsewhere but I would expect
> > > > init_on_alloc to be handled there.
> > > 
> > > Well, this is the entry point to our range allocator, which lives in
> > > page_alloc.c - used by actual high-level allocators (CMA, gigantic
> > > pages, etc). It's just a matter of taste where we want to have that
> > > handling exactly inside our allocator.

Yes I completely agree here. I just believe it should the lowest we can
achieve.

> > I agree alloc_contig_range() is fine as an entry point.
> 
> Thanks, let's see if Michal insists of having this somewhere inside
> isolate_freepages_range() instead.
 
It's not that I would be insisting. I am just pointing out that changes
like this one go against the idea of init_on_alloc because it is adding
more special casing and long term more places to be really careful about
when one has to be really careful to not undermine the security aspect
of the feature. I haven't really checked why compaction is not the
problem but I suspect it is the fact that it unconditionally copy the
full page content to the isolated page so there is no way to sneak
any data leak there. That is fine. We should however make that clear by
using a special cased function which skips this particular
initialization and make sure everybody else will just do the right thing
without much thinking.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1] mm/page_alloc: clear pages in alloc_contig_pages() with init_on_alloc=1 or __GFP_ZERO
  2020-11-11 10:22         ` Michal Hocko
@ 2020-11-11 10:32           ` David Hildenbrand
  0 siblings, 0 replies; 10+ messages in thread
From: David Hildenbrand @ 2020-11-11 10:32 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Vlastimil Babka, linux-kernel, linux-mm, Andrew Morton,
	Alexander Potapenko, Mike Kravetz, Mike Rapoport, Oscar Salvador,
	Kees Cook, Michael Ellerman

On 11.11.20 11:22, Michal Hocko wrote:
> On Wed 11-11-20 11:05:21, David Hildenbrand wrote:
>> On 11.11.20 10:58, Vlastimil Babka wrote:
>>> On 11/11/20 10:06 AM, David Hildenbrand wrote:
>>>> On 11.11.20 09:47, Michal Hocko wrote:
>>>>> On Tue 10-11-20 20:32:40, David Hildenbrand wrote:
>>>>>> commit 6471384af2a6 ("mm: security: introduce init_on_alloc=1 and
>>>>>> init_on_free=1 boot options") resulted with init_on_alloc=1 in all pages
>>>>>> leaving the buddy via alloc_pages() and friends to be
>>>>>> initialized/cleared/zeroed on allocation.
>>>>>>
>>>>>> However, the same logic is currently not applied to
>>>>>> alloc_contig_pages(): allocated pages leaving the buddy aren't cleared
>>>>>> with init_on_alloc=1 and init_on_free=0. Let's also properly clear
>>>>>> pages on that allocation path and add support for __GFP_ZERO.
>>>>>
>>>>> AFAIR we do not have any user for __GFP_ZERO right? Not that this is
>>>>
>>>> Sorry, I had extended information under "---" but accidentally
>>>> regenerated the patch before sending it out.
>>>>
>>>> __GFP_ZERO is not used yet. It's intended to be used in
>>>> https://lkml.kernel.org/r/20201029162718.29910-1-david@redhat.com
>>>> and I can move that change into a separate patch if desired.
> 
> OK, it would make sense to add it with its user.
> 
>>>>> harmful but it is better to call that explicitly because a missing
>>>>> implementation would be a real problem and as such a bug fix.
>>>>>
>>>>> I am also not sure handling init_on_free at the higher level is good.
>>>>> As we have discussed recently the primary point of this feature is to
>>>>> add clearing at very few well defined entry points rather than spill it over
>>>>> many places. In this case the entry point for the allocator is
>>>>> __isolate_free_page which removes pages from the page allocator. I
>>>>> haven't checked how much this is used elsewhere but I would expect
>>>>> init_on_alloc to be handled there.
>>>>
>>>> Well, this is the entry point to our range allocator, which lives in
>>>> page_alloc.c - used by actual high-level allocators (CMA, gigantic
>>>> pages, etc). It's just a matter of taste where we want to have that
>>>> handling exactly inside our allocator.
> 
> Yes I completely agree here. I just believe it should the lowest we can
> achieve.
> 
>>> I agree alloc_contig_range() is fine as an entry point.
>>
>> Thanks, let's see if Michal insists of having this somewhere inside
>> isolate_freepages_range() instead.
>   
> It's not that I would be insisting. I am just pointing out that changes
> like this one go against the idea of init_on_alloc because it is adding
> more special casing and long term more places to be really careful about
> when one has to be really careful to not undermine the security aspect
> of the feature. I haven't really checked why compaction is not the
> problem but I suspect it is the fact that it unconditionally copy the
> full page content to the isolated page so there is no way to sneak
> any data leak there. That is fine. We should however make that clear by

Exactly.

> using a special cased function which skips this particular
> initialization and make sure everybody else will just do the right thing
> without much thinking.

I totally agree, but I think we don't have many places where free pages 
actually leave the buddy besides alloc_pages() and friends (compaction 
is something special). I agree having a single place to handle that 
would be preferred. I'll have a look if that can be reworked without 
doing too much harm / affecting other hot paths.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-11-11 10:32 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-10 19:32 [PATCH v1] mm/page_alloc: clear pages in alloc_contig_pages() with init_on_alloc=1 or __GFP_ZERO David Hildenbrand
2020-11-11  8:47 ` Michal Hocko
2020-11-11  9:06   ` David Hildenbrand
2020-11-11  9:58     ` Vlastimil Babka
2020-11-11 10:05       ` David Hildenbrand
2020-11-11 10:22         ` Michal Hocko
2020-11-11 10:32           ` David Hildenbrand
2020-11-11  9:25 ` David Hildenbrand
2020-11-11  9:59 ` Mike Rapoport
2020-11-11 10:06   ` David Hildenbrand

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.