linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* New helper to free highmem pages in larger chunks
@ 2015-10-03 12:55 Vineet Gupta
  2015-10-05 22:09 ` Andrew Morton
  0 siblings, 1 reply; 4+ messages in thread
From: Vineet Gupta @ 2015-10-03 12:55 UTC (permalink / raw)
  To: Andrew Morton, Robin Holt, Nathan Zimmer
  Cc: Jiang Liu, linux-mm, linux-arch, lkml, Mel Gorman

Hi,

I noticed increased boot time when enabling highmem for ARC. Turns out that
freeing highmem pages into buddy allocator is done page at a time, while it is
batched for low mem pages. Below is call flow.

I'm thinking of writing free_highmem_pages() which takes start and end pfn and
want to solicit some ideas whether to write it from scratch or preferably call
existing __free_pages_memory() to reuse the logic to convert a pfn range into
{pfn, order} tuples.

For latter however there are semantical differences as you can see below which I'm
not sure of:
  -highmem page->count is set to 1, while 0 for low mem
  -atomic clearing of page reserved flag vs. non atomic


mem_init
     for (tmp = min_high_pfn; tmp < max_pfn; tmp++)
	free_highmem_page(pfn_to_page(tmp));
	     __free_reserved_page
		ClearPageReserved(page);   <--- atomic
		init_page_count(page);  <-- _count = 1
		__free_page(page);    <-- free SINGLE page


     free_all_bootmem
	free_low_memory_core_early
	   __free_memory_core(start, end)
	       __free_pages_memory(s_pfn, e_pfn) <- creates "order" sized batches
		    __free_pages_bootmem(pfn, order)
		        __free_pages_boot_core(start_page, start_pfn, order)
				loops from 0 to (1 << order)
				    __ClearPageReserved(p);   <-- non atomic
				    set_page_count(p, 0);  <--- _count = 0

				__free_pages(page, order);    <--- free BATCH

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: New helper to free highmem pages in larger chunks
  2015-10-03 12:55 New helper to free highmem pages in larger chunks Vineet Gupta
@ 2015-10-05 22:09 ` Andrew Morton
  2015-10-06  5:35   ` Vineet Gupta
  0 siblings, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2015-10-05 22:09 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: Robin Holt, Nathan Zimmer, Jiang Liu, linux-mm, linux-arch, lkml,
	Mel Gorman

On Sat, 3 Oct 2015 18:25:13 +0530 Vineet Gupta <Vineet.Gupta1@synopsys.com> wrote:

> Hi,
> 
> I noticed increased boot time when enabling highmem for ARC. Turns out that
> freeing highmem pages into buddy allocator is done page at a time, while it is
> batched for low mem pages. Below is call flow.
> 
> I'm thinking of writing free_highmem_pages() which takes start and end pfn and
> want to solicit some ideas whether to write it from scratch or preferably call
> existing __free_pages_memory() to reuse the logic to convert a pfn range into
> {pfn, order} tuples.
> 
> For latter however there are semantical differences as you can see below which I'm
> not sure of:
>   -highmem page->count is set to 1, while 0 for low mem

That would be weird.

Look more closely at __free_pages_boot_core() - it uses
set_page_refcounted() to set the page's refcount to 1.  Those
set_page_count() calls look superfluous to me.

>   -atomic clearing of page reserved flag vs. non atomic

I doubt if the atomic is needed - who else can be looking at this page
at this time?


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: New helper to free highmem pages in larger chunks
  2015-10-05 22:09 ` Andrew Morton
@ 2015-10-06  5:35   ` Vineet Gupta
  2015-10-06  8:42     ` [arc-linux-dev] " Vineet Gupta
  0 siblings, 1 reply; 4+ messages in thread
From: Vineet Gupta @ 2015-10-06  5:35 UTC (permalink / raw)
  To: Andrew Morton
  Cc: arc-linux-dev, Robin Holt, Nathan Zimmer, Jiang Liu, linux-mm,
	linux-arch, lkml, Mel Gorman

On Tuesday 06 October 2015 03:40 AM, Andrew Morton wrote:
> On Sat, 3 Oct 2015 18:25:13 +0530 Vineet Gupta <Vineet.Gupta1@synopsys.com> wrote:
>
>> Hi,
>>
>> I noticed increased boot time when enabling highmem for ARC. Turns out that
>> freeing highmem pages into buddy allocator is done page at a time, while it is
>> batched for low mem pages. Below is call flow.
>>
>> I'm thinking of writing free_highmem_pages() which takes start and end pfn and
>> want to solicit some ideas whether to write it from scratch or preferably call
>> existing __free_pages_memory() to reuse the logic to convert a pfn range into
>> {pfn, order} tuples.
>>
>> For latter however there are semantical differences as you can see below which I'm
>> not sure of:
>>   -highmem page->count is set to 1, while 0 for low mem
> That would be weird.
>
> Look more closely at __free_pages_boot_core() - it uses
> set_page_refcounted() to set the page's refcount to 1.  Those
> set_page_count() calls look superfluous to me.

If you closer still, set_page_refcounted() is called outside the loop for the
first page only. For all pages, loop iterator sets them to 1. Turns out there's
more fun here....

I ran this under a debugger and much earlier in boot process, there's existing
setting of page count to 1 for *all* pages of *all* zones (include highmem pages).
See call flow below.

free_area_init_node
    free_area_init_core
        loops thru all zones
            memmap_init_zone
               loops thru all pages of zones
               __init_single_page

This means the subsequent setting of page count to 0 (or 1 for the special first
page) is superfluous - actually buggy at best. I will send a patch to fix that. I
hope I don't break some obscure init path which doesn't hit the above init.


>
>>   -atomic clearing of page reserved flag vs. non atomic
> I doubt if the atomic is needed - who else can be looking at this page
> at this time?

I'll send another one to separately fix that as well. Seems like boot mem setup is
a relatively neglect part of kernel.

-Vineet

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [arc-linux-dev] Re: New helper to free highmem pages in larger chunks
  2015-10-06  5:35   ` Vineet Gupta
@ 2015-10-06  8:42     ` Vineet Gupta
  0 siblings, 0 replies; 4+ messages in thread
From: Vineet Gupta @ 2015-10-06  8:42 UTC (permalink / raw)
  To: Andrew Morton
  Cc: arc-linux-dev, Robin Holt, Nathan Zimmer, Jiang Liu, linux-mm,
	linux-arch, lkml, Mel Gorman

On Tuesday 06 October 2015 11:06 AM, Vineet Gupta wrote:
> On Tuesday 06 October 2015 03:40 AM, Andrew Morton wrote:
>> On Sat, 3 Oct 2015 18:25:13 +0530 Vineet Gupta <Vineet.Gupta1@synopsys.com> wrote:
>>
>>> Hi,
>>>
>>> I noticed increased boot time when enabling highmem for ARC. Turns out that
>>> freeing highmem pages into buddy allocator is done page at a time, while it is
>>> batched for low mem pages. Below is call flow.
>>>
>>> I'm thinking of writing free_highmem_pages() which takes start and end pfn and
>>> want to solicit some ideas whether to write it from scratch or preferably call
>>> existing __free_pages_memory() to reuse the logic to convert a pfn range into
>>> {pfn, order} tuples.
>>>
>>> For latter however there are semantical differences as you can see below which I'm
>>> not sure of:
>>>   -highmem page->count is set to 1, while 0 for low mem
>> That would be weird.
>>
>> Look more closely at __free_pages_boot_core() - it uses
>> set_page_refcounted() to set the page's refcount to 1.  Those
>> set_page_count() calls look superfluous to me.
> If you closer still, set_page_refcounted() is called outside the loop for the
> first page only. For all pages, loop iterator sets them to 1. Turns out there's
> more fun here....
>
> I ran this under a debugger and much earlier in boot process, there's existing
> setting of page count to 1 for *all* pages of *all* zones (include highmem pages).
> See call flow below.
>
> free_area_init_node
>     free_area_init_core
>         loops thru all zones
>             memmap_init_zone
>                loops thru all pages of zones
>                __init_single_page
>
> This means the subsequent setting of page count to 0 (or 1 for the special first
> page) is superfluous - actually buggy at best. I will send a patch to fix that. I
> hope I don't break some obscure init path which doesn't hit the above init.

So I took a stab at it and broke it royally. I was too naive for this to begin
with. The explicit setting to 1 for high mem pages, 0 for all low mem pages except
1st page in @order which has 1 is all by design.

__free_pages() called by both code paths,  always decrements the refcount of
struct page. In case of page batch (order !=0) it only decrements the first page's
refcount. This was my find of the month - but you probably have known this for
longest amount of time ! Live and learn.

The current High mem page only uses order == 0, so init ref count of 1 is needed
(although done from __init_single_page is sufficient - no need to do that again in
free_highmem_page()). The low mem pages though typically call free_pages() with
order > 0, thus the caller carefully setsup the first page in @order to refcount 1
(using set_page_refcounted()), while rest of pages are set to 0 refcount in the loop.

Thus the seeming redundant setting of 0 seems to be fine IMHO - perhaps better to
document it - assuming I got it right so far.


>>>   -atomic clearing of page reserved flag vs. non atomic
>> I doubt if the atomic is needed - who else can be looking at this page
>> at this time?
> I'll send another one to separately fix that as well. Seems like boot mem setup is
> a relatively neglect part of kernel.
>
> -Vineet
>
>


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-10-06  8:42 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-03 12:55 New helper to free highmem pages in larger chunks Vineet Gupta
2015-10-05 22:09 ` Andrew Morton
2015-10-06  5:35   ` Vineet Gupta
2015-10-06  8:42     ` [arc-linux-dev] " Vineet Gupta

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).