All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Michal Hocko <mhocko@kernel.org>
Cc: Joonsoo Kim <js1304@gmail.com>, Yang Shi <yang.shi@linaro.org>,
	Mel Gorman <mgorman@techsingularity.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>
Subject: Re: "mm: use early_pfn_to_nid in page_ext_init" broken on some configurations?
Date: Fri, 14 Jul 2017 11:34:31 +0200	[thread overview]
Message-ID: <18f28347-af10-0726-5a62-0dd1afdbd2a9@suse.cz> (raw)
In-Reply-To: <20170714091304.GC2618@dhcp22.suse.cz>

On 07/14/2017 11:13 AM, Michal Hocko wrote:
> On Fri 07-07-17 14:00:03, Vlastimil Babka wrote:
>> On 07/04/2017 07:17 AM, Joonsoo Kim wrote:
>>>>
>>>> Still, backporting b8f1a75d61d8 fixes this:
>>>>
>>>> [    1.538379] allocated 738197504 bytes of page_ext
>>>> [    1.539340] Node 0, zone      DMA: page owner found early allocated 0 pages
>>>> [    1.540179] Node 0, zone    DMA32: page owner found early allocated 33 pages
>>>> [    1.611173] Node 0, zone   Normal: page owner found early allocated 96755 pages
>>>> [    1.683167] Node 1, zone   Normal: page owner found early allocated 96575 pages
>>>>
>>>> No panic, notice how it allocated more for page_ext, and found smaller number of
>>>> early allocated pages.
>>>>
>>>> Now backporting fe53ca54270a on top:
>>>>
>>>> [    0.000000] allocated 738197504 bytes of page_ext
>>>> [    0.000000] Node 0, zone      DMA: page owner found early allocated 0 pages
>>>> [    0.000000] Node 0, zone    DMA32: page owner found early allocated 33 pages
>>>> [    0.000000] Node 0, zone   Normal: page owner found early allocated 2842622 pages
>>>> [    0.000000] Node 1, zone   Normal: page owner found early allocated 3694362 pages
>>>>
>>>> Again no panic, and same amount of page_ext usage. But the "early allocated" numbers
>>>> seem bogus to me. I think it's because init_pages_in_zone() is running and inspecting
>>>> struct pages that have not been yet initialized. It doesn't end up crashing, but
>>>> still doesn't seem correct?
>>>
>>> Numbers looks sane to me. fe53ca54270a makes init_pages_in_zone()
>>> called before page_alloc_init_late(). So, there would be many
>>> uninitialized pages with PageReserved(). Page owner regarded these
>>> PageReserved() page as allocated page.
>>
>> That seems incorrect for two reasons:
>> - init_pages_in_zone() actually skips PageReserved() pages
>> - the pages don't have PageReserved() flag, until the deferred struct page init
>> thread processes them via deferred_init_memmap() -> __init_single_page() AFAICS
>>
>> Now I've found out why upstream reports much less early allocated pages than our
>> kernel. We're missing 9d43f5aec950 ("mm/page_owner: add zone range overlapping
>> check") which adds a "page_zone(page) != zone" check. I think this only works
>> because the pages are not initialized and thus have no nid/zone links. Probably
>> page_zone() only doesn't break because it's all zeroed. I don't think it's safe
>> to rely on this?
> 
> Yes, if anything PageReserved should be checked before the zone check.

That wouldn't change anything, because we skip PageReserved and it's not
set. Perhaps we could skip pages that have the raw page flags value
zero, but then a) we should make sure that the allocation of the struct
page array zeroes the range, and b) the first modification of struct
page in the initialization is setting the PageReserved flag.

WARNING: multiple messages have this Message-ID (diff)
From: Vlastimil Babka <vbabka@suse.cz>
To: Michal Hocko <mhocko@kernel.org>
Cc: Joonsoo Kim <js1304@gmail.com>, Yang Shi <yang.shi@linaro.org>,
	Mel Gorman <mgorman@techsingularity.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>
Subject: Re: "mm: use early_pfn_to_nid in page_ext_init" broken on some configurations?
Date: Fri, 14 Jul 2017 11:34:31 +0200	[thread overview]
Message-ID: <18f28347-af10-0726-5a62-0dd1afdbd2a9@suse.cz> (raw)
In-Reply-To: <20170714091304.GC2618@dhcp22.suse.cz>

On 07/14/2017 11:13 AM, Michal Hocko wrote:
> On Fri 07-07-17 14:00:03, Vlastimil Babka wrote:
>> On 07/04/2017 07:17 AM, Joonsoo Kim wrote:
>>>>
>>>> Still, backporting b8f1a75d61d8 fixes this:
>>>>
>>>> [    1.538379] allocated 738197504 bytes of page_ext
>>>> [    1.539340] Node 0, zone      DMA: page owner found early allocated 0 pages
>>>> [    1.540179] Node 0, zone    DMA32: page owner found early allocated 33 pages
>>>> [    1.611173] Node 0, zone   Normal: page owner found early allocated 96755 pages
>>>> [    1.683167] Node 1, zone   Normal: page owner found early allocated 96575 pages
>>>>
>>>> No panic, notice how it allocated more for page_ext, and found smaller number of
>>>> early allocated pages.
>>>>
>>>> Now backporting fe53ca54270a on top:
>>>>
>>>> [    0.000000] allocated 738197504 bytes of page_ext
>>>> [    0.000000] Node 0, zone      DMA: page owner found early allocated 0 pages
>>>> [    0.000000] Node 0, zone    DMA32: page owner found early allocated 33 pages
>>>> [    0.000000] Node 0, zone   Normal: page owner found early allocated 2842622 pages
>>>> [    0.000000] Node 1, zone   Normal: page owner found early allocated 3694362 pages
>>>>
>>>> Again no panic, and same amount of page_ext usage. But the "early allocated" numbers
>>>> seem bogus to me. I think it's because init_pages_in_zone() is running and inspecting
>>>> struct pages that have not been yet initialized. It doesn't end up crashing, but
>>>> still doesn't seem correct?
>>>
>>> Numbers looks sane to me. fe53ca54270a makes init_pages_in_zone()
>>> called before page_alloc_init_late(). So, there would be many
>>> uninitialized pages with PageReserved(). Page owner regarded these
>>> PageReserved() page as allocated page.
>>
>> That seems incorrect for two reasons:
>> - init_pages_in_zone() actually skips PageReserved() pages
>> - the pages don't have PageReserved() flag, until the deferred struct page init
>> thread processes them via deferred_init_memmap() -> __init_single_page() AFAICS
>>
>> Now I've found out why upstream reports much less early allocated pages than our
>> kernel. We're missing 9d43f5aec950 ("mm/page_owner: add zone range overlapping
>> check") which adds a "page_zone(page) != zone" check. I think this only works
>> because the pages are not initialized and thus have no nid/zone links. Probably
>> page_zone() only doesn't break because it's all zeroed. I don't think it's safe
>> to rely on this?
> 
> Yes, if anything PageReserved should be checked before the zone check.

That wouldn't change anything, because we skip PageReserved and it's not
set. Perhaps we could skip pages that have the raw page flags value
zero, but then a) we should make sure that the allocation of the struct
page array zeroes the range, and b) the first modification of struct
page in the initialization is setting the PageReserved flag.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-07-14  9:34 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-30 14:18 "mm: use early_pfn_to_nid in page_ext_init" broken on some configurations? Michal Hocko
2017-06-30 14:18 ` Michal Hocko
2017-06-30 15:42 ` Michal Hocko
2017-06-30 15:42   ` Michal Hocko
2017-06-30 15:44   ` Michal Hocko
2017-06-30 15:44     ` Michal Hocko
2017-07-04  5:11     ` Joonsoo Kim
2017-07-04  5:11       ` Joonsoo Kim
2017-07-04  6:51       ` Michal Hocko
2017-07-04  6:51         ` Michal Hocko
2017-07-03 11:48 ` Vlastimil Babka
2017-07-03 11:48   ` Vlastimil Babka
2017-07-03 14:18   ` Vlastimil Babka
2017-07-03 14:18     ` Vlastimil Babka
2017-07-04  5:23     ` Joonsoo Kim
2017-07-04  5:23       ` Joonsoo Kim
2017-07-04  9:39       ` Vlastimil Babka
2017-07-04  9:39         ` Vlastimil Babka
2017-07-04 10:53         ` Michal Hocko
2017-07-04 10:53           ` Michal Hocko
2017-07-04  5:17   ` Joonsoo Kim
2017-07-04  5:17     ` Joonsoo Kim
2017-07-07 12:00     ` Vlastimil Babka
2017-07-07 12:00       ` Vlastimil Babka
2017-07-14  9:13       ` Michal Hocko
2017-07-14  9:13         ` Michal Hocko
2017-07-14  9:34         ` Vlastimil Babka [this message]
2017-07-14  9:34           ` Vlastimil Babka
2017-07-14 11:35           ` Michal Hocko
2017-07-14 11:35             ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=18f28347-af10-0726-5a62-0dd1afdbd2a9@suse.cz \
    --to=vbabka@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=js1304@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=yang.shi@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.