All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Pasha Tatashin <pasha.tatashin@oracle.com>
Cc: linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org,
	linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org,
	linux-s390@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	x86@kernel.org, kasan-dev@googlegroups.com,
	borntraeger@de.ibm.com, heiko.carstens@de.ibm.com,
	davem@davemloft.net, willy@infradead.org,
	ard.biesheuvel@linaro.org, mark.rutland@arm.com,
	will.deacon@arm.com, catalin.marinas@arm.com, sam@ravnborg.org,
	mgorman@techsingularity.net, steven.sistare@oracle.com,
	daniel.m.jordan@oracle.com, bob.picco@oracle.com
Subject: Re: [PATCH v10 05/10] mm: zero reserved and unavailable struct pages
Date: Tue, 10 Oct 2017 15:39:06 +0200	[thread overview]
Message-ID: <20171010133906.l2anyahcvgn6mg7o@dhcp22.suse.cz> (raw)
In-Reply-To: <bcf24369-ac37-cedd-a264-3396fb5cf39e@oracle.com>

On Fri 06-10-17 11:25:16, Pasha Tatashin wrote:
> Hi Michal,
> 
> > 
> > As I've said in other reply this should go in only if the scenario you
> > describe is real. I am somehow suspicious to be honest. I simply do not
> > see how those weird struct pages would be in a valid pfn range of any
> > zone.
> > 
> 
> There are examples of both when unavailable memory is not part of any zone,
> and where it is part of zones.
> 
> I run Linux in kvm with these arguments:
> 
>         qemu-system-x86_64
>         -enable-kvm
>         -cpu kvm64
>         -kernel $kernel
>         -initrd $initrd
>         -m 512
>         -smp 2
>         -device e1000,netdev=net0
>         -netdev user,id=net0
>         -boot order=nc
>         -no-reboot
>         -watchdog i6300esb
>         -watchdog-action debug
>         -rtc base=localtime
>         -serial stdio
>         -display none
>         -monitor null
> 
> This patch reports that there are 98 unavailable pages.
> 
> They are: pfn 0 and pfns in range [159, 255].
> 
> Note, trim_low_memory_range() reserves only pfns in range [0, 15], it does
> not reserve [159, 255] ones.
> 
> e820__memblock_setup() reports linux that the following physical ranges are
> available:
>     [1 , 158]
> [256, 130783]
> 
> Notice, that exactly unavailable pfns are missing!
> 
> Now, lets check what we have in zone 0: [1, 131039]
> 
> pfn 0, is not part of the zone, but pfns [1, 158], are.
> 
> However, the bigger problem we have if we do not initialize these struct
> pages is with memory hotplug. Because, that path operates at 2M boundaries
> (section_nr). And checks if 2M range of pages is hot removable. It starts
> with first pfn from zone, rounds it down to 2M boundary (sturct pages are
> allocated at 2M boundaries when vmemmap is created), and and checks if that
> section is hot removable. In this case start with pfn 1 and convert it down
> to pfn 0.

Hmm, this is really interesting! I thought each memblock is guaranteed
to be section size aligned. But I suspect this is more of a wishful
thinking. But now I see what is the problem.
-- 
Michal Hocko
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@kernel.org>
To: linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v10 05/10] mm: zero reserved and unavailable struct pages
Date: Tue, 10 Oct 2017 13:39:06 +0000	[thread overview]
Message-ID: <20171010133906.l2anyahcvgn6mg7o@dhcp22.suse.cz> (raw)
In-Reply-To: <bcf24369-ac37-cedd-a264-3396fb5cf39e@oracle.com>

On Fri 06-10-17 11:25:16, Pasha Tatashin wrote:
> Hi Michal,
> 
> > 
> > As I've said in other reply this should go in only if the scenario you
> > describe is real. I am somehow suspicious to be honest. I simply do not
> > see how those weird struct pages would be in a valid pfn range of any
> > zone.
> > 
> 
> There are examples of both when unavailable memory is not part of any zone,
> and where it is part of zones.
> 
> I run Linux in kvm with these arguments:
> 
>         qemu-system-x86_64
>         -enable-kvm
>         -cpu kvm64
>         -kernel $kernel
>         -initrd $initrd
>         -m 512
>         -smp 2
>         -device e1000,netdev=net0
>         -netdev user,id=net0
>         -boot order=nc
>         -no-reboot
>         -watchdog i6300esb
>         -watchdog-action debug
>         -rtc base=localtime
>         -serial stdio
>         -display none
>         -monitor null
> 
> This patch reports that there are 98 unavailable pages.
> 
> They are: pfn 0 and pfns in range [159, 255].
> 
> Note, trim_low_memory_range() reserves only pfns in range [0, 15], it does
> not reserve [159, 255] ones.
> 
> e820__memblock_setup() reports linux that the following physical ranges are
> available:
>     [1 , 158]
> [256, 130783]
> 
> Notice, that exactly unavailable pfns are missing!
> 
> Now, lets check what we have in zone 0: [1, 131039]
> 
> pfn 0, is not part of the zone, but pfns [1, 158], are.
> 
> However, the bigger problem we have if we do not initialize these struct
> pages is with memory hotplug. Because, that path operates at 2M boundaries
> (section_nr). And checks if 2M range of pages is hot removable. It starts
> with first pfn from zone, rounds it down to 2M boundary (sturct pages are
> allocated at 2M boundaries when vmemmap is created), and and checks if that
> section is hot removable. In this case start with pfn 1 and convert it down
> to pfn 0.

Hmm, this is really interesting! I thought each memblock is guaranteed
to be section size aligned. But I suspect this is more of a wishful
thinking. But now I see what is the problem.
-- 
Michal Hocko
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@kernel.org>
To: Pasha Tatashin <pasha.tatashin@oracle.com>
Cc: linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org,
	linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org,
	linux-s390@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	x86@kernel.org, kasan-dev@googlegroups.com,
	borntraeger@de.ibm.com, heiko.carstens@de.ibm.com,
	davem@davemloft.net, willy@infradead.org,
	ard.biesheuvel@linaro.org, mark.rutland@arm.com,
	will.deacon@arm.com, catalin.marinas@arm.com, sam@ravnborg.org,
	mgorman@techsingularity.net, steven.sistare@oracle.com,
	daniel.m.jordan@oracle.com, bob.picco@oracle.com
Subject: Re: [PATCH v10 05/10] mm: zero reserved and unavailable struct pages
Date: Tue, 10 Oct 2017 15:39:06 +0200	[thread overview]
Message-ID: <20171010133906.l2anyahcvgn6mg7o@dhcp22.suse.cz> (raw)
In-Reply-To: <bcf24369-ac37-cedd-a264-3396fb5cf39e@oracle.com>

On Fri 06-10-17 11:25:16, Pasha Tatashin wrote:
> Hi Michal,
> 
> > 
> > As I've said in other reply this should go in only if the scenario you
> > describe is real. I am somehow suspicious to be honest. I simply do not
> > see how those weird struct pages would be in a valid pfn range of any
> > zone.
> > 
> 
> There are examples of both when unavailable memory is not part of any zone,
> and where it is part of zones.
> 
> I run Linux in kvm with these arguments:
> 
>         qemu-system-x86_64
>         -enable-kvm
>         -cpu kvm64
>         -kernel $kernel
>         -initrd $initrd
>         -m 512
>         -smp 2
>         -device e1000,netdev=net0
>         -netdev user,id=net0
>         -boot order=nc
>         -no-reboot
>         -watchdog i6300esb
>         -watchdog-action debug
>         -rtc base=localtime
>         -serial stdio
>         -display none
>         -monitor null
> 
> This patch reports that there are 98 unavailable pages.
> 
> They are: pfn 0 and pfns in range [159, 255].
> 
> Note, trim_low_memory_range() reserves only pfns in range [0, 15], it does
> not reserve [159, 255] ones.
> 
> e820__memblock_setup() reports linux that the following physical ranges are
> available:
>     [1 , 158]
> [256, 130783]
> 
> Notice, that exactly unavailable pfns are missing!
> 
> Now, lets check what we have in zone 0: [1, 131039]
> 
> pfn 0, is not part of the zone, but pfns [1, 158], are.
> 
> However, the bigger problem we have if we do not initialize these struct
> pages is with memory hotplug. Because, that path operates at 2M boundaries
> (section_nr). And checks if 2M range of pages is hot removable. It starts
> with first pfn from zone, rounds it down to 2M boundary (sturct pages are
> allocated at 2M boundaries when vmemmap is created), and and checks if that
> section is hot removable. In this case start with pfn 1 and convert it down
> to pfn 0.

Hmm, this is really interesting! I thought each memblock is guaranteed
to be section size aligned. But I suspect this is more of a wishful
thinking. But now I see what is the problem.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: mhocko@kernel.org (Michal Hocko)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v10 05/10] mm: zero reserved and unavailable struct pages
Date: Tue, 10 Oct 2017 15:39:06 +0200	[thread overview]
Message-ID: <20171010133906.l2anyahcvgn6mg7o@dhcp22.suse.cz> (raw)
In-Reply-To: <bcf24369-ac37-cedd-a264-3396fb5cf39e@oracle.com>

On Fri 06-10-17 11:25:16, Pasha Tatashin wrote:
> Hi Michal,
> 
> > 
> > As I've said in other reply this should go in only if the scenario you
> > describe is real. I am somehow suspicious to be honest. I simply do not
> > see how those weird struct pages would be in a valid pfn range of any
> > zone.
> > 
> 
> There are examples of both when unavailable memory is not part of any zone,
> and where it is part of zones.
> 
> I run Linux in kvm with these arguments:
> 
>         qemu-system-x86_64
>         -enable-kvm
>         -cpu kvm64
>         -kernel $kernel
>         -initrd $initrd
>         -m 512
>         -smp 2
>         -device e1000,netdev=net0
>         -netdev user,id=net0
>         -boot order=nc
>         -no-reboot
>         -watchdog i6300esb
>         -watchdog-action debug
>         -rtc base=localtime
>         -serial stdio
>         -display none
>         -monitor null
> 
> This patch reports that there are 98 unavailable pages.
> 
> They are: pfn 0 and pfns in range [159, 255].
> 
> Note, trim_low_memory_range() reserves only pfns in range [0, 15], it does
> not reserve [159, 255] ones.
> 
> e820__memblock_setup() reports linux that the following physical ranges are
> available:
>     [1 , 158]
> [256, 130783]
> 
> Notice, that exactly unavailable pfns are missing!
> 
> Now, lets check what we have in zone 0: [1, 131039]
> 
> pfn 0, is not part of the zone, but pfns [1, 158], are.
> 
> However, the bigger problem we have if we do not initialize these struct
> pages is with memory hotplug. Because, that path operates at 2M boundaries
> (section_nr). And checks if 2M range of pages is hot removable. It starts
> with first pfn from zone, rounds it down to 2M boundary (sturct pages are
> allocated at 2M boundaries when vmemmap is created), and and checks if that
> section is hot removable. In this case start with pfn 1 and convert it down
> to pfn 0.

Hmm, this is really interesting! I thought each memblock is guaranteed
to be section size aligned. But I suspect this is more of a wishful
thinking. But now I see what is the problem.
-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2017-10-10 13:39 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-05 21:11 [PATCH v10 00/10] complete deferred page initialization Pavel Tatashin
2017-10-05 21:11 ` Pavel Tatashin
2017-10-05 21:11 ` Pavel Tatashin
2017-10-05 21:11 ` Pavel Tatashin
2017-10-05 21:11 ` [PATCH v10 01/10] x86/mm: setting fields in deferred pages Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11 ` [PATCH v10 02/10] sparc64/mm: " Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11 ` [PATCH v10 03/10] sparc64: simplify vmemmap_populate Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11 ` [PATCH v10 04/10] mm: defining memblock_virt_alloc_try_nid_raw Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11 ` [PATCH v10 05/10] mm: zero reserved and unavailable struct pages Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-06 12:30   ` Michal Hocko
2017-10-06 12:30     ` Michal Hocko
2017-10-06 12:30     ` Michal Hocko
2017-10-06 12:30     ` Michal Hocko
2017-10-06 15:25     ` Pasha Tatashin
2017-10-06 15:25       ` Pasha Tatashin
2017-10-06 15:25       ` Pasha Tatashin
2017-10-06 15:25       ` Pasha Tatashin
2017-10-10 13:39       ` Michal Hocko [this message]
2017-10-10 13:39         ` Michal Hocko
2017-10-10 13:39         ` Michal Hocko
2017-10-10 13:39         ` Michal Hocko
2017-10-05 21:11 ` [PATCH v10 06/10] mm/kasan: kasan specific map populate function Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11 ` [PATCH v10 07/10] x86/kasan: use kasan_map_populate() Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11 ` [PATCH v10 08/10] arm64/kasan: " Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11 ` [PATCH v10 09/10] mm: stop zeroing memory during allocation in vmemmap Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-06 11:10   ` David Laight
2017-10-06 11:10     ` David Laight
2017-10-06 11:10     ` David Laight
2017-10-06 11:10     ` David Laight
2017-10-06 11:10     ` David Laight
2017-10-06 11:47     ` Michal Hocko
2017-10-06 11:47       ` Michal Hocko
2017-10-06 11:47       ` Michal Hocko
2017-10-06 11:47       ` Michal Hocko
2017-10-06 11:47       ` Michal Hocko
2017-10-06 12:11       ` David Laight
2017-10-06 12:11         ` David Laight
2017-10-06 12:11         ` David Laight
2017-10-06 12:11         ` David Laight
2017-10-06 12:11         ` David Laight
2017-10-06 12:25         ` Michal Hocko
2017-10-06 12:25           ` Michal Hocko
2017-10-06 12:25           ` Michal Hocko
2017-10-06 12:25           ` Michal Hocko
2017-10-06 12:25           ` Michal Hocko
2017-10-05 21:11 ` [PATCH v10 10/10] sparc64: optimized struct page zeroing Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin
2017-10-05 21:11   ` Pavel Tatashin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171010133906.l2anyahcvgn6mg7o@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=ard.biesheuvel@linaro.org \
    --cc=bob.picco@oracle.com \
    --cc=borntraeger@de.ibm.com \
    --cc=catalin.marinas@arm.com \
    --cc=daniel.m.jordan@oracle.com \
    --cc=davem@davemloft.net \
    --cc=heiko.carstens@de.ibm.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mark.rutland@arm.com \
    --cc=mgorman@techsingularity.net \
    --cc=pasha.tatashin@oracle.com \
    --cc=sam@ravnborg.org \
    --cc=sparclinux@vger.kernel.org \
    --cc=steven.sistare@oracle.com \
    --cc=will.deacon@arm.com \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.