All of lore.kernel.org
 help / color / mirror / Atom feed
From: Baoquan He <bhe@redhat.com>
To: Chao Fan <fanc.fnst@cn.fujitsu.com>,
	akpm@linux-foundation.org, linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org, x86@kernel.org,
	yasu.isimatu@gmail.com, keescook@chromium.org,
	indou.takao@jp.fujitsu.com, caoj.fnst@cn.fujitsu.com,
	douly.fnst@cn.fujitsu.com, mhocko@suse.com, vbabka@suse.cz,
	mgorman@techsingularity.net
Subject: Re: Bug report about KASLR and ZONE_MOVABLE
Date: Wed, 11 Jul 2018 18:41:58 +0800	[thread overview]
Message-ID: <20180711104158.GE2070@MiWiFi-R3L-srv> (raw)
In-Reply-To: <20180711094244.GA2019@localhost.localdomain>

On 07/11/18 at 05:42pm, Chao Fan wrote:
> Hi all,
> 
> I found there is a BUG about KASLR and ZONE_MOVABLE.
> 
> When users use 'kernelcore=' parameter without 'movable_node',
> movable memory is evenly distributed to all nodes. The size of
> ZONE_MOVABLE depends on the kernel parameter 'kernelcore=' and
> 'movablecore='.
> But sometiomes, KASLR may put the uncompressed kernel to the
> tail position of a node, which will cause the kernel memory
> set as ZONE_MOVABLE. This region can not be offlined.
> 
> Here is a very simple test in my qemu-kvm machine, there is
> only one node:
> 
> The command line:
> [root@localhost ~]# cat /proc/cmdline
> BOOT_IMAGE=/vmlinuz-4.18.0-rc3+ root=/dev/mapper/fedora_localhost--live-root
> ro resume=/dev/mapper/fedora_localhost--live-swap
> rd.lvm.lv=fedora_localhost-live/root rd.lvm.lv=fedora_localhost-live/swap
> console=ttyS0 earlyprintk=ttyS0,115200n8 memblock=debug kernelcore=50%
> 
> I use 'kernelcore=50%' here.
> 
> Here is my early print result, I print the random_addr after KASLR chooses
> physical memory:
> early console in extract_kernel
> input_data: 0x000000000266b3b1
> input_len: 0x00000000007d8802
> output: 0x0000000001000000
> output_len: 0x0000000001e15698
> kernel_total_size: 0x0000000001a8b000
> trampoline_32bit: 0x000000000009d000
> booted via startup_32()
> Physical KASLR using RDRAND RDTSC...
> random_addr: 0x000000012f000000
> Virtual KASLR using RDRAND RDTSC...
> 
> The address for kernel is 0x000000012f000000
> 
> Here is the log of ZONE:
> [    0.000000] Zone ranges:
> [    0.000000]   DMA      [mem 0x0000000000001000-0x0000000000ffffff]
> [    0.000000]   DMA32    [mem 0x0000000001000000-0x00000000ffffffff]
> [    0.000000]   Normal   [mem 0x0000000100000000-0x00000001f57fffff]
> [    0.000000]   Device   empty
> [    0.000000] Movable zone start for each node
> [    0.000000]   Node 0: 0x000000011b000000
> [    0.000000] Early memory node ranges
> [    0.000000]   node   0: [mem 0x0000000000001000-0x000000000009efff]
> [    0.000000]   node   0: [mem 0x0000000000100000-0x00000000bffd6fff]
> [    0.000000]   node   0: [mem 0x0000000100000000-0x00000001f57fffff]
> [    0.000000] Initmem setup node 0 [mem
> 0x0000000000001000-0x00000001f57fffff]
> 
> Only one node in my machine, ZONE_MOVABLE begins from 0x000000011b000000,
> which is lower than 0x000000012f000000.
> So KASLR put the kernel to the ZONE_MOVABLE.
> Try to solve this problem, I think there should be a new tactic in function
> find_zone_movable_pfns_for_nodes() of mm/page_alloc.c. If kernel is uncompressed
> in a tail position, then just set the memory after the kernel as ZONE_MOVABLE,
> at the same time, memory in other nodes will be set as ZONE_MOVABLE.

Hmm, it's an issue, worth fixing it. Otherwise the size of
movable area will be smaller than we expect when add "kernel_core="
or "movable_core=".

Add a check in find_zone_movable_pfns_for_nodes(), and use min() to get
the starting address of movable area between aligned '_etext'
and start_pfn. It will go to label 'restart' to calculate the 2nd round
if not satisfiled. 

Hi Chao,

Could you check if below patch works for you?


From ab6e47c6a78d1a4ccb577b995b7b386f3149732f Mon Sep 17 00:00:00 2001
From: Baoquan He <bhe@redhat.com>
Date: Wed, 11 Jul 2018 18:30:04 +0800
Subject: [PATCH] mm, page_alloc: find movable zone after kernel text

In find_zone_movable_pfns_for_nodes(), when try to find the starting
PFN movable zone begins in each node, kernel text position is not
considered. KASLR may put kernel after which movable zone begins.

Fix it by finding movable zone after kernel text.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 mm/page_alloc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 1521100..fe346b4 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6678,6 +6678,8 @@ static void __init find_zone_movable_pfns_for_nodes(void)
 			unsigned long size_pages;
 
 			start_pfn = max(start_pfn, zone_movable_pfn[nid]);
+			/* KASLR may put kernel after 'start_pfn', start after kernel */
+			start_pfn = max(start_pfn, PAGE_ALIGN(_etext));
 			if (start_pfn >= end_pfn)
 				continue;
 
-- 
2.1.0


WARNING: multiple messages have this Message-ID (diff)
From: Baoquan He <bhe@redhat.com>
To: Chao Fan <fanc.fnst@cn.fujitsu.com>,
	akpm@linux-foundation.org, linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org, x86@kernel.org,
	yasu.isimatu@gmail.com, keescook@chromium.org,
	indou.takao@jp.fujitsu.com, caoj.fnst@cn.fujitsu.com,
	douly.fnst@cn.fujitsu.com, mhocko@suse.com, vbabka@suse.cz,
	mgorman@techsingularity.net
Subject: Re: Bug report about KASLR and ZONE_MOVABLE
Date: Wed, 11 Jul 2018 18:41:58 +0800	[thread overview]
Message-ID: <20180711104158.GE2070@MiWiFi-R3L-srv> (raw)
In-Reply-To: <20180711094244.GA2019@localhost.localdomain>

On 07/11/18 at 05:42pm, Chao Fan wrote:
> Hi all,
> 
> I found there is a BUG about KASLR and ZONE_MOVABLE.
> 
> When users use 'kernelcore=' parameter without 'movable_node',
> movable memory is evenly distributed to all nodes. The size of
> ZONE_MOVABLE depends on the kernel parameter 'kernelcore=' and
> 'movablecore='.
> But sometiomes, KASLR may put the uncompressed kernel to the
> tail position of a node, which will cause the kernel memory
> set as ZONE_MOVABLE. This region can not be offlined.
> 
> Here is a very simple test in my qemu-kvm machine, there is
> only one node:
> 
> The command line:
> [root@localhost ~]# cat /proc/cmdline
> BOOT_IMAGE=/vmlinuz-4.18.0-rc3+ root=/dev/mapper/fedora_localhost--live-root
> ro resume=/dev/mapper/fedora_localhost--live-swap
> rd.lvm.lv=fedora_localhost-live/root rd.lvm.lv=fedora_localhost-live/swap
> console=ttyS0 earlyprintk=ttyS0,115200n8 memblock=debug kernelcore=50%
> 
> I use 'kernelcore=50%' here.
> 
> Here is my early print result, I print the random_addr after KASLR chooses
> physical memory:
> early console in extract_kernel
> input_data: 0x000000000266b3b1
> input_len: 0x00000000007d8802
> output: 0x0000000001000000
> output_len: 0x0000000001e15698
> kernel_total_size: 0x0000000001a8b000
> trampoline_32bit: 0x000000000009d000
> booted via startup_32()
> Physical KASLR using RDRAND RDTSC...
> random_addr: 0x000000012f000000
> Virtual KASLR using RDRAND RDTSC...
> 
> The address for kernel is 0x000000012f000000
> 
> Here is the log of ZONE:
> [    0.000000] Zone ranges:
> [    0.000000]   DMA      [mem 0x0000000000001000-0x0000000000ffffff]
> [    0.000000]   DMA32    [mem 0x0000000001000000-0x00000000ffffffff]
> [    0.000000]   Normal   [mem 0x0000000100000000-0x00000001f57fffff]
> [    0.000000]   Device   empty
> [    0.000000] Movable zone start for each node
> [    0.000000]   Node 0: 0x000000011b000000
> [    0.000000] Early memory node ranges
> [    0.000000]   node   0: [mem 0x0000000000001000-0x000000000009efff]
> [    0.000000]   node   0: [mem 0x0000000000100000-0x00000000bffd6fff]
> [    0.000000]   node   0: [mem 0x0000000100000000-0x00000001f57fffff]
> [    0.000000] Initmem setup node 0 [mem
> 0x0000000000001000-0x00000001f57fffff]
> 
> Only one node in my machine, ZONE_MOVABLE begins from 0x000000011b000000,
> which is lower than 0x000000012f000000.
> So KASLR put the kernel to the ZONE_MOVABLE.
> Try to solve this problem, I think there should be a new tactic in function
> find_zone_movable_pfns_for_nodes() of mm/page_alloc.c. If kernel is uncompressed
> in a tail position, then just set the memory after the kernel as ZONE_MOVABLE,
> at the same time, memory in other nodes will be set as ZONE_MOVABLE.

Hmm, it's an issue, worth fixing it. Otherwise the size of
movable area will be smaller than we expect when add "kernel_core="
or "movable_core=".

Add a check in find_zone_movable_pfns_for_nodes(), and use min() to get
the starting address of movable area between aligned '_etext'
and start_pfn. It will go to label 'restart' to calculate the 2nd round
if not satisfiled. 

Hi Chao,

Could you check if below patch works for you?

  parent reply	other threads:[~2018-07-11 10:42 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-11  9:42 Bug report about KASLR and ZONE_MOVABLE Chao Fan
2018-07-11 10:04 ` Chao Fan
2018-07-11 10:16 ` Chao Fan
2018-07-11 10:41 ` Baoquan He [this message]
2018-07-11 10:41   ` Baoquan He
2018-07-11 10:49   ` Baoquan He
2018-07-11 10:49     ` Baoquan He
2018-07-11 12:40     ` Baoquan He
2018-07-11 12:40       ` Baoquan He
2018-07-11 17:59       ` [PATCH v3] mm, page_alloc: find movable zone after kernel text kbuild test robot
2018-07-11 19:02       ` kbuild test robot
2018-07-12  1:19       ` Bug report about KASLR and ZONE_MOVABLE Chao Fan
2018-07-12  1:19         ` Chao Fan
2018-07-12 12:08         ` Baoquan He
2018-07-12  5:49       ` Dou Liyang
2018-07-12  5:49         ` Dou Liyang
2018-07-12  6:01         ` Chao Fan
2018-07-12  6:01           ` Chao Fan
2018-07-12 12:32           ` Michal Hocko
2018-07-12 23:52             ` Baoquan He
2018-07-12 23:52               ` Baoquan He
2018-07-13  1:44               ` Chao Fan
2018-07-13  1:44                 ` Chao Fan
2018-07-16 11:38               ` Michal Hocko
2018-07-16 13:02                 ` Baoquan He
2018-07-16 15:24                   ` Michal Hocko
2018-07-17  1:51                     ` Baoquan He
2018-07-17  8:22                       ` Michal Hocko
2018-07-11 12:41     ` Baoquan He

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180711104158.GE2070@MiWiFi-R3L-srv \
    --to=bhe@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=caoj.fnst@cn.fujitsu.com \
    --cc=douly.fnst@cn.fujitsu.com \
    --cc=fanc.fnst@cn.fujitsu.com \
    --cc=indou.takao@jp.fujitsu.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=vbabka@suse.cz \
    --cc=x86@kernel.org \
    --cc=yasu.isimatu@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.