linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Baoquan He <bhe@redhat.com>
To: linux-kernel@vger.kernel.org, mingo@kernel.org,
	lcapitulino@redhat.com, keescook@chromium.org,
	tglx@linutronix.de
Cc: x86@kernel.org, hpa@zytor.com, fanc.fnst@cn.fujitsu.com,
	yasu.isimatu@gmail.com, indou.takao@jp.fujitsu.com,
	douly.fnst@cn.fujitsu.com, Baoquan He <bhe@redhat.com>
Subject: [PATCH v2 2/2] x86/boot/KASLR: Skip specified number of 1GB huge pages when do physical randomization
Date: Mon, 25 Jun 2018 11:16:56 +0800	[thread overview]
Message-ID: <20180625031656.12443-3-bhe@redhat.com> (raw)
In-Reply-To: <20180625031656.12443-1-bhe@redhat.com>

In 1GB huge pages allocation, a regression bug could be triggered when
KASLR is enabled. On a KVM guest with 4GB RAM, after adding the following
to the kernel command-line:

	'default_hugepagesz=1G hugepagesz=1G hugepages=1'

then boot the guest and check number of 1GB pages reserved:
  # grep HugePages_Total /proc/meminfo

It shows that when booting with "nokaslr" HugePages_Total is always 1,
while booting without "nokaslr" sometimes HugePages_Total is set as 0
(that is, reserving the 1GB page fails). Note that it may need to boot
a few times to trigger the issue.

After investigation, the root cause is that kernel may be put into the
only good 1GB huge page [0x40000000, 0x7fffffff] randomly. Below is the
dmesg output snippet from the KVM guest. We can see that only
[0x40000000, 0x7fffffff] region is good 1GB huge page,
[0x100000000, 0x13fffffff] will be touched by memblock top-down
allocation.

[...] e820: BIOS-provided physical RAM map:
[...] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
[...] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
[...] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
[...] BIOS-e820: [mem 0x0000000000100000-0x00000000bffdffff] usable
[...] BIOS-e820: [mem 0x00000000bffe0000-0x00000000bfffffff] reserved
[...] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
[...] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
[...] BIOS-e820: [mem 0x0000000100000000-0x000000013fffffff] usable

Besides, on those bare-metal machines with larger memory, one less 1GB
huge page might be got with KASLR enabled. It's also because that kernel
might be randomized into those good 1GB huge pages.

To fix this, firstly parse kernel command-line to get how many 1GB huge
pages are specified. Then try to skip the specified number of 1GB huge
pages when decide which memory region kernel can be randomized into.

And also change the name of handle_mem_memmap() as handle_mem_options()
since it handles not only 'mem=' and 'memmap=', but also 'hugepagesxxx'
now.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 arch/x86/boot/compressed/kaslr.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index 0fea96f9cc28..ff8a865de36b 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -244,7 +244,7 @@ static void parse_gb_huge_pages(char *param, char *val)
 }
 
 
-static int handle_mem_memmap(void)
+static int handle_mem_options(void)
 {
 	char *args = (char *)get_cmd_line_ptr();
 	size_t len = strlen((char *)args);
@@ -252,7 +252,8 @@ static int handle_mem_memmap(void)
 	char *param, *val;
 	u64 mem_size;
 
-	if (!strstr(args, "memmap=") && !strstr(args, "mem="))
+	if (!strstr(args, "memmap=") && !strstr(args, "mem=") &&
+		!strstr(args, "hugepages"))
 		return 0;
 
 	tmp_cmdline = malloc(len + 1);
@@ -277,6 +278,8 @@ static int handle_mem_memmap(void)
 
 		if (!strcmp(param, "memmap")) {
 			mem_avoid_memmap(val);
+		} else if (strstr(param, "hugepages")) {
+			parse_gb_huge_pages(param, val);
 		} else if (!strcmp(param, "mem")) {
 			char *p = val;
 
@@ -416,7 +419,7 @@ static void mem_avoid_init(unsigned long input, unsigned long input_size,
 	/* We don't need to set a mapping for setup_data. */
 
 	/* Mark the memmap regions we need to avoid */
-	handle_mem_memmap();
+	handle_mem_options();
 
 #ifdef CONFIG_X86_VERBOSE_BOOTUP
 	/* Make sure video RAM can be used. */
@@ -629,7 +632,7 @@ static void process_mem_region(struct mem_vector *entry,
 
 		/* If nothing overlaps, store the region and return. */
 		if (!mem_avoid_overlap(&region, &overlap)) {
-			store_slot_info(&region, image_size);
+			process_gb_huge_pages(&region, image_size);
 			return;
 		}
 
@@ -639,7 +642,7 @@ static void process_mem_region(struct mem_vector *entry,
 
 			beginning.start = region.start;
 			beginning.size = overlap.start - region.start;
-			store_slot_info(&beginning, image_size);
+			process_gb_huge_pages(&beginning, image_size);
 		}
 
 		/* Return if overlap extends to or past end of region. */
-- 
2.13.6


  parent reply	other threads:[~2018-06-25  3:17 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-25  3:16 [PATCH v2 0/2] x86/boot/KASLR: Skip specified number of 1GB huge pages when do physical randomization Baoquan He
2018-06-25  3:16 ` [PATCH v2 1/2] x86/boot/KASLR: Add two functions for 1GB huge pages handling Baoquan He
2018-07-03 15:58   ` [tip:x86/boot] x86/boot/KASLR: Add two new " tip-bot for Baoquan He
2018-06-25  3:16 ` Baoquan He [this message]
2018-07-03 15:58   ` [tip:x86/boot] x86/boot/KASLR: Skip specified number of 1GB huge pages when doing physical randomization (KASLR) tip-bot for Baoquan He
2018-06-26 15:06 ` [PATCH v2 0/2] x86/boot/KASLR: Skip specified number of 1GB huge pages when do physical randomization Luiz Capitulino

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180625031656.12443-3-bhe@redhat.com \
    --to=bhe@redhat.com \
    --cc=douly.fnst@cn.fujitsu.com \
    --cc=fanc.fnst@cn.fujitsu.com \
    --cc=hpa@zytor.com \
    --cc=indou.takao@jp.fujitsu.com \
    --cc=keescook@chromium.org \
    --cc=lcapitulino@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=yasu.isimatu@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).