From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933366AbdDGVZp (ORCPT ); Fri, 7 Apr 2017 17:25:45 -0400 Received: from mail-io0-f169.google.com ([209.85.223.169]:33960 "EHLO mail-io0-f169.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751174AbdDGVZh (ORCPT ); Fri, 7 Apr 2017 17:25:37 -0400 MIME-Version: 1.0 In-Reply-To: References: From: Kees Cook Date: Fri, 7 Apr 2017 14:25:35 -0700 X-Google-Sender-Auth: x24k3Do8W4tW682M1P1F7_DWtVM Message-ID: Subject: Re: KASLR causes intermittent boot failures on some systems To: Jeff Moyer Cc: Thomas Garnier , Ingo Molnar , Baoquan He , Dan Williams , LKML , linux-nvdimm@ml01.01.org Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Apr 7, 2017 at 7:41 AM, Jeff Moyer wrote: > Hi, > > commit 021182e52fe01 ("x86/mm: Enable KASLR for physical mapping memory > regions") causes some of my systems with persistent memory (whether real > or emulated) to fail to boot with a couple of different crash > signatures. The first signature is a NMI watchdog lockup of all but 1 > cpu, which causes much difficulty in extracting useful information from > the console. The second variant is an invalid paging request, listed > below. Just to rule out some of the stuff in the boot path, does booting with "nokaslr" solve this? (i.e. I want to figure out if this is from some of the rearrangements done that are exposed under that commit, or if it is genuinely the randomization that is killing the systems...) > On some systems, I haven't hit this problem at all. Other systems > experience a failed boot maybe 20-30% of the time. To reproduce it, > configure some emulated pmem on your system. You can find directions > for that here: https://nvdimm.wiki.kernel.org/ > > Install ndctl (https://github.com/pmem/ndctl). > Configure the namespace: > # ndctl create-namespace -f -e namespace0.0 -m memory > > Then just reboot several times (5 should be enough), and hopefully > you'll hit the issue. > > I've attached both my .config and the dmesg output from a successful > boot at the end of this mail. Thanks! Considering I know nothing about pmem (yet), I bet there is some oversight in what's happening with how KASLR scans for available memory areas. I'll carve out some time next week to look into this. -Kees -- Kees Cook Pixel Security