From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752700AbdDJTSR (ORCPT ); Mon, 10 Apr 2017 15:18:17 -0400 Received: from mx1.redhat.com ([209.132.183.28]:44124 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751140AbdDJTSQ (ORCPT ); Mon, 10 Apr 2017 15:18:16 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com CB36FEEF0E Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=jmoyer@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com CB36FEEF0E From: Jeff Moyer To: Kees Cook Cc: Thomas Garnier , Ingo Molnar , Baoquan He , Dan Williams , LKML , linux-nvdimm@ml01.01.org Subject: Re: KASLR causes intermittent boot failures on some systems References: X-PGP-KeyID: 1F78E1B4 X-PGP-CertKey: F6FE 280D 8293 F72C 65FD 5A58 1FF8 A7CA 1F78 E1B4 X-PCLoadLetter: What the f**k does that mean? Date: Mon, 10 Apr 2017 15:18:14 -0400 In-Reply-To: (Kees Cook's message of "Mon, 10 Apr 2017 12:03:55 -0700") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Mon, 10 Apr 2017 19:18:16 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Kees Cook writes: > On Mon, Apr 10, 2017 at 11:22 AM, Jeff Moyer wrote: >> Kees Cook writes: >> >>> On Mon, Apr 10, 2017 at 8:49 AM, Jeff Moyer wrote: >>>> Kees Cook writes: >>>> >>>>> On Fri, Apr 7, 2017 at 7:41 AM, Jeff Moyer wrote: >>>>>> Hi, >>>>>> >>>>>> commit 021182e52fe01 ("x86/mm: Enable KASLR for physical mapping memory >>>>>> regions") causes some of my systems with persistent memory (whether real >>>>>> or emulated) to fail to boot with a couple of different crash >>>>>> signatures. The first signature is a NMI watchdog lockup of all but 1 >>>>>> cpu, which causes much difficulty in extracting useful information from >>>>>> the console. The second variant is an invalid paging request, listed >>>>>> below. >>>>> >>>>> Just to rule out some of the stuff in the boot path, does booting with >>>>> "nokaslr" solve this? (i.e. I want to figure out if this is from some >>>>> of the rearrangements done that are exposed under that commit, or if >>>>> it is genuinely the randomization that is killing the systems...) >>>> >>>> Adding "nokaslr" to the boot line does indeed make the problem go away. >>> >>> Are you booting with a memmap= flag? >> >> From my first email: >> >> [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-4.11.0-rc5+ >> root=/dev/mapper/rhel_intel--lizardhead--04-root ro memmap=192G!1024G >> crashkernel=auto rd.lvm.lv=rhel_intel-lizardhead-04/root >> rd.lvm.lv=rhel_intel-lizardhead-04/swap console=ttyS0,115200n81 >> LANG=en_US.UTF-8 >> >> Did you not receive the attachments? > > I see it now, thanks! > > The memmap parsing was added in -rc1 (f28442497b5ca), so I'd expect > that to be handled. Hmmm. I can also reproduce this on a system with real persistent memory, which does not require the memmap parameter. Cheers, Jeff