From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932486AbcGZUsW (ORCPT ); Tue, 26 Jul 2016 16:48:22 -0400 Received: from cloudserver094114.home.net.pl ([79.96.170.134]:53086 "HELO cloudserver094114.home.net.pl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S932465AbcGZUsS (ORCPT ); Tue, 26 Jul 2016 16:48:18 -0400 From: "Rafael J. Wysocki" To: Kees Cook Cc: Borislav Petkov , Ingo Molnar , Josh Poimboeuf , Pavel Machek , Linux PM list , Linux Kernel Mailing List , Thomas Gleixner , shuzzle@mailbox.org, Thomas Garnier Subject: Re: Fwd: [Bug 150021] New: kernel panic: "kernel tried to execute NX-protected page" when resuming from hibernate to disk Date: Tue, 26 Jul 2016 22:53:20 +0200 Message-ID: <2264716.77pSArDZvG@vostro.rjw.lan> User-Agent: KMail/4.11.5 (Linux/4.5.0-rc1+; KDE/4.11.5; x86_64; ; ) In-Reply-To: References: <16541580.dFLT14ScxF@vostro.rjw.lan> <2437449.FpDj7DlX4Y@vostro.rjw.lan> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="utf-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tuesday, July 26, 2016 01:33:02 PM Kees Cook wrote: > On Tue, Jul 26, 2016 at 1:24 PM, Rafael J. Wysocki wrote: > > On Tuesday, July 26, 2016 04:04:42 PM Borislav Petkov wrote: > >> On Tue, Jul 26, 2016 at 01:32:28PM +0200, Rafael J. Wysocki wrote: > >> > Hi, > >> > > >> > The following commit: > >> > > >> > commit 13523309495cdbd57a0d344c0d5d574987af007f > >> > Author: Josh Poimboeuf > >> > Date: Thu Jan 21 16:49:21 2016 -0600 > >> > > >> > x86/asm/acpi: Create a stack frame in do_suspend_lowlevel() > >> > > >> > do_suspend_lowlevel() is a callable non-leaf function which doesn't > >> > honor CONFIG_FRAME_POINTER, which can result in bad stack traces. > >> > > >> > Create a stack frame for it when CONFIG_FRAME_POINTER is enabled. > >> > > >> > is reported to cause a resume-from-hibernation regression due to an attempt > >> > to execute an NX page (we've seen quite a bit of that recently). > >> > > >> > I'm asking the reporter to try 4.7, but if the problem is still there, we'll > >> > need to revert the above I'm afraid. > >> > >> So I can't resume properly from disk too, on the Intel laptop this time. Top > >> commit is from tip/master: > >> > >> commit 516f48acf59722429acd323b3d283f74f02891fe (refs/remotes/tip/master) > >> Merge: a4823bbffc96 dd9506954539 > >> Author: Ingo Molnar > >> Date: Mon Jul 25 08:39:43 2016 +0200 > >> > >> Merge branch 'linus' > >> > >> > >> So I thought it might be Josh's patch above and reverted it. No joy. > >> > >> Then I remembered that I enabled CONFIG_RANDOMIZE_MEMORY for the > >> microcode loader breakage which we've been debugging. Turned that off > >> and machine resumes fine again. > > > > Well, I wasn't aware of *another* flavor of ASLR in the works. And there > > was no hope it would not break hibernation if you asked me. > > > >> It looks like > >> > >> 0483e1fa6e09 ("x86/mm: Implement ASLR for kernel memory regions") > >> > >> broke a bunch of things. Off the top of my head, we probably should make > >> suspend to disk and CONFIG_RANDOMIZE_MEMORY mutually exclusive, like it > >> was the case with ASLR previously, AFAIR. > > > > Please no. > > > > First off, it should be perfectly possible to make hibernation work along > > with this new variant of ASLR. Second, quite obviously, the author of these > > ASLR changes had not done sufficient research to estimate the possible > > impact of them. > > I think that's a bit unfair: Thomas did a lot of testing, and it has > been living in -next for a while. Well, with all due respect, "a lot of testing" is not quite the same thing as "sufficient research" IMO. It should be known (at least from experience) that hibernation on x86-64 doesn't play well with ASLR quite as a rule, so it would be good to at least check that particular thing or CC a relevant person (ie. me). Or even ask me on IRC for that matter. Give me a heads up ahead of time. But no. I'm still on the receiving end of the "hibernation doesn't work with ASLR" story which was entirely avoidable this time around. Sigh. > > Honestly, I don't think it is a good idea to introduce random Kconfig options > > for working around cases in which the author of some changes cannot be bothered > > with doing things right. Even if that is security. > > I would agree: let's try to get this fixed soon. > > > So IMO, either we should fix the problem, or that whole new ASLR stuff should > > be reverted. > > > > I think I know how to fix it, but I won't be able to get to that before the > > next week. I guess it can wait till then, though. > > Thomas, will you have some time to examine this and estimate the work for a fix? FWIW, my hunch ATM is that you need to look at the "Set up the direct mapping from scratch" loop in set_up_temporary_mappings() and make it do the right thing when the new ASLR stuff is enabled. Thanks, Rafael