All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Rafael J. Wysocki" <rjw@rjwysocki.net>
To: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>,
	Ingo Molnar <mingo@kernel.org>, Borislav Petkov <bp@suse.de>,
	Pavel Machek <pavel@ucw.cz>,
	Linux PM list <linux-pm@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	shuzzle@mailbox.org
Subject: Re: [PATCH] x86/asm/power: Fix hibernation return address corruption
Date: Thu, 28 Jul 2016 23:36:39 +0200	[thread overview]
Message-ID: <1760536.lNRnLG8nBr@vostro.rjw.lan> (raw)
In-Reply-To: <20160728151707.nmtkzri4jtumaq6h@treble>

On Thursday, July 28, 2016 10:17:07 AM Josh Poimboeuf wrote:
> On Thu, Jul 28, 2016 at 01:29:49AM +0200, Rafael J. Wysocki wrote:
> > On Thursday, July 28, 2016 01:20:53 AM Rafael J. Wysocki wrote:
> > > On Wednesday, July 27, 2016 05:17:38 PM Josh Poimboeuf wrote:
> > > > On Thu, Jul 28, 2016 at 12:12:15AM +0200, Rafael J. Wysocki wrote:
> > > > > On Wednesday, July 27, 2016 12:59:18 PM Josh Poimboeuf wrote:
> > > > > > Hm... I have a theory, but I'm not sure about it.  I noticed that
> > > > > > x86_acpi_enter_sleep_state(),
> > > > > 
> > > > > I think you mean x86_acpi_suspend_lowlevel().
> > > > 
> > > > Oops!
> > > > 
> > > > > > which is involved in suspend, overwrites
> > > > > > several global variables (e.g, initial_code) which are used by the CPU
> > > > > > boot code in head_64.S.  But surprisingly, it doesn't restore those
> > > > > > variables to their original values after it resumes.
> > > > > 
> > > > > Is the head_64.S code also used to bring up offline CPUs?
> > > > 
> > > > Yes.
> > > 
> > > OK
> > > 
> > > So it is really interesting why and how that stuff works for everybody.
> > > 
> > > Basically, CPU online should fail after a suspend-resume cycle, but it
> > > doesn't most of the time AFAICS.
> > 
> > do_boot_cpu() restores those values, so I think we're safe from that angle.
> > 
> > That should apply to the CPU online during resume from hibernation too.
> 
> Yeah, my theory was bogus.  And as it turns out, the bug reporter made a
> mistake in the bisect.  The actual offending commit was apparently:
> 
>   ef0f3ed5a4ac ("x86/asm/power: Create stack frames in hibernate_asm_64.S")
> 
> Amazingly enough, I authored that patch as well.  I think "git bisect"
> doesn't like me!
> 
> Here's the fix:
> 
> ----
> 
> From: Josh Poimboeuf <jpoimboe@redhat.com>
> Subject: [PATCH] x86/asm/power: Fix hibernation return address corruption
> 
> In kernel bug 150021, a kernel panic was reported when restoring a
> hibernate image.  Only a picture of the oops was reported, so I can't
> paste the whole thing here.  But here are the most interesting parts:
> 
>   kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
>   BUG: unable to handle kernel paging request at ffff8804615cfd78
>   ...
>   RIP: ffff8804615cfd78
>   RSP: ffff8804615f0000
>   RBP: ffff8804615cfdc0
>   ...
>   Call Trace:
>    do_signal+0x23
>    exit_to_usermode_loop+0x64
>    ...
> 
> The RIP is on the same page as RBP, so it apparently started executing
> on the stack.
> 
> The bug was bisected to commit ef0f3ed5a4ac ("x86/asm/power: Create
> stack frames in hibernate_asm_64.S"), which in retrospect seems quite
> dangerous, since that code saves and restores the stack pointer from a
> global variable ('saved_context').
> 
> There are a lot of moving parts in the hibernate save and restore paths,
> so I don't know exactly what caused the panic.  Presumably, a FRAME_END
> was executed without the corresponding FRAME_BEGIN, or vice versa.  That
> would corrupt the return address on the stack and would be consistent
> with the details of the above panic.

One problem that I can see immediately is that the stack pointer may not
be valid any more by the time the FRAME_BEGIN in restore_registers() is
executed.  The memory it points to (which used to be a stack area of the
restore kernel) may have been overwritten by some image memory contents
from before hibernation and that page frame may now be used for whatever
different purpose it had been allocated for before hibernation.  If that
happens, the FRAME_BEGIN will corrupt that memory.

Embarrassingly enough, I have looked at that piece of code for tens of
times recently, but somehow I've never translated that FRAME_BEGIN into
a push instruction. :-/

> Instead of doing the frame pointer save/restore around the bounds of the
> affected functions, instead just do it around the call to swsusp_save().
> That has the same effect of ensuring that if swsusp_save() sleeps, the
> frame pointers will be correct.  It's also a much more obviously safe
> way to do it than the original patch.  And objtool still doesn't report
> any warnings.
> 
> Fixes: ef0f3ed5a4ac ("x86/asm/power: Create stack frames in hibernate_asm_64.S")
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=150021
> Reported-by: <shuzzle@mailbox.org>
> Tested-by: <shuzzle@mailbox.org>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>

I've queued this up as an urgent fix.  I hope there are no objections.

Thanks,
Rafael

  parent reply	other threads:[~2016-07-28 21:31 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-26 11:32 Fwd: [Bug 150021] New: kernel panic: "kernel tried to execute NX-protected page" when resuming from hibernate to disk Rafael J. Wysocki
2016-07-26 14:04 ` Borislav Petkov
2016-07-26 20:24   ` Rafael J. Wysocki
2016-07-26 20:33     ` Kees Cook
2016-07-26 20:53       ` Rafael J. Wysocki
2016-07-26 20:59         ` Kees Cook
2016-07-26 21:17           ` Thomas Garnier
2016-07-27  5:39             ` Borislav Petkov
2016-07-26 14:39 ` Josh Poimboeuf
2016-07-26 20:15   ` Rafael J. Wysocki
2016-07-26 20:31     ` Kees Cook
2016-07-26 20:42       ` Rafael J. Wysocki
2016-07-26 21:53     ` Josh Poimboeuf
2016-07-26 22:42       ` Rafael J. Wysocki
2016-07-26 23:08         ` Rafael J. Wysocki
2016-07-27 17:59           ` Josh Poimboeuf
2016-07-27 22:12             ` Rafael J. Wysocki
2016-07-27 22:17               ` Josh Poimboeuf
2016-07-27 23:20                 ` Rafael J. Wysocki
2016-07-27 23:29                   ` Rafael J. Wysocki
2016-07-28 15:17                     ` [PATCH] x86/asm/power: Fix hibernation return address corruption Josh Poimboeuf
2016-07-28 15:32                       ` Josh Poimboeuf
2016-07-28 21:36                       ` Rafael J. Wysocki [this message]
2016-07-29  7:16                         ` Ingo Molnar
2016-07-27 22:20               ` Fwd: [Bug 150021] New: kernel panic: "kernel tried to execute NX-protected page" when resuming from hibernate to disk Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1760536.lNRnLG8nBr@vostro.rjw.lan \
    --to=rjw@rjwysocki.net \
    --cc=bp@suse.de \
    --cc=jpoimboe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=pavel@ucw.cz \
    --cc=rafael@kernel.org \
    --cc=shuzzle@mailbox.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.