linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jiri Kosina <jikos@kernel.org>
To: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>, X86 ML <x86@kernel.org>,
	Borislav Petkov <bpetkov@suse.de>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH 1/2] x86/mm: Reinitialize TLB state on hotplug and resume
Date: Thu, 7 Sep 2017 21:55:38 +0200 (CEST)	[thread overview]
Message-ID: <nycvar.YFH.7.76.1709072152500.3285@jbgna.fhfr.qr> (raw)
In-Reply-To: <20170907074834.tmwo6vsvody2qrlg@gmail.com>

On Thu, 7 Sep 2017, Ingo Molnar wrote:

> > > When Linux brings a CPU down and back up, it switches to init_mm and then
> > > loads swapper_pg_dir into CR3.  With PCID enabled, this has the side effect
> > > of masking off the ASID bits in CR3.
> > > 
> > > This can result in some confusion in the TLB handling code.  If we
> > > bring a CPU down and back up with any ASID other than 0, we end up
> > > with the wrong ASID active on the CPU after resume.  This could
> > > cause our internal state to become corrupt, although major
> > > corruption is unlikely because init_mm doesn't have any user pages.
> > > More obviously, if CONFIG_DEBUG_VM=y, we'll trip over an assertion
> > > in the next context switch.  The result of *that* is a failure to
> > > resume from suspend with probability 1 - 1/6^(cpus-1).
> > > 
> > > Fix it by reinitializing cpu_tlbstate on resume and CPU bringup.
> > > 
> > > Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
> > > Reported-by: Jiri Kosina <jikos@kernel.org>
> > > Fixes: 10af6235e0d3 ("x86/mm: Implement PCID based optimization: try to preserve old TLB entries using PCID")
> > > Signed-off-by: Andy Lutomirski <luto@kernel.org>
> > 
> > Tested-by: Jiri Kosina <jkosina@suse.cz>
> 
> The fix should be upstream already, as of 1c9fe4409ce3 and later.

Hm, so I've just experienced two instances in a row of reboot just after 
reading hibernation image (i.e. exactly the same symptom as before) even 
with 3b9f8ed kernel (which contains the fix). Seems like the fix is either 
incomplete (just the probability of it happening is lower), or I'm seeing 
something differet with the same symptom.

I'll try to figure out whether it's the same VM_BUG_ON() triggering, but 
probably will be able to do so only tomorrow.

-- 
Jiri Kosina
SUSE Labs

  reply	other threads:[~2017-09-07 19:55 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-07  2:54 [PATCH 0/2] Fix resume failure due to PCID Andy Lutomirski
2017-09-07  2:54 ` [PATCH 1/2] x86/mm: Reinitialize TLB state on hotplug and resume Andy Lutomirski
2017-09-07  7:01   ` [PATCH] mm/debug: Change BUG_ON() crashes to survivable WARN_ON() warnings Ingo Molnar
2017-09-07 20:50     ` Linus Torvalds
2017-09-07  7:31   ` [PATCH 1/2] x86/mm: Reinitialize TLB state on hotplug and resume Jiri Kosina
2017-09-07  7:48     ` Ingo Molnar
2017-09-07 19:55       ` Jiri Kosina [this message]
2017-09-08  1:23         ` Andy Lutomirski
2017-09-07  9:54   ` Borislav Petkov
2017-09-07  9:59     ` Ingo Molnar
2017-09-07 10:10       ` Borislav Petkov
2017-09-07  2:54 ` [PATCH 2/2] x86/mm: Document how CR4.PCIDE restore works Andy Lutomirski
2017-09-07  3:25 ` [PATCH 0/2] Fix resume failure due to PCID Linus Torvalds
2017-09-07  4:15   ` Andy Lutomirski
2017-09-15  6:59   ` x60: warnings on boot and resume, arch/x86/mm/tlb.c:257 initialize_ ... was " Pavel Machek
2017-09-15  8:39     ` Ingo Molnar
2017-09-15  9:16       ` Pavel Machek
2017-09-15  9:35         ` Ingo Molnar
2017-09-15 10:22       ` [4.14-rc0 regression] " Pavel Machek
2017-09-15 18:47         ` Linus Torvalds
2017-09-15 19:29           ` Andy Lutomirski
2017-09-15 21:06             ` Andy Lutomirski
2017-09-07  8:59 ` Borislav Petkov
2017-09-15 11:01 ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.YFH.7.76.1709072152500.3285@jbgna.fhfr.qr \
    --to=jikos@kernel.org \
    --cc=bpetkov@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).