From: Paul Gortmaker <paul.gortmaker@windriver.com>
To: Borislav Petkov <bp@suse.de>,
Richard Purdie <richard.purdie@linuxfoundation.org>,
Toshi Kani <toshi.kani@hp.com>
Cc: Bruce Ashfield <bruce.ashfield@windriver.com>,
openembedded-core <openembedded-core@lists.openembedded.org>,
"Hart, Darren" <darren.hart@intel.com>,
"saul.wold" <saul.wold@intel.com>, <linux-kernel@vger.kernel.org>
Subject: Re: runtime regression with "x86/mm/pat: Emulate PAT when it is disabled"
Date: Thu, 3 Mar 2016 16:18:03 -0500 [thread overview]
Message-ID: <20160303211803.GC25222@windriver.com> (raw)
In-Reply-To: <20160303205924.GA25222@windriver.com>
[runtime regression with "x86/mm/pat: Emulate PAT when it is disabled"] On 03/03/2016 (Thu 15:59) Paul Gortmaker wrote:
> So, the yocto folks moved from 4.1 to 4.4 and one of their automated
> qemu x86-32 boot tests started failing. None of the yocto details seem
> to matter since I offered to help and I've repropduced it using 100%
> mainline kernels and a generic distro toolchain as well.
>
> The test case is slightly complicated, in that it relies on uvesafb
> being modular, and so one has to juggle modules within an ext4 image
> that qemu boots from. We tried making uvesafb builtin, but that made
> the issue magically vanish. Given PAT, this isn't too surprising.
>
> Richard did the preliminary investigation and analysis, and from that I
> did a bisect, and found the commit in $SUBJECT to be the root cause, as
> per the discussion here:
>
> http://lists.openembedded.org/pipermail/openembedded-core/2016-March/118397.html
>
> I'd mentioned the above to bpetkov on IRC and after confirming it was
> still an issue on 4.5-rc6, he'd asked if I had a portable reproducer.
>
> Not sure how complicated that would be, I set out to make one from my
> build. With a little LD_PRELOAD type magic and ensuring all the qemu
> components are in ./ I have one that runs on an otherwise qemu-free
> x86-64 box.
>
> The stand alone reproducer is here; launched in 00-runme:
>
> http://openlinux.wrs.com/pat-splat/reproducer.tar.bz2
Apologies, I'd used an internal DNS abbreviation here that isn't global.
Replace the wrs with windriver and everything should be good.
P.
--
>
> It is nothing fancy, just a generic yocto build of "sato" (gfx enabled
> rootfs). When it "works" it boots to a UI touchscreen interface. When
> it fails, you get a black screen with a blinking cursor (as seen in
> "vncviewer localhost:0").
>
> Upon failure, you can do <Ctrl>-<Alt>-<2> to get to a passwd-less root
> login ; there you can run dmesg and see the splat. The image is
> currently using 4.5-rc6 ; but any kernel can be inserted; "make
> modules_install INSTALL_MOD_PATH=here" and then populating those modules
> from "here" into /lib/modules of the loopback mounted image. And of
> course updating the bzImage on the qemu cmdline. Currently it
> contains a bzImage and modules for 4.5-rc6 as I last tested that.
>
> Also note that vncviewer will disconnect when it goes from early boot
> 80x25 to a higer res gfx mode; just reconnect and continue observing the
> target.
>
> I've ruled out yocto kernel changes, and yocto toolchain -- but maybe it
> is a qemu issue this commit triggers ; who knows at this point.
>
> Since I've NFI what component(s) cause this, I wanted to have the qemu
> binary, all libraries etc as part of the reproducer and nothing left to
> chance, and I've tested the reproducer on an ancient dual core w/o vmx
> and w/o any qemu binaries installed. Bruce also tested it on a slightly
> more modern dual socket xeon with vmx and confirmed it failed there..
>
> Inside there is a 00-runme ; mostly a copy of qemu args the yocto
> automated tests were using. There is also everything the qemu binaries
> need to run ; toplevel dir is noisy since qemu only looks in ./ it
> seems. There is also an ext4.img ; as mentioned earlier, this only
> happens when uvesafb.ko is a module, so one has to loopback mount that
> image and repopulate /lib/modules/ for each boot test/bisect step.
>
> I've also included 00-bisect.txt as the output of git bisect log. And
> there is also 00-configs/ dir that has the ".config" kernel file for
> each build (dir names are "git describe" in here for easy correlation)
> done for the bisect (plus the latest mainline build). The failing commit
> in the subject is v4.1-rc5-22-g9cd25aac1f44 .
>
> My contribution here is largely a bisect that can be relied on and
> providing a portable reproducer of the regression; I am by no means a
> PAT expert ; Richard invested more time into actually understanding the
> problem than I did, so I'm going to totally throw him under the bus on
> this when it comes to considering the ultimate root cause and possible
> fixes. :)
>
> Paul.
> --
next prev parent reply other threads:[~2016-03-03 21:18 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-03 20:59 runtime regression with "x86/mm/pat: Emulate PAT when it is disabled" Paul Gortmaker
2016-03-03 21:18 ` Paul Gortmaker [this message]
2016-03-04 5:02 ` Toshi Kani
2016-03-04 18:37 ` Paul Gortmaker
2016-03-04 22:12 ` Toshi Kani
2016-03-07 0:35 ` Paul Gortmaker
2016-03-07 16:03 ` Toshi Kani
[not found] ` <20160307210852.GC26051@windriver.com>
2016-03-07 23:38 ` Toshi Kani
2016-03-07 23:53 ` Paul Gortmaker
2016-03-08 0:56 ` Toshi Kani
2016-03-08 1:35 ` Toshi Kani
2016-03-08 3:28 ` Paul Gortmaker
2016-03-08 16:38 ` Toshi Kani
2016-03-10 14:42 ` Paul Gortmaker
2016-03-10 16:49 ` Toshi Kani
2016-03-10 17:20 ` Borislav Petkov
2016-03-10 19:04 ` Paul Gortmaker
2016-03-10 19:19 ` Borislav Petkov
2016-03-11 13:23 ` One Thousand Gnomes
2016-03-11 13:40 ` Borislav Petkov
2016-03-11 19:18 ` Paolo Bonzini
2016-03-11 22:16 ` Borislav Petkov
2016-03-11 22:28 ` Bruce Ashfield
2016-03-11 23:29 ` Richard Purdie
2016-03-12 12:03 ` Borislav Petkov
2016-03-10 20:12 ` Toshi Kani
2016-03-10 20:04 ` Toshi Kani
2016-03-10 19:20 ` Borislav Petkov
2016-03-10 20:24 ` Toshi Kani
2016-03-10 21:07 ` Borislav Petkov
2016-03-10 23:17 ` Toshi Kani
2016-03-08 3:16 ` Paul Gortmaker
2016-03-08 16:13 ` Toshi Kani
2016-03-08 16:03 ` Paul Gortmaker
2016-03-08 17:01 ` Toshi Kani
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160303211803.GC25222@windriver.com \
--to=paul.gortmaker@windriver.com \
--cc=bp@suse.de \
--cc=bruce.ashfield@windriver.com \
--cc=darren.hart@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=openembedded-core@lists.openembedded.org \
--cc=richard.purdie@linuxfoundation.org \
--cc=saul.wold@intel.com \
--cc=toshi.kani@hp.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).