From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56294) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bpWna-0007so-R6 for qemu-devel@nongnu.org; Thu, 29 Sep 2016 04:34:36 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bpWnW-0007kv-Hn for qemu-devel@nongnu.org; Thu, 29 Sep 2016 04:34:33 -0400 MIME-Version: 1.0 In-Reply-To: <20160929074715.GE30519@umbus.fritz.box> References: <20160929074715.GE30519@umbus.fritz.box> From: Bharata B Rao Date: Thu, 29 Sep 2016 14:04:28 +0530 Message-ID: Content-Type: text/plain; charset=UTF-8 Subject: Re: [Qemu-devel] ppc64 TCG emulation broken List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: David Gibson Cc: "qemu-devel@nongnu.org" , "qemu-ppc@nongnu.org" , "Nikunj A. Dadhania" , raji@linux.vnet.ibm.com On Thu, Sep 29, 2016 at 1:17 PM, David Gibson wrote: > On Thu, Sep 29, 2016 at 12:41:04PM +0530, Bharata B Rao wrote: >> Hi, >> >> I am observing a kernel crash with ppc64 TCG guest on x86 and git >> bisect points to this commit: >> >> e7b1e06fbcb81ac66e2586214a6c42fdf15fadf3 >> [target-ppc: add vector insert instructions] >> >> I hit the following guest kernel panic during boot: >> >> Starting Switch Root... >> [ 76.632260] systemd-journald[113]: Received SIGTERM from PID 1 (systemd). >> [ 77.082688] systemd-cgroups[1143]: unhandled signal 4 at >> 00003fff85d3d718 nip 00003fff85d3d718 lr 00003fff85c8c274 code 30001 >> [ 77.479368] systemd-coredum[1144]: unhandled signal 4 at >> 00003fff948bd718 nip 00003fff948bd718 lr 00003fff9480c274 code 30001 >> [ 77.479860] audit_printk_skb: 39 callbacks suppressed >> [ 77.479988] audit: type=1701 audit(1475132719.390:35): >> auid=4294967295 uid=0 gid=0 ses=4294967295 subj=kernel pid=1144 >> comm="systemd-coredum" exe="/usr/lib/systemd/systemd-coredump" sig=4 >> [ 77.485034] Process 1144(systemd-coredum) has RLIMIT_CORE set to 1 >> [ 77.485156] Aborting core >> [ 77.858307] systemd[1]: unhandled signal 4 at 00003fff9a48d718 nip >> 00003fff9a48d718 lr 00003fff9a3dc274 code 30001 >> [ 77.858868] audit: type=1701 audit(1475132719.770:36): >> auid=4294967295 uid=0 gid=0 ses=4294967295 subj=kernel pid=1 >> comm="systemd" exe="/usr/lib/systemd/systemd" sig=4 >> [ 78.004191] systemd-coredum[1145]: unhandled signal 4 at >> 00003fffa475d718 nip 00003fffa475d718 lr 00003fffa46ac274 code 30001 >> [ 78.004648] audit: type=1701 audit(1475132719.910:37): >> auid=4294967295 uid=0 gid=0 ses=4294967295 subj=kernel pid=1145 >> comm="systemd-coredum" exe="/usr/lib/systemd/systemd-coredump" sig=4 >> [ 78.004971] Process 1145(systemd-coredum) has RLIMIT_CORE set to 1 >> [ 78.005066] Aborting core >> [ 78.015142] Kernel panic - not syncing: Attempted to kill init! >> exitcode=0x00000084 >> [ 78.015142] >> [ 78.016926] CPU: 0 PID: 1 Comm: systemd Not tainted 4.6.4-301.fc24.ppc64 #1 >> [ 78.017726] Call Trace: >> [ 78.019154] [c00000007e6638d0] [c0000000009df0f0] >> .dump_stack+0xa8/0xe8 (unreliable) >> [ 78.022485] [c00000007e663960] [c0000000009dd6bc] .panic+0x12c/0x2fc >> [ 78.022631] [c00000007e663a00] [c0000000000cd6a8] .do_exit+0xca8/0xcb0 >> [ 78.022735] [c00000007e663ae0] [c0000000000cd77c] .do_group_exit+0x5c/0xf0 >> [ 78.022815] [c00000007e663b70] [c0000000000dd4ec] .get_signal+0x3bc/0x770 >> [ 78.022918] [c00000007e663c70] [c00000000001761c] .do_signal+0x4c/0x2a0 >> [ 78.023021] [c00000007e663db0] [c000000000017a4c] .do_notify_resume+0xac/0xc0 >> [ 78.023411] [c00000007e663e30] [c000000000009944] >> .ret_from_except_lite+0x70/0x74 >> [ 78.077889] ---[ end Kernel panic - not syncing: Attempted to kill >> init! exitcode=0x00000084 > > Huh. Well, that's unfortunate. I don't imagine the guest is trying > to use any of those new POWER9 instructions, so I guess we must have > broken decode of one of the existing vector instructions with which it > shares some part of the opcode. > > Any chance you could trace this and work out what instruction is > giving the first illegal instruction exception? The following patch fixes the immediate problem for me, Nikunj will send a more complete fix. diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c index 59ae68a..3813a26 100644 --- a/target-ppc/translate/vmx-impl.inc.c +++ b/target-ppc/translate/vmx-impl.inc.c @@ -672,11 +672,11 @@ GEN_VXFORM_UIMM_ENV(vcfux, 5, 12); GEN_VXFORM_UIMM_ENV(vcfsx, 5, 13); GEN_VXFORM_UIMM_ENV(vctuxs, 5, 14); GEN_VXFORM_UIMM_ENV(vctsxs, 5, 15); -GEN_VXFORM_DUAL(vspltisb, PPC_NONE, PPC2_ALTIVEC_207, +GEN_VXFORM_DUAL(vspltisb, PPC_ALTIVEC, PPC_NONE, vinsertb, PPC_NONE, PPC2_ISA300); -GEN_VXFORM_DUAL(vspltish, PPC_NONE, PPC2_ALTIVEC_207, +GEN_VXFORM_DUAL(vspltish, PPC_ALTIVEC, PPC_NONE, vinserth, PPC_NONE, PPC2_ISA300); -GEN_VXFORM_DUAL(vspltisw, PPC_NONE, PPC2_ALTIVEC_207, +GEN_VXFORM_DUAL(vspltisw, PPC_ALTIVEC, PPC_NONE, vinsertw, PPC_NONE, PPC2_ISA300); static void gen_vsldoi(DisasContext *ctx) diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c index e6abeae..0e9d078 100644 --- a/target-ppc/translate/vmx-ops.inc.c +++ b/target-ppc/translate/vmx-ops.inc.c @@ -198,11 +198,11 @@ GEN_VXRFORM_DUAL(vcmpbfp, vcmpgtsd, 3, 15, PPC_ALTIVEC, PPC_NONE) GEN_OPCODE_DUAL(name0##_##name1, 0x04, opc2, opc3, inval0, inval1, type, \ PPC_NONE) GEN_VXFORM_DUAL_INV(vspltisb, vinsertb, 6, 12, 0x00000000, 0x100000, - PPC2_ALTIVEC_207), + PPC_ALTIVEC), GEN_VXFORM_DUAL_INV(vspltish, vinserth, 6, 13, 0x00000000, 0x100000, - PPC2_ALTIVEC_207), + PPC_ALTIVEC), GEN_VXFORM_DUAL_INV(vspltisw, vinsertw, 6, 14, 0x00000000, 0x100000, - PPC2_ALTIVEC_207), + PPC_ALTIVEC), GEN_VXFORM_300_EXT(vinsertd, 6, 15, 0x100000), #define GEN_VXFORM_NOA(name, opc2, opc3) \