From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754059AbbIAGU0 (ORCPT ); Tue, 1 Sep 2015 02:20:26 -0400 Received: from mx1.redhat.com ([209.132.183.28]:34225 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753692AbbIAGUZ (ORCPT ); Tue, 1 Sep 2015 02:20:25 -0400 Date: Tue, 1 Sep 2015 07:20:23 +0100 From: "Richard W.M. Jones" To: Chuck Ebbert Cc: linux-kernel@vger.kernel.org, x86@kernel.org, Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" Subject: Re: [BUG 4.2-rc8] Interrupt occurs while apply_alternatives() is patching the handler Message-ID: <20150901062022.GA19002@redhat.com> References: <20150830223757.6e4c5c02@as> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150830223757.6e4c5c02@as> User-Agent: Mutt/1.5.20 (2009-12-10) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Aug 30, 2015 at 10:37:57PM -0400, Chuck Ebbert wrote: > This is from https://bugzilla.redhat.com/show_bug.cgi?id=1258223 > > [ 0.036000] BUG: unable to handle kernel paging request at 55501e06 [...] > [ 0.036000] [] ? add_nops+0x90/0xa0 > [ 0.036000] [] apply_alternatives+0x274/0x630 > [ 0.036000] [] ? wait_for_xmitr+0xa0/0xa0 > [ 0.036000] [] ? sprintf+0x1c/0x20 > [ 0.036000] [] ? irq_entries_start+0x698/0x698 > [ 0.036000] [] ? memcpy+0xb/0x30 > [ 0.036000] [] ? serial8250_set_termios+0x20/0x20 [...] > Interrupt 0x30 occurred while the alternatives code was replacing the > initial 0x90,0x90,0x90 NOPs (from the ASM_CLAC macro) with the optimized > version, 0x8d,0x76,0x00. Only the first byte has been replaced so far, > and it makes a mess out of the insn decoding. Chuck, thanks for reporting this. I have only been able to reproduce this so far using qemu and TCG (not KVM) which of course raises a range of questions: could it be a qemu bug or a TCG bug? Could it be that an atomic op is not correctly implemented by qemu? I will keep trying on KVM. Because I don't have a convenient server with 32 bit kernel and a serial port that I can reboot thousands of times, I have not tried to reproduce on baremetal yet. Here's how to reproduce it. (The host can be x86-64) (1) Grab the 32 bit Fedora kernel we are using from https://kojipkgs.fedoraproject.org//packages/kernel/4.2.0/1.fc24/i686/kernel-core-4.2.0-1.fc24.i686.rpm (from http://koji.fedoraproject.org/koji/buildinfo?buildID=681723) (2) Unpack it to extract vmlinuz: cd /tmp rpm2cpio /mnt/scratch/kernel-core-4.2.0-1.fc24.i686.rpm | cpio -id cp ./lib/modules/4.2.0-1.fc24.i686/vmlinuz . (3) Boot the kernel under qemu/KVM. The following single line command repeatedly boots the kernel until the bug is hit: while qemu-system-x86_64 -nographic -no-reboot -M accel=kvm:tcg -kernel vmlinuz -append 'console=ttyS0 panic=1' -serial stdio -monitor none >& log; ! grep add_nops log; do echo -n .; done It takes many iterations (100s with TCG) to hit the bug. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v