From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751337AbdH1NPk (ORCPT ); Mon, 28 Aug 2017 09:15:40 -0400 Received: from mx2.suse.de ([195.135.220.15]:53693 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751198AbdH1NPi (ORCPT ); Mon, 28 Aug 2017 09:15:38 -0400 Date: Mon, 28 Aug 2017 15:15:36 +0200 Message-ID: From: Takashi Iwai To: Adam Borowski Cc: Paolo Bonzini , kvm@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org Subject: Re: 4.13-rc7: WARNING at arch/x86/kvm/mmu.c:717 (and a crash thereafter) In-Reply-To: <20170828130600.rgm5vilrhwkjmjxq@angband.pl> References: <20170828130600.rgm5vilrhwkjmjxq@angband.pl> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI/1.14.6 (Maruoka) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 Emacs/25.2 (x86_64-suse-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI 1.14.6 - "Maruoka") Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 28 Aug 2017 15:06:00 +0200, Adam Borowski wrote: > > On Mon, Aug 28, 2017 at 02:26:06PM +0200, Takashi Iwai wrote: > > I seem to get a kernel warning when running KVM on Dell desktop with > > IvyBridge like below. As you can see, a bad page BUG is triggered > > after that, too. The problem is not triggered always, but it happens > > occasionally. > > See the thread starting with 20170820231302.s732zclznrqxwr46@angband.pl > > > I haven't seen this on 4.13-rc4 at all, and IIRC, it started happening > > since rc5. So this might be a regression at rc5. But, as it doesn't > > happen always, I can't be 100% sure about it, and it's quite difficult > > to bisect (the test case isn't reliable), unfortunately. > > Same here -- it sometimes takes a few hours of trying to reproduce, which > makes proving the negative greatly unpleasant. > > And all I've been able to tell so far is that the problem is between > 4.13-rc4 and 4.13-rc5, just like you say. Good to hear that we can chorus! So if it's really a regression between rc4 and rc5, I see no obvious changes in arch/x86, i.e. it's likely somewhere else. (snip) > The first WARN is always the above. But the rest seems to be totally random > -- a nasty case of fandango on core whose results range from harmless > through crash to massive data loss (just guess what would happen if some > idiot picked balancing the disk as a test load -- no one would be that > stupid, right? At least an incomplete idiot has checksums and backups). Yeah, the crash after the WARNING seems quite random. thanks, Takashi