From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1753481AbdHUB1A (ORCPT <rfc822;w@1wt.eu>);
        Sun, 20 Aug 2017 21:27:00 -0400
Received: from mail-oi0-f47.google.com ([209.85.218.47]:36743 "EHLO
        mail-oi0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1753439AbdHUB06 (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Sun, 20 Aug 2017 21:26:58 -0400
MIME-Version: 1.0
In-Reply-To: <20170820231302.s732zclznrqxwr46@angband.pl>
References: <20170820231302.s732zclznrqxwr46@angband.pl>
From: Wanpeng Li <kernellwp@gmail.com>
Date: Mon, 21 Aug 2017 09:26:57 +0800
Message-ID: <CANRm+CyjzpMFUJtfcK_WSCOGz4DcgjNTkA6ih2Hvm9bOBitdjA@mail.gmail.com>
Subject: Re: kvm splat in mmu_spte_clear_track_bits
To: Adam Borowski <kilobyte@angband.pl>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
        =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= <rkrcmar@redhat.com>,
        kvm <kvm@vger.kernel.org>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Content-Type: text/plain; charset="UTF-8"
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by nfs id v7L1RE0t018742

2017-08-21 7:13 GMT+08:00 Adam Borowski <kilobyte@angband.pl>:
> Hi!
> I'm afraid I keep getting a quite reliable, but random, splat when running
> KVM:

I reported something similar before. https://lkml.org/lkml/2017/6/29/64

Regards,
Wanpeng Li

>
> ------------[ cut here ]------------
> WARNING: CPU: 5 PID: 5826 at arch/x86/kvm/mmu.c:717 mmu_spte_clear_track_bits+0x123/0x170
> Modules linked in: tun nbd arc4 rtl8xxxu mac80211 cfg80211 rfkill nouveau video ttm
> CPU: 5 PID: 5826 Comm: qemu-system-x86 Not tainted 4.13.0-rc5-vanilla-ubsan-00211-g7f680d7ec315 #1
> Hardware name: System manufacturer System Product Name/M4A77T, BIOS 2401    05/18/2011
> task: ffff880207ef0400 task.stack: ffffc900035e4000
> RIP: 0010:mmu_spte_clear_track_bits+0x123/0x170
> RSP: 0018:ffffc900035e7ab0 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: 000000010501cc67 RCX: 0000000000000001
> RDX: dead0000000000ff RSI: ffff88020e501df8 RDI: 0000000004140700
> RBP: ffffc900035e7ad8 R08: 0000000000000100 R09: 0000000000000003
> R10: 0000000000000003 R11: 0000000000000005 R12: 000000000010501c
> R13: ffffea0004140700 R14: ffff88020e1d0000 R15: 0000000000000000
> FS:  00007f0213fbd700(0000) GS:ffff88022fd40000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000000 CR3: 000000022187f000 CR4: 00000000000006e0
> Call Trace:
>  drop_spte+0x26/0x130
>  mmu_page_zap_pte+0xc4/0x160
>  kvm_mmu_prepare_zap_page+0x65/0x660
>  kvm_mmu_invalidate_zap_all_pages+0xc5/0x1f0
>  kvm_mmu_invalidate_zap_pages_in_memslot+0x9/0x10
>  kvm_page_track_flush_slot+0x86/0xd0
>  kvm_arch_flush_shadow_memslot+0x9/0x10
>  __kvm_set_memory_region+0x8fb/0x14f0
>  kvm_set_memory_region+0x2f/0x50
>  kvm_vm_ioctl+0x559/0xcc0
>  ? kvm_vcpu_ioctl+0x171/0x620
>  ? __switch_to+0x30b/0x740
>  do_vfs_ioctl+0xbb/0x8d0
>  ? find_vma+0x23/0x100
>  ? __fget_light+0x94/0x110
>  SyS_ioctl+0x86/0xa0
>  entry_SYSCALL_64_fastpath+0x17/0x98
> RIP: 0033:0x7f021c80ddc7
> RSP: 002b:00007f0213fbc518 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f021c80ddc7
> RDX: 00007f0213fbc5b0 RSI: 000000004020ae46 RDI: 000000000000000a
> RBP: 0000000000000000 R08: 00007f020c1698a0 R09: 0000000000000000
> R10: 00007f020c1698a0 R11: 0000000000000246 R12: 0000000000000006
> R13: 00007f022201c000 R14: 0000000000000002 R15: 0000558c3899e550
> Code: ae fc 01 48 85 c0 75 1c 4c 89 e7 e8 98 de fd ff 48 8b 05 81 ae fc 01 48 85 c0 74 ba 48 85 c3 0f 95 c3 eb b8 48 85 c3 74 e7 eb dd <0f> ff eb 97 4c 89 e7 66 0f 1f 44 00 00 e8 6b de fd ff eb 97 31
> ---[ end trace 16c196134f0dd0a9 ]---
>
> After this, there are hundreds of repeats and lots of secondary damage which
> kills the host quickly.
>
> Usually this happens within a few minutes, but sometimes it takes ~half an
> hour to reproduce.  Because of this, it'd be unpleasant to bisect -- is this
> problem already known?
>
>
> Meow!
> --
> ⢀⣴⠾⠻⢶⣦⠀
> ⣾⠁⢰⠒⠀⣿⡁ Vat kind uf sufficiently advanced technology iz dis!?
> ⢿⡄⠘⠷⠚⠋⠀                                        -- Genghis Ht'rok'din
> ⠈⠳⣄⠀⠀⠀⠀