From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752045AbdDKETz (ORCPT ); Tue, 11 Apr 2017 00:19:55 -0400 Received: from mout.gmx.net ([212.227.17.22]:52080 "EHLO mout.gmx.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750721AbdDKETy (ORCPT ); Tue, 11 Apr 2017 00:19:54 -0400 Message-ID: <1491884382.5645.49.camel@gmx.de> Subject: Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues") From: Mike Galbraith To: "Michael S. Tsirkin" Cc: Christoph Hellwig , Thorsten Leemhuis , virtio-dev@lists.oasis-open.org, Linux Kernel Mailing List , rjones@redhat.com Date: Tue, 11 Apr 2017 06:19:42 +0200 In-Reply-To: <20170411002012-mutt-send-email-mst@kernel.org> References: <1491544999.5501.7.camel@gmx.de> <20170407090354-mutt-send-email-mst@kernel.org> <1491547457.5501.10.camel@gmx.de> <1491548751.5501.12.camel@gmx.de> <1491549722.5501.13.camel@gmx.de> <20170407161641-mutt-send-email-mst@kernel.org> <20170407163437-mutt-send-email-mst@kernel.org> <1491575393.3341.0.camel@gmx.de> <20170407215510-mutt-send-email-mst@kernel.org> <1491627694.4479.21.camel@gmx.de> <20170411002012-mutt-send-email-mst@kernel.org> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.16.5 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-Provags-ID: V03:K0:8VsvQcHfT2fzQ2NmTy2QoIoDtbActgGwpJ1zN5k6VPHZ0vWMe2g knQUPFsB7EI9cBk2zfO1E1T+GnXrE9y1z/QFQnIXu4rS5k389CRawpOs47HmonZ35IhlTDi ObRK2fqOrWyRMq7F79nleXV+Ia4PrRcxCxH6fq6Qsr0JyDKVcSl0OeoVcQDG4GVzWS3ehub /SAMV6RadycEz+fBl4PTw== X-UI-Out-Filterresults: notjunk:1;V01:K0:WxvAyeXBzWw=:FvbQZCEWuV+7ZJD27XZ771 gcSH4nriXXIGkBVxu2lQDAZWD2TAuve+aE8nx5JgWMeCAkYqeEaSxttkuFp4sBamFHRZ1U1k4 z5efknt/n4wDwdkge+nY0l7BUB/HUFTrAVA0ohVvJp9+kOcYoq9T1SYTNvx2dH8YjJMCM+X3A +wwqgpEn0ShkfnrOLfJt5W2XiM8D1UObAqvBk3CiMa4iLoRVtjvQCr/TwSUdXbzlqNzFtAEMU so8tDN2D091I/Q/JCXvV/JxD/DF/s7/gpvMiML5j+AcwCQ1nYBmbmdbsV7KFG9e/CJOffrg8y CjBDbsqHbkvusYLkBftRyvSt4CN49lhqVARBJE+wgE2sC49XHX08mZu89exAz1IRpgMUQ3SlF UFrAHzsI6k+9fPcaWOE98Qk8TF1kirlwgJLGUKQjRsSmEEbswGjGRnidU9tkH3aaQ8xL9N4i+ 8Q5CMQ2p6l1ZUACWQbcBmdhWEYw2j1zDrwkWIN4BIDwKYoD8nNjM+aVt962UfSZsR9cHXwJwu PG+5ow9zdwhcY7NqVBnc4JinyFoQOwk/e2ola1Af90DvUVx9z598I1IMIcgTx/+hFUiwahm0H CkTnaLSAPLX9JDuXQG+8WN6sdhnSZG21QW9jOV4mr4MsioGrackjoCsrx+RY0KSfkjZROas9V hFWyttB9g2OLsR9HTK+FT2/UzrEUAuFkuUJZlbURCjhh1eezzc8d8Kf36FacMrcxV//Pn/cXw nUWH8vOL3+byaOmU4Z8ui6PAvr1DizB9b5tz8iXApoRtM3A+PciW2iRXb3anSSXWteX+jpOjH gQdDo7F Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2017-04-11 at 00:23 +0300, Michael S. Tsirkin wrote: > On Sat, Apr 08, 2017 at 07:01:34AM +0200, Mike Galbraith wrote: > > On Fri, 2017-04-07 at 21:56 +0300, Michael S. Tsirkin wrote: > > > > > OK. test3 and test4 are now pushed: test3 should fix your hang, > > > test4 is trying to fix a crash reported independently. > > > > test3 does not fix the post hibernate hang business that I can easily > > reproduce, those are NFS, and at least as old as 4.4. Host/guest, > > dunno, put 4.4 on both, guest hangs intermittently. > > OK so IIUC you agree it's a good idea to send test4 to Linus, right? Well, my box agrees that that is a viable option. > Hybernation's still broken but that's not a regression. Yup. > > [] __rpc_wait_for_completion_task+0x30/0x30 [sunrpc] > > [] rpc_wait_bit_killable+0x1e/0xb0 [sunrpc] > > [] __rpc_wait_for_completion_task+0x30/0x30 [sunrpc] > > [] autoremove_wake_function+0x50/0x50 > > [] call_decode+0x850/0x850 [sunrpc] > > [] call_decode+0x850/0x850 [sunrpc] > > [] __rpc_execute+0x14e/0x440 [sunrpc] > > [] ktime_get+0x35/0xa0 > > [] rpc_run_task+0x120/0x170 [sunrpc] > > [] nfs4_call_sync_sequence+0x56/0x80 [nfsv4] > > [] _nfs4_proc_getattr+0xb0/0xc0 [nfsv4] > > [] path_lookupat+0xd2/0x100 > > [] nfs4_proc_getattr+0x5c/0xe0 [nfsv4] > > [] __nfs_revalidate_inode+0xa0/0x300 [nfs] > > [] nfs_getattr+0x95/0x250 [nfs] > > [] vfs_statx+0x7b/0xc0 > > [] SYSC_newstat+0x20/0x40 > > [] entry_SYSCALL_64_fastpath+0x1a/0xa9 > > [] 0xffffffffffffffff > > > > I noted no _other_ misbehavior in either kernel, w/wo threadirqs. > > > > > > -Mike > > Interesting. I would guess virtio net does not complete some > packets. So you were unable to find an old guest where this > works fine? I just tried my opensuse 13.2 clone. It works markedly less fine, turns into a brick either on the way down or back up in short order. -Mike