From mboxrd@z Thu Jan 1 00:00:00 1970 From: Steven Haigh Subject: Re: 4.4: INFO: rcu_sched self-detected stall on CPU Date: Wed, 30 Mar 2016 05:04:28 +1100 Message-ID: <56FAC3AC.9050802__8265.11544078148$1459274749$gmane$org@crc.id.au> References: <56F4A816.3050505@crc.id.au> <56F52DBF.5080006@oracle.com> <56F545B1.8080609@crc.id.au> <56F54EE0.6030004@oracle.com> <56F56172.9020805@crc.id.au> <56F5653B.1090700@oracle.com> <56F5A87A.8000903@crc.id.au> <56FA4336.2030301@crc.id.au> <56FA8DDD.7070406@oracle.com> <56FABF17.7090608@crc.id.au> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1274688225956705371==" Return-path: Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1aky0P-00066I-Vs for xen-devel@lists.xenproject.org; Tue, 29 Mar 2016 18:04:42 +0000 In-Reply-To: <56FABF17.7090608@crc.id.au> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: Boris Ostrovsky , xen-devel , linux-kernel@vger.kernel.org Cc: "gregkh@linuxfoundation.org" List-Id: xen-devel@lists.xenproject.org This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --===============1274688225956705371== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="wgjPHfkXnuhlqUkqLo8NEmppeH8raTGWs" This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --wgjPHfkXnuhlqUkqLo8NEmppeH8raTGWs Content-Type: multipart/mixed; boundary="5kP92Dowajj3xG4Dxss8vfttmcHoG8ujo" From: Steven Haigh To: Boris Ostrovsky , xen-devel , linux-kernel@vger.kernel.org Cc: "gregkh@linuxfoundation.org" Message-ID: <56FAC3AC.9050802@crc.id.au> Subject: Re: 4.4: INFO: rcu_sched self-detected stall on CPU References: <56F4A816.3050505@crc.id.au> <56F52DBF.5080006@oracle.com> <56F545B1.8080609@crc.id.au> <56F54EE0.6030004@oracle.com> <56F56172.9020805@crc.id.au> <56F5653B.1090700@oracle.com> <56F5A87A.8000903@crc.id.au> <56FA4336.2030301@crc.id.au> <56FA8DDD.7070406@oracle.com> <56FABF17.7090608@crc.id.au> In-Reply-To: <56FABF17.7090608@crc.id.au> --5kP92Dowajj3xG4Dxss8vfttmcHoG8ujo Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Greg, please see below - this is probably more for you... On 03/29/2016 04:56 AM, Steven Haigh wrote: > > Interestingly enough, this just happened again - but on a different > virtual machine. I'm starting to wonder if this may have something to d= o > with the uptime of the machine - as the system that this seems to happe= n > to is always different. > > Destroying it and monitoring it again has so far come up blank. > > I've thrown the latest lot of kernel messages here: > http://paste.fedoraproject.org/346802/59241532 So I just did a bit of digging via the almighty Google. I started hunting for these lines, as they happen just before the stall: BUG: Bad rss-counter state mm:ffff88007b7db480 idx:2 val:-1 BUG: Bad rss-counter state mm:ffff880079c638c0 idx:0 val:-1 BUG: Bad rss-counter state mm:ffff880079c638c0 idx:2 val:-1 I stumbled across this post on the lkml: http://marc.info/?l=3Dlinux-kernel&m=3D145141546409607 The patch attached seems to reference the following change in unmap_mapping_range in mm/memory.c: > - struct zap_details details; > + struct zap_details details =3D { }; When I browse the GIT tree for 4.4.6: https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree= /mm/memory.c?id=3Drefs/tags/v4.4.6 I see at line 2411: struct zap_details details; Is this something that has been missed being merged into the 4.4 tree? I'll admit my kernel knowledge is not enough to understand what the code actually does - but the similarities here seem uncanny. --=20 Steven Haigh Email: netwiz@crc.id.au Web: https://www.crc.id.au Phone: (03) 9001 6090 - 0412 935 897 --5kP92Dowajj3xG4Dxss8vfttmcHoG8ujo-- --wgjPHfkXnuhlqUkqLo8NEmppeH8raTGWs Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJW+sOsAAoJEEGvNdV6fTHcB7IP/3MAV0CpBnPzbPXiU7rGsZte cdzVhJ806CrMCbxxR8HCR2klGV7eFfV1liJ6C5GuWO+bXmXhl2tPfw/ngSzEM5HB rplsWHqdK0y/sMI5aZyk77J+zDeoS4pu4ennyQHXpl9k2nWPhhrzupPzfykmpoAR HOs317nSxMtdt+C7wMd8ULcjt+ld1eah8yVdMHMZa9pLH53ZbgIC1KUGl+GYgmsy /FSDXZQHK6p0+lOQSr9rJOTRQIHQJHS8Pp74dxIaasUlFtKiDSXjVIKYv7AtVRUH R2m3OXRNQ0388O8thLfOkCeonSO4yjmQGbmGRur7//F643OikohBd+3lH9llF19q huFmllWX4hxTc+R4g9dv+G8LJgosGJKihx4ErQ/YjiLJ5u8mGmXMNmNCTabmJGcD hDCklnavnrJvkrJJXTudhuyGY9sQ+x6bV39mPLhcsATNhGooRqqi91/rrLX0gs0p bn76ZxCc4lH/VlRLNbQaPYP7FBN6RuZARRVwB0CxXcEGlVYP8hqAM6XqgWNkmRgA 1SqQBwU7EmmI80kRCxut3CKQHTxTjzjd/YJ1c9/AeIzvxr9Tn5DVrMtsaSlKmx31 FlYy4uJnscYG4ICQk4vfUcHdPzMn9D82argXD54g5+2RNPo7TCQuVix4kpwKg+P/ m2CYZoTzNYYSDWLrXxq4 =jeqC -----END PGP SIGNATURE----- --wgjPHfkXnuhlqUkqLo8NEmppeH8raTGWs-- --===============1274688225956705371== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwOi8vbGlzdHMueGVuLm9y Zy94ZW4tZGV2ZWwK --===============1274688225956705371==--