From mboxrd@z Thu Jan 1 00:00:00 1970 From: Steven Haigh Subject: Re: null domains after xl destroy Date: Wed, 3 May 2017 20:45:16 +1000 Message-ID: References: <78571a7b-62ec-b046-02e3-3d6739b779a6@rimuhosting.com> <95efee87-6925-5376-e347-55e438c90212@suse.com> <70eae378-2392-bd82-670a-5dafff58c259@rimuhosting.com> <3385656.IoOB642KYU@amur> <6e150a33-576b-5cf8-7abc-2cba584602ff@citrix.com> <05cd7b43-153a-8b51-8fd9-e8ae4a8b5287@rimuhosting.com> <06829f8f-def6-4822-c18a-877d8633556c@suse.com> <034c9f96-1bfe-6793-68a7-9b070676971a@suse.com> <20170419071624.6enfeemielfqhqw2@dhcp-3-128.uk.xensource.com> <0b981374-700b-f26a-9504-583bad046f7d@suse.com> <4da36c5e-0712-376c-423e-97988796c393@rimuhosting.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============2062539145852683280==" Return-path: In-Reply-To: <4da36c5e-0712-376c-423e-97988796c393@rimuhosting.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: glenn@rimuhosting.com, Juergen Gross Cc: Andrew Cooper , =?UTF-8?Q?Roger_Pau_Monn=c3=a9?= , Dietmar Hahn , xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --===============2062539145852683280== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="ttNtVOSUrAdHbn5aMMX9iciwbs1Rt6LHX" This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --ttNtVOSUrAdHbn5aMMX9iciwbs1Rt6LHX Content-Type: multipart/mixed; boundary="rrRM0JPNrpGc6O0gJkaUagvR4744JaxfM"; protected-headers="v1" From: Steven Haigh To: glenn@rimuhosting.com, Juergen Gross Cc: Andrew Cooper , xen-devel@lists.xen.org, Dietmar Hahn , =?UTF-8?Q?Roger_Pau_Monn=c3=a9?= Message-ID: Subject: Re: null domains after xl destroy References: <78571a7b-62ec-b046-02e3-3d6739b779a6@rimuhosting.com> <95efee87-6925-5376-e347-55e438c90212@suse.com> <70eae378-2392-bd82-670a-5dafff58c259@rimuhosting.com> <3385656.IoOB642KYU@amur> <6e150a33-576b-5cf8-7abc-2cba584602ff@citrix.com> <05cd7b43-153a-8b51-8fd9-e8ae4a8b5287@rimuhosting.com> <06829f8f-def6-4822-c18a-877d8633556c@suse.com> <034c9f96-1bfe-6793-68a7-9b070676971a@suse.com> <20170419071624.6enfeemielfqhqw2@dhcp-3-128.uk.xensource.com> <0b981374-700b-f26a-9504-583bad046f7d@suse.com> <4da36c5e-0712-376c-423e-97988796c393@rimuhosting.com> In-Reply-To: <4da36c5e-0712-376c-423e-97988796c393@rimuhosting.com> --rrRM0JPNrpGc6O0gJkaUagvR4744JaxfM Content-Type: text/plain; charset=utf-8 Content-Language: en-AU Content-Transfer-Encoding: quoted-printable Just wanted to give this a little nudge now people seem to be back on deck... On 01/05/17 10:55, Glenn Enright wrote: > On 19/04/17 22:09, Juergen Gross wrote: >> On 19/04/17 09:16, Roger Pau Monn=C3=A9 wrote: >>> On Wed, Apr 19, 2017 at 06:39:41AM +0200, Juergen Gross wrote: >>>> On 19/04/17 03:02, Glenn Enright wrote: >>>>> Thanks Juergen. I applied that, to our 4.9.23 dom0 kernel, which st= ill >>>>> shows the issue. When replicating the leak I now see this trace (vi= a >>>>> dmesg). Hopefully that is useful. >>>>> >>>>> Please note, I'm going to be offline next week, but am keen to keep= on >>>>> with this, it may just be a while before I followup is all. >>>>> >>>>> Regards, Glenn >>>>> http://rimuhosting.com >>>>> >>>>> >>>>> ------------[ cut here ]------------ >>>>> WARNING: CPU: 0 PID: 19 at drivers/block/xen-blkback/xenbus.c:508 >>>>> xen_blkbk_remove+0x138/0x140 >>>>> Modules linked in: xen_pciback xen_netback xen_gntalloc xen_gntdev >>>>> xen_evtchn xenfs xen_privcmd xt_CT ipt_REJECT nf_reject_ipv4 >>>>> ebtable_filter ebtables xt_hashlimit xt_recent xt_state >>>>> iptable_security >>>>> iptable_raw igle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 >>>>> nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables bridge stp= >>>>> llc >>>>> ipv6 crc_ccitt ppdev parport_pc parport serio_raw sg i2c_i801 >>>>> i2c_smbus >>>>> i2c_core e1000e ptp p000_edac edac_core raid1 sd_mod ahci libahci >>>>> floppy >>>>> dm_mirror dm_region_hash dm_log dm_mod >>>>> CPU: 0 PID: 19 Comm: xenwatch Not tainted 4.9.23-1.el6xen.x86_64 #1= >>>>> Hardware name: Supermicro PDSML/PDSML+, BIOS 6.00 08/27/2007 >>>>> ffffc90040cfbba8 ffffffff8136b61f 0000000000000013 000000000000000= 0 >>>>> 0000000000000000 0000000000000000 ffffc90040cfbbf8 ffffffff8108007= d >>>>> ffffea0001373fe0 000001fc33394434 ffff880000000001 ffff88004d93fac= 0 >>>>> Call Trace: >>>>> [] dump_stack+0x67/0x98 >>>>> [] __warn+0xfd/0x120 >>>>> [] warn_slowpath_null+0x1d/0x20 >>>>> [] xen_blkbk_remove+0x138/0x140 >>>>> [] xenbus_dev_remove+0x47/0xa0 >>>>> [] __device_release_driver+0xb4/0x160 >>>>> [] device_release_driver+0x2d/0x40 >>>>> [] bus_remove_device+0x124/0x190 >>>>> [] device_del+0x112/0x210 >>>>> [] ? xenbus_read+0x53/0x70 >>>>> [] device_unregister+0x22/0x60 >>>>> [] frontend_changed+0xad/0x4c0 >>>>> [] ? schedule_tail+0x1e/0xc0 >>>>> [] xenbus_otherend_changed+0xc7/0x140 >>>>> [] ? _raw_spin_unlock_irqrestore+0x16/0x20 >>>>> [] ? schedule_tail+0x1e/0xc0 >>>>> [] frontend_changed+0x10/0x20 >>>>> [] xenwatch_thread+0x9c/0x140 >>>>> [] ? woken_wake_function+0x20/0x20 >>>>> [] ? schedule+0x3a/0xa0 >>>>> [] ? _raw_spin_unlock_irqrestore+0x16/0x20 >>>>> [] ? complete+0x4d/0x60 >>>>> [] ? split+0xf0/0xf0 >>>>> [] kthread+0xcd/0xf0 >>>>> [] ? schedule_tail+0x1e/0xc0 >>>>> [] ? __kthread_init_worker+0x40/0x40 >>>>> [] ? __kthread_init_worker+0x40/0x40 >>>>> [] ret_from_fork+0x25/0x30 >>>>> ---[ end trace ee097287c9865a62 ]--- >>>> >>>> Konrad, Roger, >>>> >>>> this was triggered by a debug patch in xen_blkbk_remove(): >>>> >>>> if (be->blkif) >>>> - xen_blkif_disconnect(be->blkif); >>>> + WARN_ON(xen_blkif_disconnect(be->blkif)); >>>> >>>> So I guess we need something like xen_blk_drain_io() in case of >>>> calls to >>>> xen_blkif_disconnect() which are not allowed to fail (either at the >>>> call >>>> sites of xen_blkif_disconnect() or in this function depending on a n= ew >>>> boolean parameter indicating it should wait for outstanding I/Os). >>>> >>>> I can try a patch, but I'd appreciate if you could confirm this >>>> wouldn't >>>> add further problems... >>> >>> Hello, >>> >>> Thanks for debugging this, the easiest solution seems to be to >>> replace the >>> ring->inflight atomic_read check in xen_blkif_disconnect with a call = to >>> xen_blk_drain_io instead, and making xen_blkif_disconnect return void= >>> (to >>> prevent further issues like this one). >> >> Glenn, >> >> can you please try the attached patch (in dom0)? >> >> >> Juergen >> >=20 > (resending with full CC list) >=20 > I'm back. After testing unfortunately I'm still seeing the leak. The > below trace is with the debug patch applied as well under 4.9.25. It > looks very similar to me. I am still able to replicate this reliably. >=20 > Regards, Glenn > http://rimuhosting.com >=20 > ------------[ cut here ]------------ > WARNING: CPU: 0 PID: 19 at drivers/block/xen-blkback/xenbus.c:511 > xen_blkbk_remove+0x138/0x140 > Modules linked in: ebt_ip xen_pciback xen_netback xen_gntalloc > xen_gntdev xen_evtchn xenfs xen_privcmd xt_CT ipt_REJECT nf_reject_ipv4= > ebtable_filter ebtables xt_hashlimit xt_recent xt_state iptable_securit= y > iptable_raw iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4= > nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables bridge stp llc= > ipv6 crc_ccitt ppdev parport_pc parport serio_raw i2c_i801 i2c_smbus > i2c_core sg e1000e ptp pps_core i3000_edac edac_core raid1 sd_mod ahci > libahci floppy dm_mirror dm_region_hash dm_log dm_mod > CPU: 0 PID: 19 Comm: xenwatch Not tainted 4.9.25-1.el6xen.x86_64 #1 > Hardware name: Supermicro PDSML/PDSML+, BIOS 6.00 08/27/2007 > ffffc90040cfbb98 ffffffff8136b76f 0000000000000013 0000000000000000 > 0000000000000000 0000000000000000 ffffc90040cfbbe8 ffffffff8108007d > ffffea0000141720 000001ff41334434 ffff880000000001 ffff88004d3aedc0 > Call Trace: > [] dump_stack+0x67/0x98 > [] __warn+0xfd/0x120 > [] warn_slowpath_null+0x1d/0x20 > [] xen_blkbk_remove+0x138/0x140 > [] xenbus_dev_remove+0x47/0xa0 > [] __device_release_driver+0xb4/0x160 > [] device_release_driver+0x2d/0x40 > [] bus_remove_device+0x124/0x190 > [] device_del+0x112/0x210 > [] ? xenbus_read+0x53/0x70 > [] device_unregister+0x22/0x60 > [] frontend_changed+0xad/0x4c0 > [] xenbus_otherend_changed+0xc7/0x140 > [] ? _raw_spin_unlock_irqrestore+0x16/0x20 > [] frontend_changed+0x10/0x20 > [] xenwatch_thread+0x9c/0x140 > [] ? woken_wake_function+0x20/0x20 > [] ? schedule+0x3a/0xa0 > [] ? _raw_spin_unlock_irqrestore+0x16/0x20 > [] ? complete+0x4d/0x60 > [] ? split+0xf0/0xf0 > [] kthread+0xe5/0x100 > [] ? kthread+0xcd/0x100 > [] ? __kthread_init_worker+0x40/0x40 > [] ? __kthread_init_worker+0x40/0x40 > [] ? __kthread_init_worker+0x40/0x40 > [] ret_from_fork+0x25/0x30 > ---[ end trace ea3a48c80e4ad79d ]--- >=20 > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > https://lists.xen.org/xen-devel --=20 Steven Haigh Email: netwiz@crc.id.au Web: https://www.crc.id.au Phone: (03) 9001 6090 - 0412 935 897 --rrRM0JPNrpGc6O0gJkaUagvR4744JaxfM-- --ttNtVOSUrAdHbn5aMMX9iciwbs1Rt6LHX Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJZCbS8AAoJEEGvNdV6fTHcoKEP/jjGKyuwKv0j2GBV43StwbeH ZCCSEs10dW6Mj90PCIUdcYq7VZx4oYulpLQgm5LTq2NbaZDkXfG45u9oC+ZNrQri 9PGRuY6wjVk2ZwobAQLHIhV5TE1WUmqteO3/WpasYhpDF/IuhvGfix2hfvas02/N B0u0Gjhqa1NYxyzPPMC5NOqCrvoGTrOJ0+VCjcyHwmsham+dxxvtrMENoX4m96BH a33JF9/uF/lynkqROqfY5LkcPbacJwodCcYS4233K/ZmkmyEkPs3Sf+hDhzlkw3W gMj8VRSDHGs9rwfzhRNACRPdvEwI6yqYDE2Fy5OKVjCierFXpQLUAZhdnxZSrTIW 8ZsBc1Jx9TYgc0NkXF0aGK0sP/KnUCgBBC+kp4zzFzPsugFiBOrk42x08dpwVnQ1 Qh+wGeXk/EyR+cTDXF1N1kTB0cYdxyeAhg7xRwIOCwHT0kbwRgRMnN1PFsoZKASN wnS8EVW7O0mwvyyoU7pGbtkupqeqGEW6oOduXAdGyCrCPHBUoayMV99Jl6PIKIzR Bx1fyTSOzaxZ1DyRZn2Yn+KnLoN4S1gV5Hwk6OjSaTzY6P/xlpv95A+7GZSB8Cmk PT7y7Vi8sR9svElwhWXQSauMZiiPY66tVNOMOvU77auXvIohJOT9gM7/q7K9A2ZM NozSdc3aayOnHA1LVX7x =qDZy -----END PGP SIGNATURE----- --ttNtVOSUrAdHbn5aMMX9iciwbs1Rt6LHX-- --===============2062539145852683280== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwczovL2xpc3RzLnhlbi5v cmcveGVuLWRldmVsCg== --===============2062539145852683280==--