From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Jorgen S. Hansen" Subject: Re: [PATCH net] VSOCK: check sk state before receive Date: Fri, 21 Sep 2018 07:48:25 +0000 Message-ID: References: <1527382936-4850-1-git-send-email-liuhangbin@gmail.com> <20180527152945.GQ8958@leo.usersys.redhat.com> <20180530091727.GF14623@stefanha-x1.localdomain> ,<20180613014402.GU8958@leo.usersys.redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable Cc: Stefan Hajnoczi , "netdev@vger.kernel.org" , "David S. Miller" To: Hangbin Liu Return-path: Received: from mail-by2nam01on0046.outbound.protection.outlook.com ([104.47.34.46]:54504 "EHLO NAM01-BY2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2388875AbeIUNgH (ORCPT ); Fri, 21 Sep 2018 09:36:07 -0400 In-Reply-To: <20180613014402.GU8958@leo.usersys.redhat.com> Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: Hi Hangbin, I finaly got to the bottom of this - the issue was indeed in the VMCI drive= r. The patch is posted here: https://lkml.org/lkml/2018/9/21/326 I used your reproduce.log to test the fix. Thanks for discovering this issu= e. Thanks, J=F8rgen ________________________________________ From: Hangbin Liu Sent: Wednesday, June 13, 2018 3:44 AM To: Jorgen S. Hansen Cc: Stefan Hajnoczi; netdev@vger.kernel.org; David S. Miller Subject: Re: [PATCH net] VSOCK: check sk state before receive On Mon, Jun 04, 2018 at 04:02:39PM +0000, Jorgen S. Hansen wrote: > > > On May 30, 2018, at 11:17 AM, Stefan Hajnoczi wro= te: > > > > On Sun, May 27, 2018 at 11:29:45PM +0800, Hangbin Liu wrote: > >> Hmm...Although I won't reproduce this bug with my reproducer after > >> apply my patch. I could still get a similiar issue with syzkaller sock= vnet test. > >> > >> It looks this patch is not complete. Here is the KASAN call trace with= my patch. > >> I can also reproduce it without my patch. > > > > Seems like a race between vmci_datagram_destroy_handle() and the > > delayed callback, vmci_transport_recv_dgram_cb(). > > > > I don't know the VMCI transport well so I'll leave this to Jorgen. > > Yes, it looks like we are calling the delayed callback after we return fr= om vmci_datagram_destroy_handle(). I=92ll take a closer look at the VMCI si= de here - the refcounting of VMCI datagram endpoints should guard against t= his, since the delayed callback does a get on the datagram resource, so thi= s could a VMCI driver issue, and not a problem in the VMCI transport for AF= _VSOCK. Hi Jorgen, Thanks for helping look at this. I'm happy to run test for you patch. Thanks Hangbin