From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9078DC43218 for ; Thu, 25 Apr 2019 16:04:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D975420878 for ; Thu, 25 Apr 2019 16:04:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727306AbfDYQEX (ORCPT ); Thu, 25 Apr 2019 12:04:23 -0400 Received: from szxga06-in.huawei.com ([45.249.212.32]:33608 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726859AbfDYQEX (ORCPT ); Thu, 25 Apr 2019 12:04:23 -0400 Received: from DGGEMS401-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id 289235C8A917970CDBAE; Fri, 26 Apr 2019 00:04:07 +0800 (CST) Received: from [127.0.0.1] (10.177.31.96) by DGGEMS401-HUB.china.huawei.com (10.3.19.201) with Microsoft SMTP Server id 14.3.439.0; Fri, 26 Apr 2019 00:04:03 +0800 Subject: Re: BUG: KASAN: use-after-free Read in tun_net_xmit To: Jason Wang , Cong Wang References: <97932452-086c-0086-8d17-640c804cc6c8@redhat.com> <84f86b46-2de2-d9e2-cb04-d5c50a72449f@redhat.com> <89745da0-c6c4-d100-eb47-61abfde753a1@huawei.com> CC: netdev From: YueHaibing Message-ID: Date: Fri, 26 Apr 2019 00:04:02 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: <89745da0-c6c4-d100-eb47-61abfde753a1@huawei.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit X-Originating-IP: [10.177.31.96] X-CFilter-Loop: Reflected Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On 2019/4/24 20:25, YueHaibing wrote: > On 2019/4/24 17:11, Jason Wang wrote: >> >> On 2019/4/24 上午12:41, Cong Wang wrote: >>> On Mon, Apr 22, 2019 at 11:42 PM Jason Wang wrote: >>>> >>>> On 2019/4/23 下午2:00, Cong Wang wrote: >>>>> On Mon, Apr 22, 2019 at 2:41 AM Jason Wang wrote: >>>>>> On 2019/4/22 上午11:57, YueHaibing wrote: >>>>>>> We get a KASAN report as below, but don't have any reproducer. >>>>>>> >>>>>>> Any comments are appreciated. >>>>>>> >>>>>>> ================================================================== >>>>>>> BUG: KASAN: use-after-free in tun_net_xmit+0x1670/0x1750 drivers/net/tun.c:1104 >>>>>>> Read of size 8 at addr ffff88836cc26a70 by task swapper/3/0 >>>>>> Which kernel version did you use? The calltrace points out the a use >>>>>> after free for tun_file structure which should be synchronized through >>>>>> RCU + RTNL lock. >>>>> The tfile socket has to be marked with SOCK_RCU_FREE in order >>>>> to fully respect the RCU grace period. >>>>> >>>>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c >>>>> index e9ca1c088d0b..31c3210288cb 100644 >>>>> --- a/drivers/net/tun.c >>>>> +++ b/drivers/net/tun.c >>>>> @@ -3431,6 +3431,7 @@ static int tun_chr_open(struct inode *inode, >>>>> struct file * file) >>>>> file->private_data = tfile; >>>>> INIT_LIST_HEAD(&tfile->next); >>>>> >>>>> + sock_set_flag(&tfile->sk, SOCK_RCU_FREE); >>>>> sock_set_flag(&tfile->sk, SOCK_ZEROCOPY); >>>>> >>>>> return 0; >>>> >>>> We did a synchronize_net() when socket is detached from netdevice in >>>> __tun_detach() so it looks to me this is unnecessary. >>> I knew, but it is only called conditionally, that is: >>> >>> 695 if (tun && !tfile->detached) { >>> ... >>> 710 >>> 711 synchronize_net(); >>> >>> And it looks like syzbot just skipped this condition, >> >> >> If tfile is detached, it should have gone for the path of synchronize_net() before. If tfile is never attached, tun_net_xmit() doesn't have the chance to access that. I wonder whether or not we should use WRITE_ONCE() for tun->numqueues-- in this fucntion. If the value was not committed to memory before synchronize_net(), we may race with tun_net_xmit() which check txq against tun->numqueues. >> >> >>> this is why I believe >>> you still need to respect RCU grace period _unconditionally_ for tfile. >> >> >> This is true if I miss subtle race in the code. >> >> >> Haibing: could you please try the following test? >> >> 1) start VM with multiple queue I configured 8 queues with virtio driver to start vm >> >> 2) using pktgen to inject packets to all queues through tap inject packet into tap nic in host >> >> 3) using ethtool to change the combined channels in guest in a loop repeat do as follow in vm: ethtool -L eth0 combined 4 ethtool -L eth0 combined 8 >> >> 4) kill the guest >> This cannot reproduce the issue. >> > > Ok, I will try this. > >> Thanks >> >> >>> Thanks. >> >> . >> > > > . >