From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx3-rdu2.redhat.com ([66.187.233.73]:50268 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750836AbeCIDr6 (ORCPT ); Thu, 8 Mar 2018 22:47:58 -0500 Subject: Re: [PATCH] vhost_net: initialize rx_ring in vhost_net_open() To: "Michael S. Tsirkin" Cc: Alexander Potapenko , Dmitriy Vyukov , kvm@vger.kernel.org, Networking References: <20180308133717.149524-1-glider@google.com> <20180308173247-mutt-send-email-mst@kernel.org> <20180308175642-mutt-send-email-mst@kernel.org> <9d5c574b-7418-e87f-feed-cf5bd7e3af2a@redhat.com> <20180309052745-mutt-send-email-mst@kernel.org> From: Jason Wang Message-ID: <4f4b5436-0984-4c7e-8b68-b79dfa01038b@redhat.com> Date: Fri, 9 Mar 2018 11:47:51 +0800 MIME-Version: 1.0 In-Reply-To: <20180309052745-mutt-send-email-mst@kernel.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 2018年03月09日 11:29, Michael S. Tsirkin wrote: > On Fri, Mar 09, 2018 at 10:30:17AM +0800, Jason Wang wrote: >> >> On 2018年03月09日 00:00, Michael S. Tsirkin wrote: >>> On Thu, Mar 08, 2018 at 04:55:39PM +0100, Alexander Potapenko wrote: >>>> On Thu, Mar 8, 2018 at 4:33 PM, Michael S. Tsirkin wrote: >>>>> On Thu, Mar 08, 2018 at 02:37:17PM +0100, Alexander Potapenko wrote: >>>>>> KMSAN reported a use of uninit memory in vhost_net_buf_unproduce() >>>>>> while trying to access n->vqs[VHOST_NET_VQ_TX].rx_ring: >>>>>> >>>>>> ================================================================== >>>>>> BUG: KMSAN: use of uninitialized memory in vhost_net_buf_unproduce+0x7bb/0x9a0 drivers/vho >>>>>> et.c:170 >>>>>> CPU: 0 PID: 3021 Comm: syz-fuzzer Not tainted 4.16.0-rc4+ #3853 >>>>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 >>>>>> Call Trace: >>>>>> __dump_stack lib/dump_stack.c:17 [inline] >>>>>> dump_stack+0x185/0x1d0 lib/dump_stack.c:53 >>>>>> kmsan_report+0x142/0x1f0 mm/kmsan/kmsan.c:1093 >>>>>> __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676 >>>>>> vhost_net_buf_unproduce+0x7bb/0x9a0 drivers/vhost/net.c:170 >>>>>> vhost_net_stop_vq drivers/vhost/net.c:974 [inline] >>>>>> vhost_net_stop+0x146/0x380 drivers/vhost/net.c:982 >>>>>> vhost_net_release+0xb1/0x4f0 drivers/vhost/net.c:1015 >>>>>> __fput+0x49f/0xa00 fs/file_table.c:209 >>>>>> ____fput+0x37/0x40 fs/file_table.c:243 >>>>>> task_work_run+0x243/0x2c0 kernel/task_work.c:113 >>>>>> tracehook_notify_resume include/linux/tracehook.h:191 [inline] >>>>>> exit_to_usermode_loop arch/x86/entry/common.c:166 [inline] >>>>>> prepare_exit_to_usermode+0x349/0x3b0 arch/x86/entry/common.c:196 >>>>>> syscall_return_slowpath+0xf3/0x6d0 arch/x86/entry/common.c:265 >>>>>> do_syscall_64+0x34d/0x450 arch/x86/entry/common.c:292 >>>>>> ... >>>>>> origin: >>>>>> kmsan_save_stack_with_flags mm/kmsan/kmsan.c:303 [inline] >>>>>> kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:213 >>>>>> kmsan_kmalloc_large+0x6f/0xd0 mm/kmsan/kmsan.c:392 >>>>>> kmalloc_large_node_hook mm/slub.c:1366 [inline] >>>>>> kmalloc_large_node mm/slub.c:3808 [inline] >>>>>> __kmalloc_node+0x100e/0x1290 mm/slub.c:3818 >>>>>> kmalloc_node include/linux/slab.h:554 [inline] >>>>>> kvmalloc_node+0x1a5/0x2e0 mm/util.c:419 >>>>>> kvmalloc include/linux/mm.h:541 [inline] >>>>>> vhost_net_open+0x64/0x5f0 drivers/vhost/net.c:921 >>>>>> misc_open+0x7b5/0x8b0 drivers/char/misc.c:154 >>>>>> chrdev_open+0xc28/0xd90 fs/char_dev.c:417 >>>>>> do_dentry_open+0xccb/0x1430 fs/open.c:752 >>>>>> vfs_open+0x272/0x2e0 fs/open.c:866 >>>>>> do_last fs/namei.c:3378 [inline] >>>>>> path_openat+0x49ad/0x6580 fs/namei.c:3519 >>>>>> do_filp_open+0x267/0x640 fs/namei.c:3553 >>>>>> do_sys_open+0x6ad/0x9c0 fs/open.c:1059 >>>>>> SYSC_openat+0xc7/0xe0 fs/open.c:1086 >>>>>> SyS_openat+0x63/0x90 fs/open.c:1080 >>>>>> do_syscall_64+0x2f1/0x450 arch/x86/entry/common.c:287 >>>>>> ================================================================== >>>>>> >>>>>> Signed-off-by: Alexander Potapenko >>>>>> --- >>>>>> drivers/vhost/net.c | 1 + >>>>>> 1 file changed, 1 insertion(+) >>>>>> >>>>>> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c >>>>>> index 610cba276d47..60f1080bffc7 100644 >>>>>> --- a/drivers/vhost/net.c >>>>>> +++ b/drivers/vhost/net.c >>>>>> @@ -948,6 +948,7 @@ static int vhost_net_open(struct inode *inode, struct file *f) >>>>>> n->vqs[i].done_idx = 0; >>>>>> n->vqs[i].vhost_hlen = 0; >>>>>> n->vqs[i].sock_hlen = 0; >>>>>> + n->vqs[i].rx_ring = NULL; >>>>>> vhost_net_buf_init(&n->vqs[i].rxq); >>>>>> } >>>>>> vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX); >>>>>> -- >>>>>> 2.16.2.395.g2e18187dfd-goog >>>>> I suspect that's not sufficient. rx ring is tied to the tap device. >>>>> I think we need to drop it every time we drop the device. >>>> Unfortunately I've no idea where is the device dropped. Are you >>>> referring to vhost_net_vq_reset()? >>>> I can fix that part if needed, but won't be able to validate it with KMSAN. >>> I see several issues. For example in vhost_net_set_backend >>> if there's a value then rx ring will point to the >>> ring of the wrong socket. >>> Something like the below might help but we really need >>> documentation of when is rx_ring valid. Is it only valid >>> when private-data is valid? >> I think so, we need keep rx_ring synced with private_data. >> >>> If yes need to make sure >>> we reset it with private_data. >>> >>> Also I see __skb_array_destroy_skb used with ptr_ring which >>> seems suspicious: how do we know the entries are skbs? >> Good catch, will post an independent patch to fix this. >> >>> Patch below is on top of yours, and >>> >>> Signed-off-by: Michael S. Tsirkin >>> >>> But I really would like Jason to look and come up with a >>> patch to address all these issues. >>> >>> --- >>> >>> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c >>> index 610cba2..7a65b69 100644 >>> --- a/drivers/vhost/net.c >>> +++ b/drivers/vhost/net.c >>> @@ -972,6 +973,7 @@ static struct socket *vhost_net_stop_vq(struct vhost_net *n, >>> vhost_net_disable_vq(n, vq); >>> vq->private_data = NULL; >>> vhost_net_buf_unproduce(nvq); >>> + vq->rx_ring = NULL; >>> mutex_unlock(&vq->mutex); >>> return sock; >>> } >>> @@ -1161,8 +1163,6 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd) >>> vhost_net_disable_vq(n, vq); >>> vq->private_data = sock; >>> vhost_net_buf_unproduce(nvq); >>> - if (index == VHOST_NET_VQ_RX) >>> - nvq->rx_ring = get_tap_ptr_ring(fd); >>> r = vhost_vq_init_access(vq); >>> if (r) >>> goto err_used; >>> @@ -1172,6 +1172,10 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd) >>> oldubufs = nvq->ubufs; >>> nvq->ubufs = ubufs; >>> + if (index == VHOST_NET_VQ_RX) >>> + nvq->rx_ring = get_tap_ptr_ring(fd); >>> + else >>> + nvq->rx_ring = NULL; >> Any reason to move those after vhost_net_enable_vq()? > Otherwise I see an issue if there is an error and > we revert the change. I see. > >> And consider we won't >> try to assign rx_ring to TX, the "else" part seems unnecessary. >> >> Thanks > ok, pls pack up all fixes as you see fit and post > a patchset. Ok. Thanks > >>> n->tx_packets = 0; >>> n->tx_zcopy_err = 0;