From mboxrd@z Thu Jan 1 00:00:00 1970 From: Badari Pulavarty Subject: Re: [RFC PATCH]vhost-blk: In-kernel accelerator for virtio block device Date: Thu, 11 Aug 2011 21:50:08 -0700 Message-ID: <4E44B100.3000208@us.ibm.com> References: <1311863346-4338-1-git-send-email-namei.unix@gmail.com> <4E325F98.5090308@gmail.com> <4E32F7F2.4080607@us.ibm.com> <4E363DB9.70801@gmail.com> <1312495132.9603.4.camel@badari-desktop> <4E3BCE4D.7090809@gmail.com> <4E3C302A.3040500@us.ibm.com> <4E3F3D4E.70104@gmail.com> <4E3F6E72.1000907@us.ibm.com> <4E3F90E3.9080600@gmail.com> <4E4019E1.2090508@us.ibm.com> <4E41EAC5.8060001@gmail.com> <1313008667.9603.14.camel@badari-desktop> <4E4345F1.90107@gmail.com> <4E434A51.8000902@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: kvm@vger.kernel.org To: Liu Yuan Return-path: Received: from e5.ny.us.ibm.com ([32.97.182.145]:56252 "EHLO e5.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750791Ab1HLEuN (ORCPT ); Fri, 12 Aug 2011 00:50:13 -0400 Received: from d01relay03.pok.ibm.com (d01relay03.pok.ibm.com [9.56.227.235]) by e5.ny.us.ibm.com (8.14.4/8.13.1) with ESMTP id p7C4KaZe024928 for ; Fri, 12 Aug 2011 00:20:36 -0400 Received: from d01av02.pok.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by d01relay03.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p7C4oCCo165566 for ; Fri, 12 Aug 2011 00:50:12 -0400 Received: from d01av02.pok.ibm.com (loopback [127.0.0.1]) by d01av02.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p7C4oCSu030781 for ; Fri, 12 Aug 2011 01:50:12 -0300 In-Reply-To: <4E434A51.8000902@gmail.com> Sender: kvm-owner@vger.kernel.org List-ID: On 8/10/2011 8:19 PM, Liu Yuan wrote: > On 08/11/2011 11:01 AM, Liu Yuan wrote: >> >>> It looks like the patch wouldn't work for testing multiple devices. >>> >>> vhost_blk_open() does >>> + used_info_cachep = KMEM_CACHE(used_info, SLAB_HWCACHE_ALIGN | >>> SLAB_PANIC); >>> >> >> This is weird. how do you open multiple device?I just opened the >> device with following command: >> >> -drive file=/dev/sda6,if=virtio,cache=none,aio=native -drive >> file=~/data0.img,if=virtio,cache=none,aio=native -drive >> file=~/data1.img,if=virtio,cache=none,aio=native >> >> And I didn't meet any problem. >> >> this would tell qemu to open three devices, and pass three FDs to >> three instances of vhost_blk module. >> So KMEM_CACHE() is okay in vhost_blk_open(). >> > > Oh, you are right. KMEM_CACHE() is in the wrong place. it is three > instances vhost worker threads created. Hmmm, but I didn't meet any > problem when opening it and running it. So strange. I'll go to figure > it out. > >>> When opening second device, we get panic since used_info_cachep is >>> already created. Just to make progress I moved this call to >>> vhost_blk_init(). >>> >>> I don't see any host panics now. With single block device (dd), >>> it seems to work fine. But when I start testing multiple block >>> devices I quickly run into hangs in the guest. I see following >>> messages in the guest from virtio_ring.c: >>> >>> virtio_blk virtio2: requests: id 0 is not a head ! >>> virtio_blk virtio1: requests: id 0 is not a head ! >>> virtio_blk virtio4: requests: id 1 is not a head ! >>> virtio_blk virtio3: requests: id 39 is not a head ! >>> >>> Thanks, >>> Badari >>> >>> >> >> vq->data[] is initialized by guest virtio-blk driver and vhost_blk is >> unware of it. it looks like used ID passed >> over by vhost_blk to guest virtio_blk is wrong, but, it should not >> happen. :| >> >> And I can't reproduce this on my laptop. :( >> Finally, found the issue :) Culprit is: +static struct io_event events[MAX_EVENTS]; With multiple devices, multiple threads could be executing handle_completion() (one for each fd) at the same time. "events" array is global :( Need to make it one per device/fd. For test, I changed MAX_EVENTS to 32 and moved "events" array to be local (stack) to handle_completion(). Tests are running fine. Your laptop must have single processor, hence you have only one thread executing handle_completion() at any time.. Thanks, Badari