From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756802Ab2IDKZW (ORCPT ); Tue, 4 Sep 2012 06:25:22 -0400 Received: from mx1.redhat.com ([209.132.183.28]:65226 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756711Ab2IDKZQ (ORCPT ); Tue, 4 Sep 2012 06:25:16 -0400 Message-ID: <5045D6FF.5020801@redhat.com> Date: Tue, 04 Sep 2012 12:25:03 +0200 From: Paolo Bonzini User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:14.0) Gecko/20120717 Thunderbird/14.0 MIME-Version: 1.0 To: "Michael S. Tsirkin" CC: "Nicholas A. Bellinger" , linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org, kvm@vger.kernel.org, rusty@rustcorp.com.au, jasowang@redhat.com, virtualization@lists.linux-foundation.org, Christoph Hellwig , Jens Axboe , target-devel Subject: Re: [PATCH 5/5] virtio-scsi: introduce multiqueue support References: <1346154857-12487-1-git-send-email-pbonzini@redhat.com> <1346154857-12487-6-git-send-email-pbonzini@redhat.com> <1346725294.4162.79.camel@haakon2.linux-iscsi.org> <5045A3B4.2030101@redhat.com> <20120904084628.GA8437@redhat.com> In-Reply-To: <20120904084628.GA8437@redhat.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Il 04/09/2012 10:46, Michael S. Tsirkin ha scritto: >>>> +static int virtscsi_queuecommand_multi(struct Scsi_Host *sh, >>>> + struct scsi_cmnd *sc) >>>> +{ >>>> + struct virtio_scsi *vscsi = shost_priv(sh); >>>> + struct virtio_scsi_target_state *tgt = vscsi->tgt[sc->device->id]; >>>> + unsigned long flags; >>>> + u32 queue_num; >>>> + >>>> + /* Using an atomic_t for tgt->reqs lets the virtqueue handler >>>> + * decrement it without taking the spinlock. >>>> + */ > > Above comment is not really helpful - reader can be safely assumed to > know what atomic_t is. Sure, the comment explains that we use an atomic because _elsewhere_ the tgt_lock is not held while modifying reqs. > Please delete, and replace with the text from commit log > that explains the heuristic used to select req_vq. Ok. > Also please add a comment near 'reqs' definition. > Something like "number of outstanding requests - used to detect idle > target". Ok. > >>>> + spin_lock_irqsave(&tgt->tgt_lock, flags); > > Looks like this lock can be removed - req_vq is only > modified when target is idle and only used when it is > not idle. If you have two incoming requests at the same time, req_vq is also modified when the target is not idle; that's the point of the lock. Suppose tgt->reqs = 0 initially, and you have two processors/queues. Initially tgt->req_vq is queue #1. If you have this: queuecommand on CPU #0 queuecommand #2 on CPU #1 -------------------------------------------------------------- atomic_inc_return(...) == 1 atomic_inc_return(...) == 2 virtscsi_queuecommand to queue #1 tgt->req_vq = queue #0 virtscsi_queuecommand to queue #0 then two requests are issued to different queues without a quiescent point in the middle. >>>> + if (atomic_inc_return(&tgt->reqs) == 1) { >>>> + queue_num = smp_processor_id(); >>>> + while (unlikely(queue_num >= vscsi->num_queues)) >>>> + queue_num -= vscsi->num_queues; >>>> + tgt->req_vq = &vscsi->req_vqs[queue_num]; >>>> + } >>>> + spin_unlock_irqrestore(&tgt->tgt_lock, flags); >>>> + return virtscsi_queuecommand(vscsi, tgt, sc); >>>> +} >>>> + >>>> + > > ..... > >>>> +static int virtscsi_queuecommand_single(struct Scsi_Host *sh, >>>> + struct scsi_cmnd *sc) >>>> +{ >>>> + struct virtio_scsi *vscsi = shost_priv(sh); >>>> + struct virtio_scsi_target_state *tgt = vscsi->tgt[sc->device->id]; >>>> + >>>> + atomic_inc(&tgt->reqs); >>>> + return virtscsi_queuecommand(vscsi, tgt, sc); >>>> +} >>>> + > > Here, reqs is unused - why bother incrementing it? > A branch on completion would be cheaper IMHO. Well, I could also let tgt->reqs go negative, but it would be a bit untidy. Another alternative is to access the target's target_busy field with ACCESS_ONCE, and drop reqs altogether. Too tricky to do this kind of micro-optimization so early, though. >> virtio-scsi multiqueue has a performance benefit up to 20% > > To be fair, you could be running in single queue mode. > In that case extra atomics and indirection that this code > brings will just add overhead without benefits. > I don't know how significant would that be. Not measurable in my experiments. Paolo