From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48753) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aR1lH-0007MB-6n for qemu-devel@nongnu.org; Wed, 03 Feb 2016 13:02:40 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aR1lD-0003Lb-W7 for qemu-devel@nongnu.org; Wed, 03 Feb 2016 13:02:39 -0500 Received: from mx4-phx2.redhat.com ([209.132.183.25]:46671) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aR1lD-0003LU-Ob for qemu-devel@nongnu.org; Wed, 03 Feb 2016 13:02:35 -0500 Date: Wed, 3 Feb 2016 13:02:34 -0500 (EST) From: Ladi Prosek Message-ID: <831310576.31418757.1454522554452.JavaMail.zimbra@redhat.com> In-Reply-To: <20160203123639.GA20527@grmbl.mre> References: <1453465198-11000-1-git-send-email-lprosek@redhat.com> <20160203123639.GA20527@grmbl.mre> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH] rng-random: implement request queue List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Amit Shah Cc: pagupta@redhat.com, qemu-devel@nongnu.org Hi Amit, ----- Original Message ----- > Hi Ladi, > > Adding Pankaj to CC, he too looked at this recently. > > On (Fri) 22 Jan 2016 [13:19:58], Ladi Prosek wrote: > > If the guest adds a buffer to the virtio queue while another buffer > > is still pending and hasn't been filled and returned by the rng > > device, rng-random internally discards the pending request, which > > leads to the second buffer getting stuck in the queue. For the guest > > this manifests as delayed completion of reads from virtio-rng, i.e. > > a read is completed only after another read is issued. > > > > This patch adds an internal queue of requests, analogous to what > > rng-egd uses, to make sure that requests and responses are balanced > > and correctly ordered. > > ... and this can lead to breaking migration (the queue of requests on > the host needs to be migrated, else the new host will have no idea of > the queue). I was under the impression that clearing the queue pre-migration as implemented by the RngBackendClass::cancel_requests callback is enough. If it wasn't, the rgn-egd backend would be already broken as its queueing logic is pretty much identical. /** * rng_backend_cancel_requests: * @s: the backend to cancel all pending requests in * * Cancels all pending requests submitted by @rng_backend_request_entropy. This * should be used by a device during reset or in preparation for live migration * to stop tracking any request. */ void rng_backend_cancel_requests(RngBackend *s); Upon closer inspection though, this function appears to have no callers. Either I'm missing something or there's another bug to be fixed. > I think we should limit the queue size to 1 instead. Multiple rng > requests should not be common, because if we did have entropy, we'd > just service the guest request and be done with it. If we haven't > replied to the guest, it just means that the host itself is waiting > for more entropy, or is waiting for the timeout before the guest's > ratelimit is lifted. The scenario I had in mind is multiple processes in the guest requesting entropy at the same time, no ratelimit, and fast entropy source in the host. Being able to queue up requests would definitely help boost performance, I think I even benchmarked it but I must have lost the numbers. I can set it up again and rerun the benchmark if you're interested. > So, instead of fixing this using a queue, how about limiting the size > of the vq to have just one element at a time? I don't believe that this is a good solution. Although perfectly valid spec-wise, I can see how a one-element queue could confuse less than perfect driver implementations. Additionally, the driver would have to implement some kind of a guest-side queueing logic and serialize its requests or else be dropping them if the virtqueue is full. Overall, I don't think that it's completely crazy to call it a breaking change. > Thanks, > > Amit > >