From mboxrd@z Thu Jan 1 00:00:00 1970 From: Anthony Liguori Subject: Re: [Qemu-devel] Re: [PATCH 2 of 5] add can_dma/post_dma for direct IO Date: Sun, 14 Dec 2008 17:08:39 -0600 Message-ID: <494591F7.3080002@codemonkey.ws> References: <4942B841.6010900@codemonkey.ws> <4942BDEE.7020003@codemonkey.ws> <49437EC8.6020506@redhat.com> <4943E68E.3030400@codemonkey.ws> <4944117C.6030404@redhat.com> <49442410.7020608@codemonkey.ws> <4944A1B5.5080300@redhat.com> <49455A33.207@codemonkey.ws> <49456337.4000000@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: qemu-devel@nongnu.org, Andrea Arcangeli , chrisw@redhat.com, Gerd Hoffmann , kvm@vger.kernel.org To: Avi Kivity Return-path: Received: from rn-out-0910.google.com ([64.233.170.184]:46571 "EHLO rn-out-0910.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751183AbYLNXIq (ORCPT ); Sun, 14 Dec 2008 18:08:46 -0500 Received: by rn-out-0910.google.com with SMTP id k40so1949568rnd.17 for ; Sun, 14 Dec 2008 15:08:43 -0800 (PST) In-Reply-To: <49456337.4000000@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: Avi Kivity wrote: > Anthony Liguori wrote: >> I've thought quite a bit about it, and I'm becoming less convinced >> that this sort of API is going to be helpful. >> >> I was thinking that we need to make one minor change to the map API I >> proposed. It should return a mapped size as an output parameter and >> take a flag as to whether partial mappings can be handled. The >> effect would be that you never bounce to RAM which means that you can >> also quite accurately determine the maximum amount of bouncing (it >> should be proportional to the amount of MMIO memory that's registered). > > That's pointless; cirrus for example has 8MB of mmio while a > cpu-to-vram blit is in progress, and some random device we'll add > tomorrow could easily introduce more. Our APIs shouldn't depend on > properties of emulated hardware, at least as much as possible. One way to think of what I'm suggesting, is that if for every cpu_register_physical_memory call for MMIO, we allocated a buffer, then whenever map() was called on MMIO, we would return that already allocated buffer. The overhead is fixed and honestly relatively small. Much smaller than dma.c proposes. But you can be smarter, and lazily allocate those buffers for MMIO. That will reduce the up front memory consumption. I'd be perfectly happy though if the first implementation just malloc() on cpu_register_physical_memory for the sake of simplicity. > I'll enumerate the functions that dma.c provides: > - convert guest physical addresses to host virtual addresses The map() api does this. > - construct an iovec for scatter/gather The map() api does not do this. > - handle guest physical addresses for which no host virtual addresses > exist, while controlling memory use > - take care of the dirty bit > - provide a place for adding hooks to hardware that can modify dma > operations generically (emulated iommus, transforming dma engines) The map() api does all of this. > I believe that a dma api that fails to address all of these > requirements is trying to solve too few problems at the same time, and > will either cause dma clients to be unduly complicated, or will > require rewriting. I think there's a disconnect between what you describe and what the current code does. I think there's a very simple solution, let's start with the map() api. I'm convinced that the virtio support for it will be trivial and that virtio will not benefit from the dma.c api proposed. I'm willing to do the virtio map implementation to demonstrate this. Let's see two implementations that use the dma.c api before we commit to it. I'd like to see at least a network device and a block device. I don't believe network devices will benefit from it because they don't support partial submissions. I like any API that reduces duplicate code. It's easy to demonstrate that with patches. Based on my current understanding of the API and what I expect from the devices using it, I don't believe the API will actually do that. It's quite easy to prove me wrong though. Regards, Anthony Liguori From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1LC04r-0001Zg-W4 for qemu-devel@nongnu.org; Sun, 14 Dec 2008 18:08:46 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1LC04r-0001Ys-4V for qemu-devel@nongnu.org; Sun, 14 Dec 2008 18:08:45 -0500 Received: from [199.232.76.173] (port=40422 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LC04q-0001Yj-R0 for qemu-devel@nongnu.org; Sun, 14 Dec 2008 18:08:44 -0500 Received: from an-out-0708.google.com ([209.85.132.248]:34129) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1LC04q-0000uy-EV for qemu-devel@nongnu.org; Sun, 14 Dec 2008 18:08:44 -0500 Received: by an-out-0708.google.com with SMTP id c38so1234110ana.37 for ; Sun, 14 Dec 2008 15:08:43 -0800 (PST) Message-ID: <494591F7.3080002@codemonkey.ws> Date: Sun, 14 Dec 2008 17:08:39 -0600 From: Anthony Liguori MIME-Version: 1.0 Subject: Re: [Qemu-devel] Re: [PATCH 2 of 5] add can_dma/post_dma for direct IO References: <4942B841.6010900@codemonkey.ws> <4942BDEE.7020003@codemonkey.ws> <49437EC8.6020506@redhat.com> <4943E68E.3030400@codemonkey.ws> <4944117C.6030404@redhat.com> <49442410.7020608@codemonkey.ws> <4944A1B5.5080300@redhat.com> <49455A33.207@codemonkey.ws> <49456337.4000000@redhat.com> In-Reply-To: <49456337.4000000@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Avi Kivity Cc: Andrea Arcangeli , chrisw@redhat.com, qemu-devel@nongnu.org, kvm@vger.kernel.org, Gerd Hoffmann Avi Kivity wrote: > Anthony Liguori wrote: >> I've thought quite a bit about it, and I'm becoming less convinced >> that this sort of API is going to be helpful. >> >> I was thinking that we need to make one minor change to the map API I >> proposed. It should return a mapped size as an output parameter and >> take a flag as to whether partial mappings can be handled. The >> effect would be that you never bounce to RAM which means that you can >> also quite accurately determine the maximum amount of bouncing (it >> should be proportional to the amount of MMIO memory that's registered). > > That's pointless; cirrus for example has 8MB of mmio while a > cpu-to-vram blit is in progress, and some random device we'll add > tomorrow could easily introduce more. Our APIs shouldn't depend on > properties of emulated hardware, at least as much as possible. One way to think of what I'm suggesting, is that if for every cpu_register_physical_memory call for MMIO, we allocated a buffer, then whenever map() was called on MMIO, we would return that already allocated buffer. The overhead is fixed and honestly relatively small. Much smaller than dma.c proposes. But you can be smarter, and lazily allocate those buffers for MMIO. That will reduce the up front memory consumption. I'd be perfectly happy though if the first implementation just malloc() on cpu_register_physical_memory for the sake of simplicity. > I'll enumerate the functions that dma.c provides: > - convert guest physical addresses to host virtual addresses The map() api does this. > - construct an iovec for scatter/gather The map() api does not do this. > - handle guest physical addresses for which no host virtual addresses > exist, while controlling memory use > - take care of the dirty bit > - provide a place for adding hooks to hardware that can modify dma > operations generically (emulated iommus, transforming dma engines) The map() api does all of this. > I believe that a dma api that fails to address all of these > requirements is trying to solve too few problems at the same time, and > will either cause dma clients to be unduly complicated, or will > require rewriting. I think there's a disconnect between what you describe and what the current code does. I think there's a very simple solution, let's start with the map() api. I'm convinced that the virtio support for it will be trivial and that virtio will not benefit from the dma.c api proposed. I'm willing to do the virtio map implementation to demonstrate this. Let's see two implementations that use the dma.c api before we commit to it. I'd like to see at least a network device and a block device. I don't believe network devices will benefit from it because they don't support partial submissions. I like any API that reduces duplicate code. It's easy to demonstrate that with patches. Based on my current understanding of the API and what I expect from the devices using it, I don't believe the API will actually do that. It's quite easy to prove me wrong though. Regards, Anthony Liguori