From mboxrd@z Thu Jan  1 00:00:00 1970
From: "Blue Swirl" <blauwirbel@gmail.com>
Subject: Re: [Qemu-devel] Re: [PATCH 2 of 5] add can_dma/post_dma for direct IO
Date: Tue, 16 Dec 2008 17:57:11 +0200
Message-ID: <f43fc5580812160757h64dc84aak98c43b29bb63251a@mail.gmail.com>
References: <cc5d812eb9369a7ad2ef.1229105804@duo.random>
	 <4943E68E.3030400@codemonkey.ws> <4944117C.6030404@redhat.com>
	 <49442410.7020608@codemonkey.ws> <4944A1B5.5080300@redhat.com>
	 <49455A33.207@codemonkey.ws> <49456337.4000000@redhat.com>
	 <494591F7.3080002@codemonkey.ws>
	 <f43fc5580812151035o40fad1d2m47c006eb802eefac@mail.gmail.com>
	 <4946D501.4020109@codemonkey.ws>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Cc: qemu-devel@nongnu.org, "Avi Kivity" <avi@redhat.com>,
	"Andrea Arcangeli" <aarcange@redhat.com>, chrisw@redhat.com,
	kvm@vger.kernel.org, "Gerd Hoffmann" <kraxel@redhat.com>
To: "Anthony Liguori" <anthony@codemonkey.ws>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mail-bw0-f21.google.com ([209.85.218.21]:33132 "EHLO
	mail-bw0-f21.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1757103AbYLPP5O (ORCPT <rfc822;kvm@vger.kernel.org>);
	Tue, 16 Dec 2008 10:57:14 -0500
Received: by bwz14 with SMTP id 14so4012116bwz.13
        for <kvm@vger.kernel.org>; Tue, 16 Dec 2008 07:57:12 -0800 (PST)
In-Reply-To: <4946D501.4020109@codemonkey.ws>
Content-Disposition: inline
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On 12/16/08, Anthony Liguori <anthony@codemonkey.ws> wrote:
> Blue Swirl wrote:
>
> > I changed the ESP SCSI and Lance Ethernet on Sparc32 to resolve the IO
> > address to physical memory (see patch). ESP works (no zero copy yet),
> > Lance doesn't. It looks much better. Because the resolving activity is
> > performed in serial steps, unbounded IO vector allocation does not
> > happen, but we still could launch as many IO as there are free IO
> > vectors.
> >
> >
>
>  It is a good cleanup.
>
>
> > There are still some issues I'm not happy yet:
> > - handling of access violations: resolving should stop before the bad
> > page, the transfers should be done until that and then post error.
> > - bounce buffers needed for Lance byte swapping are not well designed
> (stack)
> >
> >
>
>  I think you could approach the bouncing via a map/unmap API but I'm not
> sure.  You would need a map() function to take a virtual address which is
> sort of weird.  That would allow you to stack them in an arbitrary fashion
> though.

I'd still like to keep resolving virtual addresses to physical
addresses separate from physical address to host pointer mapping. I
think this was the enlightening discovery we made earlier.

The generic resolving API should look something like

int (*resolve)(target_phys_addr_t address_in, target_phys_addr_t
length_in, target_phys_addr_t &address_out, target_phys_addr_t
&length_out)

whereas the map/unmap API would be as you specified earlier.

> > This lead me to the thought that maybe we should not hide the bounce
> > buffer activity, but instead make it more explicit for the device that
> > needs bouncing. For the other device, the buffering or lack of it
> > should be opaque.
> >
> >
>
>  I think that's reasonable.
>
>
> > Also the virtual-to-physical address resolution API could be generic,
> > ie all resolver functions should take same parameters so that the
> > devices would not need to know the next higher level device.
> >
> >
>
>  Yes.  I think this is key.  The only observation I would make is that the
> resolution API should have some sort of release function (so map/unmap,
> lock/unlock, whatever).

I'm not so sure resolving works the same symmetric way, because the
virtual to physical mapping will change over time because of guest
activity. We could cache the resolved translations, but then there
should be a way to invalidate the cache entries throughout the device
chain with callbacks etc.

From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1LCcOU-0002pi-4r
	for qemu-devel@nongnu.org; Tue, 16 Dec 2008 11:03:34 -0500
Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1LCcON-0002mj-OT
	for qemu-devel@nongnu.org; Tue, 16 Dec 2008 11:03:32 -0500
Received: from [199.232.76.173] (port=53222 helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1LCcON-0002ma-Hg
	for qemu-devel@nongnu.org; Tue, 16 Dec 2008 11:03:27 -0500
Received: from fk-out-0910.google.com ([209.85.128.187]:63151)
	by monty-python.gnu.org with esmtp (Exim 4.60)
	(envelope-from <blauwirbel@gmail.com>) id 1LCcOM-0001QC-4t
	for qemu-devel@nongnu.org; Tue, 16 Dec 2008 11:03:27 -0500
Received: by fk-out-0910.google.com with SMTP id 18so1879733fks.2
	for <qemu-devel@nongnu.org>; Tue, 16 Dec 2008 08:03:19 -0800 (PST)
Message-ID: <f43fc5580812160757h64dc84aak98c43b29bb63251a@mail.gmail.com>
Date: Tue, 16 Dec 2008 17:57:11 +0200
From: "Blue Swirl" <blauwirbel@gmail.com>
Subject: Re: [Qemu-devel] Re: [PATCH 2 of 5] add can_dma/post_dma for direct IO
In-Reply-To: <4946D501.4020109@codemonkey.ws>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
References: <cc5d812eb9369a7ad2ef.1229105804@duo.random>
	<4943E68E.3030400@codemonkey.ws> <4944117C.6030404@redhat.com>
	<49442410.7020608@codemonkey.ws> <4944A1B5.5080300@redhat.com>
	<49455A33.207@codemonkey.ws> <49456337.4000000@redhat.com>
	<494591F7.3080002@codemonkey.ws>
	<f43fc5580812151035o40fad1d2m47c006eb802eefac@mail.gmail.com>
	<4946D501.4020109@codemonkey.ws>
Reply-To: qemu-devel@nongnu.org
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: Andrea Arcangeli <aarcange@redhat.com>, chrisw@redhat.com, kvm@vger.kernel.org, qemu-devel@nongnu.org, Gerd Hoffmann <kraxel@redhat.com>, Avi Kivity <avi@redhat.com>

On 12/16/08, Anthony Liguori <anthony@codemonkey.ws> wrote:
> Blue Swirl wrote:
>
> > I changed the ESP SCSI and Lance Ethernet on Sparc32 to resolve the IO
> > address to physical memory (see patch). ESP works (no zero copy yet),
> > Lance doesn't. It looks much better. Because the resolving activity is
> > performed in serial steps, unbounded IO vector allocation does not
> > happen, but we still could launch as many IO as there are free IO
> > vectors.
> >
> >
>
>  It is a good cleanup.
>
>
> > There are still some issues I'm not happy yet:
> > - handling of access violations: resolving should stop before the bad
> > page, the transfers should be done until that and then post error.
> > - bounce buffers needed for Lance byte swapping are not well designed
> (stack)
> >
> >
>
>  I think you could approach the bouncing via a map/unmap API but I'm not
> sure.  You would need a map() function to take a virtual address which is
> sort of weird.  That would allow you to stack them in an arbitrary fashion
> though.

I'd still like to keep resolving virtual addresses to physical
addresses separate from physical address to host pointer mapping. I
think this was the enlightening discovery we made earlier.

The generic resolving API should look something like

int (*resolve)(target_phys_addr_t address_in, target_phys_addr_t
length_in, target_phys_addr_t &address_out, target_phys_addr_t
&length_out)

whereas the map/unmap API would be as you specified earlier.

> > This lead me to the thought that maybe we should not hide the bounce
> > buffer activity, but instead make it more explicit for the device that
> > needs bouncing. For the other device, the buffering or lack of it
> > should be opaque.
> >
> >
>
>  I think that's reasonable.
>
>
> > Also the virtual-to-physical address resolution API could be generic,
> > ie all resolver functions should take same parameters so that the
> > devices would not need to know the next higher level device.
> >
> >
>
>  Yes.  I think this is key.  The only observation I would make is that the
> resolution API should have some sort of release function (so map/unmap,
> lock/unlock, whatever).

I'm not so sure resolving works the same symmetric way, because the
virtual to physical mapping will change over time because of guest
activity. We could cache the resolved translations, but then there
should be a way to invalidate the cache entries throughout the device
chain with callbacks etc.