All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Blue Swirl" <blauwirbel@gmail.com>
To: qemu-devel@nongnu.org, "Anthony Liguori" <anthony@codemonkey.ws>
Cc: "Avi Kivity" <avi@redhat.com>,
	"Andrea Arcangeli" <aarcange@redhat.com>,
	chrisw@redhat.com, kvm@vger.kernel.org,
	"Gerd Hoffmann" <kraxel@redhat.com>
Subject: Re: [Qemu-devel] Re: [PATCH 2 of 5] add can_dma/post_dma for direct IO
Date: Mon, 15 Dec 2008 20:35:36 +0200	[thread overview]
Message-ID: <f43fc5580812151035o40fad1d2m47c006eb802eefac@mail.gmail.com> (raw)
In-Reply-To: <494591F7.3080002@codemonkey.ws>

[-- Attachment #1: Type: text/plain, Size: 4357 bytes --]

On 12/15/08, Anthony Liguori <anthony@codemonkey.ws> wrote:
> Avi Kivity wrote:
>
> > Anthony Liguori wrote:
> >
> > > I've thought quite a bit about it, and I'm becoming less convinced that
> this sort of API is going to be helpful.
> > >
> > > I was thinking that we need to make one minor change to the map API I
> proposed.  It should return a mapped size as an output parameter and take a
> flag as to whether partial mappings can be handled.  The effect would be
> that you never bounce to RAM which means that you can also quite accurately
> determine the maximum amount of bouncing (it should be proportional to the
> amount of MMIO memory that's registered).
> > >
> >
> > That's pointless; cirrus for example has 8MB of mmio while a cpu-to-vram
> blit is in progress, and some random device we'll add tomorrow could easily
> introduce more.  Our APIs shouldn't depend on properties of emulated
> hardware, at least as much as possible.
> >
>
>  One way to think of what I'm suggesting, is that if for every
> cpu_register_physical_memory call for MMIO, we allocated a buffer, then
> whenever map() was called on MMIO, we would return that already allocated
> buffer.  The overhead is fixed and honestly relatively small.  Much smaller
> than dma.c proposes.
>
>  But you can be smarter, and lazily allocate those buffers for MMIO.  That
> will reduce the up front memory consumption.  I'd be perfectly happy though
> if the first implementation just malloc() on cpu_register_physical_memory
> for the sake of simplicity.

Maybe the buffer thing should not be called DMA, but something like
bounce-buffers?

> > I'll enumerate the functions that dma.c provides:
> > - convert guest physical addresses to host virtual addresses
> >
>
>  The map() api does this.
>
>
> > - construct an iovec for scatter/gather
> >
>
>  The map() api does not do this.
>
>
> > - handle guest physical addresses for which no host virtual addresses
> exist, while controlling memory use
> > - take care of the dirty bit
> > - provide a place for adding hooks to hardware that can modify dma
> operations generically (emulated iommus, transforming dma engines)
> >
>
>  The map() api does all of this.
>
>
> > I believe that a dma api that fails to address all of these requirements
> is trying to solve too few problems at the same time, and will either cause
> dma clients to be unduly complicated, or will require rewriting.
> >
>
>  I think there's a disconnect between what you describe and what the current
> code does.  I think there's a very simple solution, let's start with the
> map() api.  I'm convinced that the virtio support for it will be trivial and
> that virtio will not benefit from the dma.c api proposed.  I'm willing to do
> the virtio map implementation to demonstrate this.  Let's see two
> implementations that use the dma.c api before we commit to it.  I'd like to
> see at least a network device and a block device.  I don't believe network
> devices will benefit from it because they don't support partial submissions.
>
>  I like any API that reduces duplicate code.  It's easy to demonstrate that
> with patches.  Based on my current understanding of the API and what I
> expect from the devices using it, I don't believe the API will actually do
> that.  It's quite easy to prove me wrong though.

I changed the ESP SCSI and Lance Ethernet on Sparc32 to resolve the IO
address to physical memory (see patch). ESP works (no zero copy yet),
Lance doesn't. It looks much better. Because the resolving activity is
performed in serial steps, unbounded IO vector allocation does not
happen, but we still could launch as many IO as there are free IO
vectors.

There are still some issues I'm not happy yet:
- handling of access violations: resolving should stop before the bad
page, the transfers should be done until that and then post error.
- bounce buffers needed for Lance byte swapping are not well designed (stack)

This lead me to the thought that maybe we should not hide the bounce
buffer activity, but instead make it more explicit for the device that
needs bouncing. For the other device, the buffering or lack of it
should be opaque.

Also the virtual-to-physical address resolution API could be generic,
ie all resolver functions should take same parameters so that the
devices would not need to know the next higher level device.

[-- Attachment #2: translation-api-sparc.diff --]
[-- Type: plain/text, Size: 13400 bytes --]

WARNING: multiple messages have this Message-ID (diff)
From: "Blue Swirl" <blauwirbel@gmail.com>
To: qemu-devel@nongnu.org, Anthony Liguori <anthony@codemonkey.ws>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
	chrisw@redhat.com, Avi Kivity <avi@redhat.com>,
	kvm@vger.kernel.org, Gerd Hoffmann <kraxel@redhat.com>
Subject: Re: [Qemu-devel] Re: [PATCH 2 of 5] add can_dma/post_dma for direct IO
Date: Mon, 15 Dec 2008 20:35:36 +0200	[thread overview]
Message-ID: <f43fc5580812151035o40fad1d2m47c006eb802eefac@mail.gmail.com> (raw)
In-Reply-To: <494591F7.3080002@codemonkey.ws>

[-- Attachment #1: Type: text/plain, Size: 4357 bytes --]

On 12/15/08, Anthony Liguori <anthony@codemonkey.ws> wrote:
> Avi Kivity wrote:
>
> > Anthony Liguori wrote:
> >
> > > I've thought quite a bit about it, and I'm becoming less convinced that
> this sort of API is going to be helpful.
> > >
> > > I was thinking that we need to make one minor change to the map API I
> proposed.  It should return a mapped size as an output parameter and take a
> flag as to whether partial mappings can be handled.  The effect would be
> that you never bounce to RAM which means that you can also quite accurately
> determine the maximum amount of bouncing (it should be proportional to the
> amount of MMIO memory that's registered).
> > >
> >
> > That's pointless; cirrus for example has 8MB of mmio while a cpu-to-vram
> blit is in progress, and some random device we'll add tomorrow could easily
> introduce more.  Our APIs shouldn't depend on properties of emulated
> hardware, at least as much as possible.
> >
>
>  One way to think of what I'm suggesting, is that if for every
> cpu_register_physical_memory call for MMIO, we allocated a buffer, then
> whenever map() was called on MMIO, we would return that already allocated
> buffer.  The overhead is fixed and honestly relatively small.  Much smaller
> than dma.c proposes.
>
>  But you can be smarter, and lazily allocate those buffers for MMIO.  That
> will reduce the up front memory consumption.  I'd be perfectly happy though
> if the first implementation just malloc() on cpu_register_physical_memory
> for the sake of simplicity.

Maybe the buffer thing should not be called DMA, but something like
bounce-buffers?

> > I'll enumerate the functions that dma.c provides:
> > - convert guest physical addresses to host virtual addresses
> >
>
>  The map() api does this.
>
>
> > - construct an iovec for scatter/gather
> >
>
>  The map() api does not do this.
>
>
> > - handle guest physical addresses for which no host virtual addresses
> exist, while controlling memory use
> > - take care of the dirty bit
> > - provide a place for adding hooks to hardware that can modify dma
> operations generically (emulated iommus, transforming dma engines)
> >
>
>  The map() api does all of this.
>
>
> > I believe that a dma api that fails to address all of these requirements
> is trying to solve too few problems at the same time, and will either cause
> dma clients to be unduly complicated, or will require rewriting.
> >
>
>  I think there's a disconnect between what you describe and what the current
> code does.  I think there's a very simple solution, let's start with the
> map() api.  I'm convinced that the virtio support for it will be trivial and
> that virtio will not benefit from the dma.c api proposed.  I'm willing to do
> the virtio map implementation to demonstrate this.  Let's see two
> implementations that use the dma.c api before we commit to it.  I'd like to
> see at least a network device and a block device.  I don't believe network
> devices will benefit from it because they don't support partial submissions.
>
>  I like any API that reduces duplicate code.  It's easy to demonstrate that
> with patches.  Based on my current understanding of the API and what I
> expect from the devices using it, I don't believe the API will actually do
> that.  It's quite easy to prove me wrong though.

I changed the ESP SCSI and Lance Ethernet on Sparc32 to resolve the IO
address to physical memory (see patch). ESP works (no zero copy yet),
Lance doesn't. It looks much better. Because the resolving activity is
performed in serial steps, unbounded IO vector allocation does not
happen, but we still could launch as many IO as there are free IO
vectors.

There are still some issues I'm not happy yet:
- handling of access violations: resolving should stop before the bad
page, the transfers should be done until that and then post error.
- bounce buffers needed for Lance byte swapping are not well designed (stack)

This lead me to the thought that maybe we should not hide the bounce
buffer activity, but instead make it more explicit for the device that
needs bouncing. For the other device, the buffering or lack of it
should be opaque.

Also the virtual-to-physical address resolution API could be generic,
ie all resolver functions should take same parameters so that the
devices would not need to know the next higher level device.

[-- Attachment #2: translation-api-sparc.diff --]
[-- Type: plain/text, Size: 13400 bytes --]

  parent reply	other threads:[~2008-12-15 18:35 UTC|newest]

Thread overview: 111+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-12 18:16 [PATCH 0 of 5] dma api v3 Andrea Arcangeli
2008-12-12 18:16 ` [Qemu-devel] " Andrea Arcangeli
2008-12-12 18:16 ` [PATCH 1 of 5] fix cpu_physical_memory len Andrea Arcangeli
2008-12-12 18:16   ` [Qemu-devel] " Andrea Arcangeli
2008-12-12 19:06   ` Anthony Liguori
2008-12-12 19:06     ` [Qemu-devel] " Anthony Liguori
2008-12-12 19:26     ` Andrea Arcangeli
2008-12-12 19:26       ` [Qemu-devel] " Andrea Arcangeli
2008-12-12 18:16 ` [PATCH 2 of 5] add can_dma/post_dma for direct IO Andrea Arcangeli
2008-12-12 18:16   ` [Qemu-devel] " Andrea Arcangeli
2008-12-12 19:00   ` Blue Swirl
2008-12-12 19:00     ` Blue Swirl
2008-12-12 19:18     ` Anthony Liguori
2008-12-12 19:18       ` Anthony Liguori
2008-12-12 20:05       ` Blue Swirl
2008-12-12 20:10         ` Anthony Liguori
2008-12-12 20:10           ` Anthony Liguori
2008-12-12 19:15   ` Anthony Liguori
2008-12-12 19:15     ` [Qemu-devel] " Anthony Liguori
2008-12-12 19:37     ` Andrea Arcangeli
2008-12-12 19:37       ` [Qemu-devel] " Andrea Arcangeli
2008-12-12 20:09       ` Anthony Liguori
2008-12-12 20:09         ` [Qemu-devel] " Anthony Liguori
2008-12-12 20:25       ` Gerd Hoffmann
2008-12-12 20:25         ` [Qemu-devel] " Gerd Hoffmann
2008-12-12 19:39     ` Anthony Liguori
2008-12-12 19:39       ` [Qemu-devel] " Anthony Liguori
2008-12-13  9:22       ` Avi Kivity
2008-12-13  9:22         ` Avi Kivity
2008-12-13 16:45         ` Anthony Liguori
2008-12-13 16:45           ` Anthony Liguori
2008-12-13 19:48           ` Avi Kivity
2008-12-13 19:48             ` Avi Kivity
2008-12-13 21:07             ` Anthony Liguori
2008-12-13 21:07               ` Anthony Liguori
2008-12-14  6:03               ` Avi Kivity
2008-12-14  6:03                 ` Avi Kivity
2008-12-14 19:10                 ` Anthony Liguori
2008-12-14 19:10                   ` Anthony Liguori
2008-12-14 19:49                   ` Avi Kivity
2008-12-14 19:49                     ` Avi Kivity
2008-12-14 23:08                     ` Anthony Liguori
2008-12-14 23:08                       ` Anthony Liguori
2008-12-15  0:57                       ` Paul Brook
2008-12-15  0:57                         ` Paul Brook
2008-12-15  2:09                         ` Anthony Liguori
2008-12-15  2:09                           ` Anthony Liguori
2008-12-15  6:23                           ` Avi Kivity
2008-12-15  6:23                             ` Avi Kivity
2008-12-15 18:35                       ` Blue Swirl [this message]
2008-12-15 18:35                         ` Blue Swirl
2008-12-15 22:06                         ` Anthony Liguori
2008-12-15 22:06                           ` Anthony Liguori
2008-12-16  9:41                           ` Avi Kivity
2008-12-16  9:41                             ` Avi Kivity
2008-12-16 16:55                             ` Blue Swirl
2008-12-16 16:55                               ` Blue Swirl
2008-12-16 17:09                               ` Avi Kivity
2008-12-16 17:09                                 ` Avi Kivity
2008-12-16 17:48                                 ` Anthony Liguori
2008-12-16 17:48                                   ` Anthony Liguori
2008-12-16 18:11                                   ` Blue Swirl
2008-12-16 18:11                                     ` Blue Swirl
2008-12-16 15:57                           ` Blue Swirl
2008-12-16 15:57                             ` Blue Swirl
2008-12-16 16:29                             ` Paul Brook
2008-12-16 16:29                               ` Paul Brook
2008-12-16 16:35                               ` Blue Swirl
2008-12-16 16:35                                 ` Blue Swirl
2008-12-14 17:30       ` Jamie Lokier
2008-12-14 17:30         ` Jamie Lokier
2008-12-13 14:39     ` Andrea Arcangeli
2008-12-13 14:39       ` [Qemu-devel] " Andrea Arcangeli
2008-12-13 16:46       ` Anthony Liguori
2008-12-13 16:46         ` [Qemu-devel] " Anthony Liguori
2008-12-13 16:53         ` Andrea Arcangeli
2008-12-13 16:53           ` [Qemu-devel] " Andrea Arcangeli
2008-12-13 17:54           ` Andreas Färber
2008-12-13 21:11           ` Anthony Liguori
2008-12-13 21:11             ` [Qemu-devel] " Anthony Liguori
2008-12-14 16:47             ` Andrea Arcangeli
2008-12-14 16:47               ` [Qemu-devel] " Andrea Arcangeli
2008-12-14 17:01               ` Avi Kivity
2008-12-14 17:01                 ` [Qemu-devel] " Avi Kivity
2008-12-14 17:15                 ` Andrea Arcangeli
2008-12-14 17:15                   ` [Qemu-devel] " Andrea Arcangeli
2008-12-14 19:59                   ` Avi Kivity
2008-12-14 19:59                     ` [Qemu-devel] " Avi Kivity
2008-12-22 16:44                     ` Ian Jackson
2008-12-22 16:44                       ` Ian Jackson
2008-12-22 19:44                       ` Avi Kivity
2008-12-23  0:03                         ` Thiemo Seufer
2008-12-23  1:02                           ` Andrea Arcangeli
2008-12-23  1:02                             ` Andrea Arcangeli
2008-12-23 17:31                           ` Avi Kivity
2008-12-23 17:31                             ` Avi Kivity
2008-12-22 19:46                       ` Avi Kivity
2009-01-05 10:27                         ` Gerd Hoffmann
2009-01-05 10:27                           ` Gerd Hoffmann
2008-12-13 22:47     ` Anthony Liguori
2008-12-13 22:47       ` [Qemu-devel] " Anthony Liguori
2008-12-14  6:07       ` Avi Kivity
2008-12-14  6:07         ` [Qemu-devel] " Avi Kivity
2008-12-12 18:16 ` [PATCH 3 of 5] rename dma.c to isa_dma.c Andrea Arcangeli
2008-12-12 18:16   ` [Qemu-devel] " Andrea Arcangeli
2008-12-12 18:16 ` [PATCH 4 of 5] dma api Andrea Arcangeli
2008-12-12 18:16   ` [Qemu-devel] " Andrea Arcangeli
2008-12-12 18:55   ` Blue Swirl
2008-12-12 18:55     ` Blue Swirl
2008-12-12 18:16 ` [PATCH 5 of 5] bdrv_aio_readv/writev Andrea Arcangeli
2008-12-12 18:16   ` [Qemu-devel] " Andrea Arcangeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f43fc5580812151035o40fad1d2m47c006eb802eefac@mail.gmail.com \
    --to=blauwirbel@gmail.com \
    --cc=aarcange@redhat.com \
    --cc=anthony@codemonkey.ws \
    --cc=avi@redhat.com \
    --cc=chrisw@redhat.com \
    --cc=kraxel@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.