All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: joro@8bytes.org, blauwirbel@gmail.com, paul@codesourcery.com,
	avi@redhat.com, anthony@codemonkey.ws, av1474@comtv.ru,
	yamahata@valinux.co.jp, kvm@vger.kernel.org,
	qemu-devel@nongnu.org
Subject: Re: [PATCH 4/7] ide: use the PCI memory access interface
Date: Thu, 2 Sep 2010 18:31:39 +0300	[thread overview]
Message-ID: <20100902153139.GB18182@redhat.com> (raw)
In-Reply-To: <20100902150135.GA7136@localhost>

On Thu, Sep 02, 2010 at 06:01:35PM +0300, Eduard - Gabriel Munteanu wrote:
> On Thu, Sep 02, 2010 at 12:58:13PM +0300, Michael S. Tsirkin wrote:
> > On Thu, Sep 02, 2010 at 12:12:00PM +0300, Eduard - Gabriel Munteanu wrote:
> > > On Thu, Sep 02, 2010 at 08:19:11AM +0300, Michael S. Tsirkin wrote:
> > > > On Sat, Aug 28, 2010 at 05:54:55PM +0300, Eduard - Gabriel Munteanu wrote:
> > 
> > I don't insist on this solution, but what other way do you propose to
> > avoid the overhead for everyone not using an iommu?
> > I'm all for a solution that would help iommu as well,
> > but one wasn't yet proposed.
> > 
> 
> Hm, we could get even better performance by simply making the IOMMU a
> compile-time option. It also avoids problems in case some device hasn't
> been converted yet, and involves little to no tradeoffs. What do you
> think?
> 
> AFAICT, there are few uses for the IOMMU besides development and
> avantgarde stuff, as you note. So distributions can continue supplying
> prebuilt QEMU/KVM packages compiled with the IOMMU turned off for the
> time being. The only practical (commercial) use right now would be in
> the case of private virtual servers, which could be divided further into
> nested guests (though real IOMMU hardware isn't widespread yet).
> 
> Blue Swirl, in the light of this, do you agree on making it a
> compile-time option?
> 
> > > > >  static inline IDEState *idebus_active_if(IDEBus *bus)
> > > > >  {
> > > > >      return bus->ifs + bus->unit;
> > > > > diff --git a/hw/ide/macio.c b/hw/ide/macio.c
> > > > > index bd1c73e..962ae13 100644
> > > > > --- a/hw/ide/macio.c
> > > > > +++ b/hw/ide/macio.c
> > > > > @@ -79,7 +79,7 @@ static void pmac_ide_atapi_transfer_cb(void *opaque, int ret)
> > > > >  
> > > > >      s->io_buffer_size = io->len;
> > > > >  
> > > > > -    qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
> > > > > +    qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
> > > > >      qemu_sglist_add(&s->sg, io->addr, io->len);
> > > > >      io->addr += io->len;
> > > > >      io->len = 0;
> > > > > @@ -141,7 +141,7 @@ static void pmac_ide_transfer_cb(void *opaque, int ret)
> > > > >      s->io_buffer_index = 0;
> > > > >      s->io_buffer_size = io->len;
> > > > >  
> > > > > -    qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
> > > > > +    qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
> > > > >      qemu_sglist_add(&s->sg, io->addr, io->len);
> > > > >      io->addr += io->len;
> > > > >      io->len = 0;
> > > > > diff --git a/hw/ide/pci.c b/hw/ide/pci.c
> > > > > index 4d95cc5..5879044 100644
> > > > > --- a/hw/ide/pci.c
> > > > > +++ b/hw/ide/pci.c
> > > > > @@ -183,4 +183,11 @@ void pci_ide_create_devs(PCIDevice *dev, DriveInfo **hd_table)
> > > > >              continue;
> > > > >          ide_create_drive(d->bus+bus[i], unit[i], hd_table[i]);
> > > > >      }
> > > > > +
> > > > > +    for (i = 0; i < 2; i++) {
> > > > > +        d->bmdma[i].rw = (void *) pci_memory_rw;
> > > > > +        d->bmdma[i].map = (void *) pci_memory_map;
> > > > > +        d->bmdma[i].unmap = (void *) pci_memory_unmap;
> > > > > +        d->bmdma[i].opaque = dev;
> > > > > +    }
> > > > >  }
> > > > 
> > > > These casts show something is wrong with the API, IMO.
> > > > 
> > > 
> > > Hm, here's an oversight on my part: I think I should provide explicit
> > > bmdma hooks, since pcibus_t is a uint64_t and target_phys_addr_t is a
> > > uint{32,64}_t depending on the guest machine, so it might be buggy on
> > > 32-bit wrt calling conventions. But that introduces yet another
> > > non-inlined function call :-(. That would drop the (void *) cast,
> > > though.
> > > 
> > > 
> > > 	Eduard
> > 
> > So we get away with it without casts but only because C compiler
> > will let us silently convert the types, possibly discarding
> > data in the process. Or we'll add a check that will try and detect
> > this, but there's no good way to report a DMA error to user.
> > IOW, if our code only works because target fits in pcibus, what good
> > is the abstraction and using distinct types?
> > 
> > This is why I think we need a generic DMA APIs using dma addresses.
> 
> The API was made so that it doesn't report errors. That's because the
> PCI bus doesn't provide any possibility of doing so (real devices can't
> retry transfers in case an I/O page fault occurs).

This is what I am saying. We can't deal with errors.

> In my previous generic IOMMU layer implementation pci_memory_*()
> returned non-zero on failure, but I decided to drop it when switching to
> a PCI-only (rather a PCI-specific) approach.
> 
> In case target_phys_addr_t no longer fits in pcibus_t by a simple
> implicit conversion, those explicit bmdma hooks I was going to add will
> do the necessary conversions.
> 
> The idea of using distinct types is two-fold: let the programmer know
> not to rely on them being the same thing, and let the compiler prevent
> him from shooting himself in the foot (like I did). Even if there is a
> dma_addr_t, some piece of code still needs to provide glue and
> conversion between the DMA code and bus-specific code.
> 
> 
> 	Eduard

Nothing I see here is bus-specific, really. Without an mmu addresses
that make sense are target addresses, with iommu - whatever iommu
supports. So make iommu work with dma_addr_t and do the conversion.

-- 
MST

WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: kvm@vger.kernel.org, joro@8bytes.org, qemu-devel@nongnu.org,
	blauwirbel@gmail.com, yamahata@valinux.co.jp,
	paul@codesourcery.com, avi@redhat.com
Subject: [Qemu-devel] Re: [PATCH 4/7] ide: use the PCI memory access interface
Date: Thu, 2 Sep 2010 18:31:39 +0300	[thread overview]
Message-ID: <20100902153139.GB18182@redhat.com> (raw)
In-Reply-To: <20100902150135.GA7136@localhost>

On Thu, Sep 02, 2010 at 06:01:35PM +0300, Eduard - Gabriel Munteanu wrote:
> On Thu, Sep 02, 2010 at 12:58:13PM +0300, Michael S. Tsirkin wrote:
> > On Thu, Sep 02, 2010 at 12:12:00PM +0300, Eduard - Gabriel Munteanu wrote:
> > > On Thu, Sep 02, 2010 at 08:19:11AM +0300, Michael S. Tsirkin wrote:
> > > > On Sat, Aug 28, 2010 at 05:54:55PM +0300, Eduard - Gabriel Munteanu wrote:
> > 
> > I don't insist on this solution, but what other way do you propose to
> > avoid the overhead for everyone not using an iommu?
> > I'm all for a solution that would help iommu as well,
> > but one wasn't yet proposed.
> > 
> 
> Hm, we could get even better performance by simply making the IOMMU a
> compile-time option. It also avoids problems in case some device hasn't
> been converted yet, and involves little to no tradeoffs. What do you
> think?
> 
> AFAICT, there are few uses for the IOMMU besides development and
> avantgarde stuff, as you note. So distributions can continue supplying
> prebuilt QEMU/KVM packages compiled with the IOMMU turned off for the
> time being. The only practical (commercial) use right now would be in
> the case of private virtual servers, which could be divided further into
> nested guests (though real IOMMU hardware isn't widespread yet).
> 
> Blue Swirl, in the light of this, do you agree on making it a
> compile-time option?
> 
> > > > >  static inline IDEState *idebus_active_if(IDEBus *bus)
> > > > >  {
> > > > >      return bus->ifs + bus->unit;
> > > > > diff --git a/hw/ide/macio.c b/hw/ide/macio.c
> > > > > index bd1c73e..962ae13 100644
> > > > > --- a/hw/ide/macio.c
> > > > > +++ b/hw/ide/macio.c
> > > > > @@ -79,7 +79,7 @@ static void pmac_ide_atapi_transfer_cb(void *opaque, int ret)
> > > > >  
> > > > >      s->io_buffer_size = io->len;
> > > > >  
> > > > > -    qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
> > > > > +    qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
> > > > >      qemu_sglist_add(&s->sg, io->addr, io->len);
> > > > >      io->addr += io->len;
> > > > >      io->len = 0;
> > > > > @@ -141,7 +141,7 @@ static void pmac_ide_transfer_cb(void *opaque, int ret)
> > > > >      s->io_buffer_index = 0;
> > > > >      s->io_buffer_size = io->len;
> > > > >  
> > > > > -    qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
> > > > > +    qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
> > > > >      qemu_sglist_add(&s->sg, io->addr, io->len);
> > > > >      io->addr += io->len;
> > > > >      io->len = 0;
> > > > > diff --git a/hw/ide/pci.c b/hw/ide/pci.c
> > > > > index 4d95cc5..5879044 100644
> > > > > --- a/hw/ide/pci.c
> > > > > +++ b/hw/ide/pci.c
> > > > > @@ -183,4 +183,11 @@ void pci_ide_create_devs(PCIDevice *dev, DriveInfo **hd_table)
> > > > >              continue;
> > > > >          ide_create_drive(d->bus+bus[i], unit[i], hd_table[i]);
> > > > >      }
> > > > > +
> > > > > +    for (i = 0; i < 2; i++) {
> > > > > +        d->bmdma[i].rw = (void *) pci_memory_rw;
> > > > > +        d->bmdma[i].map = (void *) pci_memory_map;
> > > > > +        d->bmdma[i].unmap = (void *) pci_memory_unmap;
> > > > > +        d->bmdma[i].opaque = dev;
> > > > > +    }
> > > > >  }
> > > > 
> > > > These casts show something is wrong with the API, IMO.
> > > > 
> > > 
> > > Hm, here's an oversight on my part: I think I should provide explicit
> > > bmdma hooks, since pcibus_t is a uint64_t and target_phys_addr_t is a
> > > uint{32,64}_t depending on the guest machine, so it might be buggy on
> > > 32-bit wrt calling conventions. But that introduces yet another
> > > non-inlined function call :-(. That would drop the (void *) cast,
> > > though.
> > > 
> > > 
> > > 	Eduard
> > 
> > So we get away with it without casts but only because C compiler
> > will let us silently convert the types, possibly discarding
> > data in the process. Or we'll add a check that will try and detect
> > this, but there's no good way to report a DMA error to user.
> > IOW, if our code only works because target fits in pcibus, what good
> > is the abstraction and using distinct types?
> > 
> > This is why I think we need a generic DMA APIs using dma addresses.
> 
> The API was made so that it doesn't report errors. That's because the
> PCI bus doesn't provide any possibility of doing so (real devices can't
> retry transfers in case an I/O page fault occurs).

This is what I am saying. We can't deal with errors.

> In my previous generic IOMMU layer implementation pci_memory_*()
> returned non-zero on failure, but I decided to drop it when switching to
> a PCI-only (rather a PCI-specific) approach.
> 
> In case target_phys_addr_t no longer fits in pcibus_t by a simple
> implicit conversion, those explicit bmdma hooks I was going to add will
> do the necessary conversions.
> 
> The idea of using distinct types is two-fold: let the programmer know
> not to rely on them being the same thing, and let the compiler prevent
> him from shooting himself in the foot (like I did). Even if there is a
> dma_addr_t, some piece of code still needs to provide glue and
> conversion between the DMA code and bus-specific code.
> 
> 
> 	Eduard

Nothing I see here is bus-specific, really. Without an mmu addresses
that make sense are target addresses, with iommu - whatever iommu
supports. So make iommu work with dma_addr_t and do the conversion.

-- 
MST

  parent reply	other threads:[~2010-09-02 15:38 UTC|newest]

Thread overview: 97+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-28 14:54 [PATCH 0/7] AMD IOMMU emulation patchset v4 Eduard - Gabriel Munteanu
2010-08-28 14:54 ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-08-28 14:54 ` [PATCH 1/7] pci: expand tabs to spaces in pci_regs.h Eduard - Gabriel Munteanu
2010-08-28 14:54   ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-08-31 20:29   ` Michael S. Tsirkin
2010-08-31 20:29     ` [Qemu-devel] " Michael S. Tsirkin
2010-08-31 22:58     ` Eduard - Gabriel Munteanu
2010-08-31 22:58       ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-09-01 10:39       ` Michael S. Tsirkin
2010-09-01 10:39         ` [Qemu-devel] " Michael S. Tsirkin
2010-08-28 14:54 ` [PATCH 2/7] pci: memory access API and IOMMU support Eduard - Gabriel Munteanu
2010-08-28 14:54   ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-09-02  5:28   ` Michael S. Tsirkin
2010-09-02  5:28     ` [Qemu-devel] " Michael S. Tsirkin
2010-09-02  8:40     ` Eduard - Gabriel Munteanu
2010-09-02  8:40       ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-09-02  9:49       ` Michael S. Tsirkin
2010-09-02  9:49         ` [Qemu-devel] " Michael S. Tsirkin
2010-09-04  9:01         ` Blue Swirl
2010-09-04  9:01           ` [Qemu-devel] " Blue Swirl
2010-09-05  7:10           ` Michael S. Tsirkin
2010-09-05  7:10             ` [Qemu-devel] " Michael S. Tsirkin
2010-08-28 14:54 ` [PATCH 3/7] AMD IOMMU emulation Eduard - Gabriel Munteanu
2010-08-28 14:54   ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-08-28 15:58   ` Blue Swirl
2010-08-28 15:58     ` [Qemu-devel] " Blue Swirl
2010-08-28 21:53     ` Eduard - Gabriel Munteanu
2010-08-28 21:53       ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-08-29 20:37       ` Blue Swirl
2010-08-29 20:37         ` [Qemu-devel] " Blue Swirl
2010-08-30  3:07   ` [Qemu-devel] " Isaku Yamahata
2010-08-30  3:07     ` Isaku Yamahata
2010-08-30  5:54     ` Eduard - Gabriel Munteanu
2010-08-30  5:54       ` Eduard - Gabriel Munteanu
2010-08-28 14:54 ` [PATCH 4/7] ide: use the PCI memory access interface Eduard - Gabriel Munteanu
2010-08-28 14:54   ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-09-02  5:19   ` Michael S. Tsirkin
2010-09-02  5:19     ` [Qemu-devel] " Michael S. Tsirkin
2010-09-02  9:12     ` Eduard - Gabriel Munteanu
2010-09-02  9:12       ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-09-02  9:58       ` Michael S. Tsirkin
2010-09-02  9:58         ` [Qemu-devel] " Michael S. Tsirkin
2010-09-02 15:01         ` Eduard - Gabriel Munteanu
2010-09-02 15:01           ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-09-02 15:24           ` Avi Kivity
2010-09-02 15:24             ` [Qemu-devel] " Avi Kivity
2010-09-02 15:39             ` Michael S. Tsirkin
2010-09-02 15:39               ` [Qemu-devel] " Michael S. Tsirkin
2010-09-02 16:07               ` Avi Kivity
2010-09-02 16:07                 ` [Qemu-devel] " Avi Kivity
2010-09-02 15:31           ` Michael S. Tsirkin [this message]
2010-09-02 15:31             ` Michael S. Tsirkin
2010-08-28 14:54 ` [PATCH 5/7] rtl8139: " Eduard - Gabriel Munteanu
2010-08-28 14:54   ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-08-28 14:54 ` [PATCH 6/7] eepro100: " Eduard - Gabriel Munteanu
2010-08-28 14:54   ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-08-28 14:54 ` [PATCH 7/7] ac97: " Eduard - Gabriel Munteanu
2010-08-28 14:54   ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-08-28 16:00 ` [PATCH 0/7] AMD IOMMU emulation patchset v4 Blue Swirl
2010-08-28 16:00   ` [Qemu-devel] " Blue Swirl
2010-08-29  9:55   ` Joerg Roedel
2010-08-29  9:55     ` [Qemu-devel] " Joerg Roedel
2010-08-29 20:44     ` Blue Swirl
2010-08-29 20:44       ` [Qemu-devel] " Blue Swirl
2010-08-29 22:08       ` [PATCH 2/7] pci: memory access API and IOMMU support Eduard - Gabriel Munteanu
2010-08-29 22:08         ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-08-29 22:11         ` Eduard - Gabriel Munteanu
2010-08-29 22:11           ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-09-01 20:10         ` [Qemu-devel] " Stefan Weil
2010-09-01 20:10           ` Stefan Weil
2010-09-02  6:00           ` Michael S. Tsirkin
2010-09-02  6:00             ` Michael S. Tsirkin
2010-09-02  9:08             ` Eduard - Gabriel Munteanu
2010-09-02  9:08               ` Eduard - Gabriel Munteanu
2010-09-02 13:24               ` Anthony Liguori
2010-09-02 13:24                 ` Anthony Liguori
2010-09-02  8:51           ` Eduard - Gabriel Munteanu
2010-09-02  8:51             ` Eduard - Gabriel Munteanu
2010-09-02 16:05             ` Stefan Weil
2010-09-02 16:05               ` Stefan Weil
2010-09-02 16:14               ` Eduard - Gabriel Munteanu
2010-09-02 16:14                 ` Eduard - Gabriel Munteanu
2010-09-13 20:01 ` [PATCH RFC] dma_rw.h (was Re: [PATCH 0/7] AMD IOMMU emulation patchset v4) Michael S. Tsirkin
2010-09-13 20:01   ` [Qemu-devel] " Michael S. Tsirkin
2010-09-13 20:45   ` Anthony Liguori
2010-09-13 20:45     ` Anthony Liguori
2010-09-16  7:12     ` Eduard - Gabriel Munteanu
2010-09-16  7:12       ` Eduard - Gabriel Munteanu
2010-09-16  9:35     ` Michael S. Tsirkin
2010-09-16  9:35       ` Michael S. Tsirkin
2010-09-16  7:06   ` Eduard - Gabriel Munteanu
2010-09-16  7:06     ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-09-16  9:20     ` Michael S. Tsirkin
2010-09-16  9:20       ` [Qemu-devel] " Michael S. Tsirkin
2010-09-16 11:15       ` Eduard - Gabriel Munteanu
2010-09-16 11:15         ` [Qemu-devel] " Eduard - Gabriel Munteanu
  -- strict thread matches above, loose matches on Subject: below --
2010-08-15 19:27 [PATCH 0/7] AMD IOMMU emulation patches v3 Eduard - Gabriel Munteanu
2010-08-15 19:27 ` [PATCH 4/7] ide: use the PCI memory access interface Eduard - Gabriel Munteanu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100902153139.GB18182@redhat.com \
    --to=mst@redhat.com \
    --cc=anthony@codemonkey.ws \
    --cc=av1474@comtv.ru \
    --cc=avi@redhat.com \
    --cc=blauwirbel@gmail.com \
    --cc=eduard.munteanu@linux360.ro \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=paul@codesourcery.com \
    --cc=qemu-devel@nongnu.org \
    --cc=yamahata@valinux.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.