All of lore.kernel.org
 help / color / mirror / Atom feed
From: Blue Swirl <blauwirbel@gmail.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>,
	joro@8bytes.org, paul@codesourcery.com, avi@redhat.com,
	anthony@codemonkey.ws, av1474@comtv.ru, yamahata@valinux.co.jp,
	kvm@vger.kernel.org, qemu-devel@nongnu.org
Subject: Re: [PATCH 2/7] pci: memory access API and IOMMU support
Date: Sat, 4 Sep 2010 09:01:06 +0000	[thread overview]
Message-ID: <AANLkTik+NJB7dC3GMcsrX98EM_VfMYb=Txj3sTaGE33T@mail.gmail.com> (raw)
In-Reply-To: <20100902094947.GA9085@redhat.com>

On Thu, Sep 2, 2010 at 9:49 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Thu, Sep 02, 2010 at 11:40:58AM +0300, Eduard - Gabriel Munteanu wrote:
>> On Thu, Sep 02, 2010 at 08:28:26AM +0300, Michael S. Tsirkin wrote:
>> > On Sat, Aug 28, 2010 at 05:54:53PM +0300, Eduard - Gabriel Munteanu wrote:
>> > > PCI devices should access memory through pci_memory_*() instead of
>> > > cpu_physical_memory_*(). This also provides support for translation and
>> > > access checking in case an IOMMU is emulated.
>> > >
>> > > Memory maps are treated as remote IOTLBs (that is, translation caches
>> > > belonging to the IOMMU-aware device itself). Clients (devices) must
>> > > provide callbacks for map invalidation in case these maps are
>> > > persistent beyond the current I/O context, e.g. AIO DMA transfers.
>> > >
>> > > Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
>> >
>> >
>> > I am concerned about adding more pointer chaising on data path.
>> > Could we have
>> > 1. an iommu pointer in a device, inherited by secondary buses
>> >    when they are created and by devices from buses when they are attached.
>> > 2. translation pointer in the iommu instead of the bus
>>
>> The first solution I proposed was based on qdev, that is, each
>> DeviceState had an 'iommu' field. Translation would be done by
>> recursively looking in the parent bus/devs for an IOMMU.
>>
>> But Anthony said we're better off with bus-specific APIs, mostly because
>> (IIRC) there may be different types of addresses and it might be
>> difficult to abstract those properly.
>
> Well we ended up with casting
> away types to make pci callbacks fit in ide structure,
> and silently assuming that all addresses are in fact 64 bit.
> So maybe it's hard to abstract addresses properly, but
> it appears we'll have to, to avoid even worse problems.
>
>> I suppose I could revisit the idea by integrating the IOMMU in a
>> PCIDevice as opposed to a DeviceState.
>>
>> Anthony, Paul, any thoughts on this?
>
> Just to clarify: this is an optimization idea:
> instead of a bus walk on each access, do the walk
> when device is attached to the bus, and copy the iommu
> from the root to the device itself.
>
> This will also make it possible to create
> DMADeviceState structure which would have this iommu field,
> and we'd use this structure instead of the void pointers all over.
>
>
>> > 3. pci_memory_XX functions inline, doing fast path for non-iommu case:
>> >
>> >     if (__builtin_expect(!dev->iommu, 1)
>> >             return cpu_memory_rw
>>
>> But isn't this some sort of 'if (likely(!dev->iommu))' from the Linux
>> kernel? If so, it puts the IOMMU-enabled case at disadvantage.
>
> IOMMU has a ton of indirections anyway.
>
>> I suppose most emulated systems would have at least some theoretical
>> reasons to enable the IOMMU, e.g. as a GART replacement (say for 32-bit
>> PCI devices) or for userspace drivers.
>> So there are reasons to enable
>> the IOMMU even when you don't have a real host IOMMU and you're not
>> using nested guests.
>
> The time most people enable iommu for all devices in both real and virtualized
> systems appears distant, one of the reasons is because it has a lot of overhead.
> Let's start with not adding overhead for existing users, makes sense?

I think the goal architecture (not for IOMMU, but in general) is one
with zero copy DMA. This means we have stage one where the addresses
are translated to host pointers and stage two where the read/write
operation happens. The devices need to be converted.

Now, let's consider the IOMMU in this zero copy architecture. It's one
stage of address translation, for the access operation it will not
matter. We can add translation caching at device level (or even at any
intermediate levels), but that needs a cache invalidation callback
system as discussed earlier. This can be implemented later, we need
the zero copy stuff first.

Given this overall picture, I think eliminating some pointer
dereference overheads in non-zero copy architecture is a very
premature optimization and it may even direct the architecture to
wrong direction.

If the performance degradation at this point is not acceptable, we
could also postpone merging IOMMU until zero copy conversion has
happened, or make IOMMU a compile time option. But it would be nice to
back the decision by performance figures.

WARNING: multiple messages have this Message-ID (diff)
From: Blue Swirl <blauwirbel@gmail.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: kvm@vger.kernel.org, joro@8bytes.org, qemu-devel@nongnu.org,
	yamahata@valinux.co.jp, avi@redhat.com,
	Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>,
	paul@codesourcery.com
Subject: [Qemu-devel] Re: [PATCH 2/7] pci: memory access API and IOMMU support
Date: Sat, 4 Sep 2010 09:01:06 +0000	[thread overview]
Message-ID: <AANLkTik+NJB7dC3GMcsrX98EM_VfMYb=Txj3sTaGE33T@mail.gmail.com> (raw)
In-Reply-To: <20100902094947.GA9085@redhat.com>

On Thu, Sep 2, 2010 at 9:49 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Thu, Sep 02, 2010 at 11:40:58AM +0300, Eduard - Gabriel Munteanu wrote:
>> On Thu, Sep 02, 2010 at 08:28:26AM +0300, Michael S. Tsirkin wrote:
>> > On Sat, Aug 28, 2010 at 05:54:53PM +0300, Eduard - Gabriel Munteanu wrote:
>> > > PCI devices should access memory through pci_memory_*() instead of
>> > > cpu_physical_memory_*(). This also provides support for translation and
>> > > access checking in case an IOMMU is emulated.
>> > >
>> > > Memory maps are treated as remote IOTLBs (that is, translation caches
>> > > belonging to the IOMMU-aware device itself). Clients (devices) must
>> > > provide callbacks for map invalidation in case these maps are
>> > > persistent beyond the current I/O context, e.g. AIO DMA transfers.
>> > >
>> > > Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
>> >
>> >
>> > I am concerned about adding more pointer chaising on data path.
>> > Could we have
>> > 1. an iommu pointer in a device, inherited by secondary buses
>> >    when they are created and by devices from buses when they are attached.
>> > 2. translation pointer in the iommu instead of the bus
>>
>> The first solution I proposed was based on qdev, that is, each
>> DeviceState had an 'iommu' field. Translation would be done by
>> recursively looking in the parent bus/devs for an IOMMU.
>>
>> But Anthony said we're better off with bus-specific APIs, mostly because
>> (IIRC) there may be different types of addresses and it might be
>> difficult to abstract those properly.
>
> Well we ended up with casting
> away types to make pci callbacks fit in ide structure,
> and silently assuming that all addresses are in fact 64 bit.
> So maybe it's hard to abstract addresses properly, but
> it appears we'll have to, to avoid even worse problems.
>
>> I suppose I could revisit the idea by integrating the IOMMU in a
>> PCIDevice as opposed to a DeviceState.
>>
>> Anthony, Paul, any thoughts on this?
>
> Just to clarify: this is an optimization idea:
> instead of a bus walk on each access, do the walk
> when device is attached to the bus, and copy the iommu
> from the root to the device itself.
>
> This will also make it possible to create
> DMADeviceState structure which would have this iommu field,
> and we'd use this structure instead of the void pointers all over.
>
>
>> > 3. pci_memory_XX functions inline, doing fast path for non-iommu case:
>> >
>> >     if (__builtin_expect(!dev->iommu, 1)
>> >             return cpu_memory_rw
>>
>> But isn't this some sort of 'if (likely(!dev->iommu))' from the Linux
>> kernel? If so, it puts the IOMMU-enabled case at disadvantage.
>
> IOMMU has a ton of indirections anyway.
>
>> I suppose most emulated systems would have at least some theoretical
>> reasons to enable the IOMMU, e.g. as a GART replacement (say for 32-bit
>> PCI devices) or for userspace drivers.
>> So there are reasons to enable
>> the IOMMU even when you don't have a real host IOMMU and you're not
>> using nested guests.
>
> The time most people enable iommu for all devices in both real and virtualized
> systems appears distant, one of the reasons is because it has a lot of overhead.
> Let's start with not adding overhead for existing users, makes sense?

I think the goal architecture (not for IOMMU, but in general) is one
with zero copy DMA. This means we have stage one where the addresses
are translated to host pointers and stage two where the read/write
operation happens. The devices need to be converted.

Now, let's consider the IOMMU in this zero copy architecture. It's one
stage of address translation, for the access operation it will not
matter. We can add translation caching at device level (or even at any
intermediate levels), but that needs a cache invalidation callback
system as discussed earlier. This can be implemented later, we need
the zero copy stuff first.

Given this overall picture, I think eliminating some pointer
dereference overheads in non-zero copy architecture is a very
premature optimization and it may even direct the architecture to
wrong direction.

If the performance degradation at this point is not acceptable, we
could also postpone merging IOMMU until zero copy conversion has
happened, or make IOMMU a compile time option. But it would be nice to
back the decision by performance figures.

  reply	other threads:[~2010-09-04  9:01 UTC|newest]

Thread overview: 97+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-28 14:54 [PATCH 0/7] AMD IOMMU emulation patchset v4 Eduard - Gabriel Munteanu
2010-08-28 14:54 ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-08-28 14:54 ` [PATCH 1/7] pci: expand tabs to spaces in pci_regs.h Eduard - Gabriel Munteanu
2010-08-28 14:54   ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-08-31 20:29   ` Michael S. Tsirkin
2010-08-31 20:29     ` [Qemu-devel] " Michael S. Tsirkin
2010-08-31 22:58     ` Eduard - Gabriel Munteanu
2010-08-31 22:58       ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-09-01 10:39       ` Michael S. Tsirkin
2010-09-01 10:39         ` [Qemu-devel] " Michael S. Tsirkin
2010-08-28 14:54 ` [PATCH 2/7] pci: memory access API and IOMMU support Eduard - Gabriel Munteanu
2010-08-28 14:54   ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-09-02  5:28   ` Michael S. Tsirkin
2010-09-02  5:28     ` [Qemu-devel] " Michael S. Tsirkin
2010-09-02  8:40     ` Eduard - Gabriel Munteanu
2010-09-02  8:40       ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-09-02  9:49       ` Michael S. Tsirkin
2010-09-02  9:49         ` [Qemu-devel] " Michael S. Tsirkin
2010-09-04  9:01         ` Blue Swirl [this message]
2010-09-04  9:01           ` Blue Swirl
2010-09-05  7:10           ` Michael S. Tsirkin
2010-09-05  7:10             ` [Qemu-devel] " Michael S. Tsirkin
2010-08-28 14:54 ` [PATCH 3/7] AMD IOMMU emulation Eduard - Gabriel Munteanu
2010-08-28 14:54   ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-08-28 15:58   ` Blue Swirl
2010-08-28 15:58     ` [Qemu-devel] " Blue Swirl
2010-08-28 21:53     ` Eduard - Gabriel Munteanu
2010-08-28 21:53       ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-08-29 20:37       ` Blue Swirl
2010-08-29 20:37         ` [Qemu-devel] " Blue Swirl
2010-08-30  3:07   ` [Qemu-devel] " Isaku Yamahata
2010-08-30  3:07     ` Isaku Yamahata
2010-08-30  5:54     ` Eduard - Gabriel Munteanu
2010-08-30  5:54       ` Eduard - Gabriel Munteanu
2010-08-28 14:54 ` [PATCH 4/7] ide: use the PCI memory access interface Eduard - Gabriel Munteanu
2010-08-28 14:54   ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-09-02  5:19   ` Michael S. Tsirkin
2010-09-02  5:19     ` [Qemu-devel] " Michael S. Tsirkin
2010-09-02  9:12     ` Eduard - Gabriel Munteanu
2010-09-02  9:12       ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-09-02  9:58       ` Michael S. Tsirkin
2010-09-02  9:58         ` [Qemu-devel] " Michael S. Tsirkin
2010-09-02 15:01         ` Eduard - Gabriel Munteanu
2010-09-02 15:01           ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-09-02 15:24           ` Avi Kivity
2010-09-02 15:24             ` [Qemu-devel] " Avi Kivity
2010-09-02 15:39             ` Michael S. Tsirkin
2010-09-02 15:39               ` [Qemu-devel] " Michael S. Tsirkin
2010-09-02 16:07               ` Avi Kivity
2010-09-02 16:07                 ` [Qemu-devel] " Avi Kivity
2010-09-02 15:31           ` Michael S. Tsirkin
2010-09-02 15:31             ` [Qemu-devel] " Michael S. Tsirkin
2010-08-28 14:54 ` [PATCH 5/7] rtl8139: " Eduard - Gabriel Munteanu
2010-08-28 14:54   ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-08-28 14:54 ` [PATCH 6/7] eepro100: " Eduard - Gabriel Munteanu
2010-08-28 14:54   ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-08-28 14:54 ` [PATCH 7/7] ac97: " Eduard - Gabriel Munteanu
2010-08-28 14:54   ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-08-28 16:00 ` [PATCH 0/7] AMD IOMMU emulation patchset v4 Blue Swirl
2010-08-28 16:00   ` [Qemu-devel] " Blue Swirl
2010-08-29  9:55   ` Joerg Roedel
2010-08-29  9:55     ` [Qemu-devel] " Joerg Roedel
2010-08-29 20:44     ` Blue Swirl
2010-08-29 20:44       ` [Qemu-devel] " Blue Swirl
2010-08-29 22:08       ` [PATCH 2/7] pci: memory access API and IOMMU support Eduard - Gabriel Munteanu
2010-08-29 22:08         ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-08-29 22:11         ` Eduard - Gabriel Munteanu
2010-08-29 22:11           ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-09-01 20:10         ` [Qemu-devel] " Stefan Weil
2010-09-01 20:10           ` Stefan Weil
2010-09-02  6:00           ` Michael S. Tsirkin
2010-09-02  6:00             ` Michael S. Tsirkin
2010-09-02  9:08             ` Eduard - Gabriel Munteanu
2010-09-02  9:08               ` Eduard - Gabriel Munteanu
2010-09-02 13:24               ` Anthony Liguori
2010-09-02 13:24                 ` Anthony Liguori
2010-09-02  8:51           ` Eduard - Gabriel Munteanu
2010-09-02  8:51             ` Eduard - Gabriel Munteanu
2010-09-02 16:05             ` Stefan Weil
2010-09-02 16:05               ` Stefan Weil
2010-09-02 16:14               ` Eduard - Gabriel Munteanu
2010-09-02 16:14                 ` Eduard - Gabriel Munteanu
2010-09-13 20:01 ` [PATCH RFC] dma_rw.h (was Re: [PATCH 0/7] AMD IOMMU emulation patchset v4) Michael S. Tsirkin
2010-09-13 20:01   ` [Qemu-devel] " Michael S. Tsirkin
2010-09-13 20:45   ` Anthony Liguori
2010-09-13 20:45     ` Anthony Liguori
2010-09-16  7:12     ` Eduard - Gabriel Munteanu
2010-09-16  7:12       ` Eduard - Gabriel Munteanu
2010-09-16  9:35     ` Michael S. Tsirkin
2010-09-16  9:35       ` Michael S. Tsirkin
2010-09-16  7:06   ` Eduard - Gabriel Munteanu
2010-09-16  7:06     ` [Qemu-devel] " Eduard - Gabriel Munteanu
2010-09-16  9:20     ` Michael S. Tsirkin
2010-09-16  9:20       ` [Qemu-devel] " Michael S. Tsirkin
2010-09-16 11:15       ` Eduard - Gabriel Munteanu
2010-09-16 11:15         ` [Qemu-devel] " Eduard - Gabriel Munteanu
  -- strict thread matches above, loose matches on Subject: below --
2010-08-15 19:27 [PATCH 0/7] AMD IOMMU emulation patches v3 Eduard - Gabriel Munteanu
2010-08-15 19:27 ` [PATCH 2/7] pci: memory access API and IOMMU support Eduard - Gabriel Munteanu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='AANLkTik+NJB7dC3GMcsrX98EM_VfMYb=Txj3sTaGE33T@mail.gmail.com' \
    --to=blauwirbel@gmail.com \
    --cc=anthony@codemonkey.ws \
    --cc=av1474@comtv.ru \
    --cc=avi@redhat.com \
    --cc=eduard.munteanu@linux360.ro \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=paul@codesourcery.com \
    --cc=qemu-devel@nongnu.org \
    --cc=yamahata@valinux.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.