From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: linuxppc-dev@lists.ozlabs.org
Cc: kvm@vger.kernel.org, Alexey Kardashevskiy <aik@ozlabs.ru>,
linux-kernel@vger.kernel.org,
Alex Williamson <alex.williamson@redhat.com>,
Paul Mackerras <paulus@samba.org>
Subject: [PATCH v5 00/29] powerpc/iommu/vfio: Enable Dynamic DMA windows
Date: Tue, 10 Mar 2015 01:06:56 +1100 [thread overview]
Message-ID: <1425910045-26167-1-git-send-email-aik@ozlabs.ru> (raw)
This enables sPAPR defined feature called Dynamic DMA windows (DDW).
Each Partitionable Endpoint (IOMMU group) has an address range on a PCI bus
where devices are allowed to do DMA. These ranges are called DMA windows.
By default, there is a single DMA window, 1 or 2GB big, mapped at zero
on a PCI bus.
Hi-speed devices may suffer from the limited size of the window.
The recent host kernels use a TCE bypass window on POWER8 CPU which implements
direct PCI bus address range mapping (with offset of 1<<59) to the host memory.
For guests, PAPR defines a DDW RTAS API which allows pseries guests
querying the hypervisor about DDW support and capabilities (page size mask
for now). A pseries guest may request an additional (to the default)
DMA windows using this RTAS API.
The existing pseries Linux guests request an additional window as big as
the guest RAM and map the entire guest window which effectively creates
direct mapping of the guest memory to a PCI bus.
The multiple DMA windows feature is supported by POWER7/POWER8 CPUs; however
this patchset only adds support for POWER8 as TCE tables are implemented
in POWER7 in a quite different way ans POWER7 is not the highest priority.
This patchset reworks PPC64 IOMMU code and adds necessary structures
to support big windows.
Once a Linux guest discovers the presence of DDW, it does:
1. query hypervisor about number of available windows and page size masks;
2. create a window with the biggest possible page size (today 4K/64K/16M);
3. map the entire guest RAM via H_PUT_TCE* hypercalls;
4. switche dma_ops to direct_dma_ops on the selected PE.
Once this is done, H_PUT_TCE is not called anymore for 64bit devices and
the guest does not waste time on DMA map/unmap operations.
Note that 32bit devices won't use DDW and will keep using the default
DMA window so KVM optimizations will be required (to be posted later).
Changes:
v5:
* added SPAPR_TCE_IOMMU_v2 to tell the userspace that there is a memory
pre-registration feature
* added backward compatibility
* renamed few things (mostly powerpc_iommu -> iommu_table_group)
v4:
* moved patches around to have VFIO and PPC patches separated as much as
possible
* now works with the existing upstream QEMU
v3:
* redesigned the whole thing
* multiple IOMMU groups per PHB -> one PHB is needed for VFIO in the guest ->
no problems with locked_vm counting; also we save memory on actual tables
* guest RAM preregistration is required for DDW
* PEs (IOMMU groups) are passed to VFIO with no DMA windows at all so
we do not bother with iommu_table::it_map anymore
* added multilevel TCE tables support to support really huge guests
v2:
* added missing __pa() in "powerpc/powernv: Release replaced TCE"
* reposted to make some noise
Alexey Kardashevskiy (29):
vfio: powerpc/spapr: Move page pinning from arch code to VFIO IOMMU
driver
vfio: powerpc/spapr: Do cleanup when releasing the group
vfio: powerpc/spapr: Check that TCE page size is equal to it_page_size
vfio: powerpc/spapr: Use it_page_size
vfio: powerpc/spapr: Move locked_vm accounting to helpers
vfio: powerpc/spapr: Disable DMA mappings on disabled container
vfio: powerpc/spapr: Moving pinning/unpinning to helpers
vfio: powerpc/spapr: Register memory
vfio: powerpc/spapr: Rework attach/detach
powerpc/powernv: Do not set "read" flag if direction==DMA_NONE
powerpc/iommu: Move tce_xxx callbacks from ppc_md to iommu_table
powerpc/iommu: Introduce iommu_table_alloc() helper
powerpc/spapr: vfio: Switch from iommu_table to new iommu_table_group
vfio: powerpc/spapr: powerpc/iommu: Rework IOMMU ownership control
vfio: powerpc/spapr: powerpc/powernv/ioda2: Rework IOMMU ownership
control
powerpc/iommu: Fix IOMMU ownership control functions
powerpc/powernv/ioda/ioda2: Rework tce_build()/tce_free()
powerpc/iommu/powernv: Release replaced TCE
poweppc/powernv/ioda2: Rework iommu_table creation
powerpc/powernv/ioda2: Introduce
pnv_pci_ioda2_create_table/pnc_pci_free_table
powerpc/powernv/ioda2: Introduce pnv_pci_ioda2_set_window
powerpc/iommu: Split iommu_free_table into 2 helpers
powerpc/powernv: Implement multilevel TCE tables
powerpc/powernv: Change prototypes to receive iommu
powerpc/powernv/ioda: Define and implement DMA table/window management
callbacks
vfio: powerpc/spapr: Define v2 IOMMU
vfio: powerpc/spapr: powerpc/powernv/ioda2: Rework ownership
vfio: powerpc/spapr: Support multiple groups in one container if
possible
vfio: powerpc/spapr: Support Dynamic DMA windows
Documentation/vfio.txt | 38 +
arch/powerpc/include/asm/iommu.h | 109 ++-
arch/powerpc/include/asm/machdep.h | 25 -
arch/powerpc/kernel/iommu.c | 327 +++++----
arch/powerpc/kernel/vio.c | 5 +
arch/powerpc/platforms/cell/iommu.c | 8 +-
arch/powerpc/platforms/pasemi/iommu.c | 7 +-
arch/powerpc/platforms/powernv/pci-ioda.c | 500 ++++++++++---
arch/powerpc/platforms/powernv/pci-p5ioc2.c | 34 +-
arch/powerpc/platforms/powernv/pci.c | 114 ++-
arch/powerpc/platforms/powernv/pci.h | 12 +-
arch/powerpc/platforms/pseries/iommu.c | 55 +-
arch/powerpc/sysdev/dart_iommu.c | 12 +-
drivers/vfio/vfio_iommu_spapr_tce.c | 1001 ++++++++++++++++++++++++---
include/uapi/linux/vfio.h | 51 +-
15 files changed, 1816 insertions(+), 482 deletions(-)
--
2.0.0
next reply other threads:[~2015-03-09 14:08 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-09 14:06 Alexey Kardashevskiy [this message]
2015-03-09 14:06 ` [PATCH v5 01/29] vfio: powerpc/spapr: Move page pinning from arch code to VFIO IOMMU driver Alexey Kardashevskiy
2015-03-09 14:06 ` [PATCH v5 02/29] vfio: powerpc/spapr: Do cleanup when releasing the group Alexey Kardashevskiy
2015-03-09 14:06 ` [PATCH v5 03/29] vfio: powerpc/spapr: Check that TCE page size is equal to it_page_size Alexey Kardashevskiy
2015-03-10 19:56 ` Alex Williamson
2015-03-10 22:57 ` Alexey Kardashevskiy
2015-03-10 23:03 ` Alex Williamson
2015-03-10 23:14 ` Benjamin Herrenschmidt
2015-03-10 23:34 ` Alex Williamson
2015-03-10 23:45 ` Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 04/29] vfio: powerpc/spapr: Use it_page_size Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 05/29] vfio: powerpc/spapr: Move locked_vm accounting to helpers Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 06/29] vfio: powerpc/spapr: Disable DMA mappings on disabled container Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 07/29] vfio: powerpc/spapr: Moving pinning/unpinning to helpers Alexey Kardashevskiy
2015-03-10 23:36 ` Alex Williamson
2015-03-09 14:07 ` [PATCH v5 08/29] vfio: powerpc/spapr: Register memory Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 09/29] vfio: powerpc/spapr: Rework attach/detach Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 10/29] powerpc/powernv: Do not set "read" flag if direction==DMA_NONE Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 11/29] powerpc/iommu: Move tce_xxx callbacks from ppc_md to iommu_table Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 12/29] powerpc/iommu: Introduce iommu_table_alloc() helper Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 13/29] powerpc/spapr: vfio: Switch from iommu_table to new iommu_table_group Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 14/29] vfio: powerpc/spapr: powerpc/iommu: Rework IOMMU ownership control Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 15/29] vfio: powerpc/spapr: powerpc/powernv/ioda2: " Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 16/29] powerpc/iommu: Fix IOMMU ownership control functions Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 17/29] powerpc/powernv/ioda/ioda2: Rework tce_build()/tce_free() Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 18/29] powerpc/iommu/powernv: Release replaced TCE Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 19/29] poweppc/powernv/ioda2: Rework iommu_table creation Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 20/29] powerpc/powernv/ioda2: Introduce pnv_pci_ioda2_create_table/pnc_pci_free_table Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 21/29] powerpc/powernv/ioda2: Introduce pnv_pci_ioda2_set_window Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 22/29] powerpc/iommu: Split iommu_free_table into 2 helpers Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 23/29] powerpc/powernv: Implement multilevel TCE tables Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 24/29] powerpc/powernv: Change prototypes to receive iommu Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 25/29] powerpc/powernv/ioda: Define and implement DMA table/window management callbacks Alexey Kardashevskiy
2015-03-11 8:54 ` Alexey Kardashevskiy
2015-03-11 9:31 ` Benjamin Herrenschmidt
2015-03-09 14:07 ` [PATCH v5 26/29] vfio: powerpc/spapr: Define v2 IOMMU Alexey Kardashevskiy
2015-03-11 0:00 ` Alex Williamson
2015-03-09 14:07 ` [PATCH v5 27/29] vfio: powerpc/spapr: powerpc/powernv/ioda2: Rework ownership Alexey Kardashevskiy
2015-03-11 0:09 ` Alex Williamson
2015-03-11 0:29 ` Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 28/29] vfio: powerpc/spapr: Support multiple groups in one container if possible Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 29/29] vfio: powerpc/spapr: Support Dynamic DMA windows Alexey Kardashevskiy
2015-03-11 1:10 ` Alex Williamson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1425910045-26167-1-git-send-email-aik@ozlabs.ru \
--to=aik@ozlabs.ru \
--cc=alex.williamson@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=paulus@samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).