From: Alexey Kardashevskiy <aik@ozlabs.ru> To: linuxppc-dev@lists.ozlabs.org Cc: Alexey Kardashevskiy <aik@ozlabs.ru>, Benjamin Herrenschmidt <benh@kernel.crashing.org>, Paul Mackerras <paulus@samba.org>, Michael Ellerman <mpe@ellerman.id.au>, Gavin Shan <gwshan@linux.vnet.ibm.com>, Alex Williamson <alex.williamson@redhat.com>, Alexander Graf <agraf@suse.de>, Alexander Gordeev <agordeev@redhat.com>, linux-kernel@vger.kernel.org Subject: [PATCH v3 00/24] powerpc/iommu/vfio: Enable Dynamic DMA windows Date: Thu, 29 Jan 2015 20:21:41 +1100 [thread overview] Message-ID: <1422523325-1389-1-git-send-email-aik@ozlabs.ru> (raw) This enables PAPR defined feature called Dynamic DMA windows (DDW). Each Partitionable Endpoint (IOMMU group) has a separate DMA window on a PCI bus where devices are allows to perform DMA. By default there is 1 or 2GB window allocated at the host boot time and these windows are used when an IOMMU group is passed to the userspace (guest). These windows are mapped at zero offset on a PCI bus. Hi-speed devices may suffer from limited size of this window. On the host side a TCE bypass mode is enabled on POWER8 CPU which implements direct mapping of the host memory to a PCI bus at 1<<59. For the guest, PAPR defines a DDW RTAS API which allows the pseries guest to query the hypervisor if it supports DDW and what are the parameters of possible windows. Currently POWER8 supports 2 DMA windows per PE - already mentioned and used small 32bit window and 64bit window which can only start from 1<<59 and can support various page sizes. This patchset reworks PPC IOMMU code and adds necessary structures to extend it to support big windows. When the guest detectes the feature and the PE is capable of 64bit DMA, it does: 1. query to hypervisor about number of available windows and page masks; 2. creates a window with the biggest possible page size (current guests can do 64K or 16MB TCEs); 3. maps the entire guest RAM via H_PUT_TCE* hypercalls 4. switches dma_ops to direct_dma_ops on the selected PE. Once this is done, H_PUT_TCE is not called anymore and the guest gets maximum performance. Changes: v3: * (!) redesigned the whole thing * multiple IOMMU groups per PHB -> one PHB is needed for VFIO in the guest -> no problems with locked_vm counting; also we save memory on actual tables * guest RAM preregistration is required for DDW * PEs (IOMMU groups) are passed to VFIO with no DMA windows at all so we do not bother with iommu_table::it_map anymore * added multilevel TCE tables support to support really huge guests v2: * added missing __pa() in "powerpc/powernv: Release replaced TCE" * reposted to make some noise Alexey Kardashevskiy (24): vfio: powerpc/spapr: Move page pinning from arch code to VFIO IOMMU driver vfio: powerpc/iommu: Check that TCE page size is equal to it_page_size powerpc/powernv: Do not set "read" flag if direction==DMA_NONE vfio: powerpc/spapr: Use it_page_size vfio: powerpc/spapr: Move locked_vm accounting to helpers powerpc/iommu: Move tce_xxx callbacks from ppc_md to iommu_table powerpc/iommu: Introduce iommu_table_alloc() helper powerpc/spapr: vfio: Switch from iommu_table to new powerpc_iommu powerpc/iommu: Fix IOMMU ownership control functions powerpc/powernv/ioda2: Rework IOMMU ownership control powerpc/powernv/ioda/ioda2: Rework tce_build()/tce_free() powerpc/iommu/powernv: Release replaced TCE powerpc/pseries/lpar: Enable VFIO vfio: powerpc/spapr: Register memory poweppc/powernv/ioda2: Rework iommu_table creation powerpc/powernv/ioda2: Introduce pnv_pci_ioda2_create_table powerpc/powernv/ioda2: Introduce pnv_pci_ioda2_set_window powerpc/iommu: Split iommu_free_table into 2 helpers powerpc/powernv: Implement multilevel TCE tables powerpc/powernv: Change prototypes to receive iommu powerpc/powernv/ioda: Define and implement DMA table/window management callbacks powerpc/iommu: Get rid of ownership helpers vfio/spapr: Enable multiple groups in a container vfio: powerpc/spapr: Support Dynamic DMA windows arch/powerpc/include/asm/iommu.h | 107 +++- arch/powerpc/include/asm/machdep.h | 25 - arch/powerpc/kernel/eeh.c | 2 +- arch/powerpc/kernel/iommu.c | 282 +++------ arch/powerpc/kernel/vio.c | 5 + arch/powerpc/platforms/cell/iommu.c | 8 +- arch/powerpc/platforms/pasemi/iommu.c | 7 +- arch/powerpc/platforms/powernv/pci-ioda.c | 470 ++++++++++++--- arch/powerpc/platforms/powernv/pci-p5ioc2.c | 21 +- arch/powerpc/platforms/powernv/pci.c | 130 +++-- arch/powerpc/platforms/powernv/pci.h | 14 +- arch/powerpc/platforms/pseries/iommu.c | 99 +++- arch/powerpc/sysdev/dart_iommu.c | 12 +- drivers/vfio/vfio_iommu_spapr_tce.c | 874 ++++++++++++++++++++++++---- include/uapi/linux/vfio.h | 53 +- 15 files changed, 1584 insertions(+), 525 deletions(-) -- 2.0.0
WARNING: multiple messages have this Message-ID (diff)
From: Alexey Kardashevskiy <aik@ozlabs.ru> To: linuxppc-dev@lists.ozlabs.org Cc: Alexey Kardashevskiy <aik@ozlabs.ru>, Gavin Shan <gwshan@linux.vnet.ibm.com>, Alexander Graf <agraf@suse.de>, Alex Williamson <alex.williamson@redhat.com>, Alexander Gordeev <agordeev@redhat.com>, Paul Mackerras <paulus@samba.org>, linux-kernel@vger.kernel.org Subject: [PATCH v3 00/24] powerpc/iommu/vfio: Enable Dynamic DMA windows Date: Thu, 29 Jan 2015 20:21:41 +1100 [thread overview] Message-ID: <1422523325-1389-1-git-send-email-aik@ozlabs.ru> (raw) This enables PAPR defined feature called Dynamic DMA windows (DDW). Each Partitionable Endpoint (IOMMU group) has a separate DMA window on a PCI bus where devices are allows to perform DMA. By default there is 1 or 2GB window allocated at the host boot time and these windows are used when an IOMMU group is passed to the userspace (guest). These windows are mapped at zero offset on a PCI bus. Hi-speed devices may suffer from limited size of this window. On the host side a TCE bypass mode is enabled on POWER8 CPU which implements direct mapping of the host memory to a PCI bus at 1<<59. For the guest, PAPR defines a DDW RTAS API which allows the pseries guest to query the hypervisor if it supports DDW and what are the parameters of possible windows. Currently POWER8 supports 2 DMA windows per PE - already mentioned and used small 32bit window and 64bit window which can only start from 1<<59 and can support various page sizes. This patchset reworks PPC IOMMU code and adds necessary structures to extend it to support big windows. When the guest detectes the feature and the PE is capable of 64bit DMA, it does: 1. query to hypervisor about number of available windows and page masks; 2. creates a window with the biggest possible page size (current guests can do 64K or 16MB TCEs); 3. maps the entire guest RAM via H_PUT_TCE* hypercalls 4. switches dma_ops to direct_dma_ops on the selected PE. Once this is done, H_PUT_TCE is not called anymore and the guest gets maximum performance. Changes: v3: * (!) redesigned the whole thing * multiple IOMMU groups per PHB -> one PHB is needed for VFIO in the guest -> no problems with locked_vm counting; also we save memory on actual tables * guest RAM preregistration is required for DDW * PEs (IOMMU groups) are passed to VFIO with no DMA windows at all so we do not bother with iommu_table::it_map anymore * added multilevel TCE tables support to support really huge guests v2: * added missing __pa() in "powerpc/powernv: Release replaced TCE" * reposted to make some noise Alexey Kardashevskiy (24): vfio: powerpc/spapr: Move page pinning from arch code to VFIO IOMMU driver vfio: powerpc/iommu: Check that TCE page size is equal to it_page_size powerpc/powernv: Do not set "read" flag if direction==DMA_NONE vfio: powerpc/spapr: Use it_page_size vfio: powerpc/spapr: Move locked_vm accounting to helpers powerpc/iommu: Move tce_xxx callbacks from ppc_md to iommu_table powerpc/iommu: Introduce iommu_table_alloc() helper powerpc/spapr: vfio: Switch from iommu_table to new powerpc_iommu powerpc/iommu: Fix IOMMU ownership control functions powerpc/powernv/ioda2: Rework IOMMU ownership control powerpc/powernv/ioda/ioda2: Rework tce_build()/tce_free() powerpc/iommu/powernv: Release replaced TCE powerpc/pseries/lpar: Enable VFIO vfio: powerpc/spapr: Register memory poweppc/powernv/ioda2: Rework iommu_table creation powerpc/powernv/ioda2: Introduce pnv_pci_ioda2_create_table powerpc/powernv/ioda2: Introduce pnv_pci_ioda2_set_window powerpc/iommu: Split iommu_free_table into 2 helpers powerpc/powernv: Implement multilevel TCE tables powerpc/powernv: Change prototypes to receive iommu powerpc/powernv/ioda: Define and implement DMA table/window management callbacks powerpc/iommu: Get rid of ownership helpers vfio/spapr: Enable multiple groups in a container vfio: powerpc/spapr: Support Dynamic DMA windows arch/powerpc/include/asm/iommu.h | 107 +++- arch/powerpc/include/asm/machdep.h | 25 - arch/powerpc/kernel/eeh.c | 2 +- arch/powerpc/kernel/iommu.c | 282 +++------ arch/powerpc/kernel/vio.c | 5 + arch/powerpc/platforms/cell/iommu.c | 8 +- arch/powerpc/platforms/pasemi/iommu.c | 7 +- arch/powerpc/platforms/powernv/pci-ioda.c | 470 ++++++++++++--- arch/powerpc/platforms/powernv/pci-p5ioc2.c | 21 +- arch/powerpc/platforms/powernv/pci.c | 130 +++-- arch/powerpc/platforms/powernv/pci.h | 14 +- arch/powerpc/platforms/pseries/iommu.c | 99 +++- arch/powerpc/sysdev/dart_iommu.c | 12 +- drivers/vfio/vfio_iommu_spapr_tce.c | 874 ++++++++++++++++++++++++---- include/uapi/linux/vfio.h | 53 +- 15 files changed, 1584 insertions(+), 525 deletions(-) -- 2.0.0
next reply other threads:[~2015-01-29 9:31 UTC|newest] Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top 2015-01-29 9:21 Alexey Kardashevskiy [this message] 2015-01-29 9:21 ` [PATCH v3 00/24] powerpc/iommu/vfio: Enable Dynamic DMA windows Alexey Kardashevskiy 2015-01-29 9:21 ` [PATCH v3 01/24] vfio: powerpc/spapr: Move page pinning from arch code to VFIO IOMMU driver Alexey Kardashevskiy 2015-01-29 9:21 ` Alexey Kardashevskiy 2015-01-29 9:21 ` [PATCH v3 02/24] vfio: powerpc/iommu: Check that TCE page size is equal to it_page_size Alexey Kardashevskiy 2015-01-29 9:21 ` Alexey Kardashevskiy 2015-02-02 21:45 ` Alex Williamson 2015-02-02 21:45 ` Alex Williamson 2015-01-29 9:21 ` [PATCH v3 03/24] powerpc/powernv: Do not set "read" flag if direction==DMA_NONE Alexey Kardashevskiy 2015-01-29 9:21 ` Alexey Kardashevskiy 2015-01-29 9:21 ` [PATCH v3 04/24] vfio: powerpc/spapr: Use it_page_size Alexey Kardashevskiy 2015-01-29 9:21 ` Alexey Kardashevskiy 2015-01-29 9:21 ` [PATCH v3 05/24] vfio: powerpc/spapr: Move locked_vm accounting to helpers Alexey Kardashevskiy 2015-01-29 9:21 ` Alexey Kardashevskiy 2015-02-03 0:12 ` Alex Williamson 2015-02-03 0:12 ` Alex Williamson 2015-01-29 9:21 ` [PATCH v3 06/24] powerpc/iommu: Move tce_xxx callbacks from ppc_md to iommu_table Alexey Kardashevskiy 2015-01-29 9:21 ` Alexey Kardashevskiy 2015-01-29 9:21 ` [PATCH v3 07/24] powerpc/iommu: Introduce iommu_table_alloc() helper Alexey Kardashevskiy 2015-01-29 9:21 ` Alexey Kardashevskiy 2015-01-29 9:21 ` [PATCH v3 08/24] powerpc/spapr: vfio: Switch from iommu_table to new powerpc_iommu Alexey Kardashevskiy 2015-01-29 9:21 ` Alexey Kardashevskiy 2015-02-03 0:12 ` Alex Williamson 2015-02-03 0:12 ` Alex Williamson 2015-02-04 13:32 ` Alexander Graf 2015-02-04 13:32 ` Alexander Graf 2015-02-05 4:58 ` Alexey Kardashevskiy 2015-02-05 4:58 ` Alexey Kardashevskiy 2015-01-29 9:21 ` [PATCH v3 09/24] powerpc/iommu: Fix IOMMU ownership control functions Alexey Kardashevskiy 2015-01-29 9:21 ` Alexey Kardashevskiy 2015-01-29 9:21 ` [PATCH v3 10/24] powerpc/powernv/ioda2: Rework IOMMU ownership control Alexey Kardashevskiy 2015-01-29 9:21 ` Alexey Kardashevskiy 2015-01-29 9:21 ` [PATCH v3 11/24] powerpc/powernv/ioda/ioda2: Rework tce_build()/tce_free() Alexey Kardashevskiy 2015-01-29 9:21 ` Alexey Kardashevskiy 2015-01-29 9:21 ` [PATCH v3 12/24] powerpc/iommu/powernv: Release replaced TCE Alexey Kardashevskiy 2015-01-29 9:21 ` Alexey Kardashevskiy 2015-02-04 6:08 ` Paul Mackerras 2015-02-04 6:08 ` Paul Mackerras 2015-02-05 4:57 ` Alexey Kardashevskiy 2015-02-05 4:57 ` Alexey Kardashevskiy 2015-01-29 9:21 ` [PATCH v3 13/24] powerpc/pseries/lpar: Enable VFIO Alexey Kardashevskiy 2015-01-29 9:21 ` Alexey Kardashevskiy 2015-01-29 9:21 ` [PATCH v3 14/24] vfio: powerpc/spapr: Register memory Alexey Kardashevskiy 2015-01-29 9:21 ` Alexey Kardashevskiy 2015-02-03 0:11 ` Alex Williamson 2015-02-03 0:11 ` Alex Williamson 2015-02-03 5:51 ` Alexey Kardashevskiy 2015-02-03 5:51 ` Alexey Kardashevskiy 2015-01-29 9:21 ` [PATCH v3 15/24] poweppc/powernv/ioda2: Rework iommu_table creation Alexey Kardashevskiy 2015-01-29 9:21 ` Alexey Kardashevskiy 2015-01-29 9:21 ` [PATCH v3 16/24] powerpc/powernv/ioda2: Introduce pnv_pci_ioda2_create_table Alexey Kardashevskiy 2015-01-29 9:21 ` Alexey Kardashevskiy 2015-01-29 9:21 ` [PATCH v3 17/24] powerpc/powernv/ioda2: Introduce pnv_pci_ioda2_set_window Alexey Kardashevskiy 2015-01-29 9:21 ` Alexey Kardashevskiy 2015-01-29 9:21 ` [PATCH v3 18/24] powerpc/iommu: Split iommu_free_table into 2 helpers Alexey Kardashevskiy 2015-01-29 9:21 ` Alexey Kardashevskiy 2015-01-29 9:22 ` [PATCH v3 19/24] powerpc/powernv: Implement multilevel TCE tables Alexey Kardashevskiy 2015-01-29 9:22 ` Alexey Kardashevskiy 2015-01-29 9:22 ` [PATCH v3 20/24] powerpc/powernv: Change prototypes to receive iommu Alexey Kardashevskiy 2015-01-29 9:22 ` Alexey Kardashevskiy 2015-01-29 9:22 ` [PATCH v3 21/24] powerpc/powernv/ioda: Define and implement DMA table/window management callbacks Alexey Kardashevskiy 2015-01-29 9:22 ` Alexey Kardashevskiy 2015-01-29 9:22 ` [PATCH v3 22/24] powerpc/iommu: Get rid of ownership helpers Alexey Kardashevskiy 2015-01-29 9:22 ` Alexey Kardashevskiy 2015-01-29 9:22 ` [PATCH v3 23/24] vfio/spapr: Enable multiple groups in a container Alexey Kardashevskiy 2015-01-29 9:22 ` Alexey Kardashevskiy 2015-01-29 9:22 ` [PATCH v3 24/24] vfio: powerpc/spapr: Support Dynamic DMA windows Alexey Kardashevskiy 2015-01-29 9:22 ` Alexey Kardashevskiy 2015-02-03 2:53 ` Alex Williamson 2015-02-03 2:53 ` Alex Williamson
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1422523325-1389-1-git-send-email-aik@ozlabs.ru \ --to=aik@ozlabs.ru \ --cc=agordeev@redhat.com \ --cc=agraf@suse.de \ --cc=alex.williamson@redhat.com \ --cc=benh@kernel.crashing.org \ --cc=gwshan@linux.vnet.ibm.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linuxppc-dev@lists.ozlabs.org \ --cc=mpe@ellerman.id.au \ --cc=paulus@samba.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.