From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexey Kardashevskiy Subject: [PATCH kernel v3 0/6] powerpc/powernv/iommu: Optimize memory use Date: Wed, 4 Jul 2018 16:13:43 +1000 Message-ID: <20180704061349.20742-1-aik@ozlabs.ru> Cc: kvm@vger.kernel.org, Alexey Kardashevskiy , kvm-ppc@vger.kernel.org, Alex Williamson , David Gibson To: linuxppc-dev@lists.ozlabs.org Return-path: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linuxppc-dev-bounces+glppe-linuxppc-embedded-2=m.gmane.org@lists.ozlabs.org Sender: "Linuxppc-dev" List-Id: kvm.vger.kernel.org This patchset aims to reduce actual memory use for guests with sparse memory. The pseries guest uses dynamic DMA windows to map the entire guest RAM but it only actually maps onlined memory which may be not be contiguous. I hit this when tried passing through NVLink2-connected GPU RAM of NVIDIA V100 and trying to map this RAM at the same offset as in the real hardware forced me to rework I handle these windows. This moves userspace-to-host-physical translation table (iommu_table::it_userspace) from VFIO TCE IOMMU subdriver to the platform code and reuses the already existing multilevel TCE table code which we have for the hardware tables. At last in 6/6 I switch to on-demand allocation so we do not allocate huge chunks of the table if we do not have to; there is some math in 6/6. Changes: v3: * rebased on v4.18-rc3 and fixed compile error in 6/6 v2: * bugfix and error handling in 6/6 This is based on sha1 021c917 Linus Torvalds "Linux 4.18-rc3". Please comment. Thanks. Alexey Kardashevskiy (6): powerpc/powernv: Remove useless wrapper powerpc/powernv: Move TCE manupulation code to its own file KVM: PPC: Make iommu_table::it_userspace big endian powerpc/powernv: Add indirect levels to it_userspace powerpc/powernv: Rework TCE level allocation powerpc/powernv/ioda: Allocate indirect TCE levels on demand arch/powerpc/platforms/powernv/Makefile | 2 +- arch/powerpc/include/asm/iommu.h | 11 +- arch/powerpc/platforms/powernv/pci.h | 44 ++- arch/powerpc/kvm/book3s_64_vio.c | 11 +- arch/powerpc/kvm/book3s_64_vio_hv.c | 18 +- arch/powerpc/platforms/powernv/pci-ioda-tce.c | 399 ++++++++++++++++++++++++++ arch/powerpc/platforms/powernv/pci-ioda.c | 184 ++---------- arch/powerpc/platforms/powernv/pci.c | 158 ---------- drivers/vfio/vfio_iommu_spapr_tce.c | 65 +---- 9 files changed, 478 insertions(+), 414 deletions(-) create mode 100644 arch/powerpc/platforms/powernv/pci-ioda-tce.c -- 2.11.0 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ozlabs.ru (unknown [107.173.13.209]) by lists.ozlabs.org (Postfix) with ESMTP id 41L9gR5C2lzF1NW for ; Wed, 4 Jul 2018 16:14:27 +1000 (AEST) From: Alexey Kardashevskiy To: linuxppc-dev@lists.ozlabs.org Cc: Alexey Kardashevskiy , David Gibson , kvm-ppc@vger.kernel.org, kvm@vger.kernel.org, Alex Williamson , Benjamin Herrenschmidt , Michael Ellerman , Russell Currey Subject: [PATCH kernel v3 0/6] powerpc/powernv/iommu: Optimize memory use Date: Wed, 4 Jul 2018 16:13:43 +1000 Message-Id: <20180704061349.20742-1-aik@ozlabs.ru> List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , This patchset aims to reduce actual memory use for guests with sparse memory. The pseries guest uses dynamic DMA windows to map the entire guest RAM but it only actually maps onlined memory which may be not be contiguous. I hit this when tried passing through NVLink2-connected GPU RAM of NVIDIA V100 and trying to map this RAM at the same offset as in the real hardware forced me to rework I handle these windows. This moves userspace-to-host-physical translation table (iommu_table::it_userspace) from VFIO TCE IOMMU subdriver to the platform code and reuses the already existing multilevel TCE table code which we have for the hardware tables. At last in 6/6 I switch to on-demand allocation so we do not allocate huge chunks of the table if we do not have to; there is some math in 6/6. Changes: v3: * rebased on v4.18-rc3 and fixed compile error in 6/6 v2: * bugfix and error handling in 6/6 This is based on sha1 021c917 Linus Torvalds "Linux 4.18-rc3". Please comment. Thanks. Alexey Kardashevskiy (6): powerpc/powernv: Remove useless wrapper powerpc/powernv: Move TCE manupulation code to its own file KVM: PPC: Make iommu_table::it_userspace big endian powerpc/powernv: Add indirect levels to it_userspace powerpc/powernv: Rework TCE level allocation powerpc/powernv/ioda: Allocate indirect TCE levels on demand arch/powerpc/platforms/powernv/Makefile | 2 +- arch/powerpc/include/asm/iommu.h | 11 +- arch/powerpc/platforms/powernv/pci.h | 44 ++- arch/powerpc/kvm/book3s_64_vio.c | 11 +- arch/powerpc/kvm/book3s_64_vio_hv.c | 18 +- arch/powerpc/platforms/powernv/pci-ioda-tce.c | 399 ++++++++++++++++++++++++++ arch/powerpc/platforms/powernv/pci-ioda.c | 184 ++---------- arch/powerpc/platforms/powernv/pci.c | 158 ---------- drivers/vfio/vfio_iommu_spapr_tce.c | 65 +---- 9 files changed, 478 insertions(+), 414 deletions(-) create mode 100644 arch/powerpc/platforms/powernv/pci-ioda-tce.c -- 2.11.0 From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexey Kardashevskiy Date: Wed, 04 Jul 2018 06:13:43 +0000 Subject: [PATCH kernel v3 0/6] powerpc/powernv/iommu: Optimize memory use Message-Id: <20180704061349.20742-1-aik@ozlabs.ru> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: linuxppc-dev@lists.ozlabs.org Cc: kvm@vger.kernel.org, Alexey Kardashevskiy , kvm-ppc@vger.kernel.org, Alex Williamson , David Gibson This patchset aims to reduce actual memory use for guests with sparse memory. The pseries guest uses dynamic DMA windows to map the entire guest RAM but it only actually maps onlined memory which may be not be contiguous. I hit this when tried passing through NVLink2-connected GPU RAM of NVIDIA V100 and trying to map this RAM at the same offset as in the real hardware forced me to rework I handle these windows. This moves userspace-to-host-physical translation table (iommu_table::it_userspace) from VFIO TCE IOMMU subdriver to the platform code and reuses the already existing multilevel TCE table code which we have for the hardware tables. At last in 6/6 I switch to on-demand allocation so we do not allocate huge chunks of the table if we do not have to; there is some math in 6/6. Changes: v3: * rebased on v4.18-rc3 and fixed compile error in 6/6 v2: * bugfix and error handling in 6/6 This is based on sha1 021c917 Linus Torvalds "Linux 4.18-rc3". Please comment. Thanks. Alexey Kardashevskiy (6): powerpc/powernv: Remove useless wrapper powerpc/powernv: Move TCE manupulation code to its own file KVM: PPC: Make iommu_table::it_userspace big endian powerpc/powernv: Add indirect levels to it_userspace powerpc/powernv: Rework TCE level allocation powerpc/powernv/ioda: Allocate indirect TCE levels on demand arch/powerpc/platforms/powernv/Makefile | 2 +- arch/powerpc/include/asm/iommu.h | 11 +- arch/powerpc/platforms/powernv/pci.h | 44 ++- arch/powerpc/kvm/book3s_64_vio.c | 11 +- arch/powerpc/kvm/book3s_64_vio_hv.c | 18 +- arch/powerpc/platforms/powernv/pci-ioda-tce.c | 399 ++++++++++++++++++++++++++ arch/powerpc/platforms/powernv/pci-ioda.c | 184 ++---------- arch/powerpc/platforms/powernv/pci.c | 158 ---------- drivers/vfio/vfio_iommu_spapr_tce.c | 65 +---- 9 files changed, 478 insertions(+), 414 deletions(-) create mode 100644 arch/powerpc/platforms/powernv/pci-ioda-tce.c -- 2.11.0