All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] map big page by platform IOMMU
@ 2015-04-16 11:06 Vince Hsu
       [not found] ` <1429182379-31964-1-git-send-email-vinceh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 39+ messages in thread
From: Vince Hsu @ 2015-04-16 11:06 UTC (permalink / raw)
  To: bskeggs-H+wXaHxf7aLQT0dZR+AlfA, gnurou-Re5JQEeQqe8AvxtiuMwx3w
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tbergstrom-DDmLM1+adcrQT0dZR+AlfA

Hi,

Generally the the imported buffers which has memory type TTM_PL_TT are
mapped as small pages probably due to lack of big page allocation. But the
platform device which also use memory type TTM_PL_TT, like GK20A, can
*allocate* big page though the IOMMU hardware inside the SoC. This is a try
to map the imported buffers as big pages in GMMU by the platform IOMMU. With
some preparation work to map decreate small pages into big page(s) by IOMMU
the GMMU eventually sees the imported buffer as chunks of big pages and does
the mapping. And then we can probably do the compression on teh imported
buffer which is composed of non-contiguous small pages. The compbits related
patches shall come later.

I guess most of you won't like the change for the MMU code in this series.
So please comment and guide me how to do this better. :)

Thanks,
Vince

Vince Hsu (6):
  platform: specify the IOMMU physical translation bit
  instmem/gk20a: refer to IOMMU physical translation bit
  mmu: map small pages into big pages(s) by IOMMU if possible
  drm: enable big page mapping for small pages when IOMMU is available
  mmu: gf100: share most of functions with GK20A
  mmu: gk20a: implement IOMMU mapping for big pages

 drm/nouveau/include/nvkm/subdev/mmu.h   |  16 ++
 drm/nouveau/nouveau_bo.c                |   9 ++
 drm/nouveau/nouveau_platform.c          |  19 +++
 drm/nouveau/nouveau_platform.h          |   1 +
 drm/nouveau/nvkm/engine/device/gk104.c  |   2 +-
 drm/nouveau/nvkm/subdev/instmem/gk20a.c |  13 +-
 drm/nouveau/nvkm/subdev/mmu/Kbuild      |   1 +
 drm/nouveau/nvkm/subdev/mmu/base.c      | 158 +++++++++++++++++++-
 drm/nouveau/nvkm/subdev/mmu/gf100.c     |  28 +---
 drm/nouveau/nvkm/subdev/mmu/gf100.h     |  46 ++++++
 drm/nouveau/nvkm/subdev/mmu/gk20a.c     | 253 ++++++++++++++++++++++++++++++++
 lib/include/nvif/os.h                   |  12 ++
 12 files changed, 526 insertions(+), 32 deletions(-)
 create mode 100644 drm/nouveau/nvkm/subdev/mmu/gf100.h
 create mode 100644 drm/nouveau/nvkm/subdev/mmu/gk20a.c

-- 
2.1.4

_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 1/6] platform: specify the IOMMU physical translation bit
       [not found] ` <1429182379-31964-1-git-send-email-vinceh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
@ 2015-04-16 11:06   ` Vince Hsu
       [not found]     ` <1429182379-31964-2-git-send-email-vinceh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
  2015-04-16 11:06   ` [PATCH 2/6] instmem/gk20a: refer to " Vince Hsu
                     ` (5 subsequent siblings)
  6 siblings, 1 reply; 39+ messages in thread
From: Vince Hsu @ 2015-04-16 11:06 UTC (permalink / raw)
  To: bskeggs-H+wXaHxf7aLQT0dZR+AlfA, gnurou-Re5JQEeQqe8AvxtiuMwx3w
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tbergstrom-DDmLM1+adcrQT0dZR+AlfA

The IOMMU physical translation bit might vary with different SoCs. So add
a variable to specify this bit for GK20A.

Signed-off-by: Vince Hsu <vinceh@nvidia.com>
---
 drm/nouveau/nouveau_platform.c | 19 +++++++++++++++++++
 drm/nouveau/nouveau_platform.h |  1 +
 2 files changed, 20 insertions(+)

diff --git a/drm/nouveau/nouveau_platform.c b/drm/nouveau/nouveau_platform.c
index 775277f1edb0..0d002f73e356 100644
--- a/drm/nouveau/nouveau_platform.c
+++ b/drm/nouveau/nouveau_platform.c
@@ -25,6 +25,7 @@
 #include <linux/module.h>
 #include <linux/platform_device.h>
 #include <linux/of.h>
+#include <linux/of_device.h>
 #include <linux/reset.h>
 #include <linux/regulator/consumer.h>
 #include <linux/iommu.h>
@@ -92,6 +93,22 @@ static int nouveau_platform_power_down(struct nouveau_platform_gpu *gpu)
 	return 0;
 }
 
+static unsigned long nouveau_platform_get_iommu_bit(struct device *dev)
+{
+	const struct of_device_id *match;
+
+	match = of_match_device(dev->driver->of_match_table, dev);
+	if (!match) {
+		dev_warn(dev, "cannot find OF match for device\n");
+		return 0;
+	}
+
+	if (!strcmp(match->compatible, "nvidia,gk20a"))
+		return 34;
+	else
+		return 0;
+}
+
 static void nouveau_platform_probe_iommu(struct device *dev,
 					 struct nouveau_platform_gpu *gpu)
 {
@@ -122,6 +139,8 @@ static void nouveau_platform_probe_iommu(struct device *dev,
 			gpu->iommu.pgshift -= 1;
 		}
 
+		gpu->iommu.phys_addr_bit = nouveau_platform_get_iommu_bit(dev);
+
 		err = iommu_attach_device(gpu->iommu.domain, dev);
 		if (err)
 			goto free_domain;
diff --git a/drm/nouveau/nouveau_platform.h b/drm/nouveau/nouveau_platform.h
index 392874cf4725..3e9bd7dc0092 100644
--- a/drm/nouveau/nouveau_platform.h
+++ b/drm/nouveau/nouveau_platform.h
@@ -53,6 +53,7 @@ struct nouveau_platform_gpu {
 		struct nvkm_mm *mm;
 		struct iommu_domain *domain;
 		unsigned long pgshift;
+		unsigned long phys_addr_bit;
 	} iommu;
 };
 
-- 
2.1.4

_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 2/6] instmem/gk20a: refer to IOMMU physical translation bit
       [not found] ` <1429182379-31964-1-git-send-email-vinceh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
  2015-04-16 11:06   ` [PATCH 1/6] platform: specify the IOMMU physical translation bit Vince Hsu
@ 2015-04-16 11:06   ` Vince Hsu
       [not found]     ` <1429182379-31964-3-git-send-email-vinceh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
  2015-04-16 11:06   ` [PATCH 3/6] mmu: map small pages into big pages(s) by IOMMU if possible Vince Hsu
                     ` (4 subsequent siblings)
  6 siblings, 1 reply; 39+ messages in thread
From: Vince Hsu @ 2015-04-16 11:06 UTC (permalink / raw)
  To: bskeggs-H+wXaHxf7aLQT0dZR+AlfA, gnurou-Re5JQEeQqe8AvxtiuMwx3w
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tbergstrom-DDmLM1+adcrQT0dZR+AlfA

Instead of hard-coding the translation bit in subdev driver, we refer to
the platform data.

Signed-off-by: Vince Hsu <vinceh@nvidia.com>
---
 drm/nouveau/nvkm/subdev/instmem/gk20a.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/drm/nouveau/nvkm/subdev/instmem/gk20a.c b/drm/nouveau/nvkm/subdev/instmem/gk20a.c
index dd0994d9ebfc..69ef5eae3279 100644
--- a/drm/nouveau/nvkm/subdev/instmem/gk20a.c
+++ b/drm/nouveau/nvkm/subdev/instmem/gk20a.c
@@ -89,6 +89,7 @@ struct gk20a_instmem_priv {
 	struct nvkm_mm *mm;
 	struct iommu_domain *domain;
 	unsigned long iommu_pgshift;
+	unsigned long iommu_phys_addr_bit;
 
 	/* Only used by DMA API */
 	struct dma_attrs attrs;
@@ -169,8 +170,8 @@ gk20a_instobj_dtor_iommu(struct gk20a_instobj_priv *_node)
 	r = list_first_entry(&_node->mem->regions, struct nvkm_mm_node,
 			     rl_entry);
 
-	/* clear bit 34 to unmap pages */
-	r->offset &= ~BIT(34 - priv->iommu_pgshift);
+	/* clear IOMMU translation bit to unmap pages */
+	r->offset &= ~BIT(priv->iommu_phys_addr_bit - priv->iommu_pgshift);
 
 	/* Unmap pages from GPU address space and free them */
 	for (i = 0; i < _node->mem->size; i++) {
@@ -298,8 +299,11 @@ gk20a_instobj_ctor_iommu(struct nvkm_object *parent, struct nvkm_object *engine,
 		}
 	}
 
-	/* Bit 34 tells that an address is to be resolved through the IOMMU */
-	r->offset |= BIT(34 - priv->iommu_pgshift);
+	/*
+	 * The iommu_phys_addr_bit tells that an address is to be resolved
+	 * through the IOMMU
+	 */
+	r->offset |= BIT(priv->iommu_phys_addr_bit - priv->iommu_pgshift);
 
 	node->base._mem.offset = ((u64)r->offset) << priv->iommu_pgshift;
 
@@ -407,6 +411,7 @@ gk20a_instmem_ctor(struct nvkm_object *parent, struct nvkm_object *engine,
 		priv->domain = plat->gpu->iommu.domain;
 		priv->mm = plat->gpu->iommu.mm;
 		priv->iommu_pgshift = plat->gpu->iommu.pgshift;
+		priv->iommu_phys_addr_bit = plat->gpu->iommu.phys_addr_bit;
 		priv->mm_mutex = &plat->gpu->iommu.mutex;
 
 		nv_info(priv, "using IOMMU\n");
-- 
2.1.4

_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 3/6] mmu: map small pages into big pages(s) by IOMMU if possible
       [not found] ` <1429182379-31964-1-git-send-email-vinceh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
  2015-04-16 11:06   ` [PATCH 1/6] platform: specify the IOMMU physical translation bit Vince Hsu
  2015-04-16 11:06   ` [PATCH 2/6] instmem/gk20a: refer to " Vince Hsu
@ 2015-04-16 11:06   ` Vince Hsu
       [not found]     ` <1429182379-31964-4-git-send-email-vinceh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
  2015-04-16 11:06   ` [PATCH 4/6] drm: enable big page mapping for small pages when IOMMU is available Vince Hsu
                     ` (3 subsequent siblings)
  6 siblings, 1 reply; 39+ messages in thread
From: Vince Hsu @ 2015-04-16 11:06 UTC (permalink / raw)
  To: bskeggs-H+wXaHxf7aLQT0dZR+AlfA, gnurou-Re5JQEeQqe8AvxtiuMwx3w
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tbergstrom-DDmLM1+adcrQT0dZR+AlfA

This patch implements a way to aggregate the small pages and make them be
mapped as big page(s) by utilizing the platform IOMMU if supported. And then
we can enable compression support for these big pages later.

Signed-off-by: Vince Hsu <vinceh@nvidia.com>
---
 drm/nouveau/include/nvkm/subdev/mmu.h |  16 ++++
 drm/nouveau/nvkm/subdev/mmu/base.c    | 158 ++++++++++++++++++++++++++++++++--
 lib/include/nvif/os.h                 |  12 +++
 3 files changed, 179 insertions(+), 7 deletions(-)

diff --git a/drm/nouveau/include/nvkm/subdev/mmu.h b/drm/nouveau/include/nvkm/subdev/mmu.h
index 3a5368776c31..3230d31a7971 100644
--- a/drm/nouveau/include/nvkm/subdev/mmu.h
+++ b/drm/nouveau/include/nvkm/subdev/mmu.h
@@ -22,6 +22,8 @@ struct nvkm_vma {
 	struct nvkm_mm_node *node;
 	u64 offset;
 	u32 access;
+	struct list_head bp;
+	bool has_iommu_bp;
 };
 
 struct nvkm_vm {
@@ -37,6 +39,13 @@ struct nvkm_vm {
 	u32 lpde;
 };
 
+struct nvkm_vm_bp_list {
+	struct list_head head;
+	u32 pde;
+	u32 pte;
+	void *priv;
+};
+
 struct nvkm_mmu {
 	struct nvkm_subdev base;
 
@@ -45,6 +54,7 @@ struct nvkm_mmu {
 	u32 pgt_bits;
 	u8  spg_shift;
 	u8  lpg_shift;
+	bool iommu_capable;
 
 	int  (*create)(struct nvkm_mmu *, u64 offset, u64 length,
 		       u64 mm_offset, struct nvkm_vm **);
@@ -56,7 +66,12 @@ struct nvkm_mmu {
 		    u64 phys, u64 delta);
 	void (*map_sg)(struct nvkm_vma *, struct nvkm_gpuobj *,
 		       struct nvkm_mem *, u32 pte, u32 cnt, dma_addr_t *);
+	void (*map_iommu)(struct nvkm_vma *, struct nvkm_gpuobj *,
+		       struct nvkm_mem *, u32 pte, dma_addr_t *, void **);
+	void (*map_sg_iommu)(struct nvkm_vma *, struct nvkm_gpuobj *,
+		       struct nvkm_mem *, u32 pte, struct sg_page_iter *, void **);
 	void (*unmap)(struct nvkm_gpuobj *pgt, u32 pte, u32 cnt);
+	void (*unmap_iommu)(struct nvkm_vma *, void *);
 	void (*flush)(struct nvkm_vm *);
 };
 
@@ -84,6 +99,7 @@ extern struct nvkm_oclass nv41_mmu_oclass;
 extern struct nvkm_oclass nv44_mmu_oclass;
 extern struct nvkm_oclass nv50_mmu_oclass;
 extern struct nvkm_oclass gf100_mmu_oclass;
+extern struct nvkm_oclass gk20a_mmu_oclass;
 
 int  nv04_vm_create(struct nvkm_mmu *, u64, u64, u64,
 		    struct nvkm_vm **);
diff --git a/drm/nouveau/nvkm/subdev/mmu/base.c b/drm/nouveau/nvkm/subdev/mmu/base.c
index 277b6ec04e24..747c836d9fa6 100644
--- a/drm/nouveau/nvkm/subdev/mmu/base.c
+++ b/drm/nouveau/nvkm/subdev/mmu/base.c
@@ -26,6 +26,43 @@
 
 #include <core/gpuobj.h>
 
+static int
+nvkm_vm_map_pgt(struct nvkm_vm *vm, u32 pde, u32 type);
+
+static int
+nvkm_vm_link_bp(struct nvkm_vma *vma, u32 pde, u32 pte,
+	struct nvkm_vm_pgt *vpgt, void *priv)
+{
+	struct nvkm_vm *vm = vma->vm;
+	struct nvkm_mmu *mmu = vm->mmu;
+	struct nvkm_vm_bp_list *list;
+	list = kzalloc(sizeof(*list), GFP_KERNEL);
+	if (!list)
+		return -ENOMEM;
+
+	mutex_lock(&nv_subdev(mmu)->mutex);
+
+	if (!vma->has_iommu_bp) {
+		INIT_LIST_HEAD(&vma->bp);
+		vma->has_iommu_bp = true;
+	}
+	list->pde = pde;
+	list->pte = pte;
+	list->priv = priv;
+	list_add_tail(&list->head, &vma->bp);
+
+	mutex_unlock(&nv_subdev(mmu)->mutex);
+
+	return 0;
+}
+
+static void
+nvkm_vm_unlink_bp(struct nvkm_vma *vma, struct nvkm_vm_bp_list *list)
+{
+	list_del(&list->head);
+	kfree(list);
+}
+
 void
 nvkm_vm_map_at(struct nvkm_vma *vma, u64 delta, struct nvkm_mem *node)
 {
@@ -129,6 +166,48 @@ finish:
 }
 
 static void
+nvkm_vm_map_sg_table_with_iommu(struct nvkm_vma *vma, u64 delta, u64 length,
+		     struct nvkm_mem *mem)
+{
+	struct nvkm_vm *vm = vma->vm;
+	struct nvkm_mmu *mmu = vm->mmu;
+	int big = vma->node->type != mmu->spg_shift;
+	u32 offset = vma->node->offset + (delta >> 12);
+	u32 bits = vma->node->type - 12;
+	u32 pde  = (offset >> mmu->pgt_bits) - vm->fpde;
+	u32 pte  = (offset & ((1 << mmu->pgt_bits) - 1)) >> bits;
+	u32 max  = 1 << (mmu->pgt_bits - bits);
+	struct sg_page_iter iter;
+	u32 bpoff, i;
+	u32 multiple = 1 << bits;
+
+	i = 0;
+	for_each_sg_page(mem->sg->sgl, &iter, mem->sg->nents, 0) {
+		struct nvkm_gpuobj *pgt = vm->pgt[pde].obj[big];
+		void *priv;
+
+		bpoff = offset + i;
+
+		pde = (bpoff >> mmu->pgt_bits) - vm->fpde;
+		pte = (bpoff & ((1 << mmu->pgt_bits) - 1)) >> bits;
+		pgt = vm->pgt[pde].obj[1];
+
+		mmu->map_sg_iommu(vma, pgt, mem, pte, &iter, &priv);
+
+		nvkm_vm_link_bp(vma, pde, pte, &vm->pgt[pde], priv);
+
+		i += multiple;
+		pte++;
+		if (unlikely(pte >= max)) {
+			pde++;
+			pte = 0;
+		}
+	}
+
+	mmu->flush(vm);
+}
+
+static void
 nvkm_vm_map_sg(struct nvkm_vma *vma, u64 delta, u64 length,
 	       struct nvkm_mem *mem)
 {
@@ -166,15 +245,59 @@ nvkm_vm_map_sg(struct nvkm_vma *vma, u64 delta, u64 length,
 	mmu->flush(vm);
 }
 
+static void
+nvkm_vm_map_sg_with_iommu(struct nvkm_vma *vma, u64 delta, u64 length,
+	       struct nvkm_mem *mem)
+{
+	struct nvkm_vm *vm = vma->vm;
+	struct nvkm_mmu *mmu = vm->mmu;
+	dma_addr_t *list = mem->pages;
+	int big = vma->node->type != mmu->spg_shift;
+	u32 offset = vma->node->offset + (delta >> 12);
+	u32 bits = vma->node->type - 12;
+	u32 num  = length >> vma->node->type;
+	u32 pde  = (offset >> mmu->pgt_bits) - vm->fpde;
+	u32 pte  = (offset & ((1 << mmu->pgt_bits) - 1)) >> bits;
+	u32 max  = 1 << (mmu->pgt_bits - bits);
+	u32 multiple = 1 << bits;
+
+	while (num) {
+		struct nvkm_gpuobj *pgt = vm->pgt[pde].obj[big];
+		void *priv;
+
+		mmu->map_iommu(vma, pgt, mem, pte, list, &priv);
+
+		nvkm_vm_link_bp(vma, pde, pte, &vm->pgt[pde], priv);
+
+		list += multiple;
+		num--;
+		pte++;
+		if (unlikely(pte >= max)) {
+			pde++;
+			pte = 0;
+		}
+	}
+
+	mmu->flush(vm);
+}
+
 void
 nvkm_vm_map(struct nvkm_vma *vma, struct nvkm_mem *node)
 {
-	if (node->sg)
-		nvkm_vm_map_sg_table(vma, 0, node->size << 12, node);
-	else
-	if (node->pages)
-		nvkm_vm_map_sg(vma, 0, node->size << 12, node);
-	else
+	struct nvkm_vm *vm = vma->vm;
+	struct nvkm_mmu *mmu = vm->mmu;
+
+	if (node->sg) {
+		if (mmu->iommu_capable && vma->node->type == mmu->lpg_shift)
+			nvkm_vm_map_sg_table_with_iommu(vma, 0, node->size << 12, node);
+		else
+			nvkm_vm_map_sg_table(vma, 0, node->size << 12, node);
+	} else if (node->pages) {
+		if (mmu->iommu_capable && vma->node->type == mmu->lpg_shift)
+			nvkm_vm_map_sg_with_iommu(vma, 0, node->size << 12, node);
+		else
+			nvkm_vm_map_sg(vma, 0, node->size << 12, node);
+	} else
 		nvkm_vm_map_at(vma, 0, node);
 }
 
@@ -214,9 +337,30 @@ nvkm_vm_unmap_at(struct nvkm_vma *vma, u64 delta, u64 length)
 }
 
 void
+nvkm_vm_unmap_iommu(struct nvkm_vma *vma)
+{
+	struct nvkm_vm *vm = vma->vm;
+	struct nvkm_mmu *mmu = vm->mmu;
+	struct nvkm_vm_bp_list *list, *tmp;
+
+	list_for_each_entry_safe(list, tmp, &vma->bp, head) {
+		struct nvkm_gpuobj *pgt = vm->pgt[list->pde].obj[1];
+
+		mmu->unmap(pgt, list->pte, 1);
+		mmu->unmap_iommu(vma, list->priv);
+		nvkm_vm_unlink_bp(vma, list);
+	}
+
+	vma->has_iommu_bp = false;
+}
+
+void
 nvkm_vm_unmap(struct nvkm_vma *vma)
 {
-	nvkm_vm_unmap_at(vma, 0, (u64)vma->node->length << 12);
+	if (vma->has_iommu_bp)
+		nvkm_vm_unmap_iommu(vma);
+	else
+		nvkm_vm_unmap_at(vma, 0, (u64)vma->node->length << 12);
 }
 
 static void
diff --git a/lib/include/nvif/os.h b/lib/include/nvif/os.h
index 275fa84ad003..f56a5e5d3a4d 100644
--- a/lib/include/nvif/os.h
+++ b/lib/include/nvif/os.h
@@ -88,6 +88,7 @@ typedef dma_addr_t resource_size_t;
 #define likely(a) (a)
 #define unlikely(a) (a)
 #define BIT(a) (1UL << (a))
+#define BIT_ULL(a) (1ULL << (a))
 
 #define ERR_PTR(err) ((void *)(long)(err))
 #define PTR_ERR(ptr) ((long)(ptr))
@@ -914,6 +915,17 @@ struct sg_table {
 #define sg_dma_address(a) 0ULL
 #define sg_dma_len(a) 0ULL
 
+struct sg_page_iter {
+};
+
+#define sg_page_iter_dma_address(struct sg_page_iter *piter) 0ULL
+
+#define for_each_sg_page(sglist, piter, nents, pgoffset)		   \
+	for (__sg_page_iter_start((piter), (sglist), (nents), (pgoffset)); \
+	     __sg_page_iter_next(piter);)
+#define __sg_page_iter_start(a) (a)
+#define __sg_page_iter_next(a) (false)
+
 /******************************************************************************
  * firmware
  *****************************************************************************/
-- 
2.1.4

_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 4/6] drm: enable big page mapping for small pages when IOMMU is available
       [not found] ` <1429182379-31964-1-git-send-email-vinceh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
                     ` (2 preceding siblings ...)
  2015-04-16 11:06   ` [PATCH 3/6] mmu: map small pages into big pages(s) by IOMMU if possible Vince Hsu
@ 2015-04-16 11:06   ` Vince Hsu
       [not found]     ` <1429182379-31964-5-git-send-email-vinceh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
  2015-04-16 11:06   ` [PATCH 5/6] mmu: gf100: share most of functions with GK20A Vince Hsu
                     ` (2 subsequent siblings)
  6 siblings, 1 reply; 39+ messages in thread
From: Vince Hsu @ 2015-04-16 11:06 UTC (permalink / raw)
  To: bskeggs-H+wXaHxf7aLQT0dZR+AlfA, gnurou-Re5JQEeQqe8AvxtiuMwx3w
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tbergstrom-DDmLM1+adcrQT0dZR+AlfA

Some platforms have IOMMU to map non-contiguous physical memory into
contiguous GPU virtual address. We can use this feature to enable big pages
mapping on scattered small pages. To achieve that, we also need changes in
subdev/mmu as well.

Signed-off-by: Vince Hsu <vinceh@nvidia.com>
---
 drm/nouveau/nouveau_bo.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drm/nouveau/nouveau_bo.c b/drm/nouveau/nouveau_bo.c
index 77326e344dad..da76ee1121e4 100644
--- a/drm/nouveau/nouveau_bo.c
+++ b/drm/nouveau/nouveau_bo.c
@@ -221,6 +221,11 @@ nouveau_bo_new(struct drm_device *dev, int size, int align,
 	if (drm->client.vm) {
 		if (!(flags & TTM_PL_FLAG_TT) && size > 256 * 1024)
 			nvbo->page_shift = drm->client.vm->mmu->lpg_shift;
+
+		if ((flags & TTM_PL_FLAG_TT) &&
+				drm->client.vm->mmu->iommu_capable &&
+				(size % (1 << drm->client.vm->mmu->lpg_shift)) == 0)
+			nvbo->page_shift = drm->client.vm->mmu->lpg_shift;
 	}
 
 	nouveau_bo_fixup_align(nvbo, flags, &align, &size);
@@ -1641,6 +1646,10 @@ nouveau_bo_vma_add(struct nouveau_bo *nvbo, struct nvkm_vm *vm,
 	    (nvbo->bo.mem.mem_type == TTM_PL_VRAM ||
 	     nvbo->page_shift != vma->vm->mmu->lpg_shift))
 		nvkm_vm_map(vma, nvbo->bo.mem.mm_node);
+	else if (nvbo->bo.mem.mem_type == TTM_PL_TT &&
+		vma->vm->mmu->iommu_capable &&
+		nvbo->page_shift == vma->vm->mmu->lpg_shift)
+		nvkm_vm_map(vma, nvbo->bo.mem.mm_node);
 
 	list_add_tail(&vma->head, &nvbo->vma_list);
 	vma->refcount = 1;
-- 
2.1.4

_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 5/6] mmu: gf100: share most of functions with GK20A
       [not found] ` <1429182379-31964-1-git-send-email-vinceh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
                     ` (3 preceding siblings ...)
  2015-04-16 11:06   ` [PATCH 4/6] drm: enable big page mapping for small pages when IOMMU is available Vince Hsu
@ 2015-04-16 11:06   ` Vince Hsu
       [not found]     ` <1429182379-31964-6-git-send-email-vinceh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
  2015-04-16 11:06   ` [PATCH 6/6] mmu: gk20a: implement IOMMU mapping for big pages Vince Hsu
  2015-04-17  6:25   ` [PATCH 0/6] map big page by platform IOMMU Alexandre Courbot
  6 siblings, 1 reply; 39+ messages in thread
From: Vince Hsu @ 2015-04-16 11:06 UTC (permalink / raw)
  To: bskeggs-H+wXaHxf7aLQT0dZR+AlfA, gnurou-Re5JQEeQqe8AvxtiuMwx3w
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tbergstrom-DDmLM1+adcrQT0dZR+AlfA

This patch moves all of the functions which can be shared with GK20A to
public for later use.

Signed-off-by: Vince Hsu <vinceh@nvidia.com>
---
 drm/nouveau/nvkm/subdev/mmu/gf100.c | 28 +++++++---------------
 drm/nouveau/nvkm/subdev/mmu/gf100.h | 46 +++++++++++++++++++++++++++++++++++++
 2 files changed, 54 insertions(+), 20 deletions(-)
 create mode 100644 drm/nouveau/nvkm/subdev/mmu/gf100.h

diff --git a/drm/nouveau/nvkm/subdev/mmu/gf100.c b/drm/nouveau/nvkm/subdev/mmu/gf100.c
index 294cda37f068..b067ded5d3be 100644
--- a/drm/nouveau/nvkm/subdev/mmu/gf100.c
+++ b/drm/nouveau/nvkm/subdev/mmu/gf100.c
@@ -29,6 +29,8 @@
 
 #include <core/gpuobj.h>
 
+#include "gf100.h"
+
 struct gf100_mmu_priv {
 	struct nvkm_mmu base;
 };
@@ -74,7 +76,7 @@ const u8 gf100_pte_storage_type_map[256] =
 };
 
 
-static void
+void
 gf100_vm_map_pgt(struct nvkm_gpuobj *pgd, u32 index, struct nvkm_gpuobj *pgt[2])
 {
 	u32 pde[2] = { 0, 0 };
@@ -88,21 +90,7 @@ gf100_vm_map_pgt(struct nvkm_gpuobj *pgd, u32 index, struct nvkm_gpuobj *pgt[2])
 	nv_wo32(pgd, (index * 8) + 4, pde[1]);
 }
 
-static inline u64
-gf100_vm_addr(struct nvkm_vma *vma, u64 phys, u32 memtype, u32 target)
-{
-	phys >>= 8;
-
-	phys |= 0x00000001; /* present */
-	if (vma->access & NV_MEM_ACCESS_SYS)
-		phys |= 0x00000002;
-
-	phys |= ((u64)target  << 32);
-	phys |= ((u64)memtype << 36);
-	return phys;
-}
-
-static void
+void
 gf100_vm_map(struct nvkm_vma *vma, struct nvkm_gpuobj *pgt,
 	     struct nvkm_mem *mem, u32 pte, u32 cnt, u64 phys, u64 delta)
 {
@@ -127,7 +115,7 @@ gf100_vm_map(struct nvkm_vma *vma, struct nvkm_gpuobj *pgt,
 	}
 }
 
-static void
+void
 gf100_vm_map_sg(struct nvkm_vma *vma, struct nvkm_gpuobj *pgt,
 		struct nvkm_mem *mem, u32 pte, u32 cnt, dma_addr_t *list)
 {
@@ -144,7 +132,7 @@ gf100_vm_map_sg(struct nvkm_vma *vma, struct nvkm_gpuobj *pgt,
 	}
 }
 
-static void
+void
 gf100_vm_unmap(struct nvkm_gpuobj *pgt, u32 pte, u32 cnt)
 {
 	pte <<= 3;
@@ -155,7 +143,7 @@ gf100_vm_unmap(struct nvkm_gpuobj *pgt, u32 pte, u32 cnt)
 	}
 }
 
-static void
+void
 gf100_vm_flush(struct nvkm_vm *vm)
 {
 	struct gf100_mmu_priv *priv = (void *)vm->mmu;
@@ -191,7 +179,7 @@ gf100_vm_flush(struct nvkm_vm *vm)
 	mutex_unlock(&nv_subdev(priv)->mutex);
 }
 
-static int
+int
 gf100_vm_create(struct nvkm_mmu *mmu, u64 offset, u64 length, u64 mm_offset,
 		struct nvkm_vm **pvm)
 {
diff --git a/drm/nouveau/nvkm/subdev/mmu/gf100.h b/drm/nouveau/nvkm/subdev/mmu/gf100.h
new file mode 100644
index 000000000000..a66ca45bc755
--- /dev/null
+++ b/drm/nouveau/nvkm/subdev/mmu/gf100.h
@@ -0,0 +1,46 @@
+#ifndef __GF100_MMU_PRIV__
+#define __GF100_MMU_PRIV__
+
+#include <subdev/mmu.h>
+
+struct nv04_mmu_priv {
+	struct nvkm_mmu base;
+	struct nvkm_vm *vm;
+	dma_addr_t null;
+	void *nullp;
+};
+
+int
+gf100_vm_create(struct nvkm_mmu *mmu, u64 offset, u64 length, u64 mm_offset,
+		struct nvkm_vm **pvm);
+
+void
+gf100_vm_map_pgt(struct nvkm_gpuobj *pgd, u32 index,
+		struct nvkm_gpuobj *pgt[2]);
+void
+gf100_vm_map(struct nvkm_vma *vma, struct nvkm_gpuobj *pgt,
+	     struct nvkm_mem *mem, u32 pte, u32 cnt, u64 phys, u64 delta);
+void
+gf100_vm_map_sg(struct nvkm_vma *vma, struct nvkm_gpuobj *pgt,
+		struct nvkm_mem *mem, u32 pte, u32 cnt, dma_addr_t *list);
+void
+gf100_vm_unmap(struct nvkm_gpuobj *pgt, u32 pte, u32 cnt);
+
+void
+gf100_vm_flush(struct nvkm_vm *vm);
+
+static inline u64
+gf100_vm_addr(struct nvkm_vma *vma, u64 phys, u32 memtype, u32 target)
+{
+	phys >>= 8;
+
+	phys |= 0x00000001; /* present */
+	if (vma->access & NV_MEM_ACCESS_SYS)
+		phys |= 0x00000002;
+
+	phys |= ((u64)target  << 32);
+	phys |= ((u64)memtype << 36);
+	return phys;
+}
+
+#endif
-- 
2.1.4

_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 6/6] mmu: gk20a: implement IOMMU mapping for big pages
       [not found] ` <1429182379-31964-1-git-send-email-vinceh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
                     ` (4 preceding siblings ...)
  2015-04-16 11:06   ` [PATCH 5/6] mmu: gf100: share most of functions with GK20A Vince Hsu
@ 2015-04-16 11:06   ` Vince Hsu
       [not found]     ` <1429182379-31964-7-git-send-email-vinceh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
  2015-04-17  6:25   ` [PATCH 0/6] map big page by platform IOMMU Alexandre Courbot
  6 siblings, 1 reply; 39+ messages in thread
From: Vince Hsu @ 2015-04-16 11:06 UTC (permalink / raw)
  To: bskeggs-H+wXaHxf7aLQT0dZR+AlfA, gnurou-Re5JQEeQqe8AvxtiuMwx3w
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tbergstrom-DDmLM1+adcrQT0dZR+AlfA

This patch uses IOMMU to aggregate (probably) discrete small pages as larger
big page(s) and map it to GMMU.

Signed-off-by: Vince Hsu <vinceh@nvidia.com>
---
 drm/nouveau/nvkm/engine/device/gk104.c |   2 +-
 drm/nouveau/nvkm/subdev/mmu/Kbuild     |   1 +
 drm/nouveau/nvkm/subdev/mmu/gk20a.c    | 253 +++++++++++++++++++++++++++++++++
 3 files changed, 255 insertions(+), 1 deletion(-)
 create mode 100644 drm/nouveau/nvkm/subdev/mmu/gk20a.c

diff --git a/drm/nouveau/nvkm/engine/device/gk104.c b/drm/nouveau/nvkm/engine/device/gk104.c
index 6a9483f65d83..9ea48ba31c0d 100644
--- a/drm/nouveau/nvkm/engine/device/gk104.c
+++ b/drm/nouveau/nvkm/engine/device/gk104.c
@@ -172,7 +172,7 @@ gk104_identify(struct nvkm_device *device)
 		device->oclass[NVDEV_SUBDEV_LTC    ] =  gk104_ltc_oclass;
 		device->oclass[NVDEV_SUBDEV_IBUS   ] = &gk20a_ibus_oclass;
 		device->oclass[NVDEV_SUBDEV_INSTMEM] = gk20a_instmem_oclass;
-		device->oclass[NVDEV_SUBDEV_MMU    ] = &gf100_mmu_oclass;
+		device->oclass[NVDEV_SUBDEV_MMU    ] = &gk20a_mmu_oclass;
 		device->oclass[NVDEV_SUBDEV_BAR    ] = &gk20a_bar_oclass;
 		device->oclass[NVDEV_ENGINE_DMAOBJ ] =  gf110_dmaeng_oclass;
 		device->oclass[NVDEV_ENGINE_FIFO   ] =  gk20a_fifo_oclass;
diff --git a/drm/nouveau/nvkm/subdev/mmu/Kbuild b/drm/nouveau/nvkm/subdev/mmu/Kbuild
index 012c9db687b2..141302a8e933 100644
--- a/drm/nouveau/nvkm/subdev/mmu/Kbuild
+++ b/drm/nouveau/nvkm/subdev/mmu/Kbuild
@@ -4,3 +4,4 @@ nvkm-y += nvkm/subdev/mmu/nv41.o
 nvkm-y += nvkm/subdev/mmu/nv44.o
 nvkm-y += nvkm/subdev/mmu/nv50.o
 nvkm-y += nvkm/subdev/mmu/gf100.o
+nvkm-y += nvkm/subdev/mmu/gk20a.o
diff --git a/drm/nouveau/nvkm/subdev/mmu/gk20a.c b/drm/nouveau/nvkm/subdev/mmu/gk20a.c
new file mode 100644
index 000000000000..b444b73e208d
--- /dev/null
+++ b/drm/nouveau/nvkm/subdev/mmu/gk20a.c
@@ -0,0 +1,253 @@
+/*
+ * Copyright (c) 2015, NVIDIA CORPORATION. All rights reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include <subdev/fb.h>
+#include <subdev/ltc.h>
+#include <subdev/mmu.h>
+
+#ifdef __KERNEL__
+#include <linux/iommu.h>
+#include <nouveau_platform.h>
+#endif
+
+#include "gf100.h"
+
+struct gk20a_mmu_priv {
+	struct nvkm_mmu base;
+};
+
+struct gk20a_mmu_iommu_mapping {
+	struct nvkm_mm_node *node;
+	u64 iova;
+};
+
+extern const u8 gf100_pte_storage_type_map[256];
+
+static void
+gk20a_vm_map(struct nvkm_vma *vma, struct nvkm_gpuobj *pgt,
+		struct nvkm_mem *mem, u32 pte, u64 list)
+{
+	u32 target = (vma->access & NV_MEM_ACCESS_NOSNOOP) ? 7 : 5;
+	u64 phys;
+
+	pte <<= 3;
+	phys = gf100_vm_addr(vma, list, mem->memtype, target);
+
+	if (mem->tag) {
+		struct nvkm_ltc *ltc = nvkm_ltc(vma->vm->mmu);
+		u32 tag = mem->tag->offset;
+		phys |= (u64)tag << (32 + 12);
+		ltc->tags_clear(ltc, tag, 1);
+	}
+
+	nv_wo32(pgt, pte + 0, lower_32_bits(phys));
+	nv_wo32(pgt, pte + 4, upper_32_bits(phys));
+}
+
+static void
+gk20a_vm_map_iommu(struct nvkm_vma *vma, struct nvkm_gpuobj *pgt,
+		struct nvkm_mem *mem, u32 pte, dma_addr_t *list,
+		void **priv)
+{
+	struct nvkm_vm *vm = vma->vm;
+	struct nvkm_mmu *mmu = vm->mmu;
+	struct nvkm_mm_node *node;
+	struct nouveau_platform_device *plat;
+	struct gk20a_mmu_iommu_mapping *p;
+	int npages = 1 << (mmu->lpg_shift - mmu->spg_shift);
+	int i, ret;
+	u64 addr;
+
+	plat = nv_device_to_platform(nv_device(&mmu->base));
+
+	*priv = kzalloc(sizeof(struct gk20a_mmu_iommu_mapping), GFP_KERNEL);
+	if (!*priv)
+		return;
+
+	mutex_lock(&plat->gpu->iommu.mutex);
+	ret = nvkm_mm_head(plat->gpu->iommu.mm,
+			0,
+			1,
+			npages,
+			npages,
+			(1 << mmu->lpg_shift) >> 12,
+			&node);
+	mutex_unlock(&plat->gpu->iommu.mutex);
+	if (ret)
+		return;
+
+	for (i = 0; i < npages; i++, list++) {
+		ret = iommu_map(plat->gpu->iommu.domain,
+				(node->offset + i) << PAGE_SHIFT,
+				*list,
+				PAGE_SIZE,
+				IOMMU_READ | IOMMU_WRITE);
+
+		if (ret < 0)
+			return;
+
+		nv_trace(mmu, "IOMMU: IOVA=0x%016llx-> IOMMU -> PA=%016llx\n",
+				(u64)(node->offset + i) << PAGE_SHIFT, (u64)(*list));
+	}
+
+	addr = (u64)node->offset << PAGE_SHIFT;
+	addr |= BIT_ULL(plat->gpu->iommu.phys_addr_bit);
+
+	gk20a_vm_map(vma, pgt, mem, pte, addr);
+
+	p = *priv;
+	p->node = node;
+	p->iova = node->offset << PAGE_SHIFT;
+}
+
+static void
+gk20a_vm_map_sg_iommu(struct nvkm_vma *vma, struct nvkm_gpuobj *pgt,
+		struct nvkm_mem *mem, u32 pte, struct sg_page_iter *iter,
+		void **priv)
+{
+	struct nvkm_vm *vm = vma->vm;
+	struct nvkm_mmu *mmu = vm->mmu;
+	struct nvkm_mm_node *node;
+	struct nouveau_platform_device *plat;
+	struct gk20a_mmu_iommu_mapping *p;
+	int npages = 1 << (mmu->lpg_shift - mmu->spg_shift);
+	int i, ret;
+	u64 addr;
+
+	plat = nv_device_to_platform(nv_device(&mmu->base));
+
+	*priv = kzalloc(sizeof(struct gk20a_mmu_iommu_mapping), GFP_KERNEL);
+	if (!*priv)
+		return;
+
+	mutex_lock(&plat->gpu->iommu.mutex);
+	ret = nvkm_mm_head(plat->gpu->iommu.mm,
+			0,
+			1,
+			npages,
+			npages,
+			(1 << mmu->lpg_shift) >> 12,
+			&node);
+	mutex_unlock(&plat->gpu->iommu.mutex);
+	if (ret)
+		return;
+
+	for (i = 0; i < npages; i++) {
+		dma_addr_t phys = sg_page_iter_dma_address(iter);
+
+		ret = iommu_map(plat->gpu->iommu.domain,
+				(node->offset + i) << PAGE_SHIFT,
+				phys,
+				PAGE_SIZE,
+				IOMMU_READ | IOMMU_WRITE);
+
+		if (ret < 0)
+			return;
+
+		nv_trace(mmu, "IOMMU: IOVA=0x%016llx-> IOMMU -> PA=%016llx\n",
+				(u64)(node->offset + i) << PAGE_SHIFT, (u64)phys);
+
+		if ((i < npages - 1) && !__sg_page_iter_next(iter)) {
+			nv_error(mmu, "failed to iterate sg table\n");
+			return;
+		}
+	}
+
+	addr = (u64)node->offset << PAGE_SHIFT;
+	addr |= BIT_ULL(plat->gpu->iommu.phys_addr_bit);
+
+	gk20a_vm_map(vma, pgt, mem, pte, addr);
+
+	p = *priv;
+	p->node = node;
+	p->iova = node->offset << PAGE_SHIFT;
+}
+
+static void
+gk20a_vm_unmap_iommu(struct nvkm_vma *vma, void *priv)
+{
+	struct nvkm_vm *vm = vma->vm;
+	struct nvkm_mmu *mmu = vm->mmu;
+	struct nouveau_platform_device *plat;
+	struct gk20a_mmu_iommu_mapping *p = priv;
+	int ret;
+
+	plat = nv_device_to_platform(nv_device(&mmu->base));
+
+	ret = iommu_unmap(plat->gpu->iommu.domain, p->iova,
+			1 << mmu->lpg_shift);
+	WARN(ret < 0, "failed to unmap IOMMU address 0x%16llx, ret=%d\n",
+			p->iova, ret);
+
+	mutex_lock(&plat->gpu->iommu.mutex);
+	nvkm_mm_free(plat->gpu->iommu.mm, &p->node);
+	mutex_unlock(&plat->gpu->iommu.mutex);
+
+	kfree(priv);
+}
+
+static int
+gk20a_mmu_ctor(struct nvkm_object *parent, struct nvkm_object *engine,
+	       struct nvkm_oclass *oclass, void *data, u32 size,
+	       struct nvkm_object **pobject)
+{
+	struct gk20a_mmu_priv *priv;
+	struct nouveau_platform_device *plat;
+	int ret;
+
+	ret = nvkm_mmu_create(parent, engine, oclass, "VM", "vm", &priv);
+	*pobject = nv_object(priv);
+	if (ret)
+		return ret;
+
+	plat = nv_device_to_platform(nv_device(parent));
+	if (plat->gpu->iommu.domain)
+		priv->base.iommu_capable = true;
+
+	priv->base.limit = 1ULL << 40;
+	priv->base.dma_bits = 40;
+	priv->base.pgt_bits  = 27 - 12;
+	priv->base.spg_shift = 12;
+	priv->base.lpg_shift = 17;
+	priv->base.create = gf100_vm_create;
+	priv->base.map_pgt = gf100_vm_map_pgt;
+	priv->base.map = gf100_vm_map;
+	priv->base.map_sg = gf100_vm_map_sg;
+	priv->base.map_iommu = gk20a_vm_map_iommu;
+	priv->base.unmap_iommu = gk20a_vm_unmap_iommu;
+	priv->base.map_sg_iommu = gk20a_vm_map_sg_iommu;
+	priv->base.unmap = gf100_vm_unmap;
+	priv->base.flush = gf100_vm_flush;
+
+	return 0;
+}
+
+struct nvkm_oclass
+gk20a_mmu_oclass = {
+	.handle = NV_SUBDEV(MMU, 0xea),
+	.ofuncs = &(struct nvkm_ofuncs) {
+		.ctor = gk20a_mmu_ctor,
+		.dtor = _nvkm_mmu_dtor,
+		.init = _nvkm_mmu_init,
+		.fini = _nvkm_mmu_fini,
+	},
+};
-- 
2.1.4

_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH 6/6] mmu: gk20a: implement IOMMU mapping for big pages
       [not found]     ` <1429182379-31964-7-git-send-email-vinceh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
@ 2015-04-16 19:31       ` Ilia Mirkin
       [not found]         ` <CAKb7UvhEuNSYshiP+kt66zHfsmB=Edy+t7HLs2j_+-hvzzkdMQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 39+ messages in thread
From: Ilia Mirkin @ 2015-04-16 19:31 UTC (permalink / raw)
  To: Vince Hsu
  Cc: Terje Bergstrom, Ben Skeggs, nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Two questions --

(a) What's the perf impact of doing this? Less work for the GPU MMU
but more work for the IOMMU...
(b) Would it be a good idea to do this for desktop GPUs that are on
CPUs with IOMMUs in them (VT-d and whatever the AMD one is)? Is there
some sort of shared API for this stuff that you should be (or are?)
using?

  -ilia


On Thu, Apr 16, 2015 at 7:06 AM, Vince Hsu <vinceh@nvidia.com> wrote:
> This patch uses IOMMU to aggregate (probably) discrete small pages as larger
> big page(s) and map it to GMMU.
>
> Signed-off-by: Vince Hsu <vinceh@nvidia.com>
> ---
>  drm/nouveau/nvkm/engine/device/gk104.c |   2 +-
>  drm/nouveau/nvkm/subdev/mmu/Kbuild     |   1 +
>  drm/nouveau/nvkm/subdev/mmu/gk20a.c    | 253 +++++++++++++++++++++++++++++++++
>  3 files changed, 255 insertions(+), 1 deletion(-)
>  create mode 100644 drm/nouveau/nvkm/subdev/mmu/gk20a.c
>
> diff --git a/drm/nouveau/nvkm/engine/device/gk104.c b/drm/nouveau/nvkm/engine/device/gk104.c
> index 6a9483f65d83..9ea48ba31c0d 100644
> --- a/drm/nouveau/nvkm/engine/device/gk104.c
> +++ b/drm/nouveau/nvkm/engine/device/gk104.c
> @@ -172,7 +172,7 @@ gk104_identify(struct nvkm_device *device)
>                 device->oclass[NVDEV_SUBDEV_LTC    ] =  gk104_ltc_oclass;
>                 device->oclass[NVDEV_SUBDEV_IBUS   ] = &gk20a_ibus_oclass;
>                 device->oclass[NVDEV_SUBDEV_INSTMEM] = gk20a_instmem_oclass;
> -               device->oclass[NVDEV_SUBDEV_MMU    ] = &gf100_mmu_oclass;
> +               device->oclass[NVDEV_SUBDEV_MMU    ] = &gk20a_mmu_oclass;
>                 device->oclass[NVDEV_SUBDEV_BAR    ] = &gk20a_bar_oclass;
>                 device->oclass[NVDEV_ENGINE_DMAOBJ ] =  gf110_dmaeng_oclass;
>                 device->oclass[NVDEV_ENGINE_FIFO   ] =  gk20a_fifo_oclass;
> diff --git a/drm/nouveau/nvkm/subdev/mmu/Kbuild b/drm/nouveau/nvkm/subdev/mmu/Kbuild
> index 012c9db687b2..141302a8e933 100644
> --- a/drm/nouveau/nvkm/subdev/mmu/Kbuild
> +++ b/drm/nouveau/nvkm/subdev/mmu/Kbuild
> @@ -4,3 +4,4 @@ nvkm-y += nvkm/subdev/mmu/nv41.o
>  nvkm-y += nvkm/subdev/mmu/nv44.o
>  nvkm-y += nvkm/subdev/mmu/nv50.o
>  nvkm-y += nvkm/subdev/mmu/gf100.o
> +nvkm-y += nvkm/subdev/mmu/gk20a.o
> diff --git a/drm/nouveau/nvkm/subdev/mmu/gk20a.c b/drm/nouveau/nvkm/subdev/mmu/gk20a.c
> new file mode 100644
> index 000000000000..b444b73e208d
> --- /dev/null
> +++ b/drm/nouveau/nvkm/subdev/mmu/gk20a.c
> @@ -0,0 +1,253 @@
> +/*
> + * Copyright (c) 2015, NVIDIA CORPORATION. All rights reserved.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + */
> +
> +#include <subdev/fb.h>
> +#include <subdev/ltc.h>
> +#include <subdev/mmu.h>
> +
> +#ifdef __KERNEL__
> +#include <linux/iommu.h>
> +#include <nouveau_platform.h>
> +#endif
> +
> +#include "gf100.h"
> +
> +struct gk20a_mmu_priv {
> +       struct nvkm_mmu base;
> +};
> +
> +struct gk20a_mmu_iommu_mapping {
> +       struct nvkm_mm_node *node;
> +       u64 iova;
> +};
> +
> +extern const u8 gf100_pte_storage_type_map[256];
> +
> +static void
> +gk20a_vm_map(struct nvkm_vma *vma, struct nvkm_gpuobj *pgt,
> +               struct nvkm_mem *mem, u32 pte, u64 list)
> +{
> +       u32 target = (vma->access & NV_MEM_ACCESS_NOSNOOP) ? 7 : 5;
> +       u64 phys;
> +
> +       pte <<= 3;
> +       phys = gf100_vm_addr(vma, list, mem->memtype, target);
> +
> +       if (mem->tag) {
> +               struct nvkm_ltc *ltc = nvkm_ltc(vma->vm->mmu);
> +               u32 tag = mem->tag->offset;
> +               phys |= (u64)tag << (32 + 12);
> +               ltc->tags_clear(ltc, tag, 1);
> +       }
> +
> +       nv_wo32(pgt, pte + 0, lower_32_bits(phys));
> +       nv_wo32(pgt, pte + 4, upper_32_bits(phys));
> +}
> +
> +static void
> +gk20a_vm_map_iommu(struct nvkm_vma *vma, struct nvkm_gpuobj *pgt,
> +               struct nvkm_mem *mem, u32 pte, dma_addr_t *list,
> +               void **priv)
> +{
> +       struct nvkm_vm *vm = vma->vm;
> +       struct nvkm_mmu *mmu = vm->mmu;
> +       struct nvkm_mm_node *node;
> +       struct nouveau_platform_device *plat;
> +       struct gk20a_mmu_iommu_mapping *p;
> +       int npages = 1 << (mmu->lpg_shift - mmu->spg_shift);
> +       int i, ret;
> +       u64 addr;
> +
> +       plat = nv_device_to_platform(nv_device(&mmu->base));
> +
> +       *priv = kzalloc(sizeof(struct gk20a_mmu_iommu_mapping), GFP_KERNEL);
> +       if (!*priv)
> +               return;
> +
> +       mutex_lock(&plat->gpu->iommu.mutex);
> +       ret = nvkm_mm_head(plat->gpu->iommu.mm,
> +                       0,
> +                       1,
> +                       npages,
> +                       npages,
> +                       (1 << mmu->lpg_shift) >> 12,
> +                       &node);
> +       mutex_unlock(&plat->gpu->iommu.mutex);
> +       if (ret)
> +               return;
> +
> +       for (i = 0; i < npages; i++, list++) {
> +               ret = iommu_map(plat->gpu->iommu.domain,
> +                               (node->offset + i) << PAGE_SHIFT,
> +                               *list,
> +                               PAGE_SIZE,
> +                               IOMMU_READ | IOMMU_WRITE);
> +
> +               if (ret < 0)
> +                       return;
> +
> +               nv_trace(mmu, "IOMMU: IOVA=0x%016llx-> IOMMU -> PA=%016llx\n",
> +                               (u64)(node->offset + i) << PAGE_SHIFT, (u64)(*list));
> +       }
> +
> +       addr = (u64)node->offset << PAGE_SHIFT;
> +       addr |= BIT_ULL(plat->gpu->iommu.phys_addr_bit);
> +
> +       gk20a_vm_map(vma, pgt, mem, pte, addr);
> +
> +       p = *priv;
> +       p->node = node;
> +       p->iova = node->offset << PAGE_SHIFT;
> +}
> +
> +static void
> +gk20a_vm_map_sg_iommu(struct nvkm_vma *vma, struct nvkm_gpuobj *pgt,
> +               struct nvkm_mem *mem, u32 pte, struct sg_page_iter *iter,
> +               void **priv)
> +{
> +       struct nvkm_vm *vm = vma->vm;
> +       struct nvkm_mmu *mmu = vm->mmu;
> +       struct nvkm_mm_node *node;
> +       struct nouveau_platform_device *plat;
> +       struct gk20a_mmu_iommu_mapping *p;
> +       int npages = 1 << (mmu->lpg_shift - mmu->spg_shift);
> +       int i, ret;
> +       u64 addr;
> +
> +       plat = nv_device_to_platform(nv_device(&mmu->base));
> +
> +       *priv = kzalloc(sizeof(struct gk20a_mmu_iommu_mapping), GFP_KERNEL);
> +       if (!*priv)
> +               return;
> +
> +       mutex_lock(&plat->gpu->iommu.mutex);
> +       ret = nvkm_mm_head(plat->gpu->iommu.mm,
> +                       0,
> +                       1,
> +                       npages,
> +                       npages,
> +                       (1 << mmu->lpg_shift) >> 12,
> +                       &node);
> +       mutex_unlock(&plat->gpu->iommu.mutex);
> +       if (ret)
> +               return;
> +
> +       for (i = 0; i < npages; i++) {
> +               dma_addr_t phys = sg_page_iter_dma_address(iter);
> +
> +               ret = iommu_map(plat->gpu->iommu.domain,
> +                               (node->offset + i) << PAGE_SHIFT,
> +                               phys,
> +                               PAGE_SIZE,
> +                               IOMMU_READ | IOMMU_WRITE);
> +
> +               if (ret < 0)
> +                       return;
> +
> +               nv_trace(mmu, "IOMMU: IOVA=0x%016llx-> IOMMU -> PA=%016llx\n",
> +                               (u64)(node->offset + i) << PAGE_SHIFT, (u64)phys);
> +
> +               if ((i < npages - 1) && !__sg_page_iter_next(iter)) {
> +                       nv_error(mmu, "failed to iterate sg table\n");
> +                       return;
> +               }
> +       }
> +
> +       addr = (u64)node->offset << PAGE_SHIFT;
> +       addr |= BIT_ULL(plat->gpu->iommu.phys_addr_bit);
> +
> +       gk20a_vm_map(vma, pgt, mem, pte, addr);
> +
> +       p = *priv;
> +       p->node = node;
> +       p->iova = node->offset << PAGE_SHIFT;
> +}
> +
> +static void
> +gk20a_vm_unmap_iommu(struct nvkm_vma *vma, void *priv)
> +{
> +       struct nvkm_vm *vm = vma->vm;
> +       struct nvkm_mmu *mmu = vm->mmu;
> +       struct nouveau_platform_device *plat;
> +       struct gk20a_mmu_iommu_mapping *p = priv;
> +       int ret;
> +
> +       plat = nv_device_to_platform(nv_device(&mmu->base));
> +
> +       ret = iommu_unmap(plat->gpu->iommu.domain, p->iova,
> +                       1 << mmu->lpg_shift);
> +       WARN(ret < 0, "failed to unmap IOMMU address 0x%16llx, ret=%d\n",
> +                       p->iova, ret);
> +
> +       mutex_lock(&plat->gpu->iommu.mutex);
> +       nvkm_mm_free(plat->gpu->iommu.mm, &p->node);
> +       mutex_unlock(&plat->gpu->iommu.mutex);
> +
> +       kfree(priv);
> +}
> +
> +static int
> +gk20a_mmu_ctor(struct nvkm_object *parent, struct nvkm_object *engine,
> +              struct nvkm_oclass *oclass, void *data, u32 size,
> +              struct nvkm_object **pobject)
> +{
> +       struct gk20a_mmu_priv *priv;
> +       struct nouveau_platform_device *plat;
> +       int ret;
> +
> +       ret = nvkm_mmu_create(parent, engine, oclass, "VM", "vm", &priv);
> +       *pobject = nv_object(priv);
> +       if (ret)
> +               return ret;
> +
> +       plat = nv_device_to_platform(nv_device(parent));
> +       if (plat->gpu->iommu.domain)
> +               priv->base.iommu_capable = true;
> +
> +       priv->base.limit = 1ULL << 40;
> +       priv->base.dma_bits = 40;
> +       priv->base.pgt_bits  = 27 - 12;
> +       priv->base.spg_shift = 12;
> +       priv->base.lpg_shift = 17;
> +       priv->base.create = gf100_vm_create;
> +       priv->base.map_pgt = gf100_vm_map_pgt;
> +       priv->base.map = gf100_vm_map;
> +       priv->base.map_sg = gf100_vm_map_sg;
> +       priv->base.map_iommu = gk20a_vm_map_iommu;
> +       priv->base.unmap_iommu = gk20a_vm_unmap_iommu;
> +       priv->base.map_sg_iommu = gk20a_vm_map_sg_iommu;
> +       priv->base.unmap = gf100_vm_unmap;
> +       priv->base.flush = gf100_vm_flush;
> +
> +       return 0;
> +}
> +
> +struct nvkm_oclass
> +gk20a_mmu_oclass = {
> +       .handle = NV_SUBDEV(MMU, 0xea),
> +       .ofuncs = &(struct nvkm_ofuncs) {
> +               .ctor = gk20a_mmu_ctor,
> +               .dtor = _nvkm_mmu_dtor,
> +               .init = _nvkm_mmu_init,
> +               .fini = _nvkm_mmu_fini,
> +       },
> +};
> --
> 2.1.4
>
> _______________________________________________
> Nouveau mailing list
> Nouveau@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/nouveau
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 6/6] mmu: gk20a: implement IOMMU mapping for big pages
       [not found]         ` <CAKb7UvhEuNSYshiP+kt66zHfsmB=Edy+t7HLs2j_+-hvzzkdMQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-04-16 19:55           ` Terje Bergstrom
       [not found]             ` <553013A4.1020700-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 39+ messages in thread
From: Terje Bergstrom @ 2015-04-16 19:55 UTC (permalink / raw)
  To: Ilia Mirkin, Vince Hsu
  Cc: Ben Skeggs, nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


On 04/16/2015 12:31 PM, Ilia Mirkin wrote:
> Two questions --
>
> (a) What's the perf impact of doing this? Less work for the GPU MMU
> but more work for the IOMMU...
> (b) Would it be a good idea to do this for desktop GPUs that are on
> CPUs with IOMMUs in them (VT-d and whatever the AMD one is)? Is there
> some sort of shared API for this stuff that you should be (or are?)
> using?
a) Using IOMMU mapping is the best way of getting contiguous post-GMMU 
address spaces. The continuity is required to be able to use frame 
buffer compression. So overall performance impact when compression is 
factored in is about 20-30%.

If compression is left out of the equation, the impact SMMU translation 
and small versus large pages should not be noticeable, but I haven't 
measured it. We have measured large versus small pages with compression 
disabled in both cases in gk20a and the difference was noise.

Additional advantage is extra protection against GPU accidentally 
walking over kernel memory if kernel driver has a bug.

b) This is a Tegra specific mechanism, and for dGPU sysmem is handled 
differently, so I don't have a good answer to that. I *believe* in dGPU 
sysmem does not support compression, so it would be a question of memory 
protection, not performance.

(I'm hoping this email does not get added a corporate boilerplate - if 
it does, I apologize and feel free to ignore)

-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 6/6] mmu: gk20a: implement IOMMU mapping for big pages
       [not found]             ` <553013A4.1020700-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
@ 2015-04-16 20:01               ` Ilia Mirkin
  0 siblings, 0 replies; 39+ messages in thread
From: Ilia Mirkin @ 2015-04-16 20:01 UTC (permalink / raw)
  To: Terje Bergstrom; +Cc: Ben Skeggs, nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Thu, Apr 16, 2015 at 3:55 PM, Terje Bergstrom <tbergstrom@nvidia.com> wrote:
>
> On 04/16/2015 12:31 PM, Ilia Mirkin wrote:
>>
>> Two questions --
>>
>> (a) What's the perf impact of doing this? Less work for the GPU MMU
>> but more work for the IOMMU...
>> (b) Would it be a good idea to do this for desktop GPUs that are on
>> CPUs with IOMMUs in them (VT-d and whatever the AMD one is)? Is there
>> some sort of shared API for this stuff that you should be (or are?)
>> using?
>
> a) Using IOMMU mapping is the best way of getting contiguous post-GMMU
> address spaces. The continuity is required to be able to use frame buffer
> compression. So overall performance impact when compression is factored in
> is about 20-30%.
>
> If compression is left out of the equation, the impact SMMU translation and
> small versus large pages should not be noticeable, but I haven't measured
> it. We have measured large versus small pages with compression disabled in
> both cases in gk20a and the difference was noise.

Ah, I never made the connection to compression. I had assumed it was
something done at a higher level by PGRAPH rather than at the PTE
level by the VM. [I did know that you had to set compression at the
PTE level, but didn't think that page size mattered.]

>
> Additional advantage is extra protection against GPU accidentally walking
> over kernel memory if kernel driver has a bug.

Yeah, IOMMU's are nice :)
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/6] platform: specify the IOMMU physical translation bit
       [not found]     ` <1429182379-31964-2-git-send-email-vinceh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
@ 2015-04-16 20:41       ` Terje Bergstrom
       [not found]         ` <55301E65.5000705-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
  2015-04-17  6:26       ` Alexandre Courbot
  1 sibling, 1 reply; 39+ messages in thread
From: Terje Bergstrom @ 2015-04-16 20:41 UTC (permalink / raw)
  To: Vince Hsu, bskeggs-H+wXaHxf7aLQT0dZR+AlfA, gnurou-Re5JQEeQqe8AvxtiuMwx3w
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


On 04/16/2015 04:06 AM, Vince Hsu wrote:
> diff --git a/drm/nouveau/nouveau_platform.h b/drm/nouveau/nouveau_platform.h
> index 392874cf4725..3e9bd7dc0092 100644
> --- a/drm/nouveau/nouveau_platform.h
> +++ b/drm/nouveau/nouveau_platform.h
> @@ -53,6 +53,7 @@ struct nouveau_platform_gpu {
>   		struct nvkm_mm *mm;
>   		struct iommu_domain *domain;
>   		unsigned long pgshift;
> +		unsigned long phys_addr_bit;
>   	} iommu;
>   };
The naming is a bit reversed - the bit set tells we use SMMU, and bit 
cleared tells we use physical addressing. So the name of the bit should 
be iommu_addr_bit.

Terje
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/6] platform: specify the IOMMU physical translation bit
       [not found]         ` <55301E65.5000705-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
@ 2015-04-17  2:22           ` Vince Hsu
  0 siblings, 0 replies; 39+ messages in thread
From: Vince Hsu @ 2015-04-17  2:22 UTC (permalink / raw)
  To: Terje Bergstrom, bskeggs-H+wXaHxf7aLQT0dZR+AlfA,
	gnurou-Re5JQEeQqe8AvxtiuMwx3w
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW



On 04/17/2015 04:41 AM, Terje Bergstrom wrote:
>
> On 04/16/2015 04:06 AM, Vince Hsu wrote:
>> diff --git a/drm/nouveau/nouveau_platform.h 
>> b/drm/nouveau/nouveau_platform.h
>> index 392874cf4725..3e9bd7dc0092 100644
>> --- a/drm/nouveau/nouveau_platform.h
>> +++ b/drm/nouveau/nouveau_platform.h
>> @@ -53,6 +53,7 @@ struct nouveau_platform_gpu {
>>           struct nvkm_mm *mm;
>>           struct iommu_domain *domain;
>>           unsigned long pgshift;
>> +        unsigned long phys_addr_bit;
>>       } iommu;
>>   };
> The naming is a bit reversed - the bit set tells we use SMMU, and bit 
> cleared tells we use physical addressing. So the name of the bit 
> should be iommu_addr_bit.
Yep, you're right. How about just "addr_bit" because we already have the 
struct name iommu as prefix.

Thanks,
Vince



-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 0/6] map big page by platform IOMMU
       [not found] ` <1429182379-31964-1-git-send-email-vinceh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
                     ` (5 preceding siblings ...)
  2015-04-16 11:06   ` [PATCH 6/6] mmu: gk20a: implement IOMMU mapping for big pages Vince Hsu
@ 2015-04-17  6:25   ` Alexandre Courbot
       [not found]     ` <CAAVeFuJLRo1XYum3FZBhO5=5u63KLUY_tr0wk=TPxWutvCdL8A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  6 siblings, 1 reply; 39+ messages in thread
From: Alexandre Courbot @ 2015-04-17  6:25 UTC (permalink / raw)
  To: Vince Hsu
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Terje Bergstrom, Ben Skeggs

On Thu, Apr 16, 2015 at 8:06 PM, Vince Hsu <vinceh@nvidia.com> wrote:
> Hi,
>
> Generally the the imported buffers which has memory type TTM_PL_TT are
> mapped as small pages probably due to lack of big page allocation. But the
> platform device which also use memory type TTM_PL_TT, like GK20A, can

Nit: GK20A can *only* allocate GPU memory from TTM_PL_TT. Trying to
allocate from VRAM will result in an error.

> *allocate* big page though the IOMMU hardware inside the SoC. This is a try
> to map the imported buffers as big pages in GMMU by the platform IOMMU. With
> some preparation work to map decreate small pages into big page(s) by IOMMU

decreate?

> the GMMU eventually sees the imported buffer as chunks of big pages and does
> the mapping. And then we can probably do the compression on teh imported
> buffer which is composed of non-contiguous small pages. The compbits related
> patches shall come later.
>
> I guess most of you won't like the change for the MMU code in this series.
> So please comment and guide me how to do this better. :)
>
> Thanks,
> Vince
>
> Vince Hsu (6):
>   platform: specify the IOMMU physical translation bit
>   instmem/gk20a: refer to IOMMU physical translation bit
>   mmu: map small pages into big pages(s) by IOMMU if possible
>   drm: enable big page mapping for small pages when IOMMU is available
>   mmu: gf100: share most of functions with GK20A
>   mmu: gk20a: implement IOMMU mapping for big pages
>
>  drm/nouveau/include/nvkm/subdev/mmu.h   |  16 ++
>  drm/nouveau/nouveau_bo.c                |   9 ++
>  drm/nouveau/nouveau_platform.c          |  19 +++
>  drm/nouveau/nouveau_platform.h          |   1 +
>  drm/nouveau/nvkm/engine/device/gk104.c  |   2 +-
>  drm/nouveau/nvkm/subdev/instmem/gk20a.c |  13 +-
>  drm/nouveau/nvkm/subdev/mmu/Kbuild      |   1 +
>  drm/nouveau/nvkm/subdev/mmu/base.c      | 158 +++++++++++++++++++-
>  drm/nouveau/nvkm/subdev/mmu/gf100.c     |  28 +---
>  drm/nouveau/nvkm/subdev/mmu/gf100.h     |  46 ++++++
>  drm/nouveau/nvkm/subdev/mmu/gk20a.c     | 253 ++++++++++++++++++++++++++++++++
>  lib/include/nvif/os.h                   |  12 ++
>  12 files changed, 526 insertions(+), 32 deletions(-)
>  create mode 100644 drm/nouveau/nvkm/subdev/mmu/gf100.h
>  create mode 100644 drm/nouveau/nvkm/subdev/mmu/gk20a.c
>
> --
> 2.1.4
>
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/6] platform: specify the IOMMU physical translation bit
       [not found]     ` <1429182379-31964-2-git-send-email-vinceh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
  2015-04-16 20:41       ` Terje Bergstrom
@ 2015-04-17  6:26       ` Alexandre Courbot
       [not found]         ` <CAAVeFuL10UidtE4_Y2mYEYpWjDqXn-FQF3zJQv2eQDqg_QTJ1Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 39+ messages in thread
From: Alexandre Courbot @ 2015-04-17  6:26 UTC (permalink / raw)
  To: Vince Hsu
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Terje Bergstrom, Ben Skeggs

On Thu, Apr 16, 2015 at 8:06 PM, Vince Hsu <vinceh@nvidia.com> wrote:
> The IOMMU physical translation bit might vary with different SoCs. So add
> a variable to specify this bit for GK20A.
>
> Signed-off-by: Vince Hsu <vinceh@nvidia.com>
> ---
>  drm/nouveau/nouveau_platform.c | 19 +++++++++++++++++++
>  drm/nouveau/nouveau_platform.h |  1 +
>  2 files changed, 20 insertions(+)
>
> diff --git a/drm/nouveau/nouveau_platform.c b/drm/nouveau/nouveau_platform.c
> index 775277f1edb0..0d002f73e356 100644
> --- a/drm/nouveau/nouveau_platform.c
> +++ b/drm/nouveau/nouveau_platform.c
> @@ -25,6 +25,7 @@
>  #include <linux/module.h>
>  #include <linux/platform_device.h>
>  #include <linux/of.h>
> +#include <linux/of_device.h>
>  #include <linux/reset.h>
>  #include <linux/regulator/consumer.h>
>  #include <linux/iommu.h>
> @@ -92,6 +93,22 @@ static int nouveau_platform_power_down(struct nouveau_platform_gpu *gpu)
>         return 0;
>  }
>
> +static unsigned long nouveau_platform_get_iommu_bit(struct device *dev)
> +{
> +       const struct of_device_id *match;
> +
> +       match = of_match_device(dev->driver->of_match_table, dev);
> +       if (!match) {
> +               dev_warn(dev, "cannot find OF match for device\n");
> +               return 0;
> +       }
> +
> +       if (!strcmp(match->compatible, "nvidia,gk20a"))
> +               return 34;
> +       else
> +               return 0;
> +}

Instead of this function, you should probably use the data field of
struct of_device_id. Define a local struct called, say,
nouveau_platform_params containing (for now) a single iommu_addr_bit
field and instanciate one for each entry of nouveau_platform_match.
Then you can cast match->data and retrieve the field directly instead
of using strcmp.

I'd say this is then simple enough to do directly in
nouveau_platform_probe_iommu() instead of having a function dedicated
to it. It is also safer because when we add a new chip, we have to
update nouveau_platform_match but might very well forget about your
function, and will end up with bit 0 being set on all our GPU
addresses and endless hours of fun figuring out how this happened. :)

While I am at it: how about defining this as a u64 mask to set/unset
on GPU addresses instead of just a bit? This is more flexible and
again safer in case someone "omits" to specify the correct bit for a
chip.
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/6] instmem/gk20a: refer to IOMMU physical translation bit
       [not found]     ` <1429182379-31964-3-git-send-email-vinceh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
@ 2015-04-17  6:26       ` Alexandre Courbot
       [not found]         ` <CAAVeFuJqWuQeH1tbAdkVpocBAjq52GPv_O_yxd0X+7gSHjCoJw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 39+ messages in thread
From: Alexandre Courbot @ 2015-04-17  6:26 UTC (permalink / raw)
  To: Vince Hsu
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Terje Bergstrom, Ben Skeggs

On Thu, Apr 16, 2015 at 8:06 PM, Vince Hsu <vinceh@nvidia.com> wrote:
> Instead of hard-coding the translation bit in subdev driver, we refer to
> the platform data.
>
> Signed-off-by: Vince Hsu <vinceh@nvidia.com>
> ---
>  drm/nouveau/nvkm/subdev/instmem/gk20a.c | 13 +++++++++----
>  1 file changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/drm/nouveau/nvkm/subdev/instmem/gk20a.c b/drm/nouveau/nvkm/subdev/instmem/gk20a.c
> index dd0994d9ebfc..69ef5eae3279 100644
> --- a/drm/nouveau/nvkm/subdev/instmem/gk20a.c
> +++ b/drm/nouveau/nvkm/subdev/instmem/gk20a.c
> @@ -89,6 +89,7 @@ struct gk20a_instmem_priv {
>         struct nvkm_mm *mm;
>         struct iommu_domain *domain;
>         unsigned long iommu_pgshift;
> +       unsigned long iommu_phys_addr_bit;
>
>         /* Only used by DMA API */
>         struct dma_attrs attrs;
> @@ -169,8 +170,8 @@ gk20a_instobj_dtor_iommu(struct gk20a_instobj_priv *_node)
>         r = list_first_entry(&_node->mem->regions, struct nvkm_mm_node,
>                              rl_entry);
>
> -       /* clear bit 34 to unmap pages */
> -       r->offset &= ~BIT(34 - priv->iommu_pgshift);
> +       /* clear IOMMU translation bit to unmap pages */
> +       r->offset &= ~BIT(priv->iommu_phys_addr_bit - priv->iommu_pgshift);
>
>         /* Unmap pages from GPU address space and free them */
>         for (i = 0; i < _node->mem->size; i++) {
> @@ -298,8 +299,11 @@ gk20a_instobj_ctor_iommu(struct nvkm_object *parent, struct nvkm_object *engine,
>                 }
>         }
>
> -       /* Bit 34 tells that an address is to be resolved through the IOMMU */
> -       r->offset |= BIT(34 - priv->iommu_pgshift);
> +       /*
> +        * The iommu_phys_addr_bit tells that an address is to be resolved
> +        * through the IOMMU
> +        */
> +       r->offset |= BIT(priv->iommu_phys_addr_bit - priv->iommu_pgshift);
>
>         node->base._mem.offset = ((u64)r->offset) << priv->iommu_pgshift;
>
> @@ -407,6 +411,7 @@ gk20a_instmem_ctor(struct nvkm_object *parent, struct nvkm_object *engine,
>                 priv->domain = plat->gpu->iommu.domain;
>                 priv->mm = plat->gpu->iommu.mm;
>                 priv->iommu_pgshift = plat->gpu->iommu.pgshift;
> +               priv->iommu_phys_addr_bit = plat->gpu->iommu.phys_addr_bit;

Looks good, but I think I would definitely prefer this to be a mask
instead of a bit index, i.e:

    r->offset &= ~(priv->iommu_addr_mask >> priv->iommu_pgshift);

and

    r->offset |= (priv->iommu_addr_mask >> priv->iommu_pgshift);
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 4/6] drm: enable big page mapping for small pages when IOMMU is available
       [not found]     ` <1429182379-31964-5-git-send-email-vinceh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
@ 2015-04-17  6:26       ` Alexandre Courbot
       [not found]         ` <CAAVeFu+_zJ4DUwebhzpfA9uJroFEVrOBRqiXUuS9Gk04hVA88A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 39+ messages in thread
From: Alexandre Courbot @ 2015-04-17  6:26 UTC (permalink / raw)
  To: Vince Hsu
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Terje Bergstrom, Ben Skeggs

On Thu, Apr 16, 2015 at 8:06 PM, Vince Hsu <vinceh@nvidia.com> wrote:
> Some platforms have IOMMU to map non-contiguous physical memory into
> contiguous GPU virtual address. We can use this feature to enable big pages
> mapping on scattered small pages. To achieve that, we also need changes in
> subdev/mmu as well.
>
> Signed-off-by: Vince Hsu <vinceh@nvidia.com>
> ---
>  drm/nouveau/nouveau_bo.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/drm/nouveau/nouveau_bo.c b/drm/nouveau/nouveau_bo.c
> index 77326e344dad..da76ee1121e4 100644
> --- a/drm/nouveau/nouveau_bo.c
> +++ b/drm/nouveau/nouveau_bo.c
> @@ -221,6 +221,11 @@ nouveau_bo_new(struct drm_device *dev, int size, int align,
>         if (drm->client.vm) {
>                 if (!(flags & TTM_PL_FLAG_TT) && size > 256 * 1024)
>                         nvbo->page_shift = drm->client.vm->mmu->lpg_shift;
> +
> +               if ((flags & TTM_PL_FLAG_TT) &&
> +                               drm->client.vm->mmu->iommu_capable &&
> +                               (size % (1 << drm->client.vm->mmu->lpg_shift)) == 0)
> +                       nvbo->page_shift = drm->client.vm->mmu->lpg_shift;

I wonder if we should not just use the same size heuristics as for VRAM above?

Here, unless your buffer size is an exact multiple of the big page
size (128K for GK20A), you will not use big pages at all. In effect,
many buffers will be rejected for this reason. A behavior like "if the
buffer size of more than 256KB, increase the size of the buffer to the
next multiple of 128K and use big pages" would probably yield better
results.

>         }
>
>         nouveau_bo_fixup_align(nvbo, flags, &align, &size);
> @@ -1641,6 +1646,10 @@ nouveau_bo_vma_add(struct nouveau_bo *nvbo, struct nvkm_vm *vm,
>             (nvbo->bo.mem.mem_type == TTM_PL_VRAM ||
>              nvbo->page_shift != vma->vm->mmu->lpg_shift))
>                 nvkm_vm_map(vma, nvbo->bo.mem.mm_node);
> +       else if (nvbo->bo.mem.mem_type == TTM_PL_TT &&
> +               vma->vm->mmu->iommu_capable &&
> +               nvbo->page_shift == vma->vm->mmu->lpg_shift)
> +               nvkm_vm_map(vma, nvbo->bo.mem.mm_node);

Sorry, I don't understand why this is needed, could you explain?
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 5/6] mmu: gf100: share most of functions with GK20A
       [not found]     ` <1429182379-31964-6-git-send-email-vinceh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
@ 2015-04-17  6:27       ` Alexandre Courbot
       [not found]         ` <CAAVeFuJay4TNdOwEorver+-WiYCLJ21BXcRgCBv1aush_i1wbg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 39+ messages in thread
From: Alexandre Courbot @ 2015-04-17  6:27 UTC (permalink / raw)
  To: Vince Hsu
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Terje Bergstrom, Ben Skeggs

On Thu, Apr 16, 2015 at 8:06 PM, Vince Hsu <vinceh@nvidia.com> wrote:
> This patch moves all of the functions which can be shared with GK20A to
> public for later use.
>
> Signed-off-by: Vince Hsu <vinceh@nvidia.com>
> ---
>  drm/nouveau/nvkm/subdev/mmu/gf100.c | 28 +++++++---------------
>  drm/nouveau/nvkm/subdev/mmu/gf100.h | 46 +++++++++++++++++++++++++++++++++++++
>  2 files changed, 54 insertions(+), 20 deletions(-)
>  create mode 100644 drm/nouveau/nvkm/subdev/mmu/gf100.h
>
> diff --git a/drm/nouveau/nvkm/subdev/mmu/gf100.c b/drm/nouveau/nvkm/subdev/mmu/gf100.c
> index 294cda37f068..b067ded5d3be 100644
> --- a/drm/nouveau/nvkm/subdev/mmu/gf100.c
> +++ b/drm/nouveau/nvkm/subdev/mmu/gf100.c
> @@ -29,6 +29,8 @@
>
>  #include <core/gpuobj.h>
>
> +#include "gf100.h"
> +
>  struct gf100_mmu_priv {
>         struct nvkm_mmu base;
>  };
> @@ -74,7 +76,7 @@ const u8 gf100_pte_storage_type_map[256] =
>  };
>
>
> -static void
> +void
>  gf100_vm_map_pgt(struct nvkm_gpuobj *pgd, u32 index, struct nvkm_gpuobj *pgt[2])
>  {
>         u32 pde[2] = { 0, 0 };
> @@ -88,21 +90,7 @@ gf100_vm_map_pgt(struct nvkm_gpuobj *pgd, u32 index, struct nvkm_gpuobj *pgt[2])
>         nv_wo32(pgd, (index * 8) + 4, pde[1]);
>  }
>
> -static inline u64
> -gf100_vm_addr(struct nvkm_vma *vma, u64 phys, u32 memtype, u32 target)
> -{
> -       phys >>= 8;
> -
> -       phys |= 0x00000001; /* present */
> -       if (vma->access & NV_MEM_ACCESS_SYS)
> -               phys |= 0x00000002;
> -
> -       phys |= ((u64)target  << 32);
> -       phys |= ((u64)memtype << 36);
> -       return phys;
> -}
> -
> -static void
> +void
>  gf100_vm_map(struct nvkm_vma *vma, struct nvkm_gpuobj *pgt,
>              struct nvkm_mem *mem, u32 pte, u32 cnt, u64 phys, u64 delta)
>  {
> @@ -127,7 +115,7 @@ gf100_vm_map(struct nvkm_vma *vma, struct nvkm_gpuobj *pgt,
>         }
>  }
>
> -static void
> +void
>  gf100_vm_map_sg(struct nvkm_vma *vma, struct nvkm_gpuobj *pgt,
>                 struct nvkm_mem *mem, u32 pte, u32 cnt, dma_addr_t *list)
>  {
> @@ -144,7 +132,7 @@ gf100_vm_map_sg(struct nvkm_vma *vma, struct nvkm_gpuobj *pgt,
>         }
>  }
>
> -static void
> +void
>  gf100_vm_unmap(struct nvkm_gpuobj *pgt, u32 pte, u32 cnt)
>  {
>         pte <<= 3;
> @@ -155,7 +143,7 @@ gf100_vm_unmap(struct nvkm_gpuobj *pgt, u32 pte, u32 cnt)
>         }
>  }
>
> -static void
> +void
>  gf100_vm_flush(struct nvkm_vm *vm)
>  {
>         struct gf100_mmu_priv *priv = (void *)vm->mmu;
> @@ -191,7 +179,7 @@ gf100_vm_flush(struct nvkm_vm *vm)
>         mutex_unlock(&nv_subdev(priv)->mutex);
>  }
>
> -static int
> +int
>  gf100_vm_create(struct nvkm_mmu *mmu, u64 offset, u64 length, u64 mm_offset,
>                 struct nvkm_vm **pvm)
>  {
> diff --git a/drm/nouveau/nvkm/subdev/mmu/gf100.h b/drm/nouveau/nvkm/subdev/mmu/gf100.h
> new file mode 100644
> index 000000000000..a66ca45bc755
> --- /dev/null
> +++ b/drm/nouveau/nvkm/subdev/mmu/gf100.h
> @@ -0,0 +1,46 @@
> +#ifndef __GF100_MMU_PRIV__
> +#define __GF100_MMU_PRIV__
> +
> +#include <subdev/mmu.h>
> +
> +struct nv04_mmu_priv {
> +       struct nvkm_mmu base;
> +       struct nvkm_vm *vm;
> +       dma_addr_t null;
> +       void *nullp;
> +};

This structure is already declared in nv04.h - it looks wrong to
re-declare it here, especially since it seems to be unused in all your
code?

> +
> +int
> +gf100_vm_create(struct nvkm_mmu *mmu, u64 offset, u64 length, u64 mm_offset,
> +               struct nvkm_vm **pvm);
> +
> +void
> +gf100_vm_map_pgt(struct nvkm_gpuobj *pgd, u32 index,
> +               struct nvkm_gpuobj *pgt[2]);
> +void
> +gf100_vm_map(struct nvkm_vma *vma, struct nvkm_gpuobj *pgt,
> +            struct nvkm_mem *mem, u32 pte, u32 cnt, u64 phys, u64 delta);
> +void
> +gf100_vm_map_sg(struct nvkm_vma *vma, struct nvkm_gpuobj *pgt,
> +               struct nvkm_mem *mem, u32 pte, u32 cnt, dma_addr_t *list);
> +void
> +gf100_vm_unmap(struct nvkm_gpuobj *pgt, u32 pte, u32 cnt);
> +
> +void
> +gf100_vm_flush(struct nvkm_vm *vm);
> +
> +static inline u64
> +gf100_vm_addr(struct nvkm_vma *vma, u64 phys, u32 memtype, u32 target)
> +{
> +       phys >>= 8;
> +
> +       phys |= 0x00000001; /* present */
> +       if (vma->access & NV_MEM_ACCESS_SYS)
> +               phys |= 0x00000002;
> +
> +       phys |= ((u64)target  << 32);
> +       phys |= ((u64)memtype << 36);
> +       return phys;
> +}
> +
> +#endif
> --
> 2.1.4
>
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/6] platform: specify the IOMMU physical translation bit
       [not found]         ` <CAAVeFuL10UidtE4_Y2mYEYpWjDqXn-FQF3zJQv2eQDqg_QTJ1Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-04-17  7:07           ` Vince Hsu
  2015-04-17 15:10           ` Terje Bergstrom
  1 sibling, 0 replies; 39+ messages in thread
From: Vince Hsu @ 2015-04-17  7:07 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Terje Bergstrom, Ben Skeggs



On 04/17/2015 02:26 PM, Alexandre Courbot wrote:
> On Thu, Apr 16, 2015 at 8:06 PM, Vince Hsu <vinceh@nvidia.com> wrote:
>> The IOMMU physical translation bit might vary with different SoCs. So add
>> a variable to specify this bit for GK20A.
>>
>> Signed-off-by: Vince Hsu <vinceh@nvidia.com>
>> ---
>>   drm/nouveau/nouveau_platform.c | 19 +++++++++++++++++++
>>   drm/nouveau/nouveau_platform.h |  1 +
>>   2 files changed, 20 insertions(+)
>>
>> diff --git a/drm/nouveau/nouveau_platform.c b/drm/nouveau/nouveau_platform.c
>> index 775277f1edb0..0d002f73e356 100644
>> --- a/drm/nouveau/nouveau_platform.c
>> +++ b/drm/nouveau/nouveau_platform.c
>> @@ -25,6 +25,7 @@
>>   #include <linux/module.h>
>>   #include <linux/platform_device.h>
>>   #include <linux/of.h>
>> +#include <linux/of_device.h>
>>   #include <linux/reset.h>
>>   #include <linux/regulator/consumer.h>
>>   #include <linux/iommu.h>
>> @@ -92,6 +93,22 @@ static int nouveau_platform_power_down(struct nouveau_platform_gpu *gpu)
>>          return 0;
>>   }
>>
>> +static unsigned long nouveau_platform_get_iommu_bit(struct device *dev)
>> +{
>> +       const struct of_device_id *match;
>> +
>> +       match = of_match_device(dev->driver->of_match_table, dev);
>> +       if (!match) {
>> +               dev_warn(dev, "cannot find OF match for device\n");
>> +               return 0;
>> +       }
>> +
>> +       if (!strcmp(match->compatible, "nvidia,gk20a"))
>> +               return 34;
>> +       else
>> +               return 0;
>> +}
> Instead of this function, you should probably use the data field of
> struct of_device_id. Define a local struct called, say,
> nouveau_platform_params containing (for now) a single iommu_addr_bit
> field and instanciate one for each entry of nouveau_platform_match.
> Then you can cast match->data and retrieve the field directly instead
> of using strcmp.
Actually I did that way when I was based on your tree since you already 
defined the probe data. I admit I jerry-built this patch while I rebased 
to Ben's tree. Will fix in the next version.

>
> I'd say this is then simple enough to do directly in
> nouveau_platform_probe_iommu() instead of having a function dedicated
> to it. It is also safer because when we add a new chip, we have to
> update nouveau_platform_match but might very well forget about your
> function, and will end up with bit 0 being set on all our GPU
> addresses and endless hours of fun figuring out how this happened. :)
>
> While I am at it: how about defining this as a u64 mask to set/unset
> on GPU addresses instead of just a bit? This is more flexible and
> again safer in case someone "omits" to specify the correct bit for a
> chip.
>


-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/6] instmem/gk20a: refer to IOMMU physical translation bit
       [not found]         ` <CAAVeFuJqWuQeH1tbAdkVpocBAjq52GPv_O_yxd0X+7gSHjCoJw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-04-17  7:19           ` Vince Hsu
       [not found]             ` <5530B3F7.3000000-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
  2015-04-17 15:14           ` Terje Bergstrom
  1 sibling, 1 reply; 39+ messages in thread
From: Vince Hsu @ 2015-04-17  7:19 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Terje Bergstrom, Ben Skeggs


On 04/17/2015 02:26 PM, Alexandre Courbot wrote:
> On Thu, Apr 16, 2015 at 8:06 PM, Vince Hsu <vinceh@nvidia.com> wrote:
>> Instead of hard-coding the translation bit in subdev driver, we refer to
>> the platform data.
>>
>> Signed-off-by: Vince Hsu <vinceh@nvidia.com>
>> ---
>>   drm/nouveau/nvkm/subdev/instmem/gk20a.c | 13 +++++++++----
>>   1 file changed, 9 insertions(+), 4 deletions(-)
>>
>> diff --git a/drm/nouveau/nvkm/subdev/instmem/gk20a.c b/drm/nouveau/nvkm/subdev/instmem/gk20a.c
>> index dd0994d9ebfc..69ef5eae3279 100644
>> --- a/drm/nouveau/nvkm/subdev/instmem/gk20a.c
>> +++ b/drm/nouveau/nvkm/subdev/instmem/gk20a.c
>> @@ -89,6 +89,7 @@ struct gk20a_instmem_priv {
>>          struct nvkm_mm *mm;
>>          struct iommu_domain *domain;
>>          unsigned long iommu_pgshift;
>> +       unsigned long iommu_phys_addr_bit;
>>
>>          /* Only used by DMA API */
>>          struct dma_attrs attrs;
>> @@ -169,8 +170,8 @@ gk20a_instobj_dtor_iommu(struct gk20a_instobj_priv *_node)
>>          r = list_first_entry(&_node->mem->regions, struct nvkm_mm_node,
>>                               rl_entry);
>>
>> -       /* clear bit 34 to unmap pages */
>> -       r->offset &= ~BIT(34 - priv->iommu_pgshift);
>> +       /* clear IOMMU translation bit to unmap pages */
>> +       r->offset &= ~BIT(priv->iommu_phys_addr_bit - priv->iommu_pgshift);
>>
>>          /* Unmap pages from GPU address space and free them */
>>          for (i = 0; i < _node->mem->size; i++) {
>> @@ -298,8 +299,11 @@ gk20a_instobj_ctor_iommu(struct nvkm_object *parent, struct nvkm_object *engine,
>>                  }
>>          }
>>
>> -       /* Bit 34 tells that an address is to be resolved through the IOMMU */
>> -       r->offset |= BIT(34 - priv->iommu_pgshift);
>> +       /*
>> +        * The iommu_phys_addr_bit tells that an address is to be resolved
>> +        * through the IOMMU
>> +        */
>> +       r->offset |= BIT(priv->iommu_phys_addr_bit - priv->iommu_pgshift);
>>
>>          node->base._mem.offset = ((u64)r->offset) << priv->iommu_pgshift;
>>
>> @@ -407,6 +411,7 @@ gk20a_instmem_ctor(struct nvkm_object *parent, struct nvkm_object *engine,
>>                  priv->domain = plat->gpu->iommu.domain;
>>                  priv->mm = plat->gpu->iommu.mm;
>>                  priv->iommu_pgshift = plat->gpu->iommu.pgshift;
>> +               priv->iommu_phys_addr_bit = plat->gpu->iommu.phys_addr_bit;
> Looks good, but I think I would definitely prefer this to be a mask
> instead of a bit index, i.e:
>
>      r->offset &= ~(priv->iommu_addr_mask >> priv->iommu_pgshift);
>
> and
>
>      r->offset |= (priv->iommu_addr_mask >> priv->iommu_pgshift);
Which one is for setting up an IOMMU address If the iommu_addr_mask is 
defined as ~BIT_ULL(34)?

Thanks,
Vince




-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 4/6] drm: enable big page mapping for small pages when IOMMU is available
       [not found]         ` <CAAVeFu+_zJ4DUwebhzpfA9uJroFEVrOBRqiXUuS9Gk04hVA88A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-04-17  7:28           ` Vince Hsu
       [not found]             ` <5530B603.9010703-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 39+ messages in thread
From: Vince Hsu @ 2015-04-17  7:28 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Terje Bergstrom, Ben Skeggs



On 04/17/2015 02:26 PM, Alexandre Courbot wrote:
> On Thu, Apr 16, 2015 at 8:06 PM, Vince Hsu <vinceh@nvidia.com> wrote:
>> Some platforms have IOMMU to map non-contiguous physical memory into
>> contiguous GPU virtual address. We can use this feature to enable big pages
>> mapping on scattered small pages. To achieve that, we also need changes in
>> subdev/mmu as well.
>>
>> Signed-off-by: Vince Hsu <vinceh@nvidia.com>
>> ---
>>   drm/nouveau/nouveau_bo.c | 9 +++++++++
>>   1 file changed, 9 insertions(+)
>>
>> diff --git a/drm/nouveau/nouveau_bo.c b/drm/nouveau/nouveau_bo.c
>> index 77326e344dad..da76ee1121e4 100644
>> --- a/drm/nouveau/nouveau_bo.c
>> +++ b/drm/nouveau/nouveau_bo.c
>> @@ -221,6 +221,11 @@ nouveau_bo_new(struct drm_device *dev, int size, int align,
>>          if (drm->client.vm) {
>>                  if (!(flags & TTM_PL_FLAG_TT) && size > 256 * 1024)
>>                          nvbo->page_shift = drm->client.vm->mmu->lpg_shift;
>> +
>> +               if ((flags & TTM_PL_FLAG_TT) &&
>> +                               drm->client.vm->mmu->iommu_capable &&
>> +                               (size % (1 << drm->client.vm->mmu->lpg_shift)) == 0)
>> +                       nvbo->page_shift = drm->client.vm->mmu->lpg_shift;
> I wonder if we should not just use the same size heuristics as for VRAM above?
>
> Here, unless your buffer size is an exact multiple of the big page
> size (128K for GK20A), you will not use big pages at all. In effect,
> many buffers will be rejected for this reason. A behavior like "if the
> buffer size of more than 256KB, increase the size of the buffer to the
> next multiple of 128K and use big pages" would probably yield better
> results.
I'm told that the user space would align the buffer to the big page 
size. So I made this patch based on that impression. I'm not pretty 
familiar with the user space behavior, so sorry if I say something wrong 
here.

And are we allowed to extend the buffer size even the exporter doesn't 
know about that?

>>          }
>>
>>          nouveau_bo_fixup_align(nvbo, flags, &align, &size);
>> @@ -1641,6 +1646,10 @@ nouveau_bo_vma_add(struct nouveau_bo *nvbo, struct nvkm_vm *vm,
>>              (nvbo->bo.mem.mem_type == TTM_PL_VRAM ||
>>               nvbo->page_shift != vma->vm->mmu->lpg_shift))
>>                  nvkm_vm_map(vma, nvbo->bo.mem.mm_node);
>> +       else if (nvbo->bo.mem.mem_type == TTM_PL_TT &&
>> +               vma->vm->mmu->iommu_capable &&
>> +               nvbo->page_shift == vma->vm->mmu->lpg_shift)
>> +               nvkm_vm_map(vma, nvbo->bo.mem.mm_node);
> Sorry, I don't understand why this is needed, could you explain?
Originally the buffer can only be mapped when the memory type is VRAM or 
the page is small. So I added another condition to map big page for TT.

Thanks,
Vince





-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 5/6] mmu: gf100: share most of functions with GK20A
       [not found]         ` <CAAVeFuJay4TNdOwEorver+-WiYCLJ21BXcRgCBv1aush_i1wbg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-04-17  7:30           ` Vince Hsu
  0 siblings, 0 replies; 39+ messages in thread
From: Vince Hsu @ 2015-04-17  7:30 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Terje Bergstrom, Ben Skeggs



On 04/17/2015 02:27 PM, Alexandre Courbot wrote:
>> -static int
>> >+int
>> >  gf100_vm_create(struct nvkm_mmu *mmu, u64 offset, u64 length, u64 mm_offset,
>> >                 struct nvkm_vm **pvm)
>> >  {
>> >diff --git a/drm/nouveau/nvkm/subdev/mmu/gf100.h b/drm/nouveau/nvkm/subdev/mmu/gf100.h
>> >new file mode 100644
>> >index 000000000000..a66ca45bc755
>> >--- /dev/null
>> >+++ b/drm/nouveau/nvkm/subdev/mmu/gf100.h
>> >@@ -0,0 +1,46 @@
>> >+#ifndef __GF100_MMU_PRIV__
>> >+#define __GF100_MMU_PRIV__
>> >+
>> >+#include <subdev/mmu.h>
>> >+
>> >+struct nv04_mmu_priv {
>> >+       struct nvkm_mmu base;
>> >+       struct nvkm_vm *vm;
>> >+       dma_addr_t null;
>> >+       void *nullp;
>> >+};
> This structure is already declared in nv04.h - it looks wrong to
> re-declare it here, especially since it seems to be unused in all your
> code?
>
I don't why this structure shows here, either. It must be my copy-paste 
error. Sorry. :P

Thanks,
Vince


-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 4/6] drm: enable big page mapping for small pages when IOMMU is available
       [not found]             ` <5530B603.9010703-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
@ 2015-04-17  7:33               ` Alexandre Courbot
  2015-04-17 15:24               ` Terje Bergstrom
  1 sibling, 0 replies; 39+ messages in thread
From: Alexandre Courbot @ 2015-04-17  7:33 UTC (permalink / raw)
  To: Vince Hsu
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Terje Bergstrom, Ben Skeggs

On Fri, Apr 17, 2015 at 4:28 PM, Vince Hsu <vinceh@nvidia.com> wrote:
>
>
> On 04/17/2015 02:26 PM, Alexandre Courbot wrote:
>>
>> On Thu, Apr 16, 2015 at 8:06 PM, Vince Hsu <vinceh@nvidia.com> wrote:
>>>
>>> Some platforms have IOMMU to map non-contiguous physical memory into
>>> contiguous GPU virtual address. We can use this feature to enable big
>>> pages
>>> mapping on scattered small pages. To achieve that, we also need changes
>>> in
>>> subdev/mmu as well.
>>>
>>> Signed-off-by: Vince Hsu <vinceh@nvidia.com>
>>> ---
>>>   drm/nouveau/nouveau_bo.c | 9 +++++++++
>>>   1 file changed, 9 insertions(+)
>>>
>>> diff --git a/drm/nouveau/nouveau_bo.c b/drm/nouveau/nouveau_bo.c
>>> index 77326e344dad..da76ee1121e4 100644
>>> --- a/drm/nouveau/nouveau_bo.c
>>> +++ b/drm/nouveau/nouveau_bo.c
>>> @@ -221,6 +221,11 @@ nouveau_bo_new(struct drm_device *dev, int size, int
>>> align,
>>>          if (drm->client.vm) {
>>>                  if (!(flags & TTM_PL_FLAG_TT) && size > 256 * 1024)
>>>                          nvbo->page_shift =
>>> drm->client.vm->mmu->lpg_shift;
>>> +
>>> +               if ((flags & TTM_PL_FLAG_TT) &&
>>> +                               drm->client.vm->mmu->iommu_capable &&
>>> +                               (size % (1 <<
>>> drm->client.vm->mmu->lpg_shift)) == 0)
>>> +                       nvbo->page_shift =
>>> drm->client.vm->mmu->lpg_shift;
>>
>> I wonder if we should not just use the same size heuristics as for VRAM
>> above?
>>
>> Here, unless your buffer size is an exact multiple of the big page
>> size (128K for GK20A), you will not use big pages at all. In effect,
>> many buffers will be rejected for this reason. A behavior like "if the
>> buffer size of more than 256KB, increase the size of the buffer to the
>> next multiple of 128K and use big pages" would probably yield better
>> results.
>
> I'm told that the user space would align the buffer to the big page size. So
> I made this patch based on that impression. I'm not pretty familiar with the
> user space behavior, so sorry if I say something wrong here.

We would have to add this logic into Mesa, and only for GPUs that
feature no VRAM. Not impossible, but let's gather more informed
opinions about this.
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 0/6] map big page by platform IOMMU
       [not found]     ` <CAAVeFuJLRo1XYum3FZBhO5=5u63KLUY_tr0wk=TPxWutvCdL8A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-04-17  7:35       ` Vince Hsu
       [not found]         ` <5530B7D3.60700-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 39+ messages in thread
From: Vince Hsu @ 2015-04-17  7:35 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Terje Bergstrom, Ben Skeggs



On 04/17/2015 02:25 PM, Alexandre Courbot wrote:
> On Thu, Apr 16, 2015 at 8:06 PM, Vince Hsu <vinceh@nvidia.com> wrote:
>> Hi,
>>
>> Generally the the imported buffers which has memory type TTM_PL_TT are
>> mapped as small pages probably due to lack of big page allocation. But the
>> platform device which also use memory type TTM_PL_TT, like GK20A, can
> Nit: GK20A can *only* allocate GPU memory from TTM_PL_TT. Trying to
> allocate from VRAM will result in an error.
Yep.

>
>> *allocate* big page though the IOMMU hardware inside the SoC. This is a try
>> to map the imported buffers as big pages in GMMU by the platform IOMMU. With
>> some preparation work to map decreate small pages into big page(s) by IOMMU
> decreate?
It should be discrete. Sorry for the typo.


>
>> the GMMU eventually sees the imported buffer as chunks of big pages and does
>> the mapping. And then we can probably do the compression on teh imported
>> buffer which is composed of non-contiguous small pages. The compbits related
>> patches shall come later.
>>
>> I guess most of you won't like the change for the MMU code in this series.
>> So please comment and guide me how to do this better. :)
>>
>> Thanks,
>> Vince
>>
>> Vince Hsu (6):
>>    platform: specify the IOMMU physical translation bit
>>    instmem/gk20a: refer to IOMMU physical translation bit
>>    mmu: map small pages into big pages(s) by IOMMU if possible
>>    drm: enable big page mapping for small pages when IOMMU is available
>>    mmu: gf100: share most of functions with GK20A
>>    mmu: gk20a: implement IOMMU mapping for big pages
>>
>>   drm/nouveau/include/nvkm/subdev/mmu.h   |  16 ++
>>   drm/nouveau/nouveau_bo.c                |   9 ++
>>   drm/nouveau/nouveau_platform.c          |  19 +++
>>   drm/nouveau/nouveau_platform.h          |   1 +
>>   drm/nouveau/nvkm/engine/device/gk104.c  |   2 +-
>>   drm/nouveau/nvkm/subdev/instmem/gk20a.c |  13 +-
>>   drm/nouveau/nvkm/subdev/mmu/Kbuild      |   1 +
>>   drm/nouveau/nvkm/subdev/mmu/base.c      | 158 +++++++++++++++++++-
>>   drm/nouveau/nvkm/subdev/mmu/gf100.c     |  28 +---
>>   drm/nouveau/nvkm/subdev/mmu/gf100.h     |  46 ++++++
>>   drm/nouveau/nvkm/subdev/mmu/gk20a.c     | 253 ++++++++++++++++++++++++++++++++
>>   lib/include/nvif/os.h                   |  12 ++
>>   12 files changed, 526 insertions(+), 32 deletions(-)
>>   create mode 100644 drm/nouveau/nvkm/subdev/mmu/gf100.h
>>   create mode 100644 drm/nouveau/nvkm/subdev/mmu/gk20a.c
>>
>> --
>> 2.1.4
>>


-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 0/6] map big page by platform IOMMU
       [not found]         ` <5530B7D3.60700-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
@ 2015-04-17  7:37           ` Alexandre Courbot
       [not found]             ` <CAAVeFuLdr7h29G6_1XGV8kTag=m4Spe_R9Aog38eRs9gAbgdVg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 39+ messages in thread
From: Alexandre Courbot @ 2015-04-17  7:37 UTC (permalink / raw)
  To: Vince Hsu
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Terje Bergstrom, Ben Skeggs

On Fri, Apr 17, 2015 at 4:35 PM, Vince Hsu <vinceh@nvidia.com> wrote:
> -----------------------------------------------------------------------------------
> This email message is for the sole use of the intended recipient(s) and may
> contain
> confidential information.  Any unauthorized review, use, disclosure or
> distribution
> is prohibited.  If you are not the intended recipient, please contact the
> sender by
> reply email and destroy all copies of the original message.
> -----------------------------------------------------------------------------------

Unrelated: I asked IT to add the Nouveau ML to our copyright footer
whitelist. Sorry for the inconvenience guys, please bear with us.
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/6] instmem/gk20a: refer to IOMMU physical translation bit
       [not found]             ` <5530B3F7.3000000-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
@ 2015-04-17  7:40               ` Alexandre Courbot
  0 siblings, 0 replies; 39+ messages in thread
From: Alexandre Courbot @ 2015-04-17  7:40 UTC (permalink / raw)
  To: Vince Hsu
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Terje Bergstrom, Ben Skeggs

On Fri, Apr 17, 2015 at 4:19 PM, Vince Hsu <vinceh@nvidia.com> wrote:
>
> On 04/17/2015 02:26 PM, Alexandre Courbot wrote:
>>
>> On Thu, Apr 16, 2015 at 8:06 PM, Vince Hsu <vinceh@nvidia.com> wrote:
>>>
>>> Instead of hard-coding the translation bit in subdev driver, we refer to
>>> the platform data.
>>>
>>> Signed-off-by: Vince Hsu <vinceh@nvidia.com>
>>> ---
>>>   drm/nouveau/nvkm/subdev/instmem/gk20a.c | 13 +++++++++----
>>>   1 file changed, 9 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drm/nouveau/nvkm/subdev/instmem/gk20a.c
>>> b/drm/nouveau/nvkm/subdev/instmem/gk20a.c
>>> index dd0994d9ebfc..69ef5eae3279 100644
>>> --- a/drm/nouveau/nvkm/subdev/instmem/gk20a.c
>>> +++ b/drm/nouveau/nvkm/subdev/instmem/gk20a.c
>>> @@ -89,6 +89,7 @@ struct gk20a_instmem_priv {
>>>          struct nvkm_mm *mm;
>>>          struct iommu_domain *domain;
>>>          unsigned long iommu_pgshift;
>>> +       unsigned long iommu_phys_addr_bit;
>>>
>>>          /* Only used by DMA API */
>>>          struct dma_attrs attrs;
>>> @@ -169,8 +170,8 @@ gk20a_instobj_dtor_iommu(struct gk20a_instobj_priv
>>> *_node)
>>>          r = list_first_entry(&_node->mem->regions, struct nvkm_mm_node,
>>>                               rl_entry);
>>>
>>> -       /* clear bit 34 to unmap pages */
>>> -       r->offset &= ~BIT(34 - priv->iommu_pgshift);
>>> +       /* clear IOMMU translation bit to unmap pages */
>>> +       r->offset &= ~BIT(priv->iommu_phys_addr_bit -
>>> priv->iommu_pgshift);
>>>
>>>          /* Unmap pages from GPU address space and free them */
>>>          for (i = 0; i < _node->mem->size; i++) {
>>> @@ -298,8 +299,11 @@ gk20a_instobj_ctor_iommu(struct nvkm_object *parent,
>>> struct nvkm_object *engine,
>>>                  }
>>>          }
>>>
>>> -       /* Bit 34 tells that an address is to be resolved through the
>>> IOMMU */
>>> -       r->offset |= BIT(34 - priv->iommu_pgshift);
>>> +       /*
>>> +        * The iommu_phys_addr_bit tells that an address is to be
>>> resolved
>>> +        * through the IOMMU
>>> +        */
>>> +       r->offset |= BIT(priv->iommu_phys_addr_bit -
>>> priv->iommu_pgshift);
>>>
>>>          node->base._mem.offset = ((u64)r->offset) <<
>>> priv->iommu_pgshift;
>>>
>>> @@ -407,6 +411,7 @@ gk20a_instmem_ctor(struct nvkm_object *parent, struct
>>> nvkm_object *engine,
>>>                  priv->domain = plat->gpu->iommu.domain;
>>>                  priv->mm = plat->gpu->iommu.mm;
>>>                  priv->iommu_pgshift = plat->gpu->iommu.pgshift;
>>> +               priv->iommu_phys_addr_bit =
>>> plat->gpu->iommu.phys_addr_bit;
>>
>> Looks good, but I think I would definitely prefer this to be a mask
>> instead of a bit index, i.e:
>>
>>      r->offset &= ~(priv->iommu_addr_mask >> priv->iommu_pgshift);
>>
>> and
>>
>>      r->offset |= (priv->iommu_addr_mask >> priv->iommu_pgshift);
>
> Which one is for setting up an IOMMU address If the iommu_addr_mask is
> defined as ~BIT_ULL(34)?

I would define iomm_addr_mask as BIT_ULL(34). Then the |= is for
setting the mask and &= to remove it.
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 0/6] map big page by platform IOMMU
       [not found]             ` <CAAVeFuLdr7h29G6_1XGV8kTag=m4Spe_R9Aog38eRs9gAbgdVg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-04-17  7:42               ` Vince Hsu
  0 siblings, 0 replies; 39+ messages in thread
From: Vince Hsu @ 2015-04-17  7:42 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Terje Bergstrom, Ben Skeggs


On 04/17/2015 03:37 PM, Alexandre Courbot wrote:
> On Fri, Apr 17, 2015 at 4:35 PM, Vince Hsu <vinceh@nvidia.com> wrote:
>> -----------------------------------------------------------------------------------
>> This email message is for the sole use of the intended recipient(s) and may
>> contain
>> confidential information.  Any unauthorized review, use, disclosure or
>> distribution
>> is prohibited.  If you are not the intended recipient, please contact the
>> sender by
>> reply email and destroy all copies of the original message.
>> -----------------------------------------------------------------------------------
> Unrelated: I asked IT to add the Nouveau ML to our copyright footer
> whitelist. Sorry for the inconvenience guys, please bear with us.
Terje told me I can remove this by some setting, but I haven't tried 
that. Sorry.

Thanks,
Vince
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 3/6] mmu: map small pages into big pages(s) by IOMMU if possible
       [not found]     ` <1429182379-31964-4-git-send-email-vinceh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
@ 2015-04-17  9:11       ` Alexandre Courbot
       [not found]         ` <CAAVeFuLSYGV8ghOQ26ubA9kvWHCPD2y+y_bJMLHv1V1Ab8NOYg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 39+ messages in thread
From: Alexandre Courbot @ 2015-04-17  9:11 UTC (permalink / raw)
  To: Vince Hsu
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Terje Bergstrom, Ben Skeggs

On Thu, Apr 16, 2015 at 8:06 PM, Vince Hsu <vinceh@nvidia.com> wrote:
> This patch implements a way to aggregate the small pages and make them be
> mapped as big page(s) by utilizing the platform IOMMU if supported. And then
> we can enable compression support for these big pages later.
>
> Signed-off-by: Vince Hsu <vinceh@nvidia.com>
> ---
>  drm/nouveau/include/nvkm/subdev/mmu.h |  16 ++++
>  drm/nouveau/nvkm/subdev/mmu/base.c    | 158 ++++++++++++++++++++++++++++++++--

I believe (although I may have missed something) that this patch (and
patch 6/6) can be rewritten such as these two files remain untouched.
IOW, no new data structures (because the PTE will contain all the
information you need), and no change to base.c (because IOMMU is
chip-specific logic, although one may argue that the use of the IOMMU
API makes it more generic).

But let's review the extra data structures first:

>  lib/include/nvif/os.h                 |  12 +++
>  3 files changed, 179 insertions(+), 7 deletions(-)
>
> diff --git a/drm/nouveau/include/nvkm/subdev/mmu.h b/drm/nouveau/include/nvkm/subdev/mmu.h
> index 3a5368776c31..3230d31a7971 100644
> --- a/drm/nouveau/include/nvkm/subdev/mmu.h
> +++ b/drm/nouveau/include/nvkm/subdev/mmu.h
> @@ -22,6 +22,8 @@ struct nvkm_vma {
>         struct nvkm_mm_node *node;
>         u64 offset;
>         u32 access;
> +       struct list_head bp;
> +       bool has_iommu_bp;

Whether a chunk of memory is mapped through the IOMMU can be tested by
checking if the IOMMU bit is set in the address recorded in the PTE.
So has_iommu_bp looks redundant here.

>  };
>
>  struct nvkm_vm {
> @@ -37,6 +39,13 @@ struct nvkm_vm {
>         u32 lpde;
>  };
>
> +struct nvkm_vm_bp_list {
> +       struct list_head head;
> +       u32 pde;
> +       u32 pte;
> +       void *priv;
> +};
> +

Tracking the PDE and PTE of each memory chunk can probably be avoided
if you change your unmapping strategy. Currently you are going through
the list of nvkm_vm_bp_list, but you know your PDE and PTE are always
going to be adjacent, since a nvkm_vma represents a contiguous block
in the GPU VA. So when unmapping, you can simply check for each PTE
entry whether the IOMMU bit is set, and unmap from the IOMMU space
after unmapping from the GPU VA space, in a loop similar to that of
nvkm_vm_unmap_at().

Then we only need priv. You are keeping the nvkm_mm_node of the IOMMU
space into it, and you need it to free the IOMMU VA space. If only we
could find another way to store it, we could get rid of the whole
structure and associated list_head in nvkm_vma...

I need to give it some more thoughts, and we will probably need to
change a few things in base.c to make the hooks more flexible, so
please give me some more time to think about it. :) I just wanted to
share my thoughts so far in case this puts you on track.
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/6] platform: specify the IOMMU physical translation bit
       [not found]         ` <CAAVeFuL10UidtE4_Y2mYEYpWjDqXn-FQF3zJQv2eQDqg_QTJ1Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2015-04-17  7:07           ` Vince Hsu
@ 2015-04-17 15:10           ` Terje Bergstrom
       [not found]             ` <55312257.8000606-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
  1 sibling, 1 reply; 39+ messages in thread
From: Terje Bergstrom @ 2015-04-17 15:10 UTC (permalink / raw)
  To: Alexandre Courbot, Vince Hsu
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs


On 04/16/2015 11:26 PM, Alexandre Courbot wrote:
> Instead of this function, you should probably use the data field of
> struct of_device_id. Define a local struct called, say,
> nouveau_platform_params containing (for now) a single iommu_addr_bit
> field and instanciate one for each entry of nouveau_platform_match.
> Then you can cast match->data and retrieve the field directly instead
> of using strcmp.
>
> I'd say this is then simple enough to do directly in
> nouveau_platform_probe_iommu() instead of having a function dedicated
> to it. It is also safer because when we add a new chip, we have to
> update nouveau_platform_match but might very well forget about your
> function, and will end up with bit 0 being set on all our GPU
> addresses and endless hours of fun figuring out how this happened. :)
>
> While I am at it: how about defining this as a u64 mask to set/unset
> on GPU addresses instead of just a bit? This is more flexible and
> again safer in case someone "omits" to specify the correct bit for a
> chip.
Wouldn't u64 be as dangerous? We'd be just messing up with address in 
another way.

Another way of expressing this information would be defining number of 
physical(*) address bits. IOMMU bit is the bit after physical address bits.

Terje

* Physical here means after GPU MMU translation, i.e. either physical or 
IOMMU address. Somebody needs to come up with a term to cover that.
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/6] instmem/gk20a: refer to IOMMU physical translation bit
       [not found]         ` <CAAVeFuJqWuQeH1tbAdkVpocBAjq52GPv_O_yxd0X+7gSHjCoJw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2015-04-17  7:19           ` Vince Hsu
@ 2015-04-17 15:14           ` Terje Bergstrom
       [not found]             ` <5531234C.8020806-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
  1 sibling, 1 reply; 39+ messages in thread
From: Terje Bergstrom @ 2015-04-17 15:14 UTC (permalink / raw)
  To: Alexandre Courbot, Vince Hsu
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs


On 04/16/2015 11:26 PM, Alexandre Courbot wrote:
> Looks good, but I think I would definitely prefer this to be a mask
> instead of a bit index, i.e:
>
>      r->offset &= ~(priv->iommu_addr_mask >> priv->iommu_pgshift);
>
> and
>
>      r->offset |= (priv->iommu_addr_mask >> priv->iommu_pgshift);
Wouldn't that be just a more complicated way of expressing the same thing?
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 4/6] drm: enable big page mapping for small pages when IOMMU is available
       [not found]             ` <5530B603.9010703-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
  2015-04-17  7:33               ` Alexandre Courbot
@ 2015-04-17 15:24               ` Terje Bergstrom
  1 sibling, 0 replies; 39+ messages in thread
From: Terje Bergstrom @ 2015-04-17 15:24 UTC (permalink / raw)
  To: Vince Hsu, Alexandre Courbot
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs


On 04/17/2015 12:28 AM, Vince Hsu wrote:
>
> On 04/17/2015 02:26 PM, Alexandre Courbot wrote:
>> I wonder if we should not just use the same size heuristics as for 
>> VRAM above?
>>
>> Here, unless your buffer size is an exact multiple of the big page
>> size (128K for GK20A), you will not use big pages at all. In effect,
>> many buffers will be rejected for this reason. A behavior like "if the
>> buffer size of more than 256KB, increase the size of the buffer to the
>> next multiple of 128K and use big pages" would probably yield better
>> results.
> I'm told that the user space would align the buffer to the big page 
> size. So I made this patch based on that impression. I'm not pretty 
> familiar with the user space behavior, so sorry if I say something 
> wrong here.
>
> And are we allowed to extend the buffer size even the exporter doesn't 
> know about that?
The alignment of IOMMU VA base address and size has to be correct. As a 
consequence you could think that this means that the allocated size of 
the buffer has to be big page aligned.

You could also just allocate buffer actual size with whatever alignment, 
and take care of alignment at buffer mapping time. In case buffer is 
120k, we could allocate 128k of IOMMU VA at 128k aligned base address, 
and leave the rest 8k IOMMU unmapped, and map it as a single large page 
to GPU MMU. But that would allow triggering IOMMU fault from GPU command 
buffers, and IOMMU faults are much more difficult to debug than GPU MMU 
faults, so I don't recommend this.

So in summary, I prefer taking care of buffer size alignment at 
allocation time, and IOMMU VA base alignment at mapping time.

_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 3/6] mmu: map small pages into big pages(s) by IOMMU if possible
       [not found]         ` <CAAVeFuLSYGV8ghOQ26ubA9kvWHCPD2y+y_bJMLHv1V1Ab8NOYg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-04-17 15:37           ` Terje Bergstrom
       [not found]             ` <553128CA.40304-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 39+ messages in thread
From: Terje Bergstrom @ 2015-04-17 15:37 UTC (permalink / raw)
  To: Alexandre Courbot, Vince Hsu
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs


On 04/17/2015 02:11 AM, Alexandre Courbot wrote:
> Tracking the PDE and PTE of each memory chunk can probably be avoided
> if you change your unmapping strategy. Currently you are going through
> the list of nvkm_vm_bp_list, but you know your PDE and PTE are always
> going to be adjacent, since a nvkm_vma represents a contiguous block
> in the GPU VA. So when unmapping, you can simply check for each PTE
> entry whether the IOMMU bit is set, and unmap from the IOMMU space
> after unmapping from the GPU VA space, in a loop similar to that of
> nvkm_vm_unmap_at().
>
> Then we only need priv. You are keeping the nvkm_mm_node of the IOMMU
> space into it, and you need it to free the IOMMU VA space. If only we
> could find another way to store it, we could get rid of the whole
> structure and associated list_head in nvkm_vma...
>
> I need to give it some more thoughts, and we will probably need to
> change a few things in base.c to make the hooks more flexible, so
> please give me some more time to think about it. :) I just wanted to
> share my thoughts so far in case this puts you on track.
The way you described it would make GPU MMU and IOMMU mappings 1:1. So 
when we map a buffer to GPU MMU, we always map page by page the buffer 
also to IOMMU. There are disadvantages here.

IOMMU addresses are global, and uses in the GPU caches. When a buffer is 
mapped multiple times to different graphics contexts, we want to avoid 
cache aliasing by mapping the buffer only once to IOMMU. We also want to 
unmap the buffer from IOMMU only once after all the instances of the 
buffer have been unmapped, or only when the buffer is actually freed to 
cache IOMMU mappings.

Doing IOMMU mapping for the whole buffer with dma_map_sg is also faster 
than mapping page by page, because you can do only one TLB invalidate in 
the end of the loop instead of after every page if you use dma_map_single.

All of these would talk for having IOMMU and GMMU mapping loops 
separate. This patch set does not implement both the advantages above, 
but your suggestion would take us further away from that than Vince's 
version.
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/6] platform: specify the IOMMU physical translation bit
       [not found]             ` <55312257.8000606-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
@ 2015-04-20  7:37               ` Alexandre Courbot
       [not found]                 ` <CAAVeFuKSY9AvtmjkVY7CApMex3Gq5L5cQ=suaERY5-gtAYai6A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 39+ messages in thread
From: Alexandre Courbot @ 2015-04-20  7:37 UTC (permalink / raw)
  To: Terje Bergstrom; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs

On Sat, Apr 18, 2015 at 12:10 AM, Terje Bergstrom <tbergstrom@nvidia.com> wrote:
>
> On 04/16/2015 11:26 PM, Alexandre Courbot wrote:
>>
>> Instead of this function, you should probably use the data field of
>> struct of_device_id. Define a local struct called, say,
>> nouveau_platform_params containing (for now) a single iommu_addr_bit
>> field and instanciate one for each entry of nouveau_platform_match.
>> Then you can cast match->data and retrieve the field directly instead
>> of using strcmp.
>>
>> I'd say this is then simple enough to do directly in
>> nouveau_platform_probe_iommu() instead of having a function dedicated
>> to it. It is also safer because when we add a new chip, we have to
>> update nouveau_platform_match but might very well forget about your
>> function, and will end up with bit 0 being set on all our GPU
>> addresses and endless hours of fun figuring out how this happened. :)
>>
>> While I am at it: how about defining this as a u64 mask to set/unset
>> on GPU addresses instead of just a bit? This is more flexible and
>> again safer in case someone "omits" to specify the correct bit for a
>> chip.
>
> Wouldn't u64 be as dangerous? We'd be just messing up with address in
> another way.
>
> Another way of expressing this information would be defining number of
> physical(*) address bits. IOMMU bit is the bit after physical address bits.

I don't have a strong opinion on this - the advantage of a bit mask
over a bit index is that it at least gives us the option of not having
a MMU translation bit at all (mask == 0) whereas a bit index of 0 will
set/unset the first bit.

Not that we need this for now, but keeping that possibility open
doesn't sound bad for the future (nothing guarantees the GPU will
always be behind the IOMMU).
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/6] instmem/gk20a: refer to IOMMU physical translation bit
       [not found]             ` <5531234C.8020806-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
@ 2015-04-20  7:38               ` Alexandre Courbot
       [not found]                 ` <CAAVeFu+p=FFqMNTYNVo1DKnFWZ88_5osiTbiHQR=wQ6tJhHzBA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 39+ messages in thread
From: Alexandre Courbot @ 2015-04-20  7:38 UTC (permalink / raw)
  To: Terje Bergstrom; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs

On Sat, Apr 18, 2015 at 12:14 AM, Terje Bergstrom <tbergstrom@nvidia.com> wrote:
>
> On 04/16/2015 11:26 PM, Alexandre Courbot wrote:
>>
>> Looks good, but I think I would definitely prefer this to be a mask
>> instead of a bit index, i.e:
>>
>>      r->offset &= ~(priv->iommu_addr_mask >> priv->iommu_pgshift);
>>
>> and
>>
>>      r->offset |= (priv->iommu_addr_mask >> priv->iommu_pgshift);
>
> Wouldn't that be just a more complicated way of expressing the same thing?

Right now we have r->offset |= BIT(priv->iommu_phys_addr_bit -
priv->iommu_pgshift), which doesn't look much simpler. :)
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 3/6] mmu: map small pages into big pages(s) by IOMMU if possible
       [not found]             ` <553128CA.40304-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
@ 2015-04-20  7:49               ` Alexandre Courbot
       [not found]                 ` <CAAVeFuKc1szojynHrGM9TJrJsTK0RjRiO7i7uwxoVPAW=tZJrA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 39+ messages in thread
From: Alexandre Courbot @ 2015-04-20  7:49 UTC (permalink / raw)
  To: Terje Bergstrom; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs

On Sat, Apr 18, 2015 at 12:37 AM, Terje Bergstrom <tbergstrom@nvidia.com> wrote:
>
> On 04/17/2015 02:11 AM, Alexandre Courbot wrote:
>>
>> Tracking the PDE and PTE of each memory chunk can probably be avoided
>> if you change your unmapping strategy. Currently you are going through
>> the list of nvkm_vm_bp_list, but you know your PDE and PTE are always
>> going to be adjacent, since a nvkm_vma represents a contiguous block
>> in the GPU VA. So when unmapping, you can simply check for each PTE
>> entry whether the IOMMU bit is set, and unmap from the IOMMU space
>> after unmapping from the GPU VA space, in a loop similar to that of
>> nvkm_vm_unmap_at().
>>
>> Then we only need priv. You are keeping the nvkm_mm_node of the IOMMU
>> space into it, and you need it to free the IOMMU VA space. If only we
>> could find another way to store it, we could get rid of the whole
>> structure and associated list_head in nvkm_vma...
>>
>> I need to give it some more thoughts, and we will probably need to
>> change a few things in base.c to make the hooks more flexible, so
>> please give me some more time to think about it. :) I just wanted to
>> share my thoughts so far in case this puts you on track.
>
> The way you described it would make GPU MMU and IOMMU mappings 1:1. So when
> we map a buffer to GPU MMU, we always map page by page the buffer also to
> IOMMU. There are disadvantages here.
>
> IOMMU addresses are global, and uses in the GPU caches. When a buffer is
> mapped multiple times to different graphics contexts, we want to avoid cache
> aliasing by mapping the buffer only once to IOMMU. We also want to unmap the
> buffer from IOMMU only once after all the instances of the buffer have been
> unmapped, or only when the buffer is actually freed to cache IOMMU mappings.
>
> Doing IOMMU mapping for the whole buffer with dma_map_sg is also faster than
> mapping page by page, because you can do only one TLB invalidate in the end
> of the loop instead of after every page if you use dma_map_single.
>
> All of these would talk for having IOMMU and GMMU mapping loops separate.
> This patch set does not implement both the advantages above, but your
> suggestion would take us further away from that than Vince's version.

Aha, looks like both Vince and I overlooked this point. So IIUC we
would need to make sure a GPU buffer is only ever mapped once by the
IOMMU. This means we either need to preemptively entilery map it at
some point and just keep a reference count, or keep track of which
128k ranges are already mapped (again, with a reference count to know
when to unmap them).

First solution is tempting because it is simpler, but surely there is
something wrong with it?
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 3/6] mmu: map small pages into big pages(s) by IOMMU if possible
       [not found]                 ` <CAAVeFuKc1szojynHrGM9TJrJsTK0RjRiO7i7uwxoVPAW=tZJrA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-04-20 11:51                   ` Vince Hsu
       [not found]                     ` <5534E83D.80304-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
  2015-04-20 16:47                   ` Terje Bergstrom
  1 sibling, 1 reply; 39+ messages in thread
From: Vince Hsu @ 2015-04-20 11:51 UTC (permalink / raw)
  To: Alexandre Courbot, Terje Bergstrom
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs

On 04/20/2015 03:49 PM, Alexandre Courbot wrote:
> On Sat, Apr 18, 2015 at 12:37 AM, Terje Bergstrom <tbergstrom@nvidia.com> wrote:
>> On 04/17/2015 02:11 AM, Alexandre Courbot wrote:
>>> Tracking the PDE and PTE of each memory chunk can probably be avoided
>>> if you change your unmapping strategy. Currently you are going through
>>> the list of nvkm_vm_bp_list, but you know your PDE and PTE are always
>>> going to be adjacent, since a nvkm_vma represents a contiguous block
>>> in the GPU VA. So when unmapping, you can simply check for each PTE
>>> entry whether the IOMMU bit is set, and unmap from the IOMMU space
>>> after unmapping from the GPU VA space, in a loop similar to that of
>>> nvkm_vm_unmap_at().
>>>
>>> Then we only need priv. You are keeping the nvkm_mm_node of the IOMMU
>>> space into it, and you need it to free the IOMMU VA space. If only we
>>> could find another way to store it, we could get rid of the whole
>>> structure and associated list_head in nvkm_vma...
>>>
>>> I need to give it some more thoughts, and we will probably need to
>>> change a few things in base.c to make the hooks more flexible, so
>>> please give me some more time to think about it. :) I just wanted to
>>> share my thoughts so far in case this puts you on track.
>> The way you described it would make GPU MMU and IOMMU mappings 1:1. So when
>> we map a buffer to GPU MMU, we always map page by page the buffer also to
>> IOMMU. There are disadvantages here.
>>
>> IOMMU addresses are global, and uses in the GPU caches. When a buffer is
>> mapped multiple times to different graphics contexts, we want to avoid cache
>> aliasing by mapping the buffer only once to IOMMU. We also want to unmap the
>> buffer from IOMMU only once after all the instances of the buffer have been
>> unmapped, or only when the buffer is actually freed to cache IOMMU mappings.
>>
>> Doing IOMMU mapping for the whole buffer with dma_map_sg is also faster than
>> mapping page by page, because you can do only one TLB invalidate in the end
>> of the loop instead of after every page if you use dma_map_single.
>>
>> All of these would talk for having IOMMU and GMMU mapping loops separate.
>> This patch set does not implement both the advantages above, but your
>> suggestion would take us further away from that than Vince's version.
> Aha, looks like both Vince and I overlooked this point. So IIUC we
> would need to make sure a GPU buffer is only ever mapped once by the
> IOMMU. This means we either need to preemptively entilery map it at
> some point and just keep a reference count, or keep track of which
> 128k ranges are already mapped (again, with a reference count to know
> when to unmap them).
>
> First solution is tempting because it is simpler, but surely there is
> something wrong with it?
We can maintain a list in mmu/gk20a.c and use the first small page's 
address as a search index for the corresponding 128K page.

e.g.

--- a/drm/nouveau/nvkm/subdev/mmu/gk20a.c
+++ b/drm/nouveau/nvkm/subdev/mmu/gk20a.c
@@ -38,6 +38,8 @@ struct gk20a_mmu_priv {
  struct gk20a_mmu_iommu_mapping {
         struct nvkm_mm_node *node;
         u64 iova;
+       u64 index;
+       struct list_head head;
  };

  extern const u8 gf100_pte_storage_type_map[256];
@@ -79,6 +81,14 @@ gk20a_vm_map_iommu(struct nvkm_vma *vma, struct 
nvkm_gpuobj *pgt,

         plat = nv_device_to_platform(nv_device(&mmu->base));

+       list_for_each_entry(mapping, mapping_list, head) {
+               if (mapping->index == *list) {
+                       mapping->refcnt++;
+                       *priv = mapping;
+                       return;
+               }
+       }
+
         *priv = kzalloc(sizeof(struct gk20a_mmu_iommu_mapping), 
GFP_KERNEL);
         if (!*priv)
                 return;
@@ -95,6 +105,8 @@ gk20a_vm_map_iommu(struct nvkm_vma *vma, struct 
nvkm_gpuobj *pgt,
         if (ret)
                 return;

+       priv->index = *list;
+
         for (i = 0; i < npages; i++, list++) {
                 ret = iommu_map(plat->gpu->iommu.domain,
                                 (node->offset + i) << PAGE_SHIFT,
@@ -117,6 +129,7 @@ gk20a_vm_map_iommu(struct nvkm_vma *vma, struct 
nvkm_gpuobj *pgt,
         p = *priv;
         p->node = node;
         p->iova = node->offset << PAGE_SHIFT;
+       list_add_tail(&priv->head, mapping_list);
  }


Thanks,
Vince



_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 3/6] mmu: map small pages into big pages(s) by IOMMU if possible
       [not found]                     ` <5534E83D.80304-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
@ 2015-04-20 11:55                       ` Vince Hsu
  0 siblings, 0 replies; 39+ messages in thread
From: Vince Hsu @ 2015-04-20 11:55 UTC (permalink / raw)
  To: Alexandre Courbot, Terje Bergstrom
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs



On 04/20/2015 07:51 PM, Vince Hsu wrote:
> On 04/20/2015 03:49 PM, Alexandre Courbot wrote:
>> On Sat, Apr 18, 2015 at 12:37 AM, Terje Bergstrom 
>> <tbergstrom@nvidia.com> wrote:
>>> On 04/17/2015 02:11 AM, Alexandre Courbot wrote:
>>>> Tracking the PDE and PTE of each memory chunk can probably be avoided
>>>> if you change your unmapping strategy. Currently you are going through
>>>> the list of nvkm_vm_bp_list, but you know your PDE and PTE are always
>>>> going to be adjacent, since a nvkm_vma represents a contiguous block
>>>> in the GPU VA. So when unmapping, you can simply check for each PTE
>>>> entry whether the IOMMU bit is set, and unmap from the IOMMU space
>>>> after unmapping from the GPU VA space, in a loop similar to that of
>>>> nvkm_vm_unmap_at().
>>>>
>>>> Then we only need priv. You are keeping the nvkm_mm_node of the IOMMU
>>>> space into it, and you need it to free the IOMMU VA space. If only we
>>>> could find another way to store it, we could get rid of the whole
>>>> structure and associated list_head in nvkm_vma...
>>>>
>>>> I need to give it some more thoughts, and we will probably need to
>>>> change a few things in base.c to make the hooks more flexible, so
>>>> please give me some more time to think about it. :) I just wanted to
>>>> share my thoughts so far in case this puts you on track.
>>> The way you described it would make GPU MMU and IOMMU mappings 1:1. 
>>> So when
>>> we map a buffer to GPU MMU, we always map page by page the buffer 
>>> also to
>>> IOMMU. There are disadvantages here.
>>>
>>> IOMMU addresses are global, and uses in the GPU caches. When a 
>>> buffer is
>>> mapped multiple times to different graphics contexts, we want to 
>>> avoid cache
>>> aliasing by mapping the buffer only once to IOMMU. We also want to 
>>> unmap the
>>> buffer from IOMMU only once after all the instances of the buffer 
>>> have been
>>> unmapped, or only when the buffer is actually freed to cache IOMMU 
>>> mappings.
>>>
>>> Doing IOMMU mapping for the whole buffer with dma_map_sg is also 
>>> faster than
>>> mapping page by page, because you can do only one TLB invalidate in 
>>> the end
>>> of the loop instead of after every page if you use dma_map_single.
>>>
>>> All of these would talk for having IOMMU and GMMU mapping loops 
>>> separate.
>>> This patch set does not implement both the advantages above, but your
>>> suggestion would take us further away from that than Vince's version.
>> Aha, looks like both Vince and I overlooked this point. So IIUC we
>> would need to make sure a GPU buffer is only ever mapped once by the
>> IOMMU. This means we either need to preemptively entilery map it at
>> some point and just keep a reference count, or keep track of which
>> 128k ranges are already mapped (again, with a reference count to know
>> when to unmap them).
>>
>> First solution is tempting because it is simpler, but surely there is
>> something wrong with it?
> We can maintain a list in mmu/gk20a.c and use the first small page's 
> address as a search index for the corresponding 128K page.
>
> e.g.
>
> --- a/drm/nouveau/nvkm/subdev/mmu/gk20a.c
> +++ b/drm/nouveau/nvkm/subdev/mmu/gk20a.c
> @@ -38,6 +38,8 @@ struct gk20a_mmu_priv {
>  struct gk20a_mmu_iommu_mapping {
>         struct nvkm_mm_node *node;
>         u64 iova;
> +       u64 index;
> +       struct list_head head;
>  };
>
>  extern const u8 gf100_pte_storage_type_map[256];
> @@ -79,6 +81,14 @@ gk20a_vm_map_iommu(struct nvkm_vma *vma, struct 
> nvkm_gpuobj *pgt,
>
>         plat = nv_device_to_platform(nv_device(&mmu->base));
>
> +       list_for_each_entry(mapping, mapping_list, head) {
> +               if (mapping->index == *list) {
> +                       mapping->refcnt++;
> +                       *priv = mapping;
Bad example. We have to program the PTE for the new GVA before return.

> + return;
> +               }
> +       }
> +
>         *priv = kzalloc(sizeof(struct gk20a_mmu_iommu_mapping), 
> GFP_KERNEL);
>         if (!*priv)
>                 return;
> @@ -95,6 +105,8 @@ gk20a_vm_map_iommu(struct nvkm_vma *vma, struct 
> nvkm_gpuobj *pgt,
>         if (ret)
>                 return;
>
> +       priv->index = *list;
> +
>         for (i = 0; i < npages; i++, list++) {
>                 ret = iommu_map(plat->gpu->iommu.domain,
>                                 (node->offset + i) << PAGE_SHIFT,
> @@ -117,6 +129,7 @@ gk20a_vm_map_iommu(struct nvkm_vma *vma, struct 
> nvkm_gpuobj *pgt,
>         p = *priv;
>         p->node = node;
>         p->iova = node->offset << PAGE_SHIFT;
> +       list_add_tail(&priv->head, mapping_list);
>  }
>
>
> Thanks,
> Vince
>
>
>

_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/6] platform: specify the IOMMU physical translation bit
       [not found]                 ` <CAAVeFuKSY9AvtmjkVY7CApMex3Gq5L5cQ=suaERY5-gtAYai6A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-04-20 16:35                   ` Terje Bergstrom
  0 siblings, 0 replies; 39+ messages in thread
From: Terje Bergstrom @ 2015-04-20 16:35 UTC (permalink / raw)
  To: Alexandre Courbot; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs


On 04/20/2015 12:37 AM, Alexandre Courbot wrote:
> I don't have a strong opinion on this - the advantage of a bit mask
> over a bit index is that it at least gives us the option of not having
> a MMU translation bit at all (mask == 0) whereas a bit index of 0 will
> set/unset the first bit.
>
> Not that we need this for now, but keeping that possibility open
> doesn't sound bad for the future (nothing guarantees the GPU will
> always be behind the IOMMU).

We should not rely only on globally enabling / disabling IOMMU. The mask 
you describe is global. We'll want to  bypass IOMMU, but we'll want to 
do that in the mapping code path per buffer. VPR is an example: VPR 
buffers would sometimes be touched by GPU, and GPU needs to bypass 
IOMMU. But interleaved with that GPU is also using normal buffers with 
IOMMU translation.

Of course, sometimes we want to bypass IOMMU globally. So it's good to 
have also a mechanism to disable IOMMU for the whole GPU.

Terje
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/6] instmem/gk20a: refer to IOMMU physical translation bit
       [not found]                 ` <CAAVeFu+p=FFqMNTYNVo1DKnFWZ88_5osiTbiHQR=wQ6tJhHzBA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-04-20 16:37                   ` Terje Bergstrom
  0 siblings, 0 replies; 39+ messages in thread
From: Terje Bergstrom @ 2015-04-20 16:37 UTC (permalink / raw)
  To: Alexandre Courbot; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs


On 04/20/2015 12:38 AM, Alexandre Courbot wrote:
> On Sat, Apr 18, 2015 at 12:14 AM, Terje Bergstrom <tbergstrom@nvidia.com> wrote:
>> Wouldn't that be just a more complicated way of expressing the same 
>> thing? 
> Right now we have r->offset |= BIT(priv->iommu_phys_addr_bit -
> priv->iommu_pgshift), which doesn't look much simpler. :)

One reason is that you're doing two things in the same bit operation. 
The original address has already been shifted, so you're compensating 
the shift, and you're setting bit 34.

_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 3/6] mmu: map small pages into big pages(s) by IOMMU if possible
       [not found]                 ` <CAAVeFuKc1szojynHrGM9TJrJsTK0RjRiO7i7uwxoVPAW=tZJrA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2015-04-20 11:51                   ` Vince Hsu
@ 2015-04-20 16:47                   ` Terje Bergstrom
  1 sibling, 0 replies; 39+ messages in thread
From: Terje Bergstrom @ 2015-04-20 16:47 UTC (permalink / raw)
  To: Alexandre Courbot; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs


On 04/20/2015 12:49 AM, Alexandre Courbot wrote:
> Aha, looks like both Vince and I overlooked this point. So IIUC we
> would need to make sure a GPU buffer is only ever mapped once by the
> IOMMU. This means we either need to preemptively entilery map it at
> some point and just keep a reference count, or keep track of which
> 128k ranges are already mapped (again, with a reference count to know
> when to unmap them).
>
> First solution is tempting because it is simpler, but surely there is
> something wrong with it?

If you map the whole buffer to IOMMU at once, you can use map_sg style 
calls, which would make your IOMMU mapping be faster. It also matches 
how dma_buf_map_attachment() works. So one way would be to first check 
if the buffer has already been mapped to GPU IOMMU space, and if so, 
increment refcount and reuse address. If not, map the buffer for example 
via dma_buf_map_attachment(), and stash the IOMMU address.

For me the trickiest to get right was freeing the buffer. When the user 
space was done with buffer, can we ensure all mappings are torn down 
first? Does that hold if user space just crashes?
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2015-04-20 16:47 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-16 11:06 [PATCH 0/6] map big page by platform IOMMU Vince Hsu
     [not found] ` <1429182379-31964-1-git-send-email-vinceh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2015-04-16 11:06   ` [PATCH 1/6] platform: specify the IOMMU physical translation bit Vince Hsu
     [not found]     ` <1429182379-31964-2-git-send-email-vinceh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2015-04-16 20:41       ` Terje Bergstrom
     [not found]         ` <55301E65.5000705-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2015-04-17  2:22           ` Vince Hsu
2015-04-17  6:26       ` Alexandre Courbot
     [not found]         ` <CAAVeFuL10UidtE4_Y2mYEYpWjDqXn-FQF3zJQv2eQDqg_QTJ1Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-04-17  7:07           ` Vince Hsu
2015-04-17 15:10           ` Terje Bergstrom
     [not found]             ` <55312257.8000606-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2015-04-20  7:37               ` Alexandre Courbot
     [not found]                 ` <CAAVeFuKSY9AvtmjkVY7CApMex3Gq5L5cQ=suaERY5-gtAYai6A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-04-20 16:35                   ` Terje Bergstrom
2015-04-16 11:06   ` [PATCH 2/6] instmem/gk20a: refer to " Vince Hsu
     [not found]     ` <1429182379-31964-3-git-send-email-vinceh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2015-04-17  6:26       ` Alexandre Courbot
     [not found]         ` <CAAVeFuJqWuQeH1tbAdkVpocBAjq52GPv_O_yxd0X+7gSHjCoJw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-04-17  7:19           ` Vince Hsu
     [not found]             ` <5530B3F7.3000000-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2015-04-17  7:40               ` Alexandre Courbot
2015-04-17 15:14           ` Terje Bergstrom
     [not found]             ` <5531234C.8020806-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2015-04-20  7:38               ` Alexandre Courbot
     [not found]                 ` <CAAVeFu+p=FFqMNTYNVo1DKnFWZ88_5osiTbiHQR=wQ6tJhHzBA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-04-20 16:37                   ` Terje Bergstrom
2015-04-16 11:06   ` [PATCH 3/6] mmu: map small pages into big pages(s) by IOMMU if possible Vince Hsu
     [not found]     ` <1429182379-31964-4-git-send-email-vinceh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2015-04-17  9:11       ` Alexandre Courbot
     [not found]         ` <CAAVeFuLSYGV8ghOQ26ubA9kvWHCPD2y+y_bJMLHv1V1Ab8NOYg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-04-17 15:37           ` Terje Bergstrom
     [not found]             ` <553128CA.40304-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2015-04-20  7:49               ` Alexandre Courbot
     [not found]                 ` <CAAVeFuKc1szojynHrGM9TJrJsTK0RjRiO7i7uwxoVPAW=tZJrA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-04-20 11:51                   ` Vince Hsu
     [not found]                     ` <5534E83D.80304-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2015-04-20 11:55                       ` Vince Hsu
2015-04-20 16:47                   ` Terje Bergstrom
2015-04-16 11:06   ` [PATCH 4/6] drm: enable big page mapping for small pages when IOMMU is available Vince Hsu
     [not found]     ` <1429182379-31964-5-git-send-email-vinceh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2015-04-17  6:26       ` Alexandre Courbot
     [not found]         ` <CAAVeFu+_zJ4DUwebhzpfA9uJroFEVrOBRqiXUuS9Gk04hVA88A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-04-17  7:28           ` Vince Hsu
     [not found]             ` <5530B603.9010703-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2015-04-17  7:33               ` Alexandre Courbot
2015-04-17 15:24               ` Terje Bergstrom
2015-04-16 11:06   ` [PATCH 5/6] mmu: gf100: share most of functions with GK20A Vince Hsu
     [not found]     ` <1429182379-31964-6-git-send-email-vinceh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2015-04-17  6:27       ` Alexandre Courbot
     [not found]         ` <CAAVeFuJay4TNdOwEorver+-WiYCLJ21BXcRgCBv1aush_i1wbg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-04-17  7:30           ` Vince Hsu
2015-04-16 11:06   ` [PATCH 6/6] mmu: gk20a: implement IOMMU mapping for big pages Vince Hsu
     [not found]     ` <1429182379-31964-7-git-send-email-vinceh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2015-04-16 19:31       ` Ilia Mirkin
     [not found]         ` <CAKb7UvhEuNSYshiP+kt66zHfsmB=Edy+t7HLs2j_+-hvzzkdMQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-04-16 19:55           ` Terje Bergstrom
     [not found]             ` <553013A4.1020700-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2015-04-16 20:01               ` Ilia Mirkin
2015-04-17  6:25   ` [PATCH 0/6] map big page by platform IOMMU Alexandre Courbot
     [not found]     ` <CAAVeFuJLRo1XYum3FZBhO5=5u63KLUY_tr0wk=TPxWutvCdL8A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-04-17  7:35       ` Vince Hsu
     [not found]         ` <5530B7D3.60700-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2015-04-17  7:37           ` Alexandre Courbot
     [not found]             ` <CAAVeFuLdr7h29G6_1XGV8kTag=m4Spe_R9Aog38eRs9gAbgdVg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-04-17  7:42               ` Vince Hsu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.