All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 01/13] drm/ttm: stop calling tt_swapin in vm_access
@ 2021-09-27 11:41 ` Matthew Auld
  0 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-27 11:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström, Christian König

In commit:

commit 09ac4fcb3f255e9225967c75f5893325c116cdbe
Author: Felix Kuehling <Felix.Kuehling@amd.com>
Date:   Thu Jul 13 17:01:16 2017 -0400

    drm/ttm: Implement vm_operations_struct.access v2

we added the vm_access hook, where we also directly call tt_swapin for
some reason. If something is swapped-out then the ttm_tt must also be
unpopulated, and since access_kmap should also call tt_populate, if
needed, then swapping-in will already be handled there.

If anything, calling tt_swapin directly here would likely always fail
since the tt->pages won't yet be populated, or worse since the tt->pages
array is never actually cleared in unpopulate this might lead to a nasty
uaf.

Fixes: 09ac4fcb3f25 ("drm/ttm: Implement vm_operations_struct.access v2")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo_vm.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index f56be5bc0861..5b9b7fd01a69 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -519,11 +519,6 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
 
 	switch (bo->resource->mem_type) {
 	case TTM_PL_SYSTEM:
-		if (unlikely(bo->ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)) {
-			ret = ttm_tt_swapin(bo->ttm);
-			if (unlikely(ret != 0))
-				return ret;
-		}
 		fallthrough;
 	case TTM_PL_TT:
 		ret = ttm_bo_vm_access_kmap(bo, offset, buf, len, write);
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Intel-gfx] [PATCH v5 01/13] drm/ttm: stop calling tt_swapin in vm_access
@ 2021-09-27 11:41 ` Matthew Auld
  0 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-27 11:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström, Christian König

In commit:

commit 09ac4fcb3f255e9225967c75f5893325c116cdbe
Author: Felix Kuehling <Felix.Kuehling@amd.com>
Date:   Thu Jul 13 17:01:16 2017 -0400

    drm/ttm: Implement vm_operations_struct.access v2

we added the vm_access hook, where we also directly call tt_swapin for
some reason. If something is swapped-out then the ttm_tt must also be
unpopulated, and since access_kmap should also call tt_populate, if
needed, then swapping-in will already be handled there.

If anything, calling tt_swapin directly here would likely always fail
since the tt->pages won't yet be populated, or worse since the tt->pages
array is never actually cleared in unpopulate this might lead to a nasty
uaf.

Fixes: 09ac4fcb3f25 ("drm/ttm: Implement vm_operations_struct.access v2")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo_vm.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index f56be5bc0861..5b9b7fd01a69 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -519,11 +519,6 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
 
 	switch (bo->resource->mem_type) {
 	case TTM_PL_SYSTEM:
-		if (unlikely(bo->ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)) {
-			ret = ttm_tt_swapin(bo->ttm);
-			if (unlikely(ret != 0))
-				return ret;
-		}
 		fallthrough;
 	case TTM_PL_TT:
 		ret = ttm_bo_vm_access_kmap(bo, offset, buf, len, write);
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v5 02/13] drm/ttm: stop setting page->index for the ttm_tt
  2021-09-27 11:41 ` [Intel-gfx] " Matthew Auld
@ 2021-09-27 11:41   ` Matthew Auld
  -1 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-27 11:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström, Christian König

In commit:

commit 58aa6622d32af7d2c08d45085f44c54554a16ed7
Author: Thomas Hellstrom <thellstrom@vmware.com>
Date:   Fri Jan 3 11:47:23 2014 +0100

    drm/ttm: Correctly set page mapping and -index members

we started setting the page->mapping and page->index to point to the
virtual address space, if the pages were faulted with TTM. Apparently
this was needed for core-mm to able to reverse lookup the virtual
address given the struct page, and potentially unmap it from the page
tables. However as pointed out by Thomas, since we are now using
PFN_MAP, instead of say PFN_MIXED, this should no longer be the case.

There was also apparently some usecase in vmwgfx which needed this for
dirty tracking, but that also doesn't appear to be the case anymore, as
pointed out by Thomas.

We still need keep the page->mapping for now, since that is still needed
for different reasons, but we try to address that in the next patch.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo_vm.c | 2 --
 drivers/gpu/drm/ttm/ttm_tt.c    | 4 +---
 2 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index 5b9b7fd01a69..9a2119fe4bdd 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -346,8 +346,6 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf,
 			} else if (unlikely(!page)) {
 				break;
 			}
-			page->index = drm_vma_node_start(&bo->base.vma_node) +
-				page_offset;
 			pfn = page_to_pfn(page);
 		}
 
diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index dae52433beeb..1cc04c224988 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -367,10 +367,8 @@ static void ttm_tt_clear_mapping(struct ttm_tt *ttm)
 	if (ttm->page_flags & TTM_PAGE_FLAG_SG)
 		return;
 
-	for (i = 0; i < ttm->num_pages; ++i) {
+	for (i = 0; i < ttm->num_pages; ++i)
 		(*page)->mapping = NULL;
-		(*page++)->index = 0;
-	}
 }
 
 void ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm)
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Intel-gfx] [PATCH v5 02/13] drm/ttm: stop setting page->index for the ttm_tt
@ 2021-09-27 11:41   ` Matthew Auld
  0 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-27 11:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström, Christian König

In commit:

commit 58aa6622d32af7d2c08d45085f44c54554a16ed7
Author: Thomas Hellstrom <thellstrom@vmware.com>
Date:   Fri Jan 3 11:47:23 2014 +0100

    drm/ttm: Correctly set page mapping and -index members

we started setting the page->mapping and page->index to point to the
virtual address space, if the pages were faulted with TTM. Apparently
this was needed for core-mm to able to reverse lookup the virtual
address given the struct page, and potentially unmap it from the page
tables. However as pointed out by Thomas, since we are now using
PFN_MAP, instead of say PFN_MIXED, this should no longer be the case.

There was also apparently some usecase in vmwgfx which needed this for
dirty tracking, but that also doesn't appear to be the case anymore, as
pointed out by Thomas.

We still need keep the page->mapping for now, since that is still needed
for different reasons, but we try to address that in the next patch.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo_vm.c | 2 --
 drivers/gpu/drm/ttm/ttm_tt.c    | 4 +---
 2 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index 5b9b7fd01a69..9a2119fe4bdd 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -346,8 +346,6 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf,
 			} else if (unlikely(!page)) {
 				break;
 			}
-			page->index = drm_vma_node_start(&bo->base.vma_node) +
-				page_offset;
 			pfn = page_to_pfn(page);
 		}
 
diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index dae52433beeb..1cc04c224988 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -367,10 +367,8 @@ static void ttm_tt_clear_mapping(struct ttm_tt *ttm)
 	if (ttm->page_flags & TTM_PAGE_FLAG_SG)
 		return;
 
-	for (i = 0; i < ttm->num_pages; ++i) {
+	for (i = 0; i < ttm->num_pages; ++i)
 		(*page)->mapping = NULL;
-		(*page++)->index = 0;
-	}
 }
 
 void ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm)
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v5 03/13] drm/ttm: move ttm_tt_{add, clear}_mapping into amdgpu
  2021-09-27 11:41 ` [Intel-gfx] " Matthew Auld
@ 2021-09-27 11:41   ` Matthew Auld
  -1 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-27 11:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström, Christian König

Now that setting page->index shouldn't be needed anymore, we are just
left with setting page->mapping, and here it looks like amdgpu is the
only user, where pointing the page->mapping at the dev_mapping is used
to verify that the pages do indeed belong to the device, if userspace
later tries to touch them.

v2(Christian):
  - Drop the functions altogether and just inline modifying
    the page->mapping

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 15 ++++++++++++++-
 drivers/gpu/drm/ttm/ttm_tt.c            | 25 -------------------------
 2 files changed, 14 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 820fcb24231f..438377a89aa3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1119,6 +1119,8 @@ static int amdgpu_ttm_tt_populate(struct ttm_device *bdev,
 {
 	struct amdgpu_device *adev = amdgpu_ttm_adev(bdev);
 	struct amdgpu_ttm_tt *gtt = (void *)ttm;
+	pgoff_t i;
+	int ret;
 
 	/* user pages are bound by amdgpu_ttm_tt_pin_userptr() */
 	if (gtt->userptr) {
@@ -1131,7 +1133,14 @@ static int amdgpu_ttm_tt_populate(struct ttm_device *bdev,
 	if (ttm->page_flags & TTM_PAGE_FLAG_SG)
 		return 0;
 
-	return ttm_pool_alloc(&adev->mman.bdev.pool, ttm, ctx);
+	ret = ttm_pool_alloc(&adev->mman.bdev.pool, ttm, ctx);
+	if (ret)
+		return ret;
+
+	for (i = 0; i < ttm->num_pages; ++i)
+		ttm->pages[i]->mapping = bdev->dev_mapping;
+
+	return 0;
 }
 
 /*
@@ -1145,6 +1154,7 @@ static void amdgpu_ttm_tt_unpopulate(struct ttm_device *bdev,
 {
 	struct amdgpu_ttm_tt *gtt = (void *)ttm;
 	struct amdgpu_device *adev;
+	pgoff_t i;
 
 	amdgpu_ttm_backend_unbind(bdev, ttm);
 
@@ -1158,6 +1168,9 @@ static void amdgpu_ttm_tt_unpopulate(struct ttm_device *bdev,
 	if (ttm->page_flags & TTM_PAGE_FLAG_SG)
 		return;
 
+	for (i = 0; i < ttm->num_pages; ++i)
+		ttm->pages[i]->mapping = NULL;
+
 	adev = amdgpu_ttm_adev(bdev);
 	return ttm_pool_free(&adev->mman.bdev.pool, ttm);
 }
diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index 1cc04c224988..980ecb079b2c 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -289,17 +289,6 @@ int ttm_tt_swapout(struct ttm_device *bdev, struct ttm_tt *ttm,
 	return ret;
 }
 
-static void ttm_tt_add_mapping(struct ttm_device *bdev, struct ttm_tt *ttm)
-{
-	pgoff_t i;
-
-	if (ttm->page_flags & TTM_PAGE_FLAG_SG)
-		return;
-
-	for (i = 0; i < ttm->num_pages; ++i)
-		ttm->pages[i]->mapping = bdev->dev_mapping;
-}
-
 int ttm_tt_populate(struct ttm_device *bdev,
 		    struct ttm_tt *ttm, struct ttm_operation_ctx *ctx)
 {
@@ -336,7 +325,6 @@ int ttm_tt_populate(struct ttm_device *bdev,
 	if (ret)
 		goto error;
 
-	ttm_tt_add_mapping(bdev, ttm);
 	ttm->page_flags |= TTM_PAGE_FLAG_PRIV_POPULATED;
 	if (unlikely(ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)) {
 		ret = ttm_tt_swapin(ttm);
@@ -359,24 +347,11 @@ int ttm_tt_populate(struct ttm_device *bdev,
 }
 EXPORT_SYMBOL(ttm_tt_populate);
 
-static void ttm_tt_clear_mapping(struct ttm_tt *ttm)
-{
-	pgoff_t i;
-	struct page **page = ttm->pages;
-
-	if (ttm->page_flags & TTM_PAGE_FLAG_SG)
-		return;
-
-	for (i = 0; i < ttm->num_pages; ++i)
-		(*page)->mapping = NULL;
-}
-
 void ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm)
 {
 	if (!ttm_tt_is_populated(ttm))
 		return;
 
-	ttm_tt_clear_mapping(ttm);
 	if (bdev->funcs->ttm_tt_unpopulate)
 		bdev->funcs->ttm_tt_unpopulate(bdev, ttm);
 	else
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Intel-gfx] [PATCH v5 03/13] drm/ttm: move ttm_tt_{add, clear}_mapping into amdgpu
@ 2021-09-27 11:41   ` Matthew Auld
  0 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-27 11:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström, Christian König

Now that setting page->index shouldn't be needed anymore, we are just
left with setting page->mapping, and here it looks like amdgpu is the
only user, where pointing the page->mapping at the dev_mapping is used
to verify that the pages do indeed belong to the device, if userspace
later tries to touch them.

v2(Christian):
  - Drop the functions altogether and just inline modifying
    the page->mapping

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 15 ++++++++++++++-
 drivers/gpu/drm/ttm/ttm_tt.c            | 25 -------------------------
 2 files changed, 14 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 820fcb24231f..438377a89aa3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1119,6 +1119,8 @@ static int amdgpu_ttm_tt_populate(struct ttm_device *bdev,
 {
 	struct amdgpu_device *adev = amdgpu_ttm_adev(bdev);
 	struct amdgpu_ttm_tt *gtt = (void *)ttm;
+	pgoff_t i;
+	int ret;
 
 	/* user pages are bound by amdgpu_ttm_tt_pin_userptr() */
 	if (gtt->userptr) {
@@ -1131,7 +1133,14 @@ static int amdgpu_ttm_tt_populate(struct ttm_device *bdev,
 	if (ttm->page_flags & TTM_PAGE_FLAG_SG)
 		return 0;
 
-	return ttm_pool_alloc(&adev->mman.bdev.pool, ttm, ctx);
+	ret = ttm_pool_alloc(&adev->mman.bdev.pool, ttm, ctx);
+	if (ret)
+		return ret;
+
+	for (i = 0; i < ttm->num_pages; ++i)
+		ttm->pages[i]->mapping = bdev->dev_mapping;
+
+	return 0;
 }
 
 /*
@@ -1145,6 +1154,7 @@ static void amdgpu_ttm_tt_unpopulate(struct ttm_device *bdev,
 {
 	struct amdgpu_ttm_tt *gtt = (void *)ttm;
 	struct amdgpu_device *adev;
+	pgoff_t i;
 
 	amdgpu_ttm_backend_unbind(bdev, ttm);
 
@@ -1158,6 +1168,9 @@ static void amdgpu_ttm_tt_unpopulate(struct ttm_device *bdev,
 	if (ttm->page_flags & TTM_PAGE_FLAG_SG)
 		return;
 
+	for (i = 0; i < ttm->num_pages; ++i)
+		ttm->pages[i]->mapping = NULL;
+
 	adev = amdgpu_ttm_adev(bdev);
 	return ttm_pool_free(&adev->mman.bdev.pool, ttm);
 }
diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index 1cc04c224988..980ecb079b2c 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -289,17 +289,6 @@ int ttm_tt_swapout(struct ttm_device *bdev, struct ttm_tt *ttm,
 	return ret;
 }
 
-static void ttm_tt_add_mapping(struct ttm_device *bdev, struct ttm_tt *ttm)
-{
-	pgoff_t i;
-
-	if (ttm->page_flags & TTM_PAGE_FLAG_SG)
-		return;
-
-	for (i = 0; i < ttm->num_pages; ++i)
-		ttm->pages[i]->mapping = bdev->dev_mapping;
-}
-
 int ttm_tt_populate(struct ttm_device *bdev,
 		    struct ttm_tt *ttm, struct ttm_operation_ctx *ctx)
 {
@@ -336,7 +325,6 @@ int ttm_tt_populate(struct ttm_device *bdev,
 	if (ret)
 		goto error;
 
-	ttm_tt_add_mapping(bdev, ttm);
 	ttm->page_flags |= TTM_PAGE_FLAG_PRIV_POPULATED;
 	if (unlikely(ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)) {
 		ret = ttm_tt_swapin(ttm);
@@ -359,24 +347,11 @@ int ttm_tt_populate(struct ttm_device *bdev,
 }
 EXPORT_SYMBOL(ttm_tt_populate);
 
-static void ttm_tt_clear_mapping(struct ttm_tt *ttm)
-{
-	pgoff_t i;
-	struct page **page = ttm->pages;
-
-	if (ttm->page_flags & TTM_PAGE_FLAG_SG)
-		return;
-
-	for (i = 0; i < ttm->num_pages; ++i)
-		(*page)->mapping = NULL;
-}
-
 void ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm)
 {
 	if (!ttm_tt_is_populated(ttm))
 		return;
 
-	ttm_tt_clear_mapping(ttm);
 	if (bdev->funcs->ttm_tt_unpopulate)
 		bdev->funcs->ttm_tt_unpopulate(bdev, ttm);
 	else
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v5 04/13] drm/ttm: remove TTM_PAGE_FLAG_NO_RETRY
  2021-09-27 11:41 ` [Intel-gfx] " Matthew Auld
@ 2021-09-27 11:41   ` Matthew Auld
  -1 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-27 11:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström, Christian König

No longer used it seems.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 include/drm/ttm/ttm_tt.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h
index 89b15d673b22..842ce756213c 100644
--- a/include/drm/ttm/ttm_tt.h
+++ b/include/drm/ttm/ttm_tt.h
@@ -41,7 +41,6 @@ struct ttm_operation_ctx;
 #define TTM_PAGE_FLAG_SWAPPED         (1 << 4)
 #define TTM_PAGE_FLAG_ZERO_ALLOC      (1 << 6)
 #define TTM_PAGE_FLAG_SG              (1 << 8)
-#define TTM_PAGE_FLAG_NO_RETRY	      (1 << 9)
 
 #define TTM_PAGE_FLAG_PRIV_POPULATED  (1 << 31)
 
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Intel-gfx] [PATCH v5 04/13] drm/ttm: remove TTM_PAGE_FLAG_NO_RETRY
@ 2021-09-27 11:41   ` Matthew Auld
  0 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-27 11:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström, Christian König

No longer used it seems.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 include/drm/ttm/ttm_tt.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h
index 89b15d673b22..842ce756213c 100644
--- a/include/drm/ttm/ttm_tt.h
+++ b/include/drm/ttm/ttm_tt.h
@@ -41,7 +41,6 @@ struct ttm_operation_ctx;
 #define TTM_PAGE_FLAG_SWAPPED         (1 << 4)
 #define TTM_PAGE_FLAG_ZERO_ALLOC      (1 << 6)
 #define TTM_PAGE_FLAG_SG              (1 << 8)
-#define TTM_PAGE_FLAG_NO_RETRY	      (1 << 9)
 
 #define TTM_PAGE_FLAG_PRIV_POPULATED  (1 << 31)
 
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v5 05/13] drm/ttm: s/FLAG_SG/FLAG_EXTERNAL/
  2021-09-27 11:41 ` [Intel-gfx] " Matthew Auld
@ 2021-09-27 11:41   ` Matthew Auld
  -1 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-27 11:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Christian König, Thomas Hellström

It covers more than just ttm_bo_type_sg usage, like with say dma-buf,
since one other user is userptr in amdgpu, and in the future we might
have some more. Hence EXTERNAL is likely a more suitable name.

v2(Christian):
  - Rename these to TTM_TT_FLAGS_*
  - Fix up all the holes in the flag values

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 10 +++++-----
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c |  6 +++---
 drivers/gpu/drm/nouveau/nouveau_bo.c    |  4 ++--
 drivers/gpu/drm/radeon/radeon_ttm.c     |  8 ++++----
 drivers/gpu/drm/ttm/ttm_bo.c            |  4 ++--
 drivers/gpu/drm/ttm/ttm_bo_util.c       |  4 ++--
 drivers/gpu/drm/ttm/ttm_bo_vm.c         |  2 +-
 drivers/gpu/drm/ttm/ttm_pool.c          |  2 +-
 drivers/gpu/drm/ttm/ttm_tt.c            | 24 ++++++++++++------------
 include/drm/ttm/ttm_device.h            |  2 +-
 include/drm/ttm/ttm_tt.h                | 18 +++++++++---------
 11 files changed, 42 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 438377a89aa3..0cf94421665f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -894,7 +894,7 @@ static int amdgpu_ttm_backend_bind(struct ttm_device *bdev,
 			DRM_ERROR("failed to pin userptr\n");
 			return r;
 		}
-	} else if (ttm->page_flags & TTM_PAGE_FLAG_SG) {
+	} else if (ttm->page_flags & TTM_TT_FLAG_EXTERNAL) {
 		if (!ttm->sg) {
 			struct dma_buf_attachment *attach;
 			struct sg_table *sgt;
@@ -1130,7 +1130,7 @@ static int amdgpu_ttm_tt_populate(struct ttm_device *bdev,
 		return 0;
 	}
 
-	if (ttm->page_flags & TTM_PAGE_FLAG_SG)
+	if (ttm->page_flags & TTM_TT_FLAG_EXTERNAL)
 		return 0;
 
 	ret = ttm_pool_alloc(&adev->mman.bdev.pool, ttm, ctx);
@@ -1165,7 +1165,7 @@ static void amdgpu_ttm_tt_unpopulate(struct ttm_device *bdev,
 		return;
 	}
 
-	if (ttm->page_flags & TTM_PAGE_FLAG_SG)
+	if (ttm->page_flags & TTM_TT_FLAG_EXTERNAL)
 		return;
 
 	for (i = 0; i < ttm->num_pages; ++i)
@@ -1198,8 +1198,8 @@ int amdgpu_ttm_tt_set_userptr(struct ttm_buffer_object *bo,
 			return -ENOMEM;
 	}
 
-	/* Set TTM_PAGE_FLAG_SG before populate but after create. */
-	bo->ttm->page_flags |= TTM_PAGE_FLAG_SG;
+	/* Set TTM_TT_FLAG_EXTERNAL before populate but after create. */
+	bo->ttm->page_flags |= TTM_TT_FLAG_EXTERNAL;
 
 	gtt = (void *)bo->ttm;
 	gtt->userptr = addr;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index b94497989995..a77e90f300fe 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -194,7 +194,7 @@ static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo,
 
 	if (obj->flags & I915_BO_ALLOC_CPU_CLEAR &&
 	    man->use_tt)
-		page_flags |= TTM_PAGE_FLAG_ZERO_ALLOC;
+		page_flags |= TTM_TT_FLAG_ZERO_ALLOC;
 
 	ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags,
 			  i915_ttm_select_tt_caching(obj));
@@ -562,7 +562,7 @@ static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict,
 	}
 
 	/* Populate ttm with pages if needed. Typically system memory. */
-	if (ttm && (dst_man->use_tt || (ttm->page_flags & TTM_PAGE_FLAG_SWAPPED))) {
+	if (ttm && (dst_man->use_tt || (ttm->page_flags & TTM_TT_FLAG_SWAPPED))) {
 		ret = ttm_tt_populate(bo->bdev, ttm, ctx);
 		if (ret)
 			return ret;
@@ -573,7 +573,7 @@ static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict,
 		return PTR_ERR(dst_st);
 
 	clear = !cpu_maps_iomem(bo->resource) && (!ttm || !ttm_tt_is_populated(ttm));
-	if (!(clear && ttm && !(ttm->page_flags & TTM_PAGE_FLAG_ZERO_ALLOC)))
+	if (!(clear && ttm && !(ttm->page_flags & TTM_TT_FLAG_ZERO_ALLOC)))
 		__i915_ttm_move(bo, clear, dst_mem, bo->ttm, dst_st, true);
 
 	ttm_bo_move_sync_cleanup(bo, dst_mem);
diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
index d3b21d318b42..12b107acb6ee 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -1250,7 +1250,7 @@ nouveau_ttm_tt_populate(struct ttm_device *bdev,
 	struct ttm_tt *ttm_dma = (void *)ttm;
 	struct nouveau_drm *drm;
 	struct device *dev;
-	bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG);
+	bool slave = !!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL);
 
 	if (ttm_tt_is_populated(ttm))
 		return 0;
@@ -1273,7 +1273,7 @@ nouveau_ttm_tt_unpopulate(struct ttm_device *bdev,
 {
 	struct nouveau_drm *drm;
 	struct device *dev;
-	bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG);
+	bool slave = !!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL);
 
 	if (slave)
 		return;
diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c
index 7793249bc549..11b21d605584 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -545,14 +545,14 @@ static int radeon_ttm_tt_populate(struct ttm_device *bdev,
 {
 	struct radeon_device *rdev = radeon_get_rdev(bdev);
 	struct radeon_ttm_tt *gtt = radeon_ttm_tt_to_gtt(rdev, ttm);
-	bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG);
+	bool slave = !!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL);
 
 	if (gtt && gtt->userptr) {
 		ttm->sg = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
 		if (!ttm->sg)
 			return -ENOMEM;
 
-		ttm->page_flags |= TTM_PAGE_FLAG_SG;
+		ttm->page_flags |= TTM_TT_FLAG_EXTERNAL;
 		return 0;
 	}
 
@@ -569,13 +569,13 @@ static void radeon_ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm
 {
 	struct radeon_device *rdev = radeon_get_rdev(bdev);
 	struct radeon_ttm_tt *gtt = radeon_ttm_tt_to_gtt(rdev, ttm);
-	bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG);
+	bool slave = !!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL);
 
 	radeon_ttm_tt_unbind(bdev, ttm);
 
 	if (gtt && gtt->userptr) {
 		kfree(ttm->sg);
-		ttm->page_flags &= ~TTM_PAGE_FLAG_SG;
+		ttm->page_flags &= ~TTM_TT_FLAG_EXTERNAL;
 		return;
 	}
 
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 3b22c0013dbf..d62b2013c367 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -1115,8 +1115,8 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx,
 		return -EBUSY;
 
 	if (!bo->ttm || !ttm_tt_is_populated(bo->ttm) ||
-	    bo->ttm->page_flags & TTM_PAGE_FLAG_SG ||
-	    bo->ttm->page_flags & TTM_PAGE_FLAG_SWAPPED ||
+	    bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL ||
+	    bo->ttm->page_flags & TTM_TT_FLAG_SWAPPED ||
 	    !ttm_bo_get_unless_zero(bo)) {
 		if (locked)
 			dma_resv_unlock(bo->base.resv);
diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
index c893c3db2623..a342d701c91c 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -147,7 +147,7 @@ int ttm_bo_move_memcpy(struct ttm_buffer_object *bo,
 	bool clear;
 	int ret = 0;
 
-	if (ttm && ((ttm->page_flags & TTM_PAGE_FLAG_SWAPPED) ||
+	if (ttm && ((ttm->page_flags & TTM_TT_FLAG_SWAPPED) ||
 		    dst_man->use_tt)) {
 		ret = ttm_tt_populate(bdev, ttm, ctx);
 		if (ret)
@@ -169,7 +169,7 @@ int ttm_bo_move_memcpy(struct ttm_buffer_object *bo,
 	}
 
 	clear = src_iter->ops->maps_tt && (!ttm || !ttm_tt_is_populated(ttm));
-	if (!(clear && ttm && !(ttm->page_flags & TTM_PAGE_FLAG_ZERO_ALLOC)))
+	if (!(clear && ttm && !(ttm->page_flags & TTM_TT_FLAG_ZERO_ALLOC)))
 		ttm_move_memcpy(clear, dst_mem->num_pages, dst_iter, src_iter);
 
 	if (!src_iter->ops->maps_tt)
diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index 9a2119fe4bdd..950f4f132802 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -162,7 +162,7 @@ vm_fault_t ttm_bo_vm_reserve(struct ttm_buffer_object *bo,
 	 * Refuse to fault imported pages. This should be handled
 	 * (if at all) by redirecting mmap to the exporter.
 	 */
-	if (bo->ttm && (bo->ttm->page_flags & TTM_PAGE_FLAG_SG)) {
+	if (bo->ttm && (bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL)) {
 		dma_resv_unlock(bo->base.resv);
 		return VM_FAULT_SIGBUS;
 	}
diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
index c961a788b519..1bba0a0ed3f9 100644
--- a/drivers/gpu/drm/ttm/ttm_pool.c
+++ b/drivers/gpu/drm/ttm/ttm_pool.c
@@ -371,7 +371,7 @@ int ttm_pool_alloc(struct ttm_pool *pool, struct ttm_tt *tt,
 	WARN_ON(!num_pages || ttm_tt_is_populated(tt));
 	WARN_ON(dma_addr && !pool->dev);
 
-	if (tt->page_flags & TTM_PAGE_FLAG_ZERO_ALLOC)
+	if (tt->page_flags & TTM_TT_FLAG_ZERO_ALLOC)
 		gfp_flags |= __GFP_ZERO;
 
 	if (ctx->gfp_retry_mayfail)
diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index 980ecb079b2c..86f31fde6e35 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -68,12 +68,12 @@ int ttm_tt_create(struct ttm_buffer_object *bo, bool zero_alloc)
 	switch (bo->type) {
 	case ttm_bo_type_device:
 		if (zero_alloc)
-			page_flags |= TTM_PAGE_FLAG_ZERO_ALLOC;
+			page_flags |= TTM_TT_FLAG_ZERO_ALLOC;
 		break;
 	case ttm_bo_type_kernel:
 		break;
 	case ttm_bo_type_sg:
-		page_flags |= TTM_PAGE_FLAG_SG;
+		page_flags |= TTM_TT_FLAG_EXTERNAL;
 		break;
 	default:
 		pr_err("Illegal buffer object type\n");
@@ -156,7 +156,7 @@ EXPORT_SYMBOL(ttm_tt_init);
 
 void ttm_tt_fini(struct ttm_tt *ttm)
 {
-	WARN_ON(ttm->page_flags & TTM_PAGE_FLAG_PRIV_POPULATED);
+	WARN_ON(ttm->page_flags & TTM_TT_FLAG_PRIV_POPULATED);
 
 	if (ttm->swap_storage)
 		fput(ttm->swap_storage);
@@ -178,7 +178,7 @@ int ttm_sg_tt_init(struct ttm_tt *ttm, struct ttm_buffer_object *bo,
 
 	ttm_tt_init_fields(ttm, bo, page_flags, caching);
 
-	if (page_flags & TTM_PAGE_FLAG_SG)
+	if (page_flags & TTM_TT_FLAG_EXTERNAL)
 		ret = ttm_sg_tt_alloc_page_directory(ttm);
 	else
 		ret = ttm_dma_tt_alloc_page_directory(ttm);
@@ -224,7 +224,7 @@ int ttm_tt_swapin(struct ttm_tt *ttm)
 
 	fput(swap_storage);
 	ttm->swap_storage = NULL;
-	ttm->page_flags &= ~TTM_PAGE_FLAG_SWAPPED;
+	ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
 
 	return 0;
 
@@ -279,7 +279,7 @@ int ttm_tt_swapout(struct ttm_device *bdev, struct ttm_tt *ttm,
 
 	ttm_tt_unpopulate(bdev, ttm);
 	ttm->swap_storage = swap_storage;
-	ttm->page_flags |= TTM_PAGE_FLAG_SWAPPED;
+	ttm->page_flags |= TTM_TT_FLAG_SWAPPED;
 
 	return ttm->num_pages;
 
@@ -300,7 +300,7 @@ int ttm_tt_populate(struct ttm_device *bdev,
 	if (ttm_tt_is_populated(ttm))
 		return 0;
 
-	if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) {
+	if (!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL)) {
 		atomic_long_add(ttm->num_pages, &ttm_pages_allocated);
 		if (bdev->pool.use_dma32)
 			atomic_long_add(ttm->num_pages,
@@ -325,8 +325,8 @@ int ttm_tt_populate(struct ttm_device *bdev,
 	if (ret)
 		goto error;
 
-	ttm->page_flags |= TTM_PAGE_FLAG_PRIV_POPULATED;
-	if (unlikely(ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)) {
+	ttm->page_flags |= TTM_TT_FLAG_PRIV_POPULATED;
+	if (unlikely(ttm->page_flags & TTM_TT_FLAG_SWAPPED)) {
 		ret = ttm_tt_swapin(ttm);
 		if (unlikely(ret != 0)) {
 			ttm_tt_unpopulate(bdev, ttm);
@@ -337,7 +337,7 @@ int ttm_tt_populate(struct ttm_device *bdev,
 	return 0;
 
 error:
-	if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) {
+	if (!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL)) {
 		atomic_long_sub(ttm->num_pages, &ttm_pages_allocated);
 		if (bdev->pool.use_dma32)
 			atomic_long_sub(ttm->num_pages,
@@ -357,14 +357,14 @@ void ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm)
 	else
 		ttm_pool_free(&bdev->pool, ttm);
 
-	if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) {
+	if (!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL)) {
 		atomic_long_sub(ttm->num_pages, &ttm_pages_allocated);
 		if (bdev->pool.use_dma32)
 			atomic_long_sub(ttm->num_pages,
 					&ttm_dma32_pages_allocated);
 	}
 
-	ttm->page_flags &= ~TTM_PAGE_FLAG_PRIV_POPULATED;
+	ttm->page_flags &= ~TTM_TT_FLAG_PRIV_POPULATED;
 }
 
 #ifdef CONFIG_DEBUG_FS
diff --git a/include/drm/ttm/ttm_device.h b/include/drm/ttm/ttm_device.h
index cbe03d45e883..0a4ddec78d8f 100644
--- a/include/drm/ttm/ttm_device.h
+++ b/include/drm/ttm/ttm_device.h
@@ -65,7 +65,7 @@ struct ttm_device_funcs {
 	 * ttm_tt_create
 	 *
 	 * @bo: The buffer object to create the ttm for.
-	 * @page_flags: Page flags as identified by TTM_PAGE_FLAG_XX flags.
+	 * @page_flags: Page flags as identified by TTM_TT_FLAG_XX flags.
 	 *
 	 * Create a struct ttm_tt to back data with system memory pages.
 	 * No pages are actually allocated.
diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h
index 842ce756213c..b023cd58ff38 100644
--- a/include/drm/ttm/ttm_tt.h
+++ b/include/drm/ttm/ttm_tt.h
@@ -38,17 +38,17 @@ struct ttm_resource;
 struct ttm_buffer_object;
 struct ttm_operation_ctx;
 
-#define TTM_PAGE_FLAG_SWAPPED         (1 << 4)
-#define TTM_PAGE_FLAG_ZERO_ALLOC      (1 << 6)
-#define TTM_PAGE_FLAG_SG              (1 << 8)
+#define TTM_TT_FLAG_SWAPPED	(1 << 0)
+#define TTM_TT_FLAG_ZERO_ALLOC	(1 << 1)
+#define TTM_TT_FLAG_EXTERNAL	(1 << 2)
 
-#define TTM_PAGE_FLAG_PRIV_POPULATED  (1 << 31)
+#define TTM_TT_FLAG_PRIV_POPULATED  (1 << 31)
 
 /**
  * struct ttm_tt
  *
  * @pages: Array of pages backing the data.
- * @page_flags: see TTM_PAGE_FLAG_*
+ * @page_flags: see TTM_TT_FLAG_*
  * @num_pages: Number of pages in the page array.
  * @sg: for SG objects via dma-buf
  * @dma_address: The DMA (bus) addresses of the pages
@@ -84,7 +84,7 @@ struct ttm_kmap_iter_tt {
 
 static inline bool ttm_tt_is_populated(struct ttm_tt *tt)
 {
-	return tt->page_flags & TTM_PAGE_FLAG_PRIV_POPULATED;
+	return tt->page_flags & TTM_TT_FLAG_PRIV_POPULATED;
 }
 
 /**
@@ -103,7 +103,7 @@ int ttm_tt_create(struct ttm_buffer_object *bo, bool zero_alloc);
  *
  * @ttm: The struct ttm_tt.
  * @bo: The buffer object we create the ttm for.
- * @page_flags: Page flags as identified by TTM_PAGE_FLAG_XX flags.
+ * @page_flags: Page flags as identified by TTM_TT_FLAG_XX flags.
  * @caching: the desired caching state of the pages
  *
  * Create a struct ttm_tt to back data with system memory pages.
@@ -178,7 +178,7 @@ void ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm);
  */
 static inline void ttm_tt_mark_for_clear(struct ttm_tt *ttm)
 {
-	ttm->page_flags |= TTM_PAGE_FLAG_ZERO_ALLOC;
+	ttm->page_flags |= TTM_TT_FLAG_ZERO_ALLOC;
 }
 
 void ttm_tt_mgr_init(unsigned long num_pages, unsigned long num_dma32_pages);
@@ -194,7 +194,7 @@ struct ttm_kmap_iter *ttm_kmap_iter_tt_init(struct ttm_kmap_iter_tt *iter_tt,
  *
  * @bo: Buffer object we allocate the ttm for.
  * @bridge: The agp bridge this device is sitting on.
- * @page_flags: Page flags as identified by TTM_PAGE_FLAG_XX flags.
+ * @page_flags: Page flags as identified by TTM_TT_FLAG_XX flags.
  *
  *
  * Create a TTM backend that uses the indicated AGP bridge as an aperture
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Intel-gfx] [PATCH v5 05/13] drm/ttm: s/FLAG_SG/FLAG_EXTERNAL/
@ 2021-09-27 11:41   ` Matthew Auld
  0 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-27 11:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Christian König, Thomas Hellström

It covers more than just ttm_bo_type_sg usage, like with say dma-buf,
since one other user is userptr in amdgpu, and in the future we might
have some more. Hence EXTERNAL is likely a more suitable name.

v2(Christian):
  - Rename these to TTM_TT_FLAGS_*
  - Fix up all the holes in the flag values

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 10 +++++-----
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c |  6 +++---
 drivers/gpu/drm/nouveau/nouveau_bo.c    |  4 ++--
 drivers/gpu/drm/radeon/radeon_ttm.c     |  8 ++++----
 drivers/gpu/drm/ttm/ttm_bo.c            |  4 ++--
 drivers/gpu/drm/ttm/ttm_bo_util.c       |  4 ++--
 drivers/gpu/drm/ttm/ttm_bo_vm.c         |  2 +-
 drivers/gpu/drm/ttm/ttm_pool.c          |  2 +-
 drivers/gpu/drm/ttm/ttm_tt.c            | 24 ++++++++++++------------
 include/drm/ttm/ttm_device.h            |  2 +-
 include/drm/ttm/ttm_tt.h                | 18 +++++++++---------
 11 files changed, 42 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 438377a89aa3..0cf94421665f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -894,7 +894,7 @@ static int amdgpu_ttm_backend_bind(struct ttm_device *bdev,
 			DRM_ERROR("failed to pin userptr\n");
 			return r;
 		}
-	} else if (ttm->page_flags & TTM_PAGE_FLAG_SG) {
+	} else if (ttm->page_flags & TTM_TT_FLAG_EXTERNAL) {
 		if (!ttm->sg) {
 			struct dma_buf_attachment *attach;
 			struct sg_table *sgt;
@@ -1130,7 +1130,7 @@ static int amdgpu_ttm_tt_populate(struct ttm_device *bdev,
 		return 0;
 	}
 
-	if (ttm->page_flags & TTM_PAGE_FLAG_SG)
+	if (ttm->page_flags & TTM_TT_FLAG_EXTERNAL)
 		return 0;
 
 	ret = ttm_pool_alloc(&adev->mman.bdev.pool, ttm, ctx);
@@ -1165,7 +1165,7 @@ static void amdgpu_ttm_tt_unpopulate(struct ttm_device *bdev,
 		return;
 	}
 
-	if (ttm->page_flags & TTM_PAGE_FLAG_SG)
+	if (ttm->page_flags & TTM_TT_FLAG_EXTERNAL)
 		return;
 
 	for (i = 0; i < ttm->num_pages; ++i)
@@ -1198,8 +1198,8 @@ int amdgpu_ttm_tt_set_userptr(struct ttm_buffer_object *bo,
 			return -ENOMEM;
 	}
 
-	/* Set TTM_PAGE_FLAG_SG before populate but after create. */
-	bo->ttm->page_flags |= TTM_PAGE_FLAG_SG;
+	/* Set TTM_TT_FLAG_EXTERNAL before populate but after create. */
+	bo->ttm->page_flags |= TTM_TT_FLAG_EXTERNAL;
 
 	gtt = (void *)bo->ttm;
 	gtt->userptr = addr;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index b94497989995..a77e90f300fe 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -194,7 +194,7 @@ static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo,
 
 	if (obj->flags & I915_BO_ALLOC_CPU_CLEAR &&
 	    man->use_tt)
-		page_flags |= TTM_PAGE_FLAG_ZERO_ALLOC;
+		page_flags |= TTM_TT_FLAG_ZERO_ALLOC;
 
 	ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags,
 			  i915_ttm_select_tt_caching(obj));
@@ -562,7 +562,7 @@ static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict,
 	}
 
 	/* Populate ttm with pages if needed. Typically system memory. */
-	if (ttm && (dst_man->use_tt || (ttm->page_flags & TTM_PAGE_FLAG_SWAPPED))) {
+	if (ttm && (dst_man->use_tt || (ttm->page_flags & TTM_TT_FLAG_SWAPPED))) {
 		ret = ttm_tt_populate(bo->bdev, ttm, ctx);
 		if (ret)
 			return ret;
@@ -573,7 +573,7 @@ static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict,
 		return PTR_ERR(dst_st);
 
 	clear = !cpu_maps_iomem(bo->resource) && (!ttm || !ttm_tt_is_populated(ttm));
-	if (!(clear && ttm && !(ttm->page_flags & TTM_PAGE_FLAG_ZERO_ALLOC)))
+	if (!(clear && ttm && !(ttm->page_flags & TTM_TT_FLAG_ZERO_ALLOC)))
 		__i915_ttm_move(bo, clear, dst_mem, bo->ttm, dst_st, true);
 
 	ttm_bo_move_sync_cleanup(bo, dst_mem);
diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
index d3b21d318b42..12b107acb6ee 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -1250,7 +1250,7 @@ nouveau_ttm_tt_populate(struct ttm_device *bdev,
 	struct ttm_tt *ttm_dma = (void *)ttm;
 	struct nouveau_drm *drm;
 	struct device *dev;
-	bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG);
+	bool slave = !!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL);
 
 	if (ttm_tt_is_populated(ttm))
 		return 0;
@@ -1273,7 +1273,7 @@ nouveau_ttm_tt_unpopulate(struct ttm_device *bdev,
 {
 	struct nouveau_drm *drm;
 	struct device *dev;
-	bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG);
+	bool slave = !!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL);
 
 	if (slave)
 		return;
diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c
index 7793249bc549..11b21d605584 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -545,14 +545,14 @@ static int radeon_ttm_tt_populate(struct ttm_device *bdev,
 {
 	struct radeon_device *rdev = radeon_get_rdev(bdev);
 	struct radeon_ttm_tt *gtt = radeon_ttm_tt_to_gtt(rdev, ttm);
-	bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG);
+	bool slave = !!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL);
 
 	if (gtt && gtt->userptr) {
 		ttm->sg = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
 		if (!ttm->sg)
 			return -ENOMEM;
 
-		ttm->page_flags |= TTM_PAGE_FLAG_SG;
+		ttm->page_flags |= TTM_TT_FLAG_EXTERNAL;
 		return 0;
 	}
 
@@ -569,13 +569,13 @@ static void radeon_ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm
 {
 	struct radeon_device *rdev = radeon_get_rdev(bdev);
 	struct radeon_ttm_tt *gtt = radeon_ttm_tt_to_gtt(rdev, ttm);
-	bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG);
+	bool slave = !!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL);
 
 	radeon_ttm_tt_unbind(bdev, ttm);
 
 	if (gtt && gtt->userptr) {
 		kfree(ttm->sg);
-		ttm->page_flags &= ~TTM_PAGE_FLAG_SG;
+		ttm->page_flags &= ~TTM_TT_FLAG_EXTERNAL;
 		return;
 	}
 
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 3b22c0013dbf..d62b2013c367 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -1115,8 +1115,8 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx,
 		return -EBUSY;
 
 	if (!bo->ttm || !ttm_tt_is_populated(bo->ttm) ||
-	    bo->ttm->page_flags & TTM_PAGE_FLAG_SG ||
-	    bo->ttm->page_flags & TTM_PAGE_FLAG_SWAPPED ||
+	    bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL ||
+	    bo->ttm->page_flags & TTM_TT_FLAG_SWAPPED ||
 	    !ttm_bo_get_unless_zero(bo)) {
 		if (locked)
 			dma_resv_unlock(bo->base.resv);
diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
index c893c3db2623..a342d701c91c 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -147,7 +147,7 @@ int ttm_bo_move_memcpy(struct ttm_buffer_object *bo,
 	bool clear;
 	int ret = 0;
 
-	if (ttm && ((ttm->page_flags & TTM_PAGE_FLAG_SWAPPED) ||
+	if (ttm && ((ttm->page_flags & TTM_TT_FLAG_SWAPPED) ||
 		    dst_man->use_tt)) {
 		ret = ttm_tt_populate(bdev, ttm, ctx);
 		if (ret)
@@ -169,7 +169,7 @@ int ttm_bo_move_memcpy(struct ttm_buffer_object *bo,
 	}
 
 	clear = src_iter->ops->maps_tt && (!ttm || !ttm_tt_is_populated(ttm));
-	if (!(clear && ttm && !(ttm->page_flags & TTM_PAGE_FLAG_ZERO_ALLOC)))
+	if (!(clear && ttm && !(ttm->page_flags & TTM_TT_FLAG_ZERO_ALLOC)))
 		ttm_move_memcpy(clear, dst_mem->num_pages, dst_iter, src_iter);
 
 	if (!src_iter->ops->maps_tt)
diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index 9a2119fe4bdd..950f4f132802 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -162,7 +162,7 @@ vm_fault_t ttm_bo_vm_reserve(struct ttm_buffer_object *bo,
 	 * Refuse to fault imported pages. This should be handled
 	 * (if at all) by redirecting mmap to the exporter.
 	 */
-	if (bo->ttm && (bo->ttm->page_flags & TTM_PAGE_FLAG_SG)) {
+	if (bo->ttm && (bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL)) {
 		dma_resv_unlock(bo->base.resv);
 		return VM_FAULT_SIGBUS;
 	}
diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
index c961a788b519..1bba0a0ed3f9 100644
--- a/drivers/gpu/drm/ttm/ttm_pool.c
+++ b/drivers/gpu/drm/ttm/ttm_pool.c
@@ -371,7 +371,7 @@ int ttm_pool_alloc(struct ttm_pool *pool, struct ttm_tt *tt,
 	WARN_ON(!num_pages || ttm_tt_is_populated(tt));
 	WARN_ON(dma_addr && !pool->dev);
 
-	if (tt->page_flags & TTM_PAGE_FLAG_ZERO_ALLOC)
+	if (tt->page_flags & TTM_TT_FLAG_ZERO_ALLOC)
 		gfp_flags |= __GFP_ZERO;
 
 	if (ctx->gfp_retry_mayfail)
diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index 980ecb079b2c..86f31fde6e35 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -68,12 +68,12 @@ int ttm_tt_create(struct ttm_buffer_object *bo, bool zero_alloc)
 	switch (bo->type) {
 	case ttm_bo_type_device:
 		if (zero_alloc)
-			page_flags |= TTM_PAGE_FLAG_ZERO_ALLOC;
+			page_flags |= TTM_TT_FLAG_ZERO_ALLOC;
 		break;
 	case ttm_bo_type_kernel:
 		break;
 	case ttm_bo_type_sg:
-		page_flags |= TTM_PAGE_FLAG_SG;
+		page_flags |= TTM_TT_FLAG_EXTERNAL;
 		break;
 	default:
 		pr_err("Illegal buffer object type\n");
@@ -156,7 +156,7 @@ EXPORT_SYMBOL(ttm_tt_init);
 
 void ttm_tt_fini(struct ttm_tt *ttm)
 {
-	WARN_ON(ttm->page_flags & TTM_PAGE_FLAG_PRIV_POPULATED);
+	WARN_ON(ttm->page_flags & TTM_TT_FLAG_PRIV_POPULATED);
 
 	if (ttm->swap_storage)
 		fput(ttm->swap_storage);
@@ -178,7 +178,7 @@ int ttm_sg_tt_init(struct ttm_tt *ttm, struct ttm_buffer_object *bo,
 
 	ttm_tt_init_fields(ttm, bo, page_flags, caching);
 
-	if (page_flags & TTM_PAGE_FLAG_SG)
+	if (page_flags & TTM_TT_FLAG_EXTERNAL)
 		ret = ttm_sg_tt_alloc_page_directory(ttm);
 	else
 		ret = ttm_dma_tt_alloc_page_directory(ttm);
@@ -224,7 +224,7 @@ int ttm_tt_swapin(struct ttm_tt *ttm)
 
 	fput(swap_storage);
 	ttm->swap_storage = NULL;
-	ttm->page_flags &= ~TTM_PAGE_FLAG_SWAPPED;
+	ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
 
 	return 0;
 
@@ -279,7 +279,7 @@ int ttm_tt_swapout(struct ttm_device *bdev, struct ttm_tt *ttm,
 
 	ttm_tt_unpopulate(bdev, ttm);
 	ttm->swap_storage = swap_storage;
-	ttm->page_flags |= TTM_PAGE_FLAG_SWAPPED;
+	ttm->page_flags |= TTM_TT_FLAG_SWAPPED;
 
 	return ttm->num_pages;
 
@@ -300,7 +300,7 @@ int ttm_tt_populate(struct ttm_device *bdev,
 	if (ttm_tt_is_populated(ttm))
 		return 0;
 
-	if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) {
+	if (!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL)) {
 		atomic_long_add(ttm->num_pages, &ttm_pages_allocated);
 		if (bdev->pool.use_dma32)
 			atomic_long_add(ttm->num_pages,
@@ -325,8 +325,8 @@ int ttm_tt_populate(struct ttm_device *bdev,
 	if (ret)
 		goto error;
 
-	ttm->page_flags |= TTM_PAGE_FLAG_PRIV_POPULATED;
-	if (unlikely(ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)) {
+	ttm->page_flags |= TTM_TT_FLAG_PRIV_POPULATED;
+	if (unlikely(ttm->page_flags & TTM_TT_FLAG_SWAPPED)) {
 		ret = ttm_tt_swapin(ttm);
 		if (unlikely(ret != 0)) {
 			ttm_tt_unpopulate(bdev, ttm);
@@ -337,7 +337,7 @@ int ttm_tt_populate(struct ttm_device *bdev,
 	return 0;
 
 error:
-	if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) {
+	if (!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL)) {
 		atomic_long_sub(ttm->num_pages, &ttm_pages_allocated);
 		if (bdev->pool.use_dma32)
 			atomic_long_sub(ttm->num_pages,
@@ -357,14 +357,14 @@ void ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm)
 	else
 		ttm_pool_free(&bdev->pool, ttm);
 
-	if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) {
+	if (!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL)) {
 		atomic_long_sub(ttm->num_pages, &ttm_pages_allocated);
 		if (bdev->pool.use_dma32)
 			atomic_long_sub(ttm->num_pages,
 					&ttm_dma32_pages_allocated);
 	}
 
-	ttm->page_flags &= ~TTM_PAGE_FLAG_PRIV_POPULATED;
+	ttm->page_flags &= ~TTM_TT_FLAG_PRIV_POPULATED;
 }
 
 #ifdef CONFIG_DEBUG_FS
diff --git a/include/drm/ttm/ttm_device.h b/include/drm/ttm/ttm_device.h
index cbe03d45e883..0a4ddec78d8f 100644
--- a/include/drm/ttm/ttm_device.h
+++ b/include/drm/ttm/ttm_device.h
@@ -65,7 +65,7 @@ struct ttm_device_funcs {
 	 * ttm_tt_create
 	 *
 	 * @bo: The buffer object to create the ttm for.
-	 * @page_flags: Page flags as identified by TTM_PAGE_FLAG_XX flags.
+	 * @page_flags: Page flags as identified by TTM_TT_FLAG_XX flags.
 	 *
 	 * Create a struct ttm_tt to back data with system memory pages.
 	 * No pages are actually allocated.
diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h
index 842ce756213c..b023cd58ff38 100644
--- a/include/drm/ttm/ttm_tt.h
+++ b/include/drm/ttm/ttm_tt.h
@@ -38,17 +38,17 @@ struct ttm_resource;
 struct ttm_buffer_object;
 struct ttm_operation_ctx;
 
-#define TTM_PAGE_FLAG_SWAPPED         (1 << 4)
-#define TTM_PAGE_FLAG_ZERO_ALLOC      (1 << 6)
-#define TTM_PAGE_FLAG_SG              (1 << 8)
+#define TTM_TT_FLAG_SWAPPED	(1 << 0)
+#define TTM_TT_FLAG_ZERO_ALLOC	(1 << 1)
+#define TTM_TT_FLAG_EXTERNAL	(1 << 2)
 
-#define TTM_PAGE_FLAG_PRIV_POPULATED  (1 << 31)
+#define TTM_TT_FLAG_PRIV_POPULATED  (1 << 31)
 
 /**
  * struct ttm_tt
  *
  * @pages: Array of pages backing the data.
- * @page_flags: see TTM_PAGE_FLAG_*
+ * @page_flags: see TTM_TT_FLAG_*
  * @num_pages: Number of pages in the page array.
  * @sg: for SG objects via dma-buf
  * @dma_address: The DMA (bus) addresses of the pages
@@ -84,7 +84,7 @@ struct ttm_kmap_iter_tt {
 
 static inline bool ttm_tt_is_populated(struct ttm_tt *tt)
 {
-	return tt->page_flags & TTM_PAGE_FLAG_PRIV_POPULATED;
+	return tt->page_flags & TTM_TT_FLAG_PRIV_POPULATED;
 }
 
 /**
@@ -103,7 +103,7 @@ int ttm_tt_create(struct ttm_buffer_object *bo, bool zero_alloc);
  *
  * @ttm: The struct ttm_tt.
  * @bo: The buffer object we create the ttm for.
- * @page_flags: Page flags as identified by TTM_PAGE_FLAG_XX flags.
+ * @page_flags: Page flags as identified by TTM_TT_FLAG_XX flags.
  * @caching: the desired caching state of the pages
  *
  * Create a struct ttm_tt to back data with system memory pages.
@@ -178,7 +178,7 @@ void ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm);
  */
 static inline void ttm_tt_mark_for_clear(struct ttm_tt *ttm)
 {
-	ttm->page_flags |= TTM_PAGE_FLAG_ZERO_ALLOC;
+	ttm->page_flags |= TTM_TT_FLAG_ZERO_ALLOC;
 }
 
 void ttm_tt_mgr_init(unsigned long num_pages, unsigned long num_dma32_pages);
@@ -194,7 +194,7 @@ struct ttm_kmap_iter *ttm_kmap_iter_tt_init(struct ttm_kmap_iter_tt *iter_tt,
  *
  * @bo: Buffer object we allocate the ttm for.
  * @bridge: The agp bridge this device is sitting on.
- * @page_flags: Page flags as identified by TTM_PAGE_FLAG_XX flags.
+ * @page_flags: Page flags as identified by TTM_TT_FLAG_XX flags.
  *
  *
  * Create a TTM backend that uses the indicated AGP bridge as an aperture
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v5 06/13] drm/ttm: add some kernel-doc for TTM_TT_FLAG_*
  2021-09-27 11:41 ` [Intel-gfx] " Matthew Auld
@ 2021-09-27 11:41   ` Matthew Auld
  -1 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-27 11:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström, Christian König

Move it to inline kernel-doc, otherwise we can't add empty lines it
seems. Also drop the kernel-doc for pages_list, which doesn't seem to
exist.

v2(Christian):
  - Add a note that FLAG_SWAPPED shouldn't need to be touched by drivers.
  - Mention what FLAG_POPULATED does.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 include/drm/ttm/ttm_tt.h | 60 +++++++++++++++++++++++++++-------------
 1 file changed, 41 insertions(+), 19 deletions(-)

diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h
index b023cd58ff38..86d74069be3e 100644
--- a/include/drm/ttm/ttm_tt.h
+++ b/include/drm/ttm/ttm_tt.h
@@ -38,35 +38,57 @@ struct ttm_resource;
 struct ttm_buffer_object;
 struct ttm_operation_ctx;
 
-#define TTM_TT_FLAG_SWAPPED	(1 << 0)
-#define TTM_TT_FLAG_ZERO_ALLOC	(1 << 1)
-#define TTM_TT_FLAG_EXTERNAL	(1 << 2)
-
-#define TTM_TT_FLAG_PRIV_POPULATED  (1 << 31)
-
 /**
- * struct ttm_tt
- *
- * @pages: Array of pages backing the data.
- * @page_flags: see TTM_TT_FLAG_*
- * @num_pages: Number of pages in the page array.
- * @sg: for SG objects via dma-buf
- * @dma_address: The DMA (bus) addresses of the pages
- * @swap_storage: Pointer to shmem struct file for swap storage.
- * @pages_list: used by some page allocation backend
- * @caching: The current caching state of the pages, see enum ttm_caching.
- *
- * This is a structure holding the pages, caching- and aperture binding
- * status for a buffer object that isn't backed by fixed (VRAM / AGP)
+ * struct ttm_tt - This is a structure holding the pages, caching- and aperture
+ * binding status for a buffer object that isn't backed by fixed (VRAM / AGP)
  * memory.
  */
 struct ttm_tt {
+	/** @pages: Array of pages backing the data. */
 	struct page **pages;
+	/**
+	 * @page_flags: The page flags.
+	 *
+	 * Supported values:
+	 *
+	 * TTM_TT_FLAG_SWAPPED: Set by TTM when the pages have been unpopulated
+	 * and swapped out by TTM.  Calling ttm_tt_populate() will then swap the
+	 * pages back in, and unset the flag. Drivers should in general never
+	 * need to touch this.
+	 *
+	 * TTM_TT_FLAG_ZERO_ALLOC: Set if the pages will be zeroed on
+	 * allocation.
+	 *
+	 * TTM_TT_FLAG_EXTERNAL: Set if the underlying pages were allocated
+	 * externally, like with dma-buf or userptr. This effectively disables
+	 * TTM swapping out such pages.  Also important is to prevent TTM from
+	 * ever directly mapping these pages.
+	 *
+	 * Note that enum ttm_bo_type.ttm_bo_type_sg objects will always enable
+	 * this flag.
+	 *
+	 * TTM_TT_FLAG_PRIV_POPULATED: TTM internal only. DO NOT USE. This is
+	 * set by TTM after ttm_tt_populate() has successfully returned, and is
+	 * then unset when TTM calls ttm_tt_unpopulate().
+	 */
+#define TTM_TT_FLAG_SWAPPED	(1 << 0)
+#define TTM_TT_FLAG_ZERO_ALLOC	(1 << 1)
+#define TTM_TT_FLAG_EXTERNAL	(1 << 2)
+
+#define TTM_TT_FLAG_PRIV_POPULATED  (1 << 31)
 	uint32_t page_flags;
+	/** @num_pages: Number of pages in the page array. */
 	uint32_t num_pages;
+	/** @sg: for SG objects via dma-buf. */
 	struct sg_table *sg;
+	/** @dma_address: The DMA (bus) addresses of the pages. */
 	dma_addr_t *dma_address;
+	/** @swap_storage: Pointer to shmem struct file for swap storage. */
 	struct file *swap_storage;
+	/**
+	 * @caching: The current caching state of the pages, see enum
+	 * ttm_caching.
+	 */
 	enum ttm_caching caching;
 };
 
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Intel-gfx] [PATCH v5 06/13] drm/ttm: add some kernel-doc for TTM_TT_FLAG_*
@ 2021-09-27 11:41   ` Matthew Auld
  0 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-27 11:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström, Christian König

Move it to inline kernel-doc, otherwise we can't add empty lines it
seems. Also drop the kernel-doc for pages_list, which doesn't seem to
exist.

v2(Christian):
  - Add a note that FLAG_SWAPPED shouldn't need to be touched by drivers.
  - Mention what FLAG_POPULATED does.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 include/drm/ttm/ttm_tt.h | 60 +++++++++++++++++++++++++++-------------
 1 file changed, 41 insertions(+), 19 deletions(-)

diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h
index b023cd58ff38..86d74069be3e 100644
--- a/include/drm/ttm/ttm_tt.h
+++ b/include/drm/ttm/ttm_tt.h
@@ -38,35 +38,57 @@ struct ttm_resource;
 struct ttm_buffer_object;
 struct ttm_operation_ctx;
 
-#define TTM_TT_FLAG_SWAPPED	(1 << 0)
-#define TTM_TT_FLAG_ZERO_ALLOC	(1 << 1)
-#define TTM_TT_FLAG_EXTERNAL	(1 << 2)
-
-#define TTM_TT_FLAG_PRIV_POPULATED  (1 << 31)
-
 /**
- * struct ttm_tt
- *
- * @pages: Array of pages backing the data.
- * @page_flags: see TTM_TT_FLAG_*
- * @num_pages: Number of pages in the page array.
- * @sg: for SG objects via dma-buf
- * @dma_address: The DMA (bus) addresses of the pages
- * @swap_storage: Pointer to shmem struct file for swap storage.
- * @pages_list: used by some page allocation backend
- * @caching: The current caching state of the pages, see enum ttm_caching.
- *
- * This is a structure holding the pages, caching- and aperture binding
- * status for a buffer object that isn't backed by fixed (VRAM / AGP)
+ * struct ttm_tt - This is a structure holding the pages, caching- and aperture
+ * binding status for a buffer object that isn't backed by fixed (VRAM / AGP)
  * memory.
  */
 struct ttm_tt {
+	/** @pages: Array of pages backing the data. */
 	struct page **pages;
+	/**
+	 * @page_flags: The page flags.
+	 *
+	 * Supported values:
+	 *
+	 * TTM_TT_FLAG_SWAPPED: Set by TTM when the pages have been unpopulated
+	 * and swapped out by TTM.  Calling ttm_tt_populate() will then swap the
+	 * pages back in, and unset the flag. Drivers should in general never
+	 * need to touch this.
+	 *
+	 * TTM_TT_FLAG_ZERO_ALLOC: Set if the pages will be zeroed on
+	 * allocation.
+	 *
+	 * TTM_TT_FLAG_EXTERNAL: Set if the underlying pages were allocated
+	 * externally, like with dma-buf or userptr. This effectively disables
+	 * TTM swapping out such pages.  Also important is to prevent TTM from
+	 * ever directly mapping these pages.
+	 *
+	 * Note that enum ttm_bo_type.ttm_bo_type_sg objects will always enable
+	 * this flag.
+	 *
+	 * TTM_TT_FLAG_PRIV_POPULATED: TTM internal only. DO NOT USE. This is
+	 * set by TTM after ttm_tt_populate() has successfully returned, and is
+	 * then unset when TTM calls ttm_tt_unpopulate().
+	 */
+#define TTM_TT_FLAG_SWAPPED	(1 << 0)
+#define TTM_TT_FLAG_ZERO_ALLOC	(1 << 1)
+#define TTM_TT_FLAG_EXTERNAL	(1 << 2)
+
+#define TTM_TT_FLAG_PRIV_POPULATED  (1 << 31)
 	uint32_t page_flags;
+	/** @num_pages: Number of pages in the page array. */
 	uint32_t num_pages;
+	/** @sg: for SG objects via dma-buf. */
 	struct sg_table *sg;
+	/** @dma_address: The DMA (bus) addresses of the pages. */
 	dma_addr_t *dma_address;
+	/** @swap_storage: Pointer to shmem struct file for swap storage. */
 	struct file *swap_storage;
+	/**
+	 * @caching: The current caching state of the pages, see enum
+	 * ttm_caching.
+	 */
 	enum ttm_caching caching;
 };
 
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v5 07/13] drm/ttm: add TTM_TT_FLAG_EXTERNAL_MAPPABLE
  2021-09-27 11:41 ` [Intel-gfx] " Matthew Auld
@ 2021-09-27 11:41   ` Matthew Auld
  -1 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-27 11:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström, Christian König

In commit:

commit 667a50db0477d47fdff01c666f5ee1ce26b5264c
Author: Thomas Hellstrom <thellstrom@vmware.com>
Date:   Fri Jan 3 11:17:18 2014 +0100

    drm/ttm: Refuse to fault (prime-) imported pages

we introduced the restriction that imported pages should not be directly
mappable through TTM(this also extends to userptr). In the next patch we
want to introduce a shmem_tt backend, which should follow all the
existing rules with TTM_PAGE_FLAG_EXTERNAL, since it will need to handle
swapping itself, but with the above mapping restriction lifted.

v2(Christian):
  - Don't OR together EXTERNAL and EXTERNAL_MAPPABLE in the definition
    of EXTERNAL_MAPPABLE, just leave it the caller to handle this
    correctly, otherwise we might encounter subtle issues.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo_vm.c |  6 ++++--
 drivers/gpu/drm/ttm/ttm_tt.c    |  3 +++
 include/drm/ttm/ttm_tt.h        | 19 ++++++++++++++++---
 3 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index 950f4f132802..33680c94127c 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -163,8 +163,10 @@ vm_fault_t ttm_bo_vm_reserve(struct ttm_buffer_object *bo,
 	 * (if at all) by redirecting mmap to the exporter.
 	 */
 	if (bo->ttm && (bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL)) {
-		dma_resv_unlock(bo->base.resv);
-		return VM_FAULT_SIGBUS;
+		if (!(bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL_MAPPABLE)) {
+			dma_resv_unlock(bo->base.resv);
+			return VM_FAULT_SIGBUS;
+		}
 	}
 
 	return 0;
diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index 86f31fde6e35..7e83c00a3f48 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -84,6 +84,9 @@ int ttm_tt_create(struct ttm_buffer_object *bo, bool zero_alloc)
 	if (unlikely(bo->ttm == NULL))
 		return -ENOMEM;
 
+	WARN_ON(bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL_MAPPABLE &&
+		!(bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL));
+
 	return 0;
 }
 
diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h
index 86d74069be3e..f20832139815 100644
--- a/include/drm/ttm/ttm_tt.h
+++ b/include/drm/ttm/ttm_tt.h
@@ -67,13 +67,26 @@ struct ttm_tt {
 	 * Note that enum ttm_bo_type.ttm_bo_type_sg objects will always enable
 	 * this flag.
 	 *
+	 * TTM_TT_FLAG_EXTERNAL_MAPPABLE: Same behaviour as
+	 * TTM_TT_FLAG_EXTERNAL, but with the reduced restriction that it is
+	 * still valid to use TTM to map the pages directly. This is useful when
+	 * implementing a ttm_tt backend which still allocates driver owned
+	 * pages underneath(say with shmem).
+	 *
+	 * Note that since this also implies TTM_TT_FLAG_EXTERNAL, the usage
+	 * here should always be:
+	 *
+	 *   page_flags = TTM_TT_FLAG_EXTERNAL |
+	 *		  TTM_TT_FLAG_EXTERNAL_MAPPABLE;
+	 *
 	 * TTM_TT_FLAG_PRIV_POPULATED: TTM internal only. DO NOT USE. This is
 	 * set by TTM after ttm_tt_populate() has successfully returned, and is
 	 * then unset when TTM calls ttm_tt_unpopulate().
 	 */
-#define TTM_TT_FLAG_SWAPPED	(1 << 0)
-#define TTM_TT_FLAG_ZERO_ALLOC	(1 << 1)
-#define TTM_TT_FLAG_EXTERNAL	(1 << 2)
+#define TTM_TT_FLAG_SWAPPED		(1 << 0)
+#define TTM_TT_FLAG_ZERO_ALLOC		(1 << 1)
+#define TTM_TT_FLAG_EXTERNAL		(1 << 2)
+#define TTM_TT_FLAG_EXTERNAL_MAPPABLE	(1 << 3)
 
 #define TTM_TT_FLAG_PRIV_POPULATED  (1 << 31)
 	uint32_t page_flags;
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Intel-gfx] [PATCH v5 07/13] drm/ttm: add TTM_TT_FLAG_EXTERNAL_MAPPABLE
@ 2021-09-27 11:41   ` Matthew Auld
  0 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-27 11:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström, Christian König

In commit:

commit 667a50db0477d47fdff01c666f5ee1ce26b5264c
Author: Thomas Hellstrom <thellstrom@vmware.com>
Date:   Fri Jan 3 11:17:18 2014 +0100

    drm/ttm: Refuse to fault (prime-) imported pages

we introduced the restriction that imported pages should not be directly
mappable through TTM(this also extends to userptr). In the next patch we
want to introduce a shmem_tt backend, which should follow all the
existing rules with TTM_PAGE_FLAG_EXTERNAL, since it will need to handle
swapping itself, but with the above mapping restriction lifted.

v2(Christian):
  - Don't OR together EXTERNAL and EXTERNAL_MAPPABLE in the definition
    of EXTERNAL_MAPPABLE, just leave it the caller to handle this
    correctly, otherwise we might encounter subtle issues.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo_vm.c |  6 ++++--
 drivers/gpu/drm/ttm/ttm_tt.c    |  3 +++
 include/drm/ttm/ttm_tt.h        | 19 ++++++++++++++++---
 3 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index 950f4f132802..33680c94127c 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -163,8 +163,10 @@ vm_fault_t ttm_bo_vm_reserve(struct ttm_buffer_object *bo,
 	 * (if at all) by redirecting mmap to the exporter.
 	 */
 	if (bo->ttm && (bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL)) {
-		dma_resv_unlock(bo->base.resv);
-		return VM_FAULT_SIGBUS;
+		if (!(bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL_MAPPABLE)) {
+			dma_resv_unlock(bo->base.resv);
+			return VM_FAULT_SIGBUS;
+		}
 	}
 
 	return 0;
diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index 86f31fde6e35..7e83c00a3f48 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -84,6 +84,9 @@ int ttm_tt_create(struct ttm_buffer_object *bo, bool zero_alloc)
 	if (unlikely(bo->ttm == NULL))
 		return -ENOMEM;
 
+	WARN_ON(bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL_MAPPABLE &&
+		!(bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL));
+
 	return 0;
 }
 
diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h
index 86d74069be3e..f20832139815 100644
--- a/include/drm/ttm/ttm_tt.h
+++ b/include/drm/ttm/ttm_tt.h
@@ -67,13 +67,26 @@ struct ttm_tt {
 	 * Note that enum ttm_bo_type.ttm_bo_type_sg objects will always enable
 	 * this flag.
 	 *
+	 * TTM_TT_FLAG_EXTERNAL_MAPPABLE: Same behaviour as
+	 * TTM_TT_FLAG_EXTERNAL, but with the reduced restriction that it is
+	 * still valid to use TTM to map the pages directly. This is useful when
+	 * implementing a ttm_tt backend which still allocates driver owned
+	 * pages underneath(say with shmem).
+	 *
+	 * Note that since this also implies TTM_TT_FLAG_EXTERNAL, the usage
+	 * here should always be:
+	 *
+	 *   page_flags = TTM_TT_FLAG_EXTERNAL |
+	 *		  TTM_TT_FLAG_EXTERNAL_MAPPABLE;
+	 *
 	 * TTM_TT_FLAG_PRIV_POPULATED: TTM internal only. DO NOT USE. This is
 	 * set by TTM after ttm_tt_populate() has successfully returned, and is
 	 * then unset when TTM calls ttm_tt_unpopulate().
 	 */
-#define TTM_TT_FLAG_SWAPPED	(1 << 0)
-#define TTM_TT_FLAG_ZERO_ALLOC	(1 << 1)
-#define TTM_TT_FLAG_EXTERNAL	(1 << 2)
+#define TTM_TT_FLAG_SWAPPED		(1 << 0)
+#define TTM_TT_FLAG_ZERO_ALLOC		(1 << 1)
+#define TTM_TT_FLAG_EXTERNAL		(1 << 2)
+#define TTM_TT_FLAG_EXTERNAL_MAPPABLE	(1 << 3)
 
 #define TTM_TT_FLAG_PRIV_POPULATED  (1 << 31)
 	uint32_t page_flags;
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v5 08/13] drm/i915/gem: Break out some shmem backend utils
  2021-09-27 11:41 ` [Intel-gfx] " Matthew Auld
@ 2021-09-27 11:41   ` Matthew Auld
  -1 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-27 11:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Thomas Hellström <thomas.hellstrom@linux.intel.com>

Break out some shmem backend utils for future reuse by the TTM backend:
shmem_alloc_st(), shmem_free_st() and __shmem_writeback() which we can
use to provide a shmem-backed TTM page pool for cached-only TTM
buffer objects.

Main functional change here is that we now compute the page sizes using
the dma segments rather than using the physical page address segments.

v2(Reported-by: kernel test robot <lkp@intel.com>)
    - Make sure we initialise the mapping on the error path in
      shmem_get_pages()

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 181 +++++++++++++---------
 1 file changed, 106 insertions(+), 75 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 11f072193f3b..36b711ae9e28 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -25,46 +25,61 @@ static void check_release_pagevec(struct pagevec *pvec)
 	cond_resched();
 }
 
-static int shmem_get_pages(struct drm_i915_gem_object *obj)
+static void shmem_free_st(struct sg_table *st, struct address_space *mapping,
+			  bool dirty, bool backup)
 {
-	struct drm_i915_private *i915 = to_i915(obj->base.dev);
-	struct intel_memory_region *mem = obj->mm.region;
-	const unsigned long page_count = obj->base.size / PAGE_SIZE;
+	struct sgt_iter sgt_iter;
+	struct pagevec pvec;
+	struct page *page;
+
+	mapping_clear_unevictable(mapping);
+
+	pagevec_init(&pvec);
+	for_each_sgt_page(page, sgt_iter, st) {
+		if (dirty)
+			set_page_dirty(page);
+
+		if (backup)
+			mark_page_accessed(page);
+
+		if (!pagevec_add(&pvec, page))
+			check_release_pagevec(&pvec);
+	}
+	if (pagevec_count(&pvec))
+		check_release_pagevec(&pvec);
+
+	sg_free_table(st);
+	kfree(st);
+}
+
+static struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
+				       size_t size, struct intel_memory_region *mr,
+				       struct address_space *mapping,
+				       unsigned int max_segment)
+{
+	const unsigned long page_count = size / PAGE_SIZE;
 	unsigned long i;
-	struct address_space *mapping;
 	struct sg_table *st;
 	struct scatterlist *sg;
-	struct sgt_iter sgt_iter;
 	struct page *page;
 	unsigned long last_pfn = 0;	/* suppress gcc warning */
-	unsigned int max_segment = i915_sg_segment_size();
-	unsigned int sg_page_sizes;
 	gfp_t noreclaim;
 	int ret;
 
-	/*
-	 * Assert that the object is not currently in any GPU domain. As it
-	 * wasn't in the GTT, there shouldn't be any way it could have been in
-	 * a GPU cache
-	 */
-	GEM_BUG_ON(obj->read_domains & I915_GEM_GPU_DOMAINS);
-	GEM_BUG_ON(obj->write_domain & I915_GEM_GPU_DOMAINS);
-
 	/*
 	 * If there's no chance of allocating enough pages for the whole
 	 * object, bail early.
 	 */
-	if (obj->base.size > resource_size(&mem->region))
-		return -ENOMEM;
+	if (size > resource_size(&mr->region))
+		return ERR_PTR(-ENOMEM);
 
 	st = kmalloc(sizeof(*st), GFP_KERNEL);
 	if (!st)
-		return -ENOMEM;
+		return ERR_PTR(-ENOMEM);
 
-rebuild_st:
 	if (sg_alloc_table(st, page_count, GFP_KERNEL)) {
 		kfree(st);
-		return -ENOMEM;
+		return ERR_PTR(-ENOMEM);
 	}
 
 	/*
@@ -73,14 +88,12 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj)
 	 *
 	 * Fail silently without starting the shrinker
 	 */
-	mapping = obj->base.filp->f_mapping;
 	mapping_set_unevictable(mapping);
 	noreclaim = mapping_gfp_constraint(mapping, ~__GFP_RECLAIM);
 	noreclaim |= __GFP_NORETRY | __GFP_NOWARN;
 
 	sg = st->sgl;
 	st->nents = 0;
-	sg_page_sizes = 0;
 	for (i = 0; i < page_count; i++) {
 		const unsigned int shrink[] = {
 			I915_SHRINK_BOUND | I915_SHRINK_UNBOUND,
@@ -135,10 +148,9 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj)
 		if (!i ||
 		    sg->length >= max_segment ||
 		    page_to_pfn(page) != last_pfn + 1) {
-			if (i) {
-				sg_page_sizes |= sg->length;
+			if (i)
 				sg = sg_next(sg);
-			}
+
 			st->nents++;
 			sg_set_page(sg, page, PAGE_SIZE, 0);
 		} else {
@@ -149,14 +161,65 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj)
 		/* Check that the i965g/gm workaround works. */
 		GEM_BUG_ON(gfp & __GFP_DMA32 && last_pfn >= 0x00100000UL);
 	}
-	if (sg) { /* loop terminated early; short sg table */
-		sg_page_sizes |= sg->length;
+	if (sg) /* loop terminated early; short sg table */
 		sg_mark_end(sg);
-	}
 
 	/* Trim unused sg entries to avoid wasting memory. */
 	i915_sg_trim(st);
 
+	return st;
+err_sg:
+	sg_mark_end(sg);
+	if (sg != st->sgl) {
+		shmem_free_st(st, mapping, false, false);
+	} else {
+		mapping_clear_unevictable(mapping);
+		sg_free_table(st);
+		kfree(st);
+	}
+
+	/*
+	 * shmemfs first checks if there is enough memory to allocate the page
+	 * and reports ENOSPC should there be insufficient, along with the usual
+	 * ENOMEM for a genuine allocation failure.
+	 *
+	 * We use ENOSPC in our driver to mean that we have run out of aperture
+	 * space and so want to translate the error from shmemfs back to our
+	 * usual understanding of ENOMEM.
+	 */
+	if (ret == -ENOSPC)
+		ret = -ENOMEM;
+
+	return ERR_PTR(ret);
+}
+
+static int shmem_get_pages(struct drm_i915_gem_object *obj)
+{
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	struct intel_memory_region *mem = obj->mm.region;
+	struct address_space *mapping = obj->base.filp->f_mapping;
+	const unsigned long page_count = obj->base.size / PAGE_SIZE;
+	unsigned int max_segment = i915_sg_segment_size();
+	struct sg_table *st;
+	struct sgt_iter sgt_iter;
+	struct page *page;
+	int ret;
+
+	/*
+	 * Assert that the object is not currently in any GPU domain. As it
+	 * wasn't in the GTT, there shouldn't be any way it could have been in
+	 * a GPU cache
+	 */
+	GEM_BUG_ON(obj->read_domains & I915_GEM_GPU_DOMAINS);
+	GEM_BUG_ON(obj->write_domain & I915_GEM_GPU_DOMAINS);
+
+rebuild_st:
+	st = shmem_alloc_st(i915, obj->base.size, mem, mapping, max_segment);
+	if (IS_ERR(st)) {
+		ret = PTR_ERR(st);
+		goto err_st;
+	}
+
 	ret = i915_gem_gtt_prepare_pages(obj, st);
 	if (ret) {
 		/*
@@ -168,6 +231,7 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj)
 			for_each_sgt_page(page, sgt_iter, st)
 				put_page(page);
 			sg_free_table(st);
+			kfree(st);
 
 			max_segment = PAGE_SIZE;
 			goto rebuild_st;
@@ -200,28 +264,12 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj)
 	if (IS_JSL_EHL(i915) && obj->flags & I915_BO_ALLOC_USER)
 		obj->cache_dirty = true;
 
-	__i915_gem_object_set_pages(obj, st, sg_page_sizes);
+	__i915_gem_object_set_pages(obj, st, i915_sg_dma_sizes(st->sgl));
 
 	return 0;
 
-err_sg:
-	sg_mark_end(sg);
 err_pages:
-	mapping_clear_unevictable(mapping);
-	if (sg != st->sgl) {
-		struct pagevec pvec;
-
-		pagevec_init(&pvec);
-		for_each_sgt_page(page, sgt_iter, st) {
-			if (!pagevec_add(&pvec, page))
-				check_release_pagevec(&pvec);
-		}
-		if (pagevec_count(&pvec))
-			check_release_pagevec(&pvec);
-	}
-	sg_free_table(st);
-	kfree(st);
-
+	shmem_free_st(st, mapping, false, false);
 	/*
 	 * shmemfs first checks if there is enough memory to allocate the page
 	 * and reports ENOSPC should there be insufficient, along with the usual
@@ -231,6 +279,7 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj)
 	 * space and so want to translate the error from shmemfs back to our
 	 * usual understanding of ENOMEM.
 	 */
+err_st:
 	if (ret == -ENOSPC)
 		ret = -ENOMEM;
 
@@ -251,10 +300,8 @@ shmem_truncate(struct drm_i915_gem_object *obj)
 	obj->mm.pages = ERR_PTR(-EFAULT);
 }
 
-static void
-shmem_writeback(struct drm_i915_gem_object *obj)
+static void __shmem_writeback(size_t size, struct address_space *mapping)
 {
-	struct address_space *mapping;
 	struct writeback_control wbc = {
 		.sync_mode = WB_SYNC_NONE,
 		.nr_to_write = SWAP_CLUSTER_MAX,
@@ -270,10 +317,9 @@ shmem_writeback(struct drm_i915_gem_object *obj)
 	 * instead of invoking writeback so they are aged and paged out
 	 * as normal.
 	 */
-	mapping = obj->base.filp->f_mapping;
 
 	/* Begin writeback on each dirty page */
-	for (i = 0; i < obj->base.size >> PAGE_SHIFT; i++) {
+	for (i = 0; i < size >> PAGE_SHIFT; i++) {
 		struct page *page;
 
 		page = find_lock_page(mapping, i);
@@ -296,6 +342,12 @@ shmem_writeback(struct drm_i915_gem_object *obj)
 	}
 }
 
+static void
+shmem_writeback(struct drm_i915_gem_object *obj)
+{
+	__shmem_writeback(obj->base.size, obj->base.filp->f_mapping);
+}
+
 void
 __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj,
 				struct sg_table *pages,
@@ -316,11 +368,6 @@ __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj,
 
 void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, struct sg_table *pages)
 {
-	struct sgt_iter sgt_iter;
-	struct pagevec pvec;
-	struct page *page;
-
-	GEM_WARN_ON(IS_DGFX(to_i915(obj->base.dev)));
 	__i915_gem_object_release_shmem(obj, pages, true);
 
 	i915_gem_gtt_finish_pages(obj, pages);
@@ -328,25 +375,9 @@ void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, struct sg_
 	if (i915_gem_object_needs_bit17_swizzle(obj))
 		i915_gem_object_save_bit_17_swizzle(obj, pages);
 
-	mapping_clear_unevictable(file_inode(obj->base.filp)->i_mapping);
-
-	pagevec_init(&pvec);
-	for_each_sgt_page(page, sgt_iter, pages) {
-		if (obj->mm.dirty)
-			set_page_dirty(page);
-
-		if (obj->mm.madv == I915_MADV_WILLNEED)
-			mark_page_accessed(page);
-
-		if (!pagevec_add(&pvec, page))
-			check_release_pagevec(&pvec);
-	}
-	if (pagevec_count(&pvec))
-		check_release_pagevec(&pvec);
+	shmem_free_st(pages, file_inode(obj->base.filp)->i_mapping,
+		      obj->mm.dirty, obj->mm.madv == I915_MADV_WILLNEED);
 	obj->mm.dirty = false;
-
-	sg_free_table(pages);
-	kfree(pages);
 }
 
 static void
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Intel-gfx] [PATCH v5 08/13] drm/i915/gem: Break out some shmem backend utils
@ 2021-09-27 11:41   ` Matthew Auld
  0 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-27 11:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Thomas Hellström <thomas.hellstrom@linux.intel.com>

Break out some shmem backend utils for future reuse by the TTM backend:
shmem_alloc_st(), shmem_free_st() and __shmem_writeback() which we can
use to provide a shmem-backed TTM page pool for cached-only TTM
buffer objects.

Main functional change here is that we now compute the page sizes using
the dma segments rather than using the physical page address segments.

v2(Reported-by: kernel test robot <lkp@intel.com>)
    - Make sure we initialise the mapping on the error path in
      shmem_get_pages()

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 181 +++++++++++++---------
 1 file changed, 106 insertions(+), 75 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 11f072193f3b..36b711ae9e28 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -25,46 +25,61 @@ static void check_release_pagevec(struct pagevec *pvec)
 	cond_resched();
 }
 
-static int shmem_get_pages(struct drm_i915_gem_object *obj)
+static void shmem_free_st(struct sg_table *st, struct address_space *mapping,
+			  bool dirty, bool backup)
 {
-	struct drm_i915_private *i915 = to_i915(obj->base.dev);
-	struct intel_memory_region *mem = obj->mm.region;
-	const unsigned long page_count = obj->base.size / PAGE_SIZE;
+	struct sgt_iter sgt_iter;
+	struct pagevec pvec;
+	struct page *page;
+
+	mapping_clear_unevictable(mapping);
+
+	pagevec_init(&pvec);
+	for_each_sgt_page(page, sgt_iter, st) {
+		if (dirty)
+			set_page_dirty(page);
+
+		if (backup)
+			mark_page_accessed(page);
+
+		if (!pagevec_add(&pvec, page))
+			check_release_pagevec(&pvec);
+	}
+	if (pagevec_count(&pvec))
+		check_release_pagevec(&pvec);
+
+	sg_free_table(st);
+	kfree(st);
+}
+
+static struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
+				       size_t size, struct intel_memory_region *mr,
+				       struct address_space *mapping,
+				       unsigned int max_segment)
+{
+	const unsigned long page_count = size / PAGE_SIZE;
 	unsigned long i;
-	struct address_space *mapping;
 	struct sg_table *st;
 	struct scatterlist *sg;
-	struct sgt_iter sgt_iter;
 	struct page *page;
 	unsigned long last_pfn = 0;	/* suppress gcc warning */
-	unsigned int max_segment = i915_sg_segment_size();
-	unsigned int sg_page_sizes;
 	gfp_t noreclaim;
 	int ret;
 
-	/*
-	 * Assert that the object is not currently in any GPU domain. As it
-	 * wasn't in the GTT, there shouldn't be any way it could have been in
-	 * a GPU cache
-	 */
-	GEM_BUG_ON(obj->read_domains & I915_GEM_GPU_DOMAINS);
-	GEM_BUG_ON(obj->write_domain & I915_GEM_GPU_DOMAINS);
-
 	/*
 	 * If there's no chance of allocating enough pages for the whole
 	 * object, bail early.
 	 */
-	if (obj->base.size > resource_size(&mem->region))
-		return -ENOMEM;
+	if (size > resource_size(&mr->region))
+		return ERR_PTR(-ENOMEM);
 
 	st = kmalloc(sizeof(*st), GFP_KERNEL);
 	if (!st)
-		return -ENOMEM;
+		return ERR_PTR(-ENOMEM);
 
-rebuild_st:
 	if (sg_alloc_table(st, page_count, GFP_KERNEL)) {
 		kfree(st);
-		return -ENOMEM;
+		return ERR_PTR(-ENOMEM);
 	}
 
 	/*
@@ -73,14 +88,12 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj)
 	 *
 	 * Fail silently without starting the shrinker
 	 */
-	mapping = obj->base.filp->f_mapping;
 	mapping_set_unevictable(mapping);
 	noreclaim = mapping_gfp_constraint(mapping, ~__GFP_RECLAIM);
 	noreclaim |= __GFP_NORETRY | __GFP_NOWARN;
 
 	sg = st->sgl;
 	st->nents = 0;
-	sg_page_sizes = 0;
 	for (i = 0; i < page_count; i++) {
 		const unsigned int shrink[] = {
 			I915_SHRINK_BOUND | I915_SHRINK_UNBOUND,
@@ -135,10 +148,9 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj)
 		if (!i ||
 		    sg->length >= max_segment ||
 		    page_to_pfn(page) != last_pfn + 1) {
-			if (i) {
-				sg_page_sizes |= sg->length;
+			if (i)
 				sg = sg_next(sg);
-			}
+
 			st->nents++;
 			sg_set_page(sg, page, PAGE_SIZE, 0);
 		} else {
@@ -149,14 +161,65 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj)
 		/* Check that the i965g/gm workaround works. */
 		GEM_BUG_ON(gfp & __GFP_DMA32 && last_pfn >= 0x00100000UL);
 	}
-	if (sg) { /* loop terminated early; short sg table */
-		sg_page_sizes |= sg->length;
+	if (sg) /* loop terminated early; short sg table */
 		sg_mark_end(sg);
-	}
 
 	/* Trim unused sg entries to avoid wasting memory. */
 	i915_sg_trim(st);
 
+	return st;
+err_sg:
+	sg_mark_end(sg);
+	if (sg != st->sgl) {
+		shmem_free_st(st, mapping, false, false);
+	} else {
+		mapping_clear_unevictable(mapping);
+		sg_free_table(st);
+		kfree(st);
+	}
+
+	/*
+	 * shmemfs first checks if there is enough memory to allocate the page
+	 * and reports ENOSPC should there be insufficient, along with the usual
+	 * ENOMEM for a genuine allocation failure.
+	 *
+	 * We use ENOSPC in our driver to mean that we have run out of aperture
+	 * space and so want to translate the error from shmemfs back to our
+	 * usual understanding of ENOMEM.
+	 */
+	if (ret == -ENOSPC)
+		ret = -ENOMEM;
+
+	return ERR_PTR(ret);
+}
+
+static int shmem_get_pages(struct drm_i915_gem_object *obj)
+{
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	struct intel_memory_region *mem = obj->mm.region;
+	struct address_space *mapping = obj->base.filp->f_mapping;
+	const unsigned long page_count = obj->base.size / PAGE_SIZE;
+	unsigned int max_segment = i915_sg_segment_size();
+	struct sg_table *st;
+	struct sgt_iter sgt_iter;
+	struct page *page;
+	int ret;
+
+	/*
+	 * Assert that the object is not currently in any GPU domain. As it
+	 * wasn't in the GTT, there shouldn't be any way it could have been in
+	 * a GPU cache
+	 */
+	GEM_BUG_ON(obj->read_domains & I915_GEM_GPU_DOMAINS);
+	GEM_BUG_ON(obj->write_domain & I915_GEM_GPU_DOMAINS);
+
+rebuild_st:
+	st = shmem_alloc_st(i915, obj->base.size, mem, mapping, max_segment);
+	if (IS_ERR(st)) {
+		ret = PTR_ERR(st);
+		goto err_st;
+	}
+
 	ret = i915_gem_gtt_prepare_pages(obj, st);
 	if (ret) {
 		/*
@@ -168,6 +231,7 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj)
 			for_each_sgt_page(page, sgt_iter, st)
 				put_page(page);
 			sg_free_table(st);
+			kfree(st);
 
 			max_segment = PAGE_SIZE;
 			goto rebuild_st;
@@ -200,28 +264,12 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj)
 	if (IS_JSL_EHL(i915) && obj->flags & I915_BO_ALLOC_USER)
 		obj->cache_dirty = true;
 
-	__i915_gem_object_set_pages(obj, st, sg_page_sizes);
+	__i915_gem_object_set_pages(obj, st, i915_sg_dma_sizes(st->sgl));
 
 	return 0;
 
-err_sg:
-	sg_mark_end(sg);
 err_pages:
-	mapping_clear_unevictable(mapping);
-	if (sg != st->sgl) {
-		struct pagevec pvec;
-
-		pagevec_init(&pvec);
-		for_each_sgt_page(page, sgt_iter, st) {
-			if (!pagevec_add(&pvec, page))
-				check_release_pagevec(&pvec);
-		}
-		if (pagevec_count(&pvec))
-			check_release_pagevec(&pvec);
-	}
-	sg_free_table(st);
-	kfree(st);
-
+	shmem_free_st(st, mapping, false, false);
 	/*
 	 * shmemfs first checks if there is enough memory to allocate the page
 	 * and reports ENOSPC should there be insufficient, along with the usual
@@ -231,6 +279,7 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj)
 	 * space and so want to translate the error from shmemfs back to our
 	 * usual understanding of ENOMEM.
 	 */
+err_st:
 	if (ret == -ENOSPC)
 		ret = -ENOMEM;
 
@@ -251,10 +300,8 @@ shmem_truncate(struct drm_i915_gem_object *obj)
 	obj->mm.pages = ERR_PTR(-EFAULT);
 }
 
-static void
-shmem_writeback(struct drm_i915_gem_object *obj)
+static void __shmem_writeback(size_t size, struct address_space *mapping)
 {
-	struct address_space *mapping;
 	struct writeback_control wbc = {
 		.sync_mode = WB_SYNC_NONE,
 		.nr_to_write = SWAP_CLUSTER_MAX,
@@ -270,10 +317,9 @@ shmem_writeback(struct drm_i915_gem_object *obj)
 	 * instead of invoking writeback so they are aged and paged out
 	 * as normal.
 	 */
-	mapping = obj->base.filp->f_mapping;
 
 	/* Begin writeback on each dirty page */
-	for (i = 0; i < obj->base.size >> PAGE_SHIFT; i++) {
+	for (i = 0; i < size >> PAGE_SHIFT; i++) {
 		struct page *page;
 
 		page = find_lock_page(mapping, i);
@@ -296,6 +342,12 @@ shmem_writeback(struct drm_i915_gem_object *obj)
 	}
 }
 
+static void
+shmem_writeback(struct drm_i915_gem_object *obj)
+{
+	__shmem_writeback(obj->base.size, obj->base.filp->f_mapping);
+}
+
 void
 __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj,
 				struct sg_table *pages,
@@ -316,11 +368,6 @@ __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj,
 
 void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, struct sg_table *pages)
 {
-	struct sgt_iter sgt_iter;
-	struct pagevec pvec;
-	struct page *page;
-
-	GEM_WARN_ON(IS_DGFX(to_i915(obj->base.dev)));
 	__i915_gem_object_release_shmem(obj, pages, true);
 
 	i915_gem_gtt_finish_pages(obj, pages);
@@ -328,25 +375,9 @@ void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, struct sg_
 	if (i915_gem_object_needs_bit17_swizzle(obj))
 		i915_gem_object_save_bit_17_swizzle(obj, pages);
 
-	mapping_clear_unevictable(file_inode(obj->base.filp)->i_mapping);
-
-	pagevec_init(&pvec);
-	for_each_sgt_page(page, sgt_iter, pages) {
-		if (obj->mm.dirty)
-			set_page_dirty(page);
-
-		if (obj->mm.madv == I915_MADV_WILLNEED)
-			mark_page_accessed(page);
-
-		if (!pagevec_add(&pvec, page))
-			check_release_pagevec(&pvec);
-	}
-	if (pagevec_count(&pvec))
-		check_release_pagevec(&pvec);
+	shmem_free_st(pages, file_inode(obj->base.filp)->i_mapping,
+		      obj->mm.dirty, obj->mm.madv == I915_MADV_WILLNEED);
 	obj->mm.dirty = false;
-
-	sg_free_table(pages);
-	kfree(pages);
 }
 
 static void
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend
  2021-09-27 11:41 ` [Intel-gfx] " Matthew Auld
@ 2021-09-27 11:41   ` Matthew Auld
  -1 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-27 11:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström, Christian König

For cached objects we can allocate our pages directly in shmem. This
should make it possible(in a later patch) to utilise the existing
i915-gem shrinker code for such objects. For now this is still disabled.

v2(Thomas):
  - Add optional try_to_writeback hook for objects. Importantly we need
    to check if the object is even still shrinkable; in between us
    dropping the shrinker LRU lock and acquiring the object lock it could for
    example have been moved. Also we need to differentiate between
    "lazy" shrinking and the immediate writeback mode. Also later we need to
    handle objects which don't even have mm.pages, so bundling this into
    put_pages() would require somehow handling that edge case, hence
    just letting the ttm backend handle everything in try_to_writeback
    doesn't seem too bad.
v3(Thomas):
  - Likely a bad idea to touch the object from the unpopulate hook,
    since it's not possible to hold a reference, without also creating
    circular dependency, so likely this is too fragile. For now just
    ensure we at least mark the pages as dirty/accessed when called from the
    shrinker on WILLNEED objects.
  - s/try_to_writeback/shrinker_release_pages, since this can do more
    than just writeback.
  - Get rid of do_backup boolean and just set the SWAPPED flag prior to
    calling unpopulate.
  - Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk, since
    these just get skipped anyway. We can try to come up with something
    better later.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |   8 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |   2 +
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c     |  14 +-
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  17 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c       | 240 ++++++++++++++++--
 5 files changed, 245 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 3043fcbd31bd..1c9a1d8d3434 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -601,6 +601,14 @@ int i915_gem_object_wait_migration(struct drm_i915_gem_object *obj,
 bool i915_gem_object_placement_possible(struct drm_i915_gem_object *obj,
 					enum intel_memory_type type);
 
+struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
+				size_t size, struct intel_memory_region *mr,
+				struct address_space *mapping,
+				unsigned int max_segment);
+void shmem_free_st(struct sg_table *st, struct address_space *mapping,
+		   bool dirty, bool backup);
+void __shmem_writeback(size_t size, struct address_space *mapping);
+
 #ifdef CONFIG_MMU_NOTIFIER
 static inline bool
 i915_gem_object_is_userptr(struct drm_i915_gem_object *obj)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index fa2ba9e2a4d0..f0fb17be2f7a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -56,6 +56,8 @@ struct drm_i915_gem_object_ops {
 			  struct sg_table *pages);
 	void (*truncate)(struct drm_i915_gem_object *obj);
 	void (*writeback)(struct drm_i915_gem_object *obj);
+	int (*shrinker_release_pages)(struct drm_i915_gem_object *obj,
+				      bool should_writeback);
 
 	int (*pread)(struct drm_i915_gem_object *obj,
 		     const struct drm_i915_gem_pread *arg);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 36b711ae9e28..19e55cc29a15 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -25,8 +25,8 @@ static void check_release_pagevec(struct pagevec *pvec)
 	cond_resched();
 }
 
-static void shmem_free_st(struct sg_table *st, struct address_space *mapping,
-			  bool dirty, bool backup)
+void shmem_free_st(struct sg_table *st, struct address_space *mapping,
+		   bool dirty, bool backup)
 {
 	struct sgt_iter sgt_iter;
 	struct pagevec pvec;
@@ -52,10 +52,10 @@ static void shmem_free_st(struct sg_table *st, struct address_space *mapping,
 	kfree(st);
 }
 
-static struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
-				       size_t size, struct intel_memory_region *mr,
-				       struct address_space *mapping,
-				       unsigned int max_segment)
+struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
+				size_t size, struct intel_memory_region *mr,
+				struct address_space *mapping,
+				unsigned int max_segment)
 {
 	const unsigned long page_count = size / PAGE_SIZE;
 	unsigned long i;
@@ -300,7 +300,7 @@ shmem_truncate(struct drm_i915_gem_object *obj)
 	obj->mm.pages = ERR_PTR(-EFAULT);
 }
 
-static void __shmem_writeback(size_t size, struct address_space *mapping)
+void __shmem_writeback(size_t size, struct address_space *mapping)
 {
 	struct writeback_control wbc = {
 		.sync_mode = WB_SYNC_NONE,
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
index e382b7f2353b..cc80bd23d323 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -56,19 +56,24 @@ static bool unsafe_drop_pages(struct drm_i915_gem_object *obj,
 	return false;
 }
 
-static void try_to_writeback(struct drm_i915_gem_object *obj,
-			     unsigned int flags)
+static int try_to_writeback(struct drm_i915_gem_object *obj, unsigned int flags)
 {
+	if (obj->ops->shrinker_release_pages)
+		return obj->ops->shrinker_release_pages(obj,
+							flags & I915_SHRINK_WRITEBACK);
+
 	switch (obj->mm.madv) {
 	case I915_MADV_DONTNEED:
 		i915_gem_object_truncate(obj);
-		return;
+		return 0;
 	case __I915_MADV_PURGED:
-		return;
+		return 0;
 	}
 
 	if (flags & I915_SHRINK_WRITEBACK)
 		i915_gem_object_writeback(obj);
+
+	return 0;
 }
 
 /**
@@ -222,8 +227,8 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww,
 				}
 
 				if (!__i915_gem_object_put_pages(obj)) {
-					try_to_writeback(obj, shrink);
-					count += obj->base.size >> PAGE_SHIFT;
+					if (!try_to_writeback(obj, shrink))
+						count += obj->base.size >> PAGE_SHIFT;
 				}
 				if (!ww)
 					i915_gem_object_unlock(obj);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index a77e90f300fe..c7402995a8f9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -35,6 +35,8 @@
  * @ttm: The base TTM page vector.
  * @dev: The struct device used for dma mapping and unmapping.
  * @cached_st: The cached scatter-gather table.
+ * @is_shmem: Set if using shmem.
+ * @filp: The shmem file, if using shmem backend.
  *
  * Note that DMA may be going on right up to the point where the page-
  * vector is unpopulated in delayed destroy. Hence keep the
@@ -46,6 +48,9 @@ struct i915_ttm_tt {
 	struct ttm_tt ttm;
 	struct device *dev;
 	struct sg_table *cached_st;
+
+	bool is_shmem;
+	struct file *filp;
 };
 
 static const struct ttm_place sys_placement_flags = {
@@ -179,12 +184,90 @@ i915_ttm_placement_from_obj(const struct drm_i915_gem_object *obj,
 	placement->busy_placement = busy;
 }
 
+static int i915_ttm_tt_shmem_populate(struct ttm_device *bdev,
+				      struct ttm_tt *ttm,
+				      struct ttm_operation_ctx *ctx)
+{
+	struct drm_i915_private *i915 = container_of(bdev, typeof(*i915), bdev);
+	struct intel_memory_region *mr = i915->mm.regions[INTEL_MEMORY_SYSTEM];
+	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);
+	const unsigned int max_segment = i915_sg_segment_size();
+	const size_t size = ttm->num_pages << PAGE_SHIFT;
+	struct file *filp = i915_tt->filp;
+	struct sgt_iter sgt_iter;
+	struct sg_table *st;
+	struct page *page;
+	unsigned long i;
+	int err;
+
+	if (!filp) {
+		struct address_space *mapping;
+		gfp_t mask;
+
+		filp = shmem_file_setup("i915-shmem-tt", size, VM_NORESERVE);
+		if (IS_ERR(filp))
+			return PTR_ERR(filp);
+
+		mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
+
+		mapping = filp->f_mapping;
+		mapping_set_gfp_mask(mapping, mask);
+		GEM_BUG_ON(!(mapping_gfp_mask(mapping) & __GFP_RECLAIM));
+
+		i915_tt->filp = filp;
+	}
+
+	st = shmem_alloc_st(i915, size, mr, filp->f_mapping, max_segment);
+	if (IS_ERR(st))
+		return PTR_ERR(st);
+
+	err = dma_map_sg_attrs(i915_tt->dev,
+			       st->sgl, st->nents,
+			       PCI_DMA_BIDIRECTIONAL,
+			       DMA_ATTR_SKIP_CPU_SYNC |
+			       DMA_ATTR_NO_KERNEL_MAPPING |
+			       DMA_ATTR_NO_WARN);
+	if (err <= 0) {
+		err = -EINVAL;
+		goto err_free_st;
+	}
+
+	i = 0;
+	for_each_sgt_page(page, sgt_iter, st)
+		ttm->pages[i++] = page;
+
+	if (ttm->page_flags & TTM_TT_FLAG_SWAPPED)
+		ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
+
+	i915_tt->cached_st = st;
+	return 0;
+
+err_free_st:
+	shmem_free_st(st, filp->f_mapping, false, false);
+	return err;
+}
+
+static void i915_ttm_tt_shmem_unpopulate(struct ttm_tt *ttm)
+{
+	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);
+	bool backup = ttm->page_flags & TTM_TT_FLAG_SWAPPED;
+
+	dma_unmap_sg(i915_tt->dev, i915_tt->cached_st->sgl,
+		     i915_tt->cached_st->nents,
+		     PCI_DMA_BIDIRECTIONAL);
+
+	shmem_free_st(fetch_and_zero(&i915_tt->cached_st),
+		      file_inode(i915_tt->filp)->i_mapping,
+		      backup, backup);
+}
+
 static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo,
 					 uint32_t page_flags)
 {
 	struct ttm_resource_manager *man =
 		ttm_manager_type(bo->bdev, bo->resource->mem_type);
 	struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
+	enum ttm_caching caching = i915_ttm_select_tt_caching(obj);
 	struct i915_ttm_tt *i915_tt;
 	int ret;
 
@@ -196,36 +279,62 @@ static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo,
 	    man->use_tt)
 		page_flags |= TTM_TT_FLAG_ZERO_ALLOC;
 
-	ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags,
-			  i915_ttm_select_tt_caching(obj));
-	if (ret) {
-		kfree(i915_tt);
-		return NULL;
+	if (i915_gem_object_is_shrinkable(obj) && caching == ttm_cached) {
+		page_flags |= TTM_TT_FLAG_EXTERNAL |
+			      TTM_TT_FLAG_EXTERNAL_MAPPABLE;
+		i915_tt->is_shmem = true;
 	}
 
+	ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching);
+	if (ret)
+		goto err_free;
+
 	i915_tt->dev = obj->base.dev->dev;
 
 	return &i915_tt->ttm;
+
+err_free:
+	kfree(i915_tt);
+	return NULL;
+}
+
+static int i915_ttm_tt_populate(struct ttm_device *bdev,
+				struct ttm_tt *ttm,
+				struct ttm_operation_ctx *ctx)
+{
+	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);
+
+	if (i915_tt->is_shmem)
+		return i915_ttm_tt_shmem_populate(bdev, ttm, ctx);
+
+	return ttm_pool_alloc(&bdev->pool, ttm, ctx);
 }
 
 static void i915_ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm)
 {
 	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);
 
-	if (i915_tt->cached_st) {
-		dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st,
-				  DMA_BIDIRECTIONAL, 0);
-		sg_free_table(i915_tt->cached_st);
-		kfree(i915_tt->cached_st);
-		i915_tt->cached_st = NULL;
+	if (i915_tt->is_shmem) {
+		i915_ttm_tt_shmem_unpopulate(ttm);
+	} else {
+		if (i915_tt->cached_st) {
+			dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st,
+					  DMA_BIDIRECTIONAL, 0);
+			sg_free_table(i915_tt->cached_st);
+			kfree(i915_tt->cached_st);
+			i915_tt->cached_st = NULL;
+		}
+		ttm_pool_free(&bdev->pool, ttm);
 	}
-	ttm_pool_free(&bdev->pool, ttm);
 }
 
 static void i915_ttm_tt_destroy(struct ttm_device *bdev, struct ttm_tt *ttm)
 {
 	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);
 
+	if (i915_tt->filp)
+		fput(i915_tt->filp);
+
 	ttm_tt_fini(ttm);
 	kfree(i915_tt);
 }
@@ -235,6 +344,14 @@ static bool i915_ttm_eviction_valuable(struct ttm_buffer_object *bo,
 {
 	struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
 
+	/*
+	 * EXTERNAL objects should never be swapped out by TTM, instead we need
+	 * to handle that ourselves. TTM will already skip such objects for us,
+	 * but we would like to avoid grabbing locks for no good reason.
+	 */
+	if (bo->ttm && bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL)
+		return -EBUSY;
+
 	/* Will do for now. Our pinned objects are still on TTM's LRU lists */
 	return i915_gem_object_evictable(obj);
 }
@@ -328,9 +445,11 @@ static void i915_ttm_adjust_gem_after_move(struct drm_i915_gem_object *obj)
 	i915_gem_object_set_cache_coherency(obj, cache_level);
 }
 
-static void i915_ttm_purge(struct drm_i915_gem_object *obj)
+static int __i915_ttm_purge(struct drm_i915_gem_object *obj)
 {
 	struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
+	struct i915_ttm_tt *i915_tt =
+		container_of(bo->ttm, typeof(*i915_tt), ttm);
 	struct ttm_operation_ctx ctx = {
 		.interruptible = true,
 		.no_wait_gpu = false,
@@ -339,17 +458,79 @@ static void i915_ttm_purge(struct drm_i915_gem_object *obj)
 	int ret;
 
 	if (obj->mm.madv == __I915_MADV_PURGED)
-		return;
+		return 0;
 
-	/* TTM's purge interface. Note that we might be reentering. */
 	ret = ttm_bo_validate(bo, &place, &ctx);
-	if (!ret) {
-		obj->write_domain = 0;
-		obj->read_domains = 0;
-		i915_ttm_adjust_gem_after_move(obj);
-		i915_ttm_free_cached_io_st(obj);
-		obj->mm.madv = __I915_MADV_PURGED;
+	if (ret)
+		return ret;
+
+	if (bo->ttm && i915_tt->filp) {
+		/*
+		 * The below fput(which eventually calls shmem_truncate) might
+		 * be delayed by worker, so when directly called to purge the
+		 * pages(like by the shrinker) we should try to be more
+		 * aggressive and release the pages immediately.
+		 */
+		shmem_truncate_range(file_inode(i915_tt->filp),
+				     0, (loff_t)-1);
+		fput(fetch_and_zero(&i915_tt->filp));
+	}
+
+	obj->write_domain = 0;
+	obj->read_domains = 0;
+	i915_ttm_adjust_gem_after_move(obj);
+	i915_ttm_free_cached_io_st(obj);
+	obj->mm.madv = __I915_MADV_PURGED;
+	return 0;
+}
+
+static void i915_ttm_purge(struct drm_i915_gem_object *obj)
+{
+	__i915_ttm_purge(obj);
+}
+
+static int i915_ttm_shrinker_release_pages(struct drm_i915_gem_object *obj,
+					   bool should_writeback)
+{
+	struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
+	struct i915_ttm_tt *i915_tt =
+		container_of(bo->ttm, typeof(*i915_tt), ttm);
+	struct ttm_operation_ctx ctx = {
+		.interruptible = true,
+		.no_wait_gpu = false,
+	};
+	struct ttm_placement place = {};
+	int ret;
+
+	if (!bo->ttm || bo->resource->mem_type != TTM_PL_SYSTEM)
+		return 0;
+
+	GEM_BUG_ON(!i915_tt->is_shmem);
+
+	if (!i915_tt->filp)
+		return 0;
+
+	switch (obj->mm.madv) {
+	case I915_MADV_DONTNEED:
+		return __i915_ttm_purge(obj);
+	case __I915_MADV_PURGED:
+		return 0;
+	}
+
+	if (bo->ttm->page_flags & TTM_TT_FLAG_SWAPPED)
+		return 0;
+
+	bo->ttm->page_flags |= TTM_TT_FLAG_SWAPPED;
+	ret = ttm_bo_validate(bo, &place, &ctx);
+	if (ret) {
+		bo->ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
+		return ret;
 	}
+
+	if (should_writeback)
+		__shmem_writeback(obj->base.size, i915_tt->filp->f_mapping);
+
+	return 0;
 }
 
 static void i915_ttm_swap_notify(struct ttm_buffer_object *bo)
@@ -618,6 +799,7 @@ static unsigned long i915_ttm_io_mem_pfn(struct ttm_buffer_object *bo,
 
 static struct ttm_device_funcs i915_ttm_bo_driver = {
 	.ttm_tt_create = i915_ttm_tt_create,
+	.ttm_tt_populate = i915_ttm_tt_populate,
 	.ttm_tt_unpopulate = i915_ttm_tt_unpopulate,
 	.ttm_tt_destroy = i915_ttm_tt_destroy,
 	.eviction_valuable = i915_ttm_eviction_valuable,
@@ -685,12 +867,17 @@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj,
 	}
 
 	if (!i915_gem_object_has_pages(obj)) {
+		struct i915_ttm_tt *i915_tt =
+			container_of(bo->ttm, typeof(*i915_tt), ttm);
+
 		/* Object either has a page vector or is an iomem object */
 		st = bo->ttm ? i915_ttm_tt_get_st(bo->ttm) : obj->ttm.cached_io_st;
 		if (IS_ERR(st))
 			return PTR_ERR(st);
 
 		__i915_gem_object_set_pages(obj, st, i915_sg_dma_sizes(st->sgl));
+		if (!bo->ttm || !i915_tt->is_shmem)
+			i915_gem_object_make_unshrinkable(obj);
 	}
 
 	return ret;
@@ -770,6 +957,8 @@ static void i915_ttm_put_pages(struct drm_i915_gem_object *obj,
 static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)
 {
 	struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
+	struct i915_ttm_tt *i915_tt =
+		container_of(bo->ttm, typeof(*i915_tt), ttm);
 
 	/*
 	 * Don't manipulate the TTM LRUs while in TTM bo destruction.
@@ -782,7 +971,10 @@ static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)
 	 * Put on the correct LRU list depending on the MADV status
 	 */
 	spin_lock(&bo->bdev->lru_lock);
-	if (obj->mm.madv != I915_MADV_WILLNEED) {
+	if (bo->ttm && i915_tt->filp) {
+		/* Try to keep shmem_tt from being considered for shrinking. */
+		bo->priority = TTM_MAX_BO_PRIORITY - 1;
+	} else if (obj->mm.madv != I915_MADV_WILLNEED) {
 		bo->priority = I915_TTM_PRIO_PURGE;
 	} else if (!i915_gem_object_has_pages(obj)) {
 		if (bo->priority < I915_TTM_PRIO_HAS_PAGES)
@@ -887,9 +1079,12 @@ static const struct drm_i915_gem_object_ops i915_gem_ttm_obj_ops = {
 	.get_pages = i915_ttm_get_pages,
 	.put_pages = i915_ttm_put_pages,
 	.truncate = i915_ttm_purge,
+	.shrinker_release_pages = i915_ttm_shrinker_release_pages,
+
 	.adjust_lru = i915_ttm_adjust_lru,
 	.delayed_free = i915_ttm_delayed_free,
 	.migrate = i915_ttm_migrate,
+
 	.mmap_offset = i915_ttm_mmap_offset,
 	.mmap_ops = &vm_ops_ttm,
 };
@@ -937,7 +1132,6 @@ int __i915_gem_ttm_object_init(struct intel_memory_region *mem,
 	drm_gem_private_object_init(&i915->drm, &obj->base, size);
 	i915_gem_object_init(obj, &i915_gem_ttm_obj_ops, &lock_class, flags);
 	i915_gem_object_init_memory_region(obj, mem);
-	i915_gem_object_make_unshrinkable(obj);
 	INIT_RADIX_TREE(&obj->ttm.get_io_page.radix, GFP_KERNEL | __GFP_NOWARN);
 	mutex_init(&obj->ttm.get_io_page.lock);
 	bo_type = (obj->flags & I915_BO_ALLOC_USER) ? ttm_bo_type_device :
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend
@ 2021-09-27 11:41   ` Matthew Auld
  0 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-27 11:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström, Christian König

For cached objects we can allocate our pages directly in shmem. This
should make it possible(in a later patch) to utilise the existing
i915-gem shrinker code for such objects. For now this is still disabled.

v2(Thomas):
  - Add optional try_to_writeback hook for objects. Importantly we need
    to check if the object is even still shrinkable; in between us
    dropping the shrinker LRU lock and acquiring the object lock it could for
    example have been moved. Also we need to differentiate between
    "lazy" shrinking and the immediate writeback mode. Also later we need to
    handle objects which don't even have mm.pages, so bundling this into
    put_pages() would require somehow handling that edge case, hence
    just letting the ttm backend handle everything in try_to_writeback
    doesn't seem too bad.
v3(Thomas):
  - Likely a bad idea to touch the object from the unpopulate hook,
    since it's not possible to hold a reference, without also creating
    circular dependency, so likely this is too fragile. For now just
    ensure we at least mark the pages as dirty/accessed when called from the
    shrinker on WILLNEED objects.
  - s/try_to_writeback/shrinker_release_pages, since this can do more
    than just writeback.
  - Get rid of do_backup boolean and just set the SWAPPED flag prior to
    calling unpopulate.
  - Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk, since
    these just get skipped anyway. We can try to come up with something
    better later.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |   8 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |   2 +
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c     |  14 +-
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  17 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c       | 240 ++++++++++++++++--
 5 files changed, 245 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 3043fcbd31bd..1c9a1d8d3434 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -601,6 +601,14 @@ int i915_gem_object_wait_migration(struct drm_i915_gem_object *obj,
 bool i915_gem_object_placement_possible(struct drm_i915_gem_object *obj,
 					enum intel_memory_type type);
 
+struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
+				size_t size, struct intel_memory_region *mr,
+				struct address_space *mapping,
+				unsigned int max_segment);
+void shmem_free_st(struct sg_table *st, struct address_space *mapping,
+		   bool dirty, bool backup);
+void __shmem_writeback(size_t size, struct address_space *mapping);
+
 #ifdef CONFIG_MMU_NOTIFIER
 static inline bool
 i915_gem_object_is_userptr(struct drm_i915_gem_object *obj)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index fa2ba9e2a4d0..f0fb17be2f7a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -56,6 +56,8 @@ struct drm_i915_gem_object_ops {
 			  struct sg_table *pages);
 	void (*truncate)(struct drm_i915_gem_object *obj);
 	void (*writeback)(struct drm_i915_gem_object *obj);
+	int (*shrinker_release_pages)(struct drm_i915_gem_object *obj,
+				      bool should_writeback);
 
 	int (*pread)(struct drm_i915_gem_object *obj,
 		     const struct drm_i915_gem_pread *arg);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 36b711ae9e28..19e55cc29a15 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -25,8 +25,8 @@ static void check_release_pagevec(struct pagevec *pvec)
 	cond_resched();
 }
 
-static void shmem_free_st(struct sg_table *st, struct address_space *mapping,
-			  bool dirty, bool backup)
+void shmem_free_st(struct sg_table *st, struct address_space *mapping,
+		   bool dirty, bool backup)
 {
 	struct sgt_iter sgt_iter;
 	struct pagevec pvec;
@@ -52,10 +52,10 @@ static void shmem_free_st(struct sg_table *st, struct address_space *mapping,
 	kfree(st);
 }
 
-static struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
-				       size_t size, struct intel_memory_region *mr,
-				       struct address_space *mapping,
-				       unsigned int max_segment)
+struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
+				size_t size, struct intel_memory_region *mr,
+				struct address_space *mapping,
+				unsigned int max_segment)
 {
 	const unsigned long page_count = size / PAGE_SIZE;
 	unsigned long i;
@@ -300,7 +300,7 @@ shmem_truncate(struct drm_i915_gem_object *obj)
 	obj->mm.pages = ERR_PTR(-EFAULT);
 }
 
-static void __shmem_writeback(size_t size, struct address_space *mapping)
+void __shmem_writeback(size_t size, struct address_space *mapping)
 {
 	struct writeback_control wbc = {
 		.sync_mode = WB_SYNC_NONE,
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
index e382b7f2353b..cc80bd23d323 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -56,19 +56,24 @@ static bool unsafe_drop_pages(struct drm_i915_gem_object *obj,
 	return false;
 }
 
-static void try_to_writeback(struct drm_i915_gem_object *obj,
-			     unsigned int flags)
+static int try_to_writeback(struct drm_i915_gem_object *obj, unsigned int flags)
 {
+	if (obj->ops->shrinker_release_pages)
+		return obj->ops->shrinker_release_pages(obj,
+							flags & I915_SHRINK_WRITEBACK);
+
 	switch (obj->mm.madv) {
 	case I915_MADV_DONTNEED:
 		i915_gem_object_truncate(obj);
-		return;
+		return 0;
 	case __I915_MADV_PURGED:
-		return;
+		return 0;
 	}
 
 	if (flags & I915_SHRINK_WRITEBACK)
 		i915_gem_object_writeback(obj);
+
+	return 0;
 }
 
 /**
@@ -222,8 +227,8 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww,
 				}
 
 				if (!__i915_gem_object_put_pages(obj)) {
-					try_to_writeback(obj, shrink);
-					count += obj->base.size >> PAGE_SHIFT;
+					if (!try_to_writeback(obj, shrink))
+						count += obj->base.size >> PAGE_SHIFT;
 				}
 				if (!ww)
 					i915_gem_object_unlock(obj);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index a77e90f300fe..c7402995a8f9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -35,6 +35,8 @@
  * @ttm: The base TTM page vector.
  * @dev: The struct device used for dma mapping and unmapping.
  * @cached_st: The cached scatter-gather table.
+ * @is_shmem: Set if using shmem.
+ * @filp: The shmem file, if using shmem backend.
  *
  * Note that DMA may be going on right up to the point where the page-
  * vector is unpopulated in delayed destroy. Hence keep the
@@ -46,6 +48,9 @@ struct i915_ttm_tt {
 	struct ttm_tt ttm;
 	struct device *dev;
 	struct sg_table *cached_st;
+
+	bool is_shmem;
+	struct file *filp;
 };
 
 static const struct ttm_place sys_placement_flags = {
@@ -179,12 +184,90 @@ i915_ttm_placement_from_obj(const struct drm_i915_gem_object *obj,
 	placement->busy_placement = busy;
 }
 
+static int i915_ttm_tt_shmem_populate(struct ttm_device *bdev,
+				      struct ttm_tt *ttm,
+				      struct ttm_operation_ctx *ctx)
+{
+	struct drm_i915_private *i915 = container_of(bdev, typeof(*i915), bdev);
+	struct intel_memory_region *mr = i915->mm.regions[INTEL_MEMORY_SYSTEM];
+	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);
+	const unsigned int max_segment = i915_sg_segment_size();
+	const size_t size = ttm->num_pages << PAGE_SHIFT;
+	struct file *filp = i915_tt->filp;
+	struct sgt_iter sgt_iter;
+	struct sg_table *st;
+	struct page *page;
+	unsigned long i;
+	int err;
+
+	if (!filp) {
+		struct address_space *mapping;
+		gfp_t mask;
+
+		filp = shmem_file_setup("i915-shmem-tt", size, VM_NORESERVE);
+		if (IS_ERR(filp))
+			return PTR_ERR(filp);
+
+		mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
+
+		mapping = filp->f_mapping;
+		mapping_set_gfp_mask(mapping, mask);
+		GEM_BUG_ON(!(mapping_gfp_mask(mapping) & __GFP_RECLAIM));
+
+		i915_tt->filp = filp;
+	}
+
+	st = shmem_alloc_st(i915, size, mr, filp->f_mapping, max_segment);
+	if (IS_ERR(st))
+		return PTR_ERR(st);
+
+	err = dma_map_sg_attrs(i915_tt->dev,
+			       st->sgl, st->nents,
+			       PCI_DMA_BIDIRECTIONAL,
+			       DMA_ATTR_SKIP_CPU_SYNC |
+			       DMA_ATTR_NO_KERNEL_MAPPING |
+			       DMA_ATTR_NO_WARN);
+	if (err <= 0) {
+		err = -EINVAL;
+		goto err_free_st;
+	}
+
+	i = 0;
+	for_each_sgt_page(page, sgt_iter, st)
+		ttm->pages[i++] = page;
+
+	if (ttm->page_flags & TTM_TT_FLAG_SWAPPED)
+		ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
+
+	i915_tt->cached_st = st;
+	return 0;
+
+err_free_st:
+	shmem_free_st(st, filp->f_mapping, false, false);
+	return err;
+}
+
+static void i915_ttm_tt_shmem_unpopulate(struct ttm_tt *ttm)
+{
+	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);
+	bool backup = ttm->page_flags & TTM_TT_FLAG_SWAPPED;
+
+	dma_unmap_sg(i915_tt->dev, i915_tt->cached_st->sgl,
+		     i915_tt->cached_st->nents,
+		     PCI_DMA_BIDIRECTIONAL);
+
+	shmem_free_st(fetch_and_zero(&i915_tt->cached_st),
+		      file_inode(i915_tt->filp)->i_mapping,
+		      backup, backup);
+}
+
 static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo,
 					 uint32_t page_flags)
 {
 	struct ttm_resource_manager *man =
 		ttm_manager_type(bo->bdev, bo->resource->mem_type);
 	struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
+	enum ttm_caching caching = i915_ttm_select_tt_caching(obj);
 	struct i915_ttm_tt *i915_tt;
 	int ret;
 
@@ -196,36 +279,62 @@ static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo,
 	    man->use_tt)
 		page_flags |= TTM_TT_FLAG_ZERO_ALLOC;
 
-	ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags,
-			  i915_ttm_select_tt_caching(obj));
-	if (ret) {
-		kfree(i915_tt);
-		return NULL;
+	if (i915_gem_object_is_shrinkable(obj) && caching == ttm_cached) {
+		page_flags |= TTM_TT_FLAG_EXTERNAL |
+			      TTM_TT_FLAG_EXTERNAL_MAPPABLE;
+		i915_tt->is_shmem = true;
 	}
 
+	ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching);
+	if (ret)
+		goto err_free;
+
 	i915_tt->dev = obj->base.dev->dev;
 
 	return &i915_tt->ttm;
+
+err_free:
+	kfree(i915_tt);
+	return NULL;
+}
+
+static int i915_ttm_tt_populate(struct ttm_device *bdev,
+				struct ttm_tt *ttm,
+				struct ttm_operation_ctx *ctx)
+{
+	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);
+
+	if (i915_tt->is_shmem)
+		return i915_ttm_tt_shmem_populate(bdev, ttm, ctx);
+
+	return ttm_pool_alloc(&bdev->pool, ttm, ctx);
 }
 
 static void i915_ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm)
 {
 	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);
 
-	if (i915_tt->cached_st) {
-		dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st,
-				  DMA_BIDIRECTIONAL, 0);
-		sg_free_table(i915_tt->cached_st);
-		kfree(i915_tt->cached_st);
-		i915_tt->cached_st = NULL;
+	if (i915_tt->is_shmem) {
+		i915_ttm_tt_shmem_unpopulate(ttm);
+	} else {
+		if (i915_tt->cached_st) {
+			dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st,
+					  DMA_BIDIRECTIONAL, 0);
+			sg_free_table(i915_tt->cached_st);
+			kfree(i915_tt->cached_st);
+			i915_tt->cached_st = NULL;
+		}
+		ttm_pool_free(&bdev->pool, ttm);
 	}
-	ttm_pool_free(&bdev->pool, ttm);
 }
 
 static void i915_ttm_tt_destroy(struct ttm_device *bdev, struct ttm_tt *ttm)
 {
 	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);
 
+	if (i915_tt->filp)
+		fput(i915_tt->filp);
+
 	ttm_tt_fini(ttm);
 	kfree(i915_tt);
 }
@@ -235,6 +344,14 @@ static bool i915_ttm_eviction_valuable(struct ttm_buffer_object *bo,
 {
 	struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
 
+	/*
+	 * EXTERNAL objects should never be swapped out by TTM, instead we need
+	 * to handle that ourselves. TTM will already skip such objects for us,
+	 * but we would like to avoid grabbing locks for no good reason.
+	 */
+	if (bo->ttm && bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL)
+		return -EBUSY;
+
 	/* Will do for now. Our pinned objects are still on TTM's LRU lists */
 	return i915_gem_object_evictable(obj);
 }
@@ -328,9 +445,11 @@ static void i915_ttm_adjust_gem_after_move(struct drm_i915_gem_object *obj)
 	i915_gem_object_set_cache_coherency(obj, cache_level);
 }
 
-static void i915_ttm_purge(struct drm_i915_gem_object *obj)
+static int __i915_ttm_purge(struct drm_i915_gem_object *obj)
 {
 	struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
+	struct i915_ttm_tt *i915_tt =
+		container_of(bo->ttm, typeof(*i915_tt), ttm);
 	struct ttm_operation_ctx ctx = {
 		.interruptible = true,
 		.no_wait_gpu = false,
@@ -339,17 +458,79 @@ static void i915_ttm_purge(struct drm_i915_gem_object *obj)
 	int ret;
 
 	if (obj->mm.madv == __I915_MADV_PURGED)
-		return;
+		return 0;
 
-	/* TTM's purge interface. Note that we might be reentering. */
 	ret = ttm_bo_validate(bo, &place, &ctx);
-	if (!ret) {
-		obj->write_domain = 0;
-		obj->read_domains = 0;
-		i915_ttm_adjust_gem_after_move(obj);
-		i915_ttm_free_cached_io_st(obj);
-		obj->mm.madv = __I915_MADV_PURGED;
+	if (ret)
+		return ret;
+
+	if (bo->ttm && i915_tt->filp) {
+		/*
+		 * The below fput(which eventually calls shmem_truncate) might
+		 * be delayed by worker, so when directly called to purge the
+		 * pages(like by the shrinker) we should try to be more
+		 * aggressive and release the pages immediately.
+		 */
+		shmem_truncate_range(file_inode(i915_tt->filp),
+				     0, (loff_t)-1);
+		fput(fetch_and_zero(&i915_tt->filp));
+	}
+
+	obj->write_domain = 0;
+	obj->read_domains = 0;
+	i915_ttm_adjust_gem_after_move(obj);
+	i915_ttm_free_cached_io_st(obj);
+	obj->mm.madv = __I915_MADV_PURGED;
+	return 0;
+}
+
+static void i915_ttm_purge(struct drm_i915_gem_object *obj)
+{
+	__i915_ttm_purge(obj);
+}
+
+static int i915_ttm_shrinker_release_pages(struct drm_i915_gem_object *obj,
+					   bool should_writeback)
+{
+	struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
+	struct i915_ttm_tt *i915_tt =
+		container_of(bo->ttm, typeof(*i915_tt), ttm);
+	struct ttm_operation_ctx ctx = {
+		.interruptible = true,
+		.no_wait_gpu = false,
+	};
+	struct ttm_placement place = {};
+	int ret;
+
+	if (!bo->ttm || bo->resource->mem_type != TTM_PL_SYSTEM)
+		return 0;
+
+	GEM_BUG_ON(!i915_tt->is_shmem);
+
+	if (!i915_tt->filp)
+		return 0;
+
+	switch (obj->mm.madv) {
+	case I915_MADV_DONTNEED:
+		return __i915_ttm_purge(obj);
+	case __I915_MADV_PURGED:
+		return 0;
+	}
+
+	if (bo->ttm->page_flags & TTM_TT_FLAG_SWAPPED)
+		return 0;
+
+	bo->ttm->page_flags |= TTM_TT_FLAG_SWAPPED;
+	ret = ttm_bo_validate(bo, &place, &ctx);
+	if (ret) {
+		bo->ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
+		return ret;
 	}
+
+	if (should_writeback)
+		__shmem_writeback(obj->base.size, i915_tt->filp->f_mapping);
+
+	return 0;
 }
 
 static void i915_ttm_swap_notify(struct ttm_buffer_object *bo)
@@ -618,6 +799,7 @@ static unsigned long i915_ttm_io_mem_pfn(struct ttm_buffer_object *bo,
 
 static struct ttm_device_funcs i915_ttm_bo_driver = {
 	.ttm_tt_create = i915_ttm_tt_create,
+	.ttm_tt_populate = i915_ttm_tt_populate,
 	.ttm_tt_unpopulate = i915_ttm_tt_unpopulate,
 	.ttm_tt_destroy = i915_ttm_tt_destroy,
 	.eviction_valuable = i915_ttm_eviction_valuable,
@@ -685,12 +867,17 @@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj,
 	}
 
 	if (!i915_gem_object_has_pages(obj)) {
+		struct i915_ttm_tt *i915_tt =
+			container_of(bo->ttm, typeof(*i915_tt), ttm);
+
 		/* Object either has a page vector or is an iomem object */
 		st = bo->ttm ? i915_ttm_tt_get_st(bo->ttm) : obj->ttm.cached_io_st;
 		if (IS_ERR(st))
 			return PTR_ERR(st);
 
 		__i915_gem_object_set_pages(obj, st, i915_sg_dma_sizes(st->sgl));
+		if (!bo->ttm || !i915_tt->is_shmem)
+			i915_gem_object_make_unshrinkable(obj);
 	}
 
 	return ret;
@@ -770,6 +957,8 @@ static void i915_ttm_put_pages(struct drm_i915_gem_object *obj,
 static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)
 {
 	struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
+	struct i915_ttm_tt *i915_tt =
+		container_of(bo->ttm, typeof(*i915_tt), ttm);
 
 	/*
 	 * Don't manipulate the TTM LRUs while in TTM bo destruction.
@@ -782,7 +971,10 @@ static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)
 	 * Put on the correct LRU list depending on the MADV status
 	 */
 	spin_lock(&bo->bdev->lru_lock);
-	if (obj->mm.madv != I915_MADV_WILLNEED) {
+	if (bo->ttm && i915_tt->filp) {
+		/* Try to keep shmem_tt from being considered for shrinking. */
+		bo->priority = TTM_MAX_BO_PRIORITY - 1;
+	} else if (obj->mm.madv != I915_MADV_WILLNEED) {
 		bo->priority = I915_TTM_PRIO_PURGE;
 	} else if (!i915_gem_object_has_pages(obj)) {
 		if (bo->priority < I915_TTM_PRIO_HAS_PAGES)
@@ -887,9 +1079,12 @@ static const struct drm_i915_gem_object_ops i915_gem_ttm_obj_ops = {
 	.get_pages = i915_ttm_get_pages,
 	.put_pages = i915_ttm_put_pages,
 	.truncate = i915_ttm_purge,
+	.shrinker_release_pages = i915_ttm_shrinker_release_pages,
+
 	.adjust_lru = i915_ttm_adjust_lru,
 	.delayed_free = i915_ttm_delayed_free,
 	.migrate = i915_ttm_migrate,
+
 	.mmap_offset = i915_ttm_mmap_offset,
 	.mmap_ops = &vm_ops_ttm,
 };
@@ -937,7 +1132,6 @@ int __i915_gem_ttm_object_init(struct intel_memory_region *mem,
 	drm_gem_private_object_init(&i915->drm, &obj->base, size);
 	i915_gem_object_init(obj, &i915_gem_ttm_obj_ops, &lock_class, flags);
 	i915_gem_object_init_memory_region(obj, mem);
-	i915_gem_object_make_unshrinkable(obj);
 	INIT_RADIX_TREE(&obj->ttm.get_io_page.radix, GFP_KERNEL | __GFP_NOWARN);
 	mutex_init(&obj->ttm.get_io_page.lock);
 	bo_type = (obj->flags & I915_BO_ALLOC_USER) ? ttm_bo_type_device :
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v5 10/13] drm/i915: try to simplify make_{un}shrinkable
  2021-09-27 11:41 ` [Intel-gfx] " Matthew Auld
@ 2021-09-27 11:41   ` Matthew Auld
  -1 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-27 11:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

Drop the atomic shrink_pin stuff, and just have make_{un}shrinkable
update the shrinker visible lists immediately. This at least simplifies
the next patch, and does make the behaviour more obvious. The potential
downside is that make_unshrinkable now grabs a global lock even when the
object itself is no longer shrinkable(transitioning from purgeable <->
shrinkable doesn't seem to be a thing), for example in the ppGTT
insertion paths we should now be careful not to needlessly call
make_unshrinkable multiple times. Outside of that there is some fallout
in intel_context which relies on nesting calls to shrink_pin.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c    |  9 ----
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  3 +-
 drivers/gpu/drm/i915/gem/i915_gem_pages.c     | 16 +-----
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  | 52 +++++++++++++------
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c          |  1 -
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c          |  1 -
 drivers/gpu/drm/i915/gt/intel_context.c       |  9 +---
 7 files changed, 41 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 6fb9afb65034..e8265a432fcb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -305,15 +305,6 @@ static void i915_gem_free_object(struct drm_gem_object *gem_obj)
 	 */
 	atomic_inc(&i915->mm.free_count);
 
-	/*
-	 * This serializes freeing with the shrinker. Since the free
-	 * is delayed, first by RCU then by the workqueue, we want the
-	 * shrinker to be able to free pages of unreferenced objects,
-	 * or else we may oom whilst there are plenty of deferred
-	 * freed objects.
-	 */
-	i915_gem_object_make_unshrinkable(obj);
-
 	/*
 	 * Since we require blocking on struct_mutex to unbind the freed
 	 * object from the GPU before releasing resources back to the
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index f0fb17be2f7a..e4f8a6774da8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -461,7 +461,6 @@ struct drm_i915_gem_object {
 		 * instead go through the pin/unpin interfaces.
 		 */
 		atomic_t pages_pin_count;
-		atomic_t shrink_pin;
 
 		/**
 		 * Priority list of potential placements for this object.
@@ -522,7 +521,7 @@ struct drm_i915_gem_object {
 		struct i915_gem_object_page_iter get_dma_page;
 
 		/**
-		 * Element within i915->mm.unbound_list or i915->mm.bound_list,
+		 * Element within i915->mm.shrink_list or i915->mm.purge_list,
 		 * locked by i915->mm.obj_lock.
 		 */
 		struct list_head link;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index 8eb1c3a6fc9c..f0df1394d7f6 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -64,28 +64,16 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
 		GEM_BUG_ON(i915_gem_object_has_tiling_quirk(obj));
 		i915_gem_object_set_tiling_quirk(obj);
 		GEM_BUG_ON(!list_empty(&obj->mm.link));
-		atomic_inc(&obj->mm.shrink_pin);
 		shrinkable = false;
 	}
 
 	if (shrinkable) {
-		struct list_head *list;
-		unsigned long flags;
-
 		assert_object_held(obj);
-		spin_lock_irqsave(&i915->mm.obj_lock, flags);
-
-		i915->mm.shrink_count++;
-		i915->mm.shrink_memory += obj->base.size;
 
 		if (obj->mm.madv != I915_MADV_WILLNEED)
-			list = &i915->mm.purge_list;
+			i915_gem_object_make_purgeable(obj);
 		else
-			list = &i915->mm.shrink_list;
-		list_add_tail(&obj->mm.link, list);
-
-		atomic_set(&obj->mm.shrink_pin, 0);
-		spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
+			i915_gem_object_make_shrinkable(obj);
 	}
 }
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
index cc80bd23d323..0440696f786a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -460,23 +460,26 @@ void i915_gem_shrinker_taints_mutex(struct drm_i915_private *i915,
 
 #define obj_to_i915(obj__) to_i915((obj__)->base.dev)
 
+/**
+ * i915_gem_object_make_unshrinkable - Hide the object from the shrinker. By
+ * default all object types that support shrinking(see IS_SHRINKABLE), will also
+ * make the object visible to the shrinker after allocating the system memory
+ * pages.
+ * @obj: The GEM object.
+ *
+ * This is typically used for special kernel internal objects that can't be
+ * easily processed by the shrinker, like if they are perma-pinned.
+ */
 void i915_gem_object_make_unshrinkable(struct drm_i915_gem_object *obj)
 {
 	struct drm_i915_private *i915 = obj_to_i915(obj);
 	unsigned long flags;
 
-	/*
-	 * We can only be called while the pages are pinned or when
-	 * the pages are released. If pinned, we should only be called
-	 * from a single caller under controlled conditions; and on release
-	 * only one caller may release us. Neither the two may cross.
-	 */
-	if (atomic_add_unless(&obj->mm.shrink_pin, 1, 0))
+	if (!i915_gem_object_is_shrinkable(obj))
 		return;
 
 	spin_lock_irqsave(&i915->mm.obj_lock, flags);
-	if (!atomic_fetch_inc(&obj->mm.shrink_pin) &&
-	    !list_empty(&obj->mm.link)) {
+	if (!list_empty(&obj->mm.link)) {
 		list_del_init(&obj->mm.link);
 		i915->mm.shrink_count--;
 		i915->mm.shrink_memory -= obj->base.size;
@@ -494,28 +497,45 @@ static void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj,
 	if (!i915_gem_object_is_shrinkable(obj))
 		return;
 
-	if (atomic_add_unless(&obj->mm.shrink_pin, -1, 1))
-		return;
-
 	spin_lock_irqsave(&i915->mm.obj_lock, flags);
+
 	GEM_BUG_ON(!kref_read(&obj->base.refcount));
-	if (atomic_dec_and_test(&obj->mm.shrink_pin)) {
-		GEM_BUG_ON(!list_empty(&obj->mm.link));
 
-		list_add_tail(&obj->mm.link, head);
+	if (list_empty(&obj->mm.link)) {
 		i915->mm.shrink_count++;
 		i915->mm.shrink_memory += obj->base.size;
-
+		list_add_tail(&obj->mm.link, head);
+	} else {
+		list_move_tail(&obj->mm.link, head);
 	}
+
 	spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
 }
 
+
+/**
+ * i915_gem_object_make_shrinkable - Move the object to the tail of the
+ * shrinkable list. Objects on this list might be swapped out. Used with
+ * WILLNEED objects.
+ * @obj: The GEM object.
+ *
+ * Should only be called on objects which have backing pages.
+ */
 void i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj)
 {
 	__i915_gem_object_make_shrinkable(obj,
 					  &obj_to_i915(obj)->mm.shrink_list);
 }
 
+/**
+ * i915_gem_object_make_purgeable - Move the object to the tail of the purgeable
+ * list. Used with DONTNEED objects. Unlike with shrinkable objects, the
+ * shrinker will attempt to discard the backing pages, instead of trying to swap
+ * them out.
+ * @obj: The GEM object.
+ *
+ * Should only be called on objects which have backing pages.
+ */
 void i915_gem_object_make_purgeable(struct drm_i915_gem_object *obj)
 {
 	__i915_gem_object_make_shrinkable(obj,
diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
index 890191f286e3..baea9770200a 100644
--- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
@@ -185,7 +185,6 @@ static void gen6_alloc_va_range(struct i915_address_space *vm,
 
 			pt = stash->pt[0];
 			__i915_gem_object_pin_pages(pt->base);
-			i915_gem_object_make_unshrinkable(pt->base);
 
 			fill32_px(pt, vm->scratch[0]->encode);
 
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 037a9a6e4889..8af2f709571c 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -301,7 +301,6 @@ static void __gen8_ppgtt_alloc(struct i915_address_space * const vm,
 
 			pt = stash->pt[!!lvl];
 			__i915_gem_object_pin_pages(pt->base);
-			i915_gem_object_make_unshrinkable(pt->base);
 
 			fill_px(pt, vm->scratch[lvl]->encode);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index ff637147b1a9..1b7dc57e6ec1 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -111,11 +111,6 @@ static int __context_pin_state(struct i915_vma *vma, struct i915_gem_ww_ctx *ww)
 	if (err)
 		goto err_unpin;
 
-	/*
-	 * And mark it as a globally pinned object to let the shrinker know
-	 * it cannot reclaim the object until we release it.
-	 */
-	i915_vma_make_unshrinkable(vma);
 	vma->obj->mm.dirty = true;
 
 	return 0;
@@ -127,7 +122,6 @@ static int __context_pin_state(struct i915_vma *vma, struct i915_gem_ww_ctx *ww)
 
 static void __context_unpin_state(struct i915_vma *vma)
 {
-	i915_vma_make_shrinkable(vma);
 	i915_active_release(&vma->active);
 	__i915_vma_unpin(vma);
 }
@@ -180,7 +174,6 @@ static int intel_context_pre_pin(struct intel_context *ce,
 	if (err)
 		goto err_timeline;
 
-
 	return 0;
 
 err_timeline:
@@ -338,6 +331,8 @@ static void __intel_context_retire(struct i915_active *active)
 
 	set_bit(CONTEXT_VALID_BIT, &ce->flags);
 	intel_context_post_unpin(ce);
+	if (ce->state)
+		i915_vma_make_shrinkable(ce->state);
 	intel_context_put(ce);
 }
 
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Intel-gfx] [PATCH v5 10/13] drm/i915: try to simplify make_{un}shrinkable
@ 2021-09-27 11:41   ` Matthew Auld
  0 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-27 11:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

Drop the atomic shrink_pin stuff, and just have make_{un}shrinkable
update the shrinker visible lists immediately. This at least simplifies
the next patch, and does make the behaviour more obvious. The potential
downside is that make_unshrinkable now grabs a global lock even when the
object itself is no longer shrinkable(transitioning from purgeable <->
shrinkable doesn't seem to be a thing), for example in the ppGTT
insertion paths we should now be careful not to needlessly call
make_unshrinkable multiple times. Outside of that there is some fallout
in intel_context which relies on nesting calls to shrink_pin.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c    |  9 ----
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  3 +-
 drivers/gpu/drm/i915/gem/i915_gem_pages.c     | 16 +-----
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  | 52 +++++++++++++------
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c          |  1 -
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c          |  1 -
 drivers/gpu/drm/i915/gt/intel_context.c       |  9 +---
 7 files changed, 41 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 6fb9afb65034..e8265a432fcb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -305,15 +305,6 @@ static void i915_gem_free_object(struct drm_gem_object *gem_obj)
 	 */
 	atomic_inc(&i915->mm.free_count);
 
-	/*
-	 * This serializes freeing with the shrinker. Since the free
-	 * is delayed, first by RCU then by the workqueue, we want the
-	 * shrinker to be able to free pages of unreferenced objects,
-	 * or else we may oom whilst there are plenty of deferred
-	 * freed objects.
-	 */
-	i915_gem_object_make_unshrinkable(obj);
-
 	/*
 	 * Since we require blocking on struct_mutex to unbind the freed
 	 * object from the GPU before releasing resources back to the
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index f0fb17be2f7a..e4f8a6774da8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -461,7 +461,6 @@ struct drm_i915_gem_object {
 		 * instead go through the pin/unpin interfaces.
 		 */
 		atomic_t pages_pin_count;
-		atomic_t shrink_pin;
 
 		/**
 		 * Priority list of potential placements for this object.
@@ -522,7 +521,7 @@ struct drm_i915_gem_object {
 		struct i915_gem_object_page_iter get_dma_page;
 
 		/**
-		 * Element within i915->mm.unbound_list or i915->mm.bound_list,
+		 * Element within i915->mm.shrink_list or i915->mm.purge_list,
 		 * locked by i915->mm.obj_lock.
 		 */
 		struct list_head link;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index 8eb1c3a6fc9c..f0df1394d7f6 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -64,28 +64,16 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
 		GEM_BUG_ON(i915_gem_object_has_tiling_quirk(obj));
 		i915_gem_object_set_tiling_quirk(obj);
 		GEM_BUG_ON(!list_empty(&obj->mm.link));
-		atomic_inc(&obj->mm.shrink_pin);
 		shrinkable = false;
 	}
 
 	if (shrinkable) {
-		struct list_head *list;
-		unsigned long flags;
-
 		assert_object_held(obj);
-		spin_lock_irqsave(&i915->mm.obj_lock, flags);
-
-		i915->mm.shrink_count++;
-		i915->mm.shrink_memory += obj->base.size;
 
 		if (obj->mm.madv != I915_MADV_WILLNEED)
-			list = &i915->mm.purge_list;
+			i915_gem_object_make_purgeable(obj);
 		else
-			list = &i915->mm.shrink_list;
-		list_add_tail(&obj->mm.link, list);
-
-		atomic_set(&obj->mm.shrink_pin, 0);
-		spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
+			i915_gem_object_make_shrinkable(obj);
 	}
 }
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
index cc80bd23d323..0440696f786a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -460,23 +460,26 @@ void i915_gem_shrinker_taints_mutex(struct drm_i915_private *i915,
 
 #define obj_to_i915(obj__) to_i915((obj__)->base.dev)
 
+/**
+ * i915_gem_object_make_unshrinkable - Hide the object from the shrinker. By
+ * default all object types that support shrinking(see IS_SHRINKABLE), will also
+ * make the object visible to the shrinker after allocating the system memory
+ * pages.
+ * @obj: The GEM object.
+ *
+ * This is typically used for special kernel internal objects that can't be
+ * easily processed by the shrinker, like if they are perma-pinned.
+ */
 void i915_gem_object_make_unshrinkable(struct drm_i915_gem_object *obj)
 {
 	struct drm_i915_private *i915 = obj_to_i915(obj);
 	unsigned long flags;
 
-	/*
-	 * We can only be called while the pages are pinned or when
-	 * the pages are released. If pinned, we should only be called
-	 * from a single caller under controlled conditions; and on release
-	 * only one caller may release us. Neither the two may cross.
-	 */
-	if (atomic_add_unless(&obj->mm.shrink_pin, 1, 0))
+	if (!i915_gem_object_is_shrinkable(obj))
 		return;
 
 	spin_lock_irqsave(&i915->mm.obj_lock, flags);
-	if (!atomic_fetch_inc(&obj->mm.shrink_pin) &&
-	    !list_empty(&obj->mm.link)) {
+	if (!list_empty(&obj->mm.link)) {
 		list_del_init(&obj->mm.link);
 		i915->mm.shrink_count--;
 		i915->mm.shrink_memory -= obj->base.size;
@@ -494,28 +497,45 @@ static void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj,
 	if (!i915_gem_object_is_shrinkable(obj))
 		return;
 
-	if (atomic_add_unless(&obj->mm.shrink_pin, -1, 1))
-		return;
-
 	spin_lock_irqsave(&i915->mm.obj_lock, flags);
+
 	GEM_BUG_ON(!kref_read(&obj->base.refcount));
-	if (atomic_dec_and_test(&obj->mm.shrink_pin)) {
-		GEM_BUG_ON(!list_empty(&obj->mm.link));
 
-		list_add_tail(&obj->mm.link, head);
+	if (list_empty(&obj->mm.link)) {
 		i915->mm.shrink_count++;
 		i915->mm.shrink_memory += obj->base.size;
-
+		list_add_tail(&obj->mm.link, head);
+	} else {
+		list_move_tail(&obj->mm.link, head);
 	}
+
 	spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
 }
 
+
+/**
+ * i915_gem_object_make_shrinkable - Move the object to the tail of the
+ * shrinkable list. Objects on this list might be swapped out. Used with
+ * WILLNEED objects.
+ * @obj: The GEM object.
+ *
+ * Should only be called on objects which have backing pages.
+ */
 void i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj)
 {
 	__i915_gem_object_make_shrinkable(obj,
 					  &obj_to_i915(obj)->mm.shrink_list);
 }
 
+/**
+ * i915_gem_object_make_purgeable - Move the object to the tail of the purgeable
+ * list. Used with DONTNEED objects. Unlike with shrinkable objects, the
+ * shrinker will attempt to discard the backing pages, instead of trying to swap
+ * them out.
+ * @obj: The GEM object.
+ *
+ * Should only be called on objects which have backing pages.
+ */
 void i915_gem_object_make_purgeable(struct drm_i915_gem_object *obj)
 {
 	__i915_gem_object_make_shrinkable(obj,
diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
index 890191f286e3..baea9770200a 100644
--- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
@@ -185,7 +185,6 @@ static void gen6_alloc_va_range(struct i915_address_space *vm,
 
 			pt = stash->pt[0];
 			__i915_gem_object_pin_pages(pt->base);
-			i915_gem_object_make_unshrinkable(pt->base);
 
 			fill32_px(pt, vm->scratch[0]->encode);
 
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 037a9a6e4889..8af2f709571c 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -301,7 +301,6 @@ static void __gen8_ppgtt_alloc(struct i915_address_space * const vm,
 
 			pt = stash->pt[!!lvl];
 			__i915_gem_object_pin_pages(pt->base);
-			i915_gem_object_make_unshrinkable(pt->base);
 
 			fill_px(pt, vm->scratch[lvl]->encode);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index ff637147b1a9..1b7dc57e6ec1 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -111,11 +111,6 @@ static int __context_pin_state(struct i915_vma *vma, struct i915_gem_ww_ctx *ww)
 	if (err)
 		goto err_unpin;
 
-	/*
-	 * And mark it as a globally pinned object to let the shrinker know
-	 * it cannot reclaim the object until we release it.
-	 */
-	i915_vma_make_unshrinkable(vma);
 	vma->obj->mm.dirty = true;
 
 	return 0;
@@ -127,7 +122,6 @@ static int __context_pin_state(struct i915_vma *vma, struct i915_gem_ww_ctx *ww)
 
 static void __context_unpin_state(struct i915_vma *vma)
 {
-	i915_vma_make_shrinkable(vma);
 	i915_active_release(&vma->active);
 	__i915_vma_unpin(vma);
 }
@@ -180,7 +174,6 @@ static int intel_context_pre_pin(struct intel_context *ce,
 	if (err)
 		goto err_timeline;
 
-
 	return 0;
 
 err_timeline:
@@ -338,6 +331,8 @@ static void __intel_context_retire(struct i915_active *active)
 
 	set_bit(CONTEXT_VALID_BIT, &ce->flags);
 	intel_context_post_unpin(ce);
+	if (ce->state)
+		i915_vma_make_shrinkable(ce->state);
 	intel_context_put(ce);
 }
 
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v5 11/13] drm/i915/ttm: make evicted shmem pages visible to the shrinker
  2021-09-27 11:41 ` [Intel-gfx] " Matthew Auld
@ 2021-09-27 11:41   ` Matthew Auld
  -1 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-27 11:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

We currently just evict lmem objects to system memory when under memory
pressure, and in the next patch we want to use the shmem backend even
for this case. For this case we lack the usual object mm.pages, which
effectively hides the pages from the i915-gem shrinker, until we
actually "attach" the TT to the object, or in the case of lmem-only
objects it just gets migrated back to lmem when touched again.

For all cases we can just adjust the i915 shrinker LRU each time we also
adjust the TTM LRU. The two cases we care about are:

  1) When something is moved by TTM, including when initially populating
     an object. Importantly this covers the case where TTM moves something from
     lmem <-> smem, outside of the normal get_pages() interface, which
     should still ensure the shmem pages underneath are reclaimable.

  2) When calling into i915_gem_object_unlock(). The unlock should
     ensure the object is removed from the shinker LRU, if it was indeed
     swapped out, or just purged, when the shrinker drops the object lock.

We can optimise this(if needed) by tracking if the object is already
visible to the shrinker(protected by the object lock), so we don't touch
the shrinker LRU more than needed.

v2(Thomas)
  - Handle managing the shrinker LRU in adjust_lru, where it is always
    safe to touch the object.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.h   |  1 +
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 29 +++++++++++++++-----
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c      | 28 +++++++++++++++----
 3 files changed, 46 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 1c9a1d8d3434..640dfbf1f01e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -523,6 +523,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 
 void i915_gem_object_make_unshrinkable(struct drm_i915_gem_object *obj);
 void i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj);
+void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj);
 void i915_gem_object_make_purgeable(struct drm_i915_gem_object *obj);
 
 static inline bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
index 0440696f786a..4b6b2bb6f180 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -487,13 +487,12 @@ void i915_gem_object_make_unshrinkable(struct drm_i915_gem_object *obj)
 	spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
 }
 
-static void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj,
-					      struct list_head *head)
+static void ___i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj,
+					       struct list_head *head)
 {
 	struct drm_i915_private *i915 = obj_to_i915(obj);
 	unsigned long flags;
 
-	GEM_BUG_ON(!i915_gem_object_has_pages(obj));
 	if (!i915_gem_object_is_shrinkable(obj))
 		return;
 
@@ -512,6 +511,21 @@ static void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj,
 	spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
 }
 
+/**
+ * __i915_gem_object_make_shrinkable - Move the object to the tail of the
+ * shrinkable list. Objects on this list might be swapped out. Used with
+ * WILLNEED objects.
+ * @obj: The GEM object.
+ *
+ * DO NOT USE. This is intended to be called on very special objects that don't
+ * yet have mm.pages, but are guaranteed to have potentially reclaimable pages
+ * underneath.
+ */
+void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj)
+{
+	___i915_gem_object_make_shrinkable(obj,
+					   &obj_to_i915(obj)->mm.shrink_list);
+}
 
 /**
  * i915_gem_object_make_shrinkable - Move the object to the tail of the
@@ -523,8 +537,8 @@ static void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj,
  */
 void i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj)
 {
-	__i915_gem_object_make_shrinkable(obj,
-					  &obj_to_i915(obj)->mm.shrink_list);
+	GEM_BUG_ON(!i915_gem_object_has_pages(obj));
+	__i915_gem_object_make_shrinkable(obj);
 }
 
 /**
@@ -538,6 +552,7 @@ void i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj)
  */
 void i915_gem_object_make_purgeable(struct drm_i915_gem_object *obj)
 {
-	__i915_gem_object_make_shrinkable(obj,
-					  &obj_to_i915(obj)->mm.purge_list);
+	GEM_BUG_ON(!i915_gem_object_has_pages(obj));
+	___i915_gem_object_make_shrinkable(obj,
+					   &obj_to_i915(obj)->mm.purge_list);
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index c7402995a8f9..194e5f1deda8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -749,6 +749,8 @@ static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict,
 			return ret;
 	}
 
+	i915_ttm_adjust_lru(obj);
+
 	dst_st = i915_ttm_resource_get_st(obj, dst_mem);
 	if (IS_ERR(dst_st))
 		return PTR_ERR(dst_st);
@@ -856,7 +858,6 @@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj,
 			return i915_ttm_err_to_gem(ret);
 	}
 
-	i915_ttm_adjust_lru(obj);
 	if (bo->ttm && !ttm_tt_is_populated(bo->ttm)) {
 		ret = ttm_tt_populate(bo->bdev, bo->ttm, &ctx);
 		if (ret)
@@ -876,10 +877,10 @@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj,
 			return PTR_ERR(st);
 
 		__i915_gem_object_set_pages(obj, st, i915_sg_dma_sizes(st->sgl));
-		if (!bo->ttm || !i915_tt->is_shmem)
-			i915_gem_object_make_unshrinkable(obj);
 	}
 
+	i915_ttm_adjust_lru(obj);
+
 	return ret;
 }
 
@@ -950,8 +951,6 @@ static void i915_ttm_put_pages(struct drm_i915_gem_object *obj,
 	 * If the object is not destroyed next, The TTM eviction logic
 	 * and shrinkers will move it out if needed.
 	 */
-
-	i915_ttm_adjust_lru(obj);
 }
 
 static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)
@@ -967,6 +966,17 @@ static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)
 	if (!kref_read(&bo->kref))
 		return;
 
+	/*
+	 * Even if we lack mm.pages for this object(which will be the case when
+	 * something is evicted to system memory by TTM), we still want to make
+	 * this object visible to the shrinker, since the underlying ttm_tt
+	 * still has the real shmem pages.
+	 */
+	if (bo->ttm && i915_tt->filp && ttm_tt_is_populated(bo->ttm))
+		__i915_gem_object_make_shrinkable(obj);
+	else
+		i915_gem_object_make_unshrinkable(obj);
+
 	/*
 	 * Put on the correct LRU list depending on the MADV status
 	 */
@@ -1006,6 +1016,14 @@ static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)
 static void i915_ttm_delayed_free(struct drm_i915_gem_object *obj)
 {
 	if (obj->ttm.created) {
+		/*
+		 * We freely manage the shrinker LRU outide of the mm.pages life
+		 * cycle. As a result when destroying the object it's up to us
+		 * to ensure we remove it from the LRU, before we free the
+		 * object.
+		 */
+		i915_gem_object_make_unshrinkable(obj);
+
 		ttm_bo_put(i915_gem_to_ttm(obj));
 	} else {
 		__i915_gem_free_object(obj);
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Intel-gfx] [PATCH v5 11/13] drm/i915/ttm: make evicted shmem pages visible to the shrinker
@ 2021-09-27 11:41   ` Matthew Auld
  0 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-27 11:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

We currently just evict lmem objects to system memory when under memory
pressure, and in the next patch we want to use the shmem backend even
for this case. For this case we lack the usual object mm.pages, which
effectively hides the pages from the i915-gem shrinker, until we
actually "attach" the TT to the object, or in the case of lmem-only
objects it just gets migrated back to lmem when touched again.

For all cases we can just adjust the i915 shrinker LRU each time we also
adjust the TTM LRU. The two cases we care about are:

  1) When something is moved by TTM, including when initially populating
     an object. Importantly this covers the case where TTM moves something from
     lmem <-> smem, outside of the normal get_pages() interface, which
     should still ensure the shmem pages underneath are reclaimable.

  2) When calling into i915_gem_object_unlock(). The unlock should
     ensure the object is removed from the shinker LRU, if it was indeed
     swapped out, or just purged, when the shrinker drops the object lock.

We can optimise this(if needed) by tracking if the object is already
visible to the shrinker(protected by the object lock), so we don't touch
the shrinker LRU more than needed.

v2(Thomas)
  - Handle managing the shrinker LRU in adjust_lru, where it is always
    safe to touch the object.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.h   |  1 +
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 29 +++++++++++++++-----
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c      | 28 +++++++++++++++----
 3 files changed, 46 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 1c9a1d8d3434..640dfbf1f01e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -523,6 +523,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 
 void i915_gem_object_make_unshrinkable(struct drm_i915_gem_object *obj);
 void i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj);
+void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj);
 void i915_gem_object_make_purgeable(struct drm_i915_gem_object *obj);
 
 static inline bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
index 0440696f786a..4b6b2bb6f180 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -487,13 +487,12 @@ void i915_gem_object_make_unshrinkable(struct drm_i915_gem_object *obj)
 	spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
 }
 
-static void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj,
-					      struct list_head *head)
+static void ___i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj,
+					       struct list_head *head)
 {
 	struct drm_i915_private *i915 = obj_to_i915(obj);
 	unsigned long flags;
 
-	GEM_BUG_ON(!i915_gem_object_has_pages(obj));
 	if (!i915_gem_object_is_shrinkable(obj))
 		return;
 
@@ -512,6 +511,21 @@ static void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj,
 	spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
 }
 
+/**
+ * __i915_gem_object_make_shrinkable - Move the object to the tail of the
+ * shrinkable list. Objects on this list might be swapped out. Used with
+ * WILLNEED objects.
+ * @obj: The GEM object.
+ *
+ * DO NOT USE. This is intended to be called on very special objects that don't
+ * yet have mm.pages, but are guaranteed to have potentially reclaimable pages
+ * underneath.
+ */
+void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj)
+{
+	___i915_gem_object_make_shrinkable(obj,
+					   &obj_to_i915(obj)->mm.shrink_list);
+}
 
 /**
  * i915_gem_object_make_shrinkable - Move the object to the tail of the
@@ -523,8 +537,8 @@ static void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj,
  */
 void i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj)
 {
-	__i915_gem_object_make_shrinkable(obj,
-					  &obj_to_i915(obj)->mm.shrink_list);
+	GEM_BUG_ON(!i915_gem_object_has_pages(obj));
+	__i915_gem_object_make_shrinkable(obj);
 }
 
 /**
@@ -538,6 +552,7 @@ void i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj)
  */
 void i915_gem_object_make_purgeable(struct drm_i915_gem_object *obj)
 {
-	__i915_gem_object_make_shrinkable(obj,
-					  &obj_to_i915(obj)->mm.purge_list);
+	GEM_BUG_ON(!i915_gem_object_has_pages(obj));
+	___i915_gem_object_make_shrinkable(obj,
+					   &obj_to_i915(obj)->mm.purge_list);
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index c7402995a8f9..194e5f1deda8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -749,6 +749,8 @@ static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict,
 			return ret;
 	}
 
+	i915_ttm_adjust_lru(obj);
+
 	dst_st = i915_ttm_resource_get_st(obj, dst_mem);
 	if (IS_ERR(dst_st))
 		return PTR_ERR(dst_st);
@@ -856,7 +858,6 @@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj,
 			return i915_ttm_err_to_gem(ret);
 	}
 
-	i915_ttm_adjust_lru(obj);
 	if (bo->ttm && !ttm_tt_is_populated(bo->ttm)) {
 		ret = ttm_tt_populate(bo->bdev, bo->ttm, &ctx);
 		if (ret)
@@ -876,10 +877,10 @@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj,
 			return PTR_ERR(st);
 
 		__i915_gem_object_set_pages(obj, st, i915_sg_dma_sizes(st->sgl));
-		if (!bo->ttm || !i915_tt->is_shmem)
-			i915_gem_object_make_unshrinkable(obj);
 	}
 
+	i915_ttm_adjust_lru(obj);
+
 	return ret;
 }
 
@@ -950,8 +951,6 @@ static void i915_ttm_put_pages(struct drm_i915_gem_object *obj,
 	 * If the object is not destroyed next, The TTM eviction logic
 	 * and shrinkers will move it out if needed.
 	 */
-
-	i915_ttm_adjust_lru(obj);
 }
 
 static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)
@@ -967,6 +966,17 @@ static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)
 	if (!kref_read(&bo->kref))
 		return;
 
+	/*
+	 * Even if we lack mm.pages for this object(which will be the case when
+	 * something is evicted to system memory by TTM), we still want to make
+	 * this object visible to the shrinker, since the underlying ttm_tt
+	 * still has the real shmem pages.
+	 */
+	if (bo->ttm && i915_tt->filp && ttm_tt_is_populated(bo->ttm))
+		__i915_gem_object_make_shrinkable(obj);
+	else
+		i915_gem_object_make_unshrinkable(obj);
+
 	/*
 	 * Put on the correct LRU list depending on the MADV status
 	 */
@@ -1006,6 +1016,14 @@ static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)
 static void i915_ttm_delayed_free(struct drm_i915_gem_object *obj)
 {
 	if (obj->ttm.created) {
+		/*
+		 * We freely manage the shrinker LRU outide of the mm.pages life
+		 * cycle. As a result when destroying the object it's up to us
+		 * to ensure we remove it from the LRU, before we free the
+		 * object.
+		 */
+		i915_gem_object_make_unshrinkable(obj);
+
 		ttm_bo_put(i915_gem_to_ttm(obj));
 	} else {
 		__i915_gem_free_object(obj);
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v5 12/13] drm/i915/ttm: use cached system pages when evicting lmem
  2021-09-27 11:41 ` [Intel-gfx] " Matthew Auld
@ 2021-09-27 11:41   ` Matthew Auld
  -1 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-27 11:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

This should let us do an accelerated copy directly to the shmem pages
when temporarily moving lmem-only objects, where the i915-gem shrinker
can later kick in to swap out the pages, if needed.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index 194e5f1deda8..46d57541c0b2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -134,11 +134,11 @@ static enum ttm_caching
 i915_ttm_select_tt_caching(const struct drm_i915_gem_object *obj)
 {
 	/*
-	 * Objects only allowed in system get cached cpu-mappings.
-	 * Other objects get WC mapping for now. Even if in system.
+	 * Objects only allowed in system get cached cpu-mappings, or when
+	 * evicting lmem-only buffers to system for swapping. Other objects get
+	 * WC mapping for now. Even if in system.
 	 */
-	if (obj->mm.region->type == INTEL_MEMORY_SYSTEM &&
-	    obj->mm.n_placements <= 1)
+	if (obj->mm.n_placements <= 1)
 		return ttm_cached;
 
 	return ttm_write_combined;
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Intel-gfx] [PATCH v5 12/13] drm/i915/ttm: use cached system pages when evicting lmem
@ 2021-09-27 11:41   ` Matthew Auld
  0 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-27 11:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

This should let us do an accelerated copy directly to the shmem pages
when temporarily moving lmem-only objects, where the i915-gem shrinker
can later kick in to swap out the pages, if needed.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index 194e5f1deda8..46d57541c0b2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -134,11 +134,11 @@ static enum ttm_caching
 i915_ttm_select_tt_caching(const struct drm_i915_gem_object *obj)
 {
 	/*
-	 * Objects only allowed in system get cached cpu-mappings.
-	 * Other objects get WC mapping for now. Even if in system.
+	 * Objects only allowed in system get cached cpu-mappings, or when
+	 * evicting lmem-only buffers to system for swapping. Other objects get
+	 * WC mapping for now. Even if in system.
 	 */
-	if (obj->mm.region->type == INTEL_MEMORY_SYSTEM &&
-	    obj->mm.n_placements <= 1)
+	if (obj->mm.n_placements <= 1)
 		return ttm_cached;
 
 	return ttm_write_combined;
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v5 13/13] drm/i915/ttm: enable shmem tt backend
  2021-09-27 11:41 ` [Intel-gfx] " Matthew Auld
@ 2021-09-27 11:41   ` Matthew Auld
  -1 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-27 11:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

Turn on the shmem tt backend, and enable shrinking.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index 46d57541c0b2..4ae630fbc5cd 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -1093,6 +1093,7 @@ static u64 i915_ttm_mmap_offset(struct drm_i915_gem_object *obj)
 
 static const struct drm_i915_gem_object_ops i915_gem_ttm_obj_ops = {
 	.name = "i915_gem_object_ttm",
+	.flags = I915_GEM_OBJECT_IS_SHRINKABLE,
 
 	.get_pages = i915_ttm_get_pages,
 	.put_pages = i915_ttm_put_pages,
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Intel-gfx] [PATCH v5 13/13] drm/i915/ttm: enable shmem tt backend
@ 2021-09-27 11:41   ` Matthew Auld
  0 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-27 11:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

Turn on the shmem tt backend, and enable shrinking.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index 46d57541c0b2..4ae630fbc5cd 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -1093,6 +1093,7 @@ static u64 i915_ttm_mmap_offset(struct drm_i915_gem_object *obj)
 
 static const struct drm_i915_gem_object_ops i915_gem_ttm_obj_ops = {
 	.name = "i915_gem_object_ttm",
+	.flags = I915_GEM_OBJECT_IS_SHRINKABLE,
 
 	.get_pages = i915_ttm_get_pages,
 	.put_pages = i915_ttm_put_pages,
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 01/13] drm/ttm: stop calling tt_swapin in vm_access
  2021-09-27 11:41 ` [Intel-gfx] " Matthew Auld
@ 2021-09-27 11:47   ` Christian König
  -1 siblings, 0 replies; 60+ messages in thread
From: Christian König @ 2021-09-27 11:47 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel, Thomas Hellström

Any objections that I just push patches 1-7 to drm-misc-next?

Christian.

Am 27.09.21 um 13:41 schrieb Matthew Auld:
> In commit:
>
> commit 09ac4fcb3f255e9225967c75f5893325c116cdbe
> Author: Felix Kuehling <Felix.Kuehling@amd.com>
> Date:   Thu Jul 13 17:01:16 2017 -0400
>
>      drm/ttm: Implement vm_operations_struct.access v2
>
> we added the vm_access hook, where we also directly call tt_swapin for
> some reason. If something is swapped-out then the ttm_tt must also be
> unpopulated, and since access_kmap should also call tt_populate, if
> needed, then swapping-in will already be handled there.
>
> If anything, calling tt_swapin directly here would likely always fail
> since the tt->pages won't yet be populated, or worse since the tt->pages
> array is never actually cleared in unpopulate this might lead to a nasty
> uaf.
>
> Fixes: 09ac4fcb3f25 ("drm/ttm: Implement vm_operations_struct.access v2")
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Cc: Christian König <christian.koenig@amd.com>
> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Reviewed-by: Christian König <christian.koenig@amd.com>
> ---
>   drivers/gpu/drm/ttm/ttm_bo_vm.c | 5 -----
>   1 file changed, 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> index f56be5bc0861..5b9b7fd01a69 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> @@ -519,11 +519,6 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
>   
>   	switch (bo->resource->mem_type) {
>   	case TTM_PL_SYSTEM:
> -		if (unlikely(bo->ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)) {
> -			ret = ttm_tt_swapin(bo->ttm);
> -			if (unlikely(ret != 0))
> -				return ret;
> -		}
>   		fallthrough;
>   	case TTM_PL_TT:
>   		ret = ttm_bo_vm_access_kmap(bo, offset, buf, len, write);


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Intel-gfx] [PATCH v5 01/13] drm/ttm: stop calling tt_swapin in vm_access
@ 2021-09-27 11:47   ` Christian König
  0 siblings, 0 replies; 60+ messages in thread
From: Christian König @ 2021-09-27 11:47 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel, Thomas Hellström

Any objections that I just push patches 1-7 to drm-misc-next?

Christian.

Am 27.09.21 um 13:41 schrieb Matthew Auld:
> In commit:
>
> commit 09ac4fcb3f255e9225967c75f5893325c116cdbe
> Author: Felix Kuehling <Felix.Kuehling@amd.com>
> Date:   Thu Jul 13 17:01:16 2017 -0400
>
>      drm/ttm: Implement vm_operations_struct.access v2
>
> we added the vm_access hook, where we also directly call tt_swapin for
> some reason. If something is swapped-out then the ttm_tt must also be
> unpopulated, and since access_kmap should also call tt_populate, if
> needed, then swapping-in will already be handled there.
>
> If anything, calling tt_swapin directly here would likely always fail
> since the tt->pages won't yet be populated, or worse since the tt->pages
> array is never actually cleared in unpopulate this might lead to a nasty
> uaf.
>
> Fixes: 09ac4fcb3f25 ("drm/ttm: Implement vm_operations_struct.access v2")
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Cc: Christian König <christian.koenig@amd.com>
> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Reviewed-by: Christian König <christian.koenig@amd.com>
> ---
>   drivers/gpu/drm/ttm/ttm_bo_vm.c | 5 -----
>   1 file changed, 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> index f56be5bc0861..5b9b7fd01a69 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> @@ -519,11 +519,6 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
>   
>   	switch (bo->resource->mem_type) {
>   	case TTM_PL_SYSTEM:
> -		if (unlikely(bo->ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)) {
> -			ret = ttm_tt_swapin(bo->ttm);
> -			if (unlikely(ret != 0))
> -				return ret;
> -		}
>   		fallthrough;
>   	case TTM_PL_TT:
>   		ret = ttm_bo_vm_access_kmap(bo, offset, buf, len, write);


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Intel-gfx] [PATCH v5 01/13] drm/ttm: stop calling tt_swapin in vm_access
  2021-09-27 11:47   ` [Intel-gfx] " Christian König
  (?)
@ 2021-09-27 16:14   ` Matthew Auld
  2021-09-29 12:01     ` Christian König
  -1 siblings, 1 reply; 60+ messages in thread
From: Matthew Auld @ 2021-09-27 16:14 UTC (permalink / raw)
  To: Christian König
  Cc: Matthew Auld, Intel Graphics Development, ML dri-devel,
	Thomas Hellström

On Mon, 27 Sept 2021 at 12:47, Christian König <christian.koenig@amd.com> wrote:
>
> Any objections that I just push patches 1-7 to drm-misc-next?

Please go ahead Christian. Thanks.

>
> Christian.
>
> Am 27.09.21 um 13:41 schrieb Matthew Auld:
> > In commit:
> >
> > commit 09ac4fcb3f255e9225967c75f5893325c116cdbe
> > Author: Felix Kuehling <Felix.Kuehling@amd.com>
> > Date:   Thu Jul 13 17:01:16 2017 -0400
> >
> >      drm/ttm: Implement vm_operations_struct.access v2
> >
> > we added the vm_access hook, where we also directly call tt_swapin for
> > some reason. If something is swapped-out then the ttm_tt must also be
> > unpopulated, and since access_kmap should also call tt_populate, if
> > needed, then swapping-in will already be handled there.
> >
> > If anything, calling tt_swapin directly here would likely always fail
> > since the tt->pages won't yet be populated, or worse since the tt->pages
> > array is never actually cleared in unpopulate this might lead to a nasty
> > uaf.
> >
> > Fixes: 09ac4fcb3f25 ("drm/ttm: Implement vm_operations_struct.access v2")
> > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> > Cc: Christian König <christian.koenig@amd.com>
> > Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> > Reviewed-by: Christian König <christian.koenig@amd.com>
> > ---
> >   drivers/gpu/drm/ttm/ttm_bo_vm.c | 5 -----
> >   1 file changed, 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> > index f56be5bc0861..5b9b7fd01a69 100644
> > --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> > +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> > @@ -519,11 +519,6 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
> >
> >       switch (bo->resource->mem_type) {
> >       case TTM_PL_SYSTEM:
> > -             if (unlikely(bo->ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)) {
> > -                     ret = ttm_tt_swapin(bo->ttm);
> > -                     if (unlikely(ret != 0))
> > -                             return ret;
> > -             }
> >               fallthrough;
> >       case TTM_PL_TT:
> >               ret = ttm_bo_vm_access_kmap(bo, offset, buf, len, write);
>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [v5,01/13] drm/ttm: stop calling tt_swapin in vm_access
  2021-09-27 11:41 ` [Intel-gfx] " Matthew Auld
                   ` (13 preceding siblings ...)
  (?)
@ 2021-09-27 17:31 ` Patchwork
  -1 siblings, 0 replies; 60+ messages in thread
From: Patchwork @ 2021-09-27 17:31 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

== Series Details ==

Series: series starting with [v5,01/13] drm/ttm: stop calling tt_swapin in vm_access
URL   : https://patchwork.freedesktop.org/series/95093/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
63f7046be80e drm/ttm: stop calling tt_swapin in vm_access
-:11: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 09ac4fcb3f25 ("drm/ttm: Implement vm_operations_struct.access v2")'
#11: 
commit 09ac4fcb3f255e9225967c75f5893325c116cdbe

total: 1 errors, 0 warnings, 0 checks, 11 lines checked
2d21266abd0f drm/ttm: stop setting page->index for the ttm_tt
-:11: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 58aa6622d32a ("drm/ttm: Correctly set page mapping and -index members")'
#11: 
commit 58aa6622d32af7d2c08d45085f44c54554a16ed7

total: 1 errors, 0 warnings, 0 checks, 19 lines checked
5eb69ebfcc1a drm/ttm: move ttm_tt_{add, clear}_mapping into amdgpu
bdd2fc691ea5 drm/ttm: remove TTM_PAGE_FLAG_NO_RETRY
643501af65f9 drm/ttm: s/FLAG_SG/FLAG_EXTERNAL/
19f76bc63e36 drm/ttm: add some kernel-doc for TTM_TT_FLAG_*
783333293777 drm/ttm: add TTM_TT_FLAG_EXTERNAL_MAPPABLE
-:11: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 667a50db0477 ("drm/ttm: Refuse to fault (prime-) imported pages")'
#11: 
commit 667a50db0477d47fdff01c666f5ee1ce26b5264c

total: 1 errors, 0 warnings, 0 checks, 50 lines checked
00214d4c089c drm/i915/gem: Break out some shmem backend utils
fc0e7596391b drm/i915/ttm: add tt shmem backend
-:16: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#16: 
    dropping the shrinker LRU lock and acquiring the object lock it could for

total: 0 errors, 1 warnings, 0 checks, 448 lines checked
1c423f436dab drm/i915: try to simplify make_{un}shrinkable
-:164: CHECK:LINE_SPACING: Please don't use multiple blank lines
#164: FILE: drivers/gpu/drm/i915/gem/i915_gem_shrinker.c:515:
 
+

total: 0 errors, 0 warnings, 1 checks, 194 lines checked
1d054b38014d drm/i915/ttm: make evicted shmem pages visible to the shrinker
-:21: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#21: 
     an object. Importantly this covers the case where TTM moves something from

total: 0 errors, 1 warnings, 0 checks, 128 lines checked
abeba24ed9b1 drm/i915/ttm: use cached system pages when evicting lmem
713ad3558e87 drm/i915/ttm: enable shmem tt backend



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for series starting with [v5,01/13] drm/ttm: stop calling tt_swapin in vm_access
  2021-09-27 11:41 ` [Intel-gfx] " Matthew Auld
                   ` (14 preceding siblings ...)
  (?)
@ 2021-09-27 17:34 ` Patchwork
  -1 siblings, 0 replies; 60+ messages in thread
From: Patchwork @ 2021-09-27 17:34 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

== Series Details ==

Series: series starting with [v5,01/13] drm/ttm: stop calling tt_swapin in vm_access
URL   : https://patchwork.freedesktop.org/series/95093/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
-
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:316:49: error: static assertion failed: "amd_sriov_msg_pf2vf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1345:25: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1345:25:    struct dma_fence *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1345:25:    struct dma_fence [noderef] __rcu *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1346:17: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1346:17:    struct dma_fence *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1346:17:    struct dma_fence [noderef] __rcu *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1405:17: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1405:17:    struct dma_fence *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1405:17:    struct dma_fence [noderef] __rcu *
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:354:16: error: incompatible types in comparison expression (different type sizes):
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:354:16:    unsigned long *
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:354:16:    unsigned long long *
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:4491:31: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:4491:31:    struct dma_fence *
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:4491:31:    struct dma_fence [noderef] __rcu *
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:4493:33: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:4493:33:    struct dma_fence *
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:4493:33:    struct dma_fence [noderef] __rcu *
+drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:294:25: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:294:25:    struct dma_fence *
+drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:294:25:    struct dma_fence [noderef] __rcu *
+drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:295:17: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:295:17:    struct dma_fence *
+drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:295:17:    struct dma_fence [noderef] __rcu *
+drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:344:17: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:344:17:    struct dma_fence *
+drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:344:17:    struct dma_fence [noderef] __rcu *
+drivers/gpu/drm/amd/amdgpu/amdgpu_mca.c:117:1: warning: no newline at end of file
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h:312:49: error: static assertion faile



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [v5,01/13] drm/ttm: stop calling tt_swapin in vm_access
  2021-09-27 11:41 ` [Intel-gfx] " Matthew Auld
                   ` (15 preceding siblings ...)
  (?)
@ 2021-09-27 18:01 ` Patchwork
  -1 siblings, 0 replies; 60+ messages in thread
From: Patchwork @ 2021-09-27 18:01 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 6663 bytes --]

== Series Details ==

Series: series starting with [v5,01/13] drm/ttm: stop calling tt_swapin in vm_access
URL   : https://patchwork.freedesktop.org/series/95093/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10648 -> Patchwork_21167
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/index.html

Known issues
------------

  Here are the changes found in Patchwork_21167 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@amdgpu/amd_basic@query-info:
    - fi-bsw-kefka:       NOTRUN -> [SKIP][1] ([fdo#109271]) +32 similar issues
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/fi-bsw-kefka/igt@amdgpu/amd_basic@query-info.html

  * igt@gem_huc_copy@huc-copy:
    - fi-tgl-1115g4:      NOTRUN -> [SKIP][2] ([i915#2190])
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/fi-tgl-1115g4/igt@gem_huc_copy@huc-copy.html
    - fi-bxt-dsi:         NOTRUN -> [SKIP][3] ([fdo#109271] / [i915#2190])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/fi-bxt-dsi/igt@gem_huc_copy@huc-copy.html

  * igt@i915_pm_backlight@basic-brightness:
    - fi-tgl-1115g4:      NOTRUN -> [SKIP][4] ([i915#1155])
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/fi-tgl-1115g4/igt@i915_pm_backlight@basic-brightness.html

  * igt@i915_pm_rpm@module-reload:
    - fi-tgl-1115g4:      NOTRUN -> [INCOMPLETE][5] ([i915#4006] / [i915#4193])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/fi-tgl-1115g4/igt@i915_pm_rpm@module-reload.html

  * igt@kms_addfb_basic@too-wide:
    - fi-tgl-1115g4:      NOTRUN -> [DMESG-WARN][6] ([i915#4002]) +89 similar issues
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/fi-tgl-1115g4/igt@kms_addfb_basic@too-wide.html

  * igt@kms_chamelium@common-hpd-after-suspend:
    - fi-bxt-dsi:         NOTRUN -> [SKIP][7] ([fdo#109271] / [fdo#111827]) +8 similar issues
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/fi-bxt-dsi/igt@kms_chamelium@common-hpd-after-suspend.html
    - fi-tgl-1115g4:      NOTRUN -> [SKIP][8] ([fdo#111827]) +8 similar issues
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/fi-tgl-1115g4/igt@kms_chamelium@common-hpd-after-suspend.html

  * igt@kms_chamelium@hdmi-edid-read:
    - fi-bsw-kefka:       NOTRUN -> [SKIP][9] ([fdo#109271] / [fdo#111827]) +8 similar issues
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/fi-bsw-kefka/igt@kms_chamelium@hdmi-edid-read.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic:
    - fi-tgl-1115g4:      NOTRUN -> [SKIP][10] ([i915#4103]) +1 similar issue
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/fi-tgl-1115g4/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic.html

  * igt@kms_force_connector_basic@force-load-detect:
    - fi-bxt-dsi:         NOTRUN -> [SKIP][11] ([fdo#109271]) +30 similar issues
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/fi-bxt-dsi/igt@kms_force_connector_basic@force-load-detect.html
    - fi-tgl-1115g4:      NOTRUN -> [SKIP][12] ([fdo#109285])
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/fi-tgl-1115g4/igt@kms_force_connector_basic@force-load-detect.html

  * igt@kms_pipe_crc_basic@compare-crc-sanitycheck-pipe-d:
    - fi-bxt-dsi:         NOTRUN -> [SKIP][13] ([fdo#109271] / [i915#533])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/fi-bxt-dsi/igt@kms_pipe_crc_basic@compare-crc-sanitycheck-pipe-d.html

  * igt@kms_psr@primary_mmap_gtt:
    - fi-tgl-1115g4:      NOTRUN -> [SKIP][14] ([i915#1072]) +3 similar issues
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/fi-tgl-1115g4/igt@kms_psr@primary_mmap_gtt.html

  * igt@prime_vgem@basic-userptr:
    - fi-tgl-1115g4:      NOTRUN -> [SKIP][15] ([i915#3301])
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/fi-tgl-1115g4/igt@prime_vgem@basic-userptr.html

  * igt@runner@aborted:
    - fi-tgl-1115g4:      NOTRUN -> [FAIL][16] ([i915#2722])
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/fi-tgl-1115g4/igt@runner@aborted.html

  
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109285]: https://bugs.freedesktop.org/show_bug.cgi?id=109285
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1072]: https://gitlab.freedesktop.org/drm/intel/issues/1072
  [i915#1155]: https://gitlab.freedesktop.org/drm/intel/issues/1155
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#2722]: https://gitlab.freedesktop.org/drm/intel/issues/2722
  [i915#3301]: https://gitlab.freedesktop.org/drm/intel/issues/3301
  [i915#4002]: https://gitlab.freedesktop.org/drm/intel/issues/4002
  [i915#4006]: https://gitlab.freedesktop.org/drm/intel/issues/4006
  [i915#4103]: https://gitlab.freedesktop.org/drm/intel/issues/4103
  [i915#4193]: https://gitlab.freedesktop.org/drm/intel/issues/4193
  [i915#533]: https://gitlab.freedesktop.org/drm/intel/issues/533


Participating hosts (38 -> 34)
------------------------------

  Additional (2): fi-bsw-kefka fi-tgl-1115g4 
  Missing    (6): bat-adls-5 fi-bsw-cyan bat-adlp-4 fi-bdw-samus bat-jsl-2 bat-jsl-1 


Build changes
-------------

  * Linux: CI_DRM_10648 -> Patchwork_21167

  CI-20190529: 20190529
  CI_DRM_10648: 73d93dcb0d48bb76af25ca3f7149598e4bc68098 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6219: 4b5644c9751b489c73c9bb174644c08b31533cc8 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_21167: 713ad3558e87357bdd2db88c16883fc7e07bca3f @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

713ad3558e87 drm/i915/ttm: enable shmem tt backend
abeba24ed9b1 drm/i915/ttm: use cached system pages when evicting lmem
1d054b38014d drm/i915/ttm: make evicted shmem pages visible to the shrinker
1c423f436dab drm/i915: try to simplify make_{un}shrinkable
fc0e7596391b drm/i915/ttm: add tt shmem backend
00214d4c089c drm/i915/gem: Break out some shmem backend utils
783333293777 drm/ttm: add TTM_TT_FLAG_EXTERNAL_MAPPABLE
19f76bc63e36 drm/ttm: add some kernel-doc for TTM_TT_FLAG_*
643501af65f9 drm/ttm: s/FLAG_SG/FLAG_EXTERNAL/
bdd2fc691ea5 drm/ttm: remove TTM_PAGE_FLAG_NO_RETRY
5eb69ebfcc1a drm/ttm: move ttm_tt_{add, clear}_mapping into amdgpu
2d21266abd0f drm/ttm: stop setting page->index for the ttm_tt
63f7046be80e drm/ttm: stop calling tt_swapin in vm_access

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/index.html

[-- Attachment #2: Type: text/html, Size: 8252 bytes --]

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [Intel-gfx] ✗ Fi.CI.IGT: failure for series starting with [v5,01/13] drm/ttm: stop calling tt_swapin in vm_access
  2021-09-27 11:41 ` [Intel-gfx] " Matthew Auld
                   ` (16 preceding siblings ...)
  (?)
@ 2021-09-27 21:29 ` Patchwork
  -1 siblings, 0 replies; 60+ messages in thread
From: Patchwork @ 2021-09-27 21:29 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 30296 bytes --]

== Series Details ==

Series: series starting with [v5,01/13] drm/ttm: stop calling tt_swapin in vm_access
URL   : https://patchwork.freedesktop.org/series/95093/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10648_full -> Patchwork_21167_full
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_21167_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_21167_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_21167_full:

### IGT changes ###

#### Possible regressions ####

  * igt@kms_frontbuffer_tracking@psr-1p-primscrn-indfb-pgflip-blt:
    - shard-iclb:         [PASS][1] -> [FAIL][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-iclb2/igt@kms_frontbuffer_tracking@psr-1p-primscrn-indfb-pgflip-blt.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-iclb2/igt@kms_frontbuffer_tracking@psr-1p-primscrn-indfb-pgflip-blt.html

  
Known issues
------------

  Here are the changes found in Patchwork_21167_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_ctx_persistence@legacy-engines-queued:
    - shard-snb:          NOTRUN -> [SKIP][3] ([fdo#109271] / [i915#1099]) +4 similar issues
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-snb2/igt@gem_ctx_persistence@legacy-engines-queued.html

  * igt@gem_eio@hibernate:
    - shard-glk:          NOTRUN -> [DMESG-WARN][4] ([i915#1610])
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-glk8/igt@gem_eio@hibernate.html

  * igt@gem_eio@in-flight-contexts-1us:
    - shard-tglb:         [PASS][5] -> [TIMEOUT][6] ([i915#3063])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-tglb8/igt@gem_eio@in-flight-contexts-1us.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb5/igt@gem_eio@in-flight-contexts-1us.html

  * igt@gem_eio@unwedge-stress:
    - shard-tglb:         [PASS][7] -> [TIMEOUT][8] ([i915#2369] / [i915#3063] / [i915#3648])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-tglb5/igt@gem_eio@unwedge-stress.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb2/igt@gem_eio@unwedge-stress.html
    - shard-snb:          NOTRUN -> [FAIL][9] ([i915#3354])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-snb5/igt@gem_eio@unwedge-stress.html

  * igt@gem_exec_fair@basic-flow@rcs0:
    - shard-tglb:         [PASS][10] -> [FAIL][11] ([i915#2842])
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-tglb7/igt@gem_exec_fair@basic-flow@rcs0.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb5/igt@gem_exec_fair@basic-flow@rcs0.html

  * igt@gem_exec_fair@basic-none-share@rcs0:
    - shard-iclb:         [PASS][12] -> [FAIL][13] ([i915#2842])
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-iclb4/igt@gem_exec_fair@basic-none-share@rcs0.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-iclb5/igt@gem_exec_fair@basic-none-share@rcs0.html
    - shard-apl:          [PASS][14] -> [SKIP][15] ([fdo#109271])
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-apl7/igt@gem_exec_fair@basic-none-share@rcs0.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-apl2/igt@gem_exec_fair@basic-none-share@rcs0.html

  * igt@gem_exec_fair@basic-pace-share@rcs0:
    - shard-glk:          [PASS][16] -> [FAIL][17] ([i915#2842])
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-glk9/igt@gem_exec_fair@basic-pace-share@rcs0.html
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-glk6/igt@gem_exec_fair@basic-pace-share@rcs0.html

  * igt@gem_exec_fair@basic-pace@rcs0:
    - shard-kbl:          [PASS][18] -> [FAIL][19] ([i915#2842])
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-kbl7/igt@gem_exec_fair@basic-pace@rcs0.html
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-kbl7/igt@gem_exec_fair@basic-pace@rcs0.html

  * igt@gem_exec_fair@basic-pace@vecs0:
    - shard-tglb:         NOTRUN -> [FAIL][20] ([i915#2842]) +2 similar issues
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb6/igt@gem_exec_fair@basic-pace@vecs0.html

  * igt@gem_exec_suspend@basic-s3:
    - shard-tglb:         [PASS][21] -> [INCOMPLETE][22] ([i915#4173] / [i915#456])
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-tglb2/igt@gem_exec_suspend@basic-s3.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb7/igt@gem_exec_suspend@basic-s3.html

  * igt@gem_render_copy@y-tiled-to-vebox-linear:
    - shard-iclb:         NOTRUN -> [SKIP][23] ([i915#768])
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-iclb8/igt@gem_render_copy@y-tiled-to-vebox-linear.html

  * igt@gem_userptr_blits@dmabuf-sync:
    - shard-apl:          NOTRUN -> [SKIP][24] ([fdo#109271] / [i915#3323])
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-apl7/igt@gem_userptr_blits@dmabuf-sync.html

  * igt@gem_userptr_blits@dmabuf-unsync:
    - shard-tglb:         NOTRUN -> [SKIP][25] ([i915#3297]) +1 similar issue
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb6/igt@gem_userptr_blits@dmabuf-unsync.html

  * igt@gem_userptr_blits@vma-merge:
    - shard-apl:          NOTRUN -> [FAIL][26] ([i915#3318])
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-apl8/igt@gem_userptr_blits@vma-merge.html
    - shard-tglb:         NOTRUN -> [FAIL][27] ([i915#3318])
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb6/igt@gem_userptr_blits@vma-merge.html

  * igt@gen7_exec_parse@cmd-crossing-page:
    - shard-tglb:         NOTRUN -> [SKIP][28] ([fdo#109289]) +1 similar issue
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb8/igt@gen7_exec_parse@cmd-crossing-page.html

  * igt@gen9_exec_parse@bb-large:
    - shard-tglb:         NOTRUN -> [SKIP][29] ([i915#2856])
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb6/igt@gen9_exec_parse@bb-large.html

  * igt@i915_pm_lpsp@screens-disabled:
    - shard-tglb:         NOTRUN -> [SKIP][30] ([i915#1902])
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb8/igt@i915_pm_lpsp@screens-disabled.html

  * igt@i915_pm_rc6_residency@rc6-fence:
    - shard-tglb:         NOTRUN -> [WARN][31] ([i915#2681])
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb6/igt@i915_pm_rc6_residency@rc6-fence.html

  * igt@i915_pm_rpm@dpms-non-lpsp:
    - shard-iclb:         NOTRUN -> [SKIP][32] ([fdo#110892])
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-iclb8/igt@i915_pm_rpm@dpms-non-lpsp.html

  * igt@i915_pm_rpm@system-suspend-execbuf:
    - shard-tglb:         [PASS][33] -> [INCOMPLETE][34] ([i915#2411] / [i915#456] / [i915#750])
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-tglb5/igt@i915_pm_rpm@system-suspend-execbuf.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb7/igt@i915_pm_rpm@system-suspend-execbuf.html

  * igt@kms_big_fb@x-tiled-max-hw-stride-64bpp-rotate-0-hflip:
    - shard-apl:          NOTRUN -> [SKIP][35] ([fdo#109271] / [i915#3777])
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-apl8/igt@kms_big_fb@x-tiled-max-hw-stride-64bpp-rotate-0-hflip.html

  * igt@kms_big_fb@y-tiled-max-hw-stride-64bpp-rotate-180-hflip:
    - shard-glk:          NOTRUN -> [SKIP][36] ([fdo#109271] / [i915#3777])
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-glk8/igt@kms_big_fb@y-tiled-max-hw-stride-64bpp-rotate-180-hflip.html

  * igt@kms_big_fb@yf-tiled-64bpp-rotate-0:
    - shard-tglb:         NOTRUN -> [SKIP][37] ([fdo#111615]) +1 similar issue
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb8/igt@kms_big_fb@yf-tiled-64bpp-rotate-0.html

  * igt@kms_ccs@pipe-a-bad-aux-stride-y_tiled_gen12_rc_ccs_cc:
    - shard-skl:          NOTRUN -> [SKIP][38] ([fdo#109271] / [i915#3886])
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-skl1/igt@kms_ccs@pipe-a-bad-aux-stride-y_tiled_gen12_rc_ccs_cc.html

  * igt@kms_ccs@pipe-a-missing-ccs-buffer-y_tiled_gen12_mc_ccs:
    - shard-kbl:          NOTRUN -> [SKIP][39] ([fdo#109271] / [i915#3886]) +4 similar issues
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-kbl2/igt@kms_ccs@pipe-a-missing-ccs-buffer-y_tiled_gen12_mc_ccs.html

  * igt@kms_ccs@pipe-b-bad-pixel-format-y_tiled_ccs:
    - shard-snb:          NOTRUN -> [SKIP][40] ([fdo#109271]) +359 similar issues
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-snb5/igt@kms_ccs@pipe-b-bad-pixel-format-y_tiled_ccs.html

  * igt@kms_ccs@pipe-b-crc-primary-basic-y_tiled_gen12_mc_ccs:
    - shard-apl:          NOTRUN -> [SKIP][41] ([fdo#109271] / [i915#3886]) +3 similar issues
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-apl8/igt@kms_ccs@pipe-b-crc-primary-basic-y_tiled_gen12_mc_ccs.html

  * igt@kms_ccs@pipe-b-random-ccs-data-y_tiled_gen12_rc_ccs_cc:
    - shard-iclb:         NOTRUN -> [SKIP][42] ([fdo#109278] / [i915#3886])
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-iclb8/igt@kms_ccs@pipe-b-random-ccs-data-y_tiled_gen12_rc_ccs_cc.html

  * igt@kms_ccs@pipe-c-bad-aux-stride-y_tiled_gen12_rc_ccs_cc:
    - shard-glk:          NOTRUN -> [SKIP][43] ([fdo#109271] / [i915#3886]) +1 similar issue
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-glk8/igt@kms_ccs@pipe-c-bad-aux-stride-y_tiled_gen12_rc_ccs_cc.html

  * igt@kms_ccs@pipe-c-crc-sprite-planes-basic-y_tiled_gen12_mc_ccs:
    - shard-tglb:         NOTRUN -> [SKIP][44] ([i915#3689] / [i915#3886]) +1 similar issue
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb6/igt@kms_ccs@pipe-c-crc-sprite-planes-basic-y_tiled_gen12_mc_ccs.html

  * igt@kms_ccs@pipe-c-random-ccs-data-yf_tiled_ccs:
    - shard-tglb:         NOTRUN -> [SKIP][45] ([i915#3689]) +4 similar issues
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb6/igt@kms_ccs@pipe-c-random-ccs-data-yf_tiled_ccs.html

  * igt@kms_ccs@pipe-d-crc-sprite-planes-basic-y_tiled_gen12_mc_ccs:
    - shard-iclb:         NOTRUN -> [SKIP][46] ([fdo#109278]) +2 similar issues
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-iclb8/igt@kms_ccs@pipe-d-crc-sprite-planes-basic-y_tiled_gen12_mc_ccs.html

  * igt@kms_chamelium@dp-crc-fast:
    - shard-kbl:          NOTRUN -> [SKIP][47] ([fdo#109271] / [fdo#111827]) +3 similar issues
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-kbl2/igt@kms_chamelium@dp-crc-fast.html

  * igt@kms_chamelium@dp-mode-timings:
    - shard-apl:          NOTRUN -> [SKIP][48] ([fdo#109271] / [fdo#111827]) +11 similar issues
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-apl8/igt@kms_chamelium@dp-mode-timings.html
    - shard-iclb:         NOTRUN -> [SKIP][49] ([fdo#109284] / [fdo#111827])
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-iclb8/igt@kms_chamelium@dp-mode-timings.html

  * igt@kms_chamelium@hdmi-hpd-enable-disable-mode:
    - shard-snb:          NOTRUN -> [SKIP][50] ([fdo#109271] / [fdo#111827]) +15 similar issues
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-snb2/igt@kms_chamelium@hdmi-hpd-enable-disable-mode.html

  * igt@kms_chamelium@vga-hpd-for-each-pipe:
    - shard-skl:          NOTRUN -> [SKIP][51] ([fdo#109271] / [fdo#111827]) +1 similar issue
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-skl3/igt@kms_chamelium@vga-hpd-for-each-pipe.html

  * igt@kms_color_chamelium@pipe-b-ctm-0-5:
    - shard-tglb:         NOTRUN -> [SKIP][52] ([fdo#109284] / [fdo#111827]) +5 similar issues
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb8/igt@kms_color_chamelium@pipe-b-ctm-0-5.html

  * igt@kms_color_chamelium@pipe-d-ctm-0-25:
    - shard-glk:          NOTRUN -> [SKIP][53] ([fdo#109271] / [fdo#111827]) +2 similar issues
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-glk8/igt@kms_color_chamelium@pipe-d-ctm-0-25.html

  * igt@kms_content_protection@legacy:
    - shard-glk:          NOTRUN -> [SKIP][54] ([fdo#109271]) +27 similar issues
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-glk8/igt@kms_content_protection@legacy.html

  * igt@kms_content_protection@mei_interface:
    - shard-tglb:         NOTRUN -> [SKIP][55] ([fdo#111828])
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb8/igt@kms_content_protection@mei_interface.html

  * igt@kms_cursor_crc@pipe-a-cursor-512x512-offscreen:
    - shard-iclb:         NOTRUN -> [SKIP][56] ([fdo#109278] / [fdo#109279]) +1 similar issue
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-iclb8/igt@kms_cursor_crc@pipe-a-cursor-512x512-offscreen.html

  * igt@kms_cursor_crc@pipe-d-cursor-512x512-sliding:
    - shard-tglb:         NOTRUN -> [SKIP][57] ([fdo#109279] / [i915#3359])
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb8/igt@kms_cursor_crc@pipe-d-cursor-512x512-sliding.html

  * igt@kms_cursor_crc@pipe-d-cursor-max-size-sliding:
    - shard-tglb:         NOTRUN -> [SKIP][58] ([i915#3359]) +4 similar issues
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb8/igt@kms_cursor_crc@pipe-d-cursor-max-size-sliding.html

  * igt@kms_cursor_legacy@pipe-d-single-bo:
    - shard-kbl:          NOTRUN -> [SKIP][59] ([fdo#109271] / [i915#533])
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-kbl2/igt@kms_cursor_legacy@pipe-d-single-bo.html

  * igt@kms_flip@flip-vs-expired-vblank@a-edp1:
    - shard-skl:          [PASS][60] -> [FAIL][61] ([i915#79])
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-skl4/igt@kms_flip@flip-vs-expired-vblank@a-edp1.html
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-skl7/igt@kms_flip@flip-vs-expired-vblank@a-edp1.html

  * igt@kms_flip@flip-vs-suspend-interruptible@a-edp1:
    - shard-tglb:         [PASS][62] -> [INCOMPLETE][63] ([i915#2411] / [i915#4173] / [i915#456]) +1 similar issue
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-tglb3/igt@kms_flip@flip-vs-suspend-interruptible@a-edp1.html
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb7/igt@kms_flip@flip-vs-suspend-interruptible@a-edp1.html

  * igt@kms_flip@flip-vs-suspend@c-dp1:
    - shard-apl:          [PASS][64] -> [DMESG-WARN][65] ([i915#180])
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-apl7/igt@kms_flip@flip-vs-suspend@c-dp1.html
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-apl3/igt@kms_flip@flip-vs-suspend@c-dp1.html

  * igt@kms_frontbuffer_tracking@fbcpsr-2p-scndscrn-pri-indfb-draw-mmap-gtt:
    - shard-iclb:         NOTRUN -> [SKIP][66] ([fdo#109280])
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-iclb8/igt@kms_frontbuffer_tracking@fbcpsr-2p-scndscrn-pri-indfb-draw-mmap-gtt.html

  * igt@kms_frontbuffer_tracking@psr-2p-primscrn-indfb-pgflip-blt:
    - shard-tglb:         NOTRUN -> [SKIP][67] ([fdo#111825]) +17 similar issues
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb6/igt@kms_frontbuffer_tracking@psr-2p-primscrn-indfb-pgflip-blt.html

  * igt@kms_pipe_crc_basic@disable-crc-after-crtc-pipe-d:
    - shard-apl:          NOTRUN -> [SKIP][68] ([fdo#109271] / [i915#533])
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-apl7/igt@kms_pipe_crc_basic@disable-crc-after-crtc-pipe-d.html

  * igt@kms_pipe_crc_basic@hang-read-crc-pipe-d:
    - shard-glk:          NOTRUN -> [SKIP][69] ([fdo#109271] / [i915#533])
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-glk8/igt@kms_pipe_crc_basic@hang-read-crc-pipe-d.html

  * igt@kms_pipe_crc_basic@suspend-read-crc-pipe-c:
    - shard-kbl:          [PASS][70] -> [DMESG-WARN][71] ([i915#180]) +1 similar issue
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-kbl2/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-c.html
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-kbl4/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-c.html

  * igt@kms_plane_alpha_blend@pipe-b-alpha-basic:
    - shard-kbl:          NOTRUN -> [FAIL][72] ([fdo#108145] / [i915#265])
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-kbl2/igt@kms_plane_alpha_blend@pipe-b-alpha-basic.html

  * igt@kms_plane_lowres@pipe-a-tiling-yf:
    - shard-tglb:         NOTRUN -> [SKIP][73] ([fdo#112054])
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb8/igt@kms_plane_lowres@pipe-a-tiling-yf.html

  * igt@kms_plane_lowres@pipe-c-tiling-x:
    - shard-iclb:         NOTRUN -> [SKIP][74] ([i915#3536])
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-iclb8/igt@kms_plane_lowres@pipe-c-tiling-x.html

  * igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area-1:
    - shard-kbl:          NOTRUN -> [SKIP][75] ([fdo#109271] / [i915#658]) +1 similar issue
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-kbl7/igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area-1.html
    - shard-iclb:         NOTRUN -> [SKIP][76] ([i915#658])
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-iclb8/igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area-1.html
    - shard-apl:          NOTRUN -> [SKIP][77] ([fdo#109271] / [i915#658]) +1 similar issue
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-apl8/igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area-1.html

  * igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area-3:
    - shard-glk:          NOTRUN -> [SKIP][78] ([fdo#109271] / [i915#658])
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-glk8/igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area-3.html

  * igt@kms_psr2_sf@primary-plane-update-sf-dmg-area-3:
    - shard-tglb:         NOTRUN -> [SKIP][79] ([i915#2920]) +1 similar issue
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb3/igt@kms_psr2_sf@primary-plane-update-sf-dmg-area-3.html

  * igt@kms_psr2_su@frontbuffer:
    - shard-iclb:         [PASS][80] -> [SKIP][81] ([fdo#109642] / [fdo#111068] / [i915#658])
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-iclb2/igt@kms_psr2_su@frontbuffer.html
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-iclb1/igt@kms_psr2_su@frontbuffer.html

  * igt@kms_psr@psr2_sprite_mmap_gtt:
    - shard-tglb:         NOTRUN -> [FAIL][82] ([i915#132] / [i915#3467])
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb8/igt@kms_psr@psr2_sprite_mmap_gtt.html

  * igt@kms_setmode@basic:
    - shard-snb:          NOTRUN -> [FAIL][83] ([i915#31])
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-snb5/igt@kms_setmode@basic.html

  * igt@kms_vblank@pipe-d-query-forked-busy:
    - shard-skl:          NOTRUN -> [SKIP][84] ([fdo#109271]) +13 similar issues
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-skl4/igt@kms_vblank@pipe-d-query-forked-busy.html

  * igt@kms_vblank@pipe-d-wait-forked-hang:
    - shard-apl:          NOTRUN -> [SKIP][85] ([fdo#109271]) +118 similar issues
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-apl8/igt@kms_vblank@pipe-d-wait-forked-hang.html

  * igt@kms_writeback@writeback-check-output:
    - shard-apl:          NOTRUN -> [SKIP][86] ([fdo#109271] / [i915#2437])
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-apl7/igt@kms_writeback@writeback-check-output.html

  * igt@nouveau_crc@pipe-c-source-outp-inactive:
    - shard-tglb:         NOTRUN -> [SKIP][87] ([i915#2530]) +1 similar issue
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb8/igt@nouveau_crc@pipe-c-source-outp-inactive.html

  * igt@prime_nv_api@i915_nv_double_import:
    - shard-tglb:         NOTRUN -> [SKIP][88] ([fdo#109291]) +3 similar issues
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb8/igt@prime_nv_api@i915_nv_double_import.html

  * igt@sysfs_clients@recycle-many:
    - shard-apl:          NOTRUN -> [SKIP][89] ([fdo#109271] / [i915#2994]) +2 similar issues
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-apl7/igt@sysfs_clients@recycle-many.html

  * igt@sysfs_clients@sema-10:
    - shard-tglb:         NOTRUN -> [SKIP][90] ([i915#2994])
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb8/igt@sysfs_clients@sema-10.html

  * igt@sysfs_clients@split-10:
    - shard-iclb:         NOTRUN -> [SKIP][91] ([i915#2994])
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-iclb8/igt@sysfs_clients@split-10.html
    - shard-kbl:          NOTRUN -> [SKIP][92] ([fdo#109271] / [i915#2994])
   [92]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-kbl7/igt@sysfs_clients@split-10.html

  * igt@tools_test@sysfs_l3_parity:
    - shard-kbl:          NOTRUN -> [SKIP][93] ([fdo#109271]) +65 similar issues
   [93]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-kbl2/igt@tools_test@sysfs_l3_parity.html

  
#### Possible fixes ####

  * igt@gem_exec_fair@basic-none-vip@rcs0:
    - shard-glk:          [FAIL][94] ([i915#2842]) -> [PASS][95]
   [94]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-glk4/igt@gem_exec_fair@basic-none-vip@rcs0.html
   [95]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-glk5/igt@gem_exec_fair@basic-none-vip@rcs0.html

  * igt@gem_exec_fair@basic-pace@bcs0:
    - shard-iclb:         [FAIL][96] ([i915#2842]) -> [PASS][97]
   [96]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-iclb1/igt@gem_exec_fair@basic-pace@bcs0.html
   [97]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-iclb8/igt@gem_exec_fair@basic-pace@bcs0.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
    - shard-iclb:         [FAIL][98] ([i915#2849]) -> [PASS][99]
   [98]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-iclb3/igt@gem_exec_fair@basic-throttle@rcs0.html
   [99]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-iclb6/igt@gem_exec_fair@basic-throttle@rcs0.html

  * igt@gem_workarounds@suspend-resume:
    - shard-skl:          [INCOMPLETE][100] ([i915#198]) -> [PASS][101]
   [100]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-skl6/igt@gem_workarounds@suspend-resume.html
   [101]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-skl1/igt@gem_workarounds@suspend-resume.html

  * igt@gen9_exec_parse@allowed-single:
    - shard-skl:          [DMESG-WARN][102] ([i915#1436] / [i915#716]) -> [PASS][103]
   [102]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-skl7/igt@gen9_exec_parse@allowed-single.html
   [103]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-skl4/igt@gen9_exec_parse@allowed-single.html

  * igt@i915_selftest@live@gt_heartbeat:
    - shard-skl:          [DMESG-FAIL][104] ([i915#2291] / [i915#541]) -> [PASS][105]
   [104]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-skl6/igt@i915_selftest@live@gt_heartbeat.html
   [105]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-skl1/igt@i915_selftest@live@gt_heartbeat.html

  * igt@kms_big_fb@linear-64bpp-rotate-180:
    - shard-glk:          [DMESG-WARN][106] -> [PASS][107]
   [106]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-glk7/igt@kms_big_fb@linear-64bpp-rotate-180.html
   [107]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-glk1/igt@kms_big_fb@linear-64bpp-rotate-180.html

  * igt@kms_color@pipe-a-ctm-0-75:
    - shard-skl:          [DMESG-WARN][108] ([i915#1982]) -> [PASS][109]
   [108]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-skl2/igt@kms_color@pipe-a-ctm-0-75.html
   [109]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-skl5/igt@kms_color@pipe-a-ctm-0-75.html

  * igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions:
    - shard-skl:          [FAIL][110] ([i915#2346]) -> [PASS][111]
   [110]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-skl4/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html
   [111]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-skl7/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html

  * igt@kms_flip@flip-vs-expired-vblank@b-edp1:
    - shard-skl:          [FAIL][112] ([i915#2122]) -> [PASS][113]
   [112]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-skl4/igt@kms_flip@flip-vs-expired-vblank@b-edp1.html
   [113]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-skl7/igt@kms_flip@flip-vs-expired-vblank@b-edp1.html

  * igt@kms_flip@flip-vs-suspend-interruptible@a-dp1:
    - shard-kbl:          [DMESG-WARN][114] ([i915#180]) -> [PASS][115] +4 similar issues
   [114]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-kbl4/igt@kms_flip@flip-vs-suspend-interruptible@a-dp1.html
   [115]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-kbl2/igt@kms_flip@flip-vs-suspend-interruptible@a-dp1.html

  * igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-64bpp-ytile:
    - shard-iclb:         [SKIP][116] ([i915#3701]) -> [PASS][117]
   [116]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-iclb2/igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-64bpp-ytile.html
   [117]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-iclb1/igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-64bpp-ytile.html

  * igt@kms_frontbuffer_tracking@fbc-suspend:
    - shard-tglb:         [INCOMPLETE][118] ([i915#2411] / [i915#456]) -> [PASS][119] +1 similar issue
   [118]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-tglb7/igt@kms_frontbuffer_tracking@fbc-suspend.html
   [119]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb3/igt@kms_frontbuffer_tracking@fbc-suspend.html

  * igt@kms_plane@plane-panning-bottom-right-suspend@pipe-a-planes:
    - shard-apl:          [DMESG-WARN][120] ([i915#180]) -> [PASS][121]
   [120]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-apl3/igt@kms_plane@plane-panning-bottom-right-suspend@pipe-a-planes.html
   [121]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-apl8/igt@kms_plane@plane-panning-bottom-right-suspend@pipe-a-planes.html

  * igt@kms_plane@plane-panning-bottom-right-suspend@pipe-b-planes:
    - shard-tglb:         [INCOMPLETE][122] ([i915#456]) -> [PASS][123]
   [122]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-tglb7/igt@kms_plane@plane-panning-bottom-right-suspend@pipe-b-planes.html
   [123]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-tglb6/igt@kms_plane@plane-panning-bottom-right-suspend@pipe-b-planes.html

  * igt@kms_plane_alpha_blend@pipe-b-coverage-7efc:
    - shard-skl:          [FAIL][124] ([fdo#108145] / [i915#265]) -> [PASS][125] +2 similar issues
   [124]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-skl1/igt@kms_plane_alpha_blend@pipe-b-coverage-7efc.html
   [125]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-skl2/igt@kms_plane_alpha_blend@pipe-b-coverage-7efc.html

  * igt@kms_psr@psr2_cursor_plane_onoff:
    - shard-iclb:         [SKIP][126] ([fdo#109441]) -> [PASS][127]
   [126]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-iclb8/igt@kms_psr@psr2_cursor_plane_onoff.html
   [127]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-iclb2/igt@kms_psr@psr2_cursor_plane_onoff.html

  
#### Warnings ####

  * igt@i915_pm_rc6_residency@rc6-idle:
    - shard-iclb:         [WARN][128] ([i915#2684]) -> [WARN][129] ([i915#1804] / [i915#2684])
   [128]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-iclb5/igt@i915_pm_rc6_residency@rc6-idle.html
   [129]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-iclb7/igt@i915_pm_rc6_residency@rc6-idle.html

  * igt@kms_big_fb@yf-tiled-max-hw-stride-32bpp-rotate-180-async-flip:
    - shard-skl:          [FAIL][130] ([i915#3722]) -> [FAIL][131] ([i915#3743])
   [130]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-skl8/igt@kms_big_fb@yf-tiled-max-hw-stride-32bpp-rotate-180-async-flip.html
   [131]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-skl2/igt@kms_big_fb@yf-tiled-max-hw-stride-32bpp-rotate-180-async-flip.html

  * igt@runner@aborted:
    - shard-kbl:          ([FAIL][132], [FAIL][133], [FAIL][134], [FAIL][135], [FAIL][136], [FAIL][137]) ([i915#180] / [i915#1814] / [i915#3002] / [i915#3363]) -> ([FAIL][138], [FAIL][139], [FAIL][140], [FAIL][141], [FAIL][142]) ([i915#180] / [i915#1814] / [i915#3002] / [i915#3363] / [i915#602])
   [132]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-kbl6/igt@runner@aborted.html
   [133]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-kbl6/igt@runner@aborted.html
   [134]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-kbl4/igt@runner@aborted.html
   [135]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-kbl6/igt@runner@aborted.html
   [136]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-kbl4/igt@runner@aborted.html
   [137]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10648/shard-kbl1/igt@runner@aborted.html
   [138]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/shard-kbl4/igt@runner@aborted.html

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21167/index.html

[-- Attachment #2: Type: text/html, Size: 33549 bytes --]

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend
  2021-09-27 11:41   ` [Intel-gfx] " Matthew Auld
@ 2021-09-29 11:07     ` Thomas Hellström
  -1 siblings, 0 replies; 60+ messages in thread
From: Thomas Hellström @ 2021-09-29 11:07 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel, Christian König

On Mon, 2021-09-27 at 12:41 +0100, Matthew Auld wrote:
> For cached objects we can allocate our pages directly in shmem. This
> should make it possible(in a later patch) to utilise the existing
> i915-gem shrinker code for such objects. For now this is still
> disabled.

Some minor comments below, with those either fixed or deemed
unnecessary, 
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>




> 
> v2(Thomas):
>   - Add optional try_to_writeback hook for objects. Importantly we
> need
>     to check if the object is even still shrinkable; in between us
>     dropping the shrinker LRU lock and acquiring the object lock it
> could for
>     example have been moved. Also we need to differentiate between
>     "lazy" shrinking and the immediate writeback mode. Also later we
> need to
>     handle objects which don't even have mm.pages, so bundling this
> into
>     put_pages() would require somehow handling that edge case, hence
>     just letting the ttm backend handle everything in
> try_to_writeback
>     doesn't seem too bad.
> v3(Thomas):
>   - Likely a bad idea to touch the object from the unpopulate hook,
>     since it's not possible to hold a reference, without also
> creating
>     circular dependency, so likely this is too fragile. For now just
>     ensure we at least mark the pages as dirty/accessed when called
> from the
>     shrinker on WILLNEED objects.
>   - s/try_to_writeback/shrinker_release_pages, since this can do more
>     than just writeback.
>   - Get rid of do_backup boolean and just set the SWAPPED flag prior
> to
>     calling unpopulate.
>   - Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk,
> since
>     these just get skipped anyway. We can try to come up with
> something
>     better later.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Cc: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_object.h    |   8 +
>  .../gpu/drm/i915/gem/i915_gem_object_types.h  |   2 +
>  drivers/gpu/drm/i915/gem/i915_gem_shmem.c     |  14 +-
>  drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  17 +-
>  drivers/gpu/drm/i915/gem/i915_gem_ttm.c       | 240
> ++++++++++++++++--
>  5 files changed, 245 insertions(+), 36 deletions(-)
> 
> 

...

> +
> +       err = dma_map_sg_attrs(i915_tt->dev,
> +                              st->sgl, st->nents,
> +                              PCI_DMA_BIDIRECTIONAL,

nit: Since this is a dma api call, should we use DMA_BIDIRECTIONAL
instead of PCI_DMA_BIDIRECTIONAL? DMA_BIDIRECTIONAL is used elsewhere
in this file, but not throughout the driver IIRC.

> +                              DMA_ATTR_SKIP_CPU_SYNC |
> +                              DMA_ATTR_NO_KERNEL_MAPPING |
> +                              DMA_ATTR_NO_WARN);
> +       if (err <= 0) {
> +               err = -EINVAL;
> +               goto err_free_st;
> +       }
> +
> +       i = 0;
> +       for_each_sgt_page(page, sgt_iter, st)
> +               ttm->pages[i++] = page;
> +
> +       if (ttm->page_flags & TTM_TT_FLAG_SWAPPED)
> +               ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
> +
> +       i915_tt->cached_st = st;
> +       return 0;
> +
> +err_free_st:
> +       shmem_free_st(st, filp->f_mapping, false, false);
> +       return err;
> +}
> +
> +static void i915_ttm_tt_shmem_unpopulate(struct ttm_tt *ttm)
> +{
> +       struct i915_ttm_tt *i915_tt = container_of(ttm,
> typeof(*i915_tt), ttm);
> +       bool backup = ttm->page_flags & TTM_TT_FLAG_SWAPPED;
> +
> +       dma_unmap_sg(i915_tt->dev, i915_tt->cached_st->sgl,
> +                    i915_tt->cached_st->nents,
> +                    PCI_DMA_BIDIRECTIONAL);

Same here. 
> +
> +       shmem_free_st(fetch_and_zero(&i915_tt->cached_st),
> +                     file_inode(i915_tt->filp)->i_mapping,
> +                     backup, backup);
> +}
> +
>  static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object
> *bo,
>                                          uint32_t page_flags)
>  {
>         struct ttm_resource_manager *man =
>                 ttm_manager_type(bo->bdev, bo->resource->mem_type);
>         struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
> +       enum ttm_caching caching = i915_ttm_select_tt_caching(obj);
>         struct i915_ttm_tt *i915_tt;
>         int ret;
>  
> @@ -196,36 +279,62 @@ static struct ttm_tt *i915_ttm_tt_create(struct
> ttm_buffer_object *bo,
>             man->use_tt)
>                 page_flags |= TTM_TT_FLAG_ZERO_ALLOC;
>  
> -       ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags,
> -                         i915_ttm_select_tt_caching(obj));
> -       if (ret) {
> -               kfree(i915_tt);
> -               return NULL;
> +       if (i915_gem_object_is_shrinkable(obj) && caching ==
> ttm_cached) {
> +               page_flags |= TTM_TT_FLAG_EXTERNAL |
> +                             TTM_TT_FLAG_EXTERNAL_MAPPABLE;
> +               i915_tt->is_shmem = true;
>         }
>  
> +       ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching);
> +       if (ret)
> +               goto err_free;
> +
>         i915_tt->dev = obj->base.dev->dev;
>  
>         return &i915_tt->ttm;
> +
> +err_free:
> +       kfree(i915_tt);
> +       return NULL;
> +}
> +
> +static int i915_ttm_tt_populate(struct ttm_device *bdev,
> +                               struct ttm_tt *ttm,
> +                               struct ttm_operation_ctx *ctx)
> +{
> +       struct i915_ttm_tt *i915_tt = container_of(ttm,
> typeof(*i915_tt), ttm);
> +
> +       if (i915_tt->is_shmem)
> +               return i915_ttm_tt_shmem_populate(bdev, ttm, ctx);
> +
> +       return ttm_pool_alloc(&bdev->pool, ttm, ctx);
>  }
>  
>  static void i915_ttm_tt_unpopulate(struct ttm_device *bdev, struct
> ttm_tt *ttm)
>  {
>         struct i915_ttm_tt *i915_tt = container_of(ttm,
> typeof(*i915_tt), ttm);
>  
> -       if (i915_tt->cached_st) {
> -               dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st,
> -                                 DMA_BIDIRECTIONAL, 0);
> -               sg_free_table(i915_tt->cached_st);
> -               kfree(i915_tt->cached_st);
> -               i915_tt->cached_st = NULL;
> +       if (i915_tt->is_shmem) {
> +               i915_ttm_tt_shmem_unpopulate(ttm);
> +       } else {
> +               if (i915_tt->cached_st) {
> +                       dma_unmap_sgtable(i915_tt->dev, i915_tt-
> >cached_st,
> +                                         DMA_BIDIRECTIONAL, 0);
> +                       sg_free_table(i915_tt->cached_st);
> +                       kfree(i915_tt->cached_st);
> +                       i915_tt->cached_st = NULL;
> +               }
> +               ttm_pool_free(&bdev->pool, ttm);
>         }
> -       ttm_pool_free(&bdev->pool, ttm);
>  }
>  
>  static void i915_ttm_tt_destroy(struct ttm_device *bdev, struct
> ttm_tt *ttm)
>  {
>         struct i915_ttm_tt *i915_tt = container_of(ttm,
> typeof(*i915_tt), ttm);
>  
> +       if (i915_tt->filp)
> +               fput(i915_tt->filp);
> +
>         ttm_tt_fini(ttm);
>         kfree(i915_tt);
>  }
> @@ -235,6 +344,14 @@ static bool i915_ttm_eviction_valuable(struct
> ttm_buffer_object *bo,
>  {
>         struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
>  
> +       /*
> +        * EXTERNAL objects should never be swapped out by TTM,
> instead we need
> +        * to handle that ourselves. TTM will already skip such
> objects for us,
> +        * but we would like to avoid grabbing locks for no good
> reason.
> +        */
> +       if (bo->ttm && bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL)
> +               return -EBUSY;
> +
>         /* Will do for now. Our pinned objects are still on TTM's LRU
> lists */
>         return i915_gem_object_evictable(obj);
>  }
> @@ -328,9 +445,11 @@ static void
> i915_ttm_adjust_gem_after_move(struct drm_i915_gem_object *obj)
>         i915_gem_object_set_cache_coherency(obj, cache_level);
>  }
>  
> -static void i915_ttm_purge(struct drm_i915_gem_object *obj)
> +static int __i915_ttm_purge(struct drm_i915_gem_object *obj)
>  {
>         struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
> +       struct i915_ttm_tt *i915_tt =
> +               container_of(bo->ttm, typeof(*i915_tt), ttm);
>         struct ttm_operation_ctx ctx = {
>                 .interruptible = true,
>                 .no_wait_gpu = false,
> @@ -339,17 +458,79 @@ static void i915_ttm_purge(struct
> drm_i915_gem_object *obj)
>         int ret;
>  
>         if (obj->mm.madv == __I915_MADV_PURGED)
> -               return;
> +               return 0;
>  
> -       /* TTM's purge interface. Note that we might be reentering.
> */
>         ret = ttm_bo_validate(bo, &place, &ctx);
> -       if (!ret) {
> -               obj->write_domain = 0;
> -               obj->read_domains = 0;
> -               i915_ttm_adjust_gem_after_move(obj);
> -               i915_ttm_free_cached_io_st(obj);
> -               obj->mm.madv = __I915_MADV_PURGED;
> +       if (ret)
> +               return ret;
> +
> +       if (bo->ttm && i915_tt->filp) {
> +               /*
> +                * The below fput(which eventually calls
> shmem_truncate) might
> +                * be delayed by worker, so when directly called to
> purge the
> +                * pages(like by the shrinker) we should try to be
> more
> +                * aggressive and release the pages immediately.
> +                */
> +               shmem_truncate_range(file_inode(i915_tt->filp),
> +                                    0, (loff_t)-1);
> +               fput(fetch_and_zero(&i915_tt->filp));
> +       }
> +
> +       obj->write_domain = 0;
> +       obj->read_domains = 0;
> +       i915_ttm_adjust_gem_after_move(obj);
> +       i915_ttm_free_cached_io_st(obj);
> +       obj->mm.madv = __I915_MADV_PURGED;
> +       return 0;
> +}
> +
> +static void i915_ttm_purge(struct drm_i915_gem_object *obj)
> +{
> +       __i915_ttm_purge(obj);

Do we need a comment here as to why we choose to ignore the return
value? I typically use a void cast (void)__i915_ttm_purge(obj); to
indicate that ignoring the return value is intentional. Not sure if
that's common practice with i915? 

> +}
> +
> +static int i915_ttm_shrinker_release_pages(struct
> drm_i915_gem_object *obj,
> +                                          bool should_writeback)
> +{
> +       struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
> +       struct i915_ttm_tt *i915_tt =
> +               container_of(bo->ttm, typeof(*i915_tt), ttm);
> +       struct ttm_operation_ctx ctx = {
> +               .interruptible = true,
> +               .no_wait_gpu = false,
> +       };
> +       struct ttm_placement place = {};
> +       int ret;
> +
> +       if (!bo->ttm || bo->resource->mem_type != TTM_PL_SYSTEM)
> +               return 0;
> +
> +       GEM_BUG_ON(!i915_tt->is_shmem);
> +
> +       if (!i915_tt->filp)
> +               return 0;
> +
> +       switch (obj->mm.madv) {
> +       case I915_MADV_DONTNEED:
> +               return __i915_ttm_purge(obj);
> +       case __I915_MADV_PURGED:
> +               return 0;
> +       }
> +
> +       if (bo->ttm->page_flags & TTM_TT_FLAG_SWAPPED)
> +               return 0;
> +
> +       bo->ttm->page_flags |= TTM_TT_FLAG_SWAPPED;
> +       ret = ttm_bo_validate(bo, &place, &ctx);
> +       if (ret) {
> +               bo->ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
> +               return ret;
>         }
> +
> +       if (should_writeback)
> +               __shmem_writeback(obj->base.size, i915_tt->filp-
> >f_mapping);
> +
> +       return 0;
>  }
>  
>  static void i915_ttm_swap_notify(struct ttm_buffer_object *bo)
> @@ -618,6 +799,7 @@ static unsigned long i915_ttm_io_mem_pfn(struct
> ttm_buffer_object *bo,
>  
>  static struct ttm_device_funcs i915_ttm_bo_driver = {
>         .ttm_tt_create = i915_ttm_tt_create,
> +       .ttm_tt_populate = i915_ttm_tt_populate,
>         .ttm_tt_unpopulate = i915_ttm_tt_unpopulate,
>         .ttm_tt_destroy = i915_ttm_tt_destroy,
>         .eviction_valuable = i915_ttm_eviction_valuable,
> @@ -685,12 +867,17 @@ static int __i915_ttm_get_pages(struct
> drm_i915_gem_object *obj,
>         }
>  
>         if (!i915_gem_object_has_pages(obj)) {
> +               struct i915_ttm_tt *i915_tt =
> +                       container_of(bo->ttm, typeof(*i915_tt), ttm);
> +
>                 /* Object either has a page vector or is an iomem
> object */
>                 st = bo->ttm ? i915_ttm_tt_get_st(bo->ttm) : obj-
> >ttm.cached_io_st;
>                 if (IS_ERR(st))
>                         return PTR_ERR(st);
>  
>                 __i915_gem_object_set_pages(obj, st,
> i915_sg_dma_sizes(st->sgl));
> +               if (!bo->ttm || !i915_tt->is_shmem)
> +                       i915_gem_object_make_unshrinkable(obj);
>         }
>  
>         return ret;
> @@ -770,6 +957,8 @@ static void i915_ttm_put_pages(struct
> drm_i915_gem_object *obj,
>  static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)
>  {
>         struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
> +       struct i915_ttm_tt *i915_tt =
> +               container_of(bo->ttm, typeof(*i915_tt), ttm);
>  
>         /*
>          * Don't manipulate the TTM LRUs while in TTM bo destruction.
> @@ -782,7 +971,10 @@ static void i915_ttm_adjust_lru(struct
> drm_i915_gem_object *obj)
>          * Put on the correct LRU list depending on the MADV status
>          */
>         spin_lock(&bo->bdev->lru_lock);
> -       if (obj->mm.madv != I915_MADV_WILLNEED) {
> +       if (bo->ttm && i915_tt->filp) {
> +               /* Try to keep shmem_tt from being considered for
> shrinking. */
> +               bo->priority = TTM_MAX_BO_PRIORITY - 1;
> +       } else if (obj->mm.madv != I915_MADV_WILLNEED) {
>                 bo->priority = I915_TTM_PRIO_PURGE;
>         } else if (!i915_gem_object_has_pages(obj)) {
>                 if (bo->priority < I915_TTM_PRIO_HAS_PAGES)
> @@ -887,9 +1079,12 @@ static const struct drm_i915_gem_object_ops
> i915_gem_ttm_obj_ops = {
>         .get_pages = i915_ttm_get_pages,
>         .put_pages = i915_ttm_put_pages,
>         .truncate = i915_ttm_purge,
> +       .shrinker_release_pages = i915_ttm_shrinker_release_pages,
> +
>         .adjust_lru = i915_ttm_adjust_lru,
>         .delayed_free = i915_ttm_delayed_free,
>         .migrate = i915_ttm_migrate,
> +
>         .mmap_offset = i915_ttm_mmap_offset,
>         .mmap_ops = &vm_ops_ttm,
>  };
> @@ -937,7 +1132,6 @@ int __i915_gem_ttm_object_init(struct
> intel_memory_region *mem,
>         drm_gem_private_object_init(&i915->drm, &obj->base, size);
>         i915_gem_object_init(obj, &i915_gem_ttm_obj_ops, &lock_class,
> flags);
>         i915_gem_object_init_memory_region(obj, mem);
> -       i915_gem_object_make_unshrinkable(obj);
>         INIT_RADIX_TREE(&obj->ttm.get_io_page.radix, GFP_KERNEL |
> __GFP_NOWARN);
>         mutex_init(&obj->ttm.get_io_page.lock);
>         bo_type = (obj->flags & I915_BO_ALLOC_USER) ?
> ttm_bo_type_device :



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend
@ 2021-09-29 11:07     ` Thomas Hellström
  0 siblings, 0 replies; 60+ messages in thread
From: Thomas Hellström @ 2021-09-29 11:07 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel, Christian König

On Mon, 2021-09-27 at 12:41 +0100, Matthew Auld wrote:
> For cached objects we can allocate our pages directly in shmem. This
> should make it possible(in a later patch) to utilise the existing
> i915-gem shrinker code for such objects. For now this is still
> disabled.

Some minor comments below, with those either fixed or deemed
unnecessary, 
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>




> 
> v2(Thomas):
>   - Add optional try_to_writeback hook for objects. Importantly we
> need
>     to check if the object is even still shrinkable; in between us
>     dropping the shrinker LRU lock and acquiring the object lock it
> could for
>     example have been moved. Also we need to differentiate between
>     "lazy" shrinking and the immediate writeback mode. Also later we
> need to
>     handle objects which don't even have mm.pages, so bundling this
> into
>     put_pages() would require somehow handling that edge case, hence
>     just letting the ttm backend handle everything in
> try_to_writeback
>     doesn't seem too bad.
> v3(Thomas):
>   - Likely a bad idea to touch the object from the unpopulate hook,
>     since it's not possible to hold a reference, without also
> creating
>     circular dependency, so likely this is too fragile. For now just
>     ensure we at least mark the pages as dirty/accessed when called
> from the
>     shrinker on WILLNEED objects.
>   - s/try_to_writeback/shrinker_release_pages, since this can do more
>     than just writeback.
>   - Get rid of do_backup boolean and just set the SWAPPED flag prior
> to
>     calling unpopulate.
>   - Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk,
> since
>     these just get skipped anyway. We can try to come up with
> something
>     better later.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Cc: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_object.h    |   8 +
>  .../gpu/drm/i915/gem/i915_gem_object_types.h  |   2 +
>  drivers/gpu/drm/i915/gem/i915_gem_shmem.c     |  14 +-
>  drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  17 +-
>  drivers/gpu/drm/i915/gem/i915_gem_ttm.c       | 240
> ++++++++++++++++--
>  5 files changed, 245 insertions(+), 36 deletions(-)
> 
> 

...

> +
> +       err = dma_map_sg_attrs(i915_tt->dev,
> +                              st->sgl, st->nents,
> +                              PCI_DMA_BIDIRECTIONAL,

nit: Since this is a dma api call, should we use DMA_BIDIRECTIONAL
instead of PCI_DMA_BIDIRECTIONAL? DMA_BIDIRECTIONAL is used elsewhere
in this file, but not throughout the driver IIRC.

> +                              DMA_ATTR_SKIP_CPU_SYNC |
> +                              DMA_ATTR_NO_KERNEL_MAPPING |
> +                              DMA_ATTR_NO_WARN);
> +       if (err <= 0) {
> +               err = -EINVAL;
> +               goto err_free_st;
> +       }
> +
> +       i = 0;
> +       for_each_sgt_page(page, sgt_iter, st)
> +               ttm->pages[i++] = page;
> +
> +       if (ttm->page_flags & TTM_TT_FLAG_SWAPPED)
> +               ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
> +
> +       i915_tt->cached_st = st;
> +       return 0;
> +
> +err_free_st:
> +       shmem_free_st(st, filp->f_mapping, false, false);
> +       return err;
> +}
> +
> +static void i915_ttm_tt_shmem_unpopulate(struct ttm_tt *ttm)
> +{
> +       struct i915_ttm_tt *i915_tt = container_of(ttm,
> typeof(*i915_tt), ttm);
> +       bool backup = ttm->page_flags & TTM_TT_FLAG_SWAPPED;
> +
> +       dma_unmap_sg(i915_tt->dev, i915_tt->cached_st->sgl,
> +                    i915_tt->cached_st->nents,
> +                    PCI_DMA_BIDIRECTIONAL);

Same here. 
> +
> +       shmem_free_st(fetch_and_zero(&i915_tt->cached_st),
> +                     file_inode(i915_tt->filp)->i_mapping,
> +                     backup, backup);
> +}
> +
>  static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object
> *bo,
>                                          uint32_t page_flags)
>  {
>         struct ttm_resource_manager *man =
>                 ttm_manager_type(bo->bdev, bo->resource->mem_type);
>         struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
> +       enum ttm_caching caching = i915_ttm_select_tt_caching(obj);
>         struct i915_ttm_tt *i915_tt;
>         int ret;
>  
> @@ -196,36 +279,62 @@ static struct ttm_tt *i915_ttm_tt_create(struct
> ttm_buffer_object *bo,
>             man->use_tt)
>                 page_flags |= TTM_TT_FLAG_ZERO_ALLOC;
>  
> -       ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags,
> -                         i915_ttm_select_tt_caching(obj));
> -       if (ret) {
> -               kfree(i915_tt);
> -               return NULL;
> +       if (i915_gem_object_is_shrinkable(obj) && caching ==
> ttm_cached) {
> +               page_flags |= TTM_TT_FLAG_EXTERNAL |
> +                             TTM_TT_FLAG_EXTERNAL_MAPPABLE;
> +               i915_tt->is_shmem = true;
>         }
>  
> +       ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching);
> +       if (ret)
> +               goto err_free;
> +
>         i915_tt->dev = obj->base.dev->dev;
>  
>         return &i915_tt->ttm;
> +
> +err_free:
> +       kfree(i915_tt);
> +       return NULL;
> +}
> +
> +static int i915_ttm_tt_populate(struct ttm_device *bdev,
> +                               struct ttm_tt *ttm,
> +                               struct ttm_operation_ctx *ctx)
> +{
> +       struct i915_ttm_tt *i915_tt = container_of(ttm,
> typeof(*i915_tt), ttm);
> +
> +       if (i915_tt->is_shmem)
> +               return i915_ttm_tt_shmem_populate(bdev, ttm, ctx);
> +
> +       return ttm_pool_alloc(&bdev->pool, ttm, ctx);
>  }
>  
>  static void i915_ttm_tt_unpopulate(struct ttm_device *bdev, struct
> ttm_tt *ttm)
>  {
>         struct i915_ttm_tt *i915_tt = container_of(ttm,
> typeof(*i915_tt), ttm);
>  
> -       if (i915_tt->cached_st) {
> -               dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st,
> -                                 DMA_BIDIRECTIONAL, 0);
> -               sg_free_table(i915_tt->cached_st);
> -               kfree(i915_tt->cached_st);
> -               i915_tt->cached_st = NULL;
> +       if (i915_tt->is_shmem) {
> +               i915_ttm_tt_shmem_unpopulate(ttm);
> +       } else {
> +               if (i915_tt->cached_st) {
> +                       dma_unmap_sgtable(i915_tt->dev, i915_tt-
> >cached_st,
> +                                         DMA_BIDIRECTIONAL, 0);
> +                       sg_free_table(i915_tt->cached_st);
> +                       kfree(i915_tt->cached_st);
> +                       i915_tt->cached_st = NULL;
> +               }
> +               ttm_pool_free(&bdev->pool, ttm);
>         }
> -       ttm_pool_free(&bdev->pool, ttm);
>  }
>  
>  static void i915_ttm_tt_destroy(struct ttm_device *bdev, struct
> ttm_tt *ttm)
>  {
>         struct i915_ttm_tt *i915_tt = container_of(ttm,
> typeof(*i915_tt), ttm);
>  
> +       if (i915_tt->filp)
> +               fput(i915_tt->filp);
> +
>         ttm_tt_fini(ttm);
>         kfree(i915_tt);
>  }
> @@ -235,6 +344,14 @@ static bool i915_ttm_eviction_valuable(struct
> ttm_buffer_object *bo,
>  {
>         struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
>  
> +       /*
> +        * EXTERNAL objects should never be swapped out by TTM,
> instead we need
> +        * to handle that ourselves. TTM will already skip such
> objects for us,
> +        * but we would like to avoid grabbing locks for no good
> reason.
> +        */
> +       if (bo->ttm && bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL)
> +               return -EBUSY;
> +
>         /* Will do for now. Our pinned objects are still on TTM's LRU
> lists */
>         return i915_gem_object_evictable(obj);
>  }
> @@ -328,9 +445,11 @@ static void
> i915_ttm_adjust_gem_after_move(struct drm_i915_gem_object *obj)
>         i915_gem_object_set_cache_coherency(obj, cache_level);
>  }
>  
> -static void i915_ttm_purge(struct drm_i915_gem_object *obj)
> +static int __i915_ttm_purge(struct drm_i915_gem_object *obj)
>  {
>         struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
> +       struct i915_ttm_tt *i915_tt =
> +               container_of(bo->ttm, typeof(*i915_tt), ttm);
>         struct ttm_operation_ctx ctx = {
>                 .interruptible = true,
>                 .no_wait_gpu = false,
> @@ -339,17 +458,79 @@ static void i915_ttm_purge(struct
> drm_i915_gem_object *obj)
>         int ret;
>  
>         if (obj->mm.madv == __I915_MADV_PURGED)
> -               return;
> +               return 0;
>  
> -       /* TTM's purge interface. Note that we might be reentering.
> */
>         ret = ttm_bo_validate(bo, &place, &ctx);
> -       if (!ret) {
> -               obj->write_domain = 0;
> -               obj->read_domains = 0;
> -               i915_ttm_adjust_gem_after_move(obj);
> -               i915_ttm_free_cached_io_st(obj);
> -               obj->mm.madv = __I915_MADV_PURGED;
> +       if (ret)
> +               return ret;
> +
> +       if (bo->ttm && i915_tt->filp) {
> +               /*
> +                * The below fput(which eventually calls
> shmem_truncate) might
> +                * be delayed by worker, so when directly called to
> purge the
> +                * pages(like by the shrinker) we should try to be
> more
> +                * aggressive and release the pages immediately.
> +                */
> +               shmem_truncate_range(file_inode(i915_tt->filp),
> +                                    0, (loff_t)-1);
> +               fput(fetch_and_zero(&i915_tt->filp));
> +       }
> +
> +       obj->write_domain = 0;
> +       obj->read_domains = 0;
> +       i915_ttm_adjust_gem_after_move(obj);
> +       i915_ttm_free_cached_io_st(obj);
> +       obj->mm.madv = __I915_MADV_PURGED;
> +       return 0;
> +}
> +
> +static void i915_ttm_purge(struct drm_i915_gem_object *obj)
> +{
> +       __i915_ttm_purge(obj);

Do we need a comment here as to why we choose to ignore the return
value? I typically use a void cast (void)__i915_ttm_purge(obj); to
indicate that ignoring the return value is intentional. Not sure if
that's common practice with i915? 

> +}
> +
> +static int i915_ttm_shrinker_release_pages(struct
> drm_i915_gem_object *obj,
> +                                          bool should_writeback)
> +{
> +       struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
> +       struct i915_ttm_tt *i915_tt =
> +               container_of(bo->ttm, typeof(*i915_tt), ttm);
> +       struct ttm_operation_ctx ctx = {
> +               .interruptible = true,
> +               .no_wait_gpu = false,
> +       };
> +       struct ttm_placement place = {};
> +       int ret;
> +
> +       if (!bo->ttm || bo->resource->mem_type != TTM_PL_SYSTEM)
> +               return 0;
> +
> +       GEM_BUG_ON(!i915_tt->is_shmem);
> +
> +       if (!i915_tt->filp)
> +               return 0;
> +
> +       switch (obj->mm.madv) {
> +       case I915_MADV_DONTNEED:
> +               return __i915_ttm_purge(obj);
> +       case __I915_MADV_PURGED:
> +               return 0;
> +       }
> +
> +       if (bo->ttm->page_flags & TTM_TT_FLAG_SWAPPED)
> +               return 0;
> +
> +       bo->ttm->page_flags |= TTM_TT_FLAG_SWAPPED;
> +       ret = ttm_bo_validate(bo, &place, &ctx);
> +       if (ret) {
> +               bo->ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
> +               return ret;
>         }
> +
> +       if (should_writeback)
> +               __shmem_writeback(obj->base.size, i915_tt->filp-
> >f_mapping);
> +
> +       return 0;
>  }
>  
>  static void i915_ttm_swap_notify(struct ttm_buffer_object *bo)
> @@ -618,6 +799,7 @@ static unsigned long i915_ttm_io_mem_pfn(struct
> ttm_buffer_object *bo,
>  
>  static struct ttm_device_funcs i915_ttm_bo_driver = {
>         .ttm_tt_create = i915_ttm_tt_create,
> +       .ttm_tt_populate = i915_ttm_tt_populate,
>         .ttm_tt_unpopulate = i915_ttm_tt_unpopulate,
>         .ttm_tt_destroy = i915_ttm_tt_destroy,
>         .eviction_valuable = i915_ttm_eviction_valuable,
> @@ -685,12 +867,17 @@ static int __i915_ttm_get_pages(struct
> drm_i915_gem_object *obj,
>         }
>  
>         if (!i915_gem_object_has_pages(obj)) {
> +               struct i915_ttm_tt *i915_tt =
> +                       container_of(bo->ttm, typeof(*i915_tt), ttm);
> +
>                 /* Object either has a page vector or is an iomem
> object */
>                 st = bo->ttm ? i915_ttm_tt_get_st(bo->ttm) : obj-
> >ttm.cached_io_st;
>                 if (IS_ERR(st))
>                         return PTR_ERR(st);
>  
>                 __i915_gem_object_set_pages(obj, st,
> i915_sg_dma_sizes(st->sgl));
> +               if (!bo->ttm || !i915_tt->is_shmem)
> +                       i915_gem_object_make_unshrinkable(obj);
>         }
>  
>         return ret;
> @@ -770,6 +957,8 @@ static void i915_ttm_put_pages(struct
> drm_i915_gem_object *obj,
>  static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)
>  {
>         struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
> +       struct i915_ttm_tt *i915_tt =
> +               container_of(bo->ttm, typeof(*i915_tt), ttm);
>  
>         /*
>          * Don't manipulate the TTM LRUs while in TTM bo destruction.
> @@ -782,7 +971,10 @@ static void i915_ttm_adjust_lru(struct
> drm_i915_gem_object *obj)
>          * Put on the correct LRU list depending on the MADV status
>          */
>         spin_lock(&bo->bdev->lru_lock);
> -       if (obj->mm.madv != I915_MADV_WILLNEED) {
> +       if (bo->ttm && i915_tt->filp) {
> +               /* Try to keep shmem_tt from being considered for
> shrinking. */
> +               bo->priority = TTM_MAX_BO_PRIORITY - 1;
> +       } else if (obj->mm.madv != I915_MADV_WILLNEED) {
>                 bo->priority = I915_TTM_PRIO_PURGE;
>         } else if (!i915_gem_object_has_pages(obj)) {
>                 if (bo->priority < I915_TTM_PRIO_HAS_PAGES)
> @@ -887,9 +1079,12 @@ static const struct drm_i915_gem_object_ops
> i915_gem_ttm_obj_ops = {
>         .get_pages = i915_ttm_get_pages,
>         .put_pages = i915_ttm_put_pages,
>         .truncate = i915_ttm_purge,
> +       .shrinker_release_pages = i915_ttm_shrinker_release_pages,
> +
>         .adjust_lru = i915_ttm_adjust_lru,
>         .delayed_free = i915_ttm_delayed_free,
>         .migrate = i915_ttm_migrate,
> +
>         .mmap_offset = i915_ttm_mmap_offset,
>         .mmap_ops = &vm_ops_ttm,
>  };
> @@ -937,7 +1132,6 @@ int __i915_gem_ttm_object_init(struct
> intel_memory_region *mem,
>         drm_gem_private_object_init(&i915->drm, &obj->base, size);
>         i915_gem_object_init(obj, &i915_gem_ttm_obj_ops, &lock_class,
> flags);
>         i915_gem_object_init_memory_region(obj, mem);
> -       i915_gem_object_make_unshrinkable(obj);
>         INIT_RADIX_TREE(&obj->ttm.get_io_page.radix, GFP_KERNEL |
> __GFP_NOWARN);
>         mutex_init(&obj->ttm.get_io_page.lock);
>         bo_type = (obj->flags & I915_BO_ALLOC_USER) ?
> ttm_bo_type_device :



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 11/13] drm/i915/ttm: make evicted shmem pages visible to the shrinker
  2021-09-27 11:41   ` [Intel-gfx] " Matthew Auld
@ 2021-09-29 11:47     ` Thomas Hellström
  -1 siblings, 0 replies; 60+ messages in thread
From: Thomas Hellström @ 2021-09-29 11:47 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel

On Mon, 2021-09-27 at 12:41 +0100, Matthew Auld wrote:
> We currently just evict lmem objects to system memory when under
> memory
> pressure, and in the next patch we want to use the shmem backend even
> for this case. For this case we lack the usual object mm.pages, which
> effectively hides the pages from the i915-gem shrinker, until we
> actually "attach" the TT to the object, or in the case of lmem-only
> objects it just gets migrated back to lmem when touched again.
> 
> For all cases we can just adjust the i915 shrinker LRU each time we
> also
> adjust the TTM LRU. The two cases we care about are:
> 
>   1) When something is moved by TTM, including when initially
> populating
>      an object. Importantly this covers the case where TTM moves
> something from
>      lmem <-> smem, outside of the normal get_pages() interface,
> which
>      should still ensure the shmem pages underneath are reclaimable.
> 
>   2) When calling into i915_gem_object_unlock(). The unlock should
>      ensure the object is removed from the shinker LRU, if it was
> indeed
>      swapped out, or just purged, when the shrinker drops the object
> lock.
> 
> We can optimise this(if needed) by tracking if the object is already
> visible to the shrinker(protected by the object lock), so we don't
> touch
> the shrinker LRU more than needed.
> 
> v2(Thomas)
>   - Handle managing the shrinker LRU in adjust_lru, where it is
> always
>     safe to touch the object.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_object.h   |  1 +
>  drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 29 +++++++++++++++---
> --
>  drivers/gpu/drm/i915/gem/i915_gem_ttm.c      | 28 +++++++++++++++---
> -
>  3 files changed, 46 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> index 1c9a1d8d3434..640dfbf1f01e 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> @@ -523,6 +523,7 @@ i915_gem_object_pin_to_display_plane(struct
> drm_i915_gem_object *obj,
>  
>  void i915_gem_object_make_unshrinkable(struct drm_i915_gem_object
> *obj);
>  void i915_gem_object_make_shrinkable(struct drm_i915_gem_object
> *obj);
> +void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object
> *obj);
>  void i915_gem_object_make_purgeable(struct drm_i915_gem_object
> *obj);
>  
>  static inline bool cpu_write_needs_clflush(struct
> drm_i915_gem_object *obj)
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> index 0440696f786a..4b6b2bb6f180 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> @@ -487,13 +487,12 @@ void i915_gem_object_make_unshrinkable(struct
> drm_i915_gem_object *obj)
>         spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
>  }
>  
> -static void __i915_gem_object_make_shrinkable(struct
> drm_i915_gem_object *obj,
> -                                             struct list_head *head)
> +static void ___i915_gem_object_make_shrinkable(struct
> drm_i915_gem_object *obj,
> +                                              struct list_head
> *head)
>  {
>         struct drm_i915_private *i915 = obj_to_i915(obj);
>         unsigned long flags;
>  
> -       GEM_BUG_ON(!i915_gem_object_has_pages(obj));
>         if (!i915_gem_object_is_shrinkable(obj))
>                 return;
>  
> @@ -512,6 +511,21 @@ static void
> __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj,
>         spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
>  }
>  
> +/**
> + * __i915_gem_object_make_shrinkable - Move the object to the tail
> of the
> + * shrinkable list. Objects on this list might be swapped out. Used
> with
> + * WILLNEED objects.
> + * @obj: The GEM object.
> + *
> + * DO NOT USE. This is intended to be called on very special objects
> that don't
> + * yet have mm.pages, but are guaranteed to have potentially
> reclaimable pages
> + * underneath.
> + */
> +void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object
> *obj)
> +{
> +       ___i915_gem_object_make_shrinkable(obj,
> +                                          &obj_to_i915(obj)-
> >mm.shrink_list);
> +}
>  
>  /**
>   * i915_gem_object_make_shrinkable - Move the object to the tail of
> the
> @@ -523,8 +537,8 @@ static void
> __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj,
>   */
>  void i915_gem_object_make_shrinkable(struct drm_i915_gem_object
> *obj)
>  {
> -       __i915_gem_object_make_shrinkable(obj,
> -                                         &obj_to_i915(obj)-
> >mm.shrink_list);
> +       GEM_BUG_ON(!i915_gem_object_has_pages(obj));
> +       __i915_gem_object_make_shrinkable(obj);
>  }
>  
>  /**
> @@ -538,6 +552,7 @@ void i915_gem_object_make_shrinkable(struct
> drm_i915_gem_object *obj)
>   */
>  void i915_gem_object_make_purgeable(struct drm_i915_gem_object *obj)
>  {
> -       __i915_gem_object_make_shrinkable(obj,
> -                                         &obj_to_i915(obj)-
> >mm.purge_list);
> +       GEM_BUG_ON(!i915_gem_object_has_pages(obj));
> +       ___i915_gem_object_make_shrinkable(obj,
> +                                          &obj_to_i915(obj)-
> >mm.purge_list);
>  }
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> index c7402995a8f9..194e5f1deda8 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> @@ -749,6 +749,8 @@ static int i915_ttm_move(struct ttm_buffer_object
> *bo, bool evict,
>                         return ret;
>         }
>  
> +       i915_ttm_adjust_lru(obj);
> +

This will put the object on the shrinker list a little earlier than if
we rely on the adjust_lru() from object_unlock() only, but is that
strictly necessary? I figure even if the shrinker picks the object up,
it will fail in the object trylock and ignore the object, until we call
object_unlock() anyway?


>         dst_st = i915_ttm_resource_get_st(obj, dst_mem);
>         if (IS_ERR(dst_st))
>                 return PTR_ERR(dst_st);
> @@ -856,7 +858,6 @@ static int __i915_ttm_get_pages(struct
> drm_i915_gem_object *obj,
>                         return i915_ttm_err_to_gem(ret);
>         }
>  
> -       i915_ttm_adjust_lru(obj);
>         if (bo->ttm && !ttm_tt_is_populated(bo->ttm)) {
>                 ret = ttm_tt_populate(bo->bdev, bo->ttm, &ctx);
>                 if (ret)
> @@ -876,10 +877,10 @@ static int __i915_ttm_get_pages(struct
> drm_i915_gem_object *obj,
>                         return PTR_ERR(st);
>  
>                 __i915_gem_object_set_pages(obj, st,
> i915_sg_dma_sizes(st->sgl));
> -               if (!bo->ttm || !i915_tt->is_shmem)
> -                       i915_gem_object_make_unshrinkable(obj);
>         }
>  
> +       i915_ttm_adjust_lru(obj);
> +
>         return ret;
>  }
>  
> @@ -950,8 +951,6 @@ static void i915_ttm_put_pages(struct
> drm_i915_gem_object *obj,
>          * If the object is not destroyed next, The TTM eviction
> logic
>          * and shrinkers will move it out if needed.
>          */
> -
> -       i915_ttm_adjust_lru(obj);
>  }
>  
>  static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)
> @@ -967,6 +966,17 @@ static void i915_ttm_adjust_lru(struct
> drm_i915_gem_object *obj)
>         if (!kref_read(&bo->kref))
>                 return;
>  
> +       /*
> +        * Even if we lack mm.pages for this object(which will be the
> case when
> +        * something is evicted to system memory by TTM), we still
> want to make
> +        * this object visible to the shrinker, since the underlying
> ttm_tt
> +        * still has the real shmem pages.
> +        */
> +       if (bo->ttm && i915_tt->filp && ttm_tt_is_populated(bo->ttm))
> +               __i915_gem_object_make_shrinkable(obj);
> +       else
> +               i915_gem_object_make_unshrinkable(obj);
> +
>         /*
>          * Put on the correct LRU list depending on the MADV status
>          */
> @@ -1006,6 +1016,14 @@ static void i915_ttm_adjust_lru(struct
> drm_i915_gem_object *obj)
>  static void i915_ttm_delayed_free(struct drm_i915_gem_object *obj)
>  {
>         if (obj->ttm.created) {
> +               /*
> +                * We freely manage the shrinker LRU outide of the
> mm.pages life
> +                * cycle. As a result when destroying the object it's
> up to us
> +                * to ensure we remove it from the LRU, before we
> free the
> +                * object.
> +                */
> +               i915_gem_object_make_unshrinkable(obj);
> +

I guess this is not *strictly* necessary at this point, since the
shrinker has a kref_get_unless_zero() guard, but I guess we need to
remove the object from the shrinker LRU at some point during
destruction anyway.

Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>


>                 ttm_bo_put(i915_gem_to_ttm(obj));
>         } else {
>                 __i915_gem_free_object(obj);




^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Intel-gfx] [PATCH v5 11/13] drm/i915/ttm: make evicted shmem pages visible to the shrinker
@ 2021-09-29 11:47     ` Thomas Hellström
  0 siblings, 0 replies; 60+ messages in thread
From: Thomas Hellström @ 2021-09-29 11:47 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel

On Mon, 2021-09-27 at 12:41 +0100, Matthew Auld wrote:
> We currently just evict lmem objects to system memory when under
> memory
> pressure, and in the next patch we want to use the shmem backend even
> for this case. For this case we lack the usual object mm.pages, which
> effectively hides the pages from the i915-gem shrinker, until we
> actually "attach" the TT to the object, or in the case of lmem-only
> objects it just gets migrated back to lmem when touched again.
> 
> For all cases we can just adjust the i915 shrinker LRU each time we
> also
> adjust the TTM LRU. The two cases we care about are:
> 
>   1) When something is moved by TTM, including when initially
> populating
>      an object. Importantly this covers the case where TTM moves
> something from
>      lmem <-> smem, outside of the normal get_pages() interface,
> which
>      should still ensure the shmem pages underneath are reclaimable.
> 
>   2) When calling into i915_gem_object_unlock(). The unlock should
>      ensure the object is removed from the shinker LRU, if it was
> indeed
>      swapped out, or just purged, when the shrinker drops the object
> lock.
> 
> We can optimise this(if needed) by tracking if the object is already
> visible to the shrinker(protected by the object lock), so we don't
> touch
> the shrinker LRU more than needed.
> 
> v2(Thomas)
>   - Handle managing the shrinker LRU in adjust_lru, where it is
> always
>     safe to touch the object.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_object.h   |  1 +
>  drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 29 +++++++++++++++---
> --
>  drivers/gpu/drm/i915/gem/i915_gem_ttm.c      | 28 +++++++++++++++---
> -
>  3 files changed, 46 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> index 1c9a1d8d3434..640dfbf1f01e 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> @@ -523,6 +523,7 @@ i915_gem_object_pin_to_display_plane(struct
> drm_i915_gem_object *obj,
>  
>  void i915_gem_object_make_unshrinkable(struct drm_i915_gem_object
> *obj);
>  void i915_gem_object_make_shrinkable(struct drm_i915_gem_object
> *obj);
> +void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object
> *obj);
>  void i915_gem_object_make_purgeable(struct drm_i915_gem_object
> *obj);
>  
>  static inline bool cpu_write_needs_clflush(struct
> drm_i915_gem_object *obj)
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> index 0440696f786a..4b6b2bb6f180 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> @@ -487,13 +487,12 @@ void i915_gem_object_make_unshrinkable(struct
> drm_i915_gem_object *obj)
>         spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
>  }
>  
> -static void __i915_gem_object_make_shrinkable(struct
> drm_i915_gem_object *obj,
> -                                             struct list_head *head)
> +static void ___i915_gem_object_make_shrinkable(struct
> drm_i915_gem_object *obj,
> +                                              struct list_head
> *head)
>  {
>         struct drm_i915_private *i915 = obj_to_i915(obj);
>         unsigned long flags;
>  
> -       GEM_BUG_ON(!i915_gem_object_has_pages(obj));
>         if (!i915_gem_object_is_shrinkable(obj))
>                 return;
>  
> @@ -512,6 +511,21 @@ static void
> __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj,
>         spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
>  }
>  
> +/**
> + * __i915_gem_object_make_shrinkable - Move the object to the tail
> of the
> + * shrinkable list. Objects on this list might be swapped out. Used
> with
> + * WILLNEED objects.
> + * @obj: The GEM object.
> + *
> + * DO NOT USE. This is intended to be called on very special objects
> that don't
> + * yet have mm.pages, but are guaranteed to have potentially
> reclaimable pages
> + * underneath.
> + */
> +void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object
> *obj)
> +{
> +       ___i915_gem_object_make_shrinkable(obj,
> +                                          &obj_to_i915(obj)-
> >mm.shrink_list);
> +}
>  
>  /**
>   * i915_gem_object_make_shrinkable - Move the object to the tail of
> the
> @@ -523,8 +537,8 @@ static void
> __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj,
>   */
>  void i915_gem_object_make_shrinkable(struct drm_i915_gem_object
> *obj)
>  {
> -       __i915_gem_object_make_shrinkable(obj,
> -                                         &obj_to_i915(obj)-
> >mm.shrink_list);
> +       GEM_BUG_ON(!i915_gem_object_has_pages(obj));
> +       __i915_gem_object_make_shrinkable(obj);
>  }
>  
>  /**
> @@ -538,6 +552,7 @@ void i915_gem_object_make_shrinkable(struct
> drm_i915_gem_object *obj)
>   */
>  void i915_gem_object_make_purgeable(struct drm_i915_gem_object *obj)
>  {
> -       __i915_gem_object_make_shrinkable(obj,
> -                                         &obj_to_i915(obj)-
> >mm.purge_list);
> +       GEM_BUG_ON(!i915_gem_object_has_pages(obj));
> +       ___i915_gem_object_make_shrinkable(obj,
> +                                          &obj_to_i915(obj)-
> >mm.purge_list);
>  }
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> index c7402995a8f9..194e5f1deda8 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> @@ -749,6 +749,8 @@ static int i915_ttm_move(struct ttm_buffer_object
> *bo, bool evict,
>                         return ret;
>         }
>  
> +       i915_ttm_adjust_lru(obj);
> +

This will put the object on the shrinker list a little earlier than if
we rely on the adjust_lru() from object_unlock() only, but is that
strictly necessary? I figure even if the shrinker picks the object up,
it will fail in the object trylock and ignore the object, until we call
object_unlock() anyway?


>         dst_st = i915_ttm_resource_get_st(obj, dst_mem);
>         if (IS_ERR(dst_st))
>                 return PTR_ERR(dst_st);
> @@ -856,7 +858,6 @@ static int __i915_ttm_get_pages(struct
> drm_i915_gem_object *obj,
>                         return i915_ttm_err_to_gem(ret);
>         }
>  
> -       i915_ttm_adjust_lru(obj);
>         if (bo->ttm && !ttm_tt_is_populated(bo->ttm)) {
>                 ret = ttm_tt_populate(bo->bdev, bo->ttm, &ctx);
>                 if (ret)
> @@ -876,10 +877,10 @@ static int __i915_ttm_get_pages(struct
> drm_i915_gem_object *obj,
>                         return PTR_ERR(st);
>  
>                 __i915_gem_object_set_pages(obj, st,
> i915_sg_dma_sizes(st->sgl));
> -               if (!bo->ttm || !i915_tt->is_shmem)
> -                       i915_gem_object_make_unshrinkable(obj);
>         }
>  
> +       i915_ttm_adjust_lru(obj);
> +
>         return ret;
>  }
>  
> @@ -950,8 +951,6 @@ static void i915_ttm_put_pages(struct
> drm_i915_gem_object *obj,
>          * If the object is not destroyed next, The TTM eviction
> logic
>          * and shrinkers will move it out if needed.
>          */
> -
> -       i915_ttm_adjust_lru(obj);
>  }
>  
>  static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)
> @@ -967,6 +966,17 @@ static void i915_ttm_adjust_lru(struct
> drm_i915_gem_object *obj)
>         if (!kref_read(&bo->kref))
>                 return;
>  
> +       /*
> +        * Even if we lack mm.pages for this object(which will be the
> case when
> +        * something is evicted to system memory by TTM), we still
> want to make
> +        * this object visible to the shrinker, since the underlying
> ttm_tt
> +        * still has the real shmem pages.
> +        */
> +       if (bo->ttm && i915_tt->filp && ttm_tt_is_populated(bo->ttm))
> +               __i915_gem_object_make_shrinkable(obj);
> +       else
> +               i915_gem_object_make_unshrinkable(obj);
> +
>         /*
>          * Put on the correct LRU list depending on the MADV status
>          */
> @@ -1006,6 +1016,14 @@ static void i915_ttm_adjust_lru(struct
> drm_i915_gem_object *obj)
>  static void i915_ttm_delayed_free(struct drm_i915_gem_object *obj)
>  {
>         if (obj->ttm.created) {
> +               /*
> +                * We freely manage the shrinker LRU outide of the
> mm.pages life
> +                * cycle. As a result when destroying the object it's
> up to us
> +                * to ensure we remove it from the LRU, before we
> free the
> +                * object.
> +                */
> +               i915_gem_object_make_unshrinkable(obj);
> +

I guess this is not *strictly* necessary at this point, since the
shrinker has a kref_get_unless_zero() guard, but I guess we need to
remove the object from the shrinker LRU at some point during
destruction anyway.

Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>


>                 ttm_bo_put(i915_gem_to_ttm(obj));
>         } else {
>                 __i915_gem_free_object(obj);




^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 12/13] drm/i915/ttm: use cached system pages when evicting lmem
  2021-09-27 11:41   ` [Intel-gfx] " Matthew Auld
@ 2021-09-29 11:54     ` Thomas Hellström
  -1 siblings, 0 replies; 60+ messages in thread
From: Thomas Hellström @ 2021-09-29 11:54 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel

On Mon, 2021-09-27 at 12:41 +0100, Matthew Auld wrote:
> This should let us do an accelerated copy directly to the shmem pages
> when temporarily moving lmem-only objects, where the i915-gem
> shrinker
> can later kick in to swap out the pages, if needed.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> index 194e5f1deda8..46d57541c0b2 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> @@ -134,11 +134,11 @@ static enum ttm_caching
>  i915_ttm_select_tt_caching(const struct drm_i915_gem_object *obj)
>  {
>         /*
> -        * Objects only allowed in system get cached cpu-mappings.
> -        * Other objects get WC mapping for now. Even if in system.
> +        * Objects only allowed in system get cached cpu-mappings, or
> when
> +        * evicting lmem-only buffers to system for swapping. Other
> objects get
> +        * WC mapping for now. Even if in system.
>          */
> -       if (obj->mm.region->type == INTEL_MEMORY_SYSTEM &&
> -           obj->mm.n_placements <= 1)
> +       if (obj->mm.n_placements <= 1)
>                 return ttm_cached;
>  
>         return ttm_write_combined;

We should be aware that with TTM, even evicted bos can be mapped by
user-space while evicted, and this will appear to user-space like the
WC-mapped object suddenly became WB-mapped. But it appears like mesa
doesn't care about this as long as the mappings are fully coherent.

Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>





^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Intel-gfx] [PATCH v5 12/13] drm/i915/ttm: use cached system pages when evicting lmem
@ 2021-09-29 11:54     ` Thomas Hellström
  0 siblings, 0 replies; 60+ messages in thread
From: Thomas Hellström @ 2021-09-29 11:54 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel

On Mon, 2021-09-27 at 12:41 +0100, Matthew Auld wrote:
> This should let us do an accelerated copy directly to the shmem pages
> when temporarily moving lmem-only objects, where the i915-gem
> shrinker
> can later kick in to swap out the pages, if needed.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> index 194e5f1deda8..46d57541c0b2 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> @@ -134,11 +134,11 @@ static enum ttm_caching
>  i915_ttm_select_tt_caching(const struct drm_i915_gem_object *obj)
>  {
>         /*
> -        * Objects only allowed in system get cached cpu-mappings.
> -        * Other objects get WC mapping for now. Even if in system.
> +        * Objects only allowed in system get cached cpu-mappings, or
> when
> +        * evicting lmem-only buffers to system for swapping. Other
> objects get
> +        * WC mapping for now. Even if in system.
>          */
> -       if (obj->mm.region->type == INTEL_MEMORY_SYSTEM &&
> -           obj->mm.n_placements <= 1)
> +       if (obj->mm.n_placements <= 1)
>                 return ttm_cached;
>  
>         return ttm_write_combined;

We should be aware that with TTM, even evicted bos can be mapped by
user-space while evicted, and this will appear to user-space like the
WC-mapped object suddenly became WB-mapped. But it appears like mesa
doesn't care about this as long as the mappings are fully coherent.

Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>





^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 13/13] drm/i915/ttm: enable shmem tt backend
  2021-09-27 11:41   ` [Intel-gfx] " Matthew Auld
@ 2021-09-29 12:00     ` Thomas Hellström
  -1 siblings, 0 replies; 60+ messages in thread
From: Thomas Hellström @ 2021-09-29 12:00 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel

On Mon, 2021-09-27 at 12:41 +0100, Matthew Auld wrote:
> Turn on the shmem tt backend, and enable shrinking.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> index 46d57541c0b2..4ae630fbc5cd 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> @@ -1093,6 +1093,7 @@ static u64 i915_ttm_mmap_offset(struct
> drm_i915_gem_object *obj)
>  
>  static const struct drm_i915_gem_object_ops i915_gem_ttm_obj_ops = {
>         .name = "i915_gem_object_ttm",
> +       .flags = I915_GEM_OBJECT_IS_SHRINKABLE,
>  
>         .get_pages = i915_ttm_get_pages,
>         .put_pages = i915_ttm_put_pages,

Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

Now that BAT is running a DG1 again, it might be worth to give the
series a rerun. Perhaps with the "rework object initialization
slightly" as a HAX patch to unblock the mman + following selftest.

/Thomas



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Intel-gfx] [PATCH v5 13/13] drm/i915/ttm: enable shmem tt backend
@ 2021-09-29 12:00     ` Thomas Hellström
  0 siblings, 0 replies; 60+ messages in thread
From: Thomas Hellström @ 2021-09-29 12:00 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel

On Mon, 2021-09-27 at 12:41 +0100, Matthew Auld wrote:
> Turn on the shmem tt backend, and enable shrinking.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> index 46d57541c0b2..4ae630fbc5cd 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> @@ -1093,6 +1093,7 @@ static u64 i915_ttm_mmap_offset(struct
> drm_i915_gem_object *obj)
>  
>  static const struct drm_i915_gem_object_ops i915_gem_ttm_obj_ops = {
>         .name = "i915_gem_object_ttm",
> +       .flags = I915_GEM_OBJECT_IS_SHRINKABLE,
>  
>         .get_pages = i915_ttm_get_pages,
>         .put_pages = i915_ttm_put_pages,

Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

Now that BAT is running a DG1 again, it might be worth to give the
series a rerun. Perhaps with the "rework object initialization
slightly" as a HAX patch to unblock the mman + following selftest.

/Thomas



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Intel-gfx] [PATCH v5 01/13] drm/ttm: stop calling tt_swapin in vm_access
  2021-09-27 16:14   ` Matthew Auld
@ 2021-09-29 12:01     ` Christian König
  2021-09-29 13:45       ` Matthew Auld
  0 siblings, 1 reply; 60+ messages in thread
From: Christian König @ 2021-09-29 12:01 UTC (permalink / raw)
  To: Matthew Auld
  Cc: Matthew Auld, Intel Graphics Development, ML dri-devel,
	Thomas Hellström

Am 27.09.21 um 18:14 schrieb Matthew Auld:
> On Mon, 27 Sept 2021 at 12:47, Christian König <christian.koenig@amd.com> wrote:
>> Any objections that I just push patches 1-7 to drm-misc-next?
> Please go ahead Christian. Thanks.

Well I've pushed patches #1-#4 because #5 won't apply on current 
drm-misc-next (some conflict in i915).

Could you rebase this an/or request backmerging of drm-next into 
drm-misc-next when potential i915 prerequisites have landed there.

Thanks,
Christian.

>
>> Christian.
>>
>> Am 27.09.21 um 13:41 schrieb Matthew Auld:
>>> In commit:
>>>
>>> commit 09ac4fcb3f255e9225967c75f5893325c116cdbe
>>> Author: Felix Kuehling <Felix.Kuehling@amd.com>
>>> Date:   Thu Jul 13 17:01:16 2017 -0400
>>>
>>>       drm/ttm: Implement vm_operations_struct.access v2
>>>
>>> we added the vm_access hook, where we also directly call tt_swapin for
>>> some reason. If something is swapped-out then the ttm_tt must also be
>>> unpopulated, and since access_kmap should also call tt_populate, if
>>> needed, then swapping-in will already be handled there.
>>>
>>> If anything, calling tt_swapin directly here would likely always fail
>>> since the tt->pages won't yet be populated, or worse since the tt->pages
>>> array is never actually cleared in unpopulate this might lead to a nasty
>>> uaf.
>>>
>>> Fixes: 09ac4fcb3f25 ("drm/ttm: Implement vm_operations_struct.access v2")
>>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>>> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>>> Cc: Christian König <christian.koenig@amd.com>
>>> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>> ---
>>>    drivers/gpu/drm/ttm/ttm_bo_vm.c | 5 -----
>>>    1 file changed, 5 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> index f56be5bc0861..5b9b7fd01a69 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> @@ -519,11 +519,6 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
>>>
>>>        switch (bo->resource->mem_type) {
>>>        case TTM_PL_SYSTEM:
>>> -             if (unlikely(bo->ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)) {
>>> -                     ret = ttm_tt_swapin(bo->ttm);
>>> -                     if (unlikely(ret != 0))
>>> -                             return ret;
>>> -             }
>>>                fallthrough;
>>>        case TTM_PL_TT:
>>>                ret = ttm_bo_vm_access_kmap(bo, offset, buf, len, write);


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 10/13] drm/i915: try to simplify make_{un}shrinkable
  2021-09-27 11:41   ` [Intel-gfx] " Matthew Auld
@ 2021-09-29 13:00     ` Thomas Hellström
  -1 siblings, 0 replies; 60+ messages in thread
From: Thomas Hellström @ 2021-09-29 13:00 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel

On Mon, 2021-09-27 at 12:41 +0100, Matthew Auld wrote:
> Drop the atomic shrink_pin stuff, and just have make_{un}shrinkable
> update the shrinker visible lists immediately. This at least
> simplifies
> the next patch, and does make the behaviour more obvious. The
> potential
> downside is that make_unshrinkable now grabs a global lock even when
> the
> object itself is no longer shrinkable(transitioning from purgeable <-
> >
> shrinkable doesn't seem to be a thing), for example in the ppGTT
> insertion paths we should now be careful not to needlessly call
> make_unshrinkable multiple times. Outside of that there is some
> fallout
> in intel_context which relies on nesting calls to shrink_pin.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>

Hmm. One thing that worries me a bit here: Let's say we have, for
example an LMEM context state, and TTM has it made unshrinkable. Then
the context becomes active and calls _make_unshrinkable again. And when
it retires it callse _make_shrinkable. Doesn't it end up on the
shrinker list at that point, even if still in LMEM?

/Thomas



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Intel-gfx] [PATCH v5 10/13] drm/i915: try to simplify make_{un}shrinkable
@ 2021-09-29 13:00     ` Thomas Hellström
  0 siblings, 0 replies; 60+ messages in thread
From: Thomas Hellström @ 2021-09-29 13:00 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel

On Mon, 2021-09-27 at 12:41 +0100, Matthew Auld wrote:
> Drop the atomic shrink_pin stuff, and just have make_{un}shrinkable
> update the shrinker visible lists immediately. This at least
> simplifies
> the next patch, and does make the behaviour more obvious. The
> potential
> downside is that make_unshrinkable now grabs a global lock even when
> the
> object itself is no longer shrinkable(transitioning from purgeable <-
> >
> shrinkable doesn't seem to be a thing), for example in the ppGTT
> insertion paths we should now be careful not to needlessly call
> make_unshrinkable multiple times. Outside of that there is some
> fallout
> in intel_context which relies on nesting calls to shrink_pin.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>

Hmm. One thing that worries me a bit here: Let's say we have, for
example an LMEM context state, and TTM has it made unshrinkable. Then
the context becomes active and calls _make_unshrinkable again. And when
it retires it callse _make_shrinkable. Doesn't it end up on the
shrinker list at that point, even if still in LMEM?

/Thomas



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Intel-gfx] [PATCH v5 01/13] drm/ttm: stop calling tt_swapin in vm_access
  2021-09-29 12:01     ` Christian König
@ 2021-09-29 13:45       ` Matthew Auld
  0 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-29 13:45 UTC (permalink / raw)
  To: Christian König
  Cc: Matthew Auld, Intel Graphics Development, ML dri-devel,
	Thomas Hellström

On Wed, 29 Sept 2021 at 13:01, Christian König <christian.koenig@amd.com> wrote:
>
> Am 27.09.21 um 18:14 schrieb Matthew Auld:
> > On Mon, 27 Sept 2021 at 12:47, Christian König <christian.koenig@amd.com> wrote:
> >> Any objections that I just push patches 1-7 to drm-misc-next?
> > Please go ahead Christian. Thanks.
>
> Well I've pushed patches #1-#4 because #5 won't apply on current
> drm-misc-next (some conflict in i915).
>
> Could you rebase this an/or request backmerging of drm-next into
> drm-misc-next when potential i915 prerequisites have landed there.

Version which should apply to drm-misc-next:
https://patchwork.freedesktop.org/series/95219/

>
> Thanks,
> Christian.
>
> >
> >> Christian.
> >>
> >> Am 27.09.21 um 13:41 schrieb Matthew Auld:
> >>> In commit:
> >>>
> >>> commit 09ac4fcb3f255e9225967c75f5893325c116cdbe
> >>> Author: Felix Kuehling <Felix.Kuehling@amd.com>
> >>> Date:   Thu Jul 13 17:01:16 2017 -0400
> >>>
> >>>       drm/ttm: Implement vm_operations_struct.access v2
> >>>
> >>> we added the vm_access hook, where we also directly call tt_swapin for
> >>> some reason. If something is swapped-out then the ttm_tt must also be
> >>> unpopulated, and since access_kmap should also call tt_populate, if
> >>> needed, then swapping-in will already be handled there.
> >>>
> >>> If anything, calling tt_swapin directly here would likely always fail
> >>> since the tt->pages won't yet be populated, or worse since the tt->pages
> >>> array is never actually cleared in unpopulate this might lead to a nasty
> >>> uaf.
> >>>
> >>> Fixes: 09ac4fcb3f25 ("drm/ttm: Implement vm_operations_struct.access v2")
> >>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> >>> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> >>> Cc: Christian König <christian.koenig@amd.com>
> >>> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> >>> Reviewed-by: Christian König <christian.koenig@amd.com>
> >>> ---
> >>>    drivers/gpu/drm/ttm/ttm_bo_vm.c | 5 -----
> >>>    1 file changed, 5 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> >>> index f56be5bc0861..5b9b7fd01a69 100644
> >>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> >>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> >>> @@ -519,11 +519,6 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
> >>>
> >>>        switch (bo->resource->mem_type) {
> >>>        case TTM_PL_SYSTEM:
> >>> -             if (unlikely(bo->ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)) {
> >>> -                     ret = ttm_tt_swapin(bo->ttm);
> >>> -                     if (unlikely(ret != 0))
> >>> -                             return ret;
> >>> -             }
> >>>                fallthrough;
> >>>        case TTM_PL_TT:
> >>>                ret = ttm_bo_vm_access_kmap(bo, offset, buf, len, write);
>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 12/13] drm/i915/ttm: use cached system pages when evicting lmem
  2021-09-29 11:54     ` [Intel-gfx] " Thomas Hellström
@ 2021-09-30 10:04       ` Michel Dänzer
  -1 siblings, 0 replies; 60+ messages in thread
From: Michel Dänzer @ 2021-09-30 10:04 UTC (permalink / raw)
  To: Thomas Hellström, Matthew Auld; +Cc: dri-devel, intel-gfx

On 2021-09-29 13:54, Thomas Hellström wrote:
> On Mon, 2021-09-27 at 12:41 +0100, Matthew Auld wrote:
>> This should let us do an accelerated copy directly to the shmem pages
>> when temporarily moving lmem-only objects, where the i915-gem
>> shrinker
>> can later kick in to swap out the pages, if needed.
>>
>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>> ---
>>  drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 8 ++++----
>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>> index 194e5f1deda8..46d57541c0b2 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>> @@ -134,11 +134,11 @@ static enum ttm_caching
>>  i915_ttm_select_tt_caching(const struct drm_i915_gem_object *obj)
>>  {
>>         /*
>> -        * Objects only allowed in system get cached cpu-mappings.
>> -        * Other objects get WC mapping for now. Even if in system.
>> +        * Objects only allowed in system get cached cpu-mappings, or
>> when
>> +        * evicting lmem-only buffers to system for swapping. Other
>> objects get
>> +        * WC mapping for now. Even if in system.
>>          */
>> -       if (obj->mm.region->type == INTEL_MEMORY_SYSTEM &&
>> -           obj->mm.n_placements <= 1)
>> +       if (obj->mm.n_placements <= 1)
>>                 return ttm_cached;
>>  
>>         return ttm_write_combined;
> 
> We should be aware that with TTM, even evicted bos can be mapped by
> user-space while evicted, and this will appear to user-space like the
> WC-mapped object suddenly became WB-mapped. But it appears like mesa
> doesn't care about this as long as the mappings are fully coherent.

FWIW, the Mesa radeonsi driver avoids surprises due to this (e.g. some path which involves CPU access suddenly goes faster if the BO was evicted from VRAM) by asking for WC mapping of BOs intended to be in VRAM even while they're evicted (via the AMDGPU_GEM_CREATE_CPU_GTT_USWC flag).


-- 
Earthling Michel Dänzer            |                  https://redhat.com
Libre software enthusiast          |         Mesa and Xwayland developer

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Intel-gfx] [PATCH v5 12/13] drm/i915/ttm: use cached system pages when evicting lmem
@ 2021-09-30 10:04       ` Michel Dänzer
  0 siblings, 0 replies; 60+ messages in thread
From: Michel Dänzer @ 2021-09-30 10:04 UTC (permalink / raw)
  To: Thomas Hellström, Matthew Auld; +Cc: dri-devel, intel-gfx

On 2021-09-29 13:54, Thomas Hellström wrote:
> On Mon, 2021-09-27 at 12:41 +0100, Matthew Auld wrote:
>> This should let us do an accelerated copy directly to the shmem pages
>> when temporarily moving lmem-only objects, where the i915-gem
>> shrinker
>> can later kick in to swap out the pages, if needed.
>>
>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>> ---
>>  drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 8 ++++----
>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>> index 194e5f1deda8..46d57541c0b2 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>> @@ -134,11 +134,11 @@ static enum ttm_caching
>>  i915_ttm_select_tt_caching(const struct drm_i915_gem_object *obj)
>>  {
>>         /*
>> -        * Objects only allowed in system get cached cpu-mappings.
>> -        * Other objects get WC mapping for now. Even if in system.
>> +        * Objects only allowed in system get cached cpu-mappings, or
>> when
>> +        * evicting lmem-only buffers to system for swapping. Other
>> objects get
>> +        * WC mapping for now. Even if in system.
>>          */
>> -       if (obj->mm.region->type == INTEL_MEMORY_SYSTEM &&
>> -           obj->mm.n_placements <= 1)
>> +       if (obj->mm.n_placements <= 1)
>>                 return ttm_cached;
>>  
>>         return ttm_write_combined;
> 
> We should be aware that with TTM, even evicted bos can be mapped by
> user-space while evicted, and this will appear to user-space like the
> WC-mapped object suddenly became WB-mapped. But it appears like mesa
> doesn't care about this as long as the mappings are fully coherent.

FWIW, the Mesa radeonsi driver avoids surprises due to this (e.g. some path which involves CPU access suddenly goes faster if the BO was evicted from VRAM) by asking for WC mapping of BOs intended to be in VRAM even while they're evicted (via the AMDGPU_GEM_CREATE_CPU_GTT_USWC flag).


-- 
Earthling Michel Dänzer            |                  https://redhat.com
Libre software enthusiast          |         Mesa and Xwayland developer

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 12/13] drm/i915/ttm: use cached system pages when evicting lmem
  2021-09-30 10:04       ` [Intel-gfx] " Michel Dänzer
@ 2021-09-30 12:27         ` Matthew Auld
  -1 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-30 12:27 UTC (permalink / raw)
  To: Michel Dänzer, Thomas Hellström; +Cc: dri-devel, intel-gfx

On 30/09/2021 11:04, Michel Dänzer wrote:
> On 2021-09-29 13:54, Thomas Hellström wrote:
>> On Mon, 2021-09-27 at 12:41 +0100, Matthew Auld wrote:
>>> This should let us do an accelerated copy directly to the shmem pages
>>> when temporarily moving lmem-only objects, where the i915-gem
>>> shrinker
>>> can later kick in to swap out the pages, if needed.
>>>
>>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>>> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 8 ++++----
>>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>> index 194e5f1deda8..46d57541c0b2 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>> @@ -134,11 +134,11 @@ static enum ttm_caching
>>>   i915_ttm_select_tt_caching(const struct drm_i915_gem_object *obj)
>>>   {
>>>          /*
>>> -        * Objects only allowed in system get cached cpu-mappings.
>>> -        * Other objects get WC mapping for now. Even if in system.
>>> +        * Objects only allowed in system get cached cpu-mappings, or
>>> when
>>> +        * evicting lmem-only buffers to system for swapping. Other
>>> objects get
>>> +        * WC mapping for now. Even if in system.
>>>           */
>>> -       if (obj->mm.region->type == INTEL_MEMORY_SYSTEM &&
>>> -           obj->mm.n_placements <= 1)
>>> +       if (obj->mm.n_placements <= 1)
>>>                  return ttm_cached;
>>>   
>>>          return ttm_write_combined;
>>
>> We should be aware that with TTM, even evicted bos can be mapped by
>> user-space while evicted, and this will appear to user-space like the
>> WC-mapped object suddenly became WB-mapped. But it appears like mesa
>> doesn't care about this as long as the mappings are fully coherent.
> 
> FWIW, the Mesa radeonsi driver avoids surprises due to this (e.g. some path which involves CPU access suddenly goes faster if the BO was evicted from VRAM) by asking for WC mapping of BOs intended to be in VRAM even while they're evicted (via the AMDGPU_GEM_CREATE_CPU_GTT_USWC flag).
> 

Ok, so amdgpu just defaults to cached system memory, even for evicted 
VRAM, unless userspace requests USWC, in which case it will use WC?

> 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Intel-gfx] [PATCH v5 12/13] drm/i915/ttm: use cached system pages when evicting lmem
@ 2021-09-30 12:27         ` Matthew Auld
  0 siblings, 0 replies; 60+ messages in thread
From: Matthew Auld @ 2021-09-30 12:27 UTC (permalink / raw)
  To: Michel Dänzer, Thomas Hellström; +Cc: dri-devel, intel-gfx

On 30/09/2021 11:04, Michel Dänzer wrote:
> On 2021-09-29 13:54, Thomas Hellström wrote:
>> On Mon, 2021-09-27 at 12:41 +0100, Matthew Auld wrote:
>>> This should let us do an accelerated copy directly to the shmem pages
>>> when temporarily moving lmem-only objects, where the i915-gem
>>> shrinker
>>> can later kick in to swap out the pages, if needed.
>>>
>>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>>> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 8 ++++----
>>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>> index 194e5f1deda8..46d57541c0b2 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>> @@ -134,11 +134,11 @@ static enum ttm_caching
>>>   i915_ttm_select_tt_caching(const struct drm_i915_gem_object *obj)
>>>   {
>>>          /*
>>> -        * Objects only allowed in system get cached cpu-mappings.
>>> -        * Other objects get WC mapping for now. Even if in system.
>>> +        * Objects only allowed in system get cached cpu-mappings, or
>>> when
>>> +        * evicting lmem-only buffers to system for swapping. Other
>>> objects get
>>> +        * WC mapping for now. Even if in system.
>>>           */
>>> -       if (obj->mm.region->type == INTEL_MEMORY_SYSTEM &&
>>> -           obj->mm.n_placements <= 1)
>>> +       if (obj->mm.n_placements <= 1)
>>>                  return ttm_cached;
>>>   
>>>          return ttm_write_combined;
>>
>> We should be aware that with TTM, even evicted bos can be mapped by
>> user-space while evicted, and this will appear to user-space like the
>> WC-mapped object suddenly became WB-mapped. But it appears like mesa
>> doesn't care about this as long as the mappings are fully coherent.
> 
> FWIW, the Mesa radeonsi driver avoids surprises due to this (e.g. some path which involves CPU access suddenly goes faster if the BO was evicted from VRAM) by asking for WC mapping of BOs intended to be in VRAM even while they're evicted (via the AMDGPU_GEM_CREATE_CPU_GTT_USWC flag).
> 

Ok, so amdgpu just defaults to cached system memory, even for evicted 
VRAM, unless userspace requests USWC, in which case it will use WC?

> 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 12/13] drm/i915/ttm: use cached system pages when evicting lmem
  2021-09-30 12:27         ` [Intel-gfx] " Matthew Auld
@ 2021-09-30 12:55           ` Michel Dänzer
  -1 siblings, 0 replies; 60+ messages in thread
From: Michel Dänzer @ 2021-09-30 12:55 UTC (permalink / raw)
  To: Matthew Auld, Thomas Hellström; +Cc: dri-devel, intel-gfx

On 2021-09-30 14:27, Matthew Auld wrote:
> On 30/09/2021 11:04, Michel Dänzer wrote:
>> On 2021-09-29 13:54, Thomas Hellström wrote:
>>> On Mon, 2021-09-27 at 12:41 +0100, Matthew Auld wrote:
>>>> This should let us do an accelerated copy directly to the shmem pages
>>>> when temporarily moving lmem-only objects, where the i915-gem
>>>> shrinker
>>>> can later kick in to swap out the pages, if needed.
>>>>
>>>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>>>> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>>>> ---
>>>>   drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 8 ++++----
>>>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>>> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>>> index 194e5f1deda8..46d57541c0b2 100644
>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>>> @@ -134,11 +134,11 @@ static enum ttm_caching
>>>>   i915_ttm_select_tt_caching(const struct drm_i915_gem_object *obj)
>>>>   {
>>>>          /*
>>>> -        * Objects only allowed in system get cached cpu-mappings.
>>>> -        * Other objects get WC mapping for now. Even if in system.
>>>> +        * Objects only allowed in system get cached cpu-mappings, or
>>>> when
>>>> +        * evicting lmem-only buffers to system for swapping. Other
>>>> objects get
>>>> +        * WC mapping for now. Even if in system.
>>>>           */
>>>> -       if (obj->mm.region->type == INTEL_MEMORY_SYSTEM &&
>>>> -           obj->mm.n_placements <= 1)
>>>> +       if (obj->mm.n_placements <= 1)
>>>>                  return ttm_cached;
>>>>            return ttm_write_combined;
>>>
>>> We should be aware that with TTM, even evicted bos can be mapped by
>>> user-space while evicted, and this will appear to user-space like the
>>> WC-mapped object suddenly became WB-mapped. But it appears like mesa
>>> doesn't care about this as long as the mappings are fully coherent.
>>
>> FWIW, the Mesa radeonsi driver avoids surprises due to this (e.g. some path which involves CPU access suddenly goes faster if the BO was evicted from VRAM) by asking for WC mapping of BOs intended to be in VRAM even while they're evicted (via the AMDGPU_GEM_CREATE_CPU_GTT_USWC flag).
>>
> 
> Ok, so amdgpu just defaults to cached system memory, even for evicted VRAM, unless userspace requests USWC, in which case it will use WC?

Right.


-- 
Earthling Michel Dänzer            |                  https://redhat.com
Libre software enthusiast          |         Mesa and Xwayland developer

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Intel-gfx] [PATCH v5 12/13] drm/i915/ttm: use cached system pages when evicting lmem
@ 2021-09-30 12:55           ` Michel Dänzer
  0 siblings, 0 replies; 60+ messages in thread
From: Michel Dänzer @ 2021-09-30 12:55 UTC (permalink / raw)
  To: Matthew Auld, Thomas Hellström; +Cc: dri-devel, intel-gfx

On 2021-09-30 14:27, Matthew Auld wrote:
> On 30/09/2021 11:04, Michel Dänzer wrote:
>> On 2021-09-29 13:54, Thomas Hellström wrote:
>>> On Mon, 2021-09-27 at 12:41 +0100, Matthew Auld wrote:
>>>> This should let us do an accelerated copy directly to the shmem pages
>>>> when temporarily moving lmem-only objects, where the i915-gem
>>>> shrinker
>>>> can later kick in to swap out the pages, if needed.
>>>>
>>>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>>>> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>>>> ---
>>>>   drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 8 ++++----
>>>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>>> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>>> index 194e5f1deda8..46d57541c0b2 100644
>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>>> @@ -134,11 +134,11 @@ static enum ttm_caching
>>>>   i915_ttm_select_tt_caching(const struct drm_i915_gem_object *obj)
>>>>   {
>>>>          /*
>>>> -        * Objects only allowed in system get cached cpu-mappings.
>>>> -        * Other objects get WC mapping for now. Even if in system.
>>>> +        * Objects only allowed in system get cached cpu-mappings, or
>>>> when
>>>> +        * evicting lmem-only buffers to system for swapping. Other
>>>> objects get
>>>> +        * WC mapping for now. Even if in system.
>>>>           */
>>>> -       if (obj->mm.region->type == INTEL_MEMORY_SYSTEM &&
>>>> -           obj->mm.n_placements <= 1)
>>>> +       if (obj->mm.n_placements <= 1)
>>>>                  return ttm_cached;
>>>>            return ttm_write_combined;
>>>
>>> We should be aware that with TTM, even evicted bos can be mapped by
>>> user-space while evicted, and this will appear to user-space like the
>>> WC-mapped object suddenly became WB-mapped. But it appears like mesa
>>> doesn't care about this as long as the mappings are fully coherent.
>>
>> FWIW, the Mesa radeonsi driver avoids surprises due to this (e.g. some path which involves CPU access suddenly goes faster if the BO was evicted from VRAM) by asking for WC mapping of BOs intended to be in VRAM even while they're evicted (via the AMDGPU_GEM_CREATE_CPU_GTT_USWC flag).
>>
> 
> Ok, so amdgpu just defaults to cached system memory, even for evicted VRAM, unless userspace requests USWC, in which case it will use WC?

Right.


-- 
Earthling Michel Dänzer            |                  https://redhat.com
Libre software enthusiast          |         Mesa and Xwayland developer

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend
  2021-09-27 11:41   ` [Intel-gfx] " Matthew Auld
@ 2021-10-05  2:05     ` Zeng, Oak
  -1 siblings, 0 replies; 60+ messages in thread
From: Zeng, Oak @ 2021-10-05  2:05 UTC (permalink / raw)
  To: Auld, Matthew, intel-gfx
  Cc: dri-devel, Thomas Hellström, Christian König

Hi Matthew/Thomas,

See one question inline

Regards,
Oak

-----Original Message-----
From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Matthew Auld
Sent: September 27, 2021 7:41 AM
To: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org; Thomas Hellström <thomas.hellstrom@linux.intel.com>; Christian König <christian.koenig@amd.com>
Subject: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend

For cached objects we can allocate our pages directly in shmem. This should make it possible(in a later patch) to utilise the existing i915-gem shrinker code for such objects. For now this is still disabled.

v2(Thomas):
  - Add optional try_to_writeback hook for objects. Importantly we need
    to check if the object is even still shrinkable; in between us
    dropping the shrinker LRU lock and acquiring the object lock it could for
    example have been moved. Also we need to differentiate between
    "lazy" shrinking and the immediate writeback mode. Also later we need to
    handle objects which don't even have mm.pages, so bundling this into
    put_pages() would require somehow handling that edge case, hence
    just letting the ttm backend handle everything in try_to_writeback
    doesn't seem too bad.
v3(Thomas):
  - Likely a bad idea to touch the object from the unpopulate hook,
    since it's not possible to hold a reference, without also creating
    circular dependency, so likely this is too fragile. For now just
    ensure we at least mark the pages as dirty/accessed when called from the
    shrinker on WILLNEED objects.
  - s/try_to_writeback/shrinker_release_pages, since this can do more
    than just writeback.
  - Get rid of do_backup boolean and just set the SWAPPED flag prior to
    calling unpopulate.
  - Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk, since
    these just get skipped anyway. We can try to come up with something
    better later.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |   8 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |   2 +
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c     |  14 +-
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  17 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c       | 240 ++++++++++++++++--
 5 files changed, 245 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 3043fcbd31bd..1c9a1d8d3434 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -601,6 +601,14 @@ int i915_gem_object_wait_migration(struct drm_i915_gem_object *obj,  bool i915_gem_object_placement_possible(struct drm_i915_gem_object *obj,
 					enum intel_memory_type type);
 
+struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
+				size_t size, struct intel_memory_region *mr,
+				struct address_space *mapping,
+				unsigned int max_segment);
+void shmem_free_st(struct sg_table *st, struct address_space *mapping,
+		   bool dirty, bool backup);
+void __shmem_writeback(size_t size, struct address_space *mapping);
+
 #ifdef CONFIG_MMU_NOTIFIER
 static inline bool
 i915_gem_object_is_userptr(struct drm_i915_gem_object *obj) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index fa2ba9e2a4d0..f0fb17be2f7a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -56,6 +56,8 @@ struct drm_i915_gem_object_ops {
 			  struct sg_table *pages);
 	void (*truncate)(struct drm_i915_gem_object *obj);
 	void (*writeback)(struct drm_i915_gem_object *obj);
+	int (*shrinker_release_pages)(struct drm_i915_gem_object *obj,
+				      bool should_writeback);
 
 	int (*pread)(struct drm_i915_gem_object *obj,
 		     const struct drm_i915_gem_pread *arg); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 36b711ae9e28..19e55cc29a15 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -25,8 +25,8 @@ static void check_release_pagevec(struct pagevec *pvec)
 	cond_resched();
 }
 
-static void shmem_free_st(struct sg_table *st, struct address_space *mapping,
-			  bool dirty, bool backup)
+void shmem_free_st(struct sg_table *st, struct address_space *mapping,
+		   bool dirty, bool backup)
 {
 	struct sgt_iter sgt_iter;
 	struct pagevec pvec;
@@ -52,10 +52,10 @@ static void shmem_free_st(struct sg_table *st, struct address_space *mapping,
 	kfree(st);
 }
 
-static struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
-				       size_t size, struct intel_memory_region *mr,
-				       struct address_space *mapping,
-				       unsigned int max_segment)
+struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
+				size_t size, struct intel_memory_region *mr,
+				struct address_space *mapping,
+				unsigned int max_segment)
 {
 	const unsigned long page_count = size / PAGE_SIZE;
 	unsigned long i;
@@ -300,7 +300,7 @@ shmem_truncate(struct drm_i915_gem_object *obj)
 	obj->mm.pages = ERR_PTR(-EFAULT);
 }
 
-static void __shmem_writeback(size_t size, struct address_space *mapping)
+void __shmem_writeback(size_t size, struct address_space *mapping)
 {
 	struct writeback_control wbc = {
 		.sync_mode = WB_SYNC_NONE,
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
index e382b7f2353b..cc80bd23d323 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -56,19 +56,24 @@ static bool unsafe_drop_pages(struct drm_i915_gem_object *obj,
 	return false;
 }
 
-static void try_to_writeback(struct drm_i915_gem_object *obj,
-			     unsigned int flags)
+static int try_to_writeback(struct drm_i915_gem_object *obj, unsigned 
+int flags)
 {
+	if (obj->ops->shrinker_release_pages)
+		return obj->ops->shrinker_release_pages(obj,
+							flags & I915_SHRINK_WRITEBACK);
+
 	switch (obj->mm.madv) {
 	case I915_MADV_DONTNEED:
 		i915_gem_object_truncate(obj);
-		return;
+		return 0;
 	case __I915_MADV_PURGED:
-		return;
+		return 0;
 	}
 
 	if (flags & I915_SHRINK_WRITEBACK)
 		i915_gem_object_writeback(obj);
+
+	return 0;
 }
 
 /**
@@ -222,8 +227,8 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww,
 				}
 
 				if (!__i915_gem_object_put_pages(obj)) {
-					try_to_writeback(obj, shrink);
-					count += obj->base.size >> PAGE_SHIFT;
+					if (!try_to_writeback(obj, shrink))
+						count += obj->base.size >> PAGE_SHIFT;
 				}
 				if (!ww)
 					i915_gem_object_unlock(obj);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index a77e90f300fe..c7402995a8f9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -35,6 +35,8 @@
  * @ttm: The base TTM page vector.
  * @dev: The struct device used for dma mapping and unmapping.
  * @cached_st: The cached scatter-gather table.
+ * @is_shmem: Set if using shmem.
+ * @filp: The shmem file, if using shmem backend.
  *
  * Note that DMA may be going on right up to the point where the page-
  * vector is unpopulated in delayed destroy. Hence keep the @@ -46,6 +48,9 @@ struct i915_ttm_tt {
 	struct ttm_tt ttm;
 	struct device *dev;
 	struct sg_table *cached_st;
+
+	bool is_shmem;
+	struct file *filp;
 };
 
 static const struct ttm_place sys_placement_flags = { @@ -179,12 +184,90 @@ i915_ttm_placement_from_obj(const struct drm_i915_gem_object *obj,
 	placement->busy_placement = busy;
 }
 
+static int i915_ttm_tt_shmem_populate(struct ttm_device *bdev,
+				      struct ttm_tt *ttm,
+				      struct ttm_operation_ctx *ctx) {
+	struct drm_i915_private *i915 = container_of(bdev, typeof(*i915), bdev);
+	struct intel_memory_region *mr = i915->mm.regions[INTEL_MEMORY_SYSTEM];
+	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);
+	const unsigned int max_segment = i915_sg_segment_size();
+	const size_t size = ttm->num_pages << PAGE_SHIFT;
+	struct file *filp = i915_tt->filp;
+	struct sgt_iter sgt_iter;
+	struct sg_table *st;
+	struct page *page;
+	unsigned long i;
+	int err;
+
+	if (!filp) {
+		struct address_space *mapping;
+		gfp_t mask;
+
+		filp = shmem_file_setup("i915-shmem-tt", size, VM_NORESERVE);
+		if (IS_ERR(filp))
+			return PTR_ERR(filp);
+
+		mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
+
+		mapping = filp->f_mapping;
+		mapping_set_gfp_mask(mapping, mask);
+		GEM_BUG_ON(!(mapping_gfp_mask(mapping) & __GFP_RECLAIM));
+
+		i915_tt->filp = filp;
+	}
+
+	st = shmem_alloc_st(i915, size, mr, filp->f_mapping, max_segment);
+	if (IS_ERR(st))
+		return PTR_ERR(st);
+
+	err = dma_map_sg_attrs(i915_tt->dev,
+			       st->sgl, st->nents,
+			       PCI_DMA_BIDIRECTIONAL,
+			       DMA_ATTR_SKIP_CPU_SYNC |
+			       DMA_ATTR_NO_KERNEL_MAPPING |
+			       DMA_ATTR_NO_WARN);
+	if (err <= 0) {
+		err = -EINVAL;
+		goto err_free_st;
+	}
+
+	i = 0;
+	for_each_sgt_page(page, sgt_iter, st)
+		ttm->pages[i++] = page;
+
+	if (ttm->page_flags & TTM_TT_FLAG_SWAPPED)
+		ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
+
+	i915_tt->cached_st = st;
+	return 0;
+
+err_free_st:
+	shmem_free_st(st, filp->f_mapping, false, false);
+	return err;
+}
+
+static void i915_ttm_tt_shmem_unpopulate(struct ttm_tt *ttm) {
+	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);
+	bool backup = ttm->page_flags & TTM_TT_FLAG_SWAPPED;
+
+	dma_unmap_sg(i915_tt->dev, i915_tt->cached_st->sgl,
+		     i915_tt->cached_st->nents,
+		     PCI_DMA_BIDIRECTIONAL);
+
+	shmem_free_st(fetch_and_zero(&i915_tt->cached_st),
+		      file_inode(i915_tt->filp)->i_mapping,
+		      backup, backup);

Should we do something to undo the  shmem_file_setup operation here? From its implementation it does take a reference counter of inode and allocate file:  https://elixir.bootlin.com/linux/latest/source/mm/shmem.c#L4084

Regards,
Oak

+}
+
 static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo,
 					 uint32_t page_flags)
 {
 	struct ttm_resource_manager *man =
 		ttm_manager_type(bo->bdev, bo->resource->mem_type);
 	struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
+	enum ttm_caching caching = i915_ttm_select_tt_caching(obj);
 	struct i915_ttm_tt *i915_tt;
 	int ret;
 
@@ -196,36 +279,62 @@ static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo,
 	    man->use_tt)
 		page_flags |= TTM_TT_FLAG_ZERO_ALLOC;
 
-	ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags,
-			  i915_ttm_select_tt_caching(obj));
-	if (ret) {
-		kfree(i915_tt);
-		return NULL;
+	if (i915_gem_object_is_shrinkable(obj) && caching == ttm_cached) {
+		page_flags |= TTM_TT_FLAG_EXTERNAL |
+			      TTM_TT_FLAG_EXTERNAL_MAPPABLE;
+		i915_tt->is_shmem = true;
 	}
 
+	ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching);
+	if (ret)
+		goto err_free;
+
 	i915_tt->dev = obj->base.dev->dev;
 
 	return &i915_tt->ttm;
+
+err_free:
+	kfree(i915_tt);
+	return NULL;
+}
+
+static int i915_ttm_tt_populate(struct ttm_device *bdev,
+				struct ttm_tt *ttm,
+				struct ttm_operation_ctx *ctx)
+{
+	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), 
+ttm);
+
+	if (i915_tt->is_shmem)
+		return i915_ttm_tt_shmem_populate(bdev, ttm, ctx);
+
+	return ttm_pool_alloc(&bdev->pool, ttm, ctx);
 }
 
 static void i915_ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm)  {
 	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);
 
-	if (i915_tt->cached_st) {
-		dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st,
-				  DMA_BIDIRECTIONAL, 0);
-		sg_free_table(i915_tt->cached_st);
-		kfree(i915_tt->cached_st);
-		i915_tt->cached_st = NULL;
+	if (i915_tt->is_shmem) {
+		i915_ttm_tt_shmem_unpopulate(ttm);
+	} else {
+		if (i915_tt->cached_st) {
+			dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st,
+					  DMA_BIDIRECTIONAL, 0);
+			sg_free_table(i915_tt->cached_st);
+			kfree(i915_tt->cached_st);
+			i915_tt->cached_st = NULL;
+		}
+		ttm_pool_free(&bdev->pool, ttm);
 	}
-	ttm_pool_free(&bdev->pool, ttm);
 }
 
 static void i915_ttm_tt_destroy(struct ttm_device *bdev, struct ttm_tt *ttm)  {
 	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);
 
+	if (i915_tt->filp)
+		fput(i915_tt->filp);
+
 	ttm_tt_fini(ttm);
 	kfree(i915_tt);
 }
@@ -235,6 +344,14 @@ static bool i915_ttm_eviction_valuable(struct ttm_buffer_object *bo,  {
 	struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
 
+	/*
+	 * EXTERNAL objects should never be swapped out by TTM, instead we need
+	 * to handle that ourselves. TTM will already skip such objects for us,
+	 * but we would like to avoid grabbing locks for no good reason.
+	 */
+	if (bo->ttm && bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL)
+		return -EBUSY;
+
 	/* Will do for now. Our pinned objects are still on TTM's LRU lists */
 	return i915_gem_object_evictable(obj);  } @@ -328,9 +445,11 @@ static void i915_ttm_adjust_gem_after_move(struct drm_i915_gem_object *obj)
 	i915_gem_object_set_cache_coherency(obj, cache_level);  }
 
-static void i915_ttm_purge(struct drm_i915_gem_object *obj)
+static int __i915_ttm_purge(struct drm_i915_gem_object *obj)
 {
 	struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
+	struct i915_ttm_tt *i915_tt =
+		container_of(bo->ttm, typeof(*i915_tt), ttm);
 	struct ttm_operation_ctx ctx = {
 		.interruptible = true,
 		.no_wait_gpu = false,
@@ -339,17 +458,79 @@ static void i915_ttm_purge(struct drm_i915_gem_object *obj)
 	int ret;
 
 	if (obj->mm.madv == __I915_MADV_PURGED)
-		return;
+		return 0;
 
-	/* TTM's purge interface. Note that we might be reentering. */
 	ret = ttm_bo_validate(bo, &place, &ctx);
-	if (!ret) {
-		obj->write_domain = 0;
-		obj->read_domains = 0;
-		i915_ttm_adjust_gem_after_move(obj);
-		i915_ttm_free_cached_io_st(obj);
-		obj->mm.madv = __I915_MADV_PURGED;
+	if (ret)
+		return ret;
+
+	if (bo->ttm && i915_tt->filp) {
+		/*
+		 * The below fput(which eventually calls shmem_truncate) might
+		 * be delayed by worker, so when directly called to purge the
+		 * pages(like by the shrinker) we should try to be more
+		 * aggressive and release the pages immediately.
+		 */
+		shmem_truncate_range(file_inode(i915_tt->filp),
+				     0, (loff_t)-1);
+		fput(fetch_and_zero(&i915_tt->filp));
+	}
+
+	obj->write_domain = 0;
+	obj->read_domains = 0;
+	i915_ttm_adjust_gem_after_move(obj);
+	i915_ttm_free_cached_io_st(obj);
+	obj->mm.madv = __I915_MADV_PURGED;
+	return 0;
+}
+
+static void i915_ttm_purge(struct drm_i915_gem_object *obj) {
+	__i915_ttm_purge(obj);
+}
+
+static int i915_ttm_shrinker_release_pages(struct drm_i915_gem_object *obj,
+					   bool should_writeback)
+{
+	struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
+	struct i915_ttm_tt *i915_tt =
+		container_of(bo->ttm, typeof(*i915_tt), ttm);
+	struct ttm_operation_ctx ctx = {
+		.interruptible = true,
+		.no_wait_gpu = false,
+	};
+	struct ttm_placement place = {};
+	int ret;
+
+	if (!bo->ttm || bo->resource->mem_type != TTM_PL_SYSTEM)
+		return 0;
+
+	GEM_BUG_ON(!i915_tt->is_shmem);
+
+	if (!i915_tt->filp)
+		return 0;
+
+	switch (obj->mm.madv) {
+	case I915_MADV_DONTNEED:
+		return __i915_ttm_purge(obj);
+	case __I915_MADV_PURGED:
+		return 0;
+	}
+
+	if (bo->ttm->page_flags & TTM_TT_FLAG_SWAPPED)
+		return 0;
+
+	bo->ttm->page_flags |= TTM_TT_FLAG_SWAPPED;
+	ret = ttm_bo_validate(bo, &place, &ctx);
+	if (ret) {
+		bo->ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
+		return ret;
 	}
+
+	if (should_writeback)
+		__shmem_writeback(obj->base.size, i915_tt->filp->f_mapping);
+
+	return 0;
 }
 
 static void i915_ttm_swap_notify(struct ttm_buffer_object *bo) @@ -618,6 +799,7 @@ static unsigned long i915_ttm_io_mem_pfn(struct ttm_buffer_object *bo,
 
 static struct ttm_device_funcs i915_ttm_bo_driver = {
 	.ttm_tt_create = i915_ttm_tt_create,
+	.ttm_tt_populate = i915_ttm_tt_populate,
 	.ttm_tt_unpopulate = i915_ttm_tt_unpopulate,
 	.ttm_tt_destroy = i915_ttm_tt_destroy,
 	.eviction_valuable = i915_ttm_eviction_valuable, @@ -685,12 +867,17 @@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj,
 	}
 
 	if (!i915_gem_object_has_pages(obj)) {
+		struct i915_ttm_tt *i915_tt =
+			container_of(bo->ttm, typeof(*i915_tt), ttm);
+
 		/* Object either has a page vector or is an iomem object */
 		st = bo->ttm ? i915_ttm_tt_get_st(bo->ttm) : obj->ttm.cached_io_st;
 		if (IS_ERR(st))
 			return PTR_ERR(st);
 
 		__i915_gem_object_set_pages(obj, st, i915_sg_dma_sizes(st->sgl));
+		if (!bo->ttm || !i915_tt->is_shmem)
+			i915_gem_object_make_unshrinkable(obj);
 	}
 
 	return ret;
@@ -770,6 +957,8 @@ static void i915_ttm_put_pages(struct drm_i915_gem_object *obj,  static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)  {
 	struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
+	struct i915_ttm_tt *i915_tt =
+		container_of(bo->ttm, typeof(*i915_tt), ttm);
 
 	/*
 	 * Don't manipulate the TTM LRUs while in TTM bo destruction.
@@ -782,7 +971,10 @@ static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)
 	 * Put on the correct LRU list depending on the MADV status
 	 */
 	spin_lock(&bo->bdev->lru_lock);
-	if (obj->mm.madv != I915_MADV_WILLNEED) {
+	if (bo->ttm && i915_tt->filp) {
+		/* Try to keep shmem_tt from being considered for shrinking. */
+		bo->priority = TTM_MAX_BO_PRIORITY - 1;
+	} else if (obj->mm.madv != I915_MADV_WILLNEED) {
 		bo->priority = I915_TTM_PRIO_PURGE;
 	} else if (!i915_gem_object_has_pages(obj)) {
 		if (bo->priority < I915_TTM_PRIO_HAS_PAGES) @@ -887,9 +1079,12 @@ static const struct drm_i915_gem_object_ops i915_gem_ttm_obj_ops = {
 	.get_pages = i915_ttm_get_pages,
 	.put_pages = i915_ttm_put_pages,
 	.truncate = i915_ttm_purge,
+	.shrinker_release_pages = i915_ttm_shrinker_release_pages,
+
 	.adjust_lru = i915_ttm_adjust_lru,
 	.delayed_free = i915_ttm_delayed_free,
 	.migrate = i915_ttm_migrate,
+
 	.mmap_offset = i915_ttm_mmap_offset,
 	.mmap_ops = &vm_ops_ttm,
 };
@@ -937,7 +1132,6 @@ int __i915_gem_ttm_object_init(struct intel_memory_region *mem,
 	drm_gem_private_object_init(&i915->drm, &obj->base, size);
 	i915_gem_object_init(obj, &i915_gem_ttm_obj_ops, &lock_class, flags);
 	i915_gem_object_init_memory_region(obj, mem);
-	i915_gem_object_make_unshrinkable(obj);
 	INIT_RADIX_TREE(&obj->ttm.get_io_page.radix, GFP_KERNEL | __GFP_NOWARN);
 	mutex_init(&obj->ttm.get_io_page.lock);
 	bo_type = (obj->flags & I915_BO_ALLOC_USER) ? ttm_bo_type_device :
--
2.26.3


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* RE: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend
@ 2021-10-05  2:05     ` Zeng, Oak
  0 siblings, 0 replies; 60+ messages in thread
From: Zeng, Oak @ 2021-10-05  2:05 UTC (permalink / raw)
  To: Auld, Matthew, intel-gfx
  Cc: dri-devel, Thomas Hellström, Christian König

Hi Matthew/Thomas,

See one question inline

Regards,
Oak

-----Original Message-----
From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Matthew Auld
Sent: September 27, 2021 7:41 AM
To: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org; Thomas Hellström <thomas.hellstrom@linux.intel.com>; Christian König <christian.koenig@amd.com>
Subject: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend

For cached objects we can allocate our pages directly in shmem. This should make it possible(in a later patch) to utilise the existing i915-gem shrinker code for such objects. For now this is still disabled.

v2(Thomas):
  - Add optional try_to_writeback hook for objects. Importantly we need
    to check if the object is even still shrinkable; in between us
    dropping the shrinker LRU lock and acquiring the object lock it could for
    example have been moved. Also we need to differentiate between
    "lazy" shrinking and the immediate writeback mode. Also later we need to
    handle objects which don't even have mm.pages, so bundling this into
    put_pages() would require somehow handling that edge case, hence
    just letting the ttm backend handle everything in try_to_writeback
    doesn't seem too bad.
v3(Thomas):
  - Likely a bad idea to touch the object from the unpopulate hook,
    since it's not possible to hold a reference, without also creating
    circular dependency, so likely this is too fragile. For now just
    ensure we at least mark the pages as dirty/accessed when called from the
    shrinker on WILLNEED objects.
  - s/try_to_writeback/shrinker_release_pages, since this can do more
    than just writeback.
  - Get rid of do_backup boolean and just set the SWAPPED flag prior to
    calling unpopulate.
  - Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk, since
    these just get skipped anyway. We can try to come up with something
    better later.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |   8 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |   2 +
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c     |  14 +-
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  17 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c       | 240 ++++++++++++++++--
 5 files changed, 245 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 3043fcbd31bd..1c9a1d8d3434 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -601,6 +601,14 @@ int i915_gem_object_wait_migration(struct drm_i915_gem_object *obj,  bool i915_gem_object_placement_possible(struct drm_i915_gem_object *obj,
 					enum intel_memory_type type);
 
+struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
+				size_t size, struct intel_memory_region *mr,
+				struct address_space *mapping,
+				unsigned int max_segment);
+void shmem_free_st(struct sg_table *st, struct address_space *mapping,
+		   bool dirty, bool backup);
+void __shmem_writeback(size_t size, struct address_space *mapping);
+
 #ifdef CONFIG_MMU_NOTIFIER
 static inline bool
 i915_gem_object_is_userptr(struct drm_i915_gem_object *obj) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index fa2ba9e2a4d0..f0fb17be2f7a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -56,6 +56,8 @@ struct drm_i915_gem_object_ops {
 			  struct sg_table *pages);
 	void (*truncate)(struct drm_i915_gem_object *obj);
 	void (*writeback)(struct drm_i915_gem_object *obj);
+	int (*shrinker_release_pages)(struct drm_i915_gem_object *obj,
+				      bool should_writeback);
 
 	int (*pread)(struct drm_i915_gem_object *obj,
 		     const struct drm_i915_gem_pread *arg); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 36b711ae9e28..19e55cc29a15 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -25,8 +25,8 @@ static void check_release_pagevec(struct pagevec *pvec)
 	cond_resched();
 }
 
-static void shmem_free_st(struct sg_table *st, struct address_space *mapping,
-			  bool dirty, bool backup)
+void shmem_free_st(struct sg_table *st, struct address_space *mapping,
+		   bool dirty, bool backup)
 {
 	struct sgt_iter sgt_iter;
 	struct pagevec pvec;
@@ -52,10 +52,10 @@ static void shmem_free_st(struct sg_table *st, struct address_space *mapping,
 	kfree(st);
 }
 
-static struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
-				       size_t size, struct intel_memory_region *mr,
-				       struct address_space *mapping,
-				       unsigned int max_segment)
+struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
+				size_t size, struct intel_memory_region *mr,
+				struct address_space *mapping,
+				unsigned int max_segment)
 {
 	const unsigned long page_count = size / PAGE_SIZE;
 	unsigned long i;
@@ -300,7 +300,7 @@ shmem_truncate(struct drm_i915_gem_object *obj)
 	obj->mm.pages = ERR_PTR(-EFAULT);
 }
 
-static void __shmem_writeback(size_t size, struct address_space *mapping)
+void __shmem_writeback(size_t size, struct address_space *mapping)
 {
 	struct writeback_control wbc = {
 		.sync_mode = WB_SYNC_NONE,
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
index e382b7f2353b..cc80bd23d323 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -56,19 +56,24 @@ static bool unsafe_drop_pages(struct drm_i915_gem_object *obj,
 	return false;
 }
 
-static void try_to_writeback(struct drm_i915_gem_object *obj,
-			     unsigned int flags)
+static int try_to_writeback(struct drm_i915_gem_object *obj, unsigned 
+int flags)
 {
+	if (obj->ops->shrinker_release_pages)
+		return obj->ops->shrinker_release_pages(obj,
+							flags & I915_SHRINK_WRITEBACK);
+
 	switch (obj->mm.madv) {
 	case I915_MADV_DONTNEED:
 		i915_gem_object_truncate(obj);
-		return;
+		return 0;
 	case __I915_MADV_PURGED:
-		return;
+		return 0;
 	}
 
 	if (flags & I915_SHRINK_WRITEBACK)
 		i915_gem_object_writeback(obj);
+
+	return 0;
 }
 
 /**
@@ -222,8 +227,8 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww,
 				}
 
 				if (!__i915_gem_object_put_pages(obj)) {
-					try_to_writeback(obj, shrink);
-					count += obj->base.size >> PAGE_SHIFT;
+					if (!try_to_writeback(obj, shrink))
+						count += obj->base.size >> PAGE_SHIFT;
 				}
 				if (!ww)
 					i915_gem_object_unlock(obj);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index a77e90f300fe..c7402995a8f9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -35,6 +35,8 @@
  * @ttm: The base TTM page vector.
  * @dev: The struct device used for dma mapping and unmapping.
  * @cached_st: The cached scatter-gather table.
+ * @is_shmem: Set if using shmem.
+ * @filp: The shmem file, if using shmem backend.
  *
  * Note that DMA may be going on right up to the point where the page-
  * vector is unpopulated in delayed destroy. Hence keep the @@ -46,6 +48,9 @@ struct i915_ttm_tt {
 	struct ttm_tt ttm;
 	struct device *dev;
 	struct sg_table *cached_st;
+
+	bool is_shmem;
+	struct file *filp;
 };
 
 static const struct ttm_place sys_placement_flags = { @@ -179,12 +184,90 @@ i915_ttm_placement_from_obj(const struct drm_i915_gem_object *obj,
 	placement->busy_placement = busy;
 }
 
+static int i915_ttm_tt_shmem_populate(struct ttm_device *bdev,
+				      struct ttm_tt *ttm,
+				      struct ttm_operation_ctx *ctx) {
+	struct drm_i915_private *i915 = container_of(bdev, typeof(*i915), bdev);
+	struct intel_memory_region *mr = i915->mm.regions[INTEL_MEMORY_SYSTEM];
+	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);
+	const unsigned int max_segment = i915_sg_segment_size();
+	const size_t size = ttm->num_pages << PAGE_SHIFT;
+	struct file *filp = i915_tt->filp;
+	struct sgt_iter sgt_iter;
+	struct sg_table *st;
+	struct page *page;
+	unsigned long i;
+	int err;
+
+	if (!filp) {
+		struct address_space *mapping;
+		gfp_t mask;
+
+		filp = shmem_file_setup("i915-shmem-tt", size, VM_NORESERVE);
+		if (IS_ERR(filp))
+			return PTR_ERR(filp);
+
+		mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
+
+		mapping = filp->f_mapping;
+		mapping_set_gfp_mask(mapping, mask);
+		GEM_BUG_ON(!(mapping_gfp_mask(mapping) & __GFP_RECLAIM));
+
+		i915_tt->filp = filp;
+	}
+
+	st = shmem_alloc_st(i915, size, mr, filp->f_mapping, max_segment);
+	if (IS_ERR(st))
+		return PTR_ERR(st);
+
+	err = dma_map_sg_attrs(i915_tt->dev,
+			       st->sgl, st->nents,
+			       PCI_DMA_BIDIRECTIONAL,
+			       DMA_ATTR_SKIP_CPU_SYNC |
+			       DMA_ATTR_NO_KERNEL_MAPPING |
+			       DMA_ATTR_NO_WARN);
+	if (err <= 0) {
+		err = -EINVAL;
+		goto err_free_st;
+	}
+
+	i = 0;
+	for_each_sgt_page(page, sgt_iter, st)
+		ttm->pages[i++] = page;
+
+	if (ttm->page_flags & TTM_TT_FLAG_SWAPPED)
+		ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
+
+	i915_tt->cached_st = st;
+	return 0;
+
+err_free_st:
+	shmem_free_st(st, filp->f_mapping, false, false);
+	return err;
+}
+
+static void i915_ttm_tt_shmem_unpopulate(struct ttm_tt *ttm) {
+	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);
+	bool backup = ttm->page_flags & TTM_TT_FLAG_SWAPPED;
+
+	dma_unmap_sg(i915_tt->dev, i915_tt->cached_st->sgl,
+		     i915_tt->cached_st->nents,
+		     PCI_DMA_BIDIRECTIONAL);
+
+	shmem_free_st(fetch_and_zero(&i915_tt->cached_st),
+		      file_inode(i915_tt->filp)->i_mapping,
+		      backup, backup);

Should we do something to undo the  shmem_file_setup operation here? From its implementation it does take a reference counter of inode and allocate file:  https://elixir.bootlin.com/linux/latest/source/mm/shmem.c#L4084

Regards,
Oak

+}
+
 static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo,
 					 uint32_t page_flags)
 {
 	struct ttm_resource_manager *man =
 		ttm_manager_type(bo->bdev, bo->resource->mem_type);
 	struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
+	enum ttm_caching caching = i915_ttm_select_tt_caching(obj);
 	struct i915_ttm_tt *i915_tt;
 	int ret;
 
@@ -196,36 +279,62 @@ static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo,
 	    man->use_tt)
 		page_flags |= TTM_TT_FLAG_ZERO_ALLOC;
 
-	ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags,
-			  i915_ttm_select_tt_caching(obj));
-	if (ret) {
-		kfree(i915_tt);
-		return NULL;
+	if (i915_gem_object_is_shrinkable(obj) && caching == ttm_cached) {
+		page_flags |= TTM_TT_FLAG_EXTERNAL |
+			      TTM_TT_FLAG_EXTERNAL_MAPPABLE;
+		i915_tt->is_shmem = true;
 	}
 
+	ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching);
+	if (ret)
+		goto err_free;
+
 	i915_tt->dev = obj->base.dev->dev;
 
 	return &i915_tt->ttm;
+
+err_free:
+	kfree(i915_tt);
+	return NULL;
+}
+
+static int i915_ttm_tt_populate(struct ttm_device *bdev,
+				struct ttm_tt *ttm,
+				struct ttm_operation_ctx *ctx)
+{
+	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), 
+ttm);
+
+	if (i915_tt->is_shmem)
+		return i915_ttm_tt_shmem_populate(bdev, ttm, ctx);
+
+	return ttm_pool_alloc(&bdev->pool, ttm, ctx);
 }
 
 static void i915_ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm)  {
 	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);
 
-	if (i915_tt->cached_st) {
-		dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st,
-				  DMA_BIDIRECTIONAL, 0);
-		sg_free_table(i915_tt->cached_st);
-		kfree(i915_tt->cached_st);
-		i915_tt->cached_st = NULL;
+	if (i915_tt->is_shmem) {
+		i915_ttm_tt_shmem_unpopulate(ttm);
+	} else {
+		if (i915_tt->cached_st) {
+			dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st,
+					  DMA_BIDIRECTIONAL, 0);
+			sg_free_table(i915_tt->cached_st);
+			kfree(i915_tt->cached_st);
+			i915_tt->cached_st = NULL;
+		}
+		ttm_pool_free(&bdev->pool, ttm);
 	}
-	ttm_pool_free(&bdev->pool, ttm);
 }
 
 static void i915_ttm_tt_destroy(struct ttm_device *bdev, struct ttm_tt *ttm)  {
 	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);
 
+	if (i915_tt->filp)
+		fput(i915_tt->filp);
+
 	ttm_tt_fini(ttm);
 	kfree(i915_tt);
 }
@@ -235,6 +344,14 @@ static bool i915_ttm_eviction_valuable(struct ttm_buffer_object *bo,  {
 	struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
 
+	/*
+	 * EXTERNAL objects should never be swapped out by TTM, instead we need
+	 * to handle that ourselves. TTM will already skip such objects for us,
+	 * but we would like to avoid grabbing locks for no good reason.
+	 */
+	if (bo->ttm && bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL)
+		return -EBUSY;
+
 	/* Will do for now. Our pinned objects are still on TTM's LRU lists */
 	return i915_gem_object_evictable(obj);  } @@ -328,9 +445,11 @@ static void i915_ttm_adjust_gem_after_move(struct drm_i915_gem_object *obj)
 	i915_gem_object_set_cache_coherency(obj, cache_level);  }
 
-static void i915_ttm_purge(struct drm_i915_gem_object *obj)
+static int __i915_ttm_purge(struct drm_i915_gem_object *obj)
 {
 	struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
+	struct i915_ttm_tt *i915_tt =
+		container_of(bo->ttm, typeof(*i915_tt), ttm);
 	struct ttm_operation_ctx ctx = {
 		.interruptible = true,
 		.no_wait_gpu = false,
@@ -339,17 +458,79 @@ static void i915_ttm_purge(struct drm_i915_gem_object *obj)
 	int ret;
 
 	if (obj->mm.madv == __I915_MADV_PURGED)
-		return;
+		return 0;
 
-	/* TTM's purge interface. Note that we might be reentering. */
 	ret = ttm_bo_validate(bo, &place, &ctx);
-	if (!ret) {
-		obj->write_domain = 0;
-		obj->read_domains = 0;
-		i915_ttm_adjust_gem_after_move(obj);
-		i915_ttm_free_cached_io_st(obj);
-		obj->mm.madv = __I915_MADV_PURGED;
+	if (ret)
+		return ret;
+
+	if (bo->ttm && i915_tt->filp) {
+		/*
+		 * The below fput(which eventually calls shmem_truncate) might
+		 * be delayed by worker, so when directly called to purge the
+		 * pages(like by the shrinker) we should try to be more
+		 * aggressive and release the pages immediately.
+		 */
+		shmem_truncate_range(file_inode(i915_tt->filp),
+				     0, (loff_t)-1);
+		fput(fetch_and_zero(&i915_tt->filp));
+	}
+
+	obj->write_domain = 0;
+	obj->read_domains = 0;
+	i915_ttm_adjust_gem_after_move(obj);
+	i915_ttm_free_cached_io_st(obj);
+	obj->mm.madv = __I915_MADV_PURGED;
+	return 0;
+}
+
+static void i915_ttm_purge(struct drm_i915_gem_object *obj) {
+	__i915_ttm_purge(obj);
+}
+
+static int i915_ttm_shrinker_release_pages(struct drm_i915_gem_object *obj,
+					   bool should_writeback)
+{
+	struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
+	struct i915_ttm_tt *i915_tt =
+		container_of(bo->ttm, typeof(*i915_tt), ttm);
+	struct ttm_operation_ctx ctx = {
+		.interruptible = true,
+		.no_wait_gpu = false,
+	};
+	struct ttm_placement place = {};
+	int ret;
+
+	if (!bo->ttm || bo->resource->mem_type != TTM_PL_SYSTEM)
+		return 0;
+
+	GEM_BUG_ON(!i915_tt->is_shmem);
+
+	if (!i915_tt->filp)
+		return 0;
+
+	switch (obj->mm.madv) {
+	case I915_MADV_DONTNEED:
+		return __i915_ttm_purge(obj);
+	case __I915_MADV_PURGED:
+		return 0;
+	}
+
+	if (bo->ttm->page_flags & TTM_TT_FLAG_SWAPPED)
+		return 0;
+
+	bo->ttm->page_flags |= TTM_TT_FLAG_SWAPPED;
+	ret = ttm_bo_validate(bo, &place, &ctx);
+	if (ret) {
+		bo->ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
+		return ret;
 	}
+
+	if (should_writeback)
+		__shmem_writeback(obj->base.size, i915_tt->filp->f_mapping);
+
+	return 0;
 }
 
 static void i915_ttm_swap_notify(struct ttm_buffer_object *bo) @@ -618,6 +799,7 @@ static unsigned long i915_ttm_io_mem_pfn(struct ttm_buffer_object *bo,
 
 static struct ttm_device_funcs i915_ttm_bo_driver = {
 	.ttm_tt_create = i915_ttm_tt_create,
+	.ttm_tt_populate = i915_ttm_tt_populate,
 	.ttm_tt_unpopulate = i915_ttm_tt_unpopulate,
 	.ttm_tt_destroy = i915_ttm_tt_destroy,
 	.eviction_valuable = i915_ttm_eviction_valuable, @@ -685,12 +867,17 @@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj,
 	}
 
 	if (!i915_gem_object_has_pages(obj)) {
+		struct i915_ttm_tt *i915_tt =
+			container_of(bo->ttm, typeof(*i915_tt), ttm);
+
 		/* Object either has a page vector or is an iomem object */
 		st = bo->ttm ? i915_ttm_tt_get_st(bo->ttm) : obj->ttm.cached_io_st;
 		if (IS_ERR(st))
 			return PTR_ERR(st);
 
 		__i915_gem_object_set_pages(obj, st, i915_sg_dma_sizes(st->sgl));
+		if (!bo->ttm || !i915_tt->is_shmem)
+			i915_gem_object_make_unshrinkable(obj);
 	}
 
 	return ret;
@@ -770,6 +957,8 @@ static void i915_ttm_put_pages(struct drm_i915_gem_object *obj,  static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)  {
 	struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
+	struct i915_ttm_tt *i915_tt =
+		container_of(bo->ttm, typeof(*i915_tt), ttm);
 
 	/*
 	 * Don't manipulate the TTM LRUs while in TTM bo destruction.
@@ -782,7 +971,10 @@ static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)
 	 * Put on the correct LRU list depending on the MADV status
 	 */
 	spin_lock(&bo->bdev->lru_lock);
-	if (obj->mm.madv != I915_MADV_WILLNEED) {
+	if (bo->ttm && i915_tt->filp) {
+		/* Try to keep shmem_tt from being considered for shrinking. */
+		bo->priority = TTM_MAX_BO_PRIORITY - 1;
+	} else if (obj->mm.madv != I915_MADV_WILLNEED) {
 		bo->priority = I915_TTM_PRIO_PURGE;
 	} else if (!i915_gem_object_has_pages(obj)) {
 		if (bo->priority < I915_TTM_PRIO_HAS_PAGES) @@ -887,9 +1079,12 @@ static const struct drm_i915_gem_object_ops i915_gem_ttm_obj_ops = {
 	.get_pages = i915_ttm_get_pages,
 	.put_pages = i915_ttm_put_pages,
 	.truncate = i915_ttm_purge,
+	.shrinker_release_pages = i915_ttm_shrinker_release_pages,
+
 	.adjust_lru = i915_ttm_adjust_lru,
 	.delayed_free = i915_ttm_delayed_free,
 	.migrate = i915_ttm_migrate,
+
 	.mmap_offset = i915_ttm_mmap_offset,
 	.mmap_ops = &vm_ops_ttm,
 };
@@ -937,7 +1132,6 @@ int __i915_gem_ttm_object_init(struct intel_memory_region *mem,
 	drm_gem_private_object_init(&i915->drm, &obj->base, size);
 	i915_gem_object_init(obj, &i915_gem_ttm_obj_ops, &lock_class, flags);
 	i915_gem_object_init_memory_region(obj, mem);
-	i915_gem_object_make_unshrinkable(obj);
 	INIT_RADIX_TREE(&obj->ttm.get_io_page.radix, GFP_KERNEL | __GFP_NOWARN);
 	mutex_init(&obj->ttm.get_io_page.lock);
 	bo_type = (obj->flags & I915_BO_ALLOC_USER) ? ttm_bo_type_device :
--
2.26.3


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [v5,01/13] drm/ttm: stop calling tt_swapin in vm_access (rev2)
  2021-09-27 11:41 ` [Intel-gfx] " Matthew Auld
                   ` (17 preceding siblings ...)
  (?)
@ 2021-10-05  2:17 ` Patchwork
  -1 siblings, 0 replies; 60+ messages in thread
From: Patchwork @ 2021-10-05  2:17 UTC (permalink / raw)
  To: Zeng, Oak; +Cc: intel-gfx

== Series Details ==

Series: series starting with [v5,01/13] drm/ttm: stop calling tt_swapin in vm_access (rev2)
URL   : https://patchwork.freedesktop.org/series/95093/
State : failure

== Summary ==

Applying: drm/ttm: stop calling tt_swapin in vm_access
Using index info to reconstruct a base tree...
M	drivers/gpu/drm/ttm/ttm_bo_vm.c
Falling back to patching base and 3-way merge...
Auto-merging drivers/gpu/drm/ttm/ttm_bo_vm.c
No changes -- Patch already applied.
Applying: drm/ttm: stop setting page->index for the ttm_tt
Using index info to reconstruct a base tree...
M	drivers/gpu/drm/ttm/ttm_bo_vm.c
M	drivers/gpu/drm/ttm/ttm_tt.c
Falling back to patching base and 3-way merge...
Auto-merging drivers/gpu/drm/ttm/ttm_tt.c
CONFLICT (content): Merge conflict in drivers/gpu/drm/ttm/ttm_tt.c
Auto-merging drivers/gpu/drm/ttm/ttm_bo_vm.c
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0002 drm/ttm: stop setting page->index for the ttm_tt
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend
  2021-10-05  2:05     ` Zeng, Oak
  (?)
@ 2021-10-05 13:48     ` Thomas Hellström
  2021-10-05 14:23         ` Zeng, Oak
  -1 siblings, 1 reply; 60+ messages in thread
From: Thomas Hellström @ 2021-10-05 13:48 UTC (permalink / raw)
  To: Zeng, Oak, Auld, Matthew, intel-gfx; +Cc: dri-devel, Christian König


On 10/5/21 04:05, Zeng, Oak wrote:
> Hi Matthew/Thomas,
>
> See one question inline
>
> Regards,
> Oak
>
> -----Original Message-----
> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Matthew Auld
> Sent: September 27, 2021 7:41 AM
> To: intel-gfx@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org; Thomas Hellström <thomas.hellstrom@linux.intel.com>; Christian König <christian.koenig@amd.com>
> Subject: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend
>
> For cached objects we can allocate our pages directly in shmem. This should make it possible(in a later patch) to utilise the existing i915-gem shrinker code for such objects. For now this is still disabled.
>
> v2(Thomas):
>    - Add optional try_to_writeback hook for objects. Importantly we need
>      to check if the object is even still shrinkable; in between us
>      dropping the shrinker LRU lock and acquiring the object lock it could for
>      example have been moved. Also we need to differentiate between
>      "lazy" shrinking and the immediate writeback mode. Also later we need to
>      handle objects which don't even have mm.pages, so bundling this into
>      put_pages() would require somehow handling that edge case, hence
>      just letting the ttm backend handle everything in try_to_writeback
>      doesn't seem too bad.
> v3(Thomas):
>    - Likely a bad idea to touch the object from the unpopulate hook,
>      since it's not possible to hold a reference, without also creating
>      circular dependency, so likely this is too fragile. For now just
>      ensure we at least mark the pages as dirty/accessed when called from the
>      shrinker on WILLNEED objects.
>    - s/try_to_writeback/shrinker_release_pages, since this can do more
>      than just writeback.
>    - Get rid of do_backup boolean and just set the SWAPPED flag prior to
>      calling unpopulate.
>    - Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk, since
>      these just get skipped anyway. We can try to come up with something
>      better later.
>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Cc: Christian König <christian.koenig@amd.com>
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_object.h    |   8 +
>   .../gpu/drm/i915/gem/i915_gem_object_types.h  |   2 +
>   drivers/gpu/drm/i915/gem/i915_gem_shmem.c     |  14 +-
>   drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  17 +-
>   drivers/gpu/drm/i915/gem/i915_gem_ttm.c       | 240 ++++++++++++++++--
>   5 files changed, 245 insertions(+), 36 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> index 3043fcbd31bd..1c9a1d8d3434 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> @@ -601,6 +601,14 @@ int i915_gem_object_wait_migration(struct drm_i915_gem_object *obj,  bool i915_gem_object_placement_possible(struct drm_i915_gem_object *obj,
>   					enum intel_memory_type type);
>   
> +struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
> +				size_t size, struct intel_memory_region *mr,
> +				struct address_space *mapping,
> +				unsigned int max_segment);
> +void shmem_free_st(struct sg_table *st, struct address_space *mapping,
> +		   bool dirty, bool backup);
> +void __shmem_writeback(size_t size, struct address_space *mapping);
> +
>   #ifdef CONFIG_MMU_NOTIFIER
>   static inline bool
>   i915_gem_object_is_userptr(struct drm_i915_gem_object *obj) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> index fa2ba9e2a4d0..f0fb17be2f7a 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> @@ -56,6 +56,8 @@ struct drm_i915_gem_object_ops {
>   			  struct sg_table *pages);
>   	void (*truncate)(struct drm_i915_gem_object *obj);
>   	void (*writeback)(struct drm_i915_gem_object *obj);
> +	int (*shrinker_release_pages)(struct drm_i915_gem_object *obj,
> +				      bool should_writeback);
>   
>   	int (*pread)(struct drm_i915_gem_object *obj,
>   		     const struct drm_i915_gem_pread *arg); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> index 36b711ae9e28..19e55cc29a15 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> @@ -25,8 +25,8 @@ static void check_release_pagevec(struct pagevec *pvec)
>   	cond_resched();
>   }
>   
> -static void shmem_free_st(struct sg_table *st, struct address_space *mapping,
> -			  bool dirty, bool backup)
> +void shmem_free_st(struct sg_table *st, struct address_space *mapping,
> +		   bool dirty, bool backup)
>   {
>   	struct sgt_iter sgt_iter;
>   	struct pagevec pvec;
> @@ -52,10 +52,10 @@ static void shmem_free_st(struct sg_table *st, struct address_space *mapping,
>   	kfree(st);
>   }
>   
> -static struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
> -				       size_t size, struct intel_memory_region *mr,
> -				       struct address_space *mapping,
> -				       unsigned int max_segment)
> +struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
> +				size_t size, struct intel_memory_region *mr,
> +				struct address_space *mapping,
> +				unsigned int max_segment)
>   {
>   	const unsigned long page_count = size / PAGE_SIZE;
>   	unsigned long i;
> @@ -300,7 +300,7 @@ shmem_truncate(struct drm_i915_gem_object *obj)
>   	obj->mm.pages = ERR_PTR(-EFAULT);
>   }
>   
> -static void __shmem_writeback(size_t size, struct address_space *mapping)
> +void __shmem_writeback(size_t size, struct address_space *mapping)
>   {
>   	struct writeback_control wbc = {
>   		.sync_mode = WB_SYNC_NONE,
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> index e382b7f2353b..cc80bd23d323 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> @@ -56,19 +56,24 @@ static bool unsafe_drop_pages(struct drm_i915_gem_object *obj,
>   	return false;
>   }
>   
> -static void try_to_writeback(struct drm_i915_gem_object *obj,
> -			     unsigned int flags)
> +static int try_to_writeback(struct drm_i915_gem_object *obj, unsigned
> +int flags)
>   {
> +	if (obj->ops->shrinker_release_pages)
> +		return obj->ops->shrinker_release_pages(obj,
> +							flags & I915_SHRINK_WRITEBACK);
> +
>   	switch (obj->mm.madv) {
>   	case I915_MADV_DONTNEED:
>   		i915_gem_object_truncate(obj);
> -		return;
> +		return 0;
>   	case __I915_MADV_PURGED:
> -		return;
> +		return 0;
>   	}
>   
>   	if (flags & I915_SHRINK_WRITEBACK)
>   		i915_gem_object_writeback(obj);
> +
> +	return 0;
>   }
>   
>   /**
> @@ -222,8 +227,8 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww,
>   				}
>   
>   				if (!__i915_gem_object_put_pages(obj)) {
> -					try_to_writeback(obj, shrink);
> -					count += obj->base.size >> PAGE_SHIFT;
> +					if (!try_to_writeback(obj, shrink))
> +						count += obj->base.size >> PAGE_SHIFT;
>   				}
>   				if (!ww)
>   					i915_gem_object_unlock(obj);
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> index a77e90f300fe..c7402995a8f9 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> @@ -35,6 +35,8 @@
>    * @ttm: The base TTM page vector.
>    * @dev: The struct device used for dma mapping and unmapping.
>    * @cached_st: The cached scatter-gather table.
> + * @is_shmem: Set if using shmem.
> + * @filp: The shmem file, if using shmem backend.
>    *
>    * Note that DMA may be going on right up to the point where the page-
>    * vector is unpopulated in delayed destroy. Hence keep the @@ -46,6 +48,9 @@ struct i915_ttm_tt {
>   	struct ttm_tt ttm;
>   	struct device *dev;
>   	struct sg_table *cached_st;
> +
> +	bool is_shmem;
> +	struct file *filp;
>   };
>   
>   static const struct ttm_place sys_placement_flags = { @@ -179,12 +184,90 @@ i915_ttm_placement_from_obj(const struct drm_i915_gem_object *obj,
>   	placement->busy_placement = busy;
>   }
>   
> +static int i915_ttm_tt_shmem_populate(struct ttm_device *bdev,
> +				      struct ttm_tt *ttm,
> +				      struct ttm_operation_ctx *ctx) {
> +	struct drm_i915_private *i915 = container_of(bdev, typeof(*i915), bdev);
> +	struct intel_memory_region *mr = i915->mm.regions[INTEL_MEMORY_SYSTEM];
> +	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);
> +	const unsigned int max_segment = i915_sg_segment_size();
> +	const size_t size = ttm->num_pages << PAGE_SHIFT;
> +	struct file *filp = i915_tt->filp;
> +	struct sgt_iter sgt_iter;
> +	struct sg_table *st;
> +	struct page *page;
> +	unsigned long i;
> +	int err;
> +
> +	if (!filp) {
> +		struct address_space *mapping;
> +		gfp_t mask;
> +
> +		filp = shmem_file_setup("i915-shmem-tt", size, VM_NORESERVE);
> +		if (IS_ERR(filp))
> +			return PTR_ERR(filp);
> +
> +		mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
> +
> +		mapping = filp->f_mapping;
> +		mapping_set_gfp_mask(mapping, mask);
> +		GEM_BUG_ON(!(mapping_gfp_mask(mapping) & __GFP_RECLAIM));
> +
> +		i915_tt->filp = filp;
> +	}
> +
> +	st = shmem_alloc_st(i915, size, mr, filp->f_mapping, max_segment);
> +	if (IS_ERR(st))
> +		return PTR_ERR(st);
> +
> +	err = dma_map_sg_attrs(i915_tt->dev,
> +			       st->sgl, st->nents,
> +			       PCI_DMA_BIDIRECTIONAL,
> +			       DMA_ATTR_SKIP_CPU_SYNC |
> +			       DMA_ATTR_NO_KERNEL_MAPPING |
> +			       DMA_ATTR_NO_WARN);
> +	if (err <= 0) {
> +		err = -EINVAL;
> +		goto err_free_st;
> +	}
> +
> +	i = 0;
> +	for_each_sgt_page(page, sgt_iter, st)
> +		ttm->pages[i++] = page;
> +
> +	if (ttm->page_flags & TTM_TT_FLAG_SWAPPED)
> +		ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
> +
> +	i915_tt->cached_st = st;
> +	return 0;
> +
> +err_free_st:
> +	shmem_free_st(st, filp->f_mapping, false, false);
> +	return err;
> +}
> +
> +static void i915_ttm_tt_shmem_unpopulate(struct ttm_tt *ttm) {
> +	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);
> +	bool backup = ttm->page_flags & TTM_TT_FLAG_SWAPPED;
> +
> +	dma_unmap_sg(i915_tt->dev, i915_tt->cached_st->sgl,
> +		     i915_tt->cached_st->nents,
> +		     PCI_DMA_BIDIRECTIONAL);
> +
> +	shmem_free_st(fetch_and_zero(&i915_tt->cached_st),
> +		      file_inode(i915_tt->filp)->i_mapping,
> +		      backup, backup);
>
> Should we do something to undo the  shmem_file_setup operation here? From its implementation it does take a reference counter of inode and allocate file:  https://elixir.bootlin.com/linux/latest/source/mm/shmem.c#L4084
>
> Regards,
> Oak

Hi, Oak,

That's done in i915_ttm_tt_destroy() afaict.

/Thomas



>
> +}
> +
>   static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo,
>   					 uint32_t page_flags)
>   {
>   	struct ttm_resource_manager *man =
>   		ttm_manager_type(bo->bdev, bo->resource->mem_type);
>   	struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
> +	enum ttm_caching caching = i915_ttm_select_tt_caching(obj);
>   	struct i915_ttm_tt *i915_tt;
>   	int ret;
>   
> @@ -196,36 +279,62 @@ static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo,
>   	    man->use_tt)
>   		page_flags |= TTM_TT_FLAG_ZERO_ALLOC;
>   
> -	ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags,
> -			  i915_ttm_select_tt_caching(obj));
> -	if (ret) {
> -		kfree(i915_tt);
> -		return NULL;
> +	if (i915_gem_object_is_shrinkable(obj) && caching == ttm_cached) {
> +		page_flags |= TTM_TT_FLAG_EXTERNAL |
> +			      TTM_TT_FLAG_EXTERNAL_MAPPABLE;
> +		i915_tt->is_shmem = true;
>   	}
>   
> +	ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching);
> +	if (ret)
> +		goto err_free;
> +
>   	i915_tt->dev = obj->base.dev->dev;
>   
>   	return &i915_tt->ttm;
> +
> +err_free:
> +	kfree(i915_tt);
> +	return NULL;
> +}
> +
> +static int i915_ttm_tt_populate(struct ttm_device *bdev,
> +				struct ttm_tt *ttm,
> +				struct ttm_operation_ctx *ctx)
> +{
> +	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
> +ttm);
> +
> +	if (i915_tt->is_shmem)
> +		return i915_ttm_tt_shmem_populate(bdev, ttm, ctx);
> +
> +	return ttm_pool_alloc(&bdev->pool, ttm, ctx);
>   }
>   
>   static void i915_ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm)  {
>   	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);
>   
> -	if (i915_tt->cached_st) {
> -		dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st,
> -				  DMA_BIDIRECTIONAL, 0);
> -		sg_free_table(i915_tt->cached_st);
> -		kfree(i915_tt->cached_st);
> -		i915_tt->cached_st = NULL;
> +	if (i915_tt->is_shmem) {
> +		i915_ttm_tt_shmem_unpopulate(ttm);
> +	} else {
> +		if (i915_tt->cached_st) {
> +			dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st,
> +					  DMA_BIDIRECTIONAL, 0);
> +			sg_free_table(i915_tt->cached_st);
> +			kfree(i915_tt->cached_st);
> +			i915_tt->cached_st = NULL;
> +		}
> +		ttm_pool_free(&bdev->pool, ttm);
>   	}
> -	ttm_pool_free(&bdev->pool, ttm);
>   }
>   
>   static void i915_ttm_tt_destroy(struct ttm_device *bdev, struct ttm_tt *ttm)  {
>   	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);
>   
> +	if (i915_tt->filp)
> +		fput(i915_tt->filp);
> +
>   	ttm_tt_fini(ttm);
>   	kfree(i915_tt);
>   }
> @@ -235,6 +344,14 @@ static bool i915_ttm_eviction_valuable(struct ttm_buffer_object *bo,  {
>   	struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
>   
> +	/*
> +	 * EXTERNAL objects should never be swapped out by TTM, instead we need
> +	 * to handle that ourselves. TTM will already skip such objects for us,
> +	 * but we would like to avoid grabbing locks for no good reason.
> +	 */
> +	if (bo->ttm && bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL)
> +		return -EBUSY;
> +
>   	/* Will do for now. Our pinned objects are still on TTM's LRU lists */
>   	return i915_gem_object_evictable(obj);  } @@ -328,9 +445,11 @@ static void i915_ttm_adjust_gem_after_move(struct drm_i915_gem_object *obj)
>   	i915_gem_object_set_cache_coherency(obj, cache_level);  }
>   
> -static void i915_ttm_purge(struct drm_i915_gem_object *obj)
> +static int __i915_ttm_purge(struct drm_i915_gem_object *obj)
>   {
>   	struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
> +	struct i915_ttm_tt *i915_tt =
> +		container_of(bo->ttm, typeof(*i915_tt), ttm);
>   	struct ttm_operation_ctx ctx = {
>   		.interruptible = true,
>   		.no_wait_gpu = false,
> @@ -339,17 +458,79 @@ static void i915_ttm_purge(struct drm_i915_gem_object *obj)
>   	int ret;
>   
>   	if (obj->mm.madv == __I915_MADV_PURGED)
> -		return;
> +		return 0;
>   
> -	/* TTM's purge interface. Note that we might be reentering. */
>   	ret = ttm_bo_validate(bo, &place, &ctx);
> -	if (!ret) {
> -		obj->write_domain = 0;
> -		obj->read_domains = 0;
> -		i915_ttm_adjust_gem_after_move(obj);
> -		i915_ttm_free_cached_io_st(obj);
> -		obj->mm.madv = __I915_MADV_PURGED;
> +	if (ret)
> +		return ret;
> +
> +	if (bo->ttm && i915_tt->filp) {
> +		/*
> +		 * The below fput(which eventually calls shmem_truncate) might
> +		 * be delayed by worker, so when directly called to purge the
> +		 * pages(like by the shrinker) we should try to be more
> +		 * aggressive and release the pages immediately.
> +		 */
> +		shmem_truncate_range(file_inode(i915_tt->filp),
> +				     0, (loff_t)-1);
> +		fput(fetch_and_zero(&i915_tt->filp));
> +	}
> +
> +	obj->write_domain = 0;
> +	obj->read_domains = 0;
> +	i915_ttm_adjust_gem_after_move(obj);
> +	i915_ttm_free_cached_io_st(obj);
> +	obj->mm.madv = __I915_MADV_PURGED;
> +	return 0;
> +}
> +
> +static void i915_ttm_purge(struct drm_i915_gem_object *obj) {
> +	__i915_ttm_purge(obj);
> +}
> +
> +static int i915_ttm_shrinker_release_pages(struct drm_i915_gem_object *obj,
> +					   bool should_writeback)
> +{
> +	struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
> +	struct i915_ttm_tt *i915_tt =
> +		container_of(bo->ttm, typeof(*i915_tt), ttm);
> +	struct ttm_operation_ctx ctx = {
> +		.interruptible = true,
> +		.no_wait_gpu = false,
> +	};
> +	struct ttm_placement place = {};
> +	int ret;
> +
> +	if (!bo->ttm || bo->resource->mem_type != TTM_PL_SYSTEM)
> +		return 0;
> +
> +	GEM_BUG_ON(!i915_tt->is_shmem);
> +
> +	if (!i915_tt->filp)
> +		return 0;
> +
> +	switch (obj->mm.madv) {
> +	case I915_MADV_DONTNEED:
> +		return __i915_ttm_purge(obj);
> +	case __I915_MADV_PURGED:
> +		return 0;
> +	}
> +
> +	if (bo->ttm->page_flags & TTM_TT_FLAG_SWAPPED)
> +		return 0;
> +
> +	bo->ttm->page_flags |= TTM_TT_FLAG_SWAPPED;
> +	ret = ttm_bo_validate(bo, &place, &ctx);
> +	if (ret) {
> +		bo->ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
> +		return ret;
>   	}
> +
> +	if (should_writeback)
> +		__shmem_writeback(obj->base.size, i915_tt->filp->f_mapping);
> +
> +	return 0;
>   }
>   
>   static void i915_ttm_swap_notify(struct ttm_buffer_object *bo) @@ -618,6 +799,7 @@ static unsigned long i915_ttm_io_mem_pfn(struct ttm_buffer_object *bo,
>   
>   static struct ttm_device_funcs i915_ttm_bo_driver = {
>   	.ttm_tt_create = i915_ttm_tt_create,
> +	.ttm_tt_populate = i915_ttm_tt_populate,
>   	.ttm_tt_unpopulate = i915_ttm_tt_unpopulate,
>   	.ttm_tt_destroy = i915_ttm_tt_destroy,
>   	.eviction_valuable = i915_ttm_eviction_valuable, @@ -685,12 +867,17 @@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj,
>   	}
>   
>   	if (!i915_gem_object_has_pages(obj)) {
> +		struct i915_ttm_tt *i915_tt =
> +			container_of(bo->ttm, typeof(*i915_tt), ttm);
> +
>   		/* Object either has a page vector or is an iomem object */
>   		st = bo->ttm ? i915_ttm_tt_get_st(bo->ttm) : obj->ttm.cached_io_st;
>   		if (IS_ERR(st))
>   			return PTR_ERR(st);
>   
>   		__i915_gem_object_set_pages(obj, st, i915_sg_dma_sizes(st->sgl));
> +		if (!bo->ttm || !i915_tt->is_shmem)
> +			i915_gem_object_make_unshrinkable(obj);
>   	}
>   
>   	return ret;
> @@ -770,6 +957,8 @@ static void i915_ttm_put_pages(struct drm_i915_gem_object *obj,  static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)  {
>   	struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
> +	struct i915_ttm_tt *i915_tt =
> +		container_of(bo->ttm, typeof(*i915_tt), ttm);
>   
>   	/*
>   	 * Don't manipulate the TTM LRUs while in TTM bo destruction.
> @@ -782,7 +971,10 @@ static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)
>   	 * Put on the correct LRU list depending on the MADV status
>   	 */
>   	spin_lock(&bo->bdev->lru_lock);
> -	if (obj->mm.madv != I915_MADV_WILLNEED) {
> +	if (bo->ttm && i915_tt->filp) {
> +		/* Try to keep shmem_tt from being considered for shrinking. */
> +		bo->priority = TTM_MAX_BO_PRIORITY - 1;
> +	} else if (obj->mm.madv != I915_MADV_WILLNEED) {
>   		bo->priority = I915_TTM_PRIO_PURGE;
>   	} else if (!i915_gem_object_has_pages(obj)) {
>   		if (bo->priority < I915_TTM_PRIO_HAS_PAGES) @@ -887,9 +1079,12 @@ static const struct drm_i915_gem_object_ops i915_gem_ttm_obj_ops = {
>   	.get_pages = i915_ttm_get_pages,
>   	.put_pages = i915_ttm_put_pages,
>   	.truncate = i915_ttm_purge,
> +	.shrinker_release_pages = i915_ttm_shrinker_release_pages,
> +
>   	.adjust_lru = i915_ttm_adjust_lru,
>   	.delayed_free = i915_ttm_delayed_free,
>   	.migrate = i915_ttm_migrate,
> +
>   	.mmap_offset = i915_ttm_mmap_offset,
>   	.mmap_ops = &vm_ops_ttm,
>   };
> @@ -937,7 +1132,6 @@ int __i915_gem_ttm_object_init(struct intel_memory_region *mem,
>   	drm_gem_private_object_init(&i915->drm, &obj->base, size);
>   	i915_gem_object_init(obj, &i915_gem_ttm_obj_ops, &lock_class, flags);
>   	i915_gem_object_init_memory_region(obj, mem);
> -	i915_gem_object_make_unshrinkable(obj);
>   	INIT_RADIX_TREE(&obj->ttm.get_io_page.radix, GFP_KERNEL | __GFP_NOWARN);
>   	mutex_init(&obj->ttm.get_io_page.lock);
>   	bo_type = (obj->flags & I915_BO_ALLOC_USER) ? ttm_bo_type_device :
> --
> 2.26.3
>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend
  2021-10-05 13:48     ` Thomas Hellström
@ 2021-10-05 14:23         ` Zeng, Oak
  0 siblings, 0 replies; 60+ messages in thread
From: Zeng, Oak @ 2021-10-05 14:23 UTC (permalink / raw)
  To: Thomas Hellström, Auld, Matthew, intel-gfx
  Cc: dri-devel, Christian König



Regards,
Oak

> -----Original Message-----
> From: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Sent: October 5, 2021 9:48 AM
> To: Zeng, Oak <oak.zeng@intel.com>; Auld, Matthew
> <matthew.auld@intel.com>; intel-gfx@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org; Christian König
> <christian.koenig@amd.com>
> Subject: Re: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem
> backend
> 
> 
> On 10/5/21 04:05, Zeng, Oak wrote:
> > Hi Matthew/Thomas,
> >
> > See one question inline
> >
> > Regards,
> > Oak
> >
> > -----Original Message-----
> > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of
> Matthew Auld
> > Sent: September 27, 2021 7:41 AM
> > To: intel-gfx@lists.freedesktop.org
> > Cc: dri-devel@lists.freedesktop.org; Thomas Hellström
> <thomas.hellstrom@linux.intel.com>; Christian König
> <christian.koenig@amd.com>
> > Subject: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend
> >
> > For cached objects we can allocate our pages directly in shmem. This should
> make it possible(in a later patch) to utilise the existing i915-gem shrinker code
> for such objects. For now this is still disabled.
> >
> > v2(Thomas):
> >    - Add optional try_to_writeback hook for objects. Importantly we need
> >      to check if the object is even still shrinkable; in between us
> >      dropping the shrinker LRU lock and acquiring the object lock it could for
> >      example have been moved. Also we need to differentiate between
> >      "lazy" shrinking and the immediate writeback mode. Also later we need
> to
> >      handle objects which don't even have mm.pages, so bundling this into
> >      put_pages() would require somehow handling that edge case, hence
> >      just letting the ttm backend handle everything in try_to_writeback
> >      doesn't seem too bad.
> > v3(Thomas):
> >    - Likely a bad idea to touch the object from the unpopulate hook,
> >      since it's not possible to hold a reference, without also creating
> >      circular dependency, so likely this is too fragile. For now just
> >      ensure we at least mark the pages as dirty/accessed when called from the
> >      shrinker on WILLNEED objects.
> >    - s/try_to_writeback/shrinker_release_pages, since this can do more
> >      than just writeback.
> >    - Get rid of do_backup boolean and just set the SWAPPED flag prior to
> >      calling unpopulate.
> >    - Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk,
> since
> >      these just get skipped anyway. We can try to come up with something
> >      better later.
> >
> > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> > Cc: Christian König <christian.koenig@amd.com>
> > ---
> >   drivers/gpu/drm/i915/gem/i915_gem_object.h    |   8 +
> >   .../gpu/drm/i915/gem/i915_gem_object_types.h  |   2 +
> >   drivers/gpu/drm/i915/gem/i915_gem_shmem.c     |  14 +-
> >   drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  17 +-
> >   drivers/gpu/drm/i915/gem/i915_gem_ttm.c       | 240 ++++++++++++++++-
> -
> >   5 files changed, 245 insertions(+), 36 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> > index 3043fcbd31bd..1c9a1d8d3434 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> > @@ -601,6 +601,14 @@ int i915_gem_object_wait_migration(struct
> drm_i915_gem_object *obj,  bool
> i915_gem_object_placement_possible(struct drm_i915_gem_object *obj,
> >   					enum intel_memory_type type);
> >
> > +struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
> > +				size_t size, struct intel_memory_region *mr,
> > +				struct address_space *mapping,
> > +				unsigned int max_segment);
> > +void shmem_free_st(struct sg_table *st, struct address_space *mapping,
> > +		   bool dirty, bool backup);
> > +void __shmem_writeback(size_t size, struct address_space *mapping);
> > +
> >   #ifdef CONFIG_MMU_NOTIFIER
> >   static inline bool
> >   i915_gem_object_is_userptr(struct drm_i915_gem_object *obj) diff --git
> a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> > index fa2ba9e2a4d0..f0fb17be2f7a 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> > @@ -56,6 +56,8 @@ struct drm_i915_gem_object_ops {
> >   			  struct sg_table *pages);
> >   	void (*truncate)(struct drm_i915_gem_object *obj);
> >   	void (*writeback)(struct drm_i915_gem_object *obj);
> > +	int (*shrinker_release_pages)(struct drm_i915_gem_object *obj,
> > +				      bool should_writeback);
> >
> >   	int (*pread)(struct drm_i915_gem_object *obj,
> >   		     const struct drm_i915_gem_pread *arg); diff --git
> a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> > index 36b711ae9e28..19e55cc29a15 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> > @@ -25,8 +25,8 @@ static void check_release_pagevec(struct pagevec
> *pvec)
> >   	cond_resched();
> >   }
> >
> > -static void shmem_free_st(struct sg_table *st, struct address_space
> *mapping,
> > -			  bool dirty, bool backup)
> > +void shmem_free_st(struct sg_table *st, struct address_space *mapping,
> > +		   bool dirty, bool backup)
> >   {
> >   	struct sgt_iter sgt_iter;
> >   	struct pagevec pvec;
> > @@ -52,10 +52,10 @@ static void shmem_free_st(struct sg_table *st, struct
> address_space *mapping,
> >   	kfree(st);
> >   }
> >
> > -static struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
> > -				       size_t size, struct intel_memory_region
> *mr,
> > -				       struct address_space *mapping,
> > -				       unsigned int max_segment)
> > +struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
> > +				size_t size, struct intel_memory_region *mr,
> > +				struct address_space *mapping,
> > +				unsigned int max_segment)
> >   {
> >   	const unsigned long page_count = size / PAGE_SIZE;
> >   	unsigned long i;
> > @@ -300,7 +300,7 @@ shmem_truncate(struct drm_i915_gem_object *obj)
> >   	obj->mm.pages = ERR_PTR(-EFAULT);
> >   }
> >
> > -static void __shmem_writeback(size_t size, struct address_space
> *mapping)
> > +void __shmem_writeback(size_t size, struct address_space *mapping)
> >   {
> >   	struct writeback_control wbc = {
> >   		.sync_mode = WB_SYNC_NONE,
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> > index e382b7f2353b..cc80bd23d323 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> > @@ -56,19 +56,24 @@ static bool unsafe_drop_pages(struct
> drm_i915_gem_object *obj,
> >   	return false;
> >   }
> >
> > -static void try_to_writeback(struct drm_i915_gem_object *obj,
> > -			     unsigned int flags)
> > +static int try_to_writeback(struct drm_i915_gem_object *obj, unsigned
> > +int flags)
> >   {
> > +	if (obj->ops->shrinker_release_pages)
> > +		return obj->ops->shrinker_release_pages(obj,
> > +							flags &
> I915_SHRINK_WRITEBACK);
> > +
> >   	switch (obj->mm.madv) {
> >   	case I915_MADV_DONTNEED:
> >   		i915_gem_object_truncate(obj);
> > -		return;
> > +		return 0;
> >   	case __I915_MADV_PURGED:
> > -		return;
> > +		return 0;
> >   	}
> >
> >   	if (flags & I915_SHRINK_WRITEBACK)
> >   		i915_gem_object_writeback(obj);
> > +
> > +	return 0;
> >   }
> >
> >   /**
> > @@ -222,8 +227,8 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww,
> >   				}
> >
> >   				if (!__i915_gem_object_put_pages(obj)) {
> > -					try_to_writeback(obj, shrink);
> > -					count += obj->base.size >>
> PAGE_SHIFT;
> > +					if (!try_to_writeback(obj, shrink))
> > +						count += obj->base.size >>
> PAGE_SHIFT;
> >   				}
> >   				if (!ww)
> >   					i915_gem_object_unlock(obj);
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> > index a77e90f300fe..c7402995a8f9 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> > @@ -35,6 +35,8 @@
> >    * @ttm: The base TTM page vector.
> >    * @dev: The struct device used for dma mapping and unmapping.
> >    * @cached_st: The cached scatter-gather table.
> > + * @is_shmem: Set if using shmem.
> > + * @filp: The shmem file, if using shmem backend.
> >    *
> >    * Note that DMA may be going on right up to the point where the page-
> >    * vector is unpopulated in delayed destroy. Hence keep the @@ -46,6
> +48,9 @@ struct i915_ttm_tt {
> >   	struct ttm_tt ttm;
> >   	struct device *dev;
> >   	struct sg_table *cached_st;
> > +
> > +	bool is_shmem;
> > +	struct file *filp;
> >   };
> >
> >   static const struct ttm_place sys_placement_flags = { @@ -179,12 +184,90
> @@ i915_ttm_placement_from_obj(const struct drm_i915_gem_object *obj,
> >   	placement->busy_placement = busy;
> >   }
> >
> > +static int i915_ttm_tt_shmem_populate(struct ttm_device *bdev,
> > +				      struct ttm_tt *ttm,
> > +				      struct ttm_operation_ctx *ctx) {
> > +	struct drm_i915_private *i915 = container_of(bdev, typeof(*i915),
> bdev);
> > +	struct intel_memory_region *mr = i915-
> >mm.regions[INTEL_MEMORY_SYSTEM];
> > +	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
> ttm);
> > +	const unsigned int max_segment = i915_sg_segment_size();
> > +	const size_t size = ttm->num_pages << PAGE_SHIFT;
> > +	struct file *filp = i915_tt->filp;
> > +	struct sgt_iter sgt_iter;
> > +	struct sg_table *st;
> > +	struct page *page;
> > +	unsigned long i;
> > +	int err;
> > +
> > +	if (!filp) {
> > +		struct address_space *mapping;
> > +		gfp_t mask;
> > +
> > +		filp = shmem_file_setup("i915-shmem-tt", size,
> VM_NORESERVE);
> > +		if (IS_ERR(filp))
> > +			return PTR_ERR(filp);
> > +
> > +		mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
> > +
> > +		mapping = filp->f_mapping;
> > +		mapping_set_gfp_mask(mapping, mask);
> > +		GEM_BUG_ON(!(mapping_gfp_mask(mapping) &
> __GFP_RECLAIM));
> > +
> > +		i915_tt->filp = filp;
> > +	}
> > +
> > +	st = shmem_alloc_st(i915, size, mr, filp->f_mapping, max_segment);
> > +	if (IS_ERR(st))
> > +		return PTR_ERR(st);
> > +
> > +	err = dma_map_sg_attrs(i915_tt->dev,
> > +			       st->sgl, st->nents,
> > +			       PCI_DMA_BIDIRECTIONAL,
> > +			       DMA_ATTR_SKIP_CPU_SYNC |
> > +			       DMA_ATTR_NO_KERNEL_MAPPING |
> > +			       DMA_ATTR_NO_WARN);
> > +	if (err <= 0) {
> > +		err = -EINVAL;
> > +		goto err_free_st;
> > +	}
> > +
> > +	i = 0;
> > +	for_each_sgt_page(page, sgt_iter, st)
> > +		ttm->pages[i++] = page;
> > +
> > +	if (ttm->page_flags & TTM_TT_FLAG_SWAPPED)
> > +		ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
> > +
> > +	i915_tt->cached_st = st;
> > +	return 0;
> > +
> > +err_free_st:
> > +	shmem_free_st(st, filp->f_mapping, false, false);
> > +	return err;
> > +}
> > +
> > +static void i915_ttm_tt_shmem_unpopulate(struct ttm_tt *ttm) {
> > +	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
> ttm);
> > +	bool backup = ttm->page_flags & TTM_TT_FLAG_SWAPPED;
> > +
> > +	dma_unmap_sg(i915_tt->dev, i915_tt->cached_st->sgl,
> > +		     i915_tt->cached_st->nents,
> > +		     PCI_DMA_BIDIRECTIONAL);
> > +
> > +	shmem_free_st(fetch_and_zero(&i915_tt->cached_st),
> > +		      file_inode(i915_tt->filp)->i_mapping,
> > +		      backup, backup);
> >
> > Should we do something to undo the  shmem_file_setup operation here?
> From its implementation it does take a reference counter of inode and
> allocate file:
> https://elixir.bootlin.com/linux/latest/source/mm/shmem.c#L4084
> >
> > Regards,
> > Oak
> 
> Hi, Oak,
> 
> That's done in i915_ttm_tt_destroy() afaict.
> 
> /Thomas

Do we know whether this tt is back by a shmem at create time? If yes, I think a better place to do the shmem_file_setup is in i915_ttm_tt_create - this pairs with the fput in i915_ttm_tt_destroy.  If we don't have such information at tt create time, I agree with the current approach.

Regards,
Oak

> 
> 
> 
> >
> > +}
> > +
> >   static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo,
> >   					 uint32_t page_flags)
> >   {
> >   	struct ttm_resource_manager *man =
> >   		ttm_manager_type(bo->bdev, bo->resource->mem_type);
> >   	struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
> > +	enum ttm_caching caching = i915_ttm_select_tt_caching(obj);
> >   	struct i915_ttm_tt *i915_tt;
> >   	int ret;
> >
> > @@ -196,36 +279,62 @@ static struct ttm_tt *i915_ttm_tt_create(struct
> ttm_buffer_object *bo,
> >   	    man->use_tt)
> >   		page_flags |= TTM_TT_FLAG_ZERO_ALLOC;
> >
> > -	ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags,
> > -			  i915_ttm_select_tt_caching(obj));
> > -	if (ret) {
> > -		kfree(i915_tt);
> > -		return NULL;
> > +	if (i915_gem_object_is_shrinkable(obj) && caching == ttm_cached) {
> > +		page_flags |= TTM_TT_FLAG_EXTERNAL |
> > +			      TTM_TT_FLAG_EXTERNAL_MAPPABLE;
> > +		i915_tt->is_shmem = true;
> >   	}
> >
> > +	ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching);
> > +	if (ret)
> > +		goto err_free;
> > +
> >   	i915_tt->dev = obj->base.dev->dev;
> >
> >   	return &i915_tt->ttm;
> > +
> > +err_free:
> > +	kfree(i915_tt);
> > +	return NULL;
> > +}
> > +
> > +static int i915_ttm_tt_populate(struct ttm_device *bdev,
> > +				struct ttm_tt *ttm,
> > +				struct ttm_operation_ctx *ctx)
> > +{
> > +	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
> > +ttm);
> > +
> > +	if (i915_tt->is_shmem)
> > +		return i915_ttm_tt_shmem_populate(bdev, ttm, ctx);
> > +
> > +	return ttm_pool_alloc(&bdev->pool, ttm, ctx);
> >   }
> >
> >   static void i915_ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt
> *ttm)  {
> >   	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
> ttm);
> >
> > -	if (i915_tt->cached_st) {
> > -		dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st,
> > -				  DMA_BIDIRECTIONAL, 0);
> > -		sg_free_table(i915_tt->cached_st);
> > -		kfree(i915_tt->cached_st);
> > -		i915_tt->cached_st = NULL;
> > +	if (i915_tt->is_shmem) {
> > +		i915_ttm_tt_shmem_unpopulate(ttm);
> > +	} else {
> > +		if (i915_tt->cached_st) {
> > +			dma_unmap_sgtable(i915_tt->dev, i915_tt-
> >cached_st,
> > +					  DMA_BIDIRECTIONAL, 0);
> > +			sg_free_table(i915_tt->cached_st);
> > +			kfree(i915_tt->cached_st);
> > +			i915_tt->cached_st = NULL;
> > +		}
> > +		ttm_pool_free(&bdev->pool, ttm);
> >   	}
> > -	ttm_pool_free(&bdev->pool, ttm);
> >   }
> >
> >   static void i915_ttm_tt_destroy(struct ttm_device *bdev, struct ttm_tt
> *ttm)  {
> >   	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
> ttm);
> >
> > +	if (i915_tt->filp)
> > +		fput(i915_tt->filp);
> > +
> >   	ttm_tt_fini(ttm);
> >   	kfree(i915_tt);
> >   }
> > @@ -235,6 +344,14 @@ static bool i915_ttm_eviction_valuable(struct
> ttm_buffer_object *bo,  {
> >   	struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
> >
> > +	/*
> > +	 * EXTERNAL objects should never be swapped out by TTM, instead
> we need
> > +	 * to handle that ourselves. TTM will already skip such objects for us,
> > +	 * but we would like to avoid grabbing locks for no good reason.
> > +	 */
> > +	if (bo->ttm && bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL)
> > +		return -EBUSY;
> > +
> >   	/* Will do for now. Our pinned objects are still on TTM's LRU lists */
> >   	return i915_gem_object_evictable(obj);  } @@ -328,9 +445,11 @@
> static void i915_ttm_adjust_gem_after_move(struct drm_i915_gem_object
> *obj)
> >   	i915_gem_object_set_cache_coherency(obj, cache_level);  }
> >
> > -static void i915_ttm_purge(struct drm_i915_gem_object *obj)
> > +static int __i915_ttm_purge(struct drm_i915_gem_object *obj)
> >   {
> >   	struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
> > +	struct i915_ttm_tt *i915_tt =
> > +		container_of(bo->ttm, typeof(*i915_tt), ttm);
> >   	struct ttm_operation_ctx ctx = {
> >   		.interruptible = true,
> >   		.no_wait_gpu = false,
> > @@ -339,17 +458,79 @@ static void i915_ttm_purge(struct
> drm_i915_gem_object *obj)
> >   	int ret;
> >
> >   	if (obj->mm.madv == __I915_MADV_PURGED)
> > -		return;
> > +		return 0;
> >
> > -	/* TTM's purge interface. Note that we might be reentering. */
> >   	ret = ttm_bo_validate(bo, &place, &ctx);
> > -	if (!ret) {
> > -		obj->write_domain = 0;
> > -		obj->read_domains = 0;
> > -		i915_ttm_adjust_gem_after_move(obj);
> > -		i915_ttm_free_cached_io_st(obj);
> > -		obj->mm.madv = __I915_MADV_PURGED;
> > +	if (ret)
> > +		return ret;
> > +
> > +	if (bo->ttm && i915_tt->filp) {
> > +		/*
> > +		 * The below fput(which eventually calls shmem_truncate)
> might
> > +		 * be delayed by worker, so when directly called to purge the
> > +		 * pages(like by the shrinker) we should try to be more
> > +		 * aggressive and release the pages immediately.
> > +		 */
> > +		shmem_truncate_range(file_inode(i915_tt->filp),
> > +				     0, (loff_t)-1);
> > +		fput(fetch_and_zero(&i915_tt->filp));
> > +	}
> > +
> > +	obj->write_domain = 0;
> > +	obj->read_domains = 0;
> > +	i915_ttm_adjust_gem_after_move(obj);
> > +	i915_ttm_free_cached_io_st(obj);
> > +	obj->mm.madv = __I915_MADV_PURGED;
> > +	return 0;
> > +}
> > +
> > +static void i915_ttm_purge(struct drm_i915_gem_object *obj) {
> > +	__i915_ttm_purge(obj);
> > +}
> > +
> > +static int i915_ttm_shrinker_release_pages(struct drm_i915_gem_object
> *obj,
> > +					   bool should_writeback)
> > +{
> > +	struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
> > +	struct i915_ttm_tt *i915_tt =
> > +		container_of(bo->ttm, typeof(*i915_tt), ttm);
> > +	struct ttm_operation_ctx ctx = {
> > +		.interruptible = true,
> > +		.no_wait_gpu = false,
> > +	};
> > +	struct ttm_placement place = {};
> > +	int ret;
> > +
> > +	if (!bo->ttm || bo->resource->mem_type != TTM_PL_SYSTEM)
> > +		return 0;
> > +
> > +	GEM_BUG_ON(!i915_tt->is_shmem);
> > +
> > +	if (!i915_tt->filp)
> > +		return 0;
> > +
> > +	switch (obj->mm.madv) {
> > +	case I915_MADV_DONTNEED:
> > +		return __i915_ttm_purge(obj);
> > +	case __I915_MADV_PURGED:
> > +		return 0;
> > +	}
> > +
> > +	if (bo->ttm->page_flags & TTM_TT_FLAG_SWAPPED)
> > +		return 0;
> > +
> > +	bo->ttm->page_flags |= TTM_TT_FLAG_SWAPPED;
> > +	ret = ttm_bo_validate(bo, &place, &ctx);
> > +	if (ret) {
> > +		bo->ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
> > +		return ret;
> >   	}
> > +
> > +	if (should_writeback)
> > +		__shmem_writeback(obj->base.size, i915_tt->filp-
> >f_mapping);
> > +
> > +	return 0;
> >   }
> >
> >   static void i915_ttm_swap_notify(struct ttm_buffer_object *bo) @@ -
> 618,6 +799,7 @@ static unsigned long i915_ttm_io_mem_pfn(struct
> ttm_buffer_object *bo,
> >
> >   static struct ttm_device_funcs i915_ttm_bo_driver = {
> >   	.ttm_tt_create = i915_ttm_tt_create,
> > +	.ttm_tt_populate = i915_ttm_tt_populate,
> >   	.ttm_tt_unpopulate = i915_ttm_tt_unpopulate,
> >   	.ttm_tt_destroy = i915_ttm_tt_destroy,
> >   	.eviction_valuable = i915_ttm_eviction_valuable, @@ -685,12 +867,17
> @@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj,
> >   	}
> >
> >   	if (!i915_gem_object_has_pages(obj)) {
> > +		struct i915_ttm_tt *i915_tt =
> > +			container_of(bo->ttm, typeof(*i915_tt), ttm);
> > +
> >   		/* Object either has a page vector or is an iomem object */
> >   		st = bo->ttm ? i915_ttm_tt_get_st(bo->ttm) : obj-
> >ttm.cached_io_st;
> >   		if (IS_ERR(st))
> >   			return PTR_ERR(st);
> >
> >   		__i915_gem_object_set_pages(obj, st,
> i915_sg_dma_sizes(st->sgl));
> > +		if (!bo->ttm || !i915_tt->is_shmem)
> > +			i915_gem_object_make_unshrinkable(obj);
> >   	}
> >
> >   	return ret;
> > @@ -770,6 +957,8 @@ static void i915_ttm_put_pages(struct
> drm_i915_gem_object *obj,  static void i915_ttm_adjust_lru(struct
> drm_i915_gem_object *obj)  {
> >   	struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
> > +	struct i915_ttm_tt *i915_tt =
> > +		container_of(bo->ttm, typeof(*i915_tt), ttm);
> >
> >   	/*
> >   	 * Don't manipulate the TTM LRUs while in TTM bo destruction.
> > @@ -782,7 +971,10 @@ static void i915_ttm_adjust_lru(struct
> drm_i915_gem_object *obj)
> >   	 * Put on the correct LRU list depending on the MADV status
> >   	 */
> >   	spin_lock(&bo->bdev->lru_lock);
> > -	if (obj->mm.madv != I915_MADV_WILLNEED) {
> > +	if (bo->ttm && i915_tt->filp) {
> > +		/* Try to keep shmem_tt from being considered for shrinking.
> */
> > +		bo->priority = TTM_MAX_BO_PRIORITY - 1;
> > +	} else if (obj->mm.madv != I915_MADV_WILLNEED) {
> >   		bo->priority = I915_TTM_PRIO_PURGE;
> >   	} else if (!i915_gem_object_has_pages(obj)) {
> >   		if (bo->priority < I915_TTM_PRIO_HAS_PAGES) @@ -887,9
> +1079,12 @@ static const struct drm_i915_gem_object_ops
> i915_gem_ttm_obj_ops = {
> >   	.get_pages = i915_ttm_get_pages,
> >   	.put_pages = i915_ttm_put_pages,
> >   	.truncate = i915_ttm_purge,
> > +	.shrinker_release_pages = i915_ttm_shrinker_release_pages,
> > +
> >   	.adjust_lru = i915_ttm_adjust_lru,
> >   	.delayed_free = i915_ttm_delayed_free,
> >   	.migrate = i915_ttm_migrate,
> > +
> >   	.mmap_offset = i915_ttm_mmap_offset,
> >   	.mmap_ops = &vm_ops_ttm,
> >   };
> > @@ -937,7 +1132,6 @@ int __i915_gem_ttm_object_init(struct
> intel_memory_region *mem,
> >   	drm_gem_private_object_init(&i915->drm, &obj->base, size);
> >   	i915_gem_object_init(obj, &i915_gem_ttm_obj_ops, &lock_class,
> flags);
> >   	i915_gem_object_init_memory_region(obj, mem);
> > -	i915_gem_object_make_unshrinkable(obj);
> >   	INIT_RADIX_TREE(&obj->ttm.get_io_page.radix, GFP_KERNEL |
> __GFP_NOWARN);
> >   	mutex_init(&obj->ttm.get_io_page.lock);
> >   	bo_type = (obj->flags & I915_BO_ALLOC_USER) ?
> ttm_bo_type_device :
> > --
> > 2.26.3
> >

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend
@ 2021-10-05 14:23         ` Zeng, Oak
  0 siblings, 0 replies; 60+ messages in thread
From: Zeng, Oak @ 2021-10-05 14:23 UTC (permalink / raw)
  To: Thomas Hellström, Auld, Matthew, intel-gfx
  Cc: dri-devel, Christian König



Regards,
Oak

> -----Original Message-----
> From: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Sent: October 5, 2021 9:48 AM
> To: Zeng, Oak <oak.zeng@intel.com>; Auld, Matthew
> <matthew.auld@intel.com>; intel-gfx@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org; Christian König
> <christian.koenig@amd.com>
> Subject: Re: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem
> backend
> 
> 
> On 10/5/21 04:05, Zeng, Oak wrote:
> > Hi Matthew/Thomas,
> >
> > See one question inline
> >
> > Regards,
> > Oak
> >
> > -----Original Message-----
> > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of
> Matthew Auld
> > Sent: September 27, 2021 7:41 AM
> > To: intel-gfx@lists.freedesktop.org
> > Cc: dri-devel@lists.freedesktop.org; Thomas Hellström
> <thomas.hellstrom@linux.intel.com>; Christian König
> <christian.koenig@amd.com>
> > Subject: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend
> >
> > For cached objects we can allocate our pages directly in shmem. This should
> make it possible(in a later patch) to utilise the existing i915-gem shrinker code
> for such objects. For now this is still disabled.
> >
> > v2(Thomas):
> >    - Add optional try_to_writeback hook for objects. Importantly we need
> >      to check if the object is even still shrinkable; in between us
> >      dropping the shrinker LRU lock and acquiring the object lock it could for
> >      example have been moved. Also we need to differentiate between
> >      "lazy" shrinking and the immediate writeback mode. Also later we need
> to
> >      handle objects which don't even have mm.pages, so bundling this into
> >      put_pages() would require somehow handling that edge case, hence
> >      just letting the ttm backend handle everything in try_to_writeback
> >      doesn't seem too bad.
> > v3(Thomas):
> >    - Likely a bad idea to touch the object from the unpopulate hook,
> >      since it's not possible to hold a reference, without also creating
> >      circular dependency, so likely this is too fragile. For now just
> >      ensure we at least mark the pages as dirty/accessed when called from the
> >      shrinker on WILLNEED objects.
> >    - s/try_to_writeback/shrinker_release_pages, since this can do more
> >      than just writeback.
> >    - Get rid of do_backup boolean and just set the SWAPPED flag prior to
> >      calling unpopulate.
> >    - Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk,
> since
> >      these just get skipped anyway. We can try to come up with something
> >      better later.
> >
> > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> > Cc: Christian König <christian.koenig@amd.com>
> > ---
> >   drivers/gpu/drm/i915/gem/i915_gem_object.h    |   8 +
> >   .../gpu/drm/i915/gem/i915_gem_object_types.h  |   2 +
> >   drivers/gpu/drm/i915/gem/i915_gem_shmem.c     |  14 +-
> >   drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  17 +-
> >   drivers/gpu/drm/i915/gem/i915_gem_ttm.c       | 240 ++++++++++++++++-
> -
> >   5 files changed, 245 insertions(+), 36 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> > index 3043fcbd31bd..1c9a1d8d3434 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> > @@ -601,6 +601,14 @@ int i915_gem_object_wait_migration(struct
> drm_i915_gem_object *obj,  bool
> i915_gem_object_placement_possible(struct drm_i915_gem_object *obj,
> >   					enum intel_memory_type type);
> >
> > +struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
> > +				size_t size, struct intel_memory_region *mr,
> > +				struct address_space *mapping,
> > +				unsigned int max_segment);
> > +void shmem_free_st(struct sg_table *st, struct address_space *mapping,
> > +		   bool dirty, bool backup);
> > +void __shmem_writeback(size_t size, struct address_space *mapping);
> > +
> >   #ifdef CONFIG_MMU_NOTIFIER
> >   static inline bool
> >   i915_gem_object_is_userptr(struct drm_i915_gem_object *obj) diff --git
> a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> > index fa2ba9e2a4d0..f0fb17be2f7a 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> > @@ -56,6 +56,8 @@ struct drm_i915_gem_object_ops {
> >   			  struct sg_table *pages);
> >   	void (*truncate)(struct drm_i915_gem_object *obj);
> >   	void (*writeback)(struct drm_i915_gem_object *obj);
> > +	int (*shrinker_release_pages)(struct drm_i915_gem_object *obj,
> > +				      bool should_writeback);
> >
> >   	int (*pread)(struct drm_i915_gem_object *obj,
> >   		     const struct drm_i915_gem_pread *arg); diff --git
> a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> > index 36b711ae9e28..19e55cc29a15 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> > @@ -25,8 +25,8 @@ static void check_release_pagevec(struct pagevec
> *pvec)
> >   	cond_resched();
> >   }
> >
> > -static void shmem_free_st(struct sg_table *st, struct address_space
> *mapping,
> > -			  bool dirty, bool backup)
> > +void shmem_free_st(struct sg_table *st, struct address_space *mapping,
> > +		   bool dirty, bool backup)
> >   {
> >   	struct sgt_iter sgt_iter;
> >   	struct pagevec pvec;
> > @@ -52,10 +52,10 @@ static void shmem_free_st(struct sg_table *st, struct
> address_space *mapping,
> >   	kfree(st);
> >   }
> >
> > -static struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
> > -				       size_t size, struct intel_memory_region
> *mr,
> > -				       struct address_space *mapping,
> > -				       unsigned int max_segment)
> > +struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
> > +				size_t size, struct intel_memory_region *mr,
> > +				struct address_space *mapping,
> > +				unsigned int max_segment)
> >   {
> >   	const unsigned long page_count = size / PAGE_SIZE;
> >   	unsigned long i;
> > @@ -300,7 +300,7 @@ shmem_truncate(struct drm_i915_gem_object *obj)
> >   	obj->mm.pages = ERR_PTR(-EFAULT);
> >   }
> >
> > -static void __shmem_writeback(size_t size, struct address_space
> *mapping)
> > +void __shmem_writeback(size_t size, struct address_space *mapping)
> >   {
> >   	struct writeback_control wbc = {
> >   		.sync_mode = WB_SYNC_NONE,
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> > index e382b7f2353b..cc80bd23d323 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> > @@ -56,19 +56,24 @@ static bool unsafe_drop_pages(struct
> drm_i915_gem_object *obj,
> >   	return false;
> >   }
> >
> > -static void try_to_writeback(struct drm_i915_gem_object *obj,
> > -			     unsigned int flags)
> > +static int try_to_writeback(struct drm_i915_gem_object *obj, unsigned
> > +int flags)
> >   {
> > +	if (obj->ops->shrinker_release_pages)
> > +		return obj->ops->shrinker_release_pages(obj,
> > +							flags &
> I915_SHRINK_WRITEBACK);
> > +
> >   	switch (obj->mm.madv) {
> >   	case I915_MADV_DONTNEED:
> >   		i915_gem_object_truncate(obj);
> > -		return;
> > +		return 0;
> >   	case __I915_MADV_PURGED:
> > -		return;
> > +		return 0;
> >   	}
> >
> >   	if (flags & I915_SHRINK_WRITEBACK)
> >   		i915_gem_object_writeback(obj);
> > +
> > +	return 0;
> >   }
> >
> >   /**
> > @@ -222,8 +227,8 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww,
> >   				}
> >
> >   				if (!__i915_gem_object_put_pages(obj)) {
> > -					try_to_writeback(obj, shrink);
> > -					count += obj->base.size >>
> PAGE_SHIFT;
> > +					if (!try_to_writeback(obj, shrink))
> > +						count += obj->base.size >>
> PAGE_SHIFT;
> >   				}
> >   				if (!ww)
> >   					i915_gem_object_unlock(obj);
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> > index a77e90f300fe..c7402995a8f9 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> > @@ -35,6 +35,8 @@
> >    * @ttm: The base TTM page vector.
> >    * @dev: The struct device used for dma mapping and unmapping.
> >    * @cached_st: The cached scatter-gather table.
> > + * @is_shmem: Set if using shmem.
> > + * @filp: The shmem file, if using shmem backend.
> >    *
> >    * Note that DMA may be going on right up to the point where the page-
> >    * vector is unpopulated in delayed destroy. Hence keep the @@ -46,6
> +48,9 @@ struct i915_ttm_tt {
> >   	struct ttm_tt ttm;
> >   	struct device *dev;
> >   	struct sg_table *cached_st;
> > +
> > +	bool is_shmem;
> > +	struct file *filp;
> >   };
> >
> >   static const struct ttm_place sys_placement_flags = { @@ -179,12 +184,90
> @@ i915_ttm_placement_from_obj(const struct drm_i915_gem_object *obj,
> >   	placement->busy_placement = busy;
> >   }
> >
> > +static int i915_ttm_tt_shmem_populate(struct ttm_device *bdev,
> > +				      struct ttm_tt *ttm,
> > +				      struct ttm_operation_ctx *ctx) {
> > +	struct drm_i915_private *i915 = container_of(bdev, typeof(*i915),
> bdev);
> > +	struct intel_memory_region *mr = i915-
> >mm.regions[INTEL_MEMORY_SYSTEM];
> > +	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
> ttm);
> > +	const unsigned int max_segment = i915_sg_segment_size();
> > +	const size_t size = ttm->num_pages << PAGE_SHIFT;
> > +	struct file *filp = i915_tt->filp;
> > +	struct sgt_iter sgt_iter;
> > +	struct sg_table *st;
> > +	struct page *page;
> > +	unsigned long i;
> > +	int err;
> > +
> > +	if (!filp) {
> > +		struct address_space *mapping;
> > +		gfp_t mask;
> > +
> > +		filp = shmem_file_setup("i915-shmem-tt", size,
> VM_NORESERVE);
> > +		if (IS_ERR(filp))
> > +			return PTR_ERR(filp);
> > +
> > +		mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
> > +
> > +		mapping = filp->f_mapping;
> > +		mapping_set_gfp_mask(mapping, mask);
> > +		GEM_BUG_ON(!(mapping_gfp_mask(mapping) &
> __GFP_RECLAIM));
> > +
> > +		i915_tt->filp = filp;
> > +	}
> > +
> > +	st = shmem_alloc_st(i915, size, mr, filp->f_mapping, max_segment);
> > +	if (IS_ERR(st))
> > +		return PTR_ERR(st);
> > +
> > +	err = dma_map_sg_attrs(i915_tt->dev,
> > +			       st->sgl, st->nents,
> > +			       PCI_DMA_BIDIRECTIONAL,
> > +			       DMA_ATTR_SKIP_CPU_SYNC |
> > +			       DMA_ATTR_NO_KERNEL_MAPPING |
> > +			       DMA_ATTR_NO_WARN);
> > +	if (err <= 0) {
> > +		err = -EINVAL;
> > +		goto err_free_st;
> > +	}
> > +
> > +	i = 0;
> > +	for_each_sgt_page(page, sgt_iter, st)
> > +		ttm->pages[i++] = page;
> > +
> > +	if (ttm->page_flags & TTM_TT_FLAG_SWAPPED)
> > +		ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
> > +
> > +	i915_tt->cached_st = st;
> > +	return 0;
> > +
> > +err_free_st:
> > +	shmem_free_st(st, filp->f_mapping, false, false);
> > +	return err;
> > +}
> > +
> > +static void i915_ttm_tt_shmem_unpopulate(struct ttm_tt *ttm) {
> > +	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
> ttm);
> > +	bool backup = ttm->page_flags & TTM_TT_FLAG_SWAPPED;
> > +
> > +	dma_unmap_sg(i915_tt->dev, i915_tt->cached_st->sgl,
> > +		     i915_tt->cached_st->nents,
> > +		     PCI_DMA_BIDIRECTIONAL);
> > +
> > +	shmem_free_st(fetch_and_zero(&i915_tt->cached_st),
> > +		      file_inode(i915_tt->filp)->i_mapping,
> > +		      backup, backup);
> >
> > Should we do something to undo the  shmem_file_setup operation here?
> From its implementation it does take a reference counter of inode and
> allocate file:
> https://elixir.bootlin.com/linux/latest/source/mm/shmem.c#L4084
> >
> > Regards,
> > Oak
> 
> Hi, Oak,
> 
> That's done in i915_ttm_tt_destroy() afaict.
> 
> /Thomas

Do we know whether this tt is back by a shmem at create time? If yes, I think a better place to do the shmem_file_setup is in i915_ttm_tt_create - this pairs with the fput in i915_ttm_tt_destroy.  If we don't have such information at tt create time, I agree with the current approach.

Regards,
Oak

> 
> 
> 
> >
> > +}
> > +
> >   static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo,
> >   					 uint32_t page_flags)
> >   {
> >   	struct ttm_resource_manager *man =
> >   		ttm_manager_type(bo->bdev, bo->resource->mem_type);
> >   	struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
> > +	enum ttm_caching caching = i915_ttm_select_tt_caching(obj);
> >   	struct i915_ttm_tt *i915_tt;
> >   	int ret;
> >
> > @@ -196,36 +279,62 @@ static struct ttm_tt *i915_ttm_tt_create(struct
> ttm_buffer_object *bo,
> >   	    man->use_tt)
> >   		page_flags |= TTM_TT_FLAG_ZERO_ALLOC;
> >
> > -	ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags,
> > -			  i915_ttm_select_tt_caching(obj));
> > -	if (ret) {
> > -		kfree(i915_tt);
> > -		return NULL;
> > +	if (i915_gem_object_is_shrinkable(obj) && caching == ttm_cached) {
> > +		page_flags |= TTM_TT_FLAG_EXTERNAL |
> > +			      TTM_TT_FLAG_EXTERNAL_MAPPABLE;
> > +		i915_tt->is_shmem = true;
> >   	}
> >
> > +	ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching);
> > +	if (ret)
> > +		goto err_free;
> > +
> >   	i915_tt->dev = obj->base.dev->dev;
> >
> >   	return &i915_tt->ttm;
> > +
> > +err_free:
> > +	kfree(i915_tt);
> > +	return NULL;
> > +}
> > +
> > +static int i915_ttm_tt_populate(struct ttm_device *bdev,
> > +				struct ttm_tt *ttm,
> > +				struct ttm_operation_ctx *ctx)
> > +{
> > +	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
> > +ttm);
> > +
> > +	if (i915_tt->is_shmem)
> > +		return i915_ttm_tt_shmem_populate(bdev, ttm, ctx);
> > +
> > +	return ttm_pool_alloc(&bdev->pool, ttm, ctx);
> >   }
> >
> >   static void i915_ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt
> *ttm)  {
> >   	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
> ttm);
> >
> > -	if (i915_tt->cached_st) {
> > -		dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st,
> > -				  DMA_BIDIRECTIONAL, 0);
> > -		sg_free_table(i915_tt->cached_st);
> > -		kfree(i915_tt->cached_st);
> > -		i915_tt->cached_st = NULL;
> > +	if (i915_tt->is_shmem) {
> > +		i915_ttm_tt_shmem_unpopulate(ttm);
> > +	} else {
> > +		if (i915_tt->cached_st) {
> > +			dma_unmap_sgtable(i915_tt->dev, i915_tt-
> >cached_st,
> > +					  DMA_BIDIRECTIONAL, 0);
> > +			sg_free_table(i915_tt->cached_st);
> > +			kfree(i915_tt->cached_st);
> > +			i915_tt->cached_st = NULL;
> > +		}
> > +		ttm_pool_free(&bdev->pool, ttm);
> >   	}
> > -	ttm_pool_free(&bdev->pool, ttm);
> >   }
> >
> >   static void i915_ttm_tt_destroy(struct ttm_device *bdev, struct ttm_tt
> *ttm)  {
> >   	struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
> ttm);
> >
> > +	if (i915_tt->filp)
> > +		fput(i915_tt->filp);
> > +
> >   	ttm_tt_fini(ttm);
> >   	kfree(i915_tt);
> >   }
> > @@ -235,6 +344,14 @@ static bool i915_ttm_eviction_valuable(struct
> ttm_buffer_object *bo,  {
> >   	struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
> >
> > +	/*
> > +	 * EXTERNAL objects should never be swapped out by TTM, instead
> we need
> > +	 * to handle that ourselves. TTM will already skip such objects for us,
> > +	 * but we would like to avoid grabbing locks for no good reason.
> > +	 */
> > +	if (bo->ttm && bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL)
> > +		return -EBUSY;
> > +
> >   	/* Will do for now. Our pinned objects are still on TTM's LRU lists */
> >   	return i915_gem_object_evictable(obj);  } @@ -328,9 +445,11 @@
> static void i915_ttm_adjust_gem_after_move(struct drm_i915_gem_object
> *obj)
> >   	i915_gem_object_set_cache_coherency(obj, cache_level);  }
> >
> > -static void i915_ttm_purge(struct drm_i915_gem_object *obj)
> > +static int __i915_ttm_purge(struct drm_i915_gem_object *obj)
> >   {
> >   	struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
> > +	struct i915_ttm_tt *i915_tt =
> > +		container_of(bo->ttm, typeof(*i915_tt), ttm);
> >   	struct ttm_operation_ctx ctx = {
> >   		.interruptible = true,
> >   		.no_wait_gpu = false,
> > @@ -339,17 +458,79 @@ static void i915_ttm_purge(struct
> drm_i915_gem_object *obj)
> >   	int ret;
> >
> >   	if (obj->mm.madv == __I915_MADV_PURGED)
> > -		return;
> > +		return 0;
> >
> > -	/* TTM's purge interface. Note that we might be reentering. */
> >   	ret = ttm_bo_validate(bo, &place, &ctx);
> > -	if (!ret) {
> > -		obj->write_domain = 0;
> > -		obj->read_domains = 0;
> > -		i915_ttm_adjust_gem_after_move(obj);
> > -		i915_ttm_free_cached_io_st(obj);
> > -		obj->mm.madv = __I915_MADV_PURGED;
> > +	if (ret)
> > +		return ret;
> > +
> > +	if (bo->ttm && i915_tt->filp) {
> > +		/*
> > +		 * The below fput(which eventually calls shmem_truncate)
> might
> > +		 * be delayed by worker, so when directly called to purge the
> > +		 * pages(like by the shrinker) we should try to be more
> > +		 * aggressive and release the pages immediately.
> > +		 */
> > +		shmem_truncate_range(file_inode(i915_tt->filp),
> > +				     0, (loff_t)-1);
> > +		fput(fetch_and_zero(&i915_tt->filp));
> > +	}
> > +
> > +	obj->write_domain = 0;
> > +	obj->read_domains = 0;
> > +	i915_ttm_adjust_gem_after_move(obj);
> > +	i915_ttm_free_cached_io_st(obj);
> > +	obj->mm.madv = __I915_MADV_PURGED;
> > +	return 0;
> > +}
> > +
> > +static void i915_ttm_purge(struct drm_i915_gem_object *obj) {
> > +	__i915_ttm_purge(obj);
> > +}
> > +
> > +static int i915_ttm_shrinker_release_pages(struct drm_i915_gem_object
> *obj,
> > +					   bool should_writeback)
> > +{
> > +	struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
> > +	struct i915_ttm_tt *i915_tt =
> > +		container_of(bo->ttm, typeof(*i915_tt), ttm);
> > +	struct ttm_operation_ctx ctx = {
> > +		.interruptible = true,
> > +		.no_wait_gpu = false,
> > +	};
> > +	struct ttm_placement place = {};
> > +	int ret;
> > +
> > +	if (!bo->ttm || bo->resource->mem_type != TTM_PL_SYSTEM)
> > +		return 0;
> > +
> > +	GEM_BUG_ON(!i915_tt->is_shmem);
> > +
> > +	if (!i915_tt->filp)
> > +		return 0;
> > +
> > +	switch (obj->mm.madv) {
> > +	case I915_MADV_DONTNEED:
> > +		return __i915_ttm_purge(obj);
> > +	case __I915_MADV_PURGED:
> > +		return 0;
> > +	}
> > +
> > +	if (bo->ttm->page_flags & TTM_TT_FLAG_SWAPPED)
> > +		return 0;
> > +
> > +	bo->ttm->page_flags |= TTM_TT_FLAG_SWAPPED;
> > +	ret = ttm_bo_validate(bo, &place, &ctx);
> > +	if (ret) {
> > +		bo->ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
> > +		return ret;
> >   	}
> > +
> > +	if (should_writeback)
> > +		__shmem_writeback(obj->base.size, i915_tt->filp-
> >f_mapping);
> > +
> > +	return 0;
> >   }
> >
> >   static void i915_ttm_swap_notify(struct ttm_buffer_object *bo) @@ -
> 618,6 +799,7 @@ static unsigned long i915_ttm_io_mem_pfn(struct
> ttm_buffer_object *bo,
> >
> >   static struct ttm_device_funcs i915_ttm_bo_driver = {
> >   	.ttm_tt_create = i915_ttm_tt_create,
> > +	.ttm_tt_populate = i915_ttm_tt_populate,
> >   	.ttm_tt_unpopulate = i915_ttm_tt_unpopulate,
> >   	.ttm_tt_destroy = i915_ttm_tt_destroy,
> >   	.eviction_valuable = i915_ttm_eviction_valuable, @@ -685,12 +867,17
> @@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj,
> >   	}
> >
> >   	if (!i915_gem_object_has_pages(obj)) {
> > +		struct i915_ttm_tt *i915_tt =
> > +			container_of(bo->ttm, typeof(*i915_tt), ttm);
> > +
> >   		/* Object either has a page vector or is an iomem object */
> >   		st = bo->ttm ? i915_ttm_tt_get_st(bo->ttm) : obj-
> >ttm.cached_io_st;
> >   		if (IS_ERR(st))
> >   			return PTR_ERR(st);
> >
> >   		__i915_gem_object_set_pages(obj, st,
> i915_sg_dma_sizes(st->sgl));
> > +		if (!bo->ttm || !i915_tt->is_shmem)
> > +			i915_gem_object_make_unshrinkable(obj);
> >   	}
> >
> >   	return ret;
> > @@ -770,6 +957,8 @@ static void i915_ttm_put_pages(struct
> drm_i915_gem_object *obj,  static void i915_ttm_adjust_lru(struct
> drm_i915_gem_object *obj)  {
> >   	struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
> > +	struct i915_ttm_tt *i915_tt =
> > +		container_of(bo->ttm, typeof(*i915_tt), ttm);
> >
> >   	/*
> >   	 * Don't manipulate the TTM LRUs while in TTM bo destruction.
> > @@ -782,7 +971,10 @@ static void i915_ttm_adjust_lru(struct
> drm_i915_gem_object *obj)
> >   	 * Put on the correct LRU list depending on the MADV status
> >   	 */
> >   	spin_lock(&bo->bdev->lru_lock);
> > -	if (obj->mm.madv != I915_MADV_WILLNEED) {
> > +	if (bo->ttm && i915_tt->filp) {
> > +		/* Try to keep shmem_tt from being considered for shrinking.
> */
> > +		bo->priority = TTM_MAX_BO_PRIORITY - 1;
> > +	} else if (obj->mm.madv != I915_MADV_WILLNEED) {
> >   		bo->priority = I915_TTM_PRIO_PURGE;
> >   	} else if (!i915_gem_object_has_pages(obj)) {
> >   		if (bo->priority < I915_TTM_PRIO_HAS_PAGES) @@ -887,9
> +1079,12 @@ static const struct drm_i915_gem_object_ops
> i915_gem_ttm_obj_ops = {
> >   	.get_pages = i915_ttm_get_pages,
> >   	.put_pages = i915_ttm_put_pages,
> >   	.truncate = i915_ttm_purge,
> > +	.shrinker_release_pages = i915_ttm_shrinker_release_pages,
> > +
> >   	.adjust_lru = i915_ttm_adjust_lru,
> >   	.delayed_free = i915_ttm_delayed_free,
> >   	.migrate = i915_ttm_migrate,
> > +
> >   	.mmap_offset = i915_ttm_mmap_offset,
> >   	.mmap_ops = &vm_ops_ttm,
> >   };
> > @@ -937,7 +1132,6 @@ int __i915_gem_ttm_object_init(struct
> intel_memory_region *mem,
> >   	drm_gem_private_object_init(&i915->drm, &obj->base, size);
> >   	i915_gem_object_init(obj, &i915_gem_ttm_obj_ops, &lock_class,
> flags);
> >   	i915_gem_object_init_memory_region(obj, mem);
> > -	i915_gem_object_make_unshrinkable(obj);
> >   	INIT_RADIX_TREE(&obj->ttm.get_io_page.radix, GFP_KERNEL |
> __GFP_NOWARN);
> >   	mutex_init(&obj->ttm.get_io_page.lock);
> >   	bo_type = (obj->flags & I915_BO_ALLOC_USER) ?
> ttm_bo_type_device :
> > --
> > 2.26.3
> >

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend
  2021-10-05 14:23         ` Zeng, Oak
  (?)
@ 2021-10-05 17:07         ` Matthew Auld
  2021-10-05 18:33             ` Zeng, Oak
  -1 siblings, 1 reply; 60+ messages in thread
From: Matthew Auld @ 2021-10-05 17:07 UTC (permalink / raw)
  To: Zeng, Oak, Thomas Hellström, intel-gfx
  Cc: dri-devel, Christian König

On 05/10/2021 15:23, Zeng, Oak wrote:
> 
> 
> Regards,
> Oak
> 
>> -----Original Message-----
>> From: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>> Sent: October 5, 2021 9:48 AM
>> To: Zeng, Oak <oak.zeng@intel.com>; Auld, Matthew
>> <matthew.auld@intel.com>; intel-gfx@lists.freedesktop.org
>> Cc: dri-devel@lists.freedesktop.org; Christian König
>> <christian.koenig@amd.com>
>> Subject: Re: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem
>> backend
>>
>>
>> On 10/5/21 04:05, Zeng, Oak wrote:
>>> Hi Matthew/Thomas,
>>>
>>> See one question inline
>>>
>>> Regards,
>>> Oak
>>>
>>> -----Original Message-----
>>> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of
>> Matthew Auld
>>> Sent: September 27, 2021 7:41 AM
>>> To: intel-gfx@lists.freedesktop.org
>>> Cc: dri-devel@lists.freedesktop.org; Thomas Hellström
>> <thomas.hellstrom@linux.intel.com>; Christian König
>> <christian.koenig@amd.com>
>>> Subject: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend
>>>
>>> For cached objects we can allocate our pages directly in shmem. This should
>> make it possible(in a later patch) to utilise the existing i915-gem shrinker code
>> for such objects. For now this is still disabled.
>>>
>>> v2(Thomas):
>>>     - Add optional try_to_writeback hook for objects. Importantly we need
>>>       to check if the object is even still shrinkable; in between us
>>>       dropping the shrinker LRU lock and acquiring the object lock it could for
>>>       example have been moved. Also we need to differentiate between
>>>       "lazy" shrinking and the immediate writeback mode. Also later we need
>> to
>>>       handle objects which don't even have mm.pages, so bundling this into
>>>       put_pages() would require somehow handling that edge case, hence
>>>       just letting the ttm backend handle everything in try_to_writeback
>>>       doesn't seem too bad.
>>> v3(Thomas):
>>>     - Likely a bad idea to touch the object from the unpopulate hook,
>>>       since it's not possible to hold a reference, without also creating
>>>       circular dependency, so likely this is too fragile. For now just
>>>       ensure we at least mark the pages as dirty/accessed when called from the
>>>       shrinker on WILLNEED objects.
>>>     - s/try_to_writeback/shrinker_release_pages, since this can do more
>>>       than just writeback.
>>>     - Get rid of do_backup boolean and just set the SWAPPED flag prior to
>>>       calling unpopulate.
>>>     - Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk,
>> since
>>>       these just get skipped anyway. We can try to come up with something
>>>       better later.
>>>
>>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>>> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>>> Cc: Christian König <christian.koenig@amd.com>
>>> ---
>>>    drivers/gpu/drm/i915/gem/i915_gem_object.h    |   8 +
>>>    .../gpu/drm/i915/gem/i915_gem_object_types.h  |   2 +
>>>    drivers/gpu/drm/i915/gem/i915_gem_shmem.c     |  14 +-
>>>    drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  17 +-
>>>    drivers/gpu/drm/i915/gem/i915_gem_ttm.c       | 240 ++++++++++++++++-
>> -
>>>    5 files changed, 245 insertions(+), 36 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h
>> b/drivers/gpu/drm/i915/gem/i915_gem_object.h
>>> index 3043fcbd31bd..1c9a1d8d3434 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
>>> @@ -601,6 +601,14 @@ int i915_gem_object_wait_migration(struct
>> drm_i915_gem_object *obj,  bool
>> i915_gem_object_placement_possible(struct drm_i915_gem_object *obj,
>>>                                      enum intel_memory_type type);
>>>
>>> +struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
>>> +                           size_t size, struct intel_memory_region *mr,
>>> +                           struct address_space *mapping,
>>> +                           unsigned int max_segment);
>>> +void shmem_free_st(struct sg_table *st, struct address_space *mapping,
>>> +              bool dirty, bool backup);
>>> +void __shmem_writeback(size_t size, struct address_space *mapping);
>>> +
>>>    #ifdef CONFIG_MMU_NOTIFIER
>>>    static inline bool
>>>    i915_gem_object_is_userptr(struct drm_i915_gem_object *obj) diff --git
>> a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
>> b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
>>> index fa2ba9e2a4d0..f0fb17be2f7a 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
>>> @@ -56,6 +56,8 @@ struct drm_i915_gem_object_ops {
>>>                        struct sg_table *pages);
>>>      void (*truncate)(struct drm_i915_gem_object *obj);
>>>      void (*writeback)(struct drm_i915_gem_object *obj);
>>> +   int (*shrinker_release_pages)(struct drm_i915_gem_object *obj,
>>> +                                 bool should_writeback);
>>>
>>>      int (*pread)(struct drm_i915_gem_object *obj,
>>>                   const struct drm_i915_gem_pread *arg); diff --git
>> a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
>> b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
>>> index 36b711ae9e28..19e55cc29a15 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
>>> @@ -25,8 +25,8 @@ static void check_release_pagevec(struct pagevec
>> *pvec)
>>>      cond_resched();
>>>    }
>>>
>>> -static void shmem_free_st(struct sg_table *st, struct address_space
>> *mapping,
>>> -                     bool dirty, bool backup)
>>> +void shmem_free_st(struct sg_table *st, struct address_space *mapping,
>>> +              bool dirty, bool backup)
>>>    {
>>>      struct sgt_iter sgt_iter;
>>>      struct pagevec pvec;
>>> @@ -52,10 +52,10 @@ static void shmem_free_st(struct sg_table *st, struct
>> address_space *mapping,
>>>      kfree(st);
>>>    }
>>>
>>> -static struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
>>> -                                  size_t size, struct intel_memory_region
>> *mr,
>>> -                                  struct address_space *mapping,
>>> -                                  unsigned int max_segment)
>>> +struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
>>> +                           size_t size, struct intel_memory_region *mr,
>>> +                           struct address_space *mapping,
>>> +                           unsigned int max_segment)
>>>    {
>>>      const unsigned long page_count = size / PAGE_SIZE;
>>>      unsigned long i;
>>> @@ -300,7 +300,7 @@ shmem_truncate(struct drm_i915_gem_object *obj)
>>>      obj->mm.pages = ERR_PTR(-EFAULT);
>>>    }
>>>
>>> -static void __shmem_writeback(size_t size, struct address_space
>> *mapping)
>>> +void __shmem_writeback(size_t size, struct address_space *mapping)
>>>    {
>>>      struct writeback_control wbc = {
>>>              .sync_mode = WB_SYNC_NONE,
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
>> b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
>>> index e382b7f2353b..cc80bd23d323 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
>>> @@ -56,19 +56,24 @@ static bool unsafe_drop_pages(struct
>> drm_i915_gem_object *obj,
>>>      return false;
>>>    }
>>>
>>> -static void try_to_writeback(struct drm_i915_gem_object *obj,
>>> -                        unsigned int flags)
>>> +static int try_to_writeback(struct drm_i915_gem_object *obj, unsigned
>>> +int flags)
>>>    {
>>> +   if (obj->ops->shrinker_release_pages)
>>> +           return obj->ops->shrinker_release_pages(obj,
>>> +                                                   flags &
>> I915_SHRINK_WRITEBACK);
>>> +
>>>      switch (obj->mm.madv) {
>>>      case I915_MADV_DONTNEED:
>>>              i915_gem_object_truncate(obj);
>>> -           return;
>>> +           return 0;
>>>      case __I915_MADV_PURGED:
>>> -           return;
>>> +           return 0;
>>>      }
>>>
>>>      if (flags & I915_SHRINK_WRITEBACK)
>>>              i915_gem_object_writeback(obj);
>>> +
>>> +   return 0;
>>>    }
>>>
>>>    /**
>>> @@ -222,8 +227,8 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww,
>>>                              }
>>>
>>>                              if (!__i915_gem_object_put_pages(obj)) {
>>> -                                   try_to_writeback(obj, shrink);
>>> -                                   count += obj->base.size >>
>> PAGE_SHIFT;
>>> +                                   if (!try_to_writeback(obj, shrink))
>>> +                                           count += obj->base.size >>
>> PAGE_SHIFT;
>>>                              }
>>>                              if (!ww)
>>>                                      i915_gem_object_unlock(obj);
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>> index a77e90f300fe..c7402995a8f9 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>> @@ -35,6 +35,8 @@
>>>     * @ttm: The base TTM page vector.
>>>     * @dev: The struct device used for dma mapping and unmapping.
>>>     * @cached_st: The cached scatter-gather table.
>>> + * @is_shmem: Set if using shmem.
>>> + * @filp: The shmem file, if using shmem backend.
>>>     *
>>>     * Note that DMA may be going on right up to the point where the page-
>>>     * vector is unpopulated in delayed destroy. Hence keep the @@ -46,6
>> +48,9 @@ struct i915_ttm_tt {
>>>      struct ttm_tt ttm;
>>>      struct device *dev;
>>>      struct sg_table *cached_st;
>>> +
>>> +   bool is_shmem;
>>> +   struct file *filp;
>>>    };
>>>
>>>    static const struct ttm_place sys_placement_flags = { @@ -179,12 +184,90
>> @@ i915_ttm_placement_from_obj(const struct drm_i915_gem_object *obj,
>>>      placement->busy_placement = busy;
>>>    }
>>>
>>> +static int i915_ttm_tt_shmem_populate(struct ttm_device *bdev,
>>> +                                 struct ttm_tt *ttm,
>>> +                                 struct ttm_operation_ctx *ctx) {
>>> +   struct drm_i915_private *i915 = container_of(bdev, typeof(*i915),
>> bdev);
>>> +   struct intel_memory_region *mr = i915-
>>> mm.regions[INTEL_MEMORY_SYSTEM];
>>> +   struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
>> ttm);
>>> +   const unsigned int max_segment = i915_sg_segment_size();
>>> +   const size_t size = ttm->num_pages << PAGE_SHIFT;
>>> +   struct file *filp = i915_tt->filp;
>>> +   struct sgt_iter sgt_iter;
>>> +   struct sg_table *st;
>>> +   struct page *page;
>>> +   unsigned long i;
>>> +   int err;
>>> +
>>> +   if (!filp) {
>>> +           struct address_space *mapping;
>>> +           gfp_t mask;
>>> +
>>> +           filp = shmem_file_setup("i915-shmem-tt", size,
>> VM_NORESERVE);
>>> +           if (IS_ERR(filp))
>>> +                   return PTR_ERR(filp);
>>> +
>>> +           mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
>>> +
>>> +           mapping = filp->f_mapping;
>>> +           mapping_set_gfp_mask(mapping, mask);
>>> +           GEM_BUG_ON(!(mapping_gfp_mask(mapping) &
>> __GFP_RECLAIM));
>>> +
>>> +           i915_tt->filp = filp;
>>> +   }
>>> +
>>> +   st = shmem_alloc_st(i915, size, mr, filp->f_mapping, max_segment);
>>> +   if (IS_ERR(st))
>>> +           return PTR_ERR(st);
>>> +
>>> +   err = dma_map_sg_attrs(i915_tt->dev,
>>> +                          st->sgl, st->nents,
>>> +                          PCI_DMA_BIDIRECTIONAL,
>>> +                          DMA_ATTR_SKIP_CPU_SYNC |
>>> +                          DMA_ATTR_NO_KERNEL_MAPPING |
>>> +                          DMA_ATTR_NO_WARN);
>>> +   if (err <= 0) {
>>> +           err = -EINVAL;
>>> +           goto err_free_st;
>>> +   }
>>> +
>>> +   i = 0;
>>> +   for_each_sgt_page(page, sgt_iter, st)
>>> +           ttm->pages[i++] = page;
>>> +
>>> +   if (ttm->page_flags & TTM_TT_FLAG_SWAPPED)
>>> +           ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
>>> +
>>> +   i915_tt->cached_st = st;
>>> +   return 0;
>>> +
>>> +err_free_st:
>>> +   shmem_free_st(st, filp->f_mapping, false, false);
>>> +   return err;
>>> +}
>>> +
>>> +static void i915_ttm_tt_shmem_unpopulate(struct ttm_tt *ttm) {
>>> +   struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
>> ttm);
>>> +   bool backup = ttm->page_flags & TTM_TT_FLAG_SWAPPED;
>>> +
>>> +   dma_unmap_sg(i915_tt->dev, i915_tt->cached_st->sgl,
>>> +                i915_tt->cached_st->nents,
>>> +                PCI_DMA_BIDIRECTIONAL);
>>> +
>>> +   shmem_free_st(fetch_and_zero(&i915_tt->cached_st),
>>> +                 file_inode(i915_tt->filp)->i_mapping,
>>> +                 backup, backup);
>>>
>>> Should we do something to undo the  shmem_file_setup operation here?
>>  From its implementation it does take a reference counter of inode and
>> allocate file:
>> https://elixir.bootlin.com/linux/latest/source/mm/shmem.c#L4084
>>>
>>> Regards,
>>> Oak
>>
>> Hi, Oak,
>>
>> That's done in i915_ttm_tt_destroy() afaict.
>>
>> /Thomas
> 
> Do we know whether this tt is back by a shmem at create time? If yes, I think a better place to do the shmem_file_setup is in i915_ttm_tt_create - this pairs with the fput in i915_ttm_tt_destroy.  If we don't have such information at tt create time, I agree with the current approach.

IIRC the tt_create is called even if the object will mostly likely just 
end up being placed in VRAM, so calling shmem_file_setup in there seemed 
potentially wasteful, since the shmem file might never even be needed. 
Hence keeping it in populate instead.

> 
> Regards,
> Oak
> 
>>
>>
>>
>>>
>>> +}
>>> +
>>>    static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo,
>>>                                       uint32_t page_flags)
>>>    {
>>>      struct ttm_resource_manager *man =
>>>              ttm_manager_type(bo->bdev, bo->resource->mem_type);
>>>      struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
>>> +   enum ttm_caching caching = i915_ttm_select_tt_caching(obj);
>>>      struct i915_ttm_tt *i915_tt;
>>>      int ret;
>>>
>>> @@ -196,36 +279,62 @@ static struct ttm_tt *i915_ttm_tt_create(struct
>> ttm_buffer_object *bo,
>>>          man->use_tt)
>>>              page_flags |= TTM_TT_FLAG_ZERO_ALLOC;
>>>
>>> -   ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags,
>>> -                     i915_ttm_select_tt_caching(obj));
>>> -   if (ret) {
>>> -           kfree(i915_tt);
>>> -           return NULL;
>>> +   if (i915_gem_object_is_shrinkable(obj) && caching == ttm_cached) {
>>> +           page_flags |= TTM_TT_FLAG_EXTERNAL |
>>> +                         TTM_TT_FLAG_EXTERNAL_MAPPABLE;
>>> +           i915_tt->is_shmem = true;
>>>      }
>>>
>>> +   ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching);
>>> +   if (ret)
>>> +           goto err_free;
>>> +
>>>      i915_tt->dev = obj->base.dev->dev;
>>>
>>>      return &i915_tt->ttm;
>>> +
>>> +err_free:
>>> +   kfree(i915_tt);
>>> +   return NULL;
>>> +}
>>> +
>>> +static int i915_ttm_tt_populate(struct ttm_device *bdev,
>>> +                           struct ttm_tt *ttm,
>>> +                           struct ttm_operation_ctx *ctx)
>>> +{
>>> +   struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
>>> +ttm);
>>> +
>>> +   if (i915_tt->is_shmem)
>>> +           return i915_ttm_tt_shmem_populate(bdev, ttm, ctx);
>>> +
>>> +   return ttm_pool_alloc(&bdev->pool, ttm, ctx);
>>>    }
>>>
>>>    static void i915_ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt
>> *ttm)  {
>>>      struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
>> ttm);
>>>
>>> -   if (i915_tt->cached_st) {
>>> -           dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st,
>>> -                             DMA_BIDIRECTIONAL, 0);
>>> -           sg_free_table(i915_tt->cached_st);
>>> -           kfree(i915_tt->cached_st);
>>> -           i915_tt->cached_st = NULL;
>>> +   if (i915_tt->is_shmem) {
>>> +           i915_ttm_tt_shmem_unpopulate(ttm);
>>> +   } else {
>>> +           if (i915_tt->cached_st) {
>>> +                   dma_unmap_sgtable(i915_tt->dev, i915_tt-
>>> cached_st,
>>> +                                     DMA_BIDIRECTIONAL, 0);
>>> +                   sg_free_table(i915_tt->cached_st);
>>> +                   kfree(i915_tt->cached_st);
>>> +                   i915_tt->cached_st = NULL;
>>> +           }
>>> +           ttm_pool_free(&bdev->pool, ttm);
>>>      }
>>> -   ttm_pool_free(&bdev->pool, ttm);
>>>    }
>>>
>>>    static void i915_ttm_tt_destroy(struct ttm_device *bdev, struct ttm_tt
>> *ttm)  {
>>>      struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
>> ttm);
>>>
>>> +   if (i915_tt->filp)
>>> +           fput(i915_tt->filp);
>>> +
>>>      ttm_tt_fini(ttm);
>>>      kfree(i915_tt);
>>>    }
>>> @@ -235,6 +344,14 @@ static bool i915_ttm_eviction_valuable(struct
>> ttm_buffer_object *bo,  {
>>>      struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
>>>
>>> +   /*
>>> +    * EXTERNAL objects should never be swapped out by TTM, instead
>> we need
>>> +    * to handle that ourselves. TTM will already skip such objects for us,
>>> +    * but we would like to avoid grabbing locks for no good reason.
>>> +    */
>>> +   if (bo->ttm && bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL)
>>> +           return -EBUSY;
>>> +
>>>      /* Will do for now. Our pinned objects are still on TTM's LRU lists */
>>>      return i915_gem_object_evictable(obj);  } @@ -328,9 +445,11 @@
>> static void i915_ttm_adjust_gem_after_move(struct drm_i915_gem_object
>> *obj)
>>>      i915_gem_object_set_cache_coherency(obj, cache_level);  }
>>>
>>> -static void i915_ttm_purge(struct drm_i915_gem_object *obj)
>>> +static int __i915_ttm_purge(struct drm_i915_gem_object *obj)
>>>    {
>>>      struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
>>> +   struct i915_ttm_tt *i915_tt =
>>> +           container_of(bo->ttm, typeof(*i915_tt), ttm);
>>>      struct ttm_operation_ctx ctx = {
>>>              .interruptible = true,
>>>              .no_wait_gpu = false,
>>> @@ -339,17 +458,79 @@ static void i915_ttm_purge(struct
>> drm_i915_gem_object *obj)
>>>      int ret;
>>>
>>>      if (obj->mm.madv == __I915_MADV_PURGED)
>>> -           return;
>>> +           return 0;
>>>
>>> -   /* TTM's purge interface. Note that we might be reentering. */
>>>      ret = ttm_bo_validate(bo, &place, &ctx);
>>> -   if (!ret) {
>>> -           obj->write_domain = 0;
>>> -           obj->read_domains = 0;
>>> -           i915_ttm_adjust_gem_after_move(obj);
>>> -           i915_ttm_free_cached_io_st(obj);
>>> -           obj->mm.madv = __I915_MADV_PURGED;
>>> +   if (ret)
>>> +           return ret;
>>> +
>>> +   if (bo->ttm && i915_tt->filp) {
>>> +           /*
>>> +            * The below fput(which eventually calls shmem_truncate)
>> might
>>> +            * be delayed by worker, so when directly called to purge the
>>> +            * pages(like by the shrinker) we should try to be more
>>> +            * aggressive and release the pages immediately.
>>> +            */
>>> +           shmem_truncate_range(file_inode(i915_tt->filp),
>>> +                                0, (loff_t)-1);
>>> +           fput(fetch_and_zero(&i915_tt->filp));
>>> +   }
>>> +
>>> +   obj->write_domain = 0;
>>> +   obj->read_domains = 0;
>>> +   i915_ttm_adjust_gem_after_move(obj);
>>> +   i915_ttm_free_cached_io_st(obj);
>>> +   obj->mm.madv = __I915_MADV_PURGED;
>>> +   return 0;
>>> +}
>>> +
>>> +static void i915_ttm_purge(struct drm_i915_gem_object *obj) {
>>> +   __i915_ttm_purge(obj);
>>> +}
>>> +
>>> +static int i915_ttm_shrinker_release_pages(struct drm_i915_gem_object
>> *obj,
>>> +                                      bool should_writeback)
>>> +{
>>> +   struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
>>> +   struct i915_ttm_tt *i915_tt =
>>> +           container_of(bo->ttm, typeof(*i915_tt), ttm);
>>> +   struct ttm_operation_ctx ctx = {
>>> +           .interruptible = true,
>>> +           .no_wait_gpu = false,
>>> +   };
>>> +   struct ttm_placement place = {};
>>> +   int ret;
>>> +
>>> +   if (!bo->ttm || bo->resource->mem_type != TTM_PL_SYSTEM)
>>> +           return 0;
>>> +
>>> +   GEM_BUG_ON(!i915_tt->is_shmem);
>>> +
>>> +   if (!i915_tt->filp)
>>> +           return 0;
>>> +
>>> +   switch (obj->mm.madv) {
>>> +   case I915_MADV_DONTNEED:
>>> +           return __i915_ttm_purge(obj);
>>> +   case __I915_MADV_PURGED:
>>> +           return 0;
>>> +   }
>>> +
>>> +   if (bo->ttm->page_flags & TTM_TT_FLAG_SWAPPED)
>>> +           return 0;
>>> +
>>> +   bo->ttm->page_flags |= TTM_TT_FLAG_SWAPPED;
>>> +   ret = ttm_bo_validate(bo, &place, &ctx);
>>> +   if (ret) {
>>> +           bo->ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
>>> +           return ret;
>>>      }
>>> +
>>> +   if (should_writeback)
>>> +           __shmem_writeback(obj->base.size, i915_tt->filp-
>>> f_mapping);
>>> +
>>> +   return 0;
>>>    }
>>>
>>>    static void i915_ttm_swap_notify(struct ttm_buffer_object *bo) @@ -
>> 618,6 +799,7 @@ static unsigned long i915_ttm_io_mem_pfn(struct
>> ttm_buffer_object *bo,
>>>
>>>    static struct ttm_device_funcs i915_ttm_bo_driver = {
>>>      .ttm_tt_create = i915_ttm_tt_create,
>>> +   .ttm_tt_populate = i915_ttm_tt_populate,
>>>      .ttm_tt_unpopulate = i915_ttm_tt_unpopulate,
>>>      .ttm_tt_destroy = i915_ttm_tt_destroy,
>>>      .eviction_valuable = i915_ttm_eviction_valuable, @@ -685,12 +867,17
>> @@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj,
>>>      }
>>>
>>>      if (!i915_gem_object_has_pages(obj)) {
>>> +           struct i915_ttm_tt *i915_tt =
>>> +                   container_of(bo->ttm, typeof(*i915_tt), ttm);
>>> +
>>>              /* Object either has a page vector or is an iomem object */
>>>              st = bo->ttm ? i915_ttm_tt_get_st(bo->ttm) : obj-
>>> ttm.cached_io_st;
>>>              if (IS_ERR(st))
>>>                      return PTR_ERR(st);
>>>
>>>              __i915_gem_object_set_pages(obj, st,
>> i915_sg_dma_sizes(st->sgl));
>>> +           if (!bo->ttm || !i915_tt->is_shmem)
>>> +                   i915_gem_object_make_unshrinkable(obj);
>>>      }
>>>
>>>      return ret;
>>> @@ -770,6 +957,8 @@ static void i915_ttm_put_pages(struct
>> drm_i915_gem_object *obj,  static void i915_ttm_adjust_lru(struct
>> drm_i915_gem_object *obj)  {
>>>      struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
>>> +   struct i915_ttm_tt *i915_tt =
>>> +           container_of(bo->ttm, typeof(*i915_tt), ttm);
>>>
>>>      /*
>>>       * Don't manipulate the TTM LRUs while in TTM bo destruction.
>>> @@ -782,7 +971,10 @@ static void i915_ttm_adjust_lru(struct
>> drm_i915_gem_object *obj)
>>>       * Put on the correct LRU list depending on the MADV status
>>>       */
>>>      spin_lock(&bo->bdev->lru_lock);
>>> -   if (obj->mm.madv != I915_MADV_WILLNEED) {
>>> +   if (bo->ttm && i915_tt->filp) {
>>> +           /* Try to keep shmem_tt from being considered for shrinking.
>> */
>>> +           bo->priority = TTM_MAX_BO_PRIORITY - 1;
>>> +   } else if (obj->mm.madv != I915_MADV_WILLNEED) {
>>>              bo->priority = I915_TTM_PRIO_PURGE;
>>>      } else if (!i915_gem_object_has_pages(obj)) {
>>>              if (bo->priority < I915_TTM_PRIO_HAS_PAGES) @@ -887,9
>> +1079,12 @@ static const struct drm_i915_gem_object_ops
>> i915_gem_ttm_obj_ops = {
>>>      .get_pages = i915_ttm_get_pages,
>>>      .put_pages = i915_ttm_put_pages,
>>>      .truncate = i915_ttm_purge,
>>> +   .shrinker_release_pages = i915_ttm_shrinker_release_pages,
>>> +
>>>      .adjust_lru = i915_ttm_adjust_lru,
>>>      .delayed_free = i915_ttm_delayed_free,
>>>      .migrate = i915_ttm_migrate,
>>> +
>>>      .mmap_offset = i915_ttm_mmap_offset,
>>>      .mmap_ops = &vm_ops_ttm,
>>>    };
>>> @@ -937,7 +1132,6 @@ int __i915_gem_ttm_object_init(struct
>> intel_memory_region *mem,
>>>      drm_gem_private_object_init(&i915->drm, &obj->base, size);
>>>      i915_gem_object_init(obj, &i915_gem_ttm_obj_ops, &lock_class,
>> flags);
>>>      i915_gem_object_init_memory_region(obj, mem);
>>> -   i915_gem_object_make_unshrinkable(obj);
>>>      INIT_RADIX_TREE(&obj->ttm.get_io_page.radix, GFP_KERNEL |
>> __GFP_NOWARN);
>>>      mutex_init(&obj->ttm.get_io_page.lock);
>>>      bo_type = (obj->flags & I915_BO_ALLOC_USER) ?
>> ttm_bo_type_device :
>>> --
>>> 2.26.3
>>>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend
  2021-10-05 17:07         ` Matthew Auld
@ 2021-10-05 18:33             ` Zeng, Oak
  0 siblings, 0 replies; 60+ messages in thread
From: Zeng, Oak @ 2021-10-05 18:33 UTC (permalink / raw)
  To: Auld, Matthew, Thomas Hellström, intel-gfx
  Cc: dri-devel, Christian König

Thanks for explanation. This patch is Acked-by: Oak Zeng <Oak.Zeng@intel.com>

Regards,
Oak

> -----Original Message-----
> From: Auld, Matthew <matthew.auld@intel.com>
> Sent: October 5, 2021 1:07 PM
> To: Zeng, Oak <oak.zeng@intel.com>; Thomas Hellström
> <thomas.hellstrom@linux.intel.com>; intel-gfx@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org; Christian König
> <christian.koenig@amd.com>
> Subject: Re: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem
> backend
> 
> On 05/10/2021 15:23, Zeng, Oak wrote:
> >
> >
> > Regards,
> > Oak
> >
> >> -----Original Message-----
> >> From: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> >> Sent: October 5, 2021 9:48 AM
> >> To: Zeng, Oak <oak.zeng@intel.com>; Auld, Matthew
> >> <matthew.auld@intel.com>; intel-gfx@lists.freedesktop.org
> >> Cc: dri-devel@lists.freedesktop.org; Christian König
> >> <christian.koenig@amd.com>
> >> Subject: Re: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem
> >> backend
> >>
> >>
> >> On 10/5/21 04:05, Zeng, Oak wrote:
> >>> Hi Matthew/Thomas,
> >>>
> >>> See one question inline
> >>>
> >>> Regards,
> >>> Oak
> >>>
> >>> -----Original Message-----
> >>> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of
> >> Matthew Auld
> >>> Sent: September 27, 2021 7:41 AM
> >>> To: intel-gfx@lists.freedesktop.org
> >>> Cc: dri-devel@lists.freedesktop.org; Thomas Hellström
> >> <thomas.hellstrom@linux.intel.com>; Christian König
> >> <christian.koenig@amd.com>
> >>> Subject: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem
> backend
> >>>
> >>> For cached objects we can allocate our pages directly in shmem. This
> should
> >> make it possible(in a later patch) to utilise the existing i915-gem shrinker
> code
> >> for such objects. For now this is still disabled.
> >>>
> >>> v2(Thomas):
> >>>     - Add optional try_to_writeback hook for objects. Importantly we need
> >>>       to check if the object is even still shrinkable; in between us
> >>>       dropping the shrinker LRU lock and acquiring the object lock it could for
> >>>       example have been moved. Also we need to differentiate between
> >>>       "lazy" shrinking and the immediate writeback mode. Also later we
> need
> >> to
> >>>       handle objects which don't even have mm.pages, so bundling this into
> >>>       put_pages() would require somehow handling that edge case, hence
> >>>       just letting the ttm backend handle everything in try_to_writeback
> >>>       doesn't seem too bad.
> >>> v3(Thomas):
> >>>     - Likely a bad idea to touch the object from the unpopulate hook,
> >>>       since it's not possible to hold a reference, without also creating
> >>>       circular dependency, so likely this is too fragile. For now just
> >>>       ensure we at least mark the pages as dirty/accessed when called from
> the
> >>>       shrinker on WILLNEED objects.
> >>>     - s/try_to_writeback/shrinker_release_pages, since this can do more
> >>>       than just writeback.
> >>>     - Get rid of do_backup boolean and just set the SWAPPED flag prior to
> >>>       calling unpopulate.
> >>>     - Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk,
> >> since
> >>>       these just get skipped anyway. We can try to come up with something
> >>>       better later.
> >>>
> >>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> >>> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> >>> Cc: Christian König <christian.koenig@amd.com>
> >>> ---
> >>>    drivers/gpu/drm/i915/gem/i915_gem_object.h    |   8 +
> >>>    .../gpu/drm/i915/gem/i915_gem_object_types.h  |   2 +
> >>>    drivers/gpu/drm/i915/gem/i915_gem_shmem.c     |  14 +-
> >>>    drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  17 +-
> >>>    drivers/gpu/drm/i915/gem/i915_gem_ttm.c       | 240
> ++++++++++++++++-
> >> -
> >>>    5 files changed, 245 insertions(+), 36 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> >> b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> >>> index 3043fcbd31bd..1c9a1d8d3434 100644
> >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> >>> @@ -601,6 +601,14 @@ int i915_gem_object_wait_migration(struct
> >> drm_i915_gem_object *obj,  bool
> >> i915_gem_object_placement_possible(struct drm_i915_gem_object *obj,
> >>>                                      enum intel_memory_type type);
> >>>
> >>> +struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
> >>> +                           size_t size, struct intel_memory_region *mr,
> >>> +                           struct address_space *mapping,
> >>> +                           unsigned int max_segment);
> >>> +void shmem_free_st(struct sg_table *st, struct address_space
> *mapping,
> >>> +              bool dirty, bool backup);
> >>> +void __shmem_writeback(size_t size, struct address_space *mapping);
> >>> +
> >>>    #ifdef CONFIG_MMU_NOTIFIER
> >>>    static inline bool
> >>>    i915_gem_object_is_userptr(struct drm_i915_gem_object *obj) diff --
> git
> >> a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> >> b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> >>> index fa2ba9e2a4d0..f0fb17be2f7a 100644
> >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> >>> @@ -56,6 +56,8 @@ struct drm_i915_gem_object_ops {
> >>>                        struct sg_table *pages);
> >>>      void (*truncate)(struct drm_i915_gem_object *obj);
> >>>      void (*writeback)(struct drm_i915_gem_object *obj);
> >>> +   int (*shrinker_release_pages)(struct drm_i915_gem_object *obj,
> >>> +                                 bool should_writeback);
> >>>
> >>>      int (*pread)(struct drm_i915_gem_object *obj,
> >>>                   const struct drm_i915_gem_pread *arg); diff --git
> >> a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> >> b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> >>> index 36b711ae9e28..19e55cc29a15 100644
> >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> >>> @@ -25,8 +25,8 @@ static void check_release_pagevec(struct pagevec
> >> *pvec)
> >>>      cond_resched();
> >>>    }
> >>>
> >>> -static void shmem_free_st(struct sg_table *st, struct address_space
> >> *mapping,
> >>> -                     bool dirty, bool backup)
> >>> +void shmem_free_st(struct sg_table *st, struct address_space
> *mapping,
> >>> +              bool dirty, bool backup)
> >>>    {
> >>>      struct sgt_iter sgt_iter;
> >>>      struct pagevec pvec;
> >>> @@ -52,10 +52,10 @@ static void shmem_free_st(struct sg_table *st,
> struct
> >> address_space *mapping,
> >>>      kfree(st);
> >>>    }
> >>>
> >>> -static struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
> >>> -                                  size_t size, struct intel_memory_region
> >> *mr,
> >>> -                                  struct address_space *mapping,
> >>> -                                  unsigned int max_segment)
> >>> +struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
> >>> +                           size_t size, struct intel_memory_region *mr,
> >>> +                           struct address_space *mapping,
> >>> +                           unsigned int max_segment)
> >>>    {
> >>>      const unsigned long page_count = size / PAGE_SIZE;
> >>>      unsigned long i;
> >>> @@ -300,7 +300,7 @@ shmem_truncate(struct drm_i915_gem_object
> *obj)
> >>>      obj->mm.pages = ERR_PTR(-EFAULT);
> >>>    }
> >>>
> >>> -static void __shmem_writeback(size_t size, struct address_space
> >> *mapping)
> >>> +void __shmem_writeback(size_t size, struct address_space *mapping)
> >>>    {
> >>>      struct writeback_control wbc = {
> >>>              .sync_mode = WB_SYNC_NONE,
> >>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> >> b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> >>> index e382b7f2353b..cc80bd23d323 100644
> >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> >>> @@ -56,19 +56,24 @@ static bool unsafe_drop_pages(struct
> >> drm_i915_gem_object *obj,
> >>>      return false;
> >>>    }
> >>>
> >>> -static void try_to_writeback(struct drm_i915_gem_object *obj,
> >>> -                        unsigned int flags)
> >>> +static int try_to_writeback(struct drm_i915_gem_object *obj, unsigned
> >>> +int flags)
> >>>    {
> >>> +   if (obj->ops->shrinker_release_pages)
> >>> +           return obj->ops->shrinker_release_pages(obj,
> >>> +                                                   flags &
> >> I915_SHRINK_WRITEBACK);
> >>> +
> >>>      switch (obj->mm.madv) {
> >>>      case I915_MADV_DONTNEED:
> >>>              i915_gem_object_truncate(obj);
> >>> -           return;
> >>> +           return 0;
> >>>      case __I915_MADV_PURGED:
> >>> -           return;
> >>> +           return 0;
> >>>      }
> >>>
> >>>      if (flags & I915_SHRINK_WRITEBACK)
> >>>              i915_gem_object_writeback(obj);
> >>> +
> >>> +   return 0;
> >>>    }
> >>>
> >>>    /**
> >>> @@ -222,8 +227,8 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww,
> >>>                              }
> >>>
> >>>                              if (!__i915_gem_object_put_pages(obj)) {
> >>> -                                   try_to_writeback(obj, shrink);
> >>> -                                   count += obj->base.size >>
> >> PAGE_SHIFT;
> >>> +                                   if (!try_to_writeback(obj, shrink))
> >>> +                                           count += obj->base.size >>
> >> PAGE_SHIFT;
> >>>                              }
> >>>                              if (!ww)
> >>>                                      i915_gem_object_unlock(obj);
> >>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> >> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> >>> index a77e90f300fe..c7402995a8f9 100644
> >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> >>> @@ -35,6 +35,8 @@
> >>>     * @ttm: The base TTM page vector.
> >>>     * @dev: The struct device used for dma mapping and unmapping.
> >>>     * @cached_st: The cached scatter-gather table.
> >>> + * @is_shmem: Set if using shmem.
> >>> + * @filp: The shmem file, if using shmem backend.
> >>>     *
> >>>     * Note that DMA may be going on right up to the point where the page-
> >>>     * vector is unpopulated in delayed destroy. Hence keep the @@ -46,6
> >> +48,9 @@ struct i915_ttm_tt {
> >>>      struct ttm_tt ttm;
> >>>      struct device *dev;
> >>>      struct sg_table *cached_st;
> >>> +
> >>> +   bool is_shmem;
> >>> +   struct file *filp;
> >>>    };
> >>>
> >>>    static const struct ttm_place sys_placement_flags = { @@ -179,12
> +184,90
> >> @@ i915_ttm_placement_from_obj(const struct drm_i915_gem_object
> *obj,
> >>>      placement->busy_placement = busy;
> >>>    }
> >>>
> >>> +static int i915_ttm_tt_shmem_populate(struct ttm_device *bdev,
> >>> +                                 struct ttm_tt *ttm,
> >>> +                                 struct ttm_operation_ctx *ctx) {
> >>> +   struct drm_i915_private *i915 = container_of(bdev, typeof(*i915),
> >> bdev);
> >>> +   struct intel_memory_region *mr = i915-
> >>> mm.regions[INTEL_MEMORY_SYSTEM];
> >>> +   struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
> >> ttm);
> >>> +   const unsigned int max_segment = i915_sg_segment_size();
> >>> +   const size_t size = ttm->num_pages << PAGE_SHIFT;
> >>> +   struct file *filp = i915_tt->filp;
> >>> +   struct sgt_iter sgt_iter;
> >>> +   struct sg_table *st;
> >>> +   struct page *page;
> >>> +   unsigned long i;
> >>> +   int err;
> >>> +
> >>> +   if (!filp) {
> >>> +           struct address_space *mapping;
> >>> +           gfp_t mask;
> >>> +
> >>> +           filp = shmem_file_setup("i915-shmem-tt", size,
> >> VM_NORESERVE);
> >>> +           if (IS_ERR(filp))
> >>> +                   return PTR_ERR(filp);
> >>> +
> >>> +           mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
> >>> +
> >>> +           mapping = filp->f_mapping;
> >>> +           mapping_set_gfp_mask(mapping, mask);
> >>> +           GEM_BUG_ON(!(mapping_gfp_mask(mapping) &
> >> __GFP_RECLAIM));
> >>> +
> >>> +           i915_tt->filp = filp;
> >>> +   }
> >>> +
> >>> +   st = shmem_alloc_st(i915, size, mr, filp->f_mapping, max_segment);
> >>> +   if (IS_ERR(st))
> >>> +           return PTR_ERR(st);
> >>> +
> >>> +   err = dma_map_sg_attrs(i915_tt->dev,
> >>> +                          st->sgl, st->nents,
> >>> +                          PCI_DMA_BIDIRECTIONAL,
> >>> +                          DMA_ATTR_SKIP_CPU_SYNC |
> >>> +                          DMA_ATTR_NO_KERNEL_MAPPING |
> >>> +                          DMA_ATTR_NO_WARN);
> >>> +   if (err <= 0) {
> >>> +           err = -EINVAL;
> >>> +           goto err_free_st;
> >>> +   }
> >>> +
> >>> +   i = 0;
> >>> +   for_each_sgt_page(page, sgt_iter, st)
> >>> +           ttm->pages[i++] = page;
> >>> +
> >>> +   if (ttm->page_flags & TTM_TT_FLAG_SWAPPED)
> >>> +           ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
> >>> +
> >>> +   i915_tt->cached_st = st;
> >>> +   return 0;
> >>> +
> >>> +err_free_st:
> >>> +   shmem_free_st(st, filp->f_mapping, false, false);
> >>> +   return err;
> >>> +}
> >>> +
> >>> +static void i915_ttm_tt_shmem_unpopulate(struct ttm_tt *ttm) {
> >>> +   struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
> >> ttm);
> >>> +   bool backup = ttm->page_flags & TTM_TT_FLAG_SWAPPED;
> >>> +
> >>> +   dma_unmap_sg(i915_tt->dev, i915_tt->cached_st->sgl,
> >>> +                i915_tt->cached_st->nents,
> >>> +                PCI_DMA_BIDIRECTIONAL);
> >>> +
> >>> +   shmem_free_st(fetch_and_zero(&i915_tt->cached_st),
> >>> +                 file_inode(i915_tt->filp)->i_mapping,
> >>> +                 backup, backup);
> >>>
> >>> Should we do something to undo the  shmem_file_setup operation
> here?
> >>  From its implementation it does take a reference counter of inode and
> >> allocate file:
> >> https://elixir.bootlin.com/linux/latest/source/mm/shmem.c#L4084
> >>>
> >>> Regards,
> >>> Oak
> >>
> >> Hi, Oak,
> >>
> >> That's done in i915_ttm_tt_destroy() afaict.
> >>
> >> /Thomas
> >
> > Do we know whether this tt is back by a shmem at create time? If yes, I
> think a better place to do the shmem_file_setup is in i915_ttm_tt_create -
> this pairs with the fput in i915_ttm_tt_destroy.  If we don't have such
> information at tt create time, I agree with the current approach.
> 
> IIRC the tt_create is called even if the object will mostly likely just
> end up being placed in VRAM, so calling shmem_file_setup in there seemed
> potentially wasteful, since the shmem file might never even be needed.
> Hence keeping it in populate instead.
> 
> >
> > Regards,
> > Oak
> >
> >>
> >>
> >>
> >>>
> >>> +}
> >>> +
> >>>    static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo,
> >>>                                       uint32_t page_flags)
> >>>    {
> >>>      struct ttm_resource_manager *man =
> >>>              ttm_manager_type(bo->bdev, bo->resource->mem_type);
> >>>      struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
> >>> +   enum ttm_caching caching = i915_ttm_select_tt_caching(obj);
> >>>      struct i915_ttm_tt *i915_tt;
> >>>      int ret;
> >>>
> >>> @@ -196,36 +279,62 @@ static struct ttm_tt *i915_ttm_tt_create(struct
> >> ttm_buffer_object *bo,
> >>>          man->use_tt)
> >>>              page_flags |= TTM_TT_FLAG_ZERO_ALLOC;
> >>>
> >>> -   ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags,
> >>> -                     i915_ttm_select_tt_caching(obj));
> >>> -   if (ret) {
> >>> -           kfree(i915_tt);
> >>> -           return NULL;
> >>> +   if (i915_gem_object_is_shrinkable(obj) && caching == ttm_cached) {
> >>> +           page_flags |= TTM_TT_FLAG_EXTERNAL |
> >>> +                         TTM_TT_FLAG_EXTERNAL_MAPPABLE;
> >>> +           i915_tt->is_shmem = true;
> >>>      }
> >>>
> >>> +   ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching);
> >>> +   if (ret)
> >>> +           goto err_free;
> >>> +
> >>>      i915_tt->dev = obj->base.dev->dev;
> >>>
> >>>      return &i915_tt->ttm;
> >>> +
> >>> +err_free:
> >>> +   kfree(i915_tt);
> >>> +   return NULL;
> >>> +}
> >>> +
> >>> +static int i915_ttm_tt_populate(struct ttm_device *bdev,
> >>> +                           struct ttm_tt *ttm,
> >>> +                           struct ttm_operation_ctx *ctx)
> >>> +{
> >>> +   struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
> >>> +ttm);
> >>> +
> >>> +   if (i915_tt->is_shmem)
> >>> +           return i915_ttm_tt_shmem_populate(bdev, ttm, ctx);
> >>> +
> >>> +   return ttm_pool_alloc(&bdev->pool, ttm, ctx);
> >>>    }
> >>>
> >>>    static void i915_ttm_tt_unpopulate(struct ttm_device *bdev, struct
> ttm_tt
> >> *ttm)  {
> >>>      struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
> >> ttm);
> >>>
> >>> -   if (i915_tt->cached_st) {
> >>> -           dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st,
> >>> -                             DMA_BIDIRECTIONAL, 0);
> >>> -           sg_free_table(i915_tt->cached_st);
> >>> -           kfree(i915_tt->cached_st);
> >>> -           i915_tt->cached_st = NULL;
> >>> +   if (i915_tt->is_shmem) {
> >>> +           i915_ttm_tt_shmem_unpopulate(ttm);
> >>> +   } else {
> >>> +           if (i915_tt->cached_st) {
> >>> +                   dma_unmap_sgtable(i915_tt->dev, i915_tt-
> >>> cached_st,
> >>> +                                     DMA_BIDIRECTIONAL, 0);
> >>> +                   sg_free_table(i915_tt->cached_st);
> >>> +                   kfree(i915_tt->cached_st);
> >>> +                   i915_tt->cached_st = NULL;
> >>> +           }
> >>> +           ttm_pool_free(&bdev->pool, ttm);
> >>>      }
> >>> -   ttm_pool_free(&bdev->pool, ttm);
> >>>    }
> >>>
> >>>    static void i915_ttm_tt_destroy(struct ttm_device *bdev, struct ttm_tt
> >> *ttm)  {
> >>>      struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
> >> ttm);
> >>>
> >>> +   if (i915_tt->filp)
> >>> +           fput(i915_tt->filp);
> >>> +
> >>>      ttm_tt_fini(ttm);
> >>>      kfree(i915_tt);
> >>>    }
> >>> @@ -235,6 +344,14 @@ static bool i915_ttm_eviction_valuable(struct
> >> ttm_buffer_object *bo,  {
> >>>      struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
> >>>
> >>> +   /*
> >>> +    * EXTERNAL objects should never be swapped out by TTM, instead
> >> we need
> >>> +    * to handle that ourselves. TTM will already skip such objects for us,
> >>> +    * but we would like to avoid grabbing locks for no good reason.
> >>> +    */
> >>> +   if (bo->ttm && bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL)
> >>> +           return -EBUSY;
> >>> +
> >>>      /* Will do for now. Our pinned objects are still on TTM's LRU lists */
> >>>      return i915_gem_object_evictable(obj);  } @@ -328,9 +445,11 @@
> >> static void i915_ttm_adjust_gem_after_move(struct
> drm_i915_gem_object
> >> *obj)
> >>>      i915_gem_object_set_cache_coherency(obj, cache_level);  }
> >>>
> >>> -static void i915_ttm_purge(struct drm_i915_gem_object *obj)
> >>> +static int __i915_ttm_purge(struct drm_i915_gem_object *obj)
> >>>    {
> >>>      struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
> >>> +   struct i915_ttm_tt *i915_tt =
> >>> +           container_of(bo->ttm, typeof(*i915_tt), ttm);
> >>>      struct ttm_operation_ctx ctx = {
> >>>              .interruptible = true,
> >>>              .no_wait_gpu = false,
> >>> @@ -339,17 +458,79 @@ static void i915_ttm_purge(struct
> >> drm_i915_gem_object *obj)
> >>>      int ret;
> >>>
> >>>      if (obj->mm.madv == __I915_MADV_PURGED)
> >>> -           return;
> >>> +           return 0;
> >>>
> >>> -   /* TTM's purge interface. Note that we might be reentering. */
> >>>      ret = ttm_bo_validate(bo, &place, &ctx);
> >>> -   if (!ret) {
> >>> -           obj->write_domain = 0;
> >>> -           obj->read_domains = 0;
> >>> -           i915_ttm_adjust_gem_after_move(obj);
> >>> -           i915_ttm_free_cached_io_st(obj);
> >>> -           obj->mm.madv = __I915_MADV_PURGED;
> >>> +   if (ret)
> >>> +           return ret;
> >>> +
> >>> +   if (bo->ttm && i915_tt->filp) {
> >>> +           /*
> >>> +            * The below fput(which eventually calls shmem_truncate)
> >> might
> >>> +            * be delayed by worker, so when directly called to purge the
> >>> +            * pages(like by the shrinker) we should try to be more
> >>> +            * aggressive and release the pages immediately.
> >>> +            */
> >>> +           shmem_truncate_range(file_inode(i915_tt->filp),
> >>> +                                0, (loff_t)-1);
> >>> +           fput(fetch_and_zero(&i915_tt->filp));
> >>> +   }
> >>> +
> >>> +   obj->write_domain = 0;
> >>> +   obj->read_domains = 0;
> >>> +   i915_ttm_adjust_gem_after_move(obj);
> >>> +   i915_ttm_free_cached_io_st(obj);
> >>> +   obj->mm.madv = __I915_MADV_PURGED;
> >>> +   return 0;
> >>> +}
> >>> +
> >>> +static void i915_ttm_purge(struct drm_i915_gem_object *obj) {
> >>> +   __i915_ttm_purge(obj);
> >>> +}
> >>> +
> >>> +static int i915_ttm_shrinker_release_pages(struct
> drm_i915_gem_object
> >> *obj,
> >>> +                                      bool should_writeback)
> >>> +{
> >>> +   struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
> >>> +   struct i915_ttm_tt *i915_tt =
> >>> +           container_of(bo->ttm, typeof(*i915_tt), ttm);
> >>> +   struct ttm_operation_ctx ctx = {
> >>> +           .interruptible = true,
> >>> +           .no_wait_gpu = false,
> >>> +   };
> >>> +   struct ttm_placement place = {};
> >>> +   int ret;
> >>> +
> >>> +   if (!bo->ttm || bo->resource->mem_type != TTM_PL_SYSTEM)
> >>> +           return 0;
> >>> +
> >>> +   GEM_BUG_ON(!i915_tt->is_shmem);
> >>> +
> >>> +   if (!i915_tt->filp)
> >>> +           return 0;
> >>> +
> >>> +   switch (obj->mm.madv) {
> >>> +   case I915_MADV_DONTNEED:
> >>> +           return __i915_ttm_purge(obj);
> >>> +   case __I915_MADV_PURGED:
> >>> +           return 0;
> >>> +   }
> >>> +
> >>> +   if (bo->ttm->page_flags & TTM_TT_FLAG_SWAPPED)
> >>> +           return 0;
> >>> +
> >>> +   bo->ttm->page_flags |= TTM_TT_FLAG_SWAPPED;
> >>> +   ret = ttm_bo_validate(bo, &place, &ctx);
> >>> +   if (ret) {
> >>> +           bo->ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
> >>> +           return ret;
> >>>      }
> >>> +
> >>> +   if (should_writeback)
> >>> +           __shmem_writeback(obj->base.size, i915_tt->filp-
> >>> f_mapping);
> >>> +
> >>> +   return 0;
> >>>    }
> >>>
> >>>    static void i915_ttm_swap_notify(struct ttm_buffer_object *bo) @@ -
> >> 618,6 +799,7 @@ static unsigned long i915_ttm_io_mem_pfn(struct
> >> ttm_buffer_object *bo,
> >>>
> >>>    static struct ttm_device_funcs i915_ttm_bo_driver = {
> >>>      .ttm_tt_create = i915_ttm_tt_create,
> >>> +   .ttm_tt_populate = i915_ttm_tt_populate,
> >>>      .ttm_tt_unpopulate = i915_ttm_tt_unpopulate,
> >>>      .ttm_tt_destroy = i915_ttm_tt_destroy,
> >>>      .eviction_valuable = i915_ttm_eviction_valuable, @@ -685,12 +867,17
> >> @@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj,
> >>>      }
> >>>
> >>>      if (!i915_gem_object_has_pages(obj)) {
> >>> +           struct i915_ttm_tt *i915_tt =
> >>> +                   container_of(bo->ttm, typeof(*i915_tt), ttm);
> >>> +
> >>>              /* Object either has a page vector or is an iomem object */
> >>>              st = bo->ttm ? i915_ttm_tt_get_st(bo->ttm) : obj-
> >>> ttm.cached_io_st;
> >>>              if (IS_ERR(st))
> >>>                      return PTR_ERR(st);
> >>>
> >>>              __i915_gem_object_set_pages(obj, st,
> >> i915_sg_dma_sizes(st->sgl));
> >>> +           if (!bo->ttm || !i915_tt->is_shmem)
> >>> +                   i915_gem_object_make_unshrinkable(obj);
> >>>      }
> >>>
> >>>      return ret;
> >>> @@ -770,6 +957,8 @@ static void i915_ttm_put_pages(struct
> >> drm_i915_gem_object *obj,  static void i915_ttm_adjust_lru(struct
> >> drm_i915_gem_object *obj)  {
> >>>      struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
> >>> +   struct i915_ttm_tt *i915_tt =
> >>> +           container_of(bo->ttm, typeof(*i915_tt), ttm);
> >>>
> >>>      /*
> >>>       * Don't manipulate the TTM LRUs while in TTM bo destruction.
> >>> @@ -782,7 +971,10 @@ static void i915_ttm_adjust_lru(struct
> >> drm_i915_gem_object *obj)
> >>>       * Put on the correct LRU list depending on the MADV status
> >>>       */
> >>>      spin_lock(&bo->bdev->lru_lock);
> >>> -   if (obj->mm.madv != I915_MADV_WILLNEED) {
> >>> +   if (bo->ttm && i915_tt->filp) {
> >>> +           /* Try to keep shmem_tt from being considered for shrinking.
> >> */
> >>> +           bo->priority = TTM_MAX_BO_PRIORITY - 1;
> >>> +   } else if (obj->mm.madv != I915_MADV_WILLNEED) {
> >>>              bo->priority = I915_TTM_PRIO_PURGE;
> >>>      } else if (!i915_gem_object_has_pages(obj)) {
> >>>              if (bo->priority < I915_TTM_PRIO_HAS_PAGES) @@ -887,9
> >> +1079,12 @@ static const struct drm_i915_gem_object_ops
> >> i915_gem_ttm_obj_ops = {
> >>>      .get_pages = i915_ttm_get_pages,
> >>>      .put_pages = i915_ttm_put_pages,
> >>>      .truncate = i915_ttm_purge,
> >>> +   .shrinker_release_pages = i915_ttm_shrinker_release_pages,
> >>> +
> >>>      .adjust_lru = i915_ttm_adjust_lru,
> >>>      .delayed_free = i915_ttm_delayed_free,
> >>>      .migrate = i915_ttm_migrate,
> >>> +
> >>>      .mmap_offset = i915_ttm_mmap_offset,
> >>>      .mmap_ops = &vm_ops_ttm,
> >>>    };
> >>> @@ -937,7 +1132,6 @@ int __i915_gem_ttm_object_init(struct
> >> intel_memory_region *mem,
> >>>      drm_gem_private_object_init(&i915->drm, &obj->base, size);
> >>>      i915_gem_object_init(obj, &i915_gem_ttm_obj_ops, &lock_class,
> >> flags);
> >>>      i915_gem_object_init_memory_region(obj, mem);
> >>> -   i915_gem_object_make_unshrinkable(obj);
> >>>      INIT_RADIX_TREE(&obj->ttm.get_io_page.radix, GFP_KERNEL |
> >> __GFP_NOWARN);
> >>>      mutex_init(&obj->ttm.get_io_page.lock);
> >>>      bo_type = (obj->flags & I915_BO_ALLOC_USER) ?
> >> ttm_bo_type_device :
> >>> --
> >>> 2.26.3
> >>>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend
@ 2021-10-05 18:33             ` Zeng, Oak
  0 siblings, 0 replies; 60+ messages in thread
From: Zeng, Oak @ 2021-10-05 18:33 UTC (permalink / raw)
  To: Auld, Matthew, Thomas Hellström, intel-gfx
  Cc: dri-devel, Christian König

Thanks for explanation. This patch is Acked-by: Oak Zeng <Oak.Zeng@intel.com>

Regards,
Oak

> -----Original Message-----
> From: Auld, Matthew <matthew.auld@intel.com>
> Sent: October 5, 2021 1:07 PM
> To: Zeng, Oak <oak.zeng@intel.com>; Thomas Hellström
> <thomas.hellstrom@linux.intel.com>; intel-gfx@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org; Christian König
> <christian.koenig@amd.com>
> Subject: Re: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem
> backend
> 
> On 05/10/2021 15:23, Zeng, Oak wrote:
> >
> >
> > Regards,
> > Oak
> >
> >> -----Original Message-----
> >> From: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> >> Sent: October 5, 2021 9:48 AM
> >> To: Zeng, Oak <oak.zeng@intel.com>; Auld, Matthew
> >> <matthew.auld@intel.com>; intel-gfx@lists.freedesktop.org
> >> Cc: dri-devel@lists.freedesktop.org; Christian König
> >> <christian.koenig@amd.com>
> >> Subject: Re: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem
> >> backend
> >>
> >>
> >> On 10/5/21 04:05, Zeng, Oak wrote:
> >>> Hi Matthew/Thomas,
> >>>
> >>> See one question inline
> >>>
> >>> Regards,
> >>> Oak
> >>>
> >>> -----Original Message-----
> >>> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of
> >> Matthew Auld
> >>> Sent: September 27, 2021 7:41 AM
> >>> To: intel-gfx@lists.freedesktop.org
> >>> Cc: dri-devel@lists.freedesktop.org; Thomas Hellström
> >> <thomas.hellstrom@linux.intel.com>; Christian König
> >> <christian.koenig@amd.com>
> >>> Subject: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem
> backend
> >>>
> >>> For cached objects we can allocate our pages directly in shmem. This
> should
> >> make it possible(in a later patch) to utilise the existing i915-gem shrinker
> code
> >> for such objects. For now this is still disabled.
> >>>
> >>> v2(Thomas):
> >>>     - Add optional try_to_writeback hook for objects. Importantly we need
> >>>       to check if the object is even still shrinkable; in between us
> >>>       dropping the shrinker LRU lock and acquiring the object lock it could for
> >>>       example have been moved. Also we need to differentiate between
> >>>       "lazy" shrinking and the immediate writeback mode. Also later we
> need
> >> to
> >>>       handle objects which don't even have mm.pages, so bundling this into
> >>>       put_pages() would require somehow handling that edge case, hence
> >>>       just letting the ttm backend handle everything in try_to_writeback
> >>>       doesn't seem too bad.
> >>> v3(Thomas):
> >>>     - Likely a bad idea to touch the object from the unpopulate hook,
> >>>       since it's not possible to hold a reference, without also creating
> >>>       circular dependency, so likely this is too fragile. For now just
> >>>       ensure we at least mark the pages as dirty/accessed when called from
> the
> >>>       shrinker on WILLNEED objects.
> >>>     - s/try_to_writeback/shrinker_release_pages, since this can do more
> >>>       than just writeback.
> >>>     - Get rid of do_backup boolean and just set the SWAPPED flag prior to
> >>>       calling unpopulate.
> >>>     - Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk,
> >> since
> >>>       these just get skipped anyway. We can try to come up with something
> >>>       better later.
> >>>
> >>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> >>> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> >>> Cc: Christian König <christian.koenig@amd.com>
> >>> ---
> >>>    drivers/gpu/drm/i915/gem/i915_gem_object.h    |   8 +
> >>>    .../gpu/drm/i915/gem/i915_gem_object_types.h  |   2 +
> >>>    drivers/gpu/drm/i915/gem/i915_gem_shmem.c     |  14 +-
> >>>    drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  17 +-
> >>>    drivers/gpu/drm/i915/gem/i915_gem_ttm.c       | 240
> ++++++++++++++++-
> >> -
> >>>    5 files changed, 245 insertions(+), 36 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> >> b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> >>> index 3043fcbd31bd..1c9a1d8d3434 100644
> >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> >>> @@ -601,6 +601,14 @@ int i915_gem_object_wait_migration(struct
> >> drm_i915_gem_object *obj,  bool
> >> i915_gem_object_placement_possible(struct drm_i915_gem_object *obj,
> >>>                                      enum intel_memory_type type);
> >>>
> >>> +struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
> >>> +                           size_t size, struct intel_memory_region *mr,
> >>> +                           struct address_space *mapping,
> >>> +                           unsigned int max_segment);
> >>> +void shmem_free_st(struct sg_table *st, struct address_space
> *mapping,
> >>> +              bool dirty, bool backup);
> >>> +void __shmem_writeback(size_t size, struct address_space *mapping);
> >>> +
> >>>    #ifdef CONFIG_MMU_NOTIFIER
> >>>    static inline bool
> >>>    i915_gem_object_is_userptr(struct drm_i915_gem_object *obj) diff --
> git
> >> a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> >> b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> >>> index fa2ba9e2a4d0..f0fb17be2f7a 100644
> >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> >>> @@ -56,6 +56,8 @@ struct drm_i915_gem_object_ops {
> >>>                        struct sg_table *pages);
> >>>      void (*truncate)(struct drm_i915_gem_object *obj);
> >>>      void (*writeback)(struct drm_i915_gem_object *obj);
> >>> +   int (*shrinker_release_pages)(struct drm_i915_gem_object *obj,
> >>> +                                 bool should_writeback);
> >>>
> >>>      int (*pread)(struct drm_i915_gem_object *obj,
> >>>                   const struct drm_i915_gem_pread *arg); diff --git
> >> a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> >> b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> >>> index 36b711ae9e28..19e55cc29a15 100644
> >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> >>> @@ -25,8 +25,8 @@ static void check_release_pagevec(struct pagevec
> >> *pvec)
> >>>      cond_resched();
> >>>    }
> >>>
> >>> -static void shmem_free_st(struct sg_table *st, struct address_space
> >> *mapping,
> >>> -                     bool dirty, bool backup)
> >>> +void shmem_free_st(struct sg_table *st, struct address_space
> *mapping,
> >>> +              bool dirty, bool backup)
> >>>    {
> >>>      struct sgt_iter sgt_iter;
> >>>      struct pagevec pvec;
> >>> @@ -52,10 +52,10 @@ static void shmem_free_st(struct sg_table *st,
> struct
> >> address_space *mapping,
> >>>      kfree(st);
> >>>    }
> >>>
> >>> -static struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
> >>> -                                  size_t size, struct intel_memory_region
> >> *mr,
> >>> -                                  struct address_space *mapping,
> >>> -                                  unsigned int max_segment)
> >>> +struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
> >>> +                           size_t size, struct intel_memory_region *mr,
> >>> +                           struct address_space *mapping,
> >>> +                           unsigned int max_segment)
> >>>    {
> >>>      const unsigned long page_count = size / PAGE_SIZE;
> >>>      unsigned long i;
> >>> @@ -300,7 +300,7 @@ shmem_truncate(struct drm_i915_gem_object
> *obj)
> >>>      obj->mm.pages = ERR_PTR(-EFAULT);
> >>>    }
> >>>
> >>> -static void __shmem_writeback(size_t size, struct address_space
> >> *mapping)
> >>> +void __shmem_writeback(size_t size, struct address_space *mapping)
> >>>    {
> >>>      struct writeback_control wbc = {
> >>>              .sync_mode = WB_SYNC_NONE,
> >>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> >> b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> >>> index e382b7f2353b..cc80bd23d323 100644
> >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> >>> @@ -56,19 +56,24 @@ static bool unsafe_drop_pages(struct
> >> drm_i915_gem_object *obj,
> >>>      return false;
> >>>    }
> >>>
> >>> -static void try_to_writeback(struct drm_i915_gem_object *obj,
> >>> -                        unsigned int flags)
> >>> +static int try_to_writeback(struct drm_i915_gem_object *obj, unsigned
> >>> +int flags)
> >>>    {
> >>> +   if (obj->ops->shrinker_release_pages)
> >>> +           return obj->ops->shrinker_release_pages(obj,
> >>> +                                                   flags &
> >> I915_SHRINK_WRITEBACK);
> >>> +
> >>>      switch (obj->mm.madv) {
> >>>      case I915_MADV_DONTNEED:
> >>>              i915_gem_object_truncate(obj);
> >>> -           return;
> >>> +           return 0;
> >>>      case __I915_MADV_PURGED:
> >>> -           return;
> >>> +           return 0;
> >>>      }
> >>>
> >>>      if (flags & I915_SHRINK_WRITEBACK)
> >>>              i915_gem_object_writeback(obj);
> >>> +
> >>> +   return 0;
> >>>    }
> >>>
> >>>    /**
> >>> @@ -222,8 +227,8 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww,
> >>>                              }
> >>>
> >>>                              if (!__i915_gem_object_put_pages(obj)) {
> >>> -                                   try_to_writeback(obj, shrink);
> >>> -                                   count += obj->base.size >>
> >> PAGE_SHIFT;
> >>> +                                   if (!try_to_writeback(obj, shrink))
> >>> +                                           count += obj->base.size >>
> >> PAGE_SHIFT;
> >>>                              }
> >>>                              if (!ww)
> >>>                                      i915_gem_object_unlock(obj);
> >>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> >> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> >>> index a77e90f300fe..c7402995a8f9 100644
> >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> >>> @@ -35,6 +35,8 @@
> >>>     * @ttm: The base TTM page vector.
> >>>     * @dev: The struct device used for dma mapping and unmapping.
> >>>     * @cached_st: The cached scatter-gather table.
> >>> + * @is_shmem: Set if using shmem.
> >>> + * @filp: The shmem file, if using shmem backend.
> >>>     *
> >>>     * Note that DMA may be going on right up to the point where the page-
> >>>     * vector is unpopulated in delayed destroy. Hence keep the @@ -46,6
> >> +48,9 @@ struct i915_ttm_tt {
> >>>      struct ttm_tt ttm;
> >>>      struct device *dev;
> >>>      struct sg_table *cached_st;
> >>> +
> >>> +   bool is_shmem;
> >>> +   struct file *filp;
> >>>    };
> >>>
> >>>    static const struct ttm_place sys_placement_flags = { @@ -179,12
> +184,90
> >> @@ i915_ttm_placement_from_obj(const struct drm_i915_gem_object
> *obj,
> >>>      placement->busy_placement = busy;
> >>>    }
> >>>
> >>> +static int i915_ttm_tt_shmem_populate(struct ttm_device *bdev,
> >>> +                                 struct ttm_tt *ttm,
> >>> +                                 struct ttm_operation_ctx *ctx) {
> >>> +   struct drm_i915_private *i915 = container_of(bdev, typeof(*i915),
> >> bdev);
> >>> +   struct intel_memory_region *mr = i915-
> >>> mm.regions[INTEL_MEMORY_SYSTEM];
> >>> +   struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
> >> ttm);
> >>> +   const unsigned int max_segment = i915_sg_segment_size();
> >>> +   const size_t size = ttm->num_pages << PAGE_SHIFT;
> >>> +   struct file *filp = i915_tt->filp;
> >>> +   struct sgt_iter sgt_iter;
> >>> +   struct sg_table *st;
> >>> +   struct page *page;
> >>> +   unsigned long i;
> >>> +   int err;
> >>> +
> >>> +   if (!filp) {
> >>> +           struct address_space *mapping;
> >>> +           gfp_t mask;
> >>> +
> >>> +           filp = shmem_file_setup("i915-shmem-tt", size,
> >> VM_NORESERVE);
> >>> +           if (IS_ERR(filp))
> >>> +                   return PTR_ERR(filp);
> >>> +
> >>> +           mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
> >>> +
> >>> +           mapping = filp->f_mapping;
> >>> +           mapping_set_gfp_mask(mapping, mask);
> >>> +           GEM_BUG_ON(!(mapping_gfp_mask(mapping) &
> >> __GFP_RECLAIM));
> >>> +
> >>> +           i915_tt->filp = filp;
> >>> +   }
> >>> +
> >>> +   st = shmem_alloc_st(i915, size, mr, filp->f_mapping, max_segment);
> >>> +   if (IS_ERR(st))
> >>> +           return PTR_ERR(st);
> >>> +
> >>> +   err = dma_map_sg_attrs(i915_tt->dev,
> >>> +                          st->sgl, st->nents,
> >>> +                          PCI_DMA_BIDIRECTIONAL,
> >>> +                          DMA_ATTR_SKIP_CPU_SYNC |
> >>> +                          DMA_ATTR_NO_KERNEL_MAPPING |
> >>> +                          DMA_ATTR_NO_WARN);
> >>> +   if (err <= 0) {
> >>> +           err = -EINVAL;
> >>> +           goto err_free_st;
> >>> +   }
> >>> +
> >>> +   i = 0;
> >>> +   for_each_sgt_page(page, sgt_iter, st)
> >>> +           ttm->pages[i++] = page;
> >>> +
> >>> +   if (ttm->page_flags & TTM_TT_FLAG_SWAPPED)
> >>> +           ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
> >>> +
> >>> +   i915_tt->cached_st = st;
> >>> +   return 0;
> >>> +
> >>> +err_free_st:
> >>> +   shmem_free_st(st, filp->f_mapping, false, false);
> >>> +   return err;
> >>> +}
> >>> +
> >>> +static void i915_ttm_tt_shmem_unpopulate(struct ttm_tt *ttm) {
> >>> +   struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
> >> ttm);
> >>> +   bool backup = ttm->page_flags & TTM_TT_FLAG_SWAPPED;
> >>> +
> >>> +   dma_unmap_sg(i915_tt->dev, i915_tt->cached_st->sgl,
> >>> +                i915_tt->cached_st->nents,
> >>> +                PCI_DMA_BIDIRECTIONAL);
> >>> +
> >>> +   shmem_free_st(fetch_and_zero(&i915_tt->cached_st),
> >>> +                 file_inode(i915_tt->filp)->i_mapping,
> >>> +                 backup, backup);
> >>>
> >>> Should we do something to undo the  shmem_file_setup operation
> here?
> >>  From its implementation it does take a reference counter of inode and
> >> allocate file:
> >> https://elixir.bootlin.com/linux/latest/source/mm/shmem.c#L4084
> >>>
> >>> Regards,
> >>> Oak
> >>
> >> Hi, Oak,
> >>
> >> That's done in i915_ttm_tt_destroy() afaict.
> >>
> >> /Thomas
> >
> > Do we know whether this tt is back by a shmem at create time? If yes, I
> think a better place to do the shmem_file_setup is in i915_ttm_tt_create -
> this pairs with the fput in i915_ttm_tt_destroy.  If we don't have such
> information at tt create time, I agree with the current approach.
> 
> IIRC the tt_create is called even if the object will mostly likely just
> end up being placed in VRAM, so calling shmem_file_setup in there seemed
> potentially wasteful, since the shmem file might never even be needed.
> Hence keeping it in populate instead.
> 
> >
> > Regards,
> > Oak
> >
> >>
> >>
> >>
> >>>
> >>> +}
> >>> +
> >>>    static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo,
> >>>                                       uint32_t page_flags)
> >>>    {
> >>>      struct ttm_resource_manager *man =
> >>>              ttm_manager_type(bo->bdev, bo->resource->mem_type);
> >>>      struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
> >>> +   enum ttm_caching caching = i915_ttm_select_tt_caching(obj);
> >>>      struct i915_ttm_tt *i915_tt;
> >>>      int ret;
> >>>
> >>> @@ -196,36 +279,62 @@ static struct ttm_tt *i915_ttm_tt_create(struct
> >> ttm_buffer_object *bo,
> >>>          man->use_tt)
> >>>              page_flags |= TTM_TT_FLAG_ZERO_ALLOC;
> >>>
> >>> -   ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags,
> >>> -                     i915_ttm_select_tt_caching(obj));
> >>> -   if (ret) {
> >>> -           kfree(i915_tt);
> >>> -           return NULL;
> >>> +   if (i915_gem_object_is_shrinkable(obj) && caching == ttm_cached) {
> >>> +           page_flags |= TTM_TT_FLAG_EXTERNAL |
> >>> +                         TTM_TT_FLAG_EXTERNAL_MAPPABLE;
> >>> +           i915_tt->is_shmem = true;
> >>>      }
> >>>
> >>> +   ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching);
> >>> +   if (ret)
> >>> +           goto err_free;
> >>> +
> >>>      i915_tt->dev = obj->base.dev->dev;
> >>>
> >>>      return &i915_tt->ttm;
> >>> +
> >>> +err_free:
> >>> +   kfree(i915_tt);
> >>> +   return NULL;
> >>> +}
> >>> +
> >>> +static int i915_ttm_tt_populate(struct ttm_device *bdev,
> >>> +                           struct ttm_tt *ttm,
> >>> +                           struct ttm_operation_ctx *ctx)
> >>> +{
> >>> +   struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
> >>> +ttm);
> >>> +
> >>> +   if (i915_tt->is_shmem)
> >>> +           return i915_ttm_tt_shmem_populate(bdev, ttm, ctx);
> >>> +
> >>> +   return ttm_pool_alloc(&bdev->pool, ttm, ctx);
> >>>    }
> >>>
> >>>    static void i915_ttm_tt_unpopulate(struct ttm_device *bdev, struct
> ttm_tt
> >> *ttm)  {
> >>>      struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
> >> ttm);
> >>>
> >>> -   if (i915_tt->cached_st) {
> >>> -           dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st,
> >>> -                             DMA_BIDIRECTIONAL, 0);
> >>> -           sg_free_table(i915_tt->cached_st);
> >>> -           kfree(i915_tt->cached_st);
> >>> -           i915_tt->cached_st = NULL;
> >>> +   if (i915_tt->is_shmem) {
> >>> +           i915_ttm_tt_shmem_unpopulate(ttm);
> >>> +   } else {
> >>> +           if (i915_tt->cached_st) {
> >>> +                   dma_unmap_sgtable(i915_tt->dev, i915_tt-
> >>> cached_st,
> >>> +                                     DMA_BIDIRECTIONAL, 0);
> >>> +                   sg_free_table(i915_tt->cached_st);
> >>> +                   kfree(i915_tt->cached_st);
> >>> +                   i915_tt->cached_st = NULL;
> >>> +           }
> >>> +           ttm_pool_free(&bdev->pool, ttm);
> >>>      }
> >>> -   ttm_pool_free(&bdev->pool, ttm);
> >>>    }
> >>>
> >>>    static void i915_ttm_tt_destroy(struct ttm_device *bdev, struct ttm_tt
> >> *ttm)  {
> >>>      struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
> >> ttm);
> >>>
> >>> +   if (i915_tt->filp)
> >>> +           fput(i915_tt->filp);
> >>> +
> >>>      ttm_tt_fini(ttm);
> >>>      kfree(i915_tt);
> >>>    }
> >>> @@ -235,6 +344,14 @@ static bool i915_ttm_eviction_valuable(struct
> >> ttm_buffer_object *bo,  {
> >>>      struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
> >>>
> >>> +   /*
> >>> +    * EXTERNAL objects should never be swapped out by TTM, instead
> >> we need
> >>> +    * to handle that ourselves. TTM will already skip such objects for us,
> >>> +    * but we would like to avoid grabbing locks for no good reason.
> >>> +    */
> >>> +   if (bo->ttm && bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL)
> >>> +           return -EBUSY;
> >>> +
> >>>      /* Will do for now. Our pinned objects are still on TTM's LRU lists */
> >>>      return i915_gem_object_evictable(obj);  } @@ -328,9 +445,11 @@
> >> static void i915_ttm_adjust_gem_after_move(struct
> drm_i915_gem_object
> >> *obj)
> >>>      i915_gem_object_set_cache_coherency(obj, cache_level);  }
> >>>
> >>> -static void i915_ttm_purge(struct drm_i915_gem_object *obj)
> >>> +static int __i915_ttm_purge(struct drm_i915_gem_object *obj)
> >>>    {
> >>>      struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
> >>> +   struct i915_ttm_tt *i915_tt =
> >>> +           container_of(bo->ttm, typeof(*i915_tt), ttm);
> >>>      struct ttm_operation_ctx ctx = {
> >>>              .interruptible = true,
> >>>              .no_wait_gpu = false,
> >>> @@ -339,17 +458,79 @@ static void i915_ttm_purge(struct
> >> drm_i915_gem_object *obj)
> >>>      int ret;
> >>>
> >>>      if (obj->mm.madv == __I915_MADV_PURGED)
> >>> -           return;
> >>> +           return 0;
> >>>
> >>> -   /* TTM's purge interface. Note that we might be reentering. */
> >>>      ret = ttm_bo_validate(bo, &place, &ctx);
> >>> -   if (!ret) {
> >>> -           obj->write_domain = 0;
> >>> -           obj->read_domains = 0;
> >>> -           i915_ttm_adjust_gem_after_move(obj);
> >>> -           i915_ttm_free_cached_io_st(obj);
> >>> -           obj->mm.madv = __I915_MADV_PURGED;
> >>> +   if (ret)
> >>> +           return ret;
> >>> +
> >>> +   if (bo->ttm && i915_tt->filp) {
> >>> +           /*
> >>> +            * The below fput(which eventually calls shmem_truncate)
> >> might
> >>> +            * be delayed by worker, so when directly called to purge the
> >>> +            * pages(like by the shrinker) we should try to be more
> >>> +            * aggressive and release the pages immediately.
> >>> +            */
> >>> +           shmem_truncate_range(file_inode(i915_tt->filp),
> >>> +                                0, (loff_t)-1);
> >>> +           fput(fetch_and_zero(&i915_tt->filp));
> >>> +   }
> >>> +
> >>> +   obj->write_domain = 0;
> >>> +   obj->read_domains = 0;
> >>> +   i915_ttm_adjust_gem_after_move(obj);
> >>> +   i915_ttm_free_cached_io_st(obj);
> >>> +   obj->mm.madv = __I915_MADV_PURGED;
> >>> +   return 0;
> >>> +}
> >>> +
> >>> +static void i915_ttm_purge(struct drm_i915_gem_object *obj) {
> >>> +   __i915_ttm_purge(obj);
> >>> +}
> >>> +
> >>> +static int i915_ttm_shrinker_release_pages(struct
> drm_i915_gem_object
> >> *obj,
> >>> +                                      bool should_writeback)
> >>> +{
> >>> +   struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
> >>> +   struct i915_ttm_tt *i915_tt =
> >>> +           container_of(bo->ttm, typeof(*i915_tt), ttm);
> >>> +   struct ttm_operation_ctx ctx = {
> >>> +           .interruptible = true,
> >>> +           .no_wait_gpu = false,
> >>> +   };
> >>> +   struct ttm_placement place = {};
> >>> +   int ret;
> >>> +
> >>> +   if (!bo->ttm || bo->resource->mem_type != TTM_PL_SYSTEM)
> >>> +           return 0;
> >>> +
> >>> +   GEM_BUG_ON(!i915_tt->is_shmem);
> >>> +
> >>> +   if (!i915_tt->filp)
> >>> +           return 0;
> >>> +
> >>> +   switch (obj->mm.madv) {
> >>> +   case I915_MADV_DONTNEED:
> >>> +           return __i915_ttm_purge(obj);
> >>> +   case __I915_MADV_PURGED:
> >>> +           return 0;
> >>> +   }
> >>> +
> >>> +   if (bo->ttm->page_flags & TTM_TT_FLAG_SWAPPED)
> >>> +           return 0;
> >>> +
> >>> +   bo->ttm->page_flags |= TTM_TT_FLAG_SWAPPED;
> >>> +   ret = ttm_bo_validate(bo, &place, &ctx);
> >>> +   if (ret) {
> >>> +           bo->ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
> >>> +           return ret;
> >>>      }
> >>> +
> >>> +   if (should_writeback)
> >>> +           __shmem_writeback(obj->base.size, i915_tt->filp-
> >>> f_mapping);
> >>> +
> >>> +   return 0;
> >>>    }
> >>>
> >>>    static void i915_ttm_swap_notify(struct ttm_buffer_object *bo) @@ -
> >> 618,6 +799,7 @@ static unsigned long i915_ttm_io_mem_pfn(struct
> >> ttm_buffer_object *bo,
> >>>
> >>>    static struct ttm_device_funcs i915_ttm_bo_driver = {
> >>>      .ttm_tt_create = i915_ttm_tt_create,
> >>> +   .ttm_tt_populate = i915_ttm_tt_populate,
> >>>      .ttm_tt_unpopulate = i915_ttm_tt_unpopulate,
> >>>      .ttm_tt_destroy = i915_ttm_tt_destroy,
> >>>      .eviction_valuable = i915_ttm_eviction_valuable, @@ -685,12 +867,17
> >> @@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj,
> >>>      }
> >>>
> >>>      if (!i915_gem_object_has_pages(obj)) {
> >>> +           struct i915_ttm_tt *i915_tt =
> >>> +                   container_of(bo->ttm, typeof(*i915_tt), ttm);
> >>> +
> >>>              /* Object either has a page vector or is an iomem object */
> >>>              st = bo->ttm ? i915_ttm_tt_get_st(bo->ttm) : obj-
> >>> ttm.cached_io_st;
> >>>              if (IS_ERR(st))
> >>>                      return PTR_ERR(st);
> >>>
> >>>              __i915_gem_object_set_pages(obj, st,
> >> i915_sg_dma_sizes(st->sgl));
> >>> +           if (!bo->ttm || !i915_tt->is_shmem)
> >>> +                   i915_gem_object_make_unshrinkable(obj);
> >>>      }
> >>>
> >>>      return ret;
> >>> @@ -770,6 +957,8 @@ static void i915_ttm_put_pages(struct
> >> drm_i915_gem_object *obj,  static void i915_ttm_adjust_lru(struct
> >> drm_i915_gem_object *obj)  {
> >>>      struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
> >>> +   struct i915_ttm_tt *i915_tt =
> >>> +           container_of(bo->ttm, typeof(*i915_tt), ttm);
> >>>
> >>>      /*
> >>>       * Don't manipulate the TTM LRUs while in TTM bo destruction.
> >>> @@ -782,7 +971,10 @@ static void i915_ttm_adjust_lru(struct
> >> drm_i915_gem_object *obj)
> >>>       * Put on the correct LRU list depending on the MADV status
> >>>       */
> >>>      spin_lock(&bo->bdev->lru_lock);
> >>> -   if (obj->mm.madv != I915_MADV_WILLNEED) {
> >>> +   if (bo->ttm && i915_tt->filp) {
> >>> +           /* Try to keep shmem_tt from being considered for shrinking.
> >> */
> >>> +           bo->priority = TTM_MAX_BO_PRIORITY - 1;
> >>> +   } else if (obj->mm.madv != I915_MADV_WILLNEED) {
> >>>              bo->priority = I915_TTM_PRIO_PURGE;
> >>>      } else if (!i915_gem_object_has_pages(obj)) {
> >>>              if (bo->priority < I915_TTM_PRIO_HAS_PAGES) @@ -887,9
> >> +1079,12 @@ static const struct drm_i915_gem_object_ops
> >> i915_gem_ttm_obj_ops = {
> >>>      .get_pages = i915_ttm_get_pages,
> >>>      .put_pages = i915_ttm_put_pages,
> >>>      .truncate = i915_ttm_purge,
> >>> +   .shrinker_release_pages = i915_ttm_shrinker_release_pages,
> >>> +
> >>>      .adjust_lru = i915_ttm_adjust_lru,
> >>>      .delayed_free = i915_ttm_delayed_free,
> >>>      .migrate = i915_ttm_migrate,
> >>> +
> >>>      .mmap_offset = i915_ttm_mmap_offset,
> >>>      .mmap_ops = &vm_ops_ttm,
> >>>    };
> >>> @@ -937,7 +1132,6 @@ int __i915_gem_ttm_object_init(struct
> >> intel_memory_region *mem,
> >>>      drm_gem_private_object_init(&i915->drm, &obj->base, size);
> >>>      i915_gem_object_init(obj, &i915_gem_ttm_obj_ops, &lock_class,
> >> flags);
> >>>      i915_gem_object_init_memory_region(obj, mem);
> >>> -   i915_gem_object_make_unshrinkable(obj);
> >>>      INIT_RADIX_TREE(&obj->ttm.get_io_page.radix, GFP_KERNEL |
> >> __GFP_NOWARN);
> >>>      mutex_init(&obj->ttm.get_io_page.lock);
> >>>      bo_type = (obj->flags & I915_BO_ALLOC_USER) ?
> >> ttm_bo_type_device :
> >>> --
> >>> 2.26.3
> >>>

^ permalink raw reply	[flat|nested] 60+ messages in thread

end of thread, other threads:[~2021-10-05 18:33 UTC | newest]

Thread overview: 60+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-27 11:41 [PATCH v5 01/13] drm/ttm: stop calling tt_swapin in vm_access Matthew Auld
2021-09-27 11:41 ` [Intel-gfx] " Matthew Auld
2021-09-27 11:41 ` [PATCH v5 02/13] drm/ttm: stop setting page->index for the ttm_tt Matthew Auld
2021-09-27 11:41   ` [Intel-gfx] " Matthew Auld
2021-09-27 11:41 ` [PATCH v5 03/13] drm/ttm: move ttm_tt_{add, clear}_mapping into amdgpu Matthew Auld
2021-09-27 11:41   ` [Intel-gfx] " Matthew Auld
2021-09-27 11:41 ` [PATCH v5 04/13] drm/ttm: remove TTM_PAGE_FLAG_NO_RETRY Matthew Auld
2021-09-27 11:41   ` [Intel-gfx] " Matthew Auld
2021-09-27 11:41 ` [PATCH v5 05/13] drm/ttm: s/FLAG_SG/FLAG_EXTERNAL/ Matthew Auld
2021-09-27 11:41   ` [Intel-gfx] " Matthew Auld
2021-09-27 11:41 ` [PATCH v5 06/13] drm/ttm: add some kernel-doc for TTM_TT_FLAG_* Matthew Auld
2021-09-27 11:41   ` [Intel-gfx] " Matthew Auld
2021-09-27 11:41 ` [PATCH v5 07/13] drm/ttm: add TTM_TT_FLAG_EXTERNAL_MAPPABLE Matthew Auld
2021-09-27 11:41   ` [Intel-gfx] " Matthew Auld
2021-09-27 11:41 ` [PATCH v5 08/13] drm/i915/gem: Break out some shmem backend utils Matthew Auld
2021-09-27 11:41   ` [Intel-gfx] " Matthew Auld
2021-09-27 11:41 ` [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend Matthew Auld
2021-09-27 11:41   ` [Intel-gfx] " Matthew Auld
2021-09-29 11:07   ` Thomas Hellström
2021-09-29 11:07     ` [Intel-gfx] " Thomas Hellström
2021-10-05  2:05   ` Zeng, Oak
2021-10-05  2:05     ` Zeng, Oak
2021-10-05 13:48     ` Thomas Hellström
2021-10-05 14:23       ` Zeng, Oak
2021-10-05 14:23         ` Zeng, Oak
2021-10-05 17:07         ` Matthew Auld
2021-10-05 18:33           ` Zeng, Oak
2021-10-05 18:33             ` Zeng, Oak
2021-09-27 11:41 ` [PATCH v5 10/13] drm/i915: try to simplify make_{un}shrinkable Matthew Auld
2021-09-27 11:41   ` [Intel-gfx] " Matthew Auld
2021-09-29 13:00   ` Thomas Hellström
2021-09-29 13:00     ` [Intel-gfx] " Thomas Hellström
2021-09-27 11:41 ` [PATCH v5 11/13] drm/i915/ttm: make evicted shmem pages visible to the shrinker Matthew Auld
2021-09-27 11:41   ` [Intel-gfx] " Matthew Auld
2021-09-29 11:47   ` Thomas Hellström
2021-09-29 11:47     ` [Intel-gfx] " Thomas Hellström
2021-09-27 11:41 ` [PATCH v5 12/13] drm/i915/ttm: use cached system pages when evicting lmem Matthew Auld
2021-09-27 11:41   ` [Intel-gfx] " Matthew Auld
2021-09-29 11:54   ` Thomas Hellström
2021-09-29 11:54     ` [Intel-gfx] " Thomas Hellström
2021-09-30 10:04     ` Michel Dänzer
2021-09-30 10:04       ` [Intel-gfx] " Michel Dänzer
2021-09-30 12:27       ` Matthew Auld
2021-09-30 12:27         ` [Intel-gfx] " Matthew Auld
2021-09-30 12:55         ` Michel Dänzer
2021-09-30 12:55           ` [Intel-gfx] " Michel Dänzer
2021-09-27 11:41 ` [PATCH v5 13/13] drm/i915/ttm: enable shmem tt backend Matthew Auld
2021-09-27 11:41   ` [Intel-gfx] " Matthew Auld
2021-09-29 12:00   ` Thomas Hellström
2021-09-29 12:00     ` [Intel-gfx] " Thomas Hellström
2021-09-27 11:47 ` [PATCH v5 01/13] drm/ttm: stop calling tt_swapin in vm_access Christian König
2021-09-27 11:47   ` [Intel-gfx] " Christian König
2021-09-27 16:14   ` Matthew Auld
2021-09-29 12:01     ` Christian König
2021-09-29 13:45       ` Matthew Auld
2021-09-27 17:31 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [v5,01/13] " Patchwork
2021-09-27 17:34 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2021-09-27 18:01 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-09-27 21:29 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
2021-10-05  2:17 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [v5,01/13] drm/ttm: stop calling tt_swapin in vm_access (rev2) Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.