[Intel-xe] [PATCH 01/15] drm/xe/mmio: don't leak the forcewake ref

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Intel-xe] [PATCH 01/15] drm/xe/mmio: don't leak the forcewake ref
@ 2023-02-14 11:21 Matthew Auld
  2023-02-14 11:21 ` [Intel-xe] [PATCH 02/15] drm/xe/stolen: don't map stolen on small-bar Matthew Auld
                   ` (13 more replies)
  0 siblings, 14 replies; 17+ messages in thread
From: Matthew Auld @ 2023-02-14 11:21 UTC (permalink / raw)
  To: intel-xe

Make sure we properly release the forcewake ref on all error paths.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/xe/xe_mmio.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c
index 8a953df2b468..497f643271ae 100644
--- a/drivers/gpu/drm/xe/xe_mmio.c
+++ b/drivers/gpu/drm/xe/xe_mmio.c
@@ -426,12 +426,16 @@ int xe_mmio_ioctl(struct drm_device *dev, void *data,
 	if (args->flags & DRM_XE_MMIO_WRITE) {
 		switch (bits_flag) {
 		case DRM_XE_MMIO_8BIT:
-			return -EINVAL; /* TODO */
+			ret = -EINVAL; /* TODO */
+			goto exit;
 		case DRM_XE_MMIO_16BIT:
-			return -EINVAL; /* TODO */
+			ret = -EINVAL; /* TODO */
+			goto exit;
 		case DRM_XE_MMIO_32BIT:
-			if (XE_IOCTL_ERR(xe, args->value > U32_MAX))
-				return -EINVAL;
+			if (XE_IOCTL_ERR(xe, args->value > U32_MAX)) {
+				ret = -EINVAL;
+				goto exit;
+			}
 			xe_mmio_write32(to_gt(xe), args->addr, args->value);
 			break;
 		case DRM_XE_MMIO_64BIT:
@@ -447,9 +451,11 @@ int xe_mmio_ioctl(struct drm_device *dev, void *data,
 	if (args->flags & DRM_XE_MMIO_READ) {
 		switch (bits_flag) {
 		case DRM_XE_MMIO_8BIT:
-			return -EINVAL; /* TODO */
+			ret = -EINVAL; /* TODO */
+			break;
 		case DRM_XE_MMIO_16BIT:
-			return -EINVAL; /* TODO */
+			ret = -EINVAL; /* TODO */
+			break;
 		case DRM_XE_MMIO_32BIT:
 			args->value = xe_mmio_read32(to_gt(xe), args->addr);
 			break;
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-xe] [PATCH 02/15] drm/xe/stolen: don't map stolen on small-bar
  2023-02-14 11:21 [Intel-xe] [PATCH 01/15] drm/xe/mmio: don't leak the forcewake ref Matthew Auld
@ 2023-02-14 11:21 ` Matthew Auld
  2023-02-14 16:11   ` Lucas De Marchi
  2023-02-14 11:21 ` [Intel-xe] [PATCH 03/15] drm/xe/mmio: s/lmem/vram/ Matthew Auld
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 17+ messages in thread
From: Matthew Auld @ 2023-02-14 11:21 UTC (permalink / raw)
  To: intel-xe; +Cc: Lucas De Marchi

The driver should still be functional with small-bar, just that the vram
size is clamped to the BAR size (until we add proper support for tiered
vram). For stolen we shouldn't iomap anything if the BAR size doesn't
also also contain the stolen portion. Stolen should still be functional,
just that allocating CPU visible io memory will always return an error.

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/209
Reported-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
index b4e9c88644e4..e5a992ce432a 100644
--- a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
@@ -150,7 +150,7 @@ void xe_ttm_stolen_mgr_init(struct xe_device *xe)
 	drm_dbg_kms(&xe->drm, "Initialized stolen memory support with %llu bytes\n",
 		    stolen_size);
 
-	if (!xe_ttm_stolen_inaccessible(xe))
+	if (mgr->io_base && !xe_ttm_stolen_inaccessible(xe))
 		mgr->mapping = devm_ioremap_wc(&pdev->dev, mgr->io_base, stolen_size);
 }
 
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-xe] [PATCH 03/15] drm/xe/mmio: s/lmem/vram/
  2023-02-14 11:21 [Intel-xe] [PATCH 01/15] drm/xe/mmio: don't leak the forcewake ref Matthew Auld
  2023-02-14 11:21 ` [Intel-xe] [PATCH 02/15] drm/xe/stolen: don't map stolen on small-bar Matthew Auld
@ 2023-02-14 11:21 ` Matthew Auld
  2023-02-14 11:21 ` [Intel-xe] [PATCH 04/15] drm/xe: prefer xe_bo_create_pin_map() Matthew Auld
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Matthew Auld @ 2023-02-14 11:21 UTC (permalink / raw)
  To: intel-xe

This seems to be the preferred nomenclature in xe.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/xe/xe_mmio.c   | 40 +++++++++++++++++-----------------
 drivers/gpu/drm/xe/xe_module.c |  6 ++---
 drivers/gpu/drm/xe/xe_module.h |  2 +-
 3 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c
index 497f643271ae..e9ff5b9577fa 100644
--- a/drivers/gpu/drm/xe/xe_mmio.c
+++ b/drivers/gpu/drm/xe/xe_mmio.c
@@ -69,7 +69,7 @@ _resize_bar(struct xe_device *xe, int resno, resource_size_t size)
 	return 1;
 }
 
-static int xe_resize_lmem_bar(struct xe_device *xe, resource_size_t lmem_size)
+static int xe_resize_vram_bar(struct xe_device *xe, resource_size_t vram_size)
 {
 	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
 	struct pci_bus *root = pdev->bus;
@@ -79,31 +79,31 @@ static int xe_resize_lmem_bar(struct xe_device *xe, resource_size_t lmem_size)
 	u32 pci_cmd;
 	int i;
 	int ret;
-	u64 force_lmem_bar_size = xe_force_lmem_bar_size;
+	u64 force_vram_bar_size = xe_force_vram_bar_size;
 
 	current_size = roundup_pow_of_two(pci_resource_len(pdev, GEN12_LMEM_BAR));
 
-	if (force_lmem_bar_size) {
+	if (force_vram_bar_size) {
 		u32 bar_sizes;
 
-		rebar_size = force_lmem_bar_size * (resource_size_t)SZ_1M;
+		rebar_size = force_vram_bar_size * (resource_size_t)SZ_1M;
 		bar_sizes = pci_rebar_get_possible_sizes(pdev, GEN12_LMEM_BAR);
 
 		if (rebar_size == current_size)
 			return 0;
 
 		if (!(bar_sizes & BIT(pci_rebar_bytes_to_size(rebar_size))) ||
-		    rebar_size >= roundup_pow_of_two(lmem_size)) {
-			rebar_size = lmem_size;
+		    rebar_size >= roundup_pow_of_two(vram_size)) {
+			rebar_size = vram_size;
 			drm_info(&xe->drm,
 				 "Given bar size is not within supported size, setting it to default: %llu\n",
-				 (u64)lmem_size >> 20);
+				 (u64)vram_size >> 20);
 		}
 	} else {
 		rebar_size = current_size;
 
-		if (rebar_size != roundup_pow_of_two(lmem_size))
-			rebar_size = lmem_size;
+		if (rebar_size != roundup_pow_of_two(vram_size))
+			rebar_size = vram_size;
 		else
 			return 0;
 	}
@@ -118,7 +118,7 @@ static int xe_resize_lmem_bar(struct xe_device *xe, resource_size_t lmem_size)
 	}
 
 	if (!root_res) {
-		drm_info(&xe->drm, "Can't resize LMEM BAR - platform support is missing\n");
+		drm_info(&xe->drm, "Can't resize VRAM BAR - platform support is missing\n");
 		return -1;
 	}
 
@@ -169,7 +169,7 @@ int xe_mmio_total_vram_size(struct xe_device *xe, u64 *vram_size, u64 *usable_si
 	if (usable_size) {
 		reg = xe_gt_mcr_unicast_read_any(gt, XEHP_FLAT_CCS_BASE_ADDR);
 		*usable_size = (u64)REG_FIELD_GET(GENMASK(31, 8), reg) * SZ_64K;
-		drm_info(&xe->drm, "lmem_size: 0x%llx usable_size: 0x%llx\n",
+		drm_info(&xe->drm, "vram_size: 0x%llx usable_size: 0x%llx\n",
 			 *vram_size, *usable_size);
 	}
 
@@ -181,7 +181,7 @@ int xe_mmio_probe_vram(struct xe_device *xe)
 	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
 	struct xe_gt *gt;
 	u8 id;
-	u64 lmem_size;
+	u64 vram_size;
 	u64 original_size;
 	u64 current_size;
 	u64 usable_size;
@@ -208,29 +208,29 @@ int xe_mmio_probe_vram(struct xe_device *xe)
 	gt = xe_device_get_gt(xe, 0);
 	original_size = pci_resource_len(pdev, GEN12_LMEM_BAR);
 
-	err = xe_mmio_total_vram_size(xe, &lmem_size, &usable_size);
+	err = xe_mmio_total_vram_size(xe, &vram_size, &usable_size);
 	if (err)
 		return err;
 
-	resize_result = xe_resize_lmem_bar(xe, lmem_size);
+	resize_result = xe_resize_vram_bar(xe, vram_size);
 	current_size = pci_resource_len(pdev, GEN12_LMEM_BAR);
 	xe->mem.vram.io_start = pci_resource_start(pdev, GEN12_LMEM_BAR);
 
-	xe->mem.vram.size = min(current_size, lmem_size);
+	xe->mem.vram.size = min(current_size, vram_size);
 
 	if (!xe->mem.vram.size)
 		return -EIO;
 
 	if (resize_result > 0)
-		drm_info(&xe->drm, "Successfully resize LMEM from %lluMiB to %lluMiB\n",
+		drm_info(&xe->drm, "Successfully resize VRAM from %lluMiB to %lluMiB\n",
 			 (u64)original_size >> 20,
 			 (u64)current_size >> 20);
-	else if (xe->mem.vram.size < lmem_size && !xe_force_lmem_bar_size)
+	else if (xe->mem.vram.size < vram_size && !xe_force_vram_bar_size)
 		drm_info(&xe->drm, "Using a reduced BAR size of %lluMiB. Consider enabling 'Resizable BAR' support in your BIOS.\n",
 			 (u64)xe->mem.vram.size >> 20);
-	if (xe->mem.vram.size < lmem_size)
+	if (xe->mem.vram.size < vram_size)
 		drm_warn(&xe->drm, "Restricting VRAM size to PCI resource size (0x%llx->0x%llx)\n",
-			 lmem_size, (u64)xe->mem.vram.size);
+			 vram_size, (u64)xe->mem.vram.size);
 
 	xe->mem.vram.mapping = ioremap_wc(xe->mem.vram.io_start, xe->mem.vram.size);
 	xe->mem.vram.size = min_t(u64, xe->mem.vram.size, usable_size);
@@ -361,7 +361,7 @@ int xe_mmio_init(struct xe_device *xe)
 	 * and we should not continue with driver initialization.
 	 */
 	if (IS_DGFX(xe) && !(xe_mmio_read32(gt, GU_CNTL.reg) & LMEM_INIT)) {
-		drm_err(&xe->drm, "LMEM not initialized by firmware\n");
+		drm_err(&xe->drm, "VRAM not initialized by firmware\n");
 		return -ENODEV;
 	}
 
diff --git a/drivers/gpu/drm/xe/xe_module.c b/drivers/gpu/drm/xe/xe_module.c
index 9cd1663f83f6..d014fd3325db 100644
--- a/drivers/gpu/drm/xe/xe_module.c
+++ b/drivers/gpu/drm/xe/xe_module.c
@@ -20,9 +20,9 @@ bool enable_display = true;
 module_param_named(enable_display, enable_display, bool, 0444);
 MODULE_PARM_DESC(enable_display, "Enable display");
 
-u32 xe_force_lmem_bar_size;
-module_param_named(lmem_bar_size, xe_force_lmem_bar_size, uint, 0600);
-MODULE_PARM_DESC(lmem_bar_size, "Set the lmem bar size(in MiB)");
+u32 xe_force_vram_bar_size;
+module_param_named(vram_bar_size, xe_force_vram_bar_size, uint, 0600);
+MODULE_PARM_DESC(vram_bar_size, "Set the vram bar size(in MiB)");
 
 int xe_guc_log_level = 5;
 module_param_named(guc_log_level, xe_guc_log_level, int, 0600);
diff --git a/drivers/gpu/drm/xe/xe_module.h b/drivers/gpu/drm/xe/xe_module.h
index 2c6ee46f5595..86916c176382 100644
--- a/drivers/gpu/drm/xe/xe_module.h
+++ b/drivers/gpu/drm/xe/xe_module.h
@@ -8,6 +8,6 @@
 /* Module modprobe variables */
 extern bool enable_guc;
 extern bool enable_display;
-extern u32 xe_force_lmem_bar_size;
+extern u32 xe_force_vram_bar_size;
 extern int xe_guc_log_level;
 extern char *xe_param_force_probe;
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-xe] [PATCH 04/15] drm/xe: prefer xe_bo_create_pin_map()
  2023-02-14 11:21 [Intel-xe] [PATCH 01/15] drm/xe/mmio: don't leak the forcewake ref Matthew Auld
  2023-02-14 11:21 ` [Intel-xe] [PATCH 02/15] drm/xe/stolen: don't map stolen on small-bar Matthew Auld
  2023-02-14 11:21 ` [Intel-xe] [PATCH 03/15] drm/xe/mmio: s/lmem/vram/ Matthew Auld
@ 2023-02-14 11:21 ` Matthew Auld
  2023-02-14 11:21 ` [Intel-xe] [PATCH 05/15] drm/xe/bo: explicitly reject zero sized BO Matthew Auld
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Matthew Auld @ 2023-02-14 11:21 UTC (permalink / raw)
  To: intel-xe

With small-bar we likely want to annotate all the kernel users that
require CPU access with vram. If xe_bo_create_pin_map() is the central
place for that then we should have a central place to annotate.

This also simplifies the code and fixes what appears to be a double
xe_bo_put(hwe->hwsp) in the error handling.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/xe/xe_hw_engine.c | 23 +++-----------
 drivers/gpu/drm/xe/xe_lrc.c       | 53 +++++++++++--------------------
 drivers/gpu/drm/xe/xe_lrc_types.h |  1 -
 3 files changed, 22 insertions(+), 55 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c
index fd89dd90131c..6700d5e4f173 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine.c
+++ b/drivers/gpu/drm/xe/xe_hw_engine.c
@@ -387,24 +387,14 @@ static int hw_engine_init(struct xe_gt *gt, struct xe_hw_engine *hwe,
 	xe_reg_sr_apply_mmio(&hwe->reg_sr, gt);
 	xe_reg_sr_apply_whitelist(&hwe->reg_whitelist, hwe->mmio_base, gt);
 
-	hwe->hwsp = xe_bo_create_locked(xe, gt, NULL, SZ_4K, ttm_bo_type_kernel,
-					XE_BO_CREATE_VRAM_IF_DGFX(gt) |
-					XE_BO_CREATE_GGTT_BIT);
+	hwe->hwsp = xe_bo_create_pin_map(xe, gt, NULL, SZ_4K, ttm_bo_type_kernel,
+					 XE_BO_CREATE_VRAM_IF_DGFX(gt) |
+					 XE_BO_CREATE_GGTT_BIT);
 	if (IS_ERR(hwe->hwsp)) {
 		err = PTR_ERR(hwe->hwsp);
 		goto err_name;
 	}
 
-	err = xe_bo_pin(hwe->hwsp);
-	if (err)
-		goto err_unlock_put_hwsp;
-
-	err = xe_bo_vmap(hwe->hwsp);
-	if (err)
-		goto err_unpin_hwsp;
-
-	xe_bo_unlock_no_vm(hwe->hwsp);
-
 	err = xe_lrc_init(&hwe->kernel_lrc, hwe, NULL, NULL, SZ_16K);
 	if (err)
 		goto err_hwsp;
@@ -430,15 +420,10 @@ static int hw_engine_init(struct xe_gt *gt, struct xe_hw_engine *hwe,
 
 	return 0;
 
-err_unpin_hwsp:
-	xe_bo_unpin(hwe->hwsp);
-err_unlock_put_hwsp:
-	xe_bo_unlock_no_vm(hwe->hwsp);
-	xe_bo_put(hwe->hwsp);
 err_kernel_lrc:
 	xe_lrc_finish(&hwe->kernel_lrc);
 err_hwsp:
-	xe_bo_put(hwe->hwsp);
+	xe_bo_unpin_map_no_vm(hwe->hwsp);
 err_name:
 	hwe->name = NULL;
 
diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c
index 347ff9b34494..e4333ae9cc5e 100644
--- a/drivers/gpu/drm/xe/xe_lrc.c
+++ b/drivers/gpu/drm/xe/xe_lrc.c
@@ -616,7 +616,11 @@ int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe,
 
 	lrc->flags = 0;
 
-	lrc->bo = xe_bo_create_locked(xe, hwe->gt, vm,
+	/*
+	 * FIXME: Perma-pinning LRC as we don't yet support moving GGTT address
+	 * via VM bind calls.
+	 */
+	lrc->bo = xe_bo_create_pin_map(xe, hwe->gt, vm,
 				      ring_size + xe_lrc_size(xe, hwe->class),
 				      ttm_bo_type_kernel,
 				      XE_BO_CREATE_VRAM_IF_DGFX(hwe->gt) |
@@ -629,21 +633,6 @@ int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe,
 	else
 		lrc->full_gt = hwe->gt;
 
-	/*
-	 * FIXME: Perma-pinning LRC as we don't yet support moving GGTT address
-	 * via VM bind calls.
-	 */
-	err = xe_bo_pin(lrc->bo);
-	if (err)
-		goto err_unlock_put_bo;
-	lrc->flags |= XE_LRC_PINNED;
-
-	err = xe_bo_vmap(lrc->bo);
-	if (err)
-		goto err_unpin_bo;
-
-	xe_bo_unlock_vm_held(lrc->bo);
-
 	lrc->ring.size = ring_size;
 	lrc->ring.tail = 0;
 
@@ -653,8 +642,8 @@ int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe,
 	if (!gt->default_lrc[hwe->class]) {
 		init_data = empty_lrc_data(hwe);
 		if (!init_data) {
-			xe_lrc_finish(lrc);
-			return -ENOMEM;
+			err = -ENOMEM;
+			goto err_lrc_finish;
 		}
 	}
 
@@ -711,12 +700,8 @@ int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe,
 
 	return 0;
 
-err_unpin_bo:
-	if (lrc->flags & XE_LRC_PINNED)
-		xe_bo_unpin(lrc->bo);
-err_unlock_put_bo:
-	xe_bo_unlock_vm_held(lrc->bo);
-	xe_bo_put(lrc->bo);
+err_lrc_finish:
+	xe_lrc_finish(lrc);
 	return err;
 }
 
@@ -725,17 +710,15 @@ void xe_lrc_finish(struct xe_lrc *lrc)
 	struct ww_acquire_ctx ww;
 
 	xe_hw_fence_ctx_finish(&lrc->fence_ctx);
-	if (lrc->flags & XE_LRC_PINNED) {
-		if (lrc->bo->vm)
-			xe_vm_lock(lrc->bo->vm, &ww, 0, false);
-		else
-			xe_bo_lock_no_vm(lrc->bo, NULL);
-		xe_bo_unpin(lrc->bo);
-		if (lrc->bo->vm)
-			xe_vm_unlock(lrc->bo->vm, &ww);
-		else
-			xe_bo_unlock_no_vm(lrc->bo);
-	}
+	if (lrc->bo->vm)
+		xe_vm_lock(lrc->bo->vm, &ww, 0, false);
+	else
+		xe_bo_lock_no_vm(lrc->bo, NULL);
+	xe_bo_unpin(lrc->bo);
+	if (lrc->bo->vm)
+		xe_vm_unlock(lrc->bo->vm, &ww);
+	else
+		xe_bo_unlock_no_vm(lrc->bo);
 	xe_bo_put(lrc->bo);
 }
 
diff --git a/drivers/gpu/drm/xe/xe_lrc_types.h b/drivers/gpu/drm/xe/xe_lrc_types.h
index 2827efa2091d..8fe08535873d 100644
--- a/drivers/gpu/drm/xe/xe_lrc_types.h
+++ b/drivers/gpu/drm/xe/xe_lrc_types.h
@@ -25,7 +25,6 @@ struct xe_lrc {
 
 	/** @flags: LRC flags */
 	u32 flags;
-#define XE_LRC_PINNED BIT(1)
 
 	/** @ring: submission ring state */
 	struct {
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-xe] [PATCH 05/15] drm/xe/bo: explicitly reject zero sized BO
  2023-02-14 11:21 [Intel-xe] [PATCH 01/15] drm/xe/mmio: don't leak the forcewake ref Matthew Auld
                   ` (2 preceding siblings ...)
  2023-02-14 11:21 ` [Intel-xe] [PATCH 04/15] drm/xe: prefer xe_bo_create_pin_map() Matthew Auld
@ 2023-02-14 11:21 ` Matthew Auld
  2023-02-14 11:21 ` [Intel-xe] [PATCH 06/15] drm/xe/query: zero the region info Matthew Auld
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Matthew Auld @ 2023-02-14 11:21 UTC (permalink / raw)
  To: intel-xe

In the depths of ttm, when allocating the vma node this should result in
-ENOSPC it seems. However we should probably rather reject as part of
our own ioctl sanity checking, and then treat as programmer error in the
lower levels.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/xe/xe_bo.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 9b7b9c8f84be..98716b56498f 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -962,6 +962,9 @@ struct xe_bo *__xe_bo_create_locked(struct xe_device *xe, struct xe_bo *bo,
 	/* Only kernel objects should set GT */
 	XE_BUG_ON(gt && type != ttm_bo_type_kernel);
 
+	if (XE_WARN_ON(!size))
+		return ERR_PTR(-EINVAL);
+
 	if (!bo) {
 		bo = xe_bo_alloc();
 		if (IS_ERR(bo))
@@ -1516,6 +1519,9 @@ int xe_gem_create_ioctl(struct drm_device *dev, void *data,
 	if (XE_IOCTL_ERR(xe, args->handle))
 		return -EINVAL;
 
+	if (XE_IOCTL_ERR(xe, !args->size))
+		return -EINVAL;
+
 	if (XE_IOCTL_ERR(xe, args->size > SIZE_MAX))
 		return -EINVAL;
 
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-xe] [PATCH 06/15] drm/xe/query: zero the region info
  2023-02-14 11:21 [Intel-xe] [PATCH 01/15] drm/xe/mmio: don't leak the forcewake ref Matthew Auld
                   ` (3 preceding siblings ...)
  2023-02-14 11:21 ` [Intel-xe] [PATCH 05/15] drm/xe/bo: explicitly reject zero sized BO Matthew Auld
@ 2023-02-14 11:21 ` Matthew Auld
  2023-02-14 11:21 ` [Intel-xe] [PATCH 07/15] drm/xe/vram: start tracking the io_size Matthew Auld
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Matthew Auld @ 2023-02-14 11:21 UTC (permalink / raw)
  To: intel-xe

There are also some reserved fields in here which are not currently
cleared when handing back to userspace.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Lucas De Marchi lucas.demarchi@intel.com
---
 drivers/gpu/drm/xe/xe_query.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
index 6e904e97f456..f9f21bd1bfd7 100644
--- a/drivers/gpu/drm/xe/xe_query.c
+++ b/drivers/gpu/drm/xe/xe_query.c
@@ -115,16 +115,13 @@ static int query_memory_usage(struct xe_device *xe,
 		return -EINVAL;
 	}
 
-	usage = kmalloc(size, GFP_KERNEL);
+	usage = kzalloc(size, GFP_KERNEL);
 	if (XE_IOCTL_ERR(xe, !usage))
 		return -ENOMEM;
 
-	usage->pad = 0;
-
 	man = ttm_manager_type(&xe->ttm, XE_PL_TT);
 	usage->regions[0].mem_class = XE_MEM_REGION_CLASS_SYSMEM;
 	usage->regions[0].instance = 0;
-	usage->regions[0].pad = 0;
 	usage->regions[0].min_page_size = PAGE_SIZE;
 	usage->regions[0].max_page_size = PAGE_SIZE;
 	usage->regions[0].total_size = man->size << PAGE_SHIFT;
@@ -138,7 +135,6 @@ static int query_memory_usage(struct xe_device *xe,
 				XE_MEM_REGION_CLASS_VRAM;
 			usage->regions[usage->num_regions].instance =
 				usage->num_regions;
-			usage->regions[usage->num_regions].pad = 0;
 			usage->regions[usage->num_regions].min_page_size =
 				xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K ?
 				SZ_64K : PAGE_SIZE;
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-xe] [PATCH 07/15] drm/xe/vram: start tracking the io_size
  2023-02-14 11:21 [Intel-xe] [PATCH 01/15] drm/xe/mmio: don't leak the forcewake ref Matthew Auld
                   ` (4 preceding siblings ...)
  2023-02-14 11:21 ` [Intel-xe] [PATCH 06/15] drm/xe/query: zero the region info Matthew Auld
@ 2023-02-14 11:21 ` Matthew Auld
  2023-02-14 11:21 ` [Intel-xe] [PATCH 08/15] drm/xe/buddy: remove the virtualized start Matthew Auld
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Matthew Auld @ 2023-02-14 11:21 UTC (permalink / raw)
  To: intel-xe

First step towards supporting small-bar is to track the io_size for
vram. We can longer assume that the io_size == vram size. This way we
know how much is CPU accessible via the BAR, and how much is not.
Effectively giving us a two tiered vram, where in some later patches we
can support different allocation strategies depending on if the memory
needs to be CPU accessible or not.

Note as this stage we still clamp the vram size to the usable vram size.
Only in the final patch do we turn this on for real, and allow distinct
io_size and vram_size.

v2: (Lucas):
  - Improve the commit message, plus improve the kernel-doc for the
    io_size to give a better sense of what it actually is.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/xe/xe_device_types.h | 14 +++++++--
 drivers/gpu/drm/xe/xe_gt_types.h     | 14 +++++++--
 drivers/gpu/drm/xe/xe_mmio.c         | 44 ++++++++++++++++++++--------
 3 files changed, 55 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index 6c71e1b2dbf4..9d5a21252da3 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -175,9 +175,19 @@ struct xe_device {
 	struct {
 		/** @vram: VRAM info for device */
 		struct {
-			/** @io_start: start address of VRAM */
+			/** @io_start: IO start address of VRAM */
 			resource_size_t io_start;
-			/** @size: size of VRAM */
+			/**
+			 * @io_size: IO size of VRAM.
+			 *
+			 * This represents how much of VRAM we can access via
+			 * the CPU through the VRAM BAR. This can be smaller
+			 * than @size, in which case only part of VRAM is CPU
+			 * accessible (typically the first 256M). This
+			 * configuration is known as small-bar.
+			 */
+			resource_size_t io_size;
+			/** @size: Total size of VRAM */
 			resource_size_t size;
 			/** @mapping: pointer to VRAM mappable space */
 			void *__iomem mapping;
diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h
index b01edd3fdc4d..00c43f3a33a2 100644
--- a/drivers/gpu/drm/xe/xe_gt_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_types.h
@@ -143,9 +143,19 @@ struct xe_gt {
 		 * (virtual split), can be subset of global device VRAM
 		 */
 		struct {
-			/** @io_start: start address of VRAM */
+			/** @io_start: IO start address of this VRAM instance */
 			resource_size_t io_start;
-			/** @size: size of VRAM */
+			/**
+			 * @io_size: IO size of this VRAM instance
+			 *
+			 * This represents how much of this VRAM we can access
+			 * via the CPU through the VRAM BAR. This can be smaller
+			 * than @size, in which case only part of VRAM is CPU
+			 * accessible (typically the first 256M). This
+			 * configuration is known as small-bar.
+			 */
+			resource_size_t io_size;
+			/** @size: size of VRAM. */
 			resource_size_t size;
 			/** @mapping: pointer to VRAM mappable space */
 			void *__iomem mapping;
diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c
index e9ff5b9577fa..4c8b77caeb43 100644
--- a/drivers/gpu/drm/xe/xe_mmio.c
+++ b/drivers/gpu/drm/xe/xe_mmio.c
@@ -183,7 +183,6 @@ int xe_mmio_probe_vram(struct xe_device *xe)
 	u8 id;
 	u64 vram_size;
 	u64 original_size;
-	u64 current_size;
 	u64 usable_size;
 	int resize_result, err;
 
@@ -191,11 +190,13 @@ int xe_mmio_probe_vram(struct xe_device *xe)
 		xe->mem.vram.mapping = 0;
 		xe->mem.vram.size = 0;
 		xe->mem.vram.io_start = 0;
+		xe->mem.vram.io_size = 0;
 
 		for_each_gt(gt, xe, id) {
 			gt->mem.vram.mapping = 0;
 			gt->mem.vram.size = 0;
 			gt->mem.vram.io_start = 0;
+			gt->mem.vram.io_size = 0;
 		}
 		return 0;
 	}
@@ -213,10 +214,10 @@ int xe_mmio_probe_vram(struct xe_device *xe)
 		return err;
 
 	resize_result = xe_resize_vram_bar(xe, vram_size);
-	current_size = pci_resource_len(pdev, GEN12_LMEM_BAR);
 	xe->mem.vram.io_start = pci_resource_start(pdev, GEN12_LMEM_BAR);
-
-	xe->mem.vram.size = min(current_size, vram_size);
+	xe->mem.vram.io_size = min(usable_size,
+				   pci_resource_len(pdev, GEN12_LMEM_BAR));
+	xe->mem.vram.size = xe->mem.vram.io_size;
 
 	if (!xe->mem.vram.size)
 		return -EIO;
@@ -224,15 +225,15 @@ int xe_mmio_probe_vram(struct xe_device *xe)
 	if (resize_result > 0)
 		drm_info(&xe->drm, "Successfully resize VRAM from %lluMiB to %lluMiB\n",
 			 (u64)original_size >> 20,
-			 (u64)current_size >> 20);
-	else if (xe->mem.vram.size < vram_size && !xe_force_vram_bar_size)
+			 (u64)xe->mem.vram.io_size >> 20);
+	else if (xe->mem.vram.io_size < usable_size && !xe_force_vram_bar_size)
 		drm_info(&xe->drm, "Using a reduced BAR size of %lluMiB. Consider enabling 'Resizable BAR' support in your BIOS.\n",
 			 (u64)xe->mem.vram.size >> 20);
 	if (xe->mem.vram.size < vram_size)
 		drm_warn(&xe->drm, "Restricting VRAM size to PCI resource size (0x%llx->0x%llx)\n",
 			 vram_size, (u64)xe->mem.vram.size);
 
-	xe->mem.vram.mapping = ioremap_wc(xe->mem.vram.io_start, xe->mem.vram.size);
+	xe->mem.vram.mapping = ioremap_wc(xe->mem.vram.io_start, xe->mem.vram.io_size);
 	xe->mem.vram.size = min_t(u64, xe->mem.vram.size, usable_size);
 
 	drm_info(&xe->drm, "TOTAL VRAM: %pa, %pa\n", &xe->mem.vram.io_start, &xe->mem.vram.size);
@@ -240,7 +241,7 @@ int xe_mmio_probe_vram(struct xe_device *xe)
 	/* FIXME: Assuming equally partitioned VRAM, incorrect */
 	if (xe->info.tile_count > 1) {
 		u8 adj_tile_count = xe->info.tile_count;
-		resource_size_t size, io_start;
+		resource_size_t size, io_start, io_size;
 
 		for_each_gt(gt, xe, id)
 			if (xe_gt_is_media_type(gt))
@@ -250,15 +251,31 @@ int xe_mmio_probe_vram(struct xe_device *xe)
 
 		size = xe->mem.vram.size / adj_tile_count;
 		io_start = xe->mem.vram.io_start;
+		io_size = xe->mem.vram.io_size;
 
 		for_each_gt(gt, xe, id) {
-			if (id && !xe_gt_is_media_type(gt))
-				io_start += size;
+			if (id && !xe_gt_is_media_type(gt)) {
+				io_size -= min(io_size, size);
+				io_start += io_size;
+			}
 
 			gt->mem.vram.size = size;
-			gt->mem.vram.io_start = io_start;
-			gt->mem.vram.mapping = xe->mem.vram.mapping +
-				(io_start - xe->mem.vram.io_start);
+
+			/*
+			 * XXX: multi-tile small-bar might be wild. Hopefully
+			 * full tile without any mappable vram is not something
+			 * we care about.
+			 */
+
+			gt->mem.vram.io_size = min(size, io_size);
+			if (io_size) {
+				gt->mem.vram.io_start = io_start;
+				gt->mem.vram.mapping = xe->mem.vram.mapping +
+					(io_start - xe->mem.vram.io_start);
+			} else {
+				drm_err(&xe->drm, "Tile without any CPU visible VRAM. Aborting.\n");
+				return -ENODEV;
+			}
 
 			drm_info(&xe->drm, "VRAM[%u, %u]: %pa, %pa\n",
 				 id, gt->info.vram_id, &gt->mem.vram.io_start,
@@ -267,6 +284,7 @@ int xe_mmio_probe_vram(struct xe_device *xe)
 	} else {
 		gt->mem.vram.size = xe->mem.vram.size;
 		gt->mem.vram.io_start = xe->mem.vram.io_start;
+		gt->mem.vram.io_size = xe->mem.vram.io_size;
 		gt->mem.vram.mapping = xe->mem.vram.mapping;
 
 		drm_info(&xe->drm, "VRAM: %pa\n", &gt->mem.vram.size);
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-xe] [PATCH 08/15] drm/xe/buddy: remove the virtualized start
  2023-02-14 11:21 [Intel-xe] [PATCH 01/15] drm/xe/mmio: don't leak the forcewake ref Matthew Auld
                   ` (5 preceding siblings ...)
  2023-02-14 11:21 ` [Intel-xe] [PATCH 07/15] drm/xe/vram: start tracking the io_size Matthew Auld
@ 2023-02-14 11:21 ` Matthew Auld
  2023-02-14 11:21 ` [Intel-xe] [PATCH 09/15] drm/xe/buddy: add visible tracking Matthew Auld
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Matthew Auld @ 2023-02-14 11:21 UTC (permalink / raw)
  To: intel-xe

Hopefully not needed anymore. We can add a .compatible() hook once we
need to differentiate between mappable and non-mappable vram. If the
allocation is not contiguous then the start value is kind of
meaningless, so rather just mark as invalid.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/xe/xe_bo.c           |  5 +++++
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.c | 29 ++++++++++++++--------------
 2 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 98716b56498f..ef8c65df7659 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -663,6 +663,11 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict,
 				void *new_addr = gt->mem.vram.mapping +
 					(new_mem->start << PAGE_SHIFT);
 
+				if (XE_WARN_ON(new_mem->start == XE_BO_INVALID_OFFSET)) {
+					ret = -EINVAL;
+					goto out;
+				}
+
 				XE_BUG_ON(new_mem->start !=
 					  bo->placements->fpfn);
 
diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
index c7e21673b8fd..b70ce3a48905 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
@@ -54,7 +54,6 @@ static int xe_ttm_vram_mgr_new(struct ttm_resource_manager *man,
 	struct xe_ttm_vram_mgr_resource *vres;
 	u64 size, remaining_size, lpfn, fpfn;
 	struct drm_buddy *mm = &mgr->mm;
-	struct drm_buddy_block *block;
 	unsigned long pages_per_block;
 	int r;
 
@@ -186,24 +185,24 @@ static int xe_ttm_vram_mgr_new(struct ttm_resource_manager *man,
 			list_splice_tail(trim_list, &vres->blocks);
 	}
 
-	vres->base.start = 0;
-	list_for_each_entry(block, &vres->blocks, link) {
-		unsigned long start;
+	if (!(vres->base.placement & TTM_PL_FLAG_CONTIGUOUS) &&
+	    xe_is_vram_mgr_blocks_contiguous(mm, &vres->blocks))
+		vres->base.placement |= TTM_PL_FLAG_CONTIGUOUS;
 
-		start = drm_buddy_block_offset(block) +
-			drm_buddy_block_size(mm, block);
-		start >>= PAGE_SHIFT;
+	/*
+	 * For some kernel objects we still rely on the start when io mapping
+	 * the object.
+	 */
+	if (vres->base.placement & TTM_PL_FLAG_CONTIGUOUS) {
+		struct drm_buddy_block *block = list_first_entry(&vres->blocks,
+								 typeof(*block),
+								 link);
 
-		if (start > PFN_UP(vres->base.size))
-			start -= PFN_UP(vres->base.size);
-		else
-			start = 0;
-		vres->base.start = max(vres->base.start, start);
+		vres->base.start = drm_buddy_block_offset(block) >> PAGE_SHIFT;
+	} else {
+		vres->base.start = XE_BO_INVALID_OFFSET;
 	}
 
-	if (xe_is_vram_mgr_blocks_contiguous(mm, &vres->blocks))
-		vres->base.placement |= TTM_PL_FLAG_CONTIGUOUS;
-
 	*res = &vres->base;
 	return 0;
 
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-xe] [PATCH 09/15] drm/xe/buddy: add visible tracking
  2023-02-14 11:21 [Intel-xe] [PATCH 01/15] drm/xe/mmio: don't leak the forcewake ref Matthew Auld
                   ` (6 preceding siblings ...)
  2023-02-14 11:21 ` [Intel-xe] [PATCH 08/15] drm/xe/buddy: remove the virtualized start Matthew Auld
@ 2023-02-14 11:21 ` Matthew Auld
  2023-02-14 11:21 ` [Intel-xe] [PATCH 10/15] drm/xe/buddy: add compatible and intersects hooks Matthew Auld
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Matthew Auld @ 2023-02-14 11:21 UTC (permalink / raw)
  To: intel-xe

Replace the allocation code with the i915 version. This simplifies the
code a little, and importantly we get the accounting at the mgr level,
which is useful for debug (and maybe userspace), plus per resource
tracking so we can easily check if a resource is using one or pages in
the mappable part of vram (useful for eviction), or if the resource is
completely within the mappable portion (useful for checking if the
resource can be safely CPU mapped).

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c     |  17 +-
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.c       | 201 ++++++++++-----------
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.h       |   3 +-
 drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h |   6 +
 4 files changed, 117 insertions(+), 110 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
index e5a992ce432a..9249ce917f5a 100644
--- a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
@@ -121,7 +121,7 @@ void xe_ttm_stolen_mgr_init(struct xe_device *xe)
 {
 	struct xe_ttm_stolen_mgr *mgr = drmm_kzalloc(&xe->drm, sizeof(*mgr), GFP_KERNEL);
 	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
-	u64 stolen_size, pgsize;
+	u64 stolen_size, io_size, pgsize;
 	int err;
 
 	if (IS_DGFX(xe))
@@ -140,8 +140,17 @@ void xe_ttm_stolen_mgr_init(struct xe_device *xe)
 	if (pgsize < PAGE_SIZE)
 		pgsize = PAGE_SIZE;
 
+	/*
+	 * We don't try to attempt partial visible support for stolen vram,
+	 * since stolen is always at the end of vram, and the BAR size is pretty
+	 * much always 256M, with small-bar.
+	 */
+	io_size = 0;
+	if (mgr->io_base && !xe_ttm_stolen_inaccessible(xe))
+		io_size = stolen_size;
 
-	err = __xe_ttm_vram_mgr_init(xe, &mgr->base, XE_PL_STOLEN, stolen_size, pgsize);
+	err = __xe_ttm_vram_mgr_init(xe, &mgr->base, XE_PL_STOLEN, stolen_size,
+				     io_size, pgsize);
 	if (err) {
 		drm_dbg_kms(&xe->drm, "Stolen mgr init failed: %i\n", err);
 		return;
@@ -150,8 +159,8 @@ void xe_ttm_stolen_mgr_init(struct xe_device *xe)
 	drm_dbg_kms(&xe->drm, "Initialized stolen memory support with %llu bytes\n",
 		    stolen_size);
 
-	if (mgr->io_base && !xe_ttm_stolen_inaccessible(xe))
-		mgr->mapping = devm_ioremap_wc(&pdev->dev, mgr->io_base, stolen_size);
+	if (io_size)
+		mgr->mapping = devm_ioremap_wc(&pdev->dev, mgr->io_base, io_size);
 }
 
 u64 xe_ttm_stolen_io_offset(struct xe_bo *bo, u32 offset)
diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
index b70ce3a48905..acde9e7564e1 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
@@ -49,45 +49,26 @@ static int xe_ttm_vram_mgr_new(struct ttm_resource_manager *man,
 			       const struct ttm_place *place,
 			       struct ttm_resource **res)
 {
-	u64 max_bytes, cur_size, min_block_size;
 	struct xe_ttm_vram_mgr *mgr = to_xe_ttm_vram_mgr(man);
 	struct xe_ttm_vram_mgr_resource *vres;
-	u64 size, remaining_size, lpfn, fpfn;
 	struct drm_buddy *mm = &mgr->mm;
-	unsigned long pages_per_block;
-	int r;
-
-	lpfn = (u64)place->lpfn << PAGE_SHIFT;
-	if (!lpfn || lpfn > man->size)
-		lpfn = man->size;
-
-	fpfn = (u64)place->fpfn << PAGE_SHIFT;
+	u64 size, remaining_size, min_page_size;
+	unsigned long lpfn;
+	int err;
 
-	max_bytes = mgr->manager.size;
-	if (place->flags & TTM_PL_FLAG_CONTIGUOUS) {
-		pages_per_block = ~0ul;
-	} else {
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
-		pages_per_block = HPAGE_PMD_NR;
-#else
-		/* default to 2MB */
-		pages_per_block = 2UL << (20UL - PAGE_SHIFT);
-#endif
-
-		pages_per_block = max_t(uint32_t, pages_per_block,
-					tbo->page_alignment);
-	}
+	lpfn = place->lpfn;
+	if (!lpfn || lpfn > man->size >> PAGE_SHIFT)
+		lpfn = man->size >> PAGE_SHIFT;
 
 	vres = kzalloc(sizeof(*vres), GFP_KERNEL);
 	if (!vres)
 		return -ENOMEM;
 
 	ttm_resource_init(tbo, place, &vres->base);
-	remaining_size = vres->base.size;
 
 	/* bail out quickly if there's likely not enough VRAM for this BO */
-	if (ttm_resource_manager_usage(man) > max_bytes) {
-		r = -ENOSPC;
+	if (ttm_resource_manager_usage(man) > man->size) {
+		err = -ENOSPC;
 		goto error_fini;
 	}
 
@@ -96,95 +77,91 @@ static int xe_ttm_vram_mgr_new(struct ttm_resource_manager *man,
 	if (place->flags & TTM_PL_FLAG_TOPDOWN)
 		vres->flags |= DRM_BUDDY_TOPDOWN_ALLOCATION;
 
-	if (fpfn || lpfn != man->size)
-		/* Allocate blocks in desired range */
+	if (place->fpfn || lpfn != man->size >> PAGE_SHIFT)
 		vres->flags |= DRM_BUDDY_RANGE_ALLOCATION;
 
-	mutex_lock(&mgr->lock);
-	while (remaining_size) {
-		if (tbo->page_alignment)
-			min_block_size = tbo->page_alignment << PAGE_SHIFT;
-		else
-			min_block_size = mgr->default_page_size;
-
-		XE_BUG_ON(min_block_size < mm->chunk_size);
-
-		/* Limit maximum size to 2GiB due to SG table limitations */
-		size = min(remaining_size, 2ULL << 30);
-
-		if (size >= pages_per_block << PAGE_SHIFT)
-			min_block_size = pages_per_block << PAGE_SHIFT;
-
-		cur_size = size;
-
-		if (fpfn + size != place->lpfn << PAGE_SHIFT) {
-			/*
-			 * Except for actual range allocation, modify the size and
-			 * min_block_size conforming to continuous flag enablement
-			 */
-			if (place->flags & TTM_PL_FLAG_CONTIGUOUS) {
-				size = roundup_pow_of_two(size);
-				min_block_size = size;
-			/*
-			 * Modify the size value if size is not
-			 * aligned with min_block_size
-			 */
-			} else if (!IS_ALIGNED(size, min_block_size)) {
-				size = round_up(size, min_block_size);
-			}
-		}
+	XE_BUG_ON(!vres->base.size);
+	size = vres->base.size;
 
-		r = drm_buddy_alloc_blocks(mm, fpfn,
-					   lpfn,
-					   size,
-					   min_block_size,
-					   &vres->blocks,
-					   vres->flags);
-		if (unlikely(r))
-			goto error_free_blocks;
+	min_page_size = mgr->default_page_size;
+	if (tbo->page_alignment)
+		min_page_size = tbo->page_alignment << PAGE_SHIFT;
 
-		if (size > remaining_size)
-			remaining_size = 0;
-		else
-			remaining_size -= size;
+	XE_BUG_ON(min_page_size < mm->chunk_size);
+	XE_BUG_ON(min_page_size > SZ_2G); /* FIXME: sg limit */
+	XE_BUG_ON(size > SZ_2G &&
+		  (vres->base.placement & TTM_PL_FLAG_CONTIGUOUS));
+	XE_BUG_ON(!IS_ALIGNED(size, min_page_size));
+
+	if (place->fpfn + (vres->base.size >> PAGE_SHIFT) != place->lpfn &&
+	    place->flags & TTM_PL_FLAG_CONTIGUOUS) {
+		unsigned long pages;
+
+		size = roundup_pow_of_two(size);
+		min_page_size = size;
+
+		pages = size >> ilog2(mm->chunk_size);
+		if (pages > lpfn)
+			lpfn = pages;
 	}
-	mutex_unlock(&mgr->lock);
 
-	if (cur_size != size) {
-		struct drm_buddy_block *block;
-		struct list_head *trim_list;
-		u64 original_size;
-		LIST_HEAD(temp);
+	if (size > lpfn << PAGE_SHIFT) {
+		err = -E2BIG; /* don't trigger eviction */
+		goto error_fini;
+	}
 
-		trim_list = &vres->blocks;
-		original_size = vres->base.size;
+	mutex_lock(&mgr->lock);
+	if (lpfn <= mgr->visible_size && size > mgr->visible_avail) {
+		mutex_unlock(&mgr->lock);
+		err = -ENOSPC;
+		goto error_fini;
+	}
 
+	remaining_size = size;
+	do {
 		/*
-		 * If size value is rounded up to min_block_size, trim the last
-		 * block to the required size
+		 * Limit maximum size to 2GiB due to SG table limitations.
+		 * FIXME: Should maybe be handled as part of sg construction.
 		 */
-		if (!list_is_singular(&vres->blocks)) {
-			block = list_last_entry(&vres->blocks, typeof(*block), link);
-			list_move_tail(&block->link, &temp);
-			trim_list = &temp;
-			/*
-			 * Compute the original_size value by subtracting the
-			 * last block size with (aligned size - original size)
-			 */
-			original_size = drm_buddy_block_size(mm, block) -
-				(size - cur_size);
-		}
+		u64 alloc_size = min_t(u64, remaining_size, SZ_2G);
+
+		err = drm_buddy_alloc_blocks(mm, (u64)place->fpfn << PAGE_SHIFT,
+					     (u64)lpfn << PAGE_SHIFT,
+					     alloc_size,
+					     min_page_size,
+					     &vres->blocks,
+					     vres->flags);
+		if (err)
+			goto error_free_blocks;
 
-		mutex_lock(&mgr->lock);
-		drm_buddy_block_trim(mm,
-				     original_size,
-				     trim_list);
-		mutex_unlock(&mgr->lock);
+		remaining_size -= alloc_size;
+	} while (remaining_size);
 
-		if (!list_empty(&temp))
-			list_splice_tail(trim_list, &vres->blocks);
+	if (place->flags & TTM_PL_FLAG_CONTIGUOUS) {
+		if (!drm_buddy_block_trim(mm, vres->base.size, &vres->blocks))
+			size = vres->base.size;
 	}
 
+	if (lpfn <= mgr->visible_size >> PAGE_SHIFT) {
+		vres->used_visible_size = size;
+	} else {
+		struct drm_buddy_block *block;
+
+		list_for_each_entry(block, &vres->blocks, link) {
+			u64 start = drm_buddy_block_offset(block);
+
+			if (start < mgr->visible_size) {
+				u64 end = start + drm_buddy_block_size(mm, block);
+
+				vres->used_visible_size +=
+					min(end, mgr->visible_size) - start;
+			}
+		}
+	}
+
+	mgr->visible_avail -= vres->used_visible_size;
+	mutex_unlock(&mgr->lock);
+
 	if (!(vres->base.placement & TTM_PL_FLAG_CONTIGUOUS) &&
 	    xe_is_vram_mgr_blocks_contiguous(mm, &vres->blocks))
 		vres->base.placement |= TTM_PL_FLAG_CONTIGUOUS;
@@ -213,7 +190,7 @@ static int xe_ttm_vram_mgr_new(struct ttm_resource_manager *man,
 	ttm_resource_fini(man, &vres->base);
 	kfree(vres);
 
-	return r;
+	return err;
 }
 
 static void xe_ttm_vram_mgr_del(struct ttm_resource_manager *man,
@@ -226,6 +203,7 @@ static void xe_ttm_vram_mgr_del(struct ttm_resource_manager *man,
 
 	mutex_lock(&mgr->lock);
 	drm_buddy_free_list(mm, &vres->blocks);
+	mgr->visible_avail += vres->used_visible_size;
 	mutex_unlock(&mgr->lock);
 
 	ttm_resource_fini(man, res);
@@ -240,6 +218,13 @@ static void xe_ttm_vram_mgr_debug(struct ttm_resource_manager *man,
 	struct drm_buddy *mm = &mgr->mm;
 
 	mutex_lock(&mgr->lock);
+	drm_printf(printer, "default_page_size: %lluKiB\n",
+		   mgr->default_page_size >> 10);
+	drm_printf(printer, "visible_avail: %lluMiB\n",
+		   (u64)mgr->visible_avail >> 20);
+	drm_printf(printer, "visible_size: %lluMiB\n",
+		   (u64)mgr->visible_size >> 20);
+
 	drm_buddy_print(mm, printer);
 	mutex_unlock(&mgr->lock);
 	drm_printf(printer, "man size:%llu\n", man->size);
@@ -262,6 +247,8 @@ static void ttm_vram_mgr_fini(struct drm_device *dev, void *arg)
 	if (ttm_resource_manager_evict_all(&xe->ttm, man))
 		return;
 
+	WARN_ON_ONCE(mgr->visible_avail != mgr->visible_size);
+
 	drm_buddy_fini(&mgr->mm);
 
 	ttm_resource_manager_cleanup(&mgr->manager);
@@ -270,7 +257,8 @@ static void ttm_vram_mgr_fini(struct drm_device *dev, void *arg)
 }
 
 int __xe_ttm_vram_mgr_init(struct xe_device *xe, struct xe_ttm_vram_mgr *mgr,
-			   u32 mem_type, u64 size, u64 default_page_size)
+			   u32 mem_type, u64 size, u64 io_size,
+			   u64 default_page_size)
 {
 	struct ttm_resource_manager *man = &mgr->manager;
 	int err;
@@ -279,6 +267,8 @@ int __xe_ttm_vram_mgr_init(struct xe_device *xe, struct xe_ttm_vram_mgr *mgr,
 	mgr->mem_type = mem_type;
 	mutex_init(&mgr->lock);
 	mgr->default_page_size = default_page_size;
+	mgr->visible_size = io_size;
+	mgr->visible_avail = io_size;
 
 	ttm_resource_manager_init(man, &xe->ttm, size);
 	err = drm_buddy_init(&mgr->mm, man->size, default_page_size);
@@ -298,7 +288,8 @@ int xe_ttm_vram_mgr_init(struct xe_gt *gt, struct xe_ttm_vram_mgr *mgr)
 	mgr->gt = gt;
 
 	return __xe_ttm_vram_mgr_init(xe, mgr, XE_PL_VRAM0 + gt->info.vram_id,
-				      gt->mem.vram.size, PAGE_SIZE);
+				      gt->mem.vram.size, gt->mem.vram.io_size,
+				      PAGE_SIZE);
 }
 
 int xe_ttm_vram_mgr_alloc_sgt(struct xe_device *xe,
diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
index 78f332d26224..35e5367a79fb 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
@@ -13,7 +13,8 @@ struct xe_device;
 struct xe_gt;
 
 int __xe_ttm_vram_mgr_init(struct xe_device *xe, struct xe_ttm_vram_mgr *mgr,
-			   u32 mem_type, u64 size, u64 default_page_size);
+			   u32 mem_type, u64 size, u64 io_size,
+			   u64 default_page_size);
 int xe_ttm_vram_mgr_init(struct xe_gt *gt, struct xe_ttm_vram_mgr *mgr);
 int xe_ttm_vram_mgr_alloc_sgt(struct xe_device *xe,
 			      struct ttm_resource *res,
diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h b/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
index 39aa2ec1b968..3d9417ff7434 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h
@@ -23,6 +23,10 @@ struct xe_ttm_vram_mgr {
 	struct ttm_resource_manager manager;
 	/** @mm: DRM buddy allocator which manages the VRAM */
 	struct drm_buddy mm;
+	/** @visible_size: Proped size of the CPU visible portion */
+	u64 visible_size;
+	/** @visible_avail: CPU visible portion still unallocated */
+	u64 visible_avail;
 	/** @default_page_size: default page size */
 	u64 default_page_size;
 	/** @lock: protects allocations of VRAM */
@@ -39,6 +43,8 @@ struct xe_ttm_vram_mgr_resource {
 	struct ttm_resource base;
 	/** @blocks: list of DRM buddy blocks */
 	struct list_head blocks;
+	/** @used_visible_size: How many CPU visible bytes this resource is using */
+	u64 used_visible_size;
 	/** @flags: flags associated with the resource */
 	unsigned long flags;
 };
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-xe] [PATCH 10/15] drm/xe/buddy: add compatible and intersects hooks
  2023-02-14 11:21 [Intel-xe] [PATCH 01/15] drm/xe/mmio: don't leak the forcewake ref Matthew Auld
                   ` (7 preceding siblings ...)
  2023-02-14 11:21 ` [Intel-xe] [PATCH 09/15] drm/xe/buddy: add visible tracking Matthew Auld
@ 2023-02-14 11:21 ` Matthew Auld
  2023-02-14 11:21 ` [Intel-xe] [PATCH 11/15] drm/xe/bo: support tiered vram allocation for small-bar Matthew Auld
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Matthew Auld @ 2023-02-14 11:21 UTC (permalink / raw)
  To: intel-xe

Copy this from i915. We need .compatible for lmem -> lmem transfers, so
they don't just get nooped by ttm, if need to move something from
mappable to non-mappble or vice versa. The .intersects is needed for
eviction, to determine if a victim resource is worth eviction. e.g if we
need mappable space there is no point in evicting a resource that has
zero mappable pages.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.c | 62 ++++++++++++++++++++++++++++
 1 file changed, 62 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
index acde9e7564e1..7e4daab2ac7f 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
@@ -230,9 +230,71 @@ static void xe_ttm_vram_mgr_debug(struct ttm_resource_manager *man,
 	drm_printf(printer, "man size:%llu\n", man->size);
 }
 
+static bool xe_ttm_vram_mgr_intersects(struct ttm_resource_manager *man,
+				       struct ttm_resource *res,
+				       const struct ttm_place *place,
+				       size_t size)
+{
+	struct xe_ttm_vram_mgr *mgr = to_xe_ttm_vram_mgr(man);
+	struct xe_ttm_vram_mgr_resource *vres =
+		to_xe_ttm_vram_mgr_resource(res);
+	struct drm_buddy *mm = &mgr->mm;
+	struct drm_buddy_block *block;
+
+	if (!place->fpfn && !place->lpfn)
+		return true;
+
+	if (!place->fpfn && place->lpfn == mgr->visible_size >> PAGE_SHIFT)
+		return vres->used_visible_size > 0;
+
+	list_for_each_entry(block, &vres->blocks, link) {
+		unsigned long fpfn =
+			drm_buddy_block_offset(block) >> PAGE_SHIFT;
+		unsigned long lpfn = fpfn +
+			(drm_buddy_block_size(mm, block) >> PAGE_SHIFT);
+
+		if (place->fpfn < lpfn && place->lpfn > fpfn)
+			return true;
+	}
+
+	return false;
+}
+
+static bool xe_ttm_vram_mgr_compatible(struct ttm_resource_manager *man,
+				       struct ttm_resource *res,
+				       const struct ttm_place *place,
+				       size_t size)
+{
+	struct xe_ttm_vram_mgr *mgr = to_xe_ttm_vram_mgr(man);
+	struct xe_ttm_vram_mgr_resource *vres =
+		to_xe_ttm_vram_mgr_resource(res);
+	struct drm_buddy *mm = &mgr->mm;
+	struct drm_buddy_block *block;
+
+	if (!place->fpfn && !place->lpfn)
+		return true;
+
+	if (!place->fpfn && place->lpfn == mgr->visible_size >> PAGE_SHIFT)
+		return vres->used_visible_size == size;
+
+	list_for_each_entry(block, &vres->blocks, link) {
+		unsigned long fpfn =
+			drm_buddy_block_offset(block) >> PAGE_SHIFT;
+		unsigned long lpfn = fpfn +
+			(drm_buddy_block_size(mm, block) >> PAGE_SHIFT);
+
+		if (fpfn < place->fpfn || lpfn > place->lpfn)
+			return false;
+	}
+
+	return true;
+}
+
 static const struct ttm_resource_manager_func xe_ttm_vram_mgr_func = {
 	.alloc	= xe_ttm_vram_mgr_new,
 	.free	= xe_ttm_vram_mgr_del,
+	.intersects = xe_ttm_vram_mgr_intersects,
+	.compatible = xe_ttm_vram_mgr_compatible,
 	.debug	= xe_ttm_vram_mgr_debug
 };
 
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-xe] [PATCH 11/15] drm/xe/bo: support tiered vram allocation for small-bar
  2023-02-14 11:21 [Intel-xe] [PATCH 01/15] drm/xe/mmio: don't leak the forcewake ref Matthew Auld
                   ` (8 preceding siblings ...)
  2023-02-14 11:21 ` [Intel-xe] [PATCH 10/15] drm/xe/buddy: add compatible and intersects hooks Matthew Auld
@ 2023-02-14 11:21 ` Matthew Auld
  2023-02-14 11:21 ` [Intel-xe] [PATCH 12/15] drm/xe/migrate: retain CCS aux state for vram -> vram Matthew Auld
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Matthew Auld @ 2023-02-14 11:21 UTC (permalink / raw)
  To: intel-xe

Add the new flag XE_BO_NEEDS_CPU_ACCESS, to force allocating in the
mappable part of lmem. If no flag is specified we do a topdown
allocation, to limit the chances of stealing the precious mappable part,
if we don't need it. If this is a full-bar system, then this all gets
nooped.

For kernel users, it looks like xe_bo_create_pin_map() is the central
place which users should call if they want CPU access to the object, so
add the flag there.

We still need to plumb this through for userspace allocations. Also it
looks like page-tables are using pin_map(), which is less than ideal. If
we can already use the GPU to do page-table management, then maybe we
should just force that for small-bar.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/xe/tests/xe_migrate.c |  3 +-
 drivers/gpu/drm/xe/xe_bo.c            | 84 ++++++++++++++++++---------
 drivers/gpu/drm/xe/xe_bo.h            |  1 +
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.c  |  4 ++
 4 files changed, 63 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/xe/tests/xe_migrate.c b/drivers/gpu/drm/xe/tests/xe_migrate.c
index 03a60d5b42f1..bf9d3967a571 100644
--- a/drivers/gpu/drm/xe/tests/xe_migrate.c
+++ b/drivers/gpu/drm/xe/tests/xe_migrate.c
@@ -93,7 +93,8 @@ static void test_copy(struct xe_migrate *m, struct xe_bo *bo,
 	struct xe_bo *sysmem = xe_bo_create_locked(xe, m->gt, NULL,
 						   bo->size,
 						   ttm_bo_type_kernel,
-						   XE_BO_CREATE_SYSTEM_BIT);
+						   XE_BO_CREATE_SYSTEM_BIT |
+						   XE_BO_NEEDS_CPU_ACCESS);
 	if (IS_ERR(sysmem)) {
 		KUNIT_FAIL(test, "Failed to allocate sysmem bo for %s: %li\n",
 			   str, PTR_ERR(sysmem));
diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index ef8c65df7659..6b34009af345 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -97,22 +97,30 @@ static void try_add_system(struct xe_bo *bo, struct ttm_place *places,
 static void try_add_vram0(struct xe_device *xe, struct xe_bo *bo,
 			  struct ttm_place *places, u32 bo_flags, u32 *c)
 {
-	struct xe_gt *gt;
-
 	if (bo_flags & XE_BO_CREATE_VRAM0_BIT) {
+		struct ttm_place place = {};
+		struct xe_gt *gt;
+		u64 io_size;
+
 		gt = mem_type_to_gt(xe, XE_PL_VRAM0);
+		io_size = gt->mem.vram.io_size;
 		XE_BUG_ON(!gt->mem.vram.size);
 
-		places[*c] = (struct ttm_place) {
-			.mem_type = XE_PL_VRAM0,
-			/*
-			 * For eviction / restore on suspend / resume objects
-			 * pinned in VRAM must be contiguous
-			 */
-			.flags = bo_flags & (XE_BO_CREATE_PINNED_BIT |
-					     XE_BO_CREATE_GGTT_BIT) ?
-				TTM_PL_FLAG_CONTIGUOUS : 0,
-		};
+		place.mem_type = XE_PL_VRAM0;
+
+		if (bo_flags & (XE_BO_CREATE_PINNED_BIT |
+				XE_BO_CREATE_GGTT_BIT))
+			place.flags |= TTM_PL_FLAG_CONTIGUOUS;
+
+		if (io_size < gt->mem.vram.size) {
+			if (bo_flags & XE_BO_NEEDS_CPU_ACCESS) {
+				place.fpfn = 0;
+				place.lpfn = io_size >> PAGE_SHIFT;
+			} else {
+				place.flags |= TTM_PL_FLAG_TOPDOWN;
+			}
+		}
+		places[*c] = place;
 		*c += 1;
 
 		if (bo->props.preferred_mem_type == XE_BO_PROPS_INVALID)
@@ -123,22 +131,30 @@ static void try_add_vram0(struct xe_device *xe, struct xe_bo *bo,
 static void try_add_vram1(struct xe_device *xe, struct xe_bo *bo,
 			  struct ttm_place *places, u32 bo_flags, u32 *c)
 {
-	struct xe_gt *gt;
-
 	if (bo_flags & XE_BO_CREATE_VRAM1_BIT) {
+		struct ttm_place place = {};
+		struct xe_gt *gt;
+		u64 io_size;
+
 		gt = mem_type_to_gt(xe, XE_PL_VRAM1);
+		io_size = gt->mem.vram.io_size;
 		XE_BUG_ON(!gt->mem.vram.size);
 
-		places[*c] = (struct ttm_place) {
-			.mem_type = XE_PL_VRAM1,
-			/*
-			 * For eviction / restore on suspend / resume objects
-			 * pinned in VRAM must be contiguous
-			 */
-			.flags = bo_flags & (XE_BO_CREATE_PINNED_BIT |
-					     XE_BO_CREATE_GGTT_BIT) ?
-				TTM_PL_FLAG_CONTIGUOUS : 0,
-		};
+		place.mem_type = XE_PL_VRAM1;
+
+		if (bo_flags & (XE_BO_CREATE_PINNED_BIT |
+				XE_BO_CREATE_GGTT_BIT))
+			place.flags |= TTM_PL_FLAG_CONTIGUOUS;
+
+		if (io_size < gt->mem.vram.size) {
+			if (bo_flags & XE_BO_NEEDS_CPU_ACCESS) {
+				place.fpfn = 0;
+				place.lpfn = io_size >> PAGE_SHIFT;
+			} else {
+				place.flags |= TTM_PL_FLAG_TOPDOWN;
+			}
+		}
+		places[*c] = place;
 		*c += 1;
 
 		if (bo->props.preferred_mem_type == XE_BO_PROPS_INVALID)
@@ -370,15 +386,22 @@ static int xe_ttm_io_mem_reserve(struct ttm_device *bdev,
 				 struct ttm_resource *mem)
 {
 	struct xe_device *xe = ttm_to_xe_device(bdev);
-	struct xe_gt *gt;
 
 	switch (mem->mem_type) {
 	case XE_PL_SYSTEM:
 	case XE_PL_TT:
 		return 0;
 	case XE_PL_VRAM0:
-	case XE_PL_VRAM1:
+	case XE_PL_VRAM1: {
+		struct xe_ttm_vram_mgr_resource *vres =
+			to_xe_ttm_vram_mgr_resource(mem);
+		struct xe_gt *gt;
+
+		if (vres->used_visible_size < mem->size)
+			return -EINVAL;
+
 		gt = mem_type_to_gt(xe, mem->mem_type);
+
 		mem->bus.offset = mem->start << PAGE_SHIFT;
 
 		if (gt->mem.vram.mapping &&
@@ -393,7 +416,7 @@ static int xe_ttm_io_mem_reserve(struct ttm_device *bdev,
 		mem->bus.caching = ttm_write_combined;
 #endif
 		return 0;
-	case XE_PL_STOLEN:
+	} case XE_PL_STOLEN:
 		return xe_ttm_stolen_io_mem_reserve(xe, mem);
 	default:
 		return -EINVAL;
@@ -665,6 +688,7 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict,
 
 				if (XE_WARN_ON(new_mem->start == XE_BO_INVALID_OFFSET)) {
 					ret = -EINVAL;
+					xe_device_mem_access_put(xe);
 					goto out;
 				}
 
@@ -1160,7 +1184,8 @@ struct xe_bo *xe_bo_create_pin_map_at(struct xe_device *xe, struct xe_gt *gt,
 	    xe_ttm_stolen_inaccessible(xe))
 		flags |= XE_BO_CREATE_GGTT_BIT;
 
-	bo = xe_bo_create_locked_range(xe, gt, vm, size, start, end, type, flags);
+	bo = xe_bo_create_locked_range(xe, gt, vm, size, start, end, type,
+				       flags | XE_BO_NEEDS_CPU_ACCESS);
 	if (IS_ERR(bo))
 		return bo;
 
@@ -1458,6 +1483,9 @@ int xe_bo_vmap(struct xe_bo *bo)
 
 	xe_bo_assert_held(bo);
 
+	if (!(bo->flags & XE_BO_NEEDS_CPU_ACCESS))
+		return -EINVAL;
+
 	if (!iosys_map_is_null(&bo->vmap))
 		return 0;
 
diff --git a/drivers/gpu/drm/xe/xe_bo.h b/drivers/gpu/drm/xe/xe_bo.h
index 0699b2b4c5ca..c937ef10fcf3 100644
--- a/drivers/gpu/drm/xe/xe_bo.h
+++ b/drivers/gpu/drm/xe/xe_bo.h
@@ -28,6 +28,7 @@
 #define XE_BO_DEFER_BACKING		BIT(8)
 #define XE_BO_SCANOUT_BIT		BIT(9)
 #define XE_BO_FIXED_PLACEMENT_BIT	BIT(10)
+#define XE_BO_NEEDS_CPU_ACCESS		BIT(11)
 /* this one is trigger internally only */
 #define XE_BO_INTERNAL_TEST		BIT(30)
 #define XE_BO_INTERNAL_64K		BIT(31)
diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
index 7e4daab2ac7f..665073111e4a 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
@@ -361,12 +361,16 @@ int xe_ttm_vram_mgr_alloc_sgt(struct xe_device *xe,
 			      enum dma_data_direction dir,
 			      struct sg_table **sgt)
 {
+	struct xe_ttm_vram_mgr_resource *vres = to_xe_ttm_vram_mgr_resource(res);
 	struct xe_gt *gt = xe_device_get_gt(xe, res->mem_type - XE_PL_VRAM0);
 	struct xe_res_cursor cursor;
 	struct scatterlist *sg;
 	int num_entries = 0;
 	int i, r;
 
+	if (vres->used_visible_size < res->size)
+		return -EOPNOTSUPP;
+
 	*sgt = kmalloc(sizeof(**sgt), GFP_KERNEL);
 	if (!*sgt)
 		return -ENOMEM;
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-xe] [PATCH 12/15] drm/xe/migrate: retain CCS aux state for vram -> vram
  2023-02-14 11:21 [Intel-xe] [PATCH 01/15] drm/xe/mmio: don't leak the forcewake ref Matthew Auld
                   ` (9 preceding siblings ...)
  2023-02-14 11:21 ` [Intel-xe] [PATCH 11/15] drm/xe/bo: support tiered vram allocation for small-bar Matthew Auld
@ 2023-02-14 11:21 ` Matthew Auld
  2023-02-14 11:21 ` [Intel-xe] [PATCH 13/15] drm/xe/display: annotate CC buffers with NEEDS_CPU_ACCESS Matthew Auld
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Matthew Auld @ 2023-02-14 11:21 UTC (permalink / raw)
  To: intel-xe

On small-bar we likely need to be able to deal with vram -> vram
transfers. During eviction, as an optimisation,  we don't always
want/need to kick stuff into smem. Plus for some types of CCS surfaces,
where the clear color needs to be accessed from the CPU, we might need
to migrate it.

v2: (Lucas):
  - s/lmem/vram/ in the commit message
  - Tidy up the code a bit; use one emit_copy_ccs()

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/xe/xe_migrate.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
index cbcc355cd391..418d4f900cdb 100644
--- a/drivers/gpu/drm/xe/xe_migrate.c
+++ b/drivers/gpu/drm/xe/xe_migrate.c
@@ -544,11 +544,19 @@ static u32 xe_migrate_ccs_copy(struct xe_migrate *m,
 
 	if (xe_device_has_flat_ccs(gt_to_xe(gt)) && !copy_ccs && dst_is_vram) {
 		/*
-		 * If the bo doesn't have any CCS metadata attached, we still
-		 * need to clear it for security reasons.
+		 * If the src is already in vram, then it should already
+		 * have been cleared by us, or has been populated by the
+		 * user. Make sure we copy the CCS aux state as-is.
+		 *
+		 * Otherwise if the bo doesn't have any CCS metadata attached,
+		 * we still need to clear it for security reasons.
 		 */
-		emit_copy_ccs(gt, bb, dst_ofs, true, m->cleared_vram_ofs, false,
-			      dst_size);
+		u64 ccs_src_ofs =  src_is_vram ? src_ofs : m->cleared_vram_ofs;
+
+		emit_copy_ccs(gt, bb,
+			      dst_ofs, true,
+			      ccs_src_ofs, src_is_vram, dst_size);
+
 		flush_flags = MI_FLUSH_DW_CCS;
 	} else if (copy_ccs) {
 		if (!src_is_vram)
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-xe] [PATCH 13/15] drm/xe/display: annotate CC buffers with NEEDS_CPU_ACCESS
  2023-02-14 11:21 [Intel-xe] [PATCH 01/15] drm/xe/mmio: don't leak the forcewake ref Matthew Auld
                   ` (10 preceding siblings ...)
  2023-02-14 11:21 ` [Intel-xe] [PATCH 12/15] drm/xe/migrate: retain CCS aux state for vram -> vram Matthew Auld
@ 2023-02-14 11:21 ` Matthew Auld
  2023-02-14 11:21 ` [Intel-xe] [PATCH 14/15] drm/xe/uapi: add the userspace bits for small-bar Matthew Auld
  2023-02-14 11:21 ` [Intel-xe] [PATCH 15/15] drm/xe: fully turn on small-bar support Matthew Auld
  13 siblings, 0 replies; 17+ messages in thread
From: Matthew Auld @ 2023-02-14 11:21 UTC (permalink / raw)
  To: intel-xe

The display code wants to read the clear color value from the buffer.
However if the buffer is the non-mappable part of lmem then we fail the
kmap. The simplest solution is to just mark the buffer with
XE_BO_NEEDS_CPU_ACCESS, which will either allocate the buffer in the
mappable part of lmem, or migrate it there.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/xe/display/xe_fb_pin.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/xe/display/xe_fb_pin.c b/drivers/gpu/drm/xe/display/xe_fb_pin.c
index 65c0bc28a3d1..66e1309e21d8 100644
--- a/drivers/gpu/drm/xe/display/xe_fb_pin.c
+++ b/drivers/gpu/drm/xe/display/xe_fb_pin.c
@@ -203,6 +203,14 @@ static struct i915_vma *__xe_pin_fb_vma(struct intel_framebuffer *fb,
 	if (ret)
 		goto err;
 
+	/*
+	 * For this type of buffer we need to able to read from the CPU the
+	 * clear color value found in the buffer. This doesn't do anything on
+	 * non small-bar devices.
+	 */
+	if (intel_fb_rc_ccs_cc_plane(&fb->base) >= 0)
+		bo->flags |= XE_BO_NEEDS_CPU_ACCESS;
+
 	ret = xe_bo_validate(bo, NULL, true);
 	if (!ret)
 		ttm_bo_pin(&bo->ttm);
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-xe] [PATCH 14/15] drm/xe/uapi: add the userspace bits for small-bar
  2023-02-14 11:21 [Intel-xe] [PATCH 01/15] drm/xe/mmio: don't leak the forcewake ref Matthew Auld
                   ` (11 preceding siblings ...)
  2023-02-14 11:21 ` [Intel-xe] [PATCH 13/15] drm/xe/display: annotate CC buffers with NEEDS_CPU_ACCESS Matthew Auld
@ 2023-02-14 11:21 ` Matthew Auld
  2023-02-14 11:21 ` [Intel-xe] [PATCH 15/15] drm/xe: fully turn on small-bar support Matthew Auld
  13 siblings, 0 replies; 17+ messages in thread
From: Matthew Auld @ 2023-02-14 11:21 UTC (permalink / raw)
  To: intel-xe

Pretty much all same as i915. We add a new hint for userspace to force an
object into the mappable part of vram. Like i915 we also require smem as
a placement in case we need to spill.

We also need to tell userspace how large the mappable part is. In Vulkan
for example, there will be two vram heaps for small-bar systems. And
here the size of each heap needs to be known. Likewise the used/avail
tracking needs to account for the mappable part.

We also limit the available tracking going forward, and limit to
privileged users only, since these values are system wide and are
technically considered an info leak.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
---
 drivers/gpu/drm/xe/xe_bo.c           | 17 +++++++++++++++--
 drivers/gpu/drm/xe/xe_query.c        | 26 +++++++++++++++++++++-----
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.c | 18 ++++++++++++++++++
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.h |  4 ++++
 include/uapi/drm/xe_drm.h            |  5 ++++-
 5 files changed, 62 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 6b34009af345..10df49c122e3 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -915,7 +915,6 @@ static vm_fault_t xe_gem_fault(struct vm_fault *vmf)
 			ret = ttm_bo_vm_fault_reserved(vmf,
 						       vmf->vma->vm_page_prot,
 						       TTM_BO_VM_NUM_PREFAULT);
-
 		drm_dev_exit(idx);
 	} else {
 		ret = ttm_bo_vm_dummy_page(vmf, vmf->vma->vm_page_prot);
@@ -1542,6 +1541,7 @@ int xe_gem_create_ioctl(struct drm_device *dev, void *data,
 	if (XE_IOCTL_ERR(xe, args->flags &
 			 ~(XE_GEM_CREATE_FLAG_DEFER_BACKING |
 			   XE_GEM_CREATE_FLAG_SCANOUT |
+			   XE_GEM_CREATE_FLAG_NEEDS_CPU_ACCESS |
 			   xe->info.mem_region_mask)))
 		return -EINVAL;
 
@@ -1579,6 +1579,18 @@ int xe_gem_create_ioctl(struct drm_device *dev, void *data,
 		bo_flags |= XE_BO_SCANOUT_BIT;
 
 	bo_flags |= args->flags << (ffs(XE_BO_CREATE_SYSTEM_BIT) - 1);
+
+	if (args->flags & XE_GEM_CREATE_FLAG_NEEDS_CPU_ACCESS) {
+		if (XE_IOCTL_ERR(xe, !(bo_flags & (XE_BO_CREATE_VRAM0_BIT |
+						   XE_BO_CREATE_VRAM1_BIT))))
+			return -EINVAL;
+
+		if (XE_IOCTL_ERR(xe, !(bo_flags & XE_BO_CREATE_SYSTEM_BIT)))
+			return -EINVAL;
+
+		bo_flags |= XE_BO_NEEDS_CPU_ACCESS;
+	}
+
 	bo = xe_bo_create(xe, NULL, vm, args->size, ttm_bo_type_device,
 			  bo_flags);
 	if (vm) {
@@ -1842,7 +1854,8 @@ int xe_bo_dumb_create(struct drm_file *file_priv,
 
 	bo = xe_bo_create(xe, NULL, NULL, args->size, ttm_bo_type_device,
 			  XE_BO_CREATE_VRAM_IF_DGFX(to_gt(xe)) |
-			  XE_BO_CREATE_USER_BIT | XE_BO_SCANOUT_BIT);
+			  XE_BO_CREATE_USER_BIT | XE_BO_SCANOUT_BIT |
+			  XE_BO_NEEDS_CPU_ACCESS);
 	if (IS_ERR(bo))
 		return PTR_ERR(bo);
 
diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
index f9f21bd1bfd7..52d5f67bd6f2 100644
--- a/drivers/gpu/drm/xe/xe_query.c
+++ b/drivers/gpu/drm/xe/xe_query.c
@@ -9,11 +9,12 @@
 
 #include "xe_bo.h"
 #include "xe_device.h"
+#include "xe_ggtt.h"
 #include "xe_gt.h"
+#include "xe_guc_hwconfig.h"
 #include "xe_macros.h"
+#include "xe_ttm_vram_mgr.h"
 #include "xe_query.h"
-#include "xe_ggtt.h"
-#include "xe_guc_hwconfig.h"
 
 static const enum xe_engine_class xe_to_user_engine_class[] = {
 	[XE_ENGINE_CLASS_RENDER] = DRM_XE_ENGINE_CLASS_RENDER,
@@ -125,7 +126,10 @@ static int query_memory_usage(struct xe_device *xe,
 	usage->regions[0].min_page_size = PAGE_SIZE;
 	usage->regions[0].max_page_size = PAGE_SIZE;
 	usage->regions[0].total_size = man->size << PAGE_SHIFT;
-	usage->regions[0].used = ttm_resource_manager_usage(man);
+	if (perfmon_capable())
+		usage->regions[0].used = ttm_resource_manager_usage(man);
+	else
+		usage->regions[0].used = usage->regions[0].total_size;
 	usage->num_regions = 1;
 
 	for (i = XE_PL_VRAM0; i <= XE_PL_VRAM1; ++i) {
@@ -142,8 +146,20 @@ static int query_memory_usage(struct xe_device *xe,
 				SZ_1G;
 			usage->regions[usage->num_regions].total_size =
 				man->size;
-			usage->regions[usage->num_regions++].used =
-				ttm_resource_manager_usage(man);
+
+			if (perfmon_capable()) {
+				xe_ttm_vram_get_used(man,
+						     &usage->regions[usage->num_regions].used,
+						     &usage->regions[usage->num_regions].cpu_visible_used);
+			} else {
+				usage->regions[usage->num_regions].used = man->size;
+				usage->regions[usage->num_regions].cpu_visible_used =
+					xe_ttm_vram_get_cpu_visible_size(man);
+			}
+
+			usage->regions[usage->num_regions].cpu_visible_size =
+				xe_ttm_vram_get_cpu_visible_size(man);
+			usage->num_regions++;
 		}
 	}
 
diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
index 665073111e4a..2891e79b8e48 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
@@ -446,3 +446,21 @@ void xe_ttm_vram_mgr_free_sgt(struct device *dev, enum dma_data_direction dir,
 	sg_free_table(sgt);
 	kfree(sgt);
 }
+
+u64 xe_ttm_vram_get_cpu_visible_size(struct ttm_resource_manager *man)
+{
+	struct xe_ttm_vram_mgr *mgr = to_xe_ttm_vram_mgr(man);
+
+	return mgr->visible_size;
+}
+
+void xe_ttm_vram_get_used(struct ttm_resource_manager *man,
+			  u64 *used, u64 *used_visible)
+{
+	struct xe_ttm_vram_mgr *mgr = to_xe_ttm_vram_mgr(man);
+
+	mutex_lock(&mgr->lock);
+	*used = mgr->mm.size - mgr->mm.avail;
+	*used_visible = mgr->visible_size - mgr->visible_avail;
+	mutex_unlock(&mgr->lock);
+}
diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
index 35e5367a79fb..27f43490fa11 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.h
@@ -25,6 +25,10 @@ int xe_ttm_vram_mgr_alloc_sgt(struct xe_device *xe,
 void xe_ttm_vram_mgr_free_sgt(struct device *dev, enum dma_data_direction dir,
 			      struct sg_table *sgt);
 
+u64 xe_ttm_vram_get_cpu_visible_size(struct ttm_resource_manager *man);
+void xe_ttm_vram_get_used(struct ttm_resource_manager *man,
+			  u64 *used, u64 *used_visible);
+
 static inline struct xe_ttm_vram_mgr_resource *
 to_xe_ttm_vram_mgr_resource(struct ttm_resource *res)
 {
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 593b01ba5919..525ab9b6b282 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -169,7 +169,9 @@ struct drm_xe_query_mem_usage {
 		__u32 max_page_size;
 		__u64 total_size;
 		__u64 used;
-		__u64 reserved[8];
+		__u64 cpu_visible_size;
+		__u64 cpu_visible_used;
+		__u64 reserved[6];
 	} regions[];
 };
 
@@ -270,6 +272,7 @@ struct drm_xe_gem_create {
 	 */
 #define XE_GEM_CREATE_FLAG_DEFER_BACKING	(0x1 << 24)
 #define XE_GEM_CREATE_FLAG_SCANOUT		(0x1 << 25)
+#define XE_GEM_CREATE_FLAG_NEEDS_CPU_ACCESS	(0x1 << 26)
 	__u32 flags;
 
 	/**
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-xe] [PATCH 15/15] drm/xe: fully turn on small-bar support
  2023-02-14 11:21 [Intel-xe] [PATCH 01/15] drm/xe/mmio: don't leak the forcewake ref Matthew Auld
                   ` (12 preceding siblings ...)
  2023-02-14 11:21 ` [Intel-xe] [PATCH 14/15] drm/xe/uapi: add the userspace bits for small-bar Matthew Auld
@ 2023-02-14 11:21 ` Matthew Auld
  13 siblings, 0 replies; 17+ messages in thread
From: Matthew Auld @ 2023-02-14 11:21 UTC (permalink / raw)
  To: intel-xe

This allows vram_size > io_size, instead of just clamping the vram size
to the BAR size, now that the driver supports it.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/xe/xe_mmio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c
index 4c8b77caeb43..0a0958393a4b 100644
--- a/drivers/gpu/drm/xe/xe_mmio.c
+++ b/drivers/gpu/drm/xe/xe_mmio.c
@@ -217,7 +217,7 @@ int xe_mmio_probe_vram(struct xe_device *xe)
 	xe->mem.vram.io_start = pci_resource_start(pdev, GEN12_LMEM_BAR);
 	xe->mem.vram.io_size = min(usable_size,
 				   pci_resource_len(pdev, GEN12_LMEM_BAR));
-	xe->mem.vram.size = xe->mem.vram.io_size;
+	xe->mem.vram.size = vram_size;
 
 	if (!xe->mem.vram.size)
 		return -EIO;
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [Intel-xe] [PATCH 02/15] drm/xe/stolen: don't map stolen on small-bar
  2023-02-14 11:21 ` [Intel-xe] [PATCH 02/15] drm/xe/stolen: don't map stolen on small-bar Matthew Auld
@ 2023-02-14 16:11   ` Lucas De Marchi
  2023-02-14 17:51     ` Matthew Auld
  0 siblings, 1 reply; 17+ messages in thread
From: Lucas De Marchi @ 2023-02-14 16:11 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-xe

On Tue, Feb 14, 2023 at 11:21:23AM +0000, Matthew Auld wrote:
>The driver should still be functional with small-bar, just that the vram
>size is clamped to the BAR size (until we add proper support for tiered
>vram). For stolen we shouldn't iomap anything if the BAR size doesn't
>also also contain the stolen portion. Stolen should still be functional,

maybe a sentence that for the known platforms stolen is on top of
of the device memory, so if BAR2 doesn't cover all the memory, it's not
accessible?


>just that allocating CPU visible io memory will always return an error.
>
>Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/209
>Reported-by: Lucas De Marchi <lucas.demarchi@intel.com>
>Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>---
> drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
>index b4e9c88644e4..e5a992ce432a 100644
>--- a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
>+++ b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
>@@ -150,7 +150,7 @@ void xe_ttm_stolen_mgr_init(struct xe_device *xe)
> 	drm_dbg_kms(&xe->drm, "Initialized stolen memory support with %llu bytes\n",
> 		    stolen_size);
>
>-	if (!xe_ttm_stolen_inaccessible(xe))
>+	if (mgr->io_base && !xe_ttm_stolen_inaccessible(xe))


what I wrote above made me think:  shouldn't the check for mgr->io_base
be in the xe_ttm_stolen_inaccessible()? or should we differentiate
between CPU vs GPU accessible?

Lucas De Marchi

> 		mgr->mapping = devm_ioremap_wc(&pdev->dev, mgr->io_base, stolen_size);
> }
>
>-- 
>2.39.1
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-xe] [PATCH 02/15] drm/xe/stolen: don't map stolen on small-bar
  2023-02-14 16:11   ` Lucas De Marchi
@ 2023-02-14 17:51     ` Matthew Auld
  0 siblings, 0 replies; 17+ messages in thread
From: Matthew Auld @ 2023-02-14 17:51 UTC (permalink / raw)
  To: Lucas De Marchi; +Cc: intel-xe

On 14/02/2023 16:11, Lucas De Marchi wrote:
> On Tue, Feb 14, 2023 at 11:21:23AM +0000, Matthew Auld wrote:
>> The driver should still be functional with small-bar, just that the vram
>> size is clamped to the BAR size (until we add proper support for tiered
>> vram). For stolen we shouldn't iomap anything if the BAR size doesn't
>> also also contain the stolen portion. Stolen should still be functional,
> 
> maybe a sentence that for the known platforms stolen is on top of
> of the device memory, so if BAR2 doesn't cover all the memory, it's not
> accessible?

Ok.

> 
> 
>> just that allocating CPU visible io memory will always return an error.
>>
>> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/209
>> Reported-by: Lucas De Marchi <lucas.demarchi@intel.com>
>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>> ---
>> drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c 
>> b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
>> index b4e9c88644e4..e5a992ce432a 100644
>> --- a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
>> +++ b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
>> @@ -150,7 +150,7 @@ void xe_ttm_stolen_mgr_init(struct xe_device *xe)
>>     drm_dbg_kms(&xe->drm, "Initialized stolen memory support with %llu 
>> bytes\n",
>>             stolen_size);
>>
>> -    if (!xe_ttm_stolen_inaccessible(xe))
>> +    if (mgr->io_base && !xe_ttm_stolen_inaccessible(xe))
> 
> 
> what I wrote above made me think:  shouldn't the check for mgr->io_base
> be in the xe_ttm_stolen_inaccessible()? or should we differentiate
> between CPU vs GPU accessible?

Yeah, I think making it clear that this talking about CPU makes sense. 
Also GPU access should always be possible. I tried to move the io_base 
check into xe_ttm_stolen_inaccessible() in the next version.

> 
> Lucas De Marchi
> 
>>         mgr->mapping = devm_ioremap_wc(&pdev->dev, mgr->io_base, 
>> stolen_size);
>> }
>>
>> -- 
>> 2.39.1
>>

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-02-14 17:51 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-14 11:21 [Intel-xe] [PATCH 01/15] drm/xe/mmio: don't leak the forcewake ref Matthew Auld
2023-02-14 11:21 ` [Intel-xe] [PATCH 02/15] drm/xe/stolen: don't map stolen on small-bar Matthew Auld
2023-02-14 16:11   ` Lucas De Marchi
2023-02-14 17:51     ` Matthew Auld
2023-02-14 11:21 ` [Intel-xe] [PATCH 03/15] drm/xe/mmio: s/lmem/vram/ Matthew Auld
2023-02-14 11:21 ` [Intel-xe] [PATCH 04/15] drm/xe: prefer xe_bo_create_pin_map() Matthew Auld
2023-02-14 11:21 ` [Intel-xe] [PATCH 05/15] drm/xe/bo: explicitly reject zero sized BO Matthew Auld
2023-02-14 11:21 ` [Intel-xe] [PATCH 06/15] drm/xe/query: zero the region info Matthew Auld
2023-02-14 11:21 ` [Intel-xe] [PATCH 07/15] drm/xe/vram: start tracking the io_size Matthew Auld
2023-02-14 11:21 ` [Intel-xe] [PATCH 08/15] drm/xe/buddy: remove the virtualized start Matthew Auld
2023-02-14 11:21 ` [Intel-xe] [PATCH 09/15] drm/xe/buddy: add visible tracking Matthew Auld
2023-02-14 11:21 ` [Intel-xe] [PATCH 10/15] drm/xe/buddy: add compatible and intersects hooks Matthew Auld
2023-02-14 11:21 ` [Intel-xe] [PATCH 11/15] drm/xe/bo: support tiered vram allocation for small-bar Matthew Auld
2023-02-14 11:21 ` [Intel-xe] [PATCH 12/15] drm/xe/migrate: retain CCS aux state for vram -> vram Matthew Auld
2023-02-14 11:21 ` [Intel-xe] [PATCH 13/15] drm/xe/display: annotate CC buffers with NEEDS_CPU_ACCESS Matthew Auld
2023-02-14 11:21 ` [Intel-xe] [PATCH 14/15] drm/xe/uapi: add the userspace bits for small-bar Matthew Auld
2023-02-14 11:21 ` [Intel-xe] [PATCH 15/15] drm/xe: fully turn on small-bar support Matthew Auld

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.