linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RESEND PATCH 0/3] drm/i915: Replace kmap() with kmap_local_page()
@ 2022-10-16 18:09 Fabio M. De Francesco
  2022-10-16 18:09 ` [RESEND PATCH 1/3] " Fabio M. De Francesco
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Fabio M. De Francesco @ 2022-10-16 18:09 UTC (permalink / raw)
  To: Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin,
	David Airlie, Daniel Vetter, Zhenyu Wang, Zhi Wang, Matthew Auld,
	Thomas Hellström, Nirmoy Das, Maarten Lankhorst,
	Christian König, Matt Roper, John Harrison,
	Daniele Ceraolo Spurio, Lucas De Marchi, intel-gfx, dri-devel,
	linux-kernel, intel-gvt-dev
  Cc: Fabio M. De Francesco, Venkataramanan, Anirudh, Ira Weiny

kmap() is being deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and still valid.

Since its use in drm/i915 is safe everywhere, it should be preferred.

Therefore, replace kmap() with kmap_local_page() in drm/i915.

These changes should be tested in an 32 bits system, booting a kernel
with HIGHMEM enabled. Unfortunately I have no i915 based hardware,
therefore any help with testing would be greatly appreciated.

I'm resending this little series because I suspect that it has been
lost, since it was submitted on Aug 11, 2022. In the meantime I'm
adding one more recipient (Anirudh) who is helping, along with others, Ira
and me with these conversions / removals of kmap() and kmap_atomic() 

Cc: "Venkataramanan, Anirudh" <anirudh.venkataramanan@intel.com>
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>

Fabio M. De Francesco (3):
  drm/i915: Replace kmap() with kmap_local_page()
  drm/i915/gt: Replace kmap() with kmap_local_page()
  drm/i915/gem: Replace kmap() with kmap_local_page()

 drivers/gpu/drm/i915/gem/i915_gem_shmem.c          |  6 ++----
 drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c |  8 ++++----
 drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c       |  4 ++--
 drivers/gpu/drm/i915/gt/shmem_utils.c              | 11 ++++-------
 drivers/gpu/drm/i915/i915_gem.c                    |  8 ++++----
 5 files changed, 16 insertions(+), 21 deletions(-)

-- 
2.37.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [RESEND PATCH 1/3] drm/i915: Replace kmap() with kmap_local_page()
  2022-10-16 18:09 [RESEND PATCH 0/3] drm/i915: Replace kmap() with kmap_local_page() Fabio M. De Francesco
@ 2022-10-16 18:09 ` Fabio M. De Francesco
  2022-10-16 18:09 ` [RESEND PATCH 2/3] drm/i915/gt: " Fabio M. De Francesco
  2022-10-16 18:09 ` [RESEND PATCH 3/3] drm/i915/gem: " Fabio M. De Francesco
  2 siblings, 0 replies; 4+ messages in thread
From: Fabio M. De Francesco @ 2022-10-16 18:09 UTC (permalink / raw)
  To: Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin,
	David Airlie, Daniel Vetter, Zhenyu Wang, Zhi Wang, Matthew Auld,
	Thomas Hellström, Nirmoy Das, Maarten Lankhorst,
	Christian König, Matt Roper, John Harrison,
	Daniele Ceraolo Spurio, Lucas De Marchi, intel-gfx, dri-devel,
	linux-kernel, intel-gvt-dev
  Cc: Fabio M. De Francesco, Venkataramanan, Anirudh, Ira Weiny

kmap() is being deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and are still valid.

Since its use in i915_gem.c is safe everywhere, it should be preferred.

Therefore, replace kmap() with kmap_local_page() in i915_gem.c

Cc: "Venkataramanan, Anirudh" <anirudh.venkataramanan@intel.com>
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 702e5b89be22..43effce60e1b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -212,14 +212,14 @@ shmem_pread(struct page *page, int offset, int len, char __user *user_data,
 	char *vaddr;
 	int ret;
 
-	vaddr = kmap(page);
+	vaddr = kmap_local_page(page);
 
 	if (needs_clflush)
 		drm_clflush_virt_range(vaddr + offset, len);
 
 	ret = __copy_to_user(user_data, vaddr + offset, len);
 
-	kunmap(page);
+	kunmap_local(vaddr);
 
 	return ret ? -EFAULT : 0;
 }
@@ -634,7 +634,7 @@ shmem_pwrite(struct page *page, int offset, int len, char __user *user_data,
 	char *vaddr;
 	int ret;
 
-	vaddr = kmap(page);
+	vaddr = kmap_local_page(page);
 
 	if (needs_clflush_before)
 		drm_clflush_virt_range(vaddr + offset, len);
@@ -643,7 +643,7 @@ shmem_pwrite(struct page *page, int offset, int len, char __user *user_data,
 	if (!ret && needs_clflush_after)
 		drm_clflush_virt_range(vaddr + offset, len);
 
-	kunmap(page);
+	kunmap_local(vaddr);
 
 	return ret ? -EFAULT : 0;
 }
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [RESEND PATCH 2/3] drm/i915/gt: Replace kmap() with kmap_local_page()
  2022-10-16 18:09 [RESEND PATCH 0/3] drm/i915: Replace kmap() with kmap_local_page() Fabio M. De Francesco
  2022-10-16 18:09 ` [RESEND PATCH 1/3] " Fabio M. De Francesco
@ 2022-10-16 18:09 ` Fabio M. De Francesco
  2022-10-16 18:09 ` [RESEND PATCH 3/3] drm/i915/gem: " Fabio M. De Francesco
  2 siblings, 0 replies; 4+ messages in thread
From: Fabio M. De Francesco @ 2022-10-16 18:09 UTC (permalink / raw)
  To: Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin,
	David Airlie, Daniel Vetter, Zhenyu Wang, Zhi Wang, Matthew Auld,
	Thomas Hellström, Nirmoy Das, Maarten Lankhorst,
	Christian König, Matt Roper, John Harrison,
	Daniele Ceraolo Spurio, Lucas De Marchi, intel-gfx, dri-devel,
	linux-kernel, intel-gvt-dev
  Cc: Fabio M. De Francesco, Venkataramanan, Anirudh, Ira Weiny

kmap() is being deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and are still valid.

Since its use in i915/gt is safe everywhere, it should be preferred.

Therefore, replace kmap() with kmap_local_page() in i915/gt. Instead of
open-coding local mappings + memcpy() + local unmappings, use
the memcpy_{from,to}_page() helpers where these are better suited.

Cc: "Venkataramanan, Anirudh" <anirudh.venkataramanan@intel.com>
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
---
 drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c |  4 ++--
 drivers/gpu/drm/i915/gt/shmem_utils.c        | 11 ++++-------
 2 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
index 6ebda3d65086..21d8ce40b897 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
@@ -747,7 +747,7 @@ static void swizzle_page(struct page *page)
 	char *vaddr;
 	int i;
 
-	vaddr = kmap(page);
+	vaddr = kmap_local_page(page);
 
 	for (i = 0; i < PAGE_SIZE; i += 128) {
 		memcpy(temp, &vaddr[i], 64);
@@ -755,7 +755,7 @@ static void swizzle_page(struct page *page)
 		memcpy(&vaddr[i + 64], temp, 64);
 	}
 
-	kunmap(page);
+	kunmap_local(vaddr);
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.c b/drivers/gpu/drm/i915/gt/shmem_utils.c
index 402f085f3a02..48edbb8a33e5 100644
--- a/drivers/gpu/drm/i915/gt/shmem_utils.c
+++ b/drivers/gpu/drm/i915/gt/shmem_utils.c
@@ -98,22 +98,19 @@ static int __shmem_rw(struct file *file, loff_t off,
 		unsigned int this =
 			min_t(size_t, PAGE_SIZE - offset_in_page(off), len);
 		struct page *page;
-		void *vaddr;
 
 		page = shmem_read_mapping_page_gfp(file->f_mapping, pfn,
 						   GFP_KERNEL);
 		if (IS_ERR(page))
 			return PTR_ERR(page);
 
-		vaddr = kmap(page);
 		if (write) {
-			memcpy(vaddr + offset_in_page(off), ptr, this);
+			memcpy_to_page(page, offset_in_page(off), ptr, this);
 			set_page_dirty(page);
 		} else {
-			memcpy(ptr, vaddr + offset_in_page(off), this);
+			memcpy_from_page(ptr, page, offset_in_page(off), this);
 		}
 		mark_page_accessed(page);
-		kunmap(page);
 		put_page(page);
 
 		len -= this;
@@ -140,11 +137,11 @@ int shmem_read_to_iosys_map(struct file *file, loff_t off,
 		if (IS_ERR(page))
 			return PTR_ERR(page);
 
-		vaddr = kmap(page);
+		vaddr = kmap_local_page(page);
 		iosys_map_memcpy_to(map, map_off, vaddr + offset_in_page(off),
 				    this);
 		mark_page_accessed(page);
-		kunmap(page);
+		kunmap_local(vaddr);
 		put_page(page);
 
 		len -= this;
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [RESEND PATCH 3/3] drm/i915/gem: Replace kmap() with kmap_local_page()
  2022-10-16 18:09 [RESEND PATCH 0/3] drm/i915: Replace kmap() with kmap_local_page() Fabio M. De Francesco
  2022-10-16 18:09 ` [RESEND PATCH 1/3] " Fabio M. De Francesco
  2022-10-16 18:09 ` [RESEND PATCH 2/3] drm/i915/gt: " Fabio M. De Francesco
@ 2022-10-16 18:09 ` Fabio M. De Francesco
  2 siblings, 0 replies; 4+ messages in thread
From: Fabio M. De Francesco @ 2022-10-16 18:09 UTC (permalink / raw)
  To: Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin,
	David Airlie, Daniel Vetter, Zhenyu Wang, Zhi Wang, Matthew Auld,
	Thomas Hellström, Nirmoy Das, Maarten Lankhorst,
	Christian König, Matt Roper, John Harrison,
	Daniele Ceraolo Spurio, Lucas De Marchi, intel-gfx, dri-devel,
	linux-kernel, intel-gvt-dev
  Cc: Fabio M. De Francesco, Venkataramanan, Anirudh, Ira Weiny

kmap() is being deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and are still valid.

Since its use in i915/gem is safe everywhere, it should be preferred.

Therefore, replace kmap() with kmap_local_page() in i915/gem. Instead of
open-coding local map + memcpy + local unmap, use memcpy_to_page() in a
suited call site.

Cc: "Venkataramanan, Anirudh" <anirudh.venkataramanan@intel.com>
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c          | 6 ++----
 drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c | 8 ++++----
 2 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 4eed3dd90ba8..2bc6ab9964ff 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -640,16 +640,14 @@ i915_gem_object_create_shmem_from_data(struct drm_i915_private *dev_priv,
 	do {
 		unsigned int len = min_t(typeof(size), size, PAGE_SIZE);
 		struct page *page;
-		void *pgdata, *vaddr;
+		void *pgdata;
 
 		err = aops->write_begin(file, file->f_mapping, offset, len,
 					&page, &pgdata);
 		if (err < 0)
 			goto fail;
 
-		vaddr = kmap(page);
-		memcpy(vaddr, data, len);
-		kunmap(page);
+		memcpy_to_page(page, 0, data, len);
 
 		err = aops->write_end(file, file->f_mapping, offset, len, len,
 				      page, pgdata);
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
index 3ced9948a331..bb25b50b5688 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
@@ -153,7 +153,7 @@ static int check_partial_mapping(struct drm_i915_gem_object *obj,
 	intel_gt_flush_ggtt_writes(to_gt(i915));
 
 	p = i915_gem_object_get_page(obj, offset >> PAGE_SHIFT);
-	cpu = kmap(p) + offset_in_page(offset);
+	cpu = kmap_local_page(p) + offset_in_page(offset);
 	drm_clflush_virt_range(cpu, sizeof(*cpu));
 	if (*cpu != (u32)page) {
 		pr_err("Partial view for %lu [%u] (offset=%llu, size=%u [%llu, row size %u], fence=%d, tiling=%d, stride=%d) misalignment, expected write to page (%llu + %u [0x%llx]) of 0x%x, found 0x%x\n",
@@ -171,7 +171,7 @@ static int check_partial_mapping(struct drm_i915_gem_object *obj,
 	}
 	*cpu = 0;
 	drm_clflush_virt_range(cpu, sizeof(*cpu));
-	kunmap(p);
+	kunmap_local(cpu);
 
 out:
 	i915_gem_object_lock(obj, NULL);
@@ -249,7 +249,7 @@ static int check_partial_mappings(struct drm_i915_gem_object *obj,
 		intel_gt_flush_ggtt_writes(to_gt(i915));
 
 		p = i915_gem_object_get_page(obj, offset >> PAGE_SHIFT);
-		cpu = kmap(p) + offset_in_page(offset);
+		cpu = kmap_local_page(p) + offset_in_page(offset);
 		drm_clflush_virt_range(cpu, sizeof(*cpu));
 		if (*cpu != (u32)page) {
 			pr_err("Partial view for %lu [%u] (offset=%llu, size=%u [%llu, row size %u], fence=%d, tiling=%d, stride=%d) misalignment, expected write to page (%llu + %u [0x%llx]) of 0x%x, found 0x%x\n",
@@ -267,7 +267,7 @@ static int check_partial_mappings(struct drm_i915_gem_object *obj,
 		}
 		*cpu = 0;
 		drm_clflush_virt_range(cpu, sizeof(*cpu));
-		kunmap(p);
+		kunmap_local(cpu);
 		if (err)
 			return err;
 
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-10-16 18:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-16 18:09 [RESEND PATCH 0/3] drm/i915: Replace kmap() with kmap_local_page() Fabio M. De Francesco
2022-10-16 18:09 ` [RESEND PATCH 1/3] " Fabio M. De Francesco
2022-10-16 18:09 ` [RESEND PATCH 2/3] drm/i915/gt: " Fabio M. De Francesco
2022-10-16 18:09 ` [RESEND PATCH 3/3] drm/i915/gem: " Fabio M. De Francesco

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).