dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] drm/ttm, drm/vmwgfx: Use kmap_atomic() for cpu blit
@ 2018-01-16 13:56 Thomas Hellstrom
  2018-01-16 13:56 ` [PATCH 1/5] drm/ttm: Clean up kmap_atomic_prot selection code Thomas Hellstrom
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Thomas Hellstrom @ 2018-01-16 13:56 UTC (permalink / raw)
  To: dri-devel

The purpose of this series is to remove the often full-screen vmwgfx sequence
vmap()
blit()
vunmap()
and replace it with kmap_atomic() style per-page maps.

Although somewhat rare nowadays, 32-bit VMs restrict the vmap space so that
huge vmaps may sometimes fail. Also, large vmaps lead to frequent global
TLB flushes.

On the contrary, using kmap_atomic() makes the blit code a lot more complex,
but hopefully we shouldn't need to revisit that code. On 64-bit architectures,
kmap_atomic() is essentially free as long as the mapping is cached, since then
the linear kernel map is reused. On 32-bit architectures the same holds for
lowmem pages, and on highmem pages the TLB flushes are local and per-page.
Preliminary findings on 32-bit vms show that while the blit itself consumes
more CPU with the new approach, the system idle time is about the same or
slightly increased.

The cpu blit was originally written as a TTM utility, for use also with CPU
buffer object moves, but until that code is written and tested we're keeping
it in vmwgfx.

In addition we've added a possibility to compute the diff bounding box
between the source and the destinations. The overhead of computing that
bounding-box is not that big when integrated with a CPU blit that still has to
be performed. And it comes in very handy for remoting of 2D-only VMs where
the damage rects are broken (Ubuntu / compiz) or for page-flipping
(gnome-shell).

Note that a slightly modified version of patch 3/5 is already upstream in 4.15.
This series is against drm-next and it's included for completeness.


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/5] drm/ttm: Clean up kmap_atomic_prot selection code
  2018-01-16 13:56 [PATCH 0/5] drm/ttm, drm/vmwgfx: Use kmap_atomic() for cpu blit Thomas Hellstrom
@ 2018-01-16 13:56 ` Thomas Hellstrom
  2018-01-16 14:05   ` Christian König
  2018-01-16 13:56 ` [PATCH 2/5] drm/ttm: Export the ttm_k[un]map_atomic_prot API Thomas Hellstrom
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 8+ messages in thread
From: Thomas Hellstrom @ 2018-01-16 13:56 UTC (permalink / raw)
  To: dri-devel; +Cc: Thomas Hellstrom, Christian König

Use helpers to perform the kmap_atomic_prot() functionality to
a) Avoid in-function ifdefs that violate the kernel coding policy,
b) Facilitate exporting the functionality.

This commit should not change any functionality.

Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
---
 drivers/gpu/drm/ttm/ttm_bo_util.c | 64 +++++++++++++++++++--------------------
 1 file changed, 31 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
index 153de1b..6d6d939 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -255,6 +255,33 @@ static int ttm_copy_io_page(void *dst, void *src, unsigned long page)
 	return 0;
 }
 
+#ifdef CONFIG_X86
+#define __ttm_kmap_atomic_prot(__page, __prot) kmap_atomic_prot(__page, __prot)
+#define __ttm_kunmap_atomic(__addr) kunmap_atomic(__addr)
+#else
+#define __ttm_kmap_atomic_prot(__page, __prot) vmap(&__page, 1, 0,  __prot)
+#define __ttm_kunmap_atomic(__addr) vunmap(__addr)
+#endif
+
+static void *ttm_kmap_atomic_prot(struct page *page,
+				  pgprot_t prot)
+{
+	if (pgprot_val(prot) == pgprot_val(PAGE_KERNEL))
+		return kmap_atomic(page);
+	else
+		return __ttm_kmap_atomic_prot(page, prot);
+}
+
+
+static void ttm_kunmap_atomic_prot(void *addr,
+				   pgprot_t prot)
+{
+	if (pgprot_val(prot) == pgprot_val(PAGE_KERNEL))
+		kunmap_atomic(addr);
+	else
+		__ttm_kunmap_atomic(addr);
+}
+
 static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void *src,
 				unsigned long page,
 				pgprot_t prot)
@@ -266,28 +293,13 @@ static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void *src,
 		return -ENOMEM;
 
 	src = (void *)((unsigned long)src + (page << PAGE_SHIFT));
-
-#ifdef CONFIG_X86
-	dst = kmap_atomic_prot(d, prot);
-#else
-	if (pgprot_val(prot) != pgprot_val(PAGE_KERNEL))
-		dst = vmap(&d, 1, 0, prot);
-	else
-		dst = kmap(d);
-#endif
+	dst = ttm_kmap_atomic_prot(d, prot);
 	if (!dst)
 		return -ENOMEM;
 
 	memcpy_fromio(dst, src, PAGE_SIZE);
 
-#ifdef CONFIG_X86
-	kunmap_atomic(dst);
-#else
-	if (pgprot_val(prot) != pgprot_val(PAGE_KERNEL))
-		vunmap(dst);
-	else
-		kunmap(d);
-#endif
+	ttm_kunmap_atomic_prot(dst, prot);
 
 	return 0;
 }
@@ -303,27 +315,13 @@ static int ttm_copy_ttm_io_page(struct ttm_tt *ttm, void *dst,
 		return -ENOMEM;
 
 	dst = (void *)((unsigned long)dst + (page << PAGE_SHIFT));
-#ifdef CONFIG_X86
-	src = kmap_atomic_prot(s, prot);
-#else
-	if (pgprot_val(prot) != pgprot_val(PAGE_KERNEL))
-		src = vmap(&s, 1, 0, prot);
-	else
-		src = kmap(s);
-#endif
+	src = ttm_kmap_atomic_prot(s, prot);
 	if (!src)
 		return -ENOMEM;
 
 	memcpy_toio(dst, src, PAGE_SIZE);
 
-#ifdef CONFIG_X86
-	kunmap_atomic(src);
-#else
-	if (pgprot_val(prot) != pgprot_val(PAGE_KERNEL))
-		vunmap(src);
-	else
-		kunmap(s);
-#endif
+	ttm_kunmap_atomic_prot(src, prot);
 
 	return 0;
 }
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/5] drm/ttm: Export the ttm_k[un]map_atomic_prot API.
  2018-01-16 13:56 [PATCH 0/5] drm/ttm, drm/vmwgfx: Use kmap_atomic() for cpu blit Thomas Hellstrom
  2018-01-16 13:56 ` [PATCH 1/5] drm/ttm: Clean up kmap_atomic_prot selection code Thomas Hellstrom
@ 2018-01-16 13:56 ` Thomas Hellstrom
  2018-01-16 14:08   ` Christian König
  2018-01-16 13:56 ` [PATCH 3/5] drm/vmwgfx: Don't cache framebuffer maps Thomas Hellstrom
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 8+ messages in thread
From: Thomas Hellstrom @ 2018-01-16 13:56 UTC (permalink / raw)
  To: dri-devel; +Cc: Thomas Hellstrom, Christian König

It will be used by vmwgfx cpu blit.

Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
---
 drivers/gpu/drm/ttm/ttm_bo_util.c | 31 ++++++++++++++++++++++++++-----
 include/drm/ttm/ttm_bo_api.h      |  4 ++++
 2 files changed, 30 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
index 6d6d939..9d4c7f8 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -263,24 +263,45 @@ static int ttm_copy_io_page(void *dst, void *src, unsigned long page)
 #define __ttm_kunmap_atomic(__addr) vunmap(__addr)
 #endif
 
-static void *ttm_kmap_atomic_prot(struct page *page,
-				  pgprot_t prot)
+
+/**
+ * ttm_kmap_atomic_prot - Efficient kernel map of a single page with
+ * specified page protection.
+ *
+ * @page: The page to map.
+ * @prot: The page protection.
+ *
+ * This function maps a TTM page using the kmap_atomic api if available,
+ * otherwise falls back to vmap. The user must make sure that the
+ * specified page does not have an aliased mapping with a different caching
+ * policy unless the architecture explicitly allows it. Also mapping and
+ * unmapping using this api must be correctly nested. Unmapping should
+ * occur in the reverse order of mapping.
+ */
+void *ttm_kmap_atomic_prot(struct page *page, pgprot_t prot)
 {
 	if (pgprot_val(prot) == pgprot_val(PAGE_KERNEL))
 		return kmap_atomic(page);
 	else
 		return __ttm_kmap_atomic_prot(page, prot);
 }
+EXPORT_SYMBOL(ttm_kmap_atomic_prot);
 
-
-static void ttm_kunmap_atomic_prot(void *addr,
-				   pgprot_t prot)
+/**
+ * ttm_kunmap_atomic_prot - Unmap a page that was mapped using
+ * ttm_kmap_atomic_prot.
+ *
+ * @addr: The virtual address from the map.
+ * @prot: The page protection.
+ */
+void ttm_kunmap_atomic_prot(void *addr, pgprot_t prot)
 {
 	if (pgprot_val(prot) == pgprot_val(PAGE_KERNEL))
 		kunmap_atomic(addr);
 	else
 		__ttm_kunmap_atomic(addr);
 }
+EXPORT_SYMBOL(ttm_kunmap_atomic_prot);
 
 static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void *src,
 				unsigned long page,
diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
index f1c74c2..936e5d5 100644
--- a/include/drm/ttm/ttm_bo_api.h
+++ b/include/drm/ttm/ttm_bo_api.h
@@ -728,6 +728,10 @@ unsigned long ttm_bo_default_io_mem_pfn(struct ttm_buffer_object *bo,
 int ttm_bo_mmap(struct file *filp, struct vm_area_struct *vma,
 		struct ttm_bo_device *bdev);
 
+void *ttm_kmap_atomic_prot(struct page *page, pgprot_t prot);
+
+void ttm_kunmap_atomic_prot(void *addr, pgprot_t prot);
+
 /**
  * ttm_bo_io
  *
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/5] drm/vmwgfx: Don't cache framebuffer maps
  2018-01-16 13:56 [PATCH 0/5] drm/ttm, drm/vmwgfx: Use kmap_atomic() for cpu blit Thomas Hellstrom
  2018-01-16 13:56 ` [PATCH 1/5] drm/ttm: Clean up kmap_atomic_prot selection code Thomas Hellstrom
  2018-01-16 13:56 ` [PATCH 2/5] drm/ttm: Export the ttm_k[un]map_atomic_prot API Thomas Hellstrom
@ 2018-01-16 13:56 ` Thomas Hellstrom
  2018-01-16 13:56 ` [PATCH 4/5] drm/vmwgfx: Add a cpu blit utility that can be used for page-backed bos Thomas Hellstrom
  2018-01-16 13:56 ` [PATCH 5/5] drm/vmwgfx: Use the cpu blit utility for framebuffer to screen target blits Thomas Hellstrom
  4 siblings, 0 replies; 8+ messages in thread
From: Thomas Hellstrom @ 2018-01-16 13:56 UTC (permalink / raw)
  To: dri-devel; +Cc: Thomas Hellstrom

Buffer objects need to be either pinned or reserved while a map is active,
that's not the case here, so avoid caching the framebuffer map.
This will cause increasing mapping activity mainly when we don't do
page flipping.

This fixes occasional garbage filled screens when the framebuffer has been
evicted after the map.

Since in-kernel mapping of whole buffer objects is error-prone on 32-bit
architectures and also quite inefficient, we will revisit this later.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
---
 drivers/gpu/drm/vmwgfx/vmwgfx_kms.c  |  6 ------
 drivers/gpu/drm/vmwgfx/vmwgfx_kms.h  |  2 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c | 40 +++++++++++-------------------------
 3 files changed, 13 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
index 34ee856..96737d5 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
@@ -694,7 +694,6 @@ vmw_du_plane_duplicate_state(struct drm_plane *plane)
 	vps->pinned = 0;
 
 	/* Mapping is managed by prepare_fb/cleanup_fb */
-	memset(&vps->guest_map, 0, sizeof(vps->guest_map));
 	memset(&vps->host_map, 0, sizeof(vps->host_map));
 	vps->cpp = 0;
 
@@ -757,11 +756,6 @@ vmw_du_plane_destroy_state(struct drm_plane *plane,
 
 
 	/* Should have been freed by cleanup_fb */
-	if (vps->guest_map.virtual) {
-		DRM_ERROR("Guest mapping not freed\n");
-		ttm_bo_kunmap(&vps->guest_map);
-	}
-
 	if (vps->host_map.virtual) {
 		DRM_ERROR("Host mapping not freed\n");
 		ttm_bo_kunmap(&vps->host_map);
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
index 70a7be2..42b0f15 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
@@ -177,7 +177,7 @@ struct vmw_plane_state {
 	int pinned;
 
 	/* For CPU Blit */
-	struct ttm_bo_kmap_obj host_map, guest_map;
+	struct ttm_bo_kmap_obj host_map;
 	unsigned int cpp;
 };
 
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
index 0e771d6..6de2874 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
@@ -114,7 +114,7 @@ struct vmw_screen_target_display_unit {
 	bool defined;
 
 	/* For CPU Blit */
-	struct ttm_bo_kmap_obj host_map, guest_map;
+	struct ttm_bo_kmap_obj host_map;
 	unsigned int cpp;
 };
 
@@ -641,7 +641,8 @@ static void vmw_stdu_dmabuf_cpu_commit(struct vmw_kms_dirty *dirty)
 	s32 src_pitch, dst_pitch;
 	u8 *src, *dst;
 	bool not_used;
-
+	struct ttm_bo_kmap_obj guest_map;
+	int ret;
 
 	if (!dirty->num_hits)
 		return;
@@ -652,6 +653,13 @@ static void vmw_stdu_dmabuf_cpu_commit(struct vmw_kms_dirty *dirty)
 	if (width == 0 || height == 0)
 		return;
 
+	ret = ttm_bo_kmap(&ddirty->buf->base, 0, ddirty->buf->base.num_pages,
+			  &guest_map);
+	if (ret) {
+		DRM_ERROR("Failed mapping framebuffer for blit: %d\n",
+			  ret);
+		goto out_cleanup;
+	}
 
 	/* Assume we are blitting from Host (display_srf) to Guest (dmabuf) */
 	src_pitch = stdu->display_srf->base_size.width * stdu->cpp;
@@ -659,7 +667,7 @@ static void vmw_stdu_dmabuf_cpu_commit(struct vmw_kms_dirty *dirty)
 	src += ddirty->top * src_pitch + ddirty->left * stdu->cpp;
 
 	dst_pitch = ddirty->pitch;
-	dst = ttm_kmap_obj_virtual(&stdu->guest_map, &not_used);
+	dst = ttm_kmap_obj_virtual(&guest_map, &not_used);
 	dst += ddirty->fb_top * dst_pitch + ddirty->fb_left * stdu->cpp;
 
 
@@ -718,6 +726,7 @@ static void vmw_stdu_dmabuf_cpu_commit(struct vmw_kms_dirty *dirty)
 		vmw_fifo_commit(dev_priv, sizeof(*cmd));
 	}
 
+	ttm_bo_kunmap(&guest_map);
 out_cleanup:
 	ddirty->left = ddirty->top = ddirty->fb_left = ddirty->fb_top = S32_MAX;
 	ddirty->right = ddirty->bottom = S32_MIN;
@@ -1062,9 +1071,6 @@ vmw_stdu_primary_plane_cleanup_fb(struct drm_plane *plane,
 {
 	struct vmw_plane_state *vps = vmw_plane_state_to_vps(old_state);
 
-	if (vps->guest_map.virtual)
-		ttm_bo_kunmap(&vps->guest_map);
-
 	if (vps->host_map.virtual)
 		ttm_bo_kunmap(&vps->host_map);
 
@@ -1230,33 +1236,11 @@ vmw_stdu_primary_plane_prepare_fb(struct drm_plane *plane,
 	 */
 	if (vps->content_fb_type == SEPARATE_DMA &&
 	    !(dev_priv->capabilities & SVGA_CAP_3D)) {
-
-		struct vmw_framebuffer_dmabuf *new_vfbd;
-
-		new_vfbd = vmw_framebuffer_to_vfbd(new_fb);
-
-		ret = ttm_bo_reserve(&new_vfbd->buffer->base, false, false,
-				     NULL);
-		if (ret)
-			goto out_srf_unpin;
-
-		ret = ttm_bo_kmap(&new_vfbd->buffer->base, 0,
-				  new_vfbd->buffer->base.num_pages,
-				  &vps->guest_map);
-
-		ttm_bo_unreserve(&new_vfbd->buffer->base);
-
-		if (ret) {
-			DRM_ERROR("Failed to map content buffer to CPU\n");
-			goto out_srf_unpin;
-		}
-
 		ret = ttm_bo_kmap(&vps->surf->res.backup->base, 0,
 				  vps->surf->res.backup->base.num_pages,
 				  &vps->host_map);
 		if (ret) {
 			DRM_ERROR("Failed to map display buffer to CPU\n");
-			ttm_bo_kunmap(&vps->guest_map);
 			goto out_srf_unpin;
 		}
 
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 4/5] drm/vmwgfx: Add a cpu blit utility that can be used for page-backed bos
  2018-01-16 13:56 [PATCH 0/5] drm/ttm, drm/vmwgfx: Use kmap_atomic() for cpu blit Thomas Hellstrom
                   ` (2 preceding siblings ...)
  2018-01-16 13:56 ` [PATCH 3/5] drm/vmwgfx: Don't cache framebuffer maps Thomas Hellstrom
@ 2018-01-16 13:56 ` Thomas Hellstrom
  2018-01-16 13:56 ` [PATCH 5/5] drm/vmwgfx: Use the cpu blit utility for framebuffer to screen target blits Thomas Hellstrom
  4 siblings, 0 replies; 8+ messages in thread
From: Thomas Hellstrom @ 2018-01-16 13:56 UTC (permalink / raw)
  To: dri-devel; +Cc: Thomas Hellstrom

The utility uses kmap_atomic() instead of vmapping the whole buffer
object. As a result there will be more book-keeping but on some
architectures this will help avoid exhausting vmalloc space and also
avoid expensive TLB flushes.

The blit utility also adds a provision to compute a bounding box of
changed content, which is very useful to optimize presentation speed
of ill-behaved applications that don't supply proper damage regions, and
for page-flips. The cost of computing the bounding box is not that
expensive when done in a cpu-blit utility like this.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
---
 drivers/gpu/drm/vmwgfx/Makefile      |   2 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_blit.c | 506 +++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.h  |  48 ++++
 3 files changed, 555 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/vmwgfx/vmwgfx_blit.c

diff --git a/drivers/gpu/drm/vmwgfx/Makefile b/drivers/gpu/drm/vmwgfx/Makefile
index ad80211..794cc9d 100644
--- a/drivers/gpu/drm/vmwgfx/Makefile
+++ b/drivers/gpu/drm/vmwgfx/Makefile
@@ -7,6 +7,6 @@ vmwgfx-y := vmwgfx_execbuf.o vmwgfx_gmr.o vmwgfx_kms.o vmwgfx_drv.o \
 	    vmwgfx_surface.o vmwgfx_prime.o vmwgfx_mob.o vmwgfx_shader.o \
 	    vmwgfx_cmdbuf_res.o vmwgfx_cmdbuf.o vmwgfx_stdu.o \
 	    vmwgfx_cotable.o vmwgfx_so.o vmwgfx_binding.o vmwgfx_msg.o \
-	    vmwgfx_simple_resource.o vmwgfx_va.o
+	    vmwgfx_simple_resource.o vmwgfx_va.o vmwgfx_blit.o
 
 obj-$(CONFIG_DRM_VMWGFX) := vmwgfx.o
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c b/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c
new file mode 100644
index 0000000..2730403
--- /dev/null
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c
@@ -0,0 +1,506 @@
+/**************************************************************************
+ *
+ * Copyright © 2017 VMware, Inc., Palo Alto, CA., USA
+ * All Rights Reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
+ * "Software"), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sub license, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the
+ * next paragraph) shall be included in all copies or substantial portions
+ * of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM,
+ * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
+ * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
+ * USE OR OTHER DEALINGS IN THE SOFTWARE.
+ *
+ **************************************************************************/
+
+#include "vmwgfx_drv.h"
+
+/*
+ * Template that implements find_first_diff() for a generic
+ * unsigned integer type. @size and return value are in bytes.
+ */
+#define VMW_FIND_FIRST_DIFF(_type)			 \
+static size_t vmw_find_first_diff_ ## _type		 \
+	(const _type * dst, const _type * src, size_t size)\
+{							 \
+	size_t i;					 \
+							 \
+	for (i = 0; i < size; i += sizeof(_type)) {	 \
+		if (*dst++ != *src++)			 \
+			break;				 \
+	}						 \
+							 \
+	return i;					 \
+}
+
+
+/*
+ * Template that implements find_last_diff() for a generic
+ * unsigned integer type. Pointers point to the item following the
+ * *end* of the area to be examined. @size and return value are in
+ * bytes.
+ */
+#define VMW_FIND_LAST_DIFF(_type)					\
+static ssize_t vmw_find_last_diff_ ## _type(				\
+	const _type * dst, const _type * src, size_t size)		\
+{									\
+	while (size) {							\
+		if (*--dst != *--src)					\
+			break;						\
+									\
+		size -= sizeof(_type);					\
+	}								\
+	return size;							\
+}
+
+
+/*
+ * Instantiate find diff functions for relevant unsigned integer sizes,
+ * assuming that wider integers are faster (including aligning) up to the
+ * architecture native width, which is assumed to be 32 bit unless
+ * CONFIG_64BIT is defined.
+ */
+VMW_FIND_FIRST_DIFF(u8);
+VMW_FIND_LAST_DIFF(u8);
+
+VMW_FIND_FIRST_DIFF(u16);
+VMW_FIND_LAST_DIFF(u16);
+
+VMW_FIND_FIRST_DIFF(u32);
+VMW_FIND_LAST_DIFF(u32);
+
+#ifdef CONFIG_64BIT
+VMW_FIND_FIRST_DIFF(u64);
+VMW_FIND_LAST_DIFF(u64);
+#endif
+
+
+/* We use size aligned copies. This computes (addr - align(addr)) */
+#define SPILL(_var, _type) ((unsigned long) _var & (sizeof(_type) - 1))
+
+
+/*
+ * Template to compute find_first_diff() for a certain integer type
+ * including a head copy for alignment, and adjustment of parameters
+ * for tail find or increased resolution find using an unsigned integer find
+ * of smaller width. If finding is complete, and resolution is sufficient,
+ * the macro executes a return statement. Otherwise it falls through.
+ */
+#define VMW_TRY_FIND_FIRST_DIFF(_type)					\
+do {									\
+	unsigned int spill = SPILL(dst, _type);				\
+	size_t diff_offs;						\
+									\
+	if (spill && spill == SPILL(src, _type) &&			\
+	    sizeof(_type) - spill <= size) {				\
+		spill = sizeof(_type) - spill;				\
+		diff_offs = vmw_find_first_diff_u8(dst, src, spill);	\
+		if (diff_offs < spill)					\
+			return round_down(offset + diff_offs, granularity); \
+									\
+		dst += spill;						\
+		src += spill;						\
+		size -= spill;						\
+		offset += spill;					\
+		spill = 0;						\
+	}								\
+	if (!spill && !SPILL(src, _type)) {				\
+		size_t to_copy = size &	 ~(sizeof(_type) - 1);		\
+									\
+		diff_offs = vmw_find_first_diff_ ## _type		\
+			((_type *) dst, (_type *) src, to_copy);	\
+		if (diff_offs >= size || granularity == sizeof(_type))	\
+			return (offset + diff_offs);			\
+									\
+		dst += diff_offs;					\
+		src += diff_offs;					\
+		size -= diff_offs;					\
+		offset += diff_offs;					\
+	}								\
+} while (0)								\
+
+
+/**
+ * vmw_find_first_diff - find the first difference between dst and src
+ *
+ * @dst: The destination address
+ * @src: The source address
+ * @size: Number of bytes to compare
+ * @granularity: The granularity needed for the return value in bytes.
+ * return: The offset from find start where the first difference was
+ * encountered in bytes. If no difference was found, the function returns
+ * a value >= @size.
+ */
+static size_t vmw_find_first_diff(const u8 *dst, const u8 *src, size_t size,
+				  size_t granularity)
+{
+	size_t offset = 0;
+
+	/*
+	 * Try finding with large integers if alignment allows, or we can
+	 * fix it. Fall through if we need better resolution or alignment
+	 * was bad.
+	 */
+#ifdef CONFIG_64BIT
+	VMW_TRY_FIND_FIRST_DIFF(u64);
+#endif
+	VMW_TRY_FIND_FIRST_DIFF(u32);
+	VMW_TRY_FIND_FIRST_DIFF(u16);
+
+	return round_down(offset + vmw_find_first_diff_u8(dst, src, size),
+			  granularity);
+}
+
+
+/*
+ * Template to compute find_last_diff() for a certain integer type
+ * including a tail copy for alignment, and adjustment of parameters
+ * for head find or increased resolution find using an unsigned integer find
+ * of smaller width. If finding is complete, and resolution is sufficient,
+ * the macro executes a return statement. Otherwise it falls through.
+ */
+#define VMW_TRY_FIND_LAST_DIFF(_type)					\
+do {									\
+	unsigned int spill = SPILL(dst, _type);				\
+	ssize_t location;						\
+	ssize_t diff_offs;						\
+									\
+	if (spill && spill <= size && spill == SPILL(src, _type)) {	\
+		diff_offs = vmw_find_last_diff_u8(dst, src, spill);	\
+		if (diff_offs) {					\
+			location = size - spill + diff_offs - 1;	\
+			return round_down(location, granularity);	\
+		}							\
+									\
+		dst -= spill;						\
+		src -= spill;						\
+		size -= spill;						\
+		spill = 0;						\
+	}								\
+	if (!spill && !SPILL(src, _type)) {				\
+		size_t to_copy = round_down(size, sizeof(_type));	\
+									\
+		diff_offs = vmw_find_last_diff_ ## _type		\
+			((_type *) dst, (_type *) src, to_copy);	\
+		location = size - to_copy + diff_offs - sizeof(_type);	\
+		if (location < 0 || granularity == sizeof(_type))	\
+			return location;				\
+									\
+		dst -= to_copy - diff_offs;				\
+		src -= to_copy - diff_offs;				\
+		size -= to_copy - diff_offs;				\
+	}								\
+} while (0)
+
+
+/**
+ * vmw_find_last_diff - find the last difference between dst and src
+ *
+ * @dst: The destination address
+ * @src: The source address
+ * @size: Number of bytes to compare
+ * @granularity: The granularity needed for the return value in bytes.
+ * return: The offset from find start where the last difference was
+ * encountered in bytes, or a negative value if no difference was found.
+ */
+static ssize_t vmw_find_last_diff(const u8 *dst, const u8 *src, size_t size,
+				  size_t granularity)
+{
+	dst += size;
+	src += size;
+
+#ifdef CONFIG_64BIT
+	VMW_TRY_FIND_LAST_DIFF(u64);
+#endif
+	VMW_TRY_FIND_LAST_DIFF(u32);
+	VMW_TRY_FIND_LAST_DIFF(u16);
+
+	return round_down(vmw_find_last_diff_u8(dst, src, size) - 1,
+			  granularity);
+}
+
+
+/**
+ * vmw_memcpy - A wrapper around kernel memcpy with allowing to plug it into a
+ * struct vmw_diff_cpy.
+ *
+ * @diff: The struct vmw_diff_cpy closure argument (unused).
+ * @dest: The copy destination.
+ * @src: The copy source.
+ * @n: Number of bytes to copy.
+ */
+void vmw_memcpy(struct vmw_diff_cpy *diff, u8 *dest, const u8 *src, size_t n)
+{
+	memcpy(dest, src, n);
+}
+
+
+/**
+ * vmw_adjust_rect - Adjust rectangle coordinates for newly found difference
+ *
+ * @diff: The struct vmw_diff_cpy used to track the modified bounding box.
+ * @diff_offs: The offset from @diff->line_offset where the difference was
+ * found.
+ */
+static void vmw_adjust_rect(struct vmw_diff_cpy *diff, size_t diff_offs)
+{
+	size_t offs = (diff_offs + diff->line_offset) / diff->cpp;
+	struct drm_rect *rect = &diff->rect;
+
+	rect->x1 = min_t(int, rect->x1, offs);
+	rect->x2 = max_t(int, rect->x2, offs + 1);
+	rect->y1 = min_t(int, rect->y1, diff->line);
+	rect->y2 = max_t(int, rect->y2, diff->line + 1);
+}
+
+/**
+ * vmw_diff_memcpy - memcpy that creates a bounding box of modified content.
+ *
+ * @diff: The struct vmw_diff_cpy used to track the modified bounding box.
+ * @dest: The copy destination.
+ * @src: The copy source.
+ * @n: Number of bytes to copy.
+ *
+ * In order to correctly track the modified content, the field @diff->line must
+ * be pre-loaded with the current line number, the field @diff->line_offset must
+ * be pre-loaded with the line offset in bytes where the copy starts, and
+ * finally the field @diff->cpp need to be preloaded with the number of bytes
+ * per unit in the horizontal direction of the area we're examining.
+ * Typically bytes per pixel.
+ * This is needed to know the needed granularity of the difference computing
+ * operations. A higher cpp generally leads to faster execution at the cost of
+ * bounding box width precision.
+ */
+void vmw_diff_memcpy(struct vmw_diff_cpy *diff, u8 *dest, const u8 *src,
+		     size_t n)
+{
+	ssize_t csize, byte_len;
+
+	if (WARN_ON_ONCE(round_down(n, diff->cpp) != n))
+		return;
+
+	/* TODO: Possibly use a single vmw_find_first_diff per line? */
+	csize = vmw_find_first_diff(dest, src, n, diff->cpp);
+	if (csize < n) {
+		vmw_adjust_rect(diff, csize);
+		byte_len = diff->cpp;
+
+		/*
+		 * Starting from where first difference was found, find
+		 * location of last difference, and then copy.
+		 */
+		diff->line_offset += csize;
+		dest += csize;
+		src += csize;
+		n -= csize;
+		csize = vmw_find_last_diff(dest, src, n, diff->cpp);
+		if (csize >= 0) {
+			byte_len += csize;
+			vmw_adjust_rect(diff, csize);
+		}
+		memcpy(dest, src, byte_len);
+	}
+	diff->line_offset += n;
+}
+
+/**
+ * struct vmw_bo_blit_line_data - Convenience argument to vmw_bo_cpu_blit_line
+ *
+ * @mapped_dst: Already mapped destination page index in @dst_pages.
+ * @dst_addr: Kernel virtual address of mapped destination page.
+ * @dst_pages: Array of destination bo pages.
+ * @dst_num_pages: Number of destination bo pages.
+ * @dst_prot: Destination bo page protection.
+ * @mapped_src: Already mapped source page index in @dst_pages.
+ * @src_addr: Kernel virtual address of mapped source page.
+ * @src_pages: Array of source bo pages.
+ * @src_num_pages: Number of source bo pages.
+ * @src_prot: Source bo page protection.
+ * @diff: Struct vmw_diff_cpy, in the end forwarded to the memcpy routine.
+ */
+struct vmw_bo_blit_line_data {
+	u32 mapped_dst;
+	u8 *dst_addr;
+	struct page **dst_pages;
+	u32 dst_num_pages;
+	pgprot_t dst_prot;
+	u32 mapped_src;
+	u8 *src_addr;
+	struct page **src_pages;
+	u32 src_num_pages;
+	pgprot_t src_prot;
+	struct vmw_diff_cpy *diff;
+};
+
+/**
+ * vmw_bo_cpu_blit_line - Blit part of a line from one bo to another.
+ *
+ * @d: Blit data as described above.
+ * @dst_offset: Destination copy start offset from start of bo.
+ * @src_offset: Source copy start offset from start of bo.
+ * @bytes_to_copy: Number of bytes to copy in this line.
+ */
+static int vmw_bo_cpu_blit_line(struct vmw_bo_blit_line_data *d,
+				u32 dst_offset,
+				u32 src_offset,
+				u32 bytes_to_copy)
+{
+	struct vmw_diff_cpy *diff = d->diff;
+
+	while (bytes_to_copy) {
+		u32 copy_size = bytes_to_copy;
+		u32 dst_page = dst_offset >> PAGE_SHIFT;
+		u32 src_page = src_offset >> PAGE_SHIFT;
+		u32 dst_page_offset = dst_offset & ~PAGE_MASK;
+		u32 src_page_offset = src_offset & ~PAGE_MASK;
+		bool unmap_dst = d->dst_addr && dst_page != d->mapped_dst;
+		bool unmap_src = d->src_addr && (src_page != d->mapped_src ||
+						 unmap_dst);
+
+		copy_size = min_t(u32, copy_size, PAGE_SIZE - dst_page_offset);
+		copy_size = min_t(u32, copy_size, PAGE_SIZE - src_page_offset);
+
+		if (unmap_src) {
+			ttm_kunmap_atomic_prot(d->src_addr, d->src_prot);
+			d->src_addr = NULL;
+		}
+
+		if (unmap_dst) {
+			ttm_kunmap_atomic_prot(d->dst_addr, d->dst_prot);
+			d->dst_addr = NULL;
+		}
+
+		if (!d->dst_addr) {
+			if (WARN_ON_ONCE(dst_page >= d->dst_num_pages))
+				return -EINVAL;
+
+			d->dst_addr =
+				ttm_kmap_atomic_prot(d->dst_pages[dst_page],
+						     d->dst_prot);
+			if (!d->dst_addr)
+				return -ENOMEM;
+
+			d->mapped_dst = dst_page;
+		}
+
+		if (!d->src_addr) {
+			if (WARN_ON_ONCE(src_page >= d->src_num_pages))
+				return -EINVAL;
+
+			d->src_addr =
+				ttm_kmap_atomic_prot(d->src_pages[src_page],
+						     d->src_prot);
+			if (!d->src_addr)
+				return -ENOMEM;
+
+			d->mapped_src = src_page;
+		}
+		diff->memcpy(diff, d->dst_addr + dst_page_offset,
+			     d->src_addr + src_page_offset, copy_size);
+
+		bytes_to_copy -= copy_size;
+		dst_offset += copy_size;
+		src_offset += copy_size;
+	}
+
+	return 0;
+}
+
+/**
+ * ttm_bo_cpu_blit - in-kernel cpu blit.
+ *
+ * @dst: Destination buffer object.
+ * @dst_offset: Destination offset of blit start in bytes.
+ * @dst_stride: Destination stride in bytes.
+ * @src: Source buffer object.
+ * @src_offset: Source offset of blit start in bytes.
+ * @src_stride: Source stride in bytes.
+ * @w: Width of blit.
+ * @h: Height of blit.
+ * return: Zero on success. Negative error value on failure. Will print out
+ * kernel warnings on caller bugs.
+ *
+ * Performs a CPU blit from one buffer object to another avoiding a full
+ * bo vmap which may exhaust- or fragment vmalloc space.
+ * On supported architectures (x86), we're using kmap_atomic which avoids
+ * cross-processor TLB- and cache flushes and may, on non-HIGHMEM systems
+ * reference already set-up mappings.
+ *
+ * Neither of the buffer objects may be placed in PCI memory
+ * (Fixed memory in TTM terminology) when using this function.
+ */
+int vmw_bo_cpu_blit(struct ttm_buffer_object *dst,
+		    u32 dst_offset, u32 dst_stride,
+		    struct ttm_buffer_object *src,
+		    u32 src_offset, u32 src_stride,
+		    u32 w, u32 h,
+		    struct vmw_diff_cpy *diff)
+{
+	struct ttm_operation_ctx ctx = {
+		.interruptible = false,
+		.no_wait_gpu = false
+	};
+	u32 j, initial_line = dst_offset / dst_stride;
+	struct vmw_bo_blit_line_data d;
+	int ret = 0;
+
+	/* Buffer objects need to be either pinned or reserved: */
+	if (!(dst->mem.placement & TTM_PL_FLAG_NO_EVICT))
+		lockdep_assert_held(&dst->resv->lock.base);
+	if (!(src->mem.placement & TTM_PL_FLAG_NO_EVICT))
+		lockdep_assert_held(&src->resv->lock.base);
+
+	if (dst->ttm->state == tt_unpopulated) {
+		ret = dst->ttm->bdev->driver->ttm_tt_populate(dst->ttm, &ctx);
+		if (ret)
+			return ret;
+	}
+
+	if (src->ttm->state == tt_unpopulated) {
+		ret = src->ttm->bdev->driver->ttm_tt_populate(src->ttm, &ctx);
+		if (ret)
+			return ret;
+	}
+
+	d.mapped_dst = 0;
+	d.mapped_src = 0;
+	d.dst_addr = NULL;
+	d.src_addr = NULL;
+	d.dst_pages = dst->ttm->pages;
+	d.src_pages = src->ttm->pages;
+	d.dst_num_pages = dst->num_pages;
+	d.src_num_pages = src->num_pages;
+	d.dst_prot = ttm_io_prot(dst->mem.placement, PAGE_KERNEL);
+	d.src_prot = ttm_io_prot(src->mem.placement, PAGE_KERNEL);
+	d.diff = diff;
+
+	for (j = 0; j < h; ++j) {
+		diff->line = j + initial_line;
+		diff->line_offset = dst_offset % dst_stride;
+		ret = vmw_bo_cpu_blit_line(&d, dst_offset, src_offset, w);
+		if (ret)
+			goto out;
+
+		dst_offset += dst_stride;
+		src_offset += src_stride;
+	}
+out:
+	if (d.src_addr)
+		ttm_kunmap_atomic_prot(d.src_addr, d.src_prot);
+	if (d.dst_addr)
+		ttm_kunmap_atomic_prot(d.dst_addr, d.dst_prot);
+
+	return ret;
+}
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
index 7e5f30e..99e7e426 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
@@ -678,6 +678,7 @@ extern void vmw_fence_single_bo(struct ttm_buffer_object *bo,
 				struct vmw_fence_obj *fence);
 extern void vmw_resource_evict_all(struct vmw_private *dev_priv);
 
+
 /**
  * DMA buffer helper routines - vmwgfx_dmabuf.c
  */
@@ -1165,6 +1166,53 @@ extern int vmw_cmdbuf_cur_flush(struct vmw_cmdbuf_man *man,
 				bool interruptible);
 extern void vmw_cmdbuf_irqthread(struct vmw_cmdbuf_man *man);
 
+/* CPU blit utilities - vmwgfx_blit.c */
+
+/**
+ * struct vmw_diff_cpy - CPU blit information structure
+ *
+ * @rect: The output bounding box rectangle.
+ * @line: The current line of the blit.
+ * @line_offset: Offset of the current line segment.
+ * @cpp: Bytes per pixel (granularity information).
+ * @memcpy: Which memcpy function to use.
+ */
+struct vmw_diff_cpy {
+	struct drm_rect rect;
+	size_t line;
+	size_t line_offset;
+	int cpp;
+	void (*memcpy)(struct vmw_diff_cpy *diff, u8 *dest, const u8 *src,
+		       size_t n);
+};
+
+#define VMW_CPU_BLIT_INITIALIZER {	\
+	.memcpy = vmw_memcpy,		\
+}
+
+#define VMW_CPU_BLIT_DIFF_INITIALIZER(_cpp) {	  \
+	.line = 0,				  \
+	.line_offset = 0,			  \
+	.rect = { .x1 = INT_MAX/2,		  \
+		  .y1 = INT_MAX/2,		  \
+		  .x2 = INT_MIN/2,		  \
+		  .y2 = INT_MIN/2		  \
+	},					  \
+	.cpp = _cpp,				  \
+	.memcpy = vmw_diff_memcpy,		  \
+}
+
+void vmw_diff_memcpy(struct vmw_diff_cpy *diff, u8 *dest, const u8 *src,
+		     size_t n);
+
+void vmw_memcpy(struct vmw_diff_cpy *diff, u8 *dest, const u8 *src, size_t n);
+
+int vmw_bo_cpu_blit(struct ttm_buffer_object *dst,
+		    u32 dst_offset, u32 dst_stride,
+		    struct ttm_buffer_object *src,
+		    u32 src_offset, u32 src_stride,
+		    u32 w, u32 h,
+		    struct vmw_diff_cpy *diff);
 
 /**
  * Inline helper functions
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 5/5] drm/vmwgfx: Use the cpu blit utility for framebuffer to screen target blits
  2018-01-16 13:56 [PATCH 0/5] drm/ttm, drm/vmwgfx: Use kmap_atomic() for cpu blit Thomas Hellstrom
                   ` (3 preceding siblings ...)
  2018-01-16 13:56 ` [PATCH 4/5] drm/vmwgfx: Add a cpu blit utility that can be used for page-backed bos Thomas Hellstrom
@ 2018-01-16 13:56 ` Thomas Hellstrom
  4 siblings, 0 replies; 8+ messages in thread
From: Thomas Hellstrom @ 2018-01-16 13:56 UTC (permalink / raw)
  To: dri-devel; +Cc: Thomas Hellstrom

This blit was previously performed using two large vmaps, one of which
was teared down and remapped on each blit. Use the more resource-
conserving TTM cpu blit instead.

The blit is used in boundary-box computing mode which makes it possible
to minimize the bounding box used in host operations.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
---
 drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c |  23 ++++++++
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.h    |   1 +
 drivers/gpu/drm/vmwgfx/vmwgfx_kms.c    |  50 ++++++++++------
 drivers/gpu/drm/vmwgfx/vmwgfx_kms.h    |   4 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c   |   5 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c   | 105 +++++++++++----------------------
 6 files changed, 97 insertions(+), 91 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c b/drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c
index 22231bc..2fe099a 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c
@@ -185,6 +185,22 @@ static const struct ttm_place evictable_placement_flags[] = {
 	}
 };
 
+static const struct ttm_place nonfixed_placement_flags[] = {
+	{
+		.fpfn = 0,
+		.lpfn = 0,
+		.flags = TTM_PL_FLAG_SYSTEM | TTM_PL_FLAG_CACHED
+	}, {
+		.fpfn = 0,
+		.lpfn = 0,
+		.flags = VMW_PL_FLAG_GMR | TTM_PL_FLAG_CACHED
+	}, {
+		.fpfn = 0,
+		.lpfn = 0,
+		.flags = VMW_PL_FLAG_MOB | TTM_PL_FLAG_CACHED
+	}
+};
+
 struct ttm_placement vmw_evictable_placement = {
 	.num_placement = 4,
 	.placement = evictable_placement_flags,
@@ -213,6 +229,13 @@ struct ttm_placement vmw_mob_ne_placement = {
 	.busy_placement = &mob_ne_placement_flags
 };
 
+struct ttm_placement vmw_nonfixed_placement = {
+	.num_placement = 3,
+	.placement = nonfixed_placement_flags,
+	.num_busy_placement = 1,
+	.busy_placement = &sys_placement_flags
+};
+
 struct vmw_ttm_tt {
 	struct ttm_dma_tt dma_ttm;
 	struct vmw_private *dev_priv;
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
index 99e7e426..62e567a 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
@@ -767,6 +767,7 @@ extern struct ttm_placement vmw_evictable_placement;
 extern struct ttm_placement vmw_srf_placement;
 extern struct ttm_placement vmw_mob_placement;
 extern struct ttm_placement vmw_mob_ne_placement;
+extern struct ttm_placement vmw_nonfixed_placement;
 extern struct ttm_bo_driver vmw_bo_driver;
 extern int vmw_dma_quiescent(struct drm_device *dev);
 extern int vmw_bo_map_dma(struct ttm_buffer_object *bo);
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
index 96737d5..a62d0f8 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
@@ -692,9 +692,6 @@ vmw_du_plane_duplicate_state(struct drm_plane *plane)
 		return NULL;
 
 	vps->pinned = 0;
-
-	/* Mapping is managed by prepare_fb/cleanup_fb */
-	memset(&vps->host_map, 0, sizeof(vps->host_map));
 	vps->cpp = 0;
 
 	/* Each ref counted resource needs to be acquired again */
@@ -756,11 +753,6 @@ vmw_du_plane_destroy_state(struct drm_plane *plane,
 
 
 	/* Should have been freed by cleanup_fb */
-	if (vps->host_map.virtual) {
-		DRM_ERROR("Host mapping not freed\n");
-		ttm_bo_kunmap(&vps->host_map);
-	}
-
 	if (vps->surf)
 		vmw_surface_unreference(&vps->surf);
 
@@ -1139,12 +1131,14 @@ static const struct drm_framebuffer_funcs vmw_framebuffer_dmabuf_funcs = {
 };
 
 /**
- * Pin the dmabuffer to the start of vram.
+ * Pin the dmabuffer in a location suitable for access by the
+ * display system.
  */
 static int vmw_framebuffer_pin(struct vmw_framebuffer *vfb)
 {
 	struct vmw_private *dev_priv = vmw_priv(vfb->base.dev);
 	struct vmw_dma_buffer *buf;
+	struct ttm_placement *placement;
 	int ret;
 
 	buf = vfb->dmabuf ?  vmw_framebuffer_to_vfbd(&vfb->base)->buffer :
@@ -1161,12 +1155,24 @@ static int vmw_framebuffer_pin(struct vmw_framebuffer *vfb)
 		break;
 	case vmw_du_screen_object:
 	case vmw_du_screen_target:
-		if (vfb->dmabuf)
-			return vmw_dmabuf_pin_in_vram_or_gmr(dev_priv, buf,
-							     false);
+		if (vfb->dmabuf) {
+			if (dev_priv->capabilities & SVGA_CAP_3D) {
+				/*
+				 * Use surface DMA to get content to
+				 * sreen target surface.
+				 */
+				placement = &vmw_vram_gmr_placement;
+			} else {
+				/* Use CPU blit. */
+				placement = &vmw_sys_placement;
+			}
+		} else {
+			/* Use surface / image update */
+			placement = &vmw_mob_placement;
+		}
 
-		return vmw_dmabuf_pin_in_placement(dev_priv, buf,
-						   &vmw_mob_placement, false);
+		return vmw_dmabuf_pin_in_placement(dev_priv, buf, placement,
+						   false);
 	default:
 		return -EINVAL;
 	}
@@ -2429,14 +2435,21 @@ int vmw_kms_helper_dirty(struct vmw_private *dev_priv,
 int vmw_kms_helper_buffer_prepare(struct vmw_private *dev_priv,
 				  struct vmw_dma_buffer *buf,
 				  bool interruptible,
-				  bool validate_as_mob)
+				  bool validate_as_mob,
+				  bool for_cpu_blit)
 {
+	struct ttm_operation_ctx ctx = {
+		.interruptible = interruptible,
+		.no_wait_gpu = false};
 	struct ttm_buffer_object *bo = &buf->base;
 	int ret;
 
 	ttm_bo_reserve(bo, false, false, NULL);
-	ret = vmw_validate_single_buffer(dev_priv, bo, interruptible,
-					 validate_as_mob);
+	if (for_cpu_blit)
+		ret = ttm_bo_validate(bo, &vmw_nonfixed_placement, &ctx);
+	else
+		ret = vmw_validate_single_buffer(dev_priv, bo, interruptible,
+						 validate_as_mob);
 	if (ret)
 		ttm_bo_unreserve(bo);
 
@@ -2548,7 +2561,8 @@ int vmw_kms_helper_resource_prepare(struct vmw_resource *res,
 	if (res->backup) {
 		ret = vmw_kms_helper_buffer_prepare(res->dev_priv, res->backup,
 						    interruptible,
-						    res->dev_priv->has_mob);
+						    res->dev_priv->has_mob,
+						    false);
 		if (ret)
 			goto out_unreserve;
 	}
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
index 42b0f15..4e8749a 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
@@ -177,7 +177,6 @@ struct vmw_plane_state {
 	int pinned;
 
 	/* For CPU Blit */
-	struct ttm_bo_kmap_obj host_map;
 	unsigned int cpp;
 };
 
@@ -289,7 +288,8 @@ int vmw_kms_helper_dirty(struct vmw_private *dev_priv,
 int vmw_kms_helper_buffer_prepare(struct vmw_private *dev_priv,
 				  struct vmw_dma_buffer *buf,
 				  bool interruptible,
-				  bool validate_as_mob);
+				  bool validate_as_mob,
+				  bool for_cpu_blit);
 void vmw_kms_helper_buffer_revert(struct vmw_dma_buffer *buf);
 void vmw_kms_helper_buffer_finish(struct vmw_private *dev_priv,
 				  struct drm_file *file_priv,
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c b/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c
index 7164724..8a43ba8 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c
@@ -1032,7 +1032,7 @@ int vmw_kms_sou_do_dmabuf_dirty(struct vmw_private *dev_priv,
 	int ret;
 
 	ret = vmw_kms_helper_buffer_prepare(dev_priv, buf, interruptible,
-					    false);
+					    false, false);
 	if (ret)
 		return ret;
 
@@ -1130,7 +1130,8 @@ int vmw_kms_sou_readback(struct vmw_private *dev_priv,
 	struct vmw_kms_dirty dirty;
 	int ret;
 
-	ret = vmw_kms_helper_buffer_prepare(dev_priv, buf, true, false);
+	ret = vmw_kms_helper_buffer_prepare(dev_priv, buf, true, false,
+					    false);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
index 6de2874..8eec889 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
@@ -114,7 +114,6 @@ struct vmw_screen_target_display_unit {
 	bool defined;
 
 	/* For CPU Blit */
-	struct ttm_bo_kmap_obj host_map;
 	unsigned int cpp;
 };
 
@@ -639,10 +638,9 @@ static void vmw_stdu_dmabuf_cpu_commit(struct vmw_kms_dirty *dirty)
 		container_of(dirty->unit, typeof(*stdu), base);
 	s32 width, height;
 	s32 src_pitch, dst_pitch;
-	u8 *src, *dst;
-	bool not_used;
-	struct ttm_bo_kmap_obj guest_map;
-	int ret;
+	struct ttm_buffer_object *src_bo, *dst_bo;
+	u32 src_offset, dst_offset;
+	struct vmw_diff_cpy diff = VMW_CPU_BLIT_DIFF_INITIALIZER(stdu->cpp);
 
 	if (!dirty->num_hits)
 		return;
@@ -653,57 +651,38 @@ static void vmw_stdu_dmabuf_cpu_commit(struct vmw_kms_dirty *dirty)
 	if (width == 0 || height == 0)
 		return;
 
-	ret = ttm_bo_kmap(&ddirty->buf->base, 0, ddirty->buf->base.num_pages,
-			  &guest_map);
-	if (ret) {
-		DRM_ERROR("Failed mapping framebuffer for blit: %d\n",
-			  ret);
-		goto out_cleanup;
-	}
-
-	/* Assume we are blitting from Host (display_srf) to Guest (dmabuf) */
-	src_pitch = stdu->display_srf->base_size.width * stdu->cpp;
-	src = ttm_kmap_obj_virtual(&stdu->host_map, &not_used);
-	src += ddirty->top * src_pitch + ddirty->left * stdu->cpp;
-
-	dst_pitch = ddirty->pitch;
-	dst = ttm_kmap_obj_virtual(&guest_map, &not_used);
-	dst += ddirty->fb_top * dst_pitch + ddirty->fb_left * stdu->cpp;
+	/* Assume we are blitting from Guest (dmabuf) to Host (display_srf) */
+	dst_pitch = stdu->display_srf->base_size.width * stdu->cpp;
+	dst_bo = &stdu->display_srf->res.backup->base;
+	dst_offset = ddirty->top * dst_pitch + ddirty->left * stdu->cpp;
 
+	src_pitch = ddirty->pitch;
+	src_bo = &ddirty->buf->base;
+	src_offset = ddirty->fb_top * src_pitch + ddirty->fb_left * stdu->cpp;
 
-	/* Figure out the real direction */
-	if (ddirty->transfer == SVGA3D_WRITE_HOST_VRAM) {
-		u8 *tmp;
-		s32 tmp_pitch;
-
-		tmp = src;
-		tmp_pitch = src_pitch;
-
-		src = dst;
-		src_pitch = dst_pitch;
-
-		dst = tmp;
-		dst_pitch = tmp_pitch;
+	/* Swap src and dst if the assumption was wrong. */
+	if (ddirty->transfer != SVGA3D_WRITE_HOST_VRAM) {
+		swap(dst_pitch, src_pitch);
+		swap(dst_bo, src_bo);
+		swap(src_offset, dst_offset);
 	}
 
-	/* CPU Blit */
-	while (height-- > 0) {
-		memcpy(dst, src, width * stdu->cpp);
-		dst += dst_pitch;
-		src += src_pitch;
-	}
+	(void) vmw_bo_cpu_blit(dst_bo, dst_offset, dst_pitch,
+			       src_bo, src_offset, src_pitch,
+			       width * stdu->cpp, height, &diff);
 
-	if (ddirty->transfer == SVGA3D_WRITE_HOST_VRAM) {
+	if (ddirty->transfer == SVGA3D_WRITE_HOST_VRAM &&
+	    drm_rect_visible(&diff.rect)) {
 		struct vmw_private *dev_priv;
 		struct vmw_stdu_update *cmd;
 		struct drm_clip_rect region;
 		int ret;
 
 		/* We are updating the actual surface, not a proxy */
-		region.x1 = ddirty->left;
-		region.x2 = ddirty->right;
-		region.y1 = ddirty->top;
-		region.y2 = ddirty->bottom;
+		region.x1 = diff.rect.x1;
+		region.x2 = diff.rect.x2;
+		region.y1 = diff.rect.y1;
+		region.y2 = diff.rect.y2;
 		ret = vmw_kms_update_proxy(
 			(struct vmw_resource *) &stdu->display_srf->res,
 			(const struct drm_clip_rect *) &region, 1, 1);
@@ -720,13 +699,12 @@ static void vmw_stdu_dmabuf_cpu_commit(struct vmw_kms_dirty *dirty)
 		}
 
 		vmw_stdu_populate_update(cmd, stdu->base.unit,
-					 ddirty->left, ddirty->right,
-					 ddirty->top, ddirty->bottom);
+					 region.x1, region.x2,
+					 region.y1, region.y2);
 
 		vmw_fifo_commit(dev_priv, sizeof(*cmd));
 	}
 
-	ttm_bo_kunmap(&guest_map);
 out_cleanup:
 	ddirty->left = ddirty->top = ddirty->fb_left = ddirty->fb_top = S32_MAX;
 	ddirty->right = ddirty->bottom = S32_MIN;
@@ -772,9 +750,15 @@ int vmw_kms_stdu_dma(struct vmw_private *dev_priv,
 		container_of(vfb, struct vmw_framebuffer_dmabuf, base)->buffer;
 	struct vmw_stdu_dirty ddirty;
 	int ret;
+	bool cpu_blit = !(dev_priv->capabilities & SVGA_CAP_3D);
 
+	/*
+	 * VMs without 3D support don't have the surface DMA command and
+	 * we'll be using a CPU blit, and the framebuffer should be moved out
+	 * of VRAM.
+	 */
 	ret = vmw_kms_helper_buffer_prepare(dev_priv, buf, interruptible,
-					    false);
+					    false, cpu_blit);
 	if (ret)
 		return ret;
 
@@ -793,8 +777,8 @@ int vmw_kms_stdu_dma(struct vmw_private *dev_priv,
 	if (to_surface)
 		ddirty.base.fifo_reserve_size += sizeof(struct vmw_stdu_update);
 
-	/* 2D VMs cannot use SVGA_3D_CMD_SURFACE_DMA so do CPU blit instead */
-	if (!(dev_priv->capabilities & SVGA_CAP_3D)) {
+
+	if (cpu_blit) {
 		ddirty.base.fifo_commit = vmw_stdu_dmabuf_cpu_commit;
 		ddirty.base.clip = vmw_stdu_dmabuf_cpu_clip;
 		ddirty.base.fifo_reserve_size = 0;
@@ -1071,9 +1055,6 @@ vmw_stdu_primary_plane_cleanup_fb(struct drm_plane *plane,
 {
 	struct vmw_plane_state *vps = vmw_plane_state_to_vps(old_state);
 
-	if (vps->host_map.virtual)
-		ttm_bo_kunmap(&vps->host_map);
-
 	if (vps->surf)
 		WARN_ON(!vps->pinned);
 
@@ -1235,24 +1216,11 @@ vmw_stdu_primary_plane_prepare_fb(struct drm_plane *plane,
 	 * so cache these mappings
 	 */
 	if (vps->content_fb_type == SEPARATE_DMA &&
-	    !(dev_priv->capabilities & SVGA_CAP_3D)) {
-		ret = ttm_bo_kmap(&vps->surf->res.backup->base, 0,
-				  vps->surf->res.backup->base.num_pages,
-				  &vps->host_map);
-		if (ret) {
-			DRM_ERROR("Failed to map display buffer to CPU\n");
-			goto out_srf_unpin;
-		}
-
+	    !(dev_priv->capabilities & SVGA_CAP_3D))
 		vps->cpp = new_fb->pitches[0] / new_fb->width;
-	}
 
 	return 0;
 
-out_srf_unpin:
-	vmw_resource_unpin(&vps->surf->res);
-	vps->pinned--;
-
 out_srf_unref:
 	vmw_surface_unreference(&vps->surf);
 	return ret;
@@ -1296,7 +1264,6 @@ vmw_stdu_primary_plane_atomic_update(struct drm_plane *plane,
 		stdu->display_srf = vps->surf;
 		stdu->content_fb_type = vps->content_fb_type;
 		stdu->cpp = vps->cpp;
-		memcpy(&stdu->host_map, &vps->host_map, sizeof(vps->host_map));
 
 		vclips.x = crtc->x;
 		vclips.y = crtc->y;
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/5] drm/ttm: Clean up kmap_atomic_prot selection code
  2018-01-16 13:56 ` [PATCH 1/5] drm/ttm: Clean up kmap_atomic_prot selection code Thomas Hellstrom
@ 2018-01-16 14:05   ` Christian König
  0 siblings, 0 replies; 8+ messages in thread
From: Christian König @ 2018-01-16 14:05 UTC (permalink / raw)
  To: Thomas Hellstrom, dri-devel

Am 16.01.2018 um 14:56 schrieb Thomas Hellstrom:
> Use helpers to perform the kmap_atomic_prot() functionality to
> a) Avoid in-function ifdefs that violate the kernel coding policy,
> b) Facilitate exporting the functionality.
>
> This commit should not change any functionality.
>
> Cc: Christian König <christian.koenig@amd.com>
> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/ttm/ttm_bo_util.c | 64 +++++++++++++++++++--------------------
>   1 file changed, 31 insertions(+), 33 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
> index 153de1b..6d6d939 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
> @@ -255,6 +255,33 @@ static int ttm_copy_io_page(void *dst, void *src, unsigned long page)
>   	return 0;
>   }
>   
> +#ifdef CONFIG_X86
> +#define __ttm_kmap_atomic_prot(__page, __prot) kmap_atomic_prot(__page, __prot)
> +#define __ttm_kunmap_atomic(__addr) kunmap_atomic(__addr)
> +#else
> +#define __ttm_kmap_atomic_prot(__page, __prot) vmap(&__page, 1, 0,  __prot)
> +#define __ttm_kunmap_atomic(__addr) vunmap(__addr)
> +#endif
> +
> +static void *ttm_kmap_atomic_prot(struct page *page,
> +				  pgprot_t prot)
> +{
> +	if (pgprot_val(prot) == pgprot_val(PAGE_KERNEL))
> +		return kmap_atomic(page);
> +	else
> +		return __ttm_kmap_atomic_prot(page, prot);
> +}
> +
> +
> +static void ttm_kunmap_atomic_prot(void *addr,
> +				   pgprot_t prot)
> +{
> +	if (pgprot_val(prot) == pgprot_val(PAGE_KERNEL))
> +		kunmap_atomic(addr);
> +	else
> +		__ttm_kunmap_atomic(addr);
> +}
> +
>   static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void *src,
>   				unsigned long page,
>   				pgprot_t prot)
> @@ -266,28 +293,13 @@ static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void *src,
>   		return -ENOMEM;
>   
>   	src = (void *)((unsigned long)src + (page << PAGE_SHIFT));
> -
> -#ifdef CONFIG_X86
> -	dst = kmap_atomic_prot(d, prot);
> -#else
> -	if (pgprot_val(prot) != pgprot_val(PAGE_KERNEL))
> -		dst = vmap(&d, 1, 0, prot);
> -	else
> -		dst = kmap(d);
> -#endif
> +	dst = ttm_kmap_atomic_prot(d, prot);
>   	if (!dst)
>   		return -ENOMEM;
>   
>   	memcpy_fromio(dst, src, PAGE_SIZE);
>   
> -#ifdef CONFIG_X86
> -	kunmap_atomic(dst);
> -#else
> -	if (pgprot_val(prot) != pgprot_val(PAGE_KERNEL))
> -		vunmap(dst);
> -	else
> -		kunmap(d);
> -#endif
> +	ttm_kunmap_atomic_prot(dst, prot);
>   
>   	return 0;
>   }
> @@ -303,27 +315,13 @@ static int ttm_copy_ttm_io_page(struct ttm_tt *ttm, void *dst,
>   		return -ENOMEM;
>   
>   	dst = (void *)((unsigned long)dst + (page << PAGE_SHIFT));
> -#ifdef CONFIG_X86
> -	src = kmap_atomic_prot(s, prot);
> -#else
> -	if (pgprot_val(prot) != pgprot_val(PAGE_KERNEL))
> -		src = vmap(&s, 1, 0, prot);
> -	else
> -		src = kmap(s);
> -#endif
> +	src = ttm_kmap_atomic_prot(s, prot);
>   	if (!src)
>   		return -ENOMEM;
>   
>   	memcpy_toio(dst, src, PAGE_SIZE);
>   
> -#ifdef CONFIG_X86
> -	kunmap_atomic(src);
> -#else
> -	if (pgprot_val(prot) != pgprot_val(PAGE_KERNEL))
> -		vunmap(src);
> -	else
> -		kunmap(s);
> -#endif
> +	ttm_kunmap_atomic_prot(src, prot);
>   
>   	return 0;
>   }

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/5] drm/ttm: Export the ttm_k[un]map_atomic_prot API.
  2018-01-16 13:56 ` [PATCH 2/5] drm/ttm: Export the ttm_k[un]map_atomic_prot API Thomas Hellstrom
@ 2018-01-16 14:08   ` Christian König
  0 siblings, 0 replies; 8+ messages in thread
From: Christian König @ 2018-01-16 14:08 UTC (permalink / raw)
  To: Thomas Hellstrom, dri-devel

Am 16.01.2018 um 14:56 schrieb Thomas Hellstrom:
> It will be used by vmwgfx cpu blit.
>
> Cc: Christian König <christian.koenig@amd.com>
> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
> Reviewed-by: Brian Paul <brianp@vmware.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/ttm/ttm_bo_util.c | 31 ++++++++++++++++++++++++++-----
>   include/drm/ttm/ttm_bo_api.h      |  4 ++++
>   2 files changed, 30 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
> index 6d6d939..9d4c7f8 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
> @@ -263,24 +263,45 @@ static int ttm_copy_io_page(void *dst, void *src, unsigned long page)
>   #define __ttm_kunmap_atomic(__addr) vunmap(__addr)
>   #endif
>   
> -static void *ttm_kmap_atomic_prot(struct page *page,
> -				  pgprot_t prot)
> +
> +/**
> + * ttm_kmap_atomic_prot - Efficient kernel map of a single page with
> + * specified page protection.
> + *
> + * @page: The page to map.
> + * @prot: The page protection.
> + *
> + * This function maps a TTM page using the kmap_atomic api if available,
> + * otherwise falls back to vmap. The user must make sure that the
> + * specified page does not have an aliased mapping with a different caching
> + * policy unless the architecture explicitly allows it. Also mapping and
> + * unmapping using this api must be correctly nested. Unmapping should
> + * occur in the reverse order of mapping.
> + */
> +void *ttm_kmap_atomic_prot(struct page *page, pgprot_t prot)
>   {
>   	if (pgprot_val(prot) == pgprot_val(PAGE_KERNEL))
>   		return kmap_atomic(page);
>   	else
>   		return __ttm_kmap_atomic_prot(page, prot);
>   }
> +EXPORT_SYMBOL(ttm_kmap_atomic_prot);
>   
> -
> -static void ttm_kunmap_atomic_prot(void *addr,
> -				   pgprot_t prot)
> +/**
> + * ttm_kunmap_atomic_prot - Unmap a page that was mapped using
> + * ttm_kmap_atomic_prot.
> + *
> + * @addr: The virtual address from the map.
> + * @prot: The page protection.
> + */
> +void ttm_kunmap_atomic_prot(void *addr, pgprot_t prot)
>   {
>   	if (pgprot_val(prot) == pgprot_val(PAGE_KERNEL))
>   		kunmap_atomic(addr);
>   	else
>   		__ttm_kunmap_atomic(addr);
>   }
> +EXPORT_SYMBOL(ttm_kunmap_atomic_prot);
>   
>   static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void *src,
>   				unsigned long page,
> diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
> index f1c74c2..936e5d5 100644
> --- a/include/drm/ttm/ttm_bo_api.h
> +++ b/include/drm/ttm/ttm_bo_api.h
> @@ -728,6 +728,10 @@ unsigned long ttm_bo_default_io_mem_pfn(struct ttm_buffer_object *bo,
>   int ttm_bo_mmap(struct file *filp, struct vm_area_struct *vma,
>   		struct ttm_bo_device *bdev);
>   
> +void *ttm_kmap_atomic_prot(struct page *page, pgprot_t prot);
> +
> +void ttm_kunmap_atomic_prot(void *addr, pgprot_t prot);
> +
>   /**
>    * ttm_bo_io
>    *

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-01-16 14:08 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-16 13:56 [PATCH 0/5] drm/ttm, drm/vmwgfx: Use kmap_atomic() for cpu blit Thomas Hellstrom
2018-01-16 13:56 ` [PATCH 1/5] drm/ttm: Clean up kmap_atomic_prot selection code Thomas Hellstrom
2018-01-16 14:05   ` Christian König
2018-01-16 13:56 ` [PATCH 2/5] drm/ttm: Export the ttm_k[un]map_atomic_prot API Thomas Hellstrom
2018-01-16 14:08   ` Christian König
2018-01-16 13:56 ` [PATCH 3/5] drm/vmwgfx: Don't cache framebuffer maps Thomas Hellstrom
2018-01-16 13:56 ` [PATCH 4/5] drm/vmwgfx: Add a cpu blit utility that can be used for page-backed bos Thomas Hellstrom
2018-01-16 13:56 ` [PATCH 5/5] drm/vmwgfx: Use the cpu blit utility for framebuffer to screen target blits Thomas Hellstrom

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).