All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH 0/5] Support GEM object mappings from I/O memory
@ 2020-07-29 13:41 ` Thomas Zimmermann
  0 siblings, 0 replies; 34+ messages in thread
From: Thomas Zimmermann @ 2020-07-29 13:41 UTC (permalink / raw)
  To: airlied, daniel, sam, kraxel, b.zolnierkie, maarten.lankhorst,
	mripard, jani.nikula, peda, dan.carpenter, natechancellor
  Cc: linux-fbdev, Thomas Zimmermann, dri-devel

DRM's fbdev console uses regular load and store operations to update
framebuffer memory. The bochs driver on sparc64 requires the use of
I/O-specific load and store operations. We have a workaround, but need
a long-term sulotion tothe problem.

This patchset adds a GEM object function that returns framebuffers as
I/O memory. It further updates fbdev to use the new functionality and
implements vmap_iomem for VRAM helpers.

This is an RFC patchset to discuss the approach. It still needs testing
and polish. Patch #4 has to be rebased onto the recent workaround.

Thomas Zimmermann (5):
  fbdev: Remove trailing whitespace
  fbdev/core: Export framebuffer read and write code as cfb_ function
  drm: Add infrastructure for vmap operations of I/O memory
  drm/fb_helper: Use I/O-memory mappings if available
  drm/vram_helper: Implement struct drm_gem_object_funcs.vmap_iomem

 drivers/gpu/drm/ast/ast_cursor.c      |  12 ++-
 drivers/gpu/drm/drm_client.c          |  52 +++++++++-
 drivers/gpu/drm/drm_fb_helper.c       | 132 ++++++++++++++++++++++----
 drivers/gpu/drm/drm_gem.c             |  19 ++++
 drivers/gpu/drm/drm_gem_vram_helper.c | 106 +++++++++++++++++++--
 drivers/gpu/drm/drm_internal.h        |   1 +
 drivers/video/fbdev/core/fbmem.c      |  61 ++++++++----
 include/drm/drm_client.h              |   8 +-
 include/drm/drm_gem.h                 |  17 +++-
 include/drm/drm_gem_vram_helper.h     |   1 +
 include/linux/fb.h                    |  23 +++--
 11 files changed, 369 insertions(+), 63 deletions(-)

--
2.27.0

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [RFC][PATCH 0/5] Support GEM object mappings from I/O memory
@ 2020-07-29 13:41 ` Thomas Zimmermann
  0 siblings, 0 replies; 34+ messages in thread
From: Thomas Zimmermann @ 2020-07-29 13:41 UTC (permalink / raw)
  To: airlied, daniel, sam, kraxel, b.zolnierkie, maarten.lankhorst,
	mripard, jani.nikula, peda, dan.carpenter, natechancellor
  Cc: linux-fbdev, Thomas Zimmermann, dri-devel

DRM's fbdev console uses regular load and store operations to update
framebuffer memory. The bochs driver on sparc64 requires the use of
I/O-specific load and store operations. We have a workaround, but need
a long-term sulotion tothe problem.

This patchset adds a GEM object function that returns framebuffers as
I/O memory. It further updates fbdev to use the new functionality and
implements vmap_iomem for VRAM helpers.

This is an RFC patchset to discuss the approach. It still needs testing
and polish. Patch #4 has to be rebased onto the recent workaround.

Thomas Zimmermann (5):
  fbdev: Remove trailing whitespace
  fbdev/core: Export framebuffer read and write code as cfb_ function
  drm: Add infrastructure for vmap operations of I/O memory
  drm/fb_helper: Use I/O-memory mappings if available
  drm/vram_helper: Implement struct drm_gem_object_funcs.vmap_iomem

 drivers/gpu/drm/ast/ast_cursor.c      |  12 ++-
 drivers/gpu/drm/drm_client.c          |  52 +++++++++-
 drivers/gpu/drm/drm_fb_helper.c       | 132 ++++++++++++++++++++++----
 drivers/gpu/drm/drm_gem.c             |  19 ++++
 drivers/gpu/drm/drm_gem_vram_helper.c | 106 +++++++++++++++++++--
 drivers/gpu/drm/drm_internal.h        |   1 +
 drivers/video/fbdev/core/fbmem.c      |  61 ++++++++----
 include/drm/drm_client.h              |   8 +-
 include/drm/drm_gem.h                 |  17 +++-
 include/drm/drm_gem_vram_helper.h     |   1 +
 include/linux/fb.h                    |  23 +++--
 11 files changed, 369 insertions(+), 63 deletions(-)

--
2.27.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 1/5] fbdev: Remove trailing whitespace
  2020-07-29 13:41 ` Thomas Zimmermann
@ 2020-07-29 13:41   ` Thomas Zimmermann
  -1 siblings, 0 replies; 34+ messages in thread
From: Thomas Zimmermann @ 2020-07-29 13:41 UTC (permalink / raw)
  To: airlied, daniel, sam, kraxel, b.zolnierkie, maarten.lankhorst,
	mripard, jani.nikula, peda, dan.carpenter, natechancellor
  Cc: linux-fbdev, Thomas Zimmermann, dri-devel

Removes trailing whitespaces in several places.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
---
 drivers/video/fbdev/core/fbmem.c | 10 +++++-----
 include/linux/fb.h               | 18 +++++++++---------
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
index 30e73ec4ad5c..dd0ccf35f7b7 100644
--- a/drivers/video/fbdev/core/fbmem.c
+++ b/drivers/video/fbdev/core/fbmem.c
@@ -777,7 +777,7 @@ fb_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
 
 	if (info->fbops->fb_read)
 		return info->fbops->fb_read(info, buf, count, ppos);
-	
+
 	total_size = info->screen_size;
 
 	if (total_size = 0)
@@ -842,7 +842,7 @@ fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
 
 	if (info->fbops->fb_write)
 		return info->fbops->fb_write(info, buf, count, ppos);
-	
+
 	total_size = info->screen_size;
 
 	if (total_size = 0)
@@ -1061,7 +1061,7 @@ EXPORT_SYMBOL(fb_set_var);
 
 int
 fb_blank(struct fb_info *info, int blank)
-{	
+{
 	struct fb_event event;
 	int ret = -EINVAL;
 
@@ -1437,7 +1437,7 @@ __releases(&info->lock)
 	return res;
 }
 
-static int 
+static int
 fb_release(struct inode *inode, struct file *file)
 __acquires(&info->lock)
 __releases(&info->lock)
@@ -1627,7 +1627,7 @@ static int do_register_framebuffer(struct fb_info *fb_info)
 			fb_info->pixmap.access_align = 32;
 			fb_info->pixmap.flags = FB_PIXMAP_DEFAULT;
 		}
-	}	
+	}
 	fb_info->pixmap.offset = 0;
 
 	if (!fb_info->pixmap.blit_x)
diff --git a/include/linux/fb.h b/include/linux/fb.h
index 2b530e6d86e4..714187bc13ac 100644
--- a/include/linux/fb.h
+++ b/include/linux/fb.h
@@ -124,7 +124,7 @@ struct fb_cursor_user {
  * Register/unregister for framebuffer events
  */
 
-/*	The resolution of the passed in fb_info about to change */ 
+/*	The resolution of the passed in fb_info about to change */
 #define FB_EVENT_MODE_CHANGE		0x01
 
 #ifdef CONFIG_GUMSTIX_AM200EPD
@@ -459,12 +459,12 @@ struct fb_info {
 
 #if IS_ENABLED(CONFIG_FB_BACKLIGHT)
 	/* assigned backlight device */
-	/* set before framebuffer registration, 
+	/* set before framebuffer registration,
 	   remove after unregister */
 	struct backlight_device *bl_dev;
 
 	/* Backlight level curve */
-	struct mutex bl_curve_mutex;	
+	struct mutex bl_curve_mutex;
 	u8 bl_curve[FB_BACKLIGHT_LEVELS];
 #endif
 #ifdef CONFIG_FB_DEFERRED_IO
@@ -483,8 +483,8 @@ struct fb_info {
 		char __iomem *screen_base;	/* Virtual address */
 		char *screen_buffer;
 	};
-	unsigned long screen_size;	/* Amount of ioremapped VRAM or 0 */ 
-	void *pseudo_palette;		/* Fake palette of 16 colors */ 
+	unsigned long screen_size;	/* Amount of ioremapped VRAM or 0 */
+	void *pseudo_palette;		/* Fake palette of 16 colors */
 #define FBINFO_STATE_RUNNING	0
 #define FBINFO_STATE_SUSPENDED	1
 	u32 state;			/* Hardware state i.e suspend */
@@ -587,11 +587,11 @@ static inline struct apertures_struct *alloc_apertures(unsigned int max_num) {
      *  `Generic' versions of the frame buffer device operations
      */
 
-extern int fb_set_var(struct fb_info *info, struct fb_var_screeninfo *var); 
-extern int fb_pan_display(struct fb_info *info, struct fb_var_screeninfo *var); 
+extern int fb_set_var(struct fb_info *info, struct fb_var_screeninfo *var);
+extern int fb_pan_display(struct fb_info *info, struct fb_var_screeninfo *var);
 extern int fb_blank(struct fb_info *info, int blank);
-extern void cfb_fillrect(struct fb_info *info, const struct fb_fillrect *rect); 
-extern void cfb_copyarea(struct fb_info *info, const struct fb_copyarea *area); 
+extern void cfb_fillrect(struct fb_info *info, const struct fb_fillrect *rect);
+extern void cfb_copyarea(struct fb_info *info, const struct fb_copyarea *area);
 extern void cfb_imageblit(struct fb_info *info, const struct fb_image *image);
 /*
  * Drawing operations where framebuffer is in system RAM
-- 
2.27.0

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 1/5] fbdev: Remove trailing whitespace
@ 2020-07-29 13:41   ` Thomas Zimmermann
  0 siblings, 0 replies; 34+ messages in thread
From: Thomas Zimmermann @ 2020-07-29 13:41 UTC (permalink / raw)
  To: airlied, daniel, sam, kraxel, b.zolnierkie, maarten.lankhorst,
	mripard, jani.nikula, peda, dan.carpenter, natechancellor
  Cc: linux-fbdev, Thomas Zimmermann, dri-devel

Removes trailing whitespaces in several places.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
---
 drivers/video/fbdev/core/fbmem.c | 10 +++++-----
 include/linux/fb.h               | 18 +++++++++---------
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
index 30e73ec4ad5c..dd0ccf35f7b7 100644
--- a/drivers/video/fbdev/core/fbmem.c
+++ b/drivers/video/fbdev/core/fbmem.c
@@ -777,7 +777,7 @@ fb_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
 
 	if (info->fbops->fb_read)
 		return info->fbops->fb_read(info, buf, count, ppos);
-	
+
 	total_size = info->screen_size;
 
 	if (total_size == 0)
@@ -842,7 +842,7 @@ fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
 
 	if (info->fbops->fb_write)
 		return info->fbops->fb_write(info, buf, count, ppos);
-	
+
 	total_size = info->screen_size;
 
 	if (total_size == 0)
@@ -1061,7 +1061,7 @@ EXPORT_SYMBOL(fb_set_var);
 
 int
 fb_blank(struct fb_info *info, int blank)
-{	
+{
 	struct fb_event event;
 	int ret = -EINVAL;
 
@@ -1437,7 +1437,7 @@ __releases(&info->lock)
 	return res;
 }
 
-static int 
+static int
 fb_release(struct inode *inode, struct file *file)
 __acquires(&info->lock)
 __releases(&info->lock)
@@ -1627,7 +1627,7 @@ static int do_register_framebuffer(struct fb_info *fb_info)
 			fb_info->pixmap.access_align = 32;
 			fb_info->pixmap.flags = FB_PIXMAP_DEFAULT;
 		}
-	}	
+	}
 	fb_info->pixmap.offset = 0;
 
 	if (!fb_info->pixmap.blit_x)
diff --git a/include/linux/fb.h b/include/linux/fb.h
index 2b530e6d86e4..714187bc13ac 100644
--- a/include/linux/fb.h
+++ b/include/linux/fb.h
@@ -124,7 +124,7 @@ struct fb_cursor_user {
  * Register/unregister for framebuffer events
  */
 
-/*	The resolution of the passed in fb_info about to change */ 
+/*	The resolution of the passed in fb_info about to change */
 #define FB_EVENT_MODE_CHANGE		0x01
 
 #ifdef CONFIG_GUMSTIX_AM200EPD
@@ -459,12 +459,12 @@ struct fb_info {
 
 #if IS_ENABLED(CONFIG_FB_BACKLIGHT)
 	/* assigned backlight device */
-	/* set before framebuffer registration, 
+	/* set before framebuffer registration,
 	   remove after unregister */
 	struct backlight_device *bl_dev;
 
 	/* Backlight level curve */
-	struct mutex bl_curve_mutex;	
+	struct mutex bl_curve_mutex;
 	u8 bl_curve[FB_BACKLIGHT_LEVELS];
 #endif
 #ifdef CONFIG_FB_DEFERRED_IO
@@ -483,8 +483,8 @@ struct fb_info {
 		char __iomem *screen_base;	/* Virtual address */
 		char *screen_buffer;
 	};
-	unsigned long screen_size;	/* Amount of ioremapped VRAM or 0 */ 
-	void *pseudo_palette;		/* Fake palette of 16 colors */ 
+	unsigned long screen_size;	/* Amount of ioremapped VRAM or 0 */
+	void *pseudo_palette;		/* Fake palette of 16 colors */
 #define FBINFO_STATE_RUNNING	0
 #define FBINFO_STATE_SUSPENDED	1
 	u32 state;			/* Hardware state i.e suspend */
@@ -587,11 +587,11 @@ static inline struct apertures_struct *alloc_apertures(unsigned int max_num) {
      *  `Generic' versions of the frame buffer device operations
      */
 
-extern int fb_set_var(struct fb_info *info, struct fb_var_screeninfo *var); 
-extern int fb_pan_display(struct fb_info *info, struct fb_var_screeninfo *var); 
+extern int fb_set_var(struct fb_info *info, struct fb_var_screeninfo *var);
+extern int fb_pan_display(struct fb_info *info, struct fb_var_screeninfo *var);
 extern int fb_blank(struct fb_info *info, int blank);
-extern void cfb_fillrect(struct fb_info *info, const struct fb_fillrect *rect); 
-extern void cfb_copyarea(struct fb_info *info, const struct fb_copyarea *area); 
+extern void cfb_fillrect(struct fb_info *info, const struct fb_fillrect *rect);
+extern void cfb_copyarea(struct fb_info *info, const struct fb_copyarea *area);
 extern void cfb_imageblit(struct fb_info *info, const struct fb_image *image);
 /*
  * Drawing operations where framebuffer is in system RAM
-- 
2.27.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 2/5] fbdev/core: Export framebuffer read and write code as cfb_ function
  2020-07-29 13:41 ` Thomas Zimmermann
@ 2020-07-29 13:41   ` Thomas Zimmermann
  -1 siblings, 0 replies; 34+ messages in thread
From: Thomas Zimmermann @ 2020-07-29 13:41 UTC (permalink / raw)
  To: airlied, daniel, sam, kraxel, b.zolnierkie, maarten.lankhorst,
	mripard, jani.nikula, peda, dan.carpenter, natechancellor
  Cc: linux-fbdev, Thomas Zimmermann, dri-devel

DRM fb helpers require read and write functions for framebuffer
memory. Export the existing code from fbdev.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
---
 drivers/video/fbdev/core/fbmem.c | 53 ++++++++++++++++++++++----------
 include/linux/fb.h               |  5 +++
 2 files changed, 41 insertions(+), 17 deletions(-)

diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
index dd0ccf35f7b7..b496ff90db3e 100644
--- a/drivers/video/fbdev/core/fbmem.c
+++ b/drivers/video/fbdev/core/fbmem.c
@@ -759,25 +759,18 @@ static struct fb_info *file_fb_info(struct file *file)
 	return info;
 }
 
-static ssize_t
-fb_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
+ssize_t fb_cfb_read(struct fb_info *info, char __user *buf, size_t count,
+		    loff_t *ppos)
 {
 	unsigned long p = *ppos;
-	struct fb_info *info = file_fb_info(file);
 	u8 *buffer, *dst;
 	u8 __iomem *src;
 	int c, cnt = 0, err = 0;
 	unsigned long total_size;
 
-	if (!info || ! info->screen_base)
-		return -ENODEV;
-
 	if (info->state != FBINFO_STATE_RUNNING)
 		return -EPERM;
 
-	if (info->fbops->fb_read)
-		return info->fbops->fb_read(info, buf, count, ppos);
-
 	total_size = info->screen_size;
 
 	if (total_size = 0)
@@ -823,16 +816,12 @@ fb_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
 
 	return (err) ? err : cnt;
 }
+EXPORT_SYMBOL(fb_cfb_read);
 
 static ssize_t
-fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
+fb_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
 {
-	unsigned long p = *ppos;
 	struct fb_info *info = file_fb_info(file);
-	u8 *buffer, *src;
-	u8 __iomem *dst;
-	int c, cnt = 0, err = 0;
-	unsigned long total_size;
 
 	if (!info || !info->screen_base)
 		return -ENODEV;
@@ -840,8 +829,20 @@ fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
 	if (info->state != FBINFO_STATE_RUNNING)
 		return -EPERM;
 
-	if (info->fbops->fb_write)
-		return info->fbops->fb_write(info, buf, count, ppos);
+	if (info->fbops->fb_read)
+		return info->fbops->fb_read(info, buf, count, ppos);
+	else
+		return fb_cfb_read(info, buf, count, ppos);
+}
+
+ssize_t fb_cfb_write(struct fb_info *info, const char __user *buf,
+		     size_t count, loff_t *ppos)
+{
+	unsigned long p = *ppos;
+	u8 *buffer, *src;
+	u8 __iomem *dst;
+	int c, cnt = 0, err = 0;
+	unsigned long total_size;
 
 	total_size = info->screen_size;
 
@@ -895,6 +896,24 @@ fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
 
 	return (cnt) ? cnt : err;
 }
+EXPORT_SYMBOL(fb_cfb_write);
+
+static ssize_t
+fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
+{
+	struct fb_info *info = file_fb_info(file);
+
+	if (!info || !info->screen_base)
+		return -ENODEV;
+
+	if (info->state != FBINFO_STATE_RUNNING)
+		return -EPERM;
+
+	if (info->fbops->fb_write)
+		return info->fbops->fb_write(info, buf, count, ppos);
+	else
+		return fb_cfb_write(info, buf, count, ppos);
+}
 
 int
 fb_pan_display(struct fb_info *info, struct fb_var_screeninfo *var)
diff --git a/include/linux/fb.h b/include/linux/fb.h
index 714187bc13ac..12ad83963db5 100644
--- a/include/linux/fb.h
+++ b/include/linux/fb.h
@@ -593,6 +593,11 @@ extern int fb_blank(struct fb_info *info, int blank);
 extern void cfb_fillrect(struct fb_info *info, const struct fb_fillrect *rect);
 extern void cfb_copyarea(struct fb_info *info, const struct fb_copyarea *area);
 extern void cfb_imageblit(struct fb_info *info, const struct fb_image *image);
+extern ssize_t fb_cfb_read(struct fb_info *info, char __user *buf,
+			   size_t count, loff_t *ppos);
+extern ssize_t fb_cfb_write(struct fb_info *info, const char __user *buf,
+			    size_t count, loff_t *ppos);
+
 /*
  * Drawing operations where framebuffer is in system RAM
  */
-- 
2.27.0

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 2/5] fbdev/core: Export framebuffer read and write code as cfb_ function
@ 2020-07-29 13:41   ` Thomas Zimmermann
  0 siblings, 0 replies; 34+ messages in thread
From: Thomas Zimmermann @ 2020-07-29 13:41 UTC (permalink / raw)
  To: airlied, daniel, sam, kraxel, b.zolnierkie, maarten.lankhorst,
	mripard, jani.nikula, peda, dan.carpenter, natechancellor
  Cc: linux-fbdev, Thomas Zimmermann, dri-devel

DRM fb helpers require read and write functions for framebuffer
memory. Export the existing code from fbdev.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
---
 drivers/video/fbdev/core/fbmem.c | 53 ++++++++++++++++++++++----------
 include/linux/fb.h               |  5 +++
 2 files changed, 41 insertions(+), 17 deletions(-)

diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
index dd0ccf35f7b7..b496ff90db3e 100644
--- a/drivers/video/fbdev/core/fbmem.c
+++ b/drivers/video/fbdev/core/fbmem.c
@@ -759,25 +759,18 @@ static struct fb_info *file_fb_info(struct file *file)
 	return info;
 }
 
-static ssize_t
-fb_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
+ssize_t fb_cfb_read(struct fb_info *info, char __user *buf, size_t count,
+		    loff_t *ppos)
 {
 	unsigned long p = *ppos;
-	struct fb_info *info = file_fb_info(file);
 	u8 *buffer, *dst;
 	u8 __iomem *src;
 	int c, cnt = 0, err = 0;
 	unsigned long total_size;
 
-	if (!info || ! info->screen_base)
-		return -ENODEV;
-
 	if (info->state != FBINFO_STATE_RUNNING)
 		return -EPERM;
 
-	if (info->fbops->fb_read)
-		return info->fbops->fb_read(info, buf, count, ppos);
-
 	total_size = info->screen_size;
 
 	if (total_size == 0)
@@ -823,16 +816,12 @@ fb_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
 
 	return (err) ? err : cnt;
 }
+EXPORT_SYMBOL(fb_cfb_read);
 
 static ssize_t
-fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
+fb_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
 {
-	unsigned long p = *ppos;
 	struct fb_info *info = file_fb_info(file);
-	u8 *buffer, *src;
-	u8 __iomem *dst;
-	int c, cnt = 0, err = 0;
-	unsigned long total_size;
 
 	if (!info || !info->screen_base)
 		return -ENODEV;
@@ -840,8 +829,20 @@ fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
 	if (info->state != FBINFO_STATE_RUNNING)
 		return -EPERM;
 
-	if (info->fbops->fb_write)
-		return info->fbops->fb_write(info, buf, count, ppos);
+	if (info->fbops->fb_read)
+		return info->fbops->fb_read(info, buf, count, ppos);
+	else
+		return fb_cfb_read(info, buf, count, ppos);
+}
+
+ssize_t fb_cfb_write(struct fb_info *info, const char __user *buf,
+		     size_t count, loff_t *ppos)
+{
+	unsigned long p = *ppos;
+	u8 *buffer, *src;
+	u8 __iomem *dst;
+	int c, cnt = 0, err = 0;
+	unsigned long total_size;
 
 	total_size = info->screen_size;
 
@@ -895,6 +896,24 @@ fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
 
 	return (cnt) ? cnt : err;
 }
+EXPORT_SYMBOL(fb_cfb_write);
+
+static ssize_t
+fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
+{
+	struct fb_info *info = file_fb_info(file);
+
+	if (!info || !info->screen_base)
+		return -ENODEV;
+
+	if (info->state != FBINFO_STATE_RUNNING)
+		return -EPERM;
+
+	if (info->fbops->fb_write)
+		return info->fbops->fb_write(info, buf, count, ppos);
+	else
+		return fb_cfb_write(info, buf, count, ppos);
+}
 
 int
 fb_pan_display(struct fb_info *info, struct fb_var_screeninfo *var)
diff --git a/include/linux/fb.h b/include/linux/fb.h
index 714187bc13ac..12ad83963db5 100644
--- a/include/linux/fb.h
+++ b/include/linux/fb.h
@@ -593,6 +593,11 @@ extern int fb_blank(struct fb_info *info, int blank);
 extern void cfb_fillrect(struct fb_info *info, const struct fb_fillrect *rect);
 extern void cfb_copyarea(struct fb_info *info, const struct fb_copyarea *area);
 extern void cfb_imageblit(struct fb_info *info, const struct fb_image *image);
+extern ssize_t fb_cfb_read(struct fb_info *info, char __user *buf,
+			   size_t count, loff_t *ppos);
+extern ssize_t fb_cfb_write(struct fb_info *info, const char __user *buf,
+			    size_t count, loff_t *ppos);
+
 /*
  * Drawing operations where framebuffer is in system RAM
  */
-- 
2.27.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 3/5] drm: Add infrastructure for vmap operations of I/O memory
  2020-07-29 13:41 ` Thomas Zimmermann
@ 2020-07-29 13:41   ` Thomas Zimmermann
  -1 siblings, 0 replies; 34+ messages in thread
From: Thomas Zimmermann @ 2020-07-29 13:41 UTC (permalink / raw)
  To: airlied, daniel, sam, kraxel, b.zolnierkie, maarten.lankhorst,
	mripard, jani.nikula, peda, dan.carpenter, natechancellor
  Cc: linux-fbdev, Thomas Zimmermann, dri-devel

Most platforms allow for accessing framebuffer I/O memory with regular
load and store operations. Some platforms, such as sparc64, require
the use of special instructions instead.

This patch adds vmap_iomem to struct drm_gem_object_funcs. The new
interface drm_client_buffer_vmap_iomem() gives DRM clients access to the
I/O memory buffer. The semantics of struct drm_gem_objcet_funcs.vmap
change slightly. It used to return system or I/O memory. Now it is
expected to return memory addresses that can be accessed with regular
load and store operations. So nothing changes for existing implementations
of GEM objects. If the GEM object also implements vmap_iomem, a call
to vmap shall only return system memory, even if I/O memory could be
accessed with loads and stores.

The existing interface drm_client_buffer_vmap() shall only return memory
as given by drm_gem_vmap ((i.e., that is accessible via regular load and
store). The new interface drm_client_buffer_vmap_iomem() shall only
return I/O memory.

DRM clients must map buffers by calling drm_client_buffer_vmap_iomem()
and drm_client_buffer_vmap() to get the buffer in I/O or system memory.
Each function returns NULL if the buffer is in the other memory area.
Depending on the type of the returned memory, clients must access the
framebuffer with the appropriate operations.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
---
 drivers/gpu/drm/drm_client.c   | 52 ++++++++++++++++++++++++++++++++--
 drivers/gpu/drm/drm_gem.c      | 19 +++++++++++++
 drivers/gpu/drm/drm_internal.h |  1 +
 include/drm/drm_client.h       |  8 +++++-
 include/drm/drm_gem.h          | 17 +++++++++--
 5 files changed, 91 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/drm_client.c b/drivers/gpu/drm/drm_client.c
index 495f47d23d87..b5bbe089a41e 100644
--- a/drivers/gpu/drm/drm_client.c
+++ b/drivers/gpu/drm/drm_client.c
@@ -327,6 +327,46 @@ void *drm_client_buffer_vmap(struct drm_client_buffer *buffer)
 }
 EXPORT_SYMBOL(drm_client_buffer_vmap);
 
+/**
+ * drm_client_buffer_vmap_iomem - Map DRM client buffer into address space
+ * @buffer: DRM client buffer
+ *
+ * This function maps a client buffer into kernel address space. If the
+ * buffer is already mapped, it returns the mapping's address.
+ *
+ * Client buffer mappings are not ref'counted. Each call to
+ * drm_client_buffer_vmap() should be followed by a call to
+ * drm_client_buffer_vunmap(); or the client buffer should be mapped
+ * throughout its lifetime.
+ *
+ * Returns:
+ *	The mapped memory's address
+ */
+void __iomem *drm_client_buffer_vmap_iomem(struct drm_client_buffer *buffer)
+{
+	void __iomem *vaddr_iomem;
+
+	if (buffer->vaddr_iomem)
+		return buffer->vaddr_iomem;
+
+	/*
+	 * FIXME: The dependency on GEM here isn't required, we could
+	 * convert the driver handle to a dma-buf instead and use the
+	 * backend-agnostic dma-buf vmap support instead. This would
+	 * require that the handle2fd prime ioctl is reworked to pull the
+	 * fd_install step out of the driver backend hooks, to make that
+	 * final step optional for internal users.
+	 */
+	vaddr_iomem = drm_gem_vmap_iomem(buffer->gem);
+	if (IS_ERR(vaddr_iomem))
+		return vaddr_iomem;
+
+	buffer->vaddr_iomem = vaddr_iomem;
+
+	return vaddr_iomem;
+}
+EXPORT_SYMBOL(drm_client_buffer_vmap_iomem);
+
 /**
  * drm_client_buffer_vunmap - Unmap DRM client buffer
  * @buffer: DRM client buffer
@@ -337,8 +377,16 @@ EXPORT_SYMBOL(drm_client_buffer_vmap);
  */
 void drm_client_buffer_vunmap(struct drm_client_buffer *buffer)
 {
-	drm_gem_vunmap(buffer->gem, buffer->vaddr);
-	buffer->vaddr = NULL;
+	drm_WARN_ON(buffer->client->dev, buffer->vaddr && buffer->vaddr_iomem);
+
+	if (buffer->vaddr) {
+		drm_gem_vunmap(buffer->gem, buffer->vaddr);
+		buffer->vaddr = NULL;
+	}
+	if (buffer->vaddr_iomem) {
+		drm_gem_vunmap(buffer->gem, (void *)buffer->vaddr_iomem);
+		buffer->vaddr_iomem = NULL;
+	}
 }
 EXPORT_SYMBOL(drm_client_buffer_vunmap);
 
diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index a57f5379fc08..a001be8c0965 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -1227,6 +1227,25 @@ void *drm_gem_vmap(struct drm_gem_object *obj)
 		vaddr = obj->funcs->vmap(obj);
 	else if (obj->dev->driver->gem_prime_vmap)
 		vaddr = obj->dev->driver->gem_prime_vmap(obj);
+	else if (obj->funcs && obj->funcs->vmap_iomem)
+		vaddr = NULL; /* requires mapping as I/O memory */
+	else
+		vaddr = ERR_PTR(-EOPNOTSUPP);
+
+	if (!vaddr)
+		vaddr = ERR_PTR(-ENOMEM);
+
+	return vaddr;
+}
+
+void __iomem *drm_gem_vmap_iomem(struct drm_gem_object *obj)
+{
+	void __iomem *vaddr;
+
+	if (obj->funcs && obj->funcs->vmap_iomem)
+		vaddr = obj->funcs->vmap_iomem(obj);
+	else if (obj->funcs && obj->funcs->vmap)
+		vaddr = NULL; /* requires mapping as system memory */
 	else
 		vaddr = ERR_PTR(-EOPNOTSUPP);
 
diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
index 8e01caaf95cc..aa1a3d4f9223 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -187,6 +187,7 @@ void drm_gem_print_info(struct drm_printer *p, unsigned int indent,
 int drm_gem_pin(struct drm_gem_object *obj);
 void drm_gem_unpin(struct drm_gem_object *obj);
 void *drm_gem_vmap(struct drm_gem_object *obj);
+void __iomem *drm_gem_vmap_iomem(struct drm_gem_object *obj);
 void drm_gem_vunmap(struct drm_gem_object *obj, void *vaddr);
 
 /* drm_debugfs.c drm_debugfs_crc.c */
diff --git a/include/drm/drm_client.h b/include/drm/drm_client.h
index 7aaea665bfc2..94aa075ee4b6 100644
--- a/include/drm/drm_client.h
+++ b/include/drm/drm_client.h
@@ -141,10 +141,15 @@ struct drm_client_buffer {
 	struct drm_gem_object *gem;
 
 	/**
-	 * @vaddr: Virtual address for the buffer
+	 * @vaddr: Virtual address for the buffer in system memory
 	 */
 	void *vaddr;
 
+	/**
+	 * @vaddr: Virtual address for the buffer in I/O memory
+	 */
+	void *vaddr_iomem;
+
 	/**
 	 * @fb: DRM framebuffer
 	 */
@@ -156,6 +161,7 @@ drm_client_framebuffer_create(struct drm_client_dev *client, u32 width, u32 heig
 void drm_client_framebuffer_delete(struct drm_client_buffer *buffer);
 int drm_client_framebuffer_flush(struct drm_client_buffer *buffer, struct drm_rect *rect);
 void *drm_client_buffer_vmap(struct drm_client_buffer *buffer);
+void __iomem *drm_client_buffer_vmap_iomem(struct drm_client_buffer *buffer);
 void drm_client_buffer_vunmap(struct drm_client_buffer *buffer);
 
 int drm_client_modeset_create(struct drm_client_dev *client);
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index 337a48321705..bc735ff522a8 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -134,17 +134,28 @@ struct drm_gem_object_funcs {
 	 * @vmap:
 	 *
 	 * Returns a virtual address for the buffer. Used by the
-	 * drm_gem_dmabuf_vmap() helper.
+	 * drm_gem_dmabuf_vmap() helper. If the buffer is not
+	 * located in system memory, the function returns NULL.
 	 *
 	 * This callback is optional.
 	 */
 	void *(*vmap)(struct drm_gem_object *obj);
 
+	/**
+	 * @vmap_iomem:
+	 *
+	 * Returns a virtual address for the buffer. If the buffer is not
+	 * located in I/O memory, the function returns NULL.
+	 *
+	 * This callback is optional.
+	 */
+	void __iomem *(*vmap_iomem)(struct drm_gem_object *obj);
+
 	/**
 	 * @vunmap:
 	 *
-	 * Releases the address previously returned by @vmap. Used by the
-	 * drm_gem_dmabuf_vunmap() helper.
+	 * Releases the address previously returned by @vmap or @vmap_iomem.
+	 * Used by the drm_gem_dmabuf_vunmap() helper.
 	 *
 	 * This callback is optional.
 	 */
-- 
2.27.0

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 3/5] drm: Add infrastructure for vmap operations of I/O memory
@ 2020-07-29 13:41   ` Thomas Zimmermann
  0 siblings, 0 replies; 34+ messages in thread
From: Thomas Zimmermann @ 2020-07-29 13:41 UTC (permalink / raw)
  To: airlied, daniel, sam, kraxel, b.zolnierkie, maarten.lankhorst,
	mripard, jani.nikula, peda, dan.carpenter, natechancellor
  Cc: linux-fbdev, Thomas Zimmermann, dri-devel

Most platforms allow for accessing framebuffer I/O memory with regular
load and store operations. Some platforms, such as sparc64, require
the use of special instructions instead.

This patch adds vmap_iomem to struct drm_gem_object_funcs. The new
interface drm_client_buffer_vmap_iomem() gives DRM clients access to the
I/O memory buffer. The semantics of struct drm_gem_objcet_funcs.vmap
change slightly. It used to return system or I/O memory. Now it is
expected to return memory addresses that can be accessed with regular
load and store operations. So nothing changes for existing implementations
of GEM objects. If the GEM object also implements vmap_iomem, a call
to vmap shall only return system memory, even if I/O memory could be
accessed with loads and stores.

The existing interface drm_client_buffer_vmap() shall only return memory
as given by drm_gem_vmap ((i.e., that is accessible via regular load and
store). The new interface drm_client_buffer_vmap_iomem() shall only
return I/O memory.

DRM clients must map buffers by calling drm_client_buffer_vmap_iomem()
and drm_client_buffer_vmap() to get the buffer in I/O or system memory.
Each function returns NULL if the buffer is in the other memory area.
Depending on the type of the returned memory, clients must access the
framebuffer with the appropriate operations.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
---
 drivers/gpu/drm/drm_client.c   | 52 ++++++++++++++++++++++++++++++++--
 drivers/gpu/drm/drm_gem.c      | 19 +++++++++++++
 drivers/gpu/drm/drm_internal.h |  1 +
 include/drm/drm_client.h       |  8 +++++-
 include/drm/drm_gem.h          | 17 +++++++++--
 5 files changed, 91 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/drm_client.c b/drivers/gpu/drm/drm_client.c
index 495f47d23d87..b5bbe089a41e 100644
--- a/drivers/gpu/drm/drm_client.c
+++ b/drivers/gpu/drm/drm_client.c
@@ -327,6 +327,46 @@ void *drm_client_buffer_vmap(struct drm_client_buffer *buffer)
 }
 EXPORT_SYMBOL(drm_client_buffer_vmap);
 
+/**
+ * drm_client_buffer_vmap_iomem - Map DRM client buffer into address space
+ * @buffer: DRM client buffer
+ *
+ * This function maps a client buffer into kernel address space. If the
+ * buffer is already mapped, it returns the mapping's address.
+ *
+ * Client buffer mappings are not ref'counted. Each call to
+ * drm_client_buffer_vmap() should be followed by a call to
+ * drm_client_buffer_vunmap(); or the client buffer should be mapped
+ * throughout its lifetime.
+ *
+ * Returns:
+ *	The mapped memory's address
+ */
+void __iomem *drm_client_buffer_vmap_iomem(struct drm_client_buffer *buffer)
+{
+	void __iomem *vaddr_iomem;
+
+	if (buffer->vaddr_iomem)
+		return buffer->vaddr_iomem;
+
+	/*
+	 * FIXME: The dependency on GEM here isn't required, we could
+	 * convert the driver handle to a dma-buf instead and use the
+	 * backend-agnostic dma-buf vmap support instead. This would
+	 * require that the handle2fd prime ioctl is reworked to pull the
+	 * fd_install step out of the driver backend hooks, to make that
+	 * final step optional for internal users.
+	 */
+	vaddr_iomem = drm_gem_vmap_iomem(buffer->gem);
+	if (IS_ERR(vaddr_iomem))
+		return vaddr_iomem;
+
+	buffer->vaddr_iomem = vaddr_iomem;
+
+	return vaddr_iomem;
+}
+EXPORT_SYMBOL(drm_client_buffer_vmap_iomem);
+
 /**
  * drm_client_buffer_vunmap - Unmap DRM client buffer
  * @buffer: DRM client buffer
@@ -337,8 +377,16 @@ EXPORT_SYMBOL(drm_client_buffer_vmap);
  */
 void drm_client_buffer_vunmap(struct drm_client_buffer *buffer)
 {
-	drm_gem_vunmap(buffer->gem, buffer->vaddr);
-	buffer->vaddr = NULL;
+	drm_WARN_ON(buffer->client->dev, buffer->vaddr && buffer->vaddr_iomem);
+
+	if (buffer->vaddr) {
+		drm_gem_vunmap(buffer->gem, buffer->vaddr);
+		buffer->vaddr = NULL;
+	}
+	if (buffer->vaddr_iomem) {
+		drm_gem_vunmap(buffer->gem, (void *)buffer->vaddr_iomem);
+		buffer->vaddr_iomem = NULL;
+	}
 }
 EXPORT_SYMBOL(drm_client_buffer_vunmap);
 
diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index a57f5379fc08..a001be8c0965 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -1227,6 +1227,25 @@ void *drm_gem_vmap(struct drm_gem_object *obj)
 		vaddr = obj->funcs->vmap(obj);
 	else if (obj->dev->driver->gem_prime_vmap)
 		vaddr = obj->dev->driver->gem_prime_vmap(obj);
+	else if (obj->funcs && obj->funcs->vmap_iomem)
+		vaddr = NULL; /* requires mapping as I/O memory */
+	else
+		vaddr = ERR_PTR(-EOPNOTSUPP);
+
+	if (!vaddr)
+		vaddr = ERR_PTR(-ENOMEM);
+
+	return vaddr;
+}
+
+void __iomem *drm_gem_vmap_iomem(struct drm_gem_object *obj)
+{
+	void __iomem *vaddr;
+
+	if (obj->funcs && obj->funcs->vmap_iomem)
+		vaddr = obj->funcs->vmap_iomem(obj);
+	else if (obj->funcs && obj->funcs->vmap)
+		vaddr = NULL; /* requires mapping as system memory */
 	else
 		vaddr = ERR_PTR(-EOPNOTSUPP);
 
diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
index 8e01caaf95cc..aa1a3d4f9223 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -187,6 +187,7 @@ void drm_gem_print_info(struct drm_printer *p, unsigned int indent,
 int drm_gem_pin(struct drm_gem_object *obj);
 void drm_gem_unpin(struct drm_gem_object *obj);
 void *drm_gem_vmap(struct drm_gem_object *obj);
+void __iomem *drm_gem_vmap_iomem(struct drm_gem_object *obj);
 void drm_gem_vunmap(struct drm_gem_object *obj, void *vaddr);
 
 /* drm_debugfs.c drm_debugfs_crc.c */
diff --git a/include/drm/drm_client.h b/include/drm/drm_client.h
index 7aaea665bfc2..94aa075ee4b6 100644
--- a/include/drm/drm_client.h
+++ b/include/drm/drm_client.h
@@ -141,10 +141,15 @@ struct drm_client_buffer {
 	struct drm_gem_object *gem;
 
 	/**
-	 * @vaddr: Virtual address for the buffer
+	 * @vaddr: Virtual address for the buffer in system memory
 	 */
 	void *vaddr;
 
+	/**
+	 * @vaddr: Virtual address for the buffer in I/O memory
+	 */
+	void *vaddr_iomem;
+
 	/**
 	 * @fb: DRM framebuffer
 	 */
@@ -156,6 +161,7 @@ drm_client_framebuffer_create(struct drm_client_dev *client, u32 width, u32 heig
 void drm_client_framebuffer_delete(struct drm_client_buffer *buffer);
 int drm_client_framebuffer_flush(struct drm_client_buffer *buffer, struct drm_rect *rect);
 void *drm_client_buffer_vmap(struct drm_client_buffer *buffer);
+void __iomem *drm_client_buffer_vmap_iomem(struct drm_client_buffer *buffer);
 void drm_client_buffer_vunmap(struct drm_client_buffer *buffer);
 
 int drm_client_modeset_create(struct drm_client_dev *client);
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index 337a48321705..bc735ff522a8 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -134,17 +134,28 @@ struct drm_gem_object_funcs {
 	 * @vmap:
 	 *
 	 * Returns a virtual address for the buffer. Used by the
-	 * drm_gem_dmabuf_vmap() helper.
+	 * drm_gem_dmabuf_vmap() helper. If the buffer is not
+	 * located in system memory, the function returns NULL.
 	 *
 	 * This callback is optional.
 	 */
 	void *(*vmap)(struct drm_gem_object *obj);
 
+	/**
+	 * @vmap_iomem:
+	 *
+	 * Returns a virtual address for the buffer. If the buffer is not
+	 * located in I/O memory, the function returns NULL.
+	 *
+	 * This callback is optional.
+	 */
+	void __iomem *(*vmap_iomem)(struct drm_gem_object *obj);
+
 	/**
 	 * @vunmap:
 	 *
-	 * Releases the address previously returned by @vmap. Used by the
-	 * drm_gem_dmabuf_vunmap() helper.
+	 * Releases the address previously returned by @vmap or @vmap_iomem.
+	 * Used by the drm_gem_dmabuf_vunmap() helper.
 	 *
 	 * This callback is optional.
 	 */
-- 
2.27.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 4/5] drm/fb_helper: Use I/O-memory mappings if available
  2020-07-29 13:41 ` Thomas Zimmermann
@ 2020-07-29 13:41   ` Thomas Zimmermann
  -1 siblings, 0 replies; 34+ messages in thread
From: Thomas Zimmermann @ 2020-07-29 13:41 UTC (permalink / raw)
  To: airlied, daniel, sam, kraxel, b.zolnierkie, maarten.lankhorst,
	mripard, jani.nikula, peda, dan.carpenter, natechancellor
  Cc: linux-fbdev, Thomas Zimmermann, dri-devel

At least sparc64 requires I/O-specific access to framebuffers. This
patch prepares the fbdev console accordingly.

For drivers with direct access to the framebuffer memory, the callback
functions test for the type of memory and call the rsp fb_sys_ of fb_cfb_
functions. For drivers that employ a shadow buffer, fbdev's blit function
maps the framebuffer as either I/O or system memory, and uses the correct
memcpy function.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
---
 drivers/gpu/drm/drm_fb_helper.c | 132 ++++++++++++++++++++++++++++----
 1 file changed, 116 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index 88146f7245c5..7d928f92cba7 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -394,14 +394,25 @@ static void drm_fb_helper_dirty_blit_real(struct drm_fb_helper *fb_helper,
 	unsigned int cpp = fb->format->cpp[0];
 	size_t offset = clip->y1 * fb->pitches[0] + clip->x1 * cpp;
 	void *src = fb_helper->fbdev->screen_buffer + offset;
-	void *dst = fb_helper->buffer->vaddr + offset;
 	size_t len = (clip->x2 - clip->x1) * cpp;
 	unsigned int y;
 
-	for (y = clip->y1; y < clip->y2; y++) {
-		memcpy(dst, src, len);
-		src += fb->pitches[0];
-		dst += fb->pitches[0];
+	if (fb_helper->buffer->vaddr_iomem) {
+		void __iomem *dst = fb_helper->buffer->vaddr_iomem + offset;
+
+		for (y = clip->y1; y < clip->y2; y++) {
+			memcpy_toio(dst, src, len);
+			src += fb->pitches[0];
+			dst += fb->pitches[0];
+		}
+	} else {
+		void *dst = fb_helper->buffer->vaddr + offset;
+
+		for (y = clip->y1; y < clip->y2; y++) {
+			memcpy(dst, src, len);
+			src += fb->pitches[0];
+			dst += fb->pitches[0];
+		}
 	}
 }
 
@@ -412,6 +423,7 @@ static void drm_fb_helper_dirty_work(struct work_struct *work)
 	struct drm_clip_rect *clip = &helper->dirty_clip;
 	struct drm_clip_rect clip_copy;
 	unsigned long flags;
+	void __iomem *vaddr_iomem;
 	void *vaddr;
 
 	spin_lock_irqsave(&helper->dirty_lock, flags);
@@ -425,9 +437,15 @@ static void drm_fb_helper_dirty_work(struct work_struct *work)
 
 		/* Generic fbdev uses a shadow buffer */
 		if (helper->buffer) {
-			vaddr = drm_client_buffer_vmap(helper->buffer);
-			if (IS_ERR(vaddr))
+			vaddr_iomem +				drm_client_buffer_vmap_iomem(helper->buffer);
+			if (IS_ERR(vaddr_iomem)) {
 				return;
+			} else if (!vaddr_iomem) {
+				vaddr = drm_client_buffer_vmap(helper->buffer);
+				if (IS_ERR(vaddr))
+					return;
+			}
 			drm_fb_helper_dirty_blit_real(helper, &clip_copy);
 		}
 		if (helper->fb->funcs->dirty)
@@ -766,6 +784,25 @@ void drm_fb_helper_sys_imageblit(struct fb_info *info,
 }
 EXPORT_SYMBOL(drm_fb_helper_sys_imageblit);
 
+static ssize_t drm_fb_helper_cfb_read(struct fb_info *info, char __user *buf,
+				      size_t count, loff_t *ppos)
+{
+	return fb_cfb_read(info, buf, count, ppos);
+}
+
+static ssize_t drm_fb_helper_cfb_write(struct fb_info *info, const char __user *buf,
+				       size_t count, loff_t *ppos)
+{
+	ssize_t ret;
+
+	ret = fb_cfb_write(info, buf, count, ppos);
+	if (ret > 0)
+		drm_fb_helper_dirty(info, 0, 0, info->var.xres,
+				    info->var.yres);
+
+	return ret;
+}
+
 /**
  * drm_fb_helper_cfb_fillrect - wrapper around cfb_fillrect
  * @info: fbdev registered by the helper
@@ -2038,6 +2075,61 @@ static int drm_fbdev_fb_mmap(struct fb_info *info, struct vm_area_struct *vma)
 		return -ENODEV;
 }
 
+static ssize_t drm_fbdev_fb_read(struct fb_info *info, char __user *buf,
+				 size_t count, loff_t *ppos)
+{
+	struct drm_fb_helper *fb_helper = info->par;
+
+	if (fb_helper->buffer->vaddr_iomem)
+		return drm_fb_helper_cfb_read(info, buf, count, ppos);
+	else
+		return drm_fb_helper_sys_read(info, buf, count, ppos);
+}
+
+static ssize_t drm_fbdev_fb_write(struct fb_info *info, const char __user *buf,
+				  size_t count, loff_t *ppos)
+{
+	struct drm_fb_helper *fb_helper = info->par;
+
+	if (fb_helper->buffer->vaddr_iomem)
+		return drm_fb_helper_cfb_write(info, buf, count, ppos);
+	else
+		return drm_fb_helper_sys_write(info, buf, count, ppos);
+}
+
+static void drm_fbdev_fb_fillrect(struct fb_info *info,
+				  const struct fb_fillrect *rect)
+{
+	struct drm_fb_helper *fb_helper = info->par;
+
+	if (fb_helper->buffer->vaddr_iomem)
+		drm_fb_helper_cfb_fillrect(info, rect);
+	else
+		drm_fb_helper_sys_fillrect(info, rect);
+}
+
+static void drm_fbdev_fb_copyarea(struct fb_info *info,
+				  const struct fb_copyarea *area)
+{
+	struct drm_fb_helper *fb_helper = info->par;
+
+	if (fb_helper->buffer->vaddr_iomem)
+		drm_fb_helper_cfb_copyarea(info, area);
+	else
+		drm_fb_helper_sys_copyarea(info, area);
+}
+
+static void drm_fbdev_fb_imageblit(struct fb_info *info,
+				   const struct fb_image *image)
+{
+	struct drm_fb_helper *fb_helper = info->par;
+
+	if (fb_helper->buffer->vaddr_iomem)
+		drm_fb_helper_cfb_imageblit(info, image);
+	else
+		drm_fb_helper_sys_imageblit(info, image);
+}
+
 static const struct fb_ops drm_fbdev_fb_ops = {
 	.owner		= THIS_MODULE,
 	DRM_FB_HELPER_DEFAULT_OPS,
@@ -2045,11 +2137,11 @@ static const struct fb_ops drm_fbdev_fb_ops = {
 	.fb_release	= drm_fbdev_fb_release,
 	.fb_destroy	= drm_fbdev_fb_destroy,
 	.fb_mmap	= drm_fbdev_fb_mmap,
-	.fb_read	= drm_fb_helper_sys_read,
-	.fb_write	= drm_fb_helper_sys_write,
-	.fb_fillrect	= drm_fb_helper_sys_fillrect,
-	.fb_copyarea	= drm_fb_helper_sys_copyarea,
-	.fb_imageblit	= drm_fb_helper_sys_imageblit,
+	.fb_read	= drm_fbdev_fb_read,
+	.fb_write	= drm_fbdev_fb_write,
+	.fb_fillrect	= drm_fbdev_fb_fillrect,
+	.fb_copyarea	= drm_fbdev_fb_copyarea,
+	.fb_imageblit	= drm_fbdev_fb_imageblit,
 };
 
 static struct fb_deferred_io drm_fbdev_defio = {
@@ -2072,6 +2164,7 @@ static int drm_fb_helper_generic_probe(struct drm_fb_helper *fb_helper,
 	struct drm_framebuffer *fb;
 	struct fb_info *fbi;
 	u32 format;
+	void __iomem *vaddr_iomem;
 	void *vaddr;
 
 	drm_dbg_kms(dev, "surface width(%d), height(%d) and bpp(%d)\n",
@@ -2108,11 +2201,18 @@ static int drm_fb_helper_generic_probe(struct drm_fb_helper *fb_helper,
 		fb_deferred_io_init(fbi);
 	} else {
 		/* buffer is mapped for HW framebuffer */
-		vaddr = drm_client_buffer_vmap(fb_helper->buffer);
-		if (IS_ERR(vaddr))
-			return PTR_ERR(vaddr);
+		vaddr_iomem = drm_client_buffer_vmap_iomem(fb_helper->buffer);
+		if (IS_ERR(vaddr_iomem)) {
+			return PTR_ERR(vaddr_iomem);
+		} else if (!vaddr_iomem) {
+			vaddr = drm_client_buffer_vmap(fb_helper->buffer);
+			if (IS_ERR(vaddr))
+				return PTR_ERR(vaddr);
+			fbi->screen_buffer = vaddr;
+		} else {
+			fbi->screen_buffer = vaddr_iomem;
+		}
 
-		fbi->screen_buffer = vaddr;
 		/* Shamelessly leak the physical address to user-space */
 #if IS_ENABLED(CONFIG_DRM_FBDEV_LEAK_PHYS_SMEM)
 		if (drm_leak_fbdev_smem && fbi->fix.smem_start = 0)
-- 
2.27.0

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 4/5] drm/fb_helper: Use I/O-memory mappings if available
@ 2020-07-29 13:41   ` Thomas Zimmermann
  0 siblings, 0 replies; 34+ messages in thread
From: Thomas Zimmermann @ 2020-07-29 13:41 UTC (permalink / raw)
  To: airlied, daniel, sam, kraxel, b.zolnierkie, maarten.lankhorst,
	mripard, jani.nikula, peda, dan.carpenter, natechancellor
  Cc: linux-fbdev, Thomas Zimmermann, dri-devel

At least sparc64 requires I/O-specific access to framebuffers. This
patch prepares the fbdev console accordingly.

For drivers with direct access to the framebuffer memory, the callback
functions test for the type of memory and call the rsp fb_sys_ of fb_cfb_
functions. For drivers that employ a shadow buffer, fbdev's blit function
maps the framebuffer as either I/O or system memory, and uses the correct
memcpy function.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
---
 drivers/gpu/drm/drm_fb_helper.c | 132 ++++++++++++++++++++++++++++----
 1 file changed, 116 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index 88146f7245c5..7d928f92cba7 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -394,14 +394,25 @@ static void drm_fb_helper_dirty_blit_real(struct drm_fb_helper *fb_helper,
 	unsigned int cpp = fb->format->cpp[0];
 	size_t offset = clip->y1 * fb->pitches[0] + clip->x1 * cpp;
 	void *src = fb_helper->fbdev->screen_buffer + offset;
-	void *dst = fb_helper->buffer->vaddr + offset;
 	size_t len = (clip->x2 - clip->x1) * cpp;
 	unsigned int y;
 
-	for (y = clip->y1; y < clip->y2; y++) {
-		memcpy(dst, src, len);
-		src += fb->pitches[0];
-		dst += fb->pitches[0];
+	if (fb_helper->buffer->vaddr_iomem) {
+		void __iomem *dst = fb_helper->buffer->vaddr_iomem + offset;
+
+		for (y = clip->y1; y < clip->y2; y++) {
+			memcpy_toio(dst, src, len);
+			src += fb->pitches[0];
+			dst += fb->pitches[0];
+		}
+	} else {
+		void *dst = fb_helper->buffer->vaddr + offset;
+
+		for (y = clip->y1; y < clip->y2; y++) {
+			memcpy(dst, src, len);
+			src += fb->pitches[0];
+			dst += fb->pitches[0];
+		}
 	}
 }
 
@@ -412,6 +423,7 @@ static void drm_fb_helper_dirty_work(struct work_struct *work)
 	struct drm_clip_rect *clip = &helper->dirty_clip;
 	struct drm_clip_rect clip_copy;
 	unsigned long flags;
+	void __iomem *vaddr_iomem;
 	void *vaddr;
 
 	spin_lock_irqsave(&helper->dirty_lock, flags);
@@ -425,9 +437,15 @@ static void drm_fb_helper_dirty_work(struct work_struct *work)
 
 		/* Generic fbdev uses a shadow buffer */
 		if (helper->buffer) {
-			vaddr = drm_client_buffer_vmap(helper->buffer);
-			if (IS_ERR(vaddr))
+			vaddr_iomem =
+				drm_client_buffer_vmap_iomem(helper->buffer);
+			if (IS_ERR(vaddr_iomem)) {
 				return;
+			} else if (!vaddr_iomem) {
+				vaddr = drm_client_buffer_vmap(helper->buffer);
+				if (IS_ERR(vaddr))
+					return;
+			}
 			drm_fb_helper_dirty_blit_real(helper, &clip_copy);
 		}
 		if (helper->fb->funcs->dirty)
@@ -766,6 +784,25 @@ void drm_fb_helper_sys_imageblit(struct fb_info *info,
 }
 EXPORT_SYMBOL(drm_fb_helper_sys_imageblit);
 
+static ssize_t drm_fb_helper_cfb_read(struct fb_info *info, char __user *buf,
+				      size_t count, loff_t *ppos)
+{
+	return fb_cfb_read(info, buf, count, ppos);
+}
+
+static ssize_t drm_fb_helper_cfb_write(struct fb_info *info, const char __user *buf,
+				       size_t count, loff_t *ppos)
+{
+	ssize_t ret;
+
+	ret = fb_cfb_write(info, buf, count, ppos);
+	if (ret > 0)
+		drm_fb_helper_dirty(info, 0, 0, info->var.xres,
+				    info->var.yres);
+
+	return ret;
+}
+
 /**
  * drm_fb_helper_cfb_fillrect - wrapper around cfb_fillrect
  * @info: fbdev registered by the helper
@@ -2038,6 +2075,61 @@ static int drm_fbdev_fb_mmap(struct fb_info *info, struct vm_area_struct *vma)
 		return -ENODEV;
 }
 
+static ssize_t drm_fbdev_fb_read(struct fb_info *info, char __user *buf,
+				 size_t count, loff_t *ppos)
+{
+	struct drm_fb_helper *fb_helper = info->par;
+
+	if (fb_helper->buffer->vaddr_iomem)
+		return drm_fb_helper_cfb_read(info, buf, count, ppos);
+	else
+		return drm_fb_helper_sys_read(info, buf, count, ppos);
+}
+
+static ssize_t drm_fbdev_fb_write(struct fb_info *info, const char __user *buf,
+				  size_t count, loff_t *ppos)
+{
+	struct drm_fb_helper *fb_helper = info->par;
+
+	if (fb_helper->buffer->vaddr_iomem)
+		return drm_fb_helper_cfb_write(info, buf, count, ppos);
+	else
+		return drm_fb_helper_sys_write(info, buf, count, ppos);
+}
+
+static void drm_fbdev_fb_fillrect(struct fb_info *info,
+				  const struct fb_fillrect *rect)
+{
+	struct drm_fb_helper *fb_helper = info->par;
+
+	if (fb_helper->buffer->vaddr_iomem)
+		drm_fb_helper_cfb_fillrect(info, rect);
+	else
+		drm_fb_helper_sys_fillrect(info, rect);
+}
+
+static void drm_fbdev_fb_copyarea(struct fb_info *info,
+				  const struct fb_copyarea *area)
+{
+	struct drm_fb_helper *fb_helper = info->par;
+
+	if (fb_helper->buffer->vaddr_iomem)
+		drm_fb_helper_cfb_copyarea(info, area);
+	else
+		drm_fb_helper_sys_copyarea(info, area);
+}
+
+static void drm_fbdev_fb_imageblit(struct fb_info *info,
+				   const struct fb_image *image)
+{
+	struct drm_fb_helper *fb_helper = info->par;
+
+	if (fb_helper->buffer->vaddr_iomem)
+		drm_fb_helper_cfb_imageblit(info, image);
+	else
+		drm_fb_helper_sys_imageblit(info, image);
+}
+
 static const struct fb_ops drm_fbdev_fb_ops = {
 	.owner		= THIS_MODULE,
 	DRM_FB_HELPER_DEFAULT_OPS,
@@ -2045,11 +2137,11 @@ static const struct fb_ops drm_fbdev_fb_ops = {
 	.fb_release	= drm_fbdev_fb_release,
 	.fb_destroy	= drm_fbdev_fb_destroy,
 	.fb_mmap	= drm_fbdev_fb_mmap,
-	.fb_read	= drm_fb_helper_sys_read,
-	.fb_write	= drm_fb_helper_sys_write,
-	.fb_fillrect	= drm_fb_helper_sys_fillrect,
-	.fb_copyarea	= drm_fb_helper_sys_copyarea,
-	.fb_imageblit	= drm_fb_helper_sys_imageblit,
+	.fb_read	= drm_fbdev_fb_read,
+	.fb_write	= drm_fbdev_fb_write,
+	.fb_fillrect	= drm_fbdev_fb_fillrect,
+	.fb_copyarea	= drm_fbdev_fb_copyarea,
+	.fb_imageblit	= drm_fbdev_fb_imageblit,
 };
 
 static struct fb_deferred_io drm_fbdev_defio = {
@@ -2072,6 +2164,7 @@ static int drm_fb_helper_generic_probe(struct drm_fb_helper *fb_helper,
 	struct drm_framebuffer *fb;
 	struct fb_info *fbi;
 	u32 format;
+	void __iomem *vaddr_iomem;
 	void *vaddr;
 
 	drm_dbg_kms(dev, "surface width(%d), height(%d) and bpp(%d)\n",
@@ -2108,11 +2201,18 @@ static int drm_fb_helper_generic_probe(struct drm_fb_helper *fb_helper,
 		fb_deferred_io_init(fbi);
 	} else {
 		/* buffer is mapped for HW framebuffer */
-		vaddr = drm_client_buffer_vmap(fb_helper->buffer);
-		if (IS_ERR(vaddr))
-			return PTR_ERR(vaddr);
+		vaddr_iomem = drm_client_buffer_vmap_iomem(fb_helper->buffer);
+		if (IS_ERR(vaddr_iomem)) {
+			return PTR_ERR(vaddr_iomem);
+		} else if (!vaddr_iomem) {
+			vaddr = drm_client_buffer_vmap(fb_helper->buffer);
+			if (IS_ERR(vaddr))
+				return PTR_ERR(vaddr);
+			fbi->screen_buffer = vaddr;
+		} else {
+			fbi->screen_buffer = vaddr_iomem;
+		}
 
-		fbi->screen_buffer = vaddr;
 		/* Shamelessly leak the physical address to user-space */
 #if IS_ENABLED(CONFIG_DRM_FBDEV_LEAK_PHYS_SMEM)
 		if (drm_leak_fbdev_smem && fbi->fix.smem_start == 0)
-- 
2.27.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 5/5] drm/vram_helper: Implement struct drm_gem_object_funcs.vmap_iomem
  2020-07-29 13:41 ` Thomas Zimmermann
@ 2020-07-29 13:41   ` Thomas Zimmermann
  -1 siblings, 0 replies; 34+ messages in thread
From: Thomas Zimmermann @ 2020-07-29 13:41 UTC (permalink / raw)
  To: airlied, daniel, sam, kraxel, b.zolnierkie, maarten.lankhorst,
	mripard, jani.nikula, peda, dan.carpenter, natechancellor
  Cc: linux-fbdev, Thomas Zimmermann, dri-devel

The vmap_iomem function in struct drm_gem_object_funcs returns the
memory of the buffer if located in I/O memory, or NULL if it isn't.
The patch also updates the semantics of the vmap function to return
NULL if the buffer is not in system memory.

The main user is the fb-helper's console, which is prepared of this
change. The other use if the ast cursor code. It maintains two GEM
buffers in VRAM. The patch changes ast to map the buffers by calling
drm_gem_vram_vmap_iomem().

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
---
 drivers/gpu/drm/ast/ast_cursor.c      |  12 ++-
 drivers/gpu/drm/drm_gem_vram_helper.c | 106 +++++++++++++++++++++++---
 include/drm/drm_gem_vram_helper.h     |   1 +
 3 files changed, 108 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/ast/ast_cursor.c b/drivers/gpu/drm/ast/ast_cursor.c
index acf0d23514e8..fac0079d213f 100644
--- a/drivers/gpu/drm/ast/ast_cursor.c
+++ b/drivers/gpu/drm/ast/ast_cursor.c
@@ -77,12 +77,22 @@ int ast_cursor_init(struct ast_private *ast)
 			drm_gem_vram_put(gbo);
 			goto err_drm_gem_vram_put;
 		}
-		vaddr = drm_gem_vram_vmap(gbo);
+		vaddr = drm_gem_vram_vmap_iomem(gbo);
 		if (IS_ERR(vaddr)) {
 			ret = PTR_ERR(vaddr);
 			drm_gem_vram_unpin(gbo);
 			drm_gem_vram_put(gbo);
 			goto err_drm_gem_vram_put;
+		} else if (drm_WARN_ON(dev, !vaddr)) {
+			/*
+			 * BUG: We successfully pinned the cursor to VRAM,
+			 *      but cannot get an I/O memory mapping. This
+			 *      is a bug in the VRAM helpers.
+			 */
+			ret = -ENODEV;
+			drm_gem_vram_unpin(gbo);
+			drm_gem_vram_put(gbo);
+			goto err_drm_gem_vram_put;
 		}
 
 		ast->cursor.gbo[i] = gbo;
diff --git a/drivers/gpu/drm/drm_gem_vram_helper.c b/drivers/gpu/drm/drm_gem_vram_helper.c
index 3296ed3df358..d36e8ba0b5e0 100644
--- a/drivers/gpu/drm/drm_gem_vram_helper.c
+++ b/drivers/gpu/drm/drm_gem_vram_helper.c
@@ -505,8 +505,11 @@ EXPORT_SYMBOL(drm_gem_vram_kunmap);
  *                       space
  * @gbo:	The GEM VRAM object to map
  *
- * The vmap function pins a GEM VRAM object to its current location, either
- * system or video memory, and maps its buffer into kernel address space.
+ * The vmap function pins a GEM VRAM object to its current location in
+ * system memory, and maps its buffer into kernel address space. The function
+ * returns NULL if the buffer is not located in system memory. You should
+ * call drm_gem_vram_vmap_iomem() in this case.
+ *
  * As pinned object cannot be relocated, you should avoid pinning objects
  * permanently. Call drm_gem_vram_vunmap() with the returned address to
  * unmap and unpin the GEM VRAM object.
@@ -515,13 +518,15 @@ EXPORT_SYMBOL(drm_gem_vram_kunmap);
  * call drm_gem_vram_pin() and drm_gem_vram_kmap() directly.
  *
  * Returns:
- * The buffer's virtual address on success, or
+ * The buffer's virtual address on success,
+ * NULL if the buffer is located in I/O memory, or
  * an ERR_PTR()-encoded error code otherwise.
  */
 void *drm_gem_vram_vmap(struct drm_gem_vram_object *gbo)
 {
 	int ret;
 	void *base;
+	bool is_iomem;
 
 	ret = ttm_bo_reserve(&gbo->bo, true, false, NULL);
 	if (ret)
@@ -530,10 +535,18 @@ void *drm_gem_vram_vmap(struct drm_gem_vram_object *gbo)
 	ret = drm_gem_vram_pin_locked(gbo, 0);
 	if (ret)
 		goto err_ttm_bo_unreserve;
-	base = drm_gem_vram_kmap_locked(gbo, true, NULL);
+	base = drm_gem_vram_kmap_locked(gbo, true, &is_iomem);
 	if (IS_ERR(base)) {
 		ret = PTR_ERR(base);
 		goto err_drm_gem_vram_unpin_locked;
+	} else if (is_iomem) {
+		/*
+		 * The buffer is located in I/O memory. Unpin and
+		 * return NULL. This is not an error.
+		 */
+		drm_gem_vram_kunmap_locked(gbo);
+		drm_gem_vram_unpin_locked(gbo);
+		base = NULL;
 	}
 
 	ttm_bo_unreserve(&gbo->bo);
@@ -548,6 +561,67 @@ void *drm_gem_vram_vmap(struct drm_gem_vram_object *gbo)
 }
 EXPORT_SYMBOL(drm_gem_vram_vmap);
 
+/**
+ * drm_gem_vram_vmap_iomem() - Pins and maps a GEM VRAM object into kernel
+ *                             address space
+ * @gbo:	The GEM VRAM object to map
+ *
+ * The vmap_iomem function pins a GEM VRAM object to its current location in
+ * I/O memory, and maps its buffer into kernel address space. The function
+ * returns NULL if the buffer is not located in I/O memory. You should call
+ * drm_gem_vram_vmap() in this case.
+ *
+ * If you have special requirements for the pinning or mapping operations,
+ * call drm_gem_vram_pin() and drm_gem_vram_kmap() directly.
+ *
+ * As pinned object cannot be relocated, you should avoid pinning objects
+ * permanently. Call drm_gem_vram_vunmap() with the returned address to
+ * unmap and unpin the GEM VRAM object.
+ *
+ * Returns:
+ * The buffer's virtual address on success, or
+ * NULL if the buffer is not located in I/O memory, or
+ * an ERR_PTR()-encoded error code otherwise.
+ */
+void __iomem *drm_gem_vram_vmap_iomem(struct drm_gem_vram_object *gbo)
+{
+	int ret;
+	void __iomem *base;
+	bool is_iomem;
+
+	ret = ttm_bo_reserve(&gbo->bo, true, false, NULL);
+	if (ret)
+		return ERR_PTR(ret);
+
+	ret = drm_gem_vram_pin_locked(gbo, 0);
+	if (ret)
+		goto err_ttm_bo_unreserve;
+	base = drm_gem_vram_kmap_locked(gbo, true, &is_iomem);
+	if (IS_ERR(base)) {
+		ret = PTR_ERR(base);
+		goto err_drm_gem_vram_unpin_locked;
+	} else if (!is_iomem) {
+		/*
+		 * The buffer is not located in I/O memory. Unpin and
+		 * return NULL. This is not an error.
+		 */
+		drm_gem_vram_kunmap_locked(gbo);
+		drm_gem_vram_unpin_locked(gbo);
+		base = NULL;
+	}
+
+	ttm_bo_unreserve(&gbo->bo);
+
+	return base;
+
+err_drm_gem_vram_unpin_locked:
+	drm_gem_vram_unpin_locked(gbo);
+err_ttm_bo_unreserve:
+	ttm_bo_unreserve(&gbo->bo);
+	return ERR_PTR(ret);
+}
+EXPORT_SYMBOL(drm_gem_vram_vmap_iomem);
+
 /**
  * drm_gem_vram_vunmap() - Unmaps and unpins a GEM VRAM object
  * @gbo:	The GEM VRAM object to unmap
@@ -931,6 +1005,17 @@ static void *drm_gem_vram_object_vmap(struct drm_gem_object *gem)
 	return base;
 }
 
+static void __iomem *drm_gem_vram_object_vmap_iomem(struct drm_gem_object *gem)
+{
+	struct drm_gem_vram_object *gbo = drm_gem_vram_of_gem(gem);
+	void __iomem *base;
+
+	base = drm_gem_vram_vmap(gbo);
+	if (IS_ERR(base))
+		return NULL;
+	return base;
+}
+
 /**
  * drm_gem_vram_object_vunmap() - \
 	Implements &struct drm_gem_object_funcs.vunmap
@@ -950,12 +1035,13 @@ static void drm_gem_vram_object_vunmap(struct drm_gem_object *gem,
  */
 
 static const struct drm_gem_object_funcs drm_gem_vram_object_funcs = {
-	.free	= drm_gem_vram_object_free,
-	.pin	= drm_gem_vram_object_pin,
-	.unpin	= drm_gem_vram_object_unpin,
-	.vmap	= drm_gem_vram_object_vmap,
-	.vunmap	= drm_gem_vram_object_vunmap,
-	.mmap   = drm_gem_ttm_mmap,
+	.free	    = drm_gem_vram_object_free,
+	.pin	    = drm_gem_vram_object_pin,
+	.unpin	    = drm_gem_vram_object_unpin,
+	.vmap	    = drm_gem_vram_object_vmap,
+	.vmap_iomem = drm_gem_vram_object_vmap_iomem,
+	.vunmap	    = drm_gem_vram_object_vunmap,
+	.mmap	    = drm_gem_ttm_mmap,
 	.print_info = drm_gem_ttm_print_info,
 };
 
diff --git a/include/drm/drm_gem_vram_helper.h b/include/drm/drm_gem_vram_helper.h
index 035332f3723f..ea53a475edc9 100644
--- a/include/drm/drm_gem_vram_helper.h
+++ b/include/drm/drm_gem_vram_helper.h
@@ -105,6 +105,7 @@ void *drm_gem_vram_kmap(struct drm_gem_vram_object *gbo, bool map,
 			bool *is_iomem);
 void drm_gem_vram_kunmap(struct drm_gem_vram_object *gbo);
 void *drm_gem_vram_vmap(struct drm_gem_vram_object *gbo);
+void __iomem *drm_gem_vram_vmap_iomem(struct drm_gem_vram_object *gbo);
 void drm_gem_vram_vunmap(struct drm_gem_vram_object *gbo, void *vaddr);
 
 int drm_gem_vram_fill_create_dumb(struct drm_file *file,
-- 
2.27.0

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 5/5] drm/vram_helper: Implement struct drm_gem_object_funcs.vmap_iomem
@ 2020-07-29 13:41   ` Thomas Zimmermann
  0 siblings, 0 replies; 34+ messages in thread
From: Thomas Zimmermann @ 2020-07-29 13:41 UTC (permalink / raw)
  To: airlied, daniel, sam, kraxel, b.zolnierkie, maarten.lankhorst,
	mripard, jani.nikula, peda, dan.carpenter, natechancellor
  Cc: linux-fbdev, Thomas Zimmermann, dri-devel

The vmap_iomem function in struct drm_gem_object_funcs returns the
memory of the buffer if located in I/O memory, or NULL if it isn't.
The patch also updates the semantics of the vmap function to return
NULL if the buffer is not in system memory.

The main user is the fb-helper's console, which is prepared of this
change. The other use if the ast cursor code. It maintains two GEM
buffers in VRAM. The patch changes ast to map the buffers by calling
drm_gem_vram_vmap_iomem().

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
---
 drivers/gpu/drm/ast/ast_cursor.c      |  12 ++-
 drivers/gpu/drm/drm_gem_vram_helper.c | 106 +++++++++++++++++++++++---
 include/drm/drm_gem_vram_helper.h     |   1 +
 3 files changed, 108 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/ast/ast_cursor.c b/drivers/gpu/drm/ast/ast_cursor.c
index acf0d23514e8..fac0079d213f 100644
--- a/drivers/gpu/drm/ast/ast_cursor.c
+++ b/drivers/gpu/drm/ast/ast_cursor.c
@@ -77,12 +77,22 @@ int ast_cursor_init(struct ast_private *ast)
 			drm_gem_vram_put(gbo);
 			goto err_drm_gem_vram_put;
 		}
-		vaddr = drm_gem_vram_vmap(gbo);
+		vaddr = drm_gem_vram_vmap_iomem(gbo);
 		if (IS_ERR(vaddr)) {
 			ret = PTR_ERR(vaddr);
 			drm_gem_vram_unpin(gbo);
 			drm_gem_vram_put(gbo);
 			goto err_drm_gem_vram_put;
+		} else if (drm_WARN_ON(dev, !vaddr)) {
+			/*
+			 * BUG: We successfully pinned the cursor to VRAM,
+			 *      but cannot get an I/O memory mapping. This
+			 *      is a bug in the VRAM helpers.
+			 */
+			ret = -ENODEV;
+			drm_gem_vram_unpin(gbo);
+			drm_gem_vram_put(gbo);
+			goto err_drm_gem_vram_put;
 		}
 
 		ast->cursor.gbo[i] = gbo;
diff --git a/drivers/gpu/drm/drm_gem_vram_helper.c b/drivers/gpu/drm/drm_gem_vram_helper.c
index 3296ed3df358..d36e8ba0b5e0 100644
--- a/drivers/gpu/drm/drm_gem_vram_helper.c
+++ b/drivers/gpu/drm/drm_gem_vram_helper.c
@@ -505,8 +505,11 @@ EXPORT_SYMBOL(drm_gem_vram_kunmap);
  *                       space
  * @gbo:	The GEM VRAM object to map
  *
- * The vmap function pins a GEM VRAM object to its current location, either
- * system or video memory, and maps its buffer into kernel address space.
+ * The vmap function pins a GEM VRAM object to its current location in
+ * system memory, and maps its buffer into kernel address space. The function
+ * returns NULL if the buffer is not located in system memory. You should
+ * call drm_gem_vram_vmap_iomem() in this case.
+ *
  * As pinned object cannot be relocated, you should avoid pinning objects
  * permanently. Call drm_gem_vram_vunmap() with the returned address to
  * unmap and unpin the GEM VRAM object.
@@ -515,13 +518,15 @@ EXPORT_SYMBOL(drm_gem_vram_kunmap);
  * call drm_gem_vram_pin() and drm_gem_vram_kmap() directly.
  *
  * Returns:
- * The buffer's virtual address on success, or
+ * The buffer's virtual address on success,
+ * NULL if the buffer is located in I/O memory, or
  * an ERR_PTR()-encoded error code otherwise.
  */
 void *drm_gem_vram_vmap(struct drm_gem_vram_object *gbo)
 {
 	int ret;
 	void *base;
+	bool is_iomem;
 
 	ret = ttm_bo_reserve(&gbo->bo, true, false, NULL);
 	if (ret)
@@ -530,10 +535,18 @@ void *drm_gem_vram_vmap(struct drm_gem_vram_object *gbo)
 	ret = drm_gem_vram_pin_locked(gbo, 0);
 	if (ret)
 		goto err_ttm_bo_unreserve;
-	base = drm_gem_vram_kmap_locked(gbo, true, NULL);
+	base = drm_gem_vram_kmap_locked(gbo, true, &is_iomem);
 	if (IS_ERR(base)) {
 		ret = PTR_ERR(base);
 		goto err_drm_gem_vram_unpin_locked;
+	} else if (is_iomem) {
+		/*
+		 * The buffer is located in I/O memory. Unpin and
+		 * return NULL. This is not an error.
+		 */
+		drm_gem_vram_kunmap_locked(gbo);
+		drm_gem_vram_unpin_locked(gbo);
+		base = NULL;
 	}
 
 	ttm_bo_unreserve(&gbo->bo);
@@ -548,6 +561,67 @@ void *drm_gem_vram_vmap(struct drm_gem_vram_object *gbo)
 }
 EXPORT_SYMBOL(drm_gem_vram_vmap);
 
+/**
+ * drm_gem_vram_vmap_iomem() - Pins and maps a GEM VRAM object into kernel
+ *                             address space
+ * @gbo:	The GEM VRAM object to map
+ *
+ * The vmap_iomem function pins a GEM VRAM object to its current location in
+ * I/O memory, and maps its buffer into kernel address space. The function
+ * returns NULL if the buffer is not located in I/O memory. You should call
+ * drm_gem_vram_vmap() in this case.
+ *
+ * If you have special requirements for the pinning or mapping operations,
+ * call drm_gem_vram_pin() and drm_gem_vram_kmap() directly.
+ *
+ * As pinned object cannot be relocated, you should avoid pinning objects
+ * permanently. Call drm_gem_vram_vunmap() with the returned address to
+ * unmap and unpin the GEM VRAM object.
+ *
+ * Returns:
+ * The buffer's virtual address on success, or
+ * NULL if the buffer is not located in I/O memory, or
+ * an ERR_PTR()-encoded error code otherwise.
+ */
+void __iomem *drm_gem_vram_vmap_iomem(struct drm_gem_vram_object *gbo)
+{
+	int ret;
+	void __iomem *base;
+	bool is_iomem;
+
+	ret = ttm_bo_reserve(&gbo->bo, true, false, NULL);
+	if (ret)
+		return ERR_PTR(ret);
+
+	ret = drm_gem_vram_pin_locked(gbo, 0);
+	if (ret)
+		goto err_ttm_bo_unreserve;
+	base = drm_gem_vram_kmap_locked(gbo, true, &is_iomem);
+	if (IS_ERR(base)) {
+		ret = PTR_ERR(base);
+		goto err_drm_gem_vram_unpin_locked;
+	} else if (!is_iomem) {
+		/*
+		 * The buffer is not located in I/O memory. Unpin and
+		 * return NULL. This is not an error.
+		 */
+		drm_gem_vram_kunmap_locked(gbo);
+		drm_gem_vram_unpin_locked(gbo);
+		base = NULL;
+	}
+
+	ttm_bo_unreserve(&gbo->bo);
+
+	return base;
+
+err_drm_gem_vram_unpin_locked:
+	drm_gem_vram_unpin_locked(gbo);
+err_ttm_bo_unreserve:
+	ttm_bo_unreserve(&gbo->bo);
+	return ERR_PTR(ret);
+}
+EXPORT_SYMBOL(drm_gem_vram_vmap_iomem);
+
 /**
  * drm_gem_vram_vunmap() - Unmaps and unpins a GEM VRAM object
  * @gbo:	The GEM VRAM object to unmap
@@ -931,6 +1005,17 @@ static void *drm_gem_vram_object_vmap(struct drm_gem_object *gem)
 	return base;
 }
 
+static void __iomem *drm_gem_vram_object_vmap_iomem(struct drm_gem_object *gem)
+{
+	struct drm_gem_vram_object *gbo = drm_gem_vram_of_gem(gem);
+	void __iomem *base;
+
+	base = drm_gem_vram_vmap(gbo);
+	if (IS_ERR(base))
+		return NULL;
+	return base;
+}
+
 /**
  * drm_gem_vram_object_vunmap() - \
 	Implements &struct drm_gem_object_funcs.vunmap
@@ -950,12 +1035,13 @@ static void drm_gem_vram_object_vunmap(struct drm_gem_object *gem,
  */
 
 static const struct drm_gem_object_funcs drm_gem_vram_object_funcs = {
-	.free	= drm_gem_vram_object_free,
-	.pin	= drm_gem_vram_object_pin,
-	.unpin	= drm_gem_vram_object_unpin,
-	.vmap	= drm_gem_vram_object_vmap,
-	.vunmap	= drm_gem_vram_object_vunmap,
-	.mmap   = drm_gem_ttm_mmap,
+	.free	    = drm_gem_vram_object_free,
+	.pin	    = drm_gem_vram_object_pin,
+	.unpin	    = drm_gem_vram_object_unpin,
+	.vmap	    = drm_gem_vram_object_vmap,
+	.vmap_iomem = drm_gem_vram_object_vmap_iomem,
+	.vunmap	    = drm_gem_vram_object_vunmap,
+	.mmap	    = drm_gem_ttm_mmap,
 	.print_info = drm_gem_ttm_print_info,
 };
 
diff --git a/include/drm/drm_gem_vram_helper.h b/include/drm/drm_gem_vram_helper.h
index 035332f3723f..ea53a475edc9 100644
--- a/include/drm/drm_gem_vram_helper.h
+++ b/include/drm/drm_gem_vram_helper.h
@@ -105,6 +105,7 @@ void *drm_gem_vram_kmap(struct drm_gem_vram_object *gbo, bool map,
 			bool *is_iomem);
 void drm_gem_vram_kunmap(struct drm_gem_vram_object *gbo);
 void *drm_gem_vram_vmap(struct drm_gem_vram_object *gbo);
+void __iomem *drm_gem_vram_vmap_iomem(struct drm_gem_vram_object *gbo);
 void drm_gem_vram_vunmap(struct drm_gem_vram_object *gbo, void *vaddr);
 
 int drm_gem_vram_fill_create_dumb(struct drm_file *file,
-- 
2.27.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/5] fbdev: Remove trailing whitespace
  2020-07-29 13:41   ` Thomas Zimmermann
@ 2020-07-29 13:47     ` daniel
  -1 siblings, 0 replies; 34+ messages in thread
From: daniel @ 2020-07-29 13:47 UTC (permalink / raw)
  Cc: linux-fbdev, b.zolnierkie, jani.nikula, dri-devel, kraxel,
	airlied, natechancellor, sam, peda, dan.carpenter

On Wed, Jul 29, 2020 at 03:41:44PM +0200, Thomas Zimmermann wrote:
> Removes trailing whitespaces in several places.
> 
> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>

checkpatch patch for fbdev, I'm blown :-)

Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> ---
>  drivers/video/fbdev/core/fbmem.c | 10 +++++-----
>  include/linux/fb.h               | 18 +++++++++---------
>  2 files changed, 14 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
> index 30e73ec4ad5c..dd0ccf35f7b7 100644
> --- a/drivers/video/fbdev/core/fbmem.c
> +++ b/drivers/video/fbdev/core/fbmem.c
> @@ -777,7 +777,7 @@ fb_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
>  
>  	if (info->fbops->fb_read)
>  		return info->fbops->fb_read(info, buf, count, ppos);
> -	
> +
>  	total_size = info->screen_size;
>  
>  	if (total_size = 0)
> @@ -842,7 +842,7 @@ fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
>  
>  	if (info->fbops->fb_write)
>  		return info->fbops->fb_write(info, buf, count, ppos);
> -	
> +
>  	total_size = info->screen_size;
>  
>  	if (total_size = 0)
> @@ -1061,7 +1061,7 @@ EXPORT_SYMBOL(fb_set_var);
>  
>  int
>  fb_blank(struct fb_info *info, int blank)
> -{	
> +{
>  	struct fb_event event;
>  	int ret = -EINVAL;
>  
> @@ -1437,7 +1437,7 @@ __releases(&info->lock)
>  	return res;
>  }
>  
> -static int 
> +static int
>  fb_release(struct inode *inode, struct file *file)
>  __acquires(&info->lock)
>  __releases(&info->lock)
> @@ -1627,7 +1627,7 @@ static int do_register_framebuffer(struct fb_info *fb_info)
>  			fb_info->pixmap.access_align = 32;
>  			fb_info->pixmap.flags = FB_PIXMAP_DEFAULT;
>  		}
> -	}	
> +	}
>  	fb_info->pixmap.offset = 0;
>  
>  	if (!fb_info->pixmap.blit_x)
> diff --git a/include/linux/fb.h b/include/linux/fb.h
> index 2b530e6d86e4..714187bc13ac 100644
> --- a/include/linux/fb.h
> +++ b/include/linux/fb.h
> @@ -124,7 +124,7 @@ struct fb_cursor_user {
>   * Register/unregister for framebuffer events
>   */
>  
> -/*	The resolution of the passed in fb_info about to change */ 
> +/*	The resolution of the passed in fb_info about to change */
>  #define FB_EVENT_MODE_CHANGE		0x01
>  
>  #ifdef CONFIG_GUMSTIX_AM200EPD
> @@ -459,12 +459,12 @@ struct fb_info {
>  
>  #if IS_ENABLED(CONFIG_FB_BACKLIGHT)
>  	/* assigned backlight device */
> -	/* set before framebuffer registration, 
> +	/* set before framebuffer registration,
>  	   remove after unregister */
>  	struct backlight_device *bl_dev;
>  
>  	/* Backlight level curve */
> -	struct mutex bl_curve_mutex;	
> +	struct mutex bl_curve_mutex;
>  	u8 bl_curve[FB_BACKLIGHT_LEVELS];
>  #endif
>  #ifdef CONFIG_FB_DEFERRED_IO
> @@ -483,8 +483,8 @@ struct fb_info {
>  		char __iomem *screen_base;	/* Virtual address */
>  		char *screen_buffer;
>  	};
> -	unsigned long screen_size;	/* Amount of ioremapped VRAM or 0 */ 
> -	void *pseudo_palette;		/* Fake palette of 16 colors */ 
> +	unsigned long screen_size;	/* Amount of ioremapped VRAM or 0 */
> +	void *pseudo_palette;		/* Fake palette of 16 colors */
>  #define FBINFO_STATE_RUNNING	0
>  #define FBINFO_STATE_SUSPENDED	1
>  	u32 state;			/* Hardware state i.e suspend */
> @@ -587,11 +587,11 @@ static inline struct apertures_struct *alloc_apertures(unsigned int max_num) {
>       *  `Generic' versions of the frame buffer device operations
>       */
>  
> -extern int fb_set_var(struct fb_info *info, struct fb_var_screeninfo *var); 
> -extern int fb_pan_display(struct fb_info *info, struct fb_var_screeninfo *var); 
> +extern int fb_set_var(struct fb_info *info, struct fb_var_screeninfo *var);
> +extern int fb_pan_display(struct fb_info *info, struct fb_var_screeninfo *var);
>  extern int fb_blank(struct fb_info *info, int blank);
> -extern void cfb_fillrect(struct fb_info *info, const struct fb_fillrect *rect); 
> -extern void cfb_copyarea(struct fb_info *info, const struct fb_copyarea *area); 
> +extern void cfb_fillrect(struct fb_info *info, const struct fb_fillrect *rect);
> +extern void cfb_copyarea(struct fb_info *info, const struct fb_copyarea *area);
>  extern void cfb_imageblit(struct fb_info *info, const struct fb_image *image);
>  /*
>   * Drawing operations where framebuffer is in system RAM
> -- 
> 2.27.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/5] fbdev: Remove trailing whitespace
@ 2020-07-29 13:47     ` daniel
  0 siblings, 0 replies; 34+ messages in thread
From: daniel @ 2020-07-29 13:47 UTC (permalink / raw)
  Cc: linux-fbdev, b.zolnierkie, jani.nikula, dri-devel, kraxel,
	airlied, natechancellor, sam, peda, dan.carpenter

On Wed, Jul 29, 2020 at 03:41:44PM +0200, Thomas Zimmermann wrote:
> Removes trailing whitespaces in several places.
> 
> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>

checkpatch patch for fbdev, I'm blown :-)

Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> ---
>  drivers/video/fbdev/core/fbmem.c | 10 +++++-----
>  include/linux/fb.h               | 18 +++++++++---------
>  2 files changed, 14 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
> index 30e73ec4ad5c..dd0ccf35f7b7 100644
> --- a/drivers/video/fbdev/core/fbmem.c
> +++ b/drivers/video/fbdev/core/fbmem.c
> @@ -777,7 +777,7 @@ fb_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
>  
>  	if (info->fbops->fb_read)
>  		return info->fbops->fb_read(info, buf, count, ppos);
> -	
> +
>  	total_size = info->screen_size;
>  
>  	if (total_size == 0)
> @@ -842,7 +842,7 @@ fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
>  
>  	if (info->fbops->fb_write)
>  		return info->fbops->fb_write(info, buf, count, ppos);
> -	
> +
>  	total_size = info->screen_size;
>  
>  	if (total_size == 0)
> @@ -1061,7 +1061,7 @@ EXPORT_SYMBOL(fb_set_var);
>  
>  int
>  fb_blank(struct fb_info *info, int blank)
> -{	
> +{
>  	struct fb_event event;
>  	int ret = -EINVAL;
>  
> @@ -1437,7 +1437,7 @@ __releases(&info->lock)
>  	return res;
>  }
>  
> -static int 
> +static int
>  fb_release(struct inode *inode, struct file *file)
>  __acquires(&info->lock)
>  __releases(&info->lock)
> @@ -1627,7 +1627,7 @@ static int do_register_framebuffer(struct fb_info *fb_info)
>  			fb_info->pixmap.access_align = 32;
>  			fb_info->pixmap.flags = FB_PIXMAP_DEFAULT;
>  		}
> -	}	
> +	}
>  	fb_info->pixmap.offset = 0;
>  
>  	if (!fb_info->pixmap.blit_x)
> diff --git a/include/linux/fb.h b/include/linux/fb.h
> index 2b530e6d86e4..714187bc13ac 100644
> --- a/include/linux/fb.h
> +++ b/include/linux/fb.h
> @@ -124,7 +124,7 @@ struct fb_cursor_user {
>   * Register/unregister for framebuffer events
>   */
>  
> -/*	The resolution of the passed in fb_info about to change */ 
> +/*	The resolution of the passed in fb_info about to change */
>  #define FB_EVENT_MODE_CHANGE		0x01
>  
>  #ifdef CONFIG_GUMSTIX_AM200EPD
> @@ -459,12 +459,12 @@ struct fb_info {
>  
>  #if IS_ENABLED(CONFIG_FB_BACKLIGHT)
>  	/* assigned backlight device */
> -	/* set before framebuffer registration, 
> +	/* set before framebuffer registration,
>  	   remove after unregister */
>  	struct backlight_device *bl_dev;
>  
>  	/* Backlight level curve */
> -	struct mutex bl_curve_mutex;	
> +	struct mutex bl_curve_mutex;
>  	u8 bl_curve[FB_BACKLIGHT_LEVELS];
>  #endif
>  #ifdef CONFIG_FB_DEFERRED_IO
> @@ -483,8 +483,8 @@ struct fb_info {
>  		char __iomem *screen_base;	/* Virtual address */
>  		char *screen_buffer;
>  	};
> -	unsigned long screen_size;	/* Amount of ioremapped VRAM or 0 */ 
> -	void *pseudo_palette;		/* Fake palette of 16 colors */ 
> +	unsigned long screen_size;	/* Amount of ioremapped VRAM or 0 */
> +	void *pseudo_palette;		/* Fake palette of 16 colors */
>  #define FBINFO_STATE_RUNNING	0
>  #define FBINFO_STATE_SUSPENDED	1
>  	u32 state;			/* Hardware state i.e suspend */
> @@ -587,11 +587,11 @@ static inline struct apertures_struct *alloc_apertures(unsigned int max_num) {
>       *  `Generic' versions of the frame buffer device operations
>       */
>  
> -extern int fb_set_var(struct fb_info *info, struct fb_var_screeninfo *var); 
> -extern int fb_pan_display(struct fb_info *info, struct fb_var_screeninfo *var); 
> +extern int fb_set_var(struct fb_info *info, struct fb_var_screeninfo *var);
> +extern int fb_pan_display(struct fb_info *info, struct fb_var_screeninfo *var);
>  extern int fb_blank(struct fb_info *info, int blank);
> -extern void cfb_fillrect(struct fb_info *info, const struct fb_fillrect *rect); 
> -extern void cfb_copyarea(struct fb_info *info, const struct fb_copyarea *area); 
> +extern void cfb_fillrect(struct fb_info *info, const struct fb_fillrect *rect);
> +extern void cfb_copyarea(struct fb_info *info, const struct fb_copyarea *area);
>  extern void cfb_imageblit(struct fb_info *info, const struct fb_image *image);
>  /*
>   * Drawing operations where framebuffer is in system RAM
> -- 
> 2.27.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/5] fbdev/core: Export framebuffer read and write code as cfb_ function
  2020-07-29 13:41   ` Thomas Zimmermann
@ 2020-07-29 13:53     ` daniel
  -1 siblings, 0 replies; 34+ messages in thread
From: daniel @ 2020-07-29 13:53 UTC (permalink / raw)
  Cc: linux-fbdev, b.zolnierkie, jani.nikula, dri-devel, kraxel,
	airlied, natechancellor, sam, peda, dan.carpenter

On Wed, Jul 29, 2020 at 03:41:45PM +0200, Thomas Zimmermann wrote:
> DRM fb helpers require read and write functions for framebuffer
> memory. Export the existing code from fbdev.
> 
> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>

Hm I'm not super sure whether we want to actually reuse this stuff ... We
kinda don't care about the sparc special case, and just having an fbdev
implementation witch has the switch between memcpy and memcpy_to/from_io
in one single place sounds a lot simpler ...

This way we can have a clean split between the old horrors of real fbdev
drivers, and a much cleaner world in drm. It would mean a bit of
copypasting, but I think that's actually a good thing.

In general my idea for drm fbdev emulation is that for any area we have a
problem we just ignore the entire fbmem.c code and write our own: mmap,
backlight handling (still unsolved, and horrible), cfb vs sys here. This
entire fbmem.c stuff is pretty bad midlayer, trying to avoid code
duplication here doesn't seem worth it imo.

Thoughts?
-Daniel

> ---
>  drivers/video/fbdev/core/fbmem.c | 53 ++++++++++++++++++++++----------
>  include/linux/fb.h               |  5 +++
>  2 files changed, 41 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
> index dd0ccf35f7b7..b496ff90db3e 100644
> --- a/drivers/video/fbdev/core/fbmem.c
> +++ b/drivers/video/fbdev/core/fbmem.c
> @@ -759,25 +759,18 @@ static struct fb_info *file_fb_info(struct file *file)
>  	return info;
>  }
>  
> -static ssize_t
> -fb_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
> +ssize_t fb_cfb_read(struct fb_info *info, char __user *buf, size_t count,
> +		    loff_t *ppos)
>  {
>  	unsigned long p = *ppos;
> -	struct fb_info *info = file_fb_info(file);
>  	u8 *buffer, *dst;
>  	u8 __iomem *src;
>  	int c, cnt = 0, err = 0;
>  	unsigned long total_size;
>  
> -	if (!info || ! info->screen_base)
> -		return -ENODEV;
> -
>  	if (info->state != FBINFO_STATE_RUNNING)
>  		return -EPERM;
>  
> -	if (info->fbops->fb_read)
> -		return info->fbops->fb_read(info, buf, count, ppos);
> -
>  	total_size = info->screen_size;
>  
>  	if (total_size = 0)
> @@ -823,16 +816,12 @@ fb_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
>  
>  	return (err) ? err : cnt;
>  }
> +EXPORT_SYMBOL(fb_cfb_read);
>  
>  static ssize_t
> -fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
> +fb_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
>  {
> -	unsigned long p = *ppos;
>  	struct fb_info *info = file_fb_info(file);
> -	u8 *buffer, *src;
> -	u8 __iomem *dst;
> -	int c, cnt = 0, err = 0;
> -	unsigned long total_size;
>  
>  	if (!info || !info->screen_base)
>  		return -ENODEV;
> @@ -840,8 +829,20 @@ fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
>  	if (info->state != FBINFO_STATE_RUNNING)
>  		return -EPERM;
>  
> -	if (info->fbops->fb_write)
> -		return info->fbops->fb_write(info, buf, count, ppos);
> +	if (info->fbops->fb_read)
> +		return info->fbops->fb_read(info, buf, count, ppos);
> +	else
> +		return fb_cfb_read(info, buf, count, ppos);
> +}
> +
> +ssize_t fb_cfb_write(struct fb_info *info, const char __user *buf,
> +		     size_t count, loff_t *ppos)
> +{
> +	unsigned long p = *ppos;
> +	u8 *buffer, *src;
> +	u8 __iomem *dst;
> +	int c, cnt = 0, err = 0;
> +	unsigned long total_size;
>  
>  	total_size = info->screen_size;
>  
> @@ -895,6 +896,24 @@ fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
>  
>  	return (cnt) ? cnt : err;
>  }
> +EXPORT_SYMBOL(fb_cfb_write);
> +
> +static ssize_t
> +fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
> +{
> +	struct fb_info *info = file_fb_info(file);
> +
> +	if (!info || !info->screen_base)
> +		return -ENODEV;
> +
> +	if (info->state != FBINFO_STATE_RUNNING)
> +		return -EPERM;
> +
> +	if (info->fbops->fb_write)
> +		return info->fbops->fb_write(info, buf, count, ppos);
> +	else
> +		return fb_cfb_write(info, buf, count, ppos);
> +}
>  
>  int
>  fb_pan_display(struct fb_info *info, struct fb_var_screeninfo *var)
> diff --git a/include/linux/fb.h b/include/linux/fb.h
> index 714187bc13ac..12ad83963db5 100644
> --- a/include/linux/fb.h
> +++ b/include/linux/fb.h
> @@ -593,6 +593,11 @@ extern int fb_blank(struct fb_info *info, int blank);
>  extern void cfb_fillrect(struct fb_info *info, const struct fb_fillrect *rect);
>  extern void cfb_copyarea(struct fb_info *info, const struct fb_copyarea *area);
>  extern void cfb_imageblit(struct fb_info *info, const struct fb_image *image);
> +extern ssize_t fb_cfb_read(struct fb_info *info, char __user *buf,
> +			   size_t count, loff_t *ppos);
> +extern ssize_t fb_cfb_write(struct fb_info *info, const char __user *buf,
> +			    size_t count, loff_t *ppos);
> +
>  /*
>   * Drawing operations where framebuffer is in system RAM
>   */
> -- 
> 2.27.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/5] fbdev/core: Export framebuffer read and write code as cfb_ function
@ 2020-07-29 13:53     ` daniel
  0 siblings, 0 replies; 34+ messages in thread
From: daniel @ 2020-07-29 13:53 UTC (permalink / raw)
  Cc: linux-fbdev, b.zolnierkie, jani.nikula, dri-devel, kraxel,
	airlied, natechancellor, sam, peda, dan.carpenter

On Wed, Jul 29, 2020 at 03:41:45PM +0200, Thomas Zimmermann wrote:
> DRM fb helpers require read and write functions for framebuffer
> memory. Export the existing code from fbdev.
> 
> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>

Hm I'm not super sure whether we want to actually reuse this stuff ... We
kinda don't care about the sparc special case, and just having an fbdev
implementation witch has the switch between memcpy and memcpy_to/from_io
in one single place sounds a lot simpler ...

This way we can have a clean split between the old horrors of real fbdev
drivers, and a much cleaner world in drm. It would mean a bit of
copypasting, but I think that's actually a good thing.

In general my idea for drm fbdev emulation is that for any area we have a
problem we just ignore the entire fbmem.c code and write our own: mmap,
backlight handling (still unsolved, and horrible), cfb vs sys here. This
entire fbmem.c stuff is pretty bad midlayer, trying to avoid code
duplication here doesn't seem worth it imo.

Thoughts?
-Daniel

> ---
>  drivers/video/fbdev/core/fbmem.c | 53 ++++++++++++++++++++++----------
>  include/linux/fb.h               |  5 +++
>  2 files changed, 41 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
> index dd0ccf35f7b7..b496ff90db3e 100644
> --- a/drivers/video/fbdev/core/fbmem.c
> +++ b/drivers/video/fbdev/core/fbmem.c
> @@ -759,25 +759,18 @@ static struct fb_info *file_fb_info(struct file *file)
>  	return info;
>  }
>  
> -static ssize_t
> -fb_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
> +ssize_t fb_cfb_read(struct fb_info *info, char __user *buf, size_t count,
> +		    loff_t *ppos)
>  {
>  	unsigned long p = *ppos;
> -	struct fb_info *info = file_fb_info(file);
>  	u8 *buffer, *dst;
>  	u8 __iomem *src;
>  	int c, cnt = 0, err = 0;
>  	unsigned long total_size;
>  
> -	if (!info || ! info->screen_base)
> -		return -ENODEV;
> -
>  	if (info->state != FBINFO_STATE_RUNNING)
>  		return -EPERM;
>  
> -	if (info->fbops->fb_read)
> -		return info->fbops->fb_read(info, buf, count, ppos);
> -
>  	total_size = info->screen_size;
>  
>  	if (total_size == 0)
> @@ -823,16 +816,12 @@ fb_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
>  
>  	return (err) ? err : cnt;
>  }
> +EXPORT_SYMBOL(fb_cfb_read);
>  
>  static ssize_t
> -fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
> +fb_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
>  {
> -	unsigned long p = *ppos;
>  	struct fb_info *info = file_fb_info(file);
> -	u8 *buffer, *src;
> -	u8 __iomem *dst;
> -	int c, cnt = 0, err = 0;
> -	unsigned long total_size;
>  
>  	if (!info || !info->screen_base)
>  		return -ENODEV;
> @@ -840,8 +829,20 @@ fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
>  	if (info->state != FBINFO_STATE_RUNNING)
>  		return -EPERM;
>  
> -	if (info->fbops->fb_write)
> -		return info->fbops->fb_write(info, buf, count, ppos);
> +	if (info->fbops->fb_read)
> +		return info->fbops->fb_read(info, buf, count, ppos);
> +	else
> +		return fb_cfb_read(info, buf, count, ppos);
> +}
> +
> +ssize_t fb_cfb_write(struct fb_info *info, const char __user *buf,
> +		     size_t count, loff_t *ppos)
> +{
> +	unsigned long p = *ppos;
> +	u8 *buffer, *src;
> +	u8 __iomem *dst;
> +	int c, cnt = 0, err = 0;
> +	unsigned long total_size;
>  
>  	total_size = info->screen_size;
>  
> @@ -895,6 +896,24 @@ fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
>  
>  	return (cnt) ? cnt : err;
>  }
> +EXPORT_SYMBOL(fb_cfb_write);
> +
> +static ssize_t
> +fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
> +{
> +	struct fb_info *info = file_fb_info(file);
> +
> +	if (!info || !info->screen_base)
> +		return -ENODEV;
> +
> +	if (info->state != FBINFO_STATE_RUNNING)
> +		return -EPERM;
> +
> +	if (info->fbops->fb_write)
> +		return info->fbops->fb_write(info, buf, count, ppos);
> +	else
> +		return fb_cfb_write(info, buf, count, ppos);
> +}
>  
>  int
>  fb_pan_display(struct fb_info *info, struct fb_var_screeninfo *var)
> diff --git a/include/linux/fb.h b/include/linux/fb.h
> index 714187bc13ac..12ad83963db5 100644
> --- a/include/linux/fb.h
> +++ b/include/linux/fb.h
> @@ -593,6 +593,11 @@ extern int fb_blank(struct fb_info *info, int blank);
>  extern void cfb_fillrect(struct fb_info *info, const struct fb_fillrect *rect);
>  extern void cfb_copyarea(struct fb_info *info, const struct fb_copyarea *area);
>  extern void cfb_imageblit(struct fb_info *info, const struct fb_image *image);
> +extern ssize_t fb_cfb_read(struct fb_info *info, char __user *buf,
> +			   size_t count, loff_t *ppos);
> +extern ssize_t fb_cfb_write(struct fb_info *info, const char __user *buf,
> +			    size_t count, loff_t *ppos);
> +
>  /*
>   * Drawing operations where framebuffer is in system RAM
>   */
> -- 
> 2.27.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/5] drm: Add infrastructure for vmap operations of I/O memory
  2020-07-29 13:41   ` Thomas Zimmermann
@ 2020-07-29 13:57     ` daniel
  -1 siblings, 0 replies; 34+ messages in thread
From: daniel @ 2020-07-29 13:57 UTC (permalink / raw)
  Cc: linux-fbdev, b.zolnierkie, jani.nikula, dri-devel, kraxel,
	airlied, natechancellor, sam, peda, dan.carpenter

On Wed, Jul 29, 2020 at 03:41:46PM +0200, Thomas Zimmermann wrote:
> Most platforms allow for accessing framebuffer I/O memory with regular
> load and store operations. Some platforms, such as sparc64, require
> the use of special instructions instead.
> 
> This patch adds vmap_iomem to struct drm_gem_object_funcs. The new
> interface drm_client_buffer_vmap_iomem() gives DRM clients access to the
> I/O memory buffer. The semantics of struct drm_gem_objcet_funcs.vmap
> change slightly. It used to return system or I/O memory. Now it is
> expected to return memory addresses that can be accessed with regular
> load and store operations. So nothing changes for existing implementations
> of GEM objects. If the GEM object also implements vmap_iomem, a call
> to vmap shall only return system memory, even if I/O memory could be
> accessed with loads and stores.
> 
> The existing interface drm_client_buffer_vmap() shall only return memory
> as given by drm_gem_vmap ((i.e., that is accessible via regular load and
> store). The new interface drm_client_buffer_vmap_iomem() shall only
> return I/O memory.
> 
> DRM clients must map buffers by calling drm_client_buffer_vmap_iomem()
> and drm_client_buffer_vmap() to get the buffer in I/O or system memory.
> Each function returns NULL if the buffer is in the other memory area.
> Depending on the type of the returned memory, clients must access the
> framebuffer with the appropriate operations.
> 
> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>

Hm I don't think this works, since for more dynamic framebuffers (like
real big gpu ttm drivers) this is a dynamic thing, which can change every
time we do an mmap. So I think the ttm approach of having an is_iomem flag
is a lot better.

The trouble with that is that you don't have correct checking of sparse
mappings, but oh well :-/ The one idea I've had to address that is using
something like this

typedef dma_buf_addr_t {
	bool is_iomem;
	union {
		void __iomem *vaddr_iomem;
		void vaddr;
	};
};

And then having a wrapper for memcpy_from_dma_buf_addr and
memcpy_to_dma_buf_addr, which switches between memcpy and memcpy_from/toio
depending upon the is_iomem flag.

But it's a lot more invasive unfortunately :-/
-Daniel

> ---
>  drivers/gpu/drm/drm_client.c   | 52 ++++++++++++++++++++++++++++++++--
>  drivers/gpu/drm/drm_gem.c      | 19 +++++++++++++
>  drivers/gpu/drm/drm_internal.h |  1 +
>  include/drm/drm_client.h       |  8 +++++-
>  include/drm/drm_gem.h          | 17 +++++++++--
>  5 files changed, 91 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_client.c b/drivers/gpu/drm/drm_client.c
> index 495f47d23d87..b5bbe089a41e 100644
> --- a/drivers/gpu/drm/drm_client.c
> +++ b/drivers/gpu/drm/drm_client.c
> @@ -327,6 +327,46 @@ void *drm_client_buffer_vmap(struct drm_client_buffer *buffer)
>  }
>  EXPORT_SYMBOL(drm_client_buffer_vmap);
>  
> +/**
> + * drm_client_buffer_vmap_iomem - Map DRM client buffer into address space
> + * @buffer: DRM client buffer
> + *
> + * This function maps a client buffer into kernel address space. If the
> + * buffer is already mapped, it returns the mapping's address.
> + *
> + * Client buffer mappings are not ref'counted. Each call to
> + * drm_client_buffer_vmap() should be followed by a call to
> + * drm_client_buffer_vunmap(); or the client buffer should be mapped
> + * throughout its lifetime.
> + *
> + * Returns:
> + *	The mapped memory's address
> + */
> +void __iomem *drm_client_buffer_vmap_iomem(struct drm_client_buffer *buffer)
> +{
> +	void __iomem *vaddr_iomem;
> +
> +	if (buffer->vaddr_iomem)
> +		return buffer->vaddr_iomem;
> +
> +	/*
> +	 * FIXME: The dependency on GEM here isn't required, we could
> +	 * convert the driver handle to a dma-buf instead and use the
> +	 * backend-agnostic dma-buf vmap support instead. This would
> +	 * require that the handle2fd prime ioctl is reworked to pull the
> +	 * fd_install step out of the driver backend hooks, to make that
> +	 * final step optional for internal users.
> +	 */
> +	vaddr_iomem = drm_gem_vmap_iomem(buffer->gem);
> +	if (IS_ERR(vaddr_iomem))
> +		return vaddr_iomem;
> +
> +	buffer->vaddr_iomem = vaddr_iomem;
> +
> +	return vaddr_iomem;
> +}
> +EXPORT_SYMBOL(drm_client_buffer_vmap_iomem);
> +
>  /**
>   * drm_client_buffer_vunmap - Unmap DRM client buffer
>   * @buffer: DRM client buffer
> @@ -337,8 +377,16 @@ EXPORT_SYMBOL(drm_client_buffer_vmap);
>   */
>  void drm_client_buffer_vunmap(struct drm_client_buffer *buffer)
>  {
> -	drm_gem_vunmap(buffer->gem, buffer->vaddr);
> -	buffer->vaddr = NULL;
> +	drm_WARN_ON(buffer->client->dev, buffer->vaddr && buffer->vaddr_iomem);
> +
> +	if (buffer->vaddr) {
> +		drm_gem_vunmap(buffer->gem, buffer->vaddr);
> +		buffer->vaddr = NULL;
> +	}
> +	if (buffer->vaddr_iomem) {
> +		drm_gem_vunmap(buffer->gem, (void *)buffer->vaddr_iomem);
> +		buffer->vaddr_iomem = NULL;
> +	}
>  }
>  EXPORT_SYMBOL(drm_client_buffer_vunmap);
>  
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index a57f5379fc08..a001be8c0965 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -1227,6 +1227,25 @@ void *drm_gem_vmap(struct drm_gem_object *obj)
>  		vaddr = obj->funcs->vmap(obj);
>  	else if (obj->dev->driver->gem_prime_vmap)
>  		vaddr = obj->dev->driver->gem_prime_vmap(obj);
> +	else if (obj->funcs && obj->funcs->vmap_iomem)
> +		vaddr = NULL; /* requires mapping as I/O memory */
> +	else
> +		vaddr = ERR_PTR(-EOPNOTSUPP);
> +
> +	if (!vaddr)
> +		vaddr = ERR_PTR(-ENOMEM);
> +
> +	return vaddr;
> +}
> +
> +void __iomem *drm_gem_vmap_iomem(struct drm_gem_object *obj)
> +{
> +	void __iomem *vaddr;
> +
> +	if (obj->funcs && obj->funcs->vmap_iomem)
> +		vaddr = obj->funcs->vmap_iomem(obj);
> +	else if (obj->funcs && obj->funcs->vmap)
> +		vaddr = NULL; /* requires mapping as system memory */
>  	else
>  		vaddr = ERR_PTR(-EOPNOTSUPP);
>  
> diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
> index 8e01caaf95cc..aa1a3d4f9223 100644
> --- a/drivers/gpu/drm/drm_internal.h
> +++ b/drivers/gpu/drm/drm_internal.h
> @@ -187,6 +187,7 @@ void drm_gem_print_info(struct drm_printer *p, unsigned int indent,
>  int drm_gem_pin(struct drm_gem_object *obj);
>  void drm_gem_unpin(struct drm_gem_object *obj);
>  void *drm_gem_vmap(struct drm_gem_object *obj);
> +void __iomem *drm_gem_vmap_iomem(struct drm_gem_object *obj);
>  void drm_gem_vunmap(struct drm_gem_object *obj, void *vaddr);
>  
>  /* drm_debugfs.c drm_debugfs_crc.c */
> diff --git a/include/drm/drm_client.h b/include/drm/drm_client.h
> index 7aaea665bfc2..94aa075ee4b6 100644
> --- a/include/drm/drm_client.h
> +++ b/include/drm/drm_client.h
> @@ -141,10 +141,15 @@ struct drm_client_buffer {
>  	struct drm_gem_object *gem;
>  
>  	/**
> -	 * @vaddr: Virtual address for the buffer
> +	 * @vaddr: Virtual address for the buffer in system memory
>  	 */
>  	void *vaddr;
>  
> +	/**
> +	 * @vaddr: Virtual address for the buffer in I/O memory
> +	 */
> +	void *vaddr_iomem;
> +
>  	/**
>  	 * @fb: DRM framebuffer
>  	 */
> @@ -156,6 +161,7 @@ drm_client_framebuffer_create(struct drm_client_dev *client, u32 width, u32 heig
>  void drm_client_framebuffer_delete(struct drm_client_buffer *buffer);
>  int drm_client_framebuffer_flush(struct drm_client_buffer *buffer, struct drm_rect *rect);
>  void *drm_client_buffer_vmap(struct drm_client_buffer *buffer);
> +void __iomem *drm_client_buffer_vmap_iomem(struct drm_client_buffer *buffer);
>  void drm_client_buffer_vunmap(struct drm_client_buffer *buffer);
>  
>  int drm_client_modeset_create(struct drm_client_dev *client);
> diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
> index 337a48321705..bc735ff522a8 100644
> --- a/include/drm/drm_gem.h
> +++ b/include/drm/drm_gem.h
> @@ -134,17 +134,28 @@ struct drm_gem_object_funcs {
>  	 * @vmap:
>  	 *
>  	 * Returns a virtual address for the buffer. Used by the
> -	 * drm_gem_dmabuf_vmap() helper.
> +	 * drm_gem_dmabuf_vmap() helper. If the buffer is not
> +	 * located in system memory, the function returns NULL.
>  	 *
>  	 * This callback is optional.
>  	 */
>  	void *(*vmap)(struct drm_gem_object *obj);
>  
> +	/**
> +	 * @vmap_iomem:
> +	 *
> +	 * Returns a virtual address for the buffer. If the buffer is not
> +	 * located in I/O memory, the function returns NULL.
> +	 *
> +	 * This callback is optional.
> +	 */
> +	void __iomem *(*vmap_iomem)(struct drm_gem_object *obj);
> +
>  	/**
>  	 * @vunmap:
>  	 *
> -	 * Releases the address previously returned by @vmap. Used by the
> -	 * drm_gem_dmabuf_vunmap() helper.
> +	 * Releases the address previously returned by @vmap or @vmap_iomem.
> +	 * Used by the drm_gem_dmabuf_vunmap() helper.
>  	 *
>  	 * This callback is optional.
>  	 */
> -- 
> 2.27.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/5] drm: Add infrastructure for vmap operations of I/O memory
@ 2020-07-29 13:57     ` daniel
  0 siblings, 0 replies; 34+ messages in thread
From: daniel @ 2020-07-29 13:57 UTC (permalink / raw)
  Cc: linux-fbdev, b.zolnierkie, jani.nikula, dri-devel, kraxel,
	airlied, natechancellor, sam, peda, dan.carpenter

On Wed, Jul 29, 2020 at 03:41:46PM +0200, Thomas Zimmermann wrote:
> Most platforms allow for accessing framebuffer I/O memory with regular
> load and store operations. Some platforms, such as sparc64, require
> the use of special instructions instead.
> 
> This patch adds vmap_iomem to struct drm_gem_object_funcs. The new
> interface drm_client_buffer_vmap_iomem() gives DRM clients access to the
> I/O memory buffer. The semantics of struct drm_gem_objcet_funcs.vmap
> change slightly. It used to return system or I/O memory. Now it is
> expected to return memory addresses that can be accessed with regular
> load and store operations. So nothing changes for existing implementations
> of GEM objects. If the GEM object also implements vmap_iomem, a call
> to vmap shall only return system memory, even if I/O memory could be
> accessed with loads and stores.
> 
> The existing interface drm_client_buffer_vmap() shall only return memory
> as given by drm_gem_vmap ((i.e., that is accessible via regular load and
> store). The new interface drm_client_buffer_vmap_iomem() shall only
> return I/O memory.
> 
> DRM clients must map buffers by calling drm_client_buffer_vmap_iomem()
> and drm_client_buffer_vmap() to get the buffer in I/O or system memory.
> Each function returns NULL if the buffer is in the other memory area.
> Depending on the type of the returned memory, clients must access the
> framebuffer with the appropriate operations.
> 
> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>

Hm I don't think this works, since for more dynamic framebuffers (like
real big gpu ttm drivers) this is a dynamic thing, which can change every
time we do an mmap. So I think the ttm approach of having an is_iomem flag
is a lot better.

The trouble with that is that you don't have correct checking of sparse
mappings, but oh well :-/ The one idea I've had to address that is using
something like this

typedef dma_buf_addr_t {
	bool is_iomem;
	union {
		void __iomem *vaddr_iomem;
		void vaddr;
	};
};

And then having a wrapper for memcpy_from_dma_buf_addr and
memcpy_to_dma_buf_addr, which switches between memcpy and memcpy_from/toio
depending upon the is_iomem flag.

But it's a lot more invasive unfortunately :-/
-Daniel

> ---
>  drivers/gpu/drm/drm_client.c   | 52 ++++++++++++++++++++++++++++++++--
>  drivers/gpu/drm/drm_gem.c      | 19 +++++++++++++
>  drivers/gpu/drm/drm_internal.h |  1 +
>  include/drm/drm_client.h       |  8 +++++-
>  include/drm/drm_gem.h          | 17 +++++++++--
>  5 files changed, 91 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_client.c b/drivers/gpu/drm/drm_client.c
> index 495f47d23d87..b5bbe089a41e 100644
> --- a/drivers/gpu/drm/drm_client.c
> +++ b/drivers/gpu/drm/drm_client.c
> @@ -327,6 +327,46 @@ void *drm_client_buffer_vmap(struct drm_client_buffer *buffer)
>  }
>  EXPORT_SYMBOL(drm_client_buffer_vmap);
>  
> +/**
> + * drm_client_buffer_vmap_iomem - Map DRM client buffer into address space
> + * @buffer: DRM client buffer
> + *
> + * This function maps a client buffer into kernel address space. If the
> + * buffer is already mapped, it returns the mapping's address.
> + *
> + * Client buffer mappings are not ref'counted. Each call to
> + * drm_client_buffer_vmap() should be followed by a call to
> + * drm_client_buffer_vunmap(); or the client buffer should be mapped
> + * throughout its lifetime.
> + *
> + * Returns:
> + *	The mapped memory's address
> + */
> +void __iomem *drm_client_buffer_vmap_iomem(struct drm_client_buffer *buffer)
> +{
> +	void __iomem *vaddr_iomem;
> +
> +	if (buffer->vaddr_iomem)
> +		return buffer->vaddr_iomem;
> +
> +	/*
> +	 * FIXME: The dependency on GEM here isn't required, we could
> +	 * convert the driver handle to a dma-buf instead and use the
> +	 * backend-agnostic dma-buf vmap support instead. This would
> +	 * require that the handle2fd prime ioctl is reworked to pull the
> +	 * fd_install step out of the driver backend hooks, to make that
> +	 * final step optional for internal users.
> +	 */
> +	vaddr_iomem = drm_gem_vmap_iomem(buffer->gem);
> +	if (IS_ERR(vaddr_iomem))
> +		return vaddr_iomem;
> +
> +	buffer->vaddr_iomem = vaddr_iomem;
> +
> +	return vaddr_iomem;
> +}
> +EXPORT_SYMBOL(drm_client_buffer_vmap_iomem);
> +
>  /**
>   * drm_client_buffer_vunmap - Unmap DRM client buffer
>   * @buffer: DRM client buffer
> @@ -337,8 +377,16 @@ EXPORT_SYMBOL(drm_client_buffer_vmap);
>   */
>  void drm_client_buffer_vunmap(struct drm_client_buffer *buffer)
>  {
> -	drm_gem_vunmap(buffer->gem, buffer->vaddr);
> -	buffer->vaddr = NULL;
> +	drm_WARN_ON(buffer->client->dev, buffer->vaddr && buffer->vaddr_iomem);
> +
> +	if (buffer->vaddr) {
> +		drm_gem_vunmap(buffer->gem, buffer->vaddr);
> +		buffer->vaddr = NULL;
> +	}
> +	if (buffer->vaddr_iomem) {
> +		drm_gem_vunmap(buffer->gem, (void *)buffer->vaddr_iomem);
> +		buffer->vaddr_iomem = NULL;
> +	}
>  }
>  EXPORT_SYMBOL(drm_client_buffer_vunmap);
>  
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index a57f5379fc08..a001be8c0965 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -1227,6 +1227,25 @@ void *drm_gem_vmap(struct drm_gem_object *obj)
>  		vaddr = obj->funcs->vmap(obj);
>  	else if (obj->dev->driver->gem_prime_vmap)
>  		vaddr = obj->dev->driver->gem_prime_vmap(obj);
> +	else if (obj->funcs && obj->funcs->vmap_iomem)
> +		vaddr = NULL; /* requires mapping as I/O memory */
> +	else
> +		vaddr = ERR_PTR(-EOPNOTSUPP);
> +
> +	if (!vaddr)
> +		vaddr = ERR_PTR(-ENOMEM);
> +
> +	return vaddr;
> +}
> +
> +void __iomem *drm_gem_vmap_iomem(struct drm_gem_object *obj)
> +{
> +	void __iomem *vaddr;
> +
> +	if (obj->funcs && obj->funcs->vmap_iomem)
> +		vaddr = obj->funcs->vmap_iomem(obj);
> +	else if (obj->funcs && obj->funcs->vmap)
> +		vaddr = NULL; /* requires mapping as system memory */
>  	else
>  		vaddr = ERR_PTR(-EOPNOTSUPP);
>  
> diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
> index 8e01caaf95cc..aa1a3d4f9223 100644
> --- a/drivers/gpu/drm/drm_internal.h
> +++ b/drivers/gpu/drm/drm_internal.h
> @@ -187,6 +187,7 @@ void drm_gem_print_info(struct drm_printer *p, unsigned int indent,
>  int drm_gem_pin(struct drm_gem_object *obj);
>  void drm_gem_unpin(struct drm_gem_object *obj);
>  void *drm_gem_vmap(struct drm_gem_object *obj);
> +void __iomem *drm_gem_vmap_iomem(struct drm_gem_object *obj);
>  void drm_gem_vunmap(struct drm_gem_object *obj, void *vaddr);
>  
>  /* drm_debugfs.c drm_debugfs_crc.c */
> diff --git a/include/drm/drm_client.h b/include/drm/drm_client.h
> index 7aaea665bfc2..94aa075ee4b6 100644
> --- a/include/drm/drm_client.h
> +++ b/include/drm/drm_client.h
> @@ -141,10 +141,15 @@ struct drm_client_buffer {
>  	struct drm_gem_object *gem;
>  
>  	/**
> -	 * @vaddr: Virtual address for the buffer
> +	 * @vaddr: Virtual address for the buffer in system memory
>  	 */
>  	void *vaddr;
>  
> +	/**
> +	 * @vaddr: Virtual address for the buffer in I/O memory
> +	 */
> +	void *vaddr_iomem;
> +
>  	/**
>  	 * @fb: DRM framebuffer
>  	 */
> @@ -156,6 +161,7 @@ drm_client_framebuffer_create(struct drm_client_dev *client, u32 width, u32 heig
>  void drm_client_framebuffer_delete(struct drm_client_buffer *buffer);
>  int drm_client_framebuffer_flush(struct drm_client_buffer *buffer, struct drm_rect *rect);
>  void *drm_client_buffer_vmap(struct drm_client_buffer *buffer);
> +void __iomem *drm_client_buffer_vmap_iomem(struct drm_client_buffer *buffer);
>  void drm_client_buffer_vunmap(struct drm_client_buffer *buffer);
>  
>  int drm_client_modeset_create(struct drm_client_dev *client);
> diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
> index 337a48321705..bc735ff522a8 100644
> --- a/include/drm/drm_gem.h
> +++ b/include/drm/drm_gem.h
> @@ -134,17 +134,28 @@ struct drm_gem_object_funcs {
>  	 * @vmap:
>  	 *
>  	 * Returns a virtual address for the buffer. Used by the
> -	 * drm_gem_dmabuf_vmap() helper.
> +	 * drm_gem_dmabuf_vmap() helper. If the buffer is not
> +	 * located in system memory, the function returns NULL.
>  	 *
>  	 * This callback is optional.
>  	 */
>  	void *(*vmap)(struct drm_gem_object *obj);
>  
> +	/**
> +	 * @vmap_iomem:
> +	 *
> +	 * Returns a virtual address for the buffer. If the buffer is not
> +	 * located in I/O memory, the function returns NULL.
> +	 *
> +	 * This callback is optional.
> +	 */
> +	void __iomem *(*vmap_iomem)(struct drm_gem_object *obj);
> +
>  	/**
>  	 * @vunmap:
>  	 *
> -	 * Releases the address previously returned by @vmap. Used by the
> -	 * drm_gem_dmabuf_vunmap() helper.
> +	 * Releases the address previously returned by @vmap or @vmap_iomem.
> +	 * Used by the drm_gem_dmabuf_vunmap() helper.
>  	 *
>  	 * This callback is optional.
>  	 */
> -- 
> 2.27.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/5] fbdev/core: Export framebuffer read and write code as cfb_ function
  2020-07-29 13:53     ` daniel
@ 2020-07-29 16:36       ` Sam Ravnborg
  -1 siblings, 0 replies; 34+ messages in thread
From: Sam Ravnborg @ 2020-07-29 16:36 UTC (permalink / raw)
  To: daniel
  Cc: linux-fbdev, b.zolnierkie, jani.nikula, dri-devel, kraxel,
	airlied, natechancellor, peda, dan.carpenter

Hi Daniel.

On Wed, Jul 29, 2020 at 03:53:28PM +0200, daniel@ffwll.ch wrote:
> On Wed, Jul 29, 2020 at 03:41:45PM +0200, Thomas Zimmermann wrote:
> > DRM fb helpers require read and write functions for framebuffer
> > memory. Export the existing code from fbdev.
> > 
> > Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
> 
> Hm I'm not super sure whether we want to actually reuse this stuff ... We
> kinda don't care about the sparc special case, and just having an fbdev
> implementation witch has the switch between memcpy and memcpy_to/from_io
> in one single place sounds a lot simpler ...
> 
> This way we can have a clean split between the old horrors of real fbdev
> drivers, and a much cleaner world in drm. It would mean a bit of
> copypasting, but I think that's actually a good thing.
> 
> In general my idea for drm fbdev emulation is that for any area we have a
> problem we just ignore the entire fbmem.c code and write our own: mmap,
> backlight handling (still unsolved, and horrible), cfb vs sys here. This
> entire fbmem.c stuff is pretty bad midlayer, trying to avoid code
> duplication here doesn't seem worth it imo.
> 
> Thoughts?


I can see that fbmem is a mix of ioctl support and other stuff.
We could factor out all the ioctl parts of fbmem.c to a new file
named fbioctl.c.

And then let the ioctl parts call down into drm stuff and avoid reusing
the fbdev code when we first reach drm code.
This would require local copies of:
sys_read, sys_write, sys_fillrect, sys_copyarea, sys_imageblit
and more I think which I missed.

With local copies we could avoid some of the special cases and trim the
unctions to what is required by drm only.
And then no more fbmem dependencies and no dependencies to several of
the small helper functions. So less entanglement with fbdev core.

This all sounds simple so I am surely missing a lot a ugly details here.

And should we touch this anyway we need a test suite to verify not too
much breaks. To the best of my knowledge there is not yet such a test
suite :-( Maybe because people caring about fbdev are limited.

	Sam





> -Daniel
> 
> > ---
> >  drivers/video/fbdev/core/fbmem.c | 53 ++++++++++++++++++++++----------
> >  include/linux/fb.h               |  5 +++
> >  2 files changed, 41 insertions(+), 17 deletions(-)
> > 
> > diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
> > index dd0ccf35f7b7..b496ff90db3e 100644
> > --- a/drivers/video/fbdev/core/fbmem.c
> > +++ b/drivers/video/fbdev/core/fbmem.c
> > @@ -759,25 +759,18 @@ static struct fb_info *file_fb_info(struct file *file)
> >  	return info;
> >  }
> >  
> > -static ssize_t
> > -fb_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
> > +ssize_t fb_cfb_read(struct fb_info *info, char __user *buf, size_t count,
> > +		    loff_t *ppos)
> >  {
> >  	unsigned long p = *ppos;
> > -	struct fb_info *info = file_fb_info(file);
> >  	u8 *buffer, *dst;
> >  	u8 __iomem *src;
> >  	int c, cnt = 0, err = 0;
> >  	unsigned long total_size;
> >  
> > -	if (!info || ! info->screen_base)
> > -		return -ENODEV;
> > -
> >  	if (info->state != FBINFO_STATE_RUNNING)
> >  		return -EPERM;
> >  
> > -	if (info->fbops->fb_read)
> > -		return info->fbops->fb_read(info, buf, count, ppos);
> > -
> >  	total_size = info->screen_size;
> >  
> >  	if (total_size = 0)
> > @@ -823,16 +816,12 @@ fb_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
> >  
> >  	return (err) ? err : cnt;
> >  }
> > +EXPORT_SYMBOL(fb_cfb_read);
> >  
> >  static ssize_t
> > -fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
> > +fb_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
> >  {
> > -	unsigned long p = *ppos;
> >  	struct fb_info *info = file_fb_info(file);
> > -	u8 *buffer, *src;
> > -	u8 __iomem *dst;
> > -	int c, cnt = 0, err = 0;
> > -	unsigned long total_size;
> >  
> >  	if (!info || !info->screen_base)
> >  		return -ENODEV;
> > @@ -840,8 +829,20 @@ fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
> >  	if (info->state != FBINFO_STATE_RUNNING)
> >  		return -EPERM;
> >  
> > -	if (info->fbops->fb_write)
> > -		return info->fbops->fb_write(info, buf, count, ppos);
> > +	if (info->fbops->fb_read)
> > +		return info->fbops->fb_read(info, buf, count, ppos);
> > +	else
> > +		return fb_cfb_read(info, buf, count, ppos);
> > +}
> > +
> > +ssize_t fb_cfb_write(struct fb_info *info, const char __user *buf,
> > +		     size_t count, loff_t *ppos)
> > +{
> > +	unsigned long p = *ppos;
> > +	u8 *buffer, *src;
> > +	u8 __iomem *dst;
> > +	int c, cnt = 0, err = 0;
> > +	unsigned long total_size;
> >  
> >  	total_size = info->screen_size;
> >  
> > @@ -895,6 +896,24 @@ fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
> >  
> >  	return (cnt) ? cnt : err;
> >  }
> > +EXPORT_SYMBOL(fb_cfb_write);
> > +
> > +static ssize_t
> > +fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
> > +{
> > +	struct fb_info *info = file_fb_info(file);
> > +
> > +	if (!info || !info->screen_base)
> > +		return -ENODEV;
> > +
> > +	if (info->state != FBINFO_STATE_RUNNING)
> > +		return -EPERM;
> > +
> > +	if (info->fbops->fb_write)
> > +		return info->fbops->fb_write(info, buf, count, ppos);
> > +	else
> > +		return fb_cfb_write(info, buf, count, ppos);
> > +}
> >  
> >  int
> >  fb_pan_display(struct fb_info *info, struct fb_var_screeninfo *var)
> > diff --git a/include/linux/fb.h b/include/linux/fb.h
> > index 714187bc13ac..12ad83963db5 100644
> > --- a/include/linux/fb.h
> > +++ b/include/linux/fb.h
> > @@ -593,6 +593,11 @@ extern int fb_blank(struct fb_info *info, int blank);
> >  extern void cfb_fillrect(struct fb_info *info, const struct fb_fillrect *rect);
> >  extern void cfb_copyarea(struct fb_info *info, const struct fb_copyarea *area);
> >  extern void cfb_imageblit(struct fb_info *info, const struct fb_image *image);
> > +extern ssize_t fb_cfb_read(struct fb_info *info, char __user *buf,
> > +			   size_t count, loff_t *ppos);
> > +extern ssize_t fb_cfb_write(struct fb_info *info, const char __user *buf,
> > +			    size_t count, loff_t *ppos);
> > +
> >  /*
> >   * Drawing operations where framebuffer is in system RAM
> >   */
> > -- 
> > 2.27.0
> > 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/5] fbdev/core: Export framebuffer read and write code as cfb_ function
@ 2020-07-29 16:36       ` Sam Ravnborg
  0 siblings, 0 replies; 34+ messages in thread
From: Sam Ravnborg @ 2020-07-29 16:36 UTC (permalink / raw)
  To: daniel
  Cc: linux-fbdev, b.zolnierkie, jani.nikula, dri-devel, kraxel,
	airlied, natechancellor, peda, dan.carpenter

Hi Daniel.

On Wed, Jul 29, 2020 at 03:53:28PM +0200, daniel@ffwll.ch wrote:
> On Wed, Jul 29, 2020 at 03:41:45PM +0200, Thomas Zimmermann wrote:
> > DRM fb helpers require read and write functions for framebuffer
> > memory. Export the existing code from fbdev.
> > 
> > Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
> 
> Hm I'm not super sure whether we want to actually reuse this stuff ... We
> kinda don't care about the sparc special case, and just having an fbdev
> implementation witch has the switch between memcpy and memcpy_to/from_io
> in one single place sounds a lot simpler ...
> 
> This way we can have a clean split between the old horrors of real fbdev
> drivers, and a much cleaner world in drm. It would mean a bit of
> copypasting, but I think that's actually a good thing.
> 
> In general my idea for drm fbdev emulation is that for any area we have a
> problem we just ignore the entire fbmem.c code and write our own: mmap,
> backlight handling (still unsolved, and horrible), cfb vs sys here. This
> entire fbmem.c stuff is pretty bad midlayer, trying to avoid code
> duplication here doesn't seem worth it imo.
> 
> Thoughts?


I can see that fbmem is a mix of ioctl support and other stuff.
We could factor out all the ioctl parts of fbmem.c to a new file
named fbioctl.c.

And then let the ioctl parts call down into drm stuff and avoid reusing
the fbdev code when we first reach drm code.
This would require local copies of:
sys_read, sys_write, sys_fillrect, sys_copyarea, sys_imageblit
and more I think which I missed.

With local copies we could avoid some of the special cases and trim the
unctions to what is required by drm only.
And then no more fbmem dependencies and no dependencies to several of
the small helper functions. So less entanglement with fbdev core.

This all sounds simple so I am surely missing a lot a ugly details here.

And should we touch this anyway we need a test suite to verify not too
much breaks. To the best of my knowledge there is not yet such a test
suite :-( Maybe because people caring about fbdev are limited.

	Sam





> -Daniel
> 
> > ---
> >  drivers/video/fbdev/core/fbmem.c | 53 ++++++++++++++++++++++----------
> >  include/linux/fb.h               |  5 +++
> >  2 files changed, 41 insertions(+), 17 deletions(-)
> > 
> > diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
> > index dd0ccf35f7b7..b496ff90db3e 100644
> > --- a/drivers/video/fbdev/core/fbmem.c
> > +++ b/drivers/video/fbdev/core/fbmem.c
> > @@ -759,25 +759,18 @@ static struct fb_info *file_fb_info(struct file *file)
> >  	return info;
> >  }
> >  
> > -static ssize_t
> > -fb_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
> > +ssize_t fb_cfb_read(struct fb_info *info, char __user *buf, size_t count,
> > +		    loff_t *ppos)
> >  {
> >  	unsigned long p = *ppos;
> > -	struct fb_info *info = file_fb_info(file);
> >  	u8 *buffer, *dst;
> >  	u8 __iomem *src;
> >  	int c, cnt = 0, err = 0;
> >  	unsigned long total_size;
> >  
> > -	if (!info || ! info->screen_base)
> > -		return -ENODEV;
> > -
> >  	if (info->state != FBINFO_STATE_RUNNING)
> >  		return -EPERM;
> >  
> > -	if (info->fbops->fb_read)
> > -		return info->fbops->fb_read(info, buf, count, ppos);
> > -
> >  	total_size = info->screen_size;
> >  
> >  	if (total_size == 0)
> > @@ -823,16 +816,12 @@ fb_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
> >  
> >  	return (err) ? err : cnt;
> >  }
> > +EXPORT_SYMBOL(fb_cfb_read);
> >  
> >  static ssize_t
> > -fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
> > +fb_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
> >  {
> > -	unsigned long p = *ppos;
> >  	struct fb_info *info = file_fb_info(file);
> > -	u8 *buffer, *src;
> > -	u8 __iomem *dst;
> > -	int c, cnt = 0, err = 0;
> > -	unsigned long total_size;
> >  
> >  	if (!info || !info->screen_base)
> >  		return -ENODEV;
> > @@ -840,8 +829,20 @@ fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
> >  	if (info->state != FBINFO_STATE_RUNNING)
> >  		return -EPERM;
> >  
> > -	if (info->fbops->fb_write)
> > -		return info->fbops->fb_write(info, buf, count, ppos);
> > +	if (info->fbops->fb_read)
> > +		return info->fbops->fb_read(info, buf, count, ppos);
> > +	else
> > +		return fb_cfb_read(info, buf, count, ppos);
> > +}
> > +
> > +ssize_t fb_cfb_write(struct fb_info *info, const char __user *buf,
> > +		     size_t count, loff_t *ppos)
> > +{
> > +	unsigned long p = *ppos;
> > +	u8 *buffer, *src;
> > +	u8 __iomem *dst;
> > +	int c, cnt = 0, err = 0;
> > +	unsigned long total_size;
> >  
> >  	total_size = info->screen_size;
> >  
> > @@ -895,6 +896,24 @@ fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
> >  
> >  	return (cnt) ? cnt : err;
> >  }
> > +EXPORT_SYMBOL(fb_cfb_write);
> > +
> > +static ssize_t
> > +fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
> > +{
> > +	struct fb_info *info = file_fb_info(file);
> > +
> > +	if (!info || !info->screen_base)
> > +		return -ENODEV;
> > +
> > +	if (info->state != FBINFO_STATE_RUNNING)
> > +		return -EPERM;
> > +
> > +	if (info->fbops->fb_write)
> > +		return info->fbops->fb_write(info, buf, count, ppos);
> > +	else
> > +		return fb_cfb_write(info, buf, count, ppos);
> > +}
> >  
> >  int
> >  fb_pan_display(struct fb_info *info, struct fb_var_screeninfo *var)
> > diff --git a/include/linux/fb.h b/include/linux/fb.h
> > index 714187bc13ac..12ad83963db5 100644
> > --- a/include/linux/fb.h
> > +++ b/include/linux/fb.h
> > @@ -593,6 +593,11 @@ extern int fb_blank(struct fb_info *info, int blank);
> >  extern void cfb_fillrect(struct fb_info *info, const struct fb_fillrect *rect);
> >  extern void cfb_copyarea(struct fb_info *info, const struct fb_copyarea *area);
> >  extern void cfb_imageblit(struct fb_info *info, const struct fb_image *image);
> > +extern ssize_t fb_cfb_read(struct fb_info *info, char __user *buf,
> > +			   size_t count, loff_t *ppos);
> > +extern ssize_t fb_cfb_write(struct fb_info *info, const char __user *buf,
> > +			    size_t count, loff_t *ppos);
> > +
> >  /*
> >   * Drawing operations where framebuffer is in system RAM
> >   */
> > -- 
> > 2.27.0
> > 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/5] drm: Add infrastructure for vmap operations of I/O memory
  2020-07-29 13:57     ` daniel
@ 2020-07-30  8:14       ` Thomas Zimmermann
  -1 siblings, 0 replies; 34+ messages in thread
From: Thomas Zimmermann @ 2020-07-30  8:14 UTC (permalink / raw)
  To: daniel
  Cc: linux-fbdev, b.zolnierkie, jani.nikula, dri-devel, kraxel,
	airlied, natechancellor, sam, peda, dan.carpenter


[-- Attachment #1.1: Type: text/plain, Size: 10293 bytes --]

Hi

Am 29.07.20 um 15:57 schrieb daniel@ffwll.ch:
> On Wed, Jul 29, 2020 at 03:41:46PM +0200, Thomas Zimmermann wrote:
>> Most platforms allow for accessing framebuffer I/O memory with regular
>> load and store operations. Some platforms, such as sparc64, require
>> the use of special instructions instead.
>>
>> This patch adds vmap_iomem to struct drm_gem_object_funcs. The new
>> interface drm_client_buffer_vmap_iomem() gives DRM clients access to the
>> I/O memory buffer. The semantics of struct drm_gem_objcet_funcs.vmap
>> change slightly. It used to return system or I/O memory. Now it is
>> expected to return memory addresses that can be accessed with regular
>> load and store operations. So nothing changes for existing implementations
>> of GEM objects. If the GEM object also implements vmap_iomem, a call
>> to vmap shall only return system memory, even if I/O memory could be
>> accessed with loads and stores.
>>
>> The existing interface drm_client_buffer_vmap() shall only return memory
>> as given by drm_gem_vmap ((i.e., that is accessible via regular load and
>> store). The new interface drm_client_buffer_vmap_iomem() shall only
>> return I/O memory.
>>
>> DRM clients must map buffers by calling drm_client_buffer_vmap_iomem()
>> and drm_client_buffer_vmap() to get the buffer in I/O or system memory.
>> Each function returns NULL if the buffer is in the other memory area.
>> Depending on the type of the returned memory, clients must access the
>> framebuffer with the appropriate operations.
>>
>> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
> 
> Hm I don't think this works, since for more dynamic framebuffers (like
> real big gpu ttm drivers) this is a dynamic thing, which can change every
> time we do an mmap. So I think the ttm approach of having an is_iomem flag
> is a lot better.
> 
> The trouble with that is that you don't have correct checking of sparse
> mappings, but oh well :-/ The one idea I've had to address that is using
> something like this
> 
> typedef dma_buf_addr_t {
> 	bool is_iomem;
> 	union {
> 		void __iomem *vaddr_iomem;
> 		void vaddr;
> 	};
> };
> 
> And then having a wrapper for memcpy_from_dma_buf_addr and
> memcpy_to_dma_buf_addr, which switches between memcpy and memcpy_from/toio
> depending upon the is_iomem flag.
> 
> But it's a lot more invasive unfortunately :-/

What do you think about introducing read and write callbacks for GEM
objects? Like this:

  int drm_gem_read(struct drm_gem_object *gbo, size_t off, size_t len,
void *buf);

  int drm_gem_write(struct drm_gem_object *gbo, size_t off, size_t len,
const void *buf);

The common case would by memcpy, but GEM implementations could provide
their own thing. The fbdev blit function would look like

  vaddr = drm_gem_vmap(gbo)
  if (IS_ERR(vaddr))
    return

  for (each line) {
    drm_gem_write(gbo, gbo_line_offset, line_size, src)
    gbo_line_offset = /* next line */
    src = /* next line */
  }

  drm_gem_vunmap(gbo);

The whole mess about I/O access would be self-contained.

Best regards
Thomas

> -Daniel
> 
>> ---
>>  drivers/gpu/drm/drm_client.c   | 52 ++++++++++++++++++++++++++++++++--
>>  drivers/gpu/drm/drm_gem.c      | 19 +++++++++++++
>>  drivers/gpu/drm/drm_internal.h |  1 +
>>  include/drm/drm_client.h       |  8 +++++-
>>  include/drm/drm_gem.h          | 17 +++++++++--
>>  5 files changed, 91 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/drm_client.c b/drivers/gpu/drm/drm_client.c
>> index 495f47d23d87..b5bbe089a41e 100644
>> --- a/drivers/gpu/drm/drm_client.c
>> +++ b/drivers/gpu/drm/drm_client.c
>> @@ -327,6 +327,46 @@ void *drm_client_buffer_vmap(struct drm_client_buffer *buffer)
>>  }
>>  EXPORT_SYMBOL(drm_client_buffer_vmap);
>>  
>> +/**
>> + * drm_client_buffer_vmap_iomem - Map DRM client buffer into address space
>> + * @buffer: DRM client buffer
>> + *
>> + * This function maps a client buffer into kernel address space. If the
>> + * buffer is already mapped, it returns the mapping's address.
>> + *
>> + * Client buffer mappings are not ref'counted. Each call to
>> + * drm_client_buffer_vmap() should be followed by a call to
>> + * drm_client_buffer_vunmap(); or the client buffer should be mapped
>> + * throughout its lifetime.
>> + *
>> + * Returns:
>> + *	The mapped memory's address
>> + */
>> +void __iomem *drm_client_buffer_vmap_iomem(struct drm_client_buffer *buffer)
>> +{
>> +	void __iomem *vaddr_iomem;
>> +
>> +	if (buffer->vaddr_iomem)
>> +		return buffer->vaddr_iomem;
>> +
>> +	/*
>> +	 * FIXME: The dependency on GEM here isn't required, we could
>> +	 * convert the driver handle to a dma-buf instead and use the
>> +	 * backend-agnostic dma-buf vmap support instead. This would
>> +	 * require that the handle2fd prime ioctl is reworked to pull the
>> +	 * fd_install step out of the driver backend hooks, to make that
>> +	 * final step optional for internal users.
>> +	 */
>> +	vaddr_iomem = drm_gem_vmap_iomem(buffer->gem);
>> +	if (IS_ERR(vaddr_iomem))
>> +		return vaddr_iomem;
>> +
>> +	buffer->vaddr_iomem = vaddr_iomem;
>> +
>> +	return vaddr_iomem;
>> +}
>> +EXPORT_SYMBOL(drm_client_buffer_vmap_iomem);
>> +
>>  /**
>>   * drm_client_buffer_vunmap - Unmap DRM client buffer
>>   * @buffer: DRM client buffer
>> @@ -337,8 +377,16 @@ EXPORT_SYMBOL(drm_client_buffer_vmap);
>>   */
>>  void drm_client_buffer_vunmap(struct drm_client_buffer *buffer)
>>  {
>> -	drm_gem_vunmap(buffer->gem, buffer->vaddr);
>> -	buffer->vaddr = NULL;
>> +	drm_WARN_ON(buffer->client->dev, buffer->vaddr && buffer->vaddr_iomem);
>> +
>> +	if (buffer->vaddr) {
>> +		drm_gem_vunmap(buffer->gem, buffer->vaddr);
>> +		buffer->vaddr = NULL;
>> +	}
>> +	if (buffer->vaddr_iomem) {
>> +		drm_gem_vunmap(buffer->gem, (void *)buffer->vaddr_iomem);
>> +		buffer->vaddr_iomem = NULL;
>> +	}
>>  }
>>  EXPORT_SYMBOL(drm_client_buffer_vunmap);
>>  
>> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
>> index a57f5379fc08..a001be8c0965 100644
>> --- a/drivers/gpu/drm/drm_gem.c
>> +++ b/drivers/gpu/drm/drm_gem.c
>> @@ -1227,6 +1227,25 @@ void *drm_gem_vmap(struct drm_gem_object *obj)
>>  		vaddr = obj->funcs->vmap(obj);
>>  	else if (obj->dev->driver->gem_prime_vmap)
>>  		vaddr = obj->dev->driver->gem_prime_vmap(obj);
>> +	else if (obj->funcs && obj->funcs->vmap_iomem)
>> +		vaddr = NULL; /* requires mapping as I/O memory */
>> +	else
>> +		vaddr = ERR_PTR(-EOPNOTSUPP);
>> +
>> +	if (!vaddr)
>> +		vaddr = ERR_PTR(-ENOMEM);
>> +
>> +	return vaddr;
>> +}
>> +
>> +void __iomem *drm_gem_vmap_iomem(struct drm_gem_object *obj)
>> +{
>> +	void __iomem *vaddr;
>> +
>> +	if (obj->funcs && obj->funcs->vmap_iomem)
>> +		vaddr = obj->funcs->vmap_iomem(obj);
>> +	else if (obj->funcs && obj->funcs->vmap)
>> +		vaddr = NULL; /* requires mapping as system memory */
>>  	else
>>  		vaddr = ERR_PTR(-EOPNOTSUPP);
>>  
>> diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
>> index 8e01caaf95cc..aa1a3d4f9223 100644
>> --- a/drivers/gpu/drm/drm_internal.h
>> +++ b/drivers/gpu/drm/drm_internal.h
>> @@ -187,6 +187,7 @@ void drm_gem_print_info(struct drm_printer *p, unsigned int indent,
>>  int drm_gem_pin(struct drm_gem_object *obj);
>>  void drm_gem_unpin(struct drm_gem_object *obj);
>>  void *drm_gem_vmap(struct drm_gem_object *obj);
>> +void __iomem *drm_gem_vmap_iomem(struct drm_gem_object *obj);
>>  void drm_gem_vunmap(struct drm_gem_object *obj, void *vaddr);
>>  
>>  /* drm_debugfs.c drm_debugfs_crc.c */
>> diff --git a/include/drm/drm_client.h b/include/drm/drm_client.h
>> index 7aaea665bfc2..94aa075ee4b6 100644
>> --- a/include/drm/drm_client.h
>> +++ b/include/drm/drm_client.h
>> @@ -141,10 +141,15 @@ struct drm_client_buffer {
>>  	struct drm_gem_object *gem;
>>  
>>  	/**
>> -	 * @vaddr: Virtual address for the buffer
>> +	 * @vaddr: Virtual address for the buffer in system memory
>>  	 */
>>  	void *vaddr;
>>  
>> +	/**
>> +	 * @vaddr: Virtual address for the buffer in I/O memory
>> +	 */
>> +	void *vaddr_iomem;
>> +
>>  	/**
>>  	 * @fb: DRM framebuffer
>>  	 */
>> @@ -156,6 +161,7 @@ drm_client_framebuffer_create(struct drm_client_dev *client, u32 width, u32 heig
>>  void drm_client_framebuffer_delete(struct drm_client_buffer *buffer);
>>  int drm_client_framebuffer_flush(struct drm_client_buffer *buffer, struct drm_rect *rect);
>>  void *drm_client_buffer_vmap(struct drm_client_buffer *buffer);
>> +void __iomem *drm_client_buffer_vmap_iomem(struct drm_client_buffer *buffer);
>>  void drm_client_buffer_vunmap(struct drm_client_buffer *buffer);
>>  
>>  int drm_client_modeset_create(struct drm_client_dev *client);
>> diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
>> index 337a48321705..bc735ff522a8 100644
>> --- a/include/drm/drm_gem.h
>> +++ b/include/drm/drm_gem.h
>> @@ -134,17 +134,28 @@ struct drm_gem_object_funcs {
>>  	 * @vmap:
>>  	 *
>>  	 * Returns a virtual address for the buffer. Used by the
>> -	 * drm_gem_dmabuf_vmap() helper.
>> +	 * drm_gem_dmabuf_vmap() helper. If the buffer is not
>> +	 * located in system memory, the function returns NULL.
>>  	 *
>>  	 * This callback is optional.
>>  	 */
>>  	void *(*vmap)(struct drm_gem_object *obj);
>>  
>> +	/**
>> +	 * @vmap_iomem:
>> +	 *
>> +	 * Returns a virtual address for the buffer. If the buffer is not
>> +	 * located in I/O memory, the function returns NULL.
>> +	 *
>> +	 * This callback is optional.
>> +	 */
>> +	void __iomem *(*vmap_iomem)(struct drm_gem_object *obj);
>> +
>>  	/**
>>  	 * @vunmap:
>>  	 *
>> -	 * Releases the address previously returned by @vmap. Used by the
>> -	 * drm_gem_dmabuf_vunmap() helper.
>> +	 * Releases the address previously returned by @vmap or @vmap_iomem.
>> +	 * Used by the drm_gem_dmabuf_vunmap() helper.
>>  	 *
>>  	 * This callback is optional.
>>  	 */
>> -- 
>> 2.27.0
>>
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 516 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/5] drm: Add infrastructure for vmap operations of I/O memory
@ 2020-07-30  8:14       ` Thomas Zimmermann
  0 siblings, 0 replies; 34+ messages in thread
From: Thomas Zimmermann @ 2020-07-30  8:14 UTC (permalink / raw)
  To: daniel
  Cc: linux-fbdev, b.zolnierkie, jani.nikula, dri-devel, kraxel,
	airlied, natechancellor, sam, peda, dan.carpenter


[-- Attachment #1.1.1: Type: text/plain, Size: 10293 bytes --]

Hi

Am 29.07.20 um 15:57 schrieb daniel@ffwll.ch:
> On Wed, Jul 29, 2020 at 03:41:46PM +0200, Thomas Zimmermann wrote:
>> Most platforms allow for accessing framebuffer I/O memory with regular
>> load and store operations. Some platforms, such as sparc64, require
>> the use of special instructions instead.
>>
>> This patch adds vmap_iomem to struct drm_gem_object_funcs. The new
>> interface drm_client_buffer_vmap_iomem() gives DRM clients access to the
>> I/O memory buffer. The semantics of struct drm_gem_objcet_funcs.vmap
>> change slightly. It used to return system or I/O memory. Now it is
>> expected to return memory addresses that can be accessed with regular
>> load and store operations. So nothing changes for existing implementations
>> of GEM objects. If the GEM object also implements vmap_iomem, a call
>> to vmap shall only return system memory, even if I/O memory could be
>> accessed with loads and stores.
>>
>> The existing interface drm_client_buffer_vmap() shall only return memory
>> as given by drm_gem_vmap ((i.e., that is accessible via regular load and
>> store). The new interface drm_client_buffer_vmap_iomem() shall only
>> return I/O memory.
>>
>> DRM clients must map buffers by calling drm_client_buffer_vmap_iomem()
>> and drm_client_buffer_vmap() to get the buffer in I/O or system memory.
>> Each function returns NULL if the buffer is in the other memory area.
>> Depending on the type of the returned memory, clients must access the
>> framebuffer with the appropriate operations.
>>
>> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
> 
> Hm I don't think this works, since for more dynamic framebuffers (like
> real big gpu ttm drivers) this is a dynamic thing, which can change every
> time we do an mmap. So I think the ttm approach of having an is_iomem flag
> is a lot better.
> 
> The trouble with that is that you don't have correct checking of sparse
> mappings, but oh well :-/ The one idea I've had to address that is using
> something like this
> 
> typedef dma_buf_addr_t {
> 	bool is_iomem;
> 	union {
> 		void __iomem *vaddr_iomem;
> 		void vaddr;
> 	};
> };
> 
> And then having a wrapper for memcpy_from_dma_buf_addr and
> memcpy_to_dma_buf_addr, which switches between memcpy and memcpy_from/toio
> depending upon the is_iomem flag.
> 
> But it's a lot more invasive unfortunately :-/

What do you think about introducing read and write callbacks for GEM
objects? Like this:

  int drm_gem_read(struct drm_gem_object *gbo, size_t off, size_t len,
void *buf);

  int drm_gem_write(struct drm_gem_object *gbo, size_t off, size_t len,
const void *buf);

The common case would by memcpy, but GEM implementations could provide
their own thing. The fbdev blit function would look like

  vaddr = drm_gem_vmap(gbo)
  if (IS_ERR(vaddr))
    return

  for (each line) {
    drm_gem_write(gbo, gbo_line_offset, line_size, src)
    gbo_line_offset = /* next line */
    src = /* next line */
  }

  drm_gem_vunmap(gbo);

The whole mess about I/O access would be self-contained.

Best regards
Thomas

> -Daniel
> 
>> ---
>>  drivers/gpu/drm/drm_client.c   | 52 ++++++++++++++++++++++++++++++++--
>>  drivers/gpu/drm/drm_gem.c      | 19 +++++++++++++
>>  drivers/gpu/drm/drm_internal.h |  1 +
>>  include/drm/drm_client.h       |  8 +++++-
>>  include/drm/drm_gem.h          | 17 +++++++++--
>>  5 files changed, 91 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/drm_client.c b/drivers/gpu/drm/drm_client.c
>> index 495f47d23d87..b5bbe089a41e 100644
>> --- a/drivers/gpu/drm/drm_client.c
>> +++ b/drivers/gpu/drm/drm_client.c
>> @@ -327,6 +327,46 @@ void *drm_client_buffer_vmap(struct drm_client_buffer *buffer)
>>  }
>>  EXPORT_SYMBOL(drm_client_buffer_vmap);
>>  
>> +/**
>> + * drm_client_buffer_vmap_iomem - Map DRM client buffer into address space
>> + * @buffer: DRM client buffer
>> + *
>> + * This function maps a client buffer into kernel address space. If the
>> + * buffer is already mapped, it returns the mapping's address.
>> + *
>> + * Client buffer mappings are not ref'counted. Each call to
>> + * drm_client_buffer_vmap() should be followed by a call to
>> + * drm_client_buffer_vunmap(); or the client buffer should be mapped
>> + * throughout its lifetime.
>> + *
>> + * Returns:
>> + *	The mapped memory's address
>> + */
>> +void __iomem *drm_client_buffer_vmap_iomem(struct drm_client_buffer *buffer)
>> +{
>> +	void __iomem *vaddr_iomem;
>> +
>> +	if (buffer->vaddr_iomem)
>> +		return buffer->vaddr_iomem;
>> +
>> +	/*
>> +	 * FIXME: The dependency on GEM here isn't required, we could
>> +	 * convert the driver handle to a dma-buf instead and use the
>> +	 * backend-agnostic dma-buf vmap support instead. This would
>> +	 * require that the handle2fd prime ioctl is reworked to pull the
>> +	 * fd_install step out of the driver backend hooks, to make that
>> +	 * final step optional for internal users.
>> +	 */
>> +	vaddr_iomem = drm_gem_vmap_iomem(buffer->gem);
>> +	if (IS_ERR(vaddr_iomem))
>> +		return vaddr_iomem;
>> +
>> +	buffer->vaddr_iomem = vaddr_iomem;
>> +
>> +	return vaddr_iomem;
>> +}
>> +EXPORT_SYMBOL(drm_client_buffer_vmap_iomem);
>> +
>>  /**
>>   * drm_client_buffer_vunmap - Unmap DRM client buffer
>>   * @buffer: DRM client buffer
>> @@ -337,8 +377,16 @@ EXPORT_SYMBOL(drm_client_buffer_vmap);
>>   */
>>  void drm_client_buffer_vunmap(struct drm_client_buffer *buffer)
>>  {
>> -	drm_gem_vunmap(buffer->gem, buffer->vaddr);
>> -	buffer->vaddr = NULL;
>> +	drm_WARN_ON(buffer->client->dev, buffer->vaddr && buffer->vaddr_iomem);
>> +
>> +	if (buffer->vaddr) {
>> +		drm_gem_vunmap(buffer->gem, buffer->vaddr);
>> +		buffer->vaddr = NULL;
>> +	}
>> +	if (buffer->vaddr_iomem) {
>> +		drm_gem_vunmap(buffer->gem, (void *)buffer->vaddr_iomem);
>> +		buffer->vaddr_iomem = NULL;
>> +	}
>>  }
>>  EXPORT_SYMBOL(drm_client_buffer_vunmap);
>>  
>> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
>> index a57f5379fc08..a001be8c0965 100644
>> --- a/drivers/gpu/drm/drm_gem.c
>> +++ b/drivers/gpu/drm/drm_gem.c
>> @@ -1227,6 +1227,25 @@ void *drm_gem_vmap(struct drm_gem_object *obj)
>>  		vaddr = obj->funcs->vmap(obj);
>>  	else if (obj->dev->driver->gem_prime_vmap)
>>  		vaddr = obj->dev->driver->gem_prime_vmap(obj);
>> +	else if (obj->funcs && obj->funcs->vmap_iomem)
>> +		vaddr = NULL; /* requires mapping as I/O memory */
>> +	else
>> +		vaddr = ERR_PTR(-EOPNOTSUPP);
>> +
>> +	if (!vaddr)
>> +		vaddr = ERR_PTR(-ENOMEM);
>> +
>> +	return vaddr;
>> +}
>> +
>> +void __iomem *drm_gem_vmap_iomem(struct drm_gem_object *obj)
>> +{
>> +	void __iomem *vaddr;
>> +
>> +	if (obj->funcs && obj->funcs->vmap_iomem)
>> +		vaddr = obj->funcs->vmap_iomem(obj);
>> +	else if (obj->funcs && obj->funcs->vmap)
>> +		vaddr = NULL; /* requires mapping as system memory */
>>  	else
>>  		vaddr = ERR_PTR(-EOPNOTSUPP);
>>  
>> diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
>> index 8e01caaf95cc..aa1a3d4f9223 100644
>> --- a/drivers/gpu/drm/drm_internal.h
>> +++ b/drivers/gpu/drm/drm_internal.h
>> @@ -187,6 +187,7 @@ void drm_gem_print_info(struct drm_printer *p, unsigned int indent,
>>  int drm_gem_pin(struct drm_gem_object *obj);
>>  void drm_gem_unpin(struct drm_gem_object *obj);
>>  void *drm_gem_vmap(struct drm_gem_object *obj);
>> +void __iomem *drm_gem_vmap_iomem(struct drm_gem_object *obj);
>>  void drm_gem_vunmap(struct drm_gem_object *obj, void *vaddr);
>>  
>>  /* drm_debugfs.c drm_debugfs_crc.c */
>> diff --git a/include/drm/drm_client.h b/include/drm/drm_client.h
>> index 7aaea665bfc2..94aa075ee4b6 100644
>> --- a/include/drm/drm_client.h
>> +++ b/include/drm/drm_client.h
>> @@ -141,10 +141,15 @@ struct drm_client_buffer {
>>  	struct drm_gem_object *gem;
>>  
>>  	/**
>> -	 * @vaddr: Virtual address for the buffer
>> +	 * @vaddr: Virtual address for the buffer in system memory
>>  	 */
>>  	void *vaddr;
>>  
>> +	/**
>> +	 * @vaddr: Virtual address for the buffer in I/O memory
>> +	 */
>> +	void *vaddr_iomem;
>> +
>>  	/**
>>  	 * @fb: DRM framebuffer
>>  	 */
>> @@ -156,6 +161,7 @@ drm_client_framebuffer_create(struct drm_client_dev *client, u32 width, u32 heig
>>  void drm_client_framebuffer_delete(struct drm_client_buffer *buffer);
>>  int drm_client_framebuffer_flush(struct drm_client_buffer *buffer, struct drm_rect *rect);
>>  void *drm_client_buffer_vmap(struct drm_client_buffer *buffer);
>> +void __iomem *drm_client_buffer_vmap_iomem(struct drm_client_buffer *buffer);
>>  void drm_client_buffer_vunmap(struct drm_client_buffer *buffer);
>>  
>>  int drm_client_modeset_create(struct drm_client_dev *client);
>> diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
>> index 337a48321705..bc735ff522a8 100644
>> --- a/include/drm/drm_gem.h
>> +++ b/include/drm/drm_gem.h
>> @@ -134,17 +134,28 @@ struct drm_gem_object_funcs {
>>  	 * @vmap:
>>  	 *
>>  	 * Returns a virtual address for the buffer. Used by the
>> -	 * drm_gem_dmabuf_vmap() helper.
>> +	 * drm_gem_dmabuf_vmap() helper. If the buffer is not
>> +	 * located in system memory, the function returns NULL.
>>  	 *
>>  	 * This callback is optional.
>>  	 */
>>  	void *(*vmap)(struct drm_gem_object *obj);
>>  
>> +	/**
>> +	 * @vmap_iomem:
>> +	 *
>> +	 * Returns a virtual address for the buffer. If the buffer is not
>> +	 * located in I/O memory, the function returns NULL.
>> +	 *
>> +	 * This callback is optional.
>> +	 */
>> +	void __iomem *(*vmap_iomem)(struct drm_gem_object *obj);
>> +
>>  	/**
>>  	 * @vunmap:
>>  	 *
>> -	 * Releases the address previously returned by @vmap. Used by the
>> -	 * drm_gem_dmabuf_vunmap() helper.
>> +	 * Releases the address previously returned by @vmap or @vmap_iomem.
>> +	 * Used by the drm_gem_dmabuf_vunmap() helper.
>>  	 *
>>  	 * This callback is optional.
>>  	 */
>> -- 
>> 2.27.0
>>
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 516 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/5] fbdev/core: Export framebuffer read and write code as cfb_ function
  2020-07-29 16:36       ` Sam Ravnborg
@ 2020-07-31  9:20         ` daniel
  -1 siblings, 0 replies; 34+ messages in thread
From: daniel @ 2020-07-31  9:20 UTC (permalink / raw)
  Cc: linux-fbdev, b.zolnierkie, jani.nikula, dri-devel, kraxel,
	airlied, natechancellor, peda, dan.carpenter

On Wed, Jul 29, 2020 at 06:36:03PM +0200, Sam Ravnborg wrote:
> Hi Daniel.
> 
> On Wed, Jul 29, 2020 at 03:53:28PM +0200, daniel@ffwll.ch wrote:
> > On Wed, Jul 29, 2020 at 03:41:45PM +0200, Thomas Zimmermann wrote:
> > > DRM fb helpers require read and write functions for framebuffer
> > > memory. Export the existing code from fbdev.
> > > 
> > > Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
> > 
> > Hm I'm not super sure whether we want to actually reuse this stuff ... We
> > kinda don't care about the sparc special case, and just having an fbdev
> > implementation witch has the switch between memcpy and memcpy_to/from_io
> > in one single place sounds a lot simpler ...
> > 
> > This way we can have a clean split between the old horrors of real fbdev
> > drivers, and a much cleaner world in drm. It would mean a bit of
> > copypasting, but I think that's actually a good thing.
> > 
> > In general my idea for drm fbdev emulation is that for any area we have a
> > problem we just ignore the entire fbmem.c code and write our own: mmap,
> > backlight handling (still unsolved, and horrible), cfb vs sys here. This
> > entire fbmem.c stuff is pretty bad midlayer, trying to avoid code
> > duplication here doesn't seem worth it imo.
> > 
> > Thoughts?
> 
> 
> I can see that fbmem is a mix of ioctl support and other stuff.
> We could factor out all the ioctl parts of fbmem.c to a new file
> named fbioctl.c.
> 
> And then let the ioctl parts call down into drm stuff and avoid reusing
> the fbdev code when we first reach drm code.
> This would require local copies of:
> sys_read, sys_write, sys_fillrect, sys_copyarea, sys_imageblit
> and more I think which I missed.
> 
> With local copies we could avoid some of the special cases and trim the
> unctions to what is required by drm only.
> And then no more fbmem dependencies and no dependencies to several of
> the small helper functions. So less entanglement with fbdev core.
> 
> This all sounds simple so I am surely missing a lot a ugly details here.
> 
> And should we touch this anyway we need a test suite to verify not too
> much breaks. To the best of my knowledge there is not yet such a test
> suite :-( Maybe because people caring about fbdev are limited.

Well my idea was to not refactor anything, but just have drm copies of the
various fb_ops callbacks. Definitely not even more refactoring :-)
-Daniel

> 
> 	Sam
> 
> 
> 
> 
> 
> > -Daniel
> > 
> > > ---
> > >  drivers/video/fbdev/core/fbmem.c | 53 ++++++++++++++++++++++----------
> > >  include/linux/fb.h               |  5 +++
> > >  2 files changed, 41 insertions(+), 17 deletions(-)
> > > 
> > > diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
> > > index dd0ccf35f7b7..b496ff90db3e 100644
> > > --- a/drivers/video/fbdev/core/fbmem.c
> > > +++ b/drivers/video/fbdev/core/fbmem.c
> > > @@ -759,25 +759,18 @@ static struct fb_info *file_fb_info(struct file *file)
> > >  	return info;
> > >  }
> > >  
> > > -static ssize_t
> > > -fb_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
> > > +ssize_t fb_cfb_read(struct fb_info *info, char __user *buf, size_t count,
> > > +		    loff_t *ppos)
> > >  {
> > >  	unsigned long p = *ppos;
> > > -	struct fb_info *info = file_fb_info(file);
> > >  	u8 *buffer, *dst;
> > >  	u8 __iomem *src;
> > >  	int c, cnt = 0, err = 0;
> > >  	unsigned long total_size;
> > >  
> > > -	if (!info || ! info->screen_base)
> > > -		return -ENODEV;
> > > -
> > >  	if (info->state != FBINFO_STATE_RUNNING)
> > >  		return -EPERM;
> > >  
> > > -	if (info->fbops->fb_read)
> > > -		return info->fbops->fb_read(info, buf, count, ppos);
> > > -
> > >  	total_size = info->screen_size;
> > >  
> > >  	if (total_size = 0)
> > > @@ -823,16 +816,12 @@ fb_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
> > >  
> > >  	return (err) ? err : cnt;
> > >  }
> > > +EXPORT_SYMBOL(fb_cfb_read);
> > >  
> > >  static ssize_t
> > > -fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
> > > +fb_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
> > >  {
> > > -	unsigned long p = *ppos;
> > >  	struct fb_info *info = file_fb_info(file);
> > > -	u8 *buffer, *src;
> > > -	u8 __iomem *dst;
> > > -	int c, cnt = 0, err = 0;
> > > -	unsigned long total_size;
> > >  
> > >  	if (!info || !info->screen_base)
> > >  		return -ENODEV;
> > > @@ -840,8 +829,20 @@ fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
> > >  	if (info->state != FBINFO_STATE_RUNNING)
> > >  		return -EPERM;
> > >  
> > > -	if (info->fbops->fb_write)
> > > -		return info->fbops->fb_write(info, buf, count, ppos);
> > > +	if (info->fbops->fb_read)
> > > +		return info->fbops->fb_read(info, buf, count, ppos);
> > > +	else
> > > +		return fb_cfb_read(info, buf, count, ppos);
> > > +}
> > > +
> > > +ssize_t fb_cfb_write(struct fb_info *info, const char __user *buf,
> > > +		     size_t count, loff_t *ppos)
> > > +{
> > > +	unsigned long p = *ppos;
> > > +	u8 *buffer, *src;
> > > +	u8 __iomem *dst;
> > > +	int c, cnt = 0, err = 0;
> > > +	unsigned long total_size;
> > >  
> > >  	total_size = info->screen_size;
> > >  
> > > @@ -895,6 +896,24 @@ fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
> > >  
> > >  	return (cnt) ? cnt : err;
> > >  }
> > > +EXPORT_SYMBOL(fb_cfb_write);
> > > +
> > > +static ssize_t
> > > +fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
> > > +{
> > > +	struct fb_info *info = file_fb_info(file);
> > > +
> > > +	if (!info || !info->screen_base)
> > > +		return -ENODEV;
> > > +
> > > +	if (info->state != FBINFO_STATE_RUNNING)
> > > +		return -EPERM;
> > > +
> > > +	if (info->fbops->fb_write)
> > > +		return info->fbops->fb_write(info, buf, count, ppos);
> > > +	else
> > > +		return fb_cfb_write(info, buf, count, ppos);
> > > +}
> > >  
> > >  int
> > >  fb_pan_display(struct fb_info *info, struct fb_var_screeninfo *var)
> > > diff --git a/include/linux/fb.h b/include/linux/fb.h
> > > index 714187bc13ac..12ad83963db5 100644
> > > --- a/include/linux/fb.h
> > > +++ b/include/linux/fb.h
> > > @@ -593,6 +593,11 @@ extern int fb_blank(struct fb_info *info, int blank);
> > >  extern void cfb_fillrect(struct fb_info *info, const struct fb_fillrect *rect);
> > >  extern void cfb_copyarea(struct fb_info *info, const struct fb_copyarea *area);
> > >  extern void cfb_imageblit(struct fb_info *info, const struct fb_image *image);
> > > +extern ssize_t fb_cfb_read(struct fb_info *info, char __user *buf,
> > > +			   size_t count, loff_t *ppos);
> > > +extern ssize_t fb_cfb_write(struct fb_info *info, const char __user *buf,
> > > +			    size_t count, loff_t *ppos);
> > > +
> > >  /*
> > >   * Drawing operations where framebuffer is in system RAM
> > >   */
> > > -- 
> > > 2.27.0
> > > 
> > 
> > -- 
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/5] fbdev/core: Export framebuffer read and write code as cfb_ function
@ 2020-07-31  9:20         ` daniel
  0 siblings, 0 replies; 34+ messages in thread
From: daniel @ 2020-07-31  9:20 UTC (permalink / raw)
  Cc: linux-fbdev, b.zolnierkie, jani.nikula, dri-devel, kraxel,
	airlied, natechancellor, peda, dan.carpenter

On Wed, Jul 29, 2020 at 06:36:03PM +0200, Sam Ravnborg wrote:
> Hi Daniel.
> 
> On Wed, Jul 29, 2020 at 03:53:28PM +0200, daniel@ffwll.ch wrote:
> > On Wed, Jul 29, 2020 at 03:41:45PM +0200, Thomas Zimmermann wrote:
> > > DRM fb helpers require read and write functions for framebuffer
> > > memory. Export the existing code from fbdev.
> > > 
> > > Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
> > 
> > Hm I'm not super sure whether we want to actually reuse this stuff ... We
> > kinda don't care about the sparc special case, and just having an fbdev
> > implementation witch has the switch between memcpy and memcpy_to/from_io
> > in one single place sounds a lot simpler ...
> > 
> > This way we can have a clean split between the old horrors of real fbdev
> > drivers, and a much cleaner world in drm. It would mean a bit of
> > copypasting, but I think that's actually a good thing.
> > 
> > In general my idea for drm fbdev emulation is that for any area we have a
> > problem we just ignore the entire fbmem.c code and write our own: mmap,
> > backlight handling (still unsolved, and horrible), cfb vs sys here. This
> > entire fbmem.c stuff is pretty bad midlayer, trying to avoid code
> > duplication here doesn't seem worth it imo.
> > 
> > Thoughts?
> 
> 
> I can see that fbmem is a mix of ioctl support and other stuff.
> We could factor out all the ioctl parts of fbmem.c to a new file
> named fbioctl.c.
> 
> And then let the ioctl parts call down into drm stuff and avoid reusing
> the fbdev code when we first reach drm code.
> This would require local copies of:
> sys_read, sys_write, sys_fillrect, sys_copyarea, sys_imageblit
> and more I think which I missed.
> 
> With local copies we could avoid some of the special cases and trim the
> unctions to what is required by drm only.
> And then no more fbmem dependencies and no dependencies to several of
> the small helper functions. So less entanglement with fbdev core.
> 
> This all sounds simple so I am surely missing a lot a ugly details here.
> 
> And should we touch this anyway we need a test suite to verify not too
> much breaks. To the best of my knowledge there is not yet such a test
> suite :-( Maybe because people caring about fbdev are limited.

Well my idea was to not refactor anything, but just have drm copies of the
various fb_ops callbacks. Definitely not even more refactoring :-)
-Daniel

> 
> 	Sam
> 
> 
> 
> 
> 
> > -Daniel
> > 
> > > ---
> > >  drivers/video/fbdev/core/fbmem.c | 53 ++++++++++++++++++++++----------
> > >  include/linux/fb.h               |  5 +++
> > >  2 files changed, 41 insertions(+), 17 deletions(-)
> > > 
> > > diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
> > > index dd0ccf35f7b7..b496ff90db3e 100644
> > > --- a/drivers/video/fbdev/core/fbmem.c
> > > +++ b/drivers/video/fbdev/core/fbmem.c
> > > @@ -759,25 +759,18 @@ static struct fb_info *file_fb_info(struct file *file)
> > >  	return info;
> > >  }
> > >  
> > > -static ssize_t
> > > -fb_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
> > > +ssize_t fb_cfb_read(struct fb_info *info, char __user *buf, size_t count,
> > > +		    loff_t *ppos)
> > >  {
> > >  	unsigned long p = *ppos;
> > > -	struct fb_info *info = file_fb_info(file);
> > >  	u8 *buffer, *dst;
> > >  	u8 __iomem *src;
> > >  	int c, cnt = 0, err = 0;
> > >  	unsigned long total_size;
> > >  
> > > -	if (!info || ! info->screen_base)
> > > -		return -ENODEV;
> > > -
> > >  	if (info->state != FBINFO_STATE_RUNNING)
> > >  		return -EPERM;
> > >  
> > > -	if (info->fbops->fb_read)
> > > -		return info->fbops->fb_read(info, buf, count, ppos);
> > > -
> > >  	total_size = info->screen_size;
> > >  
> > >  	if (total_size == 0)
> > > @@ -823,16 +816,12 @@ fb_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
> > >  
> > >  	return (err) ? err : cnt;
> > >  }
> > > +EXPORT_SYMBOL(fb_cfb_read);
> > >  
> > >  static ssize_t
> > > -fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
> > > +fb_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
> > >  {
> > > -	unsigned long p = *ppos;
> > >  	struct fb_info *info = file_fb_info(file);
> > > -	u8 *buffer, *src;
> > > -	u8 __iomem *dst;
> > > -	int c, cnt = 0, err = 0;
> > > -	unsigned long total_size;
> > >  
> > >  	if (!info || !info->screen_base)
> > >  		return -ENODEV;
> > > @@ -840,8 +829,20 @@ fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
> > >  	if (info->state != FBINFO_STATE_RUNNING)
> > >  		return -EPERM;
> > >  
> > > -	if (info->fbops->fb_write)
> > > -		return info->fbops->fb_write(info, buf, count, ppos);
> > > +	if (info->fbops->fb_read)
> > > +		return info->fbops->fb_read(info, buf, count, ppos);
> > > +	else
> > > +		return fb_cfb_read(info, buf, count, ppos);
> > > +}
> > > +
> > > +ssize_t fb_cfb_write(struct fb_info *info, const char __user *buf,
> > > +		     size_t count, loff_t *ppos)
> > > +{
> > > +	unsigned long p = *ppos;
> > > +	u8 *buffer, *src;
> > > +	u8 __iomem *dst;
> > > +	int c, cnt = 0, err = 0;
> > > +	unsigned long total_size;
> > >  
> > >  	total_size = info->screen_size;
> > >  
> > > @@ -895,6 +896,24 @@ fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
> > >  
> > >  	return (cnt) ? cnt : err;
> > >  }
> > > +EXPORT_SYMBOL(fb_cfb_write);
> > > +
> > > +static ssize_t
> > > +fb_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
> > > +{
> > > +	struct fb_info *info = file_fb_info(file);
> > > +
> > > +	if (!info || !info->screen_base)
> > > +		return -ENODEV;
> > > +
> > > +	if (info->state != FBINFO_STATE_RUNNING)
> > > +		return -EPERM;
> > > +
> > > +	if (info->fbops->fb_write)
> > > +		return info->fbops->fb_write(info, buf, count, ppos);
> > > +	else
> > > +		return fb_cfb_write(info, buf, count, ppos);
> > > +}
> > >  
> > >  int
> > >  fb_pan_display(struct fb_info *info, struct fb_var_screeninfo *var)
> > > diff --git a/include/linux/fb.h b/include/linux/fb.h
> > > index 714187bc13ac..12ad83963db5 100644
> > > --- a/include/linux/fb.h
> > > +++ b/include/linux/fb.h
> > > @@ -593,6 +593,11 @@ extern int fb_blank(struct fb_info *info, int blank);
> > >  extern void cfb_fillrect(struct fb_info *info, const struct fb_fillrect *rect);
> > >  extern void cfb_copyarea(struct fb_info *info, const struct fb_copyarea *area);
> > >  extern void cfb_imageblit(struct fb_info *info, const struct fb_image *image);
> > > +extern ssize_t fb_cfb_read(struct fb_info *info, char __user *buf,
> > > +			   size_t count, loff_t *ppos);
> > > +extern ssize_t fb_cfb_write(struct fb_info *info, const char __user *buf,
> > > +			    size_t count, loff_t *ppos);
> > > +
> > >  /*
> > >   * Drawing operations where framebuffer is in system RAM
> > >   */
> > > -- 
> > > 2.27.0
> > > 
> > 
> > -- 
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/5] drm: Add infrastructure for vmap operations of I/O memory
  2020-07-30  8:14       ` Thomas Zimmermann
@ 2020-07-31  9:22         ` daniel
  -1 siblings, 0 replies; 34+ messages in thread
From: daniel @ 2020-07-31  9:22 UTC (permalink / raw)
  Cc: linux-fbdev, b.zolnierkie, jani.nikula, dri-devel, kraxel,
	airlied, natechancellor, sam, peda, dan.carpenter

On Thu, Jul 30, 2020 at 10:14:43AM +0200, Thomas Zimmermann wrote:
> Hi
> 
> Am 29.07.20 um 15:57 schrieb daniel@ffwll.ch:
> > On Wed, Jul 29, 2020 at 03:41:46PM +0200, Thomas Zimmermann wrote:
> >> Most platforms allow for accessing framebuffer I/O memory with regular
> >> load and store operations. Some platforms, such as sparc64, require
> >> the use of special instructions instead.
> >>
> >> This patch adds vmap_iomem to struct drm_gem_object_funcs. The new
> >> interface drm_client_buffer_vmap_iomem() gives DRM clients access to the
> >> I/O memory buffer. The semantics of struct drm_gem_objcet_funcs.vmap
> >> change slightly. It used to return system or I/O memory. Now it is
> >> expected to return memory addresses that can be accessed with regular
> >> load and store operations. So nothing changes for existing implementations
> >> of GEM objects. If the GEM object also implements vmap_iomem, a call
> >> to vmap shall only return system memory, even if I/O memory could be
> >> accessed with loads and stores.
> >>
> >> The existing interface drm_client_buffer_vmap() shall only return memory
> >> as given by drm_gem_vmap ((i.e., that is accessible via regular load and
> >> store). The new interface drm_client_buffer_vmap_iomem() shall only
> >> return I/O memory.
> >>
> >> DRM clients must map buffers by calling drm_client_buffer_vmap_iomem()
> >> and drm_client_buffer_vmap() to get the buffer in I/O or system memory.
> >> Each function returns NULL if the buffer is in the other memory area.
> >> Depending on the type of the returned memory, clients must access the
> >> framebuffer with the appropriate operations.
> >>
> >> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
> > 
> > Hm I don't think this works, since for more dynamic framebuffers (like
> > real big gpu ttm drivers) this is a dynamic thing, which can change every
> > time we do an mmap. So I think the ttm approach of having an is_iomem flag
> > is a lot better.
> > 
> > The trouble with that is that you don't have correct checking of sparse
> > mappings, but oh well :-/ The one idea I've had to address that is using
> > something like this
> > 
> > typedef dma_buf_addr_t {
> > 	bool is_iomem;
> > 	union {
> > 		void __iomem *vaddr_iomem;
> > 		void vaddr;
> > 	};
> > };
> > 
> > And then having a wrapper for memcpy_from_dma_buf_addr and
> > memcpy_to_dma_buf_addr, which switches between memcpy and memcpy_from/toio
> > depending upon the is_iomem flag.
> > 
> > But it's a lot more invasive unfortunately :-/
> 
> What do you think about introducing read and write callbacks for GEM
> objects? Like this:
> 
>   int drm_gem_read(struct drm_gem_object *gbo, size_t off, size_t len,
> void *buf);
> 
>   int drm_gem_write(struct drm_gem_object *gbo, size_t off, size_t len,
> const void *buf);
> 
> The common case would by memcpy, but GEM implementations could provide
> their own thing. The fbdev blit function would look like
> 
>   vaddr = drm_gem_vmap(gbo)
>   if (IS_ERR(vaddr))
>     return
> 
>   for (each line) {
>     drm_gem_write(gbo, gbo_line_offset, line_size, src)
>     gbo_line_offset = /* next line */
>     src = /* next line */
>   }
> 
>   drm_gem_vunmap(gbo);
> 
> The whole mess about I/O access would be self-contained.

Copying the irc discussion over: We've had that idea floating around years
ago, i915-gem even implemented in the form of pwrite/pread for usersapce.
But now all userspace moved over to mmap, so read/write has fallen out of
favour.

I'm also not sure whether we really need to fix more than just fbcon on
fbdev on drm emulation, and it feels a bit silly to add read/write just
for that. Also the is_iomem flag on the vmap (and maybe eventually on
mmap, no idea) might be able to let us fix this at least for real
eventually.

Cheers, Daniel

> 
> Best regards
> Thomas
> 
> > -Daniel
> > 
> >> ---
> >>  drivers/gpu/drm/drm_client.c   | 52 ++++++++++++++++++++++++++++++++--
> >>  drivers/gpu/drm/drm_gem.c      | 19 +++++++++++++
> >>  drivers/gpu/drm/drm_internal.h |  1 +
> >>  include/drm/drm_client.h       |  8 +++++-
> >>  include/drm/drm_gem.h          | 17 +++++++++--
> >>  5 files changed, 91 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/drm_client.c b/drivers/gpu/drm/drm_client.c
> >> index 495f47d23d87..b5bbe089a41e 100644
> >> --- a/drivers/gpu/drm/drm_client.c
> >> +++ b/drivers/gpu/drm/drm_client.c
> >> @@ -327,6 +327,46 @@ void *drm_client_buffer_vmap(struct drm_client_buffer *buffer)
> >>  }
> >>  EXPORT_SYMBOL(drm_client_buffer_vmap);
> >>  
> >> +/**
> >> + * drm_client_buffer_vmap_iomem - Map DRM client buffer into address space
> >> + * @buffer: DRM client buffer
> >> + *
> >> + * This function maps a client buffer into kernel address space. If the
> >> + * buffer is already mapped, it returns the mapping's address.
> >> + *
> >> + * Client buffer mappings are not ref'counted. Each call to
> >> + * drm_client_buffer_vmap() should be followed by a call to
> >> + * drm_client_buffer_vunmap(); or the client buffer should be mapped
> >> + * throughout its lifetime.
> >> + *
> >> + * Returns:
> >> + *	The mapped memory's address
> >> + */
> >> +void __iomem *drm_client_buffer_vmap_iomem(struct drm_client_buffer *buffer)
> >> +{
> >> +	void __iomem *vaddr_iomem;
> >> +
> >> +	if (buffer->vaddr_iomem)
> >> +		return buffer->vaddr_iomem;
> >> +
> >> +	/*
> >> +	 * FIXME: The dependency on GEM here isn't required, we could
> >> +	 * convert the driver handle to a dma-buf instead and use the
> >> +	 * backend-agnostic dma-buf vmap support instead. This would
> >> +	 * require that the handle2fd prime ioctl is reworked to pull the
> >> +	 * fd_install step out of the driver backend hooks, to make that
> >> +	 * final step optional for internal users.
> >> +	 */
> >> +	vaddr_iomem = drm_gem_vmap_iomem(buffer->gem);
> >> +	if (IS_ERR(vaddr_iomem))
> >> +		return vaddr_iomem;
> >> +
> >> +	buffer->vaddr_iomem = vaddr_iomem;
> >> +
> >> +	return vaddr_iomem;
> >> +}
> >> +EXPORT_SYMBOL(drm_client_buffer_vmap_iomem);
> >> +
> >>  /**
> >>   * drm_client_buffer_vunmap - Unmap DRM client buffer
> >>   * @buffer: DRM client buffer
> >> @@ -337,8 +377,16 @@ EXPORT_SYMBOL(drm_client_buffer_vmap);
> >>   */
> >>  void drm_client_buffer_vunmap(struct drm_client_buffer *buffer)
> >>  {
> >> -	drm_gem_vunmap(buffer->gem, buffer->vaddr);
> >> -	buffer->vaddr = NULL;
> >> +	drm_WARN_ON(buffer->client->dev, buffer->vaddr && buffer->vaddr_iomem);
> >> +
> >> +	if (buffer->vaddr) {
> >> +		drm_gem_vunmap(buffer->gem, buffer->vaddr);
> >> +		buffer->vaddr = NULL;
> >> +	}
> >> +	if (buffer->vaddr_iomem) {
> >> +		drm_gem_vunmap(buffer->gem, (void *)buffer->vaddr_iomem);
> >> +		buffer->vaddr_iomem = NULL;
> >> +	}
> >>  }
> >>  EXPORT_SYMBOL(drm_client_buffer_vunmap);
> >>  
> >> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> >> index a57f5379fc08..a001be8c0965 100644
> >> --- a/drivers/gpu/drm/drm_gem.c
> >> +++ b/drivers/gpu/drm/drm_gem.c
> >> @@ -1227,6 +1227,25 @@ void *drm_gem_vmap(struct drm_gem_object *obj)
> >>  		vaddr = obj->funcs->vmap(obj);
> >>  	else if (obj->dev->driver->gem_prime_vmap)
> >>  		vaddr = obj->dev->driver->gem_prime_vmap(obj);
> >> +	else if (obj->funcs && obj->funcs->vmap_iomem)
> >> +		vaddr = NULL; /* requires mapping as I/O memory */
> >> +	else
> >> +		vaddr = ERR_PTR(-EOPNOTSUPP);
> >> +
> >> +	if (!vaddr)
> >> +		vaddr = ERR_PTR(-ENOMEM);
> >> +
> >> +	return vaddr;
> >> +}
> >> +
> >> +void __iomem *drm_gem_vmap_iomem(struct drm_gem_object *obj)
> >> +{
> >> +	void __iomem *vaddr;
> >> +
> >> +	if (obj->funcs && obj->funcs->vmap_iomem)
> >> +		vaddr = obj->funcs->vmap_iomem(obj);
> >> +	else if (obj->funcs && obj->funcs->vmap)
> >> +		vaddr = NULL; /* requires mapping as system memory */
> >>  	else
> >>  		vaddr = ERR_PTR(-EOPNOTSUPP);
> >>  
> >> diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
> >> index 8e01caaf95cc..aa1a3d4f9223 100644
> >> --- a/drivers/gpu/drm/drm_internal.h
> >> +++ b/drivers/gpu/drm/drm_internal.h
> >> @@ -187,6 +187,7 @@ void drm_gem_print_info(struct drm_printer *p, unsigned int indent,
> >>  int drm_gem_pin(struct drm_gem_object *obj);
> >>  void drm_gem_unpin(struct drm_gem_object *obj);
> >>  void *drm_gem_vmap(struct drm_gem_object *obj);
> >> +void __iomem *drm_gem_vmap_iomem(struct drm_gem_object *obj);
> >>  void drm_gem_vunmap(struct drm_gem_object *obj, void *vaddr);
> >>  
> >>  /* drm_debugfs.c drm_debugfs_crc.c */
> >> diff --git a/include/drm/drm_client.h b/include/drm/drm_client.h
> >> index 7aaea665bfc2..94aa075ee4b6 100644
> >> --- a/include/drm/drm_client.h
> >> +++ b/include/drm/drm_client.h
> >> @@ -141,10 +141,15 @@ struct drm_client_buffer {
> >>  	struct drm_gem_object *gem;
> >>  
> >>  	/**
> >> -	 * @vaddr: Virtual address for the buffer
> >> +	 * @vaddr: Virtual address for the buffer in system memory
> >>  	 */
> >>  	void *vaddr;
> >>  
> >> +	/**
> >> +	 * @vaddr: Virtual address for the buffer in I/O memory
> >> +	 */
> >> +	void *vaddr_iomem;
> >> +
> >>  	/**
> >>  	 * @fb: DRM framebuffer
> >>  	 */
> >> @@ -156,6 +161,7 @@ drm_client_framebuffer_create(struct drm_client_dev *client, u32 width, u32 heig
> >>  void drm_client_framebuffer_delete(struct drm_client_buffer *buffer);
> >>  int drm_client_framebuffer_flush(struct drm_client_buffer *buffer, struct drm_rect *rect);
> >>  void *drm_client_buffer_vmap(struct drm_client_buffer *buffer);
> >> +void __iomem *drm_client_buffer_vmap_iomem(struct drm_client_buffer *buffer);
> >>  void drm_client_buffer_vunmap(struct drm_client_buffer *buffer);
> >>  
> >>  int drm_client_modeset_create(struct drm_client_dev *client);
> >> diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
> >> index 337a48321705..bc735ff522a8 100644
> >> --- a/include/drm/drm_gem.h
> >> +++ b/include/drm/drm_gem.h
> >> @@ -134,17 +134,28 @@ struct drm_gem_object_funcs {
> >>  	 * @vmap:
> >>  	 *
> >>  	 * Returns a virtual address for the buffer. Used by the
> >> -	 * drm_gem_dmabuf_vmap() helper.
> >> +	 * drm_gem_dmabuf_vmap() helper. If the buffer is not
> >> +	 * located in system memory, the function returns NULL.
> >>  	 *
> >>  	 * This callback is optional.
> >>  	 */
> >>  	void *(*vmap)(struct drm_gem_object *obj);
> >>  
> >> +	/**
> >> +	 * @vmap_iomem:
> >> +	 *
> >> +	 * Returns a virtual address for the buffer. If the buffer is not
> >> +	 * located in I/O memory, the function returns NULL.
> >> +	 *
> >> +	 * This callback is optional.
> >> +	 */
> >> +	void __iomem *(*vmap_iomem)(struct drm_gem_object *obj);
> >> +
> >>  	/**
> >>  	 * @vunmap:
> >>  	 *
> >> -	 * Releases the address previously returned by @vmap. Used by the
> >> -	 * drm_gem_dmabuf_vunmap() helper.
> >> +	 * Releases the address previously returned by @vmap or @vmap_iomem.
> >> +	 * Used by the drm_gem_dmabuf_vunmap() helper.
> >>  	 *
> >>  	 * This callback is optional.
> >>  	 */
> >> -- 
> >> 2.27.0
> >>
> > 
> 
> -- 
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer
> 




-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/5] drm: Add infrastructure for vmap operations of I/O memory
@ 2020-07-31  9:22         ` daniel
  0 siblings, 0 replies; 34+ messages in thread
From: daniel @ 2020-07-31  9:22 UTC (permalink / raw)
  Cc: linux-fbdev, b.zolnierkie, jani.nikula, dri-devel, kraxel,
	airlied, natechancellor, sam, peda, dan.carpenter

On Thu, Jul 30, 2020 at 10:14:43AM +0200, Thomas Zimmermann wrote:
> Hi
> 
> Am 29.07.20 um 15:57 schrieb daniel@ffwll.ch:
> > On Wed, Jul 29, 2020 at 03:41:46PM +0200, Thomas Zimmermann wrote:
> >> Most platforms allow for accessing framebuffer I/O memory with regular
> >> load and store operations. Some platforms, such as sparc64, require
> >> the use of special instructions instead.
> >>
> >> This patch adds vmap_iomem to struct drm_gem_object_funcs. The new
> >> interface drm_client_buffer_vmap_iomem() gives DRM clients access to the
> >> I/O memory buffer. The semantics of struct drm_gem_objcet_funcs.vmap
> >> change slightly. It used to return system or I/O memory. Now it is
> >> expected to return memory addresses that can be accessed with regular
> >> load and store operations. So nothing changes for existing implementations
> >> of GEM objects. If the GEM object also implements vmap_iomem, a call
> >> to vmap shall only return system memory, even if I/O memory could be
> >> accessed with loads and stores.
> >>
> >> The existing interface drm_client_buffer_vmap() shall only return memory
> >> as given by drm_gem_vmap ((i.e., that is accessible via regular load and
> >> store). The new interface drm_client_buffer_vmap_iomem() shall only
> >> return I/O memory.
> >>
> >> DRM clients must map buffers by calling drm_client_buffer_vmap_iomem()
> >> and drm_client_buffer_vmap() to get the buffer in I/O or system memory.
> >> Each function returns NULL if the buffer is in the other memory area.
> >> Depending on the type of the returned memory, clients must access the
> >> framebuffer with the appropriate operations.
> >>
> >> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
> > 
> > Hm I don't think this works, since for more dynamic framebuffers (like
> > real big gpu ttm drivers) this is a dynamic thing, which can change every
> > time we do an mmap. So I think the ttm approach of having an is_iomem flag
> > is a lot better.
> > 
> > The trouble with that is that you don't have correct checking of sparse
> > mappings, but oh well :-/ The one idea I've had to address that is using
> > something like this
> > 
> > typedef dma_buf_addr_t {
> > 	bool is_iomem;
> > 	union {
> > 		void __iomem *vaddr_iomem;
> > 		void vaddr;
> > 	};
> > };
> > 
> > And then having a wrapper for memcpy_from_dma_buf_addr and
> > memcpy_to_dma_buf_addr, which switches between memcpy and memcpy_from/toio
> > depending upon the is_iomem flag.
> > 
> > But it's a lot more invasive unfortunately :-/
> 
> What do you think about introducing read and write callbacks for GEM
> objects? Like this:
> 
>   int drm_gem_read(struct drm_gem_object *gbo, size_t off, size_t len,
> void *buf);
> 
>   int drm_gem_write(struct drm_gem_object *gbo, size_t off, size_t len,
> const void *buf);
> 
> The common case would by memcpy, but GEM implementations could provide
> their own thing. The fbdev blit function would look like
> 
>   vaddr = drm_gem_vmap(gbo)
>   if (IS_ERR(vaddr))
>     return
> 
>   for (each line) {
>     drm_gem_write(gbo, gbo_line_offset, line_size, src)
>     gbo_line_offset = /* next line */
>     src = /* next line */
>   }
> 
>   drm_gem_vunmap(gbo);
> 
> The whole mess about I/O access would be self-contained.

Copying the irc discussion over: We've had that idea floating around years
ago, i915-gem even implemented in the form of pwrite/pread for usersapce.
But now all userspace moved over to mmap, so read/write has fallen out of
favour.

I'm also not sure whether we really need to fix more than just fbcon on
fbdev on drm emulation, and it feels a bit silly to add read/write just
for that. Also the is_iomem flag on the vmap (and maybe eventually on
mmap, no idea) might be able to let us fix this at least for real
eventually.

Cheers, Daniel

> 
> Best regards
> Thomas
> 
> > -Daniel
> > 
> >> ---
> >>  drivers/gpu/drm/drm_client.c   | 52 ++++++++++++++++++++++++++++++++--
> >>  drivers/gpu/drm/drm_gem.c      | 19 +++++++++++++
> >>  drivers/gpu/drm/drm_internal.h |  1 +
> >>  include/drm/drm_client.h       |  8 +++++-
> >>  include/drm/drm_gem.h          | 17 +++++++++--
> >>  5 files changed, 91 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/drm_client.c b/drivers/gpu/drm/drm_client.c
> >> index 495f47d23d87..b5bbe089a41e 100644
> >> --- a/drivers/gpu/drm/drm_client.c
> >> +++ b/drivers/gpu/drm/drm_client.c
> >> @@ -327,6 +327,46 @@ void *drm_client_buffer_vmap(struct drm_client_buffer *buffer)
> >>  }
> >>  EXPORT_SYMBOL(drm_client_buffer_vmap);
> >>  
> >> +/**
> >> + * drm_client_buffer_vmap_iomem - Map DRM client buffer into address space
> >> + * @buffer: DRM client buffer
> >> + *
> >> + * This function maps a client buffer into kernel address space. If the
> >> + * buffer is already mapped, it returns the mapping's address.
> >> + *
> >> + * Client buffer mappings are not ref'counted. Each call to
> >> + * drm_client_buffer_vmap() should be followed by a call to
> >> + * drm_client_buffer_vunmap(); or the client buffer should be mapped
> >> + * throughout its lifetime.
> >> + *
> >> + * Returns:
> >> + *	The mapped memory's address
> >> + */
> >> +void __iomem *drm_client_buffer_vmap_iomem(struct drm_client_buffer *buffer)
> >> +{
> >> +	void __iomem *vaddr_iomem;
> >> +
> >> +	if (buffer->vaddr_iomem)
> >> +		return buffer->vaddr_iomem;
> >> +
> >> +	/*
> >> +	 * FIXME: The dependency on GEM here isn't required, we could
> >> +	 * convert the driver handle to a dma-buf instead and use the
> >> +	 * backend-agnostic dma-buf vmap support instead. This would
> >> +	 * require that the handle2fd prime ioctl is reworked to pull the
> >> +	 * fd_install step out of the driver backend hooks, to make that
> >> +	 * final step optional for internal users.
> >> +	 */
> >> +	vaddr_iomem = drm_gem_vmap_iomem(buffer->gem);
> >> +	if (IS_ERR(vaddr_iomem))
> >> +		return vaddr_iomem;
> >> +
> >> +	buffer->vaddr_iomem = vaddr_iomem;
> >> +
> >> +	return vaddr_iomem;
> >> +}
> >> +EXPORT_SYMBOL(drm_client_buffer_vmap_iomem);
> >> +
> >>  /**
> >>   * drm_client_buffer_vunmap - Unmap DRM client buffer
> >>   * @buffer: DRM client buffer
> >> @@ -337,8 +377,16 @@ EXPORT_SYMBOL(drm_client_buffer_vmap);
> >>   */
> >>  void drm_client_buffer_vunmap(struct drm_client_buffer *buffer)
> >>  {
> >> -	drm_gem_vunmap(buffer->gem, buffer->vaddr);
> >> -	buffer->vaddr = NULL;
> >> +	drm_WARN_ON(buffer->client->dev, buffer->vaddr && buffer->vaddr_iomem);
> >> +
> >> +	if (buffer->vaddr) {
> >> +		drm_gem_vunmap(buffer->gem, buffer->vaddr);
> >> +		buffer->vaddr = NULL;
> >> +	}
> >> +	if (buffer->vaddr_iomem) {
> >> +		drm_gem_vunmap(buffer->gem, (void *)buffer->vaddr_iomem);
> >> +		buffer->vaddr_iomem = NULL;
> >> +	}
> >>  }
> >>  EXPORT_SYMBOL(drm_client_buffer_vunmap);
> >>  
> >> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> >> index a57f5379fc08..a001be8c0965 100644
> >> --- a/drivers/gpu/drm/drm_gem.c
> >> +++ b/drivers/gpu/drm/drm_gem.c
> >> @@ -1227,6 +1227,25 @@ void *drm_gem_vmap(struct drm_gem_object *obj)
> >>  		vaddr = obj->funcs->vmap(obj);
> >>  	else if (obj->dev->driver->gem_prime_vmap)
> >>  		vaddr = obj->dev->driver->gem_prime_vmap(obj);
> >> +	else if (obj->funcs && obj->funcs->vmap_iomem)
> >> +		vaddr = NULL; /* requires mapping as I/O memory */
> >> +	else
> >> +		vaddr = ERR_PTR(-EOPNOTSUPP);
> >> +
> >> +	if (!vaddr)
> >> +		vaddr = ERR_PTR(-ENOMEM);
> >> +
> >> +	return vaddr;
> >> +}
> >> +
> >> +void __iomem *drm_gem_vmap_iomem(struct drm_gem_object *obj)
> >> +{
> >> +	void __iomem *vaddr;
> >> +
> >> +	if (obj->funcs && obj->funcs->vmap_iomem)
> >> +		vaddr = obj->funcs->vmap_iomem(obj);
> >> +	else if (obj->funcs && obj->funcs->vmap)
> >> +		vaddr = NULL; /* requires mapping as system memory */
> >>  	else
> >>  		vaddr = ERR_PTR(-EOPNOTSUPP);
> >>  
> >> diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
> >> index 8e01caaf95cc..aa1a3d4f9223 100644
> >> --- a/drivers/gpu/drm/drm_internal.h
> >> +++ b/drivers/gpu/drm/drm_internal.h
> >> @@ -187,6 +187,7 @@ void drm_gem_print_info(struct drm_printer *p, unsigned int indent,
> >>  int drm_gem_pin(struct drm_gem_object *obj);
> >>  void drm_gem_unpin(struct drm_gem_object *obj);
> >>  void *drm_gem_vmap(struct drm_gem_object *obj);
> >> +void __iomem *drm_gem_vmap_iomem(struct drm_gem_object *obj);
> >>  void drm_gem_vunmap(struct drm_gem_object *obj, void *vaddr);
> >>  
> >>  /* drm_debugfs.c drm_debugfs_crc.c */
> >> diff --git a/include/drm/drm_client.h b/include/drm/drm_client.h
> >> index 7aaea665bfc2..94aa075ee4b6 100644
> >> --- a/include/drm/drm_client.h
> >> +++ b/include/drm/drm_client.h
> >> @@ -141,10 +141,15 @@ struct drm_client_buffer {
> >>  	struct drm_gem_object *gem;
> >>  
> >>  	/**
> >> -	 * @vaddr: Virtual address for the buffer
> >> +	 * @vaddr: Virtual address for the buffer in system memory
> >>  	 */
> >>  	void *vaddr;
> >>  
> >> +	/**
> >> +	 * @vaddr: Virtual address for the buffer in I/O memory
> >> +	 */
> >> +	void *vaddr_iomem;
> >> +
> >>  	/**
> >>  	 * @fb: DRM framebuffer
> >>  	 */
> >> @@ -156,6 +161,7 @@ drm_client_framebuffer_create(struct drm_client_dev *client, u32 width, u32 heig
> >>  void drm_client_framebuffer_delete(struct drm_client_buffer *buffer);
> >>  int drm_client_framebuffer_flush(struct drm_client_buffer *buffer, struct drm_rect *rect);
> >>  void *drm_client_buffer_vmap(struct drm_client_buffer *buffer);
> >> +void __iomem *drm_client_buffer_vmap_iomem(struct drm_client_buffer *buffer);
> >>  void drm_client_buffer_vunmap(struct drm_client_buffer *buffer);
> >>  
> >>  int drm_client_modeset_create(struct drm_client_dev *client);
> >> diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
> >> index 337a48321705..bc735ff522a8 100644
> >> --- a/include/drm/drm_gem.h
> >> +++ b/include/drm/drm_gem.h
> >> @@ -134,17 +134,28 @@ struct drm_gem_object_funcs {
> >>  	 * @vmap:
> >>  	 *
> >>  	 * Returns a virtual address for the buffer. Used by the
> >> -	 * drm_gem_dmabuf_vmap() helper.
> >> +	 * drm_gem_dmabuf_vmap() helper. If the buffer is not
> >> +	 * located in system memory, the function returns NULL.
> >>  	 *
> >>  	 * This callback is optional.
> >>  	 */
> >>  	void *(*vmap)(struct drm_gem_object *obj);
> >>  
> >> +	/**
> >> +	 * @vmap_iomem:
> >> +	 *
> >> +	 * Returns a virtual address for the buffer. If the buffer is not
> >> +	 * located in I/O memory, the function returns NULL.
> >> +	 *
> >> +	 * This callback is optional.
> >> +	 */
> >> +	void __iomem *(*vmap_iomem)(struct drm_gem_object *obj);
> >> +
> >>  	/**
> >>  	 * @vunmap:
> >>  	 *
> >> -	 * Releases the address previously returned by @vmap. Used by the
> >> -	 * drm_gem_dmabuf_vunmap() helper.
> >> +	 * Releases the address previously returned by @vmap or @vmap_iomem.
> >> +	 * Used by the drm_gem_dmabuf_vunmap() helper.
> >>  	 *
> >>  	 * This callback is optional.
> >>  	 */
> >> -- 
> >> 2.27.0
> >>
> > 
> 
> -- 
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer
> 




-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/5] fbdev/core: Export framebuffer read and write code as cfb_ function
  2020-07-31  9:20         ` daniel
@ 2020-08-02 20:01           ` Sam Ravnborg
  -1 siblings, 0 replies; 34+ messages in thread
From: Sam Ravnborg @ 2020-08-02 20:01 UTC (permalink / raw)
  To: daniel
  Cc: linux-fbdev, b.zolnierkie, jani.nikula, dri-devel, kraxel,
	airlied, natechancellor, peda, dan.carpenter

On Fri, Jul 31, 2020 at 11:20:33AM +0200, daniel@ffwll.ch wrote:
> On Wed, Jul 29, 2020 at 06:36:03PM +0200, Sam Ravnborg wrote:
> > Hi Daniel.
> > 
> > On Wed, Jul 29, 2020 at 03:53:28PM +0200, daniel@ffwll.ch wrote:
> > > On Wed, Jul 29, 2020 at 03:41:45PM +0200, Thomas Zimmermann wrote:
> > > > DRM fb helpers require read and write functions for framebuffer
> > > > memory. Export the existing code from fbdev.
> > > > 
> > > > Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
> > > 
> > > Hm I'm not super sure whether we want to actually reuse this stuff ... We
> > > kinda don't care about the sparc special case, and just having an fbdev
> > > implementation witch has the switch between memcpy and memcpy_to/from_io
> > > in one single place sounds a lot simpler ...
> > > 
> > > This way we can have a clean split between the old horrors of real fbdev
> > > drivers, and a much cleaner world in drm. It would mean a bit of
> > > copypasting, but I think that's actually a good thing.
> > > 
> > > In general my idea for drm fbdev emulation is that for any area we have a
> > > problem we just ignore the entire fbmem.c code and write our own: mmap,
> > > backlight handling (still unsolved, and horrible), cfb vs sys here. This
> > > entire fbmem.c stuff is pretty bad midlayer, trying to avoid code
> > > duplication here doesn't seem worth it imo.
> > > 
> > > Thoughts?
> > 
> > 
> > I can see that fbmem is a mix of ioctl support and other stuff.
> > We could factor out all the ioctl parts of fbmem.c to a new file
> > named fbioctl.c.
> > 
> > And then let the ioctl parts call down into drm stuff and avoid reusing
> > the fbdev code when we first reach drm code.
> > This would require local copies of:
> > sys_read, sys_write, sys_fillrect, sys_copyarea, sys_imageblit
> > and more I think which I missed.
> > 
> > With local copies we could avoid some of the special cases and trim the
> > unctions to what is required by drm only.
> > And then no more fbmem dependencies and no dependencies to several of
> > the small helper functions. So less entanglement with fbdev core.
> > 
> > This all sounds simple so I am surely missing a lot a ugly details here.
> > 
> > And should we touch this anyway we need a test suite to verify not too
> > much breaks. To the best of my knowledge there is not yet such a test
> > suite :-( Maybe because people caring about fbdev are limited.
> 
> Well my idea was to not refactor anything, but just have drm copies of the
> various fb_ops callbacks. Definitely not even more refactoring :-)

$ wc -l drivers/gpu/drm/drm_fb_fbdev_helper.c
1212 drivers/gpu/drm/drm_fb_fbdev_helper.c

This is a straight copy of:
$ grep EXPORT drivers/gpu/drm/drm_fb_fbdev_helper.c
EXPORT_SYMBOL(drm_copyarea);
EXPORT_SYMBOL(drm_fillrect);
EXPORT_SYMBOL(drm_imageblit);
EXPORT_SYMBOL_GPL(drm_fb_read);
EXPORT_SYMBOL_GPL(drm_fb_write);

copyarea, fillrect and imageblit are a copy of the sys variants.
read and write are from fb_sys_fops.c

Patch builds but I am not a big fan yet - right now it gains us very
little. What would the next step be?

I looked briefly at the few drivers that have not migrated to
the generic fbdev emulation (yet?).
  msm
  omapdrm
  gma500

all uses sys_ variants - they already uses the migrated variants as
they call the drm_fb helper.


  exynos
  gma500
  i915
  nouveau
  radeon
  rockchip
  vmwgfx

all uses the cfb variants and cannot use the migrated variants as-is.

	Sam

From c554e8b9ec96e06a5933ed5bd834a928c977bcdf Mon Sep 17 00:00:00 2001
From: Sam Ravnborg <sam@ravnborg.org>
Date: Fri, 31 Jul 2020 23:36:01 +0200
Subject: [PATCH 1/3] drm: drm_fb_helper: copy fb_ops functions to drm

Copy the fbdev functions used by fb_ops to drm.
Having local copies enable the possibility
to adjust the functions so then are tailored to the need of DRM.
At the same time there is one less dependency between DRM and fbdev.

The files were copied verbatim and only minimal changes was made.

As some drivers uses the fbdev functions they continue to be build but
the generic fbdev emulation no longer uses the fbdev functions.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/Kconfig               |    1 -
 drivers/gpu/drm/Makefile              |    1 +
 drivers/gpu/drm/drm_fb_fbdev_helper.c | 1212 +++++++++++++++++++++++++
 drivers/gpu/drm/drm_fb_helper.c       |   30 +-
 include/drm/drm_fb_helper.h           |    6 +
 5 files changed, 1234 insertions(+), 16 deletions(-)
 create mode 100644 drivers/gpu/drm/drm_fb_fbdev_helper.c

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index c4fd57d8b717..11f41bc31f05 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -83,7 +83,6 @@ config DRM_KMS_FB_HELPER
 	select FB
 	select FRAMEBUFFER_CONSOLE if !EXPERT
 	select FRAMEBUFFER_CONSOLE_DETECT_PRIMARY if FRAMEBUFFER_CONSOLE
-	select FB_SYS_FOPS
 	select FB_SYS_FILLRECT
 	select FB_SYS_COPYAREA
 	select FB_SYS_IMAGEBLIT
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 02ee5faf1a92..398cb4746762 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -50,6 +50,7 @@ drm_kms_helper-y := drm_bridge_connector.o drm_crtc_helper.o drm_dp_helper.o \
 
 drm_kms_helper-$(CONFIG_DRM_PANEL_BRIDGE) += bridge/panel.o
 drm_kms_helper-$(CONFIG_DRM_FBDEV_EMULATION) += drm_fb_helper.o
+drm_kms_helper-$(CONFIG_DRM_FBDEV_EMULATION) += drm_fb_fbdev_helper.o
 drm_kms_helper-$(CONFIG_DRM_KMS_CMA_HELPER) += drm_fb_cma_helper.o
 drm_kms_helper-$(CONFIG_DRM_DP_AUX_CHARDEV) += drm_dp_aux_dev.o
 drm_kms_helper-$(CONFIG_DRM_DP_CEC) += drm_dp_cec.o
diff --git a/drivers/gpu/drm/drm_fb_fbdev_helper.c b/drivers/gpu/drm/drm_fb_fbdev_helper.c
new file mode 100644
index 000000000000..dda60f85b9d5
--- /dev/null
+++ b/drivers/gpu/drm/drm_fb_fbdev_helper.c
@@ -0,0 +1,1212 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file COPYING in the main directory of this archive for
+ * more details.
+ *
+ * Copyright (C)  2007 Antonino Daplas <adaplas@pol.net>
+ *
+ * drm_copyarea:
+ *   Based almost entirely from cfbcopyarea.c (which is based almost entirely
+ *   on Geert Uytterhoeven's copyarea routine)
+ *
+ *  drm_fillrect:
+ *    Based almost entirely from cfbfillrect.c (which is based almost entirely
+ *    on Geert Uytterhoeven's fillrect routine)
+ *
+ *  drm_imageblit:
+ *    Based almost entirely on cfbimgblt.c
+ *
+ */
+
+#include <linux/export.h>
+#include <linux/fb.h>
+#include <linux/io.h>
+
+
+/*
+*  Compose two values, using a bitmask as decision value
+*  This is equivalent to (a & mask) | (b & ~mask)
+*/
+static inline unsigned long
+comp(unsigned long a, unsigned long b, unsigned long mask)
+{
+    return ((a ^ b) & mask) ^ b;
+}
+
+/*
+ *  Create a pattern with the given pixel's color
+ */
+#if BITS_PER_LONG = 64
+static inline unsigned long
+pixel_to_pat( u32 bpp, u32 pixel)
+{
+	switch (bpp) {
+	case 1:
+		return 0xfffffffffffffffful*pixel;
+	case 2:
+		return 0x5555555555555555ul*pixel;
+	case 4:
+		return 0x1111111111111111ul*pixel;
+	case 8:
+		return 0x0101010101010101ul*pixel;
+	case 12:
+		return 0x1001001001001001ul*pixel;
+	case 16:
+		return 0x0001000100010001ul*pixel;
+	case 24:
+		return 0x0001000001000001ul*pixel;
+	case 32:
+		return 0x0000000100000001ul*pixel;
+	default:
+		WARN(1, "pixel_to_pat(): unsupported pixelformat %d\n", bpp);
+		return 0;
+    }
+}
+#else
+static inline unsigned long
+pixel_to_pat( u32 bpp, u32 pixel)
+{
+	switch (bpp) {
+	case 1:
+		return 0xfffffffful*pixel;
+	case 2:
+		return 0x55555555ul*pixel;
+	case 4:
+		return 0x11111111ul*pixel;
+	case 8:
+		return 0x01010101ul*pixel;
+	case 12:
+		return 0x01001001ul*pixel;
+	case 16:
+		return 0x00010001ul*pixel;
+	case 24:
+		return 0x01000001ul*pixel;
+	case 32:
+		return 0x00000001ul*pixel;
+	default:
+		WARN(1, "pixel_to_pat(): unsupported pixelformat %d\n", bpp);
+		return 0;
+    }
+}
+#endif
+
+#ifdef CONFIG_FB_CFB_REV_PIXELS_IN_BYTE
+#if BITS_PER_LONG = 64
+#define REV_PIXELS_MASK1 0x5555555555555555ul
+#define REV_PIXELS_MASK2 0x3333333333333333ul
+#define REV_PIXELS_MASK4 0x0f0f0f0f0f0f0f0ful
+#else
+#define REV_PIXELS_MASK1 0x55555555ul
+#define REV_PIXELS_MASK2 0x33333333ul
+#define REV_PIXELS_MASK4 0x0f0f0f0ful
+#endif
+
+static inline unsigned long fb_rev_pixels_in_long(unsigned long val,
+						  u32 bswapmask)
+{
+	if (bswapmask & 1)
+		val = comp(val >> 1, val << 1, REV_PIXELS_MASK1);
+	if (bswapmask & 2)
+		val = comp(val >> 2, val << 2, REV_PIXELS_MASK2);
+	if (bswapmask & 3)
+		val = comp(val >> 4, val << 4, REV_PIXELS_MASK4);
+	return val;
+}
+
+static inline u32 fb_shifted_pixels_mask_u32(struct fb_info *p, u32 index,
+					     u32 bswapmask)
+{
+	u32 mask;
+
+	if (!bswapmask) {
+		mask = FB_SHIFT_HIGH(p, ~(u32)0, index);
+	} else {
+		mask = 0xff << FB_LEFT_POS(p, 8);
+		mask = FB_SHIFT_LOW(p, mask, index & (bswapmask)) & mask;
+		mask = FB_SHIFT_HIGH(p, mask, index & ~(bswapmask));
+#if defined(__i386__) || defined(__x86_64__)
+		/* Shift argument is limited to 0 - 31 on x86 based CPU's */
+		if(index + bswapmask < 32)
+#endif
+			mask |= FB_SHIFT_HIGH(p, ~(u32)0,
+					(index + bswapmask) & ~(bswapmask));
+	}
+	return mask;
+}
+
+static inline unsigned long fb_shifted_pixels_mask_long(struct fb_info *p,
+							u32 index,
+							u32 bswapmask)
+{
+	unsigned long mask;
+
+	if (!bswapmask) {
+		mask = FB_SHIFT_HIGH(p, ~0UL, index);
+	} else {
+		mask = 0xff << FB_LEFT_POS(p, 8);
+		mask = FB_SHIFT_LOW(p, mask, index & (bswapmask)) & mask;
+		mask = FB_SHIFT_HIGH(p, mask, index & ~(bswapmask));
+#if defined(__i386__) || defined(__x86_64__)
+		/* Shift argument is limited to 0 - 31 on x86 based CPU's */
+		if(index + bswapmask < BITS_PER_LONG)
+#endif
+			mask |= FB_SHIFT_HIGH(p, ~0UL,
+					(index + bswapmask) & ~(bswapmask));
+	}
+	return mask;
+}
+
+
+static inline u32 fb_compute_bswapmask(struct fb_info *info)
+{
+	u32 bswapmask = 0;
+	unsigned bpp = info->var.bits_per_pixel;
+
+	if ((bpp < 8) && (info->var.nonstd & FB_NONSTD_REV_PIX_IN_B)) {
+		/*
+		 * Reversed order of pixel layout in bytes
+		 * works only for 1, 2 and 4 bpp
+		 */
+		bswapmask = 7 - bpp + 1;
+	}
+	return bswapmask;
+}
+
+#else /* CONFIG_FB_CFB_REV_PIXELS_IN_BYTE */
+
+static inline unsigned long fb_rev_pixels_in_long(unsigned long val,
+						  u32 bswapmask)
+{
+	return val;
+}
+
+#define fb_shifted_pixels_mask_u32(p, i, b) FB_SHIFT_HIGH((p), ~(u32)0, (i))
+#define fb_shifted_pixels_mask_long(p, i, b) FB_SHIFT_HIGH((p), ~0UL, (i))
+#define fb_compute_bswapmask(...) 0
+
+#endif  /* CONFIG_FB_CFB_REV_PIXELS_IN_BYTE */
+
+#define cpu_to_le_long _cpu_to_le_long(BITS_PER_LONG)
+#define _cpu_to_le_long(x) __cpu_to_le_long(x)
+#define __cpu_to_le_long(x) cpu_to_le##x
+
+#define le_long_to_cpu _le_long_to_cpu(BITS_PER_LONG)
+#define _le_long_to_cpu(x) __le_long_to_cpu(x)
+#define __le_long_to_cpu(x) le##x##_to_cpu
+
+static inline unsigned long rolx(unsigned long word, unsigned int shift, unsigned int x)
+{
+	return (word << shift) | (word >> (x - shift));
+}
+
+
+/*
+ *  Generic bitwise copy algorithm
+ */
+static void
+bitcpy(struct fb_info *p, unsigned long *dst, unsigned dst_idx,
+	const unsigned long *src, unsigned src_idx, int bits, unsigned n)
+{
+	unsigned long first, last;
+	int const shift = dst_idx-src_idx;
+	int left, right;
+
+	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
+	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
+
+	if (!shift) {
+		/* Same alignment for source and dest */
+		if (dst_idx+n <= bits) {
+			/* Single word */
+			if (last)
+				first &= last;
+			*dst = comp(*src, *dst, first);
+		} else {
+			/* Multiple destination words */
+			/* Leading bits */
+ 			if (first != ~0UL) {
+				*dst = comp(*src, *dst, first);
+				dst++;
+				src++;
+				n -= bits - dst_idx;
+			}
+
+			/* Main chunk */
+			n /= bits;
+			while (n >= 8) {
+				*dst++ = *src++;
+				*dst++ = *src++;
+				*dst++ = *src++;
+				*dst++ = *src++;
+				*dst++ = *src++;
+				*dst++ = *src++;
+				*dst++ = *src++;
+				*dst++ = *src++;
+				n -= 8;
+			}
+			while (n--)
+				*dst++ = *src++;
+
+			/* Trailing bits */
+			if (last)
+				*dst = comp(*src, *dst, last);
+		}
+	} else {
+		unsigned long d0, d1;
+		int m;
+
+		/* Different alignment for source and dest */
+		right = shift & (bits - 1);
+		left = -shift & (bits - 1);
+
+		if (dst_idx+n <= bits) {
+			/* Single destination word */
+			if (last)
+				first &= last;
+			if (shift > 0) {
+				/* Single source word */
+				*dst = comp(*src << left, *dst, first);
+			} else if (src_idx+n <= bits) {
+				/* Single source word */
+				*dst = comp(*src >> right, *dst, first);
+			} else {
+				/* 2 source words */
+				d0 = *src++;
+				d1 = *src;
+				*dst = comp(d0 >> right | d1 << left, *dst,
+					    first);
+			}
+		} else {
+			/* Multiple destination words */
+			/** We must always remember the last value read,
+			    because in case SRC and DST overlap bitwise (e.g.
+			    when moving just one pixel in 1bpp), we always
+			    collect one full long for DST and that might
+			    overlap with the current long from SRC. We store
+			    this value in 'd0'. */
+			d0 = *src++;
+			/* Leading bits */
+			if (shift > 0) {
+				/* Single source word */
+				*dst = comp(d0 << left, *dst, first);
+				dst++;
+				n -= bits - dst_idx;
+			} else {
+				/* 2 source words */
+				d1 = *src++;
+				*dst = comp(d0 >> right | d1 << left, *dst,
+					    first);
+				d0 = d1;
+				dst++;
+				n -= bits - dst_idx;
+			}
+
+			/* Main chunk */
+			m = n % bits;
+			n /= bits;
+			while (n >= 4) {
+				d1 = *src++;
+				*dst++ = d0 >> right | d1 << left;
+				d0 = d1;
+				d1 = *src++;
+				*dst++ = d0 >> right | d1 << left;
+				d0 = d1;
+				d1 = *src++;
+				*dst++ = d0 >> right | d1 << left;
+				d0 = d1;
+				d1 = *src++;
+				*dst++ = d0 >> right | d1 << left;
+				d0 = d1;
+				n -= 4;
+			}
+			while (n--) {
+				d1 = *src++;
+				*dst++ = d0 >> right | d1 << left;
+				d0 = d1;
+			}
+
+			/* Trailing bits */
+			if (m) {
+				if (m <= bits - right) {
+					/* Single source word */
+					d0 >>= right;
+				} else {
+					/* 2 source words */
+ 					d1 = *src;
+					d0 = d0 >> right | d1 << left;
+				}
+				*dst = comp(d0, *dst, last);
+			}
+		}
+	}
+}
+
+/*
+ *  Generic bitwise copy algorithm, operating backward
+ */
+static void
+bitcpy_rev(struct fb_info *p, unsigned long *dst, unsigned dst_idx,
+	   const unsigned long *src, unsigned src_idx, unsigned bits,
+	   unsigned n)
+{
+	unsigned long first, last;
+	int shift;
+
+	dst += (dst_idx + n - 1) / bits;
+	src += (src_idx + n - 1) / bits;
+	dst_idx = (dst_idx + n - 1) % bits;
+	src_idx = (src_idx + n - 1) % bits;
+
+	shift = dst_idx-src_idx;
+
+	first = ~FB_SHIFT_HIGH(p, ~0UL, (dst_idx + 1) % bits);
+	last = FB_SHIFT_HIGH(p, ~0UL, (bits + dst_idx + 1 - n) % bits);
+
+	if (!shift) {
+		/* Same alignment for source and dest */
+		if ((unsigned long)dst_idx+1 >= n) {
+			/* Single word */
+			if (first)
+				last &= first;
+			*dst = comp(*src, *dst, last);
+		} else {
+			/* Multiple destination words */
+
+			/* Leading bits */
+			if (first) {
+				*dst = comp(*src, *dst, first);
+				dst--;
+				src--;
+				n -= dst_idx+1;
+			}
+
+			/* Main chunk */
+			n /= bits;
+			while (n >= 8) {
+				*dst-- = *src--;
+				*dst-- = *src--;
+				*dst-- = *src--;
+				*dst-- = *src--;
+				*dst-- = *src--;
+				*dst-- = *src--;
+				*dst-- = *src--;
+				*dst-- = *src--;
+				n -= 8;
+			}
+			while (n--)
+				*dst-- = *src--;
+			/* Trailing bits */
+			if (last != -1UL)
+				*dst = comp(*src, *dst, last);
+		}
+	} else {
+		/* Different alignment for source and dest */
+
+		int const left = shift & (bits-1);
+		int const right = -shift & (bits-1);
+
+		if ((unsigned long)dst_idx+1 >= n) {
+			/* Single destination word */
+			if (first)
+				last &= first;
+			if (shift < 0) {
+				/* Single source word */
+				*dst = comp(*src >> right, *dst, last);
+			} else if (1+(unsigned long)src_idx >= n) {
+				/* Single source word */
+				*dst = comp(*src << left, *dst, last);
+			} else {
+				/* 2 source words */
+				*dst = comp(*src << left | *(src-1) >> right,
+					    *dst, last);
+			}
+		} else {
+			/* Multiple destination words */
+			/** We must always remember the last value read,
+			    because in case SRC and DST overlap bitwise (e.g.
+			    when moving just one pixel in 1bpp), we always
+			    collect one full long for DST and that might
+			    overlap with the current long from SRC. We store
+			    this value in 'd0'. */
+			unsigned long d0, d1;
+			int m;
+
+			d0 = *src--;
+			/* Leading bits */
+			if (shift < 0) {
+				/* Single source word */
+				d1 = d0;
+				d0 >>= right;
+			} else {
+				/* 2 source words */
+				d1 = *src--;
+				d0 = d0 << left | d1 >> right;
+			}
+			if (!first)
+				*dst = d0;
+			else
+				*dst = comp(d0, *dst, first);
+			d0 = d1;
+			dst--;
+			n -= dst_idx+1;
+
+			/* Main chunk */
+			m = n % bits;
+			n /= bits;
+			while (n >= 4) {
+				d1 = *src--;
+				*dst-- = d0 << left | d1 >> right;
+				d0 = d1;
+				d1 = *src--;
+				*dst-- = d0 << left | d1 >> right;
+				d0 = d1;
+				d1 = *src--;
+				*dst-- = d0 << left | d1 >> right;
+				d0 = d1;
+				d1 = *src--;
+				*dst-- = d0 << left | d1 >> right;
+				d0 = d1;
+				n -= 4;
+			}
+			while (n--) {
+				d1 = *src--;
+				*dst-- = d0 << left | d1 >> right;
+				d0 = d1;
+			}
+
+			/* Trailing bits */
+			if (m) {
+				if (m <= bits - left) {
+					/* Single source word */
+					d0 <<= left;
+				} else {
+					/* 2 source words */
+					d1 = *src;
+					d0 = d0 << left | d1 >> right;
+				}
+				*dst = comp(d0, *dst, last);
+			}
+		}
+	}
+}
+
+/*
+ * Generic Bit Block Transfer for frame buffers located in system RAM with
+ * packed pixels of any depth.
+ */
+void drm_copyarea(struct fb_info *p, const struct fb_copyarea *area)
+{
+	u32 dx = area->dx, dy = area->dy, sx = area->sx, sy = area->sy;
+	u32 height = area->height, width = area->width;
+	unsigned long const bits_per_line = p->fix.line_length*8u;
+	unsigned long *base = NULL;
+	int bits = BITS_PER_LONG, bytes = bits >> 3;
+	unsigned dst_idx = 0, src_idx = 0, rev_copy = 0;
+
+	if (p->state != FBINFO_STATE_RUNNING)
+		return;
+
+	/* if the beginning of the target area might overlap with the end of
+	the source area, be have to copy the area reverse. */
+	if ((dy = sy && dx > sx) || (dy > sy)) {
+		dy += height;
+		sy += height;
+		rev_copy = 1;
+	}
+
+	/* split the base of the framebuffer into a long-aligned address and
+	   the index of the first bit */
+	base = (unsigned long *)((unsigned long)p->screen_base & ~(bytes-1));
+	dst_idx = src_idx = 8*((unsigned long)p->screen_base & (bytes-1));
+	/* add offset of source and target area */
+	dst_idx += dy*bits_per_line + dx*p->var.bits_per_pixel;
+	src_idx += sy*bits_per_line + sx*p->var.bits_per_pixel;
+
+	if (p->fbops->fb_sync)
+		p->fbops->fb_sync(p);
+
+	if (rev_copy) {
+		while (height--) {
+			dst_idx -= bits_per_line;
+			src_idx -= bits_per_line;
+			bitcpy_rev(p, base + (dst_idx / bits), dst_idx % bits,
+				base + (src_idx / bits), src_idx % bits, bits,
+				width*p->var.bits_per_pixel);
+		}
+	} else {
+		while (height--) {
+			bitcpy(p, base + (dst_idx / bits), dst_idx % bits,
+				base + (src_idx / bits), src_idx % bits, bits,
+				width*p->var.bits_per_pixel);
+			dst_idx += bits_per_line;
+			src_idx += bits_per_line;
+		}
+	}
+}
+EXPORT_SYMBOL(drm_copyarea);
+
+/*
+ *  Aligned pattern fill using 32/64-bit memory accesses
+ */
+static void
+bitfill_aligned(struct fb_info *p, unsigned long *dst, int dst_idx,
+		unsigned long pat, unsigned n, int bits)
+{
+	unsigned long first, last;
+
+	if (!n)
+		return;
+
+	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
+	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
+
+	if (dst_idx+n <= bits) {
+		/* Single word */
+		if (last)
+			first &= last;
+		*dst = comp(pat, *dst, first);
+	} else {
+		/* Multiple destination words */
+
+		/* Leading bits */
+ 		if (first!= ~0UL) {
+			*dst = comp(pat, *dst, first);
+			dst++;
+			n -= bits - dst_idx;
+		}
+
+		/* Main chunk */
+		n /= bits;
+		while (n >= 8) {
+			*dst++ = pat;
+			*dst++ = pat;
+			*dst++ = pat;
+			*dst++ = pat;
+			*dst++ = pat;
+			*dst++ = pat;
+			*dst++ = pat;
+			*dst++ = pat;
+			n -= 8;
+		}
+		while (n--)
+			*dst++ = pat;
+		/* Trailing bits */
+		if (last)
+			*dst = comp(pat, *dst, last);
+	}
+}
+
+
+/*
+ *  Unaligned generic pattern fill using 32/64-bit memory accesses
+ *  The pattern must have been expanded to a full 32/64-bit value
+ *  Left/right are the appropriate shifts to convert to the pattern to be
+ *  used for the next 32/64-bit word
+ */
+static void
+bitfill_unaligned(struct fb_info *p, unsigned long *dst, int dst_idx,
+		  unsigned long pat, int left, int right, unsigned n, int bits)
+{
+	unsigned long first, last;
+
+	if (!n)
+		return;
+
+	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
+	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
+
+	if (dst_idx+n <= bits) {
+		/* Single word */
+		if (last)
+			first &= last;
+		*dst = comp(pat, *dst, first);
+	} else {
+		/* Multiple destination words */
+		/* Leading bits */
+		if (first) {
+			*dst = comp(pat, *dst, first);
+			dst++;
+			pat = pat << left | pat >> right;
+			n -= bits - dst_idx;
+		}
+
+		/* Main chunk */
+		n /= bits;
+		while (n >= 4) {
+			*dst++ = pat;
+			pat = pat << left | pat >> right;
+			*dst++ = pat;
+			pat = pat << left | pat >> right;
+			*dst++ = pat;
+			pat = pat << left | pat >> right;
+			*dst++ = pat;
+			pat = pat << left | pat >> right;
+			n -= 4;
+		}
+		while (n--) {
+			*dst++ = pat;
+			pat = pat << left | pat >> right;
+		}
+
+		/* Trailing bits */
+		if (last)
+			*dst = comp(pat, *dst, last);
+	}
+}
+
+/*
+ *  Aligned pattern invert using 32/64-bit memory accesses
+ */
+static void
+bitfill_aligned_rev(struct fb_info *p, unsigned long *dst, int dst_idx,
+		    unsigned long pat, unsigned n, int bits)
+{
+	unsigned long val = pat;
+	unsigned long first, last;
+
+	if (!n)
+		return;
+
+	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
+	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
+
+	if (dst_idx+n <= bits) {
+		/* Single word */
+		if (last)
+			first &= last;
+		*dst = comp(*dst ^ val, *dst, first);
+	} else {
+		/* Multiple destination words */
+		/* Leading bits */
+		if (first!=0UL) {
+			*dst = comp(*dst ^ val, *dst, first);
+			dst++;
+			n -= bits - dst_idx;
+		}
+
+		/* Main chunk */
+		n /= bits;
+		while (n >= 8) {
+			*dst++ ^= val;
+			*dst++ ^= val;
+			*dst++ ^= val;
+			*dst++ ^= val;
+			*dst++ ^= val;
+			*dst++ ^= val;
+			*dst++ ^= val;
+			*dst++ ^= val;
+			n -= 8;
+		}
+		while (n--)
+			*dst++ ^= val;
+		/* Trailing bits */
+		if (last)
+			*dst = comp(*dst ^ val, *dst, last);
+	}
+}
+
+
+/*
+ *  Unaligned generic pattern invert using 32/64-bit memory accesses
+ *  The pattern must have been expanded to a full 32/64-bit value
+ *  Left/right are the appropriate shifts to convert to the pattern to be
+ *  used for the next 32/64-bit word
+ */
+static void
+bitfill_unaligned_rev(struct fb_info *p, unsigned long *dst, int dst_idx,
+		      unsigned long pat, int left, int right, unsigned n,
+		      int bits)
+{
+	unsigned long first, last;
+
+	if (!n)
+		return;
+
+	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
+	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
+
+	if (dst_idx+n <= bits) {
+		/* Single word */
+		if (last)
+			first &= last;
+		*dst = comp(*dst ^ pat, *dst, first);
+	} else {
+		/* Multiple destination words */
+
+		/* Leading bits */
+		if (first != 0UL) {
+			*dst = comp(*dst ^ pat, *dst, first);
+			dst++;
+			pat = pat << left | pat >> right;
+			n -= bits - dst_idx;
+		}
+
+		/* Main chunk */
+		n /= bits;
+		while (n >= 4) {
+			*dst++ ^= pat;
+			pat = pat << left | pat >> right;
+			*dst++ ^= pat;
+			pat = pat << left | pat >> right;
+			*dst++ ^= pat;
+			pat = pat << left | pat >> right;
+			*dst++ ^= pat;
+			pat = pat << left | pat >> right;
+			n -= 4;
+		}
+		while (n--) {
+			*dst ^= pat;
+			pat = pat << left | pat >> right;
+		}
+
+		/* Trailing bits */
+		if (last)
+			*dst = comp(*dst ^ pat, *dst, last);
+	}
+}
+
+/*
+ * Generic fillrect for frame buffers in system RAM with packed pixels of
+ * any depth.
+ */
+void drm_fillrect(struct fb_info *p, const struct fb_fillrect *rect)
+{
+	unsigned long pat, pat2, fg;
+	unsigned long width = rect->width, height = rect->height;
+	int bits = BITS_PER_LONG, bytes = bits >> 3;
+	u32 bpp = p->var.bits_per_pixel;
+	unsigned long *dst;
+	int dst_idx, left;
+
+	if (p->state != FBINFO_STATE_RUNNING)
+		return;
+
+	if (p->fix.visual = FB_VISUAL_TRUECOLOR ||
+	    p->fix.visual = FB_VISUAL_DIRECTCOLOR )
+		fg = ((u32 *) (p->pseudo_palette))[rect->color];
+	else
+		fg = rect->color;
+
+	pat = pixel_to_pat( bpp, fg);
+
+	dst = (unsigned long *)((unsigned long)p->screen_base & ~(bytes-1));
+	dst_idx = ((unsigned long)p->screen_base & (bytes - 1))*8;
+	dst_idx += rect->dy*p->fix.line_length*8+rect->dx*bpp;
+	/* FIXME For now we support 1-32 bpp only */
+	left = bits % bpp;
+	if (p->fbops->fb_sync)
+		p->fbops->fb_sync(p);
+	if (!left) {
+		void (*fill_op32)(struct fb_info *p, unsigned long *dst,
+				  int dst_idx, unsigned long pat, unsigned n,
+				  int bits) = NULL;
+
+		switch (rect->rop) {
+		case ROP_XOR:
+			fill_op32 = bitfill_aligned_rev;
+			break;
+		case ROP_COPY:
+			fill_op32 = bitfill_aligned;
+			break;
+		default:
+			printk( KERN_ERR "cfb_fillrect(): unknown rop, "
+				"defaulting to ROP_COPY\n");
+			fill_op32 = bitfill_aligned;
+			break;
+		}
+		while (height--) {
+			dst += dst_idx >> (ffs(bits) - 1);
+			dst_idx &= (bits - 1);
+			fill_op32(p, dst, dst_idx, pat, width*bpp, bits);
+			dst_idx += p->fix.line_length*8;
+		}
+	} else {
+		int right, r;
+		void (*fill_op)(struct fb_info *p, unsigned long *dst,
+				int dst_idx, unsigned long pat, int left,
+				int right, unsigned n, int bits) = NULL;
+#ifdef __LITTLE_ENDIAN
+		right = left;
+		left = bpp - right;
+#else
+		right = bpp - left;
+#endif
+		switch (rect->rop) {
+		case ROP_XOR:
+			fill_op = bitfill_unaligned_rev;
+			break;
+		case ROP_COPY:
+			fill_op = bitfill_unaligned;
+			break;
+		default:
+			printk(KERN_ERR "sys_fillrect(): unknown rop, "
+				"defaulting to ROP_COPY\n");
+			fill_op = bitfill_unaligned;
+			break;
+		}
+		while (height--) {
+			dst += dst_idx / bits;
+			dst_idx &= (bits - 1);
+			r = dst_idx % bpp;
+			/* rotate pattern to the correct start position */
+			pat2 = le_long_to_cpu(rolx(cpu_to_le_long(pat), r, bpp));
+			fill_op(p, dst, dst_idx, pat2, left, right,
+				width*bpp, bits);
+			dst_idx += p->fix.line_length*8;
+		}
+	}
+}
+EXPORT_SYMBOL(drm_fillrect);
+
+static const u32 cfb_tab8_be[] = {
+    0x00000000,0x000000ff,0x0000ff00,0x0000ffff,
+    0x00ff0000,0x00ff00ff,0x00ffff00,0x00ffffff,
+    0xff000000,0xff0000ff,0xff00ff00,0xff00ffff,
+    0xffff0000,0xffff00ff,0xffffff00,0xffffffff
+};
+
+static const u32 cfb_tab8_le[] = {
+    0x00000000,0xff000000,0x00ff0000,0xffff0000,
+    0x0000ff00,0xff00ff00,0x00ffff00,0xffffff00,
+    0x000000ff,0xff0000ff,0x00ff00ff,0xffff00ff,
+    0x0000ffff,0xff00ffff,0x00ffffff,0xffffffff
+};
+
+static const u32 cfb_tab16_be[] = {
+    0x00000000, 0x0000ffff, 0xffff0000, 0xffffffff
+};
+
+static const u32 cfb_tab16_le[] = {
+    0x00000000, 0xffff0000, 0x0000ffff, 0xffffffff
+};
+
+static const u32 cfb_tab32[] = {
+	0x00000000, 0xffffffff
+};
+
+static void color_imageblit(const struct fb_image *image, struct fb_info *p,
+			    void *dst1, u32 start_index, u32 pitch_index)
+{
+	/* Draw the penguin */
+	u32 *dst, *dst2;
+	u32 color = 0, val, shift;
+	int i, n, bpp = p->var.bits_per_pixel;
+	u32 null_bits = 32 - bpp;
+	u32 *palette = (u32 *) p->pseudo_palette;
+	const u8 *src = image->data;
+
+	dst2 = dst1;
+	for (i = image->height; i--; ) {
+		n = image->width;
+		dst = dst1;
+		shift = 0;
+		val = 0;
+
+		if (start_index) {
+			u32 start_mask = ~(FB_SHIFT_HIGH(p, ~(u32)0,
+							 start_index));
+			val = *dst & start_mask;
+			shift = start_index;
+		}
+		while (n--) {
+			if (p->fix.visual = FB_VISUAL_TRUECOLOR ||
+			    p->fix.visual = FB_VISUAL_DIRECTCOLOR )
+				color = palette[*src];
+			else
+				color = *src;
+			color <<= FB_LEFT_POS(p, bpp);
+			val |= FB_SHIFT_HIGH(p, color, shift);
+			if (shift >= null_bits) {
+				*dst++ = val;
+
+				val = (shift = null_bits) ? 0 :
+					FB_SHIFT_LOW(p, color, 32 - shift);
+			}
+			shift += bpp;
+			shift &= (32 - 1);
+			src++;
+		}
+		if (shift) {
+			u32 end_mask = FB_SHIFT_HIGH(p, ~(u32)0, shift);
+
+			*dst &= end_mask;
+			*dst |= val;
+		}
+		dst1 += p->fix.line_length;
+		if (pitch_index) {
+			dst2 += p->fix.line_length;
+			dst1 = (u8 *)((long)dst2 & ~(sizeof(u32) - 1));
+
+			start_index += pitch_index;
+			start_index &= 32 - 1;
+		}
+	}
+}
+
+static void slow_imageblit(const struct fb_image *image, struct fb_info *p,
+				  void *dst1, u32 fgcolor, u32 bgcolor,
+				  u32 start_index, u32 pitch_index)
+{
+	u32 shift, color = 0, bpp = p->var.bits_per_pixel;
+	u32 *dst, *dst2;
+	u32 val, pitch = p->fix.line_length;
+	u32 null_bits = 32 - bpp;
+	u32 spitch = (image->width+7)/8;
+	const u8 *src = image->data, *s;
+	u32 i, j, l;
+
+	dst2 = dst1;
+	fgcolor <<= FB_LEFT_POS(p, bpp);
+	bgcolor <<= FB_LEFT_POS(p, bpp);
+
+	for (i = image->height; i--; ) {
+		shift = val = 0;
+		l = 8;
+		j = image->width;
+		dst = dst1;
+		s = src;
+
+		/* write leading bits */
+		if (start_index) {
+			u32 start_mask = ~(FB_SHIFT_HIGH(p, ~(u32)0,
+							 start_index));
+			val = *dst & start_mask;
+			shift = start_index;
+		}
+
+		while (j--) {
+			l--;
+			color = (*s & (1 << l)) ? fgcolor : bgcolor;
+			val |= FB_SHIFT_HIGH(p, color, shift);
+
+			/* Did the bitshift spill bits to the next long? */
+			if (shift >= null_bits) {
+				*dst++ = val;
+				val = (shift = null_bits) ? 0 :
+					FB_SHIFT_LOW(p, color, 32 - shift);
+			}
+			shift += bpp;
+			shift &= (32 - 1);
+			if (!l) { l = 8; s++; }
+		}
+
+		/* write trailing bits */
+ 		if (shift) {
+			u32 end_mask = FB_SHIFT_HIGH(p, ~(u32)0, shift);
+
+			*dst &= end_mask;
+			*dst |= val;
+		}
+
+		dst1 += pitch;
+		src += spitch;
+		if (pitch_index) {
+			dst2 += pitch;
+			dst1 = (u8 *)((long)dst2 & ~(sizeof(u32) - 1));
+			start_index += pitch_index;
+			start_index &= 32 - 1;
+		}
+
+	}
+}
+
+/*
+ * fast_imageblit - optimized monochrome color expansion
+ *
+ * Only if:  bits_per_pixel = 8, 16, or 32
+ *           image->width is divisible by pixel/dword (ppw);
+ *           fix->line_legth is divisible by 4;
+ *           beginning and end of a scanline is dword aligned
+ */
+static void fast_imageblit(const struct fb_image *image, struct fb_info *p,
+				  void *dst1, u32 fgcolor, u32 bgcolor)
+{
+	u32 fgx = fgcolor, bgx = bgcolor, bpp = p->var.bits_per_pixel;
+	u32 ppw = 32/bpp, spitch = (image->width + 7)/8;
+	u32 bit_mask, end_mask, eorx, shift;
+	const char *s = image->data, *src;
+	u32 *dst;
+	const u32 *tab = NULL;
+	int i, j, k;
+
+	switch (bpp) {
+	case 8:
+		tab = fb_be_math(p) ? cfb_tab8_be : cfb_tab8_le;
+		break;
+	case 16:
+		tab = fb_be_math(p) ? cfb_tab16_be : cfb_tab16_le;
+		break;
+	case 32:
+	default:
+		tab = cfb_tab32;
+		break;
+	}
+
+	for (i = ppw-1; i--; ) {
+		fgx <<= bpp;
+		bgx <<= bpp;
+		fgx |= fgcolor;
+		bgx |= bgcolor;
+	}
+
+	bit_mask = (1 << ppw) - 1;
+	eorx = fgx ^ bgx;
+	k = image->width/ppw;
+
+	for (i = image->height; i--; ) {
+		dst = dst1;
+		shift = 8;
+		src = s;
+
+		for (j = k; j--; ) {
+			shift -= ppw;
+			end_mask = tab[(*src >> shift) & bit_mask];
+			*dst++ = (end_mask & eorx) ^ bgx;
+			if (!shift) {
+				shift = 8;
+				src++;
+			}
+		}
+		dst1 += p->fix.line_length;
+		s += spitch;
+	}
+}
+
+/*
+ * Generic 1-bit or 8-bit source to 1-32 bit destination expansion
+ * for frame buffer located in system RAM with packed pixels of any depth.
+ */
+void drm_imageblit(struct fb_info *p, const struct fb_image *image)
+{
+	u32 fgcolor, bgcolor, start_index, bitstart, pitch_index = 0;
+	u32 bpl = sizeof(u32), bpp = p->var.bits_per_pixel;
+	u32 width = image->width;
+	u32 dx = image->dx, dy = image->dy;
+	void *dst1;
+
+	if (p->state != FBINFO_STATE_RUNNING)
+		return;
+
+	bitstart = (dy * p->fix.line_length * 8) + (dx * bpp);
+	start_index = bitstart & (32 - 1);
+	pitch_index = (p->fix.line_length & (bpl - 1)) * 8;
+
+	bitstart /= 8;
+	bitstart &= ~(bpl - 1);
+	dst1 = (void __force *)p->screen_base + bitstart;
+
+	if (p->fbops->fb_sync)
+		p->fbops->fb_sync(p);
+
+	if (image->depth = 1) {
+		if (p->fix.visual = FB_VISUAL_TRUECOLOR ||
+		    p->fix.visual = FB_VISUAL_DIRECTCOLOR) {
+			fgcolor = ((u32*)(p->pseudo_palette))[image->fg_color];
+			bgcolor = ((u32*)(p->pseudo_palette))[image->bg_color];
+		} else {
+			fgcolor = image->fg_color;
+			bgcolor = image->bg_color;
+		}
+
+		if (32 % bpp = 0 && !start_index && !pitch_index &&
+		    ((width & (32/bpp-1)) = 0) &&
+		    bpp >= 8 && bpp <= 32)
+			fast_imageblit(image, p, dst1, fgcolor, bgcolor);
+		else
+			slow_imageblit(image, p, dst1, fgcolor, bgcolor,
+					start_index, pitch_index);
+	} else
+		color_imageblit(image, p, dst1, start_index, pitch_index);
+}
+EXPORT_SYMBOL(drm_imageblit);
+
+/*
+ * Generic file operations where framebuffer is in system RAM
+ */
+ssize_t drm_fb_read(struct fb_info *info, char __user *buf, size_t count, loff_t *ppos)
+{
+	unsigned long p = *ppos;
+	void *src;
+	int err = 0;
+	unsigned long total_size;
+
+	if (info->state != FBINFO_STATE_RUNNING)
+		return -EPERM;
+
+	total_size = info->screen_size;
+
+	if (total_size = 0)
+		total_size = info->fix.smem_len;
+
+	if (p >= total_size)
+		return 0;
+
+	if (count >= total_size)
+		count = total_size;
+
+	if (count + p > total_size)
+		count = total_size - p;
+
+	src = (void __force *)(info->screen_base + p);
+
+	if (info->fbops->fb_sync)
+		info->fbops->fb_sync(info);
+
+	if (copy_to_user(buf, src, count))
+		err = -EFAULT;
+
+	if  (!err)
+		*ppos += count;
+
+	return (err) ? err : count;
+}
+EXPORT_SYMBOL_GPL(drm_fb_read);
+
+/*
+ * Generic file operations where framebuffer is in system RAM
+ */
+ssize_t drm_fb_write(struct fb_info *info, const char __user *buf, size_t count, loff_t *ppos)
+{
+	unsigned long p = *ppos;
+	void *dst;
+	int err = 0;
+	unsigned long total_size;
+
+	if (info->state != FBINFO_STATE_RUNNING)
+		return -EPERM;
+
+	total_size = info->screen_size;
+
+	if (total_size = 0)
+		total_size = info->fix.smem_len;
+
+	if (p > total_size)
+		return -EFBIG;
+
+	if (count > total_size) {
+		err = -EFBIG;
+		count = total_size;
+	}
+
+	if (count + p > total_size) {
+		if (!err)
+			err = -ENOSPC;
+
+		count = total_size - p;
+	}
+
+	dst = (void __force *) (info->screen_base + p);
+
+	if (info->fbops->fb_sync)
+		info->fbops->fb_sync(info);
+
+	if (copy_from_user(dst, buf, count))
+		err = -EFAULT;
+
+	if  (!err)
+		*ppos += count;
+
+	return (err) ? err : count;
+}
+EXPORT_SYMBOL_GPL(drm_fb_write);
diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index da0d96a69570..18fba9818635 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -664,36 +664,36 @@ void drm_fb_helper_deferred_io(struct fb_info *info,
 EXPORT_SYMBOL(drm_fb_helper_deferred_io);
 
 /**
- * drm_fb_helper_sys_read - wrapper around fb_sys_read
+ * drm_fb_helper_sys_read - wrapper around drm_fb_read
  * @info: fb_info struct pointer
  * @buf: userspace buffer to read from framebuffer memory
  * @count: number of bytes to read from framebuffer memory
  * @ppos: read offset within framebuffer memory
  *
- * A wrapper around fb_sys_read implemented by fbdev core
+ * A wrapper around drm_fb_read
  */
 ssize_t drm_fb_helper_sys_read(struct fb_info *info, char __user *buf,
 			       size_t count, loff_t *ppos)
 {
-	return fb_sys_read(info, buf, count, ppos);
+	return drm_fb_read(info, buf, count, ppos);
 }
 EXPORT_SYMBOL(drm_fb_helper_sys_read);
 
 /**
- * drm_fb_helper_sys_write - wrapper around fb_sys_write
+ * drm_fb_helper_sys_write - wrapper around drm_fb_write
  * @info: fb_info struct pointer
  * @buf: userspace buffer to write to framebuffer memory
  * @count: number of bytes to write to framebuffer memory
  * @ppos: write offset within framebuffer memory
  *
- * A wrapper around fb_sys_write implemented by fbdev core
+ * A wrapper around drm_fb_write
  */
 ssize_t drm_fb_helper_sys_write(struct fb_info *info, const char __user *buf,
 				size_t count, loff_t *ppos)
 {
 	ssize_t ret;
 
-	ret = fb_sys_write(info, buf, count, ppos);
+	ret = drm_fb_write(info, buf, count, ppos);
 	if (ret > 0)
 		drm_fb_helper_dirty(info, 0, 0, info->var.xres,
 				    info->var.yres);
@@ -703,48 +703,48 @@ ssize_t drm_fb_helper_sys_write(struct fb_info *info, const char __user *buf,
 EXPORT_SYMBOL(drm_fb_helper_sys_write);
 
 /**
- * drm_fb_helper_sys_fillrect - wrapper around sys_fillrect
+ * drm_fb_helper_sys_fillrect - wrapper around drm_fillrect
  * @info: fbdev registered by the helper
  * @rect: info about rectangle to fill
  *
- * A wrapper around sys_fillrect implemented by fbdev core
+ * A wrapper around drm_fillrect
  */
 void drm_fb_helper_sys_fillrect(struct fb_info *info,
 				const struct fb_fillrect *rect)
 {
-	sys_fillrect(info, rect);
+	drm_fillrect(info, rect);
 	drm_fb_helper_dirty(info, rect->dx, rect->dy,
 			    rect->width, rect->height);
 }
 EXPORT_SYMBOL(drm_fb_helper_sys_fillrect);
 
 /**
- * drm_fb_helper_sys_copyarea - wrapper around sys_copyarea
+ * drm_fb_helper_sys_copyarea - wrapper around drm_copyarea
  * @info: fbdev registered by the helper
  * @area: info about area to copy
  *
- * A wrapper around sys_copyarea implemented by fbdev core
+ * A wrapper around drm_copyarea
  */
 void drm_fb_helper_sys_copyarea(struct fb_info *info,
 				const struct fb_copyarea *area)
 {
-	sys_copyarea(info, area);
+	drm_copyarea(info, area);
 	drm_fb_helper_dirty(info, area->dx, area->dy,
 			    area->width, area->height);
 }
 EXPORT_SYMBOL(drm_fb_helper_sys_copyarea);
 
 /**
- * drm_fb_helper_sys_imageblit - wrapper around sys_imageblit
+ * drm_fb_helper_sys_imageblit - wrapper around drm_imageblit
  * @info: fbdev registered by the helper
  * @image: info about image to blit
  *
- * A wrapper around sys_imageblit implemented by fbdev core
+ * A wrapper around drm_imageblit
  */
 void drm_fb_helper_sys_imageblit(struct fb_info *info,
 				 const struct fb_image *image)
 {
-	sys_imageblit(info, image);
+	drm_imageblit(info, image);
 	drm_fb_helper_dirty(info, image->dx, image->dy,
 			    image->width, image->height);
 }
diff --git a/include/drm/drm_fb_helper.h b/include/drm/drm_fb_helper.h
index 306aa3a60be9..d38005f6128a 100644
--- a/include/drm/drm_fb_helper.h
+++ b/include/drm/drm_fb_helper.h
@@ -271,6 +271,12 @@ void drm_fb_helper_output_poll_changed(struct drm_device *dev);
 
 void drm_fbdev_generic_setup(struct drm_device *dev,
 			     unsigned int preferred_bpp);
+
+void drm_copyarea(struct fb_info *p, const struct fb_copyarea *area);
+void drm_fillrect(struct fb_info *p, const struct fb_fillrect *rect);
+void drm_imageblit(struct fb_info *p, const struct fb_image *image);
+ssize_t drm_fb_read(struct fb_info *info, char __user *buf, size_t count, loff_t *ppos);
+ssize_t drm_fb_write(struct fb_info *info, const char __user *buf, size_t count, loff_t *ppos);
 #else
 static inline void drm_fb_helper_prepare(struct drm_device *dev,
 					struct drm_fb_helper *helper,
-- 
2.25.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/5] fbdev/core: Export framebuffer read and write code as cfb_ function
@ 2020-08-02 20:01           ` Sam Ravnborg
  0 siblings, 0 replies; 34+ messages in thread
From: Sam Ravnborg @ 2020-08-02 20:01 UTC (permalink / raw)
  To: daniel
  Cc: linux-fbdev, b.zolnierkie, jani.nikula, dri-devel, kraxel,
	airlied, natechancellor, peda, dan.carpenter

On Fri, Jul 31, 2020 at 11:20:33AM +0200, daniel@ffwll.ch wrote:
> On Wed, Jul 29, 2020 at 06:36:03PM +0200, Sam Ravnborg wrote:
> > Hi Daniel.
> > 
> > On Wed, Jul 29, 2020 at 03:53:28PM +0200, daniel@ffwll.ch wrote:
> > > On Wed, Jul 29, 2020 at 03:41:45PM +0200, Thomas Zimmermann wrote:
> > > > DRM fb helpers require read and write functions for framebuffer
> > > > memory. Export the existing code from fbdev.
> > > > 
> > > > Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
> > > 
> > > Hm I'm not super sure whether we want to actually reuse this stuff ... We
> > > kinda don't care about the sparc special case, and just having an fbdev
> > > implementation witch has the switch between memcpy and memcpy_to/from_io
> > > in one single place sounds a lot simpler ...
> > > 
> > > This way we can have a clean split between the old horrors of real fbdev
> > > drivers, and a much cleaner world in drm. It would mean a bit of
> > > copypasting, but I think that's actually a good thing.
> > > 
> > > In general my idea for drm fbdev emulation is that for any area we have a
> > > problem we just ignore the entire fbmem.c code and write our own: mmap,
> > > backlight handling (still unsolved, and horrible), cfb vs sys here. This
> > > entire fbmem.c stuff is pretty bad midlayer, trying to avoid code
> > > duplication here doesn't seem worth it imo.
> > > 
> > > Thoughts?
> > 
> > 
> > I can see that fbmem is a mix of ioctl support and other stuff.
> > We could factor out all the ioctl parts of fbmem.c to a new file
> > named fbioctl.c.
> > 
> > And then let the ioctl parts call down into drm stuff and avoid reusing
> > the fbdev code when we first reach drm code.
> > This would require local copies of:
> > sys_read, sys_write, sys_fillrect, sys_copyarea, sys_imageblit
> > and more I think which I missed.
> > 
> > With local copies we could avoid some of the special cases and trim the
> > unctions to what is required by drm only.
> > And then no more fbmem dependencies and no dependencies to several of
> > the small helper functions. So less entanglement with fbdev core.
> > 
> > This all sounds simple so I am surely missing a lot a ugly details here.
> > 
> > And should we touch this anyway we need a test suite to verify not too
> > much breaks. To the best of my knowledge there is not yet such a test
> > suite :-( Maybe because people caring about fbdev are limited.
> 
> Well my idea was to not refactor anything, but just have drm copies of the
> various fb_ops callbacks. Definitely not even more refactoring :-)

$ wc -l drivers/gpu/drm/drm_fb_fbdev_helper.c
1212 drivers/gpu/drm/drm_fb_fbdev_helper.c

This is a straight copy of:
$ grep EXPORT drivers/gpu/drm/drm_fb_fbdev_helper.c
EXPORT_SYMBOL(drm_copyarea);
EXPORT_SYMBOL(drm_fillrect);
EXPORT_SYMBOL(drm_imageblit);
EXPORT_SYMBOL_GPL(drm_fb_read);
EXPORT_SYMBOL_GPL(drm_fb_write);

copyarea, fillrect and imageblit are a copy of the sys variants.
read and write are from fb_sys_fops.c

Patch builds but I am not a big fan yet - right now it gains us very
little. What would the next step be?

I looked briefly at the few drivers that have not migrated to
the generic fbdev emulation (yet?).
  msm
  omapdrm
  gma500

all uses sys_ variants - they already uses the migrated variants as
they call the drm_fb helper.


  exynos
  gma500
  i915
  nouveau
  radeon
  rockchip
  vmwgfx

all uses the cfb variants and cannot use the migrated variants as-is.

	Sam

From c554e8b9ec96e06a5933ed5bd834a928c977bcdf Mon Sep 17 00:00:00 2001
From: Sam Ravnborg <sam@ravnborg.org>
Date: Fri, 31 Jul 2020 23:36:01 +0200
Subject: [PATCH 1/3] drm: drm_fb_helper: copy fb_ops functions to drm

Copy the fbdev functions used by fb_ops to drm.
Having local copies enable the possibility
to adjust the functions so then are tailored to the need of DRM.
At the same time there is one less dependency between DRM and fbdev.

The files were copied verbatim and only minimal changes was made.

As some drivers uses the fbdev functions they continue to be build but
the generic fbdev emulation no longer uses the fbdev functions.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/Kconfig               |    1 -
 drivers/gpu/drm/Makefile              |    1 +
 drivers/gpu/drm/drm_fb_fbdev_helper.c | 1212 +++++++++++++++++++++++++
 drivers/gpu/drm/drm_fb_helper.c       |   30 +-
 include/drm/drm_fb_helper.h           |    6 +
 5 files changed, 1234 insertions(+), 16 deletions(-)
 create mode 100644 drivers/gpu/drm/drm_fb_fbdev_helper.c

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index c4fd57d8b717..11f41bc31f05 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -83,7 +83,6 @@ config DRM_KMS_FB_HELPER
 	select FB
 	select FRAMEBUFFER_CONSOLE if !EXPERT
 	select FRAMEBUFFER_CONSOLE_DETECT_PRIMARY if FRAMEBUFFER_CONSOLE
-	select FB_SYS_FOPS
 	select FB_SYS_FILLRECT
 	select FB_SYS_COPYAREA
 	select FB_SYS_IMAGEBLIT
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 02ee5faf1a92..398cb4746762 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -50,6 +50,7 @@ drm_kms_helper-y := drm_bridge_connector.o drm_crtc_helper.o drm_dp_helper.o \
 
 drm_kms_helper-$(CONFIG_DRM_PANEL_BRIDGE) += bridge/panel.o
 drm_kms_helper-$(CONFIG_DRM_FBDEV_EMULATION) += drm_fb_helper.o
+drm_kms_helper-$(CONFIG_DRM_FBDEV_EMULATION) += drm_fb_fbdev_helper.o
 drm_kms_helper-$(CONFIG_DRM_KMS_CMA_HELPER) += drm_fb_cma_helper.o
 drm_kms_helper-$(CONFIG_DRM_DP_AUX_CHARDEV) += drm_dp_aux_dev.o
 drm_kms_helper-$(CONFIG_DRM_DP_CEC) += drm_dp_cec.o
diff --git a/drivers/gpu/drm/drm_fb_fbdev_helper.c b/drivers/gpu/drm/drm_fb_fbdev_helper.c
new file mode 100644
index 000000000000..dda60f85b9d5
--- /dev/null
+++ b/drivers/gpu/drm/drm_fb_fbdev_helper.c
@@ -0,0 +1,1212 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file COPYING in the main directory of this archive for
+ * more details.
+ *
+ * Copyright (C)  2007 Antonino Daplas <adaplas@pol.net>
+ *
+ * drm_copyarea:
+ *   Based almost entirely from cfbcopyarea.c (which is based almost entirely
+ *   on Geert Uytterhoeven's copyarea routine)
+ *
+ *  drm_fillrect:
+ *    Based almost entirely from cfbfillrect.c (which is based almost entirely
+ *    on Geert Uytterhoeven's fillrect routine)
+ *
+ *  drm_imageblit:
+ *    Based almost entirely on cfbimgblt.c
+ *
+ */
+
+#include <linux/export.h>
+#include <linux/fb.h>
+#include <linux/io.h>
+
+
+/*
+*  Compose two values, using a bitmask as decision value
+*  This is equivalent to (a & mask) | (b & ~mask)
+*/
+static inline unsigned long
+comp(unsigned long a, unsigned long b, unsigned long mask)
+{
+    return ((a ^ b) & mask) ^ b;
+}
+
+/*
+ *  Create a pattern with the given pixel's color
+ */
+#if BITS_PER_LONG == 64
+static inline unsigned long
+pixel_to_pat( u32 bpp, u32 pixel)
+{
+	switch (bpp) {
+	case 1:
+		return 0xfffffffffffffffful*pixel;
+	case 2:
+		return 0x5555555555555555ul*pixel;
+	case 4:
+		return 0x1111111111111111ul*pixel;
+	case 8:
+		return 0x0101010101010101ul*pixel;
+	case 12:
+		return 0x1001001001001001ul*pixel;
+	case 16:
+		return 0x0001000100010001ul*pixel;
+	case 24:
+		return 0x0001000001000001ul*pixel;
+	case 32:
+		return 0x0000000100000001ul*pixel;
+	default:
+		WARN(1, "pixel_to_pat(): unsupported pixelformat %d\n", bpp);
+		return 0;
+    }
+}
+#else
+static inline unsigned long
+pixel_to_pat( u32 bpp, u32 pixel)
+{
+	switch (bpp) {
+	case 1:
+		return 0xfffffffful*pixel;
+	case 2:
+		return 0x55555555ul*pixel;
+	case 4:
+		return 0x11111111ul*pixel;
+	case 8:
+		return 0x01010101ul*pixel;
+	case 12:
+		return 0x01001001ul*pixel;
+	case 16:
+		return 0x00010001ul*pixel;
+	case 24:
+		return 0x01000001ul*pixel;
+	case 32:
+		return 0x00000001ul*pixel;
+	default:
+		WARN(1, "pixel_to_pat(): unsupported pixelformat %d\n", bpp);
+		return 0;
+    }
+}
+#endif
+
+#ifdef CONFIG_FB_CFB_REV_PIXELS_IN_BYTE
+#if BITS_PER_LONG == 64
+#define REV_PIXELS_MASK1 0x5555555555555555ul
+#define REV_PIXELS_MASK2 0x3333333333333333ul
+#define REV_PIXELS_MASK4 0x0f0f0f0f0f0f0f0ful
+#else
+#define REV_PIXELS_MASK1 0x55555555ul
+#define REV_PIXELS_MASK2 0x33333333ul
+#define REV_PIXELS_MASK4 0x0f0f0f0ful
+#endif
+
+static inline unsigned long fb_rev_pixels_in_long(unsigned long val,
+						  u32 bswapmask)
+{
+	if (bswapmask & 1)
+		val = comp(val >> 1, val << 1, REV_PIXELS_MASK1);
+	if (bswapmask & 2)
+		val = comp(val >> 2, val << 2, REV_PIXELS_MASK2);
+	if (bswapmask & 3)
+		val = comp(val >> 4, val << 4, REV_PIXELS_MASK4);
+	return val;
+}
+
+static inline u32 fb_shifted_pixels_mask_u32(struct fb_info *p, u32 index,
+					     u32 bswapmask)
+{
+	u32 mask;
+
+	if (!bswapmask) {
+		mask = FB_SHIFT_HIGH(p, ~(u32)0, index);
+	} else {
+		mask = 0xff << FB_LEFT_POS(p, 8);
+		mask = FB_SHIFT_LOW(p, mask, index & (bswapmask)) & mask;
+		mask = FB_SHIFT_HIGH(p, mask, index & ~(bswapmask));
+#if defined(__i386__) || defined(__x86_64__)
+		/* Shift argument is limited to 0 - 31 on x86 based CPU's */
+		if(index + bswapmask < 32)
+#endif
+			mask |= FB_SHIFT_HIGH(p, ~(u32)0,
+					(index + bswapmask) & ~(bswapmask));
+	}
+	return mask;
+}
+
+static inline unsigned long fb_shifted_pixels_mask_long(struct fb_info *p,
+							u32 index,
+							u32 bswapmask)
+{
+	unsigned long mask;
+
+	if (!bswapmask) {
+		mask = FB_SHIFT_HIGH(p, ~0UL, index);
+	} else {
+		mask = 0xff << FB_LEFT_POS(p, 8);
+		mask = FB_SHIFT_LOW(p, mask, index & (bswapmask)) & mask;
+		mask = FB_SHIFT_HIGH(p, mask, index & ~(bswapmask));
+#if defined(__i386__) || defined(__x86_64__)
+		/* Shift argument is limited to 0 - 31 on x86 based CPU's */
+		if(index + bswapmask < BITS_PER_LONG)
+#endif
+			mask |= FB_SHIFT_HIGH(p, ~0UL,
+					(index + bswapmask) & ~(bswapmask));
+	}
+	return mask;
+}
+
+
+static inline u32 fb_compute_bswapmask(struct fb_info *info)
+{
+	u32 bswapmask = 0;
+	unsigned bpp = info->var.bits_per_pixel;
+
+	if ((bpp < 8) && (info->var.nonstd & FB_NONSTD_REV_PIX_IN_B)) {
+		/*
+		 * Reversed order of pixel layout in bytes
+		 * works only for 1, 2 and 4 bpp
+		 */
+		bswapmask = 7 - bpp + 1;
+	}
+	return bswapmask;
+}
+
+#else /* CONFIG_FB_CFB_REV_PIXELS_IN_BYTE */
+
+static inline unsigned long fb_rev_pixels_in_long(unsigned long val,
+						  u32 bswapmask)
+{
+	return val;
+}
+
+#define fb_shifted_pixels_mask_u32(p, i, b) FB_SHIFT_HIGH((p), ~(u32)0, (i))
+#define fb_shifted_pixels_mask_long(p, i, b) FB_SHIFT_HIGH((p), ~0UL, (i))
+#define fb_compute_bswapmask(...) 0
+
+#endif  /* CONFIG_FB_CFB_REV_PIXELS_IN_BYTE */
+
+#define cpu_to_le_long _cpu_to_le_long(BITS_PER_LONG)
+#define _cpu_to_le_long(x) __cpu_to_le_long(x)
+#define __cpu_to_le_long(x) cpu_to_le##x
+
+#define le_long_to_cpu _le_long_to_cpu(BITS_PER_LONG)
+#define _le_long_to_cpu(x) __le_long_to_cpu(x)
+#define __le_long_to_cpu(x) le##x##_to_cpu
+
+static inline unsigned long rolx(unsigned long word, unsigned int shift, unsigned int x)
+{
+	return (word << shift) | (word >> (x - shift));
+}
+
+
+/*
+ *  Generic bitwise copy algorithm
+ */
+static void
+bitcpy(struct fb_info *p, unsigned long *dst, unsigned dst_idx,
+	const unsigned long *src, unsigned src_idx, int bits, unsigned n)
+{
+	unsigned long first, last;
+	int const shift = dst_idx-src_idx;
+	int left, right;
+
+	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
+	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
+
+	if (!shift) {
+		/* Same alignment for source and dest */
+		if (dst_idx+n <= bits) {
+			/* Single word */
+			if (last)
+				first &= last;
+			*dst = comp(*src, *dst, first);
+		} else {
+			/* Multiple destination words */
+			/* Leading bits */
+ 			if (first != ~0UL) {
+				*dst = comp(*src, *dst, first);
+				dst++;
+				src++;
+				n -= bits - dst_idx;
+			}
+
+			/* Main chunk */
+			n /= bits;
+			while (n >= 8) {
+				*dst++ = *src++;
+				*dst++ = *src++;
+				*dst++ = *src++;
+				*dst++ = *src++;
+				*dst++ = *src++;
+				*dst++ = *src++;
+				*dst++ = *src++;
+				*dst++ = *src++;
+				n -= 8;
+			}
+			while (n--)
+				*dst++ = *src++;
+
+			/* Trailing bits */
+			if (last)
+				*dst = comp(*src, *dst, last);
+		}
+	} else {
+		unsigned long d0, d1;
+		int m;
+
+		/* Different alignment for source and dest */
+		right = shift & (bits - 1);
+		left = -shift & (bits - 1);
+
+		if (dst_idx+n <= bits) {
+			/* Single destination word */
+			if (last)
+				first &= last;
+			if (shift > 0) {
+				/* Single source word */
+				*dst = comp(*src << left, *dst, first);
+			} else if (src_idx+n <= bits) {
+				/* Single source word */
+				*dst = comp(*src >> right, *dst, first);
+			} else {
+				/* 2 source words */
+				d0 = *src++;
+				d1 = *src;
+				*dst = comp(d0 >> right | d1 << left, *dst,
+					    first);
+			}
+		} else {
+			/* Multiple destination words */
+			/** We must always remember the last value read,
+			    because in case SRC and DST overlap bitwise (e.g.
+			    when moving just one pixel in 1bpp), we always
+			    collect one full long for DST and that might
+			    overlap with the current long from SRC. We store
+			    this value in 'd0'. */
+			d0 = *src++;
+			/* Leading bits */
+			if (shift > 0) {
+				/* Single source word */
+				*dst = comp(d0 << left, *dst, first);
+				dst++;
+				n -= bits - dst_idx;
+			} else {
+				/* 2 source words */
+				d1 = *src++;
+				*dst = comp(d0 >> right | d1 << left, *dst,
+					    first);
+				d0 = d1;
+				dst++;
+				n -= bits - dst_idx;
+			}
+
+			/* Main chunk */
+			m = n % bits;
+			n /= bits;
+			while (n >= 4) {
+				d1 = *src++;
+				*dst++ = d0 >> right | d1 << left;
+				d0 = d1;
+				d1 = *src++;
+				*dst++ = d0 >> right | d1 << left;
+				d0 = d1;
+				d1 = *src++;
+				*dst++ = d0 >> right | d1 << left;
+				d0 = d1;
+				d1 = *src++;
+				*dst++ = d0 >> right | d1 << left;
+				d0 = d1;
+				n -= 4;
+			}
+			while (n--) {
+				d1 = *src++;
+				*dst++ = d0 >> right | d1 << left;
+				d0 = d1;
+			}
+
+			/* Trailing bits */
+			if (m) {
+				if (m <= bits - right) {
+					/* Single source word */
+					d0 >>= right;
+				} else {
+					/* 2 source words */
+ 					d1 = *src;
+					d0 = d0 >> right | d1 << left;
+				}
+				*dst = comp(d0, *dst, last);
+			}
+		}
+	}
+}
+
+/*
+ *  Generic bitwise copy algorithm, operating backward
+ */
+static void
+bitcpy_rev(struct fb_info *p, unsigned long *dst, unsigned dst_idx,
+	   const unsigned long *src, unsigned src_idx, unsigned bits,
+	   unsigned n)
+{
+	unsigned long first, last;
+	int shift;
+
+	dst += (dst_idx + n - 1) / bits;
+	src += (src_idx + n - 1) / bits;
+	dst_idx = (dst_idx + n - 1) % bits;
+	src_idx = (src_idx + n - 1) % bits;
+
+	shift = dst_idx-src_idx;
+
+	first = ~FB_SHIFT_HIGH(p, ~0UL, (dst_idx + 1) % bits);
+	last = FB_SHIFT_HIGH(p, ~0UL, (bits + dst_idx + 1 - n) % bits);
+
+	if (!shift) {
+		/* Same alignment for source and dest */
+		if ((unsigned long)dst_idx+1 >= n) {
+			/* Single word */
+			if (first)
+				last &= first;
+			*dst = comp(*src, *dst, last);
+		} else {
+			/* Multiple destination words */
+
+			/* Leading bits */
+			if (first) {
+				*dst = comp(*src, *dst, first);
+				dst--;
+				src--;
+				n -= dst_idx+1;
+			}
+
+			/* Main chunk */
+			n /= bits;
+			while (n >= 8) {
+				*dst-- = *src--;
+				*dst-- = *src--;
+				*dst-- = *src--;
+				*dst-- = *src--;
+				*dst-- = *src--;
+				*dst-- = *src--;
+				*dst-- = *src--;
+				*dst-- = *src--;
+				n -= 8;
+			}
+			while (n--)
+				*dst-- = *src--;
+			/* Trailing bits */
+			if (last != -1UL)
+				*dst = comp(*src, *dst, last);
+		}
+	} else {
+		/* Different alignment for source and dest */
+
+		int const left = shift & (bits-1);
+		int const right = -shift & (bits-1);
+
+		if ((unsigned long)dst_idx+1 >= n) {
+			/* Single destination word */
+			if (first)
+				last &= first;
+			if (shift < 0) {
+				/* Single source word */
+				*dst = comp(*src >> right, *dst, last);
+			} else if (1+(unsigned long)src_idx >= n) {
+				/* Single source word */
+				*dst = comp(*src << left, *dst, last);
+			} else {
+				/* 2 source words */
+				*dst = comp(*src << left | *(src-1) >> right,
+					    *dst, last);
+			}
+		} else {
+			/* Multiple destination words */
+			/** We must always remember the last value read,
+			    because in case SRC and DST overlap bitwise (e.g.
+			    when moving just one pixel in 1bpp), we always
+			    collect one full long for DST and that might
+			    overlap with the current long from SRC. We store
+			    this value in 'd0'. */
+			unsigned long d0, d1;
+			int m;
+
+			d0 = *src--;
+			/* Leading bits */
+			if (shift < 0) {
+				/* Single source word */
+				d1 = d0;
+				d0 >>= right;
+			} else {
+				/* 2 source words */
+				d1 = *src--;
+				d0 = d0 << left | d1 >> right;
+			}
+			if (!first)
+				*dst = d0;
+			else
+				*dst = comp(d0, *dst, first);
+			d0 = d1;
+			dst--;
+			n -= dst_idx+1;
+
+			/* Main chunk */
+			m = n % bits;
+			n /= bits;
+			while (n >= 4) {
+				d1 = *src--;
+				*dst-- = d0 << left | d1 >> right;
+				d0 = d1;
+				d1 = *src--;
+				*dst-- = d0 << left | d1 >> right;
+				d0 = d1;
+				d1 = *src--;
+				*dst-- = d0 << left | d1 >> right;
+				d0 = d1;
+				d1 = *src--;
+				*dst-- = d0 << left | d1 >> right;
+				d0 = d1;
+				n -= 4;
+			}
+			while (n--) {
+				d1 = *src--;
+				*dst-- = d0 << left | d1 >> right;
+				d0 = d1;
+			}
+
+			/* Trailing bits */
+			if (m) {
+				if (m <= bits - left) {
+					/* Single source word */
+					d0 <<= left;
+				} else {
+					/* 2 source words */
+					d1 = *src;
+					d0 = d0 << left | d1 >> right;
+				}
+				*dst = comp(d0, *dst, last);
+			}
+		}
+	}
+}
+
+/*
+ * Generic Bit Block Transfer for frame buffers located in system RAM with
+ * packed pixels of any depth.
+ */
+void drm_copyarea(struct fb_info *p, const struct fb_copyarea *area)
+{
+	u32 dx = area->dx, dy = area->dy, sx = area->sx, sy = area->sy;
+	u32 height = area->height, width = area->width;
+	unsigned long const bits_per_line = p->fix.line_length*8u;
+	unsigned long *base = NULL;
+	int bits = BITS_PER_LONG, bytes = bits >> 3;
+	unsigned dst_idx = 0, src_idx = 0, rev_copy = 0;
+
+	if (p->state != FBINFO_STATE_RUNNING)
+		return;
+
+	/* if the beginning of the target area might overlap with the end of
+	the source area, be have to copy the area reverse. */
+	if ((dy == sy && dx > sx) || (dy > sy)) {
+		dy += height;
+		sy += height;
+		rev_copy = 1;
+	}
+
+	/* split the base of the framebuffer into a long-aligned address and
+	   the index of the first bit */
+	base = (unsigned long *)((unsigned long)p->screen_base & ~(bytes-1));
+	dst_idx = src_idx = 8*((unsigned long)p->screen_base & (bytes-1));
+	/* add offset of source and target area */
+	dst_idx += dy*bits_per_line + dx*p->var.bits_per_pixel;
+	src_idx += sy*bits_per_line + sx*p->var.bits_per_pixel;
+
+	if (p->fbops->fb_sync)
+		p->fbops->fb_sync(p);
+
+	if (rev_copy) {
+		while (height--) {
+			dst_idx -= bits_per_line;
+			src_idx -= bits_per_line;
+			bitcpy_rev(p, base + (dst_idx / bits), dst_idx % bits,
+				base + (src_idx / bits), src_idx % bits, bits,
+				width*p->var.bits_per_pixel);
+		}
+	} else {
+		while (height--) {
+			bitcpy(p, base + (dst_idx / bits), dst_idx % bits,
+				base + (src_idx / bits), src_idx % bits, bits,
+				width*p->var.bits_per_pixel);
+			dst_idx += bits_per_line;
+			src_idx += bits_per_line;
+		}
+	}
+}
+EXPORT_SYMBOL(drm_copyarea);
+
+/*
+ *  Aligned pattern fill using 32/64-bit memory accesses
+ */
+static void
+bitfill_aligned(struct fb_info *p, unsigned long *dst, int dst_idx,
+		unsigned long pat, unsigned n, int bits)
+{
+	unsigned long first, last;
+
+	if (!n)
+		return;
+
+	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
+	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
+
+	if (dst_idx+n <= bits) {
+		/* Single word */
+		if (last)
+			first &= last;
+		*dst = comp(pat, *dst, first);
+	} else {
+		/* Multiple destination words */
+
+		/* Leading bits */
+ 		if (first!= ~0UL) {
+			*dst = comp(pat, *dst, first);
+			dst++;
+			n -= bits - dst_idx;
+		}
+
+		/* Main chunk */
+		n /= bits;
+		while (n >= 8) {
+			*dst++ = pat;
+			*dst++ = pat;
+			*dst++ = pat;
+			*dst++ = pat;
+			*dst++ = pat;
+			*dst++ = pat;
+			*dst++ = pat;
+			*dst++ = pat;
+			n -= 8;
+		}
+		while (n--)
+			*dst++ = pat;
+		/* Trailing bits */
+		if (last)
+			*dst = comp(pat, *dst, last);
+	}
+}
+
+
+/*
+ *  Unaligned generic pattern fill using 32/64-bit memory accesses
+ *  The pattern must have been expanded to a full 32/64-bit value
+ *  Left/right are the appropriate shifts to convert to the pattern to be
+ *  used for the next 32/64-bit word
+ */
+static void
+bitfill_unaligned(struct fb_info *p, unsigned long *dst, int dst_idx,
+		  unsigned long pat, int left, int right, unsigned n, int bits)
+{
+	unsigned long first, last;
+
+	if (!n)
+		return;
+
+	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
+	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
+
+	if (dst_idx+n <= bits) {
+		/* Single word */
+		if (last)
+			first &= last;
+		*dst = comp(pat, *dst, first);
+	} else {
+		/* Multiple destination words */
+		/* Leading bits */
+		if (first) {
+			*dst = comp(pat, *dst, first);
+			dst++;
+			pat = pat << left | pat >> right;
+			n -= bits - dst_idx;
+		}
+
+		/* Main chunk */
+		n /= bits;
+		while (n >= 4) {
+			*dst++ = pat;
+			pat = pat << left | pat >> right;
+			*dst++ = pat;
+			pat = pat << left | pat >> right;
+			*dst++ = pat;
+			pat = pat << left | pat >> right;
+			*dst++ = pat;
+			pat = pat << left | pat >> right;
+			n -= 4;
+		}
+		while (n--) {
+			*dst++ = pat;
+			pat = pat << left | pat >> right;
+		}
+
+		/* Trailing bits */
+		if (last)
+			*dst = comp(pat, *dst, last);
+	}
+}
+
+/*
+ *  Aligned pattern invert using 32/64-bit memory accesses
+ */
+static void
+bitfill_aligned_rev(struct fb_info *p, unsigned long *dst, int dst_idx,
+		    unsigned long pat, unsigned n, int bits)
+{
+	unsigned long val = pat;
+	unsigned long first, last;
+
+	if (!n)
+		return;
+
+	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
+	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
+
+	if (dst_idx+n <= bits) {
+		/* Single word */
+		if (last)
+			first &= last;
+		*dst = comp(*dst ^ val, *dst, first);
+	} else {
+		/* Multiple destination words */
+		/* Leading bits */
+		if (first!=0UL) {
+			*dst = comp(*dst ^ val, *dst, first);
+			dst++;
+			n -= bits - dst_idx;
+		}
+
+		/* Main chunk */
+		n /= bits;
+		while (n >= 8) {
+			*dst++ ^= val;
+			*dst++ ^= val;
+			*dst++ ^= val;
+			*dst++ ^= val;
+			*dst++ ^= val;
+			*dst++ ^= val;
+			*dst++ ^= val;
+			*dst++ ^= val;
+			n -= 8;
+		}
+		while (n--)
+			*dst++ ^= val;
+		/* Trailing bits */
+		if (last)
+			*dst = comp(*dst ^ val, *dst, last);
+	}
+}
+
+
+/*
+ *  Unaligned generic pattern invert using 32/64-bit memory accesses
+ *  The pattern must have been expanded to a full 32/64-bit value
+ *  Left/right are the appropriate shifts to convert to the pattern to be
+ *  used for the next 32/64-bit word
+ */
+static void
+bitfill_unaligned_rev(struct fb_info *p, unsigned long *dst, int dst_idx,
+		      unsigned long pat, int left, int right, unsigned n,
+		      int bits)
+{
+	unsigned long first, last;
+
+	if (!n)
+		return;
+
+	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
+	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
+
+	if (dst_idx+n <= bits) {
+		/* Single word */
+		if (last)
+			first &= last;
+		*dst = comp(*dst ^ pat, *dst, first);
+	} else {
+		/* Multiple destination words */
+
+		/* Leading bits */
+		if (first != 0UL) {
+			*dst = comp(*dst ^ pat, *dst, first);
+			dst++;
+			pat = pat << left | pat >> right;
+			n -= bits - dst_idx;
+		}
+
+		/* Main chunk */
+		n /= bits;
+		while (n >= 4) {
+			*dst++ ^= pat;
+			pat = pat << left | pat >> right;
+			*dst++ ^= pat;
+			pat = pat << left | pat >> right;
+			*dst++ ^= pat;
+			pat = pat << left | pat >> right;
+			*dst++ ^= pat;
+			pat = pat << left | pat >> right;
+			n -= 4;
+		}
+		while (n--) {
+			*dst ^= pat;
+			pat = pat << left | pat >> right;
+		}
+
+		/* Trailing bits */
+		if (last)
+			*dst = comp(*dst ^ pat, *dst, last);
+	}
+}
+
+/*
+ * Generic fillrect for frame buffers in system RAM with packed pixels of
+ * any depth.
+ */
+void drm_fillrect(struct fb_info *p, const struct fb_fillrect *rect)
+{
+	unsigned long pat, pat2, fg;
+	unsigned long width = rect->width, height = rect->height;
+	int bits = BITS_PER_LONG, bytes = bits >> 3;
+	u32 bpp = p->var.bits_per_pixel;
+	unsigned long *dst;
+	int dst_idx, left;
+
+	if (p->state != FBINFO_STATE_RUNNING)
+		return;
+
+	if (p->fix.visual == FB_VISUAL_TRUECOLOR ||
+	    p->fix.visual == FB_VISUAL_DIRECTCOLOR )
+		fg = ((u32 *) (p->pseudo_palette))[rect->color];
+	else
+		fg = rect->color;
+
+	pat = pixel_to_pat( bpp, fg);
+
+	dst = (unsigned long *)((unsigned long)p->screen_base & ~(bytes-1));
+	dst_idx = ((unsigned long)p->screen_base & (bytes - 1))*8;
+	dst_idx += rect->dy*p->fix.line_length*8+rect->dx*bpp;
+	/* FIXME For now we support 1-32 bpp only */
+	left = bits % bpp;
+	if (p->fbops->fb_sync)
+		p->fbops->fb_sync(p);
+	if (!left) {
+		void (*fill_op32)(struct fb_info *p, unsigned long *dst,
+				  int dst_idx, unsigned long pat, unsigned n,
+				  int bits) = NULL;
+
+		switch (rect->rop) {
+		case ROP_XOR:
+			fill_op32 = bitfill_aligned_rev;
+			break;
+		case ROP_COPY:
+			fill_op32 = bitfill_aligned;
+			break;
+		default:
+			printk( KERN_ERR "cfb_fillrect(): unknown rop, "
+				"defaulting to ROP_COPY\n");
+			fill_op32 = bitfill_aligned;
+			break;
+		}
+		while (height--) {
+			dst += dst_idx >> (ffs(bits) - 1);
+			dst_idx &= (bits - 1);
+			fill_op32(p, dst, dst_idx, pat, width*bpp, bits);
+			dst_idx += p->fix.line_length*8;
+		}
+	} else {
+		int right, r;
+		void (*fill_op)(struct fb_info *p, unsigned long *dst,
+				int dst_idx, unsigned long pat, int left,
+				int right, unsigned n, int bits) = NULL;
+#ifdef __LITTLE_ENDIAN
+		right = left;
+		left = bpp - right;
+#else
+		right = bpp - left;
+#endif
+		switch (rect->rop) {
+		case ROP_XOR:
+			fill_op = bitfill_unaligned_rev;
+			break;
+		case ROP_COPY:
+			fill_op = bitfill_unaligned;
+			break;
+		default:
+			printk(KERN_ERR "sys_fillrect(): unknown rop, "
+				"defaulting to ROP_COPY\n");
+			fill_op = bitfill_unaligned;
+			break;
+		}
+		while (height--) {
+			dst += dst_idx / bits;
+			dst_idx &= (bits - 1);
+			r = dst_idx % bpp;
+			/* rotate pattern to the correct start position */
+			pat2 = le_long_to_cpu(rolx(cpu_to_le_long(pat), r, bpp));
+			fill_op(p, dst, dst_idx, pat2, left, right,
+				width*bpp, bits);
+			dst_idx += p->fix.line_length*8;
+		}
+	}
+}
+EXPORT_SYMBOL(drm_fillrect);
+
+static const u32 cfb_tab8_be[] = {
+    0x00000000,0x000000ff,0x0000ff00,0x0000ffff,
+    0x00ff0000,0x00ff00ff,0x00ffff00,0x00ffffff,
+    0xff000000,0xff0000ff,0xff00ff00,0xff00ffff,
+    0xffff0000,0xffff00ff,0xffffff00,0xffffffff
+};
+
+static const u32 cfb_tab8_le[] = {
+    0x00000000,0xff000000,0x00ff0000,0xffff0000,
+    0x0000ff00,0xff00ff00,0x00ffff00,0xffffff00,
+    0x000000ff,0xff0000ff,0x00ff00ff,0xffff00ff,
+    0x0000ffff,0xff00ffff,0x00ffffff,0xffffffff
+};
+
+static const u32 cfb_tab16_be[] = {
+    0x00000000, 0x0000ffff, 0xffff0000, 0xffffffff
+};
+
+static const u32 cfb_tab16_le[] = {
+    0x00000000, 0xffff0000, 0x0000ffff, 0xffffffff
+};
+
+static const u32 cfb_tab32[] = {
+	0x00000000, 0xffffffff
+};
+
+static void color_imageblit(const struct fb_image *image, struct fb_info *p,
+			    void *dst1, u32 start_index, u32 pitch_index)
+{
+	/* Draw the penguin */
+	u32 *dst, *dst2;
+	u32 color = 0, val, shift;
+	int i, n, bpp = p->var.bits_per_pixel;
+	u32 null_bits = 32 - bpp;
+	u32 *palette = (u32 *) p->pseudo_palette;
+	const u8 *src = image->data;
+
+	dst2 = dst1;
+	for (i = image->height; i--; ) {
+		n = image->width;
+		dst = dst1;
+		shift = 0;
+		val = 0;
+
+		if (start_index) {
+			u32 start_mask = ~(FB_SHIFT_HIGH(p, ~(u32)0,
+							 start_index));
+			val = *dst & start_mask;
+			shift = start_index;
+		}
+		while (n--) {
+			if (p->fix.visual == FB_VISUAL_TRUECOLOR ||
+			    p->fix.visual == FB_VISUAL_DIRECTCOLOR )
+				color = palette[*src];
+			else
+				color = *src;
+			color <<= FB_LEFT_POS(p, bpp);
+			val |= FB_SHIFT_HIGH(p, color, shift);
+			if (shift >= null_bits) {
+				*dst++ = val;
+
+				val = (shift == null_bits) ? 0 :
+					FB_SHIFT_LOW(p, color, 32 - shift);
+			}
+			shift += bpp;
+			shift &= (32 - 1);
+			src++;
+		}
+		if (shift) {
+			u32 end_mask = FB_SHIFT_HIGH(p, ~(u32)0, shift);
+
+			*dst &= end_mask;
+			*dst |= val;
+		}
+		dst1 += p->fix.line_length;
+		if (pitch_index) {
+			dst2 += p->fix.line_length;
+			dst1 = (u8 *)((long)dst2 & ~(sizeof(u32) - 1));
+
+			start_index += pitch_index;
+			start_index &= 32 - 1;
+		}
+	}
+}
+
+static void slow_imageblit(const struct fb_image *image, struct fb_info *p,
+				  void *dst1, u32 fgcolor, u32 bgcolor,
+				  u32 start_index, u32 pitch_index)
+{
+	u32 shift, color = 0, bpp = p->var.bits_per_pixel;
+	u32 *dst, *dst2;
+	u32 val, pitch = p->fix.line_length;
+	u32 null_bits = 32 - bpp;
+	u32 spitch = (image->width+7)/8;
+	const u8 *src = image->data, *s;
+	u32 i, j, l;
+
+	dst2 = dst1;
+	fgcolor <<= FB_LEFT_POS(p, bpp);
+	bgcolor <<= FB_LEFT_POS(p, bpp);
+
+	for (i = image->height; i--; ) {
+		shift = val = 0;
+		l = 8;
+		j = image->width;
+		dst = dst1;
+		s = src;
+
+		/* write leading bits */
+		if (start_index) {
+			u32 start_mask = ~(FB_SHIFT_HIGH(p, ~(u32)0,
+							 start_index));
+			val = *dst & start_mask;
+			shift = start_index;
+		}
+
+		while (j--) {
+			l--;
+			color = (*s & (1 << l)) ? fgcolor : bgcolor;
+			val |= FB_SHIFT_HIGH(p, color, shift);
+
+			/* Did the bitshift spill bits to the next long? */
+			if (shift >= null_bits) {
+				*dst++ = val;
+				val = (shift == null_bits) ? 0 :
+					FB_SHIFT_LOW(p, color, 32 - shift);
+			}
+			shift += bpp;
+			shift &= (32 - 1);
+			if (!l) { l = 8; s++; }
+		}
+
+		/* write trailing bits */
+ 		if (shift) {
+			u32 end_mask = FB_SHIFT_HIGH(p, ~(u32)0, shift);
+
+			*dst &= end_mask;
+			*dst |= val;
+		}
+
+		dst1 += pitch;
+		src += spitch;
+		if (pitch_index) {
+			dst2 += pitch;
+			dst1 = (u8 *)((long)dst2 & ~(sizeof(u32) - 1));
+			start_index += pitch_index;
+			start_index &= 32 - 1;
+		}
+
+	}
+}
+
+/*
+ * fast_imageblit - optimized monochrome color expansion
+ *
+ * Only if:  bits_per_pixel == 8, 16, or 32
+ *           image->width is divisible by pixel/dword (ppw);
+ *           fix->line_legth is divisible by 4;
+ *           beginning and end of a scanline is dword aligned
+ */
+static void fast_imageblit(const struct fb_image *image, struct fb_info *p,
+				  void *dst1, u32 fgcolor, u32 bgcolor)
+{
+	u32 fgx = fgcolor, bgx = bgcolor, bpp = p->var.bits_per_pixel;
+	u32 ppw = 32/bpp, spitch = (image->width + 7)/8;
+	u32 bit_mask, end_mask, eorx, shift;
+	const char *s = image->data, *src;
+	u32 *dst;
+	const u32 *tab = NULL;
+	int i, j, k;
+
+	switch (bpp) {
+	case 8:
+		tab = fb_be_math(p) ? cfb_tab8_be : cfb_tab8_le;
+		break;
+	case 16:
+		tab = fb_be_math(p) ? cfb_tab16_be : cfb_tab16_le;
+		break;
+	case 32:
+	default:
+		tab = cfb_tab32;
+		break;
+	}
+
+	for (i = ppw-1; i--; ) {
+		fgx <<= bpp;
+		bgx <<= bpp;
+		fgx |= fgcolor;
+		bgx |= bgcolor;
+	}
+
+	bit_mask = (1 << ppw) - 1;
+	eorx = fgx ^ bgx;
+	k = image->width/ppw;
+
+	for (i = image->height; i--; ) {
+		dst = dst1;
+		shift = 8;
+		src = s;
+
+		for (j = k; j--; ) {
+			shift -= ppw;
+			end_mask = tab[(*src >> shift) & bit_mask];
+			*dst++ = (end_mask & eorx) ^ bgx;
+			if (!shift) {
+				shift = 8;
+				src++;
+			}
+		}
+		dst1 += p->fix.line_length;
+		s += spitch;
+	}
+}
+
+/*
+ * Generic 1-bit or 8-bit source to 1-32 bit destination expansion
+ * for frame buffer located in system RAM with packed pixels of any depth.
+ */
+void drm_imageblit(struct fb_info *p, const struct fb_image *image)
+{
+	u32 fgcolor, bgcolor, start_index, bitstart, pitch_index = 0;
+	u32 bpl = sizeof(u32), bpp = p->var.bits_per_pixel;
+	u32 width = image->width;
+	u32 dx = image->dx, dy = image->dy;
+	void *dst1;
+
+	if (p->state != FBINFO_STATE_RUNNING)
+		return;
+
+	bitstart = (dy * p->fix.line_length * 8) + (dx * bpp);
+	start_index = bitstart & (32 - 1);
+	pitch_index = (p->fix.line_length & (bpl - 1)) * 8;
+
+	bitstart /= 8;
+	bitstart &= ~(bpl - 1);
+	dst1 = (void __force *)p->screen_base + bitstart;
+
+	if (p->fbops->fb_sync)
+		p->fbops->fb_sync(p);
+
+	if (image->depth == 1) {
+		if (p->fix.visual == FB_VISUAL_TRUECOLOR ||
+		    p->fix.visual == FB_VISUAL_DIRECTCOLOR) {
+			fgcolor = ((u32*)(p->pseudo_palette))[image->fg_color];
+			bgcolor = ((u32*)(p->pseudo_palette))[image->bg_color];
+		} else {
+			fgcolor = image->fg_color;
+			bgcolor = image->bg_color;
+		}
+
+		if (32 % bpp == 0 && !start_index && !pitch_index &&
+		    ((width & (32/bpp-1)) == 0) &&
+		    bpp >= 8 && bpp <= 32)
+			fast_imageblit(image, p, dst1, fgcolor, bgcolor);
+		else
+			slow_imageblit(image, p, dst1, fgcolor, bgcolor,
+					start_index, pitch_index);
+	} else
+		color_imageblit(image, p, dst1, start_index, pitch_index);
+}
+EXPORT_SYMBOL(drm_imageblit);
+
+/*
+ * Generic file operations where framebuffer is in system RAM
+ */
+ssize_t drm_fb_read(struct fb_info *info, char __user *buf, size_t count, loff_t *ppos)
+{
+	unsigned long p = *ppos;
+	void *src;
+	int err = 0;
+	unsigned long total_size;
+
+	if (info->state != FBINFO_STATE_RUNNING)
+		return -EPERM;
+
+	total_size = info->screen_size;
+
+	if (total_size == 0)
+		total_size = info->fix.smem_len;
+
+	if (p >= total_size)
+		return 0;
+
+	if (count >= total_size)
+		count = total_size;
+
+	if (count + p > total_size)
+		count = total_size - p;
+
+	src = (void __force *)(info->screen_base + p);
+
+	if (info->fbops->fb_sync)
+		info->fbops->fb_sync(info);
+
+	if (copy_to_user(buf, src, count))
+		err = -EFAULT;
+
+	if  (!err)
+		*ppos += count;
+
+	return (err) ? err : count;
+}
+EXPORT_SYMBOL_GPL(drm_fb_read);
+
+/*
+ * Generic file operations where framebuffer is in system RAM
+ */
+ssize_t drm_fb_write(struct fb_info *info, const char __user *buf, size_t count, loff_t *ppos)
+{
+	unsigned long p = *ppos;
+	void *dst;
+	int err = 0;
+	unsigned long total_size;
+
+	if (info->state != FBINFO_STATE_RUNNING)
+		return -EPERM;
+
+	total_size = info->screen_size;
+
+	if (total_size == 0)
+		total_size = info->fix.smem_len;
+
+	if (p > total_size)
+		return -EFBIG;
+
+	if (count > total_size) {
+		err = -EFBIG;
+		count = total_size;
+	}
+
+	if (count + p > total_size) {
+		if (!err)
+			err = -ENOSPC;
+
+		count = total_size - p;
+	}
+
+	dst = (void __force *) (info->screen_base + p);
+
+	if (info->fbops->fb_sync)
+		info->fbops->fb_sync(info);
+
+	if (copy_from_user(dst, buf, count))
+		err = -EFAULT;
+
+	if  (!err)
+		*ppos += count;
+
+	return (err) ? err : count;
+}
+EXPORT_SYMBOL_GPL(drm_fb_write);
diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index da0d96a69570..18fba9818635 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -664,36 +664,36 @@ void drm_fb_helper_deferred_io(struct fb_info *info,
 EXPORT_SYMBOL(drm_fb_helper_deferred_io);
 
 /**
- * drm_fb_helper_sys_read - wrapper around fb_sys_read
+ * drm_fb_helper_sys_read - wrapper around drm_fb_read
  * @info: fb_info struct pointer
  * @buf: userspace buffer to read from framebuffer memory
  * @count: number of bytes to read from framebuffer memory
  * @ppos: read offset within framebuffer memory
  *
- * A wrapper around fb_sys_read implemented by fbdev core
+ * A wrapper around drm_fb_read
  */
 ssize_t drm_fb_helper_sys_read(struct fb_info *info, char __user *buf,
 			       size_t count, loff_t *ppos)
 {
-	return fb_sys_read(info, buf, count, ppos);
+	return drm_fb_read(info, buf, count, ppos);
 }
 EXPORT_SYMBOL(drm_fb_helper_sys_read);
 
 /**
- * drm_fb_helper_sys_write - wrapper around fb_sys_write
+ * drm_fb_helper_sys_write - wrapper around drm_fb_write
  * @info: fb_info struct pointer
  * @buf: userspace buffer to write to framebuffer memory
  * @count: number of bytes to write to framebuffer memory
  * @ppos: write offset within framebuffer memory
  *
- * A wrapper around fb_sys_write implemented by fbdev core
+ * A wrapper around drm_fb_write
  */
 ssize_t drm_fb_helper_sys_write(struct fb_info *info, const char __user *buf,
 				size_t count, loff_t *ppos)
 {
 	ssize_t ret;
 
-	ret = fb_sys_write(info, buf, count, ppos);
+	ret = drm_fb_write(info, buf, count, ppos);
 	if (ret > 0)
 		drm_fb_helper_dirty(info, 0, 0, info->var.xres,
 				    info->var.yres);
@@ -703,48 +703,48 @@ ssize_t drm_fb_helper_sys_write(struct fb_info *info, const char __user *buf,
 EXPORT_SYMBOL(drm_fb_helper_sys_write);
 
 /**
- * drm_fb_helper_sys_fillrect - wrapper around sys_fillrect
+ * drm_fb_helper_sys_fillrect - wrapper around drm_fillrect
  * @info: fbdev registered by the helper
  * @rect: info about rectangle to fill
  *
- * A wrapper around sys_fillrect implemented by fbdev core
+ * A wrapper around drm_fillrect
  */
 void drm_fb_helper_sys_fillrect(struct fb_info *info,
 				const struct fb_fillrect *rect)
 {
-	sys_fillrect(info, rect);
+	drm_fillrect(info, rect);
 	drm_fb_helper_dirty(info, rect->dx, rect->dy,
 			    rect->width, rect->height);
 }
 EXPORT_SYMBOL(drm_fb_helper_sys_fillrect);
 
 /**
- * drm_fb_helper_sys_copyarea - wrapper around sys_copyarea
+ * drm_fb_helper_sys_copyarea - wrapper around drm_copyarea
  * @info: fbdev registered by the helper
  * @area: info about area to copy
  *
- * A wrapper around sys_copyarea implemented by fbdev core
+ * A wrapper around drm_copyarea
  */
 void drm_fb_helper_sys_copyarea(struct fb_info *info,
 				const struct fb_copyarea *area)
 {
-	sys_copyarea(info, area);
+	drm_copyarea(info, area);
 	drm_fb_helper_dirty(info, area->dx, area->dy,
 			    area->width, area->height);
 }
 EXPORT_SYMBOL(drm_fb_helper_sys_copyarea);
 
 /**
- * drm_fb_helper_sys_imageblit - wrapper around sys_imageblit
+ * drm_fb_helper_sys_imageblit - wrapper around drm_imageblit
  * @info: fbdev registered by the helper
  * @image: info about image to blit
  *
- * A wrapper around sys_imageblit implemented by fbdev core
+ * A wrapper around drm_imageblit
  */
 void drm_fb_helper_sys_imageblit(struct fb_info *info,
 				 const struct fb_image *image)
 {
-	sys_imageblit(info, image);
+	drm_imageblit(info, image);
 	drm_fb_helper_dirty(info, image->dx, image->dy,
 			    image->width, image->height);
 }
diff --git a/include/drm/drm_fb_helper.h b/include/drm/drm_fb_helper.h
index 306aa3a60be9..d38005f6128a 100644
--- a/include/drm/drm_fb_helper.h
+++ b/include/drm/drm_fb_helper.h
@@ -271,6 +271,12 @@ void drm_fb_helper_output_poll_changed(struct drm_device *dev);
 
 void drm_fbdev_generic_setup(struct drm_device *dev,
 			     unsigned int preferred_bpp);
+
+void drm_copyarea(struct fb_info *p, const struct fb_copyarea *area);
+void drm_fillrect(struct fb_info *p, const struct fb_fillrect *rect);
+void drm_imageblit(struct fb_info *p, const struct fb_image *image);
+ssize_t drm_fb_read(struct fb_info *info, char __user *buf, size_t count, loff_t *ppos);
+ssize_t drm_fb_write(struct fb_info *info, const char __user *buf, size_t count, loff_t *ppos);
 #else
 static inline void drm_fb_helper_prepare(struct drm_device *dev,
 					struct drm_fb_helper *helper,
-- 
2.25.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/5] fbdev/core: Export framebuffer read and write code as cfb_ function
  2020-08-02 20:01           ` Sam Ravnborg
@ 2020-08-03  6:46             ` Thomas Zimmermann
  -1 siblings, 0 replies; 34+ messages in thread
From: Thomas Zimmermann @ 2020-08-03  6:46 UTC (permalink / raw)
  To: Sam Ravnborg, daniel
  Cc: linux-fbdev, b.zolnierkie, jani.nikula, dri-devel, kraxel,
	airlied, natechancellor, peda, dan.carpenter


[-- Attachment #1.1: Type: text/plain, Size: 44364 bytes --]

Hi

Am 02.08.20 um 22:01 schrieb Sam Ravnborg:
> On Fri, Jul 31, 2020 at 11:20:33AM +0200, daniel@ffwll.ch wrote:
>> On Wed, Jul 29, 2020 at 06:36:03PM +0200, Sam Ravnborg wrote:
>>> Hi Daniel.
>>>
>>> On Wed, Jul 29, 2020 at 03:53:28PM +0200, daniel@ffwll.ch wrote:
>>>> On Wed, Jul 29, 2020 at 03:41:45PM +0200, Thomas Zimmermann wrote:
>>>>> DRM fb helpers require read and write functions for framebuffer
>>>>> memory. Export the existing code from fbdev.
>>>>>
>>>>> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
>>>>
>>>> Hm I'm not super sure whether we want to actually reuse this stuff ... We
>>>> kinda don't care about the sparc special case, and just having an fbdev
>>>> implementation witch has the switch between memcpy and memcpy_to/from_io
>>>> in one single place sounds a lot simpler ...
>>>>
>>>> This way we can have a clean split between the old horrors of real fbdev
>>>> drivers, and a much cleaner world in drm. It would mean a bit of
>>>> copypasting, but I think that's actually a good thing.
>>>>
>>>> In general my idea for drm fbdev emulation is that for any area we have a
>>>> problem we just ignore the entire fbmem.c code and write our own: mmap,
>>>> backlight handling (still unsolved, and horrible), cfb vs sys here. This
>>>> entire fbmem.c stuff is pretty bad midlayer, trying to avoid code
>>>> duplication here doesn't seem worth it imo.
>>>>
>>>> Thoughts?
>>>
>>>
>>> I can see that fbmem is a mix of ioctl support and other stuff.
>>> We could factor out all the ioctl parts of fbmem.c to a new file
>>> named fbioctl.c.
>>>
>>> And then let the ioctl parts call down into drm stuff and avoid reusing
>>> the fbdev code when we first reach drm code.
>>> This would require local copies of:
>>> sys_read, sys_write, sys_fillrect, sys_copyarea, sys_imageblit
>>> and more I think which I missed.
>>>
>>> With local copies we could avoid some of the special cases and trim the
>>> unctions to what is required by drm only.
>>> And then no more fbmem dependencies and no dependencies to several of
>>> the small helper functions. So less entanglement with fbdev core.
>>>
>>> This all sounds simple so I am surely missing a lot a ugly details here.
>>>
>>> And should we touch this anyway we need a test suite to verify not too
>>> much breaks. To the best of my knowledge there is not yet such a test
>>> suite :-( Maybe because people caring about fbdev are limited.
>>
>> Well my idea was to not refactor anything, but just have drm copies of the
>> various fb_ops callbacks. Definitely not even more refactoring :-)

Thanks for making a prototype, Sam.

But do we really want to duplicate alls this code from fbdev? It's not
actually pretty and there's little value in rewritting it. If anything,
I can see us extending drm_format_helpers.c and building upon this.

Best regards
Thomas

> 
> $ wc -l drivers/gpu/drm/drm_fb_fbdev_helper.c
> 1212 drivers/gpu/drm/drm_fb_fbdev_helper.c
> 
> This is a straight copy of:
> $ grep EXPORT drivers/gpu/drm/drm_fb_fbdev_helper.c
> EXPORT_SYMBOL(drm_copyarea);
> EXPORT_SYMBOL(drm_fillrect);
> EXPORT_SYMBOL(drm_imageblit);
> EXPORT_SYMBOL_GPL(drm_fb_read);
> EXPORT_SYMBOL_GPL(drm_fb_write);
> 
> copyarea, fillrect and imageblit are a copy of the sys variants.
> read and write are from fb_sys_fops.c
> 
> Patch builds but I am not a big fan yet - right now it gains us very
> little. What would the next step be?
> 
> I looked briefly at the few drivers that have not migrated to
> the generic fbdev emulation (yet?).
>   msm
>   omapdrm
>   gma500
> 
> all uses sys_ variants - they already uses the migrated variants as
> they call the drm_fb helper.
> 
> 
>   exynos
>   gma500
>   i915
>   nouveau
>   radeon
>   rockchip
>   vmwgfx
> 
> all uses the cfb variants and cannot use the migrated variants as-is.
> 
> 	Sam
> 
> From c554e8b9ec96e06a5933ed5bd834a928c977bcdf Mon Sep 17 00:00:00 2001
> From: Sam Ravnborg <sam@ravnborg.org>
> Date: Fri, 31 Jul 2020 23:36:01 +0200
> Subject: [PATCH 1/3] drm: drm_fb_helper: copy fb_ops functions to drm
> 
> Copy the fbdev functions used by fb_ops to drm.
> Having local copies enable the possibility
> to adjust the functions so then are tailored to the need of DRM.
> At the same time there is one less dependency between DRM and fbdev.
> 
> The files were copied verbatim and only minimal changes was made.
> 
> As some drivers uses the fbdev functions they continue to be build but
> the generic fbdev emulation no longer uses the fbdev functions.
> 
> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
> Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>  drivers/gpu/drm/Kconfig               |    1 -
>  drivers/gpu/drm/Makefile              |    1 +
>  drivers/gpu/drm/drm_fb_fbdev_helper.c | 1212 +++++++++++++++++++++++++
>  drivers/gpu/drm/drm_fb_helper.c       |   30 +-
>  include/drm/drm_fb_helper.h           |    6 +
>  5 files changed, 1234 insertions(+), 16 deletions(-)
>  create mode 100644 drivers/gpu/drm/drm_fb_fbdev_helper.c
> 
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index c4fd57d8b717..11f41bc31f05 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -83,7 +83,6 @@ config DRM_KMS_FB_HELPER
>  	select FB
>  	select FRAMEBUFFER_CONSOLE if !EXPERT
>  	select FRAMEBUFFER_CONSOLE_DETECT_PRIMARY if FRAMEBUFFER_CONSOLE
> -	select FB_SYS_FOPS
>  	select FB_SYS_FILLRECT
>  	select FB_SYS_COPYAREA
>  	select FB_SYS_IMAGEBLIT
> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
> index 02ee5faf1a92..398cb4746762 100644
> --- a/drivers/gpu/drm/Makefile
> +++ b/drivers/gpu/drm/Makefile
> @@ -50,6 +50,7 @@ drm_kms_helper-y := drm_bridge_connector.o drm_crtc_helper.o drm_dp_helper.o \
>  
>  drm_kms_helper-$(CONFIG_DRM_PANEL_BRIDGE) += bridge/panel.o
>  drm_kms_helper-$(CONFIG_DRM_FBDEV_EMULATION) += drm_fb_helper.o
> +drm_kms_helper-$(CONFIG_DRM_FBDEV_EMULATION) += drm_fb_fbdev_helper.o
>  drm_kms_helper-$(CONFIG_DRM_KMS_CMA_HELPER) += drm_fb_cma_helper.o
>  drm_kms_helper-$(CONFIG_DRM_DP_AUX_CHARDEV) += drm_dp_aux_dev.o
>  drm_kms_helper-$(CONFIG_DRM_DP_CEC) += drm_dp_cec.o
> diff --git a/drivers/gpu/drm/drm_fb_fbdev_helper.c b/drivers/gpu/drm/drm_fb_fbdev_helper.c
> new file mode 100644
> index 000000000000..dda60f85b9d5
> --- /dev/null
> +++ b/drivers/gpu/drm/drm_fb_fbdev_helper.c
> @@ -0,0 +1,1212 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * This file is subject to the terms and conditions of the GNU General Public
> + * License.  See the file COPYING in the main directory of this archive for
> + * more details.
> + *
> + * Copyright (C)  2007 Antonino Daplas <adaplas@pol.net>
> + *
> + * drm_copyarea:
> + *   Based almost entirely from cfbcopyarea.c (which is based almost entirely
> + *   on Geert Uytterhoeven's copyarea routine)
> + *
> + *  drm_fillrect:
> + *    Based almost entirely from cfbfillrect.c (which is based almost entirely
> + *    on Geert Uytterhoeven's fillrect routine)
> + *
> + *  drm_imageblit:
> + *    Based almost entirely on cfbimgblt.c
> + *
> + */
> +
> +#include <linux/export.h>
> +#include <linux/fb.h>
> +#include <linux/io.h>
> +
> +
> +/*
> +*  Compose two values, using a bitmask as decision value
> +*  This is equivalent to (a & mask) | (b & ~mask)
> +*/
> +static inline unsigned long
> +comp(unsigned long a, unsigned long b, unsigned long mask)
> +{
> +    return ((a ^ b) & mask) ^ b;
> +}
> +
> +/*
> + *  Create a pattern with the given pixel's color
> + */
> +#if BITS_PER_LONG == 64
> +static inline unsigned long
> +pixel_to_pat( u32 bpp, u32 pixel)
> +{
> +	switch (bpp) {
> +	case 1:
> +		return 0xfffffffffffffffful*pixel;
> +	case 2:
> +		return 0x5555555555555555ul*pixel;
> +	case 4:
> +		return 0x1111111111111111ul*pixel;
> +	case 8:
> +		return 0x0101010101010101ul*pixel;
> +	case 12:
> +		return 0x1001001001001001ul*pixel;
> +	case 16:
> +		return 0x0001000100010001ul*pixel;
> +	case 24:
> +		return 0x0001000001000001ul*pixel;
> +	case 32:
> +		return 0x0000000100000001ul*pixel;
> +	default:
> +		WARN(1, "pixel_to_pat(): unsupported pixelformat %d\n", bpp);
> +		return 0;
> +    }
> +}
> +#else
> +static inline unsigned long
> +pixel_to_pat( u32 bpp, u32 pixel)
> +{
> +	switch (bpp) {
> +	case 1:
> +		return 0xfffffffful*pixel;
> +	case 2:
> +		return 0x55555555ul*pixel;
> +	case 4:
> +		return 0x11111111ul*pixel;
> +	case 8:
> +		return 0x01010101ul*pixel;
> +	case 12:
> +		return 0x01001001ul*pixel;
> +	case 16:
> +		return 0x00010001ul*pixel;
> +	case 24:
> +		return 0x01000001ul*pixel;
> +	case 32:
> +		return 0x00000001ul*pixel;
> +	default:
> +		WARN(1, "pixel_to_pat(): unsupported pixelformat %d\n", bpp);
> +		return 0;
> +    }
> +}
> +#endif
> +
> +#ifdef CONFIG_FB_CFB_REV_PIXELS_IN_BYTE
> +#if BITS_PER_LONG == 64
> +#define REV_PIXELS_MASK1 0x5555555555555555ul
> +#define REV_PIXELS_MASK2 0x3333333333333333ul
> +#define REV_PIXELS_MASK4 0x0f0f0f0f0f0f0f0ful
> +#else
> +#define REV_PIXELS_MASK1 0x55555555ul
> +#define REV_PIXELS_MASK2 0x33333333ul
> +#define REV_PIXELS_MASK4 0x0f0f0f0ful
> +#endif
> +
> +static inline unsigned long fb_rev_pixels_in_long(unsigned long val,
> +						  u32 bswapmask)
> +{
> +	if (bswapmask & 1)
> +		val = comp(val >> 1, val << 1, REV_PIXELS_MASK1);
> +	if (bswapmask & 2)
> +		val = comp(val >> 2, val << 2, REV_PIXELS_MASK2);
> +	if (bswapmask & 3)
> +		val = comp(val >> 4, val << 4, REV_PIXELS_MASK4);
> +	return val;
> +}
> +
> +static inline u32 fb_shifted_pixels_mask_u32(struct fb_info *p, u32 index,
> +					     u32 bswapmask)
> +{
> +	u32 mask;
> +
> +	if (!bswapmask) {
> +		mask = FB_SHIFT_HIGH(p, ~(u32)0, index);
> +	} else {
> +		mask = 0xff << FB_LEFT_POS(p, 8);
> +		mask = FB_SHIFT_LOW(p, mask, index & (bswapmask)) & mask;
> +		mask = FB_SHIFT_HIGH(p, mask, index & ~(bswapmask));
> +#if defined(__i386__) || defined(__x86_64__)
> +		/* Shift argument is limited to 0 - 31 on x86 based CPU's */
> +		if(index + bswapmask < 32)
> +#endif
> +			mask |= FB_SHIFT_HIGH(p, ~(u32)0,
> +					(index + bswapmask) & ~(bswapmask));
> +	}
> +	return mask;
> +}
> +
> +static inline unsigned long fb_shifted_pixels_mask_long(struct fb_info *p,
> +							u32 index,
> +							u32 bswapmask)
> +{
> +	unsigned long mask;
> +
> +	if (!bswapmask) {
> +		mask = FB_SHIFT_HIGH(p, ~0UL, index);
> +	} else {
> +		mask = 0xff << FB_LEFT_POS(p, 8);
> +		mask = FB_SHIFT_LOW(p, mask, index & (bswapmask)) & mask;
> +		mask = FB_SHIFT_HIGH(p, mask, index & ~(bswapmask));
> +#if defined(__i386__) || defined(__x86_64__)
> +		/* Shift argument is limited to 0 - 31 on x86 based CPU's */
> +		if(index + bswapmask < BITS_PER_LONG)
> +#endif
> +			mask |= FB_SHIFT_HIGH(p, ~0UL,
> +					(index + bswapmask) & ~(bswapmask));
> +	}
> +	return mask;
> +}
> +
> +
> +static inline u32 fb_compute_bswapmask(struct fb_info *info)
> +{
> +	u32 bswapmask = 0;
> +	unsigned bpp = info->var.bits_per_pixel;
> +
> +	if ((bpp < 8) && (info->var.nonstd & FB_NONSTD_REV_PIX_IN_B)) {
> +		/*
> +		 * Reversed order of pixel layout in bytes
> +		 * works only for 1, 2 and 4 bpp
> +		 */
> +		bswapmask = 7 - bpp + 1;
> +	}
> +	return bswapmask;
> +}
> +
> +#else /* CONFIG_FB_CFB_REV_PIXELS_IN_BYTE */
> +
> +static inline unsigned long fb_rev_pixels_in_long(unsigned long val,
> +						  u32 bswapmask)
> +{
> +	return val;
> +}
> +
> +#define fb_shifted_pixels_mask_u32(p, i, b) FB_SHIFT_HIGH((p), ~(u32)0, (i))
> +#define fb_shifted_pixels_mask_long(p, i, b) FB_SHIFT_HIGH((p), ~0UL, (i))
> +#define fb_compute_bswapmask(...) 0
> +
> +#endif  /* CONFIG_FB_CFB_REV_PIXELS_IN_BYTE */
> +
> +#define cpu_to_le_long _cpu_to_le_long(BITS_PER_LONG)
> +#define _cpu_to_le_long(x) __cpu_to_le_long(x)
> +#define __cpu_to_le_long(x) cpu_to_le##x
> +
> +#define le_long_to_cpu _le_long_to_cpu(BITS_PER_LONG)
> +#define _le_long_to_cpu(x) __le_long_to_cpu(x)
> +#define __le_long_to_cpu(x) le##x##_to_cpu
> +
> +static inline unsigned long rolx(unsigned long word, unsigned int shift, unsigned int x)
> +{
> +	return (word << shift) | (word >> (x - shift));
> +}
> +
> +
> +/*
> + *  Generic bitwise copy algorithm
> + */
> +static void
> +bitcpy(struct fb_info *p, unsigned long *dst, unsigned dst_idx,
> +	const unsigned long *src, unsigned src_idx, int bits, unsigned n)
> +{
> +	unsigned long first, last;
> +	int const shift = dst_idx-src_idx;
> +	int left, right;
> +
> +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
> +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
> +
> +	if (!shift) {
> +		/* Same alignment for source and dest */
> +		if (dst_idx+n <= bits) {
> +			/* Single word */
> +			if (last)
> +				first &= last;
> +			*dst = comp(*src, *dst, first);
> +		} else {
> +			/* Multiple destination words */
> +			/* Leading bits */
> + 			if (first != ~0UL) {
> +				*dst = comp(*src, *dst, first);
> +				dst++;
> +				src++;
> +				n -= bits - dst_idx;
> +			}
> +
> +			/* Main chunk */
> +			n /= bits;
> +			while (n >= 8) {
> +				*dst++ = *src++;
> +				*dst++ = *src++;
> +				*dst++ = *src++;
> +				*dst++ = *src++;
> +				*dst++ = *src++;
> +				*dst++ = *src++;
> +				*dst++ = *src++;
> +				*dst++ = *src++;
> +				n -= 8;
> +			}
> +			while (n--)
> +				*dst++ = *src++;
> +
> +			/* Trailing bits */
> +			if (last)
> +				*dst = comp(*src, *dst, last);
> +		}
> +	} else {
> +		unsigned long d0, d1;
> +		int m;
> +
> +		/* Different alignment for source and dest */
> +		right = shift & (bits - 1);
> +		left = -shift & (bits - 1);
> +
> +		if (dst_idx+n <= bits) {
> +			/* Single destination word */
> +			if (last)
> +				first &= last;
> +			if (shift > 0) {
> +				/* Single source word */
> +				*dst = comp(*src << left, *dst, first);
> +			} else if (src_idx+n <= bits) {
> +				/* Single source word */
> +				*dst = comp(*src >> right, *dst, first);
> +			} else {
> +				/* 2 source words */
> +				d0 = *src++;
> +				d1 = *src;
> +				*dst = comp(d0 >> right | d1 << left, *dst,
> +					    first);
> +			}
> +		} else {
> +			/* Multiple destination words */
> +			/** We must always remember the last value read,
> +			    because in case SRC and DST overlap bitwise (e.g.
> +			    when moving just one pixel in 1bpp), we always
> +			    collect one full long for DST and that might
> +			    overlap with the current long from SRC. We store
> +			    this value in 'd0'. */
> +			d0 = *src++;
> +			/* Leading bits */
> +			if (shift > 0) {
> +				/* Single source word */
> +				*dst = comp(d0 << left, *dst, first);
> +				dst++;
> +				n -= bits - dst_idx;
> +			} else {
> +				/* 2 source words */
> +				d1 = *src++;
> +				*dst = comp(d0 >> right | d1 << left, *dst,
> +					    first);
> +				d0 = d1;
> +				dst++;
> +				n -= bits - dst_idx;
> +			}
> +
> +			/* Main chunk */
> +			m = n % bits;
> +			n /= bits;
> +			while (n >= 4) {
> +				d1 = *src++;
> +				*dst++ = d0 >> right | d1 << left;
> +				d0 = d1;
> +				d1 = *src++;
> +				*dst++ = d0 >> right | d1 << left;
> +				d0 = d1;
> +				d1 = *src++;
> +				*dst++ = d0 >> right | d1 << left;
> +				d0 = d1;
> +				d1 = *src++;
> +				*dst++ = d0 >> right | d1 << left;
> +				d0 = d1;
> +				n -= 4;
> +			}
> +			while (n--) {
> +				d1 = *src++;
> +				*dst++ = d0 >> right | d1 << left;
> +				d0 = d1;
> +			}
> +
> +			/* Trailing bits */
> +			if (m) {
> +				if (m <= bits - right) {
> +					/* Single source word */
> +					d0 >>= right;
> +				} else {
> +					/* 2 source words */
> + 					d1 = *src;
> +					d0 = d0 >> right | d1 << left;
> +				}
> +				*dst = comp(d0, *dst, last);
> +			}
> +		}
> +	}
> +}
> +
> +/*
> + *  Generic bitwise copy algorithm, operating backward
> + */
> +static void
> +bitcpy_rev(struct fb_info *p, unsigned long *dst, unsigned dst_idx,
> +	   const unsigned long *src, unsigned src_idx, unsigned bits,
> +	   unsigned n)
> +{
> +	unsigned long first, last;
> +	int shift;
> +
> +	dst += (dst_idx + n - 1) / bits;
> +	src += (src_idx + n - 1) / bits;
> +	dst_idx = (dst_idx + n - 1) % bits;
> +	src_idx = (src_idx + n - 1) % bits;
> +
> +	shift = dst_idx-src_idx;
> +
> +	first = ~FB_SHIFT_HIGH(p, ~0UL, (dst_idx + 1) % bits);
> +	last = FB_SHIFT_HIGH(p, ~0UL, (bits + dst_idx + 1 - n) % bits);
> +
> +	if (!shift) {
> +		/* Same alignment for source and dest */
> +		if ((unsigned long)dst_idx+1 >= n) {
> +			/* Single word */
> +			if (first)
> +				last &= first;
> +			*dst = comp(*src, *dst, last);
> +		} else {
> +			/* Multiple destination words */
> +
> +			/* Leading bits */
> +			if (first) {
> +				*dst = comp(*src, *dst, first);
> +				dst--;
> +				src--;
> +				n -= dst_idx+1;
> +			}
> +
> +			/* Main chunk */
> +			n /= bits;
> +			while (n >= 8) {
> +				*dst-- = *src--;
> +				*dst-- = *src--;
> +				*dst-- = *src--;
> +				*dst-- = *src--;
> +				*dst-- = *src--;
> +				*dst-- = *src--;
> +				*dst-- = *src--;
> +				*dst-- = *src--;
> +				n -= 8;
> +			}
> +			while (n--)
> +				*dst-- = *src--;
> +			/* Trailing bits */
> +			if (last != -1UL)
> +				*dst = comp(*src, *dst, last);
> +		}
> +	} else {
> +		/* Different alignment for source and dest */
> +
> +		int const left = shift & (bits-1);
> +		int const right = -shift & (bits-1);
> +
> +		if ((unsigned long)dst_idx+1 >= n) {
> +			/* Single destination word */
> +			if (first)
> +				last &= first;
> +			if (shift < 0) {
> +				/* Single source word */
> +				*dst = comp(*src >> right, *dst, last);
> +			} else if (1+(unsigned long)src_idx >= n) {
> +				/* Single source word */
> +				*dst = comp(*src << left, *dst, last);
> +			} else {
> +				/* 2 source words */
> +				*dst = comp(*src << left | *(src-1) >> right,
> +					    *dst, last);
> +			}
> +		} else {
> +			/* Multiple destination words */
> +			/** We must always remember the last value read,
> +			    because in case SRC and DST overlap bitwise (e.g.
> +			    when moving just one pixel in 1bpp), we always
> +			    collect one full long for DST and that might
> +			    overlap with the current long from SRC. We store
> +			    this value in 'd0'. */
> +			unsigned long d0, d1;
> +			int m;
> +
> +			d0 = *src--;
> +			/* Leading bits */
> +			if (shift < 0) {
> +				/* Single source word */
> +				d1 = d0;
> +				d0 >>= right;
> +			} else {
> +				/* 2 source words */
> +				d1 = *src--;
> +				d0 = d0 << left | d1 >> right;
> +			}
> +			if (!first)
> +				*dst = d0;
> +			else
> +				*dst = comp(d0, *dst, first);
> +			d0 = d1;
> +			dst--;
> +			n -= dst_idx+1;
> +
> +			/* Main chunk */
> +			m = n % bits;
> +			n /= bits;
> +			while (n >= 4) {
> +				d1 = *src--;
> +				*dst-- = d0 << left | d1 >> right;
> +				d0 = d1;
> +				d1 = *src--;
> +				*dst-- = d0 << left | d1 >> right;
> +				d0 = d1;
> +				d1 = *src--;
> +				*dst-- = d0 << left | d1 >> right;
> +				d0 = d1;
> +				d1 = *src--;
> +				*dst-- = d0 << left | d1 >> right;
> +				d0 = d1;
> +				n -= 4;
> +			}
> +			while (n--) {
> +				d1 = *src--;
> +				*dst-- = d0 << left | d1 >> right;
> +				d0 = d1;
> +			}
> +
> +			/* Trailing bits */
> +			if (m) {
> +				if (m <= bits - left) {
> +					/* Single source word */
> +					d0 <<= left;
> +				} else {
> +					/* 2 source words */
> +					d1 = *src;
> +					d0 = d0 << left | d1 >> right;
> +				}
> +				*dst = comp(d0, *dst, last);
> +			}
> +		}
> +	}
> +}
> +
> +/*
> + * Generic Bit Block Transfer for frame buffers located in system RAM with
> + * packed pixels of any depth.
> + */
> +void drm_copyarea(struct fb_info *p, const struct fb_copyarea *area)
> +{
> +	u32 dx = area->dx, dy = area->dy, sx = area->sx, sy = area->sy;
> +	u32 height = area->height, width = area->width;
> +	unsigned long const bits_per_line = p->fix.line_length*8u;
> +	unsigned long *base = NULL;
> +	int bits = BITS_PER_LONG, bytes = bits >> 3;
> +	unsigned dst_idx = 0, src_idx = 0, rev_copy = 0;
> +
> +	if (p->state != FBINFO_STATE_RUNNING)
> +		return;
> +
> +	/* if the beginning of the target area might overlap with the end of
> +	the source area, be have to copy the area reverse. */
> +	if ((dy == sy && dx > sx) || (dy > sy)) {
> +		dy += height;
> +		sy += height;
> +		rev_copy = 1;
> +	}
> +
> +	/* split the base of the framebuffer into a long-aligned address and
> +	   the index of the first bit */
> +	base = (unsigned long *)((unsigned long)p->screen_base & ~(bytes-1));
> +	dst_idx = src_idx = 8*((unsigned long)p->screen_base & (bytes-1));
> +	/* add offset of source and target area */
> +	dst_idx += dy*bits_per_line + dx*p->var.bits_per_pixel;
> +	src_idx += sy*bits_per_line + sx*p->var.bits_per_pixel;
> +
> +	if (p->fbops->fb_sync)
> +		p->fbops->fb_sync(p);
> +
> +	if (rev_copy) {
> +		while (height--) {
> +			dst_idx -= bits_per_line;
> +			src_idx -= bits_per_line;
> +			bitcpy_rev(p, base + (dst_idx / bits), dst_idx % bits,
> +				base + (src_idx / bits), src_idx % bits, bits,
> +				width*p->var.bits_per_pixel);
> +		}
> +	} else {
> +		while (height--) {
> +			bitcpy(p, base + (dst_idx / bits), dst_idx % bits,
> +				base + (src_idx / bits), src_idx % bits, bits,
> +				width*p->var.bits_per_pixel);
> +			dst_idx += bits_per_line;
> +			src_idx += bits_per_line;
> +		}
> +	}
> +}
> +EXPORT_SYMBOL(drm_copyarea);
> +
> +/*
> + *  Aligned pattern fill using 32/64-bit memory accesses
> + */
> +static void
> +bitfill_aligned(struct fb_info *p, unsigned long *dst, int dst_idx,
> +		unsigned long pat, unsigned n, int bits)
> +{
> +	unsigned long first, last;
> +
> +	if (!n)
> +		return;
> +
> +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
> +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
> +
> +	if (dst_idx+n <= bits) {
> +		/* Single word */
> +		if (last)
> +			first &= last;
> +		*dst = comp(pat, *dst, first);
> +	} else {
> +		/* Multiple destination words */
> +
> +		/* Leading bits */
> + 		if (first!= ~0UL) {
> +			*dst = comp(pat, *dst, first);
> +			dst++;
> +			n -= bits - dst_idx;
> +		}
> +
> +		/* Main chunk */
> +		n /= bits;
> +		while (n >= 8) {
> +			*dst++ = pat;
> +			*dst++ = pat;
> +			*dst++ = pat;
> +			*dst++ = pat;
> +			*dst++ = pat;
> +			*dst++ = pat;
> +			*dst++ = pat;
> +			*dst++ = pat;
> +			n -= 8;
> +		}
> +		while (n--)
> +			*dst++ = pat;
> +		/* Trailing bits */
> +		if (last)
> +			*dst = comp(pat, *dst, last);
> +	}
> +}
> +
> +
> +/*
> + *  Unaligned generic pattern fill using 32/64-bit memory accesses
> + *  The pattern must have been expanded to a full 32/64-bit value
> + *  Left/right are the appropriate shifts to convert to the pattern to be
> + *  used for the next 32/64-bit word
> + */
> +static void
> +bitfill_unaligned(struct fb_info *p, unsigned long *dst, int dst_idx,
> +		  unsigned long pat, int left, int right, unsigned n, int bits)
> +{
> +	unsigned long first, last;
> +
> +	if (!n)
> +		return;
> +
> +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
> +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
> +
> +	if (dst_idx+n <= bits) {
> +		/* Single word */
> +		if (last)
> +			first &= last;
> +		*dst = comp(pat, *dst, first);
> +	} else {
> +		/* Multiple destination words */
> +		/* Leading bits */
> +		if (first) {
> +			*dst = comp(pat, *dst, first);
> +			dst++;
> +			pat = pat << left | pat >> right;
> +			n -= bits - dst_idx;
> +		}
> +
> +		/* Main chunk */
> +		n /= bits;
> +		while (n >= 4) {
> +			*dst++ = pat;
> +			pat = pat << left | pat >> right;
> +			*dst++ = pat;
> +			pat = pat << left | pat >> right;
> +			*dst++ = pat;
> +			pat = pat << left | pat >> right;
> +			*dst++ = pat;
> +			pat = pat << left | pat >> right;
> +			n -= 4;
> +		}
> +		while (n--) {
> +			*dst++ = pat;
> +			pat = pat << left | pat >> right;
> +		}
> +
> +		/* Trailing bits */
> +		if (last)
> +			*dst = comp(pat, *dst, last);
> +	}
> +}
> +
> +/*
> + *  Aligned pattern invert using 32/64-bit memory accesses
> + */
> +static void
> +bitfill_aligned_rev(struct fb_info *p, unsigned long *dst, int dst_idx,
> +		    unsigned long pat, unsigned n, int bits)
> +{
> +	unsigned long val = pat;
> +	unsigned long first, last;
> +
> +	if (!n)
> +		return;
> +
> +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
> +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
> +
> +	if (dst_idx+n <= bits) {
> +		/* Single word */
> +		if (last)
> +			first &= last;
> +		*dst = comp(*dst ^ val, *dst, first);
> +	} else {
> +		/* Multiple destination words */
> +		/* Leading bits */
> +		if (first!=0UL) {
> +			*dst = comp(*dst ^ val, *dst, first);
> +			dst++;
> +			n -= bits - dst_idx;
> +		}
> +
> +		/* Main chunk */
> +		n /= bits;
> +		while (n >= 8) {
> +			*dst++ ^= val;
> +			*dst++ ^= val;
> +			*dst++ ^= val;
> +			*dst++ ^= val;
> +			*dst++ ^= val;
> +			*dst++ ^= val;
> +			*dst++ ^= val;
> +			*dst++ ^= val;
> +			n -= 8;
> +		}
> +		while (n--)
> +			*dst++ ^= val;
> +		/* Trailing bits */
> +		if (last)
> +			*dst = comp(*dst ^ val, *dst, last);
> +	}
> +}
> +
> +
> +/*
> + *  Unaligned generic pattern invert using 32/64-bit memory accesses
> + *  The pattern must have been expanded to a full 32/64-bit value
> + *  Left/right are the appropriate shifts to convert to the pattern to be
> + *  used for the next 32/64-bit word
> + */
> +static void
> +bitfill_unaligned_rev(struct fb_info *p, unsigned long *dst, int dst_idx,
> +		      unsigned long pat, int left, int right, unsigned n,
> +		      int bits)
> +{
> +	unsigned long first, last;
> +
> +	if (!n)
> +		return;
> +
> +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
> +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
> +
> +	if (dst_idx+n <= bits) {
> +		/* Single word */
> +		if (last)
> +			first &= last;
> +		*dst = comp(*dst ^ pat, *dst, first);
> +	} else {
> +		/* Multiple destination words */
> +
> +		/* Leading bits */
> +		if (first != 0UL) {
> +			*dst = comp(*dst ^ pat, *dst, first);
> +			dst++;
> +			pat = pat << left | pat >> right;
> +			n -= bits - dst_idx;
> +		}
> +
> +		/* Main chunk */
> +		n /= bits;
> +		while (n >= 4) {
> +			*dst++ ^= pat;
> +			pat = pat << left | pat >> right;
> +			*dst++ ^= pat;
> +			pat = pat << left | pat >> right;
> +			*dst++ ^= pat;
> +			pat = pat << left | pat >> right;
> +			*dst++ ^= pat;
> +			pat = pat << left | pat >> right;
> +			n -= 4;
> +		}
> +		while (n--) {
> +			*dst ^= pat;
> +			pat = pat << left | pat >> right;
> +		}
> +
> +		/* Trailing bits */
> +		if (last)
> +			*dst = comp(*dst ^ pat, *dst, last);
> +	}
> +}
> +
> +/*
> + * Generic fillrect for frame buffers in system RAM with packed pixels of
> + * any depth.
> + */
> +void drm_fillrect(struct fb_info *p, const struct fb_fillrect *rect)
> +{
> +	unsigned long pat, pat2, fg;
> +	unsigned long width = rect->width, height = rect->height;
> +	int bits = BITS_PER_LONG, bytes = bits >> 3;
> +	u32 bpp = p->var.bits_per_pixel;
> +	unsigned long *dst;
> +	int dst_idx, left;
> +
> +	if (p->state != FBINFO_STATE_RUNNING)
> +		return;
> +
> +	if (p->fix.visual == FB_VISUAL_TRUECOLOR ||
> +	    p->fix.visual == FB_VISUAL_DIRECTCOLOR )
> +		fg = ((u32 *) (p->pseudo_palette))[rect->color];
> +	else
> +		fg = rect->color;
> +
> +	pat = pixel_to_pat( bpp, fg);
> +
> +	dst = (unsigned long *)((unsigned long)p->screen_base & ~(bytes-1));
> +	dst_idx = ((unsigned long)p->screen_base & (bytes - 1))*8;
> +	dst_idx += rect->dy*p->fix.line_length*8+rect->dx*bpp;
> +	/* FIXME For now we support 1-32 bpp only */
> +	left = bits % bpp;
> +	if (p->fbops->fb_sync)
> +		p->fbops->fb_sync(p);
> +	if (!left) {
> +		void (*fill_op32)(struct fb_info *p, unsigned long *dst,
> +				  int dst_idx, unsigned long pat, unsigned n,
> +				  int bits) = NULL;
> +
> +		switch (rect->rop) {
> +		case ROP_XOR:
> +			fill_op32 = bitfill_aligned_rev;
> +			break;
> +		case ROP_COPY:
> +			fill_op32 = bitfill_aligned;
> +			break;
> +		default:
> +			printk( KERN_ERR "cfb_fillrect(): unknown rop, "
> +				"defaulting to ROP_COPY\n");
> +			fill_op32 = bitfill_aligned;
> +			break;
> +		}
> +		while (height--) {
> +			dst += dst_idx >> (ffs(bits) - 1);
> +			dst_idx &= (bits - 1);
> +			fill_op32(p, dst, dst_idx, pat, width*bpp, bits);
> +			dst_idx += p->fix.line_length*8;
> +		}
> +	} else {
> +		int right, r;
> +		void (*fill_op)(struct fb_info *p, unsigned long *dst,
> +				int dst_idx, unsigned long pat, int left,
> +				int right, unsigned n, int bits) = NULL;
> +#ifdef __LITTLE_ENDIAN
> +		right = left;
> +		left = bpp - right;
> +#else
> +		right = bpp - left;
> +#endif
> +		switch (rect->rop) {
> +		case ROP_XOR:
> +			fill_op = bitfill_unaligned_rev;
> +			break;
> +		case ROP_COPY:
> +			fill_op = bitfill_unaligned;
> +			break;
> +		default:
> +			printk(KERN_ERR "sys_fillrect(): unknown rop, "
> +				"defaulting to ROP_COPY\n");
> +			fill_op = bitfill_unaligned;
> +			break;
> +		}
> +		while (height--) {
> +			dst += dst_idx / bits;
> +			dst_idx &= (bits - 1);
> +			r = dst_idx % bpp;
> +			/* rotate pattern to the correct start position */
> +			pat2 = le_long_to_cpu(rolx(cpu_to_le_long(pat), r, bpp));
> +			fill_op(p, dst, dst_idx, pat2, left, right,
> +				width*bpp, bits);
> +			dst_idx += p->fix.line_length*8;
> +		}
> +	}
> +}
> +EXPORT_SYMBOL(drm_fillrect);
> +
> +static const u32 cfb_tab8_be[] = {
> +    0x00000000,0x000000ff,0x0000ff00,0x0000ffff,
> +    0x00ff0000,0x00ff00ff,0x00ffff00,0x00ffffff,
> +    0xff000000,0xff0000ff,0xff00ff00,0xff00ffff,
> +    0xffff0000,0xffff00ff,0xffffff00,0xffffffff
> +};
> +
> +static const u32 cfb_tab8_le[] = {
> +    0x00000000,0xff000000,0x00ff0000,0xffff0000,
> +    0x0000ff00,0xff00ff00,0x00ffff00,0xffffff00,
> +    0x000000ff,0xff0000ff,0x00ff00ff,0xffff00ff,
> +    0x0000ffff,0xff00ffff,0x00ffffff,0xffffffff
> +};
> +
> +static const u32 cfb_tab16_be[] = {
> +    0x00000000, 0x0000ffff, 0xffff0000, 0xffffffff
> +};
> +
> +static const u32 cfb_tab16_le[] = {
> +    0x00000000, 0xffff0000, 0x0000ffff, 0xffffffff
> +};
> +
> +static const u32 cfb_tab32[] = {
> +	0x00000000, 0xffffffff
> +};
> +
> +static void color_imageblit(const struct fb_image *image, struct fb_info *p,
> +			    void *dst1, u32 start_index, u32 pitch_index)
> +{
> +	/* Draw the penguin */
> +	u32 *dst, *dst2;
> +	u32 color = 0, val, shift;
> +	int i, n, bpp = p->var.bits_per_pixel;
> +	u32 null_bits = 32 - bpp;
> +	u32 *palette = (u32 *) p->pseudo_palette;
> +	const u8 *src = image->data;
> +
> +	dst2 = dst1;
> +	for (i = image->height; i--; ) {
> +		n = image->width;
> +		dst = dst1;
> +		shift = 0;
> +		val = 0;
> +
> +		if (start_index) {
> +			u32 start_mask = ~(FB_SHIFT_HIGH(p, ~(u32)0,
> +							 start_index));
> +			val = *dst & start_mask;
> +			shift = start_index;
> +		}
> +		while (n--) {
> +			if (p->fix.visual == FB_VISUAL_TRUECOLOR ||
> +			    p->fix.visual == FB_VISUAL_DIRECTCOLOR )
> +				color = palette[*src];
> +			else
> +				color = *src;
> +			color <<= FB_LEFT_POS(p, bpp);
> +			val |= FB_SHIFT_HIGH(p, color, shift);
> +			if (shift >= null_bits) {
> +				*dst++ = val;
> +
> +				val = (shift == null_bits) ? 0 :
> +					FB_SHIFT_LOW(p, color, 32 - shift);
> +			}
> +			shift += bpp;
> +			shift &= (32 - 1);
> +			src++;
> +		}
> +		if (shift) {
> +			u32 end_mask = FB_SHIFT_HIGH(p, ~(u32)0, shift);
> +
> +			*dst &= end_mask;
> +			*dst |= val;
> +		}
> +		dst1 += p->fix.line_length;
> +		if (pitch_index) {
> +			dst2 += p->fix.line_length;
> +			dst1 = (u8 *)((long)dst2 & ~(sizeof(u32) - 1));
> +
> +			start_index += pitch_index;
> +			start_index &= 32 - 1;
> +		}
> +	}
> +}
> +
> +static void slow_imageblit(const struct fb_image *image, struct fb_info *p,
> +				  void *dst1, u32 fgcolor, u32 bgcolor,
> +				  u32 start_index, u32 pitch_index)
> +{
> +	u32 shift, color = 0, bpp = p->var.bits_per_pixel;
> +	u32 *dst, *dst2;
> +	u32 val, pitch = p->fix.line_length;
> +	u32 null_bits = 32 - bpp;
> +	u32 spitch = (image->width+7)/8;
> +	const u8 *src = image->data, *s;
> +	u32 i, j, l;
> +
> +	dst2 = dst1;
> +	fgcolor <<= FB_LEFT_POS(p, bpp);
> +	bgcolor <<= FB_LEFT_POS(p, bpp);
> +
> +	for (i = image->height; i--; ) {
> +		shift = val = 0;
> +		l = 8;
> +		j = image->width;
> +		dst = dst1;
> +		s = src;
> +
> +		/* write leading bits */
> +		if (start_index) {
> +			u32 start_mask = ~(FB_SHIFT_HIGH(p, ~(u32)0,
> +							 start_index));
> +			val = *dst & start_mask;
> +			shift = start_index;
> +		}
> +
> +		while (j--) {
> +			l--;
> +			color = (*s & (1 << l)) ? fgcolor : bgcolor;
> +			val |= FB_SHIFT_HIGH(p, color, shift);
> +
> +			/* Did the bitshift spill bits to the next long? */
> +			if (shift >= null_bits) {
> +				*dst++ = val;
> +				val = (shift == null_bits) ? 0 :
> +					FB_SHIFT_LOW(p, color, 32 - shift);
> +			}
> +			shift += bpp;
> +			shift &= (32 - 1);
> +			if (!l) { l = 8; s++; }
> +		}
> +
> +		/* write trailing bits */
> + 		if (shift) {
> +			u32 end_mask = FB_SHIFT_HIGH(p, ~(u32)0, shift);
> +
> +			*dst &= end_mask;
> +			*dst |= val;
> +		}
> +
> +		dst1 += pitch;
> +		src += spitch;
> +		if (pitch_index) {
> +			dst2 += pitch;
> +			dst1 = (u8 *)((long)dst2 & ~(sizeof(u32) - 1));
> +			start_index += pitch_index;
> +			start_index &= 32 - 1;
> +		}
> +
> +	}
> +}
> +
> +/*
> + * fast_imageblit - optimized monochrome color expansion
> + *
> + * Only if:  bits_per_pixel == 8, 16, or 32
> + *           image->width is divisible by pixel/dword (ppw);
> + *           fix->line_legth is divisible by 4;
> + *           beginning and end of a scanline is dword aligned
> + */
> +static void fast_imageblit(const struct fb_image *image, struct fb_info *p,
> +				  void *dst1, u32 fgcolor, u32 bgcolor)
> +{
> +	u32 fgx = fgcolor, bgx = bgcolor, bpp = p->var.bits_per_pixel;
> +	u32 ppw = 32/bpp, spitch = (image->width + 7)/8;
> +	u32 bit_mask, end_mask, eorx, shift;
> +	const char *s = image->data, *src;
> +	u32 *dst;
> +	const u32 *tab = NULL;
> +	int i, j, k;
> +
> +	switch (bpp) {
> +	case 8:
> +		tab = fb_be_math(p) ? cfb_tab8_be : cfb_tab8_le;
> +		break;
> +	case 16:
> +		tab = fb_be_math(p) ? cfb_tab16_be : cfb_tab16_le;
> +		break;
> +	case 32:
> +	default:
> +		tab = cfb_tab32;
> +		break;
> +	}
> +
> +	for (i = ppw-1; i--; ) {
> +		fgx <<= bpp;
> +		bgx <<= bpp;
> +		fgx |= fgcolor;
> +		bgx |= bgcolor;
> +	}
> +
> +	bit_mask = (1 << ppw) - 1;
> +	eorx = fgx ^ bgx;
> +	k = image->width/ppw;
> +
> +	for (i = image->height; i--; ) {
> +		dst = dst1;
> +		shift = 8;
> +		src = s;
> +
> +		for (j = k; j--; ) {
> +			shift -= ppw;
> +			end_mask = tab[(*src >> shift) & bit_mask];
> +			*dst++ = (end_mask & eorx) ^ bgx;
> +			if (!shift) {
> +				shift = 8;
> +				src++;
> +			}
> +		}
> +		dst1 += p->fix.line_length;
> +		s += spitch;
> +	}
> +}
> +
> +/*
> + * Generic 1-bit or 8-bit source to 1-32 bit destination expansion
> + * for frame buffer located in system RAM with packed pixels of any depth.
> + */
> +void drm_imageblit(struct fb_info *p, const struct fb_image *image)
> +{
> +	u32 fgcolor, bgcolor, start_index, bitstart, pitch_index = 0;
> +	u32 bpl = sizeof(u32), bpp = p->var.bits_per_pixel;
> +	u32 width = image->width;
> +	u32 dx = image->dx, dy = image->dy;
> +	void *dst1;
> +
> +	if (p->state != FBINFO_STATE_RUNNING)
> +		return;
> +
> +	bitstart = (dy * p->fix.line_length * 8) + (dx * bpp);
> +	start_index = bitstart & (32 - 1);
> +	pitch_index = (p->fix.line_length & (bpl - 1)) * 8;
> +
> +	bitstart /= 8;
> +	bitstart &= ~(bpl - 1);
> +	dst1 = (void __force *)p->screen_base + bitstart;
> +
> +	if (p->fbops->fb_sync)
> +		p->fbops->fb_sync(p);
> +
> +	if (image->depth == 1) {
> +		if (p->fix.visual == FB_VISUAL_TRUECOLOR ||
> +		    p->fix.visual == FB_VISUAL_DIRECTCOLOR) {
> +			fgcolor = ((u32*)(p->pseudo_palette))[image->fg_color];
> +			bgcolor = ((u32*)(p->pseudo_palette))[image->bg_color];
> +		} else {
> +			fgcolor = image->fg_color;
> +			bgcolor = image->bg_color;
> +		}
> +
> +		if (32 % bpp == 0 && !start_index && !pitch_index &&
> +		    ((width & (32/bpp-1)) == 0) &&
> +		    bpp >= 8 && bpp <= 32)
> +			fast_imageblit(image, p, dst1, fgcolor, bgcolor);
> +		else
> +			slow_imageblit(image, p, dst1, fgcolor, bgcolor,
> +					start_index, pitch_index);
> +	} else
> +		color_imageblit(image, p, dst1, start_index, pitch_index);
> +}
> +EXPORT_SYMBOL(drm_imageblit);
> +
> +/*
> + * Generic file operations where framebuffer is in system RAM
> + */
> +ssize_t drm_fb_read(struct fb_info *info, char __user *buf, size_t count, loff_t *ppos)
> +{
> +	unsigned long p = *ppos;
> +	void *src;
> +	int err = 0;
> +	unsigned long total_size;
> +
> +	if (info->state != FBINFO_STATE_RUNNING)
> +		return -EPERM;
> +
> +	total_size = info->screen_size;
> +
> +	if (total_size == 0)
> +		total_size = info->fix.smem_len;
> +
> +	if (p >= total_size)
> +		return 0;
> +
> +	if (count >= total_size)
> +		count = total_size;
> +
> +	if (count + p > total_size)
> +		count = total_size - p;
> +
> +	src = (void __force *)(info->screen_base + p);
> +
> +	if (info->fbops->fb_sync)
> +		info->fbops->fb_sync(info);
> +
> +	if (copy_to_user(buf, src, count))
> +		err = -EFAULT;
> +
> +	if  (!err)
> +		*ppos += count;
> +
> +	return (err) ? err : count;
> +}
> +EXPORT_SYMBOL_GPL(drm_fb_read);
> +
> +/*
> + * Generic file operations where framebuffer is in system RAM
> + */
> +ssize_t drm_fb_write(struct fb_info *info, const char __user *buf, size_t count, loff_t *ppos)
> +{
> +	unsigned long p = *ppos;
> +	void *dst;
> +	int err = 0;
> +	unsigned long total_size;
> +
> +	if (info->state != FBINFO_STATE_RUNNING)
> +		return -EPERM;
> +
> +	total_size = info->screen_size;
> +
> +	if (total_size == 0)
> +		total_size = info->fix.smem_len;
> +
> +	if (p > total_size)
> +		return -EFBIG;
> +
> +	if (count > total_size) {
> +		err = -EFBIG;
> +		count = total_size;
> +	}
> +
> +	if (count + p > total_size) {
> +		if (!err)
> +			err = -ENOSPC;
> +
> +		count = total_size - p;
> +	}
> +
> +	dst = (void __force *) (info->screen_base + p);
> +
> +	if (info->fbops->fb_sync)
> +		info->fbops->fb_sync(info);
> +
> +	if (copy_from_user(dst, buf, count))
> +		err = -EFAULT;
> +
> +	if  (!err)
> +		*ppos += count;
> +
> +	return (err) ? err : count;
> +}
> +EXPORT_SYMBOL_GPL(drm_fb_write);
> diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
> index da0d96a69570..18fba9818635 100644
> --- a/drivers/gpu/drm/drm_fb_helper.c
> +++ b/drivers/gpu/drm/drm_fb_helper.c
> @@ -664,36 +664,36 @@ void drm_fb_helper_deferred_io(struct fb_info *info,
>  EXPORT_SYMBOL(drm_fb_helper_deferred_io);
>  
>  /**
> - * drm_fb_helper_sys_read - wrapper around fb_sys_read
> + * drm_fb_helper_sys_read - wrapper around drm_fb_read
>   * @info: fb_info struct pointer
>   * @buf: userspace buffer to read from framebuffer memory
>   * @count: number of bytes to read from framebuffer memory
>   * @ppos: read offset within framebuffer memory
>   *
> - * A wrapper around fb_sys_read implemented by fbdev core
> + * A wrapper around drm_fb_read
>   */
>  ssize_t drm_fb_helper_sys_read(struct fb_info *info, char __user *buf,
>  			       size_t count, loff_t *ppos)
>  {
> -	return fb_sys_read(info, buf, count, ppos);
> +	return drm_fb_read(info, buf, count, ppos);
>  }
>  EXPORT_SYMBOL(drm_fb_helper_sys_read);
>  
>  /**
> - * drm_fb_helper_sys_write - wrapper around fb_sys_write
> + * drm_fb_helper_sys_write - wrapper around drm_fb_write
>   * @info: fb_info struct pointer
>   * @buf: userspace buffer to write to framebuffer memory
>   * @count: number of bytes to write to framebuffer memory
>   * @ppos: write offset within framebuffer memory
>   *
> - * A wrapper around fb_sys_write implemented by fbdev core
> + * A wrapper around drm_fb_write
>   */
>  ssize_t drm_fb_helper_sys_write(struct fb_info *info, const char __user *buf,
>  				size_t count, loff_t *ppos)
>  {
>  	ssize_t ret;
>  
> -	ret = fb_sys_write(info, buf, count, ppos);
> +	ret = drm_fb_write(info, buf, count, ppos);
>  	if (ret > 0)
>  		drm_fb_helper_dirty(info, 0, 0, info->var.xres,
>  				    info->var.yres);
> @@ -703,48 +703,48 @@ ssize_t drm_fb_helper_sys_write(struct fb_info *info, const char __user *buf,
>  EXPORT_SYMBOL(drm_fb_helper_sys_write);
>  
>  /**
> - * drm_fb_helper_sys_fillrect - wrapper around sys_fillrect
> + * drm_fb_helper_sys_fillrect - wrapper around drm_fillrect
>   * @info: fbdev registered by the helper
>   * @rect: info about rectangle to fill
>   *
> - * A wrapper around sys_fillrect implemented by fbdev core
> + * A wrapper around drm_fillrect
>   */
>  void drm_fb_helper_sys_fillrect(struct fb_info *info,
>  				const struct fb_fillrect *rect)
>  {
> -	sys_fillrect(info, rect);
> +	drm_fillrect(info, rect);
>  	drm_fb_helper_dirty(info, rect->dx, rect->dy,
>  			    rect->width, rect->height);
>  }
>  EXPORT_SYMBOL(drm_fb_helper_sys_fillrect);
>  
>  /**
> - * drm_fb_helper_sys_copyarea - wrapper around sys_copyarea
> + * drm_fb_helper_sys_copyarea - wrapper around drm_copyarea
>   * @info: fbdev registered by the helper
>   * @area: info about area to copy
>   *
> - * A wrapper around sys_copyarea implemented by fbdev core
> + * A wrapper around drm_copyarea
>   */
>  void drm_fb_helper_sys_copyarea(struct fb_info *info,
>  				const struct fb_copyarea *area)
>  {
> -	sys_copyarea(info, area);
> +	drm_copyarea(info, area);
>  	drm_fb_helper_dirty(info, area->dx, area->dy,
>  			    area->width, area->height);
>  }
>  EXPORT_SYMBOL(drm_fb_helper_sys_copyarea);
>  
>  /**
> - * drm_fb_helper_sys_imageblit - wrapper around sys_imageblit
> + * drm_fb_helper_sys_imageblit - wrapper around drm_imageblit
>   * @info: fbdev registered by the helper
>   * @image: info about image to blit
>   *
> - * A wrapper around sys_imageblit implemented by fbdev core
> + * A wrapper around drm_imageblit
>   */
>  void drm_fb_helper_sys_imageblit(struct fb_info *info,
>  				 const struct fb_image *image)
>  {
> -	sys_imageblit(info, image);
> +	drm_imageblit(info, image);
>  	drm_fb_helper_dirty(info, image->dx, image->dy,
>  			    image->width, image->height);
>  }
> diff --git a/include/drm/drm_fb_helper.h b/include/drm/drm_fb_helper.h
> index 306aa3a60be9..d38005f6128a 100644
> --- a/include/drm/drm_fb_helper.h
> +++ b/include/drm/drm_fb_helper.h
> @@ -271,6 +271,12 @@ void drm_fb_helper_output_poll_changed(struct drm_device *dev);
>  
>  void drm_fbdev_generic_setup(struct drm_device *dev,
>  			     unsigned int preferred_bpp);
> +
> +void drm_copyarea(struct fb_info *p, const struct fb_copyarea *area);
> +void drm_fillrect(struct fb_info *p, const struct fb_fillrect *rect);
> +void drm_imageblit(struct fb_info *p, const struct fb_image *image);
> +ssize_t drm_fb_read(struct fb_info *info, char __user *buf, size_t count, loff_t *ppos);
> +ssize_t drm_fb_write(struct fb_info *info, const char __user *buf, size_t count, loff_t *ppos);
>  #else
>  static inline void drm_fb_helper_prepare(struct drm_device *dev,
>  					struct drm_fb_helper *helper,
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 516 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/5] fbdev/core: Export framebuffer read and write code as cfb_ function
@ 2020-08-03  6:46             ` Thomas Zimmermann
  0 siblings, 0 replies; 34+ messages in thread
From: Thomas Zimmermann @ 2020-08-03  6:46 UTC (permalink / raw)
  To: Sam Ravnborg, daniel
  Cc: linux-fbdev, b.zolnierkie, jani.nikula, dri-devel, kraxel,
	airlied, natechancellor, peda, dan.carpenter


[-- Attachment #1.1.1: Type: text/plain, Size: 44364 bytes --]

Hi

Am 02.08.20 um 22:01 schrieb Sam Ravnborg:
> On Fri, Jul 31, 2020 at 11:20:33AM +0200, daniel@ffwll.ch wrote:
>> On Wed, Jul 29, 2020 at 06:36:03PM +0200, Sam Ravnborg wrote:
>>> Hi Daniel.
>>>
>>> On Wed, Jul 29, 2020 at 03:53:28PM +0200, daniel@ffwll.ch wrote:
>>>> On Wed, Jul 29, 2020 at 03:41:45PM +0200, Thomas Zimmermann wrote:
>>>>> DRM fb helpers require read and write functions for framebuffer
>>>>> memory. Export the existing code from fbdev.
>>>>>
>>>>> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
>>>>
>>>> Hm I'm not super sure whether we want to actually reuse this stuff ... We
>>>> kinda don't care about the sparc special case, and just having an fbdev
>>>> implementation witch has the switch between memcpy and memcpy_to/from_io
>>>> in one single place sounds a lot simpler ...
>>>>
>>>> This way we can have a clean split between the old horrors of real fbdev
>>>> drivers, and a much cleaner world in drm. It would mean a bit of
>>>> copypasting, but I think that's actually a good thing.
>>>>
>>>> In general my idea for drm fbdev emulation is that for any area we have a
>>>> problem we just ignore the entire fbmem.c code and write our own: mmap,
>>>> backlight handling (still unsolved, and horrible), cfb vs sys here. This
>>>> entire fbmem.c stuff is pretty bad midlayer, trying to avoid code
>>>> duplication here doesn't seem worth it imo.
>>>>
>>>> Thoughts?
>>>
>>>
>>> I can see that fbmem is a mix of ioctl support and other stuff.
>>> We could factor out all the ioctl parts of fbmem.c to a new file
>>> named fbioctl.c.
>>>
>>> And then let the ioctl parts call down into drm stuff and avoid reusing
>>> the fbdev code when we first reach drm code.
>>> This would require local copies of:
>>> sys_read, sys_write, sys_fillrect, sys_copyarea, sys_imageblit
>>> and more I think which I missed.
>>>
>>> With local copies we could avoid some of the special cases and trim the
>>> unctions to what is required by drm only.
>>> And then no more fbmem dependencies and no dependencies to several of
>>> the small helper functions. So less entanglement with fbdev core.
>>>
>>> This all sounds simple so I am surely missing a lot a ugly details here.
>>>
>>> And should we touch this anyway we need a test suite to verify not too
>>> much breaks. To the best of my knowledge there is not yet such a test
>>> suite :-( Maybe because people caring about fbdev are limited.
>>
>> Well my idea was to not refactor anything, but just have drm copies of the
>> various fb_ops callbacks. Definitely not even more refactoring :-)

Thanks for making a prototype, Sam.

But do we really want to duplicate alls this code from fbdev? It's not
actually pretty and there's little value in rewritting it. If anything,
I can see us extending drm_format_helpers.c and building upon this.

Best regards
Thomas

> 
> $ wc -l drivers/gpu/drm/drm_fb_fbdev_helper.c
> 1212 drivers/gpu/drm/drm_fb_fbdev_helper.c
> 
> This is a straight copy of:
> $ grep EXPORT drivers/gpu/drm/drm_fb_fbdev_helper.c
> EXPORT_SYMBOL(drm_copyarea);
> EXPORT_SYMBOL(drm_fillrect);
> EXPORT_SYMBOL(drm_imageblit);
> EXPORT_SYMBOL_GPL(drm_fb_read);
> EXPORT_SYMBOL_GPL(drm_fb_write);
> 
> copyarea, fillrect and imageblit are a copy of the sys variants.
> read and write are from fb_sys_fops.c
> 
> Patch builds but I am not a big fan yet - right now it gains us very
> little. What would the next step be?
> 
> I looked briefly at the few drivers that have not migrated to
> the generic fbdev emulation (yet?).
>   msm
>   omapdrm
>   gma500
> 
> all uses sys_ variants - they already uses the migrated variants as
> they call the drm_fb helper.
> 
> 
>   exynos
>   gma500
>   i915
>   nouveau
>   radeon
>   rockchip
>   vmwgfx
> 
> all uses the cfb variants and cannot use the migrated variants as-is.
> 
> 	Sam
> 
> From c554e8b9ec96e06a5933ed5bd834a928c977bcdf Mon Sep 17 00:00:00 2001
> From: Sam Ravnborg <sam@ravnborg.org>
> Date: Fri, 31 Jul 2020 23:36:01 +0200
> Subject: [PATCH 1/3] drm: drm_fb_helper: copy fb_ops functions to drm
> 
> Copy the fbdev functions used by fb_ops to drm.
> Having local copies enable the possibility
> to adjust the functions so then are tailored to the need of DRM.
> At the same time there is one less dependency between DRM and fbdev.
> 
> The files were copied verbatim and only minimal changes was made.
> 
> As some drivers uses the fbdev functions they continue to be build but
> the generic fbdev emulation no longer uses the fbdev functions.
> 
> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
> Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>  drivers/gpu/drm/Kconfig               |    1 -
>  drivers/gpu/drm/Makefile              |    1 +
>  drivers/gpu/drm/drm_fb_fbdev_helper.c | 1212 +++++++++++++++++++++++++
>  drivers/gpu/drm/drm_fb_helper.c       |   30 +-
>  include/drm/drm_fb_helper.h           |    6 +
>  5 files changed, 1234 insertions(+), 16 deletions(-)
>  create mode 100644 drivers/gpu/drm/drm_fb_fbdev_helper.c
> 
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index c4fd57d8b717..11f41bc31f05 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -83,7 +83,6 @@ config DRM_KMS_FB_HELPER
>  	select FB
>  	select FRAMEBUFFER_CONSOLE if !EXPERT
>  	select FRAMEBUFFER_CONSOLE_DETECT_PRIMARY if FRAMEBUFFER_CONSOLE
> -	select FB_SYS_FOPS
>  	select FB_SYS_FILLRECT
>  	select FB_SYS_COPYAREA
>  	select FB_SYS_IMAGEBLIT
> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
> index 02ee5faf1a92..398cb4746762 100644
> --- a/drivers/gpu/drm/Makefile
> +++ b/drivers/gpu/drm/Makefile
> @@ -50,6 +50,7 @@ drm_kms_helper-y := drm_bridge_connector.o drm_crtc_helper.o drm_dp_helper.o \
>  
>  drm_kms_helper-$(CONFIG_DRM_PANEL_BRIDGE) += bridge/panel.o
>  drm_kms_helper-$(CONFIG_DRM_FBDEV_EMULATION) += drm_fb_helper.o
> +drm_kms_helper-$(CONFIG_DRM_FBDEV_EMULATION) += drm_fb_fbdev_helper.o
>  drm_kms_helper-$(CONFIG_DRM_KMS_CMA_HELPER) += drm_fb_cma_helper.o
>  drm_kms_helper-$(CONFIG_DRM_DP_AUX_CHARDEV) += drm_dp_aux_dev.o
>  drm_kms_helper-$(CONFIG_DRM_DP_CEC) += drm_dp_cec.o
> diff --git a/drivers/gpu/drm/drm_fb_fbdev_helper.c b/drivers/gpu/drm/drm_fb_fbdev_helper.c
> new file mode 100644
> index 000000000000..dda60f85b9d5
> --- /dev/null
> +++ b/drivers/gpu/drm/drm_fb_fbdev_helper.c
> @@ -0,0 +1,1212 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * This file is subject to the terms and conditions of the GNU General Public
> + * License.  See the file COPYING in the main directory of this archive for
> + * more details.
> + *
> + * Copyright (C)  2007 Antonino Daplas <adaplas@pol.net>
> + *
> + * drm_copyarea:
> + *   Based almost entirely from cfbcopyarea.c (which is based almost entirely
> + *   on Geert Uytterhoeven's copyarea routine)
> + *
> + *  drm_fillrect:
> + *    Based almost entirely from cfbfillrect.c (which is based almost entirely
> + *    on Geert Uytterhoeven's fillrect routine)
> + *
> + *  drm_imageblit:
> + *    Based almost entirely on cfbimgblt.c
> + *
> + */
> +
> +#include <linux/export.h>
> +#include <linux/fb.h>
> +#include <linux/io.h>
> +
> +
> +/*
> +*  Compose two values, using a bitmask as decision value
> +*  This is equivalent to (a & mask) | (b & ~mask)
> +*/
> +static inline unsigned long
> +comp(unsigned long a, unsigned long b, unsigned long mask)
> +{
> +    return ((a ^ b) & mask) ^ b;
> +}
> +
> +/*
> + *  Create a pattern with the given pixel's color
> + */
> +#if BITS_PER_LONG == 64
> +static inline unsigned long
> +pixel_to_pat( u32 bpp, u32 pixel)
> +{
> +	switch (bpp) {
> +	case 1:
> +		return 0xfffffffffffffffful*pixel;
> +	case 2:
> +		return 0x5555555555555555ul*pixel;
> +	case 4:
> +		return 0x1111111111111111ul*pixel;
> +	case 8:
> +		return 0x0101010101010101ul*pixel;
> +	case 12:
> +		return 0x1001001001001001ul*pixel;
> +	case 16:
> +		return 0x0001000100010001ul*pixel;
> +	case 24:
> +		return 0x0001000001000001ul*pixel;
> +	case 32:
> +		return 0x0000000100000001ul*pixel;
> +	default:
> +		WARN(1, "pixel_to_pat(): unsupported pixelformat %d\n", bpp);
> +		return 0;
> +    }
> +}
> +#else
> +static inline unsigned long
> +pixel_to_pat( u32 bpp, u32 pixel)
> +{
> +	switch (bpp) {
> +	case 1:
> +		return 0xfffffffful*pixel;
> +	case 2:
> +		return 0x55555555ul*pixel;
> +	case 4:
> +		return 0x11111111ul*pixel;
> +	case 8:
> +		return 0x01010101ul*pixel;
> +	case 12:
> +		return 0x01001001ul*pixel;
> +	case 16:
> +		return 0x00010001ul*pixel;
> +	case 24:
> +		return 0x01000001ul*pixel;
> +	case 32:
> +		return 0x00000001ul*pixel;
> +	default:
> +		WARN(1, "pixel_to_pat(): unsupported pixelformat %d\n", bpp);
> +		return 0;
> +    }
> +}
> +#endif
> +
> +#ifdef CONFIG_FB_CFB_REV_PIXELS_IN_BYTE
> +#if BITS_PER_LONG == 64
> +#define REV_PIXELS_MASK1 0x5555555555555555ul
> +#define REV_PIXELS_MASK2 0x3333333333333333ul
> +#define REV_PIXELS_MASK4 0x0f0f0f0f0f0f0f0ful
> +#else
> +#define REV_PIXELS_MASK1 0x55555555ul
> +#define REV_PIXELS_MASK2 0x33333333ul
> +#define REV_PIXELS_MASK4 0x0f0f0f0ful
> +#endif
> +
> +static inline unsigned long fb_rev_pixels_in_long(unsigned long val,
> +						  u32 bswapmask)
> +{
> +	if (bswapmask & 1)
> +		val = comp(val >> 1, val << 1, REV_PIXELS_MASK1);
> +	if (bswapmask & 2)
> +		val = comp(val >> 2, val << 2, REV_PIXELS_MASK2);
> +	if (bswapmask & 3)
> +		val = comp(val >> 4, val << 4, REV_PIXELS_MASK4);
> +	return val;
> +}
> +
> +static inline u32 fb_shifted_pixels_mask_u32(struct fb_info *p, u32 index,
> +					     u32 bswapmask)
> +{
> +	u32 mask;
> +
> +	if (!bswapmask) {
> +		mask = FB_SHIFT_HIGH(p, ~(u32)0, index);
> +	} else {
> +		mask = 0xff << FB_LEFT_POS(p, 8);
> +		mask = FB_SHIFT_LOW(p, mask, index & (bswapmask)) & mask;
> +		mask = FB_SHIFT_HIGH(p, mask, index & ~(bswapmask));
> +#if defined(__i386__) || defined(__x86_64__)
> +		/* Shift argument is limited to 0 - 31 on x86 based CPU's */
> +		if(index + bswapmask < 32)
> +#endif
> +			mask |= FB_SHIFT_HIGH(p, ~(u32)0,
> +					(index + bswapmask) & ~(bswapmask));
> +	}
> +	return mask;
> +}
> +
> +static inline unsigned long fb_shifted_pixels_mask_long(struct fb_info *p,
> +							u32 index,
> +							u32 bswapmask)
> +{
> +	unsigned long mask;
> +
> +	if (!bswapmask) {
> +		mask = FB_SHIFT_HIGH(p, ~0UL, index);
> +	} else {
> +		mask = 0xff << FB_LEFT_POS(p, 8);
> +		mask = FB_SHIFT_LOW(p, mask, index & (bswapmask)) & mask;
> +		mask = FB_SHIFT_HIGH(p, mask, index & ~(bswapmask));
> +#if defined(__i386__) || defined(__x86_64__)
> +		/* Shift argument is limited to 0 - 31 on x86 based CPU's */
> +		if(index + bswapmask < BITS_PER_LONG)
> +#endif
> +			mask |= FB_SHIFT_HIGH(p, ~0UL,
> +					(index + bswapmask) & ~(bswapmask));
> +	}
> +	return mask;
> +}
> +
> +
> +static inline u32 fb_compute_bswapmask(struct fb_info *info)
> +{
> +	u32 bswapmask = 0;
> +	unsigned bpp = info->var.bits_per_pixel;
> +
> +	if ((bpp < 8) && (info->var.nonstd & FB_NONSTD_REV_PIX_IN_B)) {
> +		/*
> +		 * Reversed order of pixel layout in bytes
> +		 * works only for 1, 2 and 4 bpp
> +		 */
> +		bswapmask = 7 - bpp + 1;
> +	}
> +	return bswapmask;
> +}
> +
> +#else /* CONFIG_FB_CFB_REV_PIXELS_IN_BYTE */
> +
> +static inline unsigned long fb_rev_pixels_in_long(unsigned long val,
> +						  u32 bswapmask)
> +{
> +	return val;
> +}
> +
> +#define fb_shifted_pixels_mask_u32(p, i, b) FB_SHIFT_HIGH((p), ~(u32)0, (i))
> +#define fb_shifted_pixels_mask_long(p, i, b) FB_SHIFT_HIGH((p), ~0UL, (i))
> +#define fb_compute_bswapmask(...) 0
> +
> +#endif  /* CONFIG_FB_CFB_REV_PIXELS_IN_BYTE */
> +
> +#define cpu_to_le_long _cpu_to_le_long(BITS_PER_LONG)
> +#define _cpu_to_le_long(x) __cpu_to_le_long(x)
> +#define __cpu_to_le_long(x) cpu_to_le##x
> +
> +#define le_long_to_cpu _le_long_to_cpu(BITS_PER_LONG)
> +#define _le_long_to_cpu(x) __le_long_to_cpu(x)
> +#define __le_long_to_cpu(x) le##x##_to_cpu
> +
> +static inline unsigned long rolx(unsigned long word, unsigned int shift, unsigned int x)
> +{
> +	return (word << shift) | (word >> (x - shift));
> +}
> +
> +
> +/*
> + *  Generic bitwise copy algorithm
> + */
> +static void
> +bitcpy(struct fb_info *p, unsigned long *dst, unsigned dst_idx,
> +	const unsigned long *src, unsigned src_idx, int bits, unsigned n)
> +{
> +	unsigned long first, last;
> +	int const shift = dst_idx-src_idx;
> +	int left, right;
> +
> +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
> +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
> +
> +	if (!shift) {
> +		/* Same alignment for source and dest */
> +		if (dst_idx+n <= bits) {
> +			/* Single word */
> +			if (last)
> +				first &= last;
> +			*dst = comp(*src, *dst, first);
> +		} else {
> +			/* Multiple destination words */
> +			/* Leading bits */
> + 			if (first != ~0UL) {
> +				*dst = comp(*src, *dst, first);
> +				dst++;
> +				src++;
> +				n -= bits - dst_idx;
> +			}
> +
> +			/* Main chunk */
> +			n /= bits;
> +			while (n >= 8) {
> +				*dst++ = *src++;
> +				*dst++ = *src++;
> +				*dst++ = *src++;
> +				*dst++ = *src++;
> +				*dst++ = *src++;
> +				*dst++ = *src++;
> +				*dst++ = *src++;
> +				*dst++ = *src++;
> +				n -= 8;
> +			}
> +			while (n--)
> +				*dst++ = *src++;
> +
> +			/* Trailing bits */
> +			if (last)
> +				*dst = comp(*src, *dst, last);
> +		}
> +	} else {
> +		unsigned long d0, d1;
> +		int m;
> +
> +		/* Different alignment for source and dest */
> +		right = shift & (bits - 1);
> +		left = -shift & (bits - 1);
> +
> +		if (dst_idx+n <= bits) {
> +			/* Single destination word */
> +			if (last)
> +				first &= last;
> +			if (shift > 0) {
> +				/* Single source word */
> +				*dst = comp(*src << left, *dst, first);
> +			} else if (src_idx+n <= bits) {
> +				/* Single source word */
> +				*dst = comp(*src >> right, *dst, first);
> +			} else {
> +				/* 2 source words */
> +				d0 = *src++;
> +				d1 = *src;
> +				*dst = comp(d0 >> right | d1 << left, *dst,
> +					    first);
> +			}
> +		} else {
> +			/* Multiple destination words */
> +			/** We must always remember the last value read,
> +			    because in case SRC and DST overlap bitwise (e.g.
> +			    when moving just one pixel in 1bpp), we always
> +			    collect one full long for DST and that might
> +			    overlap with the current long from SRC. We store
> +			    this value in 'd0'. */
> +			d0 = *src++;
> +			/* Leading bits */
> +			if (shift > 0) {
> +				/* Single source word */
> +				*dst = comp(d0 << left, *dst, first);
> +				dst++;
> +				n -= bits - dst_idx;
> +			} else {
> +				/* 2 source words */
> +				d1 = *src++;
> +				*dst = comp(d0 >> right | d1 << left, *dst,
> +					    first);
> +				d0 = d1;
> +				dst++;
> +				n -= bits - dst_idx;
> +			}
> +
> +			/* Main chunk */
> +			m = n % bits;
> +			n /= bits;
> +			while (n >= 4) {
> +				d1 = *src++;
> +				*dst++ = d0 >> right | d1 << left;
> +				d0 = d1;
> +				d1 = *src++;
> +				*dst++ = d0 >> right | d1 << left;
> +				d0 = d1;
> +				d1 = *src++;
> +				*dst++ = d0 >> right | d1 << left;
> +				d0 = d1;
> +				d1 = *src++;
> +				*dst++ = d0 >> right | d1 << left;
> +				d0 = d1;
> +				n -= 4;
> +			}
> +			while (n--) {
> +				d1 = *src++;
> +				*dst++ = d0 >> right | d1 << left;
> +				d0 = d1;
> +			}
> +
> +			/* Trailing bits */
> +			if (m) {
> +				if (m <= bits - right) {
> +					/* Single source word */
> +					d0 >>= right;
> +				} else {
> +					/* 2 source words */
> + 					d1 = *src;
> +					d0 = d0 >> right | d1 << left;
> +				}
> +				*dst = comp(d0, *dst, last);
> +			}
> +		}
> +	}
> +}
> +
> +/*
> + *  Generic bitwise copy algorithm, operating backward
> + */
> +static void
> +bitcpy_rev(struct fb_info *p, unsigned long *dst, unsigned dst_idx,
> +	   const unsigned long *src, unsigned src_idx, unsigned bits,
> +	   unsigned n)
> +{
> +	unsigned long first, last;
> +	int shift;
> +
> +	dst += (dst_idx + n - 1) / bits;
> +	src += (src_idx + n - 1) / bits;
> +	dst_idx = (dst_idx + n - 1) % bits;
> +	src_idx = (src_idx + n - 1) % bits;
> +
> +	shift = dst_idx-src_idx;
> +
> +	first = ~FB_SHIFT_HIGH(p, ~0UL, (dst_idx + 1) % bits);
> +	last = FB_SHIFT_HIGH(p, ~0UL, (bits + dst_idx + 1 - n) % bits);
> +
> +	if (!shift) {
> +		/* Same alignment for source and dest */
> +		if ((unsigned long)dst_idx+1 >= n) {
> +			/* Single word */
> +			if (first)
> +				last &= first;
> +			*dst = comp(*src, *dst, last);
> +		} else {
> +			/* Multiple destination words */
> +
> +			/* Leading bits */
> +			if (first) {
> +				*dst = comp(*src, *dst, first);
> +				dst--;
> +				src--;
> +				n -= dst_idx+1;
> +			}
> +
> +			/* Main chunk */
> +			n /= bits;
> +			while (n >= 8) {
> +				*dst-- = *src--;
> +				*dst-- = *src--;
> +				*dst-- = *src--;
> +				*dst-- = *src--;
> +				*dst-- = *src--;
> +				*dst-- = *src--;
> +				*dst-- = *src--;
> +				*dst-- = *src--;
> +				n -= 8;
> +			}
> +			while (n--)
> +				*dst-- = *src--;
> +			/* Trailing bits */
> +			if (last != -1UL)
> +				*dst = comp(*src, *dst, last);
> +		}
> +	} else {
> +		/* Different alignment for source and dest */
> +
> +		int const left = shift & (bits-1);
> +		int const right = -shift & (bits-1);
> +
> +		if ((unsigned long)dst_idx+1 >= n) {
> +			/* Single destination word */
> +			if (first)
> +				last &= first;
> +			if (shift < 0) {
> +				/* Single source word */
> +				*dst = comp(*src >> right, *dst, last);
> +			} else if (1+(unsigned long)src_idx >= n) {
> +				/* Single source word */
> +				*dst = comp(*src << left, *dst, last);
> +			} else {
> +				/* 2 source words */
> +				*dst = comp(*src << left | *(src-1) >> right,
> +					    *dst, last);
> +			}
> +		} else {
> +			/* Multiple destination words */
> +			/** We must always remember the last value read,
> +			    because in case SRC and DST overlap bitwise (e.g.
> +			    when moving just one pixel in 1bpp), we always
> +			    collect one full long for DST and that might
> +			    overlap with the current long from SRC. We store
> +			    this value in 'd0'. */
> +			unsigned long d0, d1;
> +			int m;
> +
> +			d0 = *src--;
> +			/* Leading bits */
> +			if (shift < 0) {
> +				/* Single source word */
> +				d1 = d0;
> +				d0 >>= right;
> +			} else {
> +				/* 2 source words */
> +				d1 = *src--;
> +				d0 = d0 << left | d1 >> right;
> +			}
> +			if (!first)
> +				*dst = d0;
> +			else
> +				*dst = comp(d0, *dst, first);
> +			d0 = d1;
> +			dst--;
> +			n -= dst_idx+1;
> +
> +			/* Main chunk */
> +			m = n % bits;
> +			n /= bits;
> +			while (n >= 4) {
> +				d1 = *src--;
> +				*dst-- = d0 << left | d1 >> right;
> +				d0 = d1;
> +				d1 = *src--;
> +				*dst-- = d0 << left | d1 >> right;
> +				d0 = d1;
> +				d1 = *src--;
> +				*dst-- = d0 << left | d1 >> right;
> +				d0 = d1;
> +				d1 = *src--;
> +				*dst-- = d0 << left | d1 >> right;
> +				d0 = d1;
> +				n -= 4;
> +			}
> +			while (n--) {
> +				d1 = *src--;
> +				*dst-- = d0 << left | d1 >> right;
> +				d0 = d1;
> +			}
> +
> +			/* Trailing bits */
> +			if (m) {
> +				if (m <= bits - left) {
> +					/* Single source word */
> +					d0 <<= left;
> +				} else {
> +					/* 2 source words */
> +					d1 = *src;
> +					d0 = d0 << left | d1 >> right;
> +				}
> +				*dst = comp(d0, *dst, last);
> +			}
> +		}
> +	}
> +}
> +
> +/*
> + * Generic Bit Block Transfer for frame buffers located in system RAM with
> + * packed pixels of any depth.
> + */
> +void drm_copyarea(struct fb_info *p, const struct fb_copyarea *area)
> +{
> +	u32 dx = area->dx, dy = area->dy, sx = area->sx, sy = area->sy;
> +	u32 height = area->height, width = area->width;
> +	unsigned long const bits_per_line = p->fix.line_length*8u;
> +	unsigned long *base = NULL;
> +	int bits = BITS_PER_LONG, bytes = bits >> 3;
> +	unsigned dst_idx = 0, src_idx = 0, rev_copy = 0;
> +
> +	if (p->state != FBINFO_STATE_RUNNING)
> +		return;
> +
> +	/* if the beginning of the target area might overlap with the end of
> +	the source area, be have to copy the area reverse. */
> +	if ((dy == sy && dx > sx) || (dy > sy)) {
> +		dy += height;
> +		sy += height;
> +		rev_copy = 1;
> +	}
> +
> +	/* split the base of the framebuffer into a long-aligned address and
> +	   the index of the first bit */
> +	base = (unsigned long *)((unsigned long)p->screen_base & ~(bytes-1));
> +	dst_idx = src_idx = 8*((unsigned long)p->screen_base & (bytes-1));
> +	/* add offset of source and target area */
> +	dst_idx += dy*bits_per_line + dx*p->var.bits_per_pixel;
> +	src_idx += sy*bits_per_line + sx*p->var.bits_per_pixel;
> +
> +	if (p->fbops->fb_sync)
> +		p->fbops->fb_sync(p);
> +
> +	if (rev_copy) {
> +		while (height--) {
> +			dst_idx -= bits_per_line;
> +			src_idx -= bits_per_line;
> +			bitcpy_rev(p, base + (dst_idx / bits), dst_idx % bits,
> +				base + (src_idx / bits), src_idx % bits, bits,
> +				width*p->var.bits_per_pixel);
> +		}
> +	} else {
> +		while (height--) {
> +			bitcpy(p, base + (dst_idx / bits), dst_idx % bits,
> +				base + (src_idx / bits), src_idx % bits, bits,
> +				width*p->var.bits_per_pixel);
> +			dst_idx += bits_per_line;
> +			src_idx += bits_per_line;
> +		}
> +	}
> +}
> +EXPORT_SYMBOL(drm_copyarea);
> +
> +/*
> + *  Aligned pattern fill using 32/64-bit memory accesses
> + */
> +static void
> +bitfill_aligned(struct fb_info *p, unsigned long *dst, int dst_idx,
> +		unsigned long pat, unsigned n, int bits)
> +{
> +	unsigned long first, last;
> +
> +	if (!n)
> +		return;
> +
> +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
> +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
> +
> +	if (dst_idx+n <= bits) {
> +		/* Single word */
> +		if (last)
> +			first &= last;
> +		*dst = comp(pat, *dst, first);
> +	} else {
> +		/* Multiple destination words */
> +
> +		/* Leading bits */
> + 		if (first!= ~0UL) {
> +			*dst = comp(pat, *dst, first);
> +			dst++;
> +			n -= bits - dst_idx;
> +		}
> +
> +		/* Main chunk */
> +		n /= bits;
> +		while (n >= 8) {
> +			*dst++ = pat;
> +			*dst++ = pat;
> +			*dst++ = pat;
> +			*dst++ = pat;
> +			*dst++ = pat;
> +			*dst++ = pat;
> +			*dst++ = pat;
> +			*dst++ = pat;
> +			n -= 8;
> +		}
> +		while (n--)
> +			*dst++ = pat;
> +		/* Trailing bits */
> +		if (last)
> +			*dst = comp(pat, *dst, last);
> +	}
> +}
> +
> +
> +/*
> + *  Unaligned generic pattern fill using 32/64-bit memory accesses
> + *  The pattern must have been expanded to a full 32/64-bit value
> + *  Left/right are the appropriate shifts to convert to the pattern to be
> + *  used for the next 32/64-bit word
> + */
> +static void
> +bitfill_unaligned(struct fb_info *p, unsigned long *dst, int dst_idx,
> +		  unsigned long pat, int left, int right, unsigned n, int bits)
> +{
> +	unsigned long first, last;
> +
> +	if (!n)
> +		return;
> +
> +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
> +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
> +
> +	if (dst_idx+n <= bits) {
> +		/* Single word */
> +		if (last)
> +			first &= last;
> +		*dst = comp(pat, *dst, first);
> +	} else {
> +		/* Multiple destination words */
> +		/* Leading bits */
> +		if (first) {
> +			*dst = comp(pat, *dst, first);
> +			dst++;
> +			pat = pat << left | pat >> right;
> +			n -= bits - dst_idx;
> +		}
> +
> +		/* Main chunk */
> +		n /= bits;
> +		while (n >= 4) {
> +			*dst++ = pat;
> +			pat = pat << left | pat >> right;
> +			*dst++ = pat;
> +			pat = pat << left | pat >> right;
> +			*dst++ = pat;
> +			pat = pat << left | pat >> right;
> +			*dst++ = pat;
> +			pat = pat << left | pat >> right;
> +			n -= 4;
> +		}
> +		while (n--) {
> +			*dst++ = pat;
> +			pat = pat << left | pat >> right;
> +		}
> +
> +		/* Trailing bits */
> +		if (last)
> +			*dst = comp(pat, *dst, last);
> +	}
> +}
> +
> +/*
> + *  Aligned pattern invert using 32/64-bit memory accesses
> + */
> +static void
> +bitfill_aligned_rev(struct fb_info *p, unsigned long *dst, int dst_idx,
> +		    unsigned long pat, unsigned n, int bits)
> +{
> +	unsigned long val = pat;
> +	unsigned long first, last;
> +
> +	if (!n)
> +		return;
> +
> +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
> +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
> +
> +	if (dst_idx+n <= bits) {
> +		/* Single word */
> +		if (last)
> +			first &= last;
> +		*dst = comp(*dst ^ val, *dst, first);
> +	} else {
> +		/* Multiple destination words */
> +		/* Leading bits */
> +		if (first!=0UL) {
> +			*dst = comp(*dst ^ val, *dst, first);
> +			dst++;
> +			n -= bits - dst_idx;
> +		}
> +
> +		/* Main chunk */
> +		n /= bits;
> +		while (n >= 8) {
> +			*dst++ ^= val;
> +			*dst++ ^= val;
> +			*dst++ ^= val;
> +			*dst++ ^= val;
> +			*dst++ ^= val;
> +			*dst++ ^= val;
> +			*dst++ ^= val;
> +			*dst++ ^= val;
> +			n -= 8;
> +		}
> +		while (n--)
> +			*dst++ ^= val;
> +		/* Trailing bits */
> +		if (last)
> +			*dst = comp(*dst ^ val, *dst, last);
> +	}
> +}
> +
> +
> +/*
> + *  Unaligned generic pattern invert using 32/64-bit memory accesses
> + *  The pattern must have been expanded to a full 32/64-bit value
> + *  Left/right are the appropriate shifts to convert to the pattern to be
> + *  used for the next 32/64-bit word
> + */
> +static void
> +bitfill_unaligned_rev(struct fb_info *p, unsigned long *dst, int dst_idx,
> +		      unsigned long pat, int left, int right, unsigned n,
> +		      int bits)
> +{
> +	unsigned long first, last;
> +
> +	if (!n)
> +		return;
> +
> +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
> +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
> +
> +	if (dst_idx+n <= bits) {
> +		/* Single word */
> +		if (last)
> +			first &= last;
> +		*dst = comp(*dst ^ pat, *dst, first);
> +	} else {
> +		/* Multiple destination words */
> +
> +		/* Leading bits */
> +		if (first != 0UL) {
> +			*dst = comp(*dst ^ pat, *dst, first);
> +			dst++;
> +			pat = pat << left | pat >> right;
> +			n -= bits - dst_idx;
> +		}
> +
> +		/* Main chunk */
> +		n /= bits;
> +		while (n >= 4) {
> +			*dst++ ^= pat;
> +			pat = pat << left | pat >> right;
> +			*dst++ ^= pat;
> +			pat = pat << left | pat >> right;
> +			*dst++ ^= pat;
> +			pat = pat << left | pat >> right;
> +			*dst++ ^= pat;
> +			pat = pat << left | pat >> right;
> +			n -= 4;
> +		}
> +		while (n--) {
> +			*dst ^= pat;
> +			pat = pat << left | pat >> right;
> +		}
> +
> +		/* Trailing bits */
> +		if (last)
> +			*dst = comp(*dst ^ pat, *dst, last);
> +	}
> +}
> +
> +/*
> + * Generic fillrect for frame buffers in system RAM with packed pixels of
> + * any depth.
> + */
> +void drm_fillrect(struct fb_info *p, const struct fb_fillrect *rect)
> +{
> +	unsigned long pat, pat2, fg;
> +	unsigned long width = rect->width, height = rect->height;
> +	int bits = BITS_PER_LONG, bytes = bits >> 3;
> +	u32 bpp = p->var.bits_per_pixel;
> +	unsigned long *dst;
> +	int dst_idx, left;
> +
> +	if (p->state != FBINFO_STATE_RUNNING)
> +		return;
> +
> +	if (p->fix.visual == FB_VISUAL_TRUECOLOR ||
> +	    p->fix.visual == FB_VISUAL_DIRECTCOLOR )
> +		fg = ((u32 *) (p->pseudo_palette))[rect->color];
> +	else
> +		fg = rect->color;
> +
> +	pat = pixel_to_pat( bpp, fg);
> +
> +	dst = (unsigned long *)((unsigned long)p->screen_base & ~(bytes-1));
> +	dst_idx = ((unsigned long)p->screen_base & (bytes - 1))*8;
> +	dst_idx += rect->dy*p->fix.line_length*8+rect->dx*bpp;
> +	/* FIXME For now we support 1-32 bpp only */
> +	left = bits % bpp;
> +	if (p->fbops->fb_sync)
> +		p->fbops->fb_sync(p);
> +	if (!left) {
> +		void (*fill_op32)(struct fb_info *p, unsigned long *dst,
> +				  int dst_idx, unsigned long pat, unsigned n,
> +				  int bits) = NULL;
> +
> +		switch (rect->rop) {
> +		case ROP_XOR:
> +			fill_op32 = bitfill_aligned_rev;
> +			break;
> +		case ROP_COPY:
> +			fill_op32 = bitfill_aligned;
> +			break;
> +		default:
> +			printk( KERN_ERR "cfb_fillrect(): unknown rop, "
> +				"defaulting to ROP_COPY\n");
> +			fill_op32 = bitfill_aligned;
> +			break;
> +		}
> +		while (height--) {
> +			dst += dst_idx >> (ffs(bits) - 1);
> +			dst_idx &= (bits - 1);
> +			fill_op32(p, dst, dst_idx, pat, width*bpp, bits);
> +			dst_idx += p->fix.line_length*8;
> +		}
> +	} else {
> +		int right, r;
> +		void (*fill_op)(struct fb_info *p, unsigned long *dst,
> +				int dst_idx, unsigned long pat, int left,
> +				int right, unsigned n, int bits) = NULL;
> +#ifdef __LITTLE_ENDIAN
> +		right = left;
> +		left = bpp - right;
> +#else
> +		right = bpp - left;
> +#endif
> +		switch (rect->rop) {
> +		case ROP_XOR:
> +			fill_op = bitfill_unaligned_rev;
> +			break;
> +		case ROP_COPY:
> +			fill_op = bitfill_unaligned;
> +			break;
> +		default:
> +			printk(KERN_ERR "sys_fillrect(): unknown rop, "
> +				"defaulting to ROP_COPY\n");
> +			fill_op = bitfill_unaligned;
> +			break;
> +		}
> +		while (height--) {
> +			dst += dst_idx / bits;
> +			dst_idx &= (bits - 1);
> +			r = dst_idx % bpp;
> +			/* rotate pattern to the correct start position */
> +			pat2 = le_long_to_cpu(rolx(cpu_to_le_long(pat), r, bpp));
> +			fill_op(p, dst, dst_idx, pat2, left, right,
> +				width*bpp, bits);
> +			dst_idx += p->fix.line_length*8;
> +		}
> +	}
> +}
> +EXPORT_SYMBOL(drm_fillrect);
> +
> +static const u32 cfb_tab8_be[] = {
> +    0x00000000,0x000000ff,0x0000ff00,0x0000ffff,
> +    0x00ff0000,0x00ff00ff,0x00ffff00,0x00ffffff,
> +    0xff000000,0xff0000ff,0xff00ff00,0xff00ffff,
> +    0xffff0000,0xffff00ff,0xffffff00,0xffffffff
> +};
> +
> +static const u32 cfb_tab8_le[] = {
> +    0x00000000,0xff000000,0x00ff0000,0xffff0000,
> +    0x0000ff00,0xff00ff00,0x00ffff00,0xffffff00,
> +    0x000000ff,0xff0000ff,0x00ff00ff,0xffff00ff,
> +    0x0000ffff,0xff00ffff,0x00ffffff,0xffffffff
> +};
> +
> +static const u32 cfb_tab16_be[] = {
> +    0x00000000, 0x0000ffff, 0xffff0000, 0xffffffff
> +};
> +
> +static const u32 cfb_tab16_le[] = {
> +    0x00000000, 0xffff0000, 0x0000ffff, 0xffffffff
> +};
> +
> +static const u32 cfb_tab32[] = {
> +	0x00000000, 0xffffffff
> +};
> +
> +static void color_imageblit(const struct fb_image *image, struct fb_info *p,
> +			    void *dst1, u32 start_index, u32 pitch_index)
> +{
> +	/* Draw the penguin */
> +	u32 *dst, *dst2;
> +	u32 color = 0, val, shift;
> +	int i, n, bpp = p->var.bits_per_pixel;
> +	u32 null_bits = 32 - bpp;
> +	u32 *palette = (u32 *) p->pseudo_palette;
> +	const u8 *src = image->data;
> +
> +	dst2 = dst1;
> +	for (i = image->height; i--; ) {
> +		n = image->width;
> +		dst = dst1;
> +		shift = 0;
> +		val = 0;
> +
> +		if (start_index) {
> +			u32 start_mask = ~(FB_SHIFT_HIGH(p, ~(u32)0,
> +							 start_index));
> +			val = *dst & start_mask;
> +			shift = start_index;
> +		}
> +		while (n--) {
> +			if (p->fix.visual == FB_VISUAL_TRUECOLOR ||
> +			    p->fix.visual == FB_VISUAL_DIRECTCOLOR )
> +				color = palette[*src];
> +			else
> +				color = *src;
> +			color <<= FB_LEFT_POS(p, bpp);
> +			val |= FB_SHIFT_HIGH(p, color, shift);
> +			if (shift >= null_bits) {
> +				*dst++ = val;
> +
> +				val = (shift == null_bits) ? 0 :
> +					FB_SHIFT_LOW(p, color, 32 - shift);
> +			}
> +			shift += bpp;
> +			shift &= (32 - 1);
> +			src++;
> +		}
> +		if (shift) {
> +			u32 end_mask = FB_SHIFT_HIGH(p, ~(u32)0, shift);
> +
> +			*dst &= end_mask;
> +			*dst |= val;
> +		}
> +		dst1 += p->fix.line_length;
> +		if (pitch_index) {
> +			dst2 += p->fix.line_length;
> +			dst1 = (u8 *)((long)dst2 & ~(sizeof(u32) - 1));
> +
> +			start_index += pitch_index;
> +			start_index &= 32 - 1;
> +		}
> +	}
> +}
> +
> +static void slow_imageblit(const struct fb_image *image, struct fb_info *p,
> +				  void *dst1, u32 fgcolor, u32 bgcolor,
> +				  u32 start_index, u32 pitch_index)
> +{
> +	u32 shift, color = 0, bpp = p->var.bits_per_pixel;
> +	u32 *dst, *dst2;
> +	u32 val, pitch = p->fix.line_length;
> +	u32 null_bits = 32 - bpp;
> +	u32 spitch = (image->width+7)/8;
> +	const u8 *src = image->data, *s;
> +	u32 i, j, l;
> +
> +	dst2 = dst1;
> +	fgcolor <<= FB_LEFT_POS(p, bpp);
> +	bgcolor <<= FB_LEFT_POS(p, bpp);
> +
> +	for (i = image->height; i--; ) {
> +		shift = val = 0;
> +		l = 8;
> +		j = image->width;
> +		dst = dst1;
> +		s = src;
> +
> +		/* write leading bits */
> +		if (start_index) {
> +			u32 start_mask = ~(FB_SHIFT_HIGH(p, ~(u32)0,
> +							 start_index));
> +			val = *dst & start_mask;
> +			shift = start_index;
> +		}
> +
> +		while (j--) {
> +			l--;
> +			color = (*s & (1 << l)) ? fgcolor : bgcolor;
> +			val |= FB_SHIFT_HIGH(p, color, shift);
> +
> +			/* Did the bitshift spill bits to the next long? */
> +			if (shift >= null_bits) {
> +				*dst++ = val;
> +				val = (shift == null_bits) ? 0 :
> +					FB_SHIFT_LOW(p, color, 32 - shift);
> +			}
> +			shift += bpp;
> +			shift &= (32 - 1);
> +			if (!l) { l = 8; s++; }
> +		}
> +
> +		/* write trailing bits */
> + 		if (shift) {
> +			u32 end_mask = FB_SHIFT_HIGH(p, ~(u32)0, shift);
> +
> +			*dst &= end_mask;
> +			*dst |= val;
> +		}
> +
> +		dst1 += pitch;
> +		src += spitch;
> +		if (pitch_index) {
> +			dst2 += pitch;
> +			dst1 = (u8 *)((long)dst2 & ~(sizeof(u32) - 1));
> +			start_index += pitch_index;
> +			start_index &= 32 - 1;
> +		}
> +
> +	}
> +}
> +
> +/*
> + * fast_imageblit - optimized monochrome color expansion
> + *
> + * Only if:  bits_per_pixel == 8, 16, or 32
> + *           image->width is divisible by pixel/dword (ppw);
> + *           fix->line_legth is divisible by 4;
> + *           beginning and end of a scanline is dword aligned
> + */
> +static void fast_imageblit(const struct fb_image *image, struct fb_info *p,
> +				  void *dst1, u32 fgcolor, u32 bgcolor)
> +{
> +	u32 fgx = fgcolor, bgx = bgcolor, bpp = p->var.bits_per_pixel;
> +	u32 ppw = 32/bpp, spitch = (image->width + 7)/8;
> +	u32 bit_mask, end_mask, eorx, shift;
> +	const char *s = image->data, *src;
> +	u32 *dst;
> +	const u32 *tab = NULL;
> +	int i, j, k;
> +
> +	switch (bpp) {
> +	case 8:
> +		tab = fb_be_math(p) ? cfb_tab8_be : cfb_tab8_le;
> +		break;
> +	case 16:
> +		tab = fb_be_math(p) ? cfb_tab16_be : cfb_tab16_le;
> +		break;
> +	case 32:
> +	default:
> +		tab = cfb_tab32;
> +		break;
> +	}
> +
> +	for (i = ppw-1; i--; ) {
> +		fgx <<= bpp;
> +		bgx <<= bpp;
> +		fgx |= fgcolor;
> +		bgx |= bgcolor;
> +	}
> +
> +	bit_mask = (1 << ppw) - 1;
> +	eorx = fgx ^ bgx;
> +	k = image->width/ppw;
> +
> +	for (i = image->height; i--; ) {
> +		dst = dst1;
> +		shift = 8;
> +		src = s;
> +
> +		for (j = k; j--; ) {
> +			shift -= ppw;
> +			end_mask = tab[(*src >> shift) & bit_mask];
> +			*dst++ = (end_mask & eorx) ^ bgx;
> +			if (!shift) {
> +				shift = 8;
> +				src++;
> +			}
> +		}
> +		dst1 += p->fix.line_length;
> +		s += spitch;
> +	}
> +}
> +
> +/*
> + * Generic 1-bit or 8-bit source to 1-32 bit destination expansion
> + * for frame buffer located in system RAM with packed pixels of any depth.
> + */
> +void drm_imageblit(struct fb_info *p, const struct fb_image *image)
> +{
> +	u32 fgcolor, bgcolor, start_index, bitstart, pitch_index = 0;
> +	u32 bpl = sizeof(u32), bpp = p->var.bits_per_pixel;
> +	u32 width = image->width;
> +	u32 dx = image->dx, dy = image->dy;
> +	void *dst1;
> +
> +	if (p->state != FBINFO_STATE_RUNNING)
> +		return;
> +
> +	bitstart = (dy * p->fix.line_length * 8) + (dx * bpp);
> +	start_index = bitstart & (32 - 1);
> +	pitch_index = (p->fix.line_length & (bpl - 1)) * 8;
> +
> +	bitstart /= 8;
> +	bitstart &= ~(bpl - 1);
> +	dst1 = (void __force *)p->screen_base + bitstart;
> +
> +	if (p->fbops->fb_sync)
> +		p->fbops->fb_sync(p);
> +
> +	if (image->depth == 1) {
> +		if (p->fix.visual == FB_VISUAL_TRUECOLOR ||
> +		    p->fix.visual == FB_VISUAL_DIRECTCOLOR) {
> +			fgcolor = ((u32*)(p->pseudo_palette))[image->fg_color];
> +			bgcolor = ((u32*)(p->pseudo_palette))[image->bg_color];
> +		} else {
> +			fgcolor = image->fg_color;
> +			bgcolor = image->bg_color;
> +		}
> +
> +		if (32 % bpp == 0 && !start_index && !pitch_index &&
> +		    ((width & (32/bpp-1)) == 0) &&
> +		    bpp >= 8 && bpp <= 32)
> +			fast_imageblit(image, p, dst1, fgcolor, bgcolor);
> +		else
> +			slow_imageblit(image, p, dst1, fgcolor, bgcolor,
> +					start_index, pitch_index);
> +	} else
> +		color_imageblit(image, p, dst1, start_index, pitch_index);
> +}
> +EXPORT_SYMBOL(drm_imageblit);
> +
> +/*
> + * Generic file operations where framebuffer is in system RAM
> + */
> +ssize_t drm_fb_read(struct fb_info *info, char __user *buf, size_t count, loff_t *ppos)
> +{
> +	unsigned long p = *ppos;
> +	void *src;
> +	int err = 0;
> +	unsigned long total_size;
> +
> +	if (info->state != FBINFO_STATE_RUNNING)
> +		return -EPERM;
> +
> +	total_size = info->screen_size;
> +
> +	if (total_size == 0)
> +		total_size = info->fix.smem_len;
> +
> +	if (p >= total_size)
> +		return 0;
> +
> +	if (count >= total_size)
> +		count = total_size;
> +
> +	if (count + p > total_size)
> +		count = total_size - p;
> +
> +	src = (void __force *)(info->screen_base + p);
> +
> +	if (info->fbops->fb_sync)
> +		info->fbops->fb_sync(info);
> +
> +	if (copy_to_user(buf, src, count))
> +		err = -EFAULT;
> +
> +	if  (!err)
> +		*ppos += count;
> +
> +	return (err) ? err : count;
> +}
> +EXPORT_SYMBOL_GPL(drm_fb_read);
> +
> +/*
> + * Generic file operations where framebuffer is in system RAM
> + */
> +ssize_t drm_fb_write(struct fb_info *info, const char __user *buf, size_t count, loff_t *ppos)
> +{
> +	unsigned long p = *ppos;
> +	void *dst;
> +	int err = 0;
> +	unsigned long total_size;
> +
> +	if (info->state != FBINFO_STATE_RUNNING)
> +		return -EPERM;
> +
> +	total_size = info->screen_size;
> +
> +	if (total_size == 0)
> +		total_size = info->fix.smem_len;
> +
> +	if (p > total_size)
> +		return -EFBIG;
> +
> +	if (count > total_size) {
> +		err = -EFBIG;
> +		count = total_size;
> +	}
> +
> +	if (count + p > total_size) {
> +		if (!err)
> +			err = -ENOSPC;
> +
> +		count = total_size - p;
> +	}
> +
> +	dst = (void __force *) (info->screen_base + p);
> +
> +	if (info->fbops->fb_sync)
> +		info->fbops->fb_sync(info);
> +
> +	if (copy_from_user(dst, buf, count))
> +		err = -EFAULT;
> +
> +	if  (!err)
> +		*ppos += count;
> +
> +	return (err) ? err : count;
> +}
> +EXPORT_SYMBOL_GPL(drm_fb_write);
> diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
> index da0d96a69570..18fba9818635 100644
> --- a/drivers/gpu/drm/drm_fb_helper.c
> +++ b/drivers/gpu/drm/drm_fb_helper.c
> @@ -664,36 +664,36 @@ void drm_fb_helper_deferred_io(struct fb_info *info,
>  EXPORT_SYMBOL(drm_fb_helper_deferred_io);
>  
>  /**
> - * drm_fb_helper_sys_read - wrapper around fb_sys_read
> + * drm_fb_helper_sys_read - wrapper around drm_fb_read
>   * @info: fb_info struct pointer
>   * @buf: userspace buffer to read from framebuffer memory
>   * @count: number of bytes to read from framebuffer memory
>   * @ppos: read offset within framebuffer memory
>   *
> - * A wrapper around fb_sys_read implemented by fbdev core
> + * A wrapper around drm_fb_read
>   */
>  ssize_t drm_fb_helper_sys_read(struct fb_info *info, char __user *buf,
>  			       size_t count, loff_t *ppos)
>  {
> -	return fb_sys_read(info, buf, count, ppos);
> +	return drm_fb_read(info, buf, count, ppos);
>  }
>  EXPORT_SYMBOL(drm_fb_helper_sys_read);
>  
>  /**
> - * drm_fb_helper_sys_write - wrapper around fb_sys_write
> + * drm_fb_helper_sys_write - wrapper around drm_fb_write
>   * @info: fb_info struct pointer
>   * @buf: userspace buffer to write to framebuffer memory
>   * @count: number of bytes to write to framebuffer memory
>   * @ppos: write offset within framebuffer memory
>   *
> - * A wrapper around fb_sys_write implemented by fbdev core
> + * A wrapper around drm_fb_write
>   */
>  ssize_t drm_fb_helper_sys_write(struct fb_info *info, const char __user *buf,
>  				size_t count, loff_t *ppos)
>  {
>  	ssize_t ret;
>  
> -	ret = fb_sys_write(info, buf, count, ppos);
> +	ret = drm_fb_write(info, buf, count, ppos);
>  	if (ret > 0)
>  		drm_fb_helper_dirty(info, 0, 0, info->var.xres,
>  				    info->var.yres);
> @@ -703,48 +703,48 @@ ssize_t drm_fb_helper_sys_write(struct fb_info *info, const char __user *buf,
>  EXPORT_SYMBOL(drm_fb_helper_sys_write);
>  
>  /**
> - * drm_fb_helper_sys_fillrect - wrapper around sys_fillrect
> + * drm_fb_helper_sys_fillrect - wrapper around drm_fillrect
>   * @info: fbdev registered by the helper
>   * @rect: info about rectangle to fill
>   *
> - * A wrapper around sys_fillrect implemented by fbdev core
> + * A wrapper around drm_fillrect
>   */
>  void drm_fb_helper_sys_fillrect(struct fb_info *info,
>  				const struct fb_fillrect *rect)
>  {
> -	sys_fillrect(info, rect);
> +	drm_fillrect(info, rect);
>  	drm_fb_helper_dirty(info, rect->dx, rect->dy,
>  			    rect->width, rect->height);
>  }
>  EXPORT_SYMBOL(drm_fb_helper_sys_fillrect);
>  
>  /**
> - * drm_fb_helper_sys_copyarea - wrapper around sys_copyarea
> + * drm_fb_helper_sys_copyarea - wrapper around drm_copyarea
>   * @info: fbdev registered by the helper
>   * @area: info about area to copy
>   *
> - * A wrapper around sys_copyarea implemented by fbdev core
> + * A wrapper around drm_copyarea
>   */
>  void drm_fb_helper_sys_copyarea(struct fb_info *info,
>  				const struct fb_copyarea *area)
>  {
> -	sys_copyarea(info, area);
> +	drm_copyarea(info, area);
>  	drm_fb_helper_dirty(info, area->dx, area->dy,
>  			    area->width, area->height);
>  }
>  EXPORT_SYMBOL(drm_fb_helper_sys_copyarea);
>  
>  /**
> - * drm_fb_helper_sys_imageblit - wrapper around sys_imageblit
> + * drm_fb_helper_sys_imageblit - wrapper around drm_imageblit
>   * @info: fbdev registered by the helper
>   * @image: info about image to blit
>   *
> - * A wrapper around sys_imageblit implemented by fbdev core
> + * A wrapper around drm_imageblit
>   */
>  void drm_fb_helper_sys_imageblit(struct fb_info *info,
>  				 const struct fb_image *image)
>  {
> -	sys_imageblit(info, image);
> +	drm_imageblit(info, image);
>  	drm_fb_helper_dirty(info, image->dx, image->dy,
>  			    image->width, image->height);
>  }
> diff --git a/include/drm/drm_fb_helper.h b/include/drm/drm_fb_helper.h
> index 306aa3a60be9..d38005f6128a 100644
> --- a/include/drm/drm_fb_helper.h
> +++ b/include/drm/drm_fb_helper.h
> @@ -271,6 +271,12 @@ void drm_fb_helper_output_poll_changed(struct drm_device *dev);
>  
>  void drm_fbdev_generic_setup(struct drm_device *dev,
>  			     unsigned int preferred_bpp);
> +
> +void drm_copyarea(struct fb_info *p, const struct fb_copyarea *area);
> +void drm_fillrect(struct fb_info *p, const struct fb_fillrect *rect);
> +void drm_imageblit(struct fb_info *p, const struct fb_image *image);
> +ssize_t drm_fb_read(struct fb_info *info, char __user *buf, size_t count, loff_t *ppos);
> +ssize_t drm_fb_write(struct fb_info *info, const char __user *buf, size_t count, loff_t *ppos);
>  #else
>  static inline void drm_fb_helper_prepare(struct drm_device *dev,
>  					struct drm_fb_helper *helper,
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 516 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/5] fbdev/core: Export framebuffer read and write code as cfb_ function
  2020-08-03  6:46             ` Thomas Zimmermann
@ 2020-08-04  9:27               ` daniel
  -1 siblings, 0 replies; 34+ messages in thread
From: daniel @ 2020-08-04  9:27 UTC (permalink / raw)
  Cc: linux-fbdev, b.zolnierkie, jani.nikula, dri-devel, kraxel,
	airlied, natechancellor, Sam Ravnborg, peda, dan.carpenter

On Mon, Aug 03, 2020 at 08:46:34AM +0200, Thomas Zimmermann wrote:
> Hi
> 
> Am 02.08.20 um 22:01 schrieb Sam Ravnborg:
> > On Fri, Jul 31, 2020 at 11:20:33AM +0200, daniel@ffwll.ch wrote:
> >> On Wed, Jul 29, 2020 at 06:36:03PM +0200, Sam Ravnborg wrote:
> >>> Hi Daniel.
> >>>
> >>> On Wed, Jul 29, 2020 at 03:53:28PM +0200, daniel@ffwll.ch wrote:
> >>>> On Wed, Jul 29, 2020 at 03:41:45PM +0200, Thomas Zimmermann wrote:
> >>>>> DRM fb helpers require read and write functions for framebuffer
> >>>>> memory. Export the existing code from fbdev.
> >>>>>
> >>>>> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
> >>>>
> >>>> Hm I'm not super sure whether we want to actually reuse this stuff ... We
> >>>> kinda don't care about the sparc special case, and just having an fbdev
> >>>> implementation witch has the switch between memcpy and memcpy_to/from_io
> >>>> in one single place sounds a lot simpler ...
> >>>>
> >>>> This way we can have a clean split between the old horrors of real fbdev
> >>>> drivers, and a much cleaner world in drm. It would mean a bit of
> >>>> copypasting, but I think that's actually a good thing.
> >>>>
> >>>> In general my idea for drm fbdev emulation is that for any area we have a
> >>>> problem we just ignore the entire fbmem.c code and write our own: mmap,
> >>>> backlight handling (still unsolved, and horrible), cfb vs sys here. This
> >>>> entire fbmem.c stuff is pretty bad midlayer, trying to avoid code
> >>>> duplication here doesn't seem worth it imo.
> >>>>
> >>>> Thoughts?
> >>>
> >>>
> >>> I can see that fbmem is a mix of ioctl support and other stuff.
> >>> We could factor out all the ioctl parts of fbmem.c to a new file
> >>> named fbioctl.c.
> >>>
> >>> And then let the ioctl parts call down into drm stuff and avoid reusing
> >>> the fbdev code when we first reach drm code.
> >>> This would require local copies of:
> >>> sys_read, sys_write, sys_fillrect, sys_copyarea, sys_imageblit
> >>> and more I think which I missed.
> >>>
> >>> With local copies we could avoid some of the special cases and trim the
> >>> unctions to what is required by drm only.
> >>> And then no more fbmem dependencies and no dependencies to several of
> >>> the small helper functions. So less entanglement with fbdev core.
> >>>
> >>> This all sounds simple so I am surely missing a lot a ugly details here.
> >>>
> >>> And should we touch this anyway we need a test suite to verify not too
> >>> much breaks. To the best of my knowledge there is not yet such a test
> >>> suite :-( Maybe because people caring about fbdev are limited.
> >>
> >> Well my idea was to not refactor anything, but just have drm copies of the
> >> various fb_ops callbacks. Definitely not even more refactoring :-)
> 
> Thanks for making a prototype, Sam.
> 
> But do we really want to duplicate alls this code from fbdev? It's not
> actually pretty and there's little value in rewritting it. If anything,
> I can see us extending drm_format_helpers.c and building upon this.

Hm yeah this is a lot more horrible than I thought, I was kinda just
looking at read/write functions, maybe also mmap, and figured we could
simplify this a lot if we just inline and done.

But all these image/fill/copy functions are pretty bad, and they seem to
care about a lot of corner cases that just don't matter anymore on modern
hw. Also agreeing with Tomas that if we do copy this, then building on top
of format helpers would make more sense. At that point we probably need a
test suite for fbdev, and this entire endeavour becomes rather silly.

So maybe compromise approach? E.g. we reuse fbdev helpers with a switch
between sys and cfb if it would drag in a huge pile of code for format
handling sillynesss. But when all it would drag in is a memcpy_to/fromio
or similar, we inline to reduce a bit the midlayer and make it clearer
what's going on. Strictly speaking would probably still need a test suite,
but I guess we can delay that until the first bug report or so :-)

Cheers, Daniel


> 
> Best regards
> Thomas
> 
> > 
> > $ wc -l drivers/gpu/drm/drm_fb_fbdev_helper.c
> > 1212 drivers/gpu/drm/drm_fb_fbdev_helper.c
> > 
> > This is a straight copy of:
> > $ grep EXPORT drivers/gpu/drm/drm_fb_fbdev_helper.c
> > EXPORT_SYMBOL(drm_copyarea);
> > EXPORT_SYMBOL(drm_fillrect);
> > EXPORT_SYMBOL(drm_imageblit);
> > EXPORT_SYMBOL_GPL(drm_fb_read);
> > EXPORT_SYMBOL_GPL(drm_fb_write);
> > 
> > copyarea, fillrect and imageblit are a copy of the sys variants.
> > read and write are from fb_sys_fops.c
> > 
> > Patch builds but I am not a big fan yet - right now it gains us very
> > little. What would the next step be?
> > 
> > I looked briefly at the few drivers that have not migrated to
> > the generic fbdev emulation (yet?).
> >   msm
> >   omapdrm
> >   gma500
> > 
> > all uses sys_ variants - they already uses the migrated variants as
> > they call the drm_fb helper.
> > 
> > 
> >   exynos
> >   gma500
> >   i915
> >   nouveau
> >   radeon
> >   rockchip
> >   vmwgfx
> > 
> > all uses the cfb variants and cannot use the migrated variants as-is.
> > 
> > 	Sam
> > 
> > From c554e8b9ec96e06a5933ed5bd834a928c977bcdf Mon Sep 17 00:00:00 2001
> > From: Sam Ravnborg <sam@ravnborg.org>
> > Date: Fri, 31 Jul 2020 23:36:01 +0200
> > Subject: [PATCH 1/3] drm: drm_fb_helper: copy fb_ops functions to drm
> > 
> > Copy the fbdev functions used by fb_ops to drm.
> > Having local copies enable the possibility
> > to adjust the functions so then are tailored to the need of DRM.
> > At the same time there is one less dependency between DRM and fbdev.
> > 
> > The files were copied verbatim and only minimal changes was made.
> > 
> > As some drivers uses the fbdev functions they continue to be build but
> > the generic fbdev emulation no longer uses the fbdev functions.
> > 
> > Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
> > Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > ---
> >  drivers/gpu/drm/Kconfig               |    1 -
> >  drivers/gpu/drm/Makefile              |    1 +
> >  drivers/gpu/drm/drm_fb_fbdev_helper.c | 1212 +++++++++++++++++++++++++
> >  drivers/gpu/drm/drm_fb_helper.c       |   30 +-
> >  include/drm/drm_fb_helper.h           |    6 +
> >  5 files changed, 1234 insertions(+), 16 deletions(-)
> >  create mode 100644 drivers/gpu/drm/drm_fb_fbdev_helper.c
> > 
> > diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> > index c4fd57d8b717..11f41bc31f05 100644
> > --- a/drivers/gpu/drm/Kconfig
> > +++ b/drivers/gpu/drm/Kconfig
> > @@ -83,7 +83,6 @@ config DRM_KMS_FB_HELPER
> >  	select FB
> >  	select FRAMEBUFFER_CONSOLE if !EXPERT
> >  	select FRAMEBUFFER_CONSOLE_DETECT_PRIMARY if FRAMEBUFFER_CONSOLE
> > -	select FB_SYS_FOPS
> >  	select FB_SYS_FILLRECT
> >  	select FB_SYS_COPYAREA
> >  	select FB_SYS_IMAGEBLIT
> > diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
> > index 02ee5faf1a92..398cb4746762 100644
> > --- a/drivers/gpu/drm/Makefile
> > +++ b/drivers/gpu/drm/Makefile
> > @@ -50,6 +50,7 @@ drm_kms_helper-y := drm_bridge_connector.o drm_crtc_helper.o drm_dp_helper.o \
> >  
> >  drm_kms_helper-$(CONFIG_DRM_PANEL_BRIDGE) += bridge/panel.o
> >  drm_kms_helper-$(CONFIG_DRM_FBDEV_EMULATION) += drm_fb_helper.o
> > +drm_kms_helper-$(CONFIG_DRM_FBDEV_EMULATION) += drm_fb_fbdev_helper.o
> >  drm_kms_helper-$(CONFIG_DRM_KMS_CMA_HELPER) += drm_fb_cma_helper.o
> >  drm_kms_helper-$(CONFIG_DRM_DP_AUX_CHARDEV) += drm_dp_aux_dev.o
> >  drm_kms_helper-$(CONFIG_DRM_DP_CEC) += drm_dp_cec.o
> > diff --git a/drivers/gpu/drm/drm_fb_fbdev_helper.c b/drivers/gpu/drm/drm_fb_fbdev_helper.c
> > new file mode 100644
> > index 000000000000..dda60f85b9d5
> > --- /dev/null
> > +++ b/drivers/gpu/drm/drm_fb_fbdev_helper.c
> > @@ -0,0 +1,1212 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * This file is subject to the terms and conditions of the GNU General Public
> > + * License.  See the file COPYING in the main directory of this archive for
> > + * more details.
> > + *
> > + * Copyright (C)  2007 Antonino Daplas <adaplas@pol.net>
> > + *
> > + * drm_copyarea:
> > + *   Based almost entirely from cfbcopyarea.c (which is based almost entirely
> > + *   on Geert Uytterhoeven's copyarea routine)
> > + *
> > + *  drm_fillrect:
> > + *    Based almost entirely from cfbfillrect.c (which is based almost entirely
> > + *    on Geert Uytterhoeven's fillrect routine)
> > + *
> > + *  drm_imageblit:
> > + *    Based almost entirely on cfbimgblt.c
> > + *
> > + */
> > +
> > +#include <linux/export.h>
> > +#include <linux/fb.h>
> > +#include <linux/io.h>
> > +
> > +
> > +/*
> > +*  Compose two values, using a bitmask as decision value
> > +*  This is equivalent to (a & mask) | (b & ~mask)
> > +*/
> > +static inline unsigned long
> > +comp(unsigned long a, unsigned long b, unsigned long mask)
> > +{
> > +    return ((a ^ b) & mask) ^ b;
> > +}
> > +
> > +/*
> > + *  Create a pattern with the given pixel's color
> > + */
> > +#if BITS_PER_LONG = 64
> > +static inline unsigned long
> > +pixel_to_pat( u32 bpp, u32 pixel)
> > +{
> > +	switch (bpp) {
> > +	case 1:
> > +		return 0xfffffffffffffffful*pixel;
> > +	case 2:
> > +		return 0x5555555555555555ul*pixel;
> > +	case 4:
> > +		return 0x1111111111111111ul*pixel;
> > +	case 8:
> > +		return 0x0101010101010101ul*pixel;
> > +	case 12:
> > +		return 0x1001001001001001ul*pixel;
> > +	case 16:
> > +		return 0x0001000100010001ul*pixel;
> > +	case 24:
> > +		return 0x0001000001000001ul*pixel;
> > +	case 32:
> > +		return 0x0000000100000001ul*pixel;
> > +	default:
> > +		WARN(1, "pixel_to_pat(): unsupported pixelformat %d\n", bpp);
> > +		return 0;
> > +    }
> > +}
> > +#else
> > +static inline unsigned long
> > +pixel_to_pat( u32 bpp, u32 pixel)
> > +{
> > +	switch (bpp) {
> > +	case 1:
> > +		return 0xfffffffful*pixel;
> > +	case 2:
> > +		return 0x55555555ul*pixel;
> > +	case 4:
> > +		return 0x11111111ul*pixel;
> > +	case 8:
> > +		return 0x01010101ul*pixel;
> > +	case 12:
> > +		return 0x01001001ul*pixel;
> > +	case 16:
> > +		return 0x00010001ul*pixel;
> > +	case 24:
> > +		return 0x01000001ul*pixel;
> > +	case 32:
> > +		return 0x00000001ul*pixel;
> > +	default:
> > +		WARN(1, "pixel_to_pat(): unsupported pixelformat %d\n", bpp);
> > +		return 0;
> > +    }
> > +}
> > +#endif
> > +
> > +#ifdef CONFIG_FB_CFB_REV_PIXELS_IN_BYTE
> > +#if BITS_PER_LONG = 64
> > +#define REV_PIXELS_MASK1 0x5555555555555555ul
> > +#define REV_PIXELS_MASK2 0x3333333333333333ul
> > +#define REV_PIXELS_MASK4 0x0f0f0f0f0f0f0f0ful
> > +#else
> > +#define REV_PIXELS_MASK1 0x55555555ul
> > +#define REV_PIXELS_MASK2 0x33333333ul
> > +#define REV_PIXELS_MASK4 0x0f0f0f0ful
> > +#endif
> > +
> > +static inline unsigned long fb_rev_pixels_in_long(unsigned long val,
> > +						  u32 bswapmask)
> > +{
> > +	if (bswapmask & 1)
> > +		val = comp(val >> 1, val << 1, REV_PIXELS_MASK1);
> > +	if (bswapmask & 2)
> > +		val = comp(val >> 2, val << 2, REV_PIXELS_MASK2);
> > +	if (bswapmask & 3)
> > +		val = comp(val >> 4, val << 4, REV_PIXELS_MASK4);
> > +	return val;
> > +}
> > +
> > +static inline u32 fb_shifted_pixels_mask_u32(struct fb_info *p, u32 index,
> > +					     u32 bswapmask)
> > +{
> > +	u32 mask;
> > +
> > +	if (!bswapmask) {
> > +		mask = FB_SHIFT_HIGH(p, ~(u32)0, index);
> > +	} else {
> > +		mask = 0xff << FB_LEFT_POS(p, 8);
> > +		mask = FB_SHIFT_LOW(p, mask, index & (bswapmask)) & mask;
> > +		mask = FB_SHIFT_HIGH(p, mask, index & ~(bswapmask));
> > +#if defined(__i386__) || defined(__x86_64__)
> > +		/* Shift argument is limited to 0 - 31 on x86 based CPU's */
> > +		if(index + bswapmask < 32)
> > +#endif
> > +			mask |= FB_SHIFT_HIGH(p, ~(u32)0,
> > +					(index + bswapmask) & ~(bswapmask));
> > +	}
> > +	return mask;
> > +}
> > +
> > +static inline unsigned long fb_shifted_pixels_mask_long(struct fb_info *p,
> > +							u32 index,
> > +							u32 bswapmask)
> > +{
> > +	unsigned long mask;
> > +
> > +	if (!bswapmask) {
> > +		mask = FB_SHIFT_HIGH(p, ~0UL, index);
> > +	} else {
> > +		mask = 0xff << FB_LEFT_POS(p, 8);
> > +		mask = FB_SHIFT_LOW(p, mask, index & (bswapmask)) & mask;
> > +		mask = FB_SHIFT_HIGH(p, mask, index & ~(bswapmask));
> > +#if defined(__i386__) || defined(__x86_64__)
> > +		/* Shift argument is limited to 0 - 31 on x86 based CPU's */
> > +		if(index + bswapmask < BITS_PER_LONG)
> > +#endif
> > +			mask |= FB_SHIFT_HIGH(p, ~0UL,
> > +					(index + bswapmask) & ~(bswapmask));
> > +	}
> > +	return mask;
> > +}
> > +
> > +
> > +static inline u32 fb_compute_bswapmask(struct fb_info *info)
> > +{
> > +	u32 bswapmask = 0;
> > +	unsigned bpp = info->var.bits_per_pixel;
> > +
> > +	if ((bpp < 8) && (info->var.nonstd & FB_NONSTD_REV_PIX_IN_B)) {
> > +		/*
> > +		 * Reversed order of pixel layout in bytes
> > +		 * works only for 1, 2 and 4 bpp
> > +		 */
> > +		bswapmask = 7 - bpp + 1;
> > +	}
> > +	return bswapmask;
> > +}
> > +
> > +#else /* CONFIG_FB_CFB_REV_PIXELS_IN_BYTE */
> > +
> > +static inline unsigned long fb_rev_pixels_in_long(unsigned long val,
> > +						  u32 bswapmask)
> > +{
> > +	return val;
> > +}
> > +
> > +#define fb_shifted_pixels_mask_u32(p, i, b) FB_SHIFT_HIGH((p), ~(u32)0, (i))
> > +#define fb_shifted_pixels_mask_long(p, i, b) FB_SHIFT_HIGH((p), ~0UL, (i))
> > +#define fb_compute_bswapmask(...) 0
> > +
> > +#endif  /* CONFIG_FB_CFB_REV_PIXELS_IN_BYTE */
> > +
> > +#define cpu_to_le_long _cpu_to_le_long(BITS_PER_LONG)
> > +#define _cpu_to_le_long(x) __cpu_to_le_long(x)
> > +#define __cpu_to_le_long(x) cpu_to_le##x
> > +
> > +#define le_long_to_cpu _le_long_to_cpu(BITS_PER_LONG)
> > +#define _le_long_to_cpu(x) __le_long_to_cpu(x)
> > +#define __le_long_to_cpu(x) le##x##_to_cpu
> > +
> > +static inline unsigned long rolx(unsigned long word, unsigned int shift, unsigned int x)
> > +{
> > +	return (word << shift) | (word >> (x - shift));
> > +}
> > +
> > +
> > +/*
> > + *  Generic bitwise copy algorithm
> > + */
> > +static void
> > +bitcpy(struct fb_info *p, unsigned long *dst, unsigned dst_idx,
> > +	const unsigned long *src, unsigned src_idx, int bits, unsigned n)
> > +{
> > +	unsigned long first, last;
> > +	int const shift = dst_idx-src_idx;
> > +	int left, right;
> > +
> > +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
> > +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
> > +
> > +	if (!shift) {
> > +		/* Same alignment for source and dest */
> > +		if (dst_idx+n <= bits) {
> > +			/* Single word */
> > +			if (last)
> > +				first &= last;
> > +			*dst = comp(*src, *dst, first);
> > +		} else {
> > +			/* Multiple destination words */
> > +			/* Leading bits */
> > + 			if (first != ~0UL) {
> > +				*dst = comp(*src, *dst, first);
> > +				dst++;
> > +				src++;
> > +				n -= bits - dst_idx;
> > +			}
> > +
> > +			/* Main chunk */
> > +			n /= bits;
> > +			while (n >= 8) {
> > +				*dst++ = *src++;
> > +				*dst++ = *src++;
> > +				*dst++ = *src++;
> > +				*dst++ = *src++;
> > +				*dst++ = *src++;
> > +				*dst++ = *src++;
> > +				*dst++ = *src++;
> > +				*dst++ = *src++;
> > +				n -= 8;
> > +			}
> > +			while (n--)
> > +				*dst++ = *src++;
> > +
> > +			/* Trailing bits */
> > +			if (last)
> > +				*dst = comp(*src, *dst, last);
> > +		}
> > +	} else {
> > +		unsigned long d0, d1;
> > +		int m;
> > +
> > +		/* Different alignment for source and dest */
> > +		right = shift & (bits - 1);
> > +		left = -shift & (bits - 1);
> > +
> > +		if (dst_idx+n <= bits) {
> > +			/* Single destination word */
> > +			if (last)
> > +				first &= last;
> > +			if (shift > 0) {
> > +				/* Single source word */
> > +				*dst = comp(*src << left, *dst, first);
> > +			} else if (src_idx+n <= bits) {
> > +				/* Single source word */
> > +				*dst = comp(*src >> right, *dst, first);
> > +			} else {
> > +				/* 2 source words */
> > +				d0 = *src++;
> > +				d1 = *src;
> > +				*dst = comp(d0 >> right | d1 << left, *dst,
> > +					    first);
> > +			}
> > +		} else {
> > +			/* Multiple destination words */
> > +			/** We must always remember the last value read,
> > +			    because in case SRC and DST overlap bitwise (e.g.
> > +			    when moving just one pixel in 1bpp), we always
> > +			    collect one full long for DST and that might
> > +			    overlap with the current long from SRC. We store
> > +			    this value in 'd0'. */
> > +			d0 = *src++;
> > +			/* Leading bits */
> > +			if (shift > 0) {
> > +				/* Single source word */
> > +				*dst = comp(d0 << left, *dst, first);
> > +				dst++;
> > +				n -= bits - dst_idx;
> > +			} else {
> > +				/* 2 source words */
> > +				d1 = *src++;
> > +				*dst = comp(d0 >> right | d1 << left, *dst,
> > +					    first);
> > +				d0 = d1;
> > +				dst++;
> > +				n -= bits - dst_idx;
> > +			}
> > +
> > +			/* Main chunk */
> > +			m = n % bits;
> > +			n /= bits;
> > +			while (n >= 4) {
> > +				d1 = *src++;
> > +				*dst++ = d0 >> right | d1 << left;
> > +				d0 = d1;
> > +				d1 = *src++;
> > +				*dst++ = d0 >> right | d1 << left;
> > +				d0 = d1;
> > +				d1 = *src++;
> > +				*dst++ = d0 >> right | d1 << left;
> > +				d0 = d1;
> > +				d1 = *src++;
> > +				*dst++ = d0 >> right | d1 << left;
> > +				d0 = d1;
> > +				n -= 4;
> > +			}
> > +			while (n--) {
> > +				d1 = *src++;
> > +				*dst++ = d0 >> right | d1 << left;
> > +				d0 = d1;
> > +			}
> > +
> > +			/* Trailing bits */
> > +			if (m) {
> > +				if (m <= bits - right) {
> > +					/* Single source word */
> > +					d0 >>= right;
> > +				} else {
> > +					/* 2 source words */
> > + 					d1 = *src;
> > +					d0 = d0 >> right | d1 << left;
> > +				}
> > +				*dst = comp(d0, *dst, last);
> > +			}
> > +		}
> > +	}
> > +}
> > +
> > +/*
> > + *  Generic bitwise copy algorithm, operating backward
> > + */
> > +static void
> > +bitcpy_rev(struct fb_info *p, unsigned long *dst, unsigned dst_idx,
> > +	   const unsigned long *src, unsigned src_idx, unsigned bits,
> > +	   unsigned n)
> > +{
> > +	unsigned long first, last;
> > +	int shift;
> > +
> > +	dst += (dst_idx + n - 1) / bits;
> > +	src += (src_idx + n - 1) / bits;
> > +	dst_idx = (dst_idx + n - 1) % bits;
> > +	src_idx = (src_idx + n - 1) % bits;
> > +
> > +	shift = dst_idx-src_idx;
> > +
> > +	first = ~FB_SHIFT_HIGH(p, ~0UL, (dst_idx + 1) % bits);
> > +	last = FB_SHIFT_HIGH(p, ~0UL, (bits + dst_idx + 1 - n) % bits);
> > +
> > +	if (!shift) {
> > +		/* Same alignment for source and dest */
> > +		if ((unsigned long)dst_idx+1 >= n) {
> > +			/* Single word */
> > +			if (first)
> > +				last &= first;
> > +			*dst = comp(*src, *dst, last);
> > +		} else {
> > +			/* Multiple destination words */
> > +
> > +			/* Leading bits */
> > +			if (first) {
> > +				*dst = comp(*src, *dst, first);
> > +				dst--;
> > +				src--;
> > +				n -= dst_idx+1;
> > +			}
> > +
> > +			/* Main chunk */
> > +			n /= bits;
> > +			while (n >= 8) {
> > +				*dst-- = *src--;
> > +				*dst-- = *src--;
> > +				*dst-- = *src--;
> > +				*dst-- = *src--;
> > +				*dst-- = *src--;
> > +				*dst-- = *src--;
> > +				*dst-- = *src--;
> > +				*dst-- = *src--;
> > +				n -= 8;
> > +			}
> > +			while (n--)
> > +				*dst-- = *src--;
> > +			/* Trailing bits */
> > +			if (last != -1UL)
> > +				*dst = comp(*src, *dst, last);
> > +		}
> > +	} else {
> > +		/* Different alignment for source and dest */
> > +
> > +		int const left = shift & (bits-1);
> > +		int const right = -shift & (bits-1);
> > +
> > +		if ((unsigned long)dst_idx+1 >= n) {
> > +			/* Single destination word */
> > +			if (first)
> > +				last &= first;
> > +			if (shift < 0) {
> > +				/* Single source word */
> > +				*dst = comp(*src >> right, *dst, last);
> > +			} else if (1+(unsigned long)src_idx >= n) {
> > +				/* Single source word */
> > +				*dst = comp(*src << left, *dst, last);
> > +			} else {
> > +				/* 2 source words */
> > +				*dst = comp(*src << left | *(src-1) >> right,
> > +					    *dst, last);
> > +			}
> > +		} else {
> > +			/* Multiple destination words */
> > +			/** We must always remember the last value read,
> > +			    because in case SRC and DST overlap bitwise (e.g.
> > +			    when moving just one pixel in 1bpp), we always
> > +			    collect one full long for DST and that might
> > +			    overlap with the current long from SRC. We store
> > +			    this value in 'd0'. */
> > +			unsigned long d0, d1;
> > +			int m;
> > +
> > +			d0 = *src--;
> > +			/* Leading bits */
> > +			if (shift < 0) {
> > +				/* Single source word */
> > +				d1 = d0;
> > +				d0 >>= right;
> > +			} else {
> > +				/* 2 source words */
> > +				d1 = *src--;
> > +				d0 = d0 << left | d1 >> right;
> > +			}
> > +			if (!first)
> > +				*dst = d0;
> > +			else
> > +				*dst = comp(d0, *dst, first);
> > +			d0 = d1;
> > +			dst--;
> > +			n -= dst_idx+1;
> > +
> > +			/* Main chunk */
> > +			m = n % bits;
> > +			n /= bits;
> > +			while (n >= 4) {
> > +				d1 = *src--;
> > +				*dst-- = d0 << left | d1 >> right;
> > +				d0 = d1;
> > +				d1 = *src--;
> > +				*dst-- = d0 << left | d1 >> right;
> > +				d0 = d1;
> > +				d1 = *src--;
> > +				*dst-- = d0 << left | d1 >> right;
> > +				d0 = d1;
> > +				d1 = *src--;
> > +				*dst-- = d0 << left | d1 >> right;
> > +				d0 = d1;
> > +				n -= 4;
> > +			}
> > +			while (n--) {
> > +				d1 = *src--;
> > +				*dst-- = d0 << left | d1 >> right;
> > +				d0 = d1;
> > +			}
> > +
> > +			/* Trailing bits */
> > +			if (m) {
> > +				if (m <= bits - left) {
> > +					/* Single source word */
> > +					d0 <<= left;
> > +				} else {
> > +					/* 2 source words */
> > +					d1 = *src;
> > +					d0 = d0 << left | d1 >> right;
> > +				}
> > +				*dst = comp(d0, *dst, last);
> > +			}
> > +		}
> > +	}
> > +}
> > +
> > +/*
> > + * Generic Bit Block Transfer for frame buffers located in system RAM with
> > + * packed pixels of any depth.
> > + */
> > +void drm_copyarea(struct fb_info *p, const struct fb_copyarea *area)
> > +{
> > +	u32 dx = area->dx, dy = area->dy, sx = area->sx, sy = area->sy;
> > +	u32 height = area->height, width = area->width;
> > +	unsigned long const bits_per_line = p->fix.line_length*8u;
> > +	unsigned long *base = NULL;
> > +	int bits = BITS_PER_LONG, bytes = bits >> 3;
> > +	unsigned dst_idx = 0, src_idx = 0, rev_copy = 0;
> > +
> > +	if (p->state != FBINFO_STATE_RUNNING)
> > +		return;
> > +
> > +	/* if the beginning of the target area might overlap with the end of
> > +	the source area, be have to copy the area reverse. */
> > +	if ((dy = sy && dx > sx) || (dy > sy)) {
> > +		dy += height;
> > +		sy += height;
> > +		rev_copy = 1;
> > +	}
> > +
> > +	/* split the base of the framebuffer into a long-aligned address and
> > +	   the index of the first bit */
> > +	base = (unsigned long *)((unsigned long)p->screen_base & ~(bytes-1));
> > +	dst_idx = src_idx = 8*((unsigned long)p->screen_base & (bytes-1));
> > +	/* add offset of source and target area */
> > +	dst_idx += dy*bits_per_line + dx*p->var.bits_per_pixel;
> > +	src_idx += sy*bits_per_line + sx*p->var.bits_per_pixel;
> > +
> > +	if (p->fbops->fb_sync)
> > +		p->fbops->fb_sync(p);
> > +
> > +	if (rev_copy) {
> > +		while (height--) {
> > +			dst_idx -= bits_per_line;
> > +			src_idx -= bits_per_line;
> > +			bitcpy_rev(p, base + (dst_idx / bits), dst_idx % bits,
> > +				base + (src_idx / bits), src_idx % bits, bits,
> > +				width*p->var.bits_per_pixel);
> > +		}
> > +	} else {
> > +		while (height--) {
> > +			bitcpy(p, base + (dst_idx / bits), dst_idx % bits,
> > +				base + (src_idx / bits), src_idx % bits, bits,
> > +				width*p->var.bits_per_pixel);
> > +			dst_idx += bits_per_line;
> > +			src_idx += bits_per_line;
> > +		}
> > +	}
> > +}
> > +EXPORT_SYMBOL(drm_copyarea);
> > +
> > +/*
> > + *  Aligned pattern fill using 32/64-bit memory accesses
> > + */
> > +static void
> > +bitfill_aligned(struct fb_info *p, unsigned long *dst, int dst_idx,
> > +		unsigned long pat, unsigned n, int bits)
> > +{
> > +	unsigned long first, last;
> > +
> > +	if (!n)
> > +		return;
> > +
> > +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
> > +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
> > +
> > +	if (dst_idx+n <= bits) {
> > +		/* Single word */
> > +		if (last)
> > +			first &= last;
> > +		*dst = comp(pat, *dst, first);
> > +	} else {
> > +		/* Multiple destination words */
> > +
> > +		/* Leading bits */
> > + 		if (first!= ~0UL) {
> > +			*dst = comp(pat, *dst, first);
> > +			dst++;
> > +			n -= bits - dst_idx;
> > +		}
> > +
> > +		/* Main chunk */
> > +		n /= bits;
> > +		while (n >= 8) {
> > +			*dst++ = pat;
> > +			*dst++ = pat;
> > +			*dst++ = pat;
> > +			*dst++ = pat;
> > +			*dst++ = pat;
> > +			*dst++ = pat;
> > +			*dst++ = pat;
> > +			*dst++ = pat;
> > +			n -= 8;
> > +		}
> > +		while (n--)
> > +			*dst++ = pat;
> > +		/* Trailing bits */
> > +		if (last)
> > +			*dst = comp(pat, *dst, last);
> > +	}
> > +}
> > +
> > +
> > +/*
> > + *  Unaligned generic pattern fill using 32/64-bit memory accesses
> > + *  The pattern must have been expanded to a full 32/64-bit value
> > + *  Left/right are the appropriate shifts to convert to the pattern to be
> > + *  used for the next 32/64-bit word
> > + */
> > +static void
> > +bitfill_unaligned(struct fb_info *p, unsigned long *dst, int dst_idx,
> > +		  unsigned long pat, int left, int right, unsigned n, int bits)
> > +{
> > +	unsigned long first, last;
> > +
> > +	if (!n)
> > +		return;
> > +
> > +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
> > +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
> > +
> > +	if (dst_idx+n <= bits) {
> > +		/* Single word */
> > +		if (last)
> > +			first &= last;
> > +		*dst = comp(pat, *dst, first);
> > +	} else {
> > +		/* Multiple destination words */
> > +		/* Leading bits */
> > +		if (first) {
> > +			*dst = comp(pat, *dst, first);
> > +			dst++;
> > +			pat = pat << left | pat >> right;
> > +			n -= bits - dst_idx;
> > +		}
> > +
> > +		/* Main chunk */
> > +		n /= bits;
> > +		while (n >= 4) {
> > +			*dst++ = pat;
> > +			pat = pat << left | pat >> right;
> > +			*dst++ = pat;
> > +			pat = pat << left | pat >> right;
> > +			*dst++ = pat;
> > +			pat = pat << left | pat >> right;
> > +			*dst++ = pat;
> > +			pat = pat << left | pat >> right;
> > +			n -= 4;
> > +		}
> > +		while (n--) {
> > +			*dst++ = pat;
> > +			pat = pat << left | pat >> right;
> > +		}
> > +
> > +		/* Trailing bits */
> > +		if (last)
> > +			*dst = comp(pat, *dst, last);
> > +	}
> > +}
> > +
> > +/*
> > + *  Aligned pattern invert using 32/64-bit memory accesses
> > + */
> > +static void
> > +bitfill_aligned_rev(struct fb_info *p, unsigned long *dst, int dst_idx,
> > +		    unsigned long pat, unsigned n, int bits)
> > +{
> > +	unsigned long val = pat;
> > +	unsigned long first, last;
> > +
> > +	if (!n)
> > +		return;
> > +
> > +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
> > +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
> > +
> > +	if (dst_idx+n <= bits) {
> > +		/* Single word */
> > +		if (last)
> > +			first &= last;
> > +		*dst = comp(*dst ^ val, *dst, first);
> > +	} else {
> > +		/* Multiple destination words */
> > +		/* Leading bits */
> > +		if (first!=0UL) {
> > +			*dst = comp(*dst ^ val, *dst, first);
> > +			dst++;
> > +			n -= bits - dst_idx;
> > +		}
> > +
> > +		/* Main chunk */
> > +		n /= bits;
> > +		while (n >= 8) {
> > +			*dst++ ^= val;
> > +			*dst++ ^= val;
> > +			*dst++ ^= val;
> > +			*dst++ ^= val;
> > +			*dst++ ^= val;
> > +			*dst++ ^= val;
> > +			*dst++ ^= val;
> > +			*dst++ ^= val;
> > +			n -= 8;
> > +		}
> > +		while (n--)
> > +			*dst++ ^= val;
> > +		/* Trailing bits */
> > +		if (last)
> > +			*dst = comp(*dst ^ val, *dst, last);
> > +	}
> > +}
> > +
> > +
> > +/*
> > + *  Unaligned generic pattern invert using 32/64-bit memory accesses
> > + *  The pattern must have been expanded to a full 32/64-bit value
> > + *  Left/right are the appropriate shifts to convert to the pattern to be
> > + *  used for the next 32/64-bit word
> > + */
> > +static void
> > +bitfill_unaligned_rev(struct fb_info *p, unsigned long *dst, int dst_idx,
> > +		      unsigned long pat, int left, int right, unsigned n,
> > +		      int bits)
> > +{
> > +	unsigned long first, last;
> > +
> > +	if (!n)
> > +		return;
> > +
> > +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
> > +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
> > +
> > +	if (dst_idx+n <= bits) {
> > +		/* Single word */
> > +		if (last)
> > +			first &= last;
> > +		*dst = comp(*dst ^ pat, *dst, first);
> > +	} else {
> > +		/* Multiple destination words */
> > +
> > +		/* Leading bits */
> > +		if (first != 0UL) {
> > +			*dst = comp(*dst ^ pat, *dst, first);
> > +			dst++;
> > +			pat = pat << left | pat >> right;
> > +			n -= bits - dst_idx;
> > +		}
> > +
> > +		/* Main chunk */
> > +		n /= bits;
> > +		while (n >= 4) {
> > +			*dst++ ^= pat;
> > +			pat = pat << left | pat >> right;
> > +			*dst++ ^= pat;
> > +			pat = pat << left | pat >> right;
> > +			*dst++ ^= pat;
> > +			pat = pat << left | pat >> right;
> > +			*dst++ ^= pat;
> > +			pat = pat << left | pat >> right;
> > +			n -= 4;
> > +		}
> > +		while (n--) {
> > +			*dst ^= pat;
> > +			pat = pat << left | pat >> right;
> > +		}
> > +
> > +		/* Trailing bits */
> > +		if (last)
> > +			*dst = comp(*dst ^ pat, *dst, last);
> > +	}
> > +}
> > +
> > +/*
> > + * Generic fillrect for frame buffers in system RAM with packed pixels of
> > + * any depth.
> > + */
> > +void drm_fillrect(struct fb_info *p, const struct fb_fillrect *rect)
> > +{
> > +	unsigned long pat, pat2, fg;
> > +	unsigned long width = rect->width, height = rect->height;
> > +	int bits = BITS_PER_LONG, bytes = bits >> 3;
> > +	u32 bpp = p->var.bits_per_pixel;
> > +	unsigned long *dst;
> > +	int dst_idx, left;
> > +
> > +	if (p->state != FBINFO_STATE_RUNNING)
> > +		return;
> > +
> > +	if (p->fix.visual = FB_VISUAL_TRUECOLOR ||
> > +	    p->fix.visual = FB_VISUAL_DIRECTCOLOR )
> > +		fg = ((u32 *) (p->pseudo_palette))[rect->color];
> > +	else
> > +		fg = rect->color;
> > +
> > +	pat = pixel_to_pat( bpp, fg);
> > +
> > +	dst = (unsigned long *)((unsigned long)p->screen_base & ~(bytes-1));
> > +	dst_idx = ((unsigned long)p->screen_base & (bytes - 1))*8;
> > +	dst_idx += rect->dy*p->fix.line_length*8+rect->dx*bpp;
> > +	/* FIXME For now we support 1-32 bpp only */
> > +	left = bits % bpp;
> > +	if (p->fbops->fb_sync)
> > +		p->fbops->fb_sync(p);
> > +	if (!left) {
> > +		void (*fill_op32)(struct fb_info *p, unsigned long *dst,
> > +				  int dst_idx, unsigned long pat, unsigned n,
> > +				  int bits) = NULL;
> > +
> > +		switch (rect->rop) {
> > +		case ROP_XOR:
> > +			fill_op32 = bitfill_aligned_rev;
> > +			break;
> > +		case ROP_COPY:
> > +			fill_op32 = bitfill_aligned;
> > +			break;
> > +		default:
> > +			printk( KERN_ERR "cfb_fillrect(): unknown rop, "
> > +				"defaulting to ROP_COPY\n");
> > +			fill_op32 = bitfill_aligned;
> > +			break;
> > +		}
> > +		while (height--) {
> > +			dst += dst_idx >> (ffs(bits) - 1);
> > +			dst_idx &= (bits - 1);
> > +			fill_op32(p, dst, dst_idx, pat, width*bpp, bits);
> > +			dst_idx += p->fix.line_length*8;
> > +		}
> > +	} else {
> > +		int right, r;
> > +		void (*fill_op)(struct fb_info *p, unsigned long *dst,
> > +				int dst_idx, unsigned long pat, int left,
> > +				int right, unsigned n, int bits) = NULL;
> > +#ifdef __LITTLE_ENDIAN
> > +		right = left;
> > +		left = bpp - right;
> > +#else
> > +		right = bpp - left;
> > +#endif
> > +		switch (rect->rop) {
> > +		case ROP_XOR:
> > +			fill_op = bitfill_unaligned_rev;
> > +			break;
> > +		case ROP_COPY:
> > +			fill_op = bitfill_unaligned;
> > +			break;
> > +		default:
> > +			printk(KERN_ERR "sys_fillrect(): unknown rop, "
> > +				"defaulting to ROP_COPY\n");
> > +			fill_op = bitfill_unaligned;
> > +			break;
> > +		}
> > +		while (height--) {
> > +			dst += dst_idx / bits;
> > +			dst_idx &= (bits - 1);
> > +			r = dst_idx % bpp;
> > +			/* rotate pattern to the correct start position */
> > +			pat2 = le_long_to_cpu(rolx(cpu_to_le_long(pat), r, bpp));
> > +			fill_op(p, dst, dst_idx, pat2, left, right,
> > +				width*bpp, bits);
> > +			dst_idx += p->fix.line_length*8;
> > +		}
> > +	}
> > +}
> > +EXPORT_SYMBOL(drm_fillrect);
> > +
> > +static const u32 cfb_tab8_be[] = {
> > +    0x00000000,0x000000ff,0x0000ff00,0x0000ffff,
> > +    0x00ff0000,0x00ff00ff,0x00ffff00,0x00ffffff,
> > +    0xff000000,0xff0000ff,0xff00ff00,0xff00ffff,
> > +    0xffff0000,0xffff00ff,0xffffff00,0xffffffff
> > +};
> > +
> > +static const u32 cfb_tab8_le[] = {
> > +    0x00000000,0xff000000,0x00ff0000,0xffff0000,
> > +    0x0000ff00,0xff00ff00,0x00ffff00,0xffffff00,
> > +    0x000000ff,0xff0000ff,0x00ff00ff,0xffff00ff,
> > +    0x0000ffff,0xff00ffff,0x00ffffff,0xffffffff
> > +};
> > +
> > +static const u32 cfb_tab16_be[] = {
> > +    0x00000000, 0x0000ffff, 0xffff0000, 0xffffffff
> > +};
> > +
> > +static const u32 cfb_tab16_le[] = {
> > +    0x00000000, 0xffff0000, 0x0000ffff, 0xffffffff
> > +};
> > +
> > +static const u32 cfb_tab32[] = {
> > +	0x00000000, 0xffffffff
> > +};
> > +
> > +static void color_imageblit(const struct fb_image *image, struct fb_info *p,
> > +			    void *dst1, u32 start_index, u32 pitch_index)
> > +{
> > +	/* Draw the penguin */
> > +	u32 *dst, *dst2;
> > +	u32 color = 0, val, shift;
> > +	int i, n, bpp = p->var.bits_per_pixel;
> > +	u32 null_bits = 32 - bpp;
> > +	u32 *palette = (u32 *) p->pseudo_palette;
> > +	const u8 *src = image->data;
> > +
> > +	dst2 = dst1;
> > +	for (i = image->height; i--; ) {
> > +		n = image->width;
> > +		dst = dst1;
> > +		shift = 0;
> > +		val = 0;
> > +
> > +		if (start_index) {
> > +			u32 start_mask = ~(FB_SHIFT_HIGH(p, ~(u32)0,
> > +							 start_index));
> > +			val = *dst & start_mask;
> > +			shift = start_index;
> > +		}
> > +		while (n--) {
> > +			if (p->fix.visual = FB_VISUAL_TRUECOLOR ||
> > +			    p->fix.visual = FB_VISUAL_DIRECTCOLOR )
> > +				color = palette[*src];
> > +			else
> > +				color = *src;
> > +			color <<= FB_LEFT_POS(p, bpp);
> > +			val |= FB_SHIFT_HIGH(p, color, shift);
> > +			if (shift >= null_bits) {
> > +				*dst++ = val;
> > +
> > +				val = (shift = null_bits) ? 0 :
> > +					FB_SHIFT_LOW(p, color, 32 - shift);
> > +			}
> > +			shift += bpp;
> > +			shift &= (32 - 1);
> > +			src++;
> > +		}
> > +		if (shift) {
> > +			u32 end_mask = FB_SHIFT_HIGH(p, ~(u32)0, shift);
> > +
> > +			*dst &= end_mask;
> > +			*dst |= val;
> > +		}
> > +		dst1 += p->fix.line_length;
> > +		if (pitch_index) {
> > +			dst2 += p->fix.line_length;
> > +			dst1 = (u8 *)((long)dst2 & ~(sizeof(u32) - 1));
> > +
> > +			start_index += pitch_index;
> > +			start_index &= 32 - 1;
> > +		}
> > +	}
> > +}
> > +
> > +static void slow_imageblit(const struct fb_image *image, struct fb_info *p,
> > +				  void *dst1, u32 fgcolor, u32 bgcolor,
> > +				  u32 start_index, u32 pitch_index)
> > +{
> > +	u32 shift, color = 0, bpp = p->var.bits_per_pixel;
> > +	u32 *dst, *dst2;
> > +	u32 val, pitch = p->fix.line_length;
> > +	u32 null_bits = 32 - bpp;
> > +	u32 spitch = (image->width+7)/8;
> > +	const u8 *src = image->data, *s;
> > +	u32 i, j, l;
> > +
> > +	dst2 = dst1;
> > +	fgcolor <<= FB_LEFT_POS(p, bpp);
> > +	bgcolor <<= FB_LEFT_POS(p, bpp);
> > +
> > +	for (i = image->height; i--; ) {
> > +		shift = val = 0;
> > +		l = 8;
> > +		j = image->width;
> > +		dst = dst1;
> > +		s = src;
> > +
> > +		/* write leading bits */
> > +		if (start_index) {
> > +			u32 start_mask = ~(FB_SHIFT_HIGH(p, ~(u32)0,
> > +							 start_index));
> > +			val = *dst & start_mask;
> > +			shift = start_index;
> > +		}
> > +
> > +		while (j--) {
> > +			l--;
> > +			color = (*s & (1 << l)) ? fgcolor : bgcolor;
> > +			val |= FB_SHIFT_HIGH(p, color, shift);
> > +
> > +			/* Did the bitshift spill bits to the next long? */
> > +			if (shift >= null_bits) {
> > +				*dst++ = val;
> > +				val = (shift = null_bits) ? 0 :
> > +					FB_SHIFT_LOW(p, color, 32 - shift);
> > +			}
> > +			shift += bpp;
> > +			shift &= (32 - 1);
> > +			if (!l) { l = 8; s++; }
> > +		}
> > +
> > +		/* write trailing bits */
> > + 		if (shift) {
> > +			u32 end_mask = FB_SHIFT_HIGH(p, ~(u32)0, shift);
> > +
> > +			*dst &= end_mask;
> > +			*dst |= val;
> > +		}
> > +
> > +		dst1 += pitch;
> > +		src += spitch;
> > +		if (pitch_index) {
> > +			dst2 += pitch;
> > +			dst1 = (u8 *)((long)dst2 & ~(sizeof(u32) - 1));
> > +			start_index += pitch_index;
> > +			start_index &= 32 - 1;
> > +		}
> > +
> > +	}
> > +}
> > +
> > +/*
> > + * fast_imageblit - optimized monochrome color expansion
> > + *
> > + * Only if:  bits_per_pixel = 8, 16, or 32
> > + *           image->width is divisible by pixel/dword (ppw);
> > + *           fix->line_legth is divisible by 4;
> > + *           beginning and end of a scanline is dword aligned
> > + */
> > +static void fast_imageblit(const struct fb_image *image, struct fb_info *p,
> > +				  void *dst1, u32 fgcolor, u32 bgcolor)
> > +{
> > +	u32 fgx = fgcolor, bgx = bgcolor, bpp = p->var.bits_per_pixel;
> > +	u32 ppw = 32/bpp, spitch = (image->width + 7)/8;
> > +	u32 bit_mask, end_mask, eorx, shift;
> > +	const char *s = image->data, *src;
> > +	u32 *dst;
> > +	const u32 *tab = NULL;
> > +	int i, j, k;
> > +
> > +	switch (bpp) {
> > +	case 8:
> > +		tab = fb_be_math(p) ? cfb_tab8_be : cfb_tab8_le;
> > +		break;
> > +	case 16:
> > +		tab = fb_be_math(p) ? cfb_tab16_be : cfb_tab16_le;
> > +		break;
> > +	case 32:
> > +	default:
> > +		tab = cfb_tab32;
> > +		break;
> > +	}
> > +
> > +	for (i = ppw-1; i--; ) {
> > +		fgx <<= bpp;
> > +		bgx <<= bpp;
> > +		fgx |= fgcolor;
> > +		bgx |= bgcolor;
> > +	}
> > +
> > +	bit_mask = (1 << ppw) - 1;
> > +	eorx = fgx ^ bgx;
> > +	k = image->width/ppw;
> > +
> > +	for (i = image->height; i--; ) {
> > +		dst = dst1;
> > +		shift = 8;
> > +		src = s;
> > +
> > +		for (j = k; j--; ) {
> > +			shift -= ppw;
> > +			end_mask = tab[(*src >> shift) & bit_mask];
> > +			*dst++ = (end_mask & eorx) ^ bgx;
> > +			if (!shift) {
> > +				shift = 8;
> > +				src++;
> > +			}
> > +		}
> > +		dst1 += p->fix.line_length;
> > +		s += spitch;
> > +	}
> > +}
> > +
> > +/*
> > + * Generic 1-bit or 8-bit source to 1-32 bit destination expansion
> > + * for frame buffer located in system RAM with packed pixels of any depth.
> > + */
> > +void drm_imageblit(struct fb_info *p, const struct fb_image *image)
> > +{
> > +	u32 fgcolor, bgcolor, start_index, bitstart, pitch_index = 0;
> > +	u32 bpl = sizeof(u32), bpp = p->var.bits_per_pixel;
> > +	u32 width = image->width;
> > +	u32 dx = image->dx, dy = image->dy;
> > +	void *dst1;
> > +
> > +	if (p->state != FBINFO_STATE_RUNNING)
> > +		return;
> > +
> > +	bitstart = (dy * p->fix.line_length * 8) + (dx * bpp);
> > +	start_index = bitstart & (32 - 1);
> > +	pitch_index = (p->fix.line_length & (bpl - 1)) * 8;
> > +
> > +	bitstart /= 8;
> > +	bitstart &= ~(bpl - 1);
> > +	dst1 = (void __force *)p->screen_base + bitstart;
> > +
> > +	if (p->fbops->fb_sync)
> > +		p->fbops->fb_sync(p);
> > +
> > +	if (image->depth = 1) {
> > +		if (p->fix.visual = FB_VISUAL_TRUECOLOR ||
> > +		    p->fix.visual = FB_VISUAL_DIRECTCOLOR) {
> > +			fgcolor = ((u32*)(p->pseudo_palette))[image->fg_color];
> > +			bgcolor = ((u32*)(p->pseudo_palette))[image->bg_color];
> > +		} else {
> > +			fgcolor = image->fg_color;
> > +			bgcolor = image->bg_color;
> > +		}
> > +
> > +		if (32 % bpp = 0 && !start_index && !pitch_index &&
> > +		    ((width & (32/bpp-1)) = 0) &&
> > +		    bpp >= 8 && bpp <= 32)
> > +			fast_imageblit(image, p, dst1, fgcolor, bgcolor);
> > +		else
> > +			slow_imageblit(image, p, dst1, fgcolor, bgcolor,
> > +					start_index, pitch_index);
> > +	} else
> > +		color_imageblit(image, p, dst1, start_index, pitch_index);
> > +}
> > +EXPORT_SYMBOL(drm_imageblit);
> > +
> > +/*
> > + * Generic file operations where framebuffer is in system RAM
> > + */
> > +ssize_t drm_fb_read(struct fb_info *info, char __user *buf, size_t count, loff_t *ppos)
> > +{
> > +	unsigned long p = *ppos;
> > +	void *src;
> > +	int err = 0;
> > +	unsigned long total_size;
> > +
> > +	if (info->state != FBINFO_STATE_RUNNING)
> > +		return -EPERM;
> > +
> > +	total_size = info->screen_size;
> > +
> > +	if (total_size = 0)
> > +		total_size = info->fix.smem_len;
> > +
> > +	if (p >= total_size)
> > +		return 0;
> > +
> > +	if (count >= total_size)
> > +		count = total_size;
> > +
> > +	if (count + p > total_size)
> > +		count = total_size - p;
> > +
> > +	src = (void __force *)(info->screen_base + p);
> > +
> > +	if (info->fbops->fb_sync)
> > +		info->fbops->fb_sync(info);
> > +
> > +	if (copy_to_user(buf, src, count))
> > +		err = -EFAULT;
> > +
> > +	if  (!err)
> > +		*ppos += count;
> > +
> > +	return (err) ? err : count;
> > +}
> > +EXPORT_SYMBOL_GPL(drm_fb_read);
> > +
> > +/*
> > + * Generic file operations where framebuffer is in system RAM
> > + */
> > +ssize_t drm_fb_write(struct fb_info *info, const char __user *buf, size_t count, loff_t *ppos)
> > +{
> > +	unsigned long p = *ppos;
> > +	void *dst;
> > +	int err = 0;
> > +	unsigned long total_size;
> > +
> > +	if (info->state != FBINFO_STATE_RUNNING)
> > +		return -EPERM;
> > +
> > +	total_size = info->screen_size;
> > +
> > +	if (total_size = 0)
> > +		total_size = info->fix.smem_len;
> > +
> > +	if (p > total_size)
> > +		return -EFBIG;
> > +
> > +	if (count > total_size) {
> > +		err = -EFBIG;
> > +		count = total_size;
> > +	}
> > +
> > +	if (count + p > total_size) {
> > +		if (!err)
> > +			err = -ENOSPC;
> > +
> > +		count = total_size - p;
> > +	}
> > +
> > +	dst = (void __force *) (info->screen_base + p);
> > +
> > +	if (info->fbops->fb_sync)
> > +		info->fbops->fb_sync(info);
> > +
> > +	if (copy_from_user(dst, buf, count))
> > +		err = -EFAULT;
> > +
> > +	if  (!err)
> > +		*ppos += count;
> > +
> > +	return (err) ? err : count;
> > +}
> > +EXPORT_SYMBOL_GPL(drm_fb_write);
> > diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
> > index da0d96a69570..18fba9818635 100644
> > --- a/drivers/gpu/drm/drm_fb_helper.c
> > +++ b/drivers/gpu/drm/drm_fb_helper.c
> > @@ -664,36 +664,36 @@ void drm_fb_helper_deferred_io(struct fb_info *info,
> >  EXPORT_SYMBOL(drm_fb_helper_deferred_io);
> >  
> >  /**
> > - * drm_fb_helper_sys_read - wrapper around fb_sys_read
> > + * drm_fb_helper_sys_read - wrapper around drm_fb_read
> >   * @info: fb_info struct pointer
> >   * @buf: userspace buffer to read from framebuffer memory
> >   * @count: number of bytes to read from framebuffer memory
> >   * @ppos: read offset within framebuffer memory
> >   *
> > - * A wrapper around fb_sys_read implemented by fbdev core
> > + * A wrapper around drm_fb_read
> >   */
> >  ssize_t drm_fb_helper_sys_read(struct fb_info *info, char __user *buf,
> >  			       size_t count, loff_t *ppos)
> >  {
> > -	return fb_sys_read(info, buf, count, ppos);
> > +	return drm_fb_read(info, buf, count, ppos);
> >  }
> >  EXPORT_SYMBOL(drm_fb_helper_sys_read);
> >  
> >  /**
> > - * drm_fb_helper_sys_write - wrapper around fb_sys_write
> > + * drm_fb_helper_sys_write - wrapper around drm_fb_write
> >   * @info: fb_info struct pointer
> >   * @buf: userspace buffer to write to framebuffer memory
> >   * @count: number of bytes to write to framebuffer memory
> >   * @ppos: write offset within framebuffer memory
> >   *
> > - * A wrapper around fb_sys_write implemented by fbdev core
> > + * A wrapper around drm_fb_write
> >   */
> >  ssize_t drm_fb_helper_sys_write(struct fb_info *info, const char __user *buf,
> >  				size_t count, loff_t *ppos)
> >  {
> >  	ssize_t ret;
> >  
> > -	ret = fb_sys_write(info, buf, count, ppos);
> > +	ret = drm_fb_write(info, buf, count, ppos);
> >  	if (ret > 0)
> >  		drm_fb_helper_dirty(info, 0, 0, info->var.xres,
> >  				    info->var.yres);
> > @@ -703,48 +703,48 @@ ssize_t drm_fb_helper_sys_write(struct fb_info *info, const char __user *buf,
> >  EXPORT_SYMBOL(drm_fb_helper_sys_write);
> >  
> >  /**
> > - * drm_fb_helper_sys_fillrect - wrapper around sys_fillrect
> > + * drm_fb_helper_sys_fillrect - wrapper around drm_fillrect
> >   * @info: fbdev registered by the helper
> >   * @rect: info about rectangle to fill
> >   *
> > - * A wrapper around sys_fillrect implemented by fbdev core
> > + * A wrapper around drm_fillrect
> >   */
> >  void drm_fb_helper_sys_fillrect(struct fb_info *info,
> >  				const struct fb_fillrect *rect)
> >  {
> > -	sys_fillrect(info, rect);
> > +	drm_fillrect(info, rect);
> >  	drm_fb_helper_dirty(info, rect->dx, rect->dy,
> >  			    rect->width, rect->height);
> >  }
> >  EXPORT_SYMBOL(drm_fb_helper_sys_fillrect);
> >  
> >  /**
> > - * drm_fb_helper_sys_copyarea - wrapper around sys_copyarea
> > + * drm_fb_helper_sys_copyarea - wrapper around drm_copyarea
> >   * @info: fbdev registered by the helper
> >   * @area: info about area to copy
> >   *
> > - * A wrapper around sys_copyarea implemented by fbdev core
> > + * A wrapper around drm_copyarea
> >   */
> >  void drm_fb_helper_sys_copyarea(struct fb_info *info,
> >  				const struct fb_copyarea *area)
> >  {
> > -	sys_copyarea(info, area);
> > +	drm_copyarea(info, area);
> >  	drm_fb_helper_dirty(info, area->dx, area->dy,
> >  			    area->width, area->height);
> >  }
> >  EXPORT_SYMBOL(drm_fb_helper_sys_copyarea);
> >  
> >  /**
> > - * drm_fb_helper_sys_imageblit - wrapper around sys_imageblit
> > + * drm_fb_helper_sys_imageblit - wrapper around drm_imageblit
> >   * @info: fbdev registered by the helper
> >   * @image: info about image to blit
> >   *
> > - * A wrapper around sys_imageblit implemented by fbdev core
> > + * A wrapper around drm_imageblit
> >   */
> >  void drm_fb_helper_sys_imageblit(struct fb_info *info,
> >  				 const struct fb_image *image)
> >  {
> > -	sys_imageblit(info, image);
> > +	drm_imageblit(info, image);
> >  	drm_fb_helper_dirty(info, image->dx, image->dy,
> >  			    image->width, image->height);
> >  }
> > diff --git a/include/drm/drm_fb_helper.h b/include/drm/drm_fb_helper.h
> > index 306aa3a60be9..d38005f6128a 100644
> > --- a/include/drm/drm_fb_helper.h
> > +++ b/include/drm/drm_fb_helper.h
> > @@ -271,6 +271,12 @@ void drm_fb_helper_output_poll_changed(struct drm_device *dev);
> >  
> >  void drm_fbdev_generic_setup(struct drm_device *dev,
> >  			     unsigned int preferred_bpp);
> > +
> > +void drm_copyarea(struct fb_info *p, const struct fb_copyarea *area);
> > +void drm_fillrect(struct fb_info *p, const struct fb_fillrect *rect);
> > +void drm_imageblit(struct fb_info *p, const struct fb_image *image);
> > +ssize_t drm_fb_read(struct fb_info *info, char __user *buf, size_t count, loff_t *ppos);
> > +ssize_t drm_fb_write(struct fb_info *info, const char __user *buf, size_t count, loff_t *ppos);
> >  #else
> >  static inline void drm_fb_helper_prepare(struct drm_device *dev,
> >  					struct drm_fb_helper *helper,
> > 
> 
> -- 
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer
> 




-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/5] fbdev/core: Export framebuffer read and write code as cfb_ function
@ 2020-08-04  9:27               ` daniel
  0 siblings, 0 replies; 34+ messages in thread
From: daniel @ 2020-08-04  9:27 UTC (permalink / raw)
  Cc: linux-fbdev, b.zolnierkie, jani.nikula, dri-devel, kraxel,
	airlied, natechancellor, Sam Ravnborg, peda, dan.carpenter

On Mon, Aug 03, 2020 at 08:46:34AM +0200, Thomas Zimmermann wrote:
> Hi
> 
> Am 02.08.20 um 22:01 schrieb Sam Ravnborg:
> > On Fri, Jul 31, 2020 at 11:20:33AM +0200, daniel@ffwll.ch wrote:
> >> On Wed, Jul 29, 2020 at 06:36:03PM +0200, Sam Ravnborg wrote:
> >>> Hi Daniel.
> >>>
> >>> On Wed, Jul 29, 2020 at 03:53:28PM +0200, daniel@ffwll.ch wrote:
> >>>> On Wed, Jul 29, 2020 at 03:41:45PM +0200, Thomas Zimmermann wrote:
> >>>>> DRM fb helpers require read and write functions for framebuffer
> >>>>> memory. Export the existing code from fbdev.
> >>>>>
> >>>>> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
> >>>>
> >>>> Hm I'm not super sure whether we want to actually reuse this stuff ... We
> >>>> kinda don't care about the sparc special case, and just having an fbdev
> >>>> implementation witch has the switch between memcpy and memcpy_to/from_io
> >>>> in one single place sounds a lot simpler ...
> >>>>
> >>>> This way we can have a clean split between the old horrors of real fbdev
> >>>> drivers, and a much cleaner world in drm. It would mean a bit of
> >>>> copypasting, but I think that's actually a good thing.
> >>>>
> >>>> In general my idea for drm fbdev emulation is that for any area we have a
> >>>> problem we just ignore the entire fbmem.c code and write our own: mmap,
> >>>> backlight handling (still unsolved, and horrible), cfb vs sys here. This
> >>>> entire fbmem.c stuff is pretty bad midlayer, trying to avoid code
> >>>> duplication here doesn't seem worth it imo.
> >>>>
> >>>> Thoughts?
> >>>
> >>>
> >>> I can see that fbmem is a mix of ioctl support and other stuff.
> >>> We could factor out all the ioctl parts of fbmem.c to a new file
> >>> named fbioctl.c.
> >>>
> >>> And then let the ioctl parts call down into drm stuff and avoid reusing
> >>> the fbdev code when we first reach drm code.
> >>> This would require local copies of:
> >>> sys_read, sys_write, sys_fillrect, sys_copyarea, sys_imageblit
> >>> and more I think which I missed.
> >>>
> >>> With local copies we could avoid some of the special cases and trim the
> >>> unctions to what is required by drm only.
> >>> And then no more fbmem dependencies and no dependencies to several of
> >>> the small helper functions. So less entanglement with fbdev core.
> >>>
> >>> This all sounds simple so I am surely missing a lot a ugly details here.
> >>>
> >>> And should we touch this anyway we need a test suite to verify not too
> >>> much breaks. To the best of my knowledge there is not yet such a test
> >>> suite :-( Maybe because people caring about fbdev are limited.
> >>
> >> Well my idea was to not refactor anything, but just have drm copies of the
> >> various fb_ops callbacks. Definitely not even more refactoring :-)
> 
> Thanks for making a prototype, Sam.
> 
> But do we really want to duplicate alls this code from fbdev? It's not
> actually pretty and there's little value in rewritting it. If anything,
> I can see us extending drm_format_helpers.c and building upon this.

Hm yeah this is a lot more horrible than I thought, I was kinda just
looking at read/write functions, maybe also mmap, and figured we could
simplify this a lot if we just inline and done.

But all these image/fill/copy functions are pretty bad, and they seem to
care about a lot of corner cases that just don't matter anymore on modern
hw. Also agreeing with Tomas that if we do copy this, then building on top
of format helpers would make more sense. At that point we probably need a
test suite for fbdev, and this entire endeavour becomes rather silly.

So maybe compromise approach? E.g. we reuse fbdev helpers with a switch
between sys and cfb if it would drag in a huge pile of code for format
handling sillynesss. But when all it would drag in is a memcpy_to/fromio
or similar, we inline to reduce a bit the midlayer and make it clearer
what's going on. Strictly speaking would probably still need a test suite,
but I guess we can delay that until the first bug report or so :-)

Cheers, Daniel


> 
> Best regards
> Thomas
> 
> > 
> > $ wc -l drivers/gpu/drm/drm_fb_fbdev_helper.c
> > 1212 drivers/gpu/drm/drm_fb_fbdev_helper.c
> > 
> > This is a straight copy of:
> > $ grep EXPORT drivers/gpu/drm/drm_fb_fbdev_helper.c
> > EXPORT_SYMBOL(drm_copyarea);
> > EXPORT_SYMBOL(drm_fillrect);
> > EXPORT_SYMBOL(drm_imageblit);
> > EXPORT_SYMBOL_GPL(drm_fb_read);
> > EXPORT_SYMBOL_GPL(drm_fb_write);
> > 
> > copyarea, fillrect and imageblit are a copy of the sys variants.
> > read and write are from fb_sys_fops.c
> > 
> > Patch builds but I am not a big fan yet - right now it gains us very
> > little. What would the next step be?
> > 
> > I looked briefly at the few drivers that have not migrated to
> > the generic fbdev emulation (yet?).
> >   msm
> >   omapdrm
> >   gma500
> > 
> > all uses sys_ variants - they already uses the migrated variants as
> > they call the drm_fb helper.
> > 
> > 
> >   exynos
> >   gma500
> >   i915
> >   nouveau
> >   radeon
> >   rockchip
> >   vmwgfx
> > 
> > all uses the cfb variants and cannot use the migrated variants as-is.
> > 
> > 	Sam
> > 
> > From c554e8b9ec96e06a5933ed5bd834a928c977bcdf Mon Sep 17 00:00:00 2001
> > From: Sam Ravnborg <sam@ravnborg.org>
> > Date: Fri, 31 Jul 2020 23:36:01 +0200
> > Subject: [PATCH 1/3] drm: drm_fb_helper: copy fb_ops functions to drm
> > 
> > Copy the fbdev functions used by fb_ops to drm.
> > Having local copies enable the possibility
> > to adjust the functions so then are tailored to the need of DRM.
> > At the same time there is one less dependency between DRM and fbdev.
> > 
> > The files were copied verbatim and only minimal changes was made.
> > 
> > As some drivers uses the fbdev functions they continue to be build but
> > the generic fbdev emulation no longer uses the fbdev functions.
> > 
> > Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
> > Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > ---
> >  drivers/gpu/drm/Kconfig               |    1 -
> >  drivers/gpu/drm/Makefile              |    1 +
> >  drivers/gpu/drm/drm_fb_fbdev_helper.c | 1212 +++++++++++++++++++++++++
> >  drivers/gpu/drm/drm_fb_helper.c       |   30 +-
> >  include/drm/drm_fb_helper.h           |    6 +
> >  5 files changed, 1234 insertions(+), 16 deletions(-)
> >  create mode 100644 drivers/gpu/drm/drm_fb_fbdev_helper.c
> > 
> > diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> > index c4fd57d8b717..11f41bc31f05 100644
> > --- a/drivers/gpu/drm/Kconfig
> > +++ b/drivers/gpu/drm/Kconfig
> > @@ -83,7 +83,6 @@ config DRM_KMS_FB_HELPER
> >  	select FB
> >  	select FRAMEBUFFER_CONSOLE if !EXPERT
> >  	select FRAMEBUFFER_CONSOLE_DETECT_PRIMARY if FRAMEBUFFER_CONSOLE
> > -	select FB_SYS_FOPS
> >  	select FB_SYS_FILLRECT
> >  	select FB_SYS_COPYAREA
> >  	select FB_SYS_IMAGEBLIT
> > diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
> > index 02ee5faf1a92..398cb4746762 100644
> > --- a/drivers/gpu/drm/Makefile
> > +++ b/drivers/gpu/drm/Makefile
> > @@ -50,6 +50,7 @@ drm_kms_helper-y := drm_bridge_connector.o drm_crtc_helper.o drm_dp_helper.o \
> >  
> >  drm_kms_helper-$(CONFIG_DRM_PANEL_BRIDGE) += bridge/panel.o
> >  drm_kms_helper-$(CONFIG_DRM_FBDEV_EMULATION) += drm_fb_helper.o
> > +drm_kms_helper-$(CONFIG_DRM_FBDEV_EMULATION) += drm_fb_fbdev_helper.o
> >  drm_kms_helper-$(CONFIG_DRM_KMS_CMA_HELPER) += drm_fb_cma_helper.o
> >  drm_kms_helper-$(CONFIG_DRM_DP_AUX_CHARDEV) += drm_dp_aux_dev.o
> >  drm_kms_helper-$(CONFIG_DRM_DP_CEC) += drm_dp_cec.o
> > diff --git a/drivers/gpu/drm/drm_fb_fbdev_helper.c b/drivers/gpu/drm/drm_fb_fbdev_helper.c
> > new file mode 100644
> > index 000000000000..dda60f85b9d5
> > --- /dev/null
> > +++ b/drivers/gpu/drm/drm_fb_fbdev_helper.c
> > @@ -0,0 +1,1212 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * This file is subject to the terms and conditions of the GNU General Public
> > + * License.  See the file COPYING in the main directory of this archive for
> > + * more details.
> > + *
> > + * Copyright (C)  2007 Antonino Daplas <adaplas@pol.net>
> > + *
> > + * drm_copyarea:
> > + *   Based almost entirely from cfbcopyarea.c (which is based almost entirely
> > + *   on Geert Uytterhoeven's copyarea routine)
> > + *
> > + *  drm_fillrect:
> > + *    Based almost entirely from cfbfillrect.c (which is based almost entirely
> > + *    on Geert Uytterhoeven's fillrect routine)
> > + *
> > + *  drm_imageblit:
> > + *    Based almost entirely on cfbimgblt.c
> > + *
> > + */
> > +
> > +#include <linux/export.h>
> > +#include <linux/fb.h>
> > +#include <linux/io.h>
> > +
> > +
> > +/*
> > +*  Compose two values, using a bitmask as decision value
> > +*  This is equivalent to (a & mask) | (b & ~mask)
> > +*/
> > +static inline unsigned long
> > +comp(unsigned long a, unsigned long b, unsigned long mask)
> > +{
> > +    return ((a ^ b) & mask) ^ b;
> > +}
> > +
> > +/*
> > + *  Create a pattern with the given pixel's color
> > + */
> > +#if BITS_PER_LONG == 64
> > +static inline unsigned long
> > +pixel_to_pat( u32 bpp, u32 pixel)
> > +{
> > +	switch (bpp) {
> > +	case 1:
> > +		return 0xfffffffffffffffful*pixel;
> > +	case 2:
> > +		return 0x5555555555555555ul*pixel;
> > +	case 4:
> > +		return 0x1111111111111111ul*pixel;
> > +	case 8:
> > +		return 0x0101010101010101ul*pixel;
> > +	case 12:
> > +		return 0x1001001001001001ul*pixel;
> > +	case 16:
> > +		return 0x0001000100010001ul*pixel;
> > +	case 24:
> > +		return 0x0001000001000001ul*pixel;
> > +	case 32:
> > +		return 0x0000000100000001ul*pixel;
> > +	default:
> > +		WARN(1, "pixel_to_pat(): unsupported pixelformat %d\n", bpp);
> > +		return 0;
> > +    }
> > +}
> > +#else
> > +static inline unsigned long
> > +pixel_to_pat( u32 bpp, u32 pixel)
> > +{
> > +	switch (bpp) {
> > +	case 1:
> > +		return 0xfffffffful*pixel;
> > +	case 2:
> > +		return 0x55555555ul*pixel;
> > +	case 4:
> > +		return 0x11111111ul*pixel;
> > +	case 8:
> > +		return 0x01010101ul*pixel;
> > +	case 12:
> > +		return 0x01001001ul*pixel;
> > +	case 16:
> > +		return 0x00010001ul*pixel;
> > +	case 24:
> > +		return 0x01000001ul*pixel;
> > +	case 32:
> > +		return 0x00000001ul*pixel;
> > +	default:
> > +		WARN(1, "pixel_to_pat(): unsupported pixelformat %d\n", bpp);
> > +		return 0;
> > +    }
> > +}
> > +#endif
> > +
> > +#ifdef CONFIG_FB_CFB_REV_PIXELS_IN_BYTE
> > +#if BITS_PER_LONG == 64
> > +#define REV_PIXELS_MASK1 0x5555555555555555ul
> > +#define REV_PIXELS_MASK2 0x3333333333333333ul
> > +#define REV_PIXELS_MASK4 0x0f0f0f0f0f0f0f0ful
> > +#else
> > +#define REV_PIXELS_MASK1 0x55555555ul
> > +#define REV_PIXELS_MASK2 0x33333333ul
> > +#define REV_PIXELS_MASK4 0x0f0f0f0ful
> > +#endif
> > +
> > +static inline unsigned long fb_rev_pixels_in_long(unsigned long val,
> > +						  u32 bswapmask)
> > +{
> > +	if (bswapmask & 1)
> > +		val = comp(val >> 1, val << 1, REV_PIXELS_MASK1);
> > +	if (bswapmask & 2)
> > +		val = comp(val >> 2, val << 2, REV_PIXELS_MASK2);
> > +	if (bswapmask & 3)
> > +		val = comp(val >> 4, val << 4, REV_PIXELS_MASK4);
> > +	return val;
> > +}
> > +
> > +static inline u32 fb_shifted_pixels_mask_u32(struct fb_info *p, u32 index,
> > +					     u32 bswapmask)
> > +{
> > +	u32 mask;
> > +
> > +	if (!bswapmask) {
> > +		mask = FB_SHIFT_HIGH(p, ~(u32)0, index);
> > +	} else {
> > +		mask = 0xff << FB_LEFT_POS(p, 8);
> > +		mask = FB_SHIFT_LOW(p, mask, index & (bswapmask)) & mask;
> > +		mask = FB_SHIFT_HIGH(p, mask, index & ~(bswapmask));
> > +#if defined(__i386__) || defined(__x86_64__)
> > +		/* Shift argument is limited to 0 - 31 on x86 based CPU's */
> > +		if(index + bswapmask < 32)
> > +#endif
> > +			mask |= FB_SHIFT_HIGH(p, ~(u32)0,
> > +					(index + bswapmask) & ~(bswapmask));
> > +	}
> > +	return mask;
> > +}
> > +
> > +static inline unsigned long fb_shifted_pixels_mask_long(struct fb_info *p,
> > +							u32 index,
> > +							u32 bswapmask)
> > +{
> > +	unsigned long mask;
> > +
> > +	if (!bswapmask) {
> > +		mask = FB_SHIFT_HIGH(p, ~0UL, index);
> > +	} else {
> > +		mask = 0xff << FB_LEFT_POS(p, 8);
> > +		mask = FB_SHIFT_LOW(p, mask, index & (bswapmask)) & mask;
> > +		mask = FB_SHIFT_HIGH(p, mask, index & ~(bswapmask));
> > +#if defined(__i386__) || defined(__x86_64__)
> > +		/* Shift argument is limited to 0 - 31 on x86 based CPU's */
> > +		if(index + bswapmask < BITS_PER_LONG)
> > +#endif
> > +			mask |= FB_SHIFT_HIGH(p, ~0UL,
> > +					(index + bswapmask) & ~(bswapmask));
> > +	}
> > +	return mask;
> > +}
> > +
> > +
> > +static inline u32 fb_compute_bswapmask(struct fb_info *info)
> > +{
> > +	u32 bswapmask = 0;
> > +	unsigned bpp = info->var.bits_per_pixel;
> > +
> > +	if ((bpp < 8) && (info->var.nonstd & FB_NONSTD_REV_PIX_IN_B)) {
> > +		/*
> > +		 * Reversed order of pixel layout in bytes
> > +		 * works only for 1, 2 and 4 bpp
> > +		 */
> > +		bswapmask = 7 - bpp + 1;
> > +	}
> > +	return bswapmask;
> > +}
> > +
> > +#else /* CONFIG_FB_CFB_REV_PIXELS_IN_BYTE */
> > +
> > +static inline unsigned long fb_rev_pixels_in_long(unsigned long val,
> > +						  u32 bswapmask)
> > +{
> > +	return val;
> > +}
> > +
> > +#define fb_shifted_pixels_mask_u32(p, i, b) FB_SHIFT_HIGH((p), ~(u32)0, (i))
> > +#define fb_shifted_pixels_mask_long(p, i, b) FB_SHIFT_HIGH((p), ~0UL, (i))
> > +#define fb_compute_bswapmask(...) 0
> > +
> > +#endif  /* CONFIG_FB_CFB_REV_PIXELS_IN_BYTE */
> > +
> > +#define cpu_to_le_long _cpu_to_le_long(BITS_PER_LONG)
> > +#define _cpu_to_le_long(x) __cpu_to_le_long(x)
> > +#define __cpu_to_le_long(x) cpu_to_le##x
> > +
> > +#define le_long_to_cpu _le_long_to_cpu(BITS_PER_LONG)
> > +#define _le_long_to_cpu(x) __le_long_to_cpu(x)
> > +#define __le_long_to_cpu(x) le##x##_to_cpu
> > +
> > +static inline unsigned long rolx(unsigned long word, unsigned int shift, unsigned int x)
> > +{
> > +	return (word << shift) | (word >> (x - shift));
> > +}
> > +
> > +
> > +/*
> > + *  Generic bitwise copy algorithm
> > + */
> > +static void
> > +bitcpy(struct fb_info *p, unsigned long *dst, unsigned dst_idx,
> > +	const unsigned long *src, unsigned src_idx, int bits, unsigned n)
> > +{
> > +	unsigned long first, last;
> > +	int const shift = dst_idx-src_idx;
> > +	int left, right;
> > +
> > +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
> > +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
> > +
> > +	if (!shift) {
> > +		/* Same alignment for source and dest */
> > +		if (dst_idx+n <= bits) {
> > +			/* Single word */
> > +			if (last)
> > +				first &= last;
> > +			*dst = comp(*src, *dst, first);
> > +		} else {
> > +			/* Multiple destination words */
> > +			/* Leading bits */
> > + 			if (first != ~0UL) {
> > +				*dst = comp(*src, *dst, first);
> > +				dst++;
> > +				src++;
> > +				n -= bits - dst_idx;
> > +			}
> > +
> > +			/* Main chunk */
> > +			n /= bits;
> > +			while (n >= 8) {
> > +				*dst++ = *src++;
> > +				*dst++ = *src++;
> > +				*dst++ = *src++;
> > +				*dst++ = *src++;
> > +				*dst++ = *src++;
> > +				*dst++ = *src++;
> > +				*dst++ = *src++;
> > +				*dst++ = *src++;
> > +				n -= 8;
> > +			}
> > +			while (n--)
> > +				*dst++ = *src++;
> > +
> > +			/* Trailing bits */
> > +			if (last)
> > +				*dst = comp(*src, *dst, last);
> > +		}
> > +	} else {
> > +		unsigned long d0, d1;
> > +		int m;
> > +
> > +		/* Different alignment for source and dest */
> > +		right = shift & (bits - 1);
> > +		left = -shift & (bits - 1);
> > +
> > +		if (dst_idx+n <= bits) {
> > +			/* Single destination word */
> > +			if (last)
> > +				first &= last;
> > +			if (shift > 0) {
> > +				/* Single source word */
> > +				*dst = comp(*src << left, *dst, first);
> > +			} else if (src_idx+n <= bits) {
> > +				/* Single source word */
> > +				*dst = comp(*src >> right, *dst, first);
> > +			} else {
> > +				/* 2 source words */
> > +				d0 = *src++;
> > +				d1 = *src;
> > +				*dst = comp(d0 >> right | d1 << left, *dst,
> > +					    first);
> > +			}
> > +		} else {
> > +			/* Multiple destination words */
> > +			/** We must always remember the last value read,
> > +			    because in case SRC and DST overlap bitwise (e.g.
> > +			    when moving just one pixel in 1bpp), we always
> > +			    collect one full long for DST and that might
> > +			    overlap with the current long from SRC. We store
> > +			    this value in 'd0'. */
> > +			d0 = *src++;
> > +			/* Leading bits */
> > +			if (shift > 0) {
> > +				/* Single source word */
> > +				*dst = comp(d0 << left, *dst, first);
> > +				dst++;
> > +				n -= bits - dst_idx;
> > +			} else {
> > +				/* 2 source words */
> > +				d1 = *src++;
> > +				*dst = comp(d0 >> right | d1 << left, *dst,
> > +					    first);
> > +				d0 = d1;
> > +				dst++;
> > +				n -= bits - dst_idx;
> > +			}
> > +
> > +			/* Main chunk */
> > +			m = n % bits;
> > +			n /= bits;
> > +			while (n >= 4) {
> > +				d1 = *src++;
> > +				*dst++ = d0 >> right | d1 << left;
> > +				d0 = d1;
> > +				d1 = *src++;
> > +				*dst++ = d0 >> right | d1 << left;
> > +				d0 = d1;
> > +				d1 = *src++;
> > +				*dst++ = d0 >> right | d1 << left;
> > +				d0 = d1;
> > +				d1 = *src++;
> > +				*dst++ = d0 >> right | d1 << left;
> > +				d0 = d1;
> > +				n -= 4;
> > +			}
> > +			while (n--) {
> > +				d1 = *src++;
> > +				*dst++ = d0 >> right | d1 << left;
> > +				d0 = d1;
> > +			}
> > +
> > +			/* Trailing bits */
> > +			if (m) {
> > +				if (m <= bits - right) {
> > +					/* Single source word */
> > +					d0 >>= right;
> > +				} else {
> > +					/* 2 source words */
> > + 					d1 = *src;
> > +					d0 = d0 >> right | d1 << left;
> > +				}
> > +				*dst = comp(d0, *dst, last);
> > +			}
> > +		}
> > +	}
> > +}
> > +
> > +/*
> > + *  Generic bitwise copy algorithm, operating backward
> > + */
> > +static void
> > +bitcpy_rev(struct fb_info *p, unsigned long *dst, unsigned dst_idx,
> > +	   const unsigned long *src, unsigned src_idx, unsigned bits,
> > +	   unsigned n)
> > +{
> > +	unsigned long first, last;
> > +	int shift;
> > +
> > +	dst += (dst_idx + n - 1) / bits;
> > +	src += (src_idx + n - 1) / bits;
> > +	dst_idx = (dst_idx + n - 1) % bits;
> > +	src_idx = (src_idx + n - 1) % bits;
> > +
> > +	shift = dst_idx-src_idx;
> > +
> > +	first = ~FB_SHIFT_HIGH(p, ~0UL, (dst_idx + 1) % bits);
> > +	last = FB_SHIFT_HIGH(p, ~0UL, (bits + dst_idx + 1 - n) % bits);
> > +
> > +	if (!shift) {
> > +		/* Same alignment for source and dest */
> > +		if ((unsigned long)dst_idx+1 >= n) {
> > +			/* Single word */
> > +			if (first)
> > +				last &= first;
> > +			*dst = comp(*src, *dst, last);
> > +		} else {
> > +			/* Multiple destination words */
> > +
> > +			/* Leading bits */
> > +			if (first) {
> > +				*dst = comp(*src, *dst, first);
> > +				dst--;
> > +				src--;
> > +				n -= dst_idx+1;
> > +			}
> > +
> > +			/* Main chunk */
> > +			n /= bits;
> > +			while (n >= 8) {
> > +				*dst-- = *src--;
> > +				*dst-- = *src--;
> > +				*dst-- = *src--;
> > +				*dst-- = *src--;
> > +				*dst-- = *src--;
> > +				*dst-- = *src--;
> > +				*dst-- = *src--;
> > +				*dst-- = *src--;
> > +				n -= 8;
> > +			}
> > +			while (n--)
> > +				*dst-- = *src--;
> > +			/* Trailing bits */
> > +			if (last != -1UL)
> > +				*dst = comp(*src, *dst, last);
> > +		}
> > +	} else {
> > +		/* Different alignment for source and dest */
> > +
> > +		int const left = shift & (bits-1);
> > +		int const right = -shift & (bits-1);
> > +
> > +		if ((unsigned long)dst_idx+1 >= n) {
> > +			/* Single destination word */
> > +			if (first)
> > +				last &= first;
> > +			if (shift < 0) {
> > +				/* Single source word */
> > +				*dst = comp(*src >> right, *dst, last);
> > +			} else if (1+(unsigned long)src_idx >= n) {
> > +				/* Single source word */
> > +				*dst = comp(*src << left, *dst, last);
> > +			} else {
> > +				/* 2 source words */
> > +				*dst = comp(*src << left | *(src-1) >> right,
> > +					    *dst, last);
> > +			}
> > +		} else {
> > +			/* Multiple destination words */
> > +			/** We must always remember the last value read,
> > +			    because in case SRC and DST overlap bitwise (e.g.
> > +			    when moving just one pixel in 1bpp), we always
> > +			    collect one full long for DST and that might
> > +			    overlap with the current long from SRC. We store
> > +			    this value in 'd0'. */
> > +			unsigned long d0, d1;
> > +			int m;
> > +
> > +			d0 = *src--;
> > +			/* Leading bits */
> > +			if (shift < 0) {
> > +				/* Single source word */
> > +				d1 = d0;
> > +				d0 >>= right;
> > +			} else {
> > +				/* 2 source words */
> > +				d1 = *src--;
> > +				d0 = d0 << left | d1 >> right;
> > +			}
> > +			if (!first)
> > +				*dst = d0;
> > +			else
> > +				*dst = comp(d0, *dst, first);
> > +			d0 = d1;
> > +			dst--;
> > +			n -= dst_idx+1;
> > +
> > +			/* Main chunk */
> > +			m = n % bits;
> > +			n /= bits;
> > +			while (n >= 4) {
> > +				d1 = *src--;
> > +				*dst-- = d0 << left | d1 >> right;
> > +				d0 = d1;
> > +				d1 = *src--;
> > +				*dst-- = d0 << left | d1 >> right;
> > +				d0 = d1;
> > +				d1 = *src--;
> > +				*dst-- = d0 << left | d1 >> right;
> > +				d0 = d1;
> > +				d1 = *src--;
> > +				*dst-- = d0 << left | d1 >> right;
> > +				d0 = d1;
> > +				n -= 4;
> > +			}
> > +			while (n--) {
> > +				d1 = *src--;
> > +				*dst-- = d0 << left | d1 >> right;
> > +				d0 = d1;
> > +			}
> > +
> > +			/* Trailing bits */
> > +			if (m) {
> > +				if (m <= bits - left) {
> > +					/* Single source word */
> > +					d0 <<= left;
> > +				} else {
> > +					/* 2 source words */
> > +					d1 = *src;
> > +					d0 = d0 << left | d1 >> right;
> > +				}
> > +				*dst = comp(d0, *dst, last);
> > +			}
> > +		}
> > +	}
> > +}
> > +
> > +/*
> > + * Generic Bit Block Transfer for frame buffers located in system RAM with
> > + * packed pixels of any depth.
> > + */
> > +void drm_copyarea(struct fb_info *p, const struct fb_copyarea *area)
> > +{
> > +	u32 dx = area->dx, dy = area->dy, sx = area->sx, sy = area->sy;
> > +	u32 height = area->height, width = area->width;
> > +	unsigned long const bits_per_line = p->fix.line_length*8u;
> > +	unsigned long *base = NULL;
> > +	int bits = BITS_PER_LONG, bytes = bits >> 3;
> > +	unsigned dst_idx = 0, src_idx = 0, rev_copy = 0;
> > +
> > +	if (p->state != FBINFO_STATE_RUNNING)
> > +		return;
> > +
> > +	/* if the beginning of the target area might overlap with the end of
> > +	the source area, be have to copy the area reverse. */
> > +	if ((dy == sy && dx > sx) || (dy > sy)) {
> > +		dy += height;
> > +		sy += height;
> > +		rev_copy = 1;
> > +	}
> > +
> > +	/* split the base of the framebuffer into a long-aligned address and
> > +	   the index of the first bit */
> > +	base = (unsigned long *)((unsigned long)p->screen_base & ~(bytes-1));
> > +	dst_idx = src_idx = 8*((unsigned long)p->screen_base & (bytes-1));
> > +	/* add offset of source and target area */
> > +	dst_idx += dy*bits_per_line + dx*p->var.bits_per_pixel;
> > +	src_idx += sy*bits_per_line + sx*p->var.bits_per_pixel;
> > +
> > +	if (p->fbops->fb_sync)
> > +		p->fbops->fb_sync(p);
> > +
> > +	if (rev_copy) {
> > +		while (height--) {
> > +			dst_idx -= bits_per_line;
> > +			src_idx -= bits_per_line;
> > +			bitcpy_rev(p, base + (dst_idx / bits), dst_idx % bits,
> > +				base + (src_idx / bits), src_idx % bits, bits,
> > +				width*p->var.bits_per_pixel);
> > +		}
> > +	} else {
> > +		while (height--) {
> > +			bitcpy(p, base + (dst_idx / bits), dst_idx % bits,
> > +				base + (src_idx / bits), src_idx % bits, bits,
> > +				width*p->var.bits_per_pixel);
> > +			dst_idx += bits_per_line;
> > +			src_idx += bits_per_line;
> > +		}
> > +	}
> > +}
> > +EXPORT_SYMBOL(drm_copyarea);
> > +
> > +/*
> > + *  Aligned pattern fill using 32/64-bit memory accesses
> > + */
> > +static void
> > +bitfill_aligned(struct fb_info *p, unsigned long *dst, int dst_idx,
> > +		unsigned long pat, unsigned n, int bits)
> > +{
> > +	unsigned long first, last;
> > +
> > +	if (!n)
> > +		return;
> > +
> > +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
> > +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
> > +
> > +	if (dst_idx+n <= bits) {
> > +		/* Single word */
> > +		if (last)
> > +			first &= last;
> > +		*dst = comp(pat, *dst, first);
> > +	} else {
> > +		/* Multiple destination words */
> > +
> > +		/* Leading bits */
> > + 		if (first!= ~0UL) {
> > +			*dst = comp(pat, *dst, first);
> > +			dst++;
> > +			n -= bits - dst_idx;
> > +		}
> > +
> > +		/* Main chunk */
> > +		n /= bits;
> > +		while (n >= 8) {
> > +			*dst++ = pat;
> > +			*dst++ = pat;
> > +			*dst++ = pat;
> > +			*dst++ = pat;
> > +			*dst++ = pat;
> > +			*dst++ = pat;
> > +			*dst++ = pat;
> > +			*dst++ = pat;
> > +			n -= 8;
> > +		}
> > +		while (n--)
> > +			*dst++ = pat;
> > +		/* Trailing bits */
> > +		if (last)
> > +			*dst = comp(pat, *dst, last);
> > +	}
> > +}
> > +
> > +
> > +/*
> > + *  Unaligned generic pattern fill using 32/64-bit memory accesses
> > + *  The pattern must have been expanded to a full 32/64-bit value
> > + *  Left/right are the appropriate shifts to convert to the pattern to be
> > + *  used for the next 32/64-bit word
> > + */
> > +static void
> > +bitfill_unaligned(struct fb_info *p, unsigned long *dst, int dst_idx,
> > +		  unsigned long pat, int left, int right, unsigned n, int bits)
> > +{
> > +	unsigned long first, last;
> > +
> > +	if (!n)
> > +		return;
> > +
> > +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
> > +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
> > +
> > +	if (dst_idx+n <= bits) {
> > +		/* Single word */
> > +		if (last)
> > +			first &= last;
> > +		*dst = comp(pat, *dst, first);
> > +	} else {
> > +		/* Multiple destination words */
> > +		/* Leading bits */
> > +		if (first) {
> > +			*dst = comp(pat, *dst, first);
> > +			dst++;
> > +			pat = pat << left | pat >> right;
> > +			n -= bits - dst_idx;
> > +		}
> > +
> > +		/* Main chunk */
> > +		n /= bits;
> > +		while (n >= 4) {
> > +			*dst++ = pat;
> > +			pat = pat << left | pat >> right;
> > +			*dst++ = pat;
> > +			pat = pat << left | pat >> right;
> > +			*dst++ = pat;
> > +			pat = pat << left | pat >> right;
> > +			*dst++ = pat;
> > +			pat = pat << left | pat >> right;
> > +			n -= 4;
> > +		}
> > +		while (n--) {
> > +			*dst++ = pat;
> > +			pat = pat << left | pat >> right;
> > +		}
> > +
> > +		/* Trailing bits */
> > +		if (last)
> > +			*dst = comp(pat, *dst, last);
> > +	}
> > +}
> > +
> > +/*
> > + *  Aligned pattern invert using 32/64-bit memory accesses
> > + */
> > +static void
> > +bitfill_aligned_rev(struct fb_info *p, unsigned long *dst, int dst_idx,
> > +		    unsigned long pat, unsigned n, int bits)
> > +{
> > +	unsigned long val = pat;
> > +	unsigned long first, last;
> > +
> > +	if (!n)
> > +		return;
> > +
> > +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
> > +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
> > +
> > +	if (dst_idx+n <= bits) {
> > +		/* Single word */
> > +		if (last)
> > +			first &= last;
> > +		*dst = comp(*dst ^ val, *dst, first);
> > +	} else {
> > +		/* Multiple destination words */
> > +		/* Leading bits */
> > +		if (first!=0UL) {
> > +			*dst = comp(*dst ^ val, *dst, first);
> > +			dst++;
> > +			n -= bits - dst_idx;
> > +		}
> > +
> > +		/* Main chunk */
> > +		n /= bits;
> > +		while (n >= 8) {
> > +			*dst++ ^= val;
> > +			*dst++ ^= val;
> > +			*dst++ ^= val;
> > +			*dst++ ^= val;
> > +			*dst++ ^= val;
> > +			*dst++ ^= val;
> > +			*dst++ ^= val;
> > +			*dst++ ^= val;
> > +			n -= 8;
> > +		}
> > +		while (n--)
> > +			*dst++ ^= val;
> > +		/* Trailing bits */
> > +		if (last)
> > +			*dst = comp(*dst ^ val, *dst, last);
> > +	}
> > +}
> > +
> > +
> > +/*
> > + *  Unaligned generic pattern invert using 32/64-bit memory accesses
> > + *  The pattern must have been expanded to a full 32/64-bit value
> > + *  Left/right are the appropriate shifts to convert to the pattern to be
> > + *  used for the next 32/64-bit word
> > + */
> > +static void
> > +bitfill_unaligned_rev(struct fb_info *p, unsigned long *dst, int dst_idx,
> > +		      unsigned long pat, int left, int right, unsigned n,
> > +		      int bits)
> > +{
> > +	unsigned long first, last;
> > +
> > +	if (!n)
> > +		return;
> > +
> > +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
> > +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
> > +
> > +	if (dst_idx+n <= bits) {
> > +		/* Single word */
> > +		if (last)
> > +			first &= last;
> > +		*dst = comp(*dst ^ pat, *dst, first);
> > +	} else {
> > +		/* Multiple destination words */
> > +
> > +		/* Leading bits */
> > +		if (first != 0UL) {
> > +			*dst = comp(*dst ^ pat, *dst, first);
> > +			dst++;
> > +			pat = pat << left | pat >> right;
> > +			n -= bits - dst_idx;
> > +		}
> > +
> > +		/* Main chunk */
> > +		n /= bits;
> > +		while (n >= 4) {
> > +			*dst++ ^= pat;
> > +			pat = pat << left | pat >> right;
> > +			*dst++ ^= pat;
> > +			pat = pat << left | pat >> right;
> > +			*dst++ ^= pat;
> > +			pat = pat << left | pat >> right;
> > +			*dst++ ^= pat;
> > +			pat = pat << left | pat >> right;
> > +			n -= 4;
> > +		}
> > +		while (n--) {
> > +			*dst ^= pat;
> > +			pat = pat << left | pat >> right;
> > +		}
> > +
> > +		/* Trailing bits */
> > +		if (last)
> > +			*dst = comp(*dst ^ pat, *dst, last);
> > +	}
> > +}
> > +
> > +/*
> > + * Generic fillrect for frame buffers in system RAM with packed pixels of
> > + * any depth.
> > + */
> > +void drm_fillrect(struct fb_info *p, const struct fb_fillrect *rect)
> > +{
> > +	unsigned long pat, pat2, fg;
> > +	unsigned long width = rect->width, height = rect->height;
> > +	int bits = BITS_PER_LONG, bytes = bits >> 3;
> > +	u32 bpp = p->var.bits_per_pixel;
> > +	unsigned long *dst;
> > +	int dst_idx, left;
> > +
> > +	if (p->state != FBINFO_STATE_RUNNING)
> > +		return;
> > +
> > +	if (p->fix.visual == FB_VISUAL_TRUECOLOR ||
> > +	    p->fix.visual == FB_VISUAL_DIRECTCOLOR )
> > +		fg = ((u32 *) (p->pseudo_palette))[rect->color];
> > +	else
> > +		fg = rect->color;
> > +
> > +	pat = pixel_to_pat( bpp, fg);
> > +
> > +	dst = (unsigned long *)((unsigned long)p->screen_base & ~(bytes-1));
> > +	dst_idx = ((unsigned long)p->screen_base & (bytes - 1))*8;
> > +	dst_idx += rect->dy*p->fix.line_length*8+rect->dx*bpp;
> > +	/* FIXME For now we support 1-32 bpp only */
> > +	left = bits % bpp;
> > +	if (p->fbops->fb_sync)
> > +		p->fbops->fb_sync(p);
> > +	if (!left) {
> > +		void (*fill_op32)(struct fb_info *p, unsigned long *dst,
> > +				  int dst_idx, unsigned long pat, unsigned n,
> > +				  int bits) = NULL;
> > +
> > +		switch (rect->rop) {
> > +		case ROP_XOR:
> > +			fill_op32 = bitfill_aligned_rev;
> > +			break;
> > +		case ROP_COPY:
> > +			fill_op32 = bitfill_aligned;
> > +			break;
> > +		default:
> > +			printk( KERN_ERR "cfb_fillrect(): unknown rop, "
> > +				"defaulting to ROP_COPY\n");
> > +			fill_op32 = bitfill_aligned;
> > +			break;
> > +		}
> > +		while (height--) {
> > +			dst += dst_idx >> (ffs(bits) - 1);
> > +			dst_idx &= (bits - 1);
> > +			fill_op32(p, dst, dst_idx, pat, width*bpp, bits);
> > +			dst_idx += p->fix.line_length*8;
> > +		}
> > +	} else {
> > +		int right, r;
> > +		void (*fill_op)(struct fb_info *p, unsigned long *dst,
> > +				int dst_idx, unsigned long pat, int left,
> > +				int right, unsigned n, int bits) = NULL;
> > +#ifdef __LITTLE_ENDIAN
> > +		right = left;
> > +		left = bpp - right;
> > +#else
> > +		right = bpp - left;
> > +#endif
> > +		switch (rect->rop) {
> > +		case ROP_XOR:
> > +			fill_op = bitfill_unaligned_rev;
> > +			break;
> > +		case ROP_COPY:
> > +			fill_op = bitfill_unaligned;
> > +			break;
> > +		default:
> > +			printk(KERN_ERR "sys_fillrect(): unknown rop, "
> > +				"defaulting to ROP_COPY\n");
> > +			fill_op = bitfill_unaligned;
> > +			break;
> > +		}
> > +		while (height--) {
> > +			dst += dst_idx / bits;
> > +			dst_idx &= (bits - 1);
> > +			r = dst_idx % bpp;
> > +			/* rotate pattern to the correct start position */
> > +			pat2 = le_long_to_cpu(rolx(cpu_to_le_long(pat), r, bpp));
> > +			fill_op(p, dst, dst_idx, pat2, left, right,
> > +				width*bpp, bits);
> > +			dst_idx += p->fix.line_length*8;
> > +		}
> > +	}
> > +}
> > +EXPORT_SYMBOL(drm_fillrect);
> > +
> > +static const u32 cfb_tab8_be[] = {
> > +    0x00000000,0x000000ff,0x0000ff00,0x0000ffff,
> > +    0x00ff0000,0x00ff00ff,0x00ffff00,0x00ffffff,
> > +    0xff000000,0xff0000ff,0xff00ff00,0xff00ffff,
> > +    0xffff0000,0xffff00ff,0xffffff00,0xffffffff
> > +};
> > +
> > +static const u32 cfb_tab8_le[] = {
> > +    0x00000000,0xff000000,0x00ff0000,0xffff0000,
> > +    0x0000ff00,0xff00ff00,0x00ffff00,0xffffff00,
> > +    0x000000ff,0xff0000ff,0x00ff00ff,0xffff00ff,
> > +    0x0000ffff,0xff00ffff,0x00ffffff,0xffffffff
> > +};
> > +
> > +static const u32 cfb_tab16_be[] = {
> > +    0x00000000, 0x0000ffff, 0xffff0000, 0xffffffff
> > +};
> > +
> > +static const u32 cfb_tab16_le[] = {
> > +    0x00000000, 0xffff0000, 0x0000ffff, 0xffffffff
> > +};
> > +
> > +static const u32 cfb_tab32[] = {
> > +	0x00000000, 0xffffffff
> > +};
> > +
> > +static void color_imageblit(const struct fb_image *image, struct fb_info *p,
> > +			    void *dst1, u32 start_index, u32 pitch_index)
> > +{
> > +	/* Draw the penguin */
> > +	u32 *dst, *dst2;
> > +	u32 color = 0, val, shift;
> > +	int i, n, bpp = p->var.bits_per_pixel;
> > +	u32 null_bits = 32 - bpp;
> > +	u32 *palette = (u32 *) p->pseudo_palette;
> > +	const u8 *src = image->data;
> > +
> > +	dst2 = dst1;
> > +	for (i = image->height; i--; ) {
> > +		n = image->width;
> > +		dst = dst1;
> > +		shift = 0;
> > +		val = 0;
> > +
> > +		if (start_index) {
> > +			u32 start_mask = ~(FB_SHIFT_HIGH(p, ~(u32)0,
> > +							 start_index));
> > +			val = *dst & start_mask;
> > +			shift = start_index;
> > +		}
> > +		while (n--) {
> > +			if (p->fix.visual == FB_VISUAL_TRUECOLOR ||
> > +			    p->fix.visual == FB_VISUAL_DIRECTCOLOR )
> > +				color = palette[*src];
> > +			else
> > +				color = *src;
> > +			color <<= FB_LEFT_POS(p, bpp);
> > +			val |= FB_SHIFT_HIGH(p, color, shift);
> > +			if (shift >= null_bits) {
> > +				*dst++ = val;
> > +
> > +				val = (shift == null_bits) ? 0 :
> > +					FB_SHIFT_LOW(p, color, 32 - shift);
> > +			}
> > +			shift += bpp;
> > +			shift &= (32 - 1);
> > +			src++;
> > +		}
> > +		if (shift) {
> > +			u32 end_mask = FB_SHIFT_HIGH(p, ~(u32)0, shift);
> > +
> > +			*dst &= end_mask;
> > +			*dst |= val;
> > +		}
> > +		dst1 += p->fix.line_length;
> > +		if (pitch_index) {
> > +			dst2 += p->fix.line_length;
> > +			dst1 = (u8 *)((long)dst2 & ~(sizeof(u32) - 1));
> > +
> > +			start_index += pitch_index;
> > +			start_index &= 32 - 1;
> > +		}
> > +	}
> > +}
> > +
> > +static void slow_imageblit(const struct fb_image *image, struct fb_info *p,
> > +				  void *dst1, u32 fgcolor, u32 bgcolor,
> > +				  u32 start_index, u32 pitch_index)
> > +{
> > +	u32 shift, color = 0, bpp = p->var.bits_per_pixel;
> > +	u32 *dst, *dst2;
> > +	u32 val, pitch = p->fix.line_length;
> > +	u32 null_bits = 32 - bpp;
> > +	u32 spitch = (image->width+7)/8;
> > +	const u8 *src = image->data, *s;
> > +	u32 i, j, l;
> > +
> > +	dst2 = dst1;
> > +	fgcolor <<= FB_LEFT_POS(p, bpp);
> > +	bgcolor <<= FB_LEFT_POS(p, bpp);
> > +
> > +	for (i = image->height; i--; ) {
> > +		shift = val = 0;
> > +		l = 8;
> > +		j = image->width;
> > +		dst = dst1;
> > +		s = src;
> > +
> > +		/* write leading bits */
> > +		if (start_index) {
> > +			u32 start_mask = ~(FB_SHIFT_HIGH(p, ~(u32)0,
> > +							 start_index));
> > +			val = *dst & start_mask;
> > +			shift = start_index;
> > +		}
> > +
> > +		while (j--) {
> > +			l--;
> > +			color = (*s & (1 << l)) ? fgcolor : bgcolor;
> > +			val |= FB_SHIFT_HIGH(p, color, shift);
> > +
> > +			/* Did the bitshift spill bits to the next long? */
> > +			if (shift >= null_bits) {
> > +				*dst++ = val;
> > +				val = (shift == null_bits) ? 0 :
> > +					FB_SHIFT_LOW(p, color, 32 - shift);
> > +			}
> > +			shift += bpp;
> > +			shift &= (32 - 1);
> > +			if (!l) { l = 8; s++; }
> > +		}
> > +
> > +		/* write trailing bits */
> > + 		if (shift) {
> > +			u32 end_mask = FB_SHIFT_HIGH(p, ~(u32)0, shift);
> > +
> > +			*dst &= end_mask;
> > +			*dst |= val;
> > +		}
> > +
> > +		dst1 += pitch;
> > +		src += spitch;
> > +		if (pitch_index) {
> > +			dst2 += pitch;
> > +			dst1 = (u8 *)((long)dst2 & ~(sizeof(u32) - 1));
> > +			start_index += pitch_index;
> > +			start_index &= 32 - 1;
> > +		}
> > +
> > +	}
> > +}
> > +
> > +/*
> > + * fast_imageblit - optimized monochrome color expansion
> > + *
> > + * Only if:  bits_per_pixel == 8, 16, or 32
> > + *           image->width is divisible by pixel/dword (ppw);
> > + *           fix->line_legth is divisible by 4;
> > + *           beginning and end of a scanline is dword aligned
> > + */
> > +static void fast_imageblit(const struct fb_image *image, struct fb_info *p,
> > +				  void *dst1, u32 fgcolor, u32 bgcolor)
> > +{
> > +	u32 fgx = fgcolor, bgx = bgcolor, bpp = p->var.bits_per_pixel;
> > +	u32 ppw = 32/bpp, spitch = (image->width + 7)/8;
> > +	u32 bit_mask, end_mask, eorx, shift;
> > +	const char *s = image->data, *src;
> > +	u32 *dst;
> > +	const u32 *tab = NULL;
> > +	int i, j, k;
> > +
> > +	switch (bpp) {
> > +	case 8:
> > +		tab = fb_be_math(p) ? cfb_tab8_be : cfb_tab8_le;
> > +		break;
> > +	case 16:
> > +		tab = fb_be_math(p) ? cfb_tab16_be : cfb_tab16_le;
> > +		break;
> > +	case 32:
> > +	default:
> > +		tab = cfb_tab32;
> > +		break;
> > +	}
> > +
> > +	for (i = ppw-1; i--; ) {
> > +		fgx <<= bpp;
> > +		bgx <<= bpp;
> > +		fgx |= fgcolor;
> > +		bgx |= bgcolor;
> > +	}
> > +
> > +	bit_mask = (1 << ppw) - 1;
> > +	eorx = fgx ^ bgx;
> > +	k = image->width/ppw;
> > +
> > +	for (i = image->height; i--; ) {
> > +		dst = dst1;
> > +		shift = 8;
> > +		src = s;
> > +
> > +		for (j = k; j--; ) {
> > +			shift -= ppw;
> > +			end_mask = tab[(*src >> shift) & bit_mask];
> > +			*dst++ = (end_mask & eorx) ^ bgx;
> > +			if (!shift) {
> > +				shift = 8;
> > +				src++;
> > +			}
> > +		}
> > +		dst1 += p->fix.line_length;
> > +		s += spitch;
> > +	}
> > +}
> > +
> > +/*
> > + * Generic 1-bit or 8-bit source to 1-32 bit destination expansion
> > + * for frame buffer located in system RAM with packed pixels of any depth.
> > + */
> > +void drm_imageblit(struct fb_info *p, const struct fb_image *image)
> > +{
> > +	u32 fgcolor, bgcolor, start_index, bitstart, pitch_index = 0;
> > +	u32 bpl = sizeof(u32), bpp = p->var.bits_per_pixel;
> > +	u32 width = image->width;
> > +	u32 dx = image->dx, dy = image->dy;
> > +	void *dst1;
> > +
> > +	if (p->state != FBINFO_STATE_RUNNING)
> > +		return;
> > +
> > +	bitstart = (dy * p->fix.line_length * 8) + (dx * bpp);
> > +	start_index = bitstart & (32 - 1);
> > +	pitch_index = (p->fix.line_length & (bpl - 1)) * 8;
> > +
> > +	bitstart /= 8;
> > +	bitstart &= ~(bpl - 1);
> > +	dst1 = (void __force *)p->screen_base + bitstart;
> > +
> > +	if (p->fbops->fb_sync)
> > +		p->fbops->fb_sync(p);
> > +
> > +	if (image->depth == 1) {
> > +		if (p->fix.visual == FB_VISUAL_TRUECOLOR ||
> > +		    p->fix.visual == FB_VISUAL_DIRECTCOLOR) {
> > +			fgcolor = ((u32*)(p->pseudo_palette))[image->fg_color];
> > +			bgcolor = ((u32*)(p->pseudo_palette))[image->bg_color];
> > +		} else {
> > +			fgcolor = image->fg_color;
> > +			bgcolor = image->bg_color;
> > +		}
> > +
> > +		if (32 % bpp == 0 && !start_index && !pitch_index &&
> > +		    ((width & (32/bpp-1)) == 0) &&
> > +		    bpp >= 8 && bpp <= 32)
> > +			fast_imageblit(image, p, dst1, fgcolor, bgcolor);
> > +		else
> > +			slow_imageblit(image, p, dst1, fgcolor, bgcolor,
> > +					start_index, pitch_index);
> > +	} else
> > +		color_imageblit(image, p, dst1, start_index, pitch_index);
> > +}
> > +EXPORT_SYMBOL(drm_imageblit);
> > +
> > +/*
> > + * Generic file operations where framebuffer is in system RAM
> > + */
> > +ssize_t drm_fb_read(struct fb_info *info, char __user *buf, size_t count, loff_t *ppos)
> > +{
> > +	unsigned long p = *ppos;
> > +	void *src;
> > +	int err = 0;
> > +	unsigned long total_size;
> > +
> > +	if (info->state != FBINFO_STATE_RUNNING)
> > +		return -EPERM;
> > +
> > +	total_size = info->screen_size;
> > +
> > +	if (total_size == 0)
> > +		total_size = info->fix.smem_len;
> > +
> > +	if (p >= total_size)
> > +		return 0;
> > +
> > +	if (count >= total_size)
> > +		count = total_size;
> > +
> > +	if (count + p > total_size)
> > +		count = total_size - p;
> > +
> > +	src = (void __force *)(info->screen_base + p);
> > +
> > +	if (info->fbops->fb_sync)
> > +		info->fbops->fb_sync(info);
> > +
> > +	if (copy_to_user(buf, src, count))
> > +		err = -EFAULT;
> > +
> > +	if  (!err)
> > +		*ppos += count;
> > +
> > +	return (err) ? err : count;
> > +}
> > +EXPORT_SYMBOL_GPL(drm_fb_read);
> > +
> > +/*
> > + * Generic file operations where framebuffer is in system RAM
> > + */
> > +ssize_t drm_fb_write(struct fb_info *info, const char __user *buf, size_t count, loff_t *ppos)
> > +{
> > +	unsigned long p = *ppos;
> > +	void *dst;
> > +	int err = 0;
> > +	unsigned long total_size;
> > +
> > +	if (info->state != FBINFO_STATE_RUNNING)
> > +		return -EPERM;
> > +
> > +	total_size = info->screen_size;
> > +
> > +	if (total_size == 0)
> > +		total_size = info->fix.smem_len;
> > +
> > +	if (p > total_size)
> > +		return -EFBIG;
> > +
> > +	if (count > total_size) {
> > +		err = -EFBIG;
> > +		count = total_size;
> > +	}
> > +
> > +	if (count + p > total_size) {
> > +		if (!err)
> > +			err = -ENOSPC;
> > +
> > +		count = total_size - p;
> > +	}
> > +
> > +	dst = (void __force *) (info->screen_base + p);
> > +
> > +	if (info->fbops->fb_sync)
> > +		info->fbops->fb_sync(info);
> > +
> > +	if (copy_from_user(dst, buf, count))
> > +		err = -EFAULT;
> > +
> > +	if  (!err)
> > +		*ppos += count;
> > +
> > +	return (err) ? err : count;
> > +}
> > +EXPORT_SYMBOL_GPL(drm_fb_write);
> > diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
> > index da0d96a69570..18fba9818635 100644
> > --- a/drivers/gpu/drm/drm_fb_helper.c
> > +++ b/drivers/gpu/drm/drm_fb_helper.c
> > @@ -664,36 +664,36 @@ void drm_fb_helper_deferred_io(struct fb_info *info,
> >  EXPORT_SYMBOL(drm_fb_helper_deferred_io);
> >  
> >  /**
> > - * drm_fb_helper_sys_read - wrapper around fb_sys_read
> > + * drm_fb_helper_sys_read - wrapper around drm_fb_read
> >   * @info: fb_info struct pointer
> >   * @buf: userspace buffer to read from framebuffer memory
> >   * @count: number of bytes to read from framebuffer memory
> >   * @ppos: read offset within framebuffer memory
> >   *
> > - * A wrapper around fb_sys_read implemented by fbdev core
> > + * A wrapper around drm_fb_read
> >   */
> >  ssize_t drm_fb_helper_sys_read(struct fb_info *info, char __user *buf,
> >  			       size_t count, loff_t *ppos)
> >  {
> > -	return fb_sys_read(info, buf, count, ppos);
> > +	return drm_fb_read(info, buf, count, ppos);
> >  }
> >  EXPORT_SYMBOL(drm_fb_helper_sys_read);
> >  
> >  /**
> > - * drm_fb_helper_sys_write - wrapper around fb_sys_write
> > + * drm_fb_helper_sys_write - wrapper around drm_fb_write
> >   * @info: fb_info struct pointer
> >   * @buf: userspace buffer to write to framebuffer memory
> >   * @count: number of bytes to write to framebuffer memory
> >   * @ppos: write offset within framebuffer memory
> >   *
> > - * A wrapper around fb_sys_write implemented by fbdev core
> > + * A wrapper around drm_fb_write
> >   */
> >  ssize_t drm_fb_helper_sys_write(struct fb_info *info, const char __user *buf,
> >  				size_t count, loff_t *ppos)
> >  {
> >  	ssize_t ret;
> >  
> > -	ret = fb_sys_write(info, buf, count, ppos);
> > +	ret = drm_fb_write(info, buf, count, ppos);
> >  	if (ret > 0)
> >  		drm_fb_helper_dirty(info, 0, 0, info->var.xres,
> >  				    info->var.yres);
> > @@ -703,48 +703,48 @@ ssize_t drm_fb_helper_sys_write(struct fb_info *info, const char __user *buf,
> >  EXPORT_SYMBOL(drm_fb_helper_sys_write);
> >  
> >  /**
> > - * drm_fb_helper_sys_fillrect - wrapper around sys_fillrect
> > + * drm_fb_helper_sys_fillrect - wrapper around drm_fillrect
> >   * @info: fbdev registered by the helper
> >   * @rect: info about rectangle to fill
> >   *
> > - * A wrapper around sys_fillrect implemented by fbdev core
> > + * A wrapper around drm_fillrect
> >   */
> >  void drm_fb_helper_sys_fillrect(struct fb_info *info,
> >  				const struct fb_fillrect *rect)
> >  {
> > -	sys_fillrect(info, rect);
> > +	drm_fillrect(info, rect);
> >  	drm_fb_helper_dirty(info, rect->dx, rect->dy,
> >  			    rect->width, rect->height);
> >  }
> >  EXPORT_SYMBOL(drm_fb_helper_sys_fillrect);
> >  
> >  /**
> > - * drm_fb_helper_sys_copyarea - wrapper around sys_copyarea
> > + * drm_fb_helper_sys_copyarea - wrapper around drm_copyarea
> >   * @info: fbdev registered by the helper
> >   * @area: info about area to copy
> >   *
> > - * A wrapper around sys_copyarea implemented by fbdev core
> > + * A wrapper around drm_copyarea
> >   */
> >  void drm_fb_helper_sys_copyarea(struct fb_info *info,
> >  				const struct fb_copyarea *area)
> >  {
> > -	sys_copyarea(info, area);
> > +	drm_copyarea(info, area);
> >  	drm_fb_helper_dirty(info, area->dx, area->dy,
> >  			    area->width, area->height);
> >  }
> >  EXPORT_SYMBOL(drm_fb_helper_sys_copyarea);
> >  
> >  /**
> > - * drm_fb_helper_sys_imageblit - wrapper around sys_imageblit
> > + * drm_fb_helper_sys_imageblit - wrapper around drm_imageblit
> >   * @info: fbdev registered by the helper
> >   * @image: info about image to blit
> >   *
> > - * A wrapper around sys_imageblit implemented by fbdev core
> > + * A wrapper around drm_imageblit
> >   */
> >  void drm_fb_helper_sys_imageblit(struct fb_info *info,
> >  				 const struct fb_image *image)
> >  {
> > -	sys_imageblit(info, image);
> > +	drm_imageblit(info, image);
> >  	drm_fb_helper_dirty(info, image->dx, image->dy,
> >  			    image->width, image->height);
> >  }
> > diff --git a/include/drm/drm_fb_helper.h b/include/drm/drm_fb_helper.h
> > index 306aa3a60be9..d38005f6128a 100644
> > --- a/include/drm/drm_fb_helper.h
> > +++ b/include/drm/drm_fb_helper.h
> > @@ -271,6 +271,12 @@ void drm_fb_helper_output_poll_changed(struct drm_device *dev);
> >  
> >  void drm_fbdev_generic_setup(struct drm_device *dev,
> >  			     unsigned int preferred_bpp);
> > +
> > +void drm_copyarea(struct fb_info *p, const struct fb_copyarea *area);
> > +void drm_fillrect(struct fb_info *p, const struct fb_fillrect *rect);
> > +void drm_imageblit(struct fb_info *p, const struct fb_image *image);
> > +ssize_t drm_fb_read(struct fb_info *info, char __user *buf, size_t count, loff_t *ppos);
> > +ssize_t drm_fb_write(struct fb_info *info, const char __user *buf, size_t count, loff_t *ppos);
> >  #else
> >  static inline void drm_fb_helper_prepare(struct drm_device *dev,
> >  					struct drm_fb_helper *helper,
> > 
> 
> -- 
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer
> 




-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/5] fbdev/core: Export framebuffer read and write code as cfb_ function
  2020-08-04  9:27               ` daniel
@ 2020-08-04  9:41                 ` Thomas Zimmermann
  -1 siblings, 0 replies; 34+ messages in thread
From: Thomas Zimmermann @ 2020-08-04  9:41 UTC (permalink / raw)
  To: daniel
  Cc: linux-fbdev, b.zolnierkie, jani.nikula, dri-devel, kraxel,
	airlied, natechancellor, Sam Ravnborg, peda, dan.carpenter


[-- Attachment #1.1: Type: text/plain, Size: 49170 bytes --]

Hi

Am 04.08.20 um 11:27 schrieb daniel@ffwll.ch:
> On Mon, Aug 03, 2020 at 08:46:34AM +0200, Thomas Zimmermann wrote:
>> Hi
>>
>> Am 02.08.20 um 22:01 schrieb Sam Ravnborg:
>>> On Fri, Jul 31, 2020 at 11:20:33AM +0200, daniel@ffwll.ch wrote:
>>>> On Wed, Jul 29, 2020 at 06:36:03PM +0200, Sam Ravnborg wrote:
>>>>> Hi Daniel.
>>>>>
>>>>> On Wed, Jul 29, 2020 at 03:53:28PM +0200, daniel@ffwll.ch wrote:
>>>>>> On Wed, Jul 29, 2020 at 03:41:45PM +0200, Thomas Zimmermann wrote:
>>>>>>> DRM fb helpers require read and write functions for framebuffer
>>>>>>> memory. Export the existing code from fbdev.
>>>>>>>
>>>>>>> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
>>>>>>
>>>>>> Hm I'm not super sure whether we want to actually reuse this stuff ... We
>>>>>> kinda don't care about the sparc special case, and just having an fbdev
>>>>>> implementation witch has the switch between memcpy and memcpy_to/from_io
>>>>>> in one single place sounds a lot simpler ...
>>>>>>
>>>>>> This way we can have a clean split between the old horrors of real fbdev
>>>>>> drivers, and a much cleaner world in drm. It would mean a bit of
>>>>>> copypasting, but I think that's actually a good thing.
>>>>>>
>>>>>> In general my idea for drm fbdev emulation is that for any area we have a
>>>>>> problem we just ignore the entire fbmem.c code and write our own: mmap,
>>>>>> backlight handling (still unsolved, and horrible), cfb vs sys here. This
>>>>>> entire fbmem.c stuff is pretty bad midlayer, trying to avoid code
>>>>>> duplication here doesn't seem worth it imo.
>>>>>>
>>>>>> Thoughts?
>>>>>
>>>>>
>>>>> I can see that fbmem is a mix of ioctl support and other stuff.
>>>>> We could factor out all the ioctl parts of fbmem.c to a new file
>>>>> named fbioctl.c.
>>>>>
>>>>> And then let the ioctl parts call down into drm stuff and avoid reusing
>>>>> the fbdev code when we first reach drm code.
>>>>> This would require local copies of:
>>>>> sys_read, sys_write, sys_fillrect, sys_copyarea, sys_imageblit
>>>>> and more I think which I missed.
>>>>>
>>>>> With local copies we could avoid some of the special cases and trim the
>>>>> unctions to what is required by drm only.
>>>>> And then no more fbmem dependencies and no dependencies to several of
>>>>> the small helper functions. So less entanglement with fbdev core.
>>>>>
>>>>> This all sounds simple so I am surely missing a lot a ugly details here.
>>>>>
>>>>> And should we touch this anyway we need a test suite to verify not too
>>>>> much breaks. To the best of my knowledge there is not yet such a test
>>>>> suite :-( Maybe because people caring about fbdev are limited.
>>>>
>>>> Well my idea was to not refactor anything, but just have drm copies of the
>>>> various fb_ops callbacks. Definitely not even more refactoring :-)
>>
>> Thanks for making a prototype, Sam.
>>
>> But do we really want to duplicate alls this code from fbdev? It's not
>> actually pretty and there's little value in rewritting it. If anything,
>> I can see us extending drm_format_helpers.c and building upon this.
> 
> Hm yeah this is a lot more horrible than I thought, I was kinda just
> looking at read/write functions, maybe also mmap, and figured we could
> simplify this a lot if we just inline and done.
> 
> But all these image/fill/copy functions are pretty bad, and they seem to
> care about a lot of corner cases that just don't matter anymore on modern
> hw. Also agreeing with Tomas that if we do copy this, then building on top
> of format helpers would make more sense. At that point we probably need a
> test suite for fbdev, and this entire endeavour becomes rather silly.
> 
> So maybe compromise approach? E.g. we reuse fbdev helpers with a switch
> between sys and cfb if it would drag in a huge pile of code for format
> handling sillynesss. But when all it would drag in is a memcpy_to/fromio
> or similar, we inline to reduce a bit the midlayer and make it clearer
> what's going on. Strictly speaking would probably still need a test suite,
> but I guess we can delay that until the first bug report or so :-)

Makes sense. I've been working on the wrapped-pointer approach a bit.
Rewriting cfb read and write within DRM is not much effort.

Format helpers apprear to have I/O and non-I/O versions of several
conversion functions. If we go with the wrapped pointer, we could later
change the format helpers to accept the wrapper structure and handle
iomem internally.

Best regards
Thomas

> 
> Cheers, Daniel
> 
> 
>>
>> Best regards
>> Thomas
>>
>>>
>>> $ wc -l drivers/gpu/drm/drm_fb_fbdev_helper.c
>>> 1212 drivers/gpu/drm/drm_fb_fbdev_helper.c
>>>
>>> This is a straight copy of:
>>> $ grep EXPORT drivers/gpu/drm/drm_fb_fbdev_helper.c
>>> EXPORT_SYMBOL(drm_copyarea);
>>> EXPORT_SYMBOL(drm_fillrect);
>>> EXPORT_SYMBOL(drm_imageblit);
>>> EXPORT_SYMBOL_GPL(drm_fb_read);
>>> EXPORT_SYMBOL_GPL(drm_fb_write);
>>>
>>> copyarea, fillrect and imageblit are a copy of the sys variants.
>>> read and write are from fb_sys_fops.c
>>>
>>> Patch builds but I am not a big fan yet - right now it gains us very
>>> little. What would the next step be?
>>>
>>> I looked briefly at the few drivers that have not migrated to
>>> the generic fbdev emulation (yet?).
>>>   msm
>>>   omapdrm
>>>   gma500
>>>
>>> all uses sys_ variants - they already uses the migrated variants as
>>> they call the drm_fb helper.
>>>
>>>
>>>   exynos
>>>   gma500
>>>   i915
>>>   nouveau
>>>   radeon
>>>   rockchip
>>>   vmwgfx
>>>
>>> all uses the cfb variants and cannot use the migrated variants as-is.
>>>
>>> 	Sam
>>>
>>> From c554e8b9ec96e06a5933ed5bd834a928c977bcdf Mon Sep 17 00:00:00 2001
>>> From: Sam Ravnborg <sam@ravnborg.org>
>>> Date: Fri, 31 Jul 2020 23:36:01 +0200
>>> Subject: [PATCH 1/3] drm: drm_fb_helper: copy fb_ops functions to drm
>>>
>>> Copy the fbdev functions used by fb_ops to drm.
>>> Having local copies enable the possibility
>>> to adjust the functions so then are tailored to the need of DRM.
>>> At the same time there is one less dependency between DRM and fbdev.
>>>
>>> The files were copied verbatim and only minimal changes was made.
>>>
>>> As some drivers uses the fbdev functions they continue to be build but
>>> the generic fbdev emulation no longer uses the fbdev functions.
>>>
>>> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
>>> Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>>> ---
>>>  drivers/gpu/drm/Kconfig               |    1 -
>>>  drivers/gpu/drm/Makefile              |    1 +
>>>  drivers/gpu/drm/drm_fb_fbdev_helper.c | 1212 +++++++++++++++++++++++++
>>>  drivers/gpu/drm/drm_fb_helper.c       |   30 +-
>>>  include/drm/drm_fb_helper.h           |    6 +
>>>  5 files changed, 1234 insertions(+), 16 deletions(-)
>>>  create mode 100644 drivers/gpu/drm/drm_fb_fbdev_helper.c
>>>
>>> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
>>> index c4fd57d8b717..11f41bc31f05 100644
>>> --- a/drivers/gpu/drm/Kconfig
>>> +++ b/drivers/gpu/drm/Kconfig
>>> @@ -83,7 +83,6 @@ config DRM_KMS_FB_HELPER
>>>  	select FB
>>>  	select FRAMEBUFFER_CONSOLE if !EXPERT
>>>  	select FRAMEBUFFER_CONSOLE_DETECT_PRIMARY if FRAMEBUFFER_CONSOLE
>>> -	select FB_SYS_FOPS
>>>  	select FB_SYS_FILLRECT
>>>  	select FB_SYS_COPYAREA
>>>  	select FB_SYS_IMAGEBLIT
>>> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
>>> index 02ee5faf1a92..398cb4746762 100644
>>> --- a/drivers/gpu/drm/Makefile
>>> +++ b/drivers/gpu/drm/Makefile
>>> @@ -50,6 +50,7 @@ drm_kms_helper-y := drm_bridge_connector.o drm_crtc_helper.o drm_dp_helper.o \
>>>  
>>>  drm_kms_helper-$(CONFIG_DRM_PANEL_BRIDGE) += bridge/panel.o
>>>  drm_kms_helper-$(CONFIG_DRM_FBDEV_EMULATION) += drm_fb_helper.o
>>> +drm_kms_helper-$(CONFIG_DRM_FBDEV_EMULATION) += drm_fb_fbdev_helper.o
>>>  drm_kms_helper-$(CONFIG_DRM_KMS_CMA_HELPER) += drm_fb_cma_helper.o
>>>  drm_kms_helper-$(CONFIG_DRM_DP_AUX_CHARDEV) += drm_dp_aux_dev.o
>>>  drm_kms_helper-$(CONFIG_DRM_DP_CEC) += drm_dp_cec.o
>>> diff --git a/drivers/gpu/drm/drm_fb_fbdev_helper.c b/drivers/gpu/drm/drm_fb_fbdev_helper.c
>>> new file mode 100644
>>> index 000000000000..dda60f85b9d5
>>> --- /dev/null
>>> +++ b/drivers/gpu/drm/drm_fb_fbdev_helper.c
>>> @@ -0,0 +1,1212 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +/*
>>> + * This file is subject to the terms and conditions of the GNU General Public
>>> + * License.  See the file COPYING in the main directory of this archive for
>>> + * more details.
>>> + *
>>> + * Copyright (C)  2007 Antonino Daplas <adaplas@pol.net>
>>> + *
>>> + * drm_copyarea:
>>> + *   Based almost entirely from cfbcopyarea.c (which is based almost entirely
>>> + *   on Geert Uytterhoeven's copyarea routine)
>>> + *
>>> + *  drm_fillrect:
>>> + *    Based almost entirely from cfbfillrect.c (which is based almost entirely
>>> + *    on Geert Uytterhoeven's fillrect routine)
>>> + *
>>> + *  drm_imageblit:
>>> + *    Based almost entirely on cfbimgblt.c
>>> + *
>>> + */
>>> +
>>> +#include <linux/export.h>
>>> +#include <linux/fb.h>
>>> +#include <linux/io.h>
>>> +
>>> +
>>> +/*
>>> +*  Compose two values, using a bitmask as decision value
>>> +*  This is equivalent to (a & mask) | (b & ~mask)
>>> +*/
>>> +static inline unsigned long
>>> +comp(unsigned long a, unsigned long b, unsigned long mask)
>>> +{
>>> +    return ((a ^ b) & mask) ^ b;
>>> +}
>>> +
>>> +/*
>>> + *  Create a pattern with the given pixel's color
>>> + */
>>> +#if BITS_PER_LONG == 64
>>> +static inline unsigned long
>>> +pixel_to_pat( u32 bpp, u32 pixel)
>>> +{
>>> +	switch (bpp) {
>>> +	case 1:
>>> +		return 0xfffffffffffffffful*pixel;
>>> +	case 2:
>>> +		return 0x5555555555555555ul*pixel;
>>> +	case 4:
>>> +		return 0x1111111111111111ul*pixel;
>>> +	case 8:
>>> +		return 0x0101010101010101ul*pixel;
>>> +	case 12:
>>> +		return 0x1001001001001001ul*pixel;
>>> +	case 16:
>>> +		return 0x0001000100010001ul*pixel;
>>> +	case 24:
>>> +		return 0x0001000001000001ul*pixel;
>>> +	case 32:
>>> +		return 0x0000000100000001ul*pixel;
>>> +	default:
>>> +		WARN(1, "pixel_to_pat(): unsupported pixelformat %d\n", bpp);
>>> +		return 0;
>>> +    }
>>> +}
>>> +#else
>>> +static inline unsigned long
>>> +pixel_to_pat( u32 bpp, u32 pixel)
>>> +{
>>> +	switch (bpp) {
>>> +	case 1:
>>> +		return 0xfffffffful*pixel;
>>> +	case 2:
>>> +		return 0x55555555ul*pixel;
>>> +	case 4:
>>> +		return 0x11111111ul*pixel;
>>> +	case 8:
>>> +		return 0x01010101ul*pixel;
>>> +	case 12:
>>> +		return 0x01001001ul*pixel;
>>> +	case 16:
>>> +		return 0x00010001ul*pixel;
>>> +	case 24:
>>> +		return 0x01000001ul*pixel;
>>> +	case 32:
>>> +		return 0x00000001ul*pixel;
>>> +	default:
>>> +		WARN(1, "pixel_to_pat(): unsupported pixelformat %d\n", bpp);
>>> +		return 0;
>>> +    }
>>> +}
>>> +#endif
>>> +
>>> +#ifdef CONFIG_FB_CFB_REV_PIXELS_IN_BYTE
>>> +#if BITS_PER_LONG == 64
>>> +#define REV_PIXELS_MASK1 0x5555555555555555ul
>>> +#define REV_PIXELS_MASK2 0x3333333333333333ul
>>> +#define REV_PIXELS_MASK4 0x0f0f0f0f0f0f0f0ful
>>> +#else
>>> +#define REV_PIXELS_MASK1 0x55555555ul
>>> +#define REV_PIXELS_MASK2 0x33333333ul
>>> +#define REV_PIXELS_MASK4 0x0f0f0f0ful
>>> +#endif
>>> +
>>> +static inline unsigned long fb_rev_pixels_in_long(unsigned long val,
>>> +						  u32 bswapmask)
>>> +{
>>> +	if (bswapmask & 1)
>>> +		val = comp(val >> 1, val << 1, REV_PIXELS_MASK1);
>>> +	if (bswapmask & 2)
>>> +		val = comp(val >> 2, val << 2, REV_PIXELS_MASK2);
>>> +	if (bswapmask & 3)
>>> +		val = comp(val >> 4, val << 4, REV_PIXELS_MASK4);
>>> +	return val;
>>> +}
>>> +
>>> +static inline u32 fb_shifted_pixels_mask_u32(struct fb_info *p, u32 index,
>>> +					     u32 bswapmask)
>>> +{
>>> +	u32 mask;
>>> +
>>> +	if (!bswapmask) {
>>> +		mask = FB_SHIFT_HIGH(p, ~(u32)0, index);
>>> +	} else {
>>> +		mask = 0xff << FB_LEFT_POS(p, 8);
>>> +		mask = FB_SHIFT_LOW(p, mask, index & (bswapmask)) & mask;
>>> +		mask = FB_SHIFT_HIGH(p, mask, index & ~(bswapmask));
>>> +#if defined(__i386__) || defined(__x86_64__)
>>> +		/* Shift argument is limited to 0 - 31 on x86 based CPU's */
>>> +		if(index + bswapmask < 32)
>>> +#endif
>>> +			mask |= FB_SHIFT_HIGH(p, ~(u32)0,
>>> +					(index + bswapmask) & ~(bswapmask));
>>> +	}
>>> +	return mask;
>>> +}
>>> +
>>> +static inline unsigned long fb_shifted_pixels_mask_long(struct fb_info *p,
>>> +							u32 index,
>>> +							u32 bswapmask)
>>> +{
>>> +	unsigned long mask;
>>> +
>>> +	if (!bswapmask) {
>>> +		mask = FB_SHIFT_HIGH(p, ~0UL, index);
>>> +	} else {
>>> +		mask = 0xff << FB_LEFT_POS(p, 8);
>>> +		mask = FB_SHIFT_LOW(p, mask, index & (bswapmask)) & mask;
>>> +		mask = FB_SHIFT_HIGH(p, mask, index & ~(bswapmask));
>>> +#if defined(__i386__) || defined(__x86_64__)
>>> +		/* Shift argument is limited to 0 - 31 on x86 based CPU's */
>>> +		if(index + bswapmask < BITS_PER_LONG)
>>> +#endif
>>> +			mask |= FB_SHIFT_HIGH(p, ~0UL,
>>> +					(index + bswapmask) & ~(bswapmask));
>>> +	}
>>> +	return mask;
>>> +}
>>> +
>>> +
>>> +static inline u32 fb_compute_bswapmask(struct fb_info *info)
>>> +{
>>> +	u32 bswapmask = 0;
>>> +	unsigned bpp = info->var.bits_per_pixel;
>>> +
>>> +	if ((bpp < 8) && (info->var.nonstd & FB_NONSTD_REV_PIX_IN_B)) {
>>> +		/*
>>> +		 * Reversed order of pixel layout in bytes
>>> +		 * works only for 1, 2 and 4 bpp
>>> +		 */
>>> +		bswapmask = 7 - bpp + 1;
>>> +	}
>>> +	return bswapmask;
>>> +}
>>> +
>>> +#else /* CONFIG_FB_CFB_REV_PIXELS_IN_BYTE */
>>> +
>>> +static inline unsigned long fb_rev_pixels_in_long(unsigned long val,
>>> +						  u32 bswapmask)
>>> +{
>>> +	return val;
>>> +}
>>> +
>>> +#define fb_shifted_pixels_mask_u32(p, i, b) FB_SHIFT_HIGH((p), ~(u32)0, (i))
>>> +#define fb_shifted_pixels_mask_long(p, i, b) FB_SHIFT_HIGH((p), ~0UL, (i))
>>> +#define fb_compute_bswapmask(...) 0
>>> +
>>> +#endif  /* CONFIG_FB_CFB_REV_PIXELS_IN_BYTE */
>>> +
>>> +#define cpu_to_le_long _cpu_to_le_long(BITS_PER_LONG)
>>> +#define _cpu_to_le_long(x) __cpu_to_le_long(x)
>>> +#define __cpu_to_le_long(x) cpu_to_le##x
>>> +
>>> +#define le_long_to_cpu _le_long_to_cpu(BITS_PER_LONG)
>>> +#define _le_long_to_cpu(x) __le_long_to_cpu(x)
>>> +#define __le_long_to_cpu(x) le##x##_to_cpu
>>> +
>>> +static inline unsigned long rolx(unsigned long word, unsigned int shift, unsigned int x)
>>> +{
>>> +	return (word << shift) | (word >> (x - shift));
>>> +}
>>> +
>>> +
>>> +/*
>>> + *  Generic bitwise copy algorithm
>>> + */
>>> +static void
>>> +bitcpy(struct fb_info *p, unsigned long *dst, unsigned dst_idx,
>>> +	const unsigned long *src, unsigned src_idx, int bits, unsigned n)
>>> +{
>>> +	unsigned long first, last;
>>> +	int const shift = dst_idx-src_idx;
>>> +	int left, right;
>>> +
>>> +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
>>> +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
>>> +
>>> +	if (!shift) {
>>> +		/* Same alignment for source and dest */
>>> +		if (dst_idx+n <= bits) {
>>> +			/* Single word */
>>> +			if (last)
>>> +				first &= last;
>>> +			*dst = comp(*src, *dst, first);
>>> +		} else {
>>> +			/* Multiple destination words */
>>> +			/* Leading bits */
>>> + 			if (first != ~0UL) {
>>> +				*dst = comp(*src, *dst, first);
>>> +				dst++;
>>> +				src++;
>>> +				n -= bits - dst_idx;
>>> +			}
>>> +
>>> +			/* Main chunk */
>>> +			n /= bits;
>>> +			while (n >= 8) {
>>> +				*dst++ = *src++;
>>> +				*dst++ = *src++;
>>> +				*dst++ = *src++;
>>> +				*dst++ = *src++;
>>> +				*dst++ = *src++;
>>> +				*dst++ = *src++;
>>> +				*dst++ = *src++;
>>> +				*dst++ = *src++;
>>> +				n -= 8;
>>> +			}
>>> +			while (n--)
>>> +				*dst++ = *src++;
>>> +
>>> +			/* Trailing bits */
>>> +			if (last)
>>> +				*dst = comp(*src, *dst, last);
>>> +		}
>>> +	} else {
>>> +		unsigned long d0, d1;
>>> +		int m;
>>> +
>>> +		/* Different alignment for source and dest */
>>> +		right = shift & (bits - 1);
>>> +		left = -shift & (bits - 1);
>>> +
>>> +		if (dst_idx+n <= bits) {
>>> +			/* Single destination word */
>>> +			if (last)
>>> +				first &= last;
>>> +			if (shift > 0) {
>>> +				/* Single source word */
>>> +				*dst = comp(*src << left, *dst, first);
>>> +			} else if (src_idx+n <= bits) {
>>> +				/* Single source word */
>>> +				*dst = comp(*src >> right, *dst, first);
>>> +			} else {
>>> +				/* 2 source words */
>>> +				d0 = *src++;
>>> +				d1 = *src;
>>> +				*dst = comp(d0 >> right | d1 << left, *dst,
>>> +					    first);
>>> +			}
>>> +		} else {
>>> +			/* Multiple destination words */
>>> +			/** We must always remember the last value read,
>>> +			    because in case SRC and DST overlap bitwise (e.g.
>>> +			    when moving just one pixel in 1bpp), we always
>>> +			    collect one full long for DST and that might
>>> +			    overlap with the current long from SRC. We store
>>> +			    this value in 'd0'. */
>>> +			d0 = *src++;
>>> +			/* Leading bits */
>>> +			if (shift > 0) {
>>> +				/* Single source word */
>>> +				*dst = comp(d0 << left, *dst, first);
>>> +				dst++;
>>> +				n -= bits - dst_idx;
>>> +			} else {
>>> +				/* 2 source words */
>>> +				d1 = *src++;
>>> +				*dst = comp(d0 >> right | d1 << left, *dst,
>>> +					    first);
>>> +				d0 = d1;
>>> +				dst++;
>>> +				n -= bits - dst_idx;
>>> +			}
>>> +
>>> +			/* Main chunk */
>>> +			m = n % bits;
>>> +			n /= bits;
>>> +			while (n >= 4) {
>>> +				d1 = *src++;
>>> +				*dst++ = d0 >> right | d1 << left;
>>> +				d0 = d1;
>>> +				d1 = *src++;
>>> +				*dst++ = d0 >> right | d1 << left;
>>> +				d0 = d1;
>>> +				d1 = *src++;
>>> +				*dst++ = d0 >> right | d1 << left;
>>> +				d0 = d1;
>>> +				d1 = *src++;
>>> +				*dst++ = d0 >> right | d1 << left;
>>> +				d0 = d1;
>>> +				n -= 4;
>>> +			}
>>> +			while (n--) {
>>> +				d1 = *src++;
>>> +				*dst++ = d0 >> right | d1 << left;
>>> +				d0 = d1;
>>> +			}
>>> +
>>> +			/* Trailing bits */
>>> +			if (m) {
>>> +				if (m <= bits - right) {
>>> +					/* Single source word */
>>> +					d0 >>= right;
>>> +				} else {
>>> +					/* 2 source words */
>>> + 					d1 = *src;
>>> +					d0 = d0 >> right | d1 << left;
>>> +				}
>>> +				*dst = comp(d0, *dst, last);
>>> +			}
>>> +		}
>>> +	}
>>> +}
>>> +
>>> +/*
>>> + *  Generic bitwise copy algorithm, operating backward
>>> + */
>>> +static void
>>> +bitcpy_rev(struct fb_info *p, unsigned long *dst, unsigned dst_idx,
>>> +	   const unsigned long *src, unsigned src_idx, unsigned bits,
>>> +	   unsigned n)
>>> +{
>>> +	unsigned long first, last;
>>> +	int shift;
>>> +
>>> +	dst += (dst_idx + n - 1) / bits;
>>> +	src += (src_idx + n - 1) / bits;
>>> +	dst_idx = (dst_idx + n - 1) % bits;
>>> +	src_idx = (src_idx + n - 1) % bits;
>>> +
>>> +	shift = dst_idx-src_idx;
>>> +
>>> +	first = ~FB_SHIFT_HIGH(p, ~0UL, (dst_idx + 1) % bits);
>>> +	last = FB_SHIFT_HIGH(p, ~0UL, (bits + dst_idx + 1 - n) % bits);
>>> +
>>> +	if (!shift) {
>>> +		/* Same alignment for source and dest */
>>> +		if ((unsigned long)dst_idx+1 >= n) {
>>> +			/* Single word */
>>> +			if (first)
>>> +				last &= first;
>>> +			*dst = comp(*src, *dst, last);
>>> +		} else {
>>> +			/* Multiple destination words */
>>> +
>>> +			/* Leading bits */
>>> +			if (first) {
>>> +				*dst = comp(*src, *dst, first);
>>> +				dst--;
>>> +				src--;
>>> +				n -= dst_idx+1;
>>> +			}
>>> +
>>> +			/* Main chunk */
>>> +			n /= bits;
>>> +			while (n >= 8) {
>>> +				*dst-- = *src--;
>>> +				*dst-- = *src--;
>>> +				*dst-- = *src--;
>>> +				*dst-- = *src--;
>>> +				*dst-- = *src--;
>>> +				*dst-- = *src--;
>>> +				*dst-- = *src--;
>>> +				*dst-- = *src--;
>>> +				n -= 8;
>>> +			}
>>> +			while (n--)
>>> +				*dst-- = *src--;
>>> +			/* Trailing bits */
>>> +			if (last != -1UL)
>>> +				*dst = comp(*src, *dst, last);
>>> +		}
>>> +	} else {
>>> +		/* Different alignment for source and dest */
>>> +
>>> +		int const left = shift & (bits-1);
>>> +		int const right = -shift & (bits-1);
>>> +
>>> +		if ((unsigned long)dst_idx+1 >= n) {
>>> +			/* Single destination word */
>>> +			if (first)
>>> +				last &= first;
>>> +			if (shift < 0) {
>>> +				/* Single source word */
>>> +				*dst = comp(*src >> right, *dst, last);
>>> +			} else if (1+(unsigned long)src_idx >= n) {
>>> +				/* Single source word */
>>> +				*dst = comp(*src << left, *dst, last);
>>> +			} else {
>>> +				/* 2 source words */
>>> +				*dst = comp(*src << left | *(src-1) >> right,
>>> +					    *dst, last);
>>> +			}
>>> +		} else {
>>> +			/* Multiple destination words */
>>> +			/** We must always remember the last value read,
>>> +			    because in case SRC and DST overlap bitwise (e.g.
>>> +			    when moving just one pixel in 1bpp), we always
>>> +			    collect one full long for DST and that might
>>> +			    overlap with the current long from SRC. We store
>>> +			    this value in 'd0'. */
>>> +			unsigned long d0, d1;
>>> +			int m;
>>> +
>>> +			d0 = *src--;
>>> +			/* Leading bits */
>>> +			if (shift < 0) {
>>> +				/* Single source word */
>>> +				d1 = d0;
>>> +				d0 >>= right;
>>> +			} else {
>>> +				/* 2 source words */
>>> +				d1 = *src--;
>>> +				d0 = d0 << left | d1 >> right;
>>> +			}
>>> +			if (!first)
>>> +				*dst = d0;
>>> +			else
>>> +				*dst = comp(d0, *dst, first);
>>> +			d0 = d1;
>>> +			dst--;
>>> +			n -= dst_idx+1;
>>> +
>>> +			/* Main chunk */
>>> +			m = n % bits;
>>> +			n /= bits;
>>> +			while (n >= 4) {
>>> +				d1 = *src--;
>>> +				*dst-- = d0 << left | d1 >> right;
>>> +				d0 = d1;
>>> +				d1 = *src--;
>>> +				*dst-- = d0 << left | d1 >> right;
>>> +				d0 = d1;
>>> +				d1 = *src--;
>>> +				*dst-- = d0 << left | d1 >> right;
>>> +				d0 = d1;
>>> +				d1 = *src--;
>>> +				*dst-- = d0 << left | d1 >> right;
>>> +				d0 = d1;
>>> +				n -= 4;
>>> +			}
>>> +			while (n--) {
>>> +				d1 = *src--;
>>> +				*dst-- = d0 << left | d1 >> right;
>>> +				d0 = d1;
>>> +			}
>>> +
>>> +			/* Trailing bits */
>>> +			if (m) {
>>> +				if (m <= bits - left) {
>>> +					/* Single source word */
>>> +					d0 <<= left;
>>> +				} else {
>>> +					/* 2 source words */
>>> +					d1 = *src;
>>> +					d0 = d0 << left | d1 >> right;
>>> +				}
>>> +				*dst = comp(d0, *dst, last);
>>> +			}
>>> +		}
>>> +	}
>>> +}
>>> +
>>> +/*
>>> + * Generic Bit Block Transfer for frame buffers located in system RAM with
>>> + * packed pixels of any depth.
>>> + */
>>> +void drm_copyarea(struct fb_info *p, const struct fb_copyarea *area)
>>> +{
>>> +	u32 dx = area->dx, dy = area->dy, sx = area->sx, sy = area->sy;
>>> +	u32 height = area->height, width = area->width;
>>> +	unsigned long const bits_per_line = p->fix.line_length*8u;
>>> +	unsigned long *base = NULL;
>>> +	int bits = BITS_PER_LONG, bytes = bits >> 3;
>>> +	unsigned dst_idx = 0, src_idx = 0, rev_copy = 0;
>>> +
>>> +	if (p->state != FBINFO_STATE_RUNNING)
>>> +		return;
>>> +
>>> +	/* if the beginning of the target area might overlap with the end of
>>> +	the source area, be have to copy the area reverse. */
>>> +	if ((dy == sy && dx > sx) || (dy > sy)) {
>>> +		dy += height;
>>> +		sy += height;
>>> +		rev_copy = 1;
>>> +	}
>>> +
>>> +	/* split the base of the framebuffer into a long-aligned address and
>>> +	   the index of the first bit */
>>> +	base = (unsigned long *)((unsigned long)p->screen_base & ~(bytes-1));
>>> +	dst_idx = src_idx = 8*((unsigned long)p->screen_base & (bytes-1));
>>> +	/* add offset of source and target area */
>>> +	dst_idx += dy*bits_per_line + dx*p->var.bits_per_pixel;
>>> +	src_idx += sy*bits_per_line + sx*p->var.bits_per_pixel;
>>> +
>>> +	if (p->fbops->fb_sync)
>>> +		p->fbops->fb_sync(p);
>>> +
>>> +	if (rev_copy) {
>>> +		while (height--) {
>>> +			dst_idx -= bits_per_line;
>>> +			src_idx -= bits_per_line;
>>> +			bitcpy_rev(p, base + (dst_idx / bits), dst_idx % bits,
>>> +				base + (src_idx / bits), src_idx % bits, bits,
>>> +				width*p->var.bits_per_pixel);
>>> +		}
>>> +	} else {
>>> +		while (height--) {
>>> +			bitcpy(p, base + (dst_idx / bits), dst_idx % bits,
>>> +				base + (src_idx / bits), src_idx % bits, bits,
>>> +				width*p->var.bits_per_pixel);
>>> +			dst_idx += bits_per_line;
>>> +			src_idx += bits_per_line;
>>> +		}
>>> +	}
>>> +}
>>> +EXPORT_SYMBOL(drm_copyarea);
>>> +
>>> +/*
>>> + *  Aligned pattern fill using 32/64-bit memory accesses
>>> + */
>>> +static void
>>> +bitfill_aligned(struct fb_info *p, unsigned long *dst, int dst_idx,
>>> +		unsigned long pat, unsigned n, int bits)
>>> +{
>>> +	unsigned long first, last;
>>> +
>>> +	if (!n)
>>> +		return;
>>> +
>>> +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
>>> +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
>>> +
>>> +	if (dst_idx+n <= bits) {
>>> +		/* Single word */
>>> +		if (last)
>>> +			first &= last;
>>> +		*dst = comp(pat, *dst, first);
>>> +	} else {
>>> +		/* Multiple destination words */
>>> +
>>> +		/* Leading bits */
>>> + 		if (first!= ~0UL) {
>>> +			*dst = comp(pat, *dst, first);
>>> +			dst++;
>>> +			n -= bits - dst_idx;
>>> +		}
>>> +
>>> +		/* Main chunk */
>>> +		n /= bits;
>>> +		while (n >= 8) {
>>> +			*dst++ = pat;
>>> +			*dst++ = pat;
>>> +			*dst++ = pat;
>>> +			*dst++ = pat;
>>> +			*dst++ = pat;
>>> +			*dst++ = pat;
>>> +			*dst++ = pat;
>>> +			*dst++ = pat;
>>> +			n -= 8;
>>> +		}
>>> +		while (n--)
>>> +			*dst++ = pat;
>>> +		/* Trailing bits */
>>> +		if (last)
>>> +			*dst = comp(pat, *dst, last);
>>> +	}
>>> +}
>>> +
>>> +
>>> +/*
>>> + *  Unaligned generic pattern fill using 32/64-bit memory accesses
>>> + *  The pattern must have been expanded to a full 32/64-bit value
>>> + *  Left/right are the appropriate shifts to convert to the pattern to be
>>> + *  used for the next 32/64-bit word
>>> + */
>>> +static void
>>> +bitfill_unaligned(struct fb_info *p, unsigned long *dst, int dst_idx,
>>> +		  unsigned long pat, int left, int right, unsigned n, int bits)
>>> +{
>>> +	unsigned long first, last;
>>> +
>>> +	if (!n)
>>> +		return;
>>> +
>>> +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
>>> +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
>>> +
>>> +	if (dst_idx+n <= bits) {
>>> +		/* Single word */
>>> +		if (last)
>>> +			first &= last;
>>> +		*dst = comp(pat, *dst, first);
>>> +	} else {
>>> +		/* Multiple destination words */
>>> +		/* Leading bits */
>>> +		if (first) {
>>> +			*dst = comp(pat, *dst, first);
>>> +			dst++;
>>> +			pat = pat << left | pat >> right;
>>> +			n -= bits - dst_idx;
>>> +		}
>>> +
>>> +		/* Main chunk */
>>> +		n /= bits;
>>> +		while (n >= 4) {
>>> +			*dst++ = pat;
>>> +			pat = pat << left | pat >> right;
>>> +			*dst++ = pat;
>>> +			pat = pat << left | pat >> right;
>>> +			*dst++ = pat;
>>> +			pat = pat << left | pat >> right;
>>> +			*dst++ = pat;
>>> +			pat = pat << left | pat >> right;
>>> +			n -= 4;
>>> +		}
>>> +		while (n--) {
>>> +			*dst++ = pat;
>>> +			pat = pat << left | pat >> right;
>>> +		}
>>> +
>>> +		/* Trailing bits */
>>> +		if (last)
>>> +			*dst = comp(pat, *dst, last);
>>> +	}
>>> +}
>>> +
>>> +/*
>>> + *  Aligned pattern invert using 32/64-bit memory accesses
>>> + */
>>> +static void
>>> +bitfill_aligned_rev(struct fb_info *p, unsigned long *dst, int dst_idx,
>>> +		    unsigned long pat, unsigned n, int bits)
>>> +{
>>> +	unsigned long val = pat;
>>> +	unsigned long first, last;
>>> +
>>> +	if (!n)
>>> +		return;
>>> +
>>> +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
>>> +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
>>> +
>>> +	if (dst_idx+n <= bits) {
>>> +		/* Single word */
>>> +		if (last)
>>> +			first &= last;
>>> +		*dst = comp(*dst ^ val, *dst, first);
>>> +	} else {
>>> +		/* Multiple destination words */
>>> +		/* Leading bits */
>>> +		if (first!=0UL) {
>>> +			*dst = comp(*dst ^ val, *dst, first);
>>> +			dst++;
>>> +			n -= bits - dst_idx;
>>> +		}
>>> +
>>> +		/* Main chunk */
>>> +		n /= bits;
>>> +		while (n >= 8) {
>>> +			*dst++ ^= val;
>>> +			*dst++ ^= val;
>>> +			*dst++ ^= val;
>>> +			*dst++ ^= val;
>>> +			*dst++ ^= val;
>>> +			*dst++ ^= val;
>>> +			*dst++ ^= val;
>>> +			*dst++ ^= val;
>>> +			n -= 8;
>>> +		}
>>> +		while (n--)
>>> +			*dst++ ^= val;
>>> +		/* Trailing bits */
>>> +		if (last)
>>> +			*dst = comp(*dst ^ val, *dst, last);
>>> +	}
>>> +}
>>> +
>>> +
>>> +/*
>>> + *  Unaligned generic pattern invert using 32/64-bit memory accesses
>>> + *  The pattern must have been expanded to a full 32/64-bit value
>>> + *  Left/right are the appropriate shifts to convert to the pattern to be
>>> + *  used for the next 32/64-bit word
>>> + */
>>> +static void
>>> +bitfill_unaligned_rev(struct fb_info *p, unsigned long *dst, int dst_idx,
>>> +		      unsigned long pat, int left, int right, unsigned n,
>>> +		      int bits)
>>> +{
>>> +	unsigned long first, last;
>>> +
>>> +	if (!n)
>>> +		return;
>>> +
>>> +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
>>> +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
>>> +
>>> +	if (dst_idx+n <= bits) {
>>> +		/* Single word */
>>> +		if (last)
>>> +			first &= last;
>>> +		*dst = comp(*dst ^ pat, *dst, first);
>>> +	} else {
>>> +		/* Multiple destination words */
>>> +
>>> +		/* Leading bits */
>>> +		if (first != 0UL) {
>>> +			*dst = comp(*dst ^ pat, *dst, first);
>>> +			dst++;
>>> +			pat = pat << left | pat >> right;
>>> +			n -= bits - dst_idx;
>>> +		}
>>> +
>>> +		/* Main chunk */
>>> +		n /= bits;
>>> +		while (n >= 4) {
>>> +			*dst++ ^= pat;
>>> +			pat = pat << left | pat >> right;
>>> +			*dst++ ^= pat;
>>> +			pat = pat << left | pat >> right;
>>> +			*dst++ ^= pat;
>>> +			pat = pat << left | pat >> right;
>>> +			*dst++ ^= pat;
>>> +			pat = pat << left | pat >> right;
>>> +			n -= 4;
>>> +		}
>>> +		while (n--) {
>>> +			*dst ^= pat;
>>> +			pat = pat << left | pat >> right;
>>> +		}
>>> +
>>> +		/* Trailing bits */
>>> +		if (last)
>>> +			*dst = comp(*dst ^ pat, *dst, last);
>>> +	}
>>> +}
>>> +
>>> +/*
>>> + * Generic fillrect for frame buffers in system RAM with packed pixels of
>>> + * any depth.
>>> + */
>>> +void drm_fillrect(struct fb_info *p, const struct fb_fillrect *rect)
>>> +{
>>> +	unsigned long pat, pat2, fg;
>>> +	unsigned long width = rect->width, height = rect->height;
>>> +	int bits = BITS_PER_LONG, bytes = bits >> 3;
>>> +	u32 bpp = p->var.bits_per_pixel;
>>> +	unsigned long *dst;
>>> +	int dst_idx, left;
>>> +
>>> +	if (p->state != FBINFO_STATE_RUNNING)
>>> +		return;
>>> +
>>> +	if (p->fix.visual == FB_VISUAL_TRUECOLOR ||
>>> +	    p->fix.visual == FB_VISUAL_DIRECTCOLOR )
>>> +		fg = ((u32 *) (p->pseudo_palette))[rect->color];
>>> +	else
>>> +		fg = rect->color;
>>> +
>>> +	pat = pixel_to_pat( bpp, fg);
>>> +
>>> +	dst = (unsigned long *)((unsigned long)p->screen_base & ~(bytes-1));
>>> +	dst_idx = ((unsigned long)p->screen_base & (bytes - 1))*8;
>>> +	dst_idx += rect->dy*p->fix.line_length*8+rect->dx*bpp;
>>> +	/* FIXME For now we support 1-32 bpp only */
>>> +	left = bits % bpp;
>>> +	if (p->fbops->fb_sync)
>>> +		p->fbops->fb_sync(p);
>>> +	if (!left) {
>>> +		void (*fill_op32)(struct fb_info *p, unsigned long *dst,
>>> +				  int dst_idx, unsigned long pat, unsigned n,
>>> +				  int bits) = NULL;
>>> +
>>> +		switch (rect->rop) {
>>> +		case ROP_XOR:
>>> +			fill_op32 = bitfill_aligned_rev;
>>> +			break;
>>> +		case ROP_COPY:
>>> +			fill_op32 = bitfill_aligned;
>>> +			break;
>>> +		default:
>>> +			printk( KERN_ERR "cfb_fillrect(): unknown rop, "
>>> +				"defaulting to ROP_COPY\n");
>>> +			fill_op32 = bitfill_aligned;
>>> +			break;
>>> +		}
>>> +		while (height--) {
>>> +			dst += dst_idx >> (ffs(bits) - 1);
>>> +			dst_idx &= (bits - 1);
>>> +			fill_op32(p, dst, dst_idx, pat, width*bpp, bits);
>>> +			dst_idx += p->fix.line_length*8;
>>> +		}
>>> +	} else {
>>> +		int right, r;
>>> +		void (*fill_op)(struct fb_info *p, unsigned long *dst,
>>> +				int dst_idx, unsigned long pat, int left,
>>> +				int right, unsigned n, int bits) = NULL;
>>> +#ifdef __LITTLE_ENDIAN
>>> +		right = left;
>>> +		left = bpp - right;
>>> +#else
>>> +		right = bpp - left;
>>> +#endif
>>> +		switch (rect->rop) {
>>> +		case ROP_XOR:
>>> +			fill_op = bitfill_unaligned_rev;
>>> +			break;
>>> +		case ROP_COPY:
>>> +			fill_op = bitfill_unaligned;
>>> +			break;
>>> +		default:
>>> +			printk(KERN_ERR "sys_fillrect(): unknown rop, "
>>> +				"defaulting to ROP_COPY\n");
>>> +			fill_op = bitfill_unaligned;
>>> +			break;
>>> +		}
>>> +		while (height--) {
>>> +			dst += dst_idx / bits;
>>> +			dst_idx &= (bits - 1);
>>> +			r = dst_idx % bpp;
>>> +			/* rotate pattern to the correct start position */
>>> +			pat2 = le_long_to_cpu(rolx(cpu_to_le_long(pat), r, bpp));
>>> +			fill_op(p, dst, dst_idx, pat2, left, right,
>>> +				width*bpp, bits);
>>> +			dst_idx += p->fix.line_length*8;
>>> +		}
>>> +	}
>>> +}
>>> +EXPORT_SYMBOL(drm_fillrect);
>>> +
>>> +static const u32 cfb_tab8_be[] = {
>>> +    0x00000000,0x000000ff,0x0000ff00,0x0000ffff,
>>> +    0x00ff0000,0x00ff00ff,0x00ffff00,0x00ffffff,
>>> +    0xff000000,0xff0000ff,0xff00ff00,0xff00ffff,
>>> +    0xffff0000,0xffff00ff,0xffffff00,0xffffffff
>>> +};
>>> +
>>> +static const u32 cfb_tab8_le[] = {
>>> +    0x00000000,0xff000000,0x00ff0000,0xffff0000,
>>> +    0x0000ff00,0xff00ff00,0x00ffff00,0xffffff00,
>>> +    0x000000ff,0xff0000ff,0x00ff00ff,0xffff00ff,
>>> +    0x0000ffff,0xff00ffff,0x00ffffff,0xffffffff
>>> +};
>>> +
>>> +static const u32 cfb_tab16_be[] = {
>>> +    0x00000000, 0x0000ffff, 0xffff0000, 0xffffffff
>>> +};
>>> +
>>> +static const u32 cfb_tab16_le[] = {
>>> +    0x00000000, 0xffff0000, 0x0000ffff, 0xffffffff
>>> +};
>>> +
>>> +static const u32 cfb_tab32[] = {
>>> +	0x00000000, 0xffffffff
>>> +};
>>> +
>>> +static void color_imageblit(const struct fb_image *image, struct fb_info *p,
>>> +			    void *dst1, u32 start_index, u32 pitch_index)
>>> +{
>>> +	/* Draw the penguin */
>>> +	u32 *dst, *dst2;
>>> +	u32 color = 0, val, shift;
>>> +	int i, n, bpp = p->var.bits_per_pixel;
>>> +	u32 null_bits = 32 - bpp;
>>> +	u32 *palette = (u32 *) p->pseudo_palette;
>>> +	const u8 *src = image->data;
>>> +
>>> +	dst2 = dst1;
>>> +	for (i = image->height; i--; ) {
>>> +		n = image->width;
>>> +		dst = dst1;
>>> +		shift = 0;
>>> +		val = 0;
>>> +
>>> +		if (start_index) {
>>> +			u32 start_mask = ~(FB_SHIFT_HIGH(p, ~(u32)0,
>>> +							 start_index));
>>> +			val = *dst & start_mask;
>>> +			shift = start_index;
>>> +		}
>>> +		while (n--) {
>>> +			if (p->fix.visual == FB_VISUAL_TRUECOLOR ||
>>> +			    p->fix.visual == FB_VISUAL_DIRECTCOLOR )
>>> +				color = palette[*src];
>>> +			else
>>> +				color = *src;
>>> +			color <<= FB_LEFT_POS(p, bpp);
>>> +			val |= FB_SHIFT_HIGH(p, color, shift);
>>> +			if (shift >= null_bits) {
>>> +				*dst++ = val;
>>> +
>>> +				val = (shift == null_bits) ? 0 :
>>> +					FB_SHIFT_LOW(p, color, 32 - shift);
>>> +			}
>>> +			shift += bpp;
>>> +			shift &= (32 - 1);
>>> +			src++;
>>> +		}
>>> +		if (shift) {
>>> +			u32 end_mask = FB_SHIFT_HIGH(p, ~(u32)0, shift);
>>> +
>>> +			*dst &= end_mask;
>>> +			*dst |= val;
>>> +		}
>>> +		dst1 += p->fix.line_length;
>>> +		if (pitch_index) {
>>> +			dst2 += p->fix.line_length;
>>> +			dst1 = (u8 *)((long)dst2 & ~(sizeof(u32) - 1));
>>> +
>>> +			start_index += pitch_index;
>>> +			start_index &= 32 - 1;
>>> +		}
>>> +	}
>>> +}
>>> +
>>> +static void slow_imageblit(const struct fb_image *image, struct fb_info *p,
>>> +				  void *dst1, u32 fgcolor, u32 bgcolor,
>>> +				  u32 start_index, u32 pitch_index)
>>> +{
>>> +	u32 shift, color = 0, bpp = p->var.bits_per_pixel;
>>> +	u32 *dst, *dst2;
>>> +	u32 val, pitch = p->fix.line_length;
>>> +	u32 null_bits = 32 - bpp;
>>> +	u32 spitch = (image->width+7)/8;
>>> +	const u8 *src = image->data, *s;
>>> +	u32 i, j, l;
>>> +
>>> +	dst2 = dst1;
>>> +	fgcolor <<= FB_LEFT_POS(p, bpp);
>>> +	bgcolor <<= FB_LEFT_POS(p, bpp);
>>> +
>>> +	for (i = image->height; i--; ) {
>>> +		shift = val = 0;
>>> +		l = 8;
>>> +		j = image->width;
>>> +		dst = dst1;
>>> +		s = src;
>>> +
>>> +		/* write leading bits */
>>> +		if (start_index) {
>>> +			u32 start_mask = ~(FB_SHIFT_HIGH(p, ~(u32)0,
>>> +							 start_index));
>>> +			val = *dst & start_mask;
>>> +			shift = start_index;
>>> +		}
>>> +
>>> +		while (j--) {
>>> +			l--;
>>> +			color = (*s & (1 << l)) ? fgcolor : bgcolor;
>>> +			val |= FB_SHIFT_HIGH(p, color, shift);
>>> +
>>> +			/* Did the bitshift spill bits to the next long? */
>>> +			if (shift >= null_bits) {
>>> +				*dst++ = val;
>>> +				val = (shift == null_bits) ? 0 :
>>> +					FB_SHIFT_LOW(p, color, 32 - shift);
>>> +			}
>>> +			shift += bpp;
>>> +			shift &= (32 - 1);
>>> +			if (!l) { l = 8; s++; }
>>> +		}
>>> +
>>> +		/* write trailing bits */
>>> + 		if (shift) {
>>> +			u32 end_mask = FB_SHIFT_HIGH(p, ~(u32)0, shift);
>>> +
>>> +			*dst &= end_mask;
>>> +			*dst |= val;
>>> +		}
>>> +
>>> +		dst1 += pitch;
>>> +		src += spitch;
>>> +		if (pitch_index) {
>>> +			dst2 += pitch;
>>> +			dst1 = (u8 *)((long)dst2 & ~(sizeof(u32) - 1));
>>> +			start_index += pitch_index;
>>> +			start_index &= 32 - 1;
>>> +		}
>>> +
>>> +	}
>>> +}
>>> +
>>> +/*
>>> + * fast_imageblit - optimized monochrome color expansion
>>> + *
>>> + * Only if:  bits_per_pixel == 8, 16, or 32
>>> + *           image->width is divisible by pixel/dword (ppw);
>>> + *           fix->line_legth is divisible by 4;
>>> + *           beginning and end of a scanline is dword aligned
>>> + */
>>> +static void fast_imageblit(const struct fb_image *image, struct fb_info *p,
>>> +				  void *dst1, u32 fgcolor, u32 bgcolor)
>>> +{
>>> +	u32 fgx = fgcolor, bgx = bgcolor, bpp = p->var.bits_per_pixel;
>>> +	u32 ppw = 32/bpp, spitch = (image->width + 7)/8;
>>> +	u32 bit_mask, end_mask, eorx, shift;
>>> +	const char *s = image->data, *src;
>>> +	u32 *dst;
>>> +	const u32 *tab = NULL;
>>> +	int i, j, k;
>>> +
>>> +	switch (bpp) {
>>> +	case 8:
>>> +		tab = fb_be_math(p) ? cfb_tab8_be : cfb_tab8_le;
>>> +		break;
>>> +	case 16:
>>> +		tab = fb_be_math(p) ? cfb_tab16_be : cfb_tab16_le;
>>> +		break;
>>> +	case 32:
>>> +	default:
>>> +		tab = cfb_tab32;
>>> +		break;
>>> +	}
>>> +
>>> +	for (i = ppw-1; i--; ) {
>>> +		fgx <<= bpp;
>>> +		bgx <<= bpp;
>>> +		fgx |= fgcolor;
>>> +		bgx |= bgcolor;
>>> +	}
>>> +
>>> +	bit_mask = (1 << ppw) - 1;
>>> +	eorx = fgx ^ bgx;
>>> +	k = image->width/ppw;
>>> +
>>> +	for (i = image->height; i--; ) {
>>> +		dst = dst1;
>>> +		shift = 8;
>>> +		src = s;
>>> +
>>> +		for (j = k; j--; ) {
>>> +			shift -= ppw;
>>> +			end_mask = tab[(*src >> shift) & bit_mask];
>>> +			*dst++ = (end_mask & eorx) ^ bgx;
>>> +			if (!shift) {
>>> +				shift = 8;
>>> +				src++;
>>> +			}
>>> +		}
>>> +		dst1 += p->fix.line_length;
>>> +		s += spitch;
>>> +	}
>>> +}
>>> +
>>> +/*
>>> + * Generic 1-bit or 8-bit source to 1-32 bit destination expansion
>>> + * for frame buffer located in system RAM with packed pixels of any depth.
>>> + */
>>> +void drm_imageblit(struct fb_info *p, const struct fb_image *image)
>>> +{
>>> +	u32 fgcolor, bgcolor, start_index, bitstart, pitch_index = 0;
>>> +	u32 bpl = sizeof(u32), bpp = p->var.bits_per_pixel;
>>> +	u32 width = image->width;
>>> +	u32 dx = image->dx, dy = image->dy;
>>> +	void *dst1;
>>> +
>>> +	if (p->state != FBINFO_STATE_RUNNING)
>>> +		return;
>>> +
>>> +	bitstart = (dy * p->fix.line_length * 8) + (dx * bpp);
>>> +	start_index = bitstart & (32 - 1);
>>> +	pitch_index = (p->fix.line_length & (bpl - 1)) * 8;
>>> +
>>> +	bitstart /= 8;
>>> +	bitstart &= ~(bpl - 1);
>>> +	dst1 = (void __force *)p->screen_base + bitstart;
>>> +
>>> +	if (p->fbops->fb_sync)
>>> +		p->fbops->fb_sync(p);
>>> +
>>> +	if (image->depth == 1) {
>>> +		if (p->fix.visual == FB_VISUAL_TRUECOLOR ||
>>> +		    p->fix.visual == FB_VISUAL_DIRECTCOLOR) {
>>> +			fgcolor = ((u32*)(p->pseudo_palette))[image->fg_color];
>>> +			bgcolor = ((u32*)(p->pseudo_palette))[image->bg_color];
>>> +		} else {
>>> +			fgcolor = image->fg_color;
>>> +			bgcolor = image->bg_color;
>>> +		}
>>> +
>>> +		if (32 % bpp == 0 && !start_index && !pitch_index &&
>>> +		    ((width & (32/bpp-1)) == 0) &&
>>> +		    bpp >= 8 && bpp <= 32)
>>> +			fast_imageblit(image, p, dst1, fgcolor, bgcolor);
>>> +		else
>>> +			slow_imageblit(image, p, dst1, fgcolor, bgcolor,
>>> +					start_index, pitch_index);
>>> +	} else
>>> +		color_imageblit(image, p, dst1, start_index, pitch_index);
>>> +}
>>> +EXPORT_SYMBOL(drm_imageblit);
>>> +
>>> +/*
>>> + * Generic file operations where framebuffer is in system RAM
>>> + */
>>> +ssize_t drm_fb_read(struct fb_info *info, char __user *buf, size_t count, loff_t *ppos)
>>> +{
>>> +	unsigned long p = *ppos;
>>> +	void *src;
>>> +	int err = 0;
>>> +	unsigned long total_size;
>>> +
>>> +	if (info->state != FBINFO_STATE_RUNNING)
>>> +		return -EPERM;
>>> +
>>> +	total_size = info->screen_size;
>>> +
>>> +	if (total_size == 0)
>>> +		total_size = info->fix.smem_len;
>>> +
>>> +	if (p >= total_size)
>>> +		return 0;
>>> +
>>> +	if (count >= total_size)
>>> +		count = total_size;
>>> +
>>> +	if (count + p > total_size)
>>> +		count = total_size - p;
>>> +
>>> +	src = (void __force *)(info->screen_base + p);
>>> +
>>> +	if (info->fbops->fb_sync)
>>> +		info->fbops->fb_sync(info);
>>> +
>>> +	if (copy_to_user(buf, src, count))
>>> +		err = -EFAULT;
>>> +
>>> +	if  (!err)
>>> +		*ppos += count;
>>> +
>>> +	return (err) ? err : count;
>>> +}
>>> +EXPORT_SYMBOL_GPL(drm_fb_read);
>>> +
>>> +/*
>>> + * Generic file operations where framebuffer is in system RAM
>>> + */
>>> +ssize_t drm_fb_write(struct fb_info *info, const char __user *buf, size_t count, loff_t *ppos)
>>> +{
>>> +	unsigned long p = *ppos;
>>> +	void *dst;
>>> +	int err = 0;
>>> +	unsigned long total_size;
>>> +
>>> +	if (info->state != FBINFO_STATE_RUNNING)
>>> +		return -EPERM;
>>> +
>>> +	total_size = info->screen_size;
>>> +
>>> +	if (total_size == 0)
>>> +		total_size = info->fix.smem_len;
>>> +
>>> +	if (p > total_size)
>>> +		return -EFBIG;
>>> +
>>> +	if (count > total_size) {
>>> +		err = -EFBIG;
>>> +		count = total_size;
>>> +	}
>>> +
>>> +	if (count + p > total_size) {
>>> +		if (!err)
>>> +			err = -ENOSPC;
>>> +
>>> +		count = total_size - p;
>>> +	}
>>> +
>>> +	dst = (void __force *) (info->screen_base + p);
>>> +
>>> +	if (info->fbops->fb_sync)
>>> +		info->fbops->fb_sync(info);
>>> +
>>> +	if (copy_from_user(dst, buf, count))
>>> +		err = -EFAULT;
>>> +
>>> +	if  (!err)
>>> +		*ppos += count;
>>> +
>>> +	return (err) ? err : count;
>>> +}
>>> +EXPORT_SYMBOL_GPL(drm_fb_write);
>>> diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
>>> index da0d96a69570..18fba9818635 100644
>>> --- a/drivers/gpu/drm/drm_fb_helper.c
>>> +++ b/drivers/gpu/drm/drm_fb_helper.c
>>> @@ -664,36 +664,36 @@ void drm_fb_helper_deferred_io(struct fb_info *info,
>>>  EXPORT_SYMBOL(drm_fb_helper_deferred_io);
>>>  
>>>  /**
>>> - * drm_fb_helper_sys_read - wrapper around fb_sys_read
>>> + * drm_fb_helper_sys_read - wrapper around drm_fb_read
>>>   * @info: fb_info struct pointer
>>>   * @buf: userspace buffer to read from framebuffer memory
>>>   * @count: number of bytes to read from framebuffer memory
>>>   * @ppos: read offset within framebuffer memory
>>>   *
>>> - * A wrapper around fb_sys_read implemented by fbdev core
>>> + * A wrapper around drm_fb_read
>>>   */
>>>  ssize_t drm_fb_helper_sys_read(struct fb_info *info, char __user *buf,
>>>  			       size_t count, loff_t *ppos)
>>>  {
>>> -	return fb_sys_read(info, buf, count, ppos);
>>> +	return drm_fb_read(info, buf, count, ppos);
>>>  }
>>>  EXPORT_SYMBOL(drm_fb_helper_sys_read);
>>>  
>>>  /**
>>> - * drm_fb_helper_sys_write - wrapper around fb_sys_write
>>> + * drm_fb_helper_sys_write - wrapper around drm_fb_write
>>>   * @info: fb_info struct pointer
>>>   * @buf: userspace buffer to write to framebuffer memory
>>>   * @count: number of bytes to write to framebuffer memory
>>>   * @ppos: write offset within framebuffer memory
>>>   *
>>> - * A wrapper around fb_sys_write implemented by fbdev core
>>> + * A wrapper around drm_fb_write
>>>   */
>>>  ssize_t drm_fb_helper_sys_write(struct fb_info *info, const char __user *buf,
>>>  				size_t count, loff_t *ppos)
>>>  {
>>>  	ssize_t ret;
>>>  
>>> -	ret = fb_sys_write(info, buf, count, ppos);
>>> +	ret = drm_fb_write(info, buf, count, ppos);
>>>  	if (ret > 0)
>>>  		drm_fb_helper_dirty(info, 0, 0, info->var.xres,
>>>  				    info->var.yres);
>>> @@ -703,48 +703,48 @@ ssize_t drm_fb_helper_sys_write(struct fb_info *info, const char __user *buf,
>>>  EXPORT_SYMBOL(drm_fb_helper_sys_write);
>>>  
>>>  /**
>>> - * drm_fb_helper_sys_fillrect - wrapper around sys_fillrect
>>> + * drm_fb_helper_sys_fillrect - wrapper around drm_fillrect
>>>   * @info: fbdev registered by the helper
>>>   * @rect: info about rectangle to fill
>>>   *
>>> - * A wrapper around sys_fillrect implemented by fbdev core
>>> + * A wrapper around drm_fillrect
>>>   */
>>>  void drm_fb_helper_sys_fillrect(struct fb_info *info,
>>>  				const struct fb_fillrect *rect)
>>>  {
>>> -	sys_fillrect(info, rect);
>>> +	drm_fillrect(info, rect);
>>>  	drm_fb_helper_dirty(info, rect->dx, rect->dy,
>>>  			    rect->width, rect->height);
>>>  }
>>>  EXPORT_SYMBOL(drm_fb_helper_sys_fillrect);
>>>  
>>>  /**
>>> - * drm_fb_helper_sys_copyarea - wrapper around sys_copyarea
>>> + * drm_fb_helper_sys_copyarea - wrapper around drm_copyarea
>>>   * @info: fbdev registered by the helper
>>>   * @area: info about area to copy
>>>   *
>>> - * A wrapper around sys_copyarea implemented by fbdev core
>>> + * A wrapper around drm_copyarea
>>>   */
>>>  void drm_fb_helper_sys_copyarea(struct fb_info *info,
>>>  				const struct fb_copyarea *area)
>>>  {
>>> -	sys_copyarea(info, area);
>>> +	drm_copyarea(info, area);
>>>  	drm_fb_helper_dirty(info, area->dx, area->dy,
>>>  			    area->width, area->height);
>>>  }
>>>  EXPORT_SYMBOL(drm_fb_helper_sys_copyarea);
>>>  
>>>  /**
>>> - * drm_fb_helper_sys_imageblit - wrapper around sys_imageblit
>>> + * drm_fb_helper_sys_imageblit - wrapper around drm_imageblit
>>>   * @info: fbdev registered by the helper
>>>   * @image: info about image to blit
>>>   *
>>> - * A wrapper around sys_imageblit implemented by fbdev core
>>> + * A wrapper around drm_imageblit
>>>   */
>>>  void drm_fb_helper_sys_imageblit(struct fb_info *info,
>>>  				 const struct fb_image *image)
>>>  {
>>> -	sys_imageblit(info, image);
>>> +	drm_imageblit(info, image);
>>>  	drm_fb_helper_dirty(info, image->dx, image->dy,
>>>  			    image->width, image->height);
>>>  }
>>> diff --git a/include/drm/drm_fb_helper.h b/include/drm/drm_fb_helper.h
>>> index 306aa3a60be9..d38005f6128a 100644
>>> --- a/include/drm/drm_fb_helper.h
>>> +++ b/include/drm/drm_fb_helper.h
>>> @@ -271,6 +271,12 @@ void drm_fb_helper_output_poll_changed(struct drm_device *dev);
>>>  
>>>  void drm_fbdev_generic_setup(struct drm_device *dev,
>>>  			     unsigned int preferred_bpp);
>>> +
>>> +void drm_copyarea(struct fb_info *p, const struct fb_copyarea *area);
>>> +void drm_fillrect(struct fb_info *p, const struct fb_fillrect *rect);
>>> +void drm_imageblit(struct fb_info *p, const struct fb_image *image);
>>> +ssize_t drm_fb_read(struct fb_info *info, char __user *buf, size_t count, loff_t *ppos);
>>> +ssize_t drm_fb_write(struct fb_info *info, const char __user *buf, size_t count, loff_t *ppos);
>>>  #else
>>>  static inline void drm_fb_helper_prepare(struct drm_device *dev,
>>>  					struct drm_fb_helper *helper,
>>>
>>
>> -- 
>> Thomas Zimmermann
>> Graphics Driver Developer
>> SUSE Software Solutions Germany GmbH
>> Maxfeldstr. 5, 90409 Nürnberg, Germany
>> (HRB 36809, AG Nürnberg)
>> Geschäftsführer: Felix Imendörffer
>>
> 
> 
> 
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 516 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/5] fbdev/core: Export framebuffer read and write code as cfb_ function
@ 2020-08-04  9:41                 ` Thomas Zimmermann
  0 siblings, 0 replies; 34+ messages in thread
From: Thomas Zimmermann @ 2020-08-04  9:41 UTC (permalink / raw)
  To: daniel
  Cc: linux-fbdev, b.zolnierkie, jani.nikula, dri-devel, kraxel,
	airlied, natechancellor, Sam Ravnborg, peda, dan.carpenter


[-- Attachment #1.1.1: Type: text/plain, Size: 49170 bytes --]

Hi

Am 04.08.20 um 11:27 schrieb daniel@ffwll.ch:
> On Mon, Aug 03, 2020 at 08:46:34AM +0200, Thomas Zimmermann wrote:
>> Hi
>>
>> Am 02.08.20 um 22:01 schrieb Sam Ravnborg:
>>> On Fri, Jul 31, 2020 at 11:20:33AM +0200, daniel@ffwll.ch wrote:
>>>> On Wed, Jul 29, 2020 at 06:36:03PM +0200, Sam Ravnborg wrote:
>>>>> Hi Daniel.
>>>>>
>>>>> On Wed, Jul 29, 2020 at 03:53:28PM +0200, daniel@ffwll.ch wrote:
>>>>>> On Wed, Jul 29, 2020 at 03:41:45PM +0200, Thomas Zimmermann wrote:
>>>>>>> DRM fb helpers require read and write functions for framebuffer
>>>>>>> memory. Export the existing code from fbdev.
>>>>>>>
>>>>>>> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
>>>>>>
>>>>>> Hm I'm not super sure whether we want to actually reuse this stuff ... We
>>>>>> kinda don't care about the sparc special case, and just having an fbdev
>>>>>> implementation witch has the switch between memcpy and memcpy_to/from_io
>>>>>> in one single place sounds a lot simpler ...
>>>>>>
>>>>>> This way we can have a clean split between the old horrors of real fbdev
>>>>>> drivers, and a much cleaner world in drm. It would mean a bit of
>>>>>> copypasting, but I think that's actually a good thing.
>>>>>>
>>>>>> In general my idea for drm fbdev emulation is that for any area we have a
>>>>>> problem we just ignore the entire fbmem.c code and write our own: mmap,
>>>>>> backlight handling (still unsolved, and horrible), cfb vs sys here. This
>>>>>> entire fbmem.c stuff is pretty bad midlayer, trying to avoid code
>>>>>> duplication here doesn't seem worth it imo.
>>>>>>
>>>>>> Thoughts?
>>>>>
>>>>>
>>>>> I can see that fbmem is a mix of ioctl support and other stuff.
>>>>> We could factor out all the ioctl parts of fbmem.c to a new file
>>>>> named fbioctl.c.
>>>>>
>>>>> And then let the ioctl parts call down into drm stuff and avoid reusing
>>>>> the fbdev code when we first reach drm code.
>>>>> This would require local copies of:
>>>>> sys_read, sys_write, sys_fillrect, sys_copyarea, sys_imageblit
>>>>> and more I think which I missed.
>>>>>
>>>>> With local copies we could avoid some of the special cases and trim the
>>>>> unctions to what is required by drm only.
>>>>> And then no more fbmem dependencies and no dependencies to several of
>>>>> the small helper functions. So less entanglement with fbdev core.
>>>>>
>>>>> This all sounds simple so I am surely missing a lot a ugly details here.
>>>>>
>>>>> And should we touch this anyway we need a test suite to verify not too
>>>>> much breaks. To the best of my knowledge there is not yet such a test
>>>>> suite :-( Maybe because people caring about fbdev are limited.
>>>>
>>>> Well my idea was to not refactor anything, but just have drm copies of the
>>>> various fb_ops callbacks. Definitely not even more refactoring :-)
>>
>> Thanks for making a prototype, Sam.
>>
>> But do we really want to duplicate alls this code from fbdev? It's not
>> actually pretty and there's little value in rewritting it. If anything,
>> I can see us extending drm_format_helpers.c and building upon this.
> 
> Hm yeah this is a lot more horrible than I thought, I was kinda just
> looking at read/write functions, maybe also mmap, and figured we could
> simplify this a lot if we just inline and done.
> 
> But all these image/fill/copy functions are pretty bad, and they seem to
> care about a lot of corner cases that just don't matter anymore on modern
> hw. Also agreeing with Tomas that if we do copy this, then building on top
> of format helpers would make more sense. At that point we probably need a
> test suite for fbdev, and this entire endeavour becomes rather silly.
> 
> So maybe compromise approach? E.g. we reuse fbdev helpers with a switch
> between sys and cfb if it would drag in a huge pile of code for format
> handling sillynesss. But when all it would drag in is a memcpy_to/fromio
> or similar, we inline to reduce a bit the midlayer and make it clearer
> what's going on. Strictly speaking would probably still need a test suite,
> but I guess we can delay that until the first bug report or so :-)

Makes sense. I've been working on the wrapped-pointer approach a bit.
Rewriting cfb read and write within DRM is not much effort.

Format helpers apprear to have I/O and non-I/O versions of several
conversion functions. If we go with the wrapped pointer, we could later
change the format helpers to accept the wrapper structure and handle
iomem internally.

Best regards
Thomas

> 
> Cheers, Daniel
> 
> 
>>
>> Best regards
>> Thomas
>>
>>>
>>> $ wc -l drivers/gpu/drm/drm_fb_fbdev_helper.c
>>> 1212 drivers/gpu/drm/drm_fb_fbdev_helper.c
>>>
>>> This is a straight copy of:
>>> $ grep EXPORT drivers/gpu/drm/drm_fb_fbdev_helper.c
>>> EXPORT_SYMBOL(drm_copyarea);
>>> EXPORT_SYMBOL(drm_fillrect);
>>> EXPORT_SYMBOL(drm_imageblit);
>>> EXPORT_SYMBOL_GPL(drm_fb_read);
>>> EXPORT_SYMBOL_GPL(drm_fb_write);
>>>
>>> copyarea, fillrect and imageblit are a copy of the sys variants.
>>> read and write are from fb_sys_fops.c
>>>
>>> Patch builds but I am not a big fan yet - right now it gains us very
>>> little. What would the next step be?
>>>
>>> I looked briefly at the few drivers that have not migrated to
>>> the generic fbdev emulation (yet?).
>>>   msm
>>>   omapdrm
>>>   gma500
>>>
>>> all uses sys_ variants - they already uses the migrated variants as
>>> they call the drm_fb helper.
>>>
>>>
>>>   exynos
>>>   gma500
>>>   i915
>>>   nouveau
>>>   radeon
>>>   rockchip
>>>   vmwgfx
>>>
>>> all uses the cfb variants and cannot use the migrated variants as-is.
>>>
>>> 	Sam
>>>
>>> From c554e8b9ec96e06a5933ed5bd834a928c977bcdf Mon Sep 17 00:00:00 2001
>>> From: Sam Ravnborg <sam@ravnborg.org>
>>> Date: Fri, 31 Jul 2020 23:36:01 +0200
>>> Subject: [PATCH 1/3] drm: drm_fb_helper: copy fb_ops functions to drm
>>>
>>> Copy the fbdev functions used by fb_ops to drm.
>>> Having local copies enable the possibility
>>> to adjust the functions so then are tailored to the need of DRM.
>>> At the same time there is one less dependency between DRM and fbdev.
>>>
>>> The files were copied verbatim and only minimal changes was made.
>>>
>>> As some drivers uses the fbdev functions they continue to be build but
>>> the generic fbdev emulation no longer uses the fbdev functions.
>>>
>>> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
>>> Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>>> ---
>>>  drivers/gpu/drm/Kconfig               |    1 -
>>>  drivers/gpu/drm/Makefile              |    1 +
>>>  drivers/gpu/drm/drm_fb_fbdev_helper.c | 1212 +++++++++++++++++++++++++
>>>  drivers/gpu/drm/drm_fb_helper.c       |   30 +-
>>>  include/drm/drm_fb_helper.h           |    6 +
>>>  5 files changed, 1234 insertions(+), 16 deletions(-)
>>>  create mode 100644 drivers/gpu/drm/drm_fb_fbdev_helper.c
>>>
>>> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
>>> index c4fd57d8b717..11f41bc31f05 100644
>>> --- a/drivers/gpu/drm/Kconfig
>>> +++ b/drivers/gpu/drm/Kconfig
>>> @@ -83,7 +83,6 @@ config DRM_KMS_FB_HELPER
>>>  	select FB
>>>  	select FRAMEBUFFER_CONSOLE if !EXPERT
>>>  	select FRAMEBUFFER_CONSOLE_DETECT_PRIMARY if FRAMEBUFFER_CONSOLE
>>> -	select FB_SYS_FOPS
>>>  	select FB_SYS_FILLRECT
>>>  	select FB_SYS_COPYAREA
>>>  	select FB_SYS_IMAGEBLIT
>>> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
>>> index 02ee5faf1a92..398cb4746762 100644
>>> --- a/drivers/gpu/drm/Makefile
>>> +++ b/drivers/gpu/drm/Makefile
>>> @@ -50,6 +50,7 @@ drm_kms_helper-y := drm_bridge_connector.o drm_crtc_helper.o drm_dp_helper.o \
>>>  
>>>  drm_kms_helper-$(CONFIG_DRM_PANEL_BRIDGE) += bridge/panel.o
>>>  drm_kms_helper-$(CONFIG_DRM_FBDEV_EMULATION) += drm_fb_helper.o
>>> +drm_kms_helper-$(CONFIG_DRM_FBDEV_EMULATION) += drm_fb_fbdev_helper.o
>>>  drm_kms_helper-$(CONFIG_DRM_KMS_CMA_HELPER) += drm_fb_cma_helper.o
>>>  drm_kms_helper-$(CONFIG_DRM_DP_AUX_CHARDEV) += drm_dp_aux_dev.o
>>>  drm_kms_helper-$(CONFIG_DRM_DP_CEC) += drm_dp_cec.o
>>> diff --git a/drivers/gpu/drm/drm_fb_fbdev_helper.c b/drivers/gpu/drm/drm_fb_fbdev_helper.c
>>> new file mode 100644
>>> index 000000000000..dda60f85b9d5
>>> --- /dev/null
>>> +++ b/drivers/gpu/drm/drm_fb_fbdev_helper.c
>>> @@ -0,0 +1,1212 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +/*
>>> + * This file is subject to the terms and conditions of the GNU General Public
>>> + * License.  See the file COPYING in the main directory of this archive for
>>> + * more details.
>>> + *
>>> + * Copyright (C)  2007 Antonino Daplas <adaplas@pol.net>
>>> + *
>>> + * drm_copyarea:
>>> + *   Based almost entirely from cfbcopyarea.c (which is based almost entirely
>>> + *   on Geert Uytterhoeven's copyarea routine)
>>> + *
>>> + *  drm_fillrect:
>>> + *    Based almost entirely from cfbfillrect.c (which is based almost entirely
>>> + *    on Geert Uytterhoeven's fillrect routine)
>>> + *
>>> + *  drm_imageblit:
>>> + *    Based almost entirely on cfbimgblt.c
>>> + *
>>> + */
>>> +
>>> +#include <linux/export.h>
>>> +#include <linux/fb.h>
>>> +#include <linux/io.h>
>>> +
>>> +
>>> +/*
>>> +*  Compose two values, using a bitmask as decision value
>>> +*  This is equivalent to (a & mask) | (b & ~mask)
>>> +*/
>>> +static inline unsigned long
>>> +comp(unsigned long a, unsigned long b, unsigned long mask)
>>> +{
>>> +    return ((a ^ b) & mask) ^ b;
>>> +}
>>> +
>>> +/*
>>> + *  Create a pattern with the given pixel's color
>>> + */
>>> +#if BITS_PER_LONG == 64
>>> +static inline unsigned long
>>> +pixel_to_pat( u32 bpp, u32 pixel)
>>> +{
>>> +	switch (bpp) {
>>> +	case 1:
>>> +		return 0xfffffffffffffffful*pixel;
>>> +	case 2:
>>> +		return 0x5555555555555555ul*pixel;
>>> +	case 4:
>>> +		return 0x1111111111111111ul*pixel;
>>> +	case 8:
>>> +		return 0x0101010101010101ul*pixel;
>>> +	case 12:
>>> +		return 0x1001001001001001ul*pixel;
>>> +	case 16:
>>> +		return 0x0001000100010001ul*pixel;
>>> +	case 24:
>>> +		return 0x0001000001000001ul*pixel;
>>> +	case 32:
>>> +		return 0x0000000100000001ul*pixel;
>>> +	default:
>>> +		WARN(1, "pixel_to_pat(): unsupported pixelformat %d\n", bpp);
>>> +		return 0;
>>> +    }
>>> +}
>>> +#else
>>> +static inline unsigned long
>>> +pixel_to_pat( u32 bpp, u32 pixel)
>>> +{
>>> +	switch (bpp) {
>>> +	case 1:
>>> +		return 0xfffffffful*pixel;
>>> +	case 2:
>>> +		return 0x55555555ul*pixel;
>>> +	case 4:
>>> +		return 0x11111111ul*pixel;
>>> +	case 8:
>>> +		return 0x01010101ul*pixel;
>>> +	case 12:
>>> +		return 0x01001001ul*pixel;
>>> +	case 16:
>>> +		return 0x00010001ul*pixel;
>>> +	case 24:
>>> +		return 0x01000001ul*pixel;
>>> +	case 32:
>>> +		return 0x00000001ul*pixel;
>>> +	default:
>>> +		WARN(1, "pixel_to_pat(): unsupported pixelformat %d\n", bpp);
>>> +		return 0;
>>> +    }
>>> +}
>>> +#endif
>>> +
>>> +#ifdef CONFIG_FB_CFB_REV_PIXELS_IN_BYTE
>>> +#if BITS_PER_LONG == 64
>>> +#define REV_PIXELS_MASK1 0x5555555555555555ul
>>> +#define REV_PIXELS_MASK2 0x3333333333333333ul
>>> +#define REV_PIXELS_MASK4 0x0f0f0f0f0f0f0f0ful
>>> +#else
>>> +#define REV_PIXELS_MASK1 0x55555555ul
>>> +#define REV_PIXELS_MASK2 0x33333333ul
>>> +#define REV_PIXELS_MASK4 0x0f0f0f0ful
>>> +#endif
>>> +
>>> +static inline unsigned long fb_rev_pixels_in_long(unsigned long val,
>>> +						  u32 bswapmask)
>>> +{
>>> +	if (bswapmask & 1)
>>> +		val = comp(val >> 1, val << 1, REV_PIXELS_MASK1);
>>> +	if (bswapmask & 2)
>>> +		val = comp(val >> 2, val << 2, REV_PIXELS_MASK2);
>>> +	if (bswapmask & 3)
>>> +		val = comp(val >> 4, val << 4, REV_PIXELS_MASK4);
>>> +	return val;
>>> +}
>>> +
>>> +static inline u32 fb_shifted_pixels_mask_u32(struct fb_info *p, u32 index,
>>> +					     u32 bswapmask)
>>> +{
>>> +	u32 mask;
>>> +
>>> +	if (!bswapmask) {
>>> +		mask = FB_SHIFT_HIGH(p, ~(u32)0, index);
>>> +	} else {
>>> +		mask = 0xff << FB_LEFT_POS(p, 8);
>>> +		mask = FB_SHIFT_LOW(p, mask, index & (bswapmask)) & mask;
>>> +		mask = FB_SHIFT_HIGH(p, mask, index & ~(bswapmask));
>>> +#if defined(__i386__) || defined(__x86_64__)
>>> +		/* Shift argument is limited to 0 - 31 on x86 based CPU's */
>>> +		if(index + bswapmask < 32)
>>> +#endif
>>> +			mask |= FB_SHIFT_HIGH(p, ~(u32)0,
>>> +					(index + bswapmask) & ~(bswapmask));
>>> +	}
>>> +	return mask;
>>> +}
>>> +
>>> +static inline unsigned long fb_shifted_pixels_mask_long(struct fb_info *p,
>>> +							u32 index,
>>> +							u32 bswapmask)
>>> +{
>>> +	unsigned long mask;
>>> +
>>> +	if (!bswapmask) {
>>> +		mask = FB_SHIFT_HIGH(p, ~0UL, index);
>>> +	} else {
>>> +		mask = 0xff << FB_LEFT_POS(p, 8);
>>> +		mask = FB_SHIFT_LOW(p, mask, index & (bswapmask)) & mask;
>>> +		mask = FB_SHIFT_HIGH(p, mask, index & ~(bswapmask));
>>> +#if defined(__i386__) || defined(__x86_64__)
>>> +		/* Shift argument is limited to 0 - 31 on x86 based CPU's */
>>> +		if(index + bswapmask < BITS_PER_LONG)
>>> +#endif
>>> +			mask |= FB_SHIFT_HIGH(p, ~0UL,
>>> +					(index + bswapmask) & ~(bswapmask));
>>> +	}
>>> +	return mask;
>>> +}
>>> +
>>> +
>>> +static inline u32 fb_compute_bswapmask(struct fb_info *info)
>>> +{
>>> +	u32 bswapmask = 0;
>>> +	unsigned bpp = info->var.bits_per_pixel;
>>> +
>>> +	if ((bpp < 8) && (info->var.nonstd & FB_NONSTD_REV_PIX_IN_B)) {
>>> +		/*
>>> +		 * Reversed order of pixel layout in bytes
>>> +		 * works only for 1, 2 and 4 bpp
>>> +		 */
>>> +		bswapmask = 7 - bpp + 1;
>>> +	}
>>> +	return bswapmask;
>>> +}
>>> +
>>> +#else /* CONFIG_FB_CFB_REV_PIXELS_IN_BYTE */
>>> +
>>> +static inline unsigned long fb_rev_pixels_in_long(unsigned long val,
>>> +						  u32 bswapmask)
>>> +{
>>> +	return val;
>>> +}
>>> +
>>> +#define fb_shifted_pixels_mask_u32(p, i, b) FB_SHIFT_HIGH((p), ~(u32)0, (i))
>>> +#define fb_shifted_pixels_mask_long(p, i, b) FB_SHIFT_HIGH((p), ~0UL, (i))
>>> +#define fb_compute_bswapmask(...) 0
>>> +
>>> +#endif  /* CONFIG_FB_CFB_REV_PIXELS_IN_BYTE */
>>> +
>>> +#define cpu_to_le_long _cpu_to_le_long(BITS_PER_LONG)
>>> +#define _cpu_to_le_long(x) __cpu_to_le_long(x)
>>> +#define __cpu_to_le_long(x) cpu_to_le##x
>>> +
>>> +#define le_long_to_cpu _le_long_to_cpu(BITS_PER_LONG)
>>> +#define _le_long_to_cpu(x) __le_long_to_cpu(x)
>>> +#define __le_long_to_cpu(x) le##x##_to_cpu
>>> +
>>> +static inline unsigned long rolx(unsigned long word, unsigned int shift, unsigned int x)
>>> +{
>>> +	return (word << shift) | (word >> (x - shift));
>>> +}
>>> +
>>> +
>>> +/*
>>> + *  Generic bitwise copy algorithm
>>> + */
>>> +static void
>>> +bitcpy(struct fb_info *p, unsigned long *dst, unsigned dst_idx,
>>> +	const unsigned long *src, unsigned src_idx, int bits, unsigned n)
>>> +{
>>> +	unsigned long first, last;
>>> +	int const shift = dst_idx-src_idx;
>>> +	int left, right;
>>> +
>>> +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
>>> +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
>>> +
>>> +	if (!shift) {
>>> +		/* Same alignment for source and dest */
>>> +		if (dst_idx+n <= bits) {
>>> +			/* Single word */
>>> +			if (last)
>>> +				first &= last;
>>> +			*dst = comp(*src, *dst, first);
>>> +		} else {
>>> +			/* Multiple destination words */
>>> +			/* Leading bits */
>>> + 			if (first != ~0UL) {
>>> +				*dst = comp(*src, *dst, first);
>>> +				dst++;
>>> +				src++;
>>> +				n -= bits - dst_idx;
>>> +			}
>>> +
>>> +			/* Main chunk */
>>> +			n /= bits;
>>> +			while (n >= 8) {
>>> +				*dst++ = *src++;
>>> +				*dst++ = *src++;
>>> +				*dst++ = *src++;
>>> +				*dst++ = *src++;
>>> +				*dst++ = *src++;
>>> +				*dst++ = *src++;
>>> +				*dst++ = *src++;
>>> +				*dst++ = *src++;
>>> +				n -= 8;
>>> +			}
>>> +			while (n--)
>>> +				*dst++ = *src++;
>>> +
>>> +			/* Trailing bits */
>>> +			if (last)
>>> +				*dst = comp(*src, *dst, last);
>>> +		}
>>> +	} else {
>>> +		unsigned long d0, d1;
>>> +		int m;
>>> +
>>> +		/* Different alignment for source and dest */
>>> +		right = shift & (bits - 1);
>>> +		left = -shift & (bits - 1);
>>> +
>>> +		if (dst_idx+n <= bits) {
>>> +			/* Single destination word */
>>> +			if (last)
>>> +				first &= last;
>>> +			if (shift > 0) {
>>> +				/* Single source word */
>>> +				*dst = comp(*src << left, *dst, first);
>>> +			} else if (src_idx+n <= bits) {
>>> +				/* Single source word */
>>> +				*dst = comp(*src >> right, *dst, first);
>>> +			} else {
>>> +				/* 2 source words */
>>> +				d0 = *src++;
>>> +				d1 = *src;
>>> +				*dst = comp(d0 >> right | d1 << left, *dst,
>>> +					    first);
>>> +			}
>>> +		} else {
>>> +			/* Multiple destination words */
>>> +			/** We must always remember the last value read,
>>> +			    because in case SRC and DST overlap bitwise (e.g.
>>> +			    when moving just one pixel in 1bpp), we always
>>> +			    collect one full long for DST and that might
>>> +			    overlap with the current long from SRC. We store
>>> +			    this value in 'd0'. */
>>> +			d0 = *src++;
>>> +			/* Leading bits */
>>> +			if (shift > 0) {
>>> +				/* Single source word */
>>> +				*dst = comp(d0 << left, *dst, first);
>>> +				dst++;
>>> +				n -= bits - dst_idx;
>>> +			} else {
>>> +				/* 2 source words */
>>> +				d1 = *src++;
>>> +				*dst = comp(d0 >> right | d1 << left, *dst,
>>> +					    first);
>>> +				d0 = d1;
>>> +				dst++;
>>> +				n -= bits - dst_idx;
>>> +			}
>>> +
>>> +			/* Main chunk */
>>> +			m = n % bits;
>>> +			n /= bits;
>>> +			while (n >= 4) {
>>> +				d1 = *src++;
>>> +				*dst++ = d0 >> right | d1 << left;
>>> +				d0 = d1;
>>> +				d1 = *src++;
>>> +				*dst++ = d0 >> right | d1 << left;
>>> +				d0 = d1;
>>> +				d1 = *src++;
>>> +				*dst++ = d0 >> right | d1 << left;
>>> +				d0 = d1;
>>> +				d1 = *src++;
>>> +				*dst++ = d0 >> right | d1 << left;
>>> +				d0 = d1;
>>> +				n -= 4;
>>> +			}
>>> +			while (n--) {
>>> +				d1 = *src++;
>>> +				*dst++ = d0 >> right | d1 << left;
>>> +				d0 = d1;
>>> +			}
>>> +
>>> +			/* Trailing bits */
>>> +			if (m) {
>>> +				if (m <= bits - right) {
>>> +					/* Single source word */
>>> +					d0 >>= right;
>>> +				} else {
>>> +					/* 2 source words */
>>> + 					d1 = *src;
>>> +					d0 = d0 >> right | d1 << left;
>>> +				}
>>> +				*dst = comp(d0, *dst, last);
>>> +			}
>>> +		}
>>> +	}
>>> +}
>>> +
>>> +/*
>>> + *  Generic bitwise copy algorithm, operating backward
>>> + */
>>> +static void
>>> +bitcpy_rev(struct fb_info *p, unsigned long *dst, unsigned dst_idx,
>>> +	   const unsigned long *src, unsigned src_idx, unsigned bits,
>>> +	   unsigned n)
>>> +{
>>> +	unsigned long first, last;
>>> +	int shift;
>>> +
>>> +	dst += (dst_idx + n - 1) / bits;
>>> +	src += (src_idx + n - 1) / bits;
>>> +	dst_idx = (dst_idx + n - 1) % bits;
>>> +	src_idx = (src_idx + n - 1) % bits;
>>> +
>>> +	shift = dst_idx-src_idx;
>>> +
>>> +	first = ~FB_SHIFT_HIGH(p, ~0UL, (dst_idx + 1) % bits);
>>> +	last = FB_SHIFT_HIGH(p, ~0UL, (bits + dst_idx + 1 - n) % bits);
>>> +
>>> +	if (!shift) {
>>> +		/* Same alignment for source and dest */
>>> +		if ((unsigned long)dst_idx+1 >= n) {
>>> +			/* Single word */
>>> +			if (first)
>>> +				last &= first;
>>> +			*dst = comp(*src, *dst, last);
>>> +		} else {
>>> +			/* Multiple destination words */
>>> +
>>> +			/* Leading bits */
>>> +			if (first) {
>>> +				*dst = comp(*src, *dst, first);
>>> +				dst--;
>>> +				src--;
>>> +				n -= dst_idx+1;
>>> +			}
>>> +
>>> +			/* Main chunk */
>>> +			n /= bits;
>>> +			while (n >= 8) {
>>> +				*dst-- = *src--;
>>> +				*dst-- = *src--;
>>> +				*dst-- = *src--;
>>> +				*dst-- = *src--;
>>> +				*dst-- = *src--;
>>> +				*dst-- = *src--;
>>> +				*dst-- = *src--;
>>> +				*dst-- = *src--;
>>> +				n -= 8;
>>> +			}
>>> +			while (n--)
>>> +				*dst-- = *src--;
>>> +			/* Trailing bits */
>>> +			if (last != -1UL)
>>> +				*dst = comp(*src, *dst, last);
>>> +		}
>>> +	} else {
>>> +		/* Different alignment for source and dest */
>>> +
>>> +		int const left = shift & (bits-1);
>>> +		int const right = -shift & (bits-1);
>>> +
>>> +		if ((unsigned long)dst_idx+1 >= n) {
>>> +			/* Single destination word */
>>> +			if (first)
>>> +				last &= first;
>>> +			if (shift < 0) {
>>> +				/* Single source word */
>>> +				*dst = comp(*src >> right, *dst, last);
>>> +			} else if (1+(unsigned long)src_idx >= n) {
>>> +				/* Single source word */
>>> +				*dst = comp(*src << left, *dst, last);
>>> +			} else {
>>> +				/* 2 source words */
>>> +				*dst = comp(*src << left | *(src-1) >> right,
>>> +					    *dst, last);
>>> +			}
>>> +		} else {
>>> +			/* Multiple destination words */
>>> +			/** We must always remember the last value read,
>>> +			    because in case SRC and DST overlap bitwise (e.g.
>>> +			    when moving just one pixel in 1bpp), we always
>>> +			    collect one full long for DST and that might
>>> +			    overlap with the current long from SRC. We store
>>> +			    this value in 'd0'. */
>>> +			unsigned long d0, d1;
>>> +			int m;
>>> +
>>> +			d0 = *src--;
>>> +			/* Leading bits */
>>> +			if (shift < 0) {
>>> +				/* Single source word */
>>> +				d1 = d0;
>>> +				d0 >>= right;
>>> +			} else {
>>> +				/* 2 source words */
>>> +				d1 = *src--;
>>> +				d0 = d0 << left | d1 >> right;
>>> +			}
>>> +			if (!first)
>>> +				*dst = d0;
>>> +			else
>>> +				*dst = comp(d0, *dst, first);
>>> +			d0 = d1;
>>> +			dst--;
>>> +			n -= dst_idx+1;
>>> +
>>> +			/* Main chunk */
>>> +			m = n % bits;
>>> +			n /= bits;
>>> +			while (n >= 4) {
>>> +				d1 = *src--;
>>> +				*dst-- = d0 << left | d1 >> right;
>>> +				d0 = d1;
>>> +				d1 = *src--;
>>> +				*dst-- = d0 << left | d1 >> right;
>>> +				d0 = d1;
>>> +				d1 = *src--;
>>> +				*dst-- = d0 << left | d1 >> right;
>>> +				d0 = d1;
>>> +				d1 = *src--;
>>> +				*dst-- = d0 << left | d1 >> right;
>>> +				d0 = d1;
>>> +				n -= 4;
>>> +			}
>>> +			while (n--) {
>>> +				d1 = *src--;
>>> +				*dst-- = d0 << left | d1 >> right;
>>> +				d0 = d1;
>>> +			}
>>> +
>>> +			/* Trailing bits */
>>> +			if (m) {
>>> +				if (m <= bits - left) {
>>> +					/* Single source word */
>>> +					d0 <<= left;
>>> +				} else {
>>> +					/* 2 source words */
>>> +					d1 = *src;
>>> +					d0 = d0 << left | d1 >> right;
>>> +				}
>>> +				*dst = comp(d0, *dst, last);
>>> +			}
>>> +		}
>>> +	}
>>> +}
>>> +
>>> +/*
>>> + * Generic Bit Block Transfer for frame buffers located in system RAM with
>>> + * packed pixels of any depth.
>>> + */
>>> +void drm_copyarea(struct fb_info *p, const struct fb_copyarea *area)
>>> +{
>>> +	u32 dx = area->dx, dy = area->dy, sx = area->sx, sy = area->sy;
>>> +	u32 height = area->height, width = area->width;
>>> +	unsigned long const bits_per_line = p->fix.line_length*8u;
>>> +	unsigned long *base = NULL;
>>> +	int bits = BITS_PER_LONG, bytes = bits >> 3;
>>> +	unsigned dst_idx = 0, src_idx = 0, rev_copy = 0;
>>> +
>>> +	if (p->state != FBINFO_STATE_RUNNING)
>>> +		return;
>>> +
>>> +	/* if the beginning of the target area might overlap with the end of
>>> +	the source area, be have to copy the area reverse. */
>>> +	if ((dy == sy && dx > sx) || (dy > sy)) {
>>> +		dy += height;
>>> +		sy += height;
>>> +		rev_copy = 1;
>>> +	}
>>> +
>>> +	/* split the base of the framebuffer into a long-aligned address and
>>> +	   the index of the first bit */
>>> +	base = (unsigned long *)((unsigned long)p->screen_base & ~(bytes-1));
>>> +	dst_idx = src_idx = 8*((unsigned long)p->screen_base & (bytes-1));
>>> +	/* add offset of source and target area */
>>> +	dst_idx += dy*bits_per_line + dx*p->var.bits_per_pixel;
>>> +	src_idx += sy*bits_per_line + sx*p->var.bits_per_pixel;
>>> +
>>> +	if (p->fbops->fb_sync)
>>> +		p->fbops->fb_sync(p);
>>> +
>>> +	if (rev_copy) {
>>> +		while (height--) {
>>> +			dst_idx -= bits_per_line;
>>> +			src_idx -= bits_per_line;
>>> +			bitcpy_rev(p, base + (dst_idx / bits), dst_idx % bits,
>>> +				base + (src_idx / bits), src_idx % bits, bits,
>>> +				width*p->var.bits_per_pixel);
>>> +		}
>>> +	} else {
>>> +		while (height--) {
>>> +			bitcpy(p, base + (dst_idx / bits), dst_idx % bits,
>>> +				base + (src_idx / bits), src_idx % bits, bits,
>>> +				width*p->var.bits_per_pixel);
>>> +			dst_idx += bits_per_line;
>>> +			src_idx += bits_per_line;
>>> +		}
>>> +	}
>>> +}
>>> +EXPORT_SYMBOL(drm_copyarea);
>>> +
>>> +/*
>>> + *  Aligned pattern fill using 32/64-bit memory accesses
>>> + */
>>> +static void
>>> +bitfill_aligned(struct fb_info *p, unsigned long *dst, int dst_idx,
>>> +		unsigned long pat, unsigned n, int bits)
>>> +{
>>> +	unsigned long first, last;
>>> +
>>> +	if (!n)
>>> +		return;
>>> +
>>> +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
>>> +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
>>> +
>>> +	if (dst_idx+n <= bits) {
>>> +		/* Single word */
>>> +		if (last)
>>> +			first &= last;
>>> +		*dst = comp(pat, *dst, first);
>>> +	} else {
>>> +		/* Multiple destination words */
>>> +
>>> +		/* Leading bits */
>>> + 		if (first!= ~0UL) {
>>> +			*dst = comp(pat, *dst, first);
>>> +			dst++;
>>> +			n -= bits - dst_idx;
>>> +		}
>>> +
>>> +		/* Main chunk */
>>> +		n /= bits;
>>> +		while (n >= 8) {
>>> +			*dst++ = pat;
>>> +			*dst++ = pat;
>>> +			*dst++ = pat;
>>> +			*dst++ = pat;
>>> +			*dst++ = pat;
>>> +			*dst++ = pat;
>>> +			*dst++ = pat;
>>> +			*dst++ = pat;
>>> +			n -= 8;
>>> +		}
>>> +		while (n--)
>>> +			*dst++ = pat;
>>> +		/* Trailing bits */
>>> +		if (last)
>>> +			*dst = comp(pat, *dst, last);
>>> +	}
>>> +}
>>> +
>>> +
>>> +/*
>>> + *  Unaligned generic pattern fill using 32/64-bit memory accesses
>>> + *  The pattern must have been expanded to a full 32/64-bit value
>>> + *  Left/right are the appropriate shifts to convert to the pattern to be
>>> + *  used for the next 32/64-bit word
>>> + */
>>> +static void
>>> +bitfill_unaligned(struct fb_info *p, unsigned long *dst, int dst_idx,
>>> +		  unsigned long pat, int left, int right, unsigned n, int bits)
>>> +{
>>> +	unsigned long first, last;
>>> +
>>> +	if (!n)
>>> +		return;
>>> +
>>> +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
>>> +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
>>> +
>>> +	if (dst_idx+n <= bits) {
>>> +		/* Single word */
>>> +		if (last)
>>> +			first &= last;
>>> +		*dst = comp(pat, *dst, first);
>>> +	} else {
>>> +		/* Multiple destination words */
>>> +		/* Leading bits */
>>> +		if (first) {
>>> +			*dst = comp(pat, *dst, first);
>>> +			dst++;
>>> +			pat = pat << left | pat >> right;
>>> +			n -= bits - dst_idx;
>>> +		}
>>> +
>>> +		/* Main chunk */
>>> +		n /= bits;
>>> +		while (n >= 4) {
>>> +			*dst++ = pat;
>>> +			pat = pat << left | pat >> right;
>>> +			*dst++ = pat;
>>> +			pat = pat << left | pat >> right;
>>> +			*dst++ = pat;
>>> +			pat = pat << left | pat >> right;
>>> +			*dst++ = pat;
>>> +			pat = pat << left | pat >> right;
>>> +			n -= 4;
>>> +		}
>>> +		while (n--) {
>>> +			*dst++ = pat;
>>> +			pat = pat << left | pat >> right;
>>> +		}
>>> +
>>> +		/* Trailing bits */
>>> +		if (last)
>>> +			*dst = comp(pat, *dst, last);
>>> +	}
>>> +}
>>> +
>>> +/*
>>> + *  Aligned pattern invert using 32/64-bit memory accesses
>>> + */
>>> +static void
>>> +bitfill_aligned_rev(struct fb_info *p, unsigned long *dst, int dst_idx,
>>> +		    unsigned long pat, unsigned n, int bits)
>>> +{
>>> +	unsigned long val = pat;
>>> +	unsigned long first, last;
>>> +
>>> +	if (!n)
>>> +		return;
>>> +
>>> +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
>>> +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
>>> +
>>> +	if (dst_idx+n <= bits) {
>>> +		/* Single word */
>>> +		if (last)
>>> +			first &= last;
>>> +		*dst = comp(*dst ^ val, *dst, first);
>>> +	} else {
>>> +		/* Multiple destination words */
>>> +		/* Leading bits */
>>> +		if (first!=0UL) {
>>> +			*dst = comp(*dst ^ val, *dst, first);
>>> +			dst++;
>>> +			n -= bits - dst_idx;
>>> +		}
>>> +
>>> +		/* Main chunk */
>>> +		n /= bits;
>>> +		while (n >= 8) {
>>> +			*dst++ ^= val;
>>> +			*dst++ ^= val;
>>> +			*dst++ ^= val;
>>> +			*dst++ ^= val;
>>> +			*dst++ ^= val;
>>> +			*dst++ ^= val;
>>> +			*dst++ ^= val;
>>> +			*dst++ ^= val;
>>> +			n -= 8;
>>> +		}
>>> +		while (n--)
>>> +			*dst++ ^= val;
>>> +		/* Trailing bits */
>>> +		if (last)
>>> +			*dst = comp(*dst ^ val, *dst, last);
>>> +	}
>>> +}
>>> +
>>> +
>>> +/*
>>> + *  Unaligned generic pattern invert using 32/64-bit memory accesses
>>> + *  The pattern must have been expanded to a full 32/64-bit value
>>> + *  Left/right are the appropriate shifts to convert to the pattern to be
>>> + *  used for the next 32/64-bit word
>>> + */
>>> +static void
>>> +bitfill_unaligned_rev(struct fb_info *p, unsigned long *dst, int dst_idx,
>>> +		      unsigned long pat, int left, int right, unsigned n,
>>> +		      int bits)
>>> +{
>>> +	unsigned long first, last;
>>> +
>>> +	if (!n)
>>> +		return;
>>> +
>>> +	first = FB_SHIFT_HIGH(p, ~0UL, dst_idx);
>>> +	last = ~(FB_SHIFT_HIGH(p, ~0UL, (dst_idx+n) % bits));
>>> +
>>> +	if (dst_idx+n <= bits) {
>>> +		/* Single word */
>>> +		if (last)
>>> +			first &= last;
>>> +		*dst = comp(*dst ^ pat, *dst, first);
>>> +	} else {
>>> +		/* Multiple destination words */
>>> +
>>> +		/* Leading bits */
>>> +		if (first != 0UL) {
>>> +			*dst = comp(*dst ^ pat, *dst, first);
>>> +			dst++;
>>> +			pat = pat << left | pat >> right;
>>> +			n -= bits - dst_idx;
>>> +		}
>>> +
>>> +		/* Main chunk */
>>> +		n /= bits;
>>> +		while (n >= 4) {
>>> +			*dst++ ^= pat;
>>> +			pat = pat << left | pat >> right;
>>> +			*dst++ ^= pat;
>>> +			pat = pat << left | pat >> right;
>>> +			*dst++ ^= pat;
>>> +			pat = pat << left | pat >> right;
>>> +			*dst++ ^= pat;
>>> +			pat = pat << left | pat >> right;
>>> +			n -= 4;
>>> +		}
>>> +		while (n--) {
>>> +			*dst ^= pat;
>>> +			pat = pat << left | pat >> right;
>>> +		}
>>> +
>>> +		/* Trailing bits */
>>> +		if (last)
>>> +			*dst = comp(*dst ^ pat, *dst, last);
>>> +	}
>>> +}
>>> +
>>> +/*
>>> + * Generic fillrect for frame buffers in system RAM with packed pixels of
>>> + * any depth.
>>> + */
>>> +void drm_fillrect(struct fb_info *p, const struct fb_fillrect *rect)
>>> +{
>>> +	unsigned long pat, pat2, fg;
>>> +	unsigned long width = rect->width, height = rect->height;
>>> +	int bits = BITS_PER_LONG, bytes = bits >> 3;
>>> +	u32 bpp = p->var.bits_per_pixel;
>>> +	unsigned long *dst;
>>> +	int dst_idx, left;
>>> +
>>> +	if (p->state != FBINFO_STATE_RUNNING)
>>> +		return;
>>> +
>>> +	if (p->fix.visual == FB_VISUAL_TRUECOLOR ||
>>> +	    p->fix.visual == FB_VISUAL_DIRECTCOLOR )
>>> +		fg = ((u32 *) (p->pseudo_palette))[rect->color];
>>> +	else
>>> +		fg = rect->color;
>>> +
>>> +	pat = pixel_to_pat( bpp, fg);
>>> +
>>> +	dst = (unsigned long *)((unsigned long)p->screen_base & ~(bytes-1));
>>> +	dst_idx = ((unsigned long)p->screen_base & (bytes - 1))*8;
>>> +	dst_idx += rect->dy*p->fix.line_length*8+rect->dx*bpp;
>>> +	/* FIXME For now we support 1-32 bpp only */
>>> +	left = bits % bpp;
>>> +	if (p->fbops->fb_sync)
>>> +		p->fbops->fb_sync(p);
>>> +	if (!left) {
>>> +		void (*fill_op32)(struct fb_info *p, unsigned long *dst,
>>> +				  int dst_idx, unsigned long pat, unsigned n,
>>> +				  int bits) = NULL;
>>> +
>>> +		switch (rect->rop) {
>>> +		case ROP_XOR:
>>> +			fill_op32 = bitfill_aligned_rev;
>>> +			break;
>>> +		case ROP_COPY:
>>> +			fill_op32 = bitfill_aligned;
>>> +			break;
>>> +		default:
>>> +			printk( KERN_ERR "cfb_fillrect(): unknown rop, "
>>> +				"defaulting to ROP_COPY\n");
>>> +			fill_op32 = bitfill_aligned;
>>> +			break;
>>> +		}
>>> +		while (height--) {
>>> +			dst += dst_idx >> (ffs(bits) - 1);
>>> +			dst_idx &= (bits - 1);
>>> +			fill_op32(p, dst, dst_idx, pat, width*bpp, bits);
>>> +			dst_idx += p->fix.line_length*8;
>>> +		}
>>> +	} else {
>>> +		int right, r;
>>> +		void (*fill_op)(struct fb_info *p, unsigned long *dst,
>>> +				int dst_idx, unsigned long pat, int left,
>>> +				int right, unsigned n, int bits) = NULL;
>>> +#ifdef __LITTLE_ENDIAN
>>> +		right = left;
>>> +		left = bpp - right;
>>> +#else
>>> +		right = bpp - left;
>>> +#endif
>>> +		switch (rect->rop) {
>>> +		case ROP_XOR:
>>> +			fill_op = bitfill_unaligned_rev;
>>> +			break;
>>> +		case ROP_COPY:
>>> +			fill_op = bitfill_unaligned;
>>> +			break;
>>> +		default:
>>> +			printk(KERN_ERR "sys_fillrect(): unknown rop, "
>>> +				"defaulting to ROP_COPY\n");
>>> +			fill_op = bitfill_unaligned;
>>> +			break;
>>> +		}
>>> +		while (height--) {
>>> +			dst += dst_idx / bits;
>>> +			dst_idx &= (bits - 1);
>>> +			r = dst_idx % bpp;
>>> +			/* rotate pattern to the correct start position */
>>> +			pat2 = le_long_to_cpu(rolx(cpu_to_le_long(pat), r, bpp));
>>> +			fill_op(p, dst, dst_idx, pat2, left, right,
>>> +				width*bpp, bits);
>>> +			dst_idx += p->fix.line_length*8;
>>> +		}
>>> +	}
>>> +}
>>> +EXPORT_SYMBOL(drm_fillrect);
>>> +
>>> +static const u32 cfb_tab8_be[] = {
>>> +    0x00000000,0x000000ff,0x0000ff00,0x0000ffff,
>>> +    0x00ff0000,0x00ff00ff,0x00ffff00,0x00ffffff,
>>> +    0xff000000,0xff0000ff,0xff00ff00,0xff00ffff,
>>> +    0xffff0000,0xffff00ff,0xffffff00,0xffffffff
>>> +};
>>> +
>>> +static const u32 cfb_tab8_le[] = {
>>> +    0x00000000,0xff000000,0x00ff0000,0xffff0000,
>>> +    0x0000ff00,0xff00ff00,0x00ffff00,0xffffff00,
>>> +    0x000000ff,0xff0000ff,0x00ff00ff,0xffff00ff,
>>> +    0x0000ffff,0xff00ffff,0x00ffffff,0xffffffff
>>> +};
>>> +
>>> +static const u32 cfb_tab16_be[] = {
>>> +    0x00000000, 0x0000ffff, 0xffff0000, 0xffffffff
>>> +};
>>> +
>>> +static const u32 cfb_tab16_le[] = {
>>> +    0x00000000, 0xffff0000, 0x0000ffff, 0xffffffff
>>> +};
>>> +
>>> +static const u32 cfb_tab32[] = {
>>> +	0x00000000, 0xffffffff
>>> +};
>>> +
>>> +static void color_imageblit(const struct fb_image *image, struct fb_info *p,
>>> +			    void *dst1, u32 start_index, u32 pitch_index)
>>> +{
>>> +	/* Draw the penguin */
>>> +	u32 *dst, *dst2;
>>> +	u32 color = 0, val, shift;
>>> +	int i, n, bpp = p->var.bits_per_pixel;
>>> +	u32 null_bits = 32 - bpp;
>>> +	u32 *palette = (u32 *) p->pseudo_palette;
>>> +	const u8 *src = image->data;
>>> +
>>> +	dst2 = dst1;
>>> +	for (i = image->height; i--; ) {
>>> +		n = image->width;
>>> +		dst = dst1;
>>> +		shift = 0;
>>> +		val = 0;
>>> +
>>> +		if (start_index) {
>>> +			u32 start_mask = ~(FB_SHIFT_HIGH(p, ~(u32)0,
>>> +							 start_index));
>>> +			val = *dst & start_mask;
>>> +			shift = start_index;
>>> +		}
>>> +		while (n--) {
>>> +			if (p->fix.visual == FB_VISUAL_TRUECOLOR ||
>>> +			    p->fix.visual == FB_VISUAL_DIRECTCOLOR )
>>> +				color = palette[*src];
>>> +			else
>>> +				color = *src;
>>> +			color <<= FB_LEFT_POS(p, bpp);
>>> +			val |= FB_SHIFT_HIGH(p, color, shift);
>>> +			if (shift >= null_bits) {
>>> +				*dst++ = val;
>>> +
>>> +				val = (shift == null_bits) ? 0 :
>>> +					FB_SHIFT_LOW(p, color, 32 - shift);
>>> +			}
>>> +			shift += bpp;
>>> +			shift &= (32 - 1);
>>> +			src++;
>>> +		}
>>> +		if (shift) {
>>> +			u32 end_mask = FB_SHIFT_HIGH(p, ~(u32)0, shift);
>>> +
>>> +			*dst &= end_mask;
>>> +			*dst |= val;
>>> +		}
>>> +		dst1 += p->fix.line_length;
>>> +		if (pitch_index) {
>>> +			dst2 += p->fix.line_length;
>>> +			dst1 = (u8 *)((long)dst2 & ~(sizeof(u32) - 1));
>>> +
>>> +			start_index += pitch_index;
>>> +			start_index &= 32 - 1;
>>> +		}
>>> +	}
>>> +}
>>> +
>>> +static void slow_imageblit(const struct fb_image *image, struct fb_info *p,
>>> +				  void *dst1, u32 fgcolor, u32 bgcolor,
>>> +				  u32 start_index, u32 pitch_index)
>>> +{
>>> +	u32 shift, color = 0, bpp = p->var.bits_per_pixel;
>>> +	u32 *dst, *dst2;
>>> +	u32 val, pitch = p->fix.line_length;
>>> +	u32 null_bits = 32 - bpp;
>>> +	u32 spitch = (image->width+7)/8;
>>> +	const u8 *src = image->data, *s;
>>> +	u32 i, j, l;
>>> +
>>> +	dst2 = dst1;
>>> +	fgcolor <<= FB_LEFT_POS(p, bpp);
>>> +	bgcolor <<= FB_LEFT_POS(p, bpp);
>>> +
>>> +	for (i = image->height; i--; ) {
>>> +		shift = val = 0;
>>> +		l = 8;
>>> +		j = image->width;
>>> +		dst = dst1;
>>> +		s = src;
>>> +
>>> +		/* write leading bits */
>>> +		if (start_index) {
>>> +			u32 start_mask = ~(FB_SHIFT_HIGH(p, ~(u32)0,
>>> +							 start_index));
>>> +			val = *dst & start_mask;
>>> +			shift = start_index;
>>> +		}
>>> +
>>> +		while (j--) {
>>> +			l--;
>>> +			color = (*s & (1 << l)) ? fgcolor : bgcolor;
>>> +			val |= FB_SHIFT_HIGH(p, color, shift);
>>> +
>>> +			/* Did the bitshift spill bits to the next long? */
>>> +			if (shift >= null_bits) {
>>> +				*dst++ = val;
>>> +				val = (shift == null_bits) ? 0 :
>>> +					FB_SHIFT_LOW(p, color, 32 - shift);
>>> +			}
>>> +			shift += bpp;
>>> +			shift &= (32 - 1);
>>> +			if (!l) { l = 8; s++; }
>>> +		}
>>> +
>>> +		/* write trailing bits */
>>> + 		if (shift) {
>>> +			u32 end_mask = FB_SHIFT_HIGH(p, ~(u32)0, shift);
>>> +
>>> +			*dst &= end_mask;
>>> +			*dst |= val;
>>> +		}
>>> +
>>> +		dst1 += pitch;
>>> +		src += spitch;
>>> +		if (pitch_index) {
>>> +			dst2 += pitch;
>>> +			dst1 = (u8 *)((long)dst2 & ~(sizeof(u32) - 1));
>>> +			start_index += pitch_index;
>>> +			start_index &= 32 - 1;
>>> +		}
>>> +
>>> +	}
>>> +}
>>> +
>>> +/*
>>> + * fast_imageblit - optimized monochrome color expansion
>>> + *
>>> + * Only if:  bits_per_pixel == 8, 16, or 32
>>> + *           image->width is divisible by pixel/dword (ppw);
>>> + *           fix->line_legth is divisible by 4;
>>> + *           beginning and end of a scanline is dword aligned
>>> + */
>>> +static void fast_imageblit(const struct fb_image *image, struct fb_info *p,
>>> +				  void *dst1, u32 fgcolor, u32 bgcolor)
>>> +{
>>> +	u32 fgx = fgcolor, bgx = bgcolor, bpp = p->var.bits_per_pixel;
>>> +	u32 ppw = 32/bpp, spitch = (image->width + 7)/8;
>>> +	u32 bit_mask, end_mask, eorx, shift;
>>> +	const char *s = image->data, *src;
>>> +	u32 *dst;
>>> +	const u32 *tab = NULL;
>>> +	int i, j, k;
>>> +
>>> +	switch (bpp) {
>>> +	case 8:
>>> +		tab = fb_be_math(p) ? cfb_tab8_be : cfb_tab8_le;
>>> +		break;
>>> +	case 16:
>>> +		tab = fb_be_math(p) ? cfb_tab16_be : cfb_tab16_le;
>>> +		break;
>>> +	case 32:
>>> +	default:
>>> +		tab = cfb_tab32;
>>> +		break;
>>> +	}
>>> +
>>> +	for (i = ppw-1; i--; ) {
>>> +		fgx <<= bpp;
>>> +		bgx <<= bpp;
>>> +		fgx |= fgcolor;
>>> +		bgx |= bgcolor;
>>> +	}
>>> +
>>> +	bit_mask = (1 << ppw) - 1;
>>> +	eorx = fgx ^ bgx;
>>> +	k = image->width/ppw;
>>> +
>>> +	for (i = image->height; i--; ) {
>>> +		dst = dst1;
>>> +		shift = 8;
>>> +		src = s;
>>> +
>>> +		for (j = k; j--; ) {
>>> +			shift -= ppw;
>>> +			end_mask = tab[(*src >> shift) & bit_mask];
>>> +			*dst++ = (end_mask & eorx) ^ bgx;
>>> +			if (!shift) {
>>> +				shift = 8;
>>> +				src++;
>>> +			}
>>> +		}
>>> +		dst1 += p->fix.line_length;
>>> +		s += spitch;
>>> +	}
>>> +}
>>> +
>>> +/*
>>> + * Generic 1-bit or 8-bit source to 1-32 bit destination expansion
>>> + * for frame buffer located in system RAM with packed pixels of any depth.
>>> + */
>>> +void drm_imageblit(struct fb_info *p, const struct fb_image *image)
>>> +{
>>> +	u32 fgcolor, bgcolor, start_index, bitstart, pitch_index = 0;
>>> +	u32 bpl = sizeof(u32), bpp = p->var.bits_per_pixel;
>>> +	u32 width = image->width;
>>> +	u32 dx = image->dx, dy = image->dy;
>>> +	void *dst1;
>>> +
>>> +	if (p->state != FBINFO_STATE_RUNNING)
>>> +		return;
>>> +
>>> +	bitstart = (dy * p->fix.line_length * 8) + (dx * bpp);
>>> +	start_index = bitstart & (32 - 1);
>>> +	pitch_index = (p->fix.line_length & (bpl - 1)) * 8;
>>> +
>>> +	bitstart /= 8;
>>> +	bitstart &= ~(bpl - 1);
>>> +	dst1 = (void __force *)p->screen_base + bitstart;
>>> +
>>> +	if (p->fbops->fb_sync)
>>> +		p->fbops->fb_sync(p);
>>> +
>>> +	if (image->depth == 1) {
>>> +		if (p->fix.visual == FB_VISUAL_TRUECOLOR ||
>>> +		    p->fix.visual == FB_VISUAL_DIRECTCOLOR) {
>>> +			fgcolor = ((u32*)(p->pseudo_palette))[image->fg_color];
>>> +			bgcolor = ((u32*)(p->pseudo_palette))[image->bg_color];
>>> +		} else {
>>> +			fgcolor = image->fg_color;
>>> +			bgcolor = image->bg_color;
>>> +		}
>>> +
>>> +		if (32 % bpp == 0 && !start_index && !pitch_index &&
>>> +		    ((width & (32/bpp-1)) == 0) &&
>>> +		    bpp >= 8 && bpp <= 32)
>>> +			fast_imageblit(image, p, dst1, fgcolor, bgcolor);
>>> +		else
>>> +			slow_imageblit(image, p, dst1, fgcolor, bgcolor,
>>> +					start_index, pitch_index);
>>> +	} else
>>> +		color_imageblit(image, p, dst1, start_index, pitch_index);
>>> +}
>>> +EXPORT_SYMBOL(drm_imageblit);
>>> +
>>> +/*
>>> + * Generic file operations where framebuffer is in system RAM
>>> + */
>>> +ssize_t drm_fb_read(struct fb_info *info, char __user *buf, size_t count, loff_t *ppos)
>>> +{
>>> +	unsigned long p = *ppos;
>>> +	void *src;
>>> +	int err = 0;
>>> +	unsigned long total_size;
>>> +
>>> +	if (info->state != FBINFO_STATE_RUNNING)
>>> +		return -EPERM;
>>> +
>>> +	total_size = info->screen_size;
>>> +
>>> +	if (total_size == 0)
>>> +		total_size = info->fix.smem_len;
>>> +
>>> +	if (p >= total_size)
>>> +		return 0;
>>> +
>>> +	if (count >= total_size)
>>> +		count = total_size;
>>> +
>>> +	if (count + p > total_size)
>>> +		count = total_size - p;
>>> +
>>> +	src = (void __force *)(info->screen_base + p);
>>> +
>>> +	if (info->fbops->fb_sync)
>>> +		info->fbops->fb_sync(info);
>>> +
>>> +	if (copy_to_user(buf, src, count))
>>> +		err = -EFAULT;
>>> +
>>> +	if  (!err)
>>> +		*ppos += count;
>>> +
>>> +	return (err) ? err : count;
>>> +}
>>> +EXPORT_SYMBOL_GPL(drm_fb_read);
>>> +
>>> +/*
>>> + * Generic file operations where framebuffer is in system RAM
>>> + */
>>> +ssize_t drm_fb_write(struct fb_info *info, const char __user *buf, size_t count, loff_t *ppos)
>>> +{
>>> +	unsigned long p = *ppos;
>>> +	void *dst;
>>> +	int err = 0;
>>> +	unsigned long total_size;
>>> +
>>> +	if (info->state != FBINFO_STATE_RUNNING)
>>> +		return -EPERM;
>>> +
>>> +	total_size = info->screen_size;
>>> +
>>> +	if (total_size == 0)
>>> +		total_size = info->fix.smem_len;
>>> +
>>> +	if (p > total_size)
>>> +		return -EFBIG;
>>> +
>>> +	if (count > total_size) {
>>> +		err = -EFBIG;
>>> +		count = total_size;
>>> +	}
>>> +
>>> +	if (count + p > total_size) {
>>> +		if (!err)
>>> +			err = -ENOSPC;
>>> +
>>> +		count = total_size - p;
>>> +	}
>>> +
>>> +	dst = (void __force *) (info->screen_base + p);
>>> +
>>> +	if (info->fbops->fb_sync)
>>> +		info->fbops->fb_sync(info);
>>> +
>>> +	if (copy_from_user(dst, buf, count))
>>> +		err = -EFAULT;
>>> +
>>> +	if  (!err)
>>> +		*ppos += count;
>>> +
>>> +	return (err) ? err : count;
>>> +}
>>> +EXPORT_SYMBOL_GPL(drm_fb_write);
>>> diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
>>> index da0d96a69570..18fba9818635 100644
>>> --- a/drivers/gpu/drm/drm_fb_helper.c
>>> +++ b/drivers/gpu/drm/drm_fb_helper.c
>>> @@ -664,36 +664,36 @@ void drm_fb_helper_deferred_io(struct fb_info *info,
>>>  EXPORT_SYMBOL(drm_fb_helper_deferred_io);
>>>  
>>>  /**
>>> - * drm_fb_helper_sys_read - wrapper around fb_sys_read
>>> + * drm_fb_helper_sys_read - wrapper around drm_fb_read
>>>   * @info: fb_info struct pointer
>>>   * @buf: userspace buffer to read from framebuffer memory
>>>   * @count: number of bytes to read from framebuffer memory
>>>   * @ppos: read offset within framebuffer memory
>>>   *
>>> - * A wrapper around fb_sys_read implemented by fbdev core
>>> + * A wrapper around drm_fb_read
>>>   */
>>>  ssize_t drm_fb_helper_sys_read(struct fb_info *info, char __user *buf,
>>>  			       size_t count, loff_t *ppos)
>>>  {
>>> -	return fb_sys_read(info, buf, count, ppos);
>>> +	return drm_fb_read(info, buf, count, ppos);
>>>  }
>>>  EXPORT_SYMBOL(drm_fb_helper_sys_read);
>>>  
>>>  /**
>>> - * drm_fb_helper_sys_write - wrapper around fb_sys_write
>>> + * drm_fb_helper_sys_write - wrapper around drm_fb_write
>>>   * @info: fb_info struct pointer
>>>   * @buf: userspace buffer to write to framebuffer memory
>>>   * @count: number of bytes to write to framebuffer memory
>>>   * @ppos: write offset within framebuffer memory
>>>   *
>>> - * A wrapper around fb_sys_write implemented by fbdev core
>>> + * A wrapper around drm_fb_write
>>>   */
>>>  ssize_t drm_fb_helper_sys_write(struct fb_info *info, const char __user *buf,
>>>  				size_t count, loff_t *ppos)
>>>  {
>>>  	ssize_t ret;
>>>  
>>> -	ret = fb_sys_write(info, buf, count, ppos);
>>> +	ret = drm_fb_write(info, buf, count, ppos);
>>>  	if (ret > 0)
>>>  		drm_fb_helper_dirty(info, 0, 0, info->var.xres,
>>>  				    info->var.yres);
>>> @@ -703,48 +703,48 @@ ssize_t drm_fb_helper_sys_write(struct fb_info *info, const char __user *buf,
>>>  EXPORT_SYMBOL(drm_fb_helper_sys_write);
>>>  
>>>  /**
>>> - * drm_fb_helper_sys_fillrect - wrapper around sys_fillrect
>>> + * drm_fb_helper_sys_fillrect - wrapper around drm_fillrect
>>>   * @info: fbdev registered by the helper
>>>   * @rect: info about rectangle to fill
>>>   *
>>> - * A wrapper around sys_fillrect implemented by fbdev core
>>> + * A wrapper around drm_fillrect
>>>   */
>>>  void drm_fb_helper_sys_fillrect(struct fb_info *info,
>>>  				const struct fb_fillrect *rect)
>>>  {
>>> -	sys_fillrect(info, rect);
>>> +	drm_fillrect(info, rect);
>>>  	drm_fb_helper_dirty(info, rect->dx, rect->dy,
>>>  			    rect->width, rect->height);
>>>  }
>>>  EXPORT_SYMBOL(drm_fb_helper_sys_fillrect);
>>>  
>>>  /**
>>> - * drm_fb_helper_sys_copyarea - wrapper around sys_copyarea
>>> + * drm_fb_helper_sys_copyarea - wrapper around drm_copyarea
>>>   * @info: fbdev registered by the helper
>>>   * @area: info about area to copy
>>>   *
>>> - * A wrapper around sys_copyarea implemented by fbdev core
>>> + * A wrapper around drm_copyarea
>>>   */
>>>  void drm_fb_helper_sys_copyarea(struct fb_info *info,
>>>  				const struct fb_copyarea *area)
>>>  {
>>> -	sys_copyarea(info, area);
>>> +	drm_copyarea(info, area);
>>>  	drm_fb_helper_dirty(info, area->dx, area->dy,
>>>  			    area->width, area->height);
>>>  }
>>>  EXPORT_SYMBOL(drm_fb_helper_sys_copyarea);
>>>  
>>>  /**
>>> - * drm_fb_helper_sys_imageblit - wrapper around sys_imageblit
>>> + * drm_fb_helper_sys_imageblit - wrapper around drm_imageblit
>>>   * @info: fbdev registered by the helper
>>>   * @image: info about image to blit
>>>   *
>>> - * A wrapper around sys_imageblit implemented by fbdev core
>>> + * A wrapper around drm_imageblit
>>>   */
>>>  void drm_fb_helper_sys_imageblit(struct fb_info *info,
>>>  				 const struct fb_image *image)
>>>  {
>>> -	sys_imageblit(info, image);
>>> +	drm_imageblit(info, image);
>>>  	drm_fb_helper_dirty(info, image->dx, image->dy,
>>>  			    image->width, image->height);
>>>  }
>>> diff --git a/include/drm/drm_fb_helper.h b/include/drm/drm_fb_helper.h
>>> index 306aa3a60be9..d38005f6128a 100644
>>> --- a/include/drm/drm_fb_helper.h
>>> +++ b/include/drm/drm_fb_helper.h
>>> @@ -271,6 +271,12 @@ void drm_fb_helper_output_poll_changed(struct drm_device *dev);
>>>  
>>>  void drm_fbdev_generic_setup(struct drm_device *dev,
>>>  			     unsigned int preferred_bpp);
>>> +
>>> +void drm_copyarea(struct fb_info *p, const struct fb_copyarea *area);
>>> +void drm_fillrect(struct fb_info *p, const struct fb_fillrect *rect);
>>> +void drm_imageblit(struct fb_info *p, const struct fb_image *image);
>>> +ssize_t drm_fb_read(struct fb_info *info, char __user *buf, size_t count, loff_t *ppos);
>>> +ssize_t drm_fb_write(struct fb_info *info, const char __user *buf, size_t count, loff_t *ppos);
>>>  #else
>>>  static inline void drm_fb_helper_prepare(struct drm_device *dev,
>>>  					struct drm_fb_helper *helper,
>>>
>>
>> -- 
>> Thomas Zimmermann
>> Graphics Driver Developer
>> SUSE Software Solutions Germany GmbH
>> Maxfeldstr. 5, 90409 Nürnberg, Germany
>> (HRB 36809, AG Nürnberg)
>> Geschäftsführer: Felix Imendörffer
>>
> 
> 
> 
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 516 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2020-08-04  9:41 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-29 13:41 [RFC][PATCH 0/5] Support GEM object mappings from I/O memory Thomas Zimmermann
2020-07-29 13:41 ` Thomas Zimmermann
2020-07-29 13:41 ` [PATCH 1/5] fbdev: Remove trailing whitespace Thomas Zimmermann
2020-07-29 13:41   ` Thomas Zimmermann
2020-07-29 13:47   ` daniel
2020-07-29 13:47     ` daniel
2020-07-29 13:41 ` [PATCH 2/5] fbdev/core: Export framebuffer read and write code as cfb_ function Thomas Zimmermann
2020-07-29 13:41   ` Thomas Zimmermann
2020-07-29 13:53   ` daniel
2020-07-29 13:53     ` daniel
2020-07-29 16:36     ` Sam Ravnborg
2020-07-29 16:36       ` Sam Ravnborg
2020-07-31  9:20       ` daniel
2020-07-31  9:20         ` daniel
2020-08-02 20:01         ` Sam Ravnborg
2020-08-02 20:01           ` Sam Ravnborg
2020-08-03  6:46           ` Thomas Zimmermann
2020-08-03  6:46             ` Thomas Zimmermann
2020-08-04  9:27             ` daniel
2020-08-04  9:27               ` daniel
2020-08-04  9:41               ` Thomas Zimmermann
2020-08-04  9:41                 ` Thomas Zimmermann
2020-07-29 13:41 ` [PATCH 3/5] drm: Add infrastructure for vmap operations of I/O memory Thomas Zimmermann
2020-07-29 13:41   ` Thomas Zimmermann
2020-07-29 13:57   ` daniel
2020-07-29 13:57     ` daniel
2020-07-30  8:14     ` Thomas Zimmermann
2020-07-30  8:14       ` Thomas Zimmermann
2020-07-31  9:22       ` daniel
2020-07-31  9:22         ` daniel
2020-07-29 13:41 ` [PATCH 4/5] drm/fb_helper: Use I/O-memory mappings if available Thomas Zimmermann
2020-07-29 13:41   ` Thomas Zimmermann
2020-07-29 13:41 ` [PATCH 5/5] drm/vram_helper: Implement struct drm_gem_object_funcs.vmap_iomem Thomas Zimmermann
2020-07-29 13:41   ` Thomas Zimmermann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.