All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/21] USB DisplayLink patches
@ 2018-06-03 14:40 ` Mikulas Patocka
  0 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:40 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

Hi

Here I'm sending bug fixes and performance improvements for the USB
DisplayLink framebuffer and modesetting drivers for this merge window.

Mikulas


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 00/21] USB DisplayLink patches
@ 2018-06-03 14:40 ` Mikulas Patocka
  0 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:40 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

Hi

Here I'm sending bug fixes and performance improvements for the USB
DisplayLink framebuffer and modesetting drivers for this merge window.

Mikulas

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 01/21] udl-kms: fix display corruption of the last line
  2018-06-03 14:40 ` Mikulas Patocka
@ 2018-06-03 14:40   ` Mikulas Patocka
  -1 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:40 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

The displaylink hardware has such a peculiarity that it doesn't render a
command until next command is received. This produces occasional
corruption, such as when setting 22x11 font on the console, only the first
line of the cursor will be blinking if the cursor is located at some
specific columns.

When we end up with a repeating pixel, the driver has a bug that it leaves
one uninitialized byte after the command (and this byte is enough to flush
the command and render it - thus it fixes the screen corruption), however
whe we end up with a non-repeating pixel, there is no byte appended and
this results in temporary screen corruption.

This patch fixes the screen corruption by always appending a byte 0xAF at
the end of URB. It also removes the uninitialized byte.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/gpu/drm/udl/udl_fb.c       |    5 ++++-
 drivers/gpu/drm/udl/udl_transfer.c |   11 +++++++----
 2 files changed, 11 insertions(+), 5 deletions(-)

Index: linux-4.16.12/drivers/gpu/drm/udl/udl_transfer.c
=================================--- linux-4.16.12.orig/drivers/gpu/drm/udl/udl_transfer.c	2018-05-31 14:47:07.000000000 +0200
+++ linux-4.16.12/drivers/gpu/drm/udl/udl_transfer.c	2018-05-31 14:53:31.000000000 +0200
@@ -153,11 +153,11 @@ static void udl_compress_hline16(
 		raw_pixels_count_byte = cmd++; /*  we'll know this later */
 		raw_pixel_start = pixel;
 
-		cmd_pixel_end = pixel + (min(MAX_CMD_PIXELS + 1,
-			min((int)(pixel_end - pixel) / bpp,
-			    (int)(cmd_buffer_end - cmd) / 2))) * bpp;
+		cmd_pixel_end = pixel + min3(MAX_CMD_PIXELS + 1UL,
+					(unsigned long)(pixel_end - pixel) / bpp,
+					(unsigned long)(cmd_buffer_end - 1 - cmd) / 2) * bpp;
 
-		prefetch_range((void *) pixel, (cmd_pixel_end - pixel) * bpp);
+		prefetch_range((void *) pixel, cmd_pixel_end - pixel);
 		pixel_val16 = get_pixel_val16(pixel, bpp);
 
 		while (pixel < cmd_pixel_end) {
@@ -193,6 +193,9 @@ static void udl_compress_hline16(
 		if (pixel > raw_pixel_start) {
 			/* finalize last RAW span */
 			*raw_pixels_count_byte = ((pixel-raw_pixel_start) / bpp) & 0xFF;
+		} else {
+			/* undo unused byte */
+			cmd--;
 		}
 
 		*cmd_pixels_count_byte = ((pixel - cmd_pixel_start) / bpp) & 0xFF;
Index: linux-4.16.12/drivers/gpu/drm/udl/udl_fb.c
=================================--- linux-4.16.12.orig/drivers/gpu/drm/udl/udl_fb.c	2018-05-31 14:47:07.000000000 +0200
+++ linux-4.16.12/drivers/gpu/drm/udl/udl_fb.c	2018-05-31 14:53:32.000000000 +0200
@@ -137,7 +137,10 @@ int udl_handle_damage(struct udl_framebu
 
 	if (cmd > (char *) urb->transfer_buffer) {
 		/* Send partial buffer remaining before exiting */
-		int len = cmd - (char *) urb->transfer_buffer;
+		int len;
+		if (cmd < (char *) urb->transfer_buffer + urb->transfer_buffer_length)
+			*cmd++ = 0xAF;
+		len = cmd - (char *) urb->transfer_buffer;
 		ret = udl_submit_urb(dev, urb, len);
 		bytes_sent += len;
 	} else


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 01/21] udl-kms: fix display corruption of the last line
@ 2018-06-03 14:40   ` Mikulas Patocka
  0 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:40 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

[-- Attachment #1: udlkms-append-zero-4.patch --]
[-- Type: text/plain, Size: 3186 bytes --]

The displaylink hardware has such a peculiarity that it doesn't render a
command until next command is received. This produces occasional
corruption, such as when setting 22x11 font on the console, only the first
line of the cursor will be blinking if the cursor is located at some
specific columns.

When we end up with a repeating pixel, the driver has a bug that it leaves
one uninitialized byte after the command (and this byte is enough to flush
the command and render it - thus it fixes the screen corruption), however
whe we end up with a non-repeating pixel, there is no byte appended and
this results in temporary screen corruption.

This patch fixes the screen corruption by always appending a byte 0xAF at
the end of URB. It also removes the uninitialized byte.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/gpu/drm/udl/udl_fb.c       |    5 ++++-
 drivers/gpu/drm/udl/udl_transfer.c |   11 +++++++----
 2 files changed, 11 insertions(+), 5 deletions(-)

Index: linux-4.16.12/drivers/gpu/drm/udl/udl_transfer.c
===================================================================
--- linux-4.16.12.orig/drivers/gpu/drm/udl/udl_transfer.c	2018-05-31 14:47:07.000000000 +0200
+++ linux-4.16.12/drivers/gpu/drm/udl/udl_transfer.c	2018-05-31 14:53:31.000000000 +0200
@@ -153,11 +153,11 @@ static void udl_compress_hline16(
 		raw_pixels_count_byte = cmd++; /*  we'll know this later */
 		raw_pixel_start = pixel;
 
-		cmd_pixel_end = pixel + (min(MAX_CMD_PIXELS + 1,
-			min((int)(pixel_end - pixel) / bpp,
-			    (int)(cmd_buffer_end - cmd) / 2))) * bpp;
+		cmd_pixel_end = pixel + min3(MAX_CMD_PIXELS + 1UL,
+					(unsigned long)(pixel_end - pixel) / bpp,
+					(unsigned long)(cmd_buffer_end - 1 - cmd) / 2) * bpp;
 
-		prefetch_range((void *) pixel, (cmd_pixel_end - pixel) * bpp);
+		prefetch_range((void *) pixel, cmd_pixel_end - pixel);
 		pixel_val16 = get_pixel_val16(pixel, bpp);
 
 		while (pixel < cmd_pixel_end) {
@@ -193,6 +193,9 @@ static void udl_compress_hline16(
 		if (pixel > raw_pixel_start) {
 			/* finalize last RAW span */
 			*raw_pixels_count_byte = ((pixel-raw_pixel_start) / bpp) & 0xFF;
+		} else {
+			/* undo unused byte */
+			cmd--;
 		}
 
 		*cmd_pixels_count_byte = ((pixel - cmd_pixel_start) / bpp) & 0xFF;
Index: linux-4.16.12/drivers/gpu/drm/udl/udl_fb.c
===================================================================
--- linux-4.16.12.orig/drivers/gpu/drm/udl/udl_fb.c	2018-05-31 14:47:07.000000000 +0200
+++ linux-4.16.12/drivers/gpu/drm/udl/udl_fb.c	2018-05-31 14:53:32.000000000 +0200
@@ -137,7 +137,10 @@ int udl_handle_damage(struct udl_framebu
 
 	if (cmd > (char *) urb->transfer_buffer) {
 		/* Send partial buffer remaining before exiting */
-		int len = cmd - (char *) urb->transfer_buffer;
+		int len;
+		if (cmd < (char *) urb->transfer_buffer + urb->transfer_buffer_length)
+			*cmd++ = 0xAF;
+		len = cmd - (char *) urb->transfer_buffer;
 		ret = udl_submit_urb(dev, urb, len);
 		bytes_sent += len;
 	} else

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 02/21] udl-kms: change down_interruptible to down
  2018-06-03 14:40 ` Mikulas Patocka
@ 2018-06-03 14:40   ` Mikulas Patocka
  -1 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:40 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

If we leave urbs around, it causes not only leak, but also memory
corruption. This patch fixes the function udl_free_urb_list, so that it
always waits for all urbs that are in progress.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/gpu/drm/udl/udl_main.c |    7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

Index: linux-4.16.12/drivers/gpu/drm/udl/udl_main.c
=================================--- linux-4.16.12.orig/drivers/gpu/drm/udl/udl_main.c	2018-05-31 10:23:42.000000000 +0200
+++ linux-4.16.12/drivers/gpu/drm/udl/udl_main.c	2018-05-31 10:28:11.000000000 +0200
@@ -170,18 +170,13 @@ static void udl_free_urb_list(struct drm
 	struct list_head *node;
 	struct urb_node *unode;
 	struct urb *urb;
-	int ret;
 	unsigned long flags;
 
 	DRM_DEBUG("Waiting for completes and freeing all render urbs\n");
 
 	/* keep waiting and freeing, until we've got 'em all */
 	while (count--) {
-
-		/* Getting interrupted means a leak, but ok at shutdown*/
-		ret = down_interruptible(&udl->urbs.limit_sem);
-		if (ret)
-			break;
+		down(&udl->urbs.limit_sem);
 
 		spin_lock_irqsave(&udl->urbs.lock, flags);
 


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 02/21] udl-kms: change down_interruptible to down
@ 2018-06-03 14:40   ` Mikulas Patocka
  0 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:40 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

[-- Attachment #1: udlkms-down-interruptible.patch --]
[-- Type: text/plain, Size: 1360 bytes --]

If we leave urbs around, it causes not only leak, but also memory
corruption. This patch fixes the function udl_free_urb_list, so that it
always waits for all urbs that are in progress.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/gpu/drm/udl/udl_main.c |    7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

Index: linux-4.16.12/drivers/gpu/drm/udl/udl_main.c
===================================================================
--- linux-4.16.12.orig/drivers/gpu/drm/udl/udl_main.c	2018-05-31 10:23:42.000000000 +0200
+++ linux-4.16.12/drivers/gpu/drm/udl/udl_main.c	2018-05-31 10:28:11.000000000 +0200
@@ -170,18 +170,13 @@ static void udl_free_urb_list(struct drm
 	struct list_head *node;
 	struct urb_node *unode;
 	struct urb *urb;
-	int ret;
 	unsigned long flags;
 
 	DRM_DEBUG("Waiting for completes and freeing all render urbs\n");
 
 	/* keep waiting and freeing, until we've got 'em all */
 	while (count--) {
-
-		/* Getting interrupted means a leak, but ok at shutdown*/
-		ret = down_interruptible(&udl->urbs.limit_sem);
-		if (ret)
-			break;
+		down(&udl->urbs.limit_sem);
 
 		spin_lock_irqsave(&udl->urbs.lock, flags);
 

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 03/21] udl-kms: handle allocation failure
  2018-06-03 14:40 ` Mikulas Patocka
@ 2018-06-03 14:40   ` Mikulas Patocka
  -1 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:40 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

Allocations larger than PAGE_ALLOC_COSTLY_ORDER are unreliable and they
may fail anytime. This patch fixes the udl kms driver so that when a large
alloactions fails, it tries to do multiple smaller allocations.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/gpu/drm/udl/udl_main.c |   28 ++++++++++++++++++----------
 1 file changed, 18 insertions(+), 10 deletions(-)

Index: linux-4.16.12/drivers/gpu/drm/udl/udl_main.c
=================================--- linux-4.16.12.orig/drivers/gpu/drm/udl/udl_main.c	2018-05-31 11:16:15.000000000 +0200
+++ linux-4.16.12/drivers/gpu/drm/udl/udl_main.c	2018-05-31 11:16:15.000000000 +0200
@@ -200,17 +200,22 @@ static void udl_free_urb_list(struct drm
 static int udl_alloc_urb_list(struct drm_device *dev, int count, size_t size)
 {
 	struct udl_device *udl = dev->dev_private;
-	int i = 0;
 	struct urb *urb;
 	struct urb_node *unode;
 	char *buf;
+	size_t wanted_size = count * size;
 
 	spin_lock_init(&udl->urbs.lock);
 
+retry:
 	udl->urbs.size = size;
 	INIT_LIST_HEAD(&udl->urbs.list);
 
-	while (i < count) {
+	sema_init(&udl->urbs.limit_sem, 0);
+	udl->urbs.count = 0;
+	udl->urbs.available = 0;
+
+	while (udl->urbs.count * size < wanted_size) {
 		unode = kzalloc(sizeof(struct urb_node), GFP_KERNEL);
 		if (!unode)
 			break;
@@ -226,11 +231,16 @@ static int udl_alloc_urb_list(struct drm
 		}
 		unode->urb = urb;
 
-		buf = usb_alloc_coherent(udl->udev, MAX_TRANSFER, GFP_KERNEL,
+		buf = usb_alloc_coherent(udl->udev, size, GFP_KERNEL,
 					 &urb->transfer_dma);
 		if (!buf) {
 			kfree(unode);
 			usb_free_urb(urb);
+			if (size > PAGE_SIZE) {
+				size /= 2;
+				udl_free_urb_list(dev);
+				goto retry;
+			}
 			break;
 		}
 
@@ -241,16 +251,14 @@ static int udl_alloc_urb_list(struct drm
 
 		list_add_tail(&unode->entry, &udl->urbs.list);
 
-		i++;
+		up(&udl->urbs.limit_sem);
+		udl->urbs.count++;
+		udl->urbs.available++;
 	}
 
-	sema_init(&udl->urbs.limit_sem, i);
-	udl->urbs.count = i;
-	udl->urbs.available = i;
-
-	DRM_DEBUG("allocated %d %d byte urbs\n", i, (int) size);
+	DRM_DEBUG("allocated %d %d byte urbs\n", udl->urbs.count, (int) size);
 
-	return i;
+	return udl->urbs.count;
 }
 
 struct urb *udl_get_urb(struct drm_device *dev)


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 03/21] udl-kms: handle allocation failure
@ 2018-06-03 14:40   ` Mikulas Patocka
  0 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:40 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

[-- Attachment #1: udlkms-allocation-failure.patch --]
[-- Type: text/plain, Size: 2459 bytes --]

Allocations larger than PAGE_ALLOC_COSTLY_ORDER are unreliable and they
may fail anytime. This patch fixes the udl kms driver so that when a large
alloactions fails, it tries to do multiple smaller allocations.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/gpu/drm/udl/udl_main.c |   28 ++++++++++++++++++----------
 1 file changed, 18 insertions(+), 10 deletions(-)

Index: linux-4.16.12/drivers/gpu/drm/udl/udl_main.c
===================================================================
--- linux-4.16.12.orig/drivers/gpu/drm/udl/udl_main.c	2018-05-31 11:16:15.000000000 +0200
+++ linux-4.16.12/drivers/gpu/drm/udl/udl_main.c	2018-05-31 11:16:15.000000000 +0200
@@ -200,17 +200,22 @@ static void udl_free_urb_list(struct drm
 static int udl_alloc_urb_list(struct drm_device *dev, int count, size_t size)
 {
 	struct udl_device *udl = dev->dev_private;
-	int i = 0;
 	struct urb *urb;
 	struct urb_node *unode;
 	char *buf;
+	size_t wanted_size = count * size;
 
 	spin_lock_init(&udl->urbs.lock);
 
+retry:
 	udl->urbs.size = size;
 	INIT_LIST_HEAD(&udl->urbs.list);
 
-	while (i < count) {
+	sema_init(&udl->urbs.limit_sem, 0);
+	udl->urbs.count = 0;
+	udl->urbs.available = 0;
+
+	while (udl->urbs.count * size < wanted_size) {
 		unode = kzalloc(sizeof(struct urb_node), GFP_KERNEL);
 		if (!unode)
 			break;
@@ -226,11 +231,16 @@ static int udl_alloc_urb_list(struct drm
 		}
 		unode->urb = urb;
 
-		buf = usb_alloc_coherent(udl->udev, MAX_TRANSFER, GFP_KERNEL,
+		buf = usb_alloc_coherent(udl->udev, size, GFP_KERNEL,
 					 &urb->transfer_dma);
 		if (!buf) {
 			kfree(unode);
 			usb_free_urb(urb);
+			if (size > PAGE_SIZE) {
+				size /= 2;
+				udl_free_urb_list(dev);
+				goto retry;
+			}
 			break;
 		}
 
@@ -241,16 +251,14 @@ static int udl_alloc_urb_list(struct drm
 
 		list_add_tail(&unode->entry, &udl->urbs.list);
 
-		i++;
+		up(&udl->urbs.limit_sem);
+		udl->urbs.count++;
+		udl->urbs.available++;
 	}
 
-	sema_init(&udl->urbs.limit_sem, i);
-	udl->urbs.count = i;
-	udl->urbs.available = i;
-
-	DRM_DEBUG("allocated %d %d byte urbs\n", i, (int) size);
+	DRM_DEBUG("allocated %d %d byte urbs\n", udl->urbs.count, (int) size);
 
-	return i;
+	return udl->urbs.count;
 }
 
 struct urb *udl_get_urb(struct drm_device *dev)

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 04/21] udl-kms: fix crash due to uninitialized memory
  2018-06-03 14:40 ` Mikulas Patocka
@ 2018-06-03 14:40   ` Mikulas Patocka
  -1 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:40 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

We must use kzalloc when allocating the fb_deferred_io structure.
Otherwise, the field first_io is undefined and it causes a crash.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/gpu/drm/udl/udl_fb.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-4.16.12/drivers/gpu/drm/udl/udl_fb.c
=================================--- linux-4.16.12.orig/drivers/gpu/drm/udl/udl_fb.c	2018-05-29 17:55:39.000000000 +0200
+++ linux-4.16.12/drivers/gpu/drm/udl/udl_fb.c	2018-05-29 17:55:39.000000000 +0200
@@ -221,7 +221,7 @@ static int udl_fb_open(struct fb_info *i
 
 		struct fb_deferred_io *fbdefio;
 
-		fbdefio = kmalloc(sizeof(struct fb_deferred_io), GFP_KERNEL);
+		fbdefio = kzalloc(sizeof(struct fb_deferred_io), GFP_KERNEL);
 
 		if (fbdefio) {
 			fbdefio->delay = DL_DEFIO_WRITE_DELAY;


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 04/21] udl-kms: fix crash due to uninitialized memory
@ 2018-06-03 14:40   ` Mikulas Patocka
  0 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:40 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

[-- Attachment #1: udlkms-defio-kzalloc.patch --]
[-- Type: text/plain, Size: 1049 bytes --]

We must use kzalloc when allocating the fb_deferred_io structure.
Otherwise, the field first_io is undefined and it causes a crash.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/gpu/drm/udl/udl_fb.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-4.16.12/drivers/gpu/drm/udl/udl_fb.c
===================================================================
--- linux-4.16.12.orig/drivers/gpu/drm/udl/udl_fb.c	2018-05-29 17:55:39.000000000 +0200
+++ linux-4.16.12/drivers/gpu/drm/udl/udl_fb.c	2018-05-29 17:55:39.000000000 +0200
@@ -221,7 +221,7 @@ static int udl_fb_open(struct fb_info *i
 
 		struct fb_deferred_io *fbdefio;
 
-		fbdefio = kmalloc(sizeof(struct fb_deferred_io), GFP_KERNEL);
+		fbdefio = kzalloc(sizeof(struct fb_deferred_io), GFP_KERNEL);
 
 		if (fbdefio) {
 			fbdefio->delay = DL_DEFIO_WRITE_DELAY;

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 05/21] udl-kms: fix a linked-list corruption when using fbdefio
  2018-06-03 14:40 ` Mikulas Patocka
@ 2018-06-03 14:40   ` Mikulas Patocka
  -1 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:40 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

The udl driver crashes when fbdefio is used. The crash can be reproduced
with this sequence:
1. echo 1 >/sys/module/udl/parameters/fb_defio
2. run some program that maps the framebuffer, such as 'links -g' or 'fbi'
3. allocate memory to the point where the machine starts swapping

The reason for the crash is that udl_gem_get_pages calls drm_gem_get_pages
and drm_gem_get_pages allocates the pages using shmem_read_mapping_page.
The shmem pages are kept on the memory management lists using the
page->lru entry.

However, fbdefio reuses the page->lru entry for the list of pages that
were modified, so the memory management lists are corrupted and the
machine crashes when vmscan starts to scan memory.

I fixed this crash by allocating pages with "alloc_page" instead. The
pages allocated with "alloc_page" have page->lru unused, and thus the
system doesn't crash when fbdefio uses it.

Unable to handle kernel paging request at virtual address dead000000000200
Mem abort info:
  ESR = 0x96000044
  Exception class = DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
Data abort info:
  ISV = 0, ISS = 0x00000044
  CM = 0, WnR = 1
[dead000000000200] address between user and kernel address ranges
Internal error: Oops: 96000044 [#1] PREEMPT SMP
Modules linked in: ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet autofs4 udl drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea fb font drm drm_panel_orientation_quirks mousedev hid_generic usbhid hid binfmt_misc snd_usb_audio snd_hwdep snd_usbmidi_lib snd_rawmidi snd_pcm snd_timer snd soundcore ipv6 aes_ce_blk crypto_simd cryptd aes_ce_cipher crc32_ce ghash_ce gf128mul aes_arm64 sha2_ce sha256_arm64 sha1_ce sha1_generic xhci_plat_hcd xhci_hcd sd_mod usbcore usb_common mvpp2 unix
CPU: 0 PID: 39 Comm: kswapd0 Not tainted 4.16.12 #3
Hardware name: Marvell 8040 MACHIATOBin (DT)
pstate: 00000085 (nzcv daIf -PAN -UAO)
pc : isolate_lru_pages.isra.16+0x23c/0x2b0
lr : isolate_lru_pages.isra.16+0x104/0x2b0
sp : ffffffc13a897ac0
x29: ffffffc13a897ac0 x28: 0000000000000003
x27: 0000000000000003 x26: 0000000000000004
x25: ffffff80087e84a0 x24: ffffffc13a897b68
x23: ffffffbf04cefc60 x22: ffffffc13a897e44
x21: 0000000000000009 x20: ffffffc13a897c00
x19: ffffffc13a897b40 x18: ffffffbf04d39000
x17: 00000000fffffff8 x16: ffffffbf00000000
x15: 0000000000000006 x14: 0000000000000000
x13: 00000000000001aa x12: 400000000004001c
x11: 0000000000000001 x10: 0000000000000001
x9 : 00000000000ee3ac x8 : 00000000000004a0
x7 : ffffff80087e84a0 x6 : 0000000000000000
x5 : 0000000000000000 x4 : 0000000000000002
x3 : ffffff80087e84a0 x2 : 0000000000000200
x1 : dead000000000200 x0 : 400000000004001c
Process kswapd0 (pid: 39, stack limit = 0x0000000097f25571)
Call trace:
 isolate_lru_pages.isra.16+0x23c/0x2b0
 shrink_inactive_list+0xe4/0x3b0
 shrink_node_memcg.constprop.19+0x374/0x630
 shrink_node+0x64/0x1c8
 kswapd+0x340/0x568
 kthread+0x118/0x120
 ret_from_fork+0x10/0x18
Code: d2804002 f85e02e0 f85e02ec f9000461 (f9000023)
---[ end trace f9f3ad3856cb2ef3 ]---
note: kswapd0[39] exited with preempt_count 1

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/gpu/drm/udl/udl_gem.c |   31 +++++++++++++++++++++++++++----
 1 file changed, 27 insertions(+), 4 deletions(-)

Index: linux-4.16.12/drivers/gpu/drm/udl/udl_gem.c
=================================--- linux-4.16.12.orig/drivers/gpu/drm/udl/udl_gem.c	2018-01-10 09:31:23.000000000 +0100
+++ linux-4.16.12/drivers/gpu/drm/udl/udl_gem.c	2018-05-29 17:46:10.000000000 +0200
@@ -130,28 +130,51 @@ int udl_gem_fault(struct vm_fault *vmf)
 int udl_gem_get_pages(struct udl_gem_object *obj)
 {
 	struct page **pages;
+	int npages, i;
 
 	if (obj->pages)
 		return 0;
 
-	pages = drm_gem_get_pages(&obj->base);
-	if (IS_ERR(pages))
-		return PTR_ERR(pages);
+	npages = obj->base.size >> PAGE_SHIFT;
+
+	pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL);
+	if (!pages)
+		return -ENOMEM;
+
+	for (i = 0; i < npages; i++) {
+		struct page *p = alloc_page(GFP_KERNEL | __GFP_ZERO);
+		if (!p)
+			goto fail;
+		pages[i] = p;
+	}
 
 	obj->pages = pages;
 
 	return 0;
+
+fail:
+	while (i--)
+		put_page(pages[i]);
+	kvfree(pages);
+	return -ENOMEM;
 }
 
 void udl_gem_put_pages(struct udl_gem_object *obj)
 {
+	int npages, i;
+
 	if (obj->base.import_attach) {
 		kvfree(obj->pages);
 		obj->pages = NULL;
 		return;
 	}
 
-	drm_gem_put_pages(&obj->base, obj->pages, false, false);
+	npages = obj->base.size >> PAGE_SHIFT;
+
+	for (i = 0; i < npages; i++)
+		put_page(obj->pages[i]);
+
+	kvfree(obj->pages);
 	obj->pages = NULL;
 }
 


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 05/21] udl-kms: fix a linked-list corruption when using fbdefio
@ 2018-06-03 14:40   ` Mikulas Patocka
  0 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:40 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

[-- Attachment #1: udlkms-alloc-pages.patch --]
[-- Type: text/plain, Size: 4866 bytes --]

The udl driver crashes when fbdefio is used. The crash can be reproduced
with this sequence:
1. echo 1 >/sys/module/udl/parameters/fb_defio
2. run some program that maps the framebuffer, such as 'links -g' or 'fbi'
3. allocate memory to the point where the machine starts swapping

The reason for the crash is that udl_gem_get_pages calls drm_gem_get_pages
and drm_gem_get_pages allocates the pages using shmem_read_mapping_page.
The shmem pages are kept on the memory management lists using the
page->lru entry.

However, fbdefio reuses the page->lru entry for the list of pages that
were modified, so the memory management lists are corrupted and the
machine crashes when vmscan starts to scan memory.

I fixed this crash by allocating pages with "alloc_page" instead. The
pages allocated with "alloc_page" have page->lru unused, and thus the
system doesn't crash when fbdefio uses it.

Unable to handle kernel paging request at virtual address dead000000000200
Mem abort info:
  ESR = 0x96000044
  Exception class = DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
Data abort info:
  ISV = 0, ISS = 0x00000044
  CM = 0, WnR = 1
[dead000000000200] address between user and kernel address ranges
Internal error: Oops: 96000044 [#1] PREEMPT SMP
Modules linked in: ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet autofs4 udl drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea fb font drm drm_panel_orientation_quirks mousedev hid_generic usbhid hid binfmt_misc snd_usb_audio snd_hwdep snd_usbmidi_lib snd_rawmidi snd_pcm snd_timer snd soundcore ipv6 aes_ce_blk crypto_simd cryptd aes_ce_cipher crc32_ce ghash_ce gf128mul aes_arm64 sha2_ce sha256_arm64 sha1_ce sha1_generic xhci_plat_hcd xhci_hcd sd_mod usbcore usb_common mvpp2 unix
CPU: 0 PID: 39 Comm: kswapd0 Not tainted 4.16.12 #3
Hardware name: Marvell 8040 MACHIATOBin (DT)
pstate: 00000085 (nzcv daIf -PAN -UAO)
pc : isolate_lru_pages.isra.16+0x23c/0x2b0
lr : isolate_lru_pages.isra.16+0x104/0x2b0
sp : ffffffc13a897ac0
x29: ffffffc13a897ac0 x28: 0000000000000003
x27: 0000000000000003 x26: 0000000000000004
x25: ffffff80087e84a0 x24: ffffffc13a897b68
x23: ffffffbf04cefc60 x22: ffffffc13a897e44
x21: 0000000000000009 x20: ffffffc13a897c00
x19: ffffffc13a897b40 x18: ffffffbf04d39000
x17: 00000000fffffff8 x16: ffffffbf00000000
x15: 0000000000000006 x14: 0000000000000000
x13: 00000000000001aa x12: 400000000004001c
x11: 0000000000000001 x10: 0000000000000001
x9 : 00000000000ee3ac x8 : 00000000000004a0
x7 : ffffff80087e84a0 x6 : 0000000000000000
x5 : 0000000000000000 x4 : 0000000000000002
x3 : ffffff80087e84a0 x2 : 0000000000000200
x1 : dead000000000200 x0 : 400000000004001c
Process kswapd0 (pid: 39, stack limit = 0x0000000097f25571)
Call trace:
 isolate_lru_pages.isra.16+0x23c/0x2b0
 shrink_inactive_list+0xe4/0x3b0
 shrink_node_memcg.constprop.19+0x374/0x630
 shrink_node+0x64/0x1c8
 kswapd+0x340/0x568
 kthread+0x118/0x120
 ret_from_fork+0x10/0x18
Code: d2804002 f85e02e0 f85e02ec f9000461 (f9000023)
---[ end trace f9f3ad3856cb2ef3 ]---
note: kswapd0[39] exited with preempt_count 1

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/gpu/drm/udl/udl_gem.c |   31 +++++++++++++++++++++++++++----
 1 file changed, 27 insertions(+), 4 deletions(-)

Index: linux-4.16.12/drivers/gpu/drm/udl/udl_gem.c
===================================================================
--- linux-4.16.12.orig/drivers/gpu/drm/udl/udl_gem.c	2018-01-10 09:31:23.000000000 +0100
+++ linux-4.16.12/drivers/gpu/drm/udl/udl_gem.c	2018-05-29 17:46:10.000000000 +0200
@@ -130,28 +130,51 @@ int udl_gem_fault(struct vm_fault *vmf)
 int udl_gem_get_pages(struct udl_gem_object *obj)
 {
 	struct page **pages;
+	int npages, i;
 
 	if (obj->pages)
 		return 0;
 
-	pages = drm_gem_get_pages(&obj->base);
-	if (IS_ERR(pages))
-		return PTR_ERR(pages);
+	npages = obj->base.size >> PAGE_SHIFT;
+
+	pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL);
+	if (!pages)
+		return -ENOMEM;
+
+	for (i = 0; i < npages; i++) {
+		struct page *p = alloc_page(GFP_KERNEL | __GFP_ZERO);
+		if (!p)
+			goto fail;
+		pages[i] = p;
+	}
 
 	obj->pages = pages;
 
 	return 0;
+
+fail:
+	while (i--)
+		put_page(pages[i]);
+	kvfree(pages);
+	return -ENOMEM;
 }
 
 void udl_gem_put_pages(struct udl_gem_object *obj)
 {
+	int npages, i;
+
 	if (obj->base.import_attach) {
 		kvfree(obj->pages);
 		obj->pages = NULL;
 		return;
 	}
 
-	drm_gem_put_pages(&obj->base, obj->pages, false, false);
+	npages = obj->base.size >> PAGE_SHIFT;
+
+	for (i = 0; i < npages; i++)
+		put_page(obj->pages[i]);
+
+	kvfree(obj->pages);
 	obj->pages = NULL;
 }
 

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 06/21] udl-kms: make a local copy of fb_ops
  2018-06-03 14:40 ` Mikulas Patocka
@ 2018-06-03 14:40   ` Mikulas Patocka
  -1 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:40 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

The defio subsystem overwrites the method fb_osp->mmap. That method is
stored in module's static data - and that means that if we have multiple
diplaylink adapters, they will over write each other's method.

In order to avoid interference between multiple adapters, we copy the
fb_ops structure to a device-local memory.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/gpu/drm/udl/udl_fb.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Index: linux-4.17-rc7/drivers/gpu/drm/udl/udl_fb.c
=================================--- linux-4.17-rc7.orig/drivers/gpu/drm/udl/udl_fb.c	2018-06-03 13:05:20.000000000 +0200
+++ linux-4.17-rc7/drivers/gpu/drm/udl/udl_fb.c	2018-06-03 13:08:03.000000000 +0200
@@ -34,6 +34,7 @@ module_param(fb_defio, int, S_IWUSR | S_
 struct udl_fbdev {
 	struct drm_fb_helper helper;
 	struct udl_framebuffer ufb;
+	struct fb_ops fb_ops;
 	int fb_count;
 };
 
@@ -405,7 +406,8 @@ static int udlfb_create(struct drm_fb_he
 	info->fix.smem_len = size;
 	info->fix.smem_start = (unsigned long)ufbdev->ufb.obj->vmapping;
 
-	info->fbops = &udlfb_ops;
+	ufbdev->fb_ops = udlfb_ops;
+	info->fbops = &ufbdev->fb_ops;
 	drm_fb_helper_fill_fix(info, fb->pitches[0], fb->format->depth);
 	drm_fb_helper_fill_var(info, &ufbdev->helper, sizes->fb_width, sizes->fb_height);
 


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 06/21] udl-kms: make a local copy of fb_ops
@ 2018-06-03 14:40   ` Mikulas Patocka
  0 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:40 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

[-- Attachment #1: udlkms-copy-fbops.patch --]
[-- Type: text/plain, Size: 1540 bytes --]

The defio subsystem overwrites the method fb_osp->mmap. That method is
stored in module's static data - and that means that if we have multiple
diplaylink adapters, they will over write each other's method.

In order to avoid interference between multiple adapters, we copy the
fb_ops structure to a device-local memory.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/gpu/drm/udl/udl_fb.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Index: linux-4.17-rc7/drivers/gpu/drm/udl/udl_fb.c
===================================================================
--- linux-4.17-rc7.orig/drivers/gpu/drm/udl/udl_fb.c	2018-06-03 13:05:20.000000000 +0200
+++ linux-4.17-rc7/drivers/gpu/drm/udl/udl_fb.c	2018-06-03 13:08:03.000000000 +0200
@@ -34,6 +34,7 @@ module_param(fb_defio, int, S_IWUSR | S_
 struct udl_fbdev {
 	struct drm_fb_helper helper;
 	struct udl_framebuffer ufb;
+	struct fb_ops fb_ops;
 	int fb_count;
 };
 
@@ -405,7 +406,8 @@ static int udlfb_create(struct drm_fb_he
 	info->fix.smem_len = size;
 	info->fix.smem_start = (unsigned long)ufbdev->ufb.obj->vmapping;
 
-	info->fbops = &udlfb_ops;
+	ufbdev->fb_ops = udlfb_ops;
+	info->fbops = &ufbdev->fb_ops;
 	drm_fb_helper_fill_fix(info, fb->pitches[0], fb->format->depth);
 	drm_fb_helper_fill_var(info, &ufbdev->helper, sizes->fb_width, sizes->fb_height);
 

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 07/21] udl-kms: avoid division
  2018-06-03 14:40 ` Mikulas Patocka
@ 2018-06-03 14:41   ` Mikulas Patocka
  -1 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

Division is slow, so it shouldn't be done by the pixel generating code.
The driver supports only 2 or 4 bytes per pixel, so we can replace
division with a shift.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/gpu/drm/udl/udl_drv.h      |    2 -
 drivers/gpu/drm/udl/udl_fb.c       |   15 ++++++++------
 drivers/gpu/drm/udl/udl_transfer.c |   39 ++++++++++++++++++-------------------
 3 files changed, 30 insertions(+), 26 deletions(-)

Index: linux-4.17-rc7/drivers/gpu/drm/udl/udl_drv.h
=================================--- linux-4.17-rc7.orig/drivers/gpu/drm/udl/udl_drv.h	2018-06-03 13:15:01.000000000 +0200
+++ linux-4.17-rc7/drivers/gpu/drm/udl/udl_drv.h	2018-06-03 13:15:01.000000000 +0200
@@ -110,7 +110,7 @@ udl_fb_user_fb_create(struct drm_device
 		      struct drm_file *file,
 		      const struct drm_mode_fb_cmd2 *mode_cmd);
 
-int udl_render_hline(struct drm_device *dev, int bpp, struct urb **urb_ptr,
+int udl_render_hline(struct drm_device *dev, int log_bpp, struct urb **urb_ptr,
 		     const char *front, char **urb_buf_ptr,
 		     u32 byte_offset, u32 device_byte_offset, u32 byte_width,
 		     int *ident_ptr, int *sent_ptr);
Index: linux-4.17-rc7/drivers/gpu/drm/udl/udl_fb.c
=================================--- linux-4.17-rc7.orig/drivers/gpu/drm/udl/udl_fb.c	2018-06-03 13:15:01.000000000 +0200
+++ linux-4.17-rc7/drivers/gpu/drm/udl/udl_fb.c	2018-06-03 13:15:01.000000000 +0200
@@ -91,7 +91,10 @@ int udl_handle_damage(struct udl_framebu
 	int bytes_identical = 0;
 	struct urb *urb;
 	int aligned_x;
-	int bpp = fb->base.format->cpp[0];
+	int log_bpp;
+
+	BUG_ON(!is_power_of_2(fb->base.format->cpp[0]));
+	log_bpp = __ffs(fb->base.format->cpp[0]);
 
 	if (!fb->active_16)
 		return 0;
@@ -126,12 +129,12 @@ int udl_handle_damage(struct udl_framebu
 
 	for (i = y; i < y + height ; i++) {
 		const int line_offset = fb->base.pitches[0] * i;
-		const int byte_offset = line_offset + (x * bpp);
-		const int dev_byte_offset = (fb->base.width * bpp * i) + (x * bpp);
-		if (udl_render_hline(dev, bpp, &urb,
+		const int byte_offset = line_offset + (x << log_bpp);
+		const int dev_byte_offset = (fb->base.width * i + x) << log_bpp;
+		if (udl_render_hline(dev, log_bpp, &urb,
 				     (char *) fb->obj->vmapping,
 				     &cmd, byte_offset, dev_byte_offset,
-				     width * bpp,
+				     width << log_bpp,
 				     &bytes_identical, &bytes_sent))
 			goto error;
 	}
@@ -150,7 +153,7 @@ int udl_handle_damage(struct udl_framebu
 error:
 	atomic_add(bytes_sent, &udl->bytes_sent);
 	atomic_add(bytes_identical, &udl->bytes_identical);
-	atomic_add(width*height*bpp, &udl->bytes_rendered);
+	atomic_add((width * height) << log_bpp, &udl->bytes_rendered);
 	end_cycles = get_cycles();
 	atomic_add(((unsigned int) ((end_cycles - start_cycles)
 		    >> 10)), /* Kcycles */
Index: linux-4.17-rc7/drivers/gpu/drm/udl/udl_transfer.c
=================================--- linux-4.17-rc7.orig/drivers/gpu/drm/udl/udl_transfer.c	2018-06-03 13:15:01.000000000 +0200
+++ linux-4.17-rc7/drivers/gpu/drm/udl/udl_transfer.c	2018-06-03 13:15:01.000000000 +0200
@@ -83,12 +83,12 @@ static inline u16 pixel32_to_be16(const
 		((pixel >> 8) & 0xf800));
 }
 
-static inline u16 get_pixel_val16(const uint8_t *pixel, int bpp)
+static inline u16 get_pixel_val16(const uint8_t *pixel, int log_bpp)
 {
-	u16 pixel_val16 = 0;
-	if (bpp = 2)
+	u16 pixel_val16;
+	if (log_bpp = 1)
 		pixel_val16 = *(const uint16_t *)pixel;
-	else if (bpp = 4)
+	else
 		pixel_val16 = pixel32_to_be16(*(const uint32_t *)pixel);
 	return pixel_val16;
 }
@@ -125,8 +125,9 @@ static void udl_compress_hline16(
 	const u8 *const pixel_end,
 	uint32_t *device_address_ptr,
 	uint8_t **command_buffer_ptr,
-	const uint8_t *const cmd_buffer_end, int bpp)
+	const uint8_t *const cmd_buffer_end, int log_bpp)
 {
+	const int bpp = 1 << log_bpp;
 	const u8 *pixel = *pixel_start_ptr;
 	uint32_t dev_addr  = *device_address_ptr;
 	uint8_t *cmd = *command_buffer_ptr;
@@ -153,12 +154,12 @@ static void udl_compress_hline16(
 		raw_pixels_count_byte = cmd++; /*  we'll know this later */
 		raw_pixel_start = pixel;
 
-		cmd_pixel_end = pixel + min3(MAX_CMD_PIXELS + 1UL,
-					(unsigned long)(pixel_end - pixel) / bpp,
-					(unsigned long)(cmd_buffer_end - 1 - cmd) / 2) * bpp;
+		cmd_pixel_end = pixel + (min3(MAX_CMD_PIXELS + 1UL,
+					(unsigned long)(pixel_end - pixel) >> log_bpp,
+					(unsigned long)(cmd_buffer_end - 1 - cmd) / 2) << log_bpp);
 
 		prefetch_range((void *) pixel, cmd_pixel_end - pixel);
-		pixel_val16 = get_pixel_val16(pixel, bpp);
+		pixel_val16 = get_pixel_val16(pixel, log_bpp);
 
 		while (pixel < cmd_pixel_end) {
 			const u8 *const start = pixel;
@@ -170,7 +171,7 @@ static void udl_compress_hline16(
 			pixel += bpp;
 
 			while (pixel < cmd_pixel_end) {
-				pixel_val16 = get_pixel_val16(pixel, bpp);
+				pixel_val16 = get_pixel_val16(pixel, log_bpp);
 				if (pixel_val16 != repeating_pixel_val16)
 					break;
 				pixel += bpp;
@@ -179,10 +180,10 @@ static void udl_compress_hline16(
 			if (unlikely(pixel > start + bpp)) {
 				/* go back and fill in raw pixel count */
 				*raw_pixels_count_byte = (((start -
-						raw_pixel_start) / bpp) + 1) & 0xFF;
+						raw_pixel_start) >> log_bpp) + 1) & 0xFF;
 
 				/* immediately after raw data is repeat byte */
-				*cmd++ = (((pixel - start) / bpp) - 1) & 0xFF;
+				*cmd++ = (((pixel - start) >> log_bpp) - 1) & 0xFF;
 
 				/* Then start another raw pixel span */
 				raw_pixel_start = pixel;
@@ -192,14 +193,14 @@ static void udl_compress_hline16(
 
 		if (pixel > raw_pixel_start) {
 			/* finalize last RAW span */
-			*raw_pixels_count_byte = ((pixel-raw_pixel_start) / bpp) & 0xFF;
+			*raw_pixels_count_byte = ((pixel - raw_pixel_start) >> log_bpp) & 0xFF;
 		} else {
 			/* undo unused byte */
 			cmd--;
 		}
 
-		*cmd_pixels_count_byte = ((pixel - cmd_pixel_start) / bpp) & 0xFF;
-		dev_addr += ((pixel - cmd_pixel_start) / bpp) * 2;
+		*cmd_pixels_count_byte = ((pixel - cmd_pixel_start) >> log_bpp) & 0xFF;
+		dev_addr += ((pixel - cmd_pixel_start) >> log_bpp) * 2;
 	}
 
 	if (cmd_buffer_end <= MIN_RLX_CMD_BYTES + cmd) {
@@ -222,19 +223,19 @@ static void udl_compress_hline16(
  * (that we can only write to, slowly, and can never read), and (optionally)
  * our shadow copy that tracks what's been sent to that hardware buffer.
  */
-int udl_render_hline(struct drm_device *dev, int bpp, struct urb **urb_ptr,
+int udl_render_hline(struct drm_device *dev, int log_bpp, struct urb **urb_ptr,
 		     const char *front, char **urb_buf_ptr,
 		     u32 byte_offset, u32 device_byte_offset,
 		     u32 byte_width,
 		     int *ident_ptr, int *sent_ptr)
 {
 	const u8 *line_start, *line_end, *next_pixel;
-	u32 base16 = 0 + (device_byte_offset / bpp) * 2;
+	u32 base16 = 0 + (device_byte_offset >> log_bpp) * 2;
 	struct urb *urb = *urb_ptr;
 	u8 *cmd = *urb_buf_ptr;
 	u8 *cmd_end = (u8 *) urb->transfer_buffer + urb->transfer_buffer_length;
 
-	BUG_ON(!(bpp = 2 || bpp = 4));
+	BUG_ON(!(log_bpp = 1 || log_bpp = 2));
 
 	line_start = (u8 *) (front + byte_offset);
 	next_pixel = line_start;
@@ -244,7 +245,7 @@ int udl_render_hline(struct drm_device *
 
 		udl_compress_hline16(&next_pixel,
 			     line_end, &base16,
-			     (u8 **) &cmd, (u8 *) cmd_end, bpp);
+			     (u8 **) &cmd, (u8 *) cmd_end, log_bpp);
 
 		if (cmd >= cmd_end) {
 			int len = cmd - (u8 *) urb->transfer_buffer;


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 07/21] udl-kms: avoid division
@ 2018-06-03 14:41   ` Mikulas Patocka
  0 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

[-- Attachment #1: udlkms-avoid-division.patch --]
[-- Type: text/plain, Size: 7697 bytes --]

Division is slow, so it shouldn't be done by the pixel generating code.
The driver supports only 2 or 4 bytes per pixel, so we can replace
division with a shift.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/gpu/drm/udl/udl_drv.h      |    2 -
 drivers/gpu/drm/udl/udl_fb.c       |   15 ++++++++------
 drivers/gpu/drm/udl/udl_transfer.c |   39 ++++++++++++++++++-------------------
 3 files changed, 30 insertions(+), 26 deletions(-)

Index: linux-4.17-rc7/drivers/gpu/drm/udl/udl_drv.h
===================================================================
--- linux-4.17-rc7.orig/drivers/gpu/drm/udl/udl_drv.h	2018-06-03 13:15:01.000000000 +0200
+++ linux-4.17-rc7/drivers/gpu/drm/udl/udl_drv.h	2018-06-03 13:15:01.000000000 +0200
@@ -110,7 +110,7 @@ udl_fb_user_fb_create(struct drm_device
 		      struct drm_file *file,
 		      const struct drm_mode_fb_cmd2 *mode_cmd);
 
-int udl_render_hline(struct drm_device *dev, int bpp, struct urb **urb_ptr,
+int udl_render_hline(struct drm_device *dev, int log_bpp, struct urb **urb_ptr,
 		     const char *front, char **urb_buf_ptr,
 		     u32 byte_offset, u32 device_byte_offset, u32 byte_width,
 		     int *ident_ptr, int *sent_ptr);
Index: linux-4.17-rc7/drivers/gpu/drm/udl/udl_fb.c
===================================================================
--- linux-4.17-rc7.orig/drivers/gpu/drm/udl/udl_fb.c	2018-06-03 13:15:01.000000000 +0200
+++ linux-4.17-rc7/drivers/gpu/drm/udl/udl_fb.c	2018-06-03 13:15:01.000000000 +0200
@@ -91,7 +91,10 @@ int udl_handle_damage(struct udl_framebu
 	int bytes_identical = 0;
 	struct urb *urb;
 	int aligned_x;
-	int bpp = fb->base.format->cpp[0];
+	int log_bpp;
+
+	BUG_ON(!is_power_of_2(fb->base.format->cpp[0]));
+	log_bpp = __ffs(fb->base.format->cpp[0]);
 
 	if (!fb->active_16)
 		return 0;
@@ -126,12 +129,12 @@ int udl_handle_damage(struct udl_framebu
 
 	for (i = y; i < y + height ; i++) {
 		const int line_offset = fb->base.pitches[0] * i;
-		const int byte_offset = line_offset + (x * bpp);
-		const int dev_byte_offset = (fb->base.width * bpp * i) + (x * bpp);
-		if (udl_render_hline(dev, bpp, &urb,
+		const int byte_offset = line_offset + (x << log_bpp);
+		const int dev_byte_offset = (fb->base.width * i + x) << log_bpp;
+		if (udl_render_hline(dev, log_bpp, &urb,
 				     (char *) fb->obj->vmapping,
 				     &cmd, byte_offset, dev_byte_offset,
-				     width * bpp,
+				     width << log_bpp,
 				     &bytes_identical, &bytes_sent))
 			goto error;
 	}
@@ -150,7 +153,7 @@ int udl_handle_damage(struct udl_framebu
 error:
 	atomic_add(bytes_sent, &udl->bytes_sent);
 	atomic_add(bytes_identical, &udl->bytes_identical);
-	atomic_add(width*height*bpp, &udl->bytes_rendered);
+	atomic_add((width * height) << log_bpp, &udl->bytes_rendered);
 	end_cycles = get_cycles();
 	atomic_add(((unsigned int) ((end_cycles - start_cycles)
 		    >> 10)), /* Kcycles */
Index: linux-4.17-rc7/drivers/gpu/drm/udl/udl_transfer.c
===================================================================
--- linux-4.17-rc7.orig/drivers/gpu/drm/udl/udl_transfer.c	2018-06-03 13:15:01.000000000 +0200
+++ linux-4.17-rc7/drivers/gpu/drm/udl/udl_transfer.c	2018-06-03 13:15:01.000000000 +0200
@@ -83,12 +83,12 @@ static inline u16 pixel32_to_be16(const
 		((pixel >> 8) & 0xf800));
 }
 
-static inline u16 get_pixel_val16(const uint8_t *pixel, int bpp)
+static inline u16 get_pixel_val16(const uint8_t *pixel, int log_bpp)
 {
-	u16 pixel_val16 = 0;
-	if (bpp == 2)
+	u16 pixel_val16;
+	if (log_bpp == 1)
 		pixel_val16 = *(const uint16_t *)pixel;
-	else if (bpp == 4)
+	else
 		pixel_val16 = pixel32_to_be16(*(const uint32_t *)pixel);
 	return pixel_val16;
 }
@@ -125,8 +125,9 @@ static void udl_compress_hline16(
 	const u8 *const pixel_end,
 	uint32_t *device_address_ptr,
 	uint8_t **command_buffer_ptr,
-	const uint8_t *const cmd_buffer_end, int bpp)
+	const uint8_t *const cmd_buffer_end, int log_bpp)
 {
+	const int bpp = 1 << log_bpp;
 	const u8 *pixel = *pixel_start_ptr;
 	uint32_t dev_addr  = *device_address_ptr;
 	uint8_t *cmd = *command_buffer_ptr;
@@ -153,12 +154,12 @@ static void udl_compress_hline16(
 		raw_pixels_count_byte = cmd++; /*  we'll know this later */
 		raw_pixel_start = pixel;
 
-		cmd_pixel_end = pixel + min3(MAX_CMD_PIXELS + 1UL,
-					(unsigned long)(pixel_end - pixel) / bpp,
-					(unsigned long)(cmd_buffer_end - 1 - cmd) / 2) * bpp;
+		cmd_pixel_end = pixel + (min3(MAX_CMD_PIXELS + 1UL,
+					(unsigned long)(pixel_end - pixel) >> log_bpp,
+					(unsigned long)(cmd_buffer_end - 1 - cmd) / 2) << log_bpp);
 
 		prefetch_range((void *) pixel, cmd_pixel_end - pixel);
-		pixel_val16 = get_pixel_val16(pixel, bpp);
+		pixel_val16 = get_pixel_val16(pixel, log_bpp);
 
 		while (pixel < cmd_pixel_end) {
 			const u8 *const start = pixel;
@@ -170,7 +171,7 @@ static void udl_compress_hline16(
 			pixel += bpp;
 
 			while (pixel < cmd_pixel_end) {
-				pixel_val16 = get_pixel_val16(pixel, bpp);
+				pixel_val16 = get_pixel_val16(pixel, log_bpp);
 				if (pixel_val16 != repeating_pixel_val16)
 					break;
 				pixel += bpp;
@@ -179,10 +180,10 @@ static void udl_compress_hline16(
 			if (unlikely(pixel > start + bpp)) {
 				/* go back and fill in raw pixel count */
 				*raw_pixels_count_byte = (((start -
-						raw_pixel_start) / bpp) + 1) & 0xFF;
+						raw_pixel_start) >> log_bpp) + 1) & 0xFF;
 
 				/* immediately after raw data is repeat byte */
-				*cmd++ = (((pixel - start) / bpp) - 1) & 0xFF;
+				*cmd++ = (((pixel - start) >> log_bpp) - 1) & 0xFF;
 
 				/* Then start another raw pixel span */
 				raw_pixel_start = pixel;
@@ -192,14 +193,14 @@ static void udl_compress_hline16(
 
 		if (pixel > raw_pixel_start) {
 			/* finalize last RAW span */
-			*raw_pixels_count_byte = ((pixel-raw_pixel_start) / bpp) & 0xFF;
+			*raw_pixels_count_byte = ((pixel - raw_pixel_start) >> log_bpp) & 0xFF;
 		} else {
 			/* undo unused byte */
 			cmd--;
 		}
 
-		*cmd_pixels_count_byte = ((pixel - cmd_pixel_start) / bpp) & 0xFF;
-		dev_addr += ((pixel - cmd_pixel_start) / bpp) * 2;
+		*cmd_pixels_count_byte = ((pixel - cmd_pixel_start) >> log_bpp) & 0xFF;
+		dev_addr += ((pixel - cmd_pixel_start) >> log_bpp) * 2;
 	}
 
 	if (cmd_buffer_end <= MIN_RLX_CMD_BYTES + cmd) {
@@ -222,19 +223,19 @@ static void udl_compress_hline16(
  * (that we can only write to, slowly, and can never read), and (optionally)
  * our shadow copy that tracks what's been sent to that hardware buffer.
  */
-int udl_render_hline(struct drm_device *dev, int bpp, struct urb **urb_ptr,
+int udl_render_hline(struct drm_device *dev, int log_bpp, struct urb **urb_ptr,
 		     const char *front, char **urb_buf_ptr,
 		     u32 byte_offset, u32 device_byte_offset,
 		     u32 byte_width,
 		     int *ident_ptr, int *sent_ptr)
 {
 	const u8 *line_start, *line_end, *next_pixel;
-	u32 base16 = 0 + (device_byte_offset / bpp) * 2;
+	u32 base16 = 0 + (device_byte_offset >> log_bpp) * 2;
 	struct urb *urb = *urb_ptr;
 	u8 *cmd = *urb_buf_ptr;
 	u8 *cmd_end = (u8 *) urb->transfer_buffer + urb->transfer_buffer_length;
 
-	BUG_ON(!(bpp == 2 || bpp == 4));
+	BUG_ON(!(log_bpp == 1 || log_bpp == 2));
 
 	line_start = (u8 *) (front + byte_offset);
 	next_pixel = line_start;
@@ -244,7 +245,7 @@ int udl_render_hline(struct drm_device *
 
 		udl_compress_hline16(&next_pixel,
 			     line_end, &base16,
-			     (u8 **) &cmd, (u8 *) cmd_end, bpp);
+			     (u8 **) &cmd, (u8 *) cmd_end, log_bpp);
 
 		if (cmd >= cmd_end) {
 			int len = cmd - (u8 *) urb->transfer_buffer;

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 08/21] udl-kms: avoid prefetch
  2018-06-03 14:40 ` Mikulas Patocka
@ 2018-06-03 14:41   ` Mikulas Patocka
  -1 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

Modern processors can detect linear memory accesses and prefetch data
automatically, so there's no need to use prefetch.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/gpu/drm/udl/udl_transfer.c |    7 -------
 1 file changed, 7 deletions(-)

Index: linux-4.16.12/drivers/gpu/drm/udl/udl_transfer.c
=================================--- linux-4.16.12.orig/drivers/gpu/drm/udl/udl_transfer.c	2018-05-31 14:48:12.000000000 +0200
+++ linux-4.16.12/drivers/gpu/drm/udl/udl_transfer.c	2018-05-31 14:48:12.000000000 +0200
@@ -13,7 +13,6 @@
 #include <linux/module.h>
 #include <linux/slab.h>
 #include <linux/fb.h>
-#include <linux/prefetch.h>
 #include <asm/unaligned.h>
 
 #include <drm/drmP.h>
@@ -51,9 +50,6 @@ static int udl_trim_hline(const u8 *bbac
 	int start = width;
 	int end = width;
 
-	prefetch((void *) front);
-	prefetch((void *) back);
-
 	for (j = 0; j < width; j++) {
 		if (back[j] != front[j]) {
 			start = j;
@@ -140,8 +136,6 @@ static void udl_compress_hline16(
 		const u8 *cmd_pixel_start, *cmd_pixel_end = NULL;
 		uint16_t pixel_val16;
 
-		prefetchw((void *) cmd); /* pull in one cache line at least */
-
 		*cmd++ = 0xaf;
 		*cmd++ = 0x6b;
 		*cmd++ = (uint8_t) ((dev_addr >> 16) & 0xFF);
@@ -158,7 +152,6 @@ static void udl_compress_hline16(
 					(unsigned long)(pixel_end - pixel) >> log_bpp,
 					(unsigned long)(cmd_buffer_end - 1 - cmd) / 2) << log_bpp);
 
-		prefetch_range((void *) pixel, cmd_pixel_end - pixel);
 		pixel_val16 = get_pixel_val16(pixel, log_bpp);
 
 		while (pixel < cmd_pixel_end) {


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 08/21] udl-kms: avoid prefetch
@ 2018-06-03 14:41   ` Mikulas Patocka
  0 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

[-- Attachment #1: udlkms-avoid-prefetch.patch --]
[-- Type: text/plain, Size: 1749 bytes --]

Modern processors can detect linear memory accesses and prefetch data
automatically, so there's no need to use prefetch.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/gpu/drm/udl/udl_transfer.c |    7 -------
 1 file changed, 7 deletions(-)

Index: linux-4.16.12/drivers/gpu/drm/udl/udl_transfer.c
===================================================================
--- linux-4.16.12.orig/drivers/gpu/drm/udl/udl_transfer.c	2018-05-31 14:48:12.000000000 +0200
+++ linux-4.16.12/drivers/gpu/drm/udl/udl_transfer.c	2018-05-31 14:48:12.000000000 +0200
@@ -13,7 +13,6 @@
 #include <linux/module.h>
 #include <linux/slab.h>
 #include <linux/fb.h>
-#include <linux/prefetch.h>
 #include <asm/unaligned.h>
 
 #include <drm/drmP.h>
@@ -51,9 +50,6 @@ static int udl_trim_hline(const u8 *bbac
 	int start = width;
 	int end = width;
 
-	prefetch((void *) front);
-	prefetch((void *) back);
-
 	for (j = 0; j < width; j++) {
 		if (back[j] != front[j]) {
 			start = j;
@@ -140,8 +136,6 @@ static void udl_compress_hline16(
 		const u8 *cmd_pixel_start, *cmd_pixel_end = NULL;
 		uint16_t pixel_val16;
 
-		prefetchw((void *) cmd); /* pull in one cache line at least */
-
 		*cmd++ = 0xaf;
 		*cmd++ = 0x6b;
 		*cmd++ = (uint8_t) ((dev_addr >> 16) & 0xFF);
@@ -158,7 +152,6 @@ static void udl_compress_hline16(
 					(unsigned long)(pixel_end - pixel) >> log_bpp,
 					(unsigned long)(cmd_buffer_end - 1 - cmd) / 2) << log_bpp);
 
-		prefetch_range((void *) pixel, cmd_pixel_end - pixel);
 		pixel_val16 = get_pixel_val16(pixel, log_bpp);
 
 		while (pixel < cmd_pixel_end) {

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 09/21] udl-kms: use spin_lock_irq instead of spin_lock_irqsave
  2018-06-03 14:40 ` Mikulas Patocka
@ 2018-06-03 14:41   ` Mikulas Patocka
  -1 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

spin_lock_irqsave and spin_unlock_irqrestore is inteded to be called from
a context where it is unknown if interrupts are enabled or disabled (such
as interrupt handlers). From a process context, we should call
spin_lock_irq and spin_unlock_irq, that avoids the costly pushf and popf
instructions.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/gpu/drm/udl/udl_main.c    |   10 ++++------
 drivers/gpu/drm/udl/udl_modeset.c |    5 ++---
 2 files changed, 6 insertions(+), 9 deletions(-)

Index: linux-4.16.12/drivers/gpu/drm/udl/udl_main.c
=================================--- linux-4.16.12.orig/drivers/gpu/drm/udl/udl_main.c	2018-05-31 11:17:01.000000000 +0200
+++ linux-4.16.12/drivers/gpu/drm/udl/udl_main.c	2018-05-31 11:17:01.000000000 +0200
@@ -170,7 +170,6 @@ static void udl_free_urb_list(struct drm
 	struct list_head *node;
 	struct urb_node *unode;
 	struct urb *urb;
-	unsigned long flags;
 
 	DRM_DEBUG("Waiting for completes and freeing all render urbs\n");
 
@@ -178,12 +177,12 @@ static void udl_free_urb_list(struct drm
 	while (count--) {
 		down(&udl->urbs.limit_sem);
 
-		spin_lock_irqsave(&udl->urbs.lock, flags);
+		spin_lock_irq(&udl->urbs.lock);
 
 		node = udl->urbs.list.next; /* have reserved one with sem */
 		list_del_init(node);
 
-		spin_unlock_irqrestore(&udl->urbs.lock, flags);
+		spin_unlock_irq(&udl->urbs.lock);
 
 		unode = list_entry(node, struct urb_node, entry);
 		urb = unode->urb;
@@ -268,7 +267,6 @@ struct urb *udl_get_urb(struct drm_devic
 	struct list_head *entry;
 	struct urb_node *unode;
 	struct urb *urb = NULL;
-	unsigned long flags;
 
 	/* Wait for an in-flight buffer to complete and get re-queued */
 	ret = down_timeout(&udl->urbs.limit_sem, GET_URB_TIMEOUT);
@@ -279,14 +277,14 @@ struct urb *udl_get_urb(struct drm_devic
 		goto error;
 	}
 
-	spin_lock_irqsave(&udl->urbs.lock, flags);
+	spin_lock_irq(&udl->urbs.lock);
 
 	BUG_ON(list_empty(&udl->urbs.list)); /* reserved one with limit_sem */
 	entry = udl->urbs.list.next;
 	list_del_init(entry);
 	udl->urbs.available--;
 
-	spin_unlock_irqrestore(&udl->urbs.lock, flags);
+	spin_unlock_irq(&udl->urbs.lock);
 
 	unode = list_entry(entry, struct urb_node, entry);
 	urb = unode->urb;
Index: linux-4.16.12/drivers/gpu/drm/udl/udl_modeset.c
=================================--- linux-4.16.12.orig/drivers/gpu/drm/udl/udl_modeset.c	2018-05-31 11:17:01.000000000 +0200
+++ linux-4.16.12/drivers/gpu/drm/udl/udl_modeset.c	2018-05-31 11:17:01.000000000 +0200
@@ -366,7 +366,6 @@ static int udl_crtc_page_flip(struct drm
 {
 	struct udl_framebuffer *ufb = to_udl_fb(fb);
 	struct drm_device *dev = crtc->dev;
-	unsigned long flags;
 
 	struct drm_framebuffer *old_fb = crtc->primary->fb;
 	if (old_fb) {
@@ -377,10 +376,10 @@ static int udl_crtc_page_flip(struct drm
 
 	udl_handle_damage(ufb, 0, 0, fb->width, fb->height);
 
-	spin_lock_irqsave(&dev->event_lock, flags);
+	spin_lock_irq(&dev->event_lock);
 	if (event)
 		drm_crtc_send_vblank_event(crtc, event);
-	spin_unlock_irqrestore(&dev->event_lock, flags);
+	spin_unlock_irq(&dev->event_lock);
 	crtc->primary->fb = fb;
 
 	return 0;


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 09/21] udl-kms: use spin_lock_irq instead of spin_lock_irqsave
@ 2018-06-03 14:41   ` Mikulas Patocka
  0 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

[-- Attachment #1: udlkms-spin-lock-irq.patch --]
[-- Type: text/plain, Size: 3355 bytes --]

spin_lock_irqsave and spin_unlock_irqrestore is inteded to be called from
a context where it is unknown if interrupts are enabled or disabled (such
as interrupt handlers). From a process context, we should call
spin_lock_irq and spin_unlock_irq, that avoids the costly pushf and popf
instructions.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/gpu/drm/udl/udl_main.c    |   10 ++++------
 drivers/gpu/drm/udl/udl_modeset.c |    5 ++---
 2 files changed, 6 insertions(+), 9 deletions(-)

Index: linux-4.16.12/drivers/gpu/drm/udl/udl_main.c
===================================================================
--- linux-4.16.12.orig/drivers/gpu/drm/udl/udl_main.c	2018-05-31 11:17:01.000000000 +0200
+++ linux-4.16.12/drivers/gpu/drm/udl/udl_main.c	2018-05-31 11:17:01.000000000 +0200
@@ -170,7 +170,6 @@ static void udl_free_urb_list(struct drm
 	struct list_head *node;
 	struct urb_node *unode;
 	struct urb *urb;
-	unsigned long flags;
 
 	DRM_DEBUG("Waiting for completes and freeing all render urbs\n");
 
@@ -178,12 +177,12 @@ static void udl_free_urb_list(struct drm
 	while (count--) {
 		down(&udl->urbs.limit_sem);
 
-		spin_lock_irqsave(&udl->urbs.lock, flags);
+		spin_lock_irq(&udl->urbs.lock);
 
 		node = udl->urbs.list.next; /* have reserved one with sem */
 		list_del_init(node);
 
-		spin_unlock_irqrestore(&udl->urbs.lock, flags);
+		spin_unlock_irq(&udl->urbs.lock);
 
 		unode = list_entry(node, struct urb_node, entry);
 		urb = unode->urb;
@@ -268,7 +267,6 @@ struct urb *udl_get_urb(struct drm_devic
 	struct list_head *entry;
 	struct urb_node *unode;
 	struct urb *urb = NULL;
-	unsigned long flags;
 
 	/* Wait for an in-flight buffer to complete and get re-queued */
 	ret = down_timeout(&udl->urbs.limit_sem, GET_URB_TIMEOUT);
@@ -279,14 +277,14 @@ struct urb *udl_get_urb(struct drm_devic
 		goto error;
 	}
 
-	spin_lock_irqsave(&udl->urbs.lock, flags);
+	spin_lock_irq(&udl->urbs.lock);
 
 	BUG_ON(list_empty(&udl->urbs.list)); /* reserved one with limit_sem */
 	entry = udl->urbs.list.next;
 	list_del_init(entry);
 	udl->urbs.available--;
 
-	spin_unlock_irqrestore(&udl->urbs.lock, flags);
+	spin_unlock_irq(&udl->urbs.lock);
 
 	unode = list_entry(entry, struct urb_node, entry);
 	urb = unode->urb;
Index: linux-4.16.12/drivers/gpu/drm/udl/udl_modeset.c
===================================================================
--- linux-4.16.12.orig/drivers/gpu/drm/udl/udl_modeset.c	2018-05-31 11:17:01.000000000 +0200
+++ linux-4.16.12/drivers/gpu/drm/udl/udl_modeset.c	2018-05-31 11:17:01.000000000 +0200
@@ -366,7 +366,6 @@ static int udl_crtc_page_flip(struct drm
 {
 	struct udl_framebuffer *ufb = to_udl_fb(fb);
 	struct drm_device *dev = crtc->dev;
-	unsigned long flags;
 
 	struct drm_framebuffer *old_fb = crtc->primary->fb;
 	if (old_fb) {
@@ -377,10 +376,10 @@ static int udl_crtc_page_flip(struct drm
 
 	udl_handle_damage(ufb, 0, 0, fb->width, fb->height);
 
-	spin_lock_irqsave(&dev->event_lock, flags);
+	spin_lock_irq(&dev->event_lock);
 	if (event)
 		drm_crtc_send_vblank_event(crtc, event);
-	spin_unlock_irqrestore(&dev->event_lock, flags);
+	spin_unlock_irq(&dev->event_lock);
 	crtc->primary->fb = fb;
 
 	return 0;

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 10/21] udl-kms: dont spam the syslog with debug messages
  2018-06-03 14:40 ` Mikulas Patocka
@ 2018-06-03 14:41   ` Mikulas Patocka
  -1 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

The udl kms driver writes messages to the syslog whenever some application
opens or closes /dev/fb0 and whenever the user switches between the
Xserver and the console.

This patch changes the priority of these messages to debug.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/gpu/drm/udl/udl_fb.c      |    6 +++---
 drivers/gpu/drm/udl/udl_modeset.c |    2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

Index: linux-4.17-rc7/drivers/gpu/drm/udl/udl_fb.c
=================================--- linux-4.17-rc7.orig/drivers/gpu/drm/udl/udl_fb.c	2018-06-03 13:17:58.000000000 +0200
+++ linux-4.17-rc7/drivers/gpu/drm/udl/udl_fb.c	2018-06-03 13:42:48.000000000 +0200
@@ -179,7 +179,7 @@ static int udl_fb_mmap(struct fb_info *i
 
 	pos = (unsigned long)info->fix.smem_start + offset;
 
-	pr_notice("mmap() framebuffer addr:%lu size:%lu\n",
+	pr_debug("mmap() framebuffer addr:%lu size:%lu\n",
 		  pos, size);
 
 	/* We don't want the framebuffer to be mapped encrypted */
@@ -237,7 +237,7 @@ static int udl_fb_open(struct fb_info *i
 	}
 #endif
 
-	pr_notice("open /dev/fb%d user=%d fb_info=%p count=%d\n",
+	pr_debug("open /dev/fb%d user=%d fb_info=%p count=%d\n",
 		  info->node, user, info, ufbdev->fb_count);
 
 	return 0;
@@ -262,7 +262,7 @@ static int udl_fb_release(struct fb_info
 	}
 #endif
 
-	pr_warn("released /dev/fb%d user=%d count=%d\n",
+	pr_debug("released /dev/fb%d user=%d count=%d\n",
 		info->node, user, ufbdev->fb_count);
 
 	return 0;
Index: linux-4.17-rc7/drivers/gpu/drm/udl/udl_modeset.c
=================================--- linux-4.17-rc7.orig/drivers/gpu/drm/udl/udl_modeset.c	2018-06-03 13:17:58.000000000 +0200
+++ linux-4.17-rc7/drivers/gpu/drm/udl/udl_modeset.c	2018-06-03 13:41:56.000000000 +0200
@@ -243,7 +243,7 @@ static int udl_crtc_write_mode_to_hw(str
 
 	memcpy(buf, udl->mode_buf, udl->mode_buf_len);
 	retval = udl_submit_urb(dev, urb, udl->mode_buf_len);
-	DRM_INFO("write mode info %d\n", udl->mode_buf_len);
+	DRM_DEBUG("write mode info %d\n", udl->mode_buf_len);
 	return retval;
 }
 


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 10/21] udl-kms: dont spam the syslog with debug messages
@ 2018-06-03 14:41   ` Mikulas Patocka
  0 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

[-- Attachment #1: udlkms-quiet.patch --]
[-- Type: text/plain, Size: 2288 bytes --]

The udl kms driver writes messages to the syslog whenever some application
opens or closes /dev/fb0 and whenever the user switches between the
Xserver and the console.

This patch changes the priority of these messages to debug.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/gpu/drm/udl/udl_fb.c      |    6 +++---
 drivers/gpu/drm/udl/udl_modeset.c |    2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

Index: linux-4.17-rc7/drivers/gpu/drm/udl/udl_fb.c
===================================================================
--- linux-4.17-rc7.orig/drivers/gpu/drm/udl/udl_fb.c	2018-06-03 13:17:58.000000000 +0200
+++ linux-4.17-rc7/drivers/gpu/drm/udl/udl_fb.c	2018-06-03 13:42:48.000000000 +0200
@@ -179,7 +179,7 @@ static int udl_fb_mmap(struct fb_info *i
 
 	pos = (unsigned long)info->fix.smem_start + offset;
 
-	pr_notice("mmap() framebuffer addr:%lu size:%lu\n",
+	pr_debug("mmap() framebuffer addr:%lu size:%lu\n",
 		  pos, size);
 
 	/* We don't want the framebuffer to be mapped encrypted */
@@ -237,7 +237,7 @@ static int udl_fb_open(struct fb_info *i
 	}
 #endif
 
-	pr_notice("open /dev/fb%d user=%d fb_info=%p count=%d\n",
+	pr_debug("open /dev/fb%d user=%d fb_info=%p count=%d\n",
 		  info->node, user, info, ufbdev->fb_count);
 
 	return 0;
@@ -262,7 +262,7 @@ static int udl_fb_release(struct fb_info
 	}
 #endif
 
-	pr_warn("released /dev/fb%d user=%d count=%d\n",
+	pr_debug("released /dev/fb%d user=%d count=%d\n",
 		info->node, user, ufbdev->fb_count);
 
 	return 0;
Index: linux-4.17-rc7/drivers/gpu/drm/udl/udl_modeset.c
===================================================================
--- linux-4.17-rc7.orig/drivers/gpu/drm/udl/udl_modeset.c	2018-06-03 13:17:58.000000000 +0200
+++ linux-4.17-rc7/drivers/gpu/drm/udl/udl_modeset.c	2018-06-03 13:41:56.000000000 +0200
@@ -243,7 +243,7 @@ static int udl_crtc_write_mode_to_hw(str
 
 	memcpy(buf, udl->mode_buf, udl->mode_buf_len);
 	retval = udl_submit_urb(dev, urb, udl->mode_buf_len);
-	DRM_INFO("write mode info %d\n", udl->mode_buf_len);
+	DRM_DEBUG("write mode info %d\n", udl->mode_buf_len);
 	return retval;
 }
 

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 11/21] udlfb: fix semaphore value leak
  2018-06-03 14:40 ` Mikulas Patocka
@ 2018-06-03 14:41   ` Mikulas Patocka
  -1 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

I observed that the performance of the udl fb driver degrades over time.
On a freshly booted machine, it takes 6 seconds to do "ls -la /usr/bin";
after some time of use, the same operation takes 14 seconds.

The reason is that the value of "limit_sem" decays over time.

The udl driver uses a semaphore "limit_set" to specify how many free urbs
are there on dlfb->urbs.list. If the count is zero, the "down" operation
will sleep until some urbs are added to the freelist.

In order to avoid some hypothetical deadlock, the driver will not call
"up" immediatelly, but it will offload it to a workqueue. The problem is
that if we call "schedule_delayed_work" on the same work item multiple
times, the work item may only be executed once.

This is happening:
* some urb completes
* dlfb_urb_completion adds it to the free list
* dlfb_urb_completion calls schedule_delayed_work to schedule the function
  dlfb_release_urb_work to increase the semaphore count
* as the urb is on the free list, some other task grabs it and submits it
* the submitted urb completes, dlfb_urb_completion is called again
* dlfb_urb_completion calls schedule_delayed_work, but the work is already
  scheduled, so it does nothing
* finally, dlfb_release_urb_work is called, it increases the semaphore
  count by 1, although it should increase it by 2

So, the semaphore count is decreasing over time, and this causes gradual
performance degradation.

Note that in the current kernel, the "up" function may be called from
interrupt and it may race with the "down" function called by another
thread, so we don't have to offload the call of "up" to a workqueue at
all. This patch removes the workqueue code. The patch also changes
"down_interruptible" to "down" in dlfb_free_urb_list, so that we will
clean up the driver properly even if a signal arrives.

With this patch, the performance of udlfb no longer degrades.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/video/fbdev/udlfb.c |   27 ++-------------------------
 include/video/udlfb.h       |    1 -
 2 files changed, 2 insertions(+), 26 deletions(-)

Index: linux-4.17-rc7/drivers/video/fbdev/udlfb.c
=================================--- linux-4.17-rc7.orig/drivers/video/fbdev/udlfb.c	2018-05-31 12:31:04.000000000 +0200
+++ linux-4.17-rc7/drivers/video/fbdev/udlfb.c	2018-05-31 12:31:04.000000000 +0200
@@ -922,14 +922,6 @@ static void dlfb_free(struct kref *kref)
 	kfree(dlfb);
 }
 
-static void dlfb_release_urb_work(struct work_struct *work)
-{
-	struct urb_node *unode = container_of(work, struct urb_node,
-					      release_urb_work.work);
-
-	up(&unode->dlfb->urbs.limit_sem);
-}
-
 static void dlfb_free_framebuffer(struct dlfb_data *dlfb)
 {
 	struct fb_info *info = dlfb->info;
@@ -1789,14 +1781,7 @@ static void dlfb_urb_completion(struct u
 	dlfb->urbs.available++;
 	spin_unlock_irqrestore(&dlfb->urbs.lock, flags);
 
-	/*
-	 * When using fb_defio, we deadlock if up() is called
-	 * while another is waiting. So queue to another process.
-	 */
-	if (fb_defio)
-		schedule_delayed_work(&unode->release_urb_work, 0);
-	else
-		up(&dlfb->urbs.limit_sem);
+	up(&dlfb->urbs.limit_sem);
 }
 
 static void dlfb_free_urb_list(struct dlfb_data *dlfb)
@@ -1805,16 +1790,11 @@ static void dlfb_free_urb_list(struct dl
 	struct list_head *node;
 	struct urb_node *unode;
 	struct urb *urb;
-	int ret;
 	unsigned long flags;
 
 	/* keep waiting and freeing, until we've got 'em all */
 	while (count--) {
-
-		/* Getting interrupted means a leak, but ok at disconnect */
-		ret = down_interruptible(&dlfb->urbs.limit_sem);
-		if (ret)
-			break;
+		down(&dlfb->urbs.limit_sem);
 
 		spin_lock_irqsave(&dlfb->urbs.lock, flags);
 
@@ -1854,9 +1834,6 @@ static int dlfb_alloc_urb_list(struct dl
 			break;
 		unode->dlfb = dlfb;
 
-		INIT_DELAYED_WORK(&unode->release_urb_work,
-			  dlfb_release_urb_work);
-
 		urb = usb_alloc_urb(0, GFP_KERNEL);
 		if (!urb) {
 			kfree(unode);
Index: linux-4.17-rc7/include/video/udlfb.h
=================================--- linux-4.17-rc7.orig/include/video/udlfb.h	2018-05-31 12:31:04.000000000 +0200
+++ linux-4.17-rc7/include/video/udlfb.h	2018-05-31 12:31:04.000000000 +0200
@@ -20,7 +20,6 @@ struct dloarea {
 struct urb_node {
 	struct list_head entry;
 	struct dlfb_data *dlfb;
-	struct delayed_work release_urb_work;
 	struct urb *urb;
 };
 


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 11/21] udlfb: fix semaphore value leak
@ 2018-06-03 14:41   ` Mikulas Patocka
  0 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

[-- Attachment #1: udl-avoid-delayed-work.patch --]
[-- Type: text/plain, Size: 4600 bytes --]

I observed that the performance of the udl fb driver degrades over time.
On a freshly booted machine, it takes 6 seconds to do "ls -la /usr/bin";
after some time of use, the same operation takes 14 seconds.

The reason is that the value of "limit_sem" decays over time.

The udl driver uses a semaphore "limit_set" to specify how many free urbs
are there on dlfb->urbs.list. If the count is zero, the "down" operation
will sleep until some urbs are added to the freelist.

In order to avoid some hypothetical deadlock, the driver will not call
"up" immediatelly, but it will offload it to a workqueue. The problem is
that if we call "schedule_delayed_work" on the same work item multiple
times, the work item may only be executed once.

This is happening:
* some urb completes
* dlfb_urb_completion adds it to the free list
* dlfb_urb_completion calls schedule_delayed_work to schedule the function
  dlfb_release_urb_work to increase the semaphore count
* as the urb is on the free list, some other task grabs it and submits it
* the submitted urb completes, dlfb_urb_completion is called again
* dlfb_urb_completion calls schedule_delayed_work, but the work is already
  scheduled, so it does nothing
* finally, dlfb_release_urb_work is called, it increases the semaphore
  count by 1, although it should increase it by 2

So, the semaphore count is decreasing over time, and this causes gradual
performance degradation.

Note that in the current kernel, the "up" function may be called from
interrupt and it may race with the "down" function called by another
thread, so we don't have to offload the call of "up" to a workqueue at
all. This patch removes the workqueue code. The patch also changes
"down_interruptible" to "down" in dlfb_free_urb_list, so that we will
clean up the driver properly even if a signal arrives.

With this patch, the performance of udlfb no longer degrades.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/video/fbdev/udlfb.c |   27 ++-------------------------
 include/video/udlfb.h       |    1 -
 2 files changed, 2 insertions(+), 26 deletions(-)

Index: linux-4.17-rc7/drivers/video/fbdev/udlfb.c
===================================================================
--- linux-4.17-rc7.orig/drivers/video/fbdev/udlfb.c	2018-05-31 12:31:04.000000000 +0200
+++ linux-4.17-rc7/drivers/video/fbdev/udlfb.c	2018-05-31 12:31:04.000000000 +0200
@@ -922,14 +922,6 @@ static void dlfb_free(struct kref *kref)
 	kfree(dlfb);
 }
 
-static void dlfb_release_urb_work(struct work_struct *work)
-{
-	struct urb_node *unode = container_of(work, struct urb_node,
-					      release_urb_work.work);
-
-	up(&unode->dlfb->urbs.limit_sem);
-}
-
 static void dlfb_free_framebuffer(struct dlfb_data *dlfb)
 {
 	struct fb_info *info = dlfb->info;
@@ -1789,14 +1781,7 @@ static void dlfb_urb_completion(struct u
 	dlfb->urbs.available++;
 	spin_unlock_irqrestore(&dlfb->urbs.lock, flags);
 
-	/*
-	 * When using fb_defio, we deadlock if up() is called
-	 * while another is waiting. So queue to another process.
-	 */
-	if (fb_defio)
-		schedule_delayed_work(&unode->release_urb_work, 0);
-	else
-		up(&dlfb->urbs.limit_sem);
+	up(&dlfb->urbs.limit_sem);
 }
 
 static void dlfb_free_urb_list(struct dlfb_data *dlfb)
@@ -1805,16 +1790,11 @@ static void dlfb_free_urb_list(struct dl
 	struct list_head *node;
 	struct urb_node *unode;
 	struct urb *urb;
-	int ret;
 	unsigned long flags;
 
 	/* keep waiting and freeing, until we've got 'em all */
 	while (count--) {
-
-		/* Getting interrupted means a leak, but ok at disconnect */
-		ret = down_interruptible(&dlfb->urbs.limit_sem);
-		if (ret)
-			break;
+		down(&dlfb->urbs.limit_sem);
 
 		spin_lock_irqsave(&dlfb->urbs.lock, flags);
 
@@ -1854,9 +1834,6 @@ static int dlfb_alloc_urb_list(struct dl
 			break;
 		unode->dlfb = dlfb;
 
-		INIT_DELAYED_WORK(&unode->release_urb_work,
-			  dlfb_release_urb_work);
-
 		urb = usb_alloc_urb(0, GFP_KERNEL);
 		if (!urb) {
 			kfree(unode);
Index: linux-4.17-rc7/include/video/udlfb.h
===================================================================
--- linux-4.17-rc7.orig/include/video/udlfb.h	2018-05-31 12:31:04.000000000 +0200
+++ linux-4.17-rc7/include/video/udlfb.h	2018-05-31 12:31:04.000000000 +0200
@@ -20,7 +20,6 @@ struct dloarea {
 struct urb_node {
 	struct list_head entry;
 	struct dlfb_data *dlfb;
-	struct delayed_work release_urb_work;
 	struct urb *urb;
 };
 

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 12/21] udlfb: fix display corruption of the last line
  2018-06-03 14:40 ` Mikulas Patocka
@ 2018-06-03 14:41   ` Mikulas Patocka
  -1 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

The displaylink hardware has such a peculiarity that it doesn't render a
command until next command is received. This produces occasional
corruption, such as when setting 22x11 font on the console, only the first
line of the cursor will be blinking if the cursor is located at some
specific columns.

When we end up with a repeating pixel, the driver has a bug that it leaves
one uninitialized byte after the command (and this byte is enough to flush
the command and render it - thus it fixes the screen corruption), however
whe we end up with a non-repeating pixel, there is no byte appended and
this results in temporary screen corruption.

This patch fixes the screen corruption by always appending a byte 0xAF at
the end of URB. It also removes the uninitialized byte.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/video/fbdev/udlfb.c |   30 ++++++++++++++++++++----------
 1 file changed, 20 insertions(+), 10 deletions(-)

Index: linux-4.17-rc7/drivers/video/fbdev/udlfb.c
=================================--- linux-4.17-rc7.orig/drivers/video/fbdev/udlfb.c	2018-05-31 14:43:13.000000000 +0200
+++ linux-4.17-rc7/drivers/video/fbdev/udlfb.c	2018-05-31 14:46:03.000000000 +0200
@@ -27,6 +27,7 @@
 #include <linux/slab.h>
 #include <linux/prefetch.h>
 #include <linux/delay.h>
+#include <asm/unaligned.h>
 #include <video/udlfb.h>
 #include "edid.h"
 
@@ -450,17 +451,17 @@ static void dlfb_compress_hline(
 		raw_pixels_count_byte = cmd++; /*  we'll know this later */
 		raw_pixel_start = pixel;
 
-		cmd_pixel_end = pixel + min(MAX_CMD_PIXELS + 1,
-			min((int)(pixel_end - pixel),
-			    (int)(cmd_buffer_end - cmd) / BPP));
+		cmd_pixel_end = pixel + min3(MAX_CMD_PIXELS + 1UL,
+					(unsigned long)(pixel_end - pixel),
+					(unsigned long)(cmd_buffer_end - 1 - cmd) / BPP);
 
-		prefetch_range((void *) pixel, (cmd_pixel_end - pixel) * BPP);
+		prefetch_range((void *) pixel, (u8 *)cmd_pixel_end - (u8 *)pixel);
 
 		while (pixel < cmd_pixel_end) {
 			const uint16_t * const repeating_pixel = pixel;
 
-			*cmd++ = *pixel >> 8;
-			*cmd++ = *pixel;
+			put_unaligned_be16(*pixel, cmd);
+			cmd += 2;
 			pixel++;
 
 			if (unlikely((pixel < cmd_pixel_end) &&
@@ -486,13 +487,16 @@ static void dlfb_compress_hline(
 		if (pixel > raw_pixel_start) {
 			/* finalize last RAW span */
 			*raw_pixels_count_byte = (pixel-raw_pixel_start) & 0xFF;
+		} else {
+			/* undo unused byte */
+			cmd--;
 		}
 
 		*cmd_pixels_count_byte = (pixel - cmd_pixel_start) & 0xFF;
-		dev_addr += (pixel - cmd_pixel_start) * BPP;
+		dev_addr += (u8 *)pixel - (u8 *)cmd_pixel_start;
 	}
 
-	if (cmd_buffer_end <= MIN_RLX_CMD_BYTES + cmd) {
+	if (cmd_buffer_end - MIN_RLX_CMD_BYTES <= cmd) {
 		/* Fill leftover bytes with no-ops */
 		if (cmd_buffer_end > cmd)
 			memset(cmd, 0xAF, cmd_buffer_end - cmd);
@@ -610,8 +614,11 @@ static int dlfb_handle_damage(struct dlf
 	}
 
 	if (cmd > (char *) urb->transfer_buffer) {
+		int len;
+		if (cmd < (char *) urb->transfer_buffer + urb->transfer_buffer_length)
+			*cmd++ = 0xAF;
 		/* Send partial buffer remaining before exiting */
-		int len = cmd - (char *) urb->transfer_buffer;
+		len = cmd - (char *) urb->transfer_buffer;
 		ret = dlfb_submit_urb(dlfb, urb, len);
 		bytes_sent += len;
 	} else
@@ -735,8 +742,11 @@ static void dlfb_dpy_deferred_io(struct
 	}
 
 	if (cmd > (char *) urb->transfer_buffer) {
+		int len;
+		if (cmd < (char *) urb->transfer_buffer + urb->transfer_buffer_length)
+			*cmd++ = 0xAF;
 		/* Send partial buffer remaining before exiting */
-		int len = cmd - (char *) urb->transfer_buffer;
+		len = cmd - (char *) urb->transfer_buffer;
 		dlfb_submit_urb(dlfb, urb, len);
 		bytes_sent += len;
 	} else


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 12/21] udlfb: fix display corruption of the last line
@ 2018-06-03 14:41   ` Mikulas Patocka
  0 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

[-- Attachment #1: udl-append-zero-4.patch --]
[-- Type: text/plain, Size: 3917 bytes --]

The displaylink hardware has such a peculiarity that it doesn't render a
command until next command is received. This produces occasional
corruption, such as when setting 22x11 font on the console, only the first
line of the cursor will be blinking if the cursor is located at some
specific columns.

When we end up with a repeating pixel, the driver has a bug that it leaves
one uninitialized byte after the command (and this byte is enough to flush
the command and render it - thus it fixes the screen corruption), however
whe we end up with a non-repeating pixel, there is no byte appended and
this results in temporary screen corruption.

This patch fixes the screen corruption by always appending a byte 0xAF at
the end of URB. It also removes the uninitialized byte.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/video/fbdev/udlfb.c |   30 ++++++++++++++++++++----------
 1 file changed, 20 insertions(+), 10 deletions(-)

Index: linux-4.17-rc7/drivers/video/fbdev/udlfb.c
===================================================================
--- linux-4.17-rc7.orig/drivers/video/fbdev/udlfb.c	2018-05-31 14:43:13.000000000 +0200
+++ linux-4.17-rc7/drivers/video/fbdev/udlfb.c	2018-05-31 14:46:03.000000000 +0200
@@ -27,6 +27,7 @@
 #include <linux/slab.h>
 #include <linux/prefetch.h>
 #include <linux/delay.h>
+#include <asm/unaligned.h>
 #include <video/udlfb.h>
 #include "edid.h"
 
@@ -450,17 +451,17 @@ static void dlfb_compress_hline(
 		raw_pixels_count_byte = cmd++; /*  we'll know this later */
 		raw_pixel_start = pixel;
 
-		cmd_pixel_end = pixel + min(MAX_CMD_PIXELS + 1,
-			min((int)(pixel_end - pixel),
-			    (int)(cmd_buffer_end - cmd) / BPP));
+		cmd_pixel_end = pixel + min3(MAX_CMD_PIXELS + 1UL,
+					(unsigned long)(pixel_end - pixel),
+					(unsigned long)(cmd_buffer_end - 1 - cmd) / BPP);
 
-		prefetch_range((void *) pixel, (cmd_pixel_end - pixel) * BPP);
+		prefetch_range((void *) pixel, (u8 *)cmd_pixel_end - (u8 *)pixel);
 
 		while (pixel < cmd_pixel_end) {
 			const uint16_t * const repeating_pixel = pixel;
 
-			*cmd++ = *pixel >> 8;
-			*cmd++ = *pixel;
+			put_unaligned_be16(*pixel, cmd);
+			cmd += 2;
 			pixel++;
 
 			if (unlikely((pixel < cmd_pixel_end) &&
@@ -486,13 +487,16 @@ static void dlfb_compress_hline(
 		if (pixel > raw_pixel_start) {
 			/* finalize last RAW span */
 			*raw_pixels_count_byte = (pixel-raw_pixel_start) & 0xFF;
+		} else {
+			/* undo unused byte */
+			cmd--;
 		}
 
 		*cmd_pixels_count_byte = (pixel - cmd_pixel_start) & 0xFF;
-		dev_addr += (pixel - cmd_pixel_start) * BPP;
+		dev_addr += (u8 *)pixel - (u8 *)cmd_pixel_start;
 	}
 
-	if (cmd_buffer_end <= MIN_RLX_CMD_BYTES + cmd) {
+	if (cmd_buffer_end - MIN_RLX_CMD_BYTES <= cmd) {
 		/* Fill leftover bytes with no-ops */
 		if (cmd_buffer_end > cmd)
 			memset(cmd, 0xAF, cmd_buffer_end - cmd);
@@ -610,8 +614,11 @@ static int dlfb_handle_damage(struct dlf
 	}
 
 	if (cmd > (char *) urb->transfer_buffer) {
+		int len;
+		if (cmd < (char *) urb->transfer_buffer + urb->transfer_buffer_length)
+			*cmd++ = 0xAF;
 		/* Send partial buffer remaining before exiting */
-		int len = cmd - (char *) urb->transfer_buffer;
+		len = cmd - (char *) urb->transfer_buffer;
 		ret = dlfb_submit_urb(dlfb, urb, len);
 		bytes_sent += len;
 	} else
@@ -735,8 +742,11 @@ static void dlfb_dpy_deferred_io(struct
 	}
 
 	if (cmd > (char *) urb->transfer_buffer) {
+		int len;
+		if (cmd < (char *) urb->transfer_buffer + urb->transfer_buffer_length)
+			*cmd++ = 0xAF;
 		/* Send partial buffer remaining before exiting */
-		int len = cmd - (char *) urb->transfer_buffer;
+		len = cmd - (char *) urb->transfer_buffer;
 		dlfb_submit_urb(dlfb, urb, len);
 		bytes_sent += len;
 	} else

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 13/21] udlfb: dont switch if we are switching to the same videomode
  2018-06-03 14:40 ` Mikulas Patocka
@ 2018-06-03 14:41   ` Mikulas Patocka
  -1 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

The udlfb driver reprograms the hardware everytime the user switches the
console, that makes quite unusable when working on the console.

This patch makes the driver remember the videomode we are in and avoid
reprogramming the hardware if we switch to the same videomode.

We mask the "activate" field and the "FB_VMODE_SMOOTH_XPAN" flag when
comparing the videomode, because they cause spurious switches when
switching to and from the Xserver.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/video/fbdev/udlfb.c |   18 ++++++++++++++++--
 include/video/udlfb.h       |    1 +
 2 files changed, 17 insertions(+), 2 deletions(-)

Index: linux-4.17-rc7/drivers/video/fbdev/udlfb.c
=================================--- linux-4.17-rc7.orig/drivers/video/fbdev/udlfb.c	2018-05-31 14:49:52.000000000 +0200
+++ linux-4.17-rc7/drivers/video/fbdev/udlfb.c	2018-05-31 14:49:52.000000000 +0200
@@ -1041,10 +1041,24 @@ static int dlfb_ops_set_par(struct fb_in
 	int result;
 	u16 *pix_framebuffer;
 	int i;
+	struct fb_var_screeninfo fvs;
+
+	/* clear the activate field because it causes spurious miscompares */
+	fvs = info->var;
+	fvs.activate = 0;
+	fvs.vmode &= ~FB_VMODE_SMOOTH_XPAN;
+
+	if (!memcmp(&dlfb->current_mode, &fvs, sizeof(struct fb_var_screeninfo)))
+		return 0;
 
 	result = dlfb_set_video_mode(dlfb, &info->var);
 
-	if ((result = 0) && (dlfb->fb_count = 0)) {
+	if (result)
+		return result;
+
+	dlfb->current_mode = fvs;
+
+	if (dlfb->fb_count = 0) {
 
 		/* paint greenscreen */
 
@@ -1056,7 +1070,7 @@ static int dlfb_ops_set_par(struct fb_in
 				   info->screen_base);
 	}
 
-	return result;
+	return 0;
 }
 
 /* To fonzi the jukebox (e.g. make blanking changes take effect) */
Index: linux-4.17-rc7/include/video/udlfb.h
=================================--- linux-4.17-rc7.orig/include/video/udlfb.h	2018-05-31 14:49:52.000000000 +0200
+++ linux-4.17-rc7/include/video/udlfb.h	2018-05-31 14:49:52.000000000 +0200
@@ -56,6 +56,7 @@ struct dlfb_data {
 	atomic_t bytes_identical; /* saved effort with backbuffer comparison */
 	atomic_t bytes_sent; /* to usb, after compression including overhead */
 	atomic_t cpu_kcycles_used; /* transpired during pixel processing */
+	struct fb_var_screeninfo current_mode;
 };
 
 #define NR_USB_REQUEST_I2C_SUB_IO 0x02


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 13/21] udlfb: dont switch if we are switching to the same videomode
@ 2018-06-03 14:41   ` Mikulas Patocka
  0 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

[-- Attachment #1: udl-cache-videomode-2.patch --]
[-- Type: text/plain, Size: 2551 bytes --]

The udlfb driver reprograms the hardware everytime the user switches the
console, that makes quite unusable when working on the console.

This patch makes the driver remember the videomode we are in and avoid
reprogramming the hardware if we switch to the same videomode.

We mask the "activate" field and the "FB_VMODE_SMOOTH_XPAN" flag when
comparing the videomode, because they cause spurious switches when
switching to and from the Xserver.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/video/fbdev/udlfb.c |   18 ++++++++++++++++--
 include/video/udlfb.h       |    1 +
 2 files changed, 17 insertions(+), 2 deletions(-)

Index: linux-4.17-rc7/drivers/video/fbdev/udlfb.c
===================================================================
--- linux-4.17-rc7.orig/drivers/video/fbdev/udlfb.c	2018-05-31 14:49:52.000000000 +0200
+++ linux-4.17-rc7/drivers/video/fbdev/udlfb.c	2018-05-31 14:49:52.000000000 +0200
@@ -1041,10 +1041,24 @@ static int dlfb_ops_set_par(struct fb_in
 	int result;
 	u16 *pix_framebuffer;
 	int i;
+	struct fb_var_screeninfo fvs;
+
+	/* clear the activate field because it causes spurious miscompares */
+	fvs = info->var;
+	fvs.activate = 0;
+	fvs.vmode &= ~FB_VMODE_SMOOTH_XPAN;
+
+	if (!memcmp(&dlfb->current_mode, &fvs, sizeof(struct fb_var_screeninfo)))
+		return 0;
 
 	result = dlfb_set_video_mode(dlfb, &info->var);
 
-	if ((result == 0) && (dlfb->fb_count == 0)) {
+	if (result)
+		return result;
+
+	dlfb->current_mode = fvs;
+
+	if (dlfb->fb_count == 0) {
 
 		/* paint greenscreen */
 
@@ -1056,7 +1070,7 @@ static int dlfb_ops_set_par(struct fb_in
 				   info->screen_base);
 	}
 
-	return result;
+	return 0;
 }
 
 /* To fonzi the jukebox (e.g. make blanking changes take effect) */
Index: linux-4.17-rc7/include/video/udlfb.h
===================================================================
--- linux-4.17-rc7.orig/include/video/udlfb.h	2018-05-31 14:49:52.000000000 +0200
+++ linux-4.17-rc7/include/video/udlfb.h	2018-05-31 14:49:52.000000000 +0200
@@ -56,6 +56,7 @@ struct dlfb_data {
 	atomic_t bytes_identical; /* saved effort with backbuffer comparison */
 	atomic_t bytes_sent; /* to usb, after compression including overhead */
 	atomic_t cpu_kcycles_used; /* transpired during pixel processing */
+	struct fb_var_screeninfo current_mode;
 };
 
 #define NR_USB_REQUEST_I2C_SUB_IO 0x02

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 14/21] udlfb: make a local copy of fb_ops
  2018-06-03 14:40 ` Mikulas Patocka
@ 2018-06-03 14:41   ` Mikulas Patocka
  -1 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

The defio subsystem overwrites the method fb_osp->mmap. That method is
stored in module's static data - and that means that if we have multiple
diplaylink adapters, they will over write each other's method.

In order to avoid interference between multiple adapters, we copy the
fb_ops structure to a device-local memory.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/video/fbdev/udlfb.c |    3 ++-
 include/video/udlfb.h       |    1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

Index: linux-4.17-rc7/drivers/video/fbdev/udlfb.c
=================================--- linux-4.17-rc7.orig/drivers/video/fbdev/udlfb.c	2018-06-03 13:17:33.000000000 +0200
+++ linux-4.17-rc7/drivers/video/fbdev/udlfb.c	2018-06-03 13:17:33.000000000 +0200
@@ -1665,7 +1665,8 @@ static void dlfb_init_framebuffer_work(s
 	dlfb->info = info;
 	info->par = dlfb;
 	info->pseudo_palette = dlfb->pseudo_palette;
-	info->fbops = &dlfb_ops;
+	dlfb->ops = dlfb_ops;
+	info->fbops = &dlfb->ops;
 
 	retval = fb_alloc_cmap(&info->cmap, 256, 0);
 	if (retval < 0) {
Index: linux-4.17-rc7/include/video/udlfb.h
=================================--- linux-4.17-rc7.orig/include/video/udlfb.h	2018-06-03 13:17:33.000000000 +0200
+++ linux-4.17-rc7/include/video/udlfb.h	2018-06-03 13:17:33.000000000 +0200
@@ -51,6 +51,7 @@ struct dlfb_data {
 	int base8;
 	u32 pseudo_palette[256];
 	int blank_mode; /*one of FB_BLANK_ */
+	struct fb_ops ops;
 	/* blit-only rendering path metrics, exposed through sysfs */
 	atomic_t bytes_rendered; /* raw pixel-bytes driver asked to render */
 	atomic_t bytes_identical; /* saved effort with backbuffer comparison */


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 14/21] udlfb: make a local copy of fb_ops
@ 2018-06-03 14:41   ` Mikulas Patocka
  0 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

[-- Attachment #1: udl-copy-fbops.patch --]
[-- Type: text/plain, Size: 1903 bytes --]

The defio subsystem overwrites the method fb_osp->mmap. That method is
stored in module's static data - and that means that if we have multiple
diplaylink adapters, they will over write each other's method.

In order to avoid interference between multiple adapters, we copy the
fb_ops structure to a device-local memory.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/video/fbdev/udlfb.c |    3 ++-
 include/video/udlfb.h       |    1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

Index: linux-4.17-rc7/drivers/video/fbdev/udlfb.c
===================================================================
--- linux-4.17-rc7.orig/drivers/video/fbdev/udlfb.c	2018-06-03 13:17:33.000000000 +0200
+++ linux-4.17-rc7/drivers/video/fbdev/udlfb.c	2018-06-03 13:17:33.000000000 +0200
@@ -1665,7 +1665,8 @@ static void dlfb_init_framebuffer_work(s
 	dlfb->info = info;
 	info->par = dlfb;
 	info->pseudo_palette = dlfb->pseudo_palette;
-	info->fbops = &dlfb_ops;
+	dlfb->ops = dlfb_ops;
+	info->fbops = &dlfb->ops;
 
 	retval = fb_alloc_cmap(&info->cmap, 256, 0);
 	if (retval < 0) {
Index: linux-4.17-rc7/include/video/udlfb.h
===================================================================
--- linux-4.17-rc7.orig/include/video/udlfb.h	2018-06-03 13:17:33.000000000 +0200
+++ linux-4.17-rc7/include/video/udlfb.h	2018-06-03 13:17:33.000000000 +0200
@@ -51,6 +51,7 @@ struct dlfb_data {
 	int base8;
 	u32 pseudo_palette[256];
 	int blank_mode; /*one of FB_BLANK_ */
+	struct fb_ops ops;
 	/* blit-only rendering path metrics, exposed through sysfs */
 	atomic_t bytes_rendered; /* raw pixel-bytes driver asked to render */
 	atomic_t bytes_identical; /* saved effort with backbuffer comparison */

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 15/21] udlfb: set optimal write delay
  2018-06-03 14:40 ` Mikulas Patocka
@ 2018-06-03 14:41   ` Mikulas Patocka
  -1 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

The default delay 5 jiffies is too much when the kernel is compiled with
HZ\x100 - it results in jumpy cursor in Xwindow.

In order to find out the optimal delay, I benchmarked the driver on
1280x720x30fps video. I found out that with HZ\x1000, 10ms is acceptable,
but with HZ%0 or HZ00, we need 4ms, so that the video is played
without any frame skips.

This patch changes the delay to this value.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 include/video/udlfb.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-4.17-rc7/include/video/udlfb.h
=================================--- linux-4.17-rc7.orig/include/video/udlfb.h	2018-06-03 13:17:37.000000000 +0200
+++ linux-4.17-rc7/include/video/udlfb.h	2018-06-03 13:17:37.000000000 +0200
@@ -88,7 +88,7 @@ struct dlfb_data {
 #define MIN_RAW_PIX_BYTES	2
 #define MIN_RAW_CMD_BYTES	(RAW_HEADER_BYTES + MIN_RAW_PIX_BYTES)
 
-#define DL_DEFIO_WRITE_DELAY    5 /* fb_deferred_io.delay in jiffies */
+#define DL_DEFIO_WRITE_DELAY    msecs_to_jiffies(HZ <= 300 ? 4 : 10) /* optimal value for 720p video */
 #define DL_DEFIO_WRITE_DISABLE  (HZ*60) /* "disable" with long delay */
 
 /* remove these once align.h patch is taken into kernel */


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 15/21] udlfb: set optimal write delay
@ 2018-06-03 14:41   ` Mikulas Patocka
  0 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

[-- Attachment #1: udl-defio-delay.patch --]
[-- Type: text/plain, Size: 1447 bytes --]

The default delay 5 jiffies is too much when the kernel is compiled with
HZ=100 - it results in jumpy cursor in Xwindow.

In order to find out the optimal delay, I benchmarked the driver on
1280x720x30fps video. I found out that with HZ=1000, 10ms is acceptable,
but with HZ=250 or HZ=300, we need 4ms, so that the video is played
without any frame skips.

This patch changes the delay to this value.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 include/video/udlfb.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-4.17-rc7/include/video/udlfb.h
===================================================================
--- linux-4.17-rc7.orig/include/video/udlfb.h	2018-06-03 13:17:37.000000000 +0200
+++ linux-4.17-rc7/include/video/udlfb.h	2018-06-03 13:17:37.000000000 +0200
@@ -88,7 +88,7 @@ struct dlfb_data {
 #define MIN_RAW_PIX_BYTES	2
 #define MIN_RAW_CMD_BYTES	(RAW_HEADER_BYTES + MIN_RAW_PIX_BYTES)
 
-#define DL_DEFIO_WRITE_DELAY    5 /* fb_deferred_io.delay in jiffies */
+#define DL_DEFIO_WRITE_DELAY    msecs_to_jiffies(HZ <= 300 ? 4 : 10) /* optimal value for 720p video */
 #define DL_DEFIO_WRITE_DISABLE  (HZ*60) /* "disable" with long delay */
 
 /* remove these once align.h patch is taken into kernel */

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 16/21] udlfb: handle allocation failure
  2018-06-03 14:40 ` Mikulas Patocka
@ 2018-06-03 14:41   ` Mikulas Patocka
  -1 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

Allocations larger than PAGE_ALLOC_COSTLY_ORDER are unreliable and they
may fail anytime. This patch fixes the udlfb driver so that when a large
alloactions fails, it tries to do multiple smaller allocations.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/video/fbdev/udlfb.c |   26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)

Index: linux-4.17-rc7/drivers/video/fbdev/udlfb.c
=================================--- linux-4.17-rc7.orig/drivers/video/fbdev/udlfb.c	2018-06-03 13:17:38.000000000 +0200
+++ linux-4.17-rc7/drivers/video/fbdev/udlfb.c	2018-06-03 13:17:38.000000000 +0200
@@ -1843,17 +1843,22 @@ static void dlfb_free_urb_list(struct dl
 
 static int dlfb_alloc_urb_list(struct dlfb_data *dlfb, int count, size_t size)
 {
-	int i = 0;
 	struct urb *urb;
 	struct urb_node *unode;
 	char *buf;
+	size_t wanted_size = count * size;
 
 	spin_lock_init(&dlfb->urbs.lock);
 
+retry:
 	dlfb->urbs.size = size;
 	INIT_LIST_HEAD(&dlfb->urbs.list);
 
-	while (i < count) {
+	sema_init(&dlfb->urbs.limit_sem, 0);
+	dlfb->urbs.count = 0;
+	dlfb->urbs.available = 0;
+
+	while (dlfb->urbs.count * size < wanted_size) {
 		unode = kzalloc(sizeof(*unode), GFP_KERNEL);
 		if (!unode)
 			break;
@@ -1866,11 +1871,16 @@ static int dlfb_alloc_urb_list(struct dl
 		}
 		unode->urb = urb;
 
-		buf = usb_alloc_coherent(dlfb->udev, MAX_TRANSFER, GFP_KERNEL,
+		buf = usb_alloc_coherent(dlfb->udev, size, GFP_KERNEL,
 					 &urb->transfer_dma);
 		if (!buf) {
 			kfree(unode);
 			usb_free_urb(urb);
+			if (size > PAGE_SIZE) {
+				size /= 2;
+				dlfb_free_urb_list(dlfb);
+				goto retry;
+			}
 			break;
 		}
 
@@ -1881,14 +1891,12 @@ static int dlfb_alloc_urb_list(struct dl
 
 		list_add_tail(&unode->entry, &dlfb->urbs.list);
 
-		i++;
+		up(&dlfb->urbs.limit_sem);
+		dlfb->urbs.count++;
+		dlfb->urbs.available++;
 	}
 
-	sema_init(&dlfb->urbs.limit_sem, i);
-	dlfb->urbs.count = i;
-	dlfb->urbs.available = i;
-
-	return i;
+	return dlfb->urbs.count;
 }
 
 static struct urb *dlfb_get_urb(struct dlfb_data *dlfb)


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 16/21] udlfb: handle allocation failure
@ 2018-06-03 14:41   ` Mikulas Patocka
  0 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

[-- Attachment #1: udl-allocation-failure.patch --]
[-- Type: text/plain, Size: 2293 bytes --]

Allocations larger than PAGE_ALLOC_COSTLY_ORDER are unreliable and they
may fail anytime. This patch fixes the udlfb driver so that when a large
alloactions fails, it tries to do multiple smaller allocations.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/video/fbdev/udlfb.c |   26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)

Index: linux-4.17-rc7/drivers/video/fbdev/udlfb.c
===================================================================
--- linux-4.17-rc7.orig/drivers/video/fbdev/udlfb.c	2018-06-03 13:17:38.000000000 +0200
+++ linux-4.17-rc7/drivers/video/fbdev/udlfb.c	2018-06-03 13:17:38.000000000 +0200
@@ -1843,17 +1843,22 @@ static void dlfb_free_urb_list(struct dl
 
 static int dlfb_alloc_urb_list(struct dlfb_data *dlfb, int count, size_t size)
 {
-	int i = 0;
 	struct urb *urb;
 	struct urb_node *unode;
 	char *buf;
+	size_t wanted_size = count * size;
 
 	spin_lock_init(&dlfb->urbs.lock);
 
+retry:
 	dlfb->urbs.size = size;
 	INIT_LIST_HEAD(&dlfb->urbs.list);
 
-	while (i < count) {
+	sema_init(&dlfb->urbs.limit_sem, 0);
+	dlfb->urbs.count = 0;
+	dlfb->urbs.available = 0;
+
+	while (dlfb->urbs.count * size < wanted_size) {
 		unode = kzalloc(sizeof(*unode), GFP_KERNEL);
 		if (!unode)
 			break;
@@ -1866,11 +1871,16 @@ static int dlfb_alloc_urb_list(struct dl
 		}
 		unode->urb = urb;
 
-		buf = usb_alloc_coherent(dlfb->udev, MAX_TRANSFER, GFP_KERNEL,
+		buf = usb_alloc_coherent(dlfb->udev, size, GFP_KERNEL,
 					 &urb->transfer_dma);
 		if (!buf) {
 			kfree(unode);
 			usb_free_urb(urb);
+			if (size > PAGE_SIZE) {
+				size /= 2;
+				dlfb_free_urb_list(dlfb);
+				goto retry;
+			}
 			break;
 		}
 
@@ -1881,14 +1891,12 @@ static int dlfb_alloc_urb_list(struct dl
 
 		list_add_tail(&unode->entry, &dlfb->urbs.list);
 
-		i++;
+		up(&dlfb->urbs.limit_sem);
+		dlfb->urbs.count++;
+		dlfb->urbs.available++;
 	}
 
-	sema_init(&dlfb->urbs.limit_sem, i);
-	dlfb->urbs.count = i;
-	dlfb->urbs.available = i;
-
-	return i;
+	return dlfb->urbs.count;
 }
 
 static struct urb *dlfb_get_urb(struct dlfb_data *dlfb)

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 17/21] udlfb: set line_length in dlfb_ops_set_par
  2018-06-03 14:40 ` Mikulas Patocka
@ 2018-06-03 14:41   ` Mikulas Patocka
  -1 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

Set the variable "line_length" in the function dlfb_ops_set_par. Without
this, we get garbage if we select different videomode with fbset.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/video/fbdev/udlfb.c |    1 +
 1 file changed, 1 insertion(+)

Index: linux-4.17-rc7/drivers/video/fbdev/udlfb.c
=================================--- linux-4.17-rc7.orig/drivers/video/fbdev/udlfb.c	2018-05-31 14:50:00.000000000 +0200
+++ linux-4.17-rc7/drivers/video/fbdev/udlfb.c	2018-05-31 14:50:00.000000000 +0200
@@ -1057,6 +1057,7 @@ static int dlfb_ops_set_par(struct fb_in
 		return result;
 
 	dlfb->current_mode = fvs;
+	info->fix.line_length = info->var.xres * (info->var.bits_per_pixel / 8);
 
 	if (dlfb->fb_count = 0) {
 


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 17/21] udlfb: set line_length in dlfb_ops_set_par
@ 2018-06-03 14:41   ` Mikulas Patocka
  0 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

[-- Attachment #1: udl-line-length.patch --]
[-- Type: text/plain, Size: 965 bytes --]

Set the variable "line_length" in the function dlfb_ops_set_par. Without
this, we get garbage if we select different videomode with fbset.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/video/fbdev/udlfb.c |    1 +
 1 file changed, 1 insertion(+)

Index: linux-4.17-rc7/drivers/video/fbdev/udlfb.c
===================================================================
--- linux-4.17-rc7.orig/drivers/video/fbdev/udlfb.c	2018-05-31 14:50:00.000000000 +0200
+++ linux-4.17-rc7/drivers/video/fbdev/udlfb.c	2018-05-31 14:50:00.000000000 +0200
@@ -1057,6 +1057,7 @@ static int dlfb_ops_set_par(struct fb_in
 		return result;
 
 	dlfb->current_mode = fvs;
+	info->fix.line_length = info->var.xres * (info->var.bits_per_pixel / 8);
 
 	if (dlfb->fb_count == 0) {
 

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 18/21] udlfb: allow reallocating the framebuffer
  2018-06-03 14:40 ` Mikulas Patocka
@ 2018-06-03 14:41   ` Mikulas Patocka
  -1 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

This patch changes udlfb so that it may reallocate the framebuffer when
setting higher-resolution mode. If we boot the system without monitor
attached, udlfb creates a framebuffer with the size 800x600. This patch
makes it possible to select higher videomode with the fbset command when
a monitor is attached.

Note that there is no reliable way to prevent the system from touching the
old framebuffer, so we must not free it. We add it to the list
dlfb->deferred_free and free it when the driver is unloaded.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/video/fbdev/udlfb.c |   70 +++++++++++++++++++++++++++++---------------
 include/video/udlfb.h       |    1 
 2 files changed, 48 insertions(+), 23 deletions(-)

Index: linux-4.17-rc7/drivers/video/fbdev/udlfb.c
=================================--- linux-4.17-rc7.orig/drivers/video/fbdev/udlfb.c	2018-06-03 13:17:41.000000000 +0200
+++ linux-4.17-rc7/drivers/video/fbdev/udlfb.c	2018-06-03 13:17:41.000000000 +0200
@@ -73,6 +73,13 @@ static bool fb_defio = 1;  /* Detect mma
 static bool shadow = 1; /* Optionally disable shadow framebuffer */
 static int pixel_limit; /* Optionally force a pixel resolution limit */
 
+struct dlfb_deferred_free {
+	struct list_head list;
+	void *mem;
+};
+
+static int dlfb_realloc_framebuffer(struct dlfb_data *dlfb, struct fb_info *info, u32 new_len);
+
 /* dlfb keeps a list of urbs for efficient bulk transfers */
 static void dlfb_urb_completion(struct urb *urb);
 static struct urb *dlfb_get_urb(struct dlfb_data *dlfb);
@@ -927,6 +934,12 @@ static void dlfb_free(struct kref *kref)
 {
 	struct dlfb_data *dlfb = container_of(kref, struct dlfb_data, kref);
 
+	while (!list_empty(&dlfb->deferred_free)) {
+		struct dlfb_deferred_free *d = list_entry(dlfb->deferred_free.next, struct dlfb_deferred_free, list);
+		list_del(&d->list);
+		vfree(d->mem);
+		kfree(d);
+	}
 	vfree(dlfb->backing_buffer);
 	kfree(dlfb->edid);
 	kfree(dlfb);
@@ -1020,10 +1033,6 @@ static int dlfb_ops_check_var(struct fb_
 	struct fb_videomode mode;
 	struct dlfb_data *dlfb = info->par;
 
-	/* TODO: support dynamically changing framebuffer size */
-	if ((var->xres * var->yres * 2) > info->fix.smem_len)
-		return -EINVAL;
-
 	/* set device-specific elements of var unrelated to mode */
 	dlfb_var_color_format(var);
 
@@ -1042,6 +1051,7 @@ static int dlfb_ops_set_par(struct fb_in
 	u16 *pix_framebuffer;
 	int i;
 	struct fb_var_screeninfo fvs;
+	u32 line_length = info->var.xres * (info->var.bits_per_pixel / 8);
 
 	/* clear the activate field because it causes spurious miscompares */
 	fvs = info->var;
@@ -1051,13 +1061,17 @@ static int dlfb_ops_set_par(struct fb_in
 	if (!memcmp(&dlfb->current_mode, &fvs, sizeof(struct fb_var_screeninfo)))
 		return 0;
 
+	result = dlfb_realloc_framebuffer(dlfb, info, info->var.yres * line_length);
+	if (result)
+		return result;
+
 	result = dlfb_set_video_mode(dlfb, &info->var);
 
 	if (result)
 		return result;
 
 	dlfb->current_mode = fvs;
-	info->fix.line_length = info->var.xres * (info->var.bits_per_pixel / 8);
+	info->fix.line_length = line_length;
 
 	if (dlfb->fb_count = 0) {
 
@@ -1066,11 +1080,11 @@ static int dlfb_ops_set_par(struct fb_in
 		pix_framebuffer = (u16 *) info->screen_base;
 		for (i = 0; i < info->fix.smem_len / 2; i++)
 			pix_framebuffer[i] = 0x37e6;
-
-		dlfb_handle_damage(dlfb, 0, 0, info->var.xres, info->var.yres,
-				   info->screen_base);
 	}
 
+	dlfb_handle_damage(dlfb, 0, 0, info->var.xres, info->var.yres,
+			   info->screen_base);
+
 	return 0;
 }
 
@@ -1146,21 +1160,29 @@ static struct fb_ops dlfb_ops = {
 };
 
 
+static void dlfb_deferred_vfree(struct dlfb_data *dlfb, void *mem)
+{
+	struct dlfb_deferred_free *d = kmalloc(sizeof(struct dlfb_deferred_free), GFP_KERNEL);
+	if (!d)
+		return;
+	d->mem = mem;
+	list_add(&d->list, &dlfb->deferred_free);
+}
+
 /*
  * Assumes &info->lock held by caller
  * Assumes no active clients have framebuffer open
  */
-static int dlfb_realloc_framebuffer(struct dlfb_data *dlfb, struct fb_info *info)
+static int dlfb_realloc_framebuffer(struct dlfb_data *dlfb, struct fb_info *info, u32 new_len)
 {
-	int old_len = info->fix.smem_len;
-	int new_len;
+	u32 old_len = info->fix.smem_len;
 	unsigned char *old_fb = info->screen_base;
 	unsigned char *new_fb;
 	unsigned char *new_back = NULL;
 
-	new_len = info->fix.line_length * info->var.yres;
+	new_len = PAGE_ALIGN(new_len);
 
-	if (PAGE_ALIGN(new_len) > old_len) {
+	if (new_len > old_len) {
 		/*
 		 * Alloc system memory for virtual framebuffer
 		 */
@@ -1169,14 +1191,15 @@ static int dlfb_realloc_framebuffer(stru
 			dev_err(info->dev, "Virtual framebuffer alloc failed\n");
 			return -ENOMEM;
 		}
+		memset(new_fb, 0xff, new_len);
 
 		if (info->screen_base) {
 			memcpy(new_fb, old_fb, old_len);
-			vfree(info->screen_base);
+			dlfb_deferred_vfree(dlfb, info->screen_base);
 		}
 
 		info->screen_base = new_fb;
-		info->fix.smem_len = PAGE_ALIGN(new_len);
+		info->fix.smem_len = new_len;
 		info->fix.smem_start = (unsigned long) new_fb;
 		info->flags = udlfb_info_flags;
 
@@ -1192,7 +1215,7 @@ static int dlfb_realloc_framebuffer(stru
 			dev_info(info->dev,
 				 "No shadow/backing buffer allocated\n");
 		else {
-			vfree(dlfb->backing_buffer);
+			dlfb_deferred_vfree(dlfb, dlfb->backing_buffer);
 			dlfb->backing_buffer = new_back;
 		}
 	}
@@ -1344,11 +1367,6 @@ static int dlfb_setup_modes(struct dlfb_
 		 * with mode size info, we can now alloc our framebuffer.
 		 */
 		memcpy(&info->fix, &dlfb_fix, sizeof(dlfb_fix));
-		info->fix.line_length = info->var.xres *
-			(info->var.bits_per_pixel / 8);
-
-		result = dlfb_realloc_framebuffer(dlfb, info);
-
 	} else
 		result = -EINVAL;
 
@@ -1436,7 +1454,10 @@ static ssize_t edid_store(
 	if (!dlfb->edid || memcmp(src, dlfb->edid, src_size))
 		return -EINVAL;
 
-	dlfb_ops_set_par(fb_info);
+	ret = dlfb_ops_set_par(fb_info);
+	if (ret)
+		return ret;
+
 	return src_size;
 }
 
@@ -1596,6 +1617,7 @@ static int dlfb_usb_probe(struct usb_int
 	}
 
 	kref_init(&dlfb->kref); /* matching kref_put in usb .disconnect fn */
+	INIT_LIST_HEAD(&dlfb->deferred_free);
 
 	dlfb->udev = usbdev;
 	usb_set_intfdata(intf, dlfb);
@@ -1693,7 +1715,9 @@ static void dlfb_init_framebuffer_work(s
 	dlfb_select_std_channel(dlfb);
 
 	dlfb_ops_check_var(&info->var, info);
-	dlfb_ops_set_par(info);
+	retval = dlfb_ops_set_par(info);
+	if (retval)
+		goto error;
 
 	retval = register_framebuffer(info);
 	if (retval < 0) {
Index: linux-4.17-rc7/include/video/udlfb.h
=================================--- linux-4.17-rc7.orig/include/video/udlfb.h	2018-06-03 13:17:41.000000000 +0200
+++ linux-4.17-rc7/include/video/udlfb.h	2018-06-03 13:17:41.000000000 +0200
@@ -58,6 +58,7 @@ struct dlfb_data {
 	atomic_t bytes_sent; /* to usb, after compression including overhead */
 	atomic_t cpu_kcycles_used; /* transpired during pixel processing */
 	struct fb_var_screeninfo current_mode;
+	struct list_head deferred_free;
 };
 
 #define NR_USB_REQUEST_I2C_SUB_IO 0x02


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 18/21] udlfb: allow reallocating the framebuffer
@ 2018-06-03 14:41   ` Mikulas Patocka
  0 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

[-- Attachment #1: udl-realloc-framebuffer.patch --]
[-- Type: text/plain, Size: 7249 bytes --]

This patch changes udlfb so that it may reallocate the framebuffer when
setting higher-resolution mode. If we boot the system without monitor
attached, udlfb creates a framebuffer with the size 800x600. This patch
makes it possible to select higher videomode with the fbset command when
a monitor is attached.

Note that there is no reliable way to prevent the system from touching the
old framebuffer, so we must not free it. We add it to the list
dlfb->deferred_free and free it when the driver is unloaded.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/video/fbdev/udlfb.c |   70 +++++++++++++++++++++++++++++---------------
 include/video/udlfb.h       |    1 
 2 files changed, 48 insertions(+), 23 deletions(-)

Index: linux-4.17-rc7/drivers/video/fbdev/udlfb.c
===================================================================
--- linux-4.17-rc7.orig/drivers/video/fbdev/udlfb.c	2018-06-03 13:17:41.000000000 +0200
+++ linux-4.17-rc7/drivers/video/fbdev/udlfb.c	2018-06-03 13:17:41.000000000 +0200
@@ -73,6 +73,13 @@ static bool fb_defio = 1;  /* Detect mma
 static bool shadow = 1; /* Optionally disable shadow framebuffer */
 static int pixel_limit; /* Optionally force a pixel resolution limit */
 
+struct dlfb_deferred_free {
+	struct list_head list;
+	void *mem;
+};
+
+static int dlfb_realloc_framebuffer(struct dlfb_data *dlfb, struct fb_info *info, u32 new_len);
+
 /* dlfb keeps a list of urbs for efficient bulk transfers */
 static void dlfb_urb_completion(struct urb *urb);
 static struct urb *dlfb_get_urb(struct dlfb_data *dlfb);
@@ -927,6 +934,12 @@ static void dlfb_free(struct kref *kref)
 {
 	struct dlfb_data *dlfb = container_of(kref, struct dlfb_data, kref);
 
+	while (!list_empty(&dlfb->deferred_free)) {
+		struct dlfb_deferred_free *d = list_entry(dlfb->deferred_free.next, struct dlfb_deferred_free, list);
+		list_del(&d->list);
+		vfree(d->mem);
+		kfree(d);
+	}
 	vfree(dlfb->backing_buffer);
 	kfree(dlfb->edid);
 	kfree(dlfb);
@@ -1020,10 +1033,6 @@ static int dlfb_ops_check_var(struct fb_
 	struct fb_videomode mode;
 	struct dlfb_data *dlfb = info->par;
 
-	/* TODO: support dynamically changing framebuffer size */
-	if ((var->xres * var->yres * 2) > info->fix.smem_len)
-		return -EINVAL;
-
 	/* set device-specific elements of var unrelated to mode */
 	dlfb_var_color_format(var);
 
@@ -1042,6 +1051,7 @@ static int dlfb_ops_set_par(struct fb_in
 	u16 *pix_framebuffer;
 	int i;
 	struct fb_var_screeninfo fvs;
+	u32 line_length = info->var.xres * (info->var.bits_per_pixel / 8);
 
 	/* clear the activate field because it causes spurious miscompares */
 	fvs = info->var;
@@ -1051,13 +1061,17 @@ static int dlfb_ops_set_par(struct fb_in
 	if (!memcmp(&dlfb->current_mode, &fvs, sizeof(struct fb_var_screeninfo)))
 		return 0;
 
+	result = dlfb_realloc_framebuffer(dlfb, info, info->var.yres * line_length);
+	if (result)
+		return result;
+
 	result = dlfb_set_video_mode(dlfb, &info->var);
 
 	if (result)
 		return result;
 
 	dlfb->current_mode = fvs;
-	info->fix.line_length = info->var.xres * (info->var.bits_per_pixel / 8);
+	info->fix.line_length = line_length;
 
 	if (dlfb->fb_count == 0) {
 
@@ -1066,11 +1080,11 @@ static int dlfb_ops_set_par(struct fb_in
 		pix_framebuffer = (u16 *) info->screen_base;
 		for (i = 0; i < info->fix.smem_len / 2; i++)
 			pix_framebuffer[i] = 0x37e6;
-
-		dlfb_handle_damage(dlfb, 0, 0, info->var.xres, info->var.yres,
-				   info->screen_base);
 	}
 
+	dlfb_handle_damage(dlfb, 0, 0, info->var.xres, info->var.yres,
+			   info->screen_base);
+
 	return 0;
 }
 
@@ -1146,21 +1160,29 @@ static struct fb_ops dlfb_ops = {
 };
 
 
+static void dlfb_deferred_vfree(struct dlfb_data *dlfb, void *mem)
+{
+	struct dlfb_deferred_free *d = kmalloc(sizeof(struct dlfb_deferred_free), GFP_KERNEL);
+	if (!d)
+		return;
+	d->mem = mem;
+	list_add(&d->list, &dlfb->deferred_free);
+}
+
 /*
  * Assumes &info->lock held by caller
  * Assumes no active clients have framebuffer open
  */
-static int dlfb_realloc_framebuffer(struct dlfb_data *dlfb, struct fb_info *info)
+static int dlfb_realloc_framebuffer(struct dlfb_data *dlfb, struct fb_info *info, u32 new_len)
 {
-	int old_len = info->fix.smem_len;
-	int new_len;
+	u32 old_len = info->fix.smem_len;
 	unsigned char *old_fb = info->screen_base;
 	unsigned char *new_fb;
 	unsigned char *new_back = NULL;
 
-	new_len = info->fix.line_length * info->var.yres;
+	new_len = PAGE_ALIGN(new_len);
 
-	if (PAGE_ALIGN(new_len) > old_len) {
+	if (new_len > old_len) {
 		/*
 		 * Alloc system memory for virtual framebuffer
 		 */
@@ -1169,14 +1191,15 @@ static int dlfb_realloc_framebuffer(stru
 			dev_err(info->dev, "Virtual framebuffer alloc failed\n");
 			return -ENOMEM;
 		}
+		memset(new_fb, 0xff, new_len);
 
 		if (info->screen_base) {
 			memcpy(new_fb, old_fb, old_len);
-			vfree(info->screen_base);
+			dlfb_deferred_vfree(dlfb, info->screen_base);
 		}
 
 		info->screen_base = new_fb;
-		info->fix.smem_len = PAGE_ALIGN(new_len);
+		info->fix.smem_len = new_len;
 		info->fix.smem_start = (unsigned long) new_fb;
 		info->flags = udlfb_info_flags;
 
@@ -1192,7 +1215,7 @@ static int dlfb_realloc_framebuffer(stru
 			dev_info(info->dev,
 				 "No shadow/backing buffer allocated\n");
 		else {
-			vfree(dlfb->backing_buffer);
+			dlfb_deferred_vfree(dlfb, dlfb->backing_buffer);
 			dlfb->backing_buffer = new_back;
 		}
 	}
@@ -1344,11 +1367,6 @@ static int dlfb_setup_modes(struct dlfb_
 		 * with mode size info, we can now alloc our framebuffer.
 		 */
 		memcpy(&info->fix, &dlfb_fix, sizeof(dlfb_fix));
-		info->fix.line_length = info->var.xres *
-			(info->var.bits_per_pixel / 8);
-
-		result = dlfb_realloc_framebuffer(dlfb, info);
-
 	} else
 		result = -EINVAL;
 
@@ -1436,7 +1454,10 @@ static ssize_t edid_store(
 	if (!dlfb->edid || memcmp(src, dlfb->edid, src_size))
 		return -EINVAL;
 
-	dlfb_ops_set_par(fb_info);
+	ret = dlfb_ops_set_par(fb_info);
+	if (ret)
+		return ret;
+
 	return src_size;
 }
 
@@ -1596,6 +1617,7 @@ static int dlfb_usb_probe(struct usb_int
 	}
 
 	kref_init(&dlfb->kref); /* matching kref_put in usb .disconnect fn */
+	INIT_LIST_HEAD(&dlfb->deferred_free);
 
 	dlfb->udev = usbdev;
 	usb_set_intfdata(intf, dlfb);
@@ -1693,7 +1715,9 @@ static void dlfb_init_framebuffer_work(s
 	dlfb_select_std_channel(dlfb);
 
 	dlfb_ops_check_var(&info->var, info);
-	dlfb_ops_set_par(info);
+	retval = dlfb_ops_set_par(info);
+	if (retval)
+		goto error;
 
 	retval = register_framebuffer(info);
 	if (retval < 0) {
Index: linux-4.17-rc7/include/video/udlfb.h
===================================================================
--- linux-4.17-rc7.orig/include/video/udlfb.h	2018-06-03 13:17:41.000000000 +0200
+++ linux-4.17-rc7/include/video/udlfb.h	2018-06-03 13:17:41.000000000 +0200
@@ -58,6 +58,7 @@ struct dlfb_data {
 	atomic_t bytes_sent; /* to usb, after compression including overhead */
 	atomic_t cpu_kcycles_used; /* transpired during pixel processing */
 	struct fb_var_screeninfo current_mode;
+	struct list_head deferred_free;
 };
 
 #define NR_USB_REQUEST_I2C_SUB_IO 0x02

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 19/21] udlfb: optimization - test the backing buffer
  2018-06-03 14:40 ` Mikulas Patocka
@ 2018-06-03 14:41   ` Mikulas Patocka
  -1 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

Currently, the udlfb driver only tests for identical bytes at the
beginning or at the end of a page and renders anything between the first
and last mismatching pixel. But pages are not the same as lines, so this
is quite suboptimal - if there is something modified at the beginning of a
page and at the end of a page, the whole page is rendered, even if most of
the page is not modified.

This patch makes it test for identical pixels at the beginning and end of
each rendering command. This patch improves identical byte detection by
41% when playing video in a window.

This patch also fixes a possible screen corruption if the user is writing
to the framebuffer while dlfb_render_hline is in progress - the pixel data
that is copied to the backbuffer with memcpy may be different from the
pixel data that is actually rendered to the hardware (because the content
of the framebuffer may change between memcpy and the rendering command).
We must make sure that we copy exactly the same pixel as the pixel that is
being rendered.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/video/fbdev/udlfb.c |   45 +++++++++++++++++++++++++++++++++-----------
 1 file changed, 34 insertions(+), 11 deletions(-)

Index: linux-4.17-rc7/drivers/video/fbdev/udlfb.c
=================================--- linux-4.17-rc7.orig/drivers/video/fbdev/udlfb.c	2018-05-31 14:51:43.000000000 +0200
+++ linux-4.17-rc7/drivers/video/fbdev/udlfb.c	2018-05-31 14:51:43.000000000 +0200
@@ -431,7 +431,9 @@ static void dlfb_compress_hline(
 	const uint16_t *const pixel_end,
 	uint32_t *device_address_ptr,
 	uint8_t **command_buffer_ptr,
-	const uint8_t *const cmd_buffer_end)
+	const uint8_t *const cmd_buffer_end,
+	unsigned long back_buffer_offset,
+	int *ident_ptr)
 {
 	const uint16_t *pixel = *pixel_start_ptr;
 	uint32_t dev_addr  = *device_address_ptr;
@@ -444,6 +446,14 @@ static void dlfb_compress_hline(
 		const uint16_t *raw_pixel_start = NULL;
 		const uint16_t *cmd_pixel_start, *cmd_pixel_end = NULL;
 
+		if (back_buffer_offset &&
+		    *pixel = *(u16 *)((u8 *)pixel + back_buffer_offset)) {
+			pixel++;
+			dev_addr += BPP;
+			(*ident_ptr)++;
+			continue;
+		}
+
 		prefetchw((void *) cmd); /* pull in one cache line at least */
 
 		*cmd++ = 0xAF;
@@ -462,25 +472,37 @@ static void dlfb_compress_hline(
 					(unsigned long)(pixel_end - pixel),
 					(unsigned long)(cmd_buffer_end - 1 - cmd) / BPP);
 
+		if (back_buffer_offset) {
+			/* note: the framebuffer may change under us, so we must test for underflow */
+			while (cmd_pixel_end - 1 > pixel &&
+			       *(cmd_pixel_end - 1) = *(u16 *)((u8 *)(cmd_pixel_end - 1) + back_buffer_offset))
+				cmd_pixel_end--;
+		}
+
 		prefetch_range((void *) pixel, (u8 *)cmd_pixel_end - (u8 *)pixel);
 
 		while (pixel < cmd_pixel_end) {
 			const uint16_t * const repeating_pixel = pixel;
+			u16 pixel_value = *pixel;
 
-			put_unaligned_be16(*pixel, cmd);
+			put_unaligned_be16(pixel_value, cmd);
+			if (back_buffer_offset)
+				*(u16 *)((u8 *)pixel + back_buffer_offset) = pixel_value;
 			cmd += 2;
 			pixel++;
 
 			if (unlikely((pixel < cmd_pixel_end) &&
-				     (*pixel = *repeating_pixel))) {
+				     (*pixel = pixel_value))) {
 				/* go back and fill in raw pixel count */
 				*raw_pixels_count_byte = ((repeating_pixel -
 						raw_pixel_start) + 1) & 0xFF;
 
-				while ((pixel < cmd_pixel_end)
-				       && (*pixel = *repeating_pixel)) {
-					pixel++;
-				}
+				do {
+					if (back_buffer_offset)
+						*(u16 *)((u8 *)pixel + back_buffer_offset) = pixel_value;
+ 					pixel++;
+				} while ((pixel < cmd_pixel_end) &&
+					 (*pixel = pixel_value));
 
 				/* immediately after raw data is repeat byte */
 				*cmd++ = ((pixel - repeating_pixel) - 1) & 0xFF;
@@ -531,6 +553,7 @@ static int dlfb_render_hline(struct dlfb
 	struct urb *urb = *urb_ptr;
 	u8 *cmd = *urb_buf_ptr;
 	u8 *cmd_end = (u8 *) urb->transfer_buffer + urb->transfer_buffer_length;
+	unsigned long back_buffer_offset = 0;
 
 	line_start = (u8 *) (front + byte_offset);
 	next_pixel = line_start;
@@ -541,6 +564,8 @@ static int dlfb_render_hline(struct dlfb
 		const u8 *back_start = (u8 *) (dlfb->backing_buffer
 						+ byte_offset);
 
+		back_buffer_offset = (unsigned long)back_start - (unsigned long)line_start;
+
 		*ident_ptr += dlfb_trim_hline(back_start, &next_pixel,
 			&byte_width);
 
@@ -549,16 +574,14 @@ static int dlfb_render_hline(struct dlfb
 		dev_addr += offset;
 		back_start += offset;
 		line_start += offset;
-
-		memcpy((char *)back_start, (char *) line_start,
-		       byte_width);
 	}
 
 	while (next_pixel < line_end) {
 
 		dlfb_compress_hline((const uint16_t **) &next_pixel,
 			     (const uint16_t *) line_end, &dev_addr,
-			(u8 **) &cmd, (u8 *) cmd_end);
+			(u8 **) &cmd, (u8 *) cmd_end, back_buffer_offset,
+			ident_ptr);
 
 		if (cmd >= cmd_end) {
 			int len = cmd - (u8 *) urb->transfer_buffer;


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 19/21] udlfb: optimization - test the backing buffer
@ 2018-06-03 14:41   ` Mikulas Patocka
  0 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

[-- Attachment #1: udl-test-backing-buffer.patch --]
[-- Type: text/plain, Size: 5093 bytes --]

Currently, the udlfb driver only tests for identical bytes at the
beginning or at the end of a page and renders anything between the first
and last mismatching pixel. But pages are not the same as lines, so this
is quite suboptimal - if there is something modified at the beginning of a
page and at the end of a page, the whole page is rendered, even if most of
the page is not modified.

This patch makes it test for identical pixels at the beginning and end of
each rendering command. This patch improves identical byte detection by
41% when playing video in a window.

This patch also fixes a possible screen corruption if the user is writing
to the framebuffer while dlfb_render_hline is in progress - the pixel data
that is copied to the backbuffer with memcpy may be different from the
pixel data that is actually rendered to the hardware (because the content
of the framebuffer may change between memcpy and the rendering command).
We must make sure that we copy exactly the same pixel as the pixel that is
being rendered.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/video/fbdev/udlfb.c |   45 +++++++++++++++++++++++++++++++++-----------
 1 file changed, 34 insertions(+), 11 deletions(-)

Index: linux-4.17-rc7/drivers/video/fbdev/udlfb.c
===================================================================
--- linux-4.17-rc7.orig/drivers/video/fbdev/udlfb.c	2018-05-31 14:51:43.000000000 +0200
+++ linux-4.17-rc7/drivers/video/fbdev/udlfb.c	2018-05-31 14:51:43.000000000 +0200
@@ -431,7 +431,9 @@ static void dlfb_compress_hline(
 	const uint16_t *const pixel_end,
 	uint32_t *device_address_ptr,
 	uint8_t **command_buffer_ptr,
-	const uint8_t *const cmd_buffer_end)
+	const uint8_t *const cmd_buffer_end,
+	unsigned long back_buffer_offset,
+	int *ident_ptr)
 {
 	const uint16_t *pixel = *pixel_start_ptr;
 	uint32_t dev_addr  = *device_address_ptr;
@@ -444,6 +446,14 @@ static void dlfb_compress_hline(
 		const uint16_t *raw_pixel_start = NULL;
 		const uint16_t *cmd_pixel_start, *cmd_pixel_end = NULL;
 
+		if (back_buffer_offset &&
+		    *pixel == *(u16 *)((u8 *)pixel + back_buffer_offset)) {
+			pixel++;
+			dev_addr += BPP;
+			(*ident_ptr)++;
+			continue;
+		}
+
 		prefetchw((void *) cmd); /* pull in one cache line at least */
 
 		*cmd++ = 0xAF;
@@ -462,25 +472,37 @@ static void dlfb_compress_hline(
 					(unsigned long)(pixel_end - pixel),
 					(unsigned long)(cmd_buffer_end - 1 - cmd) / BPP);
 
+		if (back_buffer_offset) {
+			/* note: the framebuffer may change under us, so we must test for underflow */
+			while (cmd_pixel_end - 1 > pixel &&
+			       *(cmd_pixel_end - 1) == *(u16 *)((u8 *)(cmd_pixel_end - 1) + back_buffer_offset))
+				cmd_pixel_end--;
+		}
+
 		prefetch_range((void *) pixel, (u8 *)cmd_pixel_end - (u8 *)pixel);
 
 		while (pixel < cmd_pixel_end) {
 			const uint16_t * const repeating_pixel = pixel;
+			u16 pixel_value = *pixel;
 
-			put_unaligned_be16(*pixel, cmd);
+			put_unaligned_be16(pixel_value, cmd);
+			if (back_buffer_offset)
+				*(u16 *)((u8 *)pixel + back_buffer_offset) = pixel_value;
 			cmd += 2;
 			pixel++;
 
 			if (unlikely((pixel < cmd_pixel_end) &&
-				     (*pixel == *repeating_pixel))) {
+				     (*pixel == pixel_value))) {
 				/* go back and fill in raw pixel count */
 				*raw_pixels_count_byte = ((repeating_pixel -
 						raw_pixel_start) + 1) & 0xFF;
 
-				while ((pixel < cmd_pixel_end)
-				       && (*pixel == *repeating_pixel)) {
-					pixel++;
-				}
+				do {
+					if (back_buffer_offset)
+						*(u16 *)((u8 *)pixel + back_buffer_offset) = pixel_value;
+ 					pixel++;
+				} while ((pixel < cmd_pixel_end) &&
+					 (*pixel == pixel_value));
 
 				/* immediately after raw data is repeat byte */
 				*cmd++ = ((pixel - repeating_pixel) - 1) & 0xFF;
@@ -531,6 +553,7 @@ static int dlfb_render_hline(struct dlfb
 	struct urb *urb = *urb_ptr;
 	u8 *cmd = *urb_buf_ptr;
 	u8 *cmd_end = (u8 *) urb->transfer_buffer + urb->transfer_buffer_length;
+	unsigned long back_buffer_offset = 0;
 
 	line_start = (u8 *) (front + byte_offset);
 	next_pixel = line_start;
@@ -541,6 +564,8 @@ static int dlfb_render_hline(struct dlfb
 		const u8 *back_start = (u8 *) (dlfb->backing_buffer
 						+ byte_offset);
 
+		back_buffer_offset = (unsigned long)back_start - (unsigned long)line_start;
+
 		*ident_ptr += dlfb_trim_hline(back_start, &next_pixel,
 			&byte_width);
 
@@ -549,16 +574,14 @@ static int dlfb_render_hline(struct dlfb
 		dev_addr += offset;
 		back_start += offset;
 		line_start += offset;
-
-		memcpy((char *)back_start, (char *) line_start,
-		       byte_width);
 	}
 
 	while (next_pixel < line_end) {
 
 		dlfb_compress_hline((const uint16_t **) &next_pixel,
 			     (const uint16_t *) line_end, &dev_addr,
-			(u8 **) &cmd, (u8 *) cmd_end);
+			(u8 **) &cmd, (u8 *) cmd_end, back_buffer_offset,
+			ident_ptr);
 
 		if (cmd >= cmd_end) {
 			int len = cmd - (u8 *) urb->transfer_buffer;

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 20/21] udlfb: avoid prefetch
  2018-06-03 14:40 ` Mikulas Patocka
@ 2018-06-03 14:41   ` Mikulas Patocka
  -1 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

Modern processors can detect linear memory accesses and prefetch data
automatically, so there's no need to use prefetch.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/video/fbdev/udlfb.c |    8 --------
 1 file changed, 8 deletions(-)

Index: linux-4.17-rc7/drivers/video/fbdev/udlfb.c
=================================--- linux-4.17-rc7.orig/drivers/video/fbdev/udlfb.c	2018-05-31 12:48:35.000000000 +0200
+++ linux-4.17-rc7/drivers/video/fbdev/udlfb.c	2018-05-31 12:48:35.000000000 +0200
@@ -25,7 +25,6 @@
 #include <linux/fb.h>
 #include <linux/vmalloc.h>
 #include <linux/slab.h>
-#include <linux/prefetch.h>
 #include <linux/delay.h>
 #include <asm/unaligned.h>
 #include <video/udlfb.h>
@@ -375,9 +374,6 @@ static int dlfb_trim_hline(const u8 *bba
 	int start = width;
 	int end = width;
 
-	prefetch((void *) front);
-	prefetch((void *) back);
-
 	for (j = 0; j < width; j++) {
 		if (back[j] != front[j]) {
 			start = j;
@@ -454,8 +450,6 @@ static void dlfb_compress_hline(
 			continue;
 		}
 
-		prefetchw((void *) cmd); /* pull in one cache line at least */
-
 		*cmd++ = 0xAF;
 		*cmd++ = 0x6B;
 		*cmd++ = dev_addr >> 16;
@@ -479,8 +473,6 @@ static void dlfb_compress_hline(
 				cmd_pixel_end--;
 		}
 
-		prefetch_range((void *) pixel, (u8 *)cmd_pixel_end - (u8 *)pixel);
-
 		while (pixel < cmd_pixel_end) {
 			const uint16_t * const repeating_pixel = pixel;
 			u16 pixel_value = *pixel;


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 20/21] udlfb: avoid prefetch
@ 2018-06-03 14:41   ` Mikulas Patocka
  0 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

[-- Attachment #1: udl-avoid-prefetch.patch --]
[-- Type: text/plain, Size: 1625 bytes --]

Modern processors can detect linear memory accesses and prefetch data
automatically, so there's no need to use prefetch.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/video/fbdev/udlfb.c |    8 --------
 1 file changed, 8 deletions(-)

Index: linux-4.17-rc7/drivers/video/fbdev/udlfb.c
===================================================================
--- linux-4.17-rc7.orig/drivers/video/fbdev/udlfb.c	2018-05-31 12:48:35.000000000 +0200
+++ linux-4.17-rc7/drivers/video/fbdev/udlfb.c	2018-05-31 12:48:35.000000000 +0200
@@ -25,7 +25,6 @@
 #include <linux/fb.h>
 #include <linux/vmalloc.h>
 #include <linux/slab.h>
-#include <linux/prefetch.h>
 #include <linux/delay.h>
 #include <asm/unaligned.h>
 #include <video/udlfb.h>
@@ -375,9 +374,6 @@ static int dlfb_trim_hline(const u8 *bba
 	int start = width;
 	int end = width;
 
-	prefetch((void *) front);
-	prefetch((void *) back);
-
 	for (j = 0; j < width; j++) {
 		if (back[j] != front[j]) {
 			start = j;
@@ -454,8 +450,6 @@ static void dlfb_compress_hline(
 			continue;
 		}
 
-		prefetchw((void *) cmd); /* pull in one cache line at least */
-
 		*cmd++ = 0xAF;
 		*cmd++ = 0x6B;
 		*cmd++ = dev_addr >> 16;
@@ -479,8 +473,6 @@ static void dlfb_compress_hline(
 				cmd_pixel_end--;
 		}
 
-		prefetch_range((void *) pixel, (u8 *)cmd_pixel_end - (u8 *)pixel);
-
 		while (pixel < cmd_pixel_end) {
 			const uint16_t * const repeating_pixel = pixel;
 			u16 pixel_value = *pixel;

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 21/21] udlfb: use spin_lock_irq instead of spin_lock_irqsave
  2018-06-03 14:40 ` Mikulas Patocka
@ 2018-06-03 14:41   ` Mikulas Patocka
  -1 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

spin_lock_irqsave and spin_unlock_irqrestore is inteded to be called from
a context where it is unknown if interrupts are enabled or disabled (such
as interrupt handlers). From a process context, we should call
spin_lock_irq and spin_unlock_irq, that avoids the costly pushf and popf
instructions.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/video/fbdev/udlfb.c |   10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

Index: linux-4.17-rc7/drivers/video/fbdev/udlfb.c
=================================--- linux-4.17-rc7.orig/drivers/video/fbdev/udlfb.c	2018-06-03 13:17:46.000000000 +0200
+++ linux-4.17-rc7/drivers/video/fbdev/udlfb.c	2018-06-03 13:17:46.000000000 +0200
@@ -1855,18 +1855,17 @@ static void dlfb_free_urb_list(struct dl
 	struct list_head *node;
 	struct urb_node *unode;
 	struct urb *urb;
-	unsigned long flags;
 
 	/* keep waiting and freeing, until we've got 'em all */
 	while (count--) {
 		down(&dlfb->urbs.limit_sem);
 
-		spin_lock_irqsave(&dlfb->urbs.lock, flags);
+		spin_lock_irq(&dlfb->urbs.lock);
 
 		node = dlfb->urbs.list.next; /* have reserved one with sem */
 		list_del_init(node);
 
-		spin_unlock_irqrestore(&dlfb->urbs.lock, flags);
+		spin_unlock_irq(&dlfb->urbs.lock);
 
 		unode = list_entry(node, struct urb_node, entry);
 		urb = unode->urb;
@@ -1944,7 +1943,6 @@ static struct urb *dlfb_get_urb(struct d
 	int ret;
 	struct list_head *entry;
 	struct urb_node *unode;
-	unsigned long flags;
 
 	/* Wait for an in-flight buffer to complete and get re-queued */
 	ret = down_timeout(&dlfb->urbs.limit_sem, GET_URB_TIMEOUT);
@@ -1956,14 +1954,14 @@ static struct urb *dlfb_get_urb(struct d
 		return NULL;
 	}
 
-	spin_lock_irqsave(&dlfb->urbs.lock, flags);
+	spin_lock_irq(&dlfb->urbs.lock);
 
 	BUG_ON(list_empty(&dlfb->urbs.list)); /* reserved one with limit_sem */
 	entry = dlfb->urbs.list.next;
 	list_del_init(entry);
 	dlfb->urbs.available--;
 
-	spin_unlock_irqrestore(&dlfb->urbs.lock, flags);
+	spin_unlock_irq(&dlfb->urbs.lock);
 
 	unode = list_entry(entry, struct urb_node, entry);
 	return unode->urb;


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 21/21] udlfb: use spin_lock_irq instead of spin_lock_irqsave
@ 2018-06-03 14:41   ` Mikulas Patocka
  0 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-03 14:41 UTC (permalink / raw)
  To: Mikulas Patocka, Bartlomiej Zolnierkiewicz, Dave Airlie,
	Bernie Thompson, Ladislav Michl
  Cc: linux-fbdev, dri-devel

[-- Attachment #1: udl-spin-lock-irq.patch --]
[-- Type: text/plain, Size: 2292 bytes --]

spin_lock_irqsave and spin_unlock_irqrestore is inteded to be called from
a context where it is unknown if interrupts are enabled or disabled (such
as interrupt handlers). From a process context, we should call
spin_lock_irq and spin_unlock_irq, that avoids the costly pushf and popf
instructions.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/video/fbdev/udlfb.c |   10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

Index: linux-4.17-rc7/drivers/video/fbdev/udlfb.c
===================================================================
--- linux-4.17-rc7.orig/drivers/video/fbdev/udlfb.c	2018-06-03 13:17:46.000000000 +0200
+++ linux-4.17-rc7/drivers/video/fbdev/udlfb.c	2018-06-03 13:17:46.000000000 +0200
@@ -1855,18 +1855,17 @@ static void dlfb_free_urb_list(struct dl
 	struct list_head *node;
 	struct urb_node *unode;
 	struct urb *urb;
-	unsigned long flags;
 
 	/* keep waiting and freeing, until we've got 'em all */
 	while (count--) {
 		down(&dlfb->urbs.limit_sem);
 
-		spin_lock_irqsave(&dlfb->urbs.lock, flags);
+		spin_lock_irq(&dlfb->urbs.lock);
 
 		node = dlfb->urbs.list.next; /* have reserved one with sem */
 		list_del_init(node);
 
-		spin_unlock_irqrestore(&dlfb->urbs.lock, flags);
+		spin_unlock_irq(&dlfb->urbs.lock);
 
 		unode = list_entry(node, struct urb_node, entry);
 		urb = unode->urb;
@@ -1944,7 +1943,6 @@ static struct urb *dlfb_get_urb(struct d
 	int ret;
 	struct list_head *entry;
 	struct urb_node *unode;
-	unsigned long flags;
 
 	/* Wait for an in-flight buffer to complete and get re-queued */
 	ret = down_timeout(&dlfb->urbs.limit_sem, GET_URB_TIMEOUT);
@@ -1956,14 +1954,14 @@ static struct urb *dlfb_get_urb(struct d
 		return NULL;
 	}
 
-	spin_lock_irqsave(&dlfb->urbs.lock, flags);
+	spin_lock_irq(&dlfb->urbs.lock);
 
 	BUG_ON(list_empty(&dlfb->urbs.list)); /* reserved one with limit_sem */
 	entry = dlfb->urbs.list.next;
 	list_del_init(entry);
 	dlfb->urbs.available--;
 
-	spin_unlock_irqrestore(&dlfb->urbs.lock, flags);
+	spin_unlock_irq(&dlfb->urbs.lock);
 
 	unode = list_entry(entry, struct urb_node, entry);
 	return unode->urb;

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 18/21] udlfb: allow reallocating the framebuffer
  2018-06-03 14:41   ` Mikulas Patocka
@ 2018-06-03 19:24     ` kbuild test robot
  -1 siblings, 0 replies; 75+ messages in thread
From: kbuild test robot @ 2018-06-03 19:24 UTC (permalink / raw)
  To: linux-fbdev

Hi Mikulas,

I love your patch! Perhaps something to improve:

[auto build test WARNING on drm/drm-next]
[also build test WARNING on v4.17-rc7 next-20180601]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Mikulas-Patocka/USB-DisplayLink-patches/20180603-233013
base:   git://people.freedesktop.org/~airlied/linux.git drm-next
reproduce:
        # apt-get install sparse
        make ARCH=x86_64 allmodconfig
        make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)


vim +1198 drivers/video/fbdev/udlfb.c

  1171	
  1172	/*
  1173	 * Assumes &info->lock held by caller
  1174	 * Assumes no active clients have framebuffer open
  1175	 */
  1176	static int dlfb_realloc_framebuffer(struct dlfb_data *dlfb, struct fb_info *info, u32 new_len)
  1177	{
  1178		u32 old_len = info->fix.smem_len;
> 1179		unsigned char *old_fb = info->screen_base;
  1180		unsigned char *new_fb;
  1181		unsigned char *new_back = NULL;
  1182	
  1183		new_len = PAGE_ALIGN(new_len);
  1184	
  1185		if (new_len > old_len) {
  1186			/*
  1187			 * Alloc system memory for virtual framebuffer
  1188			 */
  1189			new_fb = vmalloc(new_len);
  1190			if (!new_fb) {
  1191				dev_err(info->dev, "Virtual framebuffer alloc failed\n");
  1192				return -ENOMEM;
  1193			}
  1194			memset(new_fb, 0xff, new_len);
  1195	
  1196			if (info->screen_base) {
  1197				memcpy(new_fb, old_fb, old_len);
> 1198				dlfb_deferred_vfree(dlfb, info->screen_base);
  1199			}
  1200	
  1201			info->screen_base = new_fb;
  1202			info->fix.smem_len = new_len;
  1203			info->fix.smem_start = (unsigned long) new_fb;
  1204			info->flags = udlfb_info_flags;
  1205	
  1206			/*
  1207			 * Second framebuffer copy to mirror the framebuffer state
  1208			 * on the physical USB device. We can function without this.
  1209			 * But with imperfect damage info we may send pixels over USB
  1210			 * that were, in fact, unchanged - wasting limited USB bandwidth
  1211			 */
  1212			if (shadow)
  1213				new_back = vzalloc(new_len);
  1214			if (!new_back)
  1215				dev_info(info->dev,
  1216					 "No shadow/backing buffer allocated\n");
  1217			else {
  1218				dlfb_deferred_vfree(dlfb, dlfb->backing_buffer);
  1219				dlfb->backing_buffer = new_back;
  1220			}
  1221		}
  1222		return 0;
  1223	}
  1224	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 18/21] udlfb: allow reallocating the framebuffer
@ 2018-06-03 19:24     ` kbuild test robot
  0 siblings, 0 replies; 75+ messages in thread
From: kbuild test robot @ 2018-06-03 19:24 UTC (permalink / raw)
  Cc: linux-fbdev, Ladislav Michl, Bartlomiej Zolnierkiewicz,
	Bernie Thompson, dri-devel, Mikulas Patocka, kbuild-all,
	Dave Airlie

Hi Mikulas,

I love your patch! Perhaps something to improve:

[auto build test WARNING on drm/drm-next]
[also build test WARNING on v4.17-rc7 next-20180601]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Mikulas-Patocka/USB-DisplayLink-patches/20180603-233013
base:   git://people.freedesktop.org/~airlied/linux.git drm-next
reproduce:
        # apt-get install sparse
        make ARCH=x86_64 allmodconfig
        make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)


vim +1198 drivers/video/fbdev/udlfb.c

  1171	
  1172	/*
  1173	 * Assumes &info->lock held by caller
  1174	 * Assumes no active clients have framebuffer open
  1175	 */
  1176	static int dlfb_realloc_framebuffer(struct dlfb_data *dlfb, struct fb_info *info, u32 new_len)
  1177	{
  1178		u32 old_len = info->fix.smem_len;
> 1179		unsigned char *old_fb = info->screen_base;
  1180		unsigned char *new_fb;
  1181		unsigned char *new_back = NULL;
  1182	
  1183		new_len = PAGE_ALIGN(new_len);
  1184	
  1185		if (new_len > old_len) {
  1186			/*
  1187			 * Alloc system memory for virtual framebuffer
  1188			 */
  1189			new_fb = vmalloc(new_len);
  1190			if (!new_fb) {
  1191				dev_err(info->dev, "Virtual framebuffer alloc failed\n");
  1192				return -ENOMEM;
  1193			}
  1194			memset(new_fb, 0xff, new_len);
  1195	
  1196			if (info->screen_base) {
  1197				memcpy(new_fb, old_fb, old_len);
> 1198				dlfb_deferred_vfree(dlfb, info->screen_base);
  1199			}
  1200	
  1201			info->screen_base = new_fb;
  1202			info->fix.smem_len = new_len;
  1203			info->fix.smem_start = (unsigned long) new_fb;
  1204			info->flags = udlfb_info_flags;
  1205	
  1206			/*
  1207			 * Second framebuffer copy to mirror the framebuffer state
  1208			 * on the physical USB device. We can function without this.
  1209			 * But with imperfect damage info we may send pixels over USB
  1210			 * that were, in fact, unchanged - wasting limited USB bandwidth
  1211			 */
  1212			if (shadow)
  1213				new_back = vzalloc(new_len);
  1214			if (!new_back)
  1215				dev_info(info->dev,
  1216					 "No shadow/backing buffer allocated\n");
  1217			else {
  1218				dlfb_deferred_vfree(dlfb, dlfb->backing_buffer);
  1219				dlfb->backing_buffer = new_back;
  1220			}
  1221		}
  1222		return 0;
  1223	}
  1224	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 00/21] USB DisplayLink patches
  2018-06-03 14:40 ` Mikulas Patocka
@ 2018-06-04  1:25   ` Dave Airlie
  -1 siblings, 0 replies; 75+ messages in thread
From: Dave Airlie @ 2018-06-04  1:25 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Linux Fbdev development list, Ladislav Michl,
	Bartlomiej Zolnierkiewicz, Bernie Thompson, dri-devel,
	Dave Airlie

On 4 June 2018 at 00:40, Mikulas Patocka <mpatocka@redhat.com> wrote:
> Hi
>
> Here I'm sending bug fixes and performance improvements for the USB
> DisplayLink framebuffer and modesetting drivers for this merge window.
>

Hi,

You probably want to split these up into separate series for the kms and fbdev
drivers.

Otherwise at least for drm you've missed this merge window, since it
closes around rc6 of the previous kernel, did you use git send-email
for these patches, at least some of them viewed funny on my phone,
I'll try and look over the kms ones soon. Do you have any numbers for
improvements to the kms ones?

Dave.

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 00/21] USB DisplayLink patches
@ 2018-06-04  1:25   ` Dave Airlie
  0 siblings, 0 replies; 75+ messages in thread
From: Dave Airlie @ 2018-06-04  1:25 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Linux Fbdev development list, Ladislav Michl,
	Bartlomiej Zolnierkiewicz, Bernie Thompson, dri-devel,
	Dave Airlie

On 4 June 2018 at 00:40, Mikulas Patocka <mpatocka@redhat.com> wrote:
> Hi
>
> Here I'm sending bug fixes and performance improvements for the USB
> DisplayLink framebuffer and modesetting drivers for this merge window.
>

Hi,

You probably want to split these up into separate series for the kms and fbdev
drivers.

Otherwise at least for drm you've missed this merge window, since it
closes around rc6 of the previous kernel, did you use git send-email
for these patches, at least some of them viewed funny on my phone,
I'll try and look over the kms ones soon. Do you have any numbers for
improvements to the kms ones?

Dave.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 00/21] USB DisplayLink patches
  2018-06-04  1:25   ` Dave Airlie
@ 2018-06-04 14:14     ` Mikulas Patocka
  -1 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-04 14:14 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Linux Fbdev development list, Ladislav Michl,
	Bartlomiej Zolnierkiewicz, Bernie Thompson, dri-devel,
	Dave Airlie



On Mon, 4 Jun 2018, Dave Airlie wrote:

> On 4 June 2018 at 00:40, Mikulas Patocka <mpatocka@redhat.com> wrote:
> > Hi
> >
> > Here I'm sending bug fixes and performance improvements for the USB
> > DisplayLink framebuffer and modesetting drivers for this merge window.
> >
> 
> Hi,
> 
> You probably want to split these up into separate series for the kms and fbdev
> drivers.
> 
> Otherwise at least for drm you've missed this merge window, since it
> closes around rc6 of the previous kernel,

Could you apply at least the fbdefio patches (without them, fbdefio is 
unusable due to crashes) and the display corruption of the last line 
(because most people will hit it)?

> did you use git send-email
> for these patches, at least some of them viewed funny on my phone,

I used the command "quilt mail". I use quilt, not git, for management of 
my patches.

> I'll try and look over the kms ones soon. Do you have any numbers for
> improvements to the kms ones?
> 
> Dave.

I measured performance improvement on the framebuffer patches. The kms 
driver already performs well, there's not much to do.

I'd like to as you if you could review the patch "udl-kms: fix a 
linked-list corruption when using fbdefio" - for me it fixes the crashes, 
but I am not expert in modesetting drivers and I don't know if some other 
part of the kernel assumes that the framebuffer pages must be allocated 
with drm_gem_get_pages.


BTW. When I unplug the USB adapter while using the modesetting driver, I 
get this warning. Do you have an idea how to fix it?

WARNING: CPU: 0 PID: 61 at drivers/gpu/drm/drm_mode_config.c:439 drm_mode_config_cleanup+0x250/0x2b8 [drm]
Modules linked in: udlfb hid_generic usbhid hid tun bridge stp llc autofs4 binfmt_misc ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ipt_REJECT nf_reject_ipv4 xt_conntrack xt_multiport iptable_filter iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_nat xt_tcpudp iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 ip_tables x_tables pppoe pppox af_packet ppp_generic slhc udl drm_kms_helper cfbfillrect cfbimgblt cfbcopyarea drm drm_panel_orientation_quirks syscopyarea sysfillrect sysimgblt fb_sys_fops fb font snd_usb_audio snd_hwdep snd_usbmidi_lib snd_rawmidi snd_pcm snd_timer snd soundcore nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack sd_mod ipv6 aes_ce_blk crypto_simd cryptd aes_ce_cipher crc32_ce ghash_ce gf128mul aes_arm64 sha2_ce
 sha256_arm64 sha1_ce xhci_plat_hcd xhci_hcd sha1_generic usbcore usb_common ahci_platform libahci_platform libahci mvpp2 unix
CPU: 0 PID: 61 Comm: kworker/0:2 Not tainted 4.17.0-rc7 #1
Hardware name: Marvell 8040 MACCHIATOBin (DT)
Workqueue: usb_hub_wq hub_event [usbcore]
pstate: 80000005 (Nzcv daif -PAN -UAO)
pc : drm_mode_config_cleanup+0x250/0x2b8 [drm]
lr : drm_mode_config_cleanup+0x88/0x2b8 [drm]
sp : ffffffc13a643920
x29: ffffffc13a643920 x28: ffffffc13a63ac00
x27: ffffffc1380962c0 x26: ffffffc11b95f898
x25: ffffff8000afd1d8 x24: ffffffc11b95f800
x23: 0000000000000060 x22: ffffff8000afd240
x21: ffffffc11b95eb38 x20: ffffffc11b95e800
x19: ffffffc11b95eb30 x18: ffffffc11b95ea7c
x17: 0000007fa3591b60 x16: ffffff80081dc178
x15: ffffffc11b95ea78 x14: 0000000000000000
x13: ffffffc12b3fc000 x12: ffffffc12b3fc028
x11: ffffffc12b3fc119 x10: 000000000000001f
x9 : 0000000000000028 x8 : ffffff8000a84000
x7 : 0000000000000000 x6 : 0000000000000001
x5 : 0000000000000002 x4 : 0000000000000001
x3 : 0000000000000002 x2 : 000000000000002f
x1 : ffffffc11b95eaf8 x0 : ffffffc11b978818
Call trace:
 drm_mode_config_cleanup+0x250/0x2b8 [drm]
 udl_modeset_cleanup+0xc/0x18 [udl]
 udl_driver_unload+0x30/0x50 [udl]
 drm_dev_unregister+0x3c/0xe8 [drm]
 drm_dev_unplug+0x18/0x70 [drm]
 udl_usb_disconnect+0x30/0x40 [udl]
 usb_unbind_interface+0x6c/0x290 [usbcore]
 device_release_driver_internal+0x170/0x200
 device_release_driver+0x14/0x20
 bus_remove_device+0x118/0x128
 device_del+0x110/0x308
 usb_disable_device+0x8c/0x1f8 [usbcore]
 usb_disconnect+0xb4/0x218 [usbcore]
 usb_disconnect+0x9c/0x218 [usbcore]
 usb_disconnect+0x9c/0x218 [usbcore]
 hub_event+0xf20/0x1020 [usbcore]
 process_one_work+0x1c8/0x310
 worker_thread+0x44/0x450
 kthread+0x118/0x120
 ret_from_fork+0x10/0x18
---[ end trace 978a27ff198f1268 ]---
[drm:drm_mode_config_cleanup [drm]] *ERROR* connector DVI-I-1 leaked!

Mikulas

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 00/21] USB DisplayLink patches
@ 2018-06-04 14:14     ` Mikulas Patocka
  0 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-04 14:14 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Linux Fbdev development list, Ladislav Michl,
	Bartlomiej Zolnierkiewicz, Bernie Thompson, dri-devel,
	Dave Airlie



On Mon, 4 Jun 2018, Dave Airlie wrote:

> On 4 June 2018 at 00:40, Mikulas Patocka <mpatocka@redhat.com> wrote:
> > Hi
> >
> > Here I'm sending bug fixes and performance improvements for the USB
> > DisplayLink framebuffer and modesetting drivers for this merge window.
> >
> 
> Hi,
> 
> You probably want to split these up into separate series for the kms and fbdev
> drivers.
> 
> Otherwise at least for drm you've missed this merge window, since it
> closes around rc6 of the previous kernel,

Could you apply at least the fbdefio patches (without them, fbdefio is 
unusable due to crashes) and the display corruption of the last line 
(because most people will hit it)?

> did you use git send-email
> for these patches, at least some of them viewed funny on my phone,

I used the command "quilt mail". I use quilt, not git, for management of 
my patches.

> I'll try and look over the kms ones soon. Do you have any numbers for
> improvements to the kms ones?
> 
> Dave.

I measured performance improvement on the framebuffer patches. The kms 
driver already performs well, there's not much to do.

I'd like to as you if you could review the patch "udl-kms: fix a 
linked-list corruption when using fbdefio" - for me it fixes the crashes, 
but I am not expert in modesetting drivers and I don't know if some other 
part of the kernel assumes that the framebuffer pages must be allocated 
with drm_gem_get_pages.


BTW. When I unplug the USB adapter while using the modesetting driver, I 
get this warning. Do you have an idea how to fix it?

WARNING: CPU: 0 PID: 61 at drivers/gpu/drm/drm_mode_config.c:439 drm_mode_config_cleanup+0x250/0x2b8 [drm]
Modules linked in: udlfb hid_generic usbhid hid tun bridge stp llc autofs4 binfmt_misc ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ipt_REJECT nf_reject_ipv4 xt_conntrack xt_multiport iptable_filter iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_nat xt_tcpudp iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 ip_tables x_tables pppoe pppox af_packet ppp_generic slhc udl drm_kms_helper cfbfillrect cfbimgblt cfbcopyarea drm drm_panel_orientation_quirks syscopyarea sysfillrect sysimgblt fb_sys_fops fb font snd_usb_audio snd_hwdep snd_usbmidi_lib snd_rawmidi snd_pcm snd_timer snd soundcore nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack sd_mod ipv6 aes_ce_blk crypto_simd cryptd aes_ce_cipher crc32_ce ghash_ce gf128mul aes_arm64 sha2_ce
 sha256_arm64 sha1_ce xhci_plat_hcd xhci_hcd sha1_generic usbcore usb_common ahci_platform libahci_platform libahci mvpp2 unix
CPU: 0 PID: 61 Comm: kworker/0:2 Not tainted 4.17.0-rc7 #1
Hardware name: Marvell 8040 MACCHIATOBin (DT)
Workqueue: usb_hub_wq hub_event [usbcore]
pstate: 80000005 (Nzcv daif -PAN -UAO)
pc : drm_mode_config_cleanup+0x250/0x2b8 [drm]
lr : drm_mode_config_cleanup+0x88/0x2b8 [drm]
sp : ffffffc13a643920
x29: ffffffc13a643920 x28: ffffffc13a63ac00
x27: ffffffc1380962c0 x26: ffffffc11b95f898
x25: ffffff8000afd1d8 x24: ffffffc11b95f800
x23: 0000000000000060 x22: ffffff8000afd240
x21: ffffffc11b95eb38 x20: ffffffc11b95e800
x19: ffffffc11b95eb30 x18: ffffffc11b95ea7c
x17: 0000007fa3591b60 x16: ffffff80081dc178
x15: ffffffc11b95ea78 x14: 0000000000000000
x13: ffffffc12b3fc000 x12: ffffffc12b3fc028
x11: ffffffc12b3fc119 x10: 000000000000001f
x9 : 0000000000000028 x8 : ffffff8000a84000
x7 : 0000000000000000 x6 : 0000000000000001
x5 : 0000000000000002 x4 : 0000000000000001
x3 : 0000000000000002 x2 : 000000000000002f
x1 : ffffffc11b95eaf8 x0 : ffffffc11b978818
Call trace:
 drm_mode_config_cleanup+0x250/0x2b8 [drm]
 udl_modeset_cleanup+0xc/0x18 [udl]
 udl_driver_unload+0x30/0x50 [udl]
 drm_dev_unregister+0x3c/0xe8 [drm]
 drm_dev_unplug+0x18/0x70 [drm]
 udl_usb_disconnect+0x30/0x40 [udl]
 usb_unbind_interface+0x6c/0x290 [usbcore]
 device_release_driver_internal+0x170/0x200
 device_release_driver+0x14/0x20
 bus_remove_device+0x118/0x128
 device_del+0x110/0x308
 usb_disable_device+0x8c/0x1f8 [usbcore]
 usb_disconnect+0xb4/0x218 [usbcore]
 usb_disconnect+0x9c/0x218 [usbcore]
 usb_disconnect+0x9c/0x218 [usbcore]
 hub_event+0xf20/0x1020 [usbcore]
 process_one_work+0x1c8/0x310
 worker_thread+0x44/0x450
 kthread+0x118/0x120
 ret_from_fork+0x10/0x18
---[ end trace 978a27ff198f1268 ]---
[drm:drm_mode_config_cleanup [drm]] *ERROR* connector DVI-I-1 leaked!

Mikulas
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 00/21] USB DisplayLink patches
  2018-06-03 14:40 ` Mikulas Patocka
@ 2018-06-05  9:47   ` Alexey Brodkin
  -1 siblings, 0 replies; 75+ messages in thread
From: Alexey Brodkin @ 2018-06-05  9:47 UTC (permalink / raw)
  To: mpatocka; +Cc: linux-fbdev, ladis, b.zolnierkie, bernie, dri-devel, airlied

SGkgTWlrdWxhcywNCg0KT24gU3VuLCAyMDE4LTA2LTAzIGF0IDE2OjQwICswMjAwLCBNaWt1bGFz
IFBhdG9ja2Egd3JvdGU6DQo+IEhpDQo+IA0KPiBIZXJlIEknbSBzZW5kaW5nIGJ1ZyBmaXhlcyBh
bmQgcGVyZm9ybWFuY2UgaW1wcm92ZW1lbnRzIGZvciB0aGUgVVNCDQo+IERpc3BsYXlMaW5rIGZy
YW1lYnVmZmVyIGFuZCBtb2Rlc2V0dGluZyBkcml2ZXJzIGZvciB0aGlzIG1lcmdlIHdpbmRvdy4N
Cg0KRm9yIHN1Y2ggYSBsb25nIHNlcmllcyBpdCB3b3VsZCBiZSB2ZXJ5IG5pY2UgdG8gcG9zdCBh
IGxpbmsgdG8geW91ciBnaXQNCnRyZWUgc28gcGVvcGxlIG1heSBwbGF5IHdpdGggeW91ciBjaGFu
Z2VzIGVhc2lseS4NCg0KQ291bGQgeW91IHBsZWFzZSBwcmVwYXJlIG9uZT8NCg0KLUFsZXhleQ=

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 00/21] USB DisplayLink patches
@ 2018-06-05  9:47   ` Alexey Brodkin
  0 siblings, 0 replies; 75+ messages in thread
From: Alexey Brodkin @ 2018-06-05  9:47 UTC (permalink / raw)
  To: mpatocka; +Cc: linux-fbdev, ladis, b.zolnierkie, bernie, dri-devel, airlied

Hi Mikulas,

On Sun, 2018-06-03 at 16:40 +0200, Mikulas Patocka wrote:
> Hi
> 
> Here I'm sending bug fixes and performance improvements for the USB
> DisplayLink framebuffer and modesetting drivers for this merge window.

For such a long series it would be very nice to post a link to your git
tree so people may play with your changes easily.

Could you please prepare one?

-Alexey
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 08/21] udl-kms: avoid prefetch
  2018-06-03 14:41   ` Mikulas Patocka
@ 2018-06-05 10:08     ` Alexey Brodkin
  -1 siblings, 0 replies; 75+ messages in thread
From: Alexey Brodkin @ 2018-06-05 10:08 UTC (permalink / raw)
  To: mpatocka; +Cc: linux-fbdev, ladis, b.zolnierkie, bernie, dri-devel, airlied

SGkgTWlrdWxhcywNCg0KT24gU3VuLCAyMDE4LTA2LTAzIGF0IDE2OjQxICswMjAwLCBNaWt1bGFz
IFBhdG9ja2Egd3JvdGU6DQo+IE1vZGVybiBwcm9jZXNzb3JzIGNhbiBkZXRlY3QgbGluZWFyIG1l
bW9yeSBhY2Nlc3NlcyBhbmQgcHJlZmV0Y2ggZGF0YQ0KPiBhdXRvbWF0aWNhbGx5LCBzbyB0aGVy
ZSdzIG5vIG5lZWQgdG8gdXNlIHByZWZldGNoLg0KDQpOb3QgZWFjaCBhbmQgZXZlcnkgQ1BVIHRo
YXQncyBjYXBhYmxlIG9mIHJ1bm5pbmcgTGludXggaGFzIHByZWZldGNoDQpmdW5jdGlvbmFsaXR5
IDopDQoNClN0aWxsIHJlYWQtb24uLi4NCg0KPiBTaWduZWQtb2ZmLWJ5OiBNaWt1bGFzIFBhdG9j
a2EgPG1wYXRvY2thQHJlZGhhdC5jb20+DQo+IA0KPiAtLS0NCj4gIGRyaXZlcnMvZ3B1L2RybS91
ZGwvdWRsX3RyYW5zZmVyLmMgfCAgICA3IC0tLS0tLS0NCj4gIDEgZmlsZSBjaGFuZ2VkLCA3IGRl
bGV0aW9ucygtKQ0KPiANCj4gSW5kZXg6IGxpbnV4LTQuMTYuMTIvZHJpdmVycy9ncHUvZHJtL3Vk
bC91ZGxfdHJhbnNmZXIuYw0KPiA9PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09DQo+IC0tLSBsaW51eC00LjE2LjEyLm9yaWcv
ZHJpdmVycy9ncHUvZHJtL3VkbC91ZGxfdHJhbnNmZXIuYwkyMDE4LTA1LTMxIDE0OjQ4OjEyLjAw
MDAwMDAwMCArMDIwMA0KPiArKysgbGludXgtNC4xNi4xMi9kcml2ZXJzL2dwdS9kcm0vdWRsL3Vk
bF90cmFuc2Zlci5jCTIwMTgtMDUtMzEgMTQ6NDg6MTIuMDAwMDAwMDAwICswMjAwDQo+IEBAIC0x
Myw3ICsxMyw2IEBADQo+ICAjaW5jbHVkZSA8bGludXgvbW9kdWxlLmg+DQo+ICAjaW5jbHVkZSA8
bGludXgvc2xhYi5oPg0KPiAgI2luY2x1ZGUgPGxpbnV4L2ZiLmg+DQo+IC0jaW5jbHVkZSA8bGlu
dXgvcHJlZmV0Y2guaD4NCj4gICNpbmNsdWRlIDxhc20vdW5hbGlnbmVkLmg+DQo+ICANCj4gICNp
bmNsdWRlIDxkcm0vZHJtUC5oPg0KPiBAQCAtNTEsOSArNTAsNiBAQCBzdGF0aWMgaW50IHVkbF90
cmltX2hsaW5lKGNvbnN0IHU4ICpiYmFjDQo+ICAJaW50IHN0YXJ0ID0gd2lkdGg7DQo+ICAJaW50
IGVuZCA9IHdpZHRoOw0KPiAgDQo+IC0JcHJlZmV0Y2goKHZvaWQgKikgZnJvbnQpOw0KPiAtCXBy
ZWZldGNoKCh2b2lkICopIGJhY2spOw0KDQpBRkFJSyBwcmVmZXRjaGVyIGZldGNoZXMgbmV3IGRh
dGEgYWNjb3JkaW5nIHRvIGEga25vd24gaGlzdG9yeS4uLiBpLmUuIGJhc2VkIG9uIHByZXZpb3Vz
bHkNCnVzZWQgcGF0dGVybiB3ZSdsbCB0cnlpbmcgdG8gZ2V0IHRoZSBuZXh0IGJhdGNoIG9mIGRh
dGEuDQoNCkJ1dCB0aGUgY29kZSBhYm92ZSBpcyBpbiB0aGUgdmVyeSBiZWdpbm5pbmcgb2YgdGhl
IGRhdGEgcHJvY2Vzc2luZyByb3V0aW5lIHdoZXJlDQpwcmVmZXRjaGVyIGRvZXNuJ3QgeWV0IGhh
dmUgYW55IGhpc3RvcnkgdG8ga25vdyB3aGF0IGFuZCB3aGVyZSB0byBwcmVmZXRjaC4NCg0KU28g
SSdkIHNheSB0aGlzIHBhcnRpY3VsYXIgdXNhZ2UgaXMgZ29vZC4NCkF0IGxlYXN0IHRob3NlIHBy
ZWZldGNoZXMgc2hvdWxkbid0IGh1cnQgYmVjYXVzZSB0eXBpY2FsbHkgaXQNCndvdWxkIGJlIGp1
c3QgMSBpbnN0cnVjdGlvbiBpZiB0aG9zZSBleGlzdCBvciBub3RoaW5nIGlmIENQVS9jb21waWxl
ciBkb2Vzbid0DQpzdXBwb3J0IGl0Lg0KDQo+ICAJZm9yIChqID0gMDsgaiA8IHdpZHRoOyBqKysp
IHsNCj4gIAkJaWYgKGJhY2tbal0gIT0gZnJvbnRbal0pIHsNCj4gIAkJCXN0YXJ0ID0gajsNCj4g
QEAgLTE0MCw4ICsxMzYsNiBAQCBzdGF0aWMgdm9pZCB1ZGxfY29tcHJlc3NfaGxpbmUxNigNCj4g
IAkJY29uc3QgdTggKmNtZF9waXhlbF9zdGFydCwgKmNtZF9waXhlbF9lbmQgPSBOVUxMOw0KPiAg
CQl1aW50MTZfdCBwaXhlbF92YWwxNjsNCj4gIA0KPiAtCQlwcmVmZXRjaHcoKHZvaWQgKikgY21k
KTsgLyogcHVsbCBpbiBvbmUgY2FjaGUgbGluZSBhdCBsZWFzdCAqLw0KPiAtDQoNClByZXR0eSBt
dWNoIHRoZSBzYW1lIGlzIGhlcmUuIE9idmlvdXNseSBvbiB0aGUgZmlyc3QgaXRlcmF0aW9uIHBy
ZWZldGNoZXIgZG9zbid0DQprbm93IHdoZXJlIHRvIGdldCAiY21kIi4gT24gdGhlIG5leHQgaXRl
cmF0aW9ucyBpdCBtaWdodCBiZSBiZXR0ZXIgYnV0IGdpdmVuIGFtb3VudA0Kb2Ygb3BlcmF0aW9u
IGhhcHBlbnMgZnVydGhlciBpbiB0aGUgY3ljbGUgKGFuZCBpbiBpbm5lciBjeWNsZXMpIEkgd29u
J3QgYmUgY29tcGxldGVseQ0Kc3VyZSB0aGF0IGVhY2ggYW5kIGV2ZXJ5IHByZWZldGNoZXIgd2ls
bCBzdGlsbCBrZWVwIHRyYWNrIG9mICJjbWQiLg0KDQo+ICAJCSpjbWQrKyA9IDB4YWY7DQo+ICAJ
CSpjbWQrKyA9IDB4NmI7DQo+ICAJCSpjbWQrKyA9ICh1aW50OF90KSAoKGRldl9hZGRyID4+IDE2
KSAmIDB4RkYpOw0KPiBAQCAtMTU4LDcgKzE1Miw2IEBAIHN0YXRpYyB2b2lkIHVkbF9jb21wcmVz
c19obGluZTE2KA0KPiAgCQkJCQkodW5zaWduZWQgbG9uZykocGl4ZWxfZW5kIC0gcGl4ZWwpID4+
IGxvZ19icHAsDQo+ICAJCQkJCSh1bnNpZ25lZCBsb25nKShjbWRfYnVmZmVyX2VuZCAtIDEgLSBj
bWQpIC8gMikgPDwgbG9nX2JwcCk7DQo+ICANCj4gLQkJcHJlZmV0Y2hfcmFuZ2UoKHZvaWQgKikg
cGl4ZWwsIGNtZF9waXhlbF9lbmQgLSBwaXhlbCk7DQoNCkFnYWluIEknbSBub3Qgc3VyZSB3aGF0
IHdlIGdhaW4gcmVtb3ZpbmcgdGhhdCBjb2RlIGluIGNvbXBhcmlzb24gb2YgcG9zc2libGUNCnBl
cmZvcm1hbmNlIGRlZ3JhZGF0aW9uIG9uIHNpbXBsZXIgQ1BVcy4NCg0KQW5kIGVzc2VudGlhbGx5
IGFsbCB0aGUgc2FtZSBpcyBhcHBsaWNhYmxlIHRvIFVETEZCIHBhdGNoLg0KDQotQWxleGV5

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 08/21] udl-kms: avoid prefetch
@ 2018-06-05 10:08     ` Alexey Brodkin
  0 siblings, 0 replies; 75+ messages in thread
From: Alexey Brodkin @ 2018-06-05 10:08 UTC (permalink / raw)
  To: mpatocka; +Cc: linux-fbdev, ladis, b.zolnierkie, bernie, dri-devel, airlied

Hi Mikulas,

On Sun, 2018-06-03 at 16:41 +0200, Mikulas Patocka wrote:
> Modern processors can detect linear memory accesses and prefetch data
> automatically, so there's no need to use prefetch.

Not each and every CPU that's capable of running Linux has prefetch
functionality :)

Still read-on...

> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> 
> ---
>  drivers/gpu/drm/udl/udl_transfer.c |    7 -------
>  1 file changed, 7 deletions(-)
> 
> Index: linux-4.16.12/drivers/gpu/drm/udl/udl_transfer.c
> ===================================================================
> --- linux-4.16.12.orig/drivers/gpu/drm/udl/udl_transfer.c	2018-05-31 14:48:12.000000000 +0200
> +++ linux-4.16.12/drivers/gpu/drm/udl/udl_transfer.c	2018-05-31 14:48:12.000000000 +0200
> @@ -13,7 +13,6 @@
>  #include <linux/module.h>
>  #include <linux/slab.h>
>  #include <linux/fb.h>
> -#include <linux/prefetch.h>
>  #include <asm/unaligned.h>
>  
>  #include <drm/drmP.h>
> @@ -51,9 +50,6 @@ static int udl_trim_hline(const u8 *bbac
>  	int start = width;
>  	int end = width;
>  
> -	prefetch((void *) front);
> -	prefetch((void *) back);

AFAIK prefetcher fetches new data according to a known history... i.e. based on previously
used pattern we'll trying to get the next batch of data.

But the code above is in the very beginning of the data processing routine where
prefetcher doesn't yet have any history to know what and where to prefetch.

So I'd say this particular usage is good.
At least those prefetches shouldn't hurt because typically it
would be just 1 instruction if those exist or nothing if CPU/compiler doesn't
support it.

>  	for (j = 0; j < width; j++) {
>  		if (back[j] != front[j]) {
>  			start = j;
> @@ -140,8 +136,6 @@ static void udl_compress_hline16(
>  		const u8 *cmd_pixel_start, *cmd_pixel_end = NULL;
>  		uint16_t pixel_val16;
>  
> -		prefetchw((void *) cmd); /* pull in one cache line at least */
> -

Pretty much the same is here. Obviously on the first iteration prefetcher dosn't
know where to get "cmd". On the next iterations it might be better but given amount
of operation happens further in the cycle (and in inner cycles) I won't be completely
sure that each and every prefetcher will still keep track of "cmd".

>  		*cmd++ = 0xaf;
>  		*cmd++ = 0x6b;
>  		*cmd++ = (uint8_t) ((dev_addr >> 16) & 0xFF);
> @@ -158,7 +152,6 @@ static void udl_compress_hline16(
>  					(unsigned long)(pixel_end - pixel) >> log_bpp,
>  					(unsigned long)(cmd_buffer_end - 1 - cmd) / 2) << log_bpp);
>  
> -		prefetch_range((void *) pixel, cmd_pixel_end - pixel);

Again I'm not sure what we gain removing that code in comparison of possible
performance degradation on simpler CPUs.

And essentially all the same is applicable to UDLFB patch.

-Alexey
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 08/21] udl-kms: avoid prefetch
  2018-06-05 10:08     ` Alexey Brodkin
@ 2018-06-05 10:48       ` Ladislav Michl
  -1 siblings, 0 replies; 75+ messages in thread
From: Ladislav Michl @ 2018-06-05 10:48 UTC (permalink / raw)
  To: Alexey Brodkin
  Cc: linux-fbdev, b.zolnierkie, bernie, dri-devel, mpatocka, airlied

On Tue, Jun 05, 2018 at 10:08:02AM +0000, Alexey Brodkin wrote:
> Hi Mikulas,
> 
> On Sun, 2018-06-03 at 16:41 +0200, Mikulas Patocka wrote:
> > Modern processors can detect linear memory accesses and prefetch data
> > automatically, so there's no need to use prefetch.
> 
> Not each and every CPU that's capable of running Linux has prefetch
> functionality :)
> 
> Still read-on...
> 
> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > 
> > ---
> >  drivers/gpu/drm/udl/udl_transfer.c |    7 -------
> >  1 file changed, 7 deletions(-)
> > 
> > Index: linux-4.16.12/drivers/gpu/drm/udl/udl_transfer.c
> > =================================> > --- linux-4.16.12.orig/drivers/gpu/drm/udl/udl_transfer.c	2018-05-31 14:48:12.000000000 +0200
> > +++ linux-4.16.12/drivers/gpu/drm/udl/udl_transfer.c	2018-05-31 14:48:12.000000000 +0200
> > @@ -13,7 +13,6 @@
> >  #include <linux/module.h>
> >  #include <linux/slab.h>
> >  #include <linux/fb.h>
> > -#include <linux/prefetch.h>
> >  #include <asm/unaligned.h>
> >  
> >  #include <drm/drmP.h>
> > @@ -51,9 +50,6 @@ static int udl_trim_hline(const u8 *bbac
> >  	int start = width;
> >  	int end = width;
> >  
> > -	prefetch((void *) front);
> > -	prefetch((void *) back);
> 
> AFAIK prefetcher fetches new data according to a known history... i.e. based on previously
> used pattern we'll trying to get the next batch of data.
> 
> But the code above is in the very beginning of the data processing routine where
> prefetcher doesn't yet have any history to know what and where to prefetch.
> 
> So I'd say this particular usage is good.
> At least those prefetches shouldn't hurt because typically it
> would be just 1 instruction if those exist or nothing if CPU/compiler doesn't
> support it.
> 
> >  	for (j = 0; j < width; j++) {
> >  		if (back[j] != front[j]) {
> >  			start = j;
> > @@ -140,8 +136,6 @@ static void udl_compress_hline16(
> >  		const u8 *cmd_pixel_start, *cmd_pixel_end = NULL;
> >  		uint16_t pixel_val16;
> >  
> > -		prefetchw((void *) cmd); /* pull in one cache line at least */
> > -
> 
> Pretty much the same is here. Obviously on the first iteration prefetcher dosn't
> know where to get "cmd". On the next iterations it might be better but given amount
> of operation happens further in the cycle (and in inner cycles) I won't be completely
> sure that each and every prefetcher will still keep track of "cmd".
> 
> >  		*cmd++ = 0xaf;
> >  		*cmd++ = 0x6b;
> >  		*cmd++ = (uint8_t) ((dev_addr >> 16) & 0xFF);
> > @@ -158,7 +152,6 @@ static void udl_compress_hline16(
> >  					(unsigned long)(pixel_end - pixel) >> log_bpp,
> >  					(unsigned long)(cmd_buffer_end - 1 - cmd) / 2) << log_bpp);
> >  
> > -		prefetch_range((void *) pixel, cmd_pixel_end - pixel);
> 
> Again I'm not sure what we gain removing that code in comparison of possible
> performance degradation on simpler CPUs.
> 
> And essentially all the same is applicable to UDLFB patch.

Tested this one on AT91SAM9G20 SoC, but couldn't get any measurable difference.
Part of problem is probably that full speed USB is the bottleneck here.
However, the same applies on OMAP3630 based board with high speed USB.

As a side note, I didn't experience any problem those paches are fixing, so
perhaps testcases could be described briefly, preferably with some numbers
(I'm not running console on udlfb, but try to give it a try next week).

Thank you,
	ladis

> -Alexey

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 08/21] udl-kms: avoid prefetch
@ 2018-06-05 10:48       ` Ladislav Michl
  0 siblings, 0 replies; 75+ messages in thread
From: Ladislav Michl @ 2018-06-05 10:48 UTC (permalink / raw)
  To: Alexey Brodkin
  Cc: linux-fbdev, b.zolnierkie, bernie, dri-devel, mpatocka, airlied

On Tue, Jun 05, 2018 at 10:08:02AM +0000, Alexey Brodkin wrote:
> Hi Mikulas,
> 
> On Sun, 2018-06-03 at 16:41 +0200, Mikulas Patocka wrote:
> > Modern processors can detect linear memory accesses and prefetch data
> > automatically, so there's no need to use prefetch.
> 
> Not each and every CPU that's capable of running Linux has prefetch
> functionality :)
> 
> Still read-on...
> 
> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > 
> > ---
> >  drivers/gpu/drm/udl/udl_transfer.c |    7 -------
> >  1 file changed, 7 deletions(-)
> > 
> > Index: linux-4.16.12/drivers/gpu/drm/udl/udl_transfer.c
> > ===================================================================
> > --- linux-4.16.12.orig/drivers/gpu/drm/udl/udl_transfer.c	2018-05-31 14:48:12.000000000 +0200
> > +++ linux-4.16.12/drivers/gpu/drm/udl/udl_transfer.c	2018-05-31 14:48:12.000000000 +0200
> > @@ -13,7 +13,6 @@
> >  #include <linux/module.h>
> >  #include <linux/slab.h>
> >  #include <linux/fb.h>
> > -#include <linux/prefetch.h>
> >  #include <asm/unaligned.h>
> >  
> >  #include <drm/drmP.h>
> > @@ -51,9 +50,6 @@ static int udl_trim_hline(const u8 *bbac
> >  	int start = width;
> >  	int end = width;
> >  
> > -	prefetch((void *) front);
> > -	prefetch((void *) back);
> 
> AFAIK prefetcher fetches new data according to a known history... i.e. based on previously
> used pattern we'll trying to get the next batch of data.
> 
> But the code above is in the very beginning of the data processing routine where
> prefetcher doesn't yet have any history to know what and where to prefetch.
> 
> So I'd say this particular usage is good.
> At least those prefetches shouldn't hurt because typically it
> would be just 1 instruction if those exist or nothing if CPU/compiler doesn't
> support it.
> 
> >  	for (j = 0; j < width; j++) {
> >  		if (back[j] != front[j]) {
> >  			start = j;
> > @@ -140,8 +136,6 @@ static void udl_compress_hline16(
> >  		const u8 *cmd_pixel_start, *cmd_pixel_end = NULL;
> >  		uint16_t pixel_val16;
> >  
> > -		prefetchw((void *) cmd); /* pull in one cache line at least */
> > -
> 
> Pretty much the same is here. Obviously on the first iteration prefetcher dosn't
> know where to get "cmd". On the next iterations it might be better but given amount
> of operation happens further in the cycle (and in inner cycles) I won't be completely
> sure that each and every prefetcher will still keep track of "cmd".
> 
> >  		*cmd++ = 0xaf;
> >  		*cmd++ = 0x6b;
> >  		*cmd++ = (uint8_t) ((dev_addr >> 16) & 0xFF);
> > @@ -158,7 +152,6 @@ static void udl_compress_hline16(
> >  					(unsigned long)(pixel_end - pixel) >> log_bpp,
> >  					(unsigned long)(cmd_buffer_end - 1 - cmd) / 2) << log_bpp);
> >  
> > -		prefetch_range((void *) pixel, cmd_pixel_end - pixel);
> 
> Again I'm not sure what we gain removing that code in comparison of possible
> performance degradation on simpler CPUs.
> 
> And essentially all the same is applicable to UDLFB patch.

Tested this one on AT91SAM9G20 SoC, but couldn't get any measurable difference.
Part of problem is probably that full speed USB is the bottleneck here.
However, the same applies on OMAP3630 based board with high speed USB.

As a side note, I didn't experience any problem those paches are fixing, so
perhaps testcases could be described briefly, preferably with some numbers
(I'm not running console on udlfb, but try to give it a try next week).

Thank you,
	ladis

> -Alexey
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 08/21] udl-kms: avoid prefetch
  2018-06-05 10:08     ` Alexey Brodkin
@ 2018-06-05 15:30       ` Mikulas Patocka
  -1 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-05 15:30 UTC (permalink / raw)
  To: Alexey Brodkin
  Cc: linux-fbdev, ladis, b.zolnierkie, bernie, dri-devel, airlied



On Tue, 5 Jun 2018, Alexey Brodkin wrote:

> Hi Mikulas,
> 
> On Sun, 2018-06-03 at 16:41 +0200, Mikulas Patocka wrote:
> > Modern processors can detect linear memory accesses and prefetch data
> > automatically, so there's no need to use prefetch.
> 
> Not each and every CPU that's capable of running Linux has prefetch
> functionality :)
> 
> Still read-on...
> 
> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > 
> > ---
> >  drivers/gpu/drm/udl/udl_transfer.c |    7 -------
> >  1 file changed, 7 deletions(-)
> > 
> > Index: linux-4.16.12/drivers/gpu/drm/udl/udl_transfer.c
> > =================================> > --- linux-4.16.12.orig/drivers/gpu/drm/udl/udl_transfer.c	2018-05-31 14:48:12.000000000 +0200
> > +++ linux-4.16.12/drivers/gpu/drm/udl/udl_transfer.c	2018-05-31 14:48:12.000000000 +0200
> > @@ -13,7 +13,6 @@
> >  #include <linux/module.h>
> >  #include <linux/slab.h>
> >  #include <linux/fb.h>
> > -#include <linux/prefetch.h>
> >  #include <asm/unaligned.h>
> >  
> >  #include <drm/drmP.h>
> > @@ -51,9 +50,6 @@ static int udl_trim_hline(const u8 *bbac
> >  	int start = width;
> >  	int end = width;
> >  
> > -	prefetch((void *) front);
> > -	prefetch((void *) back);
> 
> AFAIK prefetcher fetches new data according to a known history... i.e. based on previously
> used pattern we'll trying to get the next batch of data.
> 
> But the code above is in the very beginning of the data processing routine where
> prefetcher doesn't yet have any history to know what and where to prefetch.
> 
> So I'd say this particular usage is good.
> At least those prefetches shouldn't hurt because typically it
> would be just 1 instruction if those exist or nothing if CPU/compiler doesn't
> support it.

See this post https://lwn.net/Articles/444336/ where they measured that 
prefetch hurts performance. Prefetch shouldn't be used unless you have a 
proof that it improves performance.

The problem is that the prefetch instruction causes stalls in the pipeline 
when it encounters TLB miss and the automatic prefetcher doesn't.

Mikulas

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 08/21] udl-kms: avoid prefetch
@ 2018-06-05 15:30       ` Mikulas Patocka
  0 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-05 15:30 UTC (permalink / raw)
  To: Alexey Brodkin
  Cc: linux-fbdev, ladis, b.zolnierkie, bernie, dri-devel, airlied



On Tue, 5 Jun 2018, Alexey Brodkin wrote:

> Hi Mikulas,
> 
> On Sun, 2018-06-03 at 16:41 +0200, Mikulas Patocka wrote:
> > Modern processors can detect linear memory accesses and prefetch data
> > automatically, so there's no need to use prefetch.
> 
> Not each and every CPU that's capable of running Linux has prefetch
> functionality :)
> 
> Still read-on...
> 
> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > 
> > ---
> >  drivers/gpu/drm/udl/udl_transfer.c |    7 -------
> >  1 file changed, 7 deletions(-)
> > 
> > Index: linux-4.16.12/drivers/gpu/drm/udl/udl_transfer.c
> > ===================================================================
> > --- linux-4.16.12.orig/drivers/gpu/drm/udl/udl_transfer.c	2018-05-31 14:48:12.000000000 +0200
> > +++ linux-4.16.12/drivers/gpu/drm/udl/udl_transfer.c	2018-05-31 14:48:12.000000000 +0200
> > @@ -13,7 +13,6 @@
> >  #include <linux/module.h>
> >  #include <linux/slab.h>
> >  #include <linux/fb.h>
> > -#include <linux/prefetch.h>
> >  #include <asm/unaligned.h>
> >  
> >  #include <drm/drmP.h>
> > @@ -51,9 +50,6 @@ static int udl_trim_hline(const u8 *bbac
> >  	int start = width;
> >  	int end = width;
> >  
> > -	prefetch((void *) front);
> > -	prefetch((void *) back);
> 
> AFAIK prefetcher fetches new data according to a known history... i.e. based on previously
> used pattern we'll trying to get the next batch of data.
> 
> But the code above is in the very beginning of the data processing routine where
> prefetcher doesn't yet have any history to know what and where to prefetch.
> 
> So I'd say this particular usage is good.
> At least those prefetches shouldn't hurt because typically it
> would be just 1 instruction if those exist or nothing if CPU/compiler doesn't
> support it.

See this post https://lwn.net/Articles/444336/ where they measured that 
prefetch hurts performance. Prefetch shouldn't be used unless you have a 
proof that it improves performance.

The problem is that the prefetch instruction causes stalls in the pipeline 
when it encounters TLB miss and the automatic prefetcher doesn't.

Mikulas
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 00/21] USB DisplayLink patches
  2018-06-05  9:47   ` Alexey Brodkin
@ 2018-06-05 15:34     ` Mikulas Patocka
  -1 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-05 15:34 UTC (permalink / raw)
  To: Alexey Brodkin
  Cc: linux-fbdev, ladis, b.zolnierkie, bernie, dri-devel, airlied



On Tue, 5 Jun 2018, Alexey Brodkin wrote:

> Hi Mikulas,
> 
> On Sun, 2018-06-03 at 16:40 +0200, Mikulas Patocka wrote:
> > Hi
> > 
> > Here I'm sending bug fixes and performance improvements for the USB
> > DisplayLink framebuffer and modesetting drivers for this merge window.
> 
> For such a long series it would be very nice to post a link to your git
> tree so people may play with your changes easily.
> 
> Could you please prepare one?
> 
> -Alexey

I don't use git to manage my patches, I use quilt.

I uploaded the patches from quilt here: 
http://people.redhat.com/~mpatocka/patches/kernel/udl/series.html

Mikulas

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 00/21] USB DisplayLink patches
@ 2018-06-05 15:34     ` Mikulas Patocka
  0 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-05 15:34 UTC (permalink / raw)
  To: Alexey Brodkin
  Cc: linux-fbdev, ladis, b.zolnierkie, bernie, dri-devel, airlied



On Tue, 5 Jun 2018, Alexey Brodkin wrote:

> Hi Mikulas,
> 
> On Sun, 2018-06-03 at 16:40 +0200, Mikulas Patocka wrote:
> > Hi
> > 
> > Here I'm sending bug fixes and performance improvements for the USB
> > DisplayLink framebuffer and modesetting drivers for this merge window.
> 
> For such a long series it would be very nice to post a link to your git
> tree so people may play with your changes easily.
> 
> Could you please prepare one?
> 
> -Alexey

I don't use git to manage my patches, I use quilt.

I uploaded the patches from quilt here: 
http://people.redhat.com/~mpatocka/patches/kernel/udl/series.html

Mikulas
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 08/21] udl-kms: avoid prefetch
  2018-06-05 15:30       ` Mikulas Patocka
@ 2018-06-06 12:04         ` Alexey Brodkin
  -1 siblings, 0 replies; 75+ messages in thread
From: Alexey Brodkin @ 2018-06-06 12:04 UTC (permalink / raw)
  To: mpatocka; +Cc: linux-fbdev, ladis, b.zolnierkie, bernie, dri-devel, airlied

SGkgTWlrdWxhcywNCg0KT24gVHVlLCAyMDE4LTA2LTA1IGF0IDExOjMwIC0wNDAwLCBNaWt1bGFz
IFBhdG9ja2Egd3JvdGU6DQo+IA0KPiBPbiBUdWUsIDUgSnVuIDIwMTgsIEFsZXhleSBCcm9ka2lu
IHdyb3RlOg0KPiANCj4gPiBIaSBNaWt1bGFzLA0KPiA+IA0KPiA+IE9uIFN1biwgMjAxOC0wNi0w
MyBhdCAxNjo0MSArMDIwMCwgTWlrdWxhcyBQYXRvY2thIHdyb3RlOg0KPiA+ID4gTW9kZXJuIHBy
b2Nlc3NvcnMgY2FuIGRldGVjdCBsaW5lYXIgbWVtb3J5IGFjY2Vzc2VzIGFuZCBwcmVmZXRjaCBk
YXRhDQo+ID4gPiBhdXRvbWF0aWNhbGx5LCBzbyB0aGVyZSdzIG5vIG5lZWQgdG8gdXNlIHByZWZl
dGNoLg0KPiA+IA0KPiA+IE5vdCBlYWNoIGFuZCBldmVyeSBDUFUgdGhhdCdzIGNhcGFibGUgb2Yg
cnVubmluZyBMaW51eCBoYXMgcHJlZmV0Y2gNCj4gPiBmdW5jdGlvbmFsaXR5IDopDQo+ID4gDQo+
ID4gU3RpbGwgcmVhZC1vbi4uLg0KPiA+IA0KPiA+ID4gU2lnbmVkLW9mZi1ieTogTWlrdWxhcyBQ
YXRvY2thIDxtcGF0b2NrYUByZWRoYXQuY29tPg0KPiA+ID4gDQo+ID4gPiAtLS0NCj4gPiA+ICBk
cml2ZXJzL2dwdS9kcm0vdWRsL3VkbF90cmFuc2Zlci5jIHwgICAgNyAtLS0tLS0tDQo+ID4gPiAg
MSBmaWxlIGNoYW5nZWQsIDcgZGVsZXRpb25zKC0pDQo+ID4gPiANCj4gPiA+IEluZGV4OiBsaW51
eC00LjE2LjEyL2RyaXZlcnMvZ3B1L2RybS91ZGwvdWRsX3RyYW5zZmVyLmMNCj4gPiA+ID09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT0NCj4gPiA+IC0tLSBsaW51eC00LjE2LjEyLm9yaWcvZHJpdmVycy9ncHUvZHJtL3VkbC91
ZGxfdHJhbnNmZXIuYwkyMDE4LTA1LTMxIDE0OjQ4OjEyLjAwMDAwMDAwMCArMDIwMA0KPiA+ID4g
KysrIGxpbnV4LTQuMTYuMTIvZHJpdmVycy9ncHUvZHJtL3VkbC91ZGxfdHJhbnNmZXIuYwkyMDE4
LTA1LTMxIDE0OjQ4OjEyLjAwMDAwMDAwMCArMDIwMA0KPiA+ID4gQEAgLTEzLDcgKzEzLDYgQEAN
Cj4gPiA+ICAjaW5jbHVkZSA8bGludXgvbW9kdWxlLmg+DQo+ID4gPiAgI2luY2x1ZGUgPGxpbnV4
L3NsYWIuaD4NCj4gPiA+ICAjaW5jbHVkZSA8bGludXgvZmIuaD4NCj4gPiA+IC0jaW5jbHVkZSA8
bGludXgvcHJlZmV0Y2guaD4NCj4gPiA+ICAjaW5jbHVkZSA8YXNtL3VuYWxpZ25lZC5oPg0KPiA+
ID4gIA0KPiA+ID4gICNpbmNsdWRlIDxkcm0vZHJtUC5oPg0KPiA+ID4gQEAgLTUxLDkgKzUwLDYg
QEAgc3RhdGljIGludCB1ZGxfdHJpbV9obGluZShjb25zdCB1OCAqYmJhYw0KPiA+ID4gIAlpbnQg
c3RhcnQgPSB3aWR0aDsNCj4gPiA+ICAJaW50IGVuZCA9IHdpZHRoOw0KPiA+ID4gIA0KPiA+ID4g
LQlwcmVmZXRjaCgodm9pZCAqKSBmcm9udCk7DQo+ID4gPiAtCXByZWZldGNoKCh2b2lkICopIGJh
Y2spOw0KPiA+IA0KPiA+IEFGQUlLIHByZWZldGNoZXIgZmV0Y2hlcyBuZXcgZGF0YSBhY2NvcmRp
bmcgdG8gYSBrbm93biBoaXN0b3J5Li4uIGkuZS4gYmFzZWQgb24gcHJldmlvdXNseQ0KPiA+IHVz
ZWQgcGF0dGVybiB3ZSdsbCB0cnlpbmcgdG8gZ2V0IHRoZSBuZXh0IGJhdGNoIG9mIGRhdGEuDQo+
ID4gDQo+ID4gQnV0IHRoZSBjb2RlIGFib3ZlIGlzIGluIHRoZSB2ZXJ5IGJlZ2lubmluZyBvZiB0
aGUgZGF0YSBwcm9jZXNzaW5nIHJvdXRpbmUgd2hlcmUNCj4gPiBwcmVmZXRjaGVyIGRvZXNuJ3Qg
eWV0IGhhdmUgYW55IGhpc3RvcnkgdG8ga25vdyB3aGF0IGFuZCB3aGVyZSB0byBwcmVmZXRjaC4N
Cj4gPiANCj4gPiBTbyBJJ2Qgc2F5IHRoaXMgcGFydGljdWxhciB1c2FnZSBpcyBnb29kLg0KPiA+
IEF0IGxlYXN0IHRob3NlIHByZWZldGNoZXMgc2hvdWxkbid0IGh1cnQgYmVjYXVzZSB0eXBpY2Fs
bHkgaXQNCj4gPiB3b3VsZCBiZSBqdXN0IDEgaW5zdHJ1Y3Rpb24gaWYgdGhvc2UgZXhpc3Qgb3Ig
bm90aGluZyBpZiBDUFUvY29tcGlsZXIgZG9lc24ndA0KPiA+IHN1cHBvcnQgaXQuDQo+IA0KPiBT
ZWUgdGhpcyBwb3N0IGh0dHBzOi8vdXJsZGVmZW5zZS5wcm9vZnBvaW50LmNvbS92Mi91cmw/dT1o
dHRwcy0zQV9fbHduLm5ldF9BcnRpY2xlc180NDQzMzZfJmQ9RHdJQkFnJmM9RFBMNl9YXzZKa1hG
eDdBWFdxQjB0ZyZyPWxxZGVlU1NFZXMwR0ZERGw2NTZlDQo+IFZpWE83YnJlUzU1eXRXa2hwazVS
ODFJJm09YTVSYXFKdHZhakZrTTFoTDdiT0tENWpWN2NwRmZUdkcyWTFjWUNkQlBkMCZzPXcwVzh3
RnRBZ0VOcDhURTZSemRQR2hkS1Jhc0pjX290SW4wOFYwRWtnclkmZT0gd2hlcmUgdGhleSBtZWFz
dXJlZCB0aGF0IA0KPiBwcmVmZXRjaCBodXJ0cyBwZXJmb3JtYW5jZS4gUHJlZmV0Y2ggc2hvdWxk
bid0IGJlIHVzZWQgdW5sZXNzIHlvdSBoYXZlIGEgDQo+IHByb29mIHRoYXQgaXQgaW1wcm92ZXMg
cGVyZm9ybWFuY2UuDQo+IA0KPiBUaGUgcHJvYmxlbSBpcyB0aGF0IHRoZSBwcmVmZXRjaCBpbnN0
cnVjdGlvbiBjYXVzZXMgc3RhbGxzIGluIHRoZSBwaXBlbGluZSANCj4gd2hlbiBpdCBlbmNvdW50
ZXJzIFRMQiBtaXNzIGFuZCB0aGUgYXV0b21hdGljIHByZWZldGNoZXIgZG9lc24ndC4NCg0KV293
LCB0aGFua3MgZm9yIHRoZSBsaW5rLg0KSSBkaWRuJ3Qga25vdyBhYm91dCB0aGF0IHN1YnRsZSBp
c3N1ZSB3aXRoIHByZWZldGNoIGluc3RydWN0aW9ucyBvbiBBUk0gYW5kIHg4Ni4NCg0KU28gT0sg
aW4gY2FzZSBvZiBVREwgdGhlc2UgcHJlZmV0Y2hlcyBhbnl3YXlzIG1ha2Ugbm90IG5vdCBtdWNo
IHNlbnNlIEkgZ3Vlc3MgYW5kIHRoZXJlJ3MNCnNvbWV0aGluZyB3b3JzZSBzdGlsbCwgc2VlIHdo
YXQgSSd2ZSBnb3QgZnJvbSBXYW5kQm9hcmQgUXVhZCBydW5uaW5nIGttc2N1YmUgWzFdIGFwcGxp
Y2F0aW9uDQp3aXRoIGhlbHAgb2YgcGVyZiB1dGlsaXR5Og0KLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tPjgtLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tDQojIE92ZXJoZWFkICBDb21tYW5kICBT
aGFyZWQgT2JqZWN0ICAgICAgICAgICAgU3ltYm9sIA0KIyAuLi4uLi4uLiAgLi4uLi4uLiAgLi4u
Li4uLi4uLi4uLi4uLi4uLi4uLi4gIC4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uLi4u
Li4uLi4NCiMNCiAgICA5Mi45MyUgIGttc2N1YmUgIFtrZXJuZWwua2FsbHN5bXNdICAgICAgICBb
a10gdWRsX3JlbmRlcl9obGluZQ0KICAgICAyLjUxJSAga21zY3ViZSAgW2tlcm5lbC5rYWxsc3lt
c10gICAgICAgIFtrXSBfX2RpdnNpMw0KICAgICAwLjMzJSAga21zY3ViZSAgW2tlcm5lbC5rYWxs
c3ltc10gICAgICAgIFtrXSBfcmF3X3NwaW5fdW5sb2NrX2lycXJlc3RvcmUNCiAgICAgMC4yMiUg
IGttc2N1YmUgIFtrZXJuZWwua2FsbHN5bXNdICAgICAgICBba10gbG9ja19hY3F1aXJlDQogICAg
IDAuMTklICBrbXNjdWJlICBba2VybmVsLmthbGxzeW1zXSAgICAgICAgW2tdIF9yYXdfc3Bpbl91
bmxvY2tfaXJxDQogICAgIDAuMTclICBrbXNjdWJlICBba2VybmVsLmthbGxzeW1zXSAgICAgICAg
W2tdIHVkbF9oYW5kbGVfZGFtYWdlDQogICAgIDAuMTIlICBrbXNjdWJlICBba2VybmVsLmthbGxz
eW1zXSAgICAgICAgW2tdIHY3X2RtYV9jbGVhbl9yYW5nZQ0KICAgICAwLjExJSAga21zY3ViZSAg
W2tlcm5lbC5rYWxsc3ltc10gICAgICAgIFtrXSBsMmMyMTBfY2xlYW5fcmFuZ2UNCiAgICAgMC4w
NiUgIGttc2N1YmUgIFtrZXJuZWwua2FsbHN5bXNdICAgICAgICBba10gX19tZW16ZXJvDQotLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0+OC0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0NCg0KVGhh
dCBzYWlkIGl0J3Mgbm90IGV2ZW4gVVNCIDIuMCB3aGljaCBpcyBhIGJvdHRsZS1uZWNrIGJ1dA0K
Y29tcHV0YXRpb25zIGluIHRoZSB1ZGxfcmVuZGVyX2hsaW5lKCkuDQoNCg0KWzFdIGh0dHBzOi8v
Y2dpdC5mcmVlZGVza3RvcC5vcmcvbWVzYS9rbXNjdWJlLw0KDQotQWxleGV5

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 08/21] udl-kms: avoid prefetch
@ 2018-06-06 12:04         ` Alexey Brodkin
  0 siblings, 0 replies; 75+ messages in thread
From: Alexey Brodkin @ 2018-06-06 12:04 UTC (permalink / raw)
  To: mpatocka; +Cc: linux-fbdev, ladis, b.zolnierkie, bernie, dri-devel, airlied

Hi Mikulas,

On Tue, 2018-06-05 at 11:30 -0400, Mikulas Patocka wrote:
> 
> On Tue, 5 Jun 2018, Alexey Brodkin wrote:
> 
> > Hi Mikulas,
> > 
> > On Sun, 2018-06-03 at 16:41 +0200, Mikulas Patocka wrote:
> > > Modern processors can detect linear memory accesses and prefetch data
> > > automatically, so there's no need to use prefetch.
> > 
> > Not each and every CPU that's capable of running Linux has prefetch
> > functionality :)
> > 
> > Still read-on...
> > 
> > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > > 
> > > ---
> > >  drivers/gpu/drm/udl/udl_transfer.c |    7 -------
> > >  1 file changed, 7 deletions(-)
> > > 
> > > Index: linux-4.16.12/drivers/gpu/drm/udl/udl_transfer.c
> > > ===================================================================
> > > --- linux-4.16.12.orig/drivers/gpu/drm/udl/udl_transfer.c	2018-05-31 14:48:12.000000000 +0200
> > > +++ linux-4.16.12/drivers/gpu/drm/udl/udl_transfer.c	2018-05-31 14:48:12.000000000 +0200
> > > @@ -13,7 +13,6 @@
> > >  #include <linux/module.h>
> > >  #include <linux/slab.h>
> > >  #include <linux/fb.h>
> > > -#include <linux/prefetch.h>
> > >  #include <asm/unaligned.h>
> > >  
> > >  #include <drm/drmP.h>
> > > @@ -51,9 +50,6 @@ static int udl_trim_hline(const u8 *bbac
> > >  	int start = width;
> > >  	int end = width;
> > >  
> > > -	prefetch((void *) front);
> > > -	prefetch((void *) back);
> > 
> > AFAIK prefetcher fetches new data according to a known history... i.e. based on previously
> > used pattern we'll trying to get the next batch of data.
> > 
> > But the code above is in the very beginning of the data processing routine where
> > prefetcher doesn't yet have any history to know what and where to prefetch.
> > 
> > So I'd say this particular usage is good.
> > At least those prefetches shouldn't hurt because typically it
> > would be just 1 instruction if those exist or nothing if CPU/compiler doesn't
> > support it.
> 
> See this post https://urldefense.proofpoint.com/v2/url?u=https-3A__lwn.net_Articles_444336_&d=DwIBAg&c=DPL6_X_6JkXFx7AXWqB0tg&r=lqdeeSSEes0GFDDl656e
> ViXO7breS55ytWkhpk5R81I&m=a5RaqJtvajFkM1hL7bOKD5jV7cpFfTvG2Y1cYCdBPd0&s=w0W8wFtAgENp8TE6RzdPGhdKRasJc_otIn08V0EkgrY&e= where they measured that 
> prefetch hurts performance. Prefetch shouldn't be used unless you have a 
> proof that it improves performance.
> 
> The problem is that the prefetch instruction causes stalls in the pipeline 
> when it encounters TLB miss and the automatic prefetcher doesn't.

Wow, thanks for the link.
I didn't know about that subtle issue with prefetch instructions on ARM and x86.

So OK in case of UDL these prefetches anyways make not not much sense I guess and there's
something worse still, see what I've got from WandBoard Quad running kmscube [1] application
with help of perf utility:
--------------------------->8-------------------------
# Overhead  Command  Shared Object            Symbol 
# ........  .......  .......................  ........................................
#
    92.93%  kmscube  [kernel.kallsyms]        [k] udl_render_hline
     2.51%  kmscube  [kernel.kallsyms]        [k] __divsi3
     0.33%  kmscube  [kernel.kallsyms]        [k] _raw_spin_unlock_irqrestore
     0.22%  kmscube  [kernel.kallsyms]        [k] lock_acquire
     0.19%  kmscube  [kernel.kallsyms]        [k] _raw_spin_unlock_irq
     0.17%  kmscube  [kernel.kallsyms]        [k] udl_handle_damage
     0.12%  kmscube  [kernel.kallsyms]        [k] v7_dma_clean_range
     0.11%  kmscube  [kernel.kallsyms]        [k] l2c210_clean_range
     0.06%  kmscube  [kernel.kallsyms]        [k] __memzero
--------------------------->8-------------------------

That said it's not even USB 2.0 which is a bottle-neck but
computations in the udl_render_hline().


[1] https://cgit.freedesktop.org/mesa/kmscube/

-Alexey
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 08/21] udl-kms: avoid prefetch
  2018-06-06 12:04         ` Alexey Brodkin
@ 2018-06-06 15:46           ` Mikulas Patocka
  -1 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-06 15:46 UTC (permalink / raw)
  To: Alexey Brodkin
  Cc: linux-fbdev, ladis, b.zolnierkie, bernie, dri-devel, airlied



On Wed, 6 Jun 2018, Alexey Brodkin wrote:

> Hi Mikulas,
> 
> On Tue, 2018-06-05 at 11:30 -0400, Mikulas Patocka wrote:
> > 
> > On Tue, 5 Jun 2018, Alexey Brodkin wrote:
> > 
> > > Hi Mikulas,
> > > 
> > > On Sun, 2018-06-03 at 16:41 +0200, Mikulas Patocka wrote:
> > > > Modern processors can detect linear memory accesses and prefetch data
> > > > automatically, so there's no need to use prefetch.
> > > 
> > > Not each and every CPU that's capable of running Linux has prefetch
> > > functionality :)
> > > 
> > > Still read-on...
> > > 
> > > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > > > 
> > > > ---
> > > >  drivers/gpu/drm/udl/udl_transfer.c |    7 -------
> > > >  1 file changed, 7 deletions(-)
> > > > 
> > > > Index: linux-4.16.12/drivers/gpu/drm/udl/udl_transfer.c
> > > > =================================> > > > --- linux-4.16.12.orig/drivers/gpu/drm/udl/udl_transfer.c	2018-05-31 14:48:12.000000000 +0200
> > > > +++ linux-4.16.12/drivers/gpu/drm/udl/udl_transfer.c	2018-05-31 14:48:12.000000000 +0200
> > > > @@ -13,7 +13,6 @@
> > > >  #include <linux/module.h>
> > > >  #include <linux/slab.h>
> > > >  #include <linux/fb.h>
> > > > -#include <linux/prefetch.h>
> > > >  #include <asm/unaligned.h>
> > > >  
> > > >  #include <drm/drmP.h>
> > > > @@ -51,9 +50,6 @@ static int udl_trim_hline(const u8 *bbac
> > > >  	int start = width;
> > > >  	int end = width;
> > > >  
> > > > -	prefetch((void *) front);
> > > > -	prefetch((void *) back);
> > > 
> > > AFAIK prefetcher fetches new data according to a known history... i.e. based on previously
> > > used pattern we'll trying to get the next batch of data.
> > > 
> > > But the code above is in the very beginning of the data processing routine where
> > > prefetcher doesn't yet have any history to know what and where to prefetch.
> > > 
> > > So I'd say this particular usage is good.
> > > At least those prefetches shouldn't hurt because typically it
> > > would be just 1 instruction if those exist or nothing if CPU/compiler doesn't
> > > support it.
> > 
> > See this post https://urldefense.proofpoint.com/v2/url?u=https-3A__lwn.net_Articles_444336_&d=DwIBAg&c=DPL6_X_6JkXFx7AXWqB0tg&r=lqdeeSSEes0GFDDl656e
> > ViXO7breS55ytWkhpk5R81I&m¥RaqJtvajFkM1hL7bOKD5jV7cpFfTvG2Y1cYCdBPd0&s=w0W8wFtAgENp8TE6RzdPGhdKRasJc_otIn08V0EkgrY&e= where they measured that 
> > prefetch hurts performance. Prefetch shouldn't be used unless you have a 
> > proof that it improves performance.
> > 
> > The problem is that the prefetch instruction causes stalls in the pipeline 
> > when it encounters TLB miss and the automatic prefetcher doesn't.
> 
> Wow, thanks for the link.
> I didn't know about that subtle issue with prefetch instructions on ARM and x86.
> 
> So OK in case of UDL these prefetches anyways make not not much sense I guess and there's
> something worse still, see what I've got from WandBoard Quad running kmscube [1] application
> with help of perf utility:
> --------------------------->8-------------------------
> # Overhead  Command  Shared Object            Symbol 
> # ........  .......  .......................  ........................................
> #
>     92.93%  kmscube  [kernel.kallsyms]        [k] udl_render_hline
>      2.51%  kmscube  [kernel.kallsyms]        [k] __divsi3
>      0.33%  kmscube  [kernel.kallsyms]        [k] _raw_spin_unlock_irqrestore
>      0.22%  kmscube  [kernel.kallsyms]        [k] lock_acquire
>      0.19%  kmscube  [kernel.kallsyms]        [k] _raw_spin_unlock_irq
>      0.17%  kmscube  [kernel.kallsyms]        [k] udl_handle_damage
>      0.12%  kmscube  [kernel.kallsyms]        [k] v7_dma_clean_range
>      0.11%  kmscube  [kernel.kallsyms]        [k] l2c210_clean_range
>      0.06%  kmscube  [kernel.kallsyms]        [k] __memzero
> --------------------------->8-------------------------
> 
> That said it's not even USB 2.0 which is a bottle-neck but
> computations in the udl_render_hline().
> 
> 
> [1] https://cgit.freedesktop.org/mesa/kmscube/
> 
> -Alexey

Try this patch 
http://people.redhat.com/~mpatocka/patches/kernel/udl/udlkms-avoid-division.patch

It is doing a lot of divisions - and WandBoard has Cortex-A9, that doesn't 
have division instruction.

BTW. the framebuffer UDL driver (not the modesetting driver) has 
performance counters in sysfs. Their location depends on the system, you 
can find them with find /sys -name "*metrics*"

The file "metrics_reset" resets the counters, so you can measure if the 
prefetch instructions improve performance or not.

Mikulas

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 08/21] udl-kms: avoid prefetch
@ 2018-06-06 15:46           ` Mikulas Patocka
  0 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-06 15:46 UTC (permalink / raw)
  To: Alexey Brodkin
  Cc: linux-fbdev, ladis, b.zolnierkie, bernie, dri-devel, airlied



On Wed, 6 Jun 2018, Alexey Brodkin wrote:

> Hi Mikulas,
> 
> On Tue, 2018-06-05 at 11:30 -0400, Mikulas Patocka wrote:
> > 
> > On Tue, 5 Jun 2018, Alexey Brodkin wrote:
> > 
> > > Hi Mikulas,
> > > 
> > > On Sun, 2018-06-03 at 16:41 +0200, Mikulas Patocka wrote:
> > > > Modern processors can detect linear memory accesses and prefetch data
> > > > automatically, so there's no need to use prefetch.
> > > 
> > > Not each and every CPU that's capable of running Linux has prefetch
> > > functionality :)
> > > 
> > > Still read-on...
> > > 
> > > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > > > 
> > > > ---
> > > >  drivers/gpu/drm/udl/udl_transfer.c |    7 -------
> > > >  1 file changed, 7 deletions(-)
> > > > 
> > > > Index: linux-4.16.12/drivers/gpu/drm/udl/udl_transfer.c
> > > > ===================================================================
> > > > --- linux-4.16.12.orig/drivers/gpu/drm/udl/udl_transfer.c	2018-05-31 14:48:12.000000000 +0200
> > > > +++ linux-4.16.12/drivers/gpu/drm/udl/udl_transfer.c	2018-05-31 14:48:12.000000000 +0200
> > > > @@ -13,7 +13,6 @@
> > > >  #include <linux/module.h>
> > > >  #include <linux/slab.h>
> > > >  #include <linux/fb.h>
> > > > -#include <linux/prefetch.h>
> > > >  #include <asm/unaligned.h>
> > > >  
> > > >  #include <drm/drmP.h>
> > > > @@ -51,9 +50,6 @@ static int udl_trim_hline(const u8 *bbac
> > > >  	int start = width;
> > > >  	int end = width;
> > > >  
> > > > -	prefetch((void *) front);
> > > > -	prefetch((void *) back);
> > > 
> > > AFAIK prefetcher fetches new data according to a known history... i.e. based on previously
> > > used pattern we'll trying to get the next batch of data.
> > > 
> > > But the code above is in the very beginning of the data processing routine where
> > > prefetcher doesn't yet have any history to know what and where to prefetch.
> > > 
> > > So I'd say this particular usage is good.
> > > At least those prefetches shouldn't hurt because typically it
> > > would be just 1 instruction if those exist or nothing if CPU/compiler doesn't
> > > support it.
> > 
> > See this post https://urldefense.proofpoint.com/v2/url?u=https-3A__lwn.net_Articles_444336_&d=DwIBAg&c=DPL6_X_6JkXFx7AXWqB0tg&r=lqdeeSSEes0GFDDl656e
> > ViXO7breS55ytWkhpk5R81I&m=a5RaqJtvajFkM1hL7bOKD5jV7cpFfTvG2Y1cYCdBPd0&s=w0W8wFtAgENp8TE6RzdPGhdKRasJc_otIn08V0EkgrY&e= where they measured that 
> > prefetch hurts performance. Prefetch shouldn't be used unless you have a 
> > proof that it improves performance.
> > 
> > The problem is that the prefetch instruction causes stalls in the pipeline 
> > when it encounters TLB miss and the automatic prefetcher doesn't.
> 
> Wow, thanks for the link.
> I didn't know about that subtle issue with prefetch instructions on ARM and x86.
> 
> So OK in case of UDL these prefetches anyways make not not much sense I guess and there's
> something worse still, see what I've got from WandBoard Quad running kmscube [1] application
> with help of perf utility:
> --------------------------->8-------------------------
> # Overhead  Command  Shared Object            Symbol 
> # ........  .......  .......................  ........................................
> #
>     92.93%  kmscube  [kernel.kallsyms]        [k] udl_render_hline
>      2.51%  kmscube  [kernel.kallsyms]        [k] __divsi3
>      0.33%  kmscube  [kernel.kallsyms]        [k] _raw_spin_unlock_irqrestore
>      0.22%  kmscube  [kernel.kallsyms]        [k] lock_acquire
>      0.19%  kmscube  [kernel.kallsyms]        [k] _raw_spin_unlock_irq
>      0.17%  kmscube  [kernel.kallsyms]        [k] udl_handle_damage
>      0.12%  kmscube  [kernel.kallsyms]        [k] v7_dma_clean_range
>      0.11%  kmscube  [kernel.kallsyms]        [k] l2c210_clean_range
>      0.06%  kmscube  [kernel.kallsyms]        [k] __memzero
> --------------------------->8-------------------------
> 
> That said it's not even USB 2.0 which is a bottle-neck but
> computations in the udl_render_hline().
> 
> 
> [1] https://cgit.freedesktop.org/mesa/kmscube/
> 
> -Alexey

Try this patch 
http://people.redhat.com/~mpatocka/patches/kernel/udl/udlkms-avoid-division.patch

It is doing a lot of divisions - and WandBoard has Cortex-A9, that doesn't 
have division instruction.

BTW. the framebuffer UDL driver (not the modesetting driver) has 
performance counters in sysfs. Their location depends on the system, you 
can find them with find /sys -name "*metrics*"

The file "metrics_reset" resets the counters, so you can measure if the 
prefetch instructions improve performance or not.

Mikulas
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 18/21] udlfb: allow reallocating the framebuffer
  2018-06-03 19:24     ` kbuild test robot
@ 2018-06-12 16:32       ` Mikulas Patocka
  -1 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-12 16:32 UTC (permalink / raw)
  To: kbuild test robot
  Cc: linux-fbdev, Ladislav Michl, Bartlomiej Zolnierkiewicz,
	Bernie Thompson, dri-devel, kbuild-all, Dave Airlie



On Mon, 4 Jun 2018, kbuild test robot wrote:

> Hi Mikulas,
> 
> I love your patch! Perhaps something to improve:
> 
> [auto build test WARNING on drm/drm-next]
> [also build test WARNING on v4.17-rc7 next-20180601]
> [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
> 
> url:    https://github.com/0day-ci/linux/commits/Mikulas-Patocka/USB-DisplayLink-patches/20180603-233013

What is it really complaining about? That URL shows 404 Not Found and this 
email has no warnings at all.

> base:   git://people.freedesktop.org/~airlied/linux.git drm-next
> reproduce:
>         # apt-get install sparse
>         make ARCH=x86_64 allmodconfig
>         make C=1 CF=-D__CHECK_ENDIAN__
> 
> 
> sparse warnings: (new ones prefixed by >>)
> 
> 
> vim +1198 drivers/video/fbdev/udlfb.c
> 
>   1171	
>   1172	/*
>   1173	 * Assumes &info->lock held by caller
>   1174	 * Assumes no active clients have framebuffer open
>   1175	 */
>   1176	static int dlfb_realloc_framebuffer(struct dlfb_data *dlfb, struct fb_info *info, u32 new_len)
>   1177	{
>   1178		u32 old_len = info->fix.smem_len;
> > 1179		unsigned char *old_fb = info->screen_base;
>   1180		unsigned char *new_fb;
>   1181		unsigned char *new_back = NULL;
>   1182	
>   1183		new_len = PAGE_ALIGN(new_len);
>   1184	
>   1185		if (new_len > old_len) {
>   1186			/*
>   1187			 * Alloc system memory for virtual framebuffer
>   1188			 */
>   1189			new_fb = vmalloc(new_len);
>   1190			if (!new_fb) {
>   1191				dev_err(info->dev, "Virtual framebuffer alloc failed\n");
>   1192				return -ENOMEM;
>   1193			}
>   1194			memset(new_fb, 0xff, new_len);
>   1195	
>   1196			if (info->screen_base) {
>   1197				memcpy(new_fb, old_fb, old_len);
> > 1198				dlfb_deferred_vfree(dlfb, info->screen_base);
>   1199			}
>   1200	
>   1201			info->screen_base = new_fb;
>   1202			info->fix.smem_len = new_len;
>   1203			info->fix.smem_start = (unsigned long) new_fb;
>   1204			info->flags = udlfb_info_flags;
>   1205	
>   1206			/*
>   1207			 * Second framebuffer copy to mirror the framebuffer state
>   1208			 * on the physical USB device. We can function without this.
>   1209			 * But with imperfect damage info we may send pixels over USB
>   1210			 * that were, in fact, unchanged - wasting limited USB bandwidth
>   1211			 */
>   1212			if (shadow)
>   1213				new_back = vzalloc(new_len);
>   1214			if (!new_back)
>   1215				dev_info(info->dev,
>   1216					 "No shadow/backing buffer allocated\n");
>   1217			else {
>   1218				dlfb_deferred_vfree(dlfb, dlfb->backing_buffer);
>   1219				dlfb->backing_buffer = new_back;
>   1220			}
>   1221		}
>   1222		return 0;
>   1223	}
>   1224	
> 
> ---
> 0-DAY kernel test infrastructure                Open Source Technology Center
> https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
> 

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 18/21] udlfb: allow reallocating the framebuffer
@ 2018-06-12 16:32       ` Mikulas Patocka
  0 siblings, 0 replies; 75+ messages in thread
From: Mikulas Patocka @ 2018-06-12 16:32 UTC (permalink / raw)
  To: kbuild test robot
  Cc: linux-fbdev, Ladislav Michl, Bartlomiej Zolnierkiewicz,
	Bernie Thompson, dri-devel, kbuild-all, Dave Airlie



On Mon, 4 Jun 2018, kbuild test robot wrote:

> Hi Mikulas,
> 
> I love your patch! Perhaps something to improve:
> 
> [auto build test WARNING on drm/drm-next]
> [also build test WARNING on v4.17-rc7 next-20180601]
> [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
> 
> url:    https://github.com/0day-ci/linux/commits/Mikulas-Patocka/USB-DisplayLink-patches/20180603-233013

What is it really complaining about? That URL shows 404 Not Found and this 
email has no warnings at all.

> base:   git://people.freedesktop.org/~airlied/linux.git drm-next
> reproduce:
>         # apt-get install sparse
>         make ARCH=x86_64 allmodconfig
>         make C=1 CF=-D__CHECK_ENDIAN__
> 
> 
> sparse warnings: (new ones prefixed by >>)
> 
> 
> vim +1198 drivers/video/fbdev/udlfb.c
> 
>   1171	
>   1172	/*
>   1173	 * Assumes &info->lock held by caller
>   1174	 * Assumes no active clients have framebuffer open
>   1175	 */
>   1176	static int dlfb_realloc_framebuffer(struct dlfb_data *dlfb, struct fb_info *info, u32 new_len)
>   1177	{
>   1178		u32 old_len = info->fix.smem_len;
> > 1179		unsigned char *old_fb = info->screen_base;
>   1180		unsigned char *new_fb;
>   1181		unsigned char *new_back = NULL;
>   1182	
>   1183		new_len = PAGE_ALIGN(new_len);
>   1184	
>   1185		if (new_len > old_len) {
>   1186			/*
>   1187			 * Alloc system memory for virtual framebuffer
>   1188			 */
>   1189			new_fb = vmalloc(new_len);
>   1190			if (!new_fb) {
>   1191				dev_err(info->dev, "Virtual framebuffer alloc failed\n");
>   1192				return -ENOMEM;
>   1193			}
>   1194			memset(new_fb, 0xff, new_len);
>   1195	
>   1196			if (info->screen_base) {
>   1197				memcpy(new_fb, old_fb, old_len);
> > 1198				dlfb_deferred_vfree(dlfb, info->screen_base);
>   1199			}
>   1200	
>   1201			info->screen_base = new_fb;
>   1202			info->fix.smem_len = new_len;
>   1203			info->fix.smem_start = (unsigned long) new_fb;
>   1204			info->flags = udlfb_info_flags;
>   1205	
>   1206			/*
>   1207			 * Second framebuffer copy to mirror the framebuffer state
>   1208			 * on the physical USB device. We can function without this.
>   1209			 * But with imperfect damage info we may send pixels over USB
>   1210			 * that were, in fact, unchanged - wasting limited USB bandwidth
>   1211			 */
>   1212			if (shadow)
>   1213				new_back = vzalloc(new_len);
>   1214			if (!new_back)
>   1215				dev_info(info->dev,
>   1216					 "No shadow/backing buffer allocated\n");
>   1217			else {
>   1218				dlfb_deferred_vfree(dlfb, dlfb->backing_buffer);
>   1219				dlfb->backing_buffer = new_back;
>   1220			}
>   1221		}
>   1222		return 0;
>   1223	}
>   1224	
> 
> ---
> 0-DAY kernel test infrastructure                Open Source Technology Center
> https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 08/21] udl-kms: avoid prefetch
  2018-06-06 15:46           ` Mikulas Patocka
  (?)
@ 2018-06-15 16:30             ` Alexey Brodkin
  -1 siblings, 0 replies; 75+ messages in thread
From: Alexey Brodkin @ 2018-06-15 16:30 UTC (permalink / raw)
  To: mpatocka
  Cc: linux-fbdev, ladis, b.zolnierkie, bernie, dri-devel,
	Evgeniy Didin, airlied, linux-snps-arc

SGkgTWlrdWxhcywNCg0KT24gV2VkLCAyMDE4LTA2LTA2IGF0IDExOjQ2IC0wNDAwLCBNaWt1bGFz
IFBhdG9ja2Egd3JvdGU6DQo+IA0KPiBPbiBXZWQsIDYgSnVuIDIwMTgsIEFsZXhleSBCcm9ka2lu
IHdyb3RlOg0KPiANCj4gPiBIaSBNaWt1bGFzLA0KPiA+IA0KPiA+IE9uIFR1ZSwgMjAxOC0wNi0w
NSBhdCAxMTozMCAtMDQwMCwgTWlrdWxhcyBQYXRvY2thIHdyb3RlOg0KPiA+ID4gDQo+ID4gPiBP
biBUdWUsIDUgSnVuIDIwMTgsIEFsZXhleSBCcm9ka2luIHdyb3RlOg0KPiA+ID4gDQo+ID4gPiA+
IEhpIE1pa3VsYXMsDQo+ID4gPiA+IA0KPiA+ID4gPiBPbiBTdW4sIDIwMTgtMDYtMDMgYXQgMTY6
NDEgKzAyMDAsIE1pa3VsYXMgUGF0b2NrYSB3cm90ZToNCj4gPiA+ID4gPiBNb2Rlcm4gcHJvY2Vz
c29ycyBjYW4gZGV0ZWN0IGxpbmVhciBtZW1vcnkgYWNjZXNzZXMgYW5kIHByZWZldGNoIGRhdGEN
Cj4gPiA+ID4gPiBhdXRvbWF0aWNhbGx5LCBzbyB0aGVyZSdzIG5vIG5lZWQgdG8gdXNlIHByZWZl
dGNoLg0KPiA+ID4gPiANCj4gPiA+ID4gTm90IGVhY2ggYW5kIGV2ZXJ5IENQVSB0aGF0J3MgY2Fw
YWJsZSBvZiBydW5uaW5nIExpbnV4IGhhcyBwcmVmZXRjaA0KPiA+ID4gPiBmdW5jdGlvbmFsaXR5
IDopDQo+ID4gPiA+IA0KPiA+ID4gPiBTdGlsbCByZWFkLW9uLi4uDQo+ID4gPiA+IA0KPiA+ID4g
PiA+IFNpZ25lZC1vZmYtYnk6IE1pa3VsYXMgUGF0b2NrYSA8bXBhdG9ja2FAcmVkaGF0LmNvbT4N
Cj4gPiA+ID4gPiANCj4gPiA+ID4gPiAtLS0NCj4gPiA+ID4gPiAgZHJpdmVycy9ncHUvZHJtL3Vk
bC91ZGxfdHJhbnNmZXIuYyB8ICAgIDcgLS0tLS0tLQ0KPiA+ID4gPiA+ICAxIGZpbGUgY2hhbmdl
ZCwgNyBkZWxldGlvbnMoLSkNCj4gPiA+ID4gPiANCj4gPiA+ID4gPiBJbmRleDogbGludXgtNC4x
Ni4xMi9kcml2ZXJzL2dwdS9kcm0vdWRsL3VkbF90cmFuc2Zlci5jDQo+ID4gPiA+ID4gPT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PQ0KPiA+ID4gPiA+IC0tLSBsaW51eC00LjE2LjEyLm9yaWcvZHJpdmVycy9ncHUvZHJtL3Vk
bC91ZGxfdHJhbnNmZXIuYwkyMDE4LTA1LTMxIDE0OjQ4OjEyLjAwMDAwMDAwMCArMDIwMA0KPiA+
ID4gPiA+ICsrKyBsaW51eC00LjE2LjEyL2RyaXZlcnMvZ3B1L2RybS91ZGwvdWRsX3RyYW5zZmVy
LmMJMjAxOC0wNS0zMSAxNDo0ODoxMi4wMDAwMDAwMDAgKzAyMDANCj4gPiA+ID4gPiBAQCAtMTMs
NyArMTMsNiBAQA0KPiA+ID4gPiA+ICAjaW5jbHVkZSA8bGludXgvbW9kdWxlLmg+DQo+ID4gPiA+
ID4gICNpbmNsdWRlIDxsaW51eC9zbGFiLmg+DQo+ID4gPiA+ID4gICNpbmNsdWRlIDxsaW51eC9m
Yi5oPg0KPiA+ID4gPiA+IC0jaW5jbHVkZSA8bGludXgvcHJlZmV0Y2guaD4NCj4gPiA+ID4gPiAg
I2luY2x1ZGUgPGFzbS91bmFsaWduZWQuaD4NCj4gPiA+ID4gPiAgDQo+ID4gPiA+ID4gICNpbmNs
dWRlIDxkcm0vZHJtUC5oPg0KPiA+ID4gPiA+IEBAIC01MSw5ICs1MCw2IEBAIHN0YXRpYyBpbnQg
dWRsX3RyaW1faGxpbmUoY29uc3QgdTggKmJiYWMNCj4gPiA+ID4gPiAgCWludCBzdGFydCA9IHdp
ZHRoOw0KPiA+ID4gPiA+ICAJaW50IGVuZCA9IHdpZHRoOw0KPiA+ID4gPiA+ICANCj4gPiA+ID4g
PiAtCXByZWZldGNoKCh2b2lkICopIGZyb250KTsNCj4gPiA+ID4gPiAtCXByZWZldGNoKCh2b2lk
ICopIGJhY2spOw0KPiA+ID4gPiANCj4gPiA+ID4gQUZBSUsgcHJlZmV0Y2hlciBmZXRjaGVzIG5l
dyBkYXRhIGFjY29yZGluZyB0byBhIGtub3duIGhpc3RvcnkuLi4gaS5lLiBiYXNlZCBvbiBwcmV2
aW91c2x5DQo+ID4gPiA+IHVzZWQgcGF0dGVybiB3ZSdsbCB0cnlpbmcgdG8gZ2V0IHRoZSBuZXh0
IGJhdGNoIG9mIGRhdGEuDQo+ID4gPiA+IA0KPiA+ID4gPiBCdXQgdGhlIGNvZGUgYWJvdmUgaXMg
aW4gdGhlIHZlcnkgYmVnaW5uaW5nIG9mIHRoZSBkYXRhIHByb2Nlc3Npbmcgcm91dGluZSB3aGVy
ZQ0KPiA+ID4gPiBwcmVmZXRjaGVyIGRvZXNuJ3QgeWV0IGhhdmUgYW55IGhpc3RvcnkgdG8ga25v
dyB3aGF0IGFuZCB3aGVyZSB0byBwcmVmZXRjaC4NCj4gPiA+ID4gDQo+ID4gPiA+IFNvIEknZCBz
YXkgdGhpcyBwYXJ0aWN1bGFyIHVzYWdlIGlzIGdvb2QuDQo+ID4gPiA+IEF0IGxlYXN0IHRob3Nl
IHByZWZldGNoZXMgc2hvdWxkbid0IGh1cnQgYmVjYXVzZSB0eXBpY2FsbHkgaXQNCj4gPiA+ID4g
d291bGQgYmUganVzdCAxIGluc3RydWN0aW9uIGlmIHRob3NlIGV4aXN0IG9yIG5vdGhpbmcgaWYg
Q1BVL2NvbXBpbGVyIGRvZXNuJ3QNCj4gPiA+ID4gc3VwcG9ydCBpdC4NCj4gPiA+IA0KPiA+ID4g
U2VlIHRoaXMgcG9zdCBodHRwczovL3VybGRlZmVuc2UucHJvb2Zwb2ludC5jb20vdjIvdXJsP3U9
aHR0cHMtM0FfX2x3bi5uZXRfQXJ0aWNsZXNfNDQ0MzM2XyZkPUR3SUJBZyZjPURQTDZfWF82SmtY
Rng3QVhXcUIwdGcmcj1scWRlZVNTRWVzMEdGRERsDQo+ID4gPiA2NTZlDQo+ID4gPiBWaVhPN2Jy
ZVM1NXl0V2tocGs1UjgxSSZtPWE1UmFxSnR2YWpGa00xaEw3Yk9LRDVqVjdjcEZmVHZHMlkxY1lD
ZEJQZDAmcz13MFc4d0Z0QWdFTnA4VEU2UnpkUEdoZEtSYXNKY19vdEluMDhWMEVrZ3JZJmU9IHdo
ZXJlIHRoZXkgbWVhc3VyZWQgdGhhdCANCj4gPiA+IHByZWZldGNoIGh1cnRzIHBlcmZvcm1hbmNl
LiBQcmVmZXRjaCBzaG91bGRuJ3QgYmUgdXNlZCB1bmxlc3MgeW91IGhhdmUgYSANCj4gPiA+IHBy
b29mIHRoYXQgaXQgaW1wcm92ZXMgcGVyZm9ybWFuY2UuDQo+ID4gPiANCj4gPiA+IFRoZSBwcm9i
bGVtIGlzIHRoYXQgdGhlIHByZWZldGNoIGluc3RydWN0aW9uIGNhdXNlcyBzdGFsbHMgaW4gdGhl
IHBpcGVsaW5lIA0KPiA+ID4gd2hlbiBpdCBlbmNvdW50ZXJzIFRMQiBtaXNzIGFuZCB0aGUgYXV0
b21hdGljIHByZWZldGNoZXIgZG9lc24ndC4NCj4gPiANCj4gPiBXb3csIHRoYW5rcyBmb3IgdGhl
IGxpbmsuDQo+ID4gSSBkaWRuJ3Qga25vdyBhYm91dCB0aGF0IHN1YnRsZSBpc3N1ZSB3aXRoIHBy
ZWZldGNoIGluc3RydWN0aW9ucyBvbiBBUk0gYW5kIHg4Ni4NCj4gPiANCj4gPiBTbyBPSyBpbiBj
YXNlIG9mIFVETCB0aGVzZSBwcmVmZXRjaGVzIGFueXdheXMgbWFrZSBub3Qgbm90IG11Y2ggc2Vu
c2UgSSBndWVzcyBhbmQgdGhlcmUncw0KPiA+IHNvbWV0aGluZyB3b3JzZSBzdGlsbCwgc2VlIHdo
YXQgSSd2ZSBnb3QgZnJvbSBXYW5kQm9hcmQgUXVhZCBydW5uaW5nIGttc2N1YmUgWzFdIGFwcGxp
Y2F0aW9uDQo+ID4gd2l0aCBoZWxwIG9mIHBlcmYgdXRpbGl0eToNCj4gPiAtLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0+OC0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0NCj4gPiAjIE92ZXJoZWFk
ICBDb21tYW5kICBTaGFyZWQgT2JqZWN0ICAgICAgICAgICAgU3ltYm9sIA0KPiA+ICMgLi4uLi4u
Li4gIC4uLi4uLi4gIC4uLi4uLi4uLi4uLi4uLi4uLi4uLi4uICAuLi4uLi4uLi4uLi4uLi4uLi4u
Li4uLi4uLi4uLi4uLi4uLi4uLi4uDQo+ID4gIw0KPiA+ICAgICA5Mi45MyUgIGttc2N1YmUgIFtr
ZXJuZWwua2FsbHN5bXNdICAgICAgICBba10gdWRsX3JlbmRlcl9obGluZQ0KPiA+ICAgICAgMi41
MSUgIGttc2N1YmUgIFtrZXJuZWwua2FsbHN5bXNdICAgICAgICBba10gX19kaXZzaTMNCj4gPiAg
ICAgIDAuMzMlICBrbXNjdWJlICBba2VybmVsLmthbGxzeW1zXSAgICAgICAgW2tdIF9yYXdfc3Bp
bl91bmxvY2tfaXJxcmVzdG9yZQ0KPiA+ICAgICAgMC4yMiUgIGttc2N1YmUgIFtrZXJuZWwua2Fs
bHN5bXNdICAgICAgICBba10gbG9ja19hY3F1aXJlDQo+ID4gICAgICAwLjE5JSAga21zY3ViZSAg
W2tlcm5lbC5rYWxsc3ltc10gICAgICAgIFtrXSBfcmF3X3NwaW5fdW5sb2NrX2lycQ0KPiA+ICAg
ICAgMC4xNyUgIGttc2N1YmUgIFtrZXJuZWwua2FsbHN5bXNdICAgICAgICBba10gdWRsX2hhbmRs
ZV9kYW1hZ2UNCj4gPiAgICAgIDAuMTIlICBrbXNjdWJlICBba2VybmVsLmthbGxzeW1zXSAgICAg
ICAgW2tdIHY3X2RtYV9jbGVhbl9yYW5nZQ0KPiA+ICAgICAgMC4xMSUgIGttc2N1YmUgIFtrZXJu
ZWwua2FsbHN5bXNdICAgICAgICBba10gbDJjMjEwX2NsZWFuX3JhbmdlDQo+ID4gICAgICAwLjA2
JSAga21zY3ViZSAgW2tlcm5lbC5rYWxsc3ltc10gICAgICAgIFtrXSBfX21lbXplcm8NCj4gPiAt
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0+OC0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0NCj4g
PiANCj4gPiBUaGF0IHNhaWQgaXQncyBub3QgZXZlbiBVU0IgMi4wIHdoaWNoIGlzIGEgYm90dGxl
LW5lY2sgYnV0DQo+ID4gY29tcHV0YXRpb25zIGluIHRoZSB1ZGxfcmVuZGVyX2hsaW5lKCkuDQo+
ID4gDQo+ID4gDQo+ID4gWzFdIGh0dHBzOi8vdXJsZGVmZW5zZS5wcm9vZnBvaW50LmNvbS92Mi91
cmw/dT1odHRwcy0zQV9fY2dpdC5mcmVlZGVza3RvcC5vcmdfbWVzYV9rbXNjdWJlXyZkPUR3SUJB
ZyZjPURQTDZfWF82SmtYRng3QVhXcUIwdGcmcj1scWRlZVNTRWVzMEdGRERsNjUNCj4gPiA2ZVZp
WE83YnJlUzU1eXRXa2hwazVSODFJJm09cGo2MC0yYUxRZWVQdXpLX2xvMGxRMWo5R2Vud1Nualo2
VW1JM3JfbmJCVSZzPUpNVWszX1lkT3BFUVRiSXlBczBoR3ZiVWdOUmhuNHl0bElhSjlpRV9MYmsm
ZT0NCj4gPiANCj4gPiAtQWxleGV5DQo+IA0KPiBUcnkgdGhpcyBwYXRjaCANCj4gaHR0cHM6Ly91
cmxkZWZlbnNlLnByb29mcG9pbnQuY29tL3YyL3VybD91PWh0dHAtM0FfX3Blb3BsZS5yZWRoYXQu
Y29tXy03RW1wYXRvY2thX3BhdGNoZXNfa2VybmVsX3VkbF91ZGxrbXMtMkRhdm9pZC0yRGRpdmlz
aW9uLnBhdGNoJmQ9RHdJQkFnJmM9RFBMNg0KPiBfWF82SmtYRng3QVhXcUIwdGcmcj1scWRlZVNT
RWVzMEdGRERsNjU2ZVZpWE83YnJlUzU1eXRXa2hwazVSODFJJm09cGo2MC0NCj4gMmFMUWVlUHV6
S19sbzBsUTFqOUdlbndTbmpaNlVtSTNyX25iQlUmcz1ITkptQWFEaDJ1bzl3dFlzQ2h3dV9UcGVP
WTVQbWtVRHBpY0ZBcnpLM3dFJmU9DQo+IA0KPiBJdCBpcyBkb2luZyBhIGxvdCBvZiBkaXZpc2lv
bnMgLSBhbmQgV2FuZEJvYXJkIGhhcyBDb3J0ZXgtQTksIHRoYXQgZG9lc24ndCANCj4gaGF2ZSBk
aXZpc2lvbiBpbnN0cnVjdGlvbi4NCg0KTG9va3MgbGlrZSB0aGF0IHBhdGNoIG1ha2VzIG5vdCB0
aGF0IG11Y2ggb2YgYSBkaWZmZXJlbmNlLg0KDQpCZWxvdyBhcmUgcmVzdWx0cyBvZiBwZXJmLg0K
DQpXaXRob3V0IHBhdGNoOg0KLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0+OC0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0NCiA5MS40NiUga21zY3ViZSAgW2tl
cm5lbC5rYWxsc3ltc10gICAgICAgIFtrXSB1ZGxfcmVuZGVyX2hsaW5lDQogMi4xNSUgIGttc2N1
YmUgIFtrZXJuZWwua2FsbHN5bXNdICAgICAgICBba10gX19kaXZzaTMNCiAxLjA5JSAga21zY3Vi
ZSAgW2tlcm5lbC5rYWxsc3ltc10gICAgICAgIFtrXSBtbWlvc2V0DQogMC40OCUgIGttc2N1YmUg
IFtrZXJuZWwua2FsbHN5bXNdICAgICAgICBba10gX3Jhd19zcGluX3VubG9ja19pcnENCiAwLjQ4
JSAga21zY3ViZSAgW2tlcm5lbC5rYWxsc3ltc10gICAgICAgIFtrXSB2N19kbWFfY2xlYW5fcmFu
Z2UNCiAwLjM0JSAga21zY3ViZSAgW2tlcm5lbC5rYWxsc3ltc10gICAgICAgIFtrXSBfcmF3X3Nw
aW5fdW5sb2NrX2lycXJlc3RvcmUNCiAwLjI4JSAga21zY3ViZSAgW2tlcm5lbC5rYWxsc3ltc10g
ICAgICAgIFtrXSBsMmMyMTBfY2xlYW5fcmFuZ2UNCiAwLjI1JSAga21zY3ViZSAgW2tlcm5lbC5r
YWxsc3ltc10gICAgICAgIFtrXSBzaG1lbV9nZXRwYWdlX2dmcC5jb25zdHByb3AuNA0KIDAuMTgl
ICBrbXNjdWJlICBba2VybmVsLmthbGxzeW1zXSAgICAgICAgW2tdIGxvY2tfYWNxdWlyZQ0KDQpX
aXRoIHBhdGNoOg0KLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0+OC0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0NCiA5NC44MSUga21zY3ViZSAgW2tlcm5lbC5r
YWxsc3ltc10gICAgICAgIFtrXSB1ZGxfcmVuZGVyX2hsaW5lDQogMC44OCUgIGttc2N1YmUgIFtr
ZXJuZWwua2FsbHN5bXNdICAgICAgICBba10gbW1pb3NldA0KIDAuNDQlICBrbXNjdWJlICBba2Vy
bmVsLmthbGxzeW1zXSAgICAgICAgW2tdIF9yYXdfc3Bpbl91bmxvY2tfaXJxDQogMC4zOCUgIGtt
c2N1YmUgIFtrZXJuZWwua2FsbHN5bXNdICAgICAgICBba10gdjdfZG1hX2NsZWFuX3JhbmdlDQog
MC4zMyUgIGttc2N1YmUgIFtrZXJuZWwua2FsbHN5bXNdICAgICAgICBba10gX3Jhd19zcGluX3Vu
bG9ja19pcnFyZXN0b3JlDQogMC4yMiUgIGttc2N1YmUgIFtrZXJuZWwua2FsbHN5bXNdICAgICAg
ICBba10gbDJjMjEwX2NsZWFuX3JhbmdlDQogMC4xNiUgIGttc2N1YmUgIFtrZXJuZWwua2FsbHN5
bXNdICAgICAgICBba10gc2htZW1fZ2V0cGFnZV9nZnAuY29uc3Rwcm9wLjQNCiAwLjE2JSAga21z
Y3ViZSAgW2tlcm5lbC5rYWxsc3ltc10gICAgICAgIFtrXSB1ZGxfaGFuZGxlX2RhbWFnZQ0KIDAu
MTYlICBrbXNjdWJlICBba2VybmVsLmthbGxzeW1zXSAgICAgICAgW2tdIGxvY2tfYWNxdWlyZQ0K
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0+OC0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0NCg0KVGhlcmUgaXMgbm8gbW9kZSBfX2RpdnNpMyBmdW5jdGlvbiBp
biBwZXJmIHJlc3VsdHMsIHNvIHBhdGNoIHdvcmtzLg0KQnV0IHRoaXMgZnVuY3Rpb24gd2FzIGp1
c3QgfjIlIG9mIG92ZXJhbGwgb3ZlcmhlYWQuDQoNCi1BbGV4ZXk

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 08/21] udl-kms: avoid prefetch
@ 2018-06-15 16:30             ` Alexey Brodkin
  0 siblings, 0 replies; 75+ messages in thread
From: Alexey Brodkin @ 2018-06-15 16:30 UTC (permalink / raw)
  To: linux-snps-arc

Hi Mikulas,

On Wed, 2018-06-06@11:46 -0400, Mikulas Patocka wrote:
> 
> On Wed, 6 Jun 2018, Alexey Brodkin wrote:
> 
> > Hi Mikulas,
> > 
> > On Tue, 2018-06-05@11:30 -0400, Mikulas Patocka wrote:
> > > 
> > > On Tue, 5 Jun 2018, Alexey Brodkin wrote:
> > > 
> > > > Hi Mikulas,
> > > > 
> > > > On Sun, 2018-06-03@16:41 +0200, Mikulas Patocka wrote:
> > > > > Modern processors can detect linear memory accesses and prefetch data
> > > > > automatically, so there's no need to use prefetch.
> > > > 
> > > > Not each and every CPU that's capable of running Linux has prefetch
> > > > functionality :)
> > > > 
> > > > Still read-on...
> > > > 
> > > > > Signed-off-by: Mikulas Patocka <mpatocka at redhat.com>
> > > > > 
> > > > > ---
> > > > >  drivers/gpu/drm/udl/udl_transfer.c |    7 -------
> > > > >  1 file changed, 7 deletions(-)
> > > > > 
> > > > > Index: linux-4.16.12/drivers/gpu/drm/udl/udl_transfer.c
> > > > > ===================================================================
> > > > > --- linux-4.16.12.orig/drivers/gpu/drm/udl/udl_transfer.c	2018-05-31 14:48:12.000000000 +0200
> > > > > +++ linux-4.16.12/drivers/gpu/drm/udl/udl_transfer.c	2018-05-31 14:48:12.000000000 +0200
> > > > > @@ -13,7 +13,6 @@
> > > > >  #include <linux/module.h>
> > > > >  #include <linux/slab.h>
> > > > >  #include <linux/fb.h>
> > > > > -#include <linux/prefetch.h>
> > > > >  #include <asm/unaligned.h>
> > > > >  
> > > > >  #include <drm/drmP.h>
> > > > > @@ -51,9 +50,6 @@ static int udl_trim_hline(const u8 *bbac
> > > > >  	int start = width;
> > > > >  	int end = width;
> > > > >  
> > > > > -	prefetch((void *) front);
> > > > > -	prefetch((void *) back);
> > > > 
> > > > AFAIK prefetcher fetches new data according to a known history... i.e. based on previously
> > > > used pattern we'll trying to get the next batch of data.
> > > > 
> > > > But the code above is in the very beginning of the data processing routine where
> > > > prefetcher doesn't yet have any history to know what and where to prefetch.
> > > > 
> > > > So I'd say this particular usage is good.
> > > > At least those prefetches shouldn't hurt because typically it
> > > > would be just 1 instruction if those exist or nothing if CPU/compiler doesn't
> > > > support it.
> > > 
> > > See this post https://urldefense.proofpoint.com/v2/url?u=https-3A__lwn.net_Articles_444336_&d=DwIBAg&c=DPL6_X_6JkXFx7AXWqB0tg&r=lqdeeSSEes0GFDDl
> > > 656e
> > > ViXO7breS55ytWkhpk5R81I&m=a5RaqJtvajFkM1hL7bOKD5jV7cpFfTvG2Y1cYCdBPd0&s=w0W8wFtAgENp8TE6RzdPGhdKRasJc_otIn08V0EkgrY&e= where they measured that 
> > > prefetch hurts performance. Prefetch shouldn't be used unless you have a 
> > > proof that it improves performance.
> > > 
> > > The problem is that the prefetch instruction causes stalls in the pipeline 
> > > when it encounters TLB miss and the automatic prefetcher doesn't.
> > 
> > Wow, thanks for the link.
> > I didn't know about that subtle issue with prefetch instructions on ARM and x86.
> > 
> > So OK in case of UDL these prefetches anyways make not not much sense I guess and there's
> > something worse still, see what I've got from WandBoard Quad running kmscube [1] application
> > with help of perf utility:
> > --------------------------->8-------------------------
> > # Overhead  Command  Shared Object            Symbol 
> > # ........  .......  .......................  ........................................
> > #
> >     92.93%  kmscube  [kernel.kallsyms]        [k] udl_render_hline
> >      2.51%  kmscube  [kernel.kallsyms]        [k] __divsi3
> >      0.33%  kmscube  [kernel.kallsyms]        [k] _raw_spin_unlock_irqrestore
> >      0.22%  kmscube  [kernel.kallsyms]        [k] lock_acquire
> >      0.19%  kmscube  [kernel.kallsyms]        [k] _raw_spin_unlock_irq
> >      0.17%  kmscube  [kernel.kallsyms]        [k] udl_handle_damage
> >      0.12%  kmscube  [kernel.kallsyms]        [k] v7_dma_clean_range
> >      0.11%  kmscube  [kernel.kallsyms]        [k] l2c210_clean_range
> >      0.06%  kmscube  [kernel.kallsyms]        [k] __memzero
> > --------------------------->8-------------------------
> > 
> > That said it's not even USB 2.0 which is a bottle-neck but
> > computations in the udl_render_hline().
> > 
> > 
> > [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__cgit.freedesktop.org_mesa_kmscube_&d=DwIBAg&c=DPL6_X_6JkXFx7AXWqB0tg&r=lqdeeSSEes0GFDDl65
> > 6eViXO7breS55ytWkhpk5R81I&m=pj60-2aLQeePuzK_lo0lQ1j9GenwSnjZ6UmI3r_nbBU&s=JMUk3_YdOpEQTbIyAs0hGvbUgNRhn4ytlIaJ9iE_Lbk&e=
> > 
> > -Alexey
> 
> Try this patch 
> https://urldefense.proofpoint.com/v2/url?u=http-3A__people.redhat.com_-7Empatocka_patches_kernel_udl_udlkms-2Davoid-2Ddivision.patch&d=DwIBAg&c=DPL6
> _X_6JkXFx7AXWqB0tg&r=lqdeeSSEes0GFDDl656eViXO7breS55ytWkhpk5R81I&m=pj60-
> 2aLQeePuzK_lo0lQ1j9GenwSnjZ6UmI3r_nbBU&s=HNJmAaDh2uo9wtYsChwu_TpeOY5PmkUDpicFArzK3wE&e=
> 
> It is doing a lot of divisions - and WandBoard has Cortex-A9, that doesn't 
> have division instruction.

Looks like that patch makes not that much of a difference.

Below are results of perf.

Without patch:
----------------------------->8-------------------------------------------
 91.46% kmscube  [kernel.kallsyms]        [k] udl_render_hline
 2.15%  kmscube  [kernel.kallsyms]        [k] __divsi3
 1.09%  kmscube  [kernel.kallsyms]        [k] mmioset
 0.48%  kmscube  [kernel.kallsyms]        [k] _raw_spin_unlock_irq
 0.48%  kmscube  [kernel.kallsyms]        [k] v7_dma_clean_range
 0.34%  kmscube  [kernel.kallsyms]        [k] _raw_spin_unlock_irqrestore
 0.28%  kmscube  [kernel.kallsyms]        [k] l2c210_clean_range
 0.25%  kmscube  [kernel.kallsyms]        [k] shmem_getpage_gfp.constprop.4
 0.18%  kmscube  [kernel.kallsyms]        [k] lock_acquire

With patch:
----------------------------->8-------------------------------------------
 94.81% kmscube  [kernel.kallsyms]        [k] udl_render_hline
 0.88%  kmscube  [kernel.kallsyms]        [k] mmioset
 0.44%  kmscube  [kernel.kallsyms]        [k] _raw_spin_unlock_irq
 0.38%  kmscube  [kernel.kallsyms]        [k] v7_dma_clean_range
 0.33%  kmscube  [kernel.kallsyms]        [k] _raw_spin_unlock_irqrestore
 0.22%  kmscube  [kernel.kallsyms]        [k] l2c210_clean_range
 0.16%  kmscube  [kernel.kallsyms]        [k] shmem_getpage_gfp.constprop.4
 0.16%  kmscube  [kernel.kallsyms]        [k] udl_handle_damage
 0.16%  kmscube  [kernel.kallsyms]        [k] lock_acquire
----------------------------->8-------------------------------------------

There is no mode __divsi3 function in perf results, so patch works.
But this function was just ~2% of overall overhead.

-Alexey

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 08/21] udl-kms: avoid prefetch
@ 2018-06-15 16:30             ` Alexey Brodkin
  0 siblings, 0 replies; 75+ messages in thread
From: Alexey Brodkin @ 2018-06-15 16:30 UTC (permalink / raw)
  To: mpatocka
  Cc: linux-fbdev, ladis, b.zolnierkie, bernie, dri-devel,
	Evgeniy Didin, airlied, linux-snps-arc

Hi Mikulas,

On Wed, 2018-06-06 at 11:46 -0400, Mikulas Patocka wrote:
> 
> On Wed, 6 Jun 2018, Alexey Brodkin wrote:
> 
> > Hi Mikulas,
> > 
> > On Tue, 2018-06-05 at 11:30 -0400, Mikulas Patocka wrote:
> > > 
> > > On Tue, 5 Jun 2018, Alexey Brodkin wrote:
> > > 
> > > > Hi Mikulas,
> > > > 
> > > > On Sun, 2018-06-03 at 16:41 +0200, Mikulas Patocka wrote:
> > > > > Modern processors can detect linear memory accesses and prefetch data
> > > > > automatically, so there's no need to use prefetch.
> > > > 
> > > > Not each and every CPU that's capable of running Linux has prefetch
> > > > functionality :)
> > > > 
> > > > Still read-on...
> > > > 
> > > > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > > > > 
> > > > > ---
> > > > >  drivers/gpu/drm/udl/udl_transfer.c |    7 -------
> > > > >  1 file changed, 7 deletions(-)
> > > > > 
> > > > > Index: linux-4.16.12/drivers/gpu/drm/udl/udl_transfer.c
> > > > > ===================================================================
> > > > > --- linux-4.16.12.orig/drivers/gpu/drm/udl/udl_transfer.c	2018-05-31 14:48:12.000000000 +0200
> > > > > +++ linux-4.16.12/drivers/gpu/drm/udl/udl_transfer.c	2018-05-31 14:48:12.000000000 +0200
> > > > > @@ -13,7 +13,6 @@
> > > > >  #include <linux/module.h>
> > > > >  #include <linux/slab.h>
> > > > >  #include <linux/fb.h>
> > > > > -#include <linux/prefetch.h>
> > > > >  #include <asm/unaligned.h>
> > > > >  
> > > > >  #include <drm/drmP.h>
> > > > > @@ -51,9 +50,6 @@ static int udl_trim_hline(const u8 *bbac
> > > > >  	int start = width;
> > > > >  	int end = width;
> > > > >  
> > > > > -	prefetch((void *) front);
> > > > > -	prefetch((void *) back);
> > > > 
> > > > AFAIK prefetcher fetches new data according to a known history... i.e. based on previously
> > > > used pattern we'll trying to get the next batch of data.
> > > > 
> > > > But the code above is in the very beginning of the data processing routine where
> > > > prefetcher doesn't yet have any history to know what and where to prefetch.
> > > > 
> > > > So I'd say this particular usage is good.
> > > > At least those prefetches shouldn't hurt because typically it
> > > > would be just 1 instruction if those exist or nothing if CPU/compiler doesn't
> > > > support it.
> > > 
> > > See this post https://urldefense.proofpoint.com/v2/url?u=https-3A__lwn.net_Articles_444336_&d=DwIBAg&c=DPL6_X_6JkXFx7AXWqB0tg&r=lqdeeSSEes0GFDDl
> > > 656e
> > > ViXO7breS55ytWkhpk5R81I&m=a5RaqJtvajFkM1hL7bOKD5jV7cpFfTvG2Y1cYCdBPd0&s=w0W8wFtAgENp8TE6RzdPGhdKRasJc_otIn08V0EkgrY&e= where they measured that 
> > > prefetch hurts performance. Prefetch shouldn't be used unless you have a 
> > > proof that it improves performance.
> > > 
> > > The problem is that the prefetch instruction causes stalls in the pipeline 
> > > when it encounters TLB miss and the automatic prefetcher doesn't.
> > 
> > Wow, thanks for the link.
> > I didn't know about that subtle issue with prefetch instructions on ARM and x86.
> > 
> > So OK in case of UDL these prefetches anyways make not not much sense I guess and there's
> > something worse still, see what I've got from WandBoard Quad running kmscube [1] application
> > with help of perf utility:
> > --------------------------->8-------------------------
> > # Overhead  Command  Shared Object            Symbol 
> > # ........  .......  .......................  ........................................
> > #
> >     92.93%  kmscube  [kernel.kallsyms]        [k] udl_render_hline
> >      2.51%  kmscube  [kernel.kallsyms]        [k] __divsi3
> >      0.33%  kmscube  [kernel.kallsyms]        [k] _raw_spin_unlock_irqrestore
> >      0.22%  kmscube  [kernel.kallsyms]        [k] lock_acquire
> >      0.19%  kmscube  [kernel.kallsyms]        [k] _raw_spin_unlock_irq
> >      0.17%  kmscube  [kernel.kallsyms]        [k] udl_handle_damage
> >      0.12%  kmscube  [kernel.kallsyms]        [k] v7_dma_clean_range
> >      0.11%  kmscube  [kernel.kallsyms]        [k] l2c210_clean_range
> >      0.06%  kmscube  [kernel.kallsyms]        [k] __memzero
> > --------------------------->8-------------------------
> > 
> > That said it's not even USB 2.0 which is a bottle-neck but
> > computations in the udl_render_hline().
> > 
> > 
> > [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__cgit.freedesktop.org_mesa_kmscube_&d=DwIBAg&c=DPL6_X_6JkXFx7AXWqB0tg&r=lqdeeSSEes0GFDDl65
> > 6eViXO7breS55ytWkhpk5R81I&m=pj60-2aLQeePuzK_lo0lQ1j9GenwSnjZ6UmI3r_nbBU&s=JMUk3_YdOpEQTbIyAs0hGvbUgNRhn4ytlIaJ9iE_Lbk&e=
> > 
> > -Alexey
> 
> Try this patch 
> https://urldefense.proofpoint.com/v2/url?u=http-3A__people.redhat.com_-7Empatocka_patches_kernel_udl_udlkms-2Davoid-2Ddivision.patch&d=DwIBAg&c=DPL6
> _X_6JkXFx7AXWqB0tg&r=lqdeeSSEes0GFDDl656eViXO7breS55ytWkhpk5R81I&m=pj60-
> 2aLQeePuzK_lo0lQ1j9GenwSnjZ6UmI3r_nbBU&s=HNJmAaDh2uo9wtYsChwu_TpeOY5PmkUDpicFArzK3wE&e=
> 
> It is doing a lot of divisions - and WandBoard has Cortex-A9, that doesn't 
> have division instruction.

Looks like that patch makes not that much of a difference.

Below are results of perf.

Without patch:
----------------------------->8-------------------------------------------
 91.46% kmscube  [kernel.kallsyms]        [k] udl_render_hline
 2.15%  kmscube  [kernel.kallsyms]        [k] __divsi3
 1.09%  kmscube  [kernel.kallsyms]        [k] mmioset
 0.48%  kmscube  [kernel.kallsyms]        [k] _raw_spin_unlock_irq
 0.48%  kmscube  [kernel.kallsyms]        [k] v7_dma_clean_range
 0.34%  kmscube  [kernel.kallsyms]        [k] _raw_spin_unlock_irqrestore
 0.28%  kmscube  [kernel.kallsyms]        [k] l2c210_clean_range
 0.25%  kmscube  [kernel.kallsyms]        [k] shmem_getpage_gfp.constprop.4
 0.18%  kmscube  [kernel.kallsyms]        [k] lock_acquire

With patch:
----------------------------->8-------------------------------------------
 94.81% kmscube  [kernel.kallsyms]        [k] udl_render_hline
 0.88%  kmscube  [kernel.kallsyms]        [k] mmioset
 0.44%  kmscube  [kernel.kallsyms]        [k] _raw_spin_unlock_irq
 0.38%  kmscube  [kernel.kallsyms]        [k] v7_dma_clean_range
 0.33%  kmscube  [kernel.kallsyms]        [k] _raw_spin_unlock_irqrestore
 0.22%  kmscube  [kernel.kallsyms]        [k] l2c210_clean_range
 0.16%  kmscube  [kernel.kallsyms]        [k] shmem_getpage_gfp.constprop.4
 0.16%  kmscube  [kernel.kallsyms]        [k] udl_handle_damage
 0.16%  kmscube  [kernel.kallsyms]        [k] lock_acquire
----------------------------->8-------------------------------------------

There is no mode __divsi3 function in perf results, so patch works.
But this function was just ~2% of overall overhead.

-Alexey

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 18/21] udlfb: allow reallocating the framebuffer
  2018-06-12 16:32       ` Mikulas Patocka
@ 2018-07-03 14:58         ` Bartlomiej Zolnierkiewicz
  -1 siblings, 0 replies; 75+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2018-07-03 14:58 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: linux-fbdev, Ladislav Michl, kbuild test robot, Bernie Thompson,
	dri-devel, kbuild-all, Dave Airlie

On Tuesday, June 12, 2018 12:32:34 PM Mikulas Patocka wrote:
> 
> On Mon, 4 Jun 2018, kbuild test robot wrote:
> 
> > Hi Mikulas,
> > 
> > I love your patch! Perhaps something to improve:
> > 
> > [auto build test WARNING on drm/drm-next]
> > [also build test WARNING on v4.17-rc7 next-20180601]
> > [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
> > 
> > url:    https://github.com/0day-ci/linux/commits/Mikulas-Patocka/USB-DisplayLink-patches/20180603-233013
> 
> What is it really complaining about? That URL shows 404 Not Found and this 
> email has no warnings at all.

screen_base in struct fb_info is annotated with __iomem tag:

...
		char __iomem *screen_base;	/* Virtual address */
...

and this tag should be preserved (or explicitly casted).

> > base:   git://people.freedesktop.org/~airlied/linux.git drm-next
> > reproduce:
> >         # apt-get install sparse
> >         make ARCH=x86_64 allmodconfig
> >         make C=1 CF=-D__CHECK_ENDIAN__

You should be able to reproduce the issue with the above sequence.

> > sparse warnings: (new ones prefixed by >>)

[...]

> >   1178		u32 old_len = info->fix.smem_len;
> > > 1179		unsigned char *old_fb = info->screen_base;
> >   1180		unsigned char *new_fb;

[...]

> >   1196			if (info->screen_base) {
> >   1197				memcpy(new_fb, old_fb, old_len);
> > > 1198				dlfb_deferred_vfree(dlfb, info->screen_base);
> >   1199			}

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung R&D Institute Poland
Samsung Electronics


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 18/21] udlfb: allow reallocating the framebuffer
@ 2018-07-03 14:58         ` Bartlomiej Zolnierkiewicz
  0 siblings, 0 replies; 75+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2018-07-03 14:58 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: linux-fbdev, Ladislav Michl, kbuild test robot, Bernie Thompson,
	dri-devel, kbuild-all, Dave Airlie

On Tuesday, June 12, 2018 12:32:34 PM Mikulas Patocka wrote:
> 
> On Mon, 4 Jun 2018, kbuild test robot wrote:
> 
> > Hi Mikulas,
> > 
> > I love your patch! Perhaps something to improve:
> > 
> > [auto build test WARNING on drm/drm-next]
> > [also build test WARNING on v4.17-rc7 next-20180601]
> > [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
> > 
> > url:    https://github.com/0day-ci/linux/commits/Mikulas-Patocka/USB-DisplayLink-patches/20180603-233013
> 
> What is it really complaining about? That URL shows 404 Not Found and this 
> email has no warnings at all.

screen_base in struct fb_info is annotated with __iomem tag:

...
		char __iomem *screen_base;	/* Virtual address */
...

and this tag should be preserved (or explicitly casted).

> > base:   git://people.freedesktop.org/~airlied/linux.git drm-next
> > reproduce:
> >         # apt-get install sparse
> >         make ARCH=x86_64 allmodconfig
> >         make C=1 CF=-D__CHECK_ENDIAN__

You should be able to reproduce the issue with the above sequence.

> > sparse warnings: (new ones prefixed by >>)

[...]

> >   1178		u32 old_len = info->fix.smem_len;
> > > 1179		unsigned char *old_fb = info->screen_base;
> >   1180		unsigned char *new_fb;

[...]

> >   1196			if (info->screen_base) {
> >   1197				memcpy(new_fb, old_fb, old_len);
> > > 1198				dlfb_deferred_vfree(dlfb, info->screen_base);
> >   1199			}

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung R&D Institute Poland
Samsung Electronics

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 00/21] USB DisplayLink patches
  2018-06-04 14:14     ` Mikulas Patocka
@ 2018-07-04  8:04       ` Daniel Vetter
  -1 siblings, 0 replies; 75+ messages in thread
From: Daniel Vetter @ 2018-07-04  8:04 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Linux Fbdev development list, Ladislav Michl,
	Bartlomiej Zolnierkiewicz, Bernie Thompson, dri-devel,
	Dave Airlie

On Mon, Jun 04, 2018 at 10:14:02AM -0400, Mikulas Patocka wrote:
> 
> 
> On Mon, 4 Jun 2018, Dave Airlie wrote:
> 
> > On 4 June 2018 at 00:40, Mikulas Patocka <mpatocka@redhat.com> wrote:
> > > Hi
> > >
> > > Here I'm sending bug fixes and performance improvements for the USB
> > > DisplayLink framebuffer and modesetting drivers for this merge window.
> > >
> > 
> > Hi,
> > 
> > You probably want to split these up into separate series for the kms and fbdev
> > drivers.
> > 
> > Otherwise at least for drm you've missed this merge window, since it
> > closes around rc6 of the previous kernel,
> 
> Could you apply at least the fbdefio patches (without them, fbdefio is 
> unusable due to crashes) and the display corruption of the last line 
> (because most people will hit it)?
> 
> > did you use git send-email
> > for these patches, at least some of them viewed funny on my phone,
> 
> I used the command "quilt mail". I use quilt, not git, for management of 
> my patches.
> 
> > I'll try and look over the kms ones soon. Do you have any numbers for
> > improvements to the kms ones?
> > 
> > Dave.
> 
> I measured performance improvement on the framebuffer patches. The kms 
> driver already performs well, there's not much to do.
> 
> I'd like to as you if you could review the patch "udl-kms: fix a 
> linked-list corruption when using fbdefio" - for me it fixes the crashes, 
> but I am not expert in modesetting drivers and I don't know if some other 
> part of the kernel assumes that the framebuffer pages must be allocated 
> with drm_gem_get_pages.
> 
> 
> BTW. When I unplug the USB adapter while using the modesetting driver, I 
> get this warning. Do you have an idea how to fix it?

Probably the driver is missing a proper shutdown call (for atomic drivers
this would be drm_atomic_helper_shutdown), leaving the connector active,
which leaves it's reference count elevated. Or something like that.

Aside: Should we maintain uld as part of drm-misc under the small drivers
topic? Or is this patch pile here more a one-shot effort? See

https://01.org/linuxgraphics/gfx-docs/maintainer-tools/drm-misc.html#small-drivers

Cheers, Daniel

> 
> WARNING: CPU: 0 PID: 61 at drivers/gpu/drm/drm_mode_config.c:439 drm_mode_config_cleanup+0x250/0x2b8 [drm]
> Modules linked in: udlfb hid_generic usbhid hid tun bridge stp llc autofs4 binfmt_misc ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ipt_REJECT nf_reject_ipv4 xt_conntrack xt_multiport iptable_filter iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_nat xt_tcpudp iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 ip_tables x_tables pppoe pppox af_packet ppp_generic slhc udl drm_kms_helper cfbfillrect cfbimgblt cfbcopyarea drm drm_panel_orientation_quirks syscopyarea sysfillrect sysimgblt fb_sys_fops fb font snd_usb_audio snd_hwdep snd_usbmidi_lib snd_rawmidi snd_pcm snd_timer snd soundcore nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack sd_mod ipv6 aes_ce_blk crypto_simd cryptd aes_ce_cipher crc32_ce ghash_ce gf128mul aes_arm64 sha2_ce
>  sha256_arm64 sha1_ce xhci_plat_hcd xhci_hcd sha1_generic usbcore usb_common ahci_platform libahci_platform libahci mvpp2 unix
> CPU: 0 PID: 61 Comm: kworker/0:2 Not tainted 4.17.0-rc7 #1
> Hardware name: Marvell 8040 MACCHIATOBin (DT)
> Workqueue: usb_hub_wq hub_event [usbcore]
> pstate: 80000005 (Nzcv daif -PAN -UAO)
> pc : drm_mode_config_cleanup+0x250/0x2b8 [drm]
> lr : drm_mode_config_cleanup+0x88/0x2b8 [drm]
> sp : ffffffc13a643920
> x29: ffffffc13a643920 x28: ffffffc13a63ac00
> x27: ffffffc1380962c0 x26: ffffffc11b95f898
> x25: ffffff8000afd1d8 x24: ffffffc11b95f800
> x23: 0000000000000060 x22: ffffff8000afd240
> x21: ffffffc11b95eb38 x20: ffffffc11b95e800
> x19: ffffffc11b95eb30 x18: ffffffc11b95ea7c
> x17: 0000007fa3591b60 x16: ffffff80081dc178
> x15: ffffffc11b95ea78 x14: 0000000000000000
> x13: ffffffc12b3fc000 x12: ffffffc12b3fc028
> x11: ffffffc12b3fc119 x10: 000000000000001f
> x9 : 0000000000000028 x8 : ffffff8000a84000
> x7 : 0000000000000000 x6 : 0000000000000001
> x5 : 0000000000000002 x4 : 0000000000000001
> x3 : 0000000000000002 x2 : 000000000000002f
> x1 : ffffffc11b95eaf8 x0 : ffffffc11b978818
> Call trace:
>  drm_mode_config_cleanup+0x250/0x2b8 [drm]
>  udl_modeset_cleanup+0xc/0x18 [udl]
>  udl_driver_unload+0x30/0x50 [udl]
>  drm_dev_unregister+0x3c/0xe8 [drm]
>  drm_dev_unplug+0x18/0x70 [drm]
>  udl_usb_disconnect+0x30/0x40 [udl]
>  usb_unbind_interface+0x6c/0x290 [usbcore]
>  device_release_driver_internal+0x170/0x200
>  device_release_driver+0x14/0x20
>  bus_remove_device+0x118/0x128
>  device_del+0x110/0x308
>  usb_disable_device+0x8c/0x1f8 [usbcore]
>  usb_disconnect+0xb4/0x218 [usbcore]
>  usb_disconnect+0x9c/0x218 [usbcore]
>  usb_disconnect+0x9c/0x218 [usbcore]
>  hub_event+0xf20/0x1020 [usbcore]
>  process_one_work+0x1c8/0x310
>  worker_thread+0x44/0x450
>  kthread+0x118/0x120
>  ret_from_fork+0x10/0x18
> ---[ end trace 978a27ff198f1268 ]---
> [drm:drm_mode_config_cleanup [drm]] *ERROR* connector DVI-I-1 leaked!
> 
> Mikulas
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 00/21] USB DisplayLink patches
@ 2018-07-04  8:04       ` Daniel Vetter
  0 siblings, 0 replies; 75+ messages in thread
From: Daniel Vetter @ 2018-07-04  8:04 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Linux Fbdev development list, Ladislav Michl,
	Bartlomiej Zolnierkiewicz, Bernie Thompson, dri-devel,
	Dave Airlie

On Mon, Jun 04, 2018 at 10:14:02AM -0400, Mikulas Patocka wrote:
> 
> 
> On Mon, 4 Jun 2018, Dave Airlie wrote:
> 
> > On 4 June 2018 at 00:40, Mikulas Patocka <mpatocka@redhat.com> wrote:
> > > Hi
> > >
> > > Here I'm sending bug fixes and performance improvements for the USB
> > > DisplayLink framebuffer and modesetting drivers for this merge window.
> > >
> > 
> > Hi,
> > 
> > You probably want to split these up into separate series for the kms and fbdev
> > drivers.
> > 
> > Otherwise at least for drm you've missed this merge window, since it
> > closes around rc6 of the previous kernel,
> 
> Could you apply at least the fbdefio patches (without them, fbdefio is 
> unusable due to crashes) and the display corruption of the last line 
> (because most people will hit it)?
> 
> > did you use git send-email
> > for these patches, at least some of them viewed funny on my phone,
> 
> I used the command "quilt mail". I use quilt, not git, for management of 
> my patches.
> 
> > I'll try and look over the kms ones soon. Do you have any numbers for
> > improvements to the kms ones?
> > 
> > Dave.
> 
> I measured performance improvement on the framebuffer patches. The kms 
> driver already performs well, there's not much to do.
> 
> I'd like to as you if you could review the patch "udl-kms: fix a 
> linked-list corruption when using fbdefio" - for me it fixes the crashes, 
> but I am not expert in modesetting drivers and I don't know if some other 
> part of the kernel assumes that the framebuffer pages must be allocated 
> with drm_gem_get_pages.
> 
> 
> BTW. When I unplug the USB adapter while using the modesetting driver, I 
> get this warning. Do you have an idea how to fix it?

Probably the driver is missing a proper shutdown call (for atomic drivers
this would be drm_atomic_helper_shutdown), leaving the connector active,
which leaves it's reference count elevated. Or something like that.

Aside: Should we maintain uld as part of drm-misc under the small drivers
topic? Or is this patch pile here more a one-shot effort? See

https://01.org/linuxgraphics/gfx-docs/maintainer-tools/drm-misc.html#small-drivers

Cheers, Daniel

> 
> WARNING: CPU: 0 PID: 61 at drivers/gpu/drm/drm_mode_config.c:439 drm_mode_config_cleanup+0x250/0x2b8 [drm]
> Modules linked in: udlfb hid_generic usbhid hid tun bridge stp llc autofs4 binfmt_misc ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ipt_REJECT nf_reject_ipv4 xt_conntrack xt_multiport iptable_filter iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_nat xt_tcpudp iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 ip_tables x_tables pppoe pppox af_packet ppp_generic slhc udl drm_kms_helper cfbfillrect cfbimgblt cfbcopyarea drm drm_panel_orientation_quirks syscopyarea sysfillrect sysimgblt fb_sys_fops fb font snd_usb_audio snd_hwdep snd_usbmidi_lib snd_rawmidi snd_pcm snd_timer snd soundcore nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack sd_mod ipv6 aes_ce_blk crypto_simd cryptd aes_ce_cipher crc32_ce ghash_ce gf128mul aes_arm64 sha2_ce
>  sha256_arm64 sha1_ce xhci_plat_hcd xhci_hcd sha1_generic usbcore usb_common ahci_platform libahci_platform libahci mvpp2 unix
> CPU: 0 PID: 61 Comm: kworker/0:2 Not tainted 4.17.0-rc7 #1
> Hardware name: Marvell 8040 MACCHIATOBin (DT)
> Workqueue: usb_hub_wq hub_event [usbcore]
> pstate: 80000005 (Nzcv daif -PAN -UAO)
> pc : drm_mode_config_cleanup+0x250/0x2b8 [drm]
> lr : drm_mode_config_cleanup+0x88/0x2b8 [drm]
> sp : ffffffc13a643920
> x29: ffffffc13a643920 x28: ffffffc13a63ac00
> x27: ffffffc1380962c0 x26: ffffffc11b95f898
> x25: ffffff8000afd1d8 x24: ffffffc11b95f800
> x23: 0000000000000060 x22: ffffff8000afd240
> x21: ffffffc11b95eb38 x20: ffffffc11b95e800
> x19: ffffffc11b95eb30 x18: ffffffc11b95ea7c
> x17: 0000007fa3591b60 x16: ffffff80081dc178
> x15: ffffffc11b95ea78 x14: 0000000000000000
> x13: ffffffc12b3fc000 x12: ffffffc12b3fc028
> x11: ffffffc12b3fc119 x10: 000000000000001f
> x9 : 0000000000000028 x8 : ffffff8000a84000
> x7 : 0000000000000000 x6 : 0000000000000001
> x5 : 0000000000000002 x4 : 0000000000000001
> x3 : 0000000000000002 x2 : 000000000000002f
> x1 : ffffffc11b95eaf8 x0 : ffffffc11b978818
> Call trace:
>  drm_mode_config_cleanup+0x250/0x2b8 [drm]
>  udl_modeset_cleanup+0xc/0x18 [udl]
>  udl_driver_unload+0x30/0x50 [udl]
>  drm_dev_unregister+0x3c/0xe8 [drm]
>  drm_dev_unplug+0x18/0x70 [drm]
>  udl_usb_disconnect+0x30/0x40 [udl]
>  usb_unbind_interface+0x6c/0x290 [usbcore]
>  device_release_driver_internal+0x170/0x200
>  device_release_driver+0x14/0x20
>  bus_remove_device+0x118/0x128
>  device_del+0x110/0x308
>  usb_disable_device+0x8c/0x1f8 [usbcore]
>  usb_disconnect+0xb4/0x218 [usbcore]
>  usb_disconnect+0x9c/0x218 [usbcore]
>  usb_disconnect+0x9c/0x218 [usbcore]
>  hub_event+0xf20/0x1020 [usbcore]
>  process_one_work+0x1c8/0x310
>  worker_thread+0x44/0x450
>  kthread+0x118/0x120
>  ret_from_fork+0x10/0x18
> ---[ end trace 978a27ff198f1268 ]---
> [drm:drm_mode_config_cleanup [drm]] *ERROR* connector DVI-I-1 leaked!
> 
> Mikulas
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 00/21] USB DisplayLink patches
  2018-06-05 15:34     ` Mikulas Patocka
@ 2018-07-25 13:40       ` Bartlomiej Zolnierkiewicz
  -1 siblings, 0 replies; 75+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2018-07-25 13:40 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: linux-fbdev, ladis, Alexey Brodkin, bernie, dri-devel, airlied

On Tuesday, June 05, 2018 11:34:42 AM Mikulas Patocka wrote:
> 
> On Tue, 5 Jun 2018, Alexey Brodkin wrote:
> 
> > Hi Mikulas,
> > 
> > On Sun, 2018-06-03 at 16:40 +0200, Mikulas Patocka wrote:
> > > Hi
> > > 
> > > Here I'm sending bug fixes and performance improvements for the USB
> > > DisplayLink framebuffer and modesetting drivers for this merge window.
> > 
> > For such a long series it would be very nice to post a link to your git
> > tree so people may play with your changes easily.
> > 
> > Could you please prepare one?
> > 
> > -Alexey
> 
> I don't use git to manage my patches, I use quilt.
> 
> I uploaded the patches from quilt here: 
> http://people.redhat.com/~mpatocka/patches/kernel/udl/series.html

I've queued fbdev ones for 4.19 (w/ some minor fixes), thanks.

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung R&D Institute Poland
Samsung Electronics


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 00/21] USB DisplayLink patches
@ 2018-07-25 13:40       ` Bartlomiej Zolnierkiewicz
  0 siblings, 0 replies; 75+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2018-07-25 13:40 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: linux-fbdev, ladis, Alexey Brodkin, bernie, dri-devel, airlied

On Tuesday, June 05, 2018 11:34:42 AM Mikulas Patocka wrote:
> 
> On Tue, 5 Jun 2018, Alexey Brodkin wrote:
> 
> > Hi Mikulas,
> > 
> > On Sun, 2018-06-03 at 16:40 +0200, Mikulas Patocka wrote:
> > > Hi
> > > 
> > > Here I'm sending bug fixes and performance improvements for the USB
> > > DisplayLink framebuffer and modesetting drivers for this merge window.
> > 
> > For such a long series it would be very nice to post a link to your git
> > tree so people may play with your changes easily.
> > 
> > Could you please prepare one?
> > 
> > -Alexey
> 
> I don't use git to manage my patches, I use quilt.
> 
> I uploaded the patches from quilt here: 
> http://people.redhat.com/~mpatocka/patches/kernel/udl/series.html

I've queued fbdev ones for 4.19 (w/ some minor fixes), thanks.

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung R&D Institute Poland
Samsung Electronics

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 75+ messages in thread

end of thread, other threads:[~2018-07-25 13:41 UTC | newest]

Thread overview: 75+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-03 14:40 [PATCH 00/21] USB DisplayLink patches Mikulas Patocka
2018-06-03 14:40 ` Mikulas Patocka
2018-06-03 14:40 ` [PATCH 01/21] udl-kms: fix display corruption of the last line Mikulas Patocka
2018-06-03 14:40   ` Mikulas Patocka
2018-06-03 14:40 ` [PATCH 02/21] udl-kms: change down_interruptible to down Mikulas Patocka
2018-06-03 14:40   ` Mikulas Patocka
2018-06-03 14:40 ` [PATCH 03/21] udl-kms: handle allocation failure Mikulas Patocka
2018-06-03 14:40   ` Mikulas Patocka
2018-06-03 14:40 ` [PATCH 04/21] udl-kms: fix crash due to uninitialized memory Mikulas Patocka
2018-06-03 14:40   ` Mikulas Patocka
2018-06-03 14:40 ` [PATCH 05/21] udl-kms: fix a linked-list corruption when using fbdefio Mikulas Patocka
2018-06-03 14:40   ` Mikulas Patocka
2018-06-03 14:40 ` [PATCH 06/21] udl-kms: make a local copy of fb_ops Mikulas Patocka
2018-06-03 14:40   ` Mikulas Patocka
2018-06-03 14:41 ` [PATCH 07/21] udl-kms: avoid division Mikulas Patocka
2018-06-03 14:41   ` Mikulas Patocka
2018-06-03 14:41 ` [PATCH 08/21] udl-kms: avoid prefetch Mikulas Patocka
2018-06-03 14:41   ` Mikulas Patocka
2018-06-05 10:08   ` Alexey Brodkin
2018-06-05 10:08     ` Alexey Brodkin
2018-06-05 10:48     ` Ladislav Michl
2018-06-05 10:48       ` Ladislav Michl
2018-06-05 15:30     ` Mikulas Patocka
2018-06-05 15:30       ` Mikulas Patocka
2018-06-06 12:04       ` Alexey Brodkin
2018-06-06 12:04         ` Alexey Brodkin
2018-06-06 15:46         ` Mikulas Patocka
2018-06-06 15:46           ` Mikulas Patocka
2018-06-15 16:30           ` Alexey Brodkin
2018-06-15 16:30             ` Alexey Brodkin
2018-06-15 16:30             ` Alexey Brodkin
2018-06-03 14:41 ` [PATCH 09/21] udl-kms: use spin_lock_irq instead of spin_lock_irqsave Mikulas Patocka
2018-06-03 14:41   ` Mikulas Patocka
2018-06-03 14:41 ` [PATCH 10/21] udl-kms: dont spam the syslog with debug messages Mikulas Patocka
2018-06-03 14:41   ` Mikulas Patocka
2018-06-03 14:41 ` [PATCH 11/21] udlfb: fix semaphore value leak Mikulas Patocka
2018-06-03 14:41   ` Mikulas Patocka
2018-06-03 14:41 ` [PATCH 12/21] udlfb: fix display corruption of the last line Mikulas Patocka
2018-06-03 14:41   ` Mikulas Patocka
2018-06-03 14:41 ` [PATCH 13/21] udlfb: dont switch if we are switching to the same videomode Mikulas Patocka
2018-06-03 14:41   ` Mikulas Patocka
2018-06-03 14:41 ` [PATCH 14/21] udlfb: make a local copy of fb_ops Mikulas Patocka
2018-06-03 14:41   ` Mikulas Patocka
2018-06-03 14:41 ` [PATCH 15/21] udlfb: set optimal write delay Mikulas Patocka
2018-06-03 14:41   ` Mikulas Patocka
2018-06-03 14:41 ` [PATCH 16/21] udlfb: handle allocation failure Mikulas Patocka
2018-06-03 14:41   ` Mikulas Patocka
2018-06-03 14:41 ` [PATCH 17/21] udlfb: set line_length in dlfb_ops_set_par Mikulas Patocka
2018-06-03 14:41   ` Mikulas Patocka
2018-06-03 14:41 ` [PATCH 18/21] udlfb: allow reallocating the framebuffer Mikulas Patocka
2018-06-03 14:41   ` Mikulas Patocka
2018-06-03 19:24   ` kbuild test robot
2018-06-03 19:24     ` kbuild test robot
2018-06-12 16:32     ` Mikulas Patocka
2018-06-12 16:32       ` Mikulas Patocka
2018-07-03 14:58       ` Bartlomiej Zolnierkiewicz
2018-07-03 14:58         ` Bartlomiej Zolnierkiewicz
2018-06-03 14:41 ` [PATCH 19/21] udlfb: optimization - test the backing buffer Mikulas Patocka
2018-06-03 14:41   ` Mikulas Patocka
2018-06-03 14:41 ` [PATCH 20/21] udlfb: avoid prefetch Mikulas Patocka
2018-06-03 14:41   ` Mikulas Patocka
2018-06-03 14:41 ` [PATCH 21/21] udlfb: use spin_lock_irq instead of spin_lock_irqsave Mikulas Patocka
2018-06-03 14:41   ` Mikulas Patocka
2018-06-04  1:25 ` [PATCH 00/21] USB DisplayLink patches Dave Airlie
2018-06-04  1:25   ` Dave Airlie
2018-06-04 14:14   ` Mikulas Patocka
2018-06-04 14:14     ` Mikulas Patocka
2018-07-04  8:04     ` Daniel Vetter
2018-07-04  8:04       ` Daniel Vetter
2018-06-05  9:47 ` Alexey Brodkin
2018-06-05  9:47   ` Alexey Brodkin
2018-06-05 15:34   ` Mikulas Patocka
2018-06-05 15:34     ` Mikulas Patocka
2018-07-25 13:40     ` Bartlomiej Zolnierkiewicz
2018-07-25 13:40       ` Bartlomiej Zolnierkiewicz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.