qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 9/9] sm501: Fix and optimize overlap check
  2020-06-20 20:56 [PATCH v3 0/9] More sm501 fixes and optimisations BALATON Zoltan
                   ` (2 preceding siblings ...)
  2020-06-20 20:56 ` [PATCH v3 1/9] sm501: Fix bounds checks BALATON Zoltan
@ 2020-06-20 20:56 ` BALATON Zoltan
  2020-06-22 19:38   ` Peter Maydell
  2020-06-24 16:42   ` [PATCH v4] " BALATON Zoltan
  2020-06-20 20:56 ` [PATCH v3 3/9] sm501: Ignore no-op blits BALATON Zoltan
                   ` (4 subsequent siblings)
  8 siblings, 2 replies; 15+ messages in thread
From: BALATON Zoltan @ 2020-06-20 20:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Sebastian Bauer, Magnus Damm, Gerd Hoffmann,
	Aurelien Jarno

When doing reverse blit we need to check if source and dest overlap
but it is not trivial due to possible different base and pitch of
source and dest. Do rectangle overlap if base and pitch match,
otherwise just check if memory area containing the rects overlaps so
rects could possibly overlap.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
---
 hw/display/sm501.c | 26 ++++++++++++++++----------
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/hw/display/sm501.c b/hw/display/sm501.c
index 2db347dcbc..e7c69bf7fd 100644
--- a/hw/display/sm501.c
+++ b/hw/display/sm501.c
@@ -690,6 +690,7 @@ static void sm501_2d_operation(SM501State *s)
     unsigned int dst_pitch = (s->twoD_pitch >> 16) & 0x1FFF;
     int crt = (s->dc_crt_control & SM501_DC_CRT_CONTROL_SEL) ? 1 : 0;
     int fb_len = get_width(s, crt) * get_height(s, crt) * get_bpp(s, crt);
+    bool overlap = false;
 
     if ((s->twoD_stretch >> 16) & 0xF) {
         qemu_log_mask(LOG_UNIMP, "sm501: only XY addressing is supported.\n");
@@ -784,16 +785,21 @@ static void sm501_2d_operation(SM501State *s)
                          ldn_he_p(&s->local_mem[src_base + si], bypp));
                 break;
             }
-            /* Check for overlaps, this could be made more exact */
-            uint32_t sb, se, db, de;
-            sb = src_base + src_x + src_y * (width + src_pitch);
-            se = sb + width + height * (width + src_pitch);
-            db = dst_base + dst_x + dst_y * (width + dst_pitch);
-            de = db + width + height * (width + dst_pitch);
-            if (rtl && ((db >= sb && db <= se) || (de >= sb && de <= se))) {
-                /* regions may overlap: copy via temporary */
-                int llb = width * bypp;
-                int tmp_stride = DIV_ROUND_UP(llb, sizeof(uint32_t));
+            /* If reverse blit do simple check for overlaps */
+            if (rtl && src_base == dst_base && src_pitch == dst_pitch) {
+                overlap = (src_x < dst_x + width && src_x + width > dst_x &&
+                           src_y < dst_y + height && src_y + height > dst_y);
+            } else if (rtl) {
+                unsigned int sb, se, db, de;
+                sb = src_base + (src_x + src_y * src_pitch) * bypp;
+                se = sb + (width + height * src_pitch) * bypp;
+                db = dst_base + (dst_x + dst_y * dst_pitch) * bypp;
+                de = db + (width + height * dst_pitch) * bypp;
+                overlap = (db >= sb && db <= se) || (de >= sb && de <= se);
+            }
+            if (overlap) {
+                /* pixman can't do reverse blit: copy via temporary */
+                int tmp_stride = DIV_ROUND_UP(width * bypp, sizeof(uint32_t));
                 uint32_t *tmp = tmp_buf;
 
                 if (tmp_stride * sizeof(uint32_t) * height > sizeof(tmp_buf)) {
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 1/9] sm501: Fix bounds checks
  2020-06-20 20:56 [PATCH v3 0/9] More sm501 fixes and optimisations BALATON Zoltan
  2020-06-20 20:56 ` [PATCH v3 7/9] sm501: Do not allow guest to set invalid format BALATON Zoltan
  2020-06-20 20:56 ` [PATCH v3 5/9] sm501: Optimise 1 pixel 2d ops BALATON Zoltan
@ 2020-06-20 20:56 ` BALATON Zoltan
  2020-06-20 20:56 ` [PATCH v3 9/9] sm501: Fix and optimize overlap check BALATON Zoltan
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: BALATON Zoltan @ 2020-06-20 20:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Sebastian Bauer, Magnus Damm, Gerd Hoffmann,
	Aurelien Jarno

We don't need to add width to pitch when calculating last point, that
would reject valid ops within the card's local_mem.

Fixes: b15a22bbcbe6a78dc3d88fe3134985e4cdd87de4
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/display/sm501.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/display/sm501.c b/hw/display/sm501.c
index a7fc08c52b..5ceee4166f 100644
--- a/hw/display/sm501.c
+++ b/hw/display/sm501.c
@@ -723,8 +723,8 @@ static void sm501_2d_operation(SM501State *s)
         dst_y -= height - 1;
     }
 
-    if (dst_base >= get_local_mem_size(s) || dst_base +
-        (dst_x + width + (dst_y + height) * (dst_pitch + width)) *
+    if (dst_base >= get_local_mem_size(s) ||
+        dst_base + (dst_x + width + (dst_y + height) * dst_pitch) *
         (1 << format) >= get_local_mem_size(s)) {
         qemu_log_mask(LOG_GUEST_ERROR, "sm501: 2D op dest is outside vram.\n");
         return;
@@ -749,8 +749,8 @@ static void sm501_2d_operation(SM501State *s)
             src_y -= height - 1;
         }
 
-        if (src_base >= get_local_mem_size(s) || src_base +
-            (src_x + width + (src_y + height) * (src_pitch + width)) *
+        if (src_base >= get_local_mem_size(s) ||
+            src_base + (src_x + width + (src_y + height) * src_pitch) *
             (1 << format) >= get_local_mem_size(s)) {
             qemu_log_mask(LOG_GUEST_ERROR,
                           "sm501: 2D op src is outside vram.\n");
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 7/9] sm501: Do not allow guest to set invalid format
  2020-06-20 20:56 [PATCH v3 0/9] More sm501 fixes and optimisations BALATON Zoltan
@ 2020-06-20 20:56 ` BALATON Zoltan
  2020-06-20 20:56 ` [PATCH v3 5/9] sm501: Optimise 1 pixel 2d ops BALATON Zoltan
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: BALATON Zoltan @ 2020-06-20 20:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Sebastian Bauer, Magnus Damm, Gerd Hoffmann,
	Aurelien Jarno

Prevent guest setting invalid format value that might trip checks in
sm501_2d_operation().

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/display/sm501.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/display/sm501.c b/hw/display/sm501.c
index 6349f31e64..7e4c042d52 100644
--- a/hw/display/sm501.c
+++ b/hw/display/sm501.c
@@ -1503,6 +1503,9 @@ static void sm501_2d_engine_write(void *opaque, hwaddr addr,
         s->twoD_background = value;
         break;
     case SM501_2D_STRETCH:
+        if (((value >> 20) & 3) == 3) {
+            value &= ~BIT(20);
+        }
         s->twoD_stretch = value;
         break;
     case SM501_2D_COLOR_COMPARE:
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 8/9] sm501: Convert debug printfs to traces
  2020-06-20 20:56 [PATCH v3 0/9] More sm501 fixes and optimisations BALATON Zoltan
                   ` (5 preceding siblings ...)
  2020-06-20 20:56 ` [PATCH v3 4/9] sm501: Introduce variable for commonly used value for better readability BALATON Zoltan
@ 2020-06-20 20:56 ` BALATON Zoltan
  2020-06-20 20:56 ` [PATCH v3 6/9] sm501: Use stn_he_p/ldn_he_p instead of switch/case BALATON Zoltan
  2020-06-20 20:56 ` [PATCH v3 2/9] sm501: Drop unneded variable BALATON Zoltan
  8 siblings, 0 replies; 15+ messages in thread
From: BALATON Zoltan @ 2020-06-20 20:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Sebastian Bauer, Magnus Damm, Gerd Hoffmann,
	Aurelien Jarno

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/display/sm501.c      | 50 +++++++++++------------------------------
 hw/display/trace-events | 12 ++++++++++
 2 files changed, 25 insertions(+), 37 deletions(-)

diff --git a/hw/display/sm501.c b/hw/display/sm501.c
index 7e4c042d52..2db347dcbc 100644
--- a/hw/display/sm501.c
+++ b/hw/display/sm501.c
@@ -39,15 +39,7 @@
 #include "qemu/range.h"
 #include "ui/pixel_ops.h"
 #include "qemu/bswap.h"
-
-/*#define DEBUG_SM501*/
-/*#define DEBUG_BITBLT*/
-
-#ifdef DEBUG_SM501
-#define SM501_DPRINTF(fmt, ...) printf(fmt, ## __VA_ARGS__)
-#else
-#define SM501_DPRINTF(fmt, ...) do {} while (0)
-#endif
+#include "trace.h"
 
 #define MMIO_BASE_OFFSET 0x3e00000
 #define MMIO_SIZE 0x200000
@@ -871,7 +863,6 @@ static uint64_t sm501_system_config_read(void *opaque, hwaddr addr,
 {
     SM501State *s = (SM501State *)opaque;
     uint32_t ret = 0;
-    SM501_DPRINTF("sm501 system config regs : read addr=%x\n", (int)addr);
 
     switch (addr) {
     case SM501_SYSTEM_CONTROL:
@@ -923,7 +914,7 @@ static uint64_t sm501_system_config_read(void *opaque, hwaddr addr,
         qemu_log_mask(LOG_UNIMP, "sm501: not implemented system config"
                       "register read. addr=%" HWADDR_PRIx "\n", addr);
     }
-
+    trace_sm501_system_config_read(addr, ret);
     return ret;
 }
 
@@ -931,9 +922,8 @@ static void sm501_system_config_write(void *opaque, hwaddr addr,
                                       uint64_t value, unsigned size)
 {
     SM501State *s = (SM501State *)opaque;
-    SM501_DPRINTF("sm501 system config regs : write addr=%x, val=%x\n",
-                  (uint32_t)addr, (uint32_t)value);
 
+    trace_sm501_system_config_write((uint32_t)addr, (uint32_t)value);
     switch (addr) {
     case SM501_SYSTEM_CONTROL:
         s->system_control &= 0x10DB0000;
@@ -1019,9 +1009,7 @@ static uint64_t sm501_i2c_read(void *opaque, hwaddr addr, unsigned size)
         qemu_log_mask(LOG_UNIMP, "sm501 i2c : not implemented register read."
                       " addr=0x%" HWADDR_PRIx "\n", addr);
     }
-
-    SM501_DPRINTF("sm501 i2c regs : read addr=%" HWADDR_PRIx " val=%x\n",
-                  addr, ret);
+    trace_sm501_i2c_read((uint32_t)addr, ret);
     return ret;
 }
 
@@ -1029,9 +1017,8 @@ static void sm501_i2c_write(void *opaque, hwaddr addr, uint64_t value,
                             unsigned size)
 {
     SM501State *s = (SM501State *)opaque;
-    SM501_DPRINTF("sm501 i2c regs : write addr=%" HWADDR_PRIx
-                  " val=%" PRIx64 "\n", addr, value);
 
+    trace_sm501_i2c_write((uint32_t)addr, (uint32_t)value);
     switch (addr) {
     case SM501_I2C_BYTE_COUNT:
         s->i2c_byte_count = value & 0xf;
@@ -1045,25 +1032,19 @@ static void sm501_i2c_write(void *opaque, hwaddr addr, uint64_t value,
                 s->i2c_status |= (res ? SM501_I2C_STATUS_ERROR : 0);
                 if (!res) {
                     int i;
-                    SM501_DPRINTF("sm501 i2c : transferring %d bytes to 0x%x\n",
-                                  s->i2c_byte_count + 1, s->i2c_addr >> 1);
                     for (i = 0; i <= s->i2c_byte_count; i++) {
                         res = i2c_send_recv(s->i2c_bus, &s->i2c_data[i],
                                             !(s->i2c_addr & 1));
                         if (res) {
-                            SM501_DPRINTF("sm501 i2c : transfer failed"
-                                          " i=%d, res=%d\n", i, res);
                             s->i2c_status |= SM501_I2C_STATUS_ERROR;
                             return;
                         }
                     }
                     if (i) {
-                        SM501_DPRINTF("sm501 i2c : transferred %d bytes\n", i);
                         s->i2c_status = SM501_I2C_STATUS_COMPLETE;
                     }
                 }
             } else {
-                SM501_DPRINTF("sm501 i2c : end transfer\n");
                 i2c_end_transfer(s->i2c_bus);
                 s->i2c_status &= ~SM501_I2C_STATUS_ERROR;
             }
@@ -1103,7 +1084,8 @@ static const MemoryRegionOps sm501_i2c_ops = {
 static uint32_t sm501_palette_read(void *opaque, hwaddr addr)
 {
     SM501State *s = (SM501State *)opaque;
-    SM501_DPRINTF("sm501 palette read addr=%x\n", (int)addr);
+
+    trace_sm501_palette_read((uint32_t)addr);
 
     /* TODO : consider BYTE/WORD access */
     /* TODO : consider endian */
@@ -1116,8 +1098,8 @@ static void sm501_palette_write(void *opaque, hwaddr addr,
                                 uint32_t value)
 {
     SM501State *s = (SM501State *)opaque;
-    SM501_DPRINTF("sm501 palette write addr=%x, val=%x\n",
-                  (int)addr, value);
+
+    trace_sm501_palette_write((uint32_t)addr, value);
 
     /* TODO : consider BYTE/WORD access */
     /* TODO : consider endian */
@@ -1132,7 +1114,6 @@ static uint64_t sm501_disp_ctrl_read(void *opaque, hwaddr addr,
 {
     SM501State *s = (SM501State *)opaque;
     uint32_t ret = 0;
-    SM501_DPRINTF("sm501 disp ctrl regs : read addr=%x\n", (int)addr);
 
     switch (addr) {
 
@@ -1237,7 +1218,7 @@ static uint64_t sm501_disp_ctrl_read(void *opaque, hwaddr addr,
         qemu_log_mask(LOG_UNIMP, "sm501: not implemented disp ctrl register "
                       "read. addr=%" HWADDR_PRIx "\n", addr);
     }
-
+    trace_sm501_disp_ctrl_read((uint32_t)addr, ret);
     return ret;
 }
 
@@ -1245,9 +1226,8 @@ static void sm501_disp_ctrl_write(void *opaque, hwaddr addr,
                                   uint64_t value, unsigned size)
 {
     SM501State *s = (SM501State *)opaque;
-    SM501_DPRINTF("sm501 disp ctrl regs : write addr=%x, val=%x\n",
-                  (unsigned)addr, (unsigned)value);
 
+    trace_sm501_disp_ctrl_write((uint32_t)addr, (uint32_t)value);
     switch (addr) {
     case SM501_DC_PANEL_CONTROL:
         s->dc_panel_control = value & 0x0FFF73FF;
@@ -1392,7 +1372,6 @@ static uint64_t sm501_2d_engine_read(void *opaque, hwaddr addr,
 {
     SM501State *s = (SM501State *)opaque;
     uint32_t ret = 0;
-    SM501_DPRINTF("sm501 2d engine regs : read addr=%x\n", (int)addr);
 
     switch (addr) {
     case SM501_2D_SOURCE:
@@ -1462,7 +1441,7 @@ static uint64_t sm501_2d_engine_read(void *opaque, hwaddr addr,
         qemu_log_mask(LOG_UNIMP, "sm501: not implemented disp ctrl register "
                       "read. addr=%" HWADDR_PRIx "\n", addr);
     }
-
+    trace_sm501_2d_engine_read((uint32_t)addr, ret);
     return ret;
 }
 
@@ -1470,9 +1449,8 @@ static void sm501_2d_engine_write(void *opaque, hwaddr addr,
                                   uint64_t value, unsigned size)
 {
     SM501State *s = (SM501State *)opaque;
-    SM501_DPRINTF("sm501 2d engine regs : write addr=%x, val=%x\n",
-                  (unsigned)addr, (unsigned)value);
 
+    trace_sm501_2d_engine_write((uint32_t)addr, (uint32_t)value);
     switch (addr) {
     case SM501_2D_SOURCE:
         s->twoD_source = value;
@@ -1830,8 +1808,6 @@ static void sm501_init(SM501State *s, DeviceState *dev,
                        uint32_t local_mem_bytes)
 {
     s->local_mem_size_index = get_local_mem_size_index(local_mem_bytes);
-    SM501_DPRINTF("sm501 local mem size=%x. index=%d\n", get_local_mem_size(s),
-                  s->local_mem_size_index);
 
     /* local memory */
     memory_region_init_ram(&s->local_mem_region, OBJECT(dev), "sm501.local",
diff --git a/hw/display/trace-events b/hw/display/trace-events
index 72d4c9812c..970d6bac5d 100644
--- a/hw/display/trace-events
+++ b/hw/display/trace-events
@@ -161,3 +161,15 @@ cg3_write(uint32_t addr, uint32_t val, unsigned size) "write addr:0x%06"PRIx32"
 # dpcd.c
 dpcd_read(uint32_t addr, uint8_t val) "read addr:0x%"PRIx32" val:0x%02x"
 dpcd_write(uint32_t addr, uint8_t val) "write addr:0x%"PRIx32" val:0x%02x"
+
+# sm501.c
+sm501_system_config_read(uint32_t addr, uint32_t val) "addr=0x%x, val=0x%x"
+sm501_system_config_write(uint32_t addr, uint32_t val) "addr=0x%x, val=0x%x"
+sm501_i2c_read(uint32_t addr, uint8_t val) "addr=0x%x, val=0x%x"
+sm501_i2c_write(uint32_t addr, uint32_t val) "addr=0x%x, val=0x%x"
+sm501_palette_read(uint32_t addr) "addr=0x%x"
+sm501_palette_write(uint32_t addr, uint32_t val) "addr=0x%x, val=0x%x"
+sm501_disp_ctrl_read(uint32_t addr, uint32_t val) "addr=0x%x, val=0x%x"
+sm501_disp_ctrl_write(uint32_t addr, uint32_t val) "addr=0x%x, val=0x%x"
+sm501_2d_engine_read(uint32_t addr, uint32_t val) "addr=0x%x, val=0x%x"
+sm501_2d_engine_write(uint32_t addr, uint32_t val) "addr=0x%x, val=0x%x"
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 2/9] sm501: Drop unneded variable
  2020-06-20 20:56 [PATCH v3 0/9] More sm501 fixes and optimisations BALATON Zoltan
                   ` (7 preceding siblings ...)
  2020-06-20 20:56 ` [PATCH v3 6/9] sm501: Use stn_he_p/ldn_he_p instead of switch/case BALATON Zoltan
@ 2020-06-20 20:56 ` BALATON Zoltan
  8 siblings, 0 replies; 15+ messages in thread
From: BALATON Zoltan @ 2020-06-20 20:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Sebastian Bauer, Magnus Damm, Gerd Hoffmann,
	Aurelien Jarno

We don't need a separate variable to keep track if we allocated memory
that needs to be freed as we can test the pointer itself.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/display/sm501.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/hw/display/sm501.c b/hw/display/sm501.c
index 5ceee4166f..5d2e019575 100644
--- a/hw/display/sm501.c
+++ b/hw/display/sm501.c
@@ -796,13 +796,12 @@ static void sm501_2d_operation(SM501State *s)
             de = db + width + height * (width + dst_pitch);
             if (rtl && ((db >= sb && db <= se) || (de >= sb && de <= se))) {
                 /* regions may overlap: copy via temporary */
-                int free_buf = 0, llb = width * (1 << format);
+                int llb = width * (1 << format);
                 int tmp_stride = DIV_ROUND_UP(llb, sizeof(uint32_t));
                 uint32_t *tmp = tmp_buf;
 
                 if (tmp_stride * sizeof(uint32_t) * height > sizeof(tmp_buf)) {
                     tmp = g_malloc(tmp_stride * sizeof(uint32_t) * height);
-                    free_buf = 1;
                 }
                 pixman_blt((uint32_t *)&s->local_mem[src_base], tmp,
                            src_pitch * (1 << format) / sizeof(uint32_t),
@@ -813,7 +812,7 @@ static void sm501_2d_operation(SM501State *s)
                            dst_pitch * (1 << format) / sizeof(uint32_t),
                            8 * (1 << format), 8 * (1 << format),
                            0, 0, dst_x, dst_y, width, height);
-                if (free_buf) {
+                if (tmp != tmp_buf) {
                     g_free(tmp);
                 }
             } else {
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 0/9] More sm501 fixes and optimisations
@ 2020-06-20 20:56 BALATON Zoltan
  2020-06-20 20:56 ` [PATCH v3 7/9] sm501: Do not allow guest to set invalid format BALATON Zoltan
                   ` (8 more replies)
  0 siblings, 9 replies; 15+ messages in thread
From: BALATON Zoltan @ 2020-06-20 20:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Sebastian Bauer, Magnus Damm, Gerd Hoffmann,
	Aurelien Jarno

Version 3: added R-b tags and new patch to fix overlap checks.
Hopefully I got it right this time, unless there are any more comments
this should be good to go.

BALATON Zoltan (9):
  sm501: Fix bounds checks
  sm501: Drop unneded variable
  sm501: Ignore no-op blits
  sm501: Introduce variable for commonly used value for better
    readability
  sm501: Optimise 1 pixel 2d ops
  sm501: Use stn_he_p/ldn_he_p instead of switch/case
  sm501: Do not allow guest to set invalid format
  sm501: Convert debug printfs to traces
  sm501: Fix and optimize overlap check

 hw/display/sm501.c      | 157 +++++++++++++++++++---------------------
 hw/display/trace-events |  12 +++
 2 files changed, 87 insertions(+), 82 deletions(-)

-- 
2.21.3



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v3 5/9] sm501: Optimise 1 pixel 2d ops
  2020-06-20 20:56 [PATCH v3 0/9] More sm501 fixes and optimisations BALATON Zoltan
  2020-06-20 20:56 ` [PATCH v3 7/9] sm501: Do not allow guest to set invalid format BALATON Zoltan
@ 2020-06-20 20:56 ` BALATON Zoltan
  2020-06-20 20:56 ` [PATCH v3 1/9] sm501: Fix bounds checks BALATON Zoltan
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: BALATON Zoltan @ 2020-06-20 20:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Sebastian Bauer, Magnus Damm, Gerd Hoffmann,
	Aurelien Jarno

Some guests do 1x1 blits which is faster to do directly than calling a
function for it so avoid overhead in this case.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/display/sm501.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/hw/display/sm501.c b/hw/display/sm501.c
index 3ced2c42db..2098e69810 100644
--- a/hw/display/sm501.c
+++ b/hw/display/sm501.c
@@ -794,6 +794,14 @@ static void sm501_2d_operation(SM501State *s)
                 src_x == dst_x && src_y == dst_y) {
                 break;
             }
+            /* Some clients also do 1 pixel blits, avoid overhead for these */
+            if (width == 1 && height == 1) {
+                unsigned int si = (src_x + src_y * src_pitch) * bypp;
+                unsigned int di = (dst_x + dst_y * dst_pitch) * bypp;
+                stn_he_p(&s->local_mem[dst_base + di], bypp,
+                         ldn_he_p(&s->local_mem[src_base + si], bypp));
+                break;
+            }
             /* Check for overlaps, this could be made more exact */
             uint32_t sb, se, db, de;
             sb = src_base + src_x + src_y * (width + src_pitch);
@@ -842,9 +850,14 @@ static void sm501_2d_operation(SM501State *s)
             color = cpu_to_le16(color);
         }
 
-        pixman_fill((uint32_t *)&s->local_mem[dst_base],
-                    dst_pitch * bypp / sizeof(uint32_t),
-                    8 * bypp, dst_x, dst_y, width, height, color);
+        if (width == 1 && height == 1) {
+            unsigned int i = (dst_x + dst_y * dst_pitch) * bypp;
+            stn_he_p(&s->local_mem[dst_base + i], bypp, color);
+        } else {
+            pixman_fill((uint32_t *)&s->local_mem[dst_base],
+                        dst_pitch * bypp / sizeof(uint32_t),
+                        8 * bypp, dst_x, dst_y, width, height, color);
+        }
         break;
     }
     default:
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 4/9] sm501: Introduce variable for commonly used value for better readability
  2020-06-20 20:56 [PATCH v3 0/9] More sm501 fixes and optimisations BALATON Zoltan
                   ` (4 preceding siblings ...)
  2020-06-20 20:56 ` [PATCH v3 3/9] sm501: Ignore no-op blits BALATON Zoltan
@ 2020-06-20 20:56 ` BALATON Zoltan
  2020-06-20 20:56 ` [PATCH v3 8/9] sm501: Convert debug printfs to traces BALATON Zoltan
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: BALATON Zoltan @ 2020-06-20 20:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Sebastian Bauer, Magnus Damm, Gerd Hoffmann,
	Aurelien Jarno

The bytes per pixel value can be calculated from format but it's used
freqently enough (and will be used more in subseqent patches) so store
it in a variable for better readabilty. Also drop some unneded 0x
prefix around where new variable is defined.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/display/sm501.c | 41 +++++++++++++++++++++--------------------
 1 file changed, 21 insertions(+), 20 deletions(-)

diff --git a/hw/display/sm501.c b/hw/display/sm501.c
index ad5a62bfab..3ced2c42db 100644
--- a/hw/display/sm501.c
+++ b/hw/display/sm501.c
@@ -684,10 +684,11 @@ static void sm501_2d_operation(SM501State *s)
 {
     int cmd = (s->twoD_control >> 16) & 0x1F;
     int rtl = s->twoD_control & BIT(27);
-    int format = (s->twoD_stretch >> 20) & 0x3;
-    int rop_mode = (s->twoD_control >> 15) & 0x1; /* 1 for rop2, else rop3 */
+    int format = (s->twoD_stretch >> 20) & 3;
+    int bypp = 1 << format; /* bytes per pixel */
+    int rop_mode = (s->twoD_control >> 15) & 1; /* 1 for rop2, else rop3 */
     /* 1 if rop2 source is the pattern, otherwise the source is the bitmap */
-    int rop2_source_is_pattern = (s->twoD_control >> 14) & 0x1;
+    int rop2_source_is_pattern = (s->twoD_control >> 14) & 1;
     int rop = s->twoD_control & 0xFF;
     unsigned int dst_x = (s->twoD_destination >> 16) & 0x01FFF;
     unsigned int dst_y = s->twoD_destination & 0xFFFF;
@@ -724,8 +725,8 @@ static void sm501_2d_operation(SM501State *s)
     }
 
     if (dst_base >= get_local_mem_size(s) ||
-        dst_base + (dst_x + width + (dst_y + height) * dst_pitch) *
-        (1 << format) >= get_local_mem_size(s)) {
+        dst_base + (dst_x + width + (dst_y + height) * dst_pitch) * bypp >=
+        get_local_mem_size(s)) {
         qemu_log_mask(LOG_GUEST_ERROR, "sm501: 2D op dest is outside vram.\n");
         return;
     }
@@ -750,8 +751,8 @@ static void sm501_2d_operation(SM501State *s)
         }
 
         if (src_base >= get_local_mem_size(s) ||
-            src_base + (src_x + width + (src_y + height) * src_pitch) *
-            (1 << format) >= get_local_mem_size(s)) {
+            src_base + (src_x + width + (src_y + height) * src_pitch) * bypp >=
+            get_local_mem_size(s)) {
             qemu_log_mask(LOG_GUEST_ERROR,
                           "sm501: 2D op src is outside vram.\n");
             return;
@@ -763,8 +764,8 @@ static void sm501_2d_operation(SM501State *s)
             uint8_t *d = s->local_mem + dst_base;
 
             for (y = 0; y < height; y++) {
-                i = (dst_x + (dst_y + y) * dst_pitch) * (1 << format);
-                for (x = 0; x < width; x++, i += (1 << format)) {
+                i = (dst_x + (dst_y + y) * dst_pitch) * bypp;
+                for (x = 0; x < width; x++, i += bypp) {
                     switch (format) {
                     case 0:
                         d[i] = ~d[i];
@@ -801,7 +802,7 @@ static void sm501_2d_operation(SM501State *s)
             de = db + width + height * (width + dst_pitch);
             if (rtl && ((db >= sb && db <= se) || (de >= sb && de <= se))) {
                 /* regions may overlap: copy via temporary */
-                int llb = width * (1 << format);
+                int llb = width * bypp;
                 int tmp_stride = DIV_ROUND_UP(llb, sizeof(uint32_t));
                 uint32_t *tmp = tmp_buf;
 
@@ -809,13 +810,13 @@ static void sm501_2d_operation(SM501State *s)
                     tmp = g_malloc(tmp_stride * sizeof(uint32_t) * height);
                 }
                 pixman_blt((uint32_t *)&s->local_mem[src_base], tmp,
-                           src_pitch * (1 << format) / sizeof(uint32_t),
-                           tmp_stride, 8 * (1 << format), 8 * (1 << format),
+                           src_pitch * bypp / sizeof(uint32_t),
+                           tmp_stride, 8 * bypp, 8 * bypp,
                            src_x, src_y, 0, 0, width, height);
                 pixman_blt(tmp, (uint32_t *)&s->local_mem[dst_base],
                            tmp_stride,
-                           dst_pitch * (1 << format) / sizeof(uint32_t),
-                           8 * (1 << format), 8 * (1 << format),
+                           dst_pitch * bypp / sizeof(uint32_t),
+                           8 * bypp, 8 * bypp,
                            0, 0, dst_x, dst_y, width, height);
                 if (tmp != tmp_buf) {
                     g_free(tmp);
@@ -823,9 +824,9 @@ static void sm501_2d_operation(SM501State *s)
             } else {
                 pixman_blt((uint32_t *)&s->local_mem[src_base],
                            (uint32_t *)&s->local_mem[dst_base],
-                           src_pitch * (1 << format) / sizeof(uint32_t),
-                           dst_pitch * (1 << format) / sizeof(uint32_t),
-                           8 * (1 << format), 8 * (1 << format),
+                           src_pitch * bypp / sizeof(uint32_t),
+                           dst_pitch * bypp / sizeof(uint32_t),
+                           8 * bypp, 8 * bypp,
                            src_x, src_y, dst_x, dst_y, width, height);
             }
         }
@@ -842,8 +843,8 @@ static void sm501_2d_operation(SM501State *s)
         }
 
         pixman_fill((uint32_t *)&s->local_mem[dst_base],
-                    dst_pitch * (1 << format) / sizeof(uint32_t),
-                    8 * (1 << format), dst_x, dst_y, width, height, color);
+                    dst_pitch * bypp / sizeof(uint32_t),
+                    8 * bypp, dst_x, dst_y, width, height, color);
         break;
     }
     default:
@@ -855,7 +856,7 @@ static void sm501_2d_operation(SM501State *s)
     if (dst_base >= get_fb_addr(s, crt) &&
         dst_base <= get_fb_addr(s, crt) + fb_len) {
         int dst_len = MIN(fb_len, ((dst_y + height - 1) * dst_pitch +
-                          dst_x + width) * (1 << format));
+                          dst_x + width) * bypp);
         if (dst_len) {
             memory_region_set_dirty(&s->local_mem_region, dst_base, dst_len);
         }
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 6/9] sm501: Use stn_he_p/ldn_he_p instead of switch/case
  2020-06-20 20:56 [PATCH v3 0/9] More sm501 fixes and optimisations BALATON Zoltan
                   ` (6 preceding siblings ...)
  2020-06-20 20:56 ` [PATCH v3 8/9] sm501: Convert debug printfs to traces BALATON Zoltan
@ 2020-06-20 20:56 ` BALATON Zoltan
  2020-06-20 20:56 ` [PATCH v3 2/9] sm501: Drop unneded variable BALATON Zoltan
  8 siblings, 0 replies; 15+ messages in thread
From: BALATON Zoltan @ 2020-06-20 20:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Sebastian Bauer, Magnus Damm, Gerd Hoffmann,
	Aurelien Jarno

Instead of open coding op with different sizes using a switch and type
casting it can be written more compactly using stn_he_p/ldn_he_p.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
---
 hw/display/sm501.c | 12 +-----------
 1 file changed, 1 insertion(+), 11 deletions(-)

diff --git a/hw/display/sm501.c b/hw/display/sm501.c
index 2098e69810..6349f31e64 100644
--- a/hw/display/sm501.c
+++ b/hw/display/sm501.c
@@ -766,17 +766,7 @@ static void sm501_2d_operation(SM501State *s)
             for (y = 0; y < height; y++) {
                 i = (dst_x + (dst_y + y) * dst_pitch) * bypp;
                 for (x = 0; x < width; x++, i += bypp) {
-                    switch (format) {
-                    case 0:
-                        d[i] = ~d[i];
-                        break;
-                    case 1:
-                        *(uint16_t *)&d[i] = ~*(uint16_t *)&d[i];
-                        break;
-                    case 2:
-                        *(uint32_t *)&d[i] = ~*(uint32_t *)&d[i];
-                        break;
-                    }
+                    stn_he_p(&d[i], bypp, ~ldn_he_p(&d[i], bypp));
                 }
             }
         } else {
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 3/9] sm501: Ignore no-op blits
  2020-06-20 20:56 [PATCH v3 0/9] More sm501 fixes and optimisations BALATON Zoltan
                   ` (3 preceding siblings ...)
  2020-06-20 20:56 ` [PATCH v3 9/9] sm501: Fix and optimize overlap check BALATON Zoltan
@ 2020-06-20 20:56 ` BALATON Zoltan
  2020-06-20 20:56 ` [PATCH v3 4/9] sm501: Introduce variable for commonly used value for better readability BALATON Zoltan
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: BALATON Zoltan @ 2020-06-20 20:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Sebastian Bauer, Magnus Damm, Gerd Hoffmann,
	Aurelien Jarno

Some guests seem to try source copy blits with same source and dest
which are no-op so avoid calling pixman for these.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/display/sm501.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/hw/display/sm501.c b/hw/display/sm501.c
index 5d2e019575..ad5a62bfab 100644
--- a/hw/display/sm501.c
+++ b/hw/display/sm501.c
@@ -788,6 +788,11 @@ static void sm501_2d_operation(SM501State *s)
                               (rop2_source_is_pattern ?
                                   " with pattern source" : ""));
             }
+            /* Ignore no-op blits, some guests seem to do this */
+            if (src_base == dst_base && src_pitch == dst_pitch &&
+                src_x == dst_x && src_y == dst_y) {
+                break;
+            }
             /* Check for overlaps, this could be made more exact */
             uint32_t sb, se, db, de;
             sb = src_base + src_x + src_y * (width + src_pitch);
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 9/9] sm501: Fix and optimize overlap check
  2020-06-20 20:56 ` [PATCH v3 9/9] sm501: Fix and optimize overlap check BALATON Zoltan
@ 2020-06-22 19:38   ` Peter Maydell
  2020-06-22 23:07     ` BALATON Zoltan
  2020-06-24 16:42   ` [PATCH v4] " BALATON Zoltan
  1 sibling, 1 reply; 15+ messages in thread
From: Peter Maydell @ 2020-06-22 19:38 UTC (permalink / raw)
  To: BALATON Zoltan
  Cc: Sebastian Bauer, Magnus Damm, QEMU Developers, Aurelien Jarno,
	Gerd Hoffmann

On Sat, 20 Jun 2020 at 22:04, BALATON Zoltan <balaton@eik.bme.hu> wrote:
>
> When doing reverse blit we need to check if source and dest overlap
> but it is not trivial due to possible different base and pitch of
> source and dest. Do rectangle overlap if base and pitch match,
> otherwise just check if memory area containing the rects overlaps so
> rects could possibly overlap.
>
> Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
> ---
>  hw/display/sm501.c | 26 ++++++++++++++++----------
>  1 file changed, 16 insertions(+), 10 deletions(-)
>
> diff --git a/hw/display/sm501.c b/hw/display/sm501.c
> index 2db347dcbc..e7c69bf7fd 100644
> --- a/hw/display/sm501.c
> +++ b/hw/display/sm501.c
> @@ -690,6 +690,7 @@ static void sm501_2d_operation(SM501State *s)
>      unsigned int dst_pitch = (s->twoD_pitch >> 16) & 0x1FFF;
>      int crt = (s->dc_crt_control & SM501_DC_CRT_CONTROL_SEL) ? 1 : 0;
>      int fb_len = get_width(s, crt) * get_height(s, crt) * get_bpp(s, crt);
> +    bool overlap = false;
>
>      if ((s->twoD_stretch >> 16) & 0xF) {
>          qemu_log_mask(LOG_UNIMP, "sm501: only XY addressing is supported.\n");
> @@ -784,16 +785,21 @@ static void sm501_2d_operation(SM501State *s)
>                           ldn_he_p(&s->local_mem[src_base + si], bypp));
>                  break;
>              }
> -            /* Check for overlaps, this could be made more exact */
> -            uint32_t sb, se, db, de;
> -            sb = src_base + src_x + src_y * (width + src_pitch);
> -            se = sb + width + height * (width + src_pitch);
> -            db = dst_base + dst_x + dst_y * (width + dst_pitch);
> -            de = db + width + height * (width + dst_pitch);
> -            if (rtl && ((db >= sb && db <= se) || (de >= sb && de <= se))) {
> -                /* regions may overlap: copy via temporary */
> -                int llb = width * bypp;
> -                int tmp_stride = DIV_ROUND_UP(llb, sizeof(uint32_t));
> +            /* If reverse blit do simple check for overlaps */
> +            if (rtl && src_base == dst_base && src_pitch == dst_pitch) {
> +                overlap = (src_x < dst_x + width && src_x + width > dst_x &&
> +                           src_y < dst_y + height && src_y + height > dst_y);

This part looks good...

> +            } else if (rtl) {
> +                unsigned int sb, se, db, de;
> +                sb = src_base + (src_x + src_y * src_pitch) * bypp;
> +                se = sb + (width + height * src_pitch) * bypp;
> +                db = dst_base + (dst_x + dst_y * dst_pitch) * bypp;
> +                de = db + (width + height * dst_pitch) * bypp;
> +                overlap = (db >= sb && db <= se) || (de >= sb && de <= se);

...but this part I think the overlap calculation isn't right. Consider
 db=5, de=15, sb=10, se=12. This gives overlap=false but
the two regions do overlap because [sb,se] is entirely inside [db,de].
I think you want
  overlap = (db < se && sb < de);
(this is the same logic as each of the x/y range checks in the rectangle
overlap test. put another way, if !(db<se) then we can't have an overlap
because the dest range starts after the source range ends; similarly if
!(sb<de) then the source range begins after the dest range ends and
there's no overlap. So for an overlap to be possible we must have both
db<se && sb<de.)
Here I'm using a definition of the "end" de and se which is that they point
to the byte *after* the last one used (ie that we're really working with
"half-open" ranges [db, de)  and [sb, se) where de and se aren't in the
range), because that's easier to calculate given that we need to account
for bypp and it's more natural when dealing with "start, length" pairs.

Also and less importantly (because it's wrong in the "safe" direction) I think
your se and de are overestimates, because one-past-the-last-used-byte in each
case is
   sb + (width + (height-1) * src_pitch) * bypp
(consider width=1 height=1, where one-past-the-last-used-byte is sb + bypp
because there's only one pixel involved).

> +            }
> +            if (overlap) {
> +                /* pixman can't do reverse blit: copy via temporary */
> +                int tmp_stride = DIV_ROUND_UP(width * bypp, sizeof(uint32_t));
>                  uint32_t *tmp = tmp_buf;
>
>                  if (tmp_stride * sizeof(uint32_t) * height > sizeof(tmp_buf)) {

PS: why do we care about overlap only for right-to-left blits and not
left-to-right blits ?

thanks
-- PMM


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 9/9] sm501: Fix and optimize overlap check
  2020-06-22 19:38   ` Peter Maydell
@ 2020-06-22 23:07     ` BALATON Zoltan
  0 siblings, 0 replies; 15+ messages in thread
From: BALATON Zoltan @ 2020-06-22 23:07 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Sebastian Bauer, Magnus Damm, QEMU Developers, Aurelien Jarno,
	Gerd Hoffmann

On Mon, 22 Jun 2020, Peter Maydell wrote:
> On Sat, 20 Jun 2020 at 22:04, BALATON Zoltan <balaton@eik.bme.hu> wrote:
>>
>> When doing reverse blit we need to check if source and dest overlap
>> but it is not trivial due to possible different base and pitch of
>> source and dest. Do rectangle overlap if base and pitch match,
>> otherwise just check if memory area containing the rects overlaps so
>> rects could possibly overlap.
>>
>> Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
>> ---
>>  hw/display/sm501.c | 26 ++++++++++++++++----------
>>  1 file changed, 16 insertions(+), 10 deletions(-)
>>
>> diff --git a/hw/display/sm501.c b/hw/display/sm501.c
>> index 2db347dcbc..e7c69bf7fd 100644
>> --- a/hw/display/sm501.c
>> +++ b/hw/display/sm501.c
>> @@ -690,6 +690,7 @@ static void sm501_2d_operation(SM501State *s)
>>      unsigned int dst_pitch = (s->twoD_pitch >> 16) & 0x1FFF;
>>      int crt = (s->dc_crt_control & SM501_DC_CRT_CONTROL_SEL) ? 1 : 0;
>>      int fb_len = get_width(s, crt) * get_height(s, crt) * get_bpp(s, crt);
>> +    bool overlap = false;
>>
>>      if ((s->twoD_stretch >> 16) & 0xF) {
>>          qemu_log_mask(LOG_UNIMP, "sm501: only XY addressing is supported.\n");
>> @@ -784,16 +785,21 @@ static void sm501_2d_operation(SM501State *s)
>>                           ldn_he_p(&s->local_mem[src_base + si], bypp));
>>                  break;
>>              }
>> -            /* Check for overlaps, this could be made more exact */
>> -            uint32_t sb, se, db, de;
>> -            sb = src_base + src_x + src_y * (width + src_pitch);
>> -            se = sb + width + height * (width + src_pitch);
>> -            db = dst_base + dst_x + dst_y * (width + dst_pitch);
>> -            de = db + width + height * (width + dst_pitch);
>> -            if (rtl && ((db >= sb && db <= se) || (de >= sb && de <= se))) {
>> -                /* regions may overlap: copy via temporary */
>> -                int llb = width * bypp;
>> -                int tmp_stride = DIV_ROUND_UP(llb, sizeof(uint32_t));
>> +            /* If reverse blit do simple check for overlaps */
>> +            if (rtl && src_base == dst_base && src_pitch == dst_pitch) {
>> +                overlap = (src_x < dst_x + width && src_x + width > dst_x &&
>> +                           src_y < dst_y + height && src_y + height > dst_y);
>
> This part looks good...
>
>> +            } else if (rtl) {
>> +                unsigned int sb, se, db, de;
>> +                sb = src_base + (src_x + src_y * src_pitch) * bypp;
>> +                se = sb + (width + height * src_pitch) * bypp;
>> +                db = dst_base + (dst_x + dst_y * dst_pitch) * bypp;
>> +                de = db + (width + height * dst_pitch) * bypp;
>> +                overlap = (db >= sb && db <= se) || (de >= sb && de <= se);
>
> ...but this part I think the overlap calculation isn't right. Consider
> db=5, de=15, sb=10, se=12. This gives overlap=false but
> the two regions do overlap because [sb,se] is entirely inside [db,de].
> I think you want
>  overlap = (db < se && sb < de);
> (this is the same logic as each of the x/y range checks in the rectangle
> overlap test. put another way, if !(db<se) then we can't have an overlap
> because the dest range starts after the source range ends; similarly if
> !(sb<de) then the source range begins after the dest range ends and
> there's no overlap. So for an overlap to be possible we must have both
> db<se && sb<de.)

Thanks for checking it, you're right, I'll need to think about this again 
for a bit longer.

> Here I'm using a definition of the "end" de and se which is that they point
> to the byte *after* the last one used (ie that we're really working with
> "half-open" ranges [db, de)  and [sb, se) where de and se aren't in the
> range), because that's easier to calculate given that we need to account
> for bypp and it's more natural when dealing with "start, length" pairs.
>
> Also and less importantly (because it's wrong in the "safe" direction) I think
> your se and de are overestimates, because one-past-the-last-used-byte in each
> case is
>   sb + (width + (height-1) * src_pitch) * bypp
> (consider width=1 height=1, where one-past-the-last-used-byte is sb + bypp
> because there's only one pixel involved).

I'm not sure I follow the above but I had >= in the check to account for 
this although I've got the check wrong so that doesn't mean much.

What I've used is something like:

base               pitch
+------------------------------------------+
|        ^                                 |
|      y |                                 |
|   x    v      w                          |
|< - - ->+-------------+                   |
|        |             |                   |
|       h|             |                   |
|        |             |                   |
|        +-------------+                   |
|                                          |
+------------------------------------------+

then y * pitch * bypp is the index of the line + x * bypp is index of 
first point which gives:

sb = base + (x + y * pitch) * bypp

then add width * bypp to get past the rect (index of byte after as you say 
above), then moved h lines down that's h * pitch * bypp but maybe that's 
too much because we only want to get to the last line not past that. Then 
using h - 1 is enough.

>> +            }
>> +            if (overlap) {
>> +                /* pixman can't do reverse blit: copy via temporary */
>> +                int tmp_stride = DIV_ROUND_UP(width * bypp, sizeof(uint32_t));
>>                  uint32_t *tmp = tmp_buf;
>>
>>                  if (tmp_stride * sizeof(uint32_t) * height > sizeof(tmp_buf)) {
>
> PS: why do we care about overlap only for right-to-left blits and not
> left-to-right blits ?

I think I've already answered that here:

https://lists.nongnu.org/archive/html/qemu-devel/2020-05/msg07650.html

since pixman does left to right that should match the intended operation 
even with overlap when rtl is not set. (Rtl also seems to mean bottom to 
top as well because there's only this one bit for direction.)

Regards,
BALATON Zoltan


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v4] sm501: Fix and optimize overlap check
  2020-06-20 20:56 ` [PATCH v3 9/9] sm501: Fix and optimize overlap check BALATON Zoltan
  2020-06-22 19:38   ` Peter Maydell
@ 2020-06-24 16:42   ` BALATON Zoltan
  2020-06-25 10:11     ` Peter Maydell
  2020-06-30 20:52     ` Gerd Hoffmann
  1 sibling, 2 replies; 15+ messages in thread
From: BALATON Zoltan @ 2020-06-24 16:42 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Sebastian Bauer, Magnus Damm, Gerd Hoffmann,
	Aurelien Jarno

When doing reverse blit we need to check if source and dest overlap
but it is not trivial due to possible different base and pitch of
source and dest. Do rectangle overlap if base and pitch match,
otherwise just check if memory area containing the rects overlaps so
rects could possibly overlap.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
---
This version fixes overlap check according to Peter's suggestion, only
resending this patch as v4, others still valid from v3. Let me know if
you want the whole series resent instead.

 hw/display/sm501.c | 26 ++++++++++++++++----------
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/hw/display/sm501.c b/hw/display/sm501.c
index 2db347dcbc..9cccc68c35 100644
--- a/hw/display/sm501.c
+++ b/hw/display/sm501.c
@@ -690,6 +690,7 @@ static void sm501_2d_operation(SM501State *s)
     unsigned int dst_pitch = (s->twoD_pitch >> 16) & 0x1FFF;
     int crt = (s->dc_crt_control & SM501_DC_CRT_CONTROL_SEL) ? 1 : 0;
     int fb_len = get_width(s, crt) * get_height(s, crt) * get_bpp(s, crt);
+    bool overlap = false;
 
     if ((s->twoD_stretch >> 16) & 0xF) {
         qemu_log_mask(LOG_UNIMP, "sm501: only XY addressing is supported.\n");
@@ -784,16 +785,21 @@ static void sm501_2d_operation(SM501State *s)
                          ldn_he_p(&s->local_mem[src_base + si], bypp));
                 break;
             }
-            /* Check for overlaps, this could be made more exact */
-            uint32_t sb, se, db, de;
-            sb = src_base + src_x + src_y * (width + src_pitch);
-            se = sb + width + height * (width + src_pitch);
-            db = dst_base + dst_x + dst_y * (width + dst_pitch);
-            de = db + width + height * (width + dst_pitch);
-            if (rtl && ((db >= sb && db <= se) || (de >= sb && de <= se))) {
-                /* regions may overlap: copy via temporary */
-                int llb = width * bypp;
-                int tmp_stride = DIV_ROUND_UP(llb, sizeof(uint32_t));
+            /* If reverse blit do simple check for overlaps */
+            if (rtl && src_base == dst_base && src_pitch == dst_pitch) {
+                overlap = (src_x < dst_x + width && src_x + width > dst_x &&
+                           src_y < dst_y + height && src_y + height > dst_y);
+            } else if (rtl) {
+                unsigned int sb, se, db, de;
+                sb = src_base + (src_x + src_y * src_pitch) * bypp;
+                se = sb + (width + (height - 1) * src_pitch) * bypp;
+                db = dst_base + (dst_x + dst_y * dst_pitch) * bypp;
+                de = db + (width + (height - 1) * dst_pitch) * bypp;
+                overlap = (db < se && sb < de);
+            }
+            if (overlap) {
+                /* pixman can't do reverse blit: copy via temporary */
+                int tmp_stride = DIV_ROUND_UP(width * bypp, sizeof(uint32_t));
                 uint32_t *tmp = tmp_buf;
 
                 if (tmp_stride * sizeof(uint32_t) * height > sizeof(tmp_buf)) {
-- 
2.21.3



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v4] sm501: Fix and optimize overlap check
  2020-06-24 16:42   ` [PATCH v4] " BALATON Zoltan
@ 2020-06-25 10:11     ` Peter Maydell
  2020-06-30 20:52     ` Gerd Hoffmann
  1 sibling, 0 replies; 15+ messages in thread
From: Peter Maydell @ 2020-06-25 10:11 UTC (permalink / raw)
  To: BALATON Zoltan
  Cc: Sebastian Bauer, Magnus Damm, QEMU Developers, Aurelien Jarno,
	Gerd Hoffmann

On Wed, 24 Jun 2020 at 17:47, BALATON Zoltan <balaton@eik.bme.hu> wrote:
>
> When doing reverse blit we need to check if source and dest overlap
> but it is not trivial due to possible different base and pitch of
> source and dest. Do rectangle overlap if base and pitch match,
> otherwise just check if memory area containing the rects overlaps so
> rects could possibly overlap.
>
> Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
> ---
> This version fixes overlap check according to Peter's suggestion, only
> resending this patch as v4, others still valid from v3. Let me know if
> you want the whole series resent instead.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4] sm501: Fix and optimize overlap check
  2020-06-24 16:42   ` [PATCH v4] " BALATON Zoltan
  2020-06-25 10:11     ` Peter Maydell
@ 2020-06-30 20:52     ` Gerd Hoffmann
  1 sibling, 0 replies; 15+ messages in thread
From: Gerd Hoffmann @ 2020-06-30 20:52 UTC (permalink / raw)
  To: BALATON Zoltan
  Cc: Peter Maydell, Sebastian Bauer, Magnus Damm, qemu-devel, Aurelien Jarno

On Wed, Jun 24, 2020 at 06:42:18PM +0200, BALATON Zoltan wrote:
> When doing reverse blit we need to check if source and dest overlap
> but it is not trivial due to possible different base and pitch of
> source and dest. Do rectangle overlap if base and pitch match,
> otherwise just check if memory area containing the rects overlaps so
> rects could possibly overlap.
> 
> Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
> ---
> This version fixes overlap check according to Peter's suggestion, only
> resending this patch as v4, others still valid from v3. Let me know if
> you want the whole series resent instead.

Queued up now,  Last patch in a series is easy to replace.

thanks,
  Gerd



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-06-30 20:53 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-20 20:56 [PATCH v3 0/9] More sm501 fixes and optimisations BALATON Zoltan
2020-06-20 20:56 ` [PATCH v3 7/9] sm501: Do not allow guest to set invalid format BALATON Zoltan
2020-06-20 20:56 ` [PATCH v3 5/9] sm501: Optimise 1 pixel 2d ops BALATON Zoltan
2020-06-20 20:56 ` [PATCH v3 1/9] sm501: Fix bounds checks BALATON Zoltan
2020-06-20 20:56 ` [PATCH v3 9/9] sm501: Fix and optimize overlap check BALATON Zoltan
2020-06-22 19:38   ` Peter Maydell
2020-06-22 23:07     ` BALATON Zoltan
2020-06-24 16:42   ` [PATCH v4] " BALATON Zoltan
2020-06-25 10:11     ` Peter Maydell
2020-06-30 20:52     ` Gerd Hoffmann
2020-06-20 20:56 ` [PATCH v3 3/9] sm501: Ignore no-op blits BALATON Zoltan
2020-06-20 20:56 ` [PATCH v3 4/9] sm501: Introduce variable for commonly used value for better readability BALATON Zoltan
2020-06-20 20:56 ` [PATCH v3 8/9] sm501: Convert debug printfs to traces BALATON Zoltan
2020-06-20 20:56 ` [PATCH v3 6/9] sm501: Use stn_he_p/ldn_he_p instead of switch/case BALATON Zoltan
2020-06-20 20:56 ` [PATCH v3 2/9] sm501: Drop unneded variable BALATON Zoltan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).