All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
@ 2014-06-17  3:07 Benjamin Herrenschmidt
  2014-06-17  4:40 ` Paolo Bonzini
  2014-06-17 10:45 ` Gerd Hoffmann
  0 siblings, 2 replies; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-06-17  3:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alexey Kardashevskiy, Paolo Bonzini, Gerd Hoffmann, Alexander Graf

Hi folks !

WARNING: The patch below is NOT intended for merging for rather obvious
reasons in its current form :-)

So basically, we have a problem we need to solve in qemu powerpc related
to the emulated VGA, as I mentioned in an earlier mail yesterday.

The core issue stems from the fact that graphic stacks including X and a
number of other components along the way are terminally broken if the
framebuffer endianness doesn't match the guest endianness. We can
discuss the historical reasons for that fact if you wish and we can
discuss why it is almost unfixable separately.

This is an obvious problem for qemu on ppc64 since we now support having
either BE or LE guests.

At the moment, we have a broken patch in offb that makes the guest
sort-of limp along with the reversed framebuffer but that patch is
broken in several ways so I've reverted it upstream and X wouldn't work
anyway.

So we need to dynamically be able to change the byte order of the VGA
framebuffer in qemu.

This email is NOT about discussing the mechanisms on how we trigger that
byteswap, ie whether we emulate a new register for the guest to
configure its endian or we use the existing PAPR H_SET_MODE, or a
combination of these etc... I will discuss that separately.

This is purely about discussing the changes to vga.c and vga-templates.h
to be able to handle the swapping in a runtime-decided way rather than
compile time.

I've come up with the proof of concept below which changes the various
line drawing functions in vga-template to select the right ld/st
function. However that adds a per-line test (overhead I can hear some
people shouting ! :-).

A better approach might be to just add new set of functions to
vga_draw_line_table[]. Ie change:

enum {
    VGA_DRAW_LINE2,
    VGA_DRAW_LINE2D2,
    VGA_DRAW_LINE4,
    VGA_DRAW_LINE4D2,
    VGA_DRAW_LINE8D2,
    VGA_DRAW_LINE8,
    VGA_DRAW_LINE15,
    VGA_DRAW_LINE16,
    VGA_DRAW_LINE24,
    VGA_DRAW_LINE32,
    VGA_DRAW_LINE_NB,
};

to something like

enum {
    VGA_DRAW_LINE2,
    VGA_DRAW_LINE2D2,
    VGA_DRAW_LINE4,
    VGA_DRAW_LINE4D2,
    VGA_DRAW_LINE8D2,
    VGA_DRAW_LINE8,
    VGA_DRAW_LINE15_LE,
    VGA_DRAW_LINE16_LE,
    VGA_DRAW_LINE24_LE,
    VGA_DRAW_LINE32_LE,
    VGA_DRAW_LINE15_BE,
    VGA_DRAW_LINE16_BE,
    VGA_DRAW_LINE24_BE,
    VGA_DRAW_LINE32_BE,
    VGA_DRAW_LINE_NB,
};

And add some more macros to generate them.

That means less runtime overhead (though I don't think the overhead is
huge) at the expense of bigger code size.

What do you prefer ?

Proof-of-concept code here:

diff --git a/hw/display/vga.c b/hw/display/vga.c
index 8cd6afe..c6e2e96 100644
--- a/hw/display/vga.c
+++ b/hw/display/vga.c
@@ -983,27 +983,72 @@ typedef void vga_draw_glyph9_func(uint8_t *d, int linesize,
 typedef void vga_draw_line_func(VGACommonState *s1, uint8_t *d,
                                 const uint8_t *s, int width);
 
+#if 0
+
+static inline bool vga_is_big_endian(VGACommonState *s)
+{
+#ifdef TARGET_WORDS_BIGENDIAN
+    return true;
+#else
+    return true;
+#endif
+}
+
+#else
+
+#ifdef TARGET_WORDS_BIGENDIAN
+static bool vga_is_be = true;
+#else
+static bool vga_is_be = false;
+#endif
+
+void vga_set_big_endian(bool be)
+{
+    vga_is_be = be;
+}
+
+static inline bool vga_is_big_endian(VGACommonState *s)
+{
+    return vga_is_be;
+}
+
+#endif
+
+static inline bool vga_need_swap(VGACommonState *s, bool target_be)
+{
+#ifdef HOST_WORDS_BIGENDIAN
+    return !target_be;
+#else
+    return target_be;
+#endif
+}
+
+
+#define BGR_FORMAT 0
 #define DEPTH 8
 #include "vga_template.h"
 
+#define BGR_FORMAT 0
 #define DEPTH 15
 #include "vga_template.h"
 
-#define BGR_FORMAT
+#define BGR_FORMAT 1
 #define DEPTH 15
 #include "vga_template.h"
 
+#define BGR_FORMAT 0
 #define DEPTH 16
 #include "vga_template.h"
 
-#define BGR_FORMAT
+#define BGR_FORMAT 1
 #define DEPTH 16
 #include "vga_template.h"
 
+#define BGR_FORMAT 0
 #define DEPTH 32
 #include "vga_template.h"
 
-#define BGR_FORMAT
+#define BGR_FORMAT 1
 #define DEPTH 32
 #include "vga_template.h"
 
@@ -1645,11 +1690,9 @@ static void vga_draw_graphic(VGACommonState *s, int full_update)
     uint8_t *d;
     uint32_t v, addr1, addr;
     vga_draw_line_func *vga_draw_line;
-#if defined(HOST_WORDS_BIGENDIAN) == defined(TARGET_WORDS_BIGENDIAN)
-    static const bool byteswap = false;
-#else
-    static const bool byteswap = true;
-#endif
+    bool byteswap;
+
+    byteswap = vga_need_swap(s, vga_is_big_endian(s));
 
     full_update |= update_basic_params(s);
 
@@ -1691,7 +1734,8 @@ static void vga_draw_graphic(VGACommonState *s, int full_update)
     if (s->line_offset != s->last_line_offset ||
         disp_width != s->last_width ||
         height != s->last_height ||
-        s->last_depth != depth) {
+        s->last_depth != depth ||
+        s->last_bswap != byteswap) {
         if (depth == 32 || (depth == 16 && !byteswap)) {
             surface = qemu_create_displaysurface_from(disp_width,
                     height, depth, s->line_offset,
@@ -1707,6 +1751,7 @@ static void vga_draw_graphic(VGACommonState *s, int full_update)
         s->last_height = height;
         s->last_line_offset = s->line_offset;
         s->last_depth = depth;
+	s->last_bswap = byteswap;
         full_update = 1;
     } else if (is_buffer_shared(surface) &&
                (full_update || surface_data(surface) != s->vram_ptr
diff --git a/hw/display/vga_int.h b/hw/display/vga_int.h
index 5320abd..129360f 100644
--- a/hw/display/vga_int.h
+++ b/hw/display/vga_int.h
@@ -148,6 +148,7 @@ typedef struct VGACommonState {
     uint32_t last_width, last_height; /* in chars or pixels */
     uint32_t last_scr_width, last_scr_height; /* in pixels */
     uint32_t last_depth; /* in bits */
+    bool last_bswap;
     uint8_t cursor_start, cursor_end;
     bool cursor_visible_phase;
     int64_t cursor_blink_time;
@@ -205,6 +206,8 @@ uint32_t vbe_ioport_read_data(void *opaque, uint32_t addr);
 void vbe_ioport_write_index(void *opaque, uint32_t addr, uint32_t val);
 void vbe_ioport_write_data(void *opaque, uint32_t addr, uint32_t val);
 
+void vga_set_big_endian(bool be);
+
 extern const uint8_t sr_mask[8];
 extern const uint8_t gr_mask[16];
 
diff --git a/hw/display/vga_template.h b/hw/display/vga_template.h
index 90ec9c2..320a2a0 100644
--- a/hw/display/vga_template.h
+++ b/hw/display/vga_template.h
@@ -35,13 +35,13 @@
 #error unsupport depth
 #endif
 
-#ifdef BGR_FORMAT
+#if BGR_FORMAT
 #define PIXEL_NAME glue(DEPTH, bgr)
 #else
 #define PIXEL_NAME DEPTH
 #endif /* BGR_FORMAT */
 
-#if DEPTH != 15 && !defined(BGR_FORMAT)
+#if DEPTH != 15 && !BGR_FORMAT
 
 static inline void glue(vga_draw_glyph_line_, DEPTH)(uint8_t *d,
                                                      uint32_t font_data,
@@ -353,23 +353,37 @@ static void glue(vga_draw_line8_, DEPTH)(VGACommonState *s1, uint8_t *d,
 static void glue(vga_draw_line15_, PIXEL_NAME)(VGACommonState *s1, uint8_t *d,
                                           const uint8_t *s, int width)
 {
-#if DEPTH == 15 && defined(HOST_WORDS_BIGENDIAN) == defined(TARGET_WORDS_BIGENDIAN)
-    memcpy(d, s, width * 2);
-#else
-    int w;
-    uint32_t v, r, g, b;
-
-    w = width;
-    do {
-        v = lduw_p((void *)s);
-        r = (v >> 7) & 0xf8;
-        g = (v >> 2) & 0xf8;
-        b = (v << 3) & 0xf8;
-        ((PIXEL_TYPE *)d)[0] = glue(rgb_to_pixel, PIXEL_NAME)(r, g, b);
-        s += 2;
-        d += BPP;
-    } while (--w != 0);
-#endif
+    bool is_be = vga_is_big_endian(s1);
+
+    if (DEPTH == 15 && !vga_need_swap(s1, is_be)) {
+        memcpy(d, s, width * 2);
+    } else {
+        int w;
+        uint32_t v, r, g, b;
+
+        w = width;
+        if (is_be) {
+            do {
+                v = lduw_be_p((void *)s);
+                r = (v >> 7) & 0xf8;
+                g = (v >> 2) & 0xf8;
+                b = (v << 3) & 0xf8;
+                ((PIXEL_TYPE *)d)[0] = glue(rgb_to_pixel, PIXEL_NAME)(r, g, b);
+                s += 2;
+                d += BPP;
+            } while (--w != 0);
+        } else {
+            do {
+                v = lduw_le_p((void *)s);
+                r = (v >> 7) & 0xf8;
+                g = (v >> 2) & 0xf8;
+                b = (v << 3) & 0xf8;
+                ((PIXEL_TYPE *)d)[0] = glue(rgb_to_pixel, PIXEL_NAME)(r, g, b);
+                s += 2;
+                d += BPP;
+            } while (--w != 0);
+        }
+    }
 }
 
 /*
@@ -378,23 +392,37 @@ static void glue(vga_draw_line15_, PIXEL_NAME)(VGACommonState *s1, uint8_t *d,
 static void glue(vga_draw_line16_, PIXEL_NAME)(VGACommonState *s1, uint8_t *d,
                                           const uint8_t *s, int width)
 {
-#if DEPTH == 16 && defined(HOST_WORDS_BIGENDIAN) == defined(TARGET_WORDS_BIGENDIAN)
-    memcpy(d, s, width * 2);
-#else
-    int w;
-    uint32_t v, r, g, b;
-
-    w = width;
-    do {
-        v = lduw_p((void *)s);
-        r = (v >> 8) & 0xf8;
-        g = (v >> 3) & 0xfc;
-        b = (v << 3) & 0xf8;
-        ((PIXEL_TYPE *)d)[0] = glue(rgb_to_pixel, PIXEL_NAME)(r, g, b);
-        s += 2;
-        d += BPP;
-    } while (--w != 0);
-#endif
+    bool is_be = vga_is_big_endian(s1);
+
+    if (DEPTH == 16 && !vga_need_swap(s1, is_be)) {
+        memcpy(d, s, width * 2);
+    } else {
+        int w;
+        uint32_t v, r, g, b;
+
+        w = width;
+        if (is_be) {
+            do {
+                v = lduw_be_p((void *)s);
+                r = (v >> 8) & 0xf8;
+                g = (v >> 3) & 0xfc;
+                b = (v << 3) & 0xf8;
+                ((PIXEL_TYPE *)d)[0] = glue(rgb_to_pixel, PIXEL_NAME)(r, g, b);
+                s += 2;
+                d += BPP;
+            } while (--w != 0);
+	} else {
+            do {
+                v = lduw_le_p((void *)s);
+                r = (v >> 8) & 0xf8;
+                g = (v >> 3) & 0xfc;
+                b = (v << 3) & 0xf8;
+                ((PIXEL_TYPE *)d)[0] = glue(rgb_to_pixel, PIXEL_NAME)(r, g, b);
+                s += 2;
+                d += BPP;
+            } while (--w != 0);
+	}
+    }
 }
 
 /*
@@ -407,20 +435,25 @@ static void glue(vga_draw_line24_, PIXEL_NAME)(VGACommonState *s1, uint8_t *d,
     uint32_t r, g, b;
 
     w = width;
-    do {
-#if defined(TARGET_WORDS_BIGENDIAN)
-        r = s[0];
-        g = s[1];
-        b = s[2];
-#else
-        b = s[0];
-        g = s[1];
-        r = s[2];
-#endif
-        ((PIXEL_TYPE *)d)[0] = glue(rgb_to_pixel, PIXEL_NAME)(r, g, b);
-        s += 3;
-        d += BPP;
-    } while (--w != 0);
+    if (vga_is_big_endian(s1)) {
+        do {
+            r = s[0];
+            g = s[1];
+            b = s[2];
+            ((PIXEL_TYPE *)d)[0] = glue(rgb_to_pixel, PIXEL_NAME)(r, g, b);
+            s += 3;
+            d += BPP;
+        } while (--w != 0);
+    } else {
+        do {
+            b = s[0];
+            g = s[1];
+            r = s[2];
+            ((PIXEL_TYPE *)d)[0] = glue(rgb_to_pixel, PIXEL_NAME)(r, g, b);
+            s += 3;
+            d += BPP;
+        } while (--w != 0);
+    }
 }
 
 /*
@@ -429,28 +462,35 @@ static void glue(vga_draw_line24_, PIXEL_NAME)(VGACommonState *s1, uint8_t *d,
 static void glue(vga_draw_line32_, PIXEL_NAME)(VGACommonState *s1, uint8_t *d,
                                           const uint8_t *s, int width)
 {
-#if DEPTH == 32 && defined(HOST_WORDS_BIGENDIAN) == defined(TARGET_WORDS_BIGENDIAN) && !defined(BGR_FORMAT)
-    memcpy(d, s, width * 4);
-#else
-    int w;
-    uint32_t r, g, b;
-
-    w = width;
-    do {
-#if defined(TARGET_WORDS_BIGENDIAN)
-        r = s[1];
-        g = s[2];
-        b = s[3];
-#else
-        b = s[0];
-        g = s[1];
-        r = s[2];
-#endif
-        ((PIXEL_TYPE *)d)[0] = glue(rgb_to_pixel, PIXEL_NAME)(r, g, b);
-        s += 4;
-        d += BPP;
-    } while (--w != 0);
-#endif
+    bool is_be = vga_is_big_endian(s1);
+
+    if (DEPTH == 32 && !BGR_FORMAT && !vga_need_swap(s1, is_be)) {
+        memcpy(d, s, width * 4);
+    } else {
+        int w;
+        uint32_t r, g, b;
+
+        w = width;
+        if (is_be) {
+            do {
+                r = s[1];
+                g = s[2];
+                b = s[3];
+                ((PIXEL_TYPE *)d)[0] = glue(rgb_to_pixel, PIXEL_NAME)(r, g, b);
+                s += 4;
+                d += BPP;
+            } while (--w != 0);
+        } else {
+            do {
+                b = s[0];
+                g = s[1];
+                r = s[2];
+                ((PIXEL_TYPE *)d)[0] = glue(rgb_to_pixel, PIXEL_NAME)(r, g, b);
+                s += 4;
+                d += BPP;
+            } while (--w != 0);
+        }
+    }
 }
 
 #undef PUT_PIXEL2
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 0bae053..c8aaf3b 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -709,6 +709,8 @@ static target_ulong h_logical_dcbf(PowerPCCPU *cpu, sPAPREnvironment *spapr,
     return H_SUCCESS;
 }
 
+extern void vga_set_big_endian(bool trueis_be);
+
 static target_ulong h_set_mode(PowerPCCPU *cpu, sPAPREnvironment *spapr,
                                target_ulong opcode, target_ulong *args)
 {
@@ -733,6 +735,7 @@ static target_ulong h_set_mode(PowerPCCPU *cpu, sPAPREnvironment *spapr,
             CPU_FOREACH(cs) {
                 set_spr(cs, SPR_LPCR, 0, LPCR_ILE);
             }
+            vga_set_big_endian(true);
             ret = H_SUCCESS;
             break;
 
@@ -740,6 +743,7 @@ static target_ulong h_set_mode(PowerPCCPU *cpu, sPAPREnvironment *spapr,
             CPU_FOREACH(cs) {
                 set_spr(cs, SPR_LPCR, LPCR_ILE, LPCR_ILE);
             }
+            vga_set_big_endian(false);
             ret = H_SUCCESS;
             break;
 

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-17  3:07 [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes Benjamin Herrenschmidt
@ 2014-06-17  4:40 ` Paolo Bonzini
  2014-06-17  4:59   ` Benjamin Herrenschmidt
  2014-06-17 10:45 ` Gerd Hoffmann
  1 sibling, 1 reply; 66+ messages in thread
From: Paolo Bonzini @ 2014-06-17  4:40 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, qemu-devel
  Cc: Alexey Kardashevskiy, Gerd Hoffmann, Alexander Graf

Il 17/06/2014 05:07, Benjamin Herrenschmidt ha scritto:
> I've come up with the proof of concept below which changes the various
> line drawing functions in vga-template to select the right ld/st
> function. However that adds a per-line test (overhead I can hear some
> people shouting ! :-).

Sounds perfectly reasonable.

Paolo

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-17  4:40 ` Paolo Bonzini
@ 2014-06-17  4:59   ` Benjamin Herrenschmidt
  2014-06-17  5:36     ` Paolo Bonzini
  0 siblings, 1 reply; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-06-17  4:59 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Alexey Kardashevskiy, Alexander Graf, qemu-devel, Gerd Hoffmann

On Tue, 2014-06-17 at 06:40 +0200, Paolo Bonzini wrote:
> Il 17/06/2014 05:07, Benjamin Herrenschmidt ha scritto:
> > I've come up with the proof of concept below which changes the various
> > line drawing functions in vga-template to select the right ld/st
> > function. However that adds a per-line test (overhead I can hear some
> > people shouting ! :-).
> 
> Sounds perfectly reasonable.

Thanks. I've tried the other approach of adding new functions which
means no overhead (hopefully) for the non-swap case and less invasive
changes to vga_template.c.

Patch below. What do you think ? This or the previous approach ? Then we
can discuss how we actually trigger the endian change and where we store
the state :-)

diff --git a/hw/display/vga.c b/hw/display/vga.c
index 8cd6afe..eaa6791 100644
--- a/hw/display/vga.c
+++ b/hw/display/vga.c
@@ -983,27 +983,89 @@ typedef void vga_draw_glyph9_func(uint8_t *d, int linesize,
 typedef void vga_draw_line_func(VGACommonState *s1, uint8_t *d,
                                 const uint8_t *s, int width);
 
+#ifdef TARGET_WORDS_BIGENDIAN
+static bool vga_is_be = true;
+#else
+static bool vga_is_be = false;
+#endif
+
+void vga_set_big_endian(bool be)
+{
+    vga_is_be = be;
+}
+
+static inline bool vga_is_big_endian(VGACommonState *s)
+{
+    return vga_is_be;
+}
+
+#define BGR_FORMAT 0
+#define PIX_BE 0
 #define DEPTH 8
 #include "vga_template.h"
 
+#define BGR_FORMAT 0
+#define PIX_BE 0
 #define DEPTH 15
 #include "vga_template.h"
 
-#define BGR_FORMAT
+#define BGR_FORMAT 1
+#define PIX_BE 0
 #define DEPTH 15
 #include "vga_template.h"
 
+#define BGR_FORMAT 0
+#define PIX_BE 0
 #define DEPTH 16
 #include "vga_template.h"
 
-#define BGR_FORMAT
+#define BGR_FORMAT 1
+#define PIX_BE 0
 #define DEPTH 16
 #include "vga_template.h"
 
+#define BGR_FORMAT 0
+#define PIX_BE 0
 #define DEPTH 32
 #include "vga_template.h"
 
-#define BGR_FORMAT
+#define BGR_FORMAT 1
+#define PIX_BE 0
+#define DEPTH 32
+#include "vga_template.h"
+
+#define BGR_FORMAT 0
+#define PIX_BE 1
+#define DEPTH 8
+#include "vga_template.h"
+
+#define BGR_FORMAT 0
+#define PIX_BE 1
+#define DEPTH 15
+#include "vga_template.h"
+
+#define BGR_FORMAT 1
+#define PIX_BE 1
+#define DEPTH 15
+#include "vga_template.h"
+
+#define BGR_FORMAT 0
+#define PIX_BE 1
+#define DEPTH 16
+#include "vga_template.h"
+
+#define BGR_FORMAT 1
+#define PIX_BE 1
+#define DEPTH 16
+#include "vga_template.h"
+
+#define BGR_FORMAT 0
+#define PIX_BE 1
+#define DEPTH 32
+#include "vga_template.h"
+
+#define BGR_FORMAT 1
+#define PIX_BE 1
 #define DEPTH 32
 #include "vga_template.h"
 
@@ -1486,10 +1548,14 @@ enum {
     VGA_DRAW_LINE4D2,
     VGA_DRAW_LINE8D2,
     VGA_DRAW_LINE8,
-    VGA_DRAW_LINE15,
-    VGA_DRAW_LINE16,
-    VGA_DRAW_LINE24,
-    VGA_DRAW_LINE32,
+    VGA_DRAW_LINE15_LE,
+    VGA_DRAW_LINE16_LE,
+    VGA_DRAW_LINE24_LE,
+    VGA_DRAW_LINE32_LE,
+    VGA_DRAW_LINE15_BE,
+    VGA_DRAW_LINE16_BE,
+    VGA_DRAW_LINE24_BE,
+    VGA_DRAW_LINE32_BE,
     VGA_DRAW_LINE_NB,
 };
 
@@ -1542,37 +1608,69 @@ static vga_draw_line_func * const vga_draw_line_table[NB_DEPTHS * VGA_DRAW_LINE_
     vga_draw_line8_16,
     vga_draw_line8_16,
 
-    vga_draw_line15_8,
-    vga_draw_line15_15,
-    vga_draw_line15_16,
-    vga_draw_line15_32,
-    vga_draw_line15_32bgr,
-    vga_draw_line15_15bgr,
-    vga_draw_line15_16bgr,
-
-    vga_draw_line16_8,
-    vga_draw_line16_15,
-    vga_draw_line16_16,
-    vga_draw_line16_32,
-    vga_draw_line16_32bgr,
-    vga_draw_line16_15bgr,
-    vga_draw_line16_16bgr,
-
-    vga_draw_line24_8,
-    vga_draw_line24_15,
-    vga_draw_line24_16,
-    vga_draw_line24_32,
-    vga_draw_line24_32bgr,
-    vga_draw_line24_15bgr,
-    vga_draw_line24_16bgr,
-
-    vga_draw_line32_8,
-    vga_draw_line32_15,
-    vga_draw_line32_16,
-    vga_draw_line32_32,
-    vga_draw_line32_32bgr,
-    vga_draw_line32_15bgr,
-    vga_draw_line32_16bgr,
+    vga_draw_line15_8_le,
+    vga_draw_line15_15_le,
+    vga_draw_line15_16_le,
+    vga_draw_line15_32_le,
+    vga_draw_line15_32bgr_le,
+    vga_draw_line15_15bgr_le,
+    vga_draw_line15_16bgr_le,
+
+    vga_draw_line16_8_le,
+    vga_draw_line16_15_le,
+    vga_draw_line16_16_le,
+    vga_draw_line16_32_le,
+    vga_draw_line16_32bgr_le,
+    vga_draw_line16_15bgr_le,
+    vga_draw_line16_16bgr_le,
+
+    vga_draw_line24_8_le,
+    vga_draw_line24_15_le,
+    vga_draw_line24_16_le,
+    vga_draw_line24_32_le,
+    vga_draw_line24_32bgr_le,
+    vga_draw_line24_15bgr_le,
+    vga_draw_line24_16bgr_le,
+
+    vga_draw_line32_8_le,
+    vga_draw_line32_15_le,
+    vga_draw_line32_16_le,
+    vga_draw_line32_32_le,
+    vga_draw_line32_32bgr_le,
+    vga_draw_line32_15bgr_le,
+    vga_draw_line32_16bgr_le,
+
+    vga_draw_line15_8_be,
+    vga_draw_line15_15_be,
+    vga_draw_line15_16_be,
+    vga_draw_line15_32_be,
+    vga_draw_line15_32bgr_be,
+    vga_draw_line15_15bgr_be,
+    vga_draw_line15_16bgr_be,
+
+    vga_draw_line16_8_be,
+    vga_draw_line16_15_be,
+    vga_draw_line16_16_be,
+    vga_draw_line16_32_be,
+    vga_draw_line16_32bgr_be,
+    vga_draw_line16_15bgr_be,
+    vga_draw_line16_16bgr_be,
+
+    vga_draw_line24_8_be,
+    vga_draw_line24_15_be,
+    vga_draw_line24_16_be,
+    vga_draw_line24_32_be,
+    vga_draw_line24_32bgr_be,
+    vga_draw_line24_15bgr_be,
+    vga_draw_line24_16bgr_be,
+
+    vga_draw_line32_8_be,
+    vga_draw_line32_15_be,
+    vga_draw_line32_16_be,
+    vga_draw_line32_32_be,
+    vga_draw_line32_32bgr_be,
+    vga_draw_line32_15bgr_be,
+    vga_draw_line32_16bgr_be,
 };
 
 static int vga_get_bpp(VGACommonState *s)
@@ -1645,10 +1743,14 @@ static void vga_draw_graphic(VGACommonState *s, int full_update)
     uint8_t *d;
     uint32_t v, addr1, addr;
     vga_draw_line_func *vga_draw_line;
-#if defined(HOST_WORDS_BIGENDIAN) == defined(TARGET_WORDS_BIGENDIAN)
-    static const bool byteswap = false;
+    bool byteswap, is_be;
+
+    is_be = vga_is_big_endian(s);
+
+#ifdef HOST_WORDS_BIGENDIAN
+    byteswap = !is_be;
 #else
-    static const bool byteswap = true;
+    byteswap = is_be;
 #endif
 
     full_update |= update_basic_params(s);
@@ -1691,7 +1793,8 @@ static void vga_draw_graphic(VGACommonState *s, int full_update)
     if (s->line_offset != s->last_line_offset ||
         disp_width != s->last_width ||
         height != s->last_height ||
-        s->last_depth != depth) {
+        s->last_depth != depth ||
+        s->last_bswap != byteswap) {
         if (depth == 32 || (depth == 16 && !byteswap)) {
             surface = qemu_create_displaysurface_from(disp_width,
                     height, depth, s->line_offset,
@@ -1707,6 +1810,7 @@ static void vga_draw_graphic(VGACommonState *s, int full_update)
         s->last_height = height;
         s->last_line_offset = s->line_offset;
         s->last_depth = depth;
+	s->last_bswap = byteswap;
         full_update = 1;
     } else if (is_buffer_shared(surface) &&
                (full_update || surface_data(surface) != s->vram_ptr
@@ -1750,19 +1854,19 @@ static void vga_draw_graphic(VGACommonState *s, int full_update)
             bits = 8;
             break;
         case 15:
-            v = VGA_DRAW_LINE15;
+            v = is_be ? VGA_DRAW_LINE15_BE : VGA_DRAW_LINE15_LE;
             bits = 16;
             break;
         case 16:
-            v = VGA_DRAW_LINE16;
+            v = is_be ? VGA_DRAW_LINE16_BE : VGA_DRAW_LINE16_LE;
             bits = 16;
             break;
         case 24:
-            v = VGA_DRAW_LINE24;
+            v = is_be ? VGA_DRAW_LINE24_BE : VGA_DRAW_LINE24_LE;
             bits = 24;
             break;
         case 32:
-            v = VGA_DRAW_LINE32;
+            v = is_be ? VGA_DRAW_LINE32_BE : VGA_DRAW_LINE32_LE;
             bits = 32;
             break;
         }
diff --git a/hw/display/vga_int.h b/hw/display/vga_int.h
index 5320abd..129360f 100644
--- a/hw/display/vga_int.h
+++ b/hw/display/vga_int.h
@@ -148,6 +148,7 @@ typedef struct VGACommonState {
     uint32_t last_width, last_height; /* in chars or pixels */
     uint32_t last_scr_width, last_scr_height; /* in pixels */
     uint32_t last_depth; /* in bits */
+    bool last_bswap;
     uint8_t cursor_start, cursor_end;
     bool cursor_visible_phase;
     int64_t cursor_blink_time;
@@ -205,6 +206,8 @@ uint32_t vbe_ioport_read_data(void *opaque, uint32_t addr);
 void vbe_ioport_write_index(void *opaque, uint32_t addr, uint32_t val);
 void vbe_ioport_write_data(void *opaque, uint32_t addr, uint32_t val);
 
+void vga_set_big_endian(bool be);
+
 extern const uint8_t sr_mask[8];
 extern const uint8_t gr_mask[16];
 
diff --git a/hw/display/vga_template.h b/hw/display/vga_template.h
index 90ec9c2..f8ea15f 100644
--- a/hw/display/vga_template.h
+++ b/hw/display/vga_template.h
@@ -35,13 +35,24 @@
 #error unsupport depth
 #endif
 
-#ifdef BGR_FORMAT
-#define PIXEL_NAME glue(DEPTH, bgr)
-#else
+#if BGR_FORMAT
+#if PIX_BE
+#define PIXEL_FNAME glue(DEPTH,bgr_be)
+#else /* PIX_BE */
+#define PIXEL_FNAME glue(DEPTH,bgr_le)
+#endif /* PIX_BE */ 
+#define PIXEL_NAME glue(DEPTH,bgr)
+#else /* BGR_FORMAT */
+#if PIX_BE
+#define PIXEL_FNAME glue(DEPTH,_be)
+#else /* PIX_BE */
+#define PIXEL_FNAME glue(DEPTH,_le)
+#endif /* PIX_BE */
 #define PIXEL_NAME DEPTH
 #endif /* BGR_FORMAT */
 
-#if DEPTH != 15 && !defined(BGR_FORMAT)
+
+#if DEPTH != 15 && !BGR_FORMAT && !PIX_BE
 
 static inline void glue(vga_draw_glyph_line_, DEPTH)(uint8_t *d,
                                                      uint32_t font_data,
@@ -350,10 +361,10 @@ static void glue(vga_draw_line8_, DEPTH)(VGACommonState *s1, uint8_t *d,
 /*
  * 15 bit color
  */
-static void glue(vga_draw_line15_, PIXEL_NAME)(VGACommonState *s1, uint8_t *d,
+static void glue(vga_draw_line15_, PIXEL_FNAME)(VGACommonState *s1, uint8_t *d,
                                           const uint8_t *s, int width)
 {
-#if DEPTH == 15 && defined(HOST_WORDS_BIGENDIAN) == defined(TARGET_WORDS_BIGENDIAN)
+#if DEPTH == 15 && PIX_BE == defined(HOST_WORDS_BIGENDIAN)
     memcpy(d, s, width * 2);
 #else
     int w;
@@ -361,7 +372,11 @@ static void glue(vga_draw_line15_, PIXEL_NAME)(VGACommonState *s1, uint8_t *d,
 
     w = width;
     do {
-        v = lduw_p((void *)s);
+#if PIX_BE
+        v = lduw_be_p((void *)s);
+#else
+        v = lduw_le_p((void *)s);
+#endif
         r = (v >> 7) & 0xf8;
         g = (v >> 2) & 0xf8;
         b = (v << 3) & 0xf8;
@@ -375,10 +390,10 @@ static void glue(vga_draw_line15_, PIXEL_NAME)(VGACommonState *s1, uint8_t *d,
 /*
  * 16 bit color
  */
-static void glue(vga_draw_line16_, PIXEL_NAME)(VGACommonState *s1, uint8_t *d,
+static void glue(vga_draw_line16_, PIXEL_FNAME)(VGACommonState *s1, uint8_t *d,
                                           const uint8_t *s, int width)
 {
-#if DEPTH == 16 && defined(HOST_WORDS_BIGENDIAN) == defined(TARGET_WORDS_BIGENDIAN)
+#if DEPTH == 16 && PIX_BE == defined(HOST_WORDS_BIGENDIAN)
     memcpy(d, s, width * 2);
 #else
     int w;
@@ -386,7 +401,11 @@ static void glue(vga_draw_line16_, PIXEL_NAME)(VGACommonState *s1, uint8_t *d,
 
     w = width;
     do {
-        v = lduw_p((void *)s);
+#if PIX_BE
+        v = lduw_be_p((void *)s);
+#else
+        v = lduw_le_p((void *)s);
+#endif
         r = (v >> 8) & 0xf8;
         g = (v >> 3) & 0xfc;
         b = (v << 3) & 0xf8;
@@ -400,7 +419,7 @@ static void glue(vga_draw_line16_, PIXEL_NAME)(VGACommonState *s1, uint8_t *d,
 /*
  * 24 bit color
  */
-static void glue(vga_draw_line24_, PIXEL_NAME)(VGACommonState *s1, uint8_t *d,
+static void glue(vga_draw_line24_, PIXEL_FNAME)(VGACommonState *s1, uint8_t *d,
                                           const uint8_t *s, int width)
 {
     int w;
@@ -408,7 +427,7 @@ static void glue(vga_draw_line24_, PIXEL_NAME)(VGACommonState *s1, uint8_t *d,
 
     w = width;
     do {
-#if defined(TARGET_WORDS_BIGENDIAN)
+#if PIX_BE
         r = s[0];
         g = s[1];
         b = s[2];
@@ -426,10 +445,10 @@ static void glue(vga_draw_line24_, PIXEL_NAME)(VGACommonState *s1, uint8_t *d,
 /*
  * 32 bit color
  */
-static void glue(vga_draw_line32_, PIXEL_NAME)(VGACommonState *s1, uint8_t *d,
+static void glue(vga_draw_line32_, PIXEL_FNAME)(VGACommonState *s1, uint8_t *d,
                                           const uint8_t *s, int width)
 {
-#if DEPTH == 32 && defined(HOST_WORDS_BIGENDIAN) == defined(TARGET_WORDS_BIGENDIAN) && !defined(BGR_FORMAT)
+#if DEPTH == 32 && !BGR_FORMAT && PIX_BE == defined(HOST_WORDS_BIGENDIAN)
     memcpy(d, s, width * 4);
 #else
     int w;
@@ -437,7 +456,7 @@ static void glue(vga_draw_line32_, PIXEL_NAME)(VGACommonState *s1, uint8_t *d,
 
     w = width;
     do {
-#if defined(TARGET_WORDS_BIGENDIAN)
+#if PIX_BE
         r = s[1];
         g = s[2];
         b = s[3];
@@ -458,4 +477,7 @@ static void glue(vga_draw_line32_, PIXEL_NAME)(VGACommonState *s1, uint8_t *d,
 #undef BPP
 #undef PIXEL_TYPE
 #undef PIXEL_NAME
+#undef PIXEL_FNAME
 #undef BGR_FORMAT
+#undef PIX_BE
+
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 0bae053..c8aaf3b 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -709,6 +709,8 @@ static target_ulong h_logical_dcbf(PowerPCCPU *cpu, sPAPREnvironment *spapr,
     return H_SUCCESS;
 }
 
+extern void vga_set_big_endian(bool trueis_be);
+
 static target_ulong h_set_mode(PowerPCCPU *cpu, sPAPREnvironment *spapr,
                                target_ulong opcode, target_ulong *args)
 {
@@ -733,6 +735,7 @@ static target_ulong h_set_mode(PowerPCCPU *cpu, sPAPREnvironment *spapr,
             CPU_FOREACH(cs) {
                 set_spr(cs, SPR_LPCR, 0, LPCR_ILE);
             }
+            vga_set_big_endian(true);
             ret = H_SUCCESS;
             break;
 
@@ -740,6 +743,7 @@ static target_ulong h_set_mode(PowerPCCPU *cpu, sPAPREnvironment *spapr,
             CPU_FOREACH(cs) {
                 set_spr(cs, SPR_LPCR, LPCR_ILE, LPCR_ILE);
             }
+            vga_set_big_endian(false);
             ret = H_SUCCESS;
             break;
 

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-17  4:59   ` Benjamin Herrenschmidt
@ 2014-06-17  5:36     ` Paolo Bonzini
  2014-06-17  6:06       ` Benjamin Herrenschmidt
  2014-06-17  9:18       ` Alexey Kardashevskiy
  0 siblings, 2 replies; 66+ messages in thread
From: Paolo Bonzini @ 2014-06-17  5:36 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Alexey Kardashevskiy, Alexander Graf, qemu-devel, Gerd Hoffmann

Il 17/06/2014 06:59, Benjamin Herrenschmidt ha scritto:
> Thanks. I've tried the other approach of adding new functions which
> means no overhead (hopefully) for the non-swap case and less invasive
> changes to vga_template.c.
>
> Patch below. What do you think ? This or the previous approach ? Then we
> can discuss how we actually trigger the endian change and where we store
> the state :-)

This is definitely more readable.  Anyway Gerd is the VGA guy. :)

Paolo

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-17  5:36     ` Paolo Bonzini
@ 2014-06-17  6:06       ` Benjamin Herrenschmidt
  2014-06-17  9:18       ` Alexey Kardashevskiy
  1 sibling, 0 replies; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-06-17  6:06 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Alexey Kardashevskiy, Alexander Graf, qemu-devel, Gerd Hoffmann

On Tue, 2014-06-17 at 07:36 +0200, Paolo Bonzini wrote:
> Il 17/06/2014 06:59, Benjamin Herrenschmidt ha scritto:
> > Thanks. I've tried the other approach of adding new functions which
> > means no overhead (hopefully) for the non-swap case and less invasive
> > changes to vga_template.c.
> >
> > Patch below. What do you think ? This or the previous approach ? Then we
> > can discuss how we actually trigger the endian change and where we store
> > the state :-)
> 
> This is definitely more readable.  Anyway Gerd is the VGA guy. :)

Allright, let's wait for his opinion.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-17  5:36     ` Paolo Bonzini
  2014-06-17  6:06       ` Benjamin Herrenschmidt
@ 2014-06-17  9:18       ` Alexey Kardashevskiy
  2014-06-17  9:26         ` Alexander Graf
  2014-06-17 10:00         ` Greg Kurz
  1 sibling, 2 replies; 66+ messages in thread
From: Alexey Kardashevskiy @ 2014-06-17  9:18 UTC (permalink / raw)
  To: Paolo Bonzini, Benjamin Herrenschmidt
  Cc: Peter Maydell, Alexander Graf, Greg Kurz, qemu-devel, Gerd Hoffmann

On 06/17/2014 03:36 PM, Paolo Bonzini wrote:
> Il 17/06/2014 06:59, Benjamin Herrenschmidt ha scritto:
>> Thanks. I've tried the other approach of adding new functions which
>> means no overhead (hopefully) for the non-swap case and less invasive
>> changes to vga_template.c.
>>
>> Patch below. What do you think ? This or the previous approach ? Then we
>> can discuss how we actually trigger the endian change and where we store
>> the state :-)
> 
> This is definitely more readable.  Anyway Gerd is the VGA guy. :)

I am that lucky person who got to do endianness-on-fly-switching
QOM'fication :)

We have 2 ways of doing this:

1. implement a VGA register in QEMU and use it from the guest;
there is a try to discuss it in "[Qemu-devel] Endian control register";
requires guest changes;

2. on SPAPR we have H_SET_MODE hypercall which guest uses when it switches
endianness and current hack is to change the flag directly in VGA device
from this hypercall handler. Instead of that hack, we could have added a
device callback:

void DeviceClass::endianness_notify(DeviceState *dev,
		enum device_endian endianness);

or even an Interface (with the same method alone). And in H_SET_MODE we
could walk through all devices in the system and poke ones which implement
a callback or an interface. virtio could benefit from this as well.

Ideas?


-- 
Alexey

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-17  9:18       ` Alexey Kardashevskiy
@ 2014-06-17  9:26         ` Alexander Graf
  2014-06-17 10:00         ` Greg Kurz
  1 sibling, 0 replies; 66+ messages in thread
From: Alexander Graf @ 2014-06-17  9:26 UTC (permalink / raw)
  To: Alexey Kardashevskiy, Paolo Bonzini, Benjamin Herrenschmidt
  Cc: Peter Maydell, Greg Kurz, qemu-devel, Gerd Hoffmann


On 17.06.14 11:18, Alexey Kardashevskiy wrote:
> On 06/17/2014 03:36 PM, Paolo Bonzini wrote:
>> Il 17/06/2014 06:59, Benjamin Herrenschmidt ha scritto:
>>> Thanks. I've tried the other approach of adding new functions which
>>> means no overhead (hopefully) for the non-swap case and less invasive
>>> changes to vga_template.c.
>>>
>>> Patch below. What do you think ? This or the previous approach ? Then we
>>> can discuss how we actually trigger the endian change and where we store
>>> the state :-)
>> This is definitely more readable.  Anyway Gerd is the VGA guy. :)
> I am that lucky person who got to do endianness-on-fly-switching
> QOM'fication :)
>
> We have 2 ways of doing this:
>
> 1. implement a VGA register in QEMU and use it from the guest;
> there is a try to discuss it in "[Qemu-devel] Endian control register";
> requires guest changes;
>
> 2. on SPAPR we have H_SET_MODE hypercall which guest uses when it switches
> endianness and current hack is to change the flag directly in VGA device
> from this hypercall handler. Instead of that hack, we could have added a
> device callback:
>
> void DeviceClass::endianness_notify(DeviceState *dev,
> 		enum device_endian endianness);
>
> or even an Interface (with the same method alone). And in H_SET_MODE we
> could walk through all devices in the system and poke ones which implement
> a callback or an interface. virtio could benefit from this as well.
>
> Ideas?

The register approach is nicer. Let's wait for Gerd to see what he thinks.


Alex

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-17  9:18       ` Alexey Kardashevskiy
  2014-06-17  9:26         ` Alexander Graf
@ 2014-06-17 10:00         ` Greg Kurz
  2014-06-17 10:09           ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 66+ messages in thread
From: Greg Kurz @ 2014-06-17 10:00 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: Peter Maydell, qemu-devel, Alexander Graf, Gerd Hoffmann, Paolo Bonzini

On Tue, 17 Jun 2014 19:18:11 +1000
Alexey Kardashevskiy <aik@ozlabs.ru> wrote:

> On 06/17/2014 03:36 PM, Paolo Bonzini wrote:
> > Il 17/06/2014 06:59, Benjamin Herrenschmidt ha scritto:
> >> Thanks. I've tried the other approach of adding new functions which
> >> means no overhead (hopefully) for the non-swap case and less invasive
> >> changes to vga_template.c.
> >>
> >> Patch below. What do you think ? This or the previous approach ? Then we
> >> can discuss how we actually trigger the endian change and where we store
> >> the state :-)
> > 
> > This is definitely more readable.  Anyway Gerd is the VGA guy. :)
> 
> I am that lucky person who got to do endianness-on-fly-switching
> QOM'fication :)
> 
> We have 2 ways of doing this:
> 
> 1. implement a VGA register in QEMU and use it from the guest;
> there is a try to discuss it in "[Qemu-devel] Endian control register";
> requires guest changes;
> 
> 2. on SPAPR we have H_SET_MODE hypercall which guest uses when it switches
> endianness and current hack is to change the flag directly in VGA device
> from this hypercall handler. Instead of that hack, we could have added a
> device callback:
> 
> void DeviceClass::endianness_notify(DeviceState *dev,
> 		enum device_endian endianness);
> 
> or even an Interface (with the same method alone). And in H_SET_MODE we
> could walk through all devices in the system and poke ones which implement
> a callback or an interface. virtio could benefit from this as well.
> 

There has been a discussion already about virtio endianness: relying on
a guest system wide setting such as LPCR_ILE has been strongly rejected
at the time... The consensus for virtio is "device endianness is the
endianness of the CPU that does the reset" (hence MSR_LE for PPC).

I don't know the details for VGA, but virtio would not benefit from
such callback.

> Ideas?
> 
> 



-- 
Gregory Kurz                                     kurzgreg@fr.ibm.com
                                                 gkurz@linux.vnet.ibm.com
Software Engineer @ IBM/Meiosys                  http://www.ibm.com
Tel +33 (0)562 165 496

"Anarchy is about taking complete responsibility for yourself."
        Alan Moore.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-17 10:00         ` Greg Kurz
@ 2014-06-17 10:09           ` Benjamin Herrenschmidt
  2014-06-17 10:19             ` Peter Maydell
  2014-06-17 11:40             ` Greg Kurz
  0 siblings, 2 replies; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-06-17 10:09 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Peter Maydell, Alexey Kardashevskiy, qemu-devel, Alexander Graf,
	Gerd Hoffmann, Paolo Bonzini

On Tue, 2014-06-17 at 12:00 +0200, Greg Kurz wrote:
> There has been a discussion already about virtio endianness: relying on
> a guest system wide setting such as LPCR_ILE has been strongly rejected
> at the time... The consensus for virtio is "device endianness is the
> endianness of the CPU that does the reset" (hence MSR_LE for PPC).

How on earth did anybody reach such a conclusion ? Of all the possible
options I can think of this is the one that makes the *less* sense !

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-17 10:09           ` Benjamin Herrenschmidt
@ 2014-06-17 10:19             ` Peter Maydell
  2014-06-17 10:57               ` Benjamin Herrenschmidt
  2014-06-17 11:40             ` Greg Kurz
  1 sibling, 1 reply; 66+ messages in thread
From: Peter Maydell @ 2014-06-17 10:19 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Alexey Kardashevskiy, qemu-devel, Alexander Graf, Gerd Hoffmann,
	Paolo Bonzini, Greg Kurz

On 17 June 2014 11:09, Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> On Tue, 2014-06-17 at 12:00 +0200, Greg Kurz wrote:
>> There has been a discussion already about virtio endianness: relying on
>> a guest system wide setting such as LPCR_ILE has been strongly rejected
>> at the time... The consensus for virtio is "device endianness is the
>> endianness of the CPU that does the reset" (hence MSR_LE for PPC).
>
> How on earth did anybody reach such a conclusion ? Of all the possible
> options I can think of this is the one that makes the *less* sense !

Well, the right conclusion is "virtio should have specified
endianness sanely, ie to be independent of the CPU". However
we can't rewind time to make that decision correctly, so this
is the best we can usefully do without breaking existing
working guests.

It's absolutely a virtio-specific thing, though, given that
virtio has the weird "endianness of the guest" semantics in
it, so it's not a good model for anything else.

My personal opinion here is that device models should just
have a fixed byte order, and the guest should deal with it
(except in the cases where we're modelling real hardware
which has a real config register for flipping byte order,
in which case the answer is "work like that hardware").
So I'm still really dubious about adding endian swapping to
the VGA model at all. Why can't you just have the guest
do the right thing?

thanks
-- PMM

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-17  3:07 [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes Benjamin Herrenschmidt
  2014-06-17  4:40 ` Paolo Bonzini
@ 2014-06-17 10:45 ` Gerd Hoffmann
  2014-06-17 11:08   ` Peter Maydell
  2014-06-17 11:15   ` Benjamin Herrenschmidt
  1 sibling, 2 replies; 66+ messages in thread
From: Gerd Hoffmann @ 2014-06-17 10:45 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-devel, Alexander Graf

  Hi,

> The core issue stems from the fact that graphic stacks including X and a
> number of other components along the way are terminally broken if the
> framebuffer endianness doesn't match the guest endianness. We can
> discuss the historical reasons for that fact if you wish and we can
> discuss why it is almost unfixable separately.

"framebuffer" as in "linux framebuffer device" or as in "whatever memory
area X11 renders into" ?

> This is an obvious problem for qemu on ppc64 since we now support having
> either BE or LE guests.
> 
> At the moment, we have a broken patch in offb that makes the guest
> sort-of limp along with the reversed framebuffer but that patch is
> broken in several ways so I've reverted it upstream and X wouldn't work
> anyway.

Side note: tried "CONFIG_DRM_BOCHS=m"?

In theory it should just work.  I've never actually tested it in
practice on bigendian.

Not sure it helps in any way with the byteorder issue though.  Probably
not.  Could help with the "how do we flip the framebuffer byteorder"
though, as we can simply define a new register and let the driver set
it.

> This is purely about discussing the changes to vga.c and vga-templates.h
> to be able to handle the swapping in a runtime-decided way rather than
> compile time.

> What do you prefer ?

Let pixman handle it?  Well, except that pixman can't handle byteswapped
16bpp formats.  Too bad :(

In any case cleaning up the whole mess is overdue and doing that
beforehand should simplify the job alot.  For quite some time qemu uses
32bpp internally no matter what, so the functions for converting the
guest's framebuffer into anything but 32bpp should be dead code.

And as long as the guest uses 32bpp too there is nothing converted
anyway, we just setup pixman image in the correct format, backed by the
guests vga memory.  So this ...

> +#ifdef TARGET_WORDS_BIGENDIAN
> +static bool vga_is_be = true;
> +#else
> +static bool vga_is_be = false;
> +#endif
> +
> +void vga_set_big_endian(bool be)
> +{
> +    vga_is_be = be;
> +}
> +
> +static inline bool vga_is_big_endian(VGACommonState *s)
> +{
> +    return vga_is_be;
> +}
> +
> +static inline bool vga_need_swap(VGACommonState *s, bool target_be)
> +{
> +#ifdef HOST_WORDS_BIGENDIAN
> +    return !target_be;
> +#else
> +    return target_be;
> +#endif
> +}

> @@ -1645,11 +1690,9 @@ static void vga_draw_graphic(VGACommonState *s, int full_update)
>      uint8_t *d;
>      uint32_t v, addr1, addr;
>      vga_draw_line_func *vga_draw_line;
> -#if defined(HOST_WORDS_BIGENDIAN) == defined(TARGET_WORDS_BIGENDIAN)
> -    static const bool byteswap = false;
> -#else
> -    static const bool byteswap = true;
> -#endif
> +    bool byteswap;
> +
> +    byteswap = vga_need_swap(s, vga_is_big_endian(s));
>  
>      full_update |= update_basic_params(s);
>  
> @@ -1691,7 +1734,8 @@ static void vga_draw_graphic(VGACommonState *s, int full_update)
>      if (s->line_offset != s->last_line_offset ||
>          disp_width != s->last_width ||
>          height != s->last_height ||
> -        s->last_depth != depth) {
> +        s->last_depth != depth ||
> +        s->last_bswap != byteswap) {
>          if (depth == 32 || (depth == 16 && !byteswap)) {
>              surface = qemu_create_displaysurface_from(disp_width,
>                      height, depth, s->line_offset,
> @@ -1707,6 +1751,7 @@ static void vga_draw_graphic(VGACommonState *s, int full_update)
>          s->last_height = height;
>          s->last_line_offset = s->line_offset;
>          s->last_depth = depth;
> +	s->last_bswap = byteswap;
>          full_update = 1;
>      } else if (is_buffer_shared(surface) &&
>                 (full_update || surface_data(surface) != s->vram_ptr

... is all you need to make things fly for the 32bpp case.

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-17 10:19             ` Peter Maydell
@ 2014-06-17 10:57               ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-06-17 10:57 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Alexey Kardashevskiy, qemu-devel, Alexander Graf, Gerd Hoffmann,
	Paolo Bonzini, Greg Kurz

On Tue, 2014-06-17 at 11:19 +0100, Peter Maydell wrote:
> On 17 June 2014 11:09, Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> > On Tue, 2014-06-17 at 12:00 +0200, Greg Kurz wrote:
> >> There has been a discussion already about virtio endianness: relying on
> >> a guest system wide setting such as LPCR_ILE has been strongly rejected
> >> at the time... The consensus for virtio is "device endianness is the
> >> endianness of the CPU that does the reset" (hence MSR_LE for PPC).
> >
> > How on earth did anybody reach such a conclusion ? Of all the possible
> > options I can think of this is the one that makes the *less* sense !
> 
> Well, the right conclusion is "virtio should have specified
> endianness sanely, ie to be independent of the CPU". However
> we can't rewind time to make that decision correctly, so this
> is the best we can usefully do without breaking existing
> working guests.

Agreed about original breakage.

> It's absolutely a virtio-specific thing, though, given that
> virtio has the weird "endianness of the guest" semantics in
> it, so it's not a good model for anything else.
> 
> My personal opinion here is that device models should just
> have a fixed byte order, and the guest should deal with it
> (except in the cases where we're modelling real hardware
> which has a real config register for flipping byte order,
> in which case the answer is "work like that hardware").
> So I'm still really dubious about adding endian swapping to
> the VGA model at all. Why can't you just have the guest
> do the right thing?

I absolutely agree with you on the fixed byte order model.

Sadly, graphics carries a very long legacy of brokenness in that area
that we really can't fix easily.

X assumes at compile time pretty much a native byte order and expresses
color components locations as masks and shifts inside the N-bit "words"
based on the pixel BPP. There are ways to play with the masks and shifts
for 32bpp to work but 15/16 is always busted when the green gets split,
X really can't deal with it.

But that's the tip of the iceberg. Those assumptions about pixel formats
percolate all the way through the gfx stack, it's a real mess. For GL,
for example, an ARGB format is not A,R,G,B in memory in that order but
"ARGB" from MSB to LSB in the smallest word loaded that can encompass
the pixel size (Funnily enough, I heard Direct X got that right !) and
layers of code exist out there that just can't deal with "the other
way".

This is why graphics card have historically carried all sort of weird
byte swapping hardware, more or less working, trying to solve that at
the HW level because SW is basically a trainwreck :-)

Now, Egbert is still trying to make base X and qt5 at least somewhat
work with the reverse order at least for 32bpp, but that's mostly an
academic exercise at this stage.

So yes, you are absolutely right for the general case. But graphics
sadly has to be the exception to the rule.

Cheers,
Ben.

> thanks
> -- PMM

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-17 10:45 ` Gerd Hoffmann
@ 2014-06-17 11:08   ` Peter Maydell
  2014-06-17 11:24     ` Gerd Hoffmann
  2014-06-17 11:15   ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 66+ messages in thread
From: Peter Maydell @ 2014-06-17 11:08 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Alexey Kardashevskiy, Alexander Graf, qemu-devel, Paolo Bonzini

On 17 June 2014 11:45, Gerd Hoffmann <kraxel@redhat.com> wrote:
> Benjamin Herrenschmidt wrote:
>> What do you prefer ?
>
> Let pixman handle it?  Well, except that pixman can't handle byteswapped
> 16bpp formats.  Too bad :(
>
> In any case cleaning up the whole mess is overdue and doing that
> beforehand should simplify the job alot.  For quite some time qemu uses
> 32bpp internally no matter what, so the functions for converting the
> guest's framebuffer into anything but 32bpp should be dead code.

Slight tangent, but is there an example in-tree of a framebuffer
model which uses pixman for the pixel conversion? The -template.h
stuff looks quite ugly, not to mention repetitive across devices,
but I don't really feel confident about trying to update existing
ARM devices (eg the pl110) without a sample to use as inspiration...

thanks
-- PMM

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-17 10:45 ` Gerd Hoffmann
  2014-06-17 11:08   ` Peter Maydell
@ 2014-06-17 11:15   ` Benjamin Herrenschmidt
  2014-06-17 11:57     ` Gerd Hoffmann
  1 sibling, 1 reply; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-06-17 11:15 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-devel, Alexander Graf

On Tue, 2014-06-17 at 12:45 +0200, Gerd Hoffmann wrote:
>   Hi,
> 
> > The core issue stems from the fact that graphic stacks including X and a
> > number of other components along the way are terminally broken if the
> > framebuffer endianness doesn't match the guest endianness. We can
> > discuss the historical reasons for that fact if you wish and we can
> > discuss why it is almost unfixable separately.
> 
> "framebuffer" as in "linux framebuffer device" or as in "whatever memory
> area X11 renders into" ?

Linux fbdev has some ability to deal with foreign endian but at a
performance cost.

Xorg has much more serious issues, unless we start using something like
shadowfb and byteswap everything on the way out, but again, at a
significant performance cost.

> > This is an obvious problem for qemu on ppc64 since we now support having
> > either BE or LE guests.
> > 
> > At the moment, we have a broken patch in offb that makes the guest
> > sort-of limp along with the reversed framebuffer but that patch is
> > broken in several ways so I've reverted it upstream and X wouldn't work
> > anyway.
> 
> Side note: tried "CONFIG_DRM_BOCHS=m"?
> 
> In theory it should just work.  I've never actually tested it in
> practice on bigendian.

It should provided we have the right byte order in qemu for the guest
endian :-) But if it's the KMS driver I think it is yes, I did try an
earlier version, and it should work with the same problem that qemu
needs to change it's layout with the guest endian.

> Not sure it helps in any way with the byteorder issue though.  Probably
> not.  Could help with the "how do we flip the framebuffer byteorder"
> though, as we can simply define a new register and let the driver set
> it.

Right.

> > This is purely about discussing the changes to vga.c and vga-templates.h
> > to be able to handle the swapping in a runtime-decided way rather than
> > compile time.
> 
> > What do you prefer ?
> 
> Let pixman handle it?  Well, except that pixman can't handle byteswapped
> 16bpp formats.  Too bad :(

Right :) As I said, it's a trainwreck along the whole stack. I think my
second patch is reasonably non-invasive and shouldn't affect performance
of the existing path though.

> In any case cleaning up the whole mess is overdue and doing that
> beforehand should simplify the job alot.  For quite some time qemu uses
> 32bpp internally no matter what, so the functions for converting the
> guest's framebuffer into anything but 32bpp should be dead code.

I'm happy for that to happen upstream though I think I might go with my
current hack for powerkvm so we have a more immediate solution for
ubuntu and possibly sles12.

> And as long as the guest uses 32bpp too there is nothing converted
> anyway, we just setup pixman image in the correct format, backed by the
> guests vga memory.  So this ...

pixman byteswaps 32bpp ? Good, I'd rather leave the work to it since it
has vector accelerations and other similar niceties which would make it
a better place than qemu.

However we still need to deal with 15/16bpp guest side fb

> > @@ -1645,11 +1690,9 @@ static void vga_draw_graphic(VGACommonState *s, int full_update)
> >      uint8_t *d;
> >      uint32_t v, addr1, addr;
> >      vga_draw_line_func *vga_draw_line;
> > -#if defined(HOST_WORDS_BIGENDIAN) == defined(TARGET_WORDS_BIGENDIAN)
> > -    static const bool byteswap = false;
> > -#else
> > -    static const bool byteswap = true;
> > -#endif
> > +    bool byteswap;
> > +
> > +    byteswap = vga_need_swap(s, vga_is_big_endian(s));
> >  
> >      full_update |= update_basic_params(s);
> >  
> > @@ -1691,7 +1734,8 @@ static void vga_draw_graphic(VGACommonState *s, int full_update)
> >      if (s->line_offset != s->last_line_offset ||
> >          disp_width != s->last_width ||
> >          height != s->last_height ||
> > -        s->last_depth != depth) {
> > +        s->last_depth != depth ||
> > +        s->last_bswap != byteswap) {
> >          if (depth == 32 || (depth == 16 && !byteswap)) {
> >              surface = qemu_create_displaysurface_from(disp_width,
> >                      height, depth, s->line_offset,
> > @@ -1707,6 +1751,7 @@ static void vga_draw_graphic(VGACommonState *s, int full_update)
> >          s->last_height = height;
> >          s->last_line_offset = s->line_offset;
> >          s->last_depth = depth;
> > +	s->last_bswap = byteswap;
> >          full_update = 1;
> >      } else if (is_buffer_shared(surface) &&
> >                 (full_update || surface_data(surface) != s->vram_ptr
> 
> ... is all you need to make things fly for the 32bpp case.

Provider we also give pixman the right pixel format for "reverse endian"
which we don't do today and I yet have to investigate it :-) The pixmap
stuff in qemu to be honest is news to me, last time I looked it was SDL
and hand made vnc. Is the vnc server going though pixman as well ?

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-17 11:08   ` Peter Maydell
@ 2014-06-17 11:24     ` Gerd Hoffmann
  2014-06-17 11:42       ` Peter Maydell
  0 siblings, 1 reply; 66+ messages in thread
From: Gerd Hoffmann @ 2014-06-17 11:24 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Alexey Kardashevskiy, Alexander Graf, qemu-devel, Paolo Bonzini

  Hi,

> > Let pixman handle it?  Well, except that pixman can't handle byteswapped
> > 16bpp formats.  Too bad :(

> Slight tangent, but is there an example in-tree of a framebuffer
> model which uses pixman for the pixel conversion? The -template.h
> stuff looks quite ugly, not to mention repetitive across devices,
> but I don't really feel confident about trying to update existing
> ARM devices (eg the pl110) without a sample to use as inspiration...

I'm not aware of any.

pl110 looks easy enough, so maybe I should just go convert it (and
sprinkle in comments) so there actually is a good example to follow.
Which arm board has one?  Suggestions on how to test it?

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-17 10:09           ` Benjamin Herrenschmidt
  2014-06-17 10:19             ` Peter Maydell
@ 2014-06-17 11:40             ` Greg Kurz
  2014-06-17 11:53               ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 66+ messages in thread
From: Greg Kurz @ 2014-06-17 11:40 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Peter Maydell, Alexey Kardashevskiy, qemu-devel, Alexander Graf,
	Gerd Hoffmann, Paolo Bonzini

On Tue, 17 Jun 2014 20:09:18 +1000
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:

> On Tue, 2014-06-17 at 12:00 +0200, Greg Kurz wrote:
> > There has been a discussion already about virtio endianness: relying on
> > a guest system wide setting such as LPCR_ILE has been strongly rejected
> > at the time... The consensus for virtio is "device endianness is the
> > endianness of the CPU that does the reset" (hence MSR_LE for PPC).
> 
> How on earth did anybody reach such a conclusion ? Of all the possible
> options I can think of this is the one that makes the *less* sense !
> 

This conclusion was reached while discussing dump support where we
use LPCR_ILE, and I wanted to add a common helper to be used by
virtio. Please find the thread in the link below:

https://lists.nongnu.org/archive/html/qemu-devel/2014-05/msg00415.html

Can you elaborate on the "less sense" opinion please ?

> Cheers,
> Ben.
> 
> 

Thanks.
-- 
Gregory Kurz                                     kurzgreg@fr.ibm.com
                                                 gkurz@linux.vnet.ibm.com
Software Engineer @ IBM/Meiosys                  http://www.ibm.com
Tel +33 (0)562 165 496

"Anarchy is about taking complete responsibility for yourself."
        Alan Moore.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-17 11:24     ` Gerd Hoffmann
@ 2014-06-17 11:42       ` Peter Maydell
  2014-06-19 12:33         ` Gerd Hoffmann
  0 siblings, 1 reply; 66+ messages in thread
From: Peter Maydell @ 2014-06-17 11:42 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Alexey Kardashevskiy, Alexander Graf, qemu-devel, Paolo Bonzini

On 17 June 2014 12:24, Gerd Hoffmann <kraxel@redhat.com> wrote:
> pl110 looks easy enough, so maybe I should just go convert it (and
> sprinkle in comments) so there actually is a good example to follow.
> Which arm board has one?  Suggestions on how to test it?

It's present in a number of them; the simplest one to test is
probably versatilepb, because you can get some prebuilt images
from Aurelien's website:
http://people.debian.org/~aurel32/qemu/armel/

thanks
-- PMM

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-17 11:40             ` Greg Kurz
@ 2014-06-17 11:53               ` Benjamin Herrenschmidt
  2014-06-17 12:05                 ` Peter Maydell
  0 siblings, 1 reply; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-06-17 11:53 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Peter Maydell, Alexey Kardashevskiy, qemu-devel, Alexander Graf,
	Gerd Hoffmann, Paolo Bonzini

On Tue, 2014-06-17 at 13:40 +0200, Greg Kurz wrote:
> This conclusion was reached while discussing dump support where we
> use LPCR_ILE, and I wanted to add a common helper to be used by
> virtio. Please find the thread in the link below:
> 
> https://lists.nongnu.org/archive/html/qemu-devel/2014-05/msg00415.html
> 
> Can you elaborate on the "less sense" opinion please ?

I can't fathom how the endianness of a specific CPU at the point of the
reset makes any sense vs. the overall endianness of the system.

But this is all very theorical, fact is they will always match on
powerpc, so I don't care that much :)

In any case, for VGA, the right long term approach is a register in
the adapter. Though we might want to keep the trick of forcing it when
a PAPR guest does the H_SET_MODE hcall for general compatibility with
existing guests, after all it's a paravirt interface, we can have side
effects like that....

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-17 11:15   ` Benjamin Herrenschmidt
@ 2014-06-17 11:57     ` Gerd Hoffmann
  2014-06-17 21:32       ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 66+ messages in thread
From: Gerd Hoffmann @ 2014-06-17 11:57 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-devel, Alexander Graf

  Hi,

> > Let pixman handle it?  Well, except that pixman can't handle byteswapped
> > 16bpp formats.  Too bad :(
> 
> Right :) As I said, it's a trainwreck along the whole stack. I think my
> second patch is reasonably non-invasive and shouldn't affect performance
> of the existing path though.

Yep, that looked sane on a quick scan.

> > And as long as the guest uses 32bpp too there is nothing converted
> > anyway, we just setup pixman image in the correct format, backed by the
> > guests vga memory.  So this ...
> 
> pixman byteswaps 32bpp ? Good, I'd rather leave the work to it since it
> has vector accelerations and other similar niceties which would make it
> a better place than qemu.

Indeed, that is the reason why I've made qemu start using pixman ;)

> However we still need to deal with 15/16bpp guest side fb

Yes.

> Provider we also give pixman the right pixel format for "reverse endian"
> which we don't do today and I yet have to investigate it :-)

Oh.  Check ui/qemu-pixman.c.  Probably something goes wrong when
converting qemu's PixelFormat into a pixman format.

> The pixmap
> stuff in qemu to be honest is news to me, last time I looked it was SDL
> and hand made vnc. Is the vnc server going though pixman as well ?

Ok, it is supposed to work this way:

The virtual graphics card creates a DisplaySurface, which is a pixman
image under the hood.  There are basically two ways to do that:

  (1) Use qemu_create_displaysurface().  Allocates host memory.
      Returns 32bpp framebuffer in host byte order.  virtual graphics
      card is supposed to convert the guests framebuffer into that
      format.
  (2) Use qemu_create_displaysurface_from().  DisplaySurface (and pixman
      image) is backed by guest display memory then.  pixman format
      obviously must match the guests framebuffer format.

The ui (gtk / sdl / vnc / screendump via monitor) is supposed to deal
with whatever it gets.  Typically the ui checks whenever it can use the
format directly, and if not it converts using pixman (see gd_switch in
ui/gtk.c for example).  vnc and screendump use pixman too.  sdl feeds
SDL_CreateRGBSurfaceFrom with the shifts of the pixelformat instead.

HTH,
  Gerd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-17 11:53               ` Benjamin Herrenschmidt
@ 2014-06-17 12:05                 ` Peter Maydell
  0 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2014-06-17 12:05 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Alexey Kardashevskiy, qemu-devel, Alexander Graf, Gerd Hoffmann,
	Paolo Bonzini, Greg Kurz

On 17 June 2014 12:53, Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> On Tue, 2014-06-17 at 13:40 +0200, Greg Kurz wrote:
>> This conclusion was reached while discussing dump support where we
>> use LPCR_ILE, and I wanted to add a common helper to be used by
>> virtio. Please find the thread in the link below:
>>
>> https://lists.nongnu.org/archive/html/qemu-devel/2014-05/msg00415.html
>>
>> Can you elaborate on the "less sense" opinion please ?
>
> I can't fathom how the endianness of a specific CPU at the point of the
> reset makes any sense vs. the overall endianness of the system.
>
> But this is all very theorical, fact is they will always match on
> powerpc, so I don't care that much :)

Well, the point is that "overall endianness of the system"
is hopelessly ill-defined. Does it mean "the endianness that
the CPU is currently configured for" (bad idea if you allow
guest userspace to be LE while guest kernel is BE), "the endianness
that the CPU is currently configured to take exceptions in"
(probably more likely to be the guest kernel endianness, but
when do you sample this value, and what do you do on architectures
where there's no such concept?), or "the endianness of the
external-to-the-CPU bus" (solidly defined but probably not
what you wanted since that means ppcle would still be defined
as 'big-endian')?

What we ended up with was deciding that the only point you
could reasonably sample the CPU state to figure out what
endianness the guest was actually in would be the point
where it actually first touched the virtio device, ie when
it resets it by prodding the relevant registers. That gets
you compatibility with existing guest kernels and avoids
weird issues if you have a BE bootloader that starts a
LE kernel or vice-versa.

As you say, having a register on the device to let the guest
explicitly set the endianness it wants is much nicer and
avoids weird backdoor "look at the CPU state" behaviour.
Unfortunately we're stuck with supporting the existing
virtio kernel drivers.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-17 11:57     ` Gerd Hoffmann
@ 2014-06-17 21:32       ` Benjamin Herrenschmidt
  2014-06-17 22:12         ` Peter Maydell
  2014-06-18 11:18         ` Gerd Hoffmann
  0 siblings, 2 replies; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-06-17 21:32 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-devel, Alexander Graf

On Tue, 2014-06-17 at 13:57 +0200, Gerd Hoffmann wrote:

> Ok, it is supposed to work this way:
> 
> The virtual graphics card creates a DisplaySurface, which is a pixman
> image under the hood.  There are basically two ways to do that:
> 
>   (1) Use qemu_create_displaysurface().  Allocates host memory.
>       Returns 32bpp framebuffer in host byte order.  virtual graphics
>       card is supposed to convert the guests framebuffer into that
>       format.
>   (2) Use qemu_create_displaysurface_from().  DisplaySurface (and pixman
>       image) is backed by guest display memory then.  pixman format
>       obviously must match the guests framebuffer format.
> 
> The ui (gtk / sdl / vnc / screendump via monitor) is supposed to deal
> with whatever it gets.  Typically the ui checks whenever it can use the
> format directly, and if not it converts using pixman (see gd_switch in
> ui/gtk.c for example).  vnc and screendump use pixman too.  sdl feeds
> SDL_CreateRGBSurfaceFrom with the shifts of the pixelformat instead.

Allright, I'll play with this, it makes sense to get rid of most of
the conversion functions if we know the surface will always be 32-bit
indeed.

I think we can work step by step:

 - Remove all the non-32bpp targets from vga_template

 - Add the 32-bit target only version of my byteswap

This gives us something that works quickly. We also need to focus on how
to trigger the endian mode etc...

Then we can

 - Look into doing at least some of the conversions in pixman

Which would be a subsequent optimization.

Any objection ? I'll spend some time in the next few days getting myself
a bit more familiar with that pixman stuff (I need to fix the ppc vector
stuff for LE in there anyway) and come up with patches.

Another step I hinted to earlier is how we do the actual endian
transition. My idea is to add a new register to the BOCHS DISPI VBE
list and bump the PCI revision ID to advertise its existence.

Additionally, I wouldn't mind of we did a quick "trick" equivalent (but
cleaner) to what I did in my patch which is when the pseries guest calls
the hypervisor call to change the interrupt endian mode, we notify VGA
and switch its endian mode, so we work "by default" with kernels not
updated to know about that register. But this is open for debate. It's
somewhat "acceptable" in the context of our hypercall being a
"paravirtualized" interface, so it can be argued that the hypercall
poking at the VGA chip is equivalent to some FW doing so :-)

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-17 21:32       ` Benjamin Herrenschmidt
@ 2014-06-17 22:12         ` Peter Maydell
  2014-06-17 22:55           ` Benjamin Herrenschmidt
  2014-06-18 11:18         ` Gerd Hoffmann
  1 sibling, 1 reply; 66+ messages in thread
From: Peter Maydell @ 2014-06-17 22:12 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Alexey Kardashevskiy, Paolo Bonzini, Alexander Graf,
	Gerd Hoffmann, qemu-devel

On 17 June 2014 22:32, Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> Additionally, I wouldn't mind of we did a quick "trick" equivalent (but
> cleaner) to what I did in my patch which is when the pseries guest calls
> the hypervisor call to change the interrupt endian mode, we notify VGA
> and switch its endian mode, so we work "by default" with kernels not
> updated to know about that register. But this is open for debate. It's
> somewhat "acceptable" in the context of our hypercall being a
> "paravirtualized" interface, so it can be argued that the hypercall
> poking at the VGA chip is equivalent to some FW doing so :-)

I'm definitely against this. Have your guest change the behaviour
of the VGA device by explicitly prodding the VGA device, not by
back-door side-effects of something else that it happens to do
on one particular guest for one particular architecture, please.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-17 22:12         ` Peter Maydell
@ 2014-06-17 22:55           ` Benjamin Herrenschmidt
  2014-06-17 23:05             ` Alexander Graf
  0 siblings, 1 reply; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-06-17 22:55 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Alexey Kardashevskiy, Paolo Bonzini, Alexander Graf,
	Gerd Hoffmann, qemu-devel

On Tue, 2014-06-17 at 23:12 +0100, Peter Maydell wrote:
> On 17 June 2014 22:32, Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> > Additionally, I wouldn't mind of we did a quick "trick" equivalent (but
> > cleaner) to what I did in my patch which is when the pseries guest calls
> > the hypervisor call to change the interrupt endian mode, we notify VGA
> > and switch its endian mode, so we work "by default" with kernels not
> > updated to know about that register. But this is open for debate. It's
> > somewhat "acceptable" in the context of our hypercall being a
> > "paravirtualized" interface, so it can be argued that the hypercall
> > poking at the VGA chip is equivalent to some FW doing so :-)
> 
> I'm definitely against this. Have your guest change the behaviour
> of the VGA device by explicitly prodding the VGA device, not by
> back-door side-effects of something else that it happens to do
> on one particular guest for one particular architecture, please.

But that means modifying guests ... obviously you live in a world where
you don't have to live with existing enterprise distros backward
compatibility :-)

I understand the reluctance against backdoor side effects in general,
but as I said, this is a hypervisor call that basically says "change
system endianness". It does make some amount of sense to have in that
case the hypervisor (which here is qemu) go adjust the endianness
setting in some devices as well as the cores.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-17 22:55           ` Benjamin Herrenschmidt
@ 2014-06-17 23:05             ` Alexander Graf
  2014-06-17 23:16               ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 66+ messages in thread
From: Alexander Graf @ 2014-06-17 23:05 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Peter Maydell
  Cc: Alexey Kardashevskiy, Paolo Bonzini, Gerd Hoffmann, qemu-devel


On 18.06.14 00:55, Benjamin Herrenschmidt wrote:
> On Tue, 2014-06-17 at 23:12 +0100, Peter Maydell wrote:
>> On 17 June 2014 22:32, Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
>>> Additionally, I wouldn't mind of we did a quick "trick" equivalent (but
>>> cleaner) to what I did in my patch which is when the pseries guest calls
>>> the hypervisor call to change the interrupt endian mode, we notify VGA
>>> and switch its endian mode, so we work "by default" with kernels not
>>> updated to know about that register. But this is open for debate. It's
>>> somewhat "acceptable" in the context of our hypercall being a
>>> "paravirtualized" interface, so it can be argued that the hypercall
>>> poking at the VGA chip is equivalent to some FW doing so :-)
>> I'm definitely against this. Have your guest change the behaviour
>> of the VGA device by explicitly prodding the VGA device, not by
>> back-door side-effects of something else that it happens to do
>> on one particular guest for one particular architecture, please.
> But that means modifying guests ... obviously you live in a world where
> you don't have to live with existing enterprise distros backward
> compatibility :-)
>
> I understand the reluctance against backdoor side effects in general,
> but as I said, this is a hypervisor call that basically says "change
> system endianness". It does make some amount of sense to have in that
> case the hypervisor (which here is qemu) go adjust the endianness
> setting in some devices as well as the cores.

Semantically the H_SET_MODE(LE) hypercall is on the same level as PSCI. 
Some code somewhere (Trustzone in the guest on ARM or QEMU) runs some 
code to "do magic".

So for the sake of the discussion imagine this code would live inside of 
guest context. It can't for us, as hypercalls always trap into KVM or 
QEMU, but semantically the code shouldn't be able to do too much more 
than anything coming from the guest should be able to do.

Imagine the code that runs here loops through all PCI devices, looks for 
BOCHS VGA adapters and pokes that endian register.

This is essentially the interface that Ben is suggesting here.

Given that, the prerequisite to that interface is to have a guest 
exposed register to flip the endianness in the first place. I think it 
makes most sense to settle on that register first, then talk about 
potential backwards compat hacks that we could do on hypercalls with 
that register.

Of course, the side effect that H_SET_MODE(LE) flips the VGA endianness 
needs to be documented in sPAPR as well if we want to go down that path. 
sPAPR is the hypercall specification we implement with -M pseries.


Alex

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-17 23:05             ` Alexander Graf
@ 2014-06-17 23:16               ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-06-17 23:16 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Alexey Kardashevskiy, Peter Maydell, qemu-devel, Gerd Hoffmann,
	Paolo Bonzini

On Wed, 2014-06-18 at 01:05 +0200, Alexander Graf wrote:
> Given that, the prerequisite to that interface is to have a guest 
> exposed register to flip the endianness in the first place. I think it 
> makes most sense to settle on that register first, then talk about 
> potential backwards compat hacks that we could do on hypercalls with 
> that register.
> 
> Of course, the side effect that H_SET_MODE(LE) flips the VGA endianness 
> needs to be documented in sPAPR as well if we want to go down that path. 
> sPAPR is the hypercall specification we implement with -M pseries.

Yeah well, let's first get the base code done as discussed with Gerd
with the register, and we can keep that "optimization" separate and talk
again about it later.

I'll come up with patches in the next few days.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-17 21:32       ` Benjamin Herrenschmidt
  2014-06-17 22:12         ` Peter Maydell
@ 2014-06-18 11:18         ` Gerd Hoffmann
  2014-06-18 13:03           ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 66+ messages in thread
From: Gerd Hoffmann @ 2014-06-18 11:18 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-devel, Alexander Graf

  Hi,

> Any objection ? I'll spend some time in the next few days getting myself
> a bit more familiar with that pixman stuff (I need to fix the ppc vector
> stuff for LE in there anyway) and come up with patches.

Sounds good.

Have a look at
https://www.kraxel.org/cgit/qemu/log/?h=rebase/console-wip

Has some console patches to make pixman the primary format specifier
(instead of pixman_format_code and PixelFormat living side-by-side).

That should make the control flow a bit easier to follow.  Also gives
some more flexibility to qemu_create_displaysurface_from() as it accepts
pixman format codes directly now.

/me goes continue look at pl110.  On a second look it isn't as easy as
it seemed initially.  But as it supports all sorts of funky formats it
serves well as example and also nicely shows what would be useful as
generic infrastructure ...

> Another step I hinted to earlier is how we do the actual endian
> transition. My idea is to add a new register to the BOCHS DISPI VBE
> list and bump the PCI revision ID to advertise its existence.

The dispi interface has a versioned id too (VBE_DISPI_IDx) which we
could use too.  Which makes sense IMO if we add the register to the
bochs dispi registers.

We could also place the register in pci config space, then indicate it
exists by either creating it a vendor pci capability or by pci revision.

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-18 11:18         ` Gerd Hoffmann
@ 2014-06-18 13:03           ` Benjamin Herrenschmidt
  2014-06-19  9:36             ` Gerd Hoffmann
  0 siblings, 1 reply; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-06-18 13:03 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-devel, Alexander Graf

On Wed, 2014-06-18 at 13:18 +0200, Gerd Hoffmann wrote:
> The dispi interface has a versioned id too (VBE_DISPI_IDx) which we
> could use too.  Which makes sense IMO if we add the register to the
> bochs dispi registers.
> 
> We could also place the register in pci config space, then indicate it
> exists by either creating it a vendor pci capability or by pci
> revision.

I'm not fan of the config space option .. I don't like vendor specific
stuff in PCI config space. I prefer adding a DISPI VBE register since
we never compile that out (despite the ifdef's being still around) do
we ?

I've tried to engage with the Bochs folks on their mailing list to make
sure they agree in principle at least but have had no reply yet.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-18 13:03           ` Benjamin Herrenschmidt
@ 2014-06-19  9:36             ` Gerd Hoffmann
  2014-06-21  5:37               ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 66+ messages in thread
From: Gerd Hoffmann @ 2014-06-19  9:36 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-devel, Alexander Graf

On Mi, 2014-06-18 at 23:03 +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2014-06-18 at 13:18 +0200, Gerd Hoffmann wrote:
> > The dispi interface has a versioned id too (VBE_DISPI_IDx) which we
> > could use too.  Which makes sense IMO if we add the register to the
> > bochs dispi registers.
> > 
> > We could also place the register in pci config space, then indicate it
> > exists by either creating it a vendor pci capability or by pci
> > revision.
> 
> I'm not fan of the config space option .. I don't like vendor specific
> stuff in PCI config space. I prefer adding a DISPI VBE register since
> we never compile that out (despite the ifdef's being still around) do
> we ?

Yes, the #ifdefs are a leftover, it is always compiled in.

> I've tried to engage with the Bochs folks on their mailing list to make
> sure they agree in principle at least but have had no reply yet.

If we can get to an agreement here -- perfectly fine.

If not -- then I prefer to play save and not use a vbe register to avoid
ending up with two incompatible bochs interface revisions.  If you don't
like the pci config space option -- the mmio bar has plenty of free
space left ;)

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-17 11:42       ` Peter Maydell
@ 2014-06-19 12:33         ` Gerd Hoffmann
  2014-06-19 13:01           ` Paolo Bonzini
  0 siblings, 1 reply; 66+ messages in thread
From: Gerd Hoffmann @ 2014-06-19 12:33 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Alexey Kardashevskiy, Paolo Bonzini, Alexander Graf, qemu-devel

On Di, 2014-06-17 at 12:42 +0100, Peter Maydell wrote:
> On 17 June 2014 12:24, Gerd Hoffmann <kraxel@redhat.com> wrote:
> > pl110 looks easy enough, so maybe I should just go convert it (and
> > sprinkle in comments) so there actually is a good example to follow.
> > Which arm board has one?  Suggestions on how to test it?
> 
> It's present in a number of them; the simplest one to test is
> probably versatilepb, because you can get some prebuilt images
> from Aurelien's website:
> http://people.debian.org/~aurel32/qemu/armel/

Thanks.  The linux framebuffer driver uses only the little endian modes
though.  And it looks like it doesn't support 32bpp.

Is there some way to test the bigendian modes too?

Also:  How does 16bpp on bigendian looks like?  According to both specs
and emulation code the pl110 always swaps 4 bytes (32bit words).  As far
I know 16bpp on bigendian usually is an array of 16bit words (each in
bigendian order) though.  Which implies even with pl110 in bigendian
mode on a bigendian host I can't feed the data into pixman as-is because
I'll end up with swapped pixels then.  Correct?

I'm thinking about splitting the conversion (for truecolor modes, still
need to think about how to do paletted modes best) into two steps:
First byteswap to bring the data into a format pixman can handle, then
feed into pixman for further conversion.  Which probably comes with a
performance penalty (but maybe not due to pixman performing better than
our self-made conversion code).  It will only hit the more unusual guest
framebuffer formats which I can't feed into pixman directly.  Acceptable
or not?

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-19 12:33         ` Gerd Hoffmann
@ 2014-06-19 13:01           ` Paolo Bonzini
  0 siblings, 0 replies; 66+ messages in thread
From: Paolo Bonzini @ 2014-06-19 13:01 UTC (permalink / raw)
  To: Gerd Hoffmann, Peter Maydell
  Cc: Alexey Kardashevskiy, Alexander Graf, qemu-devel

Il 19/06/2014 14:33, Gerd Hoffmann ha scritto:
>
> I'm thinking about splitting the conversion (for truecolor modes, still
> need to think about how to do paletted modes best) into two steps:
> First byteswap to bring the data into a format pixman can handle, then
> feed into pixman for further conversion.  Which probably comes with a
> performance penalty (but maybe not due to pixman performing better than
> our self-made conversion code).  It will only hit the more unusual guest
> framebuffer formats which I can't feed into pixman directly.  Acceptable
> or not?

It's probably not going to have a performance penalty if you vectorize 
the word-swap.  (Which can be done later).

Paolo

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-19  9:36             ` Gerd Hoffmann
@ 2014-06-21  5:37               ` Benjamin Herrenschmidt
  2014-06-22  2:10                 ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-06-21  5:37 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-devel, Alexander Graf

On Thu, 2014-06-19 at 11:36 +0200, Gerd Hoffmann wrote:
> If not -- then I prefer to play save and not use a vbe register to avoid
> ending up with two incompatible bochs interface revisions.  If you don't
> like the pci config space option -- the mmio bar has plenty of free
> space left ;)

I'll find something :-)

Another advantage I realize in going down that path is that's yet another
use of target endian removed from hw/ which gets us a little step closer
to having to make the whole hw/ stuff target neutral.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-21  5:37               ` Benjamin Herrenschmidt
@ 2014-06-22  2:10                 ` Benjamin Herrenschmidt
  2014-06-23  1:13                   ` Benjamin Herrenschmidt
  2014-06-30 11:14                   ` Gerd Hoffmann
  0 siblings, 2 replies; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-06-22  2:10 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-devel, Alexander Graf

On Sat, 2014-06-21 at 15:37 +1000, Benjamin Herrenschmidt wrote:
> On Thu, 2014-06-19 at 11:36 +0200, Gerd Hoffmann wrote:
> > If not -- then I prefer to play save and not use a vbe register to avoid
> > ending up with two incompatible bochs interface revisions.  If you don't
> > like the pci config space option -- the mmio bar has plenty of free
> > space left ;)
> 
> I'll find something :-)
> 
> Another advantage I realize in going down that path is that's yet another
> use of target endian removed from hw/ which gets us a little step closer
> to having to make the whole hw/ stuff target neutral.

Ok, I hit a (small) problem ...

So I've cut out a ton of cruft form vga (and a bit from cirrus). However
there is one bit that worries me: cirrus cursor emulation.

>From what I can tell, we only ever call the cursor drawing callback on
non-shared surfaces. Should I deduce that the HW cursor emulation simply
doesn't work when using shared surfaces ? Or is there another path I
have missed to handle it ?

It makes sense in a way since we never want to draw the cursor in the
shared framebuffer, but we could probably handle it by having a small
separate pixman surface which we paint on top of the final render or
something like that (or link to a host side HW cursor if any) but I
can't quite see anything in the code.

I'm asking because I'm increasing the cases where we share the surface,
from 16 non-swapped and 32 to 15 and 16 non-swapped, 24 and 32bpp, which
means that practically speaking the HW cursor painting code in cirrus
will never be called unless we are in 8bpp, reverse endian, or force
surface unsharing somewhat.

Or did I miss something ?

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-22  2:10                 ` Benjamin Herrenschmidt
@ 2014-06-23  1:13                   ` Benjamin Herrenschmidt
  2014-06-30 11:14                   ` Gerd Hoffmann
  1 sibling, 0 replies; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-06-23  1:13 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-devel, Alexander Graf

On Sun, 2014-06-22 at 12:10 +1000, Benjamin Herrenschmidt wrote:
> On Sat, 2014-06-21 at 15:37 +1000, Benjamin Herrenschmidt wrote:
> > On Thu, 2014-06-19 at 11:36 +0200, Gerd Hoffmann wrote:
> > > If not -- then I prefer to play save and not use a vbe register to avoid
> > > ending up with two incompatible bochs interface revisions.  If you don't
> > > like the pci config space option -- the mmio bar has plenty of free
> > > space left ;)
> > 
> > I'll find something :-)
> > 
> > Another advantage I realize in going down that path is that's yet another
> > use of target endian removed from hw/ which gets us a little step closer
> > to having to make the whole hw/ stuff target neutral.
> 
> Ok, I hit a (small) problem ...

BTW, here's my current work:

https://github.com/ozbenh/qemu/commits/vga-work

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-22  2:10                 ` Benjamin Herrenschmidt
  2014-06-23  1:13                   ` Benjamin Herrenschmidt
@ 2014-06-30 11:14                   ` Gerd Hoffmann
  2014-06-30 12:32                     ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 66+ messages in thread
From: Gerd Hoffmann @ 2014-06-30 11:14 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-devel, Alexander Graf

  Hi,

> From what I can tell, we only ever call the cursor drawing callback on
> non-shared surfaces. Should I deduce that the HW cursor emulation simply
> doesn't work when using shared surfaces ? Or is there another path I
> have missed to handle it ?

Hmm.  Looks like hw-cursor-on-shared-surface broken indeed.  Need to dig
out a guest which actually uses it & go figure when testing your patch
series ...

> It makes sense in a way since we never want to draw the cursor in the
> shared framebuffer, but we could probably handle it by having a small
> separate pixman surface which we paint on top of the final render or
> something like that (or link to a host side HW cursor if any) but I
> can't quite see anything in the code.

There is infrastructure to inform the ui code how the cursor should look
like (grep for "dpy_cursor_define"), so we actually can use the hosts
hardware cursor support.  cirrus doesn't use it though.

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-30 11:14                   ` Gerd Hoffmann
@ 2014-06-30 12:32                     ` Benjamin Herrenschmidt
  2014-07-01  8:20                       ` Gerd Hoffmann
  0 siblings, 1 reply; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-06-30 12:32 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-devel, Alexander Graf

On Mon, 2014-06-30 at 13:14 +0200, Gerd Hoffmann wrote:
>   Hi,
> 
> > From what I can tell, we only ever call the cursor drawing callback on
> > non-shared surfaces. Should I deduce that the HW cursor emulation simply
> > doesn't work when using shared surfaces ? Or is there another path I
> > have missed to handle it ?
> 
> Hmm.  Looks like hw-cursor-on-shared-surface broken indeed.  Need to dig
> out a guest which actually uses it & go figure when testing your patch
> series ...

I don't think I broke it much more than it already was but then I
couldn't find a guest using it. I've tried the plain cirrus DDX in X and
it didn't have any problem... maybe windows ?

> > It makes sense in a way since we never want to draw the cursor in the
> > shared framebuffer, but we could probably handle it by having a small
> > separate pixman surface which we paint on top of the final render or
> > something like that (or link to a host side HW cursor if any) but I
> > can't quite see anything in the code.
> 
> There is infrastructure to inform the ui code how the cursor should look
> like (grep for "dpy_cursor_define"), so we actually can use the hosts
> hardware cursor support.  cirrus doesn't use it though.

Right. A quick fix would be to add a flag to force always using a shadow
surface and set it in cirrus ... I'm not sure anybody will notice the
performance difference.

My main problem is whatever more complicated than that would require
testing and I yet have to find something to run in the guest that uses
that cirrus HW cursor.

Cheers,
Ben.

> cheers,
>   Gerd
> 

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-06-30 12:32                     ` Benjamin Herrenschmidt
@ 2014-07-01  8:20                       ` Gerd Hoffmann
  2014-07-01  8:26                         ` Alexander Graf
  2014-07-01  9:33                         ` Benjamin Herrenschmidt
  0 siblings, 2 replies; 66+ messages in thread
From: Gerd Hoffmann @ 2014-07-01  8:20 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-devel, Alexander Graf

On Mo, 2014-06-30 at 22:32 +1000, Benjamin Herrenschmidt wrote:
> On Mon, 2014-06-30 at 13:14 +0200, Gerd Hoffmann wrote:
> >   Hi,
> > 
> > > From what I can tell, we only ever call the cursor drawing callback on
> > > non-shared surfaces. Should I deduce that the HW cursor emulation simply
> > > doesn't work when using shared surfaces ? Or is there another path I
> > > have missed to handle it ?
> > 
> > Hmm.  Looks like hw-cursor-on-shared-surface broken indeed.  Need to dig
> > out a guest which actually uses it & go figure when testing your patch
> > series ...
> 
> I don't think I broke it much more than it already was but then I
> couldn't find a guest using it. I've tried the plain cirrus DDX in X and
> it didn't have any problem... maybe windows ?

Nope.  windows xp doesn't use it.  Anything newer doesn't ship with
cirrus drivers any more (and uses vesa bios support).

Looking at the code the cirrus hardware cursor supports two colors only
(and some funky xor mode).  Guess it simply doesn't cut it as you can't
have your cursors drop shadows with that, so guests are ignoring it.

> Right. A quick fix would be to add a flag to force always using a shadow
> surface and set it in cirrus ... I'm not sure anybody will notice the
> performance difference.

I suspect we can rip out hw cursor emulation and nobody will notice the
difference either ...

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-01  8:20                       ` Gerd Hoffmann
@ 2014-07-01  8:26                         ` Alexander Graf
  2014-07-01  8:31                           ` Paolo Bonzini
                                             ` (2 more replies)
  2014-07-01  9:33                         ` Benjamin Herrenschmidt
  1 sibling, 3 replies; 66+ messages in thread
From: Alexander Graf @ 2014-07-01  8:26 UTC (permalink / raw)
  To: Gerd Hoffmann, Benjamin Herrenschmidt
  Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-devel


On 01.07.14 10:20, Gerd Hoffmann wrote:
> On Mo, 2014-06-30 at 22:32 +1000, Benjamin Herrenschmidt wrote:
>> On Mon, 2014-06-30 at 13:14 +0200, Gerd Hoffmann wrote:
>>>    Hi,
>>>
>>>>  From what I can tell, we only ever call the cursor drawing callback on
>>>> non-shared surfaces. Should I deduce that the HW cursor emulation simply
>>>> doesn't work when using shared surfaces ? Or is there another path I
>>>> have missed to handle it ?
>>> Hmm.  Looks like hw-cursor-on-shared-surface broken indeed.  Need to dig
>>> out a guest which actually uses it & go figure when testing your patch
>>> series ...
>> I don't think I broke it much more than it already was but then I
>> couldn't find a guest using it. I've tried the plain cirrus DDX in X and
>> it didn't have any problem... maybe windows ?
> Nope.  windows xp doesn't use it.  Anything newer doesn't ship with
> cirrus drivers any more (and uses vesa bios support).
>
> Looking at the code the cirrus hardware cursor supports two colors only
> (and some funky xor mode).  Guess it simply doesn't cut it as you can't
> have your cursors drop shadows with that, so guests are ignoring it.

Windows NT 4 might use it. I remember that I had issues running NT4 with 
Cirrus emulation a while back.

>
>> Right. A quick fix would be to add a flag to force always using a shadow
>> surface and set it in cirrus ... I'm not sure anybody will notice the
>> performance difference.
> I suspect we can rip out hw cursor emulation and nobody will notice the
> difference either ...

Very likely ;). Though I think we're better off keeping it around to 
make sure we're still compatible with ancient guests (Windows 3.1 might 
use it too). Making it slow however shouldn't make any difference at all.


Alex

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-01  8:26                         ` Alexander Graf
@ 2014-07-01  8:31                           ` Paolo Bonzini
  2014-07-01  9:07                             ` Gerd Hoffmann
  2014-07-01  8:59                           ` Gerd Hoffmann
  2014-07-01  9:35                           ` Benjamin Herrenschmidt
  2 siblings, 1 reply; 66+ messages in thread
From: Paolo Bonzini @ 2014-07-01  8:31 UTC (permalink / raw)
  To: Alexander Graf, Gerd Hoffmann, Benjamin Herrenschmidt
  Cc: Alexey Kardashevskiy, qemu-devel

Il 01/07/2014 10:26, Alexander Graf ha scritto:
>>
>>> Right. A quick fix would be to add a flag to force always using a shadow
>>> surface and set it in cirrus ... I'm not sure anybody will notice the
>>> performance difference.
>> I suspect we can rip out hw cursor emulation and nobody will notice the
>> difference either ...
>
> Very likely ;). Though I think we're better off keeping it around to
> make sure we're still compatible with ancient guests (Windows 3.1 might
> use it too). Making it slow however shouldn't make any difference at all.

If you tell me what to look at, I legally own a Windows 98 CD (also NT4 
but I have to dig it out) and can test it later this week.

Paolo

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-01  8:26                         ` Alexander Graf
  2014-07-01  8:31                           ` Paolo Bonzini
@ 2014-07-01  8:59                           ` Gerd Hoffmann
  2014-07-01  9:35                           ` Benjamin Herrenschmidt
  2 siblings, 0 replies; 66+ messages in thread
From: Gerd Hoffmann @ 2014-07-01  8:59 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Alexey Kardashevskiy, qemu-devel, Paolo Bonzini

On Di, 2014-07-01 at 10:26 +0200, Alexander Graf wrote:
> On 01.07.14 10:20, Gerd Hoffmann wrote:
> > On Mo, 2014-06-30 at 22:32 +1000, Benjamin Herrenschmidt wrote:
> >> On Mon, 2014-06-30 at 13:14 +0200, Gerd Hoffmann wrote:
> >>>    Hi,
> >>>
> >>>>  From what I can tell, we only ever call the cursor drawing callback on
> >>>> non-shared surfaces. Should I deduce that the HW cursor emulation simply
> >>>> doesn't work when using shared surfaces ? Or is there another path I
> >>>> have missed to handle it ?
> >>> Hmm.  Looks like hw-cursor-on-shared-surface broken indeed.  Need to dig
> >>> out a guest which actually uses it & go figure when testing your patch
> >>> series ...
> >> I don't think I broke it much more than it already was but then I
> >> couldn't find a guest using it. I've tried the plain cirrus DDX in X and
> >> it didn't have any problem... maybe windows ?
> > Nope.  windows xp doesn't use it.  Anything newer doesn't ship with
> > cirrus drivers any more (and uses vesa bios support).
> >
> > Looking at the code the cirrus hardware cursor supports two colors only
> > (and some funky xor mode).  Guess it simply doesn't cut it as you can't
> > have your cursors drop shadows with that, so guests are ignoring it.
> 
> Windows NT 4 might use it. I remember that I had issues running NT4 with 
> Cirrus emulation a while back.

That could be it indeed.

> >> Right. A quick fix would be to add a flag to force always using a shadow
> >> surface and set it in cirrus ... I'm not sure anybody will notice the
> >> performance difference.
> > I suspect we can rip out hw cursor emulation and nobody will notice the
> > difference either ...
> 
> Very likely ;). Though I think we're better off keeping it around to 
> make sure we're still compatible with ancient guests (Windows 3.1 might 
> use it too). Making it slow however shouldn't make any difference at all.

Especially as we can make the force-shadow mode depend on the hw cursor
enable bit, so we don't have any difference for the common case.

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-01  8:31                           ` Paolo Bonzini
@ 2014-07-01  9:07                             ` Gerd Hoffmann
  2014-07-01  9:19                               ` Paolo Bonzini
  0 siblings, 1 reply; 66+ messages in thread
From: Gerd Hoffmann @ 2014-07-01  9:07 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Alexey Kardashevskiy, Alexander Graf, qemu-devel

On Di, 2014-07-01 at 10:31 +0200, Paolo Bonzini wrote:
> Il 01/07/2014 10:26, Alexander Graf ha scritto:
> >>
> >>> Right. A quick fix would be to add a flag to force always using a shadow
> >>> surface and set it in cirrus ... I'm not sure anybody will notice the
> >>> performance difference.
> >> I suspect we can rip out hw cursor emulation and nobody will notice the
> >> difference either ...
> >
> > Very likely ;). Though I think we're better off keeping it around to
> > make sure we're still compatible with ancient guests (Windows 3.1 might
> > use it too). Making it slow however shouldn't make any difference at all.
> 
> If you tell me what to look at, I legally own a Windows 98 CD (also NT4 
> but I have to dig it out) and can test it later this week.

/me has win98 too, that doesn't boot after install though.

So if you can try nt4 that would be great.  Test is simple: switch into
16bpp mode (that one uses shared surface in git master), check whenever
the mouse pointer is present.

Bonus points for compiling with DEBUG_CIRRUS and checking the logs to
see whenever nt4 actually enables the hw cursor (sr registers, index
0x12, bit 0).

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-01  9:07                             ` Gerd Hoffmann
@ 2014-07-01  9:19                               ` Paolo Bonzini
  2014-07-01 11:15                                 ` Gerd Hoffmann
  0 siblings, 1 reply; 66+ messages in thread
From: Paolo Bonzini @ 2014-07-01  9:19 UTC (permalink / raw)
  To: Gerd Hoffmann; +Cc: Alexey Kardashevskiy, Alexander Graf, qemu-devel

Il 01/07/2014 11:07, Gerd Hoffmann ha scritto:
> On Di, 2014-07-01 at 10:31 +0200, Paolo Bonzini wrote:
>> Il 01/07/2014 10:26, Alexander Graf ha scritto:
>>>>
>>>>> Right. A quick fix would be to add a flag to force always using a shadow
>>>>> surface and set it in cirrus ... I'm not sure anybody will notice the
>>>>> performance difference.
>>>> I suspect we can rip out hw cursor emulation and nobody will notice the
>>>> difference either ...
>>>
>>> Very likely ;). Though I think we're better off keeping it around to
>>> make sure we're still compatible with ancient guests (Windows 3.1 might
>>> use it too). Making it slow however shouldn't make any difference at all.
>>
>> If you tell me what to look at, I legally own a Windows 98 CD (also NT4
>> but I have to dig it out) and can test it later this week.
>
> /me has win98 too, that doesn't boot after install though.

Tried TCG?  There is a timing loop that makes Win98 fail on too-fast 
computers.

> So if you can try nt4 that would be great.  Test is simple: switch into
> 16bpp mode (that one uses shared surface in git master), check whenever
> the mouse pointer is present.
>
> Bonus points for compiling with DEBUG_CIRRUS and checking the logs to
> see whenever nt4 actually enables the hw cursor (sr registers, index
> 0x12, bit 0).

Thanks!

Paolo

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-01  8:20                       ` Gerd Hoffmann
  2014-07-01  8:26                         ` Alexander Graf
@ 2014-07-01  9:33                         ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-07-01  9:33 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-devel, Alexander Graf

On Tue, 2014-07-01 at 10:20 +0200, Gerd Hoffmann wrote:
> > Right. A quick fix would be to add a flag to force always using a shadow
> > surface and set it in cirrus ... I'm not sure anybody will notice the
> > performance difference.
> 
> I suspect we can rip out hw cursor emulation and nobody will notice the
> difference either ...

So I'd say let's leave it as-is with the known breakage. Forcing shadows
means bringing back the drawing/conversions routines for 24 & 32 which
would suck.

If somebody manages to find a guest that's affected, they can tell us and
I'll look into fixing it.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-01  8:26                         ` Alexander Graf
  2014-07-01  8:31                           ` Paolo Bonzini
  2014-07-01  8:59                           ` Gerd Hoffmann
@ 2014-07-01  9:35                           ` Benjamin Herrenschmidt
  2 siblings, 0 replies; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-07-01  9:35 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Alexey Kardashevskiy, Paolo Bonzini, Gerd Hoffmann, qemu-devel

On Tue, 2014-07-01 at 10:26 +0200, Alexander Graf wrote:
> 
> Windows NT 4 might use it. I remember that I had issues running NT4 with 
> Cirrus emulation a while back.

Any location where one can find images legally ?

> >> Right. A quick fix would be to add a flag to force always using a shadow
> >> surface and set it in cirrus ... I'm not sure anybody will notice the
> >> performance difference.
> > I suspect we can rip out hw cursor emulation and nobody will notice the
> > difference either ...
> 
> Very likely ;). Though I think we're better off keeping it around to 
> make sure we're still compatible with ancient guests (Windows 3.1 might 
> use it too). Making it slow however shouldn't make any difference at all.

Making it slower isn't completely pretty either. The patch series remove
all the conversion routines for 24 and 32bpp, relying instead on pixman.
We'd have to bring them back (even if they are just memcpy's).

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-01  9:19                               ` Paolo Bonzini
@ 2014-07-01 11:15                                 ` Gerd Hoffmann
  2014-07-01 11:23                                   ` Benjamin Herrenschmidt
  2014-07-01 12:06                                   ` Paolo Bonzini
  0 siblings, 2 replies; 66+ messages in thread
From: Gerd Hoffmann @ 2014-07-01 11:15 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Alexey Kardashevskiy, Alexander Graf, qemu-devel

  Hi,

> >> If you tell me what to look at, I legally own a Windows 98 CD (also NT4
> >> but I have to dig it out) and can test it later this week.
> >
> > /me has win98 too, that doesn't boot after install though.
> 
> Tried TCG?  There is a timing loop that makes Win98 fail on too-fast 
> computers.

TCG works.  Any workaround to make it fly with kvm too?

And win98 actually uses the hardware cursor.

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-01 11:15                                 ` Gerd Hoffmann
@ 2014-07-01 11:23                                   ` Benjamin Herrenschmidt
  2014-07-02  9:19                                     ` Benjamin Herrenschmidt
  2014-07-01 12:06                                   ` Paolo Bonzini
  1 sibling, 1 reply; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-07-01 11:23 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Alexey Kardashevskiy, Paolo Bonzini, Alexander Graf, qemu-devel

On Tue, 2014-07-01 at 13:15 +0200, Gerd Hoffmann wrote:
> TCG works.  Any workaround to make it fly with kvm too?
> 
> And win98 actually uses the hardware cursor.

Ah ok. I was trying to install NT4 here but was having problems,
the screen would go yellow and cirrus would stop displaying anything
when trying to "test" the resolution (without my patches, actually with
a slightly older qemu too).

I was planning on digging more tomorrow, but I can try Win98 if that's
easier.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-01 11:15                                 ` Gerd Hoffmann
  2014-07-01 11:23                                   ` Benjamin Herrenschmidt
@ 2014-07-01 12:06                                   ` Paolo Bonzini
  1 sibling, 0 replies; 66+ messages in thread
From: Paolo Bonzini @ 2014-07-01 12:06 UTC (permalink / raw)
  To: Gerd Hoffmann; +Cc: Alexey Kardashevskiy, Alexander Graf, qemu-devel

Il 01/07/2014 13:15, Gerd Hoffmann ha scritto:
>   Hi,
>
>>>> If you tell me what to look at, I legally own a Windows 98 CD (also NT4
>>>> but I have to dig it out) and can test it later this week.
>>>
>>> /me has win98 too, that doesn't boot after install though.
>>
>> Tried TCG?  There is a timing loop that makes Win98 fail on too-fast
>> computers.
>
> TCG works.  Any workaround to make it fly with kvm too?

Not that I know of.

Paolo

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-01 11:23                                   ` Benjamin Herrenschmidt
@ 2014-07-02  9:19                                     ` Benjamin Herrenschmidt
  2014-07-02  9:21                                       ` Benjamin Herrenschmidt
  2014-07-02 12:12                                       ` Gerd Hoffmann
  0 siblings, 2 replies; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-07-02  9:19 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Alexey Kardashevskiy, Paolo Bonzini, Alexander Graf, qemu-devel

On Tue, 2014-07-01 at 21:23 +1000, Benjamin Herrenschmidt wrote:
> On Tue, 2014-07-01 at 13:15 +0200, Gerd Hoffmann wrote:
> > TCG works.  Any workaround to make it fly with kvm too?
> > 
> > And win98 actually uses the hardware cursor.
> 
> Ah ok. I was trying to install NT4 here but was having problems,
> the screen would go yellow and cirrus would stop displaying anything
> when trying to "test" the resolution (without my patches, actually with
> a slightly older qemu too).
> 
> I was planning on digging more tomorrow, but I can try Win98 if that's
> easier.

NT4 works a lot better without -enable-kvm :-) (file corruption, general
bad behaviour...).

I can confirm the cursor is missing in 16bpp with cirrus on NT4 as well,
I'll play with it later this week or week-end, unless you beat me to it.

[ Let me know if you start working on it so we don't duplicate work ]

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-02  9:19                                     ` Benjamin Herrenschmidt
@ 2014-07-02  9:21                                       ` Benjamin Herrenschmidt
  2014-07-02 12:12                                       ` Gerd Hoffmann
  1 sibling, 0 replies; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-07-02  9:21 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Alexey Kardashevskiy, Paolo Bonzini, Alexander Graf, qemu-devel

On Wed, 2014-07-02 at 19:19 +1000, Benjamin Herrenschmidt wrote:

> NT4 works a lot better without -enable-kvm :-) (file corruption, general
> bad behaviour...).
> 
> I can confirm the cursor is missing in 16bpp with cirrus on NT4 as well,
> I'll play with it later this week or week-end, unless you beat me to it.
> 
> [ Let me know if you start working on it so we don't duplicate work ]

It's busted in 8 and 24bpp too :-)

So in 16bpp it doesn't appear at all (as expected). In 8 and 24, it's a
white square. I'll have to dig. Do we have the cirrus specs somewhere ?

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-02  9:19                                     ` Benjamin Herrenschmidt
  2014-07-02  9:21                                       ` Benjamin Herrenschmidt
@ 2014-07-02 12:12                                       ` Gerd Hoffmann
  2014-07-02 12:16                                         ` Benjamin Herrenschmidt
  2014-07-06  2:19                                         ` Benjamin Herrenschmidt
  1 sibling, 2 replies; 66+ messages in thread
From: Gerd Hoffmann @ 2014-07-02 12:12 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Alexey Kardashevskiy, Paolo Bonzini, Alexander Graf, qemu-devel

On Mi, 2014-07-02 at 19:19 +1000, Benjamin Herrenschmidt wrote:
> On Tue, 2014-07-01 at 21:23 +1000, Benjamin Herrenschmidt wrote:
> > On Tue, 2014-07-01 at 13:15 +0200, Gerd Hoffmann wrote:
> > > TCG works.  Any workaround to make it fly with kvm too?
> > > 
> > > And win98 actually uses the hardware cursor.
> > 
> > Ah ok. I was trying to install NT4 here but was having problems,
> > the screen would go yellow and cirrus would stop displaying anything
> > when trying to "test" the resolution (without my patches, actually with
> > a slightly older qemu too).
> > 
> > I was planning on digging more tomorrow, but I can try Win98 if that's
> > easier.
> 
> NT4 works a lot better without -enable-kvm :-) (file corruption, general
> bad behaviour...).
> 
> I can confirm the cursor is missing in 16bpp with cirrus on NT4 as well,
> I'll play with it later this week or week-end, unless you beat me to it.
> 
> [ Let me know if you start working on it so we don't duplicate work ]

console/pixman bits are here:

https://www.kraxel.org/cgit/qemu/log/?h=rebase/console-wip

Added a patch for hardware cursor support via dpy_cursor_define().
Old hardware cursor code is still in, so in theory this gives you two
pointers.  In practice it only shows that cursor support in relative
mouse mode is (a) flaky and (b) our uis are very inconsistent ...

[ not hacking on it at the moment, busy with other stuff ]

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-02 12:12                                       ` Gerd Hoffmann
@ 2014-07-02 12:16                                         ` Benjamin Herrenschmidt
  2014-07-06  2:19                                         ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-07-02 12:16 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Alexey Kardashevskiy, Paolo Bonzini, Alexander Graf, qemu-devel

On Wed, 2014-07-02 at 14:12 +0200, Gerd Hoffmann wrote:

> console/pixman bits are here:
> 
> https://www.kraxel.org/cgit/qemu/log/?h=rebase/console-wip
> 
> Added a patch for hardware cursor support via dpy_cursor_define().
> Old hardware cursor code is still in, so in theory this gives you two
> pointers.  In practice it only shows that cursor support in relative
> mouse mode is (a) flaky and (b) our uis are very inconsistent ...
> 
> [ not hacking on it at the moment, busy with other stuff ]

Ok, me too right now, I might get to it this week-end, we'll see.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-02 12:12                                       ` Gerd Hoffmann
  2014-07-02 12:16                                         ` Benjamin Herrenschmidt
@ 2014-07-06  2:19                                         ` Benjamin Herrenschmidt
  2014-07-06  5:49                                           ` Benjamin Herrenschmidt
  2014-07-06  5:53                                           ` Benjamin Herrenschmidt
  1 sibling, 2 replies; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-07-06  2:19 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Alexey Kardashevskiy, Paolo Bonzini, Alexander Graf, qemu-devel

On Wed, 2014-07-02 at 14:12 +0200, Gerd Hoffmann wrote:
> https://www.kraxel.org/cgit/qemu/log/?h=rebase/console-wip
> 
> Added a patch for hardware cursor support via dpy_cursor_define().
> Old hardware cursor code is still in, so in theory this gives you two
> pointers.  In practice it only shows that cursor support in relative
> mouse mode is (a) flaky and (b) our uis are very inconsistent ...
> 
> [ not hacking on it at the moment, busy with other stuff ]

I've started to look (and while at it added use of the dirty bitmap to
catch changes to the HW cursor image just because it looked easy).

One obvious issue: Your patch:

gtk: update mouse position in mouse_set()

Completely breaks cursor movement in NT4 (I haven't checked with other
guests). It works fine without the patch. This is after cherry-picking
on top of my series on github.

Now the cursor is still a white rectangle :-) I need to investigate that
one a bit more.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-06  2:19                                         ` Benjamin Herrenschmidt
@ 2014-07-06  5:49                                           ` Benjamin Herrenschmidt
  2014-07-06  6:46                                             ` Benjamin Herrenschmidt
  2014-07-06  5:53                                           ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-07-06  5:49 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Alexey Kardashevskiy, Paolo Bonzini, Alexander Graf, qemu-devel

On Sun, 2014-07-06 at 12:19 +1000, Benjamin Herrenschmidt wrote:

> I've started to look (and while at it added use of the dirty bitmap to
> catch changes to the HW cursor image just because it looked easy).
> 
> One obvious issue: Your patch:
> 
> gtk: update mouse position in mouse_set()
> 
> Completely breaks cursor movement in NT4 (I haven't checked with other
> guests). It works fine without the patch. This is after cherry-picking
> on top of my series on github.
> 
> Now the cursor is still a white rectangle :-) I need to investigate that
> one a bit more.

Ok, I've found the cirrus bug that causes that white rectangle, but also
the broken icons in 8bpp mode.

We're basically tripping that test in cirrus_bitblt_rop_fwd_*

    if (dstpitch < 0 || srcpitch < 0) {
        /* is 0 valid? srcpitch == 0 could be useful */
        return;
    }

Because when called from cirrus_bitblt_cputovideo_next() we
always pass 0 for both pitch (we do one line at a time).

That done, we now get both HW "emulated" and HW "qemu" cursors
working in 8bpp.

The cursor is still absent in 16, probably a minor glitch, I'll
try to figure that out next.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-06  2:19                                         ` Benjamin Herrenschmidt
  2014-07-06  5:49                                           ` Benjamin Herrenschmidt
@ 2014-07-06  5:53                                           ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-07-06  5:53 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Alexey Kardashevskiy, Paolo Bonzini, Alexander Graf, qemu-devel

On Sun, 2014-07-06 at 12:19 +1000, Benjamin Herrenschmidt wrote:
> One obvious issue: Your patch:
> 
> gtk: update mouse position in mouse_set()
> 
> Completely breaks cursor movement in NT4 (I haven't checked with other
> guests). It works fine without the patch. This is after cherry-picking
> on top of my series on github.

Ok it's not clear cut. The cursor goes into lalaland without your patch,
though with the patch it goes bonkers immediately. So something is still
wrong (using gtk). Added to my list of things to look it :-)

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-06  5:49                                           ` Benjamin Herrenschmidt
@ 2014-07-06  6:46                                             ` Benjamin Herrenschmidt
  2014-07-06  7:05                                               ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-07-06  6:46 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Alexey Kardashevskiy, Paolo Bonzini, Alexander Graf, qemu-devel

On Sun, 2014-07-06 at 15:49 +1000, Benjamin Herrenschmidt wrote:
> We're basically tripping that test in cirrus_bitblt_rop_fwd_*
> 
>     if (dstpitch < 0 || srcpitch < 0) {
>         /* is 0 valid? srcpitch == 0 could be useful */
>         return;
>     }
> 
> Because when called from cirrus_bitblt_cputovideo_next() we
> always pass 0 for both pitch (we do one line at a time).

Ok, this is fun :-)

The above test was added supposedly as a fix for CVE2007-1320 in
2008, commit b2eb849d4b1fdb6f35d5c46958c7f703cf64cfef by "aurel32" (not
sure who that is).

However, looking at the description, it's a bit of a joke:

<<
    I have just noticed that patch for CVE-2007-1320 has never been applied
    to the QEMU CVS. Please find it below.
    
    | Multiple heap-based buffer overflows in the cirrus_invalidate_region
    | function in the Cirrus VGA extension in QEMU 0.8.2, as used in Xen and
    | possibly other products, might allow local users to execute arbitrary
    | code via unspecified vectors related to "attempting to mark
    | non-existent regions as dirty," aka the "bitblt" heap overflow.
>>

This is bogus for at least these reasons:

 - I don't see how adding that check inside one of the ROPs will have any
effect on the call to cirrus_invalidate_region()

 - Whoever wrote the patch obviously didn't "notice" the two substractions
to the pitches right before, since the comment about value "0" makes no sense
unless you ignore those. If you don't, then the comment is completely spurrious
as of course, the case of the pitch being 0 after susbtraction would be quite
common.

 - Whoever wrote the patch didn't notice that we do call it with 0 in the copy
from host path

 - Assuming doing the fix inside the ROP makes sense, then why fix only one
of them ? The exploit, if it exists, is still present in all the other ones...

 - Finally, assuming that bound-checking the pitch has some value to avoid
writing outside of the framebuffer allocated area, doing it inside the ROP
makes little sense, this should be done when setting up the blit.

At this point, I"m tempted to just revert that commit. What do you think Gerd ?

I've noticed a few reports over the last few years of breakage with Windows
NT that seem to stem from this.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-06  6:46                                             ` Benjamin Herrenschmidt
@ 2014-07-06  7:05                                               ` Benjamin Herrenschmidt
  2014-07-06  7:22                                                 ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-07-06  7:05 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Alexey Kardashevskiy, Paolo Bonzini, Alexander Graf, qemu-devel

On Sun, 2014-07-06 at 16:46 +1000, Benjamin Herrenschmidt wrote:
> At this point, I"m tempted to just revert that commit. What do you
> think Gerd ?

I mean that hunk of the commit... I missed that the commit itself
added a whole lot more bound checking.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-06  7:05                                               ` Benjamin Herrenschmidt
@ 2014-07-06  7:22                                                 ` Benjamin Herrenschmidt
  2014-07-06  8:15                                                   ` Benjamin Herrenschmidt
                                                                     ` (2 more replies)
  0 siblings, 3 replies; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-07-06  7:22 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Alexey Kardashevskiy, Paolo Bonzini, Alexander Graf, qemu-devel

On Sun, 2014-07-06 at 17:05 +1000, Benjamin Herrenschmidt wrote:
> On Sun, 2014-07-06 at 16:46 +1000, Benjamin Herrenschmidt wrote:
> > At this point, I"m tempted to just revert that commit. What do you
> > think Gerd ?
> 
> I mean that hunk of the commit... I missed that the commit itself
> added a whole lot more bound checking.

So we have an number of other problems :-)

 - With GTK the qemu cursor constantly gets removed... it works when
established at boot but as soon as we take the grab, or change anything,
the GTK ui removes the custom cursor and it generally remains hidden
form then on until we release the grab. I am not familiar with that
code, I'll try to debug it but help is welcome. Basically most of the
time we only see the "emulated" HW cursor (or nothing with a shared
surface).

 - With SDL, the screen colors are all wrong :-) It looks like a
component is partially missing. Not sure what's up there, again,
something else to debug. Everything has a blue tint (this is full emu on
x86_64 host).

 - With SDL, the cursor quickly goes bonkers, starts jumping around all
over the place, I'm not sure what exactly is going on here. It starts ok
but as soon as one does a too fast movement, it's dead.

 - With VNC, we see both cursors as expected but ... with an offset,
more or less variable. Ie I would expect 3 cursors: The VNC host cursor,
the little round thing that VNC puts where it thinks the guest cursor
is, and the emulated guest cursor, but the little round thing, instead
of following the guest cursor (emulated), follows the host one (though
the clicks are recorded by windows at the position of the guest one). it
looks like the wrong positions are sent to VNC.

Now, it's all code I know nothing about :-) I'll spend some time
investigating in my little spare time, but hints are welcome.

Cheers,
Ben.
 

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-06  7:22                                                 ` Benjamin Herrenschmidt
@ 2014-07-06  8:15                                                   ` Benjamin Herrenschmidt
  2014-07-06 10:13                                                   ` Benjamin Herrenschmidt
  2014-07-07  9:38                                                   ` Gerd Hoffmann
  2 siblings, 0 replies; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-07-06  8:15 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Alexey Kardashevskiy, Paolo Bonzini, Alexander Graf, qemu-devel

On Sun, 2014-07-06 at 17:22 +1000, Benjamin Herrenschmidt wrote:

>  - With SDL, the screen colors are all wrong :-) It looks like a
> component is partially missing. Not sure what's up there, again,
> something else to debug. Everything has a blue tint (this is full emu on
> x86_64 host).

That one is a typo in your "add qemu_pixelformat_from_pixman" that
you appear to have already fixed in your console-wip branch, so I'll
rebase my work on top of that (oh well, debugging it taught me a few
things ...).

One thing that I do worry a bit about is that by extending the cases
for shared pixmaps to 15 and 24bpp, I broke SDL since at least the
sdl2 code seems to only know about 16 (5:6:5) and 32 (8:8:8:8).

I'll probably have to fix these too (or bring back shadow pixmaps
for those formats which would be sad).

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-06  7:22                                                 ` Benjamin Herrenschmidt
  2014-07-06  8:15                                                   ` Benjamin Herrenschmidt
@ 2014-07-06 10:13                                                   ` Benjamin Herrenschmidt
  2014-07-06 11:08                                                     ` Alexander Graf
  2014-07-07  9:38                                                   ` Gerd Hoffmann
  2 siblings, 1 reply; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-07-06 10:13 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Alexey Kardashevskiy, Paolo Bonzini, Alexander Graf, qemu-devel

On Sun, 2014-07-06 at 17:22 +1000, Benjamin Herrenschmidt wrote:
> 
>  - With SDL, the cursor quickly goes bonkers, starts jumping around all
> over the place, I'm not sure what exactly is going on here. It starts ok
> but as soon as one does a too fast movement, it's dead.

At least this one also happens with gtk and appears to be the guest (NT4)
"warping" the cursor to the edges of the screen as soon as you start moving
too quickly. I'm not quite sure what's going on yet, so far I've seen us
send reasonable deltas down. I *suspect* it might have to do with NT's
own mouse acceleration scheme.

The way we do the relative mouse stuff, we more/less assume that the diff
we send down gets applied "as is". It's not and thus we warp which causes
funny artifacts and when it gets too big, things get nasty inside NT4 itself
as far as I can tell (but I may well have missed something).

The problem is that when using relative mouses, we can't really assume that
there is any relationship between the absolute position of the host cursor
vs. the guest cursor, we should only operate in deltas and even then, we
probably want to dampen them to compensate for the guest own acceleration.

But that means that the guest HW cursor is never quite where the host cursor
is. So unless the guest draws its own (or something like VNC can draw one),
we have a problem.

I'm thinking that for relative mouse, we should probably draw a cursor ourselves
by moving / drawing the cursor pixmap on top of the display pixmap at the UI
backend (gtk/SDL) level... Or am I missing a big part of the puzzle ?

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-06 10:13                                                   ` Benjamin Herrenschmidt
@ 2014-07-06 11:08                                                     ` Alexander Graf
  2014-07-06 11:13                                                       ` Peter Maydell
  2014-07-07 10:13                                                       ` Gerd Hoffmann
  0 siblings, 2 replies; 66+ messages in thread
From: Alexander Graf @ 2014-07-06 11:08 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Gerd Hoffmann
  Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-devel


On 06.07.14 12:13, Benjamin Herrenschmidt wrote:
> On Sun, 2014-07-06 at 17:22 +1000, Benjamin Herrenschmidt wrote:
>>   - With SDL, the cursor quickly goes bonkers, starts jumping around all
>> over the place, I'm not sure what exactly is going on here. It starts ok
>> but as soon as one does a too fast movement, it's dead.
> At least this one also happens with gtk and appears to be the guest (NT4)
> "warping" the cursor to the edges of the screen as soon as you start moving
> too quickly. I'm not quite sure what's going on yet, so far I've seen us
> send reasonable deltas down. I *suspect* it might have to do with NT's
> own mouse acceleration scheme.
>
> The way we do the relative mouse stuff, we more/less assume that the diff
> we send down gets applied "as is". It's not and thus we warp which causes
> funny artifacts and when it gets too big, things get nasty inside NT4 itself
> as far as I can tell (but I may well have missed something).
>
> The problem is that when using relative mouses, we can't really assume that
> there is any relationship between the absolute position of the host cursor
> vs. the guest cursor, we should only operate in deltas and even then, we
> probably want to dampen them to compensate for the guest own acceleration.

The guest's own acceleration can easily be non-linear, so we can't 
really tell. However, FWIW we basically have 2 modes

   1) absolute pointing device (usb tablet for example or vmmouse)
   2) relative pointing device

In case 1, we can keep using the host cursor, and just tell the guest 
where exactly the cursor is in absolute coordinates. This works very 
well with VNC too ;).

In case 2, we can't tell anything at all. We can calculate the delta and 
hope for the best. That's why with any backend that supports it, we 
enable mouse grabbing here. In mouse grabbing mode we behave like any 
game that may do whatever it likes with the mouse delta information.

> But that means that the guest HW cursor is never quite where the host cursor
> is. So unless the guest draws its own (or something like VNC can draw one),
> we have a problem.

VNC can explicitly draw the host cursor at specific locations IIRC. You 
can just send a packet where the cursor is at the moment. I don't know 
about SDL or GTK+ though.

> I'm thinking that for relative mouse, we should probably draw a cursor ourselves
> by moving / drawing the cursor pixmap on top of the display pixmap at the UI
> backend (gtk/SDL) level... Or am I missing a big part of the puzzle ?

Can't we just always draw it ourselves with a second surface on top of 
our normal guest screen? Then we can make the "real cursor" for GTK+ / 
SDL / VNC be a 100% alpha cursor as soon as we enable this self-drawn 
surface and can expose hardware pointers that the respective backend 
couldn't support.

For example, IIRC VNC only supports 1-bit cursors. We certainly want 
more fancy ones :).


Alex

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-06 11:08                                                     ` Alexander Graf
@ 2014-07-06 11:13                                                       ` Peter Maydell
  2014-07-06 11:23                                                         ` Benjamin Herrenschmidt
  2014-07-07 10:13                                                       ` Gerd Hoffmann
  1 sibling, 1 reply; 66+ messages in thread
From: Peter Maydell @ 2014-07-06 11:13 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Alexey Kardashevskiy, qemu-devel, Gerd Hoffmann, Paolo Bonzini

On 6 July 2014 12:08, Alexander Graf <agraf@suse.de> wrote
> The guest's own acceleration can easily be non-linear, so we can't really
> tell. However, FWIW we basically have 2 modes
>
>   1) absolute pointing device (usb tablet for example or vmmouse)
>   2) relative pointing device
>
> In case 1, we can keep using the host cursor, and just tell the guest where
> exactly the cursor is in absolute coordinates. This works very well with VNC
> too ;).

Well, you *can* use the host cursor, but by default you should not
(ie you still need to hide the host cursor and rely on the guest displaying
its own pointer). You should also honour command line -show-cursor
which overrides this to say "don't hide host cursor". Our SDL and Cocoa
UIs get this right, GTK doesn't currently.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-06 11:13                                                       ` Peter Maydell
@ 2014-07-06 11:23                                                         ` Benjamin Herrenschmidt
  2014-07-06 13:09                                                           ` Peter Maydell
  0 siblings, 1 reply; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-07-06 11:23 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-devel, Alexander Graf,
	Gerd Hoffmann

On Sun, 2014-07-06 at 12:13 +0100, Peter Maydell wrote:
> On 6 July 2014 12:08, Alexander Graf <agraf@suse.de> wrote
> > The guest's own acceleration can easily be non-linear, so we can't really
> > tell. However, FWIW we basically have 2 modes
> >
> >   1) absolute pointing device (usb tablet for example or vmmouse)
> >   2) relative pointing device
> >
> > In case 1, we can keep using the host cursor, and just tell the guest where
> > exactly the cursor is in absolute coordinates. This works very well with VNC
> > too ;).
> 
> Well, you *can* use the host cursor, but by default you should not
> (ie you still need to hide the host cursor and rely on the guest displaying
> its own pointer). 

Well, it depends what you mean by the guest draws its own pointer ... we
are in the specific case of cirrusfb HW cursor emulation which Gerd and
I are trying to fix.

Basically the guest doesn't draw anything. Gerd initial implementation
just passes the pixels to the dpy/qemu cursor, but while that probably
works fine in absolute mode, it's somewhat busted in relative mode at
least with SDL and gtk. 

The question is whether we can make it work that way (by basically
having grab and moving the host cursor) or do we really need to add a
layer for painting the cursor on top of the surface.

> You should also honour command line -show-cursor
> which overrides this to say "don't hide host cursor". Our SDL and Cocoa
> UIs get this right, GTK doesn't currently.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-06 11:23                                                         ` Benjamin Herrenschmidt
@ 2014-07-06 13:09                                                           ` Peter Maydell
  2014-07-06 20:56                                                             ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 66+ messages in thread
From: Peter Maydell @ 2014-07-06 13:09 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-devel, Alexander Graf,
	Gerd Hoffmann

On 6 July 2014 12:23, Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> Well, it depends what you mean by the guest draws its own pointer ... we
> are in the specific case of cirrusfb HW cursor emulation which Gerd and
> I are trying to fix.
>
> Basically the guest doesn't draw anything. Gerd initial implementation
> just passes the pixels to the dpy/qemu cursor, but while that probably
> works fine in absolute mode, it's somewhat busted in relative mode at
> least with SDL and gtk.
>
> The question is whether we can make it work that way (by basically
> having grab and moving the host cursor) or do we really need to add a
> layer for painting the cursor on top of the surface.

I think this definitely needs to be done by the cirrusfb device model
(or possibly by the common display code). Otherwise you're going to
have to try to implement this in every single UI backend, and I bet
you need a fallback implementation anyway because there's going to
be at least one UI that can't implement it. And as Alex points out,
trying to use the host mouse cursor doesn't work in pretty much any
relative-mode mouse-grabbed setup.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-06 13:09                                                           ` Peter Maydell
@ 2014-07-06 20:56                                                             ` Benjamin Herrenschmidt
  2014-07-07  0:08                                                               ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-07-06 20:56 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-devel, Alexander Graf,
	Gerd Hoffmann

On Sun, 2014-07-06 at 14:09 +0100, Peter Maydell wrote:
> > The question is whether we can make it work that way (by basically
> > having grab and moving the host cursor) or do we really need to add
> a
> > layer for painting the cursor on top of the surface.
> 
> I think this definitely needs to be done by the cirrusfb device model
> (or possibly by the common display code). Otherwise you're going to
> have to try to implement this in every single UI backend, and I bet
> you need a fallback implementation anyway because there's going to
> be at least one UI that can't implement it. And as Alex points out,
> trying to use the host mouse cursor doesn't work in pretty much any
> relative-mode mouse-grabbed setup.

It would be nice to implement that cursor support in the common code,
but I don't see how to do it without explicit support in any backend,
since the pixman surface goes straight to the backend.

If we do it in the cirrus model itself, then we basically have to
use the shadow pixmap always (can't ever share) and that means bringing
back a pile of conversion functions to the VGA model that we were
getting rid of (not hard ones but still, ie, 15, 16, 24 and 32bpp
non-swapped to 32bpp) which is somewhat sad.

The other option is to add "non-host" cursor support (painted-on) to
each backend, but that is starting to look like a lot more work than
what I signed up for :-)

So there is no "good" solution here. I'll look into bringing back the
shadow conversion functions and forcing cirrus to shadow the fb when
the cursor is active.

Another question: In the case where the host cursor actually works (such
as when emulating a tablet and the backend supports a cursor), how can
cirrus know so it can switch back to direct surface and stop painting a
second copy ? This is not clear to me...

Cheers,
Ben.
 

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-06 20:56                                                             ` Benjamin Herrenschmidt
@ 2014-07-07  0:08                                                               ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Herrenschmidt @ 2014-07-07  0:08 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Alexey Kardashevskiy, Paolo Bonzini, qemu-devel, Alexander Graf,
	Gerd Hoffmann

On Mon, 2014-07-07 at 06:56 +1000, Benjamin Herrenschmidt wrote:
> If we do it in the cirrus model itself, then we basically have to
> use the shadow pixmap always (can't ever share) and that means bringing
> back a pile of conversion functions to the VGA model that we were
> getting rid of (not hard ones but still, ie, 15, 16, 24 and 32bpp
> non-swapped to 32bpp) which is somewhat sad.

I've decided to go down that path. The added conversion functions
aren't big, it remains pretty clean and in fact it allows me to
neatly separate the functions between BE and LE framebuffers instead
of "swap" and "non-swap" which avoids one of the macro tricks I had
to do before.

Additionally, we are missing some pixel formats in some of the UIs
for my extended cases of surface sharing so I'll work on that too.

(I wonder if we should add a UI callback to verify if a pixel format
is supports for sharing ... for now I'll just add all the missing
ones to SDL 1 & 2, but somebody will have to look at spice).

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-06  7:22                                                 ` Benjamin Herrenschmidt
  2014-07-06  8:15                                                   ` Benjamin Herrenschmidt
  2014-07-06 10:13                                                   ` Benjamin Herrenschmidt
@ 2014-07-07  9:38                                                   ` Gerd Hoffmann
  2 siblings, 0 replies; 66+ messages in thread
From: Gerd Hoffmann @ 2014-07-07  9:38 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Alexey Kardashevskiy, Paolo Bonzini, Alexander Graf, qemu-devel

On So, 2014-07-06 at 17:22 +1000, Benjamin Herrenschmidt wrote:
> On Sun, 2014-07-06 at 17:05 +1000, Benjamin Herrenschmidt wrote:
> > On Sun, 2014-07-06 at 16:46 +1000, Benjamin Herrenschmidt wrote:
> > > At this point, I"m tempted to just revert that commit. What do you
> > > think Gerd ?
> > 
> > I mean that hunk of the commit... I missed that the commit itself
> > added a whole lot more bound checking.
> 
> So we have an number of other problems :-)

[ list snipped ]

Nobody really uses host cursor and relative mouse mode together today.  

UIs basically assume that the guest will render the mouse pointer when
in relative mouse mode, because that is what qxl is doing.  virtio-gpu
(not upstream) has simliar issues in relative mouse mode.

I will go look into these, but I suspect it will take quite some time to
sort as it involves quite some testing will all the UIs we have.

I think the best short-term solution for cirrus is to simply keep the
existing hwcursor emulation code and force shadow mode when the guest
enables the hwcursor.

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes
  2014-07-06 11:08                                                     ` Alexander Graf
  2014-07-06 11:13                                                       ` Peter Maydell
@ 2014-07-07 10:13                                                       ` Gerd Hoffmann
  1 sibling, 0 replies; 66+ messages in thread
From: Gerd Hoffmann @ 2014-07-07 10:13 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Alexey Kardashevskiy, qemu-devel, Paolo Bonzini

  Hi,

> > The problem is that when using relative mouses, we can't really assume that
> > there is any relationship between the absolute position of the host cursor
> > vs. the guest cursor, we should only operate in deltas and even then, we
> > probably want to dampen them to compensate for the guest own acceleration.
> 
> The guest's own acceleration can easily be non-linear, so we can't 
> really tell. However, FWIW we basically have 2 modes
> 
>    1) absolute pointing device (usb tablet for example or vmmouse)
>    2) relative pointing device
> 
> In case 1, we can keep using the host cursor, and just tell the guest 
> where exactly the cursor is in absolute coordinates. This works very 
> well with VNC too ;).

Yep, #1 is the easy case and only that works reasonable well in qemu
today.

> In case 2, we can't tell anything at all. We can calculate the delta and 
> hope for the best. That's why with any backend that supports it, we 
> enable mouse grabbing here. In mouse grabbing mode we behave like any 
> game that may do whatever it likes with the mouse delta information.

There is dpy_mouse_set() which the gfx emulation can use to tell the UI
where the cursor actually is.  So we can move the host cursor to that
point, and sdl/gtk UIs attempt to do that.

Problem is that this has some latency due to the round-trip to the
guest.  Also things like mouse acceleration easily can make the mouse
pointer a bit jumpy.  I think it needs some experimentation to see
whenever operating in that mode can be made work well in practice.

On top of that I think we currently have some kind of feedback loop in
there, which makes the pointer quickly go totally wonky.  To be
debugged ...

> > But that means that the guest HW cursor is never quite where the host cursor
> > is. So unless the guest draws its own (or something like VNC can draw one),
> > we have a problem.
> 
> VNC can explicitly draw the host cursor at specific locations IIRC. You 
> can just send a packet where the cursor is at the moment.

Guess we should actually do that in vnc_mouse_set() then ;)

> I don't know 
> about SDL or GTK+ though.

See above.

> > I'm thinking that for relative mouse, we should probably draw a cursor ourselves
> > by moving / drawing the cursor pixmap on top of the display pixmap at the UI
> > backend (gtk/SDL) level... Or am I missing a big part of the puzzle ?
> 
> Can't we just always draw it ourselves with a second surface on top of 
> our normal guest screen?

We might want create some helper functions to do that, so UIs which need
it can use them.  I don't think we should do that unconditionally, UI
support for RGBA cursors isn't that bad.

> Then we can make the "real cursor" for GTK+ / 
> SDL / VNC be a 100% alpha cursor as soon as we enable this self-drawn 
> surface and can expose hardware pointers that the respective backend 
> couldn't support.
> 
> For example, IIRC VNC only supports 1-bit cursors. We certainly want 
> more fancy ones :).

SDL1 has 1bit cursors.
SDL2 has full RGBA cursors.
gtk has full RGBA cursors.
vnc has RGB cursors + 1-bit mask (with VNC_FEATURE_RICH_CURSOR).
spice has full RGBA cursors.

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 66+ messages in thread

end of thread, other threads:[~2014-07-07 10:13 UTC | newest]

Thread overview: 66+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-17  3:07 [Qemu-devel] [RFC] qemu VGA endian swap low level drawing changes Benjamin Herrenschmidt
2014-06-17  4:40 ` Paolo Bonzini
2014-06-17  4:59   ` Benjamin Herrenschmidt
2014-06-17  5:36     ` Paolo Bonzini
2014-06-17  6:06       ` Benjamin Herrenschmidt
2014-06-17  9:18       ` Alexey Kardashevskiy
2014-06-17  9:26         ` Alexander Graf
2014-06-17 10:00         ` Greg Kurz
2014-06-17 10:09           ` Benjamin Herrenschmidt
2014-06-17 10:19             ` Peter Maydell
2014-06-17 10:57               ` Benjamin Herrenschmidt
2014-06-17 11:40             ` Greg Kurz
2014-06-17 11:53               ` Benjamin Herrenschmidt
2014-06-17 12:05                 ` Peter Maydell
2014-06-17 10:45 ` Gerd Hoffmann
2014-06-17 11:08   ` Peter Maydell
2014-06-17 11:24     ` Gerd Hoffmann
2014-06-17 11:42       ` Peter Maydell
2014-06-19 12:33         ` Gerd Hoffmann
2014-06-19 13:01           ` Paolo Bonzini
2014-06-17 11:15   ` Benjamin Herrenschmidt
2014-06-17 11:57     ` Gerd Hoffmann
2014-06-17 21:32       ` Benjamin Herrenschmidt
2014-06-17 22:12         ` Peter Maydell
2014-06-17 22:55           ` Benjamin Herrenschmidt
2014-06-17 23:05             ` Alexander Graf
2014-06-17 23:16               ` Benjamin Herrenschmidt
2014-06-18 11:18         ` Gerd Hoffmann
2014-06-18 13:03           ` Benjamin Herrenschmidt
2014-06-19  9:36             ` Gerd Hoffmann
2014-06-21  5:37               ` Benjamin Herrenschmidt
2014-06-22  2:10                 ` Benjamin Herrenschmidt
2014-06-23  1:13                   ` Benjamin Herrenschmidt
2014-06-30 11:14                   ` Gerd Hoffmann
2014-06-30 12:32                     ` Benjamin Herrenschmidt
2014-07-01  8:20                       ` Gerd Hoffmann
2014-07-01  8:26                         ` Alexander Graf
2014-07-01  8:31                           ` Paolo Bonzini
2014-07-01  9:07                             ` Gerd Hoffmann
2014-07-01  9:19                               ` Paolo Bonzini
2014-07-01 11:15                                 ` Gerd Hoffmann
2014-07-01 11:23                                   ` Benjamin Herrenschmidt
2014-07-02  9:19                                     ` Benjamin Herrenschmidt
2014-07-02  9:21                                       ` Benjamin Herrenschmidt
2014-07-02 12:12                                       ` Gerd Hoffmann
2014-07-02 12:16                                         ` Benjamin Herrenschmidt
2014-07-06  2:19                                         ` Benjamin Herrenschmidt
2014-07-06  5:49                                           ` Benjamin Herrenschmidt
2014-07-06  6:46                                             ` Benjamin Herrenschmidt
2014-07-06  7:05                                               ` Benjamin Herrenschmidt
2014-07-06  7:22                                                 ` Benjamin Herrenschmidt
2014-07-06  8:15                                                   ` Benjamin Herrenschmidt
2014-07-06 10:13                                                   ` Benjamin Herrenschmidt
2014-07-06 11:08                                                     ` Alexander Graf
2014-07-06 11:13                                                       ` Peter Maydell
2014-07-06 11:23                                                         ` Benjamin Herrenschmidt
2014-07-06 13:09                                                           ` Peter Maydell
2014-07-06 20:56                                                             ` Benjamin Herrenschmidt
2014-07-07  0:08                                                               ` Benjamin Herrenschmidt
2014-07-07 10:13                                                       ` Gerd Hoffmann
2014-07-07  9:38                                                   ` Gerd Hoffmann
2014-07-06  5:53                                           ` Benjamin Herrenschmidt
2014-07-01 12:06                                   ` Paolo Bonzini
2014-07-01  8:59                           ` Gerd Hoffmann
2014-07-01  9:35                           ` Benjamin Herrenschmidt
2014-07-01  9:33                         ` Benjamin Herrenschmidt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.