LKML Archive on lore.kernel.org
 help / Atom feed
* [PATCH v3 0/3] have the vt console preserve unicode characters
@ 2018-06-27  3:56 Nicolas Pitre
  2018-06-27  3:56 ` [PATCH v3 1/3] vt: preserve unicode values corresponding to screen characters Nicolas Pitre
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Nicolas Pitre @ 2018-06-27  3:56 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Dave Mielke, Samuel Thibault, Adam Borowski, Alan Cox,
	linux-kernel, linux-console

The vt code translates UTF-8 strings into glyph index values and stores 
those glyph values in the screen buffer. Because there can only be at 
most 512 glyphs at the moment, it is impossible to represent most 
unicode characters, in which case a default glyph (often '?') is 
displayed instead. The original unicode value is then lost.

The 512-glyph limitation is inherent to text-mode VGA displays after 
which the core console code was modelled. This also means that the 
/dev/vcs* devices only provide user space with glyph index values, and 
then user applications must get hold of the unicode-to-glyph table the 
kernel is using in order to back-translate those into actual characters. 
It is not possible to get back the original unicode value when multiple 
unicode characters map to the same glyph, especially for the vast 
majority that maps to the default replacement glyph.

Users of /dev/vcs* shouldn't have to be restricted to a narrow unicode 
space from lossy screen content because of that. This is especially true 
for accessibility applications such as BRLTTY that rely on /dev/vcs to 
render screen content onto braille terminals.

It was also argued that the VGA-centric glyph buffer should eventually 
go entirely. The current design made sense when hardware was slow and 
managing the screen directly into the VGA memory made a difference (i.e. 
25 years ago). Modern console display drivers no longer have to be 
   limited to 512 glyphs.
Quoting Alan Cox:

|The only driver that it suits is the VGA text mode driver, which at 
|2GHz+ is going to be fast enough whatever format you convert from. We 
|have the memory, the processor power and the fact almost all our 
|displays are bitmapped (or more complex still) all in favour of 
|throwing away that limit.

This patch series introduces unicode screen support to the core console 
code with /dev/vcs* as a first user. Memory is allocated, and possible 
CPU overhead introduced, only if /dev/vcsu is read at least once. For 
now both the glyph and unicode buffers are maintained in parallel to 
allow for a smooth transition.

I'm a prime user of this new /dev/vcsu interface, as well as the BRLTTY 
maintainer Dave Mielke who implemented support for this in BRLTTY. There 
is therefore a vested interest in maintaining this feature as necessary. 
And this received extensive testing as well at this point.

This is also available on top of v4.18-rc2 here:

  git://git.linaro.org/people/nicolas.pitre/linux vt-unicode

Changes from v2:

- Dropped patch #4 as it was useful only for initial debugging and it 
  attracted all the review comments so far -- actually more than the 
  patch is worth.
- Added Adam Borowski's ACK.

Changes from v1:

- Rebased to v4.18-rc1.
- Dropped first patch (now in mainline as commit 4b4ecd9cb8).
- Removed a printk instance from an error path easily triggerable
  from user space.
- Minor cleanup.

Diffstat:

 drivers/tty/vt/vc_screen.c     |  90 ++++++++--
 drivers/tty/vt/vt.c            | 308 +++++++++++++++++++++++++++++++--
 include/linux/console_struct.h |   2 +
 include/linux/selection.h      |   5 +
 4 files changed, 380 insertions(+), 25 deletions(-)


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 1/3] vt: preserve unicode values corresponding to screen characters
  2018-06-27  3:56 [PATCH v3 0/3] have the vt console preserve unicode characters Nicolas Pitre
@ 2018-06-27  3:56 ` Nicolas Pitre
  2018-07-11  0:52   ` Kees Cook
  2018-06-27  3:56 ` [PATCH v3 2/3] vt: introduce unicode mode for /dev/vcs Nicolas Pitre
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Nicolas Pitre @ 2018-06-27  3:56 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Dave Mielke, Samuel Thibault, Adam Borowski, Alan Cox,
	linux-kernel, linux-console

The vt code translates UTF-8 strings into glyph index values and stores
those glyph values directly in the screen buffer. Because there can only
be at most 512 glyphs, it is impossible to represent most unicode
characters, in which case a default glyph (often '?') is displayed
instead. The original unicode value is then lost.

This patch implements the basic screen buffer handling to preserve unicode
values alongside corresponding display glyphs.  It is not activated by
default, meaning that people not relying on that functionality won't get
the implied overhead.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Dave Mielke <Dave@mielke.cc>
Acked-by: Adam Borowski <kilobyte@angband.pl>
---
 drivers/tty/vt/vt.c            | 220 +++++++++++++++++++++++++++++++--
 include/linux/console_struct.h |   2 +
 2 files changed, 211 insertions(+), 11 deletions(-)

diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index 1eb1a376a0..7b636638b3 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -317,6 +317,171 @@ void schedule_console_callback(void)
 	schedule_work(&console_work);
 }
 
+/*
+ * Code to manage unicode-based screen buffers
+ */
+
+#ifdef NO_VC_UNI_SCREEN
+/* this disables and optimizes related code away at compile time */
+#define get_vc_uniscr(vc) NULL
+#else
+#define get_vc_uniscr(vc) vc->vc_uni_screen
+#endif
+
+typedef uint32_t char32_t;
+
+/*
+ * Our screen buffer is preceded by an array of line pointers so that
+ * scrolling only implies some pointer shuffling.
+ */
+struct uni_screen {
+	char32_t *lines[0];
+};
+
+static struct uni_screen *vc_uniscr_alloc(unsigned int cols, unsigned int rows)
+{
+	struct uni_screen *uniscr;
+	void *p;
+	unsigned int memsize, i;
+
+	/* allocate everything in one go */
+	memsize = cols * rows * sizeof(char32_t);
+	memsize += rows * sizeof(char32_t *);
+	p = kmalloc(memsize, GFP_KERNEL);
+	if (!p)
+		return NULL;
+
+	/* initial line pointers */
+	uniscr = p;
+	p = uniscr->lines + rows;
+	for (i = 0; i < rows; i++) {
+		uniscr->lines[i] = p;
+		p += cols * sizeof(char32_t);
+	}
+	return uniscr;
+}
+
+static void vc_uniscr_set(struct vc_data *vc, struct uni_screen *new_uniscr)
+{
+	kfree(vc->vc_uni_screen);
+	vc->vc_uni_screen = new_uniscr;
+}
+
+static void vc_uniscr_putc(struct vc_data *vc, char32_t uc)
+{
+	struct uni_screen *uniscr = get_vc_uniscr(vc);
+
+	if (uniscr)
+		uniscr->lines[vc->vc_y][vc->vc_x] = uc;
+}
+
+static void vc_uniscr_insert(struct vc_data *vc, unsigned int nr)
+{
+	struct uni_screen *uniscr = get_vc_uniscr(vc);
+
+	if (uniscr) {
+		char32_t *ln = uniscr->lines[vc->vc_y];
+		unsigned int x = vc->vc_x, cols = vc->vc_cols;
+
+		memmove(&ln[x + nr], &ln[x], (cols - x - nr) * sizeof(*ln));
+		memset32(&ln[x], ' ', nr);
+	}
+}
+
+static void vc_uniscr_delete(struct vc_data *vc, unsigned int nr)
+{
+	struct uni_screen *uniscr = get_vc_uniscr(vc);
+
+	if (uniscr) {
+		char32_t *ln = uniscr->lines[vc->vc_y];
+		unsigned int x = vc->vc_x, cols = vc->vc_cols;
+
+		memcpy(&ln[x], &ln[x + nr], (cols - x - nr) * sizeof(*ln));
+		memset32(&ln[cols - nr], ' ', nr);
+	}
+}
+
+static void vc_uniscr_clear_line(struct vc_data *vc, unsigned int x,
+				 unsigned int nr)
+{
+	struct uni_screen *uniscr = get_vc_uniscr(vc);
+
+	if (uniscr) {
+		char32_t *ln = uniscr->lines[vc->vc_y];
+
+		memset32(&ln[x], ' ', nr);
+	}
+}
+
+static void vc_uniscr_clear_lines(struct vc_data *vc, unsigned int y,
+				  unsigned int nr)
+{
+	struct uni_screen *uniscr = get_vc_uniscr(vc);
+
+	if (uniscr) {
+		unsigned int cols = vc->vc_cols;
+
+		while (nr--)
+			memset32(uniscr->lines[y++], ' ', cols);
+	}
+}
+
+static void vc_uniscr_scroll(struct vc_data *vc, unsigned int t, unsigned int b,
+			     enum con_scroll dir, unsigned int nr)
+{
+	struct uni_screen *uniscr = get_vc_uniscr(vc);
+
+	if (uniscr) {
+		unsigned int s, d, rescue, clear;
+		char32_t *save[nr];
+
+		s = clear = t;
+		d = t + nr;
+		rescue = b - nr;
+		if (dir == SM_UP) {
+			swap(s, d);
+			swap(clear, rescue);
+		}
+		memcpy(save, uniscr->lines + rescue, nr * sizeof(*save));
+		memmove(uniscr->lines + d, uniscr->lines + s,
+			(b - t - nr) * sizeof(*uniscr->lines));
+		memcpy(uniscr->lines + clear, save, nr * sizeof(*save));
+		vc_uniscr_clear_lines(vc, clear, nr);
+	}
+}
+
+static void vc_uniscr_copy_area(struct uni_screen *dst,
+				unsigned int dst_cols,
+				unsigned int dst_rows,
+				struct uni_screen *src,
+				unsigned int src_cols,
+				unsigned int src_top_row,
+				unsigned int src_bot_row)
+{
+	unsigned int dst_row = 0;
+
+	if (!dst)
+		return;
+
+	while (src_top_row < src_bot_row) {
+		char32_t *src_line = src->lines[src_top_row];
+		char32_t *dst_line = dst->lines[dst_row];
+
+		memcpy(dst_line, src_line, src_cols * sizeof(char32_t));
+		if (dst_cols - src_cols)
+			memset32(dst_line + src_cols, ' ', dst_cols - src_cols);
+		src_top_row++;
+		dst_row++;
+	}
+	while (dst_row < dst_rows) {
+		char32_t *dst_line = dst->lines[dst_row];
+
+		memset32(dst_line, ' ', dst_cols);
+		dst_row++;
+	}
+}
+
+
 static void con_scroll(struct vc_data *vc, unsigned int t, unsigned int b,
 		enum con_scroll dir, unsigned int nr)
 {
@@ -326,6 +491,7 @@ static void con_scroll(struct vc_data *vc, unsigned int t, unsigned int b,
 		nr = b - t - 1;
 	if (b > vc->vc_rows || t >= b || nr < 1)
 		return;
+	vc_uniscr_scroll(vc, t, b, dir, nr);
 	if (con_is_visible(vc) && vc->vc_sw->con_scroll(vc, t, b, dir, nr))
 		return;
 
@@ -533,6 +699,7 @@ static void insert_char(struct vc_data *vc, unsigned int nr)
 {
 	unsigned short *p = (unsigned short *) vc->vc_pos;
 
+	vc_uniscr_insert(vc, nr);
 	scr_memmovew(p + nr, p, (vc->vc_cols - vc->vc_x - nr) * 2);
 	scr_memsetw(p, vc->vc_video_erase_char, nr * 2);
 	vc->vc_need_wrap = 0;
@@ -545,6 +712,7 @@ static void delete_char(struct vc_data *vc, unsigned int nr)
 {
 	unsigned short *p = (unsigned short *) vc->vc_pos;
 
+	vc_uniscr_delete(vc, nr);
 	scr_memcpyw(p, p + nr, (vc->vc_cols - vc->vc_x - nr) * 2);
 	scr_memsetw(p + vc->vc_cols - vc->vc_x - nr, vc->vc_video_erase_char,
 			nr * 2);
@@ -845,10 +1013,11 @@ static int vc_do_resize(struct tty_struct *tty, struct vc_data *vc,
 {
 	unsigned long old_origin, new_origin, new_scr_end, rlth, rrem, err = 0;
 	unsigned long end;
-	unsigned int old_rows, old_row_size;
+	unsigned int old_rows, old_row_size, first_copied_row;
 	unsigned int new_cols, new_rows, new_row_size, new_screen_size;
 	unsigned int user;
 	unsigned short *newscreen;
+	struct uni_screen *new_uniscr = NULL;
 
 	WARN_CONSOLE_UNLOCKED();
 
@@ -875,6 +1044,14 @@ static int vc_do_resize(struct tty_struct *tty, struct vc_data *vc,
 	if (!newscreen)
 		return -ENOMEM;
 
+	if (get_vc_uniscr(vc)) {
+		new_uniscr = vc_uniscr_alloc(new_cols, new_rows);
+		if (!new_uniscr) {
+			kfree(newscreen);
+			return -ENOMEM;
+		}
+	}
+
 	if (vc == sel_cons)
 		clear_selection();
 
@@ -884,6 +1061,7 @@ static int vc_do_resize(struct tty_struct *tty, struct vc_data *vc,
 	err = resize_screen(vc, new_cols, new_rows, user);
 	if (err) {
 		kfree(newscreen);
+		kfree(new_uniscr);
 		return err;
 	}
 
@@ -904,18 +1082,24 @@ static int vc_do_resize(struct tty_struct *tty, struct vc_data *vc,
 			 * Cursor near the bottom, copy contents from the
 			 * bottom of buffer
 			 */
-			old_origin += (old_rows - new_rows) * old_row_size;
+			first_copied_row = (old_rows - new_rows);
 		} else {
 			/*
 			 * Cursor is in no man's land, copy 1/2 screenful
 			 * from the top and bottom of cursor position
 			 */
-			old_origin += (vc->vc_y - new_rows/2) * old_row_size;
+			first_copied_row = (vc->vc_y - new_rows/2);
 		}
-	}
-
+		old_origin += first_copied_row * old_row_size;
+	} else
+		first_copied_row = 0;
 	end = old_origin + old_row_size * min(old_rows, new_rows);
 
+	vc_uniscr_copy_area(new_uniscr, new_cols, new_rows,
+			    get_vc_uniscr(vc), rlth/2, first_copied_row,
+			    min(old_rows, new_rows));
+	vc_uniscr_set(vc, new_uniscr);
+
 	update_attr(vc);
 
 	while (old_origin < end) {
@@ -1013,6 +1197,7 @@ struct vc_data *vc_deallocate(unsigned int currcons)
 		vc->vc_sw->con_deinit(vc);
 		put_pid(vc->vt_pid);
 		module_put(vc->vc_sw->owner);
+		vc_uniscr_set(vc, NULL);
 		kfree(vc->vc_screenbuf);
 		vc_cons[currcons].d = NULL;
 	}
@@ -1171,15 +1356,22 @@ static void csi_J(struct vc_data *vc, int vpar)
 
 	switch (vpar) {
 		case 0:	/* erase from cursor to end of display */
+			vc_uniscr_clear_line(vc, vc->vc_x,
+					     vc->vc_cols - vc->vc_x);
+			vc_uniscr_clear_lines(vc, vc->vc_y + 1,
+					      vc->vc_rows - vc->vc_y - 1);
 			count = (vc->vc_scr_end - vc->vc_pos) >> 1;
 			start = (unsigned short *)vc->vc_pos;
 			break;
 		case 1:	/* erase from start to cursor */
+			vc_uniscr_clear_line(vc, 0, vc->vc_x + 1);
+			vc_uniscr_clear_lines(vc, 0, vc->vc_y);
 			count = ((vc->vc_pos - vc->vc_origin) >> 1) + 1;
 			start = (unsigned short *)vc->vc_origin;
 			break;
 		case 2: /* erase whole display */
 		case 3: /* (and scrollback buffer later) */
+			vc_uniscr_clear_lines(vc, 0, vc->vc_rows);
 			count = vc->vc_cols * vc->vc_rows;
 			start = (unsigned short *)vc->vc_origin;
 			break;
@@ -1200,25 +1392,27 @@ static void csi_J(struct vc_data *vc, int vpar)
 static void csi_K(struct vc_data *vc, int vpar)
 {
 	unsigned int count;
-	unsigned short * start;
+	unsigned short *start = (unsigned short *)vc->vc_pos;
+	int offset;
 
 	switch (vpar) {
 		case 0:	/* erase from cursor to end of line */
+			offset = 0;
 			count = vc->vc_cols - vc->vc_x;
-			start = (unsigned short *)vc->vc_pos;
 			break;
 		case 1:	/* erase from start of line to cursor */
-			start = (unsigned short *)(vc->vc_pos - (vc->vc_x << 1));
+			offset = -vc->vc_x;
 			count = vc->vc_x + 1;
 			break;
 		case 2: /* erase whole line */
-			start = (unsigned short *)(vc->vc_pos - (vc->vc_x << 1));
+			offset = -vc->vc_x;
 			count = vc->vc_cols;
 			break;
 		default:
 			return;
 	}
-	scr_memsetw(start, vc->vc_video_erase_char, 2 * count);
+	vc_uniscr_clear_line(vc, vc->vc_x + offset, count);
+	scr_memsetw(start + offset, vc->vc_video_erase_char, 2 * count);
 	vc->vc_need_wrap = 0;
 	if (con_should_update(vc))
 		do_update_region(vc, (unsigned long) start, count);
@@ -1232,6 +1426,7 @@ static void csi_X(struct vc_data *vc, int vpar) /* erase the following vpar posi
 		vpar++;
 	count = (vpar > vc->vc_cols - vc->vc_x) ? (vc->vc_cols - vc->vc_x) : vpar;
 
+	vc_uniscr_clear_line(vc, vc->vc_x, count);
 	scr_memsetw((unsigned short *)vc->vc_pos, vc->vc_video_erase_char, 2 * count);
 	if (con_should_update(vc))
 		vc->vc_sw->con_clear(vc, vc->vc_y, vc->vc_x, 1, count);
@@ -2188,7 +2383,7 @@ static void con_flush(struct vc_data *vc, unsigned long draw_from,
 /* acquires console_lock */
 static int do_con_write(struct tty_struct *tty, const unsigned char *buf, int count)
 {
-	int c, tc, ok, n = 0, draw_x = -1;
+	int c, next_c, tc, ok, n = 0, draw_x = -1;
 	unsigned int currcons;
 	unsigned long draw_from = 0, draw_to = 0;
 	struct vc_data *vc;
@@ -2382,6 +2577,7 @@ static int do_con_write(struct tty_struct *tty, const unsigned char *buf, int co
 				con_flush(vc, draw_from, draw_to, &draw_x);
 			}
 
+			next_c = c;
 			while (1) {
 				if (vc->vc_need_wrap || vc->vc_decim)
 					con_flush(vc, draw_from, draw_to,
@@ -2392,6 +2588,7 @@ static int do_con_write(struct tty_struct *tty, const unsigned char *buf, int co
 				}
 				if (vc->vc_decim)
 					insert_char(vc, 1);
+				vc_uniscr_putc(vc, next_c);
 				scr_writew(himask ?
 					     ((vc_attr << 8) & ~himask) + ((tc & 0x100) ? himask : 0) + (tc & 0xff) :
 					     (vc_attr << 8) + tc,
@@ -2412,6 +2609,7 @@ static int do_con_write(struct tty_struct *tty, const unsigned char *buf, int co
 
 				tc = conv_uni_to_pc(vc, ' '); /* A space is printed in the second column */
 				if (tc < 0) tc = ' ';
+				next_c = ' ';
 			}
 			notify_write(vc, c);
 
diff --git a/include/linux/console_struct.h b/include/linux/console_struct.h
index c0ec478ea5..2c8d323989 100644
--- a/include/linux/console_struct.h
+++ b/include/linux/console_struct.h
@@ -19,6 +19,7 @@
 
 struct vt_struct;
 struct uni_pagedir;
+struct uni_screen;
 
 #define NPAR 16
 
@@ -140,6 +141,7 @@ struct vc_data {
 	struct vc_data **vc_display_fg;		/* [!] Ptr to var holding fg console for this display */
 	struct uni_pagedir *vc_uni_pagedir;
 	struct uni_pagedir **vc_uni_pagedir_loc; /* [!] Location of uni_pagedir variable for this console */
+	struct uni_screen *vc_uni_screen;	/* unicode screen content */
 	bool vc_panic_force_write; /* when oops/panic this VC can accept forced output/blanking */
 	/* additional information is in vt_kern.h */
 };
-- 
2.17.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 2/3] vt: introduce unicode mode for /dev/vcs
  2018-06-27  3:56 [PATCH v3 0/3] have the vt console preserve unicode characters Nicolas Pitre
  2018-06-27  3:56 ` [PATCH v3 1/3] vt: preserve unicode values corresponding to screen characters Nicolas Pitre
@ 2018-06-27  3:56 ` Nicolas Pitre
  2018-06-29  8:15   ` Geert Uytterhoeven
  2018-06-27  3:56 ` [PATCH v3 3/3] vt: unicode fallback for scrollback Nicolas Pitre
  2018-06-28 12:38 ` [PATCH v3 0/3] have the vt console preserve unicode characters Greg Kroah-Hartman
  3 siblings, 1 reply; 13+ messages in thread
From: Nicolas Pitre @ 2018-06-27  3:56 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Dave Mielke, Samuel Thibault, Adam Borowski, Alan Cox,
	linux-kernel, linux-console

Now that the core vt code knows how to preserve unicode values for each
displayed character, it is then possible to let user space access it via
/dev/vcs*.

Unicode characters are presented as 32 bit values in native endianity
via the /dev/vcsu* devices, mimicking the simple /dev/vcs* devices.
Unicode with attributes (similarly to /dev/vcsa*) is not supported at
the moment.

Data is available only as long as the console is in UTF-8 mode. ENODATA
is returned otherwise.

This was tested with the latest development version (to become
version 5.7) of BRLTTY. Amongst other things, this allows ⠋⠕⠗ ⠞⠓⠊⠎
⠃⠗⠁⠊⠇⠇⠑⠀⠞⠑⠭⠞⠀to appear directly on braille displays regardless of the
console font being used.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Dave Mielke <Dave@mielke.cc>
Acked-by: Adam Borowski <kilobyte@angband.pl>
---
 drivers/tty/vt/vc_screen.c | 89 ++++++++++++++++++++++++++++++++------
 drivers/tty/vt/vt.c        | 61 ++++++++++++++++++++++++++
 include/linux/selection.h  |  5 +++
 3 files changed, 141 insertions(+), 14 deletions(-)

diff --git a/drivers/tty/vt/vc_screen.c b/drivers/tty/vt/vc_screen.c
index e4a66e1fd0..9c44252e52 100644
--- a/drivers/tty/vt/vc_screen.c
+++ b/drivers/tty/vt/vc_screen.c
@@ -10,6 +10,12 @@
  *	Attribute/character pair is in native endianity.
  *            [minor: N+128]
  *
+ * /dev/vcsuN: similar to /dev/vcsaN but using 4-byte unicode values
+ *	instead of 1-byte screen glyph values.
+ *            [minor: N+64]
+ *
+ * /dev/vcsuaN: same idea as /dev/vcsaN for unicode (not yet implemented).
+ *
  * This replaces screendump and part of selection, so that the system
  * administrator can control access using file system permissions.
  *
@@ -51,6 +57,26 @@
 
 #define CON_BUF_SIZE (CONFIG_BASE_SMALL ? 256 : PAGE_SIZE)
 
+/*
+ * Our minor space:
+ *
+ *   0 ... 63	glyph mode without attributes
+ *  64 ... 127	unicode mode without attributes
+ * 128 ... 191	glyph mode with attributes
+ * 192 ... 255	unused (reserved for unicode with attributes)
+ *
+ * This relies on MAX_NR_CONSOLES being  <= 63, meaning 63 actual consoles
+ * with minors 0, 64, 128 and 192 being proxies for the foreground console.
+ */
+#if MAX_NR_CONSOLES > 63
+#warning "/dev/vcs* devices may not accommodate more than 63 consoles"
+#endif
+
+#define console(inode)		(iminor(inode) & 63)
+#define use_unicode(inode)	(iminor(inode) & 64)
+#define use_attributes(inode)	(iminor(inode) & 128)
+
+
 struct vcs_poll_data {
 	struct notifier_block notifier;
 	unsigned int cons_num;
@@ -102,7 +128,7 @@ vcs_poll_data_get(struct file *file)
 	poll = kzalloc(sizeof(*poll), GFP_KERNEL);
 	if (!poll)
 		return NULL;
-	poll->cons_num = iminor(file_inode(file)) & 127;
+	poll->cons_num = console(file_inode(file));
 	init_waitqueue_head(&poll->waitq);
 	poll->notifier.notifier_call = vcs_notifier;
 	if (register_vt_notifier(&poll->notifier) != 0) {
@@ -140,7 +166,7 @@ vcs_poll_data_get(struct file *file)
 static struct vc_data*
 vcs_vc(struct inode *inode, int *viewed)
 {
-	unsigned int currcons = iminor(inode) & 127;
+	unsigned int currcons = console(inode);
 
 	WARN_CONSOLE_UNLOCKED();
 
@@ -164,7 +190,6 @@ static int
 vcs_size(struct inode *inode)
 {
 	int size;
-	int minor = iminor(inode);
 	struct vc_data *vc;
 
 	WARN_CONSOLE_UNLOCKED();
@@ -175,8 +200,12 @@ vcs_size(struct inode *inode)
 
 	size = vc->vc_rows * vc->vc_cols;
 
-	if (minor & 128)
+	if (use_attributes(inode)) {
+		if (use_unicode(inode))
+			return -EOPNOTSUPP;
 		size = 2*size + HEADER_SIZE;
+	} else if (use_unicode(inode))
+		size *= 4;
 	return size;
 }
 
@@ -197,12 +226,10 @@ static ssize_t
 vcs_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
 {
 	struct inode *inode = file_inode(file);
-	unsigned int currcons = iminor(inode);
 	struct vc_data *vc;
 	struct vcs_poll_data *poll;
-	long pos;
-	long attr, read;
-	int col, maxcol, viewed;
+	long pos, read;
+	int attr, uni_mode, row, col, maxcol, viewed;
 	unsigned short *org = NULL;
 	ssize_t ret;
 	char *con_buf;
@@ -218,7 +245,8 @@ vcs_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
 	 */
 	console_lock();
 
-	attr = (currcons & 128);
+	uni_mode = use_unicode(inode);
+	attr = use_attributes(inode);
 	ret = -ENXIO;
 	vc = vcs_vc(inode, &viewed);
 	if (!vc)
@@ -227,6 +255,10 @@ vcs_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
 	ret = -EINVAL;
 	if (pos < 0)
 		goto unlock_out;
+	/* we enforce 32-bit alignment for pos and count in unicode mode */
+	if (uni_mode && (pos | count) & 3)
+		goto unlock_out;
+
 	poll = file->private_data;
 	if (count && poll)
 		poll->seen_last_update = true;
@@ -266,7 +298,27 @@ vcs_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
 		con_buf_start = con_buf0 = con_buf;
 		orig_count = this_round;
 		maxcol = vc->vc_cols;
-		if (!attr) {
+		if (uni_mode) {
+			unsigned int nr;
+
+			ret = vc_uniscr_check(vc);
+			if (ret)
+				break;
+			p /= 4;
+			row = p / vc->vc_cols;
+			col = p % maxcol;
+			nr = maxcol - col;
+			do {
+				if (nr > this_round/4)
+					nr = this_round/4;
+				vc_uniscr_copy_line(vc, con_buf0, row, col, nr);
+				con_buf0 += nr * 4;
+				this_round -= nr * 4;
+				row++;
+				col = 0;
+				nr = maxcol;
+			} while (this_round);
+		} else if (!attr) {
 			org = screen_pos(vc, p, viewed);
 			col = p % maxcol;
 			p += maxcol - col;
@@ -375,7 +427,6 @@ static ssize_t
 vcs_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
 {
 	struct inode *inode = file_inode(file);
-	unsigned int currcons = iminor(inode);
 	struct vc_data *vc;
 	long pos;
 	long attr, size, written;
@@ -396,7 +447,7 @@ vcs_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
 	 */
 	console_lock();
 
-	attr = (currcons & 128);
+	attr = use_attributes(inode);
 	ret = -ENXIO;
 	vc = vcs_vc(inode, &viewed);
 	if (!vc)
@@ -593,9 +644,15 @@ vcs_fasync(int fd, struct file *file, int on)
 static int
 vcs_open(struct inode *inode, struct file *filp)
 {
-	unsigned int currcons = iminor(inode) & 127;
+	unsigned int currcons = console(inode);
+	bool attr = use_attributes(inode);
+	bool uni_mode = use_unicode(inode);
 	int ret = 0;
-	
+
+	/* we currently don't support attributes in unicode mode */
+	if (attr && uni_mode)
+		return -EOPNOTSUPP;
+
 	console_lock();
 	if(currcons && !vc_cons_allocated(currcons-1))
 		ret = -ENXIO;
@@ -628,6 +685,8 @@ void vcs_make_sysfs(int index)
 {
 	device_create(vc_class, NULL, MKDEV(VCS_MAJOR, index + 1), NULL,
 		      "vcs%u", index + 1);
+	device_create(vc_class, NULL, MKDEV(VCS_MAJOR, index + 65), NULL,
+		      "vcsu%u", index + 1);
 	device_create(vc_class, NULL, MKDEV(VCS_MAJOR, index + 129), NULL,
 		      "vcsa%u", index + 1);
 }
@@ -635,6 +694,7 @@ void vcs_make_sysfs(int index)
 void vcs_remove_sysfs(int index)
 {
 	device_destroy(vc_class, MKDEV(VCS_MAJOR, index + 1));
+	device_destroy(vc_class, MKDEV(VCS_MAJOR, index + 65));
 	device_destroy(vc_class, MKDEV(VCS_MAJOR, index + 129));
 }
 
@@ -647,6 +707,7 @@ int __init vcs_init(void)
 	vc_class = class_create(THIS_MODULE, "vc");
 
 	device_create(vc_class, NULL, MKDEV(VCS_MAJOR, 0), NULL, "vcs");
+	device_create(vc_class, NULL, MKDEV(VCS_MAJOR, 64), NULL, "vcsu");
 	device_create(vc_class, NULL, MKDEV(VCS_MAJOR, 128), NULL, "vcsa");
 	for (i = 0; i < MIN_NR_CONSOLES; i++)
 		vcs_make_sysfs(i);
diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index 7b636638b3..062ce6be79 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -481,6 +481,67 @@ static void vc_uniscr_copy_area(struct uni_screen *dst,
 	}
 }
 
+/*
+ * Called from vcs_read() to make sure unicode screen retrieval is possible.
+ * This will initialize the unicode screen buffer if not already done.
+ * This returns 0 if OK, or a negative error code otherwise.
+ * In particular, -ENODATA is returned if the console is not in UTF-8 mode.
+ */
+int vc_uniscr_check(struct vc_data *vc)
+{
+	struct uni_screen *uniscr;
+	unsigned short *p;
+	int x, y, mask;
+
+	if (__is_defined(NO_VC_UNI_SCREEN))
+		return -EOPNOTSUPP;
+
+	WARN_CONSOLE_UNLOCKED();
+
+	if (!vc->vc_utf)
+		return -ENODATA;
+
+	if (vc->vc_uni_screen)
+		return 0;
+
+	uniscr = vc_uniscr_alloc(vc->vc_cols, vc->vc_rows);
+	if (!uniscr)
+		return -ENOMEM;
+
+	/*
+	 * Let's populate it initially with (imperfect) reverse translation.
+	 * This is the next best thing we can do short of having it enabled
+	 * from the start even when no users rely on this functionality. True
+	 * unicode content will be available after a complete screen refresh.
+	 */
+	p = (unsigned short *)vc->vc_origin;
+	mask = vc->vc_hi_font_mask | 0xff;
+	for (y = 0; y < vc->vc_rows; y++) {
+		char32_t *line = uniscr->lines[y];
+		for (x = 0; x < vc->vc_cols; x++) {
+			u16 glyph = scr_readw(p++) & mask;
+			line[x] = inverse_translate(vc, glyph, true);
+		}
+	}
+
+	vc->vc_uni_screen = uniscr;
+	return 0;
+}
+
+/*
+ * Called from vcs_read() to get the unicode data from the screen.
+ * This must be preceded by a successful call to vc_uniscr_check() once
+ * the console lock has been taken.
+ */
+void vc_uniscr_copy_line(struct vc_data *vc, void *dest,
+			 unsigned int row, unsigned int col, unsigned int nr)
+{
+	struct uni_screen *uniscr = get_vc_uniscr(vc);
+
+	BUG_ON(!uniscr);
+	memcpy(dest, &uniscr->lines[row][col], nr * sizeof(char32_t));
+}
+
 
 static void con_scroll(struct vc_data *vc, unsigned int t, unsigned int b,
 		enum con_scroll dir, unsigned int nr)
diff --git a/include/linux/selection.h b/include/linux/selection.h
index 5b278ce99d..2b34df9f1e 100644
--- a/include/linux/selection.h
+++ b/include/linux/selection.h
@@ -42,4 +42,9 @@ extern u16 vcs_scr_readw(struct vc_data *vc, const u16 *org);
 extern void vcs_scr_writew(struct vc_data *vc, u16 val, u16 *org);
 extern void vcs_scr_updated(struct vc_data *vc);
 
+extern int vc_uniscr_check(struct vc_data *vc);
+extern void vc_uniscr_copy_line(struct vc_data *vc, void *dest,
+				unsigned int row, unsigned int col,
+				unsigned int nr);
+
 #endif
-- 
2.17.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 3/3] vt: unicode fallback for scrollback
  2018-06-27  3:56 [PATCH v3 0/3] have the vt console preserve unicode characters Nicolas Pitre
  2018-06-27  3:56 ` [PATCH v3 1/3] vt: preserve unicode values corresponding to screen characters Nicolas Pitre
  2018-06-27  3:56 ` [PATCH v3 2/3] vt: introduce unicode mode for /dev/vcs Nicolas Pitre
@ 2018-06-27  3:56 ` Nicolas Pitre
  2018-06-28 12:38 ` [PATCH v3 0/3] have the vt console preserve unicode characters Greg Kroah-Hartman
  3 siblings, 0 replies; 13+ messages in thread
From: Nicolas Pitre @ 2018-06-27  3:56 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Dave Mielke, Samuel Thibault, Adam Borowski, Alan Cox,
	linux-kernel, linux-console

There is currently no provision for scrollback content in the core code,
leaving that to backend video drivers where this can be highly optimized.
There is currently no common method for those drivers to tell the core
what part of the scrollback is actually displayed and what size the
scrollback buffer is either. Because of that, the unicode screen buffer
has no provision for any scrollback.

At least we can provide backtranslated glyph values when the scrollback
is active which should be plenty good enough for now.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Dave Mielke <Dave@mielke.cc>
Acked-by: Adam Borowski <kilobyte@angband.pl>
---
 drivers/tty/vt/vc_screen.c |  3 ++-
 drivers/tty/vt/vt.c        | 31 +++++++++++++++++++++++++++++--
 include/linux/selection.h  |  2 +-
 3 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/drivers/tty/vt/vc_screen.c b/drivers/tty/vt/vc_screen.c
index 9c44252e52..2384ea85ff 100644
--- a/drivers/tty/vt/vc_screen.c
+++ b/drivers/tty/vt/vc_screen.c
@@ -311,7 +311,8 @@ vcs_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
 			do {
 				if (nr > this_round/4)
 					nr = this_round/4;
-				vc_uniscr_copy_line(vc, con_buf0, row, col, nr);
+				vc_uniscr_copy_line(vc, con_buf0, viewed,
+						    row, col, nr);
 				con_buf0 += nr * 4;
 				this_round -= nr * 4;
 				row++;
diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index 062ce6be79..2d14bb195d 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -533,13 +533,40 @@ int vc_uniscr_check(struct vc_data *vc)
  * This must be preceded by a successful call to vc_uniscr_check() once
  * the console lock has been taken.
  */
-void vc_uniscr_copy_line(struct vc_data *vc, void *dest,
+void vc_uniscr_copy_line(struct vc_data *vc, void *dest, int viewed,
 			 unsigned int row, unsigned int col, unsigned int nr)
 {
 	struct uni_screen *uniscr = get_vc_uniscr(vc);
+	int offset = row * vc->vc_size_row + col * 2;
+	unsigned long pos;
 
 	BUG_ON(!uniscr);
-	memcpy(dest, &uniscr->lines[row][col], nr * sizeof(char32_t));
+
+	pos = (unsigned long)screenpos(vc, offset, viewed);
+	if (pos >= vc->vc_origin && pos < vc->vc_scr_end) {
+		/*
+		 * Desired position falls in the main screen buffer.
+		 * However the actual row/col might be different if
+		 * scrollback is active.
+		 */
+		row = (pos - vc->vc_origin) / vc->vc_size_row;
+		col = ((pos - vc->vc_origin) % vc->vc_size_row) / 2;
+		memcpy(dest, &uniscr->lines[row][col], nr * sizeof(char32_t));
+	} else {
+		/*
+		 * Scrollback is active. For now let's simply backtranslate
+		 * the screen glyphs until the unicode screen buffer does
+		 * synchronize with console display drivers for a scrollback
+		 * buffer of its own.
+		 */
+		u16 *p = (u16 *)pos;
+		int mask = vc->vc_hi_font_mask | 0xff;
+		char32_t *uni_buf = dest;
+		while (nr--) {
+			u16 glyph = scr_readw(p++) & mask;
+			*uni_buf++ = inverse_translate(vc, glyph, true);
+		}
+	}
 }
 
 
diff --git a/include/linux/selection.h b/include/linux/selection.h
index 2b34df9f1e..067d2e99c7 100644
--- a/include/linux/selection.h
+++ b/include/linux/selection.h
@@ -43,7 +43,7 @@ extern void vcs_scr_writew(struct vc_data *vc, u16 val, u16 *org);
 extern void vcs_scr_updated(struct vc_data *vc);
 
 extern int vc_uniscr_check(struct vc_data *vc);
-extern void vc_uniscr_copy_line(struct vc_data *vc, void *dest,
+extern void vc_uniscr_copy_line(struct vc_data *vc, void *dest, int viewed,
 				unsigned int row, unsigned int col,
 				unsigned int nr);
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/3] have the vt console preserve unicode characters
  2018-06-27  3:56 [PATCH v3 0/3] have the vt console preserve unicode characters Nicolas Pitre
                   ` (2 preceding siblings ...)
  2018-06-27  3:56 ` [PATCH v3 3/3] vt: unicode fallback for scrollback Nicolas Pitre
@ 2018-06-28 12:38 ` Greg Kroah-Hartman
  2018-07-18  1:00   ` Nicolas Pitre
  3 siblings, 1 reply; 13+ messages in thread
From: Greg Kroah-Hartman @ 2018-06-28 12:38 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Dave Mielke, Samuel Thibault, Adam Borowski, Alan Cox,
	linux-kernel, linux-console

On Tue, Jun 26, 2018 at 11:56:39PM -0400, Nicolas Pitre wrote:
> The vt code translates UTF-8 strings into glyph index values and stores 
> those glyph values in the screen buffer. Because there can only be at 
> most 512 glyphs at the moment, it is impossible to represent most 
> unicode characters, in which case a default glyph (often '?') is 
> displayed instead. The original unicode value is then lost.
> 
> The 512-glyph limitation is inherent to text-mode VGA displays after 
> which the core console code was modelled. This also means that the 
> /dev/vcs* devices only provide user space with glyph index values, and 
> then user applications must get hold of the unicode-to-glyph table the 
> kernel is using in order to back-translate those into actual characters. 
> It is not possible to get back the original unicode value when multiple 
> unicode characters map to the same glyph, especially for the vast 
> majority that maps to the default replacement glyph.
> 
> Users of /dev/vcs* shouldn't have to be restricted to a narrow unicode 
> space from lossy screen content because of that. This is especially true 
> for accessibility applications such as BRLTTY that rely on /dev/vcs to 
> render screen content onto braille terminals.
> 
> It was also argued that the VGA-centric glyph buffer should eventually 
> go entirely. The current design made sense when hardware was slow and 
> managing the screen directly into the VGA memory made a difference (i.e. 
> 25 years ago). Modern console display drivers no longer have to be 
>    limited to 512 glyphs.
> Quoting Alan Cox:
> 
> |The only driver that it suits is the VGA text mode driver, which at 
> |2GHz+ is going to be fast enough whatever format you convert from. We 
> |have the memory, the processor power and the fact almost all our 
> |displays are bitmapped (or more complex still) all in favour of 
> |throwing away that limit.
> 
> This patch series introduces unicode screen support to the core console 
> code with /dev/vcs* as a first user. Memory is allocated, and possible 
> CPU overhead introduced, only if /dev/vcsu is read at least once. For 
> now both the glyph and unicode buffers are maintained in parallel to 
> allow for a smooth transition.
> 
> I'm a prime user of this new /dev/vcsu interface, as well as the BRLTTY 
> maintainer Dave Mielke who implemented support for this in BRLTTY. There 
> is therefore a vested interest in maintaining this feature as necessary. 
> And this received extensive testing as well at this point.
> 
> This is also available on top of v4.18-rc2 here:
> 
>   git://git.linaro.org/people/nicolas.pitre/linux vt-unicode
> 
> Changes from v2:
> 
> - Dropped patch #4 as it was useful only for initial debugging and it 
>   attracted all the review comments so far -- actually more than the 
>   patch is worth.

If you want this "feature" back, I'll be glad to take it, as odds are it
will help when any future person wants to test any changes in the code.

So feel free to resend it, I have no objection to it as-is.

And I've queued the other 3 up now, nice job.

greg k-h

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/3] vt: introduce unicode mode for /dev/vcs
  2018-06-27  3:56 ` [PATCH v3 2/3] vt: introduce unicode mode for /dev/vcs Nicolas Pitre
@ 2018-06-29  8:15   ` Geert Uytterhoeven
  0 siblings, 0 replies; 13+ messages in thread
From: Geert Uytterhoeven @ 2018-06-29  8:15 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Greg KH, Dave, Samuel Thibault, Adam Borowski,
	One Thousand Gnomes, Linux Kernel Mailing List, linux-console

Hi Nicolas,

On Wed, Jun 27, 2018 at 7:03 AM Nicolas Pitre <nicolas.pitre@linaro.org> wrote:
> Now that the core vt code knows how to preserve unicode values for each
> displayed character, it is then possible to let user space access it via
> /dev/vcs*.
>
> Unicode characters are presented as 32 bit values in native endianity
> via the /dev/vcsu* devices, mimicking the simple /dev/vcs* devices.
> Unicode with attributes (similarly to /dev/vcsa*) is not supported at
> the moment.
>
> Data is available only as long as the console is in UTF-8 mode. ENODATA
> is returned otherwise.
>
> This was tested with the latest development version (to become
> version 5.7) of BRLTTY. Amongst other things, this allows ⠋⠕⠗ ⠞⠓⠊⠎
> ⠃⠗⠁⠊⠇⠇⠑⠀⠞⠑⠭⠞⠀to appear directly on braille displays regardless of the
> console font being used.
>
> Signed-off-by: Nicolas Pitre <nico@linaro.org>

Thanks for your patch (which is now in tty-next).

> --- a/drivers/tty/vt/vc_screen.c
> +++ b/drivers/tty/vt/vc_screen.c

> @@ -51,6 +57,26 @@
>
>  #define CON_BUF_SIZE (CONFIG_BASE_SMALL ? 256 : PAGE_SIZE)
>
> +/*
> + * Our minor space:
> + *
> + *   0 ... 63  glyph mode without attributes
> + *  64 ... 127 unicode mode without attributes
> + * 128 ... 191 glyph mode with attributes
> + * 192 ... 255 unused (reserved for unicode with attributes)
> + *
> + * This relies on MAX_NR_CONSOLES being  <= 63, meaning 63 actual consoles
> + * with minors 0, 64, 128 and 192 being proxies for the foreground console.
> + */
> +#if MAX_NR_CONSOLES > 63
> +#warning "/dev/vcs* devices may not accommodate more than 63 consoles"
> +#endif

MAX_NR_CONSOLES is part of UAPI, so it cannot change.
It doesn't hurt to have an explicit check here, though.

However, looking into that I noticed include/uapi/linux/vt.h says:

    #define MAX_NR_CONSOLES 63      /* serial lines start at 64 */

But that seems to apply to /dev/tty* (major 4), not /dev/vcs* (major 7),
so you're safe.

> +
> +#define console(inode)         (iminor(inode) & 63)
> +#define use_unicode(inode)     (iminor(inode) & 64)
> +#define use_attributes(inode)  (iminor(inode) & 128)

I guess you need to update Documentation/admin-guide/devices.txt, and
add /dev/vcsu*?

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/3] vt: preserve unicode values corresponding to screen characters
  2018-06-27  3:56 ` [PATCH v3 1/3] vt: preserve unicode values corresponding to screen characters Nicolas Pitre
@ 2018-07-11  0:52   ` Kees Cook
  2018-07-11  3:39     ` Nicolas Pitre
  2018-07-11  9:18     ` Greg Kroah-Hartman
  0 siblings, 2 replies; 13+ messages in thread
From: Kees Cook @ 2018-07-11  0:52 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Greg Kroah-Hartman, Dave Mielke, Samuel Thibault, Adam Borowski,
	Alan Cox, LKML, linux-console

On Tue, Jun 26, 2018 at 8:56 PM, Nicolas Pitre <nicolas.pitre@linaro.org> wrote:
> The vt code translates UTF-8 strings into glyph index values and stores
> those glyph values directly in the screen buffer. Because there can only
> be at most 512 glyphs, it is impossible to represent most unicode
> characters, in which case a default glyph (often '?') is displayed
> instead. The original unicode value is then lost.
>
> This patch implements the basic screen buffer handling to preserve unicode
> values alongside corresponding display glyphs.  It is not activated by
> default, meaning that people not relying on that functionality won't get
> the implied overhead.
>
> Signed-off-by: Nicolas Pitre <nico@linaro.org>
> Tested-by: Dave Mielke <Dave@mielke.cc>
> Acked-by: Adam Borowski <kilobyte@angband.pl>
> ---
>  drivers/tty/vt/vt.c            | 220 +++++++++++++++++++++++++++++++--
>  include/linux/console_struct.h |   2 +
>  2 files changed, 211 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
> index 1eb1a376a0..7b636638b3 100644
> --- a/drivers/tty/vt/vt.c
> +++ b/drivers/tty/vt/vt.c
> [...]
> +static void vc_uniscr_scroll(struct vc_data *vc, unsigned int t, unsigned int b,
> +                            enum con_scroll dir, unsigned int nr)
> +{
> +       struct uni_screen *uniscr = get_vc_uniscr(vc);
> +
> +       if (uniscr) {
> +               unsigned int s, d, rescue, clear;
> +               char32_t *save[nr];

Can you adjust this to avoid the VLA here? I've almost gotten all VLAs
removed from the kernel[1], and this is introducing a new one. :)

Thanks!

-Kees

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

> +
> +               s = clear = t;
> +               d = t + nr;
> +               rescue = b - nr;
> +               if (dir == SM_UP) {
> +                       swap(s, d);
> +                       swap(clear, rescue);
> +               }
> +               memcpy(save, uniscr->lines + rescue, nr * sizeof(*save));
> +               memmove(uniscr->lines + d, uniscr->lines + s,
> +                       (b - t - nr) * sizeof(*uniscr->lines));
> +               memcpy(uniscr->lines + clear, save, nr * sizeof(*save));
> +               vc_uniscr_clear_lines(vc, clear, nr);
> +       }
> +}


-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/3] vt: preserve unicode values corresponding to screen characters
  2018-07-11  0:52   ` Kees Cook
@ 2018-07-11  3:39     ` Nicolas Pitre
  2018-07-11 20:40       ` Kees Cook
  2018-07-11  9:18     ` Greg Kroah-Hartman
  1 sibling, 1 reply; 13+ messages in thread
From: Nicolas Pitre @ 2018-07-11  3:39 UTC (permalink / raw)
  To: Kees Cook
  Cc: Nicolas Pitre, Greg Kroah-Hartman, Dave Mielke, Samuel Thibault,
	Adam Borowski, Alan Cox, LKML, linux-console

I am on vacation away from an actual keyboard until next week. Will look at it then. 

> Le 10 juill. 2018 à 20:52, Kees Cook <keescook@chromium.org> a écrit :
> 
>> On Tue, Jun 26, 2018 at 8:56 PM, Nicolas Pitre <nicolas.pitre@linaro.org> wrote:
>> The vt code translates UTF-8 strings into glyph index values and stores
>> those glyph values directly in the screen buffer. Because there can only
>> be at most 512 glyphs, it is impossible to represent most unicode
>> characters, in which case a default glyph (often '?') is displayed
>> instead. The original unicode value is then lost.
>> 
>> This patch implements the basic screen buffer handling to preserve unicode
>> values alongside corresponding display glyphs.  It is not activated by
>> default, meaning that people not relying on that functionality won't get
>> the implied overhead.
>> 
>> Signed-off-by: Nicolas Pitre <nico@linaro.org>
>> Tested-by: Dave Mielke <Dave@mielke.cc>
>> Acked-by: Adam Borowski <kilobyte@angband.pl>
>> ---
>> drivers/tty/vt/vt.c            | 220 +++++++++++++++++++++++++++++++--
>> include/linux/console_struct.h |   2 +
>> 2 files changed, 211 insertions(+), 11 deletions(-)
>> 
>> diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
>> index 1eb1a376a0..7b636638b3 100644
>> --- a/drivers/tty/vt/vt.c
>> +++ b/drivers/tty/vt/vt.c
>> [...]
>> +static void vc_uniscr_scroll(struct vc_data *vc, unsigned int t, unsigned int b,
>> +                            enum con_scroll dir, unsigned int nr)
>> +{
>> +       struct uni_screen *uniscr = get_vc_uniscr(vc);
>> +
>> +       if (uniscr) {
>> +               unsigned int s, d, rescue, clear;
>> +               char32_t *save[nr];
> 
> Can you adjust this to avoid the VLA here? I've almost gotten all VLAs
> removed from the kernel[1], and this is introducing a new one. :)
> 
> Thanks!
> 
> -Kees
> 
> [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
> 
>> +
>> +               s = clear = t;
>> +               d = t + nr;
>> +               rescue = b - nr;
>> +               if (dir == SM_UP) {
>> +                       swap(s, d);
>> +                       swap(clear, rescue);
>> +               }
>> +               memcpy(save, uniscr->lines + rescue, nr * sizeof(*save));
>> +               memmove(uniscr->lines + d, uniscr->lines + s,
>> +                       (b - t - nr) * sizeof(*uniscr->lines));
>> +               memcpy(uniscr->lines + clear, save, nr * sizeof(*save));
>> +               vc_uniscr_clear_lines(vc, clear, nr);
>> +       }
>> +}
> 
> 
> -- 
> Kees Cook
> Pixel Security

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/3] vt: preserve unicode values corresponding to screen characters
  2018-07-11  0:52   ` Kees Cook
  2018-07-11  3:39     ` Nicolas Pitre
@ 2018-07-11  9:18     ` Greg Kroah-Hartman
  2018-07-11 20:39       ` Kees Cook
  1 sibling, 1 reply; 13+ messages in thread
From: Greg Kroah-Hartman @ 2018-07-11  9:18 UTC (permalink / raw)
  To: Kees Cook
  Cc: Nicolas Pitre, Dave Mielke, Samuel Thibault, Adam Borowski,
	Alan Cox, LKML, linux-console

On Tue, Jul 10, 2018 at 05:52:01PM -0700, Kees Cook wrote:
> On Tue, Jun 26, 2018 at 8:56 PM, Nicolas Pitre <nicolas.pitre@linaro.org> wrote:
> > The vt code translates UTF-8 strings into glyph index values and stores
> > those glyph values directly in the screen buffer. Because there can only
> > be at most 512 glyphs, it is impossible to represent most unicode
> > characters, in which case a default glyph (often '?') is displayed
> > instead. The original unicode value is then lost.
> >
> > This patch implements the basic screen buffer handling to preserve unicode
> > values alongside corresponding display glyphs.  It is not activated by
> > default, meaning that people not relying on that functionality won't get
> > the implied overhead.
> >
> > Signed-off-by: Nicolas Pitre <nico@linaro.org>
> > Tested-by: Dave Mielke <Dave@mielke.cc>
> > Acked-by: Adam Borowski <kilobyte@angband.pl>
> > ---
> >  drivers/tty/vt/vt.c            | 220 +++++++++++++++++++++++++++++++--
> >  include/linux/console_struct.h |   2 +
> >  2 files changed, 211 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
> > index 1eb1a376a0..7b636638b3 100644
> > --- a/drivers/tty/vt/vt.c
> > +++ b/drivers/tty/vt/vt.c
> > [...]
> > +static void vc_uniscr_scroll(struct vc_data *vc, unsigned int t, unsigned int b,
> > +                            enum con_scroll dir, unsigned int nr)
> > +{
> > +       struct uni_screen *uniscr = get_vc_uniscr(vc);
> > +
> > +       if (uniscr) {
> > +               unsigned int s, d, rescue, clear;
> > +               char32_t *save[nr];
> 
> Can you adjust this to avoid the VLA here? I've almost gotten all VLAs
> removed from the kernel[1], and this is introducing a new one. :)

This is already in my tree, sorry :(

greg k-h

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/3] vt: preserve unicode values corresponding to screen characters
  2018-07-11  9:18     ` Greg Kroah-Hartman
@ 2018-07-11 20:39       ` Kees Cook
  2018-07-11 21:29         ` Adam Borowski
  0 siblings, 1 reply; 13+ messages in thread
From: Kees Cook @ 2018-07-11 20:39 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Nicolas Pitre, Dave Mielke, Samuel Thibault, Adam Borowski,
	Alan Cox, LKML, linux-console

On Wed, Jul 11, 2018 at 2:18 AM, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
> On Tue, Jul 10, 2018 at 05:52:01PM -0700, Kees Cook wrote:
>> On Tue, Jun 26, 2018 at 8:56 PM, Nicolas Pitre <nicolas.pitre@linaro.org> wrote:
>> > The vt code translates UTF-8 strings into glyph index values and stores
>> > those glyph values directly in the screen buffer. Because there can only
>> > be at most 512 glyphs, it is impossible to represent most unicode
>> > characters, in which case a default glyph (often '?') is displayed
>> > instead. The original unicode value is then lost.
>> >
>> > This patch implements the basic screen buffer handling to preserve unicode
>> > values alongside corresponding display glyphs.  It is not activated by
>> > default, meaning that people not relying on that functionality won't get
>> > the implied overhead.
>> >
>> > Signed-off-by: Nicolas Pitre <nico@linaro.org>
>> > Tested-by: Dave Mielke <Dave@mielke.cc>
>> > Acked-by: Adam Borowski <kilobyte@angband.pl>
>> > ---
>> >  drivers/tty/vt/vt.c            | 220 +++++++++++++++++++++++++++++++--
>> >  include/linux/console_struct.h |   2 +
>> >  2 files changed, 211 insertions(+), 11 deletions(-)
>> >
>> > diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
>> > index 1eb1a376a0..7b636638b3 100644
>> > --- a/drivers/tty/vt/vt.c
>> > +++ b/drivers/tty/vt/vt.c
>> > [...]
>> > +static void vc_uniscr_scroll(struct vc_data *vc, unsigned int t, unsigned int b,
>> > +                            enum con_scroll dir, unsigned int nr)
>> > +{
>> > +       struct uni_screen *uniscr = get_vc_uniscr(vc);
>> > +
>> > +       if (uniscr) {
>> > +               unsigned int s, d, rescue, clear;
>> > +               char32_t *save[nr];
>>
>> Can you adjust this to avoid the VLA here? I've almost gotten all VLAs
>> removed from the kernel[1], and this is introducing a new one. :)
>
> This is already in my tree, sorry :(

Yup, that's fine. (It's how I noticed it: linux-next VLA build tests.)
I'm just hoping it can get solved before the merge window opens. :)

There are still a bunch of VLAs I'm chipping away at, but this was a
newly added one, so I was hoping Nicolas (when he's back from
vacation) will have ideas on how to best avoid it.

Thanks!

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/3] vt: preserve unicode values corresponding to screen characters
  2018-07-11  3:39     ` Nicolas Pitre
@ 2018-07-11 20:40       ` Kees Cook
  0 siblings, 0 replies; 13+ messages in thread
From: Kees Cook @ 2018-07-11 20:40 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Greg Kroah-Hartman, Dave Mielke, Samuel Thibault, Adam Borowski,
	Alan Cox, LKML, linux-console

On Tue, Jul 10, 2018 at 8:39 PM, Nicolas Pitre <nicolas.pitre@linaro.org> wrote:
> I am on vacation away from an actual keyboard until next week. Will look at it then.

Awesome; thanks!

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/3] vt: preserve unicode values corresponding to screen characters
  2018-07-11 20:39       ` Kees Cook
@ 2018-07-11 21:29         ` Adam Borowski
  0 siblings, 0 replies; 13+ messages in thread
From: Adam Borowski @ 2018-07-11 21:29 UTC (permalink / raw)
  To: Kees Cook
  Cc: Greg Kroah-Hartman, Nicolas Pitre, Dave Mielke, Samuel Thibault,
	Alan Cox, LKML, linux-console

On Wed, Jul 11, 2018 at 01:39:56PM -0700, Kees Cook wrote:
> On Wed, Jul 11, 2018 at 2:18 AM, Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> > On Tue, Jul 10, 2018 at 05:52:01PM -0700, Kees Cook wrote:
> >> On Tue, Jun 26, 2018 at 8:56 PM, Nicolas Pitre <nicolas.pitre@linaro.org> wrote:
> >> > +++ b/drivers/tty/vt/vt.c
> >> > [...]
> >> > +static void vc_uniscr_scroll(struct vc_data *vc, unsigned int t, unsigned int b,
> >> > +                            enum con_scroll dir, unsigned int nr)
> >> > +{
> >> > +       struct uni_screen *uniscr = get_vc_uniscr(vc);
> >> > +
> >> > +       if (uniscr) {
> >> > +               unsigned int s, d, rescue, clear;
> >> > +               char32_t *save[nr];
> >>
> >> Can you adjust this to avoid the VLA here? I've almost gotten all VLAs
> >> removed from the kernel[1], and this is introducing a new one. :)
> >
> Yup, that's fine. (It's how I noticed it: linux-next VLA build tests.)
> I'm just hoping it can get solved before the merge window opens. :)

This one is actually nasty: max console height is 32767, if you resize it
that big then issue a large scroll request, boom it goes.

> There are still a bunch of VLAs I'm chipping away at, but this was a
> newly added one, so I was hoping Nicolas (when he's back from
> vacation) will have ideas on how to best avoid it.

Nicolas: what about just moving line pointers one at a time?  Rotating an
array slice in-place isn't that slow -- and optimizing that much when
reasonable sizes don't exceed 100ish (depending on how good your eyes are)
is quite ridiculous.  Thanks to your change, we don't need to move actual
contents, just line pointers -- that's fast enough.


Meow!
-- 
// If you believe in so-called "intellectual property", please immediately
// cease using counterfeit alphabets.  Instead, contact the nearest temple
// of Amon, whose priests will provide you with scribal services for all
// your writing needs, for Reasonable And Non-Discriminatory prices.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/3] have the vt console preserve unicode characters
  2018-06-28 12:38 ` [PATCH v3 0/3] have the vt console preserve unicode characters Greg Kroah-Hartman
@ 2018-07-18  1:00   ` Nicolas Pitre
  0 siblings, 0 replies; 13+ messages in thread
From: Nicolas Pitre @ 2018-07-18  1:00 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Dave Mielke, Samuel Thibault, Adam Borowski, Alan Cox,
	linux-kernel, linux-console

On Thu, 28 Jun 2018, Greg Kroah-Hartman wrote:

> On Tue, Jun 26, 2018 at 11:56:39PM -0400, Nicolas Pitre wrote:
> > The vt code translates UTF-8 strings into glyph index values and stores 
> > those glyph values in the screen buffer. Because there can only be at 
> > most 512 glyphs at the moment, it is impossible to represent most 
> > unicode characters, in which case a default glyph (often '?') is 
> > displayed instead. The original unicode value is then lost.
> > 
> > The 512-glyph limitation is inherent to text-mode VGA displays after 
> > which the core console code was modelled. This also means that the 
> > /dev/vcs* devices only provide user space with glyph index values, and 
> > then user applications must get hold of the unicode-to-glyph table the 
> > kernel is using in order to back-translate those into actual characters. 
> > It is not possible to get back the original unicode value when multiple 
> > unicode characters map to the same glyph, especially for the vast 
> > majority that maps to the default replacement glyph.
> > 
> > Users of /dev/vcs* shouldn't have to be restricted to a narrow unicode 
> > space from lossy screen content because of that. This is especially true 
> > for accessibility applications such as BRLTTY that rely on /dev/vcs to 
> > render screen content onto braille terminals.
> > 
> > It was also argued that the VGA-centric glyph buffer should eventually 
> > go entirely. The current design made sense when hardware was slow and 
> > managing the screen directly into the VGA memory made a difference (i.e. 
> > 25 years ago). Modern console display drivers no longer have to be 
> >    limited to 512 glyphs.
> > Quoting Alan Cox:
> > 
> > |The only driver that it suits is the VGA text mode driver, which at 
> > |2GHz+ is going to be fast enough whatever format you convert from. We 
> > |have the memory, the processor power and the fact almost all our 
> > |displays are bitmapped (or more complex still) all in favour of 
> > |throwing away that limit.
> > 
> > This patch series introduces unicode screen support to the core console 
> > code with /dev/vcs* as a first user. Memory is allocated, and possible 
> > CPU overhead introduced, only if /dev/vcsu is read at least once. For 
> > now both the glyph and unicode buffers are maintained in parallel to 
> > allow for a smooth transition.
> > 
> > I'm a prime user of this new /dev/vcsu interface, as well as the BRLTTY 
> > maintainer Dave Mielke who implemented support for this in BRLTTY. There 
> > is therefore a vested interest in maintaining this feature as necessary. 
> > And this received extensive testing as well at this point.
> > 
> > This is also available on top of v4.18-rc2 here:
> > 
> >   git://git.linaro.org/people/nicolas.pitre/linux vt-unicode
> > 
> > Changes from v2:
> > 
> > - Dropped patch #4 as it was useful only for initial debugging and it 
> >   attracted all the review comments so far -- actually more than the 
> >   patch is worth.
> 
> If you want this "feature" back, I'll be glad to take it, as odds are it
> will help when any future person wants to test any changes in the code.
> 
> So feel free to resend it, I have no objection to it as-is.
> 
> And I've queued the other 3 up now, nice job.

Thanks!

I'm about to send 3 more patches to put on top of what you already have: 
patch #1 is that debugging code (still disabled by default), patch #2 
removes the VLA, and patch #3 updates devices.txt.


Nicolas

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, back to index

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-27  3:56 [PATCH v3 0/3] have the vt console preserve unicode characters Nicolas Pitre
2018-06-27  3:56 ` [PATCH v3 1/3] vt: preserve unicode values corresponding to screen characters Nicolas Pitre
2018-07-11  0:52   ` Kees Cook
2018-07-11  3:39     ` Nicolas Pitre
2018-07-11 20:40       ` Kees Cook
2018-07-11  9:18     ` Greg Kroah-Hartman
2018-07-11 20:39       ` Kees Cook
2018-07-11 21:29         ` Adam Borowski
2018-06-27  3:56 ` [PATCH v3 2/3] vt: introduce unicode mode for /dev/vcs Nicolas Pitre
2018-06-29  8:15   ` Geert Uytterhoeven
2018-06-27  3:56 ` [PATCH v3 3/3] vt: unicode fallback for scrollback Nicolas Pitre
2018-06-28 12:38 ` [PATCH v3 0/3] have the vt console preserve unicode characters Greg Kroah-Hartman
2018-07-18  1:00   ` Nicolas Pitre

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org linux-kernel@archiver.kernel.org
	public-inbox-index lkml


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox