linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] use unicode for vt mouse paste
@ 2018-07-18  2:06 Adam Borowski
  2018-07-18  2:10 ` [PATCH 1/3] vt: don't reinvent min() Adam Borowski
  2018-07-18  3:10 ` [PATCH 0/3] use unicode for vt mouse paste Nicolas Pitre
  0 siblings, 2 replies; 5+ messages in thread
From: Adam Borowski @ 2018-07-18  2:06 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby, Nicolas Pitre, linux-kernel,
	linux-console

Hi!
Based on Nicolas' nice work (in tty-next), let's avoid corrupting characters
that have been copy+pasted via mouse selection.  The uniscr array holds
their original identity even if they got mangled by glyph conversion.
The glyph conversion lossily turns similar-looking characters into a
representation, and everyone else into a replacement character.

There's no proper handling for CJK (yet?) but anything of wcwidth()==1 will
work fine.

The whole thing doesn't get enabled until something reads from /dev/vcsu for
that console, but let's test this code first before enabling it widely.


Diffstat:
 drivers/tty/vt/selection.c | 48 +++++++++++++++++++++++++++++-------------------
 drivers/tty/vt/vt.c        | 10 ++++++++++
 include/linux/selection.h  |  1 +
 3 files changed, 40 insertions(+), 19 deletions(-)


-- 
// If you believe in so-called "intellectual property", please immediately
// cease using counterfeit alphabets.  Instead, contact the nearest temple
// of Amon, whose priests will provide you with scribal services for all
// your writing needs, for Reasonable And Non-Discriminatory prices.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/3] vt: don't reinvent min()
  2018-07-18  2:06 [PATCH 0/3] use unicode for vt mouse paste Adam Borowski
@ 2018-07-18  2:10 ` Adam Borowski
  2018-07-18  2:10   ` [PATCH 2/3] vt: selection: handle storing of characters above U+FFFF Adam Borowski
  2018-07-18  2:10   ` [PATCH 3/3] vt: selection: take screen contents from uniscr if available Adam Borowski
  2018-07-18  3:10 ` [PATCH 0/3] use unicode for vt mouse paste Nicolas Pitre
  1 sibling, 2 replies; 5+ messages in thread
From: Adam Borowski @ 2018-07-18  2:10 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby, Nicolas Pitre, linux-kernel,
	linux-console
  Cc: Adam Borowski

All the helper function saved us was a cast.

Signed-off-by: Adam Borowski <kilobyte@angband.pl>
---
 drivers/tty/vt/selection.c | 14 ++++----------
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/drivers/tty/vt/selection.c b/drivers/tty/vt/selection.c
index 90ea1cc52b7a..34e7110f310d 100644
--- a/drivers/tty/vt/selection.c
+++ b/drivers/tty/vt/selection.c
@@ -116,12 +116,6 @@ static inline int atedge(const int p, int size_row)
 	return (!(p % size_row)	|| !((p + 2) % size_row));
 }
 
-/* constrain v such that v <= u */
-static inline unsigned short limit(const unsigned short v, const unsigned short u)
-{
-	return (v > u) ? u : v;
-}
-
 /* stores the char in UTF8 and returns the number of bytes used (1-3) */
 static int store_utf8(u16 c, char *p)
 {
@@ -167,10 +161,10 @@ int set_selection(const struct tiocl_selection __user *sel, struct tty_struct *t
 	if (copy_from_user(&v, sel, sizeof(*sel)))
 		return -EFAULT;
 
-	v.xs = limit(v.xs - 1, vc->vc_cols - 1);
-	v.ys = limit(v.ys - 1, vc->vc_rows - 1);
-	v.xe = limit(v.xe - 1, vc->vc_cols - 1);
-	v.ye = limit(v.ye - 1, vc->vc_rows - 1);
+	v.xs = min_t(u16, v.xs - 1, vc->vc_cols - 1);
+	v.ys = min_t(u16, v.ys - 1, vc->vc_rows - 1);
+	v.xe = min_t(u16, v.xe - 1, vc->vc_cols - 1);
+	v.ye = min_t(u16, v.ye - 1, vc->vc_rows - 1);
 	ps = v.ys * vc->vc_size_row + (v.xs << 1);
 	pe = v.ye * vc->vc_size_row + (v.xe << 1);
 
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/3] vt: selection: handle storing of characters above U+FFFF
  2018-07-18  2:10 ` [PATCH 1/3] vt: don't reinvent min() Adam Borowski
@ 2018-07-18  2:10   ` Adam Borowski
  2018-07-18  2:10   ` [PATCH 3/3] vt: selection: take screen contents from uniscr if available Adam Borowski
  1 sibling, 0 replies; 5+ messages in thread
From: Adam Borowski @ 2018-07-18  2:10 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby, Nicolas Pitre, linux-kernel,
	linux-console
  Cc: Adam Borowski

Those above U+10FFFF get replaced with U+FFFD.

Signed-off-by: Adam Borowski <kilobyte@angband.pl>
---
 drivers/tty/vt/selection.c | 23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/tty/vt/selection.c b/drivers/tty/vt/selection.c
index 34e7110f310d..69ca337d3220 100644
--- a/drivers/tty/vt/selection.c
+++ b/drivers/tty/vt/selection.c
@@ -116,8 +116,8 @@ static inline int atedge(const int p, int size_row)
 	return (!(p % size_row)	|| !((p + 2) % size_row));
 }
 
-/* stores the char in UTF8 and returns the number of bytes used (1-3) */
-static int store_utf8(u16 c, char *p)
+/* stores the char in UTF8 and returns the number of bytes used (1-4) */
+static int store_utf8(u32 c, char *p)
 {
 	if (c < 0x80) {
 		/*  0******* */
@@ -128,13 +128,26 @@ static int store_utf8(u16 c, char *p)
 		p[0] = 0xc0 | (c >> 6);
 		p[1] = 0x80 | (c & 0x3f);
 		return 2;
-    	} else {
+	} else if (c < 0x10000) {
 		/* 1110**** 10****** 10****** */
 		p[0] = 0xe0 | (c >> 12);
 		p[1] = 0x80 | ((c >> 6) & 0x3f);
 		p[2] = 0x80 | (c & 0x3f);
 		return 3;
-    	}
+	} else if (c < 0x110000) {
+		/* 11110*** 10****** 10****** 10****** */
+		p[0] = 0xf0 | (c >> 18);
+		p[1] = 0x80 | ((c >> 12) & 0x3f);
+		p[2] = 0x80 | ((c >> 6) & 0x3f);
+		p[3] = 0x80 | (c & 0x3f);
+		return 4;
+	} else {
+		/* outside Unicode, replace with U+FFFD */
+		p[0] = 0xef;
+		p[1] = 0xbf;
+		p[2] = 0xbd;
+		return 3;
+	}
 }
 
 /**
@@ -273,7 +286,7 @@ int set_selection(const struct tiocl_selection __user *sel, struct tty_struct *t
 	sel_end = new_sel_end;
 
 	/* Allocate a new buffer before freeing the old one ... */
-	multiplier = use_unicode ? 3 : 1;  /* chars can take up to 3 bytes */
+	multiplier = use_unicode ? 4 : 1;  /* chars can take up to 4 bytes */
 	bp = kmalloc_array((sel_end - sel_start) / 2 + 1, multiplier,
 			   GFP_KERNEL);
 	if (!bp) {
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/3] vt: selection: take screen contents from uniscr if available
  2018-07-18  2:10 ` [PATCH 1/3] vt: don't reinvent min() Adam Borowski
  2018-07-18  2:10   ` [PATCH 2/3] vt: selection: handle storing of characters above U+FFFF Adam Borowski
@ 2018-07-18  2:10   ` Adam Borowski
  1 sibling, 0 replies; 5+ messages in thread
From: Adam Borowski @ 2018-07-18  2:10 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby, Nicolas Pitre, linux-kernel,
	linux-console
  Cc: Adam Borowski

This preserves whatever was written even if we can't currently display the
given glyph.  Mouse paste won't corrupt any character of wcwidth() == 1
anymore.

Note that for now uniscr doesn't get allocated until something reads
/dev/vcsuN for that console, making this code dormant for most users.

Signed-off-by: Adam Borowski <kilobyte@angband.pl>
---
 drivers/tty/vt/selection.c | 11 +++++++----
 drivers/tty/vt/vt.c        | 10 ++++++++++
 include/linux/selection.h  |  1 +
 3 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/drivers/tty/vt/selection.c b/drivers/tty/vt/selection.c
index 69ca337d3220..07496c711d7d 100644
--- a/drivers/tty/vt/selection.c
+++ b/drivers/tty/vt/selection.c
@@ -57,11 +57,13 @@ static inline void highlight_pointer(const int where)
 	complement_pos(sel_cons, where);
 }
 
-static u16
+static u32
 sel_pos(int n)
 {
+	if (use_unicode)
+		return screen_glyph_unicode(sel_cons, n / 2);
 	return inverse_translate(sel_cons, screen_glyph(sel_cons, n),
-				use_unicode);
+				0);
 }
 
 /**
@@ -90,7 +92,8 @@ static u32 inwordLut[]={
   0x07FFFFFE, /* lowercase         */
 };
 
-static inline int inword(const u16 c) {
+static inline int inword(const u32 c)
+{
 	return c > 0x7f || (( inwordLut[c>>5] >> (c & 0x1F) ) & 1);
 }
 
@@ -167,7 +170,7 @@ int set_selection(const struct tiocl_selection __user *sel, struct tty_struct *t
 	struct tiocl_selection v;
 	char *bp, *obp;
 	int i, ps, pe, multiplier;
-	u16 c;
+	u32 c;
 	int mode;
 
 	poke_blanked_console();
diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index 7fcb0ff2dccf..19da4c0b4b8e 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -4545,6 +4545,16 @@ u16 screen_glyph(struct vc_data *vc, int offset)
 }
 EXPORT_SYMBOL_GPL(screen_glyph);
 
+u32 screen_glyph_unicode(struct vc_data *vc, int n)
+{
+	struct uni_screen *uniscr = get_vc_uniscr(vc);
+
+	if (uniscr)
+		return uniscr->lines[n / vc->vc_cols][n % vc->vc_cols];
+	return inverse_translate(vc, screen_glyph(vc, n * 2), 1);
+}
+EXPORT_SYMBOL_GPL(screen_glyph_unicode);
+
 /* used by vcs - note the word offset */
 unsigned short *screen_pos(struct vc_data *vc, int w_offset, int viewed)
 {
diff --git a/include/linux/selection.h b/include/linux/selection.h
index 067d2e99c79f..a8f5b97b216f 100644
--- a/include/linux/selection.h
+++ b/include/linux/selection.h
@@ -32,6 +32,7 @@ extern unsigned char default_blu[];
 
 extern unsigned short *screen_pos(struct vc_data *vc, int w_offset, int viewed);
 extern u16 screen_glyph(struct vc_data *vc, int offset);
+extern u32 screen_glyph_unicode(struct vc_data *vc, int offset);
 extern void complement_pos(struct vc_data *vc, int offset);
 extern void invert_screen(struct vc_data *vc, int offset, int count, int shift);
 
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/3] use unicode for vt mouse paste
  2018-07-18  2:06 [PATCH 0/3] use unicode for vt mouse paste Adam Borowski
  2018-07-18  2:10 ` [PATCH 1/3] vt: don't reinvent min() Adam Borowski
@ 2018-07-18  3:10 ` Nicolas Pitre
  1 sibling, 0 replies; 5+ messages in thread
From: Nicolas Pitre @ 2018-07-18  3:10 UTC (permalink / raw)
  To: Adam Borowski; +Cc: Greg Kroah-Hartman, Jiri Slaby, linux-kernel, linux-console

On Wed, 18 Jul 2018, Adam Borowski wrote:

> Hi!
> Based on Nicolas' nice work (in tty-next), let's avoid corrupting characters
> that have been copy+pasted via mouse selection.  The uniscr array holds
> their original identity even if they got mangled by glyph conversion.
> The glyph conversion lossily turns similar-looking characters into a
> representation, and everyone else into a replacement character.
> 
> There's no proper handling for CJK (yet?) but anything of wcwidth()==1 will
> work fine.
> 
> The whole thing doesn't get enabled until something reads from /dev/vcsu for
> that console, but let's test this code first before enabling it widely.

Glad to see this. For the whole set you may add:

Acked-by: Nicolas Pitre <nico@linaro.org>


> Diffstat:
>  drivers/tty/vt/selection.c | 48 +++++++++++++++++++++++++++++-------------------
>  drivers/tty/vt/vt.c        | 10 ++++++++++
>  include/linux/selection.h  |  1 +
>  3 files changed, 40 insertions(+), 19 deletions(-)
> 
> 
> -- 
> // If you believe in so-called "intellectual property", please immediately
> // cease using counterfeit alphabets.  Instead, contact the nearest temple
> // of Amon, whose priests will provide you with scribal services for all
> // your writing needs, for Reasonable And Non-Discriminatory prices.
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-07-18  3:10 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-18  2:06 [PATCH 0/3] use unicode for vt mouse paste Adam Borowski
2018-07-18  2:10 ` [PATCH 1/3] vt: don't reinvent min() Adam Borowski
2018-07-18  2:10   ` [PATCH 2/3] vt: selection: handle storing of characters above U+FFFF Adam Borowski
2018-07-18  2:10   ` [PATCH 3/3] vt: selection: take screen contents from uniscr if available Adam Borowski
2018-07-18  3:10 ` [PATCH 0/3] use unicode for vt mouse paste Nicolas Pitre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).