All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] unit: add utf8_validate_test80 test
@ 2019-03-20 23:31 Michael Tretter
  2019-03-20 23:31 ` [PATCH 2/2] utf8: Fix expected bytes in l_utf8_get_codepoint Michael Tretter
  0 siblings, 1 reply; 3+ messages in thread
From: Michael Tretter @ 2019-03-20 23:31 UTC (permalink / raw)
  To: ell

[-- Attachment #1: Type: text/plain, Size: 1324 bytes --]

UTF-8 requires the form 10xxxxxx for the second, third and forth bytes
of a well-formed byte sequences.

Add a test for the string "ße" encoded using the Latin-1 Supplement
block. This is a relatively common German letter combination and valid
Unicode, but not UTF-8.
---
 unit/test-utf8.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/unit/test-utf8.c b/unit/test-utf8.c
index 3768506..9b5bc6e 100644
--- a/unit/test-utf8.c
+++ b/unit/test-utf8.c
@@ -815,6 +815,17 @@ static struct utf8_validate_test utf8_validate_test79 = {
 	.ucs4_len = 5,
 };
 
+static const char utf8_80[] = { 0xdf, 0x65 };
+static const wchar_t ucs4_80[] = { 0xffff };
+
+static struct utf8_validate_test utf8_validate_test80 = {
+	.utf8 = utf8_80,
+	.utf8_len = 2,
+	.type = UTF8_VALIDATE_TYPE_NOTUNICODE,
+	.ucs4 = ucs4_80,
+	.ucs4_len = 1,
+};
+
 static void test_utf8_codepoint(const struct utf8_validate_test *test)
 {
 	unsigned int i, pos;
@@ -1085,6 +1096,8 @@ int main(int argc, char *argv[])
 					&utf8_validate_test78);
 	l_test_add("Validate UTF 79", test_utf8_validate,
 					&utf8_validate_test79);
+	l_test_add("Validate UTF 80", test_utf8_validate,
+					&utf8_validate_test80);
 
 	l_test_add("Strlen UTF 1", test_utf8_strlen,
 					&utf8_strlen_test1);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-03-21  1:09 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-20 23:31 [PATCH 1/2] unit: add utf8_validate_test80 test Michael Tretter
2019-03-20 23:31 ` [PATCH 2/2] utf8: Fix expected bytes in l_utf8_get_codepoint Michael Tretter
2019-03-21  1:09   ` Denis Kenzior

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.