All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] t-ctype: simplify unit test definitions
@ 2024-02-25 11:27 René Scharfe
  2024-02-25 11:27 ` [PATCH 1/3] t-ctype: allow NUL anywhere in the specification string René Scharfe
                   ` (3 more replies)
  0 siblings, 4 replies; 30+ messages in thread
From: René Scharfe @ 2024-02-25 11:27 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, Josh Steadmon, Achu Luma, Christian Couder

Simplify the ctype unit tests to allow combining specification strings
in any order and no longer require repeating class names.  Patch 3 gets
rid of aggregation by class and implements the tests in arguably the
most basic and straight-forward way for the unit test framework, at the
cost of producing raw and lengthy output.

  t-ctype: allow NUL anywhere in the specification string
  t-ctype: avoid duplicating class names
  t-ctype: do one test per class and char

 t/unit-tests/t-ctype.c | 73 ++++++++++++------------------------------
 1 file changed, 20 insertions(+), 53 deletions(-)

--
2.44.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 1/3] t-ctype: allow NUL anywhere in the specification string
  2024-02-25 11:27 [PATCH 0/3] t-ctype: simplify unit test definitions René Scharfe
@ 2024-02-25 11:27 ` René Scharfe
  2024-02-25 18:05   ` Eric Sunshine
  2024-02-25 11:27 ` [PATCH 2/3] t-ctype: avoid duplicating class names René Scharfe
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 30+ messages in thread
From: René Scharfe @ 2024-02-25 11:27 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, Josh Steadmon, Achu Luma, Christian Couder

Replace the custom function is_in() for looking up a character in the
specification string with memchr(3) and sizeof.  This is shorter,
simpler and allows NUL anywhere in the string, which may come in handy
if we ever want to support more character classes that contain it.

Getting the string size using sizeof only works in a macro and with a
string constant, but that's exactly what we have and I don't see it
changing anytime soon.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 t/unit-tests/t-ctype.c | 16 ++--------------
 1 file changed, 2 insertions(+), 14 deletions(-)

diff --git a/t/unit-tests/t-ctype.c b/t/unit-tests/t-ctype.c
index f315489984..64d7186258 100644
--- a/t/unit-tests/t-ctype.c
+++ b/t/unit-tests/t-ctype.c
@@ -1,23 +1,11 @@
 #include "test-lib.h"

-static int is_in(const char *s, int ch)
-{
-	/*
-	 * We can't find NUL using strchr. Accept it as the first
-	 * character in the spec -- there are no empty classes.
-	 */
-	if (ch == '\0')
-		return ch == *s;
-	if (*s == '\0')
-		s++;
-	return !!strchr(s, ch);
-}
-
 /* Macro to test a character type */
 #define TEST_CTYPE_FUNC(func, string) \
 static void test_ctype_##func(void) { \
 	for (int i = 0; i < 256; i++) { \
-		if (!check_int(func(i), ==, is_in(string, i))) \
+		int expect = !!memchr(string, i, sizeof(string) - 1); \
+		if (!check_int(func(i), ==, expect)) \
 			test_msg("       i: 0x%02x", i); \
 	} \
 	if (!check(!func(EOF))) \
--
2.44.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 2/3] t-ctype: avoid duplicating class names
  2024-02-25 11:27 [PATCH 0/3] t-ctype: simplify unit test definitions René Scharfe
  2024-02-25 11:27 ` [PATCH 1/3] t-ctype: allow NUL anywhere in the specification string René Scharfe
@ 2024-02-25 11:27 ` René Scharfe
  2024-02-25 11:27 ` [PATCH 3/3] t-ctype: do one test per class and char René Scharfe
  2024-03-03 10:13 ` [PATCH v2 0/4] t-ctype: simplify unit test definitions René Scharfe
  3 siblings, 0 replies; 30+ messages in thread
From: René Scharfe @ 2024-02-25 11:27 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, Josh Steadmon, Achu Luma, Christian Couder

TEST_CTYPE_FUNC defines a function for testing a character classifier,
TEST_CHAR_CLASS calls it, causing the class name to be mentioned twice.

Avoid the need to define a class-specific function by letting
TEST_CHAR_CLASS do all the work.  This is done by using the internal
functions test__run_begin() and test__run_end(), but they do exist to be
used in test macros after all.

Alternatively we could unroll the loop to provide a very long expression
that tests all 256 characters and EOF and hand that to TEST, but that
seems awkward and hard to read.

No change of behavior or output intended.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 t/unit-tests/t-ctype.c | 68 ++++++++++++++++--------------------------
 1 file changed, 26 insertions(+), 42 deletions(-)

diff --git a/t/unit-tests/t-ctype.c b/t/unit-tests/t-ctype.c
index 64d7186258..56dfefb68e 100644
--- a/t/unit-tests/t-ctype.c
+++ b/t/unit-tests/t-ctype.c
@@ -1,18 +1,18 @@
 #include "test-lib.h"

-/* Macro to test a character type */
-#define TEST_CTYPE_FUNC(func, string) \
-static void test_ctype_##func(void) { \
-	for (int i = 0; i < 256; i++) { \
-		int expect = !!memchr(string, i, sizeof(string) - 1); \
-		if (!check_int(func(i), ==, expect)) \
-			test_msg("       i: 0x%02x", i); \
-	} \
-	if (!check(!func(EOF))) \
+#define TEST_CHAR_CLASS(class, string) do { \
+	int skip = test__run_begin(); \
+	if (!skip) { \
+		for (int i = 0; i < 256; i++) { \
+			int expect = !!memchr(string, i, sizeof(string) - 1); \
+			if (!check_int(class(i), ==, expect)) \
+				test_msg("       i: 0x%02x", i); \
+		} \
+		if (!check(!class(EOF))) \
 			test_msg("      i: 0x%02x (EOF)", EOF); \
-}
-
-#define TEST_CHAR_CLASS(class) TEST(test_ctype_##class(), #class " works")
+	} \
+	test__run_end(!skip, TEST_LOCATION(), #class " works"); \
+} while (0)

 #define DIGIT "0123456789"
 #define LOWER "abcdefghijklmnopqrstuvwxyz"
@@ -32,37 +32,21 @@ static void test_ctype_##func(void) { \
 	"\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f" \
 	"\x7f"

-TEST_CTYPE_FUNC(isdigit, DIGIT)
-TEST_CTYPE_FUNC(isspace, " \n\r\t")
-TEST_CTYPE_FUNC(isalpha, LOWER UPPER)
-TEST_CTYPE_FUNC(isalnum, LOWER UPPER DIGIT)
-TEST_CTYPE_FUNC(is_glob_special, "*?[\\")
-TEST_CTYPE_FUNC(is_regex_special, "$()*+.?[\\^{|")
-TEST_CTYPE_FUNC(is_pathspec_magic, "!\"#%&',-/:;<=>@_`~")
-TEST_CTYPE_FUNC(isascii, ASCII)
-TEST_CTYPE_FUNC(islower, LOWER)
-TEST_CTYPE_FUNC(isupper, UPPER)
-TEST_CTYPE_FUNC(iscntrl, CNTRL)
-TEST_CTYPE_FUNC(ispunct, PUNCT)
-TEST_CTYPE_FUNC(isxdigit, DIGIT "abcdefABCDEF")
-TEST_CTYPE_FUNC(isprint, LOWER UPPER DIGIT PUNCT " ")
-
 int cmd_main(int argc, const char **argv) {
-	/* Run all character type tests */
-	TEST_CHAR_CLASS(isspace);
-	TEST_CHAR_CLASS(isdigit);
-	TEST_CHAR_CLASS(isalpha);
-	TEST_CHAR_CLASS(isalnum);
-	TEST_CHAR_CLASS(is_glob_special);
-	TEST_CHAR_CLASS(is_regex_special);
-	TEST_CHAR_CLASS(is_pathspec_magic);
-	TEST_CHAR_CLASS(isascii);
-	TEST_CHAR_CLASS(islower);
-	TEST_CHAR_CLASS(isupper);
-	TEST_CHAR_CLASS(iscntrl);
-	TEST_CHAR_CLASS(ispunct);
-	TEST_CHAR_CLASS(isxdigit);
-	TEST_CHAR_CLASS(isprint);
+	TEST_CHAR_CLASS(isspace, " \n\r\t");
+	TEST_CHAR_CLASS(isdigit, DIGIT);
+	TEST_CHAR_CLASS(isalpha, LOWER UPPER);
+	TEST_CHAR_CLASS(isalnum, LOWER UPPER DIGIT);
+	TEST_CHAR_CLASS(is_glob_special, "*?[\\");
+	TEST_CHAR_CLASS(is_regex_special, "$()*+.?[\\^{|");
+	TEST_CHAR_CLASS(is_pathspec_magic, "!\"#%&',-/:;<=>@_`~");
+	TEST_CHAR_CLASS(isascii, ASCII);
+	TEST_CHAR_CLASS(islower, LOWER);
+	TEST_CHAR_CLASS(isupper, UPPER);
+	TEST_CHAR_CLASS(iscntrl, CNTRL);
+	TEST_CHAR_CLASS(ispunct, PUNCT);
+	TEST_CHAR_CLASS(isxdigit, DIGIT "abcdefABCDEF");
+	TEST_CHAR_CLASS(isprint, LOWER UPPER DIGIT PUNCT " ");

 	return test_done();
 }
--
2.44.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 3/3] t-ctype: do one test per class and char
  2024-02-25 11:27 [PATCH 0/3] t-ctype: simplify unit test definitions René Scharfe
  2024-02-25 11:27 ` [PATCH 1/3] t-ctype: allow NUL anywhere in the specification string René Scharfe
  2024-02-25 11:27 ` [PATCH 2/3] t-ctype: avoid duplicating class names René Scharfe
@ 2024-02-25 11:27 ` René Scharfe
  2024-02-26  9:28   ` Christian Couder
  2024-03-03 10:13 ` [PATCH v2 0/4] t-ctype: simplify unit test definitions René Scharfe
  3 siblings, 1 reply; 30+ messages in thread
From: René Scharfe @ 2024-02-25 11:27 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, Josh Steadmon, Achu Luma, Christian Couder

Simplify TEST_CHAR_CLASS by using TEST for each character separately.
This increases the number of tests to 3598, but avoids the need for
using internal functions and test_msg() for custom messages.  The
resulting macro has minimal test setup overhead.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 t/unit-tests/t-ctype.c | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/t/unit-tests/t-ctype.c b/t/unit-tests/t-ctype.c
index 56dfefb68e..aa728175a6 100644
--- a/t/unit-tests/t-ctype.c
+++ b/t/unit-tests/t-ctype.c
@@ -1,17 +1,12 @@
 #include "test-lib.h"

 #define TEST_CHAR_CLASS(class, string) do { \
-	int skip = test__run_begin(); \
-	if (!skip) { \
-		for (int i = 0; i < 256; i++) { \
-			int expect = !!memchr(string, i, sizeof(string) - 1); \
-			if (!check_int(class(i), ==, expect)) \
-				test_msg("       i: 0x%02x", i); \
-		} \
-		if (!check(!class(EOF))) \
-			test_msg("      i: 0x%02x (EOF)", EOF); \
+	for (int i = 0; i < 256; i++) { \
+		int expect = !!memchr(string, i, sizeof(string) - 1); \
+		TEST(check_int(class(i), ==, expect), \
+		     #class "(0x%02x) works", i); \
 	} \
-	test__run_end(!skip, TEST_LOCATION(), #class " works"); \
+	TEST(check(!class(EOF)), #class "(EOF) works"); \
 } while (0)

 #define DIGIT "0123456789"
--
2.44.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/3] t-ctype: allow NUL anywhere in the specification string
  2024-02-25 11:27 ` [PATCH 1/3] t-ctype: allow NUL anywhere in the specification string René Scharfe
@ 2024-02-25 18:05   ` Eric Sunshine
  2024-02-25 18:28     ` René Scharfe
  0 siblings, 1 reply; 30+ messages in thread
From: Eric Sunshine @ 2024-02-25 18:05 UTC (permalink / raw)
  To: René Scharfe
  Cc: git, Phillip Wood, Josh Steadmon, Achu Luma, Christian Couder

On Sun, Feb 25, 2024 at 6:27 AM René Scharfe <l.s.r@web.de> wrote:
> Replace the custom function is_in() for looking up a character in the
> specification string with memchr(3) and sizeof.  This is shorter,
> simpler and allows NUL anywhere in the string, which may come in handy
> if we ever want to support more character classes that contain it.
>
> Getting the string size using sizeof only works in a macro and with a
> string constant, but that's exactly what we have and I don't see it
> changing anytime soon.
>
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
> diff --git a/t/unit-tests/t-ctype.c b/t/unit-tests/t-ctype.c
> @@ -1,23 +1,11 @@
>  /* Macro to test a character type */
>  #define TEST_CTYPE_FUNC(func, string) \

Taking into consideration the commit message warning about string
constants, would it make sense to update the comment to mention that
limitation?

    /* Test a character type. (Only use with string constants.) */
    #define TEST_CTYPE_FUNC(func, string) \

>  static void test_ctype_##func(void) { \
>         for (int i = 0; i < 256; i++) { \
> -               if (!check_int(func(i), ==, is_in(string, i))) \
> +               int expect = !!memchr(string, i, sizeof(string) - 1); \
> +               if (!check_int(func(i), ==, expect)) \
>                         test_msg("       i: 0x%02x", i); \
>         } \
>         if (!check(!func(EOF))) \

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/3] t-ctype: allow NUL anywhere in the specification string
  2024-02-25 18:05   ` Eric Sunshine
@ 2024-02-25 18:28     ` René Scharfe
  2024-02-25 18:41       ` Eric Sunshine
  0 siblings, 1 reply; 30+ messages in thread
From: René Scharfe @ 2024-02-25 18:28 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: git, Phillip Wood, Josh Steadmon, Achu Luma, Christian Couder

Am 25.02.24 um 19:05 schrieb Eric Sunshine:
> On Sun, Feb 25, 2024 at 6:27 AM René Scharfe <l.s.r@web.de> wrote:
>> Replace the custom function is_in() for looking up a character in the
>> specification string with memchr(3) and sizeof.  This is shorter,
>> simpler and allows NUL anywhere in the string, which may come in handy
>> if we ever want to support more character classes that contain it.
>>
>> Getting the string size using sizeof only works in a macro and with a
>> string constant, but that's exactly what we have and I don't see it
>> changing anytime soon.
>>
>> Signed-off-by: René Scharfe <l.s.r@web.de>
>> ---
>> diff --git a/t/unit-tests/t-ctype.c b/t/unit-tests/t-ctype.c
>> @@ -1,23 +1,11 @@
>>  /* Macro to test a character type */
>>  #define TEST_CTYPE_FUNC(func, string) \
>
> Taking into consideration the commit message warning about string
> constants, would it make sense to update the comment to mention that
> limitation?

I think the temptation to pass a string pointer is low -- if only
because there aren't any in this file.  But adding such a warning
can't hurt, so yeah, why not?

>
>     /* Test a character type. (Only use with string constants.) */
>     #define TEST_CTYPE_FUNC(func, string) \
>
>>  static void test_ctype_##func(void) { \
>>         for (int i = 0; i < 256; i++) { \
>> -               if (!check_int(func(i), ==, is_in(string, i))) \
>> +               int expect = !!memchr(string, i, sizeof(string) - 1); \
>> +               if (!check_int(func(i), ==, expect)) \
>>                         test_msg("       i: 0x%02x", i); \
>>         } \
>>         if (!check(!func(EOF))) \

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/3] t-ctype: allow NUL anywhere in the specification string
  2024-02-25 18:28     ` René Scharfe
@ 2024-02-25 18:41       ` Eric Sunshine
  2024-02-25 21:00         ` Jeff King
  0 siblings, 1 reply; 30+ messages in thread
From: Eric Sunshine @ 2024-02-25 18:41 UTC (permalink / raw)
  To: René Scharfe
  Cc: git, Phillip Wood, Josh Steadmon, Achu Luma, Christian Couder

On Sun, Feb 25, 2024 at 1:28 PM René Scharfe <l.s.r@web.de> wrote:
> Am 25.02.24 um 19:05 schrieb Eric Sunshine:
> > On Sun, Feb 25, 2024 at 6:27 AM René Scharfe <l.s.r@web.de> wrote:
> >> Getting the string size using sizeof only works in a macro and with a
> >> string constant, but that's exactly what we have and I don't see it
> >> changing anytime soon.
> >>
> >>  /* Macro to test a character type */
> >>  #define TEST_CTYPE_FUNC(func, string) \
> >
> > Taking into consideration the commit message warning about string
> > constants, would it make sense to update the comment to mention that
> > limitation?
> >
> >     /* Test a character type. (Only use with string constants.) */
> >     #define TEST_CTYPE_FUNC(func, string) \
> >>  static void test_ctype_##func(void) { \
> >>         for (int i = 0; i < 256; i++) { \
> >> +               int expect = !!memchr(string, i, sizeof(string) - 1); \
>
> I think the temptation to pass a string pointer is low -- if only
> because there aren't any in this file.  But adding such a warning
> can't hurt, so yeah, why not?

The patch just posted[1] by SZEDER reminded me that, on this project,
we assume that the compiler is smart enough to replace
`strlen("string-literal")` with the constant `15`, so rather than
worrying about updating comment to mention the sizeof() limitation,
you could perhaps just use `strlen(string)` instead of
`sizeof(string)-1`?

[1]: https://lore.kernel.org/git/20240225183452.1939334-1-szeder.dev@gmail.com/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/3] t-ctype: allow NUL anywhere in the specification string
  2024-02-25 18:41       ` Eric Sunshine
@ 2024-02-25 21:00         ` Jeff King
  2024-02-25 21:02           ` Eric Sunshine
  0 siblings, 1 reply; 30+ messages in thread
From: Jeff King @ 2024-02-25 21:00 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: René Scharfe, git, Phillip Wood, Josh Steadmon, Achu Luma,
	Christian Couder

On Sun, Feb 25, 2024 at 01:41:30PM -0500, Eric Sunshine wrote:

> On Sun, Feb 25, 2024 at 1:28 PM René Scharfe <l.s.r@web.de> wrote:
> > Am 25.02.24 um 19:05 schrieb Eric Sunshine:
> > > On Sun, Feb 25, 2024 at 6:27 AM René Scharfe <l.s.r@web.de> wrote:
> > >> Getting the string size using sizeof only works in a macro and with a
> > >> string constant, but that's exactly what we have and I don't see it
> > >> changing anytime soon.
> > >>
> > >>  /* Macro to test a character type */
> > >>  #define TEST_CTYPE_FUNC(func, string) \
> > >
> > > Taking into consideration the commit message warning about string
> > > constants, would it make sense to update the comment to mention that
> > > limitation?
> > >
> > >     /* Test a character type. (Only use with string constants.) */
> > >     #define TEST_CTYPE_FUNC(func, string) \
> > >>  static void test_ctype_##func(void) { \
> > >>         for (int i = 0; i < 256; i++) { \
> > >> +               int expect = !!memchr(string, i, sizeof(string) - 1); \
> >
> > I think the temptation to pass a string pointer is low -- if only
> > because there aren't any in this file.  But adding such a warning
> > can't hurt, so yeah, why not?
> 
> The patch just posted[1] by SZEDER reminded me that, on this project,
> we assume that the compiler is smart enough to replace
> `strlen("string-literal")` with the constant `15`, so rather than
> worrying about updating comment to mention the sizeof() limitation,
> you could perhaps just use `strlen(string)` instead of
> `sizeof(string)-1`?

That would defeat the advertised purpose that we can handle embedded
NULs, though. Whereas with sizeof(), I think a literal like "foo\0bar"
would still have length 8.

-Peff

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/3] t-ctype: allow NUL anywhere in the specification string
  2024-02-25 21:00         ` Jeff King
@ 2024-02-25 21:02           ` Eric Sunshine
  0 siblings, 0 replies; 30+ messages in thread
From: Eric Sunshine @ 2024-02-25 21:02 UTC (permalink / raw)
  To: Jeff King
  Cc: René Scharfe, git, Phillip Wood, Josh Steadmon, Achu Luma,
	Christian Couder

On Sun, Feb 25, 2024 at 4:00 PM Jeff King <peff@peff.net> wrote:
> On Sun, Feb 25, 2024 at 01:41:30PM -0500, Eric Sunshine wrote:
> > On Sun, Feb 25, 2024 at 1:28 PM René Scharfe <l.s.r@web.de> wrote:
> > > Am 25.02.24 um 19:05 schrieb Eric Sunshine:
> > > > Taking into consideration the commit message warning about string
> > > > constants, would it make sense to update the comment to mention that
> > > > limitation?
> > >
> > > I think the temptation to pass a string pointer is low -- if only
> > > because there aren't any in this file.  But adding such a warning
> > > can't hurt, so yeah, why not?
> >
> > The patch just posted[1] by SZEDER reminded me that, on this project,
> > we assume that the compiler is smart enough to replace
> > `strlen("string-literal")` with the constant `15`, so rather than
> > worrying about updating comment to mention the sizeof() limitation,
> > you could perhaps just use `strlen(string)` instead of
> > `sizeof(string)-1`?
>
> That would defeat the advertised purpose that we can handle embedded
> NULs, though. Whereas with sizeof(), I think a literal like "foo\0bar"
> would still have length 8.

True. Sorry for the noise.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/3] t-ctype: do one test per class and char
  2024-02-25 11:27 ` [PATCH 3/3] t-ctype: do one test per class and char René Scharfe
@ 2024-02-26  9:28   ` Christian Couder
  2024-02-26 17:26     ` René Scharfe
  0 siblings, 1 reply; 30+ messages in thread
From: Christian Couder @ 2024-02-26  9:28 UTC (permalink / raw)
  To: René Scharfe; +Cc: git, Phillip Wood, Josh Steadmon, Achu Luma

On Sun, Feb 25, 2024 at 12:27 PM René Scharfe <l.s.r@web.de> wrote:
>
> Simplify TEST_CHAR_CLASS by using TEST for each character separately.
> This increases the number of tests to 3598,

Does this mean that when all the tests pass there will be 3598 lines
of output on the terminal instead of 14 before this patch?

If that's the case, I don't like this.

> but avoids the need for
> using internal functions and test_msg() for custom messages.  The
> resulting macro has minimal test setup overhead.

Yeah, the code looks definitely cleaner, but a clean output is important too.

Thanks!

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/3] t-ctype: do one test per class and char
  2024-02-26  9:28   ` Christian Couder
@ 2024-02-26 17:26     ` René Scharfe
  2024-02-26 17:44       ` Junio C Hamano
  2024-02-26 18:58       ` Josh Steadmon
  0 siblings, 2 replies; 30+ messages in thread
From: René Scharfe @ 2024-02-26 17:26 UTC (permalink / raw)
  To: Christian Couder; +Cc: git, Phillip Wood, Josh Steadmon, Achu Luma

Am 26.02.24 um 10:28 schrieb Christian Couder:
> On Sun, Feb 25, 2024 at 12:27 PM René Scharfe <l.s.r@web.de> wrote:
>>
>> Simplify TEST_CHAR_CLASS by using TEST for each character separately.
>> This increases the number of tests to 3598,
>
> Does this mean that when all the tests pass there will be 3598 lines
> of output on the terminal instead of 14 before this patch?

Yes.

> If that's the case, I don't like this.
>
>> but avoids the need for
>> using internal functions and test_msg() for custom messages.  The
>> resulting macro has minimal test setup overhead.
>
> Yeah, the code looks definitely cleaner, but a clean output is important too.

The output is clean as well, but there's a lot of it.  Perhaps too much.
The success messages are boring, though, and if all checks pass then the
only useful information is the status code.  A TAP harness like prove
summarizes that nicely:

   $ prove t/unit-tests/bin/t-ctype
   t/unit-tests/bin/t-ctype .. ok
   All tests successful.
   Files=1, Tests=3598,  0 wallclock secs ( 0.08 usr +  0.00 sys =  0.08 CPU)
   Result: PASS

Filtering out passing checks e.g. with "| grep -v ^ok" would help when
debugging a test failure. I vaguely miss the --immediate switch from the
regular test library, however.

René

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/3] t-ctype: do one test per class and char
  2024-02-26 17:26     ` René Scharfe
@ 2024-02-26 17:44       ` Junio C Hamano
  2024-02-26 18:58       ` Josh Steadmon
  1 sibling, 0 replies; 30+ messages in thread
From: Junio C Hamano @ 2024-02-26 17:44 UTC (permalink / raw)
  To: René Scharfe
  Cc: Christian Couder, git, Phillip Wood, Josh Steadmon, Achu Luma

René Scharfe <l.s.r@web.de> writes:

> Filtering out passing checks e.g. with "| grep -v ^ok" would help when
> debugging a test failure. I vaguely miss the --immediate switch from the
> regular test library, however.

Yeah, "-i" is one of the most useful switches during debugging
tests.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/3] t-ctype: do one test per class and char
  2024-02-26 17:26     ` René Scharfe
  2024-02-26 17:44       ` Junio C Hamano
@ 2024-02-26 18:58       ` Josh Steadmon
  2024-02-27 10:04         ` Christian Couder
  1 sibling, 1 reply; 30+ messages in thread
From: Josh Steadmon @ 2024-02-26 18:58 UTC (permalink / raw)
  To: René Scharfe; +Cc: Christian Couder, git, Phillip Wood, Achu Luma

On 2024.02.26 18:26, René Scharfe wrote:
> Am 26.02.24 um 10:28 schrieb Christian Couder:
> > On Sun, Feb 25, 2024 at 12:27 PM René Scharfe <l.s.r@web.de> wrote:
> >>
> >> Simplify TEST_CHAR_CLASS by using TEST for each character separately.
> >> This increases the number of tests to 3598,
> >
> > Does this mean that when all the tests pass there will be 3598 lines
> > of output on the terminal instead of 14 before this patch?
> 
> Yes.
> 
> > If that's the case, I don't like this.
> >
> >> but avoids the need for
> >> using internal functions and test_msg() for custom messages.  The
> >> resulting macro has minimal test setup overhead.
> >
> > Yeah, the code looks definitely cleaner, but a clean output is important too.
> 
> The output is clean as well, but there's a lot of it.  Perhaps too much.
> The success messages are boring, though, and if all checks pass then the
> only useful information is the status code.  A TAP harness like prove
> summarizes that nicely:
> 
>    $ prove t/unit-tests/bin/t-ctype
>    t/unit-tests/bin/t-ctype .. ok
>    All tests successful.
>    Files=1, Tests=3598,  0 wallclock secs ( 0.08 usr +  0.00 sys =  0.08 CPU)
>    Result: PASS
> 
> Filtering out passing checks e.g. with "| grep -v ^ok" would help when
> debugging a test failure. I vaguely miss the --immediate switch from the
> regular test library, however.

Yeah, I agree here. It's a lot of output but it's almost always going to
be consumed by a test harness rather than a human, and it's easy to
filter out the noise if someone does need to do some manual debugging.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/3] t-ctype: do one test per class and char
  2024-02-26 18:58       ` Josh Steadmon
@ 2024-02-27 10:04         ` Christian Couder
  2024-03-02 22:00           ` René Scharfe
  0 siblings, 1 reply; 30+ messages in thread
From: Christian Couder @ 2024-02-27 10:04 UTC (permalink / raw)
  To: Josh Steadmon, René Scharfe, Christian Couder, git,
	Phillip Wood, Achu Luma

On Mon, Feb 26, 2024 at 7:58 PM Josh Steadmon <steadmon@google.com> wrote:
>
> On 2024.02.26 18:26, René Scharfe wrote:

> > The output is clean as well, but there's a lot of it.  Perhaps too much.
> > The success messages are boring, though, and if all checks pass then the
> > only useful information is the status code.  A TAP harness like prove
> > summarizes that nicely:
> >
> >    $ prove t/unit-tests/bin/t-ctype
> >    t/unit-tests/bin/t-ctype .. ok
> >    All tests successful.
> >    Files=1, Tests=3598,  0 wallclock secs ( 0.08 usr +  0.00 sys =  0.08 CPU)
> >    Result: PASS
> >
> > Filtering out passing checks e.g. with "| grep -v ^ok" would help when
> > debugging a test failure. I vaguely miss the --immediate switch from the
> > regular test library, however.
>
> Yeah, I agree here. It's a lot of output but it's almost always going to
> be consumed by a test harness rather than a human, and it's easy to
> filter out the noise if someone does need to do some manual debugging.

Yeah, I know about TAP harnesses like prove, but the most
straightforward way to run the unit tests is still `make unit-tests`
in the t/ directory. Also when you add or change some tests, it's a
good idea to run `make unit-tests` to see what the output is, so you
still have to see that output quite often when you work on tests and
going through 3598 of mostly useless output instead of just 14 isn't
nice.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/3] t-ctype: do one test per class and char
  2024-02-27 10:04         ` Christian Couder
@ 2024-03-02 22:00           ` René Scharfe
  2024-03-04 10:00             ` Christian Couder
  2024-03-04 18:35             ` Josh Steadmon
  0 siblings, 2 replies; 30+ messages in thread
From: René Scharfe @ 2024-03-02 22:00 UTC (permalink / raw)
  To: Christian Couder, Josh Steadmon, git, Phillip Wood, Achu Luma

Am 27.02.24 um 11:04 schrieb Christian Couder:
> On Mon, Feb 26, 2024 at 7:58 PM Josh Steadmon <steadmon@google.com> wrote:
>>
>> On 2024.02.26 18:26, René Scharfe wrote:
>
>>> The output is clean as well, but there's a lot of it.  Perhaps too much.
>>> The success messages are boring, though, and if all checks pass then the
>>> only useful information is the status code.  A TAP harness like prove
>>> summarizes that nicely:
>>>
>>>    $ prove t/unit-tests/bin/t-ctype
>>>    t/unit-tests/bin/t-ctype .. ok
>>>    All tests successful.
>>>    Files=1, Tests=3598,  0 wallclock secs ( 0.08 usr +  0.00 sys =  0.08 CPU)
>>>    Result: PASS
>>>
>>> Filtering out passing checks e.g. with "| grep -v ^ok" would help when
>>> debugging a test failure. I vaguely miss the --immediate switch from the
>>> regular test library, however.
>>
>> Yeah, I agree here. It's a lot of output but it's almost always going to
>> be consumed by a test harness rather than a human, and it's easy to
>> filter out the noise if someone does need to do some manual debugging.
>
> Yeah, I know about TAP harnesses like prove, but the most
> straightforward way to run the unit tests is still `make unit-tests`
> in the t/ directory. Also when you add or change some tests, it's a
> good idea to run `make unit-tests` to see what the output is, so you
> still have to see that output quite often when you work on tests and
> going through 3598 of mostly useless output instead of just 14 isn't
> nice.

I was starting the programs from t/unit-tests/bin/ individually because
I didn't know 'make unit-tests' exists.  This is much nicer, thank you!
Especially after adding 'DEFAULT_UNIT_TEST_TARGET = unit-tests-prove' to
config.mak to complement the 'DEFAULT_TEST_TARGET = prove' I added long
ago.  It would be even nicer if the former was the default when the
latter is set.

As unit tests are added, their output is surely going to grow to
multiple screens with or without prove, no?  So someone writing or
debugging tests will still go back to starting then individually
eventually.

The size of the output in itself is not a problem, I assume, but that
most of it is useless -- details of successful tests are uninteresting.
A test harness can aggregate the output, but prove annoyed me when used
with the regular tests by also aggregating error output and only showing
the numbers of failed tests.  Finding their names involved running the
test script again without prove.  Turns out it has an option for that.
Added 'GIT_PROVE_OPTS = --failures' to config.mak as well, will see if
it helps.

Is it too much to ask developers to use a test harness?  Perhaps: It's
yet another dependency and not enabled by default.

What's the right level of aggregation and how do we achieve it?
Grouping by class is natural and follows the test definition.  We
could stop after patch 2.  Dunno.

René

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 0/4] t-ctype: simplify unit test definitions
  2024-02-25 11:27 [PATCH 0/3] t-ctype: simplify unit test definitions René Scharfe
                   ` (2 preceding siblings ...)
  2024-02-25 11:27 ` [PATCH 3/3] t-ctype: do one test per class and char René Scharfe
@ 2024-03-03 10:13 ` René Scharfe
  2024-03-03 10:13   ` [PATCH v2 1/4] t-ctype: allow NUL anywhere in the specification string René Scharfe
                     ` (4 more replies)
  3 siblings, 5 replies; 30+ messages in thread
From: René Scharfe @ 2024-03-03 10:13 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, Josh Steadmon, Achu Luma, Christian Couder

Simplify the ctype unit tests to allow combining specification strings
in any order and no longer require repeating class names.

Changes since v1:
* Added checks to guard string length calculation using sizeof.
* Kept the definition string in the error output.
* Added patches 2 and 3 with further cosmetic changes.
* Dropped the last patch with the huge output for now, as it needs
  further thought.

  t-ctype: allow NUL anywhere in the specification string
  t-ctype: simplify EOF check
  t-ctype: align output of i
  t-ctype: avoid duplicating class names

 t/unit-tests/t-ctype.c | 81 ++++++++++++++----------------------------
 1 file changed, 27 insertions(+), 54 deletions(-)

Range-Diff gegen v1:
1:  ae9b5468db ! 1:  28fba8bda9 t-ctype: allow NUL anywhere in the specification string
    @@ Commit message
         if we ever want to support more character classes that contain it.

         Getting the string size using sizeof only works in a macro and with a
    -    string constant, but that's exactly what we have and I don't see it
    -    changing anytime soon.
    +    string constant.  Use ARRAY_SIZE and compile-time checks to make sure we
    +    are not passed a string pointer.

      ## t/unit-tests/t-ctype.c ##
     @@
    @@ t/unit-tests/t-ctype.c
      /* Macro to test a character type */
      #define TEST_CTYPE_FUNC(func, string) \
      static void test_ctype_##func(void) { \
    ++	size_t len = ARRAY_SIZE(string) - 1 + \
    ++		BUILD_ASSERT_OR_ZERO(ARRAY_SIZE(string) > 0) + \
    ++		BUILD_ASSERT_OR_ZERO(sizeof(string[0]) == sizeof(char)); \
      	for (int i = 0; i < 256; i++) { \
     -		if (!check_int(func(i), ==, is_in(string, i))) \
    -+		int expect = !!memchr(string, i, sizeof(string) - 1); \
    -+		if (!check_int(func(i), ==, expect)) \
    ++		if (!check_int(func(i), ==, !!memchr(string, i, len))) \
      			test_msg("       i: 0x%02x", i); \
      	} \
      	if (!check(!func(EOF))) \
-:  ---------- > 2:  1eab0d289c t-ctype: simplify EOF check
-:  ---------- > 3:  901312e980 t-ctype: align output of i
2:  688e6b6cb5 ! 4:  5dac51cfeb t-ctype: avoid duplicating class names
    @@ t/unit-tests/t-ctype.c
     -/* Macro to test a character type */
     -#define TEST_CTYPE_FUNC(func, string) \
     -static void test_ctype_##func(void) { \
    --	for (int i = 0; i < 256; i++) { \
    --		int expect = !!memchr(string, i, sizeof(string) - 1); \
    --		if (!check_int(func(i), ==, expect)) \
    --			test_msg("       i: 0x%02x", i); \
    --	} \
    --	if (!check(!func(EOF))) \
     +#define TEST_CHAR_CLASS(class, string) do { \
    + 	size_t len = ARRAY_SIZE(string) - 1 + \
    + 		BUILD_ASSERT_OR_ZERO(ARRAY_SIZE(string) > 0) + \
    + 		BUILD_ASSERT_OR_ZERO(sizeof(string[0]) == sizeof(char)); \
    +-	for (int i = 0; i < 256; i++) { \
    +-		if (!check_int(func(i), ==, !!memchr(string, i, len))) \
    +-			test_msg("      i: 0x%02x", i); \
     +	int skip = test__run_begin(); \
     +	if (!skip) { \
     +		for (int i = 0; i < 256; i++) { \
    -+			int expect = !!memchr(string, i, sizeof(string) - 1); \
    -+			if (!check_int(class(i), ==, expect)) \
    -+				test_msg("       i: 0x%02x", i); \
    ++			if (!check_int(class(i), ==, !!memchr(string, i, len)))\
    ++				test_msg("      i: 0x%02x", i); \
     +		} \
    -+		if (!check(!class(EOF))) \
    - 			test_msg("      i: 0x%02x (EOF)", EOF); \
    ++		check(!class(EOF)); \
    + 	} \
    +-	check(!func(EOF)); \
     -}
     -
     -#define TEST_CHAR_CLASS(class) TEST(test_ctype_##class(), #class " works")
    -+	} \
     +	test__run_end(!skip, TEST_LOCATION(), #class " works"); \
     +} while (0)

3:  caf2acbd09 < -:  ---------- t-ctype: do one test per class and char
--
2.44.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 1/4] t-ctype: allow NUL anywhere in the specification string
  2024-03-03 10:13 ` [PATCH v2 0/4] t-ctype: simplify unit test definitions René Scharfe
@ 2024-03-03 10:13   ` René Scharfe
  2024-03-03 10:13   ` [PATCH v2 2/4] t-ctype: simplify EOF check René Scharfe
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 30+ messages in thread
From: René Scharfe @ 2024-03-03 10:13 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, Josh Steadmon, Achu Luma, Christian Couder

Replace the custom function is_in() for looking up a character in the
specification string with memchr(3) and sizeof.  This is shorter,
simpler and allows NUL anywhere in the string, which may come in handy
if we ever want to support more character classes that contain it.

Getting the string size using sizeof only works in a macro and with a
string constant.  Use ARRAY_SIZE and compile-time checks to make sure we
are not passed a string pointer.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 t/unit-tests/t-ctype.c | 18 ++++--------------
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/t/unit-tests/t-ctype.c b/t/unit-tests/t-ctype.c
index f315489984..35473c41d8 100644
--- a/t/unit-tests/t-ctype.c
+++ b/t/unit-tests/t-ctype.c
@@ -1,23 +1,13 @@
 #include "test-lib.h"

-static int is_in(const char *s, int ch)
-{
-	/*
-	 * We can't find NUL using strchr. Accept it as the first
-	 * character in the spec -- there are no empty classes.
-	 */
-	if (ch == '\0')
-		return ch == *s;
-	if (*s == '\0')
-		s++;
-	return !!strchr(s, ch);
-}
-
 /* Macro to test a character type */
 #define TEST_CTYPE_FUNC(func, string) \
 static void test_ctype_##func(void) { \
+	size_t len = ARRAY_SIZE(string) - 1 + \
+		BUILD_ASSERT_OR_ZERO(ARRAY_SIZE(string) > 0) + \
+		BUILD_ASSERT_OR_ZERO(sizeof(string[0]) == sizeof(char)); \
 	for (int i = 0; i < 256; i++) { \
-		if (!check_int(func(i), ==, is_in(string, i))) \
+		if (!check_int(func(i), ==, !!memchr(string, i, len))) \
 			test_msg("       i: 0x%02x", i); \
 	} \
 	if (!check(!func(EOF))) \
--
2.44.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 2/4] t-ctype: simplify EOF check
  2024-03-03 10:13 ` [PATCH v2 0/4] t-ctype: simplify unit test definitions René Scharfe
  2024-03-03 10:13   ` [PATCH v2 1/4] t-ctype: allow NUL anywhere in the specification string René Scharfe
@ 2024-03-03 10:13   ` René Scharfe
  2024-03-03 10:13   ` [PATCH v2 3/4] t-ctype: align output of i René Scharfe
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 30+ messages in thread
From: René Scharfe @ 2024-03-03 10:13 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, Josh Steadmon, Achu Luma, Christian Couder

EOF is not a member of any character class.  If a classifier function
returns a non-zero result for it, presumably by mistake, then the unit
test check reports:

   # check "!iseof(EOF)" failed at t/unit-tests/t-ctype.c:53
   #       i: 0xffffffff (EOF)

The numeric value of EOF is not particularly interesting in this
context.  Stop printing the second line.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 t/unit-tests/t-ctype.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/t/unit-tests/t-ctype.c b/t/unit-tests/t-ctype.c
index 35473c41d8..f0d61d6eb2 100644
--- a/t/unit-tests/t-ctype.c
+++ b/t/unit-tests/t-ctype.c
@@ -10,8 +10,7 @@ static void test_ctype_##func(void) { \
 		if (!check_int(func(i), ==, !!memchr(string, i, len))) \
 			test_msg("       i: 0x%02x", i); \
 	} \
-	if (!check(!func(EOF))) \
-			test_msg("      i: 0x%02x (EOF)", EOF); \
+	check(!func(EOF)); \
 }

 #define TEST_CHAR_CLASS(class) TEST(test_ctype_##class(), #class " works")
--
2.44.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 3/4] t-ctype: align output of i
  2024-03-03 10:13 ` [PATCH v2 0/4] t-ctype: simplify unit test definitions René Scharfe
  2024-03-03 10:13   ` [PATCH v2 1/4] t-ctype: allow NUL anywhere in the specification string René Scharfe
  2024-03-03 10:13   ` [PATCH v2 2/4] t-ctype: simplify EOF check René Scharfe
@ 2024-03-03 10:13   ` René Scharfe
  2024-03-03 10:13   ` [PATCH v2 4/4] t-ctype: avoid duplicating class names René Scharfe
  2024-03-04  9:25   ` [PATCH v2 0/4] t-ctype: simplify unit test definitions Christian Couder
  4 siblings, 0 replies; 30+ messages in thread
From: René Scharfe @ 2024-03-03 10:13 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, Josh Steadmon, Achu Luma, Christian Couder

The unit test reports misclassified characters like this:

   # check "isdigit(i) == !!memchr("123456789", i, len)" failed at t/unit-tests/t-ctype.c:36
   #    left: 1
   #   right: 0
   #        i: 0x30

Reduce the indent of i to put its colon directly below the ones in the
preceding lines for consistency.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 t/unit-tests/t-ctype.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/unit-tests/t-ctype.c b/t/unit-tests/t-ctype.c
index f0d61d6eb2..02d8569aa3 100644
--- a/t/unit-tests/t-ctype.c
+++ b/t/unit-tests/t-ctype.c
@@ -8,7 +8,7 @@ static void test_ctype_##func(void) { \
 		BUILD_ASSERT_OR_ZERO(sizeof(string[0]) == sizeof(char)); \
 	for (int i = 0; i < 256; i++) { \
 		if (!check_int(func(i), ==, !!memchr(string, i, len))) \
-			test_msg("       i: 0x%02x", i); \
+			test_msg("      i: 0x%02x", i); \
 	} \
 	check(!func(EOF)); \
 }
--
2.44.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 4/4] t-ctype: avoid duplicating class names
  2024-03-03 10:13 ` [PATCH v2 0/4] t-ctype: simplify unit test definitions René Scharfe
                     ` (2 preceding siblings ...)
  2024-03-03 10:13   ` [PATCH v2 3/4] t-ctype: align output of i René Scharfe
@ 2024-03-03 10:13   ` René Scharfe
  2024-03-04  9:51     ` Phillip Wood
  2024-03-04  9:25   ` [PATCH v2 0/4] t-ctype: simplify unit test definitions Christian Couder
  4 siblings, 1 reply; 30+ messages in thread
From: René Scharfe @ 2024-03-03 10:13 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, Josh Steadmon, Achu Luma, Christian Couder

TEST_CTYPE_FUNC defines a function for testing a character classifier,
TEST_CHAR_CLASS calls it, causing the class name to be mentioned twice.

Avoid the need to define a class-specific function by letting
TEST_CHAR_CLASS do all the work.  This is done by using the internal
functions test__run_begin() and test__run_end(), but they do exist to be
used in test macros after all.

Alternatively we could unroll the loop to provide a very long expression
that tests all 256 characters and EOF and hand that to TEST, but that
seems awkward and hard to read.

No change of behavior or output intended.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 t/unit-tests/t-ctype.c | 64 ++++++++++++++++--------------------------
 1 file changed, 24 insertions(+), 40 deletions(-)

diff --git a/t/unit-tests/t-ctype.c b/t/unit-tests/t-ctype.c
index 02d8569aa3..d6ac1fe678 100644
--- a/t/unit-tests/t-ctype.c
+++ b/t/unit-tests/t-ctype.c
@@ -1,19 +1,19 @@
 #include "test-lib.h"

-/* Macro to test a character type */
-#define TEST_CTYPE_FUNC(func, string) \
-static void test_ctype_##func(void) { \
+#define TEST_CHAR_CLASS(class, string) do { \
 	size_t len = ARRAY_SIZE(string) - 1 + \
 		BUILD_ASSERT_OR_ZERO(ARRAY_SIZE(string) > 0) + \
 		BUILD_ASSERT_OR_ZERO(sizeof(string[0]) == sizeof(char)); \
-	for (int i = 0; i < 256; i++) { \
-		if (!check_int(func(i), ==, !!memchr(string, i, len))) \
-			test_msg("      i: 0x%02x", i); \
+	int skip = test__run_begin(); \
+	if (!skip) { \
+		for (int i = 0; i < 256; i++) { \
+			if (!check_int(class(i), ==, !!memchr(string, i, len)))\
+				test_msg("      i: 0x%02x", i); \
+		} \
+		check(!class(EOF)); \
 	} \
-	check(!func(EOF)); \
-}
-
-#define TEST_CHAR_CLASS(class) TEST(test_ctype_##class(), #class " works")
+	test__run_end(!skip, TEST_LOCATION(), #class " works"); \
+} while (0)

 #define DIGIT "0123456789"
 #define LOWER "abcdefghijklmnopqrstuvwxyz"
@@ -33,37 +33,21 @@ static void test_ctype_##func(void) { \
 	"\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f" \
 	"\x7f"

-TEST_CTYPE_FUNC(isdigit, DIGIT)
-TEST_CTYPE_FUNC(isspace, " \n\r\t")
-TEST_CTYPE_FUNC(isalpha, LOWER UPPER)
-TEST_CTYPE_FUNC(isalnum, LOWER UPPER DIGIT)
-TEST_CTYPE_FUNC(is_glob_special, "*?[\\")
-TEST_CTYPE_FUNC(is_regex_special, "$()*+.?[\\^{|")
-TEST_CTYPE_FUNC(is_pathspec_magic, "!\"#%&',-/:;<=>@_`~")
-TEST_CTYPE_FUNC(isascii, ASCII)
-TEST_CTYPE_FUNC(islower, LOWER)
-TEST_CTYPE_FUNC(isupper, UPPER)
-TEST_CTYPE_FUNC(iscntrl, CNTRL)
-TEST_CTYPE_FUNC(ispunct, PUNCT)
-TEST_CTYPE_FUNC(isxdigit, DIGIT "abcdefABCDEF")
-TEST_CTYPE_FUNC(isprint, LOWER UPPER DIGIT PUNCT " ")
-
 int cmd_main(int argc, const char **argv) {
-	/* Run all character type tests */
-	TEST_CHAR_CLASS(isspace);
-	TEST_CHAR_CLASS(isdigit);
-	TEST_CHAR_CLASS(isalpha);
-	TEST_CHAR_CLASS(isalnum);
-	TEST_CHAR_CLASS(is_glob_special);
-	TEST_CHAR_CLASS(is_regex_special);
-	TEST_CHAR_CLASS(is_pathspec_magic);
-	TEST_CHAR_CLASS(isascii);
-	TEST_CHAR_CLASS(islower);
-	TEST_CHAR_CLASS(isupper);
-	TEST_CHAR_CLASS(iscntrl);
-	TEST_CHAR_CLASS(ispunct);
-	TEST_CHAR_CLASS(isxdigit);
-	TEST_CHAR_CLASS(isprint);
+	TEST_CHAR_CLASS(isspace, " \n\r\t");
+	TEST_CHAR_CLASS(isdigit, DIGIT);
+	TEST_CHAR_CLASS(isalpha, LOWER UPPER);
+	TEST_CHAR_CLASS(isalnum, LOWER UPPER DIGIT);
+	TEST_CHAR_CLASS(is_glob_special, "*?[\\");
+	TEST_CHAR_CLASS(is_regex_special, "$()*+.?[\\^{|");
+	TEST_CHAR_CLASS(is_pathspec_magic, "!\"#%&',-/:;<=>@_`~");
+	TEST_CHAR_CLASS(isascii, ASCII);
+	TEST_CHAR_CLASS(islower, LOWER);
+	TEST_CHAR_CLASS(isupper, UPPER);
+	TEST_CHAR_CLASS(iscntrl, CNTRL);
+	TEST_CHAR_CLASS(ispunct, PUNCT);
+	TEST_CHAR_CLASS(isxdigit, DIGIT "abcdefABCDEF");
+	TEST_CHAR_CLASS(isprint, LOWER UPPER DIGIT PUNCT " ");

 	return test_done();
 }
--
2.44.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 0/4] t-ctype: simplify unit test definitions
  2024-03-03 10:13 ` [PATCH v2 0/4] t-ctype: simplify unit test definitions René Scharfe
                     ` (3 preceding siblings ...)
  2024-03-03 10:13   ` [PATCH v2 4/4] t-ctype: avoid duplicating class names René Scharfe
@ 2024-03-04  9:25   ` Christian Couder
  4 siblings, 0 replies; 30+ messages in thread
From: Christian Couder @ 2024-03-04  9:25 UTC (permalink / raw)
  To: René Scharfe; +Cc: git, Phillip Wood, Josh Steadmon, Achu Luma

On Sun, Mar 3, 2024 at 11:13 AM René Scharfe <l.s.r@web.de> wrote:
>
> Simplify the ctype unit tests to allow combining specification strings
> in any order and no longer require repeating class names.
>
> Changes since v1:
> * Added checks to guard string length calculation using sizeof.
> * Kept the definition string in the error output.
> * Added patches 2 and 3 with further cosmetic changes.
> * Dropped the last patch with the huge output for now, as it needs
>   further thought.

This V2 looks good to me. Acked.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 4/4] t-ctype: avoid duplicating class names
  2024-03-03 10:13   ` [PATCH v2 4/4] t-ctype: avoid duplicating class names René Scharfe
@ 2024-03-04  9:51     ` Phillip Wood
  2024-03-06 18:16       ` René Scharfe
  0 siblings, 1 reply; 30+ messages in thread
From: Phillip Wood @ 2024-03-04  9:51 UTC (permalink / raw)
  To: René Scharfe, git; +Cc: Josh Steadmon, Achu Luma, Christian Couder

Hi René

On 03/03/2024 10:13, René Scharfe wrote:
> TEST_CTYPE_FUNC defines a function for testing a character classifier,
> TEST_CHAR_CLASS calls it, causing the class name to be mentioned twice.
> 
> Avoid the need to define a class-specific function by letting
> TEST_CHAR_CLASS do all the work.  This is done by using the internal
> functions test__run_begin() and test__run_end(), but they do exist to be
> used in test macros after all.

Those internal functions exist to implement the TEST() macro, they are 
not really intended for use outside that (which is why they are marked 
as private in the header file). If we ever want to update the 
implementation of TEST() it will be a lot harder if we're using the 
internal implementation directly in test files. Unit tests should be 
wrapping TEST() if it is appropriate but not the internal implementation 
directly.

Ideally we wouldn't need TEST_CTYPE_FUNC as there would only be a single 
function that was passed a ctype predicate, an input array and an array 
of expected results. Unfortunately I don't think that is possible due 
the the way the ctype predicates are implemented. Having separate macros 
to define the test function and to run the test is annoying but I don't 
think it is really worth exposing the internal implementation just to 
avoid it.

The other patches here look like useful improvements - thanks.

Best Wishes

Phillip

> Alternatively we could unroll the loop to provide a very long expression
> that tests all 256 characters and EOF and hand that to TEST, but that
> seems awkward and hard to read.
> 
> No change of behavior or output intended.
> 
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
>   t/unit-tests/t-ctype.c | 64 ++++++++++++++++--------------------------
>   1 file changed, 24 insertions(+), 40 deletions(-)
> 
> diff --git a/t/unit-tests/t-ctype.c b/t/unit-tests/t-ctype.c
> index 02d8569aa3..d6ac1fe678 100644
> --- a/t/unit-tests/t-ctype.c
> +++ b/t/unit-tests/t-ctype.c
> @@ -1,19 +1,19 @@
>   #include "test-lib.h"
> 
> -/* Macro to test a character type */
> -#define TEST_CTYPE_FUNC(func, string) \
> -static void test_ctype_##func(void) { \
> +#define TEST_CHAR_CLASS(class, string) do { \
>   	size_t len = ARRAY_SIZE(string) - 1 + \
>   		BUILD_ASSERT_OR_ZERO(ARRAY_SIZE(string) > 0) + \
>   		BUILD_ASSERT_OR_ZERO(sizeof(string[0]) == sizeof(char)); \
> -	for (int i = 0; i < 256; i++) { \
> -		if (!check_int(func(i), ==, !!memchr(string, i, len))) \
> -			test_msg("      i: 0x%02x", i); \
> +	int skip = test__run_begin(); \
> +	if (!skip) { \
> +		for (int i = 0; i < 256; i++) { \
> +			if (!check_int(class(i), ==, !!memchr(string, i, len)))\
> +				test_msg("      i: 0x%02x", i); \
> +		} \
> +		check(!class(EOF)); \
>   	} \
> -	check(!func(EOF)); \
> -}
> -
> -#define TEST_CHAR_CLASS(class) TEST(test_ctype_##class(), #class " works")
> +	test__run_end(!skip, TEST_LOCATION(), #class " works"); \
> +} while (0)
> 
>   #define DIGIT "0123456789"
>   #define LOWER "abcdefghijklmnopqrstuvwxyz"
> @@ -33,37 +33,21 @@ static void test_ctype_##func(void) { \
>   	"\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f" \
>   	"\x7f"
> 
> -TEST_CTYPE_FUNC(isdigit, DIGIT)
> -TEST_CTYPE_FUNC(isspace, " \n\r\t")
> -TEST_CTYPE_FUNC(isalpha, LOWER UPPER)
> -TEST_CTYPE_FUNC(isalnum, LOWER UPPER DIGIT)
> -TEST_CTYPE_FUNC(is_glob_special, "*?[\\")
> -TEST_CTYPE_FUNC(is_regex_special, "$()*+.?[\\^{|")
> -TEST_CTYPE_FUNC(is_pathspec_magic, "!\"#%&',-/:;<=>@_`~")
> -TEST_CTYPE_FUNC(isascii, ASCII)
> -TEST_CTYPE_FUNC(islower, LOWER)
> -TEST_CTYPE_FUNC(isupper, UPPER)
> -TEST_CTYPE_FUNC(iscntrl, CNTRL)
> -TEST_CTYPE_FUNC(ispunct, PUNCT)
> -TEST_CTYPE_FUNC(isxdigit, DIGIT "abcdefABCDEF")
> -TEST_CTYPE_FUNC(isprint, LOWER UPPER DIGIT PUNCT " ")
> -
>   int cmd_main(int argc, const char **argv) {
> -	/* Run all character type tests */
> -	TEST_CHAR_CLASS(isspace);
> -	TEST_CHAR_CLASS(isdigit);
> -	TEST_CHAR_CLASS(isalpha);
> -	TEST_CHAR_CLASS(isalnum);
> -	TEST_CHAR_CLASS(is_glob_special);
> -	TEST_CHAR_CLASS(is_regex_special);
> -	TEST_CHAR_CLASS(is_pathspec_magic);
> -	TEST_CHAR_CLASS(isascii);
> -	TEST_CHAR_CLASS(islower);
> -	TEST_CHAR_CLASS(isupper);
> -	TEST_CHAR_CLASS(iscntrl);
> -	TEST_CHAR_CLASS(ispunct);
> -	TEST_CHAR_CLASS(isxdigit);
> -	TEST_CHAR_CLASS(isprint);
> +	TEST_CHAR_CLASS(isspace, " \n\r\t");
> +	TEST_CHAR_CLASS(isdigit, DIGIT);
> +	TEST_CHAR_CLASS(isalpha, LOWER UPPER);
> +	TEST_CHAR_CLASS(isalnum, LOWER UPPER DIGIT);
> +	TEST_CHAR_CLASS(is_glob_special, "*?[\\");
> +	TEST_CHAR_CLASS(is_regex_special, "$()*+.?[\\^{|");
> +	TEST_CHAR_CLASS(is_pathspec_magic, "!\"#%&',-/:;<=>@_`~");
> +	TEST_CHAR_CLASS(isascii, ASCII);
> +	TEST_CHAR_CLASS(islower, LOWER);
> +	TEST_CHAR_CLASS(isupper, UPPER);
> +	TEST_CHAR_CLASS(iscntrl, CNTRL);
> +	TEST_CHAR_CLASS(ispunct, PUNCT);
> +	TEST_CHAR_CLASS(isxdigit, DIGIT "abcdefABCDEF");
> +	TEST_CHAR_CLASS(isprint, LOWER UPPER DIGIT PUNCT " ");
> 
>   	return test_done();
>   }
> --
> 2.44.0
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/3] t-ctype: do one test per class and char
  2024-03-02 22:00           ` René Scharfe
@ 2024-03-04 10:00             ` Christian Couder
  2024-03-06 18:16               ` René Scharfe
  2024-03-04 18:35             ` Josh Steadmon
  1 sibling, 1 reply; 30+ messages in thread
From: Christian Couder @ 2024-03-04 10:00 UTC (permalink / raw)
  To: René Scharfe; +Cc: Josh Steadmon, git, Phillip Wood, Achu Luma

On Sat, Mar 2, 2024 at 11:00 PM René Scharfe <l.s.r@web.de> wrote:
>
> Am 27.02.24 um 11:04 schrieb Christian Couder:

> > Yeah, I know about TAP harnesses like prove, but the most
> > straightforward way to run the unit tests is still `make unit-tests`
> > in the t/ directory. Also when you add or change some tests, it's a
> > good idea to run `make unit-tests` to see what the output is, so you
> > still have to see that output quite often when you work on tests and
> > going through 3598 of mostly useless output instead of just 14 isn't
> > nice.
>
> I was starting the programs from t/unit-tests/bin/ individually because
> I didn't know 'make unit-tests' exists.  This is much nicer, thank you!
> Especially after adding 'DEFAULT_UNIT_TEST_TARGET = unit-tests-prove' to
> config.mak to complement the 'DEFAULT_TEST_TARGET = prove' I added long
> ago.  It would be even nicer if the former was the default when the
> latter is set.
>
> As unit tests are added, their output is surely going to grow to
> multiple screens with or without prove, no?  So someone writing or
> debugging tests will still go back to starting then individually
> eventually.

When t-ctype will be run individually from t/unit-tests/bin/, for
example when adding or debugging ctype tests, it would still be better
if there are only 14 lines in its output rather than 3598.

> The size of the output in itself is not a problem, I assume, but that
> most of it is useless -- details of successful tests are uninteresting.
> A test harness can aggregate the output, but prove annoyed me when used
> with the regular tests by also aggregating error output and only showing
> the numbers of failed tests.  Finding their names involved running the
> test script again without prove.  Turns out it has an option for that.
> Added 'GIT_PROVE_OPTS = --failures' to config.mak as well, will see if
> it helps.
>
> Is it too much to ask developers to use a test harness?  Perhaps: It's
> yet another dependency and not enabled by default.

Yeah, it's a dependency, and when running CI tests, it's sometimes
better and simpler to have the canonical output rather than having the
output processed by a test harness.

Also if we add some verbose or immediate modes (like -v and -i with
the shell test framework) or perhaps other kinds of modes (checking
for memory leaks or other errors using existing tools for example),
these modes might not interact nicely with test harnesses but still be
useful. Requiring to always use a test harness might restrict us for
no good reason.

And anyway it doesn't make sense to have meaningful messages as second
arguments to the TEST() macros if we always want to use a test harness
that just discards them. Either:

  - we decide that we will always use some test harness, and then we
might just want to remove that second argument and yeah we can have
thousands of tests output lines from a single binary, or
  - we acknowledge that we don't always want to use a test harness,
and then we want a relatively short and meaningful output from a
single binary.

> What's the right level of aggregation and how do we achieve it?
> Grouping by class is natural and follows the test definition.  We
> could stop after patch 2.  Dunno.

I am Ok with just removing this patch like you did in v2. Thanks.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/3] t-ctype: do one test per class and char
  2024-03-02 22:00           ` René Scharfe
  2024-03-04 10:00             ` Christian Couder
@ 2024-03-04 18:35             ` Josh Steadmon
  2024-03-04 18:46               ` Junio C Hamano
  1 sibling, 1 reply; 30+ messages in thread
From: Josh Steadmon @ 2024-03-04 18:35 UTC (permalink / raw)
  To: René Scharfe; +Cc: Christian Couder, git, Phillip Wood, Achu Luma

On 2024.03.02 23:00, René Scharfe wrote:
> Am 27.02.24 um 11:04 schrieb Christian Couder:
> > On Mon, Feb 26, 2024 at 7:58 PM Josh Steadmon <steadmon@google.com> wrote:
> >>
> >> On 2024.02.26 18:26, René Scharfe wrote:
> >
> >>> The output is clean as well, but there's a lot of it.  Perhaps too much.
> >>> The success messages are boring, though, and if all checks pass then the
> >>> only useful information is the status code.  A TAP harness like prove
> >>> summarizes that nicely:
> >>>
> >>>    $ prove t/unit-tests/bin/t-ctype
> >>>    t/unit-tests/bin/t-ctype .. ok
> >>>    All tests successful.
> >>>    Files=1, Tests=3598,  0 wallclock secs ( 0.08 usr +  0.00 sys =  0.08 CPU)
> >>>    Result: PASS
> >>>
> >>> Filtering out passing checks e.g. with "| grep -v ^ok" would help when
> >>> debugging a test failure. I vaguely miss the --immediate switch from the
> >>> regular test library, however.
> >>
> >> Yeah, I agree here. It's a lot of output but it's almost always going to
> >> be consumed by a test harness rather than a human, and it's easy to
> >> filter out the noise if someone does need to do some manual debugging.
> >
> > Yeah, I know about TAP harnesses like prove, but the most
> > straightforward way to run the unit tests is still `make unit-tests`
> > in the t/ directory. Also when you add or change some tests, it's a
> > good idea to run `make unit-tests` to see what the output is, so you
> > still have to see that output quite often when you work on tests and
> > going through 3598 of mostly useless output instead of just 14 isn't
> > nice.
> 
> I was starting the programs from t/unit-tests/bin/ individually because
> I didn't know 'make unit-tests' exists.  This is much nicer, thank you!
> Especially after adding 'DEFAULT_UNIT_TEST_TARGET = unit-tests-prove' to
> config.mak to complement the 'DEFAULT_TEST_TARGET = prove' I added long
> ago.  It would be even nicer if the former was the default when the
> latter is set.

After js/unit-test-suite-runner [1] is merged, then using
'DEFAULT_TEST_TARGET = prove' will also run the unit tests alongside the
shell test suite.

[1] https://lore.kernel.org/git/cover.1708728717.git.steadmon@google.com/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/3] t-ctype: do one test per class and char
  2024-03-04 18:35             ` Josh Steadmon
@ 2024-03-04 18:46               ` Junio C Hamano
  0 siblings, 0 replies; 30+ messages in thread
From: Junio C Hamano @ 2024-03-04 18:46 UTC (permalink / raw)
  To: Josh Steadmon
  Cc: René Scharfe, Christian Couder, git, Phillip Wood, Achu Luma

Josh Steadmon <steadmon@google.com> writes:

> On 2024.03.02 23:00, René Scharfe wrote:
> ...
>> I was starting the programs from t/unit-tests/bin/ individually because
>> I didn't know 'make unit-tests' exists.  This is much nicer, thank you!
>> Especially after adding 'DEFAULT_UNIT_TEST_TARGET = unit-tests-prove' to
>> config.mak to complement the 'DEFAULT_TEST_TARGET = prove' I added long
>> ago.  It would be even nicer if the former was the default when the
>> latter is set.
>
> After js/unit-test-suite-runner [1] is merged, then using
> 'DEFAULT_TEST_TARGET = prove' will also run the unit tests alongside the
> shell test suite.
>
> [1] https://lore.kernel.org/git/cover.1708728717.git.steadmon@google.com/

Nice ;-).

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 4/4] t-ctype: avoid duplicating class names
  2024-03-04  9:51     ` Phillip Wood
@ 2024-03-06 18:16       ` René Scharfe
  2024-03-08 14:05         ` Phillip Wood
  2024-03-09 11:28         ` Phillip Wood
  0 siblings, 2 replies; 30+ messages in thread
From: René Scharfe @ 2024-03-06 18:16 UTC (permalink / raw)
  To: phillip.wood, git; +Cc: Josh Steadmon, Achu Luma, Christian Couder

Hello Phillip,

Am 04.03.24 um 10:51 schrieb Phillip Wood:
> On 03/03/2024 10:13, René Scharfe wrote:
>> TEST_CTYPE_FUNC defines a function for testing a character classifier,
>> TEST_CHAR_CLASS calls it, causing the class name to be mentioned twice.
>>
>> Avoid the need to define a class-specific function by letting
>> TEST_CHAR_CLASS do all the work.  This is done by using the internal
>> functions test__run_begin() and test__run_end(), but they do exist to be
>> used in test macros after all.
>
> Those internal functions exist to implement the TEST() macro, they
> are not really intended for use outside that (which is why they are
> marked as private in the header file). If we ever want to update the
> implementation of TEST() it will be a lot harder if we're using the
> internal implementation directly in test files. Unit tests should be
> wrapping TEST() if it is appropriate but not the internal
> implementation directly.

forcing tests to be expressions and not allow them to use statements is
an unusual requirement.  I don't see how the added friction would make
tests any better.  It just requires more boilerplate code and annoying
repetition.  What kind of changes do you envision that would be
hindered by allowing statements?

> Ideally we wouldn't need TEST_CTYPE_FUNC as there would only be a
> single function that was passed a ctype predicate, an input array and
> an array of expected results. Unfortunately I don't think that is
> possible due the the way the ctype predicates are implemented. Having
> separate macros to define the test function and to run the test is
> annoying but I don't think it is really worth exposing the internal
> implementation just to avoid it.

The classifiers are currently implemented as macros.  We could turn them
into inline functions and would then be able to pass them to a test
function.  Improving testability is a good idea, but also somehow feels
like the tail wagging the dog.  It would be easy, though, I think.  And
less gross than:

>> Alternatively we could unroll the loop to provide a very long expression
>> that tests all 256 characters and EOF and hand that to TEST, but that
>> seems awkward and hard to read.

... which would yield unsightly test macros and huge test binaries.  But
it would certainly be possible, and keep the definitions of the actual
tests clean.

René

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/3] t-ctype: do one test per class and char
  2024-03-04 10:00             ` Christian Couder
@ 2024-03-06 18:16               ` René Scharfe
  0 siblings, 0 replies; 30+ messages in thread
From: René Scharfe @ 2024-03-06 18:16 UTC (permalink / raw)
  To: Christian Couder; +Cc: Josh Steadmon, git, Phillip Wood, Achu Luma

Am 04.03.24 um 11:00 schrieb Christian Couder:
> And anyway it doesn't make sense to have meaningful messages as second
> arguments to the TEST() macros if we always want to use a test harness
> that just discards them. Either:
>
>   - we decide that we will always use some test harness, and then we
> might just want to remove that second argument and yeah we can have
> thousands of tests output lines from a single binary, or
>   - we acknowledge that we don't always want to use a test harness,
> and then we want a relatively short and meaningful output from a
> single binary.

Different situations require different levels of detail.  If all checks
pass, we just need that one bit of information.  If some fail, we need
as much helpful context as we can get.  Success messages are interesting
for someone who added a new test and as a progress indicator, but are
useless otherwise (to me at least).

Perhaps it would make sense to show ok message only after a delay if
writing to a terminal, similar to what progress.c does?  This would
effectively silence them, since our current unit tests currently take
only a fraction of a second to finish.

René

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 4/4] t-ctype: avoid duplicating class names
  2024-03-06 18:16       ` René Scharfe
@ 2024-03-08 14:05         ` Phillip Wood
  2024-03-09 11:28         ` Phillip Wood
  1 sibling, 0 replies; 30+ messages in thread
From: Phillip Wood @ 2024-03-08 14:05 UTC (permalink / raw)
  To: René Scharfe, phillip.wood, git
  Cc: Josh Steadmon, Achu Luma, Christian Couder

On 06/03/2024 18:16, René Scharfe wrote:

Sorry for the delay I somehow missed your reply

> Hello Phillip,
> 
> Am 04.03.24 um 10:51 schrieb Phillip Wood:
>> On 03/03/2024 10:13, René Scharfe wrote:
>>> TEST_CTYPE_FUNC defines a function for testing a character classifier,
>>> TEST_CHAR_CLASS calls it, causing the class name to be mentioned twice.
>>>
>>> Avoid the need to define a class-specific function by letting
>>> TEST_CHAR_CLASS do all the work.  This is done by using the internal
>>> functions test__run_begin() and test__run_end(), but they do exist to be
>>> used in test macros after all.
>>
>> Those internal functions exist to implement the TEST() macro, they
>> are not really intended for use outside that (which is why they are
>> marked as private in the header file). If we ever want to update the
>> implementation of TEST() it will be a lot harder if we're using the
>> internal implementation directly in test files. Unit tests should be
>> wrapping TEST() if it is appropriate but not the internal
>> implementation directly.
> 
> forcing tests to be expressions and not allow them to use statements is
> an unusual requirement.

I don't think it is that unusual to require tests to be implemented as 
functions which more or less amounts to the same thing.

> I don't see how the added friction would make
> tests any better.  It just requires more boilerplate code and annoying
> repetition.  What kind of changes do you envision that would be
> hindered by allowing statements?

I'm worried about bugs being introduced by the internal functions being 
used incorrectly - it is not a user friendly API because it is designed 
around the limitations of implementing TEST(), not for general 
consumption. The unit test framework is very new so I don't think we can 
be sure that we wont need to change it and that will be more difficult 
if unit tests do not use TEST(). Maybe one of the changes we need is a 
better way of allowing statements?

>> Ideally we wouldn't need TEST_CTYPE_FUNC as there would only be a
>> single function that was passed a ctype predicate, an input array and
>> an array of expected results. Unfortunately I don't think that is
>> possible due the the way the ctype predicates are implemented. Having
>> separate macros to define the test function and to run the test is
>> annoying but I don't think it is really worth exposing the internal
>> implementation just to avoid it.
> 
> The classifiers are currently implemented as macros.  We could turn them
> into inline functions and would then be able to pass them to a test
> function.  Improving testability is a good idea, but also somehow feels
> like the tail wagging the dog.  It would be easy, though, I think.  And
> less gross than:

Making them functions would allow them to be passed as function 
arguments in our code as well, though I don't know if we have much use 
for that. I certainly agree it would be better than the alternative below.

Best Wishes

Phillip

>>> Alternatively we could unroll the loop to provide a very long expression
>>> that tests all 256 characters and EOF and hand that to TEST, but that
>>> seems awkward and hard to read.
> 
> ... which would yield unsightly test macros and huge test binaries.  But
> it would certainly be possible, and keep the definitions of the actual
> tests clean.
> 
> René
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 4/4] t-ctype: avoid duplicating class names
  2024-03-06 18:16       ` René Scharfe
  2024-03-08 14:05         ` Phillip Wood
@ 2024-03-09 11:28         ` Phillip Wood
  2024-03-10 12:48           ` René Scharfe
  1 sibling, 1 reply; 30+ messages in thread
From: Phillip Wood @ 2024-03-09 11:28 UTC (permalink / raw)
  To: René Scharfe, phillip.wood, git
  Cc: Josh Steadmon, Achu Luma, Christian Couder

Hi René

On 06/03/2024 18:16, René Scharfe wrote:
> Hello Phillip,
> 
> Am 04.03.24 um 10:51 schrieb Phillip Wood:
>> On 03/03/2024 10:13, René Scharfe wrote:
>>> TEST_CTYPE_FUNC defines a function for testing a character classifier,
>>> TEST_CHAR_CLASS calls it, causing the class name to be mentioned twice.
>>>
>>> Avoid the need to define a class-specific function by letting
>>> TEST_CHAR_CLASS do all the work.  This is done by using the internal
>>> functions test__run_begin() and test__run_end(), but they do exist to be
>>> used in test macros after all.
>>
>> Those internal functions exist to implement the TEST() macro, they
>> are not really intended for use outside that (which is why they are
>> marked as private in the header file). If we ever want to update the
>> implementation of TEST() it will be a lot harder if we're using the
>> internal implementation directly in test files. Unit tests should be
>> wrapping TEST() if it is appropriate but not the internal
>> implementation directly.
> 
> forcing tests to be expressions and not allow them to use statements is
> an unusual requirement.  I don't see how the added friction would make
> tests any better.  It just requires more boilerplate code and annoying
> repetition.  What kind of changes do you envision that would be
> hindered by allowing statements?

On reflection I don't think I'm objecting to allowing statements, only 
the use of the private functions to do so. If we tweak test__run_begin() 
and test__run_end() so that the description is passed to 
test__run_begin() and we invert the return value of that function to 
match what test__run_end() is expecting then we can have

#define TEST_BEGIN(...)						\
	do {							\
		int run__ = test__run_begin(__VA_ARGS__);	\
		if (run__)

#define TEST_END				\
		test_run_end(run__);		\
	} while (0)

Which allow test authors to write

	TEST_BEGIN("my test") {
		/* test body here */
	} TEST_END;

The macros insulate the test code from having to worry about 
test_skip_all() and the "do { ... } while (0)" means that the compiler 
will complain if the author forgets TEST_END. I'm slightly on the fence 
about including the braces in the macros instead as that would make them 
harder to misuse but it would be less obvious that the test body is run 
in its own block. The compiler will allow the test author to 
accidentally nest two calls to TEST_BEGIN() but there is an assertion in 
test__run_begin() which will catch that at run time.

The slight downside compared to TEST() is that it is harder to return 
early if a check fails - we'd need something like

	TEST_BEGIN("my test") {
		if (!check(0))
			goto fail
		/* more checks */
	 fail:
		;
	} TEST_END;

Also unlike TEST(), TEST_END does not indicate to the caller whether the 
test failed or not but I'm not sure that matters in practice.

What do you think?

Best Wishes

Phillip


>> Ideally we wouldn't need TEST_CTYPE_FUNC as there would only be a
>> single function that was passed a ctype predicate, an input array and
>> an array of expected results. Unfortunately I don't think that is
>> possible due the the way the ctype predicates are implemented. Having
>> separate macros to define the test function and to run the test is
>> annoying but I don't think it is really worth exposing the internal
>> implementation just to avoid it.
> 
> The classifiers are currently implemented as macros.  We could turn them
> into inline functions and would then be able to pass them to a test
> function.  Improving testability is a good idea, but also somehow feels
> like the tail wagging the dog.  It would be easy, though, I think.  And
> less gross than:
> 
>>> Alternatively we could unroll the loop to provide a very long expression
>>> that tests all 256 characters and EOF and hand that to TEST, but that
>>> seems awkward and hard to read.
> 
> ... which would yield unsightly test macros and huge test binaries.  But
> it would certainly be possible, and keep the definitions of the actual
> tests clean.
> 
> René
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 4/4] t-ctype: avoid duplicating class names
  2024-03-09 11:28         ` Phillip Wood
@ 2024-03-10 12:48           ` René Scharfe
  0 siblings, 0 replies; 30+ messages in thread
From: René Scharfe @ 2024-03-10 12:48 UTC (permalink / raw)
  To: phillip.wood, git; +Cc: Josh Steadmon, Achu Luma, Christian Couder

Hi Phillip,

Am 09.03.24 um 12:28 schrieb Phillip Wood:
> On reflection I don't think I'm objecting to allowing statements,
> only the use of the private functions to do so. If we tweak
> test__run_begin() and test__run_end() so that the description is
> passed to test__run_begin() and we invert the return value of that
> function to match what test__run_end() is expecting then we can have
>
> #define TEST_BEGIN(...)                        \
>     do {                            \
>         int run__ = test__run_begin(__VA_ARGS__);    \
>         if (run__)
>
> #define TEST_END                \
>         test_run_end(run__);        \
>     } while (0)
>
> Which allow test authors to write
>
>     TEST_BEGIN("my test") {
>         /* test body here */
>     } TEST_END;
>
> The macros insulate the test code from having to worry about
> test_skip_all() and the "do { ... } while (0)" means that the
> compiler will complain if the author forgets TEST_END.

the location information is missing, but I get the idea.  That would
certainly work for t-ctype.

> I'm slightly on the fence about including the braces in the macros
> instead as that would make them harder to misuse but it would be less
> obvious that the test body is run in its own block. The compiler will
> allow the test author to accidentally nest two calls to TEST_BEGIN()
> but there is an assertion in test__run_begin() which will catch that
> at run time.

Thought about that as well, and I'd also be wary of including any of the
control statements.  Custom syntax requires learning, can have weird
side-effects and may be misunderstood by editors.

Below is a patch that adds the function test_start() and its companion
macro TEST_START, which allow defining tests with minimal ceremony,
similarly as in the shell-based test suite:

	#include "test-lib.h"

	int cmd_main(int argc, const char **argv)
	{
		if (TEST_START("something works"))
			check(something());
		if (TEST_START("something else works"))
			check(something_else());
		return test_done();
	}

It requires storing string copies and sits between the states of the
original test machinery, so it's a bit complicated.

The biggest downside so far, though, is that I couldn't find an example
in the other unit tests that it would simplify significantly.  At least
it would allow getting rid of the void pointers in t-strbuf, however.

> The slight downside compared to TEST() is that it is harder to return
> early if a check fails - we'd need something like
>
>     TEST_BEGIN("my test") {
>         if (!check(0))
>             goto fail
>         /* more checks */
>      fail:
>         ;
>     } TEST_END;

TEST is worse in this regard, as it doesn't allow "if" directly at all.
You can use short-circuiting, of course:

	TEST(check(!!ptr) && check(*ptr == value), "ptr points to value");

But you can do that with TEST_BEGIN as well.  In a function you can
return early, but you can use functions with both, too.

In your example you can use "continue" instead of "goto fail".  And with
"break" you can skip the test_run_end() call.  I consider both to be
downsides, though -- the abstraction is leaky.

> Also unlike TEST(), TEST_END does not indicate to the caller whether
> the test failed or not but I'm not sure that matters in practice.

Most TEST invocations out of t-basic ignore its return value so far.

ctx.result is left unchanged by test__run_end(), so we could still
access it if really needed.  Perhaps through a sanctioned function,
last_test_result() or similar.

René



diff --git a/t/unit-tests/t-ctype.c b/t/unit-tests/t-ctype.c
index d6ac1fe678..e7c846814e 100644
--- a/t/unit-tests/t-ctype.c
+++ b/t/unit-tests/t-ctype.c
@@ -4,15 +4,13 @@
 	size_t len = ARRAY_SIZE(string) - 1 + \
 		BUILD_ASSERT_OR_ZERO(ARRAY_SIZE(string) > 0) + \
 		BUILD_ASSERT_OR_ZERO(sizeof(string[0]) == sizeof(char)); \
-	int skip = test__run_begin(); \
-	if (!skip) { \
+	if (TEST_START(#class " works")) { \
 		for (int i = 0; i < 256; i++) { \
 			if (!check_int(class(i), ==, !!memchr(string, i, len)))\
 				test_msg("      i: 0x%02x", i); \
 		} \
 		check(!class(EOF)); \
 	} \
-	test__run_end(!skip, TEST_LOCATION(), #class " works"); \
 } while (0)

 #define DIGIT "0123456789"
diff --git a/t/unit-tests/test-lib.c b/t/unit-tests/test-lib.c
index 66d6980ffb..bc484c92da 100644
--- a/t/unit-tests/test-lib.c
+++ b/t/unit-tests/test-lib.c
@@ -16,6 +16,8 @@ static struct {
 	unsigned running :1;
 	unsigned skip_all :1;
 	unsigned todo :1;
+	char *desc;
+	char *location;
 } ctx = {
 	.lazy_plan = 1,
 	.result = RESULT_NONE,
@@ -123,9 +125,45 @@ void test_plan(int count)
 	ctx.lazy_plan = 0;
 }

+static void test_run_maybe_end(void)
+{
+	if (ctx.running) {
+		assert(ctx.location);
+		assert(ctx.desc);
+		test__run_end(0, ctx.location, "%s", ctx.desc);
+		FREE_AND_NULL(ctx.location);
+		FREE_AND_NULL(ctx.desc);
+	}
+	assert(!ctx.running);
+	assert(!ctx.location);
+	assert(!ctx.desc);
+}
+
+int test_start(const char *location, const char *format, ...)
+{
+	va_list ap;
+	char *desc;
+
+	test_run_maybe_end();
+
+	va_start(ap, format);
+	desc = xstrvfmt(format, ap);
+	va_end(ap);
+
+	if (test__run_begin()) {
+		test__run_end(1, location, "%s", desc);
+		free(desc);
+		return 0;
+	} else {
+		ctx.location = xstrdup(location);
+		ctx.desc = desc;
+		return 1;
+	}
+}
+
 int test_done(void)
 {
-	assert(!ctx.running);
+	test_run_maybe_end();

 	if (ctx.lazy_plan)
 		test_plan(ctx.count);
diff --git a/t/unit-tests/test-lib.h b/t/unit-tests/test-lib.h
index a8f07ae0b7..2be95b3ab8 100644
--- a/t/unit-tests/test-lib.h
+++ b/t/unit-tests/test-lib.h
@@ -21,6 +21,15 @@
  */
 void test_plan(int count);

+/*
+ * Start a test.  It ends when the next test starts or test_done()
+ * is called.  Returns 1 if the test was actually started, 0 if it was
+ * skipped because test_skip_all() had been called.
+ */
+int test_start(const char *location, const char *format, ...);
+
+#define TEST_START(...) test_start(TEST_LOCATION(), __VA_ARGS__)
+
 /*
  * test_done() must be called at the end of main(). It will print the
  * plan if plan() was not called at the beginning of the test program

^ permalink raw reply related	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2024-03-10 12:48 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-25 11:27 [PATCH 0/3] t-ctype: simplify unit test definitions René Scharfe
2024-02-25 11:27 ` [PATCH 1/3] t-ctype: allow NUL anywhere in the specification string René Scharfe
2024-02-25 18:05   ` Eric Sunshine
2024-02-25 18:28     ` René Scharfe
2024-02-25 18:41       ` Eric Sunshine
2024-02-25 21:00         ` Jeff King
2024-02-25 21:02           ` Eric Sunshine
2024-02-25 11:27 ` [PATCH 2/3] t-ctype: avoid duplicating class names René Scharfe
2024-02-25 11:27 ` [PATCH 3/3] t-ctype: do one test per class and char René Scharfe
2024-02-26  9:28   ` Christian Couder
2024-02-26 17:26     ` René Scharfe
2024-02-26 17:44       ` Junio C Hamano
2024-02-26 18:58       ` Josh Steadmon
2024-02-27 10:04         ` Christian Couder
2024-03-02 22:00           ` René Scharfe
2024-03-04 10:00             ` Christian Couder
2024-03-06 18:16               ` René Scharfe
2024-03-04 18:35             ` Josh Steadmon
2024-03-04 18:46               ` Junio C Hamano
2024-03-03 10:13 ` [PATCH v2 0/4] t-ctype: simplify unit test definitions René Scharfe
2024-03-03 10:13   ` [PATCH v2 1/4] t-ctype: allow NUL anywhere in the specification string René Scharfe
2024-03-03 10:13   ` [PATCH v2 2/4] t-ctype: simplify EOF check René Scharfe
2024-03-03 10:13   ` [PATCH v2 3/4] t-ctype: align output of i René Scharfe
2024-03-03 10:13   ` [PATCH v2 4/4] t-ctype: avoid duplicating class names René Scharfe
2024-03-04  9:51     ` Phillip Wood
2024-03-06 18:16       ` René Scharfe
2024-03-08 14:05         ` Phillip Wood
2024-03-09 11:28         ` Phillip Wood
2024-03-10 12:48           ` René Scharfe
2024-03-04  9:25   ` [PATCH v2 0/4] t-ctype: simplify unit test definitions Christian Couder

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.