Re: [PATCH] Port helper/test-ctype.c to unit-tests/t-ctype.c

From: Taylor Blau <me@ttaylorr.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Achu Luma <ach.lumap@gmail.com>,
	christian.couder@gmail.com, git@vger.kernel.org,
	Christian Couder <chriscool@tuxfamily.org>
Subject: Re: [PATCH] Port helper/test-ctype.c to unit-tests/t-ctype.c
Date: Tue, 2 Jan 2024 13:55:57 -0500	[thread overview]
Message-ID: <ZZRcPapIHFnyZTYB@nand.local> (raw)
In-Reply-To: <xmqqsf3ohkew.fsf@gitster.g>

On Tue, Dec 26, 2023 at 10:45:59AM -0800, Junio C Hamano wrote:
> Achu Luma <ach.lumap@gmail.com> writes:
>
> > diff --git a/t/helper/test-ctype.c b/t/helper/test-ctype.c
> > deleted file mode 100644
> > index e5659df40b..0000000000
> > --- a/t/helper/test-ctype.c
> > +++ /dev/null
> > @@ -1,70 +0,0 @@
> > -#include "test-tool.h"
> > -
> > -static int rc;
> > -
> > -static void report_error(const char *class, int ch)
> > -{
> > -	printf("%s classifies char %d (0x%02x) wrongly\n", class, ch, ch);
> > -	rc = 1;
> > -}
>
> So, if we have a is_foo() that characterises a byte that ought to
> be "foo" but gets miscategorised not to be "foo", we used to
> pinpoint exactly the byte value that was an issue.  We did not do
> any early return ...
>
> > ...
> > -#define TEST_CLASS(t,s) {			\
> > -	int i;					\
> > -	for (i = 0; i < 256; i++) {		\
> > -		if (is_in(s, i) != t(i))	\
> > -			report_error(#t, i);	\
> > -	}					\
> > -	if (t(EOF))				\
> > -		report_error(#t, EOF);		\
> > -}
>
> ... and reported for all errors in the "class".
>
> > diff --git a/t/unit-tests/t-ctype.c b/t/unit-tests/t-ctype.c
> > new file mode 100644
> > index 0000000000..41189ba9f9
> > --- /dev/null
> > +++ b/t/unit-tests/t-ctype.c
> > @@ -0,0 +1,76 @@
> > +#include "test-lib.h"
> > +
> > +static int is_in(const char *s, int ch)
> > +{
> > +	/*
> > +	 * We can't find NUL using strchr. Accept it as the first
> > +	 * character in the spec -- there are no empty classes.
> > +	 */
> > +	if (ch == '\0')
> > +		return ch == *s;
> > +	if (*s == '\0')
> > +		s++;
> > +	return !!strchr(s, ch);
> > +}
> > +
> > +/* Macro to test a character type */
> > +#define TEST_CTYPE_FUNC(func, string)			\
> > +static void test_ctype_##func(void)				\
> > +{								\
> > +	int i;                                     	 	\
> > +	for (i = 0; i < 256; i++)                 		\
> > +		check_int(func(i), ==, is_in(string, i)); 	\
> > +}
>
> Now, we let check_int() to do the checking for each and every byte
> value for the class.  check_int() uses different reporting and shows
> the problematic value in a way that is more verbose and at the same
> time is a less specific and harder to understand:
>
> 		test_msg("   left: %"PRIdMAX, a);
> 		test_msg("  right: %"PRIdMAX, b);
>
> But that is probably the price to pay to use a more generic
> framework, I guess.

Perhaps I'm missing something here, since I haven't followed the
unit-test effort very closely, but this check_int() macro feels like it
might be overkill for what we're trying to do.

We know that the expected value is the result of is_in(string, i), so I
wonder if we might benefit from having an "assert_equals()" that looks
like:

    assert_equals(is_in(string, i), func(i));

Where we follow the usual convention of treating the first argument as
the expected value, and the second as the actual value. Then we could
format our error message to be more specific, like:

    test_msg("expected %d, got %d", expected, actual);

I think that this would be a little more readable, and still seems
flexible enough to support the kind of thing that check_int(..., ==,
...) is after.

> > +int cmd_main(int argc, const char **argv) {
> > +	/* Run all character type tests */
> > +	TEST(test_ctype_isspace(), "isspace() works as we expect");
> > +	TEST(test_ctype_isdigit(), "isdigit() works as we expect");
> > +	TEST(test_ctype_isalpha(), "isalpha() works as we expect");
> > +	TEST(test_ctype_isalnum(), "isalnum() works as we expect");
> > +	TEST(test_ctype_is_glob_special(), "is_glob_special() works as we expect");
> > +	TEST(test_ctype_is_regex_special(), "is_regex_special() works as we expect");
> > +	TEST(test_ctype_is_pathspec_magic(), "is_pathspec_magic() works as we expect");
> > +	TEST(test_ctype_isascii(), "isascii() works as we expect");
> > +	TEST(test_ctype_islower(), "islower() works as we expect");
> > +	TEST(test_ctype_isupper(), "isupper() works as we expect");
> > +	TEST(test_ctype_iscntrl(), "iscntrl() works as we expect");
> > +	TEST(test_ctype_ispunct(), "ispunct() works as we expect");
> > +	TEST(test_ctype_isxdigit(), "isxdigit() works as we expect");
> > +	TEST(test_ctype_isprint(), "isprint() works as we expect");
> > +
> > +	return test_done();
> > +}
>
> As a practice to use the unit-tests framework, the patch looks OK.
> helper/test-ctype.c indeed is an oddball that runs once and checks
> everything it wants to check, for which the unit tests framework is
> much more suited.

As an aside, I don't think we need the "works as we expect" suffix in
each test description. I personally would be fine with something like:

    TEST(test_ctype_isspace(), "isspace()");
    TEST(test_ctype_isdigit(), "isdigit()");
    ...

But don't feel strongly about it.

Thanks,
Taylor