All of lore.kernel.org
 help / color / mirror / Atom feed
* [LTP] [PATCH] lib: tst_bool_expr: Add support for strings
@ 2020-11-11 13:11 Cyril Hrubis
  2020-11-11 14:19 ` Richard Palethorpe
  0 siblings, 1 reply; 6+ messages in thread
From: Cyril Hrubis @ 2020-11-11 13:11 UTC (permalink / raw)
  To: ltp

There are cases where special characters, e.g. parentesis appear in the
value of a kernel config variable, so in order to be able to parse
boolean variables such as "CONFIG_DEFAULT_HOSTNAME=\"(none)\"" the
tokenizer must be able to parse strings.

The implementation is easy, when in string we do not split the input
into tokens.

Signed-off-by: Cyril Hrubis <chrubis@suse.cz>
---
 lib/newlib_tests/config01        |  1 +
 lib/newlib_tests/config02        |  1 +
 lib/newlib_tests/config03        |  1 +
 lib/newlib_tests/config04        |  1 +
 lib/newlib_tests/config05        |  1 +
 lib/newlib_tests/test_kconfig.c  |  1 +
 lib/newlib_tests/tst_bool_expr.c |  3 +++
 lib/tst_bool_expr.c              | 18 +++++++++++++-----
 8 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/lib/newlib_tests/config01 b/lib/newlib_tests/config01
index 96d68d836..1d94d810a 100644
--- a/lib/newlib_tests/config01
+++ b/lib/newlib_tests/config01
@@ -2,3 +2,4 @@
 CONFIG_MMU=y
 CONFIG_EXT4_FS=m
 CONFIG_PGTABLE_LEVELS=4
+CONFIG_DEFAULT_HOSTNAME="(none)"
diff --git a/lib/newlib_tests/config02 b/lib/newlib_tests/config02
index 2de45cff8..e1b0e8086 100644
--- a/lib/newlib_tests/config02
+++ b/lib/newlib_tests/config02
@@ -2,3 +2,4 @@
 # CONFIG_MMU is not set
 CONFIG_EXT4_FS=m
 CONFIG_PGTABLE_LEVELS=4
+CONFIG_DEFAULT_HOSTNAME="(none)"
diff --git a/lib/newlib_tests/config03 b/lib/newlib_tests/config03
index 1a3b9e648..05c8e194a 100644
--- a/lib/newlib_tests/config03
+++ b/lib/newlib_tests/config03
@@ -2,3 +2,4 @@
 CONFIG_MMU=y
 CONFIG_EXT4_FS=m
 CONFIG_PGTABLE_LEVELS=44
+CONFIG_DEFAULT_HOSTNAME="(none)"
diff --git a/lib/newlib_tests/config04 b/lib/newlib_tests/config04
index cce7051ae..da01579b6 100644
--- a/lib/newlib_tests/config04
+++ b/lib/newlib_tests/config04
@@ -2,3 +2,4 @@
 CONFIG_MMU=y
 CONFIG_EXT4_FS=y
 CONFIG_PGTABLE_LEVELS=4
+CONFIG_DEFAULT_HOSTNAME="(none)"
diff --git a/lib/newlib_tests/config05 b/lib/newlib_tests/config05
index a9d7bab4d..490f94fa6 100644
--- a/lib/newlib_tests/config05
+++ b/lib/newlib_tests/config05
@@ -1,3 +1,4 @@
 # Everything is wrong
 CONFIG_EXT4_FS=y
 CONFIG_PGTABLE_LEVELS=44
+CONFIG_DEFAULT_HOSTNAME=""
diff --git a/lib/newlib_tests/test_kconfig.c b/lib/newlib_tests/test_kconfig.c
index 1f659b95a..9280f07ca 100644
--- a/lib/newlib_tests/test_kconfig.c
+++ b/lib/newlib_tests/test_kconfig.c
@@ -16,6 +16,7 @@ static const char *kconfigs[] = {
 	"CONFIG_PGTABLE_LEVELS=4",
 	"CONFIG_MMU & CONFIG_EXT4_FS=m",
 	"CONFIG_EXT4_FS=m | CONFIG_MMU",
+	"CONFIG_DEFAULT_HOSTNAME=\"(none)\"",
 	NULL
 };
 
diff --git a/lib/newlib_tests/tst_bool_expr.c b/lib/newlib_tests/tst_bool_expr.c
index f9bb1780d..8f0929d35 100644
--- a/lib/newlib_tests/tst_bool_expr.c
+++ b/lib/newlib_tests/tst_bool_expr.c
@@ -102,6 +102,9 @@ static void do_test(void)
 	do_eval_test("False & A", 1, 0, 0, 0);
 	do_eval_test("! Undefined", 0, 0, 0, -1);
 
+	do_eval_test("\"(none)\"", 0, 0, 0, -1);
+	do_eval_test("\"(none)\" & \" \"", 0, 0, 0, -1);
+
 	parse_fail("A!");
 	parse_fail("A &");
 	parse_fail("A B");
diff --git a/lib/tst_bool_expr.c b/lib/tst_bool_expr.c
index dd147cde3..35ffa5a80 100644
--- a/lib/tst_bool_expr.c
+++ b/lib/tst_bool_expr.c
@@ -64,6 +64,7 @@ static unsigned int tokenize(const char *expr, struct tst_expr_tok *last)
 {
 	size_t i, j;
 	unsigned int token_cnt = 0;
+	int in_string = 0;
 
 	for (j = i = 0; expr[i]; i++) {
 		switch (expr[i]) {
@@ -72,14 +73,21 @@ static unsigned int tokenize(const char *expr, struct tst_expr_tok *last)
 		case '!':
 		case '&':
 		case '|':
-			token_cnt += new_tok(&last, &expr[j], i - j);
-			token_cnt += new_tok(&last, &expr[i], 1);
-			j = i+1;
+			if (!in_string) {
+				token_cnt += new_tok(&last, &expr[j], i - j);
+				token_cnt += new_tok(&last, &expr[i], 1);
+				j = i+1;
+			}
 		break;
 		case '\t':
 		case ' ':
-			token_cnt += new_tok(&last, &expr[j], i - j);
-			j = i+1;
+			if (!in_string) {
+				token_cnt += new_tok(&last, &expr[j], i - j);
+				j = i+1;
+			}
+		break;
+		case '"':
+			in_string = !in_string;
 		break;
 		default:
 		break;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [LTP] [PATCH] lib: tst_bool_expr: Add support for strings
  2020-11-11 13:11 [LTP] [PATCH] lib: tst_bool_expr: Add support for strings Cyril Hrubis
@ 2020-11-11 14:19 ` Richard Palethorpe
  2020-11-11 14:37   ` Cyril Hrubis
  0 siblings, 1 reply; 6+ messages in thread
From: Richard Palethorpe @ 2020-11-11 14:19 UTC (permalink / raw)
  To: ltp

Hello,

Cyril Hrubis <chrubis@suse.cz> writes:

> diff --git a/lib/tst_bool_expr.c b/lib/tst_bool_expr.c
> index dd147cde3..35ffa5a80 100644
> --- a/lib/tst_bool_expr.c
> +++ b/lib/tst_bool_expr.c
> @@ -64,6 +64,7 @@ static unsigned int tokenize(const char *expr, struct tst_expr_tok *last)
>  {
>  	size_t i, j;
>  	unsigned int token_cnt = 0;
> +	int in_string = 0;
>  
>  	for (j = i = 0; expr[i]; i++) {


Why not skip the whole switch statement if in_string and just check for
the closing '"' instead?

>  		switch (expr[i]) {
> @@ -72,14 +73,21 @@ static unsigned int tokenize(const char *expr, struct tst_expr_tok *last)
>  		case '!':
>  		case '&':
>  		case '|':
> -			token_cnt += new_tok(&last, &expr[j], i - j);
> -			token_cnt += new_tok(&last, &expr[i], 1);
> -			j = i+1;
> +			if (!in_string) {
> +				token_cnt += new_tok(&last, &expr[j], i - j);
> +				token_cnt += new_tok(&last, &expr[i], 1);
> +				j = i+1;
> +			}
>  		break;
>  		case '\t':
>  		case ' ':
> -			token_cnt += new_tok(&last, &expr[j], i - j);
> -			j = i+1;
> +			if (!in_string) {
> +				token_cnt += new_tok(&last, &expr[j], i - j);
> +				j = i+1;
> +			}
> +		break;
> +		case '"':
> +			in_string = !in_string;
>  		break;
>  		default:
>  		break;
> -- 
> 2.26.2

It should probably be an error if tokenize exits with in_string=1?

-- 
Thank you,
Richard.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [LTP] [PATCH] lib: tst_bool_expr: Add support for strings
  2020-11-11 14:19 ` Richard Palethorpe
@ 2020-11-11 14:37   ` Cyril Hrubis
  2020-11-11 17:38     ` Richard Palethorpe
  0 siblings, 1 reply; 6+ messages in thread
From: Cyril Hrubis @ 2020-11-11 14:37 UTC (permalink / raw)
  To: ltp

Hi!
> Why not skip the whole switch statement if in_string and just check for
> the closing '"' instead?

I guess that we can instead do:

diff --git a/lib/tst_bool_expr.c b/lib/tst_bool_expr.c
index dd147cde3..387c38b91 100644
--- a/lib/tst_bool_expr.c
+++ b/lib/tst_bool_expr.c
@@ -81,6 +81,13 @@ static unsigned int tokenize(const char *expr, struct tst_expr_tok *last)
                        token_cnt += new_tok(&last, &expr[j], i - j);
                        j = i+1;
                break;
+               case '"':
+                       while (expr[i+1] != '"' && expr[i+1])
+                               i++;
+
+                       if (expr[i+1] == '"')
+                               i++;
+               break;
                default:
                break;
                }


Not sure if this is more readable.

> >  		break;
> >  		default:
> >  		break;
> > -- 
> > 2.26.2
> 
> It should probably be an error if tokenize exits with in_string=1?

Well my opinion is that we do not care since we will get error later on, in
case of the kernel config we would get obviously wrong variable value. But if
you think that this is important we can change the return type to int
and return -1 on tokenizer failure.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [LTP] [PATCH] lib: tst_bool_expr: Add support for strings
  2020-11-11 14:37   ` Cyril Hrubis
@ 2020-11-11 17:38     ` Richard Palethorpe
  2020-11-12 13:02       ` Cyril Hrubis
  0 siblings, 1 reply; 6+ messages in thread
From: Richard Palethorpe @ 2020-11-11 17:38 UTC (permalink / raw)
  To: ltp

Hello,

Cyril Hrubis <chrubis@suse.cz> writes:

> Hi!
>> Why not skip the whole switch statement if in_string and just check for
>> the closing '"' instead?
>
> I guess that we can instead do:
>
> diff --git a/lib/tst_bool_expr.c b/lib/tst_bool_expr.c
> index dd147cde3..387c38b91 100644
> --- a/lib/tst_bool_expr.c
> +++ b/lib/tst_bool_expr.c
> @@ -81,6 +81,13 @@ static unsigned int tokenize(const char *expr, struct tst_expr_tok *last)
>                         token_cnt += new_tok(&last, &expr[j], i - j);
>                         j = i+1;
>                 break;
> +               case '"':
> +                       while (expr[i+1] != '"' && expr[i+1])
> +                               i++;
> +
> +                       if (expr[i+1] == '"')
> +                               i++;
> +               break;
>                 default:
>                 break;
>                 }
>
>
> Not sure if this is more readable.

It's probably easier to understand, it atleast keeps string processing
separate.

>
>> >  		break;
>> >  		default:
>> >  		break;
>> > -- 
>> > 2.26.2
>> 
>> It should probably be an error if tokenize exits with in_string=1?
>
> Well my opinion is that we do not care since we will get error later on, in
> case of the kernel config we would get obviously wrong variable value. But if
> you think that this is important we can change the return type to int
> and return -1 on tokenizer failure.

I suppose it depends how much later and how such an error would look. If
the error will happen directly after leaving this function then I don't
think it matters either, but if it is possible that it manages to
complete the whole evaluation process before failing with TCONF because
the vars don't match then this has the potential to waste some time.

-- 
Thank you,
Richard.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [LTP] [PATCH] lib: tst_bool_expr: Add support for strings
  2020-11-11 17:38     ` Richard Palethorpe
@ 2020-11-12 13:02       ` Cyril Hrubis
  2020-11-12 14:02         ` Richard Palethorpe
  0 siblings, 1 reply; 6+ messages in thread
From: Cyril Hrubis @ 2020-11-12 13:02 UTC (permalink / raw)
  To: ltp

Hi!
> I suppose it depends how much later and how such an error would look. If
> the error will happen directly after leaving this function then I don't
> think it matters either, but if it is possible that it manages to
> complete the whole evaluation process before failing with TCONF because
> the vars don't match then this has the potential to waste some time.

Well I can easily add this kind of check here, however out of all
possible mistakes that you can do with misplaced quotes this would only
catch the cases where the number of quotes is odd. I.e. only a subset of
all possible errors, everything else would be caught later on when we
attempt to evaluate the expression.

E.g.

	CONFIG_FOO"=val"
	"CONFIG_FOO=val & CONFIG_BAR"
	"CONFIG_FOO"
	etc.

are all valid in this context, but completely wrong when evaluated.

I guess that it would make more sense to check if variable token is sane
in the upper layer, i.e. in the kconfig parser since we do have the full
information about how it should look like there. And this check would
also cover this case as well.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [LTP] [PATCH] lib: tst_bool_expr: Add support for strings
  2020-11-12 13:02       ` Cyril Hrubis
@ 2020-11-12 14:02         ` Richard Palethorpe
  0 siblings, 0 replies; 6+ messages in thread
From: Richard Palethorpe @ 2020-11-12 14:02 UTC (permalink / raw)
  To: ltp

Hello,

Cyril Hrubis <chrubis@suse.cz> writes:

> Hi!
>> I suppose it depends how much later and how such an error would look. If
>> the error will happen directly after leaving this function then I don't
>> think it matters either, but if it is possible that it manages to
>> complete the whole evaluation process before failing with TCONF because
>> the vars don't match then this has the potential to waste some time.
>
> Well I can easily add this kind of check here, however out of all
> possible mistakes that you can do with misplaced quotes this would only
> catch the cases where the number of quotes is odd. I.e. only a subset of
> all possible errors, everything else would be caught later on when we
> attempt to evaluate the expression.
>
> E.g.
>
> 	CONFIG_FOO"=val"
> 	"CONFIG_FOO=val & CONFIG_BAR"
> 	"CONFIG_FOO"
> 	etc.
>
> are all valid in this context, but completely wrong when evaluated.
>
> I guess that it would make more sense to check if variable token is sane
> in the upper layer, i.e. in the kconfig parser since we do have the full
> information about how it should look like there. And this check would
> also cover this case as well.

+1


-- 
Thank you,
Richard.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-11-12 14:02 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-11 13:11 [LTP] [PATCH] lib: tst_bool_expr: Add support for strings Cyril Hrubis
2020-11-11 14:19 ` Richard Palethorpe
2020-11-11 14:37   ` Cyril Hrubis
2020-11-11 17:38     ` Richard Palethorpe
2020-11-12 13:02       ` Cyril Hrubis
2020-11-12 14:02         ` Richard Palethorpe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.