linux-sparse.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/17] detect invalid branches at evaluation time
@ 2020-04-13 16:15 Luc Van Oostenryck
  2020-04-13 16:15 ` [PATCH 01/17] bad-goto: add testcase for 'jump inside discarded expression statement' Luc Van Oostenryck
                   ` (16 more replies)
  0 siblings, 17 replies; 30+ messages in thread
From: Luc Van Oostenryck @ 2020-04-13 16:15 UTC (permalink / raw)
  To: linux-sparse; +Cc: Luc Van Oostenryck

It's not allowed to do a goto into an expression statement.
For exemple, it's not well defined what should happen if such
an expression is not otherwise reachable and/or can be optimized
away. For such situations GCC issues an error, clang doesn't
and produce a valid IR but Spare produce an invalid IR with
branches to unexisting BBs.

The goal of the patches in this series is:
*) to detect such gotos at evaluation time;
*) issue a sensible error message;
*) avoid the linearization of functions with
   invalid gotos.

This series is an alternative to the one named
"detect invalid branches in the IR" whch was posted
last week.

-- Luc

Luc Van Oostenryck (17):
  bad-goto: add testcase for 'jump inside discarded expression statement'
  bad-goto: add testcases for linearization of invalid labels
  bad-goto: add more testcases
  bad-goto: do not linearize if the IR will be invalid
  bad-goto: reorg test in evaluate_goto_statement()
  bad-goto: simplify testing of undeclared labels
  bad-goto: do not linearize function with undeclared labels
  bad-goto: catch labels with reserved names
  scope: no memset() needed after __alloc_scope()
  scope: move scope opening/ending inside compound_statement()
  scope: make function scope the same as the body block scope
  scope: s/{start,end}_symbol_scope/{start,end}_block_scope/
  scope: let labels have their own scope
  scope: add is_in_scope()
  scope: give a scope for labels & gotos
  bad-goto: catch gotos inside expression statements
  bad-goto: cleanup evaluate_goto()

 evaluate.c                              | 25 +++++++++++++--
 expression.c                            |  4 +--
 linearize.c                             |  2 +-
 parse.c                                 | 23 +++++++-------
 parse.h                                 |  1 +
 scope.c                                 | 33 +++++++++++++++----
 scope.h                                 | 10 ++++--
 symbol.h                                |  4 +++
 validation/label-scope1.c               | 42 +++++++++++++++++++++++++
 validation/label-stmt-expr1.c           | 29 +++++++++++++++++
 validation/linear/goto-and-expr-stmt0.c | 33 +++++++++++++++++++
 validation/linear/invalid-labels0.c     | 18 +++++++++++
 12 files changed, 199 insertions(+), 25 deletions(-)
 create mode 100644 validation/label-scope1.c
 create mode 100644 validation/label-stmt-expr1.c
 create mode 100644 validation/linear/goto-and-expr-stmt0.c
 create mode 100644 validation/linear/invalid-labels0.c


base-commit: 0f5a39dcea89c66236c04815b77b107763873431
-- 
2.26.0

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 01/17] bad-goto: add testcase for 'jump inside discarded expression statement'
  2020-04-13 16:15 [PATCH 00/17] detect invalid branches at evaluation time Luc Van Oostenryck
@ 2020-04-13 16:15 ` Luc Van Oostenryck
  2020-04-13 16:15 ` [PATCH 02/17] bad-goto: add testcases for linearization of invalid labels Luc Van Oostenryck
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Luc Van Oostenryck @ 2020-04-13 16:15 UTC (permalink / raw)
  To: linux-sparse; +Cc: Luc Van Oostenryck

A goto done into an piece of code discarded at expand or
linearize time will produce an invalid IR.

Add a testcase for it.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 validation/label-stmt-expr1.c           | 30 +++++++++++++++++++++++++
 validation/linear/goto-and-expr-stmt0.c | 28 +++++++++++++++++++++++
 2 files changed, 58 insertions(+)
 create mode 100644 validation/label-stmt-expr1.c
 create mode 100644 validation/linear/goto-and-expr-stmt0.c

diff --git a/validation/label-stmt-expr1.c b/validation/label-stmt-expr1.c
new file mode 100644
index 000000000000..47ba54ae7305
--- /dev/null
+++ b/validation/label-stmt-expr1.c
@@ -0,0 +1,30 @@
+static int foo(void)
+{
+	goto l;
+	return	({
+l:
+		0;
+	});
+}
+
+static void bar(void)
+{
+	int r;
+	r = ({
+l:
+		0;
+	});
+	goto l;
+}
+
+/*
+ * check-name: label-stmt-expr1
+ * check-known-to-fail
+ *
+ * check-error-start
+label-stmt-expr1.c:3:9: error: goto into statement expression
+label-stmt-expr1.c:5:1:    label 'l' is defined here
+label-stmt-expr1.c:17:9: error: goto into statement expression
+label-stmt-expr1.c:14:1:    label 'l' is defined here
+ * check-error-end
+ */
diff --git a/validation/linear/goto-and-expr-stmt0.c b/validation/linear/goto-and-expr-stmt0.c
new file mode 100644
index 000000000000..548813531779
--- /dev/null
+++ b/validation/linear/goto-and-expr-stmt0.c
@@ -0,0 +1,28 @@
+int t(void)
+{
+	goto inside;
+	return 1 ? 2 : ({
+inside:
+			return 3;
+			4;
+		    });
+}
+
+void f(int x, int y)
+{
+	1 ? x : ({
+a:
+		 y;
+	});
+	goto a;
+}
+
+/*
+ * check-name: goto-and-expr-stmt0
+ * check-command: test-linearize -Wno-decl $file
+ * check-known-to-fail
+ *
+ * check-output-ignore
+ * check-output-excludes: END
+ * check-error-ignore
+ */
-- 
2.26.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 02/17] bad-goto: add testcases for linearization of invalid labels
  2020-04-13 16:15 [PATCH 00/17] detect invalid branches at evaluation time Luc Van Oostenryck
  2020-04-13 16:15 ` [PATCH 01/17] bad-goto: add testcase for 'jump inside discarded expression statement' Luc Van Oostenryck
@ 2020-04-13 16:15 ` Luc Van Oostenryck
  2020-04-13 16:15 ` [PATCH 03/17] bad-goto: add more testcases Luc Van Oostenryck
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Luc Van Oostenryck @ 2020-04-13 16:15 UTC (permalink / raw)
  To: linux-sparse; +Cc: Luc Van Oostenryck

A goto to a reserved or a undeclared label will generate
an IR with a branch to a non-existing BB. Bad.

Add a testcase for these.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 validation/linear/invalid-labels0.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)
 create mode 100644 validation/linear/invalid-labels0.c

diff --git a/validation/linear/invalid-labels0.c b/validation/linear/invalid-labels0.c
new file mode 100644
index 000000000000..ae3bf7283fb8
--- /dev/null
+++ b/validation/linear/invalid-labels0.c
@@ -0,0 +1,19 @@
+static void foo(void)
+{
+	goto return;
+}
+
+void bar(void)
+{
+	goto neverland;
+}
+
+/*
+ * check-name: invalid-labels0
+ * check-command: test-linearize -Wno-decl $file
+ * check-known-to-fail
+ *
+ * check-output-ignore
+ * check-output-excludes: END
+ * check-error-ignore
+ */
-- 
2.26.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 03/17] bad-goto: add more testcases
  2020-04-13 16:15 [PATCH 00/17] detect invalid branches at evaluation time Luc Van Oostenryck
  2020-04-13 16:15 ` [PATCH 01/17] bad-goto: add testcase for 'jump inside discarded expression statement' Luc Van Oostenryck
  2020-04-13 16:15 ` [PATCH 02/17] bad-goto: add testcases for linearization of invalid labels Luc Van Oostenryck
@ 2020-04-13 16:15 ` Luc Van Oostenryck
  2020-04-13 16:15 ` [PATCH 04/17] bad-goto: do not linearize if the IR will be invalid Luc Van Oostenryck
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Luc Van Oostenryck @ 2020-04-13 16:15 UTC (permalink / raw)
  To: linux-sparse; +Cc: Luc Van Oostenryck

Add some testcases for gotos with undeclared labels & __label__.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 validation/label-scope1.c | 42 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 42 insertions(+)
 create mode 100644 validation/label-scope1.c

diff --git a/validation/label-scope1.c b/validation/label-scope1.c
new file mode 100644
index 000000000000..f2b1ae9b158a
--- /dev/null
+++ b/validation/label-scope1.c
@@ -0,0 +1,42 @@
+static void ok_top(void)
+{
+	__label__ l;
+l:
+	goto l;
+}
+
+static void ko_undecl(void)
+{
+	__label__ l;
+	goto l;				// KO: undeclared
+}
+
+static void ok_local(void)
+{
+l:
+	{
+		__label__ l;
+l:
+		goto l;
+	}
+goto l;
+}
+
+static void ko_scope(void)
+{
+	{
+		__label__ l;
+l:
+		goto l;
+	}
+goto l;					// KO: undeclared
+}
+
+/*
+ * check-name: label-scope1
+ *
+ * check-error-start
+label-scope1.c:11:9: error: label 'l' was not declared
+label-scope1.c:32:1: error: label 'l' was not declared
+ * check-error-end
+ */
-- 
2.26.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 04/17] bad-goto: do not linearize if the IR will be invalid
  2020-04-13 16:15 [PATCH 00/17] detect invalid branches at evaluation time Luc Van Oostenryck
                   ` (2 preceding siblings ...)
  2020-04-13 16:15 ` [PATCH 03/17] bad-goto: add more testcases Luc Van Oostenryck
@ 2020-04-13 16:15 ` Luc Van Oostenryck
  2020-04-13 16:15 ` [PATCH 05/17] bad-goto: reorg test in evaluate_goto_statement() Luc Van Oostenryck
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Luc Van Oostenryck @ 2020-04-13 16:15 UTC (permalink / raw)
  To: linux-sparse; +Cc: Luc Van Oostenryck

In some error cases, it's not possible to produce a valid &
correct IR for the concerned function. For exemple, if the
AST contains invalid gotos, the CFG will either be invalid
or won't correspond to the erroneous source code.

So, refuse to linearize such functions.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 linearize.c | 2 +-
 symbol.h    | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/linearize.c b/linearize.c
index b040d345d469..222714564a3f 100644
--- a/linearize.c
+++ b/linearize.c
@@ -2480,7 +2480,7 @@ static struct entrypoint *linearize_fn(struct symbol *sym, struct symbol *base_t
 	pseudo_t result;
 	int i;
 
-	if (!stmt)
+	if (!stmt || base_type->bogus_linear)
 		return NULL;
 
 	ep = alloc_entrypoint();
diff --git a/symbol.h b/symbol.h
index c86dfb335e29..de13d60b8b75 100644
--- a/symbol.h
+++ b/symbol.h
@@ -171,6 +171,7 @@ struct symbol {
 			unsigned long	offset;
 			int		bit_size;
 			unsigned int	bit_offset:8,
+					bogus_linear:1,
 					variadic:1,
 					initialized:1,
 					examined:1,
-- 
2.26.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 05/17] bad-goto: reorg test in evaluate_goto_statement()
  2020-04-13 16:15 [PATCH 00/17] detect invalid branches at evaluation time Luc Van Oostenryck
                   ` (3 preceding siblings ...)
  2020-04-13 16:15 ` [PATCH 04/17] bad-goto: do not linearize if the IR will be invalid Luc Van Oostenryck
@ 2020-04-13 16:15 ` Luc Van Oostenryck
  2020-04-13 16:15 ` [PATCH 06/17] bad-goto: simplify testing of undeclared labels Luc Van Oostenryck
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Luc Van Oostenryck @ 2020-04-13 16:15 UTC (permalink / raw)
  To: linux-sparse; +Cc: Luc Van Oostenryck

No functional changes here, only changing the code structure
to prepare more incoming changes.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 evaluate.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/evaluate.c b/evaluate.c
index b7bb1f52aa91..a432d243610e 100644
--- a/evaluate.c
+++ b/evaluate.c
@@ -3741,10 +3741,14 @@ static void evaluate_goto_statement(struct statement *stmt)
 {
 	struct symbol *label = stmt->goto_label;
 
-	if (label && !label->stmt && label->ident && !lookup_keyword(label->ident, NS_KEYWORD))
+	if (!label) {
+		// no label associated, may be a computed goto
+		evaluate_expression(stmt->goto_expression);
+		return;
+	}
+	if (!label->stmt && label->ident && !lookup_keyword(label->ident, NS_KEYWORD)) {
 		sparse_error(stmt->pos, "label '%s' was not declared", show_ident(label->ident));

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 06/17] bad-goto: simplify testing of undeclared labels
  2020-04-13 16:15 [PATCH 00/17] detect invalid branches at evaluation time Luc Van Oostenryck
                   ` (4 preceding siblings ...)
  2020-04-13 16:15 ` [PATCH 05/17] bad-goto: reorg test in evaluate_goto_statement() Luc Van Oostenryck
@ 2020-04-13 16:15 ` Luc Van Oostenryck
  2020-04-13 16:15 ` [PATCH 07/17] bad-goto: do not linearize function with " Luc Van Oostenryck
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Luc Van Oostenryck @ 2020-04-13 16:15 UTC (permalink / raw)
  To: linux-sparse; +Cc: Luc Van Oostenryck

There is no need to do a lookup: checking if the label's
symbol is in the NS_LABEL namespace and is lacking an
associated statement is enough and much simpler.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 evaluate.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/evaluate.c b/evaluate.c
index a432d243610e..dc66b2e6ad9a 100644
--- a/evaluate.c
+++ b/evaluate.c
@@ -3746,7 +3746,8 @@ static void evaluate_goto_statement(struct statement *stmt)
 		evaluate_expression(stmt->goto_expression);
 		return;
 	}
-	if (!label->stmt && label->ident && !lookup_keyword(label->ident, NS_KEYWORD)) {
+
+	if (label->namespace == NS_LABEL && !label->stmt) {
 		sparse_error(stmt->pos, "label '%s' was not declared", show_ident(label->ident));
 	}
 }
-- 
2.26.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 07/17] bad-goto: do not linearize function with undeclared labels
  2020-04-13 16:15 [PATCH 00/17] detect invalid branches at evaluation time Luc Van Oostenryck
                   ` (5 preceding siblings ...)
  2020-04-13 16:15 ` [PATCH 06/17] bad-goto: simplify testing of undeclared labels Luc Van Oostenryck
@ 2020-04-13 16:15 ` Luc Van Oostenryck
  2020-04-13 16:15 ` [PATCH 08/17] bad-goto: catch labels with reserved names Luc Van Oostenryck
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Luc Van Oostenryck @ 2020-04-13 16:15 UTC (permalink / raw)
  To: linux-sparse; +Cc: Luc Van Oostenryck

It's not possible to produce a valid & correct IR if
the function contains a goto to an undeclared label.

So, try to catch these situations and mark the function
as such, the linearization will then simply ignore it.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 evaluate.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/evaluate.c b/evaluate.c
index dc66b2e6ad9a..14953f195fcc 100644
--- a/evaluate.c
+++ b/evaluate.c
@@ -3749,6 +3749,7 @@ static void evaluate_goto_statement(struct statement *stmt)
 
 	if (label->namespace == NS_LABEL && !label->stmt) {
 		sparse_error(stmt->pos, "label '%s' was not declared", show_ident(label->ident));
+		current_fn->bogus_linear = 1;
 	}
 }
 
-- 
2.26.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 08/17] bad-goto: catch labels with reserved names
  2020-04-13 16:15 [PATCH 00/17] detect invalid branches at evaluation time Luc Van Oostenryck
                   ` (6 preceding siblings ...)
  2020-04-13 16:15 ` [PATCH 07/17] bad-goto: do not linearize function with " Luc Van Oostenryck
@ 2020-04-13 16:15 ` Luc Van Oostenryck
  2020-04-13 16:15 ` [PATCH 09/17] scope: no memset() needed after __alloc_scope() Luc Van Oostenryck
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Luc Van Oostenryck @ 2020-04-13 16:15 UTC (permalink / raw)
  To: linux-sparse; +Cc: Luc Van Oostenryck

If a reserved name is used as the destination of a goto,
its associated label won't be valid and at linearization
time no BB will can be created for it, resulting in an
invalid IR.

So, catch such gotos at evaluation time and mark the
function to not be linearized.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 evaluate.c                          | 2 ++
 validation/linear/invalid-labels0.c | 1 -
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/evaluate.c b/evaluate.c
index 14953f195fcc..99a9ee72d11f 100644
--- a/evaluate.c
+++ b/evaluate.c
@@ -3751,6 +3751,8 @@ static void evaluate_goto_statement(struct statement *stmt)
 		sparse_error(stmt->pos, "label '%s' was not declared", show_ident(label->ident));
 		current_fn->bogus_linear = 1;
 	}
+	if (label->namespace == NS_NONE)
+		current_fn->bogus_linear = 1;
 }
 
 struct symbol *evaluate_statement(struct statement *stmt)
diff --git a/validation/linear/invalid-labels0.c b/validation/linear/invalid-labels0.c
index ae3bf7283fb8..a15e9d434011 100644
--- a/validation/linear/invalid-labels0.c
+++ b/validation/linear/invalid-labels0.c
@@ -11,7 +11,6 @@ void bar(void)
 /*
  * check-name: invalid-labels0
  * check-command: test-linearize -Wno-decl $file
- * check-known-to-fail
  *
  * check-output-ignore
  * check-output-excludes: END
-- 
2.26.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 09/17] scope: no memset() needed after __alloc_scope()
  2020-04-13 16:15 [PATCH 00/17] detect invalid branches at evaluation time Luc Van Oostenryck
                   ` (7 preceding siblings ...)
  2020-04-13 16:15 ` [PATCH 08/17] bad-goto: catch labels with reserved names Luc Van Oostenryck
@ 2020-04-13 16:15 ` Luc Van Oostenryck
  2020-04-13 16:15 ` [PATCH 10/17] scope: move scope opening/ending inside compound_statement() Luc Van Oostenryck
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Luc Van Oostenryck @ 2020-04-13 16:15 UTC (permalink / raw)
  To: linux-sparse; +Cc: Luc Van Oostenryck

When starting some scopes, the newly allocated struct is
memset'ed with zero but this is unneeded since the allocator
always returns zeroed memory.

Remove the unneeded call to memset().

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 scope.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/scope.c b/scope.c
index 420c0f5a3f51..0e4fb3b42150 100644
--- a/scope.c
+++ b/scope.c
@@ -68,7 +68,6 @@ void rebind_scope(struct symbol *sym, struct scope *new)
 static void start_scope(struct scope **s)
 {
 	struct scope *scope = __alloc_scope(0);
-	memset(scope, 0, sizeof(*scope));
 	scope->next = *s;
 	*s = scope;
 }
@@ -77,7 +76,6 @@ void start_file_scope(void)
 {
 	struct scope *scope = __alloc_scope(0);
 
-	memset(scope, 0, sizeof(*scope));
 	scope->next = &builtin_scope;
 	file_scope = scope;
 
-- 
2.26.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 10/17] scope: move scope opening/ending inside compound_statement()
  2020-04-13 16:15 [PATCH 00/17] detect invalid branches at evaluation time Luc Van Oostenryck
                   ` (8 preceding siblings ...)
  2020-04-13 16:15 ` [PATCH 09/17] scope: no memset() needed after __alloc_scope() Luc Van Oostenryck
@ 2020-04-13 16:15 ` Luc Van Oostenryck
  2020-04-13 16:15 ` [PATCH 11/17] scope: make function scope the same as the body block scope Luc Van Oostenryck
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Luc Van Oostenryck @ 2020-04-13 16:15 UTC (permalink / raw)
  To: linux-sparse; +Cc: Luc Van Oostenryck

A compound statement starts and ends a block scope, so
it's better to start & end this scope inside the function
parsing the statement: compound_statement.
The only exception is for the body of a function where
the scope also enclose the parameter declaration but that
is fine since the function is special anyway.

Move the calls to start & close the block scope inside
compound_statement() and directly call statement_list()
for the function body.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 expression.c |  2 --
 parse.c      | 13 ++++++-------
 2 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/expression.c b/expression.c
index 5b9bddfe456e..78e577cf10a1 100644
--- a/expression.c
+++ b/expression.c
@@ -71,9 +71,7 @@ struct token *parens_expression(struct token *token, struct expression **expr, c
 		struct statement *stmt = alloc_statement(token->pos, STMT_COMPOUND);
 		*expr = e;
 		e->statement = stmt;
-		start_symbol_scope();
 		token = compound_statement(token->next, stmt);
-		end_symbol_scope();
 		token = expect(token, '}', "at end of statement expression");
 	} else
 		token = parse_expression(token, expr);
diff --git a/parse.c b/parse.c
index a29c67c8cf41..5da314cd05ee 100644
--- a/parse.c
+++ b/parse.c
@@ -2547,11 +2547,7 @@ static struct token *statement(struct token *token, struct statement **tree)
 	}
 
 	if (match_op(token, '{')) {
-		stmt->type = STMT_COMPOUND;
-		start_symbol_scope();
 		token = compound_statement(token->next, stmt);
-		end_symbol_scope();
-		
 		return expect(token, '}', "at end of compound statement");
 	}
 			
@@ -2658,7 +2654,10 @@ static struct token *parameter_type_list(struct token *token, struct symbol *fn)
 
 struct token *compound_statement(struct token *token, struct statement *stmt)
 {
+	stmt->type = STMT_COMPOUND;
+	start_symbol_scope();
 	token = statement_list(token, &stmt->stmts);
+	end_symbol_scope();
 	return token;
 }
 
@@ -2810,15 +2809,15 @@ static struct token *parse_function_body(struct token *token, struct symbol *dec
 		decl->ctype.modifiers |= MOD_EXTERN;
 
 	stmt = start_function(decl);
-
 	*p = stmt;
+
 	FOR_EACH_PTR (base_type->arguments, arg) {
 		declare_argument(arg, base_type);
 	} END_FOR_EACH_PTR(arg);
 
-	token = compound_statement(token->next, stmt);

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 11/17] scope: make function scope the same as the body block scope
  2020-04-13 16:15 [PATCH 00/17] detect invalid branches at evaluation time Luc Van Oostenryck
                   ` (9 preceding siblings ...)
  2020-04-13 16:15 ` [PATCH 10/17] scope: move scope opening/ending inside compound_statement() Luc Van Oostenryck
@ 2020-04-13 16:15 ` Luc Van Oostenryck
  2020-04-13 16:16 ` [PATCH 12/17] scope: s/{start,end}_symbol_scope/{start,end}_block_scope/ Luc Van Oostenryck
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Luc Van Oostenryck @ 2020-04-13 16:15 UTC (permalink / raw)
  To: linux-sparse; +Cc: Luc Van Oostenryck

Currently, the function scope (only used for labels) and
the block scope of the function's body are distinct scopes,
none being a child from the other.

This is fine as these scopes are currently unrelated but:
* it's unneeded and somehow unintuitive
* checking that gotos doesn't jump inside and expression
  statement is easier if these scopes are properly nested.

So, make the function scope and the body's block scope one
single scope.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 scope.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/scope.c b/scope.c
index 0e4fb3b42150..175d72c23762 100644
--- a/scope.c
+++ b/scope.c
@@ -91,8 +91,8 @@ void start_symbol_scope(void)
 
 void start_function_scope(void)
 {
-	start_scope(&function_scope);
 	start_scope(&block_scope);
+	function_scope = block_scope;
 }
 
 static void remove_symbol_scope(struct symbol *sym)
@@ -137,7 +137,7 @@ void end_symbol_scope(void)
 void end_function_scope(void)
 {
 	end_scope(&block_scope);
-	end_scope(&function_scope);
+	function_scope = block_scope;
 }
 
 int is_outer_scope(struct scope *scope)
-- 
2.26.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 12/17] scope: s/{start,end}_symbol_scope/{start,end}_block_scope/
  2020-04-13 16:15 [PATCH 00/17] detect invalid branches at evaluation time Luc Van Oostenryck
                   ` (10 preceding siblings ...)
  2020-04-13 16:15 ` [PATCH 11/17] scope: make function scope the same as the body block scope Luc Van Oostenryck
@ 2020-04-13 16:16 ` Luc Van Oostenryck
  2020-04-13 16:16 ` [PATCH 13/17] scope: let labels have their own scope Luc Van Oostenryck
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Luc Van Oostenryck @ 2020-04-13 16:16 UTC (permalink / raw)
  To: linux-sparse; +Cc: Luc Van Oostenryck

The names {start,end}_symbol_scope() are misleading as these
function really start & end a block scope and not all symbols
are declared inside a block scope.

So, rename them to their more direct name: {start,end}_block_scope().

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 parse.c | 12 ++++++------
 scope.c |  4 ++--
 scope.h |  4 ++--
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/parse.c b/parse.c
index 5da314cd05ee..1a2c7af22ff4 100644
--- a/parse.c
+++ b/parse.c
@@ -2222,7 +2222,7 @@ static void start_iterator(struct statement *stmt)
 {
 	struct symbol *cont, *brk;
 
-	start_symbol_scope();
+	start_block_scope();
 	cont = alloc_symbol(stmt->pos, SYM_NODE);
 	bind_symbol(cont, &continue_ident, NS_ITERATOR);
 	brk = alloc_symbol(stmt->pos, SYM_NODE);
@@ -2237,7 +2237,7 @@ static void start_iterator(struct statement *stmt)
 
 static void end_iterator(struct statement *stmt)
 {
-	end_symbol_scope();
+	end_block_scope();
 }
 
 static struct statement *start_function(struct symbol *sym)
@@ -2282,7 +2282,7 @@ static void start_switch(struct statement *stmt)
 {
 	struct symbol *brk, *switch_case;
 
-	start_symbol_scope();
+	start_block_scope();
 	brk = alloc_symbol(stmt->pos, SYM_NODE);
 	bind_symbol(brk, &break_ident, NS_ITERATOR);
 
@@ -2302,7 +2302,7 @@ static void end_switch(struct statement *stmt)
 {
 	if (!stmt->switch_case->symbol_list)
 		warning(stmt->pos, "switch with no cases");
-	end_symbol_scope();
+	end_block_scope();
 }
 
 static void add_case_statement(struct statement *stmt)
@@ -2655,9 +2655,9 @@ static struct token *parameter_type_list(struct token *token, struct symbol *fn)
 struct token *compound_statement(struct token *token, struct statement *stmt)
 {
 	stmt->type = STMT_COMPOUND;
-	start_symbol_scope();
+	start_block_scope();
 	token = statement_list(token, &stmt->stmts);
-	end_symbol_scope();
+	end_block_scope();
 	return token;
 }
 
diff --git a/scope.c b/scope.c
index 175d72c23762..be042a45357d 100644
--- a/scope.c
+++ b/scope.c
@@ -84,7 +84,7 @@ void start_file_scope(void)
 	block_scope = scope;
 }
 
-void start_symbol_scope(void)
+void start_block_scope(void)
 {
 	start_scope(&block_scope);
 }
@@ -129,7 +129,7 @@ void new_file_scope(void)
 	start_file_scope();
 }
 
-void end_symbol_scope(void)
+void end_block_scope(void)
 {
 	end_scope(&block_scope);
 }
diff --git a/scope.h b/scope.h
index 3cad514ac128..83741459eb6a 100644
--- a/scope.h
+++ b/scope.h
@@ -47,8 +47,8 @@ extern void start_file_scope(void);
 extern void end_file_scope(void);
 extern void new_file_scope(void);
 
-extern void start_symbol_scope(void);
-extern void end_symbol_scope(void);
+extern void start_block_scope(void);
+extern void end_block_scope(void);
 
 extern void start_function_scope(void);
 extern void end_function_scope(void);
-- 
2.26.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 13/17] scope: let labels have their own scope
  2020-04-13 16:15 [PATCH 00/17] detect invalid branches at evaluation time Luc Van Oostenryck
                   ` (11 preceding siblings ...)
  2020-04-13 16:16 ` [PATCH 12/17] scope: s/{start,end}_symbol_scope/{start,end}_block_scope/ Luc Van Oostenryck
@ 2020-04-13 16:16 ` Luc Van Oostenryck
  2020-04-13 17:30   ` Linus Torvalds
  2020-04-13 16:16 ` [PATCH 14/17] scope: add is_in_scope() Luc Van Oostenryck
                   ` (3 subsequent siblings)
  16 siblings, 1 reply; 30+ messages in thread
From: Luc Van Oostenryck @ 2020-04-13 16:16 UTC (permalink / raw)
  To: linux-sparse; +Cc: Luc Van Oostenryck

One way of detecting gotos inside an statement expression
is to use a new kind of scope for the gotos & labels.

So create this new scope and open/close them when
entering/leaving statement expressions.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 expression.c |  2 ++
 scope.c      | 14 ++++++++++++++
 scope.h      |  4 ++++
 3 files changed, 20 insertions(+)

diff --git a/expression.c b/expression.c
index 78e577cf10a1..08650724a988 100644
--- a/expression.c
+++ b/expression.c
@@ -71,7 +71,9 @@ struct token *parens_expression(struct token *token, struct expression **expr, c
 		struct statement *stmt = alloc_statement(token->pos, STMT_COMPOUND);
 		*expr = e;
 		e->statement = stmt;
+		start_label_scope();
 		token = compound_statement(token->next, stmt);
+		end_label_scope();
 		token = expect(token, '}', "at end of statement expression");
 	} else
 		token = parse_expression(token, expr);
diff --git a/scope.c b/scope.c
index be042a45357d..24c8a7a484f7 100644
--- a/scope.c
+++ b/scope.c
@@ -36,6 +36,7 @@
 static struct scope builtin_scope = { .next = &builtin_scope };
 
 struct scope	*block_scope = &builtin_scope,		// regular automatic variables etc
+		*label_scope = &builtin_scope,		// expr-stmt labels
 		*function_scope = &builtin_scope,	// labels, arguments etc
 		*file_scope = &builtin_scope,		// static
 		*global_scope = &builtin_scope;		// externally visible
@@ -81,6 +82,7 @@ void start_file_scope(void)
 
 	/* top-level stuff defaults to file scope, "extern" etc will choose global scope */
 	function_scope = scope;
+	label_scope = scope;
 	block_scope = scope;
 }
 
@@ -93,6 +95,7 @@ void start_function_scope(void)
 {
 	start_scope(&block_scope);
 	function_scope = block_scope;
+	label_scope = function_scope;
 }
 
 static void remove_symbol_scope(struct symbol *sym)
@@ -138,6 +141,17 @@ void end_function_scope(void)
 {
 	end_scope(&block_scope);
 	function_scope = block_scope;
+	label_scope = function_scope;
+}
+
+void start_label_scope(void)
+{
+	start_scope(&label_scope);
+}
+
+void end_label_scope(void)
+{
+	end_scope(&label_scope);
 }
 
 int is_outer_scope(struct scope *scope)
diff --git a/scope.h b/scope.h
index 83741459eb6a..ddcb90bd146b 100644
--- a/scope.h
+++ b/scope.h
@@ -34,6 +34,7 @@ struct scope {
 
 extern struct scope
 		*block_scope,
+		*label_scope,
 		*function_scope,
 		*file_scope,
 		*global_scope;
@@ -53,6 +54,9 @@ extern void end_block_scope(void);
 extern void start_function_scope(void);
 extern void end_function_scope(void);
 
+extern void start_label_scope(void);
+extern void end_label_scope(void);
+
 extern void set_current_scope(struct symbol *);
 extern void bind_scope(struct symbol *, struct scope *);
 extern void rebind_scope(struct symbol *, struct scope *);
-- 
2.26.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 14/17] scope: add is_in_scope()
  2020-04-13 16:15 [PATCH 00/17] detect invalid branches at evaluation time Luc Van Oostenryck
                   ` (12 preceding siblings ...)
  2020-04-13 16:16 ` [PATCH 13/17] scope: let labels have their own scope Luc Van Oostenryck
@ 2020-04-13 16:16 ` Luc Van Oostenryck
  2020-04-13 16:16 ` [PATCH 15/17] scope: give a scope for labels & gotos Luc Van Oostenryck
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Luc Van Oostenryck @ 2020-04-13 16:16 UTC (permalink / raw)
  To: linux-sparse; +Cc: Luc Van Oostenryck

Add an helper to check if a scope is included into
another one.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 scope.c | 9 +++++++++
 scope.h | 2 ++
 2 files changed, 11 insertions(+)

diff --git a/scope.c b/scope.c
index 24c8a7a484f7..b5c9e454b55d 100644
--- a/scope.c
+++ b/scope.c
@@ -163,3 +163,12 @@ int is_outer_scope(struct scope *scope)
 	return 1;
 }
 
+int is_in_scope(struct scope *outer, struct scope *inner)
+{
+	while (inner != outer) {
+		if (inner == &builtin_scope)
+			return 0;
+		inner = inner->next;
+	}
+	return 1;
+}
diff --git a/scope.h b/scope.h
index ddcb90bd146b..36a56d6adf1d 100644
--- a/scope.h
+++ b/scope.h
@@ -62,4 +62,6 @@ extern void bind_scope(struct symbol *, struct scope *);
 extern void rebind_scope(struct symbol *, struct scope *);
 
 extern int is_outer_scope(struct scope *);
+extern int is_in_scope(struct scope *outer, struct scope *inner);
+
 #endif
-- 
2.26.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 15/17] scope: give a scope for labels & gotos
  2020-04-13 16:15 [PATCH 00/17] detect invalid branches at evaluation time Luc Van Oostenryck
                   ` (13 preceding siblings ...)
  2020-04-13 16:16 ` [PATCH 14/17] scope: add is_in_scope() Luc Van Oostenryck
@ 2020-04-13 16:16 ` Luc Van Oostenryck
  2020-04-13 17:52   ` Linus Torvalds
  2020-04-13 16:16 ` [PATCH 16/17] bad-goto: catch gotos inside expression statements Luc Van Oostenryck
  2020-04-13 16:16 ` [PATCH 17/17] bad-goto: cleanup evaluate_goto() Luc Van Oostenryck
  16 siblings, 1 reply; 30+ messages in thread
From: Luc Van Oostenryck @ 2020-04-13 16:16 UTC (permalink / raw)
  To: linux-sparse; +Cc: Luc Van Oostenryck

One way of detecting gotos inside an statement expression
is to use a new kind of scope for the gotos & labels.
Since gotos don't need to have their label predeclared,
nothing can be checked at parsing time but later it can
be checked that a goto doesn't jump inside one of the
label scope created by statement expressions.

So, add additional scope information to gotos and labels
to allow such check to be done.

Note: the label's symbols are still created in the function
      scope since they belong to a single namespace.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 parse.c  | 2 ++
 parse.h  | 1 +
 symbol.h | 3 +++
 3 files changed, 6 insertions(+)

diff --git a/parse.c b/parse.c
index 1a2c7af22ff4..244991c1c4f5 100644
--- a/parse.c
+++ b/parse.c
@@ -2481,6 +2481,7 @@ static struct token *parse_goto_statement(struct token *token, struct statement
 		add_statement(&function_computed_goto_list, stmt);
 	} else if (token_type(token) == TOKEN_IDENT) {
 		stmt->goto_label = label_symbol(token);
+		stmt->goto_scope = label_scope;
 		token = token->next;
 	} else {
 		sparse_error(token->pos, "Expected identifier or goto expression");
@@ -2533,6 +2534,7 @@ static struct token *statement(struct token *token, struct statement **tree)
 
 		if (match_op(token->next, ':')) {
 			struct symbol *s = label_symbol(token);
+			s->label_scope = label_scope;
 			token = skip_attributes(token->next->next);
 			if (s->stmt) {
 				sparse_error(stmt->pos, "label '%s' redefined", show_ident(s->ident));
diff --git a/parse.h b/parse.h
index 0742a2a87e9d..5995eb56849f 100644
--- a/parse.h
+++ b/parse.h
@@ -99,6 +99,7 @@ struct statement {
 		};
 		struct /* goto_struct */ {
 			struct symbol *goto_label;
+			struct scope *goto_scope;
 
 			/* computed gotos have these: */
 			struct expression *goto_expression;
diff --git a/symbol.h b/symbol.h
index de13d60b8b75..f9d7bcaa997f 100644
--- a/symbol.h
+++ b/symbol.h
@@ -167,6 +167,9 @@ struct symbol {
 			int (*handler)(struct stream *, struct token **, struct token *);
 			int normal;
 		};
+		struct /* NS_LABEL */ {
+			struct scope *label_scope;
+		};
 		struct /* NS_SYMBOL */ {
 			unsigned long	offset;
 			int		bit_size;
-- 
2.26.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 16/17] bad-goto: catch gotos inside expression statements
  2020-04-13 16:15 [PATCH 00/17] detect invalid branches at evaluation time Luc Van Oostenryck
                   ` (14 preceding siblings ...)
  2020-04-13 16:16 ` [PATCH 15/17] scope: give a scope for labels & gotos Luc Van Oostenryck
@ 2020-04-13 16:16 ` Luc Van Oostenryck
  2020-04-13 16:16 ` [PATCH 17/17] bad-goto: cleanup evaluate_goto() Luc Van Oostenryck
  16 siblings, 0 replies; 30+ messages in thread
From: Luc Van Oostenryck @ 2020-04-13 16:16 UTC (permalink / raw)
  To: linux-sparse; +Cc: Luc Van Oostenryck

It's not allowed to do a goto into an expression statement.
For exemple, it's not well defined what should happen if such
an expression is not evaluated because unnneded and optimized
away at expand time.

For such situations GCC issues an error, clang doesn't and
produces a valid IR. Spare produces an invalid IR with branches
to unexisting BBs.

Fix this by:
*) detecting this situation at evaluation time
*) issue an error
*) mark the function to not be linearized.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 evaluate.c                              | 9 +++++++++
 validation/label-stmt-expr1.c           | 1 -
 validation/linear/goto-and-expr-stmt0.c | 9 +++++++--
 3 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/evaluate.c b/evaluate.c
index 99a9ee72d11f..2b845a301d6b 100644
--- a/evaluate.c
+++ b/evaluate.c
@@ -42,6 +42,7 @@
 #include "symbol.h"
 #include "target.h"
 #include "expression.h"
+#include "scope.h"
 
 struct symbol *current_fn;
 
@@ -3751,6 +3752,14 @@ static void evaluate_goto_statement(struct statement *stmt)
 		sparse_error(stmt->pos, "label '%s' was not declared", show_ident(label->ident));
 		current_fn->bogus_linear = 1;
 	}
+	if (label->namespace == NS_LABEL && label->stmt) {
+		if (is_in_scope(label->label_scope, stmt->goto_scope))
+			return;
+		sparse_error(stmt->pos, "goto into statement expression");
+		info(label->stmt->pos,"   label '%s' is defined here",
+					show_ident(label->ident));
+		current_fn->bogus_linear = 1;
+	}
 	if (label->namespace == NS_NONE)
 		current_fn->bogus_linear = 1;
 }
diff --git a/validation/label-stmt-expr1.c b/validation/label-stmt-expr1.c
index 47ba54ae7305..f4f178c9d951 100644
--- a/validation/label-stmt-expr1.c
+++ b/validation/label-stmt-expr1.c
@@ -19,7 +19,6 @@ l:
 
 /*
  * check-name: label-stmt-expr1
- * check-known-to-fail
  *
  * check-error-start
 label-stmt-expr1.c:3:9: error: goto into statement expression
diff --git a/validation/linear/goto-and-expr-stmt0.c b/validation/linear/goto-and-expr-stmt0.c
index 548813531779..c6b6621a6a81 100644
--- a/validation/linear/goto-and-expr-stmt0.c
+++ b/validation/linear/goto-and-expr-stmt0.c
@@ -20,9 +20,14 @@ a:
 /*
  * check-name: goto-and-expr-stmt0
  * check-command: test-linearize -Wno-decl $file
- * check-known-to-fail
  *
  * check-output-ignore
  * check-output-excludes: END
- * check-error-ignore
+ *
+ * check-error-start
+linear/goto-and-expr-stmt0.c:3:9: error: goto into statement expression
+linear/goto-and-expr-stmt0.c:5:1:    label 'inside' is defined here
+linear/goto-and-expr-stmt0.c:17:9: error: goto into statement expression
+linear/goto-and-expr-stmt0.c:14:1:    label 'a' is defined here
+ * check-error-end
  */
-- 
2.26.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 17/17] bad-goto: cleanup evaluate_goto()
  2020-04-13 16:15 [PATCH 00/17] detect invalid branches at evaluation time Luc Van Oostenryck
                   ` (15 preceding siblings ...)
  2020-04-13 16:16 ` [PATCH 16/17] bad-goto: catch gotos inside expression statements Luc Van Oostenryck
@ 2020-04-13 16:16 ` Luc Van Oostenryck
  16 siblings, 0 replies; 30+ messages in thread
From: Luc Van Oostenryck @ 2020-04-13 16:16 UTC (permalink / raw)
  To: linux-sparse; +Cc: Luc Van Oostenryck

Reorganize the code to not repeat the test of the
label's namespace.

Also, make all namespaces other than NS_LABEL & NS_ITERATOR
as bogus and add some comments.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 evaluate.c | 26 ++++++++++++++------------
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/evaluate.c b/evaluate.c
index 2b845a301d6b..663540ff6445 100644
--- a/evaluate.c
+++ b/evaluate.c
@@ -3748,20 +3748,22 @@ static void evaluate_goto_statement(struct statement *stmt)
 		return;
 	}
 
-	if (label->namespace == NS_LABEL && !label->stmt) {
-		sparse_error(stmt->pos, "label '%s' was not declared", show_ident(label->ident));
-		current_fn->bogus_linear = 1;
-	}
-	if (label->namespace == NS_LABEL && label->stmt) {
-		if (is_in_scope(label->label_scope, stmt->goto_scope))
-			return;
-		sparse_error(stmt->pos, "goto into statement expression");
-		info(label->stmt->pos,"   label '%s' is defined here",
-					show_ident(label->ident));
+	switch (label->namespace) {
+	case NS_ITERATOR:	// break / continue
+		break;
+	case NS_LABEL:		// goto + ident
+		if (!label->stmt) {
+			sparse_error(stmt->pos, "label '%s' was not declared",
+						show_ident(label->ident));
+		} else if (!is_in_scope(label->label_scope, stmt->goto_scope)) {
+			sparse_error(stmt->pos, "goto into statement expression");
+			info(label->stmt->pos,"   label '%s' is defined here",
+						show_ident(label->ident));
+		} else
+			break;
+	default:
 		current_fn->bogus_linear = 1;
 	}
-	if (label->namespace == NS_NONE)
-		current_fn->bogus_linear = 1;
 }
 
 struct symbol *evaluate_statement(struct statement *stmt)
-- 
2.26.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 13/17] scope: let labels have their own scope
  2020-04-13 16:16 ` [PATCH 13/17] scope: let labels have their own scope Luc Van Oostenryck
@ 2020-04-13 17:30   ` Linus Torvalds
  0 siblings, 0 replies; 30+ messages in thread
From: Linus Torvalds @ 2020-04-13 17:30 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Sparse Mailing-list

On Mon, Apr 13, 2020 at 9:16 AM Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
>
> One way of detecting gotos inside an statement expression
> is to use a new kind of scope for the gotos & labels.

Ack. This is a much better approach than trying to figure things out later.

Very nice.

               Linus

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 15/17] scope: give a scope for labels & gotos
  2020-04-13 16:16 ` [PATCH 15/17] scope: give a scope for labels & gotos Luc Van Oostenryck
@ 2020-04-13 17:52   ` Linus Torvalds
  2020-04-13 18:54     ` Luc Van Oostenryck
  0 siblings, 1 reply; 30+ messages in thread
From: Linus Torvalds @ 2020-04-13 17:52 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Sparse Mailing-list

On Mon, Apr 13, 2020 at 9:16 AM Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
>
> Note: the label's symbols are still created in the function
>       scope since they belong to a single namespace.

Oh, I saw that 13/17 and was like "yeah, this is the right thing to
do", because I thought you were going in a different direction.

But then here in 15/17 I think you're doing it wrong.

Just give the symbol the proper scope, and make it simply not even
_parse_ right if somebody tries to use a label from the wrong scope,
instead of making the symbol visible, and then having to check whether
the scopes nested right later.

So I think you should just bite the bullet, and change the
bind_symbol() function so that a NS_LABEL is bound to label scope.

Then you can remove this 15/17 entirely (and all the "is this scope
nesting" code - nesting is automatically enforced by the symbol
scope).

I think that's a much cleaner approach. Yes, it gives a different
error from gcc, but I think it's a *better* error.

This:

   int fn1(int arg)
   {
      target:
         return 0;
   }

   int fn2(int arg)
   {
      goto target;
   }

is invalid code, and 'target' isn't even visible in fn2, because it is
a local label to fn1.

I think the exact same thing is the right thing to do for expression
statements, so

   int fn(int arg)
   {
      goto inside;
      return ({ inside: 0; });
   }

should fail with the exact same error message of having an undefined
label (which sparse currently gets wrong too, but you're fixing that
elsewhere).

Because "inside" simply shouldn't be defined at all in the outer
scope, and you can only branch _within_ a statement expression, the
same way you can only branch within a function.

So I think statement expressions should basically work kind of like
local "nested functions": they have access to the state outside, but
the outside doesn't have access to the state inside that statement
expression.

           Linus

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 15/17] scope: give a scope for labels & gotos
  2020-04-13 17:52   ` Linus Torvalds
@ 2020-04-13 18:54     ` Luc Van Oostenryck
  2020-04-13 19:32       ` Linus Torvalds
  0 siblings, 1 reply; 30+ messages in thread
From: Luc Van Oostenryck @ 2020-04-13 18:54 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Sparse Mailing-list

On Mon, Apr 13, 2020 at 10:52:19AM -0700, Linus Torvalds wrote:
> On Mon, Apr 13, 2020 at 9:16 AM Luc Van Oostenryck
> <luc.vanoostenryck@gmail.com> wrote:
> >
> > Note: the label's symbols are still created in the function
> >       scope since they belong to a single namespace.
> 
> Oh, I saw that 13/17 and was like "yeah, this is the right thing to
> do", because I thought you were going in a different direction.
> 
> But then here in 15/17 I think you're doing it wrong.
> 
> Just give the symbol the proper scope, and make it simply not even
> _parse_ right if somebody tries to use a label from the wrong scope,
> instead of making the symbol visible, and then having to check whether
> the scopes nested right later.
> 
> So I think you should just bite the bullet, and change the
> bind_symbol() function so that a NS_LABEL is bound to label scope.
> 
> Then you can remove this 15/17 entirely (and all the "is this scope
> nesting" code - nesting is automatically enforced by the symbol
> scope).
> 
> I think that's a much cleaner approach. Yes, it gives a different
> error from gcc, but I think it's a *better* error.
> 
> This:
> 
>    int fn1(int arg)
>    {
>       target:
>          return 0;
>    }
> 
>    int fn2(int arg)
>    {
>       goto target;
>    }
> 
> is invalid code, and 'target' isn't even visible in fn2, because it is
> a local label to fn1.
> 
> I think the exact same thing is the right thing to do for expression
> statements, so
> 
>    int fn(int arg)
>    {
>       goto inside;
>       return ({ inside: 0; });
>    }
> 
> should fail with the exact same error message of having an undefined
> label (which sparse currently gets wrong too, but you're fixing that
> elsewhere).
> 
> Because "inside" simply shouldn't be defined at all in the outer
> scope, and you can only branch _within_ a statement expression, the
> same way you can only branch within a function.
> 
> So I think statement expressions should basically work kind of like
> local "nested functions": they have access to the state outside, but
> the outside doesn't have access to the state inside that statement
> expression.

Yes, I agree and in fact (if I understand you correctly) it was what
I tried first, mainly because it was "conceptualy neat" and simpler.
But then it wasn't working correctly in all situations and I
convinced myself it couldn't. The problem was with code like:
	void foo(void)
	{
		... = ({ ...  goto out; ... });

	out:
		...;
	}

In this case, when 'goto out' is parsed, the corresponding label
symbol would be created in the inner scope and later when the label
is defined the symbol lookup will only look in the outer scope, see
nothing and declare another symbol for it, then the obvious scope
check will complain that the goto's label is undeclared.
But this code is legit and both occurences of the ident 'out' should
refer to the same label, right?

I didn't saw a proper solution for this, hence the current patch 15
where I'm keeping all labels in the usual function scope but where
the new label scope is associated to the STMT_GOTO & STMT_LABEL
and where evaluate_goto_statement() check in the scope of the
goto is contained in the one of the label definition via the
new helper is_in_scope(). This is less elegant than I would have
liked but again I don't see a better solution.

-- Luc

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 15/17] scope: give a scope for labels & gotos
  2020-04-13 18:54     ` Luc Van Oostenryck
@ 2020-04-13 19:32       ` Linus Torvalds
  2020-04-13 20:00         ` Luc Van Oostenryck
  2020-04-13 22:40         ` Linus Torvalds
  0 siblings, 2 replies; 30+ messages in thread
From: Linus Torvalds @ 2020-04-13 19:32 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Sparse Mailing-list

On Mon, Apr 13, 2020 at 11:54 AM Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
>
> Yes, I agree and in fact (if I understand you correctly) it was what
> I tried first, mainly because it was "conceptualy neat" and simpler.
> But then it wasn't working correctly in all situations and I
> convinced myself it couldn't. The problem was with code like:
>         void foo(void)
>         {
>                 ... = ({ ...  goto out; ... });
>
>         out:
>                 ...;
>         }
>
> In this case, when 'goto out' is parsed, the corresponding label
> symbol would be created in the inner scope and later when the label
> is defined the symbol lookup will only look in the outer scope, see
> nothing and declare another symbol for it

Oh, yeah, I see.

And that's just because of how we basically do the same thing at
"goto" time as we do at "label definition" time.

Both basically create the symbol label.

Which was simple, and worked, and meant that you never had to look
anything up, because they automatically just did the right thing - and
the use and scope is symmetric.

And the reason it does that, is that labels - unlike every other
symbol - aren't declared before use. So it's a hacky solution, and it
works.

And by "it works", I mean "doesn't really work all that well", because
clearly all the _other_ patches in your series were about the fact
that it also meant that we were horrible at the whole "label was never
defined in the first place" case.

But with the scoping change, the use and scope isn't symmetric any
more, and the "create symbol both at use and at definition" doesn't
work.

I _feel_ like the fix to that should be that the only thing that
creates the actual symbol is the label definition, and that the goto
should only ever use the 'ident' and we'd tie the two together later.

But yeah, that "tie the two together later" may not work, simply
because scoping is so tightly tied to parsing in sparse.

So maybe your approach is the best one.

It feels hacky and wrong, but maybe that just fundamentally comes from
labels having that very special "use = implicit declaration" thing.

              Linus

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 15/17] scope: give a scope for labels & gotos
  2020-04-13 19:32       ` Linus Torvalds
@ 2020-04-13 20:00         ` Luc Van Oostenryck
  2020-04-13 22:40         ` Linus Torvalds
  1 sibling, 0 replies; 30+ messages in thread
From: Luc Van Oostenryck @ 2020-04-13 20:00 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Sparse Mailing-list

On Mon, Apr 13, 2020 at 12:32:41PM -0700, Linus Torvalds wrote:
> 
> I _feel_ like the fix to that should be that the only thing that
> creates the actual symbol is the label definition, and that the goto
> should only ever use the 'ident' and we'd tie the two together later.

Yes, I tried that too but it didn't worked because:
 
> But yeah, that "tie the two together later" may not work, simply
> because scoping is so tightly tied to parsing in sparse.
> 
> So maybe your approach is the best one.
> 
> It feels hacky and wrong, but maybe that just fundamentally comes from
> labels having that very special "use = implicit declaration" thing.

Yes, that and the way the symbol 'table' is done: very clever but
unusable for our problem here. But maybe there is something that can
be done there. Currently end_scope() sets scope->symbols to NULL but
as far as I can see, this is not really needed and, if left, the
"tie the two together later" could be done by doing a symbol lookup
via this list instead of the usual lookup via ident->symbols, much
like classical symbol tables are used. It should be quite easy.
I'll give it a try because I'm also not really satisfied with my
current solution giving a kind of secondary scope to the statements.

-- Luc

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 15/17] scope: give a scope for labels & gotos
  2020-04-13 19:32       ` Linus Torvalds
  2020-04-13 20:00         ` Luc Van Oostenryck
@ 2020-04-13 22:40         ` Linus Torvalds
  2020-04-13 23:39           ` Luc Van Oostenryck
  1 sibling, 1 reply; 30+ messages in thread
From: Linus Torvalds @ 2020-04-13 22:40 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Sparse Mailing-list

[-- Attachment #1: Type: text/plain, Size: 1746 bytes --]

On Mon, Apr 13, 2020 at 12:32 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> I _feel_ like the fix to that should be that the only thing that
> creates the actual symbol is the label definition, and that the goto
> should only ever use the 'ident' and we'd tie the two together later.

Actually, how about something like this?

I've not signed off on these patches, and the commit logs are
questionable, but part of that is that the two first ones are just
quick-and-dirty versions of your rename cleanups.

The third patch is the serious one, which shows what I think might be
the solution to the odd scoping rules for labels.

Basically, we always scope labels - but if a label is _used_ but not
defined in an inner label, when we close the label scope, we move it
out to the next level.

But a defined label is never moved out, and when we define it, we
require that any previous use was in the same scope (where "same
scope" might have been an inner scope that was moved out).

I think it gets the semantics right, and it's actually fairly simple.

But it has very little testing, so this is more of a "how about
something like this" than a serious submission.

If you test it, and fix up the warnings and error cases (like the
other patches in your series did), you are more than welcome to take
credit and authorship for this.

I just felt that the best way to describe (and do _some_ testing) my
idea was to have a quick implementation to show what I mean.

And by "_some_ testing" I literally mean "almost no testing at all". I
didn't even run this on the kernel tree. I just used one stipid small
test-case for this, and when it gave the warning I wanted, I said
"good enough" and sent this email out ;)

            Linus

[-- Attachment #2: 0001-Rename-symbol_scope-to-block_scope.patch --]
[-- Type: text/x-patch, Size: 3478 bytes --]

From 455a1c70c1b2d8ecfb267dda8a6a5f6f8baf4233 Mon Sep 17 00:00:00 2001
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Mon, 13 Apr 2020 14:06:59 -0700
Subject: [PATCH 1/3] Rename 'symbol_scope' to 'block_scope'

The actual scope variable was already named that way, but the begin/end
functions inexplicably weren't.
---
 expression.c |  4 ++--
 parse.c      | 12 ++++++------
 scope.c      |  4 ++--
 scope.h      |  4 ++--
 4 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/expression.c b/expression.c
index 5b9bddfe..40141a5e 100644
--- a/expression.c
+++ b/expression.c
@@ -71,9 +71,9 @@ struct token *parens_expression(struct token *token, struct expression **expr, c
 		struct statement *stmt = alloc_statement(token->pos, STMT_COMPOUND);
 		*expr = e;
 		e->statement = stmt;
-		start_symbol_scope();
+		start_block_scope();
 		token = compound_statement(token->next, stmt);
-		end_symbol_scope();
+		end_block_scope();
 		token = expect(token, '}', "at end of statement expression");
 	} else
 		token = parse_expression(token, expr);
diff --git a/parse.c b/parse.c
index a29c67c8..9d11309e 100644
--- a/parse.c
+++ b/parse.c
@@ -2222,7 +2222,7 @@ static void start_iterator(struct statement *stmt)
 {
 	struct symbol *cont, *brk;
 
-	start_symbol_scope();
+	start_block_scope();
 	cont = alloc_symbol(stmt->pos, SYM_NODE);
 	bind_symbol(cont, &continue_ident, NS_ITERATOR);
 	brk = alloc_symbol(stmt->pos, SYM_NODE);
@@ -2237,7 +2237,7 @@ static void start_iterator(struct statement *stmt)
 
 static void end_iterator(struct statement *stmt)
 {
-	end_symbol_scope();
+	end_block_scope();
 }
 
 static struct statement *start_function(struct symbol *sym)
@@ -2282,7 +2282,7 @@ static void start_switch(struct statement *stmt)
 {
 	struct symbol *brk, *switch_case;
 
-	start_symbol_scope();
+	start_block_scope();
 	brk = alloc_symbol(stmt->pos, SYM_NODE);
 	bind_symbol(brk, &break_ident, NS_ITERATOR);
 
@@ -2302,7 +2302,7 @@ static void end_switch(struct statement *stmt)
 {
 	if (!stmt->switch_case->symbol_list)
 		warning(stmt->pos, "switch with no cases");
-	end_symbol_scope();
+	end_block_scope();
 }
 
 static void add_case_statement(struct statement *stmt)
@@ -2548,9 +2548,9 @@ static struct token *statement(struct token *token, struct statement **tree)
 
 	if (match_op(token, '{')) {
 		stmt->type = STMT_COMPOUND;
-		start_symbol_scope();
+		start_block_scope();
 		token = compound_statement(token->next, stmt);
-		end_symbol_scope();
+		end_block_scope();
 		
 		return expect(token, '}', "at end of compound statement");
 	}
diff --git a/scope.c b/scope.c
index 420c0f5a..a0de2e0d 100644
--- a/scope.c
+++ b/scope.c
@@ -86,7 +86,7 @@ void start_file_scope(void)
 	block_scope = scope;
 }
 
-void start_symbol_scope(void)
+void start_block_scope(void)
 {
 	start_scope(&block_scope);
 }
@@ -131,7 +131,7 @@ void new_file_scope(void)
 	start_file_scope();
 }
 
-void end_symbol_scope(void)
+void end_block_scope(void)
 {
 	end_scope(&block_scope);
 }
diff --git a/scope.h b/scope.h
index 3cad514a..83741459 100644
--- a/scope.h
+++ b/scope.h
@@ -47,8 +47,8 @@ extern void start_file_scope(void);
 extern void end_file_scope(void);
 extern void new_file_scope(void);
 
-extern void start_symbol_scope(void);
-extern void end_symbol_scope(void);
+extern void start_block_scope(void);
+extern void end_block_scope(void);
 
 extern void start_function_scope(void);
 extern void end_function_scope(void);
-- 
2.24.0.158.g3fed155289


[-- Attachment #3: 0002-Rename-function-scope-as-label-scope.patch --]
[-- Type: text/x-patch, Size: 3474 bytes --]

From f4d8234688c60bed83a3549072ac1709d942cdcf Mon Sep 17 00:00:00 2001
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Mon, 13 Apr 2020 14:09:35 -0700
Subject: [PATCH 2/3] Rename 'function' scope as 'label' scope

The only thing that was scoped this way were labels.  And 'function'
scope will soon be a mis-namer.
---
 parse.c  |  4 ++--
 scope.c  | 14 +++++++-------
 scope.h  |  6 +++---
 symbol.c |  2 +-
 4 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/parse.c b/parse.c
index 9d11309e..64aaf07b 100644
--- a/parse.c
+++ b/parse.c
@@ -2245,7 +2245,7 @@ static struct statement *start_function(struct symbol *sym)
 	struct symbol *ret;
 	struct statement *stmt = alloc_statement(sym->pos, STMT_COMPOUND);
 
-	start_function_scope();
+	start_label_scope();
 	ret = alloc_symbol(sym->pos, SYM_NODE);
 	ret->ctype = sym->ctype.base_type->ctype;
 	ret->ctype.modifiers &= ~(MOD_STORAGE | MOD_QUALIFIER | MOD_TLS | MOD_ACCESS | MOD_NOCAST | MOD_NODEREF);
@@ -2263,7 +2263,7 @@ static struct statement *start_function(struct symbol *sym)
 static void end_function(struct symbol *sym)
 {
 	current_fn = NULL;
-	end_function_scope();
+	end_label_scope();
 }
 
 /*
diff --git a/scope.c b/scope.c
index a0de2e0d..8df9a5e0 100644
--- a/scope.c
+++ b/scope.c
@@ -35,8 +35,8 @@
 
 static struct scope builtin_scope = { .next = &builtin_scope };
 
-struct scope	*block_scope = &builtin_scope,		// regular automatic variables etc
-		*function_scope = &builtin_scope,	// labels, arguments etc
+struct scope	*block_scope = &builtin_scope,		// regular variables etc
+		*label_scope = &builtin_scope,		// labels
 		*file_scope = &builtin_scope,		// static
 		*global_scope = &builtin_scope;		// externally visible
 
@@ -82,7 +82,7 @@ void start_file_scope(void)
 	file_scope = scope;
 
 	/* top-level stuff defaults to file scope, "extern" etc will choose global scope */
-	function_scope = scope;
+	label_scope = scope;
 	block_scope = scope;
 }
 
@@ -91,9 +91,9 @@ void start_block_scope(void)
 	start_scope(&block_scope);
 }
 
-void start_function_scope(void)
+void start_label_scope(void)
 {
-	start_scope(&function_scope);
+	start_scope(&label_scope);
 	start_scope(&block_scope);
 }
 
@@ -136,10 +136,10 @@ void end_block_scope(void)
 	end_scope(&block_scope);
 }
 
-void end_function_scope(void)
+void end_label_scope(void)
 {
 	end_scope(&block_scope);
-	end_scope(&function_scope);
+	end_scope(&label_scope);
 }
 
 int is_outer_scope(struct scope *scope)
diff --git a/scope.h b/scope.h
index 83741459..e7c5ba05 100644
--- a/scope.h
+++ b/scope.h
@@ -34,7 +34,7 @@ struct scope {
 
 extern struct scope
 		*block_scope,
-		*function_scope,
+		*label_scope,
 		*file_scope,
 		*global_scope;
 
@@ -50,8 +50,8 @@ extern void new_file_scope(void);
 extern void start_block_scope(void);
 extern void end_block_scope(void);
 
-extern void start_function_scope(void);
-extern void end_function_scope(void);
+extern void start_label_scope(void);
+extern void end_label_scope(void);
 
 extern void set_current_scope(struct symbol *);
 extern void bind_scope(struct symbol *, struct scope *);
diff --git a/symbol.c b/symbol.c
index c2e6f0b4..c3ded5ef 100644
--- a/symbol.c
+++ b/symbol.c
@@ -707,7 +707,7 @@ void bind_symbol(struct symbol *sym, struct ident *ident, enum namespace ns)
 	if (ns == NS_MACRO)
 		scope = file_scope;
 	if (ns == NS_LABEL)
-		scope = function_scope;
+		scope = label_scope;
 	bind_scope(sym, scope);
 }
 
-- 
2.24.0.158.g3fed155289


[-- Attachment #4: 0003-Label-scoping-is-special-and-expression-statements-a.patch --]
[-- Type: text/x-patch, Size: 3499 bytes --]

From 268d3b8ef45e1cce442b7ffe534715a5510cce67 Mon Sep 17 00:00:00 2001
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Mon, 13 Apr 2020 14:52:26 -0700
Subject: [PATCH 3/3] Label scoping is special, and expression statements are
 their own scope.

A label may be used before being defined, and you don't have to
pre-declare it.  And it may be used in an inner scope from the
declaration, but not the other way around.

We work around this by simply making the rule be that as we close the
scope of a label symbol, instead of killing the symbol, we will transfer
that symbol to the outer label scope if it hasn't been defined yet.

But a label that has been defined can never transfer out.

And when defining a label, test that there wasn't some earlier use
of it in another (outer) scope.
---
 expression.c |  4 ++--
 parse.c      | 11 +++++++++++
 scope.c      | 36 +++++++++++++++++++++++++++++++++++-
 3 files changed, 48 insertions(+), 3 deletions(-)

diff --git a/expression.c b/expression.c
index 40141a5e..08650724 100644
--- a/expression.c
+++ b/expression.c
@@ -71,9 +71,9 @@ struct token *parens_expression(struct token *token, struct expression **expr, c
 		struct statement *stmt = alloc_statement(token->pos, STMT_COMPOUND);
 		*expr = e;
 		e->statement = stmt;
-		start_block_scope();
+		start_label_scope();
 		token = compound_statement(token->next, stmt);
-		end_block_scope();
+		end_label_scope();
 		token = expect(token, '}', "at end of statement expression");
 	} else
 		token = parse_expression(token, expr);
diff --git a/parse.c b/parse.c
index 64aaf07b..8e238f59 100644
--- a/parse.c
+++ b/parse.c
@@ -2539,6 +2539,17 @@ static struct token *statement(struct token *token, struct statement **tree)
 				// skip the label to avoid multiple definitions
 				return statement(token, tree);
 			}
+			/*
+			 * If the scope of the label symbol is different
+			 * from the current label scope, that means that
+			 * it must have been used at an outer scope.
+			 *
+			 * That's not ok.
+			 */
+			if (s->scope != label_scope) {
+				sparse_error(stmt->pos, "label '%s' used outside label expression", show_ident(s->ident));
+				sparse_error(s->pos, "invalid use here");
+			}
 			stmt->type = STMT_LABEL;
 			stmt->label_identifier = s;
 			s->stmt = stmt;
diff --git a/scope.c b/scope.c
index 8df9a5e0..4b0f7947 100644
--- a/scope.c
+++ b/scope.c
@@ -136,10 +136,44 @@ void end_block_scope(void)
 	end_scope(&block_scope);
 }
 
+/*
+ * Label scope is special.
+ *
+ * The scope of a label is limited to its definition, but
+ * a label can be used in an inner scope before it's defined.
+ *
+ * So when we end the label scope, we move all the non-defined
+ * labels out by one level.
+ */
 void end_label_scope(void)
 {
+	struct symbol *sym;
+	struct symbol_list *labels = label_scope->symbols;
+
+	label_scope->symbols = NULL;
+	label_scope = label_scope->next;
+	FOR_EACH_PTR(labels, sym) {
+
+		/*
+		 * Do we have a definition for it?
+		 * Ok, we're done with this label
+		 */
+		if (sym->stmt) {
+			remove_symbol_scope(sym);
+			continue;
+		}
+
+		if (label_scope == &builtin_scope) {
+			warning(sym->pos, "Unused label '%s'", sym->ident->name);
+			remove_symbol_scope(sym);
+			continue;
+		}
+
+		/* Re-bind the symbol to the parent scope, we'll try again */
+		bind_scope(sym, label_scope);
+	} END_FOR_EACH_PTR(sym);
+
 	end_scope(&block_scope);
-	end_scope(&label_scope);
 }
 
 int is_outer_scope(struct scope *scope)
-- 
2.24.0.158.g3fed155289


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 15/17] scope: give a scope for labels & gotos
  2020-04-13 22:40         ` Linus Torvalds
@ 2020-04-13 23:39           ` Luc Van Oostenryck
  2020-04-14  7:49             ` Luc Van Oostenryck
  0 siblings, 1 reply; 30+ messages in thread
From: Luc Van Oostenryck @ 2020-04-13 23:39 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Sparse Mailing-list

On Mon, Apr 13, 2020 at 03:40:55PM -0700, Linus Torvalds wrote:
> On Mon, Apr 13, 2020 at 12:32 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > I _feel_ like the fix to that should be that the only thing that
> > creates the actual symbol is the label definition, and that the goto
> > should only ever use the 'ident' and we'd tie the two together later.
> 
> Actually, how about something like this?
> 
> I've not signed off on these patches, and the commit logs are
> questionable, but part of that is that the two first ones are just
> quick-and-dirty versions of your rename cleanups.
> 
> The third patch is the serious one, which shows what I think might be
> the solution to the odd scoping rules for labels.
> 
> Basically, we always scope labels - but if a label is _used_ but not
> defined in an inner label, when we close the label scope, we move it
> out to the next level.
> 
> But a defined label is never moved out, and when we define it, we
> require that any previous use was in the same scope (where "same
> scope" might have been an inner scope that was moved out).
> 
> I think it gets the semantics right, and it's actually fairly simple.
> 
> But it has very little testing, so this is more of a "how about
> something like this" than a serious submission.
> 
> If you test it, and fix up the warnings and error cases (like the
> other patches in your series did), you are more than welcome to take
> credit and authorship for this.
> 
> I just felt that the best way to describe (and do _some_ testing) my
> idea was to have a quick implementation to show what I mean.
> 
> And by "_some_ testing" I literally mean "almost no testing at all". I
> didn't even run this on the kernel tree. I just used one stipid small
> test-case for this, and when it gave the warning I wanted, I said
> "good enough" and sent this email out ;)

I like the idea. I just gave it a very quick test with sparse's
"make check" (it covers a lot of simple but corner/dirty cases that
the kernel may/should not have). It seemed to pass all the tests but
the ones using __label__. For exemple, things like this complain:
	{
		__label__ l;

	l:
		goto l;
	}

I'll look more at it tommorow as it's a bit late here. I just fear
that __label__ will spoil things here or at least complicate them.

-- Luc

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 15/17] scope: give a scope for labels & gotos
  2020-04-13 23:39           ` Luc Van Oostenryck
@ 2020-04-14  7:49             ` Luc Van Oostenryck
  2020-04-14 18:19               ` Linus Torvalds
  0 siblings, 1 reply; 30+ messages in thread
From: Luc Van Oostenryck @ 2020-04-14  7:49 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Sparse Mailing-list

On Tue, Apr 14, 2020 at 01:39:00AM +0200, Luc Van Oostenryck wrote:
> 
> I like the idea. I just gave it a very quick test with sparse's
> "make check" (it covers a lot of simple but corner/dirty cases that
> the kernel may/should not have). It seemed to pass all the tests but
> the ones using __label__. For exemple, things like this complain:
> 	{
> 		__label__ l;
> 
> 	l:
> 		goto l;
> 	}
> 
> I'll look more at it tommorow as it's a bit late here. I just fear
> that __label__ will spoil things here or at least complicate them.

The problem is that now normal labels use the new label_scope
but the ones declared with __label__ use block_scope and these
2 scopes are kinda in a different namespace of scope.
It's easy to make it work here but the problem would remain
when extra block levels are present, like in:
	{
		__label__ l;
		{
		l:
			goto l;
		}
		goto l;
	}

It's surely salvageable in some ways but I'm not sure it's worth
the troubles.

-- Luc

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 15/17] scope: give a scope for labels & gotos
  2020-04-14  7:49             ` Luc Van Oostenryck
@ 2020-04-14 18:19               ` Linus Torvalds
  2020-04-14 23:09                 ` Luc Van Oostenryck
  0 siblings, 1 reply; 30+ messages in thread
From: Linus Torvalds @ 2020-04-14 18:19 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Sparse Mailing-list

[-- Attachment #1: Type: text/plain, Size: 827 bytes --]

On Tue, Apr 14, 2020 at 12:49 AM Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
>
> The problem is that now normal labels use the new label_scope
> but the ones declared with __label__ use block_scope and these
> 2 scopes are kinda in a different namespace of scope.

Oh, I forgot about the special __label__ thing that actually declares labels.

That one has an interesting behavior, in that the _lifetime_ of the
symbol is the block scope, but the *use* of the symbol must remain in
label scope.

The most obvious fix is probably something like the appended: make the
'sym->scope' remain the lifetime scope, but then attach a "must be
used in this scope' thing to any NS_LABEL case.

That fairly clearly separates the two issues.

Again, not actually tested outside of the obvious trivial case.

              Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 2503 bytes --]

 parse.c  | 16 +++++++++++-----
 scope.c  |  1 +
 symbol.h |  3 +++
 3 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/parse.c b/parse.c
index 8e238f59..b6109516 100644
--- a/parse.c
+++ b/parse.c
@@ -728,6 +728,7 @@ struct symbol *label_symbol(struct token *token)
 	if (!sym) {
 		sym = alloc_symbol(token->pos, SYM_LABEL);
 		bind_symbol(sym, token->ident, NS_LABEL);
+		sym->declared_scope = label_scope;
 		fn_local_symbol(sym);
 	}
 	return sym;
@@ -2541,13 +2542,14 @@ static struct token *statement(struct token *token, struct statement **tree)
 			}
 			/*
 			 * If the scope of the label symbol is different
-			 * from the current label scope, that means that
-			 * it must have been used at an outer scope.
+			 * from the declared label scope, that means that
+			 * it must have been used or declared at an outer
+			 * scope.
 			 *
 			 * That's not ok.
 			 */
-			if (s->scope != label_scope) {
-				sparse_error(stmt->pos, "label '%s' used outside label expression", show_ident(s->ident));
+			if (s->scope != s->declared_scope) {
+				sparse_error(stmt->pos, "label '%s' used outside statement expression", show_ident(s->ident));
 				sparse_error(s->pos, "invalid use here");
 			}
 			stmt->type = STMT_LABEL;
@@ -2575,9 +2577,13 @@ static struct token *label_statement(struct token *token)
 {
 	while (token_type(token) == TOKEN_IDENT) {
 		struct symbol *sym = alloc_symbol(token->pos, SYM_LABEL);
-		/* it's block-scope, but we want label namespace */
+
+		/* it's lifetile is block-scope, but we want label namespace */
 		bind_symbol(sym, token->ident, NS_SYMBOL);
 		sym->namespace = NS_LABEL;
+
+		/* But we must define it in this label scope */
+		sym->declared_scope = label_scope;
 		fn_local_symbol(sym);
 		token = token->next;
 		if (!match_op(token, ','))
diff --git a/scope.c b/scope.c
index 4b0f7947..11792ec4 100644
--- a/scope.c
+++ b/scope.c
@@ -171,6 +171,7 @@ void end_label_scope(void)
 
 		/* Re-bind the symbol to the parent scope, we'll try again */
 		bind_scope(sym, label_scope);
+		sym->declared_scope = label_scope;
 	} END_FOR_EACH_PTR(sym);
 
 	end_scope(&block_scope);
diff --git a/symbol.h b/symbol.h
index 18476582..08e35438 100644
--- a/symbol.h
+++ b/symbol.h
@@ -167,6 +167,9 @@ struct symbol {
 			int (*handler)(struct stream *, struct token **, struct token *);
 			int normal;
 		};
+		struct /* NS_LABEL */ {
+			struct scope *declared_scope;
+		};
 		struct /* NS_SYMBOL */ {
 			unsigned long	offset;
 			int		bit_size;

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 15/17] scope: give a scope for labels & gotos
  2020-04-14 18:19               ` Linus Torvalds
@ 2020-04-14 23:09                 ` Luc Van Oostenryck
  2020-04-15  0:59                   ` Linus Torvalds
  0 siblings, 1 reply; 30+ messages in thread
From: Luc Van Oostenryck @ 2020-04-14 23:09 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Sparse Mailing-list

On Tue, Apr 14, 2020 at 11:19:32AM -0700, Linus Torvalds wrote:
> On Tue, Apr 14, 2020 at 12:49 AM Luc Van Oostenryck
> <luc.vanoostenryck@gmail.com> wrote:
> >
> > The problem is that now normal labels use the new label_scope
> > but the ones declared with __label__ use block_scope and these
> > 2 scopes are kinda in a different namespace of scope.
> 
> Oh, I forgot about the special __label__ thing that actually declares labels.
> 
> That one has an interesting behavior, in that the _lifetime_ of the
> symbol is the block scope, but the *use* of the symbol must remain in
> label scope.
> 
> The most obvious fix is probably something like the appended: make the
> 'sym->scope' remain the lifetime scope, but then attach a "must be
> used in this scope' thing to any NS_LABEL case.
> 
> That fairly clearly separates the two issues.

Yes, I see the principle but ... it doesn't work yet for __label__ 
and the more I think about it, the more I think it can't work.
The problem is that the block scopes and the label scopes are
never comparable. At the syntax level the label scope is a subset
of the block levels, less fine grained but for sparse 'struct scope'
they are never the same, even for the same { }.
So, in your patch (with things removed for simplicity):

diff --git a/parse.c b/parse.c
index 8e238f59..b6109516 100644
--- a/parse.c
+++ b/parse.c
@@ -2541,13 +2542,14 @@ static struct token *statement(struct token *token, struct statement **tree)
-			if (s->scope != label_scope) {
-				sparse_error(stmt->pos, "label '%s' used outside label expression", show_ident(s->ident));
+			if (s->scope != s->declared_scope) {

This comparison can never succeed for labels declared with __label__
because s->scope is a block scope and s->declared_scope a label one.

I'm sure it's fixeable in some way and problably it's just me
having a kind of 'mental blocage' but I don't see how the
normal scoping with lookup_symbol() can be really useful for
labels without losing the conceptual simplicity or without going
to something like my previous solution.

-- Luc

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 15/17] scope: give a scope for labels & gotos
  2020-04-14 23:09                 ` Luc Van Oostenryck
@ 2020-04-15  0:59                   ` Linus Torvalds
  2020-05-14 22:22                     ` Luc Van Oostenryck
  0 siblings, 1 reply; 30+ messages in thread
From: Linus Torvalds @ 2020-04-15  0:59 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Sparse Mailing-list

On Tue, Apr 14, 2020 at 4:09 PM Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
>
> +                       if (s->scope != s->declared_scope) {
>
> This comparison can never succeed for labels declared with __label__
> because s->scope is a block scope and s->declared_scope a label one.

Hold on.. I'm sure I tested it.

Oh.

What I tested wasn't what I sent you, and I'd fixed things due to the
testing but not updated the patch file.

Oops.

The test is supposed to be

                        if (s->declared_scope != label_scope) {

which is the whole point of that 'declared_scope'.

So the concept of the patch is that the 'declared_scope' (and
'label_scope') are the same kind of scope (and comparable): it is the
applicability of the label itself (either the whole function or some
sub-expression statement).

And the the visibility of the -symbol- ends up being different, and is
the s->scope thing.

But while my testing wasn't quite as limited as my wrong-version patch
implied, it _was_ limited. So it might miss some other case.

              Linus

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 15/17] scope: give a scope for labels & gotos
  2020-04-15  0:59                   ` Linus Torvalds
@ 2020-05-14 22:22                     ` Luc Van Oostenryck
  0 siblings, 0 replies; 30+ messages in thread
From: Luc Van Oostenryck @ 2020-05-14 22:22 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Sparse Mailing-list

On Tue, Apr 14, 2020 at 05:59:18PM -0700, Linus Torvalds wrote:
> On Tue, Apr 14, 2020 at 4:09 PM Luc Van Oostenryck
> <luc.vanoostenryck@gmail.com> wrote:
> >
> > +                       if (s->scope != s->declared_scope) {
> >
> > This comparison can never succeed for labels declared with __label__
> > because s->scope is a block scope and s->declared_scope a label one.
> 
> Hold on.. I'm sure I tested it.
> 
> Oh.
> 
> What I tested wasn't what I sent you, and I'd fixed things due to the
> testing but not updated the patch file.
> 
> Oops.
> 
> The test is supposed to be
> 
>                         if (s->declared_scope != label_scope) {
> 
> which is the whole point of that 'declared_scope'.
> 
> So the concept of the patch is that the 'declared_scope' (and
> 'label_scope') are the same kind of scope (and comparable): it is the
> applicability of the label itself (either the whole function or some
> sub-expression statement).
> 
> And the the visibility of the -symbol- ends up being different, and is
> the s->scope thing.
> 
> But while my testing wasn't quite as limited as my wrong-version patch
> implied, it _was_ limited. So it might miss some other case.

Sorry for the late reply.

Yes, it's working OK wth this change but there are several issues that
make me think that this approach is not ideal:

1) these 2 functions would give very different error message:
	void foo(void)
	{
		goto l;
		({ l: 0; });
	}

	void bar(void)
	{
		({ l: 0; });
		goto l;
	}

  The first one gives the 'warning: jumping inside statement expression'
  while the second one can only give 'warning: goto with undeclared label'
  because indeed the 'l' label inside the statement isn't visible anymore.
  This second warning is of coure less informative than the first one
  but what I really find abnormal is that the warning would be different
  depending on the fact that the goto comes before or after the label
  definition. This seems incorrect to me, confusing and is different
  from GCC (clang doesn't seems to mind).

2) in the following case, no warning can be given:
	void foo(void)
	{
	l:
		({
		l:
			goto l;
			0;
		});
		goto l;
	}
  
   In this case both label definition are in a different scope and each goto
   sees its own label. This is different than GCC which would complain
   about 'l' being redeclared.

   There was also another issue related to the fact that GCC put all
   labels in a single namespace bit I forgot the details.


These two problems are linked to the fact of using the local namespace
for labels while GCC use a single one for all of them.
But well, then again I can't say I'm fully happy with my solution
using the label scope not for the labels but for their definition
and uses (gotos & label expressions) and then comparing these with
the helper is_in_scope().

I dunno. I've fixed a number of details, I'll repost everything soon.

-- Luc

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2020-05-14 22:22 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-13 16:15 [PATCH 00/17] detect invalid branches at evaluation time Luc Van Oostenryck
2020-04-13 16:15 ` [PATCH 01/17] bad-goto: add testcase for 'jump inside discarded expression statement' Luc Van Oostenryck
2020-04-13 16:15 ` [PATCH 02/17] bad-goto: add testcases for linearization of invalid labels Luc Van Oostenryck
2020-04-13 16:15 ` [PATCH 03/17] bad-goto: add more testcases Luc Van Oostenryck
2020-04-13 16:15 ` [PATCH 04/17] bad-goto: do not linearize if the IR will be invalid Luc Van Oostenryck
2020-04-13 16:15 ` [PATCH 05/17] bad-goto: reorg test in evaluate_goto_statement() Luc Van Oostenryck
2020-04-13 16:15 ` [PATCH 06/17] bad-goto: simplify testing of undeclared labels Luc Van Oostenryck
2020-04-13 16:15 ` [PATCH 07/17] bad-goto: do not linearize function with " Luc Van Oostenryck
2020-04-13 16:15 ` [PATCH 08/17] bad-goto: catch labels with reserved names Luc Van Oostenryck
2020-04-13 16:15 ` [PATCH 09/17] scope: no memset() needed after __alloc_scope() Luc Van Oostenryck
2020-04-13 16:15 ` [PATCH 10/17] scope: move scope opening/ending inside compound_statement() Luc Van Oostenryck
2020-04-13 16:15 ` [PATCH 11/17] scope: make function scope the same as the body block scope Luc Van Oostenryck
2020-04-13 16:16 ` [PATCH 12/17] scope: s/{start,end}_symbol_scope/{start,end}_block_scope/ Luc Van Oostenryck
2020-04-13 16:16 ` [PATCH 13/17] scope: let labels have their own scope Luc Van Oostenryck
2020-04-13 17:30   ` Linus Torvalds
2020-04-13 16:16 ` [PATCH 14/17] scope: add is_in_scope() Luc Van Oostenryck
2020-04-13 16:16 ` [PATCH 15/17] scope: give a scope for labels & gotos Luc Van Oostenryck
2020-04-13 17:52   ` Linus Torvalds
2020-04-13 18:54     ` Luc Van Oostenryck
2020-04-13 19:32       ` Linus Torvalds
2020-04-13 20:00         ` Luc Van Oostenryck
2020-04-13 22:40         ` Linus Torvalds
2020-04-13 23:39           ` Luc Van Oostenryck
2020-04-14  7:49             ` Luc Van Oostenryck
2020-04-14 18:19               ` Linus Torvalds
2020-04-14 23:09                 ` Luc Van Oostenryck
2020-04-15  0:59                   ` Linus Torvalds
2020-05-14 22:22                     ` Luc Van Oostenryck
2020-04-13 16:16 ` [PATCH 16/17] bad-goto: catch gotos inside expression statements Luc Van Oostenryck
2020-04-13 16:16 ` [PATCH 17/17] bad-goto: cleanup evaluate_goto() Luc Van Oostenryck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).