linux-sparse.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8] cfg: early CFG simplification
@ 2020-11-16 22:29 Luc Van Oostenryck
  2020-11-16 22:29 ` [PATCH 1/8] testcase: avoid UNDEF Luc Van Oostenryck
                   ` (7 more replies)
  0 siblings, 8 replies; 13+ messages in thread
From: Luc Van Oostenryck @ 2020-11-16 22:29 UTC (permalink / raw)
  To: linux-sparse; +Cc: Luc Van Oostenryck

This series is a preparatory step for some other optimzations
but is quite valuable in itself. It adds a new phase of CFG
CFG simplification before any other simplifications. To be
effective this pahse needs some other changes:
* phi-sources & phi-nodes must be handled specifically
  whne merinf BBs.
* some missing REPEAT_CFG_CLEANUP must be added
* simplify_memops() must be called unconditionally

These changes slightly decrease the number of 'context imbalance'
warnings (32 less on a total of 995 warnings) and have otherwise a
significant effect on the size of the generated IR (~0.4%).

Luc Van Oostenryck (8):
  testcase: avoid UNDEF
  cfg: add testcase for phi-adjusting during BB merge
  cfg: extract merge_bb() from pack_basic_blocks()
  cfg: adjust phi-sources when merging BBs
  cfg: adjust phi-nodes when merging BBs
  cfg: add missing REPEAT_CFG_CLEANUP
  cfg: call simplify_memops() unconditionally.
  cfg: early CFG simplification

 flow.c                                       | 201 ++++++++++++++++---
 flow.h                                       |   1 +
 linearize.c                                  |   1 +
 memops.c                                     |   2 +
 optimize.c                                   |   9 +-
 simplify.c                                   |   2 +-
 validation/call-inlined.c                    |   4 +
 validation/expand/builtin_constant_inline0.c |   1 +
 validation/inline_base0.c                    |   3 +
 validation/linear/builtin_unreachable0.c     |   4 +-
 validation/linear/builtin_unreachable1.c     |   4 +-
 validation/linear/call-inline.c              |   2 +-
 validation/mem2reg/cond-expr.c               |   4 +-
 validation/mem2reg/cond-expr5.c              |   5 +-
 validation/optim/cse-size.c                  |   5 +-
 validation/optim/memops-missed01.c           |  23 +++
 validation/optim/memops-missed02.c           |  14 ++
 validation/optim/merge_bbe-adjust_phi.c      |  23 +++
 18 files changed, 269 insertions(+), 39 deletions(-)
 create mode 100644 validation/optim/memops-missed01.c
 create mode 100644 validation/optim/memops-missed02.c
 create mode 100644 validation/optim/merge_bbe-adjust_phi.c

-- 
2.29.2


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/8] testcase: avoid UNDEF
  2020-11-16 22:29 [PATCH 0/8] cfg: early CFG simplification Luc Van Oostenryck
@ 2020-11-16 22:29 ` Luc Van Oostenryck
  2020-11-16 22:29 ` [PATCH 2/8] cfg: add testcase for phi-adjusting during BB merge Luc Van Oostenryck
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Luc Van Oostenryck @ 2020-11-16 22:29 UTC (permalink / raw)
  To: linux-sparse; +Cc: Luc Van Oostenryck

Reduced testcases (with creduce, of course) often needlessly have
undefined variables. Since these are untouched by the simplification
code and should not be present in source code, they should be avoided
in optimization testcases.

So, defines 'x' to some value other than 0.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 validation/optim/cse-size.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/validation/optim/cse-size.c b/validation/optim/cse-size.c
index 5b31420c84ad..e1a5d492a666 100644
--- a/validation/optim/cse-size.c
+++ b/validation/optim/cse-size.c
@@ -1,7 +1,7 @@
 static void foo(void)
 {
 	unsigned short p = 0;
-	int x;
+	int x = 1;
 
 	for (;;)
 		if (p)
@@ -13,5 +13,6 @@ static void foo(void)
  * check-command: test-linearize -Wno-decl $file
  *
  * check-output-ignore
- * check-output-pattern(2): phi\\.
+ * check-output-pattern(0,1): phi\\.
+ * check-output-excludes: cbr
  */
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/8] cfg: add testcase for phi-adjusting during BB merge
  2020-11-16 22:29 [PATCH 0/8] cfg: early CFG simplification Luc Van Oostenryck
  2020-11-16 22:29 ` [PATCH 1/8] testcase: avoid UNDEF Luc Van Oostenryck
@ 2020-11-16 22:29 ` Luc Van Oostenryck
  2020-11-16 22:29 ` [PATCH 3/8] cfg: extract merge_bb() from pack_basic_blocks() Luc Van Oostenryck
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Luc Van Oostenryck @ 2020-11-16 22:29 UTC (permalink / raw)
  To: linux-sparse; +Cc: Luc Van Oostenryck

When merging BBs, phi-sources from the bottom BB should 'overwrite'
the ones from the top BB which should be ignored.

Add a testcase from the incoming fix.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 validation/optim/merge_bbe-adjust_phi.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)
 create mode 100644 validation/optim/merge_bbe-adjust_phi.c

diff --git a/validation/optim/merge_bbe-adjust_phi.c b/validation/optim/merge_bbe-adjust_phi.c
new file mode 100644
index 000000000000..de4c54cc6d49
--- /dev/null
+++ b/validation/optim/merge_bbe-adjust_phi.c
@@ -0,0 +1,24 @@
+extern int array[2];
+
+static inline int stupid_select(int idx)
+{
+	if (idx)
+		idx = 0;
+	return array[idx];
+}
+
+int select(void)
+{
+	int d = stupid_select(-1);
+	return d;
+}
+
+/*
+ * check-name: merge_bbe-adjust_phi
+ * check-command: test-linearize -Wno-decl $file
+ * check-known-to-fail
+ *
+ * check-output-ignore
+ * check-output-excludes: phisrc\\.
+ * check-output-excludes: phi\\.
+ */
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/8] cfg: extract merge_bb() from pack_basic_blocks()
  2020-11-16 22:29 [PATCH 0/8] cfg: early CFG simplification Luc Van Oostenryck
  2020-11-16 22:29 ` [PATCH 1/8] testcase: avoid UNDEF Luc Van Oostenryck
  2020-11-16 22:29 ` [PATCH 2/8] cfg: add testcase for phi-adjusting during BB merge Luc Van Oostenryck
@ 2020-11-16 22:29 ` Luc Van Oostenryck
  2020-11-16 22:29 ` [PATCH 4/8] cfg: adjust phi-sources when merging BBs Luc Van Oostenryck
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Luc Van Oostenryck @ 2020-11-16 22:29 UTC (permalink / raw)
  To: linux-sparse; +Cc: Luc Van Oostenryck

Extract merge_bb() from pack_basic_blocks() in order to reuse this
part of the code in other simplification/finer grained version of
pack_basic_blocks().

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 flow.c | 58 +++++++++++++++++++++++++++++++++++-----------------------
 1 file changed, 35 insertions(+), 23 deletions(-)

diff --git a/flow.c b/flow.c
index ef8d04e5827f..9ae8612a2312 100644
--- a/flow.c
+++ b/flow.c
@@ -723,13 +723,46 @@ void vrfy_flow(struct entrypoint *ep)
 	assert(!entry);
 }
 
+///
+// merge two BBs
+// @top: the first BB to be merged
+// @bot: the second BB to be merged
+static int merge_bb(struct basic_block *top, struct basic_block *bot)
+{
+	struct instruction *insn;
+	struct basic_block *bb;
+
+	if (top == bot)
+		return 0;
+
+	top->children = bot->children;
+	bot->children = NULL;
+	bot->parents = NULL;
+
+	FOR_EACH_PTR(top->children, bb) {
+		replace_bb_in_list(&bb->parents, bot, top, 1);
+	} END_FOR_EACH_PTR(bb);
+
+	kill_instruction(delete_last_instruction(&top->insns));
+	FOR_EACH_PTR(bot->insns, insn) {
+		if (!insn->bb)
+			continue;
+		assert(insn->bb == bot);
+		insn->bb = top;
+		add_instruction(&top->insns, insn);
+	} END_FOR_EACH_PTR(insn);
+	bot->insns = NULL;
+	bot->ep = NULL;
+	return REPEAT_CFG_CLEANUP;
+}
+
 void pack_basic_blocks(struct entrypoint *ep)
 {
 	struct basic_block *bb;
 
 	/* See if we can merge a bb into another one.. */
 	FOR_EACH_PTR(ep->bbs, bb) {
-		struct instruction *first, *insn;
+		struct instruction *first;
 		struct basic_block *parent, *child, *last;
 
 		if (!bb_reachable(bb))
@@ -786,28 +819,7 @@ out:
 				goto no_merge;
 		} END_FOR_EACH_PTR(child);
 
-		/*
-		 * Merge the two.
-		 */
-		repeat_phase |= REPEAT_CFG_CLEANUP;
-
-		parent->children = bb->children;
-		bb->children = NULL;
-		bb->parents = NULL;
-
-		FOR_EACH_PTR(parent->children, child) {
-			replace_bb_in_list(&child->parents, bb, parent, 0);
-		} END_FOR_EACH_PTR(child);
-
-		kill_instruction(delete_last_instruction(&parent->insns));
-		FOR_EACH_PTR(bb->insns, insn) {
-			if (!insn->bb)
-				continue;
-			assert(insn->bb == bb);
-			insn->bb = parent;
-			add_instruction(&parent->insns, insn);
-		} END_FOR_EACH_PTR(insn);
-		bb->insns = NULL;
+		repeat_phase |= merge_bb(parent, bb);
 
 	no_merge:
 		/* nothing to do */;
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/8] cfg: adjust phi-sources when merging BBs
  2020-11-16 22:29 [PATCH 0/8] cfg: early CFG simplification Luc Van Oostenryck
                   ` (2 preceding siblings ...)
  2020-11-16 22:29 ` [PATCH 3/8] cfg: extract merge_bb() from pack_basic_blocks() Luc Van Oostenryck
@ 2020-11-16 22:29 ` Luc Van Oostenryck
  2020-11-16 22:53   ` Linus Torvalds
  2020-11-16 22:29 ` [PATCH 5/8] cfg: adjust phi-nodes " Luc Van Oostenryck
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 13+ messages in thread
From: Luc Van Oostenryck @ 2020-11-16 22:29 UTC (permalink / raw)
  To: linux-sparse; +Cc: Luc Van Oostenryck

When merging two basic blocks, it may happen that both of theses
blocks contain a phi-source for the same phi-node. In this case,
only the phi-source from the bottom BB must be taken in account,
it kinda overwrites the value from the top BB and the phi-source
from the top BB must be ignored, in fact it must be removed.

However, it is not the case and this extra phi-source creates
different kind of problems. Among other things, it hinders
further simplifications. For example, the following code:

	extern int array[2];
	static inline int stupid_select(int idx)
	{
		if (idx)
			idx = 0;
		return array[idx];
	}
	int select(void)
	{
		int d = stupid_select(-1);
		return d;
	}

should boil down to a simple dereference of the array with
an index of zero, like:
	select:
		load.32     %r8 <- 0[array]
		ret.32      %r8

but currently gives:
	select:
		phisrc.32   %phi3(idx) <- $0xffffffff
		phisrc.32   %phi4(idx) <- $0
		phi.32      %r12(idx) <- %phi3(idx), %phi4(idx)
		sext.64     %r5 <- (32) %r12(idx)
		mul.64      %r6 <- %r5, $4
		add.64      %r7 <- %r6, array
		load.32     %r8 <- 0[%r7]
		ret.32      %r8

This patch takes care of the problem by:
* when merging 2 BBs, check when reaching a phi-source in the bottom BB
* if one is found, look after sibling phi-sources
* remove such sibling if belonging to the top BB.
With this change, the code above gives:
	select:
		phisrc.32   %phi4(idx) <- $0
		phi.32      %r12(idx) <- %phi4(idx)
		sext.64     %r5 <- (32) %r12(idx)
		mul.64      %r6 <- %r5, $4
		add.64      %r7 <- %r6, array
		load.32     %r8 <- 0[%r7]
		ret.32      %r8

which can the be simplified into the expected result.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 flow.c                                  | 42 +++++++++++++++++++++++++
 validation/optim/merge_bbe-adjust_phi.c |  1 -
 2 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/flow.c b/flow.c
index 9ae8612a2312..e9c8f6b7355e 100644
--- a/flow.c
+++ b/flow.c
@@ -43,6 +43,25 @@ static int rewrite_branch(struct basic_block *bb,
 	return 1;
 }
 
+///
+// returns the phi-node corresponding to a phi-source
+static struct instruction *get_phinode(struct instruction *phisrc)
+{
+	struct pseudo_user *pu;
+
+	FOR_EACH_PTR(phisrc->target->users, pu) {
+		struct instruction *user;
+
+		if (!pu)
+			continue;
+		user = pu->insn;
+		assert(user->opcode == OP_PHI);
+		return user;
+	} END_FOR_EACH_PTR(pu);
+	assert(0);
+}
+
+
 /*
  * Return the known truth value of a pseudo, or -1 if
  * it's not known.
@@ -723,6 +742,24 @@ void vrfy_flow(struct entrypoint *ep)
 	assert(!entry);
 }
 
+static void adjust_phisrc(struct basic_block *top, struct instruction *insn)
+{
+	struct instruction *user = get_phinode(insn);
+	pseudo_t phi;
+
+	FOR_EACH_PTR(user->phi_list, phi) {
+		struct instruction *phisrc;
+
+		if (phi == VOID)
+			continue;
+		phisrc = phi->def;
+		if (phisrc->bb != top)
+			continue;
+		REPLACE_CURRENT_PTR(phi, VOID);
+		kill_instruction(phisrc);
+	} END_FOR_EACH_PTR(phi);
+}
+
 ///
 // merge two BBs
 // @top: the first BB to be merged
@@ -748,6 +785,11 @@ static int merge_bb(struct basic_block *top, struct basic_block *bot)
 		if (!insn->bb)
 			continue;
 		assert(insn->bb == bot);
+		switch (insn->opcode) {
+		case OP_PHISOURCE:
+			adjust_phisrc(top, insn);
+			break;
+		}
 		insn->bb = top;
 		add_instruction(&top->insns, insn);
 	} END_FOR_EACH_PTR(insn);
diff --git a/validation/optim/merge_bbe-adjust_phi.c b/validation/optim/merge_bbe-adjust_phi.c
index de4c54cc6d49..6a8ebb73a62d 100644
--- a/validation/optim/merge_bbe-adjust_phi.c
+++ b/validation/optim/merge_bbe-adjust_phi.c
@@ -16,7 +16,6 @@ int select(void)
 /*
  * check-name: merge_bbe-adjust_phi
  * check-command: test-linearize -Wno-decl $file
- * check-known-to-fail
  *
  * check-output-ignore
  * check-output-excludes: phisrc\\.
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/8] cfg: adjust phi-nodes when merging BBs
  2020-11-16 22:29 [PATCH 0/8] cfg: early CFG simplification Luc Van Oostenryck
                   ` (3 preceding siblings ...)
  2020-11-16 22:29 ` [PATCH 4/8] cfg: adjust phi-sources when merging BBs Luc Van Oostenryck
@ 2020-11-16 22:29 ` Luc Van Oostenryck
  2020-11-16 22:29 ` [PATCH 6/8] cfg: add missing REPEAT_CFG_CLEANUP Luc Van Oostenryck
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Luc Van Oostenryck @ 2020-11-16 22:29 UTC (permalink / raw)
  To: linux-sparse; +Cc: Luc Van Oostenryck

When merging BBs, it's possible that the top BB feeds one or
several phi-nodes in the bottom BB. Since phi-nodes only make
sense for values incoming from the parent BBs, these phi-nodes
can and should be removed when merging the BBs.

So, when merging BBs, remove these related phi-nodes.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 flow.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/flow.c b/flow.c
index e9c8f6b7355e..23886f9d84fa 100644
--- a/flow.c
+++ b/flow.c
@@ -760,6 +760,26 @@ static void adjust_phisrc(struct basic_block *top, struct instruction *insn)
 	} END_FOR_EACH_PTR(phi);
 }
 
+static void adjust_phi(struct basic_block *top, struct instruction *insn)
+{
+	pseudo_t phi;
+
+	FOR_EACH_PTR(insn->phi_list, phi) {
+		struct instruction *def;
+
+		if (phi == VOID)
+			continue;
+
+		def = phi->def;
+		if (def->bb != top)
+			continue;
+
+		convert_instruction_target(insn, def->src);
+		kill_instruction(def);
+		kill_instruction(insn);
+	} END_FOR_EACH_PTR(phi);
+}
+
 ///
 // merge two BBs
 // @top: the first BB to be merged
@@ -786,6 +806,9 @@ static int merge_bb(struct basic_block *top, struct basic_block *bot)
 			continue;
 		assert(insn->bb == bot);
 		switch (insn->opcode) {
+		case OP_PHI:
+			adjust_phi(top, insn);
+			continue;
 		case OP_PHISOURCE:
 			adjust_phisrc(top, insn);
 			break;
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 6/8] cfg: add missing REPEAT_CFG_CLEANUP
  2020-11-16 22:29 [PATCH 0/8] cfg: early CFG simplification Luc Van Oostenryck
                   ` (4 preceding siblings ...)
  2020-11-16 22:29 ` [PATCH 5/8] cfg: adjust phi-nodes " Luc Van Oostenryck
@ 2020-11-16 22:29 ` Luc Van Oostenryck
  2020-11-16 22:29 ` [PATCH 7/8] cfg: call simplify_memops() unconditionally Luc Van Oostenryck
  2020-11-16 22:29 ` [PATCH 8/8] cfg: early CFG simplification Luc Van Oostenryck
  7 siblings, 0 replies; 13+ messages in thread
From: Luc Van Oostenryck @ 2020-11-16 22:29 UTC (permalink / raw)
  To: linux-sparse; +Cc: Luc Van Oostenryck

simplify_branch() & insert_branch() convert a conditional branch
into an unconditional one, removing a child which may then
become unreachable.

However, these function doesn't set REPEAT_CFG_CLEANUP and the
unreachable child may not be removed.

Fix this by setting the missing REPEAT_CFG_CLEANUP in these functions.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 linearize.c | 1 +
 simplify.c  | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/linearize.c b/linearize.c
index ab91113d00eb..5d800b7f4006 100644
--- a/linearize.c
+++ b/linearize.c
@@ -726,6 +726,7 @@ void insert_branch(struct basic_block *bb, struct instruction *jmp, struct basic
 		remove_parent(child, bb);
 	} END_FOR_EACH_PTR(child);
 	PACK_PTR_LIST(&bb->children);
+	repeat_phase |= REPEAT_CFG_CLEANUP;
 }
 	
 
diff --git a/simplify.c b/simplify.c
index e58fb6cf3941..a0e23d6de01f 100644
--- a/simplify.c
+++ b/simplify.c
@@ -2048,7 +2048,7 @@ static int simplify_branch(struct instruction *insn)
 		kill_use(&insn->cond);
 		insn->cond = NULL;
 		insn->opcode = OP_BR;
-		return REPEAT_CSE;
+		return REPEAT_CSE|REPEAT_CFG_CLEANUP;
 	}
 
 	/* Conditional on a SETNE $0 or SETEQ $0 */
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 7/8] cfg: call simplify_memops() unconditionally.
  2020-11-16 22:29 [PATCH 0/8] cfg: early CFG simplification Luc Van Oostenryck
                   ` (5 preceding siblings ...)
  2020-11-16 22:29 ` [PATCH 6/8] cfg: add missing REPEAT_CFG_CLEANUP Luc Van Oostenryck
@ 2020-11-16 22:29 ` Luc Van Oostenryck
  2020-11-16 22:29 ` [PATCH 8/8] cfg: early CFG simplification Luc Van Oostenryck
  7 siblings, 0 replies; 13+ messages in thread
From: Luc Van Oostenryck @ 2020-11-16 22:29 UTC (permalink / raw)
  To: linux-sparse; +Cc: Luc Van Oostenryck

Currently, in the main optimization loop, simplify_memops()
is only called if REPEAT_SYMBOL_CLEANUP have been set. This
has (at least) two problems:
1) simplify_memops() may itself create other 'symbol cleanup'
   opportunities. That's fine and when it happens REPEAT_SYMBOL_CLEANUP
   is correctly set but this is directly lost because repeat_phase
   is cleared just after. So, loads and stores are not always
   optimized away as they should.
2) Loads & stores are not always done directly on symbols, in
   fact, it often happens that they are done on some PSEUDO_REG.
   Here too, loads and stores are not always optimized away as
   they should.

So, call simplify_memops() unconditionally.

Note: this have only very small effects on the tests' running time:
	  before		after		  after too
	  real	 4m18.001s	real   4m18.655s  real	 4m19.4
	  user	71m32.911s	user  72m02.701s  user	72m06.6
	  sys	29m06.523s	sys   29m01.721s  sys	29m06.8
      which is under the noise I usually have.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 memops.c                           |  2 ++
 optimize.c                         |  4 +---
 validation/optim/memops-missed01.c | 23 +++++++++++++++++++++++
 validation/optim/memops-missed02.c | 14 ++++++++++++++
 4 files changed, 40 insertions(+), 3 deletions(-)
 create mode 100644 validation/optim/memops-missed01.c
 create mode 100644 validation/optim/memops-missed02.c

diff --git a/memops.c b/memops.c
index f071e556da8a..badcdbbb9378 100644
--- a/memops.c
+++ b/memops.c
@@ -146,10 +146,12 @@ static void simplify_loads(struct basic_block *bb)
 				}
 				rewrite_load_instruction(insn, dominators);
 			} else {	// cleanup pending phi-sources
+				int repeat = repeat_phase;
 				pseudo_t phi;
 				FOR_EACH_PTR(dominators, phi) {
 					kill_instruction(phi->def);
 				} END_FOR_EACH_PTR(phi);
+				repeat_phase = repeat;
 			}
 		}
 next_load:
diff --git a/optimize.c b/optimize.c
index bfab74c01b62..8ab105bccdce 100644
--- a/optimize.c
+++ b/optimize.c
@@ -84,9 +84,7 @@ repeat:
 				kill_unreachable_bbs(ep);
 
 			cse_eliminate(ep);
-
-			if (repeat_phase & REPEAT_SYMBOL_CLEANUP)
-				simplify_memops(ep);
+			simplify_memops(ep);
 		} while (repeat_phase);
 		pack_basic_blocks(ep);
 		if (repeat_phase & REPEAT_CFG_CLEANUP)
diff --git a/validation/optim/memops-missed01.c b/validation/optim/memops-missed01.c
new file mode 100644
index 000000000000..fc616f1976a1
--- /dev/null
+++ b/validation/optim/memops-missed01.c
@@ -0,0 +1,23 @@
+void bar(int);
+
+void foo(void)
+{
+	char buf[1] = { 42 };
+	const char *p = buf;
+	const char **q = &p;
+	int ch = 0;
+	switch (**q) {
+	case 4:
+		ch = 2;
+	}
+	bar(ch);
+}
+
+/*
+ * check-name: memops-missed01
+ * check-command: test-linearize -Wno-decl $file
+ *
+ * check-output-ignore
+ * check-output-excludes: store\\.
+ * check-output-excludes: load\\.
+ */
diff --git a/validation/optim/memops-missed02.c b/validation/optim/memops-missed02.c
new file mode 100644
index 000000000000..6f0286497943
--- /dev/null
+++ b/validation/optim/memops-missed02.c
@@ -0,0 +1,14 @@
+void foo(int a[])
+{
+	int i, val;
+	for (;; i++)
+		val = a[i] ? a[i] : val;
+}
+
+/*
+ * check-name: memops-missed02
+ * check-command: test-linearize -Wno-decl $file
+ *
+ * check-output-ignore
+ * check-output-pattern(1): load\\.
+ */
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 8/8] cfg: early CFG simplification
  2020-11-16 22:29 [PATCH 0/8] cfg: early CFG simplification Luc Van Oostenryck
                   ` (6 preceding siblings ...)
  2020-11-16 22:29 ` [PATCH 7/8] cfg: call simplify_memops() unconditionally Luc Van Oostenryck
@ 2020-11-16 22:29 ` Luc Van Oostenryck
  7 siblings, 0 replies; 13+ messages in thread
From: Luc Van Oostenryck @ 2020-11-16 22:29 UTC (permalink / raw)
  To: linux-sparse; +Cc: Luc Van Oostenryck

The linearization step sometimes creates a lot of intermediate
basic blocks, often containing just a branch. Their presence often
make things more complicated than needed (more work to do in later
phases, visual clutter when inspection the IR 'by hand') and they
can sometimes, indirectly hinder some optimizations.

Happily, most of them can trivially be optimized away.
So, add a CFG simplification phase running very early and doing:
*) jump threading (eliminate jump to jump)
*) merge single-child/sinle-parents basic blocks.

These changes slightly decrease the number of 'context imbalance'
warnings (32 less on a total of 995 warnings) and the size of
the generated IR (only ~0.4% but this is very significant relatively
to most other simplifications).

They also seem to improve the kernel tests' running time:
	before			after
	real   4m19.261s	real   4m17.548s
	user  72m03.634s	user  71m34.642s
	sys   29m05.573s	sys   29m01.856s
but it's probably just noise.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 flow.c                                       | 78 ++++++++++++++++++++
 flow.h                                       |  1 +
 optimize.c                                   |  5 ++
 validation/call-inlined.c                    |  4 +
 validation/expand/builtin_constant_inline0.c |  1 +
 validation/inline_base0.c                    |  3 +
 validation/linear/builtin_unreachable0.c     |  4 +-
 validation/linear/builtin_unreachable1.c     |  4 +-
 validation/linear/call-inline.c              |  2 +-
 validation/mem2reg/cond-expr.c               |  4 +-
 validation/mem2reg/cond-expr5.c              |  5 +-
 validation/optim/cse-size.c                  |  2 +-
 12 files changed, 102 insertions(+), 11 deletions(-)

diff --git a/flow.c b/flow.c
index 23886f9d84fa..f56e72db470c 100644
--- a/flow.c
+++ b/flow.c
@@ -121,6 +121,31 @@ static int bb_depends_on_phi(struct basic_block *target, struct basic_block *src
 	return 0;
 }
 
+///
+// does the BB contains ignorable instructions but a final branch?
+// :note: something could be done for phi-sources but ... we'll see.
+static bool bb_is_forwarder(struct basic_block *bb)
+{
+	struct instruction *insn;
+
+	FOR_EACH_PTR(bb->insns, insn) {
+		if (!insn->bb)
+			continue;
+		switch (insn->opcode) {
+		case OP_NOP:
+		case OP_INLINED_CALL:
+			continue;
+		case OP_CBR:
+		case OP_BR:
+			return true;
+		default:
+			goto out;
+		}
+	} END_FOR_EACH_PTR(insn);
+out:
+	return false;
+}
+
 /*
  * When we reach here, we have:
  *  - a basic block that ends in a conditional branch and
@@ -742,6 +767,22 @@ void vrfy_flow(struct entrypoint *ep)
 	assert(!entry);
 }
 
+static int retarget_parents(struct basic_block *bb, struct basic_block *target)
+{
+	struct basic_block *parent;
+
+	/*
+	 * We can't do FOR_EACH_PTR() here, because the parent list
+	 * may change when we rewrite the parent.
+	 */
+	while ((parent = first_basic_block(bb->parents))) {
+		if (!rewrite_parent_branch(parent, bb, target))
+			return 0;
+	}
+	kill_bb(bb);
+	return REPEAT_CFG_CLEANUP;
+}
+
 static void adjust_phisrc(struct basic_block *top, struct instruction *insn)
 {
 	struct instruction *user = get_phinode(insn);
@@ -821,6 +862,43 @@ static int merge_bb(struct basic_block *top, struct basic_block *bot)
 	return REPEAT_CFG_CLEANUP;
 }
 
+///
+// early simplification of the CFG
+// Three things are done here:
+//    # inactive BB are removed
+//    # branches to a 'forwarder' BB are redirected to the forwardee.
+//    # merge single-child/single-parent BBs.
+int simplify_cfg_early(struct entrypoint *ep)
+{
+	struct basic_block *bb;
+	int changed = 0;
+
+	FOR_EACH_PTR_REVERSE(ep->bbs, bb) {
+		struct instruction *insn;
+		struct basic_block *tgt;
+
+		if (!bb->ep) {
+			DELETE_CURRENT_PTR(bb);
+			changed = REPEAT_CFG_CLEANUP;
+			continue;
+		}
+
+		insn = last_instruction(bb->insns);
+		if (!insn)
+			continue;
+		switch (insn->opcode) {
+		case OP_BR:
+			tgt = insn->bb_true;
+			if (bb_is_forwarder(bb))
+				changed |= retarget_parents(bb, tgt);
+			else if (bb_list_size(tgt->parents) == 1)
+				changed |= merge_bb(bb, tgt);
+			break;
+		}
+	} END_FOR_EACH_PTR_REVERSE(bb);
+	return changed;
+}
+
 void pack_basic_blocks(struct entrypoint *ep)
 {
 	struct basic_block *bb;
diff --git a/flow.h b/flow.h
index 099767d408f5..19a743c83b94 100644
--- a/flow.h
+++ b/flow.h
@@ -18,6 +18,7 @@ extern void kill_dead_stores(struct entrypoint *ep, pseudo_t addr, int local);
 extern void simplify_symbol_usage(struct entrypoint *ep);
 extern void simplify_memops(struct entrypoint *ep);
 extern void pack_basic_blocks(struct entrypoint *ep);
+extern int simplify_cfg_early(struct entrypoint *ep);
 
 extern void convert_instruction_target(struct instruction *insn, pseudo_t src);
 extern void remove_dead_insns(struct entrypoint *);
diff --git a/optimize.c b/optimize.c
index 8ab105bccdce..338714c78e68 100644
--- a/optimize.c
+++ b/optimize.c
@@ -57,6 +57,11 @@ void optimize(struct entrypoint *ep)
 	kill_unreachable_bbs(ep);
 	ir_validate(ep);
 
+	cfg_postorder(ep);
+	if (simplify_cfg_early(ep))
+		kill_unreachable_bbs(ep);
+	ir_validate(ep);
+
 	domtree_build(ep);
 
 	/*
diff --git a/validation/call-inlined.c b/validation/call-inlined.c
index 3612c5c42690..a6cb4b5b0976 100644
--- a/validation/call-inlined.c
+++ b/validation/call-inlined.c
@@ -28,12 +28,14 @@ foo:
 	<entry-point>
 	add.32      %r3 <- %arg1, %arg2
 	add.32      %r5 <- %r3, $1
+	# call      %r6 <- add, %r3, $1
 	ret.32      %r5
 
 
 bar:
 .L3:
 	<entry-point>
+	# call      %r13 <- add, %r10, $1
 	ret
 
 
@@ -41,6 +43,7 @@ bas:
 .L6:
 	<entry-point>
 	add.64      %r16 <- "abc", $1
+	# call      %r17 <- lstrip, %r14
 	ret.64      %r16
 
 
@@ -48,6 +51,7 @@ qus:
 .L9:
 	<entry-point>
 	add.64      %r21 <- messg, $1
+	# call      %r22 <- lstrip, %r19
 	ret.64      %r21
 
 
diff --git a/validation/expand/builtin_constant_inline0.c b/validation/expand/builtin_constant_inline0.c
index a0057f2094b3..d72a211fd267 100644
--- a/validation/expand/builtin_constant_inline0.c
+++ b/validation/expand/builtin_constant_inline0.c
@@ -16,6 +16,7 @@ int foo(void)
 foo:
 .L0:
 	<entry-point>
+	# call      %r1 <- is_const, $42
 	ret.32      $42
 
 
diff --git a/validation/inline_base0.c b/validation/inline_base0.c
index 517ee972ee05..698c760ff0be 100644
--- a/validation/inline_base0.c
+++ b/validation/inline_base0.c
@@ -27,6 +27,7 @@ foo0:
 .L0:
 	<entry-point>
 	add.32      %r5 <- %arg1, %arg2
+	# call      %r6 <- add, %r1, %r2
 	ret.32      %r5
 
 
@@ -34,12 +35,14 @@ foo1:
 .L3:
 	<entry-point>
 	add.32      %r10 <- %arg1, $1
+	# call      %r11 <- add, %r8, $1
 	ret.32      %r10
 
 
 foo2:
 .L6:
 	<entry-point>
+	# call      %r13 <- add, $1, $2
 	ret.32      $3
 
 
diff --git a/validation/linear/builtin_unreachable0.c b/validation/linear/builtin_unreachable0.c
index 911ed7f97f95..4fc564739e76 100644
--- a/validation/linear/builtin_unreachable0.c
+++ b/validation/linear/builtin_unreachable0.c
@@ -14,12 +14,12 @@ foo:
 .L0:
 	<entry-point>
 	seteq.32    %r2 <- %arg1, $3
-	cbr         %r2, .L1, .L3
+	cbr         %r2, .L1, .L2
 
 .L1:
 	unreachable
 
-.L3:
+.L2:
 	ret.32      %arg1
 
 
diff --git a/validation/linear/builtin_unreachable1.c b/validation/linear/builtin_unreachable1.c
index 70f6674c6443..2fc1d728d7d9 100644
--- a/validation/linear/builtin_unreachable1.c
+++ b/validation/linear/builtin_unreachable1.c
@@ -16,9 +16,9 @@ int foo(int c)
 foo:
 .L0:
 	<entry-point>
-	cbr         %arg1, .L3, .L2
+	cbr         %arg1, .L1, .L2
 
-.L3:
+.L1:
 	ret.32      $1
 
 .L2:
diff --git a/validation/linear/call-inline.c b/validation/linear/call-inline.c
index dfd49b62c4b7..1ad785ee678b 100644
--- a/validation/linear/call-inline.c
+++ b/validation/linear/call-inline.c
@@ -13,6 +13,6 @@ int i3(void) { return (***fun)(); }		// C99,C11 6.5.3.2p4
  *
  * check-output-ignore
  * check-output-excludes: load
- * check-output-excludes: call
+ * check-output-excludes: \\tcall
  * check-output-pattern(5): ret\\..* \\$42
  */
diff --git a/validation/mem2reg/cond-expr.c b/validation/mem2reg/cond-expr.c
index 8acb00ac950f..2474d65d8c64 100644
--- a/validation/mem2reg/cond-expr.c
+++ b/validation/mem2reg/cond-expr.c
@@ -9,6 +9,6 @@ int foo(int a, int b, int c)
  * check-name: cond-expr
  * check-command: test-linearize -Wno-decl -fdump-ir=mem2reg $file
  * check-output-ignore
- * check-output-pattern(2): phi\\.
- * check-output-pattern(3): phisrc\\.
+ * check-output-pattern(1): phi\\.
+ * check-output-pattern(2): phisrc\\.
  */
diff --git a/validation/mem2reg/cond-expr5.c b/validation/mem2reg/cond-expr5.c
index a3ce5e3a955e..beef8f258cd9 100644
--- a/validation/mem2reg/cond-expr5.c
+++ b/validation/mem2reg/cond-expr5.c
@@ -15,7 +15,6 @@ int foo(int p, int q, int a)
  * check-output-ignore
  * check-output-excludes: load\\.
  * check-output-excludes: store\\.
- * check-output-excludes: phi\\..*, .*, .*
- * check-output-pattern(3): phi\\.
- * check-output-pattern(5): phisrc\\.
+ * check-output-pattern(2): phi\\.
+ * check-output-pattern(4): phisrc\\.
  */
diff --git a/validation/optim/cse-size.c b/validation/optim/cse-size.c
index e1a5d492a666..0c0c2d1425ed 100644
--- a/validation/optim/cse-size.c
+++ b/validation/optim/cse-size.c
@@ -13,6 +13,6 @@ static void foo(void)
  * check-command: test-linearize -Wno-decl $file
  *
  * check-output-ignore
- * check-output-pattern(0,1): phi\\.
+ * check-output-excludes: phi\\.
  * check-output-excludes: cbr
  */
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/8] cfg: adjust phi-sources when merging BBs
  2020-11-16 22:29 ` [PATCH 4/8] cfg: adjust phi-sources when merging BBs Luc Van Oostenryck
@ 2020-11-16 22:53   ` Linus Torvalds
  2020-11-16 23:47     ` Luc Van Oostenryck
  0 siblings, 1 reply; 13+ messages in thread
From: Linus Torvalds @ 2020-11-16 22:53 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Sparse Mailing-list

On Mon, Nov 16, 2020 at 2:30 PM Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
>
> +static void adjust_phisrc(struct basic_block *top, struct instruction *insn)

My only issue is that this is a really odd name, and calling convention.

"adjust"?

Wouldn't it be more sensible to call this "remove_phisrc()", because
that's what it does. It removes the matching phisrc instruction in the
specified basic block.

(I also think it might be a bit more obvious to do the get_phinode()
in the caller, and simply pass in the OP_PHI instruction, but maybe
you had reasons to do it that way).

             Linus

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/8] cfg: adjust phi-sources when merging BBs
  2020-11-16 22:53   ` Linus Torvalds
@ 2020-11-16 23:47     ` Luc Van Oostenryck
  2020-11-17  0:16       ` Linus Torvalds
  0 siblings, 1 reply; 13+ messages in thread
From: Luc Van Oostenryck @ 2020-11-16 23:47 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Sparse Mailing-list

On Mon, Nov 16, 2020 at 02:53:21PM -0800, Linus Torvalds wrote:
> On Mon, Nov 16, 2020 at 2:30 PM Luc Van Oostenryck
> <luc.vanoostenryck@gmail.com> wrote:
> >
> > +static void adjust_phisrc(struct basic_block *top, struct instruction *insn)
> 
> My only issue is that this is a really odd name, and calling convention.
> 
> "adjust"?
> 
> Wouldn't it be more sensible to call this "remove_phisrc()", because
> that's what it does. It removes the matching phisrc instruction in the
> specified basic block.

Yes, the name is horrible and doesn't mean much. I'll change it.
 
> (I also think it might be a bit more obvious to do the get_phinode()
> in the caller, and simply pass in the OP_PHI instruction, but maybe
> you had reasons to do it that way).

In fact, I wrote it first like that but then I've another piece
of code that need exactly the same info so I change it into
a separate function.

This code is a bit weird though, because it actively use the fact
that each phi-sources feeds a single phi-node while the underlying
data structures are designed so that phi-sources can be shared
between phi-nodes (but I think that allowing them the be effectively
shared would bring too much problems).

-- Luc

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/8] cfg: adjust phi-sources when merging BBs
  2020-11-16 23:47     ` Luc Van Oostenryck
@ 2020-11-17  0:16       ` Linus Torvalds
  2020-11-17  0:42         ` Luc Van Oostenryck
  0 siblings, 1 reply; 13+ messages in thread
From: Linus Torvalds @ 2020-11-17  0:16 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Sparse Mailing-list

On Mon, Nov 16, 2020 at 3:47 PM Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
>
> This code is a bit weird though, because it actively use the fact
> that each phi-sources feeds a single phi-node while the underlying
> data structures are designed so that phi-sources can be shared
> between phi-nodes (but I think that allowing them the be effectively
> shared would bring too much problems).

Yeah, the phisource nodes are strange

 I was confused when I wrote the original phi code, and you fixed it
all up (to the point where I suspect none of my original confusion
exists ;^), but they are still a bit strange.

Could the phi_users list be just replaced by the target (OP_PHI)
pseudo (or instruction)?

           Linus

          Linus

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/8] cfg: adjust phi-sources when merging BBs
  2020-11-17  0:16       ` Linus Torvalds
@ 2020-11-17  0:42         ` Luc Van Oostenryck
  0 siblings, 0 replies; 13+ messages in thread
From: Luc Van Oostenryck @ 2020-11-17  0:42 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Sparse Mailing-list

On Mon, Nov 16, 2020 at 04:16:01PM -0800, Linus Torvalds wrote:
> 
> Could the phi_users list be just replaced by the target (OP_PHI)
> pseudo (or instruction)?

Yes, trivially if set liveness is calculated. But this
information could be useful earlier, then it would need some
work to track it (not sure it would be worth, though).

-- Luc

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-11-17  0:42 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-16 22:29 [PATCH 0/8] cfg: early CFG simplification Luc Van Oostenryck
2020-11-16 22:29 ` [PATCH 1/8] testcase: avoid UNDEF Luc Van Oostenryck
2020-11-16 22:29 ` [PATCH 2/8] cfg: add testcase for phi-adjusting during BB merge Luc Van Oostenryck
2020-11-16 22:29 ` [PATCH 3/8] cfg: extract merge_bb() from pack_basic_blocks() Luc Van Oostenryck
2020-11-16 22:29 ` [PATCH 4/8] cfg: adjust phi-sources when merging BBs Luc Van Oostenryck
2020-11-16 22:53   ` Linus Torvalds
2020-11-16 23:47     ` Luc Van Oostenryck
2020-11-17  0:16       ` Linus Torvalds
2020-11-17  0:42         ` Luc Van Oostenryck
2020-11-16 22:29 ` [PATCH 5/8] cfg: adjust phi-nodes " Luc Van Oostenryck
2020-11-16 22:29 ` [PATCH 6/8] cfg: add missing REPEAT_CFG_CLEANUP Luc Van Oostenryck
2020-11-16 22:29 ` [PATCH 7/8] cfg: call simplify_memops() unconditionally Luc Van Oostenryck
2020-11-16 22:29 ` [PATCH 8/8] cfg: early CFG simplification Luc Van Oostenryck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).