All of lore.kernel.org
 help / color / mirror / Atom feed
* 'context imbalance' warnings
@ 2020-11-09 21:16 Luc Van Oostenryck
  2020-11-09 21:47 ` Linus Torvalds
  0 siblings, 1 reply; 4+ messages in thread
From: Luc Van Oostenryck @ 2020-11-09 21:16 UTC (permalink / raw)
  To: linux-sparse, Linus Torvalds

Hi,

This is the continuation of last week thread.

I've added a few more simplifications for selects & compares on top
of those from last week. Alas, none of these makes any real difference:
in my usual tests I have just one warning less (total number is now 995):
   * fs/dcache.c:1226:24: warning: context imbalance in 'dentry_lru_isolate_shrink' - different lock contexts for basic block

But well, these simplifications are good and can help.

So, I've taken again a look at the origin of these warnings (I already
did this 2-3 years ago). Here is what I found for the 6 first ones:

1) intel_{start,stop}_scheduling()
   - take/release a lock conditionally
   - but intel_start_scheduling() returns void ...

2) get_locked_pte()
   - it's a conditional lock
   - some code assume it never fails
   - some code protect it by a [VM_]BUG_ON()
   - sparse complains

3) lock_vector_lock()
   - OK, but the annotation is missing, I'll send a fix

4) raw_spin_trylock()
   - on UP, raw_spin_trylock() takes the context twice ...
   - I'll see what can be fixed there

5) acpi_os_read_memory()
   the usage pattern is 'interesting':

	flag = 0;
	lock();
	addr = lookup();
	if (!addr) {
		unlock()
		addr = map();
		if (!addr)
			return ERROR;
		flag = true;
	}

	// sparse complains here

	... use addr ...

	if (flag)
		unmap(addr);
	else
		unlock();
	return 0;

   I see the pattern and I suppose the locking is correct but I
   think that sparse's warning is justified. Is it needed to do
   the early unlock() when the lookup fails?

6) clk_enable_lock() (@ drivers/clk/clk.c)
   This one is really interesting and shows one of Sparse's shortcoming

	if (!spin_trylock_irqsave(&lock, flags)) {
		if (some_test) {
			__acquire;
			local_save_flags(flags);
			return flags;
		}
		spin_lock_irqsave(&enable_lock, flags);
	}
	...
	return;

   So the code is correct but what sparse really sees is:

	flags = __raw_save_flags();
	cli;
	tmp = _raw_spin_try_lock();
	if (tmp)
		__acquire;
	else
		__raw_restore_flags(flags);

	// sparse complains here

	if (!tmp) {
		if (some_test) {
			__acquire;
			flags = __raw_save_flags();
			return flags;
		}
		flags = raw_spin_lock_irqsave();
	}
	...
	return flags;

   Now sparse could merge the two branches depending on tmp,
   as if the code would have been written as:

	flags = __raw_save_flags();
	cli;
	tmp = _raw_spin_try_lock();
	if (tmp)
		__acquire;
	else {
		__raw_restore_flags(flags);
		if (some_test) {
			__acquire;
			flags = __raw_save_flags();
			return flags;
		}
		flags = raw_spin_lock_irqsave();
	}
	...
	return flags;

   But the corresponding IR is:

	... %r9(flags) <- ...			// flags = __raw_save_flags();
	...					// cli();
	call.32	%r10(tmp) <- _raw_spin_trylock	// tmp = _raw_spin_try_lock();
	cbr	%r10(tmp), .L1, .L2		// if (tmp)
.L1:	context	1				//   __acquire;
	br	.L3
.L2:	<restore interrupts>			//   __raw_restore_flags(flags);
	br	.L3
.L3:	phisrc.32 %phi1(flags) <- %r9(flags)
	cbr	%r10(tmp), .L4, .L5		// if (!tmp)

   The presence of the phisrc at .L3 blocks the BB merging.
   But this is only part of the problem. First:
   1) The phisrc (and maybe the associated phi-node) could
      be moved/adjusted when merging the BBs
   2) phisrcs are not really needed anyway (they're just a
      mean to track the BB of each phi-node's arguments).

   But even without these phi-sources, some merging that could
   happen doesn't.  For example, code like:
	#define __acquire	__context__(X, 1)
	int  def(void);
	void use(int);
	int foo(int a)
	{
		int tmp = def();
		if (a)
			__acquire;
		else
			use(tmp);
		if (!a) {
			def();
			__acquire;
		}
		use(0);
		__release;
		return tmp;
	}

   produces:
	foo:
	.L0:	call.32     %r1 <- def
		cbr         %arg1, .L1, .L2
	.L1:	context     1
		br          .L3
	.L2:	call        use, %r1
		br          .L3
	.L3:	cbr         %arg1, .L5, .L4	;; false imbalance here
	.L4:	call.32     %r7 <- def
		context     1
		br          .L5
	.L5:	call        use, $0
		context     -1
		ret.32      %r1

   There is no phi-source here and the CBR at .L3 could be merged
   with the one at .L0, removing the false imbalance but it's not.
   I thought that sparse was doing this sort of branch simplification
   but it doesn't, or at least it doesn't in this (simple) situation.

   I'll begin to look at this. I'm not sure about a general solution
   but for a simple case like here (just a 'diamond') it seems simple
   enough to handle.

-- Luc

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 'context imbalance' warnings
  2020-11-09 21:16 'context imbalance' warnings Luc Van Oostenryck
@ 2020-11-09 21:47 ` Linus Torvalds
  2020-11-09 21:57   ` Linus Torvalds
  2020-11-09 22:13   ` Luc Van Oostenryck
  0 siblings, 2 replies; 4+ messages in thread
From: Linus Torvalds @ 2020-11-09 21:47 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Sparse Mailing-list

On Mon, Nov 9, 2020 at 1:16 PM Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
>
>    There is no phi-source here and the CBR at .L3 could be merged
>    with the one at .L0, removing the false imbalance but it's not.
>    I thought that sparse was doing this sort of branch simplification
>    but it doesn't, or at least it doesn't in this (simple) situation.

The branch rewriting sparse does is very simplistic, afaik. It's also
confusingly written. I blame myself.

Honestly, in that example you quote:

                cbr         %arg1, .L1, .L2
        .L1:    context     1
                br          .L3
        .L2:    call        use, %r1
                br          .L3
        .L3:    cbr         %arg1, .L5, .L4     ;; false imbalance here

we never even try to combine the two cbr's, because they don't jump
directly to each other.

I think it would be easy to simplify if we just added some logic to
change "unconditional branch to conditional branch", and moved the cbr
up into L1 and L2, but I don't think we have any such logic (and it's
a simplification that might end up being the opposite of a
simplification - it would duplicate the conditional).

So we could try to remove L3 entirely, by moving it into both parents:

                cbr         %arg1, .L1, .L2
        .L1:    context     1
                cbr         %arg1, .L5, .L4
        .L2:    call        use, %r1
                cbr         %arg1, .L5, .L4

and at that point, the branch simplification we *do* have woudl see
that both of those duplicated conditional branches are now dominated
by the first one, and we'd end up with

                cbr         %arg1, .L1, .L2
        .L1:    context     1
                br         .L5
        .L2:    call        use, %r1
                br         .L4

and then we'd join L2 and L4 together and the end result would look nice.

But as it stands now, L3 isn't dominated by one side of the original
conditional branch (because we reach L3 from both sides), and the cbr
in L3 isn't something we can simplify without duplicating it and
moving it up into the parents.

Is duplicating the conditional branch worth it? In this case, clearly
yes. But in general? Maybe we could do it in cases like this, when the
*only* thing in a basic block is that conditional branch.

            Linus

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 'context imbalance' warnings
  2020-11-09 21:47 ` Linus Torvalds
@ 2020-11-09 21:57   ` Linus Torvalds
  2020-11-09 22:13   ` Luc Van Oostenryck
  1 sibling, 0 replies; 4+ messages in thread
From: Linus Torvalds @ 2020-11-09 21:57 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Sparse Mailing-list

On Mon, Nov 9, 2020 at 1:47 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Is duplicating the conditional branch worth it? In this case, clearly
> yes. But in general? Maybe we could do it in cases like this, when the
> *only* thing in a basic block is that conditional branch.

Actually, I think the correct thing to do would be not to move the
conditional branch into the parents, but instead:

 (a) if we see an unconditional branch to a bb with (only) a conditional one

then

 (b) we look if the current BB is dominated by the same conditional,
and if so, we follow the conditional one the right way.

and that would immediately get us to that

                cbr         %arg1, .L1, .L2
        .L1:    context     1
                br         .L5
        .L2:    call        use, %r1
                br         .L4

state, without that possibly expanding "duplicate the conditional
branches into the parent bb's".

But the current flow rewriting isn't smart enough to do even this
fairly simple thing.

               Linus

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 'context imbalance' warnings
  2020-11-09 21:47 ` Linus Torvalds
  2020-11-09 21:57   ` Linus Torvalds
@ 2020-11-09 22:13   ` Luc Van Oostenryck
  1 sibling, 0 replies; 4+ messages in thread
From: Luc Van Oostenryck @ 2020-11-09 22:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Sparse Mailing-list

On Mon, Nov 09, 2020 at 01:47:24PM -0800, Linus Torvalds wrote:
> On Mon, Nov 9, 2020 at 1:16 PM Luc Van Oostenryck
> <luc.vanoostenryck@gmail.com> wrote:
> >
> >    There is no phi-source here and the CBR at .L3 could be merged
> >    with the one at .L0, removing the false imbalance but it's not.
> >    I thought that sparse was doing this sort of branch simplification
> >    but it doesn't, or at least it doesn't in this (simple) situation.
> 
> The branch rewriting sparse does is very simplistic, afaik. It's also
> confusingly written. I blame myself.
> 
> Honestly, in that example you quote:
> 
>                 cbr         %arg1, .L1, .L2
>         .L1:    context     1
>                 br          .L3
>         .L2:    call        use, %r1
>                 br          .L3
>         .L3:    cbr         %arg1, .L5, .L4     ;; false imbalance here
> 
> we never even try to combine the two cbr's, because they don't jump
> directly to each other.
> 
> I think it would be easy to simplify if we just added some logic to
> change "unconditional branch to conditional branch", and moved the cbr
> up into L1 and L2, but I don't think we have any such logic (and it's
> a simplification that might end up being the opposite of a
> simplification - it would duplicate the conditional).
> 
> So we could try to remove L3 entirely, by moving it into both parents:
> 
>                 cbr         %arg1, .L1, .L2
>         .L1:    context     1
>                 cbr         %arg1, .L5, .L4
>         .L2:    call        use, %r1
>                 cbr         %arg1, .L5, .L4

Mmm yes, I see.
 
> and at that point, the branch simplification we *do* have woudl see
> that both of those duplicated conditional branches are now dominated
> by the first one, and we'd end up with
> 
>                 cbr         %arg1, .L1, .L2
>         .L1:    context     1
>                 br         .L5
>         .L2:    call        use, %r1
>                 br         .L4
> 
> and then we'd join L2 and L4 together and the end result would look nice.

I was simply thinking doing this directly. In the case of a simple
'diamond' (with the top and bottom CBR on the same conditional),
there is no dependency/dominance problems and it's enough to change
the target of the BR in the two side BBs.

> But as it stands now, L3 isn't dominated by one side of the original
> conditional branch (because we reach L3 from both sides), and the cbr
> in L3 isn't something we can simplify without duplicating it and
> moving it up into the parents.
> 
> Is duplicating the conditional branch worth it? In this case, clearly
> yes. But in general? Maybe we could do it in cases like this, when the
> *only* thing in a basic block is that conditional branch.

I think the general solution is to look at the dominance tree:
if there is another conditional branch with the same condition and which
dominates the current, then it must be safe to duplicate because it
can be simplified away anyway. Otherwise, I would prefer to avoid
duplicating anything without some cost/benefit guarantee.

-- Luc

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-11-09 22:13 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-09 21:16 'context imbalance' warnings Luc Van Oostenryck
2020-11-09 21:47 ` Linus Torvalds
2020-11-09 21:57   ` Linus Torvalds
2020-11-09 22:13   ` Luc Van Oostenryck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.