* 'context imbalance' warnings
@ 2020-11-09 21:16 Luc Van Oostenryck
2020-11-09 21:47 ` Linus Torvalds
0 siblings, 1 reply; 4+ messages in thread
From: Luc Van Oostenryck @ 2020-11-09 21:16 UTC (permalink / raw)
To: linux-sparse, Linus Torvalds
Hi,
This is the continuation of last week thread.
I've added a few more simplifications for selects & compares on top
of those from last week. Alas, none of these makes any real difference:
in my usual tests I have just one warning less (total number is now 995):
* fs/dcache.c:1226:24: warning: context imbalance in 'dentry_lru_isolate_shrink' - different lock contexts for basic block
But well, these simplifications are good and can help.
So, I've taken again a look at the origin of these warnings (I already
did this 2-3 years ago). Here is what I found for the 6 first ones:
1) intel_{start,stop}_scheduling()
- take/release a lock conditionally
- but intel_start_scheduling() returns void ...
2) get_locked_pte()
- it's a conditional lock
- some code assume it never fails
- some code protect it by a [VM_]BUG_ON()
- sparse complains
3) lock_vector_lock()
- OK, but the annotation is missing, I'll send a fix
4) raw_spin_trylock()
- on UP, raw_spin_trylock() takes the context twice ...
- I'll see what can be fixed there
5) acpi_os_read_memory()
the usage pattern is 'interesting':
flag = 0;
lock();
addr = lookup();
if (!addr) {
unlock()
addr = map();
if (!addr)
return ERROR;
flag = true;
}
// sparse complains here
... use addr ...
if (flag)
unmap(addr);
else
unlock();
return 0;
I see the pattern and I suppose the locking is correct but I
think that sparse's warning is justified. Is it needed to do
the early unlock() when the lookup fails?
6) clk_enable_lock() (@ drivers/clk/clk.c)
This one is really interesting and shows one of Sparse's shortcoming
if (!spin_trylock_irqsave(&lock, flags)) {
if (some_test) {
__acquire;
local_save_flags(flags);
return flags;
}
spin_lock_irqsave(&enable_lock, flags);
}
...
return;
So the code is correct but what sparse really sees is:
flags = __raw_save_flags();
cli;
tmp = _raw_spin_try_lock();
if (tmp)
__acquire;
else
__raw_restore_flags(flags);
// sparse complains here
if (!tmp) {
if (some_test) {
__acquire;
flags = __raw_save_flags();
return flags;
}
flags = raw_spin_lock_irqsave();
}
...
return flags;
Now sparse could merge the two branches depending on tmp,
as if the code would have been written as:
flags = __raw_save_flags();
cli;
tmp = _raw_spin_try_lock();
if (tmp)
__acquire;
else {
__raw_restore_flags(flags);
if (some_test) {
__acquire;
flags = __raw_save_flags();
return flags;
}
flags = raw_spin_lock_irqsave();
}
...
return flags;
But the corresponding IR is:
... %r9(flags) <- ... // flags = __raw_save_flags();
... // cli();
call.32 %r10(tmp) <- _raw_spin_trylock // tmp = _raw_spin_try_lock();
cbr %r10(tmp), .L1, .L2 // if (tmp)
.L1: context 1 // __acquire;
br .L3
.L2: <restore interrupts> // __raw_restore_flags(flags);
br .L3
.L3: phisrc.32 %phi1(flags) <- %r9(flags)
cbr %r10(tmp), .L4, .L5 // if (!tmp)
The presence of the phisrc at .L3 blocks the BB merging.
But this is only part of the problem. First:
1) The phisrc (and maybe the associated phi-node) could
be moved/adjusted when merging the BBs
2) phisrcs are not really needed anyway (they're just a
mean to track the BB of each phi-node's arguments).
But even without these phi-sources, some merging that could
happen doesn't. For example, code like:
#define __acquire __context__(X, 1)
int def(void);
void use(int);
int foo(int a)
{
int tmp = def();
if (a)
__acquire;
else
use(tmp);
if (!a) {
def();
__acquire;
}
use(0);
__release;
return tmp;
}
produces:
foo:
.L0: call.32 %r1 <- def
cbr %arg1, .L1, .L2
.L1: context 1
br .L3
.L2: call use, %r1
br .L3
.L3: cbr %arg1, .L5, .L4 ;; false imbalance here
.L4: call.32 %r7 <- def
context 1
br .L5
.L5: call use, $0
context -1
ret.32 %r1
There is no phi-source here and the CBR at .L3 could be merged
with the one at .L0, removing the false imbalance but it's not.
I thought that sparse was doing this sort of branch simplification
but it doesn't, or at least it doesn't in this (simple) situation.
I'll begin to look at this. I'm not sure about a general solution
but for a simple case like here (just a 'diamond') it seems simple
enough to handle.
-- Luc
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: 'context imbalance' warnings
2020-11-09 21:16 'context imbalance' warnings Luc Van Oostenryck
@ 2020-11-09 21:47 ` Linus Torvalds
2020-11-09 21:57 ` Linus Torvalds
2020-11-09 22:13 ` Luc Van Oostenryck
0 siblings, 2 replies; 4+ messages in thread
From: Linus Torvalds @ 2020-11-09 21:47 UTC (permalink / raw)
To: Luc Van Oostenryck; +Cc: Sparse Mailing-list
On Mon, Nov 9, 2020 at 1:16 PM Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
>
> There is no phi-source here and the CBR at .L3 could be merged
> with the one at .L0, removing the false imbalance but it's not.
> I thought that sparse was doing this sort of branch simplification
> but it doesn't, or at least it doesn't in this (simple) situation.
The branch rewriting sparse does is very simplistic, afaik. It's also
confusingly written. I blame myself.
Honestly, in that example you quote:
cbr %arg1, .L1, .L2
.L1: context 1
br .L3
.L2: call use, %r1
br .L3
.L3: cbr %arg1, .L5, .L4 ;; false imbalance here
we never even try to combine the two cbr's, because they don't jump
directly to each other.
I think it would be easy to simplify if we just added some logic to
change "unconditional branch to conditional branch", and moved the cbr
up into L1 and L2, but I don't think we have any such logic (and it's
a simplification that might end up being the opposite of a
simplification - it would duplicate the conditional).
So we could try to remove L3 entirely, by moving it into both parents:
cbr %arg1, .L1, .L2
.L1: context 1
cbr %arg1, .L5, .L4
.L2: call use, %r1
cbr %arg1, .L5, .L4
and at that point, the branch simplification we *do* have woudl see
that both of those duplicated conditional branches are now dominated
by the first one, and we'd end up with
cbr %arg1, .L1, .L2
.L1: context 1
br .L5
.L2: call use, %r1
br .L4
and then we'd join L2 and L4 together and the end result would look nice.
But as it stands now, L3 isn't dominated by one side of the original
conditional branch (because we reach L3 from both sides), and the cbr
in L3 isn't something we can simplify without duplicating it and
moving it up into the parents.
Is duplicating the conditional branch worth it? In this case, clearly
yes. But in general? Maybe we could do it in cases like this, when the
*only* thing in a basic block is that conditional branch.
Linus
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: 'context imbalance' warnings
2020-11-09 21:47 ` Linus Torvalds
@ 2020-11-09 21:57 ` Linus Torvalds
2020-11-09 22:13 ` Luc Van Oostenryck
1 sibling, 0 replies; 4+ messages in thread
From: Linus Torvalds @ 2020-11-09 21:57 UTC (permalink / raw)
To: Luc Van Oostenryck; +Cc: Sparse Mailing-list
On Mon, Nov 9, 2020 at 1:47 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Is duplicating the conditional branch worth it? In this case, clearly
> yes. But in general? Maybe we could do it in cases like this, when the
> *only* thing in a basic block is that conditional branch.
Actually, I think the correct thing to do would be not to move the
conditional branch into the parents, but instead:
(a) if we see an unconditional branch to a bb with (only) a conditional one
then
(b) we look if the current BB is dominated by the same conditional,
and if so, we follow the conditional one the right way.
and that would immediately get us to that
cbr %arg1, .L1, .L2
.L1: context 1
br .L5
.L2: call use, %r1
br .L4
state, without that possibly expanding "duplicate the conditional
branches into the parent bb's".
But the current flow rewriting isn't smart enough to do even this
fairly simple thing.
Linus
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: 'context imbalance' warnings
2020-11-09 21:47 ` Linus Torvalds
2020-11-09 21:57 ` Linus Torvalds
@ 2020-11-09 22:13 ` Luc Van Oostenryck
1 sibling, 0 replies; 4+ messages in thread
From: Luc Van Oostenryck @ 2020-11-09 22:13 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Sparse Mailing-list
On Mon, Nov 09, 2020 at 01:47:24PM -0800, Linus Torvalds wrote:
> On Mon, Nov 9, 2020 at 1:16 PM Luc Van Oostenryck
> <luc.vanoostenryck@gmail.com> wrote:
> >
> > There is no phi-source here and the CBR at .L3 could be merged
> > with the one at .L0, removing the false imbalance but it's not.
> > I thought that sparse was doing this sort of branch simplification
> > but it doesn't, or at least it doesn't in this (simple) situation.
>
> The branch rewriting sparse does is very simplistic, afaik. It's also
> confusingly written. I blame myself.
>
> Honestly, in that example you quote:
>
> cbr %arg1, .L1, .L2
> .L1: context 1
> br .L3
> .L2: call use, %r1
> br .L3
> .L3: cbr %arg1, .L5, .L4 ;; false imbalance here
>
> we never even try to combine the two cbr's, because they don't jump
> directly to each other.
>
> I think it would be easy to simplify if we just added some logic to
> change "unconditional branch to conditional branch", and moved the cbr
> up into L1 and L2, but I don't think we have any such logic (and it's
> a simplification that might end up being the opposite of a
> simplification - it would duplicate the conditional).
>
> So we could try to remove L3 entirely, by moving it into both parents:
>
> cbr %arg1, .L1, .L2
> .L1: context 1
> cbr %arg1, .L5, .L4
> .L2: call use, %r1
> cbr %arg1, .L5, .L4
Mmm yes, I see.
> and at that point, the branch simplification we *do* have woudl see
> that both of those duplicated conditional branches are now dominated
> by the first one, and we'd end up with
>
> cbr %arg1, .L1, .L2
> .L1: context 1
> br .L5
> .L2: call use, %r1
> br .L4
>
> and then we'd join L2 and L4 together and the end result would look nice.
I was simply thinking doing this directly. In the case of a simple
'diamond' (with the top and bottom CBR on the same conditional),
there is no dependency/dominance problems and it's enough to change
the target of the BR in the two side BBs.
> But as it stands now, L3 isn't dominated by one side of the original
> conditional branch (because we reach L3 from both sides), and the cbr
> in L3 isn't something we can simplify without duplicating it and
> moving it up into the parents.
>
> Is duplicating the conditional branch worth it? In this case, clearly
> yes. But in general? Maybe we could do it in cases like this, when the
> *only* thing in a basic block is that conditional branch.
I think the general solution is to look at the dominance tree:
if there is another conditional branch with the same condition and which
dominates the current, then it must be safe to duplicate because it
can be simplified away anyway. Otherwise, I would prefer to avoid
duplicating anything without some cost/benefit guarantee.
-- Luc
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-11-09 22:13 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-09 21:16 'context imbalance' warnings Luc Van Oostenryck
2020-11-09 21:47 ` Linus Torvalds
2020-11-09 21:57 ` Linus Torvalds
2020-11-09 22:13 ` Luc Van Oostenryck
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.