All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] sparse SSA construction
@ 2017-08-06 20:26 Luc Van Oostenryck
  2017-08-06 23:01 ` Christopher Li
  2017-08-15 13:41 ` Dibyendu Majumdar
  0 siblings, 2 replies; 36+ messages in thread
From: Luc Van Oostenryck @ 2017-08-06 20:26 UTC (permalink / raw)
  To: linux-sparse; +Cc: Christopher Li, Luc Van Oostenryck

Since it seems to be some interest on the subject and
to avoid duplicated work, here is a rewrite of sparse's
construction of the SSA form.

It's not using one of the classical algorithm but is
using something newer, simpler and often faster.
It's main advantage beyond the simplicity is that you don't
need to first build the whole CFG & linearized code to
to directly throw it away (or heavily transform it) as
it builds the SSA form directly during the linearization.

It's not finished code but it's working well (and is
effectively a bit faster and use less memory).
What's interesting for sparse now is that:
- the phi-noes are correctly placed
- uninitialized variables can be handled much more easily.

I don't feel it's needed to patchbomb the ML for the moment
so I'll only give the URL to the repo:

  git://github.com/lucvoo/sparse.git sssa

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-06 20:26 [RFC] sparse SSA construction Luc Van Oostenryck
@ 2017-08-06 23:01 ` Christopher Li
  2017-08-06 23:44   ` Luc Van Oostenryck
  2017-08-15 13:41 ` Dibyendu Majumdar
  1 sibling, 1 reply; 36+ messages in thread
From: Christopher Li @ 2017-08-06 23:01 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Linux-Sparse

On Sun, Aug 6, 2017 at 4:26 PM, Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
> Since it seems to be some interest on the subject and
> to avoid duplicated work, here is a rewrite of sparse's
> construction of the SSA form.
>
> It's not using one of the classical algorithm but is
> using something newer, simpler and often faster.
> It's main advantage beyond the simplicity is that you don't
> need to first build the whole CFG & linearized code to
> to directly throw it away (or heavily transform it) as
> it builds the SSA form directly during the linearization.

OK, great.

May I ask the question that, the SSA form you are building
from this codes. Does it violate the SSA dominance property
define as:
==================================
Φ-Functions are placed in all basic blocks of the
Dominance Frontier.

Dominance Property of SSA:
1. If x is used in a Phi-function in block N,
then the definition of x dominates *every* <==== notice "every"
predecessor of N.
2. If x is used in a non-Phi statement in N,
then the definition of x dominates N
==================================

In other word, does it produce the "usage before
define" IR similar to this?

        and.32      %r3 <- %r4, $-65
         or.32       %r4 <- %r3, $64

> It's not finished code but it's working well (and is
> effectively a bit faster and use less memory).

The correct behavior is the most important one.
Then performance.

> What's interesting for sparse now is that:
> - the phi-noes are correctly placed

In the dominance frontier, right? That is great.

> - uninitialized variables can be handled much more easily.

That I am very curios how it was done. See the above
question. I might not have time to go through your code
very quickly. I will do my best when I have some time.

Chris

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-06 23:01 ` Christopher Li
@ 2017-08-06 23:44   ` Luc Van Oostenryck
  2017-08-07  0:33     ` Christopher Li
  0 siblings, 1 reply; 36+ messages in thread
From: Luc Van Oostenryck @ 2017-08-06 23:44 UTC (permalink / raw)
  To: Christopher Li; +Cc: Linux-Sparse

On Sun, Aug 06, 2017 at 07:01:29PM -0400, Christopher Li wrote:
> On Sun, Aug 6, 2017 at 4:26 PM, Luc Van Oostenryck
> <luc.vanoostenryck@gmail.com> wrote:
> > Since it seems to be some interest on the subject and
> > to avoid duplicated work, here is a rewrite of sparse's
> > construction of the SSA form.
> >
> > It's not using one of the classical algorithm but is
> > using something newer, simpler and often faster.
> > It's main advantage beyond the simplicity is that you don't
> > need to first build the whole CFG & linearized code to
> > to directly throw it away (or heavily transform it) as
> > it builds the SSA form directly during the linearization.
> 
> OK, great.
> 
> May I ask the question that, the SSA form you are building
> from this codes. Does it violate the SSA dominance property

Of course not. Unless there is a bug, that is.

> In other word, does it produce the "usage before
> define" IR similar to this?
> 
>         and.32      %r3 <- %r4, $-65
>          or.32       %r4 <- %r3, $64

It gives the code as I showed as reply to Dibyendu's mail.
For example, his code gives:
        foo:
        .L0:
                <entry-point>
                and.32      %r2 <- UNDEF, $-65
                or.32       %r3 <- %r2, $64
                lsr.32      %r4 <- %r3, $6
                cast.1      %r5 <- (32) %r4
                cast.32     %r6 <- (1) %r5
                setne.32    %r7 <- %r6, $1
                ret.32      %r7

with UNDEF being a new type of pseudo.

On purpose, nothing is done yet with the UNDEFs.

> That I am very curios how it was done. See the above
> question.

IMO, the best is to read the article. My code is
still very close to the article's pseudo-code.

Basically, it's a lazy approach. Things are done
when possible and are delayed if not. 
More specifically, you can do what is needed in a BB
once you have all the parents, otherwise you need to
delay things. The article call this 'BB is sealed'
and there is a 'seal' operation that need to be called
to process what was delayed.
Complications arise with gotos (and the article doesn't
talk about them) but I've added a solution for it
(but I would like to return to it later).

From what I've seen/heard this algo have been pretty well
received. Personnaly, I'm very happy with its simplicity
and its result. 

One thing that is missing is the more complex cases
with trivial phi-nodes.

Another thing I would like to change is the ptrmap
implementation I've added: it need to be hash-table based
(but it's only an implementation detail that matters
for performance on really big functions).

-- Luc

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-06 23:44   ` Luc Van Oostenryck
@ 2017-08-07  0:33     ` Christopher Li
  2017-08-07  1:21       ` Luc Van Oostenryck
  0 siblings, 1 reply; 36+ messages in thread
From: Christopher Li @ 2017-08-07  0:33 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Linux-Sparse

On Sun, Aug 6, 2017 at 7:44 PM, Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
>>
>> May I ask the question that, the SSA form you are building
>> from this codes. Does it violate the SSA dominance property
>
> Of course not. Unless there is a bug, that is.

I am very glad to heard that :-)

>
>> In other word, does it produce the "usage before
>> define" IR similar to this?
>>
>>         and.32      %r3 <- %r4, $-65
>>          or.32       %r4 <- %r3, $64
>
> It gives the code as I showed as reply to Dibyendu's mail.
> For example, his code gives:
>         foo:
>         .L0:
>                 <entry-point>
>                 and.32      %r2 <- UNDEF, $-65
>                 or.32       %r3 <- %r2, $64
>                 lsr.32      %r4 <- %r3, $6
>                 cast.1      %r5 <- (32) %r4
>                 cast.32     %r6 <- (1) %r5
>                 setne.32    %r7 <- %r6, $1
>                 ret.32      %r7
>
> with UNDEF being a new type of pseudo.

OK. So far so good.

>
> On purpose, nothing is done yet with the UNDEFs.
>
>> That I am very curios how it was done. See the above
>> question.
>
> IMO, the best is to read the article. My code is
> still very close to the article's pseudo-code.
>
> Basically, it's a lazy approach. Things are done
> when possible and are delayed if not.

So you are building the SSA for the memory variable
I assume?

Directly from the linearize process?

> More specifically, you can do what is needed in a BB
> once you have all the parents, otherwise you need to
> delay things. The article call this 'BB is sealed'
> and there is a 'seal' operation that need to be called
> to process what was delayed.
> Complications arise with gotos (and the article doesn't
> talk about them) but I've added a solution for it
> (but I would like to return to it later).

I am missing the obvious. Where is this article?

>
> From what I've seen/heard this algo have been pretty well
> received. Personnaly, I'm very happy with its simplicity
> and its result.

Great!

>
> One thing that is missing is the more complex cases
> with trivial phi-nodes.

Can you clarify?

>
> Another thing I would like to change is the ptrmap
> implementation I've added: it need to be hash-table based
> (but it's only an implementation detail that matters
> for performance on really big functions).

Which ptrmap? The one used for CSE?

Chris

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-07  0:33     ` Christopher Li
@ 2017-08-07  1:21       ` Luc Van Oostenryck
  2017-08-07  1:44         ` Christopher Li
  0 siblings, 1 reply; 36+ messages in thread
From: Luc Van Oostenryck @ 2017-08-07  1:21 UTC (permalink / raw)
  To: Christopher Li; +Cc: Linux-Sparse

On Sun, Aug 06, 2017 at 08:33:51PM -0400, Christopher Li wrote:
> On Sun, Aug 6, 2017 at 7:44 PM, Luc Van Oostenryck wrote:
>
> >> That I am very curios how it was done. See the above
> >> question.
> >
> > IMO, the best is to read the article. My code is
> > still very close to the article's pseudo-code.
> >
> > Basically, it's a lazy approach. Things are done
> > when possible and are delayed if not.
> 
> So you are building the SSA for the memory variable
> I assume?

No, directly from symbol to pseudos.
It's the beauty of this approach.
 
> Directly from the linearize process?

Yes, only the add_load() & add_store() is modified and 
instead of storing stuff in memory, it's directly translated
to pseudos & phi-nodes are created when needed.

> > More specifically, you can do what is needed in a BB
> > once you have all the parents, otherwise you need to
> > delay things. The article call this 'BB is sealed'
> > and there is a 'seal' operation that need to be called
> > to process what was delayed.
> > Complications arise with gotos (and the article doesn't
> > talk about them) but I've added a solution for it
> > (but I would like to return to it later).
> 
> I am missing the obvious. Where is this article?

I've put the reference at the top of the new file sssa.c:
/*
 * This is an implementation of the SSA construction algorithm:
 *	"Simple and Efficient Construction of Static Single Assignment Form"
 * by Matthias Braun, Sebastian Buchwald, Sebastian Hack, Roland Leissa,
 *    Christoph Mallon and Andreas Zwinkau.
 * cfr. http://www.cdl.uni-saarland.de/papers/bbhlmz13cc.pdf
 */

> > From what I've seen/heard this algo have been pretty well
> > received. Personnaly, I'm very happy with its simplicity
> > and its result.
> 
> Great!
> 
> >
> > One thing that is missing is the more complex cases
> > with trivial phi-nodes.
> 
> Can you clarify?

It's a question of related cycles of phi-nodes.
Some can be simplified away, it's described in the article.
The simple case of a simple cycle is already handled.

> > Another thing I would like to change is the ptrmap
> > implementation I've added: it need to be hash-table based
> > (but it's only an implementation detail that matters
> > for performance on really big functions).
> 
> Which ptrmap? The one used for CSE?

No, no.
I needed a map between symbols and pseudos.
It's in a separate file: ptrmap.c

-- Luc

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-07  1:21       ` Luc Van Oostenryck
@ 2017-08-07  1:44         ` Christopher Li
  0 siblings, 0 replies; 36+ messages in thread
From: Christopher Li @ 2017-08-07  1:44 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Linux-Sparse

On Sun, Aug 6, 2017 at 9:21 PM, Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:

>
> No, directly from symbol to pseudos.
> It's the beauty of this approach.

I have to read more when I have time.
>
>> Directly from the linearize process?
>
> Yes, only the add_load() & add_store() is modified and
> instead of storing stuff in memory, it's directly translated
> to pseudos & phi-nodes are created when needed.

How does it deal with pointer usage analyze?

>> > More specifically, you can do what is needed in a BB
>> > once you have all the parents, otherwise you need to
>> > delay things. The article call this 'BB is sealed'
>> > and there is a 'seal' operation that need to be called
>> > to process what was delayed.
>> > Complications arise with gotos (and the article doesn't
>> > talk about them) but I've added a solution for it
>> > (but I would like to return to it later).
>>
>> I am missing the obvious. Where is this article?
>
> I've put the reference at the top of the new file sssa.c:
> /*
>  * This is an implementation of the SSA construction algorithm:
>  *      "Simple and Efficient Construction of Static Single Assignment Form"
>  * by Matthias Braun, Sebastian Buchwald, Sebastian Hack, Roland Leissa,
>  *    Christoph Mallon and Andreas Zwinkau.
>  * cfr. http://www.cdl.uni-saarland.de/papers/bbhlmz13cc.pdf
>  */

Interesting. I will take a look.

>> > Another thing I would like to change is the ptrmap
>> > implementation I've added: it need to be hash-table based
>> > (but it's only an implementation detail that matters
>> > for performance on really big functions).
>>
>> Which ptrmap? The one used for CSE?
>
> No, no.
> I needed a map between symbols and pseudos.
> It's in a separate file: ptrmap.c

OK, there is a back end pointer in symbol->aux
you can use to store information relate to back end.
pseudo->priv is the similar deal.

Using this two pointers do you still need the ptrmap?

Chris

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-06 20:26 [RFC] sparse SSA construction Luc Van Oostenryck
  2017-08-06 23:01 ` Christopher Li
@ 2017-08-15 13:41 ` Dibyendu Majumdar
  2017-08-15 13:59   ` Christopher Li
  1 sibling, 1 reply; 36+ messages in thread
From: Dibyendu Majumdar @ 2017-08-15 13:41 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Linux-Sparse, Christopher Li

Hi Luc,

On 6 August 2017 at 21:26, Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
> Since it seems to be some interest on the subject and
> to avoid duplicated work, here is a rewrite of sparse's
> construction of the SSA form.
>
> It's not using one of the classical algorithm but is
> using something newer, simpler and often faster.
> It's main advantage beyond the simplicity is that you don't
> need to first build the whole CFG & linearized code to
> to directly throw it away (or heavily transform it) as
> it builds the SSA form directly during the linearization.
>
> It's not finished code but it's working well (and is
> effectively a bit faster and use less memory).
> What's interesting for sparse now is that:
> - the phi-noes are correctly placed
> - uninitialized variables can be handled much more easily.
>
> I don't feel it's needed to patchbomb the ML for the moment
> so I'll only give the URL to the repo:
>
>   git://github.com/lucvoo/sparse.git sssa

I have merged the new SSA implementation in your sssa-mini-clean
branch - which I understand is minimal set of changes needed for the
new SSA on top of RC5 - into my project dmrC.

I have been testing the changes  - and so far I am pleased to report
that after a small set of changes (described below) - all my existing
tests pass.

a) I had to disallow struct / union types from being treated as simple types.
b) I am setting UNDEF pseudos a value of 0 in LLVM backend - this is
to ensure I can do the same in other backends.
c) I found that CBR instructions can get a PSEUDO_VAL as the condition
which was not being handled correctly in my version of LLVM backend.
Somehow this did not occur in the past so I am not yet sure how this
is related to the changes.

The new implementation does not appear to suffer from the performance
degradation we saw after removing the single store shortcut.

Finally the new code seems simpler and elegant. I would suggest adding
some comments in ssa.c - perhaps copy the pseudo code from the paper
it is based on - to better explain what is going on.

Many thanks for this great piece of work.

I do recommend submitting a patch of sssa-mini-clean - as this is the
minimal set on top of RC5. By integrating this post 0.5.1 release some
amount of time can be spent testing and validating this before other
changes are done.

Regards
Dibyendu

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-15 13:41 ` Dibyendu Majumdar
@ 2017-08-15 13:59   ` Christopher Li
  2017-08-15 14:06     ` Dibyendu Majumdar
  0 siblings, 1 reply; 36+ messages in thread
From: Christopher Li @ 2017-08-15 13:59 UTC (permalink / raw)
  To: Dibyendu Majumdar; +Cc: Luc Van Oostenryck, Linux-Sparse

On Tue, Aug 15, 2017 at 9:41 AM, Dibyendu Majumdar
<mobile@majumdar.org.uk> wrote:
>> I don't feel it's needed to patchbomb the ML for the moment
>> so I'll only give the URL to the repo:
>>
>>   git://github.com/lucvoo/sparse.git sssa
>
> I have merged the new SSA implementation in your sssa-mini-clean
> branch - which I understand is minimal set of changes needed for the
> new SSA on top of RC5 - into my project dmrC.
>
> I have been testing the changes  - and so far I am pleased to report
> that after a small set of changes (described below) - all my existing
> tests pass.
>

Thanks for the report and testing.

> a) I had to disallow struct / union types from being treated as simple types.
> b) I am setting UNDEF pseudos a value of 0 in LLVM backend - this is
> to ensure I can do the same in other backends.
> c) I found that CBR instructions can get a PSEUDO_VAL as the condition
> which was not being handled correctly in my version of LLVM backend.
> Somehow this did not occur in the past so I am not yet sure how this
> is related to the changes.
>
> The new implementation does not appear to suffer from the performance
> degradation we saw after removing the single store shortcut.

That is good to hear. With the shortcut removed, sparse actually suffer
a lot of finding store/load domination for large graphs. The recursive
way of finding domination has no cache acceleration at all.
There is a lot of room to improvement.


>
> Finally the new code seems simpler and elegant. I would suggest adding
> some comments in ssa.c - perhaps copy the pseudo code from the paper
> it is based on - to better explain what is going on.

Yes, I think that is helpful. I am reading the paper right now.

This ssa conversion will need to break into smaller piece for review
and merge. The general recommended number of patch for the first
round is 15 or so at a time in the Linux kernel submmitting-patches.rst
documentation.


Chris

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-15 13:59   ` Christopher Li
@ 2017-08-15 14:06     ` Dibyendu Majumdar
  2017-08-15 14:07       ` Christopher Li
  0 siblings, 1 reply; 36+ messages in thread
From: Dibyendu Majumdar @ 2017-08-15 14:06 UTC (permalink / raw)
  To: Christopher Li; +Cc: Luc Van Oostenryck, Linux-Sparse

Hi Chris,

On 15 August 2017 at 14:59, Christopher Li <sparse@chrisli.org> wrote:
> This ssa conversion will need to break into smaller piece for review
> and merge. The general recommended number of patch for the first
> round is 15 or so at a time in the Linux kernel submmitting-patches.rst
> documentation.
>

The good news is that the minimal patch is quite small. I was
surprised by how small it is.

Regards
Dibyendu

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-15 14:06     ` Dibyendu Majumdar
@ 2017-08-15 14:07       ` Christopher Li
  2017-08-15 14:09         ` Dibyendu Majumdar
  0 siblings, 1 reply; 36+ messages in thread
From: Christopher Li @ 2017-08-15 14:07 UTC (permalink / raw)
  To: Dibyendu Majumdar; +Cc: Luc Van Oostenryck, Linux-Sparse

On Tue, Aug 15, 2017 at 10:06 AM, Dibyendu Majumdar
<mobile@majumdar.org.uk> wrote:

>
> The good news is that the minimal patch is quite small. I was
> surprised by how small it is.
>
That is good new. Which minimal patch branch are you referring to?
I know Luc has a few branches regarding the ssa conversion.

Chris

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-15 14:07       ` Christopher Li
@ 2017-08-15 14:09         ` Dibyendu Majumdar
  2017-08-15 14:18           ` Christopher Li
  2017-08-15 18:36           ` Linus Torvalds
  0 siblings, 2 replies; 36+ messages in thread
From: Dibyendu Majumdar @ 2017-08-15 14:09 UTC (permalink / raw)
  To: Christopher Li; +Cc: Luc Van Oostenryck, Linux-Sparse

Hi Chris,

On 15 August 2017 at 15:07, Christopher Li <sparse@chrisli.org> wrote:
> On Tue, Aug 15, 2017 at 10:06 AM, Dibyendu Majumdar
> <mobile@majumdar.org.uk> wrote:
>> The good news is that the minimal patch is quite small. I was
>> surprised by how small it is.
>>
> That is good new. Which minimal patch branch are you referring to?
> I know Luc has a few branches regarding the ssa conversion.
>

https://github.com/lucvoo/sparse/tree/sssa-mini-clean

This is built on top of RC5 so merge will be quite straightforward.

Regards
Dibyendu

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-15 14:09         ` Dibyendu Majumdar
@ 2017-08-15 14:18           ` Christopher Li
  2017-08-15 18:36           ` Linus Torvalds
  1 sibling, 0 replies; 36+ messages in thread
From: Christopher Li @ 2017-08-15 14:18 UTC (permalink / raw)
  To: Dibyendu Majumdar; +Cc: Luc Van Oostenryck, Linux-Sparse

On Tue, Aug 15, 2017 at 10:09 AM, Dibyendu Majumdar
<mobile@majumdar.org.uk> wrote:
>>
>
> https://github.com/lucvoo/sparse/tree/sssa-mini-clean
>
> This is built on top of RC5 so merge will be quite straightforward.

Sure, I am going to take a look at that after I am done with the
paper.

Thanks

Chris

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-15 14:09         ` Dibyendu Majumdar
  2017-08-15 14:18           ` Christopher Li
@ 2017-08-15 18:36           ` Linus Torvalds
  2017-08-15 20:14             ` Luc Van Oostenryck
  2017-08-15 20:37             ` Luc Van Oostenryck
  1 sibling, 2 replies; 36+ messages in thread
From: Linus Torvalds @ 2017-08-15 18:36 UTC (permalink / raw)
  To: Dibyendu Majumdar; +Cc: Christopher Li, Luc Van Oostenryck, Linux-Sparse

On Tue, Aug 15, 2017 at 7:09 AM, Dibyendu Majumdar
<mobile@majumdar.org.uk> wrote:
>
> https://github.com/lucvoo/sparse/tree/sssa-mini-clean

I like it. But the commit messages are sorely lacking. And I'm not
talking just about the lack of sign-off's, I'm talking about the
messages just not explaining what is going on AT ALL.

But with that fixed, I heartily recommend merging the new ssa
construction code. It's based on a real paper, rather than being
ad-hoc.

                 Linus

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-15 18:36           ` Linus Torvalds
@ 2017-08-15 20:14             ` Luc Van Oostenryck
  2017-08-15 20:43               ` Linus Torvalds
  2017-08-16 11:02               ` Dibyendu Majumdar
  2017-08-15 20:37             ` Luc Van Oostenryck
  1 sibling, 2 replies; 36+ messages in thread
From: Luc Van Oostenryck @ 2017-08-15 20:14 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Dibyendu Majumdar, Christopher Li, Linux-Sparse

On Tue, Aug 15, 2017 at 8:36 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>> https://github.com/lucvoo/sparse/tree/sssa-mini-clean
>
> I like it. But the commit messages are sorely lacking. And I'm not
> talking just about the lack of sign-off's, I'm talking about the
> messages just not explaining what is going on AT ALL.

I know, sorry for that, I don't like it myself (often I've the
impression to spend more time to wrote a decent commit
message than to write the corresponding code).
I'll work on that tomorrow (Europe here, already evening, sorry).

One of the thing I'm afraid won't be able to do (without a quite
a bit more work) is to do some nice small step-by-step commits.
It's completely changing the method, a step-by-step will be very
artificial and quite useless, I think.

> But with that fixed, I heartily recommend merging the new ssa
> construction code. It's based on a real paper, rather than being
> ad-hoc.

It's just too bad that the paper simply ignores gotos. The method
is really cool, fast, simple and clean, I love it. But I had to add
a few things, more akin to kludges than hacks IMO, to make it
work with gotos. But as is, it works pretty well, is fast and produce
correct SSA (also avoid few quadratic behaviours the current
simplify_one_symbol() has).

Improvement areas are:
- I feel we can do better with gotos and computed gotos
  it needs investigation
- the ptrmap thing need an hash based implementation
  but this is only needed for extreme cases. The trick will
  be to do something with dynamic size and which will not
  waste (too much)  memory. Nothing hard, but it will need
  some experiment  and I want first to collect some statistics
- there is an optimization that can be added for (what I call)
  'complex trivial phi-nodes' (it's clear in the paper, I think)

But none of this is blocking, just something that can easily
be added later.

The current situation is:
- As far as I saw the generated IR is correct (from a phi-node
  point of view).
- The performance is pretty nice (faster than the current method)
- There are some optimizations for free because the method
  do a kind of global-numbering.
- It does very well on a kernel allyesconfig. It just creates
  something like 8 or 12 more context/locking warnings I
  would like to understand (OTOH, knowing how broken
  the current SSA is, I'm surprised there isn't more differences).

For information, the current SSA construction is very broken,
both in simplify_one_symbol() (the main SSA construction)
and simplify_loads() (which also creates phi-nodes). They both
are 'load-driven' which is OK but they both put the phi-nodes
where the load is while phi-nodes need to placed where branches
meet. The loads are often placed in a BB following where the
join-point. Often this *seems* to be OK but in fact is very wrong
and create all sort of problems a bit everywhere

Best regards,
-- Luc Van Oostenryck

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-15 18:36           ` Linus Torvalds
  2017-08-15 20:14             ` Luc Van Oostenryck
@ 2017-08-15 20:37             ` Luc Van Oostenryck
  1 sibling, 0 replies; 36+ messages in thread
From: Luc Van Oostenryck @ 2017-08-15 20:37 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Dibyendu Majumdar, Christopher Li, Linux-Sparse

On Tue, Aug 15, 2017 at 8:36 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> I like it. But the commit messages are sorely lacking. And I'm not
> talking just about the lack of sign-off's, I'm talking about the
> messages just not explaining what is going on AT ALL.

My 'real' patches are much more 'down-broken' (is that a word?)
but you need to have the whole to not have regressions in the
testsuite. So, in this ssa-mini[-clean], I just squashed a lot of
commits together.

I'm not really sure what can be done in this case (albeit all my
love for small, incremental changes).

-- Luc

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-15 20:14             ` Luc Van Oostenryck
@ 2017-08-15 20:43               ` Linus Torvalds
  2017-08-15 21:43                 ` Luc Van Oostenryck
                                   ` (2 more replies)
  2017-08-16 11:02               ` Dibyendu Majumdar
  1 sibling, 3 replies; 36+ messages in thread
From: Linus Torvalds @ 2017-08-15 20:43 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Dibyendu Majumdar, Christopher Li, Linux-Sparse

On Tue, Aug 15, 2017 at 1:14 PM, Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
>
> For information, the current SSA construction is very broken,
> both in simplify_one_symbol() (the main SSA construction)
> and simplify_loads() (which also creates phi-nodes). They both
> are 'load-driven' which is OK but they both put the phi-nodes
> where the load is while phi-nodes need to placed where branches
> meet. The loads are often placed in a BB following where the
> join-point. Often this *seems* to be OK but in fact is very wrong
> and create all sort of problems a bit everywhere

I do think you're too hung up about the placement of the phi nodes.

It may be an issue for simple models that just walk the linearized
code directly and try to turn it into code. But the original intent of
the existing phi node code was to just be as sources of the pseudo's -
the location should be largely immaterial.

IOW, you could think of the phi nodes as being "outside" the
instruction flow entirely, and just being the definition of the pseudo
they define. They have to be placed somewhere, but the "somewhere" is
somewhat arbitrary.

That said, I like your new ssa construction -independently- of that
issue - the old SSA constructor really was very ad-hoc. Look through
the git history if you don't believe me ;)

             Linus

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-15 20:43               ` Linus Torvalds
@ 2017-08-15 21:43                 ` Luc Van Oostenryck
  2017-08-15 22:44                   ` Dibyendu Majumdar
  2017-08-16  5:15                   ` Christopher Li
  2017-08-16  4:23                 ` Christopher Li
  2017-08-16  6:41                 ` Luc Van Oostenryck
  2 siblings, 2 replies; 36+ messages in thread
From: Luc Van Oostenryck @ 2017-08-15 21:43 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Dibyendu Majumdar, Christopher Li, Linux-Sparse

On Tue, Aug 15, 2017 at 10:43 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Tue, Aug 15, 2017 at 1:14 PM, Luc Van Oostenryck
> <luc.vanoostenryck@gmail.com> wrote:
>>
>> For information, the current SSA construction is very broken,
>> both in simplify_one_symbol() (the main SSA construction)
>> and simplify_loads() (which also creates phi-nodes). They both
>> are 'load-driven' which is OK but they both put the phi-nodes
>> where the load is while phi-nodes need to placed where branches
>> meet. The loads are often placed in a BB following where the
>> join-point. Often this *seems* to be OK but in fact is very wrong
>> and create all sort of problems a bit everywhere
>
> I do think you're too hung up about the placement of the phi nodes.
>
> It may be an issue for simple models that just walk the linearized
> code directly and try to turn it into code. But the original intent of
> the existing phi node code was to just be as sources of the pseudo's -
> the location should be largely immaterial.

Yes, what I thought for a long while,
But when things like try_ti_simplify_bb() come in play,
with pack_basic_blocks() on top the things become
*very* messy I often you can't make sense of it anymore.

This has also consequences on liveness (but it can't explain
the root of the problem there, i's month's I realized it was useless.

> IOW, you could think of the phi nodes as being "outside" the
> instruction flow entirely, and just being the definition of the pseudo
> they define. They have to be placed somewhere, but the "somewhere" is
> somewhat arbitrary.

It's true to a certain point, after it, you need to have them placed at
the confluence point. I can't explain better for now. I'll have an
example tomorrow.


> That said, I like your new ssa construction -independently- of that
> issue - the old SSA constructor really was very ad-hoc. Look through
> the git history if you don't believe me ;)

Oh I believe ve *and* I have view the logs.

Not directly related to that is that I really think we need to associate
a ph-node's source to it's parent BB.

So if you have something:

|      |      |
A    B    C
\      |     /
       *rx: phi, *phsra, %phisrb, %phisrcx ,

We really want something like:
|      |      |
A    B    C
\      |     /
       *rx: phi, A:*phsra, B:%phisrb, C:%phisrcx
It's essential for a number of things (I suspect a number
of SSA algorithm, for liveliness , writing a trivial simulator, ....)

-- Luc

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-15 21:43                 ` Luc Van Oostenryck
@ 2017-08-15 22:44                   ` Dibyendu Majumdar
  2017-08-16  5:36                     ` Christopher Li
  2017-08-16  5:15                   ` Christopher Li
  1 sibling, 1 reply; 36+ messages in thread
From: Dibyendu Majumdar @ 2017-08-15 22:44 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Linus Torvalds, Christopher Li, Linux-Sparse

Hi,

I think it would be good to perform a detailed comparison / assessment
of where the problem lies currently versus the new solution. I get the
impression that the current problem is not well explained or
understood.

Here are two known issues at present. You can see the old linearized
output and the new linearized output - both produced with
simplifications switched off.

C code:
https://github.com/dibyendumajumdar/dmr_c/blob/newssa/tests/bugs/simplifybug.c

Old linearized output:
https://github.com/dibyendumajumdar/dmr_c/blob/newssa/tests/bugs/simplifybug_old.lin

New linearized output:
https://github.com/dibyendumajumdar/dmr_c/blob/newssa/tests/bugs/simplifybug_new.lin


C code:
https://github.com/dibyendumajumdar/dmr_c/blob/newssa/tests/set1/onebit.c

Old linearized output:
https://github.com/dibyendumajumdar/dmr_c/blob/newssa/tests/set1/onebit_old.lin

New linearized output:
https://github.com/dibyendumajumdar/dmr_c/blob/newssa/tests/set1/onebit_new.lin


The two situations where I see problems currently are exemplified
above. In the first case the outputs are clearly different - and here
the issue appears to be how undefined pseudos are handled. In the
second case, the original linear output seems identical to the new
output - so the issue probably lies in the single store shortcut.

Regards
Dibyendu

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-15 20:43               ` Linus Torvalds
  2017-08-15 21:43                 ` Luc Van Oostenryck
@ 2017-08-16  4:23                 ` Christopher Li
  2017-08-16  4:58                   ` Christopher Li
  2017-08-16  6:41                 ` Luc Van Oostenryck
  2 siblings, 1 reply; 36+ messages in thread
From: Christopher Li @ 2017-08-16  4:23 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Luc Van Oostenryck, Dibyendu Majumdar, Linux-Sparse

On Tue, Aug 15, 2017 at 4:43 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> I do think you're too hung up about the placement of the phi nodes.
>
> It may be an issue for simple models that just walk the linearized
> code directly and try to turn it into code. But the original intent of
> the existing phi node code was to just be as sources of the pseudo's -
> the location should be largely immaterial.
>
> IOW, you could think of the phi nodes as being "outside" the
> instruction flow entirely, and just being the definition of the pseudo
> they define. They have to be placed somewhere, but the "somewhere" is
> somewhat arbitrary.

I am actually with Luc on this one. The placement of the phi node
is critical if you want to do the minimal phi node implementation.
Even you are not doing the minimal phi node, you can't do less than
the minimal phi node.

Ignore the not reachable phi node for the following discussion.  if we
are missing one of the phi node where the minimal phi node place it
(aka dominance frontier) . We are doing some thing wrong, the SSA
dominance property has been violated.

To reason about this, we can reference back to the SSA dominance
property. When dominance of a definition ends, (in other words, dominance
frontier). It need to place a phi node to join the other definition of
the variable
from all predecessors. Without that particular phi node, all successor
of this join block will no longer able to distinguish how the multiple define
merge. We can have more phi node than this, but the phi node on the
dominance frontier *must not* be skipped. Skipped it cause SSA dominance
property being violated. The variable is not static any more.

If sparse only place the phi node on where the loads are, which a lot of
case happen to be the DF (dominance frontier). In the case that DF
does not have loads, that is where current sparse do wrong, sparse
does not place a phi node  there. Which violate the SSA property.

Chris

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-16  4:23                 ` Christopher Li
@ 2017-08-16  4:58                   ` Christopher Li
  2017-08-16 10:40                     ` Dibyendu Majumdar
  0 siblings, 1 reply; 36+ messages in thread
From: Christopher Li @ 2017-08-16  4:58 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Luc Van Oostenryck, Dibyendu Majumdar, Linux-Sparse

On Wed, Aug 16, 2017 at 12:23 AM, Christopher Li <sparse@chrisli.org> wrote:
> If sparse only place the phi node on where the loads are, which a lot of
> case happen to be the DF (dominance frontier). In the case that DF
> does not have loads, that is where current sparse do wrong, sparse
> does not place a phi node  there. Which violate the SSA property.

I give an example for the above situation I am talking about:

int a, b, c, d;

void foo(void)
{
  int e, f;
  if (a)
     e = b;
  else
     e = c;
  if (b)
     c = a+b;
  d = e;
}

"if (b)" is where the dominance of "e = b" and "e =c" joined.
I don't do any loading of "e" there.
To insert some blocks before using "e", I use the "if (b)".

Now let's look at the test-linearize output:
foo:
.L0:
<entry-point>
load.32     %r1(a) <- 0[a]
cbr         %r1(a), .L1, .L2

.L1:
load.32     %r2(e) <- 0[b]
phisrc.32   %phi3(b) <- %r2(e)
phisrc.32   %phi4(e) <- %r2(e)
phisrc.32   %phi6(e) <- %r2(e)
phisrc.32   %phi7(e) <- %r2(e)
br          .L3

.L2:
load.32     %r3 <- 0[c]
phisrc.32   %phi5(e) <- %r3
br          .L3

.L3:
load.32     %r4(b) <- 0[b] <========  notice no phi node here.
          It is required because here it knows which edge
          it arrived. From L1, use %r2, from L2 use %r3.


cbr         %r4(b), .L4, .L5

.L4:
add.32      %r7 <- %r1(a), %r4(b)
store.32    %r7 -> 0[c]
br          .L5

.L5:
phi.32      %r8 <- %phi4(e), %phi5(e) <====== phi node here
store.32    %r8 -> 0[d]
ret

Notice how phi node is too late in .L5.

L5 does not have the edge information how
L1 and L2 joined. That information is at L3.

The phi source is not going to remedy this.
Because phi source is not going to know how
the phi source is going to join other phi source in
the later blocks.

The edge information is the one that is critically
missing here.

Actually, if we do SSA properly, we don't need the
phi source at all. We can directly use the source of
the phisource, which is %r2 and %r3.
if arrive in L3 from L1 use %r2, from L2 using %r3.


Chris

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-15 21:43                 ` Luc Van Oostenryck
  2017-08-15 22:44                   ` Dibyendu Majumdar
@ 2017-08-16  5:15                   ` Christopher Li
  1 sibling, 0 replies; 36+ messages in thread
From: Christopher Li @ 2017-08-16  5:15 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Linus Torvalds, Dibyendu Majumdar, Linux-Sparse

On Tue, Aug 15, 2017 at 5:43 PM, Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
> Yes, what I thought for a long while,
> But when things like try_ti_simplify_bb() come in play,
> with pack_basic_blocks() on top the things become
> *very* messy I often you can't make sense of it anymore.

It is not only messy, it is plain wrong that SSA dominance property
has been violated.

>> IOW, you could think of the phi nodes as being "outside" the
>> instruction flow entirely, and just being the definition of the pseudo
>> they define. They have to be placed somewhere, but the "somewhere" is
>> somewhat arbitrary.
>
> It's true to a certain point, after it, you need to have them placed at
> the confluence point. I can't explain better for now. I'll have an
> example tomorrow.

I give one example in my other email. I think that arbitrary view is
actually wrong:

phi node is a function that select value depend on which edge the
control flow come from. Place it arbitrary cause the phi node
does not have access the edge information how the define joined.
It need to be place at the dominance frontier, which just barely
out of reach of the dominance of the define.

> Not directly related to that is that I really think we need to associate
> a ph-node's source to it's parent BB.

Exactly. Notice that parent ( predecessor) might not be the BB
that define the variable. It is where the definition not longer
dominance (dominance frontier).

Again, the simple way to explain this, it is the requirement of
the SSA dominance property.

Chris

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-15 22:44                   ` Dibyendu Majumdar
@ 2017-08-16  5:36                     ` Christopher Li
  0 siblings, 0 replies; 36+ messages in thread
From: Christopher Li @ 2017-08-16  5:36 UTC (permalink / raw)
  To: Dibyendu Majumdar; +Cc: Luc Van Oostenryck, Linus Torvalds, Linux-Sparse

On Tue, Aug 15, 2017 at 6:44 PM, Dibyendu Majumdar
<mobile@majumdar.org.uk> wrote:
> Hi,
>
> I think it would be good to perform a detailed comparison / assessment
> of where the problem lies currently versus the new solution. I get the
> impression that the current problem is not well explained or
> understood.

I haven't look into the weather the new implementation does every thing
right yet. I can explain pretty well how the current implementation of SSA
is wrong. In short it is violating the SSA dominance property.
More specifically.
1) We have the "usage before define". The crazy programmer bug is
    trigger by this. Hopefully remove the single store short cut is going
    to help. My impression is that there is still more offender of "usage
    before define".

2) We don't place phi node at dominance frontier of the pseudo definition.

3) Uninitialized variable need to be treated as having implicate defined value
    of "uninitialzed" at entry point.

> Here are two known issues at present. You can see the old linearized
> output and the new linearized output - both produced with
> simplifications switched off.
>
> C code:
> https://github.com/dibyendumajumdar/dmr_c/blob/newssa/tests/bugs/simplifybug.c
>
> Old linearized output:
> https://github.com/dibyendumajumdar/dmr_c/blob/newssa/tests/bugs/simplifybug_old.lin
>
> New linearized output:
> https://github.com/dibyendumajumdar/dmr_c/blob/newssa/tests/bugs/simplifybug_new.lin

I haven't have time to take a closer look of the output.
The above 3 points should cover what is dong wrong with our SSA form.
That 3 points is also easier to reason than staring at the low level byte codes.

Chris

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-15 20:43               ` Linus Torvalds
  2017-08-15 21:43                 ` Luc Van Oostenryck
  2017-08-16  4:23                 ` Christopher Li
@ 2017-08-16  6:41                 ` Luc Van Oostenryck
  2 siblings, 0 replies; 36+ messages in thread
From: Luc Van Oostenryck @ 2017-08-16  6:41 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Dibyendu Majumdar, Christopher Li, Linux-Sparse

On Tue, Aug 15, 2017 at 01:43:15PM -0700, Linus Torvalds wrote:
> On Tue, Aug 15, 2017 at 1:14 PM, Luc Van Oostenryck
> <luc.vanoostenryck@gmail.com> wrote:
> >
> > For information, the current SSA construction is very broken,
> > both in simplify_one_symbol() (the main SSA construction)
> > and simplify_loads() (which also creates phi-nodes). They both
> > are 'load-driven' which is OK but they both put the phi-nodes
> > where the load is while phi-nodes need to placed where branches
> > meet. The loads are often placed in a BB following where the
> > join-point. Often this *seems* to be OK but in fact is very wrong
> > and create all sort of problems a bit everywhere
> 
> I do think you're too hung up about the placement of the phi nodes.
> 
> It may be an issue for simple models that just walk the linearized
> code directly and try to turn it into code. But the original intent of
> the existing phi node code was to just be as sources of the pseudo's -
> the location should be largely immaterial.

I somehow disagree here.
A:	B:	C:
Va = a;	Vb = b;	Vc = c;
  |	  |	  |
   \      |      / 
    \     |     / 
     \    |    / 
      \   |   / 
       \  |  / 
	M:
	  |
	  |
	  |
	  |
	N:
	  V = PHI(Va, Vb, Vc)

Now what must happen if B: has to be removed?

A:		C:
Va = a;		Vc = c;
  |	  	  |
   \             / 
    \           / 
     \         / 
      \       / 
       \     / 
	M:
	  |
	  |
	  |
	  |
	N:
	  V = PHI(Va, VOID, Vc)

where VOID represent here a non-present element. It's still 'correct'
You have to interpret the void as 'nothing here anymore'

One of the funnies begin if we now have to add a branch D to M

A:	C:	D:
Va = a;	Vc = c;	Vd = d;
  |	  |	  |
   \      |      / 
    \     |     / 
     \    |    / 
      \   |   / 
       \  |  / 
	M:
	  V' = PHI(Va, Vc, Vd)
	  |
	  |
	  |
	  |
	  |
	N:
	  V = PHI(Va, VOID, Vc)

What for V if we came from branch D?

You can also see things like:


     Vd   Vb   Ve 
       \  |  / 
	O:
     Va   |   Vc 
       \  |  / 
	M:
	  |
	  |
	  |
	N:
	  V = PHI(Va, Vc, Vd, Vb, Ve)

In others words,s a phi-nodes with much more sources/args than the first
branching parent has.

We also have (after BB packing)
     Va   Vb   Vc 
       \  |  / 
	M:
	  V = PHI(Va, Vb, Vc)
          .....
	  .....
	  F' = PHI(Fa, Fc)

THus a phi-node in the middle of a BB (it was at top first but
two BB have been joined).


At best, it's hard to see any kind of plausibly correct interpretation,
often you can't.
Thing becomes really messy when a new branch is inserted somewhere
in the middle of the chain.

But I also agree that there is two problmes here:
- having a correct SSA form (maybe not the same as what can always
  be found in the literature but we need one on which we can reason)
- once the SSA construction is done, we now have ti maintain it
  each time we make a change to the CFG. I doubt we do this
  correctly.

 
> IOW, you could think of the phi nodes as being "outside" the
> instruction flow entirely, and just being the definition of the pseudo
> they define.

Right, they are not really part of the normal instruction flow.
I think it's even best to *not* see them as part of the instruction.
They are more akin to arguments receive by a function at its start.
The phi-node are then: heer are the values we for this BB if
we come from this or this path.
They also must have the multi-assignement semantics (but in sparse,
we don't care because there is always  this layer of phi-source)
> They have to be placed somewhere, but the "somewhere" is
> somewhat arbitrary.

I disagree here.
IMO, the somwhere has to be at the top of the BB where others
BBs meet and there each phi-node has to have as much sources
as there the BB has parents.
This is the only sane semantic.

> That said, I like your new ssa construction -independently- of that
> issue - the old SSA constructor really was very ad-hoc. Look through
> the git history if you don't believe me ;)

Thank you. I would have liked that the ideas where mine though!
(in fact, when I foud the article, I had some related ideas in head
triggered by one of your replies to me wheer you said 'you can't
directly translate local var inti pseudo).


Best regards,
-- Luc

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-16  4:58                   ` Christopher Li
@ 2017-08-16 10:40                     ` Dibyendu Majumdar
  2017-08-16 13:17                       ` Christopher Li
  0 siblings, 1 reply; 36+ messages in thread
From: Dibyendu Majumdar @ 2017-08-16 10:40 UTC (permalink / raw)
  To: Christopher Li; +Cc: Linus Torvalds, Luc Van Oostenryck, Linux-Sparse

Hi Chris,

On 16 August 2017 at 05:58, Christopher Li <sparse@chrisli.org> wrote:
> On Wed, Aug 16, 2017 at 12:23 AM, Christopher Li <sparse@chrisli.org> wrote:
>> If sparse only place the phi node on where the loads are, which a lot of
>> case happen to be the DF (dominance frontier). In the case that DF
>> does not have loads, that is where current sparse do wrong, sparse
>> does not place a phi node  there. Which violate the SSA property.
>
> I give an example for the above situation I am talking about:
>
> int a, b, c, d;
>
> void foo(void)
> {
>   int e, f;
>   if (a)
>      e = b;
>   else
>      e = c;
>   if (b)
>      c = a+b;
>   d = e;
> }
>
> "if (b)" is where the dominance of "e = b" and "e =c" joined.
> I don't do any loading of "e" there.
> To insert some blocks before using "e", I use the "if (b)".
>

> Now let's look at the test-linearize output:

I think the linear output you obtained was post simplification. We
need to look at the output prior to any simplifications. I believe it
looks like this:

foo:
.L0:
        <entry-point>
        load.32     %r1 <- 0[a]
        cbr         %r1, .L1, .L2

.L1:
        load.32     %r2 <- 0[b]
        store.32    %r2 -> 0[e]
        br          .L3

.L2:
        load.32     %r3 <- 0[c]
        store.32    %r3 -> 0[e]
        br          .L3

.L3:
        load.32     %r4 <- 0[b]
        cbr         %r4, .L4, .L5

.L4:
        load.32     %r5 <- 0[a]
        load.32     %r6 <- 0[b]
        add.32      %r7 <- %r5, %r6
        store.32    %r7 -> 0[c]
        br          .L5

.L5:
        load.32     %r8 <- 0[e]
        store.32    %r8 -> 0[d]
        br          .L6

.L6:
        ret

And here is the output from the new SSA construction:

foo:
.L0:
        <entry-point>
        load.32     %r1 <- 0[a]
        cbr         %r1, .L1, .L2

.L1:
        load.32     %r2 <- 0[b]
        phisrc.32   %phi1(e) <- %r2
        br          .L3

.L2:
        load.32     %r3 <- 0[c]
        phisrc.32   %phi2(e) <- %r3
        br          .L3

.L3:
        phi.32      %r9(e) <- %phi1(e), %phi2(e)
        load.32     %r4 <- 0[b]
        cbr         %r4, .L4, .L5

.L4:
        load.32     %r5 <- 0[a]
        load.32     %r6 <- 0[b]
        add.32      %r7 <- %r5, %r6
        store.32    %r7 -> 0[c]
        br          .L5

.L5:
        store.32    %r9(e) -> 0[d]
        br          .L6

.L6:
        ret

Finally here is what clang does after some phases (initially clang
starts with local vars).

*** IR Dump After SROA ***
; Function Attrs: nounwind
define void @foo() #0 {
entry:
  %0 = load i32, i32* @a, align 4, !tbaa !1
  %tobool = icmp ne i32 %0, 0
  br i1 %tobool, label %if.then, label %if.else

if.then:                                          ; preds = %entry
  %1 = load i32, i32* @b, align 4, !tbaa !1
  br label %if.end

if.else:                                          ; preds = %entry
  %2 = load i32, i32* @c, align 4, !tbaa !1
  br label %if.end

if.end:                                           ; preds = %if.else, %if.then
  %e.0 = phi i32 [ %1, %if.then ], [ %2, %if.else ]
  %3 = load i32, i32* @b, align 4, !tbaa !1
  %tobool1 = icmp ne i32 %3, 0
  br i1 %tobool1, label %if.then2, label %if.end3

if.then2:                                         ; preds = %if.end
  %4 = load i32, i32* @a, align 4, !tbaa !1
  %5 = load i32, i32* @b, align 4, !tbaa !1
  %add = add nsw i32 %4, %5
  store i32 %add, i32* @c, align 4, !tbaa !1
  br label %if.end3

if.end3:                                          ; preds = %if.then2, %if.end
  store i32 %e.0, i32* @d, align 4, !tbaa !1
  ret void
}

Regards
Dibyendu

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-15 20:14             ` Luc Van Oostenryck
  2017-08-15 20:43               ` Linus Torvalds
@ 2017-08-16 11:02               ` Dibyendu Majumdar
  2017-08-16 12:00                 ` Luc Van Oostenryck
  1 sibling, 1 reply; 36+ messages in thread
From: Dibyendu Majumdar @ 2017-08-16 11:02 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Linus Torvalds, Christopher Li, Linux-Sparse

Hi Luc,

On 15 August 2017 at 21:14, Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
> It's just too bad that the paper simply ignores gotos. The method
> is really cool, fast, simple and clean, I love it. But I had to add
> a few things, more akin to kludges than hacks IMO, to make it
> work with gotos. But as is, it works pretty well, is fast and produce
> correct SSA (also avoid few quadratic behaviours the current
> simplify_one_symbol() has).
>

I am trying to understand above - why should there be a special
treatment for gotos? A goto is simply a 'br' instruction from one BB
to another, so is there something special about it?

Regards
Dibyendu

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-16 11:02               ` Dibyendu Majumdar
@ 2017-08-16 12:00                 ` Luc Van Oostenryck
  2017-08-16 12:16                   ` Dibyendu Majumdar
  2017-08-16 12:17                   ` Christopher Li
  0 siblings, 2 replies; 36+ messages in thread
From: Luc Van Oostenryck @ 2017-08-16 12:00 UTC (permalink / raw)
  To: Dibyendu Majumdar; +Cc: Linus Torvalds, Christopher Li, Linux-Sparse

On Wed, Aug 16, 2017 at 1:02 PM, Dibyendu Majumdar
<mobile@majumdar.org.uk> wrote:
>
> I am trying to understand above - why should there be a special
> treatment for gotos? A goto is simply a 'br' instruction from one BB
> to another, so is there something special about it?

Very good question!

It's related to 'structured programming' vs. non-structured programming.
If you write code which do control flow only by using if-then-else,
do-while/until, while-do and C's for-loop then you know very well
at each step where the code can come from, same during the linearization
here in sparse, you know when your write the IR for the if-then-else or the loop
what parent BB's you will have, you know that no other ones will need to be
added later.
Once you use gotos, it's not true anymore: if you have a label, then you
will probably have some gotos going to this label, in other words, new parents
for the BB. With computed gotos it's even worse: any computed-goto can
possibly go to any label you have taken the address (it's GCCism, not standard
C but we have to support it).

This is not a problem for the method, just a little complication as we have then
to delay a few things until we know all the gotos have been issued, then we know
that no new parents can be added (to the BB corresponding to a label) and things
can move on. It was just that the articles, for the sake of simplicity
I suppose,
superbly ignored the question of the gotos. So it was not as simple as taking
the pseudo-code they gave and implement it, I had to find some solutions
for these gotos.

-- Luc

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-16 12:00                 ` Luc Van Oostenryck
@ 2017-08-16 12:16                   ` Dibyendu Majumdar
  2017-08-16 12:23                     ` Christopher Li
  2017-08-16 12:28                     ` Luc Van Oostenryck
  2017-08-16 12:17                   ` Christopher Li
  1 sibling, 2 replies; 36+ messages in thread
From: Dibyendu Majumdar @ 2017-08-16 12:16 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Linus Torvalds, Christopher Li, Linux-Sparse

Hi Luc,

On 16 August 2017 at 13:00, Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
> On Wed, Aug 16, 2017 at 1:02 PM, Dibyendu Majumdar
> <mobile@majumdar.org.uk> wrote:
>>
>> I am trying to understand above - why should there be a special
>> treatment for gotos? A goto is simply a 'br' instruction from one BB
>> to another, so is there something special about it?
>
> Very good question!
>
> It's related to 'structured programming' vs. non-structured programming.
> If you write code which do control flow only by using if-then-else,
> do-while/until, while-do and C's for-loop then you know very well
> at each step where the code can come from, same during the linearization
> here in sparse, you know when your write the IR for the if-then-else or the loop
> what parent BB's you will have, you know that no other ones will need to be
> added later.
> Once you use gotos, it's not true anymore: if you have a label, then you
> will probably have some gotos going to this label, in other words, new parents
> for the BB. With computed gotos it's even worse: any computed-goto can
> possibly go to any label you have taken the address (it's GCCism, not standard
> C but we have to support it).
>
> This is not a problem for the method, just a little complication as we have then
> to delay a few things until we know all the gotos have been issued, then we know
> that no new parents can be added (to the BB corresponding to a label) and things
> can move on. It was just that the articles, for the sake of simplicity
> I suppose,
> superbly ignored the question of the gotos. So it was not as simple as taking
> the pseudo-code they gave and implement it, I had to find some solutions
> for these gotos.
>

Is it the problem that you are constructing SSA while linearizing - so
that you do not have the gotos resolved? Would it be a better design
to finish linear stream, and then run SSA conversion? The advantage of
this would also be that you can reuse the SSA conversion phase later
on if some transformation of the linear stream introduced non-SSA
constructs.

Regards
Dibyendu

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-16 12:00                 ` Luc Van Oostenryck
  2017-08-16 12:16                   ` Dibyendu Majumdar
@ 2017-08-16 12:17                   ` Christopher Li
  1 sibling, 0 replies; 36+ messages in thread
From: Christopher Li @ 2017-08-16 12:17 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Dibyendu Majumdar, Linus Torvalds, Linux-Sparse

On Wed, Aug 16, 2017 at 8:00 AM, Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
> On Wed, Aug 16, 2017 at 1:02 PM, Dibyendu Majumdar
> <mobile@majumdar.org.uk> wrote:
>>
>> I am trying to understand above - why should there be a special
>> treatment for gotos? A goto is simply a 'br' instruction from one BB
>> to another, so is there something special about it?
>
> Very good question!
>
> It's related to 'structured programming' vs. non-structured programming.
> If you write code which do control flow only by using if-then-else,
> do-while/until, while-do and C's for-loop then you know very well
> at each step where the code can come from, same during the linearization
> here in sparse, you know when your write the IR for the if-then-else or the loop
> what parent BB's you will have, you know that no other ones will need to be
> added later.
> Once you use gotos, it's not true anymore: if you have a label, then you
> will probably have some gotos going to this label, in other words, new parents
> for the BB. With computed gotos it's even worse: any computed-goto can
> possibly go to any label you have taken the address (it's GCCism, not standard
> C but we have to support it).

My way to explaining myself is, the "goto" can lead to irreducible graph.
It is a whole new level of complexity when irreducible graph was considering
as valid input for this paper, which is *not* cover by the simple and elegant
case suggested by the paper.

>
> This is not a problem for the method, just a little complication as we have then
> to delay a few things until we know all the gotos have been issued, then we know
> that no new parents can be added (to the BB corresponding to a label) and things
> can move on. It was just that the articles, for the sake of simplicity
> I suppose,
> superbly ignored the question of the gotos. So it was not as simple as taking
> the pseudo-code they gave and implement it, I had to find some solutions
> for these gotos.

IMHO, not having the simple and efficient way of supporting "goto" and
irreducible
graph is a major shortcoming in the paper. Because those are valid input for the
C source file. It make the final solution as a whole not so much
simple and elegant
any more.


Chris

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-16 12:16                   ` Dibyendu Majumdar
@ 2017-08-16 12:23                     ` Christopher Li
  2017-08-16 12:28                     ` Luc Van Oostenryck
  1 sibling, 0 replies; 36+ messages in thread
From: Christopher Li @ 2017-08-16 12:23 UTC (permalink / raw)
  To: Dibyendu Majumdar; +Cc: Luc Van Oostenryck, Linus Torvalds, Linux-Sparse

On Wed, Aug 16, 2017 at 8:16 AM, Dibyendu Majumdar
<mobile@majumdar.org.uk> wrote:
>
> Is it the problem that you are constructing SSA while linearizing - so
> that you do not have the gotos resolved? Would it be a better design

That is the paper is trying to construct SSA skipping the CFG stage.
When you have goto, you can't really skip CFG any more. That is why
it is fundamental hard to resolve for this paper.

> to finish linear stream, and then run SSA conversion? The advantage of

Yes, that is just the traditional way, the way that LLVM was doing.
The paper is trying to claim having a simpler and efficient solution than
LLVM. It is not true any more with considering of "goto" and irreducible
graph input.

> this would also be that you can reuse the SSA conversion phase later
> on if some transformation of the linear stream introduced non-SSA
> constructs.

That is exactly the traditional way. You start to reason against the
method  using in this paper. :-)

Chris

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-16 12:16                   ` Dibyendu Majumdar
  2017-08-16 12:23                     ` Christopher Li
@ 2017-08-16 12:28                     ` Luc Van Oostenryck
  2017-08-16 12:39                       ` Dibyendu Majumdar
  1 sibling, 1 reply; 36+ messages in thread
From: Luc Van Oostenryck @ 2017-08-16 12:28 UTC (permalink / raw)
  To: Dibyendu Majumdar; +Cc: Linus Torvalds, Christopher Li, Linux-Sparse

On Wed, Aug 16, 2017 at 2:16 PM, Dibyendu Majumdar
<mobile@majumdar.org.uk> wrote:
>
> Is it the problem that you are constructing SSA while linearizing - so
> that you do not have the gotos resolved? Would it be a better design
> to finish linear stream, and then run SSA conversion?

The *big* advantage of this method is that you *do not* have to first
generate the whole thing *and then* transform it into SSA form.
The price is a slight complication in the handling of gotos.
The more classic approach  need special handling for gotos too.

-- Luc

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-16 12:28                     ` Luc Van Oostenryck
@ 2017-08-16 12:39                       ` Dibyendu Majumdar
  2017-08-16 12:50                         ` Christopher Li
  0 siblings, 1 reply; 36+ messages in thread
From: Dibyendu Majumdar @ 2017-08-16 12:39 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Linus Torvalds, Christopher Li, Linux-Sparse

Hi Luc,

On 16 August 2017 at 13:28, Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
> On Wed, Aug 16, 2017 at 2:16 PM, Dibyendu Majumdar
> <mobile@majumdar.org.uk> wrote:
>>
>> Is it the problem that you are constructing SSA while linearizing - so
>> that you do not have the gotos resolved? Would it be a better design
>> to finish linear stream, and then run SSA conversion?
>
> The *big* advantage of this method is that you *do not* have to first
> generate the whole thing *and then* transform it into SSA form.
> The price is a slight complication in the handling of gotos.
> The more classic approach  need special handling for gotos too.
>

It is an advantage and a disadvantage in the sense that it makes SSA
construction not reusable - or have I misunderstood? Looking at the
implementation I felt that it could be run as a second phase - is that
assumption wrong?

I am just trying to understand the issue as it seems the paper claims
it creates pruned SSA for all programs - not just reducible control
flow. Have you contacted the authors regarding the issues you are
having with gotos? Might be worth doing so.

Regards
Dibyendu

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-16 12:39                       ` Dibyendu Majumdar
@ 2017-08-16 12:50                         ` Christopher Li
  2017-08-16 12:57                           ` Dibyendu Majumdar
  0 siblings, 1 reply; 36+ messages in thread
From: Christopher Li @ 2017-08-16 12:50 UTC (permalink / raw)
  To: Dibyendu Majumdar; +Cc: Luc Van Oostenryck, Linus Torvalds, Linux-Sparse

On Wed, Aug 16, 2017 at 8:39 AM, Dibyendu Majumdar
<mobile@majumdar.org.uk> wrote:
> It is an advantage and a disadvantage in the sense that it makes SSA
> construction not reusable - or have I misunderstood? Looking at the

Not sure I understand the question.

> implementation I felt that it could be run as a second phase - is that
> assumption wrong?

That assumption is correct. But not having the advantage the paper
claim any more.

>
> I am just trying to understand the issue as it seems the paper claims
> it creates pruned SSA for all programs - not just reducible control
> flow. Have you contacted the authors regarding the issues you are

Sorry I might mislead you. I said the paper solution for irreducable
graph is *not* simple *nor* efficient any more. The paper do suggest
a way to do it. But I consider that way complex (in execution time sense).
I haven't take a closer look at the code yet.

> having with gotos? Might be worth doing so.

My reading of the paper is yes. Having "goto" will interrupt that
CFG less method. It can also lead to irreducible graph. Which means
extra step to clean it up. That extra step is expensive in the O() terms.

Chris

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-16 12:50                         ` Christopher Li
@ 2017-08-16 12:57                           ` Dibyendu Majumdar
  2017-08-16 13:11                             ` Christopher Li
  0 siblings, 1 reply; 36+ messages in thread
From: Dibyendu Majumdar @ 2017-08-16 12:57 UTC (permalink / raw)
  To: Christopher Li; +Cc: Luc Van Oostenryck, Linus Torvalds, Linux-Sparse

Hi Chris,

On 16 August 2017 at 13:50, Christopher Li <sparse@chrisli.org> wrote:
> On Wed, Aug 16, 2017 at 8:39 AM, Dibyendu Majumdar
> <mobile@majumdar.org.uk> wrote:
>> It is an advantage and a disadvantage in the sense that it makes SSA
>> construction not reusable - or have I misunderstood? Looking at the
>
> Not sure I understand the question.

I meant that it would nice to save a function like SSA(linear input)
-> SSA output - because then you can rerun the SSA conversion several
times during optimization.

>
>> implementation I felt that it could be run as a second phase - is that
>> assumption wrong?
>
> That assumption is correct. But not having the advantage the paper
> claim any more.
>

I think there is still an advantage as the implementation is bottom
up, uses memoization / dynamic programming techniques, and hence might
actually be quite efficient. I think Luc is better placed to comment
on this.

>>
>> I am just trying to understand the issue as it seems the paper claims
>> it creates pruned SSA for all programs - not just reducible control
>> flow. Have you contacted the authors regarding the issues you are
>
> Sorry I might mislead you. I said the paper solution for irreducable
> graph is *not* simple *nor* efficient any more. The paper do suggest
> a way to do it. But I consider that way complex (in execution time sense).
> I haven't take a closer look at the code yet.
>
>> having with gotos? Might be worth doing so.
>
> My reading of the paper is yes. Having "goto" will interrupt that
> CFG less method. It can also lead to irreducible graph. Which means
> extra step to clean it up. That extra step is expensive in the O() terms.
>

But Luc's implementation handles gotos fine without the extra phase -
hence please please try out the implementation before arriving at
conclusions.

Regards
Dibyendu

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-16 12:57                           ` Dibyendu Majumdar
@ 2017-08-16 13:11                             ` Christopher Li
  2017-08-16 13:22                               ` Christopher Li
  0 siblings, 1 reply; 36+ messages in thread
From: Christopher Li @ 2017-08-16 13:11 UTC (permalink / raw)
  To: Dibyendu Majumdar; +Cc: Luc Van Oostenryck, Linus Torvalds, Linux-Sparse

On Wed, Aug 16, 2017 at 8:57 AM, Dibyendu Majumdar
<mobile@majumdar.org.uk> wrote:
>> Not sure I understand the question.
>
> I meant that it would nice to save a function like SSA(linear input)
> -> SSA output - because then you can rerun the SSA conversion several
> times during optimization.

OK, in that regard. I agree.
But it is not relative to the discussion we have there.
Before promoting memory variable to pseudo, it is still
valid SSA form.  After the prompting, we of course want
to stay in SSA.


>
> I think there is still an advantage as the implementation is bottom
> up, uses memoization / dynamic programming techniques, and hence might
> actually be quite efficient. I think Luc is better placed to comment
> on this.

May be. I haven't look at the code yet. The paper did not describe how
to play with "goto", so I don't no clue what Luc have to do to make the
goto work yet.

>
> But Luc's implementation handles gotos fine without the extra phase -
> hence please please try out the implementation before arriving at
> conclusions.

As I state very clear in the title, my thought as regarding the paper,
and paper only. I will take some time to read Luc's implementation
next.

I just want to get a confirm my thought on the paper is not too
far off. If I understand the paper incorrectly, it certainly will have negtive
impact on how I am going to read the code.

Thanks

Chris

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-16 10:40                     ` Dibyendu Majumdar
@ 2017-08-16 13:17                       ` Christopher Li
  0 siblings, 0 replies; 36+ messages in thread
From: Christopher Li @ 2017-08-16 13:17 UTC (permalink / raw)
  To: Dibyendu Majumdar; +Cc: Linus Torvalds, Luc Van Oostenryck, Linux-Sparse

On Wed, Aug 16, 2017 at 6:40 AM, Dibyendu Majumdar
<mobile@majumdar.org.uk> wrote:

> I think the linear output you obtained was post simplification. We
> need to look at the output prior to any simplifications. I believe it
> looks like this:

Before prompting memory variable to SSA form, the SSA form
is correct.  That is my currently understanding. The incorrect SSA
form happen when we promote the memory variable into pseudo.

> Finally here is what clang does after some phases (initially clang
> starts with local vars).
>
> *** IR Dump After SROA ***
> ; Function Attrs: nounwind
> define void @foo() #0 {
> entry:
>   %0 = load i32, i32* @a, align 4, !tbaa !1
>   %tobool = icmp ne i32 %0, 0
>   br i1 %tobool, label %if.then, label %if.else
>
> if.then:                                          ; preds = %entry
>   %1 = load i32, i32* @b, align 4, !tbaa !1
>   br label %if.end
>
> if.else:                                          ; preds = %entry
>   %2 = load i32, i32* @c, align 4, !tbaa !1
>   br label %if.end
>
> if.end:                                           ; preds = %if.else, %if.then
>   %e.0 = phi i32 [ %1, %if.then ], [ %2, %if.else ] <================

Notice how clang place the phi node at where I said it
need to be, basically the L3 in my email.

Also notice clang does not have phi source. It don't need to.


Chris

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] sparse SSA construction
  2017-08-16 13:11                             ` Christopher Li
@ 2017-08-16 13:22                               ` Christopher Li
  0 siblings, 0 replies; 36+ messages in thread
From: Christopher Li @ 2017-08-16 13:22 UTC (permalink / raw)
  To: Dibyendu Majumdar; +Cc: Luc Van Oostenryck, Linus Torvalds, Linux-Sparse

On Wed, Aug 16, 2017 at 9:11 AM, Christopher Li <sparse@chrisli.org> wrote:
>> But Luc's implementation handles gotos fine without the extra phase -
>> hence please please try out the implementation before arriving at
>> conclusions.
>
> As I state very clear in the title, my thought as regarding the paper,
> and paper only. I will take some time to read Luc's implementation
> next.

Oops, sorry I mix up  the two email thread talk about SSA.
This thread is about the patch not paper. You are right I shouldn't
draw conclusion on it before reading it in more detail.

Let me restart.  Now I read the paper (kind of). I should go read the
implementation patches.

Chris

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2017-08-16 13:22 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-06 20:26 [RFC] sparse SSA construction Luc Van Oostenryck
2017-08-06 23:01 ` Christopher Li
2017-08-06 23:44   ` Luc Van Oostenryck
2017-08-07  0:33     ` Christopher Li
2017-08-07  1:21       ` Luc Van Oostenryck
2017-08-07  1:44         ` Christopher Li
2017-08-15 13:41 ` Dibyendu Majumdar
2017-08-15 13:59   ` Christopher Li
2017-08-15 14:06     ` Dibyendu Majumdar
2017-08-15 14:07       ` Christopher Li
2017-08-15 14:09         ` Dibyendu Majumdar
2017-08-15 14:18           ` Christopher Li
2017-08-15 18:36           ` Linus Torvalds
2017-08-15 20:14             ` Luc Van Oostenryck
2017-08-15 20:43               ` Linus Torvalds
2017-08-15 21:43                 ` Luc Van Oostenryck
2017-08-15 22:44                   ` Dibyendu Majumdar
2017-08-16  5:36                     ` Christopher Li
2017-08-16  5:15                   ` Christopher Li
2017-08-16  4:23                 ` Christopher Li
2017-08-16  4:58                   ` Christopher Li
2017-08-16 10:40                     ` Dibyendu Majumdar
2017-08-16 13:17                       ` Christopher Li
2017-08-16  6:41                 ` Luc Van Oostenryck
2017-08-16 11:02               ` Dibyendu Majumdar
2017-08-16 12:00                 ` Luc Van Oostenryck
2017-08-16 12:16                   ` Dibyendu Majumdar
2017-08-16 12:23                     ` Christopher Li
2017-08-16 12:28                     ` Luc Van Oostenryck
2017-08-16 12:39                       ` Dibyendu Majumdar
2017-08-16 12:50                         ` Christopher Li
2017-08-16 12:57                           ` Dibyendu Majumdar
2017-08-16 13:11                             ` Christopher Li
2017-08-16 13:22                               ` Christopher Li
2017-08-16 12:17                   ` Christopher Li
2017-08-15 20:37             ` Luc Van Oostenryck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.