All of lore.kernel.org
 help / color / mirror / Atom feed
* [OpenRISC] OpenRISC 1.3 spec
@ 2019-04-12 20:56 Stafford Horne
  2019-04-12 21:17 ` Richard Henderson
  2019-04-13  8:03 ` Richard Henderson
  0 siblings, 2 replies; 23+ messages in thread
From: Stafford Horne @ 2019-04-12 20:56 UTC (permalink / raw)
  To: openrisc

Hi All,

As Andrey and I have been working on adding FPU support the GCC port, this
has brought up a few issues and possibly need for new optional
instructions.  Many of these are already documented by Rth back in 2015.

I propose to incorporate the following into the openrisc spec.

Use proposals

P6 https://openrisc.io/proposals/lfmadd - clarification on internal
rounding (possibly exclude new madd instructions)
P7 https://openrisc.io/proposals/lstod-ldtos - convert between double and
float
P9 https://openrisc.io/proposals/ladrp - New instruction for PIC code
(already in binutils)
P14 https://openrisc.io/proposals/orfpx64a32 - 64-bit fpu implemented in
marocchino (gcc/bintuils patches under review)
P11 https://openrisc.io/proposals/lfsf - add support for 'unordered'
compares
P13 https://openrisc.io/proposals/corrections - various corrections
(possibly exclude correction for l.ext* being mandatory as its not
implemented everywhere)

Additionally
Clarification on write back of FPU results after exceptions. As per IEEE
results should be written back.

-Stafford
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.librecores.org/pipermail/openrisc/attachments/20190413/0b3c8240/attachment.html>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [OpenRISC] OpenRISC 1.3 spec
  2019-04-12 20:56 [OpenRISC] OpenRISC 1.3 spec Stafford Horne
@ 2019-04-12 21:17 ` Richard Henderson
  2019-04-12 21:48   ` Stafford Horne
  2019-04-13  8:03 ` Richard Henderson
  1 sibling, 1 reply; 23+ messages in thread
From: Richard Henderson @ 2019-04-12 21:17 UTC (permalink / raw)
  To: openrisc

On 4/12/19 10:56 AM, Stafford Horne wrote:
> P14 https://openrisc.io/proposals/orfpx64a32 - 64-bit fpu implemented in
> marocchino (gcc/bintuils patches under review)

I don't see an encoding for this in the proposal.

I'm a tad nervous about forcing the (weird) ABI into the ISA, by using reg+2
when reg >= 16.  Are there 3 bits in the instruction that could be used to flag
reg+1 vs reg+2?  Then it's a matter of using

	lf.add.d rd1, rd2, ra1, ra2, rb1, rb2

in the assembler, and having the assembler enforce rx2 = rx1+{1,2}, and setting
the appropriate bit in the instruction.

I realize this comment is coming in quite late to your development...


r~

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [OpenRISC] OpenRISC 1.3 spec
  2019-04-12 21:17 ` Richard Henderson
@ 2019-04-12 21:48   ` Stafford Horne
  2019-04-13  8:11     ` Richard Henderson
  0 siblings, 1 reply; 23+ messages in thread
From: Stafford Horne @ 2019-04-12 21:48 UTC (permalink / raw)
  To: openrisc

Hi Richard,

On Fri, Apr 12, 2019 at 11:17:46AM -1000, Richard Henderson wrote:
> On 4/12/19 10:56 AM, Stafford Horne wrote:
> > P14 https://openrisc.io/proposals/orfpx64a32 - 64-bit fpu implemented in
> > marocchino (gcc/bintuils patches under review)
> 
> I don't see an encoding for this in the proposal.

Right, because it's proposing to just use the existing encodings.

> I'm a tad nervous about forcing the (weird) ABI into the ISA, by using reg+2
> when reg >= 16.  Are there 3 bits in the instruction that could be used to flag
> reg+1 vs reg+2?  Then it's a matter of using
> 
> 	lf.add.d rd1, rd2, ra1, ra2, rb1, rb2
> 
> in the assembler, and having the assembler enforce rx2 = rx1+{1,2}, and setting
> the appropriate bit in the instruction.

Thanks for your feedback, I think this can be done.  I looked through the
encodings and {10,9,8} look usable.

   31  26  21  16  11     8      0
  [ 0x32 | D | A | B | res | 0x10 ]    lf.add.d
  [ 0x32 | D | A | B | res | 0x11 ]    lf.sub.d
                      ...

   31  26    21  16  11     8      0
  [ 0x32 | res | A | B | res | 0x18 ]  lf.sfeq.d
  [ 0x32 | res | A | B | res | 0x19 ]  lf.sfne.d
                      ...

   31  26  21  16    11     8      0
  [ 0x32 | D | A | 0x0 | res | 0x15 ]  lf.ftoi.d
  [ 0x32 | D | A | 0x0 | res | 0x14 ]  lf.itof.d

I propose:

  bit-10 - 1 indicates if rd2 is +2
  bit-9  - 1 indicates if ra2 is +2
  bit-8  - 1 indicates if rb2 is +2

But, this does mean we can't us the register renaming trick in the verilog design.

> I realize this comment is coming in quite late to your development...

Its not too late if its before silicon ;)

-Stafford

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [OpenRISC] OpenRISC 1.3 spec
  2019-04-12 20:56 [OpenRISC] OpenRISC 1.3 spec Stafford Horne
  2019-04-12 21:17 ` Richard Henderson
@ 2019-04-13  8:03 ` Richard Henderson
  2019-04-14  6:30   ` Stafford Horne
  1 sibling, 1 reply; 23+ messages in thread
From: Richard Henderson @ 2019-04-13  8:03 UTC (permalink / raw)
  To: openrisc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="maccentraleurope", Size: 1283 bytes --]

On 4/12/19 10:56 AM, Stafford Horne wrote:
> I propose to incorporate the following into the openrisc spec.
> 
> Use proposals
> 
> P6 https://openrisc.io/proposals/lfmadd - clarification on internal rounding
> (possibly exclude new madd instructions)
> P7 https://openrisc.io/proposals/lstod-ldtos - convert between double and float
> P9 https://openrisc.io/proposals/ladrp - New instruction for PIC code (already
> in binutils)
> P14 https://openrisc.io/proposals/orfpx64a32 - 64-bit fpu implemented in
> marocchino (gcc/bintuils patches under review)
> P11 https://openrisc.io/proposals/lfsf - add support for 'unordered' compares 
> P13 https://openrisc.io/proposals/corrections - various corrections (possibly
> exclude correction for l.ext* being mandatory as its not implemented everywhere)

As long as we're making changes, I propose to adjust ORFPX32 for OR64 such that
the 32-bit result written back to the 64-bit register has the upper bits
written as 0xffffffff.

For RISC-V this is called "NaN-boxing".  It means that if you attempt to use an
f32 result as an f64 input you'll get an exception, since the input will be a
Signaling NaN.

It's not complete, as we have not defined a 32-bit load that fills in
0xffffffff_00000000, but it'll get many incorrect uses.


r~

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [OpenRISC] OpenRISC 1.3 spec
  2019-04-12 21:48   ` Stafford Horne
@ 2019-04-13  8:11     ` Richard Henderson
  2019-04-13  8:47       ` Stafford Horne
  0 siblings, 1 reply; 23+ messages in thread
From: Richard Henderson @ 2019-04-13  8:11 UTC (permalink / raw)
  To: openrisc

On 4/12/19 11:48 AM, Stafford Horne wrote:
> Thanks for your feedback, I think this can be done.  I looked through the
> encodings and {10,9,8} look usable.
> 
>    31  26  21  16  11     8      0
>   [ 0x32 | D | A | B | res | 0x10 ]    lf.add.d
>   [ 0x32 | D | A | B | res | 0x11 ]    lf.sub.d
>                       ...
> 
>    31  26    21  16  11     8      0
>   [ 0x32 | res | A | B | res | 0x18 ]  lf.sfeq.d
>   [ 0x32 | res | A | B | res | 0x19 ]  lf.sfne.d
>                       ...
> 
>    31  26  21  16    11     8      0
>   [ 0x32 | D | A | 0x0 | res | 0x15 ]  lf.ftoi.d
>   [ 0x32 | D | A | 0x0 | res | 0x14 ]  lf.itof.d
> 
> I propose:
> 
>   bit-10 - 1 indicates if rd2 is +2
>   bit-9  - 1 indicates if ra2 is +2
>   bit-8  - 1 indicates if rb2 is +2

Thanks.  LGTM.

This does cancel the second half of P6, where
I was proposing 3 inputs and 1 output from lf.madd;
rd will have to serve as both input and output.

> Its not too late if its before silicon ;)

Hah!


r~

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [OpenRISC] OpenRISC 1.3 spec
  2019-04-13  8:11     ` Richard Henderson
@ 2019-04-13  8:47       ` Stafford Horne
  2019-04-14  9:41         ` BAndViG
  0 siblings, 1 reply; 23+ messages in thread
From: Stafford Horne @ 2019-04-13  8:47 UTC (permalink / raw)
  To: openrisc

On Fri, Apr 12, 2019 at 10:11:41PM -1000, Richard Henderson wrote:
> On 4/12/19 11:48 AM, Stafford Horne wrote:
> > Thanks for your feedback, I think this can be done.  I looked through the
> > encodings and {10,9,8} look usable.
> > 
> >    31  26  21  16  11     8      0
> >   [ 0x32 | D | A | B | res | 0x10 ]    lf.add.d
> >   [ 0x32 | D | A | B | res | 0x11 ]    lf.sub.d
> >                       ...
> > 
> >    31  26    21  16  11     8      0
> >   [ 0x32 | res | A | B | res | 0x18 ]  lf.sfeq.d
> >   [ 0x32 | res | A | B | res | 0x19 ]  lf.sfne.d
> >                       ...
> > 
> >    31  26  21  16    11     8      0
> >   [ 0x32 | D | A | 0x0 | res | 0x15 ]  lf.ftoi.d
> >   [ 0x32 | D | A | 0x0 | res | 0x14 ]  lf.itof.d
> > 
> > I propose:
> > 
> >   bit-10 - 1 indicates if rd2 is +2
> >   bit-9  - 1 indicates if ra2 is +2
> >   bit-8  - 1 indicates if rb2 is +2
> 
> Thanks.  LGTM.

I was thinking, do you think these orfpx64a32 instructions should have different
opcodes to distinguish between true 64-bit and 32-bit double instructions?
Otherwise we would not really be able to run these orfpx64a32 32-bit binaries on
64-bit CPU's if 64-bit cpus ever get implemented.

I am kind of on the fence, on one end 64-bit openrisc doesn't looks to even be
coming, but on the other hand if it does this would be an issue.

> This does cancel the second half of P6, where
> I was proposing 3 inputs and 1 output from lf.madd;
> rd will have to serve as both input and output.

Yes, we would take over the room for those in the instruction space.

-Stafford

> > Its not too late if its before silicon ;)
> 
> Hah!
> 
> 
> r~

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [OpenRISC] OpenRISC 1.3 spec
  2019-04-13  8:03 ` Richard Henderson
@ 2019-04-14  6:30   ` Stafford Horne
  2019-04-14  6:48     ` Stafford Horne
  0 siblings, 1 reply; 23+ messages in thread
From: Stafford Horne @ 2019-04-14  6:30 UTC (permalink / raw)
  To: openrisc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="maccentraleurope", Size: 2241 bytes --]

On Fri, Apr 12, 2019 at 10:03:50PM -1000, Richard Henderson wrote:
> On 4/12/19 10:56 AM, Stafford Horne wrote:
> > I propose to incorporate the following into the openrisc spec.
> > 
> > Use proposals
> > 
> > P6 https://openrisc.io/proposals/lfmadd - clarification on internal rounding
> > (possibly exclude new madd instructions)
> > P7 https://openrisc.io/proposals/lstod-ldtos - convert between double and float
> > P9 https://openrisc.io/proposals/ladrp - New instruction for PIC code (already
> > in binutils)
> > P14 https://openrisc.io/proposals/orfpx64a32 - 64-bit fpu implemented in
> > marocchino (gcc/bintuils patches under review)
> > P11 https://openrisc.io/proposals/lfsf - add support for 'unordered' compares 
> > P13 https://openrisc.io/proposals/corrections - various corrections (possibly
> > exclude correction for l.ext* being mandatory as its not implemented everywhere)
> 
> As long as we're making changes, I propose to adjust ORFPX32 for OR64 such that
> the 32-bit result written back to the 64-bit register has the upper bits
> written as 0xffffffff.
> 
> For RISC-V this is called "NaN-boxing".  It means that if you attempt to use an
> f32 result as an f64 input you'll get an exception, since the input will be a
> Signaling NaN.

This makes sense.  I will add this in my updates as well.

> It's not complete, as we have not defined a 32-bit load that fills in
> 0xffffffff_00000000, but it'll get many incorrect uses.

We can add an instruction like `l.lwf` specifically for loading floats.

On 32-bit architectures it would be exactly the same as l.lwz, but on 64-bit it
can fill the upper bits with 0xffffffff.

Possible encoding:
   31    26    21    16     0
  [  0x1c  |  D  |  A  |  I  ]     l.lf

I can add it to the spec as a class II instruction.  We don't really need to
implement it until someone implements OR64.  But it will be good to have for
completeness as you pointed out.

Choosing the opcode as 0x1c as per below. (its close to l.lwa)

...
0x1a - unused
0x1b - l.lwa
0x1c - unused   <-- add l.lwf here
0x1d - l.cust2
0x1e - l.cust3
0x1f - l.cust4

(LOADs)
0x20 - l.ld
0x21 - l.lwz
0x22 - l.lws
0x23 - l.lbz
0x24 - l.lbs
0x25 - l.lhz
0x26 - l.lhs
(ALU IMM)
0x27 - l.addi
...

-Stafford

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [OpenRISC] OpenRISC 1.3 spec
  2019-04-14  6:30   ` Stafford Horne
@ 2019-04-14  6:48     ` Stafford Horne
  0 siblings, 0 replies; 23+ messages in thread
From: Stafford Horne @ 2019-04-14  6:48 UTC (permalink / raw)
  To: openrisc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="maccentraleurope", Size: 2046 bytes --]

On Sun, Apr 14, 2019 at 03:30:05PM +0900, Stafford Horne wrote:
> On Fri, Apr 12, 2019 at 10:03:50PM -1000, Richard Henderson wrote:
> > On 4/12/19 10:56 AM, Stafford Horne wrote:
> > > I propose to incorporate the following into the openrisc spec.
> > > 
> > > Use proposals
> > > 
> > > P6 https://openrisc.io/proposals/lfmadd - clarification on internal rounding
> > > (possibly exclude new madd instructions)
> > > P7 https://openrisc.io/proposals/lstod-ldtos - convert between double and float
> > > P9 https://openrisc.io/proposals/ladrp - New instruction for PIC code (already
> > > in binutils)
> > > P14 https://openrisc.io/proposals/orfpx64a32 - 64-bit fpu implemented in
> > > marocchino (gcc/bintuils patches under review)
> > > P11 https://openrisc.io/proposals/lfsf - add support for 'unordered' compares 
> > > P13 https://openrisc.io/proposals/corrections - various corrections (possibly
> > > exclude correction for l.ext* being mandatory as its not implemented everywhere)
> > 
> > As long as we're making changes, I propose to adjust ORFPX32 for OR64 such that
> > the 32-bit result written back to the 64-bit register has the upper bits
> > written as 0xffffffff.
> > 
> > For RISC-V this is called "NaN-boxing".  It means that if you attempt to use an
> > f32 result as an f64 input you'll get an exception, since the input will be a
> > Signaling NaN.
> 
> This makes sense.  I will add this in my updates as well.
> 
> > It's not complete, as we have not defined a 32-bit load that fills in
> > 0xffffffff_00000000, but it'll get many incorrect uses.
> 
> We can add an instruction like `l.lwf` specifically for loading floats.
> 
> On 32-bit architectures it would be exactly the same as l.lwz, but on 64-bit it
> can fill the upper bits with 0xffffffff.
> 
> Possible encoding:
>    31    26    21    16     0
>   [  0x1c  |  D  |  A  |  I  ]     l.lf

Actually, 0x1a would be better.  0x1c is reserved for l.cust1.

Note to me, another spec fix needed:
  - add l.cust1 to section 18. Machine Code reference

-Stafford

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [OpenRISC] OpenRISC 1.3 spec
  2019-04-13  8:47       ` Stafford Horne
@ 2019-04-14  9:41         ` BAndViG
  2019-04-25 21:17           ` Stafford Horne
  0 siblings, 1 reply; 23+ messages in thread
From: BAndViG @ 2019-04-14  9:41 UTC (permalink / raw)
  To: openrisc

Hi, Stafford, Richard

> > > I propose:
> > >
> > >   bit-10 - 1 indicates if rd2 is +2
> > >   bit-9  - 1 indicates if ra2 is +2
> > >   bit-8  - 1 indicates if rb2 is +2
> >
> > Thanks.  LGTM.

> I was thinking, do you think these orfpx64a32 instructions should have 
> different
> opcodes to distinguish between true 64-bit and 32-bit double instructions?
> Otherwise we would not really be able to run these orfpx64a32 32-bit binaries 
> on
> 64-bit CPU's if 64-bit cpus ever get implemented.

Is it assumed that 64-bit and 32-bit OpenRISC CPUs should be binary compatible?
If not, I think it is should be normal that:
- assembler instructions lf.*.d look different
- 64-bit CPU just ignores bits [10:8]

Also, let me remember here that I made initial implementation for unordered 
comparison (for MAROCCHINO pipe currently). The verilog sources are in branch
https://github.com/openrisc/or1k_marocchino/tree/fp_unordered_cmp
To keep full backward compatibility with current opcodes the proposed variant 
differs from Richard's P11 and looks as following:

lf.sfueq.s rA,rB (opc: 0x28)
lf.sfueq.d rA,rB (opc: 0x38)
lf.sfune.s rA,rB (opc: 0x29)
lf.sfune.d rA,rB (opc: 0x39)
lf.sfugt.s rA,rB (opc: 0x2A)
lf.sfugt.d rA,rB (opc: 0x3A)
lf.sfuge.s rA,rB (opc: 0x2B)
lf.sfuge.d rA,rB (opc: 0x3B)
lf.sfult.s rA,rB (opc: 0x2C)
lf.sfult.d rA,rB (opc: 0x3C)
lf.sfule.s rA,rB (opc: 0x2D)
lf.sfule.d rA,rB (opc: 0x3D)
lf.sfun.s rA,rB (opc: 0x2E)
lf.sfun.d rA,rB (opc: 0x3E)

OPC's bit [5] indicates that comparison is unordered and bit [4] that operands 
are doubles.
Additionally I’ve added fp_comparisons_table.odt document into doc/readme/ 
folder with description of all implemented FP comparison instructions and their 
relation to IEEE standard.
I haven’t tested them yet as there is no tool now.

Of course, as Staffor wrote "Its not too late if its before silicon ;)"

WBR
Andrey


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [OpenRISC] OpenRISC 1.3 spec
  2019-04-14  9:41         ` BAndViG
@ 2019-04-25 21:17           ` Stafford Horne
  2019-04-26 22:22             ` Stafford Horne
  2019-05-07 15:28             ` Richard Henderson
  0 siblings, 2 replies; 23+ messages in thread
From: Stafford Horne @ 2019-04-25 21:17 UTC (permalink / raw)
  To: openrisc

Hello,

On Sun, Apr 14, 2019 at 12:41:56PM +0300, BAndViG wrote:
> Hi, Stafford, Richard
> 
> > > > I propose:
> > > >
> > > >   bit-10 - 1 indicates if rd2 is +2
> > > >   bit-9  - 1 indicates if ra2 is +2
> > > >   bit-8  - 1 indicates if rb2 is +2
> > >
> > > Thanks.  LGTM.

Sorry, it took time, I had visitors at home last week, and I needed to relearn
how cgen worked.

This is implemented in binutils now. See my patches here:

  - https://github.com/stffrdhrn/binutils-gdb/commits/orfpx64a32-3

I have not squashed the commits because it makes it a bit easier for reviewing
what I did to get these flags working.

> > I was thinking, do you think these orfpx64a32 instructions should have
> > different
> > opcodes to distinguish between true 64-bit and 32-bit double instructions?
> > Otherwise we would not really be able to run these orfpx64a32 32-bit
> > binaries on
> > 64-bit CPU's if 64-bit cpus ever get implemented.
> 
> Is it assumed that 64-bit and 32-bit OpenRISC CPUs should be binary compatible?
> If not, I think it is should be normal that:
> - assembler instructions lf.*.d look different
> - 64-bit CPU just ignores bits [10:8]

The only concern I have with '64-bit CPU just ignores bits [10:8]' is when we
have 32-bit code that expects output in a 32-bit pair.  i.e.

  l.ftoi.d  r13,r14, r15,r16
  l.sfne    r14, r0
  ...

In 64-bit it would translate to:

  l.ftoi.d  r13, r15
  l.sfne    r14, r0
  ...

Which would make no sense as r14 would not ever be touched.  This kind of code
would not work on a 64-bit cpu.

> Also, let me remember here that I made initial implementation for unordered
> comparison (for MAROCCHINO pipe currently). The verilog sources are in
> branch
> https://github.com/openrisc/or1k_marocchino/tree/fp_unordered_cmp
> To keep full backward compatibility with current opcodes the proposed
> variant differs from Richard's P11 and looks as following:
> 
> lf.sfueq.s rA,rB (opc: 0x28)
> lf.sfueq.d rA,rB (opc: 0x38)
> lf.sfune.s rA,rB (opc: 0x29)
> lf.sfune.d rA,rB (opc: 0x39)
> lf.sfugt.s rA,rB (opc: 0x2A)
> lf.sfugt.d rA,rB (opc: 0x3A)
> lf.sfuge.s rA,rB (opc: 0x2B)
> lf.sfuge.d rA,rB (opc: 0x3B)
> lf.sfult.s rA,rB (opc: 0x2C)
> lf.sfult.d rA,rB (opc: 0x3C)
> lf.sfule.s rA,rB (opc: 0x2D)
> lf.sfule.d rA,rB (opc: 0x3D)
> lf.sfun.s rA,rB (opc: 0x2E)
> lf.sfun.d rA,rB (opc: 0x3E)
> 
> OPC's bit [5] indicates that comparison is unordered and bit [4] that
> operands are doubles.
> Additionally I’ve added fp_comparisons_table.odt document into doc/readme/
> folder with description of all implemented FP comparison instructions and
> their relation to IEEE standard.
> I haven’t tested them yet as there is no tool now.
> 
> Of course, as Staffor wrote "Its not too late if its before silicon ;)"

Right, let me look to add these to binutils then GCC in a second series of
patches.

> WBR
> Andrey
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [OpenRISC] OpenRISC 1.3 spec
  2019-04-25 21:17           ` Stafford Horne
@ 2019-04-26 22:22             ` Stafford Horne
  2019-05-02 12:22               ` BAndViG
  2019-05-07 15:28             ` Richard Henderson
  1 sibling, 1 reply; 23+ messages in thread
From: Stafford Horne @ 2019-04-26 22:22 UTC (permalink / raw)
  To: openrisc

On Fri, Apr 26, 2019 at 06:17:02AM +0900, Stafford Horne wrote:
> Hello,
> 
> On Sun, Apr 14, 2019 at 12:41:56PM +0300, BAndViG wrote:
> > Hi, Stafford, Richard
> > 
> > > > > I propose:
> > > > >
> > > > >   bit-10 - 1 indicates if rd2 is +2
> > > > >   bit-9  - 1 indicates if ra2 is +2
> > > > >   bit-8  - 1 indicates if rb2 is +2
> > > >
> > > > Thanks.  LGTM.
> 
> Sorry, it took time, I had visitors at home last week, and I needed to relearn
> how cgen worked.
> 
> This is implemented in binutils now. See my patches here:
> 
>   - https://github.com/stffrdhrn/binutils-gdb/commits/orfpx64a32-3
> 
> I have not squashed the commits because it makes it a bit easier for reviewing
> what I did to get these flags working.

I have the GCC patches up as well now.

 - https://github.com/stffrdhrn/gcc/commits/or1k-fpu-2

Initial tests look fine.

-Stafford

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [OpenRISC] OpenRISC 1.3 spec
  2019-04-26 22:22             ` Stafford Horne
@ 2019-05-02 12:22               ` BAndViG
  0 siblings, 0 replies; 23+ messages in thread
From: BAndViG @ 2019-05-02 12:22 UTC (permalink / raw)
  To: openrisc

> > On Sun, Apr 14, 2019 at 12:41:56PM +0300, BAndViG wrote:
> > > Hi, Stafford, Richard
> > >
> > > > > > I propose:
> > > > > >
> > > > > >   bit-10 - 1 indicates if rd2 is +2
> > > > > >   bit-9  - 1 indicates if ra2 is +2
> > > > > >   bit-8  - 1 indicates if rb2 is +2
> > > > >
> > > > > Thanks.  LGTM.
> >
> > Sorry, it took time, I had visitors at home last week, and I needed to 
> > relearn
> > how cgen worked.
> >
> > This is implemented in binutils now. See my patches here:
> >
> >   - https://github.com/stffrdhrn/binutils-gdb/commits/orfpx64a32-3
> >
> > I have not squashed the commits because it makes it a bit easier for 
> > reviewing
> > what I did to get these flags working.
>
> I have the GCC patches up as well now.
>
>  - https://github.com/stffrdhrn/gcc/commits/or1k-fpu-2
>
> Initial tests look fine.
>
> -Stafford

I've updated mor1kx's FPU stuff with various things. In particular 
OPTION_FTOI_ROUNDING was ported from MAROCCHINO (see description in Readme.md). 
I also created "fp_unordered_cmp" branches in both - mor1kx and 
or1k_marocchino - repos. The branches contain initial implementation of 
unordered comparison as I described in another post of the thread. The 
MAROCCHINO's branch also implements offset for a2, b2 and d2 copied from your 
pull request. I hope verilog models are ready for your upcoming GCC9-spec1.3 
port.

WBR
Andrey 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [OpenRISC] OpenRISC 1.3 spec
  2019-04-25 21:17           ` Stafford Horne
  2019-04-26 22:22             ` Stafford Horne
@ 2019-05-07 15:28             ` Richard Henderson
  2019-05-07 21:12               ` Stafford Horne
  1 sibling, 1 reply; 23+ messages in thread
From: Richard Henderson @ 2019-05-07 15:28 UTC (permalink / raw)
  To: openrisc

On 4/25/19 2:17 PM, Stafford Horne wrote:
> This is implemented in binutils now. See my patches here:
> 
>   - https://github.com/stffrdhrn/binutils-gdb/commits/orfpx64a32-3
> 
> I have not squashed the commits because it makes it a bit easier for reviewing
> what I did to get these flags working.

I've implemented this for qemu,

  https://github.com/rth7680/qemu/commits/tgt-or1k

although untested so far.  I need to regenerate my
cross-testing environment for or1k...


r~

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [OpenRISC] OpenRISC 1.3 spec
  2019-05-07 15:28             ` Richard Henderson
@ 2019-05-07 21:12               ` Stafford Horne
  2019-05-08 18:05                 ` BAndViG
  0 siblings, 1 reply; 23+ messages in thread
From: Stafford Horne @ 2019-05-07 21:12 UTC (permalink / raw)
  To: openrisc

On Tue, May 07, 2019 at 08:28:45AM -0700, Richard Henderson wrote:
> On 4/25/19 2:17 PM, Stafford Horne wrote:
> > This is implemented in binutils now. See my patches here:
> > 
> >   - https://github.com/stffrdhrn/binutils-gdb/commits/orfpx64a32-3
> > 
> > I have not squashed the commits because it makes it a bit easier for reviewing
> > what I did to get these flags working.
> 
> I've implemented this for qemu,
> 
>   https://github.com/rth7680/qemu/commits/tgt-or1k
> 
> although untested so far.  I need to regenerate my
> cross-testing environment for or1k...

This looks good, I like how you do (rD1 + rD1Offset + 1) instead of what I was
doing (rD1 + (rD1Offset ? 2 : 1 )).  I will fix my matches to use your method.

Also, just a reminder, the latest patches for GCC FPU support are up here.  I
have rebased to the 9.1.0 release.  Also, added a new REG CLASS for REG PAIRS to
fix an issue for when (rD1 + rD1Offset + 1) overflows.

  https://github.com/stffrdhrn/gcc/commits/or1k-fpu-2

-Stafford

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [OpenRISC] OpenRISC 1.3 spec
  2019-05-07 21:12               ` Stafford Horne
@ 2019-05-08 18:05                 ` BAndViG
  2019-05-09 20:29                   ` Stafford Horne
  2019-05-09 21:47                   ` Richard Henderson
  0 siblings, 2 replies; 23+ messages in thread
From: BAndViG @ 2019-05-08 18:05 UTC (permalink / raw)
  To: openrisc

> From: Stafford Horne
> Sent: Wednesday, May 08, 2019 12:12 AM
> To: Richard Henderson
> Cc: BAndViG ; Openrisc
> Subject: Re: [OpenRISC] OpenRISC 1.3 spec

> On Tue, May 07, 2019 at 08:28:45AM -0700, Richard Henderson wrote:
> > On 4/25/19 2:17 PM, Stafford Horne wrote:
> > > This is implemented in binutils now. See my patches here:
> > >
> > >   - https://github.com/stffrdhrn/binutils-gdb/commits/orfpx64a32-3
> > >
> > > I have not squashed the commits because it makes it a bit easier for 
> > > reviewing
> > > what I did to get these flags working.
> >
> > I've implemented this for qemu,
> >
> >   https://github.com/rth7680/qemu/commits/tgt-or1k
> >
> > although untested so far.  I need to regenerate my
> > cross-testing environment for or1k...

> This looks good, I like how you do (rD1 + rD1Offset + 1) instead of what I 
> was
> doing (rD1 + (rD1Offset ? 2 : 1 )).  I will fix my matches to use your 
> method.

Ah, I implemented similar approach in MAROCCHINO independently :), see latest 
commit to fp_unordered_cmp branch:
https://github.com/openrisc/or1k_marocchino/commit/313b256875c8b619f5b16db47d915e5dfaedfff7

> Also, just a reminder, the latest patches for GCC FPU support are up here.  I
> have rebased to the 9.1.0 release.  Also, added a new REG CLASS for REG PAIRS 
> to
> fix an issue for when (rD1 + rD1Offset + 1) overflows.

>   https://github.com/stffrdhrn/gcc/commits/or1k-fpu-2

Btw, earlier you wrote "... on one end 64-bit openrisc doesn't looks to even be 
coming ...". Actually I think it wouldn't be a very difficult for me to create 
64-bit OpeRISC by some re-factoring of MAROCCHINO's modules. At the same time 
is anybody interested in it?

Additionally, is anybody interested in little endian support? I've been 
thinking to implement it as a parameter, like OPTION_ENDIAN = "BIG"/"LITTLE". 
With the approach SR[LEE]:
  - should be set at compile time in according with OPTION_ENDIAN value
  - couldn't be changed by writing into SR

WBR
Andrey 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [OpenRISC] OpenRISC 1.3 spec
  2019-05-08 18:05                 ` BAndViG
@ 2019-05-09 20:29                   ` Stafford Horne
  2019-05-09 21:47                   ` Richard Henderson
  1 sibling, 0 replies; 23+ messages in thread
From: Stafford Horne @ 2019-05-09 20:29 UTC (permalink / raw)
  To: openrisc

On Wed, May 08, 2019 at 09:05:58PM +0300, BAndViG wrote:
> > From: Stafford Horne
> > Sent: Wednesday, May 08, 2019 12:12 AM
> > To: Richard Henderson
> > Cc: BAndViG ; Openrisc
> > Subject: Re: [OpenRISC] OpenRISC 1.3 spec
> 
> > On Tue, May 07, 2019 at 08:28:45AM -0700, Richard Henderson wrote:
> > > On 4/25/19 2:17 PM, Stafford Horne wrote:
> > > > This is implemented in binutils now. See my patches here:
> > > >
> > > >   - https://github.com/stffrdhrn/binutils-gdb/commits/orfpx64a32-3
> > > >
> > > > I have not squashed the commits because it makes it a bit easier for
> > > > reviewing
> > > > what I did to get these flags working.
> > >
> > > I've implemented this for qemu,
> > >
> > >   https://github.com/rth7680/qemu/commits/tgt-or1k
> > >
> > > although untested so far.  I need to regenerate my
> > > cross-testing environment for or1k...
> 
> > This looks good, I like how you do (rD1 + rD1Offset + 1) instead of what
> > I was
> > doing (rD1 + (rD1Offset ? 2 : 1 )).  I will fix my matches to use your
> > method.
> 
> Ah, I implemented similar approach in MAROCCHINO independently :), see
> latest commit to fp_unordered_cmp branch:
> https://github.com/openrisc/or1k_marocchino/commit/313b256875c8b619f5b16db47d915e5dfaedfff7

Nice.

> > Also, just a reminder, the latest patches for GCC FPU support are up here.  I
> > have rebased to the 9.1.0 release.  Also, added a new REG CLASS for REG
> > PAIRS to
> > fix an issue for when (rD1 + rD1Offset + 1) overflows.
> 
> >   https://github.com/stffrdhrn/gcc/commits/or1k-fpu-2
> 
> Btw, earlier you wrote "... on one end 64-bit openrisc doesn't looks to even
> be coming ...". Actually I think it wouldn't be a very difficult for me to
> create 64-bit OpeRISC by some re-factoring of MAROCCHINO's modules. At the
> same time is anybody interested in it?
> 
> Additionally, is anybody interested in little endian support? I've been
> thinking to implement it as a parameter, like OPTION_ENDIAN =
> "BIG"/"LITTLE". With the approach SR[LEE]:
>  - should be set at compile time in according with OPTION_ENDIAN value
>  - couldn't be changed by writing into SR

I think its possible, it requires work to be done on binutils, simulators and
gcc.  There has been a start to this work before but I didn't continue as per
simplicity.

BTW,

I have finished the first version of implementing unordered comparisons in
binutils.  Please have a look here:

  https://github.com/stffrdhrn/binutils-gdb/commits/orfpx64a32-3a

-Stafford

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [OpenRISC] OpenRISC 1.3 spec
  2019-05-08 18:05                 ` BAndViG
  2019-05-09 20:29                   ` Stafford Horne
@ 2019-05-09 21:47                   ` Richard Henderson
  2019-05-10  7:56                     ` BAndViG
  1 sibling, 1 reply; 23+ messages in thread
From: Richard Henderson @ 2019-05-09 21:47 UTC (permalink / raw)
  To: openrisc

On 5/8/19 11:05 AM, BAndViG wrote:
> Ah, I implemented similar approach in MAROCCHINO independently :), see latest
> commit to fp_unordered_cmp branch:
> https://github.com/openrisc/or1k_marocchino/commit/313b256875c8b619f5b16db47d915e5dfaedfff7

In the commit above, you say

> If A1/B1/D1 address `> 30` than `invalid instruction` exception is raised.

But that doesn't handle D1=30, D1P=1.

Since you are using a (5-bit?) add-with-carry circuit, is it easy to raise the
invalid instruction if there is carry out of bit 4, instead of the hard-coded
comparison against 30?

Otherwise, I'd drop the invalid instruction and let D2 wrap around.
D1=31,D1P=1 would be a valid (but silly) instruction clobbering the stack
pointer (R1).  As would D1=31,D1P=0, overwriting the
ought-to-have-been-hardwired-but-isn't R0.  Both are certainly user bugs, but
so what -- Don't Do That Then.


r~

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [OpenRISC] OpenRISC 1.3 spec
  2019-05-09 21:47                   ` Richard Henderson
@ 2019-05-10  7:56                     ` BAndViG
  2019-05-11 10:04                       ` Stafford Horne
  0 siblings, 1 reply; 23+ messages in thread
From: BAndViG @ 2019-05-10  7:56 UTC (permalink / raw)
  To: openrisc

> From: Richard Henderson
> Sent: Friday, May 10, 2019 12:47 AM
> To: BAndViG ; Stafford Horne
> Cc: Openrisc
> Subject: Re: [OpenRISC] OpenRISC 1.3 spec

> > On 5/8/19 11:05 AM, BAndViG wrote:
> > Ah, I implemented similar approach in MAROCCHINO independently :), see 
> > latest
> > commit to fp_unordered_cmp branch:
> > https://github.com/openrisc/or1k_marocchino/commit/313b256875c8b619f5b16db47d915e5dfaedfff7

> In the commit above, you say

> > If A1/B1/D1 address `> 30` than `invalid instruction` exception is raised.

Oh, it is just misprint. It should be `>= 30`.

> But that doesn't handle D1=30, D1P=1.
> Since you are using a (5-bit?) add-with-carry circuit, is it easy to raise 
> the
> invalid instruction if there is carry out of bit 4, instead of the hard-coded
> comparison against 30?

A1/B1/D1 boundaries check is implemented in or1k_marocchino_decode.v, lines 
#348-351

//  # check legality of A1/B1/D1 addresses: they must be < r30
wire op_fp64_rfa1_adr_l = ~(&fetch_rfa1_adr_i[OPTION_RF_ADDR_WIDTH-1:1]);
wire op_fp64_rfb1_adr_l = ~(&fetch_rfb1_adr_i[OPTION_RF_ADDR_WIDTH-1:1]);
wire op_fp64_rfd1_adr_l = ~(&fetch_rfd1_adr_i[OPTION_RF_ADDR_WIDTH-1:1]);

And here, commentary is correct :).
`_l` suffix here means `legal`.
I check that [4:1] bits must not be 4'b1111, because 5'd30 is 5'b11110 and 
5'd31 is all ones.

> Otherwise, I'd drop the invalid instruction and let D2 wrap around.
> D1=31,D1P=1 would be a valid (but silly) instruction clobbering the stack
> pointer (R1).  As would D1=31,D1P=0, overwriting the
> ought-to-have-been-hardwired-but-isn't R0.  Both are certainly user bugs, but
> so what -- Don't Do That Then.

:) As you correctly mentioned, current GPRs implementation in MAROCCHINO (and 
in any pipe of mor1kx if I remember correctly) allows writing into R0 just to 
simplify design. As a result a user have got enough room for bugs already.
I've been thinking about a variants for R0 write protection. R0 could be zero 
initialized at cpu_rst by dedicated circuits. And `invalid instruction` 
exception should be raised if an instruction tries to write to R0. At the same 
time such behavior is incompatible with current run-time initialization 
sequences implemented in OR1K tool chains. The circle is closed.

WBR
Andrey 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [OpenRISC] OpenRISC 1.3 spec
  2019-05-10  7:56                     ` BAndViG
@ 2019-05-11 10:04                       ` Stafford Horne
  2019-05-12 19:58                         ` BAndViG
  0 siblings, 1 reply; 23+ messages in thread
From: Stafford Horne @ 2019-05-11 10:04 UTC (permalink / raw)
  To: openrisc

On Fri, May 10, 2019 at 10:56:05AM +0300, BAndViG wrote:
> > From: Richard Henderson
> > Sent: Friday, May 10, 2019 12:47 AM
> > To: BAndViG ; Stafford Horne
> > Cc: Openrisc
> > Subject: Re: [OpenRISC] OpenRISC 1.3 spec
> 
> > > On 5/8/19 11:05 AM, BAndViG wrote:
> > > Ah, I implemented similar approach in MAROCCHINO independently :), see
> > > latest
> > > commit to fp_unordered_cmp branch:
> > > https://github.com/openrisc/or1k_marocchino/commit/313b256875c8b619f5b16db47d915e5dfaedfff7
> 
> > In the commit above, you say
> 
> > > If A1/B1/D1 address `> 30` than `invalid instruction` exception is raised.
> 
> Oh, it is just misprint. It should be `>= 30`.
> 
> > But that doesn't handle D1=30, D1P=1.
> > Since you are using a (5-bit?) add-with-carry circuit, is it easy to
> > raise the
> > invalid instruction if there is carry out of bit 4, instead of the hard-coded
> > comparison against 30?
> 
> A1/B1/D1 boundaries check is implemented in or1k_marocchino_decode.v, lines
> #348-351
> 
> //  # check legality of A1/B1/D1 addresses: they must be < r30
> wire op_fp64_rfa1_adr_l = ~(&fetch_rfa1_adr_i[OPTION_RF_ADDR_WIDTH-1:1]);
> wire op_fp64_rfb1_adr_l = ~(&fetch_rfb1_adr_i[OPTION_RF_ADDR_WIDTH-1:1]);
> wire op_fp64_rfd1_adr_l = ~(&fetch_rfd1_adr_i[OPTION_RF_ADDR_WIDTH-1:1]);
> 
> And here, commentary is correct :).
> `_l` suffix here means `legal`.
> I check that [4:1] bits must not be 4'b1111, because 5'd30 is 5'b11110 and
> 5'd31 is all ones.
> 
> > Otherwise, I'd drop the invalid instruction and let D2 wrap around.
> > D1=31,D1P=1 would be a valid (but silly) instruction clobbering the stack
> > pointer (R1).  As would D1=31,D1P=0, overwriting the
> > ought-to-have-been-hardwired-but-isn't R0.  Both are certainly user bugs, but
> > so what -- Don't Do That Then.
> 
> :) As you correctly mentioned, current GPRs implementation in MAROCCHINO
> (and in any pipe of mor1kx if I remember correctly) allows writing into R0
> just to simplify design. As a result a user have got enough room for bugs
> already.
> I've been thinking about a variants for R0 write protection. R0 could be
> zero initialized at cpu_rst by dedicated circuits. And `invalid instruction`
> exception should be raised if an instruction tries to write to R0. At the
> same time such behavior is incompatible with current run-time initialization
> sequences implemented in OR1K tool chains. The circle is closed.

We still have the option to drop the validation.  Just as we don't have
validation for writing to r0, I think its fine to say r31's pair register is
undefined and should be avoided. (i.e. on some machines it might go into the
shadow reg space)

On the other hand, I have finished the GCC updates for unordered comparisons.
You can see the patch here, I built newlib with this enabled and was able to
shake out a few bugs.  It seems to work:

  - https://github.com/stffrdhrn/gcc/commits/or1k-fpu-2

The new gcc argument is:

  -munordered-float

Example:

or1k-elf-gcc -o ordered ordered.c -Wall -pipe -O2 -munordered-float \
  -mdouble-float -mhard-float -mhard-mul -mhard-div -mcmov -msext -msfimm -l

#include <math.h>

int sfun(float a, float b) {
  return isunordered(a, b);
}
int sfult(float a, float b) {
  return !isgreaterequal(a, b);
}
int sfule(float a, float b) {
  return !isgreater(a, b);
}
int sfueq(float a, float b) {
  return !islessgreater(a, b);
}

/* the above outputs:
 *
000022c8 <sfun>:                                                                
    22c8:       c8 03 20 2e     lf.sfun.s r3,r4                                 
    22cc:       a9 60 00 01     l.ori r11,r0,0x1                                
    22d0:       44 00 48 00     l.jr r9                                         
    22d4:       e1 6b 00 0e     l.cmov r11,r11,r0                               
                                                                                
000022d8 <sfult>:                                                               
    22d8:       c8 03 20 2c     lf.sfult.s r3,r4                                
    22dc:       a9 60 00 01     l.ori r11,r0,0x1                                
    22e0:       44 00 48 00     l.jr r9                                         
    22e4:       e1 6b 00 0e     l.cmov r11,r11,r0                               
                                                                                
000022e8 <sfule>:                                                               
    22e8:       c8 03 20 2d     lf.sfule.s r3,r4                                
    22ec:       a9 60 00 01     l.ori r11,r0,0x1                                
    22f0:       44 00 48 00     l.jr r9                                         
    22f4:       e1 6b 00 0e     l.cmov r11,r11,r0                               
                                                                                
000022f8 <sfueq>:                                                               
    22f8:       c8 03 20 28     lf.sfueq.s r3,r4                                
    22fc:       a9 60 00 01     l.ori r11,r0,0x1                                
    2300:       44 00 48 00     l.jr r9                                         
    2304:       e1 6b 00 0e     l.cmov r11,r11,r0   
*/

int main() {
  return sfun (4.0, 4.2) ||
         sfult(4.2, 3.4) ||
         sfule(4.2, 3.4) ||
         sfueq(3.2, 3.2);
}

-Stafford

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [OpenRISC] OpenRISC 1.3 spec
  2019-05-11 10:04                       ` Stafford Horne
@ 2019-05-12 19:58                         ` BAndViG
  2019-05-12 23:09                           ` Stafford Horne
  2019-06-06 22:11                           ` Stafford Horne
  0 siblings, 2 replies; 23+ messages in thread
From: BAndViG @ 2019-05-12 19:58 UTC (permalink / raw)
  To: openrisc

> From: Stafford Horne
> Sent: Saturday, May 11, 2019 1:04 PM

> > On Fri, May 10, 2019 at 10:56:05AM +0300, BAndViG wrote:
> > I've been thinking about a variants for R0 write protection. R0 could be
> > zero initialized at cpu_rst by dedicated circuits. And `invalid 
> > instruction`
> > exception should be raised if an instruction tries to write to R0. At the
> > same time such behavior is incompatible with current run-time 
> > initialization
> > sequences implemented in OR1K tool chains. The circle is closed.

> We still have the option to drop the validation.  Just as we don't have
> validation for writing to r0, I think its fine to say r31's pair register is
> undefined and should be avoided. (i.e. on some machines it might go into the
> shadow reg space)

On the one hand I'm a kind of perfectionist and would prefer to implement such 
protections. On the other hand they cost noticeable space and timing. Not 
trivial choice for me :).

> On the other hand, I have finished the GCC updates for unordered comparisons.
> You can see the patch here, I built newlib with this enabled and was able to
> shake out a few bugs.  It seems to work:

>   - https://github.com/stffrdhrn/gcc/commits/or1k-fpu-2

> The new gcc argument is:

>   -munordered-float

I've build two variants of GCC9/NewLIB tool chains. One has got 
"-mhard-float -munordered-float" options raised by default. And another one has 
got "-mhard-float -mdouble-float -munordered-float" default options. First 
variant was used to build single precision Whetstone for mor1kx+FPU32 and 
second to build single and double precision Whetstone for MAROCCHINO. All 
variants work.
We could merge fp_unordered_cmp branches into master. Or should we postpone the 
merge till your binutils/gcc patches being upstreamed?

WBR
Andrey 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [OpenRISC] OpenRISC 1.3 spec
  2019-05-12 19:58                         ` BAndViG
@ 2019-05-12 23:09                           ` Stafford Horne
  2019-06-06 22:11                           ` Stafford Horne
  1 sibling, 0 replies; 23+ messages in thread
From: Stafford Horne @ 2019-05-12 23:09 UTC (permalink / raw)
  To: openrisc

On Mon, May 13, 2019, 4:58 AM BAndViG <bandvig@mail.ru> wrote:

> > From: Stafford Horne
> > Sent: Saturday, May 11, 2019 1:04 PM
>
> > > On Fri, May 10, 2019 at 10:56:05AM +0300, BAndViG wrote:
> > > I've been thinking about a variants for R0 write protection. R0 could
> be
> > > zero initialized at cpu_rst by dedicated circuits. And `invalid
> > > instruction`
> > > exception should be raised if an instruction tries to write to R0. At
> the
> > > same time such behavior is incompatible with current run-time
> > > initialization
> > > sequences implemented in OR1K tool chains. The circle is closed.
>
> > We still have the option to drop the validation.  Just as we don't have
> > validation for writing to r0, I think its fine to say r31's pair
> register is
> > undefined and should be avoided. (i.e. on some machines it might go into
> the
> > shadow reg space)
>
> On the one hand I'm a kind of perfectionist and would prefer to implement
> such
> protections. On the other hand they cost noticeable space and timing. Not
> trivial choice for me :).
>
> > On the other hand, I have finished the GCC updates for unordered
> comparisons.
> > You can see the patch here, I built newlib with this enabled and was
> able to
> > shake out a few bugs.  It seems to work:
>
> >   - https://github.com/stffrdhrn/gcc/commits/or1k-fpu-2
>
> > The new gcc argument is:
>
> >   -munordered-float
>
> I've build two variants of GCC9/NewLIB tool chains. One has got
> "-mhard-float -munordered-float" options raised by default. And another
> one has
> got "-mhard-float -mdouble-float -munordered-float" default options. First
> variant was used to build single precision Whetstone for mor1kx+FPU32 and
> second to build single and double precision Whetstone for MAROCCHINO. All
> variants work.
> We could merge fp_unordered_cmp branches into master. Or should we
> postpone the
> merge till your binutils/gcc patches being upstreamed?
>

I think we can merge.  It will take time to get it all upstream.

Note I started updates to the spec.
- https://github.com/stffrdhrn/doc

Still a lot to do, but if you want to look at how I wrote up the lf sfu*
instructions please let me know what you think

-stafford
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.librecores.org/pipermail/openrisc/attachments/20190513/e52103fd/attachment.html>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [OpenRISC] OpenRISC 1.3 spec
  2019-05-12 19:58                         ` BAndViG
  2019-05-12 23:09                           ` Stafford Horne
@ 2019-06-06 22:11                           ` Stafford Horne
  2019-06-15  6:14                             ` Stafford Horne
  1 sibling, 1 reply; 23+ messages in thread
From: Stafford Horne @ 2019-06-06 22:11 UTC (permalink / raw)
  To: openrisc

On Sun, May 12, 2019 at 10:58:53PM +0300, BAndViG wrote:
>>From: Stafford Horne
>>Sent: Saturday, May 11, 2019 1:04 PM
>
>>> On Fri, May 10, 2019 at 10:56:05AM +0300, BAndViG wrote:
>>> I've been thinking about a variants for R0 write protection. R0 could be
>>> zero initialized at cpu_rst by dedicated circuits. And `invalid > 
>>instruction`
>>> exception should be raised if an instruction tries to write to R0. At the
>>> same time such behavior is incompatible with current run-time > 
>>initialization
>>> sequences implemented in OR1K tool chains. The circle is closed.
>
>>We still have the option to drop the validation.  Just as we don't have
>>validation for writing to r0, I think its fine to say r31's pair register is
>>undefined and should be avoided. (i.e. on some machines it might go into the
>>shadow reg space)
>
>On the one hand I'm a kind of perfectionist and would prefer to 
>implement such protections. On the other hand they cost noticeable 
>space and timing. Not trivial choice for me :).
>
>>On the other hand, I have finished the GCC updates for unordered comparisons.
>>You can see the patch here, I built newlib with this enabled and was able to
>>shake out a few bugs.  It seems to work:
>
>>  - https://github.com/stffrdhrn/gcc/commits/or1k-fpu-2
>
>>The new gcc argument is:
>
>>  -munordered-float
>
>I've build two variants of GCC9/NewLIB tool chains. One has got 
>"-mhard-float -munordered-float" options raised by default. And 
>another one has got "-mhard-float -mdouble-float -munordered-float" 
>default options. First variant was used to build single precision 
>Whetstone for mor1kx+FPU32 and second to build single and double 
>precision Whetstone for MAROCCHINO. All variants work.
>We could merge fp_unordered_cmp branches into master. Or should we 
>postpone the merge till your binutils/gcc patches being upstreamed?
>
>WBR
>Andrey


Hello Richard, Andrey, OpenRISCers,

This is the final review for spec version 1.3.  The pdf is here:

 - https://github.com/openrisc/doc/raw/master/openrisc-arch-1.3-rev1.pdf

To see a history of the changes so far check out these pull requests:

 - https://github.com/openrisc/doc/pull/2 - Spec Updates from Stafford
 - https://github.com/openrisc/doc/pull/3 - SPec Updates from Andrey

These PRs are all merged and the last thing we have is to merge the website and
news updates to make it official:

 - https://github.com/openrisc/openrisc.github.io/pull/13

I'll let it sit for a bit to see if we can collect any comments or feedback on
the final document.  So please speak up.

But anyway, if there are any issues after the fact we can always create a
revision 2.

-Stafford

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [OpenRISC] OpenRISC 1.3 spec
  2019-06-06 22:11                           ` Stafford Horne
@ 2019-06-15  6:14                             ` Stafford Horne
  0 siblings, 0 replies; 23+ messages in thread
From: Stafford Horne @ 2019-06-15  6:14 UTC (permalink / raw)
  To: openrisc

Hello,

The changes are now merged. Thanks for all of your help.

Read more at:
https://openrisc.io/2019/06/04/openrisc-arch1.3

-Stafford

On Fri, Jun 7, 2019, 7:11 AM Stafford Horne <shorne@gmail.com> wrote:

> On Sun, May 12, 2019 at 10:58:53PM +0300, BAndViG wrote:
> >>From: Stafford Horne
> >>Sent: Saturday, May 11, 2019 1:04 PM
> >
> >>> On Fri, May 10, 2019 at 10:56:05AM +0300, BAndViG wrote:
> >>> I've been thinking about a variants for R0 write protection. R0 could
> be
> >>> zero initialized at cpu_rst by dedicated circuits. And `invalid >
> >>instruction`
> >>> exception should be raised if an instruction tries to write to R0. At
> the
> >>> same time such behavior is incompatible with current run-time >
> >>initialization
> >>> sequences implemented in OR1K tool chains. The circle is closed.
> >
> >>We still have the option to drop the validation.  Just as we don't have
> >>validation for writing to r0, I think its fine to say r31's pair
> register is
> >>undefined and should be avoided. (i.e. on some machines it might go into
> the
> >>shadow reg space)
> >
> >On the one hand I'm a kind of perfectionist and would prefer to
> >implement such protections. On the other hand they cost noticeable
> >space and timing. Not trivial choice for me :).
> >
> >>On the other hand, I have finished the GCC updates for unordered
> comparisons.
> >>You can see the patch here, I built newlib with this enabled and was
> able to
> >>shake out a few bugs.  It seems to work:
> >
> >>  - https://github.com/stffrdhrn/gcc/commits/or1k-fpu-2
> >
> >>The new gcc argument is:
> >
> >>  -munordered-float
> >
> >I've build two variants of GCC9/NewLIB tool chains. One has got
> >"-mhard-float -munordered-float" options raised by default. And
> >another one has got "-mhard-float -mdouble-float -munordered-float"
> >default options. First variant was used to build single precision
> >Whetstone for mor1kx+FPU32 and second to build single and double
> >precision Whetstone for MAROCCHINO. All variants work.
> >We could merge fp_unordered_cmp branches into master. Or should we
> >postpone the merge till your binutils/gcc patches being upstreamed?
> >
> >WBR
> >Andrey
>
>
> Hello Richard, Andrey, OpenRISCers,
>
> This is the final review for spec version 1.3.  The pdf is here:
>
>  - https://github.com/openrisc/doc/raw/master/openrisc-arch-1.3-rev1.pdf
>
> To see a history of the changes so far check out these pull requests:
>
>  - https://github.com/openrisc/doc/pull/2 - Spec Updates from Stafford
>  - https://github.com/openrisc/doc/pull/3 - SPec Updates from Andrey
>
> These PRs are all merged and the last thing we have is to merge the
> website and
> news updates to make it official:
>
>  - https://github.com/openrisc/openrisc.github.io/pull/13
>
> I'll let it sit for a bit to see if we can collect any comments or
> feedback on
> the final document.  So please speak up.
>
> But anyway, if there are any issues after the fact we can always create a
> revision 2.
>
> -Stafford
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.librecores.org/pipermail/openrisc/attachments/20190615/bc005a39/attachment.html>

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2019-06-15  6:14 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-12 20:56 [OpenRISC] OpenRISC 1.3 spec Stafford Horne
2019-04-12 21:17 ` Richard Henderson
2019-04-12 21:48   ` Stafford Horne
2019-04-13  8:11     ` Richard Henderson
2019-04-13  8:47       ` Stafford Horne
2019-04-14  9:41         ` BAndViG
2019-04-25 21:17           ` Stafford Horne
2019-04-26 22:22             ` Stafford Horne
2019-05-02 12:22               ` BAndViG
2019-05-07 15:28             ` Richard Henderson
2019-05-07 21:12               ` Stafford Horne
2019-05-08 18:05                 ` BAndViG
2019-05-09 20:29                   ` Stafford Horne
2019-05-09 21:47                   ` Richard Henderson
2019-05-10  7:56                     ` BAndViG
2019-05-11 10:04                       ` Stafford Horne
2019-05-12 19:58                         ` BAndViG
2019-05-12 23:09                           ` Stafford Horne
2019-06-06 22:11                           ` Stafford Horne
2019-06-15  6:14                             ` Stafford Horne
2019-04-13  8:03 ` Richard Henderson
2019-04-14  6:30   ` Stafford Horne
2019-04-14  6:48     ` Stafford Horne

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.