Re: Looking for architecture papers

From: Gustavo Romero <gromero@linux.vnet.ibm.com>
To: Raz <raziebe@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Subject: Re: Looking for architecture papers
Date: Mon, 8 Oct 2018 16:59:05 -0300	[thread overview]
Message-ID: <3f5ace27-7692-6d95-6e34-611d3eee3ead@linux.vnet.ibm.com> (raw)
In-Reply-To: <CAPB=Z-pLCkXcsAXD5YWt5pwPzFzJrdZsdDye_bZjE0-ea-Ur6A@mail.gmail.com>

Hi Raz,

On 10/04/2018 04:41 AM, Raz wrote:
> Frankly, the more I read the more perplexed I get. For example,
> according to BOOK III-S, chapter 3,
> the MSR bits are differ from the ones described in
> arch/powerpc/include/asm/reg.h.
> Bit zero, is LE, but in the book it is 64-bit mode.
> 
> Would someone be kind to explain what I do not understand?

Yes, I know that can be confusing at the first sight when one is used to, for
instance, x86.

x86 documents use LSB 0 notation, which means (as others already pointed out)
that the least significant bit of a value is marked as being bit 0.

On the other hand Power documents use MSB 0 notation, which means that the most
significant bit of a value is marked as being bit 0 and as a consequence the
least significant bit in that notation in a 64-bit platform is bit 63, not bit
0. MSB 0 notation is also known as IBM bit notation/bit numbering.

Historically LSB 0 notation tend to be used on docs about little-endian
architectures (for instance, x86), whilst MSB 0 notation tend to be used on docs
about big-endian architectures (for instance, Power - Power is actually a little
different because it's now bi-endian actually).

However LSB 0 and MSB 0 are only different notations, so LSB 0 can be employed
on a big-endian architecture documentation, and vice versa.

It happens that kernel code is written in C and for shifts, etc, it's convenient
the LSB 0 notation, not the MSB 0 notation, so it's convenient to use LSB 0
notation when creating a mask, like in arch/powerpc/include/asm/reg.h), i.e.
it's convenient to employ bit positions as '63 - <bit position in PowerISA>'.

So, as another example, in the following gcc macro '_TEXASR_EXTRACT_BITS' takes
a bit position 'BITNUM' as found in the PowerISA documentation but then for the
shift right it uses '63 - BITNUM':

https://github.com/gcc-mirror/gcc/blob/master/gcc/config/rs6000/htmintrin.h#L44-L45

I think it's also important to mention that on PowerISA the elements also follow
the MSB 0 notation. So byte, word, and dword elements in a register found in the
instruction descriptions when referred to 0 are the element "at the left tip",
i.e. "the most significant elements", so to speak. For instance, take
instruction "vperm": doc says 'index' takes bits 3:7 of a byte from [byte]
element 'i'. So for a byte element i=0 it means the most significant byte
("on the left tip") of vector register operand 'VRC'. Moreover, specified bits
in that byte element, i.e. bits 3:7,  also follow the MSB 0, so for the
little-endian addicted thought they are bits 4:0 (LSB 0 notation).

Now, if bits 4:0 = 0b00011 (decimal 3), we grab byte element 3 from 'src'
(256-bit). However byte element 3 is also in MSB 0 notation, so it means third
byte of 'src' but starting counting bytes from 0 from the left to the right
(which in IMO looks indeed more natural since we count, for instance, Natural
Numbers on the 'x' axis similarly).

Hence, it's like to say that 'vperm' instruction in a certain sense has a
"big-endian semantics" for the byte indices. The 'vpermr' instruction introduced
by PowerISA v3.0 is meant to cope with that, so 'vpermr' byte indices have a
"little-endian semantics", so for bits 3:7 MSB 0 (or bits 4:0 in LSB 0 notation) =
0b00011 (decimal 3), on the 'vpermr' instruction it really means we must count
bytes starting from right to left as in the LSB 0 notation and grab the third byte
element from right to left.

So, for instance:

vr0            uint128 = 0x00000000000000000000000000000000
vr1            uint128 = 0x00102030405060708090a0b0c0d0e0f0
vr2            uint128 = 0x01112233445566779999aabbccddeeff
vr3            uint128 = 0x03000000000000000000000000000000

we have 'src' as:

MSB 0:             v--- byte 0, 1, 2, 3, ...
LSB 0:                                                                                  ...  3, 2, 1, byte 0 ---v
src = vr1 || vr2 = 00 10 20 30 40 50 60 70 80 90 A0 B0 C0 D0 E0 F0 01 11 22 33 44 55 66 77 99 99 AA BB CC DD EE FF

vperm   vr0, vr1, vr2, vr3 result is:
vr0            uint128 = 0x30000000000000000000000000000000
byte 3 in MSB 0  = 0x30 ---^ and 0x00 (byte 0 in MSB 0) copied to the remaining bytes

whilst with vpermr (PowerISA v3.0 / POWER9):
vpermr   vr0, vr1, vr2, vr3 result is:
vr0            uint128 = 0xccffffffffffffffffffffffffffffff
byte 3 in LSB 0  = 0xCC----^ and 0xFF (byte 0 in LSB 0) copied to the remaining bytes

Anyway, vperm/vpermr was just an example about notation not being restricted to
bits on Power ISA. So read the docs carefully :) GDB is always useful for checking
if one's understanding about a given Power instruction is correct.

HTH.

Regards,
Gustavo