All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] Fix QEMU PPC e500v1 efscmp* instructions
@ 2016-05-10 13:24 Imran, Talha
       [not found] ` <3052678E9DA67D4CB879A6625EE42075274B5912@EU-MBX-02.mgc.mentorg.com>
  0 siblings, 1 reply; 3+ messages in thread
From: Imran, Talha @ 2016-05-10 13:24 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 979 bytes --]

Hi Everyone,

Please find attached a patch which fixes handling in QEMU PPC e500v1 for efscmp*
instructions. This was the cause of over 400 FAILs for this CPU while running
GCC testsuite, which have been fixed.

Value for Condition Registers (CR) being set in QEMU was different from
the value observed on hardware.

I have not managed to find a documentation describing the behaviour of
e500 cores for these instructions. However, the behaviour on
MPC8548-CDS target was observed by dumping registers to stdout, while
running executables from uboot.

These instructions are used by GCC only when compiling for te500v1 multilib; hence no
effect on other PPC CPUs (603, 7400 etc.)

A comparison of GCC v5.2.0 testsuite results summary (number of FAILs) is as follows:

CPU = te500v1
without patch: 699
with patch: 193

CPU = e500v2
without patch: 225
with patch: 225

Is this OK to commit? Comments and suggestions are welcome.

Thanks,
Talha Imran

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: ppc-efscmpgt.patch --]
[-- Type: text/x-patch; name="ppc-efscmpgt.patch", Size: 1713 bytes --]

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index b67ebca..752c552 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -1400,7 +1400,7 @@ static inline uint32_t efscmplt(CPUPPCState *env, uint32_t op1, uint32_t op2)
 
     u1.l = op1;
     u2.l = op2;
-    return float32_lt(u1.f, u2.f, &env->vec_status) ? 4 : 0;
+    return float32_lt(u1.f, u2.f, &env->vec_status) ? 6 : 0;
 }
 
 static inline uint32_t efscmpgt(CPUPPCState *env, uint32_t op1, uint32_t op2)
@@ -1409,7 +1409,7 @@ static inline uint32_t efscmpgt(CPUPPCState *env, uint32_t op1, uint32_t op2)
 
     u1.l = op1;
     u2.l = op2;
-    return float32_le(u1.f, u2.f, &env->vec_status) ? 0 : 4;
+    return float32_le(u1.f, u2.f, &env->vec_status) ? 0 : 6;
 }
 
 static inline uint32_t efscmpeq(CPUPPCState *env, uint32_t op1, uint32_t op2)
@@ -1418,7 +1418,7 @@ static inline uint32_t efscmpeq(CPUPPCState *env, uint32_t op1, uint32_t op2)
 
     u1.l = op1;
     u2.l = op2;
-    return float32_eq(u1.f, u2.f, &env->vec_status) ? 4 : 0;
+    return float32_eq(u1.f, u2.f, &env->vec_status) ? 6 : 0;
 }
 
 static inline uint32_t efststlt(CPUPPCState *env, uint32_t op1, uint32_t op2)
@@ -1442,7 +1442,7 @@ static inline uint32_t efststeq(CPUPPCState *env, uint32_t op1, uint32_t op2)
 #define HELPER_SINGLE_SPE_CMP(name)                                     \
     uint32_t helper_e##name(CPUPPCState *env, uint32_t op1, uint32_t op2) \
     {                                                                   \
-        return e##name(env, op1, op2) << 2;                             \
+        return e##name(env, op1, op2);                             \
     }
 /* efststlt */
 HELPER_SINGLE_SPE_CMP(fststlt);

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [Qemu-devel] [PATCH] Fix QEMU PPC e500v1 efscmp* instructions
       [not found]   ` <3052678E9DA67D4CB879A6625EE42075274B5A3C@EU-MBX-02.mgc.mentorg.com>
@ 2016-05-16 21:06     ` Imran, Talha
  2016-05-18  8:04       ` Imran, Talha
  0 siblings, 1 reply; 3+ messages in thread
From: Imran, Talha @ 2016-05-16 21:06 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel, qemu-trivial

[-- Attachment #1: Type: text/plain, Size: 2643 bytes --]

Adding some more basis to the patch:

I received some documentation from Freescale support relating to this:
Signal Processing Engine (SPE) Programming Environments Manual: http://cache.nxp.com/files/32bit/doc/ref_manual/SPEPEM.pdf.
Relevant info is on page #113.

According to the documentation, the value is CR (4 bits wide) is set as follows:
crD = undefined || cl || undefined || undefine

That is, if the condition is true, the crD[2] bit will be set to 1, else zero. The value for rest of the three bits is undefined.
Taking this information into account, we can set crD[1] bit as well, without violating architecture specification.

The existing logic in QEMU which interprets the value of CR (which is set by a number of different instructions) for branching decisions works well with all other instructions. This existing logic is also used for emulation of other CPUs which do not support efscmp* instructions.

Hence, I believe setting CR = 6 instead of CR = 4 is the most straightforward solution.

Comments and suggestions are welcome.

________________________________
From: Imran, Talha
Sent: Friday, May 13, 2016 9:26 PM
To: qemu-trivial@nongnu.org
Subject: [PATCH] Fix QEMU PPC e500v1 efscmp* instructions

re-post with attachment.

________________________________
From: Imran, Talha
Sent: Friday, May 13, 2016 9:04 AM
To: qemu-trivial@nongnu.org
Subject: [PATCH] Fix QEMU PPC e500v1 efscmp* instructions

ping

________________________________
From: Imran, Talha
Sent: Tuesday, May 10, 2016 6:24 PM
To: qemu-ppc@nongnu.org; qemu-devel@nongnu.org
Subject: [PATCH] Fix QEMU PPC e500v1 efscmp* instructions

Hi Everyone,

Please find attached a patch which fixes handling in QEMU PPC e500v1 for efscmp*
instructions. This was the cause of over 400 FAILs for this CPU while running
GCC testsuite, which have been fixed.

Value for Condition Registers (CR) being set in QEMU was different from
the value observed on hardware.

I have not managed to find a documentation describing the behaviour of
e500 cores for these instructions. However, the behaviour on
MPC8548-CDS target was observed by dumping registers to stdout, while
running executables from uboot.

These instructions are used by GCC only when compiling for te500v1 multilib; hence no
effect on other PPC CPUs (603, 7400 etc.)

A comparison of GCC v5.2.0 testsuite results summary (number of FAILs) is as follows:

CPU = te500v1
without patch: 699
with patch: 193

CPU = e500v2
without patch: 225
with patch: 225

Is this OK to commit? Comments and suggestions are welcome.

Thanks,
Talha Imran

[-- Attachment #2: ppc-efscmpgt.patch --]
[-- Type: application/octet-stream, Size: 1713 bytes --]

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index b67ebca..752c552 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -1400,7 +1400,7 @@ static inline uint32_t efscmplt(CPUPPCState *env, uint32_t op1, uint32_t op2)
 
     u1.l = op1;
     u2.l = op2;
-    return float32_lt(u1.f, u2.f, &env->vec_status) ? 4 : 0;
+    return float32_lt(u1.f, u2.f, &env->vec_status) ? 6 : 0;
 }
 
 static inline uint32_t efscmpgt(CPUPPCState *env, uint32_t op1, uint32_t op2)
@@ -1409,7 +1409,7 @@ static inline uint32_t efscmpgt(CPUPPCState *env, uint32_t op1, uint32_t op2)
 
     u1.l = op1;
     u2.l = op2;
-    return float32_le(u1.f, u2.f, &env->vec_status) ? 0 : 4;
+    return float32_le(u1.f, u2.f, &env->vec_status) ? 0 : 6;
 }
 
 static inline uint32_t efscmpeq(CPUPPCState *env, uint32_t op1, uint32_t op2)
@@ -1418,7 +1418,7 @@ static inline uint32_t efscmpeq(CPUPPCState *env, uint32_t op1, uint32_t op2)
 
     u1.l = op1;
     u2.l = op2;
-    return float32_eq(u1.f, u2.f, &env->vec_status) ? 4 : 0;
+    return float32_eq(u1.f, u2.f, &env->vec_status) ? 6 : 0;
 }
 
 static inline uint32_t efststlt(CPUPPCState *env, uint32_t op1, uint32_t op2)
@@ -1442,7 +1442,7 @@ static inline uint32_t efststeq(CPUPPCState *env, uint32_t op1, uint32_t op2)
 #define HELPER_SINGLE_SPE_CMP(name)                                     \
     uint32_t helper_e##name(CPUPPCState *env, uint32_t op1, uint32_t op2) \
     {                                                                   \
-        return e##name(env, op1, op2) << 2;                             \
+        return e##name(env, op1, op2);                             \
     }
 /* efststlt */
 HELPER_SINGLE_SPE_CMP(fststlt);

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [Qemu-devel] [PATCH] Fix QEMU PPC e500v1 efscmp* instructions
  2016-05-16 21:06     ` Imran, Talha
@ 2016-05-18  8:04       ` Imran, Talha
  0 siblings, 0 replies; 3+ messages in thread
From: Imran, Talha @ 2016-05-18  8:04 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel, qemu-trivial

[-- Attachment #1: Type: text/plain, Size: 2147 bytes --]

With more information at hand with the reference manual from Freescale
http://cache.nxp.com/files/32bit/doc/ref_manual/SPEPEM.pdf , I was able to
revisit my patch and figure out what is actually going on.

Before applying any patch, efscmp* instructions in QEMU set crD values as
(0b0100 << 2) = 0b10000 (consider the HELPER_SINGLE_SPE_CMP macro which left
shifts the value returned by efscmp* instruction by 2 bits). A value of 0b10000
is not correct according the to the reference manual.

The reference manual expects efscmp* instructions to return a value of 0b0100.
Please find attached an updated patch qemu-patch-5 which disables left shifting in
HELPER_SINGLE_SPE_CMP macro. This macro is used by efscmp* and efstst*
instructions only. efstst* instruction handlers call efscmp* handlers.

*Explanation:*
Traditionally, each crD (condition register nibble) consist of 4 bits, which is
set by comparisons as follows:
crD = W X Y Z
where
W = Less than
X = Greater than
Y = Equal to

However, efscmp* instructions being a special case return a binary result.
(efscmpeq will set the crD = 0bx1xx iff when op1 == op2 and 0bx0xx otherwise;
i.e. there is no notion of different crD values based on Less than, Greater
than and Equal to).

This effectively means that crD will store a "Greater than" comparison result
iff efscmp* instruction comparison is TRUE. Compiler exploits this feature by
checking for "Branch if Less than or Equal to" (ble instruction) OR "Branch if
Greater than" (bgt instruction) for Branch if FALSE OR Branch if TRUE
respectively after an efscmp* instruction. This can be seen in a assembly code
snippet below:

27          if (__real__ x != 3.0f || __imag__ x != 4.0f)
10000498:   lwz r10,8(r31)
1000049c:   lis r9,16448
100004a0:   efscmpeq cr7,r10,r9
100004a4:   ble- cr7,0x100004b8 <bar+60>  //jump to abort() call
100004a8:   lwz r10,12(r31)
100004ac:   lis r9,16512
100004b0:   efscmpeq cr7,r10,r9
100004b4:   bgt- cr7,0x100004bc <bar+64>  //skip abort() call
28            abort ();
100004b8:   bl 0x10000808 <abort>

Comments and suggestions are welcome.

-Talha

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: qemu-patch5.patch --]
[-- Type: text/x-diff; name="qemu-patch5.patch", Size: 701 bytes --]

Index: src/qemu-2.2-2015.17/target-ppc/fpu_helper.c
===================================================================
--- src/qemu-2.2-2015.17/target-ppc/fpu_helper.c	(revision 463967)
+++ src/qemu-2.2-2015.17/target-ppc/fpu_helper.c	(working copy)
@@ -1437,7 +1437,7 @@
 #define HELPER_SINGLE_SPE_CMP(name)                                     \
     uint32_t helper_e##name(CPUPPCState *env, uint32_t op1, uint32_t op2) \
     {                                                                   \
-        return e##name(env, op1, op2) << 2;                             \
+        return e##name(env, op1, op2);                             \
     }
 /* efststlt */
 HELPER_SINGLE_SPE_CMP(fststlt);

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-05-18  8:04 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-10 13:24 [Qemu-devel] [PATCH] Fix QEMU PPC e500v1 efscmp* instructions Imran, Talha
     [not found] ` <3052678E9DA67D4CB879A6625EE42075274B5912@EU-MBX-02.mgc.mentorg.com>
     [not found]   ` <3052678E9DA67D4CB879A6625EE42075274B5A3C@EU-MBX-02.mgc.mentorg.com>
2016-05-16 21:06     ` Imran, Talha
2016-05-18  8:04       ` Imran, Talha

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.