linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RFC: changing precision control setting in initial FPU context
@ 2001-03-03  7:12 Kevin Buhr
  2001-03-03  9:31 ` Albert D. Cahalan
  0 siblings, 1 reply; 13+ messages in thread
From: Kevin Buhr @ 2001-03-03  7:12 UTC (permalink / raw)
  To: linux-kernel; +Cc: adam, drepper

A question recently came up in "c.o.l.d.s"; actually, it was a comment
on Slashdot that had been cross-posted to 15 Usenet groups by some
ignoramus.  It concerned a snippet of C code that cast a double to int
in such a way as to get a different answer under i386 Linux than under
the i386 free BSDs and most non-i386 architectures.  In fact, the
exact same assembly, running under Linux and under FreeBSD on the same
machine, reportedly gave different results.

For those who might care,

#include <stdio.h>
int main()
{
        int a = 10;
        printf("%d %d\n", 
                /* now for some BAD CODE! */
                (int)( a*.3 +  a*.7),   /* first expression */
                (int)(10*.3 + 10*.7));  /* second expression */
        return 0;
}

when compiled under GCC *without optimization*, will print "9 10" on
i386 Linux and "10 10" most every place else.  (And, by the way, if
you sit down with a pencil and paper, you'll find that IEEE 754
arithmetic in 32-bit, 64-bit, or 80-bit precision tells us that
floor(10*.3 + 10*.7) == 10, not 9.)

It boils down to the fact that, under i386 Linux, the FPU control word
has its precision control (PC) set to 3 (for 80-bit extended
precision) while under i386 FreeBSD, NetBSD, and others, it's set to 2
(for 64-bit double precision).  On other architectures, I assume
there's usually no mismatch between the C "double" precision and the
FPU's default internal precision.

<details>
   To be specific, under Linux, the first expression takes 64-bit
   versions of the constants 0.3 and 0.7 (each slightly less than the
   true values of 0.3 and 0.7), and does 80-bit multiplies and an add
   to get a number slightly less than 10.  This gets truncated to 9.
   On the other hand, under the BSDs, the 64-bit add rounds upward
   before the truncation, giving the answer "10".

   The second expression always produces 10 (and, with -O2, the first
   also produces 10), probably because GCC itself either does all the
   constant optimization arithmetic (including forming the constants
   0.3 and 0.7) in 80 bits or stores the interim results often enough
   in 64-bit registers to make it come out "right".
</details>

Initially, I was quick to dismiss the whole thing as symptomatic of a
severe floating-point-related cluon shortage.  However, the more I
think about it, the better the case seems for changing the Linux
default:

1.  First, PC=3 is a dangerous setting.  A floating point program
    using "double"s, compiled with GCC without attention to
    FPU-related compilation options, won't do IEEE arithmetic running
    under this setting.  Instead, it will use a mixture of 80-bit and
    64-bit IEEE arithmetic depending rather unpredictably on compiler
    register allocations and optimizations.

2.  Second, PC=3 is a mostly *useless* setting for GCC-compiled
    programs.  There can obviously be no way to guarantee reliable
    IEEE 80-bit arithmetic in GCC-compiled code when "double"s are
    only 64 bits, so our only hope is to guarantee reliable IEEE
    64-bit arithmetic.  But, then we should have set PC=2 in the first
    place.

    Worse yet, I don't know of any compilation flags that *can*
    guarantee IEEE 64-bit arithmetic.  I would have thought
    -ffloat-store would do the trick, but it doesn't change the
    assembly generated for the above example, at least on my Debian
    potato build of gcc 2.95.2.

    The only use for PC=3 is in hand-assembled code (or perhaps using
    GCC "long double"s); in those cases, the people doing the coding
    (or the compiler) should know enough to set the required control
    word value.

2.  Finally, the setting is incompatible with other Unixish platforms.
    As mentioned, Free/NetBSD both use PC=2, and most non-IA-64 FPU
    architectures appear to use a floating point representation that
    matches their C "double" precision which prevents these kinds of
    surprises.

The case against, as I see it, boils down to this:

1.  The current setting is the hardware default, so it somehow "makes
    sense" to leave it.

2.  It could potentially break existing code.  Can anyone guess
    how badly?

3.  Implementation is a bit of a pain.  It requires both kernel and
    libc changes.

On the third point, Ulrich and Adam hashed out weirdness with the FPU
control word setting some time ago in the context of selecting IEEE or
POSIX error handling behavior with "-lieee" without thwarting the
kernel's lazy FPU context initialization scheme.

So, on a related note, is it reasonable to consider resurrecting the
"sys_setfpucw" idea at this point, to push the decision on the correct
initial control word up to the C library level where it belongs?  (For
those who don't remember the proposal, the idea is that the C library
can use "sys_setfpucw" to set the desired initial control word.  If
the C program actually executes an FPU instruction, the kernel will
use that saved control word to initialize the FPU context in
"init_fpu()"; otherwise, lazy FPU initialization works as expected.)

Comments, anyone?

Kevin <buhr@stat.wisc.edu>

^ permalink raw reply	[flat|nested] 13+ messages in thread
* Re: RFC: changing precision control setting in initial FPU context
@ 2001-03-03 10:47 Adam J. Richter
  2001-03-03 23:29 ` Kevin Buhr
  0 siblings, 1 reply; 13+ messages in thread
From: Adam J. Richter @ 2001-03-03 10:47 UTC (permalink / raw)
  To: buhr; +Cc: drepper, linux-kernel


	IEEE-754 floating point is available under glibc-based systems,
including most current GNU/Linux distributions, by linking with -lieee.
Your example program produces the "9 10" result you wanted when linked
this way, even when compiled with -O2 

	When not linked with "-lieee", Linux personality ELF
x86 binaries start with Precision Control set to 3, just because that
is how the x86 fninit instruction sets it.

	I thought that libieee was also available at run time for
dynamic executables by doing something like
"LD_PRELOAD=/usr/lib/libieee.so my_dynamic_exeuctable", so you could set
it in your .bashrc if you wanted, but that apparently is not the case,
at least under glibc-2.2.2.  I will have to try to figure out why this
is not available.

	I am a bit out of my depth when discussing the advantages of
occasional 80 bit precision over 64 bit, but I think that there are
situations where getting gratuitously more accurate results helps,
like getting faster convergence in some scientific numerical methods,
such as Newton's method.  (You'll still find the same point of
convergence if there is only one, but the program will run faster).
Another example would be things like 3D lighting calculations (used in
games?) where you want to produce the best images that you can within
that CPU budget.  I don't know of any sound encodings where a fully
optimized implementation would use floating point, but it's possible.
In general, I think most real uses of floating point are for "fast and
sloppy" purposes, and programs that want to use floating point and
care about exact reproducibility will link with "-lieee".

	On the other hand, if a GNU/Linux-x86 distribution did want to
change the initial floating point control word in Linux to PC=2, I think
you would still want old programs to run in their old PC=3 environment,
just in case one relied on it.  Your sys_setfpcw suggestion could do
(to set the default floating point control word without flagging the
process as one that was definitely going to use floating point), but I
think a simpler approach would be to assign a different magic number
argument setpersonality() for programs that expect to be initialized
with floating point precision control set to 2.

Adam J. Richter     __     ______________   4880 Stevens Creek Blvd, Suite 104
adam@yggdrasil.com     \ /                  San Jose, California 95129-1034
+1 408 261-6630         | g g d r a s i l   United States of America
fax +1 408 261-6631      "Free Software For The Rest Of Us."

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2001-03-04  4:22 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-03-03  7:12 RFC: changing precision control setting in initial FPU context Kevin Buhr
2001-03-03  9:31 ` Albert D. Cahalan
2001-03-03 10:26   ` Kevin Buhr
2001-03-03 20:04     ` Albert D. Cahalan
2001-03-03 21:00       ` Jason Riedy
2001-03-03 23:17         ` Kevin Buhr
2001-03-04  4:21           ` Jason Riedy
2001-03-03 22:34       ` Kevin Buhr
2001-03-03 10:47 Adam J. Richter
2001-03-03 23:29 ` Kevin Buhr
2001-03-03 23:37   ` Alan Cox
2001-03-04  0:27     ` Kevin Buhr
2001-03-04  0:45       ` Ulrich Drepper

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).