[Qemu-devel] [PATCH v1 00/19] re-factor softfloat and add fp16 functions

* [Qemu-devel] [PATCH v1 00/19] re-factor softfloat and add fp16 functions
@ 2017-12-11 12:56 Alex Bennée
  2017-12-11 12:56 ` [Qemu-devel] [PATCH v1 01/19] fpu/softfloat: implement float16_squash_input_denormal Alex Bennée
                   ` (19 more replies)
  0 siblings, 20 replies; 51+ messages in thread
From: Alex Bennée @ 2017-12-11 12:56 UTC (permalink / raw)
  To: richard.henderson, peter.maydell, laurent, bharata, andrew,
	aleksandar.markovic
  Cc: qemu-devel, Alex Bennée

Hi,

In my previous run at this I'd simply taken the existing float32
functions and attempted to copy and paste the code changing the
relevant constants. Apart from the usual typos and missed bits there
were sections where softfloat pulls tricks because it knows the exact
bit positions of things. While I'm sure it's marginally faster it does
make the code rather impenetrable to someone not familiar with how
SoftFloat does things. One thing the last few months have taught me is
the world is not awash with experts on the finer implementation
details of floating point maths. After reviewing the last series
Richard Henderson suggested a different approach which pushed most of
the code into common shared functions. The majority of the work on the
fractional bits is done in 64 bit resolution which leaves plenty of
spare bits for rounding for everything from float16 to float64. This
series is a result of that work and a coding sprint we did 2 weeks ago
in Cambridge.

We've not touched anything that needs higher precision which at the
moment is float80 and 128 bit quad precision operations. They would
need similar decomposed routines to operate on the higher precision
fractional parts. I suspect we'd need to beef up our Int128 wrapper in
the process so it can be done efficiently with 128 bit maths.

This work is part of the larger chunk of adding half-precision ops to
the ARM front-end. However I've split the series up to make for a less
messy review. This tree can be found at:

  https://github.com/stsquad/qemu/tree/softfloat-refactor-and-fp16-v1

While I have been testing the half-precision stuff in the ARM
specific tree this series is all common code. It has however been
tested with ARM RISU which exercises the float32/64 code paths quite
nicely.

Any additional testing appreciated.

Series Breakdown
----------------

The first five patches are simple helper functions that are mostly
inline and there for the benefit of architecture helper functions.
This includes the float16 constants in the final patch.

The next two patches fixed a bug in NaN propagation which only showed
up when doing ARM "Reduction" operations in float16. Although the
minmax code is totally replaced later on I wanted to fix it in place
first rather than add the fix when it was re-written.

The next two patches start preparing the ground for the new decomposed
functions and their public APIs. I've used macro expansion in a few
places just to avoid the amount of repeated boiler-plate for these
APIs. Most of the work is done in the static decompose_foo functions.

As you can see in the diffstat there is an overall code reduction even
though we have also added float16 support. For reference the previous
attempt added 1258 lines of code to implement a subset of the float16
functions. I think the code is also a lot easier to follow and reason
about.

Alex Bennée (19):
  fpu/softfloat: implement float16_squash_input_denormal
  include/fpu/softfloat: implement float16_abs helper
  include/fpu/softfloat: implement float16_chs helper
  include/fpu/softfloat: implement float16_set_sign helper
  include/fpu/softfloat: add some float16 contants
  fpu/softfloat: propagate signalling NaNs in MINMAX
  fpu/softfloat: improve comments on ARM NaN propagation
  fpu/softfloat: move the extract functions to the top of the file
  fpu/softfloat: define decompose structures
  fpu/softfloat: re-factor add/sub
  fpu/softfloat: re-factor mul
  fpu/softfloat: re-factor div
  fpu/softfloat: re-factor muladd
  fpu/softfloat: re-factor round_to_int
  fpu/softfloat: re-factor float to int/uint
  fpu/softfloat: re-factor int/uint to float
  fpu/softfloat: re-factor scalbn
  fpu/softfloat: re-factor minmax
  fpu/softfloat: re-factor compare

 fpu/softfloat-macros.h     |   44 +
 fpu/softfloat-specialize.h |  115 +-
 fpu/softfloat.c            | 6668 ++++++++++++++++++++------------------------
 include/fpu/softfloat.h    |   89 +-
 4 files changed, 3066 insertions(+), 3850 deletions(-)

-- 
2.15.1

^ permalink raw reply	[flat|nested] 51+ messages in thread