[Qemu-devel] [PATCH v4 00/22] re-factor softfloat and add fp16 functions

* [Qemu-devel] [PATCH v4 00/22] re-factor softfloat and add fp16 functions
@ 2018-02-06 16:47 Alex Bennée
  2018-02-06 16:47 ` [Qemu-devel] [PATCH v4 01/22] fpu/softfloat: implement float16_squash_input_denormal Alex Bennée
                   ` (24 more replies)
  0 siblings, 25 replies; 43+ messages in thread
From: Alex Bennée @ 2018-02-06 16:47 UTC (permalink / raw)
  To: richard.henderson, peter.maydell, laurent, bharata, andrew
  Cc: qemu-devel, Alex Bennée

Hi,

The main change is applying the __attribute__((flatten)) to some of
the public functions that show up in Emilio's dbt-benchmark. This
seems to be a cleaner solution that squashing inlines higher up the
chain and still leaves the chance for re-use for the less widely used
functions. The results are an improvement over v3 by some margin:

                         NBench score; higher is better

    5 +-+-----------+-------------+------------+-------------+-----------+-+
      |                     ****### %%%%  +++                              |
  4.5 +-+...................*..*..#.%..%..****##..%%%%+ system-2.5       +-+
      |                     *  *  # %  %  *  * #  %  %      master         |
    4 +-+...................*..*..#.%..%..*..*.#..%..%softfloat-v3       +-+
  3.5 +-+...................*..*..#.%..%..*..*.#..%..%softfloat-%%%%.....+-+
      |                     *  *  # %  %  *  * #  %  %  * *  #  %  %       |
    3 +-+...................*..*..#.%..%..*..*.#..%..%..*.*..#..%..%.....+-+
      |                     *  *  #+%  %  *  * #$$$  %  * *  #  %  %       |
  2.5 +-+........####.......*..*..#$$..%..*..*.#..$..%..*.*..#..%..%.....+-+
      |       ****  #  %%%  *  *  # $  %  *  * #  $  %  * *  #$$$  %       |
    2 +-+.....*..*..#..%.%..*..*..#.$..%..*..*.#..$..%..*.*..#..$..%.....+-+
      |       *  *  #  % %  *  *  # $  %  *  * #  $  %  * *  #  $  %       |
  1.5 +-+.....*..*..#$$$.%..*..*..#.$..%..*..*.#..$..%..*.*..#..$..%.....+-+
    1 +-+.....*..*..#..$.%..*..*..#.$..%..*..*.#..$..%..*.*..#..$..%.....+-+
      |       *  *  #  $ %  *  *  # $  %  *  * #  $  %  * *  #  $  %       |
  0.5 +-+.....*..*..#..$.%..*..*..#.$..%..*..*.#..$..%..*.*..#..$..%.....+-+
      |       *  *  #  $ %  *  *  # $  %  *  * #  $  %  * *  #  $  %       |
    0 +-+-----****###$$$%%--****###$$%%%--****##$$$%%%--***###$$$%%%-----+-+
                 FOURIER     NEURAL NETLU DECOMPOSITION    gmean

Slightly easier to read PNG:

    https://i.imgur.com/XEeL0bC.png

I think it's pretty ready for a merge. Shall I submit a pull myself or
does it make sense going via someone else? According to MAINTAINERS
Peter and Aurelien are responsible for this code...

Alex Bennée (22):
  fpu/softfloat: implement float16_squash_input_denormal
  include/fpu/softfloat: remove USE_SOFTFLOAT_STRUCT_TYPES
  fpu/softfloat-types: new header to prevent excessive re-builds
  target/*/cpu.h: remove softfloat.h
  include/fpu/softfloat: implement float16_abs helper
  include/fpu/softfloat: implement float16_chs helper
  include/fpu/softfloat: implement float16_set_sign helper
  include/fpu/softfloat: add some float16 constants
  fpu/softfloat: improve comments on ARM NaN propagation
  fpu/softfloat: move the extract functions to the top of the file
  fpu/softfloat: define decompose structures
  fpu/softfloat: re-factor add/sub
  fpu/softfloat: re-factor mul
  fpu/softfloat: re-factor div
  fpu/softfloat: re-factor muladd
  fpu/softfloat: re-factor round_to_int
  fpu/softfloat: re-factor float to int/uint
  fpu/softfloat: re-factor int/uint to float
  fpu/softfloat: re-factor scalbn
  fpu/softfloat: re-factor minmax
  fpu/softfloat: re-factor compare
  fpu/softfloat: re-factor sqrt

 fpu/softfloat-macros.h          |   48 +
 fpu/softfloat-specialize.h      |  109 +-
 fpu/softfloat.c                 | 4545 ++++++++++++++++-----------------------
 include/fpu/softfloat-types.h   |  179 ++
 include/fpu/softfloat.h         |  202 +-
 include/qemu/bswap.h            |    2 +-
 target/alpha/cpu.h              |    2 -
 target/arm/cpu.c                |    1 +
 target/arm/cpu.h                |    2 -
 target/arm/helper-a64.c         |    1 +
 target/arm/helper.c             |    1 +
 target/arm/neon_helper.c        |    1 +
 target/hppa/cpu.c               |    1 +
 target/hppa/cpu.h               |    1 -
 target/hppa/op_helper.c         |    2 +-
 target/i386/cpu.h               |    4 -
 target/i386/fpu_helper.c        |    1 +
 target/m68k/cpu.c               |    2 +-
 target/m68k/cpu.h               |    1 -
 target/m68k/fpu_helper.c        |    1 +
 target/m68k/helper.c            |    1 +
 target/m68k/translate.c         |    2 +
 target/microblaze/cpu.c         |    1 +
 target/microblaze/cpu.h         |    2 +-
 target/microblaze/op_helper.c   |    1 +
 target/moxie/cpu.h              |    1 -
 target/nios2/cpu.h              |    1 -
 target/openrisc/cpu.h           |    1 -
 target/openrisc/fpu_helper.c    |    1 +
 target/ppc/cpu.h                |    1 -
 target/ppc/fpu_helper.c         |    1 +
 target/ppc/int_helper.c         |    1 +
 target/ppc/translate_init.c     |    1 +
 target/s390x/cpu.c              |    1 +
 target/s390x/cpu.h              |    2 -
 target/s390x/fpu_helper.c       |    1 +
 target/sh4/cpu.c                |    1 +
 target/sh4/cpu.h                |    2 -
 target/sh4/op_helper.c          |    1 +
 target/sparc/cpu.h              |    2 -
 target/sparc/fop_helper.c       |    1 +
 target/tricore/cpu.h            |    1 -
 target/tricore/fpu_helper.c     |    1 +
 target/tricore/helper.c         |    1 +
 target/unicore32/cpu.c          |    1 +
 target/unicore32/cpu.h          |    1 -
 target/unicore32/ucf64_helper.c |    1 +
 target/xtensa/cpu.h             |    1 -
 target/xtensa/op_helper.c       |    1 +
 49 files changed, 2199 insertions(+), 2941 deletions(-)
 create mode 100644 include/fpu/softfloat-types.h

-- 
2.15.1

^ permalink raw reply	[flat|nested] 43+ messages in thread