All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Laight <David.Laight@ACULAB.COM>
To: 'Linus Torvalds' <torvalds@linux-foundation.org>,
	'Noah Goldstein' <goldstein.w.n@gmail.com>
Cc: 'kernel test robot' <lkp@intel.com>,
	"'x86@kernel.org'" <x86@kernel.org>,
	"'oe-kbuild-all@lists.linux.dev'" <oe-kbuild-all@lists.linux.dev>,
	"'linux-kernel@vger.kernel.org'" <linux-kernel@vger.kernel.org>,
	"'edumazet@google.com'" <edumazet@google.com>,
	"'tglx@linutronix.de'" <tglx@linutronix.de>,
	"'mingo@redhat.com'" <mingo@redhat.com>,
	"'bp@alien8.de'" <bp@alien8.de>,
	"'dave.hansen@linux.intel.com'" <dave.hansen@linux.intel.com>,
	"'hpa@zytor.com'" <hpa@zytor.com>
Subject: RE: x86/csum: Remove unnecessary odd handling
Date: Fri, 5 Jan 2024 16:12:05 +0000	[thread overview]
Message-ID: <8ff2151323cd440da25fd49061d9cc44@AcuMS.aculab.com> (raw)
In-Reply-To: <5354eeec562345f6a1de84f0b2081b75@AcuMS.aculab.com>

From: David Laight
> Sent: 05 January 2024 10:41
> 
> From: Linus Torvalds
> > Sent: 05 January 2024 00:33
> >
> > On Thu, 4 Jan 2024 at 15:36, Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:
> > >
> > > Anyway, since I looked at the thing originally, and feel like I know
> > > the x86 side and understand the strange IP csum too, I just applied it
> > > directly.
> >
> > I ended up just applying my 40-byte cleanup thing too that I've been
> > keeping in my own tree since posting it (as the "Silly csum
> > improvement. Maybe" patch).
> 
> Interesting, I'm pretty sure trying to get two blocks of
>  'adc' scheduled in parallel like that doesn't work.
> 
> I got an adc every clock from this 'beast':
> +       /*
> +        * Align the byte count to a multiple of 16 then
> +        * add 64 bit words to alternating registers.
> +        * Finally reduce to 64 bits.
> +        */
> +       asm(    "       bt    $4, %[len]\n"
> +               "       jnc   10f\n"
> +               "       add   (%[buff], %[len]), %[sum_0]\n"
> +               "       adc   8(%[buff], %[len]), %[sum_1]\n"
> +               "       lea   16(%[len]), %[len]\n"
> +               "10:    jecxz 20f\n"
> +               "       adc   (%[buff], %[len]), %[sum_0]\n"
> +               "       adc   8(%[buff], %[len]), %[sum_1]\n"
> +               "       lea   32(%[len]), %[len_tmp]\n"
> +               "       adc   16(%[buff], %[len]), %[sum_0]\n"
> +               "       adc   24(%[buff], %[len]), %[sum_1]\n"
> +               "       mov   %[len_tmp], %[len]\n"
> +               "       jmp   10b\n"
> +               "20:    adc   %[sum_0], %[sum]\n"
> +               "       adc   %[sum_1], %[sum]\n"
> +               "       adc   $0, %[sum]\n"
> +           : [sum] "+&r" (sum), [sum_0] "+&r" (sum_0), [sum_1] "+&r" (sum_1),
> +               [len] "+&c" (len), [len_tmp] "=&r" (len_tmp)
> +           : [buff] "r" (buff)
> +           : "memory" );

I've got far too many x86 checksum functions lying around.

Actually you don't need all that.
Anything recent (probably Broadwell on) will execute:
	"10:    jecxz 20f\n"
	"       adc   (%[buff], %[len]), %[sum]\n"
	"       adc   8(%[buff], %[len]), %[sum]\n"
	"       lea   16(%[len]), %[len]\n"
	"       jmp   10b\n"
	"20:    adc   $0, %[sum]\n"
in two clocks per iteration - 8 bytes/clock.
Since it is trivial to handle 8n+4 buffers (eg as above)
that only leaves the C code to handle the final 0-7 bytes.

> Maybe I'll sort out another patch...

Probably after the next rc1 is out.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

  reply	other threads:[~2024-01-05 16:12 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20230628020657.957880-1-goldstein.w.n@gmail.com>
2023-06-28  9:12 ` x86/csum: Remove unnecessary odd handling Borislav Petkov
2023-06-28 15:32   ` Noah Goldstein
2023-06-28 17:44     ` Linus Torvalds
2023-06-28 18:34       ` Noah Goldstein
2023-06-28 20:02         ` Linus Torvalds
2023-06-29 14:04   ` David Laight
2023-06-29 14:27   ` David Laight
2023-09-01 22:21 ` Noah Goldstein
2023-09-06 13:49   ` David Laight
2023-09-06 14:38   ` David Laight
2023-09-20 19:20     ` Noah Goldstein
2023-09-20 19:23 ` Noah Goldstein
2023-09-23  3:24   ` kernel test robot
2023-09-23 14:05     ` Noah Goldstein
2023-09-23 21:13       ` David Laight
2023-09-24 14:35         ` Noah Goldstein
2023-12-23 22:18           ` Noah Goldstein
2024-01-04 23:28             ` Noah Goldstein
2024-01-04 23:34               ` Dave Hansen
2024-01-04 23:36               ` Linus Torvalds
2024-01-05  0:33                 ` Linus Torvalds
2024-01-05 10:41                   ` David Laight
2024-01-05 16:12                     ` David Laight [this message]
2024-01-05 18:05                     ` Linus Torvalds
2024-01-05 23:52                       ` David Laight
2024-01-06  0:18                         ` Linus Torvalds
2024-01-06 10:26                           ` Eric Dumazet
2024-01-06 19:32                             ` Linus Torvalds
2024-01-07 12:11                             ` David Laight
2024-01-06 22:08                       ` David Laight
2024-01-07  1:09                         ` H. Peter Anvin
2024-01-07 11:44                           ` David Laight
2023-09-24 14:35 ` Noah Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8ff2151323cd440da25fd49061d9cc44@AcuMS.aculab.com \
    --to=david.laight@aculab.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=edumazet@google.com \
    --cc=goldstein.w.n@gmail.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=mingo@redhat.com \
    --cc=oe-kbuild-all@lists.linux.dev \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.