All of lore.kernel.org
 help / color / mirror / Atom feed
From: Noah Goldstein <goldstein.w.n@gmail.com>
To: Eric Dumazet <edumazet@google.com>
Cc: tglx@linutronix.de, mingo@redhat.com,
	Borislav Petkov <bp@alien8.de>,
	dave.hansen@linux.intel.com, X86 ML <x86@kernel.org>,
	hpa@zytor.com, peterz@infradead.org, alexanderduyck@fb.com,
	open list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2] x86/lib: Optimize 8x loop and memory clobbers in csum_partial.c
Date: Sat, 27 Nov 2021 00:38:58 -0600	[thread overview]
Message-ID: <CAFUsyfJVM_RO5Fy2PyerxiQipz+Nw4oXc0-gm9rR59AHpmqw8Q@mail.gmail.com> (raw)
In-Reply-To: <CANn89iKaTyrNZHCg7i46-db_zudXs0V-BvmwM-5fYLZT2yQr4Q@mail.gmail.com>

On Sat, Nov 27, 2021 at 12:03 AM Eric Dumazet <edumazet@google.com> wrote:
>
> On Fri, Nov 26, 2021 at 8:25 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> >
> > Modify the 8x loop to that it uses two independent
> > accumulators. Despite adding more instructions the latency and
> > throughput of the loop is improved because the `adc` chains can now
> > take advantage of multiple execution units.
> >
> > Make the memory clobbers more precise. 'buff' is read only and we know
> > the exact usage range. There is no reason to write-clobber all memory.
> >
> > Relative performance changes on Tigerlake:
> >
> > Time Unit: Ref Cycles
> > Size Unit: Bytes
> >
> > size, lat old, lat new,    tput old,    tput new
> >    0,   4.961,   4.901,       4.887,       4.951
> >    8,   5.590,   5.620,       4.227,       4.252
> >   16,   6.182,   6.202,       4.233,       4.278
> >   24,   7.392,   7.380,       4.256,       4.279
> >   32,   7.371,   7.390,       4.550,       4.537
> >   40,   8.621,   8.601,       4.862,       4.836
> >   48,   9.406,   9.374,       5.206,       5.234
> >   56,  10.535,  10.522,       5.416,       5.447
> >   64,  10.000,   7.590,       6.946,       6.989
> >  100,  14.218,  12.476,       9.429,       9.441
> >  200,  22.115,  16.937,      13.088,      12.852
> >  300,  31.826,  24.640,      19.383,      18.230
> >  400,  39.016,  28.133,      23.223,      21.304
> >  500,  48.815,  36.186,      30.331,      27.104
> >  600,  56.732,  40.120,      35.899,      30.363
> >  700,  66.623,  48.178,      43.044,      36.400
> >  800,  73.259,  51.171,      48.564,      39.173
> >  900,  82.821,  56.635,      58.592,      45.162
> > 1000,  90.780,  63.703,      65.658,      48.718
> >
> > Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
> >
> > tmp
>
> SGTM (not sure what this 'tmp' string means here :) )
>
> Reviewed-by: Eric Dumazet <edumazet@google.com>

Poor rebasing practices :/

Fixed in V3 (only change).

  reply	other threads:[~2021-11-27  6:41 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-25 19:38 [PATCH v1] x86/lib: Optimize 8x loop and memory clobbers in csum_partial.c Noah Goldstein
2021-11-26  1:50 ` Eric Dumazet
2021-11-26  2:15   ` Noah Goldstein
2021-11-26  2:18     ` Noah Goldstein
2021-11-26  2:38       ` Noah Goldstein
2021-11-28 19:47         ` David Laight
2021-11-28 20:59           ` Noah Goldstein
2021-11-28 22:41             ` David Laight
2021-12-02 14:24             ` David Laight
2021-12-02 15:01               ` Eric Dumazet
2021-12-02 20:19                 ` Noah Goldstein
2021-12-02 21:11                   ` David Laight
2021-11-26 16:08     ` Eric Dumazet
2021-11-26 18:17       ` Noah Goldstein
2021-11-26 18:27         ` Eric Dumazet
2021-11-26 18:50           ` Noah Goldstein
2021-11-26 19:14             ` Noah Goldstein
2021-11-26 19:21               ` Eric Dumazet
2021-11-26 19:50                 ` Noah Goldstein
2021-11-26 20:07                   ` Eric Dumazet
2021-11-26 20:33                     ` Noah Goldstein
2021-11-27  0:15                       ` Eric Dumazet
2021-11-27  0:39                         ` Noah Goldstein
2021-11-26 18:17       ` Eric Dumazet
2021-11-27  4:25 ` [PATCH v2] " Noah Goldstein
2021-11-27  6:03   ` Eric Dumazet
2021-11-27  6:38     ` Noah Goldstein [this message]
2021-11-27  6:39 ` [PATCH v3] " Noah Goldstein
2021-11-27  6:51   ` Eric Dumazet
2021-11-27  7:18     ` Noah Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFUsyfJVM_RO5Fy2PyerxiQipz+Nw4oXc0-gm9rR59AHpmqw8Q@mail.gmail.com \
    --to=goldstein.w.n@gmail.com \
    --cc=alexanderduyck@fb.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=edumazet@google.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.