Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / Atom feed
From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
To: "huanglingyan (A)" <huanglingyan2@huawei.com>
Cc: Zhangshaokun <zhangshaokun@hisilicon.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH v3] arm64: lib: accelerate do_csum with NEON instruction
Date: Fri, 18 Jan 2019 12:14:20 +0100
Message-ID: <CAKv+Gu-MUDT-pAE4kwHbCsW2MSYBCDB3N1reRgeFL1EwiNQvxQ@mail.gmail.com> (raw)
In-Reply-To: <58c28adf-a01a-bb36-4def-866375e93aac@huawei.com>

On Fri, 18 Jan 2019 at 02:07, huanglingyan (A) <huanglingyan2@huawei.com> wrote:
>
>
> On 2019/1/17 0:46, Will Deacon wrote:
> > On Wed, Jan 09, 2019 at 10:03:05AM +0800, huanglingyan (A) wrote:
> >> On 2019/1/8 21:54, Will Deacon wrote:
> >>> [re-adding Ard and LAKML -- not sure why the headers are so munged]
> >>>
> >>> On Mon, Jan 07, 2019 at 10:38:55AM +0800, huanglingyan (A) wrote:
> >>>> On 2019/1/6 16:26, Ard Biesheuvel wrote:
> >>>>     Please change this into
> >>>>
> >>>>     if (IS_ENABLED(CONFIG_KERNEL_MODE_NEON) &&
> >>>>         len >= CSUM_NEON_THRESHOLD &&
> >>>>         may_use_simd()) {
> >>>>             kernel_neon_begin();
> >>>>             res = do_csum_neon(buff, len);
> >>>>             kernel_neon_end();
> >>>>         }
> >>>>
> >>>>     and drop the intermediate do_csum_arm()
> >>>>
> >>>>
> >>>>         +               return do_csum_arm(buff, len);
> >>>>         +#endif  /* CONFIG_KERNEL_MODE_NEON */
> >>>>
> >>>>     No else? What happens if len < CSUM_NEON_THRESHOLD ?
> >>>>
> >>>>
> >>>>         +#undef do_csum
> >>>>
> >>>>     Can we drop this?
> >>>>
> >>>> Using NEON instructions will bring some costs. The spending maybe introduced
> >>>> when reservering/restoring
> >>>> neon registers with kernel_neon_begin()/kernel_neon_end(). Therefore NEON code
> >>>> is Only used when
> >>>> the length exceeds CSUM_NEON_THRESHOLD. General do csum() codes in lib/
> >>>> checksum.c will be used in
> >>>> shorter length. To achieve this goal, I use the "#undef do_csum" in else clause
> >>>> to have the oppotunity to
> >>>> utilize the general codes.
> >>> I don't think that's how it works :/
> >>>
> >>> Before we get deeper into the implementation, please could you justify the
> >>> need for a CPU-optimised checksum implementation at all? I thought this was
> >>> usually offloaded to the NIC?
> >>>
> >>> Will
> >>>
> >>> .
> >> This problem is introduced when testing Intel x710 network card on my ARM server.
> >> Ip forward is set for ease of testing. Then send lots of packages to server by Tesgine
> >> machine and then receive.
> > In the marketing blurb, that card boasts:
> >
> >   `Tx/Rx IP, SCTP, TCP, and UDP checksum offloading (IPv4, IPv6) capabilities'
> >
> > so we shouldn't need to run this on the CPU. Again, I'm not keen to optimise
> > this given that it /really/ shouldn't be used on arm64 machines that care
> > about network performance.
> >
> > Will
> >
> > .
> Yeah, you are right. Checksum is usually done in network card which is told by
> someone familiar with NIC. However, it may be used in testing scenaries and
> some primary network cards. I think it's no harm to optimize this code while
> other ARCHs have their own optimized versions.
> >
>

I disagree. If this code path is never exercised, we should not
include it. We can revisit this decision when there is a use case
where the checksumming performance is an actual bottleneck.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply index

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-06  1:55 Lingyan Huang
2019-01-06  8:26 ` Ard Biesheuvel
     [not found]   ` <9129b882-60f3-8046-0cb9-e0b2452a118d@huawei.com>
2019-01-08 13:54     ` Will Deacon
2019-01-09  2:03       ` huanglingyan (A)
2019-01-10  4:08         ` 胡海
2019-01-10  8:14           ` huanglingyan (A)
2019-01-16 16:46         ` Will Deacon
2019-01-18  1:07           ` huanglingyan (A)
2019-01-18 11:14             ` Ard Biesheuvel [this message]
2019-02-12  2:26               ` huanglingyan (A)
2019-02-12  7:07                 ` Ard Biesheuvel
2019-02-13  8:42                   ` huanglingyan (A)
2019-02-13  9:15                     ` Ard Biesheuvel
2019-02-13 17:55                       ` Ard Biesheuvel
2019-02-14  9:57                         ` huanglingyan (A)
2019-02-18  8:49                           ` huanglingyan (A)
2019-02-18  9:03                             ` Ard Biesheuvel
2019-01-09 14:58 ` Dave Martin
2019-01-10  8:03   ` huanglingyan (A)
2019-01-10 13:53     ` Dave Martin
     [not found] <1f065749-6676-6489-14ae-fdcfeeb3389c@huawei.com>
2019-01-07  6:11 ` huanglingyan (A)

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKv+Gu-MUDT-pAE4kwHbCsW2MSYBCDB3N1reRgeFL1EwiNQvxQ@mail.gmail.com \
    --to=ard.biesheuvel@linaro.org \
    --cc=catalin.marinas@arm.com \
    --cc=huanglingyan2@huawei.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=will.deacon@arm.com \
    --cc=zhangshaokun@hisilicon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-ARM-Kernel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-arm-kernel/0 linux-arm-kernel/git/0.git
	git clone --mirror https://lore.kernel.org/linux-arm-kernel/1 linux-arm-kernel/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-arm-kernel linux-arm-kernel/ https://lore.kernel.org/linux-arm-kernel \
		linux-arm-kernel@lists.infradead.org
	public-inbox-index linux-arm-kernel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.infradead.lists.linux-arm-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git