All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bin Meng <bmeng.cn@gmail.com>
To: Gary Guo <gary@garyguo.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Albert Ou <aou@eecs.berkeley.edu>, Nick Hu <nickhu@andestech.com>,
	Nylon Chen <nylon7@andestech.com>,
	linux-riscv <linux-riscv@lists.infradead.org>,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] riscv: fix memmove and optimise memcpy when misalign
Date: Thu, 13 May 2021 16:13:53 +0800	[thread overview]
Message-ID: <CAEUhbmXPAEjozYuoQpATzjS4E6Yo2eRpF6DJzxO8Cumdm57J3A@mail.gmail.com> (raw)
In-Reply-To: <20210216225555.4976-1-gary@garyguo.net>

On Wed, Feb 17, 2021 at 7:00 AM Gary Guo <gary@garyguo.net> wrote:
>
> 04091d6 introduces an assembly version of memmove but
> it does take misalignment into account (it checks if
> length is a multiple of machine word size but pointers
> need also be aligned). As a result it will generate
> misaligned load/store for the majority of cases and causes
> significant performance regression on hardware that traps
> misaligned load/store and emulate them using firmware.
>
> The current behaviour of memcpy is that it checks if both
> src and dest pointers are co-aligned (aka congruent
> modular SZ_REG). If aligned, it will copy data word-by-word
> after first aligning pointers to word boundary. If src
> and dst are not co-aligned, however, byte-wise copy will
> be performed.
>
> This patch fixes the memmove and optimises memcpy for
> misaligned cases. It will first align destination pointer
> to word-boundary regardless whether src and dest are
> co-aligned or not. If they indeed are, then wordwise copy
> is performed. If they are not co-aligned, then it will
> load two adjacent words from src and use shifts to assemble
> a full machine word. Some additional assembly level
> micro-optimisation is also performed to ensure more
> instructions can be compressed (e.g. prefer a0 to t6).
>
> In my testing this speeds up memcpy 4~5x when src and dest
> are not co-aligned (which is quite common in networking),
> and speeds up memmove 1000+x by avoiding trapping to firmware.
>
> Signed-off-by: Gary Guo <gary@garyguo.net>
> ---
>  arch/riscv/lib/memcpy.S  | 223 ++++++++++++++++++++++++---------------
>  arch/riscv/lib/memmove.S | 176 ++++++++++++++++++++----------
>  2 files changed, 257 insertions(+), 142 deletions(-)
>

Looks this patch remains unapplied.

This patch fixed an booting failure of U-Boot SPL on SiFive Unleashed
board, which was built from the latest U-Boot sources that has taken
the assembly version of mem* from the Linux kernel recently.
The exact load misalignment happens in the original memmove()
implementation that it does not handle the alignment correctly. With
this patch, the U-Boot SPL boots again.

Tested-by: Bin Meng <bmeng.cn@gmail.com>

Regards,
Bin

WARNING: multiple messages have this Message-ID (diff)
From: Bin Meng <bmeng.cn@gmail.com>
To: Gary Guo <gary@garyguo.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	 Albert Ou <aou@eecs.berkeley.edu>,
	Nick Hu <nickhu@andestech.com>,
	 Nylon Chen <nylon7@andestech.com>,
	linux-riscv <linux-riscv@lists.infradead.org>,
	 linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] riscv: fix memmove and optimise memcpy when misalign
Date: Thu, 13 May 2021 16:13:53 +0800	[thread overview]
Message-ID: <CAEUhbmXPAEjozYuoQpATzjS4E6Yo2eRpF6DJzxO8Cumdm57J3A@mail.gmail.com> (raw)
In-Reply-To: <20210216225555.4976-1-gary@garyguo.net>

On Wed, Feb 17, 2021 at 7:00 AM Gary Guo <gary@garyguo.net> wrote:
>
> 04091d6 introduces an assembly version of memmove but
> it does take misalignment into account (it checks if
> length is a multiple of machine word size but pointers
> need also be aligned). As a result it will generate
> misaligned load/store for the majority of cases and causes
> significant performance regression on hardware that traps
> misaligned load/store and emulate them using firmware.
>
> The current behaviour of memcpy is that it checks if both
> src and dest pointers are co-aligned (aka congruent
> modular SZ_REG). If aligned, it will copy data word-by-word
> after first aligning pointers to word boundary. If src
> and dst are not co-aligned, however, byte-wise copy will
> be performed.
>
> This patch fixes the memmove and optimises memcpy for
> misaligned cases. It will first align destination pointer
> to word-boundary regardless whether src and dest are
> co-aligned or not. If they indeed are, then wordwise copy
> is performed. If they are not co-aligned, then it will
> load two adjacent words from src and use shifts to assemble
> a full machine word. Some additional assembly level
> micro-optimisation is also performed to ensure more
> instructions can be compressed (e.g. prefer a0 to t6).
>
> In my testing this speeds up memcpy 4~5x when src and dest
> are not co-aligned (which is quite common in networking),
> and speeds up memmove 1000+x by avoiding trapping to firmware.
>
> Signed-off-by: Gary Guo <gary@garyguo.net>
> ---
>  arch/riscv/lib/memcpy.S  | 223 ++++++++++++++++++++++++---------------
>  arch/riscv/lib/memmove.S | 176 ++++++++++++++++++++----------
>  2 files changed, 257 insertions(+), 142 deletions(-)
>

Looks this patch remains unapplied.

This patch fixed an booting failure of U-Boot SPL on SiFive Unleashed
board, which was built from the latest U-Boot sources that has taken
the assembly version of mem* from the Linux kernel recently.
The exact load misalignment happens in the original memmove()
implementation that it does not handle the alignment correctly. With
this patch, the U-Boot SPL boots again.

Tested-by: Bin Meng <bmeng.cn@gmail.com>

Regards,
Bin

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

  reply	other threads:[~2021-05-13  8:14 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-16 22:55 [PATCH] riscv: fix memmove and optimise memcpy when misalign Gary Guo
2021-02-16 22:55 ` Gary Guo
2021-05-13  8:13 ` Bin Meng [this message]
2021-05-13  8:13   ` Bin Meng
2021-05-22 22:22 ` Gary Guo
2021-05-22 22:22   ` Gary Guo
2021-05-23  1:47   ` Palmer Dabbelt
2021-05-23  1:47     ` Palmer Dabbelt
2021-05-23 17:12     ` David Laight
2021-05-23 17:12       ` David Laight
2021-05-25 14:34       ` Gary Guo
2021-05-25 14:34         ` Gary Guo
2021-06-15 13:40         ` Bin Meng
2021-06-15 13:40           ` Bin Meng
2021-06-15 14:08           ` David Laight
2021-06-15 14:08             ` David Laight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAEUhbmXPAEjozYuoQpATzjS4E6Yo2eRpF6DJzxO8Cumdm57J3A@mail.gmail.com \
    --to=bmeng.cn@gmail.com \
    --cc=aou@eecs.berkeley.edu \
    --cc=gary@garyguo.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=nickhu@andestech.com \
    --cc=nylon7@andestech.com \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.