linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Akira Tsukamoto <akira.tsukamoto@gmail.com>
To: Andreas Schwab <schwab@linux-m68k.org>,
	Palmer Dabbelt <palmer@dabbelt.com>
Cc: akira.tsukamoto@gmail.com,
	Paul Walmsley <paul.walmsley@sifive.com>,
	linux@roeck-us.net, geert@linux-m68k.org,
	qiuwenbo@kylinos.com.cn, aou@eecs.berkeley.edu,
	linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/1] riscv: __asm_copy_to-from_user: Improve using word copy if size < 9*SZREG
Date: Fri, 20 Aug 2021 15:42:20 +0900	[thread overview]
Message-ID: <ebfbcb26-5f17-2cc2-4845-6cdb24326338@gmail.com> (raw)
In-Reply-To: <87zgthjjun.fsf@igel.home>

Hi Andreas,

On 8/17/2021 4:00 AM, Andreas Schwab wrote:
> On Aug 16 2021, Palmer Dabbelt wrote:
> 
>> On Fri, 30 Jul 2021 06:52:44 PDT (-0700), akira.tsukamoto@gmail.com wrote:
>>> Reduce the number of slow byte_copy when the size is in between
>>> 2*SZREG to 9*SZREG by using none unrolled word_copy.
>>>
>>> Without it any size smaller than 9*SZREG will be using slow byte_copy
>>> instead of none unrolled word_copy.
>>>
>>> Signed-off-by: Akira Tsukamoto <akira.tsukamoto@gmail.com>
>>> ---
>>>  arch/riscv/lib/uaccess.S | 46 ++++++++++++++++++++++++++++++++++++----
>>>  1 file changed, 42 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/arch/riscv/lib/uaccess.S b/arch/riscv/lib/uaccess.S
>>> index 63bc691cff91..6a80d5517afc 100644
>>> --- a/arch/riscv/lib/uaccess.S
>>> +++ b/arch/riscv/lib/uaccess.S
>>> @@ -34,8 +34,10 @@ ENTRY(__asm_copy_from_user)
>>>  	/*
>>>  	 * Use byte copy only if too small.
>>>  	 * SZREG holds 4 for RV32 and 8 for RV64
>>> +	 * a3 - 2*SZREG is minimum size for word_copy
>>> +	 *      1*SZREG for aligning dst + 1*SZREG for word_copy
>>>  	 */
>>> -	li	a3, 9*SZREG /* size must be larger than size in word_copy */
>>> +	li	a3, 2*SZREG
>>>  	bltu	a2, a3, .Lbyte_copy_tail
>>>
>>>  	/*
>>> @@ -66,9 +68,40 @@ ENTRY(__asm_copy_from_user)
>>>  	andi	a3, a1, SZREG-1
>>>  	bnez	a3, .Lshift_copy
>>>
>>> +.Lcheck_size_bulk:
>>> +	/*
>>> +	 * Evaluate the size if possible to use unrolled.
>>> +	 * The word_copy_unlrolled requires larger than 8*SZREG
>>> +	 */
>>> +	li	a3, 8*SZREG
>>> +	add	a4, a0, a3
>>> +	bltu	a4, t0, .Lword_copy_unlrolled
>>> +
>>>  .Lword_copy:
>>> -        /*
>>> -	 * Both src and dst are aligned, unrolled word copy
>>> +	/*
>>> +	 * Both src and dst are aligned
>>> +	 * None unrolled word copy with every 1*SZREG iteration
>>> +	 *
>>> +	 * a0 - start of aligned dst
>>> +	 * a1 - start of aligned src
>>> +	 * t0 - end of aligned dst
>>> +	 */
>>> +	bgeu	a0, t0, .Lbyte_copy_tail /* check if end of copy */
>>> +	addi	t0, t0, -(SZREG) /* not to over run */
>>> +1:
>>> +	REG_L	a5, 0(a1)
>>> +	addi	a1, a1, SZREG
>>> +	REG_S	a5, 0(a0)
>>> +	addi	a0, a0, SZREG
>>> +	bltu	a0, t0, 1b
>>> +
>>> +	addi	t0, t0, SZREG /* revert to original value */
>>> +	j	.Lbyte_copy_tail
>>> +
>>> +.Lword_copy_unlrolled:
>>> +	/*
>>> +	 * Both src and dst are aligned
>>> +	 * Unrolled word copy with every 8*SZREG iteration
>>>  	 *
>>>  	 * a0 - start of aligned dst
>>>  	 * a1 - start of aligned src
>>> @@ -97,7 +130,12 @@ ENTRY(__asm_copy_from_user)
>>>  	bltu	a0, t0, 2b
>>>
>>>  	addi	t0, t0, 8*SZREG /* revert to original value */
>>> -	j	.Lbyte_copy_tail
>>> +
>>> +	/*
>>> +	 * Remaining might large enough for word_copy to reduce slow byte
>>> +	 * copy
>>> +	 */
>>> +	j	.Lcheck_size_bulk
>>>
>>>  .Lshift_copy:
>>
>> I'm still not convinced that going all the way to such a large unrolling
>> factor is a net win, but this at least provides a much smoother cost 
>> curve.
>>
>> That said, this is causing my 32-bit configs to hang.
> 
> It's missing fixups for the loads in the loop.
> 
> diff --git a/arch/riscv/lib/uaccess.S b/arch/riscv/lib/uaccess.S
> index a835df6bd68f..12ed1f76bd1f 100644
> --- a/arch/riscv/lib/uaccess.S
> +++ b/arch/riscv/lib/uaccess.S
> @@ -89,9 +89,9 @@ ENTRY(__asm_copy_from_user)
>  	bgeu	a0, t0, .Lbyte_copy_tail /* check if end of copy */
>  	addi	t0, t0, -(SZREG) /* not to over run */
>  1:
> -	REG_L	a5, 0(a1)
> +	fixup REG_L	a5, 0(a1), 10f
>  	addi	a1, a1, SZREG
> -	REG_S	a5, 0(a0)
> +	fixup REG_S	a5, 0(a0), 10f
>  	addi	a0, a0, SZREG
>  	bltu	a0, t0, 1b

Thanks, our messages crossed.
I also made the same changes after Qiu's comment, and contacting him
so I also could try it at my place and confirm if there are any other
changes required or not.

Please give me a little more while.

Akira

  reply	other threads:[~2021-08-20  6:42 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-30 13:50 [PATCH 0/1] __asm_copy_to-from_user: Reduce more byte_copy Akira Tsukamoto
2021-07-30 13:52 ` [PATCH 1/1] riscv: __asm_copy_to-from_user: Improve using word copy if size < 9*SZREG Akira Tsukamoto
2021-08-12 13:41   ` Guenter Roeck
2021-08-15  6:51   ` Andreas Schwab
2021-08-16 18:09   ` Palmer Dabbelt
2021-08-16 19:00     ` Andreas Schwab
2021-08-20  6:42       ` Akira Tsukamoto [this message]
2021-08-17  9:03     ` Akira Tsukamoto
2021-08-12 11:01 ` [PATCH 0/1] __asm_copy_to-from_user: Reduce more byte_copy Akira Tsukamoto
     [not found]   ` <61187c37.1c69fb81.ed9bd.cc45SMTPIN_ADDED_BROKEN@mx.google.com>
2021-08-16  6:24     ` Akira Tsukamoto
     [not found]       ` <611a33ac.1c69fb81.12aae.89a5SMTPIN_ADDED_BROKEN@mx.google.com>
2021-08-17  7:32         ` Akira Tsukamoto

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ebfbcb26-5f17-2cc2-4845-6cdb24326338@gmail.com \
    --to=akira.tsukamoto@gmail.com \
    --cc=aou@eecs.berkeley.edu \
    --cc=geert@linux-m68k.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linux@roeck-us.net \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    --cc=qiuwenbo@kylinos.com.cn \
    --cc=schwab@linux-m68k.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).