All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vincenzo Frascino <vincenzo.frascino@arm.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, kasan-dev@googlegroups.com,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>, Dmitry Vyukov <dvyukov@google.com>,
	Andrey Ryabinin <aryabinin@virtuozzo.com>,
	Alexander Potapenko <glider@google.com>,
	Marco Elver <elver@google.com>,
	Evgenii Stepanov <eugenis@google.com>,
	Branislav Rankov <Branislav.Rankov@arm.com>,
	Andrey Konovalov <andreyknvl@google.com>
Subject: Re: [PATCH v3 4/4] arm64: mte: Optimize mte_assign_mem_tag_range()
Date: Sat, 16 Jan 2021 14:22:08 +0000	[thread overview]
Message-ID: <4b1a5cdf-e1bf-3a7e-593f-0089cedbbc03@arm.com> (raw)
In-Reply-To: <20210115154520.GD44111@C02TD0UTHF1T.local>

Hi Mark,

On 1/15/21 3:45 PM, Mark Rutland wrote:
> On Fri, Jan 15, 2021 at 12:00:43PM +0000, Vincenzo Frascino wrote:
>> mte_assign_mem_tag_range() is called on production KASAN HW hot
>> paths. It makes sense to optimize it in an attempt to reduce the
>> overhead.
>>
>> Optimize mte_assign_mem_tag_range() based on the indications provided at
>> [1].
> 
> ... what exactly is the optimization?
> 
> I /think/ you're just trying to have it inlined, but you should mention
> that explicitly.
> 

Good point, I will change it in the next version. I used "Optimize" as a
continuation of the topic in the previous thread but you are right it is not
immediately obvious.

>>
>> [1] https://lore.kernel.org/r/CAAeHK+wCO+J7D1_T89DG+jJrPLk3X9RsGFKxJGd0ZcUFjQT-9Q@mail.gmail.com/
>>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Will Deacon <will@kernel.org>
>> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
>> ---
>>  arch/arm64/include/asm/mte.h | 26 +++++++++++++++++++++++++-
>>  arch/arm64/lib/mte.S         | 15 ---------------
>>  2 files changed, 25 insertions(+), 16 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
>> index 1a715963d909..9730f2b07b79 100644
>> --- a/arch/arm64/include/asm/mte.h
>> +++ b/arch/arm64/include/asm/mte.h
>> @@ -49,7 +49,31 @@ long get_mte_ctrl(struct task_struct *task);
>>  int mte_ptrace_copy_tags(struct task_struct *child, long request,
>>  			 unsigned long addr, unsigned long data);
>>  
>> -void mte_assign_mem_tag_range(void *addr, size_t size);
>> +static inline void mte_assign_mem_tag_range(void *addr, size_t size)
>> +{
>> +	u64 _addr = (u64)addr;
>> +	u64 _end = _addr + size;
>> +
>> +	/*
>> +	 * This function must be invoked from an MTE enabled context.
>> +	 *
>> +	 * Note: The address must be non-NULL and MTE_GRANULE_SIZE aligned and
>> +	 * size must be non-zero and MTE_GRANULE_SIZE aligned.
>> +	 */
>> +	do {
>> +		/*
>> +		 * 'asm volatile' is required to prevent the compiler to move
>> +		 * the statement outside of the loop.
>> +		 */
>> +		asm volatile(__MTE_PREAMBLE "stg %0, [%0]"
>> +			     :
>> +			     : "r" (_addr)
>> +			     : "memory");
>> +
>> +		_addr += MTE_GRANULE_SIZE;
>> +	} while (_addr < _end);
> 
> Is there any chance that this can be used for the last bytes of the
> virtual address space? This might need to change to `_addr == _end` if
> that is possible, otherwise it'll terminate early in that case.
> 

Theoretically it is a possibility. I will change the condition and add a note
for that.

>> +}
> 
> What does the code generation look like for this, relative to the
> assembly version?
> 

The assembly looks like this:

 390:   8b000022        add     x2, x1, x0
 394:   aa0003e1        mov     x1, x0
 398:   d9200821        stg     x1, [x1]
 39c:   91004021        add     x1, x1, #0x10
 3a0:   eb01005f        cmp     x2, x1
 3a4:   54ffffa8        b.hi    398 <mte_set_mem_tag_range+0x48>

You can see the handcrafted one below.

> Thanks,
> Mark.
> 
>> +
>>  
>>  #else /* CONFIG_ARM64_MTE */
>>  
>> diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S
>> index 9e1a12e10053..a0a650451510 100644
>> --- a/arch/arm64/lib/mte.S
>> +++ b/arch/arm64/lib/mte.S
>> @@ -150,18 +150,3 @@ SYM_FUNC_START(mte_restore_page_tags)
>>  	ret
>>  SYM_FUNC_END(mte_restore_page_tags)
>>  
>> -/*
>> - * Assign allocation tags for a region of memory based on the pointer tag
>> - *   x0 - source pointer
>> - *   x1 - size
>> - *
>> - * Note: The address must be non-NULL and MTE_GRANULE_SIZE aligned and
>> - * size must be non-zero and MTE_GRANULE_SIZE aligned.
>> - */
>> -SYM_FUNC_START(mte_assign_mem_tag_range)
>> -1:	stg	x0, [x0]
>> -	add	x0, x0, #MTE_GRANULE_SIZE
>> -	subs	x1, x1, #MTE_GRANULE_SIZE
>> -	b.gt	1b
>> -	ret
>> -SYM_FUNC_END(mte_assign_mem_tag_range)
>> -- 
>> 2.30.0
>>

-- 
Regards,
Vincenzo

WARNING: multiple messages have this Message-ID (diff)
From: Vincenzo Frascino <vincenzo.frascino@arm.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: Branislav Rankov <Branislav.Rankov@arm.com>,
	Marco Elver <elver@google.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Evgenii Stepanov <eugenis@google.com>,
	linux-kernel@vger.kernel.org, kasan-dev@googlegroups.com,
	Alexander Potapenko <glider@google.com>,
	linux-arm-kernel@lists.infradead.org,
	Andrey Konovalov <andreyknvl@google.com>,
	Andrey Ryabinin <aryabinin@virtuozzo.com>,
	Will Deacon <will@kernel.org>, Dmitry Vyukov <dvyukov@google.com>
Subject: Re: [PATCH v3 4/4] arm64: mte: Optimize mte_assign_mem_tag_range()
Date: Sat, 16 Jan 2021 14:22:08 +0000	[thread overview]
Message-ID: <4b1a5cdf-e1bf-3a7e-593f-0089cedbbc03@arm.com> (raw)
In-Reply-To: <20210115154520.GD44111@C02TD0UTHF1T.local>

Hi Mark,

On 1/15/21 3:45 PM, Mark Rutland wrote:
> On Fri, Jan 15, 2021 at 12:00:43PM +0000, Vincenzo Frascino wrote:
>> mte_assign_mem_tag_range() is called on production KASAN HW hot
>> paths. It makes sense to optimize it in an attempt to reduce the
>> overhead.
>>
>> Optimize mte_assign_mem_tag_range() based on the indications provided at
>> [1].
> 
> ... what exactly is the optimization?
> 
> I /think/ you're just trying to have it inlined, but you should mention
> that explicitly.
> 

Good point, I will change it in the next version. I used "Optimize" as a
continuation of the topic in the previous thread but you are right it is not
immediately obvious.

>>
>> [1] https://lore.kernel.org/r/CAAeHK+wCO+J7D1_T89DG+jJrPLk3X9RsGFKxJGd0ZcUFjQT-9Q@mail.gmail.com/
>>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Will Deacon <will@kernel.org>
>> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
>> ---
>>  arch/arm64/include/asm/mte.h | 26 +++++++++++++++++++++++++-
>>  arch/arm64/lib/mte.S         | 15 ---------------
>>  2 files changed, 25 insertions(+), 16 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
>> index 1a715963d909..9730f2b07b79 100644
>> --- a/arch/arm64/include/asm/mte.h
>> +++ b/arch/arm64/include/asm/mte.h
>> @@ -49,7 +49,31 @@ long get_mte_ctrl(struct task_struct *task);
>>  int mte_ptrace_copy_tags(struct task_struct *child, long request,
>>  			 unsigned long addr, unsigned long data);
>>  
>> -void mte_assign_mem_tag_range(void *addr, size_t size);
>> +static inline void mte_assign_mem_tag_range(void *addr, size_t size)
>> +{
>> +	u64 _addr = (u64)addr;
>> +	u64 _end = _addr + size;
>> +
>> +	/*
>> +	 * This function must be invoked from an MTE enabled context.
>> +	 *
>> +	 * Note: The address must be non-NULL and MTE_GRANULE_SIZE aligned and
>> +	 * size must be non-zero and MTE_GRANULE_SIZE aligned.
>> +	 */
>> +	do {
>> +		/*
>> +		 * 'asm volatile' is required to prevent the compiler to move
>> +		 * the statement outside of the loop.
>> +		 */
>> +		asm volatile(__MTE_PREAMBLE "stg %0, [%0]"
>> +			     :
>> +			     : "r" (_addr)
>> +			     : "memory");
>> +
>> +		_addr += MTE_GRANULE_SIZE;
>> +	} while (_addr < _end);
> 
> Is there any chance that this can be used for the last bytes of the
> virtual address space? This might need to change to `_addr == _end` if
> that is possible, otherwise it'll terminate early in that case.
> 

Theoretically it is a possibility. I will change the condition and add a note
for that.

>> +}
> 
> What does the code generation look like for this, relative to the
> assembly version?
> 

The assembly looks like this:

 390:   8b000022        add     x2, x1, x0
 394:   aa0003e1        mov     x1, x0
 398:   d9200821        stg     x1, [x1]
 39c:   91004021        add     x1, x1, #0x10
 3a0:   eb01005f        cmp     x2, x1
 3a4:   54ffffa8        b.hi    398 <mte_set_mem_tag_range+0x48>

You can see the handcrafted one below.

> Thanks,
> Mark.
> 
>> +
>>  
>>  #else /* CONFIG_ARM64_MTE */
>>  
>> diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S
>> index 9e1a12e10053..a0a650451510 100644
>> --- a/arch/arm64/lib/mte.S
>> +++ b/arch/arm64/lib/mte.S
>> @@ -150,18 +150,3 @@ SYM_FUNC_START(mte_restore_page_tags)
>>  	ret
>>  SYM_FUNC_END(mte_restore_page_tags)
>>  
>> -/*
>> - * Assign allocation tags for a region of memory based on the pointer tag
>> - *   x0 - source pointer
>> - *   x1 - size
>> - *
>> - * Note: The address must be non-NULL and MTE_GRANULE_SIZE aligned and
>> - * size must be non-zero and MTE_GRANULE_SIZE aligned.
>> - */
>> -SYM_FUNC_START(mte_assign_mem_tag_range)
>> -1:	stg	x0, [x0]
>> -	add	x0, x0, #MTE_GRANULE_SIZE
>> -	subs	x1, x1, #MTE_GRANULE_SIZE
>> -	b.gt	1b
>> -	ret
>> -SYM_FUNC_END(mte_assign_mem_tag_range)
>> -- 
>> 2.30.0
>>

-- 
Regards,
Vincenzo

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-01-16 17:23 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-15 12:00 [PATCH v3 0/4] arm64: ARMv8.5-A: MTE: Add async mode support Vincenzo Frascino
2021-01-15 12:00 ` Vincenzo Frascino
2021-01-15 12:00 ` [PATCH v3 1/4] kasan, arm64: Add KASAN light mode Vincenzo Frascino
2021-01-15 12:00   ` Vincenzo Frascino
2021-01-15 15:08   ` Mark Rutland
2021-01-15 15:08     ` Mark Rutland
2021-01-16 13:47     ` Vincenzo Frascino
2021-01-16 13:47       ` Vincenzo Frascino
2021-01-16 14:09       ` Andrey Konovalov
2021-01-16 14:09         ` Andrey Konovalov
2021-01-18 10:24       ` Mark Rutland
2021-01-18 10:24         ` Mark Rutland
2021-01-15 18:59   ` Andrey Konovalov
2021-01-15 18:59     ` Andrey Konovalov
2021-01-16 13:40     ` Vincenzo Frascino
2021-01-16 13:40       ` Vincenzo Frascino
2021-01-16 13:59       ` Andrey Konovalov
2021-01-16 13:59         ` Andrey Konovalov
2021-01-16 14:06         ` Vincenzo Frascino
2021-01-16 14:06           ` Vincenzo Frascino
2021-01-15 12:00 ` [PATCH v3 2/4] arm64: mte: Add asynchronous mode support Vincenzo Frascino
2021-01-15 12:00   ` Vincenzo Frascino
2021-01-15 15:13   ` Mark Rutland
2021-01-15 15:13     ` Mark Rutland
2021-01-16 13:49     ` Vincenzo Frascino
2021-01-16 13:49       ` Vincenzo Frascino
2021-01-15 12:00 ` [PATCH v3 3/4] arm64: mte: Enable async tag check fault Vincenzo Frascino
2021-01-15 12:00   ` Vincenzo Frascino
2021-01-15 15:37   ` Mark Rutland
2021-01-15 15:37     ` Mark Rutland
2021-01-18 12:57   ` Catalin Marinas
2021-01-18 12:57     ` Catalin Marinas
2021-01-18 13:37     ` Vincenzo Frascino
2021-01-18 13:37       ` Vincenzo Frascino
2021-01-18 14:14       ` Mark Rutland
2021-01-18 14:14         ` Mark Rutland
2021-01-18 14:48         ` Vincenzo Frascino
2021-01-18 14:48           ` Vincenzo Frascino
2021-01-18 15:39           ` Vincenzo Frascino
2021-01-18 15:39             ` Vincenzo Frascino
2021-01-18 15:40       ` Vincenzo Frascino
2021-01-18 15:40         ` Vincenzo Frascino
2021-01-15 12:00 ` [PATCH v3 4/4] arm64: mte: Optimize mte_assign_mem_tag_range() Vincenzo Frascino
2021-01-15 12:00   ` Vincenzo Frascino
2021-01-15 15:45   ` Mark Rutland
2021-01-15 15:45     ` Mark Rutland
2021-01-16 14:22     ` Vincenzo Frascino [this message]
2021-01-16 14:22       ` Vincenzo Frascino
2021-01-17 12:27       ` Vincenzo Frascino
2021-01-17 12:27         ` Vincenzo Frascino
2021-01-18 10:41         ` Mark Rutland
2021-01-18 10:41           ` Mark Rutland
2021-01-18 11:00           ` Vincenzo Frascino
2021-01-18 11:00             ` Vincenzo Frascino

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4b1a5cdf-e1bf-3a7e-593f-0089cedbbc03@arm.com \
    --to=vincenzo.frascino@arm.com \
    --cc=Branislav.Rankov@arm.com \
    --cc=andreyknvl@google.com \
    --cc=aryabinin@virtuozzo.com \
    --cc=catalin.marinas@arm.com \
    --cc=dvyukov@google.com \
    --cc=elver@google.com \
    --cc=eugenis@google.com \
    --cc=glider@google.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.