All of lore.kernel.org
 help / color / mirror / Atom feed
From: "liuyuntao (F)" <liuyuntao12@huawei.com>
To: Arnd Bergmann <arnd@arndb.de>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-kernel@vger.kernel.org>, Ard Biesheuvel <ardb@kernel.org>,
	Fangrui Song <maskray@google.com>
Cc: Russell King <linux@armlinux.org.uk>, Andrew Davis <afd@ti.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Geert Uytterhoeven <geert+renesas@glider.be>,
	Jonathan Corbet <corbet@lwn.net>, Mike Rapoport <rppt@kernel.org>,
	Rob Herring <robh@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Linus Walleij <linus.walleij@linaro.org>, <llvm@lists.linux.dev>
Subject: Re: [PATCH-next v2] arm32: enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION
Date: Sat, 9 Mar 2024 21:24:26 +0800	[thread overview]
Message-ID: <346e15e5-49e9-4a7f-b163-c3316225baab@huawei.com> (raw)
In-Reply-To: <2a90581c-f1df-4d6b-8f0b-8e7cbf150ed9@app.fastmail.com>



On 2024/3/9 16:20, Arnd Bergmann wrote:
> On Sat, Mar 9, 2024, at 07:14, liuyuntao (F) wrote:
>> On 2024/3/8 21:15, Arnd Bergmann wrote:
>>> On Thu, Mar 7, 2024, at 16:12, Yuntao Liu wrote:
>>
>> Thanks for the tests, CONFIG_LD_DEAD_CODE_DATA_ELIMINATION and
>> CONFIG_TRIM_UNUSED_KSYMS do indeed result in a significant improvement.
>> I found that arm32 still doesn't support CONFIG_LTO_CLANG. I've done
>> some work on it, but without success. I'd like to learn more about the
>> CONFIG_LTO_CLANG patch. Do you have any relevant links?
> 
> I did not try to get it to boot and gave up when I did not see
> any size improvement. I think there were previous attempts to
> do it elsewhere, which I did not try to find.
> 

I tested this patch, the size improvement was only about one 
ten-thousandth, and the compilation time had increased by about a quarter,
and the kernel did not boot.

Strangely, LTO has actually increased the compilation time 
significantly, which seems contrary to its purpose.

           +          +trim      +dce       +trim+dce
no lto    5995384    5858720    5841024    5299032
lto       5990040    5854544    5839992    5289576
shrink    8.9‱     7.1‱     1.7‱     17.8‱


           +          +trim      +dce       +trim+dce
no lto    34.616     33.03      36.093     32.211
lto       46.881     45.324     47.247     43.246
increase  26.20%     27.10%     23.60%     25.50%



> The patch below makes it build, but it still requires disabling
> CONFIG_THUMB2_KERNEL, which totally defeats the purpose of shrinking
> the kernel as it adds some 40% size overhead in the vmlinux.
> There are probably also runtime bugs that get introduced by this.
> 
>       Arnd
> > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index de78ceb821df..7ebfda4839e8 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -2,6 +2,8 @@
>   config ARM
>   	bool
>   	default y
> +	select ARCH_SUPPORTS_LTO_CLANG
> +	select ARCH_SUPPORTS_LTO_CLANG_THIN
>   	select ARCH_32BIT_OFF_T
>   	select ARCH_CORRECT_STACKTRACE_ON_KRETPROBE if HAVE_KRETPROBES && FRAME_POINTER && !ARM_UNWIND
>   	select ARCH_HAS_BINFMT_FLAT
> diff --git a/arch/arm/boot/compressed/Makefile b/arch/arm/boot/compressed/Makefile
> index 726ecabcef09..f2ddce451ab9 100644
> --- a/arch/arm/boot/compressed/Makefile
> +++ b/arch/arm/boot/compressed/Makefile
> @@ -9,6 +9,8 @@ OBJS		=
>   
>   HEAD	= head.o
>   OBJS	+= misc.o decompress.o
> +CFLAGS_REMOVE_misc.o += $(CC_FLAGS_LTO)
> +CFLAGS_REMOVE_decompress.o += $(CC_FLAGS_LTO)

Wow, I've encountered this issue before and didn't think to solve it in 
this way. You really have a thorough understanding of these parameters. 
On a side note, if CONFIG_LD_DEAD_CODE_DATA_ELIMINATION is enabled, but 
only a few rodata sections are removed and no functions are eliminated, 
are there any compiler or linker options that can control this behavior?
thanks.

>   ifeq ($(CONFIG_DEBUG_UNCOMPRESS),y)
>   OBJS	+= debug.o
>   AFLAGS_head.o += -DDEBUG
> diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
> index d19d140a10c7..aee9e13023a8 100644
> --- a/arch/arm/mm/flush.c
> +++ b/arch/arm/mm/flush.c
> @@ -38,15 +38,14 @@ EXPORT_SYMBOL(arm_heavy_mb);
>   static void flush_pfn_alias(unsigned long pfn, unsigned long vaddr)
>   {
>   	unsigned long to = FLUSH_ALIAS_START + (CACHE_COLOUR(vaddr) << PAGE_SHIFT);
> -	const int zero = 0;
>   
>   	set_top_pte(to, pfn_pte(pfn, PAGE_KERNEL));
>   
> -	asm(	"mcrr	p15, 0, %1, %0, c14\n"
> -	"	mcr	p15, 0, %2, c7, c10, 4"
> +	asm("mcrr	p15, 0, %1, %0, c14"
>   	    :
> -	    : "r" (to), "r" (to + PAGE_SIZE - 1), "r" (zero)
> +	    : "r" (to), "r" (to + PAGE_SIZE - 1)
>   	    : "cc");
> +	dsb();
>   }
>   
>   static void flush_icache_alias(unsigned long pfn, unsigned long vaddr, unsigned long len)
> @@ -68,11 +67,11 @@ void flush_cache_mm(struct mm_struct *mm)
>   	}
>   
>   	if (cache_is_vipt_aliasing()) {
> -		asm(	"mcr	p15, 0, %0, c7, c14, 0\n"
> -		"	mcr	p15, 0, %0, c7, c10, 4"
> +		asm("mcr	p15, 0, %0, c7, c14, 0"
>   		    :
>   		    : "r" (0)
>   		    : "cc");
> +		dsb();
>   	}
>   }
>   
> @@ -84,11 +83,11 @@ void flush_cache_range(struct vm_area_struct *vma, unsigned long start, unsigned
>   	}
>   
>   	if (cache_is_vipt_aliasing()) {
> -		asm(	"mcr	p15, 0, %0, c7, c14, 0\n"
> -		"	mcr	p15, 0, %0, c7, c10, 4"
> +		asm("mcr	p15, 0, %0, c7, c14, 0"
>   		    :
>   		    : "r" (0)
>   		    : "cc");
> +		dsb();
>   	}
>   
>   	if (vma->vm_flags & VM_EXEC)

WARNING: multiple messages have this Message-ID (diff)
From: "liuyuntao (F)" <liuyuntao12@huawei.com>
To: Arnd Bergmann <arnd@arndb.de>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-kernel@vger.kernel.org>, Ard Biesheuvel <ardb@kernel.org>,
	Fangrui Song <maskray@google.com>
Cc: Russell King <linux@armlinux.org.uk>, Andrew Davis <afd@ti.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Geert Uytterhoeven <geert+renesas@glider.be>,
	Jonathan Corbet <corbet@lwn.net>, Mike Rapoport <rppt@kernel.org>,
	Rob Herring <robh@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Linus Walleij <linus.walleij@linaro.org>, <llvm@lists.linux.dev>
Subject: Re: [PATCH-next v2] arm32: enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION
Date: Sat, 9 Mar 2024 21:24:26 +0800	[thread overview]
Message-ID: <346e15e5-49e9-4a7f-b163-c3316225baab@huawei.com> (raw)
In-Reply-To: <2a90581c-f1df-4d6b-8f0b-8e7cbf150ed9@app.fastmail.com>



On 2024/3/9 16:20, Arnd Bergmann wrote:
> On Sat, Mar 9, 2024, at 07:14, liuyuntao (F) wrote:
>> On 2024/3/8 21:15, Arnd Bergmann wrote:
>>> On Thu, Mar 7, 2024, at 16:12, Yuntao Liu wrote:
>>
>> Thanks for the tests, CONFIG_LD_DEAD_CODE_DATA_ELIMINATION and
>> CONFIG_TRIM_UNUSED_KSYMS do indeed result in a significant improvement.
>> I found that arm32 still doesn't support CONFIG_LTO_CLANG. I've done
>> some work on it, but without success. I'd like to learn more about the
>> CONFIG_LTO_CLANG patch. Do you have any relevant links?
> 
> I did not try to get it to boot and gave up when I did not see
> any size improvement. I think there were previous attempts to
> do it elsewhere, which I did not try to find.
> 

I tested this patch, the size improvement was only about one 
ten-thousandth, and the compilation time had increased by about a quarter,
and the kernel did not boot.

Strangely, LTO has actually increased the compilation time 
significantly, which seems contrary to its purpose.

           +          +trim      +dce       +trim+dce
no lto    5995384    5858720    5841024    5299032
lto       5990040    5854544    5839992    5289576
shrink    8.9‱     7.1‱     1.7‱     17.8‱


           +          +trim      +dce       +trim+dce
no lto    34.616     33.03      36.093     32.211
lto       46.881     45.324     47.247     43.246
increase  26.20%     27.10%     23.60%     25.50%



> The patch below makes it build, but it still requires disabling
> CONFIG_THUMB2_KERNEL, which totally defeats the purpose of shrinking
> the kernel as it adds some 40% size overhead in the vmlinux.
> There are probably also runtime bugs that get introduced by this.
> 
>       Arnd
> > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index de78ceb821df..7ebfda4839e8 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -2,6 +2,8 @@
>   config ARM
>   	bool
>   	default y
> +	select ARCH_SUPPORTS_LTO_CLANG
> +	select ARCH_SUPPORTS_LTO_CLANG_THIN
>   	select ARCH_32BIT_OFF_T
>   	select ARCH_CORRECT_STACKTRACE_ON_KRETPROBE if HAVE_KRETPROBES && FRAME_POINTER && !ARM_UNWIND
>   	select ARCH_HAS_BINFMT_FLAT
> diff --git a/arch/arm/boot/compressed/Makefile b/arch/arm/boot/compressed/Makefile
> index 726ecabcef09..f2ddce451ab9 100644
> --- a/arch/arm/boot/compressed/Makefile
> +++ b/arch/arm/boot/compressed/Makefile
> @@ -9,6 +9,8 @@ OBJS		=
>   
>   HEAD	= head.o
>   OBJS	+= misc.o decompress.o
> +CFLAGS_REMOVE_misc.o += $(CC_FLAGS_LTO)
> +CFLAGS_REMOVE_decompress.o += $(CC_FLAGS_LTO)

Wow, I've encountered this issue before and didn't think to solve it in 
this way. You really have a thorough understanding of these parameters. 
On a side note, if CONFIG_LD_DEAD_CODE_DATA_ELIMINATION is enabled, but 
only a few rodata sections are removed and no functions are eliminated, 
are there any compiler or linker options that can control this behavior?
thanks.

>   ifeq ($(CONFIG_DEBUG_UNCOMPRESS),y)
>   OBJS	+= debug.o
>   AFLAGS_head.o += -DDEBUG
> diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
> index d19d140a10c7..aee9e13023a8 100644
> --- a/arch/arm/mm/flush.c
> +++ b/arch/arm/mm/flush.c
> @@ -38,15 +38,14 @@ EXPORT_SYMBOL(arm_heavy_mb);
>   static void flush_pfn_alias(unsigned long pfn, unsigned long vaddr)
>   {
>   	unsigned long to = FLUSH_ALIAS_START + (CACHE_COLOUR(vaddr) << PAGE_SHIFT);
> -	const int zero = 0;
>   
>   	set_top_pte(to, pfn_pte(pfn, PAGE_KERNEL));
>   
> -	asm(	"mcrr	p15, 0, %1, %0, c14\n"
> -	"	mcr	p15, 0, %2, c7, c10, 4"
> +	asm("mcrr	p15, 0, %1, %0, c14"
>   	    :
> -	    : "r" (to), "r" (to + PAGE_SIZE - 1), "r" (zero)
> +	    : "r" (to), "r" (to + PAGE_SIZE - 1)
>   	    : "cc");
> +	dsb();
>   }
>   
>   static void flush_icache_alias(unsigned long pfn, unsigned long vaddr, unsigned long len)
> @@ -68,11 +67,11 @@ void flush_cache_mm(struct mm_struct *mm)
>   	}
>   
>   	if (cache_is_vipt_aliasing()) {
> -		asm(	"mcr	p15, 0, %0, c7, c14, 0\n"
> -		"	mcr	p15, 0, %0, c7, c10, 4"
> +		asm("mcr	p15, 0, %0, c7, c14, 0"
>   		    :
>   		    : "r" (0)
>   		    : "cc");
> +		dsb();
>   	}
>   }
>   
> @@ -84,11 +83,11 @@ void flush_cache_range(struct vm_area_struct *vma, unsigned long start, unsigned
>   	}
>   
>   	if (cache_is_vipt_aliasing()) {
> -		asm(	"mcr	p15, 0, %0, c7, c14, 0\n"
> -		"	mcr	p15, 0, %0, c7, c10, 4"
> +		asm("mcr	p15, 0, %0, c7, c14, 0"
>   		    :
>   		    : "r" (0)
>   		    : "cc");
> +		dsb();
>   	}
>   
>   	if (vma->vm_flags & VM_EXEC)

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2024-03-09 13:24 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-07 15:12 [PATCH-next v2] arm32: enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION Yuntao Liu
2024-03-07 15:12 ` Yuntao Liu
2024-03-08 13:15 ` Arnd Bergmann
2024-03-08 13:15   ` Arnd Bergmann
2024-03-08 14:27   ` Ard Biesheuvel
2024-03-08 14:27     ` Ard Biesheuvel
2024-03-08 15:37     ` Ard Biesheuvel
2024-03-08 15:37       ` Ard Biesheuvel
2024-03-09  0:01       ` Ard Biesheuvel
2024-03-09  0:01         ` Ard Biesheuvel
2024-03-09  6:46         ` liuyuntao (F)
2024-03-09  6:46           ` liuyuntao (F)
2024-03-09  6:42     ` liuyuntao (F)
2024-03-09  6:42       ` liuyuntao (F)
2024-03-09  6:56     ` liuyuntao (F)
2024-03-09  6:56       ` liuyuntao (F)
2024-03-09  6:14   ` liuyuntao (F)
2024-03-09  6:14     ` liuyuntao (F)
2024-03-09  8:20     ` Arnd Bergmann
2024-03-09  8:20       ` Arnd Bergmann
2024-03-09 13:24       ` liuyuntao (F) [this message]
2024-03-09 13:24         ` liuyuntao (F)
2024-03-11  9:14         ` Geert Uytterhoeven
2024-03-11  9:14           ` Geert Uytterhoeven
2024-03-11  9:39           ` liuyuntao (F)
2024-03-11  9:39             ` liuyuntao (F)
2024-03-11 11:41           ` Arnd Bergmann
2024-03-11 11:41             ` Arnd Bergmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=346e15e5-49e9-4a7f-b163-c3316225baab@huawei.com \
    --to=liuyuntao12@huawei.com \
    --cc=afd@ti.com \
    --cc=akpm@linux-foundation.org \
    --cc=ardb@kernel.org \
    --cc=arnd@arndb.de \
    --cc=corbet@lwn.net \
    --cc=geert+renesas@glider.be \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linus.walleij@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=llvm@lists.linux.dev \
    --cc=maskray@google.com \
    --cc=robh@kernel.org \
    --cc=rppt@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.