All of lore.kernel.org
 help / color / mirror / Atom feed
From: Harry Wentland <harry.wentland@amd.com>
To: Nathan Chancellor <nathan@kernel.org>,
	Arnd Bergmann <arnd@kernel.org>,
	"Siqueira, Rodrigo" <Rodrigo.Siqueira@amd.com>
Cc: clang-built-linux <llvm@lists.linux.dev>,
	"David Airlie" <airlied@linux.ie>,
	"Pan, Xinhui" <Xinhui.Pan@amd.com>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	"amd-gfx list" <amd-gfx@lists.freedesktop.org>,
	"Christian König" <christian.koenig@amd.com>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	"Alex Deucher" <alexander.deucher@amd.com>,
	"Linus Torvalds" <torvalds@linux-foundation.org>,
	"Sudip Mukherjee (Codethink)" <sudipm.mukherjee@gmail.com>
Subject: Re: mainline build failure for x86_64 allmodconfig with clang
Date: Fri, 5 Aug 2022 11:32:33 -0400	[thread overview]
Message-ID: <9fb73284-7572-5703-93d3-f83a43535baf@amd.com> (raw)
In-Reply-To: <YuwvfsztWaHvquwC@dev-arch.thelio-3990X>



On 2022-08-04 16:43, Nathan Chancellor wrote:
> On Thu, Aug 04, 2022 at 09:24:41PM +0200, Arnd Bergmann wrote:
>> On Thu, Aug 4, 2022 at 8:52 PM Linus Torvalds
>> <torvalds@linux-foundation.org> wrote:
>>>
>>> On Thu, Aug 4, 2022 at 11:37 AM Sudip Mukherjee (Codethink)
>>> <sudipm.mukherjee@gmail.com> wrote:cov_trace_cmp
>>>>
>>>> git bisect points to 3876a8b5e241 ("drm/amd/display: Enable building new display engine with KCOV enabled").
>>>
>>> Ahh. So that was presumably why it was disabled before - because it
>>> presumably does disgusting things that make KCOV generate even bigger
>>> stack frames than it already has.
>>>
>>> Those functions do seem to have fairly big stack footprints already (I
>>> didn't try to look into why, I assume it's partly due to aggressive
>>> inlining, and probably some automatic structures on stack). But gcc
>>> doesn't seem to make it all that much worse with KCOV (and my clang
>>> build doesn't enable KCOV).
>>>
>>> So it's presumably some KCOV-vs-clang thing. Nathan?
> 
> Looks like Arnd beat me to it :)
> 
>> The dependency was originally added to avoid a link failure in 9d1d02ff3678
>>  ("drm/amd/display: Don't build DCN1 when kcov is enabled") after I reported the
>> problem in https://lists.freedesktop.org/archives/dri-devel/2018-August/186131.html>>>
>> The commit from the bisection just turns off KCOV for the entire directory
>> to avoid the link failure, so it's not actually a problem with KCOV vs clang,
>> but I think a problem with clang vs badly written code that was obscured
>> in allmodconfig builds prior to this.
> 
> Right, I do think the sanitizers make things worse here too, as those get
> enabled with allmodconfig. I ran some really quick tests with allmodconfig and
> a few instrumentation options flipped on/off:
> 
> allmodconfig (CONFIG_KASAN=y, CONFIG_KCSAN=n, CONFIG_KCOV=y, and CONFIG_UBSAN=y):
> 
> warning: stack frame size (2216) exceeds limit (2048) in 'dml30_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
> warning: stack frame size (2184) exceeds limit (2048) in 'dml31_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
> warning: stack frame size (2176) exceeds limit (2048) in 'dml32_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
> 
> allmodconfig + CONFIG_KASAN=n:
> 
> warning: stack frame size (2112) exceeds limit (2048) in 'dml32_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
> 
> allmodconfig + CONFIG_KCOV=n:
> 
> warning: stack frame size (2216) exceeds limit (2048) in 'dml30_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
> warning: stack frame size (2184) exceeds limit (2048) in 'dml31_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
> warning: stack frame size (2176) exceeds limit (2048) in 'dml32_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
> 
> allmodconfig + CONFIG_UBSAN=n:
> 
> warning: stack frame size (2584) exceeds limit (2048) in 'dml30_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
> warning: stack frame size (2680) exceeds limit (2048) in 'dml31_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
> warning: stack frame size (2352) exceeds limit (2048) in 'dml32_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
> 
> allmodconfig + CONFIG_KASAN=n + CONFIG_KCSAN=y + CONFIG_UBSAN=n:
> 
> warning: stack frame size (2504) exceeds limit (2048) in 'dml30_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
> warning: stack frame size (2600) exceeds limit (2048) in 'dml31_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
> warning: stack frame size (2264) exceeds limit (2048) in 'dml32_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
> 
> allmodconfig + CONFIG_KASAN=n + CONFIG_KCSAN=n + CONFIG_UBSAN=n:
> 
> warning: stack frame size (2072) exceeds limit (2048) in 'dml31_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
> 
> There might be other debugging configurations that make this worse too,
> as I don't see those warnings on my distribution configuration.
> 
>> The dml30_ModeSupportAndSystemConfigurationFull() function exercises
>> a few paths in the compiler that are otherwise rare. On thing it does is to
>> pass up to 60 arguments to other functions, and it heavily uses float and
>> double variables. Both of these make it rather fragile when it comes to
>> unusual compiler options, so the files keep coming up whenever a new
>> instrumentation feature gets added. There is probably some other flag
>> in allmodconfig that we can disable to improve this again, but I have not
>> checked this time.
> 
> I do notice that these files build with a non-configurable
> -Wframe-large-than value:
> 
> $ rg frame_warn_flag drivers/gpu/drm/amd/display/dc/dml/Makefile
> 54:frame_warn_flag := -Wframe-larger-than=2048

Tbh, I was looking at the history and I can't find a good reason this
was added. It should be safe to drop this. I would much rather use
the CONFIG_FRAME_WARN value than override it.

AFAIK most builds use 2048 by default anyways.

> 70:CFLAGS_$(AMDDALPATH)/dc/dml/dcn30/display_mode_vba_30.o := $(dml_ccflags) $(frame_warn_flag)
> 72:CFLAGS_$(AMDDALPATH)/dc/dml/dcn31/display_mode_vba_31.o := $(dml_ccflags) $(frame_warn_flag)
> 76:CFLAGS_$(AMDDALPATH)/dc/dml/dcn32/display_mode_vba_32.o := $(dml_ccflags) $(frame_warn_flag)
> 
> I suppose that could just be bumped as a quick workaround? Two of those
> files have a comment that implies modifying them in non-trivial ways is
> not recommended.
> 
> /*
>  * NOTE:
>  *   This file is gcc-parsable HW gospel, coming straight from HW engineers.
>  *
>  * It doesn't adhere to Linux kernel style and sometimes will do things in odd
>  * ways. Unless there is something clearly wrong with it the code should
>  * remain as-is as it provides us with a guarantee from HW that it is correct.
>  */
> 
> I do note that commit 1b54a0121dba ("drm/amd/display: Reduce stack size
> in the mode support function") did have a workaround for GCC. It appears
> clang will still inline mode_support_configuration(). If I mark it as
> 'noinline', the warning disappears in that file.
> 

That'd be the best quick fix. I guess if we split out functions to fix
stack usage we should mark them as 'noinline' in the future to avoid
agressive compiler optimizations.

Harry

> Cheers,
> Nathan


WARNING: multiple messages have this Message-ID (diff)
From: Harry Wentland <harry.wentland@amd.com>
To: Nathan Chancellor <nathan@kernel.org>,
	Arnd Bergmann <arnd@kernel.org>,
	"Siqueira, Rodrigo" <Rodrigo.Siqueira@amd.com>
Cc: clang-built-linux <llvm@lists.linux.dev>,
	"David Airlie" <airlied@linux.ie>,
	"Pan, Xinhui" <Xinhui.Pan@amd.com>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	"amd-gfx list" <amd-gfx@lists.freedesktop.org>,
	"Sudip Mukherjee (Codethink)" <sudipm.mukherjee@gmail.com>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	"Alex Deucher" <alexander.deucher@amd.com>,
	"Linus Torvalds" <torvalds@linux-foundation.org>,
	"Christian König" <christian.koenig@amd.com>
Subject: Re: mainline build failure for x86_64 allmodconfig with clang
Date: Fri, 5 Aug 2022 11:32:33 -0400	[thread overview]
Message-ID: <9fb73284-7572-5703-93d3-f83a43535baf@amd.com> (raw)
In-Reply-To: <YuwvfsztWaHvquwC@dev-arch.thelio-3990X>



On 2022-08-04 16:43, Nathan Chancellor wrote:
> On Thu, Aug 04, 2022 at 09:24:41PM +0200, Arnd Bergmann wrote:
>> On Thu, Aug 4, 2022 at 8:52 PM Linus Torvalds
>> <torvalds@linux-foundation.org> wrote:
>>>
>>> On Thu, Aug 4, 2022 at 11:37 AM Sudip Mukherjee (Codethink)
>>> <sudipm.mukherjee@gmail.com> wrote:cov_trace_cmp
>>>>
>>>> git bisect points to 3876a8b5e241 ("drm/amd/display: Enable building new display engine with KCOV enabled").
>>>
>>> Ahh. So that was presumably why it was disabled before - because it
>>> presumably does disgusting things that make KCOV generate even bigger
>>> stack frames than it already has.
>>>
>>> Those functions do seem to have fairly big stack footprints already (I
>>> didn't try to look into why, I assume it's partly due to aggressive
>>> inlining, and probably some automatic structures on stack). But gcc
>>> doesn't seem to make it all that much worse with KCOV (and my clang
>>> build doesn't enable KCOV).
>>>
>>> So it's presumably some KCOV-vs-clang thing. Nathan?
> 
> Looks like Arnd beat me to it :)
> 
>> The dependency was originally added to avoid a link failure in 9d1d02ff3678
>>  ("drm/amd/display: Don't build DCN1 when kcov is enabled") after I reported the
>> problem in https://lists.freedesktop.org/archives/dri-devel/2018-August/186131.html>>>
>> The commit from the bisection just turns off KCOV for the entire directory
>> to avoid the link failure, so it's not actually a problem with KCOV vs clang,
>> but I think a problem with clang vs badly written code that was obscured
>> in allmodconfig builds prior to this.
> 
> Right, I do think the sanitizers make things worse here too, as those get
> enabled with allmodconfig. I ran some really quick tests with allmodconfig and
> a few instrumentation options flipped on/off:
> 
> allmodconfig (CONFIG_KASAN=y, CONFIG_KCSAN=n, CONFIG_KCOV=y, and CONFIG_UBSAN=y):
> 
> warning: stack frame size (2216) exceeds limit (2048) in 'dml30_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
> warning: stack frame size (2184) exceeds limit (2048) in 'dml31_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
> warning: stack frame size (2176) exceeds limit (2048) in 'dml32_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
> 
> allmodconfig + CONFIG_KASAN=n:
> 
> warning: stack frame size (2112) exceeds limit (2048) in 'dml32_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
> 
> allmodconfig + CONFIG_KCOV=n:
> 
> warning: stack frame size (2216) exceeds limit (2048) in 'dml30_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
> warning: stack frame size (2184) exceeds limit (2048) in 'dml31_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
> warning: stack frame size (2176) exceeds limit (2048) in 'dml32_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
> 
> allmodconfig + CONFIG_UBSAN=n:
> 
> warning: stack frame size (2584) exceeds limit (2048) in 'dml30_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
> warning: stack frame size (2680) exceeds limit (2048) in 'dml31_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
> warning: stack frame size (2352) exceeds limit (2048) in 'dml32_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
> 
> allmodconfig + CONFIG_KASAN=n + CONFIG_KCSAN=y + CONFIG_UBSAN=n:
> 
> warning: stack frame size (2504) exceeds limit (2048) in 'dml30_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
> warning: stack frame size (2600) exceeds limit (2048) in 'dml31_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
> warning: stack frame size (2264) exceeds limit (2048) in 'dml32_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
> 
> allmodconfig + CONFIG_KASAN=n + CONFIG_KCSAN=n + CONFIG_UBSAN=n:
> 
> warning: stack frame size (2072) exceeds limit (2048) in 'dml31_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
> 
> There might be other debugging configurations that make this worse too,
> as I don't see those warnings on my distribution configuration.
> 
>> The dml30_ModeSupportAndSystemConfigurationFull() function exercises
>> a few paths in the compiler that are otherwise rare. On thing it does is to
>> pass up to 60 arguments to other functions, and it heavily uses float and
>> double variables. Both of these make it rather fragile when it comes to
>> unusual compiler options, so the files keep coming up whenever a new
>> instrumentation feature gets added. There is probably some other flag
>> in allmodconfig that we can disable to improve this again, but I have not
>> checked this time.
> 
> I do notice that these files build with a non-configurable
> -Wframe-large-than value:
> 
> $ rg frame_warn_flag drivers/gpu/drm/amd/display/dc/dml/Makefile
> 54:frame_warn_flag := -Wframe-larger-than=2048

Tbh, I was looking at the history and I can't find a good reason this
was added. It should be safe to drop this. I would much rather use
the CONFIG_FRAME_WARN value than override it.

AFAIK most builds use 2048 by default anyways.

> 70:CFLAGS_$(AMDDALPATH)/dc/dml/dcn30/display_mode_vba_30.o := $(dml_ccflags) $(frame_warn_flag)
> 72:CFLAGS_$(AMDDALPATH)/dc/dml/dcn31/display_mode_vba_31.o := $(dml_ccflags) $(frame_warn_flag)
> 76:CFLAGS_$(AMDDALPATH)/dc/dml/dcn32/display_mode_vba_32.o := $(dml_ccflags) $(frame_warn_flag)
> 
> I suppose that could just be bumped as a quick workaround? Two of those
> files have a comment that implies modifying them in non-trivial ways is
> not recommended.
> 
> /*
>  * NOTE:
>  *   This file is gcc-parsable HW gospel, coming straight from HW engineers.
>  *
>  * It doesn't adhere to Linux kernel style and sometimes will do things in odd
>  * ways. Unless there is something clearly wrong with it the code should
>  * remain as-is as it provides us with a guarantee from HW that it is correct.
>  */
> 
> I do note that commit 1b54a0121dba ("drm/amd/display: Reduce stack size
> in the mode support function") did have a workaround for GCC. It appears
> clang will still inline mode_support_configuration(). If I mark it as
> 'noinline', the warning disappears in that file.
> 

That'd be the best quick fix. I guess if we split out functions to fix
stack usage we should mark them as 'noinline' in the future to avoid
agressive compiler optimizations.

Harry

> Cheers,
> Nathan


  parent reply	other threads:[~2022-08-05 15:32 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-04 18:36 mainline build failure for x86_64 allmodconfig with clang Sudip Mukherjee (Codethink)
2022-08-04 18:36 ` Sudip Mukherjee (Codethink)
2022-08-04 18:36 ` Sudip Mukherjee (Codethink)
2022-08-04 18:52 ` Linus Torvalds
2022-08-04 18:52   ` Linus Torvalds
2022-08-04 18:52   ` Linus Torvalds
2022-08-04 19:24   ` Arnd Bergmann
2022-08-04 19:24     ` Arnd Bergmann
2022-08-04 19:24     ` Arnd Bergmann
2022-08-04 20:43     ` Nathan Chancellor
2022-08-04 20:43       ` Nathan Chancellor
2022-08-04 20:43       ` Nathan Chancellor
2022-08-04 21:59       ` Linus Torvalds
2022-08-04 21:59         ` Linus Torvalds
2022-08-04 21:59         ` Linus Torvalds
2022-08-04 22:43         ` Nathan Chancellor
2022-08-04 22:43           ` Nathan Chancellor
2022-08-04 22:43           ` Nathan Chancellor
2022-08-05  9:46       ` David Laight
2022-08-05  9:46         ` David Laight
2022-08-05  9:46         ` David Laight
2022-08-05 15:32       ` Harry Wentland [this message]
2022-08-05 15:32         ` Harry Wentland
2022-08-05 16:16         ` Arnd Bergmann
2022-08-05 16:16           ` Arnd Bergmann
2022-08-05 18:02           ` Nathan Chancellor
2022-08-05 18:02             ` Nathan Chancellor
2022-08-05 18:02             ` Nathan Chancellor
2022-08-05 19:32             ` Arnd Bergmann
2022-08-05 19:32               ` Arnd Bergmann
2022-08-05 19:32               ` Arnd Bergmann
2022-08-07 17:36               ` David Laight
2022-08-07 17:36                 ` David Laight
2022-08-07 17:36                 ` David Laight
2022-08-07 17:55                 ` Linus Torvalds
2022-08-07 17:55                   ` Linus Torvalds
2022-08-07 17:55                   ` Linus Torvalds
2022-08-18 15:59               ` Nathan Chancellor
2022-08-18 15:59                 ` Nathan Chancellor
2022-08-25 22:34                 ` Nathan Chancellor
2022-08-25 22:34                   ` Nathan Chancellor
2022-08-26 14:31                   ` Alex Deucher
2022-08-26 14:31                     ` Alex Deucher
2022-08-26 14:31                     ` Alex Deucher
2022-08-30 20:38                     ` Nathan Chancellor
2022-08-30 20:38                       ` Nathan Chancellor
2022-08-30 20:38                       ` Nathan Chancellor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9fb73284-7572-5703-93d3-f83a43535baf@amd.com \
    --to=harry.wentland@amd.com \
    --cc=Rodrigo.Siqueira@amd.com \
    --cc=Xinhui.Pan@amd.com \
    --cc=airlied@linux.ie \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=arnd@kernel.org \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=llvm@lists.linux.dev \
    --cc=nathan@kernel.org \
    --cc=sudipm.mukherjee@gmail.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.