linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] numa: mark __next_node() as __always_inline to fix section mismatch
@ 2021-12-06 16:17 Alexander Lobakin
  2021-12-06 19:43 ` Nick Desaulniers
  0 siblings, 1 reply; 5+ messages in thread
From: Alexander Lobakin @ 2021-12-06 16:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexander Lobakin, Arnd Bergmann, Nathan Chancellor,
	Nick Desaulniers, Zhen Lei, linux-kernel, llvm

Clang (13) uninlines __next_node() which emits the following warning
due to that this function is used in init code (amd_numa_init(),
sched_init_numa() etc.):

WARNING: modpost: vmlinux.o(.text+0x927ee): Section mismatch
in reference from the function __next_node() to the variable
.init.data:numa_nodes_parsed
The function __next_node() references
the variable __initdata numa_nodes_parsed.
This is often because __next_node lacks a __initdata
annotation or the annotation of numa_nodes_parsed is wrong.

Mark __next_node() as __always_inline() so it won't get uninlined.
bloat-o-meter over x86_64 binaries says this:

scripts/bloat-o-meter -c vmlinux.baseline vmlinux
add/remove: 1/1 grow/shrink: 2/7 up/down: 446/-2166 (-1720)
Function                                     old     new   delta
apply_wqattrs_cleanup                          -     410    +410
amd_numa_init                                814     842     +28
sched_init_numa                             1338    1346      +8
find_next_bit                                 38      19     -19
__next_node                                   45       -     -45
apply_wqattrs_prepare                       1069     799    -270
wq_nice_store                                688     414    -274
wq_numa_store                                805     433    -372
wq_cpumask_store                             789     402    -387
apply_workqueue_attrs                        538     147    -391
workqueue_set_unbound_cpumask                947     539    -408
Total: Before=14422603, After=14420883, chg -0.01%

So it's both win-win in terms of resolving section mismatch and
saving some text size (-1.7 Kb is quite nice).

Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
---
 include/linux/nodemask.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h
index 567c3ddba2c4..55ba2c56f39b 100644
--- a/include/linux/nodemask.h
+++ b/include/linux/nodemask.h
@@ -266,7 +266,7 @@ static inline int __first_node(const nodemask_t *srcp)
 }
 
 #define next_node(n, src) __next_node((n), &(src))
-static inline int __next_node(int n, const nodemask_t *srcp)
+static __always_inline int __next_node(int n, const nodemask_t *srcp)
 {
 	return min_t(int,MAX_NUMNODES,find_next_bit(srcp->bits, MAX_NUMNODES, n+1));
 }
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] numa: mark __next_node() as __always_inline to fix section mismatch
  2021-12-06 16:17 [PATCH] numa: mark __next_node() as __always_inline to fix section mismatch Alexander Lobakin
@ 2021-12-06 19:43 ` Nick Desaulniers
  2021-12-06 20:57   ` Alexander Lobakin
  2021-12-07  0:41   ` Nick Desaulniers
  0 siblings, 2 replies; 5+ messages in thread
From: Nick Desaulniers @ 2021-12-06 19:43 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: Andrew Morton, Arnd Bergmann, Nathan Chancellor, Zhen Lei,
	linux-kernel, llvm

On Mon, Dec 6, 2021 at 8:19 AM Alexander Lobakin
<alexandr.lobakin@intel.com> wrote:
>
> Clang (13) uninlines __next_node() which emits the following warning
> due to that this function is used in init code (amd_numa_init(),
> sched_init_numa() etc.):
>
> WARNING: modpost: vmlinux.o(.text+0x927ee): Section mismatch
> in reference from the function __next_node() to the variable
> .init.data:numa_nodes_parsed
> The function __next_node() references
> the variable __initdata numa_nodes_parsed.
> This is often because __next_node lacks a __initdata
> annotation or the annotation of numa_nodes_parsed is wrong.
>
> Mark __next_node() as __always_inline() so it won't get uninlined.
> bloat-o-meter over x86_64 binaries says this:
>
> scripts/bloat-o-meter -c vmlinux.baseline vmlinux
> add/remove: 1/1 grow/shrink: 2/7 up/down: 446/-2166 (-1720)
> Function                                     old     new   delta
> apply_wqattrs_cleanup                          -     410    +410
> amd_numa_init                                814     842     +28
> sched_init_numa                             1338    1346      +8
> find_next_bit                                 38      19     -19
> __next_node                                   45       -     -45
> apply_wqattrs_prepare                       1069     799    -270
> wq_nice_store                                688     414    -274
> wq_numa_store                                805     433    -372
> wq_cpumask_store                             789     402    -387
> apply_workqueue_attrs                        538     147    -391
> workqueue_set_unbound_cpumask                947     539    -408
> Total: Before=14422603, After=14420883, chg -0.01%
>
> So it's both win-win in terms of resolving section mismatch and
> saving some text size (-1.7 Kb is quite nice).
>
> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>

Thanks for the patch.  See this thread:
https://github.com/ClangBuiltLinux/linux/issues/1302

There's a lot more instances of these based on config.  Something like
https://github.com/ClangBuiltLinux/linux/issues/1302#issuecomment-807260475
would be more appropriate for fixing all instances, but I think this
is more so an issue with the inline cost model in LLVM.

I need to finish off https://reviews.llvm.org/D111456, and request
that https://reviews.llvm.org/D111272 which landed in clang-14 get
backported to the 13.0.1 release which should also help.

> ---
>  include/linux/nodemask.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h
> index 567c3ddba2c4..55ba2c56f39b 100644
> --- a/include/linux/nodemask.h
> +++ b/include/linux/nodemask.h
> @@ -266,7 +266,7 @@ static inline int __first_node(const nodemask_t *srcp)
>  }
>
>  #define next_node(n, src) __next_node((n), &(src))
> -static inline int __next_node(int n, const nodemask_t *srcp)
> +static __always_inline int __next_node(int n, const nodemask_t *srcp)
>  {
>         return min_t(int,MAX_NUMNODES,find_next_bit(srcp->bits, MAX_NUMNODES, n+1));
>  }
> --
> 2.33.1
>


-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] numa: mark __next_node() as __always_inline to fix section mismatch
  2021-12-06 19:43 ` Nick Desaulniers
@ 2021-12-06 20:57   ` Alexander Lobakin
  2021-12-07  0:41   ` Nick Desaulniers
  1 sibling, 0 replies; 5+ messages in thread
From: Alexander Lobakin @ 2021-12-06 20:57 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Alexander Lobakin, Andrew Morton, Arnd Bergmann,
	Nathan Chancellor, Zhen Lei, linux-kernel, llvm

From: Nick Desaulniers <ndesaulniers@google.com>
Date: Mon, 6 Dec 2021 11:43:47 -0800

> On Mon, Dec 6, 2021 at 8:19 AM Alexander Lobakin
> <alexandr.lobakin@intel.com> wrote:
> >
> > Clang (13) uninlines __next_node() which emits the following warning
> > due to that this function is used in init code (amd_numa_init(),
> > sched_init_numa() etc.):
> >
> > WARNING: modpost: vmlinux.o(.text+0x927ee): Section mismatch
> > in reference from the function __next_node() to the variable
> > .init.data:numa_nodes_parsed
> > The function __next_node() references
> > the variable __initdata numa_nodes_parsed.
> > This is often because __next_node lacks a __initdata
> > annotation or the annotation of numa_nodes_parsed is wrong.
> >
> > Mark __next_node() as __always_inline() so it won't get uninlined.
> > bloat-o-meter over x86_64 binaries says this:
> >
> > scripts/bloat-o-meter -c vmlinux.baseline vmlinux
> > add/remove: 1/1 grow/shrink: 2/7 up/down: 446/-2166 (-1720)
> > Function                                     old     new   delta
> > apply_wqattrs_cleanup                          -     410    +410
> > amd_numa_init                                814     842     +28
> > sched_init_numa                             1338    1346      +8
> > find_next_bit                                 38      19     -19
> > __next_node                                   45       -     -45
> > apply_wqattrs_prepare                       1069     799    -270
> > wq_nice_store                                688     414    -274
> > wq_numa_store                                805     433    -372
> > wq_cpumask_store                             789     402    -387
> > apply_workqueue_attrs                        538     147    -391
> > workqueue_set_unbound_cpumask                947     539    -408
> > Total: Before=14422603, After=14420883, chg -0.01%
> >
> > So it's both win-win in terms of resolving section mismatch and
> > saving some text size (-1.7 Kb is quite nice).
> >
> > Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
> 
> Thanks for the patch.  See this thread:
> https://github.com/ClangBuiltLinux/linux/issues/1302
> 
> There's a lot more instances of these based on config.  Something like
> https://github.com/ClangBuiltLinux/linux/issues/1302#issuecomment-807260475
> would be more appropriate for fixing all instances, but I think this
> is more so an issue with the inline cost model in LLVM.
> 
> I need to finish off https://reviews.llvm.org/D111456, and request
> that https://reviews.llvm.org/D111272 which landed in clang-14 get
> backported to the 13.0.1 release which should also help.

Oh I see. Sorry for redundant posting, non-applicable then.
We'll wait for these Clang/LLVM works to be finised, thanks!

> 
> > ---
> >  include/linux/nodemask.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h
> > index 567c3ddba2c4..55ba2c56f39b 100644
> > --- a/include/linux/nodemask.h
> > +++ b/include/linux/nodemask.h
> > @@ -266,7 +266,7 @@ static inline int __first_node(const nodemask_t *srcp)
> >  }
> >
> >  #define next_node(n, src) __next_node((n), &(src))
> > -static inline int __next_node(int n, const nodemask_t *srcp)
> > +static __always_inline int __next_node(int n, const nodemask_t *srcp)
> >  {
> >         return min_t(int,MAX_NUMNODES,find_next_bit(srcp->bits, MAX_NUMNODES, n+1));
> >  }
> > --
> > 2.33.1
> >
> 
> 
> -- 
> Thanks,
> ~Nick Desaulniers

Al

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] numa: mark __next_node() as __always_inline to fix section mismatch
  2021-12-06 19:43 ` Nick Desaulniers
  2021-12-06 20:57   ` Alexander Lobakin
@ 2021-12-07  0:41   ` Nick Desaulniers
  2021-12-07 11:27     ` Alexander Lobakin
  1 sibling, 1 reply; 5+ messages in thread
From: Nick Desaulniers @ 2021-12-07  0:41 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: Andrew Morton, Arnd Bergmann, Nathan Chancellor, Zhen Lei,
	linux-kernel, llvm

On Mon, Dec 6, 2021 at 12:57 PM Alexander Lobakin
<alexandr.lobakin@intel.com> wrote:
>
> From: Nick Desaulniers <ndesaulniers@google.com>
> Date: Mon, 6 Dec 2021 11:43:47 -0800
>
> > On Mon, Dec 6, 2021 at 8:19 AM Alexander Lobakin
> > <alexandr.lobakin@intel.com> wrote:
> > >
> > > Clang (13) uninlines __next_node() which emits the following warning
> > > due to that this function is used in init code (amd_numa_init(),
> > > sched_init_numa() etc.):
> > >
> > > WARNING: modpost: vmlinux.o(.text+0x927ee): Section mismatch
> > > in reference from the function __next_node() to the variable
> > > .init.data:numa_nodes_parsed
> > > The function __next_node() references
> > > the variable __initdata numa_nodes_parsed.
> > > This is often because __next_node lacks a __initdata
> > > annotation or the annotation of numa_nodes_parsed is wrong.
> > >
> > > Mark __next_node() as __always_inline() so it won't get uninlined.
> > > bloat-o-meter over x86_64 binaries says this:
> > >
> > > scripts/bloat-o-meter -c vmlinux.baseline vmlinux
> > > add/remove: 1/1 grow/shrink: 2/7 up/down: 446/-2166 (-1720)
> > > Function                                     old     new   delta
> > > apply_wqattrs_cleanup                          -     410    +410
> > > amd_numa_init                                814     842     +28
> > > sched_init_numa                             1338    1346      +8
> > > find_next_bit                                 38      19     -19
> > > __next_node                                   45       -     -45
> > > apply_wqattrs_prepare                       1069     799    -270
> > > wq_nice_store                                688     414    -274
> > > wq_numa_store                                805     433    -372
> > > wq_cpumask_store                             789     402    -387
> > > apply_workqueue_attrs                        538     147    -391
> > > workqueue_set_unbound_cpumask                947     539    -408
> > > Total: Before=14422603, After=14420883, chg -0.01%
> > >
> > > So it's both win-win in terms of resolving section mismatch and
> > > saving some text size (-1.7 Kb is quite nice).
> > >
> > > Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
> >
> > Thanks for the patch.  See this thread:
> > https://github.com/ClangBuiltLinux/linux/issues/1302
> >
> > There's a lot more instances of these based on config.  Something like
> > https://github.com/ClangBuiltLinux/linux/issues/1302#issuecomment-807260475
> > would be more appropriate for fixing all instances, but I think this
> > is more so an issue with the inline cost model in LLVM.
> >
> > I need to finish off https://reviews.llvm.org/D111456, and request
> > that https://reviews.llvm.org/D111272 which landed in clang-14 get
> > backported to the 13.0.1 release which should also help.
>
> Oh I see. Sorry for redundant posting, non-applicable then.

No worries; it's a complex issue.  I appreciate that you took the time
to test with clang, understand the issue, and send a patch.
++beers_owed;

If you'd like, I can add you to our github org if you'd like to be
cc'ed on issues there; just ping me privately off thread with your
github account and I'll add you.
https://github.com/ClangBuiltLinux

> We'll wait for these Clang/LLVM works to be finised, thanks!
>
> >
> > > ---
> > >  include/linux/nodemask.h | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h
> > > index 567c3ddba2c4..55ba2c56f39b 100644
> > > --- a/include/linux/nodemask.h
> > > +++ b/include/linux/nodemask.h
> > > @@ -266,7 +266,7 @@ static inline int __first_node(const nodemask_t *srcp)
> > >  }
> > >
> > >  #define next_node(n, src) __next_node((n), &(src))
> > > -static inline int __next_node(int n, const nodemask_t *srcp)
> > > +static __always_inline int __next_node(int n, const nodemask_t *srcp)
> > >  {
> > >         return min_t(int,MAX_NUMNODES,find_next_bit(srcp->bits, MAX_NUMNODES, n+1));
> > >  }
> > > --
> > > 2.33.1
> > >
> >
> >
> > --
> > Thanks,
> > ~Nick Desaulniers
>
> Al
>


-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] numa: mark __next_node() as __always_inline to fix section mismatch
  2021-12-07  0:41   ` Nick Desaulniers
@ 2021-12-07 11:27     ` Alexander Lobakin
  0 siblings, 0 replies; 5+ messages in thread
From: Alexander Lobakin @ 2021-12-07 11:27 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Alexander Lobakin, Andrew Morton, Arnd Bergmann,
	Nathan Chancellor, Zhen Lei, linux-kernel, llvm

From: Nick Desaulniers <ndesaulniers@google.com>
Date: Mon, 6 Dec 2021 16:41:00 -0800

> On Mon, Dec 6, 2021 at 12:57 PM Alexander Lobakin
> <alexandr.lobakin@intel.com> wrote:
> >
> > From: Nick Desaulniers <ndesaulniers@google.com>
> > Date: Mon, 6 Dec 2021 11:43:47 -0800
> >
> > > On Mon, Dec 6, 2021 at 8:19 AM Alexander Lobakin
> > > <alexandr.lobakin@intel.com> wrote:
> > > >
> > > > Clang (13) uninlines __next_node() which emits the following warning
> > > > due to that this function is used in init code (amd_numa_init(),
> > > > sched_init_numa() etc.):
> > > >
> > > > WARNING: modpost: vmlinux.o(.text+0x927ee): Section mismatch
> > > > in reference from the function __next_node() to the variable
> > > > .init.data:numa_nodes_parsed
> > > > The function __next_node() references
> > > > the variable __initdata numa_nodes_parsed.
> > > > This is often because __next_node lacks a __initdata
> > > > annotation or the annotation of numa_nodes_parsed is wrong.
> > > >
> > > > Mark __next_node() as __always_inline() so it won't get uninlined.
> > > > bloat-o-meter over x86_64 binaries says this:
> > > >
> > > > scripts/bloat-o-meter -c vmlinux.baseline vmlinux
> > > > add/remove: 1/1 grow/shrink: 2/7 up/down: 446/-2166 (-1720)
> > > > Function                                     old     new   delta
> > > > apply_wqattrs_cleanup                          -     410    +410
> > > > amd_numa_init                                814     842     +28
> > > > sched_init_numa                             1338    1346      +8
> > > > find_next_bit                                 38      19     -19
> > > > __next_node                                   45       -     -45
> > > > apply_wqattrs_prepare                       1069     799    -270
> > > > wq_nice_store                                688     414    -274
> > > > wq_numa_store                                805     433    -372
> > > > wq_cpumask_store                             789     402    -387
> > > > apply_workqueue_attrs                        538     147    -391
> > > > workqueue_set_unbound_cpumask                947     539    -408
> > > > Total: Before=14422603, After=14420883, chg -0.01%
> > > >
> > > > So it's both win-win in terms of resolving section mismatch and
> > > > saving some text size (-1.7 Kb is quite nice).
> > > >
> > > > Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
> > >
> > > Thanks for the patch.  See this thread:
> > > https://github.com/ClangBuiltLinux/linux/issues/1302
> > >
> > > There's a lot more instances of these based on config.  Something like
> > > https://github.com/ClangBuiltLinux/linux/issues/1302#issuecomment-807260475
> > > would be more appropriate for fixing all instances, but I think this
> > > is more so an issue with the inline cost model in LLVM.
> > >
> > > I need to finish off https://reviews.llvm.org/D111456, and request
> > > that https://reviews.llvm.org/D111272 which landed in clang-14 get
> > > backported to the 13.0.1 release which should also help.
> >
> > Oh I see. Sorry for redundant posting, non-applicable then.
> 
> No worries; it's a complex issue.  I appreciate that you took the time
> to test with clang, understand the issue, and send a patch.
> ++beers_owed;

Cool, thank you! :D Open source beer is shared across all
contributors I guess :P

> If you'd like, I can add you to our github org if you'd like to be
> cc'ed on issues there; just ping me privately off thread with your
> github account and I'll add you.
> https://github.com/ClangBuiltLinux

I think my private (non-Intel) account is added to it (it probably
was you who added me after my comments on ClangCFI x86 or so), so
I'll just be watching for it there, thanks!

> > We'll wait for these Clang/LLVM works to be finised, thanks!
> >
> > >
> > > > ---
> > > >  include/linux/nodemask.h | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h
> > > > index 567c3ddba2c4..55ba2c56f39b 100644
> > > > --- a/include/linux/nodemask.h
> > > > +++ b/include/linux/nodemask.h
> > > > @@ -266,7 +266,7 @@ static inline int __first_node(const nodemask_t *srcp)
> > > >  }
> > > >
> > > >  #define next_node(n, src) __next_node((n), &(src))
> > > > -static inline int __next_node(int n, const nodemask_t *srcp)
> > > > +static __always_inline int __next_node(int n, const nodemask_t *srcp)
> > > >  {
> > > >         return min_t(int,MAX_NUMNODES,find_next_bit(srcp->bits, MAX_NUMNODES, n+1));
> > > >  }
> > > > --
> > > > 2.33.1
> > > >
> > >
> > >
> > > --
> > > Thanks,
> > > ~Nick Desaulniers
> >
> > Al
> >
> 
> 
> -- 
> Thanks,
> ~Nick Desaulniers

Al

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-12-07 11:27 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-06 16:17 [PATCH] numa: mark __next_node() as __always_inline to fix section mismatch Alexander Lobakin
2021-12-06 19:43 ` Nick Desaulniers
2021-12-06 20:57   ` Alexander Lobakin
2021-12-07  0:41   ` Nick Desaulniers
2021-12-07 11:27     ` Alexander Lobakin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).