linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr
@ 2022-05-15 20:36 Jiri Olsa
  2022-05-15 20:36 ` [PATCH bpf-next 2/2] selftests/bpf: Remove filter for unsafe functions in kprobe_multi test Jiri Olsa
                   ` (5 more replies)
  0 siblings, 6 replies; 31+ messages in thread
From: Jiri Olsa @ 2022-05-15 20:36 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Paul E. McKenney
  Cc: netdev, bpf, lkml, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Steven Rostedt

Making arch_cpu_idle and rcu_idle_exit noinstr. Both functions run
in rcu 'not watching' context and if there's tracer attached to
them, which uses rcu (e.g. kprobe multi interface) it will hit RCU
warning like:

  [    3.017540] WARNING: suspicious RCU usage
  ...
  [    3.018363]  kprobe_multi_link_handler+0x68/0x1c0
  [    3.018364]  ? kprobe_multi_link_handler+0x3e/0x1c0
  [    3.018366]  ? arch_cpu_idle_dead+0x10/0x10
  [    3.018367]  ? arch_cpu_idle_dead+0x10/0x10
  [    3.018371]  fprobe_handler.part.0+0xab/0x150
  [    3.018374]  0xffffffffa00080c8
  [    3.018393]  ? arch_cpu_idle+0x5/0x10
  [    3.018398]  arch_cpu_idle+0x5/0x10
  [    3.018399]  default_idle_call+0x59/0x90
  [    3.018401]  do_idle+0x1c3/0x1d0

The call path is following:

default_idle_call
  rcu_idle_enter
  arch_cpu_idle
  rcu_idle_exit

The arch_cpu_idle and rcu_idle_exit are the only ones from above
path that are traceble and cause this problem on my setup.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 arch/x86/kernel/process.c | 2 +-
 kernel/rcu/tree.c         | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index b370767f5b19..1345cb0124a6 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -720,7 +720,7 @@ void arch_cpu_idle_dead(void)
 /*
  * Called from the generic idle code.
  */
-void arch_cpu_idle(void)
+void noinstr arch_cpu_idle(void)
 {
 	x86_idle();
 }
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index a4b8189455d5..20d529722f51 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -896,7 +896,7 @@ static void noinstr rcu_eqs_exit(bool user)
  * If you add or remove a call to rcu_idle_exit(), be sure to test with
  * CONFIG_RCU_EQS_DEBUG=y.
  */
-void rcu_idle_exit(void)
+void noinstr rcu_idle_exit(void)
 {
 	unsigned long flags;
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH bpf-next 2/2] selftests/bpf: Remove filter for unsafe functions in kprobe_multi test
  2022-05-15 20:36 [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr Jiri Olsa
@ 2022-05-15 20:36 ` Jiri Olsa
  2022-05-16  4:25 ` [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr Paul E. McKenney
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 31+ messages in thread
From: Jiri Olsa @ 2022-05-15 20:36 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Paul E. McKenney
  Cc: netdev, bpf, lkml, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Steven Rostedt

Removing filter for arch_cpu_idle and rcu_* functions.

Attaching to them was causing RCU warnings [1]:

  [    3.017540] WARNING: suspicious RCU usage
  ...
  [    3.018363]  kprobe_multi_link_handler+0x68/0x1c0
  [    3.018364]  ? kprobe_multi_link_handler+0x3e/0x1c0
  [    3.018366]  ? arch_cpu_idle_dead+0x10/0x10
  [    3.018367]  ? arch_cpu_idle_dead+0x10/0x10
  [    3.018371]  fprobe_handler.part.0+0xab/0x150
  [    3.018374]  0xffffffffa00080c8
  [    3.018393]  ? arch_cpu_idle+0x5/0x10
  [    3.018398]  arch_cpu_idle+0x5/0x10
  [    3.018399]  default_idle_call+0x59/0x90
  [    3.018401]  do_idle+0x1c3/0x1d0

With previous fix the functions causing that are no longer attachable.

[1] https://lore.kernel.org/bpf/CAEf4BzYtXWvBWzmadhLGqwf8_e2sruK6999th6c=b=O0WLkHOA@mail.gmail.com/
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 .../selftests/bpf/prog_tests/kprobe_multi_test.c       | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
index 586dc52d6fb9..8bb974c263c3 100644
--- a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
+++ b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
@@ -354,16 +354,6 @@ static int get_syms(char ***symsp, size_t *cntp)
 			continue;
 		if (sscanf(buf, "%ms$*[^\n]\n", &name) != 1)
 			continue;
-		/*
-		 * We attach to almost all kernel functions and some of them
-		 * will cause 'suspicious RCU usage' when fprobe is attached
-		 * to them. Filter out the current culprits - arch_cpu_idle
-		 * and rcu_* functions.
-		 */
-		if (!strcmp(name, "arch_cpu_idle"))
-			continue;
-		if (!strncmp(name, "rcu_", 4))
-			continue;
 		err = hashmap__add(map, name, NULL);
 		if (err) {
 			free(name);
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr
  2022-05-15 20:36 [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr Jiri Olsa
  2022-05-15 20:36 ` [PATCH bpf-next 2/2] selftests/bpf: Remove filter for unsafe functions in kprobe_multi test Jiri Olsa
@ 2022-05-16  4:25 ` Paul E. McKenney
  2022-05-16 11:49   ` Frederic Weisbecker
  2022-05-17  2:39 ` kernel test robot
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 31+ messages in thread
From: Paul E. McKenney @ 2022-05-16  4:25 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, netdev, bpf, lkml, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Steven Rostedt

On Sun, May 15, 2022 at 10:36:52PM +0200, Jiri Olsa wrote:
> Making arch_cpu_idle and rcu_idle_exit noinstr. Both functions run
> in rcu 'not watching' context and if there's tracer attached to
> them, which uses rcu (e.g. kprobe multi interface) it will hit RCU
> warning like:
> 
>   [    3.017540] WARNING: suspicious RCU usage
>   ...
>   [    3.018363]  kprobe_multi_link_handler+0x68/0x1c0
>   [    3.018364]  ? kprobe_multi_link_handler+0x3e/0x1c0
>   [    3.018366]  ? arch_cpu_idle_dead+0x10/0x10
>   [    3.018367]  ? arch_cpu_idle_dead+0x10/0x10
>   [    3.018371]  fprobe_handler.part.0+0xab/0x150
>   [    3.018374]  0xffffffffa00080c8
>   [    3.018393]  ? arch_cpu_idle+0x5/0x10
>   [    3.018398]  arch_cpu_idle+0x5/0x10
>   [    3.018399]  default_idle_call+0x59/0x90
>   [    3.018401]  do_idle+0x1c3/0x1d0
> 
> The call path is following:
> 
> default_idle_call
>   rcu_idle_enter
>   arch_cpu_idle
>   rcu_idle_exit
> 
> The arch_cpu_idle and rcu_idle_exit are the only ones from above
> path that are traceble and cause this problem on my setup.
> 
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>

From an RCU viewpoint:

Reviewed-by: Paul E. McKenney <paulmck@kernel.org>

[ I considered asking for an instrumentation_on() in rcu_idle_exit(),
but there is no point given that local_irq_restore() isn't something
you instrument anyway. ]

> ---
>  arch/x86/kernel/process.c | 2 +-
>  kernel/rcu/tree.c         | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> index b370767f5b19..1345cb0124a6 100644
> --- a/arch/x86/kernel/process.c
> +++ b/arch/x86/kernel/process.c
> @@ -720,7 +720,7 @@ void arch_cpu_idle_dead(void)
>  /*
>   * Called from the generic idle code.
>   */
> -void arch_cpu_idle(void)
> +void noinstr arch_cpu_idle(void)
>  {
>  	x86_idle();
>  }
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index a4b8189455d5..20d529722f51 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -896,7 +896,7 @@ static void noinstr rcu_eqs_exit(bool user)
>   * If you add or remove a call to rcu_idle_exit(), be sure to test with
>   * CONFIG_RCU_EQS_DEBUG=y.
>   */
> -void rcu_idle_exit(void)
> +void noinstr rcu_idle_exit(void)
>  {
>  	unsigned long flags;
>  
> -- 
> 2.35.3
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr
  2022-05-16  4:25 ` [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr Paul E. McKenney
@ 2022-05-16 11:49   ` Frederic Weisbecker
  2022-05-16 13:39     ` Paul E. McKenney
  2022-05-17 10:13     ` Jiri Olsa
  0 siblings, 2 replies; 31+ messages in thread
From: Frederic Weisbecker @ 2022-05-16 11:49 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, netdev, bpf, lkml, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Steven Rostedt

On Sun, May 15, 2022 at 09:25:35PM -0700, Paul E. McKenney wrote:
> On Sun, May 15, 2022 at 10:36:52PM +0200, Jiri Olsa wrote:
> > Making arch_cpu_idle and rcu_idle_exit noinstr. Both functions run
> > in rcu 'not watching' context and if there's tracer attached to
> > them, which uses rcu (e.g. kprobe multi interface) it will hit RCU
> > warning like:
> > 
> >   [    3.017540] WARNING: suspicious RCU usage
> >   ...
> >   [    3.018363]  kprobe_multi_link_handler+0x68/0x1c0
> >   [    3.018364]  ? kprobe_multi_link_handler+0x3e/0x1c0
> >   [    3.018366]  ? arch_cpu_idle_dead+0x10/0x10
> >   [    3.018367]  ? arch_cpu_idle_dead+0x10/0x10
> >   [    3.018371]  fprobe_handler.part.0+0xab/0x150
> >   [    3.018374]  0xffffffffa00080c8
> >   [    3.018393]  ? arch_cpu_idle+0x5/0x10
> >   [    3.018398]  arch_cpu_idle+0x5/0x10
> >   [    3.018399]  default_idle_call+0x59/0x90
> >   [    3.018401]  do_idle+0x1c3/0x1d0
> > 
> > The call path is following:
> > 
> > default_idle_call
> >   rcu_idle_enter
> >   arch_cpu_idle
> >   rcu_idle_exit
> > 
> > The arch_cpu_idle and rcu_idle_exit are the only ones from above
> > path that are traceble and cause this problem on my setup.
> > 
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> 
> From an RCU viewpoint:
> 
> Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
> 
> [ I considered asking for an instrumentation_on() in rcu_idle_exit(),
> but there is no point given that local_irq_restore() isn't something
> you instrument anyway. ]

So local_irq_save() in the beginning of rcu_idle_exit() is unsafe because
it is instrumentable by the function (graph)  tracers and the irqsoff tracer.

Also it calls into lockdep that might make use of RCU.

That's why rcu_idle_exit() is not noinstr yet. See this patch:

https://lore.kernel.org/lkml/20220503100051.2799723-4-frederic@kernel.org/

Thanks.

> 
> > ---
> >  arch/x86/kernel/process.c | 2 +-
> >  kernel/rcu/tree.c         | 2 +-
> >  2 files changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> > index b370767f5b19..1345cb0124a6 100644
> > --- a/arch/x86/kernel/process.c
> > +++ b/arch/x86/kernel/process.c
> > @@ -720,7 +720,7 @@ void arch_cpu_idle_dead(void)
> >  /*
> >   * Called from the generic idle code.
> >   */
> > -void arch_cpu_idle(void)
> > +void noinstr arch_cpu_idle(void)
> >  {
> >  	x86_idle();
> >  }
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index a4b8189455d5..20d529722f51 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -896,7 +896,7 @@ static void noinstr rcu_eqs_exit(bool user)
> >   * If you add or remove a call to rcu_idle_exit(), be sure to test with
> >   * CONFIG_RCU_EQS_DEBUG=y.
> >   */
> > -void rcu_idle_exit(void)
> > +void noinstr rcu_idle_exit(void)
> >  {
> >  	unsigned long flags;
> >  
> > -- 
> > 2.35.3
> > 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr
  2022-05-16 11:49   ` Frederic Weisbecker
@ 2022-05-16 13:39     ` Paul E. McKenney
  2022-05-17 10:13     ` Jiri Olsa
  1 sibling, 0 replies; 31+ messages in thread
From: Paul E. McKenney @ 2022-05-16 13:39 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, netdev, bpf, lkml, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Steven Rostedt

On Mon, May 16, 2022 at 01:49:22PM +0200, Frederic Weisbecker wrote:
> On Sun, May 15, 2022 at 09:25:35PM -0700, Paul E. McKenney wrote:
> > On Sun, May 15, 2022 at 10:36:52PM +0200, Jiri Olsa wrote:
> > > Making arch_cpu_idle and rcu_idle_exit noinstr. Both functions run
> > > in rcu 'not watching' context and if there's tracer attached to
> > > them, which uses rcu (e.g. kprobe multi interface) it will hit RCU
> > > warning like:
> > > 
> > >   [    3.017540] WARNING: suspicious RCU usage
> > >   ...
> > >   [    3.018363]  kprobe_multi_link_handler+0x68/0x1c0
> > >   [    3.018364]  ? kprobe_multi_link_handler+0x3e/0x1c0
> > >   [    3.018366]  ? arch_cpu_idle_dead+0x10/0x10
> > >   [    3.018367]  ? arch_cpu_idle_dead+0x10/0x10
> > >   [    3.018371]  fprobe_handler.part.0+0xab/0x150
> > >   [    3.018374]  0xffffffffa00080c8
> > >   [    3.018393]  ? arch_cpu_idle+0x5/0x10
> > >   [    3.018398]  arch_cpu_idle+0x5/0x10
> > >   [    3.018399]  default_idle_call+0x59/0x90
> > >   [    3.018401]  do_idle+0x1c3/0x1d0
> > > 
> > > The call path is following:
> > > 
> > > default_idle_call
> > >   rcu_idle_enter
> > >   arch_cpu_idle
> > >   rcu_idle_exit
> > > 
> > > The arch_cpu_idle and rcu_idle_exit are the only ones from above
> > > path that are traceble and cause this problem on my setup.
> > > 
> > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > 
> > From an RCU viewpoint:
> > 
> > Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
> > 
> > [ I considered asking for an instrumentation_on() in rcu_idle_exit(),
> > but there is no point given that local_irq_restore() isn't something
> > you instrument anyway. ]
> 
> So local_irq_save() in the beginning of rcu_idle_exit() is unsafe because
> it is instrumentable by the function (graph)  tracers and the irqsoff tracer.
> 
> Also it calls into lockdep that might make use of RCU.
> 
> That's why rcu_idle_exit() is not noinstr yet. See this patch:
> 
> https://lore.kernel.org/lkml/20220503100051.2799723-4-frederic@kernel.org/

Ah, I should have looked at the context-tracking series again!

And I have to ask...  How much debugging capability are we really losing
by continuing to use the raw versions of local_irq_{save,restore}()?

							Thanx, Paul

> Thanks.
> 
> > 
> > > ---
> > >  arch/x86/kernel/process.c | 2 +-
> > >  kernel/rcu/tree.c         | 2 +-
> > >  2 files changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> > > index b370767f5b19..1345cb0124a6 100644
> > > --- a/arch/x86/kernel/process.c
> > > +++ b/arch/x86/kernel/process.c
> > > @@ -720,7 +720,7 @@ void arch_cpu_idle_dead(void)
> > >  /*
> > >   * Called from the generic idle code.
> > >   */
> > > -void arch_cpu_idle(void)
> > > +void noinstr arch_cpu_idle(void)
> > >  {
> > >  	x86_idle();
> > >  }
> > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > index a4b8189455d5..20d529722f51 100644
> > > --- a/kernel/rcu/tree.c
> > > +++ b/kernel/rcu/tree.c
> > > @@ -896,7 +896,7 @@ static void noinstr rcu_eqs_exit(bool user)
> > >   * If you add or remove a call to rcu_idle_exit(), be sure to test with
> > >   * CONFIG_RCU_EQS_DEBUG=y.
> > >   */
> > > -void rcu_idle_exit(void)
> > > +void noinstr rcu_idle_exit(void)
> > >  {
> > >  	unsigned long flags;
> > >  
> > > -- 
> > > 2.35.3
> > > 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr
  2022-05-15 20:36 [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr Jiri Olsa
  2022-05-15 20:36 ` [PATCH bpf-next 2/2] selftests/bpf: Remove filter for unsafe functions in kprobe_multi test Jiri Olsa
  2022-05-16  4:25 ` [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr Paul E. McKenney
@ 2022-05-17  2:39 ` kernel test robot
  2022-05-17  2:54 ` Yonghong Song
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 31+ messages in thread
From: kernel test robot @ 2022-05-17  2:39 UTC (permalink / raw)
  To: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Paul E. McKenney
  Cc: kbuild-all, netdev, bpf, lkml, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Steven Rostedt

Hi Jiri,

I love your patch! Perhaps something to improve:

[auto build test WARNING on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Jiri-Olsa/cpuidle-rcu-Making-arch_cpu_idle-and-rcu_idle_exit-noinstr/20220516-043752
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: x86_64-randconfig-c002-20220516 (https://download.01.org/0day-ci/archive/20220517/202205171050.9msZot6F-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.2.0-20) 11.2.0
reproduce (this is a W=1 build):
        # https://github.com/intel-lab-lkp/linux/commit/0b6fee32d730f621f2bfc4d8d9f0729814398415
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Jiri-Olsa/cpuidle-rcu-Making-arch_cpu_idle-and-rcu_idle_exit-noinstr/20220516-043752
        git checkout 0b6fee32d730f621f2bfc4d8d9f0729814398415
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> vmlinux.o: warning: objtool: arch_cpu_idle()+0x0: call to {dynamic}() leaves .noinstr.text section
>> vmlinux.o: warning: objtool: rcu_idle_exit()+0x16: call to trace_hardirqs_off() leaves .noinstr.text section
   vmlinux.o: warning: objtool: enter_from_user_mode()+0x18: call to __kcsan_check_access() leaves .noinstr.text section
   vmlinux.o: warning: objtool: syscall_enter_from_user_mode()+0x1d: call to __kcsan_check_access() leaves .noinstr.text section
   vmlinux.o: warning: objtool: syscall_enter_from_user_mode_prepare()+0x18: call to __kcsan_check_access() leaves .noinstr.text section
   vmlinux.o: warning: objtool: irqentry_enter_from_user_mode()+0x18: call to __kcsan_check_access() leaves .noinstr.text section

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr
  2022-05-15 20:36 [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr Jiri Olsa
                   ` (2 preceding siblings ...)
  2022-05-17  2:39 ` kernel test robot
@ 2022-05-17  2:54 ` Yonghong Song
  2022-05-17  7:49   ` Jiri Olsa
  2022-05-17  9:19 ` kernel test robot
  2023-05-20  9:47 ` Ze Gao
  5 siblings, 1 reply; 31+ messages in thread
From: Yonghong Song @ 2022-05-17  2:54 UTC (permalink / raw)
  To: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Paul E. McKenney
  Cc: netdev, bpf, lkml, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh, Steven Rostedt



On 5/15/22 1:36 PM, Jiri Olsa wrote:
> Making arch_cpu_idle and rcu_idle_exit noinstr. Both functions run
> in rcu 'not watching' context and if there's tracer attached to
> them, which uses rcu (e.g. kprobe multi interface) it will hit RCU
> warning like:
> 
>    [    3.017540] WARNING: suspicious RCU usage
>    ...
>    [    3.018363]  kprobe_multi_link_handler+0x68/0x1c0
>    [    3.018364]  ? kprobe_multi_link_handler+0x3e/0x1c0
>    [    3.018366]  ? arch_cpu_idle_dead+0x10/0x10
>    [    3.018367]  ? arch_cpu_idle_dead+0x10/0x10
>    [    3.018371]  fprobe_handler.part.0+0xab/0x150
>    [    3.018374]  0xffffffffa00080c8
>    [    3.018393]  ? arch_cpu_idle+0x5/0x10
>    [    3.018398]  arch_cpu_idle+0x5/0x10
>    [    3.018399]  default_idle_call+0x59/0x90
>    [    3.018401]  do_idle+0x1c3/0x1d0
> 
> The call path is following:
> 
> default_idle_call
>    rcu_idle_enter
>    arch_cpu_idle
>    rcu_idle_exit
> 
> The arch_cpu_idle and rcu_idle_exit are the only ones from above
> path that are traceble and cause this problem on my setup.
> 
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>   arch/x86/kernel/process.c | 2 +-
>   kernel/rcu/tree.c         | 2 +-
>   2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> index b370767f5b19..1345cb0124a6 100644
> --- a/arch/x86/kernel/process.c
> +++ b/arch/x86/kernel/process.c
> @@ -720,7 +720,7 @@ void arch_cpu_idle_dead(void)
>   /*
>    * Called from the generic idle code.
>    */
> -void arch_cpu_idle(void)
> +void noinstr arch_cpu_idle(void)

noinstr includes a lot of attributes:

#define noinstr                                                         \
         noinline notrace __attribute((__section__(".noinstr.text")))    \
         __no_kcsan __no_sanitize_address __no_profile 
__no_sanitize_coverage

should we use notrace here?

>   {
>   	x86_idle();
>   }
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index a4b8189455d5..20d529722f51 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -896,7 +896,7 @@ static void noinstr rcu_eqs_exit(bool user)
>    * If you add or remove a call to rcu_idle_exit(), be sure to test with
>    * CONFIG_RCU_EQS_DEBUG=y.
>    */
> -void rcu_idle_exit(void)
> +void noinstr rcu_idle_exit(void)
>   {
>   	unsigned long flags;
>   

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr
  2022-05-17  2:54 ` Yonghong Song
@ 2022-05-17  7:49   ` Jiri Olsa
  2022-05-19  0:00     ` Masami Hiramatsu
  0 siblings, 1 reply; 31+ messages in thread
From: Jiri Olsa @ 2022-05-17  7:49 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Paul E. McKenney, netdev, bpf, lkml,
	Martin KaFai Lau, Song Liu, John Fastabend, KP Singh,
	Steven Rostedt

On Mon, May 16, 2022 at 07:54:37PM -0700, Yonghong Song wrote:
> 
> 
> On 5/15/22 1:36 PM, Jiri Olsa wrote:
> > Making arch_cpu_idle and rcu_idle_exit noinstr. Both functions run
> > in rcu 'not watching' context and if there's tracer attached to
> > them, which uses rcu (e.g. kprobe multi interface) it will hit RCU
> > warning like:
> > 
> >    [    3.017540] WARNING: suspicious RCU usage
> >    ...
> >    [    3.018363]  kprobe_multi_link_handler+0x68/0x1c0
> >    [    3.018364]  ? kprobe_multi_link_handler+0x3e/0x1c0
> >    [    3.018366]  ? arch_cpu_idle_dead+0x10/0x10
> >    [    3.018367]  ? arch_cpu_idle_dead+0x10/0x10
> >    [    3.018371]  fprobe_handler.part.0+0xab/0x150
> >    [    3.018374]  0xffffffffa00080c8
> >    [    3.018393]  ? arch_cpu_idle+0x5/0x10
> >    [    3.018398]  arch_cpu_idle+0x5/0x10
> >    [    3.018399]  default_idle_call+0x59/0x90
> >    [    3.018401]  do_idle+0x1c3/0x1d0
> > 
> > The call path is following:
> > 
> > default_idle_call
> >    rcu_idle_enter
> >    arch_cpu_idle
> >    rcu_idle_exit
> > 
> > The arch_cpu_idle and rcu_idle_exit are the only ones from above
> > path that are traceble and cause this problem on my setup.
> > 
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >   arch/x86/kernel/process.c | 2 +-
> >   kernel/rcu/tree.c         | 2 +-
> >   2 files changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> > index b370767f5b19..1345cb0124a6 100644
> > --- a/arch/x86/kernel/process.c
> > +++ b/arch/x86/kernel/process.c
> > @@ -720,7 +720,7 @@ void arch_cpu_idle_dead(void)
> >   /*
> >    * Called from the generic idle code.
> >    */
> > -void arch_cpu_idle(void)
> > +void noinstr arch_cpu_idle(void)
> 
> noinstr includes a lot of attributes:
> 
> #define noinstr                                                         \
>         noinline notrace __attribute((__section__(".noinstr.text")))    \
>         __no_kcsan __no_sanitize_address __no_profile __no_sanitize_coverage
> 
> should we use notrace here?

hm right, so notrace should be enough for our case (kprobe_multi)
which is based on ftrace/fprobe jump

noinstr (among other things) adds the function also the kprobes
blacklist, which will prevent standard kprobes to attach

ASAICS standard kprobes use rcu in probe path as well, like in
opt_pre_handler function

so I think we should go with noinstr

jirka

> 
> >   {
> >   	x86_idle();
> >   }
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index a4b8189455d5..20d529722f51 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -896,7 +896,7 @@ static void noinstr rcu_eqs_exit(bool user)
> >    * If you add or remove a call to rcu_idle_exit(), be sure to test with
> >    * CONFIG_RCU_EQS_DEBUG=y.
> >    */
> > -void rcu_idle_exit(void)
> > +void noinstr rcu_idle_exit(void)
> >   {
> >   	unsigned long flags;

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr
  2022-05-15 20:36 [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr Jiri Olsa
                   ` (3 preceding siblings ...)
  2022-05-17  2:54 ` Yonghong Song
@ 2022-05-17  9:19 ` kernel test robot
  2023-05-20  9:47 ` Ze Gao
  5 siblings, 0 replies; 31+ messages in thread
From: kernel test robot @ 2022-05-17  9:19 UTC (permalink / raw)
  To: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Paul E. McKenney
  Cc: kbuild-all, netdev, bpf, lkml, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Steven Rostedt

Hi Jiri,

I love your patch! Perhaps something to improve:

[auto build test WARNING on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Jiri-Olsa/cpuidle-rcu-Making-arch_cpu_idle-and-rcu_idle_exit-noinstr/20220516-043752
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: x86_64-randconfig-a014-20220516 (https://download.01.org/0day-ci/archive/20220517/202205171711.hqxFhp5l-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.2.0-20) 11.2.0
reproduce (this is a W=1 build):
        # https://github.com/intel-lab-lkp/linux/commit/0b6fee32d730f621f2bfc4d8d9f0729814398415
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Jiri-Olsa/cpuidle-rcu-Making-arch_cpu_idle-and-rcu_idle_exit-noinstr/20220516-043752
        git checkout 0b6fee32d730f621f2bfc4d8d9f0729814398415
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   vmlinux.o: warning: objtool: vmx_l1d_flush()+0x13: call to static_key_count() leaves .noinstr.text section
   vmlinux.o: warning: objtool: vmx_vcpu_enter_exit()+0x29: call to static_key_count() leaves .noinstr.text section
   vmlinux.o: warning: objtool: vmx_update_host_rsp()+0x3e: call to static_key_count() leaves .noinstr.text section
   vmlinux.o: warning: objtool: arch_cpu_idle()+0xb: call to {dynamic}() leaves .noinstr.text section
>> vmlinux.o: warning: objtool: rcu_idle_exit()+0x25: call to trace_hardirqs_on() leaves .noinstr.text section
   vmlinux.o: warning: objtool: enter_from_user_mode()+0x1c: call to __kcsan_check_access() leaves .noinstr.text section
   vmlinux.o: warning: objtool: syscall_enter_from_user_mode()+0x21: call to __kcsan_check_access() leaves .noinstr.text section
   vmlinux.o: warning: objtool: syscall_enter_from_user_mode_prepare()+0x1c: call to __kcsan_check_access() leaves .noinstr.text section
   vmlinux.o: warning: objtool: exit_to_user_mode()+0x1b: call to static_key_count() leaves .noinstr.text section
   vmlinux.o: warning: objtool: syscall_exit_to_user_mode()+0x36: call to static_key_count() leaves .noinstr.text section
   vmlinux.o: warning: objtool: irqentry_enter_from_user_mode()+0x1c: call to __kcsan_check_access() leaves .noinstr.text section
   vmlinux.o: warning: objtool: irqentry_exit_to_user_mode()+0x22: call to static_key_count() leaves .noinstr.text section

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr
  2022-05-16 11:49   ` Frederic Weisbecker
  2022-05-16 13:39     ` Paul E. McKenney
@ 2022-05-17 10:13     ` Jiri Olsa
  2022-05-18 16:21       ` Paul E. McKenney
  1 sibling, 1 reply; 31+ messages in thread
From: Jiri Olsa @ 2022-05-17 10:13 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Paul E. McKenney, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Masami Hiramatsu, netdev, bpf, lkml,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Steven Rostedt

On Mon, May 16, 2022 at 01:49:22PM +0200, Frederic Weisbecker wrote:
> On Sun, May 15, 2022 at 09:25:35PM -0700, Paul E. McKenney wrote:
> > On Sun, May 15, 2022 at 10:36:52PM +0200, Jiri Olsa wrote:
> > > Making arch_cpu_idle and rcu_idle_exit noinstr. Both functions run
> > > in rcu 'not watching' context and if there's tracer attached to
> > > them, which uses rcu (e.g. kprobe multi interface) it will hit RCU
> > > warning like:
> > > 
> > >   [    3.017540] WARNING: suspicious RCU usage
> > >   ...
> > >   [    3.018363]  kprobe_multi_link_handler+0x68/0x1c0
> > >   [    3.018364]  ? kprobe_multi_link_handler+0x3e/0x1c0
> > >   [    3.018366]  ? arch_cpu_idle_dead+0x10/0x10
> > >   [    3.018367]  ? arch_cpu_idle_dead+0x10/0x10
> > >   [    3.018371]  fprobe_handler.part.0+0xab/0x150
> > >   [    3.018374]  0xffffffffa00080c8
> > >   [    3.018393]  ? arch_cpu_idle+0x5/0x10
> > >   [    3.018398]  arch_cpu_idle+0x5/0x10
> > >   [    3.018399]  default_idle_call+0x59/0x90
> > >   [    3.018401]  do_idle+0x1c3/0x1d0
> > > 
> > > The call path is following:
> > > 
> > > default_idle_call
> > >   rcu_idle_enter
> > >   arch_cpu_idle
> > >   rcu_idle_exit
> > > 
> > > The arch_cpu_idle and rcu_idle_exit are the only ones from above
> > > path that are traceble and cause this problem on my setup.
> > > 
> > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > 
> > From an RCU viewpoint:
> > 
> > Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
> > 
> > [ I considered asking for an instrumentation_on() in rcu_idle_exit(),
> > but there is no point given that local_irq_restore() isn't something
> > you instrument anyway. ]
> 
> So local_irq_save() in the beginning of rcu_idle_exit() is unsafe because
> it is instrumentable by the function (graph)  tracers and the irqsoff tracer.
> 
> Also it calls into lockdep that might make use of RCU.
> 
> That's why rcu_idle_exit() is not noinstr yet. See this patch:
> 
> https://lore.kernel.org/lkml/20220503100051.2799723-4-frederic@kernel.org/

I see, could we mark it at least with notrace meanwhile?

jirka

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr
  2022-05-17 10:13     ` Jiri Olsa
@ 2022-05-18 16:21       ` Paul E. McKenney
  2022-05-19 11:33         ` Jiri Olsa
  0 siblings, 1 reply; 31+ messages in thread
From: Paul E. McKenney @ 2022-05-18 16:21 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Frederic Weisbecker, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Masami Hiramatsu, netdev, bpf, lkml,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Steven Rostedt

On Tue, May 17, 2022 at 12:13:45PM +0200, Jiri Olsa wrote:
> On Mon, May 16, 2022 at 01:49:22PM +0200, Frederic Weisbecker wrote:
> > On Sun, May 15, 2022 at 09:25:35PM -0700, Paul E. McKenney wrote:
> > > On Sun, May 15, 2022 at 10:36:52PM +0200, Jiri Olsa wrote:
> > > > Making arch_cpu_idle and rcu_idle_exit noinstr. Both functions run
> > > > in rcu 'not watching' context and if there's tracer attached to
> > > > them, which uses rcu (e.g. kprobe multi interface) it will hit RCU
> > > > warning like:
> > > > 
> > > >   [    3.017540] WARNING: suspicious RCU usage
> > > >   ...
> > > >   [    3.018363]  kprobe_multi_link_handler+0x68/0x1c0
> > > >   [    3.018364]  ? kprobe_multi_link_handler+0x3e/0x1c0
> > > >   [    3.018366]  ? arch_cpu_idle_dead+0x10/0x10
> > > >   [    3.018367]  ? arch_cpu_idle_dead+0x10/0x10
> > > >   [    3.018371]  fprobe_handler.part.0+0xab/0x150
> > > >   [    3.018374]  0xffffffffa00080c8
> > > >   [    3.018393]  ? arch_cpu_idle+0x5/0x10
> > > >   [    3.018398]  arch_cpu_idle+0x5/0x10
> > > >   [    3.018399]  default_idle_call+0x59/0x90
> > > >   [    3.018401]  do_idle+0x1c3/0x1d0
> > > > 
> > > > The call path is following:
> > > > 
> > > > default_idle_call
> > > >   rcu_idle_enter
> > > >   arch_cpu_idle
> > > >   rcu_idle_exit
> > > > 
> > > > The arch_cpu_idle and rcu_idle_exit are the only ones from above
> > > > path that are traceble and cause this problem on my setup.
> > > > 
> > > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > 
> > > From an RCU viewpoint:
> > > 
> > > Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
> > > 
> > > [ I considered asking for an instrumentation_on() in rcu_idle_exit(),
> > > but there is no point given that local_irq_restore() isn't something
> > > you instrument anyway. ]
> > 
> > So local_irq_save() in the beginning of rcu_idle_exit() is unsafe because
> > it is instrumentable by the function (graph)  tracers and the irqsoff tracer.
> > 
> > Also it calls into lockdep that might make use of RCU.
> > 
> > That's why rcu_idle_exit() is not noinstr yet. See this patch:
> > 
> > https://lore.kernel.org/lkml/20220503100051.2799723-4-frederic@kernel.org/
> 
> I see, could we mark it at least with notrace meanwhile?

For the RCU part, how about as follows?

If this approach is reasonable, my guess would be that Frederic will pull
it into his context-tracking series, perhaps using a revert of this patch
to maintain sanity in the near term.

If this approach is unreasonable, well, that is Murphy for you!

For the x86 idle part, my feeling is still that the rcu_idle_enter()
and rcu_idle_exit() need to be pushed deeper into the code.  Perhaps
an ongoing process as the idle loop continues to be dug deeper?

							Thanx, Paul

------------------------------------------------------------------------

commit cd338be719a0a692e0d50e1a8438e1f6c7165d9c
Author: Paul E. McKenney <paulmck@kernel.org>
Date:   Tue May 17 21:00:04 2022 -0700

    rcu: Apply noinstr to rcu_idle_enter() and rcu_idle_exit()
    
    This commit applies the "noinstr" tag to the rcu_idle_enter() and
    rcu_idle_exit() functions, which are invoked from portions of the idle
    loop that cannot be instrumented.  These tags require reworking the
    rcu_eqs_enter() and rcu_eqs_exit() functions that these two functions
    invoke in order to cause them to use normal assertions rather than
    lockdep.  In addition, within rcu_idle_exit(), the raw versions of
    local_irq_save() and local_irq_restore() are used, again to avoid issues
    with lockdep in uninstrumented code.
    
    This patch is based in part on an earlier patch by Jiri Olsa, discussions
    with Peter Zijlstra and Frederic Weisbecker, earlier changes by Thomas
    Gleixner, and off-list discussions with Yonghong Song.
    
    Link: https://lore.kernel.org/lkml/20220515203653.4039075-1-jolsa@kernel.org/
    Reported-by: Jiri Olsa <jolsa@kernel.org>
    Reported-by: Alexei Starovoitov <ast@kernel.org>
    Reported-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
    Reviewed-by: Yonghong Song <yhs@fb.com>

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 222d59299a2af..02233b17cce0e 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -635,8 +635,8 @@ static noinstr void rcu_eqs_enter(bool user)
 		return;
 	}
 
-	lockdep_assert_irqs_disabled();
 	instrumentation_begin();
+	lockdep_assert_irqs_disabled();
 	trace_rcu_dyntick(TPS("Start"), rdp->dynticks_nesting, 0, atomic_read(&rdp->dynticks));
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
 	rcu_preempt_deferred_qs(current);
@@ -663,9 +663,9 @@ static noinstr void rcu_eqs_enter(bool user)
  * If you add or remove a call to rcu_idle_enter(), be sure to test with
  * CONFIG_RCU_EQS_DEBUG=y.
  */
-void rcu_idle_enter(void)
+void noinstr rcu_idle_enter(void)
 {
-	lockdep_assert_irqs_disabled();
+	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());
 	rcu_eqs_enter(false);
 }
 EXPORT_SYMBOL_GPL(rcu_idle_enter);
@@ -865,7 +865,7 @@ static void noinstr rcu_eqs_exit(bool user)
 	struct rcu_data *rdp;
 	long oldval;
 
-	lockdep_assert_irqs_disabled();
+	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());
 	rdp = this_cpu_ptr(&rcu_data);
 	oldval = rdp->dynticks_nesting;
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && oldval < 0);
@@ -900,13 +900,13 @@ static void noinstr rcu_eqs_exit(bool user)
  * If you add or remove a call to rcu_idle_exit(), be sure to test with
  * CONFIG_RCU_EQS_DEBUG=y.
  */
-void rcu_idle_exit(void)
+void noinstr rcu_idle_exit(void)
 {
 	unsigned long flags;
 
-	local_irq_save(flags);
+	raw_local_irq_save(flags);
 	rcu_eqs_exit(false);
-	local_irq_restore(flags);
+	raw_local_irq_restore(flags);
 }
 EXPORT_SYMBOL_GPL(rcu_idle_exit);
 

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr
  2022-05-17  7:49   ` Jiri Olsa
@ 2022-05-19  0:00     ` Masami Hiramatsu
  0 siblings, 0 replies; 31+ messages in thread
From: Masami Hiramatsu @ 2022-05-19  0:00 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Yonghong Song, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Masami Hiramatsu, Paul E. McKenney, netdev, bpf,
	lkml, Martin KaFai Lau, Song Liu, John Fastabend, KP Singh,
	Steven Rostedt

On Tue, 17 May 2022 09:49:33 +0200
Jiri Olsa <olsajiri@gmail.com> wrote:

> On Mon, May 16, 2022 at 07:54:37PM -0700, Yonghong Song wrote:
> > 
> > 
> > On 5/15/22 1:36 PM, Jiri Olsa wrote:
> > > Making arch_cpu_idle and rcu_idle_exit noinstr. Both functions run
> > > in rcu 'not watching' context and if there's tracer attached to
> > > them, which uses rcu (e.g. kprobe multi interface) it will hit RCU
> > > warning like:
> > > 
> > >    [    3.017540] WARNING: suspicious RCU usage
> > >    ...
> > >    [    3.018363]  kprobe_multi_link_handler+0x68/0x1c0
> > >    [    3.018364]  ? kprobe_multi_link_handler+0x3e/0x1c0
> > >    [    3.018366]  ? arch_cpu_idle_dead+0x10/0x10
> > >    [    3.018367]  ? arch_cpu_idle_dead+0x10/0x10
> > >    [    3.018371]  fprobe_handler.part.0+0xab/0x150
> > >    [    3.018374]  0xffffffffa00080c8
> > >    [    3.018393]  ? arch_cpu_idle+0x5/0x10
> > >    [    3.018398]  arch_cpu_idle+0x5/0x10
> > >    [    3.018399]  default_idle_call+0x59/0x90
> > >    [    3.018401]  do_idle+0x1c3/0x1d0
> > > 
> > > The call path is following:
> > > 
> > > default_idle_call
> > >    rcu_idle_enter
> > >    arch_cpu_idle
> > >    rcu_idle_exit
> > > 
> > > The arch_cpu_idle and rcu_idle_exit are the only ones from above
> > > path that are traceble and cause this problem on my setup.
> > > 
> > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > ---
> > >   arch/x86/kernel/process.c | 2 +-
> > >   kernel/rcu/tree.c         | 2 +-
> > >   2 files changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> > > index b370767f5b19..1345cb0124a6 100644
> > > --- a/arch/x86/kernel/process.c
> > > +++ b/arch/x86/kernel/process.c
> > > @@ -720,7 +720,7 @@ void arch_cpu_idle_dead(void)
> > >   /*
> > >    * Called from the generic idle code.
> > >    */
> > > -void arch_cpu_idle(void)
> > > +void noinstr arch_cpu_idle(void)
> > 
> > noinstr includes a lot of attributes:
> > 
> > #define noinstr                                                         \
> >         noinline notrace __attribute((__section__(".noinstr.text")))    \
> >         __no_kcsan __no_sanitize_address __no_profile __no_sanitize_coverage
> > 
> > should we use notrace here?
> 
> hm right, so notrace should be enough for our case (kprobe_multi)
> which is based on ftrace/fprobe jump
> 
> noinstr (among other things) adds the function also the kprobes
> blacklist, which will prevent standard kprobes to attach
> 
> ASAICS standard kprobes use rcu in probe path as well, like in
> opt_pre_handler function
> 
> so I think we should go with noinstr

Yes, I agree that noinstr is preferrable for these functions.

Thank you!

> 
> jirka
> 
> > 
> > >   {
> > >   	x86_idle();
> > >   }
> > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > index a4b8189455d5..20d529722f51 100644
> > > --- a/kernel/rcu/tree.c
> > > +++ b/kernel/rcu/tree.c
> > > @@ -896,7 +896,7 @@ static void noinstr rcu_eqs_exit(bool user)
> > >    * If you add or remove a call to rcu_idle_exit(), be sure to test with
> > >    * CONFIG_RCU_EQS_DEBUG=y.
> > >    */
> > > -void rcu_idle_exit(void)
> > > +void noinstr rcu_idle_exit(void)
> > >   {
> > >   	unsigned long flags;


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr
  2022-05-18 16:21       ` Paul E. McKenney
@ 2022-05-19 11:33         ` Jiri Olsa
  2022-05-19 13:54           ` Paul E. McKenney
  0 siblings, 1 reply; 31+ messages in thread
From: Jiri Olsa @ 2022-05-19 11:33 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Jiri Olsa, Frederic Weisbecker, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Masami Hiramatsu, netdev, bpf,
	lkml, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Steven Rostedt

On Wed, May 18, 2022 at 09:21:18AM -0700, Paul E. McKenney wrote:
> On Tue, May 17, 2022 at 12:13:45PM +0200, Jiri Olsa wrote:
> > On Mon, May 16, 2022 at 01:49:22PM +0200, Frederic Weisbecker wrote:
> > > On Sun, May 15, 2022 at 09:25:35PM -0700, Paul E. McKenney wrote:
> > > > On Sun, May 15, 2022 at 10:36:52PM +0200, Jiri Olsa wrote:
> > > > > Making arch_cpu_idle and rcu_idle_exit noinstr. Both functions run
> > > > > in rcu 'not watching' context and if there's tracer attached to
> > > > > them, which uses rcu (e.g. kprobe multi interface) it will hit RCU
> > > > > warning like:
> > > > > 
> > > > >   [    3.017540] WARNING: suspicious RCU usage
> > > > >   ...
> > > > >   [    3.018363]  kprobe_multi_link_handler+0x68/0x1c0
> > > > >   [    3.018364]  ? kprobe_multi_link_handler+0x3e/0x1c0
> > > > >   [    3.018366]  ? arch_cpu_idle_dead+0x10/0x10
> > > > >   [    3.018367]  ? arch_cpu_idle_dead+0x10/0x10
> > > > >   [    3.018371]  fprobe_handler.part.0+0xab/0x150
> > > > >   [    3.018374]  0xffffffffa00080c8
> > > > >   [    3.018393]  ? arch_cpu_idle+0x5/0x10
> > > > >   [    3.018398]  arch_cpu_idle+0x5/0x10
> > > > >   [    3.018399]  default_idle_call+0x59/0x90
> > > > >   [    3.018401]  do_idle+0x1c3/0x1d0
> > > > > 
> > > > > The call path is following:
> > > > > 
> > > > > default_idle_call
> > > > >   rcu_idle_enter
> > > > >   arch_cpu_idle
> > > > >   rcu_idle_exit
> > > > > 
> > > > > The arch_cpu_idle and rcu_idle_exit are the only ones from above
> > > > > path that are traceble and cause this problem on my setup.
> > > > > 
> > > > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > > 
> > > > From an RCU viewpoint:
> > > > 
> > > > Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
> > > > 
> > > > [ I considered asking for an instrumentation_on() in rcu_idle_exit(),
> > > > but there is no point given that local_irq_restore() isn't something
> > > > you instrument anyway. ]
> > > 
> > > So local_irq_save() in the beginning of rcu_idle_exit() is unsafe because
> > > it is instrumentable by the function (graph)  tracers and the irqsoff tracer.
> > > 
> > > Also it calls into lockdep that might make use of RCU.
> > > 
> > > That's why rcu_idle_exit() is not noinstr yet. See this patch:
> > > 
> > > https://lore.kernel.org/lkml/20220503100051.2799723-4-frederic@kernel.org/
> > 
> > I see, could we mark it at least with notrace meanwhile?
> 
> For the RCU part, how about as follows?
> 
> If this approach is reasonable, my guess would be that Frederic will pull
> it into his context-tracking series, perhaps using a revert of this patch
> to maintain sanity in the near term.
> 
> If this approach is unreasonable, well, that is Murphy for you!

I checked and it works in my test ;-)

> 
> For the x86 idle part, my feeling is still that the rcu_idle_enter()
> and rcu_idle_exit() need to be pushed deeper into the code.  Perhaps
> an ongoing process as the idle loop continues to be dug deeper?

for arch_cpu_idle with noinstr I'm getting this W=1 warning:

vmlinux.o: warning: objtool: arch_cpu_idle()+0xb: call to {dynamic}() leaves .noinstr.text section

we could have it with notrace if that's a problem

thanks,
jirka

> 
> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> commit cd338be719a0a692e0d50e1a8438e1f6c7165d9c
> Author: Paul E. McKenney <paulmck@kernel.org>
> Date:   Tue May 17 21:00:04 2022 -0700
> 
>     rcu: Apply noinstr to rcu_idle_enter() and rcu_idle_exit()
>     
>     This commit applies the "noinstr" tag to the rcu_idle_enter() and
>     rcu_idle_exit() functions, which are invoked from portions of the idle
>     loop that cannot be instrumented.  These tags require reworking the
>     rcu_eqs_enter() and rcu_eqs_exit() functions that these two functions
>     invoke in order to cause them to use normal assertions rather than
>     lockdep.  In addition, within rcu_idle_exit(), the raw versions of
>     local_irq_save() and local_irq_restore() are used, again to avoid issues
>     with lockdep in uninstrumented code.
>     
>     This patch is based in part on an earlier patch by Jiri Olsa, discussions
>     with Peter Zijlstra and Frederic Weisbecker, earlier changes by Thomas
>     Gleixner, and off-list discussions with Yonghong Song.
>     
>     Link: https://lore.kernel.org/lkml/20220515203653.4039075-1-jolsa@kernel.org/
>     Reported-by: Jiri Olsa <jolsa@kernel.org>
>     Reported-by: Alexei Starovoitov <ast@kernel.org>
>     Reported-by: Andrii Nakryiko <andrii@kernel.org>
>     Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
>     Reviewed-by: Yonghong Song <yhs@fb.com>
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 222d59299a2af..02233b17cce0e 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -635,8 +635,8 @@ static noinstr void rcu_eqs_enter(bool user)
>  		return;
>  	}
>  
> -	lockdep_assert_irqs_disabled();
>  	instrumentation_begin();
> +	lockdep_assert_irqs_disabled();
>  	trace_rcu_dyntick(TPS("Start"), rdp->dynticks_nesting, 0, atomic_read(&rdp->dynticks));
>  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
>  	rcu_preempt_deferred_qs(current);
> @@ -663,9 +663,9 @@ static noinstr void rcu_eqs_enter(bool user)
>   * If you add or remove a call to rcu_idle_enter(), be sure to test with
>   * CONFIG_RCU_EQS_DEBUG=y.
>   */
> -void rcu_idle_enter(void)
> +void noinstr rcu_idle_enter(void)
>  {
> -	lockdep_assert_irqs_disabled();
> +	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());
>  	rcu_eqs_enter(false);
>  }
>  EXPORT_SYMBOL_GPL(rcu_idle_enter);
> @@ -865,7 +865,7 @@ static void noinstr rcu_eqs_exit(bool user)
>  	struct rcu_data *rdp;
>  	long oldval;
>  
> -	lockdep_assert_irqs_disabled();
> +	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());
>  	rdp = this_cpu_ptr(&rcu_data);
>  	oldval = rdp->dynticks_nesting;
>  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && oldval < 0);
> @@ -900,13 +900,13 @@ static void noinstr rcu_eqs_exit(bool user)
>   * If you add or remove a call to rcu_idle_exit(), be sure to test with
>   * CONFIG_RCU_EQS_DEBUG=y.
>   */
> -void rcu_idle_exit(void)
> +void noinstr rcu_idle_exit(void)
>  {
>  	unsigned long flags;
>  
> -	local_irq_save(flags);
> +	raw_local_irq_save(flags);
>  	rcu_eqs_exit(false);
> -	local_irq_restore(flags);
> +	raw_local_irq_restore(flags);
>  }
>  EXPORT_SYMBOL_GPL(rcu_idle_exit);
>  

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr
  2022-05-19 11:33         ` Jiri Olsa
@ 2022-05-19 13:54           ` Paul E. McKenney
  2022-05-23 13:12             ` Jiri Olsa
  0 siblings, 1 reply; 31+ messages in thread
From: Paul E. McKenney @ 2022-05-19 13:54 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Frederic Weisbecker, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Masami Hiramatsu, netdev, bpf, lkml,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Steven Rostedt

On Thu, May 19, 2022 at 01:33:16PM +0200, Jiri Olsa wrote:
> On Wed, May 18, 2022 at 09:21:18AM -0700, Paul E. McKenney wrote:
> > On Tue, May 17, 2022 at 12:13:45PM +0200, Jiri Olsa wrote:
> > > On Mon, May 16, 2022 at 01:49:22PM +0200, Frederic Weisbecker wrote:
> > > > On Sun, May 15, 2022 at 09:25:35PM -0700, Paul E. McKenney wrote:
> > > > > On Sun, May 15, 2022 at 10:36:52PM +0200, Jiri Olsa wrote:
> > > > > > Making arch_cpu_idle and rcu_idle_exit noinstr. Both functions run
> > > > > > in rcu 'not watching' context and if there's tracer attached to
> > > > > > them, which uses rcu (e.g. kprobe multi interface) it will hit RCU
> > > > > > warning like:
> > > > > > 
> > > > > >   [    3.017540] WARNING: suspicious RCU usage
> > > > > >   ...
> > > > > >   [    3.018363]  kprobe_multi_link_handler+0x68/0x1c0
> > > > > >   [    3.018364]  ? kprobe_multi_link_handler+0x3e/0x1c0
> > > > > >   [    3.018366]  ? arch_cpu_idle_dead+0x10/0x10
> > > > > >   [    3.018367]  ? arch_cpu_idle_dead+0x10/0x10
> > > > > >   [    3.018371]  fprobe_handler.part.0+0xab/0x150
> > > > > >   [    3.018374]  0xffffffffa00080c8
> > > > > >   [    3.018393]  ? arch_cpu_idle+0x5/0x10
> > > > > >   [    3.018398]  arch_cpu_idle+0x5/0x10
> > > > > >   [    3.018399]  default_idle_call+0x59/0x90
> > > > > >   [    3.018401]  do_idle+0x1c3/0x1d0
> > > > > > 
> > > > > > The call path is following:
> > > > > > 
> > > > > > default_idle_call
> > > > > >   rcu_idle_enter
> > > > > >   arch_cpu_idle
> > > > > >   rcu_idle_exit
> > > > > > 
> > > > > > The arch_cpu_idle and rcu_idle_exit are the only ones from above
> > > > > > path that are traceble and cause this problem on my setup.
> > > > > > 
> > > > > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > > > 
> > > > > From an RCU viewpoint:
> > > > > 
> > > > > Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
> > > > > 
> > > > > [ I considered asking for an instrumentation_on() in rcu_idle_exit(),
> > > > > but there is no point given that local_irq_restore() isn't something
> > > > > you instrument anyway. ]
> > > > 
> > > > So local_irq_save() in the beginning of rcu_idle_exit() is unsafe because
> > > > it is instrumentable by the function (graph)  tracers and the irqsoff tracer.
> > > > 
> > > > Also it calls into lockdep that might make use of RCU.
> > > > 
> > > > That's why rcu_idle_exit() is not noinstr yet. See this patch:
> > > > 
> > > > https://lore.kernel.org/lkml/20220503100051.2799723-4-frederic@kernel.org/
> > > 
> > > I see, could we mark it at least with notrace meanwhile?
> > 
> > For the RCU part, how about as follows?
> > 
> > If this approach is reasonable, my guess would be that Frederic will pull
> > it into his context-tracking series, perhaps using a revert of this patch
> > to maintain sanity in the near term.
> > 
> > If this approach is unreasonable, well, that is Murphy for you!
> 
> I checked and it works in my test ;-)

Whew!!!  One piece of the problem might be solved, then.  ;-)

> > For the x86 idle part, my feeling is still that the rcu_idle_enter()
> > and rcu_idle_exit() need to be pushed deeper into the code.  Perhaps
> > an ongoing process as the idle loop continues to be dug deeper?
> 
> for arch_cpu_idle with noinstr I'm getting this W=1 warning:
> 
> vmlinux.o: warning: objtool: arch_cpu_idle()+0xb: call to {dynamic}() leaves .noinstr.text section
> 
> we could have it with notrace if that's a problem

I would be happy to queue the arch_cpu_idle() portion of your patch on
-rcu, if that would move things forward.  I suspect that additional
x86_idle() surgery is required, but maybe I am just getting confused
about what the x86_idle() function pointer can point to.  But it looks
to me like these need further help:

o	static void amd_e400_idle(void)
	Plus things it calls, like tick_broadcast_enter() and
	tick_broadcast_exit().

o	static __cpuidle void mwait_idle(void)

So it might not be all that much additional work, even if I have avoided
confusion about what the x86_idle() function pointer can point to.  But
I do not trust my ability to test this accurately.

Thoughts?

							Thanx, Paul

> thanks,
> jirka
> 
> > 
> > 							Thanx, Paul
> > 
> > ------------------------------------------------------------------------
> > 
> > commit cd338be719a0a692e0d50e1a8438e1f6c7165d9c
> > Author: Paul E. McKenney <paulmck@kernel.org>
> > Date:   Tue May 17 21:00:04 2022 -0700
> > 
> >     rcu: Apply noinstr to rcu_idle_enter() and rcu_idle_exit()
> >     
> >     This commit applies the "noinstr" tag to the rcu_idle_enter() and
> >     rcu_idle_exit() functions, which are invoked from portions of the idle
> >     loop that cannot be instrumented.  These tags require reworking the
> >     rcu_eqs_enter() and rcu_eqs_exit() functions that these two functions
> >     invoke in order to cause them to use normal assertions rather than
> >     lockdep.  In addition, within rcu_idle_exit(), the raw versions of
> >     local_irq_save() and local_irq_restore() are used, again to avoid issues
> >     with lockdep in uninstrumented code.
> >     
> >     This patch is based in part on an earlier patch by Jiri Olsa, discussions
> >     with Peter Zijlstra and Frederic Weisbecker, earlier changes by Thomas
> >     Gleixner, and off-list discussions with Yonghong Song.
> >     
> >     Link: https://lore.kernel.org/lkml/20220515203653.4039075-1-jolsa@kernel.org/
> >     Reported-by: Jiri Olsa <jolsa@kernel.org>
> >     Reported-by: Alexei Starovoitov <ast@kernel.org>
> >     Reported-by: Andrii Nakryiko <andrii@kernel.org>
> >     Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> >     Reviewed-by: Yonghong Song <yhs@fb.com>
> > 
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 222d59299a2af..02233b17cce0e 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -635,8 +635,8 @@ static noinstr void rcu_eqs_enter(bool user)
> >  		return;
> >  	}
> >  
> > -	lockdep_assert_irqs_disabled();
> >  	instrumentation_begin();
> > +	lockdep_assert_irqs_disabled();
> >  	trace_rcu_dyntick(TPS("Start"), rdp->dynticks_nesting, 0, atomic_read(&rdp->dynticks));
> >  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
> >  	rcu_preempt_deferred_qs(current);
> > @@ -663,9 +663,9 @@ static noinstr void rcu_eqs_enter(bool user)
> >   * If you add or remove a call to rcu_idle_enter(), be sure to test with
> >   * CONFIG_RCU_EQS_DEBUG=y.
> >   */
> > -void rcu_idle_enter(void)
> > +void noinstr rcu_idle_enter(void)
> >  {
> > -	lockdep_assert_irqs_disabled();
> > +	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());
> >  	rcu_eqs_enter(false);
> >  }
> >  EXPORT_SYMBOL_GPL(rcu_idle_enter);
> > @@ -865,7 +865,7 @@ static void noinstr rcu_eqs_exit(bool user)
> >  	struct rcu_data *rdp;
> >  	long oldval;
> >  
> > -	lockdep_assert_irqs_disabled();
> > +	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());
> >  	rdp = this_cpu_ptr(&rcu_data);
> >  	oldval = rdp->dynticks_nesting;
> >  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && oldval < 0);
> > @@ -900,13 +900,13 @@ static void noinstr rcu_eqs_exit(bool user)
> >   * If you add or remove a call to rcu_idle_exit(), be sure to test with
> >   * CONFIG_RCU_EQS_DEBUG=y.
> >   */
> > -void rcu_idle_exit(void)
> > +void noinstr rcu_idle_exit(void)
> >  {
> >  	unsigned long flags;
> >  
> > -	local_irq_save(flags);
> > +	raw_local_irq_save(flags);
> >  	rcu_eqs_exit(false);
> > -	local_irq_restore(flags);
> > +	raw_local_irq_restore(flags);
> >  }
> >  EXPORT_SYMBOL_GPL(rcu_idle_exit);
> >  

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr
  2022-05-19 13:54           ` Paul E. McKenney
@ 2022-05-23 13:12             ` Jiri Olsa
  2022-05-24 17:33               ` Paul E. McKenney
  0 siblings, 1 reply; 31+ messages in thread
From: Jiri Olsa @ 2022-05-23 13:12 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Jiri Olsa, Frederic Weisbecker, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Masami Hiramatsu, netdev, bpf,
	lkml, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Steven Rostedt

On Thu, May 19, 2022 at 06:54:39AM -0700, Paul E. McKenney wrote:
> On Thu, May 19, 2022 at 01:33:16PM +0200, Jiri Olsa wrote:
> > On Wed, May 18, 2022 at 09:21:18AM -0700, Paul E. McKenney wrote:
> > > On Tue, May 17, 2022 at 12:13:45PM +0200, Jiri Olsa wrote:
> > > > On Mon, May 16, 2022 at 01:49:22PM +0200, Frederic Weisbecker wrote:
> > > > > On Sun, May 15, 2022 at 09:25:35PM -0700, Paul E. McKenney wrote:
> > > > > > On Sun, May 15, 2022 at 10:36:52PM +0200, Jiri Olsa wrote:
> > > > > > > Making arch_cpu_idle and rcu_idle_exit noinstr. Both functions run
> > > > > > > in rcu 'not watching' context and if there's tracer attached to
> > > > > > > them, which uses rcu (e.g. kprobe multi interface) it will hit RCU
> > > > > > > warning like:
> > > > > > > 
> > > > > > >   [    3.017540] WARNING: suspicious RCU usage
> > > > > > >   ...
> > > > > > >   [    3.018363]  kprobe_multi_link_handler+0x68/0x1c0
> > > > > > >   [    3.018364]  ? kprobe_multi_link_handler+0x3e/0x1c0
> > > > > > >   [    3.018366]  ? arch_cpu_idle_dead+0x10/0x10
> > > > > > >   [    3.018367]  ? arch_cpu_idle_dead+0x10/0x10
> > > > > > >   [    3.018371]  fprobe_handler.part.0+0xab/0x150
> > > > > > >   [    3.018374]  0xffffffffa00080c8
> > > > > > >   [    3.018393]  ? arch_cpu_idle+0x5/0x10
> > > > > > >   [    3.018398]  arch_cpu_idle+0x5/0x10
> > > > > > >   [    3.018399]  default_idle_call+0x59/0x90
> > > > > > >   [    3.018401]  do_idle+0x1c3/0x1d0
> > > > > > > 
> > > > > > > The call path is following:
> > > > > > > 
> > > > > > > default_idle_call
> > > > > > >   rcu_idle_enter
> > > > > > >   arch_cpu_idle
> > > > > > >   rcu_idle_exit
> > > > > > > 
> > > > > > > The arch_cpu_idle and rcu_idle_exit are the only ones from above
> > > > > > > path that are traceble and cause this problem on my setup.
> > > > > > > 
> > > > > > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > > > > 
> > > > > > From an RCU viewpoint:
> > > > > > 
> > > > > > Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
> > > > > > 
> > > > > > [ I considered asking for an instrumentation_on() in rcu_idle_exit(),
> > > > > > but there is no point given that local_irq_restore() isn't something
> > > > > > you instrument anyway. ]
> > > > > 
> > > > > So local_irq_save() in the beginning of rcu_idle_exit() is unsafe because
> > > > > it is instrumentable by the function (graph)  tracers and the irqsoff tracer.
> > > > > 
> > > > > Also it calls into lockdep that might make use of RCU.
> > > > > 
> > > > > That's why rcu_idle_exit() is not noinstr yet. See this patch:
> > > > > 
> > > > > https://lore.kernel.org/lkml/20220503100051.2799723-4-frederic@kernel.org/
> > > > 
> > > > I see, could we mark it at least with notrace meanwhile?
> > > 
> > > For the RCU part, how about as follows?
> > > 
> > > If this approach is reasonable, my guess would be that Frederic will pull
> > > it into his context-tracking series, perhaps using a revert of this patch
> > > to maintain sanity in the near term.
> > > 
> > > If this approach is unreasonable, well, that is Murphy for you!
> > 
> > I checked and it works in my test ;-)
> 
> Whew!!!  One piece of the problem might be solved, then.  ;-)
> 
> > > For the x86 idle part, my feeling is still that the rcu_idle_enter()
> > > and rcu_idle_exit() need to be pushed deeper into the code.  Perhaps
> > > an ongoing process as the idle loop continues to be dug deeper?
> > 
> > for arch_cpu_idle with noinstr I'm getting this W=1 warning:
> > 
> > vmlinux.o: warning: objtool: arch_cpu_idle()+0xb: call to {dynamic}() leaves .noinstr.text section
> > 
> > we could have it with notrace if that's a problem
> 
> I would be happy to queue the arch_cpu_idle() portion of your patch on
> -rcu, if that would move things forward.  I suspect that additional
> x86_idle() surgery is required, but maybe I am just getting confused
> about what the x86_idle() function pointer can point to.  But it looks
> to me like these need further help:
> 
> o	static void amd_e400_idle(void)
> 	Plus things it calls, like tick_broadcast_enter() and
> 	tick_broadcast_exit().
> 
> o	static __cpuidle void mwait_idle(void)
> 
> So it might not be all that much additional work, even if I have avoided
> confusion about what the x86_idle() function pointer can point to.  But
> I do not trust my ability to test this accurately.

same here ;-) you're right, there will be other places based
on x86_idle function pointer.. I'll check it, but perhaps we
could address that when someone reports that

jirka

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr
  2022-05-23 13:12             ` Jiri Olsa
@ 2022-05-24 17:33               ` Paul E. McKenney
  0 siblings, 0 replies; 31+ messages in thread
From: Paul E. McKenney @ 2022-05-24 17:33 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Frederic Weisbecker, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Masami Hiramatsu, netdev, bpf, lkml,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Steven Rostedt

On Mon, May 23, 2022 at 03:12:28PM +0200, Jiri Olsa wrote:
> On Thu, May 19, 2022 at 06:54:39AM -0700, Paul E. McKenney wrote:
> > On Thu, May 19, 2022 at 01:33:16PM +0200, Jiri Olsa wrote:
> > > On Wed, May 18, 2022 at 09:21:18AM -0700, Paul E. McKenney wrote:
> > > > On Tue, May 17, 2022 at 12:13:45PM +0200, Jiri Olsa wrote:
> > > > > On Mon, May 16, 2022 at 01:49:22PM +0200, Frederic Weisbecker wrote:
> > > > > > On Sun, May 15, 2022 at 09:25:35PM -0700, Paul E. McKenney wrote:
> > > > > > > On Sun, May 15, 2022 at 10:36:52PM +0200, Jiri Olsa wrote:
> > > > > > > > Making arch_cpu_idle and rcu_idle_exit noinstr. Both functions run
> > > > > > > > in rcu 'not watching' context and if there's tracer attached to
> > > > > > > > them, which uses rcu (e.g. kprobe multi interface) it will hit RCU
> > > > > > > > warning like:
> > > > > > > > 
> > > > > > > >   [    3.017540] WARNING: suspicious RCU usage
> > > > > > > >   ...
> > > > > > > >   [    3.018363]  kprobe_multi_link_handler+0x68/0x1c0
> > > > > > > >   [    3.018364]  ? kprobe_multi_link_handler+0x3e/0x1c0
> > > > > > > >   [    3.018366]  ? arch_cpu_idle_dead+0x10/0x10
> > > > > > > >   [    3.018367]  ? arch_cpu_idle_dead+0x10/0x10
> > > > > > > >   [    3.018371]  fprobe_handler.part.0+0xab/0x150
> > > > > > > >   [    3.018374]  0xffffffffa00080c8
> > > > > > > >   [    3.018393]  ? arch_cpu_idle+0x5/0x10
> > > > > > > >   [    3.018398]  arch_cpu_idle+0x5/0x10
> > > > > > > >   [    3.018399]  default_idle_call+0x59/0x90
> > > > > > > >   [    3.018401]  do_idle+0x1c3/0x1d0
> > > > > > > > 
> > > > > > > > The call path is following:
> > > > > > > > 
> > > > > > > > default_idle_call
> > > > > > > >   rcu_idle_enter
> > > > > > > >   arch_cpu_idle
> > > > > > > >   rcu_idle_exit
> > > > > > > > 
> > > > > > > > The arch_cpu_idle and rcu_idle_exit are the only ones from above
> > > > > > > > path that are traceble and cause this problem on my setup.
> > > > > > > > 
> > > > > > > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > > > > > 
> > > > > > > From an RCU viewpoint:
> > > > > > > 
> > > > > > > Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
> > > > > > > 
> > > > > > > [ I considered asking for an instrumentation_on() in rcu_idle_exit(),
> > > > > > > but there is no point given that local_irq_restore() isn't something
> > > > > > > you instrument anyway. ]
> > > > > > 
> > > > > > So local_irq_save() in the beginning of rcu_idle_exit() is unsafe because
> > > > > > it is instrumentable by the function (graph)  tracers and the irqsoff tracer.
> > > > > > 
> > > > > > Also it calls into lockdep that might make use of RCU.
> > > > > > 
> > > > > > That's why rcu_idle_exit() is not noinstr yet. See this patch:
> > > > > > 
> > > > > > https://lore.kernel.org/lkml/20220503100051.2799723-4-frederic@kernel.org/
> > > > > 
> > > > > I see, could we mark it at least with notrace meanwhile?
> > > > 
> > > > For the RCU part, how about as follows?
> > > > 
> > > > If this approach is reasonable, my guess would be that Frederic will pull
> > > > it into his context-tracking series, perhaps using a revert of this patch
> > > > to maintain sanity in the near term.
> > > > 
> > > > If this approach is unreasonable, well, that is Murphy for you!
> > > 
> > > I checked and it works in my test ;-)
> > 
> > Whew!!!  One piece of the problem might be solved, then.  ;-)
> > 
> > > > For the x86 idle part, my feeling is still that the rcu_idle_enter()
> > > > and rcu_idle_exit() need to be pushed deeper into the code.  Perhaps
> > > > an ongoing process as the idle loop continues to be dug deeper?
> > > 
> > > for arch_cpu_idle with noinstr I'm getting this W=1 warning:
> > > 
> > > vmlinux.o: warning: objtool: arch_cpu_idle()+0xb: call to {dynamic}() leaves .noinstr.text section
> > > 
> > > we could have it with notrace if that's a problem
> > 
> > I would be happy to queue the arch_cpu_idle() portion of your patch on
> > -rcu, if that would move things forward.  I suspect that additional
> > x86_idle() surgery is required, but maybe I am just getting confused
> > about what the x86_idle() function pointer can point to.  But it looks
> > to me like these need further help:
> > 
> > o	static void amd_e400_idle(void)
> > 	Plus things it calls, like tick_broadcast_enter() and
> > 	tick_broadcast_exit().
> > 
> > o	static __cpuidle void mwait_idle(void)
> > 
> > So it might not be all that much additional work, even if I have avoided
> > confusion about what the x86_idle() function pointer can point to.  But
> > I do not trust my ability to test this accurately.
> 
> same here ;-) you're right, there will be other places based
> on x86_idle function pointer.. I'll check it, but perhaps we
> could address that when someone reports that
> 
> jirka

Any thoughts on the correct approach?  One extreme would be to
mark all sorts of things noinstr.  Another extreme would be to
enclose all sorts of things in RCU_NONIDLE().  Yet another extreme
would be to push the rcu_idle_enter() and rcu_idle_exit() calls
still deeper into the idle loop.

Or does Peter's recent series somehow cover all of this?

https://lore.kernel.org/all/20220519212750.656413111@infradead.org/

							Thanx, Paul

^ permalink raw reply	[flat|nested] 31+ messages in thread

* (no subject)
  2022-05-15 20:36 [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr Jiri Olsa
                   ` (4 preceding siblings ...)
  2022-05-17  9:19 ` kernel test robot
@ 2023-05-20  9:47 ` Ze Gao
  2023-05-21  3:58   ` Yonghong Song
  2023-05-21  8:08   ` Jiri Olsa
  5 siblings, 2 replies; 31+ messages in thread
From: Ze Gao @ 2023-05-20  9:47 UTC (permalink / raw)
  To: jolsa
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Hao Luo,
	John Fastabend, KP Singh, Martin KaFai Lau, Masami Hiramatsu,
	Song Liu, Stanislav Fomichev, Steven Rostedt, Yonghong Song, bpf,
	linux-kernel, linux-trace-kernel, kafai, kpsingh, netdev,
	paulmck, songliubraving, Ze Gao


Hi Jiri,

Would you like to consider to add rcu_is_watching check in
to solve this from the viewpoint of kprobe_multi_link_prog_run
itself? And accounting of missed runs can be added as well
to imporve observability.

Regards,
Ze


-----------------
From 29fd3cd713e65461325c2703cf5246a6fae5d4fe Mon Sep 17 00:00:00 2001
From: Ze Gao <zegao@tencent.com>
Date: Sat, 20 May 2023 17:32:05 +0800
Subject: [PATCH] bpf: kprobe_multi runs bpf progs only when rcu_is_watching

From the perspective of kprobe_multi_link_prog_run, any traceable
functions can be attached while bpf progs need specical care and
ought to be under rcu protection. To solve the likely rcu lockdep
warns once for good, when (future) functions in idle path were
attached accidentally, we better paying some cost to check at least
in kernel-side, and return when rcu is not watching, which helps
to avoid any unpredictable results.

Signed-off-by: Ze Gao <zegao@tencent.com>
---
 kernel/trace/bpf_trace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 9a050e36dc6c..3e6ea7274765 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -2622,7 +2622,7 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
 	struct bpf_run_ctx *old_run_ctx;
 	int err;
 
-	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
+	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1 || !rcu_is_watching())) {
 		err = 0;
 		goto out;
 	}
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re:
  2023-05-20  9:47 ` Ze Gao
@ 2023-05-21  3:58   ` Yonghong Song
  2023-05-21 15:10     ` Re: Ze Gao
  2023-05-21  8:08   ` Jiri Olsa
  1 sibling, 1 reply; 31+ messages in thread
From: Yonghong Song @ 2023-05-21  3:58 UTC (permalink / raw)
  To: Ze Gao, jolsa
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Hao Luo,
	John Fastabend, KP Singh, Martin KaFai Lau, Masami Hiramatsu,
	Song Liu, Stanislav Fomichev, Steven Rostedt, Yonghong Song, bpf,
	linux-kernel, linux-trace-kernel, kafai, kpsingh, netdev,
	paulmck, songliubraving, Ze Gao



On 5/20/23 2:47 AM, Ze Gao wrote:
> 
> Hi Jiri,
> 
> Would you like to consider to add rcu_is_watching check in
> to solve this from the viewpoint of kprobe_multi_link_prog_run
> itself? And accounting of missed runs can be added as well
> to imporve observability.
> 
> Regards,
> Ze
> 
> 
> -----------------
>  From 29fd3cd713e65461325c2703cf5246a6fae5d4fe Mon Sep 17 00:00:00 2001
> From: Ze Gao <zegao@tencent.com>
> Date: Sat, 20 May 2023 17:32:05 +0800
> Subject: [PATCH] bpf: kprobe_multi runs bpf progs only when rcu_is_watching
> 
>  From the perspective of kprobe_multi_link_prog_run, any traceable
> functions can be attached while bpf progs need specical care and
> ought to be under rcu protection. To solve the likely rcu lockdep
> warns once for good, when (future) functions in idle path were
> attached accidentally, we better paying some cost to check at least
> in kernel-side, and return when rcu is not watching, which helps
> to avoid any unpredictable results.

kprobe_multi/fprobe share the same set of attachments with fentry.
Currently, fentry does not filter with !rcu_is_watching, maybe
because this is an extreme corner case. Not sure whether it is
worthwhile or not.

Maybe if you can give a concrete example (e.g., attachment point)
with current code base to show what the issue you encountered and
it will make it easier to judge whether adding !rcu_is_watching()
is necessary or not.

> 
> Signed-off-by: Ze Gao <zegao@tencent.com>
> ---
>   kernel/trace/bpf_trace.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 9a050e36dc6c..3e6ea7274765 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -2622,7 +2622,7 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
>   	struct bpf_run_ctx *old_run_ctx;
>   	int err;
>   
> -	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
> +	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1 || !rcu_is_watching())) {
>   		err = 0;
>   		goto out;
>   	}

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re:
  2023-05-20  9:47 ` Ze Gao
  2023-05-21  3:58   ` Yonghong Song
@ 2023-05-21  8:08   ` Jiri Olsa
  2023-05-21 10:09     ` Re: Masami Hiramatsu
  1 sibling, 1 reply; 31+ messages in thread
From: Jiri Olsa @ 2023-05-21  8:08 UTC (permalink / raw)
  To: Ze Gao
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Hao Luo,
	John Fastabend, KP Singh, Martin KaFai Lau, Masami Hiramatsu,
	Song Liu, Stanislav Fomichev, Steven Rostedt, Yonghong Song, bpf,
	linux-kernel, linux-trace-kernel, kafai, kpsingh, netdev,
	paulmck, songliubraving, Ze Gao

On Sat, May 20, 2023 at 05:47:24PM +0800, Ze Gao wrote:
> 
> Hi Jiri,
> 
> Would you like to consider to add rcu_is_watching check in
> to solve this from the viewpoint of kprobe_multi_link_prog_run

I think this was discussed in here:
  https://lore.kernel.org/bpf/20230321020103.13494-1-laoar.shao@gmail.com/

and was considered a bug, there's fix mentioned later in the thread

there's also this recent patchset:
  https://lore.kernel.org/bpf/20230517034510.15639-3-zegao@tencent.com/

that solves related problems

> itself? And accounting of missed runs can be added as well
> to imporve observability.

right, we count fprobe->nmissed but it's not exposed, we should allow
to get 'missed' stats from both fprobe and kprobe_multi later, which
is missing now, will check

thanks,
jirka

> 
> Regards,
> Ze
> 
> 
> -----------------
> From 29fd3cd713e65461325c2703cf5246a6fae5d4fe Mon Sep 17 00:00:00 2001
> From: Ze Gao <zegao@tencent.com>
> Date: Sat, 20 May 2023 17:32:05 +0800
> Subject: [PATCH] bpf: kprobe_multi runs bpf progs only when rcu_is_watching
> 
> From the perspective of kprobe_multi_link_prog_run, any traceable
> functions can be attached while bpf progs need specical care and
> ought to be under rcu protection. To solve the likely rcu lockdep
> warns once for good, when (future) functions in idle path were
> attached accidentally, we better paying some cost to check at least
> in kernel-side, and return when rcu is not watching, which helps
> to avoid any unpredictable results.
> 
> Signed-off-by: Ze Gao <zegao@tencent.com>
> ---
>  kernel/trace/bpf_trace.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 9a050e36dc6c..3e6ea7274765 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -2622,7 +2622,7 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
>  	struct bpf_run_ctx *old_run_ctx;
>  	int err;
>  
> -	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
> +	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1 || !rcu_is_watching())) {
>  		err = 0;
>  		goto out;
>  	}
> -- 
> 2.40.1
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re:
  2023-05-21  8:08   ` Jiri Olsa
@ 2023-05-21 10:09     ` Masami Hiramatsu
  2023-05-21 14:19       ` Re: Ze Gao
  0 siblings, 1 reply; 31+ messages in thread
From: Masami Hiramatsu @ 2023-05-21 10:09 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Ze Gao, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Hao Luo, John Fastabend, KP Singh, Martin KaFai Lau,
	Masami Hiramatsu, Song Liu, Stanislav Fomichev, Steven Rostedt,
	Yonghong Song, bpf, linux-kernel, linux-trace-kernel, kafai,
	kpsingh, netdev, paulmck, songliubraving, Ze Gao

On Sun, 21 May 2023 10:08:46 +0200
Jiri Olsa <olsajiri@gmail.com> wrote:

> On Sat, May 20, 2023 at 05:47:24PM +0800, Ze Gao wrote:
> > 
> > Hi Jiri,
> > 
> > Would you like to consider to add rcu_is_watching check in
> > to solve this from the viewpoint of kprobe_multi_link_prog_run
> 
> I think this was discussed in here:
>   https://lore.kernel.org/bpf/20230321020103.13494-1-laoar.shao@gmail.com/
> 
> and was considered a bug, there's fix mentioned later in the thread
> 
> there's also this recent patchset:
>   https://lore.kernel.org/bpf/20230517034510.15639-3-zegao@tencent.com/
> 
> that solves related problems

I think this rcu_is_watching() is a bit different issue. This rcu_is_watching()
check is required if the kprobe_multi_link_prog_run() uses any RCU API.
E.g. rethook_try_get() is also checks rcu_is_watching() because it uses
call_rcu().

Thank you,

> 
> > itself? And accounting of missed runs can be added as well
> > to imporve observability.
> 
> right, we count fprobe->nmissed but it's not exposed, we should allow
> to get 'missed' stats from both fprobe and kprobe_multi later, which
> is missing now, will check
> 
> thanks,
> jirka
> 
> > 
> > Regards,
> > Ze
> > 
> > 
> > -----------------
> > From 29fd3cd713e65461325c2703cf5246a6fae5d4fe Mon Sep 17 00:00:00 2001
> > From: Ze Gao <zegao@tencent.com>
> > Date: Sat, 20 May 2023 17:32:05 +0800
> > Subject: [PATCH] bpf: kprobe_multi runs bpf progs only when rcu_is_watching
> > 
> > From the perspective of kprobe_multi_link_prog_run, any traceable
> > functions can be attached while bpf progs need specical care and
> > ought to be under rcu protection. To solve the likely rcu lockdep
> > warns once for good, when (future) functions in idle path were
> > attached accidentally, we better paying some cost to check at least
> > in kernel-side, and return when rcu is not watching, which helps
> > to avoid any unpredictable results.
> > 
> > Signed-off-by: Ze Gao <zegao@tencent.com>
> > ---
> >  kernel/trace/bpf_trace.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index 9a050e36dc6c..3e6ea7274765 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -2622,7 +2622,7 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
> >  	struct bpf_run_ctx *old_run_ctx;
> >  	int err;
> >  
> > -	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
> > +	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1 || !rcu_is_watching())) {
> >  		err = 0;
> >  		goto out;
> >  	}
> > -- 
> > 2.40.1
> > 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re:
  2023-05-21 10:09     ` Re: Masami Hiramatsu
@ 2023-05-21 14:19       ` Ze Gao
  0 siblings, 0 replies; 31+ messages in thread
From: Ze Gao @ 2023-05-21 14:19 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Jiri Olsa, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Hao Luo, John Fastabend, KP Singh, Martin KaFai Lau, Song Liu,
	Stanislav Fomichev, Steven Rostedt, Yonghong Song, bpf,
	linux-kernel, linux-trace-kernel, kafai, kpsingh, netdev,
	paulmck, songliubraving, Ze Gao

On Sun, May 21, 2023 at 6:09 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Sun, 21 May 2023 10:08:46 +0200
> Jiri Olsa <olsajiri@gmail.com> wrote:
>
> > On Sat, May 20, 2023 at 05:47:24PM +0800, Ze Gao wrote:
> > >
> > > Hi Jiri,
> > >
> > > Would you like to consider to add rcu_is_watching check in
> > > to solve this from the viewpoint of kprobe_multi_link_prog_run
> >
> > I think this was discussed in here:
> >   https://lore.kernel.org/bpf/20230321020103.13494-1-laoar.shao@gmail.com/
> >
> > and was considered a bug, there's fix mentioned later in the thread
> >
> > there's also this recent patchset:
> >   https://lore.kernel.org/bpf/20230517034510.15639-3-zegao@tencent.com/
> >
> > that solves related problems
>
> I think this rcu_is_watching() is a bit different issue. This rcu_is_watching()
> check is required if the kprobe_multi_link_prog_run() uses any RCU API.
> E.g. rethook_try_get() is also checks rcu_is_watching() because it uses
> call_rcu().

Yes, that's my point!

Regards,
Ze

>
> >
> > > itself? And accounting of missed runs can be added as well
> > > to imporve observability.
> >
> > right, we count fprobe->nmissed but it's not exposed, we should allow
> > to get 'missed' stats from both fprobe and kprobe_multi later, which
> > is missing now, will check
> >
> > thanks,
> > jirka
> >
> > >
> > > Regards,
> > > Ze
> > >
> > >
> > > -----------------
> > > From 29fd3cd713e65461325c2703cf5246a6fae5d4fe Mon Sep 17 00:00:00 2001
> > > From: Ze Gao <zegao@tencent.com>
> > > Date: Sat, 20 May 2023 17:32:05 +0800
> > > Subject: [PATCH] bpf: kprobe_multi runs bpf progs only when rcu_is_watching
> > >
> > > From the perspective of kprobe_multi_link_prog_run, any traceable
> > > functions can be attached while bpf progs need specical care and
> > > ought to be under rcu protection. To solve the likely rcu lockdep
> > > warns once for good, when (future) functions in idle path were
> > > attached accidentally, we better paying some cost to check at least
> > > in kernel-side, and return when rcu is not watching, which helps
> > > to avoid any unpredictable results.
> > >
> > > Signed-off-by: Ze Gao <zegao@tencent.com>
> > > ---
> > >  kernel/trace/bpf_trace.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > > index 9a050e36dc6c..3e6ea7274765 100644
> > > --- a/kernel/trace/bpf_trace.c
> > > +++ b/kernel/trace/bpf_trace.c
> > > @@ -2622,7 +2622,7 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
> > >     struct bpf_run_ctx *old_run_ctx;
> > >     int err;
> > >
> > > -   if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
> > > +   if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1 || !rcu_is_watching())) {
> > >             err = 0;
> > >             goto out;
> > >     }
> > > --
> > > 2.40.1
> > >
>
>
> --
> Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re:
  2023-05-21  3:58   ` Yonghong Song
@ 2023-05-21 15:10     ` Ze Gao
  2023-05-21 20:26       ` Re: Jiri Olsa
  0 siblings, 1 reply; 31+ messages in thread
From: Ze Gao @ 2023-05-21 15:10 UTC (permalink / raw)
  To: Yonghong Song
  Cc: jolsa, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Hao Luo, John Fastabend, KP Singh, Martin KaFai Lau,
	Masami Hiramatsu, Song Liu, Stanislav Fomichev, Steven Rostedt,
	Yonghong Song, bpf, linux-kernel, linux-trace-kernel, kafai,
	kpsingh, netdev, paulmck, songliubraving, Ze Gao

> kprobe_multi/fprobe share the same set of attachments with fentry.
> Currently, fentry does not filter with !rcu_is_watching, maybe
> because this is an extreme corner case. Not sure whether it is
> worthwhile or not.

Agreed, it's rare, especially after Peter's patches which push narrow
down rcu eqs regions
in the idle path and reduce the chance of any traceable functions
happening in between.

However, from RCU's perspective, we ought to check if rcu_is_watching
theoretically
when there's a chance our code will run in the idle path and also we
need rcu to be alive,
And also we cannot simply make assumptions for any future changes in
the idle path.
You know, just like what was hit in the thread.

> Maybe if you can give a concrete example (e.g., attachment point)
> with current code base to show what the issue you encountered and
> it will make it easier to judge whether adding !rcu_is_watching()
> is necessary or not.

I can reproduce likely warnings on v6.1.18 where arch_cpu_idle is
traceable but not on the latest version
so far. But as I state above, in theory we need it. So here is a
gentle ping :) .

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re:
  2023-05-21 15:10     ` Re: Ze Gao
@ 2023-05-21 20:26       ` Jiri Olsa
  2023-05-22  1:36         ` Re: Masami Hiramatsu
  2023-05-22  2:07         ` Re: Ze Gao
  0 siblings, 2 replies; 31+ messages in thread
From: Jiri Olsa @ 2023-05-21 20:26 UTC (permalink / raw)
  To: Ze Gao
  Cc: Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Hao Luo, John Fastabend, KP Singh,
	Martin KaFai Lau, Masami Hiramatsu, Song Liu, Stanislav Fomichev,
	Steven Rostedt, Yonghong Song, bpf, linux-kernel,
	linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
	songliubraving, Ze Gao

On Sun, May 21, 2023 at 11:10:16PM +0800, Ze Gao wrote:
> > kprobe_multi/fprobe share the same set of attachments with fentry.
> > Currently, fentry does not filter with !rcu_is_watching, maybe
> > because this is an extreme corner case. Not sure whether it is
> > worthwhile or not.
> 
> Agreed, it's rare, especially after Peter's patches which push narrow
> down rcu eqs regions
> in the idle path and reduce the chance of any traceable functions
> happening in between.
> 
> However, from RCU's perspective, we ought to check if rcu_is_watching
> theoretically
> when there's a chance our code will run in the idle path and also we
> need rcu to be alive,
> And also we cannot simply make assumptions for any future changes in
> the idle path.
> You know, just like what was hit in the thread.
> 
> > Maybe if you can give a concrete example (e.g., attachment point)
> > with current code base to show what the issue you encountered and
> > it will make it easier to judge whether adding !rcu_is_watching()
> > is necessary or not.
> 
> I can reproduce likely warnings on v6.1.18 where arch_cpu_idle is
> traceable but not on the latest version
> so far. But as I state above, in theory we need it. So here is a
> gentle ping :) .

hum, this change [1] added rcu_is_watching check to ftrace_test_recursion_trylock,
which we use in fprobe_handler and is coming to fprobe_exit_handler in [2]

I might be missing something, but it seems like we don't need another
rcu_is_watching call on kprobe_multi level

jirka


[1] d099dbfd3306 cpuidle: tracing: Warn about !rcu_is_watching()
[2] https://lore.kernel.org/bpf/20230517034510.15639-4-zegao@tencent.com/

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re:
  2023-05-21 20:26       ` Re: Jiri Olsa
@ 2023-05-22  1:36         ` Masami Hiramatsu
  2023-05-22  2:07         ` Re: Ze Gao
  1 sibling, 0 replies; 31+ messages in thread
From: Masami Hiramatsu @ 2023-05-22  1:36 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Ze Gao, Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Hao Luo, John Fastabend, KP Singh,
	Martin KaFai Lau, Masami Hiramatsu, Song Liu, Stanislav Fomichev,
	Steven Rostedt, Yonghong Song, bpf, linux-kernel,
	linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
	songliubraving, Ze Gao

On Sun, 21 May 2023 22:26:37 +0200
Jiri Olsa <olsajiri@gmail.com> wrote:

> On Sun, May 21, 2023 at 11:10:16PM +0800, Ze Gao wrote:
> > > kprobe_multi/fprobe share the same set of attachments with fentry.
> > > Currently, fentry does not filter with !rcu_is_watching, maybe
> > > because this is an extreme corner case. Not sure whether it is
> > > worthwhile or not.
> > 
> > Agreed, it's rare, especially after Peter's patches which push narrow
> > down rcu eqs regions
> > in the idle path and reduce the chance of any traceable functions
> > happening in between.
> > 
> > However, from RCU's perspective, we ought to check if rcu_is_watching
> > theoretically
> > when there's a chance our code will run in the idle path and also we
> > need rcu to be alive,
> > And also we cannot simply make assumptions for any future changes in
> > the idle path.
> > You know, just like what was hit in the thread.
> > 
> > > Maybe if you can give a concrete example (e.g., attachment point)
> > > with current code base to show what the issue you encountered and
> > > it will make it easier to judge whether adding !rcu_is_watching()
> > > is necessary or not.
> > 
> > I can reproduce likely warnings on v6.1.18 where arch_cpu_idle is
> > traceable but not on the latest version
> > so far. But as I state above, in theory we need it. So here is a
> > gentle ping :) .
> 
> hum, this change [1] added rcu_is_watching check to ftrace_test_recursion_trylock,
> which we use in fprobe_handler and is coming to fprobe_exit_handler in [2]
> 
> I might be missing something, but it seems like we don't need another
> rcu_is_watching call on kprobe_multi level

Good point! OK, then it seems we don't need it. The rethook continues to
use the rcu_is_watching() because it is also used from kprobes, but the
kprobe_multi doesn't need it.

Thank you,

> 
> jirka
> 
> 
> [1] d099dbfd3306 cpuidle: tracing: Warn about !rcu_is_watching()
> [2] https://lore.kernel.org/bpf/20230517034510.15639-4-zegao@tencent.com/


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re:
  2023-05-21 20:26       ` Re: Jiri Olsa
  2023-05-22  1:36         ` Re: Masami Hiramatsu
@ 2023-05-22  2:07         ` Ze Gao
  2023-05-23  4:38           ` Re: Yonghong Song
                             ` (2 more replies)
  1 sibling, 3 replies; 31+ messages in thread
From: Ze Gao @ 2023-05-22  2:07 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Hao Luo, John Fastabend, KP Singh,
	Martin KaFai Lau, Masami Hiramatsu, Song Liu, Stanislav Fomichev,
	Steven Rostedt, Yonghong Song, bpf, linux-kernel,
	linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
	songliubraving, Ze Gao

Oops, I missed that. Thanks for pointing that out, which I thought is
conditional use of rcu_is_watching before.

One last point, I think we should double check on this
     "fentry does not filter with !rcu_is_watching"
as quoted from Yonghong and argue whether it needs
the same check for fentry as well.

Regards,
Ze

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re:
  2023-05-22  2:07         ` Re: Ze Gao
@ 2023-05-23  4:38           ` Yonghong Song
  2023-05-23  5:30           ` Re: Masami Hiramatsu
  2023-05-23 14:10           ` kprobes and rcu_is_watching() Steven Rostedt
  2 siblings, 0 replies; 31+ messages in thread
From: Yonghong Song @ 2023-05-23  4:38 UTC (permalink / raw)
  To: Ze Gao, Jiri Olsa
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Hao Luo,
	John Fastabend, KP Singh, Martin KaFai Lau, Masami Hiramatsu,
	Song Liu, Stanislav Fomichev, Steven Rostedt, Yonghong Song, bpf,
	linux-kernel, linux-trace-kernel, kafai, kpsingh, netdev,
	paulmck, songliubraving, Ze Gao



On 5/21/23 7:07 PM, Ze Gao wrote:
> Oops, I missed that. Thanks for pointing that out, which I thought is
> conditional use of rcu_is_watching before.
> 
> One last point, I think we should double check on this
>       "fentry does not filter with !rcu_is_watching"
> as quoted from Yonghong and argue whether it needs
> the same check for fentry as well.

I would suggest that we address rcu_is_watching issue for fentry
only if we do have a reproducible case to show something goes wrong...

> 
> Regards,
> Ze

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re:
  2023-05-22  2:07         ` Re: Ze Gao
  2023-05-23  4:38           ` Re: Yonghong Song
@ 2023-05-23  5:30           ` Masami Hiramatsu
  2023-05-23  6:59             ` Re: Paul E. McKenney
  2023-05-23 14:10           ` kprobes and rcu_is_watching() Steven Rostedt
  2 siblings, 1 reply; 31+ messages in thread
From: Masami Hiramatsu @ 2023-05-23  5:30 UTC (permalink / raw)
  To: Ze Gao
  Cc: Jiri Olsa, Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Hao Luo, John Fastabend, KP Singh,
	Martin KaFai Lau, Masami Hiramatsu, Song Liu, Stanislav Fomichev,
	Steven Rostedt, Yonghong Song, bpf, linux-kernel,
	linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
	songliubraving, Ze Gao

On Mon, 22 May 2023 10:07:42 +0800
Ze Gao <zegao2021@gmail.com> wrote:

> Oops, I missed that. Thanks for pointing that out, which I thought is
> conditional use of rcu_is_watching before.
> 
> One last point, I think we should double check on this
>      "fentry does not filter with !rcu_is_watching"
> as quoted from Yonghong and argue whether it needs
> the same check for fentry as well.

rcu_is_watching() comment says;

 * if the current CPU is not in its idle loop or is in an interrupt or
 * NMI handler, return true.

Thus it returns *fault* if the current CPU is in the idle loop and not
any interrupt(including NMI) context. This means if any tracable function
is called from idle loop, it can be !rcu_is_watching(). I meant, this is
'context' based check, thus fentry can not filter out that some commonly
used functions is called from that context but it can be detected.

Thank you,

> 
> Regards,
> Ze


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re:
  2023-05-23  5:30           ` Re: Masami Hiramatsu
@ 2023-05-23  6:59             ` Paul E. McKenney
  2023-05-25  0:13               ` Re: Masami Hiramatsu
  0 siblings, 1 reply; 31+ messages in thread
From: Paul E. McKenney @ 2023-05-23  6:59 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ze Gao, Jiri Olsa, Yonghong Song, Alexei Starovoitov,
	Andrii Nakryiko, Daniel Borkmann, Hao Luo, John Fastabend,
	KP Singh, Martin KaFai Lau, Song Liu, Stanislav Fomichev,
	Steven Rostedt, Yonghong Song, bpf, linux-kernel,
	linux-trace-kernel, kafai, kpsingh, netdev, songliubraving,
	Ze Gao

On Tue, May 23, 2023 at 01:30:19PM +0800, Masami Hiramatsu wrote:
> On Mon, 22 May 2023 10:07:42 +0800
> Ze Gao <zegao2021@gmail.com> wrote:
> 
> > Oops, I missed that. Thanks for pointing that out, which I thought is
> > conditional use of rcu_is_watching before.
> > 
> > One last point, I think we should double check on this
> >      "fentry does not filter with !rcu_is_watching"
> > as quoted from Yonghong and argue whether it needs
> > the same check for fentry as well.
> 
> rcu_is_watching() comment says;
> 
>  * if the current CPU is not in its idle loop or is in an interrupt or
>  * NMI handler, return true.
> 
> Thus it returns *fault* if the current CPU is in the idle loop and not
> any interrupt(including NMI) context. This means if any tracable function
> is called from idle loop, it can be !rcu_is_watching(). I meant, this is
> 'context' based check, thus fentry can not filter out that some commonly
> used functions is called from that context but it can be detected.

It really does return false (rather than faulting?) if the current CPU
is deep within the idle loop.

In addition, the recent x86/entry rework (thank you Peter and
Thomas!) mean that the "idle loop" is quite restricted, as can be
seen by the invocations of ct_cpuidle_enter() and ct_cpuidle_exit().
For example, in default_idle_call(), these are immediately before and
after the call to arch_cpu_idle().

Would the following help?  Or am I missing your point?

							Thanx, Paul

------------------------------------------------------------------------

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 1449cb69a0e0..fae9b4e29c93 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -679,10 +679,14 @@ static void rcu_disable_urgency_upon_qs(struct rcu_data *rdp)
 /**
  * rcu_is_watching - see if RCU thinks that the current CPU is not idle
  *
- * Return true if RCU is watching the running CPU, which means that this
- * CPU can safely enter RCU read-side critical sections.  In other words,
- * if the current CPU is not in its idle loop or is in an interrupt or
- * NMI handler, return true.
+ * Return @true if RCU is watching the running CPU and @false otherwise.
+ * An @true return means that this CPU can safely enter RCU read-side
+ * critical sections.
+ *
+ * More specifically, if the current CPU is not deep within its idle
+ * loop, return @true.  Note that rcu_is_watching() will return @true if
+ * invoked from an interrupt or NMI handler, even if that interrupt or
+ * NMI interrupted the CPU while it was deep within its idle loop.
  *
  * Make notrace because it can be called by the internal functions of
  * ftrace, and making this notrace removes unnecessary recursion calls.

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: kprobes and rcu_is_watching()
  2023-05-22  2:07         ` Re: Ze Gao
  2023-05-23  4:38           ` Re: Yonghong Song
  2023-05-23  5:30           ` Re: Masami Hiramatsu
@ 2023-05-23 14:10           ` Steven Rostedt
  2023-05-24  3:51             ` Ze Gao
  2 siblings, 1 reply; 31+ messages in thread
From: Steven Rostedt @ 2023-05-23 14:10 UTC (permalink / raw)
  To: Ze Gao
  Cc: Jiri Olsa, Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Hao Luo, John Fastabend, KP Singh,
	Martin KaFai Lau, Masami Hiramatsu, Song Liu, Stanislav Fomichev,
	Yonghong Song, bpf, linux-kernel, linux-trace-kernel, kafai,
	kpsingh, netdev, paulmck, songliubraving, Ze Gao

[ Added a subject, as I always want to delete these emails as spam! ]

On Mon, 22 May 2023 10:07:42 +0800
Ze Gao <zegao2021@gmail.com> wrote:

> Oops, I missed that. Thanks for pointing that out, which I thought is
> conditional use of rcu_is_watching before.
> 
> One last point, I think we should double check on this
>      "fentry does not filter with !rcu_is_watching"
> as quoted from Yonghong and argue whether it needs
> the same check for fentry as well.
> 

Note that trace_test_and_set_recursion() (which is used by
ftrace_test_recursion_trylock()) checks for rcu_is_watching() and
returns false if it isn't (and the trylock will fail).

-- Steve

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: kprobes and rcu_is_watching()
  2023-05-23 14:10           ` kprobes and rcu_is_watching() Steven Rostedt
@ 2023-05-24  3:51             ` Ze Gao
  0 siblings, 0 replies; 31+ messages in thread
From: Ze Gao @ 2023-05-24  3:51 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Jiri Olsa, Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Hao Luo, John Fastabend, KP Singh,
	Martin KaFai Lau, Masami Hiramatsu, Song Liu, Stanislav Fomichev,
	Yonghong Song, bpf, linux-kernel, linux-trace-kernel, kafai,
	kpsingh, netdev, paulmck, songliubraving, Ze Gao

Thanks Steven, I think we've come to a consensus on this.

The question here is whether bpf tracing fentry i.e.,
__bpf_prog_enter{_sleepable}
needs to check rcu_is_watching as well before using rcu related
calls. And Yonghong suggested making a change when there is
indeed some bad case occurring since it's rare the tracee is in the idle path.


Regards,
Ze

On Tue, May 23, 2023 at 10:10 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> [ Added a subject, as I always want to delete these emails as spam! ]
>
> On Mon, 22 May 2023 10:07:42 +0800
> Ze Gao <zegao2021@gmail.com> wrote:
>
> > Oops, I missed that. Thanks for pointing that out, which I thought is
> > conditional use of rcu_is_watching before.
> >
> > One last point, I think we should double check on this
> >      "fentry does not filter with !rcu_is_watching"
> > as quoted from Yonghong and argue whether it needs
> > the same check for fentry as well.
> >
>
> Note that trace_test_and_set_recursion() (which is used by
> ftrace_test_recursion_trylock()) checks for rcu_is_watching() and
> returns false if it isn't (and the trylock will fail).
>
> -- Steve

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re:
  2023-05-23  6:59             ` Re: Paul E. McKenney
@ 2023-05-25  0:13               ` Masami Hiramatsu
  0 siblings, 0 replies; 31+ messages in thread
From: Masami Hiramatsu @ 2023-05-25  0:13 UTC (permalink / raw)
  To: paulmck
  Cc: Ze Gao, Jiri Olsa, Yonghong Song, Alexei Starovoitov,
	Andrii Nakryiko, Daniel Borkmann, Hao Luo, John Fastabend,
	KP Singh, Martin KaFai Lau, Song Liu, Stanislav Fomichev,
	Steven Rostedt, Yonghong Song, bpf, linux-kernel,
	linux-trace-kernel, kafai, kpsingh, netdev, songliubraving,
	Ze Gao

On Mon, 22 May 2023 23:59:28 -0700
"Paul E. McKenney" <paulmck@kernel.org> wrote:

> On Tue, May 23, 2023 at 01:30:19PM +0800, Masami Hiramatsu wrote:
> > On Mon, 22 May 2023 10:07:42 +0800
> > Ze Gao <zegao2021@gmail.com> wrote:
> > 
> > > Oops, I missed that. Thanks for pointing that out, which I thought is
> > > conditional use of rcu_is_watching before.
> > > 
> > > One last point, I think we should double check on this
> > >      "fentry does not filter with !rcu_is_watching"
> > > as quoted from Yonghong and argue whether it needs
> > > the same check for fentry as well.
> > 
> > rcu_is_watching() comment says;
> > 
> >  * if the current CPU is not in its idle loop or is in an interrupt or
> >  * NMI handler, return true.
> > 
> > Thus it returns *fault* if the current CPU is in the idle loop and not
> > any interrupt(including NMI) context. This means if any tracable function
> > is called from idle loop, it can be !rcu_is_watching(). I meant, this is
> > 'context' based check, thus fentry can not filter out that some commonly
> > used functions is called from that context but it can be detected.
> 
> It really does return false (rather than faulting?) if the current CPU
> is deep within the idle loop.
> 
> In addition, the recent x86/entry rework (thank you Peter and
> Thomas!) mean that the "idle loop" is quite restricted, as can be
> seen by the invocations of ct_cpuidle_enter() and ct_cpuidle_exit().
> For example, in default_idle_call(), these are immediately before and
> after the call to arch_cpu_idle().

Thanks! I also found that the default_idle_call() is enough small and
it seems not happening on fentry because there are no commonly used
functions on that path.

> 
> Would the following help?  Or am I missing your point?

Yes, thank you for the update!

> 
> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 1449cb69a0e0..fae9b4e29c93 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -679,10 +679,14 @@ static void rcu_disable_urgency_upon_qs(struct rcu_data *rdp)
>  /**
>   * rcu_is_watching - see if RCU thinks that the current CPU is not idle
>   *
> - * Return true if RCU is watching the running CPU, which means that this
> - * CPU can safely enter RCU read-side critical sections.  In other words,
> - * if the current CPU is not in its idle loop or is in an interrupt or
> - * NMI handler, return true.
> + * Return @true if RCU is watching the running CPU and @false otherwise.
> + * An @true return means that this CPU can safely enter RCU read-side
> + * critical sections.
> + *
> + * More specifically, if the current CPU is not deep within its idle
> + * loop, return @true.  Note that rcu_is_watching() will return @true if
> + * invoked from an interrupt or NMI handler, even if that interrupt or
> + * NMI interrupted the CPU while it was deep within its idle loop.
>   *
>   * Make notrace because it can be called by the internal functions of
>   * ftrace, and making this notrace removes unnecessary recursion calls.


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2023-05-25  0:13 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-15 20:36 [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr Jiri Olsa
2022-05-15 20:36 ` [PATCH bpf-next 2/2] selftests/bpf: Remove filter for unsafe functions in kprobe_multi test Jiri Olsa
2022-05-16  4:25 ` [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr Paul E. McKenney
2022-05-16 11:49   ` Frederic Weisbecker
2022-05-16 13:39     ` Paul E. McKenney
2022-05-17 10:13     ` Jiri Olsa
2022-05-18 16:21       ` Paul E. McKenney
2022-05-19 11:33         ` Jiri Olsa
2022-05-19 13:54           ` Paul E. McKenney
2022-05-23 13:12             ` Jiri Olsa
2022-05-24 17:33               ` Paul E. McKenney
2022-05-17  2:39 ` kernel test robot
2022-05-17  2:54 ` Yonghong Song
2022-05-17  7:49   ` Jiri Olsa
2022-05-19  0:00     ` Masami Hiramatsu
2022-05-17  9:19 ` kernel test robot
2023-05-20  9:47 ` Ze Gao
2023-05-21  3:58   ` Yonghong Song
2023-05-21 15:10     ` Re: Ze Gao
2023-05-21 20:26       ` Re: Jiri Olsa
2023-05-22  1:36         ` Re: Masami Hiramatsu
2023-05-22  2:07         ` Re: Ze Gao
2023-05-23  4:38           ` Re: Yonghong Song
2023-05-23  5:30           ` Re: Masami Hiramatsu
2023-05-23  6:59             ` Re: Paul E. McKenney
2023-05-25  0:13               ` Re: Masami Hiramatsu
2023-05-23 14:10           ` kprobes and rcu_is_watching() Steven Rostedt
2023-05-24  3:51             ` Ze Gao
2023-05-21  8:08   ` Jiri Olsa
2023-05-21 10:09     ` Re: Masami Hiramatsu
2023-05-21 14:19       ` Re: Ze Gao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).