All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sven Schnelle <svens@linux.ibm.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Yinan Liu <yinan@linux.alibaba.com>,
	Ard Biesheuvel <ardb@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	Sachin Sant <sachinp@linux.ibm.com>,
	linuxppc-dev@lists.ozlabs.org,
	Russell King <linux@armlinux.org.uk>,
	linux-arm-kernel@lists.infradead.org, hca@linux.ibm.com,
	linux-s390@vgr.kernel.org
Subject: ftrace hangs waiting for rcu (was: Re: [PATCH] ftrace: Have architectures opt-in for mcount build time sorting)
Date: Thu, 27 Jan 2022 19:42:35 +0100	[thread overview]
Message-ID: <yt9dy231yq90.fsf_-_@linux.ibm.com> (raw)
In-Reply-To: <YfLjIOlGfFmbh1Zv@FVFF77S0Q05N> (Mark Rutland's message of "Thu, 27 Jan 2022 18:23:28 +0000")

Hi Mark,

Mark Rutland <mark.rutland@arm.com> writes:

> * I intermittently see a hang when running the tests. I previously hit that
>   when originally trying to bisect this issue (and IIRC that bisected down to
>   some RCU changes, but I need to re-run that). When the tests hang I
>   magic-srsrq + L tells me:
>
>   [  271.938438] sysrq: Show Blocked State
>   [  271.939245] task:ftracetest      state:D stack:    0 pid: 5687 ppid:  5627 flags:0x00000200
>   [  271.940961] Call trace:
>   [  271.941472]  __switch_to+0x104/0x160
>   [  271.942213]  __schedule+0x2b0/0x6e0
>   [  271.942933]  schedule+0x5c/0xf0
>   [  271.943586]  schedule_timeout+0x184/0x1c4
>   [  271.944410]  wait_for_completion+0x8c/0x12c
>   [  271.945274]  __wait_rcu_gp+0x184/0x190
>   [  271.946047]  synchronize_rcu_tasks_rude+0x48/0x70
>   [  271.947007]  update_ftrace_function+0xa4/0xec
>   [  271.947897]  __unregister_ftrace_function+0xa4/0xf0
>   [  271.948898]  unregister_ftrace_function+0x34/0x70
>   [  271.949857]  wakeup_tracer_reset+0x4c/0x100
>   [  271.950713]  tracing_set_tracer+0xd0/0x2b0
>   [  271.951552]  tracing_set_trace_write+0xe8/0x150
>   [  271.952477]  vfs_write+0xfc/0x284
>   [  271.953171]  ksys_write+0x7c/0x110
>   [  271.953874]  __arm64_sys_write+0x2c/0x40
>   [  271.954678]  invoke_syscall+0x5c/0x130
>   [  271.955442]  el0_svc_common.constprop.0+0x108/0x130
>   [  271.956435]  do_el0_svc+0x74/0x90
>   [  271.957124]  el0_svc+0x2c/0x90
>   [  271.957757]  el0t_64_sync_handler+0xa8/0x12c
>   [  271.958629]  el0t_64_sync+0x1a0/0x1a4

that's interesting. On s390 i'm seeing the same problem in CI, but with
the startup ftrace tests. So that's likely not arm64 spacific.

On s390, the last messages from ftrace are [    5.663568] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
[    5.667099] futex hash table entries: 65536 (order: 12, 16777216 bytes, vmalloc)
[    5.739549] Running postponed tracer tests:
[    5.740662] Testing tracer function: PASSED
[    6.194635] Testing dynamic ftrace: PASSED
[    6.471213] Testing dynamic ftrace ops #1: 
[    6.558445] (1 0 1 0 0) 
[    6.558458] (1 1 2 0 0) 
[    6.699135] (2 1 3 0 764347) 
[    6.699252] (2 2 4 0 766466) 
[    6.759857] (3 2 4 0 1159604)
[..] hangs here

The backtrace looks like this, which is very similar to the one above:

crash> bt 1
PID: 1      TASK: 80e68100          CPU: 133  COMMAND: "swapper/0"
 #0 [380004df808] __schedule at cda39f0e
 #1 [380004df880] schedule at cda3a488
 #2 [380004df8b0] schedule_timeout at cda41ef6
 #3 [380004df978] wait_for_completion at cda3bd0a
 #4 [380004df9d8] __wait_rcu_gp at ccdddd92
 #5 [380004dfa30] synchronize_rcu_tasks_generic at ccdde0aa
 #6 [380004dfad8] ftrace_shutdown at cce7b050
 #7 [380004dfb18] unregister_ftrace_function at cce7b192
 #8 [380004dfb50] trace_selftest_ops at cda1e0fa
 #9 [380004dfba0] run_tracer_selftest at cda1e4f2
#10 [380004dfc00] trace_selftest_startup_function at ce74355c
#11 [380004dfc58] run_tracer_selftest at cda1e2fc
#12 [380004dfc98] init_trace_selftests at ce742d30
#13 [380004dfcd0] do_one_initcall at cccdca16
#14 [380004dfd68] do_initcalls at ce72e776
#15 [380004dfde0] kernel_init_freeable at ce72ea60
#16 [380004dfe50] kernel_init at cda333fe
#17 [380004dfe68] __ret_from_fork at cccdf920
#18 [380004dfe98] ret_from_fork at cda444ca

I didn't had success reproducing it so far, but it is good to know that
this also happens when running the ftrace testsuite.

I have several crashdumps, so i could try to pull out some information
if someone tells me what to look for.

Thanks,
Sven

WARNING: multiple messages have this Message-ID (diff)
From: Sven Schnelle <svens@linux.ibm.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Yinan Liu <yinan@linux.alibaba.com>,
	Ard Biesheuvel <ardb@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	Sachin Sant <sachinp@linux.ibm.com>,
	linuxppc-dev@lists.ozlabs.org,
	Russell King <linux@armlinux.org.uk>,
	linux-arm-kernel@lists.infradead.org, hca@linux.ibm.com,
	linux-s390@vgr.kernel.org
Subject: ftrace hangs waiting for rcu (was: Re: [PATCH] ftrace: Have architectures opt-in for mcount build time sorting)
Date: Thu, 27 Jan 2022 19:42:35 +0100	[thread overview]
Message-ID: <yt9dy231yq90.fsf_-_@linux.ibm.com> (raw)
In-Reply-To: <YfLjIOlGfFmbh1Zv@FVFF77S0Q05N> (Mark Rutland's message of "Thu,  27 Jan 2022 18:23:28 +0000")

Hi Mark,

Mark Rutland <mark.rutland@arm.com> writes:

> * I intermittently see a hang when running the tests. I previously hit that
>   when originally trying to bisect this issue (and IIRC that bisected down to
>   some RCU changes, but I need to re-run that). When the tests hang I
>   magic-srsrq + L tells me:
>
>   [  271.938438] sysrq: Show Blocked State
>   [  271.939245] task:ftracetest      state:D stack:    0 pid: 5687 ppid:  5627 flags:0x00000200
>   [  271.940961] Call trace:
>   [  271.941472]  __switch_to+0x104/0x160
>   [  271.942213]  __schedule+0x2b0/0x6e0
>   [  271.942933]  schedule+0x5c/0xf0
>   [  271.943586]  schedule_timeout+0x184/0x1c4
>   [  271.944410]  wait_for_completion+0x8c/0x12c
>   [  271.945274]  __wait_rcu_gp+0x184/0x190
>   [  271.946047]  synchronize_rcu_tasks_rude+0x48/0x70
>   [  271.947007]  update_ftrace_function+0xa4/0xec
>   [  271.947897]  __unregister_ftrace_function+0xa4/0xf0
>   [  271.948898]  unregister_ftrace_function+0x34/0x70
>   [  271.949857]  wakeup_tracer_reset+0x4c/0x100
>   [  271.950713]  tracing_set_tracer+0xd0/0x2b0
>   [  271.951552]  tracing_set_trace_write+0xe8/0x150
>   [  271.952477]  vfs_write+0xfc/0x284
>   [  271.953171]  ksys_write+0x7c/0x110
>   [  271.953874]  __arm64_sys_write+0x2c/0x40
>   [  271.954678]  invoke_syscall+0x5c/0x130
>   [  271.955442]  el0_svc_common.constprop.0+0x108/0x130
>   [  271.956435]  do_el0_svc+0x74/0x90
>   [  271.957124]  el0_svc+0x2c/0x90
>   [  271.957757]  el0t_64_sync_handler+0xa8/0x12c
>   [  271.958629]  el0t_64_sync+0x1a0/0x1a4

that's interesting. On s390 i'm seeing the same problem in CI, but with
the startup ftrace tests. So that's likely not arm64 spacific.

On s390, the last messages from ftrace are [    5.663568] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
[    5.667099] futex hash table entries: 65536 (order: 12, 16777216 bytes, vmalloc)
[    5.739549] Running postponed tracer tests:
[    5.740662] Testing tracer function: PASSED
[    6.194635] Testing dynamic ftrace: PASSED
[    6.471213] Testing dynamic ftrace ops #1: 
[    6.558445] (1 0 1 0 0) 
[    6.558458] (1 1 2 0 0) 
[    6.699135] (2 1 3 0 764347) 
[    6.699252] (2 2 4 0 766466) 
[    6.759857] (3 2 4 0 1159604)
[..] hangs here

The backtrace looks like this, which is very similar to the one above:

crash> bt 1
PID: 1      TASK: 80e68100          CPU: 133  COMMAND: "swapper/0"
 #0 [380004df808] __schedule at cda39f0e
 #1 [380004df880] schedule at cda3a488
 #2 [380004df8b0] schedule_timeout at cda41ef6
 #3 [380004df978] wait_for_completion at cda3bd0a
 #4 [380004df9d8] __wait_rcu_gp at ccdddd92
 #5 [380004dfa30] synchronize_rcu_tasks_generic at ccdde0aa
 #6 [380004dfad8] ftrace_shutdown at cce7b050
 #7 [380004dfb18] unregister_ftrace_function at cce7b192
 #8 [380004dfb50] trace_selftest_ops at cda1e0fa
 #9 [380004dfba0] run_tracer_selftest at cda1e4f2
#10 [380004dfc00] trace_selftest_startup_function at ce74355c
#11 [380004dfc58] run_tracer_selftest at cda1e2fc
#12 [380004dfc98] init_trace_selftests at ce742d30
#13 [380004dfcd0] do_one_initcall at cccdca16
#14 [380004dfd68] do_initcalls at ce72e776
#15 [380004dfde0] kernel_init_freeable at ce72ea60
#16 [380004dfe50] kernel_init at cda333fe
#17 [380004dfe68] __ret_from_fork at cccdf920
#18 [380004dfe98] ret_from_fork at cda444ca

I didn't had success reproducing it so far, but it is good to know that
this also happens when running the ftrace testsuite.

I have several crashdumps, so i could try to pull out some information
if someone tells me what to look for.

Thanks,
Sven

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

WARNING: multiple messages have this Message-ID (diff)
From: Sven Schnelle <svens@linux.ibm.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: Kees Cook <keescook@chromium.org>,
	hca@linux.ibm.com, LKML <linux-kernel@vger.kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@kernel.org>,
	linux-s390@vgr.kernel.org, Sachin Sant <sachinp@linux.ibm.com>,
	Russell King <linux@armlinux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	Yinan Liu <yinan@linux.alibaba.com>,
	linuxppc-dev@lists.ozlabs.org, Ard Biesheuvel <ardb@kernel.org>,
	linux-arm-kernel@lists.infradead.org
Subject: ftrace hangs waiting for rcu (was: Re: [PATCH] ftrace: Have architectures opt-in for mcount build time sorting)
Date: Thu, 27 Jan 2022 19:42:35 +0100	[thread overview]
Message-ID: <yt9dy231yq90.fsf_-_@linux.ibm.com> (raw)
In-Reply-To: <YfLjIOlGfFmbh1Zv@FVFF77S0Q05N> (Mark Rutland's message of "Thu,  27 Jan 2022 18:23:28 +0000")

Hi Mark,

Mark Rutland <mark.rutland@arm.com> writes:

> * I intermittently see a hang when running the tests. I previously hit that
>   when originally trying to bisect this issue (and IIRC that bisected down to
>   some RCU changes, but I need to re-run that). When the tests hang I
>   magic-srsrq + L tells me:
>
>   [  271.938438] sysrq: Show Blocked State
>   [  271.939245] task:ftracetest      state:D stack:    0 pid: 5687 ppid:  5627 flags:0x00000200
>   [  271.940961] Call trace:
>   [  271.941472]  __switch_to+0x104/0x160
>   [  271.942213]  __schedule+0x2b0/0x6e0
>   [  271.942933]  schedule+0x5c/0xf0
>   [  271.943586]  schedule_timeout+0x184/0x1c4
>   [  271.944410]  wait_for_completion+0x8c/0x12c
>   [  271.945274]  __wait_rcu_gp+0x184/0x190
>   [  271.946047]  synchronize_rcu_tasks_rude+0x48/0x70
>   [  271.947007]  update_ftrace_function+0xa4/0xec
>   [  271.947897]  __unregister_ftrace_function+0xa4/0xf0
>   [  271.948898]  unregister_ftrace_function+0x34/0x70
>   [  271.949857]  wakeup_tracer_reset+0x4c/0x100
>   [  271.950713]  tracing_set_tracer+0xd0/0x2b0
>   [  271.951552]  tracing_set_trace_write+0xe8/0x150
>   [  271.952477]  vfs_write+0xfc/0x284
>   [  271.953171]  ksys_write+0x7c/0x110
>   [  271.953874]  __arm64_sys_write+0x2c/0x40
>   [  271.954678]  invoke_syscall+0x5c/0x130
>   [  271.955442]  el0_svc_common.constprop.0+0x108/0x130
>   [  271.956435]  do_el0_svc+0x74/0x90
>   [  271.957124]  el0_svc+0x2c/0x90
>   [  271.957757]  el0t_64_sync_handler+0xa8/0x12c
>   [  271.958629]  el0t_64_sync+0x1a0/0x1a4

that's interesting. On s390 i'm seeing the same problem in CI, but with
the startup ftrace tests. So that's likely not arm64 spacific.

On s390, the last messages from ftrace are [    5.663568] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
[    5.667099] futex hash table entries: 65536 (order: 12, 16777216 bytes, vmalloc)
[    5.739549] Running postponed tracer tests:
[    5.740662] Testing tracer function: PASSED
[    6.194635] Testing dynamic ftrace: PASSED
[    6.471213] Testing dynamic ftrace ops #1: 
[    6.558445] (1 0 1 0 0) 
[    6.558458] (1 1 2 0 0) 
[    6.699135] (2 1 3 0 764347) 
[    6.699252] (2 2 4 0 766466) 
[    6.759857] (3 2 4 0 1159604)
[..] hangs here

The backtrace looks like this, which is very similar to the one above:

crash> bt 1
PID: 1      TASK: 80e68100          CPU: 133  COMMAND: "swapper/0"
 #0 [380004df808] __schedule at cda39f0e
 #1 [380004df880] schedule at cda3a488
 #2 [380004df8b0] schedule_timeout at cda41ef6
 #3 [380004df978] wait_for_completion at cda3bd0a
 #4 [380004df9d8] __wait_rcu_gp at ccdddd92
 #5 [380004dfa30] synchronize_rcu_tasks_generic at ccdde0aa
 #6 [380004dfad8] ftrace_shutdown at cce7b050
 #7 [380004dfb18] unregister_ftrace_function at cce7b192
 #8 [380004dfb50] trace_selftest_ops at cda1e0fa
 #9 [380004dfba0] run_tracer_selftest at cda1e4f2
#10 [380004dfc00] trace_selftest_startup_function at ce74355c
#11 [380004dfc58] run_tracer_selftest at cda1e2fc
#12 [380004dfc98] init_trace_selftests at ce742d30
#13 [380004dfcd0] do_one_initcall at cccdca16
#14 [380004dfd68] do_initcalls at ce72e776
#15 [380004dfde0] kernel_init_freeable at ce72ea60
#16 [380004dfe50] kernel_init at cda333fe
#17 [380004dfe68] __ret_from_fork at cccdf920
#18 [380004dfe98] ret_from_fork at cda444ca

I didn't had success reproducing it so far, but it is good to know that
this also happens when running the ftrace testsuite.

I have several crashdumps, so i could try to pull out some information
if someone tells me what to look for.

Thanks,
Sven

  reply	other threads:[~2022-01-27 18:43 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-27 16:42 [PATCH] ftrace: Have architectures opt-in for mcount build time sorting Steven Rostedt
2022-01-27 16:42 ` Steven Rostedt
2022-01-27 16:42 ` Steven Rostedt
2022-01-27 18:23 ` Mark Rutland
2022-01-27 18:23   ` Mark Rutland
2022-01-27 18:23   ` Mark Rutland
2022-01-27 18:42   ` Sven Schnelle [this message]
2022-01-27 18:42     ` ftrace hangs waiting for rcu (was: Re: [PATCH] ftrace: Have architectures opt-in for mcount build time sorting) Sven Schnelle
2022-01-27 18:42     ` Sven Schnelle
2022-01-28 15:42     ` Mark Rutland
2022-01-28 15:42       ` Mark Rutland
2022-01-28 15:42       ` Mark Rutland
2022-01-28 16:08       ` ftrace hangs waiting for rcu Sven Schnelle
2022-01-28 16:08         ` Sven Schnelle
2022-01-28 16:08         ` Sven Schnelle
2022-01-28 16:11         ` Mark Rutland
2022-01-28 16:11           ` Mark Rutland
2022-01-28 16:11           ` Mark Rutland
2022-01-28 16:15           ` Paul E. McKenney
2022-01-28 16:15             ` Paul E. McKenney
2022-01-28 16:15             ` Paul E. McKenney
2022-01-28 17:47             ` Paul E. McKenney
2022-01-28 17:47               ` Paul E. McKenney
2022-01-28 17:47               ` Paul E. McKenney
2022-01-28 16:17           ` Sven Schnelle
2022-01-28 16:17             ` Sven Schnelle
2022-01-28 16:17             ` Sven Schnelle
2022-01-28 21:11 ` [PATCH] ftrace: Have architectures opt-in for mcount build time sorting Joe Lawrence
2022-01-28 21:11   ` Joe Lawrence
2022-01-28 21:11   ` Joe Lawrence
2022-01-28 21:39   ` Steven Rostedt
2022-01-28 21:39     ` Steven Rostedt
2022-01-28 21:39     ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=yt9dy231yq90.fsf_-_@linux.ibm.com \
    --to=svens@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=ardb@kernel.org \
    --cc=hca@linux.ibm.com \
    --cc=keescook@chromium.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vgr.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=sachinp@linux.ibm.com \
    --cc=yinan@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.