All of lore.kernel.org
 help / color / mirror / Atom feed
From: 王贇 <yun.wang@linux.alibaba.com>
To: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@redhat.com>, Namhyung Kim <namhyung@kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
	Yonghong Song <yhs@fb.com>,
	John Fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@kernel.org>,
	"open list:PERFORMANCE EVENTS SUBSYSTEM" 
	<linux-perf-users@vger.kernel.org>,
	"open list:PERFORMANCE EVENTS SUBSYSTEM" 
	<linux-kernel@vger.kernel.org>,
	"open list:BPF (Safe dynamic programs and tools)" 
	<netdev@vger.kernel.org>,
	"open list:BPF (Safe dynamic programs and tools)" 
	<bpf@vger.kernel.org>
Subject: [RFC PATCH] perf: fix panic by mark recursion inside perf_log_throttle
Date: Thu, 9 Sep 2021 11:13:21 +0800	[thread overview]
Message-ID: <ff979a43-045a-dc56-64d1-2c31dd4db381@linux.alibaba.com> (raw)

When running with ftrace function enabled, we observed panic
as below:

  traps: PANIC: double fault, error_code: 0x0
  [snip]
  RIP: 0010:perf_swevent_get_recursion_context+0x0/0x70
  [snip]
  Call Trace:
   <NMI>
   perf_trace_buf_alloc+0x26/0xd0
   perf_ftrace_function_call+0x18f/0x2e0
   kernelmode_fixup_or_oops+0x5/0x120
   __bad_area_nosemaphore+0x1b8/0x280
   do_user_addr_fault+0x410/0x920
   exc_page_fault+0x92/0x300
   asm_exc_page_fault+0x1e/0x30
  RIP: 0010:__get_user_nocheck_8+0x6/0x13
   perf_callchain_user+0x266/0x2f0
   get_perf_callchain+0x194/0x210
   perf_callchain+0xa3/0xc0
   perf_prepare_sample+0xa5/0xa60
   perf_event_output_forward+0x7b/0x1b0
   __perf_event_overflow+0x67/0x120
   perf_swevent_overflow+0xcb/0x110
   perf_swevent_event+0xb0/0xf0
   perf_tp_event+0x292/0x410
   perf_trace_run_bpf_submit+0x87/0xc0
   perf_trace_lock_acquire+0x12b/0x170
   lock_acquire+0x1bf/0x2e0
   perf_output_begin+0x70/0x4b0
   perf_log_throttle+0xe2/0x1a0
   perf_event_nmi_handler+0x30/0x50
   nmi_handle+0xba/0x2a0
   default_do_nmi+0x45/0xf0
   exc_nmi+0x155/0x170
   end_repeat_nmi+0x16/0x55

According to the trace we know the story is like this, the NMI
triggered perf IRQ throttling and call perf_log_throttle(),
which triggered the swevent overflow, and the overflow process
do perf_callchain_user() which triggered a user PF, and the PF
process triggered perf ftrace which finally lead into a suspected
stack overflow.

This patch marking the context as recursion during perf_log_throttle()
, so no more swevent during the process and no more panic.

Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com>
---
 kernel/events/core.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 744e872..6063443 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -8716,6 +8716,7 @@ static void perf_log_throttle(struct perf_event *event, int enable)
 	struct perf_output_handle handle;
 	struct perf_sample_data sample;
 	int ret;
+	int rctx;

 	struct {
 		struct perf_event_header	header;
@@ -8738,14 +8739,17 @@ static void perf_log_throttle(struct perf_event *event, int enable)

 	perf_event_header__init_id(&throttle_event.header, &sample, event);

+	rctx = perf_swevent_get_recursion_context();
 	ret = perf_output_begin(&handle, &sample, event,
 				throttle_event.header.size);
-	if (ret)
-		return;
+	if (!ret) {
+		perf_output_put(&handle, throttle_event);
+		perf_event__output_id_sample(event, &handle, &sample);
+		perf_output_end(&handle);
+	}

-	perf_output_put(&handle, throttle_event);
-	perf_event__output_id_sample(event, &handle, &sample);
-	perf_output_end(&handle);
+	if (rctx >= 0)
+		perf_swevent_put_recursion_context(rctx);
 }

 /*
-- 
1.8.3.1


             reply	other threads:[~2021-09-09  3:13 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-09  3:13 王贇 [this message]
2021-09-09  6:10 ` [RFC PATCH] perf: fix panic by mark recursion inside perf_log_throttle 王贇
2021-09-10 15:38 ` Peter Zijlstra
2021-09-13  3:00   ` 王贇
2021-09-13  3:21     ` 王贇
2021-09-13 10:24     ` Peter Zijlstra
2021-09-13 10:36       ` Peter Zijlstra
2021-09-14  2:02         ` 王贇
2021-09-14  1:58       ` 王贇
2021-09-14 10:28         ` Peter Zijlstra
2021-09-15  1:51           ` 王贇
2021-09-15 15:17             ` [PATCH] x86/dumpstack/64: Add guard pages to stack_info Peter Zijlstra
2021-09-16  3:34               ` 王贇
2021-09-16  3:47               ` 王贇
2021-09-16  8:00                 ` Peter Zijlstra
2021-09-16  8:03                   ` Peter Zijlstra
2021-09-16 10:02                     ` Peter Zijlstra
2021-09-17  2:15                       ` 王贇
2021-09-17  3:02                       ` 王贇
2021-09-17 10:21                         ` Peter Zijlstra
2021-09-17 16:40                           ` Peter Zijlstra
2021-09-18  2:30                             ` 王贇
2021-09-18  6:56                               ` Peter Zijlstra
2021-09-18  2:38                             ` 王贇
2021-09-13  3:30 ` [PATCH] perf: fix panic by disable ftrace on fault.c 王贇
2021-09-13 14:49   ` Dave Hansen
2021-09-14  1:52     ` 王贇
2021-09-14  3:02       ` 王贇
2021-09-14  7:23         ` 王贇
2021-09-14 16:16           ` Dave Hansen
2021-09-15  1:56             ` 王贇
2021-09-15  3:27               ` Dave Hansen
2021-09-15  7:22                 ` 王贇
2021-09-15  7:34                   ` 王贇
2021-09-15 15:19                     ` [PATCH] x86: Increase exception stack sizes Peter Zijlstra
2021-09-16  3:42                       ` 王贇
2021-09-21  7:28                       ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
2021-09-21 12:41                       ` tip-bot2 for Peter Zijlstra
2021-09-14  2:08     ` [PATCH] perf: fix panic by disable ftrace on fault.c 王贇

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ff979a43-045a-dc56-64d1-2c31dd4db381@linux.alibaba.com \
    --to=yun.wang@linux.alibaba.com \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@redhat.com \
    --cc=kafai@fb.com \
    --cc=kpsingh@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=songliubraving@fb.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.