linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [for-next][PATCH 00/25] tracing: Updates for 6.2
@ 2022-12-10 13:57 Steven Rostedt
  2022-12-10 13:57 ` [for-next][PATCH 01/25] tracing/user_events: Fix call print_fmt leak Steven Rostedt
                   ` (24 more replies)
  0 siblings, 25 replies; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 13:57 UTC (permalink / raw)
  To: linux-kernel; +Cc: Masami Hiramatsu, Andrew Morton

  git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git
trace/for-next

Head SHA1: d3f56476437f78d2cdea60ead59406b1da278584


Bagas Sanjaya (1):
      Documentation/osnoise: Escape underscore of NO_ prefix

Beau Belgrave (1):
      tracing/user_events: Fix call print_fmt leak

Daniel Bristot de Oliveira (4):
      tracing/osnoise: Make osnoise_options static
      tracing/osnoise: Add PANIC_ON_STOP option
      tracing/osnoise: Add preempt and/or irq disabled options
      Documentation/osnoise: Add osnoise/options documentation

David Howells (1):
      tracing: Fix some checker warnings

Masami Hiramatsu (Google) (5):
      tracing: Add .percent suffix option to histogram values
      tracing: Add .graph suffix option to histogram value
      tracing: Add nohitcount option for suppressing display of raw hitcount
      tracing: docs: Update histogram doc for .percent/.graph and 'nohitcount'
      tracing: Fix complicated dependency of CONFIG_TRACER_MAX_TRACE

Ross Zwisler (1):
      tracing: remove unnecessary trace_trigger ifdef

Song Chen (1):
      trace/kprobe: remove duplicated calls of ring_buffer_event_data

Steven Rostedt (3):
      x86/mm/kmmio: Switch to arch_spin_lock()
      x86/mm/kmmio: Use rcu_read_lock_sched_notrace()
      ring-buffer: Handle resize in early boot up

Steven Rostedt (Google) (3):
      tracing: Update MAINTAINERS file for new patchwork and mailing list
      ftrace/x86: Add back ftrace_expected for ftrace bug reports
      tracing/probes: Handle system names with hyphens

Tom Zanussi (1):
      tracing: Allow multiple hitcount values in histograms

Zheng Yejian (4):
      tracing/hist: Fix wrong return value in parse_action_params()
      tracing/hist: Fix out-of-bound write on 'action_data.var_ref_idx'
      tracing: Fix issue of missing one synthetic field
      tracing/hist: Fix issue of losting command info in error_log

----
 Documentation/trace/histogram.rst      |  10 +-
 Documentation/trace/osnoise-tracer.rst |  22 +++-
 MAINTAINERS                            |   9 ++
 arch/x86/kernel/ftrace.c               |   2 +
 arch/x86/mm/kmmio.c                    |  37 ++++---
 include/linux/trace_events.h           |   3 +-
 include/linux/trace_seq.h              |   3 +-
 kernel/trace/Kconfig                   |   2 +
 kernel/trace/ring_buffer.c             |  32 ++++--
 kernel/trace/trace.c                   |  30 ++++--
 kernel/trace/trace.h                   |  29 +++--
 kernel/trace/trace_events.c            |   6 --
 kernel/trace/trace_events_hist.c       | 190 ++++++++++++++++++++++++++++-----
 kernel/trace/trace_events_synth.c      |   2 +-
 kernel/trace/trace_events_user.c       |   1 +
 kernel/trace/trace_kprobe.c            |   2 -
 kernel/trace/trace_osnoise.c           |  56 ++++++++--
 kernel/trace/trace_output.c            |   5 +-
 kernel/trace/trace_probe.c             |   2 +-
 19 files changed, 353 insertions(+), 90 deletions(-)

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [for-next][PATCH 01/25] tracing/user_events: Fix call print_fmt leak
  2022-12-10 13:57 [for-next][PATCH 00/25] tracing: Updates for 6.2 Steven Rostedt
@ 2022-12-10 13:57 ` Steven Rostedt
  2022-12-10 13:57 ` [for-next][PATCH 02/25] tracing: Update MAINTAINERS file for new patchwork and mailing list Steven Rostedt
                   ` (23 subsequent siblings)
  24 siblings, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 13:57 UTC (permalink / raw)
  To: linux-kernel; +Cc: Masami Hiramatsu, Andrew Morton, Beau Belgrave

From: Beau Belgrave <beaub@linux.microsoft.com>

If user_event_trace_register() fails within user_event_parse() the
call's print_fmt member is not freed. Add kfree call to fix this.

Link: https://lkml.kernel.org/r/20221123183248.554-1-beaub@linux.microsoft.com

Fixes: aa3b2b4c6692 ("user_events: Add print_fmt generation support for basic types")
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/trace_events_user.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/trace/trace_events_user.c b/kernel/trace/trace_events_user.c
index 539b08ae7020..9cb53182bb31 100644
--- a/kernel/trace/trace_events_user.c
+++ b/kernel/trace/trace_events_user.c
@@ -1359,6 +1359,7 @@ static int user_event_parse(struct user_event_group *group, char *name,
 put_user:
 	user_event_destroy_fields(user);
 	user_event_destroy_validators(user);
+	kfree(user->call.print_fmt);
 	kfree(user);
 	return ret;
 }
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [for-next][PATCH 02/25] tracing: Update MAINTAINERS file for new patchwork and mailing list
  2022-12-10 13:57 [for-next][PATCH 00/25] tracing: Updates for 6.2 Steven Rostedt
  2022-12-10 13:57 ` [for-next][PATCH 01/25] tracing/user_events: Fix call print_fmt leak Steven Rostedt
@ 2022-12-10 13:57 ` Steven Rostedt
  2022-12-10 13:57 ` [for-next][PATCH 03/25] ftrace/x86: Add back ftrace_expected for ftrace bug reports Steven Rostedt
                   ` (22 subsequent siblings)
  24 siblings, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 13:57 UTC (permalink / raw)
  To: linux-kernel; +Cc: Masami Hiramatsu, Andrew Morton

From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

The tracing subsystem now has its own mailing list (although patches
should also be sent to LKML) as well as a new patchwork entry for kernel
related tracing patches.

Update the MAINTAINERS file to reflect the changes.

Link: https://lore.kernel.org/linux-trace-kernel/20221017140513.14b9ce2e@gandalf.local.home

Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 MAINTAINERS | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 2585e7edc335..d12576150a70 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8461,6 +8461,9 @@ FUNCTION HOOKS (FTRACE)
 M:	Steven Rostedt <rostedt@goodmis.org>
 M:	Masami Hiramatsu <mhiramat@kernel.org>
 R:	Mark Rutland <mark.rutland@arm.com>
+L:	linux-kernel@vger.kernel.org
+L:	linux-trace-kernel@vger.kernel.org
+Q:	https://patchwork.kernel.org/project/linux-trace-kernel/list/
 S:	Maintained
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git
 F:	Documentation/trace/ftrace*
@@ -11483,6 +11486,9 @@ M:	Naveen N. Rao <naveen.n.rao@linux.ibm.com>
 M:	Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
 M:	"David S. Miller" <davem@davemloft.net>
 M:	Masami Hiramatsu <mhiramat@kernel.org>
+L:	linux-kernel@vger.kernel.org
+L:	linux-trace-kernel@vger.kernel.org
+Q:	https://patchwork.kernel.org/project/linux-trace-kernel/list/
 S:	Maintained
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git
 F:	Documentation/trace/kprobes.rst
@@ -20862,6 +20868,9 @@ F:	drivers/hwmon/pmbus/tps546d24.c
 TRACING
 M:	Steven Rostedt <rostedt@goodmis.org>
 M:	Masami Hiramatsu <mhiramat@kernel.org>
+L:	linux-kernel@vger.kernel.org
+L:	linux-trace-kernel@vger.kernel.org
+Q:	https://patchwork.kernel.org/project/linux-trace-kernel/list/
 S:	Maintained
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git
 F:	Documentation/trace/*
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [for-next][PATCH 03/25] ftrace/x86: Add back ftrace_expected for ftrace bug reports
  2022-12-10 13:57 [for-next][PATCH 00/25] tracing: Updates for 6.2 Steven Rostedt
  2022-12-10 13:57 ` [for-next][PATCH 01/25] tracing/user_events: Fix call print_fmt leak Steven Rostedt
  2022-12-10 13:57 ` [for-next][PATCH 02/25] tracing: Update MAINTAINERS file for new patchwork and mailing list Steven Rostedt
@ 2022-12-10 13:57 ` Steven Rostedt
  2022-12-10 13:57 ` [for-next][PATCH 04/25] tracing: Allow multiple hitcount values in histograms Steven Rostedt
                   ` (21 subsequent siblings)
  24 siblings, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 13:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Masami Hiramatsu, Andrew Morton, Peter Zijlstra, Thomas Gleixner,
	x86, Borislav Petkov, Ingo Molnar, stable

From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

After someone reported a bug report with a failed modification due to the
expected value not matching what was found, it came to my attention that
the ftrace_expected is no longer set when that happens. This makes for
debugging the issue a bit more difficult.

Set ftrace_expected to the expected code before calling ftrace_bug, so
that it shows what was expected and why it failed.

Link: https://lore.kernel.org/all/CA+wXwBQ-VhK+hpBtYtyZP-NiX4g8fqRRWithFOHQW-0coQ3vLg@mail.gmail.com/
Link: https://lore.kernel.org/linux-trace-kernel/20221209105247.01d4e51d@gandalf.local.home

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "x86@kernel.org" <x86@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Fixes: 768ae4406a5c ("x86/ftrace: Use text_poke()")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 arch/x86/kernel/ftrace.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index bd165004776d..e07234ec7e23 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -217,7 +217,9 @@ void ftrace_replace_code(int enable)
 
 		ret = ftrace_verify_code(rec->ip, old);
 		if (ret) {
+			ftrace_expected = old;
 			ftrace_bug(ret, rec);
+			ftrace_expected = NULL;
 			return;
 		}
 	}
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [for-next][PATCH 04/25] tracing: Allow multiple hitcount values in histograms
  2022-12-10 13:57 [for-next][PATCH 00/25] tracing: Updates for 6.2 Steven Rostedt
                   ` (2 preceding siblings ...)
  2022-12-10 13:57 ` [for-next][PATCH 03/25] ftrace/x86: Add back ftrace_expected for ftrace bug reports Steven Rostedt
@ 2022-12-10 13:57 ` Steven Rostedt
  2022-12-10 13:57 ` [for-next][PATCH 05/25] tracing: Add .percent suffix option to histogram values Steven Rostedt
                   ` (20 subsequent siblings)
  24 siblings, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 13:57 UTC (permalink / raw)
  To: linux-kernel; +Cc: Masami Hiramatsu, Andrew Morton, Tom Zanussi

From: Tom Zanussi <zanussi@kernel.org>

The hitcount is treated specially in the histograms - since it's
always expected to be there regardless of whether the user specified
anything or not, it's always added as the first histogram value.

Currently the code doesn't allow it to be added more than once as a
value, which is inconsistent with all the other possible values.  It
would seem to be a pointless thing to want to do, but other features
being added such as percent and graph modifiers don't work properly
with the current hitcount restrictions.

Fix this by allowing multiple hitcounts to be added.

Link: https://lore.kernel.org/linux-trace-kernel/166610812248.56030.16754785928712505251.stgit@devnote2

Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Tested-by: Tom Zanussi <zanussi@kernel.org>
---
 kernel/trace/trace_events_hist.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index 1c82478e8dff..31d58ddcc1d9 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -1356,6 +1356,8 @@ static const char *hist_field_name(struct hist_field *field,
 			field_name = field->name;
 	} else if (field->flags & HIST_FIELD_FL_TIMESTAMP)
 		field_name = "common_timestamp";
+	else if (field->flags & HIST_FIELD_FL_HITCOUNT)
+		field_name = "hitcount";
 
 	if (field_name == NULL)
 		field_name = "";
@@ -2328,6 +2330,8 @@ parse_field(struct hist_trigger_data *hist_data, struct trace_event_file *file,
 			hist_data->attrs->ts_in_usecs = true;
 	} else if (strcmp(field_name, "common_cpu") == 0)
 		*flags |= HIST_FIELD_FL_CPU;
+	else if (strcmp(field_name, "hitcount") == 0)
+		*flags |= HIST_FIELD_FL_HITCOUNT;
 	else {
 		field = trace_find_event_field(file->event_call, field_name);
 		if (!field || !field->size) {
@@ -4328,8 +4332,8 @@ static int create_var_field(struct hist_trigger_data *hist_data,
 static int create_val_fields(struct hist_trigger_data *hist_data,
 			     struct trace_event_file *file)
 {
+	unsigned int i, j = 1, n_hitcount = 0;
 	char *fields_str, *field_str;
-	unsigned int i, j = 1;
 	int ret;
 
 	ret = create_hitcount_val(hist_data);
@@ -4346,8 +4350,10 @@ static int create_val_fields(struct hist_trigger_data *hist_data,
 		if (!field_str)
 			break;
 
-		if (strcmp(field_str, "hitcount") == 0)
-			continue;
+		if (strcmp(field_str, "hitcount") == 0) {
+			if (!n_hitcount++)
+				continue;
+		}
 
 		ret = create_val_field(hist_data, j++, file, field_str);
 		if (ret)
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [for-next][PATCH 05/25] tracing: Add .percent suffix option to histogram values
  2022-12-10 13:57 [for-next][PATCH 00/25] tracing: Updates for 6.2 Steven Rostedt
                   ` (3 preceding siblings ...)
  2022-12-10 13:57 ` [for-next][PATCH 04/25] tracing: Allow multiple hitcount values in histograms Steven Rostedt
@ 2022-12-10 13:57 ` Steven Rostedt
  2022-12-10 13:57 ` [for-next][PATCH 06/25] tracing: Add .graph suffix option to histogram value Steven Rostedt
                   ` (19 subsequent siblings)
  24 siblings, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 13:57 UTC (permalink / raw)
  To: linux-kernel; +Cc: Masami Hiramatsu, Andrew Morton, Tom Zanussi

From: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>

Add .percent suffix option to show the histogram values in percentage.
This feature is useful when we need yo undersntand the overall trend
for the histograms of large values.
E.g. this shows the runtime percentage for each tasks.

------
  # cd /sys/kernel/debug/tracing/
  # echo hist:keys=pid:vals=hitcount,runtime.percent:sort=pid > \
    events/sched/sched_stat_runtime/trigger
  # sleep 10
  # cat events/sched/sched_stat_runtime/hist
 # event histogram
 #
 # trigger info: hist:keys=pid:vals=hitcount,runtime.percent:sort=pid:size=2048 [active]
 #

 { pid:          8 } hitcount:          7  runtime (%):   4.14
 { pid:         14 } hitcount:          5  runtime (%):   3.69
 { pid:         16 } hitcount:         11  runtime (%):   3.41
 { pid:         61 } hitcount:         41  runtime (%):  19.75
 { pid:         65 } hitcount:          4  runtime (%):   1.48
 { pid:         70 } hitcount:          6  runtime (%):   3.60
 { pid:         72 } hitcount:          2  runtime (%):   1.10
 { pid:        144 } hitcount:         10  runtime (%):  32.01
 { pid:        151 } hitcount:          8  runtime (%):  22.66
 { pid:        152 } hitcount:          2  runtime (%):   8.10

 Totals:
     Hits: 96
     Entries: 10
     Dropped: 0
-----

Link: https://lore.kernel.org/linux-trace-kernel/166610813077.56030.4238090506973562347.stgit@devnote2

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Tested-by: Tom Zanussi <zanussi@kernel.org>
---
 kernel/trace/trace.c             |  3 +-
 kernel/trace/trace_events_hist.c | 90 +++++++++++++++++++++++++++-----
 2 files changed, 78 insertions(+), 15 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 93a75a97118f..08e9568849b1 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -5724,7 +5724,8 @@ static const char readme_msg[] =
 	"\t            .syscall    display a syscall id as a syscall name\n"
 	"\t            .log2       display log2 value rather than raw number\n"
 	"\t            .buckets=size  display values in groups of size rather than raw number\n"
-	"\t            .usecs      display a common_timestamp in microseconds\n\n"
+	"\t            .usecs      display a common_timestamp in microseconds\n"
+	"\t            .percent    display a number of percentage value\n\n"
 	"\t    The 'pause' parameter can be used to pause an existing hist\n"
 	"\t    trigger or to start a hist trigger but not log any events\n"
 	"\t    until told to do so.  'continue' can be used to start or\n"
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index 31d58ddcc1d9..35b0e956f06e 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -506,6 +506,7 @@ enum hist_field_flags {
 	HIST_FIELD_FL_ALIAS		= 1 << 16,
 	HIST_FIELD_FL_BUCKET		= 1 << 17,
 	HIST_FIELD_FL_CONST		= 1 << 18,
+	HIST_FIELD_FL_PERCENT		= 1 << 19,
 };
 
 struct var_defs {
@@ -1707,6 +1708,8 @@ static const char *get_hist_field_flags(struct hist_field *hist_field)
 		flags_str = "buckets";
 	else if (hist_field->flags & HIST_FIELD_FL_TIMESTAMP_USECS)
 		flags_str = "usecs";
+	else if (hist_field->flags & HIST_FIELD_FL_PERCENT)
+		flags_str = "percent";
 
 	return flags_str;
 }
@@ -2315,6 +2318,10 @@ parse_field(struct hist_trigger_data *hist_data, struct trace_event_file *file,
 			if (ret || !(*buckets))
 				goto error;
 			*flags |= HIST_FIELD_FL_BUCKET;
+		} else if (strncmp(modifier, "percent", 7) == 0) {
+			if (*flags & (HIST_FIELD_FL_VAR | HIST_FIELD_FL_KEY))
+				goto error;
+			*flags |= HIST_FIELD_FL_PERCENT;
 		} else {
  error:
 			hist_err(tr, HIST_ERR_BAD_FIELD_MODIFIER, errpos(modifier));
@@ -5291,33 +5298,69 @@ static void hist_trigger_print_key(struct seq_file *m,
 	seq_puts(m, "}");
 }
 
+/* Get the 100 times of the percentage of @val in @total */
+static inline unsigned int __get_percentage(u64 val, u64 total)
+{
+	if (!total)
+		goto div0;
+
+	if (val < (U64_MAX / 10000))
+		return (unsigned int)div64_ul(val * 10000, total);
+
+	total = div64_u64(total, 10000);
+	if (!total)
+		goto div0;
+
+	return (unsigned int)div64_ul(val, total);
+div0:
+	return val ? UINT_MAX : 0;
+}
+
+static void hist_trigger_print_val(struct seq_file *m, unsigned int idx,
+				   const char *field_name, unsigned long flags,
+				   u64 *totals, struct tracing_map_elt *elt)
+{
+	u64 val = tracing_map_read_sum(elt, idx);
+	unsigned int pc;
+
+	if (flags & HIST_FIELD_FL_PERCENT) {
+		pc = __get_percentage(val, totals[idx]);
+		if (pc == UINT_MAX)
+			seq_printf(m, " %s (%%):[ERROR]", field_name);
+		else
+			seq_printf(m, " %s (%%): %3u.%02u", field_name,
+					pc / 100, pc % 100);
+	} else if (flags & HIST_FIELD_FL_HEX) {
+		seq_printf(m, " %s: %10llx", field_name, val);
+	} else {
+		seq_printf(m, " %s: %10llu", field_name, val);
+	}
+}
+
 static void hist_trigger_entry_print(struct seq_file *m,
 				     struct hist_trigger_data *hist_data,
+				     u64 *totals,
 				     void *key,
 				     struct tracing_map_elt *elt)
 {
 	const char *field_name;
-	unsigned int i;
+	unsigned int i = HITCOUNT_IDX;
+	unsigned long flags;
 
 	hist_trigger_print_key(m, hist_data, key, elt);
 
-	seq_printf(m, " hitcount: %10llu",
-		   tracing_map_read_sum(elt, HITCOUNT_IDX));
+	/* At first, show the raw hitcount always */
+	hist_trigger_print_val(m, i, "hitcount", 0, totals, elt);
 
 	for (i = 1; i < hist_data->n_vals; i++) {
 		field_name = hist_field_name(hist_data->fields[i], 0);
+		flags = hist_data->fields[i]->flags;
 
-		if (hist_data->fields[i]->flags & HIST_FIELD_FL_VAR ||
-		    hist_data->fields[i]->flags & HIST_FIELD_FL_EXPR)
+		if (flags & HIST_FIELD_FL_VAR || flags & HIST_FIELD_FL_EXPR)
 			continue;
 
-		if (hist_data->fields[i]->flags & HIST_FIELD_FL_HEX) {
-			seq_printf(m, "  %s: %10llx", field_name,
-				   tracing_map_read_sum(elt, i));
-		} else {
-			seq_printf(m, "  %s: %10llu", field_name,
-				   tracing_map_read_sum(elt, i));
-		}
+		seq_puts(m, " ");
+		hist_trigger_print_val(m, i, field_name, flags, totals, elt);
 	}
 
 	print_actions(m, hist_data, elt);
@@ -5330,7 +5373,8 @@ static int print_entries(struct seq_file *m,
 {
 	struct tracing_map_sort_entry **sort_entries = NULL;
 	struct tracing_map *map = hist_data->map;
-	int i, n_entries;
+	int i, j, n_entries;
+	u64 *totals = NULL;
 
 	n_entries = tracing_map_sort_entries(map, hist_data->sort_keys,
 					     hist_data->n_sort_keys,
@@ -5338,11 +5382,29 @@ static int print_entries(struct seq_file *m,
 	if (n_entries < 0)
 		return n_entries;
 
+	for (j = 0; j < hist_data->n_vals; j++) {
+		if (!(hist_data->fields[j]->flags & HIST_FIELD_FL_PERCENT))
+			continue;
+		if (!totals) {
+			totals = kcalloc(hist_data->n_vals, sizeof(u64),
+					 GFP_KERNEL);
+			if (!totals) {
+				n_entries = -ENOMEM;
+				goto out;
+			}
+		}
+		for (i = 0; i < n_entries; i++)
+			totals[j] += tracing_map_read_sum(
+					sort_entries[i]->elt, j);
+	}
+
 	for (i = 0; i < n_entries; i++)
-		hist_trigger_entry_print(m, hist_data,
+		hist_trigger_entry_print(m, hist_data, totals,
 					 sort_entries[i]->key,
 					 sort_entries[i]->elt);
 
+	kfree(totals);
+out:
 	tracing_map_destroy_sort_entries(sort_entries, n_entries);
 
 	return n_entries;
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [for-next][PATCH 06/25] tracing: Add .graph suffix option to histogram value
  2022-12-10 13:57 [for-next][PATCH 00/25] tracing: Updates for 6.2 Steven Rostedt
                   ` (4 preceding siblings ...)
  2022-12-10 13:57 ` [for-next][PATCH 05/25] tracing: Add .percent suffix option to histogram values Steven Rostedt
@ 2022-12-10 13:57 ` Steven Rostedt
  2022-12-10 13:57 ` [for-next][PATCH 07/25] tracing: Add nohitcount option for suppressing display of raw hitcount Steven Rostedt
                   ` (18 subsequent siblings)
  24 siblings, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 13:57 UTC (permalink / raw)
  To: linux-kernel; +Cc: Masami Hiramatsu, Andrew Morton, Tom Zanussi

From: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>

Add the .graph suffix which shows the bar graph of the histogram value.

For example, the below example shows that the bar graph
of the histogram of the runtime for each tasks.

------
  # cd /sys/kernel/debug/tracing/
  # echo hist:keys=pid:vals=runtime.graph:sort=pid > \
   events/sched/sched_stat_runtime/trigger
  # sleep 10
  # cat events/sched/sched_stat_runtime/hist
 # event histogram
 #
 # trigger info: hist:keys=pid:vals=hitcount,runtime.graph:sort=pid:size=2048 [active]
 #

 { pid:         14 } hitcount:          2  runtime:
 { pid:         16 } hitcount:          8  runtime:
 { pid:         26 } hitcount:          1  runtime:
 { pid:         57 } hitcount:          3  runtime:
 { pid:         61 } hitcount:         20  runtime: ###
 { pid:         66 } hitcount:          2  runtime:
 { pid:         70 } hitcount:          3  runtime:
 { pid:         72 } hitcount:          2  runtime:
 { pid:        145 } hitcount:         14  runtime: ####################
 { pid:        152 } hitcount:          5  runtime: #######
 { pid:        153 } hitcount:          2  runtime: ####

 Totals:
     Hits: 62
     Entries: 11
     Dropped: 0
-------

Link: https://lore.kernel.org/linux-trace-kernel/166610813953.56030.10944148382315789485.stgit@devnote2

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Tested-by: Tom Zanussi <zanussi@kernel.org>
---
 kernel/trace/trace.c             |  3 +-
 kernel/trace/trace_events_hist.c | 77 +++++++++++++++++++++++++-------
 2 files changed, 63 insertions(+), 17 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 08e9568849b1..55aec4616d8b 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -5725,7 +5725,8 @@ static const char readme_msg[] =
 	"\t            .log2       display log2 value rather than raw number\n"
 	"\t            .buckets=size  display values in groups of size rather than raw number\n"
 	"\t            .usecs      display a common_timestamp in microseconds\n"
-	"\t            .percent    display a number of percentage value\n\n"
+	"\t            .percent    display a number of percentage value\n"
+	"\t            .graph      display a bar-graph of a value\n\n"
 	"\t    The 'pause' parameter can be used to pause an existing hist\n"
 	"\t    trigger or to start a hist trigger but not log any events\n"
 	"\t    until told to do so.  'continue' can be used to start or\n"
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index 35b0e956f06e..946b2b8f0f2c 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -507,6 +507,7 @@ enum hist_field_flags {
 	HIST_FIELD_FL_BUCKET		= 1 << 17,
 	HIST_FIELD_FL_CONST		= 1 << 18,
 	HIST_FIELD_FL_PERCENT		= 1 << 19,
+	HIST_FIELD_FL_GRAPH		= 1 << 20,
 };
 
 struct var_defs {
@@ -1710,6 +1711,8 @@ static const char *get_hist_field_flags(struct hist_field *hist_field)
 		flags_str = "usecs";
 	else if (hist_field->flags & HIST_FIELD_FL_PERCENT)
 		flags_str = "percent";
+	else if (hist_field->flags & HIST_FIELD_FL_GRAPH)
+		flags_str = "graph";
 
 	return flags_str;
 }
@@ -2322,6 +2325,10 @@ parse_field(struct hist_trigger_data *hist_data, struct trace_event_file *file,
 			if (*flags & (HIST_FIELD_FL_VAR | HIST_FIELD_FL_KEY))
 				goto error;
 			*flags |= HIST_FIELD_FL_PERCENT;
+		} else if (strncmp(modifier, "graph", 5) == 0) {
+			if (*flags & (HIST_FIELD_FL_VAR | HIST_FIELD_FL_KEY))
+				goto error;
+			*flags |= HIST_FIELD_FL_GRAPH;
 		} else {
  error:
 			hist_err(tr, HIST_ERR_BAD_FIELD_MODIFIER, errpos(modifier));
@@ -5316,20 +5323,52 @@ static inline unsigned int __get_percentage(u64 val, u64 total)
 	return val ? UINT_MAX : 0;
 }
 
+#define BAR_CHAR '#'
+
+static inline const char *__fill_bar_str(char *buf, int size, u64 val, u64 max)
+{
+	unsigned int len = __get_percentage(val, max);
+	int i;
+
+	if (len == UINT_MAX) {
+		snprintf(buf, size, "[ERROR]");
+		return buf;
+	}
+
+	len = len * size / 10000;
+	for (i = 0; i < len && i < size; i++)
+		buf[i] = BAR_CHAR;
+	while (i < size)
+		buf[i++] = ' ';
+	buf[size] = '\0';
+
+	return buf;
+}
+
+struct hist_val_stat {
+	u64 max;
+	u64 total;
+};
+
 static void hist_trigger_print_val(struct seq_file *m, unsigned int idx,
 				   const char *field_name, unsigned long flags,
-				   u64 *totals, struct tracing_map_elt *elt)
+				   struct hist_val_stat *stats,
+				   struct tracing_map_elt *elt)
 {
 	u64 val = tracing_map_read_sum(elt, idx);
 	unsigned int pc;
+	char bar[21];
 
 	if (flags & HIST_FIELD_FL_PERCENT) {
-		pc = __get_percentage(val, totals[idx]);
+		pc = __get_percentage(val, stats[idx].total);
 		if (pc == UINT_MAX)
 			seq_printf(m, " %s (%%):[ERROR]", field_name);
 		else
 			seq_printf(m, " %s (%%): %3u.%02u", field_name,
 					pc / 100, pc % 100);
+	} else if (flags & HIST_FIELD_FL_GRAPH) {
+		seq_printf(m, " %s: %20s", field_name,
+			   __fill_bar_str(bar, 20, val, stats[idx].max));
 	} else if (flags & HIST_FIELD_FL_HEX) {
 		seq_printf(m, " %s: %10llx", field_name, val);
 	} else {
@@ -5339,7 +5378,7 @@ static void hist_trigger_print_val(struct seq_file *m, unsigned int idx,
 
 static void hist_trigger_entry_print(struct seq_file *m,
 				     struct hist_trigger_data *hist_data,
-				     u64 *totals,
+				     struct hist_val_stat *stats,
 				     void *key,
 				     struct tracing_map_elt *elt)
 {
@@ -5350,7 +5389,7 @@ static void hist_trigger_entry_print(struct seq_file *m,
 	hist_trigger_print_key(m, hist_data, key, elt);
 
 	/* At first, show the raw hitcount always */
-	hist_trigger_print_val(m, i, "hitcount", 0, totals, elt);
+	hist_trigger_print_val(m, i, "hitcount", 0, stats, elt);
 
 	for (i = 1; i < hist_data->n_vals; i++) {
 		field_name = hist_field_name(hist_data->fields[i], 0);
@@ -5360,7 +5399,7 @@ static void hist_trigger_entry_print(struct seq_file *m,
 			continue;
 
 		seq_puts(m, " ");
-		hist_trigger_print_val(m, i, field_name, flags, totals, elt);
+		hist_trigger_print_val(m, i, field_name, flags, stats, elt);
 	}
 
 	print_actions(m, hist_data, elt);
@@ -5374,7 +5413,8 @@ static int print_entries(struct seq_file *m,
 	struct tracing_map_sort_entry **sort_entries = NULL;
 	struct tracing_map *map = hist_data->map;
 	int i, j, n_entries;
-	u64 *totals = NULL;
+	struct hist_val_stat *stats = NULL;
+	u64 val;
 
 	n_entries = tracing_map_sort_entries(map, hist_data->sort_keys,
 					     hist_data->n_sort_keys,
@@ -5382,28 +5422,33 @@ static int print_entries(struct seq_file *m,
 	if (n_entries < 0)
 		return n_entries;
 
+	/* Calculate the max and the total for each field if needed. */
 	for (j = 0; j < hist_data->n_vals; j++) {
-		if (!(hist_data->fields[j]->flags & HIST_FIELD_FL_PERCENT))
+		if (!(hist_data->fields[j]->flags &
+			(HIST_FIELD_FL_PERCENT | HIST_FIELD_FL_GRAPH)))
 			continue;
-		if (!totals) {
-			totals = kcalloc(hist_data->n_vals, sizeof(u64),
-					 GFP_KERNEL);
-			if (!totals) {
+		if (!stats) {
+			stats = kcalloc(hist_data->n_vals, sizeof(*stats),
+				       GFP_KERNEL);
+			if (!stats) {
 				n_entries = -ENOMEM;
 				goto out;
 			}
 		}
-		for (i = 0; i < n_entries; i++)
-			totals[j] += tracing_map_read_sum(
-					sort_entries[i]->elt, j);
+		for (i = 0; i < n_entries; i++) {
+			val = tracing_map_read_sum(sort_entries[i]->elt, j);
+			stats[j].total += val;
+			if (stats[j].max < val)
+				stats[j].max = val;
+		}
 	}
 
 	for (i = 0; i < n_entries; i++)
-		hist_trigger_entry_print(m, hist_data, totals,
+		hist_trigger_entry_print(m, hist_data, stats,
 					 sort_entries[i]->key,
 					 sort_entries[i]->elt);
 
-	kfree(totals);
+	kfree(stats);
 out:
 	tracing_map_destroy_sort_entries(sort_entries, n_entries);
 
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [for-next][PATCH 07/25] tracing: Add nohitcount option for suppressing display of raw hitcount
  2022-12-10 13:57 [for-next][PATCH 00/25] tracing: Updates for 6.2 Steven Rostedt
                   ` (5 preceding siblings ...)
  2022-12-10 13:57 ` [for-next][PATCH 06/25] tracing: Add .graph suffix option to histogram value Steven Rostedt
@ 2022-12-10 13:57 ` Steven Rostedt
  2022-12-10 13:57 ` [for-next][PATCH 08/25] tracing: docs: Update histogram doc for .percent/.graph and nohitcount Steven Rostedt
                   ` (17 subsequent siblings)
  24 siblings, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 13:57 UTC (permalink / raw)
  To: linux-kernel; +Cc: Masami Hiramatsu, Andrew Morton, Tom Zanussi

From: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>

Add 'nohitcount' ('NOHC' for short) option for suppressing display of
the raw hitcount column in the histogram.
Note that you must specify at least one value except raw 'hitcount'
when you specify this nohitcount option.

  # cd /sys/kernel/debug/tracing/
  # echo hist:keys=pid:vals=runtime.percent,runtime.graph:sort=pid:NOHC > \
        events/sched/sched_stat_runtime/trigger
  # sleep 10
  # cat events/sched/sched_stat_runtime/hist
 # event histogram
 #
 # trigger info: hist:keys=pid:vals=runtime.percent,runtime.graph:sort=pid:size=2048:nohitcount  [active]
 #

 { pid:          8 }  runtime (%):   3.02  runtime: #
 { pid:         14 }  runtime (%):   2.25  runtime:
 { pid:         16 }  runtime (%):   2.25  runtime:
 { pid:         26 }  runtime (%):   0.17  runtime:
 { pid:         61 }  runtime (%):  11.52  runtime: ####
 { pid:         67 }  runtime (%):   1.56  runtime:
 { pid:         68 }  runtime (%):   0.84  runtime:
 { pid:         76 }  runtime (%):   0.92  runtime:
 { pid:        117 }  runtime (%):   2.50  runtime: #
 { pid:        146 }  runtime (%):  49.88  runtime: ####################
 { pid:        157 }  runtime (%):  16.63  runtime: ######
 { pid:        158 }  runtime (%):   8.38  runtime: ###

Link: https://lore.kernel.org/linux-trace-kernel/166610814787.56030.4980636083486339906.stgit@devnote2

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Tested-by: Tom Zanussi <zanussi@kernel.org>
---
 kernel/trace/trace.c             |  3 +++
 kernel/trace/trace_events_hist.c | 34 ++++++++++++++++++++++++--------
 2 files changed, 29 insertions(+), 8 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 55aec4616d8b..948f321b9df1 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -5678,6 +5678,7 @@ static const char readme_msg[] =
 	"\t            [:size=#entries]\n"
 	"\t            [:pause][:continue][:clear]\n"
 	"\t            [:name=histname1]\n"
+	"\t            [:nohitcount]\n"
 	"\t            [:<handler>.<action>]\n"
 	"\t            [if <filter>]\n\n"
 	"\t    Note, special fields can be used as well:\n"
@@ -5734,6 +5735,8 @@ static const char readme_msg[] =
 	"\t    The 'clear' parameter will clear the contents of a running\n"
 	"\t    hist trigger and leave its current paused/active state\n"
 	"\t    unchanged.\n\n"
+	"\t    The 'nohitcount' (or NOHC) parameter will suppress display of\n"
+	"\t    raw hitcount in the histogram.\n\n"
 	"\t    The enable_hist and disable_hist triggers can be used to\n"
 	"\t    have one event conditionally start and stop another event's\n"
 	"\t    already-attached hist trigger.  The syntax is analogous to\n"
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index 946b2b8f0f2c..a0cd118af527 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -69,7 +69,8 @@
 	C(INVALID_STR_OPERAND,	"String type can not be an operand in expression"), \
 	C(EXPECT_NUMBER,	"Expecting numeric literal"),		\
 	C(UNARY_MINUS_SUBEXPR,	"Unary minus not supported in sub-expressions"), \
-	C(DIVISION_BY_ZERO,	"Division by zero"),
+	C(DIVISION_BY_ZERO,	"Division by zero"),			\
+	C(NEED_NOHC_VAL,	"Non-hitcount value is required for 'nohitcount'"),
 
 #undef C
 #define C(a, b)		HIST_ERR_##a
@@ -526,6 +527,7 @@ struct hist_trigger_attrs {
 	bool		cont;
 	bool		clear;
 	bool		ts_in_usecs;
+	bool		no_hitcount;
 	unsigned int	map_bits;
 
 	char		*assignment_str[TRACING_MAP_VARS_MAX];
@@ -1550,7 +1552,10 @@ parse_hist_trigger_attrs(struct trace_array *tr, char *trigger_str)
 			ret = parse_assignment(tr, str, attrs);
 			if (ret)
 				goto free;
-		} else if (strcmp(str, "pause") == 0)
+		} else if (strcmp(str, "nohitcount") == 0 ||
+			   strcmp(str, "NOHC") == 0)
+			attrs->no_hitcount = true;
+		else if (strcmp(str, "pause") == 0)
 			attrs->pause = true;
 		else if ((strcmp(str, "cont") == 0) ||
 			 (strcmp(str, "continue") == 0))
@@ -4377,6 +4382,12 @@ static int create_val_fields(struct hist_trigger_data *hist_data,
 	if (fields_str && (strcmp(fields_str, "hitcount") != 0))
 		ret = -EINVAL;
  out:
+	/* There is only raw hitcount but nohitcount suppresses it. */
+	if (j == 1 && hist_data->attrs->no_hitcount) {
+		hist_err(hist_data->event_file->tr, HIST_ERR_NEED_NOHC_VAL, 0);
+		ret = -ENOENT;
+	}
+
 	return ret;
 }
 
@@ -5388,13 +5399,13 @@ static void hist_trigger_entry_print(struct seq_file *m,
 
 	hist_trigger_print_key(m, hist_data, key, elt);
 
-	/* At first, show the raw hitcount always */
-	hist_trigger_print_val(m, i, "hitcount", 0, stats, elt);
+	/* At first, show the raw hitcount if !nohitcount */
+	if (!hist_data->attrs->no_hitcount)
+		hist_trigger_print_val(m, i, "hitcount", 0, stats, elt);
 
 	for (i = 1; i < hist_data->n_vals; i++) {
 		field_name = hist_field_name(hist_data->fields[i], 0);
 		flags = hist_data->fields[i]->flags;
-
 		if (flags & HIST_FIELD_FL_VAR || flags & HIST_FIELD_FL_EXPR)
 			continue;
 
@@ -5839,6 +5850,7 @@ static int event_hist_trigger_print(struct seq_file *m,
 	struct hist_trigger_data *hist_data = data->private_data;
 	struct hist_field *field;
 	bool have_var = false;
+	bool show_val = false;
 	unsigned int i;
 
 	seq_puts(m, HIST_PREFIX);
@@ -5869,12 +5881,16 @@ static int event_hist_trigger_print(struct seq_file *m,
 			continue;
 		}
 
-		if (i == HITCOUNT_IDX)
+		if (i == HITCOUNT_IDX) {
+			if (hist_data->attrs->no_hitcount)
+				continue;
 			seq_puts(m, "hitcount");
-		else {
-			seq_puts(m, ",");
+		} else {
+			if (show_val)
+				seq_puts(m, ",");
 			hist_field_print(m, field);
 		}
+		show_val = true;
 	}
 
 	if (have_var) {
@@ -5925,6 +5941,8 @@ static int event_hist_trigger_print(struct seq_file *m,
 	seq_printf(m, ":size=%u", (1 << hist_data->map->map_bits));
 	if (hist_data->enable_timestamps)
 		seq_printf(m, ":clock=%s", hist_data->attrs->clock);
+	if (hist_data->attrs->no_hitcount)
+		seq_puts(m, ":nohitcount");
 
 	print_actions_spec(m, hist_data);
 
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [for-next][PATCH 08/25] tracing: docs: Update histogram doc for .percent/.graph and nohitcount
  2022-12-10 13:57 [for-next][PATCH 00/25] tracing: Updates for 6.2 Steven Rostedt
                   ` (6 preceding siblings ...)
  2022-12-10 13:57 ` [for-next][PATCH 07/25] tracing: Add nohitcount option for suppressing display of raw hitcount Steven Rostedt
@ 2022-12-10 13:57 ` Steven Rostedt
  2022-12-10 13:57 ` [for-next][PATCH 09/25] trace/kprobe: remove duplicated calls of ring_buffer_event_data Steven Rostedt
                   ` (16 subsequent siblings)
  24 siblings, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 13:57 UTC (permalink / raw)
  To: linux-kernel; +Cc: Masami Hiramatsu, Andrew Morton, Tom Zanussi

From: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>

Update histogram document for .percent/.graph suffixes and 'nohitcount'
option.

Link: https://lore.kernel.org/linux-trace-kernel/166610815604.56030.4124933216911828519.stgit@devnote2

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Tested-by: Tom Zanussi <zanussi@kernel.org>
---
 Documentation/trace/histogram.rst | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/Documentation/trace/histogram.rst b/Documentation/trace/histogram.rst
index 87bd772836c0..f95459aa984f 100644
--- a/Documentation/trace/histogram.rst
+++ b/Documentation/trace/histogram.rst
@@ -25,7 +25,7 @@ Documentation written by Tom Zanussi
 
         hist:keys=<field1[,field2,...]>[:values=<field1[,field2,...]>]
           [:sort=<field1[,field2,...]>][:size=#entries][:pause][:continue]
-          [:clear][:name=histname1][:<handler>.<action>] [if <filter>]
+          [:clear][:name=histname1][:nohitcount][:<handler>.<action>] [if <filter>]
 
   When a matching event is hit, an entry is added to a hash table
   using the key(s) and value(s) named.  Keys and values correspond to
@@ -79,6 +79,8 @@ Documentation written by Tom Zanussi
 	.log2          display log2 value rather than raw number
 	.buckets=size  display grouping of values rather than raw number
 	.usecs         display a common_timestamp in microseconds
+        .percent       display a number of percentage value
+        .graph         display a bar-graph of a value
 	=============  =================================================
 
   Note that in general the semantics of a given field aren't
@@ -137,6 +139,12 @@ Documentation written by Tom Zanussi
   existing trigger, rather than via the '>' operator, which will cause
   the trigger to be removed through truncation.
 
+  The 'nohitcount' (or NOHC) parameter will suppress display of
+  raw hitcount in the histogram. This option requires at least one
+  value field which is not a 'raw hitcount'. For example,
+  'hist:...:vals=hitcount:nohitcount' is rejected, but
+  'hist:...:vals=hitcount.percent:nohitcount' is OK.
+
 - enable_hist/disable_hist
 
   The enable_hist and disable_hist triggers can be used to have one
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [for-next][PATCH 09/25] trace/kprobe: remove duplicated calls of ring_buffer_event_data
  2022-12-10 13:57 [for-next][PATCH 00/25] tracing: Updates for 6.2 Steven Rostedt
                   ` (7 preceding siblings ...)
  2022-12-10 13:57 ` [for-next][PATCH 08/25] tracing: docs: Update histogram doc for .percent/.graph and nohitcount Steven Rostedt
@ 2022-12-10 13:57 ` Steven Rostedt
  2022-12-10 13:58 ` [for-next][PATCH 10/25] tracing/probes: Handle system names with hyphens Steven Rostedt
                   ` (15 subsequent siblings)
  24 siblings, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 13:57 UTC (permalink / raw)
  To: linux-kernel; +Cc: Masami Hiramatsu, Andrew Morton, Song Chen

From: Song Chen <chensong_2000@189.cn>

Function __kprobe_trace_func calls ring_buffer_event_data to
get a ring buffer, however, it has been done in above call
trace_event_buffer_reserve. So does __kretprobe_trace_func.

This patch removes those duplicated calls.

Link: https://lore.kernel.org/all/1666145478-4706-1-git-send-email-chensong_2000@189.cn/

Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Song Chen <chensong_2000@189.cn>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 kernel/trace/trace_kprobe.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 5a75b039e586..ee77c8203bd5 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1344,7 +1344,6 @@ __kprobe_trace_func(struct trace_kprobe *tk, struct pt_regs *regs,
 		return;
 
 	fbuffer.regs = regs;
-	entry = fbuffer.entry = ring_buffer_event_data(fbuffer.event);
 	entry->ip = (unsigned long)tk->rp.kp.addr;
 	store_trace_args(&entry[1], &tk->tp, regs, sizeof(*entry), dsize);
 
@@ -1385,7 +1384,6 @@ __kretprobe_trace_func(struct trace_kprobe *tk, struct kretprobe_instance *ri,
 		return;
 
 	fbuffer.regs = regs;
-	entry = fbuffer.entry = ring_buffer_event_data(fbuffer.event);
 	entry->func = (unsigned long)tk->rp.kp.addr;
 	entry->ret_ip = get_kretprobe_retaddr(ri);
 	store_trace_args(&entry[1], &tk->tp, regs, sizeof(*entry), dsize);
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [for-next][PATCH 10/25] tracing/probes: Handle system names with hyphens
  2022-12-10 13:57 [for-next][PATCH 00/25] tracing: Updates for 6.2 Steven Rostedt
                   ` (8 preceding siblings ...)
  2022-12-10 13:57 ` [for-next][PATCH 09/25] trace/kprobe: remove duplicated calls of ring_buffer_event_data Steven Rostedt
@ 2022-12-10 13:58 ` Steven Rostedt
  2022-12-10 13:58 ` [for-next][PATCH 11/25] tracing: Fix complicated dependency of CONFIG_TRACER_MAX_TRACE Steven Rostedt
                   ` (14 subsequent siblings)
  24 siblings, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 13:58 UTC (permalink / raw)
  To: linux-kernel; +Cc: Masami Hiramatsu, Andrew Morton, stable, Rafael Mendonca

From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

When creating probe names, a check is done to make sure it matches basic C
standard variable naming standards. Basically, starts with alphabetic or
underline, and then the rest of the characters have alpha-numeric or
underline in them.

But system names do not have any true naming conventions, as they are
created by the TRACE_SYSTEM macro and nothing tests to see what they are.
The "xhci-hcd" trace events has a '-' in the system name. When trying to
attach a eprobe to one of these trace points, it fails because the system
name does not follow the variable naming convention because of the
hyphen, and the eprobe checks fail on this.

Allow hyphens in the system name so that eprobes can attach to the
"xhci-hcd" trace events.

Link: https://lore.kernel.org/all/Y3eJ8GiGnEvVd8%2FN@macondo/
Link: https://lore.kernel.org/linux-trace-kernel/20221122122345.160f5077@gandalf.local.home

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: stable@vger.kernel.org
Fixes: 5b7a96220900e ("tracing/probe: Check event/group naming rule at parsing")
Reported-by: Rafael Mendonca <rafaelmendsr@gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/trace.h       | 19 ++++++++++++++++---
 kernel/trace/trace_probe.c |  2 +-
 2 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 48643f07bc01..8f37ff032b4f 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -1954,17 +1954,30 @@ static __always_inline void trace_iterator_reset(struct trace_iterator *iter)
 }
 
 /* Check the name is good for event/group/fields */
-static inline bool is_good_name(const char *name)
+static inline bool __is_good_name(const char *name, bool hash_ok)
 {
-	if (!isalpha(*name) && *name != '_')
+	if (!isalpha(*name) && *name != '_' && (!hash_ok || *name != '-'))
 		return false;
 	while (*++name != '\0') {
-		if (!isalpha(*name) && !isdigit(*name) && *name != '_')
+		if (!isalpha(*name) && !isdigit(*name) && *name != '_' &&
+		    (!hash_ok || *name != '-'))
 			return false;
 	}
 	return true;
 }
 
+/* Check the name is good for event/group/fields */
+static inline bool is_good_name(const char *name)
+{
+	return __is_good_name(name, false);
+}
+
+/* Check the name is good for system */
+static inline bool is_good_system_name(const char *name)
+{
+	return __is_good_name(name, true);
+}
+
 /* Convert certain expected symbols into '_' when generating event names */
 static inline void sanitize_event_name(char *name)
 {
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 36dff277de46..bb2f95d7175c 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -246,7 +246,7 @@ int traceprobe_parse_event_name(const char **pevent, const char **pgroup,
 			return -EINVAL;
 		}
 		strlcpy(buf, event, slash - event + 1);
-		if (!is_good_name(buf)) {
+		if (!is_good_system_name(buf)) {
 			trace_probe_log_err(offset, BAD_GROUP_NAME);
 			return -EINVAL;
 		}
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [for-next][PATCH 11/25] tracing: Fix complicated dependency of CONFIG_TRACER_MAX_TRACE
  2022-12-10 13:57 [for-next][PATCH 00/25] tracing: Updates for 6.2 Steven Rostedt
                   ` (9 preceding siblings ...)
  2022-12-10 13:58 ` [for-next][PATCH 10/25] tracing/probes: Handle system names with hyphens Steven Rostedt
@ 2022-12-10 13:58 ` Steven Rostedt
  2022-12-10 13:58 ` [for-next][PATCH 12/25] x86/mm/kmmio: Switch to arch_spin_lock() Steven Rostedt
                   ` (13 subsequent siblings)
  24 siblings, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 13:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Masami Hiramatsu, Andrew Morton, Daniel Bristot de Oliveira,
	stable, David Howells, kernel test robot

From: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>

Both CONFIG_OSNOISE_TRACER and CONFIG_HWLAT_TRACER partially enables the
CONFIG_TRACER_MAX_TRACE code, but that is complicated and has
introduced a bug; It declares tracing_max_lat_fops data structure outside
of #ifdefs, but since it is defined only when CONFIG_TRACER_MAX_TRACE=y
or CONFIG_HWLAT_TRACER=y, if only CONFIG_OSNOISE_TRACER=y, that
declaration comes to a definition(!).

To fix this issue, and do not repeat the similar problem, makes
CONFIG_OSNOISE_TRACER and CONFIG_HWLAT_TRACER enables the
CONFIG_TRACER_MAX_TRACE always. It has there benefits;
- Fix the tracing_max_lat_fops bug
- Simplify the #ifdefs
- CONFIG_TRACER_MAX_TRACE code is fully enabled, or not.

Link: https://lore.kernel.org/linux-trace-kernel/167033628155.4111793.12185405690820208159.stgit@devnote3

Fixes: 424b650f35c7 ("tracing: Fix missing osnoise tracer on max_latency")
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: stable@vger.kernel.org
Reported-by: David Howells <dhowells@redhat.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Link: https://lore.kernel.org/all/166992525941.1716618.13740663757583361463.stgit@warthog.procyon.org.uk/ (original thread and v1)
Link: https://lore.kernel.org/all/202212052253.VuhZ2ulJ-lkp@intel.com/T/#u (v1 error report)
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/Kconfig |  2 ++
 kernel/trace/trace.c | 23 +++++++++++++----------
 kernel/trace/trace.h |  8 +++-----
 3 files changed, 18 insertions(+), 15 deletions(-)

diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index e9e95c790b8e..93d724996283 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -375,6 +375,7 @@ config SCHED_TRACER
 config HWLAT_TRACER
 	bool "Tracer to detect hardware latencies (like SMIs)"
 	select GENERIC_TRACER
+	select TRACER_MAX_TRACE
 	help
 	 This tracer, when enabled will create one or more kernel threads,
 	 depending on what the cpumask file is set to, which each thread
@@ -410,6 +411,7 @@ config HWLAT_TRACER
 config OSNOISE_TRACER
 	bool "OS Noise tracer"
 	select GENERIC_TRACER
+	select TRACER_MAX_TRACE
 	help
 	  In the context of high-performance computing (HPC), the Operating
 	  System Noise (osnoise) refers to the interference experienced by an
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 948f321b9df1..664619b3f1e1 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -1421,6 +1421,7 @@ int tracing_snapshot_cond_disable(struct trace_array *tr)
 	return false;
 }
 EXPORT_SYMBOL_GPL(tracing_snapshot_cond_disable);
+#define free_snapshot(tr)	do { } while (0)
 #endif /* CONFIG_TRACER_SNAPSHOT */
 
 void tracer_tracing_off(struct trace_array *tr)
@@ -1692,6 +1693,8 @@ static ssize_t trace_seq_to_buffer(struct trace_seq *s, void *buf, size_t cnt)
 }
 
 unsigned long __read_mostly	tracing_thresh;
+
+#ifdef CONFIG_TRACER_MAX_TRACE
 static const struct file_operations tracing_max_lat_fops;
 
 #ifdef LATENCY_FS_NOTIFY
@@ -1748,18 +1751,14 @@ void latency_fsnotify(struct trace_array *tr)
 	irq_work_queue(&tr->fsnotify_irqwork);
 }
 
-#elif defined(CONFIG_TRACER_MAX_TRACE) || defined(CONFIG_HWLAT_TRACER)	\
-	|| defined(CONFIG_OSNOISE_TRACER)
+#else /* !LATENCY_FS_NOTIFY */
 
 #define trace_create_maxlat_file(tr, d_tracer)				\
 	trace_create_file("tracing_max_latency", TRACE_MODE_WRITE,	\
 			  d_tracer, &tr->max_latency, &tracing_max_lat_fops)
 
-#else
-#define trace_create_maxlat_file(tr, d_tracer)	 do { } while (0)
 #endif
 
-#ifdef CONFIG_TRACER_MAX_TRACE
 /*
  * Copy the new maximum trace into the separate maximum-trace
  * structure. (this way the maximum trace is permanently saved,
@@ -1834,14 +1833,15 @@ update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu,
 		ring_buffer_record_off(tr->max_buffer.buffer);
 
 #ifdef CONFIG_TRACER_SNAPSHOT
-	if (tr->cond_snapshot && !tr->cond_snapshot->update(tr, cond_data))
-		goto out_unlock;
+	if (tr->cond_snapshot && !tr->cond_snapshot->update(tr, cond_data)) {
+		arch_spin_unlock(&tr->max_lock);
+		return;
+	}
 #endif
 	swap(tr->array_buffer.buffer, tr->max_buffer.buffer);
 
 	__update_max_tr(tr, tsk, cpu);
 
- out_unlock:
 	arch_spin_unlock(&tr->max_lock);
 }
 
@@ -1888,6 +1888,7 @@ update_max_tr_single(struct trace_array *tr, struct task_struct *tsk, int cpu)
 	__update_max_tr(tr, tsk, cpu);
 	arch_spin_unlock(&tr->max_lock);
 }
+
 #endif /* CONFIG_TRACER_MAX_TRACE */
 
 static int wait_on_pipe(struct trace_iterator *iter, int full)
@@ -6577,7 +6578,7 @@ tracing_thresh_write(struct file *filp, const char __user *ubuf,
 	return ret;
 }
 
-#if defined(CONFIG_TRACER_MAX_TRACE) || defined(CONFIG_HWLAT_TRACER)
+#ifdef CONFIG_TRACER_MAX_TRACE
 
 static ssize_t
 tracing_max_lat_read(struct file *filp, char __user *ubuf,
@@ -7592,7 +7593,7 @@ static const struct file_operations tracing_thresh_fops = {
 	.llseek		= generic_file_llseek,
 };
 
-#if defined(CONFIG_TRACER_MAX_TRACE) || defined(CONFIG_HWLAT_TRACER)
+#ifdef CONFIG_TRACER_MAX_TRACE
 static const struct file_operations tracing_max_lat_fops = {
 	.open		= tracing_open_generic,
 	.read		= tracing_max_lat_read,
@@ -9606,7 +9607,9 @@ init_tracer_tracefs(struct trace_array *tr, struct dentry *d_tracer)
 
 	create_trace_options_dir(tr);
 
+#ifdef CONFIG_TRACER_MAX_TRACE
 	trace_create_maxlat_file(tr, d_tracer);
+#endif
 
 	if (ftrace_create_function_files(tr, d_tracer))
 		MEM_FAIL(1, "Could not allocate function filter files");
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 8f37ff032b4f..9dc920b01c17 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -308,8 +308,7 @@ struct trace_array {
 	struct array_buffer	max_buffer;
 	bool			allocated_snapshot;
 #endif
-#if defined(CONFIG_TRACER_MAX_TRACE) || defined(CONFIG_HWLAT_TRACER) \
-	|| defined(CONFIG_OSNOISE_TRACER)
+#ifdef CONFIG_TRACER_MAX_TRACE
 	unsigned long		max_latency;
 #ifdef CONFIG_FSNOTIFY
 	struct dentry		*d_max_latency;
@@ -688,12 +687,11 @@ void update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu,
 		   void *cond_data);
 void update_max_tr_single(struct trace_array *tr,
 			  struct task_struct *tsk, int cpu);
-#endif /* CONFIG_TRACER_MAX_TRACE */
 
-#if (defined(CONFIG_TRACER_MAX_TRACE) || defined(CONFIG_HWLAT_TRACER) \
-	|| defined(CONFIG_OSNOISE_TRACER)) && defined(CONFIG_FSNOTIFY)
+#ifdef CONFIG_FSNOTIFY
 #define LATENCY_FS_NOTIFY
 #endif
+#endif /* CONFIG_TRACER_MAX_TRACE */
 
 #ifdef LATENCY_FS_NOTIFY
 void latency_fsnotify(struct trace_array *tr);
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [for-next][PATCH 12/25] x86/mm/kmmio: Switch to arch_spin_lock()
  2022-12-10 13:57 [for-next][PATCH 00/25] tracing: Updates for 6.2 Steven Rostedt
                   ` (10 preceding siblings ...)
  2022-12-10 13:58 ` [for-next][PATCH 11/25] tracing: Fix complicated dependency of CONFIG_TRACER_MAX_TRACE Steven Rostedt
@ 2022-12-10 13:58 ` Steven Rostedt
  2022-12-10 13:58 ` [for-next][PATCH 13/25] x86/mm/kmmio: Use rcu_read_lock_sched_notrace() Steven Rostedt
                   ` (12 subsequent siblings)
  24 siblings, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 13:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Masami Hiramatsu, Andrew Morton, Karol Herbst, Pekka Paalanen,
	Dave Hansen, Andy Lutomirski, Peter Zijlstra, Ingo Molnar,
	Borislav Petkov, Thomas Gleixner

From: Steven Rostedt <rostedt@goodmis.org>

The mmiotrace tracer is "special". The purpose is to help reverse engineer
binary drivers by removing the memory allocated by the driver and when the
driver goes to access it, a fault occurs, the mmiotracer will record what
the driver was doing and then do the work on its behalf by single stepping
through the process.

But to achieve this ability, it must do some special things. One is it
needs to grab a lock while in the breakpoint handler. This is considered
an NMI state, and then lockdep warns that the lock is being held in both
an NMI state (really a breakpoint handler) and also in normal context.

As the breakpoint/NMI state only happens when the driver is accessing
memory, there's no concern of a race condition against the setup and
tear-down of mmiotracer.

To make lockdep and mmiotrace work together, convert the locks used in the
breakpoint handler into arch_spin_lock().

Link: https://lkml.kernel.org/r/20221206191229.656244029@goodmis.org
Link: https://lore.kernel.org/lkml/20221201213126.620b7dd3@gandalf.local.home/
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Karol Herbst <karolherbst@gmail.com>
Cc: Pekka Paalanen <ppaalanen@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 arch/x86/mm/kmmio.c | 31 ++++++++++++++++++++++---------
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/arch/x86/mm/kmmio.c b/arch/x86/mm/kmmio.c
index d3efbc5b3449..edb486450158 100644
--- a/arch/x86/mm/kmmio.c
+++ b/arch/x86/mm/kmmio.c
@@ -62,7 +62,13 @@ struct kmmio_context {
 	int active;
 };
 
-static DEFINE_SPINLOCK(kmmio_lock);
+/*
+ * The kmmio_lock is taken in int3 context, which is treated as NMI context.
+ * This causes lockdep to complain about it bein in both NMI and normal
+ * context. Hide it from lockdep, as it should not have any other locks
+ * taken under it, and this is only enabled for debugging mmio anyway.
+ */
+static arch_spinlock_t kmmio_lock = __ARCH_SPIN_LOCK_UNLOCKED;
 
 /* Protected by kmmio_lock */
 unsigned int kmmio_count;
@@ -346,10 +352,10 @@ static int post_kmmio_handler(unsigned long condition, struct pt_regs *regs)
 		ctx->probe->post_handler(ctx->probe, condition, regs);
 
 	/* Prevent racing against release_kmmio_fault_page(). */
-	spin_lock(&kmmio_lock);
+	arch_spin_lock(&kmmio_lock);
 	if (ctx->fpage->count)
 		arm_kmmio_fault_page(ctx->fpage);
-	spin_unlock(&kmmio_lock);
+	arch_spin_unlock(&kmmio_lock);
 
 	regs->flags &= ~X86_EFLAGS_TF;
 	regs->flags |= ctx->saved_flags;
@@ -440,7 +446,8 @@ int register_kmmio_probe(struct kmmio_probe *p)
 	unsigned int l;
 	pte_t *pte;
 
-	spin_lock_irqsave(&kmmio_lock, flags);
+	local_irq_save(flags);
+	arch_spin_lock(&kmmio_lock);
 	if (get_kmmio_probe(addr)) {
 		ret = -EEXIST;
 		goto out;
@@ -460,7 +467,9 @@ int register_kmmio_probe(struct kmmio_probe *p)
 		size += page_level_size(l);
 	}
 out:
-	spin_unlock_irqrestore(&kmmio_lock, flags);
+	arch_spin_unlock(&kmmio_lock);
+	local_irq_restore(flags);
+
 	/*
 	 * XXX: What should I do here?
 	 * Here was a call to global_flush_tlb(), but it does not exist
@@ -494,7 +503,8 @@ static void remove_kmmio_fault_pages(struct rcu_head *head)
 	struct kmmio_fault_page **prevp = &dr->release_list;
 	unsigned long flags;
 
-	spin_lock_irqsave(&kmmio_lock, flags);
+	local_irq_save(flags);
+	arch_spin_lock(&kmmio_lock);
 	while (f) {
 		if (!f->count) {
 			list_del_rcu(&f->list);
@@ -506,7 +516,8 @@ static void remove_kmmio_fault_pages(struct rcu_head *head)
 		}
 		f = *prevp;
 	}
-	spin_unlock_irqrestore(&kmmio_lock, flags);
+	arch_spin_unlock(&kmmio_lock);
+	local_irq_restore(flags);
 
 	/* This is the real RCU destroy call. */
 	call_rcu(&dr->rcu, rcu_free_kmmio_fault_pages);
@@ -540,14 +551,16 @@ void unregister_kmmio_probe(struct kmmio_probe *p)
 	if (!pte)
 		return;
 
-	spin_lock_irqsave(&kmmio_lock, flags);
+	local_irq_save(flags);
+	arch_spin_lock(&kmmio_lock);
 	while (size < size_lim) {
 		release_kmmio_fault_page(addr + size, &release_list);
 		size += page_level_size(l);
 	}
 	list_del_rcu(&p->list);
 	kmmio_count--;
-	spin_unlock_irqrestore(&kmmio_lock, flags);
+	arch_spin_unlock(&kmmio_lock);
+	local_irq_restore(flags);
 
 	if (!release_list)
 		return;
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [for-next][PATCH 13/25] x86/mm/kmmio: Use rcu_read_lock_sched_notrace()
  2022-12-10 13:57 [for-next][PATCH 00/25] tracing: Updates for 6.2 Steven Rostedt
                   ` (11 preceding siblings ...)
  2022-12-10 13:58 ` [for-next][PATCH 12/25] x86/mm/kmmio: Switch to arch_spin_lock() Steven Rostedt
@ 2022-12-10 13:58 ` Steven Rostedt
  2022-12-10 17:47   ` Paul E. McKenney
  2022-12-10 13:58 ` [for-next][PATCH 14/25] tracing/hist: Fix wrong return value in parse_action_params() Steven Rostedt
                   ` (11 subsequent siblings)
  24 siblings, 1 reply; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 13:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Masami Hiramatsu, Andrew Morton, Karol Herbst, Pekka Paalanen,
	Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Paul E. McKenney

From: Steven Rostedt <rostedt@goodmis.org>

The mmiotrace tracer is "special". The purpose is to help reverse engineer
binary drivers by removing the memory allocated by the driver and when the
driver goes to access it, a fault occurs, the mmiotracer will record what
the driver was doing and then do the work on its behalf by single stepping
through the process.

But to achieve this ability, it must do some special things. One is to
take the rcu_read_lock() when the fault occurs, and then release it in the
breakpoint that is single stepping. This makes lockdep unhappy, as it
changes the state of RCU from within an exception that is not contained in
that exception, and we get a nasty splat from lockdep.

Instead, switch to rcu_read_lock_sched_notrace() as the RCU sched variant
has the same grace period as normal RCU. This is basically the same as
rcu_read_lock() but does not make lockdep complain about it.

Note, the preempt_disable() is still needed as it uses preempt_enable_no_resched().

Link: https://lore.kernel.org/linux-trace-kernel/20221209134144.04f33626@gandalf.local.home

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Karol Herbst <karolherbst@gmail.com>
Cc: Pekka Paalanen <ppaalanen@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 arch/x86/mm/kmmio.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/mm/kmmio.c b/arch/x86/mm/kmmio.c
index edb486450158..853c49877c16 100644
--- a/arch/x86/mm/kmmio.c
+++ b/arch/x86/mm/kmmio.c
@@ -254,7 +254,7 @@ int kmmio_handler(struct pt_regs *regs, unsigned long addr)
 	 * again.
 	 */
 	preempt_disable();
-	rcu_read_lock();
+	rcu_read_lock_sched_notrace();
 
 	faultpage = get_kmmio_fault_page(page_base);
 	if (!faultpage) {
@@ -323,7 +323,7 @@ int kmmio_handler(struct pt_regs *regs, unsigned long addr)
 	return 1; /* fault handled */
 
 no_kmmio:
-	rcu_read_unlock();
+	rcu_read_unlock_sched_notrace();
 	preempt_enable_no_resched();
 	return ret;
 }
@@ -363,7 +363,7 @@ static int post_kmmio_handler(unsigned long condition, struct pt_regs *regs)
 	/* These were acquired in kmmio_handler(). */
 	ctx->active--;
 	BUG_ON(ctx->active);
-	rcu_read_unlock();
+	rcu_read_unlock_sched_notrace();
 	preempt_enable_no_resched();
 
 	/*
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [for-next][PATCH 14/25] tracing/hist: Fix wrong return value in parse_action_params()
  2022-12-10 13:57 [for-next][PATCH 00/25] tracing: Updates for 6.2 Steven Rostedt
                   ` (12 preceding siblings ...)
  2022-12-10 13:58 ` [for-next][PATCH 13/25] x86/mm/kmmio: Use rcu_read_lock_sched_notrace() Steven Rostedt
@ 2022-12-10 13:58 ` Steven Rostedt
  2022-12-10 13:58 ` [for-next][PATCH 15/25] tracing/hist: Fix out-of-bound write on action_data.var_ref_idx Steven Rostedt
                   ` (10 subsequent siblings)
  24 siblings, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 13:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Masami Hiramatsu, Andrew Morton, zanussi, stable, Zheng Yejian

From: Zheng Yejian <zhengyejian1@huawei.com>

When number of synth fields is more than SYNTH_FIELDS_MAX,
parse_action_params() should return -EINVAL.

Link: https://lore.kernel.org/linux-trace-kernel/20221207034635.2253990-1-zhengyejian1@huawei.com

Cc: <mhiramat@kernel.org>
Cc: <zanussi@kernel.org>
Cc: stable@vger.kernel.org
Fixes: c282a386a397 ("tracing: Add 'onmatch' hist trigger action support")
Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/trace_events_hist.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index a0cd118af527..b4ad86c22b43 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -3609,6 +3609,7 @@ static int parse_action_params(struct trace_array *tr, char *params,
 	while (params) {
 		if (data->n_params >= SYNTH_FIELDS_MAX) {
 			hist_err(tr, HIST_ERR_TOO_MANY_PARAMS, 0);
+			ret = -EINVAL;
 			goto out;
 		}
 
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [for-next][PATCH 15/25] tracing/hist: Fix out-of-bound write on action_data.var_ref_idx
  2022-12-10 13:57 [for-next][PATCH 00/25] tracing: Updates for 6.2 Steven Rostedt
                   ` (13 preceding siblings ...)
  2022-12-10 13:58 ` [for-next][PATCH 14/25] tracing/hist: Fix wrong return value in parse_action_params() Steven Rostedt
@ 2022-12-10 13:58 ` Steven Rostedt
  2022-12-10 13:58 ` [for-next][PATCH 16/25] tracing: Fix issue of missing one synthetic field Steven Rostedt
                   ` (9 subsequent siblings)
  24 siblings, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 13:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Masami Hiramatsu, Andrew Morton, zanussi, stable, Zheng Yejian

From: Zheng Yejian <zhengyejian1@huawei.com>

When generate a synthetic event with many params and then create a trace
action for it [1], kernel panic happened [2].

It is because that in trace_action_create() 'data->n_params' is up to
SYNTH_FIELDS_MAX (current value is 64), and array 'data->var_ref_idx'
keeps indices into array 'hist_data->var_refs' for each synthetic event
param, but the length of 'data->var_ref_idx' is TRACING_MAP_VARS_MAX
(current value is 16), so out-of-bound write happened when 'data->n_params'
more than 16. In this case, 'data->match_data.event' is overwritten and
eventually cause the panic.

To solve the issue, adjust the length of 'data->var_ref_idx' to be
SYNTH_FIELDS_MAX and add sanity checks to avoid out-of-bound write.

[1]
 # cd /sys/kernel/tracing/
 # echo "my_synth_event int v1; int v2; int v3; int v4; int v5; int v6;\
int v7; int v8; int v9; int v10; int v11; int v12; int v13; int v14;\
int v15; int v16; int v17; int v18; int v19; int v20; int v21; int v22;\
int v23; int v24; int v25; int v26; int v27; int v28; int v29; int v30;\
int v31; int v32; int v33; int v34; int v35; int v36; int v37; int v38;\
int v39; int v40; int v41; int v42; int v43; int v44; int v45; int v46;\
int v47; int v48; int v49; int v50; int v51; int v52; int v53; int v54;\
int v55; int v56; int v57; int v58; int v59; int v60; int v61; int v62;\
int v63" >> synthetic_events
 # echo 'hist:keys=pid:ts0=common_timestamp.usecs if comm=="bash"' >> \
events/sched/sched_waking/trigger
 # echo "hist:keys=next_pid:onmatch(sched.sched_waking).my_synth_event(\
pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,\
pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,\
pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,pid,\
pid,pid,pid,pid,pid,pid,pid,pid,pid)" >> events/sched/sched_switch/trigger

[2]
BUG: unable to handle page fault for address: ffff91c900000000
PGD 61001067 P4D 61001067 PUD 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 2 PID: 322 Comm: bash Tainted: G        W          6.1.0-rc8+ #229
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
RIP: 0010:strcmp+0xc/0x30
Code: 75 f7 31 d2 44 0f b6 04 16 44 88 04 11 48 83 c2 01 45 84 c0 75 ee
c3 cc cc cc cc 0f 1f 00 31 c0 eb 08 48 83 c0 01 84 d2 74 13 <0f> b6 14
07 3a 14 06 74 ef 19 c0 83 c8 01 c3 cc cc cc cc 31 c3
RSP: 0018:ffff9b3b00f53c48 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffffffffba958a68 RCX: 0000000000000000
RDX: 0000000000000010 RSI: ffff91c943d33a90 RDI: ffff91c900000000
RBP: ffff91c900000000 R08: 00000018d604b529 R09: 0000000000000000
R10: ffff91c9483eddb1 R11: ffff91ca483eddab R12: ffff91c946171580
R13: ffff91c9479f0538 R14: ffff91c9457c2848 R15: ffff91c9479f0538
FS:  00007f1d1cfbe740(0000) GS:ffff91c9bdc80000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff91c900000000 CR3: 0000000006316000 CR4: 00000000000006e0
Call Trace:
 <TASK>
 __find_event_file+0x55/0x90
 action_create+0x76c/0x1060
 event_hist_trigger_parse+0x146d/0x2060
 ? event_trigger_write+0x31/0xd0
 trigger_process_regex+0xbb/0x110
 event_trigger_write+0x6b/0xd0
 vfs_write+0xc8/0x3e0
 ? alloc_fd+0xc0/0x160
 ? preempt_count_add+0x4d/0xa0
 ? preempt_count_add+0x70/0xa0
 ksys_write+0x5f/0xe0
 do_syscall_64+0x3b/0x90
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f1d1d0cf077
Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e
fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00
f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74
RSP: 002b:00007ffcebb0e568 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000143 RCX: 00007f1d1d0cf077
RDX: 0000000000000143 RSI: 00005639265aa7e0 RDI: 0000000000000001
RBP: 00005639265aa7e0 R08: 000000000000000a R09: 0000000000000142
R10: 000056392639c017 R11: 0000000000000246 R12: 0000000000000143
R13: 00007f1d1d1ae6a0 R14: 00007f1d1d1aa4a0 R15: 00007f1d1d1a98a0
 </TASK>
Modules linked in:
CR2: ffff91c900000000
---[ end trace 0000000000000000 ]---
RIP: 0010:strcmp+0xc/0x30
Code: 75 f7 31 d2 44 0f b6 04 16 44 88 04 11 48 83 c2 01 45 84 c0 75 ee
c3 cc cc cc cc 0f 1f 00 31 c0 eb 08 48 83 c0 01 84 d2 74 13 <0f> b6 14
07 3a 14 06 74 ef 19 c0 83 c8 01 c3 cc cc cc cc 31 c3
RSP: 0018:ffff9b3b00f53c48 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffffffffba958a68 RCX: 0000000000000000
RDX: 0000000000000010 RSI: ffff91c943d33a90 RDI: ffff91c900000000
RBP: ffff91c900000000 R08: 00000018d604b529 R09: 0000000000000000
R10: ffff91c9483eddb1 R11: ffff91ca483eddab R12: ffff91c946171580
R13: ffff91c9479f0538 R14: ffff91c9457c2848 R15: ffff91c9479f0538
FS:  00007f1d1cfbe740(0000) GS:ffff91c9bdc80000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff91c900000000 CR3: 0000000006316000 CR4: 00000000000006e0

Link: https://lore.kernel.org/linux-trace-kernel/20221207035143.2278781-1-zhengyejian1@huawei.com

Cc: <mhiramat@kernel.org>
Cc: <zanussi@kernel.org>
Cc: stable@vger.kernel.org
Fixes: d380dcde9a07 ("tracing: Fix now invalid var_ref_vals assumption in trace action")
Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/trace_events_hist.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index b4ad86c22b43..8264b28d5a57 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -621,7 +621,7 @@ struct action_data {
 	 * event param, and is passed to the synthetic event
 	 * invocation.
 	 */
-	unsigned int		var_ref_idx[TRACING_MAP_VARS_MAX];
+	unsigned int		var_ref_idx[SYNTH_FIELDS_MAX];
 	struct synth_event	*synth_event;
 	bool			use_trace_keyword;
 	char			*synth_event_name;
@@ -2186,7 +2186,9 @@ static struct hist_field *create_var_ref(struct hist_trigger_data *hist_data,
 			return ref_field;
 		}
 	}
-
+	/* Sanity check to avoid out-of-bound write on 'hist_data->var_refs' */
+	if (hist_data->n_var_refs >= TRACING_MAP_VARS_MAX)
+		return NULL;
 	ref_field = create_hist_field(var_field->hist_data, NULL, flags, NULL);
 	if (ref_field) {
 		if (init_var_ref(ref_field, var_field, system, event_name)) {
@@ -3946,6 +3948,10 @@ static int trace_action_create(struct hist_trigger_data *hist_data,
 
 	lockdep_assert_held(&event_mutex);
 
+	/* Sanity check to avoid out-of-bound write on 'data->var_ref_idx' */
+	if (data->n_params > SYNTH_FIELDS_MAX)
+		return -EINVAL;
+
 	if (data->use_trace_keyword)
 		synth_event_name = data->synth_event_name;
 	else
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [for-next][PATCH 16/25] tracing: Fix issue of missing one synthetic field
  2022-12-10 13:57 [for-next][PATCH 00/25] tracing: Updates for 6.2 Steven Rostedt
                   ` (14 preceding siblings ...)
  2022-12-10 13:58 ` [for-next][PATCH 15/25] tracing/hist: Fix out-of-bound write on action_data.var_ref_idx Steven Rostedt
@ 2022-12-10 13:58 ` Steven Rostedt
  2022-12-10 13:58 ` [for-next][PATCH 17/25] tracing/hist: Fix issue of losting command info in error_log Steven Rostedt
                   ` (8 subsequent siblings)
  24 siblings, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 13:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Masami Hiramatsu, Andrew Morton, zanussi, stable, Zheng Yejian

From: Zheng Yejian <zhengyejian1@huawei.com>

The maximum number of synthetic fields supported is defined as
SYNTH_FIELDS_MAX which value currently is 64, but it actually fails
when try to generate a synthetic event with 64 fields by executing like:

  # echo "my_synth_event int v1; int v2; int v3; int v4; int v5; int v6;\
   int v7; int v8; int v9; int v10; int v11; int v12; int v13; int v14;\
   int v15; int v16; int v17; int v18; int v19; int v20; int v21; int v22;\
   int v23; int v24; int v25; int v26; int v27; int v28; int v29; int v30;\
   int v31; int v32; int v33; int v34; int v35; int v36; int v37; int v38;\
   int v39; int v40; int v41; int v42; int v43; int v44; int v45; int v46;\
   int v47; int v48; int v49; int v50; int v51; int v52; int v53; int v54;\
   int v55; int v56; int v57; int v58; int v59; int v60; int v61; int v62;\
   int v63; int v64" >> /sys/kernel/tracing/synthetic_events

Correct the field counting to fix it.

Link: https://lore.kernel.org/linux-trace-kernel/20221207091557.3137904-1-zhengyejian1@huawei.com

Cc: <mhiramat@kernel.org>
Cc: <zanussi@kernel.org>
Cc: stable@vger.kernel.org
Fixes: c9e759b1e845 ("tracing: Rework synthetic event command parsing")
Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/trace_events_synth.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/trace_events_synth.c b/kernel/trace/trace_events_synth.c
index c3b582d19b62..67592eed0be8 100644
--- a/kernel/trace/trace_events_synth.c
+++ b/kernel/trace/trace_events_synth.c
@@ -1282,12 +1282,12 @@ static int __create_synth_event(const char *name, const char *raw_fields)
 				goto err_free_arg;
 			}
 
-			fields[n_fields++] = field;
 			if (n_fields == SYNTH_FIELDS_MAX) {
 				synth_err(SYNTH_ERR_TOO_MANY_FIELDS, 0);
 				ret = -EINVAL;
 				goto err_free_arg;
 			}
+			fields[n_fields++] = field;
 
 			n_fields_this_loop++;
 		}
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [for-next][PATCH 17/25] tracing/hist: Fix issue of losting command info in error_log
  2022-12-10 13:57 [for-next][PATCH 00/25] tracing: Updates for 6.2 Steven Rostedt
                   ` (15 preceding siblings ...)
  2022-12-10 13:58 ` [for-next][PATCH 16/25] tracing: Fix issue of missing one synthetic field Steven Rostedt
@ 2022-12-10 13:58 ` Steven Rostedt
  2022-12-10 13:58 ` [for-next][PATCH 18/25] ring-buffer: Handle resize in early boot up Steven Rostedt
                   ` (7 subsequent siblings)
  24 siblings, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 13:58 UTC (permalink / raw)
  To: linux-kernel; +Cc: Masami Hiramatsu, Andrew Morton, zanussi, Zheng Yejian

From: Zheng Yejian <zhengyejian1@huawei.com>

When input some constructed invalid 'trigger' command, command info
in 'error_log' are lost [1].

The root cause is that there is a path that event_hist_trigger_parse()
is recursely called once and 'last_cmd' which save origin command is
cleared, then later calling of hist_err() will no longer record origin
command info:

  event_hist_trigger_parse() {
    last_cmd_set()  // <1> 'last_cmd' save origin command here at first
    create_actions() {
      onmatch_create() {
        action_create() {
          trace_action_create() {
            trace_action_create_field_var() {
              create_field_var_hist() {
                event_hist_trigger_parse() {  // <2> recursely called once
                  hist_err_clear()  // <3> 'last_cmd' is cleared here
                }
                hist_err()  // <4> No longer find origin command!!!

Since 'glob' is empty string while running into the recurse call, we
can trickly check it and bypass the call of hist_err_clear() to solve it.

[1]
 # cd /sys/kernel/tracing
 # echo "my_synth_event int v1; int v2; int v3;" >> synthetic_events
 # echo 'hist:keys=pid' >> events/sched/sched_waking/trigger
 # echo "hist:keys=next_pid:onmatch(sched.sched_waking).my_synth_event(\
pid,pid1)" >> events/sched/sched_switch/trigger
 # cat error_log
[  8.405018] hist:sched:sched_switch: error: Couldn't find synthetic event
  Command:
hist:keys=next_pid:onmatch(sched.sched_waking).my_synth_event(pid,pid1)
                                                          ^
[  8.816902] hist:sched:sched_switch: error: Couldn't find field
  Command:
hist:keys=next_pid:onmatch(sched.sched_waking).my_synth_event(pid,pid1)
                          ^
[  8.816902] hist:sched:sched_switch: error: Couldn't parse field variable
  Command:
hist:keys=next_pid:onmatch(sched.sched_waking).my_synth_event(pid,pid1)
                          ^
[  8.999880] : error: Couldn't find field
  Command:
           ^
[  8.999880] : error: Couldn't parse field variable
  Command:
           ^
[  8.999880] : error: Couldn't find field
  Command:
           ^
[  8.999880] : error: Couldn't create histogram for field
  Command:
           ^

Link: https://lore.kernel.org/linux-trace-kernel/20221207135326.3483216-1-zhengyejian1@huawei.com

Cc: <mhiramat@kernel.org>
Cc: <zanussi@kernel.org>
Fixes: f404da6e1d46 ("tracing: Add 'last error' error facility for hist triggers")
Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/trace_events_hist.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index 8264b28d5a57..fcaf226b7744 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -6576,7 +6576,7 @@ static int event_hist_trigger_parse(struct event_command *cmd_ops,
 	if (se)
 		se->ref++;
  out:
-	if (ret == 0)
+	if (ret == 0 && glob[0])
 		hist_err_clear();
 
 	return ret;
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [for-next][PATCH 18/25] ring-buffer: Handle resize in early boot up
  2022-12-10 13:57 [for-next][PATCH 00/25] tracing: Updates for 6.2 Steven Rostedt
                   ` (16 preceding siblings ...)
  2022-12-10 13:58 ` [for-next][PATCH 17/25] tracing/hist: Fix issue of losting command info in error_log Steven Rostedt
@ 2022-12-10 13:58 ` Steven Rostedt
  2022-12-10 13:58 ` [for-next][PATCH 19/25] tracing: remove unnecessary trace_trigger ifdef Steven Rostedt
                   ` (6 subsequent siblings)
  24 siblings, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 13:58 UTC (permalink / raw)
  To: linux-kernel; +Cc: Masami Hiramatsu, Andrew Morton, Ross Zwisler

From: Steven Rostedt <rostedt@goodmis.org>

With the new command line option that allows trace event triggers to be
added at boot, the "snapshot" trigger will allocate the snapshot buffer
very early, when interrupts can not be enabled. Allocating the ring buffer
is not the problem, but it also resizes it, which is, as the resize code
does synchronization that can not be preformed at early boot.

To handle this, first change the raw_spin_lock_irq() in rb_insert_pages()
to raw_spin_lock_irqsave(), such that the unlocking of that spin lock will
not enable interrupts.

Next, where it calls schedule_work_on(), disable migration and check if
the CPU to update is the current CPU, and if so, perform the work
directly, otherwise re-enable migration and call the schedule_work_on() to
the CPU that is being updated. The rb_insert_pages() just needs to be run
on the CPU that it is updating, and does not need preemption nor
interrupts disabled when calling it.

Link: https://lore.kernel.org/lkml/Y5J%2FCajlNh1gexvo@google.com/
Link: https://lore.kernel.org/linux-trace-kernel/20221209101151.1fec1167@gandalf.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>

Fixes: a01fdc897fa5 ("tracing: Add trace_trigger kernel command line option")
Reported-by: Ross Zwisler <zwisler@google.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Tested-by: Ross Zwisler <zwisler@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/ring_buffer.c | 32 +++++++++++++++++++++++++-------
 1 file changed, 25 insertions(+), 7 deletions(-)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 843818ee4814..c366a0a9ddba 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -2062,8 +2062,10 @@ rb_insert_pages(struct ring_buffer_per_cpu *cpu_buffer)
 {
 	struct list_head *pages = &cpu_buffer->new_pages;
 	int retries, success;
+	unsigned long flags;
 
-	raw_spin_lock_irq(&cpu_buffer->reader_lock);
+	/* Can be called at early boot up, where interrupts must not been enabled */
+	raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags);
 	/*
 	 * We are holding the reader lock, so the reader page won't be swapped
 	 * in the ring buffer. Now we are racing with the writer trying to
@@ -2120,7 +2122,7 @@ rb_insert_pages(struct ring_buffer_per_cpu *cpu_buffer)
 	 * tracing
 	 */
 	RB_WARN_ON(cpu_buffer, !success);
-	raw_spin_unlock_irq(&cpu_buffer->reader_lock);
+	raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags);
 
 	/* free pages if they weren't inserted */
 	if (!success) {
@@ -2248,8 +2250,16 @@ int ring_buffer_resize(struct trace_buffer *buffer, unsigned long size,
 				rb_update_pages(cpu_buffer);
 				cpu_buffer->nr_pages_to_update = 0;
 			} else {
-				schedule_work_on(cpu,
-						&cpu_buffer->update_pages_work);
+				/* Run directly if possible. */
+				migrate_disable();
+				if (cpu != smp_processor_id()) {
+					migrate_enable();
+					schedule_work_on(cpu,
+							 &cpu_buffer->update_pages_work);
+				} else {
+					update_pages_handler(&cpu_buffer->update_pages_work);
+					migrate_enable();
+				}
 			}
 		}
 
@@ -2298,9 +2308,17 @@ int ring_buffer_resize(struct trace_buffer *buffer, unsigned long size,
 		if (!cpu_online(cpu_id))
 			rb_update_pages(cpu_buffer);
 		else {
-			schedule_work_on(cpu_id,
-					 &cpu_buffer->update_pages_work);
-			wait_for_completion(&cpu_buffer->update_done);
+			/* Run directly if possible. */
+			migrate_disable();
+			if (cpu_id == smp_processor_id()) {
+				rb_update_pages(cpu_buffer);
+				migrate_enable();
+			} else {
+				migrate_enable();
+				schedule_work_on(cpu_id,
+						 &cpu_buffer->update_pages_work);
+				wait_for_completion(&cpu_buffer->update_done);
+			}
 		}
 
 		cpu_buffer->nr_pages_to_update = 0;
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [for-next][PATCH 19/25] tracing: remove unnecessary trace_trigger ifdef
  2022-12-10 13:57 [for-next][PATCH 00/25] tracing: Updates for 6.2 Steven Rostedt
                   ` (17 preceding siblings ...)
  2022-12-10 13:58 ` [for-next][PATCH 18/25] ring-buffer: Handle resize in early boot up Steven Rostedt
@ 2022-12-10 13:58 ` Steven Rostedt
  2022-12-10 13:58 ` [for-next][PATCH 20/25] tracing/osnoise: Make osnoise_options static Steven Rostedt
                   ` (5 subsequent siblings)
  24 siblings, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 13:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Masami Hiramatsu, Andrew Morton, Mathieu Desnoyers,
	Joel Fernandes, Tom Zanussi, Ross Zwisler

From: Ross Zwisler <zwisler@chromium.org>

The trace_trigger command line option introduced by
commit a01fdc897fa5 ("tracing: Add trace_trigger kernel command line option")
doesn't need to depend on the CONFIG_HIST_TRIGGERS kernel config option.

This code doesn't depend on the histogram code, and the run-time
selection of triggers is usable without CONFIG_HIST_TRIGGERS.

Link: https://lore.kernel.org/linux-trace-kernel/20221209003310.1737039-1-zwisler@google.com

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Tom Zanussi <zanussi@kernel.org>
Fixes: a01fdc897fa5 ("tracing: Add trace_trigger kernel command line option")
Signed-off-by: Ross Zwisler <zwisler@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/trace_events.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 3bfaf560ecc4..33e0b4f8ebe6 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -2796,7 +2796,6 @@ trace_create_new_event(struct trace_event_call *call,
 	return file;
 }
 
-#ifdef CONFIG_HIST_TRIGGERS
 #define MAX_BOOT_TRIGGERS 32
 
 static struct boot_triggers {
@@ -2832,7 +2831,6 @@ static __init int setup_trace_triggers(char *str)
 	return 1;
 }
 __setup("trace_trigger=", setup_trace_triggers);
-#endif
 
 /* Add an event to a trace directory */
 static int
@@ -2850,7 +2848,6 @@ __trace_add_new_event(struct trace_event_call *call, struct trace_array *tr)
 		return event_define_fields(call);
 }
 
-#ifdef CONFIG_HIST_TRIGGERS
 static void trace_early_triggers(struct trace_event_file *file, const char *name)
 {
 	int ret;
@@ -2868,9 +2865,6 @@ static void trace_early_triggers(struct trace_event_file *file, const char *name
 			       bootup_triggers[i].event);
 	}
 }
-#else
-static inline void trace_early_triggers(struct trace_event_file *file, const char *name) { }
-#endif
 
 /*
  * Just create a descriptor for early init. A descriptor is required
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [for-next][PATCH 20/25] tracing/osnoise: Make osnoise_options static
  2022-12-10 13:57 [for-next][PATCH 00/25] tracing: Updates for 6.2 Steven Rostedt
                   ` (18 preceding siblings ...)
  2022-12-10 13:58 ` [for-next][PATCH 19/25] tracing: remove unnecessary trace_trigger ifdef Steven Rostedt
@ 2022-12-10 13:58 ` Steven Rostedt
  2022-12-10 13:58 ` [for-next][PATCH 21/25] tracing: Fix some checker warnings Steven Rostedt
                   ` (4 subsequent siblings)
  24 siblings, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 13:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Masami Hiramatsu, Andrew Morton, kernel test robot,
	Daniel Bristot de Oliveira

From: Daniel Bristot de Oliveira <bristot@kernel.org>

Make osnoise_options static, as reported by the kernel test robot.

Link: https://lkml.kernel.org/r/63255826485400d7a2270e9c5e66111079671e7a.1670228712.git.bristot@kernel.org

Reported-by: kernel test robot <lkp@intel.com>
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/trace_osnoise.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
index 3f10dd1f2f1c..8ba82c71268f 100644
--- a/kernel/trace/trace_osnoise.c
+++ b/kernel/trace/trace_osnoise.c
@@ -59,8 +59,8 @@ enum osnoise_options_index {
 
 static const char * const osnoise_options_str[OSN_MAX] = { "DEFAULTS", "OSNOISE_WORKLOAD" };
 
-#define OSN_DEFAULT_OPTIONS	0x2
-unsigned long osnoise_options	= OSN_DEFAULT_OPTIONS;
+#define OSN_DEFAULT_OPTIONS		0x2
+static unsigned long osnoise_options	= OSN_DEFAULT_OPTIONS;
 
 /*
  * trace_array of the enabled osnoise/timerlat instances.
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [for-next][PATCH 21/25] tracing: Fix some checker warnings
  2022-12-10 13:57 [for-next][PATCH 00/25] tracing: Updates for 6.2 Steven Rostedt
                   ` (19 preceding siblings ...)
  2022-12-10 13:58 ` [for-next][PATCH 20/25] tracing/osnoise: Make osnoise_options static Steven Rostedt
@ 2022-12-10 13:58 ` Steven Rostedt
  2022-12-10 13:58 ` [for-next][PATCH 22/25] Documentation/osnoise: Escape underscore of NO_ prefix Steven Rostedt
                   ` (3 subsequent siblings)
  24 siblings, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 13:58 UTC (permalink / raw)
  To: linux-kernel; +Cc: Masami Hiramatsu, Andrew Morton, David Howells

From: David Howells <dhowells@redhat.com>

Fix some checker warnings in the trace code by adding __printf attributes
to a number of trace functions and their declarations.

Changes:
========
ver #2)
 - Dropped the fix for the unconditional tracing_max_lat_fops decl[1].

Link: https://lore.kernel.org/r/20221205180617.9b9d3971cbe06ee536603523@kernel.org/ [1]
Link: https://lore.kernel.org/r/166992525941.1716618.13740663757583361463.stgit@warthog.procyon.org.uk/ # v1
Link: https://lkml.kernel.org/r/167023571258.382307.15314866482834835192.stgit@warthog.procyon.org.uk

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 include/linux/trace_events.h | 3 ++-
 include/linux/trace_seq.h    | 3 ++-
 kernel/trace/trace.h         | 2 +-
 kernel/trace/trace_output.c  | 5 +++--
 4 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index bb2053246d6a..4342e996bcdb 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -234,7 +234,8 @@ void tracing_record_taskinfo_sched_switch(struct task_struct *prev,
 void tracing_record_cmdline(struct task_struct *task);
 void tracing_record_tgid(struct task_struct *task);
 
-int trace_output_call(struct trace_iterator *iter, char *name, char *fmt, ...);
+int trace_output_call(struct trace_iterator *iter, char *name, char *fmt, ...)
+	 __printf(3, 4);
 
 struct event_filter;
 
diff --git a/include/linux/trace_seq.h b/include/linux/trace_seq.h
index 5a2c650d9e1c..0c4c7587d6c3 100644
--- a/include/linux/trace_seq.h
+++ b/include/linux/trace_seq.h
@@ -97,7 +97,8 @@ extern int trace_seq_hex_dump(struct trace_seq *s, const char *prefix_str,
 			      const void *buf, size_t len, bool ascii);
 
 #else /* CONFIG_TRACING */
-static inline void trace_seq_printf(struct trace_seq *s, const char *fmt, ...)
+static inline __printf(2, 3)
+void trace_seq_printf(struct trace_seq *s, const char *fmt, ...)
 {
 }
 static inline void
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 9dc920b01c17..e46a49269be2 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -614,7 +614,7 @@ void trace_buffer_unlock_commit_nostack(struct trace_buffer *buffer,
 bool trace_is_tracepoint_string(const char *str);
 const char *trace_event_format(struct trace_iterator *iter, const char *fmt);
 void trace_check_vprintf(struct trace_iterator *iter, const char *fmt,
-			 va_list ap);
+			 va_list ap) __printf(2, 0);
 
 int trace_empty(struct trace_iterator *iter);
 
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index f0ba97121345..57a13b61f186 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -322,8 +322,9 @@ void trace_event_printf(struct trace_iterator *iter, const char *fmt, ...)
 }
 EXPORT_SYMBOL(trace_event_printf);
 
-static int trace_output_raw(struct trace_iterator *iter, char *name,
-			    char *fmt, va_list ap)
+static __printf(3, 0)
+int trace_output_raw(struct trace_iterator *iter, char *name,
+		     char *fmt, va_list ap)
 {
 	struct trace_seq *s = &iter->seq;
 
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [for-next][PATCH 22/25] Documentation/osnoise: Escape underscore of NO_ prefix
  2022-12-10 13:57 [for-next][PATCH 00/25] tracing: Updates for 6.2 Steven Rostedt
                   ` (20 preceding siblings ...)
  2022-12-10 13:58 ` [for-next][PATCH 21/25] tracing: Fix some checker warnings Steven Rostedt
@ 2022-12-10 13:58 ` Steven Rostedt
  2022-12-10 13:58 ` [for-next][PATCH 23/25] tracing/osnoise: Add PANIC_ON_STOP option Steven Rostedt
                   ` (2 subsequent siblings)
  24 siblings, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 13:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Masami Hiramatsu, Andrew Morton, Jonathan Corbet, Ammar Faizi,
	GNU/Weeb Mailing List, kernel test robot, Bagas Sanjaya,
	Daniel Bristot de Oliveira

From: Bagas Sanjaya <bagasdotme@gmail.com>

kernel test robot reported unknown target name warning:

Documentation/trace/osnoise-tracer.rst:112: WARNING: Unknown target name: "no".

The warning causes NO_ prefix to be rendered as link text instead, which
points to non-existent link target.

Escape the prefix underscore to fix the warning.

Link: https://lkml.kernel.org/r/20221125034300.24168-1-bagasdotme@gmail.com

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Cc: GNU/Weeb Mailing List <gwml@vger.gnuweeb.org>
Link: https://lore.kernel.org/linux-doc/202211240447.HxRNftE5-lkp@intel.com/
Fixes: 67543cd6b8eee5 ("Documentation/osnoise: Add osnoise/options documentation")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 Documentation/trace/osnoise-tracer.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/trace/osnoise-tracer.rst b/Documentation/trace/osnoise-tracer.rst
index 3c675ed82b27..fdd562d7c22d 100644
--- a/Documentation/trace/osnoise-tracer.rst
+++ b/Documentation/trace/osnoise-tracer.rst
@@ -111,7 +111,7 @@ The tracer has a set of options inside the osnoise directory, they are:
    be used, which is currently 5 us.
  - osnoise/options: a set of on/off options that can be enabled by
    writing the option name to the file or disabled by writing the option
-   name preceded with the 'NO_' prefix. For example, writing
+   name preceded with the 'NO\_' prefix. For example, writing
    NO_OSNOISE_WORKLOAD disables the OSNOISE_WORKLOAD option. The
    special DEAFAULTS option resets all options to the default value.
 
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [for-next][PATCH 23/25] tracing/osnoise: Add PANIC_ON_STOP option
  2022-12-10 13:57 [for-next][PATCH 00/25] tracing: Updates for 6.2 Steven Rostedt
                   ` (21 preceding siblings ...)
  2022-12-10 13:58 ` [for-next][PATCH 22/25] Documentation/osnoise: Escape underscore of NO_ prefix Steven Rostedt
@ 2022-12-10 13:58 ` Steven Rostedt
  2022-12-10 13:58 ` [for-next][PATCH 24/25] tracing/osnoise: Add preempt and/or irq disabled options Steven Rostedt
  2022-12-10 13:58 ` [for-next][PATCH 25/25] Documentation/osnoise: Add osnoise/options documentation Steven Rostedt
  24 siblings, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 13:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Masami Hiramatsu, Andrew Morton, Juri Lelli, Clark Williams,
	Bagas Sanjaya, Daniel Bristot de Oliveira, Jonathan Corbet

From: Daniel Bristot de Oliveira <bristot@kernel.org>

Often the latency observed in a CPU is not caused by the work being done
in the CPU itself, but by work done on another CPU that causes the
hardware to stall all CPUs. In this case, it is interesting to know
what is happening on ALL CPUs, and the best way to do this is via
crash dump analysis.

Add the PANIC_ON_STOP option to osnoise/timerlat tracers. The default
behavior is having this option off. When enabled by the user, the system
will panic after hitting a stop tracing condition.

This option was motivated by a real scenario that Juri Lelli and I
were debugging.

Link: https://lkml.kernel.org/r/249ce4287c6725543e6db845a6e0df621dc67db5.1670623111.git.bristot@kernel.org

Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Bagas Sanjaya <bagasdotme@gmail.com>
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/trace_osnoise.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
index 8ba82c71268f..5a7613942223 100644
--- a/kernel/trace/trace_osnoise.c
+++ b/kernel/trace/trace_osnoise.c
@@ -54,10 +54,11 @@
 enum osnoise_options_index {
 	OSN_DEFAULTS = 0,
 	OSN_WORKLOAD,
+	OSN_PANIC_ON_STOP,
 	OSN_MAX
 };
 
-static const char * const osnoise_options_str[OSN_MAX] = { "DEFAULTS", "OSNOISE_WORKLOAD" };
+static const char * const osnoise_options_str[OSN_MAX] = { "DEFAULTS", "OSNOISE_WORKLOAD", "PANIC_ON_STOP" };
 
 #define OSN_DEFAULT_OPTIONS		0x2
 static unsigned long osnoise_options	= OSN_DEFAULT_OPTIONS;
@@ -1270,6 +1271,9 @@ static __always_inline void osnoise_stop_tracing(void)
 		trace_array_printk_buf(tr->array_buffer.buffer, _THIS_IP_,
 				"stop tracing hit on cpu %d\n", smp_processor_id());
 
+		if (test_bit(OSN_PANIC_ON_STOP, &osnoise_options))
+			panic("tracer hit stop condition on CPU %d\n", smp_processor_id());
+
 		tracer_tracing_off(tr);
 	}
 	rcu_read_unlock();
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [for-next][PATCH 24/25] tracing/osnoise: Add preempt and/or irq disabled options
  2022-12-10 13:57 [for-next][PATCH 00/25] tracing: Updates for 6.2 Steven Rostedt
                   ` (22 preceding siblings ...)
  2022-12-10 13:58 ` [for-next][PATCH 23/25] tracing/osnoise: Add PANIC_ON_STOP option Steven Rostedt
@ 2022-12-10 13:58 ` Steven Rostedt
  2022-12-10 13:58 ` [for-next][PATCH 25/25] Documentation/osnoise: Add osnoise/options documentation Steven Rostedt
  24 siblings, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 13:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Masami Hiramatsu, Andrew Morton, Suggested-by: Clark Williams,
	Juri Lelli, Bagas Sanjaya, Daniel Bristot de Oliveira,
	Jonathan Corbet

From: Daniel Bristot de Oliveira <bristot@kernel.org>

The osnoise workload runs with preemption and IRQs enabled in such
a way as to allow all sorts of noise to disturb osnoise's execution.
hwlat tracer has a similar workload but works with irq disabled,
allowing only NMIs and the hardware to generate noise.

While thinking about adding an options file to hwlat tracer to
allow the system to panic, and other features I was thinking
to add, like having a tracepoint at each noise detection, it
came to my mind that is easier to make osnoise and also do
hardware latency detection than making hwlat "feature compatible"
with osnoise.

Other points are:
 - osnoise already has an independent cpu file.
 - osnoise has a more intuitive interface, e.g., runtime/period vs.
   window/width (and people often need help remembering what it is).
 - osnoise: tracepoints
 - osnoise stop options
 - osnoise options file itself

Moreover, the user-space side (in rtla) is simplified by reusing the
existing osnoise code.

Finally, people have been asking me about using osnoise for hw latency
detection, and I have to explain that it was sufficient but not
necessary. These options make it sufficient and necessary.

Adding a Suggested-by Clark, as he often asked me about this
possibility.

Link: https://lkml.kernel.org/r/d9c6c19135497054986900f94c8e47410b15316a.1670623111.git.bristot@kernel.org

Cc: Suggested-by: Clark Williams <williams@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Bagas Sanjaya <bagasdotme@gmail.com>
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/trace_osnoise.c | 48 ++++++++++++++++++++++++++++++++----
 1 file changed, 43 insertions(+), 5 deletions(-)

diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
index 5a7613942223..94c1b5eb1dc0 100644
--- a/kernel/trace/trace_osnoise.c
+++ b/kernel/trace/trace_osnoise.c
@@ -55,10 +55,17 @@ enum osnoise_options_index {
 	OSN_DEFAULTS = 0,
 	OSN_WORKLOAD,
 	OSN_PANIC_ON_STOP,
+	OSN_PREEMPT_DISABLE,
+	OSN_IRQ_DISABLE,
 	OSN_MAX
 };
 
-static const char * const osnoise_options_str[OSN_MAX] = { "DEFAULTS", "OSNOISE_WORKLOAD", "PANIC_ON_STOP" };
+static const char * const osnoise_options_str[OSN_MAX] = {
+							"DEFAULTS",
+							"OSNOISE_WORKLOAD",
+							"PANIC_ON_STOP",
+							"OSNOISE_PREEMPT_DISABLE",
+							"OSNOISE_IRQ_DISABLE" };
 
 #define OSN_DEFAULT_OPTIONS		0x2
 static unsigned long osnoise_options	= OSN_DEFAULT_OPTIONS;
@@ -1308,18 +1315,26 @@ static void notify_new_max_latency(u64 latency)
  */
 static int run_osnoise(void)
 {
+	bool disable_irq = test_bit(OSN_IRQ_DISABLE, &osnoise_options);
 	struct osnoise_variables *osn_var = this_cpu_osn_var();
 	u64 start, sample, last_sample;
 	u64 last_int_count, int_count;
 	s64 noise = 0, max_noise = 0;
 	s64 total, last_total = 0;
 	struct osnoise_sample s;
+	bool disable_preemption;
 	unsigned int threshold;
 	u64 runtime, stop_in;
 	u64 sum_noise = 0;
 	int hw_count = 0;
 	int ret = -1;
 
+	/*
+	 * Disabling preemption is only required if IRQs are enabled,
+	 * and the options is set on.
+	 */
+	disable_preemption = !disable_irq && test_bit(OSN_PREEMPT_DISABLE, &osnoise_options);
+
 	/*
 	 * Considers the current thread as the workload.
 	 */
@@ -1335,6 +1350,15 @@ static int run_osnoise(void)
 	 */
 	threshold = tracing_thresh ? : 5000;
 
+	/*
+	 * Apply PREEMPT and IRQ disabled options.
+	 */
+	if (disable_irq)
+		local_irq_disable();
+
+	if (disable_preemption)
+		preempt_disable();
+
 	/*
 	 * Make sure NMIs see sampling first
 	 */
@@ -1422,16 +1446,21 @@ static int run_osnoise(void)
 		 * cond_resched()
 		 */
 		if (IS_ENABLED(CONFIG_PREEMPT_RCU)) {
-			local_irq_disable();
+			if (!disable_irq)
+				local_irq_disable();
+
 			rcu_momentary_dyntick_idle();
-			local_irq_enable();
+
+			if (!disable_irq)
+				local_irq_enable();
 		}
 
 		/*
 		 * For the non-preemptive kernel config: let threads runs, if
-		 * they so wish.
+		 * they so wish, unless set not do to so.
 		 */
-		cond_resched();
+		if (!disable_irq && !disable_preemption)
+			cond_resched();
 
 		last_sample = sample;
 		last_int_count = int_count;
@@ -1450,6 +1479,15 @@ static int run_osnoise(void)
 	 */
 	barrier();
 
+	/*
+	 * Return to the preemptive state.
+	 */
+	if (disable_preemption)
+		preempt_enable();
+
+	if (disable_irq)
+		local_irq_enable();
+
 	/*
 	 * Save noise info.
 	 */
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [for-next][PATCH 25/25] Documentation/osnoise: Add osnoise/options documentation
  2022-12-10 13:57 [for-next][PATCH 00/25] tracing: Updates for 6.2 Steven Rostedt
                   ` (23 preceding siblings ...)
  2022-12-10 13:58 ` [for-next][PATCH 24/25] tracing/osnoise: Add preempt and/or irq disabled options Steven Rostedt
@ 2022-12-10 13:58 ` Steven Rostedt
  24 siblings, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 13:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Masami Hiramatsu, Andrew Morton, Daniel Bristot de Oliveira,
	Juri Lelli, Clark Williams, Jonathan Corbet, Bagas Sanjaya

From: Daniel Bristot de Oliveira <bristot@kernel.org>

Add the documentation about the osnoise/options file, the options,
and some additional explanation about the OSNOISE_WORKLOAD option.

Link: https://lkml.kernel.org/r/fde5567a4bae364f67fd1e9a644d1d62862618a6.1670623111.git.bristot@kernel.org

Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 Documentation/trace/osnoise-tracer.rst | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/Documentation/trace/osnoise-tracer.rst b/Documentation/trace/osnoise-tracer.rst
index fdd562d7c22d..140ef2533d26 100644
--- a/Documentation/trace/osnoise-tracer.rst
+++ b/Documentation/trace/osnoise-tracer.rst
@@ -92,8 +92,8 @@ Note that the example above shows a high number of HW noise samples.
 The reason being is that this sample was taken on a virtual machine,
 and the host interference is detected as a hardware interference.
 
-Tracer options
----------------------
+Tracer Configuration
+--------------------
 
 The tracer has a set of options inside the osnoise directory, they are:
 
@@ -115,6 +115,22 @@ The tracer has a set of options inside the osnoise directory, they are:
    NO_OSNOISE_WORKLOAD disables the OSNOISE_WORKLOAD option. The
    special DEAFAULTS option resets all options to the default value.
 
+Tracer Options
+--------------
+
+The osnoise/options file exposes a set of on/off configuration options for
+the osnoise tracer. These options are:
+
+ - DEFAULTS: reset the options to the default value.
+ - OSNOISE_WORKLOAD: do not dispatch osnoise workload (see dedicated
+   section below).
+ - PANIC_ON_STOP: call panic() if the tracer stops. This option serves to
+   capture a vmcore.
+ - OSNOISE_PREEMPT_DISABLE: disable preemption while running the osnoise
+   workload, allowing only IRQ and hardware-related noise.
+ - OSNOISE_IRQ_DISABLE: disable IRQs while running the osnoise workload,
+   allowing only NMIs and hardware-related noise, like hwlat tracer.
+
 Additional Tracing
 ------------------
 
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [for-next][PATCH 13/25] x86/mm/kmmio: Use rcu_read_lock_sched_notrace()
  2022-12-10 13:58 ` [for-next][PATCH 13/25] x86/mm/kmmio: Use rcu_read_lock_sched_notrace() Steven Rostedt
@ 2022-12-10 17:47   ` Paul E. McKenney
  2022-12-10 18:34     ` Steven Rostedt
  0 siblings, 1 reply; 35+ messages in thread
From: Paul E. McKenney @ 2022-12-10 17:47 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Andrew Morton, Karol Herbst,
	Pekka Paalanen, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov

On Sat, Dec 10, 2022 at 08:58:03AM -0500, Steven Rostedt wrote:
> From: Steven Rostedt <rostedt@goodmis.org>
> 
> The mmiotrace tracer is "special". The purpose is to help reverse engineer
> binary drivers by removing the memory allocated by the driver and when the
> driver goes to access it, a fault occurs, the mmiotracer will record what
> the driver was doing and then do the work on its behalf by single stepping
> through the process.
> 
> But to achieve this ability, it must do some special things. One is to
> take the rcu_read_lock() when the fault occurs, and then release it in the
> breakpoint that is single stepping. This makes lockdep unhappy, as it
> changes the state of RCU from within an exception that is not contained in
> that exception, and we get a nasty splat from lockdep.
> 
> Instead, switch to rcu_read_lock_sched_notrace() as the RCU sched variant
> has the same grace period as normal RCU. This is basically the same as
> rcu_read_lock() but does not make lockdep complain about it.
> 
> Note, the preempt_disable() is still needed as it uses preempt_enable_no_resched().
> 
> Link: https://lore.kernel.org/linux-trace-kernel/20221209134144.04f33626@gandalf.local.home
> 
> Cc: Masami Hiramatsu <mhiramat@kernel.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Karol Herbst <karolherbst@gmail.com>
> Cc: Pekka Paalanen <ppaalanen@gmail.com>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: "Paul E. McKenney" <paulmck@kernel.org>
> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

Executable code can be the best form of comment.  ;-)

This does mess with preempt_count() redundantly, but the overhead from
that should be way down in the noise.

Acked-by: Paul E. McKenney <paulmck@kernel.org>

> ---
>  arch/x86/mm/kmmio.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/mm/kmmio.c b/arch/x86/mm/kmmio.c
> index edb486450158..853c49877c16 100644
> --- a/arch/x86/mm/kmmio.c
> +++ b/arch/x86/mm/kmmio.c
> @@ -254,7 +254,7 @@ int kmmio_handler(struct pt_regs *regs, unsigned long addr)
>  	 * again.
>  	 */
>  	preempt_disable();
> -	rcu_read_lock();
> +	rcu_read_lock_sched_notrace();
>  
>  	faultpage = get_kmmio_fault_page(page_base);
>  	if (!faultpage) {
> @@ -323,7 +323,7 @@ int kmmio_handler(struct pt_regs *regs, unsigned long addr)
>  	return 1; /* fault handled */
>  
>  no_kmmio:
> -	rcu_read_unlock();
> +	rcu_read_unlock_sched_notrace();
>  	preempt_enable_no_resched();
>  	return ret;
>  }
> @@ -363,7 +363,7 @@ static int post_kmmio_handler(unsigned long condition, struct pt_regs *regs)
>  	/* These were acquired in kmmio_handler(). */
>  	ctx->active--;
>  	BUG_ON(ctx->active);
> -	rcu_read_unlock();
> +	rcu_read_unlock_sched_notrace();
>  	preempt_enable_no_resched();
>  
>  	/*
> -- 
> 2.35.1
> 
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [for-next][PATCH 13/25] x86/mm/kmmio: Use rcu_read_lock_sched_notrace()
  2022-12-10 17:47   ` Paul E. McKenney
@ 2022-12-10 18:34     ` Steven Rostedt
  2022-12-10 21:34       ` Paul E. McKenney
  2022-12-10 23:30       ` Thomas Gleixner
  0 siblings, 2 replies; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 18:34 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, Masami Hiramatsu, Andrew Morton, Karol Herbst,
	Pekka Paalanen, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov

On Sat, 10 Dec 2022 09:47:53 -0800
"Paul E. McKenney" <paulmck@kernel.org> wrote:

> > Note, the preempt_disable() is still needed as it uses preempt_enable_no_resched().
> > 

 ...

> Executable code can be the best form of comment.  ;-)
> 
> This does mess with preempt_count() redundantly, but the overhead from
> that should be way down in the noise.

I was going to remove it, but then I realized that it would be a functional
change, as from the comment above, it uses "preempt_enable_no_resched(),
which there is not a rcu_read_unlock_sched() variant.

> 
> Acked-by: Paul E. McKenney <paulmck@kernel.org>

Thanks! I'll add this to the commit.

-- Steve

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [for-next][PATCH 13/25] x86/mm/kmmio: Use rcu_read_lock_sched_notrace()
  2022-12-10 18:34     ` Steven Rostedt
@ 2022-12-10 21:34       ` Paul E. McKenney
  2022-12-10 22:32         ` Steven Rostedt
  2022-12-10 23:30       ` Thomas Gleixner
  1 sibling, 1 reply; 35+ messages in thread
From: Paul E. McKenney @ 2022-12-10 21:34 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Andrew Morton, Karol Herbst,
	Pekka Paalanen, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov

On Sat, Dec 10, 2022 at 01:34:25PM -0500, Steven Rostedt wrote:
> On Sat, 10 Dec 2022 09:47:53 -0800
> "Paul E. McKenney" <paulmck@kernel.org> wrote:
> 
> > > Note, the preempt_disable() is still needed as it uses preempt_enable_no_resched().
> > > 
> 
>  ...
> 
> > Executable code can be the best form of comment.  ;-)
> > 
> > This does mess with preempt_count() redundantly, but the overhead from
> > that should be way down in the noise.
> 
> I was going to remove it, but then I realized that it would be a functional
> change, as from the comment above, it uses "preempt_enable_no_resched(),
> which there is not a rcu_read_unlock_sched() variant.

If this happens often enough, it might be worth adding something like
rcu_read_unlock_sched_no_resched(), but we clearly are not there yet.
Especially not with a name like that!  ;-)

							Thanx, Paul

> > Acked-by: Paul E. McKenney <paulmck@kernel.org>
> 
> Thanks! I'll add this to the commit.
> 
> -- Steve

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [for-next][PATCH 13/25] x86/mm/kmmio: Use rcu_read_lock_sched_notrace()
  2022-12-10 21:34       ` Paul E. McKenney
@ 2022-12-10 22:32         ` Steven Rostedt
  2022-12-11  5:52           ` Paul E. McKenney
  0 siblings, 1 reply; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 22:32 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, Masami Hiramatsu, Andrew Morton, Karol Herbst,
	Pekka Paalanen, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov

On Sat, 10 Dec 2022 13:34:12 -0800
"Paul E. McKenney" <paulmck@kernel.org> wrote:

> > I was going to remove it, but then I realized that it would be a functional
> > change, as from the comment above, it uses "preempt_enable_no_resched(),
> > which there is not a rcu_read_unlock_sched() variant.  
> 
> If this happens often enough, it might be worth adding something like
> rcu_read_unlock_sched_no_resched(), but we clearly are not there yet.
> Especially not with a name like that!  ;-)

Please don't ;-)

This is only to handle the bizarre case that mmio tracing does. Remember,
this tracer is only for those that want to reverse engineer a binary
driver. It's not even SMP safe! When you enable it, it shuts down all but
one CPU. This is actually the reason I worked so hard to keep it working
with lockdep. The shutting down of CPUs has caught so many bugs in other
parts of the kernel! ;-)

Thus, anything that mmio tracer does, is considered niche, and not
something to much care about.

-- Steve

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [for-next][PATCH 13/25] x86/mm/kmmio: Use rcu_read_lock_sched_notrace()
  2022-12-10 18:34     ` Steven Rostedt
  2022-12-10 21:34       ` Paul E. McKenney
@ 2022-12-10 23:30       ` Thomas Gleixner
  2022-12-10 23:55         ` Steven Rostedt
  1 sibling, 1 reply; 35+ messages in thread
From: Thomas Gleixner @ 2022-12-10 23:30 UTC (permalink / raw)
  To: Steven Rostedt, Paul E. McKenney
  Cc: linux-kernel, Masami Hiramatsu, Andrew Morton, Karol Herbst,
	Pekka Paalanen, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
	Ingo Molnar, Borislav Petkov

On Sat, Dec 10 2022 at 13:34, Steven Rostedt wrote:
> On Sat, 10 Dec 2022 09:47:53 -0800 "Paul E. McKenney" <paulmck@kernel.org> wrote:
>> This does mess with preempt_count() redundantly, but the overhead from
>> that should be way down in the noise.
>
> I was going to remove it, but then I realized that it would be a functional
> change, as from the comment above, it uses "preempt_enable_no_resched(),
> which there is not a rcu_read_unlock_sched() variant.

preempt_enable_no_resched() in this context is simply garbage.

preempt_enable_no_resched() tries to avoid the overhead of checking
whether rescheduling is due after decrementing preempt_count() because
the code which it this claims to know that it is _not_ the outermost one
which brings preempt count back to preemtible state.

I concede that there are hot paths which actually can benefit, but this
code has exactly _ZERO_ benefit from that. Taking that tracing exception
and handling it is orders of magnitudes more expensive than a regular
preempt_enable().

So just get rid of it and don't proliferate cargo cult programming.

Thanks,

        tglx



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [for-next][PATCH 13/25] x86/mm/kmmio: Use rcu_read_lock_sched_notrace()
  2022-12-10 23:30       ` Thomas Gleixner
@ 2022-12-10 23:55         ` Steven Rostedt
  2022-12-12 10:51           ` Thomas Gleixner
  0 siblings, 1 reply; 35+ messages in thread
From: Steven Rostedt @ 2022-12-10 23:55 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Paul E. McKenney, linux-kernel, Masami Hiramatsu, Andrew Morton,
	Karol Herbst, Pekka Paalanen, Dave Hansen, Andy Lutomirski,
	Peter Zijlstra, Ingo Molnar, Borislav Petkov

On Sun, 11 Dec 2022 00:30:36 +0100
Thomas Gleixner <tglx@linutronix.de> wrote:

> On Sat, Dec 10 2022 at 13:34, Steven Rostedt wrote:
> > On Sat, 10 Dec 2022 09:47:53 -0800 "Paul E. McKenney" <paulmck@kernel.org> wrote:  
> >> This does mess with preempt_count() redundantly, but the overhead from
> >> that should be way down in the noise.  
> >
> > I was going to remove it, but then I realized that it would be a functional
> > change, as from the comment above, it uses "preempt_enable_no_resched(),
> > which there is not a rcu_read_unlock_sched() variant.  
> 
> preempt_enable_no_resched() in this context is simply garbage.
> 
> preempt_enable_no_resched() tries to avoid the overhead of checking
> whether rescheduling is due after decrementing preempt_count() because
> the code which it this claims to know that it is _not_ the outermost one
> which brings preempt count back to preemtible state.
> 
> I concede that there are hot paths which actually can benefit, but this
> code has exactly _ZERO_ benefit from that. Taking that tracing exception
> and handling it is orders of magnitudes more expensive than a regular
> preempt_enable().
> 
> So just get rid of it and don't proliferate cargo cult programming.
> 

The point of the patch is to just fix the lockdep issue. I'm happy to
remove that "no_resched" (I was planning to), but that would be a separate
change, with a different purpose, and thus a separate patch.

-- Steve


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [for-next][PATCH 13/25] x86/mm/kmmio: Use rcu_read_lock_sched_notrace()
  2022-12-10 22:32         ` Steven Rostedt
@ 2022-12-11  5:52           ` Paul E. McKenney
  0 siblings, 0 replies; 35+ messages in thread
From: Paul E. McKenney @ 2022-12-11  5:52 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Masami Hiramatsu, Andrew Morton, Karol Herbst,
	Pekka Paalanen, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov

On Sat, Dec 10, 2022 at 05:32:27PM -0500, Steven Rostedt wrote:
> On Sat, 10 Dec 2022 13:34:12 -0800
> "Paul E. McKenney" <paulmck@kernel.org> wrote:
> 
> > > I was going to remove it, but then I realized that it would be a functional
> > > change, as from the comment above, it uses "preempt_enable_no_resched(),
> > > which there is not a rcu_read_unlock_sched() variant.  
> > 
> > If this happens often enough, it might be worth adding something like
> > rcu_read_unlock_sched_no_resched(), but we clearly are not there yet.
> > Especially not with a name like that!  ;-)
> 
> Please don't ;-)
> 
> This is only to handle the bizarre case that mmio tracing does. Remember,
> this tracer is only for those that want to reverse engineer a binary
> driver. It's not even SMP safe! When you enable it, it shuts down all but
> one CPU. This is actually the reason I worked so hard to keep it working
> with lockdep. The shutting down of CPUs has caught so many bugs in other
> parts of the kernel! ;-)
> 
> Thus, anything that mmio tracer does, is considered niche, and not
> something to much care about.

Agreed, as I said, we are clearly not there yet.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [for-next][PATCH 13/25] x86/mm/kmmio: Use rcu_read_lock_sched_notrace()
  2022-12-10 23:55         ` Steven Rostedt
@ 2022-12-12 10:51           ` Thomas Gleixner
  2022-12-12 15:42             ` Steven Rostedt
  0 siblings, 1 reply; 35+ messages in thread
From: Thomas Gleixner @ 2022-12-12 10:51 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Paul E. McKenney, linux-kernel, Masami Hiramatsu, Andrew Morton,
	Karol Herbst, Pekka Paalanen, Dave Hansen, Andy Lutomirski,
	Peter Zijlstra, Ingo Molnar, Borislav Petkov

On Sat, Dec 10 2022 at 18:55, Steven Rostedt wrote:
> On Sun, 11 Dec 2022 00:30:36 +0100
> Thomas Gleixner <tglx@linutronix.de> wrote:
>> I concede that there are hot paths which actually can benefit, but this
>> code has exactly _ZERO_ benefit from that. Taking that tracing exception
>> and handling it is orders of magnitudes more expensive than a regular
>> preempt_enable().
>> 
>> So just get rid of it and don't proliferate cargo cult programming.
>> 
> The point of the patch is to just fix the lockdep issue. I'm happy to
> remove that "no_resched" (I was planning to), but that would be a separate
> change, with a different purpose, and thus a separate patch.

Right, but please make that part of the series.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [for-next][PATCH 13/25] x86/mm/kmmio: Use rcu_read_lock_sched_notrace()
  2022-12-12 10:51           ` Thomas Gleixner
@ 2022-12-12 15:42             ` Steven Rostedt
  0 siblings, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2022-12-12 15:42 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Paul E. McKenney, linux-kernel, Masami Hiramatsu, Andrew Morton,
	Karol Herbst, Pekka Paalanen, Dave Hansen, Andy Lutomirski,
	Peter Zijlstra, Ingo Molnar, Borislav Petkov

On Mon, 12 Dec 2022 11:51:51 +0100
Thomas Gleixner <tglx@linutronix.de> wrote:

> Right, but please make that part of the series.

I just pushed out a patch to do this.

  https://lore.kernel.org/all/20221212103703.7129cc5d@gandalf.local.home/

Feel free to ack it.

Thanks,

-- Steve

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2022-12-12 15:42 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-10 13:57 [for-next][PATCH 00/25] tracing: Updates for 6.2 Steven Rostedt
2022-12-10 13:57 ` [for-next][PATCH 01/25] tracing/user_events: Fix call print_fmt leak Steven Rostedt
2022-12-10 13:57 ` [for-next][PATCH 02/25] tracing: Update MAINTAINERS file for new patchwork and mailing list Steven Rostedt
2022-12-10 13:57 ` [for-next][PATCH 03/25] ftrace/x86: Add back ftrace_expected for ftrace bug reports Steven Rostedt
2022-12-10 13:57 ` [for-next][PATCH 04/25] tracing: Allow multiple hitcount values in histograms Steven Rostedt
2022-12-10 13:57 ` [for-next][PATCH 05/25] tracing: Add .percent suffix option to histogram values Steven Rostedt
2022-12-10 13:57 ` [for-next][PATCH 06/25] tracing: Add .graph suffix option to histogram value Steven Rostedt
2022-12-10 13:57 ` [for-next][PATCH 07/25] tracing: Add nohitcount option for suppressing display of raw hitcount Steven Rostedt
2022-12-10 13:57 ` [for-next][PATCH 08/25] tracing: docs: Update histogram doc for .percent/.graph and nohitcount Steven Rostedt
2022-12-10 13:57 ` [for-next][PATCH 09/25] trace/kprobe: remove duplicated calls of ring_buffer_event_data Steven Rostedt
2022-12-10 13:58 ` [for-next][PATCH 10/25] tracing/probes: Handle system names with hyphens Steven Rostedt
2022-12-10 13:58 ` [for-next][PATCH 11/25] tracing: Fix complicated dependency of CONFIG_TRACER_MAX_TRACE Steven Rostedt
2022-12-10 13:58 ` [for-next][PATCH 12/25] x86/mm/kmmio: Switch to arch_spin_lock() Steven Rostedt
2022-12-10 13:58 ` [for-next][PATCH 13/25] x86/mm/kmmio: Use rcu_read_lock_sched_notrace() Steven Rostedt
2022-12-10 17:47   ` Paul E. McKenney
2022-12-10 18:34     ` Steven Rostedt
2022-12-10 21:34       ` Paul E. McKenney
2022-12-10 22:32         ` Steven Rostedt
2022-12-11  5:52           ` Paul E. McKenney
2022-12-10 23:30       ` Thomas Gleixner
2022-12-10 23:55         ` Steven Rostedt
2022-12-12 10:51           ` Thomas Gleixner
2022-12-12 15:42             ` Steven Rostedt
2022-12-10 13:58 ` [for-next][PATCH 14/25] tracing/hist: Fix wrong return value in parse_action_params() Steven Rostedt
2022-12-10 13:58 ` [for-next][PATCH 15/25] tracing/hist: Fix out-of-bound write on action_data.var_ref_idx Steven Rostedt
2022-12-10 13:58 ` [for-next][PATCH 16/25] tracing: Fix issue of missing one synthetic field Steven Rostedt
2022-12-10 13:58 ` [for-next][PATCH 17/25] tracing/hist: Fix issue of losting command info in error_log Steven Rostedt
2022-12-10 13:58 ` [for-next][PATCH 18/25] ring-buffer: Handle resize in early boot up Steven Rostedt
2022-12-10 13:58 ` [for-next][PATCH 19/25] tracing: remove unnecessary trace_trigger ifdef Steven Rostedt
2022-12-10 13:58 ` [for-next][PATCH 20/25] tracing/osnoise: Make osnoise_options static Steven Rostedt
2022-12-10 13:58 ` [for-next][PATCH 21/25] tracing: Fix some checker warnings Steven Rostedt
2022-12-10 13:58 ` [for-next][PATCH 22/25] Documentation/osnoise: Escape underscore of NO_ prefix Steven Rostedt
2022-12-10 13:58 ` [for-next][PATCH 23/25] tracing/osnoise: Add PANIC_ON_STOP option Steven Rostedt
2022-12-10 13:58 ` [for-next][PATCH 24/25] tracing/osnoise: Add preempt and/or irq disabled options Steven Rostedt
2022-12-10 13:58 ` [for-next][PATCH 25/25] Documentation/osnoise: Add osnoise/options documentation Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).